Table Stream
Table Stream is the Lakehouse's Change Data Capture (CDC) mechanism, used to capture INSERT, UPDATE, and DELETE changes occurring on a table for downstream task consumption.
Analogy: a Table Stream is like a "change log subscription" on a table — you subscribe to a table's stream, and each time you consume from it, you only get the change records added since the last consumption.
Core Concepts
Offset: The stream records the position up to which it has been consumed. After each query against the stream, the offset automatically advances, and the next query only returns newly added changes.
Change Type Column: Stream query results include a _change_type column with values of INSERT, UPDATE_BEFORE, UPDATE_AFTER, or DELETE.
Comparison with Pipe
| Scenario | Recommended Solution |
|---|---|
| Continuously import data from Kafka/OSS | Pipe |
| Capture changes from Lakehouse tables to drive downstream processing | Table Stream |
| CDC sync of external database changes to Lakehouse | Studio real-time sync task |
