READ_KAFKA
Syntax
The read_kafka
function is used to read data from Kafka. It supports the following parameters:
- bootstrap: Kafka server address, such as
1.2.3.1:9092,1.2.3.2:9092
. - topic: Kafka topic name, multiple topics separated by commas, such as
topicA,topicB
. - topic_patternt*: topic regex, not supported yet, leave it empty by default. For example: ''.
- group_id: Kafka consumer group ID.
- STARTING_OFFSETS: Specify the starting offset to read from, default is
earliest
. - ENDING_OFFSETS: Specify the ending offset to read to, default is
latest
. - STARTING_OFFSETS_TIMESTAMP: Specify the timestamp for the starting offset.
- ENDING_OFFSETS_TIMESTAMP: Specify the timestamp for the ending offset.
- KEY_FORMAT: Specify the format of the key to read, type is STRING and case-insensitive. Currently only supports raw format.
- VALUE_FORMAT: Specify the format of the value to read, type is STRING and case-insensitive. Currently only supports raw format.
- MAX_ERROR_NUMBER: The maximum number of error rows allowed within the read window. Must be greater than or equal to 0. Default is 0, which means no error rows are allowed, range is 0-100000.
- MAP(): Parameters to be passed to Kafka, prefixed with kafka., directly use Kafka's parameters, can be found in Kafka. Format like MAP('kafka.security.protocol', 'PLAINTEXT', 'kafka.auto.offset.reset', 'earliest'), refer to Kafka documentation for values.
read_kafka
result return value:
Field | Meaning | Type |
topic | Kafka topic name | STRING |
partition | Data partition ID | INT |
offset | Offset in Kafka partition | BIGINT |
timestamp | Kafka message timestamp | TIMESTAMP_LTZ |
timestamp_type | Kafka message timestamp type | STRING |
headers | Kafka message headers | MAP<STRING, BINARY> |
key | Kafka key value | BINARY |
value | Kafka value value | BINARY |
Notes
- When using read_kafka, please ensure network connectivity with Lakehouse.
Examples