Using Java to Upload Data in Batch (BulkLoadV1)
Maven Dependencies
You can include the clickzetta-java SDK via Maven dependency:
Search for clickzetta-java on the Maven repository to get the latest version record.
Creating a BulkloadStream
To create a bulk write stream through the Singdata client, refer to the following example code:
Options
Starting from clickzetta-java version 3.0.18, options are provided. Options are used to specify upload options including partition specification.
-
withPartitionSpecs is used to specify partition information for the target table, controlling the partition behavior for data writes.
- Non-partitioned table: Ignore this parameter or set it to empty.
- Partitioned table:
- Static partition write: Writes all data to a designated fixed partition. Regardless of the actual values in the partition column of the source data, the
partition_specvalue is used when writing to the target table, and all data is written to the same specified partition. The parameter format is 'partition_col1=value1,partition_col2=value2'. - Dynamic partition write: Automatically writes to the corresponding partition based on the actual values of the partition column in the data. By ignoring this parameter, the system automatically creates or writes to the appropriate partition based on the values of the partition column in the data.
- Static partition write: Writes all data to a designated fixed partition. Regardless of the actual values in the partition column of the source data, the
Operation Types
When creating a Bulkload, you can specify the following operation types via the operate method:
-
RowStream.BulkLoadOperate.APPEND: Append mode, adds data to the table. -
RowStream.BulkLoadOperate.OVERWRITE: Overwrite mode, deletes existing data in the table before writing new data.
Writing Data
Use the Row object to represent the specific data to be written. Encapsulate data into the Row object by calling the row.setValue method.
- The
createRowmethod creates a Row object and requires an integer as a shard ID. This ID can be used with multi-threading/multi-processing techniques, where multiple mutually distinct shard IDs are used to write data, effectively improving data write speed. - The first parameter of the
setValuemethod is the field name, and the second parameter is the specific data. The data type must match the table column type. - The
applymethod is used to write data, requiring the Row object and the corresponding shard ID.
Writing Complex Type Data
Committing Data
Batch-written data is only visible after being committed. Therefore, the commit process is very important.
- Use
bulkloadStream.getState()to get the status of the BulkloadStream. - If the commit fails, use
bulkloadStream.getErrorMessage()to get the error message.
Usage Example
The following is an example of using Bulkload to write complex type data:
- The Lakehouse URL can be found in Lakehouse Studio under Management -> Workspace by checking the JDBC connection string.

