Studio Task Development and Operations in Practice

Studio tasks are the scheduling execution units of Singdata Lakehouse, supporting multiple types including SQL, Python, and Shell. This series of documents covers practical task development scenarios, with each article containing complete runnable code and verified output.

Series Articles

Studio Python Task Development Guide (ZettaPark) — Read and write Lakehouse data via the ZettaPark DataFrame API, using RFM customer segmentation as a complete development walkthrough
Studio Python Task Development Guide (Python Connector) — Obtain a PEP 249 cursor via engine.raw_connection(), use executemany for bulk writes and cursor queries, using user behavior funnel analysis as an example
Studio Shell Task Development Guide — Run Bash scripts in a server-side Linux environment, call HTTP APIs to pull data and write to Lakehouse, including runtime environment details and common patterns
Studio JDBC Task Development Guide — Connect to external databases (MySQL, etc.) via JDBC to execute SQL, covering two scenarios: pre-sync exploration and data source issue investigation

Practice Scenarios

Incremental Sync from External Data Sources (Shell + Python) — Shell task pulls external API data and writes to the raw layer; Python task cleans and standardizes data and writes to the clean layer; the two tasks are chained via dependency

Studio Task Development and Operations in Practice

Series Articles

Practice Scenarios

Related Documentation