Introduction to Airbyte

Airbyte is an open-source data integration platform designed for ELT (Extract, Load, Transform) pipelines from APIs, databases, and files to databases, data warehouses, and data lakes. Airbyte provides an easy-to-use platform to help users achieve data synchronization and integration effortlessly.

Airbyte Architecture Diagram

Local Docker Installation

System Requirements

This guide has been tested on the following operating systems: macOS, Windows 10, and Ubuntu 22.04.

Installation Steps

  1. Ensure that Docker Engine is installed on your computer, along with the Docker Compose plugin. For specific installation methods, please refer to the official documentation.
  2. After installation, start Airbyte locally with the following command:
# Clone the Airbyte repository from GitHub
git clone --depth=1 https://github.com/airbytehq/airbyte.git

# Switch to the Airbyte directory
cd airbyte

# Start Airbyte
./run-ab-platform.sh
  1. Visit http://localhost:8000 to open the Airbyte web interface in your browser.
  2. The system will prompt you to enter a username and password. By default, the username is airbyte and the password is password. You can modify these credentials in the .env file:
# Proxy Configuration
# Set BASIC_AUTH_USERNAME and BASIC_AUTH_PASSWORD to empty values, such as "", to disable basic authentication
BASIC_AUTH_USERNAME=your_new_username_here
BASIC_AUTH_PASSWORD=your_new_password_here

Deploy on Windows

After installing the WSL 2 backend and Docker, you can run containers using Windows PowerShell. Additionally, we recommend building Airbyte from source on Windows to install docker-compose. Below is the recommended guide for installing Airbyte on Windows.

Setup Guide

  1. Please review the system requirements in the Docker documentation.
  2. Follow the steps for the system requirements and ensure to download and install the Linux kernel update package.
  3. Install Docker Desktop on Windows. Download link: Docker Desktop.
  4. Make sure to select the following options during installation:
    • Enable Hyper-V Windows feature
    • Install the Windows components necessary for WSL 2 (a computer restart is required after installation)
git clone --depth=1 https://github.com/airbytehq/airbyte.git
cd airbyte
bash run-ab-platform.sh
  1. Access http://localhost:8000 in your browser.
  2. The system will prompt you to enter a username and password. By default, the username is airbyte and the password is password. Please change these credentials after deploying Airbyte to the server.

Install Singdata Lakehouse Destination Connector in Airbyte

Configuration Reference

Connector display name: Clickzetta Lakehouse

Docker repository name: clickzetta/clickzetta-airbyte

Docker image tag: 0.1.0

Connector documentation URLOptional: https://www.yunqi.tech

  1. Create a new connector in Airbyte, and select "Clickzetta Lakehouse" as the display name. New Connector
  2. Configure the connector by filling in the necessary parameters, such as database address, port, username, and password.
  3. Create a data sync connection from other data sources to Singdata Lakehouse and start data synchronization. Create Data Sync Connection

Establish Connection and Sync Data to Singdata Lakehouse

  1. Create a new connection and select the "Clickzetta Lakehouse" connector that was just created.
  2. Fill in the connection configuration information, such as database address, port, username, and password.
  3. Configure the sync task by selecting the source data source and target data table, setting the sync frequency, and filtering conditions.
  4. Start the sync task to begin synchronizing data from the source data source to Singdata Lakehouse.

Data Sync Configuration

Data Sync Configuration2