Introduction to LlamaIndex
LlamaIndex is an application data framework based on large language models (LLM), designed for context-enhanced applications. This type of LLM system is known as a RAG system, or "Retrieval-Augmented Generation". LlamaIndex provides the necessary abstractions to more easily ingest, construct, and access private or domain-specific data, so that this data can be securely and reliably injected into the LLM for more accurate text generation.
The key to data ingestion in LlamaIndex lies in loading and transforming. After loading documents, you can process them through transformation and output nodes. LlamaIndex can load data stored in cloud data warehouse Lakehouse through the Database Reader.
Example Code
from llama_index.readers.database import DatabaseReader
import streamlit as st
username = st.secrets.lakehouse.username
password = st.secrets.lakehouse.password
account = st.secrets.lakehouse.account
endpoint = st.secrets.lakehouse.endpoint
workspace = st.secrets.lakehouse.workspace
schema = st.secrets.lakehouse.schema
virtualcluster = st.secrets.lakehouse.virtualcluster
CONNECTION_STRING = (
f"clickzetta://{username}:{password}@"
f"{account}.{endpoint}/{workspace}?schema={schema}&virtualcluster={virtualcluster}"
)
db = DatabaseReader(uri=CONNECTION_STRING, schema=schema)
Available Methods for SQLDatabase
print(type(db.sql_database.from_uri))
print(type(db.sql_database.get_single_table_info))
print(type(db.sql_database.get_table_columns))
print(type(db.sql_database.get_usable_table_names))
print(type(db.sql_database.insert_into_table))
print(type(db.sql_database.run_sql))
query = f"""
SELECT
CONCAT(name, ' is ', age, ' years old.') AS text
FROM public.users
WHERE age >= 18
"""
documents = db.load_data(query=query)
index = VectorStoreIndex.from_documents(documents)
Reference Documentation
Llama-Index Database Reader