Indexes

Singdata Lakehouse supports multiple index types for accelerating query filtering in different scenarios and reducing data scan volume.

Index Type Comparison

Index TypeApplicable QueriesApplicable FieldsTypical Scenario
Bloomfilter IndexEquality queries (=, IN)High-cardinality fields, such as user ID, order numberQuickly skip data files that do not contain the target value
Inverted IndexFull-text search (MATCH), keyword searchText fields, JSON fieldsLog search, document search, multi-keyword filtering
Vector IndexApproximate Nearest Neighbor search (ANN)VECTOR type fieldsSemantic search, image similarity, RAG retrieval

Selection Advice

  • The field is a high-cardinality ID field with frequent equality queries --> Bloomfilter Index
  • The field is text content requiring keyword or phrase search --> Inverted Index
  • The field is a vector embedding requiring similarity search --> Vector Index
  • Unsure which to use --> Refer to Index Best Practices

Index Management Commands