Data Type Compatibility Reference

When migrating DDL and SQL from existing databases to Lakehouse, this document lists common compatibility differences to help you avoid issues in advance.

Data Type Mapping

Lakehouse supports multiple type aliases, so DDL from other databases can be used directly without manually replacing type names. Aliases are immediately converted to Lakehouse canonical types during parsing; DESCRIBE and SHOW CREATE TABLE display canonical names, not the original aliases.

MySQL

MySQL typeLakehouse typeNotes
TINYINTTINYINTDirectly supported
SMALLINTSMALLINTDirectly supported
INT / INTEGERINTDirectly supported; INTEGER is an alias
BIGINTBIGINTDirectly supported
FLOATFLOATDirectly supported
DOUBLEDOUBLEDirectly supported
DECIMAL(p,s) / NUMERIC(p,s)DECIMAL(p,s)Directly supported; NUMERIC is an alias
VARCHAR(n)VARCHAR(n)Directly supported; max length 1048576
CHAR(n)CHAR(n)Directly supported; max length 255
TEXTSTRINGDirectly supported; TEXT is an alias
DATEDATEDirectly supported
DATETIMETIMESTAMP_NTZMySQL DATETIME has no timezone; maps to TIMESTAMP_NTZ
TIMESTAMPTIMESTAMP (i.e. TIMESTAMP_LTZ)Both are timezone-aware timestamps
BOOLEAN / BOOLBOOLEANDirectly supported; BOOL alias not supported, change to BOOLEAN

PostgreSQL

PostgreSQL typeLakehouse typeNotes
SMALLINT / INT2SMALLINTINT2 alias not supported; change to SMALLINT or SHORT
INTEGER / INT4INTINT4 alias not supported; change to INT or INTEGER
BIGINT / INT8BIGINTINT8 alias not supported; change to BIGINT or LONG
REAL / FLOAT4FLOATREAL supported; FLOAT4 not supported, change to FLOAT
DOUBLE PRECISION / FLOAT8DOUBLEFLOAT8 not supported; change to DOUBLE
NUMERIC(p,s) / DECIMAL(p,s)DECIMAL(p,s)Directly supported
VARCHAR(n)VARCHAR(n)Directly supported
CHAR(n)CHAR(n)Directly supported
TEXTSTRINGDirectly supported; TEXT is an alias
DATEDATEDirectly supported
TIMESTAMP (no timezone)TIMESTAMP_NTZPostgreSQL default TIMESTAMP has no timezone
TIMESTAMPTZTIMESTAMP (i.e. TIMESTAMP_LTZ)Timezone-aware timestamp
BOOLEANBOOLEANDirectly supported
BYTEABINARYBinary type; must be manually replaced

Hive

Hive typeLakehouse typeNotes
TINYINTTINYINTDirectly supported
SMALLINTSMALLINTDirectly supported
INTINTDirectly supported
BIGINTBIGINTDirectly supported
FLOATFLOATDirectly supported
DOUBLEDOUBLEDirectly supported
DECIMAL(p,s)DECIMAL(p,s)Directly supported
STRINGSTRINGDirectly supported
VARCHAR(n)VARCHAR(n)Directly supported
CHAR(n)CHAR(n)Directly supported
DATEDATEDirectly supported
TIMESTAMPTIMESTAMP_NTZHive TIMESTAMP has no timezone semantics
BOOLEANBOOLEANDirectly supported
BINARYBINARYDirectly supported
ARRAY<T>ARRAY<T>Directly supported
MAP<K,V>MAP<K,V>Directly supported
STRUCT<...>STRUCT<...>Directly supported

Spark SQL

Spark typeLakehouse typeNotes
BooleanTypeBOOLEANDirectly supported
ByteTypeTINYINTSpark Byte maps to Lakehouse TINYINT
ShortTypeSMALLINTDirectly supported
IntegerTypeINTDirectly supported
LongTypeBIGINTDirectly supported
FloatTypeFLOATDirectly supported
DoubleTypeDOUBLEDirectly supported
DecimalTypeDECIMAL(p,s)Directly supported
StringTypeSTRINGDirectly supported
BinaryTypeBINARYDirectly supported
DateTypeDATEDirectly supported
TimestampTypeTIMESTAMP_LTZSpark default is timezone-aware
TimestampNTZTypeTIMESTAMP_NTZNew timezone-naive type in Spark 3.4+
ArrayTypeARRAY<T>Directly supported
MapTypeMAP<K,V>Directly supported
StructTypeSTRUCT<...>Directly supported

Behavioral Differences

Type Conversion Failure Handling

ScenarioMySQL / PostgreSQLSpark SQLLakehouse default
CAST('abc' AS INT)ErrorReturns NULLReturns NULL
Numeric overflow (e.g. CAST(200 AS TINYINT))ErrorReturns NULLReturns NULL
Invalid date (e.g. CAST('2023-02-29' AS DATE))ErrorReturns NULLReturns NULL

Lakehouse runs in lenient mode by default, consistent with Spark SQL: conversion failures do not raise errors and silently return NULL. When migrating from strict-mode databases like MySQL/PostgreSQL, add explicit NULL checks at critical ETL nodes, or enable strict mode:

SET cz.sql.cast.mode = strict;

CHAR Does Not Pad with Spaces

MySQL and standard SQL CHAR(n) pads short strings with spaces to length n. Lakehouse does not pad. LENGTH(CAST('abc' AS CHAR(10))) returns 3, not 10.

Comparisons also use actual content: 'abc' = 'abc ' returns false.

VARCHAR/CHAR Silently Truncates Overflow

Strings exceeding the declared length are silently truncated on write without error. When migrating from databases with strict length validation, it is recommended to retain length validation logic in the application layer.

Strings in INSERT VALUES Are Not Implicitly Converted to Date Types

In INSERT VALUES, strings are not implicitly converted to DATE, TIMESTAMP, or TIMESTAMP_NTZ — explicit type declaration is required:

-- Error INSERT INTO t(dt) VALUES ('2024-01-15'); -- Correct INSERT INTO t(dt) VALUES (DATE'2024-01-15'); INSERT INTO t(dt) VALUES (CAST('2024-01-15' AS DATE));

Strings can implicitly match date columns in SELECT; this restriction applies only to INSERT VALUES.

NOT IN Containing NULL

v NOT IN (1, NULL) always returns NULL under three-valued logic (not true or false), causing the filter condition to fail. This is consistent with MySQL, PostgreSQL, and Spark SQL behavior, but is easy to overlook:

-- The following query always returns empty results because the subquery may contain NULL SELECT * FROM t WHERE id NOT IN (SELECT id FROM other); -- Safe alternative SELECT * FROM t WHERE NOT EXISTS (SELECT 1 FROM other WHERE other.id = t.id);

Spark SQL Compatibility Notes

Spark SQL and Lakehouse are highly consistent in type conversion behavior (both return NULL in lenient mode). For the complete Spark SQL migration guide (including CREATE TABLE syntax, JSON processing, window functions, LATERAL VIEW, configuration mapping, etc.), see Spark SQL Migration Guide.

Key notes:

  • TimestampTypeTIMESTAMP_LTZ; TimestampNTZTypeTIMESTAMP_NTZ
  • DataFrame writes must include all columns; PK tables not supported
  • Function compatibility is good; 40+ common functions fully supported

Unsupported Aliases

The following aliases are common in other databases but not supported in Lakehouse — replace them manually during migration:

Unsupported syntaxReplace withSource
BOOLBOOLEANMySQL
INT2SMALLINTPostgreSQL
INT4INTPostgreSQL
INT8BIGINTPostgreSQL
FLOAT4FLOATPostgreSQL
FLOAT8DOUBLEPostgreSQL
NUMBER(p,s)DECIMAL(p,s)Oracle
DOUBLE PRECISIONDOUBLEOracle/PostgreSQL
INT16SMALLINTSpark
INT32INTSpark
INT64BIGINTSpark
FLOAT32FLOATSpark
FLOAT64DOUBLESpark