SQL User-Defined Function Guide

SQL user-defined functions (SQL UDFs) let you encapsulate reusable computation logic in SQL expressions and call them just like built-in functions inside queries.

Choosing the Right Type

TypeBest forLimitations
SQL function (this guide)Pure SQL logic: data cleansing, calculation formulas, conditional branchingSQL expressions only — cannot call external services
External functionCalling Python/Java code, external APIs, or ML modelsRequires deploying an external service
Built-in functionStandard math, string, and date operationsNot customizable

When to use a SQL function:

  • The same computation logic appears repeatedly across multiple queries
  • Business rules need to be maintained in one place (e.g., discount calculations, classification rules)
  • You need to encapsulate complex CASE WHEN logic or multi-step calculations

SQL Commands Involved

CommandPurpose
CREATE FUNCTIONCreate a scalar or table function
CREATE OR REPLACE FUNCTIONUpdate an existing function definition
DROP FUNCTIONDelete a function
DESC FUNCTIONView a function definition
SHOW EXTERNAL FUNCTIONSList user-defined functions in the current SCHEMA

Prerequisites

CREATE TABLE doc_orders ( order_id INT, order_date DATE, amount DOUBLE, category STRING ); INSERT INTO doc_orders VALUES (1, DATE '2024-01-01', 99.9, 'electronics'), (2, DATE '2024-01-03', 299.0, 'clothing'), (3, DATE '2024-01-03', 49.5, 'food'), (4, DATE '2024-01-05', 199.0, 'electronics'), (5, DATE '2024-01-05', 0.0, 'food');


Scenario 1: Scalar Function — Encapsulating a Calculation Formula

Goal: Centrally manage discount calculation logic so it stays consistent wherever it is called.

-- Create a discount price function; rate defaults to 0.9 (10% off) CREATE OR REPLACE FUNCTION public.discount_price(price DOUBLE, rate DOUBLE DEFAULT 0.9) RETURNS DOUBLE RETURN ROUND(price * rate, 2);

-- Use the default discount (10% off) SELECT public.discount_price(100.0); -- 90.0 -- Specify a discount (20% off) SELECT public.discount_price(100.0, 0.8); -- 80.0 -- Use inside a query SELECT order_id, amount, public.discount_price(amount) AS discounted FROM doc_orders WHERE category = 'electronics';

order_idamountdiscounted
199.989.91
4199.0179.1

Scenario 2: Scalar Function — Encapsulating Classification Rules

Goal: Tier orders by amount with rules maintained in one place.

CREATE OR REPLACE FUNCTION public.order_tier(amount DOUBLE) RETURNS STRING RETURN CASE WHEN amount >= 200 THEN 'Premium' WHEN amount >= 50 THEN 'Standard' WHEN amount > 0 THEN 'Basic' ELSE 'Free' END;

SELECT order_id, amount, public.order_tier(amount) AS tier FROM doc_orders ORDER BY amount DESC;

order_idamounttier
2299.0Premium
4199.0Standard
199.9Standard
349.5Basic
50.0Free

Scenario 3: Scalar Function — Data Cleansing

Goal: Clean phone numbers by stripping non-numeric characters.

CREATE OR REPLACE FUNCTION public.clean_phone(phone STRING) RETURNS STRING RETURN REGEXP_REPLACE(TRIM(phone), '[0-9]', '');

SELECT public.clean_phone(' 138-1234-5678 '); -- 13812345678 SELECT public.clean_phone('+86 (010) 8888-9999'); -- 8601088889999


Scenario 4: Table Function — Returning Multiple Rows

A table function returns a virtual table and can be used in a FROM clause just like a regular table.

Goal: Generate a consecutive date sequence to populate a calendar dimension.

CREATE OR REPLACE FUNCTION public.date_range(start_date DATE, end_date DATE) RETURNS TABLE (dt DATE) RETURN SELECT DATE_ADD(start_date, pos) AS dt FROM (SELECT POSEXPLODE(SPLIT(SPACE(DATEDIFF(end_date, start_date)), ' '))) t(pos, v);

-- Direct call SELECT * FROM public.date_range(DATE '2024-01-01', DATE '2024-01-05');

dt
2024-01-01
2024-01-02
2024-01-03
2024-01-04
2024-01-05

-- LEFT JOIN with the orders table to find dates with no orders SELECT d.dt, COALESCE(SUM(o.amount), 0) AS daily_revenue FROM public.date_range(DATE '2024-01-01', DATE '2024-01-05') d LEFT JOIN doc_orders o ON o.order_date = d.dt GROUP BY d.dt ORDER BY d.dt;

dtdaily_revenue
2024-01-0199.9
2024-01-020
2024-01-03348.5
2024-01-040
2024-01-05199

Function Management

View a Function Definition

DESC FUNCTION public.order_tier;

Returns the function name, creation time, full SQL definition, and more.

Update a Function

Use CREATE OR REPLACE FUNCTION to overwrite in place — no need to drop first:

CREATE OR REPLACE FUNCTION public.order_tier(amount DOUBLE) RETURNS STRING RETURN CASE WHEN amount >= 300 THEN 'VIP' -- new VIP tier WHEN amount >= 200 THEN 'Premium' WHEN amount >= 50 THEN 'Standard' WHEN amount > 0 THEN 'Basic' ELSE 'Free' END;

Delete a Function

DROP FUNCTION IF EXISTS public.order_tier;


Notes

  • Schema prefix is required: You must write schema_name.function_name when calling a function, otherwise you will get a "function not found" error. You can change the resolution policy with SET cz.sql.remote.udf.lookup.policy = builtin_first — see SET (session parameters).
  • Default parameters must come last: Parameters with default values must be placed after parameters without default values.
  • Table functions must use a query: The body of a RETURNS TABLE function must be a SELECT statement, not an expression.
  • DML is not supported in function bodies: SQL functions cannot execute INSERT/UPDATE/DELETE.
  • Recursion is not supported: A function body cannot call itself — doing so results in a "function not found" error.
  • Calling other user-defined functions: A function body can call other SQL functions in the same SCHEMA; use the SCHEMA prefix.
  • Name collision with built-in functions: Call your custom function with the SCHEMA prefix; call the built-in function without a prefix. The two do not interfere with each other.
  • NULL input: A NULL argument participates in CASE WHEN evaluation. NULL >= 200 is false, so the ELSE branch is taken.
  • OR REPLACE and IF NOT EXISTS cannot be used together: Combining them causes a syntax error.