STDDEV_SAMP

stddev_samp([distinct] expr)

Function Description

The STDDEV_SAMP function is used to calculate the sample standard deviation of a set of numerical data. The standard deviation is a statistical measure of data dispersion, used to indicate the degree of variation in a set of values. This function can handle various numerical types including tinyint, smallint, int, bigint, float, double, and decimal.

Parameter Description

  • expr: A column or expression of numerical type, such as tinyint, smallint, int, bigint, float, double, or decimal.
  • distinct: An optional parameter. When set to distinct, the function calculates the standard deviation of the distinct set. If distinct is not set, the function calculates the standard deviation of the set including duplicate values.

Return Results

  • Returns a double type value representing the sample standard deviation of the dataset.
  • If the input dataset contains null values, the null values will not be included in the calculation.

Usage Example

Example 1: Basic Usage

SELECT stddev_samp(col) FROM VALUES (1), (2), (3), (3), (null) AS tab(col);
+--------------------+
| `stddev_samp`(col) |
+--------------------+
| 0.9574271077563381 |
+--------------------+

Example 2: Using the distinct keyword

SELECT stddev_samp(DISTINCT col) FROM VALUES (1), (2), (3), (3), (null) AS tab(col);
+-----------------------------+
| `stddev_samp`(DISTINCT col) |
+-----------------------------+
| 1.0                         |
+-----------------------------+

By the above example, you can better understand the usage and application scenarios of the STDDEV_SAMP function. In actual data analysis, this function can help you quickly assess the dispersion of data, thereby providing a basis for decision-making.