Functions
General functions
Mathematical functions and operators work in one of two ways:
-
Some mathematical functions perform calculations based on a single row. For example, rounding, taking a square root, multiplying, and similar functions can be used for values in a single row, returning a distinct value for each and every row. All mathematical operators, such as +, are applied one row at a time.
- Other mathematical functions, like averages and running totals, operate over many rows. These functions take many rows and reduce them to a single number, then display that same number on every row.
Function | Syntax | Purpose |
---|---|---|
abs | abs(value) | Returns the absolute value of value. |
acos | acos(value) | Returns the inverse cosine of value. |
asin | asin(value) | Returns the inverse sine of value. |
atan | atan(value) | Returns the inverse tangent of value. |
beta_dist | beta_dist(value, alpha, beta, cumulative) | Returns the position of value on the beta distribution with parameters alpha and beta. If cumulative = yes, returns the cumulative probability. |
beta_inv | beta_inv(probability, alpha, beta) | Returns the position of probability on the inverse cumulative beta distribution with parameters alpha and beta. |
binom_dist | binom_dist(num_successes, num_tests, probability, cumulative) | Returns the probability of getting num_successes successes in num_tests tests with the given probability of success. If cumulative = yes, returns the cumulative probability. |
binom_inv | binom_inv(num_tests, test_probability, target_probability) | Returns the smallest number k such that binom(k, num_tests, test_probability, yes) >= target_probability. |
ceiling | ceiling(value) | Returns the smallest integer greater than or equal to value. |
chisq_dist | chisq_dist(value, dof, cumulative) | Returns the position of value on the gamma distribution with dof degrees of freedom. If cumulative = yes, returns the cumulative probability. |
chisq_inv | chisq_inv(probability, dof) | Returns the position of probability on the inverse cumulative gamma distribution with dof degrees of freedom. |
chisq_test | hisq_test(actual, expected) | Returns the probability for the chi-squared test for independence between actual and expected data. actual can be a column or a column of lists, and expected must be the same type. |
combin | combin(set_size, selection_size) | Returns the number of ways of choosing selection_size elements from a set of size set_size. |
confidence_norm | confidence_norm(alpha, stdev, n) | Returns half the width of the normal confidence interval at significance level alpha, standard deviation stdev, and sample size n. |
confidence_t | confidence_t(alpha, stdev, n) | Returns half the width of theStudent’st-distribution confidence intervalat significance level alpha, standard deviation stdev, and sample size n (https://en.wikipedia.org/wiki/Student's_t-distribution#Confidence_intervals). |
correl | correl(column_1, column_2) | Returns the correlation coefficient of column_1 and column_2. |
cos | cos(value) | Returns the cosine of value. |
count | count(expression) | Returns the count of non-null values in the column defined by expression, unless expression defines a column of Lists, in which case returns the count in each List. |
count_distinct | count_distinct(expression) | Returns the count of distinct non-null values in the column defined by expression, unless expression defines a column of Lists, in which case returns the count in each List. |
covar_pop | covar_pop(column_1, column_2) | Returns the population covariance of column_1 and column_2. |
covar_samp | covar_samp(column_1, column_2) | Returns the sample covariance of column_1 and column_2. |
degrees | degrees(value) | Converts value from radians to degrees. |
exp | exp(value) | Returns e to the power of value. |
expon_dist | expon_dist(value, lambda, cumulative) | Returns the position of value on the exponential distribution with parameter lambda. If cumulative = yes, returns the cumulative probability. |
f_dist | f_dist(value, dof_1, dof_2, cumulative) | Returns the position of value on the F distribution with parameters dof_1 and dof_2. If cumulative = yes, returns the cumulative probability. |
f_inv | f_inv(probability, dof_1, dof_2) | Returns the position of probability on the inverse cumulative F distribution with parameters dof_1 and dof_2. |
fact | fact(value) | Returns the factorial of value. |
floor | floor(value) | Returns the largest integer less than or equal to value. |
gamma_dist | gamma_dist(value, alpha, beta, cumulative) | Returns the position of value on the gamma distribution with parameters alpha and beta. If cumulative = yes, returns the cumulative probability. |
gamma_inv | gamma_inv(probability, alpha, beta) | Returns the position of probability on the inverse cumulative gamma distribution with parameters alpha and beta. |
geomean | geomean(expression) | Returns the geometric mean of the column created by expression unless expression defines a column of Lists, in which case returns the geometric mean of each List. |
hypgeom_dist | hypgeom_dist (sample_successes, sample_size, population_successes, population_size, cumulative) | Returns the probability of getting sample_successes from the given sample_size, number of population_successes, and population_size. If cumulative = yes, returns the cumulative probability. |
intercept | intercept(y_column, x_column) | Returns the intercept of the linear regression line through the points determined by y_column and x_column. |
kurtosis | kurtosis(expression) | Returns the sample excess kurtosis of the column created by expression unless expression defines a column of Lists, in which case returns the sample excess kurtosis of each List. |
large | large(expression, k) | Returns the kth largest value of the column created by expression unless expression defines a column of Lists, in which case returns the kth largest value of each List. |
ln | ln(value) | Returns the natural logarithm of value. |
log | log(value) | Returns the base 10 logarithm of value. |
match | match(value, expression) | Returns the row number of the first occurence of value in the column created by expression unless expression defines a column of Lists, in which case returns the position of value in each List. |
max | max(expression) | Returns the max of the column created by expression unless expression defines a column of Lists, in which case returns the max of each List. |
mean | mean(expression) | Returns the mean of the column created by expression unless expression defines a column of Lists, in which case returns the mean of each List. |
median | median(expression) | Returns the median of the column created by expression unless expression defines a column of Lists, in which case returns the median of each List. |
min | min(expression) | Returns the min of the column created by expression unless expression defines a column of Lists, in which case returns the min of each List. |
mod | mod(value, divisor) | Returns the remainder of dividing value by divisor. |
mode | mode(expression) | Returns the mode of the column created by expression unless expression defines a column of Lists, in which case returns the mode of each List. |
multinomial | multinomial(value_1, value_2, ...) | Returns the factorial of the sum of the arguments divided by the product of each of their factorials. |
negbinom_dist | negbinom_dist(num_failures, num_successes, probability, cumulative) | Returns the probability of getting num_failures failures before getting num_successes successes, with the given probability of success. If cumulative = yes, returns the cumulative probability. |
norm_dist | norm_dist(value, mean, stdev, cumulative) | Returns the position of value on the normal distribution with the given mean and stdev. If cumulative = yes, then returns the cumulative probability. |
norm_inv | norm_inv(probability, mean, stdev) | Returns the position of probability on the inverse normal cumulative distribution. |
norm_s_dist | norm_s_dist(value, cumulative) | Returns the position of value on the standard normal distribution. If cumulative = yes, returns the cumulative probability. |
norm_s_inv | norm_s_inv(probability) | Returns the position of probability on the inverse standard normal cumulative distribution. |
percent_rank | percent_rank(column, value) | Returns the rank of value in column as a percentage from 0 to 1 inclusive. |
percentile | percentile(value_column, percentile_value) | Returns the value from the column created by expression corresponding to the given percentile_value, unless expression defines a column of Lists, in which case returns the percentile value for each List. Note: percentile_value must be between 0 and 1, else this returns null. |
pi | pi() | Returns the value of pi. |
poisson_dist | poisson_dist(value, lambda, cumulative) | Returns the position of value on the poisson distribution with parameter lambda. If cumulative = yes, returns the cumulative probability. |
power | power(base, exponent) | Returns base raised to the power of exponent. |
product | product(expression) | Returns the product of the column created by expression unless expression defines a column of Lists, in which case returns the product of each List. |
radians | radians(value) | Converts value from degrees to radians. |
rand | rand() | Returns a random number between 0 and 1. |
rank | rank(value, expression) | Returns the rank of value in the column created by expression. For example, if you want to rank orders by their total sale price, you could use rank(${order_items.total_sale_price},${order_items.total_sale_price}), which gives a rank for each value of order_items.total_sale_price in your query when comparing it to the entire column of order_items.total_sale_price in your query. In the case where the expression defines multiple lists, this function returns the relative size of the value in each list. |
rank_avg | rank_avg(value, expression) | Returns the average rank of value in the column created by expression unless expression defines a column of lists, in which case returns the average rank of value in each list. |
round | round(value, num_decimals) | Returns value rounded to num_decimals decimal places. |
running_product | running_product (value_column) | Returns a running product of the values in value_column. |
running_total | running_total(value_column) | Returns a running total of the values in value_column. |
skew | skew(expression) | Returns the sample skewness of the column created by expression unless expression defines a column of Lists, in which case returns the sample skewness of each List. |
slope | slope(y_column, x_column) | Returns the slope of the linear regression line through points determined by y_column and x_column. |
small | small(expression, k) | Returns the kth smallest value of the column created by expression unless expression defines a column of Lists, in which case returns the kth smallest value of each List. |
sqrt | sqrt(value) | Returns the square root of value. |
stddev_pop | stddev_pop(expression) | Returns the standard deviation (population) of the column created by expression unless expression defines a column of Lists, in which case returns the standard deviation (population) of each List. |
stddev_samp | stddev_pop(expression) | Returns the standard deviation (sample) of the column created by expression unless expression defines a column of Lists, in which case returns the standard deviation (sample) of each List. |
sum | sum(expression) | Returns the sum of the column created by expression unless expression defines a column of Lists, in which case returns the sum of each List. |
t_dist | t_dist(value, dof, cumulative) | Returns the position of value on the Student’s t-distribution with dof degrees of freedeom. If cumulative = yes, returns the cumulative probability (https://en.wikipedia.org/wiki/Student's_t-distribution). |
t_inv | t_inv(probability, dof) | Returns the position of probability on the inverse normal cumulative distribution with dof degrees of freedom. |
t_test | t_test(column_1, column_2, tails, type) | Returns the result of a Student’s t-test on the data from column_1 and column_2, using 1 or 2 tails. type: 1 = paired, 2 = homoscedastic, 3 = heteroscedastic (https://en.wikipedia.org/wiki/Student%27s_t-test). |
tan | tan(value) | Returns the tangent of value. |
var_pop | var_pop(expression) | Returns the variance (population) of the column created by expression unless expression defines a column of Lists, in which case returns the variance (population) of each List. |
var_samp | var_pop(expression) | Returns the variance (sample) of the column created by expression unless expression defines a column of Lists, in which case returns the variance (sample) of each List. |
weibull_dist | weibull_dist(value, shape, scale, cumulative) | Returns the position of value on the Weibull distribution with parameters shape and scale. If cumulative = yes, returns the cumulative probability. |
z_test | z_test(data, value, stdev) | Returns the one-tailed p-value of the z-test using the existing data and stdev on the hypothesized mean value. |
String and Date functions
String Functions
String functions operate on sentences, words, or letters, which are collectively called “strings.” You can use string functions to capitalize words and letters, extract parts of a phrase, check to see if a word or letter is in a phrase, or replace elements of a word or phrase. They can also be used to format the data returned in the table.
Function | Syntax | Purpose |
---|---|---|
concat | concat(value_1, value_2, ...) | Returns value_1, value_2, ..., value_n joined as one string |
contains | contains(string, search_string) | Returns Yes if string contains search_string, and No otherwise |
length | length(string) | Returns the number of characters in string |
lower | lower(string) | Returns string with all characters converted to lower case |
position | position(string, search_string) | Returns the start index of search_string in string if it exists, and 0 otherwise |
replace | replace(string, old_string, new_string) | Returns string with all occurrences of old_string replaced with new_string |
substring | substring(string, start_position, length) | Returns the substring of string beginning at start_position consisting of length characters |
to_number | to_number(string) | Returns the number represented by string, or null if the string cannot be converted |
to_string | to_string(value) | Returns the string representation of value, or an empty string if value is null |
upper | upper(string) | Returns string with all characters converted to upper case |
Date Functions
Date functions enable you to work with dates and times.
Function | Syntax | Purpose |
---|---|---|
add_days | add_days(number, date) | Adds number days to date |
add_hours | add_hours(number, date) | Adds number hours to date |
add_minutes | add_minutes(number, date) | Adds number minutes to date |
add_months | add_months(number, date) | Adds number months to date |
add_seconds | add_seconds(number, date) | Adds number seconds to date |
add_years | add_years(number, date) | Adds number years to date |
date | date(year, month, day) | Returns “year-month-day” date or null if the date would be invalid |
date_time | date_time(year, month, day, hours, minutes, seconds) | Returns “year-month-day hours:minutes:seconds” date or null if the date would be invalid |
diff_days | diff_days(start_date, end_date) | Returns the number of days between start_date and end_date |
diff_hours | diff_hours(start_date, end_date) | Returns the number of hours between start_date and end_date |
diff_minutes | diff_minutes(start_date, end_date) | Returns the number of minutes between start_date and end_date |
diff_months | diff_months(start_date, end_date) | Returns the number of months between start_date and end_date |
diff_seconds | diff_seconds(start_date, end_date) | Returns the number of seconds between start_date and end_date |
diff_years | diff_years(start_date, end_date) | Returns the number of years between start_date and end_date |
extract_days | extract_days(date) | Extracts the days from date |
extract_hours | extract_hours(date) | Extracts the hours from date |
extract_minutes | extract_minutes(date) | Extracts the minutes from date |
extract_months | extract_months(date) | Extracts the months from date |
extract_seconds | extract_seconds(date) | Extracts the seconds from date |
extract_years | extract_years(date) | Extracts the years from date |
now | now() | Returns the current date and time |
to_date | to_date(string) | Returns the date and time corresponding to string (YYYY, YYYY-MM, YYYY-MM-DD, YYYY-MM-DD hh, YYYY-MM-DD hh:mm, or YYYY-MM-DD hh:mm:ss) |
trunc_days | trunc_days(date) | Truncates date to days |
trunc_hours | trunc_hours(date) | Truncates date to hours |
trunc_minutes | trunc_minutes(date) | Truncates date to minutes |
trunc_months | trunc_months(date) | Truncates date to months |
trunc_years | trunc_years(date) | Truncates date to years |
Logical Functions, Operators and Constants
Logical functions and operators are used to assess whether something is true or false. Expressions using these elements take a value, evaluate it against some criteria, return Yes if the criteria are met, and No if the criteria are not met. There are also various logical operators for comparing values and combining logical expressions.
Function | Syntax | Purpose |
---|---|---|
coalesce | coalesce(value_1, value_2, ...) | Returns the first non-null value in value_1, value_2, ..., value_n if found and null otherwis |
if | if(yesno_expression, value_if_yes, value_if_no) | If yesno_expression evaluates to Yes, returns the value_if_yes value. Otherwise, returns the value_if_no value |
is_null | is_null(value) | Returns Yes if value is null, and No otherwise |
Operator | Syntax | Purpose | Note |
---|---|---|---|
= | value_1 = value_2 | Returns Yes if value_1 is equal to value_2, and No otherwise | The following comparison operators can be used with any data type |
!= | value_1 != value_2 | Returns Yes if value_1 is not equal to value_2, and No otherwise | The following comparison operators can be used with any data type |
> | value_1 > value_2 | Returns Yes if value_1 is greater than value_2, and No otherwise | The following comparison operators only can be used with numbers and dates |
< | value_1 < value_2 | Returns Yes if value_1 is less than value_2, and No otherwise | The following comparison operators only can be used with numbers and dates |
>= | value_1 >= value_2 | Returns Yes if value_1 is greater than or equal to value_2, and No otherwise | The following comparison operators only can be used with numbers and dates |
<= | value_1 <= value_2 | Returns Yes if value_1 is less than or equal to value_2, and No otherwise | The following comparison operators only can be used with numbers and dates |
AND | value_1 AND value_2 | Returns Yes if both value_1 and value_2 are Yes, and No otherwise | These logical operators must be capitalized. Logical operators written in lowercase will not work |
OR | value_1 OR value_2 | Returns Yes if either value_1 or value_2 is Yes, and No otherwise | These logical operators must be capitalized. Logical operators written in lowercase will not work |
NOT | NOT value | Returns Yes if value is No, and Yes otherwise | These logical operators must be capitalized. Logical operators written in lowercase will not work |
Logical Constants
You can use logical constants within expressions. These constants are always written in lowercase and have the following meanings.
Constant | Meaning |
---|---|
yes | True |
no | False |
null | There is no value |
Note that the constants yes and no, are the special symbols that mean true or false within expressions. In contrast, using quotes such as in "yes" and "no" creates literal strings with those values.
Logical expressions evaluate to true or false without requiring an if function. For example, this:
if(${field} > 100, yes, no)
is equivalent to this:
${field} > 100
You also can use null to indicate no value. For example, you may want to determine if a field is empty, or assign an empty value in a certain situation. This formula returns no value if the field is less than 1, or the value of the field if it is more than 1:
if(${field} < 1, null, ${field})
Combining AND and OR Operators
AND operators are evaluated before OR operators, if you don’t otherwise specify the order with parentheses. Thus, the following expression without additional parentheses:
if (
${order_items.days_to_process}>=4 OR
${order_items.shipping_time}>5 AND
${order_facts.is_first_purchase},
"review", "okay")
would be evaluated as:
if (
${order_items.days_to_process}>=4 OR
(${order_items.shipping_time}>5 AND ${order_facts.is_first_purchase}),
"review", "okay")
Positional Functions
When creating custom calculations, you can use positional transformation functions to extract information about fields in different rows or pivot columns. You can also create lists and retrieve the current row or pivot column index.
If your Explore contains totals, you can reference total values for columns and rows:
Function | Syntax | Purpose |
---|---|---|
:total | ${field:total} | Returns the column total of the field |
:row_total | ${field:row_total} | Returns the row total of the field |
Row-Related Functions
Some of these functions use the relative positions of rows, so changing the sort order of the rows affects the results of the functions.
Function | Syntax | Purpose |
---|---|---|
index | index(expression, n) | Returns the value of the nth element of the column created by expression, unless expression defines a column of Lists, in which case returns the nth element of each list |
list | list(value_1, value_2, ...) | Creates a List out of the given values |
lookup | lookup(value, lookup_column, result_column) | Returns the value in result_column that is in the same row as value is in lookup_column |
offset | offset(column, row_offset) | Returns the value of row (n + row_offset) in column, where n is the current row number |
offset_list | offset_list(column, row_offset, num_values) | Returns a List of the num_values values starting at row (n + row_offset) in column, where n is the current row number |
row | row() | Returns the current row number |
Pivot-Related Functions
Some of these functions use the relative positions of pivot columns, so changing the sort order of the pivoted dimension affects the results of those functions.
Function | Syntax | Purpose |
---|---|---|
pivot_column | pivot_column() | Returns the index of the current pivot column |
pivot_index | pivot_index(expression, pivot_index) | Evaluates expression in the context of the pivot column at position pivot_index (1 for first pivot, 2 second pivot, etc.). Returns null for unpivoted result |
pivot_offset | pivot_offset(pivot_expression, col_offset) | Returns the value of the pivot_expression in position (n + col_offset), where n is the current pivot column position. Returns null for unpivoted results |
pivot_offset_list | pivot_offset_list(pivot_expression, col_offset, num_values) | Returns a List of the the num_values values in pivot_expression starting at position (n + col_offset), where n is the current pivot index. Returns null for unpivoted results |
pivot_row | pivot_row(expression) | Returns the pivoted values of expression as a List. Returns null for unpivoted results. |
pivot_where | pivot_where(select_expression, expression) | Returns the value of expression for the pivot column which uniquely satisfies select_expression or null if such a unique column does not exist. |
The specific pivot functions you use determine whether the table calculation is displayed next to each pivoted column, or is displayed as a single column at the end of the table.
Updated almost 5 years ago