Spark SQL functions

You can use several built-in Spark SQL functions to extend SQL functionality with Adobe Experience Platform Query Service. This document lists the Spark SQL functions that are supported by Query Service.

For more detailed information about the functions, including their syntax, usage, and examples, read the Spark SQL function documentation.

NOTE
Not all functions in the external documentation are supported.

Math and statistical operators and functions math

Operator/Function
Description
%
Returns the remainder of the two numbers
*
Multiplies the two numbers
+
Adds the two numbers
-
Subtracts the two numbers
/
Divides the two numbers
abs
Returns the absolute value of the input
acos
Returns the inverse cosine value
approx_count_distinct
Returns the estimated cardinality by HyperLogLog++
approx_percentile
Returns the approximate percentile value at the given percentage
asin
Returns the inverse sine value
atan
Returns the inverse tangent value
atan2
Returns the angle between the positive x-axis plane and the points given by the coordinates
avg
Returns the average value
cbrt
Returns the cube root
ceil or ceiling
Returns the smallest integer not larger than the inputted value
conv
Convert from one base to another
corr
Returns the Pearson coefficient between the numbers
cos
Returns the cosine value
cosh
Returns the hyperbolic cosine value
cot
Returns the cotangent value
dense_rank
Returns the rank of a value in a group of values
e
Returns Euler’s number
exp
Returns e to the power of the value
expm1
Returns e to the power of the value minus 1
factorial
Returns the factorial of the value
floor
Returns the largest integer not smaller than the value
greatest
Returns the largest value of all the parameters
hypot
Returns the hypotenuse of the two values given
kurtosis
Returns the kurtosis value from the group
least
Returns the smallest value of all the parameters
ln
Returns the natural logarithm of the value
log
Returns the logarithm of the value
log10
Returns the logarithm, in base 10, of the value
log1p
Returns the logarithm of the value plus 1
log2
Returns the logarithm, in base 2, of the value
max
Returns the maximum value of the expression
mean
Returns the mean calculated from the values
min
Returns the minimum value of the expression
monotonically_increasing_id
Returns monotonically increasing IDs
negative
Returns the negated value
percent_rank
Returns the percentage ranking of a value
percentile
Returns the exact percentile at a given percentage
percentile_approx
Returns the approximate percentile at a given percentage
pi
Returns pi
pmod
Returns the positive modulo between two values
positive
Returns the positive value
pow, power
Returns the first value to the power of the second value
radians
Converts the value to radians
rand
Returns a random number from 0 through 1
randn
Returns a random value
rint
Returns the closest double value
round
Returns the closest rounded value
sign, signum
Returns the number’s sign
sin
Returns sine of the value
sinh
Returns hyperbolic sine of the value
sqrt
Returns the square root of the value
stddev
Returns the standard deviation of the value
sttdev_pop
Returns the population standard deviation of the value
stddev_samp
Returns the sample standard deviation of the value
sum
Returns the sum of the values
tan
Returns tangent of the value
tanh
Returns hyperbolic tangent of the value
var_pop
Returns the calculated population variance
var_samp, variance
Returns the calculated sample variance

Logical operators and functions logical-operators

Operator/Function
Description
! or not
Logical not
<
Less than
<=
Less than or equal to
=
Equal to
>
Greater than
>=
Greater than or equal to
^
Bitwise exclusive or
|
Bitwise or
~
Bitwise not
arrays_overlap
Returns the common elements
assert_true
Asserts if the expression is true
if
If the expression evaluates to true, return the second expression. Otherwise, return the third expression.
ifnull
If the expression is null, it returns the second expression. Otherwise, it returns the first expression.
in
Returns true if the first expression is in any of the subsequent expressions.
isnan
Returns true if the value is not a number
isnotnull
Returns true if the value is not null
isnull
Returns true if the value is null
nanvl
Returns the first expression if not a number, returns the second expression otherwise
or
Logical or
when
When can be used to create branch conditions for comparison
xpath_boolean
Returns true if the XPath expression evaluates to true or if a matching node is found

Date/time functions datetime-functions

Function
Description
add_months
Add months to date
date_add
Add days to date
date_format
Modify date format
date_sub
Subtract days from date
date_trunc
Returns the date truncated to the specified unit
datediff
Returns the difference between dates in days
day, dayofmonth
Returns the day of the month
dayofweek
Returns the day of week (1-7)
dayofyear
Returns the day of year
from_unixtime
Returns date in UNIX® time
from_utc_timestamp
Returns date in UTC time
hour
Returns the hour of the input
last_day
Returns the last day of the month that the date belongs to
minute
Returns the minute of the input
month
Returns the month of the input
months_between
Number of months between
next_day
Returns the first day later than the input
quarter
Returns the quarter of the input
second
Returns the second of the string
to_date
Converts the string to a date. Note: The string must be in the format yyyy-mm-ddTHH24:MM:SS.
to_timestamp
Converts the string to a timestamp. Note: The string must be in the format yyyy-mm-ddTHH24:MM:SS.
to_unix_timestamp
Converts the string to a UNIX® timestamp
to_utc_timestamp
Converts the string to a UTC timestamp
trunc
Truncates the date
unix_timestamp
Returns the UNIX® timestamp
weekday
Day of the week (0-6)
weekofyear
Returns the week of the year for a given date
year
Returns the year of the string

Arrays arrays

Function
Description
array
Creates an array with the given elements
array_contains
Checks if the array contains the value
array_distinct
Removes duplicate values from the array
array_except
Returns an array of the elements in the first array, but not the second
array_intersect
Returns the intersection of the two arrays
array_join
Joins two arrays together
array_max
Returns the maximum value of the array
array_min
Returns the minimum value of the array
array_position
Returns the 1-based position of the element
array_remove
Removes all elements that are equal to the element
array_repeat
Creates an array containing the value counted times
array_sort
Sorts the array
array_union
Joins the array together, without any duplicates
arrays_zip
Combines the values of given arrays with the values of the original collection at a given index
cardinality
Return the size of the array
element_at
Return the element at position
explode
Separate elements of array into multiple rows, excluding null
explode_outer
Separate elements of array into multiple rows, including null
find_in_set
Returns the 1 based position of array
flatten
Flattens an array of arrays
inline
Separate array of structs into a table, excluding null
inline_outer
Separate array of structs into a table, including null
posexplode
Separate elements of an array into multiple rows with positions, excluding null
reverse
Reverse elements of the array
shuffle
Return a random permutation of the array
slice
Subsets an array
sort_array
Sort an array, given an order
zip_with
Merges the two arrays into a single array, before applying a function

Datatype casting functions datatype-casting

Function
Description
bigint
Change the data type to bigint
binary
Change the data type to binary
boolean
Change the data type to boolean
type
Change the data type to the specified type
date
Change the data type to date
decimal
Change the data type to decimal
double
Change the data type to double
float
Change the data type to float
int
Change the data type to int
smallint
Change the data type to smallint
str_to_map
Create a map from a string
string
Change the data type to string
struct
Create a struct
tinyint
Change the data type to tinyint

Conversion and formatting functions conversion

Function
Description
ascii
Return the numeric (ASCII) value
base64
Change the argument to a base64 string
bin
Change the argument to a binary value
bit_length
Return the bit length
char, chr
Return the ASCII character
char_length, character_length
Return the string length
crc32
Returns the cyclic redundancy check value
degrees
Convert radians to degrees
format_number
Change the number’s format
from_json, get_json_object
Get data from JSON
hash
Return the hash value
hex
Convert the argument to a hexadecimal value
initcap
Changes the string to be title case
lcase, lower
Changes the string to be all lowercase
lpad
Pads the left side of a string
map
Create a map
map_from_arrays
Create a map from an array
map_from_entries
Create a map from an array of structs
md5
Return the md5 value
rpad
Pads the right side of a string
rtrim
Removes trailing spaces
sha, sha1
Return the SHA1 value
sha2
Return the SHA2 value
soundex
Return the soundex code
stack
Separate values into rows
substr, substring
Return the substring
to_json
Returns a JSON string
translate
Replace values within string
trim
Remove leading and trailing characters
ucase, upper
Change the string to be all uppercase
unbase64
Convert the base64 string to binary
unhex
Convert the hexadecimal to binary
uuid
Return a UUID

Data evaluation data-evaluation

Function
Description
coalesce
Return the first non-null argument
collect_list
Return a list of non-unique elements
collect_set
Return a set of unique elements
concat
Concatenation
concat_ws
Concatenation with separator
count
Returns the total count for rows
decode
Decode using a character set
elt
Return the nth input
encode
Encode using a character set
first, first_value
Returns the first value
grouping
Indicates if a column is grouped
grouping_id
Returns the level of grouping
instr
Returns a 1-based index of character occurrence
json_tuple
Returns a tuple from a JSON input
lag, lead
Returns the value before the offset
last, last_value
Returns the last value
left
Returns the first n characters
length
Returns the length of the string
levenshtein
Returns the Levenshtein distance between strings
locate, position
Returns the position of the first occurrence of a substring
map_concat
Concatenate a map
map_keys
Return a map’s keys
map_values
Return a map’s values
ntile
Divide rows into partitions
nullif
Returns null if true
nvl
Returns value if null
nvl2
Returns value if not null
parse_url
Extracts part of a URL
rank
Computes rank of a value
regexp_extract
Extracts something that matches the regex
regex_replace
Replaces something that matches the regex
repeat
Returns a string that repeats
replace
Replace all instances of a string
rollup
Create a multi-dimensional rollup
row_number
Assigns a unique row number
schema_of_json
Returns the schema of the JSON
sentences
Splits string into an array of words
sequence
Generates an array of elements
shiftleft
Signed bitwise shift left
shiftright
Signed bitwise shift right
shiftrightunsigned
Unsigned bitwise shift right
size
Return the size of the array
space
Return a string with n spaces
split
Split string
substring_index
Return index of substring
window
Window
xpath
Parse XML nodes
xpath_double, xpath_number
Parse XML nodes for double
xpath_float
Parse XML nodes for float
xpath_int
Parse XML nodes for integer
xpath_long
Parse XML nodes for long
xpath_short
Parse XML nodes for short integer
xpath_string
Parse XML nodes for string

Current information current-information

Function
Description
current_database
Returns current database
current_date
Returns current date
current_timestamp, now
Returns current timestamp

Higher-order functions higher-order

Function
Description
transform
Transform elements in an array
exists
Check if element exists
filter
Filter the input array
aggregate
Apply a binary operator to all elements
recommendation-more-help
ccf2b369-4031-483f-af63-a93b5ae5e3fb