Phoenix Language Reference

Phoenix is a SQL layer over HBase delivered as an embedded JDBC driver. The top level commands, SQL syntax, built-in functions, and supported data types are documented in the separate tabs below.


Aggregate Functions

AVG
COUNT
MAX
MIN
SUM
PERCENTILE_CONT
PERCENTILE_DISC
PERCENT_RANK
STDDEV_POP
STDDEV_SAMP

String Functions

SUBSTR
TRIM
LTRIM
RTRIM
LENGTH
REGEXP_SUBSTR
REGEXP_REPLACE
UPPER
LOWER
REVERSE
TO_CHAR

Time and Date Functions

ROUND
TRUNCATE
TO_DATE
CURRENT_DATE
CURRENT_TIME

Other Functions

MD5
INVERT
TO_NUMBER
COALESCE

AVG

AVG ( { numericTerm } )
AVG ( numericTerm )

The average (mean) value. If no rows are selected, the result is NULL. Aggregates are only allowed in select statements. The returned value is of the same data type as the parameter.

Example:

AVG(X)

COUNT

COUNT( [ DISTINCT ] { * | { term } } )
COUNT (
 
DISTINCT
*
term
)

The count of all row, or of the non-null values. This method returns a long. When DISTINCT is used, it counts only distinct values. If no rows are selected, the result is 0. Aggregates are only allowed in select statements.

Example:

COUNT(*)

MAX

MAX(term)
MAX ( term )

The highest value. If no rows are selected, the result is NULL. Aggregates are only allowed in select statements. The returned value is of the same data type as the parameter.

Example:

MAX(NAME)

MIN

MIN(term)
MIN ( term )

The lowest value. If no rows are selected, the result is NULL. Aggregates are only allowed in select statements. The returned value is of the same data type as the parameter.

Example:

MIN(NAME)

SUM

SUM( { numericTerm } )
SUM ( numericTerm )

The sum of all values. If no rows are selected, the result is NULL. Aggregates are only allowed in select statements. The returned value is of the same data type as the parameter.

Example:

SUM(X)

PERCENTILE_CONT

PERCENTILE_CONT( { numeric } ) WITHIN GROUP (ORDER BY { numericTerm } { ASC | DESC } )
PERCENTILE_CONT ( numeric ) WITHIN GROUP ( ORDER BY numericTerm
ASC
DESC
)

The nth percentile of values in the column. The percentile value can be between 0 and 1 inclusive. Aggregates are only allowed in select statements. The returned value is of decimal data type.

Example:

PERCENTILE_CONT( 0.9 ) WITHIN GROUP (ORDER BY X ASC)

PERCENTILE_DISC

PERCENTILE_DIST( { numeric } ) WITHIN GROUP (ORDER BY { numericTerm } { ASC | DESC } )
PERCENTILE_DIST ( numeric ) WITHIN GROUP ( ORDER BY numericTerm
ASC
DESC
)

PERCENTILE_DISC is an inverse distribution function that assumes a discrete distribution model. It takes a percentile value and a sort specification and returns an element from the set. Nulls are ignored in the calculation.

Example:

PERCENTILE_DISC( 0.9 ) WITHIN GROUP (ORDER BY X DESC)

PERCENT_RANK

PERCENT_RANK( { numeric } ) WITHIN GROUP (ORDER BY { numericTerm } { ASC | DESC } )
PERCENT_RANK ( numeric ) WITHIN GROUP ( ORDER BY numericTerm
ASC
DESC
)

The percentile rank for a hypothetical value, if inserted into the column. Aggregates are only allowed in select statements. The returned value is of decimal data type.

Example:

PERCENT_RANK( 100 ) WITHIN GROUP (ORDER BY X ASC)

STDDEV_POP

STDDEV_POP( { numericTerm } )
STDDEV_POP ( numericTerm )

The population standard deviation of all values. Aggregates are only allowed in select statements. The returned value is of decimal data type.

Example:

STDDEV_POP( X )

STDDEV_SAMP

STDDEV_SAMP( { numericTerm } )
STDDEV_SAMP ( numericTerm )

The sample standard deviation of all values. Aggregates are only allowed in select statements. The returned value is of decimal data type.

Example:

STDDEV_SAMP( X )

MD5

MD5( term )
MD5 ( term )

Computes the MD5 hash of the argument, returning the result as a BINARY(16).

Example:

MD5(my_column)

INVERT

INVERT( term )
INVERT ( term )

Inverts the bits of the argument. The return type will be the same as the argument.

Example:

INVERT(my_column)

TO_NUMBER

TO_NUMBER( stringTerm | timeStampTerm [, formatString] )
TO_NUMBER ( stringTerm
timeStampTerm
 
, formatString
)

Formats a string or date/time/timeStamp as a number, optionally accepting a format string. For details on the format, see java.text.DecimalFormat. For date, time, and timeStamp terms, the result is the time in milliseconds since the epoch. This method returns a decimal number.

Example:

TO_NUMBER('$123.33', '\u00A4###.##')

COALESCE

COALESCE( firstTerm, secondTerm )
COALESCE ( firstTerm , secondTerm )

Returns the value of the first argument if not null and the second argument otherwise. Useful to guarantee that a column in an UPSERT SELECT command will evaluate to a non null value.

Example:

COALESCE(last_update_date, CURRENT_DATE())

SUBSTR

SUBSTR( stringTerm, startInt [, lengthInt ] )
SUBSTR ( stringTerm , startInt
 
, lengthInt
)

Returns a substring of a string starting at the one-based position. If zero is used, the position is zero-based. If the start index is negative, then the start index is relative to the end of the string. The length is optional and if not supplied, the rest of the string will be returned.

Example:

SUBSTR('[Hello]', 2, 5)
SUBSTR('Hello World', -5)

TRIM

TRIM( stringTerm )
TRIM ( stringTerm )

Removes leading and trailing spaces from the input string.

Example:

TRIM('  Hello  ')

LTRIM

LTRIM( stringTerm )
LTRIM ( stringTerm )

Removes leading spaces from the input string.

Example:

LTRIM('  Hello')

RTRIM

RTRIM( stringTerm )
RTRIM ( stringTerm )

Removes trailing spaces from the input string.

Example:

RTRIM('Hello   ')

LENGTH

LENGTH( stringTerm )
LENGTH ( stringTerm )

Returns the length of the string in characters.

Example:

LENGTH('Hello')

REGEXP_SUBSTR

REGEXP_SUBSTR( stringTerm, patternString [, startInt ] )
REGEXP_SUBSTR ( stringTerm , patternString
 
, startInt
)

Returns a substring of a string by applying a regular expression start from the offset of a one-based position. Just like with SUBSTR, if the start index is negative, then it is relative to the end of the string. If not specified, the start index defaults to 1.

Example:

REGEXP_SUBSTR('na1-appsrv35-sj35', '[^-]+') evaluates to 'na1'

REGEXP_REPLACE

REGEXP_REPLACE( stringTerm, patternString [, replacementString ] )
REGEXP_REPLACE ( stringTerm , patternString
 
, replacementString
)

Returns a string by applying a regular expression and replacing the matches with the replacement string. If the replacement string is not specified, it defaults to an empty string.

Example:

REGEXP_REPLACE('abc123ABC', '[0-9]+', '#') evaluates to 'abc#ABC'

UPPER

UPPER( stringTerm )
UPPER ( stringTerm )

Returns upper case string of the string argument.

Example:

UPPER('Hello')

LOWER

LOWER( stringTerm )
LOWER ( stringTerm )

Returns lower case string of the string argument.

Example:

LOWER('HELLO')

REVERSE

REVERSE( stringTerm )
REVERSE ( stringTerm )

Returns reversed string of the string argument.

Example:

REVERSE('Hello')

TO_CHAR

TO_CHAR( timestampTerm | numberTerm [, formatString] )
TO_CHAR ( timestampTerm
numberTerm
 
, formatString
)

Formats a date, time, timestamp, or number as a string. The default date format is yyyy-MM-dd HH:mm:ss and the default number format is #,##0.###. For details, see java.text.SimpleDateFormat for date/time values and java.text.DecimalFormat for numbers. This method returns a string.

Example:

TO_CHAR(myDate, '2001-02-03 04:05:06')
TO_CHAR(myDecimal, '#,##0.###')

ROUND

ROUND(timestampTerm, {'DAY' | 'HOUR' | 'MINUTE' | 'SECOND' | 'MILLISECOND'} [, multiplierNumber])
ROUND ( timestampTerm ,
' DAY '
' HOUR '
' MINUTE '
' SECOND '
' MILLISECOND '
 
, multiplierNumber
)

Rounds the timestamp to the nearest time unit specified. The multiplier is used to round to a multiple of a time unit (i.e. 10 minute) and defaults to 1 if not specified. This method returns a date.

Example:

ROUND(date, 'MINUTE', 30)
ROUND(time, 'HOUR')

TRUNCATE

TRUNC(timestampTerm, {'DAY' | 'HOUR' | 'MINUTE' | 'SECOND' | 'MILLISECOND'} [, multiplierInt])
TRUNC ( timestampTerm ,
' DAY '
' HOUR '
' MINUTE '
' SECOND '
' MILLISECOND '
 
, multiplierInt
)

Truncates the timestamp to the next time unit closer to 0. The multiplier is used to truncate to a multiple of a time unit (i.e. 10 minute) and defaults to 1 if not specified. This method returns a date.

Example:

TRUNCATE(timestamp, 'SECOND', 30)
TRUNCATE(date, 'DAY', 7)

TO_DATE

TO_DATE( stringTerm [, formatString] )
TO_DATE ( stringTerm
 
, formatString
)

Parses a string and returns a date. The most important format characters are: y year, M month, d day, H hour, m minute, s second. The default format string is yyyy-MM-dd HH:mm:ss. For details of the format, see java.text.SimpleDateFormat.

Example:

TO_DATE('Sat, 3 Feb 2001 03:05:06 GMT', 'EEE, d MMM yyyy HH:mm:ss z')

CURRENT_DATE

CURRENT_DATE()
CURRENT_DATE ( )

Returns the current server-side date, bound at the start of the execution of a query based on the current time on the region server owning the metadata of the table being queried.

Example:

CURRENT_DATE()

CURRENT_TIME

CURRENT_TIME()
CURRENT_TIME ( )

Same as CURRENT_DATE(), except returns a value of type TIME. In either case, the underlying representation is the epoch time as a long value.

Example:

CURRENT_TIME()