Commands
SELECT UPSERT VALUES UPSERT SELECT DELETE |
CREATE DROP ALTER TABLE CREATE INDEX |
DROP INDEX ALTER INDEX EXPLAIN |
Other Grammar
SELECT
SELECT |
|
| selectExpression |
|
FROM tableExpression |
|
|
|
|
|
|
Selects data from a table. DISTINCT
filters out duplicate results while ALL
, the default, includes all results. FROM
identifies the table being queried (single table only currently - no joins or derived tables yet). Dynamic columns not declared at create time may be defined in parenthesis after the table name and then used in the query. GROUP BY
groups the the result by the given expression(s). HAVING
filter rows after grouping. ORDER BY
sorts the result by the given column(s) or expression(s) and is only allowed for aggregate queries or queries with a LIMIT
clause. LIMIT
limits the number of rows returned by the query with no limit applied if specified as null or less than zero. The LIMIT
clause is executed after the ORDER BY
clause to support TopN type queries. An optional hint overrides the default query plan.
Example:
SELECT * FROM TEST;
SELECT a.* FROM TEST;
SELECT DISTINCT NAME FROM TEST;
SELECT ID, COUNT(1) FROM TEST GROUP BY ID;
SELECT NAME, SUM(VAL) FROM TEST GROUP BY NAME HAVING COUNT(1) > 2;
SELECT 'ID' COL, MAX(ID) AS MAX FROM TEST;
SELECT * FROM TEST LIMIT 1000;
UPSERT VALUES
UPSERT INTO tableName |
| VALUES ( constantTerm |
| ) |
Inserts if not present and updates otherwise the value in the table. The list of columns is optional and if not present, the values will map to the column in the order they are declared in the schema. The values must evaluate to constants.
Example:
UPSERT INTO TEST VALUES('foo','bar',3);
UPSERT INTO TEST(NAME,ID) VALUES('foo',123);
UPSERT SELECT
Inserts if not present and updates otherwise rows in the table based on the results of running another query. The values are set based on their matching position between the source and target tables. The list of columns is optional and if not present will map to the column in the order they are declared in the schema. If auto commit is on, and both a) the target table matches the source table, and b) the select performs no aggregation, then the population of the target table will be done completely on the server-side (with constraint violations logged, but otherwise ignored). Otherwise, data is buffered on the client and, if auto commit is on, committed in row batches as specified by the UpsertBatchSize connection property (or the phoenix.mutate.upsertBatchSize HBase
config property which defaults to 10000 rows)
Example:
UPSERT INTO test.targetTable(col1, col2) SELECT col3, col4 FROM test.sourceTable WHERE col5 < 100
UPSERT INTO foo SELECT * FROM bar;
DELETE
Deletes the rows selected by the where clause. If auto commit is on, the deletion is performed completely server-side.
Example:
DELETE FROM TEST;
DELETE FROM TEST WHERE ID=123;
DELETE FROM TEST WHERE NAME LIKE 'foo%';
CREATE
CREATE |
|
| tableRef |
( columnDef |
|
| ) |
|
|
Creates a new table or view. For the creation of a table, the HBase
table and any column families referenced are created if they don't already exist (using uppercase names unless they are double quoted in which case they are case sensitive). Column families outside of the ones listed are not affected. At create time, an empty key value is added to the first column family of any existing rows. Upserts will also add this empty key value. This is done to improve query performance by having a key value column we can guarantee always being there (minimizing the amount of data that must be projected). Alternately, if a view is created, the HBase
table and column families must already exist. No empty key value is added to existing rows and no data mutations are allowed - the view is read-only. Query performance for a view will not be as good as performance for a table. For a table only, HBase
table and column configuration options may be passed through as key/value pairs to setup the HBase
table as needed.
Example:
CREATE TABLE my_schema.my_table ( id BIGINT not null primary key, date DATE not null)
CREATE TABLE my_table ( id INTEGER not null primary key desc, date DATE not null,
m.db_utilization DECIMAL, i.db_utilization)
m.DATA_BLOCK_ENCODING='DIFF'
CREATE TABLE stats.prod_metrics ( host char(50) not null, created_date date not null,
txn_count bigint CONSTRAINT pk PRIMARY KEY (host, created_date) )
CREATE TABLE IF NOT EXISTS my_table ( id char(10) not null primary key, value integer)
DATA_BLOCK_ENCODING='NONE',VERSIONS=?,MAX_FILESIZE=2000000 split on (?, ?, ?)
DROP
DROP |
|
| tableRef |
Drops a table or view. When dropping a table, the data in the table is deleted. For a view, on the other hand, the data is not affected. Note that the schema is versioned, such that snapshot queries connecting at an earlier time stamp may still query against the dropped table, as the HBase
table itself is not deleted.
Example:
DROP TABLE my_schema.my_table
DROP VIEW my_view
ALTER TABLE
Alters an existing table by adding or removing a column or updating table options. When a column is dropped from a table, the data in that column is deleted as well. PK
columns may not be dropped, and only nullable PK
columns may be added. For a view, the data is not affected when a column is dropped. Note that creating or dropping columns only affects subsequent queries and data modifications. Snapshot queries that are connected at an earlier timestamp will still use the prior schema that was in place when the data was written.
Example:
ALTER TABLE my_schema.my_table ADD d.dept_id char(10) VERSIONS=10
ALTER TABLE my_table ADD dept_name char(50)
ALTER TABLE my_table ADD parent_id char(15) null primary key
ALTER TABLE my_table DROP COLUMN d.dept_id
ALTER TABLE my_table DROP COLUMN dept_name
ALTER TABLE my_table DROP COLUMN parent_id
ALTER TABLE my_table SET IMMUTABLE_ROWS=true
CREATE INDEX
CREATE INDEX |
| indexName |
ON tableRef ( columnRef |
|
| ) |
|
|
|
Creates a new secondary index on a table or view. The index will be automatically kept in sync with the table as the data changes. At query time, the optimizer will use the index if it contains all columns referenced in the query and produces the most efficient execution plan. If a table has rows that are write-once and append-only, then the table may set the IMMUTABLE_ROWS
property to true (either up-front in the CREATE TABLE
statement or afterwards in an ALTER TABLE
statement). This reduces the overhead at write time to maintain the index. Otherwise, if this property is not set on the table, then incremental index maintenance will be performed on the server side when the data changes.
Example:
CREATE INDEX my_idx ON sales.opportunity(last_updated_date DESC)
CREATE INDEX my_idx ON log.event(created_date DESC) INCLUDE (name, payload) SALT_BUCKETS=10
CREATE INDEX IF NOT EXISTS my_comp_idx ON server_metrics ( gc_time DESC, created_date DESC )
DATA_BLOCK_ENCODING='NONE',VERSIONS=?,MAX_FILESIZE=2000000 split on (?, ?, ?)
DROP INDEX
Drops an index from a table. When dropping an index, the data in the index is deleted. Note that since metadata is versioned, snapshot queries connecting at an earlier time stamp may still use the index, as the HBase
table backing the index is not deleted.
Example:
DROP INDEX my_idx ON sales.opportunity
DROP INDEX IF EXISTS my_idx ON server_metrics
ALTER INDEX
Alters the state of an existing index. DISABLE
will cause the no further index maintenance to be performed on the index and it will no longer be considered for use in queries. REBUILD
will completely rebuild the index and upon completion will enable the index to be used in queries again. UNUSABLE
will cause the index to no longer be considered for use in queries, however index maintenance will continue to be performed. USABLE
will cause the index to again be considered for use in queries. Note that a disabled index must be rebuild and cannot be set as USABLE
.
Example:
ALTER INDEX my_idx ON sales.opportunity DISABLE
ALTER INDEX IF EXISTS my_idx ON server_metrics REBUILD
EXPLAIN
EXPLAIN |
|
Computes the logical steps necessary to execute the given command. Each step is represented as a string in a single column result set row.
Example:
EXPLAIN SELECT NAME, COUNT(*) FROM TEST GROUP BY NAME HAVING COUNT(*) > 2;
EXPLAIN SELECT entity_id FROM CORE.CUSTOM_ENTITY_DATA WHERE organization_id='00D300000000XHP' AND SUBSTR(entity_id,1,3) = '002' AND created_date < CURRENT_DATE()-1;
Constraint
CONSTRAINT constraintName PRIMARY KEY ( columnName |
|
| ) |
Defines a multi-part primary key constraint. Each column may be declared to be sorted in ascending or descending ordering. The default is ascending.
Example:
CONSTRAINT my_pk PRIMARY KEY (host,created_date)
CONSTRAINT my_pk PRIMARY KEY (host ASC,created_date DESC)
Options
| name = |
|
|
Sets an option on an HBase
table or column by modifying the respective HBase
metadata. The option applies to the named family or if omitted to all families if the name references an HColumnDescriptor
property. Otherwise, the option applies to the HTableDescriptor
.
One built-in option is SALT_BUCKETS
. This option causes an extra byte to be transparently prepended to every row key to ensure an even distribution of write load across all your region servers. This is useful when your row key is always monotonically increasing causing hot spotting on a single region server. The byte is determined by hashing the row key and modding it with the SALT_BUCKETS
value. The value may be from 1 to 256. If not split points are defined for the table, it will automatically be pre-split at each possible salt bucket value. For an excellent write-up of this technique, see http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
Another built-in options is IMMUTABLE_ROWS
. Only tables with immutable rows are allowed to have indexes. Immutable rows are expected to be inserted once in their entirety and then never updated. This limitation will be removed once incremental index maintenance has been implemented. The current implementation inserts the index rows when the data row is inserted.
Example:
IMMUTABLE_ROWS=true
SALT_BUCKETS=10
DATA_BLOCK_ENCODING='NONE',a.VERSIONS=10
MAX_FILESIZE=2000000000,MEMSTORE_FLUSHSIZE=80000000
Hint
name |
|
Advanced features that overrides default query processing behavior. The supported hints include 1) SKIP_SCAN
to force a skip scan to be performed on the query when it otherwise would not be. This option may improve performance if a query does not include the leading primary key column, but does include other, very selective primary key columns. 2) RANGE_SCAN
to force a range scan to be performed on the query. This option may improve performance if a query filters on a range for non selective leading primary key column along with other primary key columns 3) NO_INTRA_REGION_PARALLELIZATION
to prevent the spawning of multiple threads to process data within a single region. This option is useful when the overall data set being queries is known to be small. 4) NO_INDEX
to force the data table to be used for a query, and 5) INDEX
(<table_name> <index_name>...) to suggest which index to use for a given query. Double quotes may be used to surround a table_name and/or index_name that is case sensitive.
Example:
/*+ SKIP_SCAN */
/*+ RANGE_SCAN */
/*+ NO_INTRA_REGION_PARALLELIZATION */
/*+ NO_INDEX */
/*+ INDEX(employee emp_name_idx emp_start_date_idx) */
Column Def
columnRef dataType |
|
|
Define a new primary key column. The column name is case insensitive by default and case sensitive if double quoted. The sort order of a primary key may be ascending (ASC
) or descending. The default is ascending.
Example:
id char(15) not null primary key
key integer null
m.response_time bigint
Table Ref
| tableName |
References a table with an optional schema name qualifier
Example:
Sales.Contact
HR.Employee
Department
Column Ref
| columnName |
References a column with an optional family name qualifier
Example:
e.salary
dept_name
Select Expression
* | ||||||||||||||||||
| ||||||||||||||||||
|
An expression in a SELECT
statement. All columns in a table may be selected using *, and all columns in a column family may be selected using <familyName>.*.
Example:
*
cf.*
ID AS VALUE
VALUE + 1 VALUE_PLUS_ONE
Split Point
value | ||
bindParameter |
Defines a split point for a table. Use a bind parameter with preparedStatement.setBinary(int,byte[]) to supply arbitrary bytes.
Example:
'A'
Table Expression
| tableName |
|
A reference to a table. Joins and sub queries are not currently supported.
Example:
PRODUCT_METRICS AS PM
Order
expression |
|
|
Sorts the result by an expression.
Example:
NAME DESC NULLS LAST
Expression
andCondition |
|
Value or condition.
Example:
ID=1 OR NAME='Hi'
And Condition
condition |
|
Value or condition.
Example:
ID=1 AND NAME='Hi'
Condition
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Boolean value or condition. When comparing with LIKE
, the wildcards characters are _
(any one character) and %
(any characters). To search for the characters %
and _
, the characters need to be escaped. The escape character is \
(backslash). Patterns that end with an escape character are invalid and the expression returns NULL
. BETWEEN
does an inclusive comparison for both operands.
Example:
NAME LIKE 'Jo%'
Compare
| |||
| |||
| |||
= | |||
< | |||
> | |||
|
Comparison operator. The operator != is the same as <>.
Example:
<>
Operand
summand |
|
A string concatenation.
Example:
'foo'|| s
Summand
factor |
|
An addition or subtraction of numeric or date type values
Example:
a + b
a - b
Factor
term |
|
A multiplication or division.
Example:
c * d
e / 5
Term
value | |||||||||||
bindParameter | |||||||||||
Function | |||||||||||
case | |||||||||||
caseWhen | |||||||||||
| |||||||||||
| |||||||||||
rowValueConstructor |
A value.
Example:
'Hello'
Row Value Constructor
( term , term |
| ) |
A row value constructor is a list of other terms which are treated together as a kind of composite structure. They may be compared to each other or to other other terms. The main use case is 1) to enable efficiently stepping through a set of rows in support of query-more type functionality, or 2) to allow IN
clause to perform point gets on composite row keys.
Example:
(col1, col2, 5)
Bind Parameter
? | |||
|
A parameters can be indexed, for example :1
meaning the first parameter.
Example:
:1
?
Value
string | ||
numeric | ||
boolean | ||
null |
A literal value of any data type, or null.
Example:
10
Case
CASE term WHEN expression THEN term |
|
| END |
Returns the first expression where the value is equal to the test expression. If no else part is specified, return NULL
.
Example:
CASE CNT WHEN 0 THEN 'No' WHEN 1 THEN 'One' ELSE 'Some' END
Case When
CASE WHEN expression THEN term |
|
| END |
Returns the first expression where the condition is true. If no else part is specified, return NULL
.
Example:
CASE WHEN CNT<10 THEN 'Low' ELSE 'High' END
Name
| ||||||||||||||||||||||||
quotedName |
Unquoted names are not case sensitive. There is no maximum name length.
Example:
my_column
Quoted Name
" anything " |
Quoted names are case sensitive, and can contain spaces. There is no maximum name length. Two double quotes can be used to create a single double quote inside an identifier.
Example:
"first-name"
Alias
name
An alias is a name that is only valid in the context of the statement.
Example:
A
Null
NULL
NULL
is a value without data type and means 'unknown value'.
Example:
NULL
Data Type
A type name.
Example:
CHAR(15)
VARCHAR
VARCHAR(1000)
INTEGER
BINARY(200)
String
' anything ' |
A string starts and ends with a single quote. Two single quotes can be used to create a single quote inside a string.
Example:
'John''s car'
Boolean
TRUE | ||
FALSE |
A boolean value.
Example:
TRUE
Numeric
int | ||
long | ||
decimal |
The data type of a numeric value is always the lowest possible for the given value. If the number contains a dot this is decimal; otherwise it is int, long, or decimal (depending on the value).
Example:
SELECT -10.05
SELECT 5
SELECT 12345678912345
Int
| number |
The maximum integer number is 2147483647, the minimum is -2147483648.
Example:
10
Long
| number |
Long numbers are between -9223372036854775808 and 9223372036854775807.
Example:
100000
Decimal
| number |
|
A decimal number with fixed precision and scale. Internally, java.lang.BigDecimal
is used.
Example:
SELECT -10.5
Number
0-9 |
|
The maximum length of the number depends on the data type used.
Example:
100
Comments
| |||
| |||
|
Comments can be used anywhere in a command and are ignored by the database. Line comments end with a newline. Block comments cannot be nested, but can be multiple lines long.
Example:
// This is a comment