$Revision: 6758 $
Copyright 2002-2024 Fred Toussi. Permission is granted to distribute this document without any alteration under the terms of the HSQLDB license. Additional permission is granted to the HSQL Development Group to distribute this document with or without alterations under the terms of the HSQLDB license.
2024-06-01
Table of Contents
The SQL language consists of statements for different operations. HyperSQL 2.x supports the dialect of SQL defined progressively by ISO (also ANSI) SQL standards 92, 1999, 2003, 2008, 2011, 2016 and 2023. This means the syntax specified by the Standard text is accepted for any supported operation. Almost all features of SQL-92 up to Advanced Level are supported, as well as the additional features that make up the SQL:2023 core and many optional features of this standard.
At the time of this release, HyperSQL supports the widest range of SQL Standard features among all open source RDBMS.
Various chapters of this guide list the supported syntax. When writing or converting existing SQL DDL (Data Definition Language), DML (Data Manipulation Language) or DQL (Data Query Language) statements for HSQLDB, you should consult the supported syntax and modify the statements accordingly.
Over 300 words are reserved by the Standard and should not be used
as table or column names. For example, the word POSITION is reserved as it
is a function defined by the Standards with a similar role as
String.indexOf(String)
in Java. By default,
HyperSQL does not prevent you from using a reserved word if it does not
support its use or can distinguish it. For example, CUBE is a reserved
word for a feature that is supported by HyperSQL from version 2.5.1.
Before this version, CUBE was allowed as a table or column name but it is
no longer allowed. You should avoid using such names as future versions of
HyperSQL are likely to support the reserved words and may reject your
table definitions or queries. The full list of SQL reserved words is in
the appendix Lists of Keywords . You
can set a property to disallow the use of reserved keywords for names of
tables and other database objects. There are several other user-defined
properties to control the strict application of the SQL Standard in
different areas.
If you have to use a reserved keyword as the name of a database object, you can enclose it in double quotes.
HyperSQL also supports enhancements with keywords and expressions
that are not part of the SQL standard. Expressions such as SELECT
TOP 5 FROM ..
, SELECT LIMIT 0 10 FROM ...
or
DROP TABLE mytable IF EXISTS
are among such
constructs.
Many books cover SQL Standard syntax and can be consulted.
In HyperSQL version 2, all features of JDBC4 that apply to the
capabilities of HSQLDB are fully supported. The relevant JDBC classes are
thoroughly documented with additional clarifications and HyperSQL specific
comments. See the JavaDoc for the
org.hsqldb.jdbc.*
classes.
The following sections list the keywords that start various SQL statements grouped by their function.
Definition statements create, modify, or remove database objects. Tables and views are objects that contain data. There are other types of objects that do not contain data. These statements are covered in the Schemas and Database Objects chapter.
CREATE
Followed by { SCHEMA | TABLE | VIEW | SEQUENCE | PROCEDURE | FUNCTION | USER | ROLE | ... }, the keyword is used to create the database objects.
ALTER
Followed by the same keywords as CREATE, the keyword is used to modify the object.
DROP
Followed by the same keywords as above, the keyword is used to remove the object. If the object contains data, the data is removed too.
GRANT
Followed by the name of a role or privilege, the keyword assigns a role or gives permissions to a USER or role.
REVOKE
Followed by the name of a role or privilege, REVOKE is the opposite of GRANT.
COMMENT ON
Followed by the same keywords as CREATE, the keyword is used to add a text comment to TABLE, VIEW, COLUMN, ROUTINE, and TRIGGER objects.
EXPLAIN REFERENCES
These keywords are followed by TO or FROM to list the other database objects that reference the given object, or vice versa.
DECLARE
This is used for declaring temporary session tables and variables.
Data manipulation statements add, update, or delete data in tables and views. These statements are covered in the Data Access and Change chapter.
INSERT
Inserts one or more rows into a table or view.
UPDATE
Updates one or more rows in a table or view.
DELETE
Deletes one or more rows from a table or view.
TRUNCATE
Deletes all the rows in a table.
MERGE
Performs a conditional INSERT, UPDATE or DELETE on a table or view using the data given in the statement.
Data query statements retrieve and combine data from tables and views and return result sets. These statements are covered in the Data Access and Change chapter.
SELECT
Returns a result set formed from a subset of rows and columns in one or more tables or views.
VALUES
Returns a result set formed from constant values.
WITH ...
This keyword starts a series of SELECT statements that form a query. The first SELECTs act as subqueries for the final SELECT statement in the same query.
EXPLAIN PLAN
These keywords are followed by the full text of any DQL or DML statement. The result set shows the anatomy of the given DQL or DML statement, including the indexes used to access the tables.
CALL
Calls a procedure or function. Calling a function can return a result set or a value, while calling a procedure can return one or more result sets and values at the same time. This statement is covered in the SQL-Invoked Routines chapter.
SET
The SET statement has many variations and is used for setting the values of the general properties of the database or the current session. Usage of the SET statement for the database is covered in the System Management chapter. Usage for the session is covered in the Sessions and Transactions chapter.
General operations on the database include backup, checkpoint, and other operations. These statements are covered in detail in the System Management chapter.
BACKUP
Creates a backup of the database in a target directory.
PERFORM
Includes commands to export and import SQL scripts from / to the database. Also includes a command to check the consistency of the indexes.
SCRIPT
Creates a script of SQL statements that creates the database objects and settings.
CHECKPOINT
Saves all the changes to the database up to this point to disk files.
SHUTDOWN
Shuts down the database after saving all the changes.
These statements are used in a session to start, end or control transactions. They are covered in the Sessions and Transactions chapter.
START TRANSACTION
This statement initiates a new transaction with the given transaction characteristics
SET TRANSACTION
Introduces one of more characteristics for the next transaction.
COMMIT
Commits the changes to data made in the current transaction.
ROLLBACK
Rolls back the changes to data made in the current transaction. It is also possible to roll back to a savepoint.
SAVEPOINT
Records a point in the current transaction so that future changes can be rolled back to this point.
RELEASE SAVEPOINT
Releases an existing savepoint.
LOCK
Locks a set of tables for transaction control.
CONNECT
Starts a new session and continues operations in this session.
DISCONNECT
Ends the current session.
Any SQL statement can include comments. The comments are stripped before the statement is executed.
SQL style line comments start with two dashes
--
and extend to the end of the line.
C style comments can cover part of the line or multiple lines.
They start with /*
and end with
*/
.
The body of user-defined SQL procedures and functions (collectively called routines) may contain several other types of statements and keywords in addition to DML and DQL statements. These include: BEGIN and END for blocks; FOR, WHILE and REPEAT loops; IF, ELSE and ELSEIF blocks; SIGNAL and RESIGNAL statements for handling exceptions.
These statements are covered in detail in the SQL-Invoked Routines chapter.
All data is stored in tables. Therefore, creating a database requires defining the tables and their columns. The SQL Standard supports temporary tables, which are for temporary data managed by each session, and permanent base tables, which are for persistent data shared by different sessions.
A HyperSQL database can be an all-in-memory mem: database with no automatic persistence, or a file-based, persistent file: database.
Standard SQL is not case sensitive, except when names of objects
are enclosed in double-quotes. SQL keywords can be written in any case;
for example, sElect
, SELECT
and
select
are all allowed and converted to uppercase.
Identifiers, such as names of tables, columns and other objects defined
by the user, are also converted to uppercase. For example,
myTable
, MyTable
and
MYTABLE
all refer to the same table and are stored in
the database in the case-normal form, which is all uppercase for
unquoted identifiers. When the name of an object is enclosed in double
quotes when it is created, the exact name is used as the case-normal
form and it must be referenced with the exact same double-quoted string.
For example, "myTable"
and
"MYTABLE"
are different tables. When the
double-quoted name is all-uppercase, it can be referenced in any case;
"MYTABLE"
is the same as myTable
and MyTable
because they are all converted to
MYTABLE
.
HyperSQL supports the Standard definition of persistent base table, but defines three types according to the way the data is stored. These are MEMORY tables, CACHED tables, and TEXT tables.
Memory tables are the default type when the CREATE TABLE command
is used. Their data is held entirely in memory. In file-based databases,
MEMORY tables are persistent and any change to their structure or
contents is written to the *.log
and
*.script
files. The *.script
file and the *.log
file are read the next time the
database is opened, and the MEMORY tables are recreated with all the
data. This process may take a long time if the database is larger than
tens of megabytes. When the database is shutdown, all the data is
saved.
CACHED tables are created with the CREATE CACHED TABLE command. Only part of their data or indexes is held in memory, allowing large tables that would otherwise take up to several hundred megabytes of memory. Another advantage of cached tables is that the database engine takes less time to start up when a cached table is used for large amounts of data. The disadvantage of cached tables is a reduction in speed. Do not use cached tables if your data set is relatively small. In an application with some small tables and some large ones, it is better to use the default, MEMORY mode for the small tables.
TEXT tables use a CSV (Comma Separated Value) or other delimited text file as the source of their data. You can specify an existing CSV file, such as a dump from another database or program, as the source of a TEXT table. Alternatively, you can specify an empty file to be filled with data by the database engine. TEXT tables are efficient in memory usage as they cache only part of the text data and all of the indexes. The Text table data source can always be reassigned to a different file if necessary. The commands are needed to set up a TEXT table as detailed in the Text Tables chapter.
With all-in-memory mem: databases, both MEMORY table and CACHED table declarations are treated as declarations for MEMORY tables which last only for the duration of the Java process. In the latest versions of HyperSQL, TEXT table declarations are allowed in all-in-memory databases.
The default type of tables resulting from future CREATE TABLE statements can be specified with the SQL command:
SET DATABASE DEFAULT TABLE TYPE { CACHED | MEMORY };
The type of an existing table can be changed with the SQL command:
SET TABLE <table name> TYPE { CACHED | MEMORY };
SQL statements such as INSERT or SELECT access different types of tables uniformly. No change to statements is needed to access different types of table.
Data in TEMPORARY tables is not saved and lasts only for the lifetime of the session. The contents of each TEMP table are visible only from the session that is used to populate it.
HyperSQL supports two types of temporary tables.
The GLOBAL TEMPORARY
type is a schema object.
It is created with the CREATE GLOBAL TEMPORARY TABLE
statement. The definition of the table persists, and each session has
access to the table. But each session sees its own copy of the table,
which is empty at the beginning of the session.
The LOCAL TEMPORARY
type is not a schema
object. It is created with the DECLARE LOCAL TEMPORARY
TABLE
statement. The table definition lasts only for the
duration of the session and is not persisted in the database. The table
can be declared in the middle of a transaction without committing the
transaction. If a schema name is needed to reference these tables in a
given SQL statement, the pseudo schema name SESSION
can be used.
When the session commits, the contents of all temporary tables are
cleared by default. If the table definition statement includes
ON COMMIT PRESERVE ROWS
, then the contents are kept
when a commit takes place.
The rows in temporary tables are stored in memory by default. If
the hsqldb.result_max_memory_rows property has been
set or the SET SESSION RESULT MEMORY ROWS <row
count>
has been specified, tables with row count above the
setting are stored on disk.
The SQL Standard is strongly typed and completely type-safe. It supports the following basic types, which are all supported by HyperSQL.
Numeric types TINYINT, SMALLINT, INTEGER and BIGINT are types with fixed binary precision. These types are more efficient to store and retrieve. NUMERIC and DECIMAL are types with user-defined decimal precision. They can be used with zero scale to store very large integers, or with a non-zero scale to store decimal fractions. The DOUBLE type is a 64-bit, approximate floating point types. HyperSQL even allows you to store infinity in this type.
The BOOLEAN type is for logical values and can hold TRUE, FALSE or UNKNOWN. Although HyperSQL allows you to use one and zero in assignment or comparison, you should use the standard values for this type.
Character string types are CHAR(L), VARCHAR(L) and CLOB (here, L stands for length parameter, an integer). CHAR is for fixed width strings and any string that is assigned to this type is padded with spaces at the end. If you use CHAR without the length L, then it is interpreted as a single character string. Do not use this type for general storage of strings. Use VARCHAR(L) for general strings. There are only memory limits and performance implications for the maximum length of VARCHAR(L). If the strings are larger than a few kilobytes, consider using CLOB. The CLOB types is a better choice for very long strings. Do not use this type for short strings as there are performance implications. By default LONGVARCHAR is a synonym for a long VARCHAR and can be used without specifying the size. You can set LONGVARCHAR to map to CLOB, with the sql.longvar_is_lob connection property or the SET DATABASE SQL LONGVAR IS LOB TRUE statement.
Binary string types are BINARY(L), VARBINARY(L) and BLOB. Do not use BINARY(L) unless you are storing fixed length strings such as UUID. This type pads short binary strings with zero bytes. BINARY without the length L means a single byte. Use VARBINARY(L) for general binary strings, and BLOB for large binary objects. You should apply the same considerations as with the character string types. By default, LONGVARBINARY is a synonym for a long VARBINARY and can be used without specifying the size. You can set LONGVARBINARY to map to BLOB, with the sql.longvar_is_lob connection property or the SET DATABASE SQL LONGVAR IS LOB TRUE statement.
The BIT(L) and BITVARYING(L) types are for bit maps. Do not use them for other types of data. BIT without the length L argument means a single bit and is sometimes used as a logical type. Use BOOLEAN instead of this type.
The UUID type is for UUID (also called GUID) values. The value is stored as BINARY. UUID character strings, as well as BINARY strings, can be used to insert or to compare.
The datetime types DATE, TIME, and TIMESTAMP, together with their WITH TIME ZONE variations are available. Read the details in this chapter on how to use these types.
The INTERVAL type is very powerful when used together with the
datetime types. This is very easy to use, but is supported mainly by
enterprise database systems. Note that functions that add days or
months to datetime values are not really a substitute for the INTERVAL
type. Expressions such as (datecol - 7 DAY) >
CURRENT_DATE
are optimised to use indexes when it is
possible, while the equivalent function calls are not
optimised.
The OTHER type is for storage of Java objects. If your objects are large, serialize them in your application and store them as BLOB in the database.
The ARRAY type supports all base types except LOB and OTHER types. ARRAY data objects are held in memory while being processed. It is therefore not recommended to store more than about a thousand objects in an ARRAY in normal operations with disk-based databases. For specialised applications, use ARRAY with as many elements as your memory allocation can support.
HyperSQL 2.7 has several compatibility modes which allow the type names that are used by other RDBMS to be accepted and translated into the closest SQL Standard type. For example, the type TEXT, supported by MySQL and PostgreSQL is translated in these compatibility modes.
Table 2.1. List of SQL types
Type | Description |
---|---|
TINYINT, SMALLINT, INT or INTEGER, BIGNIT | binary number types with 8, 16, 32, 64 bit precision respectively |
DOUBLE or FLOAT | 64 bit precision floating point number |
DECIMAL(P,S), DEC(P,S) or NUMERIC(P,S) | identical types for fixed precision number (*) |
BOOLEAN | boolean type supports TRUE, FALSE and UNKNOWN |
CHAR(L) or CHARACTER(L) | fixed-length UTF-16 string type - padded with space to length L (**) |
VARCHCHAR(L) or CHARACTER VARYING(L) | variable-length UTF-16 string type (***) |
CLOB(L) | variable-length UTF-16 long string type (***) |
LONGVARCHAR(L) | a non-standard synonym for VARCHAR(L) (***) |
BINARY(L) | fixed-length binary string type - padded with zero to length L (**) |
VARBINARY(L) or BINARY VARYING(L) | variable-length binary string type (***) |
BLOB(L) | variable length binary string type (***) |
LONGVARBINARY(L) | a non-standard synonym for VARBINARY(L) (***) |
BIT(L) | fixed-length bit map - padded with 0 to length L - maximum value of L is 1024 |
BIT VARYING(L) | variable-length bit map - maximum value of L is 1024 |
UUID | 16 byte fixed binary type represented as UUID string |
DATE | date |
TIME(S) | time of day (****) |
TIME(S) WITH TIME ZONE | time of day with zone displacement value (****) |
TIMESTAMP(S) | date with time of day (****) |
TIMESTAMP(S) WITH TIME ZONE | timestamp with zone displacement value (****) |
INTERVAL | date or time interval - has many variants |
OTHER | non-standard type for Java serializable object |
ARRAY | array of a base type |
In the table above: (*) The parameters are optional. P is used for maximum precision and S for scale of DECIMAL and NUMERIC. If only P is used, S defaults to 0. If none is used, P defaults to 128 and S defaults to 0. The maximum value of each parameter is unlimited. (**) The parameter L is used for fixed length. If not used, it defaults to 1. (***) The parameter L is used for maximum length. It is optional. If not used, it defaults to 32K for VARCHAR and VARBINARY, 1G for BLOB or CLOB, and 16M for the LONGVARCHAR and LONGVARBINARY. The maximum value of the parameter is unlimited for CLOB and BLOB. It is 2 * 1024 *1024 *1024 for other string types. (****) The parameter S is optional and indicates sub-second fraction precision of time (0 to 9). When not used, it defaults to 6 for TIMESTAMP and 0 for TIME.
HyperSQL supports all the types defined by SQL-92, plus BOOLEAN, BINARY, ARRAY and LOB types that were later added to the SQL Standard. It also supports the non-standard OTHER type to store serializable Java objects.
SQL is a strongly typed language. All data stored in specific columns of tables and other objects (such as sequence generators) have specific types. Each data item conforms to the type limits such as precision and scale for the column. It also conforms to any additional integrity constraints that are defined as CHECK constraints in domains or tables. Types can be explicitly converted using the CAST expression, but in most expressions, they are converted automatically.
Data is returned to the user (or the application program) as a result of executing SQL statements such as query expressions or function calls. All statements are compiled prior to execution and the return type of the data is known after compilation and before execution. Therefore, once a statement is prepared, the data type of each column of the returned result is known, including any precision or scale property. The type does not change when the same query that returned one row, returns many rows as a result of adding more data to the tables.
Some SQL functions used within SQL statements are polymorphic, but the exact type of the argument and the return value is determined at compile time.
When a statement is prepared, using a JDBC
PreparedStatement
object, it is compiled by the
engine and the type of the columns of its ResultSet
and / or its parameters are accessible through the methods of
PreparedStatement
.
TINYINT, SMALLINT, INTEGER, BIGINT, NUMERIC and DECIMAL (without a
decimal point) are the supported integral types. They correspond
respectively to byte
,
short
, int
,
long
, BigDecimal
and
BigDecimal
Java types in the range of values that
they can represent (NUMERIC and DECIMAL are equivalent). The type
TINYINT is an HSQLDB extension to the SQL Standard, while the others
conform to the Standard definition. The SQL type dictates the maximum
and minimum values that can be held in a field of each type. For example
the value range for TINYINT is -128 to +127. The bit precision of
TINYINT, SMALLINT, INTEGER and BIGINT is respectively 8, 16, 32 and 64.
For NUMERIC and DECIMAL, decimal precision is used.
DECIMAL and NUMERIC with decimal fractions are mapped to
java.math.BigDecimal
and can have very large
numbers of digits. In HyperSQL the two types are equivalent. These
types, together with integral types, are called exact numeric
types.
In HyperSQL, REAL, FLOAT and DOUBLE are equivalent: they are all
mapped to double
in Java. These types are defined
by the SQL Standard as approximate numeric types. The bit-precision of
all these types is 64 bits.
The decimal precision and scale of NUMERIC and DECIMAL types can be optionally defined. For example, DECIMAL(10,2) means maximum total number of digits is 10 and there are always 2 digits after the decimal point, while DECIMAL(10) means 10 digits without a decimal point. The bit-precision of FLOAT can be defined but it is ignored and the default bit-precision of 64 is used. The default precision of NUMERIC and DECIMAL (when not defined) is 128.
Note: If a database has been set to ignore type precision limits with the SET DATABASE SQL SIZE FALSE command, then a type definition of DECIMAL with no precision and scale is treated as DECIMAL(128,32). In normal operation, it is treated as DECIMAL(128).
Integral Types
In expressions, values of TINYINT, SMALLINT, INTEGER, BIGINT, NUMERIC and DECIMAL (without a decimal point) types can be freely combined and no data narrowing takes place. The resulting value is of a type that can support all possible values.
If the SELECT statement refers to a simple column or function, then the return type is the type corresponding to the column or the return type of the function. For example:
CREATE TABLE t(a INTEGER, b BIGINT); SELECT MAX(a), MAX(b) FROM t;
will return a ResultSet
where the type of
the first column is java.lang.Integer
and the
second column is java.lang.Long
. However,
SELECT MAX(a) + 1, MAX(b) + 1 FROM t;
will return java.lang.Long
and
BigDecimal
values, generated as a result of
uniform type promotion for all possible return values. Note that type
promotion to BigDecimal
ensures the correct value
is returned if MAX(b)
evaluates to
Long.MAX_VALUE
.
There is no built-in limit on the size of intermediate integral
values in expressions. As a result, you should check for the type of the
ResultSet
column and choose an appropriate
getXXXX()
method to retrieve it. Alternatively,
you can use the getObject()
method, then cast
the result to java.lang.Number
and use the
intValue()
or
longValue()
if the value is not an instance of
java.math.BigDecimal
.
When the result of an expression is stored in a column of a
database table, it has to fit in the target column, otherwise an error
is returned. For example, when 1234567890123456789012 /
12345687901234567890
is evaluated, the result can be stored in
any integral type column, even a TINYINT column, as it is a small
value.
In SQL Statements, an integer literal is treated as INTEGER, unless its value does not fit. In this case it is treated as BIGINT or DECIMAL, depending on the value.
Depending on the types of the operands, the result of the
operation is returned in a JDBC ResultSet
in any
of the related Java types: Integer
,
Long
or BigDecimal
. The
ResultSet.getXXXX()
methods can be used to
retrieve the values so long as the returned value can be represented by
the resulting type. This type is deterministically based on the query,
not on the actual rows returned.
Other Numeric Types
In SQL statements, number literals with a decimal point are
treated as DECIMAL unless they are written with an exponent. Thus
0.2
is considered a DECIMAL value but
0.2E0
is considered a DOUBLE value.
When an approximate numeric type, REAL, FLOAT or DOUBLE (all
synonymous) is part of an expression involving different numeric types,
the type of the result is DOUBLE. DECIMAL values can be converted to
DOUBLE unless they are beyond the Double.MIN_VALUE -
Double.MAX_VALUE
range. For example, A * B, A / B, A + B,
etc., will return a DOUBLE value if either A or B is a DOUBLE.
Otherwise, when no DOUBLE value exists, if a DECIMAL or NUMERIC value is part an expression, the type of the result is DECIMAL or NUMERIC. Similar to integral values, when the result of an expression is assigned to a table column, the value has to fit in the target column, otherwise an error is returned. This means a small, 4 digit value of DECIMAL type can be assigned to a column of SMALLINT or INTEGER, but a value with 15 digits cannot.
When a DECIMAL value is multiplied by a DECIMAL or integral type, the resulting scale is the sum of the scales of the two terms. When they are divided, the result is a value with a scale (number of digits to the right of the decimal point) equal to the larger of the scales of the two terms. The precision for both operations is calculated (usually increased) to allow all possible results.
The distinction between DOUBLE and DECIMAL is important when a
division takes place. For example, 10.0/8.0
(DECIMAL)
equals 1.2
but 10.0E0/8.0E0
(DOUBLE) equals 1.25
. Without division operations,
DECIMAL values represent exact arithmetic.
REAL, FLOAT and DOUBLE values are all stored in the database as
java.lang.Double
objects. Special values such as
NaN and +-Infinity are also stored and supported. These values can be
submitted to the database via JDBC
PreparedStatement
methods and are returned in
ResultSet
objects. In order to allow division by
zero of DOUBLE values in SQL statements (which returns NaN or
+-Infinity) you should set the property
hsqldb.double_nan as false (SET DATABASE SQL DOUBLE
NAN FALSE). The double values can be retrieved from a
ResultSet
in the required type so long as they
can be represented. For setting the values, when
PreparedStatement.setDouble()
or
setFloat()
is used, the value is treated as a
DOUBLE automatically.
In short,
<numeric type> ::= <exact numeric type> |
<approximate numeric type>
<exact numeric type> ::= NUMERIC [ <left
paren> <precision> [ <comma> <scale> ] <right
paren> ] | { DECIMAL | DEC } [ <left paren> <precision> [
<comma> <scale> ] <right paren> ] | TINYINT | SMALLINT
| INTEGER | INT | BIGINT
<approximate numeric type> ::= FLOAT [ <left
paren> <precision> <right paren> ] | REAL | DOUBLE
PRECISION
<precision> ::= <unsigned
integer>
<scale> ::= <unsigned
integer>
The BOOLEAN type conforms to the SQL Standard and represents the
values TRUE
, FALSE
and
UNKNOWN
. This type of column can be initialised with
Java boolean values, or with NULL
for the
UNKNOWN
value.
The three-value logic is sometimes misunderstood. For example, x IN (1, 2, NULL) does not return true if x is NULL.
In previous versions of HyperSQL, BIT was simply an alias for BOOLEAN. In version 2, BIT is a single-bit bit map.
<boolean type> ::= BOOLEAN
The SQL Standard does not support type conversion to BOOLEAN apart from character strings that consists of boolean literals. Because the BOOLEAN type is relatively new to the Standard, several database products used other types to represent boolean values. For improved compatibility, HyperSQL allows some type conversions to boolean.
Values of BIT and BIT VARYING types with length 1 can be converted to BOOLEAN. If the bit is set, the result of conversion is the TRUE value, otherwise it is FALSE.
Values of TINYINT, SMALLINT, INTEGER and BIGINT types can be converted to BOOLEAN. If the value is zero, the result is the FALSE value, otherwise it is TRUE.
The CHARACTER, CHARACTER VARYING and CLOB types are the SQL Standard character string types. CHAR, VARCHAR and CHARACTER LARGE OBJECT are synonyms for these types. HyperSQL also supports LONGVARCHAR as a synonym for VARCHAR. If LONGVARCHAR is used without a length, then a length of 16M is assigned. You can set LONGVARCHAR to map to CLOB, with the sql.longvar_is_lob connection property or the SET DATABASE SQL LONGVAR IS LOB TRUE statement..
HyperSQL's default character set is Unicode, therefore all possible character strings can be represented by these types.
The SQL Standard behaviour of the CHARACTER type is a remnant of legacy systems in which character strings are padded with spaces to fill a fixed width. These spaces are sometimes significant while in other cases they are silently discarded. It would be best to avoid the CHARACTER type altogether. With the rest of the types, the strings are not padded when assigned to columns or variables of the given type. The trailing spaces are still considered discardable for all character types. Therefore, if a string with trailing spaces is too long to assign to a column or variable of a given length, the spaces beyond the type length are discarded and the assignment succeeds (provided all the characters beyond the type length are spaces).
The VARCHAR and CLOB types have length limits, but the strings are not padded by the system. Note that if you use a large length for a VARCHAR or CLOB type, no extra space is used in the database. The space used for each stored item is proportional to its actual length.
If CHARACTER is used without specifying the length, the length
defaults to 1. For the CLOB type, the length limit can be defined in
units of kilobyte (K, 1024), megabyte (M, 1024 * 1024) or gigabyte (G,
1024 * 1024 * 1024), using the <multiplier>
. If
CLOB is used without specifying the length, the length defaults to
1GB.
<character string type> ::= { CHARACTER | CHAR }
[ <left paren> <character length> <right paren> ] | {
CHARACTER VARYING | CHAR VARYING | VARCHAR } <left paren>
<character length> <right paren> | LONGVARCHAR [ <left
paren> <character length> <right paren> ] | <character
large object type>
<character large object type> ::= { CHARACTER
LARGE OBJECT | CHAR LARGE OBJECT | CLOB } [ <left paren>
<character large object length> <right paren>
]
<character length> ::= <unsigned integer>
[ <char length units> ]
<large object length> ::= <length> [
<multiplier> ] | <large object length
token>
<character large object length> ::= <large
object length> [ <char length units> ]
<large object length token> ::= <digit>...
<multiplier>
<multiplier> ::= K | M | G
<char length units> ::= CHARACTERS |
OCTETS
Each character type has a collation. This is either a default collation or stated explicitly with the COLLATE clause. Collations are discussed in the Schemas and Database Objects chapter.
CHAR(10) CHARACTER(10) VARCHAR(2) CHAR VARYING(2) CLOB(1000) CLOB(30K) CHARACTER LARGE OBJECT(1M) LONGVARCHAR
The BINARY, BINARY VARYING and BLOB types are the SQL Standard binary string types. VARBINARY and BINARY LARGE OBJECT are synonyms for BINARY VARYING and BLOB types. HyperSQL also supports LONGVARBINARY as a synonym for VARBINARY. You can set LONGVARBINARY to map to BLOB, with the sql.longvar_is_lob connection property or the SET DATABASE SQL LONGVAR IS LOB TRUE statement.
Binary string types are used in a similar way to character string types. There are several built-in functions that are overloaded to support character, binary and bit strings.
The BINARY type represents a fixed width-string. Each shorter string is padded with zeros to fill the fixed width. Similar to the CHARACTER type, the trailing zeros in the BINARY string are simply discarded in some operations. For the same reason, it is best to avoid this particular type and use VARBINARY instead.
When two binary values are compared, if one is of BINARY type, then zero padding is performed to extend the length of the shorter string to the longer one before comparison. No padding is performed with other binary types. If the bytes compare equal to the end of the shorter value, then the longer string is considered larger than the shorter string.
If BINARY is used without specifying the length, the length
defaults to 1. For the BLOB type, the length limit can be defined in
units of kilobyte (K, 1024), megabyte (M, 1024 * 1024) or gigabyte (G,
1024 * 1024 * 1024), using the <multiplier>
. If
BLOB is used without specifying the length, the length defaults to
1GB.
The UUID type represents a UUID string. The type is similar to
BINARY(16) but with the extra enforcement that disallows assigning,
casting, or comparing with shorter or longer strings. Strings such as
'24ff1824-01e8-4dac-8eb3-3fee32ad2b9c' or
'24ff182401e84dac8eb33fee32ad2b9c' are allowed. When a value of the UUID
type is converted to a CHARACTER type, the hyphens are inserted in the
required positions. Java UUID objects can be used with
java.sql.PreparedStatement
to insert values of
this type. The getObject()
method of ResultSet
returns the Java object for UUID column data.
<binary string type> ::= BINARY [ <left
paren> <length> <right paren> ] | { BINARY VARYING |
VARBINARY } <left paren> <length> <right paren> |
LONGVARBINARY [ <left paren> <length> <right paren> ]
| UUID | <binary large object string type>
<binary large object string type> ::= { BINARY
LARGE OBJECT | BLOB } [ <left paren> <large object length>
<right paren> ]
<length> ::= <unsigned
integer>
BINARY(10) VARBINARY(2) BINARY VARYING(2) BLOB(1000) BLOB(30G) BINARY LARGE OBJECT(1M) LONGVARBINARY
The BIT and BIT VARYING types are the supported bit string types. These types were defined by SQL:1999 but were later removed from the Standard. Bit types represent bit maps of given lengths. Each bit is 0 or 1. The BIT type represents a fixed width-string. Each shorter string is padded with zeros to fill the fixed with. If BIT is used without specifying the length, the length defaults to 1. The BIT VARYING type has a maximum width and shorter strings are not padded.
Before the introduction of the BOOLEAN type to the SQL Standard, a single-bit string of the type BIT(1) was commonly used. For compatibility with other products that do not conform to, or extend, the SQL Standard, HyperSQL allows values of BIT and BIT VARYING types with length 1 to be converted to and from the BOOLEAN type. BOOLEAN TRUE is considered equal to B'1', BOOLEAN FALSE is considered equal to B'0'.
For the same reason, numeric values can be assigned to columns and variables of the type BIT(1). For assignment, the numeric value zero is converted to B'0', while all other values are converted to B'1'. For comparison, numeric values 1 is considered equal to B'1' and numeric value zero is considered equal to B'0'.
It is not allowed to perform other arithmetic or boolean operations involving BIT(1) and BIT VARYING(1). The kid of operations allowed on bit strings are analogous to those allowed on BINARY and CHARACTER strings. Several built-in functions support all three types of string.
<bit string type> ::= BIT [ <left paren>
<length> <right paren> ] | BIT VARYING <left paren>
<length> <right paren>
BIT BIT(10) BIT VARYING(2)
BLOB and CLOB are lob types. These types are used for very long strings that do not necessarily fit in memory. Small lobs that fit in memory can be accessed just like BINARY or VARCHAR column data. But lobs are usually much larger and therefore accessed with special JDBC methods.
To insert a lob into a table, or to update a column of lob type
with a new lob, you can use the
setBinaryStream()
and
setCharacterStream()
methods of JDBC
java.sql.PreparedStatement
. These are very
efficient methods for long lobs. Other methods are also supported. If
the data for the BLOB or CLOB is already a memory object, you can use
the setBytes()
or
setString()
methods, which are efficient for
memory data. Another method is to obtain a lob with the
getBlob()
and
getClob()
methods of
java.sql.Connection
, populate its data, then use
the setBlob()
or
setClob()
methods of
PreparedStatement
. Yet another method allows to
create instances of org.hsqldb.jdbc.JDBCBlobFile
and org.hsqldb.jdbc.JDBCClobFile
and construct a
large lob for use with setBlob()
and
setClob()
methods.
A lob is retrieved from a ResultSet
with
the getBlob()
or
getClob()
method. The steaming methods of the
lob objects are then used to access the data. HyperSQL also allows
efficient access to chunks of lobs with
getBytes()
or
getString()
methods. Furthermore, parts of a
BLOB or CLOB already stored in a table can be modified. An updatable
ResultSet
is used to select the row from the
table. The getBlob()
or
getClob()
methods of
ResultSet
are used to access the lob as a
java.sql.Blob
or
java.sql.Clob
object. The
setBytes()
and
setString()
methods of these objects can be
used to modify the lob. Finally the updateRow()
method of the ResultSet
is used to update the lob
in the row. Note these modifications are not allowed with compressed or
encrypted lobs.
Lobs are logically stored in columns of tables. Their physical
storage is a separate *.lobs
file. This file is
created as soon as a BLOB or CLOB is inserted into the database. The
file will grow as new lobs are inserted into the database. In version 2,
the *.lobs
file is never deleted even if all lobs
are deleted from the database. In this case you can delete the
*.lobs
file after a SHUTDOWN. When a CHECKPOINT
happens, the space used for deleted lobs is freed and is reused for
future lobs. By default, clobs are stored without compression. You can
use a database setting to enable compression of clobs. This can
significantly reduce the storage size of clobs.
From version 2.3.4 there are two options for storing Java Objects.
The default option allows storing Serializable object. The objects remain serialized inside the database until they are retrieved. The application program that retrieves the object must include in its classpath the Java Class for the object, otherwise it cannot retrieve the object.
Any serializable Java Object can be inserted directly into a
column of type OTHER using any variation of
PreparedStatement.setObject()
methods.
The alternative Live Object option is for
mem: databases only and is enabled when the
database property sql.live_object=true is appended
to the connection property that creates the mem database. For example
'jdbc:hsqldb:mem:mydb;sql.live_object=true'
. With
this option, any Java object can be stored as it is not serialized. The
SQL statement SET DATABASE SQL LIVE OBJECT TRUE
can
be also used. Note the SQL statement must be executed on the first
connection to the database before any data is inserted. No data access
should be made from this connection. Instead, new connections should be
used for data access.
For comparison purposes and in indexes, any two Java Objects are considered equal unless one of them is NULL. You cannot search for a specific object or perform a join on a column of type OTHER.
Java Objects can simply be stored internally and no operations can
be performed on them other than assignment between columns of type OTHER
or checking for NULL. Tests such as WHERE object1 = object2
do not mean what you might expect, as any non-null object
would satisfy such a tests. But WHERE object1 IS NOT
NULL
is perfectly acceptable.
The engine does not allow normal column values to be assigned to
Java Object columns (for example, assigning an INTEGER or STRING to such
a column with an SQL statement such as UPDATE mytable SET
objectcol = intcol WHERE ...
).
<java object type> ::= OTHER
The default method of storage is used when the objects and their state needs to be saved and retrieved in the future. This method is also used when memory resources are limited and collections of objects are stored and retrieved only when needed.
The Live Object option uses the database table as a collection of objects. This allows storing some attributes of the objects in the same table alongside the object itself and fast search and retrieval of objects on their attributes. For example, when many thousands of live objects contain details of films, the film title and the director can be stored in the table and searches can be performed for films on these attributes:
CREATE TABLE movies (director VARCHAR(30), title VARCHAR(40), obj OTHER) SELECT obj FROM movies WHERE director LIKE 'Luc%'
In any case, at least one attribute of the object should be stored to allow efficient retrieval of the objects from both Live Object and Serialized storage. An ID number is often used as the stored column attribute.
In HyperSQL, column length, precision and scale qualifiers are
required and are always enforced. The VARCHAR and VARBINARY types
require a size parameter and do not have a default. For compatibility
with CREATE TABLE statements from other databases that do not have size
parameters for VARCHAR column, the URL property
hsqldb.enforce_size=false or the SQL statement
SET DATABASE SQL SIZE FALSE
can be used to allow the
table creation and automatically apply a large value for the maximum
size of the VARCHAR column. You should test your application to ensure
the length, precision and scale that is used for column definitions is
appropriate for the application data.
All other types have defaults for size or precision parameters. However, the defaults may not be what your application requires and you may have to specify the parameters.
String types, including all BIT, BINARY and CHAR string types plus CLOB and BLOB, are generally defined with a length. If no length is specified for BIT, BINARY and CHAR, the default length is 1. For CLOB and BLOB an implementation defined length of 1G is used.
TIME and TIMESTAMP types can be defined with a fractional second precision between 0 and 9. INTERVAL type definition may have precision and, in some cases, fraction second precision. DECIMAL and NUMERIC types may be defined with precision and scale. For all of these types a default precision or scale value is used if one is not specified. The default scale is 0. The default fractional precision for TIME is 0, while it is 6 for TIMESTAMP.
Values can be converted from one type to another in two different ways: by using explicit CAST expression or by implicit conversion used in assignment, comparison, and aggregation.
String values cannot be assigned to VARCHAR columns if they are longer than the defined type length. For CHARACTER columns, a long string can be assigned (with truncation) only if all the characters after the length are spaces. Shorter strings are padded with the space character when inserted into a CHARACTER column. Similar rules are applied to VARBINARY and BINARY columns. For BINARY columns, the padding and truncation rules are applied with zero bytes, instead of spaces.
Explicit CAST of a value to a CHARACTER or VARCHAR type will
result in forced truncation or padding. So a test such as CAST
(mycol AS VARCHAR(2)) = 'xy'
will find the values beginning
with 'xy'. This is the equivalent of SUBSTRING(mycol FROM 1 FOR
2)= 'xy'
.
For all numeric types, the rules of explicit cast and implicit conversion are the same. If cast or conversion causes any digits to be lost from the fractional part, it can take place. If the non-fractional part of the value cannot be represented in the new type, cast or conversion cannot take place and will result in a data exception.
There are special rules for DATE, TIME, TIMESTAMP and INTERVAL casts and conversions.
HSQLDB fully supports datetime and interval types and operations, including all relevant optional features, as specified by the SQL Standard since SQL-92. The two groups of types are complementary.
The DATE type represents a calendar date with YEAR, MONTH and DAY fields.
The TIME type represents time of day with HOUR, MINUTE and SECOND fields, plus an optional SECOND FRACTION field.
The TIMESTAMP type represents the combination of DATE and TIME types.
TIME and TIMESTAMP types can include WITH TIME ZONE or WITHOUT TIME ZONE (the default) qualifiers. They can have fractional second parts. For example, TIME(6) has six fractional digits for the second field.
If fractional second precision is not specified, it defaults to 0 for TIME and to 6 for TIMESTAMP.
<datetime type> ::= DATE | TIME [ <left
paren> <time precision> <right paren> ] [ <with or
without time zone> ] | TIMESTAMP [ <left paren> <timestamp
precision> <right paren> ] [ <with or without time zone>
]
<with or without time zone> ::= WITH TIME ZONE |
WITHOUT TIME ZONE
<time precision> ::= <time fractional seconds
precision>
<timestamp precision> ::= <time fractional
seconds precision>
<time fractional seconds precision> ::=
<unsigned integer>
DATE TIME(6) TIMESTAMP(2) WITH TIME ZONE
TIME or TIMESTAMP literals containing a zone displacement value are WITH TIME ZONE. Examples of the string literals used to represent date time values, some with time zone, some without, are below:
DATE '2008-08-22' TIMESTAMP '2008-08-08 20:08:08' TIMESTAMP '2008-08-08 20:08:08+8:00' /* Beijing */ TIME '20:08:08.034900' TIME '20:08:08.034900-8:00' /* US Pacific */
Time Zone
DATE values do not take time zones. For example, United Nations designates 5 June as World Environment Day, which was observed on DATE '2008-06-05' in different time zones.
TIME and TIMESTAMP values without time zone, usually have a context that indicates some local time zone. For example, a database for college course timetables usually stores class dates and times without time zones. This works because the location of the college is fixed and the time zone displacement is the same for all the values. Even when the events take place in different time zones, for example international flight times, it is possible to store all the datetime information as references to a single time zone, usually GMT. For some databases it may be useful to store the time zone displacement together with each datetime value. SQL’s TIME WITH TIME ZONE and TIMESTAMP WITH TIME ZONE values include a time zone displacement value.
The time zone displacement is of the type INTERVAL HOUR TO MINUTE. This data type is described in the next section. The legal values are between '–18:00' and '+18:00'.
Operations on Datetime Types
The expression <datetime expression> AT TIME ZONE {
<interval primary> | <time zone name> }
evaluates to
a datetime value representing exactly the same point of time in the
specified <time displacement>
or the geographical
<time zone name>
. The expression, AT
LOCAL
is equivalent to AT TIME ZONE <local time
displacement>
. If AT TIME ZONE
is used
with a datetime operand of type WITHOUT TIME ZONE, the operand is first
converted to a value of type WITH TIME ZONE using the session's calendar,
then the specified time zone displacement is set for the value. Therefore,
in these cases, the final value depends on the time zone of the session in
which the statement was used, calculated at the exact point of time (of
the input) and accounting for daylight saving time at that point of
time.
From version 2.7.0 it is possible to use regional time zones with AT TIME ZONE. Any zone name used must match exactly a TimeZone id supported by the JVM. These include names such as 'America/New_York'. Some zones include daylight saving time periods which are used when converting the time zone.
See also the FROM_TZ function which allows you to convert to a time zone without changing the date-time values such as hour and minute.
AT TIME ZONE, modifies the field values of the datetime operand. This is done by the following procedure:
determine the corresponding datetime at UTC using the session's calendar.
find the datetime value at the given time zone that corresponds with the UTC value from step 1.
Example a:
VALUES TIMESTAMP'2022-03-28 11:00:00' AT TIME ZONE INTERVAL '-5:00' HOUR TO MINUTE C1 ------------------------ 2022-03-28 14:00:00-5:00 VALUES TIMESTAMP'2022-03-28 11:00:00+4:00' AT TIME ZONE 'America/Chicago' C1 ------------------------ 2022-03-28 02:00:00-5:00
In the first example above, the session's time zone displacement is '-8:00'. In step 1, time '11:00:00' is converted to UTC, which is time '19:00:00+0:00'. In step 2, this value is expressed as time '14:00:00-5:00' in the target zone.
In the second example, the session's time zone displacement is not considered because the timestamp has a time zone. In step 1, time is converted to UTC, which is time '07:00:00+0:00', In step 2, this value is expressed as time '02:00:00-5:00' in the target zone.
Example b:
TIME '12:00:00-5:00' AT TIME ZONE INTERVAL '1:00' HOUR TO MINUTE
Because the operand has a time zone, the result is independent of the session time zone displacement. Step 1 results in TIME '17:00:00+0:00', and step 2 results in TIME '18:00:00+1:00'
Note that the operand is not limited to datetime literals used in these examples. Any valid expression that evaluates to a datetime value can be the operand.
Type Conversion
CAST is used for all other conversions. Examples:
CAST (<value> AS TIME WITHOUT TIME ZONE) CAST (<value> AS TIME WITH TIME ZONE)
In the first example, if <value>
has a time
zone component, it is simply dropped. For example, TIME '12:00:00-5:00' is
converted to TIME '12:00:00'
In the second example, if <value>
has no
time zone component, the current time zone displacement of the session is
added. For example, TIME '12:00:00' is converted to TIME '12:00:00-8:00'
when the session time zone displacement is '-8:00'.
Conversion between DATE and TIMESTAMP is performed by removing the TIME component of a TIMESTAMP value or by setting the hour, minute and second fields to zero. TIMESTAMP '2008-08-08 20:08:08+8:00' becomes DATE '2008-08-08', while DATE '2008-08-22' becomes TIMESTAMP '2008-08-22 00:00:00'.
Conversion between TIME and TIMESTAMP is performed by removing the DATE field values of a TIMESTAMP value or by appending the fields of the TIME value to the fields of the current session date value.
Assignment
When a value is assigned to a datetime target, e.g., a value is used to update a row of a table, the type of the value must be the same as the target, but the WITH TIME ZONE or WITHOUT TIME ZONE characteristics can be different. If the types are not the same, an explicit CAST must be used to convert the value into the target type.
Comparison
When values WITH TIME ZONE are compared, they are converted to UTC values before comparison. If a value WITH TIME ZONE is compared to another WITHOUT TIME ZONE, then the WITH TIME ZONE value is converted to AT LOCAL, then converted to WITHOUT TIME ZONE before comparison.
It is not recommended to design applications that rely on comparisons and conversions between TIME values WITH TIME ZONE. The conversions may involve normalisation of the time value, resulting in unexpected results. For example, the expression: BETWEEN(TIME '12:00:00-8:00', TIME '22:00:00-8:00') is converted to BETWEEN(TIME '20:00:00+0:00', TIME '06:00:00+0:00') when it is evaluated in the UTC zone, which is always FALSE.
Functions
Several functions return the current session timestamp in different datetime types:
CURRENT_DATE |
DATE |
CURRENT_TIME |
TIME WITH TIME ZONE |
CURRENT_TIMESTAMP |
TIMESTAMP WITH TIME ZONE |
LOCALTIME |
TIME WITHOUT TIME ZONE |
LOCALTIMESTAMP |
TIMESTAMP WITHOUT TIME ZONE |
HyperSQL supports a very extensive range of functions for conversion, extraction and manipulation of DATE and TIMESTAMP values. See the Built In Functions chapter.
Session Time Zone Displacement
When an SQL session is started (with a JDBC connection) the local time zone of the client JVM (including any seasonal time adjustments such as daylight-saving time) is used as the session time zone displacement. In version 2.7.0 a Java Calendar object with the local time zone of the client JVM is created and used. Therefore when a seasonal time adjustment for daylight saving time is made while the session is open, the SQL session zone displacement is changed. In some older versions of HyperSQL, the SQL session time displacement was not changed when a seasonal time adjustment took place.
To change the SQL session time zone displacement, use the following commands:
SET TIME ZONE <time
displacement>
SET TIME ZONE LOCAL
The first command sets the displacement to the given value. The second command restores the original, real time zone displacement of the session.
Datetime Values and Java
When datetime values are sent to the database using the
PreparedStatement
or
CallableStatement
interfaces, the Java object is
converted to the type of the prepared or callable statement parameter.
This type may be DATE, TIME, or TIMESTAMP (with or without time zone). The
time zone displacement is the time zone of the JDBC session.
When datetime values are retrieved from the database using the
ResultSet
interface, there are two representations. The
getString(…)
methods of the
ResultSet
interface, return an exact representation
of the value in the SQL type as it is stored in the database. This
includes the correct number of digits for the fractional second field, and
for values with time zone displacement, the time zone displacement.
Therefore, if TIME '12:00:00' is stored in the database, all users in
different time zones will get '12:00:00' when they retrieve the value as a
string. The getTime(…)
and
getTimestamp(…)
methods of the
ResultSet
interface return Java objects that are
corrected for the session time zone. The UTC millisecond value contained
the java.sql.Time
or
java.sql.Timestamp
objects will be adjusted to the
time zone of the session, therefore the
toString()
method of these objects return the
same values in different time zones.
If you want to store and retrieve UTC values that are independent of
any session's time zone, you can use a TIMESTAMP WITH TIME ZONE column.
The setTime(...)
and
setTimestamp(...)
methods of the
PreparedStatement interface which have a Calendar parameter can be used to
assign the values. The time zone of the given Calendar argument is used as
the time zone. Conversely, the getTime(...)
and
getTimestamp(...)
methods of the ResultSet
interface which have a Calendar parameter can be used with a Calendar
argument to retrieve the values.
Java 8 Extensions
JDBC 4 and JAVA6 did not include type codes for SQL datetime types that have a TIME ZONE property. Therefore, HyperSQL reported these types by default as datetime types without TIME ZONE.
JAVA 8 introduced new type codes for TIMESTAMP WITH TIME ZONE and
TIME WITH TIME ZONE. HyperSQL supports this in
ResultSet
, PreparedStatement
and CallableStatement
.
The getObject(int columnIndex)
method
on a column of TIMESTAMP WITH TIME ZONE returns an
java.time.OffsetDateTime
object.
The getObject(int columnIndex)
method
on a column of TIME WITH TIME ZONE returns an
java.time.OffsetTime
object.
The getObject(int columnIndex, Class
type)
method on any date, time and timestamp supports the
java.time
package types:
LocalDate
, LocalTime
,
LocalDateTime
,
OffsetTime
,
OffsetDateTime
, and
Instance
as well as
java.sql
package types,
Date
, Time
and
Timestamp
.
The setObject
methods also support Java
objects of the types listed above.
The getObject
and
setObject
methods with column name parameters
behave just like their counterparts with columnIndexe
parameters.
Non-Standard Extensions
HyperSQL version 2.7 supports some extensions to the SQL standard treatment of datetime and interval types. For example, the Standard expression to add a number of days to a date has an explicit INTERVAL value but HSQLDB also allows an integer to be used without specifying DAY. Examples of some Standard expressions and their non-standard alternatives are given below:
-- standard forms CURRENT_DATE + '2' DAY SELECT (LOCALTIMESTAMP - atimestampcolumn) DAY TO SECOND FROM atable -- non-standard forms CURRENT_DATE + 2 SELECT LOCALTIMESTAMP - atimestampcolumn FROM atable
It is recommended to use the SQL Standard syntax as it is more precise and avoids ambiguity.
Interval types are used to represent differences between date time values. The difference between two date time values can be measured in seconds or in months. For measurements in months, the units YEAR and MONTH are available, while for measurements in seconds, the units DAY, HOUR, MINUTE, SECOND are available. The units can be used individually, or as a range. An interval type can specify the precision of the most significant field and the second fraction digits of the SECOND field (if it has a SECOND field). The default precision is 2, following the Standard. The default second precision is 0. The default precision is too small for many applications and should be overridden.
<interval type> ::= INTERVAL <interval
qualifier>
<interval qualifier> ::= <start field> TO
<end field> | <single datetime field>
<start field> ::= <non-second primary datetime
field> [ <left paren> <interval leading field precision>
<right paren> ]
<end field> ::= <non-second primary datetime
field> | SECOND [ <left paren> <interval fractional seconds
precision> <right paren> ]
<single datetime field> ::= <non-second primary
datetime field> [ <left paren> <interval leading field
precision> <right paren> ] | SECOND [ <left paren>
<interval leading field precision> [ <comma> <interval
fractional seconds precision> ] <right paren>
]
<primary datetime field> ::= <non-second
primary datetime field> | SECOND
<non-second primary datetime field> ::= YEAR |
MONTH | DAY | HOUR | MINUTE
<interval fractional seconds precision> ::=
<unsigned integer>
<interval leading field precision> ::=
<unsigned integer>
Examples of INTERVAL type definition:
INTERVAL YEAR TO MONTH INTERVAL YEAR(3) INTERVAL DAY(4) TO HOUR INTERVAL MINUTE(4) TO SECOND(6) INTERVAL SECOND(4,6)
The word INTERVAL indicates the general type name. The rest of the
definition is called an <interval qualifier>
.
This designation is important, as in most expressions
<interval qualifier>
is used without the word
INTERVAL.
Interval Values
An interval value can be negative, positive or zero. An interval type has all the datetime fields in the specified range. These fields are similar to those in the TIMESTAMP type. The differences are as follows:
The first field of an interval value can hold any numeric value up to the specified precision. For example, the hour field in HOUR(2) TO SECOND can hold values above 23 (up to 99). The year and month fields can hold zero (unlike a TIMESTAMP value) and the maximum value of a month field that is not the most significant field, is 11.
The standard function ABS(<interval value
expression>)
can be used to convert a negative interval value
to a positive one.
The literal representation of interval values consists of the type definition, with a string representing the interval value inserted after the word INTERVAL. Some examples of interval literal below:
INTERVAL '145 23:12:19.345' DAY(3) TO SECOND(3) INTERVAL '3503:12:19.345' HOUR TO SECOND(3) /* equal to the first value */ INTERVAL '19.345' SECOND(4,3) /* maximum number of digits for the second value is 4, and each value is expressed with three fraction digits. */ INTERVAL '-23-10' YEAR(2) TO MONTH
Interval values of the types that are based on seconds can be cast into one another. Similarly, those that are based on months can be cast into one another. It is not possible to cast or convert a value based on seconds to one based on months, or vice versa.
When a cast is performed to a type with a smaller least-significant field, nothing is lost from the interval value. Otherwise, the values for the missing least-significant fields are discarded. Examples:
CAST ( INTERVAL '145 23:12:19' DAY TO SECOND AS INTERVAL DAY TO HOUR ) = INTERVAL '145 23' DAY TO HOUR CAST(INTERVAL '145 23' DAY TO HOUR AS INTERVAL DAY TO SECOND) = INTERVAL '145 23:00:00' DAY TO SECOND
A numeric value can be cast to an interval type. In this case the numeric value is first converted to a single-field INTERVAL type with the same field as the least significant field of the target interval type. This value is then converted to the target interval type For example CAST( 22 AS INTERVAL YEAR TO MONTH) evaluates to INTERVAL '22' MONTH and then INTERVAL '1 10' YEAR TO MONTH. Note that SQL Standard only supports casts to single-field INTERVAL types, while HyperSQL allows casting to multi-field types as well.
An interval value can be cast to a numeric type. In this case the interval value is first converted to a single-field INTERVAL type with the same field as the least significant filed of the interval value. The value is then converted to the target type. For example, CAST (INTERVAL '1-11' YEAR TO MONTH AS INT) evaluates to INTERVAL '23' MONTH, and then 23.
An interval value can be cast into a character type, which results in an INTERVAL literal. A character value can be cast into an INTERVAL type so long as it is a string with a format compatible with an INTERVAL literal.
Two interval values can be added or subtracted so long as the types of both are based on the same field, i.e., both are based on MONTH or SECOND. The values are both converted to a single-field interval type with same field as the least-significant field between the two types. After addition or subtraction, the result is converted to an interval type that contains all the fields of the two original types.
An interval value can be multiplied or divided by a numeric value. Again, the value is converted to a numeric, which is then multiplied or divided, before converting back to the original interval type.
An interval value is negated by simply prefixing with the minus sign.
Interval values used in expressions are either typed values,
including interval literals, or are interval casts. The expression:
<expression> <interval qualifier>
is a cast
of the result of the <expression>
into the
INTERVAL type specified by the <interval qualifier>. The
cast can be formed by adding the keywords and parentheses as follows: CAST
( <expression> AS INTERVAL <interval qualifier>
).
The examples below feature different forms of expression
that represent an interval value, which is then added to the given date
literal.
DATE '2000-01-01' + INTERVAL '1-10' YEAR TO MONTH /* interval literal */ DATE '2000-01-01' + '1-10' YEAR TO MONTH /* the string '1-10' is cast into INTERVAL YEAR TO MONTH */ DATE '2000-01-01' + 22 MONTH /* the integer 22 is cast into INTERVAL MONTH, same value as above */ DATE '2000-01-01' - 22 DAY /* the integer 22 is cast into INTERVAL DAY */ DATE '2000-01-01' + COL2 /* the type of COL2 must be an INTERVAL type */ DATE '2000-01-01' + COL2 MONTH /* COL2 may be a number, it is cast into a MONTH interval */
Datetime and Interval Operations
An interval can be added to or subtracted from a datetime value so long as they have some fields in common. For example, an INTERVAL MONTH cannot be added to a TIME value, while an INTERVAL HOUR TO SECOND can. The interval is first converted to a numeric value, then the value is added to, or subtracted from, the corresponding field of the datetime value.
If the result of addition or subtraction is beyond the permissible range for the field, the field value is normalised and carried over to the next significant field until all the fields are normalised. For example, adding 20 minutes to TIME '23:50:10' will result successively in '23:70:10', '24:10:10' and finally TIME '00:10:10'. Subtracting 20 minutes from the result is performed as follows: '00:-10:10', '-1:50:10', finally TIME '23:50:10'. Note that if DATE or TIMESTAMP normalisation results in the YEAR field value out of the range (1,10000), then an exception condition is raised.
If an interval value based on MONTH is added to, or subtracted from a DATE or TIMESTAMP value, the result may have an invalid day (30 or 31) for the given result month. In this case an exception condition is raised.
The result of subtraction of two datetime expressions is an interval
value. The two datetime expressions must be of the same type. The type of
the interval value must be specified in the expression, using only the
interval field names. The two datetime expressions are enclosed in
parentheses, followed by the <interval qualifier>
fields. In the first example below, COL1 and COL2 are of the same datetime
type, and the result is evaluated in INTERVAL YEAR TO MONTH type.
(COL1 – COL2) YEAR TO MONTH /* the difference between two DATE or two TIMESTAMP values in years and months */ (CURRENT_DATE – COL3) DAY /* the number of days between the value of COL3 and the current date */ (CURRENT_DATE - DATE '2000-01-01') YEAR TO MONTH /* the number of years and months since the beginning of this century */ CURRENT_DATE - 2 DAY /* the date of the day before yesterday */ (CURRENT_TIMESTAMP - TIMESTAMP '2009-01-01 00:00:00') DAY(4) TO SECOND(2) /* days to seconds since the given date */
The individual fields of both datetime and interval values can be extracted using the EXTRACT function. The same function can also be used to extract the time zone displacement fields of a datetime value.
EXTRACT ({YEAR | MONTH | DAY | HOUR | MINUTE | SECOND |
TIMEZONE_HOUR | TIMEZONE_MINUTE | DAY_OF_WEEK | WEEK_OF_YEAR } FROM
{<datetime value> | <interval value>})
The dichotomy between interval types based on seconds, and those based on months, stems from the fact that the different calendar months have different numbers of days. For example, the expression, “nine months and nine days since an event” is not exact when the date of the event is unknown. It can represent a period of around 284 days give or take one. SQL interval values are independent of any start or end dates or times. However, when they are added to or subtracted from certain date or timestamp values, the result may be invalid and cause an exception (e.g. adding one month to January 30 results in February 30, which is invalid).
JDBC has an unfortunate limitation and does not include type codes for SQL INTERVAL types. Therefore, for compatibility with database tools that are limited to the JDBC type codes, HyperSQL reports these types by default as VARCHAR. You can use the URL property hsqldb.translate_dti_types=false to override the default behaviour.
Java 8 Extensions
JAVA 8 does not have SQL type codes for INTERVAL types. HyperSQL
supports java.time
types for INTERVAL types in
ResultSet
, PreparedStatement
and CallableStatement
.
The getObject(int columnIndex, Class
type)
method on an INTERVAL supports the
java.time.Period
type for YEAR and MONTH
interval and the java.time.Duration
type for
other interval types that cover DAY to SECOND.
The setObject(int columnIndex)
method
accepts java.time.Period
and
java.time.Duration
objects for columns of
relevant INTERVAL types.
The getObject
and
setObject
methods with column name parameters
behave just like their counterparts with columnIndexe
parameters.
Array are a powerful feature of SQL:2023 and can help solve many common problems. Arrays should not be used as a substitute for tables.
HyperSQL supports arrays of values according to the Standard.
Elements of the array are either NULL, or of the same data type. It is possible to define arrays of all supported types, including the types covered in this chapter and user-defined types, except LOB types. An SQL array is one dimensional and is addressed from position 1. An empty array can also be used, which has no element.
Arrays can be stored in the database, as well as being used as temporary containers of values for simplifying SQL statements. They facilitate data exchange between the SQL engine and the user's application.
The full range of supported syntax allows array to be created, used in SELECT or other statements, combined with rows of tables, and used in routine calls.
The type of a table column, a routine parameter, a variable, or the return value of a function can be defined as an array.
<array type> ::= <data type> ARRAY [ <left
bracket or trigraph> <maximum cardinality> <right bracket or
trigraph> ]
The word ARRAY is added to any valid type definition except BLOB
and CLOB type definitions. If the optional <maximum
cardinality>
is not used, the default value is 1024. The
size of the array cannot be extended beyond maximum cardinality.
In the example below, the table contains a column of INTEGER
arrays and a column of VARCHAR arrays. The VARCHAR array has an explicit
maximum size of 10, which means each array can have between 0 and 10
elements. The INTEGER array has the default maximum size of 1024. The
scores column has a default clause with an empty array. The default
clause can be defined only as DEFAULT NULL
or
DEFAULT ARRAY[]
and does not allow arrays containing
elements.
CREATE TABLE t (id INT PRIMARY KEY, scores INT ARRAY DEFAULT ARRAY[], names VARCHAR(20) ARRAY[10])
An array can be constructed from value expressions or a query expression.
<array value constructor by enumeration> ::= ARRAY
<left bracket or trigraph> <array element list> <right
bracket or trigraph>
<array element list> ::= <value expression> [
{ <comma> <value expression> }... ]
<array value constructor by query> ::= ARRAY
<left paren> <query expression> [ <order by clause> ]
<right paren>
In the examples below, arrays are constructed from values, column references or variables, function calls, or query expressions.
ARRAY [ 1, 2, 3 ] ARRAY [ 'HOT', 'COLD' ] ARRAY [ var1, var2, CURRENT_DATE ] ARRAY (SELECT lastname FROM namestable ORDER BY id)
Inserting and updating a table with an ARRAY column can use array constructors, not only for updated column values, but also in equality search conditions:
INSERT INTO t VALUES 10, ARRAY[1,2,3], ARRAY['HOT', 'COLD'] UPDATE t SET names = ARRAY['LARGE', 'SMALL'] WHERE id = 12 UPDATE t SET names = ARRAY['LARGE', 'SMALL'] WHERE id < 12 AND scores = ARRAY[3,4]
When using a PreparedStatement with an ARRAY parameter, a Java
array or a java.sql.Array
object may be used to
set the parameter.
In the example below prepared statements for INSERT and UPDATE with array parameters are used.
// create String create = "CREATE TABLE t (id INT PRIMARY KEY, scores INT ARRAY DEFAULT ARRAY[], names VARCHAR(20) ARRAY[10])"; Statement st = connection.createStatement(); st.execute(create); // insert String insert = "INSERT INTO t VALUES ?, ?, ?"; PreparedStatement ps = connection.prepareStatement(insert); Object[] numbers = new Object[]{17, 19}; Object[] data = new Object[]{"one", "two"}; ps.setInt(1, 10); ps.setObject(2, numbers); ps.setObject(3, data); ps.execute(); // update String update_a = "UPDATE t SET names = ? WHERE id = ?"; ps = connection.prepareStatement(update_a); data = new Object[]{"I", "II", "III", "IV"}; ps.setObject(1, data); ps.setInt(2, 10); ps.executeUpdate();
In the example below, a java.sql.Array
is
used for the update, using the same data as above:
data = new Object[]{"I", "II", "III", "IV"}; Array array = connection.createArrayOf("VARCHAR", data); ps.setArray(1, array); ps.setInt(2, 10); ps.executeUpdate();
A trigraph is a substitute for <left bracket> and <right bracket>.
<left bracket trigraph> ::= ??(
<right bracket trigraph> ::= ??)
The example below shows the use of trigraphs instead of brackets.
INSERT INTO t VALUES 10, ARRAY??(1,2,3??), ARRAY['HOT', 'COLD'] UPDATE t SET names = ARRAY ??('LARGE', 'SMALL'??) WHERE id = 12 UPDATE t SET names = ARRAY['LARGE', 'SMALL'] WHERE id < 12 AND scores = ARRAY[3,4]
The most common operations on an array are element reference and assignment, which are used when reading or writing an element of the array. Unlike Java and many other languages, arrays are extended if an element is assigned to an index beyond the current length. This can result in gaps containing NULL elements. Array length cannot exceed the maximum cardinality.
Elements of all arrays, including those that are the result of function calls or other operations can be referenced for reading.
<array element reference> ::= <array value
expression> <left bracket> <numeric value expression>
<right bracket>
Elements of arrays that are table columns or routine variables can be referenced for writing. This is done in a SET statement, either inside an UPDATE statement, or as a separate statement in the case of routine variables, OUT and INOUT parameters.
<target array element specification> ::= <target
array reference> <left bracket or trigraph> <simple value
specification> <right bracket or trigraph>
<target array reference> ::= <SQL parameter
reference> | <column reference>
Note that only simple values or variables are allowed for the array index when an assignment is performed. The examples below demonstrate how elements of the array are referenced in SELECT and UPDATE statements.
SELECT scores[ranking], names[ranking] FROM t JOIN t1 on (t.id = t1.tid) UPDATE t SET scores[2] = 123, names[2] = 'Reds' WHERE id = 10
Several SQL operations and functions can be used with arrays.
CONCATENATION
Array concatenation is performed similar to string concatenation. All elements of the array on the right are appended to the array on left.
<array concatenation> ::= <array value
expression 1> <concatenation operator> <array value
expression 2>
<concatenation operator> ::= ||
FUNCTIONS
Functions listed below operate on arrays. Details are described in the Built In Functions chapter.
ARRAY_AGG
is an aggregate function and produces
an array containing values from different rows of a SELECT statement.
Details are described in the Data Access and Change chapter.
SEQUENCE_ARRAY
creates an array with sequential
elements.
CARDINALITY <left paren> <array value
expression> <right paren>
MAX_CARDINALITY <left paren> <array value
expression> <right paren>
Array cardinality and max cardinality are functions that return an integer. CARDINALITY returns the element count, while MAX_CARDINALITY returns the maximum declared cardinality of an array.
POSITION_ARRAY <left paren> <value
expression> IN <array value expression> [FROM <numeric value
expression>] <right paren>
The POSITION_ARRAY function returns the position of the first match for the <value expression> from the start or from the given start position when <numeric value expression> is used.
TRIM_ARRAY <left paren> <array value
expression> <comma> <numeric value expression> <right
paren>
The TRIM_ARRAY function returns a copy of an array with the
specified number of elements removed from the end of the array. The
<array value expression>
can be any expression
that evaluates to an array.
SORT_ARRAY <left paren> <array value
expression> [ { ASC | DESC } ] [ NULLS { FIRST | LAST } ] <right
paren>
The SORT_ARRAY function returns a sorted copy of an array. NULL elements appear at the beginning of the new array. You can change the sort direction or the position of NULL elements with the option keywords.
CAST
An array can be cast into an array of a different type. Each element of the array is cast into the element type of the target array type. For example:
SELECT CAST(scores[ranking] AS VARCHAR(6) ARRAY), names[ranking] FROM t JOIN t1 on (t.id = t1.tid)
UNNEST
Arrays can be converted into table references with the UNNEST keyword.
UNNEST(<array value expression>) [ WITH ORDINALITY
]
The <array value expression>
can be any
expression that evaluates to an array. A table is returned that contains
one column when WITH ORDINALITY is not used, or two columns when WITH
ORDINALITY is used. The first column contains the elements of the array
(including all the nulls). When the table has two columns, the second
column contains the ordinal position of the element in the array. When
UNNEST is used in the FROM clause of a query, it implies the LATERAL
keyword, which means the array that is converted to table can belong to
any table that precedes the UNNEST in the FROM clause. This is explained
in the Data Access and Change chapter.
INLINE CONSTRUCTOR
Array constructors can be used in SELECT and other statements. For example, an array constructor with a subquery can return the values from several rows as one array.
The example below shows an ARRAY constructor with a correlated subquery to return the list of order values for each customer. The CUSTOMER table that is included for tests in the DatabaseManager GUI app is the source of the data.
SELECT FIRSTNAME, LASTNAME, ARRAY(SELECT INVOICE.TOTAL FROM INVOICE WHERE CUSTOMERID = CUSTOMER.ID) AS ORDERS FROM CUSTOMER FIRSTNAME LASTNAME ORDERS --------- --------- -------------------------------------- Laura Steel ARRAY[2700.90,4235.70] Robert King ARRAY[4761.60] Robert Sommer ARRAY[] Michael Smith ARRAY[3420.30]
COMPARISON
Arrays can be compared for equality. It is possible to define a
UNIQUE constraint on a column of ARRAY type. Two arrays are equal if
they have the same length and the values at each index position are
either equal or both NULL. Array expressions cannot be used in a
comparison expression such as GREATER THAN but they can be used in an
ORDER BY clause. For example, it is possible to add ORDER BY
ORDERS
to the above SELECT statement,
USER DEFINED FUNCTIONS and PROCEDURES
Array parameters, variables and return values can be specified in user defined functions and procedures, including aggregate functions. An aggregate function can return an array that contains all the scalar values that have been aggregated. These capabilities allow a wider range of applications to be covered by user defined functions and easier data exchange between the engine and the user's application.
$Revision: 6752 $