Ignite connector#
The Ignite connector allows querying an Apache Ignite database from Trino.
Requirements#
To connect to a Ignite server, you need:
Ignite version 2.9.0 or latter
Network access from the Trino coordinator and workers to the Ignite server. Port 10800 is the default port.
Specify
--add-opens=java.base/java.nio=ALL-UNNAMED
in thejvm.config
when starting the Trino server.
Configuration#
The Ignite connector expose public
schema by default.
The connector can query a Ignite instance. Create a catalog properties file
that specifies the Ignite connector by setting the connector.name
to
ignite
.
For example, to access an instance as example
, create the file
etc/catalog/example.properties
. Replace the connection properties as
appropriate for your setup:
connector.name=ignite
connection-url=jdbc:ignite:thin://host1:10800/
connection-user=exampleuser
connection-password=examplepassword
The connection-url
defines the connection information and parameters to pass
to the Ignite JDBC driver. The parameters for the URL are available in the
Ignite JDBC driver documentation.
Some parameters can have adverse effects on the connector behavior or not work
with the connector.
The connection-user
and connection-password
are typically required and
determine the user credentials for the connection, often a service user. You can
use secrets to avoid actual values in the catalog
properties files.
Multiple Ignite servers#
If you have multiple Ignite servers you need to configure one catalog for each server. To add another catalog:
Add another properties file to
etc/catalog
Save it with a different name that ends in
.properties
For example, if you name the property file sales.properties
, Trino uses the
configured connector to create a catalog named sales
.
General configuration properties#
The following table describes general catalog configuration properties for the connector:
Property name |
Description |
---|---|
|
Support case insensitive schema and table names. Defaults to |
|
Duration for which case insensitive schema and table
names are cached. Defaults to |
|
Path to a name mapping configuration file in JSON format that allows
Trino to disambiguate between schemas and tables with similar names in
different cases. Defaults to |
|
Frequency with which Trino checks the name matching configuration file
for changes. The duration value defaults to |
|
Duration for which metadata, including table and
column statistics, is cached. Defaults to |
|
Cache the fact that metadata, including table and column statistics, is
not available. Defaults to |
|
Duration for which schema metadata is cached.
Defaults to the value of |
|
Duration for which table metadata is cached.
Defaults to the value of |
|
Duration for which tables statistics are cached.
Defaults to the value of |
|
Maximum number of objects stored in the metadata cache. Defaults to |
|
Maximum number of statements in a batched execution. Do not change
this setting from the default. Non-default values may negatively
impact performance. Defaults to |
|
Push down dynamic filters into JDBC queries. Defaults to |
|
Maximum duration for which Trino waits for dynamic
filters to be collected from the build side of joins before starting a
JDBC query. Using a large timeout can potentially result in more detailed
dynamic filters. However, it can also increase latency for some queries.
Defaults to |
Appending query metadata#
The optional parameter query.comment-format
allows you to configure a SQL
comment that is sent to the datasource with each query. The format of this
comment can contain any characters and the following metadata:
$QUERY_ID
: The identifier of the query.$USER
: The name of the user who submits the query to Trino.$SOURCE
: The identifier of the client tool used to submit the query, for exampletrino-cli
.$TRACE_TOKEN
: The trace token configured with the client tool.
The comment can provide more context about the query. This additional
information is available in the logs of the datasource. To include environment
variables from the Trino cluster with the comment , use the
${ENV:VARIABLE-NAME}
syntax.
The following example sets a simple comment that identifies each query sent by Trino:
query.comment-format=Query sent by Trino.
With this configuration, a query such as SELECT * FROM example_table;
is
sent to the datasource with the comment appended:
SELECT * FROM example_table; /*Query sent by Trino.*/
The following example improves on the preceding example by using metadata:
query.comment-format=Query $QUERY_ID sent by user $USER from Trino.
If Jane
sent the query with the query identifier
20230622_180528_00000_bkizg
, the following comment string is sent to the
datasource:
SELECT * FROM example_table; /*Query 20230622_180528_00000_bkizg sent by user Jane from Trino.*/
Note
Certain JDBC driver settings and logging configurations might cause the comment to be removed.
Domain compaction threshold#
Pushing down a large list of predicates to the data source can compromise
performance. Trino compacts large predicates into a simpler range predicate
by default to ensure a balance between performance and predicate pushdown.
If necessary, the threshold for this compaction can be increased to improve
performance when the data source is capable of taking advantage of large
predicates. Increasing this threshold may improve pushdown of large
dynamic filters.
The domain-compaction-threshold
catalog configuration property or the
domain_compaction_threshold
catalog session property can be used to adjust the default value of
1000
for this threshold.
Procedures#
system.flush_metadata_cache()
Flush JDBC metadata caches. For example, the following system call flushes the metadata caches for all schemas in the
example
catalogUSE example.example_schema; CALL system.flush_metadata_cache();
Case insensitive matching#
When case-insensitive-name-matching
is set to true
, Trino
is able to query non-lowercase schemas and tables by maintaining a mapping of
the lowercase name to the actual name in the remote system. However, if two
schemas and/or tables have names that differ only in case (such as “customers”
and “Customers”) then Trino fails to query them due to ambiguity.
In these cases, use the case-insensitive-name-matching.config-file
catalog
configuration property to specify a configuration file that maps these remote
schemas/tables to their respective Trino schemas/tables:
{
"schemas": [
{
"remoteSchema": "CaseSensitiveName",
"mapping": "case_insensitive_1"
},
{
"remoteSchema": "cASEsENSITIVEnAME",
"mapping": "case_insensitive_2"
}],
"tables": [
{
"remoteSchema": "CaseSensitiveName",
"remoteTable": "tablex",
"mapping": "table_1"
},
{
"remoteSchema": "CaseSensitiveName",
"remoteTable": "TABLEX",
"mapping": "table_2"
}]
}
Queries against one of the tables or schemes defined in the mapping
attributes are run against the corresponding remote entity. For example, a query
against tables in the case_insensitive_1
schema is forwarded to the
CaseSensitiveName schema and a query against case_insensitive_2
is forwarded
to the cASEsENSITIVEnAME
schema.
At the table mapping level, a query on case_insensitive_1.table_1
as
configured above is forwarded to CaseSensitiveName.tablex
, and a query on
case_insensitive_1.table_2
is forwarded to CaseSensitiveName.TABLEX
.
By default, when a change is made to the mapping configuration file, Trino must
be restarted to load the changes. Optionally, you can set the
case-insensitive-name-mapping.refresh-period
to have Trino refresh the
properties without requiring a restart:
case-insensitive-name-mapping.refresh-period=30s
Non-transactional INSERT#
The connector supports adding rows using INSERT statements.
By default, data insertion is performed by writing data to a temporary table.
You can skip this step to improve performance and write directly to the target
table. Set the insert.non-transactional-insert.enabled
catalog property
or the corresponding non_transactional_insert
catalog session property to
true
.
Note that with this property enabled, data can be corrupted in rare cases where exceptions occur during the insert operation. With transactions disabled, no rollback can be performed.
Table properties#
Table property usage example:
CREATE TABLE public.person (
id BIGINT NOT NULL,
birthday DATE NOT NULL,
name VARCHAR(26),
age BIGINT,
logdate DATE
)
WITH (
primary_key = ARRAY['id', 'birthday']
);
The following are supported Ignite table properties from https://ignite.apache.org/docs/latest/sql-reference/ddl
Property name |
Required |
Description |
---|---|---|
|
No |
The primary key of the table, can chose multi columns as the table primary key. Table at least contains one column not in primary key. |
primary_key
#
This is a list of columns to be used as the table’s primary key. If not specified, a VARCHAR
primary key column named DUMMY_ID
is generated,
the value is derived from the value generated by the UUID
function in Ignite.
Type mapping#
The following are supported Ignite SQL data types from https://ignite.apache.org/docs/latest/sql-reference/data-types
Ignite SQL data type name |
Map to Trino type |
Possible values |
---|---|---|
|
|
|
|
|
|
|
|
Data type with fixed precision and scale |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Represents a byte array. |
SQL support#
The connector provides read access and write access to data and metadata in Ignite. In addition to the globally available and read operation statements, the connector supports the following features:
UPDATE#
Only UPDATE
statements with constant assignments and predicates are
supported. For example, the following statement is supported because the values
assigned are constants:
UPDATE table SET col1 = 1 WHERE col3 = 1
Arithmetic expressions, function calls, and other non-constant UPDATE
statements are not supported. For example, the following statement is not
supported because arithmetic expressions cannot be used with the SET
command:
UPDATE table SET col1 = col2 + 2 WHERE col3 = 1
The =
, !=
, >
, <
, >=
, <=
, IN
, NOT IN
operators are supported in
predicates. The following statement is not supported because the AND
operator
cannot be used in predicates:
UPDATE table SET col1 = 1 WHERE col3 = 1 AND col2 = 3
All column values of a table row cannot be updated simultaneously. For a three column table, the following statement is not supported:
UPDATE table SET col1 = 1, col2 = 2, col3 = 3 WHERE col3 = 1
ALTER TABLE RENAME TO#
The connector does not support renaming tables across multiple schemas. For example, the following statement is supported:
ALTER TABLE example.schema_one.table_one RENAME TO example.schema_one.table_two
The following statement attempts to rename a table across schemas, and therefore is not supported:
ALTER TABLE example.schema_one.table_one RENAME TO example.schema_two.table_two
Pushdown#
The connector supports pushdown for a number of operations:
Aggregate pushdown for the following functions:
Predicate pushdown support#
The connector does not support pushdown of any predicates on columns with
textual types like CHAR
or VARCHAR
.
This ensures correctness of results since the data source may compare strings
case-insensitively.
In the following example, the predicate is not pushed down for either query
since name
is a column of type VARCHAR
:
SELECT * FROM nation WHERE name > 'CANADA';
SELECT * FROM nation WHERE name = 'CANADA';