Release 405 (28 Dec 2022)#
General#
Add Trino version to the output of
EXPLAIN
. (#15317)Add task input/output size distribution to the output of
EXPLAIN ANALYZE VERBOSE
. (#15286)Add stage skewness warnings to the output of
EXPLAIN ANALYZE
. (#15286)Add support for
ALTER COLUMN ... SET DATA TYPE
statement. (#11608)Allow configuring a refresh interval for the database resource group manager with the
resource-groups.refresh-interval
configuration property. (#14514)Improve performance of queries that compare
date
columns withtimestamp(n) with time zone
literals. (#5798)Improve performance and resource utilization when inserting into tables. (#14718, #14874)
Improve performance for
INSERT
queries when fault-tolerant execution is enabled. (#14735)Improve planning performance for queries with many
GROUP BY
clauses. (#15292)Improve query performance for large clusters and skewed queries. (#15369)
Rename the
node-scheduler.max-pending-splits-per-task
configuration property tonode-scheduler.min-pending-splits-per-task
. (#15168)Ensure that the configured number of task retries is not larger than 126. (#14459)
Fix incorrect rounding of
time(n)
andtime(n) with time zone
values near the top of the range of allowed values. (#15138)Fix incorrect results for queries involving window functions without a
PARTITION BY
clause followed by the evaluation of window functions with aPARTITION BY
andORDER BY
clause. (#15203)Fix incorrect results when adding or subtracting an
interval
from atimestamp with time zone
. (#15103)Fix potential incorrect results when joining tables on indexed and non-indexed columns at the same time. (#15334)
Fix potential failure of queries involving
MATCH_RECOGNIZE
. (#15343)Fix incorrect reporting of
Projection CPU time
in the output ofEXPLAIN ANALYZE VERBOSE
. (#15364)Fix
SET TIME ZONE LOCAL
to correctly reset to the initial time zone of the client session. (#15314)
Security#
Add support for string replacement as part of impersonation rules. (#14962)
Add support for fetching access control rules via HTTPS. (#14008)
Fix some
system.metadata
tables improperly showing the names of catalogs which the user cannot access. (#14000)Fix
USE
statement improperly disclosing the names of catalogs and schemas which the user cannot access. (#14208)Fix improper HTTP redirect after OAuth 2.0 token refresh. (#15336)
Web UI#
Display operator CPU time in the “Stage Performance” tab. (#15339)
JDBC driver#
Return correct values in
NULLABLE
columns of theDatabaseMetaData.getColumns
result. (#15214)
BigQuery connector#
Improve read performance with experimental support for Apache Arrow serialization when reading from BigQuery. This can be enabled with the
bigquery.experimental.arrow-serialization.enabled
catalog configuration property. (#14972)Fix queries incorrectly executing with the project ID specified in the credentials instead of the project ID specified in the
bigquery.project-id
catalog property. (#14083)
Delta Lake connector#
Add support for views. (#11609)
Add support for configuring batch size for reads on Parquet files using the
parquet.max-read-block-row-count
configuration property or theparquet_max_read_block_row_count
session property. (#15474)Improve performance and reduce storage requirements when running the
vacuum
procedure on S3-compatible storage. (#15072)Improve memory accounting for
INSERT
,MERGE
, andCREATE TABLE ... AS SELECT
queries. (#14407)Improve performance of reading Parquet files for
boolean
,tinyint
,short
,int
,long
,float
,double
,short decimal
,UUID
,time
,decimal
,varchar
, andchar
data types. This optimization can be disabled with theparquet.optimized-reader.enabled
catalog configuration property. (#14423, #14667)Improve query performance when the
nulls fraction
statistic is not available for some columns. (#15132)Improve performance when reading Parquet files. (#15257, #15474)
Improve performance of reading Parquet files for queries with filters. (#15268)
Improve
DROP TABLE
performance for tables stored on AWS S3. (#13974)Improve performance of reading Parquet files for
timestamp
andtimestamp with timezone
data types. (#15204)Improve performance of queries that read a small number of columns and queries that process tables with large Parquet row groups or ORC stripes. (#15168)
Improve stability and reduce peak memory requirements when reading from Parquet files. (#15374)
Allow registering existing table files in the metastore with the new
register_table
procedure. (#13568)Deprecate creating a new table with existing table content. This can be re-enabled using the
delta.legacy-create-table-with-existing-location.enabled
configuration property or thelegacy_create_table_with_existing_location_enabled
session property. (#13568)Fix query failure when reading Parquet files with large row groups. (#5729)
Fix
DROP TABLE
leaving files behind when using managed tables stored on S3 and created by the Databricks runtime. (#13017)Fix query failure when the path contains special characters. (#15183)
Fix potential
INSERT
failure for tables stored on S3. (#15476)
Google Sheets connector#
Add support for setting a read timeout with the
gsheets.read-timeout
configuration property. (#15322)Add support for
base64
-encoded credentials using thegsheets.credentials-key
configuration property. (#15477)Rename the
credentials-path
configuration property togsheets.credentials-path
,metadata-sheet-id
togsheets.metadata-sheet-id
,sheets-data-max-cache-size
togsheets.max-data-cache-size
, andsheets-data-expire-after-write
togsheets.data-cache-ttl
. (#15042)
Hive connector#
Add support for referencing nested fields in columns with the
UNIONTYPE
Hive type. (#15278)Add support for configuring batch size for reads on Parquet files using the
parquet.max-read-block-row-count
configuration property or theparquet_max_read_block_row_count
session property. (#15474)Improve memory accounting for
INSERT
,MERGE
, andCREATE TABLE AS SELECT
queries. (#14407)Improve performance of reading Parquet files for
boolean
,tinyint
,short
,int
,long
,float
,double
,short decimal
,UUID
,time
,decimal
,varchar
, andchar
data types. This optimization can be disabled with theparquet.optimized-reader.enabled
catalog configuration property. (#14423, #14667)Improve performance for queries which write data into multiple partitions. (#15241, #15066)
Improve performance when reading Parquet files. (#15257, #15474)
Improve performance of reading Parquet files for queries with filters. (#15268)
Improve
DROP TABLE
performance for tables stored on AWS S3. (#13974)Improve performance of reading Parquet files for
timestamp
andtimestamp with timezone
data types. (#15204)Improve performance of queries that read a small number of columns and queries that process tables with large Parquet row groups or ORC stripes. (#15168)
Improve stability and reduce peak memory requirements when reading from Parquet files. (#15374)
Disallow creating transactional tables when not using the Hive metastore. (#14673)
Fix query failure when reading Parquet files with large row groups. (#5729)
Fix incorrect
schema already exists
error caused by a client timeout when creating a new schema. (#15174)Fix failure when an access denied exception happens while listing tables or views in a Glue metastore. (#14746)
Fix
INSERT
failure on ORC ACID tables when Apache Hive 3.1.2 is used as a metastore. (#7310)Fix failure when reading Hive views with
char
types. (#15470)Fix potential
INSERT
failure for tables stored on S3. (#15476)
Hudi connector#
Improve performance of reading Parquet files for
boolean
,tinyint
,short
,int
,long
,float
,double
,short decimal
,UUID
,time
,decimal
,varchar
, andchar
data types. This optimization can be disabled with theparquet.optimized-reader.enabled
catalog configuration property. (#14423, #14667)Improve performance of reading Parquet files for queries with filters. (#15268)
Improve performance of reading Parquet files for
timestamp
andtimestamp with timezone
data types. (#15204)Improve performance of queries that read a small number of columns and queries that process tables with large Parquet row groups or ORC stripes. (#15168)
Improve stability and reduce peak memory requirements when reading from Parquet files. (#15374)
Fix query failure when reading Parquet files with large row groups. (#5729)
Iceberg connector#
Add support for configuring batch size for reads on Parquet files using the
parquet.max-read-block-row-count
configuration property or theparquet_max_read_block_row_count
session property. (#15474)Add support for the Iceberg REST catalog. (#13294)
Improve memory accounting for
INSERT
,MERGE
, andCREATE TABLE AS SELECT
queries. (#14407)Improve performance of reading Parquet files for
boolean
,tinyint
,short
,int
,long
,float
,double
,short decimal
,UUID
,time
,decimal
,varchar
, andchar
data types. This optimization can be disabled with theparquet.optimized-reader.enabled
catalog configuration property. (#14423, #14667)Improve performance when reading Parquet files. (#15257, #15474)
Improve performance of reading Parquet files for queries with filters. (#15268)
Improve
DROP TABLE
performance for tables stored on AWS S3. (#13974)Improve performance of reading Parquet files for
timestamp
andtimestamp with timezone
data types. (#15204)Improve performance of queries that read a small number of columns and queries that process tables with large Parquet row groups or ORC stripes. (#15168)
Improve stability and reduce peak memory requirements when reading from Parquet files. (#15374)
Fix incorrect results when predicates over
row
columns on Parquet files are pushed into the connector. (#15408)Fix query failure when reading Parquet files with large row groups. (#5729)
Fix
REFRESH MATERIALIZED VIEW
failure when the materialized view is based on non-Iceberg tables. (#13131)Fix failure when an access denied exception happens while listing tables or views in a Glue metastore. (#14971)
Fix potential
INSERT
failure for tables stored on S3. (#15476)
Kafka connector#
Add support for Protobuf encoding. (#14734)
MongoDB connector#
Add support for fault-tolerant execution. (#15062)
Add support for setting a file path and password for the truststore and keystore. (#15240)
Add support for case-insensitive name-matching in the
query
table function. (#15329)Rename the
mongodb.ssl.enabled
configuration property tomongodb.tls.enabled
. (#15240)Delete a MongoDB field from collections when dropping a column. Previously, the connector deleted only metadata. (#15226)
Remove deprecated
mongodb.seeds
andmongodb.credentials
configuration properties. (#15263)Fix failure when an unauthorized exception happens while listing schemas or tables. (#1398)
Fix
NullPointerException
when a column name contains uppercase characters in thequery
table function. (#15294)Fix potential incorrect results when the
objectid
function is used more than once within a single query. (#15426)
MySQL connector#
Fix failure when the
query
table function contains aWITH
clause. (#15332)
PostgreSQL connector#
Fix query failure when a
FULL JOIN
is pushed down. (#14841)
Redshift connector#
Add support for aggregation, join, and
ORDER BY ... LIMIT
pushdown. (#15365)Add support for
DELETE
. (#15365)Add schema, table, and column name length checks. (#15365)
Add full type mapping for Redshift types. The previous behavior can be restored via the
redshift.use-legacy-type-mapping
configuration property. (#15365)
SPI#
Remove deprecated
ConnectorNodePartitioningProvider.getBucketNodeMap()
method. (#14067)Use the
MERGE
APIs in the engine to executeDELETE
andUPDATE
. Require connectors to implementbeginMerge()
and related APIs. DeprecatebeginDelete()
,beginUpdate()
andUpdatablePageSource
, which are unused and do not need to be implemented. (#13926)