All Products
Search
Document Center

E-MapReduce:Release notes for EMR Serverless StarRocks

Last Updated:May 07, 2025

This topic describes the release notes for E-MapReduce (EMR) Serverless StarRocks.

3.3

Note

EMR Serverless StarRocks 3.3 is developed based on Linux Foundation Project StarRocks. For more information, see StarRocks version 3.3.

Minor version

Release date

Description

3.3.8-1.91

March 20, 2025

[New features]

  • Partial and conditional updates are supported for StarRocks shared-data instances.

  • The dynamic configuration item enable_delete_with_condition is added for backend nodes (BEs). This configuration item specifies whether to allow deletion with specific conditions.

  • The configuration item enable_corrupt_check_set_bad is added for frontend nodes (FEs). The default value is false. If you set the configuration item to true, the system checks whether replica files are damaged when you query data from tables created in StarRocks shared-nothing instances. If the replica files are damaged, the system automatically marks the files as bad files to improve data reliability.

  • The AccessKey pair can be explicitly configured in the statements used to create an external catalog.

[Optimized features]

Specific features are optimized to improve the performance of executing the SHOW MATERIALIZED VIEW statement or querying data from the information_schema.materialized_views table when a large number of materialized views (MVs) exist.

[Fixed issues]

  • The transactional mechanism conflicts with the Stream Load mode in StarRocks shared-nothing instances.

  • Compute nodes (CNs) cannot respond in community version 3.3.8.

  • After MVs are refreshed, the physical data files of expired partitions in the recycle bin are not deleted.

  • After you enable the multi-warehouse feature, Kudu external tables cannot be accessed.

  • Spill tasks cannot end as expected after the corresponding query context is released.

  • The BEs cannot respond when Flat JSON is used to read non-JSON data.

  • Fields of the timestamp type in Paimon tables failed to be converted.

  • A null pointer exception occurs when the SHOW PARTITION statement is executed for non-partitioned Paimon tables.

3.3.8-1.90

March 3, 2025

[New features]

Statistics calculation and collection at the DataCache layer are supported for StarRocks shared-data instances.

[Optimized features]

  • The logic of obtaining MVs is optimized to improve system performance.

  • Compaction stability is enhanced to fix the following issues:

    • The compaction operation may be blocked after FEs are restarted.

    • The compaction operation does not automatically end.

  • The request logic of the Object Storage Service (OSS) API is optimized.

  • When you batch publish data, logs and observability are provided.

  • More compaction-related diagnostic analysis information is provided, including status details and performance metrics.

  • The access logic of OSS-HDFS is optimized to enhance the stability and efficiency of data read and write operations.

[Fixed issues]

  • The lake compaction transaction is repeatedly triggered during primary/secondary switchovers of FEs.

  • When the transactional mechanism is used together with the Stream Load node, the multi-warehouse feature cannot work as expected.

  • The warehouse attribute is missing from the information_schema table.

  • The Paimon cache is not automatically refreshed.

  • RAM user attributes are lost after the FEs are restarted.

  • An error occurs during Data Lake Formation (DLF) 2.0 authentication.

3.3.8-1.87

February 06, 2025

[New features]

  • The SHOW PARTITIONS statement is supported to allow you to view the partition information of Paimon tables.

  • Data can be written to Paimon tables.

  • DLF 2.0 is supported.

[Optimized features]

  • Profile-related modifications are deprecated.

  • Pure code related to data lake analysis is removed.

  • Detailed observability information of compaction tasks are added.

  • Hints can be configured in SELECT statements to control the cache-related behaviors.

  • Forced refresh strategies of MVs are optimized.

  • The UPDATE statement is optimized to allow you to repeatedly perform assignments on a column.

  • Cloud-native indexes of tables created in StarRocks shared-data instances can be deleted.

  • Statistics on Paimon tables can be collected to optimize query performance.

  • Limit push down operations are supported when the Paimon table you query contains only partition predicates.

  • Log printing is supported in Metrics Action.

  • Metrics related to the primary key are added to cloud-native indexes.

  • The OSS API calls can be counted.

  • The timeout issue that occurs in gRPC is fixed.

  • Scheduling logs are generated during a compaction operation.

  • Manifest cache refreshing is supported for Paimon tables.

  • The profile generated when you query data from a Paimon table is optimized.

  • Index caching is supported for tables created in StarRocks shared-data instances.

  • Metrics related to write performance are supported in the profiles of tables created in StarRocks shred-data instances.

  • MV refresh tasks are optimized to fix the memory-related issues of FEs.

  • The metadata of Paimon tables can be cached.

  • Column-level lineage can be delivered to DataWorks.

  • The execution efficiency of COUNT(1) is optimized.

[Fixed issues]

  • The "thrift class not found" error message appears when you query data from Paimon tables created in DLF.

  • The type conversion issue occurs when you query data from Paimon tables.

  • The configuration of aliyun.oss.endpoint cannot be correctly identified when you query data from Paimon catalogs.

  • The results of aggregation queries are inaccurate.

  • Lake table statistics are incorrect.

  • Abnormal segment files are generated after you perform a partial update on internal tables created in StarRocks shared-data instances.

  • The compaction profile is not displayed as expected.

  • The lock manager of FEs fails to wake up all callers.

  • Sort keys cannot be modified.

  • The warehouse attribute does not take effect.

  • The admin user cannot manage warehouses.

  • Abnormal reduce rules are configured for the cost-based optimizer (CBO).

  • When you query data from a Hive table, the table cannot be found.

  • The publish time is not updated after a table created in a StarRocks shared-data instance is updated.

  • An error occurs when you read data from Parquet files generated by Spark.

3.3.2-1.77

November 19, 2024

  • Cloud-native primary key indexes are supported.

  • Metrics statistics of OSS API access are provided for cost optimization.

  • The fault tolerance mechanism is optimized for FEs to improve stability.

  • The following issue is fixed: In community version 3.3, stability is affected due to the preparation of log records.

  • By default, catalog-level caching is enabled for Paimon to accelerate data reads from Paimon tables.

  • EXPLAIN ANALYZE is supported for performance analysis of Paimon tables.

  • The execution plans for Paimon tables are optimized.

  • The following issue is fixed: Deadlocks may occur when you use a Hive catalog or DLF catalog to manage metadata in Paimon tables.

  • The issue that data in Paimon system tables cannot be queried is fixed.

  • The issue that DLF external tables cannot be accessed is fixed.

3.2

Note

EMR Serverless StarRocks 3.3 is developed based on Linux Foundation Project StarRocks. For more information, see StarRocks version 3.3.

Minor version

Release date

Description

3.2.15-1.92

March 21, 2025

[New features]

The window functions MAX_BY() and MIN_BY() are supported.

[Optimized features]

  • The time ranges during which base compaction operations are not allowed for tables can be specified.

  • Statistics on Paimon tables can be collected.

  • Metrics related to nodes and histograms are added.

  • Timeout-related parameters can be configured for the StarRocks client.

  • Automatic detection of replica files is supported. If the system detects that specific replica files are damaged, the system automatically marks them as bad files.

[Fixed issues]

  • After the SHOW ROUTINE LOAD statement is executed, 0 is returned for the loadRowsRate field.

  • The Files() function cannot correctly read columns that were not queried.

  • FEs cannot respond because of the ARRAY_MAP function.

  • FEs cannot respond because of metadata caching.

  • Routine Load jobs are canceled due to transaction timeout.

  • When low cardinality optimization is enabled, the execution plan for the nested aggregate function MAX(COUNT(DISTINCT)) is incorrect.

  • Stream Load jobs are submitted to non-alive nodes.

  • After BEs are restarted, the specific bRPC continuously reports errors.

  • Stream Load jobs fail to be submitted by using HTTP 1.0.

  • The job status of the leader and follower FEs is consistent.

  • The number of rows in an MV is inaccurate.

3.2.11-1.79

November 20, 2024

[Optimized features]

  • By default, partition-level caching is enabled for Paimon to accelerate queries.

  • The statistics collection feature of Paimon is optimized.

  • Error messages are returned when audit logs fail to be queried.

[Fixed issues]

  • Colocate tablets are frequently migrated.

  • The OSS scheme cannot be found during the access to Paimon tables stored in OSS or OSS-HDFS.

  • Paimon tables cannot be created in DLF 2.0.

  • A partition-related error occurs when data is written to Hive partitioned tables.

  • An error occurs when the array_to_bitmap function is called to process constant arrays.

3.2.11-1.76

October 30, 2024

[Optimized features]

  • The profile collection strategies are optimized, and specific crashes are fixed.

  • The exception handling mechanism of the shared-data architecture is enhanced to improve system fault tolerance.

[Fixed issues]

  • An error occurs when data is inserted into an external partitioned table.

  • Abnormal FEs occurs due to thread leaks.

  • The system crashes when primary key tables created in StarRocks shared-data instances contain fields of the bitmap type.

  • Data in ToDataCacheInfo is leaked. This fix prevents an out-of-memory (OOM) error from occurring in the FEs of StarRocks shared-data instances.

  • Incorrect query results are returned due to query caching.

  • Incorrect results are returned by specific bucket shuffling operations.

  • A null pointer exception occurs during the access to table functions.

3.2.9-1.71

September 14, 2024

[New features]

  • Data in Paimon tables can be written to StarRocks instances.

  • Read and write operations can be performed on Paimon catalogs created in DLF 2.0.

[Optimized features]

  • The EXPLAIN ANALYZE statement is supported for Paimon tables.

  • The pruning and statistics collection features are no longer supported for Paimon tables.

  • The SQL statements used by a query can be viewed.

  • The performance of the SELECT COUNT statement is optimized.

  • An interface is provided for obtaining migration progress.

  • The dlf.catalog.id parameter can be configured when you create an Iceberg table.

[Fixed issues]

  • An error occurs during a LIKE query.

  • Inaccurate data is queried by executing the SHOW DATA statement

3.2.9-1.67

August 16, 2024

[Optimized features]

The performance of Hive Sink is improved.

[Fixed issues]

  • Hive catalogs are not adapted to Ranger.

  • The system crashes due to the optimization of the COUNT statement.

  • The refresh efficiency of MVs is low because the snapshot information takes a long period of time to obtain.

  • Compaction Manager generates too much metadata for a table created in StarRocks shared-data instances.

  • CNs are crashed due to specific reasons.

  • After the minor version of a StarRocks instance is upgraded to 3.2.9, the PREPARE statements cannot work as expected.

  • Data is inconsistency occurs after the spilling feature is enabled.

3.2.9-1.66

August 09, 2024

[New features]

  • Data can be written to OSS-HDFS files even if the specific parent directory does not exist.

  • Data stored in Jindo can be read by using the Broker Load mode.

  • Different types of engines can be automatically identified by OSS-HDFS.

[Optimized features]

  • Compaction logs are optimized for more efficient diagnostics and analysis.

  • The star mgr directory is created when CNs are started.

  • The default redirection configuration of FEs is optimized.

  • Unnecessary logs and configuration items are optimized to improve performance.

  • More I/O monitoring metrics are added for StarRocks shared-data instances to enhance the overall O&M capabilities.

  • The adaptive I/O strategies are optimized for StarRocks shared-data instances.

[Fixed issues]

  • External tables cannot be used to access data in Google Cloud Storage (GCS) and Microsoft Azure.

  • The data size of the spilling result exceeds 4 GB.

  • A pre-aggregation issue occurs during the spilling process.

  • MVs are not refreshed as expected.

  • An error occurs when memory statistics about primary keys are collected.

  • The statistical information is inaccurate.

  • Insert tasks cannot be canceled.

  • An incorrect execution plan is generated due to schema changes.

  • External tables cannot obtain information about CNs.

3.2.9-1.65

July 19, 2024

[New features]

  • Jindo SDK is upgraded to 6.5.0.

  • DLF catalogs are supported by Iceberg.

  • Shard rebalancing is supported by StarRocks shared-data instances.

  • The regexp_split function is supported.

[Optimized features]

  • The default value of the get_txn_status_internal_sec parameter is reduced from 30 seconds to 10 seconds to prevent delays in the publish phase of Stream Load.

  • The pindex_shared_data_gc_evict_interval_seconds parameter can be modified to change the garbage collection (GC) interval for the local persistent indexes of primary key tables created in StarRocks shared-data instances.

[Fixed issues]

  • CRC mismatch occurs when data is exported to OSS by using Jindo.

  • Files do not exist during the access to OSS-HDFS.

  • The "Lost Connection" error message appears when specific SQL statements are executed.

3.2.8-1.62

June 27, 2024

[New features]

  • Kudu and Paimon are supported by unified catalogs, and unified catalogs of the DLF type can be created.

  • Generation of lineage logs is supported.

  • The comments of external tables can be viewed by executing the DESCRIBE and SHOW CREATE statements.

[Fixed issues]

  • Data cannot be written to the external tables of StarRocks shared-data instances.

  • When the partition columns contain NULL values, the MVs of Paimon cannot be refreshed.

  • The memory statistics is inaccurate.

3.2.6-1.60

June 6, 2024

[New features]

  • SQL statements that fail to be parsed are marked as bad SQL statements.

  • Kudu connectors are supported.

[Optimized features]

The enable_pipeline_engine parameter is added.

[Fixed issues]

  • BEs crash because a migration task fails to obtain the schema of the source.

  • The memory statistics is inaccurate.

  • An performance error occurs when Paimon reads data from read-only tables.

3.2.6-1.59

May 31, 2024

[New features]

Information about bad SQL statements is displayed in the query details.

[Optimized features]

The delete vector can be used by Paimon to perform queries.

[Fixed issues]

  • Paimon catalogs cannot access custom DLF directories.

  • A warehouse must be specified during the creation of a Paimon catalog.

3.2.6-1.57

May 23, 2024

Note

If you use a StarRocks instance whose minor version is earlier than 3.2.6-1.57, we recommend that you upgrade the instance to 3.2.6-1.57 or a later minor version.

[New features]

The OPTIMIZE command can be used to optimize the bucket storage layout of internal tables created in StarRocks shared-data instances.

[Optimized features]

  • The fragment_profile_drop_threshold_ms configuration item is added for the FEs and can be configured in the EMR console based on your business requirements. The default value of the drop fragment profile is set to 0 to disable the specific feature.

  • The lake_flush_thread_num_per_store configuration item is added to allow you to refresh internal tables created in StarRocks shared-data instances. This improves the write I/O throughput. The default value is calculated by using the following formula: 2 × Number of CPUs.

  • I/O merge strategies are optimized to merge small files. This allows you to directly read the merged file.

  • The default values of the configuration items that are used for cross-instance migration are changed.

[Fixed issues]

  • The connection to the RPC interface report_exec_stat fails and cannot be established after retries. This issue may cause failures of INSERT INTO operations and profile collection.

  • After MV indexes are introduced, FEs frequently crash during metadata replay.

  • After the abstract syntax tree (AST) cache mechanism is introduced from version 3.2.6, specific materialized views cannot be created.

3.2.6-1.52

May 8, 2024

[New features]

DLF databases and tables can be created and managed in EMR Serverless StarRocks.

[Fixed issues]

  • After the lake_tablet_internal_parallel parameter is set to true, tablet metadata fails to be obtained and an error message is displayed.

  • An import job in INSERT INTO mode times out. The thrift_rpc_timeout_ms configuration item is added for the BEs to adjust the RPC timeout.

  • After an operation related to schema change is performed, no response is returned for a long period of time.

3.2.4-1.37

March 08, 2024

  • Cross-instance data migration is supported.

  • Size-tiered compaction operations can be performed on primary key tables.