ApsaraDB for SelectDB

A cloud-native real-time data warehouse based on Apache Doris, providing high-performance and easy-to-use data analysis services.

Overview

ApsaraDB for SelectDB is a next-generation cloud-native real-time data warehouse service based on Apache Doris. ApsaraDB for SelectDB is easy to use and open source, supports cloud-native storage-computing separation, real-time query performance, quick integration, and excellent compatibility. ApsaraDB for SelectDB provides real-time report queries with tens of thousands of queries per second (QPS), multidimensional ad hoc analysis with subsecond latency, log analysis solutions that are 10 times more cost-effective than other solutions, and data lakehouse analysis platforms of which the cost can be reduced by up to 80%.

Benefits

  • Cloud-native architecture: resolve costs and extensibility issues

    The cloud-native architecture separates storage and computing resources. This allows you to separately scale computing and storage resources based on your business requirements. Full data is stored in Object Storage Service (OSS) buckets, which are stable and cost-effective. The unit price for storage is reduced by 90%. Data can be shared with multiple computing clusters. This prevents storage redundancy and provides powerful physical and logical isolation capabilities. The total cost of ownership (TCO) of ApsaraDB for SelectDB is 50% lower than that of a self-managed data warehouse.

  • #
    Real-time query: resolve performance issues

    ApsaraDB for SelectDB provides excellent query performance in wide table aggregation, multi-table association analysis, and high-concurrency point query scenarios. ApsaraDB for SelectDB ranks top in the global analytical database list of ClickBench. Multiple metrics of ApsaraDB for SelectDB rank top over the world. ApsaraDB for SelectDB supports high-concurrency data import and update in real time. Data can be analyzed just several seconds after data is generated.

  • #
    Ease of use: resolve usability issues

    ApsaraDB for SelectDB supports various easy-to-use data import methods to quickly import data. ApsaraDB for SelectDB is compatible with MySQL connection protocols and syntax, and can be seamlessly integrated with dozens of database and big data ecosystem services. This reduces the learning costs of users. ApsaraDB for SelectDB provides visualized development tools to simplify the data development process.

Features

Real-time Scaling

ApsaraDB for SelectDB supports multiple computing clusters. You can scale out or in your clusters based on your business requirements. The scaling can be complete in minutes without business interruptions.

Data Lake Analysis

ApsaraDB for SelectDB supports various types of data lakes, such as Hive, Iceberg, and Hudi, and allows you to query and write back data in data lakes.

Semi-structured Data Analysis

ApsaraDB for SelectDB provides simple and fast semi-structured data analysis capabilities and supports all types of variants and inverted indexes.

Scenarios

Traditional solutions face several challenges, including high latency of hours during the period from data generation to the output of visible data, slow query response and low concurrency (only support tens of concurrent queries), data loss or duplicate data, and poor service availability. ApsaraDB for SelectDB allows you to process online and high-concurrency report queries and provides you with real-time, fast, stable, and highly available services.

Benefits

  • Real-time data write

    ApsaraDB for SelectDB supports writing of millions of data records per second, and can be integrated with the database ecosystems such as MySQL, PostgreSQL, and Oracle, and big data ecosystems such as Flink, Kafka, and Dataworks. This simplifies the data writing process.

  • Subsecond query response

    ApsaraDB for SelectDB uses new query optimizers, high-performance Pipeline execution engines, and various types of indexes to accelerate queries by orders of magnitude. ApsaraDB for SelectDB provides highly consistent materialized views that are created based on the results of multi-table aggregate queries, and supports automatic query rewrite to meet the requirements of subsecond queries for statistical aggregation.

  • Up to 10,000 concurrent queries

    ApsaraDB for SelectDB supports data pruning by partitions and buckets, data skipping indexes (Zonemap and Bloomfilter), and point query indexes (primary key and inverted index). This reduces the amount of data to be read and improves the capability of concurrent queries. Combined with hybrid row-column storage and custom query optimizer, a single machine can support highly concurrent point queries with tens of thousands of QPS.

Several issues exist in traditional solutions, including complex computing and analysis process, slow query response, inflexible table schema, inadaptation to flexible business changes, and latency of data updates when data is changed. ApsaraDB for SelectDB allows you to build a multidimensional analysis platform and implement customized and fine-grained operations such as user profile and behaviour analysis. This way, you can catch users in a more accurate way and drive business development.

Benefits

  • High-performance data update

    ApsaraDB for SelectDB provides the high-concurrency data update capability that supports data updates in rows or specific columns without periodically recalculating a large amount of historical data offline. This ensures data timeliness within seconds. ApsaraDB for SelectDB provides simple and efficient built-in extract, transform, and load (ETL) capabilities. You can execute SQL statements to easily process and convert data.

  • Lightweight table schema change

    ApsaraDB for SelectDB supports lightweight table schema changes that can be complete online within seconds. ApsaraDB for SelectDB provides various semi-structured data types such as Map, Array, and JSON, and high-performance wide table processing capability that supports the process of thousands of columns. This fully meets the requirements for business flexibility and diversity.

  • Interactive analysis in seconds

    ApsaraDB for SelectDB provides various ad hoc analysis functions such as retention analysis functions and profile analysis functions, and orthogonal bitmap processing capabilities. This greatly simplifies the development process of multidimensional ad hoc analysis and implements interactive data analysis within seconds.

In logging scenarios in which large amounts of logs exist and high-throughput writes and real-time visible data are required, it is a big challenge to reduce resource costs. In addition, fast text retrieval capability is required in logging scenarios to meet requirements such as troubleshooting and full-text retrieval. ApsaraDB for SelectDB uses capabilities such as storage-computing separation, column-oriented data storage, and inverted indexes to implement real-time query, low-cost storage, and high-efficiency processing of massive logs. This is an alternative solution that is 10 times more cost-effective than Elasticsearch solutions.

Benefits

  • Real-time writing of massive data

    ApsaraDB for SelectDB provides optimized high-performance inverted indexes. The write speed is four times faster than that of Elasticsearch inverted indexes. The group commit mechanism is used on the server, which improves the real-time write throughput up to several gigabytes per second if second-level real-time visible data is ensured.

  • Cost-effective data storage

    ApsaraDB for SelectDB uses column-oriented storage, simplified inverted indexes, and high-ratio compression. The occupied storage space is 20% of that of Elasticsearch. ApsaraDB for SelectDB uses the architecture that separates storage and computing resources. The cost of a unit storage space is 33.3% of that of Elasticsearch, and the total cost is 6.7% of that of Elasticsearch.

  • Efficient query process

    ApsaraDB for SelectDB allows you to narrow down the data scope of queries by partitioning and bucketing, and filtering time ranges. You can use inverted indexes and keywords to quickly find the matched log rows. This prevents large-scale data scans and achieves response in seconds.

Traditional big data platform solutions meet the complex and diverse requirements for big data analysis by combining multiple sets of data lake query engines and data warehouse systems. This causes issues such as high costs of labor and resources, complex data development and use, and poor timeliness of data analysis. ApsaraDB for SelectDB allows you to build a data lakehouse analysis system to quickly meet your data analysis requirements in a low-cost and efficient way.

Benefits

  • Significantly reduced costs

    ApsaraDB for SelectDB can perfectly support various analysis requirements by using a system, which greatly reduces the construction of redundant systems. This reduces the labor and maintenance costs for big data platforms, and the overhead of redundant resources. The comprehensive costs can be reduced by up to 80%.

  • Simplified and unified development

    ApsaraDB for SelectDB supports data lakehouses and lightweight extract, load, and transform (ELT) capabilities. This allows you to seamlessly complete data synchronization and data cleansing from data source to data warehouse without relying on Spark or Flink. You can use ApsaraDB for SelectDB as a unified query gateway without the need to switch among multiple systems or consider the compatibility of SQL syntax.

  • Quick data analysis

    ApsaraDB for SelectDB provides the leading query and analysis engine, and supports data caching and statistics collection features. The analysis performance of ApsaraDB for SelectDB can be three to five times that of Presto and Trino. You can use elastic computing resources and internal tables to accelerate views and further improve performance.

Related services
phone Contact Us