Presto is an open-source distributed SQL query engine. It is used to run interactive analytic queries.

Basic features

Presto is implemented in Java. It is easy to use and offers high performance and strong scalability. Presto provides the following features:
  • Supports American National Standards Institute (ANSI) SQL.
  • Supports various data sources:
    • Hive
    • Cassandra
    • Kafka
    • MongoDB
    • MySQL
    • PostgreSQL
    • SQL Server
    • Redis
    • Redshift
    • Local files
  • Supports advanced data structures:
    • Array and map data
    • JSON data
    • GIS data
    • Color data
  • Delivers strong scalability:
    • Data connector expansion
    • Customization of data types
    • Customization of SQL functions

    You can expand modules based on your business requirements to improve the processing efficiency of your business.

  • Uses a pipeline model to process data and return data in real time.
  • Provides a monitoring interface:
    • Provides a web UI, on which you can view the execution processes of queries.
    • Supports Java Management Extensions (JMX) protocols.

Scenarios

Presto is a distributed SQL query engine for data warehousing and data analytics services. It can be used in the following scenarios:
  • Extract, transform, load (ETL)
  • Ad hoc queries
  • Analysis of large volumes of structured or semi-structured data
  • Aggregation of large volumes of multidimensional data, and report analysis
Notice Presto is a data warehousing product. It offers limited support for transactions and is not suitable for online business scenarios.

Benefits

EMR Presto has the following advantages over open source Presto:
  • You can quickly deploy a Presto cluster with hundreds of nodes.
  • EMR Presto supports auto scaling. You can easily scale out a Presto cluster.
  • EMR Presto can process data stored in OSS buckets.
  • EMR Presto is O&M free and offers 24/7 service.

References

The Presto version depends on the EMR version that you select when you create a cluster. For the mapping between EMR versions and Presto versions, see Version overview.

The URL of open source Presto documentation varies depending on the Presto version.
  • If the Presto version is 3XX, visit prestosql.io/docs/3XX/.

    For example, visit prestosql.io/docs/331/ to view Presto 331 Documentation.

  • If the Presto version is 0.2XX, visit prestodb.io/docs/0.2XX/.

    For example, visit prestodb.io/docs/0.228/ to view Presto 0.228 Documentation.