A Semantic View decouples business semantics from the physical data model in AnalyticDB for MySQL, providing structured semantic context that improves the accuracy of LLM-generated SQL in NL2SQL workflows.
Semantic View is in private preview. To request access, submit a ticket or contact technical support.
Comparison
In traditional NL2SQL workflows, an LLM relies solely on table schemas to generate SQL. With multi-table joins and complex calculations, this approach is error-prone. For example, when asking an agent to "query net revenue by country in Asia for 1997" against a TPC-H dataset, common errors include:
Wrong JOIN paths: joining unrelated supply chain tables (partsupp) or querying the wrong database.
Incorrect calculations: subtracting supply cost (
ps_supplycost) when the metric requires revenue only.High token cost: full schemas for 5+ tables must be provided, often requiring multiple correction rounds.
Semantic View solves this by predefining dimensions, metrics, and relationships. The LLM reads these definitions and generates correct SQL in a single round.
Metric | Without Semantic View | With Semantic View |
Accuracy | Incorrect JOIN paths and calculation logic | Correct SQL in one shot |
Conversation rounds | 3 to 5 (multiple corrections) | 1 |
Context required | Full table schemas (5+ tables) | Semantic summary only |
Reusability | Must re-explain context each time | Define once, reuse permanently |
Prerequisites
An AnalyticDB for MySQL cluster is created. For more information, see Create a cluster.
Semantic View private preview access is enabled. Submit a ticket or contact technical support to request access.
An AI coding tool that supports Skills is available, such as Claude Code.
Procedure
Step 1: Prepare data and environment
Create a database and TPC-H tables.
CREATE DATABASE IF NOT EXISTS tpch_demo; USE tpch_demo;-- Create the orders table CREATE TABLE IF NOT EXISTS orders ( o_orderkey BIGINT NOT NULL, o_custkey BIGINT NOT NULL, o_orderstatus VARCHAR(1), o_totalprice DECIMAL(15,2), o_orderdate DATE, o_orderpriority VARCHAR(15), o_clerk VARCHAR(15), o_shippriority INT, o_comment VARCHAR(79), PRIMARY KEY (o_orderkey) ); -- Create the lineitem table CREATE TABLE IF NOT EXISTS lineitem ( l_orderkey BIGINT NOT NULL, l_partkey BIGINT NOT NULL, l_suppkey BIGINT NOT NULL, l_linenumber INT NOT NULL, l_quantity DECIMAL(15,2), l_extendedprice DECIMAL(15,2), l_discount DECIMAL(15,2), l_tax DECIMAL(15,2), l_returnflag VARCHAR(1), l_linestatus VARCHAR(1), l_shipdate DATE, l_commitdate DATE, l_receiptdate DATE, l_shipinstruct VARCHAR(25), l_shipmode VARCHAR(10), l_comment VARCHAR(44), PRIMARY KEY (l_orderkey, l_linenumber) ); -- Create the customer table CREATE TABLE IF NOT EXISTS customer ( c_custkey BIGINT NOT NULL, c_name VARCHAR(25), c_address VARCHAR(40), c_nationkey BIGINT, c_phone VARCHAR(15), c_acctbal DECIMAL(15,2), c_mktsegment VARCHAR(10), c_comment VARCHAR(117), PRIMARY KEY (c_custkey) );Load TPC-H sample data. Generate data with the
dbgentool or download a dataset from the TPC-H website.NoteIf you already have a database with TPC-H data, skip the table creation and replace
tpch_demowith your actual database name in the following examples.Set database connection environment variables in your terminal.
export ADB_MYSQL_HOST="amv-xxxxx.ads.aliyuncs.com" export ADB_MYSQL_PORT="3306" export ADB_MYSQL_USER="your_username" export ADB_MYSQL_PASSWORD="your_password"
Step 2: Install the Skill
Install the AnalyticDB for MySQL Smart Analyst Skill to manage Semantic Views and run NL2SQL queries through natural language.
Run the following command to install the Skill:
npx skills add aliyun/alibabacloud-adb-mysql-mcp-serverAlternatively, install manually:
git clone https://github.com/aliyun/alibabacloud-adb-mysql-mcp-server cd alibabacloud-adb-mysql-mcp-server/skill mkdir -p ~/.claude/skills/ cp -r alibabacloud-adb-smart-analyst ~/.claude/skills/Verify database connectivity.
/alibabacloud-adb-smart-analyst connectivity testExpected output on success:
Connectivity test passed SQL executed: SELECT 1 AS connectivity_test ┌───────────────────┐ │ connectivity_test │ ├───────────────────┤ │ 1 │ └───────────────────┘ Database connection is active.If the connection fails, the Skill prompts you to set environment variables. Refer to Step 1.
Step 3: Create and manage a Semantic View
A Semantic View uses YAML to define business semantics, embedded in a CREATE OR REPLACE SEMANTIC VIEW statement. The YAML contains the following elements:
Element | Description |
dimensions | Fields used in |
time_dimensions | Date fields used for time-range filtering and aggregation |
facts | Raw numeric fields (price, discount) that serve as the basis for metric calculations |
metrics | Aggregate expressions based on facts (total revenue, average order value) representing business KPIs |
filters | Named filter conditions that encapsulate common business filtering logic |
relationships | Table join paths that enable correct multi-table queries |
Create a sales analytics Semantic View based on the lineitem, orders, and customer tables:
CREATE OR REPLACE SEMANTIC VIEW tpch_sales_analytics
LANGUAGE YAML
AS
$$
name: tpch_sales_analytics
description: Sales analytics semantic view for TPC-H. Supports analysis by customer, order, and time dimensions.
tables:
- name: lineitem
base_table:
schema: tpch_demo
table: lineitem
dimensions:
- name: return_flag
description: Return status. R = returned, A = accepted, N = not processed.
expr: l_returnflag
data_type: varchar
- name: line_status
description: Line item status. O = open, F = fulfilled.
expr: l_linestatus
data_type: varchar
- name: ship_mode
description: Shipping method. Values include TRUCK, AIR, SHIP, RAIL, and MAIL.
expr: l_shipmode
data_type: varchar
time_dimensions:
- name: ship_date
description: Date the item was shipped
expr: l_shipdate
data_type: date
facts:
- name: extended_price
description: Original item price before discount
expr: l_extendedprice
data_type: decimal
- name: discount
description: Discount rate, ranging from 0 to 1
expr: l_discount
data_type: decimal
- name: quantity
description: Number of items
expr: l_quantity
data_type: decimal
metrics:
- name: revenue
description: Total revenue after discount, calculated as SUM(price * (1 - discount))
expr: SUM(l_extendedprice * (1 - l_discount))
- name: total_quantity
description: Total number of items
expr: SUM(l_quantity)
- name: avg_price
description: Average item price
expr: AVG(l_extendedprice)
- name: late_shipping_rate
description: Percentage of items received after the committed date
expr: COUNT(CASE WHEN l_receiptdate > l_commitdate THEN 1 END) * 1.0 / COUNT(*)
filters:
- name: completed_only
description: Include only fulfilled line items
expr: l_linestatus = 'F'
- name: orders
base_table:
schema: tpch_demo
table: orders
dimensions:
- name: order_status
description: Order status. O = open, F = fulfilled, P = partially fulfilled.
expr: o_orderstatus
data_type: varchar
- name: order_priority
description: Order priority. Values are 1-URGENT, 2-HIGH, 3-MEDIUM, 4-NOT SPECIFIED, and 5-LOW.
expr: o_orderpriority
data_type: varchar
time_dimensions:
- name: order_date
description: Date the order was placed
expr: o_orderdate
data_type: date
facts:
- name: total_price
description: Total order amount
expr: o_totalprice
data_type: decimal
metrics:
- name: order_count
description: Total number of orders
expr: COUNT(*)
- name: avg_order_value
description: Average order amount
expr: AVG(o_totalprice)
- name: customer
base_table:
schema: tpch_demo
table: customer
dimensions:
- name: market_segment
description: Customer market segment. Values are AUTOMOBILE, BUILDING, FURNITURE, HOUSEHOLD, and MACHINERY.
expr: c_mktsegment
data_type: varchar
- name: customer_name
description: Customer name
expr: c_name
data_type: varchar
facts:
- name: account_balance
description: Customer account balance
expr: c_acctbal
data_type: decimal
relationships:
- name: lineitem_to_orders
left_table: lineitem
right_table: orders
relationship_type: many_to_one
on:
- left_expr: l_orderkey
right_expr: o_orderkey
- name: orders_to_customer
left_table: orders
right_table: customer
relationship_type: many_to_one
on:
- left_expr: o_custkey
right_expr: c_custkey
$$;Verify the Semantic View:
SELECT view_schema, view_name, definition
FROM information_schema.semantic_views
WHERE view_name = 'tpch_sales_analytics';You can also manage Semantic Views through natural language with the Skill:
-- Query a specific semantic view
/alibabacloud-adb-smart-analyst show the tpch_sales_analytics semantic view
-- List all semantic views
/alibabacloud-adb-smart-analyst list all semantic views
-- Delete a semantic view
/alibabacloud-adb-smart-analyst delete tpch_demo.tpch_sales_analyticsStep 4: Run NL2SQL queries
With the Semantic View created, use the Skill to query data in natural language. The Skill automatically performs semantic search, schema validation, and SQL execution.
Example:
/alibabacloud-adb-smart-analyst total revenue by shipping methodResult:
Query results
SQL executed:
SELECT l_shipmode, SUM(l_extendedprice * (1 - l_discount)) AS revenue
FROM tpch_demo.lineitem
GROUP BY l_shipmode
┌───────────┬─────────────┐
│ l_shipmode│ revenue │
├───────────┼─────────────┤
│ AIR │ 258,217,431 │
│ FOB │ 261,609,629 │
│ MAIL │ 259,103,165 │
│ RAIL │ 260,370,880 │
│ REG AIR │ 260,012,778 │
│ SHIP │ 261,260,339 │
│ TRUCK │ 261,419,202 │
└───────────┴─────────────┘More query examples:
Natural language query | Generated SQL |
Order count and average order value by market segment in H2 1995 |
|
Shipping method with the highest late delivery rate |
|
Design guidelines
Write detailed descriptions
The description field is the primary source of business context for the LLM. Include the business meaning, allowed values (for enums), and calculation logic (for metrics).
dimensions:
- name: order_status
# Good: lists all values and their meanings
description: Order status. O = open, F = fulfilled, P = partially fulfilled.
expr: o_orderstatus
data_type: varcharModel dimensions, metrics, and relationships correctly
Properly classify fields and define precise table joins:
Dimensions (dimensions/time_dimensions): fields typically used in
GROUP BY, such as categorical attributes or time fields.Facts → Metrics: define raw numeric fields as facts first, then build aggregate metrics on top of them.
Relationships: use
many_to_onefrom fact tables to dimension tables. Include only the joins actually needed.
Use filters for common conditions
Define frequently used filter conditions as named filters. The LLM applies them directly when users phrase queries with business terminology.
filters:
- name: completed_only
description: Include only fulfilled line items
expr: l_linestatus = 'F'
- name: high_value_orders
description: Orders with a total amount greater than 10000
expr: o_totalprice > 10000Split by topic and iterate
Focus each Semantic View on a single analytical domain (such as "sales" or "supply chain"). Exclude internal technical fields like partition keys and audit columns.
Use CREATE OR REPLACE to iterate: add descriptions, define filters, or adjust metric expressions based on observed NL2SQL error patterns.
Limits
Item | Description |
Feature status | Semantic View is in private preview. Features and syntax may change before general availability. |
YAML syntax | YAML is indentation-sensitive. Ensure correct indentation in the definition. |
Metric expressions | The |
Table references | The |