This topic compares the new and old versions of the data transformation feature, and provides guidance on how to choose between the two versions.
Version comparison
Comparison item | New version | Old version |
Data transformation syntax | Processing language (SPL). For more information, see SPL syntax.
| Domain-specific language (DSL). For more information, see Data transformation syntax. |
Scenarios |
|
|
Dependence of consumer groups on source logstores | Do not depend on consumer groups. | Depend on consumer groups. |
Version selection
Scenario considerations
The new version does not support data enrichment. The old version is recommended if you want to associate dimension tables with data sources from Simple Log Service logstores, Object Storage Service (OSS) objects, or ApsaraDB RDS tables. In this case, you can map IP addresses to geological locations or synchronize data across regions.
Expense considerations
Thanks to technological upgrades, the new version is only 33.33% of the old version in data transformation costs. For more information, see the Billable items of pay-by-feature. Therefore, the new version is recommended if it is suitable for your scenario.
Syntax comparison
SPL in the new version is easier to use than DSL in the old version. The following content describes the details.
As a subset of the Python syntax, DSL rules are developed in a similar way to functions and have more syntax symbols that complicate use. By contrast, SPL uses shell-style commands to minimize the use of syntax symbols.
DSL uses the
vfunction to reference field values, such asv("field"), while SPL directly references the fields, such aswhere field='ERROR'.DSL invokes functions in a way like
func(arg1, arg2), while SPL applies the| cmd arg1, arg2command for development conciseness.
In DSL, field values are fixed to strings, and intermediate results of type conversion are not retained. However, SPL allows you to maintain the type of a temporary field during processing. For more information, see the Type retention section of the "General references" topic.
DSL in the old version:
The ct_int function must be invoked twice.
e_set("ms", ct_float(v("sec"))*1000) e_keep(ct_float(v("ms")) > 500)SPL in the new version:
Type conversion is not required twice. Therefore, the processing logic is more concise.
| extend ms=cast(sec as double)*1000 | where ms>1024SPL reuses the SQL functions of SLS. This makes it easier for you to learn how to use SPL rules. For more information, see SQL functions.