All Products
Search
Document Center

DataWorks:JSON parsing

Last Updated:Feb 26, 2026

DataWorks Data Integration lets you use the JSON Parsing Component in real-time single-table synchronization Tasks to parse JSON data from a Source into structured table data.

Create and configure the JSON parsing component

Step 1: Configure a data integration task

  1. Create a data source. For more information, see Data source management.

  2. Create a Data Integration task. For more information, see Configure a real-time sync task in Data Integration.

    Note

    When the synchronization type for a Data Integration task is Single-table Real-time, you can add a data processing component between the Source and Destination components. For more information, see Supported data sources and synchronization solutions.

Step 2: Add the JSON parsing component

  1. In your real-time single-table synchronization Task, enable the Data Processing option, click +Add Node, and select the JSON Parsing Component.

  2. Enter a name and description for the Node, and then configure the JSON Parsing Component.

    Important

    To retrieve the JSON data structure, first perform Data Sampling on the Source, such as Kafka.

    Fixed fields

    • Get formatted JSON data.

      Method

      Description

      Illustration

      Get JSON from data sampling

      After Data Sampling, click Add Fixed Field For JSON Parsing to open the Fixed Field For JSON Parsing dialog box. From the Select Field list, choose the source Field that contains the JSON data. Then, click Get JSON Data Structure to retrieve the structure.

      image

      Get JSON from manual input

      If you have not performed Data Sampling, or if the source data is empty, you can provide the JSON structure manually.

      Click the Edit JSON Text Button to open the editor. In the Edit JSON Text window, enter your JSON content, and then click Return to Selection to choose Fields.

      image

      Parse Leaf Nodes

      • In the formatted JSON Data Structure view, click the PixPin_2026-02-25_17-39-24 icon next to a leaf Field. A corresponding parsing rule is automatically added to the Fixed Output Fields section.PixPin_2026-02-25_17-46-35

      • Example of a table generated from parsing Leaf Nodes.

      • image

      Parse JSON Objects

      PixPin_2026-02-25_17-48-36

      In the JSON Data Structure view, locate the target Field that you want to parse. If you select a JSON Object, such as the address Field in the sample JSON, click the PixPin_2026-02-25_17-39-24 icon next to it. A dialog box appears with the following parsing options:

      • Add each key-value pair in the JSON object as a separate field. The key is used as the field name and is assigned its corresponding value.

      • Add the entire JSON object as a single field. The value is the JSON string of the object.

      • Option

        Actions

        Result

        Select Add each key-value pair in the JSON object as a separate field. The key is used as the field name and is assigned its corresponding value.

        This option parses the object into three Fields: street, city, and zip, with their corresponding values.

        PixPin_2026-02-25_17-49-18

        image

        Select Add the entire JSON object as a single field. The value is the JSON string of the object.

        This option parses the entire address object into a single Field. The value of this Field contains the street, city, and zip key-value pairs.

        PixPin_2026-02-25_17-51-11

        image

      Parse JSON Arrays

      In the JSON Data Structure view, select the target Field. If you select a JSON Array, a dialog box appears with the following parsing options:

      • Add the array as a multi-row output.

      • Add the entire array as a single field. The value is the JSON string of the array.

      Option

      Actions

      Result

      Using the sample JSON, click the PixPin_2026-02-25_17-39-24 icon next to the array field and select Add the array as a multi-row output. in the dialog box.

      PixPin_2026-02-25_20-43-13

      image

      Using the sample JSON, click the PixPin_2026-02-25_17-39-24 icon next to the array1 and array2 fields, and select Add the array as a multi-row output.

      Note

      This option does not recursively parse nested arrays within the key-value pairs of the primary array.

      PixPin_2026-02-25_20-39-33

      image

      Using the sample JSON, click the PixPin_2026-02-25_17-39-24 icon next to the array field and select Add the entire array as a single field. The value is the JSON string of the array.

      PixPin_2026-02-25_20-42-30

      image
    • Alternatively, click Add A Field to add a Field manually. Use this option if you cannot retrieve values from an upstream Field or have not provided sample JSON. To define a parsing rule, manually edit the value path to retrieve the JSON content. The parameters are described below:

      Parameter

      Description

      Field Name

      The name of the new Field referenced by downstream Nodes.

      Value

      Specifies the JSON path for parsing. The syntax is as follows:

      • $: Represents the Root Node.

      • .: Represents a Child Node.

      • []: [number] specifies an array index, starting from 0.

      • [*]: Expands an array into multiple rows. Each element is combined with the other Fields in the record to form a separate row that is output to downstream Nodes.

      Note

      JSON Field names in the path can contain only letters, numbers, hyphens (-), and underscores (_).

      Default Value

      The default value to use when the JSON path does not exist, for example, due to changes in the Fields of the upstream table.

      • NULL: Sets the field value to NULL.

      • Do Not Fill: The Field is not populated. Unlike NULL, if the corresponding field in the destination table has a default value configured, that default value is used instead of NULL.

      • Dirty Data: The record is counted as Dirty Data for the synchronization task. The task might stop based on your dirty data tolerance settings.

      • Manually enter a constant: Uses a user-defined constant as the Field's value.

      image

    Dynamic fields

    • In the formatted JSON view, select the target JSON object that you want to parse dynamically. The system automatically adds a parsing configuration for each field under that JSON object.

    • When the Task runs, the system processes each Field within the specified JSON object path. It uses the original JSON Field name and value (as a STRING) and adds them to the record before sending it to downstream Nodes. This ensures that any structural changes or new Fields added during synchronization are automatically identified and passed downstream.

    • Get formatted JSON data.

      Method

      Description

      Illustration

      Get JSON from data sampling

      After data sampling, click Add Dynamic Field For JSON Parsing to open the Dynamic Output Field For JSON Parsing dialog box. Select a source Field from the list, and then click Get JSON Data Structure to retrieve the JSON data structure.

      image

      Get JSON from manual input

      If you have not performed data sampling or if the source data is empty, you can manually provide the JSON.

      Click the Edit JSON Text Button, enter your JSON content, and then click Return to Selection to choose Fields.

      image

    • Dynamically parse JSON Objects

      • Configuration:

        image

      • Assume a new field, c3, is added to the dynamic object. The following table compares the parsing results:

        _value_(STRING)

        c1(STRING)

        c2(STRING)

        c3(STRING)

        {
            "dynamic": {
                "c1": 2,
                "c2": ["a1","b1"]
            }
        }

        2

        ["a1","b1"]

        Not populated

        {
            "dynamic": {
                "c1": 2,
                "c2": ["a1","b1"],
                "c3": {"name": "jack"}
            }
        }

        2

        ["a1","b1"]

        {"name": "jack"}

    • Manually Add a Field.

      You can manually define a dynamic Field parsing rule when you cannot retrieve values from upstream or have not provided sample JSON. Customize the rule by manually editing the path to the JSON object:

      Parameter

      Description

      Specify JSON Object

      Specifies the path to the JSON object for dynamic parsing. The syntax is as follows:

      • $: Represents the Root Node.

      • .: Represents a Child Node.

      • []: [number] specifies an array index, starting from 0.

      Note: JSON Field names in the path can contain only letters, numbers, hyphens (-), and underscores (_).

      Default Value

      Specifies the behavior when the JSON path fails to resolve or the corresponding object does not exist.

      • Ignore: Dynamic parsing is not performed.

      • Dirty Data: The record is counted as Dirty Data for the synchronization task. The Task might stop based on your dirty data tolerance settings.

    • Policy for handling duplicate Field names.

      When a dynamic JSON object is expanded, only the first level of key-value pairs is processed. This policy determines how to handle conflicts if an expanded Field has the same name as an existing Field. The policies are as follows:

      • Overwrite: The value of the new, expanded Field replaces the value of the existing Field.

      • Discard: The value of the existing Field is kept, and the value of the new, expanded Field is discarded.

      • Error: The Task stops and reports an error.

Next steps

After you configure the Source and JSON Parsing Nodes, click Run Simulation to preview the output data from the current Node and verify that it meets your requirements.