×
Community Blog Writing a JavaScript Interpreter – A Detailed Interpretation of AST and Its Application

Writing a JavaScript Interpreter – A Detailed Interpretation of AST and Its Application

This article introduces the Abstract Syntax Tree (AST) conception, the AST working mode, and the amazing capabilities of AST.

By Chengwei

1

1. What Is AST?

1.1 AST: Abstract Syntax Tree

The devDependencies in the mainstream projects involve various modules, such as JavaScript translator, CSS preprocessor, ESLint, and Prettier. These modules, all of which are based on AST, will not be used in the production environment but play an important role in the development process.

2

1.2 AST Workflow

3

  • Parse: Parses the code into AST
  • Transform: Performs operations, such as addition, deletion, replacement, and appending on each node in AST, which includes 95% of the code in business development.
  • Generator: Converts AST to code

1.3 AST Preview

4

AST Auxiliary Development Tool

2. Start from a Simple Requirement

Pseudo-Requirement for Code Compression: Simplify square function parameters and references. Convert the variable from num to n:

5

Solution 1: Using Replace Statement for Brute-Force Conversion

const sourceText = `function square(num) {
  return num * num;
}`;
sourceText.replace(/num/g, 'n');

This operation is brutal and can cause bugs easily, so the code may not be useful. If the "num" string exists, it will also be converted:

// Before conversion
function square(num) {
  return num * num;
}
console.log('param 2 result num is ' + square(2));

// After conversion
function square(n) {
  return n * n;
}
console.log('param 2 result n is ' + square(2));

Solution 2: Using Babel for Conversion

module.exports = () => {
  return {
    visitor: {
      // Define the visitor and traverse the Identifier
      Identifier(path) {
        if (path.node.name === 'num') {
          path.node.name = 'n'; // Convert the variable name
        }
      }
    }
  }
};

Define the Identifier visitor to traverse the Identifier. If the Identifier is "num," the variable is converted. The preceding code resolves the problem that num is converted when it is a string. However, there is also a potential issue. For example, errors occur when the code looks like this:

// Before conversion
function square(num) {
  return num * num;
}
console.log('global num is ' + window.num);

// After conversion
function square(n) {
  return n * n;
}
console.log('global num is ' + window.n); // Error

window.num will also be converted after being matched by the visitor iterator. After conversion, the output code is window.n, which causes an error. The requirement contains three keywords, square function, parameters, and references. Based on the keywords, the code needs to be optimized further.

Solution 2 Upgrade: Finding the Reference Relationship

module.exports = () => {
  return {
    visitor: {
      Identifier(path,) {
        // Three criteria
        if (path.node.name !== 'num') { // Variable needs to be num
          return;
        }
        if (path.parent.type !== 'FunctionDeclaration') { // The parent type needs to be function
          return;
        }
        if (path.parent.id.name !== 'square') { // The function name must be square
          return;
        }
        const referencePaths = path.scope.bindings['num'].referencePaths; // Find the corresponding reference
        referencePaths.forEach(path => path.node.name = 'n'); // Modify reference value
        path.node.name = 'n'; // Modify variable value
      },
    }
  }
};

The preceding code can be described like this:

6

Conversion Result:

// Before conversion
function square(num) {
  return num * num;
}
console.log('global num is ' + window.num);

// After conversion
function square(n) {
  return n * n;
}
console.log('global num is ' + window.num);

In the business-oriented AST operation, "human" judgment must be abstracted and converted properly.

3. Babel in AST

7

3.1 API Overview

// Three steps of the API workflow
const parser = require('@babel/parser').parse;
const traverse = require('@babel/traverse').default;
const generate = require('@babel/generator').default;

// Supporting package
const types = require('@babel/types');

// Template package
const template = require('@babel/template').default;

3.2 @babel/parser

Use babel/parser to convert the source code to AST

const ast = parser(rawSource, {
  sourceType: 'module',
  plugins: [
    "jsx",
  ],
});

3.3 @babel/traverse

The core of AST development is that more than 95% of the code write visitors come from @babel/traverse.

const ast = parse(`function square(num) {
  return num * num;
}`);

traverse(ast, { // Perform AST conversion
    Identifier(path) { // Traverse the visitor of the variable
      // ...
    },
    // Traverse other visitors
  } 
)

The first parameter of visitor is path, which is not directly equal to a node. The attributes and methods of path are listed below:

8

3.4 @babel/generator

Use @babel/generator to generate the corresponding source code based on the operated AST:

const output = generate(ast, { /* options */ });

3.5 @babel/types

@babel/types is used to create and identify AST nodes. It is often used in actual development:

// @babel/types that start with "is" are used to identify the nodes
types.isObjectProperty(node);
types.isObjectMethod(node);

// Create a null node
const nullNode = types.nullLiteral();
// Create a square node
const squareNode = types.identifier('square');

3.6 @babel/template

@babel/types can create AST nodes, which is too tedious. Use @babel/template to create the entire AST node quickly. The following code compares the two methods of creating the AST node called import React from 'react:

// @babel/types
// Search for the corresponding API to create a node and match methods for passing parameters
const types = require('@babel/types');
const ast = types.importDeclaration(
  [ types.importDefaultSpecifier(types.identifier('React')) ], 
  types.stringLiteral('react')
);

// path.replaceWith(ast) // Node replacement
// Use @babel/template
// Create the node and input source code, which is clear and easy to understand
const template = require('@babel/template').default;
const ast = template.ast(`import React from 'react'`);

// path.replaceWith(ast) // Node replacement

3.7 Define a Babel Plugin for Common Use

Define a babel plugin for common use to facilitate Webpack integration. There is an example below:

// Define a plugin
const { declare } = require('@babel/helper-plugin-utils');

module.exports = declare((api, options) => {
  return {
    name: 'your-plugin', // Define the plugin name
    visitor: { // Write the business visitor
      Identifier(path,) {
        // ...
      },
    }
  }
});
// Configure babel.config.js
module.exports = {
    presets: [
        require('@babel/preset-env'), // It can be used together with the common presets.
    ],
    plugins: [
        require('your-plugin'),
        // require('./your-plugin') can be a relative directory.
    ]
};

The babel plugin development is equal to writing AST transform callbacks. There is no need to use the @babel/parser, @babel/traverse, @babel/generator, or other modules directly, which are called inside Babel.

When the @babel/types capability is required, @babelcore is recommended for direct use. As shown in the source code [1], @babel/core displays the babel modules mentioned above directly.

const core = require('@babel/core');
const types = core.types; // const types = require('@babel/types');

4. ESLint in AST

After mastering the core principles of AST, it is easier to customize ESLint rules:

// eslint-plugin-my-eslint-plugin
module.exports.rules = { 
  "var-length": context => ({ // Define the var-length rule to check the variable length
    VariableDeclarator: (node) => { 
      if (node.id.name.length <= 1){ 
        context.report(node, 'Variable name length must be greater than 1');
      }
    }
  })
};
// .eslintrc.js
module.exports = {
  root: true,
  parserOptions: { ecmaVersion: 6 },
  plugins: [
   "my-eslint-plugin"
  ],
  rules: {
    "my-eslint-plugin/var-length": "warn" 
  }
};

Effects

The correct IDE prompt:

9

Warning of the execution of ESLint command:

10

For more information about the ESLint API, please see the official documentation [2].

5. Obtain the JSX Interpretation

People know JSX syntax mostly when learning React. React brings out more in JSX. However, JSX is not equal to React, nor is it created by React.

// Code based on React
const name = 'John';
const element = <div>Hello, {name}</div>;
// Code converted by @babel/preset-react
const name = 'John';
const element = React.createElement("div", null, "Hello, ", name);

As a tag syntax, JSX is neither a string nor an HTML file. It is a JavaScript syntax extension that describes the essential form of the interaction that a UI should have. JSX is similar to the template language to some degree, which also features all the functions of JavaScript. The following part is about obtaining the required interpretation of JSX by writing a Babel plugin.

5.1 JSX Babel Plugin

HTML is a language that describes Web pages, and AXML or VXML is a language that describes mini app pages. They are not compatible with different containers. However, they are all based on the JavaScript technology stack. Therefore, is it possible to generate pages with the same styles by defining a set of JSX specifications?

5.2 Goal

export default (
  <view>
    hello <text style={{ fontWeight: 'bold' }}>world</text>
  </view>
);
<!-- Output Web HTML -->
<div>
  hello <span style="font-weight: bold;">world</span>
</div>
<!-- Output mini app AXML -->
<view>
  hello <text style="font-weight: bold;">world</text>
</view>

Since AST can only be used for JavaScript conversion, how can text markup languages, such as HTML and AXML, be converted? Let's think about it this way: convert the preceding JSX code to JS code and enable component consumption on the web side and the mini app side. This is a design idea of AST development. The AST tool only compiles code, and the specific consumption is implemented by the lower-level operation. Both @babel/preset-react and React work this way.

// JSX source code
module.exports = function () {
  return (
    <view
      visible
      onTap={e => console.log('clicked')}
    >ABC<button>login</button></view>
  );
};

// Goal: convert to more commonly used JavaScript code
module.exports = function () {
  return {
    "type": "view",
    "visible": true,
    "children": [
      "ABC",
      {
        "type": "button",
        "children": [
          "login1"
        ]
      }
    ]
  };
};

As the goal is clear, the implementation steps are listed below:

  1. Convert a JSX tag to an Object. The tag name is type. Example: <view /> to { type: 'view' }
  2. Transfer the attributes on the tag to the attributes of the Object. Example: <view onTap={e => {}} /> to { type: 'view', onTap: e => {} }
  3. Migrate the child elements of JSX to the children attribute. The children attribute is an array. Example: { type: 'view', style, children: [...] }
  4. Repeat the previous three steps to handle the child elements.

The following part is the sample code:

const { declare } = require('@babel/helper-plugin-utils');
const jsx = require('@babel/plugin-syntax-jsx').default;
const core = require('@babel/core');
const t = core.types;

/*
 Traverse the JSX tags and define the node as the JSX element, for example, node = <view onTap={e => console.log('clicked')} visible>ABC<button>login</button></view>
*/
const handleJSXElement = (node) => {
  const tag = node.openingElement;
  const type = tag.name.name; // Obtain the tag name View
  const propertyes = []; // Store the attributes of object
  propertyes.push( // Obtain the attribute type, which is 'ABC'
    t.objectProperty(
      t.identifier('type'),
      t.stringLiteral(type)
    )
  );
  const attributes = tag.attributes || []; // Attributes on the tag
  attributes.forEach(jsxAttr => { // Traverse the attributes on the tag
    switch (jsxAttr.type) {
      case 'JSXAttribute': { // Handle JSX attributes
        const key = t.identifier(jsxAttr.name.name); // Obtain onTap and visible attributes 
        const convertAttributeValue = (node) => {
          if (t.isJSXExpressionContainer(node)) { // The value of the attribute is an expression (such as a function)
            return node.expression; // Return the expression
          }
          // Convert a null value to true. Example: <view visible /> to {type:'view', visible: true}
          if (node === null) {
            return t.booleanLiteral(true);
          }
          return node;
        }
        const value = convertAttributeValue(jsxAttr.value);
        propertyes.push( // Obtain {type:'view', onTap: e => console.log('clicked'), visible: true}
          t.objectProperty(key, value)
        );
        break;
      }
    }
  });
  const children = node.children.map((e) => {
    switch(e.type) {
      case 'JSXElement': {
        return handleJSXElement(e); // If the child element involves JSX, traverse handleJSXElement.
      }
      case 'JSXText': {
        return t.stringLiteral(e.value); // Convert a string to a character
      }
    }
    return e;
  });
  propertyes.push( // Convert the child elements in JSX to the children attribute of the object
    t.objectProperty(t.identifier('children'), t.arrayExpression(children))
  );
  const objectNode = t.objectExpression(propertyes); // Convert to Object Node
  /* Finally, convert to
  {
    "type": "view",
    "visible": true,
    "children": [
      "ABC",
      {
        "type": "button",
        "children": [
          "login"
        ]
      }
    ]
  }
  */
  return objectNode;
}

module.exports = declare((api, options) => {
  return {
    inherits: jsx, // Inherit the foundation of the JSX parsing provided by Babel
    visitor: {
      JSXElement(path) { // Traverse JSX tags, for example, <view />
        // Convert the JSX tag to an Object
        path.replaceWith(handleJSXElement(path.node));
      },
    }
  }
});

6. Summary

This article introduces the AST conception, the AST working mode, and the amazing capabilities of AST. Some AST business scenarios are listed below. AST will be the powerful backup when a user:

  • needs to perform secondary programming development based on the infrastructure
  • needs to deal with visual programming operations
  • needs to deal with code specification customization

Note: The code snippet and test method demonstrated in this article are available at this GitHub link. Check it out to learn more.

References

[1] https://github.com/babel/babel/blob/main/packages/babel-core/src/index.js#L10-L14

[2] https://cn.eslint.org/docs/developer-guide/working-with-rules

[3] https://reactjs.bootcss.com/docs/introducing-jsx.html

0 0 0
Share on

chvin

1 posts | 0 followers

You may also like

Comments

chvin

1 posts | 0 followers

Related Products