Unlocking the Power of DSLs: Stateless State Machines

After reading "Domain-Specific Languages" by Martin Fowler, Zhang Jianfei explains a new angle to view, understand, and apply DSLs and state machines.

By Zhang Jianfei

What is a domain-specific language (DSL)? DSL is a tool that helps more clearly communicate the intent of a part of a system. In this article, we implement a state machine to provide insight into the nature of DSLs. We introduce semantic models and fluent interfaces and discuss the performance issues of state machines.

We used a state machine to keep track of frequent transitions in a recent project because the expressiveness of state machine DSLs facilitates better understanding than if-else statements. What's more, state machines are easier to implement and use and are less complicated than processing engines.

At first, we used an open-source state machine, which fell short of our expectations. So we built a lightweight state machine following the "keep it simple, stupid" (KISS) principle.

This state machine (cola-statemachine) was added to the Clean Object-Oriented and Layered Architecture (COLA) and is now open-source:

While I was working on the state machine, I read the book, Domain-Specific Languages by Martin Fowler, which totally reshaped my understanding of DSLs.
This is also why I wrote this article, to provide a new angle for you to view DSLs, and apply DSLs and state machines.

DSL

The book, Domain-Specific Languages, begins by discussing state machines, and gradually progresses into a deeper understanding of DSLs. I recommend this book to anyone interested in DSLs and state machines. The following sections summarize the main points of the book.

Let's first take a look at the definition of DSLs as described by Martin Fowler in Domain-Specific Languages.

What is a DSL?

"DSLs are a tool whose cutting edge lies in its ability to provide a means to more clearly communicate the intent of a part of a system."

This clarity isn't just an aesthetic desire. The easier it is to read the code, the easier it is to find mistakes, and the easier it is to modify the system. Thus, we encourage meaningful variable names, documentation, and clear coding constructs, and should, for the same reason, encourage DSL usage.

By definition, a DSL is a computer programming language of limited expressiveness focused on a particular domain. There are three key elements to this definition:

Language nature: A DSL is a programming design language that should have an ability to express that comes from individual expressions and the combination of expressions.
Limited expressiveness: A DSL is a general-purpose programming language that supports varied data, control, and abstract structures. These capabilities are useful, but also make the language harder to learn and use. A DSL supports the bare minimum of features that are required for its domain. A DSL can't be used to build an entire software system, but it can be employed for one particular aspect of a system.
Domain focus: A DSL is a limited language that is only useful if it has a clear focus on a small domain. It is what makes a limited language worthwhile.

Consider this regular expression:

/\d{3}-\d{3}-\d{4}/

It is a typical DSL that solves the problems in string matching.

Categories of DSLs:

DSLs can be divided into three main categories: external DSLs, internal DSLs, and language workbenches. The following are Martin Fowler's definitions:

"An internal DSL is a particular way of using a general-purpose language. A script in an internal DSL is valid code in its general-purpose language, but it has a specific style and only uses a subset of the language's features to handle the issues of one small aspect of the overall system. The result should have the feel of a custom language, rather than its host language." For example, the state machine we built is an internal DSL that does not support scripting and resides in Java, but it is a DSL in nature.

builder.externalTransition()
  .from(States.STATE1)
  .to(States.STATE2)
  .on(Events.EVENT1)
  .when(checkCondition())
  .perform(doAction());

An external DSL is a language separate from the main language of the application it works with. Usually, an external DSL has a custom syntax, but using another language's syntax is also common. XML is a frequent choice. Examples of external DSLs include the XML configuration files for systems, like Struts and Hibernate.

A language workbench is a specialized IDE for defining and building DSLs. Simply put, it is the product and visualization of a DSL.

The order of categories indicates a progressive pattern, where the internal DSLs are the simplest and the lowest-cost DSLs but do not support external configuration. Language workbenches are the most configurable but also come with the highest costs. The following diagram illustrates this pattern:

How to Choose between DSLs

You may get a clearer understanding of what type of DSLs to use after learning how they are used differently:

Internal DSLs: They are simple, convenient, and intuitive. Internal DSLs are recommended for improving code readability and in cases where external configuration is not required.
External DSLs: They are a suitable choice if you need to perform configuration at runtime, or when code deployment is not needed after configurations. For instance, when you want to add a rule to a rule engine but don't wish to republish the code afterward.
Language workbenches: This type of DSL isn't user-friendly for making configurations or writing DSL scripts, but it can be useful in certain situations. For example, the promotions and regulations on Taobao require complex settings and must be constantly updated, which is a lot for the sales operations team to manage. We can provide a language workbench to allow them to set rules that take effect immediately.

In a word, there is no one-size-fits-all solution when it comes to DSLs. One particularly negative example of DSLs is the processing engines. They are overused, overdesigned, and can make simple tasks complicated.

Complexity is better avoided but is also common in practice. Developers working for big companies do not only code, but they must also improve, extend, and innovate to come up with state-of-the-art technologies. As Nassim Nicholas Taleb said in Antifragile:

"...simplicity has been difficult to implement in modern life because it is against the spirit of a certain brand of people who seek sophistication so they can justify their profession."

Fluent Interfaces

We are faced with two choices when writing software libraries: one is to provide command-query APIs, and the other is to provide fluent interfaces. For example, the Mockito API:

when(mockedList.get(anyInt())).thenReturn("element")

This showcases the typical use of a fluent interface.

Fluent interfaces are an important means for implementing internal DSLs.

The fluency of fluent interfaces improves the code readability and comprehensibility, making them internal DSLs that go beyond providing APIs.

For example, the Mockito API:

when(mockedList.get(anyInt())).thenReturn("element")

Mockito forms a great combination with fluent interfaces and is also the DSL used for unit testing fluent interfaces. If we replace the fluent interface with a command-query API, the area of the testing framework will not be as clearly represented.

String element = mockedList.get(anyInt());
boolean isExpected = "element".equals(element);

Note: The fluent interfaces can be used in cascading calls like method chaining or the builder pattern, as with OkHttpClient.Builder():

OkHttpClient.Builder builder=new OkHttpClient.Builder();
  OkHttpClient okHttpClient=builder
    .readTimeout(5*1000, TimeUnit.SECONDS)
    .writeTimeout(5*1000, TimeUnit.SECONDS)
    .connectTimeout(5*1000, TimeUnit.SECONDS)

But a more significant function of the fluent interfaces is to specify the sequence of method calls. For example, when building a state machine, we can only call the to() method after the from() method is called, which is not available in the builder pattern.

To achieve this we can use the builder pattern together with the fluent interfaces. The details are included in the Implement the State Machine section in this article.

State Machines

The following sections describe how to implement an internal DSL state machine.

State Machine Selection

As I said earlier, the overuse of processing engines is not a practice I endorse. However, in my view, state machines can be a helpful tool for the following three major reasons:

Firstly, state machines feature lightweight implementation. The simplest state machine can be implemented at near-zero cost by using only an enum.
Secondly, using state machine DSLs to track transitions improves the clarity of semantics, and enhances the readability and maintainability of the code.
The enum approach only supports transitions between linear states, which are not sufficient in our case.

Open-Source State Machines are Too Complex

Just like processing engines, there are quite a few open-source state machines around. I checked the designs of the top 2 state machine implementations on GitHub, namely Spring Statemachine and Squirrel State Machine. They are both very powerful frameworks, but this can also be a disadvantage.

Nevertheless, this is understandable, since most people would choose to start their open-source projects to support all the features mentioned in the UML State Machine Diagrams.

For most projects, ours included, all of those advanced features of state machines, like the substates, fork/join model for the parallel execution, or submachines, are far too excessive.

Open-Source State Machines have Poor Performance

We must acknowledge the fact that these open-source state machines are all stateful, and are by definition, supposed to maintain states. On the other hand, it is because they are stateful that they are not thread-safe. This means every time a state machine accepts a request, our multi-threaded application servers implemented in a distributed environment have to build a new state machine instance.

Take e-commerce order management for example. After a user places an order, we change the status of the order to "Order Placed" by calling the state machine instance. When the user pays for the order, the request may be handled by a separate thread or another server. So, we have to build a new state machine instance because the original instance is not thread-safe.

This induces high overhead and large power consumption. Also, it may potentially cause performance issues if the state machine is built with a complex design or the queries-per-second (QPS) is too high.

For simplicity, better performance, and reduced electricity use, we built a state machine on our own, with two, simple, clear goals:

To build a lightweight state machine that foregoes features, such as substates or parallel execution.
To build a stateless state machine that follows the singleton design pattern to allow all transitions to be handled by a single instance.

Implement the State Machine

State Machine Domain Model

As is illustrated in the following diagram, the core concepts of our lightweight state machine include:

State: the state
Event: the entity that drives state changes
Transition: the change from one state to another
External Transition: the transition in which the source state is exited and the target state is entered
Internal Transition: the transition that executes without exiting or re-entering the state in which it is defined
Condition: the condition that allows or stops the transition to a certain state
Action: the behavior executed during the triggering of the transition
StateMachine: the state machine

The following diagram illustrates the semantic model of the state machine:

Note: The term "semantic model" comes from the book Domain-Specific Languages. You can think of it as the domain model. Martin used "semantic " to indicate that the DSL scripts stand for syntax, and the model stands for semantic. I think this is a good word choice.

The following is the core code of the state machine's semantic model:

//StateMachine
public class StateMachineImpl<S,E,C> implements StateMachine<S, E, C> {

  private String machineId;
  private final Map<S, State<S,E,C>> stateMap;

  ...
}

  //State
  public class StateImpl<S,E,C> implements State<S,E,C> {
    protected final S stateId;
    private Map<E, Transition<S, E,C>> transitions = new HashMap<>();

  ...
}

  //Transition
  public class TransitionImpl<S,E,C> implements Transition<S,E,C> {

    private State<S, E, C> source;
    private State<S, E, C> target;
    private E event;
    private Condition<C> condition;
    private Action<S,E,C> action;

    ...
}

Fluent API for Creating the State Machine

I wrote more lines for the builder and the fluent interface than I did for the core code. The following is the code for TransitionBuilder:

class  TransitionBuilderImpl<S,E,C> implements ExternalTransitionBuilder<S,E,C>, InternalTransitionBuilder<S,E,C>, From<S,E,C>, On<S,E,C>, To<S,E,C> {    
  ...    
  @Override    
  public From<S, E, C> from(S stateId) {        
    source = StateHelper.getState(stateMap,stateId);        
    return this;    
  }

  @Override    
   public To<S, E, C> to(S stateId) {        
     target = StateHelper.getState(stateMap, stateId);        
     return this;    
  }   
  ...
}

The fluent interface ensures the calling sequence, as shown in the following figure, in which only from() can be called after externalTransition, and only to() can be called after from(). In this way, the semantic correctness and consistency of the state machine can both the guaranteed.

Stateless Design of the State Machine

This section provides a solution to the performance issue: make the state machine stateless.

The reason that most open-source state machines are stateful is that they maintain two states: initial state and current state. To make a state machine stateless, we can simply remove these variables to leave the instance stateless.

Can we dispense with these two states? Of course we can.

The only downside is that once we do this, we can't know the current state of the instance. Since we will only be using the state machine to accept the source state, check the condition, execute the action, and return the target state, we can certainly do without the ability to know the current state. After all, it just implements a state transition DSL expression.

After adopting the stateless design, we can use a state machine instance to serve all the requests, which can significantly improve the performance.

Use the State Machine

Using the state machine is as straightforward a process as the implementation. The following code shows the three transitions supported by the cola-statemachine.

StateMachineBuilder<States, Events, Context> builder = StateMachineBuilderFactory.create();
  //external transition  
  builder.externalTransition()    
    .from(States.STATE1)    
    .to(States.STATE2)    
    .on(Events.EVENT1)    
     .when(checkCondition())    
     .perform(doAction());

   //internal transition  
  builder.internalTransition()    
     .within(States.STATE2)    
    .on(Events.INTERNAL_EVENT)    
    .when(checkCondition())    
    .perform(doAction());

  //external transitions  
  builder.externalTransitions()    
    .fromAmong(States.STATE1, States.STATE2, States.STATE3)    
    .to(States.STATE4)    
    .on(Events.EVENT4)    
    .when(checkCondition())    
     .perform(doAction());
    
  builder.build(machineId);

The internal DSL state machine noticeably improves code readability and comprehensibility, especially in the context of complex transitions. The following is the PlantUML sequence diagram of our project modeled with the cola-statemachine. Without state machines, business code like this would be unintelligible and hard to maintain.

This is the cutting edge of DSLs. It provides a means to more clearly communicate the intent of a part of a system. The configurable and flexible external DSLs are not yet supported since the current cola-statemachine already delivers sufficient and satisfactory performance.

Community

Unlocking the Power of DSLs: Stateless State Machines

DSL

What is a DSL?

Categories of DSLs:

How to Choose between DSLs

Fluent Interfaces

State Machines

State Machine Selection

Open-Source State Machines are Too Complex

Open-Source State Machines have Poor Performance

Implement the State Machine

State Machine Domain Model

Fluent API for Creating the State Machine

Stateless Design of the State Machine

Use the State Machine

Read previous post:

Frank Zhang

You may also like

Comments

Frank Zhang

Related Products

Platform For AI

YiDA Low-code Development Platform

mPaaS

Machine Translation