Polymath Robotics Blog | State Machines vs Behavior Trees: designing a decision-making architecture for robotics

State Machines vs Behavior Trees: designing a decision-making architecture for robotics

Zeerek from our Controls team explains the difference between state machine and behavior tree architectures, and when to use each one.

Zeerek Ahmad

January 26, 2023

State Machines vs. Behavior Trees

‍

You might have heard about state machines and behavior trees before, or heard debate about which is better in robotics and AI.

Let’s dig into them and why we’ve navigated from one to the other at Polymath.

State Machines and Behavior Trees are different ways of organizing code implementation, and both methods have pros and cons. Depending on your project, both are valid options.

Today, let’s talk about why you may lean one way vs the other.

‍

State Machines

‍

Starting with the simpler of the two concepts, state machines are a static and efficient way to organize your code. A state machine is a model of computation that describes a system as a set of states and the transitions between them.

In the context of robotics, a state machine might be used to describe the different actions that a robot can take, and the conditions under which it will transition between those actions.

In a simple path-following robot, this might mean having a “wait” state, which transitions to a “run” state on receiving a command, and back to a “wait” state once it completes the command.

Example of a State Machine

For example, let’s say you have two programs that need to communicate with each other.

Your state tree may sit “waiting” until it receives a message from that other program. For example’s sake, we’ll say the state tree is waiting to hear “PING”. On receiving a “PING,” the State Machine transitions to a response state such as “responding,” where it sends back some messages containing “PONG” to confirm receipt. Once it’s done responding, it then goes back to “waiting”.

Where Things Get Messy

State machines are elegant, simple, and excellent at being rigid and efficient. Where they start to struggle is when you start adding more and more states due to having complicated needs.

For example, state machines work great with a few defined states, but things get messy once you get up to 20 (let alone over 100) states. With such large state machines, it gets harder to maintain your codebase, let alone debug problematic behaviors.

In robotics, when handling complex tasks that require multiple steps of decision making, a state machine and its extensive conditions and transitions quickly becomes large and unwieldy.

State Machines in Robotics

Let’s talk about how state machines operate in a more complicated scenario.

In a state machine-powered robotics program, the program might wait for customer input. Once it gets that input, it will go into a new state where it is now checking for required information, commanding navigation to do something, and returning the information to the customer.

If someone were to hit an emergency stop (e-stop) on the robot, it might go into an e-stop state; then, if that e-stop is released, the state machine would go back into a standby state. That’s already a lot of states!

To take the concept further, we can add additional states to do your tasks with “waiting”, “responding”, “running”, “paused”, “completed”, “stopped” etc. And we’d build a network of conditions and descriptions that define the transitions and possible transitions this machine could do.

Mapped out, this state machine might look like the image below (note all the paths that must be coded in as well as state behaviors).

‍

Streamlining State Machines

As you expand your state machine beyond 5 or 6 states, you can see how the interconnections may get messy and disorganized. Finding which state is doing what only gets more difficult as you expand.

State machines are great if you can keep things streamlined and simple, but you won’t always have that option.

So, what’s the alternative to these rigid architectures?

‍

Behavior Trees

‍

Behavior trees originated from the video game industry to model more complex behaviors for NPCs. Since then, they’ve been adopted by a wide variety of industries including Robotics.

Unlike state machines, behavior trees tick through “leaf nodes” iteratively until they reach an “action” node that specifies an action. Each tick, the tree is re-evaluated to reach an action. This approach adds up-front complexity, but can help you manage more intricate code through modularity.

Unlike the more systematic state machine, behavior trees are more agile in swapping between actions. Different states, such as “stop”, are logically reached without repetition of conditions and subsequent code.

Rather than going from state A to state B, you are ticking through different nodes in the tree that will trigger multiple small actions.

Example of a Behavior Tree

Using the “PING, PONG” example, we can see immediately how behavior trees operate differently from state machines. The behavior tree is always checking for a message (instead of being in a “wait” loop). It will take the “PING” as it appears and run it through its behavior tree, eventually responding appropriately with “PONG” to the other program and will also set off any other triggered actions.

Staying Modular

Behavior trees have modular nodes, meaning it’s relatively easy to expand their parts without getting messy.

If we look back at our e-stop example, behavior trees may have a separate node just for checking for that trigger. The tree will always look out for that message without repeating code or computation.

Similarly, if you want different parts of a behavior tree to receive input from the customer, such as a "PING," you can tell the tree to run that specific node in a different section. It’s much easier to pop one thing into multiple sections and allow the tree to run its course.

This trickle-down methodology makes scalability easier and makes your code more accessible to understand throughout the different nodes of the tree.

Behavior Trees in Robotics

Ok, let’s make an example behavior tree.

First thing to note is that we will not have to develop explicit transitions like we did with a state machine; these come naturally as a benefit of a well structured tree.

As a preface we will quickly explain some of the “Control” nodes that control the flow of the tree.

Sequence

Continues to next branch if it gets Success
If all branches return Success, it returns Success
If any branch returns Failure, it returns Failure
If any branch returns Running, returns Running

Fallback

Continues to next branch only if it gets “Failure”
If all branches return Failure, it returns Failure
If any branch returns Success, it returns Success
If any branch returns Running, returns Running

ReactiveSequence or ReactiveFallback

Same as its non-reactive counterpart except that it re-ticks all prior nodes up to the running node

The tree shown below uses these to implement the StateMachine shown above in a clear way. It checks for e-stop continuously, and stops if it gets an e-stop OR if the goal fails. It keeps checking the goal until it gets one, and removes the need for an “idle” state entirely. It then responds to the goal, and runs it while checking and implementing pause.

‍

‍

Look how much cleaner the flow is! The tree also fails into a Stop condition by default, and checks for e-stops without redundant code.

Even more important: adding to this tree is as simple as adding or removing a node. Say we want to remove the Pause functionality. We can remove the ReactiveSequence that checks for the Pause and remove the Pause node, and the resulting tree has modified behavior ready. Editing the tree requires minimal code changes, making it easy to scale.

It is also easier to debug any nodes that are working incorrectly, as well as move them around for better performance!

‍

Developing Robots and Ping Pong

‍

Starting Robotics with State Machines

Here at Polymath, we started our robotics stack with state machines.

State machines have a lower performance cost and library dependency than behavior trees. We also had to consider the complexity of ROS and ROS2 within a behavior tree, particularly the way they handle topics, nodes and parameters that have to be reconciled with the behavior tree library’s desired functionality. A state machine, on the other hand, can be programmed simply without an external library to do our core task.

So, state machines were a no-brainer for our simple early implementation.

Data management is also easier in a state machine, where we’re in full control of how it’s running. In a behavior tree, we have to use a blackboard which can get more complicated because, for example, it doesn't handle non-thread-safe stuff as well. You don't contain things in their one single state like you can in a state machine.

But while starting with state machines was the right choice when things were simpler and we were growing, it didn't stay that way.

As we continued building for our customers, the number of decisions to be made exploded. This meant we needed additional states that were bigger and created lots of redundant transitions.

We eventually decided to move from a state machine to behavior tree, to try to get ahead of this issue and future-proof our stack.

The Pros of Behavior Trees

Behavior trees effectively guarantee a path to complexity-handling and upgradability.

The modular nature makes management more straightforward, letting us update just one piece of the tree when needed.

Specifically, with ROS, the behavior tree CPP library has an associated GUI library that can physically show you what the tree looks like. It also allows you to drag and drop modules, allowing for significantly more visual editing and management.

The benefit? It’s a lot harder for your code to get messy when everything is self-contained, with a sleek GUI to go with it.

The Ping Pong Experiment

Before moving our core decision-making to behavior trees, we needed to test this all out.

We developed a simple “PING” “PONG” example (as we hinted above), which leverages ROS2 and the BehaviorTree.CPP library.

This experiment creates two ROS2 lifecycle nodes that will ping and pong off of each other.

The behavior tree each node runs can be defined by an XML file, and we can edit these to do whatever they need.

The trees use ROS2 to PING and PONG each other. This example also tests multiple ways that a behavior tree running via ROS2 can subscribe to data and push the results back to the blackboard in a thread-safe way. This allows for the nodes within it to pass data, as well as use it when transmitting PONGs back.

By selecting the behavior tree, we can change which version the nodes use, as well as how the nodes decide how many PONGs to shoot back for each PING.

Check out the example here: https://github.com/polymathrobotics/ros2_behavior_tree_example

ROS and ROS2 Problem Solving

So, where has this behavior tree vs state machine decision come into play for us at Polymath Robotics?

I’ve been working on making the "brain" that helps bring together all the moving pieces on our robot.

This brain takes the customer's command (the PING in our simplified example), then tells ROS what to do, while monitoring the actions taken. It will eventually also give feedback as needed to the customer (the PONG).

This handler is making high-level decisions on all of these moving pieces, such as the system's health, in conjunction with the customer inputs, to ensure it is arriving at the correct solution.

By developing this "brain," we’re learning how to better leverage behavior trees with ROS2. We’re also laying building blocks so that once we start developing more complex systems, we can avoid miniscule bugs and save time in our troubleshooting.

In other words, we’re building the piping so that when we go build something with our new behavior tree codebase, we know how it is going to work.

This testing also gives us a sandbox so that we can make mistakes and run extreme experiments without doing so on a real robot, where there are many more factors to consider.

‍

Lessons Learned

‍

Working on this behavior tree project I learned 3 major things:

Record Thoroughly
Sandbox Extensively
Behavior Trees are Awesome AND a Pain

Recording Thoroughly

When you’re drawing from multiple sources such as Nav2’s intricate and complex implementation of Behavior Trees, ROS2’s extensive options and features, and BehaviorTree.CPP’s newest revision, it is very easy to lose things along the way.

If you don’t organize, record and synthesize your learnings in a readable way for yourself, you will find yourself unable to find it when you need it!

Using Notion helped me keep track of all the moving pieces so I could compile an example more applicable to my use case.

Just like commenting code, it’s critical to save those sources so that as you build up a knowledge base, you can continue to reference those ideas and share them.

Sandbox Extensively

If I’d jumped straight into building a behavior tree for the brain of our robot, I would have had a much harder time.

By building out an extensive sandbox that had only the core of ROS2 and BehaviorTree.CPP, I was able to go deeper into the different ways I can leverage the tree for use. It also let me clear up bugs and build issues related to syntax and implementation.

Sandboxing also allowed me to show my team the core of the BehaviorTree.CPP library as well as new features ROS2 brings such as lifecycle management, and the node interfaces. This provides utility beyond just behavior trees, but for leveraging ROS2 as a whole.

Behavior Trees are Awesome AND a Pain

The last thing I’ve learned is that Behavior Trees take more time and effort to set up (...and are thus a pain).

You have to put in time, effort and more planning than you do with state machines, to make sure you have the nodes you need and a suitable structure.

But behavior trees are also awesome! They’re so much simpler, more concise, and clear as a code base. Not to mention the scalability, flexibility and robustness they bring when developing robots.

Have you had experience with state machines vs behavior trees? Which did you use, and why?

‍