Building a Trading System, First Steps

General considerations for creating a trading system and simple structure for a limit order book.

In an high-frequency trading environment, we need to have a trading system which provides interfaces for receiving data and interacting with the market for our trading models (the mathematical or algorithmic logic that analyses data and decides how to trade). One really important thing to first consider in our trading system is to separate it from our trading models. If our trading system is coupled with our trading models, it can become a nightmare situation to modify model behaviour or to add new, fundamentally different models as the time it will take to implement them will be too much.

I’m sure the reader is thinking that the above point sounds obvious; of course, concerns should be separated. The problem is that it is sometimes very difficult to see “what” should belong to a trading system. One example is to consider a trading model that takes a single tick of market data, does some calculations, and then decides whether or not to place an order.¹ This seems reasonable, right?

Unfortunately, this will lead us to making trading decisions based on false views of the market. The reason for this is that, while a stream of market data is serial (it has to be), the events the data represents are not: some of the events are truly simultaneous. For example, a large market order that sweeps the book will generate a set of executions at the range of limits the order filled at, all of which happened at the time the market order was placed. A model that trades off of the first couple ticks of these will have an incorrect belief about the current price. A correct implementation would allow a model to detect simultaneous events and delay a trading decision until they have all been processed in the system.

So, we say that a general trading model should be the source for every behaviour that we could conceivably want to control or vary, and its interface with the trading system should be very carefully defined and limited so that it can be easily tested. A trading system, on the other hand, should be everything that is static; primarily, the connectivity with markets, databases, and other computers, an order management system, and interfaces for the models to access these resources.

A good way to think of our trading system is to define it as an API that the trading models use, which would give us the maximum flexibility during the process of model development.

The Order Management System

Let’s now consider the order management system (OMS) as it is arguably one of the most important parts of a trading system, and like the trading model interface, it is one where the most obvious implementations will lead to headaches down the road. These issues arise when, let’s say, all of our trading models have direct access to the exchange (so they can add/execute orders themselves), with no intermediary process that has information about all open orders. Or, it can also happen when the OMS is designed as a simple conduit for reaching the market, so model logic that requires orders to be in a certain state must block waiting for a response from the trading system. Two problems occur with this solution:

A blocked model will be unable to react to changing market conditions and will likely build up a backlog of stale market data while it waits for trading confirmations.
These designs don’t contain the possibility of optimising the trading behaviour across all of the models that are using the OMS. For example, if two models under this system are placing orders in opposite directions for the same instrument, an intelligent OMS should combine them into a single net order.

A good way to implement an OMS is to separate the intentions of a trading model from the actual resulting positions. With this design, a trading model can, asynchronous from actual trading, update its intentions. And, the OMS can intelligently work to implement them, notifying the model when they are fulfilled. This leaves the model free to continuously process data and potentially change intentions before they are fulfilled. An extremely intelligent OMS, however, might be better thought of as a special kind of model that gathers and transforms the trading requests of all of the other models, since it is likely that an intelligent OMS would need to be tuned and otherwise parameterized.

Limit Order Book

For a trading system to be a trading system, it must have an order book implementation that follows the market feed, updating its internal data structures, which happens to be the primary source of market information for the trading models, so it is very important to make our order book both absolutely correct and extremely fast. For example, the recommended bandwidth for Nasdaq TotalView ITCH Feed is about 15-20 mbit/s.² Since messages arrive are 20 bytes on average, our system needs to handle 125,000 messages per second.

A limit order book is a data structure that contains orders that market participants have made and that tries matching these orders for them to be executed. An order can be of type BUY or SELL, has a volume and a price, and is identified by a u64. Our order book implementation must have three basic operations: add, cancel, and execute. The goal is to implement these operations in $O(1)$ time (though, that might not be the fastest always…) while making it possible for the trading model to efficiently ask questions like:

What are the best bid and offers?
How much volume is there between prices X and Y?
What is order A’s current position in the book?

One inherent observation we notice is to separate buys and sells into two sorted structures as we need to know the highest bid and the lowest offer at an instant. We also notice that for a price point, there may be many orders. So, our order book should also have a list for each limit price, sorted by who came first.

In particular, an add operation places an order at the end of a list of orders to be executed at a particular limit proce, a cancel operation removes and order from anywhere in the book, and an execution removes an order from the inside of the book, which is defined as the oldest buy order at the highest buying price and the oldest sell order at the lowest selling price. We define these structures to help us:

1struct Order {
2	Id id;
3	Side side; // buy or sell
4	Price price;
5	Volume volume;
6	u64 entry_time;
7	u64 event_time;
8	Volume initial_volume;
9	Volume remaining_volume;
10	Order *next_order;
11	Order *prev_order;
12	Limit *parent_limit;
13};

1struct Limit { // represengint a single limit price
2	Price price;
3	Volume total_volume;
4	Limit *parent;
5	Limit *left_child;
6	Limit *right_child;
7	Order *head_order;
8	Order *tail_order;
9};

1struct Book {
2	Limit *buy_tree;
3	Limit *sell_tree;
4	Limit *lowest_sell;
5	Limit *highest_buy;
6};

The idea is to have a binary tree of Limit objects sorted by Limit.price, each of which is itself a doubly linked list of Order objects. Each side of the book, the buy and sell limits, should be in separate trees so that the inside of the book corresponds to the end and beginning of the buy Limit tree and sell Limit tree, respectively. Each order is also an entry in a map/hash map keyed off Order.id, and each limit is also an entry in a map/hash map keyed off Limit.price.

Note that these could also be implemented with C++ STL: std::map, std::unordered_map, and std::list, which I have a basic implementation of. In fact, our Limit trees would just be a std::map with key being the price and the value being the std::list of orders. This would also allow the values of our orders hash map to be just an iterator to our std::list.

Anyhow, we can easily implement these key operations with good performance (see here for a crude implementation):

Add: $O(\log M)$ for the first order at a limit, $O(1)$ for all others
Cancel: $O(1)$
Execute: $O(1)$
Get volume at limit: $O(1)$
Get best bid/offer: $O(1)$

Here, $M$ is the number of price limits (which is generally $\ll N$ , the number of orders). Some strategy for keeping the limit tree balanced should be used because the nature of markets is such that orders will be removed from one side of the tree as they’re being added to the other; of course, not needed if we’re using C++ STL.

One thing, however, as with everything, to consider is that C++ STL has been implemented and designed to be very general, so if you are actually implementing these structures for a real trading system, I would recommend for you to use a special container or just create your own.

What about the cache?

The choice of a linked list because $O(1)$ operations often miss the fact that the operation cost $C$ is constant, but can be large. A simple array might only have $c$ as constant but with $O(n)$ . It’s really important to recognise that $c \times n$ isn’t guaranteed to be larger than $C$ , and oftentimes it’s not!

This is also true in our previous structure which has trees and linked lists, meaning that the created objects will be all over memory, which, of course, would inhibit the cache. So, a variation on our previous structure is to store the limits in an array instead of a tree/std::map. This will always give $O(1)$ for add operations, but at the cost of making deletion/execution of the last order at the inside limit $O(M)$ as Book.lowest_sell and Book.highest_buy have to be updated.

This might be beneficial if your order book is not sparse, meaning that most orders have the same limit price, and it exponentially decays as you go further away from the “market value.” Look at voyager for an implementation using a flat linear array indexed by the numeric price value and an arena for limits to avoid incurring the costs of dynamic heap-based memory allocation.

Matching engine (the execute operation)

Once all of our structures are defined, the execution is left as the single non-trivial thing to implement. An execution (or multiple!) happens when the order book has this state: highest_bid.price >= lowest_sell.price. The execution continues, matching orders, until either there’s no order left on either side or highest_bid.price < lowest_sell.price. Do note that it can happen such that a big volumed order is filled by multiple orders from the other side, so it’s not exactly one-to-one.

What’s Next?

This is only a small percentage of the picture. For building a trading system, one needs to consider the networking, multithreading, and get very close to the hardware. Even so, one might need to implement their trading systems through FPGAs or through specialised circuits.

For more information about trading system and building one, I recommend:³⁴⁵⁶⁷

Footnotes

Subscribe!

Contents

The Order Management System
Limit Order Book
- What about the cache?
- Matching engine (the execute operation)
What's Next?
Footnotes