Technical architecture for prediction markets
By Rob Knight on March 9th, 2009One key question for the Wisdom Hive project is this: what is the best way of implementing a prediction market given the technologies available to us? Most of the answers to this question are obvious and even quite conventional, but it’s worth examining the question - and the answers - in more detail.
First of all, what requirements does a prediction market have?
- Some form of data storage, to store the current state of the market and its participants
- Some interface by which that state can be altered
- A means for participants to discover the state of the market and observe changes
- The means to create new markets and share them with others
- To achieve all of the above using open standards
The first requirement is easily handled by any kind of database, though we might return to the specifics later. The second requirement is where things start to get interesting, especially as the decision here will also determine the decisions made for the third and fourth requirements.
Web Interfaces
It can be taken for granted that we want Wisdom Hive markets to be accessible via a web browser; such browsers are ubiquitous and provide all of the necessary tools to display and update data in a usable format. But browsers are becoming increasingly complex; AJAX allows content to be loaded in the background, and also allows for data formats other than HTML. XML, JSON and other formats can be transported over AJAX. Extensions to AJAX such as Comet and BOSH allow the implementation of a ‘messaging’ pattern on the web, breaking the normal request-response cycle. And the HTML5 specification, already appearing in part in some cutting-edge browser releases, will allow for the creation of socket connections which will allow raw data to be sent and received by a web browser. The use of a web browser no longer implies the serving of HTML over a plain old HTTP request-response connection. This is the “pull” approach, where a user requests a piece of information - a web page, image or piece of data in XML format - from a web server, and receives a copy of that data in response. In reality, the data often doesn’t come directly from the central web server; caches of copies of this data are kept, and it’s a lot quicker to just return a copy from memory on the cache server than it is to send the whole request through to the central server for processing and an eventual reply. This works well for data that doesn’t change often, but works less well when we positively expect the data to change frequently.
Messaging
To understand the importance of messaging, it’s important to read closely the second part of requirement #3: “A means for participants to discover the state of the market and observe changes“. Messaging involves a “push” approach to data, where the server proactively sends out messages to connected clients when the state of the market changes. This is the pattern that “real” financial markets use - millions of messages per second are sent over the networks of the major financial markets. The velocity of information is so great that it’s almost impossible to comprehend. It is, in some ways, this great decrease in latency that has undermined the traditional advantage of investment bankers in the marketplace; information is cheap and can be acquired in the blink of an eye by just about anyone. An old, possibly apocryphal tale has it that, in 1815, the banker Nathan Mayer Rothschild was able to make a vast profit because his messenger was the first to arrive with news of Napoleon’s defeat at the Battle of Waterloo. Nowadays, we no longer expect that anyone will have such an advantage in access to information.
At present, there are many rival messaging standards. There are only two really worth considering for our purposes: XMPP and AMQP. XMPP scores points for having many working implementations in existence and already gaining traction in the open source and open standards world. AMQP is, objectively, a more efficient protocol and was designed for use in financial markets, making it eminently suitable for use in a prediction market application; on the negative side, there are not many working implementations and the protocol remains subject to revision.
Personally, I favour XMPP - it may be a less efficient protocol, but it’s good enough for our purposes. It’s also well-supported by open source libraries and software.
What does this mean?
In short, what this means is that if we expose the prediction market via XMPP, individual users will be able to subscribe and receive near-instant updates when trades occur. The experience of trading will become much more ‘live’, provided that there are enough participants to drive regular updates. In addition, third-party XMPP clients can be built which can monitor trades in real time, and perhaps even make automated trades themselves. This is far more efficient than the approach of having an RSS feed of trades which needs to be checked (’polled’) at regular intervals, or some kind of API that, again, needs to be polled for updates regularly. Ultimately, it will mean that prediction markets can work more like instant messaging and less like e-mail; more like Twitter and less like blogging.
Is messaging enough?
Messaging is great for broadcasting updates. But is it good for other purposes? The simple answer is “yes”, but this misses some subtleties. One of our long-term considerations in making any decision on technical architecture has to be scalability. Now, “scalability” is an over-used word which is often employed by people who most certainly do not have a scalability problem in the first place. And neither do we, yet. Nevertheless, there’s no excuse for taking bad decisions simply because, at present, there aren’t enough users to expose the fact that the system would cope with heavy traffic. So, here’s the problem: messages are often one-to-one and unique. If Alice requests a list of all current markets, and then Bob requests the same list a second later, it’s necessary to send the list out twice. And for data that doesn’t change often, this can lead to a lot of wasted effort. The great advantage of the “pull” approach of HTTP is that it’s possible to cache copies of the output which can be distributed across the web to myriad cache servers. Often, ISPs operate cache servers so that when their users request a web page, that request can be satisfied without it ever leaving the ISP’s own network (which costs the ISP money). Messages cannot be cached in this manner.
Therefore, it’s prudent to supplement the messaging system with a traditional HTTP-based means of acquiring information about the current state of the market.
RESTful APIs
REST is another much-abused buzzphrase which refers to a set of principles for how resources (a broad definition that includes data sources and services) should be exposed via HTTP. In short, REST sets out some rules for how we might want to provide an API for accessing data about markets, and for updating, adding to and deleting that data. I won’t go into specifics here; suffice it to say that a RESTful approach would ensure that the data provided could be easily cached, and would also make it easy for third-party client software to interface with our prediction market engine.
In the prediction market world, Inkling Markets do offer a RESTful API for accessing their markets, but they appear to be alone at present.
Our aim is to provide the best of both worlds: the rapid flow of data provided by a messaging pattern, along with the efficient access to the less-frequently changing data via a RESTful HTTP interface. It’s on top of this framework that we will be building our own site, and we aim to create open standard APIs which will allow for a flourishing of third-party clients which will be able to access our markets.
Tags: open source, open standards, REST, xmpp