One of the early trials of Clockwork’s clock synchronization technology was conducted at Nasdaq, as featured in the New York Times. In June 2018, co-founder Balaji Prabhakar spoke with Tom Fay, then SVP of Nasdaq’s Enterprise Architecture and System Engineering, about their work together leveraging Stanford-based research on self-driving networks and high precision clock synchronization. We’ve summarized it here, and you can also listen to the full podcast here.
Self-driving Networks
Prabhakar explains the research done by his group at Stanford and how that led to breakthroughs in clock synchronization. “Managing a data center or what is basically the networks that are underlying cloud computing infrastructures is a complex task. You have to schedule many different resources. The workload that is hitting these networks is not very clear, very well-defined, always changing quite like traffic.”
So, just adding lidar sensors to a regular car and controlling the steering and gas pedals transforms it into a self-driving car which can navigate complex environments, avoiding obstacles, his group is trying to make data centers self-driving (or self-programming). “The goal is to get networks to be able to sense themselves, their operation, and learn from their operation, and then to control how they operate. Automatically, without human intervention.”
Clock Synchronization
A key foundational technology to enable self-programming networks is high precision clock synchronization, without which the accurate measurement of delays and the ability to apply controls in real-time would not be possible. With modern distributed computing, where big jobs are executed on many small machines vs. one single supercomputer, all these machines must coordinate with each other to function as a single big system and achieve consensus on the current state.
“A critical quantity about which these nodes can agree is ‘what is the time now?’ This is called clock synchronization, meaning that if you all agree on time, it is equal to having clocks be synchronized. Now, what is the degree to which the clocks should be synchronized?” Prabhakar said.
Hyperscale data centers, with tens of thousands or millions of servers, can be synchronized to within 100s of microseconds. This is adequate for batch processing large quantities of data (e.g., business records and operations logs) using big data or machine learning algorithms. But, for any real-time sensing or control such as is needed for trading on a platform like Nasdaq’s, this is wholly inadequate.
Nasdaq: Fairness in Trading
Fairness—ensuring trades are executed in the order in which they are submitted regardless of what gateway they arrive at—is one of the key properties of a financial trading platform. Since packet transmission speeds are in the single digit microseconds for two servers in a modern high-speed data center, it is necessary to have nanosecond-level (or at most 100s of nanoseconds) accurate clock synchronization.
Prabhakar’s research group had developed a software-based clock synchronization system that had an accuracy of 10s of nanoseconds when they met Tom Fay’s team at Nasdaq. Nasdaq’s architecture is highly distributed, and time is critically important. “Trading decisions are based upon time, the ability to lift up and hit an offer, is based upon time. This is something that as a world provider of markets—we have markets all over the world—and ultimately to build a synchronization tool that will allow those markets to interoperate with each other or share data with high precision is something that is very, very interesting to us,” said Fay.
Synchronizing Across Heterogeneous Environments
Clockwork’s algorithm is able to synchronize clocks with nanosecond-level precision in heterogeneous environments, in third-party data centers where there’s no control over the length of cables, and where boxes could be from all from different manufacturers with different pass-through times. And it’s able to synchronize clocks to a few microseconds accuracy across data centers where servers are physically separated by large distances (e.g., continents).
“We go to great lengths in our data centers right now to make sure we do have fair access,” Fay said. “We use homogeneous infrastructure, equal cable lengths, all of this. And now as we look to move those markets into the cloud where we have to deal with that heterogeneous infrastructure problem, this technology actually becomes very foundational for our framework to provide and maintain that type of fairness in terms of access.”
Deploying to the Cloud
“As we look at what we’re doing, and how self-programming networks relate to what’s happening in the cloud computing world, we see that our initial work on sensing how this network behaves using just software without the need for specialized hardware is a great benefit,” Prabhakar says.
Tapping into the benefits of the cloud also means living in “virtual machine bubbles”, where a user is unable to access the underlying hardware (especially the network interface cards or switches) and certainly cannot install specialized hardware. So “it’s very nice in such a world if you could simply have software you could send over that acts on your behalf, sensing the environment, synchronizing clocks, and allowing you to run distributed applications. Several distributed applications rely on accurate clocks or perform better with synchronized clocks, like distributed databases.”
Meeting Customer Expectations
“We went from seconds to nanoseconds. Customers are demanding that type of precision and will continue to drive us toward that,” Fay said.
Customers across industries have similar expectations — contact us to learn more about how clock synchronization can help your business meet customer expectations and schedule a demo.
Interested in solving challenging engineering problems and building the platform that powers the next generation time-sensitive application? Join our world-class engineering team.