Uber revamps its operating model with real-time data and microservices orchestration
Founded in 2009, Uber Technologies Inc. started as a small tech disruptor and subsequently revolutionized urban transportation, popularizing concepts such as ride-sharing.
In the decades since, the company has recognized the intrinsic importance of data as a driver for business outcomes: differentiation and diversification. Harnessing its vast data resources, Uber has built an expansive architecture that powers real-time capabilities such as logistics services and ticket bookings.
“When I started, it was a very tiny startup and definitely has grown to run these millions of apps in so many devices across the world,” said Madan Thangavelu, senior engineering director at Uber. “I run our Rider engineering team, which is responsible for the flagship app that you download. Over the last couple of years, we’ve transitioned from an app that’s only an A-to-B travel app to one that gives you so much more related to package delivery, renting a car, and we can even do a train booking. The app complexity has exploded over the last few years.”
Thangavelu spoke with theCUBE Research’s George Gilbert during a CUBE Conversation as part of “The Road to Intelligent Data Apps” podcast series. They discussed the evolution of Uber’s Rider app, exploring the intricate balance between real-time functionality, microservice architectures and seamless user experiences.
Real-time operation: Uber’s key differentiator
Central to Uber’s application is its real-time nature, a fundamental aspect that sets it apart from many other applications. While traditional apps often deal with static content or delayed updates, Uber’s ecosystem thrives on instantaneous changes: A canceled ride, a driver’s acceptance or car availability all affect the user experience in real time. This requires separating concerns between what resides in the app itself and what’s managed on the backend platform, according to Thangavelu.
“Things that can happen on your app that somebody else changed; the driver can cancel, or your order is now ready, and all those updates are now happening on your app, which is completely distinct from somewhere else,” he said. “That real-time nature is what sets the Uber app apart. To your question about how to separate the data, we need to think about what data the user can create and which is your app.”
The data generated by users (such as location, preferences and interactions) and server-side data (such as driver availability, trip states and fares) need to be meticulously synchronized. The real-time interactions demand that the server — rather than the app — initiate much of the data push to ensure all parties are consistently updated, Thangavelu added.
“A lot of data push comes from the server … and from a separation of concern [standpoint], we definitely look at where all these things have to come together, meaning the Uber Driver app and the Rider app cannot independently operate on their own data layers and microservices,” he said.
Real-time orchestration and machine learning integration
Much of the magic sauce behind the dynamism of Uber’s different apps comes from machine learning algorithms. By rapidly processing incoming data streams, the system can adapt product recommendations based on factors such as user behavior or real-time demand. By combining that real-time user data with machine learning inference systems, the app’s logic is informed by both current context and historical patterns, according to Thangavelu.
“There are pricing systems that determine how much to charge,” he said. “There are domains with multiple intra-services and multiple services that represent all the fulfillment states that combine the Rider states, but that’s a domain by itself. These very core domains are at the lowest layer.”
Behind the scenes, Uber’s infrastructure combines various technologies to handle these complex processes. The company uses Google Spanner as the transactional database and custom frameworks for real-time event propagation and orchestration. Events and metadata are sent into Kafka for processing and storage in Uber’s Hive tables, enabling near real-time insights and historical analysis, according to Thangavelu.
“The way we do that [is] the trip state machine is entirely backed by Spanner,” he said. “In order to keep a copy of this historical context or offline analysis, at the nuts-and-bolts level, we have built frameworks; think of them [as] the state machines. What we have done is that at the framework level, at the state machines, we’ve created ways to emit events and metrics.”
Here’s theCUBE’s complete video interview with Madan Thangavelu:
Image: SiliconANGLE/DALL-E
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU