Uber Open Sources AthenaX, Its Streaming Analytics Platform

In production for six months, the platform powers Uber's business, running more than 220 applications in the company's data centers.

Christine Hall

October 17, 2017

2 Min Read
uber in london
Uber in LondonLeon Neal/Getty Images

Uber has open sourced AthenaX, the streaming analytics platform that runs its business. Simply put, the platform is the ride-hailing company's way of channeling data from a variety of real-time sources while running streaming analytics using Structured Query Language (SQL). There's no need to ask if it'll scale. If it's being used by Uber, at scale is where it starts.

This is only the latest example of a big corporation making the platform that powers its business available to anyone who needs it. Doing so puts Uber on a long list that includes Facebook, Walmart, and General Electric, to name just a few.

In production for six months, AthenaX currently runs more than 220 applications in multiple Uber data centers, where the company says it's processing billions of messages every day. It's being used with Michelangelo, the company's machine learning platform; with UberEATS Restaurant Manager, which analyses data for restaurants using its food delivery service; and UberPOOL, which drives it's carpooling service.

"To meet the needs of Uber’s scale, AthenaX compiles and optimizes SQL queries down to distributed streaming applications that can process up to several million messages per second using only eight YARN containers," Haohui Mai, Bill Liu and Naveen Cherukuri said in a fairly detailed how-it-works article on Uber's website. "AthenaX also manages the applications end-to-end, including continuously monitoring their health, scaling them automatically based on the size of inputs, and gracefully recovering them from node failures or data center failovers."

Related:Facebook Open Sources Data Center Network Fault Detection Tools

Out-of-the-box it comes equipped with resource estimation and auto scaling, which keeps it using just enough but not too many resources. And because Uber's business requires four nines of uptime, it has built-in monitoring and automatic failure recovery.

AthenaX has been released under the Apache 2.0 license, which means it can be used in projects licensed under practically any other open source license. And because Apache is a "permissive" license, it also means the code can be rolled into proprietary projects as well.

The latter is why I expect this to see a lot of quick uptake. It's mobile-ready, it scales, it can analyze massive amounts of data, and it can be taken private. Just what a lot of companies, startups and otherwise, need to launch that killer app that will be too complex to develop on a small budget.

Want to take a look? It's available on GitHub.

About the Author(s)

Christine Hall

Freelance author

Christine Hall has been a journalist since 1971. In 2001 she began writing a weekly consumer computer column and began covering IT full time in 2002, focusing on Linux and open source software. Since 2010 she's published and edited the website FOSS Force. Follow her on Twitter: @BrideOfLinux.

Subscribe to the Data Center Knowledge Newsletter
Get analysis and expert insight on the latest in data center business and technology delivered to your inbox daily.

You May Also Like