Taking Omid to the Clouds: Fast, Scalable Transactions for Real-Time Cloud Analytics

Nov 30, 2018

We describe how we evolve Omid, a transaction processing system for Apache HBase, to power Apache Phoenix, a cloud-grade real- time SQL analytics engine.

Omid was originally designed for data processing pipelines at Yahoo, which are, by and large, throughput-oriented monolithic NoSQL applications. Providing a platform to support converged real-time transaction processing and analytics applications – dubbed translytics – introduces new functional and performance require- ments. For example, SQL support is key for developer productiv- ity, multi-tenancy is essential for cloud deployment, and latency is cardinal for just-in-time data ingestion and analytics insights.

We discuss our efforts to adapt Omid to these new domains, as part of the process of integrating it into Phoenix as the transaction processing backend. A central piece of our work is latency reduc- tion in Omid’s protocol, which also improves scalability. Under light load, the new protocol’s latency is 4x to 5x smaller than the legacy Omid’s, whereas under increased loads it is an order of mag- nitude faster. We further describe a fast path f protocol or single- key transactions, which enables processing them almost as fast as native HBase operations.

  • International Conference on Very Large Data Bases (PVLDB 2018)
  • Journal