TriAD: A Distributed Shared-Nothing RDF Engine based on Asynchronous Message Passing

Publication
Jun 22, 2014
[Work published prior to Yahoo]
Abstract

We investigate a new approach to the design of distributed, shared-nothing RDF engines. Our engine, coined “TriAD”, combines join- ahead pruning via a novel form of RDF graph summarization with a locality-based, horizontal partitioning of RDF triples into a grid-like, distributed index structure. The multi-threaded and distributed execution of joins in TriAD is facilitated by an asynchronous Message Passing protocol which allows us to run multiple join operators along a query plan in a fully parallel, asynchronous fashion. We believe that our architecture provides a so far unique approach to join-ahead pruning in a distributed environment, as the more classical form of sideways information passing would not permit for executing distributed joins in an asynchronous way. Our experiments over the LUBM, BTC and WSDTS benchmarks demonstrate that TriAD consistently outperforms centralized RDF engines by up to two orders of magnitude, while gaining a factor of more than three compared to the currently fastest, distributed engines. To our knowledge, we are thus able to report the so far fastest query response times for the above benchmarks using a mid-range server and regular Ethernet setup.

  • ACM SIGMOD International Conference on Management of Data (SIGMOD 2014)
  • Conference/Workshop Paper

BibTeX