Publications > Scalable Systems > Random Forests of Very Fast Decision Trees on GPU for Mining Evolving Big Data Streams

Random Forests of Very Fast Decision Trees on GPU for Mining Evolving Big Data Streams

Publication

Aug 18, 2014

Abstract

Random Forest is a classical ensemble method used to improve the performance of single tree classifiers. It is able to obtain superior performance by increasing the diversity of the single classifiers. However, in the more challenging context of evolving data streams, the classifier also has to be adaptive and work under very strict constraints of space and time. Furthermore, the computational load of using a large number of classifiers can make its application extremely expensive. In this work, we present a method for building Random Forests that use Very Fast Decision Trees for data streams on GPUs. We show how this method can benefit from the massive parallel architecture of GPUs, which are becoming an efficient hardware alternative to large clusters of computers. Moreover, our algorithm minimizes the communication between CPU and GPU by building the trees directly inside the GPU. We run an empirical evaluation and compare our method to two well know machine learning frameworks, VFML and MOA. Random Forests on the GPU are at least 300x faster while maintaining a similar accuracy.

Download

Venue:

ECAI 2014

Type:

Conference/Workshop Paper

Authors:

Diego Marron
Gianmarco De Francisci Morales
Albert Bifet

BibTeX

@inproceedings{ author = {Diego Marron and Gianmarco De Francisci Morales and Albert Bifet}, title = {Random Forests of Very Fast Decision Trees on GPU for Mining Evolving Big Data Streams}, booktitle = {Proceedings of ECAI 2014}, year = {2014} }

- Help
- About our ads

Random Forests of Very Fast Decision Trees on GPU for Mining Evolving Big Data Streams

Publication

Abstract

ECAI 2014

Conference/Workshop Paper

Diego Marron

Gianmarco De Francisci Morales

Albert Bifet

BibTeX