As new Yahoo Mail or Flickr features go from a prototype to full production, monitoring vital system components for problems (i.e. anomalies) is critical. Manually setting static thresholds to detect big spikes in traffic becomes impractical due to the sheer number of time-series that need to be monitored and the large number of false positives and negatives stemming from this simple approach.
Today Yahoo Labs is announcing the open source of our Extendible and Generic Anomaly Detection System, or EGADS, to detect such anomalies automatically in a robust and scalable fashion. At Yahoo, we collect very large amounts of data over time, and it is critical to detect unusual or anomalous activities in this data. For example, Yahoo membership service is continuously monitored over time. Many measurements on system health and user traffic are aggregated every hour and broken down by geographical location and by other categories. When working with as much data as we have at Yahoo, it is important to automatically identify times when things go wrong or just change in such data (note that we also recently open sourced a complementary Anomalous R package in collaboration with Rob Hyndman, an upcoming Big Thinker speaker at Yahoo, which finds anomalous time-series in the context of other time-series).
There are a number of anomaly detection systems [1,2,3,4], but they are use-case specific and are not extendible. They do not service the needs of many projects at Yahoo which require different types of anomaly detection with the additional requirements that the models be easily configurable, scalable, and dependence-free for simple deployment. Motivated by the need for a lightweight library that can be used as a service and that contains a set of established anomaly detection methods applicable to many use-cases, we created EGADS.
Today EGADS supports over 20 forecasting and anomaly detection models implemented in Java and used by Yahoo Membership, Yahoo Mail, and internal monitoring tools. While scientists and engineers continue to innovate forecasting and anomaly detection models written in higher level languages, EGADS offers a standardized anomaly detection library for Yahoo and the external community that is production ready, lightweight, and has no extra dependencies besides Java. In other words, EGADS does not interfere with the current “hack and deploy” solutions, instead it is a collection of more mature models that have been tested over time.
Figures 1 and 2 (below) show, respectively, the F1-Score and performance results of various EGADS models compared to open-sourced counterparts on our recently released anomaly detection dataset.
Figure 1: Average F1-Score on EGADS compared to [1,4]
Figure 2: Average running time on EGADS compared to [1,4]
Getting started: To compile the jar library, clone the EGADS repo and execute: mvn clean compile assembly:single to compile a single jar with all dependencies included ready for deployment.
An example: EGADS supports a number of time-series and anomaly detection models which are all specified in the configuration file. Example below uses an OlympicModel (i.e., seasonal) for time-series modeling with ExtremeLowDensityModel for anomaly detection which is a density based method for finding unusual error regions. To run an example on a time-series type: java -cp target/egads-jar-with-dependencies.jar com.yahoo.egads.Egads src/test/resources/sample_config.ini src/test/resources/sample_input.csv which produces a plot similar to Figure 3 (when OUTPUT is set to GUI). Notice that the time-series behavior changes around point 1100. In order to capture change points only, we can change the anomaly detection model to AdaptiveKernelDensityChangePointDetector which uses two sliding windows side-by-side and calls a point p a change point if the residual distribution in the two windows is sufficiently different. Figure 4 (below) shows the detected change points. There are many more fast and accurate models available within EGADS that researchers are encouraged to play with and contribute to.
Figure 3: Detected outliers
Figure 4: Detected change-points