Anomaly detection in dynamic systems using weak estimators
Original version
Zhan, J., Oommen, B. J., & Crisostomo, J. (2011). Anomaly detection in dynamic systems using weak estimators. ACM transactions on internet technology, 11(1), 1-16. doi: 10.1145/1993083.1993086Abstract
Anomaly detection involves identifying observations that deviate from the normal behavior of a system. One
of the ways to achieve this is by identifying the phenomena that characterize “normal” observations. Subsequently,
based on the characteristics of data learned from the “normal” observations, new observations are
classified as being either “normal” or not. Most state-of-the-art approaches, especially those which belong to
the family of parameterized statistical schemes, work under the assumption that the underlying distributions
of the observations are stationary. That is, they assume that the distributions that are learned during
the training (or learning) phase, though unknown, are not time-varying. They further assume that the
same distributions are relevant even as new observations are encountered. Although such a “stationarity”
assumption is relevant for many applications, there are some anomaly detection problems where stationarity
cannot be assumed. For example, in network monitoring, the patterns which are learned to represent
normal behavior may change over time due to several factors such as network infrastructure expansion, new
services, growth of user population, and so on. Similarly, in meteorology, identifying anomalous temperature
patterns involves taking into account seasonal changes of normal observations. Detecting anomalies
or outliers under these circumstances introduces several challenges. Indeed, the ability to adapt to changes
in nonstationary environments is necessary so that anomalous observations can be identified even with
changes in what would otherwise be classified as “normal” behavior. In this article we propose to apply
a family of weak estimators for anomaly detection in dynamic environments. In particular, we apply this
theory to spam email detection. Our experimental results demonstrate that our proposal is both feasible
and effective for the detection of such anomalous emails.
Description
Accepted version of an article from the journal: ACM transactions on internet technology. Published version available from the ACM: http://dx.doi.org/10.1145/1993083.1993086