Algebraic and spectral methods for network anomaly detection.
NATO school on Mining Massive Datasets for Security
Italy, Sep. 2007
In: NATO school on Mining Massive Datasets for Security, 5-29 Sep, 2007, Gazzada, Italy.
The tutorial will discuss two central issues: (i) Information Theoretic principles and algorithms for extracting predictive statistics in distributed networks and (ii) algebraic and spectral methods for network anomaly detection. The first part will deal with the concept of predictive information - the mutual information between the past and future of a process, its sub-extensive properties, and algorithms for estimating it from data.We will argue that the information theoretic predictability quantifies the complexity of a process and provides effective ways for detecting anomalies and surprises in the process. Using the Information Bottleneck algorithms one can extract approximate sufficient statistics from the past to the future of the process and use them as anomaly detectors on multiple time scales. In the second part we will discuss ways for analyzing network activity using spectral methods (distributed PCA and network Laplacian analysis) for identifying regular temporal patterns of connected network components. By combining the two approaches, we will suggest new techniques for network anomaly detectors for security.