FloCon 2017 has ended
Back To Schedule
Thursday, January 12 • 9:00am - 9:30am
Discovering Deep Patterns in Large-scale Network Flows using Tensor Decompositions

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

We present an approach to a cyber security workflow based on ENSIGN, a high-performance implementation of tensor decomposition algorithms that enable the unsupervised discovery of subtle undercurrents and deep, cross-dimensional correlations within multi-dimensional data. This new approach of starting from identified patterns in the data complements traditional workflows that focus on highlighting individual suspicious activities. This enhanced workflow assists in identifying attackers who craft their actions to
subvert signature-based detection methods and automates much of the labor intensive forensic process of connecting isolated incidents into a coherent attack profile.

Tensor decompositions accept network metadata as multidimensional arrays, for example sender, receiver, port, and query type information, and produce components - weighted fragments of data that each capture a specific pattern. These components are the product of computationally intensive model-fitting routines that, with ENSIGN, have been aggressively optimized for the cyber domain. What ENSIGN provides is superior to other classical unsupervised machine learning approaches, such as dimensionality reduction or clustering, in that a decomposition into components can capture patterns that span the entire multidimensional data space. This can include patterns that reflect multiple sources, multiple receivers, periodic time intervals, and other complex correlations. From unsupervised discovery, domain knowledge attaches meaning to a handful of components each isolating
a key contributing pattern to the overall network flow. In most cases, the story underpinning the existence of a component is a self-evident, easily recognizable pattern of expected, benign activity. However, in other cases, patterns emerge among one or more dimensions  - regular time intervals, a common destination, a common request type - that reflect a deeper, more directed, intent.

Operating last year in the Security Operations Center (SOC) at SCinet - the large-scale research network stood up each year in support of the annual Supercomputing Conference (SC) - ENSIGN analyzed metadata collected for more than 600 million flows over a two-day span. ENSIGN tensor decomposition methods isolated activities of concern including the evolution of an SSH attack from scan to exploitation and a subtle, persistent attempt at DNS exfiltration. We present results from an updated and more advanced deployment of ENSIGN at SCinet as part of SC16. We highlight how the ENSIGN analytics used at SC are suited for automated post-processing and recurrent pattern detection, making them ideal for nightly reports. We demonstrate how novel joint tensor decompositions enable data fusion, allowing patterns to be discovered from multiple data sources with common elements. Finally, we illustrate an end-to-end workflow where ENSIGN builds on R-Scope (www.reservoir.com/product/ensign-cyber), a scalable and hardened network security monitor based on Bro (www.bro.org) that collects the rich contextual metadata crucial to the success of unsupervised discovery, and Splunk as a metadata access store. We show how this combination provides a powerful analytic tool curity professionals in capturing and visualizing - and ultimately comprehending - the patterns contained within the vast volumes of traffic on a large-scale network.


James Ezick

Reservoir Labs
James Ezick is the lead for Reservoir's Analytics, Reasoning, and Verification Team. Since joining Reservoir in 2004, he has developed solutions addressing a broad range of research and commercial challenges in verification, compilers, cyber security, software-defined radio, high-performance... Read More →

Thursday January 12, 2017 9:00am - 9:30am PST
Great Room V-VIII 7450 Hazard Center Dr.