This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
View analytic
Thursday, January 12 • 9:30am - 10:00am
Scalable Temporal Analytics to Detect Automation and Coordination

Sign up or log in to save this to your schedule and see who's attending!

Temporal analysis of cyber data can be leveraged in a number of ways to identify automated behavior, to include: periodic, "bursty", and coordinated activity. Malware frequently makes use of regular or periodic polling in order to receive updates or commands. Bursty and coordinated activity can be indicative of scanning, denial of service, as well as exfiltration among victims. Automated behaviors discovered through temporal analysis can be fed into post-processing analytics, such as whitelisting/filtering and clustering, to identify anomalous or outlier automated behaviors on cyber networks.

This presentation will focus on scalable and flexible techniques for applying analytics on various types of logs/features, as well as methodologies to further narrow the results to anomalous/outlier cases that may be indicative of a cyber security event. Operational use-cases leveraging these techniques on real-world data will be presented. For example, in Kaspersky's recent (July 2016) report on the "Project Sauron" advanced persistent threat (footnote: https://securelist.com/files/2016/07/The-ProjectSauron-APT_research_KL.pdf) their research identifies the use of DNS and/or HTTP to poll/check-in to C2 at specific times, supporting up to 31 unique date/time parameters. Scalable, flexible temporal analysis of network traffic would allow for identification of such automated behavior.

The specific algorithms used to identify periodic behavior include a Fourier transform used to identify candidate periodicities which are then filtered down and refined using the autocorrelation function of the time series. A fast Fourier transform algorithm is used to compute the transform on each time series while an inverse fast Fourier transform is used on the resulting periodogram to obtain the autocorrelation function. These operations are performed at scale in parallel across millions of entities (e.g. IP addresses). "Bursty" behavior is detected based on comparing time series values to summary statistics of the series over a sliding window in time for each entity. Coordinated activity is found by performing a nearest neighbor search across entities in various metric spaces using Jaccard, Cosine, or Euclidean distance. Distance is measured on feature spaces to include Fourier
coefficients, sets of time stamps where activity is observed or spikes (referred to as time signatures), and shingles of inter-arrival time sequences. The nearest neighbor search is performed using a scalable locality sensitive hashing algorithm that allows us to filter down large sets of data to entities with similar temporal behavior. We can apply this technique across multiple data sources, leveraging the commonality of a time dimension in each, in order to identify entities that are acting in an apparently coordinated manner, while accounting for possible offsets in log synchronization. Post processing on the set of 'similar' entities discovered in this manner may include applying unsupervised learning techniques to flag anomalous coordinated activity as well as supervised techniques to classify coordinated activity that has been whitelisted.


Lauren Deason

Data Scientist, DZYNE Technologies
Lauren Deason is a Data Scientist with DZYNE Technologies working on the DARPA Network Defense program, focusing on applying digital signal processing and machine learning techniques to detect automated and coordinated behavior in cyber data. Lauren holds a PhD in Economics from University of Maryland, College Park, a Master’s degree in Mathematics from the University of California, Berkeley, and a Bachelor’s degree in Applied... Read More →

Thursday January 12, 2017 9:30am - 10:00am
Great Room V-VIII 7450 Hazard Center Dr.

Attendees (21)