FloCon 2017: Full Schedule

8:30am PST

Uncovering Beacons Using Behavioral Analytics and Information Theory

A beacon, or a heartbeat, is machine-generated traffic leaving the network to confirm availability to or seek new instructions from an external system. Beacons may be used for innocuous purposes (such as checking for Microsoft updates) or for malicious purposes (such as registering an infected host to a C2 server). In this presentation, we will demonstrate how to detect beacons using a combination of packet count entropy, producer-consumer ratios, and dynamically generated hostname detection across a Bro dataset. Packet count entropy is used to measure variance in the number of packets transmitted in a set of connections, with the assumption being that human driven traffic will exhibit a wide distribution of different packet counts across connections and beaconing traffic will exhibit a comparably low distribution of different packet counts. Producer-consumer ratios compare the number of bytes leaving a client with the number of bytes returning to a client to detect clients regularly transmitting data outward without receiving data in return. Dynamically generated hostname detection looks for hosts with machine-generated hostnames to root out hosts that may attempt to escape detection by constantly changing hostnames. We combine these three independent signals to detect potential hosts that are attracting beacon connections from inside our network. We can then crossreference this data against open-source and proprietary threat intelligence to detect possible C2 servers.

In this presentation, we will demonstrate that these tasks can be accomplished using a small number of SQL scripts that can be easily parameterized, with results aggregated by a Python or shell script. As such, they can easily be automated to run on a set frequency or when new batches of data are available.

Speakers

Eric Dull

Specialist Leader, Deloitte & Touche, LLP

Eric Dull is a Specialist Leader at Deloitte, leading large-scale data science and cyber security applications for a variety of United States Government and commercial clients. He is an expert in applied graph theory, data mining, and anomaly detection. His work includes machine... Read More →

Brian Sacash

Specialist Senior, Deloitte & Touche, LLP

Brian Sacash is a Specialist Senior at Deloitte, focusing on data science and software development in the cyber security sector. He has experience employing natural language processing, statistical analysis, and machine learning, using big data technologies, for analytic-based decision... Read More →

Uncovering Beacons Using Behavioral Analytics and Information Theory pptx

Thursday January 12, 2017 8:30am - 9:00am PST
Great Room V-VIII 7450 Hazard Center Dr.

General Session, Behavior and Patterns

9:00am PST

Discovering Deep Patterns in Large-scale Network Flows using Tensor Decompositions

We present an approach to a cyber security workflow based on ENSIGN, a high-performance implementation of tensor decomposition algorithms that enable the unsupervised discovery of subtle undercurrents and deep, cross-dimensional correlations within multi-dimensional data. This new approach of starting from identified patterns in the data complements traditional workflows that focus on highlighting individual suspicious activities. This enhanced workflow assists in identifying attackers who craft their actions to
subvert signature-based detection methods and automates much of the labor intensive forensic process of connecting isolated incidents into a coherent attack profile.

Tensor decompositions accept network metadata as multidimensional arrays, for example sender, receiver, port, and query type information, and produce components - weighted fragments of data that each capture a specific pattern. These components are the product of computationally intensive model-fitting routines that, with ENSIGN, have been aggressively optimized for the cyber domain. What ENSIGN provides is superior to other classical unsupervised machine learning approaches, such as dimensionality reduction or clustering, in that a decomposition into components can capture patterns that span the entire multidimensional data space. This can include patterns that reflect multiple sources, multiple receivers, periodic time intervals, and other complex correlations. From unsupervised discovery, domain knowledge attaches meaning to a handful of components each isolating
a key contributing pattern to the overall network flow. In most cases, the story underpinning the existence of a component is a self-evident, easily recognizable pattern of expected, benign activity. However, in other cases, patterns emerge among one or more dimensions - regular time intervals, a common destination, a common request type - that reflect a deeper, more directed, intent.

Operating last year in the Security Operations Center (SOC) at SCinet - the large-scale research network stood up each year in support of the annual Supercomputing Conference (SC) - ENSIGN analyzed metadata collected for more than 600 million flows over a two-day span. ENSIGN tensor decomposition methods isolated activities of concern including the evolution of an SSH attack from scan to exploitation and a subtle, persistent attempt at DNS exfiltration. We present results from an updated and more advanced deployment of ENSIGN at SCinet as part of SC16. We highlight how the ENSIGN analytics used at SC are suited for automated post-processing and recurrent pattern detection, making them ideal for nightly reports. We demonstrate how novel joint tensor decompositions enable data fusion, allowing patterns to be discovered from multiple data sources with common elements. Finally, we illustrate an end-to-end workflow where ENSIGN builds on R-Scope (www.reservoir.com/product/ensign-cyber), a scalable and hardened network security monitor based on Bro (www.bro.org) that collects the rich contextual metadata crucial to the success of unsupervised discovery, and Splunk as a metadata access store. We show how this combination provides a powerful analytic tool curity professionals in capturing and visualizing - and ultimately comprehending - the patterns contained within the vast volumes of traffic on a large-scale network.

Speakers

James Ezick

Reservoir Labs

James Ezick is the lead for Reservoir's Analytics, Reasoning, and Verification Team. Since joining Reservoir in 2004, he has developed solutions addressing a broad range of research and commercial challenges in verification, compilers, cyber security, software-defined radio, high-performance... Read More →

Discovering Deep Patterns in Large scale Network Flows using Tensor Decompositions pdf

Thursday January 12, 2017 9:00am - 9:30am PST
Great Room V-VIII 7450 Hazard Center Dr.

General Session, Behavior and Patterns

9:30am PST

Scalable Temporal Analytics to Detect Automation and Coordination

Temporal analysis of cyber data can be leveraged in a number of ways to identify automated behavior, to include: periodic, "bursty", and coordinated activity. Malware frequently makes use of regular or periodic polling in order to receive updates or commands. Bursty and coordinated activity can be indicative of scanning, denial of service, as well as exfiltration among victims. Automated behaviors discovered through temporal analysis can be fed into post-processing analytics, such as whitelisting/filtering and clustering, to identify anomalous or outlier automated behaviors on cyber networks.

This presentation will focus on scalable and flexible techniques for applying analytics on various types of logs/features, as well as methodologies to further narrow the results to anomalous/outlier cases that may be indicative of a cyber security event. Operational use-cases leveraging these techniques on real-world data will be presented. For example, in Kaspersky's recent (July 2016) report on the "Project Sauron" advanced persistent threat (footnote: https://securelist.com/files/2016/07/The-ProjectSauron-APT_research_KL.pdf) their research identifies the use of DNS and/or HTTP to poll/check-in to C2 at specific times, supporting up to 31 unique date/time parameters. Scalable, flexible temporal analysis of network traffic would allow for identification of such automated behavior.

The specific algorithms used to identify periodic behavior include a Fourier transform used to identify candidate periodicities which are then filtered down and refined using the autocorrelation function of the time series. A fast Fourier transform algorithm is used to compute the transform on each time series while an inverse fast Fourier transform is used on the resulting periodogram to obtain the autocorrelation function. These operations are performed at scale in parallel across millions of entities (e.g. IP addresses). "Bursty" behavior is detected based on comparing time series values to summary statistics of the series over a sliding window in time for each entity. Coordinated activity is found by performing a nearest neighbor search across entities in various metric spaces using Jaccard, Cosine, or Euclidean distance. Distance is measured on feature spaces to include Fourier
coefficients, sets of time stamps where activity is observed or spikes (referred to as time signatures), and shingles of inter-arrival time sequences. The nearest neighbor search is performed using a scalable locality sensitive hashing algorithm that allows us to filter down large sets of data to entities with similar temporal behavior. We can apply this technique across multiple data sources, leveraging the commonality of a time dimension in each, in order to identify entities that are acting in an apparently coordinated manner, while accounting for possible offsets in log synchronization. Post processing on the set of 'similar' entities discovered in this manner may include applying unsupervised learning techniques to flag anomalous coordinated activity as well as supervised techniques to classify coordinated activity that has been whitelisted.

Speakers

Lauren Deason

Data Scientist, DZYNE Technologies

Lauren Deason is a Data Scientist with DZYNE Technologies working on the DARPA Network Defense program, focusing on applying digital signal processing and machine learning techniques to detect automated and coordinated behavior in cyber data. Lauren holds a PhD in Economics from... Read More →

Scalable Temporal Analytics to Detect Automation and Coordination pptx

Thursday January 12, 2017 9:30am - 10:00am PST
Great Room V-VIII 7450 Hazard Center Dr.

General Session, Behavior and Patterns

10:00am PST

'Lions and Tigers and Bears, Mirai!': Tracking IoT-Based Malware w//Netflow

The Mirai malware rose to prominence in late 2016 with record-breaking Distributed Denial of Service (DDoS) attacks from a botnet built largely from the unlikeliest of sources - various linux-based devices that make up the so-called Internet-of-Things (IoT). "Are we vulnerable to Mirai? Do we have any active infections? Are we participating in the DDoS attacks? What can we do to protect ourselves?" These are all questions that should immediately come to mind for IT managers and network defenders. The NCCIC/US-CERT Network Analysis Team leveraged the National Cybersecurity Protection System (NCPS), better know as EINSTEIN, to answer these questions for U.S. Federal Government entities.

This presentation will begin with an overview of Mirai, and why it is notable, and discuss some key aspects of Mirai's behavior from analyzing Mirai source code and community open source research. Next, we will present the analysis methodology that we employed, leveraging both netflow and content-based network traffic analysis to correlate known indicators and infrastructure with behavioral characteristics, and discuss how they were used to complement one another. Finally, we will discuss some lessons-learned and share some thoughts on the future of IoT-based threats and defensive strategies.

Speakers

Kevin Breeden

Kevin Breeden is a network security analyst currently supporting the United States Computer Emergency Readiness Team (US-CERT) Network Analysis branch. Kevin's primary responsibilities are network traffic analysis through various proactive and reactive analysis techniques centered... Read More →

'Lions and Tigers and Bears, Mirai!' Tracking IoT Based Malware wNetflow pptx

Thursday January 12, 2017 10:00am - 10:30am PST
Great Room V-VIII 7450 Hazard Center Dr.

General Session, Behavior and Patterns