ARGUS Sensor
Sensor Development
The sensor part of Argus is all about generating bi-directional network flow data. Here, we focus on many parts of the sensor development problem; design, implementation, portability, deployment, configuration, maintenance and testing. But when we talk about sensor development, we try to focus on the technical issues of high performance packet processing; packet header parsing, bi-directional classification, sessionization, feature capture, data formats, and data transport.
The Argus sensor is both an operational real-time network monitor, capable of generating Argus flow data at 100Gbps on commercially available hardware, and a set of network traffic analytics, that processes packets for analysis and investigation, providing packet classification, sessionization, packet dynamic measurement, aggregation and periodic reporting.
Written in C, Argus has been ported to over 25 platforms, and can generate network flow data just about anywhere, providing the same dense network data, no matter where it's deployed, at least that is the goal.
Design Goals
Argus is designed based on a few principal goals.
- Argus must account for every packet observed.
- Provide a rich set of network features in its data model.
- Operate at very high performance levels (100Gbps +)
- Support an extendable data model.
In order to provide comprehensive network flow data, Argus cannot be statistical. When utilities MUST know what is going on in a network, the data source can't provide a statistical look. This complete accounting of all the packets observed is an primary goal of Argus and its data, and differentiates it from other flow systems.
Supporting operations, performance and security management analytics means that Argus data need to have the attributes that are needed for these tasks. Reachability indicators for operations management, loss detection for performance measurements, and content for security analytics, are just a few examples.
The world of network security has really advanced in the last 10 years, and Argus has kept pace, providing the richest general network flow data available today. Its comprehensive (non-statistical) transaction model is designed to support the complete NIST Cyber Security Framework, Identify, Protect, Detect, Respond and Recover. Advanced comprehensive network flow data, with metadata enhancements, embedded protocol verifications + payload capture provides the information needed to find network evidence of the bad thing, intrusion, exploitation, exfiltration, shadow IT. Whether the network activity reflects malware attempting to discover nodes in the local network, attempts to break into adjacent systems, stepping stone behavior, Argus data is rich enough to provide the basic information needed to identify and detect bad actor behavior in the network.
Network Flows
The argus sensor is first and foremost, a network flow monitor. This is in contrast to IPFIX, netflow V5,9, Jflow, Qflow, which are IP network flow monitors. Argus generates flow data for most Layer 2 network protocols including Ethernet, Infiniband, ATM, FDDI, Frame Relay, USB, PPP, ARP, HDLC, L2TP, SLIP, VLAN, Token Ring and generates flow data regardless of the type of Layer 3 protocol that is being used. This is what makes Argus a good choice for Cyber Security, as you never know what protocols the attacker may want to use, and covert channels are an everyday occurrence.
The second biggest thing about Argus is that it is a bi-directional network flow monitor. The monitor correlates both directions of a network flow and reports on its state, ... for whatever flow is being tracked. This is accomplished through careful packet classification and cache management strategies, to track packets in both directions. Because of asymmetric routing, Argus may not see both sides of every connection, when it doesn't it still tracks the bi-directional flow state, so you can still get status on reachability, connectivity and availability.
And third, Argus is a near-realtime end-to-end transport layer flow monitor. The transport layer has all the end-to-end goodies that are important for operations, performance and security. The transport layer is transitional in nature, which is huge for the theory of audit and its use in anomaly and fraud detection, and it has all the network state, network and performance metrics, so you can figure it out.
Data Models
Argus supports 14 fundamental flow models to monitor all the traffic it sees, whether it is a Type-P, Type-P1-P2 flow, connection oriented, connection-less, reliable, unreliable, unicast, multicast, broadcast, whatever. A flow model defines the identifiers in the packet that are used to make a flow unique, and is the basis for classification, tracking and cache management. Depending on the observed protocols, Argus uses 2-tuple, 3-tuple, 5-tuple, and 6-tuple flow models to track bi-directional flows, by default. You can modify these basic models to include sub-Layer 3 identifiers like ethernet addresses, VLAN IDs, and MPLS Labels, which make for a really large number of flow types supported.
As examples, when Argus encounters traffic that is using an unknown ethernet type, Argus uses a 3-tuple flow model using the src and dst ethernet addresses and the ethertype for the bidirectional flow key. This results in Argus tracking all packets between the 2 ethernet addresses that use the unknown ether type as a single bi-flow. We may not know what it is, but you'll know that its there.
For DNS traffic, Argus uses a 6-tuple flow model, the standard IP flow 5-tuple key, (src and dst IP addresses, the transport protocol id, and the src and dst transport port numbers) plus the DNS transaction id. This results in Argus tracking individual DNS transactions.
Type-P1-P2 flows, where the flow is made up of 2 different packet types may seem weird but they are very common and very important events. Argus supports a complete family of Type-P1-P2 flows by correlating ICMP unreachable packets with the flow that caused the ICMP to be sent. Any packet goes out, and an ICMP unreachable comes back. Argus will make that sequence into a bi-direcitional flow, using a modified 5-tuple flow model.
Implementation
Argus is a multi-threaded packet processor that relies on native operating system support to process live network packet streams, or to read packets from named pipes or files. It parses all network headers, until it finds the end-to-end Layer 3 Transport header, and then tracks the transport state until it times out due to an idle state.
Sensor History
Argus was started by Carter Bullard as a part of his research at Georgia Tech in the early 80's. GaTech was a network center of excellence. It maintained possibly the largest X-25 research lab, it built the largest campus ethernet network in the world (at the a time), it was one of the top 5 Usenet nodes in the world (a big deal), and it was a key part of the emerging NSFnet and SURAnet networks. We were developing / testing TCP/IP stacks for a number of computer vendors and the research group was early in developing Voice over ethernet and Voice over IP technology which ran over the CS ethernet networks, and so simple awareness of what was on the wire was important.
At the time there were only two approaches for network analysis and awareness, interface counters accessed through consoles and then later through SNMP, and wireline awareness through packet capture. In order to 'understand' how large university networks were being used, SNMP interface statistics didn't provide enough information, and large scale packet capture at 1Mbps was impossible, since disk space was very, very, very expensive.
Argus was at first developed as an independent network monitor, that could provide near realtime summarized network activity for network debugging. It was the first technology to use the 5-tuple IP flow spec to classify packet data and to visualize network flow activity. This quickly moved out of the lab and into the operational network to provide operational support for uptime and network faults, and then into the NSF networks infrastructure, giving the network operators unprecedented views into how GaTech's networks were being used. It was key to detecting that GaTech was a launch point for attacks on Equifax by the Legion of Doom, and it was operational during the Morris Worm, the most devastating Cyber attack in network history.
When Carter moved to CMU's Software Engineering Institute, as the networking guy at the Computer Emergency Response Team, work started on Argus as a cyber security technology. We continued to develop Argus as a real-time operational sensor at the SEI, but we also integrated Argus data into the cyber security workflow. We started to evangelize the idea that network accountability was critically important to network forensics and incident response in the early 1990's, at the IETF, NANOG and to vendors, such as Cisco in 1993. In 1995 Argus was placed as an open source project by CMU and Carter Bullard, under the GNU Public License, and it has been maintained by Carter since.