Using Argus

Getting Argus

Argus Wiki







Argus, Netflow, Flow Tools, Sflow and Jflow


Argus now supports reading native Cisco Netflow, JFlow, Flow tools, and we are working on sflow data. Netflow v3-8 are supported, and v9 is being developed. With the basic completion of the IETF's IPFIX WG original charter, there is absolutely no doubt that Netflow V9 will be the prevalent flow data generated by commercial vendors for quite some time.

All argus-client programs can currently read native Netflow data, versions 1-8, and we've started work on supporting Netflow v9. Argus data is a superset of Netflow's data schema, and so there is no data loss in the conversion. Argus data that is derived from Netflow data does have a specific designation, so that any argus data processing system can realize that the data originated from a Netflow data source.

Argus can read Flow Tools data and Junipers's Jflow data ,and an incomplete implementation of Inmon's Sflow flow data support is in argus-3.0.6. If you have an interest in using Sflow data or IPFIX data, send us email.

Reading and Processing Netflow Data

All argus client programs can read Cisco Netflow data, versions 1 - 8, and convert them to argus data streams. This enables you to filter, sort, enhance, print, graph, aggregate, label, geolocate, analyze, store, archive Netflow data along with your argus data. The data can be read from the network, which is the preferred method, or can be read from files that use the flow-tools format, such as those provided by the Internet2 Observatory.

ra -r netflow.file

When reading from the network, argus clients are normally expecting Argus records, so we have to tell the ra* program that the data source and format are Netflow, what port, and optionally, what interface to listen on. This is currently done using the "-C [host:]port" option.

ra -C 9996

If the machine ra* is running on has multiple interfaces, you may need to provide the IP address of the interface you want to listen on. This address should be the same as that used by the Netflow exporter.

ra -C

While all ra* programs can read Netflow data, if you are going to be collecting Netflow persistently, the preferred method is to use radium() to collect and redistribute the data. Radium() can collect from up to 256 Netflow and Argus data sources simultaneously, and provides you with a single point of access to all your flow data. radium() supports distributing the output stream to as many as 256 client programs. Some can act as IDS/IPS applications, others can build near real-time displays and some can manage the flows as an archive, which can be a huge performance bottleneck.

All argus records contain a "source id", which allows us to discriminate flow data from multiple sources, for aggregation, storage, graphing, etc.... The source ID used for Netflow v 1-8 data is the IP address of the transmitter.

There are a lot of differences between argus data and netflow data: protocol support, encapsulation reporting, time precision, size of records, style and type of metrics covered. These differences are getting smaller with Netflow v9, but the biggest difference with regard to processing of Netflow data, is the directional data model.

Argus is a bi-directional flow monitor. Argus will track both sides of a network conversation when possible, and report the metrics for the complete conversation in the same flow record. The bi-directional monitor approach enables argus to provide Availability, Connectivity, Fault, Performance and Round Trip metrics. Netflow, on the other hand, is a uni-directional flow monitor, reporting only on the status and state of each half of each conversation, independently. This is a huge difference, not only in amount of data needed to report the stats (two records per transaction vs one) but also in the kind of information that the sensor can report on. There are benefits from an implemenation perspective (performance) to reporting only half-duplex flow statistics, but ..... argus sensors work great in asymmetric routing environments, where it only see's one half of the connection. In these situations, argus works just like Netflow.

Argus-client aggregating programs like racluster(), ratop(), rasqlinsert() and rabins(), have the ability to stitch uni-directional flow records into bi-directional flow records. In its default mode, racluster() will perform RACLUSTER_AUTO_CORRECTION, which takes a flow record, and generates both uni-directional keys for cache hits and merging. The results are "Two flows enter, one flow leaves" (to quote Mad Max Beyond Thunderdome).

thoth:tmp carter$ ra -r /tmp/ra.netflow.out    StartTime  Proto      SrcAddr  Sport   Dir      DstAddr  Dport SrcPkt DstPkt SrcBytes DstBytes
12:34:31.658    udp     ->        1      0       74        0
12:34:31.718    udp     ->        1      0       74        0
12:35:31.848    udp     ->       10      0      796        0
12:35:31.938    udp     ->        1      0       74        0
12:35:31.941    udp      ->       1      0       78        0
12:35:31.851    udp      ->      10      0      861        0

thoth:tmp carter$ racluster -r /tmp/ra.netflow.out    StartTime  Proto      SrcAddr  Sport   Dir      DstAddr  Dport SrcPkt DstPkt SrcBytes DstBytes
12:34:31.658    udp     ->        1      0       74        0
12:34:31.718    udp     ->        1      0       74        0
12:35:31.848    udp     ->       10     10      796      861
12:35:31.938    udp     ->        1      1       74       78

When uni-directional flows are merged together, racluster() will create some of the metrics that argus would have generated, such as duration statistics, and some TCP state indications. And now filters like "con" (flows that were connected) and aggregations oriented around Availability (racluster -A) work.

When establishing an archive of argus data, most sites will process their files with racluster() early in the archive establishment, but it is optional. When the data is derived from Netflow Data, the use of racluster() is compelling and should be considered a MUST.