This brief section will introduce new features in the 2.8 release of ourmon. The features include:
In this document we give an introduction and overview of Ourmon and it's individual filters.
Ourmon is an open-source tool for network monitoring and anomaly detection. It runs on FreeBSD, Linux, and Solaris. Goals include:
The front-end packet capture engine has three forms of filters: 1. hardwired (programmed in C), 2. user-programmable filters using the BSD Berkeley Packet Filter (BPF), 3. top N style filters that produce sorted lists of largest flows, interesting IP hosts, or top talker TCP/UDP ports, which also can use Perl Compatible Regular Expressions for "flow" identification (PCRE). User-programmable BPF filters or BPF filter sets may be added to the system by the user, and allow considerable customization of the system. For example, a user might choose to create one or more graphs that watch local subnets, local hosts, or local ports on local hosts. Hardwired, and BPF filters use the RRDTOOL package for the display and baselining of data. Top N filters use html iframes and log one week's worth of data (typically off-line in the ourmon/logs directory) for later analysis. Some summarization-style logs are available from the web pages. The user-programmable PCRE filters are described in Layer 7 matching.
The back-end system displays network information on the web using either:
In the last few years ourmon has been modified to enhance its abilities to detect network anomalies associated with various forms of attacks including TCP syn scanning, UDP scanning, DOS attacks, and distributed zombie attacks. A number of features have been added, some of which use the BPF to give us an overall network view of TCP and ICMP control data, some of which use new top N filters to show information about particular host IP systems engaged in scanning, and some of which provide RRDTOOL graphs for carefully chosen and proven useful metadata. We will go over these particular filters below in a section of their own entitled anomaly detection filters.
The ourmon architectural flow diagram is intended to give a rough
analysis of data flow through the ourmon system.
For network setup, the ourmon "probe" box is assumed to be directly
connected to an Ethernet switch. The switch must have port mirroring
turned on so that promiscuous mode filtering on the ourmon front-end "probe" can see all
desired network packets passing through the Ethernet switch. Network packets
are passed to the NIC card of the front-end probe, and in FreeBSD are stored
in the BPF buffer. (In Linux, the details are different, but there is still
a kernel buffer associated with packet flow to the front-end application).
Ourmon architecturally consists of two parts, known as the front-end probe, written in C, and the back-end graphics engine, mostly written in Perl. The front-end is a form of network analyzer that reads network packets from an Ethernet interface out of a kernel buffer, and runs each packet through a set of filters. (It should be noted that the front-end program is also called ourmon). For each filter, byte counts or packet counts are periodically stored in an output report file, and that information is passed to the back-end. The back-end then takes the output of the front-end and creates web-based graphics. It also logs top N flows to ourmon logs (that are similar to syslog in style) and produce some daily summarization reports on the logs which are rolled over daily for roughly a week, giving you one week of summarizations.
The front-end uses the Berkeley Packet Filter and pcap library to fetch packets from a promiscuous mode Ethernet device. The filters used are specified in an input configuration file, called ourmon.conf . One may add or remove filters as desired. 68 bytes maximum per packet (as with tcpdump) are captured. Thus only the protocol parts of packets are actually captured via the BPF. Ourmon has the ability to examine L7 data. This is optional and will only be done if the feature is enabled in the configuration file. When disabled, Ourmon still allows (L2) Ethernet and (L3) IP addresses, and L4 ports to be examined at higher speeds. Ourmon can thus be run as both an anomaly detector and a signature-based tool.
Internally the front-end program looks at the Ethernet header, IP source and destination addresses, IP next protocol, and TCP and UDP ports, or in the case of ICMP, major and minor ICMP codes. Each configured-in filter is executed per packet. Thus if there are 5 filters in use, they will be executed in order, one after the other on each packet. In general, the average filter will count bytes or packets and represents this with integer counters. Top N filters keep hash lists of flows or IP addresses associated with counters. At the end of the sample period (every thirty seconds) the output is written out in a very simple format, called the mon.lite file, counters are reinitialized to zero and dynamically allocated lists are also freed. See mon.lite for an example. This file is then processed by the back-end to create various kinds of graphics available from a web server. The "mon.lite" file may be viewed as a summarization of a great deal of packet information into a small intermediate and condensed form.
The front-end is programmed with some hardwired "application" flags which are really a form of signature detection. The "app flags" are also combined with a programmable PCRE mechanism which can be used currently to tag quite a range of traffic. The tags are available with most top_N reports and are also used to various extent in the TCP and UDP port signature reports. The basic idea here for signatures is that we can determine that a host has performed some Gnutella or Bittorrent traffic during a period. This gives us a way to say that host IP X has done Bittorrent (or telnet for that matter). The hardwired signature detection is very efficient and is done with a small amount of C code. PCRE tags are not very efficient but can be programmed by the user.
The back-end is Perl-based. The back-end consists of several programs. The most important backend program is called: omupdate.pl. It creates, and updates graphs with the help of the RRDTOOL library. For top N display, it also dynamically creates second-level web pages using iframes and preinstalled small PNG files. These are used to make horizontal per flow histograms, with a label that in nearly all cases can use the previously mentioned tag mechanism. There are some additional sorting and logging functions that will not be covered here.
Omupdate.pl both creates and updates per-filter RRD (RRDTOOL) databases, depending on which set of filters you wish to use. Omupdate.pl must be driven by a crontab entry (and is typically called twice a minute). It takes the current copy of the mon.lite file, processes it a bit, and stores information in various per-filter rrds (RRDTOOL log databases), as well as creates the current set of RRD-based graphics. It is possible, and probably a good idea, to run the front-end and back-end on two separate hosts. How "mon.lite" is copied from the front-end to the back-end is an "out of band" problem, and is not solved by the ourmon system. But it is an easy thing to do and ssh or the wget application can be used to solve the problem. One very nice side effect of using RRDTOOL is that all the RRD filters, including BPF-based filters, produce a year's worth of baseline data in graphical format. Thus per-filter, one gets current (daily), weekly, monthly, and yearly pictures.
In the next two sections, we will discuss most of the individual filters in detail.
Please note that ourmon as a system supports three kinds of filters:
1. hardwired filters, that are done in C code. These have
names and specific functions. There are not many of them.
2. user-programmable BPF filters. Ourmon supports arbitrary user-space
(as opposed to kernel-space) Berkeley Packet Filter (BPF)-based filters.
One can have 1-6 BPF-filters per RRDTOOL graph (more than 6 produces a cluttered
graph, and in fact 6 is probably too many). These filters use
the same language specification mechanism used by tcpdump.
It is currently possible to have up to six BPF expressions per RRDTOOL style graph.
At the time of writing the one deployed ourmon front-end system has 60 BPF expressions in it total
and in general does not lose packets.
3. top N filter mechanisms that produce various kinds of top N information
including a traditional flow monitors. Other top N lists exist focused
on anomaly detection. These lists typically show information associated with
individual IP source addresses, although the more traditional basic flow
filter shows a more conventional flow tuple (flows here are stateless though,
state is not carried over from one 30-second sample to the next).
In the next section we begin a detailed discussion of the various filters supplied by ourmon. Note that filters have names; for example, the filter that displays different IP level protocols is called "fixed_ipproto" . Filter names are important because they tie the inputs and outputs of the ourmon system together. Filters are named in the ourmon.conf file, and output appropriate to that name appears in the mon.lite file, and again appears in rrdtool libraries, log files, or png images created by the back end. It should also be pointed out that data associated with a filter presented by the back end, is usually interpreted as packets or bytes, and may therefore be presented as bits/sec, or pkts/sec. In some cases, a data item may be presented as data/period. Period means the ourmon sample period of 30 seconds.
In this section we provide a brief overview of the web page layout. The layout is hierarchical with two levels consisting of a main page (index.html) and any number of secondary detail pages. The primary function of the main page is to provide current status -- the goal is to tell you what is going on now . Secondary pages which are accessed as links from the main page show more runtime statistical details. In general the graphics-engine back-end of ourmon makes reports and graphics. Secondary pages that use RRDTOOL must be designed and installed by hand (and linked up to the main page). Second pages that display top N information are now created dynamically. The design and layout of the html pages can (and should at least when using the user-customizable BPFs) be altered according to your own needs.
The main page includes RRDTOOL current graphs for the current day in terms of hours, the "top-most" top N histogram graphs, and various reports including reports for the last sample period and some hourly/daily summarizations. For example, this would include the RRDTOOL basic pkts graph at the top layer. From the main page, find basic network information and look right underneath it for the probe #1 pkts/drops graph. Essentially the graph has a label and both graph and label are links as well. In general the label (when a hypertext link) will take you to the info.html page in order to provide more details on the features of the graph. The graph will take you in turn to a secondary details page that provides more runtime statistical details. Ourmon's notion of help consists of links that jump back and forth between the main index.html page, the info.html help page, and the secondary detailed statistics pages referenced by index.html. (This system works best with mozilla/firefox).
Top_n graphs are often presented as histograms. On the main page find the ASCII heading top talker (top_n) flows based on IP source. Underneath it is the histogram graph that shows the top IP-based flows. This page is an iframe and can either be viewed here by scrolling or you can jump to the secondary page by clicking on the label "Top N IP flows(expand)". If you click on it, you go the to related secondary page/s that should show more bars. The format for these top N pages is now a string label, followed by a bar to show the relative size of the flow. The string label typically has the flow identifier, the size (say in bytes), and an application tags field which may or may not help ID the flow.
The main index.html page is structurally broken up into several sections including:
Here we only wish to discuss the quick jump directory as the other sections should prove self-explanatory. The quick jump directory consists of tables of links into various parts of ourmon. The first table is called important security and availability reports/web pages. It has links that are either security related or have bearing on whether or not ourmon is actually running. The second table entitled main page sections simply divides the main page up into various sub-sections and can be used to directly jump to those sub-sections. For example, in the first table we have the TCP-based "worm" port report.txt report that gives information about noisy TCP hosts. One also finds the similar UDP-based udpreport.txt that gives information about noisy UDP hosts. Both of these links are very fundamental to security and may show current attacks. The following links will take you to the relevent "info" section below for various important security links in the two tables:
In general one uses a web browser to browse ourmon. There are a few things to note. The top-page comes in two flavors, index.html and indexstatic.html. index.html self-updates every 30 seconds (which may be annoying). indexstatic.html should have the same content but does NOT update every thirty seconds. All the second-level web pages do update every thirty seconds, but ASCII reports are not currently written in html and do not update themselves. You must use your web browser reload button to update an ASCII report. Just remember that the sample period is thirty seconds long, and an occasional refresh if looking at an ASCII page is helpful. If you think this behavior is WRONG, send email and lobby for change.
Now we will discuss the filters in detail.
This release supports an experimental threaded front-end.
The threaded front-end
is not the default at this point. Threads are not useful without
at least a quad CPU. They should not be used on a single threaded CPU.
(A dual threaded CPU is useful for ourmon performance because
that gives the operating system interrupt side a thread for packet
reception, and provides a remaining thread for ourmon. However
the thread feature in ourmon is NOT useful until you have have
at least four CPU hardware threads).
threads - overview
The threads code currently works on FreeBSD and Linux. On BSD
we use the rfork system call to create threads. On Linux we use
the clone call to create threads. In both cases, an x86 (amd compatible)
primitive is used for spinlocks. A
It is important to point out the following:
There are several important technical details that are worth
discussion. For example,
ourmon compiled without -DTHREAD leaves out the threaded code
and simply uses the libpcap library in the standard way. On
the other hand,
the THREADED code is invasive of the libpcap library;
i.e., we modified the read code to include a spinlock in order
to serialize thread access to the kernel. The packet model
(at least on BSD since Linux can be said to only have a BAD model)
is that we want to read out as many packets as possible per read.
In previous experimentation we discovered that large BSD BPF (kernel)
buffers helped the overall system not lose packets under some load
conditions. This is true for Linux as well. However with
a per thread buffer we have also discovered that the kernel BPF
buffer does not need to be as big as before. For example with
a single threaded ourmon probe, the buffer might be set to 16 Mbytes.
With a thread ourmon probe, we are setting the buffer to 1 Mbyte
and that seems to be sufficient.
In order to install the threaded ourmon, several changes are needed.
For example, if you have drops,
you can always choose to lose a filter. PCRE pattern matching takes
a lot of CPU resources. So do malloc-based top N filters.
The BPF mechanism itself can consume a lot of CPU unless the C-based
version is being used.
Note that the pkts
(caught and dropped) counters are zeroed out (in SNMP terms, pkts is a GAUGE),
at the mon.lite write period time. All of the individual ourmon
filters are currently
zeroed out at mon.lite output time. Thus the counters start over
from zero for the next round (and at the moment they are more or
less all GAUGEs anyway).
Typical "mon.lite" output is as follows:
The mon.lite output looks like this:
The mon.lite output looks like this:
The mon.lite output looks like this:
In the ourmon.conf file, BPF filters default to displaying bits per second
(bpf-bytes is used above to force this mode). The config file
has two modes which can be toggled on and off between filter sets (but not
inside a filter set) using the following two commands:
The first filter above is supplied with the default ourmon config file
and serves to graph pkts/sec for basic IP protocols, including any IP protocol itself,
TCP, UDP, ICMP, and "other" protocols not of an IP nature.
We get one RRDTOOL graph with five lines in it.
Any use of "bpf" as the first
config token starts a new BPF graph.
The name of the graph is "protopkts". "bpf" and "bpf-next" tags are used
to introduce (label, BPF expression) pairs. For example,
The "ports" filter set also gives us one RRDTOOL graph with 5 lines in it.
The name of the graph (BPF filter set) is "ports".
The BPF tag called "bpf" indicates to ourmon that we are starting
a new BPF filter set graph. "ports" is the name and label of the entire graph
and is used in the backend for creating unique filenames both in the web and
rrddata directories. This first bpf line also includes the first line label "ssh" and a
BPF filter specification associated with that line label. The BPF expression
"tcp port 22" is designed to capture all secure-shell traffic on port 22.
The subsequent "bpf-next" lines add additional lines
to the graph for p2p traffic (kazaa, bigtorrent, edonkey, gnutella, and the like --
by definition p2p apps can use whatever port they want, so this isn't perfect,
it's just an informed guess), the web, ftp, and email.
Each bpf-next line has a line label and filter specification.
A BPF graph is either terminated by another filter or by a BPF-noxtra label
which tells ourmon to not collect and graph bytes that fail to
match any of the BPF filter specifications in this graph. One
may choose to have remaining bytes shown in the final "xtra" graph,
or choose to ignore them (which means xtra is *still* shown in the graphs but has a zero
value as nothing is ever stored in its counter).
The supplied ourmon.conf file has many examples of BPF filters, and you
should make your own. For example, you could graph subnet traffic for
3 subnets, 10.0.1.0/16, 10.0.2.0/16, and 10.0.3.0/16 as follows:
As a runtime optimization that only involves the front-end probe,
we have developed a sub-system we all CBPF.
The goal of CBPF is to allow the user to increase the number
of BPF filter-set expressions while at the same decreasing the odds
that packets will be lost. CBPF allows us to
hand code C functions to replace some commonly used BPF expressions
in the front-end. Unlike the BPF expression mechanism, CBPF is
not general. There is only a limited set of expressions
that can be used. The syntax in the config file is also slightly
different and in some cases cumbersome. Therefore BPF expressions will exist
that cannot be replaced by the existing set of CBPF functions.
However the runtime advantage of a CBPF function over its equivalent
BPF expression is significant. Typically CBPF is at least ten times
faster. One can use CBPF to replace common subnet expressions (net IP)
and also in our supplied BPF filter-sets for ports and P2P applications.
(See the supplied ourmon.conf file). In the configuration file,
CBPF tags where they exist may be dropped into the user BPF filter-set
at any point, replacing an equivalent BPF expression. CBPF config tags
can be used to replace either "bpf" or "bpf-next" as follows:
Commentary: the BPF lacks a range operator. BPF expressions
can consist of logical ORs but this results (particular in
an attempt to catch the bittorrent default port range, which
may actually extend to port 6899) in a very inefficient
set of BPF instructions. Keep in mind that the BPF is interpreted
at runtime, and CBPF is compiled C code. In general, BPF
expressions have linear cost. Longer expressions cost more
at runtime. Note that this CBPF operator (as it true in
similar cases) can be viewed as a set of logical ORs in terms
of its operands.
Commentary: Mixed sets of UDP and TCP ports can be OR'ed together.
Ten pairs of (u port) or (t port) combined are allowed maximum.
The above is the functional equivalent of the following:
Commentary: The CBPF ping tag takes no expression string.
The front-end ourmon program performs a hash function, and creates
a dynamic hash list of flows at runtime. At report time,
it sorts the list with quicksort, and writes out the
top N total IP "flows" (src to destination), top N TCP, top N UDP, top N ICMP
and top packet flows. The number N is configured into the ourmon config file.
Note that the numbers are simply 30 second
byte counts represented as bits/sec.
The back-end program omupdate.pl is used to create dynamic html iframes for
preinstalled png histograms
for the results. In addition, logging is done for the flow
types in separate logging files (this is true for all top N filters).
(These syslog-style logs are available for analysis by the administrator,
but they are not available on the web).
In addition "top of top" reports or daily summarizations
are generated for the topn_ip, topn_udp, topn_icmp and pkt filters,
on an hourly or daily basis for a week. The sorting functionality
provided allows one to see the top flows over a period of time.
Top of top reports are also created for topn_ip, topn_udp, and topn_icmp
that report on the biggest IP source and destination addresses flow generators for these
three flow types.
The mon.lite output is as follows:
All of the flows may have application tags associated with them if a flow
packet is matched by any of the hardwired or user-supplied PCRE tags.
There are a couple of important design ideas here driving this work.
One is simply the observation that it can be useful to look at network
control data to observe a baseline and then detect anomalies in that baseline.
There is in general much LESS network control data than data itself (one hopes).
There are less ICMP error messages, than UDP data (usually). There
are less TCP control packets in terms of SYNS/FINS/RESETS compared to data itself.
As two more concrete examples of this notion, one would expect that
TCP SYNS might somehow be related graphically to TCP FINS. Large
numbers of SYNS that have nothing to do with the FIN count thus
indicate that something abnormal is happening. One can also look
at ICMP error data, and see that for example, large numbers of ICMP
host unreachables are being returned because of an active UDP
scanning system. Second-order data may also be of interest from
the viewpoint of network control. For example, instead of focusing
on individual top network flows, we can focus on the overall count
of network flows. Perhaps one has on the order of 1000 ICMP flows per
second. 100000 ICMP flows per second would then indicate that
something was very wrong. (This actually happened due to an outbreak
of the Welchia/Nachi worm on one campus in Fall 2003).
We use
various kinds of weights (small stats) to detect scanners. There are three
new top N list mechanisms that in some cases use weights
to try and detect scanning systems, and in some cases simply
sort on integer counts. For example, we have a top N syn tuple list that currently
does both of these things. In one form, it sorts on the top N ip src sending
TCP syn packets, and at the same time, produces various metrics
that for example tell us the percentage of TCP control packets compared to
the overall percentage of TCP packets sent and received by
that host during the sample period. We call this weight the
TCP work weight . The work weight appears
to be very useful as it allows us to see blatant scanning and to
some extent rule out most P2P using hosts
that often also send large numbers of TCP SYNS during the sample period,
but in fact, generate less TCP SYNS + FINS + RESETS / total pkt count
than most SYN scanning worms. The top N syn tuple
also produces several other outputs including the tworm graph ,
which represents a total count of anomalous ip sources in terms of SYN counts.
The tworm graph can show parallel scans or DDOS attacks.
In addition, the syn tuple
produces a specific email report tuned to email ports by the front-end probe,
a p2p report based on the application tags mechanism that shows hosts using IRC, Bittorrent, and Gnutella,
and a so-called syndump report that shows statistics for
all local hosts (but not remote hosts as due to possible P2P IP address fanout, remote
hosts would be too much work). The
tcpworm.txt report can be viewed via several different outputs most
notably via the so-called
port signature report
.
The synreport in the front-end during a sample period gathers all
IP hosts with their related SYN counts and other stats. It then
puts a subset of them into the TCP port report (and worm graph) based
on two conditions: 1. the work weight is non-zero, and 2 enough
packets were sent to make inclusion worthwhile roughly as follows:
In other anomaly-oriented top N filters, we also
count the total number of ICMP error packets sent back to a host,
and a weighted measure of UDP errors as well.
We also have a conventional IP scan catcher that sorts on the top N IP
source addresses with the most unique destination IP addresses. This scan
filter also produces graphs that show the mapping of an IP source
to N L4 TCP or UDP destination ports.
In this section we discuss the TCP syn list tuple and various outputs associated with it.
We will first present a general overview ot the TCP syn list mechanism, then give
configuration info, and finally discuss each output presented above in turn.
Ourmon if so configured stores a TCP syn tuple with various counters
that look at the two-way nature of TCP traffic from an IP source address.
This syn tuple includes information about SYNS sent by an ip source, the total number
of SYNS+ACK packets sent by an ip source, FINS returned to that ip source,
RESETS returned, ICMP errors returned, total pkts sent from the IP source, total
pkts returned to the IP source, counts of unique IP destinations, counts of unique
L4 TCP ports, application flag information, one sample IP destination, and both source and destination port sampling scheme.
The source and destination port sampling scheme stores the first N TCP destination ports seen associated
with SYNs and stores packet counts for SYN packets only sent to those ports.
This is a sampling scheme as only the first N ports seen are stored. Not all
ports are stored. The SYN+ACK count packets are only those packets from a given ip source that are sent
as the second packet of the TCP 3-way handshake. In TCP state machine terms, these
packets traditionally are associated with a service on a host. Thus it is fair to say that a percent score
here of 100% means a host may be a server.
As one example, the topn_syn list shows individual IP source addresses sending the most TCP syns
in a period.
The following outputs are currently produced by the topn_syn list tuple:
The topn_syn tuple produces a large set of IP hosts, local and remote. The topn_syn
top N list is the subset of those hosts that have sent the most SYN packets. The basic
TCP work weight port signature report consists of those hosts that have a non-zero work weight
and have sent some non-trivial amount of packets. This is typically (barring DDOS and large parallel scans)
a small set. The p2p version is the initial set filtered by some hardwired application
flags (Bittorrent and the like). The syndump set of IPs consists of only local home IPs.
The email set consists of all hosts local and remote who have sent SYN packets
on the defined set of email ports (25, 587, 465). The potdump report
shows those hosts that have written into darknet networks which
must be specified with the P/D flags via honeynet or darknet config
directives. A honeynet address is supplied and on by default BUT
the address is not useful. If you can't configure it to something
useful, then just ignore the outputs.
Note that these different outputs are all subsets
of the initial topn_syn tuple viewed in different ways.
Note that three different top N mechanisms currently rely on each other. The IRC
summarization mechanism uses the topn_syn work weight. The topn_syn list
itself also uses data from the topn_scan IP and port lists. The topn_syn list
also uses data in application flags from the IRC filter. All three (topn_syn,
topn_scan, IRC) of this lists should be turned on together.
Two major weight functions are used in various places above. These are called
the work weight and the worm weight respectively.
The work weight approximates the percentage of TCP control packets captured divided
by the total number of packets sent both ways. The work weight for each ip source is roughly:
The front-end collects a topn syn list which may of course have 1000s of "syn flows" in it.
It sorts the syn list for the topn_syn report, and does additional processing for
the various other syn_list reports (e.g., the port signature in tcpworm.txt ).
As a result, the top N syn list when viewed as a sorted list of SYN sending IPs
is bounded as of course it is a top N list. The tcpworm.txt list is not bounded.
However typically barring large distributed attacks, it is far smaller than the total number of IP sources
in the complete topn N syn tuple list.
Let us first look at the setup in the ourmon.conf file and then we will consider the
results in more detail:
The first command
topn_syn turns on the basic topn_syn graph.
The argument to topn_syn (60) specifies how many hosts
you want to capture in terms of pictures. This argument should have the value 10, 20, 30.. to 100
maximum. 60 will produce 60 bars worth of individual SYN "flows" (per host stats) assuming you have 60 flows
to capture.
topn_syn_wormfile turns on the tworm RRDTOOL graph output and also the tcpworm.txt port signature report.
In other words, output based on the worm metric is produced. There are actually
several front-end outputs including the tworm graph, which graphs integer counts showing
the number of external ip sources and internal ip sources appearing in the tcpworm.txt file,
and the tcpworm.txt file itself, which like the mon.lite file is passed to the back-end
for further processing. Note that we assume you will use this function, and it is probably not wise to turn
it off.
The second argument to topn_syn_wormfile is a directory path which specifies
where to place the tcpworm.txt report in the front-end file system.
This argument is currently overridden with the -D option to the front-end ourmon probe
(see bin/ourmon.sh). -D
causes all ourmon probe output files to be placed in a specified directory, typically
/home/mrourmon/tmp.
If two systems are used, one as the front-end probe, and the
other as the back-end graphics engine,
the administrator must arrange that the tcpworm.txt and other probe output files will be copied to the back-end,
and further processed by shellscripts produced in the configuration process.
(Note that the current back-end shellscript driver (bin/omupdate.sh) has commented out code
in it that uses wget to get all needed files from the probe to the back-end ourmon code).
Tcpworm.txt
is the rough output that is further processed by the back-end and becomes the port signature report,
and other outputs when processed by the back-end.
We will discuss these reports more below.
topn_syn_homeip specifies what you consider to be the "internal" subnet used for "us" in the tworm.html output.
This allows the front-end to decide what is "us" versus "them" in terms of the tworm RRDTOOL graph.
The first argument is a network IP and the second argument is a netmask. If you are using ourmon
on a host-basis and not watching an entire network, you can set this value to be your host IP,
and then use a /32 netmask. This net info is also used in the syndump report.
Note that 1 to 10 such declarations can be made in the config file.
topn_syn_p2pfile causes the creation of the p2pfile.txt probe output file. This file is the subset
of all syn tuples where the host in question has matched one of the hardwired application flags
for a well-known P2P application including Bittorrent, Gnutella, Kazaa, Limewire/Morpheus, and Edonkey (which
is unfortunately prone to false positives).
These flags are based on very low-cost pattern matching (signature checking is done in C code).
It also includes hosts using Internet Relay Chat. This file is processed by the back-end to create the p2p port report,
which is viewable on the IRC web page as well as at the top of the main page.
Note that this report does NOT include application flags set by the user programmable
PCRE/Layer 7 matching mechanism.
topn_syn_syndump causes the creation of the syndump.txt probe output file. This file is the subset
of all syn tuples that belong to the home subnet (see topn_syn_homeip above).
The daily summarization for this file is probably the most useful output as it can be used to check
gross statistics for any home IP address including initial and final timestamps.
topn_syn_emaildump
causes the creation of the emaildump.txt probe output file. This file is the subset
of all syn tuples where a special email port only TCP work weight has been computed. The resulting
port reports (now and summarizations) are thus focused ONLY on email application hosts. The email ports
in question are 25 (normal), 587 (submission) and 465 (ssl-based smtp). The goal is that outputs
can be used to check for large volume email hosts that are not on the usual list of email servers (i.e.,
hacked spambot boxes). Of course this may also help you determine which hosts are actually sending email
and which hosts are the largest senders of email.
honeynet
may be used to specify a darknet as a net/mask to the ourmon probe.
If packets are seen directed to the honeynet the P application flag
will be associated with them.
A darknet of /24 will do the trick. This is a very useful malware
collector as typically only
malware scanners will send packets into a darknet. P may very well
mark the dubious IP address
as a result. This feature is well worth using. This feature
is on by default but has a ridiculous IP address (127.0.0.2).
You must configure it to a reasonable darknet if at all possible.
It won't be useful until you do.
Finally tworm_weight can be used to specify a value ranging from 0..100 percent that is used to filter
the number of hosts shown in the tworm RRDTOOL graph. It is hard to say what is a reasonable number here.
It may be that a low number such as 30 will capture Layer 7 password guessing attacks. Higher numbers
(like 70) will tend to catch scanners.
A lower number may be useful because a common attack against one of the SQL-slammer ports (1433)
may actually result in two-way data exchange. This is because SQL servers are returning Layer 7 data that
is basically saying: "the password guess is wrong". As a result, the work metric for an attacking IP source
may be lower if data (albeit negative data) is being returned by the attacked host.
If this switch is not used, the default filter weight is 0%. This means no filtering is done.
There is at this time no way to filter the port report itself or the tcpworm.txt file as an input.
For each IP address in the topn_syn histogram,
we show per 30-second sample period syns, fins, resets, total packets, the work weight metric, and the application flags.
All IPs are sorted by max syns hence top N syns.
The topn_syn back-end graph information per ip source address
is presented roughly as follows in the legend above the associated bar. The bars taken
together give a relative strength approximation of the senders. The first bar is always
taken as 100% strength. Others are relative to it.
The IP address is the IP source address for the SYNS being sent. It is followed
by total syns, fins, resets, and total packets for the IP host. This is followed
by the work weight and application flags if any. This host appears to be a scanner
sending packets into a darknet.
Although false positives are possible,
we have observed the following about the work weight system:
mon.lite output for tworm appears as follows:
The TCP "port report" or "port signature report" has this name because it includes a small sample of TCP destination
ports from the IP source in question. Thus it gives a set of destination ports
which we call a "port signature". This port signature may allow you to see a new virus
at work and often some virus/worms have distinctive "port signatures". For example, see
ports 5554 and 9898 as found below in a small example of real port report output.
This is a signature of the dabber worm.
The port signature report has three sections. The first section which is the most important section
is the port signature report itself which is is sorted in ascending order in terms of IP source
address. This sort may allow you to see subnets with infected hosts in the same subnet.
It also lets you see parallel scans and attacks at one glance.
Each line is for one IP address and the statistics associated with that IP address including the TCP destination
port signature at the end of the line.
The second section is based on a daily "db" file database that tells you
if the current set of destination ports seen are "new" in this port sample, or "old".
This database is started again every night at midnight.
New simply means so far today we have not seen that particular set of ports before. "old" means
we have seen that particular port set before. For example, note the destination ports
given below (5554, 9898). This set might have appeared in previous port signature reports
for the day, or it might be new for this particular report. New port signatures are
also stored in the
the new port signature log .
The final small chunk
of information in the port signature report is a simple condensed help reminder
that is a guide to the hardwired application flags only. If application flags
are implemented via PCRE pattern matching, you have to remember what the flags mean.
They aren't shown in this guide.
With the main port signature, we sort first by IP source address from small to large.
Thus one can thus easily note attacks that come from the same IP source "locale",
including attacks from the same IP subnet possibly synchronized with some bot mechanism.
Each line represents culled information for one IP source deemed anomalous in terms
of its SYN output. Let's look at some examples:
The ip source address is given, followed respectively by the (worm) flags,
application flags,
work,
SA/S,
unique L3 IP destination address counts and L4 TCP port counts,
1-10 max L4 TCP source port counts and one sampled port,
sampled IP destination port,
total sent and received TCP packet counts,
sampled dst pkt counts/total dst pkt counts,
and port signature tuple fields.
The flags field attempts to convey whether the "work" in question is two-way or simply
one way from the IP source and provides details about the nature of TCP control data as well.
The total set of flags is as follows with a rough explanation given for each:
Scanners or worms commonly produce WORM or WOM, although if a network administrator chooses too, he or she
might produce more RESETS from local routers and/or hosts, and this could be useful in detection
of scans. The flags field is explained in more detail in the verbose tcpworm.txt output file.
The work weight as before roughly represents the number of TCP control packets divided by all
TCP packets sent to and received from the IP source address in question. 100% means more or less
all control and no data. From experience we suggest that work weights in the range of 60% or more
should be deemed highly suspicious. Although there are rare cases of noisy clients that
for some reason cannot reach their server OR email servers that are trying to reply to spam
(which will always fail). We have observed over many months that in general anything
with a weight over 60% is "anomalous" and in most cases not benign.
Low work weights may mean you have a noisy web server
that is sending large numbers of small packets per connection to a client.
On the other hand, low work weights may be associated with P2P applications, Layer 7 password attacks on web servers
(port 80 as a destination port may indeed be significant, but it could be an attack or
a benign web client use), or irc server bots with a high rate of JOIN message churn.
The latter very well may be a trojan. Thus it is important to note:
A low work is not necessarily a benign indication but often other indicators
taken together will paint a benign picture.
The application flags field (which is also used in the p2p version of the port report and in fact
defines that report) gives various hints in some cases derived from L7 packet payload scanning about
the nature of the applications on the host in question. Ourmon has several flags included by default and
the user can also create new flags together with their accompanying patterns as
described in Layer 7 matching.
In general the list below should be regarded as reserved letters. We use letters A-Z and a-z
for flags. Any remaining letters can be used with PCRE-based Layer 7 matching patterns to
create new tags that will show up in the port report and in various other places including
most top N lists. The tag items given below are very efficient and are done with a few lines
of C code. In general the P2P applications
here (and IRC) indicate the start of a peer to peer exchange (not a client/server exchange).
Thus it is possible that a host may appear to have a "worm" or TCP scanner when using
Gnutella and not show a "G" flag simply because it is completely failing to ever
find a real Gnutella peer. With Gnutella this
seems to be a fairly common occurence.
The application flags by default include the following reserved tags out of the A-Z,a-z space:
The SA/S field expresses the total percent (0..100) of SYN+ACK packets typically sent as the
second packet of the TCP 3-way initial handshake divided by the total number of SYN
packets sent from the IP source in question. There are three possible thresholds here.
0 means the system in question is a client. 100 means the system in question is a server.
A number in-between (which is commonly found with P2P systems) shows that the system
in question has both server and client functionality. One should correlate this information
with the port signature information, in particular when the sample space of 10 possible
ports is filled up. A low work weight, 100% for SA/S and 10 out of 10 ports in the port signature
typically indicates a server (which is typically either a web server or a server undergoing a L7
attack returning error messages like "404 not found", or "password failed").
One interesting facet of this field is that occasionally one will see a work weight of
100% and an SA/S value of 100%. This can mean that the host in question is performing
SYN/ACK scanning.
The L3D/L4D destination field is derived from the ourmon probe scanner module.
The L3D field shows the exact number of unique L3 IP destinations during the sampled period.
Although in practical experience this field is not misleading, it should be noted that there
is no guarantee that the IP destinations are indeed TCP peers (they might be UDP peers). (This
is a bug and will hopefully be fixed in a future release). The L4D field shows the unique
count of L4 TCP ports seen during the sample period. These fields can often be used to
give you an approximation of the behavior of the IP host in question especially if it is a scanner.
The L4S/src field
gives information about L4 source port fields for the IP source in question.
It samples 1 to 10 ports maxiumum during the sample period and displays
a count of the number of sampled ports seen (L4S). This count cannot
go higher than 10. Typically servers will have low counts. Scanners, web clients,
and p2p clients typically have high counts due to client-side threading.
One sampled src port is shown (src). This may match the most used server port
for a busy web server.
The ip dst field
gives information about one sampled IP destination address. This
may not be useful normally but it may show many remote hosts attacking
one campus host.
The snt/rcv field
gives counts for the total TCP packets sent and received from
the host during the sample period.
The sdst/total field gives an estimate of how well the port signature mechanism worked.
The sdst number represents the total number of sampled packets stored as counts in the port signature tuple.
The port signature tuple samples the first 10 destination ports seen coming from the IP source
in question during a sample period. Each tuple consists of a TCP destination port, and the packet count associated
with that destination port. The total field consists of the total number of packets coming
from that IP source. If the dst value is much less than the total and all 10 port tuple
buckets are full, then this is a good indication that the port signature field failed to
catch much in terms of its samples. These two numbers are not expressed as a percent because
the two values can give you some idea of how many packets per sample period are being
sent by the IP source in question. For example if you see that the work weight is 100%,
and that packets are being sent to say 5554, 9898 as above, you can then estimate from the numbers
given above that probably 3 syns per second are being sent.
The port signature field itself results from a possible sample size of 10 TCP destination ports
captured for the IP source. The destination port is the first field in the port 2-tuple.
Note that in order to help spot similar attacks, the ports are sorted in ascending order
from left to right.
In addition there is a "guess" field that is not available at this time, and as a result
has the NA value in all cases.
We can sometimes tell by the port number itself that an attack is occuring. For example,
in general one should be suspicious of any of the following Microsoft ports in the UDP or TCP
port signature report.
The following page gives an hourly summarization for the TCP port report:
Below we explain one sample entry take from a summarized port report.
The above per IP summarization has four lines.
This basic form for summarizations is used in the p2preport,
the syndump report, and is more or less what is used with the email
syn port report as well, although that form has some email syn
specific information in it.
See
email syn port signature report
below for more information on the latter form of the port report.
Hourly summarization is available for this report
as well. The hourly summarized version is sorted by total packet counts.
See
daily summarized version of p2p port report
for a sample version of the summarized form of this report. The
weekly set of summarized versions is available (as usual) at the bottom
of the main web page.
The syndump report is sorted by total packet count for both 30-second and summarized
versions. It is relatively expensive in terms of CPU.
The tuple and file format is the same as with the TCP port signature report.
Except in this case it is not sorted by IP address, but instead
by total packet count. Thus it can be viewed
as a form of Top N talker in terms of IP packets.
This file is the natural target for PCRE tags which can be used
to determine what local hosts are doing in terms of traffic analysis.
See the
PCRE section below for more information.
The daily and hourly summarized version of this report may
be found here:
daily summarized version of syndump port report
A weekly set of daily reports for the syndump is available near the
bottom of the main page with the other summarizations.
The goal is to help you identify local email systems in order to
know about possible anomalous email systems.
The email dump has a summarization as well that is sorted by email SYN packets.
The summarization form is mostly similar to the TCP port signature summarization except
for one email specific line (line 4) which might appear as follows:
Furthermore it may be used in conjunction
with the
In addition to the P and D in the TCP and UDP port reports, this
feature can also be used to cause an event log warning about writes
to the specific subnet. See
system event log
section for more information.
Furthermore it may be used in conjunction
with the
Currently the batch report gives the total number of instances for the IP source in question,
total pkt counts for syns/fins/resets and all pkts (data too)
as well as an average work weight.
and also presents a small sample (5) of port signatures. The port signatures presented here
are of the form [port_destination, packet count].
First and last instance time seen are also given.
One list is generated for ICMP errors associated with individual IP source addresses, and an additional list is generated
for UDP-based IP hosts. The ICMP list is important for catching
scanners generating ICMP errors (which may or may not be generating ICMP packets
themselves). The latter is focused on UDP-based anomalies
which may include UDP scanners, DOS attacks, or badly-behaved UDP-based applications.
The rationale for these top N lists is the same -- they focus on hosts that
generate large numbers of ICMP errors. In addition, the UDP list can be quite helpful
in catching UDP specific scanners.
In the mon.lite file, these lists are shown as follows:
In the graphical histogram representation for the ICMP icmperror_list
the current label appears as follows:
In the next few sections,
we will discuss the UDP list in more detail as it has several features.
The UDP work weight is used for sorting. The weight is basically:
The UDP port signature report is basically a TOP N report sorted by
weight. It tends to make those IP hosts causing network errors
to be higher in the report. Thus scanners, DOSsers, and hosts
running p2p will tend to appear at the top. If such hosts
do not exist, it is quite possible that local DNS servers will appear
at the top as they often have ICMP errors being returned to them
for various reasons (spam is one possibility).
In the graphical representation for the udperror_list,
we sort on the computed weight as mentioned before.
A typical UDP label in the graphical version might appear as follows:
The ASCII UDP port report is created by omupdate.pl and is more verbose
than the histogram-based graphical system.
The report presents the per IP source information
in a form similar to the TCP port signature report. Port packet tuples
are of the form [destination port, frequency]. In other words, packet counts
are not given here, but instead are given as a percent of the total packets
captured in the port signature sample. For example, one might see the following:
Three hosts are shown in sorted order. The first is
performing an IP address scan of the local address space
and is sending so-called SPIM (hence the s flag). The second
is performing a slower scan and may be a botnet client or worm.
(port 137 is a possible giveaway here as it is a Microsoft file share
port). Note the P in the appflags. This host has scanned into
the local "darknet".
Ports 137 and 1434 (SQL) should be viewed with suspicion.
The last entry is a local DNS server. Note port 53 as dominant
in the port signature.
Given the apparent sporadic nature
of UDP attacks (compared to 7x24 TCP attacks), we have three additional mechanism for help in analysis. This includes the
udpweight graph discussed in the next section, the system
system event log
and the new trigger system,
which can be setup to automatically capture the top UDP weight based on a pre-configured
threshold. See triggers - automated packet capture for more information
on the latter.
The system event log is automatically configured to capture any UDP anomaly in this list
with a weight greater than 10,000,000. Basically omupdate.pl stores the first line of the ASCII
UDP port signature report in the event log if such an event occurs. The threshold in question
is configured in bin/omupdate.sh as a parameter (-u) passed to omupdate.pl and may be changed if
so desired. However we find that our supplied 10,000,000 value works well.
Note that the UDP port signature report is logged in /home/mrourmon/logs/udpreport.
However look in the
event log for the precise time
for any large attack that exceeds the UDPWEIGHT value
found in bin/omupdate.sh. omupdate.pl will put UDP port report style
info in the event log in this case. This includes the IP address
of the attacker and other information. A sample UDP weight event log
entry is given below:
Using this UDP work weight graph you may be able to
decide to increase or decrease the UDP weight threshold passed to omupdate.pl
or given to the ourmon probe to trigger automated packet storage.
If you want to modify the UDP work weight, modify it in
bin/ompudate.sh which is back-end code.
The topn_port_scans filter presents three separate graphs, but in general,
looks at single IP sources
sending packets to many unique L4 TCP and UDP destination ports.
We sort on the maximum L4 destination ports. There are three graphs because
the ip_portscan graph counts both TCP and UDP ports (and does not discriminate between the two),
while the tcp_portscan and udp_portscan graphs only show TCP and UDP destinations respectively.
Thus both topn_scans and topn_port_scans are 1-N in terms of their basic mapping.
The ourmon.conf configuration is setup as follows:
We give an IP address, the count of destinations, and the
application flags field which shows that this host is actually
doing the P2P Bittorrent protocol (and it isn't a scanner
in the malware sense, it merely
has many peers).
Results are sorted in order of unique destinations.
It should be pointed out that information here may be correlated
with information shown in the UDP errors or TCP syn top N mechanisms.
There currently is no summarization although averaged L3D/L4D values
may be found in various places in summarized versions of the TCP port
report.
The top N flow mechanism may also be of use for network security
in terms of anomaly detection. There are fundamentally two
different sets of graphs shown here. First of all, it has proven
very useful to use a RRDTOOL graph to display the count of
all four kinds of flows. (The count is the tag that follows
the flow tag in the mon.lite file. It is the count of unique
flows of that type, of which only the top N are provided
as tuples in the mon.lite output).
We call this the "flow_count" graph.
This graph shows small and large attacks and classifies them
as to whether they are TCP, UDP, or ICMP in origin. It shows
the attacks simply because "scanning" means variation in IP destination,
and/or L4 ports (or ICMP major/minor numbers) which are classified as separate flows as a result,
even though they may be flows of one packet only. This graph
has proved to be a particular good indicator of network-wide scanning
attacks, including single instances of worms like the UDP slammer.
The companion graph, "topn_ip insert count" show the counts
of inserts in the flow hash list, which again are due to separate flows.
Inserts result in malloc(3) calls and this graph also tracks attacks
fairly well. This graph is probably useful mostly for possible
malloc problems with the ourmon probe though.
In addition, it should go be pointed out that the ICMP top N flow graph
may be useful. In particular, the ICMP flow summarization report
has a tendency to reveal hosts that are engaged in long term scanning
simply because they pile up extraordinary amounts of errors. (It may
also show a long term ping of course). These
scanners may be both TCP or UDP-based. In the TCP case, errors
may occur due to administrative prohibited ICMP unreachables,
ICMP redirects, or TTL exceeded errors. In the UDP case, UDP scanners,
may in addition pile up ICMP port unreachable errors over time.
As a result, summarization (or the current ICMP flow page)
may be useful for detecting
scanning systems as well. Mass quantities of
ICMP unreachables, TTL exceeded, and
routing redirects may be treated with suspicion.
Typically a trigger event occurs when a runtime current value
as seen with a graphics-engine graph exceeds some
administratively set ourmon probe configuration-time threshold integer.
The threshold integer is calculated in terms of back-end values,
either in bits per second, or packets per second.
A per trigger type dump directory must be specified (say /usr/dumps)
and created before the ourmon probe is started.
Each runtime trigger when enabled places a unique instance of a tcpdump
dumpfile filled with the specified count of packets in an instance file.
The instance file name is always of the form:
Each trigger has some
sort of implicit BPF expression capability
associated with it, which is not user specifiable at this time.
For example the UDP weight trigger
dynamically determines the top IP host address, and stores count
packets of the form: "host IP and (udp or icmp)", thus storing
UDP packets and any ICMP errors associated with the host IP that
caused the trigger to activate.
Current trigger types include:
In ourmon.conf, triggers in general should be placed at the
end of the configuration file as they assume that certain
filters exist and are turned on. They have the following
general syntax:
trigger_name threshold packet_count dump_dirname.
threshold - specify a threshold in bits/sec or pkts/sec or units
depending on the trigger type. When the named filter reaches that threshold,
packet capture will begin. The threshold is always specified in
terms of a backend graphics-engine value.
Packet dumping is turned off when either the count is exceeded
or the trigger threshold flow count is less than the current runtime value
-- which ever comes first. Packet dumping after being triggered cannot recur
between trigger on and trigger off times.
For example assume the packet capture count is 10000, and the threshold value is 2000.
If at mon.lite creation time, the value is 2000, a trigger_on event
is started and 10000 packets may be stored. If the count is exhausted and
the threshold is still > than the value, no more packets will be stored in
the file. Thus it is not possible for one trigger_on event to cause
more than one output file concurrently. When an event is retriggered,
a separate file with a different timestamp is created.
Trigger_on and trigger_off messages are put into mon.lite as "elog"
messages and are recorded in the back-end daily
event log.
As a result
it is possible to easily see when trigger on and off events have
occurred. The trigger dump filename is passed to the event log.
Note that if the probe and back-end graphics engine run on separate
computers, due to the imprecision of synchronization, a given
on or off message may NOT appear. However usually both messages
will appear, or at least the on or off message will appear.
Trigger messages in the event log appear as follows:
The trigger on message means that the trigger for tworm is on because
the current total "worm" count is 20 and has exceeded the configured
threshold of 10. The trigger off message indicates that packet
capture is done. The trigger will capture and store packets
into the file:
Note: at this time ourmon has no way to determine if the
dynamically created file has been created successfully
or if there is runtime storage room within the file system
in question. If the file cannot be created an elog message
will be sent. The ourmon probe will not exit.
In the config file one might use the following as an example:
The filename has the form:
For example, assume that the portsignature.txt
file shows that port 6666 was attacked, then one can do:
# tcpdump -n -r "dumpfile" tcp port 6666
This helps to narrow down the packet dump to a more
relevant set of packets.
This trigger is based on the
udperror_list (topn_icmperror). More information
can be found in the following sections:
Note that this threshold should be set the same both
in the front-end config file in the trigger config and also in the backend
omupdate.sh file. The latter file causes the offending
line from the UDP port signature report to be placed in
the elog event file. As a result it becomes simple to identify
the IP source address of the offending UDP system. Also
the UDP destination ports in question are also supplied.
The ourmon.conf trigger config value of course causes the trigger
to happen and try to capture UDP packets from the IP source in question.
The capture filename has the form:
topn_udp_err.< timestamp >.dump
This trigger will capture and store UDP packets sent to and from
the top host in udperror_list. The internal BPF expression used is:
"ip_src and (udp or icmp)".
The event log entries for topn_udperror will appear roughly as follows:
The dump filename is created by concatenating the major and minor labels
along with a prefix of "bpf" and a timestamp to make the dump file unique.
For example, a dump for the above could have the following name:
The capture filename has the form:
IRC information is enabled in the front-end probe by
the ourmon.conf config tags:
For the IRC messages the probe creates two sets of IRC tuples
including irc host tuples and channel tuples. The host tuple
contains statistical information about IRC usage for a given host participating in IRC communication.
The channel tuple contains statistics about IRC channels including
the set of hosts participating in the channels.
The information produced also includes data taken from the topn_syn
module and includes a TCP work weight for each IRC host. This work
weight is computed via the rough control packets divided by total
pkts function discussed in the
port signature report
but does not include a strength metric (number of syn packets sent).
As a result it is a weaker estimate of control.
The work weight is also the maximum value of any work weights seen
during the sample period (30-seconds or hourly/daily summarizations).
The goal is to see if
a number of hosts in a channel are scanning and if so one might
have found a botnet client mesh.
Thus it is possible to determine if an IRC host or channel may be infected
by a worm/scanner/bot. Tuple information for IRC hosts and channels
is placed in the per sample period irc.txt file and handed over
to the back-end for further processing.
The web
page has several kinds of data including:
The IRC report format may be broken down in the following way.
First the file itself consists of three sub-sections including:
Of all this information probably the evil channel, max message channel,
and channel host sub-reports are the most important. Overall there
are really two kinds of formats used in the IRC report. We might
call one per channel and the other per host. Channel statistics
roughly look like the following:
As an example, say
we have a channel named "exploit" with a total of 33 messages (all PRIVMSGS). Five hosts appeared
in the channel and 4/5 had a work weight greater than 60%. (The work weight in the IRC summarization
is not an average but the maximum work weight seen as we are looking for any signs of suspicious
behavior). An E flag is awarded if a channel has more than one wormy host and at least half of
its hosts (discounting servers which may be botnet controllers) are infected. The small e flag
is given if a channel only has one host with a "bad" work weight.
Next we show a sample host-oriented format. This is used
with a few variations in all of the host sub-reports and also
in the very important channel host report which represents a breakout
of hosts and their statistics in their associated channel.
The lsass445 channel perhaps showed up in the evil channel report
or you decided to check it out because you looked at a TCP port report
and found that there were hosts scanning and they were marked with
the I flag for IRC. Or perhaps you thought the channel name
was "interesting". It has two local client hosts in it and one server.
The first host has a total message count
of 161 messages broken down into total
counts for JOINS, PINGS, PONGS, and PRIVMSGS.
The total number of channels for the host
in question is shown.
The maximum work weight
for all instances is also shown. We also
show whether we believe the host is an
IRC server or IRC client. Note that a real
channel/host breakdown minimally has at least
two hosts in it (one client and one server).
The server/client guess is only a guess as to status but
is fairly accurate. Not all IRC implementations necessarily
follow it - it is not unusual for games to register every host as a "server"
in some sense. This is followed by a feeble attempt by the probe
to sample a source and destination TCP port for the IRC host in question.
This is only a sampling technique and should be per host in a channel but
it is only per host and may well be wrong if a host is in multiple channels.
It can however sometimes be used to determine the server's listening port.
The last column (which is only available in the summarization channel host report)
gives a timestamp for the first time a given host/channel was seen during the day.
This is some sense a sign-on time for IRC. It may also be near the time
of an infection in the case of a spreading botnet.
What is shown in the real report is actually the name of the first
30-second port report file as found in /home/mrourmon/logs/irc (rawirc really)
in which the host/channel combination appeared.
In this particular case it is a fair bet that this is a botnet client
mesh because all of the clients have been scanning. One should look
at associated TCP port reports to determine the target ports. Given
the channel name we suspect port 445 for Microsoft file share may
be a target. In this particular case using a tool like ngrep
on the server IP address (and possibly its port number) may be a good idea.
For example, one possible very general invocation could be:
The most important sections of the IRC summarization in security terms are probably
the first three channel sub-reports, evil, max message, and channel hosts.
The second important security section appears at the absolute bottom of the
irc summarization
and gives a list of hosts that may be infected according to the work weight.
Channel names in IRC are case insensitive. Thus it is possible that a channel
named "exploit" and "EXPLOIT" are the same channel. This is probably true if
they have the same set of IP hosts associated with them. The report
mechanism takes all names and reduces them to lower-case. However the chanmap
sub-section gives the mapping of possibly upper-case or mixed-case names
to the lower casename used by ourmon. If you want to know the possible
true spelling of the ircname refer to that section.
We are going to assume that you would use the
bleedingthreats file as an input source for IP addresses
to check against IRC IP addresses. See the
Automated Blacklist
section for more information. The
stirc.pl script must be used. It takes snort rules
and extracts the IP addresses storing only the IP addresses
in a db "database file" suitable for fast lookup via a perl
script. stirc.pl requires snort rules in a certain format.
Ironically the input needed really is only a list of
single IP addresses (/32). If that is desired, take the stirc.pl
script and modify it to make a script that would simply take a list
of IP addresses, one per line, and turn those into the db format.
Note that the IRC blacklist mechanism only uses IP addresses,
and does not use port addresses.
Below we show how the script stirc.pl, supplied in src/scripts,
might be used to process snort rules to produce an output
file called ourshadow.db. Ourshadow is a ".db" file that can be used
by the bin/irc.pl script for checking IRC IP addresses against
the blacklist.
Once you have the .db file, edit bin/batchip.sh to
include a -B parameter with a file argument as follows
Note the db filename does NOT include the final .db suffix
For example, the completed modified invocation of the irc.pl
script would now look like this.
Ourmon has two kinds of blacklist systems based on either looking
for known bad IP addresses or known bad DNS names. These systems
are currently used in three ways.
Blacklist matches produce event log messages in the ourmon event log.
See
event log
for more information
about event log messages. Most event log messages are included
in that section and explained.
Blacklists are not on by default
and must be manually configured into the ourmon config file.
It is also possible using crontab to setup an automated script
system that might for example get at least the IP addresses
from a remote site and dynamically replace the IP blacklist
itself in the probe, and reboot the probe. There is a section
on how to do that below called
Automated Blacklists
.
In this blacklist section we first show how to
manually configure the IP blacklists. Then we make
you use hypertext links to read about the DNS and IRC blacklist
mechanisms as they are parts of wider feature sets. Finally
we discuss how to automate the system at least for IP addresses
for the front-end and IRC back-end mechanisms.
The two front-end blacklists are called the
IP blacklist and the
DNS blacklist, and naturally the first works
with IP addresses placed in a file loaded into the ourmon probe
at boot, and the second works with DNS names loaded via config files at boot.
The back-end blacklist must be configured into the irc.pl script.
See IRC blacklist for information on
how to accomplish this form of configuration.
The IP blacklist is in some sense just a general part of ourmon
and may be seen as roughly being part of the "flow analysis" side of ourmon.
TCP or UDP packets may be flagged by this subsystem.
Packets that match either the IP dst or IP src and/or port as provided
in the boottime IP blacklist config file (see below) when matched,
will produce ourmon event log messages. Flows are 1-way and there is only one
event log message per flow per 30 seconds. If two-flows are detected
in thirty seconds (bi-directional), then two event log messages will
be produced. A packet count is given.
In addition to event log messages,
packets are also dumped according to the ourmon snap length
size into a "blacklist" file. There is only one blacklist file
per boot of ourmon. If a new config blacklist is somehow generated,
one must arrange to reboot ourmon (manually or with crontab) so that it knows
about the new list and also produces a new blacklist dump file.
The blacklist packet dump file is in tcpdump file format and has a timestamp
suffix as is the case with tcpdump automated output files in ourmon.
Thus the odds are rather high that each ourmon reboot will produce
a unique blacklist file. Again see the
event log section
for more information about the format of event log messages.
The config code for the IP blacklist function looks as follows:
There is no limit on the number of blist_include files. The second
argument specifies a list label that is put in the event log
for the flow hit. This is used to help the analyst quickly
determine which list caused the hit. Note the double quotes around
the list name. These must be provided.
One IP address per line must be provided, and a port number must be used.
Here are two examples:
Then one should arrange to reboot the ourmon probe via crontab
AFTER new information is dynamically provided. It is also
necessary to modify the batchip.sh script so that the irc.pl
script can find its own blacklist file. More details
for that modification are found in the irc section.
Topn DNS is a new module in ourmon and is an emerging "work in progress".
The basic goal is to analyze Layer 7 DNS traffic in various ways from
both network monitoring and network security points of view.
At this point there are two features (although we should
point out that it is easy to create a BPF graph
to watch packet counts to/from internal DNS servers and this
is a very very good idea). One feature is a simple RRDTOOL
graph of DNS traffic statistics. The other is the DNS blacklist.
The former is on by default and the latter must be configured in.
Configuration in the ourmon config file is as follows:
The dns_include file config line allows a DNS name based blacklist
to be configured into the ourmon probe. Any number of files
may be given. The format of individual entries in the DNS
blacklist is as follows:
DNS query responses are unwrapped and if the question is found
to match the DNS name in the blacklist, an ourmon event log
message is generated. Note that it is important to know
the IP addresses of local DNS servers.
See the
system event log
section for more information. The event log section
shows most events in terms of examples and in particular,
the DNS blacklist log entry is explained there.
This event log message has the prefix:
This information does not directly appear on the web.
It may be searched via find or grep (see
security analysis and logging )
as desired.
Information includes top N logs, front-end files like mon.lite,
and important security files including the TCP port signature report
in the logs/portreport directory, and the UDP port signature
report in the logs/udpreport directory. Back end summarization
scripts run at hourly intervals and summarize this data displaying
some (but not all) of it in the form of web-pages including IRC, top
N, and other kinds of summarizations.
Note that after running
ourmon for one week, in general the logs directory does not grow
any bigger as the next day is zeroed out - its data will disappear
and be replaced during the current day. Also note that log files
are typically NOT web summarizations. They represent 30-second
sample reports. Taken together they can be used to produce
a web summarization which typically is done on an hourly or
daily basis during the current day and then rolled over in the web
directory to become the previous day, etc.
A typical log directory (say portreport) has daily sub-directories
inside it. For example, Mon, Tue, Wed, Thu, Fri, Sat, Sun. Near midnight
the current day's directory is summarized in some cases. The next
day's log information is deleted and can thus be replaced on that day.
So for example if it is now Tue all Tue directories will be removed
and replaced. Each log directory has a symlink file with the form
reportname_today which is created to point at the current day.
The log directories and their contents are as follows:
Note that each event log message is preceeded by a timestamp.
However only the first event log message shown here has a timestamp.
Important event log entries include the following types:
Note that this message will not appear unless the darknet/honeynet
(potreport) is turned on in the system. See
honeynet tag feature
for more information about how that may be done.
The event log is summarized in a
weekly summary .
The current
daily event log
is updated
hourly and at midnight becomes the event log for the previous day.
Summarized data at the bottom of the main web page includes:
Format details for TCP port report style summarization may be found in
the
port signature report
section. (Both 30-second and hourly summarization formats are discussed).
Other summarizations exist and are discussed elsewhere including
the
IRC data
and the
TCP scanner log .
Another search technique may be of use specifically
in the portreport directory in conjunction with spikes
in the worm graph. If you see a spike in the worm graph that is
the largest spike for a day this particular search can be very useful.
One can easily find the associated port
report simply by sorting on the size in lines of the port report file.
This is because each IP source address in the port report
gets its own line.
Roughly the idea is that
the user (you) provides a pattern via a pattern statement in the configuration file.
If a packet matches the pattern the associated tag character (tag)
is displayed in the apps field in the back-end report. The basic idea
is that we are not looking for payloads in individual packets for
intrusion detection purposes. We tag packets in order to do traffic characterization.
For example you can learn that local hosts are using the UDP version of Bittorrent,
or are acting as web servers. All of this is done using PCRE which
can be viewed as grep-like patterns. The patterns must be placed into
the ourmon.conf config file.
IMPORTANT NOTE
For PCRE and IRC pattern-matching to work it is necessary for the ourmon front-end
probe to read more than the traditional small default packet size used with tcpdump
and the like. As a result we read and buffer the first 256 bytes of any packet
so that pattern matching may work in the Layer 7 payload area. Thus the
-s 256 switch is used with the ourmon probe as follows:
Here is an example pattern that would match instances of FTP servers:
With the topn_syn filter the "flow" in question is tagged.
However because that filter is IP source oriented,
the IP destination for that packet
is NOT tagged. Thus it is possible to use this mechanism to
determine clients versus servers (with reports like syndump based on topn_syn).
Patterns based on the topn_syn tuple are thus source host IP specific.
This tagging mechanism uses the PCRE (perl compatible regular expression) library.
It is expensive in terms of computation and should not be lightly used.
As a result it is not turned on by default.
Note that the patterns used here are compatible with the patterns
found at:
l7-filter project
Any (TCP) pattern there may potentially be used with ourmon.
The general syntax for PCRE pattern matching is as follows:
Note the difference between the following two patterns:
In addition the following two stateful configuration tokens exist and apply to
the patterns that they encapsulate:
threads
-T
count is
provided to the front-end ourmon executable to specify the number
of threads. For example,
# ourmon -T 3 etc...
creates four ourmon threads in all. There is always a "master"
thread that is in charge of synchronizing the other threads and forcing
them to stop servicing packets when the alarm period comes, and
statistical counters need to be written to various output files
and zeroed out. Spin locks are used for synchronization and
at this time are not very agressive in terms of granularity.
Ourmon has two runtime stages, packet-gathering, and report-writing.
During the packet-gathering stage all the child threads and
the master thread try to read packets from the kernel and process
them through all the configured ourmon filters. At report-writing time,
the master thread tells the child threads to line up, and when
they do so, it then writes out the "data store" and reinitializes it,
after which it unleashes itself and the children to read more packets.
This is a comparatively simple model, but has proved difficult to
debug.
#ifdef THREAD
is used everywhere in the code to distinguish between THREADED
code and the original single thread model.
Don't use the threads unless you have at least a quad CPU.
Threading with our current architecture is not useful on either a
single, or dual CPU. In fact, it is probably harmful. A single
threaded ourmon on a dual CPU is a good thing simply because
the operating system and the associated NIC card can use up
one thread. We are currently running a threaded probe
on a dell 1950 running FreeBSD current (else the disk controller
wouldn't work).
threads - benchmark results
We are not going to present much in the way of details here.
Informally the best performance can be had with FreeBSD (probably
any BSD). Of course the operating system must be running in SMP mode.
We tried various forms of linux including the MMAP version,
and the stock version. Without threading,
We found that BSD gave the best performance,
followed by the linux MMAP version, followed by the stock
linux version. Linux's performance with libpcap is "less good"
because each packet read requires two system calls.
BSD can read many packets in one read. We did not modify
the MMAP version of Linux to make it threaded. We found
that standard linux with our modified libpcap showed improved
performance but was still not as good as FreeBSD with our
threaded scheme. See the code for more details.
If you are bound and determined to use linux, deploy
it with the MMAP option (both the libpcap and kernel,
where I believe it is on by default). See
Phil Wood
for more details. There is a small section in the snort
manual on this subject.
threads - compilation/installation
Threads in the ourmon probe must be turned on in the per release BSD or
Linux src/ourmon/Makefile* with -DTHREADS and recompiled and installed.
See the relevant Makefile (Makefile.bsd or Makefile.linux) for information
about how to compile with threads enabled.
1. compile - First of all compile by changing the Makefile and making sure -DTHREADS
is used to rebuild and reinstall ourmon in the bin directory.
2. modify the bin/ourmon.sh startup script -
Second you need to change the runtime startup script
bin/ourmon.sh.
In the shell function called
start_om
Find the line that starts ourmon which looks something like:
/home/mrourmon/bin/ourmon -s 256 -a 30 -f /etc/ourmon.conf -i em0 -D /home/mrourmon/web.pages &
}
and change it to add a -T parameter which specifies the number
of child threads (more on this below). For example, let's assume
we have a SMP processor with 4 hardware "threads" of some sort. On
BSD, it would make sense to do this as a result:
/home/mrourmon/bin/ourmon -T 3 -s 256 -a 30 -f /etc/ourmon.conf -i em0 -D /home/mrourmon/web.pages &
This means there will be four ourmon processes as shown with top or ps.
In order to stop ourmon, modify the
stop_om
function as follows:
stop_om()
{
kill -TERM `cat /var/run/ourmon.pid`
echo -n ' ourmon'
}
The initial "master" process can be signaled with -TERM which will stop
all the processes. Note that killing one process will NOT stop the others (it will make a mess).
# killall ourmon
may work depending on the OS.
Network Management
In this section, we will look at filters that may be deemed to
be of general interest from the network monitoring point of view, as opposed
to the anomaly detection view. This is really a false distinction in some
ways, as very blatant attacks (for example the UDP slammer worm coming from
one local host) can cause many of the graphs in ourmon to indicate an anomaly.
pkts
The pkts filter (which shows one or two input interfaces, both
assumed to be in promiscuous mode) displays the number of total packets
caught and/or dropped by the kernel packet mechanism. This particular filter
should always be included and is typically not directly specified in the ourmon
It should not be removed. Drops are nature's way
of telling you that you have overloaded the
ourmon system. How much the ourmon system can do is hard to say,
and may take some tinkering on the part of the administrator.
However if you are often losing 50% of your packets, that may be a sign
that you need to do less or buy a faster computer for the front-end probe.
Dropped packets can also occur during attacks. For example,
if you are hit with a distributed TCP syn attack, you may drop packets
depending on your front-end load. Small packets are a general problem
for all "sniffing" based systems as there is simply not enough time
between packets to do arbitrary amounts of computational processing.
pkts: caught:9440 : drops:0: caught2:0 : drop2:0
fixed_ipproto
The fixed_ipproto filter is very simple. It simply
counts up TCP versus UDP versus ICMP bits, dumping any other IP protocol
into the "xtra" bin. It has a user-BPF friend that lives next door that
shows similar information in pkts/second as opposed to bits/sec.
Typical "mon.lite" output is as follows:
fixed_ipproto: tcp:67402888 : udp:30976940 : icmp:23158 : xtra:47623:
A sneaky technique is used in the mon.lite file in that in some cases
mon.lite counts bytes per period, and omupdate.pl converts bytes to bits per second.
fixed_tcp3
The fixed_tcp3 filter accepts two TCP dst OR src ports. It then
counts packets displaying bits/sec with src/dst port1 versus port2 versus any
remainder packets, dumping any others
into the "xtra" bin. Note that xtra is all other bits, not just
TCP (this may be a silly idea).
Ourmon (the front-end) takes a configuration
file. The entry for this filter might take the following form:
fixed_tcp3 119 80
What this means is that we are interested in capturing USENET NEWS (port 119)
and HTTP traffic (port 80). As supplied, this filter captures email (25) versus
web (80) traffic.
Typical "mon.lite" output is as follows:
fixed_tcp3: 119:39908731 : 80:11033941 : xtra:47506497:
fixed_cast
The fixed_cast filter is performed at the Ethernet header
layer, not the IP layer, if and only if the supplied link device
is an Ethernet device. It displays bits/sec.
It classifies packets as Ethernet multicast,
Ethernet unicast, and Ethernet broadcast, based on the Ethernet
destination address. Note that this filter is potentially useful
for observing possible broadcast storms, whatever the cause, or multicast
routing meltdowns. At this time, assuming Ethernet inputs,
"xtra" packets should be 0.
The configuration file looks as follows:
fixed_cast 127.0.0.0 255.0.0.0
This is because we require an IP net/mask pair, just in case
one of the input interfaces is not Ethernet-based (the localhost device).
These addresses are ignored if Ethernet is used, but they must
still be supplied (just leave them alone in the supplied configuration).
fixed_cast: mcast:191214 : ucast:98257955 : bcast:0 : xtra:1440:
fixed_size
The fixed_size filter is performed at the Ethernet header
layer, not the IP layer, if and only if the supplied BPF device
is an Ethernet device. It displays pkts/sec. This filter counts packets within
four fixed byte bucket sizes, where the packet is <= 100 bytes (tiny), <= 500 (small), <= 1000 (medium),
or <= 1500 bytes (big). These packets may include errors.
The configuration file looks as follows:
fixed_size
fixed_size: tiny:51732 : small:13795 : med:13544 : big:32876:
fixed_l2proto
The fixed_l2proto filter is performed at the Ethernet header
layer, not the IP layer, if and only if the supplied link device
is an Ethernet device. It displays pkts/sec.
This filter counts packets according to the L2 ethernet header
for IP protocols only, including IPv4, IPv6, and ARP. Other
packet types are considered as "xtra". Note that this filter
might show an ARP storm.
The configuration file looks as follows:
fixed_l2proto
fixed_l2proto: ip:647819 : ipv6:1 : arp:48 : xtra:19:
user designed RRDTOOL-based BPF graphs
The user-mode BPF filters are a powerful part of
ourmon and allow programmable back-end RRD-based graphs.
They allow the user to design his or her own RRDTOOL graphs.
For example, we might have the following BPF filter set
examples in our ourmon.conf
filter specification file:
bpf-packets
bpf "protopkts" "ip" "ip"
bpf-next "tcp" "tcp"
bpf-next "udp" "udp"
bpf-next "icmp" "icmp"
bpf-bytes
bpf "ports" "ssh" "tcp port 22"
bpf-next "p2p" "port 1214 or port 6881 or port 4662 or port 6346 or port 6257 or
port 6699 or port 6257"
bpf-next "web" "tcp port 80 or tcp port 443"
bpf-next "ftp" "tcp port 20 or tcp port 21"
bpf-next "email" "tcp port 25"
bpf-noxtra
bpf-bytes
bpf-packets
These two commands cause an entire BPF filter set to
produce either bits/sec or pkts/sec.
bpf-next "tcp" "tcp"
means the label is "tcp" followed by the BPF expression which is also "tcp" in this
case. Let us look at a more complex example.
bpf-bytes
bpf "subnets1" "net1" "net 10.0.1.0/16"
bpf-next "net2" "net 10.0.2.0/16"
bpf-next "net3" "net 10.0.3.0/16"
bpf-noxtra
Or you can easily make up graphs that might watch a local server using expressions like:
bpf "hostA" "total" "host 10.0.0.1"
bpf-next "email" "host 10.0.0.1 and port 25"
bpf-next "web" "host 10.0.0.1 and (tcp port 80 or tcp port 443)"
bpf-noxtra
See the INSTALL file for more information on BPF graph customization.
CBPF BPF Optimization
cbpf "ctag" "graph-label" "line-label" "cfilter"
cbpf-next "ctag" "per-filter-label" "cfilter"
The "ctag" is used to signal to the configuration the syntactic form
of the CBPF filter. The "graph-label" and "line-labels" have the
same function as with the pure BPF filter set syntax. However "ctag"
and "cfilter" are totally different from BPF. Here are usage examples
for the existing set of CBPF tags shown with equivalent BPF
expressions:
1. network tag ("net")
cbpf "net" "subnets" "subnet8" "10.0.8.0/24"
bpf "subnets" "subnet8" "net 10.0.8.0/24"
cbpf-next "net" "subnet8" "10.0.8.0/24"
bpf-next "subnet8" "net 10.0.8.0/24"
2. tcp port range ("tcprange")
cbpf-next "tcprange" "bittorrent" "6881 6889"
bpf-next "bittorrent" "tcp and (port 6881 or port 6882 etc.)"
cbpf-next "udprange" "someports" "6881 6889"
bpf-next "someports" "udp and (port 6881 or port 6882 etc.)"
4. udp and tcp mixed single ports ("ports")
cbpf-next "ports" "edonkey" "u 4665 u 4672 t 4665 t 4661 t 4662"
bpf-next "edonkey" "(udp and port 4665 or udp port 4672) or (tcp and (port 4665 or port 4661 or port 4662))"
cbpf-next "tcpflag" "syn" "s"
cbpf-next "tcpflag" "fin" "f"
cbpf-next "tcpflag" "rst" "r"
bpf-next "fin" "tcp[tcpflags] & tcp-fin != 0"
bpf "tcpcontrol" "rst" "tcp[tcpflags] & tcp-rst != 0"
6. ping ("ping")
cbpf-next "ping" "pingline"
bpf-next "bpfping" "(icmp[icmptype] == icmp-echoreply) || (icmp[icmptype] == icmp-echo)"
cbpf-next "icmp" "ccportun" "3 3"
bpf-next "bpfportun" "icmp[icmptype] == icmp-unreach && icmp[icmpcode] == 3"
8. ip next protocol ("ipproto")
cbpf-next "ipproto" "tcp" "tcp"
bpf-next "tcp" "tcp"
cbpf-next "ipproto" "esp" "50"
bpf-next "esp" "ip proto 50"
topn flow graphs
The following links show the outputs for the various topn flow graphs:
topn_ip
topn_tcp
topn_udp
topn_icmp
topn_ip_pkts
The topn_ip filter provides a traditional top N flow
point of view for IP (any IP flow), TCP, UPD, ICMP flows, and all flows
together in terms of top packet count.
It shows the top N flows for IP/TCP/UDP/ICMP in bits/sec. An IP flow is defined
as a 5 tuple having this form: (IP src, IP dst, next IP protocol, L4 src port,
L4 dst port). TCP and UDP flows of course do not have the next IP protocol field.
ICMP flows display major and minor ICMP codes as opposed to L4 ports.
The ICMP major value is displayed as the L4 "source port", that is,
it is put on the left-hand side of the flow.
The ICMP minor value is displayed as the L4 "source port" on the right-hand
side of the flow. The top packets flow is of course shown in pkts/sec.
topn_ip : 6954 : 131.252.208.43.65529->131.252.120.170.119(tcp): 18320510 :
128.223.220.30.40165->131.252.208.43.119(tcp): ETC ...
topn_tcp : 5596 : 131.252.208.43.65529->131.252.120.170.119: 18320510 : ETC...
topn_udp : 1257 : 209.70.46.6.27968->131.252.77.153.6974: 269300 : ETC ...
topn_icmp: 2: 131.252.3.1.0->131.252.2.1.0: 5234: 131.252.2.1.8->131.252.3.1.0: 5234: ETC ...
topn_ip_pkts : 61109 : 38.99.15.80.80->131.252.77.126.4496(tcp): ETC ...
Note that the number following the topn_ip tag value above is the count of
distinct IP flows seen during the sample period. This is not the same as the top N flows
shown as tuples in the mon.lite file. It is a count of the unique flows seen during
the sample period, all of which have been stored in the hash list itself. But of course,
not all of them are printed out if the number of flows exceeds the top N value
supplied in the config file.
The IP/TCP/UDP/ICMP flow count itself is currently
graphed in an RRDTOOL-style graph as it is very useful for anomaly detection.
See below in the anomaly detection section for more information on that graph.
topn ports graph
The following links show the output for the topn ports graph:
topn_tcp_port
topn_udp_port
The topn_port
filter displays the top N ports used in TCP and UDP
flows. The top N ports are sorted by packet byte count and expressed in bits/sec.
The topn_tcp_port graph has the following format in its graphs:
port, bits/sec, L4 src_count,L4 dst_count, application flags
The top port is displayed followed by its bits/sec as a total port.
These in turn are followed by L4 src_count and dst_count in the "legend" or top part of the label on the graph.
The port value may be either a src or destination L4 port. We do not distinguish.
Effectively the port is the key, and the bit count is used for sorting.
The src_count/dst_count denote how many times the L4 port was a
source/destination *port*. Thus it may be possible to determine that a particular
port is only being used as a destination (or source) port.
Also the src and dst counter is a packet counter, not a byte counter.
The entry for this filter in the configuration file may be as follows:
topn_port 60
This indicates how many top port tuples should be written to the mon.lite file.
mon.lite output is as follows:
tcp_ports: 8472 : 80:56316509:49409:39209 : 6881:20155127:13459:13166 : ETC...
udp_ports: 2237 : 49156:3834617:3693:2758 : 49302:3834617:2758:3693 : ETC...
The number value followed the filter tag in the mon.lite output (tcp_ports : number : ...)
represents the number of distinct port tuples seen during the sample period.
N (e.g., 20) 4-tuples follow with the tuple format: (port, byte count,
L4 src port packet count, L4 dst port packet count). omupdate.pl rearranges this information
into a tuple more suitable for display.
Anomaly Detection
Ourmon provides a fair number of anomaly detection filters using both
supplied BPF filters, top N filters, and a few carefully chosen meta-data
RRDTOOL-based graphs. We will discuss the top N filters here first,
including some interesting features, and then go on to the supplied BPF
filters. It should be pointed out that in general the default supplied BPF filters
look at the big picture. For example, the TCP control BPF filter set
shows the overall number of TCP SYNS, FINS, and RESETS in a network.
These may be customized to particular networks if desired.
On the other hand, the top N TCP syn filter shows the top IP hosts
sending out TCP SYNS. The former gives you a network-wide picture.
The latter helps show individual hosts that may be taking part
in an attack (or running gnutella).
TCP SYNS sent - TCP FINS returned > 30.
This mechanism seems to do a good
job of showing distributed zombie attacks simply because the number
of hosts matching the SYNS - FINS metric increases over some normal
bound by a significant number. The "sent" metric can be viewed as a low-pass
filter that leaves out "normal" applications that typically produce
roughly equivalent amounts of SYNS and FINS in a sample period. We
have also done some normalization experiments and can state that
unusual numbers of TCP SYNS aimed at TCP ports 445 and 139 do not
occur with normal Microsoft file sharing.
TCP Syn List Overview
The following links provide outputs for information derived from the topn tcp syn list.
Note that some of the links are for 30-second "now" versions and some are for daily
summarizations.
now - topn_syn histograms
RRDTOOL (now and baselined) tworm graphs
now - the TCP port signature report
daily summary of TCP port report
- the TCP port signature report
daily summary of TCP port report - hosts with work weight >= 40
daily summary of TCP port report - hosts mentioning port 445
now - detailed TCP port signature/tcpworm.txt report (debug)
big SYNNERS - daily summarized TCP port signature report
the TCP port signature new scanner log
the p2p application version of the port report
daily summarized version of p2p port report
now - the syndump version of the port report
daily summarized version of syndump port report
now - the email version of the port report
daily summarized version of email port report
1. the topn_syn list,
2. the tworm RRDTOOL graph,
3. the detailed tcpworm.txt output report,
4. the port signature report (which is a condensed parallel view of the previous item).
5. the ourmon scanner log which records new instances of port signatures in a day.
6. the port signature hourly summary batch reports.
7. the p2p port signature report which shows hosts using major P2P applications
like bittorrent.
8. the syndump port signature report which shows all local hosts.
9. the email port signature report which shows all hosts using an email port.
Email ports of interest are ports 25, 587, 465. See below for more information.
10. the potdump port signature report which shows hosts that
have written to either the P "honeypot" network and/or D "darknet" network.
SYNS sent + FINS sent + RESETS returned / total 2-way packet count.
Typical benign hosts score low work weights of 0%. A work weight of 100% means all control and no data and may
be deemed truly "anomalous". Signicantly anomalous hosts may have a work weight between 50..100%.
Intuitively we are measuring the amount of TCP control packets versus the total number of packets.
If there are large amounts of data the work weight will be low. Obviousally a SYN
scanner will score a 100% work weight. Of course an anomaly may represent a client that has no server.
Or it may represent a badly coded client or poorly performing TCP application. For example it is not unusual to
spot hosts using Gnutella with a high work weight because the Gnutella application is failing to
find Gnutella peers.
However in many months, we have seen only a handful of cases of such anomalies that were not worms,
and 1000s of cases that were worms. More details on the work weight are given below
when we talk about the port signature report.
TCP Syn list ourmon.conf configuration
topn_syn 60
topn_syn_wormfile /home/mrourmon/tmp
topn_syn_homeip 10.0.0.0/8
topn_syn_p2pfile /home/mrourmon/tmp
topn_syn_syndump /home/mrourmon/tmp
topn_syn_emaildump /home/mrourmon/tmp
honeynet 10.1.2.0/24
topn_syn_potdump /home/mrourmon/tmp
darknet 10.1.2.0/24
tworm_weight 30
topn_syn_potdump
causes the creation of a special version of the TCP syn report
that only shows those IP hosts that have written into the honeynet
(and/or darknet). This is on by default but dependent on
what you do with the honeynet address above. Event log messages
may be logged if there is a HOME address supplied. Note
that the event log messages are only logged for HOME addresses,
and not for remote addresses as someone is always scanning you.
darknet
may be used in addition to the honeynet tag to supply
the system with an additional tag. The D tag is used. P and D
tags are both shown in TCP port signature reports and in the special
filtered version called the "potdump" which only shows IP hosts
writing to P and/or D marked addresses. Note that the associated
address can be a net or host as it used a subnet mask.
TCP syn list outputs -
topn_syn graph
ip: 10.190.133.152, per period: s:296, f:1, r:0, total: 296, ww: 100%, apps: P
TCP syn list outputs -
The tworm graph
The tworm graph attempts to capture the number of "wormy" hosts according to hosts
put in tcpworm.txt at one time. It is graphing the number of hosts that appear
in the tcpworm.txt file (which is the same as the number of lines in the file).
Hosts placed in this file are deemed to be "noisy" in that for some reason
they generate more TCP SYNS than TCP FINS.
Spikes in this curve may correspond to automated distributed
bot attacks which may be performing a DOS attack or simply scanning for exploits in parallel.
By default the tworm count information counts all IP sources appearing in the tcpworm.txt
file and classifies them as to whether or not they appear to be from the internal network,
or from an external network. (If internal versus external doesn't make sense, best to do something
like make the internal network 10.0.0.0 with netmask 255.0.0.0, thus making all IPs external).
The mon.lite config variable
tworm_weight 80
may optionally be used to filter the tworm count by the work metric. Thus one can approximate
the number of "real" worms as opposed to noisy P2P hosts or noisy web servers.
tworm: 9: 3: 6:
This tuple is placed in the mon.lite file and processed by omupdate.pl in the backend. This produces
the tworm RRDTOOL graph. The three numbers in turn represent:
1. a total count of "worms" (by default
the count here is the number of ip hosts found in the tcpworm.txt file.
2. the total count of systems in the home system that appear in the tcpworm.txt file.
3. the total count of external systems not in the home subnet.
We have observed external attacks in the 1000s made on one campus. These attacks are real
and this mechanism is very useful. It is fair to view this graph as a botnet attack detector.
TCP syn list outputs -
the port signature report
The ourmon system presents two processed versions of the front-end tcpworm.txt
file called "tcpworm.txt" and "portreport.txt" respectively. Here we will focus only
on the TCP port report or port signature report as it may also be called.
(The more verbose tcpworm.txt report actually includes the portreport.txt information as well,
but for reasons of quick web lookup, the port report is broken out into a separate report
file.) Both files are updated every thirty seconds but as they are ASCII outputs,
you must hit "reload" yourself on your web client.
ip src: flags apps: work:
SA/S: L3D/L4D: L4S/src:
ip dst snt/rcv sdst/total port signature
10.82.196.58 (WOM) 100: 0: 423/1 10/2314
192.1.2.3 107/0
107/107 [5554,40][9898,59])
192.168.245.29 (O) B 6: 67: 622/438 10/6345
10.10.10.10 277/343
277/411 [6880,0][6881,42][6883,6][... more]
E - An anomalous amount of ICMP errors are being returned to the IP source.
W - The work weight is greater than or equal to 90%. A W means the IP source
is highly suspicious.
w - The work weight is greater than or equal to 50% but less than 90%.
O - very few fins are being returned. O stands for "output" (or "ouch"?).
R - TCP resets are being returned.
M - no TCP packets are being returned to the IP src in question from receivers.
So for example, M here means there is no 2-way exchange of data.
B - a Bittorrent application was detected.
G - A Gnutella application was detected.
K - a Kazaa application was detected.
M - a Limewire/Morpheus application was detected.
I - an IRC application was detected.
e - an Edonkey/emule application was detected. This flag is prone
to false positives. The others are not as prone (understatement).
E - is used to indicate that packets are sent to destination port 25 and related ports (see below).
The goal
here is to alert you to a possible spammer or a benign email server.
H - packets are being sent from port 80 or port 443. The sender could
be a noisy web server or nmap for that matter.
P - if configured in (see honeynet below),
a P is shown to indicate that
the IP source in question was sending packets to the configured darknet.
This is useful for catching scanners. Of these tags, this tag is also
also available in the
UDP port report.
s - is reserved for the UDP port report and means that an all too common "SPIM"
packet was seen.
It should be noted that in general the applications flag field does not
usually find false positives, although the Edonkey flag is more prone to false positives
than the other fields. Also note that the E for email and H for web server flags
are only based on the existance of port 25, port 587, and port 465 as a TCP port for email,
and ports 80 and 443 as destination ports for web servers. An attacker may very well be scanning
on those ports. The use of E and H based on TCP ports compared to B and G is a mixed metaphor, but we felt
it was useful to have a mechanism that would give us possible application clues and that
clues about email and web are important indeed.
[5554,40][9898,59])
For example [5554,40] means destination port 5554 was being scanned. The second field in
the port 2-tuple gives the overall percent or frequency of packets for that port in the total number
of sampled ports. In this case 40% of the total port count of packets were aimed at port 5554,
and 59% were aimed at port 9898. (For the actual numbers look in tcpworm.txt).
It should be noted that this is only a sample. It is not unusual to
see all 10 port tuples "full" of something that seems to be evenly distributed at 10%.
Such occurances are often due to web-based "noisy" client/server relationships and may be
benign. However in some cases this may represent a remote scanner that is simply walking the port space.
Scanners may be spotted by looking in the scanning log as their port signatures will be "new"
and will change over time. (Of course they may also show up in the top N scanner part of
ourmon as well).
135-139, 445, 1433, 1434
In typical use one does not see large discrepencies in ordinary
use of the Microsoft File Share distributed file system or with the use of a SQL server.
The work weight here simply does not matter (and high is still bad!), and in fact with port 1433,
a low work weight is a bad sign because it means hosts are your site may be responding at Layer 7
to attacks. Ports 80 and 443 of course may or may not represent attacks on web servers.
A low work weight with port 80 might mean that attacks are being launched at a web server
and it is returned failure messages at Layer 7. It might also mean that a web server
is sending lots of small connections back to a web client and in that case is benign.
Certain other port combinations represent well-known viruses as (5554, 9898) is an example.
If you see new port signatures, it can be useful to search google for suspicious port combinations
(as well as the dshield site and web pages offered up by various anti-virus vendors).
daily summary of TCP port report
- summarized daily TCP port signature report
The TCP port report in its hourly summarization form is sorted by
instance count (the number of 30-second samples seen).
Other forms of the port report are sorted by packet count.
Web summarizations in general are available at the end of
the main page (roughly one week's worth). The daily summary
is rolled over approximately at midnight to the previous day.
The daily page is at the left and the previous day is moved to
the right. For example, yesterday is next to today (to the right),
etc.
ip src
flags
apps
ww:min/avg/max
sa/s
l3d/l4d
syn/fin/rst
tsent/trecv
10.1.200.176
B
( 1: 5: 10:)
0:
(140/116)
(57:14:0)
(10130:8012)
dns
dns: host-200-176.pubnet.pdx.edu
instance count
start time
stop time
963
Fri_Sep_22_00:00:09_PDT_2006:
Fri_Sep_22_08:01:03_PDT_2006:
porttuples[count]:[port, pkt count] ...
portuples[10]: [6881, 2057931][6346, 466092] ...
line 1 -
has the IP address, an OR'ed version of the flags field,
an OR'ed version of the application flags field,
the TCP work weight as a 3 tuple giving minimum, average
across all instances, and maximum values, average SA/S,
average L3D/L4D, average SYNS/FINS/RESETS,, and average
TCP pkts sent and received. Averages are across all
instances.
line 2
-
has the resolved DNS address (if available).
line 3
-
has the instance count (number of 30 second TCP port reports),
the first timestamp for the IP address (for the first port report
in which it appears), and the last timestamp (for the last seen
port report) during the day for
the IP address.
line 4
-
contains a summarization for all porttuples seen. However in
this case the port samples are sorted according to the max
packets seen across all instances sent to the port in question.
For example, port 6881 above had 2057931 packets sent to it
from the host in question.
TCP syn list outputs -
the tcpworm report
The tcpworm report is essentially an expanded "translated" version of the raw tcpworm.txt file.
It is sorted by the syn list; that is IP sources with more syns will appear before
IP sources with less syns. We may divide it up into three parts: 1. per IP source statistics,
which include all the information the front-end has gathered about a specific IP source, 2. various
summary statistics., and 3. the port signature report, which is also included in here. The port signature
report is included so that one can use a text editor and simply pattern-match back and forth
to learn more details about IP sources in the port signature report deemed of interest.
TCP syn list outputs -
the p2p report
The p2p report is a separate front-end
output and is another topn syn feature. It is functionally all syn
list IP src-based tuples that have set the hardwired application flag field
for one of the hardwired p2p tags (see list above, Bittorrent, etc).
These hosts must be local as well. (Otherwise the fanout for non-local hosts
could be potentially deadly to ourmon).
In other words, it
is the set of local IP hosts seen during the sample period that initiated
P2P exchanges for Bittorrent, Kazaa, Gnutella, Morpheus/Limewire.
This set of P2P hosts does not yet include PCRE tags.
For example if the user adds a PCRE tag for a new P2P protocol,
and only the PCRE tag is matched, the host in question is not
yet added to the P2P host set (this needs to be fixed).
Only the older built-in tags cause set inclusion at this time.
However if a set member has a PCRE tag set, it will show it
in the various forms of the port report but it will not be added to
this port report due to a PCRE tag.
This report includes any host that sent or received IRC messages including
JOIN, PRIVMSG, PING, or PONG. See the previous
portreport section or the
PCRE section
for more format information. This file uses the same
format as the TCP port report,
but is an entirely different subset of IP sources taken from the entire set of
hosts sending TCP SYN packets. It is sorted by total packet counts.
Note that there is no port signature database
associated with this file.
TCP syn list outputs -
the syndump port signature report
The topn_syn tuple may be large in terms of the set of IP addresses
collected as individual IP source address tuples. The TCP port signature
report roughly only includes those tuples where there are a sufficiency
of unFIN'ed SYNS and the TCP work weight is positive. The P2P
tuple includes those hosts which have set one of the built-in hardwired P2P
(non PCRE) tags (B for Bittorrent, etc). The syndump report provides more
information and an even bigger set of local TCP hosts.
The syndump in general dumps all
"home" IP hosts that have done something non-trivial. For example
it would include all local email servers, local web servers, etc.
It does
not include non-local hosts due to P2P fanout (that would increase
the report size by two to three orders of magnitude).
TCP syn list outputs -
the email syn port signature report
A new and special form of the TCP port report now exists that focuses on
hosts using email ports including port 25, 587 (submission) and 465 (smtps).
This version of the port report focuses on hosts doing email (and probably
acting as email servers). Both 30-second and hourly summarizations exist.
All email syn reports are sorted by email syn counts. Thus they can be viewed
as top N email syn reports. They also include a special application TCP work
weight that is calculated for the email port TCP packets only.
For example the email version of the 30-second TCP port report
adds the fields
esyn/eww
as a column. This means the count of EMAIL syns seen during the
sample period and the email specific TCP work weight (eww). The "eww"
field may not be the same as the TCP work weight because the former
applies only to packets having one of the 3 port addresses for email
and the latter applies to all packets seen.
email: syns: 4147, synavg: 4, wwavg: 67
This is telling us that the system in question sent 4k syns total so far during the day,
and that it average 4 syns per sample period. It's email specific work weight average was 67%.
The daily email dump summarization may be found here:
daily summarized version of email port report
TCP syn list outputs -
honeynet tag feature
The following config tag may be placed in the ourmon.conf file
in order to turn on a P tag in both top_syn reports and
the UDP port signature report. P stands for "POT", but really
stands for a darknet (empty net) that should by definition have no hosts on it.
(It could run a honeynet though).
honeynet network/netmask
For example, if one had the following /24 subnet free:
honeynet 10.0.8.0/24
this will cause packets sent to the "honeynet" (or darknet) to
be flagged with a P in the apps column in various ASCII reports such
as the tcp port signature report. P can be taken to mean that the
IP source in question is scanning. This can be useful for distinguishing
between some P2P using hosts and true "worms". For example, Gnutella
clients may sometimes scan (of course use of a P2P application like
gnutella might violate your local security policy). Of course the host
in question may simply be doing a scan. Note that the darknet
is best if unpopulated with real hosts.
This is a very useful feature and you are well advised to make use of it.
topn_syn_potdump directory
config tag to cause the creation of a special TCP port report
that only includes all P or D references.
This special TCP port signature report will cause event log
messages at 30-second periods if there are writes
to the specific subnet/s. Event log entries are only done for home addresses
and not done for away addresses as someone is always scanning you.
See
system event log
section for more information.
synlistconf
for more information.
TCP syn list outputs -
darknet tag feature
In addition to the P for POT tag,
the following config tag may be placed in the ourmon.conf file
in order to turn on a D tag in both top_syn reports and
the UDP port signature report.
darknet network/netmask
D stands for "DARKNET". Any IP address that writes to
the darknet net is marked with a D tag.
This network may be a different network or subset of the P "POTNET"
network.
Note that either the darknet or the so-called potnet may be used
for watching a specific subnet more closely than other networks.
topn_syn_potdump directory
config tag to cause the creation of a special TCP port report
that only includes all P or D references. See
synlistconf
for more information.
TCP syn list outputs -
the TCP port signature new scanner log
The TCP scanner log is a daily log that provides
new port signatures -- where new means a TCP port signature that has not been observed
since midnight of the current day.
New port signatures are stored in a db database that is turned over every night at midnight.
Port signatures are put in the scanning log with the following form:
Tue Nov 2 08:00:29 PST 2004: new worm signature from:192.168.186.238 [5554][9898][16881]
A TCP scanner seen scanning over a long period may have multiple entries in the log.
Note that this log is potentially updated every 30 seconds.
TCP syn list outputs -
batch port signature reports
The ourmon back-end generates an hourly report that is a SYN-focused
summarization of the TCP port report.
This report is generated by the tool ombatchsyn.pl which sorts IPs by max syn count.
By default it filters out IPs according to a hardwired variable (-W 100) so that
only those systems having a work weight of 100 are counted in. It also only
lists the top 1000 of such systems (-c 1000). Essentially this gives you a list
of potentially grievous offenders in terms of systems having work weights and sending
out lots of SYNs.
The goal is to leave out sporadic weights (of say email servers) where once in a while an IP source
will have a high work weight, but on average does not have a high work weight.
If you wish to tune this weight modify the back-end bin/batchip.sh script where it
calls ombatchsyn.pl to make this log. As this is a batch report, reverse DNS lookup is
performed. The output filename is "allworm_today.txt".
icmp and udp error list
The following links show various outputs associated with the topn icmperror list:
topn_icmperror
topn_udperror
udp port signature report
udpweight graph
ourmon system event log
The icmperror_list top n mechanism is specified in
the ourmon.conf file as follows:
topn_icmperror 60
This produces a maximum of 60 horizontal label/bar combinations
as with the other top N filters.
Note that this config
command turns on both the ICMP histogram and the UDP error histogram
as well as UDP weight mechanisms.
Both sets of secondary html files are dynamically created by omupdate.pl.
icmperror_list: 19579 : 10.22.17.77:940:2728:106:468:2:0:468:0: ETC.
udperror_list: 19579 : 10.17.226.58:1605867:3852:1:208:0:2544:1:0:40000:0:1:1026,3852,:
As usual, the total number of list entries in the hash list for the 30-second
sample follows the tag.
Refer to the source code in src/ourmon for an explanation of the tuples.
ip: 131.252.177.77, icmps/period: 940, Tcp: 2728, Udp: 106, Pings: 468, Unr: 2, Red: 0, ttlx: 468, flags:
First the IP address is given, followed by the total ICMP packets in the sample period,
followed by TCP, UDP, and PING packet counts, followed by counts for destination unreachable, redirect,
and TTL exceeded ICMP errors. Flags are not implemented for ICMP at this time.
topn_udperror
and
udp port signature report
The UDP port signature report comes in two flavors. As the UDP port signature
information is fundamentally a top N list we include a
histogram-based graphic,
and in addition
an ASCII cousin which is similar to the TCP port report. The former
may be called the "topn_udperror list" (graphic) and the latter is simply
called the "UDP port report". The
UDP port report layout is similar to the TCP report,
but it is sorted by the
UDP work weight ,
not IP address. However the UDP port signature field
of the ASCII report is otherwise similar to the TCP port signature field.
UDP pkts sent * ICMP unreachable errors + UDP packets received.
The function
is quadratic and unbounded unlike the TCP work weight
and we have seen instances of this value above
one billion (a DOS attack). ICMP errors consist of ICMP unreachables, redirects, and TTL exceeded pkts
sent back to the IP source in question. ICMP (port) unreachables weigh slightly more and will raise the weight a great deal.
ip: 10.47.186.58, weight/period: 1605867, Snd: 3852, Rcv: 1, Unr: 208, PNG: 0, L3D/L4D: 2544/1, flags: s
The IP address is followed by the UDP work weight, UDP packets sent and received by the host, ICMP unreachables
sent to the host, PING packets sent by the host, unique L3 IP destination and L4 UDP destination counts and L4 src counts (here 10 means MANY as this
counter cannot go beyond 10 - this is unusual for UDP sockets),
a histogram that shows sent packet sizes in terms of the layer 7
byte payload counts, a running average for
UDP packets sent by the host (sa), and UDP packets sent back
to the host (ra) given as the average of the L7 UDP payload size,
and application flags. The latter currently has
two hardwired flags (s/P) and can use PCRE tags. If the darknet
or honeynet features are used, D, or P flags will appear in
the appflags field. Note that PCRE tags for bittorrent/dht,
and gnutella are supplied in the default ourmon config file.
The packet size histogram is based on the size of L7 UDP payload
packets. There are six buckets shown ranging on the left from
smallest to largest. The bucket divisions are currently set at
less than or equal these values, 40, 90, 200, 500, 1000, mtu.
The sa and ra averages are a horrible hack and may overflow
under high speed conditions. However they are intended to
capture small packet sizes and will work under most normal
conditions. Note they may be especially small with DOS attacks
as large packets do not make for a good DOS attack. Note
that the "guess" field is under development at this time,
and may appear in a future version of ourmon. "NA" means
not available.
ip src: weight: udp_sent: udp_recv: unreach:
ping:
L3D/L4D/L4S:
sizes:
sa/ra:
appflags
port_count: port signature
10.10.10.10
12149968
6564
4
1736
0
446/2/7
0/0/0/100/0/0
214/67
s
2:
[1026,51][1027,48]
10.2.196.58 3864 138 0: 28
1
10/1/1
0/100/0/0/0/0
68/0
P 1: [137,100])
192.168.1.2
2442
2116
326
0
0
808/649/2
7/49/23/19/0/0
103/122
10:
[53,98][3122,0][etc.]
udpweight graph
Every thirty seconds omupdate.pl takes the first IP
address found in the UDP port report and graphs
its UDP work weight in this RRDTOOL-based graph.
This allows you to get some feeling for the magnitude of possible
attacks or scans and also helps to pin down a rough time of an attack.
Thu Sep 28 02:13:33 PDT 2006: udpweight threshold exceeded:10.16.208.23 30828784 22904 0 605 0 4270/2 Ps 2: [1026,50][1027,49]
This system is sending us SPIM (spam for Internet Messaging services. The work weight
is around 31 million. It sent us 22K UDP messages in 30 seconds and got back 605 ICMP
unreachables. Ports 1026 and 1027 were attacked. See the
udp port signature report above
for information about the format of this event log message (the format is the same).
UDPWEIGHT=10000000
Note that it is possible to have an automated packet capture trigger that will capture
these packets. See the
trigger sections and the
event log sections for more information
on this entire subject.
topn scans
The following links show output associated with the topn scan lists:
ip_scans
ip_portscan
tcp_portscan
udp_portscan
The topn_scans filter
(when producing graph ip_scan above)
looks at packets from an IP source and
counts unique IP destinations from that
IP source to multiple IP destination addresses
during the sample period. Therefore we can say it is 1-N in terms of IP source
to IP destination mapping. We sort on the maximum unique IP destinations.
topn_scans 20
topn_port_scans 20
The number supplied should vary from 10..100 by values of 10.
As with all the topn filters bar charts with labels are produced.
There are four mon.lite outputs
(discounting STATS which are used for internal tuning).
mon.lite output is roughly as follows:
ip_scan: 18908 : 10.0.0.1:2340:0:0:0: ETC.
ip_portscan: 18563 : 10.0.0.1:2366:0:0:0: ETC.
tcp_portscan: 16303 : 10.0.0.1:2366:0:0:0: ETC.
udp_portscan: 2526 : 10.0.0.2:242:0:0:0: ETC.
The topn_scans filter produces one output called ip_scan.
The topn_port_scans filter produces three outputs, called
ip_portscan, tcp_portscan, and udp_portscan, respectively.
Each output has the number of tuples following the tag,
and each tuple is a 5-tuple of (ip source address, unique destination count,
application flags fields(3)).
The destination count is the number of unique IP or L4 port destinations.
Below is one sample label field taken from the ip_scan graph.
scanner: 131.252.244.255, ip dsts: 1911,flags: B
topn flows and anomaly detection
The following links show various aspects of the topn flow mechanism
and anomaly detection:
topn_ip flow RRDTOOL count graph
topn_ip insert count
topn_icmp
topicmp_today.txt
BPF graphs and network anomaly detection
A number of BPF filter sets are provided in the default configuration that show overall network
error and control information.
BPF network errors
The
graph above shows a total
count of TCP resets, ICMP unreachables, ICMP pings, and ICMP ttl
exceeded errors. TCP resets often correlate with TCP scanning attacks.
ICMP unreachables may correlate with a UDP attack.
BPF view of ICMP unreachable packets
The bpf-unreachable BPF filter set breaks out a number of different
kinds of ICMP unreachable errors. Network, host, port, and administrative
prohibited unreachable packet counts are shown. Keep in mind that UDP-based attacks
may cause large numbers of port unreachables. TCP and UDP attacks may
produce administrative unreachable errors if ACLs in Firewalls or Routers
return those kinds of errors. Yes, Virginia, it might just be a good idea
to do that to detect scanners, as opposed to returning nothing.
BPF view of TCP control packets
The bpf-tcpcontrol BPF filter set shows a breakdown of network-wide
TCP control packets, including SYNS, FINS, and RESETS. This may
be of use for spotting SYN anomalies as well as other kinds of anomalies.
It can sometimes be possible to spot an attack on this graph,
and then look through the topn_syn logs
(as well as the tcpworm.txt and tworm graphs)
to determine more information about the source of an attack, possibly
including IP source addresses.
Triggers - automated packet capture
The automated trigger mechanism allows the probe to dynamically store
a set of "interesting" packets which may be filtered internally
with various BPF expressions and then written out to a tcpdump style dump file.
trigger_tag.timestamp.dmp.
The timestamp combined with the trigger_tag makes the filename
unique. In general, only one trigger of each trigger type may be
active at a time. In this release, there are two kinds of
triggers. At this time we include a trigger for tworm events
and an additional trigger for UDP weight (udp error) events.
More triggers types may be released in the future.
tworm - a trigger may be set on the total worm count.
topn_icmperror (udp weight trigger) - a trigger may be set on the top udp error
weight for a specific IP host.
bpf_trigger - a trigger may be set on any BPF expression in a BPF filter set.
drop-trigger - a trigger may be set on the pkts filter when the number of drops exceeds
a certain threshold.
It is important to note that triggers may be more useful in terms of
the packets captured if the particular BPF expression
is more specific. This problem can be called the "trigger signal to noise problem".
For example, a BPF expression that captures all TCP packets may not show
anything useful. Compare this to the udp error weight trigger only
captures UDP packets for a particular IP source. Thus the tcpdump
capture file in question is more likely to show the exact nature of an attack.
packet_count - terminate when count packets are stored.
dump_dirname - directory in which to store the dump file.
This directory should be created before the ourmon probe is run.
Tue Mar 1 09:37:01 PST 2005: ourmon front-end event: tworm trigger on, current count: 45, threshold 40, dumpfile: /usr/dumps/tworm.<03.01.2005|09:36:53>.dmp
Tue Mar 1 09:38:01 PST 2005: ourmon front-end event: tworm trigger OFF, current count is 20, threshold: 40
/home/mrourmon/dumps/tworm.<03.01.2005|09:36:53>.dmp.
The contents of this file may be viewed with tcpdump.
Since the timestamp typically includes shell metacharacters one
can usually cut and paste the timestamp name in between double
quotes (or just use * globbing) as follows:
# tcpdump -n -r "tworm.<03.01.2005|09:36:53>.dmp"
tworm trigger
In the ourmon configuration file this trigger
has the following syntax:
trigger_worm threshold packet_count dump_dirname
The trigger_worm trigger is associated with the
tworm graph .
One should
determine a suitable configuration threshold by
watching this graph over time. Note that the threshold applies
to the tworm total count (not us, not them, the total ).
If the total value for tworm counts (us+them) exceeds the supplied
threshold value, a packet dump will begin for count packets in the
dump_dirname in a dynamically created filename.
Only TCP SYN packets are stored.
trigger_worm 60 10000 /usr/dumps
This would mean if 60 scanners are seen store 10000 packets
in the trigger file.
tworm.< timestamp >.dmp.
It can be extremely useful to use the
saved back-end portreport.txt file for analysis here (see
logging
below). The relevant TCP port report file here
may give important clues about the nature of the attack
in terms of TCP destination port numbers, the number
of IP destination addresses, or the IP source addresses
involved in the attack. This information may
be useful in helping both to ignore irrelevant TCP syns
gathered during the attack and for searching the tcpdump
file for the IP addresses or ports in question (simply
tack on a BPF expression at the end of the tcpdump search).
udp weight trigger
In the ourmon configuration file this trigger has the following syntax:
udperror_trigger threshold packet_count dump_dirname
An example ourmon.conf trigger config might be as follows:
udperror_trigger 10000000 10000 /usr/dumps
icmp and udp error list
udp weight graph
udp port signature report
In terms of the back-end the threshold value is simply
the UDP weight calculated for the first host in the udperror_list.
This value is graphed in the udp weight graph .
We suggest that the threshold be set at 10000000 to start.
It should be tuned down or up depending upon the incidence
of false positives versus "interesting events". In general
UDP scanning does occur and this mechanism will catch outrageous
incidents. It may also catch disfunctional P2P applications or
multimedia games. Of course what you do with that information,
will depend upon your security policy.
Fri Mar 4 01:17:33 PST 2005: udpweight threshold exceeded:10.0.0.1 14841330 7284 1194 1218 0 10: [1649,0][7674,80][22321,18][33068,0][40167,0][54156,0][55662,0][61131,0][64734,0][ 65021,0]
Fri Mar 4 01:17:33 PST 2005: ourmon front-end event: topn_udp_err:
trigger on, current count: 14841330, threshold 10000000, dumpfile: /usr/dumps/topn_udp_err.<03.04.2005|01:17:05>.dmp
There are two event log messages shown associated with the UDP error event.
The first message is simply the first line of the UDP port report
and is generated by the backend. It has nothing to do with the trigger
and simply gives you the UDP port signature report information for the incident
in question. The offending IP is given, along
with various statistics taken from the UDP port signature report including
ports. The second line is generated by the probe and its trigger and gives the
trigger capture filename. There should also be a trigger off event line which
is not shown.
bpf trigger
In the ourmon configuration file this trigger
has the following syntax:
bpf_trigger "filterset_label" "bpf_line_label" threshold packet_count dump_dirname
The BPF trigger capability works with any individual BPF expression in a BPF filter set.
The bpf_trigger that matches a BPF expression must be declared in the config file
after the filter set declaration. The trigger expression first names the filter
set and then names the BPF line label as a two-tuple - thus determining the BPF
expression used for the trigger. All packets stored match the BPF expression.
The threshold value is either packets or bytes depending upon the declaration of
the filter set and as always is expressed in terms of back-end graphics output.
If the user bpf filter set was declared as follows:
bpf-packets
bpf "protopkts" "ip" "ip"
bpf-next "tcp" "tcp"
and the desired target BPF expression is "tcp", then the bpf_trigger
to match that expression would be:
bpf_trigger "protopkts" "tcp" 1000 10000 /usr/dumps
which means that the trigger should be turned on to store 10000 packets
when a threshold of 1000 packets per second is reached.
bpf_protopkts_tcp.<06.03.2005|01:02:37>.dmp
drops trigger
In the ourmon configuration file this trigger
has the following syntax:
drop_trigger threshold packet_count dump_dirname
The drop trigger is associated with the
pkts graph . It is triggered by
the (first) drops counter. The assumption here is that this
trigger is not normally triggered, but only triggers due to
large DOS attacks. It is possible that the trigger might
thus catch a DOS attack (especially if the DOS attack
is the major cause of packets).
The supplied packet count value should probably not be small as it is quite
possible that this filter might store "good" packets as opposed
to bad packets.
drop.timestamp.dmp
The threshold for this trigger is based on packet counts.
IRC information
IRC information is presented on its own web page.
IRC data is reported for two reasons, both security related.
It has long been known that IRC may be used by "hackers"
as the control plane for hosts that have been captured and
become part of a remote botnet. Client botnet hosts may
in turn be used for launching attacks on local (or remote) hosts
including denial of service attacks, and "fan-out" attacks using
various well-known exploits. Thus client IRC hosts may act
as TCP-based scanning hosts (UDP activity is not ruled out but we are ignoring
it barring the UDP port signature report).
Second it is always possible that IRC may be used for the dissemination
of "warez" (meaning files possibly sans copyright).
Whether this is allowed or IRC itself is allowed within an administrative domain
depends on the local security policy. That said ourmon looks at statistical
data for IRC and does not look at PRIVMSG content. The main
goal is to identify IRC channels that appear to be under the control
of computers (and malware at that) as opposed to people.
topn_irc 0
topn_irc_file /home/mrourmon/tmp
These two switches should both be used and never be
given separately. (0 is a placeholder for a possible future feature).
The latter switch directs where
the front-end output irc.txt file should be stored.
It may be overridden by the use of the -D parameter
with the ourmon probe. The ourmon probe uses a lightweight
layer 7 hand-coded scanner to look for certain IRC messages including
JOIN, PRIVMSG, PING, and PONG and also gathers a small set of
RRDTOOL statistics based on global counts for those messages.
Note that the IRC filter is dependent on the topn_syn module.
Ironically the topn_syn module in its application flags also needs
the IRC module.
1. A 30-second version of the IRC report (the format is the
same for both this version and the more important summarized version).
Because IRC data may take a while to be collected, this 30-second
view may only be useful for debugging or large security incidents.
2. A daily summarization compiled at the hour, and then the typical
week's worth of rolled-over summarizations. Typically today's and
yesterday's views may be quite useful in looking for IRC-related malware
problems. We will explain the report format below in more detail.
3. RRD graphs for total IRC message counts are also provided.
These may be useful for detecting large changes in normal network
packet types including PRIVMSG, PING, PONG, and JOIN counts. Large
changes in for example PRIVMSG or PING/PONG may indicate the presence
of a bot server on a local network.
1. global statistics, 2. channel statistics, and 3. IRC host statistics.
Channel statistics are given in major sections including:
channels sorted by wormy (evil) hosts - call this the
evil channel report
channels sorted by max messages - call this the
max message channel report
channels with associated host IPs and host stats - call this the
channel host report
channels with no PRIVMSGS, only JOINS
channels with any wormy hosts
chanmap table - mapping of channel names to a case-insensitive form.
Host statistics follow:
servers sorted by max messages
most busy server in terms of max messages
hosts with JOINS but no PRIVMSGS
hosts with any sign of worminess
channel
msgs
joins
privmsgs
ipcount
wormyhosts
evil?
exploit
33
0
33
5
4
E
Most of the channel reports (e.g., the evil channel report or
max message channel report) have this format. The channel host report
has a different per host format.
channel
ip_src
tmsg
tjoin
tping
tpong
tprivmsg
maxchans
maxworm
Server?
sport/dport
first_ts
lsass445
192.1.1.1
161
30
67
63
1
1
99
H
3245/6667
12:43:20 PDT
192.1.1.2
151
31
62
65
1
1
89
H
3245/6667
12:43:20 PDT
10.0.0.1
151
31
62
65
1
1
22
S
6667/3245
12:43:20 PDT
# ngrep host 10.0.0.1
IRC blacklist
The irc.pl script in the backend may optionally be modified to take
a list of IP addresses in the form of a perl db (database) file.
This might be done to the thirty second summarization in
bin/omupdate.sh or it might be done to the hourly summarization
in the bin/batchip.sh script. We will look at how to do it for
the latter (but you might wish to do it to the former in order
to more quickly produce event log messages). If you want
to have a 30-second message, modify the irc.pl script call
in bin/omupdate.sh . If you want the messages to be hourly,
modify bin/batchip.sh . We will assume modifications
will be made to the latter script for our discussion here.
/home/mrourmon/bin/stirc.pl /home/mrourmon/etc/rawbots.txt /home/mrourmon/etc/ou
rshadow
-B /home/mrourmon/etc/ourshadow .
Also include an event log parameter as follows:
-b $WEBDIR/event_today.txt
Thus if the script finds a matching IRC address, it will
post an event log warning in the event log.
$IRC -d -B /home/mrourmon/etc/ourshadow -e $WEBDIR/event_today.txt
$OURPATH/logs/rawirc/rawirc_today > $WEBDIR/ircreport_today.txt
If an event log notification is found, look at the hourly IRC
summarization file, and go to the end of it. There is a heading
there as follows in the host stats section:
hosts appear in blacklist!!! - assume channel is infected
Any IP addresses there matched IP addresses in the blacklist.
If a match is found, search the IRC report for any and all
channels that match the IP address. You may have found
a botnet.
Blacklists
1. the front-end IP blacklist mechanism
2. the front-end DNS blacklist mechanism
3. the back-end IRC blacklist mechanism.
Packets to/from certain "evil" hosts may be flagged with
the IP blacklist mechanism. This system works with known
suspect hosts expressed as IP addresses.
DNS query packets to/from certain
"evil" hosts as expressed with DNS names may also be flagged.
The ourmon front-end processes both the IP blacklist and the DNS
blacklist. The IP blacklist outputs may include event log
messages and a tcpdump style dump file that catches all packets
sent to blacklist IPs subject to ourmon's snap length size.
The DNS list mechanism merely produces event log messages.
In addition, it is possible to provide a list of IPs to
the back-end IRC module (bin/irc.pl) to produce special
IRC event log messages if any blacklisted host is found in the IRC
list of known IRC-speaking hosts.
ourmon probe - IP blacklist
# blacklist of evil ips
blist_dumpfile 10000 /usr/dumps
blist_include "list1" /etc/ourmon_list1.txt
blist_include "list2" /etc/ourmon_list2.txt
The blist_dumpfile directory provides the directory location for the output
blacklist dump file. A maximum number of packets to be placed in that file
must be specified. The file when created has an appended timestamp
as part of the filename so that the filename is unique. For example,
two such filenames might appear as follows:
-rw-r--r-- 1 root wheel 24 Dec 20 00:10 /usr/dumps/blacklist.<12.20.2007.00:10:00>.dmp
-rw-r--r-- 1 root wheel 831095 Dec 22 00:10 /usr/dumps/blacklist.<12.21.2007.00:10:01>.dmp
Note that 24 bytes means the file is empty and has no packets in it.
Tcpdump, wireshark, or other sniffers that understand the
pcap libraries tcpdump file format can be used to play back the packets.
One can use cut and paste (with quotes) or wildcards as follows
to play back the file:
# tcpdump -n -XX -r blacklist*12*20*2007*
10.0.0.1 7777
10.0.0.2 any
"any" is a wildcard and matches any port.
DNS blacklist
This blacklist is part of the topn_dns module feature set.
We explain how to configure it in the
Topn DNS section
Automated Blacklists
At this point in time, it is possible to dynamically download
a set of botcc names from the snort bleedingthreats site.
The bleeding-snort site is getting rules for a set
of botnet C/Cs from shadowserver.org. These rules
have been made available at:
www.bleedingthreats.net/rules/bleeding-botcc.rules
We provide three scripts in src/scripts which can be used to turn these
snort rules into a config file for ourmon. We first assume
that the IP blacklist mechanism is on and configured into the ourmon
config file (as discussed above). The three scripts are as follows:
getssbots.sh: main script
stirc.pl: parses snort rules and turns them into simple IRC blacklist
stoo.pl: parses snort rules and turns them into front-end IP blacklist.
The getssbots.sh should be driven by crontab and uses wget
to download a snort rules file from bleedingthreats. It then
calls the stoo.pl, and stirc.pl scripts to produce various
IP blacklist files that can either be used with the ourmon
front-end IP blacklist or with the back-end irc.pl script (or both).
The scripts should be read and tested by executing getssbots.sh
until it is working properly and producing output files
that can be configured into the various parts of ourmon.
Topn DNS.
dns 60
dns_include /etc/dns_blacklist.txt
The "dns 60" line turns on the DNS module and provides one
RRDTOOL graph to the back-end. Currently the DNS module only
works with UDP port 53 (needs fixing) and basically very crudely
counts queries (qy), query responses (qr), and breaks query responses
up into A, AAAA, MX, and PTR counts.
lsass.exploited.org A
One entry should be given per line.
IP dnslist event.
Ourmon Logging
Logging in ourmon takes various forms.
1. RRD data is stored in the base back-end rrddata directory. The
file rrddata/ourmon.log records RRD errors and should
be consulted when one is attempting a new user BPF filter-set,
especially when the filter-set graphics do not appear!. The RRD
data gives a year's worth of baselined data for RRD graphics.
Note that RRD data files reach their maximum size at creation time
and do not grow larger over time.
2. The event log
gives significant events in the life of the ourmon system
including probe reboots and important security events.
More information is available below.
3. Web summarizations
represent hourly (and eventually daily) reports
available via the ourmon web pages. For example
various top N summarizations and IRC summarizations
are available. Most summarizations are available at the
bottom of the
main web page .
IRC summarizations
are on a separate page.
In general one week's worth of summarizations are available.
4. Basic log information is stored in the back-end logs directory.
Depending on where you installed ourmon this directory is called logs
and might be found at /home/mrourmon/logs or /usr/local/mrourmon/logs.
Log information can be divided into "raw" logs which more or less directly
come from the probe, and processed log information which is generated
by various back-end scripts. For example the rawirc directory has IRC
information as generated by the front-end. The irc directory has
30-second reports which represent the processed version of the raw reports.
Summarization is always done with raw versions.
Mon, Tue, Wed, Thu, Fri, Sat, Sun - daily raw top_n log files (Mon Tue etc) -
in some cases these are used for top talker daily summarizations.
The symlink
file just inside the top
/home/mrourmon/logs directory
called
topn_today points at the current day.
Note that the top_n log file directory also contains the daily
scanner db database.
irc - processed IRC reports.
mon.lite - raw probe mon.lite file, contains top talker, hardwired,
and BPF stats. top_n tuples are broken out by the back-end into
various per top_n log files (Mon-Sun) as above.
p2preport - processed TCP
syn p2p port report files. These show which systems are doing IRC,
and various forms of p2p activity based on currently hardwired p2p
application flags.
portreport - processed TCP port
signature report files (portreport.txt). These can be crucial to
security analysis.
rawemail - raw EMAIL topn_syn files. processed into email summarization.
rawirc - raw IRC files. processed into IRC summarization.
rawp2p - raw p2p files, processed into P2P summarization.
rawsyndump - raw syndump files, processed into syndump summarization.
tworm - raw tcp port report files from the probe. These are processed
into 30-second and hourly TCP port report summarizations.
udpreport - processed UDP port signature
reports. These files may be useful for security analysis.
There is unfortunately no summarization script at this time.
However typically the top offender will be captured in the event log.
Event Log
The event log records important events in the life of ourmon
including probe messages of interest like reboots and nasty back-end
errors or important security events detected anywhere in
the ourmon system. The event log is also closely coupled
to the automated packet capture facility. Any trigger will
generate on and off messages placed in the event log.
For trigger info,
please see
triggers - automated packet capture
Tue Oct 3 00:55:00 PDT 2006: ourmon front-end event: ourmon restarted
This means that the front-end was rebooted.
ourmon front-end event: topn_udp_err trigger on, current count: 21587544, threshold 10000000, dumpfile: /usr/dumps/topn_udp_err.<10.03.2006|00:54:55>.dmp
This message tells us that topn_udp automated packet capture event was turned on, and packets were stored in the named file. tcpdump can be used to review the packets.
ourmon back-end elapsed time too long
This means that the omupdate.pl script was not able to do its work
in the 30 seconds required. This may indicate a serious system bug
which may or may not have been caused by ourmon. One possibility is
that the system simply has too much work to do in the time alloted.
botnet client mesh?: irc channel X has bad #hosts:
This means that the named channel with N hosts (the channel name and host count
are filled in) may be a botnet
client channel and the IRC and TCP report information should be closely
examined. This message is triggered if 3 "wormy" hosts are found
in an evil channel report. The number can be adjusted by modifying
the constant:
$K_badips
in bin/irc.pl.
"botserver?: irc channel X has #hosts: ";
This means that an IRC channel with 150 or more IP hosts was detected.
This may indicate the presence of an IRC botserver in your local network.
The number can be adjusted by modifying the constant:
$K_ipcount
in bin/irc.pl.
irc blacklist hit for (ip/count): IP address
This means that the hourly back-end irc script run has been supplied
with a known list of evil IRC hosts (the IP blacklist) and thus one such
"evil" IRC host has been encountered in a channel.
See
blacklists for information on how to
configure this feature.
TCP port report darknet/honeynet violation:192.168.3.4 (O) D 0 67 69/53 10/8293 10.1.2.3 2k/2k 786/2k
[80,37][2520,4][3462,11][4383,4][6441,4][11310,5][15552,8][23121,11][54425,5][60528,7]
So this message means that 192.168.3.4 wrote into the (D) darknet.
IP dnslist event: dns query:: lsass.exploited.org. from
192.168.1.1->192.168.1.2: count 2
Interpretation of this message depends upon knowledge of your
own DNS servers and the position in your network of the ourmon probe.
If we assume 192.168.1.1 is a local DNS server and that 192.168.1.2
is a host, this means that the local DNS server has returned
a query response to the host for "lsass.exploited.org".
The DNS name (lsass.exploited.org) was supplied to the front-end in
the DNS blacklist file. It can be assumed that the host
has made a DNS query with the "evil" name. This may be part of
an attempt by a botnet client to download a payload. This is a per-30 second event.
See
blacklists
for information on how to configure this feature.
Also see
topn_dns
for information on how to turn on the topn_dns feature in the front-end.
ourmon front-end event: IP blacklist event:: (list1) 10.0.0.1->192.168.1.2:80 count 2
These messages are produced at 30-second internals by the ourmon
front-end probe if and only if a packet sent to or from a blacklist
address is detected by ourmon. This message shows a flow of packets from
10.0.0.1 to 192.168.1.2 at port 80 on 192.168.1.2. There were 2 packets.
The packet data subject to the ourmon probe snap size,
will have been automatically captured in the blacklist
tcpdump dump file. The list name (list1) is provided so
that the analyst can tell which input blacklist file caused
the event.
See blacklists
for information on this feature including configuration info.
Web summarizations
Web summarizations are produced for some
top N data
and other data as well.
Summarizations
are a form of logging in which statistics are aggregated together
and summarized in various ways. Daily summaries are updated
at the hour and typically rolled back a day at midnight producing
a week's worth of summaries. Summarized data is stored
in the ourmon web directory of course and not in the logs directory.
However the logging directory data is used to produce the
web summarization.
(The work weight parameter (-W) may be modified
by changing bin/batchip.sh on the line that begins:
$OMBATCHSYN -W 100 -N -c 1000 etc. )
Sorting within the above restrictions
is done by the maximum number of syns. Data is represented in
the same fashion as with the top N syn summarization above.
Security Analysis and Logging
There are a number of useful tricks for searching the logging directories
(or the web directory).
For example assume
that you have an IP source address of interest and you wish
to search the TCP port report logging directory to see if that
IP source is mentioned. Assume that you know friday
is also a day of interest.
# cd /home/mrourmon/logs/portreport/Fri
# find . | xargs grep 10.0.0.1
The use of xargs here combined with grep allows you
to easily search the log directory for the IP source
address 10.0.0.1. This technique of course can be used
in other logging directories as well.
# cd /home/mrourmon/logs/portreport/Fri
# find . | xargs wc -l | sort
Wc is used to count the lines in the file, and sort will
sort out the results showing you the biggest file.
PCRE - Layer 7 pattern matching
Ourmon has a new feature that allows programmable signature tags based
on PCRE pattern matching to be associated with various kinds of flow
reports including the top N talker reports and the TCP and UDP port signature reports.
For example, see the
port signature report above.
For another example, see the
topn flow graphs above.
This feature applies to wherever the apps field appears in
the TCP port report, p2p port report, and new syndump reports
or top talker graphs like top_n (not ICMP), topn_syn, the UDP
part of the icmp and udp error list, etc.
ourmon -s 256 ... additional parameters
If pattern matching is not of any interest and there is no desire to capture
IRC information either, the -s 256 flag to ourmon in bin/ourmon.sh should be removed.
pattern TCP F DEFAULT DEFAULT "^220.*ftp"
This could result in an F in the backend tags field indicating that
the ip_src with an F in its apps field acted as an FTP server.
configuration syntax for pattern matching
The tag space available for all patterns logically is a-z, and A-Z.
Thus there could be a maximum of 52 patterns. However for now,
because this mechanism overloads the same previous generation tags
used in the apps field of the TCP port signature report, consider
the following letters as reserved words, and do not overload
them.
reserved tags: B, G, I, K, M, P,
E, H, little e, and little s. .
See the TCP port signature
report section for more information. You may use the remaining letters in the a-z,
and A-Z spaces for PCRE tags. These tags will appear in the apps field of
the various tcp syn tuple reports (tcp port signature, p2p, syndump) and top_n
reports that currently can use the tags.
pattern transport_layer tagchar max_packets max_bytes pcre_pattern
for example:
pattern TCP b 100 2000 "^\x13bittorrent protocol"
(No need for this one though - we cover it already in a faster
manner).
pattern - indicates a pattern definition in the configuration file.
transport_layer - only payloads on this transport layer will be matched.
In theory TCP, UDP, and ICMP will be available. Currently only TCP
is supported.
tagchar - One alphabetical character [a-zA-Z] tag associated with the
pattern. This tag will be used to indicate a successful
match in the application flags field of the port report and its relatives.
It should be unique. Note that the tagchar is independent from the layer being
matched. That is, if the same tagchar is associated with
TCP and UDP payload then the second tag will be ignored as
it is considered to be a duplicate. Put another way, use each
tag only once. You cannot use tag A with both TCP and UDP (when in the future
UDP tags become useful).
max_packets - Maximum number of packets associated with a IP src that
will be searched for this pattern.
max- bytes - Maximum bytes associated with an IP src that will be
searched for this pattern. During one ourmon 30-second sample period, the payload will be
searched for this pattern until max packets or
max bytes, whichever comes first, is encountered. These
can also have the value DEFAULT that will cause the built in
defaults, 8 for max # packets and 2048 for max bytes, to be used.
These default values have been adopted from the l7-filter project.
If a pattern has a value of zero for both the maximums then the pattern
will always be searched for in packet payloads. In effect these
two fields mean "search for a while and give up" during the sampling
time. Note again that the search period is per IP source.
It is not for all IP sources found during the sample period.
#ftp server - limited search
pattern TCP F DEFAULT DEFAULT "^220.*ftp"
#ftp server - search during entire sample period
pattern TCP F 0 0 "^220.*ftp"
In the first case we look per IP source for only a limited number of packets
or bytes for the pattern. In the second case, there is no restriction placed
on counts. The second form is more costly in terms of CPU
utilization.
pattern-keepNULL
pattern-removeNULL
In order to interoperate with patterns found at:
the l7 filter project ,
by default NULL characters will be removed from the L7 payload
before any pattern matching is performed. This can be changed
by preceding a section of patterns with the tag "pattern-keepNULL".
Using the tag "pattern-removeNULL" before a section of patterns
will restore the default. For example,
pattern-keepNULL
# note that you don't actually need this one as ourmon has it hardwired
pattern TCP B 100 2000 "^\x13bittorrent protocol"
pattern-removeNULL
pattern TCP a DEFAULT DEFAULT ajprot\x0d\x0a
PCRE - sample patterns
These patterns and possibly others are provided in the sample ourmon.conf file.
Many more patterns are provided at the L7 filter project web site.
# gnutella and similar - probably more comprehensive than ourmon builtin
pattern TCP g DEFAULT DEFAULT ^(gnutella connect/[012]\.[0-9]\x0d\x0a|get /uri-res/n2r\?urn:sha1:|get /.*user-agent: (gtk-gnutella|bearshare|mactella|gnucleus|gnotella|limewire|imesh)|get /.*content-type: application/x-gnutella-packets|giv [0-9]*:[0-9a-f]*/|queue [0-9a-f]* [1-9][0-9]?[0-9]?\.[1 -9][0-9]?[0-9]?\.[1-9][0-9]?[0-9]?\.[1-9][0-9]?[0-9]?:[1-9][0-9]?[0-9]?[0-9]?|gnutella.*content-type: application/x-gnutella|..................lime)
#
# ftp server
pattern TCP F DEFAULT DEFAULT "^220.*ftp"
#
# symantec Update message
pattern TCP S DEFAULT DEFAULT "User-Agent: Symantec LiveUpdate"
#
# web server message
pattern TCP W DEFAULT DEFAULT "^http/(0\.9|1\.0|1\.1) [1-5][0-9][0-9 ] [\x09-\x0d -~]*(server:)"
#
# soulseek p2p
pattern TCP z DEFAULT DEFAULT ^(\x05..?|.\x01.[ -~]+\x01F..?.?.?.?.?.?.?)$
#
# web client
pattern TCP w DEFAULT DEFAULT "^get .*http[\x09-\x0d -~]*(user-agent:)"
#
# ares p2p
pattern TCP A DEFAULT DEFAULT ^\x03\x5a.?.?\x05.?\x38
#
# bittorrent in udp dress
pattern UDP j DEFAULT DEFAULT "d1:ad2:id20:"