Traffic Accounting System (TAS)

© Anton Voronin (anton@urc.ac.ru), 2000-2001.

Contents:


Introduction

TAS is designed to gather and process the traffic statistics from PC or Cisco routers (actually, with slight modifications - from any traffic accounting capable devices) - on IP level and from specific applications on application level.

The application level is needed because some "intermediate" services (like http-proxy servers, news servers or mail relays) "hide" actual user's traffic from IP level. For example, a client requests a large file from the abroad via your http proxy server. On IP level you can notice only the traffic between the client and your proxy server. So if you wish to know all traffic flows initiated by or destinated for your clients (either for billing, for setting traffic limits or just for estimating your network usage per each client), you have to account the traffic on application level as well. TAS can work with the following applications: squid, sendmail and MailGate.


Model of work

TAS is written completely in Perl and consists of the following components:

Most of them use configuration files (see below).

The first four programs collect accounting data picked up from routers or specific applications. AcctMax does a specific processing required for IP data before it is processed by AcctLog. AcctLog builds arbitrary reports according to the rules specified in its configuration. AcctJoin summarizes daily databases into current month databases. Periodic scripts are responsible for running other TAS components, send the reports to operator and archive them.

Accounting data is stored in Berkeley DB tables. I know, it is not very smart idea to use db for this task because it leads to consequent search of the full database when selecting data for building reports. But it is very simple and convinient to summarize the data in hash tables bacause it eliminates key duplications (in comparison to storing data in plain text files).


Installation

After you have unpacked the archive, you'll see the Makefile. You don't need to build or configure anything before install. To install the TAS just type:

make install
By default all components are installed under /usr/local. If you want to use any other prefix (for example, /usr/local/tas), then type:
make PREFIX=/usr/local/tas install
After the files are copied you need to do some installation steps manually. See the next chapter for each TAS component.


The TAS components


Configuration

TAS uses three configuration files - /usr/local/etc/tas/tas.conf for AcctFetch, AcctSquid, AcctSendmail and AcctMailgate programs, /usr/local/etc/tas/AcctLog.conf for AcctLog program and /usr/local/etc/tas/accounting.conf for periodic scripts.


tas.conf has a single parameter $prefix that defines a directory where the accounting databases reside.


AcctLog.conf has three complex parameters: @local_nets, @lists, and %tables.


File accounting.conf also has three scalar parameters. All they should be explicitly defined.


Configuration tips


Performance

The most time consuming operation of the TAS is report building. To make it more efficient, the following measures have been taken:

On my PII 266MHz server to build a report consisting of 6 tables each of 8 columns from a database of more than 100,000 records (which is a daily amount of IP traffic statistics I currently have), AcctLog spends 23 minutes. To process monthly database it takes 2-3 hours.

I doubt that there's any way to increase processing speed even more (at least in current realization, when a full consequent database search is used for selection).


Planned enhancements

In the future it is planned to get rid of DNS resolution of addresses into names and grouping by names when building a report. Instead AcctLog should connect to an MySQL database that keeps all the information about clients, find out who owns the given address, and so be able to aggregate hosts by clients in the report tables rather than by ip nets or domain names. Of course, DNS resolution and grouping will be kept as an option.

Also the results of traffic computation for each client have to be automatically put into the client database, not only into the report.

Accounting data itself also needs to be stored in a real database, like MySQL, rather than in db tables because it would let to eliminate a full sequential retrieval of all records when building a report.

For domain names it would be possible if names were stored in reverse form (upper-level domain to the left).

For IP addresses it would be possible to select by the first N octets of an address (where N * 8 is less than or equal to the length of mask that identifies the subnet to which the selected addresses should belong) and only then to apply the mask and compare with the subnet address.

Even simpler it might be achieved using some other database system that supports indexing by arbitrary functions of the fields.


History of changes


Download

Note: these links are relative, so if you obtained this document within the TAS package, they won't work. Please open this page from either of web mirrors: