User Guide for CCCC

Table of Contents

Introduction

Report Contents

Counting Methods

Command line syntax

Configuration

Getting CCCC

Introduction

CCCC is a tool for the analysis of source code in various languages (primarily C++), which generates a report in HTML format on various measurements of the code processed. Although the tool was originally implemented to process C++ and ANSI C, facilities have recently been added to allow Java and Ada 95 source files to be recognized and processed as well. The name CCCC stands for 'C and C++ Code Counter'.

Measurements of source code of this kind are generally referred to as 'software metrics', or more precisely 'software product metrics' (as the term 'software metrics` also covers measurements of the software process, which are called 'software process metrics'). There is a reasonable consensus among modern opinion leaders in the software engineering field that measurement of some kind is probably a Good Thing, although there is less consensus on what is worth measuring and what the measurements mean.

CCCC has been developed as freeware, and is released in source code form, although precompiled binaries for Linux and Windows are included in the distribution for the convenience of users. Users are encouraged to compile the program themselves, and to modify the source to reflect their preferences and interests. The CCCC Reference Guide is intended to document the internals of the program to enable interested users to get started on hacking the source in this way.

The simplest way of using CCCC is just to run it with the names of a selection of files on the command line like this:

cccc my_types.h big.h small.h *.cc

CCCC will open each of the files specified on the command line (using standard wildcard processing were appropriate), and parse it using a parser selected to match the filename extension. As the parser processes each file, recognition of certain constructs will cause records to be written into an internal database. When all files have been processed, a report on the contents of the internal database will be generated in HTML format. By default the HTML report is generated to the file cccc.htm in the current working directory, although the output filename is configurable.

The report contains a number of tables identifying the modules in the files submitted and covering:

Some of the data presented in the report may be displayed in an emphasized form (either with a bold or italic font, or with a red or yellow background). These are items which have been identified as lying outside ranges which have been laid down as desirable for the particular items. A bold font or red background indicates a value which exceeds a threshold defined as being dangerous for that measure, while italic fonts and yellow backgrounds indicate values below the danger threshold but still above a second lower threshold which has been laid down to indicate cause for concern. The two thresholds are configurable by the user of the tool: see the section below on configuring metric treatment for more details.

Report Contents

The report generated by CCCC normally consists of six tables plus a table of contents at the beginning and some informational material about CCCC itself at the end.

Tables generated

Table nameDescription
Project Summary This table presents summary values of various measures over the body of source code submitted.
Procedural Summary This table presents values of procedural measures summed for each module identified in the code submitted.
Procedural Details This table presents values of the same procedural measures covered in the procedural summary report, but this time broken down within each module into the contributions of each member function of the module.
Structural Summary This table presents counts of fan-in and fan-out relationships to each module identified, and a derived metric called the Henry/Kafura/Shepperd measure, which is calculated as the square of the product of the fan-in and fan-outcounts.
Structural Details This table presents lists of the modules contributing to the relationship counts reported in the structural summary.
Rejected Extents This table presents a list of code regions which the analyser was unable to parse.

Metrics displayed

Tag Metric Name Description
LOC Lines of Code This metric counts the lines of non-blank, non-comment source code in a function (LOCf), module (LOCm), or project (LOCp). LOC was one of the earliest metrics to come into use (principally because it is straightforward to measure).

It has an obvious relation to the size or complexity of a piece of code, and can be calibrated for use in prediction of maintenance effort, although concern has been expressed that use of this metric as a measure of programmer productivity may tend to encourage verbose programming practises and discourage desirable simplification.

MVG McCabe's Cyclomatic Complexity A measure of a body of code based on analysis of the cyclomatic complexity of the directed acyclic graph which represents the flow of control within each function. First proposed as a measure of the minimum number of test cases to ensure all parts of each function are exercised, it is now widely accepted as a measure for the detection of code which is likely to be error-prone and/or difficult to maintain.
COM Comment Lines A crude measure comparable to LOC of the extent of commenting within a region of code. Not very meaningful in isolation, but sometimes used in ratio with LOC or MVG to ensure that comments are distributed proportionately to the bulk or complexity of a region of code.
L_C,M_C LOC/COM, MVG/COM See above
FO,FOc,FOv
FI,FIc,FIc
Fan-out, Fan-in For a given module A, the fan-out is the number of other modules which the module A uses, while the fan-in is the number of other modules which use A.
See the section below on counting methods for a discussion of the distinction between the variants on each of these measures. these figures.
HKS, HKSv, HKSc Henry-Kafura/Shepperd measure This metric is derived by squaring the product of the fan-in and fan-out of each module. The original Henry-Kafura measure, which has been described as a measure of 'information flow complexity' includes a term for the length of the module under consideration, but CCCC uses the measure as modified by Shepperd, which omits this term on the basis that it debases the measure by combining two attributes which can and should be separately measured.
Corresponding to the variants on the fan-in and fan-out measures described above, similar variants are calculated on this metric.
NOM Number of modules Number of modules identified in the project. See discussion below about what constitutes a module.
WMC Weighted methods per class This measure, proposed by Chidamber and Kemerer, is a count of the number of functions defined in a module multiplied by a weighting factor. The only weighting algorithm suggested in the original formulation is a uniform weighting of one unit per function.
REJ Rejected lines This is a measure of the number of non-blank non-comment lines of code which was not successfully analysed by the parser. This is more of a validity check on the report generated than a metric of the code submitted: if the amount of code rejected was more than a small fraction (say 10%) of the total code processed, the meaningfulness of the numbers generated by the run must be in doubt.

Counting methods

CCCC implements simple algorithms to calculate each of the measures presented. The algorithms are intended to present a useful approximation to the underlying quantities, rather than meticulously exact counting: in general agreement with manual counts based on the same definitions should agree with CCCC to within 2-3%. If larger discrepancies are discovered, or if this level of agreement is not considered adequate, users are welcome to modify the source code to implement closer agreement, or to change the counting behaviour to reflect a desired basis of calculation. The basic definitions of each count are as follows:

Command-line syntax

The following flags control the operation of CCCC: The descriptions above are correct as of version 2.1.1. Additional functions may be available in that version or any other: please consult the source code file ccccmain.cc for full details of command line handling.

Configuration

CCCC can be configured by editing various configuration files, all of which are found in the library directory described above.

Treatment of metric values

The file cccc_tmt.dat allows the user to configure the thresholds at which the HTML report presents measures in different ways. The version of this file shipped with CCCC describes the configuration data and format:
# cccc_tmt.dat
#
# configuration file for treatment of metric values in CCCC
#
# lines in this file starting with '#' are treated as comments
# all other lines are treated as defining a record in a table of
# treatments for different metrics, which controls the display of
# values of that metric
#
# all metric values are displayed using the class CCCC_Metric, which may be
# viewed as ratio of two integers associated with a character string tag
# the denominator of the ratio defaults to 1, allowing simple counts to
# be handled by the same code as is used for ratios
#
# the tag associated with a metric is used as a key to lookup a record
# describing a policy for its display (class Metric_Treatment)
#
# the fields of each treatment record are as follows:
# TAG     the short string of characters used as the lookup key 
# T1, T2  two numeric thresholds which are the lower bounds for the ratio of
#         the metric's numerator and denominator beyond which the 
#         value is treated as high or extreme by the analyser
#         these will be displayed in emphasized fonts, and if the browser
#         supports the BGCOLOR attribute, extreme values will have a red
#         background, while high values will have a yellow background
#         the intent is that high values should be treated as suspicious but
#         tolerable in moderation, whereas extreme values should almost always
#         be regarded as defects (not necessarily that you will fix them)
# NT      a third threshold which supresses calculation of ratios where
#         the numerator is lower than NT
#         the principal reason for doing this is to prevent ratios like L_C
#         being shown as *** (infinity) and displayed as extreme when the 
#         denominator is 0, providing the numerator is sufficiently low
#         suitable values are probably similar to those for T1
# W       the width of the metric (total number of digits)
# P       the precision of the metric (digits after the decimal point)
# Comment a free form field extending to the end of the line
#
#TAG      T1     T2	NT W P Comment
LOCf      30    100      0 6 0 Lines of code/function
LOCm     500   2000      0 6 0 Lines of code/module 
LOCp  999999 999999      0 6 0 Lines of code/project 
MVGf      10     30      0 6 0 Cyclomatic complexity/function
MVGm     200   1000      0 6 0 Cyclomatic complexity/module
MVGp  999999 999999      0 6 0 Cyclomatic complexity/project
COM   999999 999999      0 6 0 Comment lines
M_C        5     10      5 6 3 MVG/COM McCabe/comment line
L_C        7     30     20 6 3 LOC/COM Lines of code/comment line
CGS	  10	 30	 0 6 3 Card & Glass Structural Complexity
CGSv       7     20      0 6 3 Card & Glass Structural Complexity (visible)
CGSc       7     20      0 6 3 Card & Glass Structural Complexity (concrete)
FI	  12	 20	 0 6 0 Fan in (overall)
FIv	   6     12      0 6 0 Fan in (visible uses only)
FIc	   6	 12	 0 6 0 Fan in (concrete uses only)
FO	  12	 20	 0 6 0 Fan out (overall)
FOv	   6     12      0 6 0 Fan out (visible uses only)
FOc	   6	 12	 0 6 0 Fan out (concrete uses only)
HKS	 100   1000      0 6 0 Henry-Kafura/Shepperd measure (overall)
HKSv      30    100      0 6 0 Henry-Kafura/Shepperd measure (visible only)
HKSc      30    100      0 6 0 Henry-Kafura/Shepperd measure (concrete only)

Ignoring compiler-specific keywords

Some C++ compilers define keywords additional to the ones supported by CCCC, for example the keywords 'far' and 'near' which are used to specify size of pointers in the segmented 16 bit MS/DOS architecture. CCCC will normally recognize these keywords as identifiers, and may well be unable to parse some constructs as a result. If problems of this kind occur, it is possible to configure CCCC to ignore the offending keyword, by listing it in the configuration file cccc_ign.dat.

Supporting information

The report generated by CCCC contains a section of supporting information which is copied from the file cccc_inf.dat. This file can be changed to present whatever information is required (e.g. make links point to local copy of manual, refer directly to information on metrics).

Getting CCCC

The best place to look for information about CCCC is the CCCC home page at http://www.fste.ac.cowan.edu.au.

Recent versions of CCCC are usually available for download from ftp://www.fste.ac.cowan.edu.au/pub/tlittlef.

Significant new production versions are announced in the comp.software.measurement USENET news group