simonkagstrom (simonkagstrom ) wrote,

Kcov - gcov, lcov and bcov

(Short version: Kcov, a new project of mine for code coverage testing)

When developing software, I've often found measuring code coverage useful and important for development. A few years ago, I wrote shcov, which provides code coverage measurement for shell scripts. However, traditionally C and C++ code is more important, and the standard tool for measuring code coverage in UNIX is Gcov.

Gcov is unfortunately a bit crude: you have to compile your program with special switches, running the program outputs lots and lots of temporary files and collecting the data is done with a command-line tool. It also adds instructions count each basic block, so in addition it makes your program (slightly) slower. The output can be made nicer by using Lcov (I used the lcov HTML for shcov), but this introduces yet another step in the build/run/collection sequence. Because of all this, I seldomly use gcov despite its usefulness.

But can't code coverage collection be done in some better way? Yes, it can.

Thomas Neumann has written a nice tool called Bcov which improves a lot on the gcov experience: No need for special build switches, a single collection and a single report stage and lcov-style output directly. How does it do this? Well, Thomas realised that all the required information for code coverage is already present in the binary in the form of DWARF debugging information. After all, GDB and addr2line is clearly able to translate instruction addresses into code lines already.

Bcov, I presume, is short for Breakpoint COVerage. It uses ptrace to setup temporary breakpoints in the binary and then run the application until it stops, notes which line the instruction corresponds to and clears the breakpoint, lets it continue until it stops the next time and so on. So bcov replaces gcov and lcov, with the restriction that it doesn't count the number of hits per line, but simply if it has been hit, and all of that in less than 1500 lines of code!


So what is Kcov then? Well, while bcov is a great tool, I still have a few problems with it, so I decided to try to make it better. First, while it's much easier to use than gcov+lcov, it's still two steps (collection and reporting), and I'd like to reduce that to a single one. Second, many applications are long running (or even, like deamons, never terminate), and collecting all output at the end then is hard, so I'd like continuous output. Third, I think the same principle will work for the kernel, which explains the name. That particular part isn't finished yet though, so for now K will have to stand for something else.

The output is still Lcov-like:



and the usage is as simple as I wanted it:
  kcov /path/to/outdir executable [args for the executable]

and the output (you can see an example here) will end up in /path/to/outdir and written to disk around once per second. There are a couple of command-line options to control what should be included in the output, and skipping system headers often make sense.


A few words on the implementation as well (the source is kept at github). Bcov is written in C++, but my limited mental powers couldn't quite manage to get the hang of the source code, so I rewrote it in C. You can still see sections of Thomas source code in many places, especially the ptrace code and the report code. To build kcov, you'll need glib, libdwarf and libelf, which should be available on all Linux distributions.

Have fun, and thanks to Thomas for the excellent Bcov!
Comments for this post were disabled by the author