So you want to write a Valgrind tool? Here are some instructions
that may help.
2.1. Introduction
The key idea behind Valgrind's architecture is the division
between its core and tools.
The core provides the common low-level infrastructure to
support program instrumentation, including the JIT
compiler, low-level memory manager, signal handling and a
thread scheduler. It also provides certain services that
are useful to some but not all tools, such as support for error
recording, and support for replacing heap allocation functions such as
malloc.
But the core leaves certain operations undefined, which
must be filled by tools. Most notably, tools define how program
code should be instrumented. They can also call certain
functions to indicate to the core that they would like to use
certain services, or be notified when certain interesting events
occur. But the core takes care of all the hard work.
2.2. Basics
2.2.1. How tools work
Tools must define various functions for instrumenting programs
that are called by Valgrind's core. They are then linked against
Valgrind's core to define a complete Valgrind tool which will be used
when the --tool option is used to select it.
2.2.2. Getting the code
To write your own tool, you'll need the Valgrind source code. You'll
need a clone from the git repository for the automake/autoconf
build instructions to work. See the information about how to do clone
from the repository at the Valgrind
website.
2.2.3. Getting started
Valgrind uses GNU automake and
autoconf for the creation of Makefiles
and configuration. But don't worry, these instructions should be enough
to get you started even if you know nothing about those tools.
In what follows, all filenames are relative to Valgrind's
top-level directory valgrind/.
Choose a name for the tool, and a two-letter abbreviation that can
be used as a short prefix. We'll use
foobar and
fb as an example.
Make three new directories foobar/,
foobar/docs/ and
foobar/tests/.
Create an empty file foobar/tests/Makefile.am.
Copy none/Makefile.am into
foobar/. Edit it by replacing all
occurrences of the strings
"none",
"nl_" and
"nl-" with
"foobar",
"fb_" and
"fb-" respectively.
Copy none/nl_main.c into
foobar/, renaming it as
fb_main.c. Edit it by changing the
details lines in
nl_pre_clo_init to something appropriate for the
tool. These fields are used in the startup message, except for
bug_reports_to which is used if a
tool assertion fails. Also, replace the string
"nl_" throughout with
"fb_" again.
Edit Makefile.am, adding the new directory
foobar to the
TOOLS or
EXP_TOOLS variables.
Edit configure.ac, adding
foobar/Makefile and
foobar/tests/Makefile to the
AC_OUTPUT list.
Run:
autogen.sh
./configure --prefix=`pwd`/inst
make
make install
It should automake, configure and compile without errors,
putting copies of the tool in
foobar/ and
inst/lib/valgrind/.
You can test it with a command like:
inst/bin/valgrind --tool=foobar date
(almost any program should work;
date is just an example).
The output should be something like this:
==738== foobar-0.0.1, a foobarring tool.
==738== Copyright (C) 2002-2017, and GNU GPL'd, by J. Programmer.
==738== Using Valgrind-3.14.0.GIT and LibVEX; rerun with -h for copyright info
==738== Command: date
==738==
Tue Nov 27 12:40:49 EST 2017
==738==
The tool does nothing except run the program uninstrumented.
These steps don't have to be followed exactly -- you can choose
different names for your source files, and use a different
--prefix for
./configure.
Now that we've setup, built and tested the simplest possible tool,
onto the interesting stuff...
The names can be different to the above, but these are the usual
names. The first one is registered using the macro
VG_DETERMINE_INTERFACE_VERSION.
The last three are registered using the
VG_(basic_tool_funcs) function.
In addition, if a tool wants to use some of the optional services
provided by the core, it may have to define other functions and tell the
core about them.
2.2.5. Initialisation
Most of the initialisation should be done in
pre_clo_init. Only use
post_clo_init if a tool provides command line
options and must do some initialisation after option processing takes
place ("clo" stands for "command line
options").
First of all, various "details" need to be set for a tool, using
the functions VG_(details_*). Some are all
compulsory, some aren't. Some are used when constructing the startup
message, detail_bug_reports_to is used
if VG_(tool_panic) is ever called, or
a tool assertion fails. Others have other uses.
Second, various "needs" can be set for a tool, using the functions
VG_(needs_*). They are mostly booleans, and can
be left untouched (they default to False). They
determine whether a tool can do various things such as: record, report
and suppress errors; process command line options; wrap system calls;
record extra information about heap blocks; etc.
For example, if a tool wants the core's help in recording and
reporting errors, it must call
VG_(needs_tool_errors) and provide definitions of
eight functions for comparing errors, printing out errors, reading
suppressions from a suppressions file, etc. While writing these
functions requires some work, it's much less than doing error handling
from scratch because the core is doing most of the work.
Third, the tool can indicate which events in core it wants to be
notified about, using the functions VG_(track_*).
These include things such as heap blocks being allocated, the stack
pointer changing, a mutex being locked, etc. If a tool wants to know
about this, it should provide a pointer to a function, which will be
called when that event happens.
For example, if the tool want to be notified when a new heap block
is allocated, it should call
VG_(track_new_mem_heap) with an appropriate
function pointer, and the assigned function will be called each time
this happens.
More information about "details", "needs" and "trackable events"
can be found in
include/pub_tool_tooliface.h.
2.2.6. Instrumentation
instrument is the interesting one. It
allows you to instrument VEX IR, which is
Valgrind's RISC-like intermediate language. VEX IR is described
in the comments of the header file
VEX/pub/libvex_ir.h.
The easiest way to instrument VEX IR is to insert calls to C
functions when interesting things happen. See the tool "Lackey"
(lackey/lk_main.c) for a simple example of this, or
Cachegrind (cachegrind/cg_main.c) for a more
complex example.
2.2.7. Finalisation
This is where you can present the final results, such as a summary
of the information collected. Any log files should be written out at
this point.
2.2.8. Other Important Information
Please note that the core/tool split infrastructure is quite
complex and not brilliantly documented. Here are some important points,
but there are undoubtedly many others that I should note but haven't
thought of.
The files include/pub_tool_*.h contain all the
types, macros, functions, etc. that a tool should (hopefully) need, and are
the only .h files a tool should need to
#include. They have a reasonable amount of
documentation in it that should hopefully be enough to get you going.
Note that you can't use anything from the C library (there
are deep reasons for this, trust us). Valgrind provides an
implementation of a reasonable subset of the C library, details of which
are in pub_tool_libc*.h.
When writing a tool, in theory you shouldn't need to look at any of
the code in Valgrind's core, but in practice it might be useful sometimes to
help understand something.
The include/pub_tool_basics.h and
VEX/pub/libvex_basictypes.h files have some basic
types that are widely used.
Ultimately, the tools distributed (Memcheck, Cachegrind, Lackey, etc.)
are probably the best documentation of all, for the moment.
The VG_ macro is used
heavily. This just prepends a longer string in front of names to avoid
potential namespace clashes. It is defined in
include/pub_tool_basics.h.
There are some assorted notes about various aspects of the
implementation in docs/internals/. Much of it
isn't that relevant to tool-writers, however.
2.3. Advanced Topics
Once a tool becomes more complicated, there are some extra
things you may want/need to do.
2.3.1. Debugging Tips
Writing and debugging tools is not trivial. Here are some
suggestions for solving common problems.
If you are getting segmentation faults in C functions used by your
tool, the usual GDB command:
gdb <prog> core
usually gives the location of the segmentation fault.
If you want to debug C functions used by your tool, there are
instructions on how to do so in the file
README_DEVELOPERS.
If you are having problems with your VEX IR instrumentation, it's
likely that GDB won't be able to help at all. In this case, Valgrind's
--trace-flags option is invaluable for observing the
results of instrumentation.
If you just want to know whether a program point has been reached,
using the OINK macro (in
include/pub_tool_libcprint.h) can be easier than
using GDB.
The other debugging command line options can be useful too (run
valgrind --help-debug for the
list).
2.3.2. Suppressions
If your tool reports errors and you want to suppress some common
ones, you can add suppressions to the suppression files. The relevant
files are *.supp; the final suppression
file is aggregated from these files by combining the relevant
.supp files depending on the versions of linux, X
and glibc on a system.
Suppression types have the form
tool_name:suppression_name. The
tool_name here is the name you specify
for the tool during initialisation with
VG_(details_name).
2.3.3. Documentation
If you are feeling conscientious and want to write some
documentation for your tool, please use XML as the rest of Valgrind does.
The file docs/README has more details on getting
the XML toolchain to work; this can be difficult, unfortunately.
To write the documentation, follow these steps (using
foobar as the example tool name
again):
The docs go in
foobar/docs/, which you will
have created when you started writing the tool.
Copy the XML documentation file for the tool Nulgrind from
none/docs/nl-manual.xml to
foobar/docs/, and rename it to
foobar/docs/fb-manual.xml.
Note: there is a tetex bug
involving underscores in filenames, so don't use '_'.
Write the documentation. There are some helpful bits and
pieces on using XML markup in
docs/xml/xml_help.txt.
Include it in the User Manual by adding the relevant entry to
docs/xml/manual.xml. Copy and edit an
existing entry.
Include it in the man page by adding the relevant entry to
docs/xml/valgrind-manpage.xml. Copy and
edit an existing entry.
Validate foobar/docs/fb-manual.xml using
the following command from within docs/:
make valid
You may get errors that look like this:
./xml/index.xml:5: element chapter: validity error : No declaration for
attribute base of element chapter
Ignore (only) these -- they're not important.
Because the XML toolchain is fragile, it is important to ensure
that fb-manual.xml won't break the documentation
set build. Note that just because an XML file happily transforms to
html does not necessarily mean the same holds true for pdf/ps.
You can (re-)generate the HTML docs while you are writing
fb-manual.xml to help you see how it's looking.
The generated files end up in
docs/html/. Use the following
command, within docs/:
make html-docs
When you have finished, try to generate PDF and PostScript output to
check all is well, from within docs/:
make print-docs
Check the output .pdf and
.ps files in
docs/print/.
Note that the toolchain is even more fragile for the print docs,
so don't feel too bad if you can't get it working.
2.3.4. Regression Tests
Valgrind has some support for regression tests. If you want to
write regression tests for your tool:
The tests go in foobar/tests/,
which you will have created when you started writing the tool.
Write foobar/tests/Makefile.am. Use
memcheck/tests/Makefile.am as an
example.
Write the tests, .vgtest test
description files, .stdout.exp and
.stderr.exp expected output files.
(Note that Valgrind's output goes to stderr.) Some details on
writing and running tests are given in the comments at the top of
the testing script
tests/vg_regtest.
Write a filter for stderr results
foobar/tests/filter_stderr. It can
call the existing filters in
tests/. See
memcheck/tests/filter_stderr for an
example; in particular note the
$dir trick that ensures the filter
works correctly from any directory.
2.3.5. Profiling
Lots of profiling tools have trouble running Valgrind. For example,
trying to use gprof is hopeless.
Probably the best way to profile a tool is with OProfile on Linux.
You can also use Cachegrind on it. Read
README_DEVELOPERS for details on running Valgrind under
Valgrind; it's a bit fragile but can usually be made to work.
2.3.6. Other Makefile Hackery
If you add any directories under
foobar/, you will need to add
an appropriate Makefile.am to it, and add a
corresponding entry to the AC_OUTPUT
list in configure.ac.
If you add any scripts to your tool (see Cachegrind for an
example) you need to add them to the
bin_SCRIPTS variable in
foobar/Makefile.am and possible also to the
AC_OUTPUT list in
configure.ac.
2.3.7. The Core/tool Interface
The core/tool interface evolves over time, but it's pretty stable.
We deliberately do not provide backward compatibility with old interfaces,
because it is too difficult and too restrictive. We view this as a good
thing -- if we had to be backward compatible with earlier versions, many
improvements now in the system could not have been added.
Because tools are statically linked with the core, if a tool compiles
successfully then it should be compatible with the core. We would not
deliberately violate this property by, for example, changing the behaviour
of a core function without changing its prototype.
2.4. Final Words
Writing a new Valgrind tool is not easy, but the tools you can write
with Valgrind are among the most powerful programming tools there are.
Happy programming!