Valgrind Home Information Source Code Documentation Contact How to Help Gallery

COMPILE TIME

Debugging Memory Problems

by Steve Best
May 2003

Courtesy of Linux Magazine

Dynamic memory allocation seems straightforward enough: you allocate memory on demand -- using malloc() or one of its variants -- and free memory when it's no longer needed. Indeed, memory management would be that easy -- if only we programmers never made mistakes. Alas, we do make mistakes (from time to time) and memory management problems do occur.

For example, a memory leak occurs when memory is allocated but never freed. Leaks can obviously be caused if you malloc() without a corresponding free(), but leaks can also be inadvertently caused if a pointer to dynamically-allocated memory is deleted, lost, or overwritten. Memory corruption can occur when allocated (and in use) memory is overwritten accidentally. Buffer overruns -- caused by writing past the end of a block of allocated memory -- frequently corrupt memory.

Regardless of the root cause, memory management errors can have unexpected, even devastating effects on application and system behavior. With dwindling available memory, processes and entire systems can grind to a halt, while corrupted memory often leads to spurious crashes. System security is also susceptible to buffer overruns. Worse, it might take days before evidence of a real problem appears. Today, it's common for Linux systems to have a gigabyte of main memory. If a program leaks a small amount of memory, it'll take some time before the application and system show symptoms of a problem. Memory management errors can be quite insidious and very difficult to find and fix.

This month, let's take a look at Valgrind, a tool that can help detect common memory management errors. We'll review the basics, write some "buggy" code, and then use Valgrind to find the mistakes. Valgrind was written by Julian Seward and is available under the GNU Public License.

Dynamic Memory Functions

Of all of the library calls in Linux, only four manage memory: malloc(), calloc(), realloc(), and free(). All of these functions have prototypes in the stdlib.h include file.

  • malloc() allocates a memory block. It's prototype is void* malloc(size_t size) and the single argument is the number of bytes of memory to allocate. If the allocation is successful, malloc returns a pointer to the memory. If memory allocation fails for some reason (for example, if the system is out of memory), malloc() returns a NULL pointer.
  • calloc() allocates an array in memory and initializes all of the memory to zero (with malloc(), the allocated memory is uninitialized). void* calloc(size_t nmemb, size_t size) is the prototype. The first argument is the number of elements in the array and the second argument is the size (in bytes) of each element. Like malloc(), calloc() returns a pointer if the memory allocation was successful, and NULL otherwise.
  • realloc() is defined as void* realloc (void *ptr, size_t size). realloc() changes the size of the object referenced by the pointer to a new size specified by the second argument. realloc() returns a pointer to the moved block of memory.
  • free() deallocates a memory block. It takes a pointer as an argument, as shown in its prototype, void free (void *ptr), and releases that memory.

While the API for memory management is unusually small, the number and kind of memory errors that can occur is quite substantial, including reading and using uninitialized memory; reading/writing memory after it has been freed; reading/writing from memory past the allocated size; reading/writing inappropriate areas on the stack; and memory leaks.

Luckily, Valgrind can detect all of those problems. When a program is run under Valgrind, all memory reads and writes are inspected and all calls to malloc()/new() and free()/ delete() are intercepted.

Installing Valgrind

Valgrind is closely tied to the architecture of the operating system. Currently, it's only supported on Linux x86 machines with kernels from the 2.2.x and 2.4.x series and glibc 2.1.x or 2.2.x.

You can get the source for Valgrind at http://www.valgrind.org/. (At the time of this writing, the latest stable release of Valgrind is 1.0.4. The latest development release is 1.9.4.) Download the latest stable release (or the latest development release, depending on your sense of adventure) and build the software:

% bunzip2 valgrind-1.0.4.tar.bz2
% tar xvf valgrind-1.0.4.tar
% cd valgrind-1.0.4
% ./configure
% make
% make install

One great feature of Valgrind is that it doesn't require you to build (or re-build) your application in any special way. Simply place valgrind right in front of the program you want to inspect. For example, the command

% valgrind ls -all

inspects and monitors the ls command. (Running this command on Red Hat Linux 8.0 showed no errors.)

The output from Valgrind has the following format.

==20691== 8192 bytes in 1 blocks are definitely lost in loss record 1 of 1
==20691==    at 0x40048434: malloc (vg_clientfuncs.c:100)
==20691==    by 0x806910C: fscklog_init (fsckwsp.c:2491)
==20691==    by 0x806E7D0: initial_processing (xchkdsk.c:2101)
==20691==    by 0x806C70D: main (xchkdsk.c:289)

The ==I<xxxxx>== string prefixes each line of Valgrind-specific output. (Application-specific output does not have the prefix.) In the sample output shown above, the 20691 is the process ID. The message indicates that there is a memory leak of 8192 bytes and provides diagnostics and a kind of trace to direct you to the error. The second and subsequent lines indicate that the memory is initially allocated on line 2491 in the routine fscklog_init() (in the file fsckwsp.c); fscklog_init() was called from initial_ processing() at line number 2101; and main() called initial_processing(). By the way, if fscklog_init() is called more than once in the initial_processing() routine, the line number clearly identifies which call caused the problem.

Losing Your Memory

Let's use Valgrind to find some common memory errors. The first sample program, sample1.c, shown in Listing One, has more than one memory leak. The code in Listing One allocates two 512-byte blocks of memory and then sets the pointer to the block to the second block. As a result, the address of the second block is lost, causing a memory leak. There are also 512 10-byte blocks of memory that are never freed.

Listing One: sample1.c, a program with a memory leak

#include <stdlib.h>
#include <stdio.h>

void get_mem() {
  char *ptr;
  ptr = (char *) malloc (10);  /* memory not freed */
}

int main(void) {
  int i;
  char *ptr1, *ptr2;
  ptr1 = (char *) malloc (512);
  ptr2 = (char *) malloc (512);
  ptr2 = ptr1; /* causes the memory leak of ptr1 */
  free(ptr2);
  free(ptr1);
  for ( i = 0; i < 512; i++ ) {
    get_mem();
  }
}

Compile and analyze sample1.c using the following commands:

% gcc sample1.c -o sample1 
% valgrind -v --leak-check=yes ./sample1

Valgrind produces the output shown in Figure One, correctly identifying the 512-byte and the 512 10-byte memory leaks. The -v provides verbose feedback and the --leak-check=yes option searches for memory leaks when the client program exits.

Figure One: Valgrind output for Listing One

==2009== 512 bytes in 1 blocks are definitely lost in loss record 1 of 2
==2009==    at 0x400483E4: malloc (vg_clientfuncs.c:100)
==2009==    by 0x80484D4: main (in /jfs/article/sample1)
==2009==    by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129)
==2009==    by 0x80483B1: free@@GLIBC_2.0 (in /jfs/article/sample1)
==2009== 
==2009== 5120 bytes in 512 blocks are definitely lost in loss record 2 of 2
==2009==    at 0x400483E4: malloc (vg_clientfuncs.c:100)
==2009==    by 0x80484A0: get_mem (in /jfs/article/sample1)
==2009==    by 0x8048519: main (in /jfs/article/sample1)
==2009==    by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129)
==2009== 
==2009== LEAK SUMMARY:
==2009==    definitely lost: 5632 bytes in 513 blocks.

sample2.c, shown in Listing Two, demonstrates another common memory error: reading beyond the end of an array of bytes. Again, build the sample code and run Valgrind to analyze it.

Listing Two: sample2.c, a program that tries to access memory 
beyond the end of an array

#include <stdlib.h>
#include <stdio.h>

int main(void) {
  char *chptr;
  char *chptr1;
  int i = 1;
  chptr = (char *) malloc(512);
  chptr1 = (char *) malloc (512);
  for ( i; i <= 513; i++ ) {
    chptr[i] = '?';        /* error when i = 513 invalid write */
    chptr1[i] = chptr[i];  /* error when i = 513 invalid read and write */
  }

  free(chptr1);
  free(chptr);
}
% gcc sample2.c -o sample2
% valgrind ./sample2

As you can see from the output in Figure Two, references to element 513 in the two arrays cause a write error, a read error, and another write error. The message Address 0x40CA0224 is 0 bytes after a block of size 512 alloc'd indicates that there is no storage beyond the end of the array of 512 bytes.

Figure Two: Valgrind output for Listing Two

==3016==  ...
==3016== Invalid write of size 1
==3016==    at 0x80484DA: main (in /jfs/article/sample2)
==3016==    by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129)
==3016==    by 0x80483B1: free@@GLIBC_2.0 (in /jfs/article/sample2)
==3016==    Address 0x40CA0224 is 0 bytes after a block of size 512 alloc'd
==3016==    at 0x400483E4: malloc (vg_clientfuncs.c:100)
==3016==    by 0x80484AA: main (in /jfs/article/sample2)
==3016==    by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129)
==3016==    by 0x80483B1: free@@GLIBC_2.0 (in /jfs/article/sample2)
==3016== 
==3016== Invalid read of size 1
==3016==    at 0x80484EB: main (in /jfs/article/sample2)
==3016==    by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129)
==3016==    by 0x80483B1: free@@GLIBC_2.0 (in /jfs/article/sample2)
==3016==    Address 0x40CA0224 is 0 bytes after a block of size 512 alloc'd
==3016==    at 0x400483E4: malloc (vg_clientfuncs.c:100)
==3016==    by 0x80484AA: main (in /jfs/article/sample2)
==3016==    by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129)
==3016==    by 0x80483B1: free@@GLIBC_2.0 (in /jfs/article/sample2)
==3016== 
==3016== Invalid write of size 1
==3016==    at 0x80484EB: main (in /jfs/article/sample2)
==3016==    by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129)
==3016==    by 0x80483B1: free@@GLIBC_2.0 (in /jfs/article/sample2)
==3016==    Address 0x40CA0454 is 0 bytes after a block of size 512 alloc'd
==3016==    at 0x400483E4: malloc (vg_clientfuncs.c:100)
==3016==    by 0x80484BF: main (in /jfs/article/sample2)
==3016==    by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129)
==3016==    by 0x80483B1: free@@GLIBC_2.0 (in /jfs/article/sample2)

Finally, to show how Valgrind finds invalid use of uninitialized memory, let's look at the results of analyzing the Journaled File System's (JFS) fsck utility. As before, we fun fsck under the auspices of Valgrind:

% valgrind -v -leak-check=yes \ fsck.jfs /dev/hdb1 

Figure Three shows a snippet of the output.

Figure Three: Valgrind output for the Journaled File System utility fsck.jfs
...
==12903== Conditional jump or move depends on uninitialised value(s)
==12903==    at 0x8079FCC: __divdi3 (in /sbin/fsck.jfs)
==12903==    by 0x805CB0E: validate_super (fsckmeta.c:2331)
==12903==    by 0x805C266: validate_repair_superblock (fsckmeta.c:1833)
==12903==    by 0x806E2B5: initial_processing (xchkdsk.c:1968)

The validate_super() routine can be found in the jfsutils package in jfsutils-1.x.x/fsck/fsckmeta.c. Listing Three shows a portion of the code:

Listing Three: A code snippet from fsckmeta.c

int validate_super( int which_super ) {
  int64_t bytes_on_device;

  /* get physical device size */
  vfs_rc = ujfs_get_dev_size(Dev_IOPort, &bytes_on_device);
  .
  .
  .
  dev_blks_on_device = bytes_on_device / Dev_blksize; /* Line 2331 */
  if (sb_ptr->s_pbsize != Dev_blksize) {
    ... 

The output from Valgrind indicates that an uninitialized variable is used on line 2331 -- that's the line that says, dev_blks_on_device = bytes_ on_device / Dev_blksize. As you can see, bytes_on_device is not set before its used. Using Valgrind, this memory management problem was identified and fixed before an end-user ever came across it.

Cache Profiling

Valgrind can also perform cache simulations and annotate your source line-by-line with the number of cache misses. In particular, it records:

  • L1 instruction cache reads and misses
  • L1 data cache reads and read misses, and writes and write misses
  • L2 unified cache reads and read misses, and writes and writes misses

L1 is a small amount of SRAM memory that's used as a cache. L1 temporarily stores instructions and data, ensuring that the processor has a steady supply of data to process while memory catches up delivering new data. L1 is integrated or packaged within the same module as the processor. Level 2 caching is performed in L2.

Valgrind's cachegrind tool is used to do cache profiling -- you use it just like valgrind. For example, the following command looks at the fsck.jfs program:

% cachegrind fsck.jfs -n -v /dev/hdb1

The output of cachegrind is collected in the file cachegrind.out. Sample output from analyzing fsck.jfs is shown in Figure Four.

Figure Four: cachegrind.out, cachegrind's analysis of fsck.jfs

==11004== I   refs:      99,813,615
==11004== I1  misses:         4,301
==11004== L2i misses:         3,210
==11004== I1  miss rate:        0.0%
==11004== L2i miss rate:        0.0%
==11004== 
==11004== D   refs:      68,846,938  (65,916,678 rd + 2,930,260 wr)
==11004== D1  misses:        63,883  (    37,768 rd +    26,115 wr)
==11004== L2d misses:        37,485  (    14,330 rd +    23,155 wr)
==11004== D1  miss rate:        0.0% (       0.0%   +       0.8%  )
==11004== L2d miss rate:        0.0% (       0.0%   +       0.7%  )
==11004== 
==11004== L2 refs:           68,184  (    42,069 rd +    26,115 wr)
==11004== L2 misses:         40,695  (    17,540 rd +    23,155 wr)
==11004== L2 miss rate:         0.0% (       0.0%   +       0.7%  )

Events recorded abbreviations are:
    Ir : I cache reads (ie. instructions executed) 
    I1mr: I1 cache read misses 
    I2mr: L2 cache instruction read misses 
    Dr : D cache reads (ie. memory reads) 
    D1mr: D1 cache read misses 
    D2mr: L2 cache data read misses 
    Dw : D cache writes (ie. memory writes) 
    D1mw: D1 cache write misses 
    D2mw: L2 cache data write misses

Next, you can annotate the output from cachegrind, by using vg_annotate:

% vg_annotate

vg_annotate produces output like that shown in Figure Five. The figure shows one annotation for the routine dmap_pmap_ verify(). The entry states that 88,405,584 instructions of 99,813,615 total instructions were spent in dmap_pmap_ verify(). This information is invaluable for deciding where to tune the program. You can also further annotate dmap_pmap_ verify() to find the actual instructions executed in that routine.

Figure Five: annotation of one entry of cachegrind for fsck.jfs

------------------------------
Ir I1mr I2mr    Dr   D1mr   D2mr    Dw   D1mw   D2mw  
------------------------------
88,405,584   23   23 61,740,960 14,535   98 576,828  9  9  fsckbmap.c:dmap_pmap_verify

END

For a complete description of cachegrind, see the Valgrind User's Manual in the Valgrind distribution.

Some Limitations Of Valgrind

There are two issues that you should be aware of when analyzing an application with Valgrind. First, an application running under Valgrind consumes more memory. Second, your program will run slower.

However, these two minor annoyances shouldn't stop you from using this powerful memory management debug tool.


About the Author

Steve Best works in the Linux Technology Center of IBM in Austin, Texas. He is currently working on the Journaled File System (JFS) for Linux project. Steve has done extensive work in operating system development with a focus in the areas of file systems, internationalization, and security. He can be reached at sbest@us.ibm.com.

You can download the sample programs used in this article here.



Bad, Bad Bug!

Copyright © 2000-2013 Valgrind™ Developers

Best Viewed With A(ny) Browser