gdb

CS 271

Prof. Calvin

async

slice

Announcements

You have a robust code base and just be debugging.
Let's learn a debugger.
I stole this from someone who spent 13 years working on a Ph.D. and is now at AWS
(Samuel Huang)

Today

gdb: intro/invocation; breakpoints; debugging; conditions; pointers
valgrind: memory leaks

What is gdb?

“GNU Debugger”
A debugger for several languages, including C and C++
It allows you to inspect what the program is doing at a certain point during execution.
Errors like segmentation faults may be easier to find with the help of gdb.
Online manual

Additional step when compiling program

Normally, you would compile a program like:

gcc [flags] <source files> -o <output file>

For example: gcc hi.c -o hi.x
Now you add a -g option to enable built-in debugging support (which gdb needs)
For example: gcc -g hi.c -o hi.x

The example used: gcc -Wall -Werror -std=c89 -pedantic-errors -g hi.c -o hi.x but I wasn't good enough at writing sneaky segfaults to get anything through that.

Starting up gdb

Just try gdb or gdb hi.x You’ll get a prompt that looks like this:

(gdb)

If you didn’t specify a program to debug, you’ll have to load it in now:

(gdb) file hi.x

Here, hi.x is the program you want to load, and file is the command to load it.

Before we go any further

gdb has an interactive shell, like Python. It can recall history with the arrow keys, auto-complete words (most of the time) with the TAB key, and has other nice features.
Tip: If you’re ever confused about a command or just want more information, use the help command, with or without an argument:

(gdb) help [command]

You should get a nice description and maybe some more useful tidbits.

A bad program

int *add_ptr(int *a, int *b)
{
    int val = *a + *b ;
    return &val ;
}

int main() 
{
    int a = 2, b = 3, *ptr ;
    ptr = add_ptr(&a, &b) ;
    printf("Sum = %d\n", *ptr) ;
    return 0;
}

Running the program

To run the program, just use:

(gdb) run

This runs the program.
If it has no serious problems (i.e. the normal program didn’t get a segmentation fault, etc.), the program should run fine here too.
If the program did have issues, then you (should) get some useful information like the line number where it crashed, and parameters to the function that caused the error:

0x00005555555551fc in main () at hi.c:16
16          printf("Sum = %d\n", *ptr) ;

So what if I have bugs?

Okay, so you’ve run it successfully. But you don’t need gdb for that. What if the program isn’t working?
Basic idea: Chances are if this is the case, you don’t want to run the program without any stopping, breaking, etc. Otherwise, you’ll just rush past the error and never find the root of the issue. So, you’ll want to step through your code a bit at a time, until you arrive upon the error.
This brings us to the next set of commands...

Setting breakpoints

Breakpoints can be used to stop the program run in the middle, at a designated point. The simplest way is the command break
This sets a breakpoint at a specified file-line pair:
(gdb) break hi.c:15
This sets a breakpoint at line 15, of hi.c. Now, if the program ever reaches that location when running, the program will pause and prompt you for another command.
Tip: You can set as many breakpoints as you want, and the program should stop execution if it reaches any of them.

More fun with breakpoints

You can also tell gdb to break at a particular function. Suppose you have a function int *add_ptr(int *a, int *b) ;
You can break anytime this function is called:
(gdb) break add_ptr

Now what?

Once you’ve set a breakpoint, you can try using the run command again. This time, it should stop where you tell it to (unless a fatal error occurs before reaching that point).
You can proceed onto the next breakpoint by typing continue
Typing run again would restart the program from the beginning, which isn’t always very useful.
You can single-step (execute just the next line of code) by typing stepThis gives you really fine-grained control over how the program proceeds. You can do this a lot...

Now what? (even more!)

Similar to stepthe command next single-steps as well, except this one doesn’t execute each line of a sub-routine, it just treats it as one instruction.
Tip: Typing step or next a lot of times can be tedious. If you just press ENTER, gdb will repeat the same command you just gave it. You can do this a bunch of times.

Querying other aspects of the program

So far you’ve learned how to interrupt program flow at fixed, specified points, and how to continue stepping line-by-line. However, sooner or later you’re going to want to see things like the values of variables, etc. This might be useful in debugging.
The print command prints the value of the variable specified, and print/x prints the value in hexadecimal:
Breakpoint 1, add_ptr (a=0x7fffffffdd88, b=0x7fffffffdd8c) at hi.c:7 7 { (gdb) print/x a $3 = 0x7fffffffdd88

Setting watchpoints

Whereas breakpoints interrupt the program at a particular line or function, watchpoints act on variables. They pause the program whenever a watched variable’s value is modified.
For example, the following watch command:
(gdb) watch val
Now, whenever the value of valis modified, the program will interrupt and print out the old and new values.
Tip: You may wonder how gdb determines which variable named val to watch if there is more than one declared in your program.
The answer (perhaps unfortunately) is that it relies upon the variable’s scope, relative to where you are in the program at the time of the watch. This just means that you have to remember the tricky nuances of scope and extent.
To learn about scope, take this class when we have more than 14 weeks to get through entirety of computer science

Other useful commands

backtrace - produces a stack trace of the function calls that lead to a seg fault (should remind you of Java exceptions)
where - same as backtrace - you can think of this version as working even when you’re still in the middle of the program
finish - runs until the current function is finished
delete - deletes a specified breakpoint
info breakpoints - shows information about all declared breakpoints
Look at sections 5 and 9 of the manual mentioned at the beginning of this tutorial to find other useful commands, or just try help

More about breakpoints

Breakpoints by themselves may seem too tedious. You have to keep stepping, and stepping, and stepping...
Basic idea: Once we develop an idea for what the error could be (like dereferencing a NULL pointer, or going past the bounds of an array), we probably only care if such an event happens; we don’t want to break at each iteration regardless.
So ideally, we’d like to condition on a particular requirement (or set of requirements). Using conditional breakpoints allow us to accomplish this goal...

Conditional breakpoints

Just like regular breakpoints, except that you get to specify some criterion that must be met for the breakpoint to trigger. We use the same break command as before:
(gdb) break file1.c:6 if i >= ARRAYSIZE
This command sets a breakpoint at line 6 of file file1.c, which triggers only if the variable i is greater than or equal to the size of the array (which probably is bad if line 6 does something like arr[i] Conditional breakpoints can most likely avoid all the unnecessary stepping, etc.

Fun with pointers

Who doesn’t have fun with pointers? First, let’s assume we have the following structure defined:
struct entry { int key; char *name; float price; long serial_number; };
Maybe this struct is used in some sort of hash table as part of a catalog for products, or something related.

Using pointers with gdb I

Now, let’s assume we’re in gdb, and are at some point in the execution after a line that looks like:
struct entry * e1 = <something>;
We can do a lot of stuff with pointer operations, just like we could in C.
See the value (memory address) of the pointer:
(gdb) print e1
See a particular field of the struct the pointer is referencing:
(gdb) print e1->key (gdb) print e1->name (gdb) print e1->price (gdb) print e1->serial number

Using pointers with gdb II

You can also use the dereference star * and reference dot . operators in place of the arrow operator ->
(gdb) print (*e1).key
(gdb) print (*e1).name
(gdb) print (*e1).price
(gdb) print (*e1).serial number
See the entire contents of the struct the pointer references (you can’t do this as easily in C!):
(gdb) print *e1
You can also follow pointers iteratively, like in a linked list:
(gdb) print list prt->next->next->next->data

Today

✓ gdb: ✓ intro/invocation; ✓ breakpoints; ✓ debugging; ✓ conditions; ✓ pointers
valgrind: memory leaks

Valgrind

Valgrind is an instrumentation framework for building dynamic analysis tools. There are Valgrind tools that can automatically detect many memory management and threading bugs, and profile your programs in detail. You can also use Valgrind to build new tools.

valgrind ./a.out

I'll be using my matrix mult code from disc, which contains errors since I wrote it on a midterm. #include <stdio.h> #include <stdint.h> #include <stdlib.h> /* take am m by n and n by p and m by p */ uint64_t **mmult(size_t m, size_t n, size_t p, uint64_t **a, uint64_t **b) { uint64_t **ret = malloc(m * sizeof(uint64_t *)) ; size_t i, j, k ; for ( i = 0 ; i < m ; i++ ) { ret[i] = malloc(p * sizeof(uint64_t)) ; for ( j = 0 ; j < p ; j++ ) { ret[i][j] = 0 ; for ( k = 0 ; k < n ; k++ ) { ret[i][j] += a[i][k] * b[k][j] ; } } } return ret ; } void mprint(size_t m, size_t n, uint64_t **a) { size_t i, j ; putchar('{') ; for ( i = 0 ; i < m ; i++ ) { putchar('{') ; for ( j = 0 ; j < n - 1 ; j++ ) { printf("%lu, ", a[i][j]) ; } printf("%lu}", a[i][n - 1]) ; if (i < m - 1) { printf(",\n ") ; } } printf("}\n") ; } int main() { size_t m = 2, n = 3, p = 2, i, j ; uint64_t **a, **b ; a = (uint64_t **)malloc(m * sizeof(uint64_t *)) ; for ( i = 0 ; i < m ; i++ ) { a[i] = malloc(p * sizeof(uint64_t)) ; for ( j = 0 ; j < n ; j++ ) { a[i][j] = i + j ; } } b = (uint64_t **)malloc(m * sizeof(uint64_t *)) ; for ( i = 0 ; i < n ; i++ ) { b[i] = malloc(p * sizeof(uint64_t)) ; for ( j = 0 ; j < p ; j++ ) { b[i][j] = i + j ; } } mprint(m, n, a); mprint(n, p, b); mprint(m, p, mmult(m,n,p,a,b)); return 0; }
Let they who is without sin throw the first stone. ==32743== ERROR SUMMARY: 17 errors from 8 contexts (suppressed: 0 from 0)
Make sure you still use gcc -g.

Understanding errors

Let's look at the first chunk
==34585== Invalid write of size 8 ==34585== at 0x1094E5: main (in /home/user/a.out) ==34585== Address 0x4a8d0a0 is 0 bytes after a block of size 16 alloc'd ==34585== at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==34585== by 0x1094AB: main (in /home/user/a.out)
Wow I have no idea where that is, recompile with -g.
==35544== Invalid write of size 8 ==35544== at 0x1094E5: main (hi.c:56) ==35544== Address 0x4a8d0a0 is 0 bytes after a block of size 16 alloc'd ==35544== at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==35544== by 0x1094AB: main (hi.c:53)
Ah, line 53 and 56.

Understanding errors

This is line 53 through 56
a[i] = malloc(p * sizeof(uint64_t)) ; for ( j = 0 ; j < n ; j++ ) { a[i][j] = i + j ;
The variables are defined here: size_t m = 2, n = 3, p = 2, i, j ; uint64_t **a, **b ; a = (uint64_t **)malloc(m * sizeof(uint64_t *)) ;
The error was "Address 0x4a8d0a0 is 0 bytes after a block of size 16 alloc'd"
What does that mean?

Understanding errors

This is line 50 through 54. Notice anything?
a = (uint64_t **)malloc(m * sizeof(uint64_t *)) ; for ( i = 0 ; i < m ; i++ ) { a[i] = malloc(p * sizeof(uint64_t)) ; for ( j = 0 ; j < n ; j++ )
Valgrind has detected an anomaly between our malloc size and our loop size.
We malloc "m" by "p" a = (uint64_t **)malloc(m * sizeof(uint64_t *)) ; ... a[i] = malloc(p * sizeof(uint64_t)) ; but we step "m" by "n" times. for ( i = 0 ; i < m ; i++ ) ... for ( j = 0 ; j < n ; j++ )
I made this error when creating both testing matrices, and when I corrected it... ==39032== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Understanding leaks

After we fix memory errors we can fix memory leaks
POP QUIZ: What's the difference? ==39032== LEAK SUMMARY: ==39032== definitely lost: 56 bytes in 3 blocks ==39032== indirectly lost: 128 bytes in 7 blocks ==39032== possibly lost: 0 bytes in 0 blocks ==39032== still reachable: 0 bytes in 0 blocks ==39032== suppressed: 0 bytes in 0 blocks ==39032== Rerun with --leak-check=full to see details of leaked memory
Here's your hint: mprint(m, p, mmult(m,n,p,a,b));

Understanding leaks

==39032== definitely lost: 56 bytes in 3 blocks

56 = 7 * 8
I lost 7 pointers of size 8.
The mmult was a 2 by 3 times a 3 by 2 that returned a 2 by 2
That is, 7 total rows, in 3 distinct chunks (blocks)
Each row contained a pointer to a column.
The columns were big enough to hold a total of 2*3 + 3*2 + 2*2 = 16 values
16 values of size 8 is 16 * 8 = 128 bytes, in 7 blocks (one for each column) ==43596== indirectly lost: 128 bytes in 7 blocks

Understanding leaks

We can trivially fix the code by adding a free call inside the printer, which necessarily reads all malloc'ed memory. // capture the return value c = mmult(m, n, p, a, b) ; // include free's in print mprint(m, n, a); mprint(n, p, b); mprint(m, p, c);
Then we see a happy valgrind: ==47140== HEAP SUMMARY: ==47140== in use at exit: 0 bytes in 0 blocks ==47140== total heap usage: 11 allocs, 11 frees, 1,208 bytes allocated ==47140== ==47140== All heap blocks were freed -- no leaks are possible

Today

✓ gdb: ✓ intro/invocation; ✓ breakpoints; ✓ debugging; ✓ conditions; ✓ pointers
✓ valgrind: ✓ memory leaks

Valgrind

I fixed my matrix mult code. #include <stdio.h> #include <stdint.h> #include <stdlib.h> /* take am m by n and n by p and m by p */ uint64_t **mmult(size_t m, size_t n, size_t p, uint64_t **a, uint64_t **b) { uint64_t **ret = malloc(m * sizeof(uint64_t *)) ; size_t i, j, k ; for ( i = 0 ; i < m ; i++ ) { ret[i] = malloc(p * sizeof(uint64_t)) ; for ( j = 0 ; j < p ; j++ ) { ret[i][j] = 0 ; for ( k = 0 ; k < n ; k++ ) { ret[i][j] += a[i][k] * b[k][j] ; } } } return ret ; } void mprint(size_t m, size_t n, uint64_t **a) { size_t i, j ; putchar('{') ; for ( i = 0 ; i < m ; i++ ) { putchar('{') ; for ( j = 0 ; j < n - 1 ; j++ ) { printf("%lu, ", a[i][j]) ; } printf("%lu}", a[i][n - 1]) ; free(a[i]) ; if (i < m - 1) { printf(",\n ") ; } } free(a) ; printf("}\n") ; } int main() { size_t m = 2, n = 3, p = 2, i, j ; uint64_t **a, **b, **c ; a = (uint64_t **)malloc(m * sizeof(uint64_t *)) ; for ( i = 0 ; i < m ; i++ ) { a[i] = malloc(n * sizeof(uint64_t)) ; for ( j = 0 ; j < n ; j++ ) { a[i][j] = i + j ; } } b = (uint64_t **)malloc(n * sizeof(uint64_t *)) ; for ( i = 0 ; i < n ; i++ ) { b[i] = malloc(p * sizeof(uint64_t)) ; for ( j = 0 ; j < p ; j++ ) { b[i][j] = i + j ; } } // capture the return value c = mmult(m, n, p, a, b) ; // include free's in print mprint(m, n, a); mprint(n, p, b); mprint(m, p, c); return 0; }