Advanced Python Debugging with pdb
Python's built-in pdb
module is extremely useful for interactive debugging, but has a bit of a learning curve. For a long time, I stuck to basic print
-debugging and used pdb
on a limited basis, which meant I missed out on a lot of features that would have made debugging faster and easier.
In this post I will show you a few tips I've picked up over the years to level up my interactive debugging skills.
Print debugging vs. interactive debugging
First, why would you want to use an interactive debugger instead of inserting print
or logging
statements into your code?
With pdb
, you have a lot more flexibility to run, resume, and alter the execution of your program without touching the underlying source. Once you get good at this, it means more time spent diving into issues and less time context switching back and forth between your editor and the command line.
Also, by not touching the underlying source code, you will have the ability to step into third party code (e.g. modules installed from PyPI) and the standard library.
Post-mortem debugging
The first workflow I used after moving away from print
debugging was pdb
's "post-mortem debugging" mode. This is where you run your program as usual, but whenever an unhandled exception is thrown, you drop down into the debugger to poke around in the program state. After that, you attempt to make a fix and repeat the process until the problem is resolved.
You can run an existing script with the post-mortem debugger by using Python's -mpdb
option:
python3 -mpdb path/to/script.py
From here, you are dropped into a (Pdb)
prompt. To start execution, you use the continue
or c
command. If the program executes successfully, you will be taken back to the (Pdb)
prompt where you can restart the execution again. At this point, you can use quit
/ q
or Ctrl+D to exit the debugger.
If the program throws an unhandled exception, you'll also see a (Pdb)
prompt, but with the program execution stopped at the line that threw the exception. From here, you can run Python code and debugger commands at the prompt to inspect the current program state.
Testing our basic workflow
To see how these basic debugging steps work, I'll be using this (buggy) program:
import random
MAX = 100
def main(num_loops=1000):
for i in range(num_loops):
num = random.randint(0, MAX)
denom = random.randint(0, MAX)
result = num / denom
print("{} divided by {} is {:.2f}".format(num, denom, result))
if __name__ == "__main__":
import sys
arg = sys.argv[-1]
if arg.isdigit():
main(arg)
else:
main()
We're expecting the program to do some basic math operations on random numbers in a loop and print the result. Try running it normally and you will see one of the bugs:
$ python3 script.py
2 divided by 30 is 0.07
65 divided by 41 is 1.59
0 divided by 70 is 0.00
...
38 divided by 26 is 1.46
Traceback (most recent call last):
File "script.py", line 16, in <module>
main()
File "script.py", line 7, in main
result = num / denom
ZeroDivisionError: division by zero
Let's try post-mortem debugging this error:
$ python3 -mpdb script.py
> ./src/script.py(1)<module>()
-> import random
(Pdb) c
49 divided by 46 is 1.07
...
Traceback (most recent call last):
File "/usr/lib/python3.4/pdb.py", line 1661, in main
pdb._runscript(mainpyfile)
File "/usr/lib/python3.4/pdb.py", line 1542, in _runscript
self.run(statement)
File "/usr/lib/python3.4/bdb.py", line 431, in run
exec(cmd, globals, locals)
File "<string>", line 1, in <module>
File "./src/script.py", line 1, in <module>
import random
File "./src/script.py", line 7, in main
result = num / denom
ZeroDivisionError: division by zero
Uncaught exception. Entering post mortem debugging
Running 'cont' or 'step' will restart the program
> ./src/script.py(7)main()
-> result = num / denom
(Pdb) num
76
(Pdb) denom
0
(Pdb) random.randint(0, MAX)
56
(Pdb) random.randint(0, MAX)
79
(Pdb) random.randint(0, 1)
0
(Pdb) random.randint(1, 1)
1
Once the post-mortem debugger kicks in, we can inspect all of the variables in the current frame and even run new code to help us figure out what's wrong and attempt to make a fix.
pdb.set_trace
Dropping into the debugger from Python code using Another technique that I used early on, after starting to use pdb
, was forcing the debugger to run at a certain line of code before an error occurred. This is a common next step after learning post-mortem debugging because it feels similar to debugging with print
statements.
For example, in the above code, if we want to stop execution before the division operation, we could add a pdb.set_trace
call to our program here:
import pdb; pdb.set_trace()
result = num / denom
And then run our program without -mpdb
:
$ python3 script.py
> ./src/script.py(10)main()
-> result = num / denom
(Pdb) num
94
(Pdb) denom
19
The problem with this method is that you have to constantly drop these statements into your source code, remember to remove them afterwards, and switch between running your code with python
vs. python -mpdb
.
Using pdb.set_trace
gets the job done, but breakpoints are an even more flexible way to stop the debugger at any line (even third party or standard library code), without needing to modify any source code. Let's learn about breakpoints and a few other useful commands.
Debugger commands
There are over 30 commands you can give to the interactive debugger, a list that can be seen by using the help
command when at the (Pdb)
prompt:
(Pdb) help
Documented commands (type help <topic>):
========================================
EOF c d h list q rv undisplay
a cl debug help ll quit s unt
alias clear disable ignore longlist r source until
args commands display interact n restart step up
b condition down j next return tbreak w
break cont enable jump p retval u whatis
bt continue exit l pp run unalias where
You can use help <topic>
for more information on a given command.
Instead of walking through each command, I'll list out the ones I've found most useful and what arguments they take.
Setting breakpoints:
l(ist)
: displays the source code of the currently running program, with line numbers, for the 10 lines around the current statement.l 1,999
: displays the source code of lines 1-999. I regularly use this to see the source for the entire program. If your program only has 20 lines, it'll just show all 20 lines.b(reakpoint)
: displays a list of current breakpoints.b 10
: set a breakpoint at line 10. Breakpoints are referred to by a numeric ID, starting at 1.b main
: set a breakpoint at the function namedmain
. The function name must be in the current scope. You can also set breakpoints on functions in other modules in the current scope, e.g.b random.randint
.b script.py:10
: sets a breakpoint at line 10 inscript.py
. This gives you another way to set breakpoints in another module.clear
: clears all breakpoints.clear 1
: clear breakpoint 1.
Stepping through execution:
c(ontinue)
: execute until the program finishes, an exception is thrown, or a breakpoint is hit.s(tep)
: execute the next line, whatever it is (your code, stdlib, third party code, etc.). Use this when you want to step down into function calls you're interested in.n(ext)
: execute the next line in the current function (will not step into downstream function calls). Use this when you're only interested in the current function.r(eturn)
: execute the remaining lines in the current function until it returns. Use this to skip over the rest of the function and go up a level. For example, if you've stepped down into a function by mistake.unt(il) [lineno]
: execute until the current line exceeds the current line number. This is useful when you've stepped into a loop but want to let the loop continue executing without having to manually step through every iteration. Without any argument, this command behaves likenext
(with the loop skipping behavior, once you've stepped through the loop body once).
Moving up and down the stack:
w(here)
: shows an annotated view of the stack trace, with your current frame marked by>
.u(p)
: move up one frame in the current stack trace. For example, when post-mortem debugging, you'll start off on the lowest level of the stack and typically want to moveup
a few times to help figure out what went wrong.d(own)
: move down one frame in the current stack trace.
Additional commands and tips:
pp <expression>
: This will "pretty print" the result of the given expression using thepprint
module. Example:
(Pdb) stuff = "testing the pp command in pdb with a big list of strings"
(Pdb) pp [(i, x) for (i, x) in enumerate(stuff.split())]
[(0, 'testing'),
(1, 'the'),
(2, 'pp'),
(3, 'command'),
(4, 'in'),
(5, 'pdb'),
(6, 'with'),
(7, 'a'),
(8, 'big'),
(9, 'list'),
(10, 'of'),
(11, 'strings')]
-
!<python code>
: sometimes the Python code you run in the debugger will be confused for a command. For examplec = 1
will trigger thecontinue
command. To force the debugger to execute Python code, prefix the line with!
, e.g.!c = 1
. -
Pressing the Enter key at the
(Pdb)
prompt will execute the previous command again. This is most useful after thes
/n
/r
/unt
commands to quickly step through execution line-by-line. -
You can run multiple commands on one line by separating them with
;;
, e.g.b 8 ;; c
. -
The
pdb
module can take multiple-c
arguments on the command line to execute commands as soon as the debugger starts.
Example:
python3 -mpdb -cc script.py # run the program without you having to enter an initial "c" at the prompt
python3 -mpdb -c "b 8" -cc script.py # sets a breakpoint on line 8 and runs the program
Restart behavior
Another thing that can shave time off debugging is understanding how pdb
's restart behavior works. You may have noticed that after execution stops, pdb
will give a message like, "The program finished and will be restarted," or "The script will be restarted." When I first started using pdb
, I would always quit and re-run python -mpdb ...
to make sure that my code changes were getting picked up, which was unnecessary in most cases.
When pdb
says it will restart the program, or when you use the restart
command, code changes to the script you're debugging will be reloaded automatically. Breakpoints will still be set after reloading, but may need to be cleared and re-set due to line numbers shifting. Code changes to other imported modules will not be reloaded — you will need to quit
and re-run the -mpdb
command to pick those up.
Watches
One feature you may miss from other interactive debuggers is the ability to "watch" a variable change throughout the program's execution. pdb
does not include a watch command by default, but you can get something similar by using commands
, which lets you run arbitrary Python code whenever a breakpoint is hit.
To watch what happens to the denom
variable in our example program:
$ python3 -mpdb script.py
> ./src/script.py(1)<module>()
-> import random
(Pdb) b 9
Breakpoint 1 at ./src/script.py:9
(Pdb) commands
(com) silent
(com) print("DENOM: {}".format(denom))
(com) c
(Pdb) c
DENOM: 77
71 divided by 77 is 0.92
DENOM: 27
100 divided by 27 is 3.70
DENOM: 10
82 divided by 10 is 8.20
DENOM: 20
...
We first set a breakpoint (which is assigned ID 1), then use commands
to start entering a block of commands. These commands function as if you had typed them at the (Pdb)
prompt. They can be either Python code or additional pdb
commands.
Once we start the commands
block, the prompt changes to (com)
. The silent
command means the following commands will not be echoed back to the screen every time they're executed, which makes reading the output a little easier.
After that, we run a print
statement to inspect the variable, similar to what we might do when print
debugging. Finally, we end with a c
to continue execution, which ends the command block. Typing c
again at the (Pdb)
prompt starts execution and we see our new print
statement running.
If you'd rather stop execution instead of continuing, you can use end
instead of c
in the command block.
Running pdb from the interpreter
Another way to run pdb
is via the interpreter, which is useful when you're experimenting interactively and would like to drop into pdb
without running a standalone script.
For post-mortem debugging, all you need is a call to pdb.pm()
after an exception has occurred:
$ python3
>>> import script
>>> script.main()
17 divided by 60 is 0.28
...
56 divided by 94 is 0.60
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "./src/script.py", line 9, in main
result = num / denom
ZeroDivisionError: division by zero
>>> import pdb
>>> pdb.pm()
> ./src/script.py(9)main()
-> result = num / denom
(Pdb) num
4
(Pdb) denom
0
If you want to step through normal execution instead, use the pdb.run()
function:
$ python3
>>> import script
>>> import pdb
>>> pdb.run("script.main()")
> <string>(1)<module>()
(Pdb) b script:6
Breakpoint 1 at ./src/script.py:6
(Pdb) c
> ./src/script.py(6)main()
-> for i in range(num_loops):
(Pdb) n
> ./src/script.py(7)main()
-> num = random.randint(0, MAX)
(Pdb) n
> ./src/script.py(8)main()
-> denom = random.randint(0, MAX)
(Pdb) n
> ./src/script.py(9)main()
-> result = num / denom
(Pdb) n
> ./src/script.py(10)main()
-> print("{} divided by {} is {:.2f}".format(num, denom, result))
(Pdb) n
66 divided by 70 is 0.94
> ./src/script.py(6)main()
-> for i in range(num_loops):
This one is a little trickier than -mpdb
because you don't have the ability to step through an entire program. Instead, you'll need to manually set a breakpoint, e.g. on the first statement of the function you're trying to execute.
Conclusion
Hopefully these tips have given you a few new ideas on how to use pdb
more effectively. After getting a handle on these, you should be able to pick up the other commands and start customizing pdb
via a .pdbrc
file (example).
You can also look into other front-ends for debugging, like pdbpp, pudb, and ipdb, or GUI debuggers like the one included in PyCharm. Happy debugging!
Hi Steven,
This is a very useful post.
I am looking for one more thing. I have a Python script calling a function from C Library.
I want to know, how to debug the functions from C Library using this debugger?
How to set breakpoints in these C functions?
Please do suggest.
Regards,
Narendra
Hi Narendra. You need to use
gdb
(or whatever is used for C debugging on your platform of choice, e.g. Visual Studio on Windows) to debug C extensions and core cpython code.Here’s an example. Say I want to debug the StringIO code in Python 3, which is implemented in C. Let’s say I want to set a breakpoint on this method:
https://github.com/python/cpython/blob/4a21e57fe55076c77b0ee454e1994ca544d09dc0/Modules/_io/stringio.c#L345-L373
I will use this program to trigger the execution of this method:
If I have
gdb
and Python 3 installed (with debug symbols) I can set a breakpoint on_stringio_readline
, execute my program, have it break into the debugger when that method is executed, and then usegdb
's commands and some Python extensions to step through the code, examine the contents of variables, grab a Python traceback, etc.Hi Steven,
Nice article about debugging Python.
I’m looking for a way to stop the program execution (and jump into pdb) for any exception (handled
& unhandled).
Thanks,
Chaim.
Hi Chaim. See the post-mortem debugging section of this article. Running your program with
python -mpdb script.py
then using thec
command will run your program and drop you into a debugger prompt for unhandled exceptions.Re: triggering the debugger on handled exceptions, can you explain more about why you want to do that? Exceptions are commonly used for control flow, so your program might handle hundreds or thousands of exceptions during normal execution (especially in the standard library and installed packages).
That said you could probably do it using the
trace
module. Here’s one way to display handled exceptions (this is a small program that imports a few modules from the standard library, then tries to run 1/0 and catch the ZeroDivisionError):You can see how the standard library raised and handled 11 exceptions before one line of my code even ran.
It might be possible to subclass
trace.Trace
to trigger the debugger on caught exceptions, but I would want to know more about why you want to do that before pursuing that.Hi Steven,
My problem is: I’m running a 3rd party Python packages (Django), which in turn calls other Python packages. I’m getting the following error from Django: TypeError: unsupported operand type(s) for &: ‘str’ and ‘int’. And, because of this error the application fails to start (open a web page). So, I want to know where the error comes from, so I can fix it of avoid it!!! Because it is handled exception, I want to force the debugger (pdb) to stop on ‘handled’ exception!
How are you getting that error? Is it in the console when you run Django’s
runserver
command, or is does it display in the browser from Django’s error page? Do you get a full traceback, or just that one “TypeError” line? I wouldn’t jump to the debugger yet, especially trying to debug all caught exceptions. I think there’s a simpler way to solve it. Let me know if you’d like to start a session so I can take a look at it.Hi Steven,
Thanks for your help. I finally found the problem. It came from ‘flup’ package. After I downgrade the ‘flup’ version from 1.0.3 to 1.0.2 the problem disappeared. I run the program via debugger, and before starting the Django service I set the ‘option.traceback’ to 1 (in the pdb) and that cause the traceback to be printed. (BTW, I assume that there should be a configuration parameter to set it, but I didn’t fine it yet.)
Anyway, for the next time…I still interested in a way to enforce the PDB to stop (or hook) on any exception (handled & unhandled). I think that it can be very useful to debug problems like this.
Thanks,
Chaim.