Codementor Events

Python Recursion by Example

Published Sep 20, 2018Last updated Mar 18, 2019
Python Recursion by Example

Table of Contents

Introduction

Recursion can be a bit of a rat's nest. It confuses a lot of people (with good reason). Common questions are: What is it? Why is it useful? Why can't we just use loops? Why won't it stop? Please make it stop? And what on Earth is that maximum recursion depth that I just exceeded? By the end of this article, all will be made clear.

We'll start off with an introduction to the call stack and a comparison between recursive functions and loops. We'll then hit a few fun examples.

Prerequisites

I've done my best to keep the code as clear and concise as possible. This means I have made use of a few concepts that deserve tutorials of their own:

  • List comprehensions: You can think of them as compressed for loops. You can learn more about them here. And if you are having trouble understanding, please ask. If there is enough interest, I'll probably write an article explaining them.
  • Mutable data types: we'll specifically be dealing with dictionaries. If you don't know what that means, then I suggest you familiarize yourself with the concept of mutability. I recently wrote an article about Python lists that explains mutability, which may be of use to you.

Version

The examples below are all written in Python3.

The Python Call Stack

So imagine that you are a dish washer at a restaurant.

Imagine you have a big stack of dirty dishes that you need to wash. You are restricted to cleaning only one at a time. It would make sense to grab the top plate and wash that one. Then grab the next one from the top. Keep going until you have cleaned the entire stack.

Right, but the plates keep coming in because this is a functioning restaurant. Plates get pushed on top of your stack, and you pop them off. You wonder about the state of your life and dream about one day becoming a software developer.

Which brings us neatly to... Python!

Python (like many languages) maintains a kind of stack. It's a stack of function calls instead of a stack of dishes.

Consider:

def parrot():
    print("parrot start")
    print("parrot end")

def cheese():
    print("cheese start")
    parrot()
    print("cheese end")

cheese()

This outputs:

cheese start
parrot start
parrot end
cheese end

Which is pretty sensible. What is happening under the hood is:

We call cheese() so cheese() is pushed onto the call stack. Python starts executing on cheese and then cheese adds parrot to the call stack. So Python drops what it is doing and starts dealing with parrot. Once it is done with parrot, then parrot is removed from the stack. It returns to cheese and continues until it is done. cheese is then popped off the stack.

Going back to the restaurant example, it would be sortof like having dirty dishes that manufacture more dirty dishes when you try to clean them. In the restaurant business, this would be very depressing, but in Python, it's pretty powerful.

Going deeper under the hood: every time a function is called you add a stackframe (aka activation records) to the stack. The stackframe keeps track of what has been executed and what comes next. There are also some local variables associated with the stack frame.

Let's look at some more complex code. Because I like you folks, I'll throw in a picture. Consider this:

def check_cheese(cheese):
    """ return true if the cheese is in stock
    """
    if cheese == "Gouda":
        logging.info("ooh we have that one")
        return True
    logging.info(f"we don't have any {cheese}")
    return False

def find_first_available_cheese():
    """iterates over preferred cheeses, returns the first one that is available"""
    for cheese in ['Emmental','Gouda']:
        if check_cheese(cheese):
            return cheese
    logging.error("and you call yourself a cheeseshop!")

find_first_available_cheese()   ### start

The line labeled ### start is where we begin. When this line executes, then the stack looks a lil something like:

find_first_available_cheese
(function) (local variables)

So we have one stack frame representing the call to find_first_available_cheese. There are no local variables just yet.

We now enter the for loop. As we start the first iteration the stack is updated. It now looks like:

find_first_available_cheese cheese=Emmental
(function) (local variables)

Next up we call check_cheese and pass in our cheese. This pushes a frame onto the stack:

check_cheese cheese=Emmental
find_first_available_cheese cheese=Emmental
(function) (local variables)

Now Python starts executing check_cheese. cheese is compared to Gouda. The function returns False and the the frame is popped from the stack. We are now left with:

find_first_available_cheese cheese=Emmental
(function) (local variables)

Execution of find_first_available_cheese continues. In the next iteration of the loop, the value of cheese is updated. So our stack looks like:

find_first_available_cheese cheese=Gouda
(function) (local variables)

We now call check_cheese. The stack now looks like:

check_cheese cheese=Gouda
find_first_available_cheese cheese=Gouda
(function) (local variables)

Now Python starts executing check_cheese. cheese is compared to Gouda. The function returns True and the the frame is popped from the stack. We are now left with:

find_first_available_cheese cheese=Gouda
(function) (local variables)

Since check_cheese returned True, find_first_available_cheese returns the current cheese and is popped off the stack. And that is the end of the script.

Make sense? Good.

Of course, this is a grand over-simplification of what the call stack does under the hood. But it should give you a clear enough picture of what is going on so that the rest of this article makes sense.

Recursion versus Loops

So now you are familiar with the call stack and how it behaves when you call a function in a loop. In the code below, we will demonstrate something a bit different

def foo(iteration=0):
    print(f"spam number {iteration}")
    foo(iteration+1)

foo() ### start

So this useless function is a basic example of recursion. Let's run through the changes to the stack just like before.

We first execute the line marked with ### start. This gives us a stack like:

foo iteration=0
(function) (local variables)

The function then prints out the current spam iteration number and then calls foo again. Note that the initial foo call did not return so it remains on the stack.

foo iteration=1
foo iteration=0
(function) (local variables)

Now Python deals with the top frame and calls foo again. And again. and again...

foo iteration=4
foo iteration=3
foo iteration=2
foo iteration=1
foo iteration=0
(function) (local variables)

This is recursion.

Now our recursion function doesn't do a lot. The logic can be represented in a loop like this

iteration = 0
while True:
    print(f"spam number {iteration}")
    iteration += 1

This is an infinite loop because there is no statement that causes the loop to stop. So if you were to run the code above, it would just print stuff until you force the program to exit. Paste it in and see for yourself.

Now paste this into your Python shell:

def foo(iteration=0):
    print(f"spam number {iteration}")
    foo(iteration+1)
    print("exiting foo")  # this never executes

foo() ### start

There will be a whole lot of output and the final line will be something like:

RuntimeError: maximum recursion depth exceeded ...

So you can have infinite loops, but you can't have infinite recursion. Going back to the restaurant from before, if you keep pushing plates to the stack, you'll eventually hit the ceiling.

Now usually in while True kinds of loops you will have some breaking condition. For example:

iteration = 0
while True:
    if iteration >= 5:
        break
    print(f"spam number {iteration}")
    iteration += 1

print("finished")

We can also force our recursion to stop. We just need to have foo NOT call foo under certain conditions.

def foo(iteration=0):
    if iteration >= 5:    ###
        return            ### dont call foo. stop the recursion
    print(f"spam number {iteration}")
    foo(iteration+1)
    print("exiting foo")


foo() ### start
print("finished")

So let's say we let this program run for a while. Eventually, the stack will look like:

foo iteration=5
foo iteration=4
foo iteration=3
foo iteration=2
foo iteration=1
foo iteration=0
(function) (local variables)

Now since iteration >= 5, the current foo simply returns. It does not recurse any further because it does not make any new function calls. It is thus popped off the stack. This is called a termination mechanism. Remember that well, it'll be useful later.

The stack now looks like:

foo iteration=4
foo iteration=3
foo iteration=2
foo iteration=1
foo iteration=0
(function) (local variables)

Now execution of foo(4) continues. We print "exiting foo" then return to foo(3)

foo iteration=3
foo iteration=2
foo iteration=1
foo iteration=0
(function) (local variables)

And so on until there are no more foos on the stack. At this point we print "finished" and we are done.

So does that mean recursion is just a more verbose, confusing, and error prone version of looping? Not quite. There are a lot of things you can do with recursion that you just can't achieve with loops. This section demonstrated the basic mechanisms of recursion. In the next section we can do something a little bit more interesting.

Visiting leaves on a tree

Let's say you have a bunch of files on your computer and these files are organized into directories. Your directories can be nested to an arbitrary depth (meaning you can have directories inside directories inside directories inside directories for as many levels as you want). Seems reasonable enough.

Now let's say you want to write a script that could be used to search for files that match some arbitrary condition. For example, we could require that our function returns all files with .png extension that are more than 5MB in size.

As an aside, we'll be making use of Python's wonderful os package in the code that follows. os is used to interact with your OS (operating system). I'll explain all of the function calls in comments in the code.

Alrighty, now for some code. The comparison part is fairly straightforward so let's just get that out of the way:

def is_big_png(file_path):
    """ return True if the file at the given path is a png and is at least 5MB in size
    """
    if file_path.endswith('.png'):
        # get file statistics
        statinfo = os.stat(file_path)
        # st size is the size of the file in Bytes
        return statinfo.st_size >= 5*1024*1024
    return False

Now for the fun part 😃

def find_files(root_path,condition_function):
    """ return a list of file paths for files within the given root directory, where the condition function returns true """
    ...

Since Python functions are objects, they can be assigned to variables and used as arguments to other functions. So when we call find_files, we will pass in is_big_png as our condition function.

Let's first assume that the directory has no subdirectories. That is, we are only looking at files within the root directory.

def find_files(root_path,condition_function):
    """ return a list of file paths for files within the given root 
    directory, where the condition function returns true """
    
    # get a list of file and directory names within the given path
    
    all_directory_contents = os.listdir(root_path)
    
    # for each of the names, get its path
    
    all_content_paths = [
        os.path.join(root_path,name)
        for name in all_directory_contents
    ]
    
    # now we only care about files, not directories
    
    file_paths = [
        path
        for path in all_content_paths
        if not os.path.isdir(path) # isdir checks if the path is a directory
    ]
    
    # now find the files matching the given condition
    
    matching_file_paths = [
        path for path in file_paths
        if condition_function(path)
    ]
    
    # and return them
    
    return matching_file_paths

The in line comments should make the above example make sense. One possible point of confusion is the use of list comprehensions. You can think of them as compressed for loops. They can be used to make your code a lot more concise.

List comprehensions are a bit outside the scope of this article, but you can learn more about them here. And if you are having trouble understanding, please ask. If there is enough interest, I'll probably write an article explaining them.

Now for the recursive part. We assumed that the directory at root_path only contained files. Now, let's fix that assumption by allowing child directories.

Most of the code from before doesn't change, the new code is clearly marked.

def find_files(root_path,condition_function):
    """ return a list of file paths for files within the given root 
    directory, where the condition function returns true """
    
    # get a list of file and directory names within the given path
    
    all_directory_contents = os.listdir(root_path)
    
    # for each of the names, get its path
    
    all_content_paths = [
        os.path.join(root_path,name)
        for name in all_directory_contents
    ]
    
    # now we only care about files, not directories
    
    file_paths = [
        path
        for path in all_content_paths
        if not os.path.isdir(path)
    ]
    
    # now find the files matching the given condition
    
    matching_file_paths = [
        path for path in file_paths
        if condition_function(path)
    ]

    ########################
    # new code begins here #
    ########################

    # now we care about directories, not files
    
    directory_paths = [
        path
        for path in all_content_paths
        if os.path.isdir(path)
    ]
    
    # we need to look inside each of the child directories 
    # and add the results to the returned list
    
    for child_directory_path in directory_paths:
    
    	# Recurse !!!
        
        child_matching_file_paths = find_files(
            child_directory_path,
            condition_function
            )
            
        # we now add the newly found paths to the list that we want to return
        
        matching_file_paths.extend(child_matching_file_paths)

    ######################
    # new code ends here #
    ######################

    # and return them
    
    return matching_file_paths

Before we continue, see if you can spot the termination mechanism...

The part of the function that actually performs the recursion is here:

for child_directory_path in directory_paths:
    
    	# Recurse !!!
        
        child_matching_file_paths = find_files(
            child_directory_path,
            condition_function
            )
            
        # we now add the newly found paths to the list that we want to return
        
        matching_file_paths.extend(child_matching_file_paths)

Our termination mechanism isn't an explicit if statement. It's in the structure of the for loop. If directory_paths is empty, then the body of the for loop never executes. Thus, the funcction does not recurse any further. So if a directory contains no child directories, then the recursion terminates.

Now, to call this function on your home directory (if you are running Linux) you would do this:

find_files(root_path='/home/your_name',condition_function=is_big_png)

Awesome! We wrote some code that can be genuinely useful in your life. In terms of recursion, we have traversed a tree. Trees are a common structure in computer programming and life. You just traversed a directory tree, you could traverse a family tree, a decision tree, an org chart or any other strictly hierarchical structure in this way.

In the next section, we'll be dealing with a structure that is a bit more complex. Please make sure you understand the concepts up until this point before continuing.

Visiting nodes on a graph. With cycles 😃

Before we get cracking with this section, it's important to know what a graph is. In this section, we won't be talking about line graphs and other kinds of charts, we'll be talking about a bunch of nodes connected by edges. Take a look at the Wikipedia entry if you want to see what it's all about.

We won't be doing anything too hardcore to do with graph theory, it's just good to know what a graph is.

In the last section, we dealt with some simple trees. A tree is a spacial kind of graph in which there is a clear parent-child relationship between nodes (our nodes were files and directories). In this section, we will deal with traversing a graph in which any node can connect to any other node, and those connections have directions.

Don't feel intimidated, you deal with graphs like these all the time in life.

Okay, ready?

Here's the situation we'll be dealing with. Let's say we have a bunch of web pages with links between them (much like this web page linked to Wikipedia at the start of this section).

Now, let's say we want to count the number of links to each web page (to figure out how popular a web page is). And also, we only know one web page's URL to begin with. Consider the diagram below:

recursion graph 1.png

Looking at the diagram, we have 6 nodes (A, B, C ... F). There are a bunch of arrows between them. Each arrow represents a link, and the direction is significant (I linked to Wikipedia above, but Wikipedia didn't link to this page. The direction is from me to Wikipedia). Each of the links in the graph is given a name, e.g., edge ef represents a link from E to F.

Okay, before we get to the code, let's talk through the problem. Let's say we start off just knowing A's URL. At this point, all of the other nodes on the graph are invisible to us. Now A does have one link to it (ba) but we haven't discovered B yet so we don't know that.

Since A is a web page we can fetch it and check what it links to. It links to C, D and F (via edges ac,ad and af respectively). Let's tackle C first. (This should make you think of a for loop)

C has one known incoming link (we just traversed ac). And it links to B and D. Cool, let's deal with B first.
B has one known incoming link (we just traversed cb). And it links to A and F. That means A's incoming link count can be incremented to 1. Since we have already visited A, we won't check its links again or we will just keep going through the same cycle. So we would process F next.

Eventually all the nodes would be visited (except E since nobody links to it), and the link counts will be as follows (note, F's link count doesn't take efinto account since we never traverse that edge).

recursion graph 2.png

Okay, now for some Python code 😃

Now the graph above is totally made up so actually fetching those URLs wont work out. So let's start off by making a new Internet.

# internet.py

class Internet:
    """ this class has knowledge of all the web pages that exist. 
    Given the url of a web page its contents can be fetched """
    
    # the dictionary below represents the graph
    
    pages = {
        'http://A.com' : ['http://C.com','http://D.com','http://F.com'], # [ac,ad,af]
        'http://B.com' : ['http://A.com','http://F.com'], # [ba, bf]
        'http://C.com' : ['http://B.com','http://D.com'], # [cb,cd]
        'http://D.com' : [], # D doesn't link anywhere
        'http://E.com' : ['http://F.com'], # ef
        'http://F.com' : []  # F doesn't link anywhere
    }

    @classmethod
    def fetch_page_links(Cls,url):
        """ given a page url, return a list of all the urls that the page links to
        """
        if url not in Cls.pages:
            # the url does not exist on the Internet
            raise Exception(404)
        return Cls.pages[url]

Now our code will look something like this:

# simple_page_rank.py

from internet import Internet
...

Treat the Internet a a bit of a black box. We can query it only by calling fetch_page_links.

As an aside, it would be useful to understand that dictionaries are "mutable." If you don't know what that means, then I suggest you familiarize yourself with the concept of mutability. I recently wrote an article about Python lists that explains mutability. It may be of use to you.

Okay, back to the problem at hand. The code that follows is the actual solution:

def traverse_graph(url, visited_pages=None):
    """ this will do the actual work of traversing the graph.
    if the url has already been visited then return
    otherwise fetch all the links and visit each linked to page
    increment the known incoming link count as needed
    """
    visited_pages = visited_pages or {}
    
    # the visited_pages dictionary will be used to keep track of 
    # what we have visited and how many incoming links we know about for each page
    
    # visited_pages will look like this:
    #  { url_1 : known_incomming_link_count_for_url_1, url_2 ... }

    if url in visited_pages:
        return visited_pages # been there done that. Return

    # add the page to visited pages so that we don't exceed our maximum recursion
    
    visited_pages[url] = 0

    # visit each link
    
    for link_url in Internet.fetch_page_links(url):
    
    	# Recurse !!!
    
        traverse_graph(link_url,visited_pages)
        
        # and increment the incomming link count
        
        visited_pages[link_url] += 1

    # finally, return our dictionary
    
    return visited_pages

Can you spot the termination mechanism here? Here's a hint, there are actually two.

The first one is quite explicit:

if url in visited_pages:
        return visited_pages # been there done that. Return

If we have already dealt with this URL, then there is nothing left to do. Simply return, recurse no further.

The next termination mechanism is similar to what we say in the last example. The actual recursion happens here:

    for link_url in Internet.fetch_page_links(url):
    
    	# Recurse !!!
    
        traverse_graph(link_url,visited_pages)
        
        # and increment the incomming link count
        
        visited_pages[link_url] += 1

Look familiar?

So if Internet.fetch_page_links(url) returns an empty list, then the loop body does not execute, so there is no further recursion.

The other point of possible confusion is that darn mutability I keep mentioning. visited_pages is instantiated once, and then passed to every recursive call. It is present in every stack frame.

We aren't creating multiple copies of visited_pages and storing them in different memory locations or anything like that. Each stack frame has a reference to the same piece of memory that contains the contents of visited_pages.

So if you make a change to visited_pages in one stack frame, then the result will be available to whatever stack frame we process next (so long as the function was passed visited_pages, it's not like it is a global variable or anything like that.

If you are struggling to grasp the concept in the context of recursion, then take a look at this. It's a much simpler illustration of dictionary mutability:

>>> def bar(some_dict):
...     some_dict['a']='b'
... 
>>> def foo():
...     d = {}
...     bar(d)
...     return d
... 
>>> print(foo())
{'a': 'b'}
>>> print(d)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'd' is not defined

This kind of behavior can often lead to lots of weird and wonderful bugs, so it is worth knowing about it. In our case, this is very useful just because it means we can keep our code concise.

Now let's call our shiny new recursive function...

d = traverse_graph('http://A.com')
print(d)

# this outputs:
# {'http://A.com': 1, 'http://F.com': 2, 'http://B.com': 1, 'http://C.com': 1, 'http://D.com': 2}

As a bonus, here is a little trick you can use to make your printed dictionary look a little prettier:

import json
d = traverse_graph('http://A.com')
print(json.dumps(d,sort_keys=True,indent=2))

# this outputs:
# {
#   "http://A.com": 1,
#   "http://B.com": 1,
#   "http://C.com": 1,
#   "http://D.com": 2,
#   "http://F.com": 2
# }

So you can see that the numbers match those in the diagram, and E was not visited. As expected.

You can play around with different URLs as well. For example, you could start at E:

d = traverse_graph('http://E.com')
print(json.dumps(d,sort_keys=True,indent=2))

# this outputs:
# {
#   "http://E.com": 0,
#   "http://F.com": 1
# }

What we implemented above is a very simplified page rank algorithm. A more complex algorithm would add "weights" to the edges in the graph because some links are worth more than others. For example, a link from D is worth more than a link from B, because D is more popular than B.

Other examples of these kinds of graphs are all around us. The road system is an example, some streets are one way, some aren't. Another example is the low level infrastructure of the Internet (the real one), it is made up of lots of computers joined by a web of routers and other devices. Social networks as well. These things are everywhere.

Conclusion

Well done for getting this far! I suspect it was heavy going for most people. In this article, we covered the Python call stack and illustrated how it behaves when looping and when calling functions. We then introduced recursion (and probably the most common recursion error, maximum recursion depth exceeded) in terms of the stack.

We then got out of the kitchen and wrote some practical code. We traversed a directory tree to find some files and we met a more complex graph structure and implemented a simple page rank algorithm. The techniques we used in these examples can be applied to other graphs.

Discover and read more posts from Sheena
get started
post comments1Reply
Sourav Singha
6 years ago

Thanks for sharing your Idea and make mine.