All veteran Python developers (Pythonistas) preach about writing Pythonic code. If you're someone who has spent some time writing Pythonic code, you will have come across the best practices. But what exactly is Pythonic code, and how should you remember the major pain points/avoid obvious (bad) practices?
Fortunately, the Python community is blessed with a relatively simple and complete set of code style guidelines and "Pythonic" idioms. These are one of the key reasons for the high readability of Pythonic code. Readability and simplistic syntax is at the heart of Python.
In this post, we are going to talk about a few very important style guidelines and Pythonic idioms, and how to deal with legacy code.
One Statement of Code per Line
If you're writing disjointed statements in a single line, you're violating the essence of Python. The only exception is list comprehensions and a few other compound statements. These are allowed and accepted for their brevity and their expressiveness.
Bad practice
print 'foo'; print 'bar'
if x == 1: print 'foo'
if <complex comparison> and <other complex comparison>:
# do something
Best practice
print 'foo'
print 'bar'
if x == 1:
print 'foo'
cond1 = <complex comparison>
cond2 = <other complex comparison>
if cond1 and cond2:
# do something
Explicit code
The simplest (and easiest to understand) way of writing code is always the best.
Bad practice
def make_complex(*args):
x, y = args
return dict(**locals())
The above code returns:
{'args': (1, 2), 'x': 1, 'y': 2}
This is redundant as we needed just x and y whereas it returns ‘args’ as well.
Also, if we had:
def make_complex(*args):
x,y = args
z = x+y
return dict(**locals())
The above function will return ‘z’:3 too in the locals dict. Best practice is to specify what’s needed, and follow the most straight-forward approach. Remember to keep things simple and explicit.
Another pitfall of this bad practice is that if you pass in more than 2 parameters while calling the function: make_complex(1,2,3)
, it would throw a valueError
like this:
Best practice
def make_complex(x, y):
return {'x': x, 'y': y}
Passing args to Functions
There are four different ways of passing arguments to a function:
- Positional arguments: These are the simplest form of arguments. Positional arguments are fully part of the function's meaning, and their order will be the order in which they are defined. For instance, in
cal_area(length, breadth)
orsend_msg(message, recipient)
, the developer doesn't have to worry about remembering that these two functions require 2 arguments, or their order.
Note: In the above two examples, you can also call functions with different orders using keywords like: cal_area(breadth = 40.0, length=90)
or send_msg(recipient='Mak', message='Hello there!')
.
-
Keyword arguments: Also known as
kwargs
, these are often used as optional parameters passed to the function. When a function has more than two or three positional parameters, its signature gets difficult to remember.
Kwargs come in handy with default value. For example, a better way of writing asend_msg
function would be:send_message(message, recipient, cc=None, bcc=None)
. Here,cc
andbcc
are optional, and would be returning as “None” if no value is passed. -
Arbitrary arguments list: If the business logic of the function requires an extensible number of positional arguments, it can be defined with the
*args
constructs. Inside the function,args
will be a tuple of all the remaining positional arguments. For example,send_msg(message, *args)
can be called with each recipient as an argument:
send_msg('Hello there!', 'God', 'Mom', 'Cthulhu')
, and the function scope will haveargs
equal to('God', 'Mom', 'Cthulhu')
. -
The arbitrary keyword argument dictionary: If your function requires an undetermined series of named arguments, it is possible to use the
**kwargs
construct. In the function body, kwargs will be a dictionary of all the passed named arguments that have not been caught by other keyword arguments in the function signature.
Using*args
, Python passes variable length non-keyword argument to the function — but what if we want to pass keyword argument? Using**kwargs
, we can pass the variable length of keyword arguments to the function.
For example, for the function below:
def introduction(**data):
print("\nData type of argument:",type(data))
for key, value in data.items():
print("{} is {}".format(key,value))
introduction(Firstname="Sita", Lastname="Sharma", Age=22, Phone=1234567890)
introduction(Firstname="John", Lastname="Wood", Email="johnwood@nomail.com", Country="Wakanda", Age=25, Phone=9876543210)
The output will be:
Data type of argument: <class 'dict'>
Firstname is Sita
Lastname is Sharma
Age is 22
Phone is 1234567890
Data type of argument: <class 'dict'>
Firstname is John
Lastname is Wood
Email is johnwood@nomail.com
Country is Wakanda
Age is 25
Phone is 9876543210
Note: The same caution as for arbitrary argument lists is necessary. The reasons are similar: these powerful techniques are only to be used when there is a proven necessity, and should not be used if the simpler and clearer construct is sufficient to express the function’s intention.
If the coding style guide is followed wisely, your Python functions will be:
- easy to read (the name and arguments need no explanations)
- easy to change (adding a new keyword argument does not break other parts of the code)
Return Statements
As a function grows in complexity, it becomes susceptible to having multiple return statements inside the function’s body. However, in order to keep a clear intent and a sustained readability level, it is preferable to avoid returning meaningful values at multiple output points in the function body.
For instance, take a look at the example below (explained by the inline comments) on how to avoid adding multiple output points and raise exceptions instead:
Bad practice
def complex_function(a, b, c):
if not a:
return None
if not b:
return None
# Some complex code trying to compute x from a, b and c
if x:
return x
if not x:
# Some Plan-B computation of x
return x
Best practice
def complex_function(a, b, c):
if not a or not b or not c:
raise ValueError("The args can't be None")
# Raising an exception is better
# Some complex code trying to compute x from a, b and c
# Resist temptation to return x if succeeded
if not x:
# Some Plan-B computation of x
return x # One single exit point for the returned value x will help when maintaining the code.
Writing Idiomatic Python
An idiom is a phrase that doesn't make literal sense, but makes sense once you're acquainted with the culture in which it arose. Programming idioms are no different. They are the little things you do daily in a particular programming language or paradigm that only make sense to a person familiar with its culture.
Python beginners can be unaware of writing idiomatic Python, so we’ve listed some common Python idioms:
Unpacking
If you want to assign names or references to the elements of a list while unpacking it, try using enumerate()
:
for index, item in enumerate(some_list):
# do something with index and item
You can use swap variables:
a, b = b, a
Nested unpacking works too:
a, (b, c) = 1, (2, 3)
In Python 3, PEP 3132 has introduced a new method of extended unpacking:
a, *rest = [1, 2, 3]
# a = 1, rest = [2, 3]
a, *middle, c = [1, 2, 3, 4]
# a = 1, middle = [2, 3], c = 4
Creating throwaway variables
If you need to assign something (for instance, in unpacking), but will not need that variable, use __
:
filename = 'foobar.txt'
basename, __, ext = filename.rpartition('.')
Note:
Many Python style guides recommend the use of a single underscore _
for throwaway variables rather than the double underscore __
recommended here. The issue is that _
is commonly used as an alias for the gettext()
function, and is also used at the interactive prompt to hold the value of the last operation.
Using a double underscore instead is just as clear and almost as convenient. The benefit of this practice is eliminating the risk of accidentally interfering with either of these other use cases.
Create a length-N list of the same thing
Use the Python list *
operator to create simple lists and nested lists as well:
nones = [None]*4
foures_of_fours = [[4]]*5
Output:
[None, None, None, None]
[[4], [4], [4], [4], [4]]
Search for an item in a collection
Sometimes we need to search through a collection. Let’s look at two options: lists and sets. Take the following code for example:
def in_test(iterable):
for i in range(1000):
if i in iterable:
pass
from timeit import timeit
timeit(
"in_test(iterable)",
setup="from __main__ import in_test; iterable = set(range(1000))",
number=10000)
Output: 0.5591847896575928
timeit(
"in_test(iterable)",
setup="from __main__ import in_test; iterable = list(range(1000))",
number=10000)
Output: 50.18339991569519
timeit(
"in_test(iterable)",
setup="from __main__ import in_test; iterable = tuple(range(1000))",
number=10000)
Output: 51.597304821014404
Both functions look identical, because the lookup_set()
is utilizing the fact that sets in Python are hashtables. However, the lookup performances of the two are different — i.e. sets use O(log n), whereas list has a time complexity of O(n).
To determine whether an item is in a list, Python will have to go through each item until it finds a matching item. This is time consuming, especially for long lists. In a set, on the other hand, the hash of the item will tell Python where in the set to look for a matching item. As a result, the search can be done quickly, even if the set is large.
Because of these differences in performance, it is often a good idea to use sets or dictionaries instead of lists in cases where:
- the collection will contain a large number of items
- you will be repeatedly searching for items in the collection
- you do not have duplicate items
Access a Dictionary Element
Don’t use the dict.has_key() method. Instead, use x in
d syntax, or pass a default argument to dict.get(), as it is more Pythonic and is removed in Python 3.x.
Note: Python2 is about to be retired in 2020. It is advised to use Python 3.x for any sort of development, as most of the Python packages have/will stop releasing updates for Python 2.x. Read more here.
Bad practice
d = {'foo': 'bar'}
if d.has_key('foo'):
print d['foo'] # prints 'bar'
else:
print 'default_value'
Best practice
d = {'foo': 'bar'}
print d.get('foo', 'default_value') # prints 'bar'
print d.get('thingy', 'default_value') # prints 'default_value'
# alternative
if 'hello' in d:
print d['foo']
Filtering a List
Never remove items from a list while you are iterating it. Why? If your list is accessed via multiple references, the fact that you're just reseating one of the references (and NOT altering the list object itself) can lead to subtle, disastrous bugs. Read more about it here.
Bad practice
# Filter elements greater than 4
num_list = [1, 2, 3]
for i in num_list:
if i > 2:
num_list.remove(i)
Don’t make multiple passes through the list.
while i in num_list:
num_list.remove(i)
Best practice
Use a list comprehension or generator expression:
# comprehensions create a new list object
filtered_values = [value for value in sequence if value != x]
# generators don't create another list
filtered_values = (value for value in sequence if value != x)
Updating Values in a List
Remember that assignment never creates a new object. If two or more variables refer to the same list, changing one of them changes them all.
Bad practice
# Add three to all list members.
a = [3, 4, 5]
b = a # a and b refer to the same list object
for i in range(len(a)):
a[i] += 3 # b[i] also changes
Best practice
a = [3, 4, 5]
b = a
# assign the variable "a" to a new list without changing "b"
a = [i + 3 for i in a]
b = a[:] # even better way to copy a list
Read From a File
Use the with open
syntax to read from files. This will automatically close files for you.
Bad practice
f = open('file.txt')
a = f.read()
print a
f.close()
Best practice
with open('file.txt') as f:
for line in f:
print line
The with
method is better because it ensures you always close the file, even if an exception is raised inside the block.
Dealing with Legacy Code
We’ve covered the basics of writing good code in Python. It’s now worth looking at the art of handling big projects in Python. How can you take up new open-source or closed-source projects? What are the steps to refactor legacy code? What are the best practices to get yourself up to speed on a new project?
Often when you join a new organization, you're given a codebase to comprehend and refactor, or you need to take up legacy code to refactor. Sometimes, thanks to this situation you'll find yourself in deep distress, and unable to figure out the starting point.
At this point, it's important to define “legacy code/project” so that we're all on the same page. Here's what you'll come across:
- an “older” project that has been around forever
- a code base without any kind of tests
- the project that no one wants to work on
- “Everyone who worked on this left the company years ago…”
All of the above are somewhat right, but sometimes projects are done in haste and put into production before everyone realizes that there is a lot of scope for improvement. So, how shall we tackle a legacy project?
Below is a quick list of steps you should follow in order to make your journey of refactoring simpler and smoother:
- First and foremost, make sure the project is in a version control system.
- Delete commented out code. (Once the project is in production, always make sure that you remove the commented code.)
- Run tests/add tests. Make sure that you have at least 80% test coverage. Use
pytest
or similar Python packages to track test coverage. - Use Pylint/Vulture. Always consider running some type of linter over the code to see how “healthy” it is. Try to look for:
- Unused variables
- Anything that is noted as a potential bug
- Use formatters like Flake8 or PEP8. These guidelines can be used to reformat Python code to make it more PEP8 complaint.
- Write more idiomatic Python (as described above).
Conclusion
With the exploding Python community and budding Pythonistas, we have Python in almost all development fields such as data science, web development, mobile development, and AI, etc. As such, it is increasingly important to make sure we always ship enterprise-grade code following proper guidelines.
Thanks to these basic tools — and the beauty of the Python language itself — producing awesome code and products doesn’t have to be a scary proposition. Now that you’ve gone through these guidelines, go ahead and try these on an open source Python project!
For more Python best practices, check out these posts:
- Python Exception Handling
- Python Lists in Depth
- A Python Import Tutorial for Beginners
- Learn Python by building projects with DevProjects
- Refer to the code-style documentation by Kenneth Reitz for more comprehensive guidance.
Notes and references:
[1] One Statement of Code per line - Code Style from The Hitchhiker's Guide to Python
[2] Passing args to function - Code Style from The Hitchhiker's Guide to Python