Comprehensions in Python
Python provides comprehension as a handy feature for creating containers like lists, dictionaries, sets, and generators. It’s not only a short and concise way to construct sequences but it’s also perform better in most cases.
Python supports 4 types of comprehensions to create different types of sequences.
- List comprehension.
- Dictionary Comprehension.
- Set Comprehension.
- Generator Comprehension.
In this article, We’ll explore all of them with some basic examples, Once you get familiar with one of them, you’ll be able to easily work with other types.
List Comprehension
List comprehensions are used to create lists in an elegant way and it’s more popular among other types, It generates an output list from an input list with the possibility of intermediate logic. The basic syntax is as follows.
list_out = [<expression(item)> for item in iter_in if <condition(item)>]
This process iterates over the iterable object iter_in
and feed the items to the intermediate logic <expression(item)>
before appending them to list_out
, if the <condition>
is specified then the items are filtered by that condition before passing them to the expression. This process is similar to the following typical syntax.
list_out = []
for item in iter_in:
if condition(item) == True:
item_new = expression(item)
list_out.append(item)
✨ Example | List Comprehension
Let’s use the range
function to create a list of numbers (from -10
to 10
) then we’ll apply list comprehension to generate new different lists.
n_list = list(range(-10, 11))
[-10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Now, we can use list comprehensions to create a list containing the squares of the items from the list n_list
.
n_list_1 = [num ** 2 for num in n_list]
When the condition is omitted, all the items from n_list
are passed to the expression num ** 2
and added in the new list n_list_1
.
[100, 81, 64, 49, 36, 25, 16, 9, 4, 1, 0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
If we want to filter only positive values, we can add the condition num >= 0
as follows.
n_list_2 = [num ** 2 for num in n_list if num >= 0]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
We can also add another if
statements which work like a nested if
blocks, or like and
combination. A second condition can be used to filter the odd values as follows:
n_list_3 = [num ** 2 for num in n_list if num >= 0 if num % 2 != 0]
So now we have only the squares of positive odd numbers from the original list.
[1, 9, 25, 49, 81]
It’s also possible to use different kinds of iterables as input like the enumerate
object.
n_list_4 = [num ** 2 for (i, num) in enumerate(n_list) if num >= 0 and i % 2 == 0]
The enumerate
function yields a tuple of pairs (index, value) from the input list, so both the index i
and value num
can be used in the filtering condition or in the expression.
The previous comprehension results in a list containing the squares of positive values which have even index in the list.
[0, 4, 16, 36, 64, 100]
Dictionary Comprehension
Dictionaries can be created using dictionary comprehension with the following basic syntax:
dict_out = [<get_key(item)>: <get_value(item)> for item in iter_in if <condition>]
Here, the <get_key(item)>
is an expression representing the key and <get_value(item)>
is an expression representing the value for each entry in the output dictionary. both expressions can be a function of the item from the input iterable. and finally the <condition>
is optional to filter the items before adding them to the dictionary.
This process is similar to the following typical syntax:
dict_out = dict()
for item in iter_in:
if condition(item) == True:
key = get_key(item)
value = get_value(item)
dict_out[key] = value
✨ Example | Dictionary Comprehension
Let’s split the English idiom “Better late than never”
into a set of unique words, then we’ll apply dictionary comprehension to create a new dictionary containing the words as keys and their indices in the set as values.
words = set("Better late than never".split(" "))
{'Better', 'late', 'never', 'than'}
After getting the set of words, we can construct the dictionary of words and their indices as follows.
words_ind = {word: i for i, word in enumerate(words)}
{'Better': 0, 'late': 1, 'never': 2, 'than': 3}
When working with test data, it’s common to inverse the word mapping to provide double direction between words and their indices. We can do so in our example as follows.
ind_words = {val: key for key, val in words_ind.items()}
{0: 'Better', 1: 'late', 2: 'never', 3: 'than'}
Optionally, we can add a condition in order to filter the entries like keeping only words with certain length.
ind_words_lim = {val: key for key, val in words_ind.items() if len(key) < 6}
{1: 'late', 2: 'never', 3: 'than'}
Set Comprehensions
Similarly to list comprehension, we can create a set using the same syntax but wrapping it in between curly brackets {}
.
set_out = {<expression(item)> for item in iter_in if <condition(item)>}
This syntax is similar to the following typical code.
set_out = set()
for item in iter_in:
if condition(item) == True:
item_new = expression(item)
set_out.add(item_new)
✨ Example | Set Comprehension
Let’s create a list of random numbers from 0
to 10
with possible duplicates, then we’ll apply set comprehensions to create a set of unique numbers from the list.
from random import randint
random_list = [randint(0, 10) for i in range(10)]
[3, 4, 6, 9, 0, 9, 7, 4, 5, 9]
Now, we can apply the set comprehension on the list as follows.
n_set_1 = {num for num in random_list}
{0, 3, 4, 5, 6, 7, 9}
Additionally, we can specify a condition in order to filter only odd values.
n_set_2 = {num for num in random_list if num % 2 != 0}
{3, 5, 7, 9}
Generator Comprehension
Finally, Generator Comprehension is also very similar to list comprehension in syntax where generator comprehension is wrapped in between parenthesis ()
instead of square brackets.
One more significant difference is that generator comprehension doesn’t allow memory for all items at once like a list, but instead it generates the items gradually one by one as the iteration is going through, so it can hit higher memory performance.
gen_out = (<expression(item)> for item in iter_in if <condition(item)>)
It’s possible to achieve the same result using the typical code.
def generate_out(iter_in):
for item in iter_in:
if condition(item) == True:
item_new = expression(item)
yield item_new
gen_out = generate_out()
✨ Example | Generator Comprehension
For this example, let’s create a list of numbers from 0
to 10
, then we’ll apply generator comprehension to create a generator of odd numbers from 0
to 10
.
n_list = list(range(10))
gen_out = (num for num in n_list if num % 2 != 0)
Now we can iterate over the generated items using the next
built-in function or a for .. in ..
statement.
print(next(gen_out))
print(next(gen_out))
print(next(gen_out))
1
3
5
Finally !
Now that you have learned how to create sequences in a pythonic way using comprehension, You’ll be able to write more concise code and improve its readability, but keep in mind that over using it especially as nested comprehensions can have the opposite effect by reducing the readability of your code.
Read my other Articles 🔥
Geospatial Data in Python - Interactive Visualization
One-Hot Encoding in Data Science
Thanks for giving this amazing python solution to the new students you can see https://www.securedmoving.com/ for more detail
I find it really interesting, good article.
https://tweakbox.mobi/ https://tutuappx.com/
Thank you! I’m glad you like it
Why not directly do
words_ind = {word: i for word, i in enumerate(words)}
instead of
words_ind = {word: i for i, word in enumerate(words)}
to have index first ?
In some cases (eg. NLP) you may need the mapping to be in both directions “word:num” and “num:word” so you can easily go back if needed. So you’ll end up with two dictionaries by reversing key and value anyways.
what mean NLP ?
Natural Language Processing