Codementor Events

Python's Counter - Part 2

Published May 20, 2019Last updated May 29, 2019
Python's Counter - Part 2

This is a continuation from the introductory post on python's Counter tool. If you want to read that before coming back to this, you can find it here.

Can we apply Counter to any form of data?

Counter is an amazing tool that simplifies the task of counting items, but, No! it works only on iterables - there is more to this, so keep reading. In the meantime, what is an iterable? In basic terms, an iterable stores a sequence of values, or characters, which you can traverse.

Let's look at examples:

>>> a = [1, 2, 3, 4, 5, 6, 7, 3, 3]
>>> b = (1, 2, 3, 4, 5, 6, 7, 2, 2)
>>> c = "This is a string."
>>> m = 123456

a is an iterable. It is a list of integers. You can traverse these numbers, one after the other.
b is also an iterable. It is a tuple, containing values, one after the other as well.
c is also an iterable. It is a sequence of characters.
m is not iterable. It is an integer and not a sequence.

What happens when we apply a Counter to a, b, c, and m?
Firstly, we will import Counter from collections

>>> from collections import Counter 

Next we will apply Counter to a, b, and c.

>>> Counter(a)
Counter({3: 3, 1: 1, 2: 1, 4: 1, 5: 1, 6: 1, 7: 1})
>>> Counter(b)
Counter({2: 3, 1: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1})
>>> Counter(c)
Counter({'i': 3, 's': 3, ' ': 3, 'T': 1, 'h': 1, 'a': 1, 't': 1, 'r': 1, 'n': 1, 'g': 1, '.': 1})

Just as always, Counter has given us the number of times each element occurs in the given iterables.

For a, the integer 3 occurs three times and each of the other numbers occur just once.
For b, the integer 2 occurs three times and each of the other numbers occur just once.
The behaviour with c is interesting. Counter gave us the number of times each letter or character occured in the text. i, s, and the space character occured three times each, and each of the other characters occured only once.

What happens with m?

>>> Counter(m)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\DELL\AppData\Local\Programs\Python\Python37-32\lib\collections\__init__.py", line 566, in __init__
    self.update(*args, **kwds)
  File "C:\Users\DELL\AppData\Local\Programs\Python\Python37-32\lib\collections\__init__.py", line 653, in update
    _count_elements(self, iterable)
TypeError: 'int' object is not iterable

anger-angry-annoyed-987585.jpg

We get a TypeError, screaming that m is not iterable 😃.
We can safely settle with this: Counters only work with iterables.

Can we use a Counter meaningfully with every iterable?

Let's see...

>>> d = {1, 2, 3, 4, 5, 6, 7, 3, 3}
>>> Counter(d)
Counter({1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1})

This is not what we expected. Counter says every number occured just once, but we know 3 occured three times.

The reason is simple. d is a set and cannot contain duplicates. Even though we assigned several numbers with 3 occurring several times, the duplicates were removed before it was stored in the variable d. Let's break that down in code.

>>> d = {1, 2, 3, 4, 5, 6, 7, 3, 3}
>>>
>>> # what does d contain?
>>> d
{1, 2, 3, 4, 5, 6, 7}
>>>
>>> Counter(d)
Counter({1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1})

We can now understand why we got that result. The problem was that d didn't have duplicates, and Counter rightfully counted only one of each number.
So, we cannot use Counter meaningfully on a set, even though a set is an iterable.

Let's see what happens with a dictionary.

>>> e = {'num1': 1, 'num2':2, 'num3':3, 'num4':3, 'num5':3}
>>> 
>>> # What does e contain?
>>> e
{'num1': 1, 'num2': 2, 'num3': 3, 'num4': 3, 'num5': 3}
>>> Counter(e)
Counter({'num3': 3, 'num4': 3, 'num5': 3, 'num2': 2, 'num1': 1})
>>> 
>>> 
>>>
>>> f = {'str1':'this', 'str2':'that', 'str3':'then', 'str4':'then', 'str5':'that'}
>>> 
>>> # What does f contain?
>>> f
{'str1': 'this', 'str2': 'that', 'str3': 'then', 'str4': 'then', 'str5': 'that'}
>>> Counter(f)
Counter({'str1': 'this', 'str3': 'then', 'str4': 'then', 'str2': 'that', 'str5': 'that'})
>>> 
>>>

Above, we have two dictionaries, containing different value types. The first one, e, contains numbers, while the second, f, contains strings.
They behave similarly with Counter; Counter seems not to have done any counting and returned the key-value pairs in the dictionaries as is. Except, for a change in order - the values appear to be sorted in descending order. Let's reserve the reason for another post 😃.

Apparently, using Counter on a dictionary is not very meaningful as well, unless, the dictionary values are numbers. In that case, the values are treated as the counts of the keys, as demonstrated with e.

Thus, even though we can traverse a dictionary - I partially consider it an iterable - it is not very meaningful using a Counter on it. I mentioned that a dictionary is partially iterable because you can iterate over its keys and not its values. Let's demonstrate this:


>>> # mydict is a sample dictionary containing
>>> # information about John Doe
>>> mydict = {'name':'John Doe', 'id':'DST024315', 'age':40, 'country':'Germany'}
>>>
>>> # let's iterate over it
>>> for item in mydict:
...    print(item)
...
name
id
age
country
>>>
>>> # the keys of the dictionary were printed
>>> # instead of the key-value pairs or values

As we have seen,
a. Counters work only on iterables
b. The use of Counter may not be appreciable on all iterables.

We observed that it is useful to apply Counters to lists, tuples, and strings, but not sets and dictionaries, when we want to count individual values. As a side note, it works well on a deque too, another tool similar to list but with added features. You can find out more on the use of a deque.

Finally, when we applied Counter on a text, it gave us the count of characters. What if we wanted to count words and not characters? Keep tuned for the next post on Counter.

Discover and read more posts from Victor Ayi
get started