Improve Your Code With Atomic Functions
In your studies, you may have encountered the terms "atomic function" and "Don't Repeat Yourself (DRY)". Today, I'm going to demonstrate how these concepts work together to provide easily maintainable, easily testable, and beautiful code.
The Problem
Often, beginners write functions that simply do too much. Such functions may take a large number of arguments, perform several logic checks, and possibly manage multiple loops. They instantiate variables internally which affect further internal operations.
The Python function below briefly exemplifies these dynamics:
import math
def find_prime_factors(number):
limit = int(math.sqrt(number)) + 1
factors = [(num, int(number / num)) for num in range(1, limit) if number % num == 0]
prime_factors = []
if len(factors) == 1:
return factors
for num in factors[1]:
limit = int(math.sqrt(num)) + 1
factors2 = [(x, int(num / x)) for x in range(1, limit) if num % x == 0]
if len(factors2) == 1:
prime_factors.append(num)
else: prime_factors.extend(find_prime_factors(num))
return prime_factors
Functions designed in this manner have predictability problems, which in turn make them difficult to test and troubleshoot. If the function fails a test or provides an unexpected result, the reason is simply not clear. Certainly, one of the internal logic tests may have failed or an internal loop could have behaved unexpectedly. In any case, troubleshooting it requires working through every line until the problem is found.
Atomicity
Atomic functions resolve the above issues by reducing each function to a single operation. I have refactored find_prime_factors() into a set of atomic functions below:
import math
def find_prime_factors(number):
if is_prime(number):
return [1, number]
return _find_prime_factors(factor_number(number)[1])
def is_prime(number):
return len(factor_number(number)) == 1
def factor_number(number):
limit = int(math.sqrt(number)) + 1
return [(num, int(number / num)) for num in range(1, limit) if number % num == 0]
def _find_prime_factors(factor_list):
primes = []
for num in factor_list:
if is_prime(num):
primes.append(num)
else: primes.extend(find_prime_factors(num))
return primes
Notice how short each function is. They range from one to seven lines each. Often, they return an expression or another function without instantiating a variable to contain the resulting value. Some functions accept the output of other functions as arguments allowing them to be strung together, effectively combining their internal codes.
Atomic functions benefit us in several ways. First, they are simple to test. Because each function has as single responsibility, we can easily predict what the output of each function will be with a given input. We know the output of factor_number() for any given number without working through dozens of lines of logical tests. The function is straightforward in what it does. Since we have a straightforward expectation from each function, our tests can be straightforward and concise.
Second, since each function is simple to test, they are simple to troubleshoot as well. Each function has so few lines that the errors have no room to hide. If is_prime() gives us an erroneous result and factor_number() does not, then the error must be on the single line of is_prime().
Third, just like atoms combine into molecules, atomic functions combine to have different effects. The factor_number() function is used twice - once to determine if a number is prime and once to return the factors of the number. This also means that we can easily elaborate upon our design. We can write a dozen more functions that employ these functions in various different ways. Because we can reuse them infinitely, we can satisfy the DRY principle as well by using these functions instead of repeating ourselves.
Fourth, our end result is easier to maintain. If we determine that factor_number() isn't working correctly, we can correct the function and that correction affects the entire program. Any given problem can be traced to a single line. Because atomic functions result in DRY code, any problem we encounter can be traced to a few lines of code in a single function. This drastically reduces the time needed to correct a problem.
Thanks for this article. I agree with the general idea, but maybe the example is not the best to illustrate it. Personally, I would not decompose the code of find_prime_factors() as you propose. Having to write _find_prime_factors() is awkward in my opinion; just the name of the function indicates that it has no clear purpose, and additionally, it tends to hide the recursive nature of the algorithm.
Admittedly, it’s a manufactured problem. I needed something to demonstrate and this came to mind first. It isn’t necessarily the best example.