How to Refactor Messy Code: A Step-by-Step Guide
🍀 Introduction: A Tale of Two Project
Imagine this: You’re working on a university project with a tight deadline. Your team decides to divide the tasks and code individually. A week later, it’s time to integrate everything, but there’s a problem—no one understands each other’s code. Variable names like x and temp123, functions hundreds of lines long, magic numbers, and inconsistent formatting turn what should be a simple integration into a debugging nightmare.
The frustration is real. But the problem isn’t just the code—it’s dirty code. Writing clean code isn’t just about aesthetics; it’s about creating something maintainable, readable, and functional for you and others. Here’s how you can make your Python code cleaner, one step at a time.
đź’˘ What is Dirty Code, and Why is it Problematic?
Dirty code refers to code that is hard to read, understand, or maintain. It may function correctly but is riddled with issues like:
- Unclear variable names
- Overly complicated logic
- Magic numbers or strings
- Long, monolithic functions
- Redundant or dead code
- Tight coupling between components
Why is it a problem? Because debugging and extending dirty code takes far more time than writing it in the first place. Plus, messy code often results in more bugs and technical debt over time.
For discussion purpose, I'm choosing Python as the language for this blog because it's a versatile, beginner-friendly language. But the clean code topics discussed here can be applied to any programming language.
đź’« Effective Ways for Writing Clean Code
- Follow PEP 8: Official style guide helps maintain consistency. This is python specific. For other languages, follow the official style guide.
- Get Feedback: Ask peers or mentors to review your code. This one is supper effective.
- Refactor Often: Revisit your code time to time to simplify and clean it.
🧹Start Cleaning Your Code From Now
Well, let's try out some examples with python. Remember, here language is just a tool. The principles of clean code apply to any programming language.
-
Use Meaningful Variable & Function Names: Instead of x or temp123, use descriptive names like count or total_area.
Before
def calc(x, y): z = x + y return z
After
def calculate_area(length, width): return length * width
Why? Descriptive names clarify the intent behind your code without needing comments.
-
Apply the DRY Principle (Don’t Repeat Yourself): Refactor repetitive code into reusable functions or loops.
Before
print("Hello John!") print("Hello Jane!") print("Hello Jim!")
After
def greet(names): for name in names: print(f"Hello {name}!") greet(["John", "Jane", "Jim"])
Why? This reduces redundancy and makes updates simpler.
-
Use Comments and Docstrings Wisely: Explain complex logic or algorithms in comments and docstrings.Document why something is done, not just what the code does.
Before
def calculate_area(radius): return 2 * 3.14159 * radius
After
def calculate_area(radius): """Calculate the area of a circle given its radius.""" PI = 3.14159 return 2 * PI * radius
Caution: Over-commenting can clutter code; strike a balance.
-
Use Type Hints: Specify function inputs and outputs with type hints for clarity.
Before
def add_numbers(numbers): total = 0 for num in numbers: total += num return total def add(a, b): return a + b def get_max_min(numbers): return max(numbers), min(numbers)
After
def add_numbers(numbers: list[int]) -> int: total = 0 for num in numbers: total += num return total def add(a: int, b: int) -> int: return a + b def get_max_min(numbers: list[int]) -> tuple[int, int]: return max(numbers), min(numbers)
Why? Type hints make your code more readable and help catch bugs early.
-
Avoid Overly Complicated Logic: Simplify nested if-else statements and loops.
Before:
def calculate_discount(price, is_member, has_coupon): if is_member: if has_coupon: discount = price * 0.25 # 25% discount else: discount = price * 0.15 # 15% discount else: if has_coupon: discount = price * 0.08 # 8% discount else: discount = 0 # No discount return discount def get_filtered_numbers(numbers): filtered = [] for n in numbers: if (n % 3 == 0 and n % 5 == 0) or (n > 40 and n % 2 == 0): filtered.append(n) return filtered
After:
def calculate_discount(price: float, is_member: bool, has_coupon: bool) -> float: if is_member: return price * 0.25 if has_coupon else price * 0.15 return price * 0.08 if has_coupon else 0 def is_valid_number(num: int) -> bool: return (num % 3 == 0 and num % 5 == 0) or (num > 40 and num % 2 == 0) def get_filtered_numbers(numbers: list) -> list: return [num for num in numbers if is_valid_number(num)]
Result: Simplified logic makes your code easier to understand and maintain.
-
Handle Exceptions Gracefully: Use try and except blocks to make your code robust and user-friendly.
Before
value = int(input("Enter a number: ")) print(10 / value)
After
import logging try: value = int(input("Enter a number: ")) print(10 / value) except Exception as err: logging.error(f"Error: {err}")
Result: Your program becomes more reliable and user-focused.
-
Ignore Hard-coded Values: Replace magic numbers or strings with constants, variables or enums.
Before
def calculate_salary(hours, rate): if hours > 40: overtime = (hours - 40) * (rate * 1.5) salary = 40 * rate + overtime else: salary = hours * rate return salary
After
REGULAR_HOURS = 40 OVERTIME_RATE = 1.5 def calculate_salary(hours: int, rate: float) -> float: overtime_hours = max(0, hours - REGULAR_HOURS) regular_hours = hours - overtime_hours return regular_hours * rate + overtime_hours * rate * OVERTIME_RATE
Result: Now you know what 40 and 1.5 mean without reading the entire function.
-
Keep Functions Short and Modularize: Aim for functions that do one thing well and are less than 20 lines long.
Before
def process_user_data(user_data): # Validate user data if not user_data.get('email'): raise ValueError("Email is required") if not user_data.get('age'): raise ValueError("Age is required") # get user's name user_data['name'] = user_data['email'].split('@')[0] user_data['name'] = user_data['name'].capitalize() # Calculate user score age = user_data['age'] activities = user_data.get('activities', []) score = age * 2 for activity in activities: if activity == "sports": score += 10 elif activity == "music": score += 5 elif activity == "travel": score += 7 # Save to database try: with open('database.txt', 'a') as db: db.write(f"{user_data['name']},{user_data['age']},{score}\n") except Exception as e: raise IOError(f"Failed to save user data: {e}") return {"name": user_data['name'], "age": user_data['age'], "score": score}
After
def validate_user_data(user_data): """Validate the user data.""" if not user_data.get('email'): raise ValueError("Email is required") if not user_data.get('age'): raise ValueError("Age is required") def calculate_score(age, activities): """Calculate the user's score based on age and activities.""" score = age * 2 activity_scores = {"sports": 10, "music": 5, "travel": 7} for activity in activities: score += activity_scores.get(activity, 0) return score def save_user_data(name, age, score): """Save the user data to the database.""" try: user_data_string = f"{name},{age},{score}\n" with open('database.txt', 'a') as db: db.write(user_data_string) except Exception as e: raise IOError(f"Failed to save user data: {e}") def process_user_data(user_data): validate_user_data(user_data) name = user_data['email'].split('@')[0].capitalize() score = calculate_score(user_data['age'], user_data.get('activities', [])) save_user_data(name, user_data['age'], score) return {"name": name, "age": user_data['age'], "score": score} process_user_data({"email": "tushar","age": 25, "activities": ['sports', 'music']})
Why? Instead of having all logic in one file, split related functionality into separate files or classes.
Remember: clean code is a journey, not a destination. Keep refining your code over time.
🔧 Tools and Resources for Writing Clean Python Code
- Linters: Use tools like flake8 or pylint to catch formatting issues.
- Code Formatters: Automatically format your code with black or autopep8.
- Integrated Development Environments (IDEs): Leverage IDEs like PyCharm or VSCode with Python extensions to enforce clean code practices.
đź’Ş Bonus: Let's Clean Code Together
✨Here’s some small challenges for you ✨ Share your solutions in the comment section & dicuss 💬
def process_purchase_data(data):
# Validate purchase data
if not data.get('item_id'):
raise ValueError("Item ID is required")
if not data.get('quantity'):
raise ValueError("Quantity is required")
# Calculate total cost
a = data['price']
b = data['quantity']
total_sum = a * b
# Apply discounts based on conditions
if total_sum > 1000:
total_sum -= total_sum * 0.15
elif total_sum > 500:
total_sum -= total_sum * 0.05
# Apply additional fee based on item category
category = data.get('category', '')
if category == 'electronics':
total_sum += 50 # Electronics fee
elif category == 'furniture':
total_sum += 30 # Furniture fee
return {"item_id": data['item_id'], "quantity": data['quantity'], "total_cost": total_sum}
data = process_purchase_data({'item_id': 1, 'price': 200, 'quantity': 10, 'category': 'electronics'})
Conclusion: Clean Code is Worth the Effort
Clean code might take a bit longer to write upfront, but it saves hours of debugging and frustration later. Remember, you’re not just writing for the computer—you’re writing for yourself and anyone who might work on your code in the future. Start small, and over time, you’ll develop habits that make clean code second nature.
Keep practicing, and happy coding! 🚀