Codementor Events

How to Refactor Messy Code: A Step-by-Step Guide

Published Nov 24, 2024
How to Refactor Messy Code: A Step-by-Step Guide

🍀 Introduction: A Tale of Two Project

Imagine this: You’re working on a university project with a tight deadline. Your team decides to divide the tasks and code individually. A week later, it’s time to integrate everything, but there’s a problem—no one understands each other’s code. Variable names like x and temp123, functions hundreds of lines long, magic numbers, and inconsistent formatting turn what should be a simple integration into a debugging nightmare.

The frustration is real. But the problem isn’t just the code—it’s dirty code. Writing clean code isn’t just about aesthetics; it’s about creating something maintainable, readable, and functional for you and others. Here’s how you can make your Python code cleaner, one step at a time.


đź’˘ What is Dirty Code, and Why is it Problematic?

Dirty code refers to code that is hard to read, understand, or maintain. It may function correctly but is riddled with issues like:

  • Unclear variable names
  • Overly complicated logic
  • Magic numbers or strings
  • Long, monolithic functions
  • Redundant or dead code
  • Tight coupling between components

Why is it a problem? Because debugging and extending dirty code takes far more time than writing it in the first place. Plus, messy code often results in more bugs and technical debt over time.

For discussion purpose, I'm choosing Python as the language for this blog because it's a versatile, beginner-friendly language. But the clean code topics discussed here can be applied to any programming language.


đź’« Effective Ways for Writing Clean Code

  1. Follow PEP 8: Official style guide helps maintain consistency. This is python specific. For other languages, follow the official style guide.
  2. Get Feedback: Ask peers or mentors to review your code. This one is supper effective.
  3. Refactor Often: Revisit your code time to time to simplify and clean it.

🧹Start Cleaning Your Code From Now

Well, let's try out some examples with python. Remember, here language is just a tool. The principles of clean code apply to any programming language.

  1. Use Meaningful Variable & Function Names: Instead of x or temp123, use descriptive names like count or total_area.

    Before

    def calc(x, y):
        z = x + y
        return z
    

    After

    def calculate_area(length, width):
        return length * width
    

    Why? Descriptive names clarify the intent behind your code without needing comments.

  2. Apply the DRY Principle (Don’t Repeat Yourself): Refactor repetitive code into reusable functions or loops.

    Before

    print("Hello John!")
    print("Hello Jane!")
    print("Hello Jim!")
    

    After

    def greet(names):
       for name in names:
           print(f"Hello {name}!")
    
    greet(["John", "Jane", "Jim"])
    

    Why? This reduces redundancy and makes updates simpler.

  3. Use Comments and Docstrings Wisely: Explain complex logic or algorithms in comments and docstrings.Document why something is done, not just what the code does.

    Before

    def calculate_area(radius):
       return 2 * 3.14159 * radius
    

    After

    def calculate_area(radius):
       """Calculate the area of a circle given its radius."""
       PI = 3.14159
       return 2 * PI * radius
    

    Caution: Over-commenting can clutter code; strike a balance.

  4. Use Type Hints: Specify function inputs and outputs with type hints for clarity.

    Before

    def add_numbers(numbers):
        total = 0
        for num in numbers:
            total += num
        return total
    
    def add(a, b):
        return a + b
    
    def get_max_min(numbers):
        return max(numbers), min(numbers)
    

    After

    def add_numbers(numbers: list[int]) -> int:
        total = 0
        for num in numbers:
            total += num
        return total
    
    def add(a: int, b: int) -> int:
        return a + b
    
    def get_max_min(numbers: list[int]) -> tuple[int, int]:
        return max(numbers), min(numbers)
    

    Why? Type hints make your code more readable and help catch bugs early.

  5. Avoid Overly Complicated Logic: Simplify nested if-else statements and loops.

    Before:

    def calculate_discount(price, is_member, has_coupon):
       if is_member:
           if has_coupon:
               discount = price * 0.25  # 25% discount
           else:
               discount = price * 0.15  # 15% discount
       else:
           if has_coupon:
               discount = price * 0.08  # 8% discount
           else:
               discount = 0  # No discount
       return discount
    
    def get_filtered_numbers(numbers):
       filtered = []
       for n in numbers:
           if (n % 3 == 0 and n % 5 == 0) or (n > 40 and n % 2 == 0):
               filtered.append(n)
    
       return filtered
    

    After:

    def calculate_discount(price: float, is_member: bool, has_coupon: bool) -> float:
       if is_member:
           return price * 0.25 if has_coupon else price * 0.15
       return price * 0.08 if has_coupon else 0
    
    def is_valid_number(num: int) -> bool:
       return (num % 3 == 0 and num % 5 == 0) or (num > 40 and num % 2 == 0)
    
    def get_filtered_numbers(numbers: list) -> list:
       return [num for num in numbers if is_valid_number(num)]
    

    Result: Simplified logic makes your code easier to understand and maintain.

  6. Handle Exceptions Gracefully: Use try and except blocks to make your code robust and user-friendly.

    Before

    value = int(input("Enter a number: "))
    print(10 / value)
    

    After

    import logging
    try:
        value = int(input("Enter a number: "))
        print(10 / value)
    except Exception as err:
        logging.error(f"Error: {err}")
    

    Result: Your program becomes more reliable and user-focused.

  7. Ignore Hard-coded Values: Replace magic numbers or strings with constants, variables or enums.

    Before

    def calculate_salary(hours, rate):
       if hours > 40:
           overtime = (hours - 40) * (rate * 1.5)
           salary = 40 * rate + overtime
       else:
           salary = hours * rate
       return salary
    

    After

    REGULAR_HOURS = 40
    OVERTIME_RATE = 1.5
    
    def calculate_salary(hours: int, rate: float) -> float:
       overtime_hours = max(0, hours - REGULAR_HOURS)
       regular_hours = hours - overtime_hours
       return regular_hours * rate + overtime_hours * rate * OVERTIME_RATE
    

    Result: Now you know what 40 and 1.5 mean without reading the entire function.

  8. Keep Functions Short and Modularize: Aim for functions that do one thing well and are less than 20 lines long.

    Before

    def process_user_data(user_data):
       # Validate user data
       if not user_data.get('email'):
           raise ValueError("Email is required")
       if not user_data.get('age'):
           raise ValueError("Age is required")
    
        # get user's name
        user_data['name'] = user_data['email'].split('@')[0]
        user_data['name'] = user_data['name'].capitalize()
    
        # Calculate user score
        age = user_data['age']
        activities = user_data.get('activities', [])
        score = age * 2
        for activity in activities:
            if activity == "sports":
                score += 10
            elif activity == "music":
                score += 5
            elif activity == "travel":
                score += 7
    
        # Save to database
        try:
            with open('database.txt', 'a') as db:
                db.write(f"{user_data['name']},{user_data['age']},{score}\n")
        except Exception as e:
            raise IOError(f"Failed to save user data: {e}")
    
        return {"name": user_data['name'], "age": user_data['age'], "score": score}
    

    After

    def validate_user_data(user_data):
      """Validate the user data."""
      if not user_data.get('email'):
           raise ValueError("Email is required")
      if not user_data.get('age'):
           raise ValueError("Age is required")
    
    def calculate_score(age, activities):
       """Calculate the user's score based on age and activities."""
       score = age * 2
       activity_scores = {"sports": 10, "music": 5, "travel": 7}
       for activity in activities:
           score += activity_scores.get(activity, 0)
       return score
    
    def save_user_data(name, age, score):
       """Save the user data to the database."""
       try:
           user_data_string = f"{name},{age},{score}\n"
           with open('database.txt', 'a') as db:
               db.write(user_data_string)
       except Exception as e:
           raise IOError(f"Failed to save user data: {e}")
    
    def process_user_data(user_data):
       validate_user_data(user_data)
       name = user_data['email'].split('@')[0].capitalize()
       score = calculate_score(user_data['age'], user_data.get('activities', []))
       save_user_data(name, user_data['age'], score)
       return {"name": name, "age": user_data['age'], "score": score}
    
    process_user_data({"email": "tushar","age": 25, "activities": ['sports', 'music']})
    

    Why? Instead of having all logic in one file, split related functionality into separate files or classes.

Remember: clean code is a journey, not a destination. Keep refining your code over time.


🔧 Tools and Resources for Writing Clean Python Code

  • Linters: Use tools like flake8 or pylint to catch formatting issues.
  • Code Formatters: Automatically format your code with black or autopep8.
  • Integrated Development Environments (IDEs): Leverage IDEs like PyCharm or VSCode with Python extensions to enforce clean code practices.

đź’Ş Bonus: Let's Clean Code Together

✨Here’s some small challenges for you ✨ Share your solutions in the comment section & dicuss 💬

def process_purchase_data(data):
    # Validate purchase data
    if not data.get('item_id'):
        raise ValueError("Item ID is required")
    if not data.get('quantity'):
        raise ValueError("Quantity is required")

    # Calculate total cost
    a = data['price']
    b = data['quantity']
    total_sum = a * b

    # Apply discounts based on conditions
    if total_sum > 1000:
        total_sum -= total_sum * 0.15
    elif total_sum > 500:
        total_sum -= total_sum * 0.05

    # Apply additional fee based on item category
    category = data.get('category', '')
    if category == 'electronics':
        total_sum += 50  # Electronics fee
    elif category == 'furniture':
        total_sum += 30  # Furniture fee

    return {"item_id": data['item_id'], "quantity": data['quantity'], "total_cost": total_sum}

data = process_purchase_data({'item_id': 1, 'price': 200, 'quantity': 10, 'category': 'electronics'})

Conclusion: Clean Code is Worth the Effort

Clean code might take a bit longer to write upfront, but it saves hours of debugging and frustration later. Remember, you’re not just writing for the computer—you’re writing for yourself and anyone who might work on your code in the future. Start small, and over time, you’ll develop habits that make clean code second nature.

Keep practicing, and happy coding! 🚀

Discover and read more posts from Abdullah Al Masud Tushar
get started