Benjamin Hackl fe1638cacb Add README to demo-03 explaining the refactoring exercise

- Documents the current state (messy but functional code)
- Lists all code smells present in the original implementation
- Explains the refactoring goals and learning objectives
- Provides guidance for the refactoring path

2026-01-15 19:03:18 +01:00

3.1 KiB

Raw Blame History

Demo 03: Refactoring Exercise - CSV Data Processor

Overview

This demo demonstrates how to approach refactoring messy but functional code. The employee_data_processor.py script works correctly but contains numerous code smells and anti-patterns that make it difficult to maintain, test, or extend.

The Current State (Before)

employee_data_processor.py is a functional CSV processing tool that:

Reads employee data from employees.csv
Validates records (email, salary, department, hire_date)
Transforms data (salary to annual, department codes to full names)
Outputs to report.json, report.html, and console

Code Smells Present

🚨 God Function: process_employee_data() does everything in one 169-line function

🚨 Global Variables: 4 globals (processed_records, skipped_records, total_salary, dept_count)

🚨 Hardcoded Values: File paths, department mappings, validation rules scattered throughout

🚨 Mixed Concerns: Validation logic mixed with file I/O mixed with output generation

🚨 Copy-Paste Code: Validation blocks repeated unnecessarily

🚨 Poor Naming: Variables like d, dt, sal, f, jf, hf

🚨 Nested Conditionals: 4-5 levels of nested if-else statements

🚨 String Concatenation: Building HTML strings in loops (inefficient)

🚨 Limited Error Handling: Generic try-catch that doesn't provide actionable feedback

The Goal (After)

The refactored version should demonstrate:

✅ Single Responsibility: Each function/class has one clear purpose

✅ Separation of Concerns: Validation, transformation, and output are independent

✅ Configuration Management: Constants and config objects replace magic values

✅ Testable Design: Pure functions, dependency injection, no globals

✅ Clear Naming: Descriptive variable and function names

✅ Error Handling: Proper exception handling and logging

✅ Extensibility: Easy to add new validation rules or output formats

Sample Data

employees.csv contains 10 employee records with intentional issues:

6 valid records
4 invalid records (negative salary, bad email, invalid department, bad date)

Running the Script

python3 employee_data_processor.py

This will:

Read and validate employees.csv
Print a summary to console
Generate report.json with structured data
Generate report.html with a formatted table

Refactoring Path

Recommended refactoring steps:

Extract constants and configuration
Separate validation logic into validators
Create data classes/structures for employee records
Extract output generators (JSON, HTML, console)
Implement proper error handling and logging
Write tests to verify behavior is preserved
Consider using dataclasses, pydantic, or similar for validation

Learning Objectives

This exercise demonstrates:

Identifying code smells and anti-patterns
Planning a refactoring strategy
Applying SOLID principles
Maintaining functionality while improving code quality
Testing refactored code to ensure no regressions

Ready to turn this mess into maintainable, professional code! 🛠️

3.1 KiB Raw Blame History