From fe1638cacb9b692e394b09af60f21c6be1dcdbf9 Mon Sep 17 00:00:00 2001 From: Benjamin Hackl Date: Thu, 15 Jan 2026 19:03:18 +0100 Subject: [PATCH] Add README to demo-03 explaining the refactoring exercise - Documents the current state (messy but functional code) - Lists all code smells present in the original implementation - Explains the refactoring goals and learning objectives - Provides guidance for the refactoring path --- demo-03/README.md | 93 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 93 insertions(+) create mode 100644 demo-03/README.md diff --git a/demo-03/README.md b/demo-03/README.md new file mode 100644 index 0000000..70cfa71 --- /dev/null +++ b/demo-03/README.md @@ -0,0 +1,93 @@ +# Demo 03: Refactoring Exercise - CSV Data Processor + +## Overview + +This demo demonstrates how to approach refactoring messy but functional code. The `employee_data_processor.py` script works correctly but contains numerous code smells and anti-patterns that make it difficult to maintain, test, or extend. + +## The Current State (Before) + +`employee_data_processor.py` is a functional CSV processing tool that: +- Reads employee data from `employees.csv` +- Validates records (email, salary, department, hire_date) +- Transforms data (salary to annual, department codes to full names) +- Outputs to `report.json`, `report.html`, and console + +### Code Smells Present + +🚨 **God Function:** `process_employee_data()` does everything in one 169-line function + +🚨 **Global Variables:** 4 globals (`processed_records`, `skipped_records`, `total_salary`, `dept_count`) + +🚨 **Hardcoded Values:** File paths, department mappings, validation rules scattered throughout + +🚨 **Mixed Concerns:** Validation logic mixed with file I/O mixed with output generation + +🚨 **Copy-Paste Code:** Validation blocks repeated unnecessarily + +🚨 **Poor Naming:** Variables like `d`, `dt`, `sal`, `f`, `jf`, `hf` + +🚨 **Nested Conditionals:** 4-5 levels of nested if-else statements + +🚨 **String Concatenation:** Building HTML strings in loops (inefficient) + +🚨 **Limited Error Handling:** Generic try-catch that doesn't provide actionable feedback + +## The Goal (After) + +The refactored version should demonstrate: + +✅ **Single Responsibility:** Each function/class has one clear purpose + +✅ **Separation of Concerns:** Validation, transformation, and output are independent + +✅ **Configuration Management:** Constants and config objects replace magic values + +✅ **Testable Design:** Pure functions, dependency injection, no globals + +✅ **Clear Naming:** Descriptive variable and function names + +✅ **Error Handling:** Proper exception handling and logging + +✅ **Extensibility:** Easy to add new validation rules or output formats + +## Sample Data + +`employees.csv` contains 10 employee records with intentional issues: +- 6 valid records +- 4 invalid records (negative salary, bad email, invalid department, bad date) + +## Running the Script + +```bash +python3 employee_data_processor.py +``` + +This will: +1. Read and validate `employees.csv` +2. Print a summary to console +3. Generate `report.json` with structured data +4. Generate `report.html` with a formatted table + +## Refactoring Path + +Recommended refactoring steps: +1. Extract constants and configuration +2. Separate validation logic into validators +3. Create data classes/structures for employee records +4. Extract output generators (JSON, HTML, console) +5. Implement proper error handling and logging +6. Write tests to verify behavior is preserved +7. Consider using dataclasses, pydantic, or similar for validation + +## Learning Objectives + +This exercise demonstrates: +- Identifying code smells and anti-patterns +- Planning a refactoring strategy +- Applying SOLID principles +- Maintaining functionality while improving code quality +- Testing refactored code to ensure no regressions + +--- + +*Ready to turn this mess into maintainable, professional code! 🛠️*