agentic-coding-demo/demo-03
Benjamin Hackl fe1638cacb Add README to demo-03 explaining the refactoring exercise
- Documents the current state (messy but functional code)
- Lists all code smells present in the original implementation
- Explains the refactoring goals and learning objectives
- Provides guidance for the refactoring path
2026-01-15 19:03:18 +01:00
..
employee_data_processor.py Add demo-03: CSV data processor with intentional code smells 2026-01-15 19:01:07 +01:00
employees.csv Add demo-03: CSV data processor with intentional code smells 2026-01-15 19:01:07 +01:00
README.md Add README to demo-03 explaining the refactoring exercise 2026-01-15 19:03:18 +01:00
report.html Add demo-03: CSV data processor with intentional code smells 2026-01-15 19:01:07 +01:00
report.json Add demo-03: CSV data processor with intentional code smells 2026-01-15 19:01:07 +01:00

Demo 03: Refactoring Exercise - CSV Data Processor

Overview

This demo demonstrates how to approach refactoring messy but functional code. The employee_data_processor.py script works correctly but contains numerous code smells and anti-patterns that make it difficult to maintain, test, or extend.

The Current State (Before)

employee_data_processor.py is a functional CSV processing tool that:

  • Reads employee data from employees.csv
  • Validates records (email, salary, department, hire_date)
  • Transforms data (salary to annual, department codes to full names)
  • Outputs to report.json, report.html, and console

Code Smells Present

🚨 God Function: process_employee_data() does everything in one 169-line function

🚨 Global Variables: 4 globals (processed_records, skipped_records, total_salary, dept_count)

🚨 Hardcoded Values: File paths, department mappings, validation rules scattered throughout

🚨 Mixed Concerns: Validation logic mixed with file I/O mixed with output generation

🚨 Copy-Paste Code: Validation blocks repeated unnecessarily

🚨 Poor Naming: Variables like d, dt, sal, f, jf, hf

🚨 Nested Conditionals: 4-5 levels of nested if-else statements

🚨 String Concatenation: Building HTML strings in loops (inefficient)

🚨 Limited Error Handling: Generic try-catch that doesn't provide actionable feedback

The Goal (After)

The refactored version should demonstrate:

Single Responsibility: Each function/class has one clear purpose

Separation of Concerns: Validation, transformation, and output are independent

Configuration Management: Constants and config objects replace magic values

Testable Design: Pure functions, dependency injection, no globals

Clear Naming: Descriptive variable and function names

Error Handling: Proper exception handling and logging

Extensibility: Easy to add new validation rules or output formats

Sample Data

employees.csv contains 10 employee records with intentional issues:

  • 6 valid records
  • 4 invalid records (negative salary, bad email, invalid department, bad date)

Running the Script

python3 employee_data_processor.py

This will:

  1. Read and validate employees.csv
  2. Print a summary to console
  3. Generate report.json with structured data
  4. Generate report.html with a formatted table

Refactoring Path

Recommended refactoring steps:

  1. Extract constants and configuration
  2. Separate validation logic into validators
  3. Create data classes/structures for employee records
  4. Extract output generators (JSON, HTML, console)
  5. Implement proper error handling and logging
  6. Write tests to verify behavior is preserved
  7. Consider using dataclasses, pydantic, or similar for validation

Learning Objectives

This exercise demonstrates:

  • Identifying code smells and anti-patterns
  • Planning a refactoring strategy
  • Applying SOLID principles
  • Maintaining functionality while improving code quality
  • Testing refactored code to ensure no regressions

Ready to turn this mess into maintainable, professional code! 🛠️