agentic-coding-demo/demo-03/README.md

# Demo 03: Refactoring Exercise - CSV Data Processor

## Overview

This demo demonstrates how to approach refactoring messy but functional code. The `employee_data_processor.py` script works correctly but contains numerous code smells and anti-patterns that make it difficult to maintain, test, or extend.

## The Current State (Before)

`employee_data_processor.py` is a functional CSV processing tool that:
- Reads employee data from `employees.csv`
- Validates records (email, salary, department, hire_date)
- Transforms data (salary to annual, department codes to full names)
- Outputs to `report.json`, `report.html`, and console

### Code Smells Present

🚨 **God Function:** `process_employee_data()` does everything in one 169-line function

🚨 **Global Variables:** 4 globals (`processed_records`, `skipped_records`, `total_salary`, `dept_count`)

🚨 **Hardcoded Values:** File paths, department mappings, validation rules scattered throughout

🚨 **Mixed Concerns:** Validation logic mixed with file I/O mixed with output generation

🚨 **Copy-Paste Code:** Validation blocks repeated unnecessarily

🚨 **Poor Naming:** Variables like `d`, `dt`, `sal`, `f`, `jf`, `hf`

🚨 **Nested Conditionals:** 4-5 levels of nested if-else statements

🚨 **String Concatenation:** Building HTML strings in loops (inefficient)

🚨 **Limited Error Handling:** Generic try-catch that doesn't provide actionable feedback

## The Goal (After)

The refactored version should demonstrate:

✅ **Single Responsibility:** Each function/class has one clear purpose

✅ **Separation of Concerns:** Validation, transformation, and output are independent

✅ **Configuration Management:** Constants and config objects replace magic values

✅ **Testable Design:** Pure functions, dependency injection, no globals

✅ **Clear Naming:** Descriptive variable and function names

✅ **Error Handling:** Proper exception handling and logging

✅ **Extensibility:** Easy to add new validation rules or output formats

## Sample Data

`employees.csv` contains 10 employee records with intentional issues:
- 6 valid records
- 4 invalid records (negative salary, bad email, invalid department, bad date)

## Running the Script

```bash
python3 employee_data_processor.py
```

This will:
1. Read and validate `employees.csv`
2. Print a summary to console
3. Generate `report.json` with structured data
4. Generate `report.html` with a formatted table

## Refactoring Path

Recommended refactoring steps:
1. Extract constants and configuration
2. Separate validation logic into validators
3. Create data classes/structures for employee records
4. Extract output generators (JSON, HTML, console)
5. Implement proper error handling and logging
6. Write tests to verify behavior is preserved
7. Consider using dataclasses, pydantic, or similar for validation

## Learning Objectives

This exercise demonstrates:
- Identifying code smells and anti-patterns
- Planning a refactoring strategy
- Applying SOLID principles
- Maintaining functionality while improving code quality
- Testing refactored code to ensure no regressions

---

*Ready to turn this mess into maintainable, professional code! 🛠️*