# Benchmarking in hpfracc — layout, coupling, and maintenance

This note complements [CONTRIBUTING.md](https://github.com/dave2k77/hpfracc/blob/main/CONTRIBUTING.md), [ALGORITHMS_ARCHITECTURE.md](ALGORITHMS_ARCHITECTURE.md), [ANALYTICS_ARCHITECTURE.md](ANALYTICS_ARCHITECTURE.md), [SPECIAL_ARCHITECTURE.md](SPECIAL_ARCHITECTURE.md), [UTILS_ARCHITECTURE.md](UTILS_ARCHITECTURE.md), [VALIDATION_ARCHITECTURE.md](VALIDATION_ARCHITECTURE.md), and [SOLVERS_ARCHITECTURE.md](SOLVERS_ARCHITECTURE.md). It maps **where benchmarks live**, how they differ, naming pitfalls, and lightweight import rules.

---

## 1. Three different “benchmark” surfaces

| Location | Role | Typical entry |
|----------|------|----------------|
| **`hpfracc/benchmarks/benchmark_runner.py`** | General numerical benchmarks (array sizes, synthetic test functions, timing/memory, optional plots + CSV export). | `BenchmarkRunner`, `BenchmarkConfig`, `BenchmarkResult` — used by `scripts/run_benchmarks.py` and `examples/benchmarks/benchmark_demo.py`. |
| **`hpfracc/benchmarks/ml_performance_benchmark.py`** | Torch-heavy ML layer benchmarks (`FractionalNeuralNetwork`, conv/LSTM/transformer, etc.). | **Canonical:** `MLPerformanceBenchmark`, `MLBenchmarkConfig`, `MLBenchmarkResult`. **Deprecated aliases:** `BenchmarkConfig` / `BenchmarkResult` (subclasses that emit `DeprecationWarning` on construction). |
| **`hpfracc/validation/benchmarks.py`** | Validation-oriented `PerformanceBenchmark` helpers (warmup, repeat timing) for numerical method comparison. | Used by validation workflows; **not** the same API as `BenchmarkRunner`. |
| **Repo root `benchmarks/`** | Standalone scripts (e.g. intelligent backend timing). | Run as scripts; not necessarily imported as a package. |

---

## 2. Naming and package surface

- **Numerical** (`benchmark_runner.py`): `BenchmarkConfig`, `BenchmarkResult`, `BenchmarkRunner` — these names are **only** for the numerical runner.
- **ML** (`ml_performance_benchmark.py`): use **`MLBenchmarkConfig`** and **`MLBenchmarkResult`** for new code. The old names `BenchmarkConfig` / `BenchmarkResult` in this module remain as **deprecated subclasses** (same constructor shape, `DeprecationWarning` in `__post_init__`).
- **`hpfracc/benchmarks/__init__.py`**: imports the numerical runner eagerly; exposes **`MLPerformanceBenchmark`**, **`MLBenchmarkConfig`**, **`MLBenchmarkResult`** via **`__getattr__`** so `import hpfracc.benchmarks` does **not** load PyTorch until you touch an ML symbol.
- **`validation.method_benchmarks.BenchmarkResult`** is a third, unrelated dataclass (includes `BenchmarkType`, `success`, etc.).

---

## 3. Dependencies and import coupling

**`benchmark_runner.py`**

- Core: **NumPy**, **psutil**, **json**, **logging** (no root `logging.basicConfig`; configure logging in applications or scripts).
- **Matplotlib** is imported **only inside** `_plot_performance_results`, `_plot_accuracy_results`, and `_plot_memory_results` so `import hpfracc.benchmarks.benchmark_runner` does not load pyplot for CSV-only or non-plotting paths.
- **pandas** was already lazy inside the CSV export helper.

**`ml_performance_benchmark.py`**

- Eager: **torch**, **numpy**, **psutil**, and **hpfracc.ml** components (heavy).
- **Matplotlib / seaborn** are imported only inside **`_generate_visualizations`**.

**`validation/benchmarks.py`**

- **NumPy**, **psutil**, **warnings** — no matplotlib at module level.

---

## 4. Outputs and risks

| Risk | Mitigation |
|------|------------|
| **Default `output_dir="benchmark_results"`** (runner) is CWD-relative | Set `BenchmarkConfig(output_dir=...)` to a temp or project artifacts directory in CI. |
| **Name collision across numerical vs ML** | Use **`MLBenchmark*`** for ML; numerical `BenchmarkConfig` stays on `benchmark_runner`. Deprecated ML aliases will be removed in a future major release after a deprecation window. |
| **ML benchmark cost** | Full `run_comprehensive_benchmark` is expensive; gate behind explicit scripts or reduced configs in CI. |

---

## 5. Tests

- **`tests/test_benchmarks/`** — smoke tests: subprocess import guard (matplotlib not loaded for `benchmark_runner`), `BenchmarkRunner` with **temp `output_dir`**, JSON/CSV `save_results`, lazy ML exports on `hpfracc.benchmarks`, and ML **canonical vs deprecated** `DeprecationWarning` behaviour (skipped if PyTorch is unavailable).

When extending tests:

- Prefer **temp `output_dir`**, **`MPLBACKEND=Agg`**, and **do not patch `builtins.open` globally** alongside matplotlib (see [ANALYTICS_ARCHITECTURE.md](ANALYTICS_ARCHITECTURE.md) §7 — same font-manager footgun).

```bash
python -m pytest tests/test_benchmarks/ -q
```

---

## 6. Related documentation

- [ANALYTICS_ARCHITECTURE.md](ANALYTICS_ARCHITECTURE.md)
- [ALGORITHMS_ARCHITECTURE.md](ALGORITHMS_ARCHITECTURE.md)
- [SPECIAL_ARCHITECTURE.md](SPECIAL_ARCHITECTURE.md)
- [UTILS_ARCHITECTURE.md](UTILS_ARCHITECTURE.md)
- [VALIDATION_ARCHITECTURE.md](VALIDATION_ARCHITECTURE.md)
- [SOLVERS_ARCHITECTURE.md](SOLVERS_ARCHITECTURE.md)
- Scripts: `scripts/run_benchmarks.py`
- Examples: `examples/benchmarks/benchmark_demo.py`