hpfracc.analytics — architecture, dependencies, and maintenance
This note complements CONTRIBUTING.md, ALGORITHMS_ARCHITECTURE.md, SPECIAL_ARCHITECTURE.md, SOLVERS_ARCHITECTURE.md, UTILS_ARCHITECTURE.md, and VALIDATION_ARCHITECTURE.md. It describes how the analytics package is structured, what it depends on, how data flows, known risks, and how to exercise tests locally.
1. Design goals
Opt-in telemetry-style tracking of estimator/method names, parameters, array sizes, and success flags—persisted locally (SQLite by default), not sent to a remote service.
Four concerns, four submodules: usage popularity, performance timing/memory, error/reliability patterns, and workflow/session sequences.
Single façade (
AnalyticsManager+AnalyticsConfig) for coordinated tracking, export (json/csv/html), and retention cleanup.Isolation from numerical core:
hpfracc.coreandhpfracc.algorithmsdo not importhpfracc.analytics; integration is call-site only (examples, demos, or future explicit hooks).
2. Module layout (mental model)
Component |
File |
Responsibility |
|---|---|---|
Facade |
|
|
Usage |
|
|
Performance |
|
|
Errors |
|
|
Workflow |
|
|
Package surface |
|
Re-exports the six public symbols listed in |
3. Dependency diagram
AnalyticsManager is the only importer of all four submodules at module level. Submodules do not import each other.
flowchart TB
subgraph analytics_pkg["hpfracc.analytics"]
CFG["AnalyticsConfig"]
MGR["AnalyticsManager"]
UT["UsageTracker"]
PM["PerformanceMonitor"]
EA["ErrorAnalyzer"]
WI["WorkflowInsights"]
end
subgraph storage["Local persistence"]
SQL["SQLite\n(manager: under report dir)"]
FS["Report dir\n(analytics_reports /)"]
end
subgraph heavy["Other deps"]
PD["pandas\n(lazy: CSV export)"]
PLT["matplotlib + seaborn\n(lazy: HTML plots)"]
PSU["psutil"]
NP["numpy"]
end
CFG --> MGR
MGR --> UT
MGR --> PM
MGR --> EA
MGR --> WI
UT --> SQL
PM --> SQL
EA --> SQL
WI --> SQL
MGR --> FS
PM --> PSU
PM --> NP
Import cost: analytics_manager does not import pandas or matplotlib at module load. pandas is imported inside _generate_csv_report only; matplotlib and seaborn inside _create_analytics_plots (HTML report path). JSON-only workflows avoid those imports. The diagram keeps pandas/matplotlib in a separate box as a reminder, not as eager imports from MGR.
4. Data flow (typical use)
Caller constructs
AnalyticsConfigandAnalyticsManager. SQLite files default to<report_output_dir>/_analytics_data/*.dbunlessdatabase_diris set explicitly.On each logical “method run”, caller invokes
track_method_call(...)(and optionally wraps execution inmonitor_method_performance(...)).AnalyticsManagerforwards to:UsageTracker.track_usageWorkflowInsights.track_workflow_eventErrorAnalyzer.track_error(only if an exception object is passed).
Performance events are recorded separately via
PerformanceMonitor’s context manager (used frommonitor_method_performance).Aggregation/reporting:
get_comprehensive_analytics,generate_analytics_report,export_all_data,cleanup_old_data.
There is no automatic instrumentation of Caputo / RiemannLiouville / etc.; any integration must be added explicitly in application or example code.
5. Naming and boundaries
AnalyticsManagervsAnalyticsConfig: manager holds runtime state (session_id, subcomponents,output_dir); config is a frozen-style dataclass of feature flags and export settings.Database filenames (
usage_analytics.db,performance_analytics.db, …) are defaults; tests should override with temp paths (seetests/test_analytics/).No naming collision with
hpfracc.mlorbenchmarksmodules; the word “analytics” here means library usage telemetry, not autograd “forward pass analytics”.
6. Risk register and mitigations
Risk |
Mitigation / note |
|---|---|
SQLite relative to CWD (standalone trackers) |
|
|
Still relative to CWD ( |
Headless / optional plotting |
HTML report path imports matplotlib/seaborn; may need a GUI backend or |
|
Declared in |
Swallowed failures |
|
Privacy / portability |
Parameters are JSON-serialized into SQLite; callers should avoid putting secrets into |
HTML reports embed emoji in static strings |
Cosmetic; harmless for file output, irrelevant for numerical correctness. |
7. Tests and coverage
Pytest tree (representative):
tests/test_analytics/— expanded and comprehensive tests per submodule.tests_unittest/test_analytics.py— lighter unittest-style smoke paths.
Example focused run from repo root:
python -m pytest tests/test_analytics/ tests_unittest/test_analytics.py -q
Optional coverage (whole package avoids some Windows/JAX + pytest-cov edge cases—same guidance as ALGORITHMS_ARCHITECTURE.md §6):
python -m pytest tests/test_analytics/ --cov=hpfracc --cov-report=term-missing:skip-covered -q
HTML / matplotlib tests: Avoid patch("builtins.open", mock_open()) while exercising _generate_html_report / generate_analytics_report with plotting: matplotlib’s font manager opens real font paths via open, and a global open mock can raise PytestUnraisableExceptionWarning (FT2Font / expected bytes, str found). Prefer tmp_path for report_output_dir, set MPLBACKEND=Agg, and let HTML files use the real open (see tests/test_analytics/test_analytics_manager_comprehensive_coverage.py).
9. Consolidation / deprecation candidates (no action required unless you choose)
These are observations, not committed roadmap items:
Single SQLite module: The four trackers repeat similar
_setup_database/ export / retention patterns; a small internalsqlite_store.pycould deduplicate boilerplate without changing public API.Further lazy imports:
seabornis only needed inside_create_analytics_plots; could defer its import to the first line of that helper (minor).Optional extra: Declare an
analyticsoptional extra inpyproject.tomlif the project ever splits “minimal numerical install” from “telemetry + reporting”; today analytics ships with the main package surface inhpfracc/analytics/.