Local setup
Use an editable install with development dependencies. Windows contributors need Visual Studio Build Tools with the C++ workload because Arnio builds a native extension.
python -m pip install -U pip setuptools wheel
python -m pip install -e ".[dev]"
pre-commit installTesting expectations
Every behavior change needs tests. Add focused regressions for bug fixes, edge cases for new APIs, and smoke checks for examples or docs-facing scripts.
| Change type | Expected checks |
|---|---|
| Python API | Pytest coverage for valid input, invalid input, edge cases, and return types. |
| C++ parser/core | Python regression tests plus C++ tests when native behavior changes directly. |
| Docs or examples | Example smoke tests where possible and link/text checks for stale APIs. |
| Benchmarks | Dry-run mode and smoke tests so benchmark scripts remain CI-safe. |
pytest
ruff check .
python benchmarks/benchmark_vs_pandas.py --dry-runPull request process
- Branch from current
origin/main. - Use a Conventional Commit PR title such as
feat:,fix:,docs:,test:,perf:, orrefactor:. - Keep the PR focused to one issue or closely related behavior group.
- Include what changed, how it was tested, and any user-facing docs impact.
- Do not overwrite unrelated files or generated local artifacts.
Contribution areas
CSV parser and C++ core
Malformed row handling, type inference, memory correctness, threading, and native transforms.
Python API
Frame ergonomics, validation, better errors, pandas/Arrow/DuckDB/Parquet workflows, and typing.
Quality and schema
Profiling signals, gates, schema fields, custom validators, exports, and privacy-aware reporting.
Docs, examples, website
Keep examples current with public APIs and update the website whenever shipped behavior changes.
Custom steps
Pipeline steps should accept an ArFrame or pandas DataFrame as documented, return the expected type, validate arguments, and preserve row/column contracts unless the function clearly documents otherwise.
def trim_customer_id(frame):
return ar.strip_whitespace(frame, subset=["customer_id"])
ar.register_step("trim_customer_id", trim_customer_id, overwrite=True)Release flow
Releases are automated through Release Please and GitHub Actions. User-facing changes should use release-note-friendly PR titles. Maintainers verify wheels, PyPI, GitHub release notes, and smoke installs before announcing a release.
- Merge user-facing changes into
mainwith conventional PR titles. - Review and merge the Release Please PR.
- Confirm build and publish workflows succeed for the tag.
- Smoke test
pip install arnioand relevant extras. - Update website/docs when public behavior moved.