Local setup

Use an editable install with development dependencies. Windows contributors need Visual Studio Build Tools with the C++ workload because Arnio builds a native extension.

Shell
python -m pip install -U pip setuptools wheel
python -m pip install -e ".[dev]"
pre-commit install

Testing expectations

Every behavior change needs tests. Add focused regressions for bug fixes, edge cases for new APIs, and smoke checks for examples or docs-facing scripts.

Change typeExpected checks
Python APIPytest coverage for valid input, invalid input, edge cases, and return types.
C++ parser/corePython regression tests plus C++ tests when native behavior changes directly.
Docs or examplesExample smoke tests where possible and link/text checks for stale APIs.
BenchmarksDry-run mode and smoke tests so benchmark scripts remain CI-safe.
Shell
pytest
ruff check .
python benchmarks/benchmark_vs_pandas.py --dry-run

Pull request process

  • Branch from current origin/main.
  • Use a Conventional Commit PR title such as feat:, fix:, docs:, test:, perf:, or refactor:.
  • Keep the PR focused to one issue or closely related behavior group.
  • Include what changed, how it was tested, and any user-facing docs impact.
  • Do not overwrite unrelated files or generated local artifacts.

Contribution areas

CSV parser and C++ core

Malformed row handling, type inference, memory correctness, threading, and native transforms.

Python API

Frame ergonomics, validation, better errors, pandas/Arrow/DuckDB/Parquet workflows, and typing.

Quality and schema

Profiling signals, gates, schema fields, custom validators, exports, and privacy-aware reporting.

Docs, examples, website

Keep examples current with public APIs and update the website whenever shipped behavior changes.

Custom steps

Pipeline steps should accept an ArFrame or pandas DataFrame as documented, return the expected type, validate arguments, and preserve row/column contracts unless the function clearly documents otherwise.

Python
def trim_customer_id(frame):
    return ar.strip_whitespace(frame, subset=["customer_id"])

ar.register_step("trim_customer_id", trim_customer_id, overwrite=True)

Release flow

Releases are automated through Release Please and GitHub Actions. User-facing changes should use release-note-friendly PR titles. Maintainers verify wheels, PyPI, GitHub release notes, and smoke installs before announcing a release.

  1. Merge user-facing changes into main with conventional PR titles.
  2. Review and merge the Release Please PR.
  3. Confirm build and publish workflows succeed for the tag.
  4. Smoke test pip install arnio and relevant extras.
  5. Update website/docs when public behavior moved.