v1.18.0Released

Arrow export and bounded schema validation

Latest public release. It adds to_arrow, bool dtype detection, and max_errors for schema validation.

v1.17.xReleased

Large production feature wave

  • CSV hardening: on_bad_lines, dtype overrides, skip rows, decimal separators, encoding errors, Unicode paths, duplicate headers, and clearer row diagnostics.
  • Frame ergonomics: from_records, schema_summary, describe, to_dict, and stronger column guards.
  • Interop: DuckDB registration, Parquet writing, Arrow-facing work, pandas accessor improvements, and scikit-learn safety checks.
  • Quality and schema: near-constant warnings, high-cardinality warnings, schema YAML export, custom validators, and validation Markdown/Pandas output.
Current mainUnreleased

Polish before the next release

  • Frame selection and dropping methods are now more ergonomic.
  • Quality reports can exclude sensitive columns and export JSON.
  • Benchmark regression checks and dry-run tests improve maintenance confidence.
  • Whitespace-only duplicate headers and tuple mapping keys are handled more safely.
NextPlanned

Performance parity and streaming depth

  • Continue optimizing parser and cleaning paths using benchmark-driven changes.
  • Expand chunked processing beyond reading into more streaming-friendly transformation and validation workflows.
  • Add clearer benchmark baselines for release comparisons across supported platforms.
MaintenanceOngoing

Docs, release, and contributor experience

  • Keep website docs aligned with public releases and current main.
  • Preserve release automation through Release Please, GitHub Actions, wheels, and PyPI Trusted Publishing.
  • Maintain label quality, issue scope, GSSoC review flow, and test expectations for merged contributions.