v1.18.0Released
Arrow export and bounded schema validation
Latest public release. It adds to_arrow, bool dtype detection, and max_errors for schema validation.
v1.17.xReleased
Large production feature wave
- CSV hardening:
on_bad_lines, dtype overrides, skip rows, decimal separators, encoding errors, Unicode paths, duplicate headers, and clearer row diagnostics. - Frame ergonomics:
from_records,schema_summary,describe,to_dict, and stronger column guards. - Interop: DuckDB registration, Parquet writing, Arrow-facing work, pandas accessor improvements, and scikit-learn safety checks.
- Quality and schema: near-constant warnings, high-cardinality warnings, schema YAML export, custom validators, and validation Markdown/Pandas output.
Current mainUnreleased
Polish before the next release
- Frame selection and dropping methods are now more ergonomic.
- Quality reports can exclude sensitive columns and export JSON.
- Benchmark regression checks and dry-run tests improve maintenance confidence.
- Whitespace-only duplicate headers and tuple mapping keys are handled more safely.
NextPlanned
Performance parity and streaming depth
- Continue optimizing parser and cleaning paths using benchmark-driven changes.
- Expand chunked processing beyond reading into more streaming-friendly transformation and validation workflows.
- Add clearer benchmark baselines for release comparisons across supported platforms.
MaintenanceOngoing
Docs, release, and contributor experience
- Keep website docs aligned with public releases and current main.
- Preserve release automation through Release Please, GitHub Actions, wheels, and PyPI Trusted Publishing.
- Maintain label quality, issue scope, GSSoC review flow, and test expectations for merged contributions.