-
Notifications
You must be signed in to change notification settings - Fork 23
feat(pkg-py): Replace pandas with narwhals #175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
09816ed to
1081e4d
Compare
| """A DataSource implementation that wraps a pandas DataFrame using DuckDB.""" | ||
| """A DataSource implementation that wraps a DataFrame using DuckDB.""" | ||
|
|
||
| _df: nw.DataFrame | nw.LazyFrame |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure it's going to make sense for us to have a separate LazyFrameSource, which I'll do in a follow up PR (before the next release). The benefit being that we can be more lazy about computation, and possibly have .df() also return a LazyFrame in that scenario
|
|
||
| # Ensure we're working with a DataFrame, not a LazyFrame | ||
| ndf = ( | ||
| self._df.head(10).collect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that downstream calculation of ranges and unique values wasn't working properly because they were based on the first 10 rows -- I'll address this when doing the new LazyFrameSource implementation
65e778b to
167317d
Compare
Remove pandas as a required dependency in favor of narwhals, which provides a unified DataFrame interface supporting both pandas and polars backends. Changes: - Add _df_compat.py module with read_csv, read_sql, and duckdb_result_to_nw helpers - Update DataSource classes to return narwhals DataFrames - Update df_to_html to generate HTML without pandas dependency - Make pandas and polars optional dependencies - Add comprehensive tests for DataFrameSource and df_compat module Users can now install with either `pip install querychat[pandas]` or `pip install querychat[polars]`. Use `.to_native()` on returned DataFrames to get the underlying pandas or polars DataFrame. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
167317d to
0ffa6e7
Compare
98ef15d to
288cd55
Compare
This PR removes
pandasas a required dependency and replaces it withnarwhals, a lightweight DataFrame abstraction layer that supports both pandas and polars backends. Users can now choose their preferred DataFrame library.Motivation
Changes
Dependencies
pandasfrom required dependenciespandasandpolarsas optional dependenciespip install querychat[pandas]orpip install querychat[polars]API Changes (Breaking)
execute_query(),get_data(), anddf()now return narwhals DataFrames instead of pandas DataFrames.to_native()on returned DataFrames to get the underlying pandas/polars DataFrameInternal Changes
_df_compat.pymodule handles backend selection (prefers polars when available)df_to_html()generates HTML directly without pandas dependencyDataFrameSourceaccepts pandas, polars, or narwhals DataFramesTests
test_df_compat.pyfor the compatibility layertest_dataframe_source.pywith comprehensive DataFrameSource testsMigration Guide