deps: remove scikit-learn and sqlalchemy as required dependencies#1296
deps: remove scikit-learn and sqlalchemy as required dependencies#1296Genesis929 merged 25 commits intomainfrom
Conversation
bigframes/ml/metrics/_metrics.py
Outdated
| if len(x_pandas) < 2: | ||
| raise ValueError( | ||
| f"At least 2 points are needed to compute area under curve, but x.shape = {len(x_pandas)}" | ||
| ) | ||
|
|
||
| if x_pandas.is_monotonic_decreasing: | ||
| d = -1 | ||
| elif x_pandas.is_monotonic_increasing: | ||
| d = 1 | ||
| else: | ||
| raise ValueError(f"x is neither increasing nor decreasing : {x_pandas}.") | ||
|
|
||
| if hasattr(np, "trapezoid"): | ||
| # new in numpy 2.0 | ||
| return d * np.trapezoid(y_pandas, x_pandas) | ||
| # np.trapz has been deprecated in 2.0 | ||
| return d * np.trapz(y_pandas, x_pandas) # type: ignore |
There was a problem hiding this comment.
If this code is copied from sklearn, we should put it in the third_party/bigframes_vendored directory.
There was a problem hiding this comment.
This is a simplified version from sklearn logic, because we need it to work for pandas series only. In this case should we put it in third-party?
There was a problem hiding this comment.
Yes. If it originally came from somewhere else, we need to show attribution for the copyright. Even if we modified it.
There was a problem hiding this comment.
Moved to third party
noxfile.py
Outdated
| session.install( | ||
| "scikit-learn>=1.2.2", | ||
| ) |
There was a problem hiding this comment.
Do not install optional dependencies in unit_noextras. The intention of such test sessions is to make sure that dependencies that are marked as optional are truly optional. This risks making a forced dependency on this package.
Also, prefer removing the package from other sessions as well.
There was a problem hiding this comment.
Updated, now test skipped is sklearn not installed.
For docs and docfx session, it's required to have sklearn, so changed the way of installation instead of remove it.
noxfile.py
Outdated
| session.install( | ||
| "scikit-learn>=1.2.2", | ||
| ) |
There was a problem hiding this comment.
If there is optional functionality that needs scikit-learn, please use the "extras" section in setup.py. Don't install packages this way.
|
This is not a |
tswast
left a comment
There was a problem hiding this comment.
Please do not merge as is. We are opening our customers up to a big risk of having a broken installation by installing scikit-learn in so many of our test sessions.
a14a66a to
207a6e3
Compare
da02654 to
5a221ec
Compare
tswast
left a comment
There was a problem hiding this comment.
You did not address my feedback. You just changed how you install scikit-learn. You're still installing scikit-learn in way too many of our test sessions.
noxfile.py
Outdated
| "3.9": ["scikit-learn"], | ||
| "3.10": ["scikit-learn"], | ||
| "3.11": ["scikit-learn"], | ||
| "3.12": ["polars", "scikit-learn"], | ||
| "3.13": ["scikit-learn"], |
There was a problem hiding this comment.
No.
We can't install it in every session. We need to know that bigframes works without this package!
| "3.9": ["scikit-learn"], | |
| "3.10": ["scikit-learn"], | |
| "3.11": ["scikit-learn"], | |
| "3.12": ["polars", "scikit-learn"], | |
| "3.13": ["scikit-learn"], | |
| "3.12": ["polars", "scikit-learn"], |
There was a problem hiding this comment.
Updated, but I assume in this case unit noextras sessions runs the same thing for 3.13?
noxfile.py
Outdated
| SYSTEM_TEST_LOCAL_DEPENDENCIES: List[str] = [] | ||
| SYSTEM_TEST_DEPENDENCIES: List[str] = [] | ||
| SYSTEM_TEST_EXTRAS: List[str] = ["tests"] | ||
| SYSTEM_TEST_EXTRAS: List[str] = ["tests", "scikit-learn"] |
There was a problem hiding this comment.
No.
We can't install it in every session. We need to know that bigframes works without this package! Use SYSTEM_TEST_EXTRAS_BY_PYTHON, instead.
| """Build the docfx yaml files for this library.""" | ||
|
|
||
| session.install("-e", ".") | ||
| session.install("-e", ".[scikit-learn]") |
There was a problem hiding this comment.
I'm very concerned if we need to install scikit-learn just to build the docs. Please remove.
| session.install("-e", ".[scikit-learn]") | |
| session.install("-e", ".") |
There was a problem hiding this comment.
Not sure if it's possible to remove this as we are calling publish_api_coverage.py, and one of the things it do is go through sklearn APIs and generate a coverage report.
| """Build the docs for this library.""" | ||
|
|
||
| session.install("-e", ".") | ||
| session.install("-e", ".[scikit-learn]") |
There was a problem hiding this comment.
I'm very concerned if we need to install scikit-learn just to build the docs. Please remove.
| session.install("-e", ".[scikit-learn]") | |
| session.install("-e", ".") |
There was a problem hiding this comment.
Not sure if it's possible to remove this as we are calling publish_api_coverage.py, and one of the things it do is go through sklearn APIs and generate a coverage report.
There was a problem hiding this comment.
Good point. Thanks for clarifying.
Co-authored-by: Tim Sweña (Swast) <swast@google.com>
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #<issue_number_goes_here> 🦕