feat: support bytes type in remote_function#761
Conversation
| "Int64": int, | ||
| "Float64": float, | ||
| "string": str, | ||
| "binary[pyarrow]": base64.b64decode, |
There was a problem hiding this comment.
This dictionary worries me. It seems pretty brittle to rely on the string representations of pandas dtypes. We should revisit this in b/345222844
chelsea-lin
left a comment
There was a problem hiding this comment.
LGTM overall with some minor comments.
|
|
||
|
|
||
| @pytest.mark.flaky(retries=2, delay=120) | ||
| def test_series_map(session_with_bq_connection, scalars_dfs): |
There was a problem hiding this comment.
do we still need the old test_series_map test?
There was a problem hiding this comment.
I think test_series_map_bytes should be sufficent. We have other tests that check integer input and output with remote_function.
bigframes/dataframe.py
Outdated
| # in the remote function | ||
| # NOTE: Keep in sync with the value converters used in the gcf code | ||
| # generated in generate_cloud_function_main_code in remote_function.py | ||
| # generated in in remote_function_template.py |
There was a problem hiding this comment.
in generate_cloud_function_main_code? or just remove extra in here?
| def_, | ||
| cf_name, | ||
| *, | ||
| input_types: Tuple[str], |
There was a problem hiding this comment.
Do input_types and output_types need default values, when they are defined after "*"?
There was a problem hiding this comment.
No, the * means they are keyword-only. Without a default value they are required.
| def_, | ||
| directory, | ||
| *, | ||
| input_types: Tuple[str], |
There was a problem hiding this comment.
Similar questions: whether they need default values or not.
|
I will update the validation to check the arrow dtype as well in the case of arrow values. |
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
🦕