chore: support addition between a timestamp and a timedelta#1369
chore: support addition between a timestamp and a timedelta#1369
Conversation
bigframes/core/compile/ibis_types.py
Outdated
|
|
||
| scalar_expr = bigframes_vendored.ibis.literal(literal) | ||
| if ibis_dtype: | ||
| if isinstance(literal, datetime.timedelta): |
There was a problem hiding this comment.
Should we allow pandas, numpy, and pyarrow scalars here, too?
| if isinstance(literal, datetime.timedelta): | |
| if isinstance(literal, (datetime.timedelta, numpy.timedelta64, pandas.Timedelta, pyarrow.DurationScalar)): |
There was a problem hiding this comment.
Sure! Pandas timedelta is a subclass of datetime.timedelta, but I can make it more explicit.
Added support for numpy.
I eventually decided not to implement pyarrow duration because:
- It causes more unrelated failures than I expected in Python 3.9 and 3.10 envs for some reasons.
- Pandas does not support adding pyarrow duration literals to series
Maybe we can add support for pyarrow in the future, but at this moment it may not be worth it
bigframes/core/utils.py
Outdated
| def timedelta_to_micros(td: typing.Union[pd.Timedelta, datetime.timedelta]) -> int: | ||
| if isinstance(td, pd.Timedelta): | ||
| # td.value returns total nanoseconds. | ||
| return td.value // 1000 |
There was a problem hiding this comment.
Looks like pandas.Timedelta has units: https://pandas.pydata.org/docs/reference/api/pandas.Timedelta.html
Please make this robust to various units.
There was a problem hiding this comment.
that is for the constructor arg, but as per the docs, "The .value attribute is always in ns."
There was a problem hiding this comment.
Right. Pandas always converts the input to values in nanoseconds, no matter what unit we use in the constructor.
bigframes/dtypes.py
Outdated
| return DATE_DTYPE | ||
| if issubclass(type, datetime.time): | ||
| return TIME_DTYPE | ||
| if issubclass(type, datetime.timedelta): |
There was a problem hiding this comment.
Likewise, do we need numpy, pandas, and pyarrow object detection here too?
There was a problem hiding this comment.
Done. Left the PyArrow out for this one.
146c690 to
c680948
Compare
c9a5021 to
b5b69cc
Compare
bigframes/core/compile/ibis_types.py
Outdated
| if isinstance( | ||
| literal, | ||
| (datetime.timedelta, pd.Timedelta, numpy.timedelta64), | ||
| ): | ||
| # numpy timedelta is compatible with Ibis, so we process them separately. | ||
| return bigframes_vendored.ibis.literal( | ||
| utils.timedelta_to_micros(literal), ibis_dtype |
There was a problem hiding this comment.
Why don't we just replace literal expressions (as well as column defs in leaves) in the tree when we replace all the ops?
There was a problem hiding this comment.
Good call. I moved the conversion to the rewrite module
There was a problem hiding this comment.
I am going to rename the module rewrite.operators to something like rewrite.timedelta_expression to better reflect its behavior, perhaps in another PR
| @scalar_op_compiler.register_binary_op(ops.timestamp_add_op) | ||
| def timestamp_add_op_impl(x: ibis_types.TimestampValue, y: ibis_types.IntegerValue): | ||
| return x + y.to_interval("us") |
There was a problem hiding this comment.
What does this syntax look like from this? Is it a clean TIMESTAMP_ADD(timestamp_col, INTERVAL duration MICROSECOND)?
There was a problem hiding this comment.
Yes https://cloud.google.com/bigquery/docs/reference/standard-sql/timestamp_functions#timestamp_add
Though this has less to do with SQL syntax than the ease of wiring code. Ibis does support interval + timestamp and timestamp + interval, but that means I also need to check which side of the input is integer here.
If I can standardize the order of types in the rewrite module, then I don't need to check ibis type here :)
No description provided.