feat: Support table format: Iceberg, Delta, and Hudi#5650
Conversation
protos/feast/core/DataSource.proto
Outdated
| string date_partition_column_format = 5; | ||
|
|
||
| // Table Format (e.g. iceberg, delta, etc) | ||
| string table_format = 6; |
There was a problem hiding this comment.
TODO, create TableFormat proto, consolidate with FileFormat proto
|
+1 on the inclusion of all 3 formats. Still I think we might be able to better design data-source side such that data source definitions don't tie the sources to specific offline stores. For example right now I think we can have best of both worlds if we instead go for adding all these formats as separate independent data sources ( |
| query: The query to be executed in Spark. | ||
| path: The path to file data. | ||
| file_format: The format of the file data. | ||
| file_format: The underlying file format (parquet, avro, csv, json). |
There was a problem hiding this comment.
why not consolidate now?
+1 |
|
@franciscojavierarceo @tokoko consolidation with FilleFormat and new data sources could break the backward compatibility, so I want to do it pace by pace. |
|
That makes sense |
|
@HaoXuAI Why would new data sources break backwards compatibility though? |
There will be some proto changes, no 100% sure if there will be API changes exposed to users but I think might be the case |
|
@franciscojavierarceo @ntkathole mind take a look |
franciscojavierarceo
left a comment
There was a problem hiding this comment.
@HaoXuAI i don't see use actually using or testing Spark Table, Iceberg, or Hudi format's outside of our definitions, can you add that?
Can you also add documentation that these formats are now supported?
Otherwise lgtm.
|
Gonna update to add the TableFormat proto in the next PR, after that I'll add the docs. And I think the test will need to be changed as well. |
Signed-off-by: hao-xu5 <hxu44@apple.com>
Signed-off-by: hao-xu5 <hxu44@apple.com>
Signed-off-by: hao-xu5 <hxu44@apple.com>
|
@franciscojavierarceo mind take another look? |
* add support for table format such as Iceberg, Delta, Hudi etc. Signed-off-by: HaoXuAI <sduxuhao@gmail.com> * linting Signed-off-by: HaoXuAI <sduxuhao@gmail.com> * linting Signed-off-by: HaoXuAI <sduxuhao@gmail.com> * add tests Signed-off-by: HaoXuAI <sduxuhao@gmail.com> * fix tests Signed-off-by: HaoXuAI <sduxuhao@gmail.com> * fix tests Signed-off-by: HaoXuAI <sduxuhao@gmail.com> * linting Signed-off-by: HaoXuAI <sduxuhao@gmail.com> * add tableformat proto Signed-off-by: hao-xu5 <hxu44@apple.com> * update Signed-off-by: hao-xu5 <hxu44@apple.com> * update doc Signed-off-by: hao-xu5 <hxu44@apple.com> * fix linting Signed-off-by: hao-xu5 <hxu44@apple.com> * fix test Signed-off-by: hao-xu5 <hxu44@apple.com> --------- Signed-off-by: HaoXuAI <sduxuhao@gmail.com> Signed-off-by: hao-xu5 <hxu44@apple.com> Co-authored-by: hao-xu5 <hxu44@apple.com>
# [0.57.0](v0.56.0...v0.57.0) (2025-11-13) ### Bug Fixes * Improve trino to feast type mapping with (real,varchar,timestamp,decimal) ([#5691](#5691)) ([f855ad2](f855ad2)) * Materialize API - ODFV views not looked-up (thinks views non existant) - crashes materialize ([#5716](#5716)) ([1b050b3](1b050b3)) * Support historical feature retrieval with start_date/end_date in RemoteOfflineStore ([#5703](#5703)) ([ad32756](ad32756)) * Thread safe Clickhouse offline store ([#5710](#5710)) ([5f446ed](5f446ed)) ### Features * Add annotations to cronjob CRDs ([#5701](#5701)) ([be6e6c2](be6e6c2)) * Add batch commit mode for MySQL OnlineStore ([#5699](#5699)) ([3cfe4eb](3cfe4eb)) * Add possibility to materialize only latest values, to increase performance ([#5713](#5713)) ([8d77b72](8d77b72)) * Support table format: Iceberg, Delta, and Hudi ([#5650](#5650)) ([2915ad1](2915ad1))
What this PR does / why we need it:
examples:
Which issue(s) this PR fixes:
Misc