Skip to content

Conversation

@IceKhan13
Copy link
Contributor

@IceKhan13 IceKhan13 commented Aug 13, 2025

Summary

Adds support for versions for DeltaTable.

Details

Versions are supported via VersionedTableProtocol. For different tables versions would mean different things. For example for DeltaTable it will be combination of uri and schema. This PR implements versions for DeltaTable as per original issue request.

Example

table = DeltalakeTable(
    name="my_delta_table",
    schema=pa.schema(
        [
            ("implant_id", pa.int64()),
            ("date", pa.string()),
            ("uniq", pa.string()),
            ("value", pa.int64()),
        ]
    ),
    uri=".../somepath/v1/",
    ...
)

new_schema = pa.schema(
    [
        ("implant_id", pa.int64()),
        ("uniq", pa.string()),
        ("value", pa.int64()),
    ]
)

# add version
table.add_version(version="v2", uri=".../somepath/v2/", schema=new_schema)

# get all versions
table.get_versions()
# ["v1", "v2"] 

# switches to v1 table
table.change_version("v1")

Closes #42

@IceKhan13
Copy link
Contributor Author

@houqp request reviews here as well. Thanks!

@houqp houqp requested review from PeterKeDer and asura-io August 14, 2025 17:15
@houqp
Copy link
Collaborator

houqp commented Aug 18, 2025

this can be a bit confusing for delta tables because delta tables themselves have the native concept of versions.

@IceKhan13 what's the use-case you have in mind? is this abstraction created to group multiple delta tables that are logically related?

@IceKhan13
Copy link
Contributor Author

IceKhan13 commented Aug 19, 2025

@houqp to be honest, I was just going through open issues and opened up PRs with possible solutions.

In my mind #42 ask was to support version for tables in general. From the issue:

if we make changes to the schema and/or the business logic pertaining to data underlying the table, we'd like a way to version the table such that we can access prior versions of the data with a different schema

is this abstraction created to group multiple delta tables that are logically related?

Yes, to me #42 sounded like this

IMO #42 might not even be needed. If one needs table of any backend with specific version them can just create 2 separate tables.

Let me know if I should close this as not needed or maybe clarify a bit more #42.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support accessing prior versions of a DeltalakeTable with an older schema

2 participants