fix: use removeprefix to decode DynamoDB namespace property keys#3602
Open
anxkhn wants to merge 1 commit into
Open
fix: use removeprefix to decode DynamoDB namespace property keys#3602anxkhn wants to merge 1 commit into
anxkhn wants to merge 1 commit into
Conversation
_remove_property_prefix used str.lstrip(PROPERTY_KEY_PREFIX), which treats "p." as a set of characters and strips every leading "p"/"." from the key instead of removing the literal "p." prefix. Namespace property keys that begin with those characters were therefore corrupted on read: "provider" became "rovider", "path" became "ath", and "ppp" became "". Existing tests did not catch this because their fixture keys avoid a leading "p" or ".". Switch to str.removeprefix, which strips only the literal prefix. The single caller already guards with startswith(PROPERTY_KEY_PREFIX), so the behavior for all valid keys is unchanged. Add a round-trip regression test covering keys that start with "p" and ".".
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Rationale for this change
DynamoDbCatalogstores each namespace property under a key prefixed withPROPERTY_KEY_PREFIX("p.") via_add_property_prefix, and decodes it on readwith
_remove_property_prefix:str.lstriptreats its argument as a set of characters to strip, not a literalprefix. With
PROPERTY_KEY_PREFIX == "p."it removes every leadingpand.from the decoded key, so any namespace property whose name starts with one of
those characters is corrupted when read back through
load_namespace_properties:providerroviderpathathpartitionartitionppp.hiddenhiddenupdate_namespace_propertiesreads the same decoded properties, so it operates onthe corrupted keys too. Property names that do not start with
por.(such ascomment,owner,location) round-trip correctly, which is why the existingtests, whose fixtures avoid those leading characters, never surfaced the bug.
The fix uses
str.removeprefix, which strips only the literal"p."prefix. Thesingle caller (
_get_namespace_properties) already guards each key withstartswith(PROPERTY_KEY_PREFIX), so behavior for every valid key is unchangedand only the previously corrupted keys are affected.
str.removeprefixisavailable in Python 3.9+ and this project requires
>=3.10.Are these changes tested?
Yes. Added
test_load_namespace_properties_with_prefix_char_keysintests/catalog/test_dynamodb.py, acreate_namespace/load_namespace_propertiesround trip (backed by
moto'smock_aws) over keysprovider,path,ppp,.hidden, andcomment. The test fails on the previouslstripimplementation(keys come back corrupted) and passes with
removeprefix.tests/catalog/test_dynamodb.pypasses (40 tests), andmake lint(ruff,ruff-format, mypy, and the other pre-commit hooks) is clean. The DynamoDB catalog
has no separate integration test path; behavior is covered by the mocked unit test.
Are there any user-facing changes?
Yes: a bug fix. Reading namespace properties from a DynamoDB-backed catalog now
returns property keys that begin with
por.correctly instead of truncatingtheir leading characters. No public API changes.