Skip to content

Diagram.trace: descend through master-part boundaries (Merge-table upstream tracing) #1481

Description

@dimitri-yatsenko

Summary

Diagram.trace() (and therefore self.upstream) walks ancestor FK edges only. It does not descend from an ancestor Master into that Master's Parts. In the merge-table shape used by Spyglass —

Parent → Master.Part → Master → Child

trace(Child & key) reaches Master, but trace[Master.Part] and trace[Parent] raise DataJointError. The true upstream source (Parent) is unreachable through the merge point.

This matches the shipped spec (provenance.md §2, "Allowed table set": an ancestor's Part is included only when the Part itself lies on an FK path to the seed) and is now pinned by test_trace_stops_at_master_no_part_down_collection. However, the design comment on discussion #1232 described a symmetric down-collection ("when a Master is reached, add its Parts to the trace, and continue upward from the Parts' FK parents") that was never implemented — a correction has been posted there. This issue tracks actually building that capability.

Proposed behavior (opt-in or default — design question)

When the upward walk reaches a Master, additionally:

  1. Restrict the Master's Parts downward from the Master's restriction (the existing forward rules).
  2. Continue upward from the Parts' other FK parents (the upward rules), OR-merging into the trace.
  3. Multi-pass until stable (mirror of the downstream part_integrity="cascade" mechanics — this is its upward analog, cf. _propagate_part_to_master).

Design questions

  • Default-on (a Part is semantically an extension of its Master, so "what contributed" arguably includes contributions through Parts) vs. opt-in flag (trace(expr, through_parts=True)) to preserve current semantics.
  • OR-merge and termination through alias nodes; interaction with the (part, master)-pair dedup introduced for the downstream analog.
  • Effect on self.upstream's allowed-table set under strict_provenance — the merge pattern is exactly where downstream make()s need to read through the merge point.

Motivation

  • Spyglass Merge tables (discussion Cascading Restrictions #1232, @CBroz1): versioned pipelines route provenance through Master/Part merge points; upstream tracing that stops at the merge master cannot answer "which Parent produced this result."
  • Without it, strict-provenance mode makes such reads impossible rather than merely unergonomic (the true parent is not in the allowed set).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementIndicates new improvements

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions