Skip to content

Kotlin extractor: fix K2 regressions without expected churn#22076

Draft
andersfugmann wants to merge 6 commits into
mainfrom
andersfugmann/fix_kotlin1_tests
Draft

Kotlin extractor: fix K2 regressions without expected churn#22076
andersfugmann wants to merge 6 commits into
mainfrom
andersfugmann/fix_kotlin1_tests

Conversation

@andersfugmann

@andersfugmann andersfugmann commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR groups the Kotlin extractor fixes needed to keep the migrated Kotlin1 and Kotlin2 integration tests stable when compiled with Kotlin 2.4 and -language-version 2.0. The objective is parity with the previous extraction output: no loss of extracted information, and no expected-output churn that is not justified by a genuine behavioural difference between K1 and K2.

What changed

Extractor changes, in dependency order:

  1. Restore external file-class locations under K2. Under K2, top-level declarations from external binaries attach to IrExternalPackageFragment rather than to an IrClass file-class parent, which bypassed the normal class-source location path. Binary paths are now resolved from the container source (JvmPackagePartSource) so external file-class entities keep stable binary file locations.

  2. Anchor local variable locations to the identifier. K2 resolves some local-variable locations to a wider declaration span (covering type, annotations, and modifiers). The variable-name offset is now derived by scanning the declaration slice, restoring name-anchored locations, with a fallback to the previous provider when source offsets are unavailable.

  3. Reconcile Java binary signatures under K2. K2 represents binary Java symbols differently from K1 (VirtualFileBasedSourceElement rather than JavaSourceElement, with metadata often absent). ASM-based fallback inspection keeps callable matching, callable labels (primitive versus boxed), and external Java stubs stable.

  4. Roll Kotlin1 suites forward to language-version 2.0. With the extractor fixes in place, the migrated Kotlin1 tests run directly under -language-version 2.0. This also removes the obsolete CODEOWNERS entry for the retired java/ql/test-kotlin1/ path.

  5. Disambiguate binary overload probing. Fixes a DB-CHECK INVALID_KEY failure caused by ambiguous name-plus-arity matching when both primitive and boxed overloads exist. The probe now gathers all candidates and returns a concrete answer only when they agree, otherwise it returns null.

  6. Keep synthetic locations for unresolved file classes. Avoids forcing concrete external paths for stdlib-style unresolved container-source paths, preserving stable receiver-location output for multiple_files/method_accesses.

  7. Scope Object-method redeclaration recovery. Constrains the equals(Object) recovery to non-generic parents, skips it when a concrete sibling declaration already exists, and treats Kotlin binary classes (carrying kotlin.Metadata) as non-Java for redeclaration recovery.

Notes on the approach

Most of these are genuine K2 adaptations (1, 2, 3, 5). Two are conservative stability measures rather than behavioural improvements: the synthetic-location handling (6) preserves prior output for unresolved paths, and the Object-method scoping (7) keeps K1-era expectations intact. Both are candidates for a follow-up that aligns K1 and K2 behaviour uniformly and updates the expectations instead.

The standalone 2.4.0 extractor target builds cleanly under -Werror.

This addresses investigation needed for Kotlin1 tests noted in https://github.com/github/codeql-kotlin-team/issues/196

@andersfugmann andersfugmann force-pushed the andersfugmann/fix_kotlin1_tests branch 2 times, most recently from 831e87b to 6f3f2fc Compare June 30, 2026 07:11
Why this is needed:
- Under K2, top-level declarations from external binaries are attached directly
  to IrExternalPackageFragment rather than to an IrClass file-class parent.
- That bypassed the normal class-source location path, so some external file-class
  entities ended up without stable binary file locations.
- Missing/unstable locations caused drift in tests that depend on external file
  class member resolution and location facts.

What this changes:
- Resolve binary paths from IrMemberWithContainerSource (JvmPackagePartSource)
  via a dedicated getContainerSourceBinaryPath helper.
- In KotlinUsesExtractor, when extracting top-level external declarations,
  attach file-class location from container-source binary path when available.
- Track external file classes whose locations were emitted to avoid duplicate
  hasLocation facts.

This targets the K2 external file-class location gap (for example file_classes and
external-property-overloads parity).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@andersfugmann andersfugmann force-pushed the andersfugmann/fix_kotlin1_tests branch 4 times, most recently from 4303989 to 8157a86 Compare June 30, 2026 14:33
andersfugmann and others added 5 commits June 30, 2026 17:46
Why this is needed:
- With Kotlin 2.0 analysis, some local-variable locations resolve to a wider
  declaration span than before.
- The previous extractor logic used provider-based ranges that can cover type,
  annotations, and modifiers, which shifts expected variable location facts.
- This caused parity drift in tests that expect the location to point at the
  variable name token itself.

What this changes:
- Cache current source text per file during extraction.
- Derive variable-name offsets by scanning the declaration slice and locating
  the declared identifier token.
- Emit local-variable declaration/expr locations from that identifier span,
  with fallback to the previous provider when source offsets are unavailable.

This restores stable name-anchored variable locations under Kotlin 2.0.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Why this is needed:
- Under K2, binary Java symbols are represented differently from K1:
  JavaSourceElement metadata is often absent and sources are exposed through
  VirtualFileBasedSourceElement.
- Without recovery logic, callable matching can miss declared Java methods,
  callable labels can drift (primitive vs boxed reference types), and external
  Java declaration stubs can gain wildcard noise when Java signatures are not
  available.
- These differences produced Kotlin 2.0 parity drift in tests that rely on
  stable Java/Kotlin cross-extractor callable identity.

What this changes:
- Add K2-aware Java binary inspection helpers (ASM-based fallback) to detect
  declared methods and parameter/return reference-vs-primitive shape when
  JavaSourceElement metadata is unavailable.
- Recover Java callables more reliably in KotlinUsesExtractor, including a
  binary-class fallback path.
- Normalise callable labels and call result typing to boxed Java classes when
  K2 enhanced reference types appear as Kotlin primitives.
- Accept K2's `Any` form for Object.equals(Object) and keep binary declaration
  checks stable.
- Suppress default wildcard insertion for external Java declaration stubs when
  no Java callable metadata is available, preventing synthetic wildcard drift.

This commit restores Java interop parity for Kotlin 2.0 extraction paths.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Why this is needed:
- The extractor compatibility fixes now preserve the information these Kotlin1-era
  tests were protecting, even when compiled with Kotlin 2.4 and
  `-language-version 2.0`.
- Keeping mixed legacy language-version wiring in individual tests is no longer
  necessary and obscures the intended steady-state execution mode.

What this changes:
- Update all affected Kotlin1 compatibility integration tests to run with
  `-language-version 2.0` directly.
- Keep the expected extraction signal aligned for extractor information output.
- Remove the obsolete CODEOWNERS entry for the retired `java/ql/test-kotlin1/`
  path.

This consolidates the language-version transition into a single test rollup
commit, as requested.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Why this is needed:
- library-tests/exprs/DB-CHECK was failing with INVALID_KEY and INVALID_KEY_SET in params for kotlin.jvm.internal.Intrinsics.areEqual.
- The Java binary probing code matched methods by name plus arity and used the first match, which is ambiguous when both primitive and boxed overloads exist.
- Under that ambiguity, callable labels could be boxed while extracted params remained primitive (or vice versa), creating conflicting rows for the same key.

What changed:
- For both parameter and return-type probing, gather all matching overloads and compute classifier-vs-primitive from the full candidate set.
- Return a concrete answer only when all matches agree; return null when matches disagree.
- Apply the same unambiguous matching rule in both K1 metadata and K2 ASM fallback paths.

Effect:
- The boxing fallback now activates only when the Java binary evidence is deterministic, preventing callable-label collisions and restoring DB integrity in the affected Kotlin2 dataset check.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Why this is needed:
- library-tests/multiple_files/method_accesses.ql regressed because receiver class locations for external file-class members became concrete file paths.
- For stdlib-style unresolved container-source paths, forcing a concrete location changed stable output from synthetic unknown location to external path-based locations.

What changed:
- Added shouldUseConcreteExternalFileClassLocation to distinguish reliable concrete paths from unresolved placeholders.
- In external package-fragment parent handling, only write an external file-class location when the normalized path is concrete and stable.
- If no reliable path is available, keep prior synthetic behaviour by not forcing a concrete location.

Effect:
- Restores stable receiver-location output for method_accesses while preserving concrete locations when we have trustworthy binary-path information.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@andersfugmann andersfugmann force-pushed the andersfugmann/fix_kotlin1_tests branch from 8157a86 to b82ec0e Compare June 30, 2026 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant