feat(types): scope-based receiver-type call resolution for all 13 languages (SCIP base case)#10
Merged
Conversation
Resolve method-call targets by the receiver's locally-known TYPE instead of
name-string only, so x.M() on a common method name resolves to the exact method
(the SCIP base case), not an arbitrary same-named symbol.
Mechanism (write-path only; reads stay lock-free RCU):
- Per-function local type env {name -> type}, built syntactically in the Go
extractor from the receiver, typed params, `var x T`, and `x := T{}/&T{}`.
Cleared per function; closures inherit the enclosing env.
- A method call `recv.M()` whose receiver type is known is emitted as a
receiver-type-qualified ref `Type.M`.
- resolve_reference_target: a dotted `Type.M` resolves to the method named M
whose receiver/owning type is Type (Go receiver parsed from signature; class
langs matched via scope_chain — ready for later phases). Unknown/dynamic
receiver falls back to the existing name-based path (degrades to a candidate,
honest — same as gopls/SCIP for interface dispatch).
Verified:
- Controlled 2-type corpus: A.Do->a.helpA resolves to A.helpA, B.Do->b.helpB to
B.helpB (no same-name collision); callers correct.
- chi: param-typed receivers resolve (r *http.Request -> r.Context() ->
Request.Context); field-access chains (mx.pool.Get) degrade to bare name.
- Full unit 1692/1692; integration 128/128; no golden regen (synthetic golden
corpus has no typed method calls).
Design + per-language rollout: docs/plans/2026-06-17-scope-type-resolution.md.
Phases 2-4 (Java/C#/TS, Python/Rust, JS/C++/Kotlin/PHP/Ruby/Zig) follow this
template per language.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extend the SCIP-base-case type resolution to Python, and fix the same method-call caller gap Go had: process_python_reference tagged a method call's name as the un-resolvable "obj.M" Call + a Usage on "M", so Python methods had no callers. Now tag the attribute (method name) as the Call, qualified to "Type.M" when the receiver type is known. Local type env (UNAMBIGUOUS sources only — `x = Foo()` is skipped because constructor vs factory call is syntactically identical in Python): - self / cls -> enclosing class (via enclosing_class_name() over the scope stack; resolver matches the class through scope_chain). - annotated params `def m(self, x: T)` and annotated assignments `x: T`. py_bare_type strips quotes (string annotations), subscripts (List[Foo]->List), and module qualifiers. Verified: - Controlled 2-class corpus: A.do/self.helpA() -> A.helpA, B.do -> B.helpB (no same-name collision). - Full unit 1692/1692; MCP goldens 14/14 clean (earlier batch failures were the pre-existing MCP-readiness flake under load — 73ms pre-index responses — and reproduce on the Go phase too; get_context passes at 5082ms unloaded). Reused for all class-based languages: enclosing_class_name() + scope_chain receiver matching. Next: TS/JS, Java, C#, Rust, C++, Kotlin, PHP, Ruby, Zig. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extend SCIP-base-case type resolution to JavaScript/TypeScript and fix the same method-call caller gap: a call obj.M() tagged the un-resolvable "obj.M" member_expression as the Call (method name M only got a Usage), so JS/TS methods had no callers. Now tag the PROPERTY (M) as the Call, qualified to "Type.M" when the receiver type is known. Local type env: - this -> enclosing class (enclosing_class_name()). - TS-annotated params `(x: T)` and variable annotations `const x: T`. - `new T()` constructor inference (`const x = new T()` -> x: T) — unambiguous in JS/TS unlike Python. js_bare_type strips ": ", generics (Foo<Bar>->Foo), array suffix, qualifier. Verified: - Controlled TS 2-class corpus: A.do/this.helpA()->A.helpA, B.do->B.helpB; run(): const a:A=new A(); a.do()->A.do and const b=new B(); b.do()->B.do (annotation + new() inference both resolve). - Full unit 1692/1692; trpc TS real-project 4/4. (MCP batch goldens flake on the pre-existing MCP-readiness race under load — 73ms pre-index responses; pass at ~5s when given time; multi-lang corpus has no JS/TS file so this change cannot affect them.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…e case)
Phase 4 of scope-type resolution. C/C++ method calls now resolve by the
receiver's TYPE, not just the method name, so same-named methods across
different classes (run/ServeHTTP/Close/…) attribute to the correct symbol.
- process_scope_node: a named struct_specifier/class_specifier/union_specifier
*with a body* opens a Class scope named after the aggregate. This gives
member methods an owning-class entry in their scope_chain, which the resolver
matches against a scope-typed `T.m` ref. Bodyless forms (forward decls,
`struct A a;` uses) are excluded so they don't nest the surrounding scope.
- process_reference_node (cpp branch): builds a per-function local var->type env
(this -> enclosing class; `T x;` / `T x = ...` decls) and emits field-call
refs as receiver-type-qualified `Type.m` when the receiver type is known.
Unknown receivers fall back to the bare name (today's behavior).
- Relocated go/js/py bare-type helpers to the top anon namespace so the cpp
branch can use go_bare_type (was defined after the use site).
Verified on a controlled corpus: go() -> {A.run -> A.helpA, B.run -> B.helpB}
resolves both edges distinctly (previously both collapsed onto A.run).
Added ReferenceTrackerTest.ResolvesByReceiverTypeScope. Full unit suite
1693/1693 green; no regressions from the new C/C++ class scopes.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…e remaining 7 languages
Java, C#, Rust, PHP, Kotlin, Ruby, and Zig previously emitted ZERO call
references — they had no call graph at all, so there was nothing to
type-resolve. This adds, in the single extraction pass, both the call-reference
extraction and the SCIP-base-case receiver-type env for each:
- process_<lang>_reference: emit method/function calls as ReferenceType::Call,
tagging the method-name node (not the un-resolvable receiver.method selector),
and qualify to "Type.method" when the receiver's type is locally known
(this/self/$this -> enclosing class; typed params; `T x`/`new T()`/`T::new()`/
`T{}`/`T.new` locals). Shared qualify_and_push() helper.
- Class-scope prerequisites so the resolver can match an owning type:
Rust impl_item/struct_item, Zig `const A = struct{…}`. (C/C++ landed earlier.)
- Kotlin symbol extraction was entirely broken: the fieldless tree-sitter-kotlin
grammar has no `name` field, so extract_function/extract_class/process_scope_node
produced zero symbols. Added first_named_child_typed() fallback
(simple_identifier / type_identifier) — Kotlin now indexes and resolves.
Verified per language on controlled corpora: go() resolves a.run()/b.run() to the
distinct run() of each class (previously collapsed onto the first same-named
symbol). Added ScopeTypeResolution.* (7 langs). Full unit suite 1700/1700.
Known base-case limits documented in the design doc: Ruby bare no-paren calls
(parse as identifier, not call) aren't edges; Kotlin/Zig constructor calls show
as a bare Call on the type. Unknown receivers degrade to the bare name, never a
fabricated edge.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…e env `A* a = new A();` is init_declarator > pointer_declarator > identifier, so the prior identifier-only unwrap never recorded `a:A` and `a->run()` stayed unqualified — the common C++ receiver shape. Peel init/pointer/reference/array declarators down to the identifier; the `*`/`&` live on the declarator, not the type token, so the recorded type stays "A". Added ScopeTypeResolution.CppQualifiesPointerAndValueReceivers. 1701/1701. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Call-graph method calls now resolve by the receiver's type, not just the method-name string. Same-named methods across classes (
run/ServeHTTP/Close/String…) attribute to the correct symbol instead of collapsing onto the first match. SCIP base case — no type checker, no generics instantiation, no flow analysis.Covers all 13 languages. Stacked on #9 (base branch
fix-mcp-getcontext-name); retarget tomainonce #9 merges. The 5 type-resolution commits:af9f738) — added Class scope for named struct/class/union,this/typed-local env, qualified emission.635b746) — Java, C#, Rust, PHP, Kotlin, Ruby, Zig.The key finding
7 of 13 languages (Java, C#, Rust, PHP, Kotlin, Ruby, Zig) emitted zero call references — they had no call graph at all, so there was nothing to type-resolve. This PR builds the call graph and the receiver-type env for each, in the single extraction pass. Kotlin was worse: its fieldless tree-sitter grammar broke symbol extraction entirely (zero symbols) — fixed with a fieldless-name fallback.
How it works (write-path only; reads stay lock-free RCU)
local_var_types_, per function):this/self/$this→ enclosing class; typed params;T x/new T()/T::new()/T{}/T.newlocals.recv.m()whose receiver type is known emits a ref namedType.m(tagging the method-name node, not the un-resolvablerecv.mselector).resolve_reference_target): splits the dotted name, picks the candidate whose owning type matches viasymbol_matches_receiver_type— Go parses the receiver from the signature; class-based languages match the owning type inscope_chain. Unknown/dynamic receiver → bare-name fallback. Never a fabricated edge.Per-language logic isolated in
process_<lang>_reference. Class-scope prerequisites added where missing so the resolver can match an owning type: C/C++ specifiers, Rustimpl_item/struct_item, Zigconst A = struct{}.Honest base-case limits (documented in the design doc)
help_a) parses asidentifier, notcall, so it is not emitted as an edge. Receiver calls (a.run,self.help_a) andT.new-typed locals resolve.val a = A()/const a = A{}constructor calls emit as a bare Call on the type name (shows construction; resolves to the type).Verification
go()resolvesa.run()/b.run()to the distinctrunof each class (previously both collapsed onto the first same-named symbol).ScopeTypeResolution.*(7 langs, extraction-level qualified-ref assertions) +ReferenceTrackerTest.ResolvesByReceiverTypeScope(resolver-level). Full suite 1700/1700.🤖 Generated with Claude Code