Ref api by YishaiGlasner · Pull Request #3191 · Sefaria/Sefaria-Project

YishaiGlasner · 2026-03-25T12:27:32Z

Summary

Add a new /api/ref/<tref> endpoint that returns structured metadata for any Sefaria Ref
Provide consistent navigation and structure metadata across node types (JaggedArrayNode, SchemaNode, Dictionary, Sheets, etc.)
Avoids redundant MongoDB queries by ensuring a single vstate fetch per request for standard text refs (i.e. not virtual nodes)
Fix prev_segment_ref / next_segment_ref to correctly handle virtual nodes (e.g. Siddur)
Add optional state_ja parameter to avoid redundant DB calls when state is already available
Add a pymongo QueryCounter listener for asserting query counts in tests
Add comprehensive test coverage and OpenAPI documentation

API Details

New endpoint: GET /api/ref/<tref>

Returns a JSON object with:

is_ref — whether the input resolves to a valid ref (returns {is_ref: false} for invalid input)
normalized, hebrew, url_ref — normalized representations
index_title, node_type — index and node metadata
depth, address_types, section_names — structure info (for JaggedArrayNode / DictionaryEntryNode)
start_indexes, start_labels, end_indexes, end_labels — section position
navigation_refs — contextual navigation:
- lineage_refs_top_down — ancestor refs from root to parent
- first_available_section_ref — first section with content
- first_subref / last_subref — child navigation (non-segment, non-range)
- prev_section_ref / next_section_ref — section-level navigation
- prev_segment_ref / next_segment_ref — segment-level navigation
children — child node titles (for SchemaNode)
default_child_node — default child metadata when applicable
sheet_id, lexicon_name, headword — type-specific fields

Considerations

Navigation scope
prev_* and next_* are only defined for section-level and segment-level refs.
Navigation at higher levels is intentionally not exposed to avoid ambiguity. Consumers can traverse upward (via lineage_refs_top_down) and derive such relationships if needed.
Field presence
Fields that are not applicable to a given ref type are omitted.
Fields that are applicable but have no value (e.g. no previous or next ref exists) are returned as null.

Changes in Ref

Fix prev_segment_ref and next_segment_ref to support DictionaryEntryNode
Add optional state_ja parameter to selected methods (already supported in others) to improve performance

pymongo listener

Adds QueryCounter, a pymongo CommandListener used in tests to:

Count MongoDB queries per request
Record query tracebacks for debugging
Tests reset the counter before each API call and assert on QueryCounter.count. On failure, full query tracebacks are printed to help identify unnecessary database hits.
The listener is only registered in test environments (sys._called_from_test), so there is zero production overhead.

Note on tests

api/tests.py is currently not part of the CI suite (historical decision).
All new tests were added there and can be run locally.

…in ref methods.

…o not return the sections (section are not defined well as parts when the ref is range).

…` function on the node.

… normalized sections.

…s_top_down` and `parent_ref`.

…a` param, but call other functions that use `vstate`.

…n having private functions doing it.

…itor number of times mongo is queried.

…lls.

Copilot

Pull request overview

Adds a new GET /api/ref/<tref> API endpoint to validate and introspect Sefaria refs, returning structured node/structure metadata and navigation refs, with accompanying OpenAPI documentation and tests. The PR also updates core Ref navigation helpers to better support virtual nodes and to reduce redundant DB work by allowing callers to pass a pre-fetched VersionState.

Changes:

Add RefView (/api/ref/<tref>) returning normalized/hebrew/url forms, node metadata, structure fields, and navigation refs.
Introduce a pymongo QueryCounter listener (test-only) to assert Mongo command counts in API tests.
Extend/refine Ref navigation/state helpers (prev_segment_ref, next_segment_ref, first_available_section_ref, get_state_ja, is_empty) to accept an optional vstate.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`sefaria/urls_shared.py`	Routes the new `/api/ref/<tref>` endpoint to `RefView`.
`api/views.py`	Implements `RefView` response construction and navigation metadata.
`sefaria/model/text.py`	Updates `Ref` navigation + state access to support `vstate` and virtual-node behavior.
`sefaria/system/database.py`	Adds `QueryCounter` and registers it as a pymongo listener in test environments.
`api/tests.py`	Adds comprehensive tests for the new endpoint + query-count assertions.
`docs/openAPI.json`	Documents `/api/ref/{tref}` and the `RefJSON` response schema.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

sefaria/model/text.py

Copilot · 2026-03-25T12:32:51Z

sefaria/model/text.py

-        # return db.texts.find(self.condition_query(), {"_id": 1}).count() == 0
+        if vstate and not self.index_node.is_virtual:
+            state_ja = self.get_state_ja(vstate=vstate)
+            return state_ja.sub_array_length([i - 1 for i in self.sections]) in (0, None)


is_empty()'s new vstate fast-path is incorrect for many refs. VersionState.state_node(...).ja() returns a JaggedIntArray whose leaf values are ints; calling sub_array_length() after indexing down to a leaf hits the TypeError path and returns 0, which makes segment-level refs (and other fully-specified refs) appear empty even when text exists. Use a content check that works at arbitrary depth (e.g., state_ja.subarray_with_ref(self).is_empty() or get_element() for segment-level) instead of sub_array_length(self.sections).

Suggested change

return state_ja.sub_array_length([i - 1 for i in self.sections]) in (0, None)

subarray = state_ja.subarray_with_ref(self)

return subarray.is_empty()

api/views.py

sefaria/system/database.py

docs/openAPI.json

Copilot · 2026-03-25T12:32:52Z

sefaria/model/text.py

            if not r:
                return None
+            if self.index_node.is_virtual:
+                return r.all_subrefs()[0]


In prev_segment_ref() for virtual nodes, when the current ref is the first segment of a section, the previous segment should be the last segment of the previous section. Returning r.all_subrefs()[0] returns the first segment instead (and can also raise IndexError if the previous section has no subrefs). Adjust this to return the last available subref (and handle empty subref lists).

Suggested change

return r.all_subrefs()[0]

subrefs = r.all_subrefs()

if not subrefs:

# No subrefs available in the previous section; fall back to the section ref itself

return r

return subrefs[-1]

changing to -1
the assumption that previous section has segments is also when not vurtual.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…egment of previous section.

…tion/handshake commands. Rather give the user to pass what commands he wants to count.

stevekaplan123 · 2026-03-29T14:06:27Z

docs/openAPI.json

+            "type": "object",
+            "properties": {
+              "lineage_refs_top_down": {
+                "description": "An array of references representing the hierarchical path from the book level down to the immediate parent. Empty array if at book level.",


I'm not crazy about the name 'lineage_refs_top_down'. What about 'ancestor_refs'? When do you foresee it being used by API users?

It's essentially the shortest path to the root so that fact could also factor in the name: 'shortest_path_to_root', but 'ancestor_refs' might be sufficient

i neither crazy, but i didn't want to use 'ancestor', for we nodes where we use the word ancestor, like Ramban on genesis is the father of Ramban on Genesis, Introuduction; so i'm not sure we want to call Genesis 1 the father of Genesis 1:5. what do you think?

stevekaplan123 · 2026-03-29T14:13:48Z

sefaria/model/text.py

        r = self.ending_ref()
        if not r.is_segment_level():
            return r
+        if self.index_node.is_virtual:


Instead of special casing for VirtualNodes, I'm wondering if you could take advantage of the fact that VirtualNodes and JaggedArrayNodes inherit from TitledTreeNodes and create a function in TitledTreeNodes that handles the base case. Then you can override that function in VirtualNode. That way, here you can just call this function on self.index_node rather than having a special case both here and in prev_segment_ref.

i don't think the node is the right place for function that finds the next segment. the place to find it is the vstate.

That may be true, and we special case VirtualNodes a lot it seems. And in this function half of it is dealing with the special case of VirtualNodes. What do you mean the place to find it is the vstate? This is the Ref class.

i mean that we're finding the next segment by checking the vstate, but the node is just the jagged array. it can tell you its depth. maybe you can expect it to tell you the next jagged array node. but it doesn't know what populates it (in some cases we do have a limitation on the top level, like 50 chapters in Genesis, but no regarding the verses in any chapter). so i don't think the node should have a method for the next segment.

YishaiGlasner added 29 commits January 18, 2026 12:35

feat(ref api): endpoint and part of the view/

c084080

fix(ref api): add signature to get and wrap dict with jsonResponse

4cbd204

feat(ref api): remove index data

5c8a62b

feat(Ref): ref function returns its parts, and adding it to ref api

2a076bd

feat(Ref): ref function returns its parts, and adding it to ref api

a887c41

feat(Ref): change attr name for default node metadata.

d0068dd

feat(Ref): add navigation_refs.

da29cca

refactor(Ref): add ja param to more functionss for saving mongo calls.

11d3a72

refactor(ref api): call the vstate when index node is ja, and use it …

c1680da

…in ref methods.

feat(ref api): add next and prev.

ad09089

refactor(Ref): change ref_parts to get_lineage_titles_top_down, and d…

b1e7991

…o not return the sections (section are not defined well as parts when the ref is range).

refactor(Ref): remove get_lineage_titles_top_down as we have `address…

47f06c9

…` function on the node.

feat(ref api): change re parts to get_lineage_titles_top_down and add…

aa6cc3f

… normalized sections.

feat(ref api): many things.

f481237

chore(ref api): remove todo.

12175ea

fix(ref api): move prev and next into navigation_refs.

7ae1801

feat(ref api): add parent ref.

a56151e

fix(ref api): some virtual nodes problems.

0c9e506

fix(Ref): fix next and prev segment for virtual node.

9a6245b

text(ref api)

c1174b4

docs(ref api): add documentation to openAPI.json.

536653b

refactor(ref api): lineage_refs_top_down rather than `lineage_title…

d0cf2d3

…s_top_down` and `parent_ref`.

feat(Ref): add vstate param to functions that already have `state_j…

d90c186

…a` param, but call other functions that use `vstate`.

feat(ref api): add vstate to improve performance.

4ccb547

refactor(Ref): use vstate rather than state_ja.

d4d9e80

refactor(Ref): use vstate in is_empty and get_state_ja rather tha…

043c5ab

…n having private functions doing it.

feat(database): add QueryCounter class and use it in tests env to mon…

7f4bdfc

…itor number of times mongo is queried.

test(ref api): add get_ref function with assertion about number of ca…

566274a

…lls.

chore(ref api): remove redundant import.

93ca8aa

YishaiGlasner requested a review from akiva10b March 25, 2026 12:27

YishaiGlasner requested review from Copilot, saengel, stevekaplan123 and yitzhakc March 25, 2026 12:27

Copilot started reviewing on behalf of YishaiGlasner March 25, 2026 12:28 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

YishaiGlasner and others added 13 commits March 25, 2026 14:52

fix(ref api): check first available section exists before normalizing.

9aed4a3

docs(ref api): first available section can be null

6d51a27

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

fix(Ref.prev_segment_ref): change 0 to -1 for it should be the last s…

006b950

…egment of previous section.

Merge remote-tracking branch 'origin/ref-api' into ref-api

1e82043

fix(Ref.is_empty): fix call on segment ref with vstate.

d0dc947

feat(QueryCounter): do not count any command to avoid counting connec…

b2094ea

…tion/handshake commands. Rather give the user to pass what commands he wants to count.

Merge branch 'master' into ref-api

2e5240d

fix(tests): remove redundant non-existing import

28adb83

fix(ref api): fix lineage refs to be top down.

004e891

docs(ref api): remove false comment.

e80cf89

feat(ref api): add address types and scection names to default child.

d87135b

docs(ref api): add nullable to prev_segment_ref.

0798337

fix(ref api): handle DictionaryEntryNotFoundError as not being a ref.

4836c97

stevekaplan123 reviewed Mar 29, 2026

View reviewed changes

stevekaplan123 requested changes Mar 29, 2026

View reviewed changes

YishaiGlasner requested a review from stevekaplan123 March 30, 2026 09:23

	return state_ja.sub_array_length([i - 1 for i in self.sections]) in (0, None)
	subarray = state_ja.subarray_with_ref(self)
	return subarray.is_empty()

-                return r.all_subrefs()[0]
+                subrefs = r.all_subrefs()
+                if not subrefs:
+                    # No subrefs available in the previous section; fall back to the section ref itself
+                    return r
+                return subrefs[-1]

Uh oh!

Conversation

YishaiGlasner commented Mar 25, 2026

Summary

API Details

Considerations

Changes in Ref

pymongo listener

Note on tests

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

YishaiGlasner Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

stevekaplan123 Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

stevekaplan123 Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

YishaiGlasner Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

stevekaplan123 Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

YishaiGlasner Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

stevekaplan123 Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

YishaiGlasner Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

stevekaplan123 Mar 29, 2026 •

edited

Loading