Skip to content

Update Citation model's full span and regexes to account for ReferenceCitation overlaps #209

Open
1 of 1 issue completed
@grossir

Description

@grossir

With the introduction of ReferenceCitations we noticed they sometimes overlapped with other citation models.

Given that References may be a standalone name As seen in Roe, ... or a name pincite combination As seen in Roe at 223, a reference extraction that does not take into account other citation models may incorrectly extract references that are actually part of the fuller citation models.

Currently, this is managed by eyecite.helpers.filter_citations, but we have been running into bugs due to not having correct full span calculations; or due to having incomplete extractors

overlap with supra

From Example 1

  • overlap with supra citation Twombly, supra, at 553-554

Image

A Reference would be found inside of the Supra due to incomplete full span calculation:

eyecite/eyecite/find.py

Lines 313 to 324 in 32ee756

# Return SupraCitation
return SupraCitation(
cast(SupraToken, words[index]),
index,
span_end=span_end,
metadata={
"antecedent_guess": antecedent_guess,
"pin_cite": pin_cite,
"parenthetical": parenthetical,
"volume": volume,
},
)

overlap with short case citation

From Example 1

  • overlap with ShortCaseCitation Twombly, 550 U. S. ( I think this has been solved recently)

overlap with single-name and pincite full case citation

Example 2:

  • Nobelman at 332, 113 S.Ct. 2106 is actually a pincited case citation (?); currently we would identify it as a Reference followed by: a full citation or maybe a short case citation

Image

overlap with single name full case citation

From example 1
Image



Not strictly related to References, but to parallel citations; this should probably be split into another issue; but I am pointing it here to be added as test cases that we will know will fail

Example

  • State v. Howard, supra 128-129, 539 A.2d 1203. is a single citation that lists all the parallels, but our system will recognize it as a SupraCitation followed by a CaseCitation

Image

On the same example, something similar happens with an IdCitation and parallel citations

Image

Sub-issues

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

Future...

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions