Skip to content

Improve benchmarkΒ #212

Open
Open
@grossir

Description

@grossir

First, the readability can be improved. Branches are named "branch1", "branch2"; but which is "main" and which is the PR?

Second, we could return some statistics already implied in the processing
In this recent PR, there was a noticeable processing time increment, which is an important metric for the changes

Also, with the introduction of ReferenceCitations we get the possibility of overlapping citations; it would be interesting to return the citation type, besides the citation type, in the JSON itself

About processing time calculation, from the results JSON themselves

url1 = "https://raw.githubusercontent.com/freelawproject/eyecite/artifacts/203/results/8981703e7cc27067adcb39f66346dc62248974cf.json"
url2 = "https://raw.githubusercontent.com/freelawproject/eyecite/artifacts/203/results/bb9ca00f64c5aa47d0eb85a16e38bc03a6bf0b61.json"

def get_time_stats(url):
    import requests
    import statistics
    
    jason = requests.get(url).json()
    prev_start = jason[0]['time']
    actual_times = [prev_start]
    for item in jason[1:]:
        actual_times.append(item['time'] - prev_start)
        prev_start =  item['time']
    
    print("Mean: ", sum(actual_times)/len(actual_times))
    print("Median: ", statistics.median(actual_times))
    print("Sample size: ", len(actual_times))

get_time_stats(url1)
get_time_stats(url2)

Yields

Mean:  0.08226925288831836
Median:  0.014779999999994686
Sample size:  779

Mean:  0.05099591142490372
Median:  0.009577000000000169
Sample size:  779

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Future...

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions