Open
Description
First, the readability can be improved. Branches are named "branch1", "branch2"; but which is "main" and which is the PR?
Second, we could return some statistics already implied in the processing
In this recent PR, there was a noticeable processing time increment, which is an important metric for the changes
Also, with the introduction of ReferenceCitations we get the possibility of overlapping citations; it would be interesting to return the citation type, besides the citation type, in the JSON itself
About processing time calculation, from the results JSON themselves
url1 = "https://raw.githubusercontent.com/freelawproject/eyecite/artifacts/203/results/8981703e7cc27067adcb39f66346dc62248974cf.json"
url2 = "https://raw.githubusercontent.com/freelawproject/eyecite/artifacts/203/results/bb9ca00f64c5aa47d0eb85a16e38bc03a6bf0b61.json"
def get_time_stats(url):
import requests
import statistics
jason = requests.get(url).json()
prev_start = jason[0]['time']
actual_times = [prev_start]
for item in jason[1:]:
actual_times.append(item['time'] - prev_start)
prev_start = item['time']
print("Mean: ", sum(actual_times)/len(actual_times))
print("Median: ", statistics.median(actual_times))
print("Sample size: ", len(actual_times))
get_time_stats(url1)
get_time_stats(url2)
Yields
Mean: 0.08226925288831836
Median: 0.014779999999994686
Sample size: 779
Mean: 0.05099591142490372
Median: 0.009577000000000169
Sample size: 779
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status