Skip to content

Empty submission detection#570

Open
jon-bell wants to merge 7 commits intostagingfrom
cursor/empty-submission-detection-5082
Open

Empty submission detection#570
jon-bell wants to merge 7 commits intostagingfrom
cursor/empty-submission-detection-5082

Conversation

@jon-bell
Copy link
Contributor

@jon-bell jon-bell commented Jan 14, 2026

Adds automatic detection and flagging of empty submissions by comparing submitted files against stored handout file hashes for all template repo versions.


Open in Cursor Open in Web

Summary by CodeRabbit

  • New Features

    • Detects and flag "empty" submissions that match instructor handout files; assignments can permit or reject such submissions and rejected submissions are cleaned up automatically.
    • Stores per-assignment handout file fingerprints to enable reliable empty-submission detection.
  • Chores

    • Database schema updated to record empty-submission status and handout fingerprints; assignment settings added to control behavior.
    • Added local e2e test script.
  • Tests

    • End-to-end tests added to validate empty-submission detection, acceptance, and rejection.

✏️ Tip: You can customize this high-level summary in your review settings.

cursoragent and others added 2 commits January 14, 2026 21:07
Co-authored-by: jon <jon@jonbell.net>
Co-authored-by: jon <jon@jonbell.net>
@cursor
Copy link

cursor bot commented Jan 14, 2026

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 14, 2026

Walkthrough

Adds empty-submission detection: stores per-handout file hashes from template commits, computes SHA-256 fingerprints for submitted files, flags submissions that match handout fingerprints, and optionally rejects and cleans up submissions when an assignment disallows empty submissions.

Changes

Cohort / File(s) Summary
Database migrations & schema
supabase/migrations/20260114180000_empty_submission_detection.sql, supabase/migrations/20260114193000_permit_empty_submissions.sql
Add assignment_handout_file_hashes table, indexes, RLS policy; add submissions.is_empty_submission column and assignments.permit_empty_submissions column with defaults and comments.
Server functions — autograder
supabase/functions/autograder-create-submission/index.ts
Add SHA-256 hashing utilities, compute per-file & combined submission hash, lookup handout hashes, set is_empty_submission, optionally reject/rollback and cleanup rejected submissions, preserve error handling.
Server functions — GitHub webhook
supabase/functions/github-repo-webhook/index.ts
Fetch commit tree/blobs via Octokit, compute per-file & combined handout hashes (micromatch filtering), store hashes into assignment_handout_file_hashes, adjust template SHA selection and status timestamps.
Type definitions
supabase/functions/_shared/SupabaseTypes.d.ts, utils/supabase/SupabaseTypes.d.ts
Add assignment_handout_file_hashes table types and relationships; add permit_empty_submissions to assignments; add is_empty_submission to submissions and grading_review types.
Tests & utilities
tests/e2e/TestingUtils.ts, tests/e2e/create-submission.test.tsx
Extend insertAssignment to accept permit_empty_submissions; add e2e tests covering allowed/rejected/non-empty scenarios and test-side hashing utilities to validate behavior.
Package scripts
package.json
Add test:e2e:local npm script to run Playwright tests against localhost.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Autograder as Autograder Service
    participant Storage as File Storage
    participant DB as Database

    User->>Autograder: Submit assignment files
    Autograder->>Storage: Read submitted file blobs
    Storage-->>Autograder: File contents
    Autograder->>Autograder: Compute per-file SHA-256 & combined hash
    Autograder->>DB: Query assignment_handout_file_hashes by assignment_id + combined_hash
    DB-->>Autograder: Matching handout record? (yes/no)
    alt Match found
        Autograder->>DB: Set submissions.is_empty_submission = true
        alt assignment.permit_empty_submissions = false and user not grader/instructor
            Autograder->>DB: Cleanup related reviews/files and remove submission
            Autograder-->>User: Reject submission (UserVisibleError)
        else
            Autograder->>DB: Persist submission (flagged empty)
            Autograder-->>User: Accept submission (empty)
        end
    else No match
        Autograder->>DB: Set submissions.is_empty_submission = false
        Autograder->>DB: Persist submission
        Autograder-->>User: Accept submission
    end
Loading
sequenceDiagram
    participant GitHub
    participant Webhook as Webhook Handler
    participant Octokit as GitHub API
    participant DB as Database

    GitHub->>Webhook: Push event (template repo)
    Webhook->>Webhook: Determine commit SHA
    Webhook->>Octokit: Fetch commit tree & blobs
    Octokit-->>Webhook: File blobs
    Webhook->>Webhook: Filter expected files, compute per-file & combined hashes
    Webhook->>DB: Insert/update assignment_handout_file_hashes (assignment_id, sha, combined_hash, file_hashes, class_id)
    Webhook->>DB: Update assignment.latest_template_sha
    Webhook-->>GitHub: Acknowledge event
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested labels

backend, database, testing

Suggested reviewers

  • jon-bell

Poem

Handouts hashed and fingerprints set,
Submissions checked so none are met,
Empty ones flagged, some swept away,
Tests confirm the gating play.
🔐✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 8.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Empty submission detection' accurately summarizes the main change: implementing automatic detection and flagging of empty submissions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.



📜 Recent review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 070610e and 4c882ef.

📒 Files selected for processing (1)
  • package.json
🔇 Additional comments (1)
package.json (1)

19-19: LGTM!

Clean addition that follows existing patterns in the project. Using cross-env for the BASE_URL environment variable ensures cross-platform compatibility, and localhost:3000 is a sensible default for local Next.js development. This nicely complements the existing test:e2e script for CI environments.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

cursoragent and others added 3 commits January 14, 2026 21:16
Co-authored-by: jon <jon@jonbell.net>
Co-authored-by: jon <jon@jonbell.net>
Co-authored-by: jon <jon@jonbell.net>
@jon-bell jon-bell marked this pull request as ready for review January 14, 2026 21:18
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
supabase/functions/autograder-create-submission/index.ts (1)

1309-1391: Blocker: permit_empty_submissions is never selected, so rejections won’t happen.

You check repoData.assignments.permit_empty_submissions === false, but the repo lookup selects only assignments(class_id, due_date, allow_not_graded_submissions, autograder(*)), so permit_empty_submissions will be undefined at runtime.

Also: consider wrapping the GitHub updateCheckRun in a try/catch so a GitHub API failure can’t prevent cleanup + the intended rejection.

Proposed fix (include permit_empty_submissions in the repoData select)
-    .select("*, assignments(class_id, due_date, allow_not_graded_submissions, autograder(*))")
+    .select("*, assignments(class_id, due_date, allow_not_graded_submissions, permit_empty_submissions, autograder(*))")
🧹 Nitpick comments (4)
tests/e2e/create-submission.test.tsx (2)

28-50: Keep hash canonicalization in one place (avoid subtle drift).

This duplicates the combined-hash algorithm now implemented in edge functions; consider extracting to a shared helper to prevent mismatches (delimiter/sorting/encoding) across test + webhook + autograder.


210-297: Make handout-hash setup idempotent to avoid rare “duplicate key” flakes.

These inserts into assignment_handout_file_hashes are fine for fresh assignments, but using upsert(..., { onConflict: "assignment_id,sha" }) (or a unique sha per test run) would make reruns more robust.

utils/supabase/SupabaseTypes.d.ts (1)

696-768: Consider stronger typing for file_hashes (shape appears well-known).
If file_hashes is a “path -> hash” map, keeping it as Json pushes a lot of validation to runtime. If feasible, consider adding a separate app-level type (e.g., type HandoutFileHashes = Record<string, string>) and casting at the boundary, rather than relying on Json everywhere.

supabase/migrations/20260114180000_empty_submission_detection.sql (1)

28-38: Constraint existence check could be more precise.

The pg_constraint query only filters by conname, which could theoretically match a constraint with the same name on a different table or schema (unlikely but possible). A more robust check would also verify the constraint belongs to the correct table.

♻️ Optional: More precise constraint existence check
 DO $$
 BEGIN
   IF NOT EXISTS (
     SELECT 1
-    FROM pg_constraint
-    WHERE conname = 'assignment_handout_file_hashes_assignment_id_sha_key'
+    FROM pg_constraint c
+    JOIN pg_class t ON c.conrelid = t.oid
+    JOIN pg_namespace n ON t.relnamespace = n.oid
+    WHERE c.conname = 'assignment_handout_file_hashes_assignment_id_sha_key'
+      AND t.relname = 'assignment_handout_file_hashes'
+      AND n.nspname = 'public'
   ) THEN
     ALTER TABLE public.assignment_handout_file_hashes
       ADD CONSTRAINT assignment_handout_file_hashes_assignment_id_sha_key UNIQUE (assignment_id, sha);
   END IF;
 END $$;

That said, given the verbose constraint name, a collision is highly improbable in practice.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bc6091f and f63f036.

📒 Files selected for processing (8)
  • supabase/functions/_shared/SupabaseTypes.d.ts
  • supabase/functions/autograder-create-submission/index.ts
  • supabase/functions/github-repo-webhook/index.ts
  • supabase/migrations/20260114180000_empty_submission_detection.sql
  • supabase/migrations/20260114193000_permit_empty_submissions.sql
  • tests/e2e/TestingUtils.ts
  • tests/e2e/create-submission.test.tsx
  • utils/supabase/SupabaseTypes.d.ts
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-12-12T05:59:48.513Z
Learnt from: koyohashi
Repo: pawtograder/platform PR: 514
File: supabase/migrations/20251204070101_live-polls.sql:38-38
Timestamp: 2025-12-12T05:59:48.513Z
Learning: In SQL migrations affecting live polls, ensure the UNIQUE constraint on (live_poll_id, public_profile_id) behaves with NULLs as intended: anonymous (public_profile_id NULL) can submit multiple responses, while duplicates are prevented only when require_login is true and public_profile_id IS NOT NULL. Review migration 20251204070101_live-polls.sql to confirm this constraint and related columns (public_profile_id nullable, require_login flag) are implemented accordingly and documented.

Applied to files:

  • supabase/migrations/20260114193000_permit_empty_submissions.sql
  • supabase/migrations/20260114180000_empty_submission_detection.sql
📚 Learning: 2025-12-12T06:13:23.017Z
Learnt from: koyohashi
Repo: pawtograder/platform PR: 514
File: supabase/migrations/20251204070101_live-polls.sql:0-0
Timestamp: 2025-12-12T06:13:23.017Z
Learning: Migrations that implement RLS should clearly document the intended access boundaries. For live_polls (and similar features), ensure that unrestricted read access (USING (true)) is intentional and that real security checks occur at the response level (e.g., require_login and can_access_poll_response()). In the migration, add a comment explaining why read-level policy is permissive and that the enforcement happens downstream, so future reviewers understand the design choice and potential risks.

Applied to files:

  • supabase/migrations/20260114193000_permit_empty_submissions.sql
  • supabase/migrations/20260114180000_empty_submission_detection.sql
📚 Learning: 2025-10-19T01:15:20.591Z
Learnt from: jon-bell
Repo: pawtograder/platform PR: 409
File: supabase/migrations/20251017190202_improve_help_request_rls_perf.sql:1-235
Timestamp: 2025-10-19T01:15:20.591Z
Learning: Before suggesting the creation of database indices, always check prior migration files to verify whether the index already exists. Use shell scripts to search for CREATE INDEX statements and PRIMARY KEY constraints in the supabase/migrations directory.

Applied to files:

  • supabase/migrations/20260114180000_empty_submission_detection.sql
🧬 Code graph analysis (4)
tests/e2e/create-submission.test.tsx (2)
utils/supabase/DatabaseTypes.d.ts (1)
  • Assignment (42-42)
tests/e2e/TestingUtils.ts (3)
  • insertAssignment (1142-1427)
  • insertSubmissionViaAPI (1429-1536)
  • supabase (12-12)
supabase/functions/autograder-create-submission/index.ts (4)
supabase/functions/_shared/SupabaseTypes.d.ts (1)
  • Database (3-11377)
utils/supabase/SupabaseTypes.d.ts (1)
  • Database (3-11377)
supabase/functions/_shared/GitHubWrapper.ts (1)
  • updateCheckRun (1569-1580)
supabase/functions/_shared/HandlerUtils.ts (1)
  • UserVisibleError (316-324)
supabase/functions/_shared/SupabaseTypes.d.ts (2)
utils/supabase/SupabaseTypes.d.ts (1)
  • Json (1-1)
utils/supabase/DatabaseTypes.d.ts (1)
  • Json (3-3)
supabase/functions/github-repo-webhook/index.ts (3)
supabase/functions/_shared/GitHubWrapper.ts (1)
  • getOctoKit (244-319)
supabase/functions/_shared/PawtograderYml.d.ts (1)
  • PawtograderConfig (42-49)
utils/PawtograderYml.d.ts (1)
  • PawtograderConfig (42-49)
⏰ Context from checks skipped due to timeout of 180000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: lint (22.x)
  • GitHub Check: deploy
  • GitHub Check: lint (22.x)
🔇 Additional comments (12)
supabase/functions/_shared/SupabaseTypes.d.ts (3)

696-768: No action needed—schema is correctly designed.

The file_hashes column is jsonb NOT NULL DEFAULT '{}'::jsonb in the migration, which fully supports the TypeScript type contract. Insert/Update fields can safely omit file_hashes because the database applies the default; Row correctly shows it as required because the column is never null. The same pattern applies to is_empty_submission and permit_empty_submissions, both with explicit NOT NULL DEFAULT constraints. The schema is sound.

Likely an incorrect or invalid review comment.


825-826: TypeScript pattern and database constraints confirmed as correct.

The field is properly typed as required in the Row type (line 825) but optional in Insert/Update types (lines 860, 895), following the expected "defaulted column" pattern. The migration file correctly applies both NOT NULL and DEFAULT true constraints, ensuring backwards compatibility with older insert paths and preventing runtime failures. The default value of true aligns with the semantic meaning ("allow empty submissions by default"), maintaining safe backwards-compatible behavior.


8095-8096: submissions.is_empty_submission and is_not_graded: Defaults correctly configured in migrations.

Both fields have DEFAULT false in their migrations (20260114180000_empty_submission_detection.sql and 20250712150000_update-due-date-logic-for-lab-sections.sql respectively), ensuring existing rows are backfilled and new inserts without explicit values default safely. The TypeScript schema correctly reflects this: Row types are non-nullable while Insert/Update types are optional.

tests/e2e/TestingUtils.ts (1)

1142-1206: Permit-empty-submissions plumbing looks correct (defaulting to true).

permit_empty_submissions: permit_empty_submissions ?? true matches the DB default and keeps existing call sites stable.

supabase/migrations/20260114193000_permit_empty_submissions.sql (1)

1-8: Migration is clean and idempotent.

tests/e2e/create-submission.test.tsx (1)

84-101: Good coverage: explicit assignments for empty allowed vs prohibited.

utils/supabase/SupabaseTypes.d.ts (2)

8095-8146: Issue verified as resolvedis_empty_submission is properly defined with NOT NULL DEFAULT false in the migration.

The migration 20260114180000_empty_submission_detection.sql correctly adds the column with both the NOT NULL constraint and DEFAULT false value, which handles backfilling existing rows automatically. No further action needed.


825-906: No action needed — assignments.permit_empty_submissions already has NOT NULL constraint with DEFAULT true.

The column is properly defined in migration 20260114193000_permit_empty_submissions.sql with NOT NULL DEFAULT true, which safely backlills existing rows and prevents runtime surprises. The default value permits empty submissions by default, with ability to restrict when needed.

supabase/migrations/20260114180000_empty_submission_detection.sql (4)

5-10: LGTM! Clean column addition with sensible defaults.

Adding is_empty_submission as NOT NULL DEFAULT false ensures backward compatibility with existing rows. The comment clearly documents the column's purpose.


12-23: Table structure looks solid.

Good choices here:

  • GENERATED BY DEFAULT AS IDENTITY for the PK is the modern PostgreSQL approach
  • ON DELETE CASCADE on both FKs prevents orphaned records when assignments or classes are removed
  • jsonb for file_hashes is appropriate for the key-value mapping of paths to hashes

One minor note: the file_hashes default is '{}'::jsonb which is an empty object — this aligns well with the intended usage.


40-45: Index choices are appropriate.

  • The composite index on (assignment_id, combined_hash) directly supports the empty-submission lookup pattern described in the comments.
  • The class_id index will help with RLS policy evaluation since authorizeforclassinstructor(class_id) needs to filter by class.

Based on learnings, I verified these are new indexes being introduced in this migration.


50-57: RLS policy is correctly designed — reads restricted to instructors/graders, writes via service role.

The SELECT policy appropriately limits read access to instructors and graders for debugging and admin visibility. All writes (upsert in github-repo-webhook and autograder-create-submission) use adminSupabase, which operates as the service role and automatically bypasses RLS. No INSERT or UPDATE policies are needed.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +58 to +74
async function safeCleanupRejectedSubmission(params: {
adminSupabase: SupabaseClient<Database>;
submissionId: number;
}) {
const { adminSupabase, submissionId } = params;
// Break circular-ish references: submissions.grading_review_id -> submission_reviews
await adminSupabase.from("submissions").update({ grading_review_id: null, is_active: false }).eq("id", submissionId);

// Delete any auto-created review row(s) for this submission.
await adminSupabase.from("submission_reviews").delete().eq("submission_id", submissionId);

// Remove files (these are the only child rows we definitely created in this path).
await adminSupabase.from("submission_files").delete().eq("submission_id", submissionId);

// Finally, remove the submission row itself.
await adminSupabase.from("submissions").delete().eq("id", submissionId);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Cleanup helper silently ignores PostgREST failures (won’t clean up in practice).

safeCleanupRejectedSubmission never inspects { error }, so cleanup can fail without throwing/capture, leaving rejected submissions behind.

Proposed fix
 async function safeCleanupRejectedSubmission(params: {
   adminSupabase: SupabaseClient<Database>;
   submissionId: number;
 }) {
   const { adminSupabase, submissionId } = params;
   // Break circular-ish references: submissions.grading_review_id -> submission_reviews
-  await adminSupabase.from("submissions").update({ grading_review_id: null, is_active: false }).eq("id", submissionId);
+  {
+    const { error } = await adminSupabase
+      .from("submissions")
+      .update({ grading_review_id: null, is_active: false })
+      .eq("id", submissionId);
+    if (error) throw error;
+  }

   // Delete any auto-created review row(s) for this submission.
-  await adminSupabase.from("submission_reviews").delete().eq("submission_id", submissionId);
+  {
+    const { error } = await adminSupabase.from("submission_reviews").delete().eq("submission_id", submissionId);
+    if (error) throw error;
+  }

   // Remove files (these are the only child rows we definitely created in this path).
-  await adminSupabase.from("submission_files").delete().eq("submission_id", submissionId);
+  {
+    const { error } = await adminSupabase.from("submission_files").delete().eq("submission_id", submissionId);
+    if (error) throw error;
+  }

   // Finally, remove the submission row itself.
-  await adminSupabase.from("submissions").delete().eq("id", submissionId);
+  {
+    const { error } = await adminSupabase.from("submissions").delete().eq("id", submissionId);
+    if (error) throw error;
+  }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async function safeCleanupRejectedSubmission(params: {
adminSupabase: SupabaseClient<Database>;
submissionId: number;
}) {
const { adminSupabase, submissionId } = params;
// Break circular-ish references: submissions.grading_review_id -> submission_reviews
await adminSupabase.from("submissions").update({ grading_review_id: null, is_active: false }).eq("id", submissionId);
// Delete any auto-created review row(s) for this submission.
await adminSupabase.from("submission_reviews").delete().eq("submission_id", submissionId);
// Remove files (these are the only child rows we definitely created in this path).
await adminSupabase.from("submission_files").delete().eq("submission_id", submissionId);
// Finally, remove the submission row itself.
await adminSupabase.from("submissions").delete().eq("id", submissionId);
}
async function safeCleanupRejectedSubmission(params: {
adminSupabase: SupabaseClient<Database>;
submissionId: number;
}) {
const { adminSupabase, submissionId } = params;
// Break circular-ish references: submissions.grading_review_id -> submission_reviews
{
const { error } = await adminSupabase
.from("submissions")
.update({ grading_review_id: null, is_active: false })
.eq("id", submissionId);
if (error) throw error;
}
// Delete any auto-created review row(s) for this submission.
{
const { error } = await adminSupabase.from("submission_reviews").delete().eq("submission_id", submissionId);
if (error) throw error;
}
// Remove files (these are the only child rows we definitely created in this path).
{
const { error } = await adminSupabase.from("submission_files").delete().eq("submission_id", submissionId);
if (error) throw error;
}
// Finally, remove the submission row itself.
{
const { error } = await adminSupabase.from("submissions").delete().eq("id", submissionId);
if (error) throw error;
}
}

Comment on lines +73 to +138
function sha256Hex(buf: Uint8Array): string {
const hash = createHash("sha256");
hash.update(buf);
return hash.digest("hex");
}

function computeCombinedHashFromFileHashes(file_hashes: Record<string, string>): string {
const combinedInput = Object.keys(file_hashes)
.sort()
.map((name) => `${name}\0${file_hashes[name]}\n`)
.join("");
return sha256Hex(Buffer.from(combinedInput, "utf-8"));
}

async function computeHandoutFileHashesForCommit(params: {
templateRepo: string;
commitSha: string;
expectedFiles: string[];
scope: Sentry.Scope;
}): Promise<{ file_hashes: Record<string, string>; combined_hash: string }> {
const { templateRepo, commitSha, expectedFiles, scope } = params;
const octokit = await getOctoKit(templateRepo, scope);
if (!octokit) {
throw new Error(`No octokit found for repository ${templateRepo}`);
}
const [owner, repo] = templateRepo.split("/");

// Resolve commit -> tree sha
const { data: commit } = await octokit.request("GET /repos/{owner}/{repo}/git/commits/{commit_sha}", {
owner,
repo,
commit_sha: commitSha
});
const treeSha = commit.tree.sha;

// List all files in tree
const { data: tree } = await octokit.request("GET /repos/{owner}/{repo}/git/trees/{tree_sha}", {
owner,
repo,
tree_sha: treeSha,
recursive: "true"
});

const wantedPaths = (tree.tree || [])
.filter((item) => item.type === "blob" && !!item.path && !!item.sha)
.map((item) => ({ path: item.path!, sha: item.sha! }))
.filter(({ path }) => expectedFiles.some((pattern) => micromatch.isMatch(path, pattern)));

// Hash file contents for all wanted paths
const file_hashes: Record<string, string> = {};
for (const { path, sha } of wantedPaths) {
const { data: blob } = await octokit.request("GET /repos/{owner}/{repo}/git/blobs/{file_sha}", {
owner,
repo,
file_sha: sha
});
if (blob.encoding !== "base64") {
throw new Error(`Unexpected blob encoding for ${path}: ${blob.encoding}`);
}
const bytes = Buffer.from(blob.content, "base64");
file_hashes[path] = sha256Hex(bytes);
}

const combined_hash = computeCombinedHashFromFileHashes(file_hashes);
return { file_hashes, combined_hash };
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

GitHub REST API git trees recursive parameter format "true" vs "1" octokit

💡 Result:

Both work — GitHub’s git/trees endpoint treats the recursive query as “any value enables recursion.” You can pass recursive=1 or recursive=true (or 0/"false" to disable). Octokit docs likewise note values like 0, 1, "true", and "false" are accepted. (docs.github.com)

Citations:


🏁 Script executed:

# Search the codebase to understand how the function is called and what payload.commits looks like
rg -A 5 -B 5 "computeHandoutFileHashesForCommit" /

# Check if there are multiple calls to this function or if it's called once per commit
rg "payload\.commits" /

# Look for how templateRepo is passed and if there's any caching mechanism
rg "assignment_handout_file_hashes" /

Repository: pawtograder/platform

Length of output: 7580


🏁 Script executed:

# Check the Octokit documentation or type definitions in the codebase
fd "octokit" -t f | head -20

# Look for package.json to see what Octokit version is being used
fd "package.json" -t f | xargs grep -l "octokit" 2>/dev/null | head -5

Repository: pawtograder/platform

Length of output: 76


🏁 Script executed:

# View the context around where computeHandoutFileHashesForCommit is called (around line 140-160)
sed -n '135,160p' /home/jailuser/git/supabase/functions/github-repo-webhook/index.ts

Repository: pawtograder/platform

Length of output: 956


🏁 Script executed:

# View the lines 549-599 to see if this pattern repeats elsewhere
sed -n '540,605p' /home/jailuser/git/supabase/functions/github-repo-webhook/index.ts

Repository: pawtograder/platform

Length of output: 2267


🏁 Script executed:

# Also check the overall structure of the main function to understand commit iteration
sed -n '140,250p' /home/jailuser/git/supabase/functions/github-repo-webhook/index.ts

Repository: pawtograder/platform

Length of output: 5059


Handout hashing makes duplicate GitHub API blob calls across assignments; consider caching.

For each assignment with the same template repo and commit SHA, the function makes separate blob API calls to fetch the same files. With multiple assignments per webhook and broad globs (e.g., **/*.java), this quickly compounds: if 10 assignments share a template repo, you fetch the same N blobs 10 times. For large template repos with many matched files, this risks rate limiting or webhook timeouts. Consider caching blob fetches within a single webhook invocation, or enforcing a reasonable cap on matched files per assignment.

The recursive: "true" parameter is valid—Octokit accepts it without issue.

cursoragent and others added 2 commits January 14, 2026 21:31
Co-authored-by: jon <jon@jonbell.net>
@argos-ci
Copy link

argos-ci bot commented Jan 14, 2026

The latest updates on your projects. Learn more about Argos notifications ↗︎

Build Status Details Updated (UTC)
default (Inspect) ⚠️ Changes detected (Review) 61 changed, 15 removed Jan 14, 2026, 10:46 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants