Skip to content

Conversation

Rajgupta36
Copy link
Collaborator

@Rajgupta36 Rajgupta36 commented Aug 6, 2025

closes(#2078 )

Checklist

  • Added models IssueComment and interested contributors they both only tied to mentorship app
  • Created command to sync issue comment related to mentorship and find interested contributers.

Preview

image

@Rajgupta36 Rajgupta36 requested a review from arkid15r as a code owner August 6, 2025 13:49
Copy link
Contributor

coderabbitai bot commented Aug 6, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Summary by CodeRabbit

  • New Features
    • GitHub issue comments are now synced and linked to issues.
    • Issues show associated comments, including the most recent one.
    • Automatically detects contributor interest on mentorship issues (e.g., comments containing “interested”) and records it.
    • New admin pages for managing GitHub comments and recorded contributor interests.
  • Chores
    • Added a Makefile target to update mentorship-related issue comments and interests.
  • Admin
    • Streamlined issue list display in admin by removing creation date.

Walkthrough

Adds a Comment model and migration, links comments to Issue (M2M) with a latest_comment property, exports and registers Comment and IssueUserInterest in admin, implements sync_issue_comments and a mentorship management command to sync comments and register interest, updates Makefiles, and removes created_at from IssueAdmin.list_display.

Changes

Cohort / File(s) Summary
GitHub comment model & relation
backend/apps/github/models/comment.py, backend/apps/github/migrations/0034_comment_issue_comments.py, backend/apps/github/models/issue.py, backend/apps/github/models/__init__.py
Add Comment model (github_id, author FK, body, created_at, updated_at), migration creating the model and adding Issue.comments M2M, add Issue.latest_comment property, re-export Comment from models package.
GitHub sync logic
backend/apps/github/common.py
Add sync_issue_comments(gh_client, issue: Issue) to fetch new GitHub comments since the issue's latest local comment, deduplicate, create/update Comment instances (bulk-save) and associate them with the Issue; add Comment import and logging/error handling.
GitHub admin
backend/apps/github/admin/comment.py, backend/apps/github/admin/issue.py, backend/apps/github/admin/__init__.py
Add CommentAdmin (list_display, list_filter, search_fields), remove created_at from IssueAdmin.list_display, export CommentAdmin from admin package.
Mentorship models & admin
backend/apps/mentorship/models/issue_user_interest.py, backend/apps/mentorship/migrations/0004_issueuserinterest.py, backend/apps/mentorship/models/__init__.py, backend/apps/mentorship/admin/issue_user_interest.py, backend/apps/mentorship/admin/__init__.py
Add IssueUserInterest model (module, issue, user) with unique constraint and str, migration to create it, re-export model, and add/register IssueUserInterestAdmin.
Management command
backend/apps/mentorship/management/commands/mentorship_update_comments.py
Add mentorship_update_comments Django command: iterate published modules, collect repo IDs, query open issues, call sync_issue_comments, detect "interested" via regex in issue comments, and bulk-create IssueUserInterest records.
Makefile changes
backend/apps/mentorship/Makefile, backend/Makefile
Add mentorship-update-comments Makefile target and include backend/apps/mentorship/Makefile into top-level backend/Makefile.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • arkid15r
  • kasya
✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a89007 and 81a8713.

📒 Files selected for processing (4)
  • backend/apps/github/admin/issue.py (2 hunks)
  • backend/apps/github/common.py (3 hunks)
  • backend/apps/github/migrations/0034_issue_interested_users.py (1 hunks)
  • backend/apps/github/models/issue.py (1 hunks)
🔇 Additional comments (7)
backend/apps/github/migrations/0034_issue_interested_users.py (1)

1-22: LGTM! Well-structured Django migration.

The migration correctly adds the ManyToManyField with appropriate configuration including blank=True for optional relationships, proper related_name for reverse queries, and verbose_name for admin display.

backend/apps/github/models/issue.py (1)

73-79: LGTM! Proper ManyToManyField implementation.

The interested_users field is correctly configured and positioned within the M2Ms section. The field configuration matches the migration and follows Django conventions.

backend/apps/github/admin/issue.py (2)

12-18: LGTM! Appropriate admin field configuration.

Adding interested_users to autocomplete_fields enables efficient user selection in the Django admin interface.


25-29: LGTM! Useful filter addition.

Adding created_at to list_filter provides valuable filtering capability for issues in the admin interface.

backend/apps/github/common.py (3)

6-6: LGTM! Necessary import for regex functionality.

The regex import is appropriately added to support pattern matching for interested user detection.


131-138: LGTM! Good refactoring of issue creation.

Moving issue creation outside the milestone conditional block improves code clarity and ensures the issue object is available for subsequent processing.


179-179: Consider API rate limiting implications.

The get_comments() call for every issue could quickly consume GitHub API rate limits, especially for repositories with many issues and comments.

Consider implementing:

  1. Comment pagination handling
  2. Caching mechanism for recently processed comments
  3. Rate limiting checks before making API calls
  4. Only processing comments newer than a certain threshold

Copy link
Collaborator

@arkid15r arkid15r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please apply some separation of concerns here:

logger.info("Skipping issues sync for %s", repository.name)

try:
comments = gh_issue.get_comments()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's separate mentorship/github sync logic.
This file should have 0 knowledge about mentorship app. It should be able to sync issue comments though (when needed). We need to sync issue comments for projects that participate in currently active mentorship programs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to sync new comments only.

Comment on lines 160 to 181
interest_patterns = [
r"assign.*me",
r"i(?:'d| would)? like to work on",
r"can i work on",
r"i(?:'ll| will)? take",
r"i want to work on",
r"i am interested",
r"can i be assigned",
r"please assign.*me",
r"i can (?:help|work|fix|handle)",
r"let me (?:work|take|handle)",
r"i(?:'ll| will).*(?:fix|handle|work)",
r"assign.*to.*me",
r"i volunteer",
r"count me in",
r"i(?:'m| am) up for",
r"i could work",
r"happy.*work",
r"i(?:'d| would) love to work",
]
if any(re.search(pattern, body) for pattern in interest_patterns):
user_obj = User.update_data(comment.user)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic should be extracted to mentorship app job that's responsible for issue processing.

@@ -70,6 +70,13 @@ class Meta:
)

# M2Ms.
interested_users = models.ManyToManyField(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not that simple.
Let's create a through table for program/user/issue combinations as within time issue can be involved in multiple different programs.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

♻️ Duplicate comments (1)
backend/apps/github/common.py (1)

233-239: Neutralize mentorship mention in docstring (this module should stay app-agnostic).

Per earlier feedback, the github app shouldn’t carry mentorship-specific semantics. The function is already generic; let’s update the docstring to match.

-def sync_issue_comments(gh_app, issue: Issue):
-    """Sync new comments for a mentorship program specific issue on-demand.
+def sync_issue_comments(gh_app, issue: Issue):
+    """Sync new comments for a given GitHub issue on-demand.
🧹 Nitpick comments (14)
backend/manage.py (1)

8-11: Guard against missing dependency and prefer robust .env discovery

Unconditional import will break manage.py in environments where python-dotenv isn’t installed. Also, prefer find_dotenv() so local runs from different CWDs can locate backend/.env reliably.

Apply this diff to make the import fail-soft and to discover .env via find_dotenv without overriding existing env vars:

-from dotenv import load_dotenv
-
-load_dotenv()
+try:
+    from dotenv import load_dotenv, find_dotenv
+except ImportError:
+    load_dotenv = None
+
+if load_dotenv:
+    load_dotenv(find_dotenv(), override=False)
backend/Makefile (1)

3-3: Optional: Align target naming and wire into update-data

Most targets in this Makefile use hyphenated names. Consider adding a hyphenated alias for sync_mentorship_issue_comments for consistency and optionally include it in update-data if this should be part of routine data refresh.

Example adjustments (outside this file for the alias) in backend/apps/mentorship/Makefile:

# Alias to match hyphenated naming convention
sync-mentorship-issue-comments: sync_mentorship_issue_comments
	@:

Then, to include it in update-data (in this file):

 update-data: \
 	github-update-owasp-organization \
 	owasp-scrape-chapters \
 	owasp-scrape-committees \
 	owasp-scrape-projects \
 	github-add-related-repositories \
 	github-update-related-organizations \
 	github-update-users \
 	owasp-aggregate-projects \
 	owasp-update-events \
 	owasp-sync-posts \
 	owasp-update-sponsors \
-	slack-sync-data
+	slack-sync-data \
+	sync_mentorship_issue_comments

If this should not run by default, skip wiring it into update-data and keep it manual.

backend/apps/github/admin/__init__.py (1)

4-4: Admin exposure is fine; consider truncating comment body in list_display

Importing IssueCommentAdmin is correct. As a follow-up, consider truncating the body column in the admin to avoid very wide rows and heavy rendering when comments are long.

Suggested change in backend/apps/github/admin/issue_comment.py (outside this file):

import textwrap
from django.contrib import admin
from backend.apps.github.models import IssueComment

@admin.register(IssueComment)
class IssueCommentAdmin(admin.ModelAdmin):
    list_display = ("short_body", "issue", "author", "created_at", "updated_at")
    list_filter = ("created_at", "updated_at")
    search_fields = ("body", "issue__title", "author__login")

    def short_body(self, obj):
        return textwrap.shorten(obj.body or "", width=80, placeholder="…")
    short_body.short_description = "body"
backend/apps/github/admin/issue_comment.py (2)

1-1: Fix inaccurate module docstring (refers to Issue instead of IssueComment).

The docstring should reflect that this admin manages IssueComment, not Issue.

-"""GitHub app Issue model admin."""
+"""GitHub app IssueComment model admin."""

11-19: Improve admin usability and performance (truncate body, add select_related, search by author login).

Listing full comment bodies can be unwieldy and costly. Also, resolving FK fields without select_related causes N+1 queries. Add a short_body column, use select_related, and expand search fields.

 class IssueCommentAdmin(admin.ModelAdmin):
     """Admin for IssueComment model."""
 
-    list_display = (
-        "body",
+    list_display = (
+        "short_body",
         "issue",
         "author",
         "created_at",
         "updated_at",
     )
     list_filter = ("created_at", "updated_at")
-    search_fields = ("body", "issue__title")
+    search_fields = ("body", "issue__title", "author__login")
+    list_select_related = ("issue", "author")
+    date_hierarchy = "created_at"
+    ordering = ("-created_at",)
+
+    def short_body(self, obj):
+        return (obj.body[:80] + "…") if obj.body and len(obj.body) > 80 else obj.body
+    short_body.short_description = "Body"
backend/apps/mentorship/Makefile (1)

1-3: Declare the target as phony and keep linter noise contained.

Add a .PHONY declaration for clarity; for the checkmake warnings about all/clean/test on this included Makefile, prefer excluding it from standalone linting rather than adding unrelated targets here.

+ .PHONY: sync_mentorship_issue_comments
 sync_mentorship_issue_comments:
 	@echo "Syncing Github Comments related to issues"
 	@CMD="python manage.py sync_mentorship_issue_comments" $(MAKE) exec-backend-command
backend/apps/github/common.py (1)

261-283: Optional: keep the in-memory dedupe set in sync after inserts.

Not strictly necessary, but adding new IDs to existing_github_ids reduces any chance of processing duplicates within the same iterator window.

         for gh_comment in gh_comments:
             if gh_comment.id in existing_github_ids:
                 logger.info("Skipping existing comment %s", gh_comment.id)
                 continue
@@
                     comment_obj = IssueComment.update_data(gh_comment, issue, author_obj)
                     if comment_obj:
+                        existing_github_ids.add(gh_comment.id)
                         comments_synced += 1
backend/apps/github/migrations/0034_issuecomment.py (1)

22-26: Index timestamps used for ordering/filtering to improve query performance.

created_at is used for ordering (and likely filtering). Adding db_index=True on created_at/updated_at can materially speed up admin and sync queries on large tables. Since this is a new migration, consider amending it before merge.

-                ("created_at", models.DateTimeField()),
-                ("updated_at", models.DateTimeField()),
+                ("created_at", models.DateTimeField(db_index=True)),
+                ("updated_at", models.DateTimeField(db_index=True)),

Alternatively, add an index combining issue + created_at for common access patterns:

options = {
    "indexes": [
        models.Index(fields=["issue", "-created_at"], name="issue_cmt_issue_created_idx"),
    ],
}

I can prepare the model/migration adjustments if you prefer.

backend/apps/github/models/issue_comment.py (1)

33-35: Truncate body in str to avoid overly long representations.

Long comment bodies can bloat logs/admin lists.

-        return f"{self.issue} - {self.author} - {self.body}"
+        safe_author = self.author or "Unknown"
+        snippet = (self.body or "").replace("\n", " ")[:80]
+        return f"{self.issue} - {safe_author} - {snippet}"
backend/apps/mentorship/migrations/0004_participantinterest.py (1)

46-49: Prefer UniqueConstraint over unique_together for forward-compatibility.

Django recommends using models.UniqueConstraint in Meta.constraints over unique_together. This is optional for now; consider using a named UniqueConstraint in future migrations for clarity and better schema control.

backend/apps/mentorship/models/interested_contributors.py (2)

21-24: Avoid heavy DB work in str; limit output and queries.

Joining all user logins can be slow and produce very long strings in admin lists. Show a small sample plus count.

-    def __str__(self):
-        """Return a human-readable representation of the participant interest."""
-        user_list = ", ".join(self.users.values_list("login", flat=True))
-        return f"Users [{user_list}] interested in '{self.issue.title}' for {self.program.name}"
+    def __str__(self):
+        """Return a human-readable representation of the participant interest."""
+        # Show up to 5 logins and the total count.
+        logins = list(self.users.values_list("login", flat=True)[:5])
+        total = self.users.count()
+        suffix = f" (+{total - len(logins)})" if total > len(logins) else ""
+        users_str = ", ".join(logins) + suffix
+        return f"Users [{users_str}] interested in '{self.issue.title}' for {self.program.name}"

17-20: Optional: use UniqueConstraint instead of unique_together.

Keeping aligned with Django’s modern API improves readability and migration diffs.

Example:

class Meta:
    constraints = [
        models.UniqueConstraint(fields=["program", "issue"], name="uniq_participant_interest_program_issue"),
    ]
    verbose_name_plural = "Participant Interests"

Note: Changing this would require a follow-up migration.

backend/apps/mentorship/management/commands/sync_mentorship_issue_comments.py (2)

59-61: Grammar: use “has” (singular) instead of “have”.

Minor readability fix in output.

-                self.stdout.write(
-                    f"Program '{program.name}' have {program_repos.count()} repositories."
-                )
+                self.stdout.write(
+                    f"Program '{program.name}' has {program_repos.count()} repositories."
+                )

73-74: Incomplete status message.

The sentence is cut off. Consider including the repository count for context.

-                self.stdout.write(f"Found {relevant_issues.count()} open issues across ")
+                self.stdout.write(
+                    f"Found {relevant_issues.count()} open issues across {program_repos.count()} repositories."
+                )
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1ee00ae and 5fd03c7.

📒 Files selected for processing (16)
  • backend/Makefile (1 hunks)
  • backend/apps/github/admin/__init__.py (1 hunks)
  • backend/apps/github/admin/issue.py (1 hunks)
  • backend/apps/github/admin/issue_comment.py (1 hunks)
  • backend/apps/github/common.py (2 hunks)
  • backend/apps/github/migrations/0034_issuecomment.py (1 hunks)
  • backend/apps/github/models/__init__.py (1 hunks)
  • backend/apps/github/models/issue.py (0 hunks)
  • backend/apps/github/models/issue_comment.py (1 hunks)
  • backend/apps/mentorship/Makefile (1 hunks)
  • backend/apps/mentorship/admin/interested_contributors.py (1 hunks)
  • backend/apps/mentorship/management/commands/sync_mentorship_issue_comments.py (1 hunks)
  • backend/apps/mentorship/migrations/0004_participantinterest.py (1 hunks)
  • backend/apps/mentorship/models/__init__.py (1 hunks)
  • backend/apps/mentorship/models/interested_contributors.py (1 hunks)
  • backend/manage.py (1 hunks)
💤 Files with no reviewable changes (1)
  • backend/apps/github/models/issue.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/apps/github/admin/issue.py
🧰 Additional context used
🧬 Code Graph Analysis (11)
backend/apps/github/models/issue_comment.py (1)
backend/apps/github/models/issue.py (1)
  • update_data (187-211)
backend/apps/github/admin/issue_comment.py (1)
backend/apps/github/models/issue_comment.py (1)
  • IssueComment (6-35)
backend/apps/github/models/__init__.py (1)
backend/apps/github/models/issue_comment.py (1)
  • IssueComment (6-35)
backend/apps/github/admin/__init__.py (1)
backend/apps/github/admin/issue_comment.py (1)
  • IssueCommentAdmin (8-19)
backend/apps/mentorship/models/__init__.py (1)
backend/apps/mentorship/models/interested_contributors.py (1)
  • ParticipantInterest (6-24)
backend/apps/github/migrations/0034_issuecomment.py (1)
backend/apps/mentorship/migrations/0004_participantinterest.py (1)
  • Migration (7-51)
backend/apps/mentorship/migrations/0004_participantinterest.py (1)
backend/apps/github/migrations/0034_issuecomment.py (1)
  • Migration (7-45)
backend/apps/mentorship/admin/interested_contributors.py (1)
backend/apps/mentorship/models/interested_contributors.py (1)
  • ParticipantInterest (6-24)
backend/apps/mentorship/management/commands/sync_mentorship_issue_comments.py (4)
backend/apps/github/common.py (1)
  • sync_issue_comments (232-316)
backend/apps/github/models/issue.py (1)
  • Issue (18-211)
backend/apps/mentorship/models/interested_contributors.py (1)
  • ParticipantInterest (6-24)
backend/apps/mentorship/models/program.py (2)
  • Program (12-74)
  • ProgramStatus (19-22)
backend/apps/github/common.py (2)
backend/apps/github/models/issue_comment.py (2)
  • IssueComment (6-35)
  • update_data (19-31)
backend/apps/github/models/user.py (2)
  • User (19-153)
  • update_data (124-153)
backend/apps/mentorship/models/interested_contributors.py (2)
backend/apps/github/models/issue.py (1)
  • Meta (24-30)
backend/apps/github/models/common.py (1)
  • title (40-44)
🪛 checkmake (0.2.2)
backend/apps/mentorship/Makefile

[warning] 1-1: Missing required phony target "all"

(minphony)


[warning] 1-1: Missing required phony target "clean"

(minphony)


[warning] 1-1: Missing required phony target "test"

(minphony)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: Run backend tests
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (9)
backend/manage.py (1)

8-11: python-dotenv is declared in backend dependencies — no action required

Confirmed: python-dotenv is listed in the backend manifests.

  • backend/pyproject.toml (line ~70): python-dotenv = "^1.0.1"
  • backend/poetry.lock contains python-dotenv entries
backend/Makefile (1)

3-3: Good integration: mentorship Makefile now available at top-level

Including backend/apps/mentorship/Makefile cleanly exposes the new sync target(s) to the main Makefile. No issues spotted.

backend/apps/mentorship/models/__init__.py (1)

1-1: Re-export looks good

Re-exporting ParticipantInterest at the package level makes it easy to import via mentorship.models without apparent downsides here.

backend/apps/github/models/__init__.py (1)

3-3: Exporting IssueComment at package level is appropriate

This keeps import ergonomics consistent with other models. No functional concerns.

backend/apps/github/models/issue_comment.py (1)

10-10: No accessor clash — keep related_name="comments"

Issue has no field named "comments" (it defines comments_count), and the reverse accessor issue.comments is intentionally used across the codebase and in migrations. Do not apply the suggested rename.

  • Evidence:
    • backend/apps/github/models/issue.py — defines comments_count (PositiveIntegerField) (around line 46)
    • backend/apps/github/models/issue_comment.py — ForeignKey(..., related_name="comments") (line 10)
    • backend/apps/github/migrations/0034_issuecomment.py — migration references related_name="comments" (line 39)
    • Call sites using issue.comments:
      • backend/apps/mentorship/management/commands/sync_mentorship_issue_comments.py:99
      • backend/apps/github/common.py:249
      • backend/apps/github/common.py:261

Action: ignore the original suggested diff; no change required.

Likely an incorrect or invalid review comment.

backend/apps/mentorship/migrations/0004_participantinterest.py (1)

13-51: Migration looks structurally correct.

  • FKs use CASCADE appropriately.
  • M2M to users is optional (blank=True).
  • unique_together on (program, issue) enforces the key semantic.
backend/apps/mentorship/models/interested_contributors.py (1)

6-16: LGTM on model semantics.

The relationships and related_names look appropriate for querying from Program and Issue contexts.

backend/apps/mentorship/management/commands/sync_mentorship_issue_comments.py (2)

38-44: LGTM overall command structure.

Clear flow: enumerate programs, resolve repositories, gather open issues, sync comments, and register interest.


99-99: No change required — IssueComment reverse accessor is comments.

The IssueComment FK sets related_name="comments" so issue.comments.select_related("author") is correct; do not change to issue.issue_comments.

  • Evidence: backend/apps/github/models/issue_comment.py contains:
    issue = models.ForeignKey("github.Issue", on_delete=models.CASCADE, related_name="comments")
  • Keep: backend/apps/mentorship/management/commands/sync_mentorship_issue_comments.py — the line using issue.comments.select_related("author").all() is correct.

Likely an incorrect or invalid review comment.

@Rajgupta36 Rajgupta36 requested a review from arkid15r August 14, 2025 21:02
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
backend/apps/github/common.py (2)

235-241: Remove mentorship-specific phrasing from docstring

This module should remain domain-agnostic per earlier feedback. The function indeed syncs comments generically; the docstring should reflect that.

-def sync_issue_comments(gh_app, issue: Issue):
-    """Sync new comments for a mentorship program specific issue on-demand.
+def sync_issue_comments(gh_app, issue: Issue):
+    """Sync new comments for a GitHub issue.

251-259: Fix naive vs. aware datetime comparison; avoid TypeError

You normalize since to naive UTC, but later compare gh_comment.created_at (naive) to last_comment.created_at (likely aware). Comparing naive vs. aware raises TypeError. Normalize once and reuse for both the since filter and comparisons.

Apply this diff:

-        last_comment = issue.comments.order_by("-created_at").first()
-        since = None
+        last_comment = issue.comments.order_by("-created_at").first()
+        since = None
+        last_comment_created_naive_utc = None
@@
-        if last_comment:
-            # Convert Django datetime to naive datetime for GitHub API
-            since = last_comment.created_at
-            if timezone.is_aware(since):
-                since = timezone.make_naive(since, UTC)
-            logger.info("Found last comment at: %s, fetching newer comments", since)
+        if last_comment:
+            # Convert Django datetime (likely aware) to naive UTC for GitHub API and comparisons
+            last_created_at = last_comment.created_at
+            last_comment_created_naive_utc = (
+                timezone.make_naive(last_created_at, UTC)
+                if timezone.is_aware(last_created_at)
+                else last_created_at
+            )
+            since = last_comment_created_naive_utc
+            logger.info("Found last comment at: %s, fetching newer comments", since)
@@
-            if last_comment and gh_comment.created_at <= last_comment.created_at:
+            if last_comment_created_naive_utc and gh_comment.created_at <= last_comment_created_naive_utc:
                 logger.info("Skipping comment %s - not newer than our last comment", gh_comment.id)
                 continue

Note: See the next comment for a stronger recommendation to base the logic on updated_at so edits are synced.

Also applies to: 273-275

🧹 Nitpick comments (3)
backend/apps/github/common.py (3)

6-6: Make UTC import compatible with Python < 3.11

from datetime import UTC is only available in Python 3.11+. If the project runs on 3.10 or earlier, this import will fail. Use a safe fallback to datetime.timezone.utc.

Apply this diff at the imports:

-from datetime import UTC
+try:
+    from datetime import UTC  # Python 3.11+
+except ImportError:  # Python < 3.11
+    from datetime import timezone as dt_timezone
+    UTC = dt_timezone.utc

If the project guarantees Python 3.11+, please ignore and mark this resolved. Otherwise, this prevents runtime ImportError on older environments.


271-272: Reduce log verbosity in the inner loop

These per-comment info logs will be noisy on busy issues. Use debug for “skipping existing” and “synced new comment” to keep INFO-level logs actionable.

-                logger.info("Skipping existing comment %s", gh_comment.id)
+                logger.debug("Skipping existing comment %s", gh_comment.id)
@@
-                        logger.info(
-                            "Synced new comment %s for issue #%s", gh_comment.id, issue.number
-                        )
+                        logger.debug("Synced new comment %s for issue #%s", gh_comment.id, issue.number)

Also applies to: 284-286


296-305: Return a result to aid callers and observability

Returning the number of comments synced/updated gives the caller (e.g., management command) a simple success metric and simplifies testing.

-        if comments_synced > 0:
+        if comments_synced > 0:
             logger.info(
                 "Synced %d new comments for issue #%s in %s",
                 comments_synced,
                 issue.number,
                 issue.repository.name,
             )
-        else:
-            logger.info("No new comments found for issue #%s", issue.number)
+            return comments_synced
+        else:
+            logger.info("No new comments found for issue #%s", issue.number)
+            return 0
@@
-    except UnknownObjectException as e:
+    except UnknownObjectException as e:
         logger.warning(
             "Could not access issue #%s in %s/%s. Error: %s",
             issue.number,
             issue.repository.owner.login,
             issue.repository.name,
             str(e),
         )
-    except Exception:
+        return 0
+    except Exception:
         logger.exception(
             "An unexpected error occurred during comment sync for issue #%s",
             issue.number,
         )
+        return 0

If you adopt the “updated existing comments” change, you may prefer returning a tuple (synced, updated) or a dataclass for richer telemetry.

Also applies to: 306-318

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5fd03c7 and e2b66b7.

📒 Files selected for processing (4)
  • backend/apps/github/common.py (2 hunks)
  • backend/apps/mentorship/admin/__init__.py (1 hunks)
  • backend/apps/mentorship/admin/interested_contributors.py (1 hunks)
  • backend/apps/mentorship/management/commands/sync_mentorship_issue_comments.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • backend/apps/mentorship/management/commands/sync_mentorship_issue_comments.py
  • backend/apps/mentorship/admin/interested_contributors.py
🧰 Additional context used
🧬 Code Graph Analysis (2)
backend/apps/mentorship/admin/__init__.py (1)
backend/apps/mentorship/admin/interested_contributors.py (1)
  • ParticipantInterestAdmin (8-19)
backend/apps/github/common.py (2)
backend/apps/github/models/issue_comment.py (2)
  • IssueComment (6-35)
  • update_data (19-31)
backend/apps/github/models/user.py (2)
  • User (19-153)
  • update_data (124-153)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: Run backend tests
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (2)
backend/apps/mentorship/admin/__init__.py (2)

3-3: LGTM: Importing ParticipantInterestAdmin here aligns with the admin autodiscover pattern

Consistent with the existing approach—importing admin submodules in init to trigger registrations on autodiscover. Import order also remains alphabetical by module name.


3-3: ParticipantInterest is registered in admin — no action required

Confirmed: admin.site.register(ParticipantInterest, ParticipantInterestAdmin) is present.

Files:

  • backend/apps/mentorship/admin/interested_contributors.py — registration at line ~22
  • backend/apps/mentorship/admin/init.py — imports ParticipantInterestAdmin

Comment on lines 263 to 276
existing_github_ids = set(issue.comments.values_list("github_id", flat=True))

comments_synced = 0

gh_comments = gh_issue.get_comments(since=since) if since else gh_issue.get_comments()

for gh_comment in gh_comments:
if gh_comment.id in existing_github_ids:
logger.info("Skipping existing comment %s", gh_comment.id)
continue
if last_comment and gh_comment.created_at <= last_comment.created_at:
logger.info("Skipping comment %s - not newer than our last comment", gh_comment.id)
continue

Copy link
Contributor

@coderabbitai coderabbitai bot Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Existing comments are never updated; switch to updated_at-based sync and update-if-changed

Current logic skips any comment whose GitHub ID already exists locally. That means edited comments (updated_at newer) are never refreshed, leaving stale bodies/timestamps. Also, the GitHub API’s since for comments filters by update time, not creation time.

Refactor to:

  • Compute since from the most recent local updated_at.
  • Maintain a map of existing comment updated_at values (normalized).
  • Update an existing comment if gh_comment.updated_at is newer.

Suggested diff:

-        existing_github_ids = set(issue.comments.values_list("github_id", flat=True))
+        # Map existing comments by GitHub ID -> last local updated_at (naive UTC) for change detection
+        existing_pairs = dict(issue.comments.values_list("github_id", "updated_at"))
+        existing_updated_naive = {
+            gid: (timezone.make_naive(dt, UTC) if timezone.is_aware(dt) else dt)
+            for gid, dt in existing_pairs.items()
+        }
@@
-        comments_synced = 0
+        comments_synced = 0
+        comments_updated = 0
@@
-        gh_comments = gh_issue.get_comments(since=since) if since else gh_issue.get_comments()
+        gh_comments = gh_issue.get_comments(since=since) if since else gh_issue.get_comments()
@@
-            if gh_comment.id in existing_github_ids:
-                logger.info("Skipping existing comment %s", gh_comment.id)
-                continue
-            if last_comment and gh_comment.created_at <= last_comment.created_at:
-                logger.info("Skipping comment %s - not newer than our last comment", gh_comment.id)
-                continue
+            # If already present, update if the remote comment was edited after our local copy
+            if gh_comment.id in existing_updated_naive:
+                local_updated = existing_updated_naive[gh_comment.id]
+                if gh_comment.updated_at > local_updated:
+                    author_obj = User.update_data(gh_comment.user)
+                    if author_obj:
+                        try:
+                            IssueComment.update_data(gh_comment, issue, author_obj)
+                            comments_updated += 1
+                            logger.info("Updated existing comment %s for issue #%s", gh_comment.id, issue.number)
+                        except Exception:
+                            logger.exception(
+                                "Failed to update comment %s for issue #%s",
+                                gh_comment.id,
+                                issue.number,
+                            )
+                    else:
+                        logger.warning("Could not update author for comment %s", gh_comment.id)
+                else:
+                    logger.debug("Skipping unchanged existing comment %s", gh_comment.id)
+                continue
+
+            # New comment (not seen locally)
+            if last_comment_created_naive_utc and gh_comment.created_at <= last_comment_created_naive_utc:
+                logger.debug("Skipping comment %s - not newer than our last known created_at", gh_comment.id)
+                continue
@@
-                        logger.info(
-                            "Synced new comment %s for issue #%s", gh_comment.id, issue.number
-                        )
+                        logger.info("Synced new comment %s for issue #%s", gh_comment.id, issue.number)

And adjust the computation of since to be based on updated_at:

-        last_comment = issue.comments.order_by("-created_at").first()
+        last_comment = issue.comments.order_by("-updated_at").first()

Finally, consider reflecting comments_updated in the summary log (see next comment for a return value suggestion).

Also applies to: 269-293

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are interested only to find user who are interested to work on the issue m so maybe we can ignore update comments. Am I right @arkid15r

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (5)
backend/apps/github/common.py (5)

249-257: Build repo_full_name robustly (prefer Repository.path; don’t require owner to exist)

Owner can be nullable and Repository often exposes a precomputed path. Current early-return on missing owner prevents syncing when path could suffice.

-        if not repo.owner:
-            logger.warning("Repository for issue #%s has no owner, skipping", issue.number)
-            return
-
-        repo_full_name = f"{repo.owner.login}/{repo.name}"
+        # Prefer model-provided path; fallback to owner_login/name if available
+        repo_full_name = getattr(repo, "path", None)
+        if not repo_full_name:
+            owner_login = getattr(repo, "owner_login", None) or (
+                getattr(repo.owner, "login", None) if getattr(repo, "owner", None) else None
+            )
+            if not owner_login:
+                logger.warning("Repository for issue #%s missing owner login, skipping", issue.number)
+                return
+            repo_full_name = f"{owner_login}/{repo.name}"
         logger.info("Fetching repository: %s", repo_full_name)

312-318: Avoid referencing repo_full_name if it’s not defined (safer logging)

If an exception occurs before repo_full_name is set, this will raise UnboundLocalError while logging.

-        logger.warning(
-            "Could not access issue #%s in %s. Error: %s",
-            issue.number,
-            repo_full_name,
-            str(e),
-        )
+        safe_repo_ref = 'unknown'
+        try:
+            safe_repo_ref = repo_full_name  # set earlier in the try-block
+        except NameError:
+            safe_repo_ref = getattr(getattr(issue, "repository", None), "path", "unknown")
+        logger.warning(
+            "Could not access issue #%s in %s. Error: %s",
+            issue.number,
+            safe_repo_ref,
+            str(e),
+        )

259-267: Fix naive vs aware datetime handling for ‘since’ to avoid TypeError at runtime

Django stores aware datetimes (commonly UTC), PyGithub returns naive UTC. Comparing them or passing mixed types can raise TypeError. Normalize to a single naive UTC value and log that value.

-        last_comment = issue.comments.order_by("-created_at").first()
-        since = None
+        last_comment = issue.comments.order_by("-created_at").first()
+        since = None
+        since_naive_utc = None
 
         if last_comment:
-            since = last_comment.created_at
-            logger.info("Found last comment at: %s, fetching newer comments", since)
+            since = last_comment.created_at
+            # Normalize to naive UTC for GitHub API and comparisons
+            since_naive_utc = (
+                timezone.make_naive(since, timezone.utc) if timezone.is_aware(since) else since
+            )
+            logger.info("Found last comment at: %s, fetching newer comments", since_naive_utc)
         else:
             logger.info("No existing comments found, fetching all comments")

272-272: Pass the normalized ‘since’ to GitHub API

Ensure the call uses the normalized naive UTC value to match PyGithub/GitHub expectations.

-        gh_comments = gh_issue.get_comments(since=since) if since else gh_issue.get_comments()
+        gh_comments = gh_issue.get_comments(since=since_naive_utc) if since_naive_utc else gh_issue.get_comments()

279-281: Compare normalized datetimes to prevent naive/aware mismatch

Use the normalized timestamp for comparisons to avoid exceptions.

-            if since and gh_comment.created_at <= since:
+            if since_naive_utc and gh_comment.created_at <= since_naive_utc:
                 logger.info("Skipping comment %s - not newer than our last comment", gh_comment.id)
                 continue
🧹 Nitpick comments (4)
backend/apps/github/common.py (4)

234-240: Make the docstring app-agnostic (avoid mentorship-specific wording in common layer)

This module is intended to be generic. Suggest rewording to avoid coupling to mentorship domain.

-    """Sync new comments for a mentorship program specific issue on-demand.
+    """Sync new comments for a GitHub issue on-demand.
@@
-        issue (Issue): The local database Issue object to sync comments for.
+        issue (Issue): The local database Issue to sync comments for.
     """

276-277: Reduce log noise: downgrade “Skipping existing comment” to debug

This message can be very chatty at info level during routine syncs.

-                logger.info("Skipping existing comment %s", gh_comment.id)
+                logger.debug("Skipping existing comment %s", gh_comment.id)

268-289: Consider updating edited comments (updated_at-based sync) rather than skipping by ID

Right now, existing GitHub IDs are skipped, so edited comments are never refreshed. If consumers need accurate interest detection or timestamps, update when remote updated_at is newer. If mentorship flow truly doesn’t need edits, feel free to ignore this.

Do you want to update edited comments? If yes, I can provide a full patch to:

  • compute since from latest local updated_at,
  • map id -> local updated_at (normalized),
  • update existing rows when remote updated_at is newer,
  • keep the new-comment path as-is.

259-259: Optional: Align ‘since’ semantics with GitHub by basing on updated_at

GitHub’s “since” for issue comments filters by last update time, not creation time. If you care about edits, compute last_comment from updated_at instead of created_at.

-        last_comment = issue.comments.order_by("-created_at").first()
+        last_comment = issue.comments.order_by("-updated_at").first()
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e2b66b7 and d924096.

📒 Files selected for processing (1)
  • backend/apps/github/common.py (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
backend/apps/github/common.py (3)
backend/apps/github/models/issue_comment.py (2)
  • IssueComment (6-35)
  • update_data (19-31)
backend/apps/github/models/issue.py (2)
  • Issue (18-212)
  • update_data (188-212)
backend/apps/github/models/user.py (2)
  • User (19-153)
  • update_data (124-153)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (1)
backend/apps/github/common.py (1)

12-12: Import of IssueComment looks correct and scoped appropriately

No concerns with this import; aligns with the new model for persisting comments.

updated_at = models.DateTimeField()

@classmethod
def update_data(cls, gh_comment, issue_obj, author_obj):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not how update_data works. Please check other model examples. This needs bulk save support and should never create an object unless save=True is passed.

class IssueComment(models.Model):
"""Represents a comment on a GitHub issue."""

github_id = models.BigIntegerField(unique=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the values here, do you have examples?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

@@ -0,0 +1,3 @@
sync_mentorship_issue_comments:
@echo "Syncing Github Comments related to issues"
@CMD="python manage.py sync_mentorship_issue_comments" $(MAKE) exec-backend-command
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the naming patterns we use for management commands.


logger = logging.getLogger(__name__)

INTEREST_PATTERNS = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not overcomplicate it and use /interested as the only pattern/command for now.

logger.info("Starting comment sync for issue #%s", issue.number)

try:
repo = issue.repository
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use full names repository

"""Sync new comments for a mentorship program specific issue on-demand.

Args:
gh_app (Github): An authenticated PyGithub instance.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure it's an app here?

gh_repo = gh_app.get_repo(repo_full_name)
gh_issue = gh_repo.get_issue(number=issue.number)

last_comment = issue.comments.order_by("-created_at").first()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add this as a property for Issue -- we have latest_release, latest_pull_request as examples.

logger.info("Skipping comment %s - not newer than our last comment", gh_comment.id)
continue

author_obj = User.update_data(gh_comment.user)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how many names with _obj suffix you can find in the code base. This is redundant, they are all objects obviously. You don't even have author taken by some other value. Why this kind of naming 🤷‍♂️ ?

class ParticipantInterest(models.Model):
"""Represents users interested in a specific issue within a program."""

program = models.ForeignKey(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be a module?

from django.db import models


class ParticipantInterest(models.Model):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's with the naming here? Interested contributors, participants, interest, and users in the model eventually. Could you improve this?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (5)
backend/apps/mentorship/Makefile (1)

1-3: Confirm command naming pattern alignment across the repo

A prior review asked to follow established naming patterns for management commands. Please verify that mentorship_update_comments aligns with existing conventions.

Run:

#!/bin/bash
# List existing management commands to compare naming patterns
echo "Existing management commands:"
fd -t f 'commands/.*\.py$' backend | sed -E 's|.*/commands/([^/]+)\.py|\1|' | sort -u

echo
echo "Existing Makefile targets under backend/apps/*:"
fd -t f '^Makefile$' backend | while read -r f; do
  echo "---- $f ----"
  rg -n '^[A-Za-z0-9_.-]+:' "$f" | sed -E 's/:.*$/:/'
done
backend/apps/mentorship/models/contributor_interest.py (1)

6-16: Design note: consider a through model for per-user interest metadata

If you need to track when/how a user expressed interest (e.g., timestamp, source comment, signal type), a custom through table on users with fields like created_at, source_comment, program context is more extensible than a bare M2M.

Example (outside selected lines):

class ContributorInterestUser(models.Model):
    contributor_interest = models.ForeignKey(
        "mentorship.ContributorInterest", on_delete=models.CASCADE, related_name="user_links"
    )
    user = models.ForeignKey("github.User", on_delete=models.CASCADE)
    source_comment = models.ForeignKey(
        "github.IssueComment", on_delete=models.SET_NULL, null=True, blank=True
    )
    created_at = models.DateTimeField(auto_now_add=True)
    class Meta:
        unique_together = ("contributor_interest", "user")

class ContributorInterest(models.Model):
    ...
    users = models.ManyToManyField(
        "github.User",
        related_name="mentorship_interests",
        through="mentorship.ContributorInterestUser",
        through_fields=("contributor_interest", "user"),
    )
backend/apps/github/common.py (2)

249-255: Build repository full name robustly; avoid assuming owner is present

Repository.owner can be null and projects prefer using the model’s full path/full name. Use repository.path (or similar) when available and fall back safely. Also drop the hard early-return on missing owner.

Apply this diff:

-        if not repository.owner:
-            logger.warning("Repository for issue #%s has no owner, skipping", issue.number)
-            return
-
-        repository_full_name = f"{repository.owner.login}/{repository.name}"
-        logger.info("Fetching repository: %s", repository_full_name)
+        # Prefer stored full path (e.g., "owner/name"), else derive safely.
+        repository_full_name = getattr(repository, "path", None)
+        if not repository_full_name:
+            owner_login = getattr(getattr(repository, "owner", None), "login", None)
+            if not owner_login:
+                logger.warning(
+                    "Cannot determine repository owner login for issue #%s (repo id=%s), skipping",
+                    issue.number,
+                    getattr(repository, "id", "unknown"),
+                )
+                return
+            repository_full_name = f"{owner_login}/{repository.name}"
+        logger.info("Fetching repository: %s", repository_full_name)

259-267: Naive vs. aware datetime comparison will raise TypeError; normalize once and reuse

PyGithub often returns naive UTC datetimes; Django likely stores aware datetimes. Comparing them directly (created_at <= since) can crash. Normalize the last local timestamp to naive UTC for use in the API since and subsequent comparisons.

Apply this diff:

-        last_comment = issue.latest_comment
-        since = None
-
-        if last_comment:
-            since = last_comment.created_at
-            logger.info("Found last comment at: %s, fetching newer comments", since)
+        last_comment = issue.latest_comment
+        since = None
+        last_comment_created_naive_utc = None
+
+        if last_comment:
+            # Convert Django datetime (likely aware) to naive UTC for GitHub API and comparisons
+            last_created_at = last_comment.created_at
+            last_comment_created_naive_utc = (
+                timezone.make_naive(last_created_at, timezone.utc)
+                if timezone.is_aware(last_created_at)
+                else last_created_at
+            )
+            since = last_comment_created_naive_utc
+            logger.info("Found last comment at: %s, fetching newer comments", since)
         else:
             logger.info("No existing comments found, fetching all comments")
backend/apps/github/models/issue_comment.py (1)

3-6: Import timezone to normalize datetimes when mapping from GitHub

You’re assigning datetimes straight from PyGithub; normalize to aware UTC to avoid inconsistent storage under USE_TZ.

Apply this diff:

 from django.db import models
+from django.utils import timezone
 
 from apps.common.models import BulkSaveModel
🧹 Nitpick comments (14)
backend/apps/github/models/issue.py (2)

90-94: Docstring type nit: refer to IssueComment explicitly

Minor clarity tweak: use the actual model name in the return description.

Apply this diff:

-        Returns:
-            Comment | None: The most recently created comment, or None if no comments exist.
+        Returns:
+            IssueComment | None: The most recently created comment, or None if no comments exist.

86-95: Avoid N+1 queries when accessing latest_comment in bulk

Accessing latest_comment on each Issue instance (e.g. in admin or list templates) will emit one query per issue. To load the “latest comment” more efficiently:

• Annotate timestamps only

from django.db.models import Max
Issue.objects.annotate(
    latest_comment_at=Max("comments__created_at")
)
# Render {{ issue.latest_comment_at }} without extra queries

• Prefetch the latest comment object

from django.db.models import Prefetch
from apps.github.models.issue_comment import IssueComment

Issue.objects.prefetch_related(
    Prefetch(
        "comments",
        queryset=IssueComment.objects.order_by("-created_at")[:1],
        to_attr="latest_comment_prefetch"
    )
)
# Access issue.latest_comment_prefetch[0] if it exists

Pinpoint location:

  • backend/apps/github/models/issue.py lines 86–95 (the latest_comment property)
backend/apps/mentorship/admin/contributor_interest.py (1)

11-20: Admin UX/perf polish: use @admin.display, and avoid per-row COUNT queries

  • Prefer the modern decorator over setting short_description.
  • Counting M2M per row does an extra query per row. Annotate once in get_queryset and read the annotated value.
  • Consider raw_id_fields to avoid loading huge dropdowns for issue/module/users in forms, and list_select_related to reduce FK queries in the changelist.

Apply this diff to modernize the column definition:

 class ContributorInterestAdmin(admin.ModelAdmin):
     """ContributorInterest admin."""

     list_display = ("module", "issue", "users_count")
     search_fields = ("module__name", "users__login", "issue__title")
     list_filter = ("module", "issue")

-    def users_count(self, obj):
-        """Return the count of users interested in the issue."""
-        return obj.users.count()
-
-    users_count.short_description = "Interested Users"
+    @admin.display(description="Interested Users")
+    def users_count(self, obj):
+        """Return the count of users interested in the issue."""
+        # Prefer annotated value when available; fall back to .count()
+        return getattr(obj, "users_count", None) or obj.users.count()

Outside the selected lines, you can further optimize:

# Add near the top of the file:
from django.db.models import Count

# Inside ContributorInterestAdmin:
raw_id_fields = ("module", "issue", "users")
list_select_related = ("module", "issue")

def get_queryset(self, request):
    qs = super().get_queryset(request)
    return qs.annotate(users_count=Count("users"))
backend/apps/mentorship/Makefile (1)

1-3: Add .PHONY and fix GitHub capitalization in the message

Small hygiene improvements.

Apply this diff:

+ .PHONY: mentorship-update-comments
 mentorship-update-comments:
-	@echo "Syncing Github Comments related to issues"
+	@echo "Syncing GitHub comments related to issues"
 	@CMD="python manage.py mentorship_update_comments" $(MAKE) exec-backend-command
backend/apps/mentorship/migrations/0004_contributorinterest.py (1)

13-51: Prefer UniqueConstraint over unique_together for forward-compatibility

Django recommends UniqueConstraint; unique_together is deprecated. Not a blocker, but consider switching in a follow-up migration to avoid deprecation noise.

Example (for a future migration):

from django.db import migrations, models

class Migration(migrations.Migration):
    dependencies = [("mentorship", "0004_contributorinterest")]

    operations = [
        migrations.AlterUniqueTogether(
            name="contributorinterest",
            unique_together=set(),  # remove legacy constraint
        ),
        migrations.AddConstraint(
            model_name="contributorinterest",
            constraint=models.UniqueConstraint(
                fields=["module", "issue"], name="unique_module_issue_interest"
            ),
        ),
    ]
backend/apps/mentorship/models/contributor_interest.py (3)

1-1: Align module docstring with model name

Use “Contributor interest model.” to match the class name and terminology used elsewhere.

Apply this diff:

-"""Participant interest model."""
+"""Contributor interest model."""

17-20: Model Meta: switch to UniqueConstraint

unique_together is deprecated; switching to UniqueConstraint improves clarity and future-compatibility. This will require an accompanying migration.

Apply this diff:

     class Meta:
-        unique_together = ("module", "issue")
-        verbose_name_plural = "Contributor Interests"
+        constraints = [
+            models.UniqueConstraint(
+                fields=["module", "issue"], name="unique_module_issue_interest"
+            )
+        ]
+        verbose_name_plural = "Contributor Interests"

21-24: Avoid heavy queries and long strings in str

Building a string with all user logins issues a query and can get unwieldy. Prefer a concise representation.

Apply this diff:

-    def __str__(self):
-        """Return a human-readable representation of the contributor interest."""
-        user_list = ", ".join(self.users.values_list("login", flat=True))
-        return f"Users [{user_list}] interested in '{self.issue.title}' for {self.module.name}"
+    def __str__(self):
+        """Return a concise representation."""
+        return f"Interest(issue={self.issue_id}, module={self.module_id}, users={self.users.count()})"
backend/apps/github/common.py (3)

233-240: Docstring leaks mentorship context into github.common

This module should stay generic. Reword the docstring to avoid mentorship-specific phrasing.

Apply this diff:

-    """Sync new comments for a mentorship program specific issue on-demand.
+    """Sync GitHub issue comments on-demand.

273-283: Use the normalized timestamp for comparisons and avoid mismatching created vs updated filters

GitHub’s comments API filters by updated_at when using since. At minimum, compare using the same normalized timestamp you computed; optionally, consider shifting to updated_at as the baseline.

Apply this minimal consistency fix:

-        for gh_comment in gh_comments:
+        for gh_comment in gh_comments:
             if gh_comment.id in existing_github_ids:
                 logger.info("Skipping existing comment %s", gh_comment.id)
                 continue
 
-            if since and gh_comment.created_at <= since:
+            if last_comment_created_naive_utc and gh_comment.created_at <= last_comment_created_naive_utc:
                 logger.info("Skipping comment %s - not newer than our last comment", gh_comment.id)
                 continue

Optional (recommended): base the window off updated_at to align with the API and not miss edited-but-previously-unsynced comments:

-        last_comment = issue.latest_comment
+        last_comment = issue.comments.order_by("-updated_at").first()

315-320: Enrich error log with repository context

Since we now compute repository_full_name defensively, include it in the warning for better traceability.

Apply this diff:

-        logger.warning(
-            "Could not access issue #%s. Error: %s",
-            issue.number,
-            str(e),
-        )
+        logger.warning(
+            "Could not access issue #%s in %s. Error: %s",
+            issue.number,
+            locals().get("repository_full_name", getattr(getattr(issue, "repository", None), "path", "unknown")),
+            str(e),
+        )
backend/apps/mentorship/management/commands/mentorship_update_comments.py (2)

16-18: Broaden the interested trigger to allow whitespace around the slash command

As written, only exactly "/interested" matches. Allow leading/trailing whitespace to reduce false negatives.

Apply this diff:

-INTEREST_PATTERNS = [
-    re.compile(r"^/interested$", re.IGNORECASE),
-]
+INTEREST_PATTERNS = [
+    re.compile(r"^\s*/interested\s*$", re.IGNORECASE),
+]

87-96: Trim comment body before matching; keep it cheap and robust

Trimming avoids missing valid matches due to trailing newlines/spaces from GitHub clients.

Apply this diff:

-            body = comment.body or ""
-            if any(p.search(body) for p in INTEREST_PATTERNS):
+            body = (comment.body or "").strip()
+            if any(p.search(body) for p in INTEREST_PATTERNS):
                 to_add.append(comment.author)
backend/apps/github/models/issue_comment.py (1)

20-23: Trim str to avoid dumping large bodies in admin/logs

Show a short preview to keep admin lists readable.

Apply this diff:

-    def __str__(self):
-        """Return a string representation of the issue comment."""
-        return f"{self.issue} - {self.author} - {self.body}"
+    def __str__(self):
+        """Return a concise string representation of the issue comment."""
+        preview = (self.body or "")[:80].replace("\n", " ")
+        return f"{self.issue} - {self.author} - {preview}"
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d924096 and b77a27c.

📒 Files selected for processing (10)
  • backend/apps/github/common.py (2 hunks)
  • backend/apps/github/models/issue.py (1 hunks)
  • backend/apps/github/models/issue_comment.py (1 hunks)
  • backend/apps/mentorship/Makefile (1 hunks)
  • backend/apps/mentorship/admin/__init__.py (1 hunks)
  • backend/apps/mentorship/admin/contributor_interest.py (1 hunks)
  • backend/apps/mentorship/management/commands/mentorship_update_comments.py (1 hunks)
  • backend/apps/mentorship/migrations/0004_contributorinterest.py (1 hunks)
  • backend/apps/mentorship/models/__init__.py (1 hunks)
  • backend/apps/mentorship/models/contributor_interest.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • backend/apps/mentorship/models/init.py
  • backend/apps/mentorship/admin/init.py
🧰 Additional context used
🧬 Code Graph Analysis (4)
backend/apps/mentorship/admin/contributor_interest.py (1)
backend/apps/mentorship/models/contributor_interest.py (1)
  • ContributorInterest (6-24)
backend/apps/github/models/issue_comment.py (2)
backend/apps/common/models.py (1)
  • BulkSaveModel (10-34)
backend/apps/github/models/issue.py (4)
  • from_github (96-135)
  • bulk_save (187-189)
  • update_data (198-222)
  • save (175-184)
backend/apps/mentorship/management/commands/mentorship_update_comments.py (5)
backend/apps/github/common.py (1)
  • sync_issue_comments (233-325)
backend/apps/github/models/issue.py (1)
  • Issue (18-222)
backend/apps/mentorship/models/contributor_interest.py (1)
  • ContributorInterest (6-24)
backend/apps/mentorship/models/module.py (1)
  • Module (15-80)
backend/apps/mentorship/models/program.py (2)
  • Program (12-74)
  • ProgramStatus (19-22)
backend/apps/github/common.py (4)
backend/apps/github/models/issue_comment.py (3)
  • IssueComment (8-66)
  • update_data (46-66)
  • bulk_save (41-43)
backend/apps/github/models/issue.py (5)
  • Issue (18-222)
  • latest_comment (87-94)
  • update_data (198-222)
  • save (175-184)
  • bulk_save (187-189)
backend/apps/github/models/user.py (3)
  • User (19-153)
  • update_data (124-153)
  • bulk_save (104-106)
backend/apps/common/models.py (1)
  • bulk_save (19-34)
🪛 checkmake (0.2.2)
backend/apps/mentorship/Makefile

[warning] 1-1: Missing required phony target "all"

(minphony)


[warning] 1-1: Missing required phony target "clean"

(minphony)


[warning] 1-1: Missing required phony target "test"

(minphony)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Run spell check
  • GitHub Check: Run pre-commit checks
  • GitHub Check: Run frontend checks
  • GitHub Check: CodeQL (python)
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (3)
backend/apps/github/models/issue.py (1)

86-95: LGTM: latest_comment is a useful, safe accessor

The implementation is concise and correct. Ordering by created_at desc and taking first is the right call.

backend/apps/mentorship/management/commands/mentorship_update_comments.py (1)

69-76: Good separation of concerns: command pulls comments via github.sync and then derives interest

This honors the earlier feedback about keeping mentorship logic out of github.common. No issues here.

backend/apps/github/models/issue_comment.py (1)

45-66: update_data bulk-friendly flow looks good

Upsert-by-github_id, populate via from_github, and defer save when requested enables bulk_save to do its job. No changes needed here.

@Rajgupta36 Rajgupta36 requested a review from arkid15r August 20, 2025 18:57
from django.db import models


class ContributorInterest(models.Model):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IssueUserInterest

issue = models.ForeignKey(
"github.Issue", on_delete=models.CASCADE, related_name="participant_interests"
)
users = models.ManyToManyField("github.User", related_name="mentorship_interests", blank=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a single user in order to keep it granular

)
users = models.ManyToManyField("github.User", related_name="mentorship_interests", blank=True)

class Meta:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The meta lacks all needed fields and normally goes at the top (in our code base)

@@ -23,6 +23,7 @@ class IssueAdmin(admin.ModelAdmin):
)
list_filter = (
"state",
"created_at",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why adding this?

from apps.common.models import BulkSaveModel


class IssueComment(BulkSaveModel, models.Model):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This design tightly couples comment to issue. Let's have Comment model instead and add m2m on Issue (and later we may use them for PRs as well if needed)

logger.warning("Issue #%s has no repository, skipping", issue.number)
return

if not repo.owner:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant we don't have repositories w/o owners in our DB.

logger = logging.getLogger(__name__)

INTEREST_PATTERNS = [
re.compile(r"^/interested$", re.IGNORECASE),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pattern is to narrow - let's check if the entire comment body just contains /interested somewhere (there may be some other words in the comment)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (1)
backend/apps/mentorship/Makefile (1)

1-3: Add PHONY and fix GitHub casing

Small polish for make hygiene and messaging.

+.PHONY: mentorship-update-comments
 mentorship-update-comments:
-	@echo "Syncing Github Comments related to issues"
+	@echo "Syncing GitHub comments related to issues"
 	@CMD="python manage.py mentorship_update_comments --verbosity 2" $(MAKE) exec-backend-command

The previous review about following naming patterns for management commands is satisfied; thanks for aligning with our conventions.

🧹 Nitpick comments (9)
backend/apps/github/models/issue.py (1)

93-102: Annotate return type and ensure fast path relies on indexed field

Add a return type for clarity. Also ensure Comment.created_at is indexed (proposed in migration/model review) since this orders on created_at frequently.

-    def latest_comment(self):
+    def latest_comment(self) -> "Comment | None":
         """Get the latest comment for this issue.
backend/apps/github/models/comment.py (3)

23-25: Safer str: include id and avoid long bodies

Large bodies can clutter the admin list and logs. Consider a shorter preview and include the GitHub id.

-        return f"{self.author} - {self.body[:40]}"
+        return f"#{self.github_id} • {self.author or 'Unknown'} • {self.body[:60]}"

27-41: Ensure timezone-aware datetimes

PyGithub typically returns timezone-aware datetimes, but it’s worth guarding against naive datetimes to avoid Django warnings and incorrect comparisons when computing “since”.

Optionally coerce to aware if naive:

+from django.utils import timezone
...
         for model_field, gh_field in field_mapping.items():
             value = getattr(gh_comment, gh_field, None)
             if value is not None:
+                if isinstance(value, datetime.datetime) and timezone.is_naive(value):
+                    value = timezone.make_aware(value, timezone=timezone.utc)
                 setattr(self, model_field, value)

Please confirm whether gh_comment.created_at/updated_at are guaranteed aware in your GitHub client wrapper. If not, the above guard is recommended.


47-71: Map author when not explicitly provided; optional optimization

If author isn’t passed, you could resolve it from gh_comment.user to reduce call-site burden. Also consider select_for_update when updating in high-concurrency syncs (optional).

-    def update_data(gh_comment, *, author=None, save: bool = True):
+    def update_data(gh_comment, *, author=None, save: bool = True):
         """Update or create a Comment instance from a GitHub comment object.
@@
-        try:
+        try:
             comment = Comment.objects.get(github_id=gh_comment.id)
         except Comment.DoesNotExist:
             comment = Comment(github_id=gh_comment.id)
 
-        comment.from_github(gh_comment, author=author)
+        # Fallback author mapping if not provided by caller.
+        if author is None and getattr(gh_comment, "user", None):
+            # Expect a caller-level mapping to github.User before saving if needed.
+            # This keeps Comment decoupled from User.update_data to avoid circular deps.
+            author = getattr(gh_comment, "user", None)
+        comment.from_github(gh_comment, author=author)

If you prefer centralizing user mapping in the sync layer (backend/apps/github/common.py), ignore this and keep the current signature.

backend/apps/github/admin/comment.py (1)

11-18: Tighten admin list UX and performance

  • Avoid rendering full bodies in lists.
  • Add github_id to display and search for quick lookups.
  • Use raw_id_fields for author to prevent huge dropdowns.
  • Date hierarchy helps navigation.
 class CommentAdmin(admin.ModelAdmin):
     """Admin for Comment model."""
 
-    list_display = (
-        "body",
-        "author",
-        "created_at",
-        "updated_at",
-    )
-    list_filter = ("created_at", "updated_at")
-    search_fields = ("body", "author__login")
+    list_display = ("github_id", "author", "short_body", "created_at", "updated_at")
+    list_filter = ("created_at", "updated_at")
+    search_fields = ("github_id", "body", "author__login")
+    raw_id_fields = ("author",)
+    date_hierarchy = "created_at"
+
+    def short_body(self, obj):
+        return (obj.body or "")[:80]
+    short_body.short_description = "body"
backend/apps/mentorship/admin/issue_user_interest.py (1)

11-13: Optional: improve discoverability and performance in admin.

  • Show the interested user in the list.
  • Reduce N+1 queries using list_select_related.

Apply this diff:

-    list_display = ("module", "issue")
+    list_display = ("module", "issue", "user")
@@
-    list_filter = ("module",)
+    list_filter = ("module",)
+    list_select_related = ("module", "issue", "user")

If your related admins have search_fields defined, consider also:

+    autocomplete_fields = ("module", "issue", "user")
backend/apps/mentorship/models/issue_user_interest.py (3)

9-12: Prefer UniqueConstraint over unique_together (modern style, clearer error surfaces).

Functionally equivalent here, but UniqueConstraint is the recommended approach and allows naming, deferrable options, and better control.

Apply this diff:

 class IssueUserInterest(models.Model):
@@
-    class Meta:
-        verbose_name = "Issue User Interest"
-        verbose_name_plural = "Issue User Interests"
-        unique_together = ("module", "issue", "user")
+    class Meta:
+        verbose_name = "Issue User Interest"
+        verbose_name_plural = "Issue User Interests"
+        constraints = [
+            models.UniqueConstraint(
+                fields=["module", "issue", "user"],
+                name="uniq_issue_user_interest",
+            )
+        ]

Note: This change will require a follow-up migration.


1-2: Nit: file-level docstring can be more specific.

Align with the model’s purpose to improve readability.

Apply this diff:

-"""Participant interest model."""
+"""Model for tracking when a user expresses interest in a GitHub issue within a mentorship module."""

20-24: Optional: capture creation timestamp for analytics and auditing.

Storing when the interest was registered can help with reporting and deduplication strategies.

Apply this diff:

     user = models.ForeignKey(
         "github.User",
         related_name="mentorship_interests",
         on_delete=models.CASCADE,
     )
+
+    created_at = models.DateTimeField(auto_now_add=True)

Requires a migration.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between b77a27c and bc1d0d6.

📒 Files selected for processing (15)
  • backend/apps/github/admin/__init__.py (1 hunks)
  • backend/apps/github/admin/comment.py (1 hunks)
  • backend/apps/github/admin/issue.py (1 hunks)
  • backend/apps/github/common.py (2 hunks)
  • backend/apps/github/migrations/0034_comment_issue_comments.py (1 hunks)
  • backend/apps/github/models/__init__.py (1 hunks)
  • backend/apps/github/models/comment.py (1 hunks)
  • backend/apps/github/models/issue.py (2 hunks)
  • backend/apps/mentorship/Makefile (1 hunks)
  • backend/apps/mentorship/admin/__init__.py (1 hunks)
  • backend/apps/mentorship/admin/issue_user_interest.py (1 hunks)
  • backend/apps/mentorship/management/commands/mentorship_update_comments.py (1 hunks)
  • backend/apps/mentorship/migrations/0004_issueuserinterest.py (1 hunks)
  • backend/apps/mentorship/models/__init__.py (1 hunks)
  • backend/apps/mentorship/models/issue_user_interest.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • backend/apps/mentorship/admin/init.py
  • backend/apps/github/admin/issue.py
  • backend/apps/github/models/init.py
  • backend/apps/github/admin/init.py
  • backend/apps/github/common.py
  • backend/apps/mentorship/management/commands/mentorship_update_comments.py
🧰 Additional context used
🧬 Code graph analysis (7)
backend/apps/github/admin/comment.py (1)
backend/apps/github/models/comment.py (1)
  • Comment (8-70)
backend/apps/mentorship/migrations/0004_issueuserinterest.py (1)
backend/apps/github/migrations/0034_comment_issue_comments.py (1)
  • Migration (7-46)
backend/apps/github/migrations/0034_comment_issue_comments.py (1)
backend/apps/mentorship/migrations/0004_issueuserinterest.py (1)
  • Migration (7-54)
backend/apps/github/models/comment.py (2)
backend/apps/common/models.py (1)
  • BulkSaveModel (10-34)
backend/apps/github/models/issue.py (5)
  • Meta (24-30)
  • from_github (103-142)
  • bulk_save (194-196)
  • update_data (205-229)
  • save (182-191)
backend/apps/mentorship/admin/issue_user_interest.py (1)
backend/apps/mentorship/models/issue_user_interest.py (1)
  • IssueUserInterest (6-30)
backend/apps/mentorship/models/issue_user_interest.py (1)
backend/apps/github/models/issue.py (1)
  • Meta (24-30)
backend/apps/mentorship/models/__init__.py (1)
backend/apps/mentorship/models/issue_user_interest.py (1)
  • IssueUserInterest (6-30)
🪛 checkmake (0.2.2)
backend/apps/mentorship/Makefile

[warning] 1-1: Missing required phony target "all"

(minphony)


[warning] 1-1: Missing required phony target "clean"

(minphony)


[warning] 1-1: Missing required phony target "test"

(minphony)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (5)
backend/apps/github/admin/comment.py (1)

21-21: LGTM: admin registration

Registration looks good and matches the model wiring.

backend/apps/mentorship/models/__init__.py (1)

1-1: Re-export looks good; package-level exposure is appropriate.

This makes IssueUserInterest available via apps.mentorship.models, which is consistent with how other models here are surfaced.

backend/apps/mentorship/models/issue_user_interest.py (1)

6-24: Model structure and relations look correct.

The FK targets and related_names align with the intended traversals, and the uniqueness across (module, issue, user) captures the deduping requirement.

backend/apps/mentorship/migrations/0004_issueuserinterest.py (2)

7-54: Migration matches the model; dependencies look correct.

Creates the model with the expected FKs and the uniqueness constraint. Good to go.


48-52: Verify model⇄migration sync (django-configurations)

Before running the checks, ensure your environment is configured for django-configurations. Otherwise, manage.py will error with ModuleNotFoundError: No module named 'configurations'.

  1. In your shell, set:
    • export DJANGO_SETTINGS_MODULE=your_project.settings
    • export DJANGO_CONFIGURATION=Dev (or whichever configuration you use)
  2. From the project root, run:
    • python manage.py makemigrations --check
    • python manage.py migrate --plan
  3. Smoke-check the admin import:
    python - <<'PY'
    import django, importlib, sys
    django.setup()
    try:
        importlib.import_module("apps.mentorship.admin.issue_user_interest")
        print("Admin import OK")
    except Exception as e:
        print("Admin import failed:", e)
        sys.exit(1)
    PY

If you see no pending migrations, an empty migrate plan, and no import errors, your migration is in sync.


Optional: Consider migrating to a named UniqueConstraint in the model rather than using unique_together. If you do:

  • Add the UniqueConstraint to your model’s Meta.constraints.
  • Generate and commit a follow-up migration so the DB constraint is explicitly named and matches the model.

Comment on lines 8 to 18
class Comment(BulkSaveModel, models.Model):
"""Represents a comment on a GitHub entity (Issue, PR, etc.)."""

github_id = models.BigIntegerField(unique=True)
author = models.ForeignKey(
"github.User", on_delete=models.SET_NULL, null=True, related_name="comments"
)
body = models.TextField()
created_at = models.DateTimeField()
updated_at = models.DateTimeField()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Scope Comment to Issue (or use GFK) and index timestamps

Align the model to one-to-many Issue↔Comment or adopt a GFK for polymorphism. For the mentorship workflow (issue comments), a direct FK is simplest.

-class Comment(BulkSaveModel, models.Model):
-    """Represents a comment on a GitHub entity (Issue, PR, etc.)."""
+class Comment(BulkSaveModel, models.Model):
+    """Represents a comment on a GitHub Issue."""
 
     github_id = models.BigIntegerField(unique=True)
     author = models.ForeignKey(
         "github.User", on_delete=models.SET_NULL, null=True, related_name="comments"
     )
+    issue = models.ForeignKey(
+        "github.Issue", on_delete=models.CASCADE, related_name="comments"
+    )
     body = models.TextField()
-    created_at = models.DateTimeField()
-    updated_at = models.DateTimeField()
+    created_at = models.DateTimeField(db_index=True)
+    updated_at = models.DateTimeField(db_index=True)
 
     class Meta:
         verbose_name = "Comment"
         verbose_name_plural = "Comments"
+        ordering = ("-created_at",)

If you intend to support PR comments in the near term, replace the FK with a GenericForeignKey instead. I can draft that if needed.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class Comment(BulkSaveModel, models.Model):
"""Represents a comment on a GitHub entity (Issue, PR, etc.)."""
github_id = models.BigIntegerField(unique=True)
author = models.ForeignKey(
"github.User", on_delete=models.SET_NULL, null=True, related_name="comments"
)
body = models.TextField()
created_at = models.DateTimeField()
updated_at = models.DateTimeField()
class Comment(BulkSaveModel, models.Model):
"""Represents a comment on a GitHub Issue."""
github_id = models.BigIntegerField(unique=True)
author = models.ForeignKey(
"github.User", on_delete=models.SET_NULL, null=True, related_name="comments"
)
issue = models.ForeignKey(
"github.Issue", on_delete=models.CASCADE, related_name="comments"
)
body = models.TextField()
created_at = models.DateTimeField(db_index=True)
updated_at = models.DateTimeField(db_index=True)
class Meta:
verbose_name = "Comment"
verbose_name_plural = "Comments"
ordering = ("-created_at",)
🤖 Prompt for AI Agents
In backend/apps/github/models/comment.py around lines 8 to 18, the Comment model
is missing a relation to the Issue (or a GFK for polymorphism) and should index
timestamp fields; add a foreign key field like issue =
models.ForeignKey("github.Issue", on_delete=models.CASCADE,
related_name="comments") to scope comments to an Issue (or replace with a
GenericForeignKey if you need PR/other-target polymorphism), and set created_at
and updated_at to include db_index=True; run makemigrations/migrate after
updating the model.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
backend/apps/github/models/comment.py (2)

8-14: Wrong cardinality: comments should be 1:many with Issue (not M2M) + clarify scope in docstring

GitHub issue comments belong to exactly one issue. Modeling via Issue.comments M2M leads to unnecessary join table, risk of accidental cross-linking, and complicates cascading deletes. Scope this model directly to Issue via FK (or use a GFK if you truly need PR/other targets). Also align the docstring with the actual scope. Echoing the earlier review on the same point.

Apply in this file:

 class Comment(BulkSaveModel, models.Model):
-    """Represents a comment on a GitHub entity (Issue, PR, etc.)."""
+    """Represents a comment on a GitHub Issue."""
@@
     class Meta:
         verbose_name = "Comment"
         verbose_name_plural = "Comments"
+        # Efficient recency queries like Issue.latest_comment
+        ordering = ("-created_at",)

And add the FK in the fields block (see next comment for full fields diff).

If you decide you really need polymorphism (Issue/PR/Commit), I can draft a GenericForeignKey alternative. Would you like that?


15-21: Add Issue FK and index timestamps for query performance

  • Missing FK to Issue; add it here to enforce 1:many.
  • Index created_at/updated_at to speed recency queries, admin list ordering, and sync diffs. This was previously requested.
     github_id = models.BigIntegerField(unique=True)
     author = models.ForeignKey(
         "github.User", on_delete=models.SET_NULL, null=True, related_name="comments"
     )
+    issue = models.ForeignKey(
+        "github.Issue", on_delete=models.CASCADE, related_name="comments"
+    )
     body = models.TextField()
-    created_at = models.DateTimeField()
-    updated_at = models.DateTimeField()
+    created_at = models.DateTimeField(db_index=True)
+    updated_at = models.DateTimeField(db_index=True)

Follow-ups I can help with:

  • Generate the migration altering relationships and backfilling existing M2M links into the FK field.
  • Update sync code to pass the Issue when building Comment instances.
🧹 Nitpick comments (5)
backend/apps/github/models/comment.py (5)

23-26: Make str resilient to missing author/body

Minor polish to avoid rendering “None - …” and to protect against unexpected None body.

-        return f"{self.author} - {self.body[:40]}"
+        author = str(self.author) if self.author else "unknown"
+        preview = (self.body or "")[:40]
+        return f"{author} - {preview}"

27-42: Plumb Issue through from_github and tighten signature

If we add the Issue FK, let from_github accept the issue and set it. Also make params keyword-only to prevent accidental arg order bugs and add lightweight typing.

-    def from_github(self, gh_comment, author=None):
+    def from_github(self, gh_comment, *, author=None, issue=None) -> None:
         """Populate fields from a GitHub API comment object."""
@@
         for model_field, gh_field in field_mapping.items():
             value = getattr(gh_comment, gh_field, None)
             if value is not None:
                 setattr(self, model_field, value)
 
-        self.author = author
+        self.author = author
+        # Set owning issue when provided (enables FK cardinality)
+        if issue is not None:
+            self.issue = issue

I can also wire this into your sync function to pass issue=... if you share that file.


48-70: Eliminate race on create path and add return typing

The get-then-create flow can raise an IntegrityError under concurrent syncs (unique github_id). Use get_or_create in a transaction. Also add return typing for clarity.

-    def update_data(gh_comment, *, author=None, save: bool = True):
+    def update_data(gh_comment, *, author=None, issue=None, save: bool = True) -> "Comment":
@@
-        try:
-            comment = Comment.objects.get(github_id=gh_comment.id)
-        except Comment.DoesNotExist:
-            comment = Comment(github_id=gh_comment.id)
-
-        comment.from_github(gh_comment, author=author)
-
-        if save:
-            comment.save()
+        from django.db import transaction, IntegrityError
+        try:
+            with transaction.atomic():
+                comment, _created = Comment.objects.get_or_create(github_id=gh_comment.id)
+                comment.from_github(gh_comment, author=author, issue=issue)
+                if save:
+                    comment.save()
+        except IntegrityError:
+            # In case of a concurrent creator, re-fetch and update
+            comment = Comment.objects.get(github_id=gh_comment.id)
+            comment.from_github(gh_comment, author=author, issue=issue)
+            if save:
+                comment.save()
 
         return comment

Question: Can mentorship_update_comments run in parallel across workers or repos? If yes, the above change is advisable; otherwise current code works but is brittle under concurrency.


8-21: Naming mismatch with PR description (“IssueComment” vs “Comment”)

The PR summary mentions an “IssueComment” model, but this file defines Comment. If the intent is Issue-only (as the management command suggests), consider renaming to IssueComment for clarity and to avoid future collisions with PR review comments. If you plan polymorphism later, keep Comment but adopt a GFK now.

I can provide either:

  • A safe rename (model + migrations + admin + imports), or
  • A GFK version (content_type, object_id, GenericForeignKey) with indexes and constraints.

11-21: Add DB constraints and admin QoL (optional)

  • Consider Index(author, created_at) to optimize per-user timelines.
  • Add constraints.Index for lower(body) if you later do case-insensitive searches (optional).
  • Ensure USE_TZ is on; PyGithub returns aware datetimes—be consistent.

I can draft the migration if you confirm the query patterns you expect (latest per issue, by author, date ranges).

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between bc1d0d6 and 686e6c1.

📒 Files selected for processing (2)
  • backend/apps/github/admin/issue.py (0 hunks)
  • backend/apps/github/models/comment.py (1 hunks)
💤 Files with no reviewable changes (1)
  • backend/apps/github/admin/issue.py
🧰 Additional context used
🧬 Code graph analysis (1)
backend/apps/github/models/comment.py (2)
backend/apps/common/models.py (1)
  • BulkSaveModel (10-34)
backend/apps/github/models/issue.py (5)
  • Meta (24-30)
  • from_github (103-142)
  • bulk_save (194-196)
  • update_data (205-229)
  • save (182-191)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (1)
backend/apps/github/models/comment.py (1)

42-46: Wrapper around BulkSaveModel is fine

The thin wrapper that binds the model class improves callsites and mirrors Issue.bulk_save; no issues.

@Rajgupta36 Rajgupta36 requested a review from arkid15r August 23, 2025 21:47
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
backend/apps/github/migrations/0034_comment_issue_comments.py (1)

42-46: Replace Issue↔Comment M2M with a Comment.issue FK (one-to-many is the correct GitHub cardinality).

Each GitHub issue comment belongs to exactly one issue. The M2M on Issue.comments introduces incorrect semantics, complicates queries, and risks data skew. Model this as a ForeignKey on Comment instead and drop the M2M.

Apply this diff to the migration:

@@
         migrations.CreateModel(
             name="Comment",
             fields=[
@@
                 (
                     "author",
                     models.ForeignKey(
                         null=True,
                         on_delete=django.db.models.deletion.SET_NULL,
                         related_name="comments",
                         to="github.user",
                     ),
                 ),
+                (
+                    "issue",
+                    models.ForeignKey(
+                        on_delete=django.db.models.deletion.CASCADE,
+                        related_name="comments",
+                        to="github.issue",
+                    ),
+                ),
             ],
             options={
                 "verbose_name": "Comment",
                 "verbose_name_plural": "Comments",
                 "ordering": ("-created_at",),
             },
         ),
-        migrations.AddField(
-            model_name="issue",
-            name="comments",
-            field=models.ManyToManyField(blank=True, related_name="issues", to="github.comment"),
-        ),

If 0034 is already applied anywhere, do not rewrite it in-place. Instead:

  1. Add a new migration to AddField Comment.issue (null=True).
  2. Data-migrate from the existing M2M through table to set Comment.issue.
  3. Alter Comment.issue to null=False.
  4. Remove the Issue.comments M2M field (and its through table).

I can generate that follow-up migration with a reversible RunPython if helpful.

backend/apps/github/models/comment.py (1)

8-15: Model says “comment on a GitHub Issue” but there’s no Issue relation; add Comment.issue FK.

The docstring and PR intent are issue comments. Without an FK to Issue you’re forced into an incorrect Issue↔Comment M2M (see migration). Add the FK here and align the migration accordingly.

Apply this change:

 class Comment(BulkSaveModel, models.Model):
     """Represents a comment on a GitHub Issue."""
@@
     github_id = models.BigIntegerField(unique=True)
     author = models.ForeignKey(
         "github.User", on_delete=models.SET_NULL, null=True, related_name="comments"
     )
+    issue = models.ForeignKey(
+        "github.Issue", on_delete=models.CASCADE, related_name="comments"
+    )
     body = models.TextField()
     created_at = models.DateTimeField(db_index=True)
     updated_at = models.DateTimeField(db_index=True)
🧹 Nitpick comments (3)
backend/apps/github/migrations/0034_comment_issue_comments.py (1)

36-41: Optional: add a compound index for common queries.

Once Comment.issue exists, consider an additional index to speed lookups of the latest comments for a given issue.

         migrations.CreateModel(
             name="Comment",
             fields=[
@@
             ],
             options={
                 "verbose_name": "Comment",
                 "verbose_name_plural": "Comments",
                 "ordering": ("-created_at",),
+                "indexes": [
+                    models.Index(
+                        fields=["issue", "-created_at"],
+                        name="comment_issue_createdat_idx",
+                    ),
+                ],
             },
         ),
backend/apps/github/models/comment.py (2)

11-15: Optional: add a compound index in Meta for issue + created_at.

Once the FK exists, this index accelerates “latest comments per issue” and timelines.

     class Meta:
         verbose_name = "Comment"
         verbose_name_plural = "Comments"
         ordering = ("-created_at",)
+        indexes = [
+            models.Index(fields=["issue", "-created_at"], name="comment_issue_createdat_idx"),
+        ]

24-26: Minor: guard str when author is None.

Avoid “None - …” in admin lists.

-        return f"{self.author} - {self.body[:40]}"
+        author = str(self.author) if self.author else "Unknown"
+        return f"{author} - {self.body[:40]}"
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 686e6c1 and 67290f9.

📒 Files selected for processing (4)
  • backend/apps/github/migrations/0034_comment_issue_comments.py (1 hunks)
  • backend/apps/github/models/comment.py (1 hunks)
  • backend/apps/mentorship/admin/issue_user_interest.py (1 hunks)
  • backend/apps/mentorship/migrations/0004_issueuserinterest.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • backend/apps/mentorship/admin/issue_user_interest.py
  • backend/apps/mentorship/migrations/0004_issueuserinterest.py
🧰 Additional context used
🧬 Code graph analysis (2)
backend/apps/github/migrations/0034_comment_issue_comments.py (1)
backend/apps/mentorship/migrations/0004_issueuserinterest.py (1)
  • Migration (7-54)
backend/apps/github/models/comment.py (2)
backend/apps/common/models.py (1)
  • BulkSaveModel (10-34)
backend/apps/github/models/issue.py (5)
  • Meta (24-30)
  • from_github (103-142)
  • bulk_save (194-196)
  • update_data (205-229)
  • save (182-191)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run backend tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (3)
backend/apps/github/migrations/0034_comment_issue_comments.py (1)

24-25: Good: timestamp indexes and default ordering.

Indexing created_at/updated_at and ordering by -created_at will keep “latest comment per issue” queries snappy once the FK exists.

Also applies to: 39-39

backend/apps/github/models/comment.py (2)

28-41: Input mapping looks fine; confirm timezone-awareness.

created_at/updated_at from PyGithub are tz-aware; your DateTimeFields are naive by default unless USE_TZ is True (likely is). Just ensure settings.USE_TZ is enabled so ORM stores aware datetimes.


20-22: Security note: HTML/Markdown in body must be escaped on render.

Storing raw comment text is correct. When you render it, ensure the frontend applies safe HTML sanitization (e.g., bleach) if any HTML is allowed.

@@ -17,7 +17,6 @@ class IssueAdmin(admin.ModelAdmin):
)
list_display = (
"repository",
"created_at",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Comment on lines 244 to 245
repository = issue.repository
if not repository:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use :=

logger.warning("Issue #%s has no repository, skipping", issue.number)
return

repository_full_name = f"{repository.owner.login}/{repository.name}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you use repository model existing properties like path

Comment on lines 269 to 271
if gh_comment.id in existing_github_ids:
logger.info("Skipping existing comment %s", gh_comment.id)
continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't since make sure that there will be only updated comments?

Copy link
Collaborator Author

@Rajgupta36 Rajgupta36 Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since parameter tells gitHub to return comments that were created or updated after the given timestamp.

def handle(self, *args, **options):
self.stdout.write(self.style.SUCCESS("Starting mentorship issue processing job..."))

gh = get_github_client()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need it up here?

f"Found {relevant_issues.count()} open issues across repositories"
)

except GithubException as e:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What code/line throws this exception in covered try/except block?

interests_to_create = []
new_user_logins = []

for comment in issue.comments.select_related("author").all():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need all() here?


self.stdout.write(self.style.SUCCESS("\nProcessed successfully!"))

def _find_and_register_interest(self, issue: Issue, module: Module):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you loop over all issue comments instead of updated ones only. It's not optimal.
The second suggestion is that we need to unregister user's interest too if their comment was updated and /interested marker was removed.

logger = logging.getLogger(__name__)

INTEREST_PATTERNS = [
re.compile(r"interested", re.IGNORECASE),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I recommended /interested as a marker. Like /assign

"""Represents users interested in a specific issue within a module."""

class Meta:
verbose_name = "Issue User Interest"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The db table name is missing.

@arkid15r arkid15r changed the base branch from main to feature/mentorship-portal September 5, 2025 01:36
Copy link

sonarqubecloud bot commented Sep 7, 2025

@Rajgupta36 Rajgupta36 requested a review from arkid15r September 7, 2025 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants