Conversation
- Make debugpy optional in worker hot-reload (set AIRWEAVE_WORKER_DEBUG=true to enable) - Make --wait-for-client optional (set AIRWEAVE_DEBUGPY_WAIT_FOR_CLIENT=true to enable) - Fix graceful shutdown when metrics server not fully initialized - Fix watch paths for hot-reload script - Add Monke troubleshooting guide for entity ID collisions and deletion order - Minor monke logging improvements
There was a problem hiding this comment.
1 issue found across 19 files
Prompt for AI agents (all 1 issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="monke/bongos/google_sheets.py">
<violation number="1" location="monke/bongos/google_sheets.py:202">
P1: Passing `self._test_spreadsheets` directly causes list modification during iteration. Inside `delete_specific_entities`, items are removed from `_test_spreadsheets` while iterating over the same list, which can cause elements to be skipped. Pass a copy instead, similar to how `cleanup` does it.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
| "google_drive": "googledrive", | ||
| "google_calendar": "googlecalendar", | ||
| "google_docs": "googledocs", | ||
| "google_sheets": "googledocs", # Reuse Google Docs OAuth (compatible scopes) |
There was a problem hiding this comment.
does this require users to create a "googledocs" Composio connection and use that to create a "google_sheets" Airweave conncection?
There was a problem hiding this comment.
I'm using googledocs just temporarily, because I don't have access to composio.
| class GoogleSheetsConfig(SourceConfig): | ||
| """Google Sheets configuration schema.""" | ||
|
|
||
| include_trashed: bool = Field( |
There was a problem hiding this comment.
not sure if we should include this option. Normally we do not synced deleted data.
There was a problem hiding this comment.
This follows the existing pattern for Google connectors - GoogleDocsConfig and GoogleSlidesConfig already have this same option on main. It defaults to False (trashed items excluded), so the default behavior matches our principle of not syncing deleted data. The opt-in exists for edge cases like compliance audits or data recovery scenarios.
|
we do have a xlsx converter. Did you consider processing the data as a file? or do you think this is a better way? |
@marc-rutzou Good question! I considered the XLSX export approach but went with the Sheets API directly for a few reasons:
|
|
@marc-rutzou btw the test that is failing is unrelated to the changes here. It's likely a bug :/ |
marc-rutzou
left a comment
There was a problem hiding this comment.
once the composio name mapping is fixed, its good to go!
|
@marc-rutzou could you please re-review? Composio is set |
There was a problem hiding this comment.
2 issues found across 11 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/airweave/platform/sources/google_sheets.py">
<violation number="1" location="backend/airweave/platform/sources/google_sheets.py:279">
P1: Dead code: `_latest_new_start_page_token` is never assigned in this class</violation>
</file>
<file name="backend/airweave/platform/auth/yaml/dev.integrations.yaml">
<violation number="1" location="backend/airweave/platform/auth/yaml/dev.integrations.yaml:161">
P2: Using placeholder OAuth credentials will break Google Sheets authentication in the dev integration config.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
4640847 to
19f76e2
Compare
| from .google_sheets import ( | ||
| GoogleSheetsSheetEntity, | ||
| GoogleSheetsSpreadsheetEntity, | ||
| ) |
There was a problem hiding this comment.
We don't have a Deletion Entity, is that expected? _process_changes filters for removed == False but never yields a deletion entity for removed == True. So this will work for the monke since we set force_full_sync to true, but regular users will accumulate stale data over time.
| spreadsheet_title = file_data.get("name", "Untitled Spreadsheet") | ||
|
|
||
| # Parse timestamps | ||
| created_time = self._parse_datetime(file_data.get("createdTime")) or datetime.utcnow() |
There was a problem hiding this comment.
We're in the process of bumping to Python 3.14. AFAIK datetime.utcnow() is deprecated for 3.12 and higher. Can you use datetime.now(timezone.utc) instead?
|
Thanks!! I believe we should park this connector for now. We are unable to parse sheets in a meaningful way yet that a) doesn't crash our infra with mem-usage Let's revisit once our platform is ready. |
Summary by cubic
Adds a Google Sheets connector (ENG-31) with OAuth and full/incremental sync via Drive Changes, plus docs and UI icons. Includes E2E tests and local worker QoL fixes for safer shutdowns.
New Features
Bug Fixes
Written for commit 19f76e2. Summary will update on new commits.