Skip to content

Conversation

huydhn
Copy link
Contributor

@huydhn huydhn commented Aug 16, 2025

This route is not used anymore, and always fails. I will need to clean up the write into dynamoDB in another PR and the dynamoDB table torchci-oss-ci-benchmark itself later.

UPSERTING 3 INTO benchmark.oss_ci_benchmark_v2
--
/var/task/lambda_function.py:206: UserWarning: Failed to upsert 3 into benchmark.oss_ci_benchmark_v2: HTTPDriver for https://hyt81izu0c.us-east-1.aws.clickhouse.cloud:8443 received ClickHouse error code 60
Code: 60. DB::Exception: Table benchmark.oss_ci_benchmark_v2 does not exist. Maybe you meant benchmark.oss_ci_benchmark_v3?. (UNKNOWN_TABLE) (version 25.4.1.37615 (official build))
warn(f"Failed to upsert {len(documents)} into {table}: {error}")
[ERROR] DatabaseError: HTTPDriver for https://hyt81izu0c.us-east-1.aws.clickhouse.cloud:8443 received ClickHouse error code 60 Code: 60. DB::Exception: Table benchmark.oss_ci_benchmark_v2 does not exist. Maybe you meant benchmark.oss_ci_benchmark_v3?. (UNKNOWN_TABLE) (version 25.4.1.37615 (official build))Traceback (most recent call last):  File "/var/task/lambda_function.py", line 58, in lambda_handler    handle_event(event, False)  File "/var/task/lambda_function.py", line 83, in handle_event    upsert_documents(table, ids_and_docs, dry_run)  File "/var/task/lambda_function.py", line 219, in upsert_documents    raise error  File "/var/task/lambda_function.py", line 203, in upsert_documents    res = get_clickhouse_client().query(query)  File "/var/task/clickhouse_connect/driver/client.py", line 228, in query    return self._query_with_context(query_context)  File "/var/task/clickhouse_connect/driver/httpclient.py", line 239, in _query_with_context    response = self._raw_request(body,  File "/var/task/clickhouse_connect/driver/httpclient.py", line 472, in _raw_request    self._error_handler(response)  File "/var/task/clickhouse_connect/driver/httpclient.py", line 395, in _error_handler    raise OperationalError(err_str) if retried else DatabaseError(err_str) from None

The error dropped to 0 after I removed this line:

Screenshot 2025-08-15 at 18 30 13

Also remove the print json.dumps(body) part because it's writing too much information into the log. We needed it at the start to debug issue, but it's too verbose now. We can remove the whole print statement now if no one really looks into it. @clee2000 Your thoughts on this? I'll do this in a separate PR to improve the logging mechanism here instead of using poor man print

@huydhn huydhn requested a review from clee2000 August 16, 2025 01:36
Copy link

vercel bot commented Aug 16, 2025

@huydhn is attempting to deploy a commit to the Meta Open Source Team on Vercel.

A member of the Team first needs to authorize it.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 16, 2025
@huydhn huydhn changed the title Clean up the unused sync to benchmark.oss_ci_benchmark_v2 Clean up the unused sync from DynamoDB to benchmark.oss_ci_benchmark_v2 Aug 16, 2025
Copy link

vercel bot commented Aug 16, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Preview Updated (UTC)
torchci Ignored Ignored Preview Aug 18, 2025 7:34pm

@@ -177,7 +176,7 @@ def get_doc_for_upsert(record: Any) -> Optional[Tuple[str, str, Any]]:
body = handle_workflow_job(body)

id = extract_dynamodb_key(record)
print(f"Parsing {id}: {json.dumps(body)}")
print(f"Parsing {id}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we do like a log.debug() and then not print the debug statement unless we need it?

That way we can keep the line but have it off by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, let me do that. The logging of this important lambda could be improved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, I think it's better to break this into 2 PR. This will just remove the unused sync from DynamoDB to benchmark.oss_ci_benchmark_v2. I will do the logging part separately as it's more involved.

@huydhn huydhn requested a review from zxiiro August 18, 2025 19:33
@huydhn huydhn merged commit 47f28a3 into pytorch:main Aug 18, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants