Handling rate limiting for external services #7189

ramyaragupathy · 2026-03-09T06:43:14Z

ramyaragupathy
Mar 9, 2026
Maintainer

Recent 429 Too Many Requests errors are causing login blockages across the ecosystem. This has been observed not only in Tasking Manager but also in uMap, HOT Export Tool, and fAIr.

Symptom: The OSM stats endpoint/auth redirect is failing or timing out.
Impact: Because the stats fetch is a blocking part of the login flow in TM, our users cannot work even if the rest of our infrastructure is healthy.

Our login flow treats non-essential metadata (user stats) as a hard dependency for authentication. When OSM-wide rate limits are triggered, this blocks our workflow.

Proposal:

Upon login, the backend should immediately return the session and existing DB stats. A background worker (Celery/Redis) should then be triggered to update the stats in asycn manner.
May be introduce a TTL for stats?
ensure our code looks for retry-after informatiin in 429 response headers and then make a call after the timeout expires?
also if the OSM/OhSome stats service fails consistently (e.g., 10-20 times in 1 minute), automatically disablestats fetching for all users for 15 minutes to allow the upstream service to recover and to prevent TM from being flagged as abusive.

ramyaragupathy
Mar 9, 2026
Maintainer Author

cc @emi420 @dakotabenjamin @prabinoid @suzit-10 @kshitijrajsharma @omranlm @petya-kangalova

0 replies

ramyaragupathy · 2026-03-10T10:01:09Z

ramyaragupathy
Mar 10, 2026
Maintainer Author

Related task: #7196

0 replies

Adrian-Shobrooke · 2026-03-10T11:21:37Z

Adrian-Shobrooke
Mar 10, 2026

Proposals

As I understand the stats, there are two sources? 1 the TM db and 2 the external OSM/OhSome db. It was the external db response that was problematic?

So, how important are the stats that they need to be updated every time a task is submitted and a user looks at their contribution stats? Would getting the external stats every 15 minutes cover most users as that is around the suggested ideal time to map a task?

Approximately how many times is the external service called in 24 hours?

So proposal 1 would be get external stats every 15 minutes, but TM stats could be more frequent. Although I think every 15 minutes for both sources would seem adequate for me.

2 TTL - Time To Live? If collating stats every 15 minutes, would this cover TTL?

3 No comment

4 15 minute intervals overall might help here? Try several (5?) times every 15 minutes, but if 5 failures, then wait to next interval.

Open questions

1 New user stats message something more like 'Statistics will start to appear after you have saved your work to OSM and completed tasks.'

2 I can think of no reason why coordinating with OSM Ops would not be a good idea. I would hope that any API could respond with the throttle or ban response rather than have to involve a person. I guess it's a person for now. If we call every 15 minutes or when authenticating user, I think we'd not be much load on OSM.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Handling rate limiting for external services #7189

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Uh oh!

Handling rate limiting for external services #7189

Uh oh!

Uh oh!

ramyaragupathy Mar 9, 2026 Maintainer

Replies: 3 comments

Uh oh!

ramyaragupathy Mar 9, 2026 Maintainer Author

Uh oh!

ramyaragupathy Mar 10, 2026 Maintainer Author

Uh oh!

Uh oh!

Adrian-Shobrooke Mar 10, 2026

ramyaragupathy
Mar 9, 2026
Maintainer

ramyaragupathy
Mar 9, 2026
Maintainer Author

ramyaragupathy
Mar 10, 2026
Maintainer Author

Adrian-Shobrooke
Mar 10, 2026