BYOC: Fix Orchestrator streaming reserve capacity#3857
BYOC: Fix Orchestrator streaming reserve capacity#3857ad-astra-video merged 6 commits intolivepeer:masterfrom
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3857 +/- ##
===================================================
- Coverage 32.25814% 32.24679% -0.01135%
===================================================
Files 169 169
Lines 41202 41201 -1
===================================================
- Hits 13291 13286 -5
- Misses 26907 26915 +8
+ Partials 1004 1000 -4
... and 1 file with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
eliteprox
left a comment
There was a problem hiding this comment.
Changes LGTM! Added small nit on related lines
byoc/job_gateway.go
Outdated
There was a problem hiding this comment.
Should this be respTimeout or timeout from getJobOrchestrators() instead of 500 milliseconds? Note that respTimeout at the getJobOrchestrators() scope is unused.
Small nit, there is some ambiguity with the respTimeout variable name reused in getOrchJobToken, would suggest to rename one of them or remove the first if it remains unused
There was a problem hiding this comment.
Good catch, updated and included a small improvement to the get orchestrators loop with the timeout in 545b14e.
Note that the getOrchSerachTimeouts applies defaults if no timeouts are set. respTimeout default is 500ms and the searchTimeout is 1s. The searchTimeout is what it waits for all the responses and the respTimeout applies to each request to the Orchestrator.
| Balance: capBalInt, | ||
| Price: jobPrice, | ||
| ServiceAddr: orch.ServiceURI().String(), | ||
| AvailableCapacity: capacity, |
There was a problem hiding this comment.
I think it's worth noting that JobTokens generally can't be changed during the active stream, so this value will always reflect the capacity available when the orchestrator was selected for the job, is that correct?
Not sure if a code comment is needed or anything, just adding context here
There was a problem hiding this comment.
Yes, each JobToken is a point in time representation of information from the Orchestrator. They are requested and generated on demand for a stream start and refreshed when failing over to another Orchestrator to ensure fresh. The stream stop request gets a fresh token as well. Status and Update requests do not use the token since its just using a stream already running and paid for.
|
Please also update CHANGELOG_PENDING.md |
Updated in eef7567 |
What does this pull request do? Explain your changes. (required)
Fixes bug introduces when spliting BYOC to separate gateway and orchestrator streaming files.
Part of the fix is also returning a JobToken even when there is no capacity. This enables reporting capacity available when job token sent
No net new features added, just bug fix and refactor.
Specific updates (required)
verifyJobCredsfunction in BYOCOrchestratorServer and simplified same function for Gateway to only parse the request.CheckExternalCapabilityCapacityto return net capacity availableGetTokento return JobToken even if no capacity. No major change on this one but removing the if block makes the diff kind of ugly.getJobOrchestratorsto usetoken.AvailableCapacityto determine if Orchestrator should be in working set (0 = excluded)How did you test each of these updates (required)
Added test and built and ran local docker containers for Gateway and Orchestrator to test full stack.
Does this pull request close any open issues?
Checklist:
makeruns successfully./test.shpass