-
Notifications
You must be signed in to change notification settings - Fork 0
Description
An introduction: when starting new runs with the orchestrator, each run will be associated with a runName
other than its usual ID. This is necessary to support some features, such as persistency.
The orchestrator allows starting multiple runs at the same time and to automatically split the input.
Let's take the start
function as an example:
const orchestrator = new Orchestrator();
const client = await orchestrator.apifyClient();
const actor = client.actor('actor-id');
// Start a single run and get the run object
const run = await actor.start('my-run', { ...input }, { ...options });
// Start multiple runs and get the map [runName:runObject]
const runRecord = await actor.startRuns(
{
runName: 'my-run-1',
input: { ... },
options: { ... },
},
{
runName: 'rmy-run-2',
input: { ... },
options: { ... },
},
...
);
// Automatically split input, generate a sequence of run names,
// start multiple runs and get the map [runName:runObject]
const runRecord = await actor.startBatch(
'my-run', // will be used as a prefix
[...urls], // some "sources"
(urls) => ({ startUrls: urls }), // a function mapping sources to an input object
{ ...splitRules }, // rules for generating multiple inputs
{ ...options }, // actor options
);
// You will get something like: { 'my-run-1/2': run1, 'my-run-2/2': run2 }
Even if a bit complicated, using startBatch
instead of start
helps to avoid API errors. E.g., instead of doing:
const run = actor.start('my-run', { startUrls: urls });
Do:
const runRecord = actor.startBatch(
'my-run',
urls,
(urls) => ({ startUrls: urls }),
{ respectApifyMaxPayloadSize: true },
);
Notice that the orchestrator provides tools for working with a "run record" as if it was a single run, e.g., for reading dataset items.
I would like to simplify this interface and make it more versatile. For instance, the orchestrator could provide a separate function for splitting inputs, instead of embedding this functionality in the startBatch
(and callBatch
) methods. But then, what about the run names?
We could discuss how to achieve this.