Skip to content

Conversation

@kohankhaki
Copy link
Collaborator

@kohankhaki kohankhaki commented Aug 26, 2025

PR Type

Feature

Short Description

This PR adds a new agentic system for task generation. It also introduces a structured prompt/response contract (JSON) to include thoughts and integrates Langfuse for logging LLM outputs and key events.

Tests Added

None


This change is Reviewable

…tputs, and updated corresponding output parser.
@kohankhaki kohankhaki requested a review from afkanpour August 26, 2025 17:50
@kohankhaki kohankhaki closed this Aug 26, 2025
@kohankhaki kohankhaki reopened this Aug 26, 2025
Copy link
Collaborator

@afkanpour afkanpour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@afkanpour reviewed 7 of 7 files at r1, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @kohankhaki)


src/utils/agentic_prompts.py line 209 at r1 (raw file):

Please return your proposal and your thoughts and reasoning in the following format:
{{
  "thought": "Your reasoning and thought process about the kind of tasks you're proposing",

"Thought": "Your reasoning and thought process for designing the tasks and ensuring diversity in content and difficulty of tasks"

Code quote:

"Your reasoning and thought process about the kind of tasks you're proposing"

src/utils/agentic_prompts.py line 211 at r1 (raw file):

  "thought": "Your reasoning and thought process about the kind of tasks you're proposing",
  "problems": {{
    "problem_0": "TASK_TEXT_1",

These could be replaced with "PROBLEM_1_DESCRIPTION"

Code quote:

TASK_TEXT_1

src/utils/agentic_prompts.py line 242 at r1 (raw file):

    "solution_1": "SOLUTION_TEXT_2",
    ...
  }}

We should give one problem at a time for solving. So I expect the solution json will contain only one solution.

We should add a sentence to the prompt asking for the final numerical solution, so parsing and verification becomes easy.

Code quote:

  "solutions": {{
    "solution_0": "SOLUTION_TEXT_1",
    "solution_1": "SOLUTION_TEXT_2",
    ...
  }}

Copy link
Collaborator Author

@kohankhaki kohankhaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 27 files reviewed, 3 unresolved discussions (waiting on @afkanpour)


src/utils/agentic_prompts.py line 209 at r1 (raw file):

Previously, afkanpour (Arash) wrote…

"Thought": "Your reasoning and thought process for designing the tasks and ensuring diversity in content and difficulty of tasks"

Done.


src/utils/agentic_prompts.py line 211 at r1 (raw file):

Previously, afkanpour (Arash) wrote…

These could be replaced with "PROBLEM_1_DESCRIPTION"

Done.


src/utils/agentic_prompts.py line 242 at r1 (raw file):

Previously, afkanpour (Arash) wrote…

We should give one problem at a time for solving. So I expect the solution json will contain only one solution.

We should add a sentence to the prompt asking for the final numerical solution, so parsing and verification becomes easy.

Done.

Copy link
Collaborator

@afkanpour afkanpour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@afkanpour reviewed 27 of 27 files at r2.
Reviewable status: 17 of 27 files reviewed, 9 unresolved discussions (waiting on @kohankhaki)


src/task_solver/moderator.py line 49 at r2 (raw file):

        num_solvers: int,
        max_rounds: int,
        output_dir: Path,

Please add a description for the class attributes in the docstring.

Code quote:

        model_client: ChatCompletionClient,
        num_solvers: int,
        max_rounds: int,
        output_dir: Path,

src/cfg/agentic_config.yaml line 11 at r2 (raw file):

# Debate configuration (shared across all stages)
debate_cfg:
  max_round: 5

Is this the number of rounds of debate between two agents? Isn't 5 too large?

Code quote:

5

src/utils/agentic_prompts.py line 285 at r2 (raw file):

Provide your solution in JSON format with the following structure:
- thought: Your detailed reasoning and step-by-step solution process
- final_answer: Your complete answer with explanation

Should we remove the requirement for 'explanation' in the final answer?

Code quote:

with explanation

src/utils/agentic_prompts.py line 286 at r2 (raw file):

- thought: Your detailed reasoning and step-by-step solution process
- final_answer: Your complete answer with explanation
- numerical_answer: The final numerical result (if applicable, otherwise null)

having both final_answer and numerical_answer could be confusing.
I suggest we provide only one field for the final solution

Code quote:

numerical_answer:

src/cfg/agentic_config.yaml line 7 at r2 (raw file):

global_cfg:
  domain: math
  output_dir: /fs01/projects/aieng/public/ace/agentic_outputs/

Curious where this is specified?

Code quote:

/fs01/projects/aieng/public/ace/

src/task_solver/generator.py line 66 at r2 (raw file):

                        seed=cfg.agents.moderator.get("seed"),
                    ),
                    num_solvers=2,

Can this be specified in the config? How hard is it to change the logic to work with >2 solvers?

Code quote:

2

README.md line 89 at r2 (raw file):

# Generate tasks for each capability
python -m src.agentic_task_generator

Where is the capability for which task are to be generated specified? Please add a comment for that in the README.

Code quote:

agentic_task_generator

README.md line 92 at r2 (raw file):

# Generate tasks for all capabilities
python -m src.agentic_task_generator pipeline_tags.capabilities_tag=_20250902_030203

Is this tag auto-generated by a previous job (for example, capability generator)? Please explain in the README how this tag should be specified.

In general the README file should provide sufficient information for running all steps easily by someone unfamiliar with the codebase.

Code quote:

_20250902_030203

README.md line 95 at r2 (raw file):

# Generate solutions for tasks using multi-agent debate
python -m src.agentic_task_solver pipeline_tags.tasks_tag=_20250905_153532

ditto

Code quote:

_20250905_153532

@kohankhaki
Copy link
Collaborator Author

src/cfg/agentic_config.yaml line 7 at r2 (raw file):

Previously, afkanpour (Arash) wrote…

Curious where this is specified?

/fs01/projects/aieng/public/ace/ needs to be set in output_dir. I intentionally set it to agentic_outputs/, so if someone is new to the repo, do not make any changes to our primary storage.

@kohankhaki
Copy link
Collaborator Author

src/cfg/agentic_config.yaml line 11 at r2 (raw file):

Previously, afkanpour (Arash) wrote…

Is this the number of rounds of debate between two agents? Isn't 5 too large?

This is just a place holder for now. We can change these in the experiments. That said, with 3, agents did not reach consensus.

@kohankhaki
Copy link
Collaborator Author

src/task_solver/generator.py line 66 at r2 (raw file):

Previously, afkanpour (Arash) wrote…

Can this be specified in the config? How hard is it to change the logic to work with >2 solvers?

It is not easy. Needs lots of refactoring.

@kohankhaki
Copy link
Collaborator Author

src/utils/agentic_prompts.py line 285 at r2 (raw file):

Previously, afkanpour (Arash) wrote…

Should we remove the requirement for 'explanation' in the final answer?

Not sure about this. I guess we need to run experiments to finalize these details.

@kohankhaki
Copy link
Collaborator Author

src/utils/agentic_prompts.py line 286 at r2 (raw file):

Previously, afkanpour (Arash) wrote…

having both final_answer and numerical_answer could be confusing.
I suggest we provide only one field for the final solution

Not having that had also its own complications. This way it is easier to evaluate the end result if it is numerical. I'd say let's modify these details later on, when we run experiments and find the best setting.

Copy link
Collaborator Author

@kohankhaki kohankhaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 17 of 27 files reviewed, 9 unresolved discussions (waiting on @afkanpour)


README.md line 89 at r2 (raw file):

Previously, afkanpour (Arash) wrote…

Where is the capability for which task are to be generated specified? Please add a comment for that in the README.

Done.


README.md line 92 at r2 (raw file):

Previously, afkanpour (Arash) wrote…

Is this tag auto-generated by a previous job (for example, capability generator)? Please explain in the README how this tag should be specified.

In general the README file should provide sufficient information for running all steps easily by someone unfamiliar with the codebase.

Done.


README.md line 95 at r2 (raw file):

Previously, afkanpour (Arash) wrote…

ditto

Done.


src/task_solver/moderator.py line 49 at r2 (raw file):

Previously, afkanpour (Arash) wrote…

Please add a description for the class attributes in the docstring.

Done.

Copy link
Collaborator

@afkanpour afkanpour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@afkanpour reviewed 7 of 10 files at r3, 3 of 3 files at r4, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @kohankhaki)


src/cfg/agentic_config.yaml line 7 at r2 (raw file):

Previously, kohankhaki (Farnaz Kohankhaki) wrote…

/fs01/projects/aieng/public/ace/ needs to be set in output_dir. I intentionally set it to agentic_outputs/, so if someone is new to the repo, do not make any changes to our primary storage.

So if someone wants to run the pipeline, should they add /fs01/projects/aieng/public/ace/ to the config file? If so, I suggest we simply hard-code it there for now to make the runs easier for everyone. If different paths have to be specified in the config, let's have it as a base_dir or root_dir somewhere in the config and then in the code append it to paths.


src/task_solver/generator.py line 66 at r2 (raw file):

Previously, kohankhaki (Farnaz Kohankhaki) wrote…

It is not easy. Needs lots of refactoring.

OK

@kohankhaki
Copy link
Collaborator Author

kohankhaki commented Nov 7, 2025

@afkanpour Added a comment insrc/cfg/agentic_config.yaml regarding the output_dir.

Copy link
Collaborator

@afkanpour afkanpour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 26 of 27 files reviewed, 2 unresolved discussions (waiting on @kohankhaki)

@kohankhaki kohankhaki merged commit eea799f into main Nov 7, 2025
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants