Analyzer_agent#67
Conversation
d2f0444 to
7b6bfa4
Compare
ebzych
left a comment
There was a problem hiding this comment.
i think to need migrate on openai and don't use gigachat, generalize for any model as it implemented in perf_analyzer
| def create_model(): | ||
| """Create GigaChat model.""" | ||
| credentials = os.getenv("GIGACHAT_CREDENTIALS") | ||
| if not credentials: | ||
| raise ValueError("GigaChat credentials environment variable not set") | ||
|
|
||
| model = GigaChat( | ||
| credentials=credentials, | ||
| scope="GIGACHAT_API_PERS", | ||
| model="GigaChat-2-pro", | ||
| verify_ssl_certs=False, | ||
| timeout=120, | ||
| temperature=0.3, | ||
| ) | ||
| return model |
There was a problem hiding this comment.
you should think about the use scenario of tool: a user may want to use his own model
look at perf_analyzer, it is implemented generically there via environment variables, if there is a problem with generalization in langchain, try with openai
| import yaml | ||
| from langchain_core.messages import AIMessage, HumanMessage, SystemMessage, ToolMessage | ||
| from langchain_core.tools import tool | ||
| from langchain_gigachat.chat_models import GigaChat |
There was a problem hiding this comment.
why do we need to nail gigachat to tool?
| """ | ||
| Returns file tree in the project. Each line contains relative path to one file. | ||
| Returns max 300 files to avoid token limits. | ||
|
|
||
| Args: | ||
| proj_path: Absolute path to the project directory | ||
| """ |
There was a problem hiding this comment.
it is Google docstring style, we use reST style everywhere, check other modules
don't forget about types
:param str proj_path: bubuububu
:rtype:
:return:
There was a problem hiding this comment.
i hope it does not cause problems with tool description to llm
| proj_path: Absolute path to the project directory | ||
| """ | ||
| base = Path(proj_path) | ||
| paths = [str(p.relative_to(base)) for p in base.rglob("*") if p.is_file()] |
There was a problem hiding this comment.
if llm had been used this tool several times but not found file in root of project because rglob implemented as DFS? you can look at general.BuildSystem.find_relative_path, BFS was implemented there
| """ | ||
| base = Path(proj_path) | ||
| paths = [str(p.relative_to(base)) for p in base.rglob("*") if p.is_file()] | ||
| return "\n".join(sorted(paths)[:MAX_FILES_IN_TREE]) |
There was a problem hiding this comment.
e.g. directory src have ~4700 files in grpc repository, if the first directory to walking will be src then you may not find CMakeLists.txt
| 1. Use directory_tree to discover the project structure | ||
| 2. Get information about project by presence of files and directories | ||
| 3. Use get_file_content to examine build configs (CMakeLists.txt, meson.build, Makefile, etc.) and CI files | ||
| 4. Analyze CMakeLists.txt, meson.build, CI configs, etc. for test/benchmark paths | ||
| 5. Analyze all build system files to find what systems are used | ||
| 6. Analyze third-party directory, CMakeLists.txt find_package, etc. to find dependencies | ||
| 7. Put found information in YAML file format | ||
| 8. Repeat until you have all information |
There was a problem hiding this comment.
openai support skills, you can make more clear and comprehensive instructions for concrete build system or anything else with they, because they are not increasing system prompt and loads if condition is satisfied
| except ValueError as e: | ||
| raise e | ||
|
|
||
| model_with_tools = model.bind_tools(TOOLS) |
There was a problem hiding this comment.
maybe TOOLS_MAP be better? also may need to specify model.tool_choice because it None by default?
| messages.append(result) | ||
|
|
||
| if len(messages) > MAX_MESSAGES: | ||
| messages = messages[-MAX_MESSAGES:] |
There was a problem hiding this comment.
from the end for getting part of message with yaml formatting?
| last_message = messages[-1] | ||
| content = last_message.content | ||
| if isinstance(content, list): | ||
| output_text = str(content) |
| from langchain_core.tools import tool | ||
| from langchain_gigachat.chat_models import GigaChat | ||
|
|
||
| LLM_ANALYSIS_FILE = "amphimixis_llm.analyzed" |
There was a problem hiding this comment.
maybe centralize data in one file for reuse for other modules?
| if use_llm is True: | ||
| _logger.info("Analyzing with llm") | ||
| analyze_with_agent(proj_path) | ||
| _logger.info("Analyzing with llm done") | ||
|
|
There was a problem hiding this comment.
you don't return after this
i understand that you do a primary analysis with heuristics in case llm fails, but I don’t see the results being reflected in the results somehow
Add agent for optional analysis with LLM