Add comprehensive PromQL extraction with multiple fallback patterns#21
Add comprehensive PromQL extraction with multiple fallback patterns#21yashpal2104 wants to merge 2 commits intosodafoundation:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR improves the robustness of extracting executable PromQL from LLM (Ollama) responses and updates the DynamicPrompt templates to steer the model toward syntactically valid PromQL, while also expanding Prometheus config parsing to support multiple-instance config formats.
Changes:
- Added multiple fallback extraction patterns and basic PromQL syntax heuristics in
get_promql_from_ollama. - Updated Prometheus connection setup to accept
prometheus_instances-based config and additional base URL key fallbacks. - Refined prompt templates to request “PromQL-only” output and added a PromQL syntax reference to reduce LLM formatting/syntax errors.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| pkg/copilot/DP_logic/dp_logic.py | Adds multi-pattern PromQL extraction/cleanup + syntax heuristics; expands Prometheus config parsing logic. |
| pkg/copilot/DP_logic/DynamicPrompt/dynamic_prompt/prompt_builder.py | Reorders prompt content to more directly instruct the model to generate PromQL for the user question. |
| pkg/copilot/DP_logic/DynamicPrompt/config/template_sections/postamble.md | Tightens instructions to output only a PromQL code block and avoid placeholders/comments. |
| pkg/copilot/DP_logic/DynamicPrompt/config/template_sections/domain.md | Adds an embedded PromQL syntax reference and common pitfalls to guide query generation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Step 5: Create the core `query` expression. This is the PromQL string. If necessary, build the query using aggregation operators, functions, or vector selectors, based on what the input text describes. Explicitly explain how you construct this expression. | ||
|
|
||
| Step 6: Display the final `query` expression in a labelled markdown block like so: | ||
| ```promql | ||
| [final query here] | ||
| ``` | ||
|
|
||
| IMPORTANT: Do not ask the user for clarification or additional details. Always infer the best possible query and format. | ||
| Step 6: Output ONLY the PromQL query expression inside a promql markdown code block. | ||
| The code block must contain the actual executable PromQL query you generated - nothing else. | ||
|
|
There was a problem hiding this comment.
The instructions are internally inconsistent: Step 5 asks to "Explicitly explain" how the query is constructed, but Step 6 now requires outputting ONLY the PromQL code block and nothing else. This conflict can cause the LLM to emit extra text and break downstream extraction; consider removing the Step 5 explanation requirement or clarifying that any explanation must be outside the final response (or omitted entirely).
pkg/copilot/DP_logic/dp_logic.py
Outdated
| # Clean up any square brackets or placeholder text that might have been included | ||
| promql = re.sub(r'^\s*\[\s*', '', promql) # Remove leading [ | ||
| promql = re.sub(r'\s*\]\s*$', '', promql) # Remove trailing ] | ||
| promql = promql.strip() |
There was a problem hiding this comment.
The cleanup that strips a trailing ']' will break valid PromQL expressions that legitimately end with a range selector (e.g., metric[5m]), turning them into invalid syntax. Consider removing this normalization, or only stripping surrounding brackets when the entire extracted string is clearly a placeholder wrapper (e.g., starts with '[' and ends with ']' and contains no other '['/']' pairs).
| # Clean up any square brackets or placeholder text that might have been included | |
| promql = re.sub(r'^\s*\[\s*', '', promql) # Remove leading [ | |
| promql = re.sub(r'\s*\]\s*$', '', promql) # Remove trailing ] | |
| promql = promql.strip() | |
| # Clean up any placeholder-style square bracket wrappers like "[ query ]" | |
| promql = promql.strip() | |
| if promql.startswith("[") and promql.endswith("]") and promql.count("[") == 1 and promql.count("]") == 1: | |
| promql = promql[1:-1].strip() |
pkg/copilot/DP_logic/dp_logic.py
Outdated
| # Remove comments (# and everything after on the same line) | ||
| promql = re.sub(r'#.*', '', promql) |
There was a problem hiding this comment.
Removing everything after # can corrupt otherwise-valid PromQL when # appears inside a quoted label value (e.g., {label="a#b"}). Since PromQL doesn’t have comments, it may be better to avoid this transformation or only strip lines where # is the first non-whitespace character and not within quotes.
| # Remove comments (# and everything after on the same line) | |
| promql = re.sub(r'#.*', '', promql) | |
| # Remove comment lines where '#' is the first non-whitespace character | |
| promql = "\n".join( | |
| line for line in promql.splitlines() | |
| if not line.lstrip().startswith("#") | |
| ) |
| # Handle both direct config and prometheus_instances structure | ||
| if "prometheus_instances" in prom_config: | ||
| # Extract first instance from list | ||
| instance = prom_config["prometheus_instances"][0] | ||
| base_url = instance["base_url"] | ||
| disable_ssl = instance.get("disable_ssl", False) | ||
| else: | ||
| # Fallback logic: check for 'base_url', then 'prometheus_url', then 'url' | ||
| base_url = prom_config.get("base_url") or \ | ||
| prom_config.get("prometheus_url") or \ | ||
| prom_config.get("url") | ||
| disable_ssl = prom_config.get("disable_ssl", True) | ||
|
|
||
| prom = PrometheusConnect( | ||
| url=prom_config["base_url"], | ||
| disable_ssl=True | ||
| url=base_url, | ||
| disable_ssl=disable_ssl | ||
| ) |
There was a problem hiding this comment.
prometheus_instances handling assumes the list is non-empty and that base_url is present; otherwise this will raise (IndexError/KeyError) or pass None into PrometheusConnect. Also, any per-instance headers in config (supported elsewhere in the repo) are ignored here, which can break auth-required Prometheus setups. Suggest validating the config shape (non-empty list, required keys) and passing through headers when constructing PrometheusConnect.
|
@rohithvaidya @Anagha9812 pls help to review |
|
@yashpal2104 updating this PR with reworked code? |
|
@sushanthakumar reworked as in? |
|
Hi @yashpal2104, Thanks for raising a PR for the fix, by adding syntax checking and promql extraction routines. Please incorporate the below points for modularity and provide clarifications for the functionality being added.
|
3502f24 to
31d3e37
Compare
|
- Implement robust syntax validation for common PromQL errors - Detect 'over' vs 'by'/'without' aggregation mistakes - Validate bracket/parentheses/brace balancing - Check for range vectors used directly in aggregations - Verify over-time functions have required time ranges - Improve PromQL cleaning: strip comments, placeholders, and language markers - Add flexible Prometheus config handling for multiple instance formats - Update prompt templates with PromQL syntax reference and clearer instructions Signed-off-by: Yash Pal <yashpal2104@gmail.com>
…r modularity Signed-off-by: Yash Pal <yashpal2104@gmail.com>
31d3e37 to
8a0d632
Compare
Fixes #20
What type of PR is this?
/kind bug fix
/kind enhancement
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #20
Test Report Added?:
Test Report:
Special notes for your reviewer:


