|
1 | 1 | --- |
2 | 2 | title: Advanced Filtering |
3 | | -description: Understand how to add filter expressions to your Knowledge search requests. |
4 | | -keywords: [search, retrieval, rag, agentic rag, agentic search, search, filters, metadata] |
| 3 | +description: Use filter expressions (EQ, AND, OR, NOT) for complex logical filtering of knowledge base searches. |
| 4 | +keywords: [search, retrieval, rag, agentic rag, agentic search, search, filters, metadata, filter expressions] |
5 | 5 | --- |
6 | 6 |
|
7 | | -When agents search through knowledge bases, sometimes you need more control than just "find similar content." Maybe you want to search only within specific documents, exclude outdated information, or focus on content from particular sources. That's where advanced filtering comes in—it lets you precisely target which content gets retrieved. |
| 7 | +When basic dictionary filters aren't enough, filter expressions give you powerful logical control over knowledge searches. Use them to combine multiple conditions with AND/OR logic, exclude content with NOT, or perform comparisons like "greater than" and "less than". |
8 | 8 |
|
9 | | -## How Knowledge Filtering Works |
| 9 | +For basic filtering with dictionary format, see [Search & Retrieval](/concepts/knowledge/core-concepts/search-retrieval). |
10 | 10 |
|
11 | | -Think of filtering like adding smart constraints to a library search. Instead of searching through every book, you can tell the librarian: "Only look in the science section, published after 2020, but exclude textbooks." Knowledge filtering works the same way—you specify criteria based on the metadata attached to your content. |
12 | | - |
13 | | -<Steps> |
14 | | - <Step title="Metadata Assignment"> |
15 | | - When you add content, attach metadata like department, document type, date, or any custom attributes. |
16 | | - </Step> |
17 | | - <Step title="Filter Construction"> |
18 | | - Build filter expressions using comparison and logical operators to define your search criteria. |
19 | | - </Step> |
20 | | - <Step title="Targeted Search"> |
21 | | - The knowledge base only searches through content that matches your filter conditions. |
22 | | - </Step> |
23 | | - <Step title="Contextual Results"> |
24 | | - You get precisely the information you need from exactly the right sources. |
25 | | - </Step> |
26 | | -</Steps> |
27 | | - |
28 | | -## Available Filter Expressions |
| 11 | +## Filter Expression Operators |
29 | 12 |
|
30 | 13 | Agno provides a rich set of filter expressions that can be combined to create sophisticated search criteria: |
31 | 14 |
|
@@ -314,74 +297,7 @@ async def progressive_search(agent, query, base_filters=None): |
314 | 297 | return broad_results |
315 | 298 | ``` |
316 | 299 |
|
317 | | -## Working with Metadata |
318 | | - |
319 | | -### Designing Effective Metadata |
320 | | - |
321 | | -Good filtering starts with thoughtful metadata design: |
322 | | - |
323 | | -```python |
324 | | -# ✅ Rich, searchable metadata |
325 | | -good_metadata = { |
326 | | - "document_type": "policy", |
327 | | - "department": "hr", |
328 | | - "category": "benefits", |
329 | | - "audience": "all_employees", |
330 | | - "last_updated": "2024-01-15", |
331 | | - "version": "2.1", |
332 | | - "tags": ["health_insurance", "401k", "vacation"], |
333 | | - "sensitivity": "internal" |
334 | | -} |
335 | | - |
336 | | -# ❌ Sparse, hard to filter metadata |
337 | | -poor_metadata = { |
338 | | - "type": "doc", |
339 | | - "id": "12345" |
340 | | -} |
341 | | -``` |
342 | | - |
343 | | -### Dynamic Metadata Assignment |
344 | | - |
345 | | -Add metadata programmatically based on content: |
346 | | - |
347 | | -```python |
348 | | -def assign_metadata(file_path: str) -> dict: |
349 | | - """Generate metadata based on file characteristics.""" |
350 | | - metadata = {} |
351 | | - |
352 | | - # Extract from filename |
353 | | - if "policy" in file_path.lower(): |
354 | | - metadata["document_type"] = "policy" |
355 | | - elif "guide" in file_path.lower(): |
356 | | - metadata["document_type"] = "guide" |
357 | | - |
358 | | - # Extract department from path |
359 | | - if "/hr/" in file_path: |
360 | | - metadata["department"] = "hr" |
361 | | - elif "/engineering/" in file_path: |
362 | | - metadata["department"] = "engineering" |
363 | | - |
364 | | - # Add timestamp |
365 | | - metadata["indexed_at"] = datetime.now().isoformat() |
366 | | - |
367 | | - return metadata |
368 | | - |
369 | | -# Use when adding content |
370 | | -for file_path in document_files: |
371 | | - knowledge.add_content( |
372 | | - path=file_path, |
373 | | - metadata=assign_metadata(file_path) |
374 | | - ) |
375 | | -``` |
376 | | - |
377 | | -## Best Practices for Advanced Filtering |
378 | | - |
379 | | -### Metadata Strategy |
380 | | - |
381 | | -- **Be Consistent**: Use standardized values (e.g., always "hr" not sometimes "HR" or "human_resources") |
382 | | -- **Think Hierarchically**: Use nested categories when appropriate (`department.team`, `location.region`) |
383 | | -- **Include Temporal Data**: Add dates, versions, or other time-based metadata for lifecycle management |
384 | | -- **Add Semantic Tags**: Include searchable tags or keywords that might not appear in the content |
| 300 | +## Best Practices for Filter Expressions |
385 | 301 |
|
386 | 302 | ### Filter Design |
387 | 303 |
|
@@ -418,6 +334,89 @@ def safe_filter_search(agent, query, filters): |
418 | 334 | return agent.get_relevant_docs_from_knowledge(query=query) |
419 | 335 | ``` |
420 | 336 |
|
| 337 | +## Troubleshooting |
| 338 | + |
| 339 | +### Filter Not Working |
| 340 | + |
| 341 | +<AccordionGroup> |
| 342 | + <Accordion title="Verify metadata keys exist"> |
| 343 | + Check that the keys you're filtering on actually exist in your knowledge base: |
| 344 | + |
| 345 | + ```python |
| 346 | + # Add content with explicit metadata |
| 347 | + knowledge.add_content( |
| 348 | + path="doc.pdf", |
| 349 | + metadata={"status": "published", "category": "tech"} |
| 350 | + ) |
| 351 | + |
| 352 | + # Now filter will work |
| 353 | + filter_expr = EQ("status", "published") |
| 354 | + ``` |
| 355 | + </Accordion> |
| 356 | + |
| 357 | + <Accordion title="Check filter structure"> |
| 358 | + Print the filter to verify it's constructed correctly: |
| 359 | + |
| 360 | + ```python |
| 361 | + from agno.filters import EQ, GT, AND |
| 362 | + |
| 363 | + filter_expr = AND(EQ("status", "published"), GT("views", 100)) |
| 364 | + print(filter_expr.to_dict()) |
| 365 | + ``` |
| 366 | + </Accordion> |
| 367 | +</AccordionGroup> |
| 368 | + |
| 369 | +### Complex Filters Failing |
| 370 | + |
| 371 | +<AccordionGroup> |
| 372 | + <Accordion title="Break down into smaller filters"> |
| 373 | + Test each condition individually: |
| 374 | + |
| 375 | + ```python |
| 376 | + # Test each part separately |
| 377 | + filter1 = EQ("status", "published") # Test |
| 378 | + filter2 = GT("date", "2024-01-01") # Test |
| 379 | + filter3 = IN("region", ["US", "EU"]) # Test |
| 380 | + |
| 381 | + # Then combine |
| 382 | + combined = AND(filter1, filter2, filter3) |
| 383 | + ``` |
| 384 | + </Accordion> |
| 385 | + |
| 386 | + <Accordion title="Verify filter structure"> |
| 387 | + Check that nested logic is correctly structured: |
| 388 | + |
| 389 | + ```python |
| 390 | + import json |
| 391 | + |
| 392 | + try: |
| 393 | + filter_dict = filter_expr.to_dict() |
| 394 | + json_str = json.dumps(filter_dict) |
| 395 | + json.loads(json_str) # Verify it parses |
| 396 | + print("Valid filter structure") |
| 397 | + except (TypeError, ValueError) as e: |
| 398 | + print(f"Invalid filter: {e}") |
| 399 | + ``` |
| 400 | + </Accordion> |
| 401 | + |
| 402 | + <Accordion title="Check operator precedence"> |
| 403 | + Make sure nested logic is clear and well-structured: |
| 404 | + |
| 405 | + ```python |
| 406 | + # Clear nested structure |
| 407 | + filter_expr = OR( |
| 408 | + AND(EQ("a", 1), EQ("b", 2)), |
| 409 | + EQ("c", 3) |
| 410 | + ) |
| 411 | + |
| 412 | + # Break down complex filters for readability |
| 413 | + condition1 = AND(EQ("a", 1), EQ("b", 2)) |
| 414 | + condition2 = EQ("c", 3) |
| 415 | + filter_expr = OR(condition1, condition2) |
| 416 | + ``` |
| 417 | + </Accordion> |
| 418 | +</AccordionGroup> |
| 419 | + |
421 | 420 | ### Vector Database Support |
422 | 421 |
|
423 | 422 | Advanced filter expressions (using `FilterExpr` like `EQ()`, `AND()`, etc.) have varying support across vector databases: |
|
0 commit comments