Terminology & Concepts

Standard vocabulary for AI agent systems - speak the same language

Core Concepts

Agent

A system that uses an LLM to autonomously make decisions and take actions to accomplish a goal.

Not to be confused with:

Chatbot - Conversational interface (may or may not be an agent)
Assistant - Often marketing term; may be simple RAG + chat
Co-pilot - Usually assists humans rather than acting autonomously

Orchestrator

The component that coordinates agent behavior - decides what to do next.

Also called:

Controller
Planner
Reasoner

Responsibilities:

Parse LLM output
Decide next action
Manage execution flow
Handle errors and retries

Tool

An external function or API that an agent can invoke.

Examples:

Web search API
Database query
Calculator function
File system access
Email sender

Key properties:

Name - Identifier the LLM uses
Description - What the tool does (critical for LLM understanding)
Parameters - Input schema (usually JSON)
Return type - Output format

Tool Calling (Function Calling)

The mechanism by which an LLM requests to execute a tool.

Flow:

LLM receives prompt + tool definitions
LLM outputs structured tool request (JSON)
Orchestrator parses and executes tool
Result returned to LLM for next step

Formats:

OpenAI function calling - Native support in API
Anthropic tool use - Similar structured format
ReAct format - Text-based with Thought/Action/Observation
Custom JSON - Roll your own schema

Memory

The ability to store and retrieve information across interactions.

Types:

Short-term Memory (Working Memory)

Current conversation or task context
Stored in prompt/context window
Cleared after task completes

Long-term Memory (Persistent)

Facts, preferences, history stored externally
Survives across sessions
Requires retrieval mechanism (vector DB, key-value store)

Semantic Memory

General knowledge and learned patterns
Often implemented via RAG

Episodic Memory

Specific past interactions or events
"Remember when we discussed X?"

Context Window

The amount of text (tokens) an LLM can process at once.

Implications:

Limits how much conversation history you can include
Affects cost (longer context = more tokens)
Trade-off: More context vs. speed/cost

Typical sizes (as of 2026):

Small models: 4K-8K tokens
Standard: 32K-128K tokens
Long-context: 200K-1M+ tokens

Prompt

The input text sent to the LLM, including instructions, context, and user query.

Components:

System message - Role and behavior instructions
Few-shot examples - Example inputs/outputs
Context - Relevant background information
User query - The actual task/question
Tool definitions - Available functions
History - Previous conversation turns

Reasoning

The LLM's internal process of deciding what to do.

Patterns:

ReAct (Reason + Act)

Thought: I need to find the weather
Action: search_weather(location="Seattle")
Observation: 65°F, partly cloudy
Thought: Now I can answer the user
Final Answer: It's 65°F in Seattle

Chain-of-Thought (CoT)

Let's think step by step:
1. First, I need the user's location
2. Then I'll check the weather API
3. Finally, I'll format a friendly response

Plan-and-Execute

Plan:
- Step 1: Get location
- Step 2: Fetch weather
- Step 3: Respond

Execution: [execute plan steps]

Observation

The result returned after an action (tool call) is executed.

Example flow:

Action: calculate(2 + 2)
Observation: 4
Next thought: "The answer is 4"

Router

A component or agent that dispatches tasks to specialized handlers.

Types:

LLM-based - Uses reasoning to choose route
Rule-based - Keyword matching or classifiers
Hybrid - Combines both approaches

Example:

User: "I need a refund"
Router: [classifies as billing issue]
→ Routes to billing_agent

Sub-agent (Worker Agent)

A specialized agent that handles a specific domain or task type.

Example hierarchy:

Manager agent - Coordinates overall task
- Research agent - Gathers information
- Writing agent - Generates content
- Editing agent - Proofreads output

Workflow vs. Agent

Workflow	Agent
Predefined steps	Dynamic decisions
Deterministic	Non-deterministic
If-then logic	LLM reasoning
State machine	Autonomous executor

When terms blur: Some systems are "agentic workflows" - workflows with LLM decision points.

Execution Patterns

Sequential Execution

Tasks run one after another; output of step N feeds into step N+1.

Step 1 → Step 2 → Step 3 → Result

Parallel Execution

Multiple tasks run simultaneously; results combined afterward.

     ┌─ Task A ─┐
Start├─ Task B ─┤→ Combine → Result
     └─ Task C ─┘

Conditional Execution

Next step depends on a decision or result.

Start → Check condition
         ├─ If true → Path A
         └─ If false → Path B

Iterative Execution (Loop)

Repeat actions until goal achieved or limit reached.

Start → Action → Check goal
         ↑           |
         └───────────┘ (loop if not done)

Error Handling Concepts

Retry Strategy

Automatically re-attempt failed actions.

Approaches:

Simple retry - Try again N times
Exponential backoff - Wait longer between retries
Conditional retry - Retry only for specific errors

Fallback

Alternative action when primary fails.

Example:

Try API call → If fails → Use cached data → If no cache → Return error message

Validation

Check LLM output before executing.

Types:

Schema validation - Is JSON formatted correctly?
Constraint validation - Are parameters within bounds?
Semantic validation - Does action make sense in context?

Circuit Breaker

Stop calling a failing service after repeated errors.

States:

Closed - Normal operation
Open - Reject requests immediately (service is down)
Half-open - Test if service recovered

Performance Concepts

Latency

Time from request to response.

Sources:

LLM inference time
Tool execution time
Network delays
Orchestration overhead

Optimization:

Parallel execution
Caching
Streaming responses
Smaller models

Throughput

Number of requests handled per unit time.

Bottlenecks:

Rate limits (API quotas)
Model concurrency limits
Database connections
Cost budgets

Token

The basic unit of text for LLMs.

Rough conversion:

1 token ≈ 4 characters
1 token ≈ 0.75 words
100 tokens ≈ 75 words

Why it matters:

Pricing is per token
Context window measured in tokens
Latency increases with token count

Token Budget

The maximum tokens you're willing to spend per task.

Components:

Input tokens (prompt + context)
Output tokens (LLM response)
Tool call overhead

Example:

Max budget: 10K tokens
Prompt: 2K tokens
LLM response: 500 tokens
Tool results: 1K tokens
Remaining: 6.5K tokens for additional steps

Observability Concepts

Trace

A record of all steps taken by an agent to complete a task.

Contains:

LLM calls with prompts and responses
Tool executions with inputs and outputs
Timestamps and durations
Errors and retries

Span

A single operation within a trace.

Example span types:

LLM inference
Tool call
Database query
External API request

Logging

Recording events for debugging and monitoring.

Levels:

Debug - Detailed internal state
Info - Key milestones
Warn - Recoverable issues
Error - Failures requiring attention

Framework-Specific Terms

LangChain

Chain - Sequence of components
Agent executor - Runs agent loop
Callback - Hook for observability

LlamaIndex

Query engine - Executes data retrieval
Agent runner - Orchestration layer
Index - Data structure for retrieval

Semantic Kernel

Kernel - Core orchestration engine
Plugin - Reusable skill or tool
Planner - Generates execution plans

Quick Reference

Term	One-Line Definition
Agent	Autonomous system using LLM to take actions toward goals
Tool	External function agent can invoke
Memory	Ability to store/retrieve info across interactions
Orchestrator	Coordinates agent behavior and execution flow
Router	Dispatches tasks to specialized handlers
ReAct	Reason → Act → Observe loop pattern
Token	Basic unit of LLM text (≈0.75 words)
Trace	Record of all agent steps in a task
Context window	Max tokens LLM can process at once
Fallback	Alternative when primary action fails

Next Steps

Understand the terms? → Use the Decision Tree to pick your pattern
Ready to implement? → Jump to Core Patterns
Need production concepts? → Check Production Engineering

FilesExpand file tree

terminology.md

Latest commit

History

terminology.md

File metadata and controls

Terminology & Concepts

Core Concepts

Agent

Orchestrator

Tool

Tool Calling (Function Calling)

Memory

Short-term Memory (Working Memory)

Long-term Memory (Persistent)

Semantic Memory

Episodic Memory

Context Window

Prompt

Reasoning

ReAct (Reason + Act)

Chain-of-Thought (CoT)

Plan-and-Execute

Observation

Router

Sub-agent (Worker Agent)

Workflow vs. Agent

Execution Patterns

Sequential Execution

Parallel Execution

Conditional Execution

Iterative Execution (Loop)

Error Handling Concepts

Retry Strategy

Fallback

Validation

Circuit Breaker

Performance Concepts

Latency

Throughput

Token

Token Budget

Observability Concepts

Trace

Span

Logging

Framework-Specific Terms

LangChain

LlamaIndex

Semantic Kernel

Quick Reference

Next Steps

Further Reading