Updated Agent Architecture with Google Cloud's four-component model (persona, memory, tools, model), surface vs. background agent distinction, single vs. multi-agent architecture comparison, and expanded architectural diagram.

title: "Agent Architecture" type: concept tags: [#planning, #reasoning, #tool-use, #memory, #multi-agent] created: 2025-01-01 updated: 2025-07-14 status: complete

Agent Architecture

Agent architecture describes the structural design of an AI agent system, specifying how components such as perception, reasoning, memory, tool use, and action-selection are organized and interact.

Overview

Agent architecture is the blueprint that determines how an AI agent thinks, remembers, acts, and communicates. A well-designed architecture integrates a foundation model as the core reasoning engine with surrounding components that extend its capabilities: memory systems for context retention, tool interfaces for environmental interaction, and orchestration logic for managing multi-step task execution.

The field has converged on several recurring structural patterns. The ReAct Framework introduced a widely adopted loop of interleaved reasoning and acting steps. Plan-and-Execute architectures separate high-level goal decomposition from low-level execution. More recent designs add self-refinement loops, where agents evaluate and revise their own outputs before committing to an action.

Google Cloud's component model for agent architecture identifies four core building blocks that every agent defines: a persona (role, personality, and communication style), memory (short-term, long-term, episodic, and consensus), tools (external resources and APIs), and the model (the LLM foundation that processes language and generates reasoning). These components interact within an operational loop of observe → reason → plan → act → self-refine.

Architecture choices directly determine an agent's capability ceiling, latency profile, cost, and safety properties. Single-agent architectures optimize for simplicity and well-defined tasks; multi-agent architectures composed of specialized sub-agents enable tackling problems of greater complexity at the cost of coordination overhead. See Multi-Agent Coordination and Agent Orchestration for the governance layer that sits above individual agent designs.

How It Works

A standard agent architecture consists of the following layers:

┌──────────────────────────────────────────────────┐
│                   GOAL / TASK INPUT              │
└───────────────────────┬──────────────────────────┘
                        │
┌───────────────────────▼──────────────────────────┐
│              REASONING ENGINE (LLM)              │
│  - Persona definition                            │
│  - Instruction following                         │
│  - Chain-of-thought / planning                   │
└──────┬────────────────┬────────────────┬─────────┘
       │                │                │
┌──────▼──────┐  ┌──────▼──────┐  ┌──────▼──────┐
│   MEMORY    │  │    TOOLS    │  │   ACTION    │
│ Short-term  │  │ APIs, DBs   │  │  Execution  │
│ Long-term   │  │ Code, Search│  │  & Output   │
│ Episodic    │  │ External sys│  └─────────────┘
│ Consensus   │  └─────────────┘
└─────────────┘

Core loop:

The agent receives a goal or task
The LLM reasons over available context (memory + observations)
The agent selects and executes a tool or action
Observations from the action feed back into memory and the next reasoning step
The loop continues until the goal is achieved or a stopping criterion is met
Self-refinement may trigger re-evaluation of outputs before they are committed

Key Properties / Characteristics

Persona-driven behavior: Each agent defines a role and communication style that shapes how it interprets goals and formats outputs
Four-component structure: Persona, memory, tools, and model are the canonical building blocks per Google Cloud's framework
Observe-reason-plan-act loop: The core operational cycle shared across most modern agent architectures
Modularity: Components can be swapped or upgraded independently (e.g., changing the LLM without changing the memory system)
Scalability via composition: Single agents can be composed into multi-agent networks, with each agent potentially using a different foundation model optimized for its role
Tool extensibility: Architecture is open-ended; new capabilities are added by registering new tools rather than retraining the model

Variants & Related Approaches

ReAct Framework: Interleaved reasoning traces and tool-calling actions in a single context window; the dominant pattern for single-agent systems
Plan-and-Execute: A planner agent decomposes the goal into a sequence of steps; an executor agent carries them out; reduces error propagation by separating concerns
Reflection architectures: Add an explicit self-critique step where the agent evaluates its own output and revises before acting
Surface vs. background agents: Surface agents interact conversationally with users; background agents run autonomously on event-driven queues — a distinction based on interaction mode rather than internal architecture
Single-agent vs. multi-agent: Single agents use one foundation model; multi-agent systems allow different models per agent, matched to task requirements

Strengths & Limitations

Strengths

Modular design allows incremental capability expansion without full system redesign
LLM-as-brain abstraction means advances in foundation models directly improve agent performance
Tool-use extensibility means agents can interact with virtually any external system
Multi-agent composition enables specialization and parallelism beyond any single agent's capacity

Limitations

Context window limits constrain how much short-term memory and reasoning history the LLM can utilize
Tool selection errors compound across multi-step tasks, leading to error cascades
Persona and instruction design significantly affects reliability — poorly specified agents behave inconsistently
Multi-agent architectures introduce coordination complexity and consensus memory consistency challenges
Latency and cost scale with the number of LLM calls in the reasoning loop

Notable Uses / Applications

Google Cloud Vertex AI Agent Builder: Builds agents using the four-component architecture; supports both single and multi-agent configurations
Google Agent Development Kit (ADK): Open-source Python SDK implementing multi-agent orchestration with memory and tool integration
Code agents: Specialized architectures for code generation, testing, and debugging workflows
Security agents: Architectures tuned for threat detection and response across the security lifecycle

Source Material

What are AI agents? Definition, examples, and types | Google Cloud — Source for the four-component architecture model (persona, memory, tools, model), single vs. multi-agent distinctions, surface vs. background agent types, and operational loop.
Yao et al. 2022 — ReAct — Foundational paper establishing the Reason+Act loop as a core architectural pattern.

Is a type of: Agentic AI Implements: ReAct Framework Uses / Depends on: Tool Use, Memory Systems Governed by: Agent Orchestration Scales to: Multi-Agent Coordination Safety considerations: AI Safety and Alignment

Open Questions

How should persona specifications be formally validated to ensure consistent, safe agent behavior?
What are the optimal context management strategies when short-term memory approaches context window limits?
How do different foundation models perform as the reasoning engine within the same architectural shell?
Is the four-component model (persona, memory, tools, model) sufficient, or are additional structural elements needed for more complex agentic systems?
How should architecture evolve to support real-time, streaming task execution?

Page type: concept | Status: complete

Agent Architecture

title: "Agent Architecture" type: concept tags: [#planning, #reasoning, #tool-use, #memory, #multi-agent] created: 2025-01-01 updated: 2025-07-14 status: complete

Agent Architecture

Overview

How It Works

Key Properties / Characteristics

Variants & Related Approaches

Strengths & Limitations

Strengths

Limitations

Notable Uses / Applications

Source Material

Related Pages

Open Questions