githubEdit

robotOverview

What an Agent is and how the system is structured

The Agent is the core building block of AIDevKit. It connects your Unity project to any supported AI provider, manages the full conversation lifecycle, and handles everything from streaming responses to tool calls.


What Is an Agent?

Think of an Agent as an AI character or assistant living in your Unity scene. You give it:

  • Instructions — how it should behave and what role it plays

  • A model — which AI (GPT-4o, Gemini, Claude, etc.) powers it

  • Tools (optional) — functions it can invoke to interact with your game world

  • Conversation memory (optional) — how much context it remembers

The Agent handles everything else: making API calls, streaming tokens, executing tool calls, persisting conversation history, and dispatching Unity events your UI can react to.


Before v4: Multiple Agent Types

In earlier versions of AIDevKit, you had to choose between several specialized classes:

[Old Design - Deprecated]
ChatAgent        → text chat only
VoiceAgent       → voice input / output
AssistantsAgent  → OpenAI Assistants API

Switching from text chat to voice meant replacing your entire Agent class, and sharing logic between types was awkward.


v4+: One Agent Class

Starting with v4, all of those types are merged into a single Agent class. Behavior is driven by configuration, not inheritance:

You switch modes by changing ChatApiType in AgentSettings. No code changes needed.

circle-info

In Unity scenes, you don't use Agent directly. You attach the AgentBehaviour MonoBehaviour to a GameObject. It creates and manages the Agent for you.


The Two Key Assets

Every Agent in Unity is configured through two ScriptableObjects:

Asset
Purpose

AgentSettings

Technical config: API type, model, audio flags, conversation settings

AgentProfile

Identity config: agent name, instructions, personality traits, starting message

You assign both to the AgentBehaviour component in the Inspector. Because they are separate assets, you can swap personalities or backends independently — and the same AgentProfile can be shared across multiple agents.


What's in This Section

Page
What you'll learn

Internal architecture and the lifecycle of a message

Step-by-step setup guide from scratch

Conversation history, context windows, and long-term retrieval

Learn the fundamentals of working with agents:

Core Components

Dive into the agent system's building blocks:

Customization

Extend agents with custom implementations:

Architecture

Key Features

Unified API

Interact with any LLM provider using a consistent interface:

  • OpenAI (Chat Completions, Assistants, Responses, Realtime)

  • Google Gemini

  • Anthropic Claude

  • And more...

Stateful Conversations

Maintain context across multiple interactions:

  • Automatic history management

  • Multiple conversation support

  • Persistent storage options

Multimodal Support

Handle text, voice, images, and more:

  • Voice input/output

  • Image generation

  • Document understanding

  • Vision capabilities

Tool Integration

Extend agent capabilities with tools:

  • Built-in tools (search, calculation, etc.)

  • Custom tool creation

  • MCP (Model Context Protocol) support

  • Automatic function calling

Event System

Monitor and respond to agent activities:

  • Status changes

  • Streaming updates

  • Tool execution

  • Conversation events

Next Steps

Start building with agents:

  1. Create your first agent with Your First Agentarrow-up-right

  2. Understand the architecture in How Agents Workarrow-up-right

  3. Explore Event Routerarrow-up-right for advanced patterns

Last updated