AI Dev Kit 3.7.0
Last updated
Last updated
AIDevKit empowers beginner developers to effortlessly integrate advanced AI functionalities directly into Unity, dramatically simplifying your game development workflow.
With just a few clicks and simple text prompts, you can create code, generate images, produce sound effects, and even synthesize voices. AIDevKit offers broad API integrations, rich editor tools, extensive voice synthesis options, and unique audio generation capabilities.
Generate text, images, speech, or code using .GEN*()
extension methods directly on Unity objects.
No boilerplate. No wrappers. Just write:
Supported host types:
string
→ text, speech, image, code, sound effects
Texture2D
→ image editing, video generation
AudioClip
→ transcription, translation, voice change, cleanup
Everything is powered by OpenAI, Google Gemini, ElevenLabs, and more — wrapped in a unified fluent API.
This workflow is ideal for rapid prototyping, tool development, and AI-native content pipelines.
Now they can.
No setup. No config. Just type .GEN*()
and let the magic happen.
Whether you're building dialogue systems, generating assets, or automating pipelines — this is the fastest way to get started with generative AI in Unity.
AI Dev Kit gives you instant access to text, image, audio, and code generation — with zero boilerplate.
Just call .GEN*()
on your Unity objects and chain your desired behavior — it's fast, readable, and production-ready.
Works out of the box. Fully extensible. No extra SDKs required.
One of the most widely used AI platforms, offering powerful models like GPT-4o. It supports a broad range of APIs for text, audio, image, and assistant functionalities—ideal for building intelligent, multimodal applications.
🧠 Embeddings
🛠️ Models, Fine-Tuning
📦 Batch
🧪 Evals
📝 Graders
🗂️ Vector Stores, Vector Store Files, Vector Store File Batches
🤖 Assistants, Threads, Messages, Runs, Run steps, Streaming (Pro-only)
📡 Responses, Responses Streaming (Coming Soon for Pro)
A cutting-edge multimodal AI service by Google, capable of both text and image generation. It is well-suited for developers who need tight integration with Google’s cloud infrastructure and rapid performance.
🧠 Embeddings
🛠️ Models, Tuning, Permissions
🗂️ Vector Stores, Vector Store Files, Vector Store File Batches
A state-of-the-art voice platform specializing in high-quality Text-to-Speech, Speech-to-Text, and voice transformation. Perfect for voice assistants, narration, and real-time character dubbing.
🧠 Embeddings
🛠️ Audio Isolation
🛠️ Audio Native
🛠️ Forced Alignment
🗂️ Vector Stores, Vector Store Files, Vector Store File Batches
Worried about costs? No Problem! With Ollama you can run LLMs like LLaMA and Mistral on your own machine for free. Ideal for offline development, edge applications, and private inference without relying on cloud services.
🧠 Embeddings
🛠️ Models, Model Management
🛠️ Version
A gateway to multiple third-party LLMs offering 300+ models including Claude, Mistral, Command R, and more—all accessible via a single API. Great for experimenting with various models without switching providers.
Supported APIs
OpenAI
OpenAI
ElevenLabs,
Ollama
OpenRouter
Supported Tasks
Response Generation (Text)
Image Generation
Text To Speech
Speech To Text
Response Generation (Text)
Image Generation
Text To Speech
Speech To Text
SoundFX Generation
Video Generation
Voice Change
Audio Isolation
Number of AI Models
120+
450+
Number of AI Voices
10+
4000+
Unity Components
–
Chatbot
Chatbot (with Assistants API)
Realtime Assistant
Modular Components (components you can use with 'Unity Components')
–
Image Generator
Text To Speech
Speech To Text
Voice Changer
Advanced Integration
–
Chat Session, Assistants API, Realtime API
Editor Tools
Model / Voice Library
Model / Voice Library
Editor Chat
Code Generator
Unity Component Generator
Avatar Generator
Icon Generator
Background Generator
Mesh Texture Generator
Speech Generator
SoundFX Generator
Video Generator
Fully
Windows, OSX, Linux
Partially
Unity WebGL
Fully
Android, iOS, Windows Phone/Store
Fully
PlayStation, Xbox, PS Vita/PSM, Switch
Chat Completions, Streaming, Completions (Legacy)
Image Creation, Image Edit, Image Variation
Speech, Transcript, Translation
Text Moderations, Image Moderations
Files, Uploads
Realtime, Session tokens, Client events, Server events (Pro-only)
Generate Content - Text generation, Vision, JSON Mode, Function calling
Predict - Image generation (via Gemini)
Predict - Image generation (via Imagen3)
Predict Long Running - Video generation (via Veo)
Tokens
Files, Caching
Live API - WebSockets API
Text to Speech, Stream
Text to Speech (WebSockets)
Multi-Context Text to Speech (WebSockets)
Speech to Text
Sound Effects
Voice Changer
Text to Voice (Voice design)
Dubbing
Models, Voices
Voice Library
Chat Completion, Stream, Completion
Chat Completion - Stream, Vision, Structured Outputs, Function calling, Web Search
More API integrations are on the way: Have something you'd like to see next? Let us know via — your feedback shapes the roadmap.