AI DevKit
Glitch9 Inc.Glitch9 DocsDiscordIssues
  • Introduction
    • AI DevKit 3.0
    • Update Logs
    • Troubleshooting
      • ❗Issues After Updating AIDevKit?
      • ❗The type or namespace name 'Newtonsoft' could not be found
      • ❗Build Error: The name 'UnityMenu' does not exist in the current context
      • ❗Model 'modelName' not found
      • ❗The model `model name` does not exist or you do not have access to it
      • ❗The type or namespace name 'AndroidJavaObject' could not be found
      • ❗The type or namaspace name 'Plastic' does not exist
      • ❗Build Error: The name 'Asset Database' does not exist in the current context
      • ❗'ModelData.Create(Provider, string, UnixTime?, string)': not all code paths return a value
      • ⚠️ Timeout Issues
      • ⚠️ Receiving a “HTTP/1.1 400 Bad Request” Error?
    • FAQ
      • My OpenAI API free trial has ended or is inactive.
  • Quick Start
    • Get API Keys
      • OpenAI API Key Guide
      • Google API Key Guide
      • ElevenLabs API Key Guide
    • Text Generation
    • C# Object Generation
    • Image Generation
    • Sound Effect Generation
    • Text to Speech (TTS)
    • Speech to Text (STT)
    • Voice Changer
    • Audio Isolation
  • Pro Features
    • Generation Menu
      • Code Generators
        • C# Script Generator
        • Unity Component Generator
    • Editor Chat
    • Editor Vision (TTI, ITI)
    • Editor Speech (TTS)
    • Management Tools
      • Prompt History Viewer
      • AI Model Manager
      • TTS Voice Manager
      • OpenAI File Manager
      • OpenAI Assistant Manager
      • ElevenLabs Voice Library
  • Assistants API (OpenAI)
    • How it works
    • Creating custom functions
    • Creating assistants API
  • Advanced API Supports
    • OpenAI API
      • 💬Chat completions
      • 🖼️Image operations
      • 🗣️Text to speech
      • 🎙️Speech to text
        • Recording real-time in Unity
      • 💾Files
      • 🔎Embeddings
      • 🛡️Moderations
      • ⚙️Fine-tuning
    • Google API
      • 📝System instructions
      • 💬Text generation
      • ⚙️Fine-tuning
      • ▶️Fucntion calling
      • 🔎Embeddings
      • 🛡️Safety
      • 💻Code execution
    • ElevenLabs API
  • Legacy Documents
    • AI DevKit 1.0 - 2.0
      • AI DevKit 2.0
      • AI DevKit 1.0
      • Preperation
      • Event Handlers
      • Scriptable Toolkits
        • Chat Streamer
        • Image Generator
        • Voice Transcriber
        • Voice Generator
      • Editor Tools
Powered by GitBook
On this page
  1. Quick Start

Speech to Text (STT)

Convert spoken words from an AudioClip into text using powerful AI transcription models.

Ideal for voice commands, user feedback, subtitles, or audio-driven gameplay systems.


✅ Basic Usage

AudioClip recording = MicrophoneCapture.GetLastClip();

string result = await recording
    .GENTranscript()
    .SetModel(OpenAIModel.Whisper)
    .SetLanguage(SystemLanguage.Korean)
    .ExecuteAsync();

Debug.Log("Transcript: " + result);

🔊 The AudioClip can be from a microphone, file, or any runtime source.


⚙️ Configuration Options

Method
Description

SetLanguage(SystemLanguage)

Optional hint to improve transcription accuracy

SetModel(model)

Choose which STT model to use (Whisper, Gemini STT, etc.)

SetOutputPath(path)

Save transcription to file (optional)


🌍 Translation Mode

You can also translate speech into English using GENTranslation():

string english = await recording
    .GENTranslation()
    .SetModel(OpenAIModel.Whisper)
    .ExecuteAsync();

🗣️ This uses the same audio input but produces translated text (into English).


📦 Example Result

Audio Input: "안녕하세요, 오늘 날씨 어때요?" Transcript: "안녕하세요, 오늘 날씨 어때요?" Translation: "Hello, how's the weather today?"


🧠 Tips

  • Works best with clean, mono audio at 16kHz or higher.

  • SetLanguage is optional — the model can auto-detect, but accuracy improves with a hint.

  • For multilingual games or voice input, pair this with Text Generation for natural response.

PreviousText to Speech (TTS)NextVoice Changer

Last updated 18 days ago