AIDevKit - AI Suite for Unity
API ReferencesDiscordGlitch9
  • Introduction
    • AI Dev Kit 3.7.0
    • Troubleshooting
    • FAQ
    • Update Logs
  • Provider Setup
    • API Key Setup
    • OpenAI
    • Google Gemini
    • ElevenLabs
    • Ollama
    • OpenRouter
  • Editor Tools
    • Introduction
    • Editor Chat
    • Model Library
    • Voice Library
  • GEN Tasks
    • Overview
    • Prefixes
    • Response
    • Chat
    • Image
    • Video
    • SoundFX
    • Speech
    • Transcript
    • Voice Change
    • Audio Isolation
  • Advanced API (Pro)
    • Assistants
      • How it works
      • Creating custom functions
      • Creating assistants API
    • Realtime
  • Legacy API
    • OpenAI
      • 💬Chat completions
      • 🖼️Image operations
      • 🗣️Text to speech
      • 🎙️Speech to text
        • Recording real-time in Unity
      • 💾Files
      • 🔎Embeddings
      • 🛡️Moderations
      • ⚙️Fine-tuning
    • Google Gemini
      • 📝System instructions
      • 💬Text generation
      • ⚙️Fine-tuning
      • ▶️Fucntion calling
      • 🔎Embeddings
      • 🛡️Safety
      • 💻Code execution
  • Legacy Documents
    • AI Dev Kit 1.0
      • Preperation
      • Scriptable Toolkits
        • Chat Streamer
        • Image Generator
        • Voice Transcriber
        • Voice Generator
      • Editor Tools
      • Troubleshooting (Legacy)
        • ❗Build Error: The name 'UnityMenu' does not exist in the current context
        • ❗The type or namespace name 'AndroidJavaObject' could not be found
        • ❗The type or namaspace name 'Plastic' does not exist
        • ❗Build Error: The name 'Asset Database' does not exist in the current context
        • ❗'ModelData.Create(Provider, string, UnixTime?, string)': not all code paths return a value
      • Code Generators
        • C# Script Generator
        • Unity Component Generator
    • AI Dev Kit 2.0
      • Event Handlers
      • Editor Chat
      • Editor Vision (TTI, ITI)
      • Editor Speech (TTS)
      • Management Tools
        • Prompt History Viewer
        • AI Model Manager
        • TTS Voice Manager
        • OpenAI File Manager
        • OpenAI Assistant Manager
        • ElevenLabs Voice Library
Powered by GitBook
On this page
  • Step 1: Creating a VoiceGenerator Instance
  • Step 2: Configuring VoiceGenerator
  • Step 3: Implementing VoiceGenerator in Your Scene
  • Step 4: Generating Voice Audio
  • Step 5: Customizing the Generated Voice
  • Step 6: Retrieving and Using Generated Audio
  • Step 7: Managing Audio Files
  • Step 8: Error Handling and Validation
  • Best Practices
  1. Legacy Documents
  2. AI Dev Kit 1.0
  3. Scriptable Toolkits

Voice Generator

PreviousVoice TranscriberNextEditor Tools

The VoiceGenerator is a component of the OpenAI with unity asset that converts text into speech. This functionality is essential for creating interactive and accessible applications, ranging from dynamic character dialogue to voice-driven instructions and information delivery.

Step 1: Creating a VoiceGenerator Instance

  1. In the Unity Editor's top menu, go to Assets > Create > Glitch9/OpenAI/Toolkits > Voice Generator.

  2. This action will generate a VoiceGenerator instance in your Project pane. Select it to view and modify its properties in the Inspector.

Step 2: Configuring VoiceGenerator

  1. Audio File Path: Specify the directory where the audio files will be saved after generation.

  2. Familiarize yourself with the various settings such as supported voices, audio file path, and the default save path which may be automatically populated based on your project settings.

Step 3: Implementing VoiceGenerator in Your Scene

  1. Drag the VoiceGenerator ScriptableObject into a relevant script component within your scene that will control the text-to-speech operations.

  2. Reference the VoiceGenerator in your script to call its methods when needed.

csharpCopy codepublic VoiceGenerator voiceGenerator;

public async void GenerateSpeechFromText(string textToSpeak) {
    AudioFile result = await voiceGenerator.Create(textToSpeak, VoiceActor.Alloy);
    // Handle the generated speech audio
}

Step 4: Generating Voice Audio

  1. To create a voice audio clip from text, call the Create method with the text string and the desired VoiceActor.

  2. Use await or UniTask to manage the asynchronous nature of the operation and obtain the AudioFile result which represents the generated speech audio.

Step 5: Customizing the Generated Voice

  1. Voice: Choose from the available VoiceActor options to select the desired voice for the speech.

  2. Speed: Adjust the speed of speech to match your application's needs (e.g., slow for clarity or fast for brevity).

Step 6: Retrieving and Using Generated Audio

  • Access the generated audio through the AudioFile returned by the Create method.

  • Play the audio clip within your scene or attach it to UI elements as required.

Step 7: Managing Audio Files

  • The GetRecordings method can be used to retrieve a list of all the generated audio clips.

  • Implement a cleanup routine if your application dynamically generates a lot of speech to avoid excessive memory use or storage consumption.

Step 8: Error Handling and Validation

  • Validate the input text for length and content to ensure it meets OpenAI's API guidelines.

  • Implement error handling to catch and manage exceptions or failed operations gracefully.

Best Practices

  • Keep the generated speech concise to maintain user engagement and reduce processing time.

  • Regularly back up or clear out the audio file directory based on the frequency and number of generations.

  • Utilize user feedback to refine the choice of VoiceActor and speech speed for the best user experience.