Audio Isolation

Isolate or enhance audio elements using .GENAudioIsolation().

Basic Usage

AudioClip noisyAudio = Resources.Load<AudioClip>("NoisyRecording");
AudioClip clean = await noisyAudio
    .GENAudioIsolation()
    .ExecuteAsync();

Input Types

AudioClip Input

AudioClip noisy = Resources.Load<AudioClip>("Audio");
AudioClip isolated = await noisy
    .GENAudioIsolation()
    .ExecuteAsync();

File Input

var audioFile = new File<AudioClip>(audioClip, "recording.wav");
AudioClip isolated = await audioFile
    .GENAudioIsolation()
    .ExecuteAsync();

Common Use Cases

1. Voice Isolation

Remove background noise and isolate speech:

2. Noise Reduction

Clean up audio recordings:

3. Audio Enhancement

Improve audio clarity:

Unity Integration Examples

Example 1: Voice Chat Cleaner

Example 2: Recording Cleanup

Example 3: Batch Audio Processor

Example 4: Real-time Audio Filter

Example 5: Podcast Editor

Example 6: Voice Command Preprocessor

Provider Support

ElevenLabs

Features:

  • ✅ Voice isolation

  • ✅ Noise reduction

  • ✅ Audio enhancement

  • ✅ Background removal

Note: Currently, ElevenLabs is the primary provider for audio isolation.

What Gets Removed

Audio isolation typically removes:

  • Background noise

  • Environmental sounds

  • Music (in voice isolation mode)

  • Echo and reverb

  • Static and hiss

  • Wind noise

What Gets Preserved

  • Primary speech/voice

  • Speech clarity

  • Timing and rhythm

  • Emotional tone

  • Word pronunciation

Best Practices

✅ Good Practices

❌ Bad Practices

Audio Requirements

Input:

  • Any audio with speech

  • Background noise is okay

  • Any format (WAV, MP3, etc.)

Quality:

  • Better input = better output

  • Extreme noise may reduce quality

  • Very short clips may not process well

Limits:

  • Max file size: varies by provider

  • Duration: varies by provider

Use Cases

Use Case
Example

Voice Chat

Clean real-time voice communication

Podcasts

Remove background noise

Interviews

Isolate speaker from environment

Voice Commands

Improve recognition accuracy

Recordings

Clean up low-quality recordings

Transcription

Pre-process for better accuracy

Error Handling

Performance Tips

Workflow: Clean → Transcribe

Common pattern for voice recognition:

Limitations

  1. Speech Focus: Optimized for speech, not music

  2. Cannot Restore: Can't restore missing/corrupted audio

  3. Quality Dependent: Very noisy input = lower quality output

  4. Processing Time: Takes time depending on audio length

Next Steps

Last updated