Audio Isolation
Isolate or enhance audio elements using .GENAudioIsolation().
Basic Usage
AudioClip noisyAudio = Resources.Load<AudioClip>("NoisyRecording");
AudioClip clean = await noisyAudio
.GENAudioIsolation()
.ExecuteAsync();Input Types
AudioClip Input
AudioClip noisy = Resources.Load<AudioClip>("Audio");
AudioClip isolated = await noisy
.GENAudioIsolation()
.ExecuteAsync();File Input
var audioFile = new File<AudioClip>(audioClip, "recording.wav");
AudioClip isolated = await audioFile
.GENAudioIsolation()
.ExecuteAsync();Common Use Cases
1. Voice Isolation
Remove background noise and isolate speech:
2. Noise Reduction
Clean up audio recordings:
3. Audio Enhancement
Improve audio clarity:
Unity Integration Examples
Example 1: Voice Chat Cleaner
Example 2: Recording Cleanup
Example 3: Batch Audio Processor
Example 4: Real-time Audio Filter
Example 5: Podcast Editor
Example 6: Voice Command Preprocessor
Provider Support
ElevenLabs
Features:
✅ Voice isolation
✅ Noise reduction
✅ Audio enhancement
✅ Background removal
Note: Currently, ElevenLabs is the primary provider for audio isolation.
What Gets Removed
Audio isolation typically removes:
Background noise
Environmental sounds
Music (in voice isolation mode)
Echo and reverb
Static and hiss
Wind noise
What Gets Preserved
Primary speech/voice
Speech clarity
Timing and rhythm
Emotional tone
Word pronunciation
Best Practices
✅ Good Practices
❌ Bad Practices
Audio Requirements
Input:
Any audio with speech
Background noise is okay
Any format (WAV, MP3, etc.)
Quality:
Better input = better output
Extreme noise may reduce quality
Very short clips may not process well
Limits:
Max file size: varies by provider
Duration: varies by provider
Use Cases
Voice Chat
Clean real-time voice communication
Podcasts
Remove background noise
Interviews
Isolate speaker from environment
Voice Commands
Improve recognition accuracy
Recordings
Clean up low-quality recordings
Transcription
Pre-process for better accuracy
Error Handling
Performance Tips
Workflow: Clean → Transcribe
Common pattern for voice recognition:
Limitations
Speech Focus: Optimized for speech, not music
Cannot Restore: Can't restore missing/corrupted audio
Quality Dependent: Very noisy input = lower quality output
Processing Time: Takes time depending on audio length
Next Steps
Voice Change - Modify voice characteristics
Speech to Text - Transcribe audio
Text to Speech - Generate speech
Sound Effects - Generate sound effects
Last updated