The VoiceGenerator
is a component of the OpenAI with unity asset that converts text into speech. This functionality is essential for creating interactive and accessible applications, ranging from dynamic character dialogue to voice-driven instructions and information delivery.
In the Unity Editor's top menu, go to Assets > Create > Glitch9/OpenAI/Toolkits > Voice Generator
.
This action will generate a VoiceGenerator
instance in your Project pane. Select it to view and modify its properties in the Inspector.
Audio File Path: Specify the directory where the audio files will be saved after generation.
Familiarize yourself with the various settings such as supported voices, audio file path, and the default save path which may be automatically populated based on your project settings.
Drag the VoiceGenerator
ScriptableObject into a relevant script component within your scene that will control the text-to-speech operations.
Reference the VoiceGenerator
in your script to call its methods when needed.
To create a voice audio clip from text, call the Create
method with the text string and the desired VoiceActor
.
Use await
or UniTask
to manage the asynchronous nature of the operation and obtain the AudioFile
result which represents the generated speech audio.
Voice: Choose from the available VoiceActor
options to select the desired voice for the speech.
Speed: Adjust the speed of speech to match your application's needs (e.g., slow for clarity or fast for brevity).
Access the generated audio through the AudioFile
returned by the Create
method.
Play the audio clip within your scene or attach it to UI elements as required.
The GetRecordings
method can be used to retrieve a list of all the generated audio clips.
Implement a cleanup routine if your application dynamically generates a lot of speech to avoid excessive memory use or storage consumption.
Validate the input text for length and content to ensure it meets OpenAI's API guidelines.
Implement error handling to catch and manage exceptions or failed operations gracefully.
Keep the generated speech concise to maintain user engagement and reduce processing time.
Regularly back up or clear out the audio file directory based on the frequency and number of generations.
Utilize user feedback to refine the choice of VoiceActor
and speech speed for the best user experience.