Text to Audio

With Text to Audio, you convert text into an audio file. This is useful for scripts, explainer videos, listening material, read-aloud texts, and language education.

Getting started from the dashboard

On the dashboard, select Text to Audio under the input field. This is the button with the chat icon, an arrow, and the sound wave icon. The larger input field for the text you want to be spoken then appears.

Choose Text to Audio on the dashboard

The input field becomes bigger so you can comfortably enter longer scripts. You can then fill in the text and generate audio.

Settings

Through the settings button next to the input field you can adjust the speech settings.

Setting	Description
Model	Choose the text-to-speech model.
Language	Choose the language in which the text should be spoken.
Voice	Choose a voice suitable for the chosen language.
System prompt	Provide instructions for pronunciation, tone, tempo, accent, and special terms.
Style reference	Add extra cues about the desired speaking style.

The voice list is filtered by the chosen language. If a voice is intended for certain languages only, you will see that language listed with the voice.

Pronunciation and style

The system prompt controls how the voice should sound. You can indicate, for example:

that the speaker should sound like a native Dutch speaker;
that words such as AI, AI-School, ChatGPT, OpenAI, and Gemini may be pronounced in English;
that Claude should be pronounced as a French name;
or that the tone should be calm, warm, formal, informal, low, or energetic.

When you choose another language, AI-School adjusts the default instructions to that language.

Save and restore

You can save your settings to your account. AI-School will remember, among other things, the model, language, voice, and system prompt. With Restore defaults you remove these saved preferences.

Result

After generation, the audio file appears directly in the chat. You can play it there with the audio player and download it with the download button.

During generation, the input form is temporarily disabled to prevent multiple audio generations from running simultaneously.

Getting started from the dashboard​

Settings​

Pronunciation and style​

Save and restore​

Result​

Getting started from the dashboard

Settings

Pronunciation and style

Save and restore

Result