Applio
A simple, high-quality voice conversion tool focused on ease of use and performance.
Select the voice model to use for the conversion.
Select the index file to use for the conversion.
Select the audio to convert.
Select the speaker ID to use for the conversion.
Split the audio into chunks for inference to obtain better results in some cases.
Apply a soft autotune to your inferences, recommended for singing conversions.
Clean your audio output using noise detection algorithms, recommended for speaking audios.
Enable formant shifting. Used for male to female and vice-versa convertions.
Post-process the audio to apply effects to the output.
Presets are located in /assets/formant_shift folder
Apply reverb to the audio.
Apply pitch shift to the audio.
Apply limiter to the audio.
Apply gain to the audio.
Apply distortion to the audio.
Apply chorus to the audio.
Apply bitcrush to the audio.
Apply clipping to the audio.
Apply compressor to the audio.
Apply delay to the audio.
Please ensure compliance with the terms and conditions detailed in this document before proceeding with your inference.
Select the speaker ID to use for the conversion.
Split the audio into chunks for inference to obtain better results in some cases.
Apply a soft autotune to your inferences, recommended for singing conversions.
Clean your audio output using noise detection algorithms, recommended for speaking audios.
Enable formant shifting. Used for male to female and vice-versa convertions.
Post-process the audio to apply effects to the output.
Presets are located in /assets/formant_shift folder
Apply reverb to the audio.
Apply pitch shift to the audio.
Apply limiter to the audio.
Apply gain to the audio.
Apply distortion to the audio.
Apply chorus to the audio.
Apply bitcrush to the audio.
Apply clipping to the audio.
Apply compressor to the audio.
Apply delay to the audio.
Please ensure compliance with the terms and conditions detailed in this document before proceeding with your inference.
Name of the new model.
Path to the dataset folder.
It's recommended to deactivate this option if your dataset has already been processed.
It's recommended keep deactivate this option if your dataset has already been processed.
Enabling this setting will result in the G and D files saving only their most recent versions, effectively conserving storage space.
This setting enables you to save the weights of the model at the conclusion of each epoch.
Utilize pretrained models when training your own. This approach reduces training duration and enhances overall quality.
Enable this setting only if you are training a new model from scratch or restarting the training. Deletes all previously generated weights and tensorboard logs.
Cache the dataset in GPU memory to speed up the training process.
Enables memory-efficient training. This reduces VRAM usage at the cost of slower training speed. It is useful for GPUs with limited memory (e.g., <6GB VRAM) or when training with a batch size larger than what your GPU can normally accommodate.
Utilizing custom pretrained models can lead to superior results, as selecting the most suitable pretrained models tailored to the specific use case can significantly enhance performance.
Detect overtraining to prevent the model from learning the training data too well and losing the ability to generalize to new data.
Select the custom pretrained model for the generator.
Select the custom pretrained model for the discriminator.
Please ensure compliance with the terms and conditions detailed in this document before proceeding with your training.
The button 'Upload' is only for google colab: Uploads the exported files to the ApplioExported folder in your Google Drive.
Select the pth file to be exported
Select the index file to be exported
Select the voice model to use for the conversion.
Select the index file to use for the conversion.
Applio is a Speech-to-Speech conversion software, utilizing EdgeTTS as middleware for running the Text-to-Speech (TTS) component. Read more about it here!
Select the TTS voice to use for the conversion.
Select the speaker ID to use for the conversion.
Split the audio into chunks for inference to obtain better results in some cases.
Apply a soft autotune to your inferences, recommended for singing conversions.
Clean your audio output using noise detection algorithms, recommended for speaking audios.
Please ensure compliance with the terms and conditions detailed in this document before proceeding with your inference.
Voice Blender
Select two voice models, set your desired blend percentage, and blend them into an entirely new voice.
Download Model
Drop files
Download Pretrained Models
Select the pretrained model you want to download.
And select the sampling rate.
How to Report an Issue on GitHub
- Click on the 'Record Screen' button below to start recording the issue you are experiencing.
- Once you have finished recording the issue, click on the 'Stop Recording' button (the same button, but the label changes depending on whether you are actively recording or not).
- Go to GitHub Issues and click on the 'New Issue' button.
- Complete the provided issue template, ensuring to include details as needed, and utilize the assets section to upload the recorded file from the previous step.
It will activate the possibility of displaying the current Applio activity in Discord.
Select the theme you want to use. (Requires restarting Applio)
Select the language you want to use. (Requires restarting Applio)