Whisper Finetuning with Pytorch Lightning

This code implements finetuning of OpenAI Whisper models using Pytorch Lightning. Most of the code is inspired by (and partly directly copied from) whisper-finetuning. However, since the current code is based on Pytorch Lightning, it also support multi-GPU training. It also supports training with SpecAugment (based on the implmenetation in ESPNet).

The finetuning method implemented here (and also the one in whisper-finetuning) is quite different from the finetuning code in HuggingFace Transformers and ESPNet, and is more similar to the training method that was actually used for training Whisper (according to the paper): finetuning is done on 30-second chunks extracted from long audio recordings, with intra-utterance timestamps. Therefore, the resulting model works very well for transcribing long audios, using e.g. faster-whisper.

Usage

TODO

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
README.md		README.md
create_data.py		create_data.py
data.py		data.py
export.py		export.py
main.py		main.py
mask_along_axis.py		mask_along_axis.py
model.py		model.py
specaug.py		specaug.py
time_warp.py		time_warp.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whisper Finetuning with Pytorch Lightning

Usage

Über uns

Releases

Packages

Languages

License

alumae/pl-whisper-finetuner

Folders and files

Latest commit

History

Repository files navigation

Whisper Finetuning with Pytorch Lightning

Usage

Über uns

Ressourcen

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages