transformers

mirror of https://github.com/saymrwulf/transformers.git synced 2026-05-14 20:58:08 +00:00

History

Matthijs Hollemans e4bacf6614 [WIP] add SpeechT5 model (#18922 ) * make SpeechT5 model by copying Wav2Vec2 * add paper to docs * whoops added docs in wrong file * remove SpeechT5Tokenizer + put CTC back in the name * remove deprecated class * remove unused docstring * delete SpeechT5FeatureExtractor, use Wav2Vec2FeatureExtractor instead * remove classes we don't need right now * initial stab at speech encoder prenet * add more speech encoder prenet stuff * improve SpeechEncoderPrenet * add encoder (not finished yet) * add relative position bias to self-attention * add encoder CTC layers * fix formatting * add decoder from BART, doesn't work yet * make it work with generate loop * wrap the encoder into a speech encoder class * wrap the decoder in a text decoder class * changed my mind * changed my mind again ;-) * load decoder weights, make it work * add weights for text decoder postnet * add SpeechT5ForCTC model that uses only the encoder * clean up EncoderLayer and DecoderLayer * implement _init_weights in SpeechT5PreTrainedModel * cleanup config + Encoder and Decoder * add head + cross attention masks * improve doc comments * fixup * more cleanup * more fixup * TextDecoderPrenet works now, thanks Kendall * add CTC loss * add placeholders for other pre/postnets * add type annotation * fix freeze_feature_encoder * set padding tokens to 0 in decoder attention mask * encoder attention mask downsampling * remove features_pen calculation * disable the padding tokens thing again * fixup * more fixup * code review fixes * rename encoder/decoder wrapper classes * allow checkpoints to be loaded into SpeechT5Model * put encoder into wrapper for CTC model * clean up conversion script * add encoder for TTS model * add speech decoder prenet * add speech decoder post-net * attempt to reconstruct the generation loop * add speech generation loop * clean up generate_speech * small tweaks * fix forward pass * enable always dropout on speech decoder prenet * sort declaration * rename models * fixup * fix copies * more fixup * make consistency checker happy * add Seq2SeqSpectrogramOutput class * doc comments * quick note about loss and labels * add HiFi-GAN implementation (from Speech2Speech PR) * rename file * add vocoder to TTS model * improve vocoder * working on tokenizer * more better tokenizer * add CTC tokenizer * fix decode and batch_code in CTC tokenizer * fix processor * two processors and feature extractors * use SpeechT5WaveformFeatureExtractor instead of Wav2Vec2 * cleanup * more cleanup * even more fixup * notebooks * fix log-mel spectrograms * support reduction factor * fixup * shift spectrograms to right to create decoder inputs * return correct labels * add labels for stop token prediction * fix doc comments * fixup * remove SpeechT5ForPreTraining * more fixup * update copyright headers * add usage examples * add SpeechT5ProcessorForCTC * fixup * push unofficial checkpoints to hub * initial version of tokenizer unit tests * add slow test * fix failing tests * tests for CTC tokenizer * finish CTC tokenizer tests * processor tests * initial test for feature extractors * tests for spectrogram feature extractor * fixup * more fixup * add decorators * require speech for tests * modeling tests * more tests for ASR model * fix imports * add fake tests for the other models * fixup * remove jupyter notebooks * add missing SpeechT5Model tests * add missing tests for SpeechT5ForCTC * add missing tests for SpeechT5ForTextToSpeech * sort tests by name * fix Hi-Fi GAN tests * fixup * add speech-to-speech model * refactor duplicate speech generation code * add processor for SpeechToSpeech model * add usage example * add tests for speech-to-speech model * fixup * enable gradient checkpointing for SpeechT5FeatureEncoder * code review * push_to_hub now takes repo_id * improve doc comments for HiFi-GAN config * add missing test * add integration tests * make number of layers in speech decoder prenet configurable * rename variable * rename variables * add auto classes for TTS and S2S * REMOVE CTC!!! * S2S processor does not support save/load_pretrained * fixup * these models are now in an auto mapping * fix doc links * rename HiFiGAN to HifiGan, remove separate config file * REMOVE auto classes * there can be only one * fixup * replace assert * reformat * feature extractor can process input and target at same time * update checkpoint names * fix commit hash		2023-02-03 12:43:46 -05:00
..
benchmark
deepspeed	[examples/deepspeed] fix renamed api (#21283 )	2023-01-24 09:54:33 -08:00
extended	[bnb optim] fixing test (#21030 )	2023-01-12 08:52:54 -08:00
fixtures	[WIP] add SpeechT5 model (#18922 )	2023-02-03 12:43:46 -05:00
generation	🚨🚨 Generate: standardize beam search behavior across frameworks (#21368 )	2023-02-03 10:24:02 +00:00
mixed_int8	[`bnb`] Fine-tuning HF 8-bit models (#21290 )	2023-02-02 16:39:23 +01:00
models	[WIP] add SpeechT5 model (#18922 )	2023-02-03 12:43:46 -05:00
onnx	Add Onnx Config for PoolFormer (#20868 )	2022-12-23 01:30:57 -05:00
optimization
pipelines	Fix some pipeline tests (#21401 )	2023-02-02 19:03:31 +01:00
repo_utils
sagemaker
tokenization
trainer	Add AWS Neuron torchrun support (#20806 )	2023-01-18 11:21:19 -05:00
utils	Add the GeLU activation from pytorch with the tanh approximation (#21345 )	2023-02-02 09:33:04 -05:00
__init__.py
test_configuration_common.py
test_feature_extraction_common.py	Add test_image_processing_common.py (#20785 )	2023-01-23 13:48:30 +00:00
test_image_processing_common.py	Add test_image_processing_common.py (#20785 )	2023-01-23 13:48:30 +00:00
test_image_transforms.py	Move convert_to_rgb to image_transforms module (#20784 )	2022-12-15 18:47:04 +00:00
test_modeling_common.py	Add variant to transformers (#21332 )	2023-02-01 09:21:52 +01:00
test_modeling_flax_common.py	Generate: save generation config with the models' `.save_pretrained()` (#21264 )	2023-01-23 16:21:44 +00:00
test_modeling_tf_common.py	Generate: fix TF XLA tests on models with `max_position_embeddings` or `max_target_positions` (#21389 )	2023-01-31 15:49:34 +00:00
test_sequence_feature_extraction_common.py
test_tokenization_common.py