transformers

mirror of https://github.com/saymrwulf/transformers.git synced 2026-05-14 20:58:08 +00:00

History

Matthijs Hollemans e4bacf6614 [WIP] add SpeechT5 model (#18922 ) * make SpeechT5 model by copying Wav2Vec2 * add paper to docs * whoops added docs in wrong file * remove SpeechT5Tokenizer + put CTC back in the name * remove deprecated class * remove unused docstring * delete SpeechT5FeatureExtractor, use Wav2Vec2FeatureExtractor instead * remove classes we don't need right now * initial stab at speech encoder prenet * add more speech encoder prenet stuff * improve SpeechEncoderPrenet * add encoder (not finished yet) * add relative position bias to self-attention * add encoder CTC layers * fix formatting * add decoder from BART, doesn't work yet * make it work with generate loop * wrap the encoder into a speech encoder class * wrap the decoder in a text decoder class * changed my mind * changed my mind again ;-) * load decoder weights, make it work * add weights for text decoder postnet * add SpeechT5ForCTC model that uses only the encoder * clean up EncoderLayer and DecoderLayer * implement _init_weights in SpeechT5PreTrainedModel * cleanup config + Encoder and Decoder * add head + cross attention masks * improve doc comments * fixup * more cleanup * more fixup * TextDecoderPrenet works now, thanks Kendall * add CTC loss * add placeholders for other pre/postnets * add type annotation * fix freeze_feature_encoder * set padding tokens to 0 in decoder attention mask * encoder attention mask downsampling * remove features_pen calculation * disable the padding tokens thing again * fixup * more fixup * code review fixes * rename encoder/decoder wrapper classes * allow checkpoints to be loaded into SpeechT5Model * put encoder into wrapper for CTC model * clean up conversion script * add encoder for TTS model * add speech decoder prenet * add speech decoder post-net * attempt to reconstruct the generation loop * add speech generation loop * clean up generate_speech * small tweaks * fix forward pass * enable always dropout on speech decoder prenet * sort declaration * rename models * fixup * fix copies * more fixup * make consistency checker happy * add Seq2SeqSpectrogramOutput class * doc comments * quick note about loss and labels * add HiFi-GAN implementation (from Speech2Speech PR) * rename file * add vocoder to TTS model * improve vocoder * working on tokenizer * more better tokenizer * add CTC tokenizer * fix decode and batch_code in CTC tokenizer * fix processor * two processors and feature extractors * use SpeechT5WaveformFeatureExtractor instead of Wav2Vec2 * cleanup * more cleanup * even more fixup * notebooks * fix log-mel spectrograms * support reduction factor * fixup * shift spectrograms to right to create decoder inputs * return correct labels * add labels for stop token prediction * fix doc comments * fixup * remove SpeechT5ForPreTraining * more fixup * update copyright headers * add usage examples * add SpeechT5ProcessorForCTC * fixup * push unofficial checkpoints to hub * initial version of tokenizer unit tests * add slow test * fix failing tests * tests for CTC tokenizer * finish CTC tokenizer tests * processor tests * initial test for feature extractors * tests for spectrogram feature extractor * fixup * more fixup * add decorators * require speech for tests * modeling tests * more tests for ASR model * fix imports * add fake tests for the other models * fixup * remove jupyter notebooks * add missing SpeechT5Model tests * add missing tests for SpeechT5ForCTC * add missing tests for SpeechT5ForTextToSpeech * sort tests by name * fix Hi-Fi GAN tests * fixup * add speech-to-speech model * refactor duplicate speech generation code * add processor for SpeechToSpeech model * add usage example * add tests for speech-to-speech model * fixup * enable gradient checkpointing for SpeechT5FeatureEncoder * code review * push_to_hub now takes repo_id * improve doc comments for HiFi-GAN config * add missing test * add integration tests * make number of layers in speech decoder prenet configurable * rename variable * rename variables * add auto classes for TTS and S2S * REMOVE CTC!!! * S2S processor does not support save/load_pretrained * fixup * these models are now in an auto mapping * fix doc links * rename HiFiGAN to HifiGan, remove separate config file * REMOVE auto classes * there can be only one * fixup * replace assert * reformat * feature extractor can process input and target at same time * update checkpoint names * fix commit hash		2023-02-03 12:43:46 -05:00
..
albert.mdx
altclip.mdx	Add AltCLIP (#20446 )	2023-01-04 09:18:57 +01:00
audio-spectrogram-transformer.mdx	Add resources (#20872 )	2023-01-17 17:42:33 +01:00
auto.mdx	Add Universal Segmentation class + mapping (#20766 )	2022-12-16 14:22:46 +01:00
bart.mdx	Add TFBartForSequenceClassification (#20570 )	2022-12-07 18:05:39 +01:00
barthez.mdx
bartpho.mdx
beit.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
bert-generation.mdx
bert-japanese.mdx
bert.mdx	Add BERT resources (#19852 )	2022-11-01 11:09:53 -07:00
bertweet.mdx
big_bird.mdx	Update documentation on seq2seq models with absolute positional embeddings, to be in line with Tips section for BERT and GPT2 (#20068 )	2022-11-04 11:32:44 -04:00
bigbird_pegasus.mdx	Update documentation on seq2seq models with absolute positional embeddings, to be in line with Tips section for BERT and GPT2 (#20068 )	2022-11-04 11:32:44 -04:00
biogpt.mdx	Add BioGPT (#20420 )	2022-12-05 10:12:03 -05:00
bit.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
blenderbot-small.mdx	Update documentation on seq2seq models with absolute positional embeddings, to be in line with Tips section for BERT and GPT2 (#20068 )	2022-11-04 11:32:44 -04:00
blenderbot.mdx	Update documentation on seq2seq models with absolute positional embeddings, to be in line with Tips section for BERT and GPT2 (#20068 )	2022-11-04 11:32:44 -04:00
blip.mdx	`blip` support for training (#21021 )	2023-01-18 11:24:37 +01:00
bloom.mdx	docs: Resolve many typos in the English docs (#20088 )	2022-11-07 09:19:04 -05:00
bort.mdx
bridgetower.mdx	Add BridgeTower model (#20775 )	2023-01-25 14:04:32 -05:00
byt5.mdx
camembert.mdx
canine.mdx
chinese_clip.mdx	Add Chinese-CLIP implementation (#20368 )	2022-11-30 19:22:23 +01:00
clip.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
clipseg.mdx	[CLIPSeg] Add resources (#20118 )	2022-11-09 18:31:22 +01:00
codegen.mdx	Add CodeGen model (#17443 )	2022-06-24 17:10:38 +02:00
conditional_detr.mdx	Add segmentation + object detection image processors (#20160 )	2022-11-30 10:24:03 +00:00
convbert.mdx
convnext.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
cpm.mdx	Allow all imports from transformers (#17050 )	2022-05-02 12:47:39 -04:00
ctrl.mdx
cvt.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
data2vec.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
deberta-v2.mdx	Add DebertaV2ForMultipleChoice (#17135 )	2022-05-10 16:21:44 -04:00
deberta.mdx	Add to DeBERTa resources (#20155 )	2022-11-15 13:26:07 -05:00
decision_transformer.mdx
deformable_detr.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
deit.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
deta.mdx	[Docs] Minor fixes (#21383 )	2023-01-31 15:13:12 +01:00
detr.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
dialogpt.mdx
dinat.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
distilbert.mdx	add resources for distilbert (#19930 )	2022-10-28 13:16:07 -07:00
dit.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
donut.mdx	Add Donut image processor (#20425 )	2022-11-29 10:38:01 +00:00
dpr.mdx
dpt.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
efficientformer.mdx	Efficientformer (#20459 )	2023-01-20 11:35:42 +03:00
electra.mdx	[FlaxBert] Add ForCausalLM (#16995 )	2022-05-03 11:26:19 +02:00
encoder-decoder.mdx	[EncoderDecoder] Improve docs (#18271 )	2022-07-27 10:08:59 +02:00
ernie.mdx	add task_type_id to BERT to support ERNIE-2.0 and ERNIE-3.0 models (#18686 )	2022-09-09 07:36:46 -04:00
esm.mdx	Add ESMFold (#19977 )	2022-10-31 21:32:58 -04:00
flan-t5.mdx	Update flan-t5 original model link (#20897 )	2022-12-27 02:26:14 -05:00
flaubert.mdx
flava.mdx	AutoImageProcessor (#20111 )	2022-11-08 19:54:41 +00:00
fnet.mdx
fsmt.mdx
funnel.mdx
git.mdx	Add resources (#20872 )	2023-01-17 17:42:33 +01:00
glpn.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
gpt-sw3.mdx	Add gpt-sw3 model to transformers (#20209 )	2022-12-12 13:12:13 -05:00
gpt2.mdx	add in layer gpt2 tokenizer (#20421 )	2022-11-29 10:02:40 -05:00
gpt_neo.mdx
gpt_neox.mdx	[WIP] Adding GPT-NeoX-20B (#16659 )	2022-05-24 09:31:10 -04:00
gpt_neox_japanese.mdx	Add support for Japanese GPT-NeoX-based model by ABEJA, Inc. (#18814 )	2022-09-14 10:17:40 -04:00
gptj.mdx	Adding resource section to GPT-J docs (#21270 )	2023-01-30 16:48:04 -05:00
graphormer.mdx	Graphormer model for Graph Classification (#20968 )	2023-01-19 13:05:59 -05:00
groupvit.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
herbert.mdx
hubert.mdx
ibert.mdx
imagegpt.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
jukebox.mdx	Add Jukebox model (replaces #16875 ) (#17826 )	2022-11-10 21:05:27 +01:00
layoutlm.mdx	Added model resources for LayoutLM Issue#19848 (#21377 )	2023-02-03 08:53:16 -05:00
layoutlmv2.mdx	AutoImageProcessor (#20111 )	2022-11-08 19:54:41 +00:00
layoutlmv3.mdx	AutoImageProcessor (#20111 )	2022-11-08 19:54:41 +00:00
layoutxlm.mdx
led.mdx	Update documentation on seq2seq models with absolute positional embeddings, to be in line with Tips section for BERT and GPT2 (#20068 )	2022-11-04 11:32:44 -04:00
levit.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
lilt.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
longformer.mdx	docs: Resolve many typos in the English docs (#20088 )	2022-11-07 09:19:04 -05:00
longt5.mdx	Update longt5.mdx (#18634 )	2022-08-16 10:20:46 -05:00
luke.mdx	Adding fine-tuning models to LUKE (#18353 )	2022-08-01 11:09:47 -04:00
lxmert.mdx
m2m_100.mdx	Fix `m2m_100.mdx` doc example missing `labels` (#19149 )	2022-09-29 13:27:58 +02:00
marian.mdx	Replace `as_target` context managers by direct calls (#18325 )	2022-07-29 08:09:09 -04:00
markuplm.mdx	Fix doctest for `MarkupLM` (#19845 )	2022-10-24 17:54:23 +02:00
mask2former.mdx	[Mask2Former] Add doc tests (#21232 )	2023-01-25 12:34:43 +01:00
maskformer.mdx	Add Mask2Former (#20792 )	2023-01-16 20:37:07 +03:00
mbart.mdx	Replace `as_target` context managers by direct calls (#18325 )	2022-07-29 08:09:09 -04:00
mctct.mdx	[Past CI] 🔥 Leave Past CI failures in the past 🔥 (#20861 )	2022-12-27 18:37:25 +01:00
megatron-bert.mdx
megatron_gpt2.mdx
mluke.mdx
mobilebert.mdx
mobilenet_v1.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
mobilenet_v2.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
mobilevit.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
mpnet.mdx
mt5.mdx	docs: Resolve many typos in the English docs (#20088 )	2022-11-07 09:19:04 -05:00
mvp.mdx	Add MVP model (#17787 )	2022-06-29 09:30:55 -04:00
nat.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
nezha.mdx	Nezha Pytorch implementation (#17776 )	2022-06-23 12:36:22 -04:00
nllb.mdx	Replace `as_target` context managers by direct calls (#18325 )	2022-07-29 08:09:09 -04:00
nystromformer.mdx
oneformer.mdx	[Mask2Former] Add doc tests (#21232 )	2023-01-25 12:34:43 +01:00
openai-gpt.mdx	Very small edit to change name to OpenAI GPT (#20722 )	2022-12-12 09:43:43 -05:00
opt.mdx	Add `OPTForQuestionAnswering` (#19402 )	2022-10-10 09:30:59 -04:00
owlvit.mdx	Improve OWL-ViT postprocessing (#20980 )	2023-01-03 19:25:09 +03:00
pegasus.mdx
pegasus_x.mdx	PEGASUS-X (#18551 )	2022-09-02 19:54:02 +02:00
perceiver.mdx	AutoImageProcessor (#20111 )	2022-11-08 19:54:41 +00:00
phobert.mdx
plbart.mdx	Replace `as_target` context managers by direct calls (#18325 )	2022-07-29 08:09:09 -04:00
poolformer.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
prophetnet.mdx	Update documentation on seq2seq models with absolute positional embeddings, to be in line with Tips section for BERT and GPT2 (#20068 )	2022-11-04 11:32:44 -04:00
qdqbert.mdx
rag.mdx
realm.mdx
reformer.mdx
regnet.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
rembert.mdx
resnet.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
retribert.mdx
roberta-prelayernorm.mdx	Implement Roberta PreLayerNorm (#20305 )	2022-12-19 09:30:17 +01:00
roberta.mdx	Add RoBERTa resources (#19911 )	2022-10-27 11:33:15 -07:00
roc_bert.mdx	Add RocBert (#20013 )	2022-11-08 10:03:43 -05:00
roformer.mdx
segformer.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
sew-d.mdx
sew.mdx
speech-encoder-decoder.mdx	Replace `as_target` context managers by direct calls (#18325 )	2022-07-29 08:09:09 -04:00
speech_to_text.mdx	Fix some doctests after PR 15775 (#20036 )	2022-11-03 14:18:45 +01:00
speech_to_text_2.mdx	Generate: move generation_.py src files into generation/.py (#20096 )	2022-11-09 15:34:08 +00:00
speecht5.mdx	[WIP] add SpeechT5 model (#18922 )	2023-02-03 12:43:46 -05:00
splinter.mdx	docs: Resolve many typos in the English docs (#20088 )	2022-11-07 09:19:04 -05:00
squeezebert.mdx
swin.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
swin2sr.mdx	Add Swin2SR (#19784 )	2022-12-16 16:24:01 +01:00
swinv2.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
switch_transformers.mdx	Add Switch transformers (#19323 )	2022-11-15 13:06:45 +01:00
t5.mdx	Generate: move generation_.py src files into generation/.py (#20096 )	2022-11-09 15:34:08 +00:00
t5v1.1.mdx	docs: Resolve many typos in the English docs (#20088 )	2022-11-07 09:19:04 -05:00
table-transformer.mdx	Update doc examples feature extractor -> image processor (#20501 )	2022-11-30 14:50:55 +00:00
tapas.mdx	Fix tapas scatter (#20149 )	2022-11-14 01:04:26 -05:00
tapex.mdx
time_series_transformer.mdx	time series forecasting model (#17965 )	2022-09-30 15:32:59 -04:00
timesformer.mdx	[New Model] Add TimeSformer model (#18908 )	2022-12-02 09:13:25 +01:00
trajectory_transformer.mdx	Add trajectory transformer (#17141 )	2022-05-17 19:07:43 -04:00
transfo-xl.mdx
trocr.mdx	Update doc examples feature extractor -> image processor (#20501 )	2022-11-30 14:50:55 +00:00
ul2.mdx	Add UL2 (just docs) (#17740 )	2022-06-21 10:24:50 +02:00
unispeech-sat.mdx	Add Wav2Vec2Conformer (#16812 )	2022-05-17 00:43:16 +02:00
unispeech.mdx	Add Wav2Vec2Conformer (#16812 )	2022-05-17 00:43:16 +02:00
upernet.mdx	[Docs] Minor fixes (#21383 )	2023-01-31 15:13:12 +01:00
van.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
videomae.mdx	Update doc examples feature extractor -> image processor (#20501 )	2022-11-30 14:50:55 +00:00
vilt.mdx	AutoImageProcessor (#20111 )	2022-11-08 19:54:41 +00:00
vision-encoder-decoder.mdx	Update doc examples feature extractor -> image processor (#20501 )	2022-11-30 14:50:55 +00:00
vision-text-dual-encoder.mdx	docs: Resolve many typos in the English docs (#20088 )	2022-11-07 09:19:04 -05:00
visual_bert.mdx	Update doc examples feature extractor -> image processor (#20501 )	2022-11-30 14:50:55 +00:00
vit.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
vit_hybrid.mdx	Add BiT + ViT hybrid (#20550 )	2022-12-07 11:03:39 +01:00
vit_mae.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
vit_msn.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
wav2vec2-conformer.mdx	[Wav2Vec2Conformer] Official release (#17709 )	2022-06-15 18:34:15 +02:00
wav2vec2.mdx	Add wav2vec2 resources (#19931 )	2022-10-28 13:28:18 -07:00
wav2vec2_phoneme.mdx
wavlm.mdx	Add Wav2Vec2Conformer (#16812 )	2022-05-17 00:43:16 +02:00
whisper.mdx	Generate: move generation_.py src files into generation/.py (#20096 )	2022-11-09 15:34:08 +00:00
xclip.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
xglm.mdx	Add TF implementation of `XGLMModel` (#16543 )	2022-08-24 10:51:05 +01:00
xlm-prophetnet.mdx
xlm-roberta-xl.mdx
xlm-roberta.mdx	Remove Roberta Dependencies from XLM Roberta Flax and Tensorflow models (#21047 )	2023-01-18 07:49:39 -05:00
xlm.mdx
xlnet.mdx
xls_r.mdx
xlsr_wav2vec2.mdx
yolos.mdx	Add batch of resources (#20647 )	2023-01-17 17:18:56 +01:00
yoso.mdx