transformers/tests
Khai Mai c5c69096b3
Exclude the load balancing loss of padding tokens in Mixtral-8x7B (#28517)
* fix the function load_balancing_loss_func in Mixtral_Moe to include attention_mask

* format code using black and ruff

* skip computing mask if attention_mask=None

* add tests for load balancing loss Mixtral-Moe

* fix assert loss is different in mixtral_test

* fix pad_leng

* use assertNotAlmostEqual and print to debug

* remove print for debug

* minor updates

* reduce rtol and atol
2024-01-24 10:12:14 +01:00
..
benchmark
bettertransformer
deepspeed Fix initialization for missing parameters in from_pretrained under ZeRO-3 (#28245) 2024-01-09 14:58:21 +00:00
extended
fixtures
fsdp fix resuming from ckpt when using FSDP with FULL_STATE_DICT (#27891) 2023-12-16 19:41:43 +05:30
generation Fix _speculative_sampling implementation (#28508) 2024-01-19 14:07:31 +00:00
models Exclude the load balancing loss of padding tokens in Mixtral-8x7B (#28517) 2024-01-24 10:12:14 +01:00
optimization
peft_integration [Peft] modules_to_save support for peft integration (#27466) 2023-11-14 10:32:57 +01:00
pipelines [Whisper] Finalize batched SOTA long-form generation (#27658) 2024-01-19 14:04:17 +02:00
quantization [GPTQ] Fix test (#28018) 2024-01-15 11:22:54 -05:00
repo_utils Allow # Ignore copy (#27328) 2023-12-07 10:00:08 +01:00
sagemaker Broken links fixed related to datasets docs (#27569) 2023-11-17 13:44:09 -08:00
tokenization [Styling] stylify using ruff (#27144) 2023-11-16 17:43:19 +01:00
tools
trainer Avoid root logger's level being changed (#28638) 2024-01-22 14:45:30 +01:00
utils Enable instantiating model with pretrained backbone weights (#28214) 2024-01-23 11:01:50 +00:00
__init__.py
test_backbone_common.py Align backbone stage selection with out_indices & out_features (#27606) 2023-12-20 18:33:17 +00:00
test_cache_utils.py Generate: SinkCache can handle iterative prompts (#27907) 2023-12-08 20:02:20 +00:00
test_configuration_common.py [ PretrainedConfig] Improve messaging (#27438) 2023-11-15 14:10:39 +01:00
test_configuration_utils.py Config: warning when saving generation kwargs in the model config (#28514) 2024-01-16 18:31:01 +00:00
test_feature_extraction_common.py
test_feature_extraction_utils.py Remove-auth-token (#27060) 2023-11-13 14:20:54 +01:00
test_image_processing_common.py Fix a couple of typos and add an illustrative test (#26941) 2023-12-11 15:51:51 +00:00
test_image_processing_utils.py Remove-auth-token (#27060) 2023-11-13 14:20:54 +01:00
test_image_transforms.py Normalize floating point cast (#27249) 2023-11-10 15:35:27 +00:00
test_modeling_common.py Fix SDPA tests (#28552) 2024-01-17 17:29:18 +01:00
test_modeling_flax_common.py
test_modeling_flax_utils.py Enable safetensors conversion from PyTorch to other frameworks without the torch requirement (#27599) 2024-01-23 10:28:23 +01:00
test_modeling_tf_common.py Replace build() with build_in_name_scope() for some TF tests (#28046) 2023-12-14 17:42:25 +00:00
test_modeling_tf_utils.py Replace build() with build_in_name_scope() for some TF tests (#28046) 2023-12-14 17:42:25 +00:00
test_modeling_utils.py Use LoggingLevel context manager in 3 tests (#28575) 2024-01-18 13:41:25 +00:00
test_pipeline_mixin.py Shorten the conversation tests for speed + fixing position overflows (#26960) 2023-10-31 14:20:04 +00:00
test_processing_common.py Don't save processor_config.json if a processor has no extra attribute (#28584) 2024-01-19 09:59:14 +00:00
test_sequence_feature_extraction_common.py
test_tokenization_common.py [ TokenizationUtils] Fix add_special_tokens when the token is already there (#28520) 2024-01-16 16:36:29 +01:00
test_tokenization_utils.py Remove-auth-token (#27060) 2023-11-13 14:20:54 +01:00