Commit graph

  • cb5de19d8f
    add key_mapping keyword refactor-from-pretrained Cyril Vallez 2025-02-10 15:09:45 +0100
  • 65d0cbc495
    Update modeling_utils.py Cyril Vallez 2025-02-10 13:58:13 +0100
  • 8971d57339 fix fix-bnb-warning Marc Sun 2025-02-10 13:41:38 +0100
  • 0bd85ed634
    Merge branch 'main' into skip_blip2_torchscript skip_blip2_torchscript Yih-Dar 2025-02-10 13:37:04 +0100
  • 3b465d3472
    add forgotten check Cyril Vallez 2025-02-10 13:04:30 +0100
  • bea683cf18
    style Cyril Vallez 2025-02-10 13:01:03 +0100
  • bd3b5fa50f
    finalize sound renaming logic Cyril Vallez 2025-02-10 13:00:36 +0100
  • b6b0981483
    Merge branch 'main' into revert-checkpoint-tmp-dir revert-checkpoint-tmp-dir Marc Sun 2025-02-10 12:54:22 +0100
  • f11085d6ea
    Update modeling_utils.py Cyril Vallez 2025-02-10 12:35:57 +0100
  • 7a86256c89
    add sanity check back (a test depends on it) Cyril Vallez 2025-02-10 12:23:55 +0100
  • 6bbe016c78
    remove unused check Cyril Vallez 2025-02-10 12:10:39 +0100
  • 89c740d07a
    simplify renaming logic Cyril Vallez 2025-02-10 11:52:55 +0100
  • 1c0c5cf7b4
    simplify Cyril Vallez 2025-02-10 11:15:18 +0100
  • c9d2b88ffe Revert "Save checkpoint to temporary directory to handle partial saves during failures (#35580)" Marc Sun 2025-02-10 10:58:06 +0100
  • b3d1068951 Revert "Fix OS err (#36094)" Marc Sun 2025-02-10 10:57:34 +0100
  • f7f2c896c2
    Update modeling_utils.py Cyril Vallez 2025-02-10 10:45:46 +0100
  • dd4207c3ed
    Update hub.py Cyril Vallez 2025-02-10 10:41:13 +0100
  • 820ae396a4 update adding_fp8_quantization MekkCyber 2025-02-10 07:06:53 +0000
  • f22ff5c2e1
    CIs Cyril Vallez 2025-02-08 22:39:27 +0100
  • c7b175ec95
    simplify Cyril Vallez 2025-02-08 20:15:48 +0100
  • e6e75102e8
    fix tp plan registry Cyril Vallez 2025-02-08 19:14:34 +0100
  • 7a87812f3e
    Remove unused function and move back _get_tp_registry Cyril Vallez 2025-02-08 17:12:48 +0100
  • aca9b22c25
    remove old functions Cyril Vallez 2025-02-08 16:44:50 +0100
  • 1a5bfc1de5 fix style MekkCyber 2025-02-08 14:52:28 +0000
  • d41e11b3f1 update MekkCyber 2025-02-08 14:37:19 +0000
  • 98aa2bdad6
    remove old function Cyril Vallez 2025-02-08 15:30:23 +0100
  • 68a3234aa8 fix consistency MekkCyber 2025-02-08 12:25:38 +0000
  • 8d637f15de fix imports MekkCyber 2025-02-08 11:36:54 +0000
  • 3ead98b2f6 fix style MekkCyber 2025-02-08 11:23:12 +0000
  • 33f73712dc end2end MekkCyber 2025-02-08 11:18:52 +0000
  • 83912834cc fix quantization logic MekkCyber 2025-02-05 09:44:31 +0000
  • 70749dfd9b fix create_quantized_param MekkCyber 2025-02-04 17:09:58 +0000
  • 3700bbc09f adding kernels MekkCyber 2025-02-04 14:57:18 +0000
  • b0c3641f56 first commit MekkCyber 2025-02-04 09:58:42 +0000
  • 33f9e49a5b
    fast download first prototype Cyril Vallez 2025-02-08 00:10:30 +0100
  • 0642700ecd
    fix Cyril Vallez 2025-02-07 23:18:10 +0100
  • 9f66405c2e add sync test-deepseek-fp8 MekkCyber 2025-02-07 21:04:53 +0000
  • f8a963c116 Loss_function muellerzr-more-models-sadface [[ -z $EMAIL ]] && read -e -p "Enter your email (for git configuration): " EMAIL 2025-02-07 12:35:22 -0500
  • f839aa20fe Style [[ -z $EMAIL ]] && read -e -p "Enter your email (for git configuration): " EMAIL 2025-02-07 12:23:57 -0500
  • f4c3dadfb1
    Update modeling_utils.py Cyril Vallez 2025-02-07 18:21:47 +0100
  • 919bcbeca7 Moar bronked [[ -z $EMAIL ]] && read -e -p "Enter your email (for git configuration): " EMAIL 2025-02-07 12:20:52 -0500
  • f459bf8177
    Update modeling_utils.py Cyril Vallez 2025-02-07 17:19:48 +0100
  • 87e97ea54c
    Update modeling_utils.py Cyril Vallez 2025-02-07 16:24:42 +0100
  • 8aa45e177e Fix bug in apply_rotary_pos_emb_flashatt: in Qwen2-5-VL (#36065) DeepWave 2025-02-07 17:43:45 +0800
  • 128b840247 Adding RT-DETRv2 for object detection (#34773) Jade Choghari 2025-02-06 11:28:45 -0800
  • 4d1480d567 [docs] fix outdated example code in trainer.md (#36066) Fanli Lin 2025-02-07 02:54:22 +0800
  • 9c3009a391 Fix StopStringCriteria to handle tokens above len(tokenizer) (#35797) Matt 2025-02-06 16:53:28 +0000
  • e173ffd3ba Fix model kwargs (#35875) Zach Mueller 2025-02-06 11:35:25 -0500
  • b0cb3d0cc5 Fix words typos in ggml test. (#36060) 湛露先生 2025-02-06 23:32:40 +0800
  • c3e530a184 Nail in edge case of torch dtype being overriden permantly in the case of an error (#35845) Zach Mueller 2025-02-06 09:05:23 -0500
  • 20d17358c4 Save checkpoint to temporary directory to handle partial saves during failures (#35580) SilverSoldier 2025-02-06 19:18:05 +0530
  • 987e09e549 Paligemma: fix generation with Gemma2 (#36044) Raushan Turganbay 2025-02-06 14:31:32 +0100
  • 4886cb79c1 Update test_flash_attn_2_can_dispatch_composite_models (#36050) Yih-Dar 2025-02-06 12:09:49 +0100
  • 8201506d28 Fix repo consistency (#36063) Yih-Dar 2025-02-06 11:53:15 +0100
  • 1bfcddbba7 Fix usage of unpad_input function (#35925) Pavel Gein 2025-02-06 15:33:42 +0500
  • d68779b358 Iterative generation using Input embeds and past_key_values (#35890) Yaswanth Gali 2025-02-06 15:36:05 +0530
  • b44d36e10f Add Qwen2VLImageProcessorFast into Qwen2VLProcessor (#35987) Ye Liu 2025-02-06 17:03:09 +0800
  • 5f597b7460 Fix Audio Classification Pipeline top_k Documentation Mismatch and Bug #35736 (#35771) Sambhav Dixit 2025-02-05 21:55:08 +0530
  • 31bbef04b9 Fix how we compute the final non-padding token for ForSequenceClassification models (#35911) Matt 2025-02-05 16:23:33 +0000
  • ac4acde462 [docs] no hard-coding cuda (#36043) Fanli Lin 2025-02-06 00:22:33 +0800
  • 8e499a203c [docs] fix bugs in the bitsandbytes documentation (#35868) Fanli Lin 2025-02-06 00:21:20 +0800
  • 741865eebf [docs] no hard coding cuda as bnb has multi-backend support (#35867) Fanli Lin 2025-02-06 00:20:02 +0800
  • 1e929f357f DeepSpeed github repo move sync (#36021) Stas Bekman 2025-02-05 08:19:31 -0800
  • c14c2cc006 add support for empty list as input to create_model_card (#36042) ROZBEH 2025-02-05 04:29:17 -0800
  • c21f13fe21 Add XPU type for work-around -inf mask causing sdpa NaN issue in modeling files (#35647) Liangliang Ma 2025-02-05 20:28:31 +0800
  • dc3a523e9d Fix synced multi-GPU generation with LLMs and VLMs (#35893) ManukyanD 2025-02-05 14:15:11 +0400
  • 99f9afb079 commit MekkCyber 2025-02-07 10:41:53 +0000
  • c341f8c6fb
    style Cyril Vallez 2025-02-07 11:41:07 +0100
  • c95dc4ebe7
    update Cyril Vallez 2025-02-07 11:39:28 +0100
  • f88bb46428
    update Cyril Vallez 2025-02-07 11:07:53 +0100
  • 298b3f1930 v4.48.3 v4.48.3 v4.48-release Arthur Zucker 2025-02-07 10:32:49 +0100
  • d28f0207d5 GPTNeoX needs kwargs Arthur Zucker 2025-02-07 10:14:53 +0100
  • 8025c92c7b
    trigger CIs Cyril Vallez 2025-02-07 02:03:21 +0100
  • a17393187f
    Update test_modeling_common.py Cyril Vallez 2025-02-07 01:54:55 +0100
  • 42f02f6ee1
    update Cyril Vallez 2025-02-07 01:47:44 +0100
  • 3d6e55c7e7 Fix model kwargs (#35875) Zach Mueller 2025-02-06 11:35:25 -0500
  • 45c858a5d8
    fix Cyril Vallez 2025-02-06 18:38:30 +0100
  • 3b56ad06c4
    much clearer renaming of keys Cyril Vallez 2025-02-06 18:32:25 +0100
  • c631f36cca quality muellerzr-speedup-modular-conversion [[ -z $EMAIL ]] && read -e -p "Enter your email (for git configuration): " EMAIL 2025-02-06 12:26:38 -0500
  • 139f2cae03 Brr [[ -z $EMAIL ]] && read -e -p "Enter your email (for git configuration): " EMAIL 2025-02-06 12:22:44 -0500
  • 421bf8611a fix 2 nit_cleanup ydshieh 2025-02-06 15:58:29 +0100
  • 99f9e044d7 commit MekkCyber 2025-02-06 14:56:12 +0000
  • f0adc01768 add special tokens update-special-tokens Arthur Zucker 2025-02-06 15:14:03 +0100
  • 093bebcdd9 Paligemma: fix generation with Gemma2 (#36044) Raushan Turganbay 2025-02-06 14:31:32 +0100
  • 97a6cf9072 Fix device in rope module when using dynamic updates (#35608) Cyril Vallez 2025-01-13 10:11:17 +0100
  • 8aca12c774
    Update modeling_utils.py Cyril Vallez 2025-02-06 11:19:55 +0100
  • c4cbed8081 Processor: prevent duplicated tokens processor-template-duplicated-tokens Pedro Cuenca 2025-02-06 10:41:05 +0100
  • b149b1f6fe
    trigger CIs Cyril Vallez 2025-02-05 23:36:06 +0100
  • 11c0bde616
    fix weird in-place op Cyril Vallez 2025-02-05 23:28:26 +0100
  • 27e1615466
    style Cyril Vallez 2025-02-05 22:12:55 +0100
  • ff1078387e
    new first tp loading version Cyril Vallez 2025-02-05 21:44:46 +0100
  • c3e818561e
    Update modeling_utils.py Cyril Vallez 2025-02-05 15:35:10 +0100
  • 1bdb7bba52
    Update modeling_utils.py Cyril Vallez 2025-02-05 15:27:08 +0100
  • a3401c3e23
    keep improving Cyril Vallez 2025-02-05 15:14:50 +0100
  • 574e3f76c9
    remove _fast_init Cyril Vallez 2025-02-05 13:57:47 +0100
  • a6ffd7b4d2
    Update modeling_utils.py Cyril Vallez 2025-02-05 13:53:54 +0100
  • 11e378024d
    add type hints/docstring Cyril Vallez 2025-02-05 13:52:57 +0100
  • 6ba13f577b update ydshieh 2025-02-05 13:44:09 +0100
  • 82ca6920c6 update ydshieh 2025-02-05 13:40:59 +0100
  • 11e31ec24f Add future import for Py < 3.10 (#35666) Matt 2025-01-15 12:45:43 +0000