transformers/docker/transformers-quantization-latest-gpu/Dockerfile

FROM nvidia/cuda:12.1.0-cudnn8-devel-ubuntu20.04
LABEL maintainer="Hugging Face"

ARG DEBIAN_FRONTEND=noninteractive

# Use login shell to read variables from `~/.profile` (to pass dynamic created variables between RUN commands)
SHELL ["sh", "-lc"]

# The following `ARG` are mainly used to specify the versions explicitly & directly in this docker file, and not meant
# to be used as arguments for docker build (so far).

ARG PYTORCH='2.3.0'
# Example: `cu102`, `cu113`, etc.
ARG CUDA='cu121'

RUN apt update
RUN apt install -y git libsndfile1-dev tesseract-ocr espeak-ng python python3-pip ffmpeg
RUN python3 -m pip install --no-cache-dir --upgrade pip

ARG REF=main
RUN git clone https://github.com/huggingface/transformers && cd transformers && git checkout $REF

RUN [ ${#PYTORCH} -gt 0 ] && VERSION='torch=='$PYTORCH'.*' ||  VERSION='torch'; echo "export VERSION='$VERSION'" >> ~/.profile
RUN echo torch=$VERSION
# `torchvision` and `torchaudio` should be installed along with `torch`, especially for nightly build.
# Currently, let's just use their latest releases (when `torch` is installed with a release version)
RUN python3 -m pip install --no-cache-dir -U $VERSION torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/$CUDA

RUN python3 -m pip install --no-cache-dir -e ./transformers[dev-torch]

RUN python3 -m pip install --no-cache-dir git+https://github.com/huggingface/accelerate@main#egg=accelerate

# needed in bnb and awq
RUN python3 -m pip install --no-cache-dir einops

# Add bitsandbytes for mixed int8 testing
RUN python3 -m pip install --no-cache-dir bitsandbytes

# Add auto-gptq for gtpq quantization testing
RUN python3 -m pip install --no-cache-dir auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/

# Add optimum for gptq quantization testing
RUN python3 -m pip install --no-cache-dir git+https://github.com/huggingface/optimum@main#egg=optimum

# Add aqlm for quantization testing
RUN python3 -m pip install --no-cache-dir aqlm[gpu]==1.0.2

# Add hqq for quantization testing
RUN python3 -m pip install --no-cache-dir hqq

# Add autoawq for quantization testing
# >=v0.2.3 needed for compatibility with torch 2.2.1
RUN python3 -m pip install --no-cache-dir https://github.com/casper-hansen/AutoAWQ/releases/download/v0.2.3/autoawq-0.2.3+cu118-cp38-cp38-linux_x86_64.whl

# Add quanto for quantization testing
RUN python3 -m pip install --no-cache-dir quanto

# Add eetq for quantization testing
RUN python3 -m pip install git+https://github.com/NetEase-FuXi/EETQ.git

# When installing in editable mode, `transformers` is not recognized as a package.
# this line must be added in order for python to be aware of transformers.
RUN cd transformers && python3 setup.py develop
Use `torch 2.3` for CI (#30837) 2.3 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> 2024-05-15 17:31:52 +00:00			`FROM nvidia/cuda:12.1.0-cudnn8-devel-ubuntu20.04`
[CI] Quantization workflow (#29046) * [CI] Quantization workflow * build dockerfile * fix dockerfile * update self-cheduled.yml * test build dockerfile on push * fix torch install * udapte to python 3.10 * update aqlm version * uncomment build dockerfile * tests if the scheduler works * fix docker * do not trigger on psuh again * add additional runs * test again * all good * style * Update .github/workflows/self-scheduled.yml Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * test build dockerfile with torch 2.2.0 * fix extra * clean * revert changes * Revert "revert changes" This reverts commit 4cb52b8822da9d1786a821a33e867e4fcc00d8fd. * revert correct change --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> 2024-02-28 15:09:25 +00:00			`LABEL maintainer="Hugging Face"`

			`ARG DEBIAN_FRONTEND=noninteractive`

			# Use login shell to read variables from `~/.profile` (to pass dynamic created variables between RUN commands)
			`SHELL ["sh", "-lc"]`

			# The following `ARG` are mainly used to specify the versions explicitly & directly in this docker file, and not meant
			`# to be used as arguments for docker build (so far).`

Use `torch 2.3` for CI (#30837) 2.3 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> 2024-05-15 17:31:52 +00:00			`ARG PYTORCH='2.3.0'`
[CI] Quantization workflow (#29046) * [CI] Quantization workflow * build dockerfile * fix dockerfile * update self-cheduled.yml * test build dockerfile on push * fix torch install * udapte to python 3.10 * update aqlm version * uncomment build dockerfile * tests if the scheduler works * fix docker * do not trigger on psuh again * add additional runs * test again * all good * style * Update .github/workflows/self-scheduled.yml Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * test build dockerfile with torch 2.2.0 * fix extra * clean * revert changes * Revert "revert changes" This reverts commit 4cb52b8822da9d1786a821a33e867e4fcc00d8fd. * revert correct change --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> 2024-02-28 15:09:25 +00:00			# Example: `cu102`, `cu113`, etc.
Use `torch 2.3` for CI (#30837) 2.3 Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> 2024-05-15 17:31:52 +00:00			`ARG CUDA='cu121'`
[CI] Quantization workflow (#29046) * [CI] Quantization workflow * build dockerfile * fix dockerfile * update self-cheduled.yml * test build dockerfile on push * fix torch install * udapte to python 3.10 * update aqlm version * uncomment build dockerfile * tests if the scheduler works * fix docker * do not trigger on psuh again * add additional runs * test again * all good * style * Update .github/workflows/self-scheduled.yml Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * test build dockerfile with torch 2.2.0 * fix extra * clean * revert changes * Revert "revert changes" This reverts commit 4cb52b8822da9d1786a821a33e867e4fcc00d8fd. * revert correct change --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> 2024-02-28 15:09:25 +00:00
			`RUN apt update`
			`RUN apt install -y git libsndfile1-dev tesseract-ocr espeak-ng python python3-pip ffmpeg`
			`RUN python3 -m pip install --no-cache-dir --upgrade pip`

			`ARG REF=main`
			`RUN git clone https://github.com/huggingface/transformers && cd transformers && git checkout $REF`

			`RUN [ ${#PYTORCH} -gt 0 ] && VERSION='torch=='$PYTORCH'.*' \|\| VERSION='torch'; echo "export VERSION='$VERSION'" >> ~/.profile`
			`RUN echo torch=$VERSION`
			# `torchvision` and `torchaudio` should be installed along with `torch`, especially for nightly build.
			# Currently, let's just use their latest releases (when `torch` is installed with a release version)
			`RUN python3 -m pip install --no-cache-dir -U $VERSION torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/$CUDA`

			`RUN python3 -m pip install --no-cache-dir -e ./transformers[dev-torch]`

			`RUN python3 -m pip install --no-cache-dir git+https://github.com/huggingface/accelerate@main#egg=accelerate`

Fix quantization tests (#29914) * revert back to torch 2.1.1 * run test * switch to torch 2.2.1 * udapte dockerfile * fix awq tests * fix test * run quanto tests * update tests * split quantization tests * fix * fix again * final fix * fix report artifact * build docker again * Revert "build docker again" This reverts commit 399a5f9d9308da071d79034f238c719de0f3532e. * debug * revert * style * new notification system * testing notfication * rebuild docker * fix_prev_ci_results * typo * remove warning * fix typo * fix artifact name * debug * issue fixed * debug again * fix * fix time * test notif with faling test * typo * issues again * final fix ? * run all quantization tests again * remove name to clear space * revert modfiication done on workflow * fix * build docker * build only quant docker * fix quantization ci * fix * fix report * better quantization_matrix * add print * revert to the basic one 2024-04-09 15:10:29 +00:00			`# needed in bnb and awq`
			`RUN python3 -m pip install --no-cache-dir einops`

[CI] Quantization workflow (#29046) * [CI] Quantization workflow * build dockerfile * fix dockerfile * update self-cheduled.yml * test build dockerfile on push * fix torch install * udapte to python 3.10 * update aqlm version * uncomment build dockerfile * tests if the scheduler works * fix docker * do not trigger on psuh again * add additional runs * test again * all good * style * Update .github/workflows/self-scheduled.yml Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * test build dockerfile with torch 2.2.0 * fix extra * clean * revert changes * Revert "revert changes" This reverts commit 4cb52b8822da9d1786a821a33e867e4fcc00d8fd. * revert correct change --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> 2024-02-28 15:09:25 +00:00			`# Add bitsandbytes for mixed int8 testing`
			`RUN python3 -m pip install --no-cache-dir bitsandbytes`

			`# Add auto-gptq for gtpq quantization testing`
			`RUN python3 -m pip install --no-cache-dir auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/`

			`# Add optimum for gptq quantization testing`
			`RUN python3 -m pip install --no-cache-dir git+https://github.com/huggingface/optimum@main#egg=optimum`

			`# Add aqlm for quantization testing`
			`RUN python3 -m pip install --no-cache-dir aqlm[gpu]==1.0.2`

Add HQQ quantization support (#29637) * update HQQ transformers integration * push import_utils.py * add force_hooks check in modeling_utils.py * fix \| with Optional * force bias as param * check bias is Tensor * force forward for multi-gpu * review fixes pass * remove torch grad() * if any key in linear_tags fix * add cpu/disk check * isinstance return * add multigpu test + refactor tests * clean hqq_utils imports in hqq.py * clean hqq_utils imports in quantizer_hqq.py * delete hqq_utils.py * Delete src/transformers/utils/hqq_utils.py * ruff init * remove torch.float16 from __init__ in test * refactor test * isinstance -> type in quantizer_hqq.py * cpu/disk device_map check in quantizer_hqq.py * remove type(module) nn.linear check in quantizer_hqq.py * add BaseQuantizeConfig import inside HqqConfig init * remove hqq import in hqq.py * remove accelerate import from test_hqq.py * quant config.py doc update * add hqqconfig to main_classes doc * make style * __init__ fix * ruff __init__ * skip_modules list * hqqconfig format fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * test_hqq.py remove mistral comment * remove self.using_multi_gpu is False * torch_dtype default val set and logger.info * hqq.py isinstance fix * remove torch=None * torch_device test_hqq * rename test_hqq * MODEL_ID in test_hqq * quantizer_hqq setattr fix * quantizer_hqq typo fix * imports quantizer_hqq.py * isinstance quantizer_hqq * hqq_layer.bias reformat quantizer_hqq * Step 2 as comment in quantizer_hqq * prepare_for_hqq_linear() comment * keep_in_fp32_modules fix * HqqHfQuantizer reformat * quantization.md hqqconfig * quantization.md model example reformat * quantization.md # space * quantization.md space }) * quantization.md space }) * quantization_config fix doc Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * axis value check in quantization_config * format * dynamic config explanation * quant config method in quantization.md * remove shard-level progress * .cuda fix modeling_utils * test_hqq fixes * make fix-copies --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-05-02 16:51:49 +00:00			`# Add hqq for quantization testing`
			`RUN python3 -m pip install --no-cache-dir hqq`

[CI] Quantization workflow (#29046) * [CI] Quantization workflow * build dockerfile * fix dockerfile * update self-cheduled.yml * test build dockerfile on push * fix torch install * udapte to python 3.10 * update aqlm version * uncomment build dockerfile * tests if the scheduler works * fix docker * do not trigger on psuh again * add additional runs * test again * all good * style * Update .github/workflows/self-scheduled.yml Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * test build dockerfile with torch 2.2.0 * fix extra * clean * revert changes * Revert "revert changes" This reverts commit 4cb52b8822da9d1786a821a33e867e4fcc00d8fd. * revert correct change --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> 2024-02-28 15:09:25 +00:00			`# Add autoawq for quantization testing`
Fix quantization tests (#29914) * revert back to torch 2.1.1 * run test * switch to torch 2.2.1 * udapte dockerfile * fix awq tests * fix test * run quanto tests * update tests * split quantization tests * fix * fix again * final fix * fix report artifact * build docker again * Revert "build docker again" This reverts commit 399a5f9d9308da071d79034f238c719de0f3532e. * debug * revert * style * new notification system * testing notfication * rebuild docker * fix_prev_ci_results * typo * remove warning * fix typo * fix artifact name * debug * issue fixed * debug again * fix * fix time * test notif with faling test * typo * issues again * final fix ? * run all quantization tests again * remove name to clear space * revert modfiication done on workflow * fix * build docker * build only quant docker * fix quantization ci * fix * fix report * better quantization_matrix * add print * revert to the basic one 2024-04-09 15:10:29 +00:00			`# >=v0.2.3 needed for compatibility with torch 2.2.1`
			`RUN python3 -m pip install --no-cache-dir https://github.com/casper-hansen/AutoAWQ/releases/download/v0.2.3/autoawq-0.2.3+cu118-cp38-cp38-linux_x86_64.whl`
[CI] Quantization workflow (#29046) * [CI] Quantization workflow * build dockerfile * fix dockerfile * update self-cheduled.yml * test build dockerfile on push * fix torch install * udapte to python 3.10 * update aqlm version * uncomment build dockerfile * tests if the scheduler works * fix docker * do not trigger on psuh again * add additional runs * test again * all good * style * Update .github/workflows/self-scheduled.yml Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * test build dockerfile with torch 2.2.0 * fix extra * clean * revert changes * Revert "revert changes" This reverts commit 4cb52b8822da9d1786a821a33e867e4fcc00d8fd. * revert correct change --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> 2024-02-28 15:09:25 +00:00
[Quantization] Quanto quantizer (#29023) * start integration * fix * add and debug tests * update tests * make pytorch serialization works * compatible with device_map and offload * fix tests * make style * add ref * guard against safetensors * add float8 and style * fix is_serializable * Fix shard_checkpoint compatibility with quanto * more tests * docs * adjust memory * better * style * pass tests * Update src/transformers/modeling_utils.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add is_safe_serialization instead * Update src/transformers/quantizers/quantizer_quanto.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add QbitsTensor tests * fix tests * simplify activation list * Update docs/source/en/quantization.md Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> * better comment * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> * find and fix edge case * Update docs/source/en/quantization.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * pass weights_only_kwarg instead * fix shard_checkpoint loading * simplify update_missing_keys * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * recursion to get all tensors * block serialization * skip serialization tests * fix * change by cuda:0 for now * fix regression * update device_map * fix doc * add noteboon * update torch_dtype * update doc * typo * typo * remove comm --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Younes Belkada <younesbelkada@gmail.com> 2024-03-15 15:51:29 +00:00			`# Add quanto for quantization testing`
			`RUN python3 -m pip install --no-cache-dir quanto`

[FEAT]: EETQ quantizer support (#30262) * [FEAT]: EETQ quantizer support * Update quantization.md * Update docs/source/en/main_classes/quantization.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update docs/source/en/quantization.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update docs/source/en/quantization.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/integrations/__init__.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/integrations/__init__.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/integrations/eetq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/integrations/eetq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/integrations/eetq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/eetq_integration/test_eetq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/auto.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/auto.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/auto.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_eetq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/eetq_integration/test_eetq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_eetq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/eetq_integration/test_eetq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/eetq_integration/test_eetq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * [FEAT]: EETQ quantizer support * [FEAT]: EETQ quantizer support * remove whitespaces * update quantization.md * style * Update docs/source/en/quantization.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add copyright * Update quantization.md * Update docs/source/en/quantization.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/quantization.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Address the comments by amyeroberts * style --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Marc Sun <marc@huggingface.co> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> 2024-04-22 19:38:58 +00:00			`# Add eetq for quantization testing`
			`RUN python3 -m pip install git+https://github.com/NetEase-FuXi/EETQ.git`

[CI] Quantization workflow (#29046) * [CI] Quantization workflow * build dockerfile * fix dockerfile * update self-cheduled.yml * test build dockerfile on push * fix torch install * udapte to python 3.10 * update aqlm version * uncomment build dockerfile * tests if the scheduler works * fix docker * do not trigger on psuh again * add additional runs * test again * all good * style * Update .github/workflows/self-scheduled.yml Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * test build dockerfile with torch 2.2.0 * fix extra * clean * revert changes * Revert "revert changes" This reverts commit 4cb52b8822da9d1786a821a33e867e4fcc00d8fd. * revert correct change --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> 2024-02-28 15:09:25 +00:00			# When installing in editable mode, `transformers` is not recognized as a package.
			`# this line must be added in order for python to be aware of transformers.`
			`RUN cd transformers && python3 setup.py develop`