onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-25 02:50:42 +00:00

Author	SHA1	Message	Date
pengwa	2c6b31c5aa	FP16 optimizer automatically detect DeepSpeed compatibility (#18084 ) ### FP16 optimizer automatically detect DeepSpeed compatibility Optimum/Transformers are using accelerate lib to prepare models, so our FP16 optimizer wrapper does not work for long time. Because the namespace is `accelerate.utils.deepspeed.DeepSpeedOptimizerWrapper`, which underlying is still calling into DeepSpeed stage1and2 optimizer. This PR includes following changes: 1. Add `accelerate.utils.deepspeed.DeepSpeedOptimizerWrapper` in the modifier registry, plus a check on its contained `optimizer` property MUST be DeepSpeed stage 1 and 2 optimizer. (let's cover Stage 3 optimizer later) 2. For DeepSpeed version > 0.9.1, we will store the source code in a version list. As long as the related function in DeepSpeed remains unchanged during its new release, we won't need manually upgrade the version check any more. If some day, the source code did not match, a warning will be raised to users, to add a new version of source code in the list. With the above change, we will have our FP16 Optimizer working again in Optimum. ![image](https://github.com/microsoft/onnxruntime/assets/10530022/d35b4aa9-b371-46f1-98ae-73114f91179b)	2023-10-25 15:11:02 +08:00
Justin Chu	be7541ef4a	[Linter] Bump ruff and remove pylint (#17797 ) Bump ruff version and remove pylint from the linter list. Fix any new error detected by ruff. ### Motivation and Context Ruff covers many of the pylint rules. Since pylint is not enabled in this repo and runs slow, we remove it from the linters	2023-10-05 21:07:33 -07:00
Justin Chu	d79515041c	[Better Engineering] Bump ruff to 0.0.278 and fix new lint errors (#16789 ) Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #16789 Bump ruff to 0.0.278 and fix new lint errors. I added noqa to all existing RUF012 errors which requires mutable class variables to be annotated with `ClassVar`, as well as all PERF issues. Signed-off-by: Justin Chu <justinchu@microsoft.com>	2023-07-21 12:53:41 -07:00
Xavier Dupré	b508c7236f	Replace call to deprecated torch.norm (#16758 ) ### Description torch.norm is deprecated as mentioned in issue #16751. This PR replaces the call to torch.norm by the options suggested by torch documentation.	2023-07-20 19:52:19 -07:00
jingyanwangms	5dcaf70501	Adding this set_to_none flag to zero_grad to have signature parity with pytorch Adam (#16375 ) ### Description torch.optim Adam zero_grad() signature is zero_grad(set_to_none=True) https://pytorch.org/docs/stable/generated/torch.optim.Adam.html#torch.optim.Adam.zero_grad We set this flag in initialization, similar to deepspeed: https://deepspeed.readthedocs.io/en/latest/optimizers.html#deepspeed.ops.adam.FusedAdam Adding this flag to have signature parity with pytorch Adam ### Motivation and Context Easier model integration Co-authored-by: Jingyan Wang <jingywa@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2023-06-19 17:27:41 -07:00
Rui Ren	db6a9bc033	support latest deepspeed version for optim (#15682 ) ### Description <!-- Describe your changes. --> support the latest deepspeed 0.9.1 for the next release ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> This will avoid the warn message `Skip modifying optimizer because of unsupported DeepSpeed version` --------- Co-authored-by: ruiren <ruiren@microsoft.com>	2023-04-25 20:12:23 -07:00
Justin Chu	a36caba073	Bump ruff in CI (#15533 ) ### Description Bump ruff version in CI and fixed new lint errors. - This change enables the flake8-implicit-str-concat rules which helps detect unintended string concatenations: https://beta.ruff.rs/docs/rules/#flake8-implicit-str-concat-isc - Update gitignore to include common python files that we want to exclude. ### Motivation and Context Code quality	2023-04-17 10:11:44 -07:00
Rui Ren	5e2f46df2b	update deepspeed version 0.8.3 (#15415 ) ### Description <!-- Describe your changes. --> Update the support deepspeed to 0.8.3 as it's the latest version ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> This will fix the error of `Skip modifying optimizer because of unsupported DeepSpeed version` Co-authored-by: ruiren <ruiren@microsoft.com>	2023-04-07 17:59:50 -07:00
Justin Chu	938e2136c6	Enable pylint and numpy rules (#15218 ) ### Description Enable pylint and numpy rules ### Motivation and Context Modernize numpy usage and enable more quality checks	2023-03-27 20:37:53 -07:00
Justin Chu	d834ec895a	Adopt linrtunner as the linting tool - take 2 (#15085 ) ### Description `lintrunner` is a linter runner successfully used by pytorch, onnx and onnx-script. It provides a uniform experience running linters locally and in CI. It supports all major dev systems: Windows, Linux and MacOs. The checks are enforced by the `Python format` workflow. This PR adopts `lintrunner` to onnxruntime and fixed ~2000 flake8 errors in Python code. `lintrunner` now runs all required python lints including `ruff`(replacing `flake8`), `black` and `isort`. Future lints like `clang-format` can be added. Most errors are auto-fixed by `ruff` and the fixes should be considered robust. Lints that are more complicated to fix are applied `# noqa` for now and should be fixed in follow up PRs. ### Notable changes 1. This PR removed some suboptimal patterns: - `not xxx in` -> `xxx not in` membership checks - bare excepts (`except:` -> `except Exception`) - unused imports The follow up PR will remove: - `import *` - mutable values as default in function definitions (`def func(a=[])`) - more unused imports - unused local variables 2. Use `ruff` to replace `flake8`. `ruff` is much (40x) faster than flake8 and is more robust. We are using it successfully in onnx and onnx-script. It also supports auto-fixing many flake8 errors. 3. Removed the legacy flake8 ci flow and updated docs. 4. The added workflow supports SARIF code scanning reports on github, example snapshot: ![image](https://user-images.githubusercontent.com/11205048/212598953-d60ce8a9-f242-4fa8-8674-8696b704604a.png) 5. Removed `onnxruntime-python-checks-ci-pipeline` as redundant ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Unified linting experience in CI and local. Replacing https://github.com/microsoft/onnxruntime/pull/14306 --------- Signed-off-by: Justin Chu <justinchu@microsoft.com>	2023-03-24 15:29:03 -07:00
Abhishek Jindal	3d388a1aea	change deepspeed version in warning from 0.7.3 to 0.8.0 (#14527 ) ### Description change deepspeed version in warning from 0.7.3 to 0.8.0 ### Motivation and Context The version was updated for Deepspeed support in ORT from 0.7.3 to 0.8.0 but wasn't updated in the warnings message and this PR is to fix that.	2023-02-01 12:00:43 -08:00
Abhishek Jindal	6fa4555a06	Including support for Deepspeed 0.8.0 (#14506 ) ### Description Including Support for Deepspeed 0.8.0. ### Motivation and Context Deepspeed 0.8.0 has a bug fix and mlfow integration.	2023-02-01 06:19:41 -08:00
Vincent Wang	6fb70a82df	[ORTModule] Update Supported DeepSpeed Version for FP16_Optimizer (#13305 ) Update supported deepspeed highest version from 0.7.1 to 0.7.3 for FP16_Optimizer. Also add version info to warning log.	2022-10-13 13:03:01 +08:00
pengwa	a0c25e5c2f	Fix segment fault for alltoall (#12701 ) * fix segment fault * formatting	2022-08-30 11:27:14 +08:00
Vincent Wang	53ecb9e635	Update Supporting DS Version to 0.7.1 for ORTModule (#12696 ) update ds version support for fp16_optimizer	2022-08-24 14:56:12 +08:00
Vincent Wang	a078c8d99b	Update Supporting Deepspeed Version of ORTModule's FP16_Optimizer (#12668 )	2022-08-22 22:22:53 +08:00
Vincent Wang	a7eb9fe3ac	Remove Apex Dependency For Deepspeed FP16_Optimizer (#12077 ) * remove apex dependency * fix amd build	2022-07-14 11:15:53 +08:00
Vincent Wang	04f7c2deda	FP16_Optimizer Support for more Deepspeed Versions (#12046 ) * fp16_optimizer for more ds versions * change ds version * bugfix * fix bug	2022-06-30 18:36:17 +08:00
zhijxu	9f260fb60f	resolve comments	2022-06-30 11:26:13 +08:00
zhijxu	100aebbd26	resolve comments	2022-06-30 11:26:13 +08:00
zhijxu	2295b24cd5	support optimizer opt for deepspeed 0.5.9	2022-06-30 11:26:13 +08:00
Justin Chu	fdce4fa6af	Format all python files under onnxruntime with black and isort (#11324 ) Description: Format all python files under onnxruntime with black and isort. After checking in, we can use .git-blame-ignore-revs to ignore the formatting PR in git blame. #11315, #11316	2022-04-26 09:35:16 -07:00
pengwa	89ef987ab1	Improve NonZero on CUDA/ROCM (#10307 ) * improve NonZero * fix megatron_fp16 optimzier, fix the doc * multi_tensor_applier * resolve comment * fix building warning * fix build error when enabling training and use tensorrt	2022-03-25 07:35:45 +08:00
Baiju Meswani	141606534c	Add support for FusedAdam to be mathematically equivalent to pytorch/AdamW (#10106 )	2022-01-21 13:37:59 -08:00
pengwa	b125446f9c	Optimize python overhead of APEX amp (#9447 ) * optimize python overhead of _post_amp_backward * overwrite apex amp's zero_grad for faster implementation * move unscale_fp16_grads_into_fp32_grads into C++ impl * improve the efficiency furthur, reducing 3.5ms to 1.7ms for unilm. * unilm 1.7ms to 338us: 1). optimize python list <==> std::vector copy, 2). launch the kernels as long as num_elem reach thresh hold. This help reduce the CUDA idel time. * refine the logic a bit after validating Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>	2021-10-26 13:13:49 +08:00
ashbhandare	0270ff7951	Minor import fix (#9538 )	2021-10-25 21:29:31 -07:00
baijumeswani	5da4e07daa	Make FusedAdam mathematically equivalent to Transformers AdamW (#9343 )	2021-10-18 16:03:18 -07:00
pengwa	f05c285a58	Exception when duplicated autograd.Function name detected (#9351 ) * Exception when duplicated autograd.Function name detected * reorder a bit for a bittle bit better perf * fix a bug in previous PR :( * correct the error message a bit	2021-10-15 12:23:13 +08:00
pengwa	5ee47e3ffa	legacy_megatron-lm/deepspeed_ZERO1&2 FP16_Optimizer wrapper (#9184 ) * megatron-lm FP16_Optimizer Wrap, allow model parallelism aggregation optional * add deepspeed zero1 and zero2 - checkoverflow & clip norm * re-structure code and add the copyright * update the document * refine the code after validation	2021-10-14 09:01:23 +08:00
baijumeswani	bcdb411c8d	Implement FusedAdam for ORT adapted from DeepSpeed (#9266 )	2021-10-05 20:50:34 -07:00
pengwa	453431f7bb	Add max_norm for gradient clipping. (#6289 ) * add max_norm as user option for gradient clipping * add adam and lamb test cases for clip norm * add frontend tests	2021-01-21 01:01:11 +08:00
Thiago Crepaldi	6594d6672f	Move onnxruntime.experiment to onnxruntime.training namespace (#5045 )	2020-09-09 09:46:06 -07:00

32 commits