1. Fix the nuget cpu pipeline and put code coverage pipeline back.
2. Reduce onnx_test_runner's default logging level from WARNING to ERROR. Because there are too many log messages now.
3. Enlarge the protobuf read buffer size for onnx_test_runner. It was missed from PR #4020.
- Add support for ENABLE_LANGUAGE_INTEROP_OPS in training build which is enabled for nightly builds
- Fix passing of environment variables to `sudo docker run` in build definitions
- Fix setup.py package naming logic
The default branch for the spdlog repository on GitHub recently changed from "master"
to "v1.x", which has a different API for `syslog_sink::syslog_sink()`. This breaks
builds of the server for anyone who has checked out the submodules since that change.
Fixes#4077.
Co-authored-by: Derek Murray <demurra@microsoft.com>
* Add flake8 to Win CI build so it's re-enabled. It was in the static analysis build that is currently disabled so checks are not running.
Fix build.py to be compliant again.
Add prefix to flake8 output so it's (hopefully) easier to identify the errors in build output.
* Add to all builds in Windows CPU CI so they all fail quickly if there's an issue.
* Handle edge case where an implicit input for a subgraph may not get wired in correctly.
Conditions required:
- two or more levels of nested subgraph
- an implicit input from above the bottom two levels is used in both levels of subgraph
- this creates a NodeArg for the implicit input at both levels
- something changes to the first level subgraph to no longer use the implicit input
- could be constant folding, could be partitioning of nodes results in a copy of the implicit input being made to a different device
When that occurs we lose the wiring through to the second level of nested subgraph as there's a NodeArg in the first level but the implicit input is no longer used there. Fix that by doing a final check for outer scope values once we know all the outputs produced by the current graph.
Found by commenting out the CUDA implementations of the control flow nodes and running ssd_mobilenet_300 from the mlperf models.
* Add test case.
* publish mloperatorauthor.h in the nuget
* build dmlep into arm/arm64 builds
* update to not use --use_dml everywhere, but enable custom ops everywhere
* always download directml nuget in winml builds
* always build with dml
* dont build dml for arm
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
* fixed seg fault when using concrete shape
disable gradient as output
* fix evaluation hang issue for multiple gpu run
* Remove dead code, ORTModel and improve docstrings (#3814)
* Refine ORTTrainer docstring descriptions (#3907)
Description:
Fix 2 edge cases as described here: #3755 (comment)
Create a NodeArg for subgraph inputs even if they have no type. If they are only used as an implicit input to another level of nested subgraph we will not create a NodeArg via any other path
Allow an If output to have no shape. Obscure edge case where a loop carried dependency to a Loop node is passed through a nested If node subgraph (i.e. the Loop subgraph contains an If node with a nested subgraph for the else_branch/then_branch). We can't infer a shape for a loop carried dependency (they may change across iterations), which means we can't infer a shape for the nested If subgraph output either. We have delayed allocation support for If outputs so use that.
Motivation and Context
#3755
* update benchmark_gpt2 to use past state only
* update dynamic axes of input/output tensors
* Remove --use_openmp option since it is default for onnxruntime 1.3 cpu.
* Use same option names as benchmark.py
This class is already part of the protobuf-lite library. We don't need a copy here.
And if we do, we must ensure the signature of every function is exactly the same as the original. However, the upstream code may get changed over time. For example, recently protobuf added a "const" modifier to the FileInputStream::GetErrno(), which may break the build if a user want to use the latest protobuf.
* Enable optimizer on models with external data (>2GB)
* Refactoring optimizer: move fusion to separate file
* Update benchmark: (1) output datatime to csv (2) Add option --onnx_dir to benchmark.py for onnx model directory path (3) add gpt2-large (4) loose thrsholds for fp16 validation
* update optimizer (1) Add attribute of ConstantOfShape in fp16 conversion (2) Use OnnxRuntime level 1 optimization
* update bert_perf_test.py: rename --input_ids to --input_ids_name
Add transformer glue test example to show how to use ORTTrainer to fine-tune a transformer model
Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* Pad: Add support for all datatypes in opset-11 spec
Pad opset-11 implementation supports:
int32, int64, float & double
Per specification, Pad opset-11 also supports:
uint8, uint16, uint32, uint64, int8, int16 & float16
This commit add support for those types to get full coverage of Pad opset-11 operator.
* Pad: Remove 16-bit datatypes support
These types are unused at the moment and binary size is impacted. Remove support for those type to lower binary size.
Clean up a CUDAExecutionProvider's associated PerThreadContext instances when that CUDAExecutionProvider is destroyed.
Revert workaround (introduced in #3767) to lazily initialize CUDA handles to avoid segmentation fault. For that case, the CUDA handle cleanup was happening quite a bit later than the CUDAExecutionProvider destructor. This should be a cleaner way to fix that.
* online partition
* fix when multiple consumer nodes is in cut info
* fix windows build
* address feedback
* adding test
* feedback
* address feedback
* add parser for cut edge
* windows build
* Add amd migraphx execution provider to onnx runtime
* rename MiGraphX to MIGraphX
* remove unnecessary changes in migraphx_execution_provider.cc
* add migraphx EP to tests
* add input requests of the batchnorm operator
* add to support an onnx operator PRelu
* update migrapx dockerfile and removed one unused line
* sync submodules with mater branch
* fixed a small bug
* fix various bugs to run msft real models correctly
* some code cleanup
* fix python file format
* fixed a code style issue
* add default provider for migraphx execution provider
Co-authored-by: Shucai Xiao <Shucai.Xiao@amd.com>
* Optimize for OneHot with zero off value.
* Add test cases for indices out of range.
Co-authored-by: Vincent Wang <weicwang@microsoft.com>
Co-authored-by: Vincent Wang <weicwang@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>