onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-11 17:48:34 +00:00

Author	SHA1	Message	Date
Edward Chen	215732f74b	Ignore saved runtime optimizations when updating ORT format model <v5. (#13393 ) The old runtime optimization format is not readily convertible to the new one without extra information for translating kernel def hashes. Ignore such saved runtime optimizations and output a warning for now.	2022-11-08 13:36:46 -08:00
Peter Salas	b383312f4c	[tvm] Add support for int8 models, update TVM revision (#13519 ) ### Description In the TVM EP, this adds more entries to the conversion from `ONNXTensorElementDataType` to `DLDataType`. Additionally, it removes an unused function and updates the TVM revision to allow running models from recent revisions of TVM. ### Motivation and Context In the TVM EP, the mapping from `ONNXTensorElementDataType` to `DLDataType` was incomplete and neglected several integer types (in particular `ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT8` and `ONNX_TENSOR_ELEMENT_DATA_TYPE_INT8`) which prevented some models from running. Co-authored-by: Peter Salas <psalas@octoml.ai>	2022-11-08 11:28:32 -08:00
Edward Chen	9e65f3bfdb	Replace deprecated Python dependency sklearn with scikit-learn. (#13585 )	2022-11-08 09:08:29 -08:00
Changming Sun	efcbdac58e	Remove the cmake option: onnxruntime_DEV_MODE (#13573 ) 1. Remove the cmake option onnxruntime_DEV_MODE and replace it with "--compile-no-warning-as-error" 2. Suppress some GSL warnings because now we treat nvcc diag warnings as errors	2022-11-07 09:06:28 -08:00
Changming Sun	6201593f24	Remove the dependency on CentOS EPEL (#13567 ) ### Description The yum repo is called: ["Extra Packages for Enterprise Linux (EPEL)"](https://docs.fedoraproject.org/en-US/epel/#what_is_extra_packages_for_enterprise_linux_or_epel) . It is provided by Fedora community for RHEL/CentOS/... Linux distros. However, we do not really need it. ### Motivation and Context To minimize the number of dependencies. And the command "yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm" often fails because the website is often not responding,	2022-11-06 21:28:16 -08:00
Changming Sun	23da468154	Upgrade cmake version to 3.24 (#13569 ) ### Description Upgrade cmake version to 3.24 because I need to use a new feature that is only provided in that version and later. Starting from cmake 3.24, the [FetchContent](https://cmake.org/cmake/help/latest/module/FetchContent.html#module:FetchContent) module and the [find_package()](https://cmake.org/cmake/help/latest/command/find_package.html#command:find_package) command now support integration capabilities, which means calls to "FetchContent" can be implicitly redirected to "find_package", and vice versa. Users can use a cmake variable to control the behavior. So, we don't need to provide such a build option. We can delete our "onnxruntime_PREFER_SYSTEM_LIB" build option and let cmake handle it. And it would be easier for who wants to use vcpkg. ### Motivation and Context Provide a unified package management method, and get aligned with the community. This change is split from #13523 for easier review.	2022-11-04 22:58:51 -07:00
yf711	8b9065a396	Add getter/setter of C# OrtEnv log level (#13402 ) ### Description * Add getter/setter to access and update C# OrtEnv log level * Add C API about updating ort env with custom log level to support the setter above (Following [pybind implementation](`952c99304a/onnxruntime/python/onnxruntime_pybind_state.cc (L923-L924)`)) * Add test case to verify getter & setter ### Motivation and Context * For C++/Python, the log level can be adjusted via OrtEnv, and this feature is missing in C# binding	2022-11-04 21:46:00 -07:00
George Nash	0296bc74c1	oneDNN ep bf16 enabling (#13484 ) ### Description This adds bfloat16 support to the oneDNN ep. When using the oneDNN ep this enables bfloat16 support for the following ops: Exp, Sigmoid, Tanh, Relu, MatMul, Gelu, BiasGelu, Add, Sub, Mul, Div, Div, Sqrt, Pow, ReduceMean, Abs, Cast, Equal, Exp, FastGelu, FusedMatMul, Gemm, Greter, GreaterOrEqual, LeakyRelu, Less, LessOrEqual, LRN, ReduceOps, Reshape, Squeeze, Transpose, and Unsqueeze. LayerNorm with some internal casting. BatchNorm only enabled BFloat16 for input and output, scale and bias still need fp32 input. Added bfloat16 unit tests for all of the operators in question. When possible we reused the already existing unit tests that were added by CUDA and ROCM eps. In many of the unit tests an unusual pattern will be seen #if defined(USE_DNNL) TEST(Test, bfloat16_test) { #if defined(USE_DNNL) // oneDNN ep specific code #endif //test code } #endif Although it looks unusual this was purposely done if another ep implements bfloat16 support for that operator they will be able to enable the unit test by adding there execution provider to the first line without needing to edit inside the test. Example: `#if defined(USE_CUDA) \|\| defined(USE_DNNL)` see the MatMul_float16 test in matmul_test.cc for and example of how this is useful. Additionally two new ISA checks (AVX512_BF16 and AMX-BF16) were added to the cpuid_info code in. This was important to detecting is bfloat16 operations are supported by the CPU. ### Motivation and Context This expands the capabilities of the oneDNN execution provider to support models containing bfloat16 operations. Signed-off-by: George Nash <george.nash@intel.com> Signed-off-by: Ruihan-Yin <ruihan.yin@intel.com>	2022-11-04 18:25:09 -07:00
Edward Chen	4401f50c5e	Change GSL download to use HTTPS URL. (#13563 )	2022-11-04 18:01:18 -07:00
pengwa	ab9ac2acc4	Add guidelines for ORTModule (#13553 ) ### Add guidelines for ORTModule As title. Feel free to let me know if I missed something. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-11-04 19:42:10 +08:00
Changming Sun	433f262dd5	Disable some tests for 32-bit Windows (#13551 ) ### Description The failed tests are: ``` [ FAILED ] ModelTests/ModelTest.Run/cpu__models_zoo_opset7_ResNet101_DUC_HDC_ResNet101DUC7, where GetParam() = L"cpu_..\\models\\zoo\\opset7\\ResNet101_DUC_HDC\\ResNet101-DUC-7.onnx" [ FAILED ] ModelTests/ModelTest.Run/cpu__models_zoo_opset12_ResNet101_DUC_HDC12_ResNet101DUC12, where GetParam() = L"cpu_..\\models\\zoo\\opset12\\ResNet101_DUC_HDC-12\\ResNet101-DUC-12.onnx" [ FAILED ] ModelTests/ModelTest.Run/cpu__models_zoo_opset11_FCN_ResNet101_model, where GetParam() = L"cpu_..\\models\\zoo\\opset11\\FCN ResNet-101\\model.onnx" [ FAILED ] ModelTests/ModelTest.Run/cpu__models_zoo_opset10_SSD_model, where GetParam() = L"cpu_..\\models\\zoo\\opset10\\SSD\\model.onnx" ``` They are instable. Sometimes they fail with error "Message: bad allocation". Sample job: https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=797861&view=logs&j=cceb3ef3-4a22-5fef-c5e9-ef6abe6579ed&t=fa89271b-d780-55e6-8822-71317e62ce21	2022-11-03 20:34:03 -07:00
zhijiang	1977b7ed6a	Fix pythonop training_mode in evaluation mode (#13514 ) Customer reported this issue: they see many warnings when doing hte evaluation using ORTModule. ![image](https://user-images.githubusercontent.com/10530022/199371757-5fed7d05-a951-4f1b-8f88-049c5ab89886.png) After investigation, we found the `training_mode` is exported to a wrong value in evaluation mode, it's value should be 0, but we found it is 1. Fix: fix pythonop training mode if training_mode's type is torch._C._onnx.TrainingMode, then not matter it is EVAL or TRAINING, "if training_mode" will always be true	2022-11-04 08:47:01 +08:00
Ye Wang	df796bbb62	cast logits to half when T=MLFloat16 (#13454 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-11-03 16:40:19 -07:00
Edward Chen	b4a1ae8350	Use narrow instead of gsl::narrow. (#13555 )	2022-11-03 16:24:11 -07:00
cloudhan	2de883c592	Update CK and fix performance issue on dev machine (#13531 ) 1. Update CK to its latest develop branch 2. `-mllvm -amdgpu-early-inline-all=true` is critical to CK's performance, ensure it is properly configured. - The flags are propagated from target `hip-lang::device`'s `INTERFACE_COMPILE_OPTIONS`, we must not manually add the flags. - Instead, we must ensure this target is properly configured by checking _CMAKE_HIP_DEVICE_RUNTIME_TARGET is set. TL,DR `hip-lang::device` sometime will be not be properly configured if our `CMAKE_PREFIX_PATH` is not configured carefully. In the CI docker, the configuration is in good state, but on dev machine it is not, which then silently result poor performance for kernels. We fixed it in this PR and add a guard to avoid unsuccessful future editing and to prevent convoluted debugging process. `_CMAKE_HIP_DEVICE_RUNTIME_TARGET ` is shared in `/opt/rocm/lib/cmake/hip-lang/hip-lang-config.cmake` and it is internal to [CMake](https://gitlab.kitware.com/cmake/cmake/-/merge_requests/6121/diffs), the variable name will not be changed in the foreseeable future.	2022-11-03 19:32:30 +08:00
Yi Zhang	7c3a23c186	extend some timeout value (#13552 ) ### Description <!-- Describe your changes. --> ### Motivation and Context these workflows are prone to timeout.	2022-11-03 15:11:41 +08:00
pengwa	a3e7da60e7	Trade subgraph recompute for memory (#12852 ) Description: Subgraph-level recompute This PR adds an optional capability trading additional re-computation for better memory efficiency. Specifically, a pre-defined operator list used to iterate the Graph to find some subgraphs for recompute, to reduce some stashed activations whose lifetime across forward and backward pass. When training with ORTModule, by default, the graph transformer will scan the execution graph to find all eligible subgraph to recompute, along with sizes that can save. An example looks like below. If we want to enable some of them to recompute, we can define env variable this way: `export ORTMODULE_ENABLE_MEMORY_ALLEVIATION="Mul+FusedMatMul+Cast+Unsqueeze+Unsqueeze+Cast+Sub+Mul+Add+BiasSoftmaxDropout+Cast+:1:-1,BiasGelu+:1:-1,BitmaskDropout+Cast+:1:-1,FusedMatMul+:1:-1,Cast+:1:-1,Mul+Add+:1:-1,Mul+Sub+:1:-1"` ``` [1,0]<stderr>:2,022-10-12 14:47:39.302,954,530 [W:onnxruntime:, memory_alleviation.cc:595 PrintSummary] [1,0]<stderr>:MemoryAlleviation Summary: [1,0]<stderr>: User config: [1,0]<stderr>: Mul+FusedMatMul+Cast+Unsqueeze+Unsqueeze+Cast+Sub+Mul+Add+BiasSoftmaxDropout+Cast+:1,BiasGelu+:1,BitmaskDropout+Cast+:1,FusedMatMul+:1,Cast+:1,Mul+Add+:1,Mul+Sub+:1 [1,0]<stderr>: ================================= [1,0]<stderr>: Subgraph: BitmaskDropout+ [1,0]<stderr>: AlleviationType: Disabled [1,0]<stderr>: Patterns: [1,0]<stderr>: PatternShape:input_ids_dim0 x 1,024 x Frequency:1 [1,0]<stderr>: -------------------------------- [1,0]<stderr>: Subgraph: BiasGelu+ [1,0]<stderr>: AlleviationType: Recompute [1,0]<stderr>: Patterns: [1,0]<stderr>: PatternShape:input_ids_dim0 x input_ids_dim1 x 4,096 x Frequency:24 [1,0]<stderr>: -------------------------------- [1,0]<stderr>: Subgraph: Reshape[1,0]<stderr>:+ [1,0]<stderr>: AlleviationType: Disabled [1,0]<stderr>: Patterns: [1,0]<stderr>: PatternShape:labels_dim0 x Frequency:1 [1,0]<stderr>: -------------------------------- [1,0]<stderr>: Subgraph: Unsqueeze+Unsqueeze+Cast+Sub+Mul+Mul+FusedMatMul+Cast+Add+BiasSoftmaxDropout+Cast+ [1,0]<stderr>: AlleviationType: Disabled [1,0]<stderr>: Patterns: [1,0]<stderr>: PatternShape:input_ids_dim0 x 16 x input_ids_dim1 x input_ids_dim1 x Frequency:23 [1,0]<stderr>: -------------------------------- [1,0]<stderr>: Subgraph: Mul+FusedMatMul+Cast+Unsqueeze+Unsqueeze+Cast+Sub+Mul+Add+BiasSoftmaxDropout+Cast+ [1,0]<stderr>: AlleviationType: Recompute [1,0]<stderr>: Patterns: [1,0]<stderr>: PatternShape:input_ids_dim0 x 16 x input_ids_dim1 x input_ids_dim1 x Frequency:1 [1,0]<stderr>: -------------------------------- [1,0]<stderr>: Subgraph: Mul+Add+ [1,0]<stderr>: AlleviationType: Recompute [1,0]<stderr>: Patterns: [1,0]<stderr>: PatternShape:input_ids_dim0 x 16 x input_ids_dim1 x 1 x Frequency:24 [1,0]<stderr>: -------------------------------- [1,0]<stderr>: Subgraph: FusedMatMul+Cast+Add+Reshape+Cast+ [1,0]<stderr>: AlleviationType: Disabled [1,0]<stderr>: Patterns: [1,0]<stderr>: PatternShape:input_ids_dim0 x 16 x input_ids_dim1 x 2 x 4 x Frequency:24 [1,0]<stderr>: -------------------------------- [1,0]<stderr>: Subgraph: Mul+Sub+ [1,0]<stderr>: AlleviationType: Recompute [1,0]<stderr>: Patterns: [1,0]<stderr>: PatternShape:input_ids_dim0 x 16 x input_ids_dim1 x 1 x Frequency:24 [1,0]<stderr>: -------------------------------- [1,0]<stderr>: Subgraph: Cast+ [1,0]<stderr>: AlleviationType: Recompute [1,0]<stderr>: Patterns: [1,0]<stderr>: PatternShape:1,024 x 1,024 x Frequency:97 [1,0]<stderr>: PatternShape:3 x 1,024 x Frequency:1 [1,0]<stderr>: PatternShape:8 x 64 x Frequency:24 [1,0]<stderr>: PatternShape:1,024 x 4,096 x Frequency:24 [1,0]<stderr>: PatternShape:4,096 x Frequency:24 [1,0]<stderr>: PatternShape:4,096 x 1,024 x Frequency:24 [1,0]<stderr>: -------------------------------- [1,0]<stderr>: Subgraph: FusedMatMul+ [1,0]<stderr>: AlleviationType: Recompute [1,0]<stderr>: Patterns: [1,0]<stderr>: PatternShape:input_ids_dim0 x input_ids_dim1 x 4,096 x Frequency:24 [1,0]<stderr>: -------------------------------- [1,0]<stderr>: ================================= ``` "Type config:" whether recompute is enabled by users. 0 - disable, 1- enable. "Subgraph" means what kind of subgraph will be recomputed, in this case, it is a single node "Gelu", and it will be "Recompute". "Shape && Frequency" means, for this recompute, one tensor of size (batch size, 500) will be saved because it will be recomputed. Baseline On a 1P model (DEBERTA V2), sequence length 256, training with 16 A100 GPUs. With latest main branch, we can run batch size 16, and the maximum batch size < 32. So 16 is usually chosen by data scientists. 65% of 40GB memory is used during training. The SamplesPerSec=479.2543353561354. ![image](https://user-images.githubusercontent.com/10530022/188320941-13dde5e7-c32b-4399-a64b-6803fbb9dcda.png) With this PR Gelu is recomputed for saving memory peak, batch size 32 can be run. The 97% of 40GB A100 is used, the SamplesPerSec=562.041593991271 (1.17X of baseline). ![image](https://user-images.githubusercontent.com/10530022/188321081-f64811bf-9637-4873-8095-349de8d498cc.png) Motivation and Context - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here.	2022-11-03 13:49:41 +08:00
George Nash	77be22f379	[oneDNN ep] Update from oneDNN v2.7.0 to oneDNN v2.7.1 (#13536 ) The oneDNN 2.7.1 release includes multiple functional and performance improvements. Signed-off-by: George Nash <george.nash@intel.com> ### Description Update the oneDNN library from 2.7.0 to 2.7.1. This contains multiple functional and performance improvements. ### Motivation and Context This is a minor point release from the oneDNN library that gives performance and functional fixes that were found in the oneDNN 2.7 library shortly after release. Signed-off-by: George Nash <george.nash@intel.com>	2022-11-02 15:57:49 -07:00
Changming Sun	b1e1b25e04	Delete CUB (#13534 ) ### Description Delete CUB ### Motivation and Context Because it is already in CUDA SDK.	2022-11-02 13:06:22 -07:00
Changming Sun	5914a7e0ae	Fix an error in the python packaging pipeline (#13538 ) ### Description It missed a space there. ### Motivation and Context Right now the pipeline is failing because GSL was just converted from a submodule to a cmake external project.	2022-11-02 07:55:20 -07:00
Wei-Sheng Chin	b5904c40dd	Enable ORT in TorchDynamo (#13259 ) This PR enables ORT to execute graphs captured by TorchDynamo. Major compilation code is in `OrtBackend.compile` in ort_backend.py. `register_backend.py` is for plugging `OrtBackend` into TorchDynamo as a compiler.	2022-11-01 11:19:29 -07:00
PeixuanZuo	6740528b98	[ROCm] Fix bug for rocm ep build using MS GSL 4.0.0 (#13525 )	2022-11-01 13:05:55 +08:00
PeixuanZuo	c8886c5b4c	Revert "Update CK and fix performance due to lacking -amdgpu-early-inline-all=true (#13493 )" This reverts commit `4dd053cc15`.	2022-11-01 13:05:55 +08:00
Baiju Meswani	c557a55816	Fix on-device training ExportModelForInferencing api (#13510 )	2022-10-31 21:29:06 -07:00
Vincent Wang	17f0ffd1c8	Support More Cases in NoOpElimination (#13460 ) Current NoOpElimination can support only Add node. This PR adds support for: x-0, x1, 1x and x/1 besides x+0 and 0+x. With this PR, all Div(x,1) and their gradients (also Div(x,1)) in Huggingface's diffusers model can be removed, which takes ~1% of compute time in total previously.	2022-11-01 10:39:52 +08:00
Patrice Vignola	3d0db47c17	[DML EP] Fix variable shadowing in EinSum (#13520 ) ### Description Fix variable shadowing in the DML EP's implementation of EinSum ### Motivation and Context An SDL bug was opened because of shadowing of the variable `i` in a nested loop of the EinSum operator.	2022-10-31 19:27:43 -07:00
Patrice Vignola	74f905b237	DML EP enable the provider in the op tests (#13441 ) ### Description Enables the DML provider in the op tests to allow for better CI coverage. ### Motivation and Context Some of the CI tests for DML were actually running on the CPU because there was no default DML provider, so it was returning a `nullptr`. This should add better coverage, and it already uncovered some failures and asserts hitting in a few tests, which need to be investigated separately.	2022-10-31 15:49:03 -07:00
Adrian Lizarraga	9d867a07c0	Fix regression in CustomOpApi::GetTensorData (#13450 ) - Reverts change to CustomOpApi::GetTensorData introduced by commit `5dae0c477d`, which causes infinite recursion. - Moves EndsProfilingAllocated to non-const session implementation (C++ API header).	2022-10-31 12:20:49 -07:00
Edward Chen	2ecd1d6622	Switch GSL to MS GSL 4.0.0 (#13416 )	2022-10-29 04:15:20 -07:00
Edward Chen	7fbfbf789f	Increase timeout for binary-size-checks-pipeline. (#13498 )	2022-10-28 23:15:56 -07:00
zhangyaobit	33b8778a46	Minor improvement for the documentation of kernel explorer (#13490 ) ### Description <!-- Describe your changes. --> Fix the input shape of FastGelu Minor improvement for the documentation of kernel explorer ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-10-28 22:57:53 -07:00
Fei Hu	943e156f4c	Allow custom ops to set input memory type (#10879 )	2022-10-28 21:45:26 -07:00
Hector Li	1b494daffa	Add yml file for Snpe EP build (#13494 ) Add yml file for Snpe EP build	2022-10-28 19:47:50 -07:00
Changming Sun	689e524c58	Move DML packaging pipelines to aiinfra-dml-winbuild machine pool (#13487 ) 1. Move DML packaging pipelines to aiinfra-dml-winbuild machine pool 2. Delete tools/ci_build/github/azure-pipelines/templates/windowsai-nuget-build.yml because the pipeline has been migrated to Onebranch. I monitored it for months, it worked well.	2022-10-28 10:30:16 -07:00
Numfor Tiapo	49e5a11ccd	Fix SDL and Prefast Errors (#13465 ) Fixes Errors 1978844, 1978870, 1978850, 1978855, and 9245 Co-authored-by: Numfor Mbiziwo-Tiapo <numform@microsoft.com>	2022-10-28 09:41:18 -07:00
zhangyaobit	0a524cfe1c	Fix the input shape of FastGelu (#13488 ) ### Description <!-- Describe your changes. --> Fix the input shape of FastGelu ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-10-28 09:36:31 -07:00
cloudhan	4dd053cc15	Update CK and fix performance due to lacking -amdgpu-early-inline-all=true (#13493 ) 1. Update CK to its latest develop branch 2. `-mllvm -amdgpu-early-inline-all=true` is critical to CK's performance, add it.	2022-10-28 09:36:00 -07:00
Vincent Wang	8b0669bf63	QuickGelu Fusion (#12417 ) Some models have QuickGelu(x)=x*sigmoid(1.702x), which has 3 Ops for forward and 5 Ops for backward. The PR is to fuse this to a single Op named QuickGelu and its gradient QuickGeluGrad. For CUDA, tested in V100 using input tensor with shape [64,128,2048] and float16 type: Before, FW takes 335us, BW takes 614us ![image](https://user-images.githubusercontent.com/11661208/182291335-15188709-ffe7-44d1-9d14-0b544cbe5e55.png) After, FW takes 115us, BW takes 139us, which is much faster. ![image](https://user-images.githubusercontent.com/11661208/182291502-f0b5161c-b95c-45fc-90f8-ad0c592d2433.png) For CPU kernel, using same shape and float type: Before, FW takes 10us, BW takes 49us Mul: 3480[µs] Sigmoid: 1996[µs] Mul: 4789[µs] Mul: 4642[µs] Mul: 4195[µs] SigmoidGrad: 18328[µs] Mul: 2988[µs] Sum: 18576[µs] After, FW takes 4us, BW takes 5us, which is also much faster. QuickGelu: 3939[µs] QuickGeluGrad: 5089[µs] Co-authored-by: Vincent Wang <weicwang@microsoft.com>	2022-10-28 18:12:07 +08:00
JiCheng	20c3c35c33	[XNNPACK] support building xnnpack EP for IOS (#13461 ) ### Description support building xnnpack for IOS ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-10-28 15:03:04 +08:00
Changming Sun	07271b6c8a	Update docs/OperatorKernels.md (#13485 )	2022-10-27 20:11:49 -07:00
Jian Chen	f9378c5cca	Cjian/c4244 round 2 (#13473 ) ### Description Round 2 of fixing C4244 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-10-27 18:50:26 -04:00
Changming Sun	4a20c0d98b	Delete zlib.cmake (#13467 ) Delete the file because it is not included by any other file.	2022-10-27 15:36:04 -07:00
Yi Zhang	67074851a3	Skip failed models on training ci and openvino ci (#13477 )	2022-10-27 15:22:47 -07:00
Changming Sun	35659d9021	Increase the timeout value for linux-gpu-tensorrt-ci-pipeline.yml (#13481 ) Now it takes about 55-60 minutes. It is on the edge so it often fails.	2022-10-27 14:26:22 -07:00
Scott McKay	ab71c4bbc0	Document generation CI is broken (#13308 ) ### Description <!-- Describe your changes. --> Fix document generation CI. It's not currently updating the docs as we're skipping the tests, which is the invocation of build.py that would have generated the documentation. Setup specific task to generate documentation for greater clarity. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Operator kernel documentation is not getting updated and is now out of date.	2022-10-28 07:20:48 +10:00
Patrice Vignola	0b29f64dba	[DML EP] Enable all datatypes for Abs and Sign (#13470 ) ### Description Enables all datatypes supported for DML for `Abs` and `Sign`. ### Motivation and Context `Abs` and `Sign` haven't been updated since DML started to support all datatypes for them. These ops are used in some transformer models and were forcing unnecessary copies between the CPU and the GPU.	2022-10-27 11:36:11 -07:00
Dmitri Smirnov	0e2087acff	Add extension method to compensate for Contains() absence (#13466 ) ### Description The targeted framework does not contain `Contains(string, orginal)`. Add extension method to compensate in following the suggestion [here](https://learn.microsoft.com/en-us/dotnet/api/system.string.contains?view=net-7.0). ### Motivation and Context Packaging pipeline fails.	2022-10-27 10:00:47 -07:00
Baiju Meswani	a46c599a40	Training API to export the eval model to an inference model (#13345 )	2022-10-27 09:34:01 -07:00
Jian Chen	8827c4bdbc	First round of fixes. (#13452 ) ### Description First round of fixes for C4244 error. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-10-26 23:05:45 -04:00
Edward Chen	601b74b904	Add '$schema' entry to cgmanifest.json files. (#13444 )	2022-10-26 16:15:05 -07:00

1 2 3 4 5 ...

7662 commits