onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-13 18:08:13 +00:00

Author	SHA1	Message	Date
Chen Fu	df4cb6f301	Adding pytorch cpuinfo as dependency (#8178 ) Pytorch cpuinfo library allows us to query current cpu features, micro-architecture and cache size, etc. These information is needed for targeted performance optimizations. Unfortunately it does not work under Windows/ARM. We need to develop our own later	2021-07-12 14:21:12 -07:00
Sheil Kumar	eec8e1394a	Memory map files on windows to speed up model load (#8349 ) * Memory map files on windows to speed up model load * fix custom ops Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2021-07-12 11:52:08 -07:00
Yufeng Li	f6956e0259	Refactor qgemm file (#8322 ) This PR purely extracts each kernel to a standalone file. No functionality change. It includes specifically: leave the MlasGemm function and thread handling in the qgemm.cc put dispatcher functions and the template functions (interfaces) that are required to implement a kernel into qgemm.h put each kernel implementation in a separate file, which implements/specialize template functions: MlasGemmU8X8FixupZeroPointB, MlasGemmU8X8CopyPackA, MlasGemmU8X8CopyPackB, MlasGemmU8X8Kernel determine the files to be compiled in cmake file	2021-07-12 10:13:20 -07:00
KeDengMS	b7c9696ac3	Symbolic_shape_infer fixes (#8280 ) 1. Add support for sequence ops: ConcatFromSequence, SequenceAt, SequenceInsert. There are other sequence ops supported by onnx that worked well after adding these ops, so no need to add all of them in symbolic_shape_infer 2. For If node, the two branches output might have different shapes. In that case, for sequence output, use None in dimension; For tensor output, create a new symbolic dimension. 3. Fix a bug in Tile, where input for repeats might be of unknown value 4. Topological sort of nodes in graph need to consider implicit input in subgraphs for If/Loop/Scan ops 5. Generate unique prefix for new dimensions inside subgraph	2021-07-09 19:14:26 -07:00
Guoyu Wang	10142f9510	Add metadata_props to ORT model (#8340 ) * Add metadata_props to ORT model * Minor update * Update python binding, and increase the minimal pipeline size threshold * Fixed a small bug in serializing ir_version * Remove temp ort.py.fbs and add it to .gitignore	2021-07-09 11:28:27 -07:00
Changming Sun	60641a19e4	Add "/external:templates-" to VC++ flags (#8338 )	2021-07-09 11:23:53 -07:00
Tang, Cheng	e467d78a11	fix a typo (#8334 ) Co-authored-by: Cheng Tang <chenta@microsoft.com>	2021-07-09 09:24:43 -07:00
Tang, Cheng	598454bb5f	Fix the mix precision handle for square case (#8333 ) * handle unsqueeze change in opset13 * fix the node arguments index check for square case (x * x) * Revert "fix the node arguments index check for square case (x * x)" This reverts commit c66344f0a82c35d8c24d31f2264cf7e9b235ce22. * handle the square case (x * x) for node argument search Co-authored-by: Cheng Tang <chenta@microsoft.com>	2021-07-09 09:24:19 -07:00
Rachel Guo	187743726b	[CoreML EP] Add Int32<->Int64 handling around coreml ep (#8183 ) * initial int32-int64 type handling * initial * clean and fix UT error * modify code comments * address partial pr comments * minor update * address pr comments Co-authored-by: rachguo <rachguo@rachguos-Mac-mini.local> Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2021-07-09 09:08:05 -07:00
Hariharan Seshadri	5369821ad6	Support SpaceDepth ops in the CUDA and ROCM EPs (#7960 )	2021-07-09 01:00:22 -07:00
Scott McKay	1b2e1a7e0c	Refactor QDQ optimizers to enable future usage in minimal build (#8191 ) * Add new transformer that can split node selection from node modification to allow just the modifications to be applied at runtime in a minimal build. This is the first step of a few to enable a QDQ model to be optimized for the NNAPI EP and/or the CPU EP at runtime in a mobile scenario. Add generic and QDQ specific helpers for selection and modification. Replace existing QDQ optimizers with optimizer based on new approach.	2021-07-09 16:11:43 +10:00
Hariharan Seshadri	46e5c8d4b9	Cosmetic change in test infrastructure (#8292 )	2021-07-08 21:52:02 -07:00
pengwa	5454af4b95	decouple the shared python dependency (#8294 ) * remove warnining message for non-training build * move to/from dlpack for onnxruntime_python back into python project	2021-07-09 11:47:11 +08:00
Dmitry Yutkin	067759b387	Fix bad URL to huggingface onnx-export example notebook	2021-07-08 15:01:46 -07:00
satyajandhyala	84bc20fe9d	Enable cast propagation with level one by default. (#8286 )	2021-07-08 14:38:09 -07:00
RandySheriffH	f40df30219	Replace functions with secured version for OSX compliance (#7586 ) * replace strlen with strnlen * replace vsnprintf with vsnprintf_l * add macro * switch to std numeric::limits * apply uint16 max * fix build err * fix mac build * define MAX_STR_LEN * define MAX_STR_LEN * fix typo * trim empty lines * apply constexpr * fix typo * add namespace * fix build err * rename global constant Co-authored-by: Randy <Randy@randysmac.attlocal.net> Co-authored-by: Randy Shuai <rashuai@microsoft.com> Co-authored-by: Randy <Randy@randysmac.local>	2021-07-08 11:02:36 -07:00
pengwa	6dbfb8db0e	autograd function fallback perf (#8312 ) * fix known issues * Update orttraining/orttraining/test/python/orttraining_test_ortmodule_autograd.py Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>	2021-07-09 00:29:40 +08:00
Edward Chen	c254c3c355	Fix issue with ONNX to ORT format model conversion script when given single model file as input. (#8323 )	2021-07-07 14:08:47 -07:00
baijumeswani	6652d17dcd	Support lists as inputs to ORTModule (#8311 )	2021-07-07 13:04:19 -07:00
Thiago Crepaldi	9a855fe9e7	Make Torch CPP extension build optional for packaging pipelines (#8305 )	2021-07-07 07:24:58 -07:00
Tang, Cheng	d7c3703371	handle unsqueeze change in opset13 (#8308 ) Co-authored-by: Cheng Tang <chenta@microsoft.com>	2021-07-06 22:30:24 -07:00
pengwa	2347a0aca8	Autograd Function Fallback bug fix - moe support (#8105 ) * Support forward inputs orders like "Non_tensor/Tensor/Non_tensor". Correspondingly, support "None/Tensor_Grad/None" fpr backward outputs. * Report RuntimeError when PythonOp detected but _enable_custom_autograd_function is enabled. * Fix "PoliCheck ] - Defect : Term "hang", Component : orttraining\orttraining\python\training\ortmodule\__init__.py (1 issue)" * rename call_convention->input_convention, input_tensor_requires_grads->input_requires_grads * fix minor comment * revert polycheck fix in case of conflict * Update orttraining/orttraining/core/graph/training_op_defs.cc Co-authored-by: Tim Harris <tiharr@microsoft.com> * Apply suggestions from code review Refine the schema description Co-authored-by: Tim Harris <tiharr@microsoft.com> * Resolve review comments Co-authored-by: Tim Harris <tiharr@microsoft.com>	2021-07-07 08:58:01 +08:00
Nick Kreeger	40e5279f8f	Drop unused functions from math.h (#8304 ) * Drop unused functions from math.h * fix dnnl_conv.h	2021-07-06 19:18:18 -05:00
Nick Kreeger	62d1458ea8	Move kernel implementations outside of lookup table utility functions. (#8306 )	2021-07-06 18:31:05 -05:00
baijumeswani	090bae21ab	Pinning pillow version to 8.2.0 to circumvent regression introduced by 8.3.0 (#8303 )	2021-07-06 13:02:39 -07:00
Suffian Khan	008c5f7640	Use single builder image across Python versions for ROCm wheels (#8302 ) * first attempt share docker image across python and torch versons * set dependency between jobs * fix yaml grammer * remove python version from first stage * clean deepspeed directroy * split into two images according torch version * fix yaml syntax * invalidate cache * remove DS to prevent torch 1.9.0 upgrade	2021-07-06 11:56:00 -07:00
RandySheriffH	56e4dd1d3e	Fix optimizer crash (#8274 )	2021-07-02 17:19:15 -07:00
Suffian Khan	e71846b029	fix ld_preload for rocm (#8290 )	2021-07-02 17:15:28 -07:00
Suffian Khan	036eee5b66	register softmaxinternal with rocm (#8289 )	2021-07-02 16:29:18 -07:00
Pranav Sharma	969eb545d1	Update issue template to ask users to check known issues to avoid repetition. (#8288 )	2021-07-02 15:36:14 -07:00
Tiago Koji Castro Shibata	0fa9ac3648	Remove path from telemetry strings (#8281 )	2021-07-02 10:49:59 -07:00
Nick Kreeger	552806f3be	Fix lamda function formatting in layer_norm.cc (#8276 )	2021-07-02 12:30:16 -05:00
baijumeswani	2bda2a62fd	Pin version of Pillow to 8.2.0 to circumvent noncompatibility with numpy (#8278 )	2021-07-02 09:05:49 -07:00
Vincent Wang	88ec95ea96	Support OrtMemTypeCPUInput for ATenOp/ATenOpGrad (#8116 )	2021-07-02 23:04:43 +08:00
Edward Chen	b42e7d2c78	Add iOS packaging pipeline (#8264 ) Create a pipeline to produce the iOS package artifacts.	2021-07-02 06:21:59 -07:00
Tang, Cheng	a9a2394fa5	disable computation reduction optimization for non-gpu build (#8251 ) * disable computation reduction optimization for non-gpu build * fix comments in pr * add cpu execution provider * apply the core provider list to computation reduction optimizer * try macro Co-authored-by: Cheng Tang <chenta@microsoft.com>	2021-07-01 16:43:51 -07:00
Vincent Wang	9cfe642b34	enable BN training in cpu inference build (#8269 )	2021-07-01 13:15:59 -07:00
Tang, Cheng	996a98b3ac	fix the shared provider test for training build; expose more symbols to non cuda build (#8249 ) * expose more symbols for non cuda build * fix the test execution provider for training build Co-authored-by: Cheng Tang <chenta@microsoft.com>	2021-07-01 11:03:02 -07:00
Zuwei Zhao	b46310b349	Integrate onnxruntime-extensions into onnxruntime. (#8143 ) Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>	2021-07-01 09:34:03 -07:00
baijumeswani	f616cd07b4	Provide torch module interface for ORTModule (#8148 ) * Interface for the module manager and implementation of the torch module manager	2021-07-01 09:15:16 -07:00
Vincent Wang	ce9d134952	gather elements optimization (#8154 )	2021-07-01 14:30:00 +08:00
Vincent Wang	ef8f50c4ab	ScatterNDGrad (#8261 )	2021-07-01 13:49:49 +08:00
Thiago Crepaldi	97f1eea2ea	Propagate ROCM version to onnxruntime wheel package (#8247 )	2021-06-30 13:52:22 -07:00
Edward Chen	665ecdf9ce	[CoreML EP] Use partitioning utils in CoreMLExecutionProvider::GetCapability(). (#8179 ) Use partitioning utils in CoreMLExecutionProvider::GetCapability().	2021-06-30 09:57:36 -07:00
Scott McKay	4993680e56	Graph::GetNodeProvidesGraphOutput -> NodeProducesGraphOutput (#8243 ) 'GetNode' is a little confusing as it returns a bool. Update a couple more places where GetNodeOutputsInGraphOutputs was being used unnecessarily.	2021-06-30 20:43:33 +10:00
Scott McKay	b3479367cf	Add helper to check if node provides a graph output. (#8186 ) * Add helper to check if node provides a graph output. The current approach unnecessarily creates a vector when most of the optimizers only care about a true/false response. * Undo accidental change * Fix a couple of issues due to copying from larger set of changes.	2021-06-30 12:15:42 +10:00
Scott McKay	17d4545ccb	Improve readability of Graph::PerformTopologicalSortAndCheckIsAcyclic. (#8187 )	2021-06-30 12:15:17 +10:00
Guoyu Wang	9b19241b27	Disable update database for Android code coverage (#8182 )	2021-06-29 18:50:16 -07:00
Ankur Verma	fa8768723a	Allow custom loaders for testing (#8150 )	2021-06-29 16:54:36 -07:00
Nick Kreeger	507d97b200	Add initializer for embed layer norm unit tests. (#8196 )	2021-06-29 17:57:06 -05:00

1 2 3 4 5 ...

5176 commits