onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-12 17:57:38 +00:00

Author	SHA1	Message	Date
Tianlei Wu	403f99cd77	Use yapf to format python (#3276 ) Update ReformatSourcePython.bat to use YAPF to format python code, and add onnxruntime\test directory to be formatted. Add onnxruntime\.style.yapf for configuration. The style is based on google, except max column width 120. Format python scripts using ReformatSourcePython.bat.	2020-03-20 14:34:10 -07:00
Pranav Sharma	84015d9491	Fix post merge test. This doesn't get triggered as part of gated PR checks. (#3277 )	2020-03-20 13:23:09 -07:00
Dmitri Smirnov	b880c48c4c	Make reduction ops handle Scalar input (#3260 ) Handle Scalar values for CPU and GPU Ifdef CUDA nd TVM as they require more changes.	2020-03-20 12:04:47 -07:00
Ye Wang	c5149e89d9	Wangye/shortgraindropper (#3273 ) (#3274 ) * Featurizer Library update * update Featurizer Library * add short_grain_dropper_transformer * resolve comments * resolve comments * resolve comments	2020-03-20 11:48:31 -07:00
Tianlei Wu	1d9be2baed	Add Notebook for Bert Model exported by Keras2onnx (#3271 ) * Add notebook for bert squad model exported by python 1.4 * update bert performance test tool: (1) set OpenMP environment variable before importing onnxruntime. (2) launch new process for each test. * Add notebook Reduce combinations in perf test * update readme * fix quote * Allow test multiple batch_size * Add latency percentile * Add warm up run Reset logger for notebook * refine default settings to test for cpu/gpu * Add script to dump machine info * Add notebooks for PyTorch SQuAD model GPU and CPU inference * Update machineinfo.py: add license header; format by yapf * Do not reset log handler. Skip adding handler if existed. * Add comments about GPU result diff. Filter rows of batch set to keep only one setting. * update according to review feedback * Download script from master branch * Add notebook for bert model exported by keras2onnx * format columns in result table * re-run and update notebook	2020-03-20 11:37:25 -07:00
Yufeng Li	a69d859912	fix quantize_bias (#3270 )	2020-03-20 11:36:47 -07:00
Scott McKay	6dc25a60f8	Make the reduction ops more consistent in checking if no transpose is required and skipping the copy of the input data if that is the case. Significantly better performance when this is done (2x faster for model calling ReduceSumSquare with input of {2048,10}). (#3265 )	2020-03-20 06:55:38 +10:00
Changming Sun	8f00147c14	Fix a few warnings	2020-03-19 09:22:28 -07:00
Tiago Koji Castro Shibata	3bdb0b620a	Fix WCOS/Win32 linking bugs (#3126 ) * Fix WCOS/Win32 linking bugs * Remove unused NODEFAULTLIB flags * Avoid plain target_link_libraries signature * Avoid plain target_link_libraries signature * Fix library list escaping * Use library list instead of string * Remove duplicate link to windowsapp.lib * Remove Win32 build workarounds * Specify CMake policies before initializing language * Expose Win32 header definitions during build * Force set API family * Enable Win32 APIs in featurizer * Use MT dynamic CRT * Expose Win32 specific functions * Disable app container globally * Disable default wide functions in featurizers * Add featurizers to test include path * Workaround https://gitlab.kitware.com/cmake/cmake/issues/19428 * Revert pipeline debugging hacks * Skip /FI in CUDA sources * Default to Win32 builds * Enable WCOS when using WinML * Use generator expression to apply CMAKE_MSVC_RUNTIME_LIBRARY to C++ only	2020-03-19 08:52:40 -07:00
Pranav Sharma	435f014d71	Add support for sessions to share a global threadpool. (#3177 ) * Add support for sessions to share a global threadpool. * Fix build issues * Add tests, fix build issues. * Added some documentation * Fix centos issue when threadpools become nullptr due to 1 core. * Fix mac and x86 build issues * Address some PR comments * Disabled test for android, added few more tests and addressed more PR comments. * const_cast	2020-03-18 15:42:46 -07:00
edgchen1	e03b8a1e2f	Move path_lib from onnxruntime/core/framework to onnxruntime/core/platform. (#3253 ) Moved path_lib.h/cc from onnxruntime/core/framework to onnxruntime/core/platform and from the onnxruntime_framework to the onnxruntime_common libraries.	2020-03-18 11:53:46 -07:00
Xiang Zhang	61621d4053	Add extra fields to ORT telemetry (#3234 ) * Add extra fields to ORT telemetry * fix linux build failure caused by using HRESULT * little refactor	2020-03-18 09:37:35 -07:00
Xavier Dupré	bd348ec6ca	Add unit test to cover TreeEnsembleClassifier applied to binary classification and 2 classes (#3230 ) * Add unit test to cover TreeEnsembleClassifier for binary classification	2020-03-18 11:32:58 +01:00
jaka.katrasnik	88c65f8add	Fixes GTest deprecation warnings	2020-03-17 16:38:55 -07:00
Tianlei Wu	0700d13ece	Add Bert Optimization Notebooks (#3204 ) * Add notebooks for GPU and CPU inference of PyTorch BERT SQuAD model * update bert_optimization.py: Do not add duplicated logger handler * Add machineinfo.py to show machine configuration for notebook. * Update bert performance test tool: (1) Set OpenMP environment variable before importing onnxruntime. (2) Use sub-process for each test (3) Allow test multiple batch_size (4) Add latency percentile (5) Add warmup	2020-03-17 11:56:36 -07:00
Faith Xu	8bc4e3195d	Updates to roadmap (#3155 ) * Updates to roadmap * remove redundant directML * Add JS to future investments	2020-03-16 18:19:07 -07:00
Ori Levari	e63f817eb6	avoid IDXGIFactory 6 where possible to enable WinML GPU Path downlevel to RS3 (#3180 )	2020-03-16 15:25:32 -07:00
Xiang Zhang	682dde2b3b	add dml_ep_lock (#3200 ) * add dml_ep_lock * Move Winml process-wide lock back to individual sessions	2020-03-16 14:32:12 -07:00
Xavier Dupré	6319357a99	Reduce number of allocations in TreeEnsemble (#3217 ) * reduce number of allocations in TreeEnsemble * Fix probabilities for binary case. * fix outbound access Co-authored-by: xavier dupré <xavier.dupre@gmail.com>	2020-03-16 12:22:15 +01:00
Changming Sun	0fceb33288	Fix onnxruntime server docker file build failure (#3219 ) 1. Fix onnxruntime server docker file build failure. Tested with the notebook in ONNX tutorial, it works well. 2. Delete the docker files for the other EPs, because currently they don't work and I don't have enough time to update them.	2020-03-15 14:46:46 -07:00
Tracy Sharpe	88c20eaef1	MLAS: rename AVX512BW->AVX512Core (#3216 ) Cleanup change: remap functions and files with Avx512BW to Avx512Core.	2020-03-13 22:45:51 -07:00
Dmitri Smirnov	2a6e5ce978	Speedup and reduce binary size for TfIdfVectorizer (#3197 ) Speed up TfIdf. Build Trie like structure to quickly exclude dead-ends. Use ParallelFor() for each of the rows processing. Make it non-template, batch it. Check for short tail within the inner loop.	2020-03-13 17:00:59 -07:00
Tracy Sharpe	fe0b2b2abd	QLinearConv speed up (#3196 ) For x86/x64 builds, change the QLinearConv op to use MLAS for the u8u8=s32 GEMM, then requantize the intermediate buffer to u8.	2020-03-13 16:54:55 -07:00
Changming Sun	0a1257e467	Adjust the grouping logic in ThreadPool::TryBatchParallelFor (#3207 ) 1. No more plus 1. 2. Use MlasPartitionWork function to calculate the work index range.	2020-03-13 12:49:17 -07:00
Yulong Wang	5bc0d8be5c	Fix TopK Cuda implementation (#3176 ) Fixes a bug in TopK cuda implementation when input size is between GridDim::maxThreadsPerBlock and GridDim::maxThreadsPerBlock * 2. In this case, the BitonicTopK will generate all-zero outputs.	2020-03-13 11:46:17 -07:00
Ori Levari	93569bf0f4	fix regex to populate dll version information correctly	2020-03-13 11:35:49 -07:00
Yufeng Li	c69194ec4c	fix the missing return in _get_quantize_input_nodes and format code with yapf (#3199 ) * fix the missing return for function _get_quantize_input_nodes * format quantization code with yapf	2020-03-13 09:28:41 -07:00
Xavier Dupré	d99554bea1	Improves implementation of tree ensemble regressor and classifier (4 to 5 times faster) (#2692 ) * Improves implementation of tree ensemble regressor (4 to 5 times faster) * Use ORT_THROW	2020-03-13 14:10:37 +01:00
Scott McKay	e9d5ed270f	Normalizer performance improvements (#3201 ) * Simplify Normalizer as the spec only requires support for 2D input. Tried using eigen (LpNorm<1>(), and norm()) on each row but that was much slower. * Remove unused variable	2020-03-13 22:15:44 +10:00
Scott McKay	890cb78b20	Use Eigen::logistic instead of manually computing values. (#3186 ) * Use MlasComputeLogistic instead of manually computing values. * Update test script to allow the tolerance to be specified when checking float output from logreg_iris.onnx.	2020-03-13 20:27:25 +10:00
Hariharan Seshadri	b8575dda7b	Avoid some heap allocations in the InferenceSession and Model classes (#3103 ) * Avoid some heap allocations in the InferenceSession and Model classes	2020-03-12 18:38:10 -07:00
Changming Sun	a02638eb46	Adjust the threading logic in ThreadPool::ParallelFor (#3178 ) 1. Do not reuse the main thread. 2. Do not plus one when mlas calculate the number of tasks to schedule. (It was me put the plus one there) This is the second try of #1839 It's known that this change has negative performance impact on some of the models.	2020-03-12 11:33:33 -07:00
Scott McKay	f49912c42a	Performance improvement to Transpose when moving single axis. (#3173 ) * Avoid use of vectors for tracking reader/writer offsets as it adds too much overhead if there are a lot of readers or writers. Tracy found improvements in resnet34-ssd1200 and BERT Squad with this approach.	2020-03-12 14:49:02 +10:00
Paul McDaniel	6791ed0217	Documentation updates for 1.2 for WinML (#3149 ) * api goverannce draft * Update CONTRIBUTING.md updated for ABI proposals * Update CONTRIBUTING.md * Update CONTRIBUTING.md * Incomplete, a draft iteartion of 2 more changes - api docs and high levle design * pushing to see how the picture size works on screen. * added 2 charts on api choice and distribution choice * details on contract checking * lint cleanup and links * PR feedback. * fixed markdown and lists * more markdown and lists * fixed broken links * PR feedback * commas * PR comments from nick * PR feedback * fixed build section Co-authored-by: Nick Geisler <36938193+ngeisler11@users.noreply.github.com>	2020-03-11 14:19:30 -07:00
Hariharan Seshadri	a912415bac	Support custom ops targeting the CUDA EP (#3165 ) * Initial commit * Minor nit * Comment * Fix build * Fix build	2020-03-11 00:49:01 -07:00
Hariharan Seshadri	3464801c3e	Explicitly specify NugetPackage parameter while validating nuget in some release pipelines (#3139 )	2020-03-10 15:14:09 -07:00
Yufeng Li	3de1fc096d	Move zero point inputs of MatmulInteger to CPU memory (#3159 )	2020-03-10 13:56:23 -07:00
Tianlei Wu	51a8c82908	Update bert optimization script for SQuAD model exported by keras2onnx (#3163 ) Update script to make it work on fine-tuned bert model exported by keras2onnx	2020-03-10 12:57:49 -07:00
Yufeng Li	876d0c5430	Make quantization parameters as constant weigth instead of overrideable (#3160 )	2020-03-10 08:35:02 -07:00
Scott McKay	3d928de778	Use GEMM for LinearRegressor and LinearClassifier operators to improve performance (#3154 )	2020-03-10 20:24:25 +10:00
Dmitri Smirnov	f87b6913cd	Add package download step before pushing to feeds (#3162 ) Add package download step before publishing.	2020-03-09 14:32:18 -07:00
Changming Sun	6ed5d7c332	Update post_binary_sizes_to_dashboard.py (#3161 ) Discussed with Faith, because the data size is very small and changes are gradual, there is no need to delete the old data. We want to keep all the history.	2020-03-09 13:21:58 -07:00
Tiago Koji Castro Shibata	a59243090a	Publish release symbols (#3152 ) * Publish release symbols * Publish symbols if IsReleaseBuild	2020-03-05 22:32:18 -08:00
Andrew Kane	781a6ebb06	Updated Ruby supported versions	2020-03-05 19:50:41 -08:00
pranavm-nvidia	cfd18b583a	Help output typo fix Fixes a typo in the help output for `symbolic_shape_infer`	2020-03-05 19:50:13 -08:00
Tianlei Wu	5be6665b86	Update Gelu Fusion to support new graph pattern from PyTorch 1.4 (#3148 ) * update GeluFusion to support pattern from PyTorch 1.4; * Fix a bug that missing the check of an edge between mul2 and root. * update script to fuse gelu from PyTorch 1.4 * Add test for python optimizer	2020-03-05 18:31:52 -08:00
Dmitri Smirnov	e2894c5ffb	Fix package name overrides (#3150 ) Add env var with the package name.	2020-03-05 17:10:55 -08:00
Yufeng Li	1d2b8115e2	Support u8u8 in quantization tool (#3140 )	2020-03-05 14:42:46 -08:00
KeDengMS	ade4fa108f	Disable delayload for cuda dlls (#3147 ) This change fixes #3129. When running onnxruntime as dll on Windows, CUDA does some internal cleanups when process exits. After this, any call to CUDA would cause crash. Delayload makes thread_local destructor to happen after CUDA cleanup, thus the crash.	2020-03-05 14:40:22 -08:00
Dmitri Smirnov	2c446a7f2f	Add push to ORT-NIGHTLY. (#3146 )	2020-03-05 11:38:22 -08:00

1 2 3 4 5 ...

1980 commits