onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-25 22:26:24 +00:00

Author	SHA1	Message	Date
Adam Pocock	e9dc8954ac	Adding support for ACL and DML to the Java API.	2020-04-14 20:35:03 -07:00
Ori Levari	f564569a80	Adapter Model and Environment tests (#3469 ) Adapter Model and Environment tests winml test macro clean up and extension	2020-04-14 13:36:31 -07:00
Du Li	621b3ac03a	FFT contrib ops (#3381 ) * add custom op skeleton * Adding Rfft, Irfft kernels. * Fix a few errors: 1. make kernel stateless to avoid race condition 2. reclaim cufft plan * Adding MLFloat16 support * Adding fp16 support for fft ops. * Adding cufft plan cache. * adding a util func * adding copyright info. * Accommodating PR comments.	2020-04-14 10:12:04 -07:00
Ye Wang	66a79d2c9f	fix (#3512 )	2020-04-13 18:30:58 -07:00
Ye Wang	cbe30f3e19	update FeaturizersLibrary (#3511 )	2020-04-13 15:47:51 -07:00
Ye Wang	438353abcd	Fix TruncatedSVDFeaturizer's test failure and re-enable it's kernel test (#3458 ) * checkin * fix linux & macos build * fix test * revert the changes for a single-aimed PR * fix	2020-04-13 13:59:38 -07:00
Tiago Koji Castro Shibata	d09d4a6b0d	Fix OS build (#3481 )	2020-04-09 21:46:01 -07:00
Yufeng Li	a443b1b6b9	Revert "Use IMMA for int8 matmul to leverage Turing Tensor Core (#3413 )" (#3472 ) This reverts commit `4d71958ccf`. Revert the PR. Looks like it triggers a bug in nvcc and failes the GPU pipeline.	2020-04-09 15:59:52 -07:00
Yufeng Li	4d71958ccf	Use IMMA for int8 matmul to leverage Turing Tensor Core (#3413 ) Use IMMA for int8 matmul to leverage Turing Tensor Core Format files under onnxruntime/core/providers/cude	2020-04-07 15:22:04 -07:00
Ye Wang	4ebad8805b	change (#3431 )	2020-04-06 11:30:21 -07:00
Changming Sun	0dcc6035b1	Disable strong inline (#3399 ) To bypass a MSVC bug. Without this change, people can't use VS2017 to build onnxruntime in Release or RelWithDebInfo mode.	2020-04-06 11:19:09 -07:00
Changming Sun	33006f48c0	Update onnx submodule to 1.7.0 release candidate (#3405 ) Update onnx submodule to 1.7.0 release candidate. This isn't a release tag, but it will be released soon, in 1-2 weeks.	2020-04-04 16:23:42 -07:00
Pranav Sharma	14f4c3e25f	Fix issue in construction of DummyArena. (#3416 )	2020-04-03 08:28:05 -07:00
Tiago Koji Castro Shibata	1671072b6b	[WIP] Port image tests from WAI (#3365 ) * Copy image tests from ADO * wip * Port tests to googletest * Add FNS-Candy license * Add missing collaterals * Remove brand images * Fix typos * Use PrepareModelSessionBinding in MnistImageTest * Fix typos	2020-04-01 15:38:44 -07:00
Changming Sun	accffded5d	Build options for enabling AVX/AVX2/AVX512 (#3373 ) 1. Add build options for enabling AVX/AVX2/AVX512 2. Update eigen to a newer version, because the current one doesn't work with VC and AVX512.	2020-04-01 10:07:22 -07:00
Dmitri Smirnov	a4fe60c4d3	OpSet 12 ops (#3341 ) Advance ONNX commit to pickup the latest ArgMax, ArgMin, ReduceMax/ReduceMin, MaxPool Declare new versions for CPU/CUDA. Implement infrastructure support for int8/uint8. Adust GatherOp test for a new error. Adjust Scan9.BadShape test. Add exclusions for index out of bounds checks. Rework result verification for SVDTransformer.	2020-03-31 15:31:06 -07:00
stevenlix	2332a93db0	Update onnx-tensorrt parser (#3369 ) * sync onnx-tensorrt parser and update TensorRT doc * remove --msvc_toolset 14.16 in tensorrt ci pipeline	2020-03-30 20:31:59 -07:00
Jan Scholz	ce9acf0c21	iOS crosscompilation under linux (#3298 ) * added support for ios crosscompilation under linux * reverted cmake generator change * if --ios is added protoc can be compiled for host system * accidently reverted change to compile protoc for host system for ios if protoc exe is not set * wdata is now used * accidentally pasted CMAKE_OSX_ARCHITECTURES into CmakeLists.txt, also made bad merge on build.py previously * removed print * fixed typeo, deleted commented statements for earlier debugging * reverted accidental delete * added asmmacro.h for aarch64 asm now MlasSgemmKernel**** gets underscore added if needed no need anymote to differentiate between iOS arm64 and normal amr64 build onnxruntime.cmake: added check if iOSCross is set to properly set RPATH * removed 2 spaces * fix: logcial error fixed, now protoc gets compiled if not supplied with --path_to_protoc_exe * removed unecessarily added spaces * removed some more spaces	2020-03-30 19:39:17 -07:00
Changming Sun	06fc9506fd	Thread pool changes (#3153 ) 1. Copy tensorflow's thread pool class to ORT, so that we can get a better implementation of thread pool based parallelfor 2. Copy Eigen's thread pool class to ORT 3. Support thread affinity 4. Remove RNN kernel’s private thread pool 5. Modify pool kernels to use the thread pool when openmp is disabled.	2020-03-30 12:18:40 -07:00
George Wu	355f39ddee	fix cuda build for cmake >= 3.17.0 (#3362 )	2020-03-30 00:38:57 -07:00
Tiago Koji Castro Shibata	c3cea486d0	Port ConcurrencyTests from TAEF (#3086 ) * Add ConcurrencyTests * Make ConcurrencyTests compatible with TAEF * Use test PCH in concurrency tests * Fix include header * Ignore unused code warnings on WINML_SKIP_TEST * Remove BOM * Remove conflicting namespace in older SDK * Refactor duplicate code * Fix unused DELAYLOAD * Fix unused DELAYLOAD * Remove link to internal bug * Address code style fixes * Add new concurrency tests	2020-03-27 17:39:22 -07:00
Sheil Kumar	b72fe13941	Update WinML Projection to accept sequence of tensors (#3287 ) * Enable sequence of tensor * add tests * small updates * There should only be 2 elements returned * CR feedback, and another 6->2 check update in the test. * missing semicolon... * Add explicit to constructor taking pointer paramter Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2020-03-23 15:55:20 -07:00
Tracy Sharpe	57468c651c	QLinearMatMul speed up (#3283 ) The equivalent of PR#3196 but done for QLinearMatMul. Use MLAS to do a u8u8=s32 GEMM and then requantize this intermediate buffer.	2020-03-21 15:37:25 -07:00
Pranav Sharma	84015d9491	Fix post merge test. This doesn't get triggered as part of gated PR checks. (#3277 )	2020-03-20 13:23:09 -07:00
Ye Wang	c5149e89d9	Wangye/shortgraindropper (#3273 ) (#3274 ) * Featurizer Library update * update Featurizer Library * add short_grain_dropper_transformer * resolve comments * resolve comments * resolve comments	2020-03-20 11:48:31 -07:00
Tiago Koji Castro Shibata	3bdb0b620a	Fix WCOS/Win32 linking bugs (#3126 ) * Fix WCOS/Win32 linking bugs * Remove unused NODEFAULTLIB flags * Avoid plain target_link_libraries signature * Avoid plain target_link_libraries signature * Fix library list escaping * Use library list instead of string * Remove duplicate link to windowsapp.lib * Remove Win32 build workarounds * Specify CMake policies before initializing language * Expose Win32 header definitions during build * Force set API family * Enable Win32 APIs in featurizer * Use MT dynamic CRT * Expose Win32 specific functions * Disable app container globally * Disable default wide functions in featurizers * Add featurizers to test include path * Workaround https://gitlab.kitware.com/cmake/cmake/issues/19428 * Revert pipeline debugging hacks * Skip /FI in CUDA sources * Default to Win32 builds * Enable WCOS when using WinML * Use generator expression to apply CMAKE_MSVC_RUNTIME_LIBRARY to C++ only	2020-03-19 08:52:40 -07:00
Pranav Sharma	435f014d71	Add support for sessions to share a global threadpool. (#3177 ) * Add support for sessions to share a global threadpool. * Fix build issues * Add tests, fix build issues. * Added some documentation * Fix centos issue when threadpools become nullptr due to 1 core. * Fix mac and x86 build issues * Address some PR comments * Disabled test for android, added few more tests and addressed more PR comments. * const_cast	2020-03-18 15:42:46 -07:00
edgchen1	e03b8a1e2f	Move path_lib from onnxruntime/core/framework to onnxruntime/core/platform. (#3253 ) Moved path_lib.h/cc from onnxruntime/core/framework to onnxruntime/core/platform and from the onnxruntime_framework to the onnxruntime_common libraries.	2020-03-18 11:53:46 -07:00
Tracy Sharpe	88c20eaef1	MLAS: rename AVX512BW->AVX512Core (#3216 ) Cleanup change: remap functions and files with Avx512BW to Avx512Core.	2020-03-13 22:45:51 -07:00
Tracy Sharpe	fe0b2b2abd	QLinearConv speed up (#3196 ) For x86/x64 builds, change the QLinearConv op to use MLAS for the u8u8=s32 GEMM, then requantize the intermediate buffer to u8.	2020-03-13 16:54:55 -07:00
KeDengMS	ade4fa108f	Disable delayload for cuda dlls (#3147 ) This change fixes #3129. When running onnxruntime as dll on Windows, CUDA does some internal cleanups when process exits. After this, any call to CUDA would cause crash. Delayload makes thread_local destructor to happen after CUDA cleanup, thus the crash.	2020-03-05 14:40:22 -08:00
smk2007	6cdd2b4934	Enable DML Nuget Package for x64 or x86 architectures (#3120 ) * add dml gpu pipelines * add x86 to the gpu dml dev build pipeline * Enable DML x86 builds * Fix uint64_t -> size_t warning * fix warnings * enable dml on x86 ci builds * operatorHelper 773 error uint32_t vs uint64_t * operatorHelper 773 error uint32_t vs uint64_t * make x86 pipeline use the gpu pool * more warnings * fix x86 directml path * make dml nuget package * disable tf_pnasnet_large * disable zfnet512 * make validation use wildcards * disable x86 dml gpu tests * add args. * update gpu.yml * change nupkg wildcard * add debug statements * package x86 dml nupkg * dont drop managed nuget again from dml pipeline build * Add DML EULA * directml license should be renamed to not clobber the existing license * casing on dml package.... * {} to () * fix license name * disable dml from x86 ci * typo and cr feedback * remove featurizers * ship the dml pdb as well	2020-03-02 20:18:46 -08:00
edgchen1	37f5fd8fb8	Add support for loading TensorProtos with external data from optimizer Initializer (#3045 ) - Added support for loading TensorProtos with external data from the optimizer Initializer class. - Added some file path utilities.	2020-02-28 13:19:16 -08:00
Changming Sun	c6ed077441	Add d2FH4- flag to cuda (#3105 )	2020-02-27 20:22:07 -08:00
Dmitri Smirnov	5008fc5b00	Featurizers: Import fix for Linux build adjust linkage (#3089 ) Advance FeaturizersLibrary SetAbsError on Output	2020-02-27 15:49:18 -08:00
Changming Sun	d72639ef77	Fix CUDA 10.1 DLL names (#3102 )	2020-02-27 14:43:16 -08:00
daquexian	37a905f557	Make Java API available on Android (#3030 )	2020-02-27 08:23:50 -08:00
Ori Levari	5e0f7412cd	Properly handle downlevel and WCOS scenarios (#3075 )	2020-02-25 17:47:02 -08:00
stevenlix	f4a5d17294	Upgrade to CUDA10.2 for TensorRT (#3084 ) * Switch to CUDA10.2 * Update win-gpu-tensorrt-ci-pipeline.yml * Update win-gpu-tensorrt-ci-pipeline.yml * remove dynamic_shape * update onnx-tensorrt submodule * check if input shape is specified for TensorRT subgraph input and enable some TensorRT unit tests * fix format issue * add shape inference instruction for TensorRT * update according to the reviews * Update win-gpu-tensorrt-ci-pipeline.yml	2020-02-25 05:36:01 -08:00
Adam Pocock	b23b7f0fea	[java] Adds the provider compile-time flags where the JNI code expects them. (#3082 )	2020-02-24 15:47:26 -08:00
Dmitri Smirnov	b8628404f3	Replace hardcoded include path value with the advertised setting. (#3083 )	2020-02-24 13:55:00 -08:00
kile0	f367fd921c	Use a custom allocator for temporary buffers in reduction_ops.cc (#2775 ) * port the mimalloc allocator * hook mimalloc opt into common.h and reduction ops * repurpose USE_MIMALLOC to only denote subbing in of default allocator with mimalloc and some refactoring * fix unintended cherry pick diffs * polish alloctor_mimalloc * explicitly disable mimalloc where it already had been disabled * update mimalloc to pull in stl allocator * switch mimalloc stl allocator to use mimalloc library version * turn mimalloc on by default (only the stl changes are enabled, the python interacting ones are off already and shall remain so) * move FastAllocVector into cpu specific code * separate out defines into arena and stl changes * the rest of the define renames * bfc arena allocator * some typos and rename the bfc arena allocator to fit existing class naming conventions * adjustments in response to comments * different template instantiations are friends	2020-02-23 16:04:30 +10:00
Changming Sun	ae1f35fb9f	Ignore GCC no-deprecated-copy warnings (#3074 )	2020-02-22 11:48:27 -08:00
Changming Sun	45ba325fa6	Remove USE_NSYNC macro (#3052 )	2020-02-20 13:29:19 -08:00
Scott McKay	a1db87b382	Add SafeInt bounds checking to memory allocation size calculations. (#3022 ) * Add SafeInt bounds checking to memory allocation size calculations. * Fix TensorRT library includes	2020-02-20 11:41:03 -08:00
Changming Sun	cb24e2a214	Update nsync	2020-02-20 11:25:34 -08:00
Changming Sun	e3c27536d0	Python binding doesn't need to link to the python lib on Linux	2020-02-19 12:18:47 -08:00
James Yuzawa	411b3aa801	Java build system enhancements (#2866 )	2020-02-18 15:41:49 -08:00
daquexian	4ca50d9352	Update DNNLibrary to v0.9.0 and update NNAPI GetSupportedNodes	2020-02-17 13:24:10 -08:00
ytaous	2b77cb19bd	merge training kernels to master (#2999 ) * merge training kernels to master * merge training kernels to master * revert two files * merge training kernels to master * merge training kernels to master * merge training kernels to master * merge training kernels to master * merge training kernels to master * merge training kernels to master * merge training kernels to master * merge training kernels to master * merge training kernels to master * merge training kernels to master * merge training kernels to master * merge training kernels to master * merge training kernels to master * merge training kernels to master * merge training kernels to master * merge training kernels to master * merge training kernels to master * merge training kernels to master	2020-02-13 14:52:35 -08:00

1 2 3 4 5 ...

352 commits