onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-16 21:00:14 +00:00

Author	SHA1	Message	Date
Cheng	64e991a9fc	[Qlinearsoftmax] contrib cpu (#12177 ) * [Qlinearsoftmax] contrib cpu * int8 implementation * contrib operator md * qdq transformer test * new attribute: opset * doc * quantized tool * remove template to reduce Binary size * doc of contribe operators * enforce x_shape is valid * fix reduce_size if input-shape is dynamic * add UT * register one op for reducing binarysize * kernel hash update * docs/ContribOperators.md	2022-08-10 10:52:02 +08:00
Vincent Wang	cfa09d16d9	[CUDA] Mod Op Kernel (#12499 ) * mod for cuda and rocm * fix bfloat16 ut * change bf16 ut number * fix opset version * fix op kernel doc	2022-08-09 13:05:40 +08:00
Vincent Wang	37995a7245	[CUDA] BiasSoftmax Supporting New Pattern (#12361 )	2022-08-05 06:59:24 +08:00
Scott McKay	a3de1bbf7d	Update script to find optimizers that potentially need supported opset updates (#12330 ) * Update to handle multiline declarations for the kernels which are typical these days. * Update to new path for the cpu contrib_op kernel registrations. * Update tools/python/find_optimizer_opset_version_updates_required.py Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>	2022-08-04 07:37:27 +10:00
Dmitri Smirnov	dc984a03d5	Container and memory allocation guidelines (#12387 ) Container and memory allocation guidelines Re-org and add code samples Clarify the wording on returning gsl::span	2022-08-03 10:31:59 -07:00
Changming Sun	44ec2cf088	Update publish-python-apidocs.yml (#12433 )	2022-08-03 10:17:00 -07:00
Ye Wang	b622e5fa9b	Support vocab_mask/prefix_vocab_mask/no_repeat_number in greedysearch op (#12327 ) * support more inputs for greedy search * fix docs * refactor test * lint * review comments	2022-08-03 10:10:08 -07:00
Valery Chernov	e2423bb55c	[TVM EP] Build on Windows with ipp-crypto support (#12336 ) * update TVM EP docs for ipp-crypto build conditions * add ipp-crypto by ExternalProject_Add Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>	2022-07-28 15:40:19 +02:00
Ye Wang	89ac61f4d4	support gpt2 model with greedy search (#12068 ) * greedy search gpt2 cpu checkin * add cuda support * add test * provider * update * fix some bugs * refactor impl class * refactor test * remove unused func * refactor parameters class * simplify padding * fix lint warnings * python format * Revert "python format" This reverts commit f25fe1017fa33d960b2418ebbb5dba6a4bd043cf. * python format * fix pipelines * fix pipeline * move bufferallocater to generate_impl_base * review comments(alignment, filename/namespace change) * rebase2 * python reformat * reformat * fix rocm build * review comment * review comments * review comments * fix a bug * rebase test files * python format * format import order * review comments * fix build	2022-07-22 15:45:16 -07:00
RandySheriffH	0264a9c29b	Bump ort version number (#11948 ) * bump ort version number * update link and note url * update version to silence assert Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2022-07-22 12:55:53 -07:00
Valery Chernov	3b0aaa9e0e	[TVM EP] support build on Windows (#11851 ) * add description of build ORT+TVM EP on Windows * fix cmake error related to symlink creation on Windows * add llvm config path to build flags for correct build on Windows * update TVM_EP.md for llvm_config build arg * fix warnings skipping during build on Windows * fix using string or wstring for model path to correct build on Windows (MSVC error) * fix error in custom logger for correct build on Windows * implement glob algorithm for Windows * additional build fixes * update TVM with export of VM symbols for dll * description of nasm issue and workaround * update TVM with export of Executable from VM symbols for dll * description of installation of ipp-crypto dependencies on Windows * cmake key for ipp-crypto build * fix wstring for TVMso EP * fix ipp-crypto build * cmake key onnxruntime_TVM_USE_HASH switch off not specific methods, but full hash functionality * fix absolute path to compiled lib * update TVM_EP.md, fix lint warnings * update TVM_EP.md * small fixes after review * switch on handshake functionality for Linux workflow Co-authored-by: Valery Chernov <valery.chernov@deelvin.com> Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>	2022-07-13 10:48:42 +02:00
PeixuanZuo	5579d81fc8	[add] Add operator gemmfastgelu for ROCM (#12101 ) * [ADD] add gemm fast gelu * [UPDATE] refunction matmul_impl * [Update] delete tuning_ in this pr * [FIX] code format * [FIX] compiler warning * [Update] update doc	2022-07-13 15:40:16 +08:00
Preetha Veeramalai	99a370dd02	Update readme for OVEP (#12122 ) * Add changes for training module in Readme * Update ReadMeOV.rst	2022-07-11 10:54:12 -07:00
Valery Chernov	8ba8146650	[TVM] handshake mechanism for support of TVMso EP (#11437 ) * infrastructure for handshake mechanism was implemented. sha256 was selected as first hash algorithm * check hash during compile in TVMso EP * add IPP-CRYPTO to external dependencies for TVM EP * made checkHash method constant * removed the public implementation of the SHA-256 algorithm so as not to cause a license conflict * implemented SHA-256 calculation using ipp-crypto library * fix dependency for ipp-crypto * add provider options for hash check * update documentation for added provider options * add hash check condition * fix docs * fix lint * fix ORT_THROW Co-authored-by: Valery Chernov <valery.chernov@deelvin.com> Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>	2022-06-29 14:57:18 +02:00
Gary Miguel	dc5d6b9515	register signal ops for opset 17 (#11778 ) * Register signal ops for op set 17 Note code is mostly being moved, not added. These ops were previously only registered as Microsoft contrib ops and only built if `BUILD_MS_EXPERIMENTAL_OPS=1`. They've been added to the ai.onnx standard op set in version 17. Main components of this change: * Move the kernels from the conrib_ops directory to the core directory. * Add function bodies for ms experimental ops. This will allow old models that use the contrib ops to continue to function. All the function bodies consist of a single op (the new standard op), so performance overhead should be minimal. Minor clean-up also in this change: * De-duplicate get_scalar_value_from_tensor: put it in a new utils.h. * Fix some bugs that caused compilation errors with the experimental ops. Tested with `build.sh --ms_experimental` * Fix some spelling errors and lint violations. * Replace a couple of switch statements with `MLTypeCallDispatcher`. * Use `InlineVector` instead of `std::vector`. Unblocks https://github.com/microsoft/onnxruntime/issues/11640	2022-06-27 10:26:55 +10:00
Gary Miguel	4bf22e2a40	Update ONNX to 1.12 (#11924 ) Follow-ups that need to happen after this and before the next ORT release: * Support SequenceMap with https://github.com/microsoft/onnxruntime/pull/11731 * Support signal ops with https://github.com/microsoft/onnxruntime/pull/11778 Follow-ups that need to happen after this but don't necessarily need to happen before the release: * Implement LayerNormalization kernel for opset version 17: https://github.com/microsoft/onnxruntime/issues/11916 Fixes #11640	2022-06-21 17:19:52 -07:00
Ye Wang	859ef277a0	apply zcode changes to the beam search op (#11880 ) * apply zcode changes to the beam search op * fix pipeline failure * add doc * workaround for C# * update * update * use name zcode * review comment * review comments * fix cpplint * review coments	2022-06-20 18:39:07 -07:00
Tianlei Wu	6ee2c1b5fc	Remove temperature input from BeamSearch operator (#11896 ) * remove temperature input * update index of remaining inputs	2022-06-20 09:50:45 -07:00
sfatimar	f97bd38c4f	UEP 4.1 release (#11834 ) * Add pypi build changes to latest Master * Add ORT training part of OV build * Disabling SqueezeOpTest.BadAxes * Add ONNXruntime branch ARG to Docker build * Changes to include file details versions * Commit File Version Updates * Change naming for linux build * Add fix for pylint format errors * Fix pylint warnings. * Fix pylint errors - stage 2 Signed-off-by: Preetha Veeramalai <preetha.veeramalai@intel.com> * Fix pylint errors - stage 3 * Fix pylint format - stage4 Signed-off-by: Preetha Veeramalai <preetha.veeramalai@intel.com> * Commit for Wheel Release >0.35.1 Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com> Co-authored-by: mayavijx <mayax.vijayan@intel.com> Co-authored-by: Sahar Fatima <sfatima.3001@gmail.com> Co-authored-by: nmaajidk <n.maajid.khan@intel.com>	2022-06-17 14:49:04 -07:00
Gary Miguel	e8b0d24071	Support per-test tolerances for ONNX tests (#11775 ) Prior to this every test shared the same tolerances. This meant that if an ONNX test failed due to a small but acceptable difference in output, the only alternative was to disable the test entirely. In op set 17, the DFT operator is being added. Without this change, the tests for that operator fail because the output is off by about 5e-5. It's better to keep test coverage for this new op rather than disable the test entirely. Also prior to this change, the global tolerances were not shared between C++, JavaScript, and Python tests. Now they are. Also fix various minor issues raised by linters. Unblocks https://github.com/microsoft/onnxruntime/issues/11640.	2022-06-14 15:12:23 -07:00
Chun-Wei Chen	63c483a998	1.12.0 is the right TBD instead of released 1.11.0 (#11817 )	2022-06-13 14:27:59 -07:00
Tianlei Wu	def78a1b81	Support T5 in BeamSearch operator (#11450 ) (1) Support T5 in BeamSearch operator, and add both CPU and CUDA implementation. (2) Change BeamSearch op: rename encoder_decoder_init attribute to encoder, and add decoder_start_token_id attribute (3) Update convert_to_onnx for T5 to use int32 instead of int64 inputs as default. (4) Add more tests in best_beam_search.py (5) fix ORT_ENFORCE of hypothesis_buffer_offset_ (6) Improve ONNX conversion: (a) Change encoder some dynamic axes to fixed dim value (b) add --separate_encoder_and_decoder_init (c) correct name t5-3B => t5-3b, t5-11B => t5-11b (d) Add --use_int32_inputs in convert t5 to onnx (e) Allow t5 beam search conversion in one step	2022-06-10 15:06:57 -07:00
Alexey Gladyshev	331c387f4a	[TVM EP][DOC] Documentation update for TVM EP due to the addition of precompiled model support. (#11743 ) * update description of TVM EP options in docs * update sample notebook * update TVM EP documentation * add link to description of options Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>	2022-06-08 14:56:01 +02:00
Hector Li	95a16c1ffe	Snpe ep (#11665 ) * Initiate Ort SNPE EP * fix snpe ep windows build which is caused by the utility method (ToUTF8String) name change on master * correct the source path for libonnxruntime.so while building for andorid package * add AdditionalDependencies for amr64 * On MS-Windows, the patchfile must be a text file, i.e. CR-LF must be used as line endings. A file with LF may give the error: "Assertion failed, hunk, file patch.c, line 343," unless the option '--binary' is given. * fix build failure if snpe is not enabled * update doc for contrib op * separate out snpe ep settings to onnxruntime_snpe_provider.cmake * renaming according review comments * update according review comments	2022-06-03 14:10:02 -07:00
Gary Miguel	74bc4c07f6	Fix C# and numbering (#11643 ) * C# protocol buffer code can be updated on Linux. Link to the relevant instructions. * Fix numbering.	2022-05-31 11:33:36 -07:00
Vincent Wang	02724c54ff	[CUDA] Implement BitmaskDropout, BitmaskBiasDropout and BitmaskDropoutGrad (#11534 ) * Implement BitmaskDropout and associated unit tests. * Implement BitmaskDropoutGrad and associated unit tests. * Implement Dropout -> BitmaskDropout rewrite rule and associated unit tests. * Implement (Dropout,DropoutGrad) -> (BitmaskDropout,BitmaskDropoutGrad) rewrite rule. This commit does not yet include unit tests for this rewrite rule. This commit also introduces improved documentation for all changes which will be grouped into this PR. * bitmask dropout * fix win build * bugfix for rocm * bugfix * fix code format * fix ut * fix build break * fix ut in win * resolve comments * fix ut in trt * resolve comments * fix rocm build error * fix typo Co-authored-by: Aidan Beggs <aidanbeggs@microsoft.com>	2022-05-27 17:24:47 +08:00
Justin Chu	c541063245	Format coding conventions documentation (#11405 ) Add proper formatting to code blocks to make the doc more readable. - Wrap code blocks with ` - Fix typos	2022-05-09 10:19:15 -07:00
Justin Chu	fdce4fa6af	Format all python files under onnxruntime with black and isort (#11324 ) Description: Format all python files under onnxruntime with black and isort. After checking in, we can use .git-blame-ignore-revs to ignore the formatting PR in git blame. #11315, #11316	2022-04-26 09:35:16 -07:00
Justin Chu	6fb29f5b9a	Add python docstring linting in vscode settings (#11316 ) Add python docstring linting in vscode settings Use black and isort for python code formatting in VScode. Import sorting enabled on save. Code formatting available in VSCode with manual trigger. Adopted from pytorch https://github.com/pytorch/pytorch/blob/master/.vscode/settings_recommended.json	2022-04-23 06:23:04 -07:00
Chun-Wei Chen	b9279f637d	update How_To_Update_ONNX_Dev_Notes with right paths (#11074 )	2022-04-01 08:05:31 -07:00
Xavier Dupré	c37d2728bf	Implement TreeEnsemble for opset(ai.onnx.ml)==3 (#10821 ) * Implement TreeEnsemble for opset(ai.onnx.ml)==3 * use of InlineVector * refactoring * improve attributes retrieval * avoid creating a temporary buffer * modifies onnx.ml.cpu.json * use unordered_map * update docs/OperatorKernels.md * address PR comments (TH -> ThresholdType, ORT_RETURN...) * add a python unit test to load a TreeEnsembleRegressor following ai.onnx.ml==3 specifications	2022-03-30 12:53:12 +02:00
Vincent Wang	6a6840d5c6	Fuse LayerNormalization for Apex O2 (#10233 )	2022-03-29 21:22:04 +08:00
Chi Lo	8ba52b0a05	Bump master version to 1.12 (#10797 ) * bump master version to 1.11 * bump master version to 1.12	2022-03-28 12:30:11 -07:00
pengwa	89ef987ab1	Improve NonZero on CUDA/ROCM (#10307 ) * improve NonZero * fix megatron_fp16 optimzier, fix the doc * multi_tensor_applier * resolve comment * fix building warning * fix build error when enabling training and use tensorrt	2022-03-25 07:35:45 +08:00
Nat Kershaw (MSFT)	2d961604b1	Refactor Python API docs to better explain IO binding scenarios (#10651 )	2022-03-15 09:40:59 -07:00
Hariharan Seshadri	a9d9c6b486	Register CPU, CUDA and ROCM opset-16 kernels for some operators (#10643 )	2022-03-08 09:18:39 -08:00
liqun Fu	da885a72e8	update with onnx 1.11 release (#10441 )	2022-03-07 21:10:55 -08:00
Tianlei Wu	0e335aba37	Update BeamSearch operator spec to support t5 (#10777 ) * change BeamSearch op to support encoder decoder model * check model_type and decoder attribute * fix * update comments * warn shape inference issue with onnx v1.11 or T5 * skip parity test when tempature != 1.0 * fix build	2022-03-04 21:52:45 -08:00
Tianlei Wu	36c3271546	BeamSearch op cuda (#10556 ) Add BeamSearch cuda implementation with support of fp16 GPT-2 subgraph	2022-02-25 13:08:55 -08:00
Dmitri Smirnov	2679711bee	Refactor transformers and other code to reduce memory allocation calls (#10523 ) Work on minimizing memory management calls by reducing number of allocations and copies. Replace std::unordered_set to InlinedHashSet and add usage of InlinedVector. Employ std::move() to minimize copying and memory allocations. Remove copying of the const shared data into each of the PropagateCast transformer instances. Move inlined_containers.h header to include/common Adjust AsSpan imlementation for C++ < 17	2022-02-24 16:17:14 -08:00
Alexey Gladyshev	7dc7529ec8	[TVM EP] Integrate tests for TVM EP into public onnxruntime CI (#10505 ) * add support for bool type * add TVM EP support for tests * include TVM EP in python test pool * fix pylint * moved technical imports to a separate file * clean up post build actions & move _ld_preload.py extension to CMake level * add files for include TVM EP into CI * implement custom logger for TVM * replace TVM logging with ONNX RT logging * update link for TVM EP tutorial * clean up TVM EP cmake * add pybind auto enabling for TVM EP * fix blank spaces * code review fixes * replace print with comment * add list of EP without TVM EP * enable onnx tests * disable contrib ops and ml ops * reuse Dockerfile.ubuntu * Move install_tvm_test_dependencies.sh out of Docker context dir, update build definition. Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2022-02-24 16:24:23 +01:00
Scott McKay	df841ee87d	Fix incorrect type constraint registration for operator kernels. (#10489 ) * Fix incorrect type constraint registration for RoiAlign. This led to the input type not actually being checked when matching a kernel as the invalid constraint name is treated as a missing optional input. * fix missing dependency for the unit test exe. Whilst it doesn't link against the CUDA providers lib, without the dependency VS doesn't know it needs to rebuild the library if there are changes. * Add check for invalid type constraints. * Fix invalid registrations for other kernels. * Add hash replacement logic to provide backwards compatibility in ORT format models when the registration is fixed. * Add tests	2022-02-18 16:55:32 +10:00
Valery Chernov	1cdc23aba4	[TVM EP] Rename Standalone TVM (STVM) Execution Provider to TVM EP (#10260 ) * update java API for STVM EP. Issue is from PR#10019 * use_stvm -> use_tvm * rename stvm worktree * STVMAllocator -> TVMAllocator * StvmExecutionProviderInfo -> TvmExecutionProviderInfo * stvm -> tvm for cpu_targets. resolve onnxruntime::tvm and origin tvm namespaces conflict * STVMRunner -> TVMRunner * StvmExecutionProvider -> TvmExecutionProvider * tvm::env_vars * StvmProviderFactory -> TvmProviderFactory * rename factory funcs * StvmCPUDataTransfer -> TvmCPUDataTransfer * small clean * STVMFuncState -> TVMFuncState * USE_TVM -> NUPHAR_USE_TVM * USE_STVM -> USE_TVM * python API: providers.stvm -> providers.tvm. clean TVM_EP.md * clean build scripts #1 * clean build scripts, java frontend and others #2 * once more clean #3 * fix build of nuphar tvm test * final transfer stvm namespace to onnxruntime::tvm * rename stvm->tvm * NUPHAR_USE_TVM -> USE_NUPHAR_TVM * small fixes for correct CI tests * clean after rebase. Last renaming stvm to tvm, separate TVM and Nuphar in cmake and build files * update CUDA support for TVM EP * roll back CudaNN home check * ERROR for not positive input shape dimension instead of WARNING * update documentation for CUDA * small corrections after review * update GPU description * update GPU description * misprints were fixed * cleaned up error msgs Co-authored-by: Valery Chernov <valery.chernov@deelvin.com> Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru> Co-authored-by: Thierry Moreau <tmoreau@octoml.ai>	2022-02-15 10:21:02 +01:00
Changming Sun	3185680b6c	Add NHWC CONV contrib op (#10506 )	2022-02-10 15:47:49 -08:00
Viswanath Boga	ad9d2e2e89	Prefix match in first iteration of beam search OP (#10231 ) * Add BeamSearch op schema * Add ONNX conversion for beams search * remove attention_mask and change input order * add option to run baseline * add check data type NULL * applies VerifyNodeAndOpMatch to subgraph * update input_ids shape * Add node name for Cast node * expose API for topk * parse parameters * Add beam search scorer * output results * fix typo * use c++ template and format python * fix build pipeline errors * symbolic shape infer of input onnx * output scores * add kernel def hash * Handle vocab_mask; move CheckSubgraph * undo insert_cast_transformer.cc and fusion_utils.py * fix typo * fix merge * update doc * add repetition penalty * refactoring: add GptSubgraph class * move BeamSearchState from .h to .cc file * adjust logits processor order * add batch generation example * fix repetition penalty for dup words in sequence * Add test * Add no repeat ngram processor * refactoring: move logits processor to classes * fix build warning * show latency * use allocator in beam state * use allocator in sequences * fix build error * move next_positions to beam state * Changes for prefix matching * removing debugs * removing more debugs * clean up * clean up * cpu doc updated * Updated docs * updated prefix_vocab_mask dimension in convert script * changes to support bxs prefix_vocab_mask in beamsearchop kernel * doc update * OperatorKernels.md updated * matching docs from artifacts * minor change in logits processor * Addressing comments * Updated the prefix vocab mask usage properly Co-authored-by: Tianlei Wu <tlwu@microsoft.com>	2022-02-03 00:14:39 +05:30
Yufeng Li	1aa0789691	add qdq support for QGemm (#10414 ) * add qgemm in quantization tool * add qdq support for QGemm * fix build break * fix OperatorKernels.md	2022-02-02 10:35:29 -08:00
Xavier Dupré	481b96d32a	STVM, NUPHAR, remove tvm from submodules list, checks pointers are not null. (#10211 ) * STVM, checks pointers are not null. * removes submodules tvm * add missing include(FetchContent) * add target tvm * fix stvm test * extend cgmanifest with dependencies of tvm	2022-01-27 20:31:13 +01:00
Edward Chen	66acf50488	Document C/C++ API documentation version info conventions. (#10396 )	2022-01-27 10:20:13 -08:00
Dmitri Smirnov	3367ddc5ba	Add abseil cgmanifest declaration. Update coding standards. (#10374 ) Add abseil cgmanifest declaration. Update coding standards for InlinedContainers Adjust coding guidelines. Add default N calculation for InlinedVector<T, N> for general use. Rename T from InlinedShapeVectorT. Fix Eager build Add LLVM Copyright with modified derived code notice.	2022-01-27 08:32:05 -08:00
Yi-Hong Lyu	e27f2dc932	int8/uint8 support for Argmax for opset 1, 11, 12 (#10296 )	2022-01-18 14:37:34 -08:00

1 2 3 4 5 ...

410 commits