onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-01 03:45:06 +00:00

Author	SHA1	Message	Date
Bowen Bao	e983f37121	Bifurcation detector for aggressive decoding (#9432 ) ``` Component for aggressive decoding. Find the bifurcation index of predicted tokens, between source tokens, starting from previous suffix match index, and predicted tokens. Concat predicted tokens, starting from bifurcation index, to the back of current tokens. This forms the output tokens. Detect suffix match index in source tokens, between source tokens and output tokens. Detection is based on finding the appearances of last n-gram in output tokens in source tokens. A match is considered found if source tokens contain a single matching n-gram. Return the index of the start of the n-gram in source tokens. No matching if found if src tokens contain multiple or zero matching n-grams. Return -1. ```	2021-10-19 19:53:56 -07:00
Hariharan Seshadri	4698b73725	Fix output shape description of Attention op's schema (#9406 )	2021-10-19 15:56:35 -07:00
Xavier Dupré	11f0081c1e	Remove tensorflow, tf2onnx from the list of dependencies for the documentation (#9221 ) * Remove tensorflow, tf2onnx from the list of dependencies for the documentation * improve documentation * update API	2021-10-14 18:07:35 +02:00
mindest	f9cf62912a	Add same_shape case for BiasDropout (#9188 ) * bias dropout improvement * add transform case for same shape case * combine kernel * merge with vectorized kernel * use "has_same_shape_bias" * minor: a "N % 4 != 0" case * add op UT for has_same_shape_bias * address comments; add param case for 1d bias; add param case tests for 1d and same-shape bias * rewrite logic condition Co-authored-by: Peng Wang <pengwa@microsoft.com>	2021-10-12 19:57:38 +08:00
ashbhandare	35c2102cfa	Fixes for GatherND, Multinomial (#9143 ) * register gathernd kernel, aten multinomial * fix CI, add test * review comments	2021-10-05 14:51:58 -07:00
Ye Wang	4934455ab6	Bumping up to 1.10 (#9006 ) * bump to 1.10 * Update Versioning.md * Update README.rst * Change opset version to 15	2021-09-22 16:34:28 -07:00
Jason	4e5bc8365b	Add Paddle2ONNX to Versioning.md (#9067 ) * Add Paddle2ONNX to Versioning.md	2021-09-22 13:38:14 -07:00
Pranav Sharma	dae37dc946	Fix S360 issue by using "use strict" for javascript code. (#9128 )	2021-09-20 20:32:44 -07:00
Ryan Hill	6ae5f7a244	C API Docs - Add build instructions (#9106 ) * Update Doxyfile, add build instructions to header * Update paths in README.md	2021-09-17 18:40:27 -07:00
Ryan Hill	280e79463a	FIll in more documentation (#9088 ) Fix plural values with %s Fix more symbol links Add custom header for web metrics	2021-09-16 17:08:27 -07:00
Zuwei Zhao	ff66cfdfa6	Enable linking in exception throwing support library when build onnxruntime wasm. (#8973 ) * Enable linking in exception throwing support library when build onnxruntime webassembly containing onnxruntime-extensions. * Add flag in build.py to enable linking exceptions throwing library. * Update onnxruntime-extensions document and bind custom_ops build flag with use_extensions. * Update doc. * Update cgmanifest.json. Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>	2021-09-10 22:09:16 +08:00
Ryan Hill	2439ced3ec	API Documentation (#8948 ) * Make help information compile properly	2021-09-09 22:04:51 -07:00
ytaous	0193490cbf	ReduceMin - add int64 cuda kernel support for opset12/13 (#8966 ) * ReduceMin - int64 support * fix doc Co-authored-by: Ethan Tao <ettao@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-09-07 17:01:26 -07:00
Ye Wang	e2194797a7	bumping up to version 1.9 (#8982 ) * bump up version * makes the windowAI column align with ORT version * update the hardcoded version string * fix a typo	2021-09-07 14:30:55 -07:00
Zuwei Zhao	89e8bff121	Enable selecting custom ops in onnxruntime-extensions. (#8826 ) * Enable selecting custom ops in onnxruntime-extensions. * Move cmake_helper.py. * Remove over-indented spaces. * Add doc. * Remove onnxruntime-extensions from git submodules, and user should pass path of onnxruntime-extensions for build. * Modify doc. * Remove argument --enable_onnxruntime_extensions and use --onnxruntime_extensions_path. * Fix build error. * Fix build error. * Use onnxruntime_extensions_path. * support both submodule and external source folders * refinement * Update cgmanifest.json * Support building onnxruntime-extensions from either git submodule or pre-pulled path. * Update doc. * more standard name * update docs * add the copyright header Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com> Co-authored-by: Wenbing Li <wenbingl@outlook.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>	2021-08-27 21:45:52 -07:00
Hariharan Seshadri	cee79526fd	Add opset 15 kernels for Pow, BatchNorm, and Shape (#8442 )	2021-08-25 12:04:20 -07:00
Hariharan Seshadri	17b0664e34	Optimize sequence type usage on CUDA [2/n] (#8720 )	2021-08-24 10:40:28 -07:00
XiyinOSS	19b82b438b	GridSample OP implementation for CPU and CUDA (#8551 ) * GridSample OP implementation for CPU and CUDA Description: This change contains implementation for torch grid_sample OP. Cuda implementation contains contribution from Muscle Wu. * Use interpolation for out-of-bound points in zero padding mode Out-of-bound points in zeros padding mode changed from constant 0 to interpolation of surrounding pixels. This aligns with Pytorch implementation. A bug in CUDA batch offset calculation is fixed. Custom op exporter type is added. * Fix nearest bug in CPU * Update per CI build finding and review comments * Force float to avoid potential integer T issue * Style update * PR update * Remove c++17 feature from cuda code	2021-08-20 12:37:38 -07:00
harshithapv	c24335246b	Support bool type for Pad Op and fix Unsqueeze in Tile grad for Opset 13 (#8602 ) * changes * tile grad unsqueeze fix for opset 13 * clean up * remove bool support for opset 2 to 12 for Pad as it is not supported. * Copy OperatorKernels.md from artifacts of Windows CI build.	2021-08-11 11:21:02 -07:00
Xavier Dupré	064a385b59	Support int8 for operator Split (#8615 ) * Support int8 for operator Split	2021-08-10 23:04:16 +02:00
Changming Sun	ed17ca3595	Remove onnxruntime/core/protobuf (#8617 ) * remove onnxruntime/core/protobuf * Update How_To_Update_ONNX_Dev_Notes.md	2021-08-10 09:36:27 -07:00
Guoyu Wang	52a212e4f1	Bump ORT master version to 1.8.2 (#8646 )	2021-08-09 11:10:29 -07:00
Yulong Wang	1b902d0227	doc: add ort-web related instructions to update onnx doc (#8500 ) * doc: update instructions for ort web docs * revise readme	2021-08-06 15:09:11 -07:00
Ashwini Khade	96eb9810ba	Update onnx (#8458 ) * updates for picking pnnx commit * add tests filter to c# tests * plus test fixes * fix versioning for contrib ops * fix tests * test filter for optional ops * more versioning related updates * fix test * fix layernorm spec * more updates * update docs * add more test filters * more filters * update binary size threshold * update docs * plus more fixes * updates per review * update to release commit * add filters for optional type tests * plus updates	2021-08-05 09:21:44 -07:00
Chun-Wei Chen	9d88b1de78	correct supported ONNX version (#8590 )	2021-08-05 06:49:50 -07:00
Yufeng Li	ceeb1a65d6	Add quantization support of GEMM directly with QGemm (#8447 ) QGemm takes in quantized A, B, C, and quantization parameters of output Y, in which C and quantization parameters of Y are optional. Its output can be quantized or full precision, which depends on whether quantization parameters of Y exists or not. If quant params of Y are provided, the output will be requantized or is full precision. Comparing with QLinearMatMul and MatMulInteger, QGemm supports transpose, apha and beta attribute. The formula for quantized GEMM is: Y = alpha * scale_a * scale_b * ((A_int8 - zp_a) * (B_int8 - zp_b) + C_int32), in which, C_int32 is quantized with formula: C_int32 = (beta * C) / (alpha * scale_a * scale_b)	2021-07-27 21:21:49 -07:00
Xavier Dupré	a9fc3c448c	Improves documentation, show InferenceSession contructor attributes (#8494 ) * include constructor parameters in the python documentation * expose more classes into the documentation	2021-07-26 15:58:47 +02:00
Dmitri Smirnov	950fe5e28b	Implement SparseTensor and infrastructure suppport and advance ONNX commit (#8038 ) SparseTensor support Implement Builder pattern Fix support for 1-D and 2-D COO indices Implement and test CSR support. Handle shape inference for SparseTensors Implement conversion for COO, CSR and tests. Address the case where constant sparse initializer is the output. Implement test infra for SparseTensors Implement SparseDenseMatMul for Csr and COO and tested it. Add hash for SparseToDenseMatMul Finish shared provider refactor Refactor GetOrCreate to Create Working on py interface Expose OrtDevice and use it in allocate_numpy Adjust Sparse interfaces, add support for string SparseTensor. Add tests. Add and test to_cuda() Add accessors to format specific indices Test values and indices views, read-only flag, after GC access Add sparse related methods to OrtValue Re-work SparseTensor wrapper, add OrtValue methods Rework numpy_array_to_cuda/to_cpu Add run_with_ort_values Add models and test sparse_mat_mul with run_with_ort_values Refactor sparse tensor to use a single buffer Ifdef x86 Eigen CSR sparse matmul implementation Exclude broken test, check for string type when copying cross device Split pybind schema, regenerate docs, add exclusion Conditionally exclude schema module Update docs fix cuda build Add test to a filter and renerate JS docs Add conversion and test string support for sparse tensors Exclude conversion utils from minimal build Add CUDA Memcpy and adjust provider interfaces	2021-07-22 15:24:36 -07:00
DeyuHuang	4275055868	Add Gridsampler contrib op (#8372 ) * add Gridsampler contrib op * fix gridsampler_paddingmode_border test * disable the tests until the kernel added * fix CI failure * change GridSampler to GridSample	2021-07-22 15:39:28 +08:00
harshithapv	0f989c6162	bumping onnxruntime version to 1.8.1 (#8429 )	2021-07-19 16:48:56 -07:00
Viswanath Boga	afce0e2543	Attention kernel update to handle different Q,K,V hidden sizes (#8039 ) * changes working to convert akv nodes * changes to replace nodes * changes to accomodate qkv hidden sizes as attributes * kernel to accept qkv_hidden_size attributes * Working till compute for varied dimension, todo applyattention() * changes to make all regression tests work * inference running successfully without prepack * success inference with pre-pack weights * add test for diff sizes * bias shape need not be a mul of 3 * get the output_hidden_size from input * infer output shape from input * merge with master * cleaning up files that got merged wrong * accurancy at accepted level * added unit test case for different dimensions * all unit tests passing * packed weights working for attention * prepacked weights working * added test case for newly added extra qk input * updated unit test to test only extra add qk * fixing build error * removing few debugs * reverting test changes * all python test passing * cleaning up * new unit test added, major clean up of code * removed extra code * minor * minor fix to tests * prepack weights code cleaned up * compacted compute() in attention.cc * reformat compute() * making a parameter T * adding 3 q,k,v buffers in all cases * fixing build * running tests only on cpu * Updating docs * trigger ci builds * Addressing comments in PR * addressing some more comments * get add_qk_str from add_qk node directly * updating docs, added extra check to verify attn inputs * Optimized the extra add by parallelizing * added attention_shape to symbolic_shape_infer.py * minor refactoring to address comments	2021-07-19 12:21:33 -07:00
Ye Wang	04297110c3	Support int64 in ReduceMin cuda op for Opset 14 (#8307 ) * reducemin int64_t support * fix xxcuda.so load error * testtest * refactor * update doc * propagate types to opset14 * re-generate doc * rename macro	2021-07-13 16:18:06 -07:00
Zuwei Zhao	0a5b75f5cd	Update submodule onnxruntime-extensions. (#8282 ) * Update submodule onnxruntime-extensions to latest. * Add document for onnxruntime-extensions. * Update cgmanifest.json for onnxruntime-extensions. * Add example in JavaScript. Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>	2021-07-13 10:21:11 +08:00
Hariharan Seshadri	5369821ad6	Support SpaceDepth ops in the CUDA and ROCM EPs (#7960 )	2021-07-09 01:00:22 -07:00
Nick Kreeger	800b62a139	Create a quantized EmbedLayerNorm for ORT. (#8124 ) Create a quantized EmbedLayerNorm Op for ORT	2021-06-25 17:51:43 -05:00
Negin Raoof	80b7b134bf	Adding optional ops in contrib ops (#7946 ) * Added optional const spec	2021-06-24 13:16:31 -07:00
Bowen Bao	51c12a715b	Add NGramRepeatBlock contrib op (#8078 ) Description: Enforce no repetition of n-grams. Scores are set to `-inf` for tokens that form a repeated n-gram if added to the back of the input_ids. Motivation and Context Needed by transformer models in sequence generation algorithms (greedy search and beam search). This module has heavy impact on performance, and can be highly parallelized.	2021-06-21 10:21:48 -07:00
Olivia Jain	c72a8c7ff4	Upgrade tf 2.4.1 to 2.4.2 for component governance (#8036 ) * Upgrade tf 2.4.1 to 2.4.2 for component governance * Trial run with tf 2.5.0	2021-06-14 09:30:58 -07:00
Xavier Dupré	6d7461795f	Update Version.md (#8021 ) Fix the correct supported opset 1.8.0.	2021-06-13 18:52:40 +02:00
RandySheriffH	1a5ee11dbd	Implement Sequence Ops GPU (#7863 )	2021-06-07 15:30:26 -07:00
Thiago Crepaldi	c45ac166d3	Add graphviz into Dockerfile images for Python API documentation (#7819 )	2021-06-02 16:12:54 -07:00
Scott McKay	0fbec1b9c1	Update the operator documentation generation (#7787 ) * Update the operator documentation generation - Make layout a little nicer - Update to latest supported operators including training - Fix some links that are broken when the docs content is copied to github-pages - Fix incorrect usage of 'onnx.ai.ml' as the default domain - ML ops are now separated from the real default domain of 'onnx.ai' - Include CPU, CUDA and training kernels - exclude DNNL as it's not an EP we own * There are separate paths for CUDA and CUDNN as they are not guaranteed to be in the same location on a Windows machine. Use the CUDNN path when looking for the CUDNN library. * Enable validation of both contrib ops and operator kernels in build Filter generation so it's deterministic Add ability for CI to publish the md files as build artifacts if they differ so a developer can download and add to their PR to resolve any diffs. Remove workarounds for github-pages as that will now link to the github docs which display correctly	2021-06-02 17:47:40 +10:00
Siva Popuri	c08bb4eee3	Update docs/ONNX_Runtime_Server_Usage.md (#7818 ) Making it clear in the documentation to proactively inform users.	2021-05-26 16:17:20 -07:00
Scott McKay	57782b3463	Add supported operators/types documentation for the ORT Mobile package (#7807 ) * Add ability to generate documentation for the ORT Mobile package using the build configuration as input.	2021-05-26 15:57:40 +10:00
Xueyun Zhu	e92b3c1394	bumping up version number to 1.8 (#7733 ) * bump to 1.8 * fix windows AI	2021-05-18 09:03:37 -07:00
Thiago Crepaldi	4fe2ffae16	Fix ORTModule python doc generation (#7704 ) * Fix ORTModule python doc generation * Address comment	2021-05-17 09:55:49 -07:00
Yufeng Li	a74e41e47d	Add non-zero zp support for quant matmul and attention (#7570 ) * add non-zero zp support * support A and B scale with any dimensions	2021-05-14 16:50:31 -07:00
Zhang Lei	50c5edcf13	Add nhwc support for QLinearAveragePool operator (#7656 ) * Add nhwc support for QLinearAveragePool operator * Update ContribOperators.md * Update OperatorKernels.md with cpu,dnnl and cuda enabled.	2021-05-13 22:05:30 -07:00
Faith Xu	7cb9077043	Fix readme page (#7659 ) * Delete mobile page Moved to: https://www.onnxruntime.ai/docs/how-to/deploy-on-mobile.html * Delete ONNX_Runtime_Mobile_NNAPI_perf_considerations.md Moved to: https://www.onnxruntime.ai/docs/reference/execution-providers/NNAPI-ExecutionProvider.html#performance-tuning * Fix links to website docs * Update some summary text * Add space	2021-05-12 14:30:23 -07:00
Tracy Sharpe	16297a8e61	Implement NCHWc Upsample linear mode (#7623 ) Extend the existing NCHWc Upsample operator to support linear modes too.	2021-05-10 12:16:16 -07:00

1 2 3 4 5 ...

340 commits