onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-14 20:48:00 +00:00

Author	SHA1	Message	Date
Dmitri Smirnov	950fe5e28b	Implement SparseTensor and infrastructure suppport and advance ONNX commit (#8038 ) SparseTensor support Implement Builder pattern Fix support for 1-D and 2-D COO indices Implement and test CSR support. Handle shape inference for SparseTensors Implement conversion for COO, CSR and tests. Address the case where constant sparse initializer is the output. Implement test infra for SparseTensors Implement SparseDenseMatMul for Csr and COO and tested it. Add hash for SparseToDenseMatMul Finish shared provider refactor Refactor GetOrCreate to Create Working on py interface Expose OrtDevice and use it in allocate_numpy Adjust Sparse interfaces, add support for string SparseTensor. Add tests. Add and test to_cuda() Add accessors to format specific indices Test values and indices views, read-only flag, after GC access Add sparse related methods to OrtValue Re-work SparseTensor wrapper, add OrtValue methods Rework numpy_array_to_cuda/to_cpu Add run_with_ort_values Add models and test sparse_mat_mul with run_with_ort_values Refactor sparse tensor to use a single buffer Ifdef x86 Eigen CSR sparse matmul implementation Exclude broken test, check for string type when copying cross device Split pybind schema, regenerate docs, add exclusion Conditionally exclude schema module Update docs fix cuda build Add test to a filter and renerate JS docs Add conversion and test string support for sparse tensors Exclude conversion utils from minimal build Add CUDA Memcpy and adjust provider interfaces	2021-07-22 15:24:36 -07:00
DeyuHuang	4275055868	Add Gridsampler contrib op (#8372 ) * add Gridsampler contrib op * fix gridsampler_paddingmode_border test * disable the tests until the kernel added * fix CI failure * change GridSampler to GridSample	2021-07-22 15:39:28 +08:00
Viswanath Boga	afce0e2543	Attention kernel update to handle different Q,K,V hidden sizes (#8039 ) * changes working to convert akv nodes * changes to replace nodes * changes to accomodate qkv hidden sizes as attributes * kernel to accept qkv_hidden_size attributes * Working till compute for varied dimension, todo applyattention() * changes to make all regression tests work * inference running successfully without prepack * success inference with pre-pack weights * add test for diff sizes * bias shape need not be a mul of 3 * get the output_hidden_size from input * infer output shape from input * merge with master * cleaning up files that got merged wrong * accurancy at accepted level * added unit test case for different dimensions * all unit tests passing * packed weights working for attention * prepacked weights working * added test case for newly added extra qk input * updated unit test to test only extra add qk * fixing build error * removing few debugs * reverting test changes * all python test passing * cleaning up * new unit test added, major clean up of code * removed extra code * minor * minor fix to tests * prepack weights code cleaned up * compacted compute() in attention.cc * reformat compute() * making a parameter T * adding 3 q,k,v buffers in all cases * fixing build * running tests only on cpu * Updating docs * trigger ci builds * Addressing comments in PR * addressing some more comments * get add_qk_str from add_qk node directly * updating docs, added extra check to verify attn inputs * Optimized the extra add by parallelizing * added attention_shape to symbolic_shape_infer.py * minor refactoring to address comments	2021-07-19 12:21:33 -07:00
Nick Kreeger	800b62a139	Create a quantized EmbedLayerNorm for ORT. (#8124 ) Create a quantized EmbedLayerNorm Op for ORT	2021-06-25 17:51:43 -05:00
Negin Raoof	80b7b134bf	Adding optional ops in contrib ops (#7946 ) * Added optional const spec	2021-06-24 13:16:31 -07:00
Bowen Bao	51c12a715b	Add NGramRepeatBlock contrib op (#8078 ) Description: Enforce no repetition of n-grams. Scores are set to `-inf` for tokens that form a repeated n-gram if added to the back of the input_ids. Motivation and Context Needed by transformer models in sequence generation algorithms (greedy search and beam search). This module has heavy impact on performance, and can be highly parallelized.	2021-06-21 10:21:48 -07:00
Scott McKay	0fbec1b9c1	Update the operator documentation generation (#7787 ) * Update the operator documentation generation - Make layout a little nicer - Update to latest supported operators including training - Fix some links that are broken when the docs content is copied to github-pages - Fix incorrect usage of 'onnx.ai.ml' as the default domain - ML ops are now separated from the real default domain of 'onnx.ai' - Include CPU, CUDA and training kernels - exclude DNNL as it's not an EP we own * There are separate paths for CUDA and CUDNN as they are not guaranteed to be in the same location on a Windows machine. Use the CUDNN path when looking for the CUDNN library. * Enable validation of both contrib ops and operator kernels in build Filter generation so it's deterministic Add ability for CI to publish the md files as build artifacts if they differ so a developer can download and add to their PR to resolve any diffs. Remove workarounds for github-pages as that will now link to the github docs which display correctly	2021-06-02 17:47:40 +10:00
Yufeng Li	a74e41e47d	Add non-zero zp support for quant matmul and attention (#7570 ) * add non-zero zp support * support A and B scale with any dimensions	2021-05-14 16:50:31 -07:00
Zhang Lei	50c5edcf13	Add nhwc support for QLinearAveragePool operator (#7656 ) * Add nhwc support for QLinearAveragePool operator * Update ContribOperators.md * Update OperatorKernels.md with cpu,dnnl and cuda enabled.	2021-05-13 22:05:30 -07:00
Tracy Sharpe	16297a8e61	Implement NCHWc Upsample linear mode (#7623 ) Extend the existing NCHWc Upsample operator to support linear modes too.	2021-05-10 12:16:16 -07:00
Ye Wang	803837df63	Add 4dmask support for attention cuda kernel (#7591 ) * checkin * add 4dmask support in attention cuda op * trim * add comments * fix build/test error * review comments and add tests * sync doc * review comments * minor change	2021-05-07 20:17:29 -07:00
Tracy Sharpe	d13e5b2fd9	NCHWc: ReorderInput improvements (#7442 ) Implement various improvements related to reordering a tensor for use by NCHWc operations: Relax the requirement that the input channel count must be a multiple of the NCHWc block size (either 8 or 16 depending on ISA). The requirement now is that the channel count must be a multiple of 4. The implementation of MlasReorderInputNchw would need further work to support relaxing this further, but I don't have any models where I've observed this to be necessary yet. Support fusing a Transpose(NHWC->NCHW) into a following ReorderInput. ReorderInput now has a channels_last attribute as was done in the past for ReorderOutput. This helps with models converted from TF where the converter is unable to remove all Transpose operations. Add threading support to ReorderInput to accelerate performance (ReorderOutput will come later).	2021-04-26 19:16:39 -07:00
Zhang Lei	ada0fbbd2d	Implement qlinear concat and unit test. (#7341 ) * Implement qlinear concat and unit test. Add quantization tools for QLinearConcat and it quantization tests. * Add kernel def hash for QLinearConcat. * Change according to PR. Add qdq transformer support for QLinearConcat. * Add QDQ Transformer unittest. Fix typo on domain. * remove dup logic of no use. * fix x86 build error. * Update operator docs.	2021-04-26 13:38:40 -07:00
Changming Sun	afa7b23609	Update docs/ContribOperators.md and the script that generates it. (#7399 )	2021-04-21 16:20:56 -07:00
Changming Sun	5bd192c439	Update ContribOperators.md (#7246 )	2021-04-05 17:11:33 -07:00
Ashwini Khade	2a018cc235	revert contrib op version bump and deprecation of TransposeMatMul (#5424 ) * revert contrib op version bump and deprecation of TransposeMatMul * update documentation	2020-10-12 13:02:15 -07:00
Ashwini Khade	3f00b8db8f	move all experimental ops to version 1 of ms domain (#5287 ) * move all experimental ops to version 1 of ms domain * deprecate TransposeMatMul in favor of FusedMatMul * update documentation	2020-09-30 14:50:18 -07:00
Nat Kershaw (MSFT)	8a03b6e5c7	Render Operator documentation as compliant markdown (#3658 )	2020-09-02 15:07:50 -07:00
Hariharan Seshadri	1599562016	Fix BatchNorm CUDA kernel definition	2020-04-18 17:21:29 -07:00
Hariharan Seshadri	b4457ecb7a	Fix `gen_doc` build option and refresh documentation (#3545 ) * Support listing keys in custom metadata map via C/C++ API * nit * PR feedback * Nit * Initial commit * More changes * Support listing keys in custom metadata map via C/C++ API * nit * PR feedback * Nit * Initial commit * More changes * Add md files * Doc changes * Update * revert cmake changes * Update * Doc change * Update * Update	2020-04-17 14:41:04 -07:00
David Fan	c9d83a52a8	Implement contrib op CropAndResize (#1277 ) * Implement contrib op CropAndResize * Implement contrib op CropAndResize	2019-06-24 18:34:35 -07:00
Hariharan Seshadri	c69dff7928	Implement contrib kernels for Pad (changed interface) and Unique (new ONNX op) (#1006 ) * Intial commit * Rename DynamicPad to Pad * More changes * Add Unique operator * Revert accidental check-in * Fix CUDA Pad to align with changes * More changes * Fix more CUDA pad source files * More fixes * More changes * More changes * Avoid vector copy * Update vector validation logic * Fix build failures * Fix build * Fix build failure * Fix tensorrt build	2019-05-13 13:10:18 -07:00
shahasad	306453f9d6	fix the link to the script in the doc. fix some error messages (#960 )	2019-05-02 19:21:41 -07:00
shahasad	2c46fff69a	Enable gen-doc on windows CI (#716 ) * add --gen_doc to ci_build * make gen-doc conditional to build/test step * some fix in the git diff check * some more trick on doc diff * updated for input/output * updated the contrib operator doc * fix on missing input output descriptions * fixed the problem of missing doc string, due to protobuf optimization * fix * revert last change * moved gen_doc.py to /tools/python * fixed typo	2019-05-01 14:58:21 -07:00
shahasad	83ae641425	add documentation for custom ops (#708 ) * added tools for doc gen, added doc * doc updated * some fixes * hooked up with build.py * hooked up with build.py and fail on nonupdated doc * update	2019-03-26 21:58:01 -07:00

25 commits