Make kernels non-template. Add input constraint for learnt data.
Fixup tests.
Add two more featurizers along with tests. Tests fail.
min_max_scalar_transformer
robust_scalar_transformer
Fix tests serialized stream by prepending version bytes.
Add inputation_marker_transfomer and the test.
Fix up float/double type designations.
Added label_encoder_transformer along with a test.
string_throw case is broken at the momement.
Fix labelencodertransfomer_test.cc string_throw case
Rename maxabsscalertransformer_test.cc
Add MissingDummiesTransformer along with the test.
Update manifest.
Add TimeSeriesImputerTransformer definition, implementation and tests
* don't run cuda tests if building with tensorrt
* remove unnecessary build options for win trt ci
* refactor win gpu tensorrt ci yml
* --numpy_version=1.17
* update
* update
* azcopy and cuda path
Added the optimized implementation for depthwise convolution for both ACL v19.02 and ACL 19.05.
Also the pointwise convolution seems to be more optimal in the CPU implementation so we opted for that instead.
When it is posible we use a fully connected layer instead of the gemm implementation.
This will let the library use the best implementation based on the input data.
* Implement a more stable SoftMax
e^x is represented as infinity if x is large enough, like 100.f. Infinity divided by Infinity is a NAN. Thus, softmax gets a NAN if one or more item are large enough.
A math transform as below is leveraged to get a stable softmax:
e^xi/(e^x1 + ...e^xn) = e^(xi - max) / (e^(x1 - max) + ... + e^(xn - max))
And for convenience, force max to 0.f if all xi are negative
1. Pipeline changes for python 3.8
2. Fix a regression in setup.py which was just introduced in the previous commit.
Please notice, we still haven't made python 3.8 + Windows + CUDA work.
* Add schema for new Qops
* adding shape inference + qlinearaveragepool
* plus review comments
* plus review comments
* updates per review comments
* plus review comments
1. Reflect int8 GEMV improvements for multi-threading from #2696
2. Add notes on multi-threading control using OpenMP
3. Add samples of running multi-isa AOT, and show int8 GEMM differences between AVX and AVX2
4. Add rnn_benchmark example to resolve#1993
* Additional DML operators
* Check unsupported attributes and inputs
* Address PR comments
* Add kernel capability function used for partitioning, and re-enable stride-based int64 support based on value range
* Fix test failures
* Build fix
* PR comments
* Fix C# handling of unicode strings
* more tests
* check for handle before freesing
* variable reuse efficiency
* refactor and cleanup utf8 o utf16 conversion block
Update featurizers. Fix up constraint issue.
Pass static VCRT library option down to Featurizers CMAKE.
Make build Featurizers OFF by default.
Rename registration call.