Fix a few obvious issues:
(1) bert_perf_test.py create session without provider in line 65.
(2) compare_bert_results.py miss a parameter in create_session in line 37
(3) onnx_exporter.py returns value mismatch in lines 667, 690.
(4) remove some imports not used in the scripts.
(5) fusion_utils need not print "Removed 0 cast nodes" or "Removed 0 Identity nodes"...
(6) update requirements for numpy version since gpt2 parity tool use equal_nan in numpy v1.19+
* Adding Split Fusion
* Make changes to comments
* Format files and change typo
* Format files and change typo
* Format files and change typo
* Format files and change typo
* Format file
* Format files
* Format files
* Format files
* Format files
* Update stale.yml
Change the number of days of inactivity before an issue becomes stale from 60 to 5 and the number of days of inactivity before a stale issue is closed from 7 to 5. Update the exempt labels based on the redefined set of GH labels.
* Implement stale.yml feedback.
**Description**: Create codeql.yml to replace LGTM
**Motivation and Context**
LGTM.com is shutting down and moving to github code scanning. This PR enables github code scanning.
cpp and c# support will be added in a separate pr.
**Description**: Remove reference to the deprecated variable in `torch.onnx.symbolic_helper` pytorch/pytorch#81953
- Removed unused imports
- Changed BANNED_AUTOGRAD_FUNCTION_NAMES to a frozenset
**Motivation and Context**
The cast_pytorch_to_onnx variable is deprecated and removed in `torch.onnx.symbolic_helper`. Since there is still a need for converting scalar types to onnx type, I copied the mapping to `_CAST_PYTORCH_TO_ONNX` in the module.
* upgrade emsdk to 3.1.19
* fix build break
* ignore '-Wunused-but-set-variable' in eigen
* add malloc and free in exported functions
* EXPORTED_FUNCTIONS
Shape Inference and Model Optimization before Quantization
Model quantization with QDQ format, i.e. inserting QuantizeLinear/DeQuantizeLinear on
the tensor, requires tensor shape information to perform its best. Currently, shape inferencing
works best with optimized model. As a result, it is highly recommended to run quantization
on optimized model with shape information.
This change adds code for model optimization and shape inferencing of the following three steps:
1. Symbolic shape inference.
2. Model optimization
3. ONNX shape inference
At the same time we should recommend model optimization should be turned off during quantization.
As the optimization might change the computation graph, making it harder for the QDQ debugger
to locate matching tensors between original and the quantized models.
* fix shape mismatch in FuseConv
* remove zeroed bias
* offset Z dim
* append UT
* add testing model
* remove output
* remove commented
* fix comments
* refactor output msg
* narrowly restrict the use of cudnn...ActFwd
* reset changes in cudnn_common
* add test cases covering all path
* move cases to conv test
* remove extra space
* fix build err
Co-authored-by: Randy Shuai <rashuai@microsoft.com>