1. Reflect int8 GEMV improvements for multi-threading from #2696
2. Add notes on multi-threading control using OpenMP
3. Add samples of running multi-isa AOT, and show int8 GEMM differences between AVX and AVX2
4. Add rnn_benchmark example to resolve#1993
* [NupharEP] Add parallel schedule to JIT function name
Update Nuphar docker to use Python 3.6 and ubuntu 18.04
* Update notebook
* Avoid JIT cache file name conflict
* Fixed a bug of missing tvm in python wheel
* Put Nuphar Python scripts into wheel
* Add note book tutorial
* Some improvements in symbolic shape inference for quantized models