onnxruntime/orttraining
Pranav Prakash 3c5d02a9ce
Implement BatchNormGradient kernel for CPU EP (#7622)
**Description**: Register an implementation for BatchNormInternal and
add a CPU kernel for BatchNormGradient. This is the third in a series of
PRs to implement BN training on CPU (first was #6946, second was #7539).

**Motivation and Context**
Support training networks with BatchNorm (e.g. convnets). Also note that
there exists a CUDA kernel for BN (forward training & backwards) but
it's currently disabled due to flaky failures; someone more familiar
with those parts can register the implementation for BNInternal on CUDA
(gradient kernel doesn't have to change).

---------

Co-authored-by: Simon Zirui Guo <simonguozirui@berkeley.edu>
Co-authored-by: mindest <linminuser@gmail.com>
Co-authored-by: mindest <30493312+mindest@users.noreply.github.com>
2023-04-08 09:20:26 +08:00
..
orttraining Implement BatchNormGradient kernel for CPU EP (#7622) 2023-04-08 09:20:26 +08:00
pytorch_frontend_examples Enable pylint and numpy rules (#15218) 2023-03-27 20:37:53 -07:00
tools Enable pylint and numpy rules (#15218) 2023-03-27 20:37:53 -07:00