Summary:
There is a module called `2to3` which you can target for future specifically to remove these, the directory of `caffe2` has the most redundant imports:
```2to3 -f future -w caffe2```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45033
Reviewed By: seemethere
Differential Revision: D23808648
Pulled By: bugra
fbshipit-source-id: 38971900f0fe43ab44a9168e57f2307580d36a38
Added a caffe2 math sum operator so that it takes integers (only int32)
Changed the SumFloatIter to SumGenericIter so that it takes >1 types.
Added a sumElementInt operator
Summary: The old version used one block with 128 threads. Throughput was too low for the NMT use case (calculating squared gradient norms for every parameter), so this increases the throughput. Shaves 7% off CNN model training time per step
Reviewed By: wickedfoo
Differential Revision: D5263748
fbshipit-source-id: adc3bacd11e49ea00c60381d613d993050e899be
Summary:
Added SumSqrElements, since then we can avoid a large temporary blob which is needed when doing Sqr + SumElements.
Also moved to reduction_ops, because utlitity_ops has grown too big.
Reviewed By: jamesr66a
Differential Revision: D4844172
fbshipit-source-id: 032eec45e24d6724f0d5fb83f4ec1c771d1146e5