Summary: This allows us to serialize things between MKLMemory and a TensorProto.
Reviewed By: dzhulgakov
Differential Revision: D4218044
fbshipit-source-id: 934181493b482cb259c17ff4b17008eac52fd885
(1) nccl submodule, cnmem submodule
(2) mpi ops fallback test
(3) a bit more blob interface
(4) fixed tests
(5) caffe2.python.io -> caffe2.python.dataio to avoid name conflicts
(6) In the build system autogen __init__.py instead of having manual
rules just to copy over an empty __init__.py.