pytorch/caffe2/python/examples
Pieter Noordhuis d43ab4bec5 Create Gloo common world through MPI rendezvous
Summary:
Before this change there were two ways for machines to rendezvous for a
distributed run: shared file system or Redis. If you're using an MPI
cluster it is much more convenient to simply execute mpirun and expect
the "right thing (tm)" to happen. This change adds the "mpi_rendezvous"
option to the CreateCommonWorld operator. If this is set, the common
world size and rank will be pulled from the MPI context and Gloo
rendezvous takes place using MPI. Note that this does NOT mean the MPI
BTL is used; MPI is only used for rendezvous.
Closes https://github.com/caffe2/caffe2/pull/1190

Reviewed By: akyrola

Differential Revision: D5796060

Pulled By: pietern

fbshipit-source-id: f8276908d3f3afef2ac88594ad377e38c17d0226
2017-09-08 17:18:47 -07:00
..
char_rnn.py Fixed typo 2017-06-23 14:02:40 -07:00
lmdb_create_example.py Deprecate CNNModelHelper in lmdb_create_example 2017-06-19 13:04:02 -07:00
resnet50_trainer.py Create Gloo common world through MPI rendezvous 2017-09-08 17:18:47 -07:00