pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

Author	SHA1	Message	Date
Andrey Malevich	ec51f887bf	Create only one instance of SigridTransform in DPerExample. Summary: DPer example have been creating multiple copies of the transform config in net defition till this moment, that resulted in the fact that I've hit the limit of ProtoBuf (64MB) for a certain Task requests (especially visible because of the ValidationPipeline that I was adding). After this diff we're going to store SigridTransforms in one instance per machine for training (or 1 instance per reading). Difference in sizes of the plans for some simple SparseNN model ~30 MB (even including the fact that second model have validation plan as well). TODO: Do similar logic for NNPreProc as well (it's also pretty large). Reviewed By: dzhulgakov Differential Revision: D4441441 fbshipit-source-id: 4452dd86a4dc49b2c7f5b7642f443aed5720b047	2017-01-22 19:29:16 -08:00
Aapo Kyrola	06398e9bfb	softmax-with-loss, handle gracefully cases when total weight is 0 Summary: Spatial Softmax allows specifying locations that are not counted for the loss. If none of the locations are counted, this resulted in NaNs, and headache. This diff fixes that by explicitly handling these cases. + assertion for label blob dimension(0) Created a new test as well. Differential Revision: D4442939 fbshipit-source-id: 8641bfad2a994e517ca3eda39345380a6ca1ba50	2017-01-20 15:29:21 -08:00
Aapo Kyrola	e18643f90b	More fixes Summary: When testing the code, a couple of issues arised: - we need to have different name for last layer than the preprocessed model, otherwise a shape assertion is created - preprocess_noaugmentation still needs to do a crop for images larger than 227x227, otherwise things fail. Reviewed By: viswanathgs Differential Revision: D4442700 fbshipit-source-id: 05f54e7f17c266280f5ba5bb57af1721fe30df12	2017-01-20 13:44:24 -08:00
Kevin Matzen	6a7dd236fa	instance norm Summary: Added gradient and GPU implementation to caffe2 InstanceNorm op Reviewed By: Yangqing Differential Revision: D4304808 fbshipit-source-id: 6feecaed589ea9f825260a49b39b4260da6e5426	2017-01-20 12:29:28 -08:00
Alexander Sidorov	3f66f66da9	DebugMode helper for Caffe2 Summary: It helps to develop scripts locally (when working outside of Flow). One doesn't have to rerun the script in order to catch exception in the debugger / add a print statement. (Flow does this kind of thing automatically) Usage example: ``` if __name__ == '__main__': workspace.GlobalInit(['caffe2', '--caffe2_log_level=2']) from caffe2.python.utils import DebugMode DebugMode.enable() DebugMode.run(main) ``` Reviewed By: Yangqing Differential Revision: D4424096 fbshipit-source-id: 73f418c80f581820e70139df7e166981e4d8c55f	2017-01-20 09:29:31 -08:00
Aapo Kyrola	afe822ebd7	Small tweaks Summary: Some tweaks, hopefully getting us to 0.98 MAP - no cropping for test dataset (as per patrick) - spatialBN momentum 0.1 (default is 0.9) Also added some additional logging and reduced frequency of running of test net and logging. Reviewed By: viswanathgs Differential Revision: D4439790 fbshipit-source-id: 700705b811a5fc8c7139a265de96db646605ca5a	2017-01-19 18:44:26 -08:00
Ahmed Taei	411059d649	Generate huffman tree Summary: In this diff : [1] Change the output from generating all paths from root to labels to TreeProto. TreeProto itself is required by inference and we can use hsm_util to get the paths from TreeProto. [2] Fix hsm_util index assigment. Differential Revision: D4416731 fbshipit-source-id: 657d8b9b4df6fa30c9f92d391cf7e07b5c5db1f8	2017-01-19 16:14:23 -08:00
Ahmed Taei	dd51336611	Fix label start index for HuffmanTreeHierarchyOp Summary: Change labels indices range to be in the range [0, num_classes[ Differential Revision: D4416685 fbshipit-source-id: b16ca8539fd538ad62bf1298dbad3f1553956241	2017-01-19 15:14:53 -08:00
Andrey Malevich	9f0a7935f6	Replace one more place from _net.external_input to _external_input_map Summary: #accept2ship Reviewed By: dzhulgakov Differential Revision: D4435301 fbshipit-source-id: 6b62492c190325e82bc14d5397852106d07d5235	2017-01-19 12:29:30 -08:00
Yangqing Jia	91ebfa3c7c	Unit test for big batch size avg pooling Summary: basically copied test_pooling and hard coded values Reviewed By: prigoyal Differential Revision: D4428162 fbshipit-source-id: 6c0444ac8c21f08824df7ff53999a94967607dc4	2017-01-18 19:29:20 -08:00
Viswanath Sivakumar	be97f491e6	Unbreak caffe_translator for Conv op Summary: Minor bug in D4426513 - bias is added as input blob always. Running it on xray throws "RuntimeError: [enforce fail at operator.cc:25] blob != nullptr. op Conv: Encountered a non-existing input blob: caffe.SpatialConvolution_0_b" Reviewed By: Yangqing Differential Revision: D4429231 fbshipit-source-id: 0d3905ea6e87128ec1aa9d0f0a2f43126b1069b1	2017-01-18 14:00:04 -08:00
Viswanath Sivakumar	e67425647a	Support bias for Scale layer in caffe_translate Summary: Turns out xray models have some independent Scale layers (with bias) besides the Conv-Scale pairs. We could still fuse it with previous layers with some work, but for simplicity, including Add op followed by Mul for bias if needed. We could revisit optimizations layer fusion in the future once we have something working for xray. Reviewed By: Yangqing Differential Revision: D4427266 fbshipit-source-id: ef7d8677ccd7d10dbd20759eeed378d9bc4522d1	2017-01-18 09:59:21 -08:00
Yangqing Jia	bfca2b86c3	Removed the old group convolution code Summary: Now that we direct support group convolution, this will no longer be needed. I also took the chance to add dilated convolution and also optional bias. Reviewed By: prigoyal Differential Revision: D4426513 fbshipit-source-id: eb2bb0aa619f8ff5f732512570f736bc59cd57dd	2017-01-18 00:44:31 -08:00
Andrew Tulloch	e23ddf06e9	UnsafeCoalesceOp for `nn.Module.flattenParameters` style coalescing Summary: This is a handy tool for amortizing expensive operators (e.g. distributed communication, some heavier kernel launches, etc) over a lot of small blobs (e.g. all the biases in a network). We can just coalesce these small blobs in-place into a single blob, act on them in operators, etc as if they are non-coalsed (passing them as inputs to operators, etc), and then finally for heavier operators, just work on the coalesced blob that contains each of these units. I named it UnsafeCoalesce since it introduces blob aliasing, which needs care for work like memory management, graph rewriting as in memonger, etc. Reviewed By: Yangqing Differential Revision: D3557149 fbshipit-source-id: 09cff4459b84270fe9e1da3b4a168fd66d01f795	2017-01-17 17:14:35 -08:00
Viswanath Sivakumar	d63f58013b	Throw error in caffe_translator on Scale layer with bias Summary: Failing fast instead of swallowing the bias term. Differential Revision: D4419130 fbshipit-source-id: 98ce0af9a20adecfb027ffe8293ff69910873abc	2017-01-17 09:59:20 -08:00
Viswanath Sivakumar	7d6742f2f5	Tool to convert caffe models to c2 + fixes for xray v10 Summary: Simple tool similar to caffe_translator_test.py for conversion from caffe to caffe2. The differences are: There are a couple of issues that need to be fixed as mentioned in https://our.intern.facebook.com/intern/tasks?t=15424761, especially related to the 'legacy_pad' field in conv op. Differential Revision: D4407146 fbshipit-source-id: ec641f6d7e0cf6cdf2eca21f058b4451635d4a56	2017-01-17 08:59:58 -08:00
Aapo Kyrola	b96c2ed6ab	fix validation to consider cpu-only ops Summary: Data paralell model has a sanity check that ensures that operators inputs/outputs do not cross device boundaries. This failed when the operator was a CPU-only operator (such as the new AccuracyOp version). This fixes that. Reviewed By: prigoyal Differential Revision: D4417841 fbshipit-source-id: 9bc4e7a2074a544ca4db69ecf24183bbd41f84ca	2017-01-13 18:59:32 -08:00
Yangqing Jia	8683737410	Caffe translator: match torch pooling Summary: See code comments: legacy is a legend. Reviewed By: viswanathgs Differential Revision: D4414447 fbshipit-source-id: 7cd96778bbc00aff053100871f273b2e1b43c973	2017-01-13 10:59:20 -08:00
Ahmed Taei	9ad10959ee	Enable large PlanDef protobuf message. Summary: Enable cases where PlanDef message is bigger than protobuf string decoding limits. Differential Revision: D4412736 fbshipit-source-id: 91ee02d7a8ab85b1c8169683a6c1dccd4c79be40	2017-01-13 09:29:29 -08:00
Bram Wasti	0d5f3654b2	Adding back untracked files from manual github pull Summary: Github import didn't work and the manual import lost some files. Reviewed By: Yangqing Differential Revision: D4408509 fbshipit-source-id: ec8edb8c02876410f0ef212bde6847a7ba327fe4	2017-01-12 08:59:19 -08:00
Yangqing Jia	1cd166d330	CMake completions work Summary: Closes https://github.com/caffe2/caffe2/pull/88 Differential Revision: D4404292 Pulled By: bwasti fbshipit-source-id: 8a4351c2dee5136aaa12b90f1a61fd7afee51994	2017-01-11 16:59:22 -08:00
Pooya Davoodi	92ebb58a06	Top-k accuracy operator on host Summary: Automatically copy from device -> host if necessary. Thanks to pooyadavoodi for the host top-k code. Closes https://github.com/caffe2/caffe2/pull/51 Reviewed By: Yangqing Differential Revision: D4348953 Pulled By: bwasti fbshipit-source-id: be650855cdd6c2c7bed838155f30e9fa92759dfe	2017-01-10 18:44:30 -08:00
Andrey Malevich	8047b8dc83	Fix random issues with some of the layers getting missing from registry. Summary: It looks like for the types that are created directly through type(...) function call, we don't store the strong references anywhere. As a result a GC call in Python might/or might not clean up these classes depending on the phase of the moon and other random things. This results in a fact that in some cases simple layers as a Relu might disappear. cat_shame Reviewed By: xianjiec Differential Revision: D4396289 fbshipit-source-id: ba4e9b7ef54ee43349853b0acc3d3f40c74e4d73	2017-01-10 15:14:31 -08:00
Aapo Kyrola	bb928f3cc0	Latest fixes to Xray Flow workflows for Caffe2 Summary: (Ignore the convolution-op related changes, they will be later patched separately) This diff ignores work from latest few weeks: - some refactoring of the flow ops - no_bias setting - MAP computation (instead of accuracy) for OC - adaptive learning rate for Xray concepts - various small bug fixes Reviewed By: viswanathgs Differential Revision: D4329500 fbshipit-source-id: 000d4fd22ec408af5290480c788eb86546bff52e	2017-01-10 12:59:23 -08:00
Aapo Kyrola	4f1db36cff	add CUDA gradient for Div Summary: DivOp missed a gradient for CUDA, so implemented it. Also added operator test. Differential Revision: D4396638 fbshipit-source-id: 9949e47aa3735bb418a0db003e2b2f4896056a71	2017-01-09 21:59:23 -08:00
Aapo Kyrola	95b3309a87	Gradient Input memory sharing using memonger blob sharing Summary: This diff brings us to roughly par with Torch on ResNet memory usage. On batch size 32, Resnet-50 took 7497MiB, after this 5010 MiB. This will thus allow us to handle 64 images / GPU, or 256 images / 4 GPUs. In addition, I added a special argument to DagNet that causes it to run only one thread for the first iteration. This is needed since there are allocations on the first iteration's backward pass due to gradient sharing, and this will cause NCCL to deadlock. The sharing of gradient buffers requires inferring which gradients can share memory (i.e that they are not used concurrently). Previous memonger code uses topological sort, but rbgirshick showed that it does not work with tree-like models. Thus, I wrote a new optimization algorithm based on DFS. It takes about 0.25 secs / GPU on resnet-50, so is clearly fast enough. Module data_parallel_model supports this feature natively. Reviewed By: prigoyal Differential Revision: D4363209 fbshipit-source-id: 73b11e7610438098bb11bff0af8075ab0cf2c0f1	2017-01-09 19:44:23 -08:00
Yangqing Jia	3732a0044c	Move mpi_python.cc to the python folder to be more consistent about source file locations. Summary: TSIA Differential Revision: D4386553 fbshipit-source-id: 2c7196171be7d0af90b46b75f68c949ee3980c2e	2017-01-09 10:59:39 -08:00
Bram Wasti	737000b166	Linter fix up to sync fbsource and github	2017-01-06 15:36:17 -08:00
Bram Wasti	3833dad5f6	manual sync of old never sync'd files	2017-01-06 15:28:45 -08:00
Yangqing Jia	375c0816b3	goodbye old brewery	2017-01-04 20:58:35 -08:00
Yangqing Jia	5bfd6c4cd1	semicolon	2017-01-04 14:36:16 -08:00
Yangqing Jia	311ae2ba33	build file fix and avx2 on mac fix	2017-01-04 14:35:15 -08:00
Bram Wasti	2f3b5d7943	Moved binaries/python CMake files to reflect paradigm of the rest of the codebase	2017-01-04 14:02:52 -08:00
Bram Wasti	7ea9f9e0ee	Updated naming convention of Caffe2_LINK*	2017-01-04 12:03:27 -08:00
Yangqing Jia	b1a31942fc	Merge remote-tracking branch 'upstream/master' into cmake	2017-01-03 18:10:58 -08:00
Simon Layton	7c3f1521a7	Gpu transform Summary: Adds a thread pool for image decode, and optional GPU-based data conversion, mean subtraction and std division Closes https://github.com/caffe2/caffe2/pull/56 Reviewed By: Yangqing Differential Revision: D4341326 Pulled By: bwasti fbshipit-source-id: 6485616ea7d212c7701274a40fae912db30dff4a	2017-01-03 17:59:34 -08:00
Alisson Gusatti Azzolini	6618d7462d	Improvements+fixes for NetBuilder Summary: Title. Reviewed By: dzhulgakov Differential Revision: D4358227 fbshipit-source-id: 21afe5107bed27eec2027f16f2c77db62c70c6e8	2017-01-03 16:59:24 -08:00
bwasti	9ce23cbb71	Fix false positive for non-clang compilers.	2016-12-29 11:39:50 -08:00
Bram Wasti	b48f1ff810	OS X build	2016-12-29 12:25:53 -05:00
Xianjie Chen	4b3bd06a7f	sparse nn converges better by dedupping sparse gradient by mean Summary: this normalizes the sparse gradient, so that the "effective learning rate" of each sparse parameter will NOT be affected by the number of examples in a batch that "use" this sparse parameter. experiment shows it help convergence (about 0.1% better train NE): https://fburl.com/1230747813683956. It's not conclusive yet, and we still need to do more experiments. But this diff adds it as an option, and does not change the default behavior, so we can get this in first. Differential Revision: D4367283 fbshipit-source-id: 49ea80dfa9ea776ff4160e220cf6c86593521607	2016-12-27 22:59:29 -08:00
Jason Jeong	9e75aa4d35	specify path to write htrace logs Summary: This diff adds a gflag for specifying the path for htrace span log files. This flag is used by the net types `HTraceDAGNet` and `HTraceAsyncDAGNet`. Differential Revision: D4366849 fbshipit-source-id: 56038d3d64a3fd5ab363feda86a19a6f2496971c	2016-12-27 11:44:31 -08:00
Bram Wasti	826abe8438	Merge pull request #5 from caffe2/master merge caffe2:master into bwasti:master	2016-12-26 14:19:30 -05:00
Ou Jin	a4f3721e15	weightedsum on ps Summary: Rewrite D3993337 based on new stack. Comparing to the old one, we need more readers to achieve the same speed. But so far the speed is the same and the new bottleneck is the write bandwidth of trainer. Model quality is the same as the base. Reviewed By: azzolini Differential Revision: D4310803 fbshipit-source-id: 6d04ae8040c1ee7caa9aea5287f054e73fbe325a	2016-12-22 19:14:38 -08:00
Ievgen Soboliev	a7f8fe0423	introduce request net into prediction schema Summary: As title. We want to have request_only net which runs on user_only sparse features. Submitting to get early feedback. Reviewed By: dzhulgakov Differential Revision: D4282783 fbshipit-source-id: 71241bf5444550075884c788c2da4783659bc1e0	2016-12-22 15:59:27 -08:00
Aapo Kyrola	e51e651255	Remove redundant and failing test of FeedBlob asserts Summary: Recently a PR landed that removed asserts of trying to feed float64 to FeedBlob for GPUs and changed to a warning. Thus the test testing assertions were given started to fail. Removing it. Reviewed By: Yangqing Differential Revision: D4363780 fbshipit-source-id: d9e222c309302243138d4ff3c223c711a4d2052d	2016-12-22 14:59:28 -08:00
Priya Goyal	3eb08feff5	Support no_bias in naive group conv implementation Summary: I was testing perf difference between naive group conv and cudnn group conv. I am doing no_bias conv and added support for that in naive implementation although its deprecated, i thought it would be nice to have working things in our code Differential Revision: D4363168 fbshipit-source-id: 29719013d79b449fd359884709c7a1195be51ae3	2016-12-22 14:14:26 -08:00
Bram Wasti	d4a783405f	Merge branch 'master' of github.com:caffe2/caffe2 into cmake	2016-12-22 13:15:05 -08:00
Aapo Kyrola	db5cc8f278	revert exhaustive_search setting to False Summary: As per discussion in D4355529 Reviewed By: prigoyal Differential Revision: D4362162 fbshipit-source-id: 795fcf1507235a7dc3c7a10b0453037936d057aa	2016-12-22 12:44:42 -08:00
Maxime Boucher	e2181a32ca	Normalize rank loss gradient to avoid convergence issues when the number of pairs is really large Summary: Essentially, when number of pairs is around 1000, then only positive samples in the list gets a massive boost from all the negative examples. This diff normalizes the gradient and the loss with the number of pairs. This diff also adds protection against NaN and more logging to help debug. Reviewed By: kdub0 Differential Revision: D4359782 fbshipit-source-id: 7240344ddb1f2f670d1eec1b03e7f6e413f3dfcc	2016-12-21 17:29:24 -08:00
Yangqing Jia	2c6a579859	Make all convolution operators allow optional bias term Summary: It used to be that only the cudnn engine supports it, and now it should be fully supported by any conv engine. To ignore bias, simply use a convolution op that has two inputs instead of 3. The gradient operator will automatically figure out that it does not compute the bias gradient. Reviewed By: prigoyal Differential Revision: D4354183 fbshipit-source-id: cf71b6289a254d15a6a663a85df63fbbaec3702b	2016-12-21 15:14:24 -08:00

1 2 3 4

157 commits