pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

History

jjsjann123 0dc3f829d9 Nvfuser code bump 11 5 (#67943 ) Summary: nvfuser code update: 1. Tuning heuristics on schedulers for reduction/normalization kernels; 2. bfloat16 on IO tensor support; 3. Refactored memory format support, now we can support dimension collapsing with non-coherent input tensors with different memory format. e.g. channels last tensor input to batch normalization. Note that we are currently limiting memory format to only Contiguous and Channels last; 4. Refactored nvfuser graph partitioning in `graph_fuser.cpp`, separated node merge and profile node API. Updated `profiling_record.cpp`. Things that are reverted from our local branch: 1. changes on some entries in autodiff 2. aten::gelu with approximation 3. native_dropout(_backward) Pull Request resolved: https://github.com/pytorch/pytorch/pull/67943 Reviewed By: ngimel Differential Revision: D32288709 Pulled By: dzhulgakov fbshipit-source-id: fc9491182ea7e0158bc112c66f096823c588eaf1		2021-11-17 01:22:17 -08:00
..
nvfuser	Nvfuser code bump 11 5 (#67943 )	2021-11-17 01:22:17 -08:00
tensorexpr	[PyTorch] Add int version of vectorized PrefixSum to Benchmark (#67865 )	2021-11-04 14:00:19 -07:00
CMakeLists.txt	CPU Convolution benchmark harness for some popular models (#56455 )	2021-04-22 22:14:36 -07:00
convolution.cpp	Disable `avoid-non-const-global-variables` lint check (#62008 )	2021-07-22 18:04:40 -07:00