* Added Trilu CUDA kernel. * Added TriluGrad. * Added a training testcase for Trilu. * Added Trilu gradient checker test.