Revision 40af7bb68fa3ca245676b81938568aedefcb5ece authored by Bowen Bao on 18 September 2018, 01:10:34 UTC, committed by Bowen Bao on 18 September 2018, 01:10:34 UTC
* CPU.
  - Short-circuit the call to ComputeConvolveGeometryExplicit() when
    MKL is enabled.
  - Replace div/mod inside ComputeConvolveGeometryExplicit with
    fast_divmod.
* GPU.
  - Instead of creating a new CuDnnConvolutionEngine for every batch,
    just update the geometry related info, and try to reuse the
    workspacememory from previous run.
1 parent 1f28c7d
History

back to top