https://github.com/amirgholami/accfft

sort by:
Revision Author Date Message Commit Date
2038542 removed unsupported compute capabilities and added new ones 05 November 2020, 11:48:01 UTC
133a585 error in GPU_float fixed 17 August 2018, 13:01:44 UTC
5cb08a4 Merge pull request #17 from pkestene/mpi_deprecated avoid datatype MPI::BOOL (deprecated) 04 May 2018, 23:08:13 UTC
8552f6e replace MPI::BOOL (MPI c++ bindings deprecated) by a custom MPI_Datatype (from dtypes.h) 04 May 2018, 10:19:04 UTC
0d734b8 Merge pull request #14 from alalazo/fixes/odr_violation_gpu Fix odr violation on GPUs + don't override user nvcc flags 23 February 2018, 21:56:48 UTC
409e830 Fixed build failures due to unsupported architectures When working on a machine that doesn't support the architectures that are appended unconditionnaly, the current make file incurs in a build failure. This should be fixed by making the defaults kick-in only if the user didn't specify anything else on the command line. 21 February 2018, 15:28:19 UTC
416518e Fixed odr violation on explicit template instantiation A few temaplates are instantiated in a header file, that is included multiple times in different compilation units. This causes a build failure due to odr violation. Moving the explicit instantiation to the single cpp file where they are needed solves the issue. 21 February 2018, 15:25:50 UTC
632d249 Merge pull request #13 from alalazo/fixes/linking_shared_libraries Fixed linking of step* executables when producing shared libraries 15 February 2018, 19:38:41 UTC
4134eab Fixed linking of step* executables when producing shared libraries Before this commit the library was not able to link the 'step*' executables when giving '-DBUILD_SHARED=true' (undefined references to a few symbols). 11 February 2018, 09:43:05 UTC
2199051 removed unnecessary operation in transpose 20 May 2017, 23:16:49 UTC
babb821 added support for outplace transpose 20 May 2017, 20:34:15 UTC
4475f3d added restrict keyword to transpose functions 20 May 2017, 02:21:41 UTC
113b3db removed debug info 10 April 2017, 15:20:01 UTC
22cf750 merged 10 April 2017, 03:10:57 UTC
f00f3f9 changed the layout for fast spectral operators so the FFT is performed on contiguous direction 09 April 2017, 23:49:20 UTC
c4b6ccf added self exec timer to gradient and divergence operators 05 April 2017, 20:27:07 UTC
2140179 removed unused variables 03 April 2017, 05:29:13 UTC
1eb9f32 added verbose2 definition to transpose 03 April 2017, 05:16:47 UTC
f6e0c4a guard against int overflow 02 April 2017, 22:54:10 UTC
97d17e6 removed unnecessary barriers 29 March 2017, 00:12:44 UTC
f70f053 removed barriers from operators.txx 28 March 2017, 22:15:35 UTC
d80b5d2 moved operator allocations to the memory manager 24 March 2017, 21:34:01 UTC
2a40c78 moved timings[5] to timings[6] for operators. t[5] is now empty 23 March 2017, 16:12:20 UTC
b005a97 added timers for operators, note that now you should pass a field of size 6 for the timings 23 March 2017, 04:06:47 UTC
f2eb48f added memcpy time to total transpose time for spectral operators 23 March 2017, 02:42:36 UTC
028f084 fix for a corner case in spectral operators 22 March 2017, 20:19:58 UTC
154b031 temporary switch to slow operators 22 March 2017, 01:32:28 UTC
7c0cb84 added plan templates for operators 17 March 2017, 17:22:51 UTC
02fefd2 added templated version of single and double precision 16 March 2017, 21:11:49 UTC
11c1d1b pushed the previous fix for single precision and gpu 11 March 2017, 23:22:21 UTC
0b0c207 fix for operators using irregular sizes 11 March 2017, 01:41:10 UTC
aa9870c fixed compile issue with gcc 28 February 2017, 11:57:02 UTC
9c37a08 debugged fast spectral solvers 27 February 2017, 23:36:42 UTC
8094869 fixed gcc compilation issue 21 February 2017, 17:28:07 UTC
dd02c1b code cleanup 21 February 2017, 01:40:01 UTC
fe62538 added fast versions of spectral operators 20 February 2017, 22:01:16 UTC
4cd49f4 fixed build issue for GPU 15 February 2017, 16:47:19 UTC
bcac4ee cleaned the code + filter nyquist in inverse laplace operator 15 February 2017, 16:23:43 UTC
ae56b40 corrected order of fftw library linking 12 January 2017, 22:38:51 UTC
5d73849 added header to soperators.txx 25 November 2016, 17:33:08 UTC
1c1bcf5 code cleanup 08 October 2016, 21:42:45 UTC
93cf04e minor change to step1 CMakeList 07 October 2016, 19:17:04 UTC
7d60a0c cleanup 18 September 2016, 17:30:12 UTC
e72664f changed create_comm message 30 June 2016, 03:10:29 UTC
eb337d4 fixed compiler warning for malloc 26 June 2016, 01:51:13 UTC
5c6064e fixed compiler warnings 26 June 2016, 01:38:00 UTC
4430898 Cmake find pnetcdf quietly, it is an optional package 26 June 2016, 00:39:57 UTC
10df740 outof order send for inverse 14 June 2016, 04:27:16 UTC
004249a mpi_waitall for v and v_h, also removed omp_parallel_for 14 June 2016, 04:12:35 UTC
9cfc60e MPI_STATUSES_NULL > MPI_STATUSES_IGNORE 14 June 2016, 03:52:50 UTC
00bd8d8 Merge pull request #8 from jeffhammond/fix-ulong fix ulong -> unsigned long 14 June 2016, 03:39:05 UTC
c45f68d Merge pull request #7 from jeffhammond/use-waitall replace loop over wait with waitall 14 June 2016, 03:38:29 UTC
ec90098 Merge pull request #6 from jeffhammond/remove-nilpotent-expressions Remove nilpotent expressions 14 June 2016, 03:34:34 UTC
0c85ca9 Merge pull request #5 from jeffhammond/use-calloc Use calloc 14 June 2016, 03:32:26 UTC
f0687ff fix ulong -> unsigned long 10 May 2016, 19:28:17 UTC
df99207 replace loop over wait with waitall 1) merge send and recv request vectors 2) set odd (even) values of request vector to send (recv) requests 3) call MPI_Waitall once instead of looping over two calls to MPI_Wait This should have a nontrivial impact when nproc is large and messages are completed by the network out-of-order w.r.t. the wait loop. (comment #2 may have send and recv backwards) 10 May 2016, 13:35:56 UTC
cb75449 remove loop with trip count of 1 10 May 2016, 13:12:40 UTC
49460fa remove 0*nprocs 10 May 2016, 13:11:47 UTC
ae075be replace 0*MPI_Wtime with 0 #2 10 May 2016, 13:11:12 UTC
179be93 replace 0*MPI_Wtime with 0 10 May 2016, 13:09:37 UTC
b196a29 eliminate unnecessary repeat log2() 10 May 2016, 09:38:34 UTC
0256c4a use calloc=malloc+memset(0) #2 10 May 2016, 09:36:03 UTC
b488850 BUGFIX: ptrdiff_t often wider than int, so this only zeros the lower half of the array 10 May 2016, 09:35:34 UTC
c032f80 use calloc=malloc+memset(0) 10 May 2016, 09:34:44 UTC
abb66aa made pnetcdf optional 04 May 2016, 01:59:35 UTC
77c3124 minor fix in debug mode print 13 April 2016, 20:17:44 UTC
4f320ad add c_comm to write_pnetcdf, and isize,istart to accfft_plan 22 March 2016, 17:16:22 UTC
d78abc1 bug fix for inverse laplace and biharmonic operators 07 March 2016, 23:04:11 UTC
18a869b gpu cleanup part1 07 March 2016, 19:03:42 UTC
4b5d09c Merge branch 'master' of github.com:amirgholami/accfft 02 March 2016, 03:12:25 UTC
4ab0ba5 added double precision inverse laplace and inverse biharmonic operators. 02 March 2016, 03:11:33 UTC
5325f67 . 02 March 2016, 03:10:41 UTC
ad42521 no need to use issend 01 March 2016, 18:44:34 UTC
12b6ca1 bug fix 01 March 2016, 18:43:39 UTC
7143762 added biharmonic spectral operator 23 February 2016, 21:21:53 UTC
7d2b50f cleanup 22 February 2016, 01:35:16 UTC
316cf7d code cleanup 2 21 February 2016, 22:15:37 UTC
ddb163c code cleanup 21 February 2016, 22:09:43 UTC
1f41020 removed c++11 features for compatibility with older compilers 17 February 2016, 03:29:10 UTC
094e279 save details during setup phase 17 February 2016, 03:12:17 UTC
228be48 compatibility with older compilers due to bitlist list initialization 16 February 2016, 17:01:48 UTC
a76e35e more inclusive setup phase, fixed a compile error on cray machines 16 February 2016, 04:01:19 UTC
d33318b clean up setup function, better communication selector 30 January 2016, 21:49:02 UTC
cde1002 step1 cleanup 02 January 2016, 00:41:58 UTC
1dd1b4d compatibility with new cmake versions 02 January 2016, 00:39:04 UTC
c454aca corrected timings for step5 and step6 14 December 2015, 02:45:08 UTC
35e816f changed operator timings to the default style (see doxygen) 14 December 2015, 01:32:48 UTC
5ef46db clean up, added pnetcdf_io for floats 06 December 2015, 22:51:26 UTC
6aa2172 working on step6 (unfinished) 03 December 2015, 20:52:36 UTC
107ec29 timing for operators 03 December 2015, 20:52:00 UTC
b95b5d9 added step5: spectral operators for CPU and GPU 02 December 2015, 03:08:13 UTC
2f8ba07 doxygen for spectral operators + code cleanup 02 December 2015, 03:07:24 UTC
40860ec gpu spectral operators completed 01 December 2015, 22:32:42 UTC
39d28f8 initial commit for gpu spectral operators (unfinished) 30 November 2015, 23:18:05 UTC
d0c1bf4 cpu spectral operators finished 29 November 2015, 23:15:27 UTC
11102d6 adding spectral operators (uncompleted) 26 November 2015, 23:43:51 UTC
6bfd26f single precision support added 25 November 2015, 22:00:10 UTC
65eb680 added single precision versions of steps 1,2,3 25 November 2015, 03:15:30 UTC
93229b6 single precision support, part iv added float fft srcs 24 November 2015, 23:00:00 UTC
0551a19 single precision support part iii, modified accfft.cpp 23 November 2015, 21:21:59 UTC
back to top