Revision history - HEAD - origin: https://github.com/amirgholami/accfft

visit type:

https://github.com/amirgholami/accfft

05 April 2024, 20:01:52 UTC

Revision	Author	Date	Message	Commit Date
2038542	Malte Brunn	05 November 2020, 11:48:01 UTC	removed unsupported compute capabilities and added new ones	05 November 2020, 11:48:01 UTC
133a585	brunnme	17 August 2018, 13:01:44 UTC	error in GPU_float fixed	17 August 2018, 13:01:44 UTC
5cb08a4	Amir Gholami	04 May 2018, 23:08:13 UTC	Merge pull request #17 from pkestene/mpi_deprecated avoid datatype MPI::BOOL (deprecated)	04 May 2018, 23:08:13 UTC
8552f6e	Pierre Kestener	04 May 2018, 10:19:04 UTC	replace MPI::BOOL (MPI c++ bindings deprecated) by a custom MPI_Datatype (from dtypes.h)	04 May 2018, 10:19:04 UTC
0d734b8	Amir Gholami	23 February 2018, 21:56:48 UTC	Merge pull request #14 from alalazo/fixes/odr_violation_gpu Fix odr violation on GPUs + don't override user nvcc flags	23 February 2018, 21:56:48 UTC
409e830	Massimiliano Culpo	21 February 2018, 15:28:19 UTC	Fixed build failures due to unsupported architectures When working on a machine that doesn't support the architectures that are appended unconditionnaly, the current make file incurs in a build failure. This should be fixed by making the defaults kick-in only if the user didn't specify anything else on the command line.	21 February 2018, 15:28:19 UTC
416518e	Massimiliano Culpo	21 February 2018, 15:25:50 UTC	Fixed odr violation on explicit template instantiation A few temaplates are instantiated in a header file, that is included multiple times in different compilation units. This causes a build failure due to odr violation. Moving the explicit instantiation to the single cpp file where they are needed solves the issue.	21 February 2018, 15:25:50 UTC
632d249	Amir Gholami	15 February 2018, 19:38:41 UTC	Merge pull request #13 from alalazo/fixes/linking_shared_libraries Fixed linking of step* executables when producing shared libraries	15 February 2018, 19:38:41 UTC
4134eab	Massimiliano Culpo	11 February 2018, 09:43:05 UTC	Fixed linking of step* executables when producing shared libraries Before this commit the library was not able to link the 'step*' executables when giving '-DBUILD_SHARED=true' (undefined references to a few symbols).	11 February 2018, 09:43:05 UTC
2199051	Amir Gholami	20 May 2017, 23:16:49 UTC	removed unnecessary operation in transpose	20 May 2017, 23:16:49 UTC
babb821	Amir Gholami	20 May 2017, 20:34:15 UTC	added support for outplace transpose	20 May 2017, 20:34:15 UTC
4475f3d	Amir Gholami	20 May 2017, 02:21:23 UTC	added restrict keyword to transpose functions	20 May 2017, 02:21:41 UTC
113b3db	Amir Gholami	10 April 2017, 15:20:01 UTC	removed debug info	10 April 2017, 15:20:01 UTC
22cf750	Amir Gholami	10 April 2017, 03:10:57 UTC	merged	10 April 2017, 03:10:57 UTC
f00f3f9	Amir Gholami	09 April 2017, 23:49:20 UTC	changed the layout for fast spectral operators so the FFT is performed on contiguous direction	09 April 2017, 23:49:20 UTC
c4b6ccf	Amir Gholami	05 April 2017, 20:27:07 UTC	added self exec timer to gradient and divergence operators	05 April 2017, 20:27:07 UTC
2140179	Amir Gholami	03 April 2017, 05:29:13 UTC	removed unused variables	03 April 2017, 05:29:13 UTC
1eb9f32	Amir Gholami	03 April 2017, 05:16:47 UTC	added verbose2 definition to transpose	03 April 2017, 05:16:47 UTC
f6e0c4a	Amir Gholami	02 April 2017, 22:54:10 UTC	guard against int overflow	02 April 2017, 22:54:10 UTC
97d17e6	Amir Gholami	29 March 2017, 00:12:44 UTC	removed unnecessary barriers	29 March 2017, 00:12:44 UTC
f70f053	Amir Gholami	28 March 2017, 22:15:35 UTC	removed barriers from operators.txx	28 March 2017, 22:15:35 UTC
d80b5d2	Amir Gholami	24 March 2017, 21:34:01 UTC	moved operator allocations to the memory manager	24 March 2017, 21:34:01 UTC
2a40c78	Amir Gholami	23 March 2017, 16:12:20 UTC	moved timings[5] to timings[6] for operators. t[5] is now empty	23 March 2017, 16:12:20 UTC
b005a97	Amir Gholami	23 March 2017, 04:06:47 UTC	added timers for operators, note that now you should pass a field of size 6 for the timings	23 March 2017, 04:06:47 UTC
f2eb48f	Amir Gholami	23 March 2017, 02:42:36 UTC	added memcpy time to total transpose time for spectral operators	23 March 2017, 02:42:36 UTC
028f084	Amir Gholami	22 March 2017, 20:19:58 UTC	fix for a corner case in spectral operators	22 March 2017, 20:19:58 UTC
154b031	Amir Gholami	22 March 2017, 01:32:28 UTC	temporary switch to slow operators	22 March 2017, 01:32:28 UTC
7c0cb84	Amir Gholami	17 March 2017, 17:22:51 UTC	added plan templates for operators	17 March 2017, 17:22:51 UTC
02fefd2	Amir Gholami	16 March 2017, 21:11:49 UTC	added templated version of single and double precision	16 March 2017, 21:11:49 UTC
11c1d1b	Amir Gholami	11 March 2017, 23:22:21 UTC	pushed the previous fix for single precision and gpu	11 March 2017, 23:22:21 UTC
0b0c207	Amir Gholami	11 March 2017, 01:41:10 UTC	fix for operators using irregular sizes	11 March 2017, 01:41:10 UTC
aa9870c	Amir Gholami	28 February 2017, 11:57:02 UTC	fixed compile issue with gcc	28 February 2017, 11:57:02 UTC
9c37a08	Amir Gholami	27 February 2017, 23:36:42 UTC	debugged fast spectral solvers	27 February 2017, 23:36:42 UTC
8094869	Amir Gholami	21 February 2017, 17:28:07 UTC	fixed gcc compilation issue	21 February 2017, 17:28:07 UTC
dd02c1b	Amir Gholami	21 February 2017, 01:40:01 UTC	code cleanup	21 February 2017, 01:40:01 UTC
fe62538	Amir Gholami	20 February 2017, 22:01:16 UTC	added fast versions of spectral operators	20 February 2017, 22:01:16 UTC
4cd49f4	Amir Gholami	15 February 2017, 16:47:19 UTC	fixed build issue for GPU	15 February 2017, 16:47:19 UTC
bcac4ee	Amir Gholami	15 February 2017, 16:23:43 UTC	cleaned the code + filter nyquist in inverse laplace operator	15 February 2017, 16:23:43 UTC
ae56b40	Amir Gholami	12 January 2017, 22:38:51 UTC	corrected order of fftw library linking	12 January 2017, 22:38:51 UTC
5d73849	Amir Gholami	25 November 2016, 17:33:08 UTC	added header to soperators.txx	25 November 2016, 17:33:08 UTC
1c1bcf5	Amir Gholami	08 October 2016, 21:42:45 UTC	code cleanup	08 October 2016, 21:42:45 UTC
93cf04e	Amir Gholami	07 October 2016, 19:17:04 UTC	minor change to step1 CMakeList	07 October 2016, 19:17:04 UTC
7d60a0c	Amir Gholami	18 September 2016, 17:30:12 UTC	cleanup	18 September 2016, 17:30:12 UTC
e72664f	Amir Gholami	30 June 2016, 03:10:29 UTC	changed create_comm message	30 June 2016, 03:10:29 UTC
eb337d4	Amir Gholami	26 June 2016, 01:51:13 UTC	fixed compiler warning for malloc	26 June 2016, 01:51:13 UTC
5c6064e	Amir Gholami	26 June 2016, 01:38:00 UTC	fixed compiler warnings	26 June 2016, 01:38:00 UTC
4430898	Amir Gholami	26 June 2016, 00:39:57 UTC	Cmake find pnetcdf quietly, it is an optional package	26 June 2016, 00:39:57 UTC
10df740	Amir Gholami	14 June 2016, 04:27:16 UTC	outof order send for inverse	14 June 2016, 04:27:16 UTC
004249a	Amir Gholami	14 June 2016, 04:12:35 UTC	mpi_waitall for v and v_h, also removed omp_parallel_for	14 June 2016, 04:12:35 UTC
9cfc60e	Amir Gholami	14 June 2016, 03:52:50 UTC	MPI_STATUSES_NULL > MPI_STATUSES_IGNORE	14 June 2016, 03:52:50 UTC
00bd8d8	Amir Gholami	14 June 2016, 03:39:05 UTC	Merge pull request #8 from jeffhammond/fix-ulong fix ulong -> unsigned long	14 June 2016, 03:39:05 UTC
c45f68d	Amir Gholami	14 June 2016, 03:38:29 UTC	Merge pull request #7 from jeffhammond/use-waitall replace loop over wait with waitall	14 June 2016, 03:38:29 UTC
ec90098	Amir Gholami	14 June 2016, 03:34:34 UTC	Merge pull request #6 from jeffhammond/remove-nilpotent-expressions Remove nilpotent expressions	14 June 2016, 03:34:34 UTC
0c85ca9	Amir Gholami	14 June 2016, 03:32:26 UTC	Merge pull request #5 from jeffhammond/use-calloc Use calloc	14 June 2016, 03:32:26 UTC
f0687ff	Jeff Hammond	10 May 2016, 19:28:17 UTC	fix ulong -> unsigned long	10 May 2016, 19:28:17 UTC
df99207	Jeff Hammond	10 May 2016, 13:35:56 UTC	replace loop over wait with waitall 1) merge send and recv request vectors 2) set odd (even) values of request vector to send (recv) requests 3) call MPI_Waitall once instead of looping over two calls to MPI_Wait This should have a nontrivial impact when nproc is large and messages are completed by the network out-of-order w.r.t. the wait loop. (comment #2 may have send and recv backwards)	10 May 2016, 13:35:56 UTC
cb75449	Jeff Hammond	10 May 2016, 13:12:40 UTC	remove loop with trip count of 1	10 May 2016, 13:12:40 UTC
49460fa	Jeff Hammond	10 May 2016, 13:11:47 UTC	remove 0*nprocs	10 May 2016, 13:11:47 UTC
ae075be	Jeff Hammond	10 May 2016, 13:11:12 UTC	replace 0*MPI_Wtime with 0 #2	10 May 2016, 13:11:12 UTC
179be93	Jeff Hammond	10 May 2016, 13:09:37 UTC	replace 0*MPI_Wtime with 0	10 May 2016, 13:09:37 UTC
b196a29	Jeff Hammond	10 May 2016, 09:38:34 UTC	eliminate unnecessary repeat log2()	10 May 2016, 09:38:34 UTC
0256c4a	Jeff Hammond	10 May 2016, 09:36:03 UTC	use calloc=malloc+memset(0) #2	10 May 2016, 09:36:03 UTC
b488850	Jeff Hammond	10 May 2016, 09:35:34 UTC	BUGFIX: ptrdiff_t often wider than int, so this only zeros the lower half of the array	10 May 2016, 09:35:34 UTC
c032f80	Jeff Hammond	10 May 2016, 09:34:44 UTC	use calloc=malloc+memset(0)	10 May 2016, 09:34:44 UTC
abb66aa	Amir Gholami	04 May 2016, 01:59:35 UTC	made pnetcdf optional	04 May 2016, 01:59:35 UTC
77c3124	Amir Gholami	13 April 2016, 20:17:44 UTC	minor fix in debug mode print	13 April 2016, 20:17:44 UTC
4f320ad	Amir Gholami	22 March 2016, 17:16:22 UTC	add c_comm to write_pnetcdf, and isize,istart to accfft_plan	22 March 2016, 17:16:22 UTC
d78abc1	Amir Gholami	07 March 2016, 23:04:11 UTC	bug fix for inverse laplace and biharmonic operators	07 March 2016, 23:04:11 UTC
18a869b	Amir Gholami	07 March 2016, 19:03:42 UTC	gpu cleanup part1	07 March 2016, 19:03:42 UTC
4b5d09c	Amir Gholami	02 March 2016, 03:12:25 UTC	Merge branch 'master' of github.com:amirgholami/accfft	02 March 2016, 03:12:25 UTC
4ab0ba5	Amir Gholami	02 March 2016, 03:10:41 UTC	added double precision inverse laplace and inverse biharmonic operators.	02 March 2016, 03:11:33 UTC
5325f67	Amir Gholami	02 March 2016, 03:10:41 UTC	.	02 March 2016, 03:10:41 UTC
ad42521	Amir Gholami	01 March 2016, 18:44:34 UTC	no need to use issend	01 March 2016, 18:44:34 UTC
12b6ca1	Amir Gholami	01 March 2016, 18:43:39 UTC	bug fix	01 March 2016, 18:43:39 UTC
7143762	Amir Gholami	23 February 2016, 21:21:53 UTC	added biharmonic spectral operator	23 February 2016, 21:21:53 UTC
7d2b50f	Amir Gholami	22 February 2016, 01:35:16 UTC	cleanup	22 February 2016, 01:35:16 UTC
316cf7d	Amir Gholami	21 February 2016, 22:15:37 UTC	code cleanup 2	21 February 2016, 22:15:37 UTC
ddb163c	Amir Gholami	21 February 2016, 22:09:43 UTC	code cleanup	21 February 2016, 22:09:43 UTC
1f41020	Amir Gholami	17 February 2016, 03:29:10 UTC	removed c++11 features for compatibility with older compilers	17 February 2016, 03:29:10 UTC
094e279	Amir Gholami	17 February 2016, 03:12:17 UTC	save details during setup phase	17 February 2016, 03:12:17 UTC
228be48	Amir Gholami	16 February 2016, 17:01:48 UTC	compatibility with older compilers due to bitlist list initialization	16 February 2016, 17:01:48 UTC
a76e35e	Amir Gholami	16 February 2016, 04:01:19 UTC	more inclusive setup phase, fixed a compile error on cray machines	16 February 2016, 04:01:19 UTC
d33318b	Amir Gholami	30 January 2016, 21:49:02 UTC	clean up setup function, better communication selector	30 January 2016, 21:49:02 UTC
cde1002	Amir Gholami	02 January 2016, 00:41:58 UTC	step1 cleanup	02 January 2016, 00:41:58 UTC
1dd1b4d	Amir Gholami	02 January 2016, 00:39:04 UTC	compatibility with new cmake versions	02 January 2016, 00:39:04 UTC
c454aca	Amir Gholami	14 December 2015, 02:45:08 UTC	corrected timings for step5 and step6	14 December 2015, 02:45:08 UTC
35e816f	Amir Gholami	14 December 2015, 01:32:48 UTC	changed operator timings to the default style (see doxygen)	14 December 2015, 01:32:48 UTC
5ef46db	Amir Gholami	06 December 2015, 22:51:26 UTC	clean up, added pnetcdf_io for floats	06 December 2015, 22:51:26 UTC
6aa2172	Amir Gholami	03 December 2015, 20:52:36 UTC	working on step6 (unfinished)	03 December 2015, 20:52:36 UTC
107ec29	Amir Gholami	03 December 2015, 20:52:00 UTC	timing for operators	03 December 2015, 20:52:00 UTC
b95b5d9	Amir Gholami	02 December 2015, 03:08:13 UTC	added step5: spectral operators for CPU and GPU	02 December 2015, 03:08:13 UTC
2f8ba07	Amir Gholami	02 December 2015, 03:07:24 UTC	doxygen for spectral operators + code cleanup	02 December 2015, 03:07:24 UTC
40860ec	Amir Gholami	01 December 2015, 22:32:42 UTC	gpu spectral operators completed	01 December 2015, 22:32:42 UTC
39d28f8	Amir Gholami	30 November 2015, 23:18:05 UTC	initial commit for gpu spectral operators (unfinished)	30 November 2015, 23:18:05 UTC
d0c1bf4	Amir Gholami	29 November 2015, 23:15:27 UTC	cpu spectral operators finished	29 November 2015, 23:15:27 UTC
11102d6	Amir Gholami	26 November 2015, 23:43:51 UTC	adding spectral operators (uncompleted)	26 November 2015, 23:43:51 UTC
6bfd26f	Amir Gholami	25 November 2015, 22:00:10 UTC	single precision support added	25 November 2015, 22:00:10 UTC
65eb680	Amir Gholami	25 November 2015, 03:15:30 UTC	added single precision versions of steps 1,2,3	25 November 2015, 03:15:30 UTC
93229b6	Amir Gholami	24 November 2015, 23:00:00 UTC	single precision support, part iv added float fft srcs	24 November 2015, 23:00:00 UTC
0551a19	Amir Gholami	23 November 2015, 21:21:59 UTC	single precision support part iii, modified accfft.cpp	23 November 2015, 21:21:59 UTC

Newer
Older