https://github.com/paboyle/Grid

sort by:
Revision Author Date Message Commit Date
e57eafe Fix to multinode code 26 April 2017, 18:46:52 UTC
738c1a1 longer nloop 26 April 2017, 07:43:20 UTC
f8797e1 bug fix. works now and great face performance 26 April 2017, 07:14:02 UTC
fd1eb7d Clean implementation of the exterior faces listing only those points on the boudary 26 April 2017, 06:34:52 UTC
2ce898e Pretty code 26 April 2017, 06:34:25 UTC
ab66bac Think I'm getting on top of the reduced cost exterior precomputed list of links 25 April 2017, 07:50:26 UTC
56277a1 Build a list of whats on the surface 24 April 2017, 16:06:15 UTC
916e9e1 Merge branch 'feature/half-prec-comms' of https://github.com/paboyle/Grid into feature/half-prec-comms 24 April 2017, 09:39:19 UTC
5b55867 Slightly cheaper Ext assembly 24 April 2017, 09:36:11 UTC
3accb1e Debugged assemply split phase with interior suppression 23 April 2017, 23:30:19 UTC
e3d0e31 Debugged assemply split phase with interior suppression 23 April 2017, 23:29:27 UTC
5812eb8 Partially fixed. But the comms-overlap does not work yet. 22 April 2017, 22:50:25 UTC
4dd3763 Use OMP as much as possible 22 April 2017, 19:35:20 UTC
c429ace Cleaner OpenMP use 22 April 2017, 19:28:42 UTC
ac58565 Dangerous rewrite of the assembly. If I make a mistake the debug will be painful. 22 April 2017, 18:31:04 UTC
3703b71 Mark up a table if a given site only receives from itself; including MPI3 splitting info. 22 April 2017, 18:28:37 UTC
b722889 Try a better load balancing loop 22 April 2017, 18:27:41 UTC
abba44a Hand unrolled for overlapped comms 22 April 2017, 16:45:17 UTC
f301be9 Fixed 22 April 2017, 16:42:31 UTC
1d1b225 Hand unrolled Nc=3 kernels support split phase compute (on-node, off-node). 22 April 2017, 13:05:28 UTC
53a785a Fixing the KNL compile 22 April 2017, 12:11:51 UTC
736bf3c Major rework of stencil. Half precision and MPI3 now working. 22 April 2017, 10:33:50 UTC
b9bbe5d L1p config bg/q 22 April 2017, 10:33:09 UTC
3844bcf If no f16c instructions supported must use software half precision conversion. This will also become useful on BG/Q, so will move out from SSE4 into a general area. Lifted the Eigen half precision from web. Looks sensible, but not extensively regressed against the intrinsics implementation yet. 20 April 2017, 14:30:52 UTC
e1a2319 Simple compressor moved out of cshift into stencil 20 April 2017, 12:18:15 UTC
180c732 Move compressors out of Cshift. Slice iterators would help 20 April 2017, 12:17:55 UTC
957a706 Useful script 20 April 2017, 12:17:44 UTC
d2312e9 Drop compressor entirely from Cshift to only Stencil. 20 April 2017, 12:16:55 UTC
fc4ab9c Working half precision comms 20 April 2017, 10:20:26 UTC
4a340aa Massive compressor rework to support reduced precision comms 20 April 2017, 08:28:27 UTC
3b7de79 Type comparison in the traits work 18 April 2017, 12:28:04 UTC
557c3fa Pretty change 18 April 2017, 12:27:38 UTC
ec18e9f Merge branch 'develop' into feature/half-prec-comms 18 April 2017, 10:39:39 UTC
a839d5b Updated todo list 18 April 2017, 10:22:17 UTC
de41b84 Merge branch 'feature/normHP' into develop 18 April 2017, 09:57:21 UTC
8e16115 MultiRHS solver improvements with slice operations moved into lattice and sped up. Block solver requires a lot of performance work. 18 April 2017, 09:51:55 UTC
3141eba MultiRHS working, starting to optimise. Block doesn't and I thought it already was; puzzled. 17 April 2017, 09:50:19 UTC
7ede696 Non compile of tests fixed 16 April 2017, 22:40:00 UTC
bf516c3 higher precision reduction variables in norm and inner product 15 April 2017, 11:27:28 UTC
441a52e First cut at higher precision reduction 15 April 2017, 09:57:21 UTC
a8db024 Cleaning up the dense matrix and lanczos sector 15 April 2017, 07:54:11 UTC
a9c22d5 Verbose removal 14 April 2017, 13:38:49 UTC
3ca4145 Fix to no USE_FP16 case 14 April 2017, 13:20:54 UTC
9e2d29c USE_FP16 macro 14 April 2017, 13:17:14 UTC
951be75 Half precision conversion working on AVX512 now too 13 April 2017, 16:35:11 UTC
b9113ed Patches for knl 13 April 2017, 16:02:12 UTC
42fb49d Merge branch 'develop' of https://github.com/paboyle/Grid into develop 13 April 2017, 13:12:47 UTC
2a54c9a Merge branch 'feature/block-cg' into develop 13 April 2017, 13:12:24 UTC
0957378 Fixing conditional ugly way 13 April 2017, 12:47:56 UTC
2ed6c76 Getting multiline if then fi working 13 April 2017, 12:43:13 UTC
d3b9a7f F16c apparently requires AVX, even if the 128 bit are used. Seems odd. 13 April 2017, 12:19:11 UTC
75ea306 Another try at travis 13 April 2017, 12:05:32 UTC
4226c63 Default to FP16 off again 13 April 2017, 11:51:39 UTC
5a4eafb .travis 13 April 2017, 11:50:43 UTC
eb8e260 Travis update for macos 13 April 2017, 11:35:11 UTC
db5ea00 Update to use Xcode 8.3 since -mfp16 causes SIGILL 13 April 2017, 11:22:40 UTC
2846f07 Predicate tests on fp16 being enabled 13 April 2017, 11:08:05 UTC
1d502e4 FP16 optional compile time 13 April 2017, 10:55:24 UTC
73cdf0f Drop f16c from SSE because of a macos compile error on travis 13 April 2017, 10:23:41 UTC
1c25773 Trap illegal instructions 13 April 2017, 09:51:40 UTC
c38400b Trap signals 13 April 2017, 09:35:20 UTC
9c3065b Debug flags off again 13 April 2017, 09:01:32 UTC
94eb829 Align cast fixed for __mm128i gcc complained 13 April 2017, 07:40:44 UTC
68392dd Exchange in generic Precision change in AVX, SSE, AVX512, Generic. QPX still to do. 13 April 2017, 07:38:12 UTC
cb6b81a Half precision conversion 12 April 2017, 18:32:37 UTC
8ef4300 spurious .dirstamp files removed 10 April 2017, 16:00:22 UTC
98a24eb The macro “magics” is very intensive for the preprocessor in the measurement code which has numerous serialisable classes. Reducing the number of serialisable fields to 64 (instead of 1024) helps a lot, this is enough for now and can be extended trivially if needed in the future. 10 April 2017, 15:58:54 UTC
b12dc89 Commenting and clean up 10 April 2017, 11:38:20 UTC
d80d802 MultiRHS solver test 09 April 2017, 15:12:12 UTC
3d99b09 Start of blockCG 09 April 2017, 14:42:10 UTC
db5f6d3 Verbose fix 09 April 2017, 14:41:30 UTC
683550f Const args improvement 09 April 2017, 14:41:04 UTC
55d0329 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 07 April 2017, 02:08:14 UTC
86aaa35 Christoph needs SchurDiagTwoKappa which is mobius specific. 07 April 2017, 02:07:40 UTC
172d3dc Correcting names in tests 05 April 2017, 15:24:04 UTC
5592f7b Creation mode better implementation 04 April 2017, 17:35:34 UTC
35da4ec UID fix 04 April 2017, 17:18:15 UTC
061b15b Merge branch 'feature/sitmo-skipahead' into develop 04 April 2017, 16:24:49 UTC
561426f Clean up 02 April 2017, 14:13:48 UTC
83f6fab Big/Small crush test, and fast SITMO rng init, faster but not ideal MT and Ranlux init. 02 April 2017, 03:10:51 UTC
0fade84 No random device 01 April 2017, 15:29:40 UTC
9dc7ca4 Sitmo fast init 01 April 2017, 15:28:22 UTC
935d82f sanity checks 01 April 2017, 15:27:28 UTC
9cbcdd6 No random device seed 01 April 2017, 15:26:57 UTC
f18f5ed Drop random device 01 April 2017, 15:26:26 UTC
d1d63a4 sitmo default 01 April 2017, 15:26:05 UTC
7e5faa0 Multiple RNGs 01 April 2017, 15:25:44 UTC
6af459c Christoph's coefficients. 31 March 2017, 08:07:43 UTC
1c4bc7e Debugged staggered conventions 31 March 2017, 05:41:48 UTC
93ea5d9 Pretty code 30 March 2017, 06:00:03 UTC
1ec5d32 Chulwoo's test to zmobius helped me shake out 30 March 2017, 04:45:13 UTC
9fd23fa Pretty layout 30 March 2017, 04:44:45 UTC
10e4fa0 Template instantiation improvements 30 March 2017, 04:44:25 UTC
c4aca1d Conjugate coefficients on adjoint 30 March 2017, 04:44:05 UTC
b9e8ea3 conjugate coefficient on the dagger 30 March 2017, 04:43:13 UTC
077aa72 Fix the ZMobius (I think) 30 March 2017, 04:42:09 UTC
a8d83d8 Macro controls 30 March 2017, 04:31:34 UTC
7fd46ee Trailing whitespace removal 30 March 2017, 04:31:10 UTC
e0c4eeb Compiles again 30 March 2017, 04:30:45 UTC
cb9a297 Chulwoo's Zmobius test 30 March 2017, 04:30:25 UTC
back to top