e57eafe | Peter Boyle | 26 April 2017, 18:46:52 UTC | Fix to multinode code | 26 April 2017, 18:46:52 UTC |
738c1a1 | paboyle | 26 April 2017, 07:43:20 UTC | longer nloop | 26 April 2017, 07:43:20 UTC |
f8797e1 | Peter Boyle | 26 April 2017, 07:14:02 UTC | bug fix. works now and great face performance | 26 April 2017, 07:14:02 UTC |
fd1eb7d | Peter Boyle | 26 April 2017, 06:34:52 UTC | Clean implementation of the exterior faces listing only those points on the boudary | 26 April 2017, 06:34:52 UTC |
2ce898e | Peter Boyle | 26 April 2017, 06:34:25 UTC | Pretty code | 26 April 2017, 06:34:25 UTC |
ab66bac | paboyle | 25 April 2017, 07:50:26 UTC | Think I'm getting on top of the reduced cost exterior precomputed list of links | 25 April 2017, 07:50:26 UTC |
56277a1 | paboyle | 24 April 2017, 16:06:15 UTC | Build a list of whats on the surface | 24 April 2017, 16:06:15 UTC |
916e9e1 | paboyle | 24 April 2017, 09:39:19 UTC | Merge branch 'feature/half-prec-comms' of https://github.com/paboyle/Grid into feature/half-prec-comms | 24 April 2017, 09:39:19 UTC |
5b55867 | Peter Boyle | 24 April 2017, 09:36:11 UTC | Slightly cheaper Ext assembly | 24 April 2017, 09:36:11 UTC |
3accb1e | Peter Boyle | 23 April 2017, 23:30:19 UTC | Debugged assemply split phase with interior suppression | 23 April 2017, 23:30:19 UTC |
e3d0e31 | Peter Boyle | 23 April 2017, 23:29:27 UTC | Debugged assemply split phase with interior suppression | 23 April 2017, 23:29:27 UTC |
5812eb8 | Peter Boyle | 22 April 2017, 22:50:25 UTC | Partially fixed. But the comms-overlap does not work yet. | 22 April 2017, 22:50:25 UTC |
4dd3763 | paboyle | 22 April 2017, 19:35:20 UTC | Use OMP as much as possible | 22 April 2017, 19:35:20 UTC |
c429ace | paboyle | 22 April 2017, 19:28:42 UTC | Cleaner OpenMP use | 22 April 2017, 19:28:42 UTC |
ac58565 | paboyle | 22 April 2017, 18:31:04 UTC | Dangerous rewrite of the assembly. If I make a mistake the debug will be painful. | 22 April 2017, 18:31:04 UTC |
3703b71 | paboyle | 22 April 2017, 18:28:37 UTC | Mark up a table if a given site only receives from itself; including MPI3 splitting info. | 22 April 2017, 18:28:37 UTC |
b722889 | paboyle | 22 April 2017, 18:27:41 UTC | Try a better load balancing loop | 22 April 2017, 18:27:41 UTC |
abba44a | paboyle | 22 April 2017, 16:45:17 UTC | Hand unrolled for overlapped comms | 22 April 2017, 16:45:17 UTC |
f301be9 | paboyle | 22 April 2017, 16:42:31 UTC | Fixed | 22 April 2017, 16:42:31 UTC |
1d1b225 | Peter Boyle | 22 April 2017, 13:05:28 UTC | Hand unrolled Nc=3 kernels support split phase compute (on-node, off-node). | 22 April 2017, 13:05:28 UTC |
53a785a | Peter Boyle | 22 April 2017, 12:11:51 UTC | Fixing the KNL compile | 22 April 2017, 12:11:51 UTC |
736bf3c | paboyle | 22 April 2017, 10:33:50 UTC | Major rework of stencil. Half precision and MPI3 now working. | 22 April 2017, 10:33:50 UTC |
b9bbe5d | paboyle | 22 April 2017, 10:33:09 UTC | L1p config bg/q | 22 April 2017, 10:33:09 UTC |
3844bcf | paboyle | 20 April 2017, 14:30:52 UTC | If no f16c instructions supported must use software half precision conversion. This will also become useful on BG/Q, so will move out from SSE4 into a general area. Lifted the Eigen half precision from web. Looks sensible, but not extensively regressed against the intrinsics implementation yet. | 20 April 2017, 14:30:52 UTC |
e1a2319 | paboyle | 20 April 2017, 12:18:15 UTC | Simple compressor moved out of cshift into stencil | 20 April 2017, 12:18:15 UTC |
180c732 | paboyle | 20 April 2017, 12:17:55 UTC | Move compressors out of Cshift. Slice iterators would help | 20 April 2017, 12:17:55 UTC |
957a706 | paboyle | 20 April 2017, 12:17:44 UTC | Useful script | 20 April 2017, 12:17:44 UTC |
d2312e9 | paboyle | 20 April 2017, 12:16:55 UTC | Drop compressor entirely from Cshift to only Stencil. | 20 April 2017, 12:16:55 UTC |
fc4ab9c | paboyle | 20 April 2017, 10:20:26 UTC | Working half precision comms | 20 April 2017, 10:20:26 UTC |
4a340aa | paboyle | 20 April 2017, 08:28:27 UTC | Massive compressor rework to support reduced precision comms | 20 April 2017, 08:28:27 UTC |
3b7de79 | paboyle | 18 April 2017, 12:28:04 UTC | Type comparison in the traits work | 18 April 2017, 12:28:04 UTC |
557c3fa | paboyle | 18 April 2017, 12:27:38 UTC | Pretty change | 18 April 2017, 12:27:38 UTC |
ec18e9f | paboyle | 18 April 2017, 10:39:39 UTC | Merge branch 'develop' into feature/half-prec-comms | 18 April 2017, 10:39:39 UTC |
a839d5b | paboyle | 18 April 2017, 10:22:17 UTC | Updated todo list | 18 April 2017, 10:22:17 UTC |
de41b84 | paboyle | 18 April 2017, 09:57:21 UTC | Merge branch 'feature/normHP' into develop | 18 April 2017, 09:57:21 UTC |
8e16115 | paboyle | 18 April 2017, 09:51:55 UTC | MultiRHS solver improvements with slice operations moved into lattice and sped up. Block solver requires a lot of performance work. | 18 April 2017, 09:51:55 UTC |
3141eba | paboyle | 17 April 2017, 09:50:19 UTC | MultiRHS working, starting to optimise. Block doesn't and I thought it already was; puzzled. | 17 April 2017, 09:50:19 UTC |
7ede696 | paboyle | 16 April 2017, 22:40:00 UTC | Non compile of tests fixed | 16 April 2017, 22:40:00 UTC |
bf516c3 | paboyle | 15 April 2017, 11:27:28 UTC | higher precision reduction variables in norm and inner product | 15 April 2017, 11:27:28 UTC |
441a52e | paboyle | 15 April 2017, 09:57:21 UTC | First cut at higher precision reduction | 15 April 2017, 09:57:21 UTC |
a8db024 | paboyle | 15 April 2017, 07:54:11 UTC | Cleaning up the dense matrix and lanczos sector | 15 April 2017, 07:54:11 UTC |
a9c22d5 | paboyle | 14 April 2017, 13:38:49 UTC | Verbose removal | 14 April 2017, 13:38:49 UTC |
3ca4145 | paboyle | 14 April 2017, 13:20:54 UTC | Fix to no USE_FP16 case | 14 April 2017, 13:20:54 UTC |
9e2d29c | paboyle | 14 April 2017, 13:17:14 UTC | USE_FP16 macro | 14 April 2017, 13:17:14 UTC |
951be75 | Peter Boyle | 13 April 2017, 16:35:11 UTC | Half precision conversion working on AVX512 now too | 13 April 2017, 16:35:11 UTC |
b9113ed | Peter Boyle | 13 April 2017, 16:02:12 UTC | Patches for knl | 13 April 2017, 16:02:12 UTC |
42fb49d | paboyle | 13 April 2017, 13:12:47 UTC | Merge branch 'develop' of https://github.com/paboyle/Grid into develop | 13 April 2017, 13:12:47 UTC |
2a54c9a | paboyle | 13 April 2017, 13:12:24 UTC | Merge branch 'feature/block-cg' into develop | 13 April 2017, 13:12:24 UTC |
0957378 | paboyle | 13 April 2017, 12:47:56 UTC | Fixing conditional ugly way | 13 April 2017, 12:47:56 UTC |
2ed6c76 | paboyle | 13 April 2017, 12:43:13 UTC | Getting multiline if then fi working | 13 April 2017, 12:43:13 UTC |
d3b9a7f | paboyle | 13 April 2017, 12:19:11 UTC | F16c apparently requires AVX, even if the 128 bit are used. Seems odd. | 13 April 2017, 12:19:11 UTC |
75ea306 | paboyle | 13 April 2017, 12:05:32 UTC | Another try at travis | 13 April 2017, 12:05:32 UTC |
4226c63 | paboyle | 13 April 2017, 11:51:39 UTC | Default to FP16 off again | 13 April 2017, 11:51:39 UTC |
5a4eafb | paboyle | 13 April 2017, 11:50:43 UTC | .travis | 13 April 2017, 11:50:43 UTC |
eb8e260 | paboyle | 13 April 2017, 11:35:11 UTC | Travis update for macos | 13 April 2017, 11:35:11 UTC |
db5ea00 | paboyle | 13 April 2017, 11:22:40 UTC | Update to use Xcode 8.3 since -mfp16 causes SIGILL | 13 April 2017, 11:22:40 UTC |
2846f07 | paboyle | 13 April 2017, 11:08:05 UTC | Predicate tests on fp16 being enabled | 13 April 2017, 11:08:05 UTC |
1d502e4 | paboyle | 13 April 2017, 10:55:24 UTC | FP16 optional compile time | 13 April 2017, 10:55:24 UTC |
73cdf0f | paboyle | 13 April 2017, 10:23:41 UTC | Drop f16c from SSE because of a macos compile error on travis | 13 April 2017, 10:23:41 UTC |
1c25773 | paboyle | 13 April 2017, 09:51:40 UTC | Trap illegal instructions | 13 April 2017, 09:51:40 UTC |
c38400b | paboyle | 13 April 2017, 09:35:20 UTC | Trap signals | 13 April 2017, 09:35:20 UTC |
9c3065b | paboyle | 13 April 2017, 09:01:32 UTC | Debug flags off again | 13 April 2017, 09:01:32 UTC |
94eb829 | paboyle | 13 April 2017, 07:40:44 UTC | Align cast fixed for __mm128i gcc complained | 13 April 2017, 07:40:44 UTC |
68392dd | paboyle | 13 April 2017, 07:38:12 UTC | Exchange in generic Precision change in AVX, SSE, AVX512, Generic. QPX still to do. | 13 April 2017, 07:38:12 UTC |
cb6b81a | paboyle | 12 April 2017, 18:32:37 UTC | Half precision conversion | 12 April 2017, 18:32:37 UTC |
8ef4300 | Antonin Portelli | 10 April 2017, 16:00:22 UTC | spurious .dirstamp files removed | 10 April 2017, 16:00:22 UTC |
98a24eb | Antonin Portelli | 10 April 2017, 15:58:54 UTC | The macro “magics” is very intensive for the preprocessor in the measurement code which has numerous serialisable classes. Reducing the number of serialisable fields to 64 (instead of 1024) helps a lot, this is enough for now and can be extended trivially if needed in the future. | 10 April 2017, 15:58:54 UTC |
b12dc89 | paboyle | 10 April 2017, 11:38:20 UTC | Commenting and clean up | 10 April 2017, 11:38:20 UTC |
d80d802 | paboyle | 09 April 2017, 15:12:12 UTC | MultiRHS solver test | 09 April 2017, 15:12:12 UTC |
3d99b09 | paboyle | 09 April 2017, 14:42:10 UTC | Start of blockCG | 09 April 2017, 14:42:10 UTC |
db5f6d3 | paboyle | 09 April 2017, 14:41:30 UTC | Verbose fix | 09 April 2017, 14:41:30 UTC |
683550f | paboyle | 09 April 2017, 14:41:04 UTC | Const args improvement | 09 April 2017, 14:41:04 UTC |
55d0329 | paboyle | 07 April 2017, 02:08:14 UTC | Merge branch 'develop' of https://github.com/paboyle/Grid into develop | 07 April 2017, 02:08:14 UTC |
86aaa35 | paboyle | 07 April 2017, 02:07:40 UTC | Christoph needs SchurDiagTwoKappa which is mobius specific. | 07 April 2017, 02:07:40 UTC |
172d3dc | Guido Cossu | 05 April 2017, 15:24:04 UTC | Correcting names in tests | 05 April 2017, 15:24:04 UTC |
5592f7b | paboyle | 04 April 2017, 17:35:34 UTC | Creation mode better implementation | 04 April 2017, 17:35:34 UTC |
35da4ec | paboyle | 04 April 2017, 17:18:15 UTC | UID fix | 04 April 2017, 17:18:15 UTC |
061b15b | paboyle | 04 April 2017, 16:24:49 UTC | Merge branch 'feature/sitmo-skipahead' into develop | 04 April 2017, 16:24:49 UTC |
561426f | paboyle | 02 April 2017, 14:13:48 UTC | Clean up | 02 April 2017, 14:13:48 UTC |
83f6fab | paboyle | 02 April 2017, 03:10:51 UTC | Big/Small crush test, and fast SITMO rng init, faster but not ideal MT and Ranlux init. | 02 April 2017, 03:10:51 UTC |
0fade84 | paboyle | 01 April 2017, 15:29:40 UTC | No random device | 01 April 2017, 15:29:40 UTC |
9dc7ca4 | paboyle | 01 April 2017, 15:28:22 UTC | Sitmo fast init | 01 April 2017, 15:28:22 UTC |
935d82f | paboyle | 01 April 2017, 15:27:28 UTC | sanity checks | 01 April 2017, 15:27:28 UTC |
9cbcdd6 | paboyle | 01 April 2017, 15:26:57 UTC | No random device seed | 01 April 2017, 15:26:57 UTC |
f18f5ed | paboyle | 01 April 2017, 15:26:26 UTC | Drop random device | 01 April 2017, 15:26:26 UTC |
d1d63a4 | paboyle | 01 April 2017, 15:26:05 UTC | sitmo default | 01 April 2017, 15:26:05 UTC |
7e5faa0 | paboyle | 01 April 2017, 15:25:44 UTC | Multiple RNGs | 01 April 2017, 15:25:44 UTC |
6af459c | paboyle | 31 March 2017, 08:07:43 UTC | Christoph's coefficients. | 31 March 2017, 08:07:43 UTC |
1c4bc7e | paboyle | 31 March 2017, 05:41:48 UTC | Debugged staggered conventions | 31 March 2017, 05:41:48 UTC |
93ea5d9 | paboyle | 30 March 2017, 06:00:03 UTC | Pretty code | 30 March 2017, 06:00:03 UTC |
1ec5d32 | paboyle | 30 March 2017, 04:45:13 UTC | Chulwoo's test to zmobius helped me shake out | 30 March 2017, 04:45:13 UTC |
9fd23fa | paboyle | 30 March 2017, 04:44:45 UTC | Pretty layout | 30 March 2017, 04:44:45 UTC |
10e4fa0 | paboyle | 30 March 2017, 04:44:25 UTC | Template instantiation improvements | 30 March 2017, 04:44:25 UTC |
c4aca1d | paboyle | 30 March 2017, 04:44:05 UTC | Conjugate coefficients on adjoint | 30 March 2017, 04:44:05 UTC |
b9e8ea3 | paboyle | 30 March 2017, 04:43:13 UTC | conjugate coefficient on the dagger | 30 March 2017, 04:43:13 UTC |
077aa72 | paboyle | 30 March 2017, 04:42:09 UTC | Fix the ZMobius (I think) | 30 March 2017, 04:42:09 UTC |
a8d83d8 | paboyle | 30 March 2017, 04:31:34 UTC | Macro controls | 30 March 2017, 04:31:34 UTC |
7fd46ee | paboyle | 30 March 2017, 04:31:10 UTC | Trailing whitespace removal | 30 March 2017, 04:31:10 UTC |
e0c4eeb | paboyle | 30 March 2017, 04:30:45 UTC | Compiles again | 30 March 2017, 04:30:45 UTC |
cb9a297 | paboyle | 30 March 2017, 04:30:25 UTC | Chulwoo's Zmobius test | 30 March 2017, 04:30:25 UTC |