sort by:
Revision Author Date Message Commit Date
726b099 Auto-merge of 6bf64a69a3530c3b7fb5413b8e0a0f2b1403ad6b 02 March 2017, 05:00:06 UTC
8333cb2 Fix non-hexadecapole in nodeGravityComputation Change-Id: I2c4dcda098cd4d0f9fe1798b92b22b8a92989bfe 27 February 2017, 05:42:47 UTC
4277144 Correctly handle --with-cuda arguments in cuda.ac Change-Id: I19dfd0812de2586632e2916cdfdc8e729c7633b1 27 February 2017, 04:18:09 UTC
d24689e Use new "runKernel" function pointer, instead of old ID and "kernelSelect" switch. Also: Use new hapi_hostFree macros, instead of duplicated hapi_poolFree / delayedFree / free nest of #ifdefs. And replace CUDA_TRACE #ifdef mess with macros, Change-Id: I20d6e2d8d951fc643253beb8bc4984d5eb0bd011 01 February 2017, 02:41:53 UTC
2df7c0d GPU: change partMap (map of remote particles on GPU) to std::unordered_map. Change-Id: I15c9115377b297072d31fa6881e4d7e20bb6804c 29 January 2017, 02:10:26 UTC
03c27de particleGravityComputation optimizations Change-Id: I5f96c11e168d453c3abb6a4a459fd3080bf1730d 18 January 2017, 23:13:27 UTC
583924e optimized nodeGravity kernel add comment on launch_bound value Change-Id: Ia46146a98776de3df694ef12d032df0e6d752c52 17 January 2017, 17:42:41 UTC
f4cf985 TreePiece::nextBucket(): be sure sGravity is initialized before send*InteractionsToGPU(). Change-Id: I041e2a95d46dd0493767358f400bbc5f53be0099 24 December 2016, 23:31:17 UTC
daaded5 Dumpframe: test for limit on color table entries. Also some documentation. Change-Id: Iee7b3f016a7990783298cdef38f79516e61119b3 23 December 2016, 20:27:37 UTC
68ad81e Even more documentation. Change-Id: I48533ee840d637991c519a561ea0523a4b19b9eb 23 December 2016, 20:20:16 UTC
70d95b0 Removed unused "val" from dummyMsg. Added some documentation. Change-Id: I0184f4bf6948017bc0edaa10de5c1bcddaa621cd 23 December 2016, 19:47:50 UTC
eac4cbf More documentation Change-Id: I80a6683d2c01e16d3ecb7d37cd5c4e42fdb59e73 23 December 2016, 19:43:54 UTC
caa32dd cooling_grackle: comment out gasoline-only data structures. Change-Id: Iaea86cca9972f62ef120cfd9adc8062da66c0120 23 December 2016, 19:42:43 UTC
2df83a9 GenericTreeNode: documentation and dead code removal. Change-Id: I73ba153bd9414399206a723f3387ed58dcb84f8c 23 December 2016, 19:35:58 UTC
01c491d Move unused code out of the way. Change-Id: I38fe3f35b484aa458cc4205ce141315b84d52963 23 December 2016, 19:31:11 UTC
cc73679 Add INTERLIST_VER ifdefs around ckloop code so that it compiles/runs with B-H gravity walk. Change-Id: I38aeb240f449c22ee25fd789e0e14c3256fa4d0d 23 December 2016, 19:21:30 UTC
ba54aea Upgrade Doxyfile. Change-Id: I686570a1850dd6a8a38f589591674568ddaef2ff 23 December 2016, 04:39:56 UTC
e573a9d Cuda: increased sized of node interaction lists sent to the GPU. Change-Id: I263f3ad69085aa75a1473d688242ffed8e1dacdd 27 November 2016, 02:18:34 UTC
70a6abf Comment out memory clean up statements at the finish. These are only for debugging and tend to tickle charm QD bugs. Change-Id: I5b48e9be47506025796151e1d20f35d3934a93aa 20 September 2016, 17:25:02 UTC
f3b82e5 Ewald on GPU: change Taylor series limit for single precision. Change-Id: Ifdccbb555c4fdd0302554002ced5893f576b5d64 20 September 2016, 02:12:18 UTC
73165b5 Scaled moments: handle case with all particles at the same position in a bucket. Change-Id: I0f10e8fdfc2583f0661173aa7b22d666aae8d344 17 September 2016, 23:14:05 UTC
beefd18 Make single-precision for gravity calculations configurable Change-Id: I0ea72b679729d29579a487db82f5508d3288e6aa 06 September 2016, 16:50:33 UTC
6a3f98f Deactivate GPU small phase Change-Id: Id9c1b8297396cd765b90d7213fc0dae764e666ae 06 September 2016, 16:49:27 UTC
07431c1 Add dist-clean target Change-Id: I7c02d21a17a1f8728b5d4610e4ee179eeaa81990 06 September 2016, 16:48:40 UTC
a4f56b2 Remove all CUDA-specific functionality during non-CUDA build Change-Id: I413f80200cb2f8d685d2c3929cfc5c06ee58426e 06 September 2016, 16:46:32 UTC
ae75676 Fix rebase errors in Makefile.in Change-Id: I81b3a91ff347df4b97351fd5c2607628a17f65b0 06 September 2016, 16:43:04 UTC
f7a839d Set precision of scaled moments to cosmoType Change-Id: I2ef2fa7c6d0026cf741cba18a791cc9aed5e46ff 03 September 2016, 05:19:54 UTC
3a5e0ad Document meaning of momFloat. Change-Id: Id88c24ed3f13943029a945fd6774c005a856e451 03 September 2016, 00:54:20 UTC
d055a30 Add error for (unsupported) single-precision AVX Change-Id: Ie27ea1f230247c016e010854ce4c51c50cd7c847 03 September 2016, 00:52:53 UTC
ae04415 Include cosmoType.h Change-Id: I7ff07822ea20aa7d515e97ac702b9ec7b66e304b 03 September 2016, 00:51:50 UTC
3af621a Rebuild configure after rebase Change-Id: Id9e31dd669d175e0b3b89c96b6213533b40b0bc1 03 September 2016, 00:46:25 UTC
f9eaf82 Force rebuild of cuda.mk and Makefile.dep on each run of configure Change-Id: Ie42051a763eeef5cae0bc220c6ba0aa15287e456 03 September 2016, 00:43:55 UTC
175440b Ignore Eclipse and autoconf cache folders Change-Id: I3b80e77bbace77cf6646f50bf79d6f8ecef8d72d 03 September 2016, 00:43:25 UTC
a377927 DataManager::combineLocalTrees(): remove verbose message. Change-Id: I4e56debb163ad4b58b1e811c42a1cf78f23a64a6 03 September 2016, 00:42:44 UTC
a030a3c TreePiece::calculateGravityRemote(): fix bad ifdef. Change-Id: I2a4e008e2119f8897f70e8466cfd6e071650ce0a 03 September 2016, 00:42:00 UTC
9661639 Make 35 the default CUDA level. (K20s or better). Change-Id: I787b1164b54d5eac2afc9a19c3779c51169e0459 03 September 2016, 00:41:37 UTC
662715d Fixed multiple chare problem with Ewald CUDA memory buffers. Change-Id: I0107e8cf7bba8dadcd9c733a2db1ceccb4261709 03 September 2016, 00:40:50 UTC
0aa2c3b Fix for smp changa, comment DataManagerHelper code Change-Id: Iabf9691ea668267606f5e38b57c3462be10b9641 03 September 2016, 00:38:55 UTC
20d4b42 A little GPU code documentation and a little cleanup. Change-Id: I39107effc3aae622f3398cdb0feaea54b864e4b5 03 September 2016, 00:34:32 UTC
2dc9c2a A couple of GPU comments. Change-Id: Idd18eea64bc03feb7284bb4f9aad1b65d8a391ea 03 September 2016, 00:33:30 UTC
4611271 Comment kernels, and correct call to Ewald*Kernel(). Change-Id: I4a58ab197339960f15730e83095538dced6d5305 03 September 2016, 00:33:05 UTC
dd76f84 Enable multistepping for Ewald. Change-Id: Ib88b34da222cc9dece8e0fbe7546f56c040bdeba 03 September 2016, 00:32:01 UTC
6407cc7 Implemented bGravStep on the GPU. Change-Id: If8869f396537fc5382501a00844b1e1b5f1b7ab7 03 September 2016, 00:31:22 UTC
d2ca8c6 cuda_typedef.h: eliminate dead classes. Change-Id: I74a448f82f40ddb2ce39a6c6e2b9c7f4b8e25aa4 03 September 2016, 00:30:30 UTC
3ce19dc Make GPU routines less verbose. Change-Id: I2867792171af36bf5d7bca23d0d82da83bcc7576 03 September 2016, 00:29:55 UTC
c11f6c4 Temporary fix for Ewald to avoid bad buffers. Change-Id: I89bc1cb6bfd58cecb4ff9a77d0f8d81f82c3a0d2 03 September 2016, 00:29:21 UTC
777d518 taManager::resumeRemoteChunk(): commenceCalculateGravityLocal() can only be called once per iteration. Change-Id: I0343e0e56be8317598bbbbc64c1a7b350a9f6ba3 03 September 2016, 00:28:47 UTC
4c38db7 ckloop and CUDA defines. Change-Id: Ie7796c85492c4b817f0b453007e94d1240246cff 03 September 2016, 00:28:06 UTC
ad57347 Bug fixes Change-Id: Iee3f06bc9d00990485eb38b3fb565be67ef891a4 03 September 2016, 00:26:00 UTC
cdd9b98 Implementing fix for synchronizing GPU Manager device buffer table after transfers of remote chunk data Change-Id: I1c38aff37536a00f857625772405664a3f646212 03 September 2016, 00:16:51 UTC
9b165b9 Change type of thread indices on the GPU from char to int to support large bucket sizes Change-Id: I32bc357437eb81259802a6bd7db08bc228b6cafc 03 September 2016, 00:16:26 UTC
9eebb4b Ewald CUDA kernels: pass particleTable length instead of getting it from cachedData. On SMP machines the number of particles could come from another core. This would also break multiple treepieces per core. Change-Id: Ie4816b3cd166435ee1f981e86dbaafd7b24563f7 03 September 2016, 00:15:53 UTC
a5a160a Further bug fixes Change-Id: Ie6fd5b9c0afd4b6bc92a7ebde21e0698bdbb4f1b 03 September 2016, 00:15:12 UTC
3eea622 Purge shared buffers from GPU Manager table before each step Change-Id: I3daaa161146f2d7f9c674869787b6331e4225d4a 03 September 2016, 00:14:32 UTC
fcf8980 Further fixes. Will need cleanup later. Change-Id: Iae9bf8e8a8add9dbe1f35fca481c5bdd6b4aac8b 03 September 2016, 00:13:50 UTC
6070482 Further bug fixes Change-Id: Ibae27051546cafbede41e1709b461b55eb30bed2 03 September 2016, 00:13:12 UTC
d66d84e Bug fixes to previous commit Change-Id: Iebb3e33a96a173d4f3e7104ff20484f909d6cefb 03 September 2016, 00:12:38 UTC
10ee962 Fix for the bug when running CUDA version of ChaNGa in smp mode. Work in progress. Change-Id: If575cae1920f62c42689e68062861262d91d2036 03 September 2016, 00:10:02 UTC
06f668f changed buffer size to 4 to prevent flaky seg faults Change-Id: Id4504b89a4358baeb6e5e383ce1216a482eb067e 03 September 2016, 00:06:08 UTC
8f76818 Update HostCUDA.cu as per new GPU Manager Change-Id: Iaf0d0d849482549c5796ddd10b559cefae9808d4 03 September 2016, 00:05:32 UTC
a785953 Add softening support when running on the GPU Change-Id: I9e40389fcc1ff9b530b7f765739f89ad572f0625 03 September 2016, 00:05:01 UTC
a006a46 Add missing source files in Makefile Change-Id: Ibe26bd17197cff9ec24e6fd88f3dcbfb6f2619c3 03 September 2016, 00:03:44 UTC
9e41244 Adding HEXADECAPOLE functionality to Ewald on the GPU Change-Id: I9aa4a1e3b41ad085535e1d47e2b86128a7b22a54 03 September 2016, 00:03:03 UTC
07f4b23 Replace floats with cudatype where appropriate and add missing variable assignment in momEval for CUDA Change-Id: Ifc3976edf7fdf0c217f7718336c778fd83fc9e2d 03 September 2016, 00:02:00 UTC
ea7dc08 Allow disabling HEXADECAPOLE at configure time. Although normally this is a bad idea, the purpose is to allow reverting to the old CUDA code (with is incompatible with the hexadecapole code). Change-Id: I13ec2545793e0da52e542aff150669016107eaf6 02 September 2016, 23:59:41 UTC
39575b4 Makefile.in: fix build when CUDA is not enabled Since we don't generate cuda.mk unless CUDA is enabled, we need a conditional around the include statement in Makefile.in. Change-Id: I8961b656cabe50d3ad97902604f1cb20c5755414 02 September 2016, 23:58:54 UTC
02b620c split off CUDA version of `momEvalFmomrcm' into its own file This allows for easier unit testing. Change-Id: Ifb567fb4e99685f6a66ed3ea6f96d17c2a7b3b76 02 September 2016, 23:57:38 UTC
d7dc596 calculateRadiusFarthestParticle: constness fixes These changes allow use of `const'-qualified particle instances with this function. Conflicts: MultipoleMoments.h Change-Id: I98a6ee9180408259897b98f41fd0e6777b0f3613 02 September 2016, 23:57:02 UTC
608df6f Some header changes for Charm-less unit tests, missing #includes Backported from cuda-scaledmoments branch. Change-Id: I764d06392d6312c6ddc810953b27b6ce570d499e 02 September 2016, 23:56:06 UTC
64f3cd3 Revert "HostCUDA.cu: use sqrt(1.0/rsq) instead of the approximate rsqrt(rsq)" This reverts commit 75fbd14788fd30406dd090805d922b3ab9777be8. See <https://github.com/insaneinside/changa/commit/75fbd14788fd30406dd090805d922b3ab9777be8#commitcomment-5837323> for the discussion. Change-Id: Ia89bb2a2cf9fe8dd6f4f0ee0c79a97ad7da7cf34 02 September 2016, 23:55:12 UTC
4201e32 HostCUDA.cu: use sqrt(1.0/rsq) instead of the approximate rsqrt(rsq) N.B. some sources claim that sqrt(1/x) is more accurate that 1/sqrt(x). See, for example, the first paragraph of the message found at <http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00523.html>, which references Peter Markstein's "IA-64 and Elementary Functions". Change-Id: I28a5d7b89234aeb94492f6f97c6421438c826b7f 02 September 2016, 23:54:38 UTC
e50c3ad HostCUDA.cu: nodeGravityComputation: add missing factor to `u` Change-Id: I56f211239314425c1b8ef4a21283f973f9fda88c 02 September 2016, 23:54:03 UTC
2202348 HostCUDA.cu: typo fixes to hexadecapole-enabled nodeGravityComputation Change-Id: I97d3de593c1d4b52651ec1c35876b1aa5e049f00 02 September 2016, 23:53:37 UTC
18f2f8e EwaldGPU(): correct source of 1st-order multipole moments w/ HEXADECAPOLE Change-Id: I9228243f16da075826929af1e68475121eb88eca 02 September 2016, 23:53:09 UTC
a7288ed cuda_typedef.h: use correct fields when HEXADECAPOLE is enabled Change-Id: I5a39d29d9a7fa9e6dc85d1a769c94733135414f2 02 September 2016, 23:52:43 UTC
404cff9 Makefile.in: remove lines disabling HEXADECAPOLE when CUDA is enabled That was the whole point of the scaledmoments CUDA port, anyway. Change-Id: I5317b0b3b72bb7567d8b683f762748c16d103ea1 02 September 2016, 23:51:54 UTC
21c6a62 removed Makefile.dep from version control; build it in build directory Change-Id: I3bc7e838e4e56e6dc15cc1a082e69acf651906ad 02 September 2016, 23:50:05 UTC
443bb48 Revert "Makefile.in: corrected references to Makefile.dep (lives in SOURCE dir)" This reverts commit a1000b301dffe5091d3719fc7fc834b2e3bca63a. Change-Id: I915ec5df61b01e9e0e023bfd87f9145f10afc04d 02 September 2016, 23:49:32 UTC
aba4ed8 HostCUDA.cu: typo fix re: recent changes to nodeGravityComputation Change-Id: I1f9f8f594d2b748013d050c3e93f8e9e9d4ba0ca 02 September 2016, 23:48:50 UTC
ffc120a HostCUDA.cu: changed `u` to match parameter passed to `momEvalFmomrcm` Change-Id: I40dbddef2d3a6c914ecc04df67017adcf1f971e6 02 September 2016, 23:48:18 UTC
360a16a cuda.ac: use $withval, not $enableval, in processing --with-cuda-level Change-Id: I219ca2c7899cb0ef1752767de20377795d5d4550 02 September 2016, 23:47:52 UTC
d70d39f Makefile.in: separate include flags into a separate variable Easier to avoid passing the configure-chosen `-fpic' flag to `nvcc` this way (nvcc doesn't like it). This should fix the problem I'm having with compiling the CUDA branch on the Intel compiler suite. Change-Id: Id37654f7ed29a195a950ae4c86b3e881e499eaf7 02 September 2016, 23:44:41 UTC
1549da6 Makefile.in: corrected references to Makefile.dep (lives in SOURCE dir) Change-Id: Id471c2c2fac2abbb4d0fa7a76094d97f9a5329b5 02 September 2016, 23:37:30 UTC
cac8909 Makefile.in: keep C/C++ PreProcessor FLAGS (-D, -I, etc.) in CPPFLAGS Change-Id: I595563f268b5c98cc8e8c2ab5bf96231d51c266e 02 September 2016, 23:28:08 UTC
19cd0ba cuda.ac, configure: only generate cuda.mk if CUDA is enabled Change-Id: If47a979fe2554e5220292c849b41aeec928fa95d 02 September 2016, 22:15:21 UTC
4eed9b7 HostCUDA.cu: try one at porting `momEvalFmomrcm` into `nodeGravityComputation` Change-Id: I8ae6c734fb486967ad633fc4675a5e84a679de93 02 September 2016, 22:14:40 UTC
787a9f0 cuda_typedef.h: added hexadecapole members to CudaMultipoleMoments Change-Id: I5e822b85c09dd478d06fb64253576c0230fb70bd 02 September 2016, 22:13:39 UTC
7d0fc8e cuda_typedef.h: needs to #include "cosmoType.h" Change-Id: Id026ff332692d682123c4f51953d5578600a5ccb 02 September 2016, 22:13:15 UTC
a00246c CacheInterface.h: make CkCache.h an angled #include Needed for similar reason as described in commit @c9a512e: ChaNGa's dependency generation assumes that CkCache.h is a local include, and here the files _it_ includes were being listed without path in Makefile.dep. Change-Id: Ie9086cfd66e0174d7f9027eed5112a1add6e3851 02 September 2016, 22:12:38 UTC
e0d7d68 Makefile.in: add naive fixed CUDA-level `nvcc` flags This is better than nvcc's default, which uses compute capability 1.0, but may not be suitable in all circumstances. Also added a comment pointing to more information on less-naive ways to specify target CUDA capability. Change-Id: I97e73aad6d36141c1434e17fafd856104a83acf8 02 September 2016, 22:11:30 UTC
2a05d5e cuda.{ac,mk.in}: added a --with-cuda-level option For specifying the CUDA compute capability to use when compiling. Change-Id: I587fa2f3bdfe83a72eb333e106d508dc582c7cef 02 September 2016, 22:11:00 UTC
d7051c0 cuda.ac: update docs to reflect > 1 `--with` arg Change-Id: I2ed8333a534f299f0d4b310a5f9c35351418df54 02 September 2016, 22:10:13 UTC
2c0691a cuda.{mk.in,ac},configure,Makefile.in: more CUDA `configure` options, docs * Moved `configure`-time CUDA configuration values to `cuda.mk.in`. * Added --with-cuda-sdk=PATH option => CUDA_SDK_DIR in cuda.mk * Added additional documentation to cuda.ac * Added HAVE_CUDA and HAVE_CUDA_SDK vars to cuda.mk Change-Id: I893b4fbcfb8e6808c3bf3f429765dfa718d72088 02 September 2016, 22:09:01 UTC
b9d8ab3 cuda.ac, configure: added/extended a couple of CUDA-related comments Change-Id: Ie602ba419cb2d01bdbc99d549c0bd74c9db6f15f 02 September 2016, 22:08:19 UTC
d512b00 cuda.ac, configure: disabled CUDA auto-enable Change-Id: I93d9e591cf14b784a7c84d4083aa908f938d9e28 02 September 2016, 22:07:34 UTC
74299e3 HostCUDA.h: replaced `boolean` enum with C++ `bool` All C++ standards to date provide for promotion of a `bool` to an integer, and all specify that `false` becomes 0 and `true` becomes 1, so this should have no effect whatsoever on the compiled program. Change-Id: I1eeeed695b6f598bca631238508908eb39fec411 02 September 2016, 22:06:34 UTC
3bcc453 HostCUDA.h: follow convention for "local" vs. <non-local> #includes Change-Id: Ie1d9522ca1da80448c5c12574749a3f1ef1ccb8e 02 September 2016, 22:05:54 UTC
ad8fbdd HostCUDA.cu: use a macro to simplify host-memory allocation code. Change-Id: If74f785012c00884eb93d6d82956ca80dca008f6 02 September 2016, 22:05:01 UTC
c4d6b5e configure.ac: remove mostly-useless CUDA-related comment Change-Id: Ie287ff047128344795fb9b8c9ef92eb8122d7e29 02 September 2016, 22:04:23 UTC
6b57307 cuda.ac: tweaks to a comment paragraph; don't deal with autoconf bugs Change-Id: Ia673523a8d3dc9466d949b2e23b0b429e6f645a1 02 September 2016, 22:03:28 UTC
back to top