sort by:
Revision Author Date Message Commit Date
9b165b9 Change type of thread indices on the GPU from char to int to support large bucket sizes Change-Id: I32bc357437eb81259802a6bd7db08bc228b6cafc 03 September 2016, 00:16:26 UTC
9eebb4b Ewald CUDA kernels: pass particleTable length instead of getting it from cachedData. On SMP machines the number of particles could come from another core. This would also break multiple treepieces per core. Change-Id: Ie4816b3cd166435ee1f981e86dbaafd7b24563f7 03 September 2016, 00:15:53 UTC
a5a160a Further bug fixes Change-Id: Ie6fd5b9c0afd4b6bc92a7ebde21e0698bdbb4f1b 03 September 2016, 00:15:12 UTC
3eea622 Purge shared buffers from GPU Manager table before each step Change-Id: I3daaa161146f2d7f9c674869787b6331e4225d4a 03 September 2016, 00:14:32 UTC
fcf8980 Further fixes. Will need cleanup later. Change-Id: Iae9bf8e8a8add9dbe1f35fca481c5bdd6b4aac8b 03 September 2016, 00:13:50 UTC
6070482 Further bug fixes Change-Id: Ibae27051546cafbede41e1709b461b55eb30bed2 03 September 2016, 00:13:12 UTC
d66d84e Bug fixes to previous commit Change-Id: Iebb3e33a96a173d4f3e7104ff20484f909d6cefb 03 September 2016, 00:12:38 UTC
10ee962 Fix for the bug when running CUDA version of ChaNGa in smp mode. Work in progress. Change-Id: If575cae1920f62c42689e68062861262d91d2036 03 September 2016, 00:10:02 UTC
06f668f changed buffer size to 4 to prevent flaky seg faults Change-Id: Id4504b89a4358baeb6e5e383ce1216a482eb067e 03 September 2016, 00:06:08 UTC
8f76818 Update HostCUDA.cu as per new GPU Manager Change-Id: Iaf0d0d849482549c5796ddd10b559cefae9808d4 03 September 2016, 00:05:32 UTC
a785953 Add softening support when running on the GPU Change-Id: I9e40389fcc1ff9b530b7f765739f89ad572f0625 03 September 2016, 00:05:01 UTC
a006a46 Add missing source files in Makefile Change-Id: Ibe26bd17197cff9ec24e6fd88f3dcbfb6f2619c3 03 September 2016, 00:03:44 UTC
9e41244 Adding HEXADECAPOLE functionality to Ewald on the GPU Change-Id: I9aa4a1e3b41ad085535e1d47e2b86128a7b22a54 03 September 2016, 00:03:03 UTC
07f4b23 Replace floats with cudatype where appropriate and add missing variable assignment in momEval for CUDA Change-Id: Ifc3976edf7fdf0c217f7718336c778fd83fc9e2d 03 September 2016, 00:02:00 UTC
ea7dc08 Allow disabling HEXADECAPOLE at configure time. Although normally this is a bad idea, the purpose is to allow reverting to the old CUDA code (with is incompatible with the hexadecapole code). Change-Id: I13ec2545793e0da52e542aff150669016107eaf6 02 September 2016, 23:59:41 UTC
39575b4 Makefile.in: fix build when CUDA is not enabled Since we don't generate cuda.mk unless CUDA is enabled, we need a conditional around the include statement in Makefile.in. Change-Id: I8961b656cabe50d3ad97902604f1cb20c5755414 02 September 2016, 23:58:54 UTC
02b620c split off CUDA version of `momEvalFmomrcm' into its own file This allows for easier unit testing. Change-Id: Ifb567fb4e99685f6a66ed3ea6f96d17c2a7b3b76 02 September 2016, 23:57:38 UTC
d7dc596 calculateRadiusFarthestParticle: constness fixes These changes allow use of `const'-qualified particle instances with this function. Conflicts: MultipoleMoments.h Change-Id: I98a6ee9180408259897b98f41fd0e6777b0f3613 02 September 2016, 23:57:02 UTC
608df6f Some header changes for Charm-less unit tests, missing #includes Backported from cuda-scaledmoments branch. Change-Id: I764d06392d6312c6ddc810953b27b6ce570d499e 02 September 2016, 23:56:06 UTC
64f3cd3 Revert "HostCUDA.cu: use sqrt(1.0/rsq) instead of the approximate rsqrt(rsq)" This reverts commit 75fbd14788fd30406dd090805d922b3ab9777be8. See <https://github.com/insaneinside/changa/commit/75fbd14788fd30406dd090805d922b3ab9777be8#commitcomment-5837323> for the discussion. Change-Id: Ia89bb2a2cf9fe8dd6f4f0ee0c79a97ad7da7cf34 02 September 2016, 23:55:12 UTC
4201e32 HostCUDA.cu: use sqrt(1.0/rsq) instead of the approximate rsqrt(rsq) N.B. some sources claim that sqrt(1/x) is more accurate that 1/sqrt(x). See, for example, the first paragraph of the message found at <http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00523.html>, which references Peter Markstein's "IA-64 and Elementary Functions". Change-Id: I28a5d7b89234aeb94492f6f97c6421438c826b7f 02 September 2016, 23:54:38 UTC
e50c3ad HostCUDA.cu: nodeGravityComputation: add missing factor to `u` Change-Id: I56f211239314425c1b8ef4a21283f973f9fda88c 02 September 2016, 23:54:03 UTC
2202348 HostCUDA.cu: typo fixes to hexadecapole-enabled nodeGravityComputation Change-Id: I97d3de593c1d4b52651ec1c35876b1aa5e049f00 02 September 2016, 23:53:37 UTC
18f2f8e EwaldGPU(): correct source of 1st-order multipole moments w/ HEXADECAPOLE Change-Id: I9228243f16da075826929af1e68475121eb88eca 02 September 2016, 23:53:09 UTC
a7288ed cuda_typedef.h: use correct fields when HEXADECAPOLE is enabled Change-Id: I5a39d29d9a7fa9e6dc85d1a769c94733135414f2 02 September 2016, 23:52:43 UTC
404cff9 Makefile.in: remove lines disabling HEXADECAPOLE when CUDA is enabled That was the whole point of the scaledmoments CUDA port, anyway. Change-Id: I5317b0b3b72bb7567d8b683f762748c16d103ea1 02 September 2016, 23:51:54 UTC
21c6a62 removed Makefile.dep from version control; build it in build directory Change-Id: I3bc7e838e4e56e6dc15cc1a082e69acf651906ad 02 September 2016, 23:50:05 UTC
443bb48 Revert "Makefile.in: corrected references to Makefile.dep (lives in SOURCE dir)" This reverts commit a1000b301dffe5091d3719fc7fc834b2e3bca63a. Change-Id: I915ec5df61b01e9e0e023bfd87f9145f10afc04d 02 September 2016, 23:49:32 UTC
aba4ed8 HostCUDA.cu: typo fix re: recent changes to nodeGravityComputation Change-Id: I1f9f8f594d2b748013d050c3e93f8e9e9d4ba0ca 02 September 2016, 23:48:50 UTC
ffc120a HostCUDA.cu: changed `u` to match parameter passed to `momEvalFmomrcm` Change-Id: I40dbddef2d3a6c914ecc04df67017adcf1f971e6 02 September 2016, 23:48:18 UTC
360a16a cuda.ac: use $withval, not $enableval, in processing --with-cuda-level Change-Id: I219ca2c7899cb0ef1752767de20377795d5d4550 02 September 2016, 23:47:52 UTC
d70d39f Makefile.in: separate include flags into a separate variable Easier to avoid passing the configure-chosen `-fpic' flag to `nvcc` this way (nvcc doesn't like it). This should fix the problem I'm having with compiling the CUDA branch on the Intel compiler suite. Change-Id: Id37654f7ed29a195a950ae4c86b3e881e499eaf7 02 September 2016, 23:44:41 UTC
1549da6 Makefile.in: corrected references to Makefile.dep (lives in SOURCE dir) Change-Id: Id471c2c2fac2abbb4d0fa7a76094d97f9a5329b5 02 September 2016, 23:37:30 UTC
cac8909 Makefile.in: keep C/C++ PreProcessor FLAGS (-D, -I, etc.) in CPPFLAGS Change-Id: I595563f268b5c98cc8e8c2ab5bf96231d51c266e 02 September 2016, 23:28:08 UTC
19cd0ba cuda.ac, configure: only generate cuda.mk if CUDA is enabled Change-Id: If47a979fe2554e5220292c849b41aeec928fa95d 02 September 2016, 22:15:21 UTC
4eed9b7 HostCUDA.cu: try one at porting `momEvalFmomrcm` into `nodeGravityComputation` Change-Id: I8ae6c734fb486967ad633fc4675a5e84a679de93 02 September 2016, 22:14:40 UTC
787a9f0 cuda_typedef.h: added hexadecapole members to CudaMultipoleMoments Change-Id: I5e822b85c09dd478d06fb64253576c0230fb70bd 02 September 2016, 22:13:39 UTC
7d0fc8e cuda_typedef.h: needs to #include "cosmoType.h" Change-Id: Id026ff332692d682123c4f51953d5578600a5ccb 02 September 2016, 22:13:15 UTC
a00246c CacheInterface.h: make CkCache.h an angled #include Needed for similar reason as described in commit @c9a512e: ChaNGa's dependency generation assumes that CkCache.h is a local include, and here the files _it_ includes were being listed without path in Makefile.dep. Change-Id: Ie9086cfd66e0174d7f9027eed5112a1add6e3851 02 September 2016, 22:12:38 UTC
e0d7d68 Makefile.in: add naive fixed CUDA-level `nvcc` flags This is better than nvcc's default, which uses compute capability 1.0, but may not be suitable in all circumstances. Also added a comment pointing to more information on less-naive ways to specify target CUDA capability. Change-Id: I97e73aad6d36141c1434e17fafd856104a83acf8 02 September 2016, 22:11:30 UTC
2a05d5e cuda.{ac,mk.in}: added a --with-cuda-level option For specifying the CUDA compute capability to use when compiling. Change-Id: I587fa2f3bdfe83a72eb333e106d508dc582c7cef 02 September 2016, 22:11:00 UTC
d7051c0 cuda.ac: update docs to reflect > 1 `--with` arg Change-Id: I2ed8333a534f299f0d4b310a5f9c35351418df54 02 September 2016, 22:10:13 UTC
2c0691a cuda.{mk.in,ac},configure,Makefile.in: more CUDA `configure` options, docs * Moved `configure`-time CUDA configuration values to `cuda.mk.in`. * Added --with-cuda-sdk=PATH option => CUDA_SDK_DIR in cuda.mk * Added additional documentation to cuda.ac * Added HAVE_CUDA and HAVE_CUDA_SDK vars to cuda.mk Change-Id: I893b4fbcfb8e6808c3bf3f429765dfa718d72088 02 September 2016, 22:09:01 UTC
b9d8ab3 cuda.ac, configure: added/extended a couple of CUDA-related comments Change-Id: Ie602ba419cb2d01bdbc99d549c0bd74c9db6f15f 02 September 2016, 22:08:19 UTC
d512b00 cuda.ac, configure: disabled CUDA auto-enable Change-Id: I93d9e591cf14b784a7c84d4083aa908f938d9e28 02 September 2016, 22:07:34 UTC
74299e3 HostCUDA.h: replaced `boolean` enum with C++ `bool` All C++ standards to date provide for promotion of a `bool` to an integer, and all specify that `false` becomes 0 and `true` becomes 1, so this should have no effect whatsoever on the compiled program. Change-Id: I1eeeed695b6f598bca631238508908eb39fec411 02 September 2016, 22:06:34 UTC
3bcc453 HostCUDA.h: follow convention for "local" vs. <non-local> #includes Change-Id: Ie1d9522ca1da80448c5c12574749a3f1ef1ccb8e 02 September 2016, 22:05:54 UTC
ad8fbdd HostCUDA.cu: use a macro to simplify host-memory allocation code. Change-Id: If74f785012c00884eb93d6d82956ca80dca008f6 02 September 2016, 22:05:01 UTC
c4d6b5e configure.ac: remove mostly-useless CUDA-related comment Change-Id: Ie287ff047128344795fb9b8c9ef92eb8122d7e29 02 September 2016, 22:04:23 UTC
6b57307 cuda.ac: tweaks to a comment paragraph; don't deal with autoconf bugs Change-Id: Ia673523a8d3dc9466d949b2e23b0b429e6f645a1 02 September 2016, 22:03:28 UTC
4df6503 HostCUDA.{cu,h}: removed extraneous executable bit from files. Change-Id: I7d919af81a63d590b38775d5c159ed19288ced88 02 September 2016, 22:01:27 UTC
124beb1 configure.ac: moved CUDA detection script to a separate file `cuda.ac` Also added some cleanups related CUDA-related variable names Change-Id: I5e45a16c0d8c0d87046ffaf01373b1906c676e2a 02 September 2016, 22:00:22 UTC
607cc8f configure{,.ac}: Added CUDA toolkit detection This adds the option `--with-cuda[=PATH]` to `configure`. When the option is omitted, CUDA is enabled automatically if a toolkit directory is found (`--without-cuda` can be used to disable this behavior). If the automatic-enable is not wanted, it can be disabled by changing the line ENABLE_CUDA="auto" to ENABLE_CUDA="no" in configure.ac. Change-Id: I22a262d60d124a98c1f9653911539db6e393a0f0 02 September 2016, 21:50:45 UTC
e26d611 ParallelGravity.h: made liveViz.h an angled #include Without this, Charm's dependency-generation was listing "liveViz.h" -- without path -- as a dependency for any object build from a source that includes ParallelGravity.h. Since liveViz.h is found via path search (and is not kept in the local directory), this is the right include style to use for it anyway. Change-Id: I3d302364268a9500a36d02ea530358ef670b6a4e 02 September 2016, 21:49:24 UTC
ab7714e Makefile.in: add source and build directories to C include path, too Change-Id: I582804e2b938bd9687777342464614ddc97ff0a8 02 September 2016, 21:47:49 UTC
fe728eb configure{,.ac}, Makefile.in: added support for out-of-source builds Change-Id: If4952122bbb34eff879b34a0e5584146bc46c1b6 02 September 2016, 21:46:35 UTC
8fca345 Makefile.in: Makefile depends on Makefile.in To regenerate Makefile, we use `config.status` (which re-runs `configure` with all user-provided options). Change-Id: Ia624505cf0f8dda5cdcfc6e28dc1813835d29993 02 September 2016, 21:44:55 UTC
5362a7d Don't make COSMO_FLOAT the default (for now). Change-Id: I2ddfdcf35486094e0bb2cf44a6d19d54e0339292 02 September 2016, 21:43:22 UTC
76a59c6 Use cosmoType as hosttype Change-Id: I36a01dde1826d4e8a87ff06b04a2a79bb99845e3 02 September 2016, 21:39:17 UTC
15e22a0 Fixed scaled moments to be ignored when compiled without HEXADECAPOLE Change-Id: I210c55558a9f88585915a3785918a87b904ecdb6 02 September 2016, 21:30:10 UTC
8111a50 Distinguish between "COSMO_FLOAT", the type used for particle positions, and "SSE_COSMO_FLOAT", the type used in the SSE force calculations. Change-Id: I29e1d33c653db8295d21bde3f8ea58504b318615 02 September 2016, 05:10:57 UTC
bc2ee32 Delete isinf() debugging checks. Change-Id: Id5d37b599d355415506193c28cbaa1d09055cb47 02 September 2016, 04:53:41 UTC
681d06a Move cosmoType into separate header for inclusion into C files. Change-Id: I015dbaa04d68b44e8c9bbc04c99ea0fd7fe72ccb 02 September 2016, 04:47:24 UTC
8be9a92 Changed the opening criteria to cosmoType. Change-Id: Ie70c17a09b70be7ab7f769b6122910b821ef1f38 02 September 2016, 04:35:19 UTC
648ae51 Changed periodic offsets to cosmoType. Change-Id: Ibef599fd5f270df24b34ca3affa1b24810ab2585 30 August 2016, 16:49:11 UTC
8929347 Implemented scaled multipoles in single precision. This includes a fix to SSEdefs.h: cosmoMask was wrong. Change-Id: If2a7cb6e04e14a767a5e78ff88a9ffbc189dfb93 12 August 2016, 05:26:50 UTC
0cd087f First crack at scaled moments allowing single precision. Does not compile. Change-Id: Ic94f1dfbfc1c39c2d174a2ee974d3a1bd09c7f85 12 August 2016, 05:26:41 UTC
4b03974 Make SSE2 and AVX vectorization configurable. Also change FMA4 macro name to AVX as this is more accurate. Change-Id: I5c1474b5852a48d83ee796b37daf4fa9f549c0b6 18 June 2016, 23:45:28 UTC
000bfac Update web page referral. Change-Id: If9c8da7ed3ad9b631aaca2f3816576d625cb570d 30 May 2016, 22:21:53 UTC
f273127 GenericTreeNode cleanup: usedBy is never used. Change-Id: I35b38e1332e9e5e08d8c0458a9d9579af211d6ca 30 May 2016, 04:41:54 UTC
6aa9a2b Version uptick to v3.2. Change-Id: Id4961d47cff369f1de40c62acb429bde624f0ba2 03 May 2016, 19:20:34 UTC
4d763c1 Merge branch 'public' into ppl_master Conflicts: TreePiece.cpp 23 April 2016, 17:26:07 UTC
96b75cf Merge branch 'master' of charmgit:cosmo/changa into ppl_master 23 April 2016, 17:21:59 UTC
6d74ca5 Merge branch 'trq/gamma_warn' into public 15 April 2016, 05:35:23 UTC
d92cb2d assignKeys(): increase bounding box slop. This time to be sure there is never a "0" key, which tickles an edge bug in domain decomposition. 13 April 2016, 03:50:54 UTC
f7ecf11 Abort on bad value of dConstGamma. 09 April 2016, 17:16:55 UTC
b79547b Better documentation for cache functions. Change-Id: I18ef9546b6a6a6e2610692374f5f2146070fc7fa 26 February 2016, 23:53:01 UTC
84a70cd Merge branch 'public' into ppl_master 17 February 2016, 18:25:18 UTC
243ec6a Main::writeOutput(): move mkdir()s out of CkAsserts() for N-Chilada files. Also time N-Chilada outputs. Change-Id: I272d141f33e80fa63aa3cb0045db2a5267cfae86 30 January 2016, 19:51:01 UTC
c42220c Merge pull request #106 from N-BodyShop/trq/feedbackstepfix Force timesteps to be at least as small as the star formation time. 30 January 2016, 05:16:12 UTC
d334fbb assignKeys(): another slight increase of roundoff slop for boxsize. Change-Id: Iabe4025b68f4b9355e20d71ac6bce1e72af28b65 30 January 2016, 05:07:15 UTC
d0eae68 Force timesteps to be at least as small as the star formation time. This corrects a feedback problem. 28 January 2016, 05:27:07 UTC
e572b13 Updated README and CHANGES in anticipation of new release. Change-Id: If3b1986a78a27db3fb145a05988db159752d2290 18 January 2016, 19:11:13 UTC
6abe351 Initialize random number generator. Change-Id: I22fef3ddc5f9e75232b08a4cda2d31c0cf59a5e2 17 January 2016, 05:16:13 UTC
6f3fa14 Add parameter "bFracLoadBalance" to specify how often to do load balancing. Change-Id: I867b4f31112c03987a3fb4adc71d2f00483cc56f 16 January 2016, 23:08:26 UTC
019b4da Merge branch 'public' into ppl_master 13 January 2016, 05:37:14 UTC
8ff66f7 Main(): warn about nReplicas = 0 and bPeriodic = 1. Change-Id: I1e46681abc44232d09a763eb85c6af4bc383aeca 13 January 2016, 05:34:37 UTC
c4784c8 Enable "killAt" for restarts. Change-Id: I736f2def814de4465f0a6a9fce8cdbd18db22bf6 13 January 2016, 05:28:37 UTC
d92c473 More documentation for imf and cooling_cosmo. Change-Id: Ie51d4a83c8171937f2547aae29966d1a0c0af45a 13 January 2016, 04:50:29 UTC
51cc7e2 Main::ReadASCII(): always print warning about file open fail. This will make missing UV file errors more obvious. Change-Id: Ieaabcc3e14032733bdd146adee85b684b64be604 13 January 2016, 04:48:40 UTC
801f5a0 configure: give precedence to CHARM_DIR Change-Id: I7cc96fc7b61bc40e9d6c5ce8a122cd2e41c72824 12 January 2016, 21:58:37 UTC
84ab0fe Test for checkpoint failure. Also call Main::restart() directly. Change-Id: Ib688f8b91bc36d72c3abfa62f4f24b6d3d5390c3 09 January 2016, 04:45:19 UTC
40365aa Document self-shielding code in cooling_cosmo.c Change-Id: I16ac8a3a4136f5e2d77ac3fd95b65f8fc7abf455 04 January 2016, 21:07:36 UTC
c68d71a Main::addDelParticles(): use %ld when printing out New particle numbers. Change-Id: I81318c3a414129aee0df10475c76d63784ccc870 13 December 2015, 01:08:22 UTC
93d4c50 Merge branch 'ppl_master' into public 11 December 2015, 05:57:10 UTC
5a980e7 Sorter::collectEvaluationsOct() small change to handle small particle counts. Change-Id: Iad748f5d173156524194feef6dfac5054ca9dbe4 11 December 2015, 05:52:20 UTC
267f813 Handle Oct decomp in new Tree Build. Change-Id: I9fe0f7d024a8fbb537f1e71c34af59b858512de1 03 December 2015, 05:03:28 UTC
9d3aca4 Allow for large iOrders in star formation events. This changes the size of the starlog file. Change-Id: If0dd5ae3c29903b41a848e1147cac157fde09da3 26 November 2015, 23:09:28 UTC
75ac81b Allow large (> 2 billion) iOrders in photogenic file. Change-Id: I72bb7dfaad444ba4002a75b34a356d69a6f0f804 26 November 2015, 22:43:57 UTC
500dac4 TreePiece::buildTree(): handle one TreePiece case. Also delete obsolete comment. Change-Id: I25a0d8ac8577278ba904d40aef61c86c106482e6 21 November 2015, 05:51:59 UTC
back to top