0bbf787 | Cedric Nugteren | 21 November 2022, 16:12:32 UTC | Merge pull request #59 from trixirt/fix_numeric_limits include limits header | 21 November 2022, 16:12:32 UTC |
9d8b2d0 | Tom Rix | 19 November 2022, 14:36:28 UTC | include limits header On Ubuntu 22.04 / g++ 11.3 there is this error CLTune/src/ml_model.cc:58:21: error: ‘numeric_limits’ is not a member of ‘std’ 58 | auto min = std::numeric_limits<T>::max(); | ^~~~~~~~~~~~~~ numeric_limits is defined in the limits header, so include it. Signed-off-by: Tom Rix <trix@redhat.com> | 19 November 2022, 14:36:28 UTC |
6167b30 | Cedric Nugteren | 04 February 2022, 08:05:32 UTC | Merge pull request #58 from afshinarefi/master Add explicit header for std::stringstream | 04 February 2022, 08:05:32 UTC |
5b38c67 | Afshin Arefi | 03 February 2022, 15:47:03 UTC | Add explicit header for std::stringstream | 03 February 2022, 15:47:03 UTC |
6df6ac6 | Cedric Nugteren | 17 June 2020, 18:20:04 UTC | Merge pull request #56 from Knutakir/missing-cuda-functions Add missing functions for CUDA | 17 June 2020, 18:20:04 UTC |
022f4c4 | Knut Kirkhorn | 16 June 2020, 16:03:31 UTC | Add missing functions for CUDA | 16 June 2020, 16:03:31 UTC |
be5f37a | Cedric Nugteren | 19 November 2018, 19:53:19 UTC | Fixed AppVeyor issues | 19 November 2018, 19:53:19 UTC |
050ab01 | Cedric Nugteren | 19 November 2018, 19:50:14 UTC | Merge pull request #53 from rtrembecky/master 2D convolution sample - fix correctness and redundant loads | 19 November 2018, 19:50:14 UTC |
63ba4db | rtrembecky | 18 November 2018, 14:15:15 UTC | fix redundant loads | 18 November 2018, 14:20:26 UTC |
c5246f3 | rtrembecky | 18 November 2018, 13:30:01 UTC | fix correctness - fix global offset for input data access | 18 November 2018, 14:20:38 UTC |
45078d9 | rtrembecky | 18 November 2018, 13:07:44 UTC | fix correctness - fix input padded data initialization | 18 November 2018, 14:20:26 UTC |
7ca2c87 | Cedric Nugteren | 30 August 2018, 19:42:20 UTC | Added reference to KTT | 30 August 2018, 19:42:20 UTC |
06721eb | Cedric Nugteren | 16 September 2017, 16:29:48 UTC | Fixed an issue with the NVIDIA compute capability not being retrieved properly | 16 September 2017, 16:29:48 UTC |
1ba3595 | Cedric Nugteren | 14 September 2017, 20:02:35 UTC | Removed development builds from README | 14 September 2017, 20:02:35 UTC |
7c47f44 | Cedric Nugteren | 14 September 2017, 20:01:25 UTC | Removed development builds from README | 14 September 2017, 20:01:25 UTC |
e0484dc | Cedric Nugteren | 14 September 2017, 20:00:48 UTC | Added a guard against missing AMD and NVIDIA extensions | 14 September 2017, 20:00:48 UTC |
6d4b506 | Cedric Nugteren | 10 September 2017, 13:39:18 UTC | Added additional OpenCL information printing to screen and to JSON | 10 September 2017, 13:39:18 UTC |
7c5a580 | Cedric Nugteren | 05 September 2017, 18:21:47 UTC | Added printing of the parameter configuration in verbose mode; prints parmeters to stdout before compiling and running a kernel (with -DVERBOSE=ON only) | 05 September 2017, 18:21:47 UTC |
fc28b36 | Cedric Nugteren | 25 July 2017, 18:48:49 UTC | Global and local thread size dividers now perform a ceiled division by default | 25 July 2017, 18:48:49 UTC |
6b7c50b | Cedric Nugteren | 26 June 2017, 19:11:53 UTC | Merge branch 'development' | 26 June 2017, 19:11:53 UTC |
3c577cc | Cedric Nugteren | 26 June 2017, 19:11:34 UTC | Updated to version 2.7.0 | 26 June 2017, 19:11:34 UTC |
3225b69 | Cedric Nugteren | 26 June 2017, 18:51:20 UTC | Fixed appveyor settings related to a recent change in the Khronos repo's | 26 June 2017, 18:51:20 UTC |
ba07c79 | Cedric Nugteren | 26 June 2017, 18:44:04 UTC | The tuner now automatically ensures global size is a multiple of the local size | 26 June 2017, 18:44:04 UTC |
2b49667 | Cedric Nugteren | 20 February 2017, 21:20:11 UTC | Fixed a bug in the annealing method and added some extra sanity checks | 20 February 2017, 21:20:11 UTC |
8eb3df1 | Cedric Nugteren | 20 February 2017, 21:04:05 UTC | Added a function to return the best tuning parameters to the user; factored out the logic to get the best tuning results | 20 February 2017, 21:04:05 UTC |
115e14b | Cedric Nugteren | 17 February 2017, 20:22:24 UTC | Changed std::initalizer_list in the AddParameters API to std::vector | 17 February 2017, 20:22:24 UTC |
35de111 | Cedric Nugteren | 23 October 2016, 13:41:14 UTC | Merge pull request #46 from CNugteren/development Update to version 2.6.0 | 23 October 2016, 13:41:14 UTC |
dc1cb0b | Cedric Nugteren | 23 October 2016, 13:29:58 UTC | Updated to version 2.6.0 | 23 October 2016, 13:29:58 UTC |
a8c6871 | Cedric Nugteren | 23 October 2016, 13:29:27 UTC | Added support for pkg-config installation on Linux | 23 October 2016, 13:29:27 UTC |
73ed6c3 | Cedric Nugteren | 22 October 2016, 14:42:10 UTC | Added an option to compile a static library | 22 October 2016, 14:42:10 UTC |
bdbf353 | Cedric Nugteren | 12 October 2016, 19:43:01 UTC | Fixed a const/constexpr issue caused by the previous commit | 12 October 2016, 19:43:01 UTC |
083a5e2 | Cedric Nugteren | 12 October 2016, 19:36:20 UTC | Added support for compilation under Visual Studio 2013 (MSVC++ 12.0) | 12 October 2016, 19:36:20 UTC |
0ed56a1 | Cedric Nugteren | 02 October 2016, 11:44:46 UTC | It is now possible to set the OpenCL compiler options through an environment variable | 02 October 2016, 11:44:46 UTC |
5219183 | Cedric Nugteren | 02 October 2016, 11:38:45 UTC | Execution time measurements is no longer based on events but uses CPU timers instead to also include the (varying) kernel launch time overhead and other overheads (if any) | 02 October 2016, 11:38:45 UTC |
d0ec5a1 | Cedric Nugteren | 27 September 2016, 19:04:58 UTC | Merge pull request #45 from CNugteren/development Update to version 2.5.0 | 27 September 2016, 19:04:58 UTC |
82dd234 | Cedric Nugteren | 27 September 2016, 18:53:32 UTC | Updated to version 2.5.0 | 27 September 2016, 18:53:32 UTC |
2a56722 | Cedric Nugteren | 27 September 2016, 18:49:47 UTC | Made the number of runs for averaging a setting configurable by the user | 27 September 2016, 18:49:47 UTC |
bb4ba83 | Cedric Nugteren | 27 September 2016, 18:48:10 UTC | Updated to version 8.0 of CLCudaAPI | 27 September 2016, 18:48:10 UTC |
492c362 | Cedric Nugteren | 03 August 2016, 18:21:59 UTC | Updated Travis CI to use the system OpenCL instead of compiling our own OpenCL library | 03 August 2016, 18:21:59 UTC |
68cb1d4 | Cedric Nugteren | 03 August 2016, 18:17:41 UTC | Updated to version 7.0 of the CLCudaAPI header | 03 August 2016, 18:17:41 UTC |
a6cb325 | Cedric Nugteren | 03 August 2016, 18:00:03 UTC | Merge pull request #44 from williamjshipman/development Fix bug in Kernel::LocalMemUsage on Intel CPU runtime | 03 August 2016, 18:00:03 UTC |
d8318a5 | williamjshipman | 30 July 2016, 23:31:22 UTC | Fix bug in Kernel::LocalMemUsage where Intel CPU runtime returns a size of 0 if the in the first call to clGetKernelWorkGroupInfo. Cause seems to be an ambiguity in the OpenCL standard. | 30 July 2016, 23:31:22 UTC |
86dbb2e | Cedric Nugteren | 29 June 2016, 17:50:22 UTC | Merge pull request #42 from CNugteren/development Update to version 2.4.0 | 29 June 2016, 17:50:22 UTC |
a001605 | Cedric Nugteren | 29 June 2016, 16:21:25 UTC | Minor fix to the AppVeyor CI build | 29 June 2016, 16:21:25 UTC |
45b2c52 | Cedric Nugteren | 29 June 2016, 16:10:46 UTC | Updated to version 2.4.0 | 29 June 2016, 16:10:46 UTC |
0526f9d | Cedric Nugteren | 29 June 2016, 16:08:10 UTC | Made it possible to run some of the GEMM kernels using CUDA (those without shared memory) | 29 June 2016, 16:08:10 UTC |
6177c14 | Cedric Nugteren | 29 June 2016, 15:50:12 UTC | Updated to version 6.0 of the CLCudaAPI header | 29 June 2016, 15:50:12 UTC |
609ea4c | Cedric Nugteren | 29 June 2016, 15:49:52 UTC | Removed building of tests for AppVeyor CI | 29 June 2016, 15:49:52 UTC |
fca2ad1 | Cedric Nugteren | 29 June 2016, 15:03:42 UTC | Added Appveyor CI and added OS X compilation for Travis | 29 June 2016, 15:03:42 UTC |
48719a2 | Cedric Nugteren | 16 June 2016, 18:20:44 UTC | Fixed the RPATH settings for OSX | 16 June 2016, 18:20:44 UTC |
b516ef7 | Cedric Nugteren | 16 June 2016, 18:18:49 UTC | Added a VERBOSE option to CMake to get additional diagnostic messages | 16 June 2016, 18:18:49 UTC |
e95c158 | Cedric Nugteren | 31 May 2016, 18:37:17 UTC | Unit-tests are now based on string-kernels instead of external-file-kernels to make it possible to run the unit test executables anywhere | 31 May 2016, 18:37:17 UTC |
f1b0900 | Cedric Nugteren | 25 May 2016, 11:03:26 UTC | Merge pull request #39 from CNugteren/development Update to version 2.3.1 | 25 May 2016, 11:03:26 UTC |
ebb3085 | Cedric Nugteren | 25 May 2016, 10:14:35 UTC | Updated to version 2.3.1 (bug-fix release) | 25 May 2016, 10:14:35 UTC |
53a05ba | Cedric Nugteren | 24 May 2016, 09:58:26 UTC | Fixed computing the validation error for half-precision fp16 data-types | 24 May 2016, 09:58:26 UTC |
e9f43b5 | Cedric Nugteren | 24 May 2016, 09:56:15 UTC | Fixed a bug where an output buffer could not be used as input at the same time | 24 May 2016, 09:56:15 UTC |
b887e1e | Cedric Nugteren | 22 May 2016, 15:05:41 UTC | Merge pull request #38 from CNugteren/development Update to version 2.3.0 | 22 May 2016, 15:05:41 UTC |
ae12ebe | Cedric Nugteren | 22 May 2016, 15:01:18 UTC | Updated to version 2.3.0 | 22 May 2016, 15:01:18 UTC |
921271c | Cedric Nugteren | 22 May 2016, 15:00:41 UTC | Fixed CMake to compare strings properly; made MSVC link the runtime libraries statically | 22 May 2016, 15:00:41 UTC |
f923a17 | Cedric Nugteren | 22 May 2016, 14:41:10 UTC | Fixed a bug where failed results would still show up in the JSON files | 22 May 2016, 14:41:10 UTC |
86d701c | Cedric Nugteren | 16 May 2016, 10:14:18 UTC | Fixed a bug where failed results would still show up in the final results | 16 May 2016, 10:14:18 UTC |
ccf5ce2 | Cedric Nugteren | 14 May 2016, 15:59:17 UTC | Added support for short integers and cl_half fp16 as kernel arguments | 14 May 2016, 15:59:17 UTC |
cba89a4 | Cedric Nugteren | 27 April 2016, 09:03:34 UTC | Merge pull request #37 from CNugteren/development Update to version 2.2.0 | 27 April 2016, 09:03:34 UTC |
2da8be1 | Cedric Nugteren | 27 April 2016, 08:56:54 UTC | Updated to version 2.2.0 | 27 April 2016, 08:56:54 UTC |
acc110a | Cedric Nugteren | 27 April 2016, 08:55:13 UTC | Made the new samples work for CUDA as well | 27 April 2016, 08:55:13 UTC |
5f645e8 | Cedric Nugteren | 27 April 2016, 08:42:47 UTC | Fixed a typo in the API documentation | 27 April 2016, 08:42:47 UTC |
8b76ad1 | Cedric Nugteren | 27 April 2016, 08:39:18 UTC | Added API documentation to the repository | 27 April 2016, 08:39:18 UTC |
122cbb9 | Cedric Nugteren | 27 April 2016, 07:59:27 UTC | Minor fixes related to the newly added samples | 27 April 2016, 07:59:27 UTC |
9801a1e | cnugteren | 25 April 2016, 00:13:47 UTC | Added two much simpler examples to improve documentation | 25 April 2016, 00:13:47 UTC |
eac490c | cnugteren | 24 April 2016, 02:59:02 UTC | Updated the documentation | 24 April 2016, 02:59:02 UTC |
54df67a | cnugteren | 24 April 2016, 02:58:00 UTC | Updated headers to version 5.0 of the CLCudaAPI | 24 April 2016, 02:58:00 UTC |
8752c44 | cnugteren | 24 April 2016, 02:45:10 UTC | Updated Travis to reflect the latest Travis and Khronos changes | 24 April 2016, 02:45:10 UTC |
b306cf1 | Cedric Nugteren | 03 April 2016, 22:41:46 UTC | Merge pull request #36 from williamjshipman/development Only use OpenCL 2.x functions on OpenCL 2.x devices | 03 April 2016, 22:41:46 UTC |
bf1821b | williamjshipman | 03 April 2016, 01:20:33 UTC | - Add VersionNumber function for querying device OpenCL version number as an integer (e.g. 120 for OpenCL 1.2). - Clean up OpenCL 2.0 check in Queue constructor. | 03 April 2016, 01:20:33 UTC |
33ba3ef | William John Shipman | 02 April 2016, 19:37:23 UTC | Merge pull request #2 from CNugteren/development Development | 02 April 2016, 19:37:23 UTC |
da97040 | cnugteren | 31 March 2016, 04:11:58 UTC | Prepared the changelog for the next release | 31 March 2016, 04:11:58 UTC |
ad94a3d | cnugteren | 31 March 2016, 04:09:37 UTC | Merge branch 'development' | 31 March 2016, 04:09:37 UTC |
5802148 | cnugteren | 31 March 2016, 04:08:35 UTC | Updated to version 2.1.0 | 31 March 2016, 04:08:35 UTC |
0110efc | williamjshipman | 26 March 2016, 13:53:07 UTC | Merge branch 'development' of https://github.com/williamjshipman/CLTune into development | 26 March 2016, 13:53:07 UTC |
bccd8ac | williamjshipman | 26 March 2016, 13:21:37 UTC | Add runtime check for OpenCL 2 before using OpenCL 2 function. | 26 March 2016, 13:37:30 UTC |
4698d4a | williamjshipman | 26 March 2016, 13:21:37 UTC | Add runtime check for OpenCL 2 before using OpenCL 2 function. | 26 March 2016, 13:21:37 UTC |
0dc2a99 | Cedric Nugteren | 21 March 2016, 21:27:54 UTC | Updated the README | 21 March 2016, 21:27:54 UTC |
1ad3bb2 | Cedric Nugteren | 21 March 2016, 21:15:24 UTC | Merge branch 'development' of github.com:CNugteren/CLTune into development specially if it merges an updated upstream into a topic branch. | 21 March 2016, 21:15:24 UTC |
0b90c0c | CNugteren | 21 March 2016, 19:57:35 UTC | Fixes for minor warnings under Visual Studio | 21 March 2016, 19:57:35 UTC |
1d3c159 | CNugteren | 21 March 2016, 19:56:35 UTC | Added dllexport to be able to build a DLL under Windows | 21 March 2016, 19:56:35 UTC |
b170354 | Cedric Nugteren | 31 January 2016, 17:38:09 UTC | Merge pull request #35 from williamjshipman/development Add command line parameter for platform index to conv and gemm samples in line with description in README. | 31 January 2016, 17:38:09 UTC |
59faefa | williamjshipman | 30 January 2016, 23:07:16 UTC | Updated the README to show that the platform ID is one of the command line parameters and updated the samples so that the order of the parameters matches all parts of the README. | 30 January 2016, 23:07:16 UTC |
b5a3a8b | williamjshipman | 30 January 2016, 22:48:10 UTC | Samples now support a platform parameter in their command lines, in addition to the device number. | 30 January 2016, 22:48:10 UTC |
dcddd80 | Cedric Nugteren | 23 January 2016, 15:08:14 UTC | Updated FindOpenCL for Intel Linux OpenCL paths | 23 January 2016, 15:08:14 UTC |
d643731 | Cedric Nugteren | 22 November 2015, 11:19:55 UTC | Prepared the changelog for the next release | 22 November 2015, 11:19:55 UTC |
9e401f4 | Cedric Nugteren | 22 November 2015, 11:18:09 UTC | Merge pull request #33 from CNugteren/development Added machine learning, new CLCudaAPI, CUDA, Catch, and MSVC support | 22 November 2015, 11:18:09 UTC |
8bc6684 | Cedric Nugteren | 22 November 2015, 11:16:46 UTC | Updated to version 2.0.0 | 22 November 2015, 11:16:46 UTC |
b22dce2 | Cedric Nugteren | 22 November 2015, 11:15:35 UTC | Updated the readme | 22 November 2015, 11:15:35 UTC |
a21d4a5 | Cedric Nugteren | 21 November 2015, 13:33:50 UTC | Merge pull request #32 from CNugteren/catch_tests Replaced GTest with Catch unit testing | 21 November 2015, 13:33:50 UTC |
400752b | Cedric Nugteren | 21 November 2015, 13:29:27 UTC | Updated changelog and readme | 21 November 2015, 13:29:27 UTC |
8757c9e | Cedric Nugteren | 21 November 2015, 13:27:45 UTC | Updated the 'KernelInfo' class to use Catch | 21 November 2015, 13:27:45 UTC |
b74dcef | Cedric Nugteren | 19 November 2015, 20:03:50 UTC | Updated the 'tuner' class tests to use Catch | 19 November 2015, 20:03:50 UTC |
3526cc8 | Cedric Nugteren | 15 November 2015, 15:20:41 UTC | Removed GTest, added Catch, added CLCudaAPI tests | 15 November 2015, 15:20:41 UTC |
e9984c3 | Cedric Nugteren | 15 November 2015, 15:08:07 UTC | Merge pull request #31 from CNugteren/msvc_support MSVC 2015 support | 15 November 2015, 15:08:07 UTC |
d3c961f | Cedric Nugteren | 15 November 2015, 15:06:23 UTC | Updated changelog | 15 November 2015, 15:06:23 UTC |