1f96534 | Paul Koch | 30 March 2024, 07:05:01 UTC | disable parallel multiclass since it seems to be slower in general | 30 March 2024, 07:08:27 UTC |
b2a06c7 | Paul Koch | 30 March 2024, 07:01:48 UTC | parallel binning for multiclass | 30 March 2024, 07:01:48 UTC |
d38fd62 | Paul Koch | 30 March 2024, 02:09:56 UTC | reduce the allocation size of the fast bins memory buffer when we don't need extra memory for the parallel binning | 30 March 2024, 02:09:56 UTC |
4791ebf | Paul Koch | 29 March 2024, 23:56:53 UTC | fix the shift size restrictions in AVX2 and AVX512 and simplify ConvertAddBin by using the new gradient and hessian constants provided in the Bin class | 30 March 2024, 00:13:39 UTC |
c8262cf | Paul Koch | 29 March 2024, 23:19:10 UTC | add speculative changes for future investigation in BinSumsBoosting where we treat bins with hessians differently by gather loading the gradients and hessians together | 29 March 2024, 23:19:10 UTC |
476508b | Paul Koch | 29 March 2024, 21:21:25 UTC | separate BinSumsBoosting into separate hessian and non-hessian specializations | 29 March 2024, 21:29:32 UTC |
1fe326b | Paul Koch | 29 March 2024, 09:59:17 UTC | add more missing template keywords | 29 March 2024, 09:59:17 UTC |
9152386 | Paul Koch | 29 March 2024, 08:55:54 UTC | optimize BinSumsBoosting by prefetching gathering load as early as possible | 29 March 2024, 09:17:07 UTC |
171aae3 | Paul Koch | 29 March 2024, 08:02:04 UTC | add missing template keywords for clang/g++ | 29 March 2024, 08:02:04 UTC |
341722e | Paul Koch | 29 March 2024, 07:52:43 UTC | add missing typename that was causing clang/g++ to fail to compile | 29 March 2024, 07:52:43 UTC |
cd1f010 | Paul Koch | 28 March 2024, 22:38:23 UTC | use parallel histograms in BinSumsBoosting when they do not take up excessive memory | 29 March 2024, 07:33:24 UTC |
4200a52 | Paul Koch | 28 March 2024, 21:12:49 UTC | add shift control for SIMD gather/scatter Load/Save | 28 March 2024, 21:38:19 UTC |
25561d2 | Paul Koch | 28 March 2024, 03:33:58 UTC | eliminate k_cItemsPerBitPackNone | 28 March 2024, 03:33:58 UTC |
09b00c4 | Paul Koch | 28 March 2024, 03:08:54 UTC | add new bCollapsed template argument to BinSumsBoosting | 28 March 2024, 03:12:49 UTC |
310998c | Paul Koch | 28 March 2024, 02:13:40 UTC | add new bCollapsed template argument to the Objectives | 28 March 2024, 02:54:08 UTC |
a58c176 | Paul Koch | 27 March 2024, 23:15:52 UTC | eliminate bit packing specializations for sizes that cannot exist due to the size of the integer types into which the bits are being packed and increase the range of bit packing allowed on BinSumsBoosting to the entire legal range | 27 March 2024, 23:18:51 UTC |
354d3d1 | Paul Koch | 27 March 2024, 23:08:59 UTC | simplify the RMSE objective code since clang and g++ seem to already be optimizing a loop away and the Microsoft compiler seems to be choosing to not do this although it appears that this is due to a heuristic in the compiler based on the complexity of the contents of the loop | 27 March 2024, 23:08:59 UTC |
d452ea6 | Paul Koch | 27 March 2024, 08:32:26 UTC | improvements to the RMSE objective loop unrolling | 27 March 2024, 08:32:26 UTC |
10b205b | Paul Koch | 27 March 2024, 08:14:26 UTC | modify the Objectives to in theory allow the compiler to unroll their bitpack loops, although most seem too complex for the compiler to want to do this, other than the RMSE objective which the compiler is currently unrolling | 27 March 2024, 08:14:26 UTC |
29504a4 | Paul Koch | 27 March 2024, 01:56:21 UTC | change ApplyUpdate to only call with complete bitpacks unless the call is for a dynamic sized bitpack | 27 March 2024, 02:17:43 UTC |
2ae152c | Paul Koch | 27 March 2024, 01:02:28 UTC | reformat with clang | 27 March 2024, 01:03:22 UTC |
4b9983e | Paul Koch | 27 March 2024, 00:52:21 UTC | template out multiple specializations of bit packing for the objectives | 27 March 2024, 00:56:04 UTC |
9d0f3b1 | Paul Koch | 27 March 2024, 00:25:46 UTC | separate out multiclass from single score calls in the Objective class | 27 March 2024, 00:43:18 UTC |
5b655c5 | Paul Koch | 27 March 2024, 00:21:59 UTC | simplify the Objective templating since we no longer have access to the templated bitpacking outside of the GPU | 27 March 2024, 00:21:59 UTC |
fa4fff0 | Paul Koch | 26 March 2024, 23:25:18 UTC | move the templating of the bit packs into the GPU for Objectives | 27 March 2024, 00:03:50 UTC |
913da98 | Paul Koch | 26 March 2024, 22:55:23 UTC | small improvements to BinSumsBoosting | 26 March 2024, 22:55:23 UTC |
a3650cf | Paul Koch | 26 March 2024, 21:43:53 UTC | move the computation of the remnants value into the bit packed templated function | 26 March 2024, 21:44:46 UTC |
adb7e87 | Paul Koch | 26 March 2024, 20:59:21 UTC | move the advancement of the weight, gradient, and packed pointers to outside of the call to BinSumsBoosting | 26 March 2024, 21:25:31 UTC |
d0ee146 | Paul Koch | 26 March 2024, 19:31:09 UTC | minimize the difference between the dynamic and fixed size bitpacking code | 26 March 2024, 19:37:35 UTC |
f063f7e | Paul Koch | 26 March 2024, 19:01:48 UTC | change indexing in BinSumsBoosting to end at zero for optimization purposes if the compiler does not optimize away the bit packing loop | 26 March 2024, 19:29:00 UTC |
e57d79b | Paul Koch | 26 March 2024, 08:56:30 UTC | unroll the bit packing loop as an optimization for BinSumsBoosting | 26 March 2024, 10:06:53 UTC |
62388b4 | Paul Koch | 26 March 2024, 08:16:55 UTC | call BinSumsBoosting twice to handle the bit packing reminants | 26 March 2024, 08:16:55 UTC |
574aefa | Paul Koch | 26 March 2024, 07:39:03 UTC | create specialized BinSumsBoosting functions for each bit pack | 26 March 2024, 07:39:03 UTC |
9ad6f0f | Paul Koch | 26 March 2024, 07:16:33 UTC | separate bit packing into version for multiple scores and version for single scores | 26 March 2024, 07:16:33 UTC |
dfd84eb | Paul Koch | 26 March 2024, 07:00:26 UTC | move the bit packing templating after the operator callback | 26 March 2024, 07:00:26 UTC |
2340a48 | Paul Koch | 26 March 2024, 01:28:59 UTC | small optimizations to BinSumsInteraction | 26 March 2024, 01:28:59 UTC |
2fa6cfd | Paul Koch | 25 March 2024, 23:40:55 UTC | remove the inline attribute from the NEVER_INLINE definition | 25 March 2024, 23:40:55 UTC |
0306cf8 | Paul Koch | 25 March 2024, 22:36:19 UTC | fix bug that was preventing the use of SIMD | 25 March 2024, 22:36:19 UTC |
be4e231 | Paul Koch | 25 March 2024, 22:26:54 UTC | fix bug that the cached counts and weights were being added multiple times if there were multiple subsets (bug introduced in commit 01a5c3f8 on 2024-03-24) | 25 March 2024, 22:31:27 UTC |
dc0fdc5 | Paul Koch | 25 March 2024, 09:15:25 UTC | eliminate the counts and weights from the histograms when possible | 25 March 2024, 10:25:52 UTC |
e34e7dc | Paul Koch | 25 March 2024, 07:44:48 UTC | separate the Bin class into specialized versions with and without the count and weight fields | 25 March 2024, 07:44:48 UTC |
607878d | Paul Koch | 25 March 2024, 07:12:39 UTC | add a base data class above the Bin class to hold the data that we can later specialize | 25 March 2024, 07:12:39 UTC |
1bca184 | Paul Koch | 25 March 2024, 05:47:38 UTC | remove storage of the occurrence counts, which are no longer needed | 25 March 2024, 05:47:38 UTC |
1dc4d5b | Paul Koch | 25 March 2024, 01:51:32 UTC | remove unneeded work during BinSumsBoosting when cached counts and weights are available | 25 March 2024, 01:51:32 UTC |
31b8f32 | Paul Koch | 25 March 2024, 01:09:16 UTC | eliminate unnecessary allocations for the cached count and weight tensors | 25 March 2024, 01:09:16 UTC |
01a5c3f | Paul Koch | 24 March 2024, 23:28:30 UTC | use the cached counts and weights instead of the computed ones | 24 March 2024, 23:50:35 UTC |
d86cdff | Paul Koch | 24 March 2024, 06:47:09 UTC | cache results count/weight computations | 24 March 2024, 08:49:54 UTC |
754ee5c | Paul Koch | 23 March 2024, 22:54:43 UTC | simplify InitBags now that the outer bag weights are unpacked in a previous function | 23 March 2024, 23:57:15 UTC |
cf4b81d | Paul Koch | 23 March 2024, 22:31:18 UTC | change InitBags function to work with the cached bag weights instead of working directly from the shared dataset memory | 23 March 2024, 22:31:18 UTC |
eda8c18 | Paul Koch | 23 March 2024, 19:53:49 UTC | preserve sample weights for future re-inner bagging | 23 March 2024, 20:09:11 UTC |
9ba15e9 | Paul Koch | 23 March 2024, 00:06:31 UTC | update release notes | 23 March 2024, 00:06:31 UTC |
ee3b5fe | Paul Koch | 16 March 2024, 07:36:21 UTC | version 0.6.0 release | 16 March 2024, 08:56:38 UTC |
7c2284a | Paul Koch | 16 March 2024, 06:50:10 UTC | update release process and hyperparameters | 16 March 2024, 06:50:10 UTC |
139fddc | Paul Koch | 16 March 2024, 00:33:32 UTC | add warning regarding using monotonic constraints during fitting | 16 March 2024, 00:33:32 UTC |
86e60fe | Paul Koch | 15 March 2024, 20:55:18 UTC | handle monotonicity when using random boosting | 15 March 2024, 22:30:43 UTC |
e660298 | Paul Koch | 15 March 2024, 18:44:05 UTC | change smoothing rounds back to 200 and 50 for interaction smoothing rounds | 15 March 2024, 18:44:05 UTC |
274379c | Paul Koch | 14 March 2024, 23:07:22 UTC | restore tests for python 3.12 | 14 March 2024, 23:07:22 UTC |
fc0c81c | Paul Koch | 14 March 2024, 21:59:46 UTC | change cyclic_progress to accept bools and make default True | 14 March 2024, 21:59:46 UTC |
bf3189d | Paul Koch | 14 March 2024, 21:20:04 UTC | remove DecisionListClassifier from documentation since skope-rules is no longer maintained and the documentation no longer builds | 14 March 2024, 21:55:42 UTC |
7a30293 | Paul Koch | 14 March 2024, 20:59:43 UTC | rename breakpoint_iteration to best_iteration to align with XGBoost naming | 14 March 2024, 21:03:48 UTC |
019c713 | Paul Koch | 14 March 2024, 18:57:23 UTC | update documentation to use cyclic_progress variable instead of greediness (which has been deprecated) | 14 March 2024, 18:57:23 UTC |
a6f3120 | Paul Koch | 14 March 2024, 18:48:11 UTC | rename greediness to greedy_ratio to better match it's new behavior | 14 March 2024, 18:49:45 UTC |
aeec521 | Paul Koch | 14 March 2024, 05:57:03 UTC | reformatting | 14 March 2024, 05:57:03 UTC |
008e63b | Paul Koch | 14 March 2024, 05:21:23 UTC | expose boosting based monotonicity | 14 March 2024, 05:52:59 UTC |
e1ead2e | Paul Koch | 14 March 2024, 00:31:20 UTC | add monotonicity to test api | 14 March 2024, 00:31:20 UTC |
3328880 | Paul Koch | 13 March 2024, 23:54:48 UTC | update to use python 3.9 for building | 13 March 2024, 23:54:48 UTC |
7a8c0c5 | Paul Koch | 13 March 2024, 02:44:36 UTC | fix the shap issue in version 0.45 where tree shap has changed the axis where it expresses the class index | 13 March 2024, 02:44:36 UTC |
66e04fa | Paul Koch | 13 March 2024, 02:06:10 UTC | C++ reformat | 13 March 2024, 02:06:10 UTC |
9498ba6 | Paul Koch | 12 March 2024, 23:15:52 UTC | ability to constrain by monotonicity | 12 March 2024, 23:15:52 UTC |
6e01242 | Paul Koch | 12 March 2024, 21:48:34 UTC | use same version of clang-format as VS and shaping for future monotonicity | 12 March 2024, 21:48:34 UTC |
d5ec3fa | Paul Koch | 09 March 2024, 06:09:53 UTC | update URLs to https | 09 March 2024, 06:09:53 UTC |
9631291 | Paul Koch | 05 March 2024, 15:47:08 UTC | update hyperparameter values | 05 March 2024, 15:47:49 UTC |
5813304 | DerWeh | 05 March 2024, 00:12:51 UTC | TST: loosen requirements for tests (#520) Signed-off-by: DerWeh <andreas.weh@web.de> | 05 March 2024, 00:12:51 UTC |
1877154 | Paul Koch | 04 March 2024, 19:48:27 UTC | modify boost loop to handle fractional cyclic_progress values | 04 March 2024, 19:48:27 UTC |
cc75f87 | Paul Koch | 04 March 2024, 19:25:45 UTC | change cyclic_progress to accept a float percentage | 04 March 2024, 19:28:26 UTC |
3b9012b | Paul Koch | 04 March 2024, 07:39:06 UTC | update greediness parameter to accept numbers greater than 1.0 in accordance to the newer interpretation of that parameter and remove the refresh_rate parameter and expose the cyclic_progress parameter | 04 March 2024, 07:39:06 UTC |
620a496 | Paul Koch | 04 March 2024, 02:03:25 UTC | replace refresh_period with cyclic_progress bool which optionally turns off boosting during the cyclic rounds in order to update the gain values instead | 04 March 2024, 04:14:08 UTC |
44d3c69 | Paul Koch | 04 March 2024, 00:51:37 UTC | change the greediness algorithm to allow for better control of the amount of greedy steps | 04 March 2024, 01:36:54 UTC |
ec870b2 | dependabot[bot] | 01 March 2024, 17:37:36 UTC | Bump es5-ext from 0.10.61 to 0.10.63 in /python/stitch (#516) Bumps [es5-ext](https://github.com/medikoo/es5-ext) from 0.10.61 to 0.10.63. - [Release notes](https://github.com/medikoo/es5-ext/releases) - [Changelog](https://github.com/medikoo/es5-ext/blob/main/CHANGELOG.md) - [Commits](https://github.com/medikoo/es5-ext/compare/v0.10.61...v0.10.63) --- updated-dependencies: - dependency-name: es5-ext dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> | 01 March 2024, 17:37:36 UTC |
0e8dad8 | Paul Koch | 29 February 2024, 09:05:29 UTC | remove experimental tag on ML2SQL, update readme with new links, and update hyperparameter tuning recommendations | 01 March 2024, 01:20:36 UTC |
bcf86e6 | DerWeh | 29 February 2024, 23:08:54 UTC | Increase compatibility of EBM with scikit-learn (#518) * MAINT: sort imports Signed-off-by: DerWeh <andreas.weh@web.de> * ENH: sklearn compatibility: warn about 2d y Signed-off-by: DerWeh <andreas.weh@web.de> * ENH: sklearn compatibility: use None instead of [] Signed-off-by: DerWeh <andreas.weh@web.de> * ENH: sklearn: add tag that EBM supports NaN inputs Signed-off-by: DerWeh <andreas.weh@web.de> * ENH: raise ValueError for complex data (sklearn) Raising a ValueError is compatible with sklearn and therefore the expected behavior. TypeError is inappropriate in the Python world, as `type(X)` has the correct type (an np.ndarray). Just the X.dtype is wrong. This is a NumPy concept and not a native Python concept. Signed-off-by: DerWeh <andreas.weh@web.de> * ENH: sklearn: check also for inf in labels Signed-off-by: DerWeh <andreas.weh@web.de> * BLD: bump sklearn dependency Signed-off-by: DerWeh <andreas.weh@web.de> * ENH: test sklearn compatibility Signed-off-by: DerWeh <andreas.weh@web.de> * Revert "ENH: sklearn compatibility: warn about 2d y" Support of 2d y we `y.shape[-1] = 1` is desired This reverts commit 230cddd2ec4c7b7f60c684b0c8ffa83a6e2b4d92. Signed-off-by: DerWeh <andreas.weh@web.de> * ENH: update tags Signed-off-by: DerWeh <andreas.weh@web.de> * TST: update selection of scikit-learn checks Signed-off-by: DerWeh <andreas.weh@web.de> * MAINT: drop old skipped scikit-learn checks Signed-off-by: DerWeh <andreas.weh@web.de> --------- Signed-off-by: DerWeh <andreas.weh@web.de> | 29 February 2024, 23:08:54 UTC |
0cf3f2f | Paul Koch | 29 February 2024, 07:00:22 UTC | fix bug in convert_categorical_to_continuous that did not handle scenarios where there is only one continuous section | 29 February 2024, 07:00:22 UTC |
a7e345a | DerWeh | 29 February 2024, 06:28:29 UTC | MAINT: Rebase EBM utils (#517) * MAINT: avoid code duplications Signed-off-by: DerWeh <andreas.weh@web.de> * MAINT: early error to reduce complexity Signed-off-by: DerWeh <andreas.weh@web.de> * MAINT: avoid unnecessary allocations Signed-off-by: DerWeh <andreas.weh@web.de> * MAINT: simplify function conversion function Signed-off-by: DerWeh <andreas.weh@web.de> * MAINT: refactor midpoint into own function Signed-off-by: DerWeh <andreas.weh@web.de> * MAINT: sort imports Signed-off-by: DerWeh <andreas.weh@web.de> * MAINT: check directly for finite numbers Signed-off-by: DerWeh <andreas.weh@web.de> * ENH: handle special case of empty cuts Signed-off-by: DerWeh <andreas.weh@web.de> * TST: minimal test for conversion between cuts and intervals Signed-off-by: DerWeh <andreas.weh@web.de> --------- Signed-off-by: DerWeh <andreas.weh@web.de> | 29 February 2024, 06:28:29 UTC |
9cbb353 | Paul Koch | 27 February 2024, 10:51:41 UTC | increase the default max_rounds value to 25000 | 27 February 2024, 10:51:41 UTC |
14f140c | Paul Koch | 25 February 2024, 22:12:08 UTC | add Hyperparameters to the documentation | 27 February 2024, 10:49:21 UTC |
abad985 | Paul Koch | 24 February 2024, 16:47:49 UTC | add RMSE to synthetic notebook output | 24 February 2024, 20:53:33 UTC |
24f4f11 | Paul Koch | 15 February 2024, 00:46:27 UTC | add refresh_rate option to public parameters | 24 February 2024, 20:53:32 UTC |
f2ffdd1 | alvanli | 20 February 2024, 19:53:23 UTC | Return self in sweep() (#511) Signed-off-by: alvanli <51011489+alvanli@users.noreply.github.com> | 20 February 2024, 19:53:23 UTC |
fe334e8 | Paul Koch | 08 February 2024, 21:09:06 UTC | update release process notes and readme | 13 February 2024, 21:07:25 UTC |
979d93f | Paul Koch | 08 February 2024, 05:43:02 UTC | update interpret to version 0.5.1 | 08 February 2024, 05:58:56 UTC |
5d56f07 | Paul Koch | 07 February 2024, 10:20:25 UTC | add internal option to disable smoothing for nominals | 07 February 2024, 22:08:44 UTC |
7b0c331 | Paul Koch | 07 February 2024, 06:04:29 UTC | add interface to access the nominal feature definitions from the dataset object | 07 February 2024, 06:04:29 UTC |
3a6abe8 | Paul Koch | 07 February 2024, 03:49:01 UTC | small changes to EBM defaults | 07 February 2024, 05:16:17 UTC |
f8cb826 | Paul Koch | 06 February 2024, 23:36:38 UTC | re-expose min_samples_leaf | 06 February 2024, 23:47:43 UTC |
ed3501b | Paul Koch | 06 February 2024, 20:36:22 UTC | change EBM parameters -> set tolerance to zero, set regression min_hessian to 1.01 | 06 February 2024, 20:38:36 UTC |
f1115eb | Paul Koch | 06 February 2024, 12:44:34 UTC | go back to hessian boosting during smoothing because it's faster | 06 February 2024, 12:44:34 UTC |
240a620 | Paul Koch | 06 February 2024, 09:08:12 UTC | update default EBM parameters | 06 February 2024, 09:08:12 UTC |
edab834 | Paul Koch | 06 February 2024, 08:29:42 UTC | update docs to remove references to min_samples_leaf | 06 February 2024, 08:29:42 UTC |
52b8966 | Paul Koch | 06 February 2024, 08:22:11 UTC | change early stopping tolerance to be a percentage | 06 February 2024, 08:22:11 UTC |
b65426f | Paul Koch | 06 February 2024, 01:03:35 UTC | early stop on no improvement (if there is a validation set) | 06 February 2024, 07:45:22 UTC |