9b40375 | Johnu George | 10 September 2019, 10:00:56 UTC | Adding suggestion reconcile in Experiment controller (#750) * Adding suggestion reconcile * Fix tests * Commenting e2e tests * comment nas rl build * Add logs | 10 September 2019, 10:00:55 UTC |
bbae0ea | Hougang Liu | 09 September 2019, 07:31:18 UTC | Experiment webhook for metricsCollector (#749) * Experiment webhook for metricsCollector * Fix test error | 09 September 2019, 07:31:18 UTC |
cfede05 | Hougang Liu | 05 September 2019, 15:33:12 UTC | Ignore injecting sidecar for none collector kind (#747) * Ignore injecting sidecar for none collector kind * Upper camel case | 05 September 2019, 15:33:12 UTC |
e9cc353 | Ce Gao | 05 September 2019, 08:12:52 UTC | feat: Add suggestion controller (#742) * feat: Add suggestio controller Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: gofmt Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix name Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add error output Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix RBAC Signed-off-by: Ce Gao <gaoce@caicloud.io> | 05 September 2019, 08:12:52 UTC |
0e8f189 | Hougang Liu | 05 September 2019, 06:32:52 UTC | Inject sidecar container based on metrics collector kind (#745) | 05 September 2019, 06:32:52 UTC |
349a5a4 | Ce Gao | 04 September 2019, 08:29:01 UTC | feat(apis): Add suggestion CRD API (#740) * feat: Add suggestion CRD Signed-off-by: Ce Gao <gaoce@caicloud.io> * chore: generate deepcopy Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix json tag Signed-off-by: Ce Gao <gaoce@caicloud.io> | 04 September 2019, 08:29:01 UTC |
e61c988 | Johnu George | 04 September 2019, 06:44:58 UTC | Adding extra timeouts to builds (#741) | 04 September 2019, 06:44:58 UTC |
daacf9f | Johnu George | 04 September 2019, 02:24:58 UTC | Katib v1alpha3 api implementation (#739) * v1alpha3 api implementation * fix jsonnet params * Adding v1alpha3 examples * Adding UI to builds | 04 September 2019, 02:24:58 UTC |
7d8743c | Hougang Liu | 03 September 2019, 08:42:27 UTC | MetricsCollector pod injection for katib job (#738) * MetricsCollector pod injection for katib job * Fix gofmt | 03 September 2019, 08:42:27 UTC |
ab0b4a7 | Johnu George | 03 September 2019, 05:58:27 UTC | Refactor directory structure (#737) * Refactor directory structure * Fix tests | 03 September 2019, 05:58:27 UTC |
6bfca31 | Johnu George | 03 September 2019, 03:22:27 UTC | Upgrade vendor packages (#736) | 03 September 2019, 03:22:27 UTC |
e5a27ad | ChungHsuan Wu | 02 September 2019, 02:57:41 UTC | Add metrics-collector.md (#732) | 02 September 2019, 02:57:41 UTC |
b068809 | Johnu George | 01 September 2019, 23:31:42 UTC | Delete v1alpha1 api (#734) * Delete v1alpha1 api * Removing modelstore | 01 September 2019, 23:31:42 UTC |
2bdfe61 | Hougang Liu | 29 August 2019, 06:37:55 UTC | Inject sidecar only for katib belongs (#733) | 29 August 2019, 06:37:55 UTC |
7b7c1c5 | Ce Gao | 29 August 2019, 02:41:54 UTC | fix: Add build for sidecar (#730) * fix: Add build for sidecar Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Update Signed-off-by: Ce Gao <gaoce@caicloud.io> | 29 August 2019, 02:41:54 UTC |
e7d2708 | Hougang Liu | 29 August 2019, 01:51:54 UTC | Fix bayesianoptimizatio build for ppc64le (#731) | 29 August 2019, 01:51:54 UTC |
b225423 | Hougang Liu | 28 August 2019, 23:57:54 UTC | Inject pod sidecar for specified namespace (#729) | 28 August 2019, 23:57:54 UTC |
1945560 | ChungHsuan Wu | 28 August 2019, 06:49:13 UTC | Add pod level inject webhook (#716) * Add pod level inject webhook. * Implement sidecar injection (Hard code container) * Inject metrics collector as a sidecar * Update metrics-collector to satisfy sidecar * Clean up test logs * Get experiment name and job kind * Update common labels * Separate the sidecar metrics collector | 28 August 2019, 06:49:13 UTC |
08f0429 | huimin | 26 August 2019, 23:22:38 UTC | upadte katib ui base image (#727) | 26 August 2019, 23:22:38 UTC |
7a4bee2 | Hougang Liu | 16 August 2019, 11:30:34 UTC | Update prow (#721) | 16 August 2019, 11:30:34 UTC |
c70f5e1 | Johnu George | 16 August 2019, 11:20:32 UTC | Minor test fix (#720) | 16 August 2019, 11:20:32 UTC |
45124c1 | Hougang Liu | 16 August 2019, 07:30:40 UTC | Enable prometheus metrics for katib-controller (#717) * Enable prometheus metrics for katib-controller * Disable UI build to avoid blocking * Use kubeflow repo image to run golang test | 16 August 2019, 07:30:40 UTC |
6d4dd47 | Hougang Liu | 05 August 2019, 12:13:48 UTC | Refactor webhook organization (#711) | 05 August 2019, 12:13:48 UTC |
de96a52 | Hougang Liu | 01 August 2019, 10:08:15 UTC | API for metricCollector (#697) * API for metricCollector * Drop trialmetrics crd | 01 August 2019, 10:08:15 UTC |
97e0b32 | Hougang Liu | 29 July 2019, 03:51:53 UTC | Update vendor (#700) | 29 July 2019, 03:51:53 UTC |
505e90f | Hougang Liu | 25 July 2019, 07:03:51 UTC | Fix wrong end state of trial (#695) | 25 July 2019, 07:03:51 UTC |
97a97fa | renyux | 19 July 2019, 03:17:13 UTC | update some dockerfiles to support power (#673) * modify some dockerfile to support power * update more dockerfiles for v2 * modify wrong manager-rest name | 19 July 2019, 03:17:13 UTC |
a30e8e1 | Hougang Liu | 16 July 2019, 09:31:24 UTC | Fix tf-event example cannot work (#689) | 16 July 2019, 09:31:24 UTC |
f80a7d2 | Hougang Liu | 16 July 2019, 03:13:23 UTC | Add example readme for v1alpha2 (#688) | 16 July 2019, 03:13:23 UTC |
08ea526 | Hougang Liu | 15 July 2019, 11:55:06 UTC | Fix error of undeploy.sh (#687) | 15 July 2019, 11:55:06 UTC |
742fbf6 | Richard Liu | 13 July 2019, 02:11:04 UTC | Add tfevent-volume to v1alpha2 example (#681) | 13 July 2019, 02:11:04 UTC |
d513bd6 | Andrey Velichkevich | 13 July 2019, 00:47:04 UTC | Change Version to latest in e2e tests (#686) | 13 July 2019, 00:47:04 UTC |
02c3528 | Hougang Liu | 12 July 2019, 03:57:03 UTC | Use latest katib images to deploy katib (#682) | 12 July 2019, 03:57:03 UTC |
f1af0be | Hougang Liu | 12 July 2019, 01:57:06 UTC | Fix prow error (#684) | 12 July 2019, 01:57:06 UTC |
912f329 | Andrey Velichkevich | 04 July 2019, 02:26:12 UTC | Update Build Script for v1alpha2 (#672) * Update build script * Change version in e2e test | 04 July 2019, 02:26:12 UTC |
47d4d8f | Andrey Velichkevich | 28 June 2019, 01:33:21 UTC | Add npm build to the UI Dockerfile (#665) * Remove build from the Repo * npm build in dockerfile * Remove no-cache * Change size | 28 June 2019, 01:33:21 UTC |
c81818d | Erik Parmann | 27 June 2019, 05:39:19 UTC | MetricController: Run only a single job per task (#660) This changes the `spec.concurrencyPolicy` of the metric collector cron-job from "Allow" (default) to "Forbid". The cronjob used to create a new job even if the previous job had not succeeded. On high-load clusters this could lead to a high number of jobs which never finished. This fixed #659 | 27 June 2019, 05:39:19 UTC |
702703b | Andrey Velichkevich | 21 June 2019, 03:30:36 UTC | Build images for nasrl training container (#669) * Add NASRL training container build image * Add build for v1alpha1 | 21 June 2019, 03:30:36 UTC |
a21c14f | Andrey Velichkevich | 19 June 2019, 03:00:32 UTC | Add delete experiment (#654) | 19 June 2019, 03:00:32 UTC |
1344dc2 | Andrey Velichkevich | 19 June 2019, 01:30:32 UTC | Change add template (#656) | 19 June 2019, 01:30:32 UTC |
855f75c | Andrey Velichkevich | 18 June 2019, 19:17:50 UTC | Select objectiveType from the list (#653) | 18 June 2019, 19:17:50 UTC |
c81692c | Johnu George | 18 June 2019, 06:44:16 UTC | Add e2e test to presubmit (#652) * Adding grid e2e test to presubmit * Adding extra checks | 18 June 2019, 06:44:16 UTC |
ae10864 | Ce Gao | 17 June 2019, 08:20:08 UTC | fix: Do not use webhook in UT (#657) Signed-off-by: Ce Gao <gaoce@caicloud.io> | 17 June 2019, 08:20:08 UTC |
0cb1597 | Johnu George | 14 June 2019, 10:46:22 UTC | Enhancing katib client apis (#650) * Enhacning katib client apis * Delete unnecessary file | 14 June 2019, 10:46:21 UTC |
8970cdf | Johnu George | 14 June 2019, 10:10:21 UTC | Wrong mock file name (#651) | 14 June 2019, 10:10:21 UTC |
69d097e | Andrey Velichkevich | 13 June 2019, 05:58:13 UTC | UI: Show only succeeded Trials (#646) * Show only succeeded trials * Create build | 13 June 2019, 05:58:13 UTC |
14dad8b | Hougang Liu | 12 June 2019, 18:20:17 UTC | v1alpha2 hyperband suggestion service validation (#648) | 12 June 2019, 18:20:17 UTC |
0d456ae | Ce Gao | 12 June 2019, 16:02:23 UTC | refactor: Remove requests check for most test cases (#626) * refactor: Remove requests check for most test cases Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Set timeout for apiserver Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix experiment test Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix test cases Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix test Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix test cases Signed-off-by: Ce Gao <gaoce@caicloud.io> | 12 June 2019, 16:02:23 UTC |
ff27b55 | Ce Gao | 12 June 2019, 09:07:29 UTC | feat(experiment): Delete dup trials (#647) * feat(experiment): Delete dup trials Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Sort before delete Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Use the result in the memory Signed-off-by: Ce Gao <gaoce@caicloud.io> | 12 June 2019, 09:07:29 UTC |
2625d43 | Andrey Velichkevich | 11 June 2019, 23:31:29 UTC | Add bayesianoptimization algorithm in selectlist (#645) | 11 June 2019, 23:31:29 UTC |
56192e1 | Hougang Liu | 08 June 2019, 05:31:07 UTC | Fix v1alpha1 hyperband algorithm mismatch (#634) * Fix v1alpha1 hyperband algorithm mismatch * Fix test error | 08 June 2019, 05:31:07 UTC |
da6dae1 | Hougang Liu | 08 June 2019, 05:07:06 UTC | hyperband suggestion service (#631) | 08 June 2019, 05:07:06 UTC |
2ef2bc8 | Johnu George | 07 June 2019, 00:48:00 UTC | Upgrade Job operators to v1 (#635) * Upgrade tfjob/pytorchjob apis to v1 * Remove unnecessary files | 07 June 2019, 00:48:00 UTC |
2d059a4 | Hougang Liu | 05 June 2019, 08:17:52 UTC | Fix sql syntax for UpdateAlgorithmExtraSettings (#633) | 05 June 2019, 08:17:52 UTC |
77ae12d | Johnu George | 05 June 2019, 07:39:54 UTC | Write entries to extra settings table during create (#630) | 05 June 2019, 07:39:54 UTC |
b0e0dd5 | Johnu George | 04 June 2019, 18:13:55 UTC | Adding cascading delete of pods when jobs are deleted (#632) | 04 June 2019, 18:13:55 UTC |
03fb85e | Johnu George | 04 June 2019, 08:51:09 UTC | Add tests for grid suggestion (#628) | 04 June 2019, 08:51:09 UTC |
e70d56b | Johnu George | 04 June 2019, 05:08:59 UTC | Fixing tag (#627) | 04 June 2019, 05:08:59 UTC |
32d3401 | Andrey Velichkevich | 04 June 2019, 03:22:59 UTC | Training Container for NAS RL Suggestion in v1alpha2 (#614) * Add training container in v1alpha2 * Modify runTrial | 04 June 2019, 03:22:59 UTC |
cb25807 | Johnu George | 03 June 2019, 17:43:33 UTC | Implementing v1alpha2 grid search suggestion algorithm (#622) * Implementing v1alpha2 grid search algorithm * Fix indendation * Build grid image | 03 June 2019, 17:43:33 UTC |
0f6fdeb | Ce Gao | 03 June 2019, 11:06:14 UTC | feat: Support bayesianoptimization in v1alpha2 (#595) * feat: Support bayesianoptimization Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Suport bo Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Resolve conflicts Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix runtime error Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix format Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix comments Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix comments Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix component name Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix errors Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix a runtime error Signed-off-by: Ce Gao <gaoce@caicloud.io> | 03 June 2019, 11:06:14 UTC |
0c50fb4 | Andrey Velichkevich | 03 June 2019, 06:20:15 UTC | NAS RL Suggestion for v1alpha2 (#613) * Init commit * 50% Suggestion is done * Suggestion is done * Remove logs * Remove temp file * Fix errors * Fix ValidateAlgorithmReply * Move NASRL suggestion deployment * Build image for NASRL suggestion * Fix building image * Remove unused import | 03 June 2019, 06:20:15 UTC |
c2a1995 | Andrey Velichkevich | 03 June 2019, 02:40:13 UTC | Fix problems in the UI for v1alpha2 (#623) * Fix AlgorithmName in Params Submit * Fix TrialPath in Submit by Params * Fix name in InputSizes and OutputSizes Change initial NumLayers | 03 June 2019, 02:40:13 UTC |
2e07fe1 | Guang Ya Liu | 02 June 2019, 08:38:12 UTC | Updated help message for golint. (#621) This is related to https://github.com/kubeflow/kfserving/issues/135 | 02 June 2019, 08:38:12 UTC |
d9e727d | Andrey Velichkevich | 02 June 2019, 03:46:15 UTC | Add experiment to Scheme (#620) | 02 June 2019, 03:46:15 UTC |
48889cb | Richard Liu | 31 May 2019, 23:21:50 UTC | Merge pull request #616 from johnugeorge/metricfix Set trial completion status only after metric collection | 31 May 2019, 23:21:50 UTC |
8056907 | Johnu George | 31 May 2019, 10:46:25 UTC | go unit tests from presubmits (#618) * Remove go unit tests from presubmits * Minor fix | 31 May 2019, 10:46:25 UTC |
df9f22c | Johnu George | 31 May 2019, 09:06:43 UTC | Adding owner for cronjob watch | 31 May 2019, 09:42:43 UTC |
4e342cc | Johnu George | 31 May 2019, 08:16:24 UTC | Set trial completion status only after metric collection | 31 May 2019, 09:42:43 UTC |
7a2ffe1 | Johnu George | 31 May 2019, 09:24:25 UTC | Skip creating trials if add count is zero (#617) | 31 May 2019, 09:24:25 UTC |
1fdca87 | Andrey Velichkevich | 30 May 2019, 03:42:18 UTC | Fix nasrl example in v1alpha2 (#609) | 30 May 2019, 03:42:18 UTC |
6a484f6 | Guang Ya Liu | 29 May 2019, 05:54:18 UTC | Enabled make check in travis. (#608) * Enabled make check in travis. * Upgrade to go 1.12.5 for travis. | 29 May 2019, 05:54:18 UTC |
8d77fa5 | Guang Ya Liu | 29 May 2019, 03:16:21 UTC | fix make check (#606) | 29 May 2019, 03:16:21 UTC |
b9b179a | Guang Ya Liu | 29 May 2019, 02:16:21 UTC | Fine-grained docker image build. (#605) | 29 May 2019, 02:16:21 UTC |
2bc89ed | Johnu George | 28 May 2019, 23:20:22 UTC | Moving folders (#602) | 28 May 2019, 23:20:22 UTC |
31104a2 | Johnu George | 28 May 2019, 22:57:48 UTC | Fixing latest tag (#603) | 28 May 2019, 22:57:48 UTC |
3d4712d | Johnu George | 28 May 2019, 11:43:52 UTC | Minor changes (#601) | 28 May 2019, 11:43:52 UTC |
99a4359 | Hougang Liu | 28 May 2019, 11:01:54 UTC | Mini fix for v1alpha1 metricsCollector (#600) | 28 May 2019, 11:01:54 UTC |
4f678e2 | Andrey Velichkevich | 28 May 2019, 10:23:53 UTC | Check error in OpenSQLConnection (#588) * Check error in openSQLconn * Add logic in v1alpha1 * Change to Errorf | 28 May 2019, 10:23:53 UTC |
c2b6f9e | Hougang Liu | 28 May 2019, 09:35:56 UTC | Fix issue of hyperband suggestion service cannot move on (#596) | 28 May 2019, 09:35:56 UTC |
eafd7f7 | Ce Gao | 28 May 2019, 05:57:51 UTC | doc: Update readme (#593) * doc: Update readme Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add title Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove vizier Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add a note in NASRL Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove vizier in Job Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Replace links Signed-off-by: Ce Gao <gaoce@caicloud.io> | 28 May 2019, 05:57:51 UTC |
86b0721 | Hougang Liu | 28 May 2019, 04:53:51 UTC | Reverse logic of Less in hyperband v1alpha1 (#592) | 28 May 2019, 04:53:51 UTC |
c3478af | Hougang Liu | 28 May 2019, 04:27:52 UTC | Mini fix for getExperimentConf (#594) | 28 May 2019, 04:27:52 UTC |
c3faf0c | Ce Gao | 28 May 2019, 03:51:51 UTC | feat: Add UI in manifests v1alpha2 (#591) * feat: Add UI Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add name Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Rename Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix ui Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Support UI Signed-off-by: Ce Gao <gaoce@caicloud.io> | 28 May 2019, 03:51:51 UTC |
daac957 | Ce Gao | 27 May 2019, 10:39:51 UTC | feat: Support flags in UI (#590) Signed-off-by: Ce Gao <gaoce@caicloud.io> | 27 May 2019, 10:39:51 UTC |
c2b20e5 | Guang Ya Liu | 27 May 2019, 03:11:49 UTC | Default make target to v1alpha2. (#585) | 27 May 2019, 03:11:49 UTC |
1e663f9 | Andrey Velichkevich | 25 May 2019, 07:21:27 UTC | Change undeploy script (#587) * Move delete for pv under db * Change script in v1alpha1 | 25 May 2019, 07:21:27 UTC |
0d8f13d | Guang Ya Liu | 24 May 2019, 06:50:20 UTC | Added undeploy for katib. (#579) | 24 May 2019, 06:50:20 UTC |
5c67c0d | Ce Gao | 23 May 2019, 23:20:18 UTC | feat(trial): Add more failure test cases (#570) * WIP Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix test cases Signed-off-by: Ce Gao <gaoce@caicloud.io> | 23 May 2019, 23:20:18 UTC |
644fc99 | Hougang Liu | 23 May 2019, 08:14:26 UTC | Add categories for katib CRDs (#576) | 23 May 2019, 08:14:26 UTC |
d4d87ca | Andrey Velichkevich | 23 May 2019, 07:26:26 UTC | Add Validate Algorithm Settings in v1alpha2 (#574) * Add Validate Algorithm Settings * Integrate ValidateAlgorithmSettings in ManagerClient * Run dep ensure | 23 May 2019, 07:26:26 UTC |
200c59d | Guang Ya Liu | 23 May 2019, 06:46:25 UTC | Updated makefile by adding more targets for developer. (#575) * Updated Makefile for go tools. * Run make depend. * Run make update. * Fixed go vet. * Updated development guide. | 23 May 2019, 06:46:25 UTC |
ef2ac5b | Ce Gao | 23 May 2019, 05:30:24 UTC | feat(experiment): Add more test cases (#563) * feat: Add test cases Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove debug Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix deletion Signed-off-by: Ce Gao <gaoce@caicloud.io> | 23 May 2019, 05:30:24 UTC |
42dc4c8 | Ce Gao | 23 May 2019, 03:30:28 UTC | refactor: Use manager client to get log for test (#569) * refactor: Use manager client to get log for test Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add a log Signed-off-by: Ce Gao <gaoce@caicloud.io> | 23 May 2019, 03:30:28 UTC |
206108a | Guang Ya Liu | 23 May 2019, 02:54:27 UTC | Adding go tools scripts - part 1 (#573) * Added hack scripts for katib. * Run ./hack/update-gofmt.sh. | 23 May 2019, 02:54:26 UTC |
cc5f367 | Hougang Liu | 23 May 2019, 02:32:25 UTC | Retain for job and metricsCollector (#572) | 23 May 2019, 02:32:25 UTC |
9df08fd | Hougang Liu | 23 May 2019, 01:52:24 UTC | Fix finalizer cannot work (#571) | 23 May 2019, 01:52:24 UTC |
17dbca3 | Hougang Liu | 22 May 2019, 03:36:04 UTC | Implement GetExperimentInDB (#558) * Implement GetExperimentInDB * Parse ErrNoRows error * Fix pod ready condition in test script * Add PreCheckRegisterExperiment | 22 May 2019, 03:36:04 UTC |
73d940d | Ce Gao | 21 May 2019, 10:38:40 UTC | refactor: Unify the interface (#568) Signed-off-by: Ce Gao <gaoce@caicloud.io> | 21 May 2019, 10:38:40 UTC |
26231dd | Johnu George | 21 May 2019, 07:56:44 UTC | Implement trial observation metrics (#564) * Implement trial observation * Fix test * Remove unnecessary condition | 21 May 2019, 07:56:44 UTC |