sort by:
Revision Author Date Message Commit Date
d92ec01 Add tfjob and pytorch examples to e2e (#820) * Add tfjob and pytorch examples to e2e * Fix tests * Fix tests * Fix tests * Fix tests * Install crds before katib * Fix tests * Adding timeout to 30 min 29 September 2019, 08:47:38 UTC
18c4a8b fix: Update liveness probe to avoid problems (#833) Signed-off-by: Ce Gao <gaoce@caicloud.io> 29 September 2019, 08:01:38 UTC
d03a551 Remove used katib-manager code (#836) 29 September 2019, 06:37:37 UTC
fbf0726 File metrics collector end to end test (#832) 29 September 2019, 05:31:38 UTC
afaf252 feat: support namespace for trial template (#827) * WIP Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: support namespace for trial template Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Use configmap function Signed-off-by: Ce Gao <gaoce@caicloud.io> 29 September 2019, 04:29:37 UTC
69904e9 Remove metrics in DB when delete trial (#830) 29 September 2019, 03:19:37 UTC
adaffc5 Update status conditions during reconcile error (#831) 29 September 2019, 01:43:37 UTC
92069c2 feat: Use env var for namespace (#829) Signed-off-by: Ce Gao <gaoce@caicloud.io> 27 September 2019, 18:41:36 UTC
cad5060 Make sure experiment namespace can inject metriccollector sidecar (#828) 27 September 2019, 08:09:36 UTC
1182029 Doc about katib workflow design (#824) 27 September 2019, 05:17:35 UTC
6a09c61 fix: Support multiple namespaces (#826) Signed-off-by: Ce Gao <gaoce@caicloud.io> 27 September 2019, 05:13:35 UTC
b3f005e feat: Support step when using grid in UI (#821) * fix: Use log Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove hard coded path Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Add step for double Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Do not set if it is necessary Signed-off-by: Ce Gao <gaoce@caicloud.io> 26 September 2019, 10:23:06 UTC
3627933 fix: Build e2e-runner (#822) Signed-off-by: Ce Gao <gaoce@caicloud.io> 26 September 2019, 09:43:08 UTC
e26c442 Fix stdout of worker container show nothing (#819) 26 September 2019, 08:41:07 UTC
d39865b feat: Remove useless APIs (#818) * feat: Remove useless APIs Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Remove Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix Signed-off-by: Ce Gao <gaoce@caicloud.io> 26 September 2019, 07:31:09 UTC
e9c91ed feat: Add validation for grid (#812) * feat: Add validation for grid Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Import grpc Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove check in get suggestions Signed-off-by: Ce Gao <gaoce@caicloud.io> 26 September 2019, 06:29:06 UTC
8be1650 Adding additional printer columns for better debugging (#817) 26 September 2019, 05:29:06 UTC
39beda3 metrics-collector role is not usefule any more (#816) 26 September 2019, 04:13:06 UTC
67a9cea Rename algorithm deployment and service (#814) 26 September 2019, 02:55:06 UTC
9c768ca fix: Fix the type (#813) * fix: Fix the type Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: trigger CI Signed-off-by: Ce Gao <gaoce@caicloud.io> 26 September 2019, 02:09:07 UTC
95d7d12 feat: Add tpe e2e test case (#809) Signed-off-by: Ce Gao <gaoce@caicloud.io> 25 September 2019, 11:11:59 UTC
d571094 Remove unused field from Experiment Spec (#806) * Remove unused field from Spec * Remove references 25 September 2019, 10:36:00 UTC
cc76656 feat: Add HyperBand (#787) * feat: Add HyperBand Signed-off-by: Ce Gao <gaoce@caicloud.io> * chore: Add test in CI Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix name Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix name Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix script Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix r_l Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add parallel trial count Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add output Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Append algorithm settings Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add output Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix useless variable Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Use resource_name instead of ResourceName Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Update Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Avoid nil pointer exception Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Move algorithm to status Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add max Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Use algorithm settings Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove updateSpec Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix test Signed-off-by: Ce Gao <gaoce@caicloud.io> 25 September 2019, 09:43:59 UTC
e9e0768 Removing unecessary lines (#803) 25 September 2019, 01:03:59 UTC
5601587 feat: Add NAS RL based algorithm (#793) * feat: Add NAS RL based algorithm Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add tensorflow Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Add health check Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Add E2E in CI Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Install packages Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix image Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Add NAS in suggestion client Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Do not set nasconfig for hp jobs Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix script Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add output Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add for debug Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove -u Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove version Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Comment test Signed-off-by: Ce Gao <gaoce@caicloud.io> 24 September 2019, 11:39:27 UTC
6a5bc1d fix: Remove copy (#802) Signed-off-by: Ce Gao <gaoce@caicloud.io> 24 September 2019, 10:59:26 UTC
bd4480c Adding example trial as the default (#801) 24 September 2019, 10:25:27 UTC
cdd8e32 Removing metric collector templates from UI (#800) 24 September 2019, 09:53:27 UTC
50e7f00 fix: Use commitid (#799) Signed-off-by: Ce Gao <gaoce@caicloud.io> 24 September 2019, 08:35:27 UTC
e7e8e57 Use common metricsCollector struct (#798) * Use common metricsCollector struct * Fix test error 24 September 2019, 02:31:25 UTC
81856da build: Support arguments (#795) Signed-off-by: Ce Gao <gaoce@caicloud.io> 23 September 2019, 08:49:23 UTC
ebb48f8 feat: Rename algorithms (#794) * feat: Rename algorithms Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove prefix Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix test cases Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix algorithms Signed-off-by: Ce Gao <gaoce@caicloud.io> 23 September 2019, 08:09:23 UTC
a52a473 feat: Add events in suggestion (#796) Signed-off-by: Ce Gao <gaoce@caicloud.io> 23 September 2019, 07:03:24 UTC
1a238d4 UI: Fix problems (#786) * fix: Add log Signed-off-by: Ce Gao <gaoce@caicloud.io> * WIP Signed-off-by: Ce Gao <gaoce@caicloud.io> 23 September 2019, 05:11:22 UTC
e98b182 Implement tfevent collector (#792) * Implement tfevent collector * Fix stdout collector error 23 September 2019, 04:37:23 UTC
e2ac10b Run e2e tests parallel (#790) * Run e2e tests parallel * Adding cluster resources to test * Adding cluster resources to test infra 23 September 2019, 03:33:22 UTC
f1a57e8 Mark trial as failed when job fails (#791) 23 September 2019, 03:01:22 UTC
67a5e02 Adding javascripts locally (#789) 23 September 2019, 01:25:22 UTC
f73caf7 feat: Add grid with the help of chocolate (#780) * feat: Add chocolate Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add build in CI Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Update Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Comment choco test now Signed-off-by: Ce Gao <gaoce@caicloud.io> 22 September 2019, 06:51:22 UTC
608c19c feat: Add bayesian (#777) * feat: Add bayesian Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix requirement Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Update Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Add config in cm Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Use 120s as timeout Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Update Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix timeout Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Address comments Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix command Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix path Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Change the period Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add version in requirements Signed-off-by: Ce Gao <gaoce@caicloud.io> * pkg: Fix period Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add build Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix inital delay Signed-off-by: Ce Gao <gaoce@caicloud.io> 21 September 2019, 18:07:25 UTC
00cc9bb Implement file metrics collector (#783) 21 September 2019, 12:15:24 UTC
eb71586 feat: Remove useless algorithms (#782) * fix: Remove Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove in workflow Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove Signed-off-by: Ce Gao <gaoce@caicloud.io> 21 September 2019, 03:05:24 UTC
f26f7b3 Adding algorithm deployment status to Suggestion status (#784) * Adding deployment status to Suggestion status * Adding running status immediately when set 21 September 2019, 00:45:25 UTC
18ca285 Wait for server (#785) 20 September 2019, 23:09:25 UTC
94a401e feat: Add GRPC health check in suggestions (#779) * feat: Add health check in suggestions Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix ut python Signed-off-by: Ce Gao <gaoce@caicloud.io> 20 September 2019, 04:18:59 UTC
d1331ad feat: Add more output in e2e test for debug purpose and fix test cases (#775) * feat: Add more output Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Update Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add suggestion Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: generate Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Reorder Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: output error Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix test cases Signed-off-by: Ce Gao <gaoce@caicloud.io> 19 September 2019, 08:04:59 UTC
f255c29 Removing suggestions from manager interface (#772) * Removing suggestions from manager interface * Removing long running services * Increasing timeout to 60 sec 19 September 2019, 07:29:00 UTC
7248af5 Marking experiment failed when suggestion fails (#773) 19 September 2019, 01:28:59 UTC
7df0955 Adding status conditions for Suggestion CRD (#770) * Adding status conditions for Suggestion CRD * Fix tests 18 September 2019, 05:05:29 UTC
17d36c0 feat(GRPC): Replace trial with assignment (#767) * feat: Update API Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Replace trial with assignment Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix test Signed-off-by: Ce Gao <gaoce@caicloud.io> 17 September 2019, 11:46:23 UTC
8196c78 feat: Enable python test (#766) * feat: Enable python test Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix test Signed-off-by: Ce Gao <gaoce@caicloud.io> 17 September 2019, 08:46:23 UTC
f54e821 Remove old metricscollector source code (#765) Signed-off-by: hougang liu <liuhgxa@gmail.com> 17 September 2019, 06:56:23 UTC
e2191fd Remove old version metrics collector (#761) * Remove old version metrics collector * fix test error * Fix hyperopt example 17 September 2019, 05:54:24 UTC
20f9b40 feat: Support HyperOpt (#753) * feat: Support HyperOpt Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Add cmd Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Add hyperopt example Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Update API Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add build Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: chmod Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix test Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove useless tag Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove SJTUG mirror Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add version in requirements Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Address comments Signed-off-by: Ce Gao <gaoce@caicloud.io> 17 September 2019, 02:24:23 UTC
27893d3 Refactor DB interface (#760) * Refactor DB interface * Fix tests * Address review comments * Address review comments * Address review comments * Address review comments 16 September 2019, 23:18:22 UTC
8447e08 Metrics collector exit when worker container done (#758) * Metrics collector exit when worker container done * update prow 16 September 2019, 09:44:37 UTC
6467615 feat(GRPC): Update API for Suggestion (#743) * feat: Fix the API Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Code generate Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove the implementation Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix CI Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix import Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix test case Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix test Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Remove manager test Signed-off-by: Ce Gao <gaoce@caicloud.io> 16 September 2019, 02:28:36 UTC
8c0e0f7 Removing back db pass (#756) 15 September 2019, 23:46:37 UTC
2ed88bc Adding suggestion images to configmap (#754) * Adding suggestion images to configmap * Fix tests * Fix gofmt 12 September 2019, 06:28:27 UTC
b94ee9a Populate Parameter assignments in trials (#751) 10 September 2019, 12:48:54 UTC
c40b803 Fix v1alpha3 paths in prow (#752) 10 September 2019, 10:50:54 UTC
9b40375 Adding suggestion reconcile in Experiment controller (#750) * Adding suggestion reconcile * Fix tests * Commenting e2e tests * comment nas rl build * Add logs 10 September 2019, 10:00:55 UTC
bbae0ea Experiment webhook for metricsCollector (#749) * Experiment webhook for metricsCollector * Fix test error 09 September 2019, 07:31:18 UTC
cfede05 Ignore injecting sidecar for none collector kind (#747) * Ignore injecting sidecar for none collector kind * Upper camel case 05 September 2019, 15:33:12 UTC
e9cc353 feat: Add suggestion controller (#742) * feat: Add suggestio controller Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: gofmt Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix name Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add error output Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix RBAC Signed-off-by: Ce Gao <gaoce@caicloud.io> 05 September 2019, 08:12:52 UTC
0e8f189 Inject sidecar container based on metrics collector kind (#745) 05 September 2019, 06:32:52 UTC
349a5a4 feat(apis): Add suggestion CRD API (#740) * feat: Add suggestion CRD Signed-off-by: Ce Gao <gaoce@caicloud.io> * chore: generate deepcopy Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix json tag Signed-off-by: Ce Gao <gaoce@caicloud.io> 04 September 2019, 08:29:01 UTC
e61c988 Adding extra timeouts to builds (#741) 04 September 2019, 06:44:58 UTC
daacf9f Katib v1alpha3 api implementation (#739) * v1alpha3 api implementation * fix jsonnet params * Adding v1alpha3 examples * Adding UI to builds 04 September 2019, 02:24:58 UTC
7d8743c MetricsCollector pod injection for katib job (#738) * MetricsCollector pod injection for katib job * Fix gofmt 03 September 2019, 08:42:27 UTC
ab0b4a7 Refactor directory structure (#737) * Refactor directory structure * Fix tests 03 September 2019, 05:58:27 UTC
6bfca31 Upgrade vendor packages (#736) 03 September 2019, 03:22:27 UTC
e5a27ad Add metrics-collector.md (#732) 02 September 2019, 02:57:41 UTC
b068809 Delete v1alpha1 api (#734) * Delete v1alpha1 api * Removing modelstore 01 September 2019, 23:31:42 UTC
2bdfe61 Inject sidecar only for katib belongs (#733) 29 August 2019, 06:37:55 UTC
7b7c1c5 fix: Add build for sidecar (#730) * fix: Add build for sidecar Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Update Signed-off-by: Ce Gao <gaoce@caicloud.io> 29 August 2019, 02:41:54 UTC
e7d2708 Fix bayesianoptimizatio build for ppc64le (#731) 29 August 2019, 01:51:54 UTC
b225423 Inject pod sidecar for specified namespace (#729) 28 August 2019, 23:57:54 UTC
1945560 Add pod level inject webhook (#716) * Add pod level inject webhook. * Implement sidecar injection (Hard code container) * Inject metrics collector as a sidecar * Update metrics-collector to satisfy sidecar * Clean up test logs * Get experiment name and job kind * Update common labels * Separate the sidecar metrics collector 28 August 2019, 06:49:13 UTC
08f0429 upadte katib ui base image (#727) 26 August 2019, 23:22:38 UTC
7a4bee2 Update prow (#721) 16 August 2019, 11:30:34 UTC
c70f5e1 Minor test fix (#720) 16 August 2019, 11:20:32 UTC
45124c1 Enable prometheus metrics for katib-controller (#717) * Enable prometheus metrics for katib-controller * Disable UI build to avoid blocking * Use kubeflow repo image to run golang test 16 August 2019, 07:30:40 UTC
6d4dd47 Refactor webhook organization (#711) 05 August 2019, 12:13:48 UTC
de96a52 API for metricCollector (#697) * API for metricCollector * Drop trialmetrics crd 01 August 2019, 10:08:15 UTC
97e0b32 Update vendor (#700) 29 July 2019, 03:51:53 UTC
505e90f Fix wrong end state of trial (#695) 25 July 2019, 07:03:51 UTC
97a97fa update some dockerfiles to support power (#673) * modify some dockerfile to support power * update more dockerfiles for v2 * modify wrong manager-rest name 19 July 2019, 03:17:13 UTC
a30e8e1 Fix tf-event example cannot work (#689) 16 July 2019, 09:31:24 UTC
f80a7d2 Add example readme for v1alpha2 (#688) 16 July 2019, 03:13:23 UTC
08ea526 Fix error of undeploy.sh (#687) 15 July 2019, 11:55:06 UTC
742fbf6 Add tfevent-volume to v1alpha2 example (#681) 13 July 2019, 02:11:04 UTC
d513bd6 Change Version to latest in e2e tests (#686) 13 July 2019, 00:47:04 UTC
02c3528 Use latest katib images to deploy katib (#682) 12 July 2019, 03:57:03 UTC
f1af0be Fix prow error (#684) 12 July 2019, 01:57:06 UTC
912f329 Update Build Script for v1alpha2 (#672) * Update build script * Change version in e2e test 04 July 2019, 02:26:12 UTC
47d4d8f Add npm build to the UI Dockerfile (#665) * Remove build from the Repo * npm build in dockerfile * Remove no-cache * Change size 28 June 2019, 01:33:21 UTC
c81818d MetricController: Run only a single job per task (#660) This changes the `spec.concurrencyPolicy` of the metric collector cron-job from "Allow" (default) to "Forbid". The cronjob used to create a new job even if the previous job had not succeeded. On high-load clusters this could lead to a high number of jobs which never finished. This fixed #659 27 June 2019, 05:39:19 UTC
702703b Build images for nasrl training container (#669) * Add NASRL training container build image * Add build for v1alpha1 21 June 2019, 03:30:36 UTC
a21c14f Add delete experiment (#654) 19 June 2019, 03:00:32 UTC
back to top