Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Goptuna based suggestion service for CMA-ES. #1131

Merged
merged 21 commits into from
Apr 16, 2020

Conversation

c-bata
Copy link
Member

@c-bata c-bata commented Apr 9, 2020

What this PR does / why we need it:

Add CMA-ES suggestion service using Goptuna.

I tested this suggestion service with the following experiment:
https://github.com/c-bata/katib-goptuna-example

ScreenShot 2020-04-11 9 01 43

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #1100

Special notes for your reviewer:

  1. Goptuna suggestion service also supports TPE and Random. So if you modify katib-config, you can optimize your objective function with Goptuna TPE and Goptuna Random.
  2. After merged this PR and the docker image is pushed, I'll create a pull request to update katib-config. See Add Goptuna based suggestion service for CMA-ES. #1131 (comment)

Release note:

Support [Goptuna](https://github.com/c-bata/goptuna) based suggestion service for CMA-ES.

@kubeflow-bot
Copy link

This change is Reviewable

@k8s-ci-robot
Copy link

Hi @c-bata. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@c-bata

This comment has been minimized.

@c-bata
Copy link
Member Author

c-bata commented Apr 9, 2020

/ok-to-test

@andreyvelich
Copy link
Member

Thank you, this is great!
I will take a look @c-bata.

@c-bata c-bata changed the title Add CMA-ES based suggestion service using Goptuna. Add Goptuna based suggestion service for CMA-ES. Apr 9, 2020
@c-bata c-bata force-pushed the cmaes-suggestion branch from 44ff60c to 0169033 Compare April 9, 2020 17:47
Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good and easy to read! I left few comments

@@ -0,0 +1,129 @@
package suggestion_goptuna_v1alpha3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to add service to /goptuna folder or follow current design, when service is located under /suggestion/v1alpha3 folder?
I am not sure what is the best way to store Algorithms which is written in GO.
What do you think @gaocegege @johnugeorge ?

Copy link
Member Author

@c-bata c-bata Apr 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Go, basically we can't put multiple packages in the same directory (except for *_test package). So I think we need to put this service into /goptuna folder.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

@andreyvelich
Copy link
Member

Thank you for the changes, overall /lgtm.

I think, we can add yaml example, CI Tests, update katib-config in the future PRs.

/assign @johnugeorge @gaocegege

Thank you for implementing this!

@andreyvelich
Copy link
Member

/lgtm

Comment on lines 91 to 114
func isSameTrialParam(ktrial, gtrial goptuna.FrozenTrial) bool {
// Compare trial parameters by "internal representation".
// In the internal representation, all parameters are represented by `float64` to store the storage
// (because Goptuna supports not only in-memory but also RDB storage backend).
// To represent categorical parameters, Goptuna holds an index of the list in the database.
//
// SearchSpace: map[string]interface{}{"x1": Uniform{Min: -10, Max: 10}, "x2": Categorical{Choices: []string{"param-1", "param-2"}}}
// External representation: map[string]interface{}{"x1": 5.5, "x2": "param-2"}
// Internal representation: map[string]float64{"x1": 5.5, "x2": 1.0}
for name := range gtrial.InternalParams {
gtrialParamValue := gtrial.InternalParams[name]
ktrialParamValue, ok := ktrial.InternalParams[name]
if !ok {
// must not reach here
klog.Errorf("Detect inconsistent internal parameters: %v and %v",
ktrial.InternalParams, gtrial.InternalParams)
return false
}
if gtrialParamValue != ktrialParamValue {
return false
}
}
return true
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this logic seems to be broken.

In Goptuna, the internal representation is not necessarily reproducible from external representation because of the following reason:
optuna/optuna#925

So we need to compare external representations because ktrial is created from Katib parameter assignments which store external representation. I'll fix this soon.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is enough to use reflect.DeepEqual(gtrial.Params, ktrial.Params.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed this at 8c21693 and tested on the following parameter settings:

# https://github.com/c-bata/katib-goptuna-example/blob/master/experiment-step.yaml
  parameters:
    - name: x1
      parameterType: int
      feasibleSpace:
        min: "-15"
        max: "15"
        step: "2"
    - name: x2
      parameterType: double
      feasibleSpace:
        min: "-10"
        max: "10"
        step: 0.5

It works fine!

>>> logs -f example-step3-cmaes-54b4886448-cgkdm
I0414 06:22:33.668875       1 main.go:36] Start Goptuna suggestion service: 0.0.0.0:6789
I0414 06:22:47.179106       1 service.go:67] Success to sample new trial: trialID=0, assignments=[name:"x1" value:"3"  name:"x2" value:"1.5" ]
I0414 06:22:47.182047       1 service.go:67] Success to sample new trial: trialID=1, assignments=[name:"x1" value:"5"  name:"x2" value:"-1" ]
I0414 06:22:53.962858       1 service.go:100] Update trial mapping : trialName=example-step3-lvmdgldl -> trialID=1
I0414 06:22:53.962882       1 service.go:100] Update trial mapping : trialName=example-step3-458v8s2c -> trialID=0
I0414 06:22:53.962888       1 service.go:130] Detect changes of Trial (trialName=example-step3-458v8s2c, trialID=0) : State Complete, Evaluation 46.250000
I0414 06:22:53.962988       1 service.go:67] Success to sample new trial: trialID=2, assignments=[name:"x2" value:"0.5"  name:"x1" value:"-3" ]
I0414 06:22:56.158324       1 service.go:100] Update trial mapping : trialName=example-step3-mrxdqpq9 -> trialID=2
I0414 06:22:56.158345       1 service.go:130] Detect changes of Trial (trialName=example-step3-lvmdgldl, trialID=1) : State Complete, Evaluation 16.000000
I0414 06:22:56.158473       1 service.go:67] Success to sample new trial: trialID=3, assignments=[name:"x1" value:"-7"  name:"x2" value:"1" ]
I0414 06:22:59.670946       1 service.go:130] Detect changes of Trial (trialName=example-step3-mrxdqpq9, trialID=2) : State Complete, Evaluation 94.250000
I0414 06:22:59.670984       1 service.go:100] Update trial mapping : trialName=example-step3-z9p7rklk -> trialID=3
I0414 06:22:59.671069       1 service.go:67] Success to sample new trial: trialID=4, assignments=[name:"x1" value:"-1"  name:"x2" value:"-1" ]

PTAL.

@k8s-ci-robot k8s-ci-robot removed the lgtm label Apr 14, 2020
}
}
return nil, cmaes.NewSampler(opts...), nil
} else if name == AlgorithmTPE {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will have two suggestion providers which support TPE, do you think we should keep a prefix for it? Maybe gotuna-tpe?

/cc @andreyvelich @johnugeorge

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, we can name it goptuna-tpe and goptuna-random.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little bit concerned about the consistency of naming. It might be enough just updating katib-config if we want to use Goptuna suggestion service.

:
  suggestion: |-
    {
      "random": {
        "image": "gcr.io/kubeflow-images-public/katib/v1alpha3/suggestion-goptuna"
      },
      "tpe": {
        "imagePullPolicy": "Always",
        "image": "gcr.io/kubeflow-images-public/katib/v1alpha3/suggestion-goptuna"
      },
      "cmaes": {
        "imagePullPolicy": "Always",
        "image": "gcr.io/kubeflow-images-public/katib/v1alpha3/suggestion-goptuna"
      },
:

Copy link
Member

@johnugeorge johnugeorge Apr 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @c-bata .If we add prefix, user will be confused. User is not concerned about the internal implementation. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we should update doc which implementation we currently support for which algorithm.
E.g random: hyperopt, chocolate, goptuna.

Copy link
Member

@gaocegege gaocegege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM

Thanks for your contribution! 🎉 👍

It is huge contribution.

@andreyvelich
Copy link
Member

I think it is ready to merge, right @c-bata ?

@c-bata
Copy link
Member Author

c-bata commented Apr 15, 2020

I think it is ready to merge, right @c-bata ?

Yes!

FYI, I created issue #1147.

@andreyvelich
Copy link
Member

I think it is ready to merge, right @c-bata ?

Yes!

FYI, I created issue #1147.

Thanks!
/lgtm
/cc @gaocegege @johnugeorge

Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge it, I will create an issue for remaining work.
Thanks again @c-bata!
/approve

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andreyvelich

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 2238eee into kubeflow:master Apr 16, 2020
sperlingxx pushed a commit to sperlingxx/katib that referenced this pull request Jul 9, 2020
* Add goptuna dependencies

* Implement Goptuna based suggestion service

* Add test cases

* Validate duplicated parameters

* Add test cases

* Support step argument

* Update vendor files

* Support step argument for INT

* Use new suggest APIs

* Add refactor changes

* Apply review feedbacks

* Update vendor files

* Apply review feedbacks

* Fix tests

* Add some refactor changes and add code comments

* Fix tests

* Refactor blank lines of import statements

* Provide more algorithm settings

* Create goptuna study and search space at first get suggestoins request

* Compare trial parameters exactly

* Fix isSameTrialParam() and refactor logging
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add CMA-ES based suggestion service.
6 participants