-
Notifications
You must be signed in to change notification settings - Fork 460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add v1alpha2 grpc api #427
Conversation
Signed-off-by: YujiOshima <[email protected]>
Signed-off-by: YujiOshima <[email protected]>
Signed-off-by: YujiOshima <[email protected]>
I added a new version of DB IF |
pkg/api/v1alpha2/api.proto
Outdated
ObjectiveSpec objective = 2; | ||
AlgorithmSpec algorithm = 3; | ||
string trial_template = 4; | ||
int32 pallarel_trial_count = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo - parallel
string value = 2; | ||
} | ||
|
||
message MetricLog { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this used for?
repeated Metric metrics = 1; | ||
} | ||
|
||
message ObservationLog { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document what this is used for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added doc here 2e3444d#diff-c0f6760a929f1056527071edf1f57db8R117
Katib will store every log of metrics. Users can look at the history of training. And the logs are used by suggestion algorithm and early stopping.
pkg/api/v1alpha2/api.proto
Outdated
} | ||
|
||
message GetExperimentListReply { | ||
repeated ExperimentSummary experiment_summarys = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
summaries
pkg/db/v1alpha2/db_init.go
Outdated
objective TEXT, | ||
algorithm TEXT, | ||
trial_template TEXT, | ||
pallarel_trial_count INT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parallel
pkg/api/v1alpha2/api.proto
Outdated
*/ | ||
message GraphConfig { | ||
int32 num_layers = 1; /// Number of layers | ||
repeated int32 input_size = 2; /// Dimenstions of input size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dimenstions -> Dimensions
pkg/api/v1alpha2/api.proto
Outdated
|
||
message ExperimentSummary { | ||
string experiment_name = 1; | ||
ExperimentStatus.ExperimentConditionType condition = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to change 4 to 2 ?
start_time DATETIME(6), | ||
metrics_collector_type TEXT, | ||
completion_time DATETIME(6), | ||
last_reconcile_time DATETIME(6))`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we add NasConfig in experiments
table in the future?
@richardsliu @andreyvelich Thanks! I applied your comments. |
Signed-off-by: YujiOshima <[email protected]>
d939e6d
to
2e3444d
Compare
pkg/api/v1alpha2/api.proto
Outdated
ObjectiveType type = 1; | ||
float goal = 2; | ||
string objective_metric_name = 3; | ||
repeated string additional_metrics_name = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
additional_metrics_names?
pkg/api/v1alpha2/api.proto
Outdated
string value = 2; | ||
} | ||
string name = 1; | ||
repeated ValueLog valuelog = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why would it be a list of [timestamp-value] pair for a particular key? Isn't it, a [key-value] pair for a particular timestamp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the metric name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/valuelog/value_log
pkg/api/v1alpha2/api.proto
Outdated
Experiment experiment = 1; | ||
} | ||
|
||
message GetStudyListRequest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this used for?
|
||
message GetTrialListRequest { | ||
string experiment_name = 1; | ||
string start_time = 2; ///The start of the time range. RFC3339 format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these filter options?
pkg/api/v1alpha2/api.proto
Outdated
message GetSuggestionsRequest { | ||
string experiment_name = 1; | ||
string algorithm_name = 2; | ||
int32 request_number = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this field used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you set 3 to request_number
, you can get three Suggestions at one time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this same as the parallel_trial_count
which is already in the experiment Spec ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a little bit different.
When parallel_trial_count is 4 and 2 trials are completed at the same time, the operator needs to set 2 to request_number in GetSuggesitonRequest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Thanks
pkg/api/v1alpha2/api.proto
Outdated
|
||
message ExperimentSummary { | ||
string experiment_name = 1; | ||
ExperimentStatus.ExperimentConditionType condition = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add more fields as in ExperimentStatus https://github.com/kubeflow/katib/blob/master/pkg/api/operators/apis/experiment/v1alpha2/types.go#L75
OptimalTrial ,TrialsCompleted,TrialsFailed ,TrialsKilled ,TrialsPending
pkg/api/v1alpha2/api.proto
Outdated
} | ||
string start_time = 1; /// RFC3339 format | ||
string completion_time = 2; /// RFC3339 format | ||
string last_reconcile_time = 3; /// RFC3339 format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need last_reconcile_time in DB?
pkg/api/v1alpha2/api.proto
Outdated
} | ||
string start_time = 1; /// RFC3339 format | ||
string completion_time = 2; /// RFC3339 format | ||
string last_reconcile_time = 3; /// RFC3339 format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need last_reconcile_time in DB?
pkg/api/v1alpha2/api.proto
Outdated
} | ||
|
||
message Trial { | ||
string id = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this ID or name(like in experiment)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the trials are generated by the controller, users won't name them.
I think ID
is more suitable here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is true. But, I feel that it might be better to keep the same terminology as in the Kubernetes config(matadata.name)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense. "name" would be more user friendly.
pkg/api/v1alpha2/api.proto
Outdated
} | ||
|
||
message RegisterTrialReply { | ||
string trial_id = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DB interface can have multiple storage backend(Mysql, ModelDb etc). Is the controller creating Id? What is the return trial_id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My assumption, it is created by DB backend.
But it is OK the controller create the ID. It will probably be more consistent behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if every storage backend supports id creation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. let's make the name of trial is generated by controller.
/** | ||
* Update Status of a experiment. | ||
*/ | ||
rpc UpdateExperimentStatus(UpdateExperimentStatusRequest) returns (UpdateExperimentStatusReply){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, Suggestion algorithms can modify the algorithm parameters. https://github.com/kubeflow/katib/blob/master/pkg/suggestion/grid_service.go#L195
How is this handled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will create UpdateAlgorithmParameter API.
pkg/api/v1alpha2/api.proto
Outdated
* Config for operations in DAG | ||
*/ | ||
message Operation { | ||
string operationType = 1; /// Type of operation in DAG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/operationType/operation_type to keep same name style as others
pkg/api/v1alpha2/api.proto
Outdated
string value = 2; | ||
} | ||
string name = 1; | ||
repeated ValueLog valuelog = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/valuelog/value_log
pkg/api/v1alpha2/api.proto
Outdated
} | ||
|
||
message ObservationLog { | ||
repeated MetricLog metriclogs = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/metriclogs/metric_logs
Signed-off-by: YujiOshima <[email protected]>
@johnugeorge @hougangliu Thank you for your comments. I updated. |
pkg/db/v1alpha2/db_init.go
Outdated
id INT AUTO_INCREMENT PRIMARY KEY, | ||
time DATETIME(6), | ||
name VARCHAR(255), | ||
metric_name VARCHAR(255), | ||
value TEXT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this metric_value field?
@YujiOshima I see that new API UpdateAlgorithmSpec is added. How is experiment Spec update handled in general? Say If user updates fields like MaxTrials, ParallelTrials etc in K8s config , how is DB updated? |
@johnugeorge You have a point. |
@YujiOshima How would |
@johnugeorge Yes. Since the parameter that each algorithm need is different, the AlgorithmStatus should have a general struct like key-value pair.
Yes. |
@YujiOshima
3 . Do we need UpdateAlgorithmSpec call in such a case? |
@johnugeorge Sorry for confusing you.
Right. make sense? |
Yes. Should we rename the api |
pkg/api/v1alpha2/api.proto
Outdated
/** | ||
* List of ParameterAssignment | ||
*/ | ||
message ParameterAssignments { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@YujiOshima @richardsliu I think we need ParameterSpec
instead of ParameterAssignments
, that I told you. We need Feasible space inside ParameterConfigs in Operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
@johnugeorge How about
|
Signed-off-by: YujiOshima <[email protected]>
Signed-off-by: YujiOshima <[email protected]>
@johnugeorge @andreyvelich @richardsliu @hougangliu I think I applied all comments. PTAL. |
/lgtm |
/retest |
4 similar comments
/retest |
/retest |
/retest |
/retest |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: johnugeorge The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
#423
I added v1alpha2 gRPC API complies with the v1alpha2 CRD.
https://github.com/kubeflow/katib/pull/427/files#diff-c0f6760a929f1056527071edf1f57db8
I will add a new DB schema soon.
@richardsliu @johnugeorge @hougangliu @gaocegege
Signed-off-by: YujiOshima [email protected]
This change is