-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP fetch abort_recipe_on_fail attribute #295
base: master
Are you sure you want to change the base?
Conversation
LGTM |
Thanks for review. I've removed the draft status from the pull request. Can you aprove it? Thanks! |
src/task.c
Outdated
@@ -116,6 +116,7 @@ fetch_finish_callback (GError *error, guint32 match_cnt, | |||
} else { | |||
g_propagate_error (&task->error, error); | |||
task->state = TASK_COMPLETE; | |||
task->fetch_succeeded = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposal 1 out of 2:
What if you did,
if (task->abort_recipe_on_fail) {
task->fetch_succeeded = false; # You do need this var to differentiate from rstrnt-abort
app_data->aborted = ABORTED_TASK;
g_cancellable_cancel(app_data->cancellable);
}
THEN
Further down the code in the state case TASK_COMPLETE:
// Set task finished
if (g_cancellable_is_cancelled(app_data->cancellable) &&
app_data->aborted != ABORTED_NONE) {
g_clear_error(&task->error);
if {task->fetch_succeeded) {
g_set_error(&task->error, RESTRAINT_ERROR,
RESTRAINT_TASK_RUNNER_ABORTED,
"Aborted by rstrnt-abort");
}else {
g_set_error(&task->error, RESTRAINT_ERROR,
RESTRAINT_TASK_RUNNER_ABORTED,
"Aborted due to fetch failure");
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @cbouchar And thanks a lot for comment.
Your are right this solution would be better and in fact it was the first solution I have tried. But it turned out not to be working. In this solution the task is corectly aborted with the correct abort message.
I've reimplemented this solution in this commit (3258337) and tried to run it beaker. And even though the task is aborted correctly the whole recipe just continue to run and the following tasks are executed as normal.
I haven't been able to solve this issue in the past so I've come up with the solution you've been reviewing.
So the actual problem with this solution is that the recipe is not aborted and conitues to run even though the RESTRAINT_ERROR is properly set.
I'll look at this aborting issue once again and try to solve it, but If you have any idea why the recipe is not aborted I'd like to hear your insights.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure you saw my 2nd proposal. This sounds like an extreme corner case for a specific area in your recipe. It seems more reasonable to create a task to check if your requested test got installed. if not just call rstrnt-abort? Messing with restraint will affect all users and you'd be surprised the bugs that can surface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I saw your second proposal. If there won't be any possibility to push this update upstream, we would need to implement some kind of wrapper. However, even this wrapper could fail to fetch data, which would result in the same outcome as if there were no wrapper at all.
Please take a quick look at my latest changes. I've refactored the code to minimize disruption and make the new feature as optional as possible for the end user.
Regarding this feature, you may see it as an extreme corner case. But is it? All we want to do is abort the recipe if there is a fetch error in some task. As it stands, the user currently has no way to do that in restraint. This provides the user with an option—not a necessity—to abort the recipe, which could result in saving computation time and stopping tasks that would produce irrelevant data.
And yes, I know there's always the possibility that some bugs could surface, but as I've said, this feature is designed to be as optional as possible.
Thanks a lot for your time!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion wasn't to retry a fetch but to check if the directory is present. if not, abort. I'm concerned about disrupting other engineers. BTW I'll be on PTO next week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussion, let postpone this pull request for Q1. Thanks!
src/task.c
Outdated
@@ -1281,6 +1294,10 @@ task_handler (gpointer user_data) | |||
break; | |||
} | |||
case TASK_NEXT: | |||
if (!task->fetch_succeeded && task->abort_recipe_on_fail) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change, it will proceed to get the next task but I believe your intent is to quit the current task after failing to perform the fetch operation for it. Putting this change here does not make sense. You should quit on the TASK that's failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I'd like to end on the aborted task. But I thought it should be handled by this:
g_set_error(&task->error, RESTRAINT_ERROR, RESTRAINT_TASK_RUNNER_ABORTED, "Aborted due to fetch failure");
But it just aborts curent task not the whole recipe. So this was the mechanism to abort all the following tasks manualy with coresponding log message.
Is there a mechanizm that could abort whole recipe? I tried to skip all the following tasks in a list in the recipe. But it did not work with Beaker properly. Beaker was stuck wating for results from those skiped tasks.
I'm concerned about your changes. You're injected another path for error handling which may circumvent existing path. FYI the documentation https://restraint.readthedocs.io/en/latest/results.html will be affected by your changes. Please take the time understand this doc since my suggestion is trying to work with how result handling is currently performed. |
src/task.c
Outdated
g_string_printf(message, "** Fetching task: %s [%s]\n", task->task_id, task->path); | ||
app_data->fetch_retries = 0; | ||
task->state = TASK_FETCH; | ||
if(app_data->state == RECIPE_ABORT){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need these changes with what I proposed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes if the whole recipe abort would work than it wouldnt be necesary to have this.
src/recipe.h
Outdated
@@ -37,6 +37,7 @@ typedef enum { | |||
RECIPE_RUNNING, | |||
RECIPE_FAIL, | |||
RECIPE_COMPLETE, | |||
RECIPE_ABORT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you shouldn't need this
Proposal 2: I don't have your full picture of the problem you are trying to solve but I suspect this problem is with |
Hello
I've added feature to http fetch tag. Now attribute abort_recipe_on_fail="on/off" can be placed in there.
If fetch of this task fail then the taks and all subsequent tasks in recipe are aborted.