-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue where new task was not returned in sticky-disabled scenarios #589
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense ofc since I wrote it, but let's get someone else to take a peek.
@yiminc Does this make sense to you?
internal/internal_task_pollers.go
Outdated
// If we are forcing creation of a new workflow task because we have a long running local activity, | ||
// it's desirable to also ensure we get that new task returned to ourselves immediately, otherwise | ||
// if we aren't caching the workflow we can end up forgetting the number of attempts on said local | ||
// activity. | ||
request.ReturnNewWorkflowTask = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, I think even when we disable sticky cache, we should still be able to handle new workflow task, meaning this value should always be true. Could you try to see if that works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have confirmed that if we just remove this entire else
block (thereby always leaving it true
if it was there), it does pass all tests in all scenarios and does properly return the new task. @Sushisource - thoughts on removing this else
block entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll still work. My thought here is that if we have sticky cached disabled it is better to allow load to be distributed to other workers which may have it enabled, or not, but at least then this worker isn't only running it's own tasks, but still incurring full replay cost every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The ReturnNewWorkflowTask only apply if there is a task for this current workflow. This not very common.
- It is not likely to have some worker disable sticky, but others enable sticky.
The perf difference would be very minimal with or without this flag.
The benefit of removing this else is to simplify code logic. So reader don't even need to know all these tedious details that does not really matters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed this conditional altogether.
What was changed
Allow
ReturnNewWorkflowTask
to betrue
inRespondWorkflowTaskCompletedRequest
even in sticky-disabled scenarios (e.g. integration tests withWORKFLOW_CACHE_SIZE=0
) when the force flag is set.Why?
Currently this single test w/ no-cache fails on master:
However with cache or when run as a full suite (i.e. make integration-test-zero-cache) it succeeds presumably due to other tests' side effects. In #579 there was a considerable speedup to integration tests (speeding up the stopping of a worker) which then showed this single test failure. The activity attempt count was getting reset because we were always setting
ReturnNewWorkflowTask
asfalse
in no-cache situations.EDIT: Forgot to mention, thanks to @Sushisource for helping me find the issue and suggest a solution.