-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix cancel handling in pipedv1 scheduler #5597
Conversation
because the piped handles the case as cancelled by the user without using the plugin's result. Signed-off-by: Shinnosuke Sawada-Dazai <[email protected]>
…rror Signed-off-by: Shinnosuke Sawada-Dazai <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5597 +/- ##
==========================================
- Coverage 26.28% 26.24% -0.04%
==========================================
Files 470 473 +3
Lines 50353 50450 +97
==========================================
+ Hits 13234 13242 +8
- Misses 36059 36146 +87
- Partials 1060 1062 +2 ☔ View full report in Codecov by Sentry. |
pkg/plugin/sdk/deployment.go
Outdated
StageStatusSuccess StageStatus = 2 | ||
StageStatusFailure StageStatus = 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Q] I don't remember why we made this enum start from 2; could you teach me? 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because these lines are copied from below. It's a bit confusing, so I want to make them start from 1.
pipecd/pkg/model/deployment.pb.go
Lines 109 to 120 in ad00a56
// StageStatus represents the current status of a stage of a deployment. | |
type StageStatus int32 | |
const ( | |
StageStatus_STAGE_NOT_STARTED_YET StageStatus = 0 | |
StageStatus_STAGE_RUNNING StageStatus = 1 | |
StageStatus_STAGE_SUCCESS StageStatus = 2 | |
StageStatus_STAGE_FAILURE StageStatus = 3 | |
StageStatus_STAGE_CANCELLED StageStatus = 4 | |
StageStatus_STAGE_SKIPPED StageStatus = 5 | |
StageStatus_STAGE_EXITED StageStatus = 6 | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored it on this commit.
a755ab8
Signed-off-by: Shinnosuke Sawada-Dazai <[email protected]>
@@ -78,7 +78,10 @@ func wait(ctx context.Context, duration time.Duration, initialStart time.Time, s | |||
|
|||
case <-ctx.Done(): // on cancelled | |||
slp.Info("Wait cancelled") | |||
return sdk.StageStatusCancelled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[IMO] I think StageStatusCancelled
should remain, although it's not used in piped.
That's because plugin developers will be confused about which status to return.
If we want to remove StageStatusCancelled
, we should remove case <-ctx.Done():
section too. (If possible, that's ideal)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you are concerned about.
On the other hand, my concern is that the plugin developers may think they have to handle context cancellation as StageStatusCancelled. This is incorrect; the plugin should exit its operation on the context cancel without concern about its response.
The WAIT plugin's case is special because we must handle context cancellation to exit its operation. Almost all plugins can do this only by passing the context to their internal functions because deployment operations can handle context cancellation as a failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plugin developers may think they have to handle context cancellation as StageStatusCancelled.
I agree!
What about WaitApproval and ScriptRun stages?
pipecd/pkg/app/piped/executor/waitapproval/waitapproval.go
Lines 67 to 88 in ad00a56
e.LogPersister.Infof("Waiting for approval from at least %d user(s)...", num) | |
for { | |
select { | |
case <-ticker.C: | |
if e.checkApproval(ctx, num) { | |
return model.StageStatus_STAGE_SUCCESS | |
} | |
case s := <-sig.Ch(): | |
switch s { | |
case executor.StopSignalCancel: | |
return model.StageStatus_STAGE_CANCELLED | |
case executor.StopSignalTerminate: | |
return originalStatus | |
default: | |
return model.StageStatus_STAGE_FAILURE | |
} | |
case <-timer.C: | |
e.LogPersister.Errorf("Timed out %v", timeout) | |
return model.StageStatus_STAGE_FAILURE | |
} | |
} |
pipecd/pkg/app/piped/executor/scriptrun/scriptrun.go
Lines 71 to 92 in ad00a56
for { | |
select { | |
case result := <-c: | |
return result | |
case <-timer.C: | |
e.LogPersister.Errorf("Canceled because of timeout") | |
return model.StageStatus_STAGE_FAILURE | |
case s := <-sig.Ch(): | |
switch s { | |
case executor.StopSignalCancel: | |
e.LogPersister.Info("Canceled by user") | |
return model.StageStatus_STAGE_CANCELLED | |
case executor.StopSignalTerminate: | |
e.LogPersister.Info("Terminated by system") | |
return originalStatus | |
default: | |
e.LogPersister.Error("Unexpected") | |
return model.StageStatus_STAGE_FAILURE | |
} | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, the timeout should be handled on the piped side so the plugin doesn't have to.
SCRIPT_RUN stage should use os/exec.CommandContext: it handles context cancellation as an interruption of executed commands. So we can implement it without watching ctx.Done().
WAIT_APPROVAL stage is difficult to implement without watching ctx.Done() because it doesn't operate something with context other than polling the approval states.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, i got it.
Let's add a note about when to handle ctx.Done()
to the plugin dev guide.
Even if StageStatusCancelled
is removed, plugin developers should be aware of cancellation to certainly exit the stage.
Co-authored-by: Tetsuya KIKUCHI <[email protected]> Signed-off-by: Shinnosuke Sawada-Dazai <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
What this PR does:
StageStatusCancelled
from the SDK because cancel is handled in the piped. We don't have to handle cancel in the plugin implementations.Why we need it:
Which issue(s) this PR fixes:
Part of #4980 #5530
Does this PR introduce a user-facing change?: No