Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: graceperiodseconds option in e2e test #1622

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions test/e2e/e2e_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1822,6 +1822,7 @@ func getCurrentPodNames(ctx context.Context, clientSet clientset.Interface, name

names := []string{}
for _, item := range podList.Items {
fmt.Printf("pod: %v, status: %v, second: %v\n", item.Name, item.Status.Phase, item.DeletionGracePeriodSeconds)
names = append(names, item.Name)
}
return names
Expand Down
45 changes: 41 additions & 4 deletions test/e2e/e2e_toomanyrestarts_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,12 @@ func TestTooManyRestarts(t *testing.T) {
policy: tooManyRestartsPolicy(testNamespace.Name, 3, true, 0),
expectedEvictedPodCount: 4,
},
{
name: "test-one-evictions-use-gracePeriodSeconds",
policy: tooManyRestartsPolicy(testNamespace.Name, 3, true, 15),
enableGracePeriod: true,
expectedEvictedPodCount: 4,
},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
Expand Down Expand Up @@ -198,12 +204,43 @@ func TestTooManyRestarts(t *testing.T) {
if len(deschedulerPods) != 0 {
deschedulerPodName = deschedulerPods[0].Name
}

// Check if grace period is enabled and wait accordingly
if tc.enableGracePeriod {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After checking the problem, it should be here: we cannot use the wait.PollUntilContextTimeoutdirectly. We need to continuously poll within a period of time to ensure that no pod is evicted during this period. wait.PollUntilContextTimeout cannot achieve this.
FYI:
https://github.com/kubernetes-sigs/descheduler/pull/1606/files#diff-40509b8123ffe5c09c142ea31f2175330f82ddc10a17b8ec098b2200bb41678fL210-L230

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate more on why you are convinced replacing wait.PollUntilContextTimeout with the current approach works?

The duration is known so you might create a loop, run it for at most 55 iterations and sleep for one second during each iteration. So the extra context is not needed.

// Ensure no pods are evicted during the grace period
// Wait for 12 seconds to ensure that the pods are not evicted during the grace period
// We do not want to use an extreme waiting time, such as 15 seconds,
// because the grace period is set to 30 seconds.
// In order to avoid unnecessary flake failures,
// we only need to make sure that the pod is not evicted within a certain range.
duration := time.Duration(12) * time.Second
t.Logf("Waiting for grace period of %v seconds", duration)
ctx, cancel := context.WithTimeout(context.Background(), duration)
defer cancel()
var gracePeriodCheck bool
for !gracePeriodCheck {
select {
case <-ctx.Done():
t.Logf("Grace period timeout reached.")
gracePeriodCheck = true
default:
currentRunNames := sets.NewString(getCurrentPodNames(ctx, clientSet, testNamespace.Name, t)...)
t.Logf("preRunNames: %v, currentRunNames: %v\n", preRunNames.List(), currentRunNames.List())

// Check if any pods were evicted
if diff := preRunNames.Len() - currentRunNames.Len(); diff > 0 {
t.Fatalf("Expected no pods to be evicted, but found %d pods evicted.", diff)
}
<-time.After(2 * time.Second)
}
}
}
// Run RemovePodsHavingTooManyRestarts strategy
if err := wait.PollUntilContextTimeout(ctx, 1*time.Second, 50*time.Second, true, func(ctx context.Context) (bool, error) {
currentRunNames := sets.NewString(getCurrentPodNames(ctx, clientSet, testNamespace.Name, t)...)
actualEvictedPod := preRunNames.Difference(currentRunNames)
if err := wait.PollUntilContextTimeout(ctx, 5*time.Second, 120*time.Second, true, func(ctx context.Context) (bool, error) {
currentRunNames1 := sets.NewString(getCurrentPodNames(ctx, clientSet, testNamespace.Name, t)...)
actualEvictedPod := preRunNames.Difference(currentRunNames1)
actualEvictedPodCount := uint(actualEvictedPod.Len())
t.Logf("preRunNames: %v, currentRunNames: %v, actualEvictedPodCount: %v\n", preRunNames.List(), currentRunNames.List(), actualEvictedPodCount)
t.Logf("preRunNames: %v, currentRunNames: %v, actualEvictedPodCount: %v\n", preRunNames.List(), currentRunNames1.List(), actualEvictedPodCount)
if actualEvictedPodCount < tc.expectedEvictedPodCount {
t.Logf("Expecting %v number of pods evicted, got %v instead", tc.expectedEvictedPodCount, actualEvictedPodCount)
return false, nil
Expand Down