Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cpu utilization does not include st percentage #730

Closed
owenchenxy opened this issue Feb 17, 2022 · 2 comments
Closed

cpu utilization does not include st percentage #730

owenchenxy opened this issue Feb 17, 2022 · 2 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@owenchenxy
Copy link

owenchenxy commented Feb 17, 2022

What version of descheduler are you using?

descheduler version: v0.23.0

Does this issue reproduce with the latest release?
yes

Which descheduler CLI options are you using?

          command:
            - "/bin/descheduler"
          args:
            - "--policy-config-file"
            - "/policy-dir/policy.yaml"
            - "--descheduling-interval"
            - "5m"
            - "--v"
            - "3"

Please provide a copy of your descheduler policy config file

  policy.yaml: |
    apiVersion: "descheduler/v1alpha1"
    kind: "DeschedulerPolicy"
    strategies:
      "RemoveDuplicates":
         enabled: true
      "RemovePodsViolatingInterPodAntiAffinity":
         enabled: true
      "LowNodeUtilization":
         enabled: true
         params:
           nodeResourceUtilizationThresholds:
             thresholds:
               "cpu" : 20
               "memory": 20
               "pods": 20
             targetThresholds:
               "cpu" : 50
               "memory": 50
               "pods": 50

What k8s version are you using (kubectl version)?

kubectl version Output
$ kubectl version
v1.20.4

What did you do?
I applied the descheduler in my kubernetes cluster but no pods were evicted as I expected.

I ran kubectl top nodes to inspect the nodes' utilizations:

root@ubtunt:~# kubectl top nodes
NAME                 CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
testkubear01n01      3890m        33%    6964Mi          36%       
testkubear01n02      6500m        56%    5912Mi          30%       
testkubear01n03      11555m       99%    11638Mi         60%       
testkubear01n04      11651m       100%   9221Mi          47%

it showed high cpu utilization for testkuber01n03 and testkuber01n04.

however, when I inspect the log of descheduler, I found below information

I0217 07:55:20.793827       1 nodeutilization.go:167] "Node is overutilized" node="testkubear01n01" usage=map[cpu:5350m memory:14888Mi pods:22] usagePercentage=map[cpu:46.12068965517241 memory:77.39591560029146 pods:20]
I0217 07:55:20.793901       1 nodeutilization.go:167] "Node is overutilized" node="testkubear01n02" usage=map[cpu:6350m memory:16680Mi pods:21] usagePercentage=map[cpu:54.741379310344826 memory:86.71170554895632 pods:19.09090909090909]
I0217 07:55:20.793963       1 nodeutilization.go:170] "Node is appropriately utilized" node="testkubear01n03" usage=map[cpu:2950m memory:4348Mi pods:49] usagePercentage=map[cpu:25.43103448275862 memory:22.603267129907795 pods:44.54545454545455]
I0217 07:55:20.794031       1 nodeutilization.go:170] "Node is appropriately utilized" node="testkubear01n04" usage=map[cpu:3300m memory:4540Mi pods:47] usagePercentage=map[cpu:28.448275862068964 memory:23.60138748155046 pods:42.72727272727273]

which shows different cpu utilization from the kubectl top nodes command.

Futhur I login into testkuber01n03 which showed the cpu utilization of 99%, and ran top to see the details:

top - 16:37:53 up 10 days, 4 min,  1 user,  load average: 43.70, 43.61, 43.64
Tasks: 977 total,  17 running, 874 sleeping,   0 stopped,   1 zombie
%Cpu(s): 26.7 us, 12.9 sy,  0.0 ni,  4.3 id,  0.1 wa,  0.0 hi,  0.2 si, 55.9 st
KiB Mem : 21273500 total,  3292272 free,  7739800 used, 10241428 buff/cache
KiB Swap:        0 total,        0 free,        0 used.  6348684 avail Mem 

It showed a high st percentage. Indeed, my cluster was running on a batch of vms, which are running on a private cloud platform lacking for resources.

I guess descheduler might not count the st percentage into cpu utilization. However, kubernetes as a cloud native tool should be popular to be deployed on vm nodes. And for vm nodes, to compute real available cpu, we should take the st percentage into consideration.

What did you expect to see?

What did you see instead?

@owenchenxy owenchenxy added the kind/bug Categorizes issue or PR as related to a bug. label Feb 17, 2022
@damemi
Copy link
Contributor

damemi commented Feb 17, 2022

Hi @owenchenxy
This is because the descheduler determines usage based only on pod requests/limits, not on actual real-time resource consumption. It does this in order to be consistent with the scheduler, which takes the same approach.

For this reason, the actual usage from commands like kubectl top (which uses kubelet metrics) may differ from the scheduler and descheduler's view. There has been some discussion on adding metrics-based real time usage descheduling, but this effort has not made much progress beyond initial proposals.

For more information, please see the pinned issue at the top of this repo: #225

(We also have a PR open to make this clearer in the documentation #708)

I'm going to close this issue as a duplicate, please feel free to continue discussion on the topic in that main issue if you would like. Thanks!
/close

@k8s-ci-robot
Copy link
Contributor

@damemi: Closing this issue.

In response to this:

Hi @owenchenxy
This is because the descheduler determines usage based only on pod requests/limits, not on actual real-time resource consumption. It does this in order to be consistent with the scheduler, which takes the same approach.

For this reason, the actual usage from commands like kubectl top (which uses kubelet metrics) may differ from the scheduler and descheduler's view. There has been some discussion on adding metrics-based real time usage descheduling, but this effort has not made much progress beyond initial proposals.

For more information, please see the pinned issue at the top of this repo: #225

(We also have a PR open to make this clearer in the documentation #708)

I'm going to close this issue as a duplicate, please feel free to continue discussion on the topic in that main issue if you would like. Thanks!
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants