cpu utilization does not include st percentage #730

owenchenxy · 2022-02-17T08:44:21Z

What version of descheduler are you using?

descheduler version: v0.23.0

Does this issue reproduce with the latest release?
yes

Which descheduler CLI options are you using?

          command:
            - "/bin/descheduler"
          args:
            - "--policy-config-file"
            - "/policy-dir/policy.yaml"
            - "--descheduling-interval"
            - "5m"
            - "--v"
            - "3"

Please provide a copy of your descheduler policy config file

  policy.yaml: |
    apiVersion: "descheduler/v1alpha1"
    kind: "DeschedulerPolicy"
    strategies:
      "RemoveDuplicates":
         enabled: true
      "RemovePodsViolatingInterPodAntiAffinity":
         enabled: true
      "LowNodeUtilization":
         enabled: true
         params:
           nodeResourceUtilizationThresholds:
             thresholds:
               "cpu" : 20
               "memory": 20
               "pods": 20
             targetThresholds:
               "cpu" : 50
               "memory": 50
               "pods": 50

What k8s version are you using (kubectl version)?

kubectl version Output

$ kubectl version
v1.20.4

What did you do?
I applied the descheduler in my kubernetes cluster but no pods were evicted as I expected.

I ran kubectl top nodes to inspect the nodes' utilizations:

root@ubtunt:~# kubectl top nodes
NAME                 CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
testkubear01n01      3890m        33%    6964Mi          36%       
testkubear01n02      6500m        56%    5912Mi          30%       
testkubear01n03      11555m       99%    11638Mi         60%       
testkubear01n04      11651m       100%   9221Mi          47%

it showed high cpu utilization for testkuber01n03 and testkuber01n04.

however, when I inspect the log of descheduler, I found below information

I0217 07:55:20.793827       1 nodeutilization.go:167] "Node is overutilized" node="testkubear01n01" usage=map[cpu:5350m memory:14888Mi pods:22] usagePercentage=map[cpu:46.12068965517241 memory:77.39591560029146 pods:20]
I0217 07:55:20.793901       1 nodeutilization.go:167] "Node is overutilized" node="testkubear01n02" usage=map[cpu:6350m memory:16680Mi pods:21] usagePercentage=map[cpu:54.741379310344826 memory:86.71170554895632 pods:19.09090909090909]
I0217 07:55:20.793963       1 nodeutilization.go:170] "Node is appropriately utilized" node="testkubear01n03" usage=map[cpu:2950m memory:4348Mi pods:49] usagePercentage=map[cpu:25.43103448275862 memory:22.603267129907795 pods:44.54545454545455]
I0217 07:55:20.794031       1 nodeutilization.go:170] "Node is appropriately utilized" node="testkubear01n04" usage=map[cpu:3300m memory:4540Mi pods:47] usagePercentage=map[cpu:28.448275862068964 memory:23.60138748155046 pods:42.72727272727273]

which shows different cpu utilization from the kubectl top nodes command.

Futhur I login into testkuber01n03 which showed the cpu utilization of 99%, and ran top to see the details:

top - 16:37:53 up 10 days, 4 min,  1 user,  load average: 43.70, 43.61, 43.64
Tasks: 977 total,  17 running, 874 sleeping,   0 stopped,   1 zombie
%Cpu(s): 26.7 us, 12.9 sy,  0.0 ni,  4.3 id,  0.1 wa,  0.0 hi,  0.2 si, 55.9 st
KiB Mem : 21273500 total,  3292272 free,  7739800 used, 10241428 buff/cache
KiB Swap:        0 total,        0 free,        0 used.  6348684 avail Mem

It showed a high st percentage. Indeed, my cluster was running on a batch of vms, which are running on a private cloud platform lacking for resources.

I guess descheduler might not count the st percentage into cpu utilization. However, kubernetes as a cloud native tool should be popular to be deployed on vm nodes. And for vm nodes, to compute real available cpu, we should take the st percentage into consideration.

What did you expect to see?

What did you see instead?

The text was updated successfully, but these errors were encountered:

damemi · 2022-02-17T14:31:51Z

Hi @owenchenxy
This is because the descheduler determines usage based only on pod requests/limits, not on actual real-time resource consumption. It does this in order to be consistent with the scheduler, which takes the same approach.

For this reason, the actual usage from commands like kubectl top (which uses kubelet metrics) may differ from the scheduler and descheduler's view. There has been some discussion on adding metrics-based real time usage descheduling, but this effort has not made much progress beyond initial proposals.

For more information, please see the pinned issue at the top of this repo: #225

(We also have a PR open to make this clearer in the documentation #708)

I'm going to close this issue as a duplicate, please feel free to continue discussion on the topic in that main issue if you would like. Thanks!
/close

k8s-ci-robot · 2022-02-17T14:32:03Z

@damemi: Closing this issue.

In response to this:

Hi @owenchenxy
This is because the descheduler determines usage based only on pod requests/limits, not on actual real-time resource consumption. It does this in order to be consistent with the scheduler, which takes the same approach.

For this reason, the actual usage from commands like kubectl top (which uses kubelet metrics) may differ from the scheduler and descheduler's view. There has been some discussion on adding metrics-based real time usage descheduling, but this effort has not made much progress beyond initial proposals.

For more information, please see the pinned issue at the top of this repo: #225

(We also have a PR open to make this clearer in the documentation #708)

I'm going to close this issue as a duplicate, please feel free to continue discussion on the topic in that main issue if you would like. Thanks!
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

owenchenxy added the kind/bug Categorizes issue or PR as related to a bug. label Feb 17, 2022

k8s-ci-robot closed this as completed Feb 17, 2022

Beherenowhjy mentioned this issue Jul 1, 2022

LowNodeUtilization: cpu utilization don't precise. #864

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cpu utilization does not include st percentage #730

cpu utilization does not include st percentage #730

owenchenxy commented Feb 17, 2022 •

edited

Loading

damemi commented Feb 17, 2022

k8s-ci-robot commented Feb 17, 2022

cpu utilization does not include st percentage #730

cpu utilization does not include st percentage #730

Comments

owenchenxy commented Feb 17, 2022 • edited Loading

damemi commented Feb 17, 2022

k8s-ci-robot commented Feb 17, 2022

owenchenxy commented Feb 17, 2022 •

edited

Loading