Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low fps on external monitor connected to nvidia hdmi port #650

Open
2 tasks done
tm4ig opened this issue Jun 2, 2024 · 31 comments
Open
2 tasks done

Low fps on external monitor connected to nvidia hdmi port #650

tm4ig opened this issue Jun 2, 2024 · 31 comments
Labels
bug Something isn't working NV-Triaged An NVBug has been created for dev to investigate

Comments

@tm4ig
Copy link

tm4ig commented Jun 2, 2024

NVIDIA Open GPU Kernel Modules Version

555.42.02

Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.

  • I confirm that this does not happen with the proprietary driver package.

Operating System and Version

Arch Linux

Kernel Release

6.9.3

Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.

  • I am running on a stable kernel release.

Hardware: GPU

AMD Radeon 780M IGPU + NVIDIA GeForce RTX 4060 Laptop DGPU (UUID: GPU-36f796c8-ee38-5be5-08a5-c7d8635be2d6)

Describe the bug

I have asus laptop with AMD Radeon 780M IGPU + NVIDIA GeForce RTX 4060 Mobile MAX-Q DGPU and KDE6 and Wayland session.
Laptop monitor connected to AMD GPU, external monitor connected to nvidia GPU (HDMI port).
When I run glxgears benchmark test in kde 6 wayland session on 555.42.02 nvidia-open driver or 555.42.02 nvidia proprietary driver without nvidia.NVreg_EnableGpuFirmware=0 kernel option on my external monitor connected to nvidia hdmi port I have low fps framerate equal to half the screen refresh rate (in my case I have only ~37-38 fps when external screen refresh rate 75).
This looks like a bug https://bugs.kde.org/show_bug.cgi?id=452219 but it nvidia diriver regression because on nvidia-open 550.xx driver or nvidia proprietary driver 555.42 drver wih nvidia.NVreg_EnableGpuFirmware=0 kernel option I have normal framerate on extenal monitor.
I can not use nvidia proprietary driver 550.xx or 555.42 because it causes the kernel to panic https://forums.developer.nvidia.com/t/series-550-freezes-laptop/284772/135 and nvidia can not fix this problem more than 3 monthes.
I do not want use nvidia open driver 550.xx because with this driver and external monitor I have very large cpu utilization for kwin_wayland proccess

To Reproduce

  1. Connect external monitor to nvidia hdmi port, wayland, kwin 6 and nvidia-open driver 555.42.02
  2. run glxgears on external monitor

Bug Incidence

Always

nvidia-bug-report.log.gz

nvidia-bug-report.log.gz

More Info

No response

@tm4ig tm4ig added the bug Something isn't working label Jun 2, 2024
@tm4ig
Copy link
Author

tm4ig commented Jun 6, 2024

similar problem https://forums.developer.nvidia.com/t/wayland-external-monitor-refresh-rate-issue/290752
But in my case problem with nvidia-open 555 driver (or nvidia 555 closed driver with GSP Firmware).
With nvidia-open 550 driver (or nvidia proprietary driver 555 with NVreg_EnableGpuFirmware=0 )I have normal framerate on external monitor

@tm4ig
Copy link
Author

tm4ig commented Jun 7, 2024

OGL_DEDICATED_HW_STATE_PER_CONTEXT=ENABLE_ROBUST does not help me

@tm4ig
Copy link
Author

tm4ig commented Jul 10, 2024

With nvidia proprietary linux driver 555, nvidia.NVreg_EnableGpuFirmware=0 kernel option, Wayland and KDE Plasma 6.1 I have low CPU usage (around 5-20% for one core activity for kwin_wayland) and hight frame rate (around 70-75 fps with monitor refresh rate 75 Hz) on external monitor connected to nvidia HDMI port but I have kernels panics as in case https://forums.developer.nvidia.com/t/series-550-freezes-laptop/284772/210

With nvidia open linux driver 555, Wayland and KDE Plasma 6.1 I have hight CPU usage (around 20-80% for one core with activity for kwin_wayland) and low frame rate (around 65-70 fps with monitor refresh rate 75 Hz) on external monitor connected to nvidia HDMI port but I have not kernels panics as in case https://forums.developer.nvidia.com/t/series-550-freezes-laptop/284772/210

With nvidia proprietary linux driver 555, nvidia.NVreg_EnableGpuFirmware=1 kernel option, Wayland and KDE Plasma 6.1 I have hight CPU usage (around 20-80% for one core with activity for kwin_wayland) and low frame rate (around 65-70 fps with monitor refresh rate 75 Hz) on external monitor connected to nvidia HDMI port and I have kernels panics as in case https://forums.developer.nvidia.com/t/series-550-freezes-laptop/284772/210

So on nvidia proprietary driver (full closed mode) I have best performance, but I also have kernel panics
On nvidia open driver I have low performance, but have not kernel panics
On nvidia proprietary driver with Gpu Firmware enabled (default for 555) I have low performance and I have kernels panics.

Nvidia can not fix kernel panics with hybrid graphics five months

@Kimiblock
Copy link

Reverse Prime do get tricky on my machine. It’s a known issue for a long time.

@Enverbalalic
Copy link

Are there any updates to this issue ? I'm having the same problem with an external monitor connected via DisplayPort (USB-C). Specific configuration is a Ryzen 7000 laptop with a RTX 4080, the external display is basically unusable

@Kimiblock
Copy link

Had to use only the dedicated GPU for now. Power consumption is insane.

@lucasslima
Copy link

Hi all, we also have a thread in the nvidia forums related to the same issue: https://forums.developer.nvidia.com/t/nvidia-please-get-it-together-with-external-monitors-on-wayland/301684/30

@mtijanic
Copy link
Collaborator

mtijanic commented Oct 9, 2024

This is being tracked as NV bug 4830125

@mtijanic mtijanic added the NV-Triaged An NVBug has been created for dev to investigate label Oct 9, 2024
@kasvtv
Copy link

kasvtv commented Oct 17, 2024

I have the same issue without using HDMI, using a USB->DP cable

@moiSentineL
Copy link

same issue.
75Hz external monitor -> 37FPS

Using KDE Plasma 6.2.1.1 Wayland, NVIDIA prop. 560.35.03-17, NVIDIA GTX 1050Ti

Using nvidia.NVreg_EnableGpuFirmware=0 or OGL_DEDICATED_HW_STATE_PER_CONTEXT=ENABLE_ROBUST doesn't help

@NGStaph
Copy link

NGStaph commented Oct 28, 2024

i use usb4 displayport/thunderbolt and, funnily enough, hit the advertised target refresh rates on both monitors in KDE plasma, but not on Gnome.

@virusapex
Copy link

This is being tracked as NV bug 4830125

Good day! Is this an internal bug tracker or can we get some updates on this publicly?

@mtijanic
Copy link
Collaborator

Hey there! Sorry, the NV bug is private, but we can provide public updates here. We have a machine with local repro and are actively working on it, but we don't have a root cause yet. The issue seems to be related to the power savings feature of the GSP or one of the display-specific components. Going to a lower (more power) pstate makes the issue go away.

That's not a solution though, and we're working on understanding the exact cause and how to fix it without consuming excess power. Will update here when we have more to share. And if the fix involves only kernel-side changes, we can post the patches as well.

Thanks for the patience!

@lucasslima
Copy link

Hey there! Sorry, the NV bug is private, but we can provide public updates here. We have a machine with local repro and are actively working on it, but we don't have a root cause yet. The issue seems to be related to the power savings feature of the GSP or one of the display-specific components. Going to a lower (more power) pstate makes the issue go away.

That's not a solution though, and we're working on understanding the exact cause and how to fix it without consuming excess power. Will update here when we have more to share. And if the fix involves only kernel-side changes, we can post the patches as well.

Thanks for the patience!

I find this very unlikely. I'm using the closed source drivers with the GSP disabled so I don't think it's related to that, while using the open module does makes the framerate on the external monitor more jittery. Setting the clocks to max also makes no difference, and I can't find how to set the power limit using nvidia-settings in Wayland/nvidia-smi.

@mtijanic
Copy link
Collaborator

I find this very unlikely. I'm using the closed source drivers with the GSP disabled so I don't think it's related to that, while using the open module does makes the framerate on the external monitor more jittery.

Hi there. Am I understanding correctly that you are also seeing the "half-FPS on external monitor" issue with GSP disabled too? That doesn't match our experiments.

While debugging this we did find some causes of jitter where individual frames would take longer, but that wasn't the core issue. Eventually we got to the point where, on GSP only, running with <=P4 pstate a monitor runs at 60.0fps, and with >=P5 at 30.000fps. This is the issue we are debugging and that is tracked here and in NV bug 4830125.

Setting the clocks to max also makes no difference, and I can't find how to set the power limit using nvidia-settings in Wayland/nvidia-smi.

You can poll your pstate with

nvidia-smi --query-gpu="pstate" --format=csv --loop-ms=1000

and any of the following should cause it to change:

  • Running a graphics intensive app, such as __GL_SYNC_TO_VBLANK=0 glxgears (disabling vsync makes it render at max fps and warms up the GPU) should set it to P0
  • Running any CUDA app, such as mpv --hwdec=nvdec-copy video.mp4 will set it to P2
  • This little app should set it to P0: https://gist.github.com/mtijanic/9c129900bfba774b39914ad11b0041f6

@lucasslima
Copy link

Thank you for the information the information, it helped on getting more details from this.

I've tried to set the GPU state using what you've mentioned, I've got some interesting results. When running

__GL_SYNC_TO_VBLANK=0 prime-run glxgears

The power state indeed goes to P0 and the framerate on the desktop improves. Whowever, if I switch to a open Firefox window open in https://testufo.com/, the framerate goes back to half the refresh rate regardless of the GPU power state.

I've made a small recording showing what happens: https://youtu.be/FY-LxShijdk

@ngoquang2708
Copy link

@lucasslima What happen when you move the Firefox window to the same monitor as glxgears?

@mtijanic
Copy link
Collaborator

mtijanic commented Nov 27, 2024

@lucasslima thanks for that video. I don't think this is the same issue that we're talking about here. From first glance, it looks like a weird interaction between firefox, kwin and the NVIDIA usermode drivers (related to explicit sync?).

Could you please send this video and the output of your nvidia-bug-report.sh to [email protected] and then that will get routed internally to where it needs to be, since this repo doesn't seem to be right place for it. Oh, and please mention the make&model of the external monitor, I don't know if it's caught in the bug report log.

EDIT: I see on the forums that NV bug 4824813 was filed for this already and it has the needed info. This bug will be revisited once 4830125 is root caused so we know if it is the same issue or not.

Thanks!

@lucasslima
Copy link

Glad for be helpful.

@lucasslima What happen when you move the Firefox window to the same monitor as glxgears?

Both are running on the same screen, it just happens to be ultra-wide.

@Simpuis
Copy link

Simpuis commented Nov 28, 2024

Using the program above to force P0 fixes the problem for me, although just like @lucasslima on a KDE wayland session with firefox I also get said UFO problem, but without firefox both monitors run smoothly to my eyes.

@busybox11
Copy link

I have also experienced similar issues on both proprietary and open NVIDIA modules, but only on high refresh rate monitors - I do experience subtle performance issues on my 1080p60 monitor very occasionally, but ALWAYS on high refresh rate ones - 1440p ultrawide 180hz and 1440p 16/9 144hz.

Both GNOME and KDE have this issue, couldn't test on Hyprland as it straight up doesn't work with NVIDIA.

@clone-888
Copy link

Just want to mention for the devs that version 550.76 of the driver does not have this issue. I found it to be the only driver version that doesn't.

@virusapex
Copy link

@mtijanic Good day! Sorry to bother again, but was there any progress on this?

@mtijanic
Copy link
Collaborator

Hi @virusapex , sorry, I was out of office over the holidays, for an unexpectedly long period of time. Other people have since taken over this issue; I see progress but it still hasn't been fully closed down. I'll follow up and get back here a bit later.

@virusapex
Copy link

Hi @virusapex , sorry, I was out of office over the holidays, for an unexpectedly long period of time. Other people have since taken over this issue; I see progress but it still hasn't been fully closed down. I'll follow up and get back here a bit later.

Thank you for notifying us!

@NGStaph
Copy link

NGStaph commented Feb 3, 2025

570.86.16 seems to mostly fix the issue for me, which is to say that performance (with the beta drivers) is almost as good as 565 (closed) with nvidia.NVreg_EnableGpuFirmware=0.

wishful thinking.

@virusapex
Copy link

570.86.16 seems to mostly fix the issue for me, which is to say that performance (with the beta drivers) is almost as good as 565 with nvidia.NVreg_EnableGpuFirmware=0.

others still seem to be facing difficulties (https://forums.developer.nvidia.com/t/570-release-feedback/321956/13) but this has not been my case

As far as I know, you can't disable GSP firmware on Open kernel module, so nvidia.NVreg_EnableGpuFirmware=0 can't be used in the current situation. As for 570.86.16, I tried but still got a sufficient lag when moving windows around the screen.

@uentity
Copy link

uentity commented Feb 9, 2025

Agree with @virusapex, low rate lag is not fixed in 570.86

@x0wllaar
Copy link

Using the program above to force P0 fixes the problem for me, although just like @lucasslima on a KDE wayland session with firefox I also get said UFO problem, but without firefox both monitors run smoothly to my eyes.

It's probably because #743 also plays a role here (and it's more of a userspace issue maybe?)

@mtijanic
Copy link
Collaborator

Quick update here: We now think we fully understand the issue. The short version is that when the GPU is in a low power state (e.g. idle desktop) then there's an extra bit of latency between the various components when a per-frame notification is delivered from the display hardware all the way up to the compositor. This causes us to just miss the vblank interval and drop a frame, which is why the framerate gets cut exactly in half. These delays are bigger with GSP in the picture than when it is disabled, so the issue does not reproduce on the legacy driver mode*.

We're still looking into what a proper fix will be, but the two short term workarounds we found are to either run the GPU at a slightly higher pstate (P4-P5 seems sufficient, depending on the HW) or to deliver the events to kwin on a fixed timer instead.

Will keep you posted.


* It could with a high refresh rate monitor, unless driving such a high refresh rate didn't bump up the power draw anyway

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working NV-Triaged An NVBug has been created for dev to investigate
Projects
None yet
Development

No branches or pull requests