Getting what you really want from your Nvidia on Linux
I had to fight with my Nvidia for a while trying to get my new Media Center based on Fujitsu/Siemens Scaleo E to work as it should. One thing that struck me during that fight is that there is a lot of random information floating around the net related to Nvidia on Linux performance. As a result you have to grab bits from here, bits from there and at the end of the day it still does not perform quite the way it could.
Bus Specific Notes
- AGP - nothing to optimise in general. It just works. I have five or six cards from Inno3D?, PNY, XFX and Asus starting from 5xxx and up to 7xxx series and all of them have worked fine performancewise. Only Asus can do suspend-to-ram (S3). PNY and the rest cannot. Adjusting the frequency by hand using nvidia-settings or nvclock has no effect.
- PCI-Express - Ughh... The road to hell is paved by good intentions. It is supposed to be a better bus than AGP and it can be a better bus than AGP, but on a lot of PCs it actually is not. The reason is that legacy and even APIC interrupts from the PCIE end up being shared with stuff like USB. As a result, if reliable performance is desired it is essential that key hardware is configured to use the newer interrupt mechanisms like Message Signalled Interrupts (MSI) instead of legacy.
Nvidia Performance and Power Consumption Related Settings
| Setting |
Means of Changing |
Performance |
Power/Heat/Comments |
| MSI |
NVreg_EnableMSI=1 as argument to modprobe nvidia |
Improve Performance |
This switches nvidia from legacy to MSI interrupts improving the speed and predictability of interrupt handling. It improves performance especially for video when using adaptive clocking or powersave. The drivers however are buggy and the first interrupt after the card exits ACPI S3 (as of 190+ driver series) is legacy, not MSI. It is uncaught by the driver so the card never wakes up. BUG |
| Clocks |
nvclock -n, nvclock -m, nvidia-settings |
Can decrease/increase performance |
Can draw more and cook your card if overclocked. I have yet to see a entry to mid-range card where there is a positive effect on power consumption from underclocking nvidia manually. So this setting is useless in terms of decreasing power consumption |
| Coolbits |
xorg.conf: Option "Coolbits" "1" |
N/A |
By itself it provides no difference. It however allows to manipulate most performance/power related settings from inside X |
| PowerMizer? |
xorg.conf: Option "RegistryDwords" "PowerMizerEnable=0x1" |
Enables card power management |
Essential to set some strategy using other options as needed |
| PerfLevelSrc? |
xorg.conf: Option "RegistryDwords" "PerfLevelSrc=0xNNNN" |
Sets the card clocking strategy |
This setting determines if the card is clocked adaptively or at fixed clock. See PerfLevelSrc? table for more details 22 in a nibble signifies fixed clocking, 33 signifies adaptive. If you are using adaptive clocking on a desktop the clocking transitions are often noticeable in legacy interupt mode (video using xv is a good example). MSI definitely helps here, but MSI means newest (non-packaged driver) and no ACPI 3 Sleep |
| PowerMizerDefault? |
xorg.conf Option "RegistryDwords" "PowerMizerDefault=0xN" |
Sets fixed perf level on Battery |
In conjunction with PerfLevelSrc? set to 0x33NN determines performance on Battery. See PerfLevelSrc? table for more details. 1-3, 3 for max powersave, 1 for max performance |
| PowerMizerDefaultAC? |
xorg.conf Option "RegistryDwords" "PowerMizerDefaultAC=0xN" |
Sets fixed perf level on AC |
In conjunction with PerfLevelSrc? set to 0xNN33 determines performance on AC. See PerfLevelSrc? table for more details. 1-3, 3 for max powersave, 1 for max performance |
| PerfLevelSrc? |
Power/Performance |
| 0x2222 |
Static clocking both AC and battery, PowerMizerDefaultl? and PowerMizerDefaultAC? determine actual mode |
| 0x3322 |
Fixed AC clocking, Adaptive battery, AC clocking determined by PowerMizerDefaultAC? |
| 0x2233 |
Fixed battery clocking, Adaptive AC, battery clocking determined by PowerMizerDefault? |
| 0x3333 |
Adaptive throughout |
Overall, nvidia's clocking adjustments are still fairly rudimentary. There is no way to set max clock in an adaptive strategy, there is no way to set min clock and there is a very limited choice of power levels.
Acknowledgements
- AntiAcknowledgement? to Nvidia for their "excellent" documentation. People actually care about performance, power and thermals nowdays and this has to be documented better
- The most helpful bit of info on the net on the subject I have found so far is this page in Ukraine.
Topic revision: r3 - 13 Feb 2010 - 13:48:24 -
AntonIvanov?