Introduction#
Installing graphics card drivers is actually very simple.
This article mainly records a frustrating graphics card driver installation process that took more than a month (and eventually found out that the graphics card was broken), as well as some insights I gained during this process.
Background#
Continuing from the previous episode, after I installed a computer for 1500 yuan, in order to use my server in my deep learning introductory course, I urgently needed a graphics card. So I ordered a P104-100 from the seafood market, and thus began my nightmare-like journey of installing graphics card drivers.
Driver Installation Process#
First Attempt#
At the first opportunity after receiving the graphics card, I tried to install the driver using PPA. First, add the apt
repository.
sudo add-apt-repository ppa:micahflee/ppa
sudo apt update
Then use the ubuntu-drivers
command to install the recommended driver.
❯ ubuntu-drivers devices
== /sys/devices/pci0000:00/0000:00:01.1/0000:10:00.0 ==
modalias : pci:v000010DEd00001B87sv000010DEsd00001237bc03sc02i00
vendor : NVIDIA Corporation
model : GP104 [P104-100]
driver : nvidia-driver-470-server - distro non-free
driver : nvidia-driver-390 - distro non-free
driver : nvidia-driver-545 - third-party non-free
driver : nvidia-driver-525 - distro non-free
driver : nvidia-driver-535-server - distro non-free
driver : nvidia-driver-470 - distro non-free
driver : nvidia-driver-525-server - distro non-free
driver : nvidia-driver-450-server - distro non-free
driver : nvidia-driver-535 - distro non-free recommended
driver : nvidia-driver-418-server - distro non-free
driver : xserver-xorg-video-nouveau - distro free builtin
It can be seen that nvidia-driver-535
is recommended, so install it directly.
sudo apt install nvidia-driver-455
sudo reboot
Or use
sudo ubuntu-drivers autoinstall
to achieve the same effect.
At this point, the graphics card driver installation process should be completed, and you can now use this graphics card normally (as it turns out, if the graphics card is good, it is indeed like this). However, after running Nvidia-smi
, I encountered an error.
❯ nvidia-smi
No devices were found
After searching, I found that it might be a problem with the motherboard, so I checked the motherboard, set the CPU integrated graphics as the default graphics card, and checked that the Above 4G Decoding
option was enabled, but it did not solve the problem. So I further suspected that the driver installed by this method was incompatible with the mining card.
Second Attempt#
I uninstalled the previously installed driver and downloaded the .run
version driver for GTX1080 from the NVIDIA official website. The installation process of this version of the driver is more complicated. It requires disabling the open-source graphics card driver Nouveau
and shutting down the graphical interface, otherwise there will be a black screen on startup. However, this driver did not work either, and it still showed No devices were found
.
Third Attempt#
I chose to install Cuda directly and selected the option to automatically install the graphics card driver during the Cuda installation. The advantage of doing this is that Cuda and the driver are installed together, saving time and effort, and there is no need to disable the open-source driver and shut down the graphical interface, simplifying the installation process. However, this method still didn't work.
Fourth Attempt#
I tried various methods to install the driver on Manjaro and Windows systems, but there were still various errors.
Analyzing the Problem#
I didn't immediately suspect that the graphics card was the problem because the unscrupulous sellers in the seafood market promised that the card had undergone stress testing before shipping.
I asked for solutions on the NVIDIA community, but no one responded to me.
I searched through the answers to similar questions on the NVIDIA community, and tried the methods given there, but none of them solved my problem. So I carefully studied the driver installation log and found a line that said RmInitAdapter failed!
. After comparing it with similar problems online, I realized that the graphics card might be the problem. So I ordered a new graphics card on a certain online platform and used the bundled installation method of Cuda and the graphics card driver, and it worked!
Finally, the wonderful output of nvidia-smi
appeared.
Conclusion#
At this point, this graphics card can be used normally. I want to say that in the seafood market, obviously the vendors cannot be blindly trusted when selling hardware, and when problems are discovered, hardware issues should be suspected more. Don't waste precious time troubleshooting other issues, only to find out in the end that the hardware is faulty. Also, don't disassemble the graphics card before it is tested to be in good condition. I checked the thermal paste status of the graphics card as soon as I received it, which damaged the warranty sticker. When I found out that this graphics card was faulty, I couldn't return it, and could only sell it as a faulty card.