драйвера Nvidia

0 views

Skip to first unread message

Exequiel Mondragon

unread,

Aug 5, 2024, 11:14:44 AM8/5/24

to amisittrem

Pleaserun nvidia-bug-report.sh as root and attach the resulting .gz file to your post. Hovering the mouse over an existing post of yours will reveal a paperclip icon.

[url] -files-to-forum-topics-posts/[/url]

I am experiencing a similar problem trying to get the NVIDIA driver running on my SurfaceBook-2 running Ubuntu 18.04.3 LTS in a dual-boot dual-monitor (via Surface Dock) with Windows and secure boot disabled.

Thanks for spotting that. Not sure how that happened, but I think it is corrected now. I still cannot get the NVIDIA GPU working though. (I would like to run the GPU enabled TensorFlow and GPU enabled OpenCV.) I have rolled back the driver to 390 and back to 430 (rebooting in the process). I cycled between Intel and Nvidia using prime-select with no success. New bug-report attached.

After I shut down the computer after that failed update, at restart I got a login-infinite loop at the login screen (entered password, no login into Gnome Desktop, asked for password again, entered it again, no login into Gnome Desktop, repeat).I researched online and I found out that the nvidia-340 drivers are not compatible with my current kernel version.

All my graphics programs are destroyed.In the past when my machine was functioning properly, I had installed Qt5 and linked against a framework I use for work. Now that framework doesn't function anymore, I cannot compile programs against the visualization drivers anymore.

I tried to reinstall the visualization drivers again, as I did 3-4 months ago on the same machine (when I successfully set up that framework I use for work): but I fail at the very first step:sudo apt install qt5-default

I have tried to switch to Nouveau from the Applications -> Software & Updates -> Additional Drivers, but it doesn't work. When I click to switch to that, Error while applying changes: pk-client-error-quark: Error while installing package: installed nvidia-340 package post-removal script subprocess returned error exit status 127 (313)

6') I have tried to run nvidia-smi -> returns 'nvidia-smi' command nout found, can be installed with: and here appears a list of nvidia drivers it recommends installing via sudo apt install nvidia-XYZ or sudo apt install nvidia-utils-XYZ.

I have tried to run sudo ubuntu-drivers autoinstall -> this returns the same thing as before: The following packages will be removed: nvidia-340Removing nvidia-340 ...dpkg: error processing package nvidia-340 (--remove):installed nvidia-340 package post-removal script subprocess returned error exit status 127dpkg: too many errors, stoppingErrors were encountered while processing:nvidia-340...

I have tried to run sudo apt-get install nvidia-340 -> this returns that nvidida-340 is already the newest version (340.108-0ubuntu5.20.04.2)0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.1 not fully installed or removed.Need to get 52,0 MB of archives.After this operation, 0B of additional disk space will be used.Do you want to continue ? Yes YGet: 1 ...Fetched 52,0 MB in 2 sProcessing triggers for libc-bin (2.31-0ubuntu9.2) ...and here I don't get any more returned statements and I just get a new line on the terminal to enter a new command as in $ > ... (so process exits alright and I can just continue doing stuff).

Sticking with 5.4 kernel will not be an option forever. I had the same issue with a early 2009 iMac, where the last proprietary nvidia driver available for my C79 [GeForce 9400] card is also nvidia-340.

I have noticed on a couple new installs that the nvidia-drm-modeset=1 seems no longer to be added into /etc/default/grub. If an older install already has it the option is not removed, but a new clean install seems to no longer receive that option in the kernel command line.

This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.

A system reboot should work, but you may want to forbid the automatic update of this package by modifying /etc/apt/sources.list.d/ files.In my experience, the best way is to simply hold the packages to prevent them from automatic updates by executing the commands below

However, I had completely removed the 384 version, and removed any remaining kernel drivers nvidia-384*. But even after reboot, I was still getting this. Seeing this meant that the kernel was still compiled to reference 384, but it was only finding 410. So I recompiled my kernel:

I experienced this problem after a normal kernel update on a CentOS machine. Since all CUDA and Nvidia drivers and libraries have been installed via YUM repositories, I managed to solve the issues using the following steps:

I have to restart my kernels and remove all the packages that I have installed previously (during the first installation). Please make sure to delete all the packages, even after removing packages by the command below:

For completeness, I ran into this issue as well. In my case it turned out that because I had set Clang as my default compiler (using update-alternatives), nvidia-driver-440 failed to compile (check /var/crash/) even though apt didn't post any warnings. For me, the solution was to apt purge nvidia-*, set cc back to use gcc, reboot, and reinstall nvidia-driver-440.

In my case, the NVRM version was 440.100 and the driver version was 460.32.03. My driver was updated by sudo apt install caffe-cuda and I didn't notice at that time, but I checked it from /var/log/apt/history.log.

I installed the drivers according to instructions (not sure if it were exactly the same steps as you, but looks somewhat familiar), no errors detected, but nvidia-smi gives similar output after a reboot.

I am generally hesitant to test out different things with secure boot as I have some times ended up with unbootable OS installations when I have played with the secure boot settings on my own, without really understanding fully what I am doing.

I want to connect the Quest 2 via Air Link. So I installed the Oculus-App. Now my PC restarts or hang-up. And I have to use a restore-point to get the PC normal working again. But at a point, I can't restore it. So had to reinstall the complete PC. Now I have installed all, the oculus-app, too. But not the nvidia drivers. The PC runs fine. But if I want now to install the drivers. Because I need them, for games. Now it hangs up and that's it. I have to use a restore point, to let the PC run again.

@TheBlackGoddess I suggest you use DDU in Safe mode to properly uninstall your nvidia driver (google how to do this) then reinstall the latest nvidia driver (527.56 is working very well with my 3090 gpu). Also, like support mentioned, downloading the Oculus desktop pc app and running a Repair may also help. Also, check your win10 device manager usb to make sure you've disabled any power saving options. I've also found it best with win10 to disable hardware acceleration graphics (hags) and turn Game mode off. Hope this helps.

Hey there, @TheBlackGoddess! We're glad to hear that you've fixed it! If there's ever anything else we can help out with or if you have any questions that we could answer, please feel free to contact us again!

Having trouble with a Facebook or Instagram account? The best place to go for help with those accounts is the Facebook Help Center or the Instagram Help Center. This community can't help with those accounts.

NVIDIA Driver Error: Found no NVIDIA driver on your system

However, my Python script that loads models to CUDA still errored out with RuntimeError: Found no NVIDIA driver on your system.

Question

Can you please advise how to solve this NVIDIA driver issue? Our deployment spins up a GPU VM on-demand as inference requests arrive, thus ideally the A100 VM on GCP already has an Nvidia Driver pre-installed to avoid latency. Thank you!

Due to the nature of the issue you're experiencing, it would be impossible to reproduce the issue without inspecting your project. Please follow one of these 2 options in order for GCP Support to assist you:

I have the exact same problem. The GPU worked fine a few days ago. I started the instance again today and `nvidia-smi` displays that same error. It's like the driver disappeared. Did you have any luck figuring out what happened?

I haven't figured it out yet unfortunately. Were you starting up an A100 GPU VM instance with a custom VM storage (OS) image with Nvidia Driver/ CUDA toolkit pre-installed, which worked before but now fails? Appreciate if you could keep me posted if you figure it out!

Upon a more careful re-read, I see that you're spinning up instances instead of starting an existing instance. And that you already reinstalled the driver manually. For what it's worth, my python code is able to utilise the GPU via pytorch. Apologies for the misinformed answer!