28 April 2019

28th of April

Farm status
Intel GPUs
Were running Seti, now all on Einstein.

Nvidia GPUs
Two running Seti

Raspberry Pis
All running Einstein BRP4 work


Other news
The weather cooled off over the weekend so I had all the Intel GPU and two of the Nvidia GPU machines running Seti, They reached the goal of 60 million so I have now put the Intel GPUs back to running Einstein work.

Einstein have released (or should that be re-released) their O2OS20 Continuous Gravity Wave search. We started this late last year and then some issues were raised and it was stopped. Only one of the Intel GPU machines has received some, the others got O1OD1 work. Initial processing estimates to complete the O2OS20 are 10 to 11 hours (on an i7-8700 at stock speed of 3.2GHz).
 

GTX 1660 Ti
As you would have gathered from my previous post Debian seem to have issues with their Nvidia drivers and I can't get it going. Just as well I only installed one of them. Hopefully they'll fix their mess soon.

Nvidia also released the GTX 1650 last week so there is now a 430.09 driver to support them under Linux. Nvidia refer to it as a "beta" driver so I suspect Debian will ignore it until there is a release version available.

19 April 2019

A journey I would rather not go on



























The two ASUS GTX 1660 Ti cards arrived. Being eager to get the new toy going I went to install one of them. Hardware installation was fine. I already had a PCIe power cable with a 6+2 pin power connector running the older graphics card. Swapped it out no issue there, power up the machine and got an ASUS logo followed by the Debian desktop. All looking good so far.

I went to check the BOINC logs and it couldn’t work out what model GPU it was, so time to reinstall the current (410) driver. It got part the way through before complaining about unmet dependencies. But its from the repo so why does it have unmet dependencies? I decided to try removing it and rebooting. Oh great no desktop now. At least I can log into the box remotely.

Next I try installing the driver from Debian Buster (the next release). No that has unmet dependencies as well. Lets try the version from Debian Experimental (418.56) as its more up to date. It wants to install 800 updates. Okay last resort before I give up and put the old card back into the machine, lets do a dist-upgrade to get to Debian Buster. Two hours later its finished. Reboot and we have an ASUS logo and the new dark-themed (more like a grey camouflage look) desktop. It still doesn’t recognise the GPU though. Debian Buster still has the 410 driver.

Okay now try installing the driver from Experimental. It installed okay. Lets hold our breath, cross your fingers and reboot. I get an ASUS logo and the camo desktop. Well that bit is still working at least. I check the BOINC log and now it recognises the GPU. Hooray. Lets see if it can be used for compute. I set BOINC to no cache and allow it to fetch work, it downloads 16 CPU and one GPU task. I disable work fetch and watch. The GPU task isn’t moving. Uh oh. Lets give it a bit of time. After about 30 seconds it jumps to 23% done and slowly starts counting up. Looking good. It gets to about 50% and oh crap its gone back to 0% and started counting up again. I keep watching as it gets past 50% and makes its way up to 100% and then uploads. I’m not too sure what happened there but it looks like it worked. I know we’ve gone from CUDA 10.0 to 10.1 with the driver update.

I try to shut it down the following morning once the CPU tasks have finished. I login as root and try to shut it down. “Shutdown now” command not found. Oh wonderful. A bit of googling and I find out we have to use “systemctl poweroff” and “systemctl reboot” now. The service command is also gone, we use “systemctl stop xxx” or “systemctl start xxx” to stop or start services.

Where to now? Next I will update the Seti Multi-beam app. The one I have is CUDA 9 and there is a CUDA 10.1 version. Hopefully that will work, but don’t hold your breath...


Update 25 April 
I raised a bug for Debian. They seem to have fixed the driver dependencies for Experimental and moved it up to Sid. The drivers at Stretch and Stretch-backports are still broken.

I tried re-installing Stretch, upgrading to Buster and then the driver from Sid - The machine hangs at boot time and won't display the desktop at all.

I have also tried downloading the driver directly from Nvidia however to install it you need to get gcc and various other dependencies sorted out by hand.


Update 11 May
Debian have pushed the 418.56-2 driver through to stretch-backports. This works and I have finally got the GTX 1660 Ti running. I even upgraded the driver on the GTX 1060 machines and they are running fine as well.

14 April 2019

14th of April

Farm status
Intel GPUs
All running Seti overnight

Nvidia GPUs
Two running Seti overnight

Raspberry Pis
All running Einstein


Other news
Seti is on 59,418,000 credits so I have the farm concentrating on it so I can reach 60 million.

Seti have 20th anniversary T shirts being organised so I have ordered one.


GPU ordering
The EVGA GTX 1660 Ti that I wanted (the XC Ultra) still aren't available in Australia so I might have to buy another brand. ASUS have a dual-slot card or maybe I will just get the EVGA 2.75 slot one. I will check what is available tomorrow and put an order in this week.

The bad news is the Turing based cards don't work with GPUgrid or Asteroids@home which will restrict which projects I can use them on.

I am thinking I will go back to my original idea of have a dedicated GPU machine, possibly with two graphics cards. Probably an i5-8500T (6 cores/6 threads) with a TDP of 35 watts and two GTX 1660 Ti. For the moment I will just get a pair of 1660 Ti's and swap out two of my GTX 1060's. The dedicated machine can come later.