22 June 2019

22nd of June

Farm status
Intel GPUs
Four running Asteroids work.

Nvidia GPUs
Two running Seti work
One running Asteroids work

Raspberry Pis
All running Einstein work


Memory upgrades
It seems CPDN have finally decided its time to start offering 64 bit apps. I think this was driven in part by most Linux distributions dropping support for 32 bit and they have some new apps that want large amounts of memory. Now that they have done so I added memory to the Intel GPU machines to bring them up to 32Gb.

I had the memory for quite a while, the same brand/model as I originally put into them, but have waited until now to install it as none of the projects need large memory. Run times on tasks seem to have improved slightly as a result of the memory upgrades. I'm not sure why as the CPU's have dual channel memory controllers and they were using two memory slots. I am not complaining, I would just like to know why it made a difference.


Guinea Pig
The i3/AMD cruncher that I mentioned in my previous post has been used as a bit of guinea pig for things. It had a 1TB SSHD in there that I swapped our for a Samsung 850Pro SSD. I swapped out the HD7770 graphics card for a GTX1060 so its now an Nvidia GPU machine. I put Debian Buster on it to see how it behaves.

I am not sure if its Debian Buster or the way I have the screen hooked up, its plugged into the on-board VGA that the i3 provides rather than the ports on the back of the GTX1060. I don't have any spare DisplayPort or DVI-I adapters. It doesn't seem to display correctly. NTP doesn't work properly, they've replaced iptables and the service command is replaced by systemctl commands. Oh and they don't use Xorg any more. Debian have announced they'll be releasing Buster on the 6th of July.

I think I might hold off on upgrading all the machines to Buster until they have had their first point release, which is usually a bunch of fixes that didn't make it in time for the official release.


Asteroids wasn't working
While using the i3 I found out that my app_info for Asteroids doesn't work so I trashed a bunch of work over the last couple of days until I managed to fix it. When I checked none of the GPU machines had done any GPU work for Asteroids, they have only been doing CPU work.

The only reason why I use an app_info is because their server insists on using an app that uses the sse3 cpu instructions instead of avx instructions. The avx app is quite a bit faster. The apps are the same ones supplied by the project, but this way I can specify which one it uses.

10 June 2019

10th of June

Farm status
Intel GPUs
Running Asteroids overnight

Nvidia GPUs
Two running Seti

Raspberry Pis
All running Einstein BRP4 work


Other news
The two GTX 1660 Ti equipped Ryzens are running Seti 24/7. I had one of them lock up and had to power it off. When it came back up it decided to trash a bunch of CPU work units that had been in progress. I have seen this behaviour from them where they just lock up for no apparent reason. Given all four Ryzen machines have done this it seems something specific to them. They've had BIOS upgrades and two of them have also had GPU upgrades. I've even seen it when they are idle. I haven't been able to pin point the cause. Given these four machines are slated for replacement soon I am not going to waste any more time trying to debug the issue.


AMD Linux driver experiment
I have an old HD 7770 graphics card that has been sitting in its box for a few years now. They were released in February 2012 so they are ancient in GPU terms. I decided to fire up one of the i3's as an AMD cruncher in order to see just how bad it is getting AMD's drivers to work under Linux. AMD GPUs are good at running OpenCL apps, much better than Nvidia.

I'll point out I am running Debian and AMD only release their Linux drivers for Red Hat or Ubuntu. Ubuntu is based on Debian so how hard could it be? The hardware part is simple just fit the card into the PCIe slot and plug a 6 pin power cable in. The machine is happy to display through its DVI port without any issue. I install a clean copy of Debian and it complains about missing AMD firmware. Debian have a package called firmware-amd-graphics which fixes that. I install it and reboot and I how have a high-res desktop working.

The next part is to get OpenCL going which is when it all falls apart. First you need to install a few packages from the Debian repo:

sudo apt install build-essential dkms

Now we need to download the latest amdgpu-pro drivers which I did on a windows machine and then stuck them on a USB thumb drive to copy them across. At the time I write this they are called amdgpu-pro-17.40-492261.tar.xz so they need to be unpacked using the command:

tar -xJpf amdgpu-pro-*.tar.xz

At this point you'll have a bunch of .deb files and an install script. You'll notice they have their driver version number (17.40-492261) in all the file names. When they bring out a new version expect these numbers to change. After this we then need to install them one by one, but we don't need all of them just to get OpenCL. We would do the following:

sudo dpkg -i amdgpu-pro-core_17.40-492261_all
sudo dpkg -i libopencl1-amdgpu-pro_17.40-492261_amd64
sudo dpkg -i clinfo-amdgpu-pro_17.40-492261_amd64
sudo dpkg -i opencl-amdgpu-pro-icd_17.40-492261_amd64
sudo dpkg -i amdgpu-pro-dkms_17.40-492261_all
sudo dpkg -i libdrm2-amdgpu-pro_2.4.82-492261_amd64
sudo dpkg -i ids-amdgpu-pro_1.0.0-492261_all
sudo dpkg -i libdrm-amdgpu-pro-amdgpu1_2.4.82-492261_amd64

I got as far as the dkms when it failed to build. AMD only support the current long term release kernel and so it fails under the 4.19 kernel. I think Ubuntu are on the 4.18 kernel at the moment so there isn't much I can do about this.

Looking at the install script the ids-amdgpu-pro* isn't referenced so I suspect its not needed, but seeing as it failed before that point I can't tell.

I will be sticking with Nvidia because at least their drivers are simple enough to install (yes they have a dkms component as well) and work on current release kernels. AMD really need to get their act together with their drivers, they could be moving so much more hardware if they fixed their software.

01 June 2019

1st of June

Farm status
Intel GPUs
Running Einstein O2AS20 work overnight

Nvidia GPUs
Two running Seti work

Raspberry Pis
Running Einstein BRP4 work


BIOS updates
The motherboard manufacturers will update the CPU firmware via a BIOS update. There are other ways of patching them as well such as using the intel-microcode package under Debian.

The Intel CPUs have another security issue referred to as MDS or more commonly known as Zombieload. The AMD machines don't have this particular issue but are doing updates to support the 3rd generation Ryzen CPUs even on older motherboards. I took the opportunity to update all the machines.


Other news
We got a bit of a cold snap in the weather so the two machines with GTX 1660 Ti cards have been running 24/7. This has greatly improved their output.

I resurrected my Milkyway@home account, which I haven't used since 2012 and did a burst of GPU work for them. Their GPU app is written in OpenCL and so is slower on Nvidia cards. The Milkyway simulations took slightly under 4 minutes to complete on the GTX 1660 Ti.


Ryzen upgrades
At Computex 2019 (last week) AMD announced 5 of their 3rd generation Ryzen CPUs. The official specs of these CPUs were somewhat different to the leaks on the internet. We're expecting more official announcements on the 7th of July as that is the release date. Only another 5 weeks to go...

At the moment I am looking at ASUS X570-Pro motherboards with DDR4 3200MHz memory, but I am not sure how much memory because that depends on how many cores they have. Which CPU they'll get is undecided until the rest of the Ryzen line-up is officially announced. I will swap all 4 machines out with the exception of reusing the GPUs so that will mean new cases, power supplies, CPU cooler and NVMe SSD's to replace the hard disks.

The Ryzen 1700's that I currently use are 65 watts and have 8 cores/16 threads. For a GPU cruncher I was hoping for a lower wattage CPU, probably with a lower core count. The X570 chipset uses 15 watts so that will eat any power saved.