29 March 2020

29th of March

Farm status
Intel GPUs
Four running Rosetta work

Nvidia GPUs
Three running Einstein gravity wave work

Raspberry Pis
Eight running Einstein BRP4 work

For news on the Raspberry Pis see Marks Rpi Cluster

Its all about the Corona virus these days. Some tech commentators have suggested to run Folding@home however they have so many new users they can't generate enough work to keep up with demand.

I gave their Linux app a try. Its a stand-alone app. It gave errors on install because it uses Python 2 which is depreciated but the CPU did work. Its a multi-thread app and used 11 out of 12 threads on the machine I tried it on. It didn't find the OpenCL run-time library so wouldn't use the GPU. Its probably looking in the wrong place for it. I removed it. They need to update it for current Linux distributions which use Python 3 and current placement of the OpenCL libraries.

Another project studying the Corona virus protein, however this one is a BOINC-based project. They only have a CPU app and it uses a single thread. You can run multiple instances. They have two apps, the first is Rosetta Mini and uses around 400-500MB memory and the standard Rosetta app which can use up to 2GB memory per thread. I've been running it on the Intel GPU machines as they have 32GB. Tasks take a certain amount of time, which defaults to 8 hours but you can adjust it to more of less via the web site project preferences.

One of the Intel GPU machines failed half the work units complaining about memory errors, so once it finishes its current work I will have to have a look at it. I suspect the memory isn't seated properly for one or two sticks of memory because half of them seem to work.

They've said they will also be doing some Corona virus protein studies, but don't currently have any. They have a Nvidia GPU app. I have been running some of their current work recently.

28 March 2020

X10SRi Storage Server rebuild

I felt inspired by Linus Tech Tips storage server on the cheap build, see my previous post for a link. So in true send up of Linus here is my version. Unlike Linus I don't have any sponser and have to pay for my own hardware and reuse parts where possible.

The case is a 2015 vintage Fractal Designs Define R2 which has excellent build quality, and lots of room to stuff things inside. We're going to need it. It also weights a hefty 12 kilograms.

You can see the sound-dampening foam on the front panel. It also has thicker padding on the side panels (not shown) as well as the top of the case.

Its has louvered doors. There's nylon mesh in front of the fans to keep out the dust. As Linus pointed out with his build this arrangement usually means the airflow isn't the best. To counter that I have replaced the original fans with Noctua ones all round. Better airflow, reliability and they are fairly quiet.

This is what we're starting with. It had 4 x 4TB drives. They started off life running Windows Server 2008 R2 using the RAID controller in hardware mode and then most recently its been switched to Linux and the controller is in JBOD mode with ZFS on Linux providing the RAID functionality.

So its out with the old. The bits on the side are the drive caddy.

And in with the new.

But wait there's more. We're doubling the number of drives in order to fill all 8 drive bays. I'm not counting the 5.25" drive bays at the top of the case. Besides these cost me a small fortune.

Here is the finished product. Yes I know it needs cable management Linus.

And what does all this look like in Linux I hear you ask, Like this:

# zpool status
  pool: pool1
 state: ONLINE
  scan: scrub repaired 0B in 0 days 00:02:09 with 0 errors on Sat Mar 28 03:59:10 2020

        NAME                        STATE     READ WRITE CKSUM
        pool1                       ONLINE       0     0     0
          raidz2-0                  ONLINE       0     0     0
            wwn-0x5000cca298c13b97  ONLINE       0     0     0
            wwn-0x5000cca298c143ed  ONLINE       0     0     0
            wwn-0x5000cca298c13f04  ONLINE       0     0     0
            wwn-0x5000cca298c16623  ONLINE       0     0     0
            wwn-0x5000cca298c14c7f  ONLINE       0     0     0
            wwn-0x5000cca298c16afa  ONLINE       0     0     0
            wwn-0x5000cca298c152cd  ONLINE       0     0     0
            wwn-0x5000cca298c1431f  ONLINE       0     0     0

errors: No known data errors

# df /pool1
Filesystem       1K-blocks      Used   Available Use% Mounted on
pool1          75325838848 162769536 75163069312   1% /pool1

Thats 71TB of usable space. Probably more because I turned lz4 compression on for the pool. I went with two drive redundancy so I can lose any two drives and still not lose my data.

Well that's it for this post. If you like this post then give me the thumbs up and if you don't we'll it doesn't matter because this isn't YouTube.

15 March 2020

15th of March

Farm status
Intel GPUs
One running Einstein Gravity wave work

Nvidia GPUs
Two running GPUgrid work

Raspberry Pis
Eight running Einstein BRP4 work

For news on the Raspberry Pis see Marks Rpi Cluster

Project news - GPUgrid
They've got an experiment going with lots of work units. My GTX 1660 Ti's are talking about an hour and a half each. I'm only running two machines so I don't trip the circuit breaker. There is no shortage of work at the moment.

Einstein gravity wave work
I have been running the gravity wave work on CPU for a while and some of the frequencies are now using quite a bit of memory. So much so that if I try and run 12 on the Nvidia GPU machines that I can get 7 running and the others end up waiting for memory. They have 16GB of memory but some of the work units are using 2GB each.

I even got one of the Intel GPU machines running them because they have 32GB of memory, but they're a lot slower.

Storage server
I ordered some larger hard disks for one of the storage servers. It only has 32GB of memory so I swapped the memory out of the other one. I found that the X10SRi-F motherboard has a fault in DIMM socket C1 so I can't populate the memory as recommended. Supermicro stopped making the X10 motherboards so I might have to try getting a second-hand one to replace it. I will also see if its possible to get the motherboard repaired. In the mean time I have put the 32GB back in the sockets to the left of the CPU which work (the C1 socket is to the right of the CPU).

I was inspired by Linus (of Linus Tech Tips) who made a relatively cheap storage server using a Fractal Designs "Define" case and he managed to stuff 20 hard disks in it here: https://www.youtube.com/watch?v=FAy9N1vX76o

I wouldn't recommend going above 16 drives as the case only had 16 drive bays. He attached drives to the top and back of the case which doesn't do much for reliability. My file server with 32GB of memory is in a Fractal Design Define R2 case, an older version and it has 8 + 2 drive bays. I think 8 drives is enough for my purposes. Its currently got a SATA SSD as the boot drive and 4 x 4TB drives.

Update 17 Mar 2020
I need to make a couple of corrections to the Storage server details above.
1. The X10SRi-F motherboards are still available.
2. The case Linus used was a Fractal Designs Define 7 XL which is larger than the Define.

I still wouldn't recommend bolting drives onto the top or back of the case though, just use the drive bays that it comes with.

Meanwhile I've ordered 4 x 14TB HDD. Today ordered 3 more. The price went up $51 between my first and second orders (6 days). Oh and they have been delayed.

07 March 2020

7th of March

Farm status
Intel GPUs
One running Einstein gravity wave work

Nvidia GPUs
All running Einstein gravity wave work

Raspberry Pis
Eight running Einstein BRP4 work

For news on the Raspberry Pis see Marks Rpi Cluster

Project news - Seti
Sad news this week. They announced they are going into "hibernation" from the 31st of March. That generally means the end of the project.

Other news
I've been doing some GPUgrid work on a couple of the Nvidia GPU machines after we got updated drivers via the Debian buster-backports repo. I also did some Milkyway work. Since then I have updated all of the Nvidia GPU machines to the 440.59 drivers.

The Intel "neo" drivers have finally made it into a Debian repo. I have installed them on one of the Intel GPU machines but haven't tried any GPU crunching on it yet. Generally it slows the whole CPU down so it not a good idea to use both at the same time.

I was looking at building a Ryzen 3950x machine, however with the news that Seti is closing down have decided not to proceed with it. Einstein CPU work is very demanding on the memory system and so it wouldn't suit their app.

And it more news this week there is yet another security bug with Intel's ME module and they don't think they can correct it. Supposedly the 10th generation CPU's aren't effected but I can't see why anyone would buy them given all the security flaws.