28 July 2009
Since the nvidia 190.38 driver came out last week, GPUgrid work has been failing on the GTX260's in Maul for all work units. The GTS250's in 4 of the quaddies work fine. This points to an incompatibility with the G200 based cards, which are the high-performance ones.
While its possible to go back to an older driver, Seti positively screams along now with GPU work units taking as little as 2 minutes when using the cuda 2.3 DLL's. If I downgrade I can't use cuda 2.3 and lose all that speed. I am hoping that the GPUgrid cuda 2.2 app, due later this week, will work properly with the current drivers. In the mean time Maul is concentrating on Einstein and Seti work.
As mentioned in my last post Yoda is up for sale. You can see it on eBay here: http://cgi.ebay.com.au/ws/eBayISAPI.dll?ViewItem&ssPageName=STRK:MESELX:IT&item=300333014651
Its my intention to also decommission all the other quaddies and sell them off too. This is mainly because of needing the space for the new i7's and also because of the amount of power needed to drive them. The energy bills have gone up 20% since the 1st of July due to price increases.
25 July 2009
Quaddies for sale
Yoda is for sale on eBay. I need the room (and money).
I have ordered 4 x i7 machines to replace all the quaddies. They are ASUS P6T's. In the mean time I have dropped off the 700w power supply and the GTX275 to be installed in the 1st one. If its reasonably quick i'll go with the same graphics card in all of them.
Cuda 2.3 released
Nvidia has released cuda 2.3 along with an updated driver during the week. Initial indications are that its about 30% faster than cuda 2.2.
I have upgraded 3 of the machines so far. The other two have to finish off their cuda work before I can upgrade them. Meanwhile GPUgrid are updating their app to use cuda 2.2 next week. Hopefully it won't take them too long to get to 2.3.
20 July 2009
Trashing cuda work
I upgraded the "stock" Seti cuda app to the optimised one and stuffed up the app_info. That resulted in all my Seti cuda work being deleted. To top things off I couldn't get any new cuda work units until much later due to the Seti servers being overloaded. It took me a short while to fix the app_info problem, but much longer to get any work unit to prove it worked. Once I had it working I upgraded the remaining machines with the new app and app_info files.
After the cuda app upgrade I decided to relocate the router and the 8 port switch. So I unplugged the switch, moved it to its new location and plugged it all in. No problem the machines only use the network during my ISP's off-peak times (1am to 7am).
When I checked the following day all the machines were complaining about no network connection. After checking a few things it seems the network cable had come out the back of the file server. I swapped the cable for another one and the network came back to life. After that I manually had to allow the machines to get work as they had all run out of work.
18 July 2009
Updated for Astropulse 5.05 and cuda 2.3
If you are not comfortable with editing an app_info then this is not for you. You would be better off using the Unified Installer for Windows available from the Lunatics web site see http://lunatics.kwsn.net/optimized-applications-release-news/lunatics-unified-installer-for-windows-v0-2.msg19610.html;topicseen#msg19610
BOINC is very unforgiving of an incorrect app_info and usually will delete all tasks if you get it wrong.
Do NOT use Internet Explorer to edit the xml files, it will stuff up your app_info. Use Notepad or another text editor.
Upgrade your BOINC client first and get it working before changing anything else. At the time of writing I am running the 6.6.37 client, although any client from 6.6.15 should be sufficient.
The app_info.xml below is based on a Windows XP platform (32 bit) and the cuda-capable card is a GTS250. If you are running on another platform you may need to add/amend the <platform_name> tags.
My computers support the SSSE3 instruction set. SSE2, SSE3 or SSE4.1 may be more appropiate for you. You will need to amend the program names as appropriate in the app_info.
I've assumed that you have your cuda-capable card up and running and have the necessary nvidia drivers (minimum version is 180.48). Use the 190.38 drivers (or later) if you want to use cuda 2.3.
a) Optimised multibeam and optimised Astropulse, available from the Lunatics web site
b) Cuda multibeam and support libraries
1. Download and install BOINC. Get this working before changing anything else.
2. Empty your cache of Seti@home work. This is best achieved by setting the project to No new work and letting it finish off its tasks. Make sure they are all uploaded and reported, there should be none on your tasks list. If you are feeling brave/confident then skip this step, but if all your tasks get deleted don’t say you weren’t warned.
3. Download the optimised multibeam and astropulse apps if you don't already have them.
4. Download the cuda multibeam app (from the Seti web site) if you don't already have them. If you run the stock cuda multibeam app then you should already have these in your projects\Setiathome.berkeley.edu folder.
You can also use the optimised cuda app available instead of the “stock” app. You will need to change the name in the app_info.
5. Disable network communications in BOINC.
6. Shutdown BOINC. Make sure it and the science apps are shutdown.
7. Browse your client_state.xml file (its in the BOINC data directory) and look for the entry <p_fpops>. We need to use this number. Do NOT change this file.
8. Browse the BOINC log file to get the estimated speed of your GPU or before you shut BOINC down click on the messages tab. This is usually given at the top of the log in Gflops. 9800GT’s are estimated at 60Gflops, GTS250’s at 84Gflops and GTX260 (216 shaders) at 96Gflops.
9. For each of the apps multiply the p_fpops value by the factor below and put this into the appropiate flops entry in the app_info given below. For multibeam 608 you need the estimated Gflops. The app_info given below has the values for a GTS250.
Astropulse 503 p_fpops x 2.6
Astropulse 505 P_fpops x 2.6
Multibeam 603 p_fpops x 1.75
Multibeam 608 Est.Gflops x 0.2
10. Make sure you have all the programs listed above in the projects\Setiathome.berkeley.edu folder. If not copy them there.
11. Save your app_info in the projects\Setiathome.berkeley.edu folder.
12. Start up BOINC. Check the messages tab to see if it lists any [file error] messages. If there are shut BOINC down, check you have the correct program names referenced. Go back to step 10.
13. If okay then enable new work for the Seti@home project.
14. Enable network communications again.
15. BOINC should now download work of all types. If not check your Seti@home preferences on the Seti web site, that Astropulse_v5, Astropulse_505 and Allow graphics processor are all ticked. If you have a slower computer you may not get Astropulse work units anyway.
Seti upload woes
Seti continues to have upload problems, but there have been a couple of suggestions made about resolving it without them having to run a 1Gbit fibre optic cable up the hill. They have a 1Gbit link which stops at the bottom of the hill, however they are at the top and only have a 100Mbit link from the bottom of the hill to the top where the Space Sciences Lab is located. There are also some software tweaks being applied on the BOINC client end to do with retries.
Einstein download throttling?
On the subject of file transfers I have noticed that Einstein@home seems to be downloading at 10Kbps (on average) when it used to be in the order of 50-100. Uploads however are quite quick, around 30Kbps. I did post a message in one of their forums but have received no answer. I suspect that they are throttling connection speed so their server doesn't fail again.
Astropulse 5.05 released
The optimised AP 5.05 app has been released for Seti@home. I installed that during the week on all the machines and updated my app_info files. Since then I have only seen a couple of Astropulse work units. Seti is mainly producing Multi-Beam work units at the moment, which is one of the reasons why they are having upload issues. Multi-beam work units don't take as long to process as Astropulse and are smaller. I'll post my app_info and instructions soon.
ReSchedule 1.9 released
The Reschedule program was updated to 1.9 during the week. It now leaves the VHAR (very high angle range) work units on the gpu instead of shifting them. The VLAR (very low angle range) work units are the ones that cause the Seti cuda app to take much longer to process. I ran around and installed it on all the machines with gpus.
11 July 2009
Seti bandwidth issues
Seti has been having continual bandwidth issues for the last 2 weeks. Its made worse by the fact they have a scheduled outage every Tuesday to perform database maintenance. After the scheduled outage the whole world tries to connect to it, so it hits the proverbial brick wall of 100Mbit. It has been taking most of the week to clear the backlog and get back to normal, just in time for the next outage.
I did try and install the GTX275 into Qui-Gon during the week but had a problem. The card just fits (there is about 1cm from the end of the card to the drive bays at the front of the computer). Unfortunately there are a whole bunch of connectors on the edge of the motherboard that it seems to sit on top of. It looks like it might have to wait until another machine comes along before I can use it.
KVM and case fans
The 25 foot KVM cable arrived during the week. I have now plugged in Maul to the new KVM even though its around the other side of the room. Suprisingly I can now use a PS/2 mouse, which the old KVM seemed unable to do.
I also picked up a bunch of Noctua 92mm case fans. I have 5 quaddies at the moment and only 3 of them have case fans. I'll be installing one this weekend for the machine that has a GTS250.
While I was in the computer shop I was told that there are some price cuts coming from Intel, so they will try and reprice my quote for 4 x i7 machines.
04 July 2009
This is really good because the VLAR and VHAR work units take forever to run on cuda, but run as normal on the CPU. I've installed it on all my cuda machines after 1st testing it on one machine.
I have Qui-Gon running out its cache of cuda work. I will put the 700w power supply and the GTX275 in there. That will free up a 500w power supply and GTS250 for Yoda. I haven't upgraded Yoda as its still having issues with cpu cooling. I need to get ACER out to look at it. If I don't upgrade Yoda then I haven't voided my 3 year warranty.
I am still waiting on the 25 foot KVM cable for the Belkin to arrive, but its not urgent as I have an old screen and keyboard/mouse on Maul at the moment.