Message boards : Number crunching : Task ran fine, but timed out on Threadripper 9000 after approx. 2,5 days: rb_04_04_702801_692873_ab_t000__robetta_cstwt_5.0_FT_IGNORE_THE_REST_09_10_3023291_43_0
| Author | Message |
|---|---|
Michael H.W. WeberSend message Joined: 18 Sep 05 Posts: 18 Credit: 6,850,436 RAC: 5,685 |
So, I got a task which ran unusually long (I did not set ANY preference regarding desired run time) and was timed out after the return deadline was reached. Note that this machine does not bunker tasks, so it started right after download and did not finish within time. This is the work packet: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1447140176 This is the task (no log): https://boinc.bakerlab.org/rosetta/result.php?resultid=1626758790 WU name is: rb_04_04_702801_692873_ab_t000__robetta_cstwt_5.0_FT_IGNORE_THE_REST_09_10_3023291_43_0 I checked the screensaver after around 2 days, 14 hrs and it looked just fine. Apparently, this WU type does not "converge" (if this applys here) and some scientist might want to take a closer look at this packet. Michael. President of Rechenkraft.net e.V. http://www.rechenkraft.net - The world's first and largest distributed computing association. We make those things possible that supercomputers don't. |
|
Sid Celery Send message Joined: 11 Feb 08 Posts: 2574 Credit: 47,207,888 RAC: 2,523 |
So, I got a task which ran unusually long (I did not set ANY preference regarding desired run time) and was timed out after the return deadline was reached. Note that this machine does not bunker tasks, so it started right after download and did not finish within time. I have no idea, but checking around it does seem like this task was a one-off among a lot of others run by that User that ran just fine. The other thing I notice is that User is using Boinc Manager 7.24.1 when the current version is 8.2.9 It also seems strange that other tasks returned by that user ran for 8hrs with very little time between CPU time and Wall-clock time, doesn't appear to run other projects, yet this one task (and no others) missed deadline As a one-off I wouldn't worry about it, although I am seeing some of my own tasks crashing out before completion too It seems we're running a slightly faulty batch atm with errors not uncommon
|
Michael H.W. WeberSend message Joined: 18 Sep 05 Posts: 18 Credit: 6,850,436 RAC: 5,685 |
I run that older BOINC client (the latest v7 version) on purpose because I figured that the entire v8 BOINC client batch has severe scheduling issues - especially in conjunction with using BAM! and communicating with Primegrid (over BAM!). Just check out other BOINC forums where people complain about not receiving tasks - it always seems connected to BOINC v8 clients. Michael. President of Rechenkraft.net e.V. http://www.rechenkraft.net - The world's first and largest distributed computing association. We make those things possible that supercomputers don't. |
|
Sid Celery Send message Joined: 11 Feb 08 Posts: 2574 Credit: 47,207,888 RAC: 2,523 |
The other thing I notice is that User is using Boinc Manager 7.24.1 when the current version is 8.2.9 I haven't heard or seen that with any version of v8 Scheduling has certainly changed, but I personally find it better if I notice any difference at all That said, I signed up to BAM a very long time ago and never found out how to make it work, so I abandoned it many years ago I'm much more inclined to blame BAM than Boinc, but my requirements are very limited, so you may have different needs to me
|
Grant (SSSF)Send message Joined: 28 Mar 20 Posts: 1931 Credit: 18,534,891 RAC: 0 |
I'm much more inclined to blame BAM than Boinc,Yep- there are issues with BAM that have never been addressed. Grant Darwin NT |
Message boards :
Number crunching :
Task ran fine, but timed out on Threadripper 9000 after approx. 2,5 days: rb_04_04_702801_692873_ab_t000__robetta_cstwt_5.0_FT_IGNORE_THE_REST_09_10_3023291_43_0
©2026 University of Washington
https://www.bakerlab.org