Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 338 · 339 · 340 · 341
Author | Message |
---|---|
Sid Celery Send message Joined: 11 Feb 08 Posts: 2380 Credit: 45,200,983 RAC: 24,025 ![]() |
First tasks came down and everything crashed out within seconds - WCG tasks running fine though. Yup, that made no difference whatsoever... <sigh> ![]() ![]() |
Tom M Send message Joined: 20 Jun 17 Posts: 158 Credit: 31,910,146 RAC: 104,988 ![]() |
First tasks came down and everything crashed out within seconds - WCG tasks running fine though. If I am clear, you have the Windows on a seperate HDD. And you just replaced the Data HDD where the Boinc is living? Have you checked the permissions on your Rosetta exe? Or better yet just "reset" the project and it should download everything clean and start running again. I hope. Proud member of the O.F.A. (Old Farts Association) |
Tom M Send message Joined: 20 Jun 17 Posts: 158 Credit: 31,910,146 RAC: 104,988 ![]() |
There are several systems that normally have higher RAC's than I do. Yet many of them don't have a full enough cache to run all of the available threads. Since I know we have published both Linux and Window's polling scripts. They should be able to suck down enough to keep up? Does anyone reading here run those systems? Tell me/us what is going on? Respectfully, Proud member of the O.F.A. (Old Farts Association) |
![]() Send message Joined: 28 Mar 20 Posts: 1854 Credit: 18,534,891 RAC: 0 |
Or better yet just "reset" the project and it should download everything clean and start running again.Yep. Reset the Project, and let it re-download all new files (given your disk issues, the existing files could very well be corrupted). Grant Darwin NT |
Stevie G Send message Joined: 15 Dec 18 Posts: 129 Credit: 1,028,210 RAC: 477 |
But there are now six Rosetta tasks stuck in Ready to Report. Two of them have been there since Sunday and four all day Monday. I reset the project,, rebooted, hit update several times per day and still get no tasks. You remember the six completed task I had that were ready to send for several days? My account now reports them all "Timed out-- no response." So I don't understand what is wrong. S. Gaber |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2380 Credit: 45,200,983 RAC: 24,025 ![]() |
First tasks came down and everything crashed out within seconds - WCG tasks running fine though. With Grant confirming your suggestion I've done this straight away. Before sending the PC away I'd run all tasks down to nothing and set No New Tasks on all projects. I didn't want anything to start working again before I was ready. The old data HDD was drive E and came back as drive D, so I had to change that back to E in disk management And it came back with User directories set to C so I re-set symbolic links of docs, downloads, music, pictures, videos etc back to E as well I was very lucky my repairer sent me pictures of my directories E:/ E:/Users & E:/Users/<Name> to help ensure I wasn't forgetting anything I didn't ask for those, but he was double-checking what to file-copy after he discovered cloning was failing due to repeated crashes (the reason I needed the new drive) and it turned out to be a Godsend - very fortunate. Yes, I have had further problems with permissions, which I <think> I've now resolved, but I may've just resolved the most obvious ones and there's others still lurking in the background. I've no obvious way of knowing without a dialog box popping up - I'm not technical enough to know how to find out. Moving on, resetting Rosetta was producing no reaction in my Event log for several minutes, so I took the opportunity to review the detailed Security history in Norton. There are lots of blocked transactions, the source of which is almost completely opaque. Specifically: "Rule IGMP Public Blocked IGMP(2) traffic with (192.168.0.1)" Over the years, Norton has hidden more and more under the bonnet, to the point where finding out what it's doing and why is increasingly hidden away. I discovered this Rule IGMP is one of its default Traffic rules (and isn't 192.168.0.1 my own router?). I took the view it wasn't wise to change the rule in any way. Trawling through other detailed settings I discovered Boinc blocked in its Sandbox section. Why or how, I don't know. I changed that to allow it. I also ensured I'd properly whitelisted the whole C Boinc directory and E Boinc data directory. Going back to Event log, after 30mins of nothing, a new Master file download succeeded. I suspect following my removal of Boinc from Norton's Sandbox block list After 90 more minutes of attempts to download Rosetta tasks, finally I got some. And they're running without crashing out. Success! I should point out, throughout this period, I've been receiving and successfully running WCG tasks to completion, so my PC hasn't been idle. Why, I don't know. Whatever problems I've since discovered and resolved in Norton should've affected WCG tasks just as much as Rosetta. But clearly they didn't. It''s a mystery I'm not going to get bogged down in. Rosetta and WCG are both running succesfully and i can depart for work for 3 days without having to worry about it. I've now set WCG back to NNT to get Rosetta back in full flow. Thanks for letting me bounce ideas off you guys. It genuinely did help. I'd got myself bogged down without the suggestion of a new route around the problem. ![]() ![]() |
Tom M Send message Joined: 20 Jun 17 Posts: 158 Credit: 31,910,146 RAC: 104,988 ![]() |
Me neither. Does your hosts file look something like this? 127.0.0.1 localhost 127.0.1.1 Lynnes-Monolith 128.95.160.156 boinc-files.bakerlab.org 128.95.160.156 bwsrv1.bakerlab.org # The following lines are desirable for IPv6 capable hosts ::1 ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters Proud member of the O.F.A. (Old Farts Association) |
Stevie G Send message Joined: 15 Dec 18 Posts: 129 Credit: 1,028,210 RAC: 477 |
I copied the host file you suggested and reset the project again. Here's my event log: 6/12/2025 12:55:10 PM | Rosetta@home | Resetting project 6/12/2025 12:55:15 PM | Rosetta@home | Master file download succeeded 6/12/2025 12:55:20 PM | Rosetta@home | Sending scheduler request: To fetch work. 6/12/2025 12:55:20 PM | Rosetta@home | Requesting new tasks for CPU and AMD/ATI GPU 6/12/2025 12:55:21 PM | Rosetta@home | Scheduler request completed: got 0 new tasks 6/12/2025 12:55:21 PM | Rosetta@home | Server error: feeder not running 6/12/2025 12:55:21 PM | Rosetta@home | Project requested delay of 3600 seconds |
![]() Send message Joined: 16 Jun 08 Posts: 1242 Credit: 14,421,737 RAC: 1 |
This line usually indicates that the server you are trying to download work from is not running, so all you can do is wait for it to start running again: 6/12/2025 12:55:21 PM | Rosetta@home | Server error: feeder not running |
Tom M Send message Joined: 20 Jun 17 Posts: 158 Credit: 31,910,146 RAC: 104,988 ![]() |
Now start this script from a command line window: Windows script to keep running updates on Rosetta at Home. From https://boinc.bakerlab.org/rosetta/show_user.php?userid=412375 aka: kotenok2000 cd /d c:Program FilesBOINC :loop boinccmd.exe --project https://boinc.bakerlab.org/rosetta/ update TIMEOUT /T 600 goto loop I have had trouble with missing back slashes when trying to post this. There is a back slash between the c: and the "Program Files". And another between "Program Files" and BOINC. And if your Boinc lives someplace else you need to change the drive letter and path to suit. The reason you run this script is to more reliably get downloads from Rosetta. Proud member of the O.F.A. (Old Farts Association) |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2380 Credit: 45,200,983 RAC: 24,025 ![]() |
But there are now six Rosetta tasks stuck in Ready to Report. Two of them have been there since Sunday and four all day Monday. I hear everything you're saying, but C:/Windows/System32/drivers/etc/hosts is not being read That's 'hosts' with no extension - not .bak .old .txt .doc or anything else, just hosts And not in any other directory - specifically the folder written above For whatever reason that none of us seems to understand, Rosetta won't magically come back just by waiting Something changed somewhere. You <must> have had it right to get tasks to come down, then it <must> have changed to stop connecting And whatever the file is that you're editing now simply cannot be the one in that very specific folder I know I'm writing this from a distance like I know better than you, but if you've repeatedly put the lines you've been given in the right file in the right place you simply wouldn't keep on receiving the message "Server error: feeder not running". You might get other messages saying all sorts of things, but not that. That's a line that says I'm not looking at the hosts file you keep editing. ![]() ![]() |
Adam Gajdacs (Mr. Fusion) Send message Joined: 26 Nov 05 Posts: 14 Credit: 3,090,462 RAC: 2 |
May I ask why adding these IP/Host assignments to the hosts file is necessary? Has it been communicated somewhere as a workaround for some issue with the project? Was there a change/issue in the BOINC client itself that requires this in some cases (I have to admit, mine isn't up to date, I update it very infrequently, and I don't follow the news related to it in terms of major changes or critical fixes) Or where is this information coming from? Seeing as I have not received Rosetta tasks for at least half a year, I finally remembered to come to their site to take a look but have not seen anything obvious. Servers appeared to be up, tasks being available, so I checked the boards and found this one conversation about needing to edit the hosts file if the event log for Rosetta just keeps saying "feeder not running", but oddly enough, there appeared to be only a single person with this issue, and yet someone else giving solution as if it was public knowledge, Before I made the edit, I checked the two hosts involved. boinc-files.bakerlab.org resolved to 128.95.160.135 bwsrlv.bakerlab.org resolved to 2607:4000:406::160:156 Both were up and replying. After making the edit, both resolved to the address specified in the hosts file and were also up and replying on that IP address. The log entries for Rosetta changed from "feeder not running" to 2025. 06. 13. 11:12:02 | Rosetta@home | Sending scheduler request: To fetch work. 2025. 06. 13. 11:12:02 | Rosetta@home | Requesting new tasks for CPU 2025. 06. 13. 11:12:03 | Rosetta@home | Scheduler request completed: got 0 new tasks 2025. 06. 13. 11:12:03 | Rosetta@home | No tasks sent 2025. 06. 13. 11:12:03 | Rosetta@home | Project requested delay of 31 seconds which appears to be what to expect when communication with the servers is ok just no work is being sent by them for whatever normal reason. The question remnains: why is this address resolve override needed, and how is it that apparently it is only needed for an extremely low number of users? What am I missing here? |
![]() Send message Joined: 28 Mar 20 Posts: 1854 Credit: 18,534,891 RAC: 0 |
The question remnains: why is this address resolve override needed, and how is it that apparently it is only needed for an extremely low number of users? What am I missing here?Very few people use IPv6, and that is what is what is causing the issues as it's broken here at Rosetta. Along with several other issues- such as millions of Tasks queued up but frequently 0 are Ready to send, issues with the download servers (requiring their own Host file fix if you have that issue), along with issues with the Assimilators. And the project not taking the slightest bit of interest in resolving any of these problems. Grant Darwin NT |
Tom M Send message Joined: 20 Jun 17 Posts: 158 Credit: 31,910,146 RAC: 104,988 ![]() |
Is it possible that he has been running the notepad without Administrator permissions? And so has not actually successfully made those changes? Or there are "invisible" characters in the current file name so it is not being recognized? I have had to literally wipe out/delete and recreate from scratch some windows text files that had garbage I could not see. In order to get something to work again. Proud member of the O.F.A. (Old Farts Association) |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2025 University of Washington
https://www.bakerlab.org