Page 1 of 2
Problem with completing downloads
Posted: Thu Nov 01, 2007 12:59 pm
by BigJens
Hi!
I am experiencing a problem with the completion of downloads recently. I can see in the task manager that there are bodies left, but no errors on the others. See attached picture. Any help here?
TIA
Jens
Posted: Thu Nov 01, 2007 4:02 pm
by Josef K
I've seen this happen on occasion but I've been trying to determine whether or not it's a particular server just hanging on me. It does seem like UE's timeout isn't really doing its job effectively. UE tends to pick up the missing pieces with a restart. I've only noticed this recently so that's what led me to need to check for any server issues.
I'm leaning towards thinking that UE isn't timing out the read operation (currently set at 30 seconds - I can't remember if that's default or not).
Posted: Fri Nov 02, 2007 2:58 am
by alex
it indicates there are running tasks. what is the progress in the running pane (it is the pane to the right of the articles pane with the green computer icon)? 0% or 100%?
if it is 100% it means the server doesn't return the end of article character (dot) or returns it with delay (i saw that with UNS). 0% might be e.g. dns resolution problem (you could check it is not by replacing the server name with server ip) or your local (firewall) or external - server related - connection trouble, you can try to reduce "read timeout" if it is some intermittent rare connection issue (but the timeout is not effective for dns resolution).
try to check whether it is happen with e.g. news.microsoft.com, if it is not it is most likely the news server problem, if it does ensure the problem is not caused by your firewall or antivirus.
in any case if you have antivirus or firewall set them not to interfere with UE traffic, it may consume CPU and bugs in them might cause hanged downloads or even system instability.
read timeout is just a socket setting, if you define a timeout UE sets it for every created socket for sure, the rest though is up to the socket layer. dns resolution is not governed by the timeout since for the dns resolution call there is no select operation where the timeout is set, when the timeout is set third party firewalls sometimes might cause sockets to behave differently.
Posted: Fri Nov 02, 2007 4:48 am
by Josef K
I noticed this again a few hours ago. Since this isn't easily reproduced (it happens with maybe one or two segments within a DVD sized post) I'll check again next time I download something to see what the percentage is.
Antivirus and firewall both have UE excluded completely (AV = database directory, firewall = UE traffic). The same config has been in effect for some time now but it is only recently I've seen this behaviour.
I'm thinking it's a server issue (Ngroups.net) since they seem to be the only hanging connections. Having said that, most of what I've been downloading has been outside of my ISP retention so they just return as not available anyway. The articles have been around 65-69 days old so they are well within the retention of Ngroups.
Posted: Fri Nov 02, 2007 10:44 am
by alex
NGroups resells UNS so it is the same server, with UNS I think what I saw it was delay in returning the final dot or somewhere in the end, it was stuck at 100%, at least in some instances it completed without retry after a delay.
Posted: Sun Nov 04, 2007 1:52 am
by Josef K
I've finally seen this happen again. Twelve segments are hung at 0% on the Ngroups SSL server. I can't be sure that this is the only server that's been problematic but since they're all on that particular one right now, maybe it's having issues at the moment.
If it is server related, can UE be adapted to take account of this sort of situation? For example, if an article stays running beyond the read timeout set in Properties, could UE cut off the connection and retry? In effect a hard limit.
The several times I've seen this happen I've restarted UE, it's begun the download again and then it completes. What I'm looking for is some sort of automatic operation UE can perform without user intervention. It will be especially useful when PAR/RAR support arrives since otherwise nothing can be done in a case like this. Even now if I have QuickPar running, checking for completion and repairing if necessary, it will never have the opportunity to do so.
Posted: Sun Nov 04, 2007 3:07 am
by alex
but it is what it does.
if it is read/write operation which stalls - read/write timeout settings will be used (also if it happens during the initial ssl negotiation).
if it is socket connect there is no direct timeout setting in winsock and as to resolving dns there is no timeout setting at all, but eventually both will time out, a system-wide timeout value somewhere.
i can add timeout for connect to make it user defined (as a separate setting? - i'm not sure all firewalls will digest it well), need to check first it works in Windows.
yes i remember i saw the same problem with SSL/UNS.
Posted: Sun Nov 04, 2007 6:08 am
by Josef K
All I can tell you is what I see and what I see is a connection that waits forever. UE does its job but there are times like these where it really isn't taking command. If a server is at fault then it needs to be shown who's boss.
There just needs to be something in UE to say 'something's wrong' and then retry.
If there is a firewall issue later on then it can be dealt with in its config. Mostly it should just pass by without a significant problem.
The system wide timeout duration I don't know about and I'm too tired to research right now. The 'timeout' is fairly inaccurately labelled if it just sits pending forever, though. I've left UE downloading overnight and found it still running hung tasks the next day.
It would be helpful if the original poster would describe the issue in more detail regarding which server(s) were hanging. It could be helpful to narrow down to the type of server on which this occurs.
Posted: Sun Nov 04, 2007 7:17 am
by alex
i'll add connect timeout option, we'll see then.
Posted: Sun Nov 04, 2007 2:25 pm
by Josef K
Again I wake up to another download which has hung on the SSL server. This time the two segments left were in between 0% and 100% and weren't moving until I restarted. They were something like %40-something and %60-something. What does that suggest? I had QuickPar running to monitor and auto repair (it did need repairing) and it had enough blocks so it would have if that last file had been able to complete.
I think the SSL server is stalling occasionally and needs to restart the downloads periodically in order to catch a time when it isn't stalled so things can complete.
UE with a timeout would go a long way to avoiding this - I look forward to seeing it.
Posted: Mon Nov 05, 2007 2:33 pm
by alex
try to check this version:
http://www.netwu.com/ue/ue05nov07.rar
in properties->tasks i added connect timeout.
if sys.default is checked it will engage normal connect.
the version number is intentionally wrong (1.9.9.1), it is just to check whether it works with this server condition and let me know
Posted: Mon Nov 05, 2007 4:09 pm
by Josef K
I'll check it as soon as I can. I'm running low on space (as always
) so I can't run it right now but I should be able to have it running overnight.
The unfortunate thing is that even if it works it might just be that the server is working fine so there's no real way of telling other than to run it for an extended period of time. I hadn't had this problem until recently so if this was a temporary issue with the SSL server then it could be that they fixed it. Unless you've built in some sort of debug log that would list the times it forced a reconnect.
We'll see what happens...
Posted: Mon Nov 05, 2007 8:22 pm
by Greg_G
I've got ue05nov07 running now and am switched back to SSL. I'll let you know how things go.
Edit: I've set the connect timeout to 60 and I'm sorry to say that I currently have all 3 allowed tasks stuck at 0% for about an hour. The TCP connections still show as active in netstat:
secure.usenetserver.com:https ESTABLISHED
secure.usenetserver.com:https ESTABLISHED
secure.usenetserver.com:https ESTABLISHED
Posted: Mon Nov 05, 2007 11:47 pm
by alex
try to do the following with the current release (not the test version):
no articles in the queue or they are marked for download later.
uncheck properties->general->keep alive
in properties->servers for UNS set number of retries to zero and uncheck SSL (but leave the SSL port).
then restart the program and try download a single part article.
does it give "Select timeout error" after the time set in properties->tasks, read timeout?
the idea is when you use SSL port for non-SSL connection it should time out since both sides are waiting, so we'll see whether you have timeout working, if not it is 100% your local configuration issue.
Posted: Tue Nov 06, 2007 1:14 am
by Greg_G
alex wrote:does it give "Select timeout error" after the time set in properties->tasks, read timeout?
Yes it does, alex:
Last error/mishap: server: secure.usenetserver.com time: 20:10 error: Select timeout error