Problem with using Multiple Servers

Post Reply
Codone
Posts: 9
Joined: Sun Apr 24, 2005 8:36 am

Problem with using Multiple Servers

Post by Codone »

I am a Software Engineer, and have a lot of experience tracking down bugs. And I've spent many hours trying to track down exactly what is going wrong with Newspro. Now unless I just do not have a certain option enabled that I need to, I believe I've found a serious bug. I used two servers in my test -- news.west.cox.net and news.central.cox.net. I am confident that when I look at these servers alone, Newspro is reporting the correct information. I used "Xnews" to verify both servers (singly).

First, I deleted the registry for Newspro, deleted the Newspro Directory and database, and re-installed (the registered version). Re-entered keys, made a new database directory. In short, I started from stratch each time I tested a new case. I even tried taking my router and all firewalls out of the system. I eventually figured out that my problem stems from using more than one server. When I just use the WEST server, all the headers appear. When I add WEST and CENTRAL, many headers are missing. In fact, many are missing from Central, and Newspro appears to not be combining WEST and CENTRAL, it appears to just use Central results. It even reports the same missing headers for both WEST and CENTRAL when I use "Properties" on an incomplete file.

--------------------
I have put up screenshots showing all of this. I went one step at a time.
PLEASE GO TO http://members.cox.net/codone/
to see all screenshots of the steps below!!
---------------------

STEP 1: Shows just the WEST server (All is good!)

------------
STEP 2 - 1: Shows just the CENTRAL server (has incompletes which is correct!)

STEP 2 - 2: Enabled the Partial -> Show incomplete partial messages
NOTE: Also shows a Properties of incompete "Part 2" (Correct, I assume)

------------
STEP 3 - 1: Show both WEST and CENTRAL together (INCORRECT)
STEP 3 - 2: Enabled the Partial -> Show incomplete partial messages
NOTE: Also shows a Properties of incompete "Part 2"


The following is a detailed list of what I did on each step:
--------------------------------------
(after deleting Registry/newspro directory and reinstalling)
Step 1
--------------------------------------
Added server: news.west.cox.net
Added Newsgroup: alt.binaries.drwho

Get New Headers

Notice the ALL the files for "Creature From the Pit" are complete.
--------------------------------------
(after deleting Registry/newspro directory and reinstalling)
Step 2
--------------------------------------
Added server: news.central.cox.net
Added Newsgroup: alt.binaries.drwho

Get New Headers

Notice that many the files for "Creature From the Pit" are INCOMPLETE!
--------------------------------------
(after deleting Registry/newspro directory and reinstalling)
Step 3
--------------------------------------
Added server: news.west.cox.net
Added server: news.central.cox.net
Added Newsgroup: alt.binaries.drwho

Get New Headers

Notice that many the files for "Creature From the Pit" are INCOMPLETE! Even though I have used the WEST server (along with CENTRAL). WEST had all the files complete! This appears to be the bug.
alex
Posts: 4538
Joined: Thu Feb 27, 2003 5:57 pm

Post by alex »

newspro doesn't combine headers, it rather has them combined from the outset.

did you check dns/reverse dns that the servers don't have different ip addresses at different times?

can you replace server names with ip addresses (there is rename option for servers, just restart the program after rename for some case).

68.12.19.6 [news.central.cox.net]

68.6.19.6 [news.west.cox.net]

looks like single ip address at the moment.

but even then you may get different results at different times, if they use server farms and some kind of switch at the front end.

if you saw it incomplete with one server and then complete with the same server (with same ip address) try to reset the newsgroup repeatedly and get headers again, you should see the same picture as when you restarted the program.

resetting newsgroup is enough, if you didn't purge/delete any headers nothing special to show all headers is needed.

check with other incompletes whether the picture is the same (it looks like the same part - 17- is missing in your example - maybe it went to the same physical server behind different gateways).

with xnews you should see the same alternating picture then.

i mean idealizing the server side is some kind of exaggeration :)
Codone
Posts: 9
Joined: Sun Apr 24, 2005 8:36 am

Post by Codone »

newspro doesn't combine headers, it rather has them combined from the outset.
On step 3 (which I deleted everything -- database/registry/etc), I started with nothing, added both servers (central & west) to the list, and did a Get New Headers. It was getting headers from both servers at the same time (2 threads going). Yet it seems to only be showing the results from Central. You can see all this from my step-by-step listing of what I did, and from looking at the screenshots, where on step 3, you can see that both servers were in the list. Remember on all 3 steps, I always deleted everything (reg/database/even the EXE -- even had to re-enter the REG code and name).

And after Step 3, I re-did Step 1..... And again WEST showed all headers intact. In fact I rechecked all this for hours while I was writing this message. I had to go back and re-do it all to get the screenshots. Central ALWAYS showed incompletes... WEST never did and will not for days since the retention is good on that server. When I put both servers in and Get New Headers, you see that it only shows Central's results.

It's like when I put both servers in, it goes to Central's address for WEST's address. This would explain why it shows the same missing parts (step 3-2) for central and west, even though West is not missing any parts.

As you showed, west and central do have different IP addresses..
I'm not sure what you meant by "looks like a single IP address at the moment". You listed them as:

68.12.19.6 [news.central.cox.net]

68.6.19.6 [news.west.cox.net]

which are different.
Codone
Posts: 9
Joined: Sun Apr 24, 2005 8:36 am

Post by Codone »

I tried using the actual IP addresses of the two servers instead of "news.west.cox.net" and "news.central.cox.net"

69.12.19.6
69.6.19.6

I got the same results exactly as Step 3. Showed "central's" results. Seemed to ignore "west's" completed parts.

I reset the servers before this test.
alex
Posts: 4538
Joined: Thu Feb 27, 2003 5:57 pm

Post by alex »

no need to delete everything, just reset the newsgroup.

try to disable properties->general->keep alive and then get headers 1 task a time (set header tasks to 1 in properties->tasks).

you can even try it with different order expand newsgroup, select one server at a time and invoke 'get new headers'

if it works ok, reset the newsgroup and try now 2 tasks at a time with keep-alive disabled.

you can also try e.g. to download headers in a group for news.microsoft.com and news....cox.net for a microsoft group, following your logic you'll see the same headers on both servers - as newspro works now it would be the case if you were right.
Codone
Posts: 9
Joined: Sun Apr 24, 2005 8:36 am

Post by Codone »

First, I want to clarify -- I have been using "Open and get new headers", not just "Get new headers". In case you try to reproduce this.
------------------------------------

Now to make things harder, lol, the Central server has caught up with West on the test files we were using.

They now both are complete on "Creature from the Pit". I ran Xnews and found another case in the same group that had missing parts on Central and complete on West (used "The Celestial Toymaker").
--------------------------------------------

Okay, I tried your suggestions:

1) Disabled Keep-alive
2) Set New Header tasks to 1

Reset Newsgroups (still have both servers active, in IP address form)

With the task limit on, it did only get one server's headers at a time.
It first got West's (which are COMPLETE). Then it got Central's headers which are incomplete. It WORKED in this case. It shows all parts complete on the main window, and a properties shows Central has missing, and West has complete.
---------------------------

Now I am trying two tasks at a time with Keep Alive off:

Still worked! It started Central and West's threads at the same time. Central finished very quickly (I get 500 kb/sec there, and only 50 KB/sec at WEST). I saw many missing files after Central completed. Minutes later, WEST completed. And all the missing files filled in!

It appears that the Keep Alive being enabled was the problem. I will continue to test it over the next few days using all three Cox servers (East,West, and Central) and see if this has fixed the problem.

Thanks for the support! And if it all goes smoothly this next week, I'll probably register another copy for my brother..!
alex
Posts: 4538
Joined: Thu Feb 27, 2003 5:57 pm

Post by alex »

do you have distinctively different speeds with both servers?

the 'get new headers' is the only command you've issued after the program start? then keep-alive would't matter, since both connections would be the only connections opened anew.

can you just start the progam in the conditions when you got the problem, reset the newsgroup, then invoke 'get new headers' only for the newsgroup and write down bandwidth counters for both servers (servers pane in the workspace) and then in one minute calculate the difference ('exclude header bandwidth' in the context menu for the server should be unchecked) to be sure there is several times difference you expect between the servers.

or just examine packets to be sure it gets data from both servers, there are network monitoring programs where you can see all network packets coming in and out.
Codone
Posts: 9
Joined: Sun Apr 24, 2005 8:36 am

Post by Codone »

There is no need -- Central always finishes about 10 times faster than west. This occurred in all the tests that I did including when I was getting the failures. The case where I explained that I get 500 KB/sec vs 50 KB/sec was to let you know that Central finished first (which I never had mentioned) and that I saw the incomplete files... then about 5 minutes later, West completed and I saw the missing files fill in. In all the failure cases, this is exactly what occurred before timeline-wise, but when West completed, the missing files didn't fill in.

I might have to try the packet monitoring when I get a chance. In the next week or so, if the problem happens again I will do that. And if I get a free day soon, I will run these tests again, with Keep Alive off, and on, to verify that it indeed is the problem and not some strange coincidence. So far, since I turned it off, it still seems to be working.
alex
Posts: 4538
Joined: Thu Feb 27, 2003 5:57 pm

Post by alex »

so you are sure it was indeed downloading from the right servers?
Codone
Posts: 9
Joined: Sun Apr 24, 2005 8:36 am

Post by Codone »

Yes I'm sure that it's downloading from the right servers. The reason being I can use Xnews and get just Central's and see missing parts, and then use Xnews and get West, and see completed parts. Same with Newspro if I only use one server. The problem only seems to occur when I use both at the same time. And I think that the problem has come back. I still had Keep Alive off, but last night getting a group, many files were incomplete. I deleted East and Central (leaving West) and tried again, and all were complete. I have noticed in all this testing that on rare occasions (maybe 5-10%) that with more than one server, it would still work correctly). Now the thing to remember is that when this problem does occur, that it only shows itself if parts are missing on one server, and present on another. If parts are present on both servers, they will still appear in the list even though (apparently in my case) Newspro only shows the information from one of the servers. I think I have been seeing this a lot recently because Central's retention has dropped a lot relative to the other servers. Unfortunately, I think I'm about to give up. When I delete East and Central and just use West, everything works fine, except for my slower download speed. Maybe you can find a server that has missing parts, and one that doesnt, and see if you can reproduce this?
alex
Posts: 4538
Joined: Thu Feb 27, 2003 5:57 pm

Post by alex »

how can i reproduce it if noone has such a problem (i mean related to the newsreader part) in the past since version 1.0 :)

especially in newspro, since it considers every part separatedly, partials are just a metalevel above the database (excluding trivial optimisations to reduce the database size which cannot play any role here since records for every part are separate).

i mean given newspro current internals e.g. exactly the same picture one would observe with text groups.

if you like i'll prepare for you a version which will write all xover output to the disk with error log enabled so you could compare raw server output.

so you are sure newspro is dealing with different servers. the only unknown is we don't know how their servers are organized physically, e.g. they may buy service from a large usenet provider and provide access to their users through two nntp switches at different ip "locations" but you may get at different times from the switches to the same physical servers.

you shouldn't take for granted you are dealing with the same physical server every time, it is most likely not the case.
Codone
Posts: 9
Joined: Sun Apr 24, 2005 8:36 am

Post by Codone »

Just to update what I've found with this issue...

I have dropped the number of header tasks to one at a time, as you suggested before, and that works 100% of the time. It gets WEST first, then after its done, it gets one of the others, and then the last one. It never shows the problem of missing headers. The problem only occurs when I get the 3 tasks going at the same time. This makes it a lot slower to get the headers, but it will use all three servers when it actually downloads (if I wait until all three header tasks complete). I still am thinking its a problem with Newspro somehow, but since no one else seems to be having the issue, maybe its something strange with Cox's servers. But since I'm still able to get it to work, I'm still happy.
Post Reply