Don't get existing headers with XPAT

plopje99 · Post by **plopje99** » Mon May 26, 2003 5:32 pm

Alex,

First of all let me say that NewsPro is an outstanding product. My problem isn't really a problem but I will ask anyway.

It seems to me that NewsPro always gets all headers it can in a XPAT search group when getting new headers. Is it possible that NewsPro checks (message-id??) wether or not it has the header already for the server it gets the headers from? It can skip those headers and this will save time and bandwidth.

Another request:

Is it possible to add a column for some servers (a server-option) in the article window which states the number of parts it has for all multiparts. This way I can quickly see which bodies I have to retrieve from my pay-server in order to make the multipart complete (minimise the cost). I have to scan all incomplete multiparts now with SHIFT-S and write down how many parts each multipart misses. When I have a couple of PARS I can skip the multiparts with the least available parts.

Thanx!

Tha*Lunat!k · Post by **Tha*Lunat!k** » Mon May 26, 2003 10:18 pm

XPAT is a limited functionality server-side command, so there isn't really much flexibility. When issuing the command the server processes it the same regardless of what you already have. NewsPro can do date limitations to stop it from showing older headers, but it still goes through all the same processes initially.

As for completion: you could push F2 on a multipart and it will show the % complete for each server. It should be easier to just check the percentages and know which articles not to download. There isn't really a way to sum up the completion of a server overall though.

plopje99 · Post by **plopje99** » Tue May 27, 2003 7:53 am

The XPAT search consist of two stages. The client sends the search string request and will get all the header-id's that matches the string in the news group for an answer. This part is always the same and costs little time and bandwidth. The client is then able to get the headers from the server one by one by using the message-id's from the list. NewsPro can scan if a certain header already exists in its database. If it exists there is no need to fetch it from the server. The fetching of all the headers from the id-list is a rather time consuming process when you often scan for new headers on a server in a large XPAT search group and it's a waste of bandwidth. The test if a certain header already exists, on the other hand, can be done with the speed of light.

As for completion: With your solution I have to push F2 for all the incomplete mulitparts and write down the results. The next step is to sort them on paper and then I can make some desisions. This is a very time consuming operation. With a artical-count column per server I have a good overwiew.

alex · Post by **alex** » Wed May 28, 2003 5:28 am

The NewsPro database is around message-ids and not article numbers from the indexing point of view (it seems to me it is the only optimial way to go), I'm just using article numbers to address some server malfunctions but they don't require the find function, by message id it is possible to locate newsgroup and article number but not the opposite.

XPat returns article numbers so the only thing newspro does to optimize header retrieval - it uses some heuristics in treating ranges of article numbers, based on the principle that downloading dense scattered headers one by one is slower than to download all headers in the range.

The slowest part is not downloading headers but xpat itself, usually the server operation is slow so its response takes time, in newspro there is even separate delay for xpat read timeout.

I'm not sure it is feasibly possible at all to implement what you are talking about, all what we have is the server, the article number and the newsgroup. We need additional index per server per newsgroup (say one index per newsgroup but the key consist of article nubmer and server) to find the message by article number which is not so good since we don't need the index anywhere else - and the index is very heavy load and space waste since it changes with every incoming/outgoing destination, (for expiring headers you don't need the find function but rather you need all numbers less that certain value and it is a different requirement from the point of organizing data structures), so then what we need is full newsgroup scan, but then e.g. if we have a modest xpat dialog that includes only subscribed newsgroups - it will require full database scan. All congrats to the xpat command designers that instead of the message-id included the almost useless header you are looking for, it is an issue in formalizing the protocol without taking into account considerations of every side involved in implementing it.