Has anyone experitmented with using a dedicated partition for the newspro databases?
I have been trying out NTFS mount points, and on a whim I decided to create a logical partition and use a better cluster size than 4K.
I decided on 16K since the database files seem to be sized in multiples of 4K, and they size to 400K maz. So 32K would leave half clusters.
So I went ahead and tried it out. It has been working surprisingly well. The fragmentation that was occuring on my primary drive is virtually gone, and is limited to the NEWSPRO partition. Also, since this partition is both small 10GB and has a reasonably large cluster size, it defragments quickly. Also my primary drive now has 10k fewer files to keep track of.
Anyone else tried doing something like this?
Another technique I use is to have 1 instance of newspro that doesn't have any retention or subscriptions at all. I use it for NZB files, and it is the default. I have another profile which loads the full databases.
Both of these things have sped things up for me. Anyone else have any similar tips they's like to share?
Speed Up Tips
database files increases in several mb at a time, probably most fragmentation happens because of downloading multiple bodies and parts are on average 350-500KB, maybe it is why larger cluster size doesn't result in storage losses from other side with larger cluster size fragmentation should be less (since if it will be one big cluster there will be no fragmentation ).
as to the future the handling bodies won't change in principle so if these are indeed article bodies which cause fragmentation it will be good then too.
as to the future the handling bodies won't change in principle so if these are indeed article bodies which cause fragmentation it will be good then too.
You are right. Actually some of the fragmentation is transient, and is resolved when attachments are saved to another disk. Isolatiing this activity from the rest of the disk tends to help though, because you can be writing thousands of files, and even on a large volume the fragmentation still occurs, and then it takes longer to defragment the larger volume.
I don't know really though, I am just experimenting to see if I can find some good optimizations.
I don't know really though, I am just experimenting to see if I can find some good optimizations.
maybe fragmentation should be taken into consideration in the database design, but locality of reference there is questionable (how close in memory is likely to be the data the program needs in subsequent requests), especially after long run with deleting headers or downloading headers simultaneously from several newsgroups, but maybe at least less randomness if the disk space is less fragmented, so the data will be faster to be reclaimed into memory in the case of disk thrashing.
i'm doing all memory management myself as to headers, at some point i'll compare performance mapping memory into files, mapping memory into paging file or allocating memory directly (using paging file indirectly), but in the end i feel probably there is no much difference.
i'm doing all memory management myself as to headers, at some point i'll compare performance mapping memory into files, mapping memory into paging file or allocating memory directly (using paging file indirectly), but in the end i feel probably there is no much difference.
I have read a few other posts, and looked again.
So the headers are in the files named after the newsgroups + $ for indexes(?).
Then npr files are actually article bodies. Is that right?
Seems to me, and I think theres another post about this that if you split them into different directories then we could do things like run each on separate physical disks or even a ramdisk. Seems like a simple enough enhancement.
Since I have you ear, where do unprocessed headers go? It seems to me that it would be good to be able to say "set header/hard task chance to run = 0 when unprocessed headers > nn".
This would prevent a header task from eating up one of your server connections slowly reading headers in a throttled state - it could finish getting headers for that server, and no more header tasks could start until we get caught back up. Just a thought.
I didn't want to disable the prevent overload feature, because it seems like a great concept. Maybe i don't know what problem it was put in to solve.
Anyways, thanks alot, newspro is in a league of its own.
So the headers are in the files named after the newsgroups + $ for indexes(?).
Then npr files are actually article bodies. Is that right?
Seems to me, and I think theres another post about this that if you split them into different directories then we could do things like run each on separate physical disks or even a ramdisk. Seems like a simple enough enhancement.
Since I have you ear, where do unprocessed headers go? It seems to me that it would be good to be able to say "set header/hard task chance to run = 0 when unprocessed headers > nn".
This would prevent a header task from eating up one of your server connections slowly reading headers in a throttled state - it could finish getting headers for that server, and no more header tasks could start until we get caught back up. Just a thought.
I didn't want to disable the prevent overload feature, because it seems like a great concept. Maybe i don't know what problem it was put in to solve.
Anyways, thanks alot, newspro is in a league of its own.
specific files are implementation details which will change and are not worthy to discuss, just in a usable newsreader all revolves about the database structure so there are limitations on how frequently it can be deeply revised.
the main problem i see now people using exclusively binary downloaders taking over the usenet since it decreases the IQ level of usenet community. it would eventually mean slowing down support for versatile usenet clients (although as to now my work is not affected).
the main problem i see now people using exclusively binary downloaders taking over the usenet since it decreases the IQ level of usenet community. it would eventually mean slowing down support for versatile usenet clients (although as to now my work is not affected).