Request feature: PAR2 support (tricky, read inside)

Post Reply
JimKi
Posts: 7
Joined: Tue Apr 22, 2003 3:12 pm
Contact:

Request feature: PAR2 support (tricky, read inside)

Post by JimKi »

Ok, i just hope that what i'm trying to do isn't already included with the current version of newspro.... let me explain

We are all acquainted with PAR files, this is really an interesting way to fill missing archives posted in the binary newsgroups

Now some smart people go further with PAR2 !

For the ones not knowing how it works, let me explain quickly:

Imagine that you have an anime episode, packed with winrar, 15 archives of 20'000'000 bytes each

With normal PAR system, if you have a single part of archive #9 missing or bad you can regenerate it with a single P01 file of 20'000'000 bytes, if another part is missing in archive #3, you need P01 & P02 of 40'000'000 etc....

Now with the new PAR2, archives are chunked :-)

This means that if you have the 15 archives with only some bytes missing inside one of them (let's say you miss a part in the archive #9) you can analyse how many bytes are bad or missing and download only the required PAR2 chunks

So if only 2 chunks are bad in the #9, you just download a small file of say only 0.5 Mbyte and you can repair the WHOLE archive set

If there are plenty of missing parts, you can theorically repair any number of missing chunks with a small amount of data, meaning that you can repair 3 missing/bad archives with only 8 chunks, 1 or 2 MBytes in total !!!!

To compare with actual PAR system, it's like if you have 10'000 small archives with only 3 or 4 missing and you can just download only 4 small PAR files to repair them. Of course you don't see them, everything is handled by the new PAR2 system !!! great !!!

Now how to make it work with newspro, that's the tricky question

Currently the only way i see is to un-activate the multi-part messages support, save EACH part separately, RECONSTRUCT the whole archive, by putting fake parts where they are missing and finally you can use the PAR2 system to repair the set.... pffff, not really easy

Is there a way to save quickly a whole file with filling zeroes to replace missing/bad parts ?

Help me, i don't know how to do this without becoming crazy... especially with people splitting in 105 parts with stupid posting-software
alex
Posts: 4514
Joined: Thu Feb 27, 2003 5:57 pm

Post by alex »

are you sure you need to fill missing parts with zeros?

i had some email communication recently about the matter and since i'm busy with something i asked the person to check whether it works when bytes are missing (so the offset changes in the middle of the file, not just some bytes are changed but offsets preserved).

the reply i got it works with missing bytes (probably par2 remember a few starting bytes and hash so it can find the right chunk even if it is misplaced).

so all what you need is to download all available bodes, save the attachment (with some parts missing) and apply the par2.

the method is problematic since usually there is excessive number of incompletes and it may be not trivial to find what you need, the former par method has the advantage of not going through a pile of garbage.

if the user has mistaken about the offsets - let me know and i'll check it myself, the problem here is only yEnc supplies offset but the real question (with yEnc since otherwise nothing is available) is whether one can rely on it, as to newspro i'm sure, as to other posters - i'm not, since the parameter may not be used so its value might be incorrect, also for some types of attachments zero fill may create problems when playing such a file directly.
JimKi
Posts: 7
Joined: Tue Apr 22, 2003 3:12 pm
Contact:

RIGHT :-)

Post by JimKi »

You're right, incredible, why didn't i try this before lol

I tested it and it works well, i wonder how they did that. Anyway it means the guys doing PAR2 thinked really hard.

Ok so it's easy now, i can just save what i've got without bothering WOWOWOW

Thanks for the tip !
buzzy
Posts: 4
Joined: Sat May 31, 2003 1:44 am

Post by buzzy »

for any newsreader than can download incomplete files, par2 completely kicks par's ass. get it, use it, get your friends to use it.

par2 does not require that missing parts be filled with dummy data, just save it either as separate chunks or as one (incomplete) file. the key is that the file must be named properly, same base naming scheme as the original file set.

when posting, best by far to use a block size that is a multiple of (unencoded) article size (lines x 128 bytes). so 3500 yenc lines x 128 = 448,000 or a multiple thereof.

it's worth someone doing a short faq on downloading incomplete files, maybe I'll start a thread ...
RimBlock

Post by RimBlock »

The one other major advantage of Par2 is that raring is no longer required.

If a poster must rar the file then they can make only a single rar rather than having to split the rars for Par1 recovery.

There is quite a good discussion going on in alt.binaries.vcd.xxx on this subject at the moment for anyone who is interested (subject is "Re:Par2 Posts are doing my head in!!!!!!!!").

Alex, the only thing I can think of for NewsPro to do in regard to Par2 is to check the downloaded file with the .par2 when the save option is invoked and then autoqueue needed .par2 recovery files (after a user prompt which can be switched off). After the par2 recovery files are downloaded then saving the archive would also perform a recovery.

I am not sure how much of the par2 development is open source and easy to intigrate to News Pro but it would be a great featuer to have if possible.

Cheers
RB
buzzy
Posts: 4
Joined: Sat May 31, 2003 1:44 am

Post by buzzy »

The core code for the command line par2 is open source, but I don't think that allows it to be incorporated into a commercial client. If you're talking about adding a feature to NewsPro that calls par2cmdline, that may work.

Though of course it's usable today without that.

See the parchive page ...
http://parchive.sourceforge.net/

Also see this link for a general (user-oriented) discussion of par2
http://www.netwu.com/newspro/phpBB2/vie ... php?p=1137
If a poster must rar the file then they can make only a single rar rather than having to split the rars for Par1 recovery.
I wouldn't recommend that, if you're going to rar you might as well break it up into smaller files. Small files are much more likely to be complete than large files, and complete files will always be a little more convenient to handle.
Guest

Post by Guest »

I wouldn't recommend that, if you're going to rar you might as well break it up into smaller files. Small files are much more likely to be complete than large files, and complete files will always be a little more convenient to handle.
While I can understand your views, I am only interested in video and as such there is not much mileage in compressing with Winrar. Compressing (or storing) with rar also adds to the complete file size so you therefore end up having to post more which in turn reduces completion.

Complete files are always more easy to handle than incomplete ones :) .

Most of the video posts I do allow for 20% recovery although Par2 may well allow for me to reduce this percentage as you are recovering much smaller pieces rather than large chunks due to messages failing to propgate completely across Usenet.

As it is, Par2 is fairly new and needs far more testing 'in the wild' to find it's best application but suffice it to say that with the last vcd posted as a single mpg file (no rar involved), from the feedback recieved, people only had to download a fraction of what would be required for a Par1 recovery.

I will have a browse of the other thread and see if there is anything else to add to my experiences. Thanks for the link.

Cheers
RB
buzzy
Posts: 4
Joined: Sat May 31, 2003 1:44 am

Post by buzzy »

Anonymous wrote:
I wouldn't recommend that, if you're going to rar you might as well break it up into smaller files. Small files are much more likely to be complete than large files, and complete files will always be a little more convenient to handle.
While I can understand your views, I am only interested in video and as such there is not much mileage in compressing with Winrar. Compressing (or storing) with rar also adds to the complete file size so you therefore end up having to post more which in turn reduces completion.

Complete files are always more easy to handle than incomplete ones :) .

Most of the video posts I do allow for 20% recovery although Par2 may well allow for me to reduce this percentage as you are recovering much smaller pieces rather than large chunks due to messages failing to propgate completely across Usenet.

As it is, Par2 is fairly new and needs far more testing 'in the wild' to find it's best application but suffice it to say that with the last vcd posted as a single mpg file (no rar involved), from the feedback recieved, people only had to download a fraction of what would be required for a Par1 recovery.

I will have a browse of the other thread and see if there is anything else to add to my experiences. Thanks for the link.

Cheers
RB
Not sure you got the right quote to go with your comment, either that or you've missed the point. Take another look at things. That bit you quoted was a specific response to the earlier poster's suggestion that when using rar, one huge rar file be used. Not a good idea. Whether one uses rar or not, is a separate question.

And yes, just about everything people post (video, images, audio) is already compressed using the optimal compression technique. So of course, one doesn't generally use rar for compression of the flies being posted to a newsgroup, but rather simply to achieve the advantages of:
- small files, for better completion
- even size files, for use with par

Same effect as using a splitter for an mpg. Whatever.

Keep in mind that not every newsreader supports the DL of incomplete files yet (without relying on an obscure bug/feature) - some newsreaders will, by default, prevent such downloads, which is a postive for many users. So that's a factor in deciding how to post. In addition, lots of users are reluctant to use par2. So posting in a way that enhances the odds of complete files still has some value (both today and in that future world where every newsreader supports downloading incomplete files).

par2 has been tested quite a bit. And me and a couple thousand of my closest friends are already using par2 successfully for newsgroup posts.
RimBlock
Posts: 5
Joined: Sat May 31, 2003 1:04 pm

Post by RimBlock »

Ok, guest was me, my account had expired here as I rarely use it here.
Not sure you got the right quote to go with your comment, either that or you've missed the point. Take another look at things. That bit you quoted was a specific response to the earlier poster's suggestion that when using rar, one huge rar file be used. Not a good idea. Whether one uses rar or not, is a separate question.
Yeah, sort of went off at a tangent there. :lol:
And yes, just about everything people post (video, images, audio) is already compressed using the optimal compression technique. So of course, one doesn't generally use rar for compression of the flies being posted to a newsgroup, but rather simply to achieve the advantages of:
- small files, for better completion
- even size files, for use with par (or with par2, for those who cannot DL incomplete files).
Yep most just store rather than compress.

Not sure the smaller file = better completion of the whole post works out.

The way I see it;
A 729Mb Mpg1 file split into 15Mb rar's and yenc encoded with Yenc Post 2002 creates 25 messages * 49 rar files and 19 messages * 1 rar file = 1244 messages posted for the full file.

The same file posted as a 1 part mpg = 1167 messages.

After splitting and raring into smaller parts you would have to post an extra 81 messages to be able to rebuild the entire file. That makes another 81 messages that could potentially go missing more than posting as an unrared mpg file.

Maybe I am missing something here and if so enlightenment would be most welcome.
Keep in mind that not every newsreader supports the DL of incomplete files yet (without relying on an obscure bug/feature) - some newsreaders will, by default, prevent such downloads, which is a postive for many users. So that's a factor in deciding how to post. In addition, lots of users are reluctant to use par2. So posting in a way that enhances the odds of complete files still has some value (both today and in that future world where every newsreader supports downloading incomplete files).
Ok this is something that I was not aware of as most I have used do support this feature and this could therefore put a completely new spin on it.
par2 has been tested quite a bit. And me and a couple thousand of my closest friends are already using par2 successfully for newsgroup posts.
Ok maybe so but not in the groups I post to and read. Of the people posting there is would seem that only myself and one other are using Par2 and the volume of posts is one of the highest for a Usenet newsgroup. How do you find recovery speed with Par2 as I have heard a number of complaints of Quickpar being quite slow compaired to Par1 for rebuilding missing parts (more often exhibited by needing to rebuild full rar files).

Nice description of Par2 on the other link. The pics make it easier to understand for people just trying to get to grips with the system. If you have no objection I will add the link to my .nfo files for people new to Par2.

Cheers
RB
buzzy
Posts: 4
Joined: Sat May 31, 2003 1:44 am

Post by buzzy »

In a world where complete files have some value and incomplete files are difficult to manage (because of the software one uses or just unwillingness to use par2) - here's the general idea of why small files will generate more complete files than large files. Suppose a particular newserver has a 1 in 250 chance of not having a given part. In the example above, of a 1200 message post, odds are that 5 parts would go missing.

- If posted as one large file, there is one big incomplete file with five missing parts, which may not be useful to some users. (In fact, the default on ,many binaries newsreaders is not to even show incompletes. People always complain the files are "missing," not "incomplete." Very easy for par2 to fix, though - only need about 3-4MB of par2 data!)
- If posted as 49 files, there are about 5 incomplete files and 44 complete files. If 11% or more (roughly) par or par2 data has been posted, even someone who doesn't DL incompletes can use par/par2 to complete the file set.

But again, the value of this depends on whether complete files have value. In the real world, they seem to, because people are lazy, or set in their ways. That may be debatable to some, and should improve over time.

You have found the only real drawback of par2, because it uses blocks vs. files to compute, it slows down the calculations. Used properly, though, on net it should still be far better than par for some uses. In the example above, replacing just the five missing parts with par2 might be about as fast as replacing five missing files with par.

It may be that for certain filetypes - like huge mpgs or disc images - rar/par or splitter/par work well enough and there's some inertia.

But - par2 really should prove far more effective for all filetypes, as (if you get the block size right) you only need to post enough par2 data to replace what's missing on the news servers, there's no duplication of data that's there but unretrievable (in incomplete files) as with par. So that should
- cut upload size/time
- help offset the ever-expanding size of newsgroups and the retention effects

Most of the groups I use are lossless audio, where the file sizes are smaller than video but uneven sizes (10MB-100MB). So par2 is a big step forward, it eliminates the need to use rar solely to get even file sizes for use with par.

Any help you can give others will surely only make this better for all of us. Plus it will make the hardworking developers of the open-source software happy!
RimBlock
Posts: 5
Joined: Sat May 31, 2003 1:04 pm

Post by RimBlock »

Ok so posting split files is a big advantage when peoples news readers do not support partial downloads.

So, the news readers just have to allow for partials and we should all be set :lol: .
- cut upload size/time
- help offset the ever-expanding size of newsgroups and the retention effects
And that is what I am pushing the acceptance of Par2 on within the groups I post too. For me it is really a no brainer.

Unfortunatly I still get people who what a full vcd repost because they are unwilling to rename a few files with a Par1 utility thus enabling recovery with only one Par1 file download. I can just see the complaints with Par2 but as I have said, it is early days in the groups I post to for Par2.

Slow and steady wins the race.
and
Education rather than pampering.

Cheers
RB
Post Reply