disParity » General Discussion

Parity drive takes up more space than any other drive?

(4 posts)
  1. rust0r
    Member

    I don't think this has been covered before (at least I wasn't able to find it), this has occurred under both 0.16 and 0.21, so it isn't limited to the newest version.

    I'm running 5 data drives and 1 party (all identical 1tb WD Blacks)

    Having recreated my parity, it sits at 134GB FREE, the fullest data drive is showing 137GB FREE (thus there is more information on my parity drive, than the fullest of my data drives).

    My understanding is this shouldn't be able to happen at all, if things were REVERSED it could be attributed say to hidden files, however I am having the opposite problem in this case.

    Any ideas?

    Posted 5 months ago #
  2. Phatty2x4
    Member

    Actually this has been covered several times - though it is buried in different posts

    Disparity does not protect at the file system level (such as how UNRaid operates). Disparity protects data sets. Even though you are setting up a drive to be protected, it is still a data set. There is an additional tiny amount of MetaData that is captured with the protection.

    In a very simplifies example think of disparity protecting as follows:
    {Desired data + metadata} = {parity set} Where MetaData size is based upon amount of data being protected. So if you want to protect you 1 TB drive, you are actually protecting 1 TB of data plus the metadata disparity creates with the protection. This is especially true if you have a lot of small files being protected.

    In one of the previous posts, Roland has a good formula for estimating usage expectations. If I find it, I'll add it to my post.

    Posted 5 months ago #
  3. Roland
    Key Master

    Parity will be slightly larger for two reasons. One is the meta data, as Phatty says. Meta data is the small amount of additional information disParity needs to store (file names, sizes, dates, etc.) to keep track of what is stored where in the parity. Unless you have a very large number of files (millions) this meta data will typically be very small, usually just a few MB. Meta data is stored in the files*.dat files on the parity drive.

    The larger contributor is the overhead due to rounding of files to the nearest 64K block size when storing them in parity. Since the end of a file will land at a random place somewhere in the last 64K block, you can estimate that each file will "waste" on average an additional 32K of parity space. So the overhead formula Phatty mentioned is (total # of files on all drives) x 32K.

    -Roland

    Posted 5 months ago #
  4. Roland
    Key Master

    Oops, I've since realized that the above formula isn't exactly right. It's true that each file has on average of 32K overhead in parity, but it's also the case that the overhead from each data drive overlaps with all of the others in parity.

    So what you really have to do is determine which of your data drives has the most files. The parity overhead will then be 32K times the number of files on that one drive, not the total number of files on all drives.

    Posted 5 months ago #

RSS feed for this topic

Reply

You must log in to post.