Discussion:
BTRFS with more than two parities
Ronny Egner
2014-10-21 06:28:34 UTC
Permalink
Dear All,

i was wondering what happened with the patch posted by Andrea Mazzoleni
back in
=46ebrurary 2014 (this Thread:
http://thread.gmane.org/gmane.linux.kernel/1654735).

Why wash=B4t it added to the code? Something missing/wrong?

In my opinion the posted patch is awesome and would enable a unique
feature to btrfs that
no file system / volume manager on Linux and other UNIX-operating syste=
m
currently has.


Cheers
Ronny

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Duncan
2014-10-21 11:01:50 UTC
Permalink
Post by Ronny Egner
Dear All,
=20
i was wondering what happened with the patch posted by Andrea Mazzole=
ni
Post by Ronny Egner
http://thread.gmane.org/gmane.linux.kernel/1654735).
=20
Why wash=C2=B4t it added to the code? Something missing/wrong?
=20
In my opinion the posted patch is awesome and would enable a unique
feature to btrfs that no file system / volume manager on Linux and ot=
her
Post by Ronny Egner
UNIX-operating system currently has.
That, along with a bunch of other features, is on the longer term
roadmap and will likely eventually be implemented. Actually, the=20
discussion involves a quite flexible plan where number of=20
redundancies/parities/strips-per-strip are all configurable along=20
different/independent axes.

I wouldn't recommend expecting it any time soon, however, as btrfs=20
features have repeatedly taken far longer to implement and become stabl=
e=20
than originally predicted.

In fact, the big feature I've been waiting for, N-way-mirroring (curren=
t=20
btrfs raid1 mode is 2-way-mirroring regardless of the number of devices=
,=20
more devices simply adds more capacity), has been on the roadmap for=20
implementation "right after raid56 mode" for something like two years=20
now, with raid56 mode originally due to drop in kernel 3.5.

Needless to say, 3.5 came and went with no raid56. So did 3.6 and 3.7=20
and 3.8. An incomplete implementation finally dropped as of 3.9,=20
complete in normal operation but lacking a working scrub and reliable=20
recovery.

That was 3.9, and 3.17 was recently released. Two plus years since=20
original planned drop and 12 kernel series later, a year and a half plu=
s=20
and 8 kernel series since the original partial implementation drop, rai=
d56=20
mode is still incomplete. tho some progress has been made.

Altho certainly the devs haven't been idle. /Tremendous/ progress has=20
been made in general btrfs stability in that time. It's just that, wel=
l,=20
stability /did/ become the overriding focus, and in that time, btrfs ha=
s=20
gone from a definitely experimental filesystem that could and did often=
=20
eat data, to one that's still not entirely stable and where backups for=
=20
data of any value are still strongly recommended, but that works pretty=
=20
well, most of the time for most users, including not only the tradition=
al=20
filesystem features, but even most of the existing non-traditional=20
features that are (nearly) btrfs specific.

But the roadmap remains, after completion of raid56 mode, n-way-mirrori=
ng=20
should be next.

And with a bit of luck, with n-way-mirroring will come a redo of the wa=
y=20
btrfs handles mirrors/parity/stripes to fit into the larger framework=20
with each one on its own access so they can be even more flexibly=20
combined. After all, some of that will be needed in ordered to=20
accommodate n-way-mirroring anyway, and they might as well redo the=20
framework for how it's all specified at the same time, since it has=20
already been discussed and there's a vision there.

Of course with the experience I've had waiting for raid56 and knowing i=
t=20
wasn't the first feature to take much longer than anticipated, I don't=20
really expect n-way-mirroring, at least complete and stable enough to b=
e=20
less risk rather than more compared to the current 2-way-mirroring raid=
1,=20
to take much under a year, particularly if it introduces the raid-
framework redo along with it.

But once that is done, plugging in further raid expansions including 3+=
-
way-parity should be a comparatively minor detail.

But, I'd guess we're looking at at /least/ two years out, another coupl=
e=20
kernel cycles anyway to complete raid56, say another year to complete n=
-
way-mirroring and the raid-framework redo, and another several kernel=20
cycles after that for 3+-way-parity. So realistically, 2-2.5 years, an=
d=20
that's assuming no 2+-year delays on n-way-mirroring as happened with=20
raid56, and integration of the raid-framework redo into the same year's=
=20
time I'm allowing for n-way-mirroring. And given project history, that=
=20
could /easily/ stretch to 5+ years.

So bottom line, as I said up top, it's on the roadmap, but don't expect=
=20
it any time soon. Realistically at least two years out, and it could b=
e=20
five...

Unless of course you have a spare kernel and filesystems developer or=20
two, along with their sponsorship, to dedicate to the task. Tho even=20
then, given the time it'd take them to come upto speed and the testing=20
the new raid-framework would take, a year and a half to two years out=20
wouldn't be unreasonable.

=46WIW, the other, more mature but not fully GPLv2 kernel license=20
compatible alternative is Sun/Oracle's ZFS. It's a mature filesystem=20
with many promised btrfs features already implemented and long mature,=20
but choosing it does mean either choosing a non-mainline kernel module=20
with questionable legal issues (or the slower userspace code), or=20
choosing a kernel other than Linux -- one of the implementing BSDs or=20
Solaris. That's the most viable current option for some would-be btrfs=
=20
users, tho it's not so viable for others, for various reasons.

The other more general raid solution would be to get the n-way-parity=20
code into the kernel's md- or dmraid implementations. I've no idea wha=
t=20
the status is there, but presumably they're considering it, and given=20
btrfs implementation timetables, they may well have it implemented and=20
stable long before btrfs does.

--=20
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Ronny Egner
2014-10-21 13:41:03 UTC
Permalink
Post by Duncan
Post by Ronny Egner
Dear All,
i was wondering what happened with the patch posted by Andrea Mazzoleni
http://thread.gmane.org/gmane.linux.kernel/1654735)
Why wash´t it added to the code? Something missing/wrong
In my opinion the posted patch is awesome and would enable a unique
feature to btrfs that no file system / volume manager on Linux and other
UNIX-operating system currently has.
That, along with a bunch of other features, is on the longer term
roadmap and will likely eventually be implemented. Actually, the
discussion involves a quite flexible plan where number of
redundancies/parities/strips-per-strip are all configurable along
different/independent axes.
I wouldn't recommend expecting it any time soon, however, as btrfs
features have repeatedly taken far longer to implement and become stable
than originally predicted.
Understood. But what is the point in having a file system / volume manager
combination if you don´t offer the required redundancy levels? And that
included parity with more than one or two disks for me. Or parity-based
redundancy *at all* given the current status of the raid56 implementation.

Mirroring is not the solution to everything. Many use cases include the
need
for many disks and a lot of space. And parity-based is definitely needed
right
from the start imho.
Post by Duncan
Altho certainly the devs haven't been idle. /Tremendous/ progress has
been made in general btrfs stability in that time. It's just that,
well,
stability /did/ become the overriding focus, and in that time, btrfs has
gone from a definitely experimental filesystem that could and did often
eat data, to one that's still not entirely stable and where backups for
data of any value are still strongly recommended, but that works pretty
well, most of the time for most users, including not only the
traditional
filesystem features, but even most of the existing non-traditional
features that are (nearly) btrfs specific.
Yes, btrfs was (not sure what the status in later kernels 3.17 and 3.18 is)
not very stable. Stability certainly has priority.
Post by Duncan
FWIW, the other, more mature but not fully GPLv2 kernel license
compatible alternative is Sun/Oracle's ZFS. It's a mature filesystem
with many promised btrfs features already implemented and long mature,
but choosing it does mean either choosing a non-mainline kernel module
with questionable legal issues (or the slower userspace code), or
choosing a kernel other than Linux -- one of the implementing BSDs or
Solaris. That's the most viable current option for some would-be btrfs
users, tho it's not so viable for others, for various reasons.
I played with it. It has a rock solid parity implementation but some other
downsides like the requirement for ECC memory, large memory footprint,
sub-optimal interaction with the linux memory management, I/O
characteristics
due to the nature of RAIDZ (all number of I/Os in a raid group (called
‚vddv‘)
is exactly the number of I/O one disk is able to do - regardless of the
number
of disks in the vdev), and so on.
Post by Duncan
The other more general raid solution would be to get the n-way-parity
code into the kernel's md- or dmraid implementations. I've no idea what
the status is there, but presumably they're considering it, and given
btrfs implementation timetables, they may well have it implemented and
stable long before btrfs does.
I´ve asked on the kernel mailing list as well since it would be a waste
to throw this not entirely simple work away.

(When replying please send me the mail CC. Thanks)


Ronny

��{.n�+�������+%��lzwm��b�맲��r��zX�����"��^n�r���z���h�����&���G���h�
Loading...