Discussion:
btrfs differential receive has become excrutiatingly slow on one machine
(too old to reply)
Marc MERLIN
2014-09-08 01:51:24 UTC
Permalink
Hi Filipe and others,

TL;DR:
- btrfs send of a 1.5GB filesystem takes 4mn to replicate that
filesystem on the target machine.
- btrfs send -p between 2 snapshots happens in 1 second
- copying that diff to the other machine takes 8 seconds
- applying that diff with btrfs receive takes 82mn (again
when resyncing the entire filesystem from scratch only takes 4mn)

Details:
Kernel 3.16.2 on both sides, although the problem started earlier, maybe
after 3.14.

btrfs send/receive works great on my laptop from one drive to another,
but it's unbearable when I run it to my disk server (which was on 3.14
up to today due to the bugs in 3.15 and 3.16.x<2).
legolas:/mnt/btrfs_pool1# time btrfs send root_ggm_daily_ro.20140907_13* | ssh gargamel "time btrfs receive /mnt/btrfs_pool2"
At subvol root_ggm_daily_ro.20140907_13:16:58
At subvol root_ggm_daily_ro.20140907_13:16:58
real 4m41.392s
user 0m1.380s
sys 0m9.504s
real 4m41.685s
user 0m4.580s
sys 0m16.844s
Sending an incremental 154MB diff takes almost 82mn for btrfs receive to process.


btrfs send differential is fast:
legolas:/mnt/btrfs_pool1# time btrfs send -p /mnt/btrfs_pool1/root_ggm_daily_ro.20140906_10:25:25 root_ggm_daily_ro.20140907_14:55:40 > /mnt/btrfs_pool2/diff
At subvol root_ggm_daily_ro.20140907_14:55:40

real 0m1.464s
user 0m0.004s
sys 0m1.172s

However, btrfs receive of that diff is slow as molasses.

The diff is smallish:
legolas:/mnt/btrfs_pool1# du -sh /mnt/btrfs_pool2/diff
154M /mnt/btrfs_pool2/diff

The subvolume is small too:
legolas:/mnt/btrfs_pool1# du -sh root
1.5G root

Copying that diff file over ssh takes 8 seconds, and then doing the
receive manually is where wait is going on

gargamel:/mnt/btrfs_pool2/backup/debian64/legolas# time btrfs receive -f /tmp/diff .
At snapshot root_ggm_daily_ro.20140907_14:55:40


Strace shows that it hangs for portion of or multiple seconds on read from FH 3
btrfs 4467 root 3r REG 0,27 160962420 13136056 /tmp/diff

/tmp/diff is on a different filesystem and different drives than the
btrfs FS the restore is happening on.

Here is a snapshot of strace showing how it hangs for multiple seconds at the read
step that reads from the diff file:
0.000114 read(3, "h\0\0\0\21\0\305\341\244\266", 10) = 10
0.000059 read(3, "\17\0X\0lib/modules/3.16.2-amd64-i91"..., 104) = 104
0.000061 truncate("/mnt/btrfs_pool2/backup/debian64/legolas/root_ggm_daily_ro.20140907_14:55:40/lib/modules/3.16.2-amd64-i915-preempt-20140714/kernel/arch/x86/crypto/blowfish-x86_64.ko", 20304) = 0
91.915793 read(3, "t\0\0\0\23\0\v\370\222\244", 10) = 10
0.000066 read(3, "\17\0X\0lib/modules/3.16.2-amd64-i91"..., 116) = 116
0.000061 lchown("/mnt/btrfs_pool2/backup/debian64/legolas/root_ggm_daily_ro.20140907_14:55:40/lib/modules/3.16.2-amd64-i915-preempt-20140714/kernel/arch/x86/crypto/blowfish-x86_64.ko", 0, 0) = 0
0.000125 read(3, "h\0\0\0\22\0Xa\r\271", 10) = 10
0.000050 read(3, "\17\0X\0lib/modules/3.16.2-amd64-i91"..., 104) = 104
0.000055 chmod("/mnt/btrfs_pool2/backup/debian64/legolas/root_ggm_daily_ro.20140907_14:55:40/lib/modules/3.16.2-amd64-i915-preempt-20140714/kernel/arch/x86/crypto/blowfish-x86_64.ko", 0644) = 0

See how that was 91 seconds up there, it's thankfully usually less.

While strace shows me the wait is on read, I'm going to assume that really the hang is
when btrfs receive is sending the bytes to the kernel to write on disk.

The target filesystem looks like this:
Label: 'dshelf2' uuid: d4a51178-c1e6-4219-95ab-5c5864695bfd
Total devices 1 FS bytes used 3.53TiB
devid 1 size 7.28TiB used 3.66TiB path /dev/mapper/dshelf2

Data, single: total=3.50TiB, used=3.47TiB
System, DUP: total=8.00MiB, used=416.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, DUP: total=77.50GiB, used=62.31GiB
Metadata, single: total=8.00MiB, used=0.00
unknown, single: total=512.00MiB, used=8.64MiB

So it's not critically misbalanced since it's only 50% used, correct?

For good measure, I ran:
btrfs balance start -dusage=10 /mnt/btrfs_pool2
ran and exited with:
Done, had to relocate 5 out of 3746 chunks
but that did not help the restore speed at all.


btrfs restore -v -v -v shows it's doing work, but slow enough that I can easily read the text going by:
truncate sbin/swaplabel size=14568
chown sbin/swaplabel - uid=0, gid=0
chmod sbin/swaplabel - mode=0755
utimes sbin/swaplabel
mkfile o85871-279575-0
rename o85871-279575-0 -> sbin/raw
utimes sbin
truncate sbin/raw size=10336
chown sbin/raw - uid=0, gid=0
chmod sbin/raw - mode=0755
utimes sbin/raw
mkfile o85872-279575-0
rename o85872-279575-0 -> sbin/fsck.cramfs
utimes sbin
truncate sbin/fsck.cramfs size=14544
chown sbin/fsck.cramfs - uid=0, gid=0
chmod sbin/fsck.cramfs - mode=0755
utimes sbin/fsck.cramfs
mkfile o85873-279575-0
rename o85873-279575-0 -> sbin/mkfs.cramfs
utimes sbin
truncate sbin/mkfs.cramfs size=31120
chown sbin/mkfs.cramfs - uid=0, gid=0
chmod sbin/mkfs.cramfs - mode=0755
utimes sbin/mkfs.cramfs
mkfile o85874-279575-0
rename o85874-279575-0 -> sbin/findfs
utimes sbin
truncate sbin/findfs size=6208
chown sbin/findfs - uid=0, gid=0
chmod sbin/findfs - mode=0755
utimes sbin/findfs
mkfile o85875-279575-0
rename o85875-279575-0 -> sbin/wipefs
utimes sbin

Here is a full run:
gargamel:/mnt/btrfs_pool2/backup/debian64/legolas# time btrfs receive -v -v -v -f /tmp/diff .
At snapshot root_ggm_daily_ro.20140907_14:55:40
receiving snapshot root_ggm_daily_ro.20140907_14:55:40 uuid=8bfc0ac8-6436-614d-8a56-7f7894f743b8, ctransid=281821 parent_uuid=075e6ff5-9b87-874f-991c-b453a0f5d4c0, parent_ctransid=278891
utimes tmp
utimes bin
unlink bin/dmesg
utimes bin
unlink bin/lsblk
utimes bin
unlink bin/more
utimes bin
unlink bin/tailf
(...)
BTRFS_IOC_SET_RECEIVED_SUBVOL uuid=8bfc0ac8-6436-614d-8a56-7f7894f743b8, stransid=281821

real 82m55.222s
user 0m0.236s
sys 0m1.816s


Any idea what's going on?

Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Filipe David Manana
2014-09-08 21:49:01 UTC
Permalink
Post by Marc MERLIN
Hi Filipe and others,
- btrfs send of a 1.5GB filesystem takes 4mn to replicate that
filesystem on the target machine.
- btrfs send -p between 2 snapshots happens in 1 second
- copying that diff to the other machine takes 8 seconds
- applying that diff with btrfs receive takes 82mn (again
when resyncing the entire filesystem from scratch only takes 4mn)
Kernel 3.16.2 on both sides, although the problem started earlier, maybe
after 3.14.
btrfs send/receive works great on my laptop from one drive to another,
but it's unbearable when I run it to my disk server (which was on 3.14
up to today due to the bugs in 3.15 and 3.16.x<2).
legolas:/mnt/btrfs_pool1# time btrfs send root_ggm_daily_ro.20140907_13* | ssh gargamel "time btrfs receive /mnt/btrfs_pool2"
At subvol root_ggm_daily_ro.20140907_13:16:58
At subvol root_ggm_daily_ro.20140907_13:16:58
real 4m41.392s
user 0m1.380s
sys 0m9.504s
real 4m41.685s
user 0m4.580s
sys 0m16.844s
Sending an incremental 154MB diff takes almost 82mn for btrfs receive to process.
legolas:/mnt/btrfs_pool1# time btrfs send -p /mnt/btrfs_pool1/root_ggm_daily_ro.20140906_10:25:25 root_ggm_daily_ro.20140907_14:55:40 > /mnt/btrfs_pool2/diff
At subvol root_ggm_daily_ro.20140907_14:55:40
real 0m1.464s
user 0m0.004s
sys 0m1.172s
However, btrfs receive of that diff is slow as molasses.
legolas:/mnt/btrfs_pool1# du -sh /mnt/btrfs_pool2/diff
154M /mnt/btrfs_pool2/diff
legolas:/mnt/btrfs_pool1# du -sh root
1.5G root
Copying that diff file over ssh takes 8 seconds, and then doing the
receive manually is where wait is going on
gargamel:/mnt/btrfs_pool2/backup/debian64/legolas# time btrfs receive -f /tmp/diff .
At snapshot root_ggm_daily_ro.20140907_14:55:40
Strace shows that it hangs for portion of or multiple seconds on read from FH 3
btrfs 4467 root 3r REG 0,27 160962420 13136056 /tmp/diff
/tmp/diff is on a different filesystem and different drives than the
btrfs FS the restore is happening on.
Here is a snapshot of strace showing how it hangs for multiple seconds at the read
0.000114 read(3, "h\0\0\0\21\0\305\341\244\266", 10) = 10
0.000059 read(3, "\17\0X\0lib/modules/3.16.2-amd64-i91"..., 104) = 104
0.000061 truncate("/mnt/btrfs_pool2/backup/debian64/legolas/root_ggm_daily_ro.20140907_14:55:40/lib/modules/3.16.2-amd64-i915-preempt-20140714/kernel/arch/x86/crypto/blowfish-x86_64.ko", 20304) = 0
91.915793 read(3, "t\0\0\0\23\0\v\370\222\244", 10) = 10
0.000066 read(3, "\17\0X\0lib/modules/3.16.2-amd64-i91"..., 116) = 116
0.000061 lchown("/mnt/btrfs_pool2/backup/debian64/legolas/root_ggm_daily_ro.20140907_14:55:40/lib/modules/3.16.2-amd64-i915-preempt-20140714/kernel/arch/x86/crypto/blowfish-x86_64.ko", 0, 0) = 0
0.000125 read(3, "h\0\0\0\22\0Xa\r\271", 10) = 10
0.000050 read(3, "\17\0X\0lib/modules/3.16.2-amd64-i91"..., 104) = 104
0.000055 chmod("/mnt/btrfs_pool2/backup/debian64/legolas/root_ggm_daily_ro.20140907_14:55:40/lib/modules/3.16.2-amd64-i915-preempt-20140714/kernel/arch/x86/crypto/blowfish-x86_64.ko", 0644) = 0
See how that was 91 seconds up there, it's thankfully usually less.
Hi Marc,

Does the sum of all reads from the stream file (fd 3) gets anywhere
close to the total btrfs receive time? (or even more than 50%)
Can you paste somewhere the full output of strace (with -T option)?

thanks
Post by Marc MERLIN
While strace shows me the wait is on read, I'm going to assume that really the hang is
when btrfs receive is sending the bytes to the kernel to write on disk.
Label: 'dshelf2' uuid: d4a51178-c1e6-4219-95ab-5c5864695bfd
Total devices 1 FS bytes used 3.53TiB
devid 1 size 7.28TiB used 3.66TiB path /dev/mapper/dshelf2
Data, single: total=3.50TiB, used=3.47TiB
System, DUP: total=8.00MiB, used=416.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, DUP: total=77.50GiB, used=62.31GiB
Metadata, single: total=8.00MiB, used=0.00
unknown, single: total=512.00MiB, used=8.64MiB
So it's not critically misbalanced since it's only 50% used, correct?
btrfs balance start -dusage=10 /mnt/btrfs_pool2
Done, had to relocate 5 out of 3746 chunks
but that did not help the restore speed at all.
truncate sbin/swaplabel size=14568
chown sbin/swaplabel - uid=0, gid=0
chmod sbin/swaplabel - mode=0755
utimes sbin/swaplabel
mkfile o85871-279575-0
rename o85871-279575-0 -> sbin/raw
utimes sbin
truncate sbin/raw size=10336
chown sbin/raw - uid=0, gid=0
chmod sbin/raw - mode=0755
utimes sbin/raw
mkfile o85872-279575-0
rename o85872-279575-0 -> sbin/fsck.cramfs
utimes sbin
truncate sbin/fsck.cramfs size=14544
chown sbin/fsck.cramfs - uid=0, gid=0
chmod sbin/fsck.cramfs - mode=0755
utimes sbin/fsck.cramfs
mkfile o85873-279575-0
rename o85873-279575-0 -> sbin/mkfs.cramfs
utimes sbin
truncate sbin/mkfs.cramfs size=31120
chown sbin/mkfs.cramfs - uid=0, gid=0
chmod sbin/mkfs.cramfs - mode=0755
utimes sbin/mkfs.cramfs
mkfile o85874-279575-0
rename o85874-279575-0 -> sbin/findfs
utimes sbin
truncate sbin/findfs size=6208
chown sbin/findfs - uid=0, gid=0
chmod sbin/findfs - mode=0755
utimes sbin/findfs
mkfile o85875-279575-0
rename o85875-279575-0 -> sbin/wipefs
utimes sbin
gargamel:/mnt/btrfs_pool2/backup/debian64/legolas# time btrfs receive -v -v -v -f /tmp/diff .
At snapshot root_ggm_daily_ro.20140907_14:55:40
receiving snapshot root_ggm_daily_ro.20140907_14:55:40 uuid=8bfc0ac8-6436-614d-8a56-7f7894f743b8, ctransid=281821 parent_uuid=075e6ff5-9b87-874f-991c-b453a0f5d4c0, parent_ctransid=278891
utimes tmp
utimes bin
unlink bin/dmesg
utimes bin
unlink bin/lsblk
utimes bin
unlink bin/more
utimes bin
unlink bin/tailf
(...)
BTRFS_IOC_SET_RECEIVED_SUBVOL uuid=8bfc0ac8-6436-614d-8a56-7f7894f743b8, stransid=281821
real 82m55.222s
user 0m0.236s
sys 0m1.816s
Any idea what's going on?
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Filipe David Manana,

"Reasonable men adapt themselves to the world.
Unreasonable men adapt the world to themselves.
That's why all progress depends on unreasonable men."
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Marc MERLIN
2014-09-15 00:18:36 UTC
Permalink
Post by Filipe David Manana
Hi Marc,
Does the sum of all reads from the stream file (fd 3) gets anywhere
close to the total btrfs receive time? (or even more than 50%)
Can you paste somewhere the full output of strace (with -T option)?
Sorry for the lack of answer, I lost the snapshot I used for that mail,
so it was not possible to do again easily.
Because my backups were so hopelessly behind, I did a full resync of
/var, i.e. not a differential send (300GB or so). The copy went at about
25GB/h, which wasn't bad at all since was over wifi (took about 14H).

I'm still waiting for the next incremental backup to run.
Once it's done, I'll try to run a smaller one under strace so I don't
have a ridiculously long log to give you.

Is there a reasonable way to know if btrfs receive is indeed hanging on
kernel code, and whether it's on receive code or on code that tries to
writes the files on the filesystem in the new snapshot?

Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Marc MERLIN
2014-09-15 17:57:47 UTC
Permalink
Post by Marc MERLIN
I'm still waiting for the next incremental backup to run.
Once it's done, I'll try to run a smaller one under strace so I don't
have a ridiculously long log to give you.
Is there a reasonable way to know if btrfs receive is indeed hanging on
kernel code, and whether it's on receive code or on code that tries to
writes the files on the filesystem in the new snapshot?
Ok, so my incremental sync of /var which I had _just_ fully copied over,
took over 12H and then died with

ABORT: btrfs send /mnt/btrfs_pool1/var_ggm_daily_ro.20140913_14:50:24 var_ggm_daily_ro.20140914_19:16:48 | ssh gargamel.merlins.org btrfs receive /mnt/btrfs_pool2/backup/debian64/legolas/ failed
Write failed: Broken pipe

I'm not sure that my ssh connection went down because it was at home on
reliable wireless and I had other connections that were fine during that
time, but you never know.

Unfortunately, due to my stupidity, the strace logging failed, but it looks
like I got about 12H worth of the following below. Interestingly that directory
is nocow (VM images).
Does btrfs send get unhappy with NOCOW?

Thanks,
Marc

read(0, "\17\0(\0local/nobck/VirtualBox VMs/W"..., 49212) = 15511 <0.000023>
read(0, "\336\350A\340\2@\377p\20\377u\360\3\36T\245\241\25\307\2033\314E`\20,f\20\25C$4"..., 33701) = 32768 <0.117252>
read(0, "\210:\214:\220:\224:\0\230:\234:\240:\244:\0\250:\254:\260:\264:\0\270:\304:\310"..., 933) = 933 <0.369642>
pwrite(5, "`\273\0}\300\271\f\0\0\0003\0\300\363\253\213\3163\300\211\0E\350\211U\274\213\331\213\0E\274"..., 49152, 32698048512) = 49152 <0.000086>
read(0, "<\260\0\0\17\0\224\22\261\v", 10) = 10 <0.000013>
read(0, "\17\0(\0local/nobck/VirtualBox VMs/W"..., 45116) = 31825 <0.000024>
read(0, "\325\260\3 \270\3T\270\3\340y\7\201\1\\\350\16\262\3\241\0\261\3\207\260\0038\271\0\354Z\347"..., 13291) = 13291 <0.228846>
pwrite(5, "\363\273\0\0\0\0\350\30(\353\377\0\213F\4\211E\340\213N@\0209\t\350\10\227\0x\330\0\213"..., 45056, 32698114048) = 45056 <0.000060>
read(0, "<\260\0\0\17\0(-\275\365", 10) = 10 <0.000044>
read(0, "\17\0(\0local/nobck/VirtualBox VMs/W"..., 45116) = 19467 <0.000021>
read(0, "\3\341\23P\25\241\3\360\240\3\211b\10\205\311u\0>\213G\10\203x(\0\4t5\0\1\213@"..., 25649) = 25649 <0.176543>
pwrite(5, "\226\272\10\377j\1\1\20\213\323\213\316\0\350\310N\352\377\211GH\0\351\37\1\0\0\213N(\0009"..., 45056, 32698179584) = 45056 <0.000071>
read(0, "<`\0\0\17\0\tmn,", 10) = 10 <0.000043>
read(0, "\17\0(\0local/nobck/VirtualBox VMs/W"..., 24636) = 7109 <0.000019>
read(0, "\377\307A\10 \243\377\240\7\300m\352^\200\2\2140\3P0\2\204\1\245\3\1\320\31\314\314D\30"..., 17527) = 17527 <0.113324>
pwrite(5, "P\272\0\n\351\377\213G\4\213@\20 \211E\360\2\200\10\211E\0\3543\366\203}\360\0~H\\"..., 24576, 32698245120) = 24576 <0.000055>
read(0, "<\200\0\0\17\0\346\vV\270", 10) = 10 <0.000030>
read(0, "\17\0(\0local/nobck/VirtualBox VMs/W"..., 32828) = 15231 <0.000017>
read(0, "\350\24\202b\243(\206@\n\213E\350H\377p\10\301\26\213\310\240\26'\10\261\377\377\5f\213~T"..., 17597) = 17597 <0.165656>
pwrite(5, "\200\320u\327\205\366~u\251%D\v\372\240+\310\213\201#\36\350\320\213E\350\213!%\360`\234\300"..., 32768, 32699318272) = 32768 <0.000057>
read(0, "<@\0\0\17\0\221\256ob", 10) = 10 <0.000015>
read(0, "\17\0(\0local/nobck/VirtualBox VMs/W"..., 16444) = 15161 <0.000015>
read(0, "\210V<4a\273\10\23\372\240\227o\20\332\267\30S\375\2628oy\271\370yj\211ryr\335\177"..., 1283) = 1283 <0.190593>
pwrite(5, "\342\377\303\320vr\21\364zs|N\302\36\262\20N\203\224\212R\315\222S)\231E\255\344\265X\211"..., 16384, 32699351040) = 16384 <0.000050>
read(0, "<\300\0\0\17\0:\205\337\336", 10) = 10 <0.000015>
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Marc MERLIN
2014-09-16 23:57:42 UTC
Permalink
Post by Marc MERLIN
Unfortunately, due to my stupidity, the strace logging failed, but it looks
like I got about 12H worth of the following below. Interestingly that directory
is nocow (VM images).
Does btrfs send get unhappy with NOCOW?
So I ran another differential send/receive last night and it took another 10 or 12H
again.
Strace shows most of the time was spent on my virtualbox images which
did not change at all.

The diff itself wasn't big and only took 2mn to generate:
legolas:/mnt/btrfs_pool1# time btrfs send -p var_ggm_daily_ro.20140913_14:50:24 var_ggm_daily_ro.20140915_22:57:33 > /tmp/diff
At subvol var_ggm_daily_ro.20140915_22:57:33
real 2m14.365s
user 0m0.400s
sys 1m10.192s
legolas:/mnt/btrfs_pool1# du -sh /tmp/diff
11G /tmp/diff

I have a filtered log showing any system call that took more than 1 sec,
that list is small:
http://marc.merlins.org/tmp/btrfs_receive.log

Most of the time is apparently just death by a thousand cuts of many
many system calls spent around receiving my virtual images that didn't
change.

Here's the full strace log if you wish
http://marc.merlins.org/tmp/btrfs_receive.log.xz

What do you think?

Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Marc MERLIN
2014-09-17 15:00:02 UTC
Permalink
kernel: 3.16.2
Post by Marc MERLIN
I have a filtered log showing any system call that took more than 1 sec,
http://marc.merlins.org/tmp/btrfs_receive.log
Most of the time is apparently just death by a thousand cuts of many
many system calls spent around receiving my virtual images that didn't
change.
Here's the full strace log if you wish
http://marc.merlins.org/tmp/btrfs_receive.log.xz
Ok, so while debugging this further, I found out that my VM images were
not NOCOW anymore (they used to be, but this must have been lost during a
restore).

Problems:
filefrag on my vbox file took all of my RAM and swap (32GB) and killed my
machine without being able to finish.

Moving the dir to +C and copying the vbox image from backup (having deleted
the fragmented one) took much longer to start than it should have
(destination had a filesize of 0 for a long time), but finished overnight.

The next morning (now), I see multiple of my CPUs deadlocked and a kworker
at the top of the list:
INFO: task kworker/u16:6:21880 blocked for more than 120 seconds.
Tainted: G O 3.16.2-amd64-i915-preempt-20140714 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u16:6 D 0000000000000000 0 21880 2 0x00000080
Workqueue: writeback bdi_writeback_workfn (flush-btrfs-2)
ffff88012f87b9d0 0000000000000046 ffff88012f87b9a0 ffff88012f87bfd8
ffff8800139c0490 00000000000140c0 ffff88041d2140c0 ffff8800139c0490
ffff88012f87ba70 0000000000000002 ffffffff81106441 ffff88012f87b9e0
Call Trace:
[<ffffffff81106441>] ? wait_on_page_read+0x3c/0x3c
[<ffffffff8163a889>] schedule+0x6e/0x70
[<ffffffff8163aa2b>] io_schedule+0x60/0x7a
[<ffffffff8110644f>] sleep_on_page+0xe/0x12
[<ffffffff8163adbb>] __wait_on_bit_lock+0x46/0x8a
[<ffffffff8110650a>] __lock_page+0x69/0x6b
[<ffffffff81087ba4>] ? autoremove_wake_function+0x34/0x34
[<ffffffff8124aead>] lock_page+0x1e/0x21
[<ffffffff8124ecb9>] extent_write_cache_pages.isra.16.constprop.31+0x10e/0x2c3
[<ffffffff8124f2a2>] extent_writepages+0x4b/0x5c
[<ffffffff81238e7c>] ? btrfs_submit_direct+0x3f9/0x3f9
[<ffffffff81079658>] ? preempt_count_add+0x78/0x8d
[<ffffffff81237568>] btrfs_writepages+0x28/0x2a
[<ffffffff81110efe>] do_writepages+0x1e/0x2c
[<ffffffff811814db>] __writeback_single_inode+0x7d/0x238
[<ffffffff81182213>] writeback_sb_inodes+0x1eb/0x339
[<ffffffff811823d5>] __writeback_inodes_wb+0x74/0xb7
[<ffffffff81182550>] wb_writeback+0x138/0x293
[<ffffffff81182b88>] bdi_writeback_workfn+0x19a/0x329
[<ffffffff81068bf7>] process_one_work+0x195/0x2d2
[<ffffffff81068fd8>] worker_thread+0x275/0x352
[<ffffffff81068d63>] ? process_scheduled_works+0x2f/0x2f
[<ffffffff8106e3a9>] kthread+0xae/0xb6
[<ffffffff8106e2fb>] ? __kthread_parkme+0x61/0x61
[<ffffffff8163d8fc>] ret_from_fork+0x7c/0xb0
[<ffffffff8106e2fb>] ? __kthread_parkme+0x61/0x61

Hung tasks (sysrq-w) are here:
http://marc.merlins.org/tmp/btrfs_hang-3.16.2.txt

I'm going to purge that fragmented vbox image from all my snapshots and reboot,
but clearly there are things that are going wrong.

Filipe, sorry for the initial bad problem report. While I can't exactly see
how it's related, it looks like btrfs receive of a heavily fragmented files
can take 12h or more.
It may not be that important to fix compared to the main problem heavy fragmentation
causes to btrfs still

Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Marc MERLIN
2014-09-17 17:13:22 UTC
Permalink
Post by Marc MERLIN
kernel: 3.16.2
Grumble, I messed up the subject line.
I meant that COW on virtual disk images causes the problems described in the
previous Email.
Obviously I already know that COW on disk images wasn't ideal, but the
effects in 3.16 are still fairly severe.

Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...