Discussion:
3.17.1 blocked task
(too old to reply)
Paul Jones
2014-10-18 12:00:09 UTC
Permalink
Raw Message
Chris Murphy
2014-10-18 15:17:52 UTC
Permalink
Raw Message
Hi All,
=20
Just found this stack trace in dmesg while running a scrub on one of =
my file systems. I haven=E2=80=99t seen this reported yet so I thought =
I should report it =E2=98=BA

Suggest reproducing and issuing sysrq+w which will show the blocked tas=
ks (in dmesg).

Chris Murphy

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Robert White
2014-10-18 16:02:12 UTC
Permalink
Raw Message
Just found this stack trace in dmesg while running a scrub on one of =
my file systems. I haven=E2=80=99t seen this reported yet so I thought =
I should report it =E2=98=BA
All filesystems are raid1.
...
[ 5396.970316] INFO: task kworker/u16:8:7540 blocked for more than 12=
0 seconds.
[ 5396.970318] Not tainted 3.17.1-gentoo-r1 #1
[ 5396.970319] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" dis=
ables this message.
[ 5396.970319] kworker/u16:8 D ffff880302e4a2a0 0 7540 2 =
0x00000000
[ 5396.970325] Workqueue: writeback bdi_writeback_workfn (flush-btrfs=
-3)
...
(1) This backtrace is harmless in terms of system integrity. It is=20
purely advisory. As it says in the message "echo 0 > /proc/sys/..." to=20
disable this message.

(2) I reported a similarly long transaction on the first mount of a=20
BTRFS image that I'd just converted from EXT4, thinking it was an error=
=2E

(3) I'd destroyed my system by thininking this panic-looking backtrace=20
was actually a panic and resetting my box because _I_ paniced. /doh!=20
[the system was building the initial csum tree or something and turning=
=20
it off during that made an unrecoverable mess. having only part of your=
=20
csum tree, it turns out, is "bad". I saved my data by doing a "btrfs=20
restore".]

(4) someone else had to point out to me that the message was purely=20
informative and that its emission doesn't affect process outcome at all=
=2E

The lessons I learned:

BTRFS _can_ do some _very_ time consuming things in "one transaction"=20
and the kernel's "you might want to take a look at this task" timer is=20
set to consider things like "one over-long write to device" as "one=20
action" as compared to, say, creating an entire csum tree.

Don't panic, and _don't_ turn off your box. (which used to be a good=20
ting to do if fsck was eating a partition back before ext3, but old man=
=20
reflexes can be wrong now-a-days 8-).

The "info" stack traces need to look a lot less like the "panic" stack=20
traces... 8-)

TL;DR :: the above is almost certainly the result of the scrub doing=20
something particularly arduous but correct, or another transaction=20
correctly waiting for the scrub to finish with a particular resource.

-- Rob.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...