<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/fs, branch v3.0.33</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>vfs: make AIO use the proper rw_verify_area() area helpers</title>
<updated>2012-06-01T07:12:53+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-05-21T23:06:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2ec196c975ffb8076df77f6fa929448717e5141b'/>
<id>2ec196c975ffb8076df77f6fa929448717e5141b</id>
<content type='text'>
commit a70b52ec1aaeaf60f4739edb1b422827cb6f3893 upstream.

We had for some reason overlooked the AIO interface, and it didn't use
the proper rw_verify_area() helper function that checks (for example)
mandatory locking on the file, and that the size of the access doesn't
cause us to overflow the provided offset limits etc.

Instead, AIO did just the security_file_permission() thing (that
rw_verify_area() also does) directly.

This fixes it to do all the proper helper functions, which not only
means that now mandatory file locking works with AIO too, we can
actually remove lines of code.

Reported-by: Manish Honap &lt;manish_honap_vit@yahoo.co.in&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit a70b52ec1aaeaf60f4739edb1b422827cb6f3893 upstream.

We had for some reason overlooked the AIO interface, and it didn't use
the proper rw_verify_area() helper function that checks (for example)
mandatory locking on the file, and that the size of the access doesn't
cause us to overflow the provided offset limits etc.

Instead, AIO did just the security_file_permission() thing (that
rw_verify_area() also does) directly.

This fixes it to do all the proper helper functions, which not only
means that now mandatory file locking works with AIO too, we can
actually remove lines of code.

Reported-by: Manish Honap &lt;manish_honap_vit@yahoo.co.in&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>block: don't mark buffers beyond end of disk as mapped</title>
<updated>2012-06-01T07:12:52+00:00</updated>
<author>
<name>Jeff Moyer</name>
<email>jmoyer@redhat.com</email>
</author>
<published>2012-05-11T14:34:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=37a8457773c266cdb77fcddec008cd73e81786be'/>
<id>37a8457773c266cdb77fcddec008cd73e81786be</id>
<content type='text'>
commit 080399aaaf3531f5b8761ec0ac30ff98891e8686 upstream.

Hi,

We have a bug report open where a squashfs image mounted on ppc64 would
exhibit errors due to trying to read beyond the end of the disk.  It can
easily be reproduced by doing the following:

[root@ibm-p750e-02-lp3 ~]# ls -l install.img
-rw-r--r-- 1 root root 142032896 Apr 30 16:46 install.img
[root@ibm-p750e-02-lp3 ~]# mount -o loop ./install.img /mnt/test
[root@ibm-p750e-02-lp3 ~]# dd if=/dev/loop0 of=/dev/null
dd: reading `/dev/loop0': Input/output error
277376+0 records in
277376+0 records out
142016512 bytes (142 MB) copied, 0.9465 s, 150 MB/s

In dmesg, you'll find the following:

squashfs: version 4.0 (2009/01/31) Phillip Lougher
[   43.106012] attempt to access beyond end of device
[   43.106029] loop0: rw=0, want=277410, limit=277408
[   43.106039] Buffer I/O error on device loop0, logical block 138704
[   43.106053] attempt to access beyond end of device
[   43.106057] loop0: rw=0, want=277412, limit=277408
[   43.106061] Buffer I/O error on device loop0, logical block 138705
[   43.106066] attempt to access beyond end of device
[   43.106070] loop0: rw=0, want=277414, limit=277408
[   43.106073] Buffer I/O error on device loop0, logical block 138706
[   43.106078] attempt to access beyond end of device
[   43.106081] loop0: rw=0, want=277416, limit=277408
[   43.106085] Buffer I/O error on device loop0, logical block 138707
[   43.106089] attempt to access beyond end of device
[   43.106093] loop0: rw=0, want=277418, limit=277408
[   43.106096] Buffer I/O error on device loop0, logical block 138708
[   43.106101] attempt to access beyond end of device
[   43.106104] loop0: rw=0, want=277420, limit=277408
[   43.106108] Buffer I/O error on device loop0, logical block 138709
[   43.106112] attempt to access beyond end of device
[   43.106116] loop0: rw=0, want=277422, limit=277408
[   43.106120] Buffer I/O error on device loop0, logical block 138710
[   43.106124] attempt to access beyond end of device
[   43.106128] loop0: rw=0, want=277424, limit=277408
[   43.106131] Buffer I/O error on device loop0, logical block 138711
[   43.106135] attempt to access beyond end of device
[   43.106139] loop0: rw=0, want=277426, limit=277408
[   43.106143] Buffer I/O error on device loop0, logical block 138712
[   43.106147] attempt to access beyond end of device
[   43.106151] loop0: rw=0, want=277428, limit=277408
[   43.106154] Buffer I/O error on device loop0, logical block 138713
[   43.106158] attempt to access beyond end of device
[   43.106162] loop0: rw=0, want=277430, limit=277408
[   43.106166] attempt to access beyond end of device
[   43.106169] loop0: rw=0, want=277432, limit=277408
...
[   43.106307] attempt to access beyond end of device
[   43.106311] loop0: rw=0, want=277470, limit=2774

Squashfs manages to read in the end block(s) of the disk during the
mount operation.  Then, when dd reads the block device, it leads to
block_read_full_page being called with buffers that are beyond end of
disk, but are marked as mapped.  Thus, it would end up submitting read
I/O against them, resulting in the errors mentioned above.  I fixed the
problem by modifying init_page_buffers to only set the buffer mapped if
it fell inside of i_size.

Cheers,
Jeff

Signed-off-by: Jeff Moyer &lt;jmoyer@redhat.com&gt;
Acked-by: Nick Piggin &lt;npiggin@kernel.dk&gt;

--

Changes from v1-&gt;v2: re-used max_block, as suggested by Nick Piggin.
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 080399aaaf3531f5b8761ec0ac30ff98891e8686 upstream.

Hi,

We have a bug report open where a squashfs image mounted on ppc64 would
exhibit errors due to trying to read beyond the end of the disk.  It can
easily be reproduced by doing the following:

[root@ibm-p750e-02-lp3 ~]# ls -l install.img
-rw-r--r-- 1 root root 142032896 Apr 30 16:46 install.img
[root@ibm-p750e-02-lp3 ~]# mount -o loop ./install.img /mnt/test
[root@ibm-p750e-02-lp3 ~]# dd if=/dev/loop0 of=/dev/null
dd: reading `/dev/loop0': Input/output error
277376+0 records in
277376+0 records out
142016512 bytes (142 MB) copied, 0.9465 s, 150 MB/s

In dmesg, you'll find the following:

squashfs: version 4.0 (2009/01/31) Phillip Lougher
[   43.106012] attempt to access beyond end of device
[   43.106029] loop0: rw=0, want=277410, limit=277408
[   43.106039] Buffer I/O error on device loop0, logical block 138704
[   43.106053] attempt to access beyond end of device
[   43.106057] loop0: rw=0, want=277412, limit=277408
[   43.106061] Buffer I/O error on device loop0, logical block 138705
[   43.106066] attempt to access beyond end of device
[   43.106070] loop0: rw=0, want=277414, limit=277408
[   43.106073] Buffer I/O error on device loop0, logical block 138706
[   43.106078] attempt to access beyond end of device
[   43.106081] loop0: rw=0, want=277416, limit=277408
[   43.106085] Buffer I/O error on device loop0, logical block 138707
[   43.106089] attempt to access beyond end of device
[   43.106093] loop0: rw=0, want=277418, limit=277408
[   43.106096] Buffer I/O error on device loop0, logical block 138708
[   43.106101] attempt to access beyond end of device
[   43.106104] loop0: rw=0, want=277420, limit=277408
[   43.106108] Buffer I/O error on device loop0, logical block 138709
[   43.106112] attempt to access beyond end of device
[   43.106116] loop0: rw=0, want=277422, limit=277408
[   43.106120] Buffer I/O error on device loop0, logical block 138710
[   43.106124] attempt to access beyond end of device
[   43.106128] loop0: rw=0, want=277424, limit=277408
[   43.106131] Buffer I/O error on device loop0, logical block 138711
[   43.106135] attempt to access beyond end of device
[   43.106139] loop0: rw=0, want=277426, limit=277408
[   43.106143] Buffer I/O error on device loop0, logical block 138712
[   43.106147] attempt to access beyond end of device
[   43.106151] loop0: rw=0, want=277428, limit=277408
[   43.106154] Buffer I/O error on device loop0, logical block 138713
[   43.106158] attempt to access beyond end of device
[   43.106162] loop0: rw=0, want=277430, limit=277408
[   43.106166] attempt to access beyond end of device
[   43.106169] loop0: rw=0, want=277432, limit=277408
...
[   43.106307] attempt to access beyond end of device
[   43.106311] loop0: rw=0, want=277470, limit=2774

Squashfs manages to read in the end block(s) of the disk during the
mount operation.  Then, when dd reads the block device, it leads to
block_read_full_page being called with buffers that are beyond end of
disk, but are marked as mapped.  Thus, it would end up submitting read
I/O against them, resulting in the errors mentioned above.  I fixed the
problem by modifying init_page_buffers to only set the buffer mapped if
it fell inside of i_size.

Cheers,
Jeff

Signed-off-by: Jeff Moyer &lt;jmoyer@redhat.com&gt;
Acked-by: Nick Piggin &lt;npiggin@kernel.dk&gt;

--

Changes from v1-&gt;v2: re-used max_block, as suggested by Nick Piggin.
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>wake up s_wait_unfrozen when -&gt;freeze_fs fails</title>
<updated>2012-05-21T16:40:05+00:00</updated>
<author>
<name>Kazuya Mio</name>
<email>k-mio@sx.jp.nec.com</email>
</author>
<published>2011-12-01T07:51:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fcb2c2e95085336d1d6ac9cc1a07a783f80e60c0'/>
<id>fcb2c2e95085336d1d6ac9cc1a07a783f80e60c0</id>
<content type='text'>
commit e1616300a20c80396109c1cf013ba9a36055a3da upstream.

dd slept infinitely when fsfeeze failed because of EIO.
To fix this problem, if -&gt;freeze_fs fails, freeze_super() wakes up
the tasks waiting for the filesystem to become unfrozen.

When s_frozen isn't SB_UNFROZEN in __generic_file_aio_write(),
the function sleeps until FITHAW ioctl wakes up s_wait_unfrozen.

However, if -&gt;freeze_fs fails, s_frozen is set to SB_UNFROZEN and then
freeze_super() returns an error number. In this case, FITHAW ioctl returns
EINVAL because s_frozen is already SB_UNFROZEN. There is no way to wake up
s_wait_unfrozen, so __generic_file_aio_write() sleeps infinitely.

Signed-off-by: Kazuya Mio &lt;k-mio@sx.jp.nec.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit e1616300a20c80396109c1cf013ba9a36055a3da upstream.

dd slept infinitely when fsfeeze failed because of EIO.
To fix this problem, if -&gt;freeze_fs fails, freeze_super() wakes up
the tasks waiting for the filesystem to become unfrozen.

When s_frozen isn't SB_UNFROZEN in __generic_file_aio_write(),
the function sleeps until FITHAW ioctl wakes up s_wait_unfrozen.

However, if -&gt;freeze_fs fails, s_frozen is set to SB_UNFROZEN and then
freeze_super() returns an error number. In this case, FITHAW ioctl returns
EINVAL because s_frozen is already SB_UNFROZEN. There is no way to wake up
s_wait_unfrozen, so __generic_file_aio_write() sleeps infinitely.

Signed-off-by: Kazuya Mio &lt;k-mio@sx.jp.nec.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: fix error handling on inode bitmap corruption</title>
<updated>2012-05-21T16:40:04+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2011-12-18T22:37:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1a28fbbebe6bd7e3f0338663302b3b3ce500e088'/>
<id>1a28fbbebe6bd7e3f0338663302b3b3ce500e088</id>
<content type='text'>
commit acd6ad83517639e8f09a8c5525b1dccd81cd2a10 upstream.

When insert_inode_locked() fails in ext4_new_inode() it most likely means inode
bitmap got corrupted and we allocated again inode which is already in use. Also
doing unlock_new_inode() during error recovery is wrong since the inode does
not have I_NEW set. Fix the problem by jumping to fail: (instead of fail_drop:)
which declares filesystem error and does not call unlock_new_inode().

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit acd6ad83517639e8f09a8c5525b1dccd81cd2a10 upstream.

When insert_inode_locked() fails in ext4_new_inode() it most likely means inode
bitmap got corrupted and we allocated again inode which is already in use. Also
doing unlock_new_inode() during error recovery is wrong since the inode does
not have I_NEW set. Fix the problem by jumping to fail: (instead of fail_drop:)
which declares filesystem error and does not call unlock_new_inode().

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>ext3: Fix error handling on inode bitmap corruption</title>
<updated>2012-05-21T16:40:04+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2011-12-08T20:13:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8e8a21270cbfc41bf3bf2e014b99dc113ba554ec'/>
<id>8e8a21270cbfc41bf3bf2e014b99dc113ba554ec</id>
<content type='text'>
commit 1415dd8705394399d59a3df1ab48d149e1e41e77 upstream.

When insert_inode_locked() fails in ext3_new_inode() it most likely
means inode bitmap got corrupted and we allocated again inode which
is already in use. Also doing unlock_new_inode() during error recovery
is wrong since inode does not have I_NEW set. Fix the problem by jumping
to fail: (instead of fail_drop:) which declares filesystem error and
does not call unlock_new_inode().

Reviewed-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 1415dd8705394399d59a3df1ab48d149e1e41e77 upstream.

When insert_inode_locked() fails in ext3_new_inode() it most likely
means inode bitmap got corrupted and we allocated again inode which
is already in use. Also doing unlock_new_inode() during error recovery
is wrong since inode does not have I_NEW set. Fix the problem by jumping
to fail: (instead of fail_drop:) which declares filesystem error and
does not call unlock_new_inode().

Reviewed-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>NFSv4: Revalidate uid/gid after open</title>
<updated>2012-05-21T16:40:04+00:00</updated>
<author>
<name>Jonathan Nieder</name>
<email>jrnieder@gmail.com</email>
</author>
<published>2012-05-11T09:20:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=19165bdbb3622cfca0ff66e8b30248d469b849d6'/>
<id>19165bdbb3622cfca0ff66e8b30248d469b849d6</id>
<content type='text'>
This is a shorter (and more appropriate for stable kernels) analog to
the following upstream commit:

commit 6926afd1925a54a13684ebe05987868890665e2b
Author: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;
Date:   Sat Jan 7 13:22:46 2012 -0500

    NFSv4: Save the owner/group name string when doing open

    ...so that we can do the uid/gid mapping outside the asynchronous RPC
    context.
    This fixes a bug in the current NFSv4 atomic open code where the client
    isn't able to determine what the true uid/gid fields of the file are,
    (because the asynchronous nature of the OPEN call denies it the ability
    to do an upcall) and so fills them with default values, marking the
    inode as needing revalidation.
    Unfortunately, in some cases, the VFS will do some additional sanity
    checks on the file, and may override the server's decision to allow
    the open because it sees the wrong owner/group fields.

    Signed-off-by: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;

Without this patch, logging into two different machines with home
directories mounted over NFS4 and then running "vim" and typing ":q"
in each reliably produces the following error on the second machine:

	E137: Viminfo file is not writable: /users/system/rtheys/.viminfo

This regression was introduced by 80e52aced138 ("NFSv4: Don't do
idmapper upcalls for asynchronous RPC calls", merged during the 2.6.32
cycle) --- after the OPEN call, .viminfo has the default values for
st_uid and st_gid (0xfffffffe) cached because we do not want to let
rpciod wait for an idmapper upcall to fill them in.

The fix used in mainline is to save the owner and group as strings and
perform the upcall in _nfs4_proc_open outside the rpciod context,
which takes about 600 lines.  For stable, we can do something similar
with a one-liner: make open check for the stale fields and make a
(synchronous) GETATTR call to fill them when needed.

Trond dictated the patch, I typed it in, and Rik tested it.

Addresses http://bugs.debian.org/659111 and
          https://bugzilla.redhat.com/789298

Reported-by: Rik Theys &lt;Rik.Theys@esat.kuleuven.be&gt;
Explained-by: David Flyn &lt;davidf@rd.bbc.co.uk&gt;
Signed-off-by: Jonathan Nieder &lt;jrnieder@gmail.com&gt;
Tested-by: Rik Theys &lt;Rik.Theys@esat.kuleuven.be&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is a shorter (and more appropriate for stable kernels) analog to
the following upstream commit:

commit 6926afd1925a54a13684ebe05987868890665e2b
Author: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;
Date:   Sat Jan 7 13:22:46 2012 -0500

    NFSv4: Save the owner/group name string when doing open

    ...so that we can do the uid/gid mapping outside the asynchronous RPC
    context.
    This fixes a bug in the current NFSv4 atomic open code where the client
    isn't able to determine what the true uid/gid fields of the file are,
    (because the asynchronous nature of the OPEN call denies it the ability
    to do an upcall) and so fills them with default values, marking the
    inode as needing revalidation.
    Unfortunately, in some cases, the VFS will do some additional sanity
    checks on the file, and may override the server's decision to allow
    the open because it sees the wrong owner/group fields.

    Signed-off-by: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;

Without this patch, logging into two different machines with home
directories mounted over NFS4 and then running "vim" and typing ":q"
in each reliably produces the following error on the second machine:

	E137: Viminfo file is not writable: /users/system/rtheys/.viminfo

This regression was introduced by 80e52aced138 ("NFSv4: Don't do
idmapper upcalls for asynchronous RPC calls", merged during the 2.6.32
cycle) --- after the OPEN call, .viminfo has the default values for
st_uid and st_gid (0xfffffffe) cached because we do not want to let
rpciod wait for an idmapper upcall to fill them in.

The fix used in mainline is to save the owner and group as strings and
perform the upcall in _nfs4_proc_open outside the rpciod context,
which takes about 600 lines.  For stable, we can do something similar
with a one-liner: make open check for the stale fields and make a
(synchronous) GETATTR call to fill them when needed.

Trond dictated the patch, I typed it in, and Rik tested it.

Addresses http://bugs.debian.org/659111 and
          https://bugzilla.redhat.com/789298

Reported-by: Rik Theys &lt;Rik.Theys@esat.kuleuven.be&gt;
Explained-by: David Flyn &lt;davidf@rd.bbc.co.uk&gt;
Signed-off-by: Jonathan Nieder &lt;jrnieder@gmail.com&gt;
Tested-by: Rik Theys &lt;Rik.Theys@esat.kuleuven.be&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: avoid deadlock on sync-mounted FS w/o journal</title>
<updated>2012-05-21T16:40:04+00:00</updated>
<author>
<name>Eric Sandeen</name>
<email>sandeen@redhat.com</email>
</author>
<published>2012-02-21T04:06:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=797c09ed348c2e22715cbfaeb3ab658436753570'/>
<id>797c09ed348c2e22715cbfaeb3ab658436753570</id>
<content type='text'>
commit c1bb05a657fb3d8c6179a4ef7980261fae4521d7 upstream.

Processes hang forever on a sync-mounted ext2 file system that
is mounted with the ext4 module (default in Fedora 16).

I can reproduce this reliably by mounting an ext2 partition with
"-o sync" and opening a new file an that partition with vim. vim
will hang in "D" state forever.  The same happens on ext4 without
a journal.

I am attaching a small patch here that solves this issue for me.
In the sync mounted case without a journal,
ext4_handle_dirty_metadata() may call sync_dirty_buffer(), which
can't be called with buffer lock held.

Also move mb_cache_entry_release inside lock to avoid race
fixed previously by 8a2bfdcb ext[34]: EA block reference count racing fix
Note too that ext2 fixed this same problem in 2006 with
b2f49033 [PATCH] fix deadlock in ext2

Signed-off-by: Martin.Wilck@ts.fujitsu.com
[sandeen@redhat.com: move mb_cache_entry_release before unlock, edit commit msg]
Signed-off-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit c1bb05a657fb3d8c6179a4ef7980261fae4521d7 upstream.

Processes hang forever on a sync-mounted ext2 file system that
is mounted with the ext4 module (default in Fedora 16).

I can reproduce this reliably by mounting an ext2 partition with
"-o sync" and opening a new file an that partition with vim. vim
will hang in "D" state forever.  The same happens on ext4 without
a journal.

I am attaching a small patch here that solves this issue for me.
In the sync mounted case without a journal,
ext4_handle_dirty_metadata() may call sync_dirty_buffer(), which
can't be called with buffer lock held.

Also move mb_cache_entry_release inside lock to avoid race
fixed previously by 8a2bfdcb ext[34]: EA block reference count racing fix
Note too that ext2 fixed this same problem in 2006 with
b2f49033 [PATCH] fix deadlock in ext2

Signed-off-by: Martin.Wilck@ts.fujitsu.com
[sandeen@redhat.com: move mb_cache_entry_release before unlock, edit commit msg]
Signed-off-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>jffs2: Fix lock acquisition order bug in gc path</title>
<updated>2012-05-21T16:40:03+00:00</updated>
<author>
<name>Josh Cartwright</name>
<email>joshc@linux.com</email>
</author>
<published>2012-03-29T23:34:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1541d27bdbcf4637162f4cedf7e3954bee3dace2'/>
<id>1541d27bdbcf4637162f4cedf7e3954bee3dace2</id>
<content type='text'>
commit 226bb7df3d22bcf4a1c0fe8206c80cc427498eae upstream.

The locking policy is such that the erase_complete_block spinlock is
nested within the alloc_sem mutex.  This fixes a case in which the
acquisition order was erroneously reversed.  This issue was caught by
the following lockdep splat:

   =======================================================
   [ INFO: possible circular locking dependency detected ]
   3.0.5 #1
   -------------------------------------------------------
   jffs2_gcd_mtd6/299 is trying to acquire lock:
    (&amp;c-&gt;alloc_sem){+.+.+.}, at: [&lt;c01f7714&gt;] jffs2_garbage_collect_pass+0x314/0x890

   but task is already holding lock:
    (&amp;(&amp;c-&gt;erase_completion_lock)-&gt;rlock){+.+...}, at: [&lt;c01f7708&gt;] jffs2_garbage_collect_pass+0x308/0x890

   which lock already depends on the new lock.

   the existing dependency chain (in reverse order) is:

   -&gt; #1 (&amp;(&amp;c-&gt;erase_completion_lock)-&gt;rlock){+.+...}:
          [&lt;c008bec4&gt;] validate_chain+0xe6c/0x10bc
          [&lt;c008c660&gt;] __lock_acquire+0x54c/0xba4
          [&lt;c008d240&gt;] lock_acquire+0xa4/0x114
          [&lt;c046780c&gt;] _raw_spin_lock+0x3c/0x4c
          [&lt;c01f744c&gt;] jffs2_garbage_collect_pass+0x4c/0x890
          [&lt;c01f937c&gt;] jffs2_garbage_collect_thread+0x1b4/0x1cc
          [&lt;c0071a68&gt;] kthread+0x98/0xa0
          [&lt;c000f264&gt;] kernel_thread_exit+0x0/0x8

   -&gt; #0 (&amp;c-&gt;alloc_sem){+.+.+.}:
          [&lt;c008ad2c&gt;] print_circular_bug+0x70/0x2c4
          [&lt;c008c08c&gt;] validate_chain+0x1034/0x10bc
          [&lt;c008c660&gt;] __lock_acquire+0x54c/0xba4
          [&lt;c008d240&gt;] lock_acquire+0xa4/0x114
          [&lt;c0466628&gt;] mutex_lock_nested+0x74/0x33c
          [&lt;c01f7714&gt;] jffs2_garbage_collect_pass+0x314/0x890
          [&lt;c01f937c&gt;] jffs2_garbage_collect_thread+0x1b4/0x1cc
          [&lt;c0071a68&gt;] kthread+0x98/0xa0
          [&lt;c000f264&gt;] kernel_thread_exit+0x0/0x8

   other info that might help us debug this:

    Possible unsafe locking scenario:

          CPU0                    CPU1
          ----                    ----
     lock(&amp;(&amp;c-&gt;erase_completion_lock)-&gt;rlock);
                                  lock(&amp;c-&gt;alloc_sem);
                                  lock(&amp;(&amp;c-&gt;erase_completion_lock)-&gt;rlock);
     lock(&amp;c-&gt;alloc_sem);

    *** DEADLOCK ***

   1 lock held by jffs2_gcd_mtd6/299:
    #0:  (&amp;(&amp;c-&gt;erase_completion_lock)-&gt;rlock){+.+...}, at: [&lt;c01f7708&gt;] jffs2_garbage_collect_pass+0x308/0x890

   stack backtrace:
   [&lt;c00155dc&gt;] (unwind_backtrace+0x0/0x100) from [&lt;c0463dc0&gt;] (dump_stack+0x20/0x24)
   [&lt;c0463dc0&gt;] (dump_stack+0x20/0x24) from [&lt;c008ae84&gt;] (print_circular_bug+0x1c8/0x2c4)
   [&lt;c008ae84&gt;] (print_circular_bug+0x1c8/0x2c4) from [&lt;c008c08c&gt;] (validate_chain+0x1034/0x10bc)
   [&lt;c008c08c&gt;] (validate_chain+0x1034/0x10bc) from [&lt;c008c660&gt;] (__lock_acquire+0x54c/0xba4)
   [&lt;c008c660&gt;] (__lock_acquire+0x54c/0xba4) from [&lt;c008d240&gt;] (lock_acquire+0xa4/0x114)
   [&lt;c008d240&gt;] (lock_acquire+0xa4/0x114) from [&lt;c0466628&gt;] (mutex_lock_nested+0x74/0x33c)
   [&lt;c0466628&gt;] (mutex_lock_nested+0x74/0x33c) from [&lt;c01f7714&gt;] (jffs2_garbage_collect_pass+0x314/0x890)
   [&lt;c01f7714&gt;] (jffs2_garbage_collect_pass+0x314/0x890) from [&lt;c01f937c&gt;] (jffs2_garbage_collect_thread+0x1b4/0x1cc)
   [&lt;c01f937c&gt;] (jffs2_garbage_collect_thread+0x1b4/0x1cc) from [&lt;c0071a68&gt;] (kthread+0x98/0xa0)
   [&lt;c0071a68&gt;] (kthread+0x98/0xa0) from [&lt;c000f264&gt;] (kernel_thread_exit+0x0/0x8)

This was introduce in '81cfc9f jffs2: Fix serious write stall due to erase'.

Signed-off-by: Josh Cartwright &lt;joshc@linux.com&gt;
Signed-off-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Signed-off-by: David Woodhouse &lt;David.Woodhouse@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 226bb7df3d22bcf4a1c0fe8206c80cc427498eae upstream.

The locking policy is such that the erase_complete_block spinlock is
nested within the alloc_sem mutex.  This fixes a case in which the
acquisition order was erroneously reversed.  This issue was caught by
the following lockdep splat:

   =======================================================
   [ INFO: possible circular locking dependency detected ]
   3.0.5 #1
   -------------------------------------------------------
   jffs2_gcd_mtd6/299 is trying to acquire lock:
    (&amp;c-&gt;alloc_sem){+.+.+.}, at: [&lt;c01f7714&gt;] jffs2_garbage_collect_pass+0x314/0x890

   but task is already holding lock:
    (&amp;(&amp;c-&gt;erase_completion_lock)-&gt;rlock){+.+...}, at: [&lt;c01f7708&gt;] jffs2_garbage_collect_pass+0x308/0x890

   which lock already depends on the new lock.

   the existing dependency chain (in reverse order) is:

   -&gt; #1 (&amp;(&amp;c-&gt;erase_completion_lock)-&gt;rlock){+.+...}:
          [&lt;c008bec4&gt;] validate_chain+0xe6c/0x10bc
          [&lt;c008c660&gt;] __lock_acquire+0x54c/0xba4
          [&lt;c008d240&gt;] lock_acquire+0xa4/0x114
          [&lt;c046780c&gt;] _raw_spin_lock+0x3c/0x4c
          [&lt;c01f744c&gt;] jffs2_garbage_collect_pass+0x4c/0x890
          [&lt;c01f937c&gt;] jffs2_garbage_collect_thread+0x1b4/0x1cc
          [&lt;c0071a68&gt;] kthread+0x98/0xa0
          [&lt;c000f264&gt;] kernel_thread_exit+0x0/0x8

   -&gt; #0 (&amp;c-&gt;alloc_sem){+.+.+.}:
          [&lt;c008ad2c&gt;] print_circular_bug+0x70/0x2c4
          [&lt;c008c08c&gt;] validate_chain+0x1034/0x10bc
          [&lt;c008c660&gt;] __lock_acquire+0x54c/0xba4
          [&lt;c008d240&gt;] lock_acquire+0xa4/0x114
          [&lt;c0466628&gt;] mutex_lock_nested+0x74/0x33c
          [&lt;c01f7714&gt;] jffs2_garbage_collect_pass+0x314/0x890
          [&lt;c01f937c&gt;] jffs2_garbage_collect_thread+0x1b4/0x1cc
          [&lt;c0071a68&gt;] kthread+0x98/0xa0
          [&lt;c000f264&gt;] kernel_thread_exit+0x0/0x8

   other info that might help us debug this:

    Possible unsafe locking scenario:

          CPU0                    CPU1
          ----                    ----
     lock(&amp;(&amp;c-&gt;erase_completion_lock)-&gt;rlock);
                                  lock(&amp;c-&gt;alloc_sem);
                                  lock(&amp;(&amp;c-&gt;erase_completion_lock)-&gt;rlock);
     lock(&amp;c-&gt;alloc_sem);

    *** DEADLOCK ***

   1 lock held by jffs2_gcd_mtd6/299:
    #0:  (&amp;(&amp;c-&gt;erase_completion_lock)-&gt;rlock){+.+...}, at: [&lt;c01f7708&gt;] jffs2_garbage_collect_pass+0x308/0x890

   stack backtrace:
   [&lt;c00155dc&gt;] (unwind_backtrace+0x0/0x100) from [&lt;c0463dc0&gt;] (dump_stack+0x20/0x24)
   [&lt;c0463dc0&gt;] (dump_stack+0x20/0x24) from [&lt;c008ae84&gt;] (print_circular_bug+0x1c8/0x2c4)
   [&lt;c008ae84&gt;] (print_circular_bug+0x1c8/0x2c4) from [&lt;c008c08c&gt;] (validate_chain+0x1034/0x10bc)
   [&lt;c008c08c&gt;] (validate_chain+0x1034/0x10bc) from [&lt;c008c660&gt;] (__lock_acquire+0x54c/0xba4)
   [&lt;c008c660&gt;] (__lock_acquire+0x54c/0xba4) from [&lt;c008d240&gt;] (lock_acquire+0xa4/0x114)
   [&lt;c008d240&gt;] (lock_acquire+0xa4/0x114) from [&lt;c0466628&gt;] (mutex_lock_nested+0x74/0x33c)
   [&lt;c0466628&gt;] (mutex_lock_nested+0x74/0x33c) from [&lt;c01f7714&gt;] (jffs2_garbage_collect_pass+0x314/0x890)
   [&lt;c01f7714&gt;] (jffs2_garbage_collect_pass+0x314/0x890) from [&lt;c01f937c&gt;] (jffs2_garbage_collect_thread+0x1b4/0x1cc)
   [&lt;c01f937c&gt;] (jffs2_garbage_collect_thread+0x1b4/0x1cc) from [&lt;c0071a68&gt;] (kthread+0x98/0xa0)
   [&lt;c0071a68&gt;] (kthread+0x98/0xa0) from [&lt;c000f264&gt;] (kernel_thread_exit+0x0/0x8)

This was introduce in '81cfc9f jffs2: Fix serious write stall due to erase'.

Signed-off-by: Josh Cartwright &lt;joshc@linux.com&gt;
Signed-off-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Signed-off-by: David Woodhouse &lt;David.Woodhouse@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>hfsplus: Fix potential buffer overflows</title>
<updated>2012-05-07T15:56:50+00:00</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@linuxfoundation.org</email>
</author>
<published>2012-05-04T19:09:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8792953929b01e82e57bce07cc717a0bf24384bc'/>
<id>8792953929b01e82e57bce07cc717a0bf24384bc</id>
<content type='text'>
commit 6f24f892871acc47b40dd594c63606a17c714f77 upstream.

Commit ec81aecb2966 ("hfs: fix a potential buffer overflow") fixed a few
potential buffer overflows in the hfs filesystem.  But as Timo Warns
pointed out, these changes also need to be made on the hfsplus
filesystem as well.

Reported-by: Timo Warns &lt;warns@pre-sense.de&gt;
Acked-by: WANG Cong &lt;amwang@redhat.com&gt;
Cc: Alexey Khoroshilov &lt;khoroshilov@ispras.ru&gt;
Cc: Miklos Szeredi &lt;mszeredi@suse.cz&gt;
Cc: Sage Weil &lt;sage@newdream.net&gt;
Cc: Eugene Teo &lt;eteo@redhat.com&gt;
Cc: Roman Zippel &lt;zippel@linux-m68k.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Dave Anderson &lt;anderson@redhat.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 6f24f892871acc47b40dd594c63606a17c714f77 upstream.

Commit ec81aecb2966 ("hfs: fix a potential buffer overflow") fixed a few
potential buffer overflows in the hfs filesystem.  But as Timo Warns
pointed out, these changes also need to be made on the hfsplus
filesystem as well.

Reported-by: Timo Warns &lt;warns@pre-sense.de&gt;
Acked-by: WANG Cong &lt;amwang@redhat.com&gt;
Cc: Alexey Khoroshilov &lt;khoroshilov@ispras.ru&gt;
Cc: Miklos Szeredi &lt;mszeredi@suse.cz&gt;
Cc: Sage Weil &lt;sage@newdream.net&gt;
Cc: Eugene Teo &lt;eteo@redhat.com&gt;
Cc: Roman Zippel &lt;zippel@linux-m68k.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Dave Anderson &lt;anderson@redhat.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>autofs: make the autofsv5 packet file descriptor use a packetized pipe</title>
<updated>2012-05-07T15:56:37+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-04-29T20:30:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=70403b35a5e2d08c9e2727b2e8dd78cb0b1391b3'/>
<id>70403b35a5e2d08c9e2727b2e8dd78cb0b1391b3</id>
<content type='text'>
commit 64f371bc3107e69efce563a3d0f0e6880de0d537 upstream.

The autofs packet size has had a very unfortunate size problem on x86:
because the alignment of 'u64' differs in 32-bit and 64-bit modes, and
because the packet data was not 8-byte aligned, the size of the autofsv5
packet structure differed between 32-bit and 64-bit modes despite
looking otherwise identical (300 vs 304 bytes respectively).

We first fixed that up by making the 64-bit compat mode know about this
problem in commit a32744d4abae ("autofs: work around unhappy compat
problem on x86-64"), and that made a 32-bit 'systemd' work happily on a
64-bit kernel because everything then worked the same way as on a 32-bit
kernel.

But it turned out that 'automount' had actually known and worked around
this problem in user space, so fixing the kernel to do the proper 32-bit
compatibility handling actually *broke* 32-bit automount on a 64-bit
kernel, because it knew that the packet sizes were wrong and expected
those incorrect sizes.

As a result, we ended up reverting that compatibility mode fix, and
thus breaking systemd again, in commit fcbf94b9dedd.

With both automount and systemd doing a single read() system call, and
verifying that they get *exactly* the size they expect but using
different sizes, it seemed that fixing one of them inevitably seemed to
break the other.  At one point, a patch I seriously considered applying
from Michael Tokarev did a "strcmp()" to see if it was automount that
was doing the operation.  Ugly, ugly.

However, a prettier solution exists now thanks to the packetized pipe
mode.  By marking the communication pipe as being packetized (by simply
setting the O_DIRECT flag), we can always just write the bigger packet
size, and if user-space does a smaller read, it will just get that
partial end result and the extra alignment padding will simply be thrown
away.

This makes both automount and systemd happy, since they now get the size
they asked for, and the kernel side of autofs simply no longer needs to
care - it could pad out the packet arbitrarily.

Of course, if there is some *other* user of autofs (please, please,
please tell me it ain't so - and we haven't heard of any) that tries to
read the packets with multiple writes, that other user will now be
broken - the whole point of the packetized mode is that one system call
gets exactly one packet, and you cannot read a packet in pieces.

Tested-by: Michael Tokarev &lt;mjt@tls.msk.ru&gt;
Cc: Alan Cox &lt;alan@lxorguk.ukuu.org.uk&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Ian Kent &lt;raven@themaw.net&gt;
Cc: Thomas Meyer &lt;thomas@m3y3r.de&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 64f371bc3107e69efce563a3d0f0e6880de0d537 upstream.

The autofs packet size has had a very unfortunate size problem on x86:
because the alignment of 'u64' differs in 32-bit and 64-bit modes, and
because the packet data was not 8-byte aligned, the size of the autofsv5
packet structure differed between 32-bit and 64-bit modes despite
looking otherwise identical (300 vs 304 bytes respectively).

We first fixed that up by making the 64-bit compat mode know about this
problem in commit a32744d4abae ("autofs: work around unhappy compat
problem on x86-64"), and that made a 32-bit 'systemd' work happily on a
64-bit kernel because everything then worked the same way as on a 32-bit
kernel.

But it turned out that 'automount' had actually known and worked around
this problem in user space, so fixing the kernel to do the proper 32-bit
compatibility handling actually *broke* 32-bit automount on a 64-bit
kernel, because it knew that the packet sizes were wrong and expected
those incorrect sizes.

As a result, we ended up reverting that compatibility mode fix, and
thus breaking systemd again, in commit fcbf94b9dedd.

With both automount and systemd doing a single read() system call, and
verifying that they get *exactly* the size they expect but using
different sizes, it seemed that fixing one of them inevitably seemed to
break the other.  At one point, a patch I seriously considered applying
from Michael Tokarev did a "strcmp()" to see if it was automount that
was doing the operation.  Ugly, ugly.

However, a prettier solution exists now thanks to the packetized pipe
mode.  By marking the communication pipe as being packetized (by simply
setting the O_DIRECT flag), we can always just write the bigger packet
size, and if user-space does a smaller read, it will just get that
partial end result and the extra alignment padding will simply be thrown
away.

This makes both automount and systemd happy, since they now get the size
they asked for, and the kernel side of autofs simply no longer needs to
care - it could pad out the packet arbitrarily.

Of course, if there is some *other* user of autofs (please, please,
please tell me it ain't so - and we haven't heard of any) that tries to
read the packets with multiple writes, that other user will now be
broken - the whole point of the packetized mode is that one system call
gets exactly one packet, and you cannot read a packet in pieces.

Tested-by: Michael Tokarev &lt;mjt@tls.msk.ru&gt;
Cc: Alan Cox &lt;alan@lxorguk.ukuu.org.uk&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Ian Kent &lt;raven@themaw.net&gt;
Cc: Thomas Meyer &lt;thomas@m3y3r.de&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
</feed>
