<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/fs/ceph, branch v4.2.1</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>fs: create and use seq_show_option for escaping</title>
<updated>2015-09-21T17:10:59+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2015-09-04T22:44:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c5291202ffc78688bc48ffbd7a7cf7807093e25e'/>
<id>c5291202ffc78688bc48ffbd7a7cf7807093e25e</id>
<content type='text'>
commit a068acf2ee77693e0bf39d6e07139ba704f461c3 upstream.

Many file systems that implement the show_options hook fail to correctly
escape their output which could lead to unescaped characters (e.g.  new
lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files.  This
could lead to confusion, spoofed entries (resulting in things like
systemd issuing false d-bus "mount" notifications), and who knows what
else.  This looks like it would only be the root user stepping on
themselves, but it's possible weird things could happen in containers or
in other situations with delegated mount privileges.

Here's an example using overlay with setuid fusermount trusting the
contents of /proc/mounts (via the /etc/mtab symlink).  Imagine the use
of "sudo" is something more sneaky:

  $ BASE="ovl"
  $ MNT="$BASE/mnt"
  $ LOW="$BASE/lower"
  $ UP="$BASE/upper"
  $ WORK="$BASE/work/ 0 0
  none /proc fuse.pwn user_id=1000"
  $ mkdir -p "$LOW" "$UP" "$WORK"
  $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt
  $ cat /proc/mounts
  none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0
  none /proc fuse.pwn user_id=1000 0 0
  $ fusermount -u /proc
  $ cat /proc/mounts
  cat: /proc/mounts: No such file or directory

This fixes the problem by adding new seq_show_option and
seq_show_option_n helpers, and updating the vulnerable show_option
handlers to use them as needed.  Some, like SELinux, need to be open
coded due to unusual existing escape mechanisms.

[akpm@linux-foundation.org: add lost chunk, per Kees]
[keescook@chromium.org: seq_show_option should be using const parameters]
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Acked-by: Serge Hallyn &lt;serge.hallyn@canonical.com&gt;
Acked-by: Jan Kara &lt;jack@suse.com&gt;
Acked-by: Paul Moore &lt;paul@paul-moore.com&gt;
Cc: J. R. Okajima &lt;hooanon05g@gmail.com&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit a068acf2ee77693e0bf39d6e07139ba704f461c3 upstream.

Many file systems that implement the show_options hook fail to correctly
escape their output which could lead to unescaped characters (e.g.  new
lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files.  This
could lead to confusion, spoofed entries (resulting in things like
systemd issuing false d-bus "mount" notifications), and who knows what
else.  This looks like it would only be the root user stepping on
themselves, but it's possible weird things could happen in containers or
in other situations with delegated mount privileges.

Here's an example using overlay with setuid fusermount trusting the
contents of /proc/mounts (via the /etc/mtab symlink).  Imagine the use
of "sudo" is something more sneaky:

  $ BASE="ovl"
  $ MNT="$BASE/mnt"
  $ LOW="$BASE/lower"
  $ UP="$BASE/upper"
  $ WORK="$BASE/work/ 0 0
  none /proc fuse.pwn user_id=1000"
  $ mkdir -p "$LOW" "$UP" "$WORK"
  $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt
  $ cat /proc/mounts
  none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0
  none /proc fuse.pwn user_id=1000 0 0
  $ fusermount -u /proc
  $ cat /proc/mounts
  cat: /proc/mounts: No such file or directory

This fixes the problem by adding new seq_show_option and
seq_show_option_n helpers, and updating the vulnerable show_option
handlers to use them as needed.  Some, like SELinux, need to be open
coded due to unusual existing escape mechanisms.

[akpm@linux-foundation.org: add lost chunk, per Kees]
[keescook@chromium.org: seq_show_option should be using const parameters]
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Acked-by: Serge Hallyn &lt;serge.hallyn@canonical.com&gt;
Acked-by: Jan Kara &lt;jack@suse.com&gt;
Acked-by: Paul Moore &lt;paul@paul-moore.com&gt;
Cc: J. R. Okajima &lt;hooanon05g@gmail.com&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>ceph: always re-send cap flushes when MDS recovers</title>
<updated>2015-07-31T08:38:53+00:00</updated>
<author>
<name>Yan, Zheng</name>
<email>zyan@redhat.com</email>
</author>
<published>2015-07-20T01:50:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fc927cd32feca2acefd90a4ac317fa4f0a2e5955'/>
<id>fc927cd32feca2acefd90a4ac317fa4f0a2e5955</id>
<content type='text'>
commit e548e9b93d3e565e42b938a99804114565be1f81 makes the kclient
only re-send cap flush once during MDS failover. If the kclient sends
a cap flush after MDS enters reconnect stage but before MDS recovers.
The kclient will skip re-sending the same cap flush when MDS recovers.

This causes problem for newly created inode. The MDS handles cap
flushes before replaying unsafe requests, so it's possible that MDS
find corresponding inode is missing when handling cap flush. The fix
is reverting to old behaviour: always re-send when MDS recovers

Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit e548e9b93d3e565e42b938a99804114565be1f81 makes the kclient
only re-send cap flush once during MDS failover. If the kclient sends
a cap flush after MDS enters reconnect stage but before MDS recovers.
The kclient will skip re-sending the same cap flush when MDS recovers.

This causes problem for newly created inode. The MDS handles cap
flushes before replaying unsafe requests, so it's possible that MDS
find corresponding inode is missing when handling cap flush. The fix
is reverting to old behaviour: always re-send when MDS recovers

Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ceph: fix ceph_encode_locks_to_buffer()</title>
<updated>2015-07-31T08:38:47+00:00</updated>
<author>
<name>Yan, Zheng</name>
<email>zyan@redhat.com</email>
</author>
<published>2015-07-07T08:18:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f6762cb2ca48e9052b5233c338fa254fa58d8981'/>
<id>f6762cb2ca48e9052b5233c338fa254fa58d8981</id>
<content type='text'>
posix locks should be in ctx-&gt;flc_posix list

Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
posix locks should be in ctx-&gt;flc_posix list

Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2015-07-05T02:36:06+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-07-05T02:36:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1dc51b8288007753ad7cd7d08bb8fa930fc8bb10'/>
<id>1dc51b8288007753ad7cd7d08bb8fa930fc8bb10</id>
<content type='text'>
Pull more vfs updates from Al Viro:
 "Assorted VFS fixes and related cleanups (IMO the most interesting in
  that part are f_path-related things and Eric's descriptor-related
  stuff).  UFS regression fixes (it got broken last cycle).  9P fixes.
  fs-cache series, DAX patches, Jan's file_remove_suid() work"

[ I'd say this is much more than "fixes and related cleanups".  The
  file_table locking rule change by Eric Dumazet is a rather big and
  fundamental update even if the patch isn't huge.   - Linus ]

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits)
  9p: cope with bogus responses from server in p9_client_{read,write}
  p9_client_write(): avoid double p9_free_req()
  9p: forgetting to cancel request on interrupted zero-copy RPC
  dax: bdev_direct_access() may sleep
  block: Add support for DAX reads/writes to block devices
  dax: Use copy_from_iter_nocache
  dax: Add block size note to documentation
  fs/file.c: __fget() and dup2() atomicity rules
  fs/file.c: don't acquire files-&gt;file_lock in fd_install()
  fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation
  vfs: avoid creation of inode number 0 in get_next_ino
  namei: make set_root_rcu() return void
  make simple_positive() public
  ufs: use dir_pages instead of ufs_dir_pages()
  pagemap.h: move dir_pages() over there
  remove the pointless include of lglock.h
  fs: cleanup slight list_entry abuse
  xfs: Correctly lock inode when removing suid and file capabilities
  fs: Call security_ops-&gt;inode_killpriv on truncate
  fs: Provide function telling whether file_remove_privs() will do anything
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull more vfs updates from Al Viro:
 "Assorted VFS fixes and related cleanups (IMO the most interesting in
  that part are f_path-related things and Eric's descriptor-related
  stuff).  UFS regression fixes (it got broken last cycle).  9P fixes.
  fs-cache series, DAX patches, Jan's file_remove_suid() work"

[ I'd say this is much more than "fixes and related cleanups".  The
  file_table locking rule change by Eric Dumazet is a rather big and
  fundamental update even if the patch isn't huge.   - Linus ]

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits)
  9p: cope with bogus responses from server in p9_client_{read,write}
  p9_client_write(): avoid double p9_free_req()
  9p: forgetting to cancel request on interrupted zero-copy RPC
  dax: bdev_direct_access() may sleep
  block: Add support for DAX reads/writes to block devices
  dax: Use copy_from_iter_nocache
  dax: Add block size note to documentation
  fs/file.c: __fget() and dup2() atomicity rules
  fs/file.c: don't acquire files-&gt;file_lock in fd_install()
  fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation
  vfs: avoid creation of inode number 0 in get_next_ino
  namei: make set_root_rcu() return void
  make simple_positive() public
  ufs: use dir_pages instead of ufs_dir_pages()
  pagemap.h: move dir_pages() over there
  remove the pointless include of lglock.h
  fs: cleanup slight list_entry abuse
  xfs: Correctly lock inode when removing suid and file capabilities
  fs: Call security_ops-&gt;inode_killpriv on truncate
  fs: Provide function telling whether file_remove_privs() will do anything
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client</title>
<updated>2015-07-02T18:35:00+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-07-02T18:35:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0c76c6ba246043bbc5c0f9620a0645ae78217421'/>
<id>0c76c6ba246043bbc5c0f9620a0645ae78217421</id>
<content type='text'>
Pull Ceph updates from Sage Weil:
 "We have a pile of bug fixes from Ilya, including a few patches that
  sync up the CRUSH code with the latest from userspace.

  There is also a long series from Zheng that fixes various issues with
  snapshots, inline data, and directory fsync, some simplification and
  improvement in the cap release code, and a rework of the caching of
  directory contents.

  To top it off there are a few small fixes and cleanups from Benoit and
  Hong"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (40 commits)
  rbd: use GFP_NOIO in rbd_obj_request_create()
  crush: fix a bug in tree bucket decode
  libceph: Fix ceph_tcp_sendpage()'s more boolean usage
  libceph: Remove spurious kunmap() of the zero page
  rbd: queue_depth map option
  rbd: store rbd_options in rbd_device
  rbd: terminate rbd_opts_tokens with Opt_err
  ceph: fix ceph_writepages_start()
  rbd: bump queue_max_segments
  ceph: rework dcache readdir
  crush: sync up with userspace
  crush: fix crash from invalid 'take' argument
  ceph: switch some GFP_NOFS memory allocation to GFP_KERNEL
  ceph: pre-allocate data structure that tracks caps flushing
  ceph: re-send flushing caps (which are revoked) in reconnect stage
  ceph: send TID of the oldest pending caps flush to MDS
  ceph: track pending caps flushing globally
  ceph: track pending caps flushing accurately
  libceph: fix wrong name "Ceph filesystem for Linux"
  ceph: fix directory fsync
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull Ceph updates from Sage Weil:
 "We have a pile of bug fixes from Ilya, including a few patches that
  sync up the CRUSH code with the latest from userspace.

  There is also a long series from Zheng that fixes various issues with
  snapshots, inline data, and directory fsync, some simplification and
  improvement in the cap release code, and a rework of the caching of
  directory contents.

  To top it off there are a few small fixes and cleanups from Benoit and
  Hong"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (40 commits)
  rbd: use GFP_NOIO in rbd_obj_request_create()
  crush: fix a bug in tree bucket decode
  libceph: Fix ceph_tcp_sendpage()'s more boolean usage
  libceph: Remove spurious kunmap() of the zero page
  rbd: queue_depth map option
  rbd: store rbd_options in rbd_device
  rbd: terminate rbd_opts_tokens with Opt_err
  ceph: fix ceph_writepages_start()
  rbd: bump queue_max_segments
  ceph: rework dcache readdir
  crush: sync up with userspace
  crush: fix crash from invalid 'take' argument
  ceph: switch some GFP_NOFS memory allocation to GFP_KERNEL
  ceph: pre-allocate data structure that tracks caps flushing
  ceph: re-send flushing caps (which are revoked) in reconnect stage
  ceph: send TID of the oldest pending caps flush to MDS
  ceph: track pending caps flushing globally
  ceph: track pending caps flushing accurately
  libceph: fix wrong name "Ceph filesystem for Linux"
  ceph: fix directory fsync
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>ceph: fix ceph_writepages_start()</title>
<updated>2015-06-25T15:30:53+00:00</updated>
<author>
<name>Yan, Zheng</name>
<email>zyan@redhat.com</email>
</author>
<published>2015-06-17T19:10:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e1966b49446a43994c3f25a07d0eb4d05660b429'/>
<id>e1966b49446a43994c3f25a07d0eb4d05660b429</id>
<content type='text'>
Before a page get locked, someone else can write data to the page
and increase the i_size. So we should re-check the i_size after
pages are locked.

Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Before a page get locked, someone else can write data to the page
and increase the i_size. So we should re-check the i_size after
pages are locked.

Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ceph: rework dcache readdir</title>
<updated>2015-06-25T08:49:32+00:00</updated>
<author>
<name>Yan, Zheng</name>
<email>zyan@redhat.com</email>
</author>
<published>2015-06-16T12:48:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fdd4e15838e59c394a1ec4963b57c22c12608685'/>
<id>fdd4e15838e59c394a1ec4963b57c22c12608685</id>
<content type='text'>
Previously our dcache readdir code relies on that child dentries in
directory dentry's d_subdir list are sorted by dentry's offset in
descending order. When adding dentries to the dcache, if a dentry
already exists, our readdir code moves it to head of directory
dentry's d_subdir list. This design relies on dcache internals.
Al Viro suggests using ncpfs's approach: keeping array of pointers
to dentries in page cache of directory inode. the validity of those
pointers are presented by directory inode's complete and ordered
flags. When a dentry gets pruned, we clear directory inode's complete
flag in the d_prune() callback. Before moving a dentry to other
directory, we clear the ordered flag for both old and new directory.

Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Previously our dcache readdir code relies on that child dentries in
directory dentry's d_subdir list are sorted by dentry's offset in
descending order. When adding dentries to the dcache, if a dentry
already exists, our readdir code moves it to head of directory
dentry's d_subdir list. This design relies on dcache internals.
Al Viro suggests using ncpfs's approach: keeping array of pointers
to dentries in page cache of directory inode. the validity of those
pointers are presented by directory inode's complete and ordered
flags. When a dentry gets pruned, we clear directory inode's complete
flag in the d_prune() callback. Before moving a dentry to other
directory, we clear the ordered flag for both old and new directory.

Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ceph: switch some GFP_NOFS memory allocation to GFP_KERNEL</title>
<updated>2015-06-25T08:49:31+00:00</updated>
<author>
<name>Yan, Zheng</name>
<email>zyan@redhat.com</email>
</author>
<published>2015-06-13T09:27:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=687265e5a885d6308f5d73e738efe3c2674fa218'/>
<id>687265e5a885d6308f5d73e738efe3c2674fa218</id>
<content type='text'>
GFP_NOFS memory allocation is required for page writeback path.
But there is no need to use GFP_NOFS in syscall path and readpage
path

Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
GFP_NOFS memory allocation is required for page writeback path.
But there is no need to use GFP_NOFS in syscall path and readpage
path

Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ceph: pre-allocate data structure that tracks caps flushing</title>
<updated>2015-06-25T08:49:31+00:00</updated>
<author>
<name>Yan, Zheng</name>
<email>zyan@redhat.com</email>
</author>
<published>2015-06-10T09:26:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f66fd9f0952187d274c13c136b74548f792c1925'/>
<id>f66fd9f0952187d274c13c136b74548f792c1925</id>
<content type='text'>
Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ceph: re-send flushing caps (which are revoked) in reconnect stage</title>
<updated>2015-06-25T08:49:31+00:00</updated>
<author>
<name>Yan, Zheng</name>
<email>zyan@redhat.com</email>
</author>
<published>2015-06-10T07:17:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e548e9b93d3e565e42b938a99804114565be1f81'/>
<id>e548e9b93d3e565e42b938a99804114565be1f81</id>
<content type='text'>
if flushing caps were revoked, we should re-send the cap flush in
client reconnect stage. This guarantees that MDS processes the cap
flush message before issuing the flushing caps to other client.

Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
if flushing caps were revoked, we should re-send the cap flush in
client reconnect stage. This guarantees that MDS processes the cap
flush message before issuing the flushing caps to other client.

Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
