<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/fs/proc, branch v4.9.121</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>proc: Fix proc_sys_prune_dcache to hold a sb reference</title>
<updated>2018-08-15T16:14:43+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2017-07-06T13:41:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a3a7b992b240ba621a47ff2d3465fa4f0534e297'/>
<id>a3a7b992b240ba621a47ff2d3465fa4f0534e297</id>
<content type='text'>
commit 2fd1d2c4ceb2248a727696962cf3370dc9f5a0a4 upstream.

Andrei Vagin writes:
FYI: This bug has been reproduced on 4.11.7
&gt; BUG: Dentry ffff895a3dd01240{i=4e7c09a,n=lo}  still in use (1) [unmount of proc proc]
&gt; ------------[ cut here ]------------
&gt; WARNING: CPU: 1 PID: 13588 at fs/dcache.c:1445 umount_check+0x6e/0x80
&gt; CPU: 1 PID: 13588 Comm: kworker/1:1 Not tainted 4.11.7-200.fc25.x86_64 #1
&gt; Hardware name: CompuLab sbc-flt1/fitlet, BIOS SBCFLT_0.08.04 06/27/2015
&gt; Workqueue: events proc_cleanup_work
&gt; Call Trace:
&gt;  dump_stack+0x63/0x86
&gt;  __warn+0xcb/0xf0
&gt;  warn_slowpath_null+0x1d/0x20
&gt;  umount_check+0x6e/0x80
&gt;  d_walk+0xc6/0x270
&gt;  ? dentry_free+0x80/0x80
&gt;  do_one_tree+0x26/0x40
&gt;  shrink_dcache_for_umount+0x2d/0x90
&gt;  generic_shutdown_super+0x1f/0xf0
&gt;  kill_anon_super+0x12/0x20
&gt;  proc_kill_sb+0x40/0x50
&gt;  deactivate_locked_super+0x43/0x70
&gt;  deactivate_super+0x5a/0x60
&gt;  cleanup_mnt+0x3f/0x90
&gt;  mntput_no_expire+0x13b/0x190
&gt;  kern_unmount+0x3e/0x50
&gt;  pid_ns_release_proc+0x15/0x20
&gt;  proc_cleanup_work+0x15/0x20
&gt;  process_one_work+0x197/0x450
&gt;  worker_thread+0x4e/0x4a0
&gt;  kthread+0x109/0x140
&gt;  ? process_one_work+0x450/0x450
&gt;  ? kthread_park+0x90/0x90
&gt;  ret_from_fork+0x2c/0x40
&gt; ---[ end trace e1c109611e5d0b41 ]---
&gt; VFS: Busy inodes after unmount of proc. Self-destruct in 5 seconds.  Have a nice day...
&gt; BUG: unable to handle kernel NULL pointer dereference at           (null)
&gt; IP: _raw_spin_lock+0xc/0x30
&gt; PGD 0

Fix this by taking a reference to the super block in proc_sys_prune_dcache.

The superblock reference is the core of the fix however the sysctl_inodes
list is converted to a hlist so that hlist_del_init_rcu may be used.  This
allows proc_sys_prune_dache to remove inodes the sysctl_inodes list, while
not causing problems for proc_sys_evict_inode when if it later choses to
remove the inode from the sysctl_inodes list.  Removing inodes from the
sysctl_inodes list allows proc_sys_prune_dcache to have a progress
guarantee, while still being able to drop all locks.  The fact that
head-&gt;unregistering is set in start_unregistering ensures that no more
inodes will be added to the the sysctl_inodes list.

Previously the code did a dance where it delayed calling iput until the
next entry in the list was being considered to ensure the inode remained on
the sysctl_inodes list until the next entry was walked to.  The structure
of the loop in this patch does not need that so is much easier to
understand and maintain.

Cc: stable@vger.kernel.org
Reported-by: Andrei Vagin &lt;avagin@gmail.com&gt;
Tested-by: Andrei Vagin &lt;avagin@openvz.org&gt;
Fixes: ace0c791e6c3 ("proc/sysctl: Don't grab i_lock under sysctl_lock.")
Fixes: d6cffbbe9a7e ("proc/sysctl: prune stale dentries during unregistering")
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 2fd1d2c4ceb2248a727696962cf3370dc9f5a0a4 upstream.

Andrei Vagin writes:
FYI: This bug has been reproduced on 4.11.7
&gt; BUG: Dentry ffff895a3dd01240{i=4e7c09a,n=lo}  still in use (1) [unmount of proc proc]
&gt; ------------[ cut here ]------------
&gt; WARNING: CPU: 1 PID: 13588 at fs/dcache.c:1445 umount_check+0x6e/0x80
&gt; CPU: 1 PID: 13588 Comm: kworker/1:1 Not tainted 4.11.7-200.fc25.x86_64 #1
&gt; Hardware name: CompuLab sbc-flt1/fitlet, BIOS SBCFLT_0.08.04 06/27/2015
&gt; Workqueue: events proc_cleanup_work
&gt; Call Trace:
&gt;  dump_stack+0x63/0x86
&gt;  __warn+0xcb/0xf0
&gt;  warn_slowpath_null+0x1d/0x20
&gt;  umount_check+0x6e/0x80
&gt;  d_walk+0xc6/0x270
&gt;  ? dentry_free+0x80/0x80
&gt;  do_one_tree+0x26/0x40
&gt;  shrink_dcache_for_umount+0x2d/0x90
&gt;  generic_shutdown_super+0x1f/0xf0
&gt;  kill_anon_super+0x12/0x20
&gt;  proc_kill_sb+0x40/0x50
&gt;  deactivate_locked_super+0x43/0x70
&gt;  deactivate_super+0x5a/0x60
&gt;  cleanup_mnt+0x3f/0x90
&gt;  mntput_no_expire+0x13b/0x190
&gt;  kern_unmount+0x3e/0x50
&gt;  pid_ns_release_proc+0x15/0x20
&gt;  proc_cleanup_work+0x15/0x20
&gt;  process_one_work+0x197/0x450
&gt;  worker_thread+0x4e/0x4a0
&gt;  kthread+0x109/0x140
&gt;  ? process_one_work+0x450/0x450
&gt;  ? kthread_park+0x90/0x90
&gt;  ret_from_fork+0x2c/0x40
&gt; ---[ end trace e1c109611e5d0b41 ]---
&gt; VFS: Busy inodes after unmount of proc. Self-destruct in 5 seconds.  Have a nice day...
&gt; BUG: unable to handle kernel NULL pointer dereference at           (null)
&gt; IP: _raw_spin_lock+0xc/0x30
&gt; PGD 0

Fix this by taking a reference to the super block in proc_sys_prune_dcache.

The superblock reference is the core of the fix however the sysctl_inodes
list is converted to a hlist so that hlist_del_init_rcu may be used.  This
allows proc_sys_prune_dache to remove inodes the sysctl_inodes list, while
not causing problems for proc_sys_evict_inode when if it later choses to
remove the inode from the sysctl_inodes list.  Removing inodes from the
sysctl_inodes list allows proc_sys_prune_dcache to have a progress
guarantee, while still being able to drop all locks.  The fact that
head-&gt;unregistering is set in start_unregistering ensures that no more
inodes will be added to the the sysctl_inodes list.

Previously the code did a dance where it delayed calling iput until the
next entry in the list was being considered to ensure the inode remained on
the sysctl_inodes list until the next entry was walked to.  The structure
of the loop in this patch does not need that so is much easier to
understand and maintain.

Cc: stable@vger.kernel.org
Reported-by: Andrei Vagin &lt;avagin@gmail.com&gt;
Tested-by: Andrei Vagin &lt;avagin@openvz.org&gt;
Fixes: ace0c791e6c3 ("proc/sysctl: Don't grab i_lock under sysctl_lock.")
Fixes: d6cffbbe9a7e ("proc/sysctl: prune stale dentries during unregistering")
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>proc/sysctl: Don't grab i_lock under sysctl_lock.</title>
<updated>2018-08-15T16:14:43+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2017-02-20T05:17:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=631f93a6fe847d2d317010d5bbd7cb3bcc284336'/>
<id>631f93a6fe847d2d317010d5bbd7cb3bcc284336</id>
<content type='text'>
commit ace0c791e6c3cf5ef37cad2df69f0d90ccc40ffb upstream.

Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt; writes:
&gt; This patch has locking problem. I've got lockdep splat under LTP.
&gt;
&gt; [ 6633.115456] ======================================================
&gt; [ 6633.115502] [ INFO: possible circular locking dependency detected ]
&gt; [ 6633.115553] 4.9.10-debug+ #9 Tainted: G             L
&gt; [ 6633.115584] -------------------------------------------------------
&gt; [ 6633.115627] ksm02/284980 is trying to acquire lock:
&gt; [ 6633.115659]  (&amp;sb-&gt;s_type-&gt;i_lock_key#4){+.+...}, at: [&lt;ffffffff816bc1ce&gt;] igrab+0x1e/0x80
&gt; [ 6633.115834] but task is already holding lock:
&gt; [ 6633.115882]  (sysctl_lock){+.+...}, at: [&lt;ffffffff817e379b&gt;] unregister_sysctl_table+0x6b/0x110
&gt; [ 6633.116026] which lock already depends on the new lock.
&gt; [ 6633.116026]
&gt; [ 6633.116080]
&gt; [ 6633.116080] the existing dependency chain (in reverse order) is:
&gt; [ 6633.116117]
&gt; -&gt; #2 (sysctl_lock){+.+...}:
&gt; -&gt; #1 (&amp;(&amp;dentry-&gt;d_lockref.lock)-&gt;rlock){+.+...}:
&gt; -&gt; #0 (&amp;sb-&gt;s_type-&gt;i_lock_key#4){+.+...}:
&gt;
&gt; d_lock nests inside i_lock
&gt; sysctl_lock nests inside d_lock in d_compare
&gt;
&gt; This patch adds i_lock nesting inside sysctl_lock.

Al Viro &lt;viro@ZenIV.linux.org.uk&gt; replied:
&gt; Once -&gt;unregistering is set, you can drop sysctl_lock just fine.  So I'd
&gt; try something like this - use rcu_read_lock() in proc_sys_prune_dcache(),
&gt; drop sysctl_lock() before it and regain after.  Make sure that no inodes
&gt; are added to the list ones -&gt;unregistering has been set and use RCU list
&gt; primitives for modifying the inode list, with sysctl_lock still used to
&gt; serialize its modifications.
&gt;
&gt; Freeing struct inode is RCU-delayed (see proc_destroy_inode()), so doing
&gt; igrab() is safe there.  Since we don't drop inode reference until after we'd
&gt; passed beyond it in the list, list_for_each_entry_rcu() should be fine.

I agree with Al Viro's analsysis of the situtation.

Fixes: d6cffbbe9a7e ("proc/sysctl: prune stale dentries during unregistering")
Reported-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Tested-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Suggested-by: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit ace0c791e6c3cf5ef37cad2df69f0d90ccc40ffb upstream.

Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt; writes:
&gt; This patch has locking problem. I've got lockdep splat under LTP.
&gt;
&gt; [ 6633.115456] ======================================================
&gt; [ 6633.115502] [ INFO: possible circular locking dependency detected ]
&gt; [ 6633.115553] 4.9.10-debug+ #9 Tainted: G             L
&gt; [ 6633.115584] -------------------------------------------------------
&gt; [ 6633.115627] ksm02/284980 is trying to acquire lock:
&gt; [ 6633.115659]  (&amp;sb-&gt;s_type-&gt;i_lock_key#4){+.+...}, at: [&lt;ffffffff816bc1ce&gt;] igrab+0x1e/0x80
&gt; [ 6633.115834] but task is already holding lock:
&gt; [ 6633.115882]  (sysctl_lock){+.+...}, at: [&lt;ffffffff817e379b&gt;] unregister_sysctl_table+0x6b/0x110
&gt; [ 6633.116026] which lock already depends on the new lock.
&gt; [ 6633.116026]
&gt; [ 6633.116080]
&gt; [ 6633.116080] the existing dependency chain (in reverse order) is:
&gt; [ 6633.116117]
&gt; -&gt; #2 (sysctl_lock){+.+...}:
&gt; -&gt; #1 (&amp;(&amp;dentry-&gt;d_lockref.lock)-&gt;rlock){+.+...}:
&gt; -&gt; #0 (&amp;sb-&gt;s_type-&gt;i_lock_key#4){+.+...}:
&gt;
&gt; d_lock nests inside i_lock
&gt; sysctl_lock nests inside d_lock in d_compare
&gt;
&gt; This patch adds i_lock nesting inside sysctl_lock.

Al Viro &lt;viro@ZenIV.linux.org.uk&gt; replied:
&gt; Once -&gt;unregistering is set, you can drop sysctl_lock just fine.  So I'd
&gt; try something like this - use rcu_read_lock() in proc_sys_prune_dcache(),
&gt; drop sysctl_lock() before it and regain after.  Make sure that no inodes
&gt; are added to the list ones -&gt;unregistering has been set and use RCU list
&gt; primitives for modifying the inode list, with sysctl_lock still used to
&gt; serialize its modifications.
&gt;
&gt; Freeing struct inode is RCU-delayed (see proc_destroy_inode()), so doing
&gt; igrab() is safe there.  Since we don't drop inode reference until after we'd
&gt; passed beyond it in the list, list_for_each_entry_rcu() should be fine.

I agree with Al Viro's analsysis of the situtation.

Fixes: d6cffbbe9a7e ("proc/sysctl: prune stale dentries during unregistering")
Reported-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Tested-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Suggested-by: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>proc/sysctl: prune stale dentries during unregistering</title>
<updated>2018-08-15T16:14:43+00:00</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>khlebnikov@yandex-team.ru</email>
</author>
<published>2017-02-10T07:35:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b96e215e539509cae8bfe468689b70661cf511b4'/>
<id>b96e215e539509cae8bfe468689b70661cf511b4</id>
<content type='text'>
commit d6cffbbe9a7e51eb705182965a189457c17ba8a3 upstream.

Currently unregistering sysctl table does not prune its dentries.
Stale dentries could slowdown sysctl operations significantly.

For example, command:

 # for i in {1..100000} ; do unshare -n -- sysctl -a &amp;&gt; /dev/null ; done
 creates a millions of stale denties around sysctls of loopback interface:

 # sysctl fs.dentry-state
 fs.dentry-state = 25812579  24724135        45      0       0       0

 All of them have matching names thus lookup have to scan though whole
 hash chain and call d_compare (proc_sys_compare) which checks them
 under system-wide spinlock (sysctl_lock).

 # time sysctl -a &gt; /dev/null
 real    1m12.806s
 user    0m0.016s
 sys     1m12.400s

Currently only memory reclaimer could remove this garbage.
But without significant memory pressure this never happens.

This patch collects sysctl inodes into list on sysctl table header and
prunes all their dentries once that table unregisters.

Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt; writes:
&gt; On 10.02.2017 10:47, Al Viro wrote:
&gt;&gt; how about &gt;&gt; the matching stats *after* that patch?
&gt;
&gt; dcache size doesn't grow endlessly, so stats are fine
&gt;
&gt; # sysctl fs.dentry-state
&gt; fs.dentry-state = 92712	58376	45	0	0	0
&gt;
&gt; # time sysctl -a &amp;&gt;/dev/null
&gt;
&gt; real	0m0.013s
&gt; user	0m0.004s
&gt; sys	0m0.008s

Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Suggested-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d6cffbbe9a7e51eb705182965a189457c17ba8a3 upstream.

Currently unregistering sysctl table does not prune its dentries.
Stale dentries could slowdown sysctl operations significantly.

For example, command:

 # for i in {1..100000} ; do unshare -n -- sysctl -a &amp;&gt; /dev/null ; done
 creates a millions of stale denties around sysctls of loopback interface:

 # sysctl fs.dentry-state
 fs.dentry-state = 25812579  24724135        45      0       0       0

 All of them have matching names thus lookup have to scan though whole
 hash chain and call d_compare (proc_sys_compare) which checks them
 under system-wide spinlock (sysctl_lock).

 # time sysctl -a &gt; /dev/null
 real    1m12.806s
 user    0m0.016s
 sys     1m12.400s

Currently only memory reclaimer could remove this garbage.
But without significant memory pressure this never happens.

This patch collects sysctl inodes into list on sysctl table header and
prunes all their dentries once that table unregisters.

Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt; writes:
&gt; On 10.02.2017 10:47, Al Viro wrote:
&gt;&gt; how about &gt;&gt; the matching stats *after* that patch?
&gt;
&gt; dcache size doesn't grow endlessly, so stats are fine
&gt;
&gt; # sysctl fs.dentry-state
&gt; fs.dentry-state = 92712	58376	45	0	0	0
&gt;
&gt; # time sysctl -a &amp;&gt;/dev/null
&gt;
&gt; real	0m0.013s
&gt; user	0m0.004s
&gt; sys	0m0.008s

Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Suggested-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>fs/proc/proc_sysctl.c: fix potential page fault while unregistering sysctl table</title>
<updated>2018-05-30T05:50:40+00:00</updated>
<author>
<name>Danilo Krummrich</name>
<email>danilokrummrich@dk-develop.de</email>
</author>
<published>2018-04-10T23:31:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b62143830170e14ccd94c2e340a2ce8f2f4c777b'/>
<id>b62143830170e14ccd94c2e340a2ce8f2f4c777b</id>
<content type='text'>
[ Upstream commit a0b0d1c345d0317efe594df268feb5ccc99f651e ]

proc_sys_link_fill_cache() does not take currently unregistering sysctl
tables into account, which might result into a page fault in
sysctl_follow_link() - add a check to fix it.

This bug has been present since v3.4.

Link: http://lkml.kernel.org/r/20180228013506.4915-1-danilokrummrich@dk-develop.de
Fixes: 0e47c99d7fe25 ("sysctl: Replace root_list with links between sysctl_table_sets")
Signed-off-by: Danilo Krummrich &lt;danilokrummrich@dk-develop.de&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: "Luis R . Rodriguez" &lt;mcgrof@kernel.org&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit a0b0d1c345d0317efe594df268feb5ccc99f651e ]

proc_sys_link_fill_cache() does not take currently unregistering sysctl
tables into account, which might result into a page fault in
sysctl_follow_link() - add a check to fix it.

This bug has been present since v3.4.

Link: http://lkml.kernel.org/r/20180228013506.4915-1-danilokrummrich@dk-develop.de
Fixes: 0e47c99d7fe25 ("sysctl: Replace root_list with links between sysctl_table_sets")
Signed-off-by: Danilo Krummrich &lt;danilokrummrich@dk-develop.de&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: "Luis R . Rodriguez" &lt;mcgrof@kernel.org&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page</title>
<updated>2018-05-30T05:50:26+00:00</updated>
<author>
<name>Jia Zhang</name>
<email>zhang.jia@linux.alibaba.com</email>
</author>
<published>2018-02-12T14:44:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=059befd4e0ae7ad7c54d5d292a3cb75b51ff4bf9'/>
<id>059befd4e0ae7ad7c54d5d292a3cb75b51ff4bf9</id>
<content type='text'>
[ Upstream commit 595dd46ebfc10be041a365d0a3fa99df50b6ba73 ]

Commit:

  df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data")

... introduced a bounce buffer to work around CONFIG_HARDENED_USERCOPY=y.
However, accessing the vsyscall user page will cause an SMAP fault.

Replace memcpy() with copy_from_user() to fix this bug works, but adding
a common way to handle this sort of user page may be useful for future.

Currently, only vsyscall page requires KCORE_USER.

Signed-off-by: Jia Zhang &lt;zhang.jia@linux.alibaba.com&gt;
Reviewed-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: jolsa@redhat.com
Link: http://lkml.kernel.org/r/1518446694-21124-2-git-send-email-zhang.jia@linux.alibaba.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 595dd46ebfc10be041a365d0a3fa99df50b6ba73 ]

Commit:

  df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data")

... introduced a bounce buffer to work around CONFIG_HARDENED_USERCOPY=y.
However, accessing the vsyscall user page will cause an SMAP fault.

Replace memcpy() with copy_from_user() to fix this bug works, but adding
a common way to handle this sort of user page may be useful for future.

Currently, only vsyscall page requires KCORE_USER.

Signed-off-by: Jia Zhang &lt;zhang.jia@linux.alibaba.com&gt;
Reviewed-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: jolsa@redhat.com
Link: http://lkml.kernel.org/r/1518446694-21124-2-git-send-email-zhang.jia@linux.alibaba.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: fix /proc/*/map_files lookup</title>
<updated>2018-05-30T05:50:25+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2018-02-06T23:36:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e0a1a0173aceffb45c8ab008d3d37c096224af56'/>
<id>e0a1a0173aceffb45c8ab008d3d37c096224af56</id>
<content type='text'>
[ Upstream commit ac7f1061c2c11bb8936b1b6a94cdb48de732f7a4 ]

Current code does:

	if (sscanf(dentry-&gt;d_name.name, "%lx-%lx", start, end) != 2)

However sscanf() is broken garbage.

It silently accepts whitespace between format specifiers
(did you know that?).

It silently accepts valid strings which result in integer overflow.

Do not use sscanf() for any even remotely reliable parsing code.

	OK
	# readlink '/proc/1/map_files/55a23af39000-55a23b05b000'
	/lib/systemd/systemd

	broken
	# readlink '/proc/1/map_files/               55a23af39000-55a23b05b000'
	/lib/systemd/systemd

	broken
	# readlink '/proc/1/map_files/55a23af39000-55a23b05b000    '
	/lib/systemd/systemd

	very broken
	# readlink '/proc/1/map_files/1000000000000000055a23af39000-55a23b05b000'
	/lib/systemd/systemd

Andrei said:

: This patch breaks criu.  It was a bug in criu.  And this bug is on a minor
: path, which works when memfd_create() isn't available.  It is a reason why
: I ask to not backport this patch to stable kernels.
:
: In CRIU this bug can be triggered, only if this patch will be backported
: to a kernel which version is lower than v3.16.

Link: http://lkml.kernel.org/r/20171120212706.GA14325@avx2
Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Cc: Andrei Vagin &lt;avagin@virtuozzo.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit ac7f1061c2c11bb8936b1b6a94cdb48de732f7a4 ]

Current code does:

	if (sscanf(dentry-&gt;d_name.name, "%lx-%lx", start, end) != 2)

However sscanf() is broken garbage.

It silently accepts whitespace between format specifiers
(did you know that?).

It silently accepts valid strings which result in integer overflow.

Do not use sscanf() for any even remotely reliable parsing code.

	OK
	# readlink '/proc/1/map_files/55a23af39000-55a23b05b000'
	/lib/systemd/systemd

	broken
	# readlink '/proc/1/map_files/               55a23af39000-55a23b05b000'
	/lib/systemd/systemd

	broken
	# readlink '/proc/1/map_files/55a23af39000-55a23b05b000    '
	/lib/systemd/systemd

	very broken
	# readlink '/proc/1/map_files/1000000000000000055a23af39000-55a23b05b000'
	/lib/systemd/systemd

Andrei said:

: This patch breaks criu.  It was a bug in criu.  And this bug is on a minor
: path, which works when memfd_create() isn't available.  It is a reason why
: I ask to not backport this patch to stable kernels.
:
: In CRIU this bug can be triggered, only if this patch will be backported
: to a kernel which version is lower than v3.16.

Link: http://lkml.kernel.org/r/20171120212706.GA14325@avx2
Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Cc: Andrei Vagin &lt;avagin@virtuozzo.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: Use underscores for SSBD in 'status'</title>
<updated>2018-05-22T14:58:02+00:00</updated>
<author>
<name>Konrad Rzeszutek Wilk</name>
<email>konrad.wilk@oracle.com</email>
</author>
<published>2018-05-09T19:41:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f8cd89f5e05d49422315e60ec2db9fcb66d25aca'/>
<id>f8cd89f5e05d49422315e60ec2db9fcb66d25aca</id>
<content type='text'>
commit e96f46ee8587607a828f783daa6eb5b44d25004d upstream

The style for the 'status' file is CamelCase or this. _.

Fixes: fae1fa0fc ("proc: Provide details on speculation flaw mitigations")
Signed-off-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: David Woodhouse &lt;dwmw@amazon.co.uk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit e96f46ee8587607a828f783daa6eb5b44d25004d upstream

The style for the 'status' file is CamelCase or this. _.

Fixes: fae1fa0fc ("proc: Provide details on speculation flaw mitigations")
Signed-off-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: David Woodhouse &lt;dwmw@amazon.co.uk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>prctl: Add force disable speculation</title>
<updated>2018-05-22T14:58:01+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2018-05-03T20:09:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=036608d62a838aeb63cae0adaf8ac773cb53148c'/>
<id>036608d62a838aeb63cae0adaf8ac773cb53148c</id>
<content type='text'>
commit 356e4bfff2c5489e016fdb925adbf12a1e3950ee upstream

For certain use cases it is desired to enforce mitigations so they cannot
be undone afterwards. That's important for loader stubs which want to
prevent a child from disabling the mitigation again. Will also be used for
seccomp(). The extra state preserving of the prctl state for SSB is a
preparatory step for EBPF dymanic speculation control.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: David Woodhouse &lt;dwmw@amazon.co.uk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 356e4bfff2c5489e016fdb925adbf12a1e3950ee upstream

For certain use cases it is desired to enforce mitigations so they cannot
be undone afterwards. That's important for loader stubs which want to
prevent a child from disabling the mitigation again. Will also be used for
seccomp(). The extra state preserving of the prctl state for SSB is a
preparatory step for EBPF dymanic speculation control.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: David Woodhouse &lt;dwmw@amazon.co.uk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: Provide details on speculation flaw mitigations</title>
<updated>2018-05-22T14:58:01+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2018-05-01T22:31:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=51ef9af2a35bbc21334c801fd15cbfe01210760f'/>
<id>51ef9af2a35bbc21334c801fd15cbfe01210760f</id>
<content type='text'>
commit fae1fa0fc6cca8beee3ab8ed71d54f9a78fa3f64 upstream

As done with seccomp and no_new_privs, also show speculation flaw
mitigation state in /proc/$pid/status.

Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: David Woodhouse &lt;dwmw@amazon.co.uk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit fae1fa0fc6cca8beee3ab8ed71d54f9a78fa3f64 upstream

As done with seccomp and no_new_privs, also show speculation flaw
mitigation state in /proc/$pid/status.

Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: David Woodhouse &lt;dwmw@amazon.co.uk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: do not access cmdline nor environ from file-backed areas</title>
<updated>2018-05-19T08:27:00+00:00</updated>
<author>
<name>Willy Tarreau</name>
<email>w@1wt.eu</email>
</author>
<published>2018-05-11T06:11:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6f1abf8628b750905606996fd5ff5ea22d149238'/>
<id>6f1abf8628b750905606996fd5ff5ea22d149238</id>
<content type='text'>
commit 7f7ccc2ccc2e70c6054685f5e3522efa81556830 upstream.

proc_pid_cmdline_read() and environ_read() directly access the target
process' VM to retrieve the command line and environment. If this
process remaps these areas onto a file via mmap(), the requesting
process may experience various issues such as extra delays if the
underlying device is slow to respond.

Let's simply refuse to access file-backed areas in these functions.
For this we add a new FOLL_ANON gup flag that is passed to all calls
to access_remote_vm(). The code already takes care of such failures
(including unmapped areas). Accesses via /proc/pid/mem were not
changed though.

This was assigned CVE-2018-1120.

Note for stable backports: the patch may apply to kernels prior to 4.11
but silently miss one location; it must be checked that no call to
access_remote_vm() keeps zero as the last argument.

Reported-by: Qualys Security Advisory &lt;qsa@qualys.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 7f7ccc2ccc2e70c6054685f5e3522efa81556830 upstream.

proc_pid_cmdline_read() and environ_read() directly access the target
process' VM to retrieve the command line and environment. If this
process remaps these areas onto a file via mmap(), the requesting
process may experience various issues such as extra delays if the
underlying device is slow to respond.

Let's simply refuse to access file-backed areas in these functions.
For this we add a new FOLL_ANON gup flag that is passed to all calls
to access_remote_vm(). The code already takes care of such failures
(including unmapped areas). Accesses via /proc/pid/mem were not
changed though.

This was assigned CVE-2018-1120.

Note for stable backports: the patch may apply to kernels prior to 4.11
but silently miss one location; it must be checked that no call to
access_remote_vm() keeps zero as the last argument.

Reported-by: Qualys Security Advisory &lt;qsa@qualys.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
</feed>
