<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include, branch v3.10.2</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>nbd: correct disconnect behavior</title>
<updated>2013-07-22T01:21:29+00:00</updated>
<author>
<name>Paul Clements</name>
<email>paul.clements@steeleye.com</email>
</author>
<published>2013-07-03T22:09:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=323af551c09ddc7cac1c22486b1419aeb1cccdd5'/>
<id>323af551c09ddc7cac1c22486b1419aeb1cccdd5</id>
<content type='text'>
commit c378f70adbc1bbecd9e6db145019f14b2f688c7c upstream.

Currently, when a disconnect is requested by the user (via NBD_DISCONNECT
ioctl) the return from NBD_DO_IT is undefined (it is usually one of
several error codes).  This means that nbd-client does not know if a
manual disconnect was performed or whether a network error occurred.
Because of this, nbd-client's persist mode (which tries to reconnect after
error, but not after manual disconnect) does not always work correctly.

This change fixes this by causing NBD_DO_IT to always return 0 if a user
requests a disconnect.  This means that nbd-client can correctly either
persist the connection (if an error occurred) or disconnect (if the user
requested it).

Signed-off-by: Paul Clements &lt;paul.clements@steeleye.com&gt;
Acked-by: Rob Landley &lt;rob@landley.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit c378f70adbc1bbecd9e6db145019f14b2f688c7c upstream.

Currently, when a disconnect is requested by the user (via NBD_DISCONNECT
ioctl) the return from NBD_DO_IT is undefined (it is usually one of
several error codes).  This means that nbd-client does not know if a
manual disconnect was performed or whether a network error occurred.
Because of this, nbd-client's persist mode (which tries to reconnect after
error, but not after manual disconnect) does not always work correctly.

This change fixes this by causing NBD_DO_IT to always return 0 if a user
requests a disconnect.  This means that nbd-client can correctly either
persist the connection (if an error occurred) or disconnect (if the user
requested it).

Signed-off-by: Paul Clements &lt;paul.clements@steeleye.com&gt;
Acked-by: Rob Landley &lt;rob@landley.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: fix RCU accesses to task-&gt;cgroups</title>
<updated>2013-07-22T01:21:25+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2013-06-25T18:48:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b6891ed4e66b65e5d6bb36964af0d65a08590018'/>
<id>b6891ed4e66b65e5d6bb36964af0d65a08590018</id>
<content type='text'>
commit 14611e51a57df10240817d8ada510842faf0ec51 upstream.

task-&gt;cgroups is a RCU pointer pointing to struct css_set.  A task
switches to a different css_set on cgroup migration but a css_set
doesn't change once created and its pointers to cgroup_subsys_states
aren't RCU protected.

task_subsys_state[_check]() is the macro to acquire css given a task
and subsys_id pair.  It RCU-dereferences task-&gt;cgroups-&gt;subsys[] not
task-&gt;cgroups, so the RCU pointer task-&gt;cgroups ends up being
dereferenced without read_barrier_depends() after it.  It's broken.

Fix it by introducing task_css_set[_check]() which does
RCU-dereference on task-&gt;cgroups.  task_subsys_state[_check]() is
reimplemented to directly dereference -&gt;subsys[] of the css_set
returned from task_css_set[_check]().

This removes some of sparse RCU warnings in cgroup.

v2: Fixed unbalanced parenthsis and there's no need to use
    rcu_dereference_raw() when !CONFIG_PROVE_RCU.  Both spotted by Li.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 14611e51a57df10240817d8ada510842faf0ec51 upstream.

task-&gt;cgroups is a RCU pointer pointing to struct css_set.  A task
switches to a different css_set on cgroup migration but a css_set
doesn't change once created and its pointers to cgroup_subsys_states
aren't RCU protected.

task_subsys_state[_check]() is the macro to acquire css given a task
and subsys_id pair.  It RCU-dereferences task-&gt;cgroups-&gt;subsys[] not
task-&gt;cgroups, so the RCU pointer task-&gt;cgroups ends up being
dereferenced without read_barrier_depends() after it.  It's broken.

Fix it by introducing task_css_set[_check]() which does
RCU-dereference on task-&gt;cgroups.  task_subsys_state[_check]() is
reimplemented to directly dereference -&gt;subsys[] of the css_set
returned from task_css_set[_check]().

This removes some of sparse RCU warnings in cgroup.

v2: Fixed unbalanced parenthsis and there's no need to use
    rcu_dereference_raw() when !CONFIG_PROVE_RCU.  Both spotted by Li.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>futex: Take hugepages into account when generating futex_key</title>
<updated>2013-07-13T18:42:26+00:00</updated>
<author>
<name>Zhang Yi</name>
<email>wetpzy@gmail.com</email>
</author>
<published>2013-06-25T13:19:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ab1842f10b18f847893dad5bfe513f1392cbe797'/>
<id>ab1842f10b18f847893dad5bfe513f1392cbe797</id>
<content type='text'>
commit 13d60f4b6ab5b702dc8d2ee20999f98a93728aec upstream.

The futex_keys of process shared futexes are generated from the page
offset, the mapping host and the mapping index of the futex user space
address. This should result in an unique identifier for each futex.

Though this is not true when futexes are located in different subpages
of an hugepage. The reason is, that the mapping index for all those
futexes evaluates to the index of the base page of the hugetlbfs
mapping. So a futex at offset 0 of the hugepage mapping and another
one at offset PAGE_SIZE of the same hugepage mapping have identical
futex_keys. This happens because the futex code blindly uses
page-&gt;index.

Steps to reproduce the bug:

1. Map a file from hugetlbfs. Initialize pthread_mutex1 at offset 0
   and pthread_mutex2 at offset PAGE_SIZE of the hugetlbfs
   mapping.

   The mutexes must be initialized as PTHREAD_PROCESS_SHARED because
   PTHREAD_PROCESS_PRIVATE mutexes are not affected by this issue as
   their keys solely depend on the user space address.

2. Lock mutex1 and mutex2

3. Create thread1 and in the thread function lock mutex1, which
   results in thread1 blocking on the locked mutex1.

4. Create thread2 and in the thread function lock mutex2, which
   results in thread2 blocking on the locked mutex2.

5. Unlock mutex2. Despite the fact that mutex2 got unlocked, thread2
   still blocks on mutex2 because the futex_key points to mutex1.

To solve this issue we need to take the normal page index of the page
which contains the futex into account, if the futex is in an hugetlbfs
mapping. In other words, we calculate the normal page mapping index of
the subpage in the hugetlbfs mapping.

Mappings which are not based on hugetlbfs are not affected and still
use page-&gt;index.

Thanks to Mel Gorman who provided a patch for adding proper evaluation
functions to the hugetlbfs code to avoid exposing hugetlbfs specific
details to the futex code.

[ tglx: Massaged changelog ]

Signed-off-by: Zhang Yi &lt;zhang.yi20@zte.com.cn&gt;
Reviewed-by: Jiang Biao &lt;jiang.biao2@zte.com.cn&gt;
Tested-by: Ma Chenggong &lt;ma.chenggong@zte.com.cn&gt;
Reviewed-by: 'Mel Gorman' &lt;mgorman@suse.de&gt;
Acked-by: 'Darren Hart' &lt;dvhart@linux.intel.com&gt;
Cc: 'Peter Zijlstra' &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/000101ce71a6%24a83c5880%24f8b50980%24@com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 13d60f4b6ab5b702dc8d2ee20999f98a93728aec upstream.

The futex_keys of process shared futexes are generated from the page
offset, the mapping host and the mapping index of the futex user space
address. This should result in an unique identifier for each futex.

Though this is not true when futexes are located in different subpages
of an hugepage. The reason is, that the mapping index for all those
futexes evaluates to the index of the base page of the hugetlbfs
mapping. So a futex at offset 0 of the hugepage mapping and another
one at offset PAGE_SIZE of the same hugepage mapping have identical
futex_keys. This happens because the futex code blindly uses
page-&gt;index.

Steps to reproduce the bug:

1. Map a file from hugetlbfs. Initialize pthread_mutex1 at offset 0
   and pthread_mutex2 at offset PAGE_SIZE of the hugetlbfs
   mapping.

   The mutexes must be initialized as PTHREAD_PROCESS_SHARED because
   PTHREAD_PROCESS_PRIVATE mutexes are not affected by this issue as
   their keys solely depend on the user space address.

2. Lock mutex1 and mutex2

3. Create thread1 and in the thread function lock mutex1, which
   results in thread1 blocking on the locked mutex1.

4. Create thread2 and in the thread function lock mutex2, which
   results in thread2 blocking on the locked mutex2.

5. Unlock mutex2. Despite the fact that mutex2 got unlocked, thread2
   still blocks on mutex2 because the futex_key points to mutex1.

To solve this issue we need to take the normal page index of the page
which contains the futex into account, if the futex is in an hugetlbfs
mapping. In other words, we calculate the normal page mapping index of
the subpage in the hugetlbfs mapping.

Mappings which are not based on hugetlbfs are not affected and still
use page-&gt;index.

Thanks to Mel Gorman who provided a patch for adding proper evaluation
functions to the hugetlbfs code to avoid exposing hugetlbfs specific
details to the futex code.

[ tglx: Massaged changelog ]

Signed-off-by: Zhang Yi &lt;zhang.yi20@zte.com.cn&gt;
Reviewed-by: Jiang Biao &lt;jiang.biao2@zte.com.cn&gt;
Tested-by: Ma Chenggong &lt;ma.chenggong@zte.com.cn&gt;
Reviewed-by: 'Mel Gorman' &lt;mgorman@suse.de&gt;
Acked-by: 'Darren Hart' &lt;dvhart@linux.intel.com&gt;
Cc: 'Peter Zijlstra' &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/000101ce71a6%24a83c5880%24f8b50980%24@com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: fix invalid unsigned-&gt;signed conversion for timespec encoding</title>
<updated>2013-07-13T18:42:26+00:00</updated>
<author>
<name>Josh Durgin</name>
<email>josh.durgin@inktank.com</email>
</author>
<published>2013-06-28T20:13:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1b7927f047e3d71b46e36c186903fafbc9a47bf3'/>
<id>1b7927f047e3d71b46e36c186903fafbc9a47bf3</id>
<content type='text'>
commit 8b8cf8917f9b5d74e04f281272d8719ce335a497 upstream.

__kernel_time_t is a long, which cannot hold a U32_MAX on 32-bit
architectures.  Just drop this check as it has limited value.

This fixes a crash like:

[  957.905812] kernel BUG at /srv/autobuild-ceph/gitbuilder.git/build/include/linux/ceph/decode.h:164!
[  957.914849] Internal error: Oops - BUG: 0 [#1] SMP ARM
[  957.919978] Modules linked in: rbd libceph libcrc32c ipmi_devintf ipmi_si ipmi_msghandler nfsd nfs_acl auth_rpcgss nfs fscache lockd sunrpc
[  957.932547] CPU: 1    Tainted: G        W     (3.9.0-ceph-19bb6a83-highbank #1)
[  957.939881] PC is at ceph_osdc_build_request+0x8c/0x4f8 [libceph]
[  957.945967] LR is at 0xec520904
[  957.949103] pc : [&lt;bf13e76c&gt;]    lr : [&lt;ec520904&gt;]    psr: 20000153
[  957.949103] sp : ec753df8  ip : 00000001  fp : ec53e100
[  957.960571] r10: ebef25c0  r9 : ec5fa400  r8 : ecbcc000
[  957.965788] r7 : 00000000  r6 : 00000000  r5 : ffffffff  r4 : 00000020
[  957.972307] r3 : 51cc8143  r2 : ec520900  r1 : ec753e58  r0 : ec520908
[  957.978827] Flags: nzCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment user
[  957.986039] Control: 10c5387d  Table: 2c59c04a  DAC: 00000015
[  957.991777] Process rbd (pid: 2138, stack limit = 0xec752238)
[  957.997514] Stack: (0xec753df8 to 0xec754000)
[  958.001864] 3de0:                                                       00000001 00000001
[  958.010032] 3e00: 00000001 bf139744 ecbcc000 ec55a0a0 00000024 00000000 ebef25c0 fffffffe
[  958.018204] 3e20: ffffffff 00000000 00000000 00000001 ec5fa400 ebef25c0 ec53e100 bf166b68
[  958.026377] 3e40: 00000000 0000220f fffffffe ffffffff ec753e58 bf13ff24 51cc8143 05b25ed2
[  958.034548] 3e60: 00000001 00000000 00000000 bf1688d4 00000001 00000000 00000000 00000000
[  958.042720] 3e80: 00000001 00000060 ec5fa400 ed53d200 ed439600 ed439300 00000001 00000060
[  958.050888] 3ea0: ec5fa400 ed53d200 00000000 bf16a320 00000000 ec53e100 00000040 ec753eb8
[  958.059059] 3ec0: ec51df00 ed53d7c0 ed53d200 ed53d7c0 00000000 ed53d7c0 ec5fa400 bf16ed70
[  958.067230] 3ee0: 00000000 00000060 00000002 ed53d200 00000000 bf16acf4 ed53d7c0 ec752000
[  958.075402] 3f00: ed980e50 e954f5d8 00000000 00000060 ed53d240 ed53d258 ec753f80 c04f44a8
[  958.083574] 3f20: edb7910c ec664700 01ade920 c02e4c44 00000060 c016b3dc ec51de40 01adfb84
[  958.091745] 3f40: 00000060 ec752000 ec753f80 ec752000 00000060 c0108444 00000007 ec51de48
[  958.099914] 3f60: ed0eb8c0 00000000 00000000 ec51de40 01adfb84 00000001 00000060 c0108858
[  958.108085] 3f80: 00000000 00000000 51cc8143 00000060 01adfb84 00000007 00000004 c000dd68
[  958.116257] 3fa0: 00000000 c000dbc0 00000060 01adfb84 00000007 01adfb84 00000060 01adfb80
[  958.124429] 3fc0: 00000060 01adfb84 00000007 00000004 beded1a8 00000000 01adf2f0 01ade920
[  958.132599] 3fe0: 00000000 beded180 b6811324 b6811334 800f0010 00000007 2e7f5821 2e7f5c21
[  958.140815] [&lt;bf13e76c&gt;] (ceph_osdc_build_request+0x8c/0x4f8 [libceph]) from [&lt;bf166b68&gt;] (rbd_osd_req_format_write+0x50/0x7c [rbd])
[  958.152739] [&lt;bf166b68&gt;] (rbd_osd_req_format_write+0x50/0x7c [rbd]) from [&lt;bf1688d4&gt;] (rbd_dev_header_watch_sync+0xe0/0x204 [rbd])
[  958.164486] [&lt;bf1688d4&gt;] (rbd_dev_header_watch_sync+0xe0/0x204 [rbd]) from [&lt;bf16a320&gt;] (rbd_dev_image_probe+0x23c/0x850 [rbd])
[  958.175967] [&lt;bf16a320&gt;] (rbd_dev_image_probe+0x23c/0x850 [rbd]) from [&lt;bf16acf4&gt;] (rbd_add+0x3c0/0x918 [rbd])
[  958.185975] [&lt;bf16acf4&gt;] (rbd_add+0x3c0/0x918 [rbd]) from [&lt;c02e4c44&gt;] (bus_attr_store+0x20/0x2c)
[  958.194850] [&lt;c02e4c44&gt;] (bus_attr_store+0x20/0x2c) from [&lt;c016b3dc&gt;] (sysfs_write_file+0x168/0x198)
[  958.203984] [&lt;c016b3dc&gt;] (sysfs_write_file+0x168/0x198) from [&lt;c0108444&gt;] (vfs_write+0x9c/0x170)
[  958.212768] [&lt;c0108444&gt;] (vfs_write+0x9c/0x170) from [&lt;c0108858&gt;] (sys_write+0x3c/0x70)
[  958.220768] [&lt;c0108858&gt;] (sys_write+0x3c/0x70) from [&lt;c000dbc0&gt;] (ret_fast_syscall+0x0/0x30)
[  958.229199] Code: e59d1058 e5913000 e3530000 ba000114 (e7f001f2)

Signed-off-by: Josh Durgin &lt;josh.durgin@inktank.com&gt;
Reviewed-by: Sage Weil &lt;sage@inktank.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 8b8cf8917f9b5d74e04f281272d8719ce335a497 upstream.

__kernel_time_t is a long, which cannot hold a U32_MAX on 32-bit
architectures.  Just drop this check as it has limited value.

This fixes a crash like:

[  957.905812] kernel BUG at /srv/autobuild-ceph/gitbuilder.git/build/include/linux/ceph/decode.h:164!
[  957.914849] Internal error: Oops - BUG: 0 [#1] SMP ARM
[  957.919978] Modules linked in: rbd libceph libcrc32c ipmi_devintf ipmi_si ipmi_msghandler nfsd nfs_acl auth_rpcgss nfs fscache lockd sunrpc
[  957.932547] CPU: 1    Tainted: G        W     (3.9.0-ceph-19bb6a83-highbank #1)
[  957.939881] PC is at ceph_osdc_build_request+0x8c/0x4f8 [libceph]
[  957.945967] LR is at 0xec520904
[  957.949103] pc : [&lt;bf13e76c&gt;]    lr : [&lt;ec520904&gt;]    psr: 20000153
[  957.949103] sp : ec753df8  ip : 00000001  fp : ec53e100
[  957.960571] r10: ebef25c0  r9 : ec5fa400  r8 : ecbcc000
[  957.965788] r7 : 00000000  r6 : 00000000  r5 : ffffffff  r4 : 00000020
[  957.972307] r3 : 51cc8143  r2 : ec520900  r1 : ec753e58  r0 : ec520908
[  957.978827] Flags: nzCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment user
[  957.986039] Control: 10c5387d  Table: 2c59c04a  DAC: 00000015
[  957.991777] Process rbd (pid: 2138, stack limit = 0xec752238)
[  957.997514] Stack: (0xec753df8 to 0xec754000)
[  958.001864] 3de0:                                                       00000001 00000001
[  958.010032] 3e00: 00000001 bf139744 ecbcc000 ec55a0a0 00000024 00000000 ebef25c0 fffffffe
[  958.018204] 3e20: ffffffff 00000000 00000000 00000001 ec5fa400 ebef25c0 ec53e100 bf166b68
[  958.026377] 3e40: 00000000 0000220f fffffffe ffffffff ec753e58 bf13ff24 51cc8143 05b25ed2
[  958.034548] 3e60: 00000001 00000000 00000000 bf1688d4 00000001 00000000 00000000 00000000
[  958.042720] 3e80: 00000001 00000060 ec5fa400 ed53d200 ed439600 ed439300 00000001 00000060
[  958.050888] 3ea0: ec5fa400 ed53d200 00000000 bf16a320 00000000 ec53e100 00000040 ec753eb8
[  958.059059] 3ec0: ec51df00 ed53d7c0 ed53d200 ed53d7c0 00000000 ed53d7c0 ec5fa400 bf16ed70
[  958.067230] 3ee0: 00000000 00000060 00000002 ed53d200 00000000 bf16acf4 ed53d7c0 ec752000
[  958.075402] 3f00: ed980e50 e954f5d8 00000000 00000060 ed53d240 ed53d258 ec753f80 c04f44a8
[  958.083574] 3f20: edb7910c ec664700 01ade920 c02e4c44 00000060 c016b3dc ec51de40 01adfb84
[  958.091745] 3f40: 00000060 ec752000 ec753f80 ec752000 00000060 c0108444 00000007 ec51de48
[  958.099914] 3f60: ed0eb8c0 00000000 00000000 ec51de40 01adfb84 00000001 00000060 c0108858
[  958.108085] 3f80: 00000000 00000000 51cc8143 00000060 01adfb84 00000007 00000004 c000dd68
[  958.116257] 3fa0: 00000000 c000dbc0 00000060 01adfb84 00000007 01adfb84 00000060 01adfb80
[  958.124429] 3fc0: 00000060 01adfb84 00000007 00000004 beded1a8 00000000 01adf2f0 01ade920
[  958.132599] 3fe0: 00000000 beded180 b6811324 b6811334 800f0010 00000007 2e7f5821 2e7f5c21
[  958.140815] [&lt;bf13e76c&gt;] (ceph_osdc_build_request+0x8c/0x4f8 [libceph]) from [&lt;bf166b68&gt;] (rbd_osd_req_format_write+0x50/0x7c [rbd])
[  958.152739] [&lt;bf166b68&gt;] (rbd_osd_req_format_write+0x50/0x7c [rbd]) from [&lt;bf1688d4&gt;] (rbd_dev_header_watch_sync+0xe0/0x204 [rbd])
[  958.164486] [&lt;bf1688d4&gt;] (rbd_dev_header_watch_sync+0xe0/0x204 [rbd]) from [&lt;bf16a320&gt;] (rbd_dev_image_probe+0x23c/0x850 [rbd])
[  958.175967] [&lt;bf16a320&gt;] (rbd_dev_image_probe+0x23c/0x850 [rbd]) from [&lt;bf16acf4&gt;] (rbd_add+0x3c0/0x918 [rbd])
[  958.185975] [&lt;bf16acf4&gt;] (rbd_add+0x3c0/0x918 [rbd]) from [&lt;c02e4c44&gt;] (bus_attr_store+0x20/0x2c)
[  958.194850] [&lt;c02e4c44&gt;] (bus_attr_store+0x20/0x2c) from [&lt;c016b3dc&gt;] (sysfs_write_file+0x168/0x198)
[  958.203984] [&lt;c016b3dc&gt;] (sysfs_write_file+0x168/0x198) from [&lt;c0108444&gt;] (vfs_write+0x9c/0x170)
[  958.212768] [&lt;c0108444&gt;] (vfs_write+0x9c/0x170) from [&lt;c0108858&gt;] (sys_write+0x3c/0x70)
[  958.220768] [&lt;c0108858&gt;] (sys_write+0x3c/0x70) from [&lt;c000dbc0&gt;] (ret_fast_syscall+0x0/0x30)
[  958.229199] Code: e59d1058 e5913000 e3530000 ba000114 (e7f001f2)

Signed-off-by: Josh Durgin &lt;josh.durgin@inktank.com&gt;
Reviewed-by: Sage Weil &lt;sage@inktank.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2013-06-27T05:24:37+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-06-27T05:24:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=98b6ed0f2bf4abfb759206aa70690636372bdee7'/>
<id>98b6ed0f2bf4abfb759206aa70690636372bdee7</id>
<content type='text'>
Pull networking fixes from David Miller:

 1) Found via trinity:

    If you connect up an ipv6 socket to an ipv4 mapped address then an
    ipv6 one, sendmsg() can croak because ip6_sk_dst_check() assumes the
    route cached in the socket is an ipv6 one.  In this case there is an
    ipv4 route attached, so it gets stomped on.

    Reported by Dave Jones and Hannes Frederic Sowa, fixed by Eric
    Dumazet.

 2) AF_KEY notifications leak some kernel memory to userspace, fix from
    Mathias Krause.

 3) DLCI calls __dev_get_by_name() without proper locking, and dlci_del
    doesn't validate that the device being deleted is actually a DLCI
    one.  Fixes from Li Zefan.

 4) Length check on bluetooth l2cap information responses is wrong, each
    response type has a different lenth, so we should make sure it's in
    a given range rather than enforce one single valid length.  From
    Jaganath Kanakkassery.

 5) Receive FIFO overflow is really easy to trigger in stress scenerios
    in the sh_eth driver, but the event isn't being handled properly at
    all.  Specifically, the mask of error interrupts doesn't include the
    event so we never clear it, resulting in the driver becomming wedged
    processing an interrupt that never gets cleared.

    Fix from Sergei Shtylyov.

 6) qlcnic sleeps while holding a spinlock, use mdelay() instead of
    msleep().  From Shahed Shaikh.

 7) Missing curly braces causes SIP netfilter NAT module to always drop
    packets.  Fix from Balazs Peter Odor.

 8) ipt_ULOG in netfilter passes the wrong value to timer setup, causing
    the timer to dereference crap when it fires.  Fix from Gao Feng.

 9) Missing RCU protection around txq-&gt;axq_acq traversal in
    ath_txq_schedule().  Fix from Felix Fietkau.

10) Idle state transition test in ath9k_htc_config() is reversed, fix
    from Sujith Manoharan.

11) IPV6 forwarding handles unicast Router Alert packets incorrectly.
    It tests the wrong option state.  Previously opt-&gt;ra being non-zero
    indicated a router alert marking in the SKB, but now it's indicated
    by a bit in opt-&gt;flags.  Fix from YOSHIFUJI Hideaki.

12) SKB leak in GRE tunnel GSO handling, from Eric Dumazet.

13) get_user_pages_fast() error handling in TUN and MACVTAP use the same
    local variable for the base index and the loop iterator for page
    traversal, oops! Fix from Michael S Tsirkin.

14) ipv6_get_lladdr() can fail, and we must therefore check it's return
    value in inet6_set_iftoken().  For from Hannes Frederic Sowa.

15) If you change an interface name and meanwhile can sneak in something
    that looks up the name (like SO_BINDTODEVICE or SIOCGIFNAME) we can
    deadlock with CONFIG_PREEMPT=n.  Fix this by providing a helper
    function that properly uses raw_seqcount_begin().  From Nicolas
    Schichan.

16) Chain noise calibration test is inverted in iwlwifi, fix from
    Nikolay Martynov.

17) Properly set TX iwlwifi descriptor flags for back requests.  Fix
    from Emmanuel Grumbach.

18) We can't assume skb_transport_header() is set in xt_TCPOPTSTRAP
    module, fix from Pablo Neira Ayuso.

19) Some crummy APs don't provide the proper High Throughput info in
    association response frames.  Add a workaround by assume we'll use
    whatever is in the beacon/probe.  Fix from Johannes Berg.

20) mac80211 call to rate_idx_match_mask() swaps two arguments (mask and
    channel width).  Fix from Simon Wunderlich.

21) xt_TCPMSS (like xt_TCPOPTSTRAP) must not try to handle fragmented
    frames.  Fix from Phil Oester.

22) Fix rate control regression causing iwlwifi/iwlegacy chips to use
    1Mbit/s on pre-11n networks.  From Moshe Benji and Stanslaw Gruszka.

23) Disable brcmsmac power-save functions, they cause regressions.  From
    Arend van Spriel.

24) Enforce a sane minimum MTU in l2cap_build_cmd() otherwise we can
    easily crash.  Fix from Anderson Lizardo.

25) If a learning packet arrives during vxlan_stop() we crash, easily
    fixed by checking netif_running().  From Stephen Hemminger.

26) Static vxlan FDB entries should not be migrated, also from Stephen.

27) skb_clone() failures not handled in vxlan_xmit(), oops.  Also from
    Stephen.

28) Add minimal driver for AR816x/AR817x ethernet chips, from Johannes
    Berg.

29) Fix regression in userspace VLAN acceleration control, added by the
    802.1ad support changes.  Fix from Fernando Luis Vazquez Cao.

30) Interval selection for MLD queries in the bridging code was
    reversed.  Fix from Linus Lüssing.

31) ipv6's ndisc_send_redirect() erroneously writes to the packet we
    received not the packet we are building to send out.  Fix from
    Matthias Schiffer.

32) Don't free netdev before unregistering it, in usb_8dev can driver.
    From Marc Kleine-Budde.

33) Fix nl80211 attribute buffer races, from Johannes Berg.

34) Although netlink_diag.h is under uapi/ it isn't present in Kbuild.
    From Stephen Hemminger.

35) Wrong address and family passed to MD5 key lookups in TCP, from
    Aydin Arik.

36) phy_type attribute created by SFC driver should not be writable.
    From Ben Hutchings.

37) Receive/Transmit queue allocations in pxa168_eth and mv643xx_eth
    should use kzalloc().  Otherwise if setup fails half-way, we'll
    dereference garbage when trying to teardown the rings.  From Lubomir
    Rintel.

38) Fix double-allocation of dst (resulting in unfreeable net device) in
    ipv6's init_loopback().  From Gao Feng.

39) Fix fragmentation handling SKB leak in netfilter conntrack, we were
    freeing the wrong skb pointer.  From Phil Oester.

40) Don't report "-1" (SPEED_UNKNOWN) in bond_miimon_commit(), from
    Nikolay Aleksandrov.

41) davinci_cpdma doesn't check for DMA mapping errors, letting the
    device scribble to random addresses.  From Sebastian Siewior.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits)
  dlci: validate the net device in dlci_del()
  dlci: acquire rtnl_lock before calling __dev_get_by_name()
  af_key: fix info leaks in notify messages
  ipv6: ip6_sk_dst_check() must not assume ipv6 dst
  net: fix kernel deadlock with interface rename and netdev name retrieval.
  net/tg3: Avoid delay during MMIO access
  ipv6: check return value of ipv6_get_lladdr
  macvtap: fix recovery from gup errors
  tun: fix recovery from gup errors
  gre: fix a possible skb leak
  ipv6: Process unicast packet with Router Alert by checking flag in skb.
  ath9k_htc: Handle IDLE state transition properly
  ath9k: fix an RCU issue in calling ieee80211_get_tx_rates
  netfilter: ipt_ULOG: fix incorrect setting of ulog timer
  netfilter: ctnetlink: send event when conntrack label was modified
  netfilter: nf_nat_sip: fix mangling
  qlcnic: Do not sleep while holding spinlock
  drivers: net: cpsw: fix compilation error with cpsw driver
  tcp: doc : fix the syncookies default value
  sh_eth: fix misreporting of transmit abort
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull networking fixes from David Miller:

 1) Found via trinity:

    If you connect up an ipv6 socket to an ipv4 mapped address then an
    ipv6 one, sendmsg() can croak because ip6_sk_dst_check() assumes the
    route cached in the socket is an ipv6 one.  In this case there is an
    ipv4 route attached, so it gets stomped on.

    Reported by Dave Jones and Hannes Frederic Sowa, fixed by Eric
    Dumazet.

 2) AF_KEY notifications leak some kernel memory to userspace, fix from
    Mathias Krause.

 3) DLCI calls __dev_get_by_name() without proper locking, and dlci_del
    doesn't validate that the device being deleted is actually a DLCI
    one.  Fixes from Li Zefan.

 4) Length check on bluetooth l2cap information responses is wrong, each
    response type has a different lenth, so we should make sure it's in
    a given range rather than enforce one single valid length.  From
    Jaganath Kanakkassery.

 5) Receive FIFO overflow is really easy to trigger in stress scenerios
    in the sh_eth driver, but the event isn't being handled properly at
    all.  Specifically, the mask of error interrupts doesn't include the
    event so we never clear it, resulting in the driver becomming wedged
    processing an interrupt that never gets cleared.

    Fix from Sergei Shtylyov.

 6) qlcnic sleeps while holding a spinlock, use mdelay() instead of
    msleep().  From Shahed Shaikh.

 7) Missing curly braces causes SIP netfilter NAT module to always drop
    packets.  Fix from Balazs Peter Odor.

 8) ipt_ULOG in netfilter passes the wrong value to timer setup, causing
    the timer to dereference crap when it fires.  Fix from Gao Feng.

 9) Missing RCU protection around txq-&gt;axq_acq traversal in
    ath_txq_schedule().  Fix from Felix Fietkau.

10) Idle state transition test in ath9k_htc_config() is reversed, fix
    from Sujith Manoharan.

11) IPV6 forwarding handles unicast Router Alert packets incorrectly.
    It tests the wrong option state.  Previously opt-&gt;ra being non-zero
    indicated a router alert marking in the SKB, but now it's indicated
    by a bit in opt-&gt;flags.  Fix from YOSHIFUJI Hideaki.

12) SKB leak in GRE tunnel GSO handling, from Eric Dumazet.

13) get_user_pages_fast() error handling in TUN and MACVTAP use the same
    local variable for the base index and the loop iterator for page
    traversal, oops! Fix from Michael S Tsirkin.

14) ipv6_get_lladdr() can fail, and we must therefore check it's return
    value in inet6_set_iftoken().  For from Hannes Frederic Sowa.

15) If you change an interface name and meanwhile can sneak in something
    that looks up the name (like SO_BINDTODEVICE or SIOCGIFNAME) we can
    deadlock with CONFIG_PREEMPT=n.  Fix this by providing a helper
    function that properly uses raw_seqcount_begin().  From Nicolas
    Schichan.

16) Chain noise calibration test is inverted in iwlwifi, fix from
    Nikolay Martynov.

17) Properly set TX iwlwifi descriptor flags for back requests.  Fix
    from Emmanuel Grumbach.

18) We can't assume skb_transport_header() is set in xt_TCPOPTSTRAP
    module, fix from Pablo Neira Ayuso.

19) Some crummy APs don't provide the proper High Throughput info in
    association response frames.  Add a workaround by assume we'll use
    whatever is in the beacon/probe.  Fix from Johannes Berg.

20) mac80211 call to rate_idx_match_mask() swaps two arguments (mask and
    channel width).  Fix from Simon Wunderlich.

21) xt_TCPMSS (like xt_TCPOPTSTRAP) must not try to handle fragmented
    frames.  Fix from Phil Oester.

22) Fix rate control regression causing iwlwifi/iwlegacy chips to use
    1Mbit/s on pre-11n networks.  From Moshe Benji and Stanslaw Gruszka.

23) Disable brcmsmac power-save functions, they cause regressions.  From
    Arend van Spriel.

24) Enforce a sane minimum MTU in l2cap_build_cmd() otherwise we can
    easily crash.  Fix from Anderson Lizardo.

25) If a learning packet arrives during vxlan_stop() we crash, easily
    fixed by checking netif_running().  From Stephen Hemminger.

26) Static vxlan FDB entries should not be migrated, also from Stephen.

27) skb_clone() failures not handled in vxlan_xmit(), oops.  Also from
    Stephen.

28) Add minimal driver for AR816x/AR817x ethernet chips, from Johannes
    Berg.

29) Fix regression in userspace VLAN acceleration control, added by the
    802.1ad support changes.  Fix from Fernando Luis Vazquez Cao.

30) Interval selection for MLD queries in the bridging code was
    reversed.  Fix from Linus Lüssing.

31) ipv6's ndisc_send_redirect() erroneously writes to the packet we
    received not the packet we are building to send out.  Fix from
    Matthias Schiffer.

32) Don't free netdev before unregistering it, in usb_8dev can driver.
    From Marc Kleine-Budde.

33) Fix nl80211 attribute buffer races, from Johannes Berg.

34) Although netlink_diag.h is under uapi/ it isn't present in Kbuild.
    From Stephen Hemminger.

35) Wrong address and family passed to MD5 key lookups in TCP, from
    Aydin Arik.

36) phy_type attribute created by SFC driver should not be writable.
    From Ben Hutchings.

37) Receive/Transmit queue allocations in pxa168_eth and mv643xx_eth
    should use kzalloc().  Otherwise if setup fails half-way, we'll
    dereference garbage when trying to teardown the rings.  From Lubomir
    Rintel.

38) Fix double-allocation of dst (resulting in unfreeable net device) in
    ipv6's init_loopback().  From Gao Feng.

39) Fix fragmentation handling SKB leak in netfilter conntrack, we were
    freeing the wrong skb pointer.  From Phil Oester.

40) Don't report "-1" (SPEED_UNKNOWN) in bond_miimon_commit(), from
    Nikolay Aleksandrov.

41) davinci_cpdma doesn't check for DMA mapping errors, letting the
    device scribble to random addresses.  From Sebastian Siewior.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits)
  dlci: validate the net device in dlci_del()
  dlci: acquire rtnl_lock before calling __dev_get_by_name()
  af_key: fix info leaks in notify messages
  ipv6: ip6_sk_dst_check() must not assume ipv6 dst
  net: fix kernel deadlock with interface rename and netdev name retrieval.
  net/tg3: Avoid delay during MMIO access
  ipv6: check return value of ipv6_get_lladdr
  macvtap: fix recovery from gup errors
  tun: fix recovery from gup errors
  gre: fix a possible skb leak
  ipv6: Process unicast packet with Router Alert by checking flag in skb.
  ath9k_htc: Handle IDLE state transition properly
  ath9k: fix an RCU issue in calling ieee80211_get_tx_rates
  netfilter: ipt_ULOG: fix incorrect setting of ulog timer
  netfilter: ctnetlink: send event when conntrack label was modified
  netfilter: nf_nat_sip: fix mangling
  qlcnic: Do not sleep while holding spinlock
  drivers: net: cpsw: fix compilation error with cpsw driver
  tcp: doc : fix the syncookies default value
  sh_eth: fix misreporting of transmit abort
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>net: fix kernel deadlock with interface rename and netdev name retrieval.</title>
<updated>2013-06-26T20:42:54+00:00</updated>
<author>
<name>Nicolas Schichan</name>
<email>nschichan@freebox.fr</email>
</author>
<published>2013-06-26T15:23:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5dbe7c178d3f0a4634f088d9e729f1909b9ddcd1'/>
<id>5dbe7c178d3f0a4634f088d9e729f1909b9ddcd1</id>
<content type='text'>
When the kernel (compiled with CONFIG_PREEMPT=n) is performing the
rename of a network interface, it can end up waiting for a workqueue
to complete. If userland is able to invoke a SIOCGIFNAME ioctl or a
SO_BINDTODEVICE getsockopt in between, the kernel will deadlock due to
the fact that read_secklock_begin() will spin forever waiting for the
writer process (the one doing the interface rename) to update the
devnet_rename_seq sequence.

This patch fixes the problem by adding a helper (netdev_get_name())
and using it in the code handling the SIOCGIFNAME ioctl and
SO_BINDTODEVICE setsockopt.

The netdev_get_name() helper uses raw_seqcount_begin() to avoid
spinning forever, waiting for devnet_rename_seq-&gt;sequence to become
even. cond_resched() is used in the contended case, before retrying
the access to give the writer process a chance to finish.

The use of raw_seqcount_begin() will incur some unneeded work in the
reader process in the contended case, but this is better than
deadlocking the system.

Signed-off-by: Nicolas Schichan &lt;nschichan@freebox.fr&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When the kernel (compiled with CONFIG_PREEMPT=n) is performing the
rename of a network interface, it can end up waiting for a workqueue
to complete. If userland is able to invoke a SIOCGIFNAME ioctl or a
SO_BINDTODEVICE getsockopt in between, the kernel will deadlock due to
the fact that read_secklock_begin() will spin forever waiting for the
writer process (the one doing the interface rename) to update the
devnet_rename_seq sequence.

This patch fixes the problem by adding a helper (netdev_get_name())
and using it in the code handling the SIOCGIFNAME ioctl and
SO_BINDTODEVICE setsockopt.

The netdev_get_name() helper uses raw_seqcount_begin() to avoid
spinning forever, waiting for devnet_rename_seq-&gt;sequence to become
even. cond_resched() is used in the contended case, before retrying
the access to give the writer process a chance to finish.

The use of raw_seqcount_begin() will incur some unneeded work in the
reader process in the contended case, but this is better than
deadlocking the system.

Signed-off-by: Nicolas Schichan &lt;nschichan@freebox.fr&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gre: fix a possible skb leak</title>
<updated>2013-06-25T23:07:44+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2013-06-24T13:26:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bd8a7036c06cf15779b31a5397d4afcb12be81ea'/>
<id>bd8a7036c06cf15779b31a5397d4afcb12be81ea</id>
<content type='text'>
commit 68c331631143 ("v4 GRE: Add TCP segmentation offload for GRE")
added a possible skb leak, because it frees only the head of segment
list, in case a skb_linearize() call fails.

This patch adds a kfree_skb_list() helper to fix the bug.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Pravin B Shelar &lt;pshelar@nicira.com&gt;
Cc: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 68c331631143 ("v4 GRE: Add TCP segmentation offload for GRE")
added a possible skb leak, because it frees only the head of segment
list, in case a skb_linearize() call fails.

This patch adds a kfree_skb_list() helper to fix the bug.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Pravin B Shelar &lt;pshelar@nicira.com&gt;
Cc: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ACPI / dock / PCI: Synchronous handling of dock events for PCI devices</title>
<updated>2013-06-24T09:22:53+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2013-06-24T09:22:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=21a31013f774c726bd199526cd673acc6432b21d'/>
<id>21a31013f774c726bd199526cd673acc6432b21d</id>
<content type='text'>
The interactions between the ACPI dock driver and the ACPI-based PCI
hotplug (acpiphp) are currently problematic because of ordering
issues during hot-remove operations.

First of all, the current ACPI glue code expects that physical
devices will always be deleted before deleting the companion ACPI
device objects.  Otherwise, acpi_unbind_one() will fail with a
warning message printed to the kernel log, for example:

[  185.026073] usb usb5: Oops, 'acpi_handle' corrupt
[  185.035150] pci 0000:1b:00.0: Oops, 'acpi_handle' corrupt
[  185.035515] pci 0000:18:02.0: Oops, 'acpi_handle' corrupt
[  180.013656]  port1: Oops, 'acpi_handle' corrupt

This means, in particular, that struct pci_dev objects have to
be deleted before the struct acpi_device objects they are "glued"
with.

Now, the following happens the during the undocking of an ACPI-based
dock station:
 1) hotplug_dock_devices() invokes registered hotplug callbacks to
    destroy physical devices associated with the ACPI device objects
    depending on the dock station.  It calls dd-&gt;ops-&gt;handler() for
    each of those device objects.
 2) For PCI devices dd-&gt;ops-&gt;handler() points to
    handle_hotplug_event_func() that queues up a separate work item
    to execute _handle_hotplug_event_func() for the given device and
    returns immediately.  That work item will be executed later.
 3) hotplug_dock_devices() calls dock_remove_acpi_device() for each
    device depending on the dock station.  This runs acpi_bus_trim()
    for each of them, which causes the underlying ACPI device object
    to be destroyed, but the work items queued up by
    handle_hotplug_event_func() haven't been started yet.
 4) _handle_hotplug_event_func() queued up in step 2) are executed
    and cause the above failure to happen, because the PCI devices
    they handle do not have the companion ACPI device objects any
    more (those objects have been deleted in step 3).

The possible breakage doesn't end here, though, because
hotplug_dock_devices() may return before at least some of the
_handle_hotplug_event_func() work items spawned by it have a
chance to complete and then undock() will cause _DCK to be
evaluated and that will cause the devices handled by the
_handle_hotplug_event_func() to go away possibly while they are
being accessed.

This means that dd-&gt;ops-&gt;handler() for PCI devices should not point
to handle_hotplug_event_func().  Instead, it should point to a
function that will do the work of _handle_hotplug_event_func()
synchronously.  For this reason, introduce such a function,
hotplug_event_func(), and modity acpiphp_dock_ops to point to
it as the handler.

Unfortunately, however, this is not sufficient, because if the dock
code were not changed further, hotplug_event_func() would now
deadlock with hotplug_dock_devices() that called it, since it would
run unregister_hotplug_dock_device() which in turn would attempt to
acquire the dock station's hp_lock mutex already acquired by
hotplug_dock_devices().

To resolve that deadlock use the observation that
unregister_hotplug_dock_device() won't need to acquire hp_lock
if PCI bridges the devices on the dock station depend on are
prevented from being removed prematurely while the first loop in
hotplug_dock_devices() is in progress.

To make that possible, introduce a mechanism by which the callers of
register_hotplug_dock_device() can provide "init" and "release"
routines that will be executed, respectively, during the addition
and removal of the physical device object associated with the
given ACPI device handle.  Make acpiphp use two new functions,
acpiphp_dock_init() and acpiphp_dock_release(), that call
get_bridge() and put_bridge(), respectively, on the acpiphp bridge
holding the given device, for this purpose.

In addition to that, remove the dock station's list of
"hotplug devices" and make the dock code always walk the whole list
of "dependent devices" instead in such a way that the loops in
hotplug_dock_devices() and dock_event() (replacing the loops over
"hotplug devices") will take references to the list entries that
register_hotplug_dock_device() has been called for.  That prevents
the "release" routines associated with those entries from being
called while the given entry is being processed and for PCI
devices this means that their bridges won't be removed (by a
concurrent thread) while hotplug_event_func() handling them is
being executed.

This change is based on two earlier patches from Jiang Liu.

References: https://bugzilla.kernel.org/show_bug.cgi?id=59501
Reported-and-tested-by: Alexander E. Patrakov &lt;patrakov@gmail.com&gt;
Tracked-down-by: Jiang Liu &lt;jiang.liu@huawei.com&gt;
Tested-by: Illya Klymov &lt;xanf@xanf.me&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Acked-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Cc: 3.9+ &lt;stable@vger.kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The interactions between the ACPI dock driver and the ACPI-based PCI
hotplug (acpiphp) are currently problematic because of ordering
issues during hot-remove operations.

First of all, the current ACPI glue code expects that physical
devices will always be deleted before deleting the companion ACPI
device objects.  Otherwise, acpi_unbind_one() will fail with a
warning message printed to the kernel log, for example:

[  185.026073] usb usb5: Oops, 'acpi_handle' corrupt
[  185.035150] pci 0000:1b:00.0: Oops, 'acpi_handle' corrupt
[  185.035515] pci 0000:18:02.0: Oops, 'acpi_handle' corrupt
[  180.013656]  port1: Oops, 'acpi_handle' corrupt

This means, in particular, that struct pci_dev objects have to
be deleted before the struct acpi_device objects they are "glued"
with.

Now, the following happens the during the undocking of an ACPI-based
dock station:
 1) hotplug_dock_devices() invokes registered hotplug callbacks to
    destroy physical devices associated with the ACPI device objects
    depending on the dock station.  It calls dd-&gt;ops-&gt;handler() for
    each of those device objects.
 2) For PCI devices dd-&gt;ops-&gt;handler() points to
    handle_hotplug_event_func() that queues up a separate work item
    to execute _handle_hotplug_event_func() for the given device and
    returns immediately.  That work item will be executed later.
 3) hotplug_dock_devices() calls dock_remove_acpi_device() for each
    device depending on the dock station.  This runs acpi_bus_trim()
    for each of them, which causes the underlying ACPI device object
    to be destroyed, but the work items queued up by
    handle_hotplug_event_func() haven't been started yet.
 4) _handle_hotplug_event_func() queued up in step 2) are executed
    and cause the above failure to happen, because the PCI devices
    they handle do not have the companion ACPI device objects any
    more (those objects have been deleted in step 3).

The possible breakage doesn't end here, though, because
hotplug_dock_devices() may return before at least some of the
_handle_hotplug_event_func() work items spawned by it have a
chance to complete and then undock() will cause _DCK to be
evaluated and that will cause the devices handled by the
_handle_hotplug_event_func() to go away possibly while they are
being accessed.

This means that dd-&gt;ops-&gt;handler() for PCI devices should not point
to handle_hotplug_event_func().  Instead, it should point to a
function that will do the work of _handle_hotplug_event_func()
synchronously.  For this reason, introduce such a function,
hotplug_event_func(), and modity acpiphp_dock_ops to point to
it as the handler.

Unfortunately, however, this is not sufficient, because if the dock
code were not changed further, hotplug_event_func() would now
deadlock with hotplug_dock_devices() that called it, since it would
run unregister_hotplug_dock_device() which in turn would attempt to
acquire the dock station's hp_lock mutex already acquired by
hotplug_dock_devices().

To resolve that deadlock use the observation that
unregister_hotplug_dock_device() won't need to acquire hp_lock
if PCI bridges the devices on the dock station depend on are
prevented from being removed prematurely while the first loop in
hotplug_dock_devices() is in progress.

To make that possible, introduce a mechanism by which the callers of
register_hotplug_dock_device() can provide "init" and "release"
routines that will be executed, respectively, during the addition
and removal of the physical device object associated with the
given ACPI device handle.  Make acpiphp use two new functions,
acpiphp_dock_init() and acpiphp_dock_release(), that call
get_bridge() and put_bridge(), respectively, on the acpiphp bridge
holding the given device, for this purpose.

In addition to that, remove the dock station's list of
"hotplug devices" and make the dock code always walk the whole list
of "dependent devices" instead in such a way that the loops in
hotplug_dock_devices() and dock_event() (replacing the loops over
"hotplug devices") will take references to the list entries that
register_hotplug_dock_device() has been called for.  That prevents
the "release" routines associated with those entries from being
called while the given entry is being processed and for PCI
devices this means that their bridges won't be removed (by a
concurrent thread) while hotplug_event_func() handling them is
being executed.

This change is based on two earlier patches from Jiang Liu.

References: https://bugzilla.kernel.org/show_bug.cgi?id=59501
Reported-and-tested-by: Alexander E. Patrakov &lt;patrakov@gmail.com&gt;
Tracked-down-by: Jiang Liu &lt;jiang.liu@huawei.com&gt;
Tested-by: Illya Klymov &lt;xanf@xanf.me&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Acked-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Cc: 3.9+ &lt;stable@vger.kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2013-06-22T18:42:20+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-06-22T18:42:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b8ff768b5ab9d17204c1a8b3647ee6db608f63ab'/>
<id>b8ff768b5ab9d17204c1a8b3647ee6db608f63ab</id>
<content type='text'>
Pull vfs fixes from Al Viro:
 "Several fixes for bugs caught while looking through f_pos (ab)users"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  aout32 coredump compat fix
  splice: don't pass the address of -&gt;f_pos to methods
  mconsole: we'd better initialize pos before passing it to vfs_read()...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull vfs fixes from Al Viro:
 "Several fixes for bugs caught while looking through f_pos (ab)users"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  aout32 coredump compat fix
  splice: don't pass the address of -&gt;f_pos to methods
  mconsole: we'd better initialize pos before passing it to vfs_read()...
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'acpi-3.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm</title>
<updated>2013-06-21T16:31:10+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-06-21T16:31:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=64a2f30a89a3c26a5152b09f4d390b9d91cab0cc'/>
<id>64a2f30a89a3c26a5152b09f4d390b9d91cab0cc</id>
<content type='text'>
Pull ACPI fixes from Rafael Wysocki:

 - Fix for a regression causing a failure to turn on some devices on
   some systems during initialization introduced by a recent revert of
   an ACPI PM change that broke something else.  Fortunately, we know
   exactly what devices are affected, so we can add a fix just for them
   leaving everyone else alone.

 - ACPI power resources initialization fix preventing a NULL pointer
   from being dereferenced in the acpi_add_power_resource() error code
   path.

 - ACPI dock station driver fix that adds missing locking to
   write_undock().

 - ACPI resources allocation fix changing the scope of an old workaround
   so that it doesn't affect systems that aren't actually buggy.  This
   was reported a couple of days ago to fix DMA problems on some new
   platforms so we need it in -stable.  From Mika Westerberg.

* tag 'acpi-3.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / LPSS: Power up LPSS devices during enumeration
  ACPI / PM: Fix error code path for power resources initialization
  ACPI / dock: Take ACPI scan lock in write_undock()
  ACPI / resources: call acpi_get_override_irq() only for legacy IRQ resources
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull ACPI fixes from Rafael Wysocki:

 - Fix for a regression causing a failure to turn on some devices on
   some systems during initialization introduced by a recent revert of
   an ACPI PM change that broke something else.  Fortunately, we know
   exactly what devices are affected, so we can add a fix just for them
   leaving everyone else alone.

 - ACPI power resources initialization fix preventing a NULL pointer
   from being dereferenced in the acpi_add_power_resource() error code
   path.

 - ACPI dock station driver fix that adds missing locking to
   write_undock().

 - ACPI resources allocation fix changing the scope of an old workaround
   so that it doesn't affect systems that aren't actually buggy.  This
   was reported a couple of days ago to fix DMA problems on some new
   platforms so we need it in -stable.  From Mika Westerberg.

* tag 'acpi-3.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / LPSS: Power up LPSS devices during enumeration
  ACPI / PM: Fix error code path for power resources initialization
  ACPI / dock: Take ACPI scan lock in write_undock()
  ACPI / resources: call acpi_get_override_irq() only for legacy IRQ resources
</pre>
</div>
</content>
</entry>
</feed>
