Age | Commit message (Collapse) | Author |
|
[ Upstream commit 9d186cac7ffb1831e9f34cb4a3a8b22abb9dd9d4 ]
We don't know right timestamp for repaired skb-s. Wrong RTT estimations
isn't good, because some congestion modules heavily depends on it.
This patch adds the TCPCB_REPAIRED flag, which is included in
TCPCB_RETRANS.
Thanks to Eric for the advice how to fix this issue.
This patch fixes the warning:
[ 879.562947] WARNING: CPU: 0 PID: 2825 at net/ipv4/tcp_input.c:3078 tcp_ack+0x11f5/0x1380()
[ 879.567253] CPU: 0 PID: 2825 Comm: socket-tcpbuf-l Not tainted 3.16.0-next-20140811 #1
[ 879.567829] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 879.568177] 0000000000000000 00000000c532680c ffff880039643d00 ffffffff817aa2d2
[ 879.568776] 0000000000000000 ffff880039643d38 ffffffff8109afbd ffff880039d6ba80
[ 879.569386] ffff88003a449800 000000002983d6bd 0000000000000000 000000002983d6bc
[ 879.569982] Call Trace:
[ 879.570264] [<ffffffff817aa2d2>] dump_stack+0x4d/0x66
[ 879.570599] [<ffffffff8109afbd>] warn_slowpath_common+0x7d/0xa0
[ 879.570935] [<ffffffff8109b0ea>] warn_slowpath_null+0x1a/0x20
[ 879.571292] [<ffffffff816d0a05>] tcp_ack+0x11f5/0x1380
[ 879.571614] [<ffffffff816d10bd>] tcp_rcv_established+0x1ed/0x710
[ 879.571958] [<ffffffff816dc9da>] tcp_v4_do_rcv+0x10a/0x370
[ 879.572315] [<ffffffff81657459>] release_sock+0x89/0x1d0
[ 879.572642] [<ffffffff816c81a0>] do_tcp_setsockopt.isra.36+0x120/0x860
[ 879.573000] [<ffffffff8110a52e>] ? rcu_read_lock_held+0x6e/0x80
[ 879.573352] [<ffffffff816c8912>] tcp_setsockopt+0x32/0x40
[ 879.573678] [<ffffffff81654ac4>] sock_common_setsockopt+0x14/0x20
[ 879.574031] [<ffffffff816537b0>] SyS_setsockopt+0x80/0xf0
[ 879.574393] [<ffffffff817b40a9>] system_call_fastpath+0x16/0x1b
[ 879.574730] ---[ end trace a17cbc38eb8c5c00 ]---
v2: moving setting of skb->when for repaired skb-s in tcp_write_xmit,
where it's set for other skb-s.
Fixes: 431a91242d8d ("tcp: timestamp SYN+DATA messages")
Fixes: 740b0f1841f6 ("tcp: switch rtt estimations to usec resolution")
Cc: Eric Dumazet <edumazet@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 0d5501c1c828fb97d02af50aa9d2b1a5498b94e4 ]
Currently the functionality to untag traffic on input resides
as part of the vlan module and is build only when VLAN support
is enabled in the kernel. When VLAN is disabled, the function
vlan_untag() turns into a stub and doesn't really untag the
packets. This seems to create an interesting interaction
between VMs supporting checksum offloading and some network drivers.
There are some drivers that do not allow the user to change
tx-vlan-offload feature of the driver. These drivers also seem
to assume that any VLAN-tagged traffic they transmit will
have the vlan information in the vlan_tci and not in the vlan
header already in the skb. When transmitting skbs that already
have tagged data with partial checksum set, the checksum doesn't
appear to be updated correctly by the card thus resulting in a
failure to establish TCP connections.
The following is a packet trace taken on the receiver where a
sender is a VM with a VLAN configued. The host VM is running on
doest not have VLAN support and the outging interface on the
host is tg3:
10:12:43.503055 52:54:00:ae:42:3f > 28:d2:44:7d:c2:de, ethertype 802.1Q
(0x8100), length 78: vlan 100, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 27243,
offset 0, flags [DF], proto TCP (6), length 60)
10.0.100.1.58545 > 10.0.100.10.ircu-2: Flags [S], cksum 0xdc39 (incorrect
-> 0x48d9), seq 1069378582, win 29200, options [mss 1460,sackOK,TS val
4294837885 ecr 0,nop,wscale 7], length 0
10:12:44.505556 52:54:00:ae:42:3f > 28:d2:44:7d:c2:de, ethertype 802.1Q
(0x8100), length 78: vlan 100, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 27244,
offset 0, flags [DF], proto TCP (6), length 60)
10.0.100.1.58545 > 10.0.100.10.ircu-2: Flags [S], cksum 0xdc39 (incorrect
-> 0x44ee), seq 1069378582, win 29200, options [mss 1460,sackOK,TS val
4294838888 ecr 0,nop,wscale 7], length 0
This connection finally times out.
I've only access to the TG3 hardware in this configuration thus have
only tested this with TG3 driver. There are a lot of other drivers
that do not permit user changes to vlan acceleration features, and
I don't know if they all suffere from a similar issue.
The patch attempt to fix this another way. It moves the vlan header
stipping code out of the vlan module and always builds it into the
kernel network core. This way, even if vlan is not supported on
a virtualizatoin host, the virtual machines running on top of such
host will still work with VLANs enabled.
CC: Patrick McHardy <kaber@trash.net>
CC: Nithin Nayak Sujir <nsujir@broadcom.com>
CC: Michael Chan <mchan@broadcom.com>
CC: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 615d6e8756c87149f2d4c1b93d471bca002bd849 upstream.
This patch is a continuation of efforts trying to optimize find_vma(),
avoiding potentially expensive rbtree walks to locate a vma upon faults.
The original approach (https://lkml.org/lkml/2013/11/1/410), where the
largest vma was also cached, ended up being too specific and random,
thus further comparison with other approaches were needed. There are
two things to consider when dealing with this, the cache hit rate and
the latency of find_vma(). Improving the hit-rate does not necessarily
translate in finding the vma any faster, as the overhead of any fancy
caching schemes can be too high to consider.
We currently cache the last used vma for the whole address space, which
provides a nice optimization, reducing the total cycles in find_vma() by
up to 250%, for workloads with good locality. On the other hand, this
simple scheme is pretty much useless for workloads with poor locality.
Analyzing ebizzy runs shows that, no matter how many threads are
running, the mmap_cache hit rate is less than 2%, and in many situations
below 1%.
The proposed approach is to replace this scheme with a small per-thread
cache, maximizing hit rates at a very low maintenance cost.
Invalidations are performed by simply bumping up a 32-bit sequence
number. The only expensive operation is in the rare case of a seq
number overflow, where all caches that share the same address space are
flushed. Upon a miss, the proposed replacement policy is based on the
page number that contains the virtual address in question. Concretely,
the following results are seen on an 80 core, 8 socket x86-64 box:
1) System bootup: Most programs are single threaded, so the per-thread
scheme does improve ~50% hit rate by just adding a few more slots to
the cache.
+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 50.61% | 19.90 |
| patched | 73.45% | 13.58 |
+----------------+----------+------------------+
2) Kernel build: This one is already pretty good with the current
approach as we're dealing with good locality.
+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 75.28% | 11.03 |
| patched | 88.09% | 9.31 |
+----------------+----------+------------------+
3) Oracle 11g Data Mining (4k pages): Similar to the kernel build workload.
+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 70.66% | 17.14 |
| patched | 91.15% | 12.57 |
+----------------+----------+------------------+
4) Ebizzy: There's a fair amount of variation from run to run, but this
approach always shows nearly perfect hit rates, while baseline is just
about non-existent. The amounts of cycles can fluctuate between
anywhere from ~60 to ~116 for the baseline scheme, but this approach
reduces it considerably. For instance, with 80 threads:
+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 1.06% | 91.54 |
| patched | 99.97% | 14.18 |
+----------------+----------+------------------+
[akpm@linux-foundation.org: fix nommu build, per Davidlohr]
[akpm@linux-foundation.org: document vmacache_valid() logic]
[akpm@linux-foundation.org: attempt to untangle header files]
[akpm@linux-foundation.org: add vmacache_find() BUG_ON]
[hughd@google.com: add vmacache_valid_mm() (from Oleg)]
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: adjust and enhance comments]
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Tested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d26914d11751b23ca2e8747725f2cae10c2f2c1b upstream.
Since put_mems_allowed() is strictly optional, its a seqcount retry, we
don't need to evaluate the function if the allocation was in fact
successful, saving a smp_rmb some loads and comparisons on some relative
fast-paths.
Since the naming, get/put_mems_allowed() does suggest a mandatory
pairing, rename the interface, as suggested by Mel, to resemble the
seqcount interface.
This gives us: read_mems_allowed_begin() and read_mems_allowed_retry(),
where it is important to note that the return value of the latter call
is inverted from its previous incarnation.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 18ab4d4ced0817421e6db6940374cc39d28d65da upstream.
Originally get_swap_page() started iterating through the singly-linked
list of swap_info_structs using swap_list.next or highest_priority_index,
which both were intended to point to the highest priority active swap
target that was not full. The first patch in this series changed the
singly-linked list to a doubly-linked list, and removed the logic to start
at the highest priority non-full entry; it starts scanning at the highest
priority entry each time, even if the entry is full.
Replace the manually ordered swap_list_head with a plist, swap_active_head.
Add a new plist, swap_avail_head. The original swap_active_head plist
contains all active swap_info_structs, as before, while the new
swap_avail_head plist contains only swap_info_structs that are active and
available, i.e. not full. Add a new spinlock, swap_avail_lock, to protect
the swap_avail_head list.
Mel Gorman suggested using plists since they internally handle ordering
the list entries based on priority, which is exactly what swap was doing
manually. All the ordering code is now removed, and swap_info_struct
entries and simply added to their corresponding plist and automatically
ordered correctly.
Using a new plist for available swap_info_structs simplifies and
optimizes get_swap_page(), which no longer has to iterate over full
swap_info_structs. Using a new spinlock for swap_avail_head plist
allows each swap_info_struct to add or remove themselves from the
plist when they become full or not-full; previously they could not
do so because the swap_info_struct->lock is held when they change
from full<->not-full, and the swap_lock protecting the main
swap_active_head must be ordered before any swap_info_struct->lock.
Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Cc: Weijie Yang <weijieut@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a75f232ce0fe38bd01301899ecd97ffd0254316a upstream.
Add plist_requeue(), which moves the specified plist_node after all other
same-priority plist_nodes in the list. This is essentially an optimized
plist_del() followed by plist_add().
This is needed by swap, which (with the next patch in this set) uses a
plist of available swap devices. When a swap device (either a swap
partition or swap file) are added to the system with swapon(), the device
is added to a plist, ordered by the swap device's priority. When swap
needs to allocate a page from one of the swap devices, it takes the page
from the first swap device on the plist, which is the highest priority
swap device. The swap device is left in the plist until all its pages are
used, and then removed from the plist when it becomes full.
However, as described in man 2 swapon, swap must allocate pages from swap
devices with the same priority in round-robin order; to do this, on each
swap page allocation, swap uses a page from the first swap device in the
plist, and then calls plist_requeue() to move that swap device entry to
after any other same-priority swap devices. The next swap page allocation
will again use a page from the first swap device in the plist and requeue
it, and so on, resulting in round-robin usage of equal-priority swap
devices.
Also add plist_test_requeue() test function, for use by plist_test() to
test plist_requeue() function.
Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Cc: Weijie Yang <weijieut@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit fd16618e12a05df79a3439d72d5ffdac5d34f3da upstream.
Add PLIST_HEAD() to plist.h, equivalent to LIST_HEAD() from list.h, to
define and initialize a struct plist_head.
Add plist_for_each_continue() and plist_for_each_entry_continue(),
equivalent to list_for_each_continue() and list_for_each_entry_continue(),
to iterate over a plist continuing after the current position.
Add plist_prev() and plist_next(), equivalent to (struct list_head*)->prev
and ->next, implemented by list_prev_entry() and list_next_entry(), to
access the prev/next struct plist_node entry. These are needed because
unlike struct list_head, direct access of the prev/next struct plist_node
isn't possible; the list must be navigated via the contained struct
list_head. e.g. instead of accessing the prev by list_prev_entry(node,
node_list) it can be accessed by plist_prev(node).
Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Cc: Weijie Yang <weijieut@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit adfab836f4908deb049a5128082719e689eed964 upstream.
The logic controlling the singly-linked list of swap_info_struct entries
for all active, i.e. swapon'ed, swap targets is rather complex, because:
- it stores the entries in priority order
- there is a pointer to the highest priority entry
- there is a pointer to the highest priority not-full entry
- there is a highest_priority_index variable set outside the swap_lock
- swap entries of equal priority should be used equally
this complexity leads to bugs such as: https://lkml.org/lkml/2014/2/13/181
where different priority swap targets are incorrectly used equally.
That bug probably could be solved with the existing singly-linked lists,
but I think it would only add more complexity to the already difficult to
understand get_swap_page() swap_list iteration logic.
The first patch changes from a singly-linked list to a doubly-linked list
using list_heads; the highest_priority_index and related code are removed
and get_swap_page() starts each iteration at the highest priority
swap_info entry, even if it's full. While this does introduce unnecessary
list iteration (i.e. Schlemiel the painter's algorithm) in the case where
one or more of the highest priority entries are full, the iteration and
manipulation code is much simpler and behaves correctly re: the above bug;
and the fourth patch removes the unnecessary iteration.
The second patch adds some minor plist helper functions; nothing new
really, just functions to match existing regular list functions. These
are used by the next two patches.
The third patch adds plist_requeue(), which is used by get_swap_page() in
the next patch - it performs the requeueing of same-priority entries
(which moves the entry to the end of its priority in the plist), so that
all equal-priority swap_info_structs get used equally.
The fourth patch converts the main list into a plist, and adds a new plist
that contains only swap_info entries that are both active and not full.
As Mel suggested using plists allows removing all the ordering code from
swap - plists handle ordering automatically. The list naming is also
clarified now that there are two lists, with the original list changed
from swap_list_head to swap_active_head and the new list named
swap_avail_head. A new spinlock is also added for the new list, so
swap_info entries can be added or removed from the new list immediately as
they become full or not full.
This patch (of 4):
Replace the singly-linked list tracking active, i.e. swapon'ed,
swap_info_struct entries with a doubly-linked list using struct
list_heads. Simplify the logic iterating and manipulating the list of
entries, especially get_swap_page(), by using standard list_head
functions, and removing the highest priority iteration logic.
The change fixes the bug:
https://lkml.org/lkml/2014/2/13/181
in which different priority swap entries after the highest priority entry
are incorrectly used equally in pairs. The swap behavior is now as
advertised, i.e. different priority swap entries are used in order, and
equal priority swap targets are used concurrently.
Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Cc: Weijie Yang <weijieut@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d78c9300c51d6ceed9f6d078d4e9366f259de28c upstream.
timeval_to_jiffies tried to round a timeval up to an integral number
of jiffies, but the logic for doing so was incorrect: intervals
corresponding to exactly N jiffies would become N+1. This manifested
itself particularly repeatedly stopping/starting an itimer:
setitimer(ITIMER_PROF, &val, NULL);
setitimer(ITIMER_PROF, NULL, &val);
would add a full tick to val, _even if it was exactly representable in
terms of jiffies_ (say, the result of a previous rounding.) Doing
this repeatedly would cause unbounded growth in val. So fix the math.
Here's what was wrong with the conversion: we essentially computed
(eliding seconds)
jiffies = usec * (NSEC_PER_USEC/TICK_NSEC)
by using scaling arithmetic, which took the best approximation of
NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC =
x/(2^USEC_JIFFIE_SC), and computed:
jiffies = (usec * x) >> USEC_JIFFIE_SC
and rounded this calculation up in the intermediate form (since we
can't necessarily exactly represent TICK_NSEC in usec.) But the
scaling arithmetic is a (very slight) *over*approximation of the true
value; that is, instead of dividing by (1 usec/ 1 jiffie), we
effectively divided by (1 usec/1 jiffie)-epsilon (rounding
down). This would normally be fine, but we want to round timeouts up,
and we did so by adding 2^USEC_JIFFIE_SC - 1 before the shift; this
would be fine if our division was exact, but dividing this by the
slightly smaller factor was equivalent to adding just _over_ 1 to the
final result (instead of just _under_ 1, as desired.)
In particular, with HZ=1000, we consistently computed that 10000 usec
was 11 jiffies; the same was true for any exact multiple of
TICK_NSEC.
We could possibly still round in the intermediate form, adding
something less than 2^USEC_JIFFIE_SC - 1, but easier still is to
convert usec->nsec, round in nanoseconds, and then convert using
time*spec*_to_jiffies. This adds one constant multiplication, and is
not observably slower in microbenchmarks on recent x86 hardware.
Tested: the following program:
int main() {
struct itimerval zero = {{0, 0}, {0, 0}};
/* Initially set to 10 ms. */
struct itimerval initial = zero;
initial.it_interval.tv_usec = 10000;
setitimer(ITIMER_PROF, &initial, NULL);
/* Save and restore several times. */
for (size_t i = 0; i < 10; ++i) {
struct itimerval prev;
setitimer(ITIMER_PROF, &zero, &prev);
/* on old kernels, this goes up by TICK_USEC every iteration */
printf("previous value: %ld %ld %ld %ld\n",
prev.it_interval.tv_sec, prev.it_interval.tv_usec,
prev.it_value.tv_sec, prev.it_value.tv_usec);
setitimer(ITIMER_PROF, &prev, NULL);
}
return 0;
}
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Turner <pjt@google.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Paul Turner <pjt@google.com>
Reported-by: Aaron Jacobs <jacobsa@google.com>
Signed-off-by: Andrew Hunter <ahh@google.com>
[jstultz: Tweaked to apply to 3.17-rc]
Signed-off-by: John Stultz <john.stultz@linaro.org>
[bwh: Backported to 3.16: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 58d75f4b1ce26324b4d809b18f94819843a98731 upstream.
The recent conversion of saa7134 to vb2 unconvered a poll() bug that
broke the teletext applications alevt and mtt. These applications
expect that calling poll() without having called VIDIOC_STREAMON will
cause poll() to return POLLERR. That did not happen in vb2.
This patch fixes that behavior. It also fixes what should happen when
poll() is called when STREAMON is called but no buffers have been
queued. In that case poll() will also return POLLERR, but only for
capture queues since output queues will always return POLLOUT
anyway in that situation.
This brings the vb2 behavior in line with the old videobuf behavior.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 457c1b27ed56ec472d202731b12417bff023594a upstream.
Currently, I am seeing the following when I `mount -t hugetlbfs /none
/dev/hugetlbfs`, and then simply do a `ls /dev/hugetlbfs`. I think it's
related to the fact that hugetlbfs is properly not correctly setting
itself up in this state?:
Unable to handle kernel paging request for data at address 0x00000031
Faulting instruction address: 0xc000000000245710
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=2048 NUMA pSeries
....
In KVM guests on Power, in a guest not backed by hugepages, we see the
following:
AnonHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 64 kB
HPAGE_SHIFT == 0 in this configuration, which indicates that hugepages
are not supported at boot-time, but this is only checked in
hugetlb_init(). Extract the check to a helper function, and use it in a
few relevant places.
This does make hugetlbfs not supported (not registered at all) in this
environment. I believe this is fine, as there are no valid hugepages
and that won't change at runtime.
[akpm@linux-foundation.org: use pr_info(), per Mel]
[akpm@linux-foundation.org: fix build when HPAGE_SHIFT is undefined]
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 766a53d059d1500c9755c8af017bd411bd8f1b20 upstream.
Drivers should call this on unload to unregister pmops.
Bug:
https://bugzilla.kernel.org/show_bug.cgi?id=84431
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 22fdcf02f6e80d64a927f702dd9d631a927d87d4 upstream.
This commit reverts the addition of lockdep checking to raw_seqcount_begin
for the following reasons:
1) It violates the naming convention that raw_* functions should not
do lockdep checks (a convention that is also followed by the other
raw_*_seqcount_begin functions).
2) raw_seqcount_begin does not spin, so it can only be part of an ABBA
deadlock in very special circumstances (for instance if a lock
is held across the entire raw_seqcount_begin()+read_seqcount_retry()
loop while also being taken inside the write_seqcount protected area).
3) It is causing false positives with some existing callers, and there
is no non-lockdep alternative for those callers to use.
None of the three existing callers (__d_lookup_rcu, netdev_get_name, and
the NFS state code) appear to use the function in a manner that is ABBA
deadlock prone.
Fixes: 1ca7d67cf5d5: seqcount: Add lockdep functionality to seqcount/seqlock
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Waiman Long <Waiman.Long@hp.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/CAHQdGtRR6SvEhXiqWo24hoUh9AU9cL82Z8Z-d8-7u951F_d+5g@mail.gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a5fe8e7695dc3f547e955ad2b662e3e72969e506 upstream.
alpha2 is defined as 2-chars array, but is used in multiple
places as string (e.g. with nla_put_string calls), which
might leak kernel data.
Solve it by simply adding an extra char for the NULL
terminator, making such operations safe.
Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e09c2c295468476a239d13324ce9042ec4de05eb upstream.
create_singlethread_workqueue() is a compat interface for single
threaded workqueue which maps to ordered workqueue w/ rescuer in the
current implementation. create_singlethread_workqueue() currently
implemented by invoking alloc_workqueue() w/ appropriate parameters.
8719dceae2f9 ("workqueue: reject adjusting max_active or applying
attrs to ordered workqueues") introduced __WQ_ORDERED to protect
ordered workqueues against dynamic attribute changes which can break
ordering guarantees but forgot to apply it to
create_singlethread_workqueue(). This in itself is okay as nobody
currently uses dynamic attribute change on workqueues created with
create_singlethread_workqueue().
However, 4c16bd327c ("workqueue: implement NUMA affinity for unbound
workqueues") broke singlethreaded guarantee for ordered workqueues
through allocating a separate pool_workqueue on each NUMA node by
default. A later change 8a2b75384444 ("workqueue: fix ordered
workqueues in NUMA setups") fixed it by allocating only one global
pool_workqueue if __WQ_ORDERED is set.
Combined, the __WQ_ORDERED omission in create_singlethread_workqueue()
became critical breaking its single threadedness and ordering
guarantee.
Let's make create_singlethread_workqueue() wrap
alloc_ordered_workqueue() instead so that it inherits __WQ_ORDERED and
can implicitly track future ordered_workqueue changes.
v2: I missed that __WQ_ORDERED now protects against pwq splitting
across NUMA nodes and incorrectly described the patch as a
nice-to-have fix to protect against future dynamic attribute
usages. Oleg pointed out that this is actually a critical
breakage due to 8a2b75384444 ("workqueue: fix ordered workqueues
in NUMA setups").
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Mike Anderson <mike.anderson@us.ibm.com>
Cc: Oleg Nesterov <onestero@redhat.com>
Cc: Gustavo Luiz Duarte <gduarte@redhat.com>
Cc: Tomas Henzl <thenzl@redhat.com>
Fixes: 4c16bd327c ("workqueue: implement NUMA affinity for unbound workqueues")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f153566570fb9e32c2f59182883f4f66048788fb upstream.
Instead of a void function, return the trigger pointer.
Whilst not in of itself a fix, this makes the following set of
7 fixes cleaner than they would otherwise be.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8ab17fc92e49bc2b8fff9d220c19bf50ec9c1158 upstream.
Commit 46394fd01 (ACPI / hotplug: Move container-specific code out of
the core) removed the generation of "online" uevents for containers,
because "add" uevents are now generated for them automatically when
container system devices are registered. However, there are user
space tools that need to be notified when the container and all of
its children have been enumerated, which doesn't happen any more.
For this reason, add a mechanism allowing "online" uevents to be
generated for ACPI containers after enumerating the container along
with all of its children.
Fixes: 46394fd01 (ACPI / hotplug: Move container-specific code out of the core)
Reported-and-tested-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Applicable for 3.14-stable only, as this revertes a previous commit that was incorrect.
This commit drops duplicate declaration of struct usb_functionfs_descs_head
erronousely added in commit 28c5980b54 ("usb: gadget: f_fs: resurect
usb_functionfs_descs_head structure").
Fix from 28c5980b54 is applicable only for v3.15-rc1 and newer kernels.
This fixes error in uapi:
/src/linux$ make -C tools usb
make: Entering directory '/src/linux/tools'
DESCEND usb
make[1]: Entering directory '/src/linux/tools/usb'
gcc -Wall -Wextra -g -I../include -o testusb testusb.c -lpthread
gcc -Wall -Wextra -g -I../include -o ffs-test ffs-test.c -lpthread
In file included from ffs-test.c:41:0:
../../include/uapi/linux/usb/functionfs.h:42:8: error: redefinition of ‘struct usb_functionfs_descs_head’
struct usb_functionfs_descs_head {
^
../../include/uapi/linux/usb/functionfs.h:31:8: note: originally defined here
struct usb_functionfs_descs_head {
^
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Felipe Balbi <balbi@ti.com>
Signed-off-by: Karol Lewandowski <k.lewandowsk@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit bfcfd44cce2774f19daeb59fb4e43fc9aa80e7b8 upstream.
The guard was introduced in commit ea1a8217b06b ("xattr: guard against
simultaneous glibc header inclusion") but it is using #ifdef to check
for a define that is either set to 1 or 0. Fix it to use #if instead.
* Without this patch:
$ { echo "#include <sys/xattr.h>"; echo "#include <linux/xattr.h>"; } | gcc -E -Iinclude/uapi - >/dev/null
include/uapi/linux/xattr.h:19:0: warning: "XATTR_CREATE" redefined [enabled by default]
#define XATTR_CREATE 0x1 /* set value, fail if attr already exists */
^
/usr/include/x86_64-linux-gnu/sys/xattr.h:32:0: note: this is the location of the previous definition
#define XATTR_CREATE XATTR_CREATE
^
* With this patch:
$ { echo "#include <sys/xattr.h>"; echo "#include <linux/xattr.h>"; } | gcc -E -Iinclude/uapi - >/dev/null
(no warnings)
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Cc: Allan McRae <allan@archlinux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 44b1d53043c482225196e8a9cd9f35163a1b3336 upstream.
Add d_is_dir(dentry) helper which is analogous to S_ISDIR().
To avoid confusion, rename d_is_directory() to d_can_lookup().
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit db1044d458a287c18c4d413adc4ad12e92e253b5 upstream.
added struct sockaddr_storage to rdma_user_cm.h without also adding an
include for linux/socket.h to make sure it is defined. Systemtap
needs the header files to build standalone and cannot rely on other
files to pre-include other headers, so add linux/socket.h to the list
of includes in this file.
Fixes: ee7aed4528f ("RDMA/ucma: Support querying for AF_IB addresses")
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9566d6742852c527bf5af38af5cbb878dad75705 upstream.
While invesgiating the issue where in "mount --bind -oremount,ro ..."
would result in later "mount --bind -oremount,rw" succeeding even if
the mount started off locked I realized that there are several
additional mount flags that should be locked and are not.
In particular MNT_NOSUID, MNT_NODEV, MNT_NOEXEC, and the atime
flags in addition to MNT_READONLY should all be locked. These
flags are all per superblock, can all be changed with MS_BIND,
and should not be changable if set by a more privileged user.
The following additions to the current logic are added in this patch.
- nosuid may not be clearable by a less privileged user.
- nodev may not be clearable by a less privielged user.
- noexec may not be clearable by a less privileged user.
- atime flags may not be changeable by a less privileged user.
The logic with atime is that always setting atime on access is a
global policy and backup software and auditing software could break if
atime bits are not updated (when they are configured to be updated),
and serious performance degradation could result (DOS attack) if atime
updates happen when they have been explicitly disabled. Therefore an
unprivileged user should not be able to mess with the atime bits set
by a more privileged user.
The additional restrictions are implemented with the addition of
MNT_LOCK_NOSUID, MNT_LOCK_NODEV, MNT_LOCK_NOEXEC, and MNT_LOCK_ATIME
mnt flags.
Taken together these changes and the fixes for MNT_LOCK_READONLY
should make it safe for an unprivileged user to create a user
namespace and to call "mount --bind -o remount,... ..." without
the danger of mount flags being changed maliciously.
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a6138db815df5ee542d848318e5dae681590fccd upstream.
Kenton Varda <kenton@sandstorm.io> discovered that by remounting a
read-only bind mount read-only in a user namespace the
MNT_LOCK_READONLY bit would be cleared, allowing an unprivileged user
to the remount a read-only mount read-write.
Correct this by replacing the mask of mount flags to preserve
with a mask of mount flags that may be changed, and preserve
all others. This ensures that any future bugs with this mask and
remount will fail in an easy to detect way where new mount flags
simply won't change.
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a383b68d9fe9864c4d3b86f67ad6488f58136435 upstream.
The _SUN device indentification object is not guaranteed to return
the same value every time it is executed, so we should not cache its
return value, but rather execute it every time as needed. If it is
cached, an incorrect stale value may be used in some situations.
This issue was exposed by commit 202317a573b2 (ACPI / scan: Add
acpi_device objects for all device nodes in the namespace). Fix it
by avoiding to cache the return value of _SUN.
Fixes: 202317a573b2 (ACPI / scan: Add acpi_device objects for all device nodes in the namespace)
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0213436a2cc5e4a5ca2fabfaa4d3877097f3b13f upstream.
Some devices don't like REPORT SUPPORTED OPERATION CODES and will
simply timeout causing sd_mod init to take a very very long time.
Introduce BLIST_NO_RSOC scsi scan flag, that stops RSOC from being
issued. Add it to Promise Vtrak E610f entry in scsi scan
blacklist. Fixes bug #79901 reported at
https://bugzilla.kernel.org/show_bug.cgi?id=79901
Fixes: 98dcc2946adb ("SCSI: sd: Update WRITE SAME heuristics")
Signed-off-by: Janusz Dziemidowicz <rraptorr@nails.eu.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c1d40a527e885a40bb9ea6c46a1b1145d42b66a0 upstream.
Despite supporting modern SCSI features some storage devices continue to
claim conformance to an older version of the SPC spec. This is done for
compatibility with legacy operating systems.
Linux by default will not attempt to read VPD pages on devices that
claim SPC-2 or older. Introduce a blacklist flag that can be used to
trigger VPD page inquiries on devices that are known to support them.
Reported-by: KY Srinivasan <kys@microsoft.com>
Tested-by: KY Srinivasan <kys@microsoft.com>
Reviewed-by: KY Srinivasan <kys@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 22ffeb48b7584d6cd50f2a595ed6065d86a87459 upstream.
Sequential scan for more than 256 LUNs is very fragile as
LUNs might not be numbered sequentially after that point.
SAM revisions later than SCSI-3 impose a structure on
LUNs larger than 256, making LUN numbers between 256
and 16384 illegal.
SCSI-3, however allows for plain 64-bit numbers with
no internal structure.
So restrict sequential LUN scan to 256 LUNs and add a
new blacklist flag 'BLIST_SCSI3LUN' to scan up to
max_lun devices.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 7d8b6c63751cfbbe5eef81a48c22978b3407a3ad upstream.
This is effectively a revert of 7b9a7ec565505699f503b4fcf61500dceb36e744
plus fixing it a different way...
We found, when trying to run an application from an application which
had dropped privs that the kernel does security checks on undefined
capability bits. This was ESPECIALLY difficult to debug as those
undefined bits are hidden from /proc/$PID/status.
Consider a root application which drops all capabilities from ALL 4
capability sets. We assume, since the application is going to set
eff/perm/inh from an array that it will clear not only the defined caps
less than CAP_LAST_CAP, but also the higher 28ish bits which are
undefined future capabilities.
The BSET gets cleared differently. Instead it is cleared one bit at a
time. The problem here is that in security/commoncap.c::cap_task_prctl()
we actually check the validity of a capability being read. So any task
which attempts to 'read all things set in bset' followed by 'unset all
things set in bset' will not even attempt to unset the undefined bits
higher than CAP_LAST_CAP.
So the 'parent' will look something like:
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: ffffffc000000000
All of this 'should' be fine. Given that these are undefined bits that
aren't supposed to have anything to do with permissions. But they do...
So lets now consider a task which cleared the eff/perm/inh completely
and cleared all of the valid caps in the bset (but not the invalid caps
it couldn't read out of the kernel). We know that this is exactly what
the libcap-ng library does and what the go capabilities library does.
They both leave you in that above situation if you try to clear all of
you capapabilities from all 4 sets. If that root task calls execve()
the child task will pick up all caps not blocked by the bset. The bset
however does not block bits higher than CAP_LAST_CAP. So now the child
task has bits in eff which are not in the parent. These are
'meaningless' undefined bits, but still bits which the parent doesn't
have.
The problem is now in cred_cap_issubset() (or any operation which does a
subset test) as the child, while a subset for valid cap bits, is not a
subset for invalid cap bits! So now we set durring commit creds that
the child is not dumpable. Given it is 'more priv' than its parent. It
also means the parent cannot ptrace the child and other stupidity.
The solution here:
1) stop hiding capability bits in status
This makes debugging easier!
2) stop giving any task undefined capability bits. it's simple, it you
don't put those invalid bits in CAP_FULL_SET you won't get them in init
and you won't get them in any other task either.
This fixes the cap_issubset() tests and resulting fallout (which
made the init task in a docker container untraceable among other
things)
3) mask out undefined bits when sys_capset() is called as it might use
~0, ~0 to denote 'all capabilities' for backward/forward compatibility.
This lets 'capsh --caps="all=eip" -- -c /bin/bash' run.
4) mask out undefined bit when we read a file capability off of disk as
again likely all bits are set in the xattr for forward/backward
compatibility.
This lets 'setcap all+pe /bin/bash; /bin/bash' run
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Vagin <avagin@openvz.org>
Cc: Andrew G. Morgan <morgan@kernel.org>
Cc: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Steve Grubb <sgrubb@redhat.com>
Cc: Dan Walsh <dwalsh@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8e54caf407b98efa05409e1fee0e5381abd2b088 upstream.
Some Atmel TPMs provide completely wrong timeouts from their
TPM_CAP_PROP_TIS_TIMEOUT query. This patch detects that and returns
new correct values via a DID/VID table in the TIS driver.
Tested on ARM using an AT97SC3204T FW version 37.16
[PHuewe: without this fix these 'broken' Atmel TPMs won't function on
older kernels]
Signed-off-by: "Berg, Christopher" <Christopher.Berg@atmel.com>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
|
|
commit 3c45ddf823d679a820adddd53b52c6699c9a05ac upstream.
The current code always selects XPRT_TRANSPORT_BC_TCP for the back
channel, even when the forward channel was not TCP (eg, RDMA). When
a 4.1 mount is attempted with RDMA, the server panics in the TCP BC
code when trying to send CB_NULL.
Instead, construct the transport protocol number from the forward
channel transport or'd with XPRT_TRANSPORT_BC. Transports that do
not support bi-directional RPC will not have registered a "BC"
transport, causing create_backchannel_client() to fail immediately.
Fixes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=265
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit db9ee220361de03ee86388f9ea5e529eaad5323c upstream.
It turns out that there are some serious problems with the on-disk
format of journal checksum v2. The foremost is that the function to
calculate descriptor tag size returns sizes that are too big. This
causes alignment issues on some architectures and is compounded by the
fact that some parts of jbd2 use the structure size (incorrectly) to
determine the presence of a 64bit journal instead of checking the
feature flags.
Therefore, introduce journal checksum v3, which enlarges the
descriptor block tag format to allow for full 32-bit checksums of
journal blocks, fix the journal tag function to return the correct
sizes, and fix the jbd2 recovery code to use feature flags to
determine 64bitness.
Add a few function helpers so we don't have to open-code quite so
many pieces.
Switching to a 16-byte block size was found to increase journal size
overhead by a maximum of 0.1%, to convert a 32-bit journal with no
checksumming to a 32-bit journal with checksum v3 enabled.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reported-by: TR Reardon <thomas_reardon@hotmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 37dbeab788a8f23fd946c0be083e5484d6f929a1 upstream.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5fc540edc8ea1297c76685f74bc82a2107fe6731 upstream.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6dc14baf4ced769017c7a7045019c7a19f373865 upstream.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=82912
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 95cb5745983c222867cc9ac593aebb2ad67d72c0 ]
Ipv4 tunnels created with "local any remote $ip" didn't work properly since
7d442fab0 (ipv4: Cache dst in tunnels). 99% of packets sent via those tunnels
had src addr = 0.0.0.0. That was because only dst_entry was cached, although
fl4.saddr has to be cached too. Every time ip_tunnel_xmit used cached dst_entry
(tunnel_rtable_get returned non-NULL), fl4.saddr was initialized with
tnl_params->saddr (= 0 in our case), and wasn't changed until iptunnel_xmit().
This patch adds saddr to ip_tunnel->dst_cache, fixing this issue.
Reported-by: Sergey Popov <pinkbyte@gentoo.org>
Signed-off-by: Dmitry Popov <ixaphire@qrator.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 04ca6973f7c1a0d8537f2d9906a0cf8e69886d75 ]
In "Counting Packets Sent Between Arbitrary Internet Hosts", Jeffrey and
Jedidiah describe ways exploiting linux IP identifier generation to
infer whether two machines are exchanging packets.
With commit 73f156a6e8c1 ("inetpeer: get rid of ip_id_count"), we
changed IP id generation, but this does not really prevent this
side-channel technique.
This patch adds a random amount of perturbation so that IP identifiers
for a given destination [1] are no longer monotonically increasing after
an idle period.
Note that prandom_u32_max(1) returns 0, so if generator is used at most
once per jiffy, this patch inserts no hole in the ID suite and do not
increase collision probability.
This is jiffies based, so in the worst case (HZ=1000), the id can
rollover after ~65 seconds of idle time, which should be fine.
We also change the hash used in __ip_select_ident() to not only hash
on daddr, but also saddr and protocol, so that ICMP probes can not be
used to infer information for other protocols.
For IPv6, adds saddr into the hash as well, but not nexthdr.
If I ping the patched target, we can see ID are now hard to predict.
21:57:11.008086 IP (...)
A > target: ICMP echo request, seq 1, length 64
21:57:11.010752 IP (... id 2081 ...)
target > A: ICMP echo reply, seq 1, length 64
21:57:12.013133 IP (...)
A > target: ICMP echo request, seq 2, length 64
21:57:12.015737 IP (... id 3039 ...)
target > A: ICMP echo reply, seq 2, length 64
21:57:13.016580 IP (...)
A > target: ICMP echo request, seq 3, length 64
21:57:13.019251 IP (... id 3437 ...)
target > A: ICMP echo reply, seq 3, length 64
[1] TCP sessions uses a per flow ID generator not changed by this patch.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jeffrey Knockel <jeffk@cs.unm.edu>
Reported-by: Jedidiah R. Crandall <crandall@cs.unm.edu>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Hannes Frederic Sowa <hannes@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 73f156a6e8c1074ac6327e0abd1169e95eb66463 ]
Ideally, we would need to generate IP ID using a per destination IP
generator.
linux kernels used inet_peer cache for this purpose, but this had a huge
cost on servers disabling MTU discovery.
1) each inet_peer struct consumes 192 bytes
2) inetpeer cache uses a binary tree of inet_peer structs,
with a nominal size of ~66000 elements under load.
3) lookups in this tree are hitting a lot of cache lines, as tree depth
is about 20.
4) If server deals with many tcp flows, we have a high probability of
not finding the inet_peer, allocating a fresh one, inserting it in
the tree with same initial ip_id_count, (cf secure_ip_id())
5) We garbage collect inet_peer aggressively.
IP ID generation do not have to be 'perfect'
Goal is trying to avoid duplicates in a short period of time,
so that reassembly units have a chance to complete reassembly of
fragments belonging to one message before receiving other fragments
with a recycled ID.
We simply use an array of generators, and a Jenkin hash using the dst IP
as a key.
ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it
belongs (it is only used from this file)
secure_ip_id() and secure_ipv6_id() no longer are needed.
Rename ip_select_ident_more() to ip_select_ident_segs() to avoid
unnecessary decrement/increment of the number of segments.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 23d9cec07c589276561c13b180577c0b87930140 upstream.
The DRA74/72 control module pins have a weak pull up and pull down.
This is configured by bit offset 17. if BIT(17) is 1, a pull up is
selected, else a pull down is selected.
However, this pull resisstor is applied based on BIT(16) -
PULLUDENABLE - if BIT(18) is *0*, then pull as defined in BIT(17) is
applied, else no weak pulls are applied. We defined this in reverse.
Reference: Table 18-5 (Description of the pad configuration register
bits) in Technical Reference Manual Revision (DRA74x revision Q:
SPRUHI2Q Revised June 2014 and DRA72x revision F: SPRUHP2F - Revised
June 2014)
Fixes: 6e58b8f1daaf1a ("ARM: dts: DRA7: Add the dts files for dra7 SoC and dra7-evm board")
Signed-off-by: Nishanth Menon <nm@ti.com>
Tested-by: Felipe Balbi <balbi@ti.com>
Acked-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit aac74dc495456412c4130a1167ce4beb6c1f0b38 upstream.
After learning we'll need some sort of deferred printk functionality in
the timekeeping core, Peter suggested we rename the printk_sched function
so it can be reused by needed subsystems.
This only changes the function name. No logic changes.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Bohac <jbohac@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1a112d10f03e83fb3a2fdc4c9165865dec8a3ca6 upstream.
1871ee134b73 ("libata: support the ata host which implements a queue
depth less than 32") directly used ata_port->scsi_host->can_queue from
ata_qc_new() to determine the number of tags supported by the host;
unfortunately, SAS controllers doing SATA don't initialize ->scsi_host
leading to the following oops.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
IP: [<ffffffff814e0618>] ata_qc_new_init+0x188/0x1b0
PGD 0
Oops: 0002 [#1] SMP
Modules linked in: isci libsas scsi_transport_sas mgag200 drm_kms_helper ttm
CPU: 1 PID: 518 Comm: udevd Not tainted 3.16.0-rc6+ #62
Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
task: ffff880c1a00b280 ti: ffff88061a000000 task.ti: ffff88061a000000
RIP: 0010:[<ffffffff814e0618>] [<ffffffff814e0618>] ata_qc_new_init+0x188/0x1b0
RSP: 0018:ffff88061a003ae8 EFLAGS: 00010012
RAX: 0000000000000001 RBX: ffff88000241ca80 RCX: 00000000000000fa
RDX: 0000000000000020 RSI: 0000000000000020 RDI: ffff8806194aa298
RBP: ffff88061a003ae8 R08: ffff8806194a8000 R09: 0000000000000000
R10: 0000000000000000 R11: ffff88000241ca80 R12: ffff88061ad58200
R13: ffff8806194aa298 R14: ffffffff814e67a0 R15: ffff8806194a8000
FS: 00007f3ad7fe3840(0000) GS:ffff880627620000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000058 CR3: 000000061a118000 CR4: 00000000001407e0
Stack:
ffff88061a003b20 ffffffff814e96e1 ffff88000241ca80 ffff88061ad58200
ffff8800b6bf6000 ffff880c1c988000 ffff880619903850 ffff88061a003b68
ffffffffa0056ce1 ffff88061a003b48 0000000013d6e6f8 ffff88000241ca80
Call Trace:
[<ffffffff814e96e1>] ata_sas_queuecmd+0xa1/0x430
[<ffffffffa0056ce1>] sas_queuecommand+0x191/0x220 [libsas]
[<ffffffff8149afee>] scsi_dispatch_cmd+0x10e/0x300 [<ffffffff814a3bc5>] scsi_request_fn+0x2f5/0x550
[<ffffffff81317613>] __blk_run_queue+0x33/0x40
[<ffffffff8131781a>] queue_unplugged+0x2a/0x90
[<ffffffff8131ceb4>] blk_flush_plug_list+0x1b4/0x210
[<ffffffff8131d274>] blk_finish_plug+0x14/0x50
[<ffffffff8117eaa8>] __do_page_cache_readahead+0x198/0x1f0
[<ffffffff8117ee21>] force_page_cache_readahead+0x31/0x50
[<ffffffff8117ee7e>] page_cache_sync_readahead+0x3e/0x50
[<ffffffff81172ac6>] generic_file_read_iter+0x496/0x5a0
[<ffffffff81219897>] blkdev_read_iter+0x37/0x40
[<ffffffff811e307e>] new_sync_read+0x7e/0xb0
[<ffffffff811e3734>] vfs_read+0x94/0x170
[<ffffffff811e43c6>] SyS_read+0x46/0xb0
[<ffffffff811e33d1>] ? SyS_lseek+0x91/0xb0
[<ffffffff8171ee29>] system_call_fastpath+0x16/0x1b
Code: 00 00 00 88 50 29 83 7f 08 01 19 d2 83 e2 f0 83 ea 50 88 50 34 c6 81 1d 02 00 00 40 c6 81 17 02 00 00 00 5d c3 66 0f 1f 44 00 00 <89> 14 25 58 00 00 00
Fix it by introducing ata_host->n_tags which is initialized to
ATA_MAX_QUEUE - 1 in ata_host_init() for SAS controllers and set to
scsi_host_template->can_queue in ata_host_register() for !SAS ones.
As SAS hosts are never registered, this will give them the same
ATA_MAX_QUEUE - 1 as before. Note that we can't use
scsi_host->can_queue directly for SAS hosts anyway as they can go
higher than the libata maximum.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Reported-by: Jesse Brandeburg <jesse.brandeburg@gmail.com>
Reported-by: Peter Hurley <peter@hurleysoftware.com>
Reported-by: Peter Zijlstra <peterz@infradead.org>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Fixes: 1871ee134b73 ("libata: support the ata host which implements a queue depth less than 32")
Cc: Kevin Hao <haokexin@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 5925a0555bdaf0b396a84318cbc21ba085f6c0d3 ]
sk_dst_cache has __rcu annotation, so we need a cast to avoid
following sparse error :
include/net/sock.h:1774:19: warning: incorrect type in initializer (different address spaces)
include/net/sock.h:1774:19: expected struct dst_entry [noderef] <asn:4>*__ret
include/net/sock.h:1774:19: got struct dst_entry *dst
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 7f502361531e ("ipv4: irq safe sk_dst_[re]set() and ipv4_sk_update_pmtu() fix")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 7f502361531e9eecb396cf99bdc9e9a59f7ebd7f ]
We have two different ways to handle changes to sk->sk_dst
First way (used by TCP) assumes socket lock is owned by caller, and use
no extra lock : __sk_dst_set() & __sk_dst_reset()
Another way (used by UDP) uses sk_dst_lock because socket lock is not
always taken. Note that sk_dst_lock is not softirq safe.
These ways are not inter changeable for a given socket type.
ipv4_sk_update_pmtu(), added in linux-3.8, added a race, as it used
the socket lock as synchronization, but users might be UDP sockets.
Instead of converting sk_dst_lock to a softirq safe version, use xchg()
as we did for sk_rx_dst in commit e47eb5dfb296b ("udp: ipv4: do not use
sk_dst_lock from softirq context")
In a follow up patch, we probably can remove sk_dst_lock, as it is
only used in IPv6.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Fixes: 9cb3a50c5f63e ("ipv4: Invalidate the socket cached route on pmtu events if possible")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit f88649721268999bdff09777847080a52004f691 ]
When IP route cache had been removed in linux-3.6, we broke assumption
that dst entries were all freed after rcu grace period. DST_NOCACHE
dst were supposed to be freed from dst_release(). But it appears
we want to keep such dst around, either in UDP sockets or tunnels.
In sk_dst_get() we need to make sure dst refcount is not 0
before incrementing it, or else we might end up freeing a dst
twice.
DST_NOCACHE set on a dst does not mean this dst can not be attached
to a socket or a tunnel.
Then, before actual freeing, we need to observe a rcu grace period
to make sure all other cpus can catch the fact the dst is no longer
usable.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dormando <dormando@rydia.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8b8b36834d0fff67fc8668093f4312dd04dcf21d upstream.
The per_cpu buffers are created one per possible CPU. But these do
not mean that those CPUs are online, nor do they even exist.
With the addition of the ring buffer polling, it assumes that the
caller polls on an existing buffer. But this is not the case if
the user reads trace_pipe from a CPU that does not exist, and this
causes the kernel to crash.
Simple fix is to check the cpu against buffer bitmask against to see
if the buffer was allocated or not and return -ENODEV if it is
not.
More updates were done to pass the -ENODEV back up to userspace.
Link: http://lkml.kernel.org/r/5393DB61.6060707@oracle.com
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 09122141785348bf9539762a5f5dbbae3761c783 upstream.
Even though usb_functionfs_descs_head structure is now deprecated,
it has been used by some user space tools. Its removel in commit
[ac8dde1: “Add flags to descriptors block”] was an oversight
leading to build breakage for such tools.
Bring it back so that old user space tools can still be build
without problems on newer kernel versions.
Reported-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Reported-by: Krzysztof Opasiak <k.opasiak@samsung.com>
Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 4af4206be2bd1933cae20c2b6fb2058dbc887f7c upstream.
syscall_regfunc() and syscall_unregfunc() should set/clear
TIF_SYSCALL_TRACEPOINT system-wide, but do_each_thread() can race
with copy_process() and miss the new child which was not added to
the process/thread lists yet.
Change copy_process() to update the child's TIF_SYSCALL_TRACEPOINT
under tasklist.
Link: http://lkml.kernel.org/p/20140413185854.GB20668@redhat.com
Fixes: a871bd33a6c0 "tracing: Add syscall tracepoints"
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b9cd18de4db3c9ffa7e17b0dc0ca99ed5aa4d43a upstream.
The 'sysret' fastpath does not correctly restore even all regular
registers, much less any segment registers or reflags values. That is
very much part of why it's faster than 'iret'.
Normally that isn't a problem, because the normal ptrace() interface
catches the process using the signal handler infrastructure, which
always returns with an iret.
However, some paths can get caught using ptrace_event() instead of the
signal path, and for those we need to make sure that we aren't going to
return to user space using 'sysret'. Otherwise the modifications that
may have been done to the register set by the tracer wouldn't
necessarily take effect.
Fix it by forcing IRET path by setting TIF_NOTIFY_RESUME from
arch_ptrace_stop_needed() which is invoked from ptrace_stop().
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1e77d0a1ed7417d2a5a52a7b8d32aea1833faa6c upstream.
Till reported that the spurious interrupt detection of threaded
interrupts is broken in two ways:
- note_interrupt() is called for each action thread of a shared
interrupt line. That's wrong as we are only interested whether none
of the device drivers felt responsible for the interrupt, but by
calling multiple times for a single interrupt line we account
IRQ_NONE even if one of the drivers felt responsible.
- note_interrupt() when called from the thread handler is not
serialized. That leaves the members of irq_desc which are used for
the spurious detection unprotected.
To solve this we need to defer the spurious detection of a threaded
interrupt to the next hardware interrupt context where we have
implicit serialization.
If note_interrupt is called with action_ret == IRQ_WAKE_THREAD, we
check whether the previous interrupt requested a deferred check. If
not, we request a deferred check for the next hardware interrupt and
return.
If set, we check whether one of the interrupt threads signaled
success. Depending on this information we feed the result into the
spurious detector.
If one primary handler of a shared interrupt returns IRQ_HANDLED we
disable the deferred check of irq threads on the same line, as we have
found at least one device driver who cared.
Reported-by: Till Straumann <strauman@slac.stanford.edu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Austin Schuh <austin@peloton-tech.com>
Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Cc: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: linux-can@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1303071450130.22263@ionos
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2426bd456a61407388b6e61fc5f98dbcbebc50e2 upstream.
When an initiator sends an allocation length bigger than what its
command consumes, the target should only return the actual response data
and set the residual length to the unused part of the allocation length.
Add a helper function that command handlers (INQUIRY, READ CAPACITY,
etc) can use to do this correctly, and use this code to get the correct
residual for commands that don't use the full initiator allocation in the
handlers for READ CAPACITY, READ CAPACITY(16), INQUIRY, MODE SENSE and
REPORT LUNS.
This addresses a handful of failures as reported by Christophe with
the Windows Certification Kit:
http://permalink.gmane.org/gmane.linux.scsi.target.devel/6515
Signed-off-by: Roland Dreier <roland@purestorage.com>
Tested-by: Christophe Vu-Brugier <cvubrugier@yahoo.fr>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 22c7aaa57e80853b4904a46c18f97db0036a3b97 upstream.
In case the transport is iser we should not include the
iscsi target info in the sendtargets text response pdu.
This causes sendtargets response to include the target
info twice.
Modify iscsit_build_sendtargets_response to filter
transport types that don't match.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reported-by: Slava Shwartsman <valyushash@gmail.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|