summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-11-28random: run random_int_secret_init() run after all late_initcallsTheodore Ts'o
commit 47d06e532e95b71c0db3839ebdef3fe8812fca2c upstream. The some platforms (e.g., ARM) initializes their clocks as late_initcalls for some unknown reason. So make sure random_int_secret_init() is run after all of the late_initcalls are run. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28jfs: fix error path in iallocDave Kleikamp
commit 8660998608cfa1077e560034db81885af8e1e885 upstream. If insert_inode_locked() fails, we shouldn't be calling unlock_new_inode(). Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Tested-by: Michael L. Semon <mlsemon35@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28include/linux/fs.h: disable preempt when acquire i_size_seqcount write lockFan Du
commit 74e3d1e17b2e11d175970b85acd44f5927000ba2 upstream. Two rt tasks bind to one CPU core. The higher priority rt task A preempts a lower priority rt task B which has already taken the write seq lock, and then the higher priority rt task A try to acquire read seq lock, it's doomed to lockup. rt task A with lower priority: call write i_size_write rt task B with higher priority: call sync, and preempt task A write_seqcount_begin(&inode->i_size_seqcount); i_size_read inode->i_size = i_size; read_seqcount_begin <-- lockup here... So disable preempt when acquiring every i_size_seqcount *write* lock will cure the problem. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28tracing: Fix potential out-of-bounds in trace_get_user()Steven Rostedt
commit 057db8488b53d5e4faa0cedb2f39d4ae75dfbdbb upstream. Andrey reported the following report: ERROR: AddressSanitizer: heap-buffer-overflow on address ffff8800359c99f3 ffff8800359c99f3 is located 0 bytes to the right of 243-byte region [ffff8800359c9900, ffff8800359c99f3) Accessed by thread T13003: #0 ffffffff810dd2da (asan_report_error+0x32a/0x440) #1 ffffffff810dc6b0 (asan_check_region+0x30/0x40) #2 ffffffff810dd4d3 (__tsan_write1+0x13/0x20) #3 ffffffff811cd19e (ftrace_regex_release+0x1be/0x260) #4 ffffffff812a1065 (__fput+0x155/0x360) #5 ffffffff812a12de (____fput+0x1e/0x30) #6 ffffffff8111708d (task_work_run+0x10d/0x140) #7 ffffffff810ea043 (do_exit+0x433/0x11f0) #8 ffffffff810eaee4 (do_group_exit+0x84/0x130) #9 ffffffff810eafb1 (SyS_exit_group+0x21/0x30) #10 ffffffff81928782 (system_call_fastpath+0x16/0x1b) Allocated by thread T5167: #0 ffffffff810dc778 (asan_slab_alloc+0x48/0xc0) #1 ffffffff8128337c (__kmalloc+0xbc/0x500) #2 ffffffff811d9d54 (trace_parser_get_init+0x34/0x90) #3 ffffffff811cd7b3 (ftrace_regex_open+0x83/0x2e0) #4 ffffffff811cda7d (ftrace_filter_open+0x2d/0x40) #5 ffffffff8129b4ff (do_dentry_open+0x32f/0x430) #6 ffffffff8129b668 (finish_open+0x68/0xa0) #7 ffffffff812b66ac (do_last+0xb8c/0x1710) #8 ffffffff812b7350 (path_openat+0x120/0xb50) #9 ffffffff812b8884 (do_filp_open+0x54/0xb0) #10 ffffffff8129d36c (do_sys_open+0x1ac/0x2c0) #11 ffffffff8129d4b7 (SyS_open+0x37/0x50) #12 ffffffff81928782 (system_call_fastpath+0x16/0x1b) Shadow bytes around the buggy address: ffff8800359c9700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd ffff8800359c9780: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa ffff8800359c9800: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa ffff8800359c9880: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa ffff8800359c9900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>ffff8800359c9980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00[03]fb ffff8800359c9a00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa ffff8800359c9a80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa ffff8800359c9b00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00 ffff8800359c9b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8800359c9c00: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap redzone: fa Heap kmalloc redzone: fb Freed heap region: fd Shadow gap: fe The out-of-bounds access happens on 'parser->buffer[parser->idx] = 0;' Although the crash happened in ftrace_regex_open() the real bug occurred in trace_get_user() where there's an incrementation to parser->idx without a check against the size. The way it is triggered is if userspace sends in 128 characters (EVENT_BUF_SIZE + 1), the loop that reads the last character stores it and then breaks out because there is no more characters. Then the last character is read to determine what to do next, and the index is incremented without checking size. Then the caller of trace_get_user() usually nulls out the last character with a zero, but since the index is equal to the size, it writes a nul character after the allocated space, which can corrupt memory. Luckily, only root user has write access to this file. Link: http://lkml.kernel.org/r/20131009222323.04fd1a0d@gandalf.local.home Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28netfilter: nf_ct_sip: don't drop packets with offsets pointing outside the ↵Patrick McHardy
packet commit 3a7b21eaf4fb3c971bdb47a98f570550ddfe4471 upstream. Some Cisco phones create huge messages that are spread over multiple packets. After calculating the offset of the SIP body, it is validated to be within the packet and the packet is dropped otherwise. This breaks operation of these phones. Since connection tracking is supposed to be passive, just let those packets pass unmodified and untracked. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> [bwh: Backported to 3.2: there is no log message to delete] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-288139cp: re-enable interrupts after tx timeoutDavid Woodhouse
commit 01ffc0a7f1c1801a2354719dedbc32aff45b987d upstream. Recovery doesn't work too well if we leave interrupts disabled... Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28inet: fix possible memory corruption with UDP_CORK and UFOHannes Frederic Sowa
[ This is a simplified -stable version of a set of upstream commits. ] This is a replacement patch only for stable which does fix the problems handled by the following two commits in -net: "ip_output: do skb ufo init for peeked non ufo skb as well" (e93b7d748be887cd7639b113ba7d7ef792a7efb9) "ip6_output: do skb ufo init for peeked non ufo skb as well" (c547dbf55d5f8cf615ccc0e7265e98db27d3fb8b) Three frames are written on a corked udp socket for which the output netdevice has UFO enabled. If the first and third frame are smaller than the mtu and the second one is bigger, we enqueue the second frame with skb_append_datato_frags without initializing the gso fields. This leads to the third frame appended regulary and thus constructing an invalid skb. This fixes the problem by always using skb_append_datato_frags as soon as the first frag got enqueued to the skb without marking the packet as SKB_GSO_UDP. The problem with only two frames for ipv6 was fixed by "ipv6: udp packets following an UFO enqueued packet need also be handled by UFO" (2811ebac2521ceac84f2bdae402455baa6a7fb47). Cc: Jiri Pirko <jiri@resnulli.us> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28perf tools: Fix getrusage() related build failure on glibc trunkMarkus Trippelsdorf
commit 7b78f13603c6fcb64e020a0bbe31a651ea2b657b upstream. On a system running glibc trunk perf doesn't build: CC builtin-sched.o builtin-sched.c: In function ‘get_cpu_usage_nsec_parent’: builtin-sched.c:399:16: error: storage size of ‘ru’ isn’t known builtin-sched.c:403:2: error: implicit declaration of function ‘getrusage’ [-Werror=implicit-function-declaration] [...] Fix it by including sys/resource.h. Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120404084527.GA294@x4 Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28xen-netback: use jiffies_64 value to calculate credit timeoutWei Liu
[ Upstream commit 059dfa6a93b779516321e5112db9d7621b1367ba ] time_after_eq() only works if the delta is < MAX_ULONG/2. For a 32bit Dom0, if netfront sends packets at a very low rate, the time between subsequent calls to tx_credit_exceeded() may exceed MAX_ULONG/2 and the test for timer_after_eq() will be incorrect. Credit will not be replenished and the guest may become unable to send packets (e.g., if prior to the long gap, all credit was exhausted). Use jiffies_64 variant to mitigate this problem for 32bit Dom0. Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Jason Luan <jianhai.luan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28perf: Fix perf ring buffer memory orderingPeter Zijlstra
commit bf378d341e4873ed928dc3c636252e6895a21f50 upstream. The PPC64 people noticed a missing memory barrier and crufty old comments in the perf ring buffer code. So update all the comments and add the missing barrier. When the architecture implements local_t using atomic_long_t there will be double barriers issued; but short of introducing more conditional barrier primitives this is the best we can do. Reported-by: Victor Kaplansky <victork@il.ibm.com> Tested-by: Victor Kaplansky <victork@il.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: michael@ellerman.id.au Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: anton@samba.org Cc: benh@kernel.crashing.org Link: http://lkml.kernel.org/r/20131025173749.GG19466@laptop.lan Signed-off-by: Ingo Molnar <mingo@kernel.org> [bwh: Backported to 3.2: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28zram: allow request end to coincide with disksizeSergey Senozhatsky
commit 75c7caf5a052ffd8db3312fa7864ee2d142890c4 upstream. Pass valid_io_request() checks if request end coincides with disksize (end equals bound), only fail if we attempt to read beyond the bound. mkfs.ext2 produces numerous errors: [ 2164.632747] quiet_error: 1 callbacks suppressed [ 2164.633260] Buffer I/O error on device zram0, logical block 153599 [ 2164.633265] lost page write due to I/O error on zram0 Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28ext3: return 32/64-bit dir name hash according to usage typeEric Sandeen
commit d7dab39b6e16d5eea78ed3c705d2a2d0772b4f06 upstream. This is based on commit d1f5273e9adb40724a85272f248f210dc4ce919a ext4: return 32/64-bit dir name hash according to usage type by Fan Yong <yong.fan@whamcloud.com> Traditionally ext2/3/4 has returned a 32-bit hash value from llseek() to appease NFSv2, which can only handle a 32-bit cookie for seekdir() and telldir(). However, this causes problems if there are 32-bit hash collisions, since the NFSv2 server can get stuck resending the same entries from the directory repeatedly. Allow ext3 to return a full 64-bit hash (both major and minor) for telldir to decrease the chance of hash collisions. This patch does implement a new ext3_dir_llseek op, because with 64-bit hashes, nfs will attempt to seek to a hash "offset" which is much larger than ext3's s_maxbytes. So for dx dirs, we call generic_file_llseek_size() with the appropriate max hash value as the maximum seekable size. Otherwise we just pass through to generic_file_llseek(). Patch-updated-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Patch-updated-by: Eric Sandeen <sandeen@redhat.com> (blame us if something is not correct) Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)Bernd Schubert
commit 06effdbb49af5f6c7d20affaec74603914acc768 upstream. Use 32-bit or 64-bit llseek() hashes for directory offsets depending on the NFS version. NFSv2 gets 32-bit hashes only. NOTE: This patch got rather complex as Christoph asked to set the filp->f_mode flag in the open call or immediatly after dentry_open() in nfsd_open() to avoid races. Personally I still do not see a reason for that and in my opinion FMODE_32BITHASH/FMODE_64BITHASH flags could be set nfsd_readdir(), as it follows directly after nfsd_open() without a chance of races. Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28nfsd: rename 'int access' to 'int may_flags' in nfsd_open()Bernd Schubert
commit 999448a8c0202d8c41711c92385323520644527b upstream. Just rename this variable, as the next patch will add a flag and 'access' as variable name would not be correct any more. Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28ext4: return 32/64-bit dir name hash according to usage typeFan Yong
commit d1f5273e9adb40724a85272f248f210dc4ce919a upstream. Traditionally ext2/3/4 has returned a 32-bit hash value from llseek() to appease NFSv2, which can only handle a 32-bit cookie for seekdir() and telldir(). However, this causes problems if there are 32-bit hash collisions, since the NFSv2 server can get stuck resending the same entries from the directory repeatedly. Allow ext4 to return a full 64-bit hash (both major and minor) for telldir to decrease the chance of hash collisions. This still needs integration on the NFS side. Patch-updated-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> (blame me if something is not correct) Signed-off-by: Fan Yong <yong.fan@whamcloud.com> Signed-off-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28fs: add new FMODE flags: FMODE_32bithash and FMODE_64bithashBernd Schubert
commit 6a8a13e03861c0ab83ab07d573ca793cff0e5d00 upstream. Those flags are supposed to be set by NFS readdir() to tell ext3/ext4 to 32bit (NFSv2) or 64bit hash values (offsets) in seekdir(). Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28PCI: fix truncation of resource size to 32 bitsNikhil P Rao
commit d6776e6d5c2f8db0252f447b09736075e1bbe387 upstream. _pci_assign_resource() took an int "size" argument, which meant that sizes larger than 4GB were truncated. Change type to resource_size_t. [bhelgaas: changelog] Signed-off-by: Nikhil P Rao <nikhil.rao@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28davinci_emac.c: Fix IFF_ALLMULTI setupMariusz Ceier
[ Upstream commit d69e0f7ea95fef8059251325a79c004bac01f018 ] When IFF_ALLMULTI flag is set on interface and IFF_PROMISC isn't, emac_dev_mcast_set should only enable RX of multicasts and reset MACHASH registers. It does this, but afterwards it either sets up multicast MACs filtering or disables RX of multicasts and resets MACHASH registers again, rendering IFF_ALLMULTI flag useless. This patch fixes emac_dev_mcast_set, so that multicast MACs filtering and disabling of RX of multicasts are skipped when IFF_ALLMULTI flag is set. Tested with kernel 2.6.37. Signed-off-by: Mariusz Ceier <mceier+kernel@gmail.com> Acked-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28net: fix cipso packet validation when !NETLABELSeif Mazareeb
[ Upstream commit f2e5ddcc0d12f9c4c7b254358ad245c9dddce13b ] When CONFIG_NETLABEL is disabled, the cipso_v4_validate() function could loop forever in the main loop if opt[opt_iter +1] == 0, this will causing a kernel crash in an SMP system, since the CPU executing this function will stall /not respond to IPIs. This problem can be reproduced by running the IP Stack Integrity Checker (http://isic.sourceforge.net) using the following command on a Linux machine connected to DUT: "icmpsic -s rand -d <DUT IP address> -r 123456" wait (1-2 min) Signed-off-by: Seif Mazareeb <seif@marvell.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28net: unix: inherit SOCK_PASS{CRED, SEC} flags from socket to fix raceDaniel Borkmann
[ Upstream commit 90c6bd34f884cd9cee21f1d152baf6c18bcac949 ] In the case of credentials passing in unix stream sockets (dgram sockets seem not affected), we get a rather sparse race after commit 16e5726 ("af_unix: dont send SCM_CREDENTIALS by default"). We have a stream server on receiver side that requests credential passing from senders (e.g. nc -U). Since we need to set SO_PASSCRED on each spawned/accepted socket on server side to 1 first (as it's not inherited), it can happen that in the time between accept() and setsockopt() we get interrupted, the sender is being scheduled and continues with passing data to our receiver. At that time SO_PASSCRED is neither set on sender nor receiver side, hence in cmsg's SCM_CREDENTIALS we get eventually pid:0, uid:65534, gid:65534 (== overflow{u,g}id) instead of what we actually would like to see. On the sender side, here nc -U, the tests in maybe_add_creds() invoked through unix_stream_sendmsg() would fail, as at that exact time, as mentioned, the sender has neither SO_PASSCRED on his side nor sees it on the server side, and we have a valid 'other' socket in place. Thus, sender believes it would just look like a normal connection, not needing/requesting SO_PASSCRED at that time. As reverting 16e5726 would not be an option due to the significant performance regression reported when having creds always passed, one way/trade-off to prevent that would be to set SO_PASSCRED on the listener socket and allow inheriting these flags to the spawned socket on server side in accept(). It seems also logical to do so if we'd tell the listener socket to pass those flags onwards, and would fix the race. Before, strace: recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=0, uid=65534, gid=65534}}, msg_flags=0}, 0) = 5 After, strace: recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=11580, uid=1000, gid=1000}}, msg_flags=0}, 0) = 5 Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28wanxl: fix info leak in ioctlSalva Peiró
[ Upstream commit 2b13d06c9584b4eb773f1e80bbaedab9a1c344e1 ] The wanxl_ioctl() code fails to initialize the two padding bytes of struct sync_serial_settings after the ->loopback member. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Salva Peiró <speiro@ai2.upv.es> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28sctp: Perform software checksum if packet has to be fragmented.Vlad Yasevich
[ Upstream commit d2dbbba77e95dff4b4f901fee236fef6d9552072 ] IP/IPv6 fragmentation knows how to compute only TCP/UDP checksum. This causes problems if SCTP packets has to be fragmented and ipsummed has been set to PARTIAL due to checksum offload support. This condition can happen when retransmitting after MTU discover, or when INIT or other control chunks are larger then MTU. Check for the rare fragmentation condition in SCTP and use software checksum calculation in this case. CC: Fan Du <fan.du@windriver.com> Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28sctp: Use software crc32 checksum when xfrm transform will happen.Fan Du
[ Upstream commit 27127a82561a2a3ed955ce207048e1b066a80a2a ] igb/ixgbe have hardware sctp checksum support, when this feature is enabled and also IPsec is armed to protect sctp traffic, ugly things happened as xfrm_output checks CHECKSUM_PARTIAL to do checksum operation(sum every thing up and pack the 16bits result in the checksum field). The result is fail establishment of sctp communication. Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28net: dst: provide accessor function to dst->xfrmVlad Yasevich
[ Upstream commit e87b3998d795123b4139bc3f25490dd236f68212 ] dst->xfrm is conditionally defined. Provide accessor funtion that is always available. Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28bnx2x: record rx queue for LRO packetsEric Dumazet
[ Upstream commit 60e66fee56b2256dcb1dc2ea1b2ddcb6e273857d ] RPS support is kind of broken on bnx2x, because only non LRO packets get proper rx queue information. This triggers reorders, as it seems bnx2x like to generate a non LRO packet for segment including TCP PUSH flag : (this might be pure coincidence, but all the reorders I've seen involve segments with a PUSH) 11:13:34.335847 IP A > B: . 415808:447136(31328) ack 1 win 457 <nop,nop,timestamp 3789336 3985797> 11:13:34.335992 IP A > B: . 447136:448560(1424) ack 1 win 457 <nop,nop,timestamp 3789336 3985797> 11:13:34.336391 IP A > B: . 448560:479888(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985797> 11:13:34.336425 IP A > B: P 511216:512640(1424) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> 11:13:34.336423 IP A > B: . 479888:511216(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> 11:13:34.336924 IP A > B: . 512640:543968(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> 11:13:34.336963 IP A > B: . 543968:575296(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> We must call skb_record_rx_queue() to properly give to RPS (and more generally for TX queue selection on forward path) the receive queue information. Similar fix is needed for skb_mark_napi_id(), but will be handled in a separate patch to ease stable backports. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Eilon Greenstein <eilong@broadcom.com> Acked-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28connector: use nlmsg_len() to check message lengthMathias Krause
[ Upstream commit 162b2bedc084d2d908a04c93383ba02348b648b0 ] The current code tests the length of the whole netlink message to be at least as long to fit a cn_msg. This is wrong as nlmsg_len includes the length of the netlink message header. Use nlmsg_len() instead to fix this "off-by-NLMSG_HDRLEN" size check. Cc: stable@vger.kernel.org # v2.6.14+ Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28farsync: fix info leak in ioctlSalva Peiró
[ Upstream commit 96b340406724d87e4621284ebac5e059d67b2194 ] The fst_get_iface() code fails to initialize the two padding bytes of struct sync_serial_settings after the ->loopback member. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28l2tp: must disable bh before calling l2tp_xmit_skb()Eric Dumazet
[ Upstream commit 455cc32bf128e114455d11ad919321ab89a2c312 ] François Cachereul made a very nice bug report and suspected the bh_lock_sock() / bh_unlok_sock() pair used in l2tp_xmit_skb() from process context was not good. This problem was added by commit 6af88da14ee284aaad6e4326da09a89191ab6165 ("l2tp: Fix locking in l2tp_core.c"). l2tp_eth_dev_xmit() runs from BH context, so we must disable BH from other l2tp_xmit_skb() users. [ 452.060011] BUG: soft lockup - CPU#1 stuck for 23s! [accel-pppd:6662] [ 452.061757] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppoe pppox ppp_generic slhc ipv6 ext3 mbcache jbd virtio_balloon xfs exportfs dm_mod virtio_blk ata_generic virtio_net floppy ata_piix libata virtio_pci virtio_ring virtio [last unloaded: scsi_wait_scan] [ 452.064012] CPU 1 [ 452.080015] BUG: soft lockup - CPU#2 stuck for 23s! [accel-pppd:6643] [ 452.080015] CPU 2 [ 452.080015] [ 452.080015] Pid: 6643, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs [ 452.080015] RIP: 0010:[<ffffffff81059f6c>] [<ffffffff81059f6c>] do_raw_spin_lock+0x17/0x1f [ 452.080015] RSP: 0018:ffff88007125fc18 EFLAGS: 00000293 [ 452.080015] RAX: 000000000000aba9 RBX: ffffffff811d0703 RCX: 0000000000000000 [ 452.080015] RDX: 00000000000000ab RSI: ffff8800711f6896 RDI: ffff8800745c8110 [ 452.080015] RBP: ffff88007125fc18 R08: 0000000000000020 R09: 0000000000000000 [ 452.080015] R10: 0000000000000000 R11: 0000000000000280 R12: 0000000000000286 [ 452.080015] R13: 0000000000000020 R14: 0000000000000240 R15: 0000000000000000 [ 452.080015] FS: 00007fdc0cc24700(0000) GS:ffff8800b6f00000(0000) knlGS:0000000000000000 [ 452.080015] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 452.080015] CR2: 00007fdb054899b8 CR3: 0000000074404000 CR4: 00000000000006a0 [ 452.080015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 452.080015] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 452.080015] Process accel-pppd (pid: 6643, threadinfo ffff88007125e000, task ffff8800b27e6dd0) [ 452.080015] Stack: [ 452.080015] ffff88007125fc28 ffffffff81256559 ffff88007125fc98 ffffffffa01b2bd1 [ 452.080015] ffff88007125fc58 000000000000000c 00000000029490d0 0000009c71dbe25e [ 452.080015] 000000000000005c 000000080000000e 0000000000000000 ffff880071170600 [ 452.080015] Call Trace: [ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core] [ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.080015] Code: 81 48 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 <8a> 07 eb f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 [ 452.080015] Call Trace: [ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core] [ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.064012] [ 452.064012] Pid: 6662, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs [ 452.064012] RIP: 0010:[<ffffffff81059f6e>] [<ffffffff81059f6e>] do_raw_spin_lock+0x19/0x1f [ 452.064012] RSP: 0018:ffff8800b6e83ba0 EFLAGS: 00000297 [ 452.064012] RAX: 000000000000aaa9 RBX: ffff8800b6e83b40 RCX: 0000000000000002 [ 452.064012] RDX: 00000000000000aa RSI: 000000000000000a RDI: ffff8800745c8110 [ 452.064012] RBP: ffff8800b6e83ba0 R08: 000000000000c802 R09: 000000000000001c [ 452.064012] R10: ffff880071096c4e R11: 0000000000000006 R12: ffff8800b6e83b18 [ 452.064012] R13: ffffffff8125d51e R14: ffff8800b6e83ba0 R15: ffff880072a589c0 [ 452.064012] FS: 00007fdc0b81e700(0000) GS:ffff8800b6e80000(0000) knlGS:0000000000000000 [ 452.064012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 452.064012] CR2: 0000000000625208 CR3: 0000000074404000 CR4: 00000000000006a0 [ 452.064012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 452.064012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 452.064012] Process accel-pppd (pid: 6662, threadinfo ffff88007129a000, task ffff8800744f7410) [ 452.064012] Stack: [ 452.064012] ffff8800b6e83bb0 ffffffff81256559 ffff8800b6e83bc0 ffffffff8121c64a [ 452.064012] ffff8800b6e83bf0 ffffffff8121ec7a ffff880072a589c0 ffff880071096c62 [ 452.064012] 0000000000000011 ffffffff81430024 ffff8800b6e83c80 ffffffff8121f276 [ 452.064012] Call Trace: [ 452.064012] <IRQ> [ 452.064012] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb [ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269 [ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae [ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0 [ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c [ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5 [ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84 [ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3 [ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269 [ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb [ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7 [ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e [ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b [ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net] [ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8 [ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12 [ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26 [ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82 [ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c [ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5 [ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e [ 452.064012] <EOI> [ 452.064012] [<ffffffff810b82a1>] ? kfree+0x8a/0xa3 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.064012] Code: 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 8a 07 <eb> f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 55 48 [ 452.064012] Call Trace: [ 452.064012] <IRQ> [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb [ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269 [ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae [ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0 [ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c [ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5 [ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84 [ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3 [ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269 [ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb [ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7 [ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e [ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b [ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net] [ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8 [ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12 [ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26 [ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82 [ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c [ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5 [ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e [ 452.064012] <EOI> [<ffffffff810b82a1>] ? kfree+0x8a/0xa3 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b Reported-by: François Cachereul <f.cachereul@alphalink.fr> Tested-by: François Cachereul <f.cachereul@alphalink.fr> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28net: vlan: fix nlmsg size calculation in vlan_get_size()Marc Kleine-Budde
[ Upstream commit c33a39c575068c2ea9bffb22fd6de2df19c74b89 ] This patch fixes the calculation of the nlmsg size, by adding the missing nla_total_size(). Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28ipv6: restrict neighbor entry creation to output flowMarcelo Ricardo Leitner
This patch is based on 3.2.y branch, the one used by reporter. Please let me know if it should be different. Thanks. The patch which introduced the regression was applied on stables: 3.0.64 3.4.31 3.7.8 3.2.39 The patch which introduced the regression was for stable trees only. ---8<--- Commit 0d6a77079c475033cb622c07c5a880b392ef664e "ipv6: do not create neighbor entries for local delivery" introduced a regression on which routes to local delivery would not work anymore. Like this: $ ip -6 route add local 2001::/64 dev lo $ ping6 -c1 2001::9 PING 2001::9(2001::9) 56 data bytes ping: sendmsg: Invalid argument As this is a local delivery, that commit would not allow the creation of a neighbor entry and thus the packet cannot be sent. But as TPROXY scenario actually needs to avoid the neighbor entry creation only for input flow, this patch now limits previous patch to input flow, keeping output as before that patch. Reported-by: Debabrata Banerjee <dbavatar@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> CC: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28can: dev: fix nlmsg size calculation in can_get_size()Marc Kleine-Budde
[ Upstream commit fe119a05f8ca481623a8d02efcc984332e612528 ] This patch fixes the calculation of the nlmsg size, by adding the missing nla_total_size(). Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28ipv4: fix ineffective source address selectionJiri Benc
[ Upstream commit 0a7e22609067ff524fc7bbd45c6951dd08561667 ] When sending out multicast messages, the source address in inet->mc_addr is ignored and rewritten by an autoselected one. This is caused by a typo in commit 813b3b5db831 ("ipv4: Use caller's on-stack flowi as-is in output route lookups"). Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28proc connector: fix info leaksMathias Krause
[ Upstream commit e727ca82e0e9616ab4844301e6bae60ca7327682 ] Initialize event_data for all possible message types to prevent leaking kernel stack contents to userland (up to 20 bytes). Also set the flags member of the connector message to 0 to prevent leaking two more stack bytes this way. Cc: stable@vger.kernel.org # v2.6.15+ Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28net: heap overflow in __audit_sockaddr()Dan Carpenter
[ Upstream commit 1661bf364ae9c506bc8795fef70d1532931be1e8 ] We need to cap ->msg_namelen or it leads to a buffer overflow when we to the memcpy() in __audit_sockaddr(). It requires CAP_AUDIT_CONTROL to exploit this bug. The call tree is: ___sys_recvmsg() move_addr_to_user() audit_sockaddr() __audit_sockaddr() Reported-by: Jüri Aedla <juri.aedla@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28net: do not call sock_put() on TIMEWAIT socketsEric Dumazet
[ Upstream commit 80ad1d61e72d626e30ebe8529a0455e660ca4693 ] commit 3ab5aee7fe84 ("net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls") incorrectly used sock_put() on TIMEWAIT sockets. We should instead use inet_twsk_put() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28tcp: do not forget FIN in tcp_shifted_skb()Eric Dumazet
[ Upstream commit 5e8a402f831dbe7ee831340a91439e46f0d38acd ] Yuchung found following problem : There are bugs in the SACK processing code, merging part in tcp_shift_skb_data(), that incorrectly resets or ignores the sacked skbs FIN flag. When a receiver first SACK the FIN sequence, and later throw away ofo queue (e.g., sack-reneging), the sender will stop retransmitting the FIN flag, and hangs forever. Following packetdrill test can be used to reproduce the bug. $ cat sack-merge-bug.pkt `sysctl -q net.ipv4.tcp_fack=0` // Establish a connection and send 10 MSS. 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +.000 bind(3, ..., ...) = 0 +.000 listen(3, 1) = 0 +.050 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> +.000 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6> +.001 < . 1:1(0) ack 1 win 1024 +.000 accept(3, ..., ...) = 4 +.100 write(4, ..., 12000) = 12000 +.000 shutdown(4, SHUT_WR) = 0 +.000 > . 1:10001(10000) ack 1 +.050 < . 1:1(0) ack 2001 win 257 +.000 > FP. 10001:12001(2000) ack 1 +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:11001,nop,nop> +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:12002,nop,nop> // SACK reneg +.050 < . 1:1(0) ack 12001 win 257 +0 %{ print "unacked: ",tcpi_unacked }% +5 %{ print "" }% First, a typo inverted left/right of one OR operation, then code forgot to advance end_seq if the merged skb carried FIN. Bug was added in 2.6.29 by commit 832d11c5cd076ab ("tcp: Try to restore large SKBs while SACK processing") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-11-28tcp: must unclone packets before mangling themEric Dumazet
[ Upstream commit c52e2421f7368fd36cbe330d2cf41b10452e39a9 ] TCP stack should make sure it owns skbs before mangling them. We had various crashes using bnx2x, and it turned out gso_size was cleared right before bnx2x driver was populating TC descriptor of the _previous_ packet send. TCP stack can sometime retransmit packets that are still in Qdisc. Of course we could make bnx2x driver more robust (using ACCESS_ONCE(shinfo->gso_size) for example), but the bug is TCP stack. We have identified two points where skb_unclone() was needed. This patch adds a WARN_ON_ONCE() to warn us if we missed another fix of this kind. Kudos to Neal for finding the root cause of this bug. Its visible using small MSS. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-10-26Linux 3.2.52v3.2.52Ben Hutchings
2013-10-26can: flexcan: flexcan_chip_start: fix regression, mark one MB for TX and ↵Marc Kleine-Budde
abort pending TX commit d5a7b406c529e4595ce03dc8f6dcf7fa36f106fa upstream. In patch 0d1862e can: flexcan: fix flexcan_chip_start() on imx6 the loop in flexcan_chip_start() that iterates over all mailboxes after the soft reset of the CAN core was removed. This loop put all mailboxes (even the ones marked as reserved 1...7) into EMPTY/INACTIVE mode. On mailboxes 8...63, this aborts any pending TX messages. After a cold boot there is random garbage in the mailboxes, which leads to spontaneous transmit of CAN frames during first activation. Further if the interface was disabled with a pending message (usually due to an error condition on the CAN bus), this message is retransmitted after enabling the interface again. This patch fixes the regression by: 1) Limiting the maximum number of used mailboxes to 8, 0...7 are used by the RX FIFO, 8 is used by TX. 2) Marking the TX mailbox as EMPTY/INACTIVE, so that any pending TX of that mailbox is aborted. Cc: Lothar Waßmann <LW@KARO-electronics.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> [bwh: Backported to 3.2: - Adjust context - Hardware local echo is still enabled] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-10-26gianfar: Change default HW Tx queue scheduling modeClaudiu Manoil
commit b98b8babd6e3370fadb7c6eaacb00eb2f6344a6c upstream. This is primarily to address transmission timeout occurrences, when multiple H/W Tx queues are being used concurrently. Because in the priority scheduling mode the controller does not service the Tx queues equally (but in ascending index order), Tx timeouts are being triggered rightaway for a basic test with multiple simultaneous connections like: iperf -c <server_ip> -n 100M -P 8 resulting in kernel trace: NETDEV WATCHDOG: eth1 (fsl-gianfar): transmit queue <X> timed out ------------[ cut here ]------------ WARNING: at net/sched/sch_generic.c:255 ... and controller reset during intense traffic, and possibly further complications. This patch changes the default H/W Tx scheduling setting (TXSCHED) for multi-queue devices, from priority scheduling mode to a weighted round robin mode with equal weights for all H/W Tx queues, and addresses the issue above. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-10-26mm, show_mem: suppress page counts in non-blockable contextsDavid Rientjes
commit 4b59e6c4730978679b414a8da61514a2518da512 upstream. On large systems with a lot of memory, walking all RAM to determine page types may take a half second or even more. In non-blockable contexts, the page allocator will emit a page allocation failure warning unless __GFP_NOWARN is specified. In such contexts, irqs are typically disabled and such a lengthy delay may even result in NMI watchdog timeouts. To fix this, suppress the page walk in such contexts when printing the page allocation failure warning. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mgorman@suse.de> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-10-26ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler()Lv Zheng
commit 06a8566bcf5cf7db9843a82cde7a33c7bf3947d9 upstream. This patch fixes the issues indicated by the test results that ipmi_msg_handler() is invoked in atomic context. BUG: scheduling while atomic: kipmi0/18933/0x10000100 Modules linked in: ipmi_si acpi_ipmi ... CPU: 3 PID: 18933 Comm: kipmi0 Tainted: G AW 3.10.0-rc7+ #2 Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.0027.070120100606 07/01/2010 ffff8838245eea00 ffff88103fc63c98 ffffffff814c4a1e ffff88103fc63ca8 ffffffff814bfbab ffff88103fc63d28 ffffffff814c73e0 ffff88103933cbd4 0000000000000096 ffff88103fc63ce8 ffff88102f618000 ffff881035c01fd8 Call Trace: <IRQ> [<ffffffff814c4a1e>] dump_stack+0x19/0x1b [<ffffffff814bfbab>] __schedule_bug+0x46/0x54 [<ffffffff814c73e0>] __schedule+0x83/0x59c [<ffffffff81058853>] __cond_resched+0x22/0x2d [<ffffffff814c794b>] _cond_resched+0x14/0x1d [<ffffffff814c6d82>] mutex_lock+0x11/0x32 [<ffffffff8101e1e9>] ? __default_send_IPI_dest_field.constprop.0+0x53/0x58 [<ffffffffa09e3f9c>] ipmi_msg_handler+0x23/0x166 [ipmi_si] [<ffffffff812bf6e4>] deliver_response+0x55/0x5a [<ffffffff812c0fd4>] handle_new_recv_msgs+0xb67/0xc65 [<ffffffff81007ad1>] ? read_tsc+0x9/0x19 [<ffffffff814c8620>] ? _raw_spin_lock_irq+0xa/0xc [<ffffffffa09e1128>] ipmi_thread+0x5c/0x146 [ipmi_si] ... Also Tony Camuso says: We were getting occasional "Scheduling while atomic" call traces during boot on some systems. Problem was first seen on a Cisco C210 but we were able to reproduce it on a Cisco c220m3. Setting CONFIG_LOCKDEP and LOCKDEP_SUPPORT to 'y' exposed a lockdep around tx_msg_lock in acpi_ipmi.c struct acpi_ipmi_device. ================================= [ INFO: inconsistent lock state ] 2.6.32-415.el6.x86_64-debug-splck #1 --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. ksoftirqd/3/17 [HC0[0]:SC1[1]:HE1:SE0] takes: (&ipmi_device->tx_msg_lock){+.?...}, at: [<ffffffff81337a27>] ipmi_msg_handler+0x71/0x126 {SOFTIRQ-ON-W} state was registered at: [<ffffffff810ba11c>] __lock_acquire+0x63c/0x1570 [<ffffffff810bb0f4>] lock_acquire+0xa4/0x120 [<ffffffff815581cc>] __mutex_lock_common+0x4c/0x400 [<ffffffff815586ea>] mutex_lock_nested+0x4a/0x60 [<ffffffff8133789d>] acpi_ipmi_space_handler+0x11b/0x234 [<ffffffff81321c62>] acpi_ev_address_space_dispatch+0x170/0x1be The fix implemented by this change has been tested by Tony: Tested the patch in a boot loop with lockdep debug enabled and never saw the problem in over 400 reboots. Reported-and-tested-by: Tony Camuso <tcamuso@redhat.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-10-26staging: comedi: ni_65xx: (bug fix) confine insn_bits to one subdeviceIan Abbott
commit 677a31565692d596ef42ea589b53ba289abf4713 upstream. The `insn_bits` handler `ni_65xx_dio_insn_bits()` has a `for` loop that currently writes (optionally) and reads back up to 5 "ports" consisting of 8 channels each. It reads up to 32 1-bit channels but can only read and write a whole port at once - it needs to handle up to 5 ports as the first channel it reads might not be aligned on a port boundary. It breaks out of the loop early if the next port it handles is beyond the final port on the card. It also breaks out early on the 5th port in the loop if the first channel was aligned. Unfortunately, it doesn't check that the current port it is dealing with belongs to the comedi subdevice the `insn_bits` handler is acting on. That's a bug. Redo the `for` loop to terminate after the final port belonging to the subdevice, changing the loop variable in the process to simplify things a bit. The `for` loop could now try and handle more than 5 ports if the subdevice has more than 40 channels, but the test `if (bitshift >= 32)` ensures it will break out early after 4 or 5 ports (depending on whether the first channel is aligned on a port boundary). (`bitshift` will be between -7 and 7 inclusive on the first iteration, increasing by 8 for each subsequent operation.) Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [Ian Abbott: This patch applies to kernels 2.6.34.y through to 3.5.y inclusive.] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-10-26ext4: avoid hang when mounting non-journal filesystems with orphan listTheodore Ts'o
commit 0e9a9a1ad619e7e987815d20262d36a2f95717ca upstream. When trying to mount a file system which does not contain a journal, but which does have a orphan list containing an inode which needs to be truncated, the mount call with hang forever in ext4_orphan_cleanup() because ext4_orphan_del() will return immediately without removing the inode from the orphan list, leading to an uninterruptible loop in kernel code which will busy out one of the CPU's on the system. This can be trivially reproduced by trying to mount the file system found in tests/f_orphan_extents_inode/image.gz from the e2fsprogs source tree. If a malicious user were to put this on a USB stick, and mount it on a Linux desktop which has automatic mounts enabled, this could be considered a potential denial of service attack. (Not a big deal in practice, but professional paranoids worry about such things, and have even been known to allocate CVE numbers for such problems.) -js: This is a fix for CVE-2013-2015. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-10-26hwmon: (applesmc) Silence uninitialized warningsHenrik Rydberg
commit 0fc86eca1b338d06ec500b34ef7def79c32b602b upstream. Some error paths do not set a result, leading to the (false) assumption that the value may be used uninitialized. Set results for those paths as well. Signed-off-by: Henrik Rydberg <rydberg@euromail.se> Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-10-26xhci: Fix race between ep halt and URB cancellationFlorian Wolter
commit 526867c3ca0caa2e3e846cb993b0f961c33c2abb upstream. The halted state of a endpoint cannot be cleared over CLEAR_HALT from a user process, because the stopped_td variable was overwritten in the handle_stopped_endpoint() function. So the xhci_endpoint_reset() function will refuse the reset and communication with device can not run over this endpoint. https://bugzilla.kernel.org/show_bug.cgi?id=60699 Signed-off-by: Florian Wolter <wolly84@web.de> Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-10-26cciss: fix info leak in cciss_ioctl32_passthru()Dan Carpenter
commit 58f09e00ae095e46ef9edfcf3a5fd9ccdfad065e upstream. The arg64 struct has a hole after ->buf_size which isn't cleared. Or if any of the calls to copy_from_user() fail then that would cause an information leak as well. This was assigned CVE-2013-2147. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-10-26cpqarray: fix info leak in ida_locked_ioctl()Dan Carpenter
commit 627aad1c01da6f881e7f98d71fd928ca0c316b1a upstream. The pciinfo struct has a two byte hole after ->dev_fn so stack information could be leaked to the user. This was assigned CVE-2013-2147. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-10-26iscsi: don't hang in endless loop if no targets presentSasha Levin
commit 46a7c17d26967922092f3a8291815ffb20f6cabe upstream. iscsi_if_send_reply() may return -ESRCH if there were no targets to send data to. Currently we're ignoring this value and looping in attempt to do it over and over, which will usually lead in a hung task like this one: [ 4920.817298] INFO: task trinity:9074 blocked for more than 120 seconds. [ 4920.818527] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4920.819982] trinity D 0000000000000000 5504 9074 2756 0x00000004 [ 4920.825374] ffff880003961a98 0000000000000086 ffff8800001aa000 ffff8800001aa000 [ 4920.826791] 00000000001d4340 ffff880003961fd8 ffff880003960000 00000000001d4340 [ 4920.828241] 00000000001d4340 00000000001d4340 ffff880003961fd8 00000000001d4340 [ 4920.833231] [ 4920.833519] Call Trace: [ 4920.834010] [<ffffffff826363fa>] schedule+0x3a/0x50 [ 4920.834953] [<ffffffff82634ac9>] __mutex_lock_common+0x209/0x5b0 [ 4920.836226] [<ffffffff81af805d>] ? iscsi_if_rx+0x2d/0x990 [ 4920.837281] [<ffffffff81053943>] ? sched_clock+0x13/0x20 [ 4920.838305] [<ffffffff81af805d>] ? iscsi_if_rx+0x2d/0x990 [ 4920.839336] [<ffffffff82634eb0>] mutex_lock_nested+0x40/0x50 [ 4920.840423] [<ffffffff81af805d>] iscsi_if_rx+0x2d/0x990 [ 4920.841434] [<ffffffff810dffed>] ? sub_preempt_count+0x9d/0xd0 [ 4920.842548] [<ffffffff82637bb0>] ? _raw_read_unlock+0x30/0x60 [ 4920.843666] [<ffffffff821f71de>] netlink_unicast+0x1ae/0x1f0 [ 4920.844751] [<ffffffff821f7997>] netlink_sendmsg+0x227/0x350 [ 4920.845850] [<ffffffff821857bd>] ? sock_update_netprioidx+0xdd/0x1b0 [ 4920.847060] [<ffffffff82185732>] ? sock_update_netprioidx+0x52/0x1b0 [ 4920.848276] [<ffffffff8217f226>] sock_aio_write+0x166/0x180 [ 4920.849348] [<ffffffff810dfe41>] ? get_parent_ip+0x11/0x50 [ 4920.850428] [<ffffffff811d0d9a>] do_sync_write+0xda/0x120 [ 4920.851465] [<ffffffff810dffed>] ? sub_preempt_count+0x9d/0xd0 [ 4920.852579] [<ffffffff810dfe41>] ? get_parent_ip+0x11/0x50 [ 4920.853608] [<ffffffff81791887>] ? security_file_permission+0x27/0xb0 [ 4920.854821] [<ffffffff811d0f4c>] vfs_write+0x16c/0x180 [ 4920.855781] [<ffffffff811d104f>] sys_write+0x4f/0xa0 [ 4920.856798] [<ffffffff82638e79>] system_call_fastpath+0x16/0x1b [ 4920.877487] 1 lock held by trinity/9074: [ 4920.878239] #0: (rx_queue_mutex){+.+...}, at: [<ffffffff81af805d>] iscsi_if_rx+0x2d/0x990 [ 4920.880005] Kernel panic - not syncing: hung_task: blocked tasks Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Acked-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-10-26Revert "sctp: fix call to SCTP_CMD_PROCESS_SACK in sctp_cmd_interpreter()"Ben Hutchings
This reverts commit de77b7955c3985ca95f64af3cb10557eb17eacee, which was commit f6e80abeab928b7c47cc1fbf53df13b4398a2bec upstream. This fix was only appropriate for Linux 3.7 onward, and introduced a regression when applied to earlier versions. Signed-off-by: Ben Hutchings <ben@decadent.org.uk>