linux-toradex.git, branch v2.6.27.48

Linux 2.6.27.48

2010-07-05T18:09:05+00:00

sctp: fix append error cause to ERROR chunk correctly

2010-07-05T18:08:48+00:00

commit 2e3219b5c8a2e44e0b83ae6e04f52f20a82ac0f2 upstream.

commit 5fa782c2f5ef6c2e4f04d3e228412c9b4a4c8809
  sctp: Fix skb_over_panic resulting from multiple invalid \
    parameter errors (CVE-2010-1173) (v4)

cause 'error cause' never be add the the ERROR chunk due to
some typo when check valid length in sctp_init_cause_fixed().

Signed-off-by: Wei Yongjun 
Reviewed-by: Neil Horman 
Acked-by: Vlad Yasevich 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

KEYS: find_keyring_by_name() can gain access to a freed keyring

2010-07-05T18:08:48+00:00

commit cea7daa3589d6b550546a8c8963599f7c1a3ae5c upstream.

find_keyring_by_name() can gain access to a keyring that has had its reference
count reduced to zero, and is thus ready to be freed.  This then allows the
dead keyring to be brought back into use whilst it is being destroyed.

The following timeline illustrates the process:

|(cleaner)                           (user)
|
| free_user(user)                    sys_keyctl()
|  |                                  |
|  key_put(user->session_keyring)     keyctl_get_keyring_ID()
|  ||	//=> keyring->usage = 0        |
|  |schedule_work(&key_cleanup_task)   lookup_user_key()
|  ||                                   |
|  kmem_cache_free(,user)               |
|  .                                    |[KEY_SPEC_USER_KEYRING]
|  .                                    install_user_keyrings()
|  .                                    ||
| key_cleanup() [<= worker_thread()]    ||
|  |                                    ||
|  [spin_lock(&key_serial_lock)]        |[mutex_lock(&key_user_keyr..mutex)]
|  |                                    ||
|  atomic_read() == 0                   ||
|  |{ rb_ease(&key->serial_node,) }     ||
|  |                                    ||
|  [spin_unlock(&key_serial_lock)]      |find_keyring_by_name()
|  |                                    |||
|  keyring_destroy(keyring)             ||[read_lock(&keyring_name_lock)]
|  ||                                   |||
|  |[write_lock(&keyring_name_lock)]    ||atomic_inc(&keyring->usage)
|  |.                                   ||| *** GET freeing keyring ***
|  |.                                   ||[read_unlock(&keyring_name_lock)]
|  ||                                   ||
|  |list_del()                          |[mutex_unlock(&key_user_k..mutex)]
|  ||                                   |
|  |[write_unlock(&keyring_name_lock)]  ** INVALID keyring is returned **
|  |                                    .
|  kmem_cache_free(,keyring)            .
|                                       .
|                                       atomic_dec(&keyring->usage)
v                                         *** DESTROYED ***
TIME

If CONFIG_SLUB_DEBUG=y then we may see the following message generated:

	=============================================================================
	BUG key_jar: Poison overwritten
	-----------------------------------------------------------------------------

	INFO: 0xffff880197a7e200-0xffff880197a7e200. First byte 0x6a instead of 0x6b
	INFO: Allocated in key_alloc+0x10b/0x35f age=25 cpu=1 pid=5086
	INFO: Freed in key_cleanup+0xd0/0xd5 age=12 cpu=1 pid=10
	INFO: Slab 0xffffea000592cb90 objects=16 used=2 fp=0xffff880197a7e200 flags=0x200000000000c3
	INFO: Object 0xffff880197a7e200 @offset=512 fp=0xffff880197a7e300

	Bytes b4 0xffff880197a7e1f0:  5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
	  Object 0xffff880197a7e200:  6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b jkkkkkkkkkkkkkkk

Alternatively, we may see a system panic happen, such as:

	BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
	IP: [] kmem_cache_alloc+0x5b/0xe9
	PGD 6b2b4067 PUD 6a80d067 PMD 0
	Oops: 0000 [#1] SMP
	last sysfs file: /sys/kernel/kexec_crash_loaded
	CPU 1
	...
	Pid: 31245, comm: su Not tainted 2.6.34-rc5-nofixed-nodebug #2 D2089/PRIMERGY
	RIP: 0010:[]  [] kmem_cache_alloc+0x5b/0xe9
	RSP: 0018:ffff88006af3bd98  EFLAGS: 00010002
	RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88007d19900b
	RDX: 0000000100000000 RSI: 00000000000080d0 RDI: ffffffff81828430
	RBP: ffffffff81828430 R08: ffff88000a293750 R09: 0000000000000000
	R10: 0000000000000001 R11: 0000000000100000 R12: 00000000000080d0
	R13: 00000000000080d0 R14: 0000000000000296 R15: ffffffff810f20ce
	FS:  00007f97116bc700(0000) GS:ffff88000a280000(0000) knlGS:0000000000000000
	CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	CR2: 0000000000000001 CR3: 000000006a91c000 CR4: 00000000000006e0
	DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
	DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
	Process su (pid: 31245, threadinfo ffff88006af3a000, task ffff8800374414c0)
	Stack:
	 0000000512e0958e 0000000000008000 ffff880037f8d180 0000000000000001
	 0000000000000000 0000000000008001 ffff88007d199000 ffffffff810f20ce
	 0000000000008000 ffff88006af3be48 0000000000000024 ffffffff810face3
	Call Trace:
	 [] ? get_empty_filp+0x70/0x12f
	 [] ? do_filp_open+0x145/0x590
	 [] ? tlb_finish_mmu+0x2a/0x33
	 [] ? unmap_region+0xd3/0xe2
	 [] ? virt_to_head_page+0x9/0x2d
	 [] ? alloc_fd+0x69/0x10e
	 [] ? do_sys_open+0x56/0xfc
	 [] ? system_call_fastpath+0x16/0x1b
	Code: 0f 1f 44 00 00 49 89 c6 fa 66 0f 1f 44 00 00 65 4c 8b 04 25 60 e8 00 00 48 8b 45 00 49 01 c0 49 8b 18 48 85 db 74 0d 48 63 45 18 <48> 8b 04 03 49 89 00 eb 14 4c 89 f9 83 ca ff 44 89 e6 48 89 ef
	RIP  [] kmem_cache_alloc+0x5b/0xe9

This problem is that find_keyring_by_name does not confirm that the keyring is
valid before accepting it.

Skipping keyrings that have been reduced to a zero count seems the way to go.
To this end, use atomic_inc_not_zero() to increment the usage count and skip
the candidate keyring if that returns false.

The following script _may_ cause the bug to happen, but there's no guarantee
as the window of opportunity is small:

	#!/bin/sh
	LOOP=100000
	USER=dummy_user
	/bin/su -c "exit;" $USER || { /usr/sbin/adduser -m $USER; add=1; }
	for ((i=0; i /dev/null" $USER
	done
	(( add == 1 )) && /usr/sbin/userdel -r $USER
	exit

Note that the nominated user must not be in use.

An alternative way of testing this may be:

	for ((i=0; i<100000; i++))
	do
		keyctl session foo /bin/true || break
	done >&/dev/null

as that uses a keyring named "foo" rather than relying on the user and
user-session named keyrings.

Reported-by: Toshiyuki Okajima 
Signed-off-by: David Howells 
Tested-by: Toshiyuki Okajima 
Acked-by: Serge Hallyn 
Signed-off-by: James Morris 
Cc: Ben Hutchings 
Cc: Chuck Ebbert 
Signed-off-by: Greg Kroah-Hartman

KEYS: Return more accurate error codes

2010-07-05T18:08:47+00:00

commit 4d09ec0f705cf88a12add029c058b53f288cfaa2 upstream.

We were using the wrong variable here so the error codes weren't being returned
properly.  The original code returns -ENOKEY.

Signed-off-by: Dan Carpenter 
Signed-off-by: David Howells 
Signed-off-by: James Morris 
Signed-off-by: Greg Kroah-Hartman

parisc: clear floating point exception flag on SIGFPE signal

2010-07-05T18:08:47+00:00

commit 550f0d922286556c7ea43974bb7921effb5a5278 upstream.

Clear the floating point exception flag before returning to
user space. This is needed, else the libc trampoline handler
may hit the same SIGFPE again while building up a trampoline
to a signal handler.

Fixes debian bug #559406.

Signed-off-by: Helge Deller 
Signed-off-by: Kyle McMartin 
Signed-off-by: Greg Kroah-Hartman

tipc: Fix oops on send prior to entering networked mode (v3)

2010-07-05T18:08:47+00:00

commit d0021b252eaf65ca07ed14f0d66425dd9ccab9a6 upstream.

Fix TIPC to disallow sending to remote addresses prior to entering NET_MODE

user programs can oops the kernel by sending datagrams via AF_TIPC prior to
entering networked mode.  The following backtrace has been observed:

ID: 13459  TASK: ffff810014640040  CPU: 0   COMMAND: "tipc-client"
[exception RIP: tipc_node_select_next_hop+90]
RIP: ffffffff8869d3c3  RSP: ffff81002d9a5ab8  RFLAGS: 00010202
RAX: 0000000000000001  RBX: 0000000000000001  RCX: 0000000000000001
RDX: 0000000000000000  RSI: 0000000000000001  RDI: 0000000001001001
RBP: 0000000001001001   R8: 0074736575716552   R9: 0000000000000000
R10: ffff81003fbd0680  R11: 00000000000000c8  R12: 0000000000000008
R13: 0000000000000001  R14: 0000000000000001  R15: ffff810015c6ca00
ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
RIP: 0000003cbd8d49a3  RSP: 00007fffc84e0be8  RFLAGS: 00010206
RAX: 000000000000002c  RBX: ffffffff8005d116  RCX: 0000000000000000
RDX: 0000000000000008  RSI: 00007fffc84e0c00  RDI: 0000000000000003
RBP: 0000000000000000   R8: 00007fffc84e0c10   R9: 0000000000000010
R10: 0000000000000000  R11: 0000000000000246  R12: 0000000000000000
R13: 00007fffc84e0d10  R14: 0000000000000000  R15: 00007fffc84e0c30
ORIG_RAX: 000000000000002c  CS: 0033  SS: 002b

What happens is that, when the tipc module in inserted it enters a standalone
node mode in which communication to its own address is allowed <0.0.0> but not
to other addresses, since the appropriate data structures have not been
allocated yet (specifically the tipc_net pointer).  There is nothing stopping a
client from trying to send such a message however, and if that happens, we
attempt to dereference tipc_net.zones while the pointer is still NULL, and
explode.  The fix is pretty straightforward.  Since these oopses all arise from
the dereference of global pointers prior to their assignment to allocated
values, and since these allocations are small (about 2k total), lets convert
these pointers to static arrays of the appropriate size.  All the accesses to
these bits consider 0/NULL to be a non match when searching, so all the lookups
still work properly, and there is no longer a chance of a bad dererence
anywhere.  As a bonus, this lets us eliminate the setup/teardown routines for
those pointers, and elimnates the need to preform any locking around them to
prevent access while their being allocated/freed.

I've updated the tipc_net structure to behave this way to fix the exact reported
problem, and also fixed up the tipc_bearers and media_list arrays to fix an
obvious simmilar problem that arises from issuing tipc-config commands to
manipulate bearers/links prior to entering networked mode

I've tested this for a few hours by running the sanity tests and stress test
with the tipcutils suite, and nothing has fallen over.  There have been a few
lockdep warnings, but those were there before, and can be addressed later, as
they didn't actually result in any deadlock.

Signed-off-by: Neil Horman 
CC: Allan Stephens 
CC: David S. Miller 
CC: tipc-discussion@lists.sourceforge.net
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

vfs: add NOFOLLOW flag to umount(2)

2010-07-05T18:08:46+00:00

commit db1f05bb85d7966b9176e293f3ceead1cb8b5d79 upstream.

Add a new UMOUNT_NOFOLLOW flag to umount(2).  This is needed to prevent
symlink attacks in unprivileged unmounts (fuse, samba, ncpfs).

Additionally, return -EINVAL if an unknown flag is used (and specify
an explicitly unused flag: UMOUNT_UNUSED).  This makes it possible for
the caller to determine if a flag is supported or not.

CC: Eugene Teo 
CC: Michael Kerrisk 
Signed-off-by: Miklos Szeredi 
Signed-off-by: Al Viro 
Signed-off-by: Greg Kroah-Hartman

sctp: Fix skb_over_panic resulting from multiple invalid parameter errors (CVE-2010-1173) (v4)

2010-07-05T18:08:46+00:00

commit 5fa782c2f5ef6c2e4f04d3e228412c9b4a4c8809 upstream.

Ok, version 4

Change Notes:
1) Minor cleanups, from Vlads notes

Summary:

Hey-
	Recently, it was reported to me that the kernel could oops in the
following way:

<5> kernel BUG at net/core/skbuff.c:91!
<5> invalid operand: 0000 [#1]
<5> Modules linked in: sctp netconsole nls_utf8 autofs4 sunrpc iptable_filter
ip_tables cpufreq_powersave parport_pc lp parport vmblock(U) vsock(U) vmci(U)
vmxnet(U) vmmemctl(U) vmhgfs(U) acpiphp dm_mirror dm_mod button battery ac md5
ipv6 uhci_hcd ehci_hcd snd_ens1371 snd_rawmidi snd_seq_device snd_pcm_oss
snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_ac97_codec snd soundcore
pcnet32 mii floppy ext3 jbd ata_piix libata mptscsih mptsas mptspi mptscsi
mptbase sd_mod scsi_mod
<5> CPU:    0
<5> EIP:    0060:[]    Not tainted VLI
<5> EFLAGS: 00010216   (2.6.9-89.0.25.EL)
<5> EIP is at skb_over_panic+0x1f/0x2d
<5> eax: 0000002c   ebx: c033f461   ecx: c0357d96   edx: c040fd44
<5> esi: c033f461   edi: df653280   ebp: 00000000   esp: c040fd40
<5> ds: 007b   es: 007b   ss: 0068
<5> Process swapper (pid: 0, threadinfo=c040f000 task=c0370be0)
<5> Stack: c0357d96 e0c29478 00000084 00000004 c033f461 df653280 d7883180
e0c2947d
<5>        00000000 00000080 df653490 00000004 de4f1ac0 de4f1ac0 00000004
df653490
<5>        00000001 e0c2877a 08000800 de4f1ac0 df653490 00000000 e0c29d2e
00000004
<5> Call Trace:
<5>  [] sctp_addto_chunk+0xb0/0x128 [sctp]
<5>  [] sctp_addto_chunk+0xb5/0x128 [sctp]
<5>  [] sctp_init_cause+0x3f/0x47 [sctp]
<5>  [] sctp_process_unk_param+0xac/0xb8 [sctp]
<5>  [] sctp_verify_init+0xcc/0x134 [sctp]
<5>  [] sctp_sf_do_5_1B_init+0x83/0x28e [sctp]
<5>  [] sctp_do_sm+0x41/0x77 [sctp]
<5>  [] cache_grow+0x140/0x233
<5>  [] sctp_endpoint_bh_rcv+0xc5/0x108 [sctp]
<5>  [] sctp_inq_push+0xe/0x10 [sctp]
<5>  [] sctp_rcv+0x454/0x509 [sctp]
<5>  [] ipt_hook+0x17/0x1c [iptable_filter]
<5>  [] nf_iterate+0x40/0x81
<5>  [] ip_local_deliver_finish+0x0/0x151
<5>  [] ip_local_deliver_finish+0xc6/0x151
<5>  [] nf_hook_slow+0x83/0xb5
<5>  [] ip_local_deliver+0x1a2/0x1a9
<5>  [] ip_local_deliver_finish+0x0/0x151
<5>  [] ip_rcv+0x334/0x3b4
<5>  [] netif_receive_skb+0x320/0x35b
<5>  [] init_stall_timer+0x67/0x6a [uhci_hcd]
<5>  [] process_backlog+0x6c/0xd9
<5>  [] net_rx_action+0xfe/0x1f8
<5>  [] __do_softirq+0x35/0x79
<5>  [] handle_IRQ_event+0x0/0x4f
<5>  [] do_softirq+0x46/0x4d

Its an skb_over_panic BUG halt that results from processing an init chunk in
which too many of its variable length parameters are in some way malformed.

The problem is in sctp_process_unk_param:
if (NULL == *errp)
	*errp = sctp_make_op_error_space(asoc, chunk,
					 ntohs(chunk->chunk_hdr->length));

	if (*errp) {
		sctp_init_cause(*errp, SCTP_ERROR_UNKNOWN_PARAM,
				 WORD_ROUND(ntohs(param.p->length)));
		sctp_addto_chunk(*errp,
			WORD_ROUND(ntohs(param.p->length)),
				  param.v);

When we allocate an error chunk, we assume that the worst case scenario requires
that we have chunk_hdr->length data allocated, which would be correct nominally,
given that we call sctp_addto_chunk for the violating parameter.  Unfortunately,
we also, in sctp_init_cause insert a sctp_errhdr_t structure into the error
chunk, so the worst case situation in which all parameters are in violation
requires chunk_hdr->length+(sizeof(sctp_errhdr_t)*param_count) bytes of data.

The result of this error is that a deliberately malformed packet sent to a
listening host can cause a remote DOS, described in CVE-2010-1173:
http://cve.mitre.org/cgi-bin/cvename.cgi?name=2010-1173

I've tested the below fix and confirmed that it fixes the issue.  We move to a
strategy whereby we allocate a fixed size error chunk and ignore errors we don't
have space to report.  Tested by me successfully

Signed-off-by: Neil Horman 
Acked-by: Vlad Yasevich 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

ext4: Implement range_cyclic in ext4_da_writepages instead of write_cache_pages

2010-07-05T18:08:46+00:00

commit 2acf2c261b823d9d9ed954f348b97620297a36b5 upstream.

With delayed allocation we lock the page in write_cache_pages() and
try to build an in memory extent of contiguous blocks.  This is needed
so that we can get large contiguous blocks request.  If range_cyclic
mode is enabled, write_cache_pages() will loop back to the 0 index if
no I/O has been done yet, and try to start writing from the beginning
of the range.  That causes an attempt to take the page lock of lower
index page while holding the page lock of higher index page, which can
cause a dead lock with another writeback thread.

The solution is to implement the range_cyclic behavior in
ext4_da_writepages() instead.

http://bugzilla.kernel.org/show_bug.cgi?id=12579

Signed-off-by: Aneesh Kumar K.V 
Signed-off-by: "Theodore Ts'o" 
Signed-off-by: Jayson R. King 
Signed-off-by: Greg Kroah-Hartman

ext4: Fix file fragmentation during large file write.

2010-07-05T18:08:45+00:00

commit 22208dedbd7626e5fc4339c417f8d24cc21f79d7 upstream.

The range_cyclic writeback mode uses the address_space writeback_index
as the start index for writeback.  With delayed allocation we were
updating writeback_index wrongly resulting in highly fragmented file.
This patch reduces the number of extents reduced from 4000 to 27 for a
3GB file.

Signed-off-by: Aneesh Kumar K.V 
Signed-off-by: Theodore Ts'o 
[dev@jaysonking.com: Some changed lines from the original version of this patch were dropped, since they were rolled up with another cherry-picked patch applied to 2.6.27.y earlier.]
[dev@jaysonking.com: Use of wbc->no_nrwrite_index_update was dropped, since write_cache_pages_da() implies it.]
Signed-off-by: Jayson R. King 
Signed-off-by: Greg Kroah-Hartman