linux-toradex.git/fs, branch v2.6.35.4

Fix init ordering of /dev/console vs callers of modprobe

2010-08-26T23:46:04+00:00

commit 31d1d48e199e99077fb30f6fb9a793be7bec756f upstream.

Make /dev/console get initialised before any initialisation routine that
invokes modprobe because if modprobe fails, it's going to want to open
/dev/console, presumably to write an error message to.

The problem with that is that if the /dev/console driver is not yet
initialised, the chardev handler will call request_module() to invoke
modprobe, which will fail, because we never compile /dev/console as a
module.

This will lead to a modprobe loop, showing the following in the kernel
log:

	request_module: runaway loop modprobe char-major-5-1
	request_module: runaway loop modprobe char-major-5-1
	request_module: runaway loop modprobe char-major-5-1
	request_module: runaway loop modprobe char-major-5-1
	request_module: runaway loop modprobe char-major-5-1

This can happen, for example, when the built in md5 module can't find
the built in cryptomgr module (because the latter fails to initialise).
The md5 module comes before the call to tty_init(), presumably because
'crypto' comes before 'drivers' alphabetically.

Fix this by calling tty_init() from chrdev_init().

Signed-off-by: David Howells 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman

NFS: Fix an Oops in the NFSv4 atomic open code

2010-08-26T23:45:53+00:00

commit 0a377cff9428af2da2b293d11e07bc4dbf064ee5 upstream.

Adam Lackorzynski reports:

with 2.6.35.2 I'm getting this reproducible Oops:

[  110.825396] BUG: unable to handle kernel NULL pointer dereference at
(null)
[  110.828638] IP: [] encode_attrs+0x1a/0x2a4
[  110.828638] PGD be89f067 PUD bf18f067 PMD 0
[  110.828638] Oops: 0000 [#1] SMP
[  110.828638] last sysfs file: /sys/class/net/lo/operstate
[  110.828638] CPU 2
[  110.828638] Modules linked in: rtc_cmos rtc_core rtc_lib amd64_edac_mod
i2c_amd756 edac_core i2c_core dm_mirror dm_region_hash dm_log dm_snapshot
sg sr_mod usb_storage ohci_hcd mptspi tg3 mptscsih mptbase usbcore nls_base
[last unloaded: scsi_wait_scan]
[  110.828638]
[  110.828638] Pid: 11264, comm: setchecksum Not tainted 2.6.35.2 #1
[  110.828638] RIP: 0010:[]  []
encode_attrs+0x1a/0x2a4
[  110.828638] RSP: 0000:ffff88003bf5b878  EFLAGS: 00010296
[  110.828638] RAX: ffff8800bddb48a8 RBX: ffff88003bf5bb18 RCX:
0000000000000000
[  110.828638] RDX: ffff8800be258800 RSI: 0000000000000000 RDI:
ffff88003bf5b9f8
[  110.828638] RBP: 0000000000000000 R08: ffff8800bddb48a8 R09:
0000000000000004
[  110.828638] R10: 0000000000000003 R11: ffff8800be779000 R12:
ffff8800be258800
[  110.828638] R13: ffff88003bf5b9f8 R14: ffff88003bf5bb20 R15:
ffff8800be258800
[  110.828638] FS:  0000000000000000(0000) GS:ffff880041e00000(0063)
knlGS:00000000556bd6b0
[  110.828638] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  110.828638] CR2: 0000000000000000 CR3: 00000000be8ef000 CR4:
00000000000006e0
[  110.828638] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  110.828638] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[  110.828638] Process setchecksum (pid: 11264, threadinfo
ffff88003bf5a000, task ffff88003f232210)
[  110.828638] Stack:
[  110.828638]  0000000000000000 ffff8800bfbcf920 0000000000000000
0000000000000ffe
[  110.828638] <0> 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[  110.828638] <0> 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[  110.828638] Call Trace:
[  110.828638]  [] ? nfs4_xdr_enc_setattr+0x90/0xb4
[  110.828638]  [] ? call_transmit+0x1c3/0x24a
[  110.828638]  [] ? __rpc_execute+0x78/0x22a
[  110.828638]  [] ? rpc_run_task+0x21/0x2b
[  110.828638]  [] ? rpc_call_sync+0x3d/0x5d
[  110.828638]  [] ? _nfs4_do_setattr+0x11b/0x147
[  110.828638]  [] ? nfs_init_locked+0x0/0x32
[  110.828638]  [] ? ifind+0x4e/0x90
[  110.828638]  [] ? nfs4_do_setattr+0x4b/0x6e
[  110.828638]  [] ? nfs4_do_open+0x291/0x3a6
[  110.828638]  [] ? nfs4_open_revalidate+0x63/0x14a
[  110.828638]  [] ? nfs_open_revalidate+0xd7/0x161
[  110.828638]  [] ? do_lookup+0x1a4/0x201
[  110.828638]  [] ? link_path_walk+0x6a/0x9d5
[  110.828638]  [] ? do_last+0x17b/0x58e
[  110.828638]  [] ? do_filp_open+0x1bd/0x56e
[  110.828638]  [] ? _atomic_dec_and_lock+0x30/0x48
[  110.828638]  [] ? dput+0x37/0x152
[  110.828638]  [] ? alloc_fd+0x69/0x10a
[  110.828638]  [] ? do_sys_open+0x56/0x100
[  110.828638]  [] ? ia32_sysret+0x0/0x5
[  110.828638] Code: 83 f1 01 e8 f5 ca ff ff 48 83 c4 50 5b 5d 41 5c c3 41
57 41 56 41 55 49 89 fd 41 54 49 89 d4 55 48 89 f5 53 48 81 ec 18 01 00 00
<8b> 06 89 c2 83 e2 08 83 fa 01 19 db 83 e3 f8 83 c3 18 a8 01 8d
[  110.828638] RIP  [] encode_attrs+0x1a/0x2a4
[  110.828638]  RSP 
[  110.828638] CR2: 0000000000000000
[  112.840396] ---[ end trace 95282e83fd77358f ]---

We need to ensure that the O_EXCL flag is turned off if the user doesn't
set O_CREAT.

Signed-off-by: Trond Myklebust 
Signed-off-by: Greg Kroah-Hartman

nfs: Add "lookupcache" to displayed mount options

2010-08-26T23:45:53+00:00

commit 9b00c64318cc337846a7a08a5678f5f19aeff188 upstream.

Running "cat /proc/mounts" fails to display the "lookupcache" option.
This oversight cost me a bunch of wasted time recently.

The following simple patch fixes it.

Signed-off-by: Patrick LoPresti 
Signed-off-by: Trond Myklebust 
Signed-off-by: Greg Kroah-Hartman

Fix the nested PR lock calling issue in ACL

2010-08-26T23:45:50+00:00

commit 845b6cf34150100deb5f58c8a37a372b111f2918 upstream.

Hi,

Thanks a lot for all the review and comments so far;) I'd like to send
the improved (V4) version of this patch.

This patch fixes a deadlock in OCFS2 ACL. We found this bug in OCFS2
and Samba integration using scenario, the symptom is several smbd
processes will be hung under heavy workload. Finally we found out it
is the nested PR lock calling that leads to this deadlock:

 node1        node2
              gr PR
                |
                V
 PR(EX)---> BAST:OCFS2_LOCK_BLOCKED
                |
                V
              rq PR
                |
                V
              wait=1

After requesting the 2nd PR lock, the process "smbd" went into D
state. It can only be woken up when the 1st PR lock's RO holder equals
zero. There should be an ocfs2_inode_unlock in the calling path later
on, which can decrement the RO holder. But since it has been in
uninterruptible sleep, the unlock function has no chance to be called.

The related stack trace is:
smbd          D ffff8800013d0600     0  9522   5608 0x00000000
 ffff88002ca7fb18 0000000000000282 ffff88002f964500 ffff88002ca7fa98
 ffff8800013d0600 ffff88002ca7fae0 ffff88002f964340 ffff88002f964340
 ffff88002ca7ffd8 ffff88002ca7ffd8 ffff88002f964340 ffff88002f964340
Call Trace:
[] schedule_timeout+0x175/0x210
[] wait_for_common+0xf0/0x210
[] __ocfs2_cluster_lock+0x3b9/0xa90 [ocfs2]
[] ocfs2_inode_lock_full_nested+0x255/0xdb0 [ocfs2]
[] ocfs2_get_acl+0x69/0x120 [ocfs2]
[] ocfs2_check_acl+0x28/0x80 [ocfs2]
[] acl_permission_check+0x57/0xb0
[] generic_permission+0x1d/0xc0
[] ocfs2_permission+0x10a/0x1d0 [ocfs2]
[] inode_permission+0x45/0x100
[] sys_chdir+0x53/0x90
[] system_call_fastpath+0x16/0x1b
[<00007f34a4ef6927>] 0x7f34a4ef6927

For details, please see:
https://bugzilla.novell.com/show_bug.cgi?id=614332 and
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1278

Signed-off-by: Jiaju Zhang 
Acked-by: Mark Fasheh 
Signed-off-by: Joel Becker 
Signed-off-by: Greg Kroah-Hartman

nilfs2: fix list corruption after ifile creation failure

2010-08-26T23:45:47+00:00

commit af4e36318edb848fcc0a8d5f75000ca00cdc7595 upstream.

If nilfs_attach_checkpoint() gets a memory allocation failure during
creation of ifile, it will return without removing nilfs_sb_info
struct from ns_supers list.  When a concurrently mounted snapshot is
unmounted or another new snapshot is mounted after that, this causes
kernel oops as below:

> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [] nilfs_find_sbinfo+0x74/0xa4 [nilfs2]
> *pde = 00000000
> Oops: 0000 [#1] SMP

> Call Trace:
>  [] ? nilfs_get_sb+0x165/0x532 [nilfs2]
>  [] ? ida_get_new_above+0x16d/0x187
>  [] ? alloc_vfsmnt+0x7e/0x10a
>  [] ? kstrdup+0x2c/0x40
>  [] ? vfs_kern_mount+0x96/0x14e
>  [] ? do_kern_mount+0x32/0xbd
>  [] ? do_mount+0x642/0x6a1
>  [] ? do_page_fault+0x0/0x2d1
>  [] ? copy_mount_options+0x80/0xe2
>  [] ? strndup_user+0x48/0x67
>  [] ? sys_mount+0x61/0x90
>  [] ? sysenter_do_call+0x12/0x22

This fixes the problem.

Signed-off-by: Ryusuke Konishi 
Tested-by: Ryusuke Konishi 
Signed-off-by: Greg Kroah-Hartman

ocfs2/dlm: remove potential deadlock -V3

2010-08-26T23:45:47+00:00

commit b11f1f1ab73fd358b1b734a9427744802202ba68 upstream.

When we need to take both dlm_domain_lock and dlm->spinlock, we should take
them in order of: dlm_domain_lock then dlm->spinlock.

There is pathes disobey this order. That is calling dlm_lockres_put() with
dlm->spinlock held in dlm_run_purge_list. dlm_lockres_put() calls dlm_put() at
the ref and dlm_put() locks on dlm_domain_lock.

Fix:
Don't grab/put the dlm when the initialising/releasing lockres.
That grab is not required because we don't call dlm_unregister_domain()
based on refcount.

Signed-off-by: Wengang Wang 
Signed-off-by: Joel Becker 
Signed-off-by: Greg Kroah-Hartman

ocfs2/dlm: avoid incorrect bit set in refmap on recovery master

2010-08-26T23:45:47+00:00

commit a524812b7eaa7783d7811198921100f079034e61 upstream.

In the following situation, there remains an incorrect bit in refmap on the
recovery master. Finally the recovery master will fail at purging the lockres
due to the incorrect bit in refmap.

1) node A has no interest on lockres A any longer, so it is purging it.
2) the owner of lockres A is node B, so node A is sending de-ref message
to node B.
3) at this time, node B crashed. node C becomes the recovery master. it recovers
lockres A(because the master is the dead node B).
4) node A migrated lockres A to node C with a refbit there.
5) node A failed to send de-ref message to node B because it crashed. The failure
is ignored. no other action is done for lockres A any more.

For mormal, re-send the deref message to it to recovery master can fix it. Well,
ignoring the failure of deref to the original master and not recovering the lockres
to recovery master has the same effect. And the later is simpler.

Signed-off-by: Wengang Wang 
Acked-by: Srinivas Eeda 
Signed-off-by: Joel Becker 
Signed-off-by: Greg Kroah-Hartman

ocfs2: Count more refcount records in file system fragmentation.

2010-08-26T23:45:46+00:00

commit 8a2e70c40ff58f82dde67770e6623ca45f0cb0c8 upstream.

The refcount record calculation in ocfs2_calc_refcount_meta_credits
is too optimistic that we can always allocate contiguous clusters
and handle an already existed refcount rec as a whole. Actually
because of file system fragmentation, we may have the chance to split
a refcount record into 3 parts during the transaction. So consider
the worst case in record calculation.

Signed-off-by: Tao Ma 
Signed-off-by: Joel Becker 
Signed-off-by: Greg Kroah-Hartman

ocfs2 fix o2dlm dlm run purgelist (rev 3)

2010-08-26T23:45:46+00:00

commit 7beaf243787f85a2ef9213ccf13ab4a243283fde upstream.

This patch fixes two problems in dlm_run_purgelist

1. If a lockres is found to be in use, dlm_run_purgelist keeps trying to purge
the same lockres instead of trying the next lockres.

2. When a lockres is found unused, dlm_run_purgelist releases lockres spinlock
before setting DLM_LOCK_RES_DROPPING_REF and calls dlm_purge_lockres.
spinlock is reacquired but in this window lockres can get reused. This leads
to BUG.

This patch modifies dlm_run_purgelist to skip lockres if it's in use and purge
 next lockres. It also sets DLM_LOCK_RES_DROPPING_REF before releasing the
lockres spinlock protecting it from getting reused.

Signed-off-by: Srinivas Eeda 
Acked-by: Sunil Mushran 
Signed-off-by: Joel Becker 
Signed-off-by: Greg Kroah-Hartman

ocfs2/dlm: fix a dead lock

2010-08-26T23:45:46+00:00

commit 6d98c3ccb52f692f1a60339dde7c700686a5568b upstream.

When we have to take both dlm->master_lock and lockres->spinlock,
take them in order

lockres->spinlock and then dlm->master_lock.

The patch fixes a violation of the rule.
We can simply move taking dlm->master_lock to where we have dropped res->spinlock
since when we access res->state and free mle memory we don't need master_lock's
protection.

Signed-off-by: Wengang Wang 
Signed-off-by: Joel Becker 
Signed-off-by: Greg Kroah-Hartman