linux-toradex.git/include, branch v3.2.5

PCI: Rework ASPM disable code

2012-02-06T17:41:06+00:00

commit 3c076351c4027a56d5005a39a0b518a4ba393ce2 upstream.

Right now we forcibly clear ASPM state on all devices if the BIOS indicates
that the feature isn't supported. Based on the Microsoft presentation
"PCI Express In Depth for Windows Vista and Beyond", I'm starting to think
that this may be an error. The implication is that unless the platform
grants full control via _OSC, Windows will not touch any PCIe features -
including ASPM. In that case clearing ASPM state would be an error unless
the platform has granted us that control.

This patch reworks the ASPM disabling code such that the actual clearing
of state is triggered by a successful handoff of PCIe control to the OS.
The general ASPM code undergoes some changes in order to ensure that the
ability to clear the bits isn't overridden by ASPM having already been
disabled. Further, this theoretically now allows for situations where
only a subset of PCIe roots hand over control, leaving the others in the
BIOS state.

It's difficult to know for sure that this is the right thing to do -
there's zero public documentation on the interaction between all of these
components. But enough vendors enable ASPM on platforms and then set this
bit that it seems likely that they're expecting the OS to leave them alone.

Measured to save around 5W on an idle Thinkpad X220.

Signed-off-by: Matthew Garrett 
Signed-off-by: Jesse Barnes 
Signed-off-by: Greg Kroah-Hartman

netns: Fail conspicously if someone uses net_generic at an inappropriate time.

2012-02-03T17:22:18+00:00

[ Upstream commit 5ee4433efe99b9f39f6eff5052a177bbcfe72cea ]

By definition net_generic should never be called when it can return
NULL.  Fail conspicously with a BUG_ON to make it clear when people mess
up that a NULL return should never happen.

Recently there was a bug in the CAIF subsystem where it was registered
with register_pernet_device instead of register_pernet_subsys.  It was
erroneously concluded that net_generic could validly return NULL and
that net_assign_generic was buggy (when it was just inefficient).
Hopefully this BUG_ON will prevent people to coming to similar erroneous
conclusions in the futrue.

Signed-off-by: Eric W. Biederman 
Tested-by: Sasha Levin 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

drm: Fix authentication kernel crash

2012-02-03T17:21:26+00:00

commit 598781d71119827b454fd75d46f84755bca6f0c6 upstream.

If the master tries to authenticate a client using drm_authmagic and
that client has already closed its drm file descriptor,
either wilfully or because it was terminated, the
call to drm_authmagic will dereference a stale pointer into kmalloc'ed memory
and corrupt it.

Typically this results in a hard system hang.

This patch fixes that problem by removing any authentication tokens
(struct drm_magic_entry) open for a file descriptor when that file
descriptor is closed.

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Daniel Vetter 
Signed-off-by: Dave Airlie 
Signed-off-by: Greg Kroah-Hartman

SHM_UNLOCK: fix Unevictable pages stranded after swap

2012-01-26T00:13:59+00:00

commit 245132643e1cfcd145bbc86a716c1818371fcb93 upstream.

Commit cc39c6a9bbde ("mm: account skipped entries to avoid looping in
find_get_pages") correctly fixed an infinite loop; but left a problem
that find_get_pages() on shmem would return 0 (appearing to callers to
mean end of tree) when it meets a run of nr_pages swap entries.

The only uses of find_get_pages() on shmem are via pagevec_lookup(),
called from invalidate_mapping_pages(), and from shmctl SHM_UNLOCK's
scan_mapping_unevictable_pages().  The first is already commented, and
not worth worrying about; but the second can leave pages on the
Unevictable list after an unusual sequence of swapping and locking.

Fix that by using shmem_find_get_pages_and_swap() (then ignoring the
swap) instead of pagevec_lookup().

But I don't want to contaminate vmscan.c with shmem internals, nor
shmem.c with LRU locking.  So move scan_mapping_unevictable_pages() into
shmem.c, renaming it shmem_unlock_mapping(); and rename
check_move_unevictable_page() to check_move_unevictable_pages(), looping
down an array of pages, oftentimes under the same lock.

Leave out the "rotate unevictable list" block: that's a leftover from
when this was used for /proc/sys/vm/scan_unevictable_pages, whose flawed
handling involved looking at pages at tail of LRU.

Was there significance to the sequence first ClearPageUnevictable, then
test page_evictable, then SetPageUnevictable here? I think not, we're
under LRU lock, and have no barriers between those.

Signed-off-by: Hugh Dickins 
Reviewed-by: KOSAKI Motohiro 
Cc: Minchan Kim 
Cc: Rik van Riel 
Cc: Shaohua Li 
Cc: Eric Dumazet 
Cc: Johannes Weiner 
Cc: Michel Lespinasse 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman

tuner: Fix numberspace conflict between xc4000 and pti 5nf05 tuners

2012-01-26T00:13:51+00:00

commit cd4ca7afc61d3b18fcd635002459fb6b1d701099 upstream.

Update xc4000 tuner definition, number 81 is already in use by
TUNER_PARTSNIC_PTI_5NF05.

Signed-off-by: Miroslav Slugen 
Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Greg Kroah-Hartman

target: Set additional sense length field in sense data

2012-01-26T00:13:50+00:00

commit 895f3022523361e9b383cf48f51feb1f7d5e7e53 upstream.

The target code was not setting the additional sense length field in the
sense data it returned, which meant that at least the Linux stack
ignored the ASC/ASCQ fields.  For example, without this patch, on a
tcm_loop device:

    # sg_raw -v /dev/sda 2 0 0 0 0 0

gives

        cdb to send: 02 00 00 00 00 00
    SCSI Status: Check Condition

    Sense Information:
     Fixed format, current;  Sense key: Illegal Request
      Raw sense data (in hex):
            70 00 05 00 00 00 00 00

while after the patch we correctly get the following (which matches what
a regular disk returns):

        cdb to send: 02 00 00 00 00 00
    SCSI Status: Check Condition

    Sense Information:
     Fixed format, current;  Sense key: Illegal Request
     Additional sense: Invalid command operation code
     Raw sense data (in hex):
            70 00 05 00 00 00 00 0a  00 00 00 00 20 00 00 00
            00 00

Signed-off-by: Roland Dreier 
Signed-off-by: Nicholas Bellinger 
Signed-off-by: Greg Kroah-Hartman

ACPI: Store SRAT table revision

2012-01-26T00:13:46+00:00

commit 8df0eb7c9d96f9e82f233ee8b74e0f0c8471f868 upstream.

In SRAT v1, we had 8bit proximity domain (PXM) fields; SRAT v2 provides
32bits for these. The new fields were reserved before.
According to the ACPI spec, the OS must disregrard reserved fields.
In order to know whether or not, we must know what version the SRAT
table has.

This patch stores the SRAT table revision for later consumption
by arch specific __init functions.

Signed-off-by: Kurt Garloff 
Signed-off-by: Len Brown 
Signed-off-by: Greg Kroah-Hartman

block: fail SCSI passthrough ioctls on partition devices

2012-01-26T00:13:42+00:00

commit 0bfc96cb77224736dfa35c3c555d37b3646ef35e upstream.

[ Changes with respect to 3.3: return -ENOTTY from scsi_verify_blk_ioctl
  and -ENOIOCTLCMD from sd_compat_ioctl. ]

Linux allows executing the SG_IO ioctl on a partition or LVM volume, and
will pass the command to the underlying block device.  This is
well-known, but it is also a large security problem when (via Unix
permissions, ACLs, SELinux or a combination thereof) a program or user
needs to be granted access only to part of the disk.

This patch lets partitions forward a small set of harmless ioctls;
others are logged with printk so that we can see which ioctls are
actually sent.  In my tests only CDROM_GET_CAPABILITY actually occurred.
Of course it was being sent to a (partition on a) hard disk, so it would
have failed with ENOTTY and the patch isn't changing anything in
practice.  Still, I'm treating it specially to avoid spamming the logs.

In principle, this restriction should include programs running with
CAP_SYS_RAWIO.  If for example I let a program access /dev/sda2 and
/dev/sdb, it still should not be able to read/write outside the
boundaries of /dev/sda2 independent of the capabilities.  However, for
now programs with CAP_SYS_RAWIO will still be allowed to send the
ioctls.  Their actions will still be logged.

This patch does not affect the non-libata IDE driver.  That driver
however already tests for bd != bd->bd_contains before issuing some
ioctl; it could be restricted further to forbid these ioctls even for
programs running with CAP_SYS_ADMIN/CAP_SYS_RAWIO.

Cc: linux-scsi@vger.kernel.org
Cc: Jens Axboe 
Cc: James Bottomley 
Signed-off-by: Paolo Bonzini 
[ Make it also print the command name when warning - Linus ]
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman

block: add and use scsi_blk_cmd_ioctl

2012-01-26T00:13:42+00:00

commit 577ebb374c78314ac4617242f509e2f5e7156649 upstream.

Introduce a wrapper around scsi_cmd_ioctl that takes a block device.

The function will then be enhanced to detect partition block devices
and, in that case, subject the ioctls to whitelisting.

Cc: linux-scsi@vger.kernel.org
Cc: Jens Axboe 
Cc: James Bottomley 
Signed-off-by: Paolo Bonzini 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman

fix shrink_dcache_parent() livelock

2012-01-26T00:13:35+00:00

commit eaf5f9073533cde21c7121c136f1c3f072d9cf59 upstream.

Two (or more) concurrent calls of shrink_dcache_parent() on the same dentry may
cause shrink_dcache_parent() to loop forever.

Here's what appears to happen:

1 - CPU0: select_parent(P) finds C and puts it on dispose list, returns 1

2 - CPU1: select_parent(P) locks P->d_lock

3 - CPU0: shrink_dentry_list() locks C->d_lock
   dentry_kill(C) tries to lock P->d_lock but fails, unlocks C->d_lock

4 - CPU1: select_parent(P) locks C->d_lock,
         moves C from dispose list being processed on CPU0 to the new
dispose list, returns 1

5 - CPU0: shrink_dentry_list() finds dispose list empty, returns

6 - Goto 2 with CPU0 and CPU1 switched

Basically select_parent() steals the dentry from shrink_dentry_list() and thinks
it found a new one, causing shrink_dentry_list() to think it's making progress
and loop over and over.

One way to trigger this is to make udev calls stat() on the sysfs file while it
is going away.

Having a file in /lib/udev/rules.d/ with only this one rule seems to the trick:

ATTR{vendor}=="0x8086", ATTR{device}=="0x10ca", ENV{PCI_SLOT_NAME}="%k", ENV{MATCHADDR}="$attr{address}", RUN+="/bin/true"

Then execute the following loop:

while true; do
        echo -bond0 > /sys/class/net/bonding_masters
        echo +bond0 > /sys/class/net/bonding_masters
        echo -bond1 > /sys/class/net/bonding_masters
        echo +bond1 > /sys/class/net/bonding_masters
done

One fix would be to check all callers and prevent concurrent calls to
shrink_dcache_parent().  But I think a better solution is to stop the
stealing behavior.

This patch adds a new dentry flag that is set when the dentry is added to the
dispose list.  The flag is cleared in dentry_lru_del() in case the dentry gets a
new reference just before being pruned.

If the dentry has this flag, select_parent() will skip it and let
shrink_dentry_list() retry pruning it.  With select_parent() skipping those
dentries there will not be the appearance of progress (new dentries found) when
there is none, hence shrink_dcache_parent() will not loop forever.

Set the flag is also set in prune_dcache_sb() for consistency as suggested by
Linus.

Signed-off-by: Miklos Szeredi 
Signed-off-by: Al Viro 
Signed-off-by: Greg Kroah-Hartman