| Age | Commit message (Collapse) | Author |
|
|
|
The cmpxchg_futex_value_locked API was funny in that it returned either
the original, user-exposed futex value OR an error code such as -EFAULT.
This was confusing at best, and could be a source of livelocks in places
that retry the cmpxchg_futex_value_locked after trying to fix the issue
by running fault_in_user_writeable().
This change makes the cmpxchg_futex_value_locked API more similar to the
get_futex_value_locked one, returning an error code and updating the
original value through a reference argument.
Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com> [tile]
Acked-by: Tony Luck <tony.luck@intel.com> [ia64]
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu> [microblaze]
Acked-by: David Howells <dhowells@redhat.com> [frv]
Cc: Darren Hart <darren@dvhart.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20110311024851.GC26122@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
|
|
|
|
Signed-off-by: Zhang Jiejing <jiejing.zhang@freescale.com>
|
|
Conflicts:
drivers/misc/Kconfig
drivers/misc/Makefile
drivers/net/wireless/Makefile
include/linux/mmc/mmc.h
kernel/power/main.c
|
|
There was some driver is slow on suspend/resume,
but some embeded system like eReader,Cellphone
are time sensitive,this commit will report the slow
driver on suspend/resume, the default value is 500us(0.5ms)
Also, the threshold can be change by modify
'/sys/power/device_suspend_time_threshold' to change the threshold,
it is in microsecond.
The output is like:
PM: device platform:soc-audio.2 suspend too slow, takes 606.696 msecs
PM: device platform:mxc_sdc_fb.1 suspend too slow, takes 7.708 msecs
the default state of suspend driver is default off,
if you want to debug the suspend time, echo time in
microsecond(u Second) to /sys/powe/device_suspend_time_threshold
eg: I want to know which driver suspend & resume takes
more that 0.5 ms (500 us), you can just :
ehco 500 > /sys/power/device_suspend_time_threshold
Signed-off-by: Zhang Jiejing <jiejing.zhang@freescale.com>
|
|
Change-Id: I36f90735c75fb7c7ab1084775ec0d0ab02336e6e
Signed-off-by: Todd Poynor <toddpoynor@google.com>
|
|
A thread/process in cgroup_attach_task() could have called
list_del(&tsk->cg_list) after cgroup_exit() had already called
list_del() on the same list. Since it only checked for
!list_empty(&tsk->cg_list) before doing this, the list_del()
call would thus be made twice.
The solution is to leave tsk->cg_list in a valid state in
cgroup_exit() with list_del_init(&tsk->cg_list), which leaves
an empty list.
Change-Id: I4e7c1d0665fced629f5ca033c18dd98afe080e0c
Signed-off-by: Simon Wilson <simonwilson@google.com>
|
|
synchronize_rcu can be very expensive, averaging 100 ms in
some cases. In cgroup_attach_task, it is used to prevent
a task->cgroups pointer dereferenced in an RCU read side
critical section from being invalidated, by delaying the
call to put_css_set until after an RCU grace period.
To avoid the call to synchronize_rcu, make the put_css_set
call rcu-safe by moving the deletion of the css_set links
into free_css_set_work, scheduled by the rcu callback
free_css_set_rcu.
The decrement of the cgroup refcount is no longer
synchronous with the call to put_css_set, which can result
in the cgroup refcount staying positive after the last call
to cgroup_attach_task returns. To allow the cgroup to be
deleted with cgroup_rmdir synchronously after
cgroup_attach_task, have rmdir check the refcount of all
associated css_sets. If cgroup_rmdir is called on a cgroup
for which the css_sets all have refcount zero but the
cgroup refcount is nonzero, reuse the rmdir waitqueue to
block the rmdir until free_css_set_work is called.
Signed-off-by: Colin Cross <ccross@android.com>
|
|
Changes the meaning of CGRP_RELEASABLE to be set on any cgroup
that has ever had a task or cgroup in it, or had css_get called
on it. The bit is set in cgroup_attach_task, cgroup_create,
and __css_get. It is not necessary to set the bit in
cgroup_fork, as the task is either in the root cgroup, in
which can never be released, or the task it was forked from
already set the bit in croup_attach_task.
Signed-off-by: Colin Cross <ccross@android.com>
|
|
After pulling the thread off the run-queue during a cgroup change,
the cfs_rq.min_vruntime gets recalculated. The dequeued thread's vruntime
then gets normalized to this new value. This can then lead to the thread
getting an unfair boost in the new group if the vruntime of the next
task in the old run-queue was way further ahead.
Cc: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Dima Zavin <dima@android.com>
|
|
wall_to_monotonic
Change-Id: I9e9c3b923bf9a22ffd48f80a72050289496e57d8
|
|
Change-Id: I21366ace371d1b8f4684ddbe4ea8d555a926ac21
Signed-off-by: Colin Cross <ccross@google.com>
|
|
Platform must register cpu power function that return power in
milliWatt seconds.
Change-Id: I1caa0335e316c352eee3b1ddf326fcd4942bcbe8
Signed-off-by: Mike Chan <mike@android.com>
|
|
Introduce new platform callback hooks for cpuacct for tracking CPU frequencies
Not all platforms / architectures have a set CPU_FREQ_TABLE defined
for CPU transition speeds. In order to track time spent in at various
CPU frequencies, we enable platform callbacks from cpuacct for this accounting.
Architectures that support overclock boosting, or don't have pre-defined
frequency tables can implement their own bucketing system that makes sense
given their cpufreq scaling abilities.
New file:
cpuacct.cpufreq reports the CPU time (in nanoseconds) spent at each CPU
frequency.
Change-Id: I10a80b3162e6fff3a8a2f74dd6bb37e88b12ba96
Signed-off-by: Mike Chan <mike@android.com>
|
|
This patch adds a notifier which can be used by subsystems that may
be interested in when a task has completely died and is about to
have it's last resource freed.
The Android lowmemory killer uses this to determine when a task
it has killed has finally given up its goods.
Signed-off-by: San Mehat <san@google.com>
|
|
When DEBUG_SUSPEND is enabled print active wakelocks when we check
if there are any active wakelocks.
In print_active_locks(), print expired wakelocks if DEBUG_EXPIRE is enabled
Change-Id: Ib1cb795555e71ff23143a2bac7c8a58cbce16547
Signed-off-by: Mike Chan <mike@android.com>
|
|
Signed-off-by: San Mehat <san@google.com>
|
|
If idx was non-zero and the log had wrapped, len did not get truncated
to stop at the last byte written to the log.
|
|
This reverts commit acff181d3574244e651913df77332e897b88bff4.
|
|
Change-Id: Ib82f6a716686a3ebb4592112400fc5e4a2ce066c
Signed-off-by: Colin Cross <ccross@android.com>
|
|
Rather than using explicit euid == 0 checks when trying to move
tasks into a cgroup via CFS, move permission checks into each
specific cgroup subsystem. If a subsystem does not specify a
'can_attach' handler, then we fall back to doing our checks the old way.
This way non-root processes can add arbitrary processes to
a cgroup if all the registered subsystems on that cgroup agree.
Also change explicit euid == 0 check to CAP_SYS_ADMIN
Signed-off-by: San Mehat <san@google.com>
|
|
Rather than signaling a full update of the display from userspace via a
console switch, this patch introduces 2 files int /sys/power,
wait_for_fb_sleep and wait_for_fb_wake. Reading these files will block
until the requested state has been entered. When a read from
wait_for_fb_sleep returns userspace should stop drawing. When
wait_for_fb_wake returns, it should do a full update. If either are called
when the fb driver is already in the requested state, they will return
immediately.
Signed-off-by: Rebecca Schultz <rschultz@google.com>
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
vt_waitactive now needs a 1 based console number
Change-Id: I07ab9a3773c93d67c09d928c8d5494ce823ffa2e
|
|
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
try_to_freeze_tasks after less than one second
Change-Id: Ib2976e5b97a5ee4ec9abd4d4443584d9257d0941
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
Avoids a problem where the device sometimes hangs for 20 seconds
before the screen is turned on.
|
|
This adds /sys/power/wake_lock and /sys/power/wake_unlock.
Writing a string to wake_lock creates a wake lock the
first time is sees a string and locks it. Optionally, the
string can be followed by a timeout.
To unlock the wake lock, write the same string to wake_unlock.
Change-Id: I66c6e3fe6487d17f9c2fafde1174042e57d15cd7
|
|
If EARLYSUSPEND is enabled then writes to /sys/power/state no longer
blocks, and the kernel will try to enter the requested state every
time no wakelocks are held. Write "on" to resume normal operation.
|
|
|
|
of stats.
Change-Id: I42ed8bea639684f7a8a95b2057516764075c6b01
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
Change-Id: Ic944e3b3d3bc53eddc6fd0963565fd072cac373c
Signed-off-by: Erik Gilling <konkers@android.com>
|
|
Signed-off-by: Mike Chan <mike@android.com>
|
|
PM: wakelock: Replace expire work with a timer
The expire work function did not work in the normal case.
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
This allows detection of init bugs in built-in drivers.
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
|
|
By default the kernel tries to keep half as much memory free at each
order as it does for one order below. This can be too agressive when
running without swap.
Change-Id: I5efc1a0b50f41ff3ac71e92d2efd175dedd54ead
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
Makes low-level printk work.
Signed-off-by: Tony Lindgren <tony@atomide.com>
|
|
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: NFSROOT should default to "proto=udp"
nfs4: remove duplicated #include
NFSv4: nfs4_state_mark_reclaim_nograce() should be static
NFSv4: Fix the setlk error handler
NFSv4.1: Fix the handling of the SEQUENCE status bits
NFSv4/4.1: Fix nfs4_schedule_state_recovery abuses
NFSv4.1 reclaim complete must wait for completion
NFSv4: remove duplicate clientid in struct nfs_client
NFSv4.1: Retry CREATE_SESSION on NFS4ERR_DELAY
sunrpc: Propagate errors from xs_bind() through xs_create_sock()
(try3-resend) Fix nfs_compat_user_ino64 so it doesn't cause problems if bit 31 or 63 are set in fileid
nfs: fix compilation warning
nfs: add kmalloc return value check in decode_and_add_ds
SUNRPC: Remove resource leak in svc_rdma_send_error()
nfs: close NFSv4 COMMIT vs. CLOSE race
SUNRPC: Close a race in __rpc_wait_for_completion_task()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix sched rt group scheduling when hierachy is enabled
|
|
Although they run as rpciod background tasks, under normal operation
(i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck()
and nfs4_do_close() want to be fully synchronous. This means that when we
exit, we want all references to the rpc_task to be gone, and we want
any dentry references etc. held by that task to be released.
For this reason these functions call __rpc_wait_for_completion_task(),
followed by rpc_put_task() in the expectation that the latter will be
releasing the last reference to the rpc_task, and thus ensuring that the
callback_ops->rpc_release() has been called synchronously.
This patch fixes a race which exists due to the fact that
rpciod calls rpc_complete_task() (in order to wake up the callers of
__rpc_wait_for_completion_task()) and then subsequently calls
rpc_put_task() without ensuring that these two steps are done atomically.
In order to avoid adding new spin locks, the patch uses the existing
waitqueue spin lock to order the rpc_task reference count releases between
the waiting process and rpciod.
The common case where nobody is waiting for completion is optimised for by
checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task
reference count is 1: in those cases we drop trying to grab the spin lock,
and immediately free up the rpc_task.
Those few processes that need to put the rpc_task from inside an
asynchronous context and that do not care about ordering are given a new
helper: rpc_put_task_async().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
nd->inode is not set on the second attempt in path_walk()
unfuck proc_sysctl ->d_compare()
minimal fix for do_filp_open() race
|
|
a) struct inode is not going to be freed under ->d_compare();
however, the thing PROC_I(inode)->sysctl points to just might.
Fortunately, it's enough to make freeing that sucker delayed,
provided that we don't step on its ->unregistering, clear
the pointer to it in PROC_I(inode) before dropping the reference
and check if it's NULL in ->d_compare().
b) I'm not sure that we *can* walk into NULL inode here (we recheck
dentry->seq between verifying that it's still hashed / fetching
dentry->d_inode and passing it to ->d_compare() and there's no
negative hashed dentries in /proc/sys/*), but if we can walk into
that, we really should not have ->d_compare() return 0 on it!
Said that, I really suspect that this check can be simply killed.
Nick?
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Don't forget to release cgroup_mutex if alloc_trial_cpuset() fails.
[akpm@linux-foundation.org: avoid multiple return points]
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
They are only used inside kernel/ptrace.c, and have been for a long
time. We don't want to go back to the bad-old-days when architectures
did things on their own, so make them static and private.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The current sched rt code is broken when it comes to hierarchical
scheduling, this patch fixes two problems
1. It adds redundant enqueuing (harmless) when it finds a queue
has tasks enqueued, but it has no run time and it is not
throttled.
2. The most important change is in sched_rt_rq_enqueue/dequeue.
The code just picks the rt_rq belonging to the current cpu
on which the period timer runs, the patch fixes it, so that
the correct rt_se is enqueued/dequeued.
Tested with a simple hierarchy
/c/d, c and d assigned similar runtimes of 50,000 and a while
1 loop runs within "d". Both c and d get throttled, without
the patch, the task just stops running and never runs (depends
on where the sched_rt b/w timer runs). With the patch, the
task is throttled and runs as expected.
[ bharata, suggestions on how to pick the rt_se belong to the
rt_rq and correct cpu ]
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
LKML-Reference: <20110303113435.GA2868@balbir.in.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
If we enable trace events to trace block actions, We use
blk_fill_rwbs_rq to analyze the corresponding actions
in request's cmd_flags, but we only choose the minor 2 bits
from it, so most of other flags(e.g, REQ_SYNC) are missing.
For example, with a sync write we get:
write_test-2409 [001] 160.013869: block_rq_insert: 3,64 W 0 () 258135 + =
8 [write_test]
Since now we have integrated the flags of both bio and request,
it is safe to pass rq->cmd_flags directly to blk_fill_rwbs and
blk_fill_rwbs_rq isn't needed any more.
With this patch, after a sync write we get:
write_test-2417 [000] 226.603878: block_rq_insert: 3,64 WS 0 () 258135 +=
8 [write_test]
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
When the per cpu timer is marked CLOCK_EVT_FEAT_C3STOP, then we only
can switch into oneshot mode, when the backup broadcast device
supports oneshot mode as well. Otherwise we would try to switch the
broadcast device into an unsupported mode unconditionally. This went
unnoticed so far as the current available broadcast devices support
oneshot mode. Seth unearthed this problem while debugging and working
around an hpet related BIOS wreckage.
Add the necessary check to tick_is_oneshot_available().
Reported-and-tested-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <alpine.LFD.2.00.1102252231200.2701@localhost6.localdomain6>
Cc: stable@kernel.org # .21 ->
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Disable the SHIRQ_DEBUG call in request_threaded_irq for now
genirq: Prevent access beyond allocated_irqs bitmap
|