summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2010-02-04sched: Change usage of rt_rq->rt_se to rt_rq->tg->rt_se[cpu]Yong Zhang
This is the first step to remove rt_rq member rt_se because it have the same meaning with tg->rt_se[cpu]. And the latter style is also used by the fair scheduling class. Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <2674af741001282257r28c97a92o9f90cf16fe8d3d84@mail.gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-02sched: Remove unused update_shares_locked()Peter Zijlstra
Commit f492e12ef050e02bf0185b6b57874992591b9be1 ("sched: Remove load_balance_newidle()") removed the only user of this function, so remove it too. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1265019219.24455.128.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-02sched: Use for_each_bitAkinobu Mita
No change in functionality. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <1264938810-4173-1-git-send-email-akinobu.mita@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-22sched: Queue a deboosted task to the head of the RT prio queueThomas Gleixner
rtmutex_set_prio() is used to implement priority inheritance for futexes. When a task is deboosted it gets enqueued at the tail of its RT priority list. This is violating the POSIX scheduling semantics: rt priority list X contains two runnable tasks A and B task A runs with priority X and holds mutex M task C preempts A and is blocked on mutex M -> task A is boosted to priority of task C (Y) task A unlocks the mutex M and deboosts itself -> A is dequeued from rt priority list Y -> A is enqueued to the tail of rt priority list X task C schedules away task B runs This is wrong as task A did not schedule away and therefor violates the POSIX scheduling semantics. Enqueue the task to the head of the priority list instead. Reported-by: Mathias Weber <mathias.weber.mw1@roche.com> Reported-by: Carsten Emde <cbe@osadl.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Tested-by: Carsten Emde <cbe@osadl.org> Tested-by: Mathias Weber <mathias.weber.mw1@roche.com> LKML-Reference: <20100120171629.809074113@linutronix.de>
2010-01-22sched: Implement head queueing for sched_rtThomas Gleixner
The ability of enqueueing a task to the head of a SCHED_FIFO priority list is required to fix some violations of POSIX scheduling policy. Implement the functionality in sched_rt. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Tested-by: Carsten Emde <cbe@osadl.org> Tested-by: Mathias Weber <mathias.weber.mw1@roche.com> LKML-Reference: <20100120171629.772169931@linutronix.de>
2010-01-22sched: Extend enqueue_task to allow head queueingThomas Gleixner
The ability of enqueueing a task to the head of a SCHED_FIFO priority list is required to fix some violations of POSIX scheduling policy. Extend the related functions with a "head" argument. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Tested-by: Carsten Emde <cbe@osadl.org> Tested-by: Mathias Weber <mathias.weber.mw1@roche.com> LKML-Reference: <20100120171629.734886007@linutronix.de>
2010-01-21sched: Remove USER_SCHEDDhaval Giani
Remove the USER_SCHED feature. It has been scheduled to be removed in 2.6.34 as per http://marc.info/?l=linux-kernel&m=125728479022976&w=2 Signed-off-by: Dhaval Giani <dhaval.giani@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1263990378.24844.3.camel@localhost> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-21sched: Fix the place where group powers are updatedGautham R Shenoy
We want to update the sched_group_powers when balance_cpu == this_cpu. Currently the group powers are updated only if the balance_cpu is the first CPU in the local group. But balance_cpu = this_cpu could also be the first idle cpu in the group. Hence fix the place where the group powers are updated. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Joel Schopp <jschopp@austin.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1264017764.5717.127.camel@jschopp-laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-21sched: Assume *balance is validPeter Zijlstra
Since all load_balance() callers will have !NULL balance parameters we can now assume so and remove a few checks. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-21sched: Remove load_balance_newidle()Peter Zijlstra
The two functions: load_balance{,_newidle}() are very similar, with the following differences: - rq->lock usage - sb->balance_interval updates - *balance check So remove the load_balance_newidle() call with load_balance(.idle = CPU_NEWLY_IDLE), explicitly unlock the rq->lock before calling (would be done by double_lock_balance() anyway), and ignore the other differences for now. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-21sched: Unify load_balance{,_newidle}()Peter Zijlstra
load_balance() and load_balance_newidle() look remarkably similar, one key point they differ in is the condition on when to active balance. So split out that logic into a separate function. One side effect is that previously load_balance_newidle() used to fail and return -1 under these conditions, whereas now it doesn't. I've not yet fully figured out the whole -1 return case for either load_balance{,_newidle}(). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-21sched: Add a lock break for PREEMPT=yPeter Zijlstra
Since load-balancing can hold rq->locks for quite a long while, allow breaking out early when there is lock contention. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-21sched: Remove from fwd declsPeter Zijlstra
Move code around to get rid of fwd declarations. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-21sched: Remove rq_iterator from move_one_taskPeter Zijlstra
Again, since we only iterate the fair class, remove the abstraction. Since this is the last user of the rq_iterator, remove all that too. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-21sched: Remove rq_iterator usage from load_balance_fairPeter Zijlstra
Since we only ever iterate the fair class, do away with this abstraction. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-21sched: Remove the sched_class load_balance methodsPeter Zijlstra
Take out the sched_class methods for load-balancing. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-21sched: Move load balance code into sched_fair.cPeter Zijlstra
Straight fwd code movement. Since non of the load-balance abstractions are used anymore, do away with them and simplify the code some. In preparation move the code around. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-17sched: Don't expose local functionsH Hartley Sweeten
kernel/sched: don't expose local functions The get_rr_interval_* functions are all class methods of struct sched_class. They are not exported so make them static. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <201001132021.53253.hartleys@visionengravers.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-28sched: might_sleep(): Make file parameter const char *Simon Kagstrom
Fixes a warning when building with g++: warning: deprecated conversion from string constant to 'char*' And the file parameter use is constant, so mark it as such. Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net> Cc: peterz@infradead.org LKML-Reference: <20091223110818.442d848e@marrow.netinsight.se> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-24Merge branch 'sysctl' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc-2.6 * 'sysctl' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc-2.6: SYSCTL: Add a mutex to the page_alloc zone order sysctl SYSCTL: Print binary sysctl warnings (nearly) only once
2009-12-23SYSCTL: Print binary sysctl warnings (nearly) only onceAndi Kleen
When printing legacy sysctls print the warning message for each of them only once. This way there is a guarantee the syslog won't be flooded for any sane program. The original attempt at this made the tables non const and stored the flag inline. Linus suggested using a separate hash table for this, this is based on a code snippet from him. The hash implies this is not exact and can sometimes not print a new sysctl due to a hash collision, but in practice this should not be a problem I used a FNV32 hash over the binary string with a 32byte bitmap. This gives relatively little collisions when all the predefined binary sysctls are hashed: size 256 bucket length number 0: [25] 1: [67] 2: [88] 3: [47] 4: [22] 5: [6] 6: [1] The worst case is a single collision of 6 hash values. Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-23Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Revert 738d2be, simplify set_task_cpu()
2009-12-23sched: Revert 738d2be, simplify set_task_cpu()Peter Zijlstra
Effectively reverts 738d2be4301007f054541c5c4bf7fb6a361c9b3a. As demonstrated by Eric, we really need to call __set_task_cpu() early in the fork() path to properly initialize the various task state -- specifically the cgroup state through set_task_rq(). [ we could probably fix this by explicitly calling __set_task_cpu() from sched_fork(), but lets try that for the next cycle and simply revert to the old behaviour for now. ] Reported-by: Eric Paris <eparis@redhat.com> Tested-by: Eric Paris <eparis@redhat.com>, Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: efault@gmx.de LKML-Reference: <1261492999.4937.36.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: jfs: Fix 32bit build warning Remove obsolete comment in fs.h Sanitize f_flags helpers Fix f_flags/f_mode in case of lookup_instantiate_filp() from open(pathname, 3) anonfd: Allow making anon files read-only fs/compat_ioctl.c: fix build error when !BLOCK pohmelfs needs I_LOCK alloc_file(): simplify handling of mnt_clone_write() errors
2009-12-22kfifo: add record handling functionsStefani Seibold
Add kfifo_in_rec() - puts some record data into the FIFO Add kfifo_out_rec() - gets some record data from the FIFO Add kfifo_from_user_rec() - puts some data from user space into the FIFO Add kfifo_to_user_rec() - gets data from the FIFO and write it to user space Add kfifo_peek_rec() - gets the size of the next FIFO record field Add kfifo_skip_rec() - skip the next fifo out record Add kfifo_avail_rec() - determinate the number of bytes available in a record FIFO Signed-off-by: Stefani Seibold <stefani@seibold.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-22kfifo: add kfifo_skip, kfifo_from_user and kfifo_to_userStefani Seibold
Add kfifo_reset_out() for save lockless discard the fifo output Add kfifo_skip() to skip a number of output bytes Add kfifo_from_user() to copy user space data into the fifo Add kfifo_to_user() to copy fifo data to user space Signed-off-by: Stefani Seibold <stefani@seibold.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-22kfifo: rename kfifo_put... into kfifo_in... and kfifo_get... into kfifo_out...Stefani Seibold
rename kfifo_put... into kfifo_in... to prevent miss use of old non in kernel-tree drivers ditto for kfifo_get... -> kfifo_out... Improve the prototypes of kfifo_in and kfifo_out to make the kerneldoc annotations more readable. Add mini "howto porting to the new API" in kfifo.h Signed-off-by: Stefani Seibold <stefani@seibold.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-22kfifo: cleanup namespaceStefani Seibold
change name of __kfifo_* functions to kfifo_*, because the prefix __kfifo should be reserved for internal functions only. Signed-off-by: Stefani Seibold <stefani@seibold.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-22kfifo: move out spinlockStefani Seibold
Move the pointer to the spinlock out of struct kfifo. Most users in tree do not actually use a spinlock, so the few exceptions now have to call kfifo_{get,put}_locked, which takes an extra argument to a spinlock. Signed-off-by: Stefani Seibold <stefani@seibold.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-22kfifo: move struct kfifo in placeStefani Seibold
This is a new generic kernel FIFO implementation. The current kernel fifo API is not very widely used, because it has to many constrains. Only 17 files in the current 2.6.31-rc5 used it. FIFO's are like list's a very basic thing and a kfifo API which handles the most use case would save a lot of development time and memory resources. I think this are the reasons why kfifo is not in use: - The API is to simple, important functions are missing - A fifo can be only allocated dynamically - There is a requirement of a spinlock whether you need it or not - There is no support for data records inside a fifo So I decided to extend the kfifo in a more generic way without blowing up the API to much. The new API has the following benefits: - Generic usage: For kernel internal use and/or device driver. - Provide an API for the most use case. - Slim API: The whole API provides 25 functions. - Linux style habit. - DECLARE_KFIFO, DEFINE_KFIFO and INIT_KFIFO Macros - Direct copy_to_user from the fifo and copy_from_user into the fifo. - The kfifo itself is an in place member of the using data structure, this save an indirection access and does not waste the kernel allocator. - Lockless access: if only one reader and one writer is active on the fifo, which is the common use case, no additional locking is necessary. - Remove spinlock - give the user the freedom of choice what kind of locking to use if one is required. - Ability to handle records. Three type of records are supported: - Variable length records between 0-255 bytes, with a record size field of 1 bytes. - Variable length records between 0-65535 bytes, with a record size field of 2 bytes. - Fixed size records, which no record size field. - Preserve memory resource. - Performance! - Easy to use! This patch: Since most users want to have the kfifo as part of another object, reorganize the code to allow including struct kfifo in another data structure. This requires changing the kfifo_alloc and kfifo_init prototypes so that we pass an existing kfifo pointer into them. This patch changes the implementation and all existing users. [akpm@linux-foundation.org: fix warning] Signed-off-by: Stefani Seibold <stefani@seibold.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-22Revert "time: Remove xtime_cache"Linus Torvalds
This reverts commit 7bc7d637452383d56ba4368d4336b0dde1bb476d, as requested by John Stultz. Quoting John: "Petr Titěra reported an issue where he saw odd atime regressions with 2.6.33 where there were a full second worth of nanoseconds in the nanoseconds field. He also reviewed the time code and narrowed down the problem: unhandled overflow of the nanosecond field caused by rounding up the sub-nanosecond accumulated time. Details: * At the end of update_wall_time(), we currently round up the sub-nanosecond portion of accumulated time when storing it into xtime. This was added to avoid time inconsistencies caused when the sub-nanosecond portion was truncated when storing into xtime. Unfortunately we don't handle the possible second overflow caused by that rounding. * Previously the xtime_cache code hid this overflow by normalizing the xtime value when storing into the xtime_cache. * We could try to handle the second overflow after the rounding up, but since this affects the timekeeping's internal state, this would further complicate the next accumulation cycle, causing small errors in ntp steering. As much as I'd like to get rid of it, the xtime_cache code is known to work. * The correct fix is really to include the sub-nanosecond portion in the timekeeping accessor function, so we don't need to round up at during accumulation. This would greatly simplify the accumulation code. Unfortunately, we can't do this safely until the last three non-GENERIC_TIME arches (sparc32, arm, cris) are converted (those patches are in -mm) and we kill off the spots where arches set xtime directly. This is all 2.6.34 material, so I think reverting the xtime_cache change is the best approach for now. Many thanks to Petr for both reporting and finding the issue!" Reported-by: Petr Titěra <P.Titera@century.cz> Requested-by: john stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-22Sanitize f_flags helpersAl Viro
* pull ACC_MODE to fs.h; we have several copies all over the place * nightmarish expression calculating f_mode by f_flags deserves a helper too (OPEN_FMODE(flags)) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-22anonfd: Allow making anon files read-onlyRoland Dreier
It seems a couple places such as arch/ia64/kernel/perfmon.c and drivers/infiniband/core/uverbs_main.c could use anon_inode_getfile() instead of a private pseudo-fs + alloc_file(), if only there were a way to get a read-only file. So provide this by having anon_inode_getfile() create a read-only file if we pass O_RDONLY in flags. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-21resources: fix call to alignf() in allocate_resource()Dominik Brodowski
The second parameter to alignf() in allocate_resource() must reflect what new resource is attempted to be allocated, else functions like pcibios_align_resource() (at least on x86) or pcmcia_align() can't work correctly. Commit 1e5ad9679016275d422e36b12a98b0927d76f556 broke this by setting the "new" resource until we're about to return success. To keep the resource untouched when allocate_resource() fails, a "tmp" resource is introduced. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-20sched: Fix hotplug hangPeter Zijlstra
The hot-unplug kstopmachine usage does a wakeup after deactivating the cpu, hence we cannot use cpu_active() here but must rely on the good olde online. Reported-by: Sachin Sant <sachinp@in.ibm.com> Reported-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Jens Axboe <jens.axboe@oracle.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> LKML-Reference: <1261326987.4314.24.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-20sched: Restore printk sanityPeter Zijlstra
Revert the braindead pr_* crap. (Commit 663997d "sched: Use pr_fmt() and pr_<level>()") It's dumb and causes stupid "sched: " strings all over the place. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Mike Galbraith <efault@gmx.de> Cc: Joe Perches <joe@perches.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <1261315437.4314.6.camel@laptop> [ i dont mind the pr_*() patterns that much - but Peter dislikes them with a vengence. ] [ - v2: remove spurious diffstat from changelog :-/ ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-19Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf session: Make events_stats u64 to avoid overflow on 32-bit arches hw-breakpoints: Fix hardware breakpoints -> perf events dependency perf events: Dont report side-band events on each cpu for per-task-per-cpu events perf events, x86/stacktrace: Fix performance/softlockup by providing a special frame pointer-only stack walker perf events, x86/stacktrace: Make stack walking optional perf events: Remove unused perf_counter.h header file perf probe: Check new event name kprobe-tracer: Check new event/group name perf probe: Check whether debugfs path is correct perf probe: Fix libdwarf include path for Debian
2009-12-19Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits) sched: Fix broken assertion sched: Assert task state bits at build time sched: Update task_state_arraypwith new states sched: Add missing state chars to TASK_STATE_TO_CHAR_STR sched: Move TASK_STATE_TO_CHAR_STR near the TASK_state bits sched: Teach might_sleep() about preemptible RCU sched: Make warning less noisy sched: Simplify set_task_cpu() sched: Remove the cfs_rq dependency from set_task_cpu() sched: Add pre and post wakeup hooks sched: Move kthread_bind() back to kthread.c sched: Fix select_task_rq() vs hotplug issues sched: Fix sched_exec() balancing sched: Ensure set_task_cpu() is never called on blocked tasks sched: Use TASK_WAKING for fork wakups sched: Select_task_rq_fair() must honour SD_LOAD_BALANCE sched: Fix task_hot() test order sched: Fix set_cpu_active() in cpu_down() sched: Mark boot-cpu active before smp_init() sched: Fix cpu_clock() in NMIs, on !CONFIG_HAVE_UNSTABLE_SCHED_CLOCK ...
2009-12-19Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sys: Fix missing rcu protection for __task_cred() access signals: Fix more rcu assumptions signal: Fix racy access to __task_cred in kill_pid_info_as_uid()
2009-12-19Merge branch 'timers-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: timers: Remove duplicate setting of new_base in __mod_timer() clockevents: Prevent clockevent_devices list corruption on cpu hotplug
2009-12-19fix more leaks in audit_tree.c tag_chunk()Al Viro
Several leaks in audit_tree didn't get caught by commit 318b6d3d7ddbcad3d6867e630711b8a705d873d7, including the leak on normal exit in case of multiple rules refering to the same chunk. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-19fix braindamage in audit_tree.c untag_chunk()Al Viro
... aka "Al had badly fscked up when writing that thing and nobody noticed until Eric had fixed leaks that used to mask the breakage". The function essentially creates a copy of old array sans one element and replaces the references to elements of original (they are on cyclic lists) with those to corresponding elements of new one. After that the old one is fair game for freeing. First of all, there's a dumb braino: when we get to list_replace_init we use indices for wrong arrays - position in new one with the old array and vice versa. Another bug is more subtle - termination condition is wrong if the element to be excluded happens to be the last one. We shouldn't go until we fill the new array, we should go until we'd finished the old one. Otherwise the element we are trying to kill will remain on the cyclic lists... That crap used to be masked by several leaks, so it was not quite trivial to hit. Eric had fixed some of those leaks a while ago and the shit had hit the fan... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-17Merge branch 'cpumask-cleanups' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * 'cpumask-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: cpumask: rename tsk_cpumask to tsk_cpus_allowed cpumask: don't recommend set_cpus_allowed hack in Documentation/cpu-hotplug.txt cpumask: avoid dereferencing struct cpumask cpumask: convert drivers/idle/i7300_idle.c to cpumask_var_t cpumask: use modern cpumask style in drivers/scsi/fcoe/fcoe.c cpumask: avoid deprecated function in mm/slab.c cpumask: use cpu_online in kernel/perf_event.c
2009-12-17Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: Keys: KEYCTL_SESSION_TO_PARENT needs TIF_NOTIFY_RESUME architecture support NOMMU: Optimise away the {dac_,}mmap_min_addr tests security/min_addr.c: make init_mmap_min_addr() static keys: PTR_ERR return of wrong pointer in keyctl_get_security()
2009-12-17Merge branch 'kmemleak' of git://linux-arm.org/linux-2.6Linus Torvalds
* 'kmemleak' of git://linux-arm.org/linux-2.6: kmemleak: fix kconfig for crc32 build error kmemleak: Reduce the false positives by checking for modified objects kmemleak: Show the age of an unreferenced object kmemleak: Release the object lock before calling put_object() kmemleak: Scan the _ftrace_events section in modules kmemleak: Simplify the kmemleak_scan_area() function prototype kmemleak: Do not use off-slab management with SLAB_NOLEAKTRACE
2009-12-17printk: fix new kernel-doc warningsRandy Dunlap
Fix kernel-doc warnings in printk.c: Warning(kernel/printk.c:1422): No description found for parameter 'dumper' Warning(kernel/printk.c:1422): Excess function parameter 'dump' description in 'kmsg_dump_register' Warning(kernel/printk.c:1451): No description found for parameter 'dumper' Warning(kernel/printk.c:1451): Excess function parameter 'dump' description in 'kmsg_dump_unregister' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-17do_wait() optimization: do not place sub-threads on task_struct->children listOleg Nesterov
Thanks to Roland who pointed out de_thread() issues. Currently we add sub-threads to ->real_parent->children list. This buys nothing but slows down do_wait(). With this patch ->children contains only main threads (group leaders). The only complication is that forget_original_parent() should iterate over sub-threads by hand, and de_thread() needs another list_replace() when it changes ->group_leader. Henceforth do_wait_thread() can never see task_detached() && !EXIT_DEAD tasks, we can remove this check (and we can unify do_wait_thread() and ptrace_do_wait()). This change can confuse the optimistic search in mm_update_next_owner(), but this is fixable and minor. Perhaps badness() and oom_kill_process() should be updated, but they should be fixed in any case. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Ratan Nalumasu <rnalumasu@gmail.com> Cc: Vitaly Mayatskikh <vmayatsk@redhat.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-17kernel/sysctl.c: fix the incomplete part of ↵WANG Cong
sysctl_max_map_count-should-be-non-negative.patch It is a mistake that we used 'proc_dointvec', it should be 'proc_dointvec_minmax', as in the original patch. Signed-off-by: WANG Cong <amwang@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-17Merge branch 'for-33' of git://repo.or.cz/linux-kbuildLinus Torvalds
* 'for-33' of git://repo.or.cz/linux-kbuild: (29 commits) net: fix for utsrelease.h moving to generated gen_init_cpio: fixed fwrite warning kbuild: fix make clean after mismerge kbuild: generate modules.builtin genksyms: properly consider EXPORT_UNUSED_SYMBOL{,_GPL}() score: add asm/asm-offsets.h wrapper unifdef: update to upstream revision 1.190 kbuild: specify absolute paths for cscope kbuild: create include/generated in silentoldconfig scripts/package: deb-pkg: use fakeroot if available scripts/package: add KBUILD_PKG_ROOTCMD variable scripts/package: tar-pkg: use tar --owner=root Kbuild: clean up marker net: add net_tstamp.h to headers_install kbuild: move utsrelease.h to include/generated kbuild: move autoconf.h to include/generated drop explicit include of autoconf.h kbuild: move compile.h to include/generated kbuild: drop include/asm kbuild: do not check for include/asm-$ARCH ... Fixed non-conflicting clean merge of modpost.c as per comments from Stephen Rothwell (modpost.c had grown an include of linux/autoconf.h that needed to be changed to generated/autoconf.h)
2009-12-17sched: Fix broken assertionPeter Zijlstra
There's a preemption race in the set_task_cpu() debug check in that when we get preempted after setting task->state we'd still be on the rq proper, but fail the test. Check for preempted tasks, since those are always on the RQ. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20091217121830.137155561@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>