summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2007-11-12[NET]: Add the helper kernel_sock_shutdown()Trond Myklebust
...and fix a couple of bugs in the NBD, CIFS and OCFS2 socket handlers. Looking at the sock->op->shutdown() handlers, it looks as if all of them take a SHUT_RD/SHUT_WR/SHUT_RDWR argument instead of the RCV_SHUTDOWN/SEND_SHUTDOWN arguments. Add a helper, and then define the SHUT_* enum to ensure that kernel users of shutdown() don't get confused. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Mark Fasheh <mark.fasheh@oracle.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12[IPV6]: Add ifindex field to ND user option messages.Pierre Ynard
Userland neighbor discovery options are typically heavily involved with the interface on which thay are received: add a missing ifindex field to the original struct. Thanks to Rémi Denis-Courmont. Signed-off-by: Pierre Ynard <linkfanel@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-virtio * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-virtio: virtio: Force use of power-of-two for descriptor ring sizes lguest: Fix lguest virtio-blk backend size computation virtio: Fix used_idx wrap-around virtio: more fallout from scatterlist changes. virtio: fix vring_init for 64 bits
2007-11-12Merge branch 'master' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (39 commits) [INET]: Small possible memory leak in FIB rules [NETNS]: init dev_base_lock only once [UNIX]: The unix_nr_socks limit can be exceeded [AF_UNIX]: Convert socks to unix_socks in scan_inflight, not in callbacks [AF_UNIX]: Make unix_tot_inflight counter non-atomic [AF_PACKET]: Allow multicast traffic to be caught by ORIGDEV when bonded ssb: Fix PCMCIA-host lowlevel bus access mac80211: fix MAC80211_RCSIMPLE Kconfig mac80211: make "decrypt failed" messages conditional upon MAC80211_DEBUG mac80211: use IW_AUTH_PRIVACY_INVOKED rather than IW_AUTH_KEY_MGMT mac80211: remove unused driver ops mac80211: remove ieee80211_common.h softmac: MAINTAINERS update rfkill: Fix sparse warning rfkill: Use mutex_lock() at register and add sanity check iwlwifi: select proper rate control algorithm mac80211: allow driver to ask for a rate control algorithm mac80211: don't allow registering the same rate control twice rfkill: Use subsys_initcall mac80211: make simple rate control algorithm built-in ...
2007-11-12x86: fix taking DNA during 64bit sigreturnSiddha, Suresh B
restore sigcontext is taking a DNA exception while restoring FP context from the user stack, during the sigreturn. Appended patch fixes it by doing clts() if the app doesn't touch FP during the signal handler execution. This will stop generating a DNA, during the fxrstor in the sigreturn. This improves 64-bit lat_sig numbers by ~30% on my core2 platform. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-12virtio: Force use of power-of-two for descriptor ring sizesRusty Russell
The virtio descriptor rings of size N-1 were nicely set up to be aligned to an N-byte boundary. But as Anthony Liguori points out, the free-running indices used by virtio require that the sizes be a power of 2, otherwise we get problems on wrap (demonstrated with lguest). So we replace the clever "2^n-1" scheme with a simple "align to page boundary" scheme: this means that all virtio rings take at least two pages, but it's safer than guessing cache alignment. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-11-12virtio: fix vring_init for 64 bitsAnthony Liguori
This patch fixes a typo in vring_init(). This happens to work today in lguest because the sizeof(struct vring_desc) is 16 and struct vring contains 3 pointers and an unsigned int so on 32-bit sizeof(struct vring_desc) == sizeof(struct vring). However, this is no longer true on 64-bit where the bug is exposed. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-11-10[INET]: Small possible memory leak in FIB rulesDenis V. Lunev
This patch fixes a small memory leak. Default fib rules can be deleted by the user if the rule does not carry FIB_RULE_PERMANENT flag, f.e. by ip rule flush Such a rule will not be freed as the ref-counter has 2 on start and becomes clearly unreachable after removal. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10[AF_UNIX]: Make unix_tot_inflight counter non-atomicPavel Emelyanov
This counter is _always_ modified under the unix_gc_lock spinlock, so its atomicity can be provided w/o additional efforts. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10mac80211: remove unused driver opsJohannes Berg
The driver operations set_ieee8021x(), set_port_auth() and set_privacy_invoked() are not used by any drivers, except set_privacy_invoked() they aren't even used by mac80211. Remove them at least until we need to support drivers with mac80211 that require getting this information. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10mac80211: allow driver to ask for a rate control algorithmJohannes Berg
This allows a driver to ask for a specific rate control algorithm. The rate control algorithm asked for must be registered and be available as a module or built-in. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10[NET]: Fix skb_truesize_check() assertionChuck Lever
The intent of the assertion in skb_truesize_check() is to check for skb->truesize being decremented too much by other code, resulting in a wraparound below zero. The type of the right side of the comparison causes the compiler to promote the left side to an unsigned type, despite the presence of an explicit type cast. This defeats the check for negativity. Ensure both sides of the comparison are a signed type to prevent the implicit type conversion. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10[NET]: Make helper to get dst entry and "use" itPavel Emelyanov
There are many places that get the dst entry, increase the __use counter and set the "lastuse" time stamp. Make a helper for this. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10[INET]: Add a missing include <linux/vmalloc.h> to inet_hashtables.hEric Dumazet
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10voyager: use struct instead of PARAMRandy Dunlap
Use struct boot_params instead of PARAM + 0xoffsets. Fixes one of many Voyager build problems. arch/x86/kernel/setup_32.c:543: error: 'PARAM' undeclared (first use in this function) Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-11-09Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] IOSAPIC bogus error cleanup [IA64] Update printing of feature set bits [IA64] Fix IOSAPIC delivery mode setting [IA64] XPC heartbeat timer function must run on CPU 0 [IA64] Clean up /proc/interrupts output [IA64] Disable/re-enable CPE interrupts on Altix [IA64] Clean-up McKinley Errata message [IA64] Add gate.lds to list of files ignored by Git [IA64] Fix section mismatch in contig.c version of per_cpu_init() [IA64] Wrong args to memset in efi_gettimeofday() [IA64] Remove duplicate includes from ia32priv.h [IA64] fix number of bytes zeroed by sys_fw_init() in arch/ia64/hp/sim/boot/fw-emu.c [IA64] Fix perfmon sysctl directory modes
2007-11-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-schedLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: sched: proper prototype for kernel/sched.c:migration_init() sched: avoid large irq-latencies in smp-balancing sched: fix copy_namespace() <-> sched_fork() dependency in do_fork sched: clean up the wakeup preempt check, #2 sched: clean up the wakeup preempt check sched: wakeup preemption fix sched: remove PREEMPT_RESTRICT sched: turn off PREEMPT_RESTRICT KVM: fix !SMP build error x86: make nmi_cpu_busy() always defined x86: make ipi_handler() always defined sched: cleanup, use NSEC_PER_MSEC and NSEC_PER_SEC sched: reintroduce SMP tunings again sched: restore deterministic CPU accounting on powerpc sched: fix delay accounting regression sched: reintroduce the sched_min_granularity tunable sched: documentation: place_entity() comments sched: fix vslice
2007-11-09Merge master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6: (26 commits) sh: remove dead config symbols from SH code sh: Kill off broken snapgear ds1302 code. sh: Add a dummy vga.h. rtc: rtc-sh: Zero out tm value for invalid rtc states. rtc: sh-rtc: Handle rtc_device_register() failure properly. sh: Fix heartbeart on Solution Engine series sh: Remove SCI_NPORTS from sh-sci.h sh: Fix up PAGE_KERNEL_PCC() for nommu. sh: hs7751rvoip: Kill off dead IPR IRQ mappings. sh: hs7751rvoip: irq.c needs linux/interrupt.h. sh: Kill off __{copy,clear}_user_page(). sh: Optimized copy_{to,from}_user_page() for SH-4. sh: Wire up clear_user_highpage(). sh: Kill off the remaining ST40 cruft. superhyway: Handle device_register() retval properly. sh: kgdb sysrq depends on magic sysrq. sh: Add -Werror for clean directories. sh: Fix up kgdb build with modular sh-sci. sh: Export __{s,u}divsi3_i4i on all CPUs. sh: Fix up kgdb-on-NMI branch target. ...
2007-11-09Merge master.kernel.org:/home/rmk/linux-2.6-armLinus Torvalds
* master.kernel.org:/home/rmk/linux-2.6-arm: [ARM] pxa: fix one-shot timer mode [ARM] 4645/1: Cyberpro: Trivial fix to restore 16bpp mode. [ARM] 4644/2: fix flush_kern_tlb_range() in module space [ARM] Allow watchdog drivers to be selected again [ARM] 4633/1: omap build fix when FB enabled [ARM] 4642/2: netX: default config for netx based boards [ARM] 4641/2: netX: fix kobject_name type [ARM] Fix iop3xx macro
2007-11-09Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: Add UNPLUG traces to all appropriate places block: fix requeue handling in blk_queue_invalidate_tags() mmc: Fix sg helper copy-and-paste error pktcdvd: fix BUG caused by sysfs module reference semantics change ioprio: allow sys_ioprio_set() value of 0 to reset ioprio setting cfq_idle_class_timer: add paranoid checks for jiffies overflow cfq: fix IOPRIO_CLASS_IDLE delays cfq: fix IOPRIO_CLASS_IDLE accounting
2007-11-09Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (37 commits) [POWERPC] EEH: Make sure warning message is printed [POWERPC] Make altivec code in swsusp_32.S depend on CONFIG_ALTIVEC [POWERPC] windfarm: Fix windfarm thread freezer interaction [POWERPC] Fix si_addr value on low level hash failures [POWERPC] Refresh ppc64_defconfig and enable pasemi-related options [POWERPC] pasemi: Update defconfig [POWERPC] iSeries: Fix ref counting in vio setup [POWERPC] ] Fix memset size error [POWERPC] Fix link errors for allyesconfig [POWERPC] iSeries_init_IRQ non-PCI tidy [POWERPC] Change fallocate to match unistd.h on powerpc [POWERPC] EEH: Avoid crash on null device [POWERPC] EEH: Drivers that need reset trump others [POWERPC] EEH: Clean up comments [POWERPC] Fix off-by-one error in setting decrementer on Book E/4xx (v2) [POWERPC] Fix switch_slb handling of 1T ESID values [POWERPC] Fix build failure when CONFIG_VIRT_CPU_ACCOUNTING is not defined [POWERPC] Include udbg.h when using udbg_printf [POWERPC] Fix cache line vs. block size confusion [POWERPC] Fix sysctl table check failure on PowerMac ...
2007-11-09frv: Remove bogus NO_IRQ = -1 defineAlan Cox
The old NO_IRQ define some platforms had was long ago declared obsolete and wrong. FRV should therefore not be re-introducing this, especially as IRQs are usually unsigned in the kernel. The "no IRQ" case is defined to be zero and Linus made this rather clear at the time. arch/frv shows no dependancy on this but it might show up driver fixes needing doing I guess Signed-off-by: Alan Cox <alan@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-09Merge branch 'master' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC64]: Use "is_power_of_2" macro for simplicity. [SPARC]: Remove duplicate includes.
2007-11-09sched: proper prototype for kernel/sched.c:migration_init()Adrian Bunk
This patch adds a proper prototype for migration_init() in include/linux/sched.h Since there's no point in always returning 0 to a caller that doesn't check the return value it also changes the function to return void. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-09sched: avoid large irq-latencies in smp-balancingPeter Zijlstra
SMP balancing is done with IRQs disabled and can iterate the full rq. When rqs are large this can cause large irq-latencies. Limit the nr of iterations on each run. This fixes a scheduling latency regression reported by the -rt folks. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-09sched: remove PREEMPT_RESTRICTIngo Molnar
remove PREEMPT_RESTRICT. (this is a separate commit so that any regression related to the removal itself is bisectable) Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-09KVM: fix !SMP build errorIngo Molnar
fix a !SMP build error: drivers/kvm/kvm_main.c: In function 'kvm_flush_remote_tlbs': drivers/kvm/kvm_main.c:220: error: implicit declaration of function 'smp_call_function_mask' (and also avoid unused function warning related to up_smp_call_function() not making use of the 'func' parameter.) Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-09sched: restore deterministic CPU accounting on powerpcPaul Mackerras
Since powerpc started using CONFIG_GENERIC_CLOCKEVENTS, the deterministic CPU accounting (CONFIG_VIRT_CPU_ACCOUNTING) has been broken on powerpc, because we end up counting user time twice: once in timer_interrupt() and once in update_process_times(). This fixes the problem by pulling the code in update_process_times that updates utime and stime into a separate function called account_process_tick. If CONFIG_VIRT_CPU_ACCOUNTING is not defined, there is a version of account_process_tick in kernel/timer.c that simply accounts a whole tick to either utime or stime as before. If CONFIG_VIRT_CPU_ACCOUNTING is defined, then arch code gets to implement account_process_tick. This also lets us simplify the s390 code a bit; it means that the s390 timer interrupt can now call update_process_times even when CONFIG_VIRT_CPU_ACCOUNTING is turned on, and can just implement a suitable account_process_tick(). account_process_tick() now takes the task_struct * as an argument. Tested both with and without CONFIG_VIRT_CPU_ACCOUNTING. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-09sched: reintroduce the sched_min_granularity tunablePeter Zijlstra
we lost the sched_min_granularity tunable to a clever optimization that uses the sched_latency/min_granularity ratio - but the ratio is quite unintuitive to users and can also crash the kernel if the ratio is set to 0. So reintroduce the min_granularity tunable, while keeping the ratio maintained internally. no functionality changed. [ mingo@elte.hu: some fixlets. ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-09[IA64] Update printing of feature set bitsRuss Anderson
Newer Itanium versions have added additional processor feature set bits. This patch prints all the implemented feature set bits. Some bit descriptions have not been made public. For those bits, a generic "Feature set X bit Y" message is printed. Bits that are not implemented will no longer be printed. Signed-off-by: Russ Anderson <rja@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-11-09Add UNPLUG traces to all appropriate placesAlan D. Brunelle
Added blk_unplug interface, allowing all invocations of unplugs to result in a generated blktrace UNPLUG. Signed-off-by: Alan D. Brunelle <Alan.Brunelle@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-11-09sh: remove dead config symbols from SH codeJiri Olsa
Signed-off-by: Jiri Olsa <olsajiri@gmail.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-11-08[ARM] 4644/2: fix flush_kern_tlb_range() in module spaceKevin Hilman
For kernel addresses between TASK_SIZE and PAGE_OFFSET, flush_tlb_kern_range() does not work as would be expected. The TLB invalidate works with a matching ASID, or on entries marked as global. The set_pte_at() macro marks addresses >= PAGE_OFFSET as global, but not addresses from TASK_SIZE to PAGE_OFFSET, which are also kernel addresses. The result is that the entries in this range are not actually invalidated by flush_tlb_kern_range(). This patch instead marks addresses >= TASK_SIZE as global. Signed-off-by: Satoru Fujii <s-fujii@ct.jp.nec.com> Signed-off-by: Kevin Hilman <khilman@mvista.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2007-11-08Merge branch 'page_colouring_despair'Paul Mundt
2007-11-08Merge branch 'for-2.6.24' of ↵Paul Mackerras
master.kernel.org:/pub/scm/linux/kernel/git/jwboyer/powerpc-4xx into merge
2007-11-08[POWERPC] Change fallocate to match unistd.h on powerpcPatrick Mansfield
Fix the fallocate system call on powerpc to match its unistd.h. This implies none of these system calls are currently working with the unistd.h sys call values: fallocate signalfd timerfd eventfd sync_file_range2 Signed-off-by: Patrick Mansfield <patmans@us.ibm.com> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-11-08[POWERPC] Fix off-by-one error in setting decrementer on Book E/4xx (v2)Paul Mackerras
The decrementer in Book E and 4xx processors interrupts on the transition from 1 to 0, rather than on the 0 to -1 transition as on 64-bit server and 32-bit "classic" (6xx/7xx/7xxx) processors. At the moment we subtract 1 from the count of how many decrementer ticks are required before the next interrupt before putting it into the decrementer, which is correct for server/classic processors, but could possibly cause the interrupt to happen too early on Book E and 4xx if the timebase/decrementer frequency is low. This fixes the problem by making set_dec subtract 1 from the count for server and classic processors, instead of having the callers subtract 1. Since set_dec already had a bunch of ifdefs to handle different processor types, there is no net increase in ugliness. :) Note that calling set_dec(0) may not generate an interrupt on some processors. To make sure that decrementer_set_next_event always calls set_dec with an interval of at least 1 tick, we set min_delta_ns of the decrementer_clockevent to correspond to 2 ticks (2 rather than 1 to compensate for truncations in the conversions between ticks and ns). This also removes a redundant call to set the decrementer to 0x7fffffff - it was already set to that earlier in timer_interrupt. Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-11-07[NETLINK]: Fix unicast timeoutsPatrick McHardy
Commit ed6dcf4a in the history.git tree broke netlink_unicast timeouts by moving the schedule_timeout() call to a new function that doesn't propagate the remaining timeout back to the caller. This means on each retry we start with the full timeout again. ipc/mqueue.c seems to actually want to wait indefinitely so this behaviour is retained. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07[INET]: Remove per bucket rwlock in tcp/dccp ehash table.Eric Dumazet
As done two years ago on IP route cache table (commit 22c047ccbc68fa8f3fa57f0e8f906479a062c426) , we can avoid using one lock per hash bucket for the huge TCP/DCCP hash tables. On a typical x86_64 platform, this saves about 2MB or 4MB of ram, for litle performance differences. (we hit a different cache line for the rwlock, but then the bucket cache line have a better sharing factor among cpus, since we dirty it less often). For netstat or ss commands that want a full scan of hash table, we perform fewer memory accesses. Using a 'small' table of hashed rwlocks should be more than enough to provide correct SMP concurrency between different buckets, without using too much memory. Sizing of this table depends on num_possible_cpus() and various CONFIG settings. This patch provides some locking abstraction that may ease a future work using a different model for TCP/DCCP table. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07[IPVS]: Synchronize closing of ConnectionsRumen G. Bogdanovski
This patch makes the master daemon to sync the connection when it is about to close. This makes the connections on the backup to close or timeout according their state. Before the sync was performed only if the connection is in ESTABLISHED state which always made the connections to timeout in the hard coded 3 minutes. However the Andy Gospodarek's patch ([IPVS]: use proper timeout instead of fixed value) effectively did nothing more than increasing this to 15 minutes (Established state timeout). So this patch makes use of proper timeout since it syncs the connections on status changes to FIN_WAIT (2min timeout) and CLOSE (10sec timeout). However if the backup misses CLOSE hopefully it did not miss FIN_WAIT. Otherwise we will just have to wait for the ESTABLISHED state timeout. As it is without this patch. This way the number of the hanging connections on the backup is kept to minimum. And very few of them will be left to timeout with a long timeout. This is important if we want to make use of the fix for the real server overcommit on master/backup fail-over. Signed-off-by: Rumen G. Bogdanovski <rumen@voicecho.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07[IPVS]: Bind connections on stanby if the destination existsRumen G. Bogdanovski
This patch fixes the problem with node overload on director fail-over. Given the scenario: 2 nodes each accepting 3 connections at a time and 2 directors, director failover occurs when the nodes are fully loaded (6 connections to the cluster) in this case the new director will assign another 6 connections to the cluster, If the same real servers exist there. The problem turned to be in not binding the inherited connections to the real servers (destinations) on the backup director. Therefore: "ipvsadm -l" reports 0 connections: root@test2:~# ipvsadm -l IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP test2.local:5999 wlc -> node473.local:5999 Route 1000 0 0 -> node484.local:5999 Route 1000 0 0 while "ipvs -lnc" is right root@test2:~# ipvsadm -lnc IPVS connection entries pro expire state source virtual destination TCP 14:56 ESTABLISHED 192.168.0.10:39164 192.168.0.222:5999 192.168.0.51:5999 TCP 14:59 ESTABLISHED 192.168.0.10:39165 192.168.0.222:5999 192.168.0.52:5999 So the patch I am sending fixes the problem by binding the received connections to the appropriate service on the backup director, if it exists, else the connection will be handled the old way. So if the master and the backup directors are synchronized in terms of real services there will be no problem with server over-committing since new connections will not be created on the nonexistent real services on the backup. However if the service is created later on the backup, the binding will be performed when the next connection update is received. With this patch the inherited connections will show as inactive on the backup: root@test2:~# ipvsadm -l IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP test2.local:5999 wlc -> node473.local:5999 Route 1000 0 1 -> node484.local:5999 Route 1000 0 1 rumen@test2:~$ cat /proc/net/ip_vs IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP C0A800DE:176F wlc -> C0A80033:176F Route 1000 0 1 -> C0A80032:176F Route 1000 0 1 Regards, Rumen Bogdanovski Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Rumen G. Bogdanovski <rumen@voicecho.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2007-11-07[TTY]: Fix network driver interactions with TCGET/SET calls.Alan Cox
Dave Miller noted various cases where line disciplines for things like ppp go poking around in termios themselves in ways that broke with the new termios code. Rather than have them all learning about termios internals provide proper methods for this - tty_mode_ioctl() This handles all the terminal mode handling for speed/carrier etc and none of the methods are ldisc dependant so they can be called by any user - tty_perform_flush() This extracts the flush functionality and enables pppd the ppp layer to share it cleanly. The existing n_tty_ioctl code is refactored in this patch to provide the new functions and to call them itself appropriately. This patch has no (intended) behaviour changes and simply prepares for the other fixes. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07[IPV4]: Compact some ifdefs in the fib code.Pavel Emelyanov
There are places that check for CONFIG_IP_MULTIPLE_TABLES twice in the same file, but the internals of these #ifdefs can be merged. As a side effect - remove one ifdef from inside a function. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07[NET]: Kill proc_net_create()David S. Miller
There are no more users. Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07[NET]: Define infrastructure to keep 'inuse' changes in an efficent SMP/NUMA ↵Eric Dumazet
way. "struct proto" currently uses an array stats[NR_CPUS] to track change on 'inuse' sockets per protocol. If NR_CPUS is big, this means we use a big memory area for this. Moreover, all this memory area is located on a single node on NUMA machines, increasing memory pressure on the boot node. In this patch, I tried to : - Keep a fast !CONFIG_SMP implementation - Keep a fast CONFIG_SMP implementation for often used protocols (tcp,udp,raw,...) - Introduce a NUMA efficient implementation Some helper macros are defined in include/net/sock.h These macros take into account CONFIG_SMP If a "struct proto" is declared without using DEFINE_PROTO_INUSE / REF_PROTO_INUSE macros, it will automatically use a default implementation, using a dynamically allocated percpu zone. This default implementation will be NUMA efficient, but might use 32/64 bytes per possible cpu because of current alloc_percpu() implementation. However it still should be better than previous implementation based on stats[NR_CPUS] field. When a "struct proto" is changed to use the new macros, we use a single static "int" percpu variable, lowering the memory and cpu costs, still preserving NUMA efficiency. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07[IPV4]: Clean the ip_sockglue.c from some ugly ifdefsPavel Emelyanov
The #idfed CONFIG_IP_MROUTE is sometimes places inside the if-s, which looks completely bad. Similar ifdefs inside the functions looks a bit better, but they are also not recommended to be used. Provide an ifdef-ed ip_mroute_opt() helper to cleanup the code. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07[NETFILTER]: Sort matches/targets in Kbuild fileJan Engelhardt
Sort matches and targets in the Kbuild file. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07sh: Add a dummy vga.h.Paul Mundt
We have nothing to do here, but there are continually drivers that fail to build without it. Stub it in. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-11-07[SPARC64]: Use "is_power_of_2" macro for simplicity.Robert P. J. Day
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07sh: Fix up PAGE_KERNEL_PCC() for nommu.Paul Mundt
PAGE_KERNEL_PCC() takes two arguments, which weren't reflected in the nommu case. Fix it up. Signed-off-by: Paul Mundt <lethal@linux-sh.org>