summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2006-10-05IRQ: Maintain regs pointer globally rather than passing to IRQ handlersDavid Howells
Maintain a per-CPU global "struct pt_regs *" variable which can be used instead of passing regs around manually through all ~1800 interrupt handlers in the Linux kernel. The regs pointer is used in few places, but it potentially costs both stack space and code to pass it around. On the FRV arch, removing the regs parameter from all the genirq function results in a 20% speed up of the IRQ exit path (ie: from leaving timer_interrupt() to leaving do_IRQ()). Where appropriate, an arch may override the generic storage facility and do something different with the variable. On FRV, for instance, the address is maintained in GR28 at all times inside the kernel as part of general exception handling. Having looked over the code, it appears that the parameter may be handed down through up to twenty or so layers of functions. Consider a USB character device attached to a USB hub, attached to a USB controller that posts its interrupts through a cascaded auxiliary interrupt controller. A character device driver may want to pass regs to the sysrq handler through the input layer which adds another few layers of parameter passing. I've build this code with allyesconfig for x86_64 and i386. I've runtested the main part of the code on FRV and i386, though I can't test most of the drivers. I've also done partial conversion for powerpc and MIPS - these at least compile with minimal configurations. This will affect all archs. Mostly the changes should be relatively easy. Take do_IRQ(), store the regs pointer at the beginning, saving the old one: struct pt_regs *old_regs = set_irq_regs(regs); And put the old one back at the end: set_irq_regs(old_regs); Don't pass regs through to generic_handle_irq() or __do_IRQ(). In timer_interrupt(), this sort of change will be necessary: - update_process_times(user_mode(regs)); - profile_tick(CPU_PROFILING, regs); + update_process_times(user_mode(get_irq_regs())); + profile_tick(CPU_PROFILING); I'd like to move update_process_times()'s use of get_irq_regs() into itself, except that i386, alone of the archs, uses something other than user_mode(). Some notes on the interrupt handling in the drivers: (*) input_dev() is now gone entirely. The regs pointer is no longer stored in the input_dev struct. (*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking. It does something different depending on whether it's been supplied with a regs pointer or not. (*) Various IRQ handler function pointers have been moved to type irq_handler_t. Signed-Off-By: David Howells <dhowells@redhat.com> (cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)
2006-10-05IRQ: Typedef the IRQ handler function typeDavid Howells
Typedef the IRQ handler function type. Signed-Off-By: David Howells <dhowells@redhat.com> (cherry picked from 1356d1e5fd256997e3d3dce0777ab787d0515c7a commit)
2006-10-05IRQ: Typedef the IRQ flow handler function typeDavid Howells
Typedef the IRQ flow handler function type. Signed-Off-By: David Howells <dhowells@redhat.com> (cherry picked from 8e973fbdf5716b93a0a8c0365be33a31ca0fa351 commit)
2006-10-04Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (54 commits) [SCSI] Initial Commit of qla4xxx [SCSI] raid class: handle component-add errors [SCSI] SCSI megaraid_sas: handle thrown errors [SCSI] SCSI aic94xx: handle sysfs errors [SCSI] SCSI st: fix error handling in module init, sysfs [SCSI] SCSI sd: fix module init/exit error handling [SCSI] SCSI osst: add error handling to module init, sysfs [SCSI] scsi: remove hosts.h [SCSI] scsi: Scsi_Cmnd convertion in aic7xxx_old.c [SCSI] megaraid_sas: sets ioctl timeout and updates version,changelog [SCSI] megaraid_sas: adds tasklet for cmd completion [SCSI] megaraid_sas: prints pending cmds before setting hw_crit_error [SCSI] megaraid_sas: function pointer for disable interrupt [SCSI] megaraid_sas: frame count optimization [SCSI] megaraid_sas: FW transition and q size changes [SCSI] qla2xxx: Update version number to 8.01.07-k2. [SCSI] qla2xxx: Stall mid-layer error handlers while rport is blocked. [SCSI] qla2xxx: Add MODULE_FIRMWARE tags. [SCSI] qla2xxx: Add support for host port state FC transport attribute. [SCSI] qla2xxx: Add support for fabric name FC transport attribute. ...
2006-10-04[SCSI] raid class: handle component-add errorsJeff Garzik
Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-10-04Merge branch 'for-2.6.19' of git://brick.kernel.dk/data/git/linux-2.6-blockLinus Torvalds
* 'for-2.6.19' of git://brick.kernel.dk/data/git/linux-2.6-block: [PATCH] Document bi_sector and sector_t [PATCH] helper function for retrieving scsi_cmd given host based block layer tag
2006-10-04Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linusLinus Torvalds
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: [MIPS] Remove remaining reference to ite_gpio.h from Kbuild [MIPS] PNX8550 fixups
2006-10-04[PATCH] Document bi_sector and sector_tRoger Gammans
Signed-Off-By: Roger Gammans <rgammans@computer-surgery.co.uk> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2006-10-04[PATCH] helper function for retrieving scsi_cmd given host based block layer tagDavid C Somayajulu
This was necessitated by the need for a function to get back to a scsi_cmnd, when an hba the posts its (corresponding) completion interrupt with a block layer tag as its reference. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: David Somayajulu <david.somayajulu@qlogic.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2006-10-04[PATCH] serial: Rename PORT_AT91 -> PORT_ATMELHaavard Skinnemoen
The at91_serial driver can be used with both AT32 and AT91 devices from Atmel and has therefore been renamed atmel_serial. The only thing left is to rename PORT_AT91 PORT_ATMEL. Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com> Acked-by: Andrew Victor <andrew@sanpeople.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[MIPS] Remove remaining reference to ite_gpio.h from KbuildDavid Woodhouse
Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-10-04Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/confighLinus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davej/configh: Remove all inclusions of <linux/config.h> Manually resolved trivial path conflicts due to removed files in the sound/oss/ subdirectory.
2006-10-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6: (292 commits) [GFS2] Fix endian bug for de_type [GFS2] Initialize SELinux extended attributes at inode creation time. [GFS2] Move logging code into log.c (mostly) [GFS2] Mark nlink cleared so VFS sees it happen [GFS2] Two redundant casts removed [GFS2] Remove uneeded endian conversion [GFS2] Remove duplicate sb reading code [GFS2] Mark metadata reads for blktrace [GFS2] Remove iflags.h, use FS_ [GFS2] Fix code style/indent in ops_file.c [GFS2] streamline-generic_file_-interfaces-and-filemap gfs fix [GFS2] Remove readv/writev methods and use aio_read/aio_write instead (gfs bits) [GFS2] inode-diet: Eliminate i_blksize from the inode structure [GFS2] inode_diet: Replace inode.u.generic_ip with inode.i_private (gfs) [GFS2] Fix typo in last patch [GFS2] Fix direct i/o logic in filemap.c [GFS2] Fix bug in Makefiles for lock modules [GFS2] Remove (extra) fs_subsys declaration [GFS2/DLM] Fix trailing whitespace [GFS2] Tidy up meta_io code ...
2006-10-04Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [XFRM]: BEET mode [TCP]: Kill warning in tcp_clean_rtx_queue(). [NET_SCHED]: Remove old estimator implementation [ATM]: [zatm] always *pcr in alloc_shaper() [ATM]: [ambassador] Change the return type to reflect reality [ATM]: kmalloc to kzalloc patches for drivers/atm [TIPC]: fix printk warning [XFRM]: Clearing xfrm_policy_count[] to zero during flush is incorrect. [XFRM] STATE: Use destination address for src hash. [NEIGH]: always use hash_mask under tbl lock [UDP]: Fix MSG_PROBE crash [UDP6]: Fix flowi clobbering [NET_SCHED]: Revert "HTB: fix incorrect use of RB_EMPTY_NODE" [NETFILTER]: ebt_mark: add or/and/xor action support to mark target [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function [NETFILTER]: Honour source routing for LVS-NAT [NETFILTER]: add type parameter to ip_route_me_harder [NETFILTER]: Kconfig: fix xt_physdev dependencies
2006-10-04Merge master.kernel.org:/pub/scm/linux/kernel/git/willy/parisc-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/willy/parisc-2.6: (41 commits) [PARISC] Kill wall_jiffies use [PARISC] Honour "panic_on_oops" sysctl [PARISC] Fix fs/binfmt_som.c [PARISC] Export clear_user_page to modules [PARISC] Make DMA routines more stubby [PARISC] Define pci_get_legacy_ide_irq [PARISC] Fix CONFIG_DEBUG_SPINLOCK [PARISC] Fix HPUX compat compile with current GCC [PARISC] Fix iounmap compile warning [PARISC] Add support for Quicksilver AGPGART [PARISC] Move LBA and SBA register defines to the common ropes.h [PARISC] Create shared <asm/ropes.h> header [PARISC] Stash the lba_device in its struct device drvdata [PARISC] Generalize IS_ASTRO et al to take a parisc_device like [PARISC] Pretty print the name of the lba type on kernel boot [PARISC] Remove some obsolete comments and I checked that Reo is similar to Ike [PARISC] Add hardware found in the rp8400 [PARISC] Allow nested interrupts [PARISC] Further updates to timer_interrupt() [PARISC] remove halftick and copy clocktick to local var (gcc can optimize usage) ...
2006-10-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpcLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (25 commits) [POWERPC] Add support for the mpc832x mds board [POWERPC] Add initial support for the e300c2 core [POWERPC] Add MPC8360EMDS default dts file [POWERPC] Add MPC8360EMDS board support [POWERPC] Add QUICC Engine (QE) infrastructure [POWERPC] Add QE device tree node definition [POWERPC] Don't try to just continue if xmon has no input device [POWERPC] Fix a printk in pseries_mpic_init_IRQ [POWERPC] Get default baud rate in udbg_scc [POWERPC] Fix zImage.coff on oldworld PowerMac [POWERPC] Fix xmon=off and cleanup xmon initialisation [POWERPC] Cleanup include/asm-powerpc/xmon.h [POWERPC] Update swim3 printk after blkdev.h change [POWERPC] Cell interrupt rework POWERPC: mpc82xx merge: board-specific/platform stuff(resend) POWERPC: 8272ads merge to powerpc: common stuff POWERPC: Added devicetree for mpc8272ads board [POWERPC] iSeries has no legacy I/O [POWERPC] implement BEGIN/END_FW_FTR_SECTION [POWERPC] iSeries does not need pcibios_fixup_resources ...
2006-10-04Merge branch 'audit.b32' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current * 'audit.b32' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: [PATCH] message types updated [PATCH] name_count array overrun [PATCH] PPID filtering fix [PATCH] arch filter lists with < or > should not be accepted
2006-10-04Merge branch 'upstream-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: [libata] pata_artop: kill gcc warning [PATCH] libata: turn off NCQ if queue depth is adjusted to 1 [PATCH] libata: cosmetic changes to constants [libata] DocBook minor updates, fixes [libata] PCI ID table cleanup in various drivers [libata] Print out Status register, if a BSY-sleep takes too long [libata] init probe_ent->private_data in a common location [libata] minor PCI IDE probe fixes and cleanups [libata] Use new PCI_VDEVICE() macro to dramatically shorten ID lists [PATCH] Fix reference of uninitialised memory in ata_device_add()
2006-10-04[PATCH] The scheduled removal of some OSS driversAdrian Bunk
This patch contains the scheduled removal of OSS drivers that: - have ALSA drivers for the same hardware without known regressions and - whose Kconfig options have been removed in 2.6.17. [michal.k.k.piotrowski@gmail.com: build fix] Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] RCU: CREDITS and MAINTAINERSJosh Triplett
Add MAINTAINERS entry for Read-Copy Update (RCU), listing Dipankar Sarma as maintainer, and giving the URL for Paul McKenney's RCU site. Add MAINTAINERS entry for rcutorture, listing myself as maintainer. Add CREDITS entries for developers of RCU, RCU variants, and rcutorture. Use Paul McKenney's preferred email address in include/linux/rcupdate.h . Signed-off-by: Josh Triplett <josh@freedesktop.org> Cc: Paul McKenney <paulmck@us.ibm.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] rcu: simplify/improve batch tuningOleg Nesterov
Kill a hard-to-calculate 'rsinterval' boot parameter and per-cpu rcu_data.last_rs_qlen. Instead, it adds adds a flag rcu_ctrlblk.signaled, which records the fact that one of CPUs has sent a resched IPI since the last rcu_start_batch(). Roughly speaking, we need two rcu_start_batch()s in order to move callbacks from ->nxtlist to ->donelist. This means that when ->qlen exceeds qhimark and continues to grow, we should send a resched IPI, and then do it again after we gone through a quiescent state. On the other hand, if it was already sent, we don't need to do it again when another CPU detects overflow of the queue. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] SRCU: report out-of-memory errorsAlan Stern
Currently the init_srcu_struct() routine has no way to report out-of-memory errors. This patch (as761) makes it return -ENOMEM when the per-cpu data allocation fails. The patch also makes srcu_init_notifier_head() report a BUG if a notifier head can't be initialized. Perhaps it should return -ENOMEM instead, but in the most likely cases where this might occur I don't think any recovery is possible. Notifier chains generally are not created dynamically. [akpm@osdl.org: avoid statement-with-side-effect in macro] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] Add SRCU-based notifier chainsAlan Stern
This patch (as751) adds a new type of notifier chain, based on the SRCU (Sleepable Read-Copy Update) primitives recently added to the kernel. An SRCU notifier chain is much like a blocking notifier chain, in that it must be called in process context and its callout routines are allowed to sleep. The difference is that the chain's links are protected by the SRCU mechanism rather than by an rw-semaphore, so calling the chain has extremely low overhead: no memory barriers and no cache-line bouncing. On the other hand, unregistering from the chain is expensive and the chain head requires special runtime initialization (plus cleanup if it is to be deallocated). SRCU notifiers are appropriate for notifiers that will be called very frequently and for which unregistration occurs very seldom. The proposed "task notifier" scheme qualifies, as may some of the network notifiers. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Acked-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] srcu-3: RCU variant permitting read-side blockingPaul E. McKenney
Updated patch adding a variant of RCU that permits sleeping in read-side critical sections. SRCU is as follows: o Each use of SRCU creates its own srcu_struct, and each srcu_struct has its own set of grace periods. This is critical, as it prevents one subsystem with a blocking reader from holding up SRCU grace periods for other subsystems. o The SRCU primitives (srcu_read_lock(), srcu_read_unlock(), and synchronize_srcu()) all take a pointer to a srcu_struct. o The SRCU primitives must be called from process context. o srcu_read_lock() returns an int that must be passed to the matching srcu_read_unlock(). Realtime RCU avoids the need for this by storing the state in the task struct, but SRCU needs to allow a given code path to pass through multiple SRCU domains -- storing state in the task struct would therefore require either arbitrary space in the task struct or arbitrary limits on SRCU nesting. So I kicked the state-storage problem up to the caller. Of course, it is not permitted to call synchronize_srcu() while in an SRCU read-side critical section. o There is no call_srcu(). It would not be hard to implement one, but it seems like too easy a way to OOM the system. (Hey, we have enough trouble with call_rcu(), which does -not- permit readers to sleep!!!) So, if you want it, please tell me why... [josht@us.ibm.com: sparse notation] Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Josh Triplett <josh@freedesktop.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] htirq: tidy up the htirq codeEric W. Biederman
This moves the declarations for the architecture helpers into include/linux/htirq.h from the generic include/linux/pci.h. Hopefully this will make this distinction clearer. htirq.h is included where it is needed. The dependency on the msi code is fixed and removed. The Makefile is tidied up. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Tony Luck <tony.luck@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] msi: refactor and move the msi irq_chip into the arch codeEric W. Biederman
It turns out msi_ops was simply not enough to abstract the architecture specific details of msi. So I have moved the resposibility of constructing the struct irq_chip to the architectures, and have two architecture specific functions arch_setup_msi_irq, and arch_teardown_msi_irq. For simple architectures those functions can do all of the work. For architectures with platform dependencies they can call into the appropriate platform code. With this msi.c is finally free of assuming you have an apic, and this actually takes less code. The helpers for the architecture specific code are declared in the linux/msi.h to keep them separate from the msi functions used by drivers in linux/pci.h Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Tony Luck <tony.luck@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] msi: simplify msi sanity checks by adding with generic irq codeEric W. Biederman
Currently msi.c is doing sanity checks that make certain before an irq is destroyed it has no more users. By adding irq_has_action I can perform the test is a generic way, instead of relying on a msi specific data structure. By performing the core check in dynamic_irq_cleanup I ensure every user of dynamic irqs has a test present and we don't free resources that are in use. In msi.c this allows me to kill the attrib.state member of msi_desc and all of the assciated code to maintain it. To keep from freeing data structures when irq cleanup code is called to soon changing dyanamic_irq_cleanup is insufficient because there are msi specific data structures that are also not safe to free. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Tony Luck <tony.luck@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] Initial generic hypertransport interrupt supportEric W. Biederman
This patch implements two functions ht_create_irq and ht_destroy_irq for use by drivers. Several other functions are implemented as helpers for arch specific irq_chip handlers. The driver for the card I tested this on isn't yet ready to be merged. However this code is and hypertransport irqs are in use in a few other places in the kernel. Not that any of this will get merged before 2.6.19 Because the ipath-ht400 is slightly out of spec this code will need to be generalized to work there. I think all of the powerpc uses are for a plain interrupt controller in a chipset so support for native hypertransport devices is a little less interesting. However I think this is a half way decent model on how to separate arch specific and generic helper code, and I think this is a functional model of how to get the architecture dependencies out of the msi code. [akpm@osdl.org: Kconfig fix] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Greg KH <greg@kroah.com> Cc: Andi Kleen <ak@muc.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] Add Hypertransport capability definesEric W. Biederman
This adds defines for the hypertransport capability subtypes and starts using them a little. [akpm@osdl.org: fix typo] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] genirq: irq: generalize the check for HARDIRQ_BITSEric W. Biederman
This patch adds support for systems that cannot receive every interrupt on a single cpu simultaneously, in the check to see if we have enough HARDIRQ_BITS. MAX_HARDIRQS_PER_CPU becomes the count of the maximum number of hardare generated interrupts per cpu. On architectures that support per cpu interrupt delivery this can be a significant space savings and scalability bonus. This patch adds support for systems that cannot receive every interrupt on Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] genirq: irq: remove msi hacksEric W. Biederman
Because of the nasty way that CONFIG_PCI_MSI was implemented we wound up with set_irq_info and set_native_irq_info, with move_irq and move_native_irq. Both functions did the same thing but they were built and called under different circumstances. Now that the msi hacks are gone we can kill move_irq and set_irq_info. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] genirq: irq: add a dynamic irq creation APIEric W. Biederman
With the msi support comes a new concept in irq handling, irqs that are created dynamically at run time. Currently the msi code allocates irqs backwards. First it allocates a platform dependent routing value for an interrupt the ``vector'' and then it figures out from the vector which irq you are on. This msi backwards allocator suffers from two basic problems. The allocator suffers because it is trying to do something that is architecture specific in a generic way making it brittle, inflexible, and tied to tightly to the architecture implementation. The alloctor also suffers from it's very backwards nature as it has tied things together that should have no dependencies. To solve the basic dynamic irq allocation problem two new architecture specific functions are added: create_irq and destroy_irq. create_irq takes no input and returns an unused irq number, that won't be reused until it is returned to the free poll with destroy_irq. The irq then can be used for any purpose although the only initial consumer is the msi code. destroy_irq takes an irq number allocated with create_irq and returns it to the free pool. Making this functionality per architecture increases the simplicity of the irq allocation code and increases it's flexibility. dynamic_irq_init() and dynamic_irq_cleanup() are added to automate the irq_desc initializtion that should happen for dynamic irqs. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] genirq: msi: refactor the msi_opsEric W. Biederman
The current msi_ops are short sighted in a number of ways, this patch attempts to fix the glaring deficiences. - Report in msi_ops if a 64bit address is needed in the msi message, so we can fail 32bit only msi structures. - Send and receive a full struct msi_msg in both setup and target. This is a little cleaner and allows for architectures that need to modify the data to retarget the msi interrupt to a different cpu. - In target pass in the full cpu mask instead of just the first cpu in case we can make use of the full cpu mask. - Operate in terms of irqs and not vectors, currently there is still a 1-1 relationship but on architectures other than ia64 I expect this will change. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] genirq: msi: implement helper functions read_msi_msg and write_msi_msgEric W. Biederman
In support of this I also add a struct msi_msg that captures the the two address and one data field ina typical msi message, and I remember the pos and if the address is 64bit in struct msi_desc. This makes the code a little more readable and easier to maintain, and paves the way to further simplfications. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] genirq: irq: add moved_masked_irqEric W. Biederman
Currently move_native_irq disables and renables the irq we are migrating to ensure we don't take that irq when we are actually doing the migration operation. Disabling the irq needs to happen but sometimes doing the work is move_native_irq is too late. On x86 with ioapics the irq move sequences needs to be: edge_triggered: mask irq. move irq. unmask irq. ack irq. level_triggered: mask irq. ack irq. move irq. unmask irq. We can easily perform the edge triggered sequence, with the current defintion of move_native_irq. However the level triggered case does not map well. For that I have added move_masked_irq, to allow me to disable the irqs around both the ack and the move. Q: Why have we not seen this problem earlier? A: The only symptom I have been able to reproduce is that if we change the vector before acknowleding an irq the wrong irq is acknowledged. Since we currently are not reprogramming the irq vector during migration no problems show up. We have to mask the irq before we acknowledge the irq or else we could hit a window where an irq is asserted just before we acknowledge it. Edge triggered irqs do not have this problem because acknowledgements do not propogate in the same way. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] genirq: irq: convert the move_irq flag from a 32bit word to a single bitEric W. Biederman
The primary aim of this patchset is to remove maintenances problems caused by the irq infrastructure. The two big issues I address are an artificially small cap on the number of irqs, and that MSI assumes vector == irq. My primary focus is on x86_64 but I have touched other architectures where necessary to keep them from breaking. - To increase the number of irqs I modify the code to look at the (cpu, vector) pair instead of just looking at the vector. With a large number of irqs available systems with a large irq count no longer need to compress their irq numbers to fit. Removing a lot of brittle special cases. For acpi guys the result is that irq == gsi. - Addressing the fact that MSI assumes irq == vector takes a few more patches. But suffice it to say when I am done none of the generic irq code even knows what a vector is. In quick testing on a large Unisys x86_64 machine we stumbled over at least one driver that assumed that NR_IRQS could always fit into an 8 bit number. This driver is clearly buggy today. But this has become a class of bugs that it is now much easier to hit. This patch: This is a minor space optimization. In practice I don't think this has any affect because of our alignment constraints and the other fields but there is not point in chewing up an uncessary word and since we already read the flag field this should improve the cache hit ratio of the irq handler. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] Fix linux/nfsd/const.h for make headers_checkCedric Le Goater
make headers_check fails on linux/nfsd/const.h. Since linux/sunrpc/msg_prot.h does not seem to export anything interesting for userspace, this patch moves it in the __KERNEL__ protected section. Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] knfsd: nfsd4: actually use all the pieces to implement referralsJ.Bruce Fields
Use all the pieces set up so far to implement referral support, allowing return of NFS4ERR_MOVED and fs_locations attribute. Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] knfsd: nfsd4: xdr encoding for fs_locationsJ.Bruce Fields
Encode fs_locations attribute. Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] knfsd: nfsd4: fslocations data structuresManoj Naik
Define FS locations structures, some functions to manipulate them, and add code to parse FS locations in downcall and add to the exports structure. [bfields@fieldses.org: bunch of fixes and cleanups] Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] knfsd: nfsd: store export path in exportJ.Bruce Fields
Store the export path in the svc_export structure instead of storing only the dentry. This will prevent the need for additional d_path calls to provide NFSv4 fs_locations support. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] Convert lockd to use the newer mutex instead of the older semaphoreNeil Brown
Both the (recently introduces) nsm_sema and the older f_sema are converted over. Cc: Olaf Kirch <okir@suse.de> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] knfsd: register all RPC programs with portmapper by defaultOlaf Kirch
The NFSACL patches introduced support for multiple RPC services listening on the same transport. However, only the first of these services was registered with portmapper. This was perfectly fine for nfsacl, as you traditionally do not want these to show up in a portmapper listing. The patch below changes the default behavior to always register all services listening on a given transport, but retains the old behavior for nfsacl services. Signed-off-by: Olaf Kirch <okir@suse.de> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] knfsd: export nsm_local_state to user space via sysctlOlaf Kirch
Every NLM call includes the client's NSM state. Currently, the Linux client always reports 0 - which seems not to cause any problems, but is not what the protocol says. This patch exposes the kernel's internal variable to user space via a sysctl, which can be set at system boot time by statd. Signed-off-by: Olaf Kirch <okir@suse.de> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] knfsd: match GRANTED_RES replies using cookiesOlaf Kirch
When we send a GRANTED_MSG call, we current copy the NLM cookie provided in the original LOCK call - because in 1996, some broken clients seemed to rely on this bug. However, this means the cookies are not unique, so that when the client's GRANTED_RES message comes back, we cannot simply match it based on the cookie, but have to use the client's IP address in addition. Which breaks when you have a multi-homed NFS client. The X/Open spec explicitly mentions that clients should not expect the same cookie; so one may hope that any clients that were broken in 1996 have either been fixed or rendered obsolete. Signed-off-by: Olaf Kirch <okir@suse.de> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] knfsd: make nlmclnt_next_cookie SMP safeOlaf Kirch
The way we incremented the NLM cookie in nlmclnt_next_cookie was not thread safe. This patch changes the counter to an atomic_t Signed-off-by: Olaf Kirch <okir@suse.de> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] knfsd: lockd: optionally use hostnames for identifying peersOlaf Kirch
This patch adds the nsm_use_hostnames sysctl and module param. If set, lockd will use the client's name (as given in the NLM arguments) to find the NSM handle. This makes recovery work when the NFS peer is multi-homed, and the reboot notification arrives from a different IP than the original lock calls. Signed-off-by: Olaf Kirch <okir@suse.de> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] knfsd: simplify nlmsvc_invalidate_allNeilBrown
As a result of previous patches, the loop in nlmsvc_invalidate_all just sets h_expires for all client/hosts to 0 (though does it in a very complicated way). This was possibly meant to trigger early garbage collection but half the time '0' is in the future and so it infact delays garbage collection. Pre-aging the 'hosts' is not really needed at this point anyway so we throw out the loop and nlm_find_client which is no longer needed. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] knfsd: lockd: make nlm_traverse_* more flexibleOlaf Kirch
This patch makes nlm_traverse{locks,blocks,shares} and friends use a function pointer rather than a "action" enum. This function pointer is given two nlm_hosts (one given by the caller, the other taken from the lock/block/share currently visited), and is free to do with them as it wants. If it returns a non-zero value, the lockd/block/share is released. Signed-off-by: Olaf Kirch <okir@suse.de> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04[PATCH] knfsd: change nlm_file to use a hlistOlaf Kirch
This changes struct nlm_file and the nlm_files hash table to use a hlist instead of the home-grown lists. This allows us to remove f_hash which was only used to find the right hash chain to delete an entry from. It also increases the size of the nlm_files hash table from 32 to 128. Signed-off-by: Olaf Kirch <okir@suse.de> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>