diff options
author | Linus Torvalds <torvalds@ppc970.osdl.org> | 2005-04-16 15:20:36 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@ppc970.osdl.org> | 2005-04-16 15:20:36 -0700 |
commit | 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 (patch) | |
tree | 0bba044c4ce775e45a88a51686b5d9f90697ea9d /Documentation |
Linux-2.6.12-rc2v2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!
Diffstat (limited to 'Documentation')
722 files changed, 177485 insertions, 0 deletions
diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX new file mode 100644 index 000000000000..72dc90f8f4a7 --- /dev/null +++ b/Documentation/00-INDEX @@ -0,0 +1,294 @@ + +This is a brief list of all the files in ./linux/Documentation and what +they contain. If you add a documentation file, please list it here in +alphabetical order as well, or risk being hunted down like a rabid dog. +Please try and keep the descriptions small enough to fit on one line. + Thanks -- Paul G. + +Following translations are available on the WWW: + + - Japanese, maintained by the JF Project (JF@linux.or.jp), at + http://www.linux.or.jp/JF/ + +00-INDEX + - this file. +BK-usage/ + - directory with info on BitKeeper. +BUG-HUNTING + - brute force method of doing binary search of patches to find bug. +Changes + - list of changes that break older software packages. +CodingStyle + - how the boss likes the C code in the kernel to look. +DMA-API.txt + - DMA API, pci_ API & extensions for non-consistent memory machines. +DMA-mapping.txt + - info for PCI drivers using DMA portably across all platforms. +DocBook/ + - directory with DocBook templates etc. for kernel documentation. +IO-mapping.txt + - how to access I/O mapped memory from within device drivers. +IPMI.txt + - info on Linux Intelligent Platform Management Interface (IPMI) Driver. +IRQ-affinity.txt + - how to select which CPU(s) handle which interrupt events on SMP. +ManagementStyle + - how to (attempt to) manage kernel hackers. +MSI-HOWTO.txt + - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ. +RCU/ + - directory with info on RCU (read-copy update). +README.DAC960 + - info on Mylex DAC960/DAC1100 PCI RAID Controller Driver for Linux. +SAK.txt + - info on Secure Attention Keys. +SubmittingDrivers + - procedure to get a new driver source included into the kernel tree. +SubmittingPatches + - procedure to get a source patch included into the kernel tree. +VGA-softcursor.txt + - how to change your VGA cursor from a blinking underscore. +arm/ + - directory with info about Linux on the ARM architecture. +basic_profiling.txt + - basic instructions for those who wants to profile Linux kernel. +binfmt_misc.txt + - info on the kernel support for extra binary formats. +block/ + - info on the Block I/O (BIO) layer. +cachetlb.txt + - describes the cache/TLB flushing interfaces Linux uses. +cciss.txt + - info, major/minor #'s for Compaq's SMART Array Controllers. +cdrom/ + - directory with information on the CD-ROM drivers that Linux has. +cli-sti-removal.txt + - cli()/sti() removal guide. +computone.txt + - info on Computone Intelliport II/Plus Multiport Serial Driver. +cpqarray.txt + - info on using Compaq's SMART2 Intelligent Disk Array Controllers. +cpu-freq/ + - info on CPU frequency and voltage scaling. +cris/ + - directory with info about Linux on CRIS architecture. +crypto/ + - directory with info on the Crypto API. +debugging-modules.txt + - some notes on debugging modules after Linux 2.6.3. +device-mapper/ + - directory with info on Device Mapper. +devices.txt + - plain ASCII listing of all the nodes in /dev/ with major minor #'s. +digiepca.txt + - info on Digi Intl. {PC,PCI,EISA}Xx and Xem series cards. +dnotify.txt + - info about directory notification in Linux. +driver-model/ + - directory with info about Linux driver model. +dvb/ + - info on Linux Digital Video Broadcast (DVB) subsystem. +early-userspace/ + - info about initramfs, klibc, and userspace early during boot. +eisa.txt + - info on EISA bus support. +exception.txt + - how Linux v2.2 handles exceptions without verify_area etc. +fb/ + - directory with info on the frame buffer graphics abstraction layer. +filesystems/ + - directory with info on the various filesystems that Linux supports. +firmware_class/ + - request_firmware() hotplug interface info. +floppy.txt + - notes and driver options for the floppy disk driver. +ftape.txt + - notes about the floppy tape device driver. +hayes-esp.txt + - info on using the Hayes ESP serial driver. +highuid.txt + - notes on the change from 16 bit to 32 bit user/group IDs. +hpet.txt + - High Precision Event Timer Driver for Linux. +hw_random.txt + - info on Linux support for random number generator in i8xx chipsets. +i2c/ + - directory with info about the I2C bus/protocol (2 wire, kHz speed). +i2o/ + - directory with info about the Linux I2O subsystem. +i386/ + - directory with info about Linux on Intel 32 bit architecture. +ia64/ + - directory with info about Linux on Intel 64 bit architecture. +ide.txt + - important info for users of ATA devices (IDE/EIDE disks and CD-ROMS). +initrd.txt + - how to use the RAM disk as an initial/temporary root filesystem. +input/ + - info on Linux input device support. +io_ordering.txt + - info on ordering I/O writes to memory-mapped addresses. +ioctl-number.txt + - how to implement and register device/driver ioctl calls. +iostats.txt + - info on I/O statistics Linux kernel provides. +isapnp.txt + - info on Linux ISA Plug & Play support. +isdn/ + - directory with info on the Linux ISDN support, and supported cards. +java.txt + - info on the in-kernel binary support for Java(tm). +kbuild/ + - directory with info about the kernel build process. +kernel-doc-nano-HOWTO.txt + - mini HowTo on generation and location of kernel documentation files. +kernel-docs.txt + - listing of various WWW + books that document kernel internals. +kernel-parameters.txt + - summary listing of command line / boot prompt args for the kernel. +kobject.txt + - info of the kobject infrastructure of the Linux kernel. +laptop-mode.txt + - How to conserve battery power using laptop-mode. +ldm.txt + - a brief description of LDM (Windows Dynamic Disks). +locks.txt + - info on file locking implementations, flock() vs. fcntl(), etc. +logo.gif + - Full colour GIF image of Linux logo (penguin). +logo.txt + - Info on creator of above logo & site to get additional images from. +m68k/ + - directory with info about Linux on Motorola 68k architecture. +magic-number.txt + - list of magic numbers used to mark/protect kernel data structures. +mandatory.txt + - info on the Linux implementation of Sys V mandatory file locking. +mca.txt + - info on supporting Micro Channel Architecture (e.g. PS/2) systems. +md.txt + - info on boot arguments for the multiple devices driver. +memory.txt + - info on typical Linux memory problems. +mips/ + - directory with info about Linux on MIPS architecture. +mono.txt + - how to execute Mono-based .NET binaries with the help of BINFMT_MISC. +moxa-smartio + - info on installing/using Moxa multiport serial driver. +mtrr.txt + - how to use PPro Memory Type Range Registers to increase performance. +nbd.txt + - info on a TCP implementation of a network block device. +networking/ + - directory with info on various aspects of networking with Linux. +nfsroot.txt + - short guide on setting up a diskless box with NFS root filesystem. +nmi_watchdog.txt + - info on NMI watchdog for SMP systems. +numastat.txt + - info on how to read Numa policy hit/miss statistics in sysfs. +oops-tracing.txt + - how to decode those nasty internal kernel error dump messages. +paride.txt + - information about the parallel port IDE subsystem. +parisc/ + - directory with info on using Linux on PA-RISC architecture. +parport.txt + - how to use the parallel-port driver. +parport-lowlevel.txt + - description and usage of the low level parallel port functions. +pci.txt + - info on the PCI subsystem for device driver authors. +pm.txt + - info on Linux power management support. +pnp.txt + - Linux Plug and Play documentation. +power/ + - directory with info on Linux PCI power management. +powerpc/ + - directory with info on using Linux with the PowerPC. +preempt-locking.txt + - info on locking under a preemptive kernel. +ramdisk.txt + - short guide on how to set up and use the RAM disk. +riscom8.txt + - notes on using the RISCom/8 multi-port serial driver. +rocket.txt + - info on the Comtrol RocketPort multiport serial driver. +rpc-cache.txt + - introduction to the caching mechanisms in the sunrpc layer. +rtc.txt + - notes on how to use the Real Time Clock (aka CMOS clock) driver. +s390/ + - directory with info on using Linux on the IBM S390. +sched-coding.txt + - reference for various scheduler-related methods in the O(1) scheduler. +sched-design.txt + - goals, design and implementation of the Linux O(1) scheduler. +sched-domains.txt + - information on scheduling domains. +sched-stats.txt + - information on schedstats (Linux Scheduler Statistics). +scsi/ + - directory with info on Linux scsi support. +serial/ + - directory with info on the low level serial API. +serial-console.txt + - how to set up Linux with a serial line console as the default. +sgi-visws.txt + - short blurb on the SGI Visual Workstations. +sh/ + - directory with info on porting Linux to a new architecture. +smart-config.txt + - description of the Smart Config makefile feature. +smp.txt + - a few notes on symmetric multi-processing. +sonypi.txt + - info on Linux Sony Programmable I/O Device support. +sound/ + - directory with info on sound card support. +sparc/ + - directory with info on using Linux on Sparc architecture. +specialix.txt + - info on hardware/driver for specialix IO8+ multiport serial card. +spinlocks.txt + - info on using spinlocks to provide exclusive access in kernel. +stallion.txt + - info on using the Stallion multiport serial driver. +svga.txt + - short guide on selecting video modes at boot via VGA BIOS. +sx.txt + - info on the Specialix SX/SI multiport serial driver. +sysctl/ + - directory with info on the /proc/sys/* files. +sysrq.txt + - info on the magic SysRq key. +telephony/ + - directory with info on telephony (e.g. voice over IP) support. +time_interpolators.txt + - info on time interpolators. +tipar.txt + - information about Parallel link cable for Texas Instruments handhelds. +tty.txt + - guide to the locking policies of the tty layer. +unicode.txt + - info on the Unicode character/font mapping used in Linux. +uml/ + - directory with infomation about User Mode Linux. +usb/ + - directory with info regarding the Universal Serial Bus. +video4linux/ + - directory with info regarding video/TV/radio cards and linux. +vm/ + - directory with info on the Linux vm code. +voyager.txt + - guide to running Linux on the Voyager architecture. +watchdog/ + - how to auto-reboot Linux if it has "fallen and can't get up". ;-) +x86_64/ + - directory with info on Linux support for AMD x86-64 (Hammer) machines. +xterm-linux.xpm + - XPM image of penguin logo (see logo.txt) sitting on an xterm. +zorro.txt + - info on writing drivers for Zorro bus devices found on Amigas. diff --git a/Documentation/BK-usage/00-INDEX b/Documentation/BK-usage/00-INDEX new file mode 100644 index 000000000000..82768784ea52 --- /dev/null +++ b/Documentation/BK-usage/00-INDEX @@ -0,0 +1,51 @@ +bk-kernel-howto.txt: Description of kernel workflow under BitKeeper + +bk-make-sum: Create summary of changesets in one repository and not +another, typically in preparation to be sent to an upstream maintainer. +Typical usage: + cd my-updated-repo + bk-make-sum ~/repo/original-repo + mv /tmp/linus.txt ../original-repo.txt + +bksend: Create readable text output containing summary of changes, GNU +patch of the changes, and BK metadata of changes (as needed for proper +importing into BitKeeper by an upstream maintainer). This output is +suitable for emailing BitKeeper changes. The recipient of this output +may pipe it directly to 'bk receive'. + +bz64wrap: helper script. Uncompressed input is piped to this script, +which compresses its input, and then outputs the uu-/base64-encoded +version of the compressed input. + +cpcset: Copy changeset between unrelated repositories. +Attempts to preserve changeset user, user address, description, in +addition to the changeset (the patch) itself. +Typical usage: + cd my-updated-repo + bk changes # looking for a changeset... + cpcset 1.1511 . ../another-repo + +csets-to-patches: Produces a delta of two BK repositories, in the form +of individual files, each containing a single cset as a GNU patch. +Output is several files, each with the filename "/tmp/rev-$REV.patch" +Typical usage: + cd my-updated-repo + bk changes -L ~/repo/original-repo 2>&1 | \ + perl csets-to-patches + +cset-to-linus: Produces a delta of two BK repositories, in the form of +changeset descriptions, with 'diffstat' output created for each +individual changset. +Typical usage: + cd my-updated-repo + bk changes -L ~/repo/original-repo 2>&1 | \ + perl cset-to-linus > summary.txt + +gcapatch: Generates patch containing changes in local repository. +Typical usage: + cd my-updated-repo + gcapatch > foo.patch + +unbz64wrap: Reverse an encoded, compressed data stream created by +bz64wrap into an uncompressed, typically text/plain output. + diff --git a/Documentation/BK-usage/bk-kernel-howto.txt b/Documentation/BK-usage/bk-kernel-howto.txt new file mode 100644 index 000000000000..b7b9075d2910 --- /dev/null +++ b/Documentation/BK-usage/bk-kernel-howto.txt @@ -0,0 +1,283 @@ + + Doing the BK Thing, Penguin-Style + + + + +This set of notes is intended mainly for kernel developers, occasional +or full-time, but sysadmins and power users may find parts of it useful +as well. It assumes at least a basic familiarity with CVS, both at a +user level (use on the cmd line) and at a higher level (client-server model). +Due to the author's background, an operation may be described in terms +of CVS, or in terms of how that operation differs from CVS. + +This is -not- intended to be BitKeeper documentation. Always run +"bk help <command>" or in X "bk helptool <command>" for reference +documentation. + + +BitKeeper Concepts +------------------ + +In the true nature of the Internet itself, BitKeeper is a distributed +system. When applied to revision control, this means doing away with +client-server, and changing to a parent-child model... essentially +peer-to-peer. On the developer's end, this also represents a +fundamental disruption in the standard workflow of changes, commits, +and merges. You will need to take a few minutes to think about +how to best work under BitKeeper, and re-optimize things a bit. +In some sense it is a bit radical, because it might described as +tossing changes out into a maelstrom and having them magically +land at the right destination... but I'm getting ahead of myself. + +Let's start with this progression: +Each BitKeeper source tree on disk is a repository unto itself. +Each repository has a parent (except the root/original, of course). +Each repository contains a set of a changesets ("csets"). +Each cset is one or more changed files, bundled together. + +Each tree is a repository, so all changes are checked into the local +tree. When a change is checked in, all modified files are grouped +into a logical unit, the changeset. Internally, BK links these +changesets in a tree, representing various converging and diverging +lines of development. These changesets are the bread and butter of +the BK system. + +After the concept of changesets, the next thing you need to get used +to is having multiple copies of source trees lying around. This -really- +takes some getting used to, for some people. Separate source trees +are the means in BitKeeper by which you delineate parallel lines +of development, both minor and major. What would be branches in +CVS become separate source trees, or "clones" in BitKeeper [heh, +or Star Wars] terminology. + +Clones and changesets are the tools from which most of the power of +BitKeeper is derived. As mentioned earlier, each clone has a parent, +the tree used as the source when the new clone was created. In a +CVS-like setup, the parent would be a remote server on the Internet, +and the child is your local clone of that tree. + +Once you have established a common baseline between two source trees -- +a common parent -- then you can merge changesets between those two +trees with ease. Merging changes into a tree is called a "pull", and +is analagous to 'cvs update'. A pull downloads all the changesets in +the remote tree you do not have, and merges them. Sending changes in +one tree to another tree is called a "push". Push sends all changes +in the local tree the remote does not yet have, and merges them. + +From these concepts come some initial command examples: + +1) bk clone -q http://linux.bkbits.net/linux-2.5 linus-2.5 +Download a 2.5 stock kernel tree, naming it "linus-2.5" in the local dir. +The "-q" disables listing every single file as it is downloaded. + +2) bk clone -ql linus-2.5 alpha-2.5 +Create a separate source tree for the Alpha AXP architecture. +The "-l" uses hard links instead of copying data, since both trees are +on the local disk. You can also replace the above with "bk lclone -q ..." + +You only clone a tree -once-. After cloning the tree lives a long time +on disk, being updating by pushes and pulls. + +3) cd alpha-2.5 ; bk pull http://gkernel.bkbits.net/alpha-2.5 +Download changes in "alpha-2.5" repository which are not present +in the local repository, and merge them into the source tree. + +4) bk -r co -q +Because every tree is a repository, files must be checked out before +they will be in their standard places in the source tree. + +5) bk vi fs/inode.c # example change... + bk citool # checkin, using X tool + bk push bk://gkernel@bkbits.net/alpha-2.5 # upload change +Typical example of a BK sequence that would replace the analagous CVS +situation, + vi fs/inode.c + cvs commit + +As this is just supposed to be a quick BK intro, for more in-depth +tutorials, live working demos, and docs, see http://www.bitkeeper.com/ + + + +BK and Kernel Development Workflow +---------------------------------- +Currently the latest 2.5 tree is available via "bk clone $URL" +and "bk pull $URL" at http://linux.bkbits.net/linux-2.5 +This should change in a few weeks to a kernel.org URL. + + +A big part of using BitKeeper is organizing the various trees you have +on your local disk, and organizing the flow of changes among those +trees, and remote trees. If one were to graph the relationships between +a desired BK setup, you are likely to see a few-many-few graph, like +this: + + linux-2.5 + | + merge-to-linus-2.5 + / | | + / | | + vm-hacks bugfixes filesys personal-hacks + \ | | / + \ | | / + \ | | / + testing-and-validation + +Since a "bk push" sends all changes not in the target tree, and +since a "bk pull" receives all changes not in the source tree, you want +to make sure you are only pushing specific changes to the desired tree, +not all changes from "peer parent" trees. For example, pushing a change +from the testing-and-validation tree would probably be a bad idea, +because it will push all changes from vm-hacks, bugfixes, filesys, and +personal-hacks trees into the target tree. + +One would typically work on only one "theme" at a time, either +vm-hacks or bugfixes or filesys, keeping those changes isolated in +their own tree during development, and only merge the isolated with +other changes when going upstream (to Linus or other maintainers) or +downstream (to your "union" trees, like testing-and-validation above). + +It should be noted that some of this separation is not just recommended +practice, it's actually [for now] -enforced- by BitKeeper. BitKeeper +requires that changesets maintain a certain order, which is the reason +that "bk push" sends all local changesets the remote doesn't have. This +separation may look like a lot of wasted disk space at first, but it +helps when two unrelated changes may "pollute" the same area of code, or +don't follow the same pace of development, or any other of the standard +reasons why one creates a development branch. + +Small development branches (clones) will appear and disappear: + + -------- A --------- B --------- C --------- D ------- + \ / + -----short-term devel branch----- + +While long-term branches will parallel a tree (or trees), with period +merge points. In this first example, we pull from a tree (pulls, +"\") periodically, such as what occurs when tracking changes in a +vendor tree, never pushing changes back up the line: + + -------- A --------- B --------- C --------- D ------- + \ \ \ + ----long-term devel branch----------------- + +And then a more common case in Linux kernel development, a long term +branch with periodic merges back into the tree (pushes, "/"): + + -------- A --------- B --------- C --------- D ------- + \ \ / \ + ----long-term devel branch----------------- + + + + + +Submitting Changes to Linus +--------------------------- +There's a bit of an art, or style, of submitting changes to Linus. +Since Linus's tree is now (you might say) fully integrated into the +distributed BitKeeper system, there are several prerequisites to +properly submitting a BitKeeper change. All these prereq's are just +general cleanliness of BK usage, so as people become experts at BK, feel +free to optimize this process further (assuming Linus agrees, of +course). + + + +0) Make sure your tree was originally cloned from the linux-2.5 tree +created by Linus. If your tree does not have this as its ancestor, it +is impossible to reliably exchange changesets. + + + +1) Pay attention to your commit text. The commit message that +accompanies each changeset you submit will live on forever in history, +and is used by Linus to accurately summarize the changes in each +pre-patch. Remember that there is no context, so + "fix for new scheduler changes" +would be too vague, but + "fix mips64 arch for new scheduler switch_to(), TIF_xxx semantics" +would be much better. + +You can and should use the command "bk comment -C<rev>" to update the +commit text, and improve it after the fact. This is very useful for +development: poor, quick descriptions during development, which get +cleaned up using "bk comment" before issuing the "bk push" to submit the +changes. + + + +2) Include an Internet-available URL for Linus to pull from, such as + + Pull from: http://gkernel.bkbits.net/net-drivers-2.5 + + + +3) Include a summary and "diffstat -p1" of each changeset that will be +downloaded, when Linus issues a "bk pull". The author auto-generates +these summaries using "bk changes -L <parent>", to obtain a listing +of all the pending-to-send changesets, and their commit messages. + +It is important to show Linus what he will be downloading when he issues +a "bk pull", to reduce the time required to sift the changes once they +are downloaded to Linus's local machine. + +IMPORTANT NOTE: One of the features of BK is that your repository does +not have to be up to date, in order for Linus to receive your changes. +It is considered a courtesy to keep your repository fairly recent, to +lessen any potential merge work Linus may need to do. + + +4) Split up your changes. Each maintainer<->Linus situation is likely +to be slightly different here, so take this just as general advice. The +author splits up changes according to "themes" when merging with Linus. +Simultaneous pushes from local development go to special trees which +exist solely to house changes "queued" for Linus. Example of the trees: + + net-drivers-2.5 -- on-going net driver maintenance + vm-2.5 -- VM-related changes + fs-2.5 -- filesystem-related changes + +Linus then has much more freedom for pulling changes. He could (for +example) issue a "bk pull" on vm-2.5 and fs-2.5 trees, to merge their +changes, but hold off net-drivers-2.5 because of a change that needs +more discussion. + +Other maintainers may find that a single linus-pull-from tree is +adequate for passing BK changesets to him. + + + +Frequently Answered Questions +----------------------------- +1) How do I change the e-mail address shown in the changelog? +A. When you run "bk citool" or "bk commit", set environment + variables BK_USER and BK_HOST to the desired username + and host/domain name. + + +2) How do I use tags / get a diff between two kernel versions? +A. Pass the tags Linus uses to 'bk export'. + +ChangeSets are in a forward-progressing order, so it's pretty easy +to get a snapshot starting and ending at any two points in time. +Linus puts tags on each release and pre-release, so you could use +these two examples: + + bk export -tpatch -hdu -rv2.5.4,v2.5.5 | less + # creates patch-2.5.5 essentially + bk export -tpatch -du -rv2.5.5-pre1,v2.5.5 | less + # changes from pre1 to final + +A tag is just an alias for a specific changeset... and since changesets +are ordered, a tag is thus a marker for a specific point in time (or +specific state of the tree). + + +3) Is there an easy way to generate One Big Patch versus mainline, + for my long-lived kernel branch? +A. Yes. This requires BK 3.x, though. + + bk export -tpatch -r`bk repogca bk://linux.bkbits.net/linux-2.5`,+ + diff --git a/Documentation/BK-usage/bk-make-sum b/Documentation/BK-usage/bk-make-sum new file mode 100755 index 000000000000..58ca46a0fcc6 --- /dev/null +++ b/Documentation/BK-usage/bk-make-sum @@ -0,0 +1,34 @@ +#!/bin/sh -e +# DIR=$HOME/BK/axp-2.5 +# cd $DIR + +LINUS_REPO=$1 +DIRBASE=`basename $PWD` + +{ +cat <<EOT +Please do a + + bk pull bk://gkernel.bkbits.net/$DIRBASE + +This will update the following files: + +EOT + +bk export -tpatch -hdu -r`bk repogca $LINUS_REPO`,+ | diffstat -p1 2>/dev/null + +cat <<EOT + +through these ChangeSets: + +EOT + +bk changes -L -d'$unless(:MERGE:){ChangeSet|:CSETREV:\n}' $LINUS_REPO | +bk -R prs -h -d'$unless(:MERGE:){<:P:@:HOST:> (:D: :I:)\n$each(:C:){ (:C:)\n}\n}' - + +} > /tmp/linus.txt + +cat <<EOT +Mail text in /tmp/linus.txt; please check and send using your favourite +mailer. +EOT diff --git a/Documentation/BK-usage/bksend b/Documentation/BK-usage/bksend new file mode 100755 index 000000000000..836ca943694f --- /dev/null +++ b/Documentation/BK-usage/bksend @@ -0,0 +1,36 @@ +#!/bin/sh +# A script to format BK changeset output in a manner that is easy to read. +# Andreas Dilger <adilger@turbolabs.com> 13/02/2002 +# +# Add diffstat output after Changelog <adilger@turbolabs.com> 21/02/2002 + +PROG=bksend + +usage() { + echo "usage: $PROG -r<rev>" + echo -e "\twhere <rev> is of the form '1.23', '1.23..', '1.23..1.27'," + echo -e "\tor '+' to indicate the most recent revision" + + exit 1 +} + +case $1 in +-r) REV=$2; shift ;; +-r*) REV=`echo $1 | sed 's/^-r//'` ;; +*) echo "$PROG: no revision given, you probably don't want that";; +esac + +[ -z "$REV" ] && usage + +echo "You can import this changeset into BK by piping this whole message to:" +echo "'| bk receive [path to repository]' or apply the patch as usual." + +SEP="\n===================================================================\n\n" +echo -e $SEP +env PAGER=/bin/cat bk changes -r$REV +echo +bk export -tpatch -du -h -r$REV | diffstat +echo; echo +bk export -tpatch -du -h -r$REV +echo -e $SEP +bk send -wgzip_uu -r$REV - diff --git a/Documentation/BK-usage/bz64wrap b/Documentation/BK-usage/bz64wrap new file mode 100755 index 000000000000..be780876849f --- /dev/null +++ b/Documentation/BK-usage/bz64wrap @@ -0,0 +1,41 @@ +#!/bin/sh + +# bz64wrap - the sending side of a bzip2 | base64 stream +# Andreas Dilger <adilger@clusterfs.com> Jan 2002 + + +PATH=$PATH:/usr/bin:/usr/local/bin:/usr/freeware/bin + +# A program to generate base64 encoding on stdout +BASE64_ENCODE="uuencode -m /dev/stdout" +BASE64_BEGIN= +BASE64_END= + +BZIP=NO +BASE64=NO + +# Test if we have the bzip program installed +bzip2 -c /dev/null > /dev/null 2>&1 && BZIP=YES + +# Test if uuencode can handle the -m (MIME) encoding option +$BASE64_ENCODE < /dev/null > /dev/null 2>&1 && BASE64=YES + +if [ $BASE64 = NO ]; then + BASE64_ENCODE=mimencode + BASE64_BEGIN="begin-base64 644 -" + BASE64_END="====" + + $BASE64_ENCODE < /dev/null > /dev/null 2>&1 && BASE64=YES +fi + +if [ $BZIP = NO -o $BASE64 = NO ]; then + echo "$0: can't use bz64 encoding: bzip2=$BZIP, $BASE64_ENCODE=$BASE64" + exit 1 +fi + +# Sadly, mimencode does not appear to have good "begin" and "end" markers +# like uuencode does, and it is picky about getting the right start/end of +# the base64 stream, so we handle this internally. +echo "$BASE64_BEGIN" +bzip2 -9 | $BASE64_ENCODE +echo "$BASE64_END" diff --git a/Documentation/BK-usage/cpcset b/Documentation/BK-usage/cpcset new file mode 100755 index 000000000000..b8faca97dab9 --- /dev/null +++ b/Documentation/BK-usage/cpcset @@ -0,0 +1,36 @@ +#!/bin/sh +# +# Purpose: Copy changeset patch and description from one +# repository to another, unrelated one. +# +# usage: cpcset [revision] [from-repository] [to-repository] +# + +REV=$1 +FROM=$2 +TO=$3 +TMPF=/tmp/cpcset.$$ + +rm -f $TMPF* + +CWD_SAVE=`pwd` +cd $FROM +bk changes -r$REV | \ + grep -v '^ChangeSet' | \ + sed -e 's/^ //g' > $TMPF.log + +USERHOST=`bk changes -r$REV | grep '^ChangeSet' | awk '{print $4}'` +export BK_USER=`echo $USERHOST | awk '-F@' '{print $1}'` +export BK_HOST=`echo $USERHOST | awk '-F@' '{print $2}'` + +bk export -tpatch -hdu -r$REV > $TMPF.patch && \ +cd $CWD_SAVE && \ +cd $TO && \ +bk import -tpatch -CFR -y"`cat $TMPF.log`" $TMPF.patch . && \ +bk commit -y"`cat $TMPF.log`" + +rm -f $TMPF* + +echo changeset $REV copied. +echo "" + diff --git a/Documentation/BK-usage/cset-to-linus b/Documentation/BK-usage/cset-to-linus new file mode 100755 index 000000000000..d28a96f8c618 --- /dev/null +++ b/Documentation/BK-usage/cset-to-linus @@ -0,0 +1,49 @@ +#!/usr/bin/perl -w + +use strict; + +my ($lhs, $rev, $tmp, $rhs, $s); +my @cset_text = (); +my @pipe_text = (); +my $have_cset = 0; + +while (<>) { + next if /^---/; + + if (($lhs, $tmp, $rhs) = (/^(ChangeSet\@)([^,]+)(, .*)$/)) { + &cset_rev if ($have_cset); + + $rev = $tmp; + $have_cset = 1; + + push(@cset_text, $_); + } + + elsif ($have_cset) { + push(@cset_text, $_); + } +} +&cset_rev if ($have_cset); +exit(0); + + +sub cset_rev { + my $empty_cset = 0; + + open PIPE, "bk export -tpatch -hdu -r $rev | diffstat -p1 2>/dev/null |" or die; + while ($s = <PIPE>) { + $empty_cset = 1 if ($s =~ /0 files changed/); + push(@pipe_text, $s); + } + close(PIPE); + + if (! $empty_cset) { + print @cset_text; + print @pipe_text; + print "\n\n"; + } + + @pipe_text = (); + @cset_text = (); +} + diff --git a/Documentation/BK-usage/csets-to-patches b/Documentation/BK-usage/csets-to-patches new file mode 100755 index 000000000000..e2b81c35883f --- /dev/null +++ b/Documentation/BK-usage/csets-to-patches @@ -0,0 +1,44 @@ +#!/usr/bin/perl -w + +use strict; + +my ($lhs, $rev, $tmp, $rhs, $s); +my @cset_text = (); +my @pipe_text = (); +my $have_cset = 0; + +while (<>) { + next if /^---/; + + if (($lhs, $tmp, $rhs) = (/^(ChangeSet\@)([^,]+)(, .*)$/)) { + &cset_rev if ($have_cset); + + $rev = $tmp; + $have_cset = 1; + + push(@cset_text, $_); + } + + elsif ($have_cset) { + push(@cset_text, $_); + } +} +&cset_rev if ($have_cset); +exit(0); + + +sub cset_rev { + my $empty_cset = 0; + + system("bk export -tpatch -du -r $rev > /tmp/rev-$rev.patch"); + + if (! $empty_cset) { + print @cset_text; + print @pipe_text; + print "\n\n"; + } + + @pipe_text = (); + @cset_text = (); +} + diff --git a/Documentation/BK-usage/gcapatch b/Documentation/BK-usage/gcapatch new file mode 100755 index 000000000000..aaeb17dc7c7f --- /dev/null +++ b/Documentation/BK-usage/gcapatch @@ -0,0 +1,8 @@ +#!/bin/sh +# +# Purpose: Generate GNU diff of local changes versus canonical top-of-tree +# +# Usage: gcapatch > foo.patch +# + +bk export -tpatch -hdu -r`bk repogca bk://linux.bkbits.net/linux-2.5`,+ diff --git a/Documentation/BK-usage/unbz64wrap b/Documentation/BK-usage/unbz64wrap new file mode 100755 index 000000000000..4fc3e73e9a81 --- /dev/null +++ b/Documentation/BK-usage/unbz64wrap @@ -0,0 +1,25 @@ +#!/bin/sh + +# unbz64wrap - the receiving side of a bzip2 | base64 stream +# Andreas Dilger <adilger@clusterfs.com> Jan 2002 + +# Sadly, mimencode does not appear to have good "begin" and "end" markers +# like uuencode does, and it is picky about getting the right start/end of +# the base64 stream, so we handle this explicitly here. + +PATH=$PATH:/usr/bin:/usr/local/bin:/usr/freeware/bin + +if mimencode -u < /dev/null > /dev/null 2>&1 ; then + SHOW= + while read LINE; do + case $LINE in + begin-base64*) SHOW=YES ;; + ====) SHOW= ;; + *) [ "$SHOW" ] && echo "$LINE" ;; + esac + done | mimencode -u | bunzip2 + exit $? +else + cat - | uudecode -o /dev/stdout | bunzip2 + exit $? +fi diff --git a/Documentation/BUG-HUNTING b/Documentation/BUG-HUNTING new file mode 100644 index 000000000000..ca29242dbc38 --- /dev/null +++ b/Documentation/BUG-HUNTING @@ -0,0 +1,92 @@ +[Sat Mar 2 10:32:33 PST 1996 KERNEL_BUG-HOWTO lm@sgi.com (Larry McVoy)] + +This is how to track down a bug if you know nothing about kernel hacking. +It's a brute force approach but it works pretty well. + +You need: + + . A reproducible bug - it has to happen predictably (sorry) + . All the kernel tar files from a revision that worked to the + revision that doesn't + +You will then do: + + . Rebuild a revision that you believe works, install, and verify that. + . Do a binary search over the kernels to figure out which one + introduced the bug. I.e., suppose 1.3.28 didn't have the bug, but + you know that 1.3.69 does. Pick a kernel in the middle and build + that, like 1.3.50. Build & test; if it works, pick the mid point + between .50 and .69, else the mid point between .28 and .50. + . You'll narrow it down to the kernel that introduced the bug. You + can probably do better than this but it gets tricky. + + . Narrow it down to a subdirectory + + - Copy kernel that works into "test". Let's say that 3.62 works, + but 3.63 doesn't. So you diff -r those two kernels and come + up with a list of directories that changed. For each of those + directories: + + Copy the non-working directory next to the working directory + as "dir.63". + One directory at time, try moving the working directory to + "dir.62" and mv dir.63 dir"time, try + + mv dir dir.62 + mv dir.63 dir + find dir -name '*.[oa]' -print | xargs rm -f + + And then rebuild and retest. Assuming that all related + changes were contained in the sub directory, this should + isolate the change to a directory. + + Problems: changes in header files may have occurred; I've + found in my case that they were self explanatory - you may + or may not want to give up when that happens. + + . Narrow it down to a file + + - You can apply the same technique to each file in the directory, + hoping that the changes in that file are self contained. + + . Narrow it down to a routine + + - You can take the old file and the new file and manually create + a merged file that has + + #ifdef VER62 + routine() + { + ... + } + #else + routine() + { + ... + } + #endif + + And then walk through that file, one routine at a time and + prefix it with + + #define VER62 + /* both routines here */ + #undef VER62 + + Then recompile, retest, move the ifdefs until you find the one + that makes the difference. + +Finally, you take all the info that you have, kernel revisions, bug +description, the extent to which you have narrowed it down, and pass +that off to whomever you believe is the maintainer of that section. +A post to linux.dev.kernel isn't such a bad idea if you've done some +work to narrow it down. + +If you get it down to a routine, you'll probably get a fix in 24 hours. + +My apologies to Linus and the other kernel hackers for describing this +brute force approach, it's hardly what a kernel hacker would do. However, +it does work and it lets non-hackers help fix bugs. And it is cool +because Linux snapshots will let you do this - something that you can't +do with vendor supplied releases. + diff --git a/Documentation/Changes b/Documentation/Changes new file mode 100644 index 000000000000..caa6a5529b6b --- /dev/null +++ b/Documentation/Changes @@ -0,0 +1,410 @@ +Intro +===== + +This document is designed to provide a list of the minimum levels of +software necessary to run the 2.6 kernels, as well as provide brief +instructions regarding any other "Gotchas" users may encounter when +trying life on the Bleeding Edge. If upgrading from a pre-2.4.x +kernel, please consult the Changes file included with 2.4.x kernels for +additional information; most of that information will not be repeated +here. Basically, this document assumes that your system is already +functional and running at least 2.4.x kernels. + +This document is originally based on my "Changes" file for 2.0.x kernels +and therefore owes credit to the same people as that file (Jared Mauch, +Axel Boldt, Alessandro Sigala, and countless other users all over the +'net). + +The latest revision of this document, in various formats, can always +be found at <http://cyberbuzz.gatech.edu/kaboom/linux/Changes-2.4/>. + +Feel free to translate this document. If you do so, please send me a +URL to your translation for inclusion in future revisions of this +document. + +Smotrite file <http://oblom.rnc.ru/linux/kernel/Changes.ru>, yavlyaushisya +russkim perevodom dannogo documenta. + +Visite <http://www2.adi.uam.es/~ender/tecnico/> para obtener la traducción +al español de este documento en varios formatos. + +Eine deutsche Version dieser Datei finden Sie unter +<http://www.stefan-winter.de/Changes-2.4.0.txt>. + +Last updated: October 29th, 2002 + +Chris Ricker (kaboom@gatech.edu or chris.ricker@genetics.utah.edu). + +Current Minimal Requirements +============================ + +Upgrade to at *least* these software revisions before thinking you've +encountered a bug! If you're unsure what version you're currently +running, the suggested command should tell you. + +Again, keep in mind that this list assumes you are already +functionally running a Linux 2.4 kernel. Also, not all tools are +necessary on all systems; obviously, if you don't have any PCMCIA (PC +Card) hardware, for example, you probably needn't concern yourself +with pcmcia-cs. + +o Gnu C 2.95.3 # gcc --version +o Gnu make 3.79.1 # make --version +o binutils 2.12 # ld -v +o util-linux 2.10o # fdformat --version +o module-init-tools 0.9.10 # depmod -V +o e2fsprogs 1.29 # tune2fs +o jfsutils 1.1.3 # fsck.jfs -V +o reiserfsprogs 3.6.3 # reiserfsck -V 2>&1|grep reiserfsprogs +o xfsprogs 2.6.0 # xfs_db -V +o pcmcia-cs 3.1.21 # cardmgr -V +o quota-tools 3.09 # quota -V +o PPP 2.4.0 # pppd --version +o isdn4k-utils 3.1pre1 # isdnctrl 2>&1|grep version +o nfs-utils 1.0.5 # showmount --version +o procps 3.2.0 # ps --version +o oprofile 0.5.3 # oprofiled --version + +Kernel compilation +================== + +GCC +--- + +The gcc version requirements may vary depending on the type of CPU in your +computer. The next paragraph applies to users of x86 CPUs, but not +necessarily to users of other CPUs. Users of other CPUs should obtain +information about their gcc version requirements from another source. + +The recommended compiler for the kernel is gcc 2.95.x (x >= 3), and it +should be used when you need absolute stability. You may use gcc 3.0.x +instead if you wish, although it may cause problems. Later versions of gcc +have not received much testing for Linux kernel compilation, and there are +almost certainly bugs (mainly, but not exclusively, in the kernel) that +will need to be fixed in order to use these compilers. In any case, using +pgcc instead of plain gcc is just asking for trouble. + +The Red Hat gcc 2.96 compiler subtree can also be used to build this tree. +You should ensure you use gcc-2.96-74 or later. gcc-2.96-54 will not build +the kernel correctly. + +In addition, please pay attention to compiler optimization. Anything +greater than -O2 may not be wise. Similarly, if you choose to use gcc-2.95.x +or derivatives, be sure not to use -fstrict-aliasing (which, depending on +your version of gcc 2.95.x, may necessitate using -fno-strict-aliasing). + +Make +---- + +You will need Gnu make 3.79.1 or later to build the kernel. + +Binutils +-------- + +Linux on IA-32 has recently switched from using as86 to using gas for +assembling the 16-bit boot code, removing the need for as86 to compile +your kernel. This change does, however, mean that you need a recent +release of binutils. + +System utilities +================ + +Architectural changes +--------------------- + +DevFS has been obsoleted in favour of udev +(http://www.kernel.org/pub/linux/utils/kernel/hotplug/) + +32-bit UID support is now in place. Have fun! + +Linux documentation for functions is transitioning to inline +documentation via specially-formatted comments near their +definitions in the source. These comments can be combined with the +SGML templates in the Documentation/DocBook directory to make DocBook +files, which can then be converted by DocBook stylesheets to PostScript, +HTML, PDF files, and several other formats. In order to convert from +DocBook format to a format of your choice, you'll need to install Jade as +well as the desired DocBook stylesheets. + +Util-linux +---------- + +New versions of util-linux provide *fdisk support for larger disks, +support new options to mount, recognize more supported partition +types, have a fdformat which works with 2.4 kernels, and similar goodies. +You'll probably want to upgrade. + +Ksymoops +-------- + +If the unthinkable happens and your kernel oopses, you'll need a 2.4 +version of ksymoops to decode the report; see REPORTING-BUGS in the +root of the Linux source for more information. + +Module-Init-Tools +----------------- + +A new module loader is now in the kernel that requires module-init-tools +to use. It is backward compatible with the 2.4.x series kernels. + +Mkinitrd +-------- + +These changes to the /lib/modules file tree layout also require that +mkinitrd be upgraded. + +E2fsprogs +--------- + +The latest version of e2fsprogs fixes several bugs in fsck and +debugfs. Obviously, it's a good idea to upgrade. + +JFSutils +-------- + +The jfsutils package contains the utilities for the file system. +The following utilities are available: +o fsck.jfs - initiate replay of the transaction log, and check + and repair a JFS formatted partition. +o mkfs.jfs - create a JFS formatted partition. +o other file system utilities are also available in this package. + +Reiserfsprogs +------------- + +The reiserfsprogs package should be used for reiserfs-3.6.x +(Linux kernels 2.4.x). It is a combined package and contains working +versions of mkreiserfs, resize_reiserfs, debugreiserfs and +reiserfsck. These utils work on both i386 and alpha platforms. + +Xfsprogs +-------- + +The latest version of xfsprogs contains mkfs.xfs, xfs_db, and the +xfs_repair utilities, among others, for the XFS filesystem. It is +architecture independent and any version from 2.0.0 onward should +work correctly with this version of the XFS kernel code (2.6.0 or +later is recommended, due to some significant improvements). + + +Pcmcia-cs +--------- + +PCMCIA (PC Card) support is now partially implemented in the main +kernel source. Pay attention when you recompile your kernel ;-). +Also, be sure to upgrade to the latest pcmcia-cs release. + +Quota-tools +----------- + +Support for 32 bit uid's and gid's is required if you want to use +the newer version 2 quota format. Quota-tools version 3.07 and +newer has this support. Use the recommended version or newer +from the table above. + +Intel IA32 microcode +-------------------- + +A driver has been added to allow updating of Intel IA32 microcode, +accessible as both a devfs regular file and as a normal (misc) +character device. If you are not using devfs you may need to: + +mkdir /dev/cpu +mknod /dev/cpu/microcode c 10 184 +chmod 0644 /dev/cpu/microcode + +as root before you can use this. You'll probably also want to +get the user-space microcode_ctl utility to use with this. + +Powertweak +---------- + +If you are running v0.1.17 or earlier, you should upgrade to +version v0.99.0 or higher. Running old versions may cause problems +with programs using shared memory. + +udev +---- +udev is a userspace application for populating /dev dynamically with +only entries for devices actually present. udev replaces devfs. + +Networking +========== + +General changes +--------------- + +If you have advanced network configuration needs, you should probably +consider using the network tools from ip-route2. + +Packet Filter / NAT +------------------- +The packet filtering and NAT code uses the same tools like the previous 2.4.x +kernel series (iptables). It still includes backwards-compatibility modules +for 2.2.x-style ipchains and 2.0.x-style ipfwadm. + +PPP +--- + +The PPP driver has been restructured to support multilink and to +enable it to operate over diverse media layers. If you use PPP, +upgrade pppd to at least 2.4.0. + +If you are not using devfs, you must have the device file /dev/ppp +which can be made by: + +mknod /dev/ppp c 108 0 + +as root. + +If you use devfsd and build ppp support as modules, you will need +the following in your /etc/devfsd.conf file: + +LOOKUP PPP MODLOAD + +Isdn4k-utils +------------ + +Due to changes in the length of the phone number field, isdn4k-utils +needs to be recompiled or (preferably) upgraded. + +NFS-utils +--------- + +In 2.4 and earlier kernels, the nfs server needed to know about any +client that expected to be able to access files via NFS. This +information would be given to the kernel by "mountd" when the client +mounted the filesystem, or by "exportfs" at system startup. exportfs +would take information about active clients from /var/lib/nfs/rmtab. + +This approach is quite fragile as it depends on rmtab being correct +which is not always easy, particularly when trying to implement +fail-over. Even when the system is working well, rmtab suffers from +getting lots of old entries that never get removed. + +With 2.6 we have the option of having the kernel tell mountd when it +gets a request from an unknown host, and mountd can give appropriate +export information to the kernel. This removes the dependency on +rmtab and means that the kernel only needs to know about currently +active clients. + +To enable this new functionality, you need to: + + mount -t nfsd nfsd /proc/fs/nfs + +before running exportfs or mountd. It is recommended that all NFS +services be protected from the internet-at-large by a firewall where +that is possible. + +Getting updated software +======================== + +Kernel compilation +****************** + +gcc 2.95.3 +---------- +o <ftp://ftp.gnu.org/gnu/gcc/gcc-2.95.3.tar.gz> + +Make +---- +o <ftp://ftp.gnu.org/gnu/make/> + +Binutils +-------- +o <ftp://ftp.kernel.org/pub/linux/devel/binutils/> + +System utilities +**************** + +Util-linux +---------- +o <ftp://ftp.kernel.org/pub/linux/utils/util-linux/> + +Ksymoops +-------- +o <ftp://ftp.kernel.org/pub/linux/utils/kernel/ksymoops/v2.4/> + +Module-Init-Tools +----------------- +o <ftp://ftp.kernel.org/pub/linux/kernel/people/rusty/modules/> + +Mkinitrd +-------- +o <ftp://rawhide.redhat.com/pub/rawhide/SRPMS/SRPMS/> + +E2fsprogs +--------- +o <http://prdownloads.sourceforge.net/e2fsprogs/e2fsprogs-1.29.tar.gz> + +JFSutils +-------- +o <http://jfs.sourceforge.net/> + +Reiserfsprogs +------------- +o <http://www.namesys.com/pub/reiserfsprogs/reiserfsprogs-3.6.3.tar.gz> + +Xfsprogs +-------- +o <ftp://oss.sgi.com/projects/xfs/download/> + +Pcmcia-cs +--------- +o <ftp://pcmcia-cs.sourceforge.net/pub/pcmcia-cs/pcmcia-cs-3.1.21.tar.gz> + +Quota-tools +---------- +o <http://sourceforge.net/projects/linuxquota/> + +Jade +---- +o <ftp://ftp.jclark.com/pub/jade/jade-1.2.1.tar.gz> + +DocBook Stylesheets +------------------- +o <http://nwalsh.com/docbook/dsssl/> + +Intel P6 microcode +------------------ +o <http://www.urbanmyth.org/microcode/> + +Powertweak +---------- +o <http://powertweak.sourceforge.net/> + +udev +---- +o <http://www.kernel.org/pub/linux/utils/kernel/hotplug/udev.html> + +Networking +********** + +PPP +--- +o <ftp://ftp.samba.org/pub/ppp/ppp-2.4.0.tar.gz> + +Isdn4k-utils +------------ +o <ftp://ftp.isdn4linux.de/pub/isdn4linux/utils/isdn4k-utils.v3.1pre1.tar.gz> + +NFS-utils +--------- +o <http://sourceforge.net/project/showfiles.php?group_id=14> + +Iptables +-------- +o <http://www.iptables.org/downloads.html> + +Ip-route2 +--------- +o <ftp://ftp.tux.org/pub/net/ip-routing/iproute2-2.2.4-now-ss991023.tar.gz> + +OProfile +-------- +o <http://oprofile.sf.net/download/> + +NFS-Utils +--------- +o <http://nfs.sourceforge.net/> + diff --git a/Documentation/CodingStyle b/Documentation/CodingStyle new file mode 100644 index 000000000000..f25b3953f513 --- /dev/null +++ b/Documentation/CodingStyle @@ -0,0 +1,431 @@ + + Linux kernel coding style + +This is a short document describing the preferred coding style for the +linux kernel. Coding style is very personal, and I won't _force_ my +views on anybody, but this is what goes for anything that I have to be +able to maintain, and I'd prefer it for most other things too. Please +at least consider the points made here. + +First off, I'd suggest printing out a copy of the GNU coding standards, +and NOT read it. Burn them, it's a great symbolic gesture. + +Anyway, here goes: + + + Chapter 1: Indentation + +Tabs are 8 characters, and thus indentations are also 8 characters. +There are heretic movements that try to make indentations 4 (or even 2!) +characters deep, and that is akin to trying to define the value of PI to +be 3. + +Rationale: The whole idea behind indentation is to clearly define where +a block of control starts and ends. Especially when you've been looking +at your screen for 20 straight hours, you'll find it a lot easier to see +how the indentation works if you have large indentations. + +Now, some people will claim that having 8-character indentations makes +the code move too far to the right, and makes it hard to read on a +80-character terminal screen. The answer to that is that if you need +more than 3 levels of indentation, you're screwed anyway, and should fix +your program. + +In short, 8-char indents make things easier to read, and have the added +benefit of warning you when you're nesting your functions too deep. +Heed that warning. + +Don't put multiple statements on a single line unless you have +something to hide: + + if (condition) do_this; + do_something_everytime; + +Outside of comments, documentation and except in Kconfig, spaces are never +used for indentation, and the above example is deliberately broken. + +Get a decent editor and don't leave whitespace at the end of lines. + + + Chapter 2: Breaking long lines and strings + +Coding style is all about readability and maintainability using commonly +available tools. + +The limit on the length of lines is 80 columns and this is a hard limit. + +Statements longer than 80 columns will be broken into sensible chunks. +Descendants are always substantially shorter than the parent and are placed +substantially to the right. The same applies to function headers with a long +argument list. Long strings are as well broken into shorter strings. + +void fun(int a, int b, int c) +{ + if (condition) + printk(KERN_WARNING "Warning this is a long printk with " + "3 parameters a: %u b: %u " + "c: %u \n", a, b, c); + else + next_statement; +} + + Chapter 3: Placing Braces + +The other issue that always comes up in C styling is the placement of +braces. Unlike the indent size, there are few technical reasons to +choose one placement strategy over the other, but the preferred way, as +shown to us by the prophets Kernighan and Ritchie, is to put the opening +brace last on the line, and put the closing brace first, thusly: + + if (x is true) { + we do y + } + +However, there is one special case, namely functions: they have the +opening brace at the beginning of the next line, thus: + + int function(int x) + { + body of function + } + +Heretic people all over the world have claimed that this inconsistency +is ... well ... inconsistent, but all right-thinking people know that +(a) K&R are _right_ and (b) K&R are right. Besides, functions are +special anyway (you can't nest them in C). + +Note that the closing brace is empty on a line of its own, _except_ in +the cases where it is followed by a continuation of the same statement, +ie a "while" in a do-statement or an "else" in an if-statement, like +this: + + do { + body of do-loop + } while (condition); + +and + + if (x == y) { + .. + } else if (x > y) { + ... + } else { + .... + } + +Rationale: K&R. + +Also, note that this brace-placement also minimizes the number of empty +(or almost empty) lines, without any loss of readability. Thus, as the +supply of new-lines on your screen is not a renewable resource (think +25-line terminal screens here), you have more empty lines to put +comments on. + + + Chapter 4: Naming + +C is a Spartan language, and so should your naming be. Unlike Modula-2 +and Pascal programmers, C programmers do not use cute names like +ThisVariableIsATemporaryCounter. A C programmer would call that +variable "tmp", which is much easier to write, and not the least more +difficult to understand. + +HOWEVER, while mixed-case names are frowned upon, descriptive names for +global variables are a must. To call a global function "foo" is a +shooting offense. + +GLOBAL variables (to be used only if you _really_ need them) need to +have descriptive names, as do global functions. If you have a function +that counts the number of active users, you should call that +"count_active_users()" or similar, you should _not_ call it "cntusr()". + +Encoding the type of a function into the name (so-called Hungarian +notation) is brain damaged - the compiler knows the types anyway and can +check those, and it only confuses the programmer. No wonder MicroSoft +makes buggy programs. + +LOCAL variable names should be short, and to the point. If you have +some random integer loop counter, it should probably be called "i". +Calling it "loop_counter" is non-productive, if there is no chance of it +being mis-understood. Similarly, "tmp" can be just about any type of +variable that is used to hold a temporary value. + +If you are afraid to mix up your local variable names, you have another +problem, which is called the function-growth-hormone-imbalance syndrome. +See next chapter. + + + Chapter 5: Functions + +Functions should be short and sweet, and do just one thing. They should +fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24, +as we all know), and do one thing and do that well. + +The maximum length of a function is inversely proportional to the +complexity and indentation level of that function. So, if you have a +conceptually simple function that is just one long (but simple) +case-statement, where you have to do lots of small things for a lot of +different cases, it's OK to have a longer function. + +However, if you have a complex function, and you suspect that a +less-than-gifted first-year high-school student might not even +understand what the function is all about, you should adhere to the +maximum limits all the more closely. Use helper functions with +descriptive names (you can ask the compiler to in-line them if you think +it's performance-critical, and it will probably do a better job of it +than you would have done). + +Another measure of the function is the number of local variables. They +shouldn't exceed 5-10, or you're doing something wrong. Re-think the +function, and split it into smaller pieces. A human brain can +generally easily keep track of about 7 different things, anything more +and it gets confused. You know you're brilliant, but maybe you'd like +to understand what you did 2 weeks from now. + + + Chapter 6: Centralized exiting of functions + +Albeit deprecated by some people, the equivalent of the goto statement is +used frequently by compilers in form of the unconditional jump instruction. + +The goto statement comes in handy when a function exits from multiple +locations and some common work such as cleanup has to be done. + +The rationale is: + +- unconditional statements are easier to understand and follow +- nesting is reduced +- errors by not updating individual exit points when making + modifications are prevented +- saves the compiler work to optimize redundant code away ;) + +int fun(int ) +{ + int result = 0; + char *buffer = kmalloc(SIZE); + + if (buffer == NULL) + return -ENOMEM; + + if (condition1) { + while (loop1) { + ... + } + result = 1; + goto out; + } + ... +out: + kfree(buffer); + return result; +} + + Chapter 7: Commenting + +Comments are good, but there is also a danger of over-commenting. NEVER +try to explain HOW your code works in a comment: it's much better to +write the code so that the _working_ is obvious, and it's a waste of +time to explain badly written code. + +Generally, you want your comments to tell WHAT your code does, not HOW. +Also, try to avoid putting comments inside a function body: if the +function is so complex that you need to separately comment parts of it, +you should probably go back to chapter 5 for a while. You can make +small comments to note or warn about something particularly clever (or +ugly), but try to avoid excess. Instead, put the comments at the head +of the function, telling people what it does, and possibly WHY it does +it. + + + Chapter 8: You've made a mess of it + +That's OK, we all do. You've probably been told by your long-time Unix +user helper that "GNU emacs" automatically formats the C sources for +you, and you've noticed that yes, it does do that, but the defaults it +uses are less than desirable (in fact, they are worse than random +typing - an infinite number of monkeys typing into GNU emacs would never +make a good program). + +So, you can either get rid of GNU emacs, or change it to use saner +values. To do the latter, you can stick the following in your .emacs file: + +(defun linux-c-mode () + "C mode with adjusted defaults for use with the Linux kernel." + (interactive) + (c-mode) + (c-set-style "K&R") + (setq tab-width 8) + (setq indent-tabs-mode t) + (setq c-basic-offset 8)) + +This will define the M-x linux-c-mode command. When hacking on a +module, if you put the string -*- linux-c -*- somewhere on the first +two lines, this mode will be automatically invoked. Also, you may want +to add + +(setq auto-mode-alist (cons '("/usr/src/linux.*/.*\\.[ch]$" . linux-c-mode) + auto-mode-alist)) + +to your .emacs file if you want to have linux-c-mode switched on +automagically when you edit source files under /usr/src/linux. + +But even if you fail in getting emacs to do sane formatting, not +everything is lost: use "indent". + +Now, again, GNU indent has the same brain-dead settings that GNU emacs +has, which is why you need to give it a few command line options. +However, that's not too bad, because even the makers of GNU indent +recognize the authority of K&R (the GNU people aren't evil, they are +just severely misguided in this matter), so you just give indent the +options "-kr -i8" (stands for "K&R, 8 character indents"), or use +"scripts/Lindent", which indents in the latest style. + +"indent" has a lot of options, and especially when it comes to comment +re-formatting you may want to take a look at the man page. But +remember: "indent" is not a fix for bad programming. + + + Chapter 9: Configuration-files + +For configuration options (arch/xxx/Kconfig, and all the Kconfig files), +somewhat different indentation is used. + +Help text is indented with 2 spaces. + +if CONFIG_EXPERIMENTAL + tristate CONFIG_BOOM + default n + help + Apply nitroglycerine inside the keyboard (DANGEROUS) + bool CONFIG_CHEER + depends on CONFIG_BOOM + default y + help + Output nice messages when you explode +endif + +Generally, CONFIG_EXPERIMENTAL should surround all options not considered +stable. All options that are known to trash data (experimental write- +support for file-systems, for instance) should be denoted (DANGEROUS), other +experimental options should be denoted (EXPERIMENTAL). + + + Chapter 10: Data structures + +Data structures that have visibility outside the single-threaded +environment they are created and destroyed in should always have +reference counts. In the kernel, garbage collection doesn't exist (and +outside the kernel garbage collection is slow and inefficient), which +means that you absolutely _have_ to reference count all your uses. + +Reference counting means that you can avoid locking, and allows multiple +users to have access to the data structure in parallel - and not having +to worry about the structure suddenly going away from under them just +because they slept or did something else for a while. + +Note that locking is _not_ a replacement for reference counting. +Locking is used to keep data structures coherent, while reference +counting is a memory management technique. Usually both are needed, and +they are not to be confused with each other. + +Many data structures can indeed have two levels of reference counting, +when there are users of different "classes". The subclass count counts +the number of subclass users, and decrements the global count just once +when the subclass count goes to zero. + +Examples of this kind of "multi-level-reference-counting" can be found in +memory management ("struct mm_struct": mm_users and mm_count), and in +filesystem code ("struct super_block": s_count and s_active). + +Remember: if another thread can find your data structure, and you don't +have a reference count on it, you almost certainly have a bug. + + + Chapter 11: Macros, Enums, Inline functions and RTL + +Names of macros defining constants and labels in enums are capitalized. + +#define CONSTANT 0x12345 + +Enums are preferred when defining several related constants. + +CAPITALIZED macro names are appreciated but macros resembling functions +may be named in lower case. + +Generally, inline functions are preferable to macros resembling functions. + +Macros with multiple statements should be enclosed in a do - while block: + +#define macrofun(a, b, c) \ + do { \ + if (a == 5) \ + do_this(b, c); \ + } while (0) + +Things to avoid when using macros: + +1) macros that affect control flow: + +#define FOO(x) \ + do { \ + if (blah(x) < 0) \ + return -EBUGGERED; \ + } while(0) + +is a _very_ bad idea. It looks like a function call but exits the "calling" +function; don't break the internal parsers of those who will read the code. + +2) macros that depend on having a local variable with a magic name: + +#define FOO(val) bar(index, val) + +might look like a good thing, but it's confusing as hell when one reads the +code and it's prone to breakage from seemingly innocent changes. + +3) macros with arguments that are used as l-values: FOO(x) = y; will +bite you if somebody e.g. turns FOO into an inline function. + +4) forgetting about precedence: macros defining constants using expressions +must enclose the expression in parentheses. Beware of similar issues with +macros using parameters. + +#define CONSTANT 0x4000 +#define CONSTEXP (CONSTANT | 3) + +The cpp manual deals with macros exhaustively. The gcc internals manual also +covers RTL which is used frequently with assembly language in the kernel. + + + Chapter 12: Printing kernel messages + +Kernel developers like to be seen as literate. Do mind the spelling +of kernel messages to make a good impression. Do not use crippled +words like "dont" and use "do not" or "don't" instead. + +Kernel messages do not have to be terminated with a period. + +Printing numbers in parentheses (%d) adds no value and should be avoided. + + + Chapter 13: References + +The C Programming Language, Second Edition +by Brian W. Kernighan and Dennis M. Ritchie. +Prentice Hall, Inc., 1988. +ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback). +URL: http://cm.bell-labs.com/cm/cs/cbook/ + +The Practice of Programming +by Brian W. Kernighan and Rob Pike. +Addison-Wesley, Inc., 1999. +ISBN 0-201-61586-X. +URL: http://cm.bell-labs.com/cm/cs/tpop/ + +GNU manuals - where in compliance with K&R and this text - for cpp, gcc, +gcc internals and indent, all available from http://www.gnu.org + +WG14 is the international standardization working group for the programming +language C, URL: http://std.dkuug.dk/JTC1/SC22/WG14/ + +-- +Last updated on 16 February 2004 by a community effort on LKML. diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt new file mode 100644 index 000000000000..6ee3cd6134df --- /dev/null +++ b/Documentation/DMA-API.txt @@ -0,0 +1,526 @@ + Dynamic DMA mapping using the generic device + ============================================ + + James E.J. Bottomley <James.Bottomley@HansenPartnership.com> + +This document describes the DMA API. For a more gentle introduction +phrased in terms of the pci_ equivalents (and actual examples) see +DMA-mapping.txt + +This API is split into two pieces. Part I describes the API and the +corresponding pci_ API. Part II describes the extensions to the API +for supporting non-consistent memory machines. Unless you know that +your driver absolutely has to support non-consistent platforms (this +is usually only legacy platforms) you should only use the API +described in part I. + +Part I - pci_ and dma_ Equivalent API +------------------------------------- + +To get the pci_ API, you must #include <linux/pci.h> +To get the dma_ API, you must #include <linux/dma-mapping.h> + + +Part Ia - Using large dma-coherent buffers +------------------------------------------ + +void * +dma_alloc_coherent(struct device *dev, size_t size, + dma_addr_t *dma_handle, int flag) +void * +pci_alloc_consistent(struct pci_dev *dev, size_t size, + dma_addr_t *dma_handle) + +Consistent memory is memory for which a write by either the device or +the processor can immediately be read by the processor or device +without having to worry about caching effects. + +This routine allocates a region of <size> bytes of consistent memory. +it also returns a <dma_handle> which may be cast to an unsigned +integer the same width as the bus and used as the physical address +base of the region. + +Returns: a pointer to the allocated region (in the processor's virtual +address space) or NULL if the allocation failed. + +Note: consistent memory can be expensive on some platforms, and the +minimum allocation length may be as big as a page, so you should +consolidate your requests for consistent memory as much as possible. +The simplest way to do that is to use the dma_pool calls (see below). + +The flag parameter (dma_alloc_coherent only) allows the caller to +specify the GFP_ flags (see kmalloc) for the allocation (the +implementation may chose to ignore flags that affect the location of +the returned memory, like GFP_DMA). For pci_alloc_consistent, you +must assume GFP_ATOMIC behaviour. + +void +dma_free_coherent(struct device *dev, size_t size, void *cpu_addr + dma_addr_t dma_handle) +void +pci_free_consistent(struct pci_dev *dev, size_t size, void *cpu_addr + dma_addr_t dma_handle) + +Free the region of consistent memory you previously allocated. dev, +size and dma_handle must all be the same as those passed into the +consistent allocate. cpu_addr must be the virtual address returned by +the consistent allocate + + +Part Ib - Using small dma-coherent buffers +------------------------------------------ + +To get this part of the dma_ API, you must #include <linux/dmapool.h> + +Many drivers need lots of small dma-coherent memory regions for DMA +descriptors or I/O buffers. Rather than allocating in units of a page +or more using dma_alloc_coherent(), you can use DMA pools. These work +much like a kmem_cache_t, except that they use the dma-coherent allocator +not __get_free_pages(). Also, they understand common hardware constraints +for alignment, like queue heads needing to be aligned on N byte boundaries. + + + struct dma_pool * + dma_pool_create(const char *name, struct device *dev, + size_t size, size_t align, size_t alloc); + + struct pci_pool * + pci_pool_create(const char *name, struct pci_device *dev, + size_t size, size_t align, size_t alloc); + +The pool create() routines initialize a pool of dma-coherent buffers +for use with a given device. It must be called in a context which +can sleep. + +The "name" is for diagnostics (like a kmem_cache_t name); dev and size +are like what you'd pass to dma_alloc_coherent(). The device's hardware +alignment requirement for this type of data is "align" (which is expressed +in bytes, and must be a power of two). If your device has no boundary +crossing restrictions, pass 0 for alloc; passing 4096 says memory allocated +from this pool must not cross 4KByte boundaries. + + + void *dma_pool_alloc(struct dma_pool *pool, int gfp_flags, + dma_addr_t *dma_handle); + + void *pci_pool_alloc(struct pci_pool *pool, int gfp_flags, + dma_addr_t *dma_handle); + +This allocates memory from the pool; the returned memory will meet the size +and alignment requirements specified at creation time. Pass GFP_ATOMIC to +prevent blocking, or if it's permitted (not in_interrupt, not holding SMP locks) +pass GFP_KERNEL to allow blocking. Like dma_alloc_coherent(), this returns +two values: an address usable by the cpu, and the dma address usable by the +pool's device. + + + void dma_pool_free(struct dma_pool *pool, void *vaddr, + dma_addr_t addr); + + void pci_pool_free(struct pci_pool *pool, void *vaddr, + dma_addr_t addr); + +This puts memory back into the pool. The pool is what was passed to +the the pool allocation routine; the cpu and dma addresses are what +were returned when that routine allocated the memory being freed. + + + void dma_pool_destroy(struct dma_pool *pool); + + void pci_pool_destroy(struct pci_pool *pool); + +The pool destroy() routines free the resources of the pool. They must be +called in a context which can sleep. Make sure you've freed all allocated +memory back to the pool before you destroy it. + + +Part Ic - DMA addressing limitations +------------------------------------ + +int +dma_supported(struct device *dev, u64 mask) +int +pci_dma_supported(struct device *dev, u64 mask) + +Checks to see if the device can support DMA to the memory described by +mask. + +Returns: 1 if it can and 0 if it can't. + +Notes: This routine merely tests to see if the mask is possible. It +won't change the current mask settings. It is more intended as an +internal API for use by the platform than an external API for use by +driver writers. + +int +dma_set_mask(struct device *dev, u64 mask) +int +pci_set_dma_mask(struct pci_device *dev, u64 mask) + +Checks to see if the mask is possible and updates the device +parameters if it is. + +Returns: 0 if successful and a negative error if not. + +u64 +dma_get_required_mask(struct device *dev) + +After setting the mask with dma_set_mask(), this API returns the +actual mask (within that already set) that the platform actually +requires to operate efficiently. Usually this means the returned mask +is the minimum required to cover all of memory. Examining the +required mask gives drivers with variable descriptor sizes the +opportunity to use smaller descriptors as necessary. + +Requesting the required mask does not alter the current mask. If you +wish to take advantage of it, you should issue another dma_set_mask() +call to lower the mask again. + + +Part Id - Streaming DMA mappings +-------------------------------- + +dma_addr_t +dma_map_single(struct device *dev, void *cpu_addr, size_t size, + enum dma_data_direction direction) +dma_addr_t +pci_map_single(struct device *dev, void *cpu_addr, size_t size, + int direction) + +Maps a piece of processor virtual memory so it can be accessed by the +device and returns the physical handle of the memory. + +The direction for both api's may be converted freely by casting. +However the dma_ API uses a strongly typed enumerator for its +direction: + +DMA_NONE = PCI_DMA_NONE no direction (used for + debugging) +DMA_TO_DEVICE = PCI_DMA_TODEVICE data is going from the + memory to the device +DMA_FROM_DEVICE = PCI_DMA_FROMDEVICE data is coming from + the device to the + memory +DMA_BIDIRECTIONAL = PCI_DMA_BIDIRECTIONAL direction isn't known + +Notes: Not all memory regions in a machine can be mapped by this +API. Further, regions that appear to be physically contiguous in +kernel virtual space may not be contiguous as physical memory. Since +this API does not provide any scatter/gather capability, it will fail +if the user tries to map a non physically contiguous piece of memory. +For this reason, it is recommended that memory mapped by this API be +obtained only from sources which guarantee to be physically contiguous +(like kmalloc). + +Further, the physical address of the memory must be within the +dma_mask of the device (the dma_mask represents a bit mask of the +addressable region for the device. i.e. if the physical address of +the memory anded with the dma_mask is still equal to the physical +address, then the device can perform DMA to the memory). In order to +ensure that the memory allocated by kmalloc is within the dma_mask, +the driver may specify various platform dependent flags to restrict +the physical memory range of the allocation (e.g. on x86, GFP_DMA +guarantees to be within the first 16Mb of available physical memory, +as required by ISA devices). + +Note also that the above constraints on physical contiguity and +dma_mask may not apply if the platform has an IOMMU (a device which +supplies a physical to virtual mapping between the I/O memory bus and +the device). However, to be portable, device driver writers may *not* +assume that such an IOMMU exists. + +Warnings: Memory coherency operates at a granularity called the cache +line width. In order for memory mapped by this API to operate +correctly, the mapped region must begin exactly on a cache line +boundary and end exactly on one (to prevent two separately mapped +regions from sharing a single cache line). Since the cache line size +may not be known at compile time, the API will not enforce this +requirement. Therefore, it is recommended that driver writers who +don't take special care to determine the cache line size at run time +only map virtual regions that begin and end on page boundaries (which +are guaranteed also to be cache line boundaries). + +DMA_TO_DEVICE synchronisation must be done after the last modification +of the memory region by the software and before it is handed off to +the driver. Once this primitive is used. Memory covered by this +primitive should be treated as read only by the device. If the device +may write to it at any point, it should be DMA_BIDIRECTIONAL (see +below). + +DMA_FROM_DEVICE synchronisation must be done before the driver +accesses data that may be changed by the device. This memory should +be treated as read only by the driver. If the driver needs to write +to it at any point, it should be DMA_BIDIRECTIONAL (see below). + +DMA_BIDIRECTIONAL requires special handling: it means that the driver +isn't sure if the memory was modified before being handed off to the +device and also isn't sure if the device will also modify it. Thus, +you must always sync bidirectional memory twice: once before the +memory is handed off to the device (to make sure all memory changes +are flushed from the processor) and once before the data may be +accessed after being used by the device (to make sure any processor +cache lines are updated with data that the device may have changed. + +void +dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size, + enum dma_data_direction direction) +void +pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr, + size_t size, int direction) + +Unmaps the region previously mapped. All the parameters passed in +must be identical to those passed in (and returned) by the mapping +API. + +dma_addr_t +dma_map_page(struct device *dev, struct page *page, + unsigned long offset, size_t size, + enum dma_data_direction direction) +dma_addr_t +pci_map_page(struct pci_dev *hwdev, struct page *page, + unsigned long offset, size_t size, int direction) +void +dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size, + enum dma_data_direction direction) +void +pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address, + size_t size, int direction) + +API for mapping and unmapping for pages. All the notes and warnings +for the other mapping APIs apply here. Also, although the <offset> +and <size> parameters are provided to do partial page mapping, it is +recommended that you never use these unless you really know what the +cache width is. + +int +dma_mapping_error(dma_addr_t dma_addr) + +int +pci_dma_mapping_error(dma_addr_t dma_addr) + +In some circumstances dma_map_single and dma_map_page will fail to create +a mapping. A driver can check for these errors by testing the returned +dma address with dma_mapping_error(). A non zero return value means the mapping +could not be created and the driver should take appropriate action (eg +reduce current DMA mapping usage or delay and try again later). + +int +dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, + enum dma_data_direction direction) +int +pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg, + int nents, int direction) + +Maps a scatter gather list from the block layer. + +Returns: the number of physical segments mapped (this may be shorted +than <nents> passed in if the block layer determines that some +elements of the scatter/gather list are physically adjacent and thus +may be mapped with a single entry). + +Please note that the sg cannot be mapped again if it has been mapped once. +The mapping process is allowed to destroy information in the sg. + +As with the other mapping interfaces, dma_map_sg can fail. When it +does, 0 is returned and a driver must take appropriate action. It is +critical that the driver do something, in the case of a block driver +aborting the request or even oopsing is better than doing nothing and +corrupting the filesystem. + +void +dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nhwentries, + enum dma_data_direction direction) +void +pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg, + int nents, int direction) + +unmap the previously mapped scatter/gather list. All the parameters +must be the same as those and passed in to the scatter/gather mapping +API. + +Note: <nents> must be the number you passed in, *not* the number of +physical entries returned. + +void +dma_sync_single(struct device *dev, dma_addr_t dma_handle, size_t size, + enum dma_data_direction direction) +void +pci_dma_sync_single(struct pci_dev *hwdev, dma_addr_t dma_handle, + size_t size, int direction) +void +dma_sync_sg(struct device *dev, struct scatterlist *sg, int nelems, + enum dma_data_direction direction) +void +pci_dma_sync_sg(struct pci_dev *hwdev, struct scatterlist *sg, + int nelems, int direction) + +synchronise a single contiguous or scatter/gather mapping. All the +parameters must be the same as those passed into the single mapping +API. + +Notes: You must do this: + +- Before reading values that have been written by DMA from the device + (use the DMA_FROM_DEVICE direction) +- After writing values that will be written to the device using DMA + (use the DMA_TO_DEVICE) direction +- before *and* after handing memory to the device if the memory is + DMA_BIDIRECTIONAL + +See also dma_map_single(). + + +Part II - Advanced dma_ usage +----------------------------- + +Warning: These pieces of the DMA API have no PCI equivalent. They +should also not be used in the majority of cases, since they cater for +unlikely corner cases that don't belong in usual drivers. + +If you don't understand how cache line coherency works between a +processor and an I/O device, you should not be using this part of the +API at all. + +void * +dma_alloc_noncoherent(struct device *dev, size_t size, + dma_addr_t *dma_handle, int flag) + +Identical to dma_alloc_coherent() except that the platform will +choose to return either consistent or non-consistent memory as it sees +fit. By using this API, you are guaranteeing to the platform that you +have all the correct and necessary sync points for this memory in the +driver should it choose to return non-consistent memory. + +Note: where the platform can return consistent memory, it will +guarantee that the sync points become nops. + +Warning: Handling non-consistent memory is a real pain. You should +only ever use this API if you positively know your driver will be +required to work on one of the rare (usually non-PCI) architectures +that simply cannot make consistent memory. + +void +dma_free_noncoherent(struct device *dev, size_t size, void *cpu_addr, + dma_addr_t dma_handle) + +free memory allocated by the nonconsistent API. All parameters must +be identical to those passed in (and returned by +dma_alloc_noncoherent()). + +int +dma_is_consistent(dma_addr_t dma_handle) + +returns true if the memory pointed to by the dma_handle is actually +consistent. + +int +dma_get_cache_alignment(void) + +returns the processor cache alignment. This is the absolute minimum +alignment *and* width that you must observe when either mapping +memory or doing partial flushes. + +Notes: This API may return a number *larger* than the actual cache +line, but it will guarantee that one or more cache lines fit exactly +into the width returned by this call. It will also always be a power +of two for easy alignment + +void +dma_sync_single_range(struct device *dev, dma_addr_t dma_handle, + unsigned long offset, size_t size, + enum dma_data_direction direction) + +does a partial sync. starting at offset and continuing for size. You +must be careful to observe the cache alignment and width when doing +anything like this. You must also be extra careful about accessing +memory you intend to sync partially. + +void +dma_cache_sync(void *vaddr, size_t size, + enum dma_data_direction direction) + +Do a partial sync of memory that was allocated by +dma_alloc_noncoherent(), starting at virtual address vaddr and +continuing on for size. Again, you *must* observe the cache line +boundaries when doing this. + +int +dma_declare_coherent_memory(struct device *dev, dma_addr_t bus_addr, + dma_addr_t device_addr, size_t size, int + flags) + + +Declare region of memory to be handed out by dma_alloc_coherent when +it's asked for coherent memory for this device. + +bus_addr is the physical address to which the memory is currently +assigned in the bus responding region (this will be used by the +platform to perform the mapping) + +device_addr is the physical address the device needs to be programmed +with actually to address this memory (this will be handed out as the +dma_addr_t in dma_alloc_coherent()) + +size is the size of the area (must be multiples of PAGE_SIZE). + +flags can be or'd together and are + +DMA_MEMORY_MAP - request that the memory returned from +dma_alloc_coherent() be directly writeable. + +DMA_MEMORY_IO - request that the memory returned from +dma_alloc_coherent() be addressable using read/write/memcpy_toio etc. + +One or both of these flags must be present + +DMA_MEMORY_INCLUDES_CHILDREN - make the declared memory be allocated by +dma_alloc_coherent of any child devices of this one (for memory residing +on a bridge). + +DMA_MEMORY_EXCLUSIVE - only allocate memory from the declared regions. +Do not allow dma_alloc_coherent() to fall back to system memory when +it's out of memory in the declared region. + +The return value will be either DMA_MEMORY_MAP or DMA_MEMORY_IO and +must correspond to a passed in flag (i.e. no returning DMA_MEMORY_IO +if only DMA_MEMORY_MAP were passed in) for success or zero for +failure. + +Note, for DMA_MEMORY_IO returns, all subsequent memory returned by +dma_alloc_coherent() may no longer be accessed directly, but instead +must be accessed using the correct bus functions. If your driver +isn't prepared to handle this contingency, it should not specify +DMA_MEMORY_IO in the input flags. + +As a simplification for the platforms, only *one* such region of +memory may be declared per device. + +For reasons of efficiency, most platforms choose to track the declared +region only at the granularity of a page. For smaller allocations, +you should use the dma_pool() API. + +void +dma_release_declared_memory(struct device *dev) + +Remove the memory region previously declared from the system. This +API performs *no* in-use checking for this region and will return +unconditionally having removed all the required structures. It is the +drivers job to ensure that no parts of this memory region are +currently in use. + +void * +dma_mark_declared_memory_occupied(struct device *dev, + dma_addr_t device_addr, size_t size) + +This is used to occupy specific regions of the declared space +(dma_alloc_coherent() will hand out the first free region it finds). + +device_addr is the *device* address of the region requested + +size is the size (and should be a page sized multiple). + +The return value will be either a pointer to the processor virtual +address of the memory, or an error (via PTR_ERR()) if any part of the +region is occupied. + + diff --git a/Documentation/DMA-mapping.txt b/Documentation/DMA-mapping.txt new file mode 100644 index 000000000000..f4ac37f157ea --- /dev/null +++ b/Documentation/DMA-mapping.txt @@ -0,0 +1,881 @@ + Dynamic DMA mapping + =================== + + David S. Miller <davem@redhat.com> + Richard Henderson <rth@cygnus.com> + Jakub Jelinek <jakub@redhat.com> + +This document describes the DMA mapping system in terms of the pci_ +API. For a similar API that works for generic devices, see +DMA-API.txt. + +Most of the 64bit platforms have special hardware that translates bus +addresses (DMA addresses) into physical addresses. This is similar to +how page tables and/or a TLB translates virtual addresses to physical +addresses on a CPU. This is needed so that e.g. PCI devices can +access with a Single Address Cycle (32bit DMA address) any page in the +64bit physical address space. Previously in Linux those 64bit +platforms had to set artificial limits on the maximum RAM size in the +system, so that the virt_to_bus() static scheme works (the DMA address +translation tables were simply filled on bootup to map each bus +address to the physical page __pa(bus_to_virt())). + +So that Linux can use the dynamic DMA mapping, it needs some help from the +drivers, namely it has to take into account that DMA addresses should be +mapped only for the time they are actually used and unmapped after the DMA +transfer. + +The following API will work of course even on platforms where no such +hardware exists, see e.g. include/asm-i386/pci.h for how it is implemented on +top of the virt_to_bus interface. + +First of all, you should make sure + +#include <linux/pci.h> + +is in your driver. This file will obtain for you the definition of the +dma_addr_t (which can hold any valid DMA address for the platform) +type which should be used everywhere you hold a DMA (bus) address +returned from the DMA mapping functions. + + What memory is DMA'able? + +The first piece of information you must know is what kernel memory can +be used with the DMA mapping facilities. There has been an unwritten +set of rules regarding this, and this text is an attempt to finally +write them down. + +If you acquired your memory via the page allocator +(i.e. __get_free_page*()) or the generic memory allocators +(i.e. kmalloc() or kmem_cache_alloc()) then you may DMA to/from +that memory using the addresses returned from those routines. + +This means specifically that you may _not_ use the memory/addresses +returned from vmalloc() for DMA. It is possible to DMA to the +_underlying_ memory mapped into a vmalloc() area, but this requires +walking page tables to get the physical addresses, and then +translating each of those pages back to a kernel address using +something like __va(). [ EDIT: Update this when we integrate +Gerd Knorr's generic code which does this. ] + +This rule also means that you may not use kernel image addresses +(ie. items in the kernel's data/text/bss segment, or your driver's) +nor may you use kernel stack addresses for DMA. Both of these items +might be mapped somewhere entirely different than the rest of physical +memory. + +Also, this means that you cannot take the return of a kmap() +call and DMA to/from that. This is similar to vmalloc(). + +What about block I/O and networking buffers? The block I/O and +networking subsystems make sure that the buffers they use are valid +for you to DMA from/to. + + DMA addressing limitations + +Does your device have any DMA addressing limitations? For example, is +your device only capable of driving the low order 24-bits of address +on the PCI bus for SAC DMA transfers? If so, you need to inform the +PCI layer of this fact. + +By default, the kernel assumes that your device can address the full +32-bits in a SAC cycle. For a 64-bit DAC capable device, this needs +to be increased. And for a device with limitations, as discussed in +the previous paragraph, it needs to be decreased. + +pci_alloc_consistent() by default will return 32-bit DMA addresses. +PCI-X specification requires PCI-X devices to support 64-bit +addressing (DAC) for all transactions. And at least one platform (SGI +SN2) requires 64-bit consistent allocations to operate correctly when +the IO bus is in PCI-X mode. Therefore, like with pci_set_dma_mask(), +it's good practice to call pci_set_consistent_dma_mask() to set the +appropriate mask even if your device only supports 32-bit DMA +(default) and especially if it's a PCI-X device. + +For correct operation, you must interrogate the PCI layer in your +device probe routine to see if the PCI controller on the machine can +properly support the DMA addressing limitation your device has. It is +good style to do this even if your device holds the default setting, +because this shows that you did think about these issues wrt. your +device. + +The query is performed via a call to pci_set_dma_mask(): + + int pci_set_dma_mask(struct pci_dev *pdev, u64 device_mask); + +The query for consistent allocations is performed via a a call to +pci_set_consistent_dma_mask(): + + int pci_set_consistent_dma_mask(struct pci_dev *pdev, u64 device_mask); + +Here, pdev is a pointer to the PCI device struct of your device, and +device_mask is a bit mask describing which bits of a PCI address your +device supports. It returns zero if your card can perform DMA +properly on the machine given the address mask you provided. + +If it returns non-zero, your device can not perform DMA properly on +this platform, and attempting to do so will result in undefined +behavior. You must either use a different mask, or not use DMA. + +This means that in the failure case, you have three options: + +1) Use another DMA mask, if possible (see below). +2) Use some non-DMA mode for data transfer, if possible. +3) Ignore this device and do not initialize it. + +It is recommended that your driver print a kernel KERN_WARNING message +when you end up performing either #2 or #3. In this manner, if a user +of your driver reports that performance is bad or that the device is not +even detected, you can ask them for the kernel messages to find out +exactly why. + +The standard 32-bit addressing PCI device would do something like +this: + + if (pci_set_dma_mask(pdev, DMA_32BIT_MASK)) { + printk(KERN_WARNING + "mydev: No suitable DMA available.\n"); + goto ignore_this_device; + } + +Another common scenario is a 64-bit capable device. The approach +here is to try for 64-bit DAC addressing, but back down to a +32-bit mask should that fail. The PCI platform code may fail the +64-bit mask not because the platform is not capable of 64-bit +addressing. Rather, it may fail in this case simply because +32-bit SAC addressing is done more efficiently than DAC addressing. +Sparc64 is one platform which behaves in this way. + +Here is how you would handle a 64-bit capable device which can drive +all 64-bits when accessing streaming DMA: + + int using_dac; + + if (!pci_set_dma_mask(pdev, DMA_64BIT_MASK)) { + using_dac = 1; + } else if (!pci_set_dma_mask(pdev, DMA_32BIT_MASK)) { + using_dac = 0; + } else { + printk(KERN_WARNING + "mydev: No suitable DMA available.\n"); + goto ignore_this_device; + } + +If a card is capable of using 64-bit consistent allocations as well, +the case would look like this: + + int using_dac, consistent_using_dac; + + if (!pci_set_dma_mask(pdev, DMA_64BIT_MASK)) { + using_dac = 1; + consistent_using_dac = 1; + pci_set_consistent_dma_mask(pdev, DMA_64BIT_MASK); + } else if (!pci_set_dma_mask(pdev, DMA_32BIT_MASK)) { + using_dac = 0; + consistent_using_dac = 0; + pci_set_consistent_dma_mask(pdev, DMA_32BIT_MASK); + } else { + printk(KERN_WARNING + "mydev: No suitable DMA available.\n"); + goto ignore_this_device; + } + +pci_set_consistent_dma_mask() will always be able to set the same or a +smaller mask as pci_set_dma_mask(). However for the rare case that a +device driver only uses consistent allocations, one would have to +check the return value from pci_set_consistent_dma_mask(). + +If your 64-bit device is going to be an enormous consumer of DMA +mappings, this can be problematic since the DMA mappings are a +finite resource on many platforms. Please see the "DAC Addressing +for Address Space Hungry Devices" section near the end of this +document for how to handle this case. + +Finally, if your device can only drive the low 24-bits of +address during PCI bus mastering you might do something like: + + if (pci_set_dma_mask(pdev, 0x00ffffff)) { + printk(KERN_WARNING + "mydev: 24-bit DMA addressing not available.\n"); + goto ignore_this_device; + } + +When pci_set_dma_mask() is successful, and returns zero, the PCI layer +saves away this mask you have provided. The PCI layer will use this +information later when you make DMA mappings. + +There is a case which we are aware of at this time, which is worth +mentioning in this documentation. If your device supports multiple +functions (for example a sound card provides playback and record +functions) and the various different functions have _different_ +DMA addressing limitations, you may wish to probe each mask and +only provide the functionality which the machine can handle. It +is important that the last call to pci_set_dma_mask() be for the +most specific mask. + +Here is pseudo-code showing how this might be done: + + #define PLAYBACK_ADDRESS_BITS DMA_32BIT_MASK + #define RECORD_ADDRESS_BITS 0x00ffffff + + struct my_sound_card *card; + struct pci_dev *pdev; + + ... + if (!pci_set_dma_mask(pdev, PLAYBACK_ADDRESS_BITS)) { + card->playback_enabled = 1; + } else { + card->playback_enabled = 0; + printk(KERN_WARN "%s: Playback disabled due to DMA limitations.\n", + card->name); + } + if (!pci_set_dma_mask(pdev, RECORD_ADDRESS_BITS)) { + card->record_enabled = 1; + } else { + card->record_enabled = 0; + printk(KERN_WARN "%s: Record disabled due to DMA limitations.\n", + card->name); + } + +A sound card was used as an example here because this genre of PCI +devices seems to be littered with ISA chips given a PCI front end, +and thus retaining the 16MB DMA addressing limitations of ISA. + + Types of DMA mappings + +There are two types of DMA mappings: + +- Consistent DMA mappings which are usually mapped at driver + initialization, unmapped at the end and for which the hardware should + guarantee that the device and the CPU can access the data + in parallel and will see updates made by each other without any + explicit software flushing. + + Think of "consistent" as "synchronous" or "coherent". + + The current default is to return consistent memory in the low 32 + bits of the PCI bus space. However, for future compatibility you + should set the consistent mask even if this default is fine for your + driver. + + Good examples of what to use consistent mappings for are: + + - Network card DMA ring descriptors. + - SCSI adapter mailbox command data structures. + - Device firmware microcode executed out of + main memory. + + The invariant these examples all require is that any CPU store + to memory is immediately visible to the device, and vice + versa. Consistent mappings guarantee this. + + IMPORTANT: Consistent DMA memory does not preclude the usage of + proper memory barriers. The CPU may reorder stores to + consistent memory just as it may normal memory. Example: + if it is important for the device to see the first word + of a descriptor updated before the second, you must do + something like: + + desc->word0 = address; + wmb(); + desc->word1 = DESC_VALID; + + in order to get correct behavior on all platforms. + +- Streaming DMA mappings which are usually mapped for one DMA transfer, + unmapped right after it (unless you use pci_dma_sync_* below) and for which + hardware can optimize for sequential accesses. + + This of "streaming" as "asynchronous" or "outside the coherency + domain". + + Good examples of what to use streaming mappings for are: + + - Networking buffers transmitted/received by a device. + - Filesystem buffers written/read by a SCSI device. + + The interfaces for using this type of mapping were designed in + such a way that an implementation can make whatever performance + optimizations the hardware allows. To this end, when using + such mappings you must be explicit about what you want to happen. + +Neither type of DMA mapping has alignment restrictions that come +from PCI, although some devices may have such restrictions. + + Using Consistent DMA mappings. + +To allocate and map large (PAGE_SIZE or so) consistent DMA regions, +you should do: + + dma_addr_t dma_handle; + + cpu_addr = pci_alloc_consistent(dev, size, &dma_handle); + +where dev is a struct pci_dev *. You should pass NULL for PCI like buses +where devices don't have struct pci_dev (like ISA, EISA). This may be +called in interrupt context. + +This argument is needed because the DMA translations may be bus +specific (and often is private to the bus which the device is attached +to). + +Size is the length of the region you want to allocate, in bytes. + +This routine will allocate RAM for that region, so it acts similarly to +__get_free_pages (but takes size instead of a page order). If your +driver needs regions sized smaller than a page, you may prefer using +the pci_pool interface, described below. + +The consistent DMA mapping interfaces, for non-NULL dev, will by +default return a DMA address which is SAC (Single Address Cycle) +addressable. Even if the device indicates (via PCI dma mask) that it +may address the upper 32-bits and thus perform DAC cycles, consistent +allocation will only return > 32-bit PCI addresses for DMA if the +consistent dma mask has been explicitly changed via +pci_set_consistent_dma_mask(). This is true of the pci_pool interface +as well. + +pci_alloc_consistent returns two values: the virtual address which you +can use to access it from the CPU and dma_handle which you pass to the +card. + +The cpu return address and the DMA bus master address are both +guaranteed to be aligned to the smallest PAGE_SIZE order which +is greater than or equal to the requested size. This invariant +exists (for example) to guarantee that if you allocate a chunk +which is smaller than or equal to 64 kilobytes, the extent of the +buffer you receive will not cross a 64K boundary. + +To unmap and free such a DMA region, you call: + + pci_free_consistent(dev, size, cpu_addr, dma_handle); + +where dev, size are the same as in the above call and cpu_addr and +dma_handle are the values pci_alloc_consistent returned to you. +This function may not be called in interrupt context. + +If your driver needs lots of smaller memory regions, you can write +custom code to subdivide pages returned by pci_alloc_consistent, +or you can use the pci_pool API to do that. A pci_pool is like +a kmem_cache, but it uses pci_alloc_consistent not __get_free_pages. +Also, it understands common hardware constraints for alignment, +like queue heads needing to be aligned on N byte boundaries. + +Create a pci_pool like this: + + struct pci_pool *pool; + + pool = pci_pool_create(name, dev, size, align, alloc); + +The "name" is for diagnostics (like a kmem_cache name); dev and size +are as above. The device's hardware alignment requirement for this +type of data is "align" (which is expressed in bytes, and must be a +power of two). If your device has no boundary crossing restrictions, +pass 0 for alloc; passing 4096 says memory allocated from this pool +must not cross 4KByte boundaries (but at that time it may be better to +go for pci_alloc_consistent directly instead). + +Allocate memory from a pci pool like this: + + cpu_addr = pci_pool_alloc(pool, flags, &dma_handle); + +flags are SLAB_KERNEL if blocking is permitted (not in_interrupt nor +holding SMP locks), SLAB_ATOMIC otherwise. Like pci_alloc_consistent, +this returns two values, cpu_addr and dma_handle. + +Free memory that was allocated from a pci_pool like this: + + pci_pool_free(pool, cpu_addr, dma_handle); + +where pool is what you passed to pci_pool_alloc, and cpu_addr and +dma_handle are the values pci_pool_alloc returned. This function +may be called in interrupt context. + +Destroy a pci_pool by calling: + + pci_pool_destroy(pool); + +Make sure you've called pci_pool_free for all memory allocated +from a pool before you destroy the pool. This function may not +be called in interrupt context. + + DMA Direction + +The interfaces described in subsequent portions of this document +take a DMA direction argument, which is an integer and takes on +one of the following values: + + PCI_DMA_BIDIRECTIONAL + PCI_DMA_TODEVICE + PCI_DMA_FROMDEVICE + PCI_DMA_NONE + +One should provide the exact DMA direction if you know it. + +PCI_DMA_TODEVICE means "from main memory to the PCI device" +PCI_DMA_FROMDEVICE means "from the PCI device to main memory" +It is the direction in which the data moves during the DMA +transfer. + +You are _strongly_ encouraged to specify this as precisely +as you possibly can. + +If you absolutely cannot know the direction of the DMA transfer, +specify PCI_DMA_BIDIRECTIONAL. It means that the DMA can go in +either direction. The platform guarantees that you may legally +specify this, and that it will work, but this may be at the +cost of performance for example. + +The value PCI_DMA_NONE is to be used for debugging. One can +hold this in a data structure before you come to know the +precise direction, and this will help catch cases where your +direction tracking logic has failed to set things up properly. + +Another advantage of specifying this value precisely (outside of +potential platform-specific optimizations of such) is for debugging. +Some platforms actually have a write permission boolean which DMA +mappings can be marked with, much like page protections in the user +program address space. Such platforms can and do report errors in the +kernel logs when the PCI controller hardware detects violation of the +permission setting. + +Only streaming mappings specify a direction, consistent mappings +implicitly have a direction attribute setting of +PCI_DMA_BIDIRECTIONAL. + +The SCSI subsystem provides mechanisms for you to easily obtain +the direction to use, in the SCSI command: + + scsi_to_pci_dma_dir(SCSI_DIRECTION) + +Where SCSI_DIRECTION is obtained from the 'sc_data_direction' +member of the SCSI command your driver is working on. The +mentioned interface above returns a value suitable for passing +into the streaming DMA mapping interfaces below. + +For Networking drivers, it's a rather simple affair. For transmit +packets, map/unmap them with the PCI_DMA_TODEVICE direction +specifier. For receive packets, just the opposite, map/unmap them +with the PCI_DMA_FROMDEVICE direction specifier. + + Using Streaming DMA mappings + +The streaming DMA mapping routines can be called from interrupt +context. There are two versions of each map/unmap, one which will +map/unmap a single memory region, and one which will map/unmap a +scatterlist. + +To map a single region, you do: + + struct pci_dev *pdev = mydev->pdev; + dma_addr_t dma_handle; + void *addr = buffer->ptr; + size_t size = buffer->len; + + dma_handle = pci_map_single(dev, addr, size, direction); + +and to unmap it: + + pci_unmap_single(dev, dma_handle, size, direction); + +You should call pci_unmap_single when the DMA activity is finished, e.g. +from the interrupt which told you that the DMA transfer is done. + +Using cpu pointers like this for single mappings has a disadvantage, +you cannot reference HIGHMEM memory in this way. Thus, there is a +map/unmap interface pair akin to pci_{map,unmap}_single. These +interfaces deal with page/offset pairs instead of cpu pointers. +Specifically: + + struct pci_dev *pdev = mydev->pdev; + dma_addr_t dma_handle; + struct page *page = buffer->page; + unsigned long offset = buffer->offset; + size_t size = buffer->len; + + dma_handle = pci_map_page(dev, page, offset, size, direction); + + ... + + pci_unmap_page(dev, dma_handle, size, direction); + +Here, "offset" means byte offset within the given page. + +With scatterlists, you map a region gathered from several regions by: + + int i, count = pci_map_sg(dev, sglist, nents, direction); + struct scatterlist *sg; + + for (i = 0, sg = sglist; i < count; i++, sg++) { + hw_address[i] = sg_dma_address(sg); + hw_len[i] = sg_dma_len(sg); + } + +where nents is the number of entries in the sglist. + +The implementation is free to merge several consecutive sglist entries +into one (e.g. if DMA mapping is done with PAGE_SIZE granularity, any +consecutive sglist entries can be merged into one provided the first one +ends and the second one starts on a page boundary - in fact this is a huge +advantage for cards which either cannot do scatter-gather or have very +limited number of scatter-gather entries) and returns the actual number +of sg entries it mapped them to. On failure 0 is returned. + +Then you should loop count times (note: this can be less than nents times) +and use sg_dma_address() and sg_dma_len() macros where you previously +accessed sg->address and sg->length as shown above. + +To unmap a scatterlist, just call: + + pci_unmap_sg(dev, sglist, nents, direction); + +Again, make sure DMA activity has already finished. + +PLEASE NOTE: The 'nents' argument to the pci_unmap_sg call must be + the _same_ one you passed into the pci_map_sg call, + it should _NOT_ be the 'count' value _returned_ from the + pci_map_sg call. + +Every pci_map_{single,sg} call should have its pci_unmap_{single,sg} +counterpart, because the bus address space is a shared resource (although +in some ports the mapping is per each BUS so less devices contend for the +same bus address space) and you could render the machine unusable by eating +all bus addresses. + +If you need to use the same streaming DMA region multiple times and touch +the data in between the DMA transfers, the buffer needs to be synced +properly in order for the cpu and device to see the most uptodate and +correct copy of the DMA buffer. + +So, firstly, just map it with pci_map_{single,sg}, and after each DMA +transfer call either: + + pci_dma_sync_single_for_cpu(dev, dma_handle, size, direction); + +or: + + pci_dma_sync_sg_for_cpu(dev, sglist, nents, direction); + +as appropriate. + +Then, if you wish to let the device get at the DMA area again, +finish accessing the data with the cpu, and then before actually +giving the buffer to the hardware call either: + + pci_dma_sync_single_for_device(dev, dma_handle, size, direction); + +or: + + pci_dma_sync_sg_for_device(dev, sglist, nents, direction); + +as appropriate. + +After the last DMA transfer call one of the DMA unmap routines +pci_unmap_{single,sg}. If you don't touch the data from the first pci_map_* +call till pci_unmap_*, then you don't have to call the pci_dma_sync_* +routines at all. + +Here is pseudo code which shows a situation in which you would need +to use the pci_dma_sync_*() interfaces. + + my_card_setup_receive_buffer(struct my_card *cp, char *buffer, int len) + { + dma_addr_t mapping; + + mapping = pci_map_single(cp->pdev, buffer, len, PCI_DMA_FROMDEVICE); + + cp->rx_buf = buffer; + cp->rx_len = len; + cp->rx_dma = mapping; + + give_rx_buf_to_card(cp); + } + + ... + + my_card_interrupt_handler(int irq, void *devid, struct pt_regs *regs) + { + struct my_card *cp = devid; + + ... + if (read_card_status(cp) == RX_BUF_TRANSFERRED) { + struct my_card_header *hp; + + /* Examine the header to see if we wish + * to accept the data. But synchronize + * the DMA transfer with the CPU first + * so that we see updated contents. + */ + pci_dma_sync_single_for_cpu(cp->pdev, cp->rx_dma, + cp->rx_len, + PCI_DMA_FROMDEVICE); + + /* Now it is safe to examine the buffer. */ + hp = (struct my_card_header *) cp->rx_buf; + if (header_is_ok(hp)) { + pci_unmap_single(cp->pdev, cp->rx_dma, cp->rx_len, + PCI_DMA_FROMDEVICE); + pass_to_upper_layers(cp->rx_buf); + make_and_setup_new_rx_buf(cp); + } else { + /* Just sync the buffer and give it back + * to the card. + */ + pci_dma_sync_single_for_device(cp->pdev, + cp->rx_dma, + cp->rx_len, + PCI_DMA_FROMDEVICE); + give_rx_buf_to_card(cp); + } + } + } + +Drivers converted fully to this interface should not use virt_to_bus any +longer, nor should they use bus_to_virt. Some drivers have to be changed a +little bit, because there is no longer an equivalent to bus_to_virt in the +dynamic DMA mapping scheme - you have to always store the DMA addresses +returned by the pci_alloc_consistent, pci_pool_alloc, and pci_map_single +calls (pci_map_sg stores them in the scatterlist itself if the platform +supports dynamic DMA mapping in hardware) in your driver structures and/or +in the card registers. + +All PCI drivers should be using these interfaces with no exceptions. +It is planned to completely remove virt_to_bus() and bus_to_virt() as +they are entirely deprecated. Some ports already do not provide these +as it is impossible to correctly support them. + + 64-bit DMA and DAC cycle support + +Do you understand all of the text above? Great, then you already +know how to use 64-bit DMA addressing under Linux. Simply make +the appropriate pci_set_dma_mask() calls based upon your cards +capabilities, then use the mapping APIs above. + +It is that simple. + +Well, not for some odd devices. See the next section for information +about that. + + DAC Addressing for Address Space Hungry Devices + +There exists a class of devices which do not mesh well with the PCI +DMA mapping API. By definition these "mappings" are a finite +resource. The number of total available mappings per bus is platform +specific, but there will always be a reasonable amount. + +What is "reasonable"? Reasonable means that networking and block I/O +devices need not worry about using too many mappings. + +As an example of a problematic device, consider compute cluster cards. +They can potentially need to access gigabytes of memory at once via +DMA. Dynamic mappings are unsuitable for this kind of access pattern. + +To this end we've provided a small API by which a device driver +may use DAC cycles to directly address all of physical memory. +Not all platforms support this, but most do. It is easy to determine +whether the platform will work properly at probe time. + +First, understand that there may be a SEVERE performance penalty for +using these interfaces on some platforms. Therefore, you MUST only +use these interfaces if it is absolutely required. %99 of devices can +use the normal APIs without any problems. + +Note that for streaming type mappings you must either use these +interfaces, or the dynamic mapping interfaces above. You may not mix +usage of both for the same device. Such an act is illegal and is +guaranteed to put a banana in your tailpipe. + +However, consistent mappings may in fact be used in conjunction with +these interfaces. Remember that, as defined, consistent mappings are +always going to be SAC addressable. + +The first thing your driver needs to do is query the PCI platform +layer with your devices DAC addressing capabilities: + + int pci_dac_set_dma_mask(struct pci_dev *pdev, u64 mask); + +This routine behaves identically to pci_set_dma_mask. You may not +use the following interfaces if this routine fails. + +Next, DMA addresses using this API are kept track of using the +dma64_addr_t type. It is guaranteed to be big enough to hold any +DAC address the platform layer will give to you from the following +routines. If you have consistent mappings as well, you still +use plain dma_addr_t to keep track of those. + +All mappings obtained here will be direct. The mappings are not +translated, and this is the purpose of this dialect of the DMA API. + +All routines work with page/offset pairs. This is the _ONLY_ way to +portably refer to any piece of memory. If you have a cpu pointer +(which may be validly DMA'd too) you may easily obtain the page +and offset using something like this: + + struct page *page = virt_to_page(ptr); + unsigned long offset = offset_in_page(ptr); + +Here are the interfaces: + + dma64_addr_t pci_dac_page_to_dma(struct pci_dev *pdev, + struct page *page, + unsigned long offset, + int direction); + +The DAC address for the tuple PAGE/OFFSET are returned. The direction +argument is the same as for pci_{map,unmap}_single(). The same rules +for cpu/device access apply here as for the streaming mapping +interfaces. To reiterate: + + The cpu may touch the buffer before pci_dac_page_to_dma. + The device may touch the buffer after pci_dac_page_to_dma + is made, but the cpu may NOT. + +When the DMA transfer is complete, invoke: + + void pci_dac_dma_sync_single_for_cpu(struct pci_dev *pdev, + dma64_addr_t dma_addr, + size_t len, int direction); + +This must be done before the CPU looks at the buffer again. +This interface behaves identically to pci_dma_sync_{single,sg}_for_cpu(). + +And likewise, if you wish to let the device get back at the buffer after +the cpu has read/written it, invoke: + + void pci_dac_dma_sync_single_for_device(struct pci_dev *pdev, + dma64_addr_t dma_addr, + size_t len, int direction); + +before letting the device access the DMA area again. + +If you need to get back to the PAGE/OFFSET tuple from a dma64_addr_t +the following interfaces are provided: + + struct page *pci_dac_dma_to_page(struct pci_dev *pdev, + dma64_addr_t dma_addr); + unsigned long pci_dac_dma_to_offset(struct pci_dev *pdev, + dma64_addr_t dma_addr); + +This is possible with the DAC interfaces purely because they are +not translated in any way. + + Optimizing Unmap State Space Consumption + +On many platforms, pci_unmap_{single,page}() is simply a nop. +Therefore, keeping track of the mapping address and length is a waste +of space. Instead of filling your drivers up with ifdefs and the like +to "work around" this (which would defeat the whole purpose of a +portable API) the following facilities are provided. + +Actually, instead of describing the macros one by one, we'll +transform some example code. + +1) Use DECLARE_PCI_UNMAP_{ADDR,LEN} in state saving structures. + Example, before: + + struct ring_state { + struct sk_buff *skb; + dma_addr_t mapping; + __u32 len; + }; + + after: + + struct ring_state { + struct sk_buff *skb; + DECLARE_PCI_UNMAP_ADDR(mapping) + DECLARE_PCI_UNMAP_LEN(len) + }; + + NOTE: DO NOT put a semicolon at the end of the DECLARE_*() + macro. + +2) Use pci_unmap_{addr,len}_set to set these values. + Example, before: + + ringp->mapping = FOO; + ringp->len = BAR; + + after: + + pci_unmap_addr_set(ringp, mapping, FOO); + pci_unmap_len_set(ringp, len, BAR); + +3) Use pci_unmap_{addr,len} to access these values. + Example, before: + + pci_unmap_single(pdev, ringp->mapping, ringp->len, + PCI_DMA_FROMDEVICE); + + after: + + pci_unmap_single(pdev, + pci_unmap_addr(ringp, mapping), + pci_unmap_len(ringp, len), + PCI_DMA_FROMDEVICE); + +It really should be self-explanatory. We treat the ADDR and LEN +separately, because it is possible for an implementation to only +need the address in order to perform the unmap operation. + + Platform Issues + +If you are just writing drivers for Linux and do not maintain +an architecture port for the kernel, you can safely skip down +to "Closing". + +1) Struct scatterlist requirements. + + Struct scatterlist must contain, at a minimum, the following + members: + + struct page *page; + unsigned int offset; + unsigned int length; + + The base address is specified by a "page+offset" pair. + + Previous versions of struct scatterlist contained a "void *address" + field that was sometimes used instead of page+offset. As of Linux + 2.5., page+offset is always used, and the "address" field has been + deleted. + +2) More to come... + + Handling Errors + +DMA address space is limited on some architectures and an allocation +failure can be determined by: + +- checking if pci_alloc_consistent returns NULL or pci_map_sg returns 0 + +- checking the returned dma_addr_t of pci_map_single and pci_map_page + by using pci_dma_mapping_error(): + + dma_addr_t dma_handle; + + dma_handle = pci_map_single(dev, addr, size, direction); + if (pci_dma_mapping_error(dma_handle)) { + /* + * reduce current DMA mapping usage, + * delay and try again later or + * reset driver. + */ + } + + Closing + +This document, and the API itself, would not be in it's current +form without the feedback and suggestions from numerous individuals. +We would like to specifically mention, in no particular order, the +following people: + + Russell King <rmk@arm.linux.org.uk> + Leo Dagum <dagum@barrel.engr.sgi.com> + Ralf Baechle <ralf@oss.sgi.com> + Grant Grundler <grundler@cup.hp.com> + Jay Estabrook <Jay.Estabrook@compaq.com> + Thomas Sailer <sailer@ife.ee.ethz.ch> + Andrea Arcangeli <andrea@suse.de> + Jens Axboe <axboe@suse.de> + David Mosberger-Tang <davidm@hpl.hp.com> diff --git a/Documentation/DocBook/Makefile b/Documentation/DocBook/Makefile new file mode 100644 index 000000000000..a221039ee4c9 --- /dev/null +++ b/Documentation/DocBook/Makefile @@ -0,0 +1,195 @@ +### +# This makefile is used to generate the kernel documentation, +# primarily based on in-line comments in various source files. +# See Documentation/kernel-doc-nano-HOWTO.txt for instruction in how +# to ducument the SRC - and how to read it. +# To add a new book the only step required is to add the book to the +# list of DOCBOOKS. + +DOCBOOKS := wanbook.xml z8530book.xml mcabook.xml videobook.xml \ + kernel-hacking.xml kernel-locking.xml via-audio.xml \ + deviceiobook.xml procfs-guide.xml tulip-user.xml \ + writing_usb_driver.xml scsidrivers.xml sis900.xml \ + kernel-api.xml journal-api.xml lsm.xml usb.xml \ + gadget.xml libata.xml mtdnand.xml librs.xml + +### +# The build process is as follows (targets): +# (xmldocs) +# file.tmpl --> file.xml +--> file.ps (psdocs) +# +--> file.pdf (pdfdocs) +# +--> DIR=file (htmldocs) +# +--> man/ (mandocs) + +### +# The targets that may be used. +.PHONY: xmldocs sgmldocs psdocs pdfdocs htmldocs mandocs installmandocs + +BOOKS := $(addprefix $(obj)/,$(DOCBOOKS)) +xmldocs: $(BOOKS) +sgmldocs: xmldocs + +PS := $(patsubst %.xml, %.ps, $(BOOKS)) +psdocs: $(PS) + +PDF := $(patsubst %.xml, %.pdf, $(BOOKS)) +pdfdocs: $(PDF) + +HTML := $(patsubst %.xml, %.html, $(BOOKS)) +htmldocs: $(HTML) + +MAN := $(patsubst %.xml, %.9, $(BOOKS)) +mandocs: $(MAN) + +installmandocs: mandocs + $(MAKEMAN) install Documentation/DocBook/man + +### +#External programs used +KERNELDOC = scripts/kernel-doc +DOCPROC = scripts/basic/docproc +SPLITMAN = $(PERL) $(srctree)/scripts/split-man +MAKEMAN = $(PERL) $(srctree)/scripts/makeman + +### +# DOCPROC is used for two purposes: +# 1) To generate a dependency list for a .tmpl file +# 2) To preprocess a .tmpl file and call kernel-doc with +# appropriate parameters. +# The following rules are used to generate the .xml documentation +# required to generate the final targets. (ps, pdf, html). +quiet_cmd_docproc = DOCPROC $@ + cmd_docproc = SRCTREE=$(srctree)/ $(DOCPROC) doc $< >$@ +define rule_docproc + set -e; \ + $(if $($(quiet)cmd_$(1)),echo ' $($(quiet)cmd_$(1))';) \ + $(cmd_$(1)); \ + ( \ + echo 'cmd_$@ := $(cmd_$(1))'; \ + echo $@: `SRCTREE=$(srctree) $(DOCPROC) depend $<`; \ + ) > $(dir $@).$(notdir $@).cmd +endef + +%.xml: %.tmpl FORCE + $(call if_changed_rule,docproc) + +### +#Read in all saved dependency files +cmd_files := $(wildcard $(foreach f,$(BOOKS),$(dir $(f)).$(notdir $(f)).cmd)) + +ifneq ($(cmd_files),) + include $(cmd_files) +endif + +### +# Changes in kernel-doc force a rebuild of all documentation +$(BOOKS): $(KERNELDOC) + +### +# procfs guide uses a .c file as example code. +# This requires an explicit dependency +C-procfs-example = procfs_example.xml +C-procfs-example2 = $(addprefix $(obj)/,$(C-procfs-example)) +$(obj)/procfs-guide.xml: $(C-procfs-example2) + +### +# Rules to generate postscript, PDF and HTML +# db2html creates a directory. Generate a html file used for timestamp + +quiet_cmd_db2ps = DB2PS $@ + cmd_db2ps = db2ps -o $(dir $@) $< +%.ps : %.xml + @(which db2ps > /dev/null 2>&1) || \ + (echo "*** You need to install DocBook stylesheets ***"; \ + exit 1) + $(call cmd,db2ps) + +quiet_cmd_db2pdf = DB2PDF $@ + cmd_db2pdf = db2pdf -o $(dir $@) $< +%.pdf : %.xml + @(which db2pdf > /dev/null 2>&1) || \ + (echo "*** You need to install DocBook stylesheets ***"; \ + exit 1) + $(call cmd,db2pdf) + +quiet_cmd_db2html = DB2HTML $@ + cmd_db2html = db2html -o $(patsubst %.html,%,$@) $< && \ + echo '<a HREF="$(patsubst %.html,%,$(notdir $@))/book1.html"> \ + Goto $(patsubst %.html,%,$(notdir $@))</a><p>' > $@ + +%.html: %.xml + @(which db2html > /dev/null 2>&1) || \ + (echo "*** You need to install DocBook stylesheets ***"; \ + exit 1) + @rm -rf $@ $(patsubst %.html,%,$@) + $(call cmd,db2html) + @if [ ! -z "$(PNG-$(basename $(notdir $@)))" ]; then \ + cp $(PNG-$(basename $(notdir $@))) $(patsubst %.html,%,$@); fi + +### +# Rule to generate man files - output is placed in the man subdirectory + +%.9: %.xml +ifneq ($(KBUILD_SRC),) + $(Q)mkdir -p $(objtree)/Documentation/DocBook/man +endif + $(SPLITMAN) $< $(objtree)/Documentation/DocBook/man "$(VERSION).$(PATCHLEVEL).$(SUBLEVEL)" + $(MAKEMAN) convert $(objtree)/Documentation/DocBook/man $< + +### +# Rules to generate postscripts and PNG imgages from .fig format files +quiet_cmd_fig2eps = FIG2EPS $@ + cmd_fig2eps = fig2dev -Leps $< $@ + +%.eps: %.fig + @(which fig2dev > /dev/null 2>&1) || \ + (echo "*** You need to install transfig ***"; \ + exit 1) + $(call cmd,fig2eps) + +quiet_cmd_fig2png = FIG2PNG $@ + cmd_fig2png = fig2dev -Lpng $< $@ + +%.png: %.fig + @(which fig2dev > /dev/null 2>&1) || \ + (echo "*** You need to install transfig ***"; \ + exit 1) + $(call cmd,fig2png) + +### +# Rule to convert a .c file to inline XML documentation +%.xml: %.c + @echo ' GEN $@' + @( \ + echo "<programlisting>"; \ + expand --tabs=8 < $< | \ + sed -e "s/&/\\&/g" \ + -e "s/</\\</g" \ + -e "s/>/\\>/g"; \ + echo "</programlisting>") > $@ + +### +# Help targets as used by the top-level makefile +dochelp: + @echo ' Linux kernel internal documentation in different formats:' + @echo ' xmldocs (XML DocBook), psdocs (Postscript), pdfdocs (PDF)' + @echo ' htmldocs (HTML), mandocs (man pages, use installmandocs to install)' + +### +# Temporary files left by various tools +clean-files := $(DOCBOOKS) \ + $(patsubst %.xml, %.dvi, $(DOCBOOKS)) \ + $(patsubst %.xml, %.aux, $(DOCBOOKS)) \ + $(patsubst %.xml, %.tex, $(DOCBOOKS)) \ + $(patsubst %.xml, %.log, $(DOCBOOKS)) \ + $(patsubst %.xml, %.out, $(DOCBOOKS)) \ + $(patsubst %.xml, %.ps, $(DOCBOOKS)) \ + $(patsubst %.xml, %.pdf, $(DOCBOOKS)) \ + $(patsubst %.xml, %.html, $(DOCBOOKS)) \ + $(patsubst %.xml, %.9, $(DOCBOOKS)) \ + $(C-procfs-example) + +clean-dirs := $(patsubst %.xml,%,$(DOCBOOKS)) + +#man put files in man subdir - traverse down +subdir- := man/ diff --git a/Documentation/DocBook/deviceiobook.tmpl b/Documentation/DocBook/deviceiobook.tmpl new file mode 100644 index 000000000000..6f41f2f5c6f6 --- /dev/null +++ b/Documentation/DocBook/deviceiobook.tmpl @@ -0,0 +1,341 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="DoingIO"> + <bookinfo> + <title>Bus-Independent Device Accesses</title> + + <authorgroup> + <author> + <firstname>Matthew</firstname> + <surname>Wilcox</surname> + <affiliation> + <address> + <email>matthew@wil.cx</email> + </address> + </affiliation> + </author> + </authorgroup> + + <authorgroup> + <author> + <firstname>Alan</firstname> + <surname>Cox</surname> + <affiliation> + <address> + <email>alan@redhat.com</email> + </address> + </affiliation> + </author> + </authorgroup> + + <copyright> + <year>2001</year> + <holder>Matthew Wilcox</holder> + </copyright> + + <legalnotice> + <para> + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later + version. + </para> + + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + See the GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + + <para> + For more details see the file COPYING in the source + distribution of Linux. + </para> + </legalnotice> + </bookinfo> + +<toc></toc> + + <chapter id="intro"> + <title>Introduction</title> + <para> + Linux provides an API which abstracts performing IO across all busses + and devices, allowing device drivers to be written independently of + bus type. + </para> + </chapter> + + <chapter id="bugs"> + <title>Known Bugs And Assumptions</title> + <para> + None. + </para> + </chapter> + + <chapter id="mmio"> + <title>Memory Mapped IO</title> + <sect1> + <title>Getting Access to the Device</title> + <para> + The most widely supported form of IO is memory mapped IO. + That is, a part of the CPU's address space is interpreted + not as accesses to memory, but as accesses to a device. Some + architectures define devices to be at a fixed address, but most + have some method of discovering devices. The PCI bus walk is a + good example of such a scheme. This document does not cover how + to receive such an address, but assumes you are starting with one. + Physical addresses are of type unsigned long. + </para> + + <para> + This address should not be used directly. Instead, to get an + address suitable for passing to the accessor functions described + below, you should call <function>ioremap</function>. + An address suitable for accessing the device will be returned to you. + </para> + + <para> + After you've finished using the device (say, in your module's + exit routine), call <function>iounmap</function> in order to return + the address space to the kernel. Most architectures allocate new + address space each time you call <function>ioremap</function>, and + they can run out unless you call <function>iounmap</function>. + </para> + </sect1> + + <sect1> + <title>Accessing the device</title> + <para> + The part of the interface most used by drivers is reading and + writing memory-mapped registers on the device. Linux provides + interfaces to read and write 8-bit, 16-bit, 32-bit and 64-bit + quantities. Due to a historical accident, these are named byte, + word, long and quad accesses. Both read and write accesses are + supported; there is no prefetch support at this time. + </para> + + <para> + The functions are named <function>readb</function>, + <function>readw</function>, <function>readl</function>, + <function>readq</function>, <function>readb_relaxed</function>, + <function>readw_relaxed</function>, <function>readl_relaxed</function>, + <function>readq_relaxed</function>, <function>writeb</function>, + <function>writew</function>, <function>writel</function> and + <function>writeq</function>. + </para> + + <para> + Some devices (such as framebuffers) would like to use larger + transfers than 8 bytes at a time. For these devices, the + <function>memcpy_toio</function>, <function>memcpy_fromio</function> + and <function>memset_io</function> functions are provided. + Do not use memset or memcpy on IO addresses; they + are not guaranteed to copy data in order. + </para> + + <para> + The read and write functions are defined to be ordered. That is the + compiler is not permitted to reorder the I/O sequence. When the + ordering can be compiler optimised, you can use <function> + __readb</function> and friends to indicate the relaxed ordering. Use + this with care. + </para> + + <para> + While the basic functions are defined to be synchronous with respect + to each other and ordered with respect to each other the busses the + devices sit on may themselves have asynchronicity. In particular many + authors are burned by the fact that PCI bus writes are posted + asynchronously. A driver author must issue a read from the same + device to ensure that writes have occurred in the specific cases the + author cares. This kind of property cannot be hidden from driver + writers in the API. In some cases, the read used to flush the device + may be expected to fail (if the card is resetting, for example). In + that case, the read should be done from config space, which is + guaranteed to soft-fail if the card doesn't respond. + </para> + + <para> + The following is an example of flushing a write to a device when + the driver would like to ensure the write's effects are visible prior + to continuing execution. + </para> + +<programlisting> +static inline void +qla1280_disable_intrs(struct scsi_qla_host *ha) +{ + struct device_reg *reg; + + reg = ha->iobase; + /* disable risc and host interrupts */ + WRT_REG_WORD(&reg->ictrl, 0); + /* + * The following read will ensure that the above write + * has been received by the device before we return from this + * function. + */ + RD_REG_WORD(&reg->ictrl); + ha->flags.ints_enabled = 0; +} +</programlisting> + + <para> + In addition to write posting, on some large multiprocessing systems + (e.g. SGI Challenge, Origin and Altix machines) posted writes won't + be strongly ordered coming from different CPUs. Thus it's important + to properly protect parts of your driver that do memory-mapped writes + with locks and use the <function>mmiowb</function> to make sure they + arrive in the order intended. Issuing a regular <function>readX + </function> will also ensure write ordering, but should only be used + when the driver has to be sure that the write has actually arrived + at the device (not that it's simply ordered with respect to other + writes), since a full <function>readX</function> is a relatively + expensive operation. + </para> + + <para> + Generally, one should use <function>mmiowb</function> prior to + releasing a spinlock that protects regions using <function>writeb + </function> or similar functions that aren't surrounded by <function> + readb</function> calls, which will ensure ordering and flushing. The + following pseudocode illustrates what might occur if write ordering + isn't guaranteed via <function>mmiowb</function> or one of the + <function>readX</function> functions. + </para> + +<programlisting> +CPU A: spin_lock_irqsave(&dev_lock, flags) +CPU A: ... +CPU A: writel(newval, ring_ptr); +CPU A: spin_unlock_irqrestore(&dev_lock, flags) + ... +CPU B: spin_lock_irqsave(&dev_lock, flags) +CPU B: writel(newval2, ring_ptr); +CPU B: ... +CPU B: spin_unlock_irqrestore(&dev_lock, flags) +</programlisting> + + <para> + In the case above, newval2 could be written to ring_ptr before + newval. Fixing it is easy though: + </para> + +<programlisting> +CPU A: spin_lock_irqsave(&dev_lock, flags) +CPU A: ... +CPU A: writel(newval, ring_ptr); +CPU A: mmiowb(); /* ensure no other writes beat us to the device */ +CPU A: spin_unlock_irqrestore(&dev_lock, flags) + ... +CPU B: spin_lock_irqsave(&dev_lock, flags) +CPU B: writel(newval2, ring_ptr); +CPU B: ... +CPU B: mmiowb(); +CPU B: spin_unlock_irqrestore(&dev_lock, flags) +</programlisting> + + <para> + See tg3.c for a real world example of how to use <function>mmiowb + </function> + </para> + + <para> + PCI ordering rules also guarantee that PIO read responses arrive + after any outstanding DMA writes from that bus, since for some devices + the result of a <function>readb</function> call may signal to the + driver that a DMA transaction is complete. In many cases, however, + the driver may want to indicate that the next + <function>readb</function> call has no relation to any previous DMA + writes performed by the device. The driver can use + <function>readb_relaxed</function> for these cases, although only + some platforms will honor the relaxed semantics. Using the relaxed + read functions will provide significant performance benefits on + platforms that support it. The qla2xxx driver provides examples + of how to use <function>readX_relaxed</function>. In many cases, + a majority of the driver's <function>readX</function> calls can + safely be converted to <function>readX_relaxed</function> calls, since + only a few will indicate or depend on DMA completion. + </para> + </sect1> + + <sect1> + <title>ISA legacy functions</title> + <para> + On older kernels (2.2 and earlier) the ISA bus could be read or + written with these functions and without ioremap being used. This is + no longer true in Linux 2.4. A set of equivalent functions exist for + easy legacy driver porting. The functions available are prefixed + with 'isa_' and are <function>isa_readb</function>, + <function>isa_writeb</function>, <function>isa_readw</function>, + <function>isa_writew</function>, <function>isa_readl</function>, + <function>isa_writel</function>, <function>isa_memcpy_fromio</function> + and <function>isa_memcpy_toio</function> + </para> + <para> + These functions should not be used in new drivers, and will + eventually be going away. + </para> + </sect1> + + </chapter> + + <chapter> + <title>Port Space Accesses</title> + <sect1> + <title>Port Space Explained</title> + + <para> + Another form of IO commonly supported is Port Space. This is a + range of addresses separate to the normal memory address space. + Access to these addresses is generally not as fast as accesses + to the memory mapped addresses, and it also has a potentially + smaller address space. + </para> + + <para> + Unlike memory mapped IO, no preparation is required + to access port space. + </para> + + </sect1> + <sect1> + <title>Accessing Port Space</title> + <para> + Accesses to this space are provided through a set of functions + which allow 8-bit, 16-bit and 32-bit accesses; also + known as byte, word and long. These functions are + <function>inb</function>, <function>inw</function>, + <function>inl</function>, <function>outb</function>, + <function>outw</function> and <function>outl</function>. + </para> + + <para> + Some variants are provided for these functions. Some devices + require that accesses to their ports are slowed down. This + functionality is provided by appending a <function>_p</function> + to the end of the function. There are also equivalents to memcpy. + The <function>ins</function> and <function>outs</function> + functions copy bytes, words or longs to the given port. + </para> + </sect1> + + </chapter> + + <chapter id="pubfunctions"> + <title>Public Functions Provided</title> +!Einclude/asm-i386/io.h + </chapter> + +</book> diff --git a/Documentation/DocBook/gadget.tmpl b/Documentation/DocBook/gadget.tmpl new file mode 100644 index 000000000000..a34442436128 --- /dev/null +++ b/Documentation/DocBook/gadget.tmpl @@ -0,0 +1,752 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="USB-Gadget-API"> + <bookinfo> + <title>USB Gadget API for Linux</title> + <date>20 August 2004</date> + <edition>20 August 2004</edition> + + <legalnotice> + <para> + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later + version. + </para> + + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + See the GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + + <para> + For more details see the file COPYING in the source + distribution of Linux. + </para> + </legalnotice> + <copyright> + <year>2003-2004</year> + <holder>David Brownell</holder> + </copyright> + + <author> + <firstname>David</firstname> + <surname>Brownell</surname> + <affiliation> + <address><email>dbrownell@users.sourceforge.net</email></address> + </affiliation> + </author> + </bookinfo> + +<toc></toc> + +<chapter><title>Introduction</title> + +<para>This document presents a Linux-USB "Gadget" +kernel mode +API, for use within peripherals and other USB devices +that embed Linux. +It provides an overview of the API structure, +and shows how that fits into a system development project. +This is the first such API released on Linux to address +a number of important problems, including: </para> + +<itemizedlist> + <listitem><para>Supports USB 2.0, for high speed devices which + can stream data at several dozen megabytes per second. + </para></listitem> + <listitem><para>Handles devices with dozens of endpoints just as + well as ones with just two fixed-function ones. Gadget drivers + can be written so they're easy to port to new hardware. + </para></listitem> + <listitem><para>Flexible enough to expose more complex USB device + capabilities such as multiple configurations, multiple interfaces, + composite devices, + and alternate interface settings. + </para></listitem> + <listitem><para>USB "On-The-Go" (OTG) support, in conjunction + with updates to the Linux-USB host side. + </para></listitem> + <listitem><para>Sharing data structures and API models with the + Linux-USB host side API. This helps the OTG support, and + looks forward to more-symmetric frameworks (where the same + I/O model is used by both host and device side drivers). + </para></listitem> + <listitem><para>Minimalist, so it's easier to support new device + controller hardware. I/O processing doesn't imply large + demands for memory or CPU resources. + </para></listitem> +</itemizedlist> + + +<para>Most Linux developers will not be able to use this API, since they +have USB "host" hardware in a PC, workstation, or server. +Linux users with embedded systems are more likely to +have USB peripheral hardware. +To distinguish drivers running inside such hardware from the +more familiar Linux "USB device drivers", +which are host side proxies for the real USB devices, +a different term is used: +the drivers inside the peripherals are "USB gadget drivers". +In USB protocol interactions, the device driver is the master +(or "client driver") +and the gadget driver is the slave (or "function driver"). +</para> + +<para>The gadget API resembles the host side Linux-USB API in that both +use queues of request objects to package I/O buffers, and those requests +may be submitted or canceled. +They share common definitions for the standard USB +<emphasis>Chapter 9</emphasis> messages, structures, and constants. +Also, both APIs bind and unbind drivers to devices. +The APIs differ in detail, since the host side's current +URB framework exposes a number of implementation details +and assumptions that are inappropriate for a gadget API. +While the model for control transfers and configuration +management is necessarily different (one side is a hardware-neutral master, +the other is a hardware-aware slave), the endpoint I/0 API used here +should also be usable for an overhead-reduced host side API. +</para> + +</chapter> + +<chapter id="structure"><title>Structure of Gadget Drivers</title> + +<para>A system running inside a USB peripheral +normally has at least three layers inside the kernel to handle +USB protocol processing, and may have additional layers in +user space code. +The "gadget" API is used by the middle layer to interact +with the lowest level (which directly handles hardware). +</para> + +<para>In Linux, from the bottom up, these layers are: +</para> + +<variablelist> + + <varlistentry> + <term><emphasis>USB Controller Driver</emphasis></term> + + <listitem> + <para>This is the lowest software level. + It is the only layer that talks to hardware, + through registers, fifos, dma, irqs, and the like. + The <filename><linux/usb_gadget.h></filename> API abstracts + the peripheral controller endpoint hardware. + That hardware is exposed through endpoint objects, which accept + streams of IN/OUT buffers, and through callbacks that interact + with gadget drivers. + Since normal USB devices only have one upstream + port, they only have one of these drivers. + The controller driver can support any number of different + gadget drivers, but only one of them can be used at a time. + </para> + + <para>Examples of such controller hardware include + the PCI-based NetChip 2280 USB 2.0 high speed controller, + the SA-11x0 or PXA-25x UDC (found within many PDAs), + and a variety of other products. + </para> + + </listitem></varlistentry> + + <varlistentry> + <term><emphasis>Gadget Driver</emphasis></term> + + <listitem> + <para>The lower boundary of this driver implements hardware-neutral + USB functions, using calls to the controller driver. + Because such hardware varies widely in capabilities and restrictions, + and is used in embedded environments where space is at a premium, + the gadget driver is often configured at compile time + to work with endpoints supported by one particular controller. + Gadget drivers may be portable to several different controllers, + using conditional compilation. + (Recent kernels substantially simplify the work involved in + supporting new hardware, by <emphasis>autoconfiguring</emphasis> + endpoints automatically for many bulk-oriented drivers.) + Gadget driver responsibilities include: + </para> + <itemizedlist> + <listitem><para>handling setup requests (ep0 protocol responses) + possibly including class-specific functionality + </para></listitem> + <listitem><para>returning configuration and string descriptors + </para></listitem> + <listitem><para>(re)setting configurations and interface + altsettings, including enabling and configuring endpoints + </para></listitem> + <listitem><para>handling life cycle events, such as managing + bindings to hardware, + USB suspend/resume, remote wakeup, + and disconnection from the USB host. + </para></listitem> + <listitem><para>managing IN and OUT transfers on all currently + enabled endpoints + </para></listitem> + </itemizedlist> + + <para> + Such drivers may be modules of proprietary code, although + that approach is discouraged in the Linux community. + </para> + </listitem></varlistentry> + + <varlistentry> + <term><emphasis>Upper Level</emphasis></term> + + <listitem> + <para>Most gadget drivers have an upper boundary that connects + to some Linux driver or framework in Linux. + Through that boundary flows the data which the gadget driver + produces and/or consumes through protocol transfers over USB. + Examples include: + </para> + <itemizedlist> + <listitem><para>user mode code, using generic (gadgetfs) + or application specific files in + <filename>/dev</filename> + </para></listitem> + <listitem><para>networking subsystem (for network gadgets, + like the CDC Ethernet Model gadget driver) + </para></listitem> + <listitem><para>data capture drivers, perhaps video4Linux or + a scanner driver; or test and measurement hardware. + </para></listitem> + <listitem><para>input subsystem (for HID gadgets) + </para></listitem> + <listitem><para>sound subsystem (for audio gadgets) + </para></listitem> + <listitem><para>file system (for PTP gadgets) + </para></listitem> + <listitem><para>block i/o subsystem (for usb-storage gadgets) + </para></listitem> + <listitem><para>... and more </para></listitem> + </itemizedlist> + </listitem></varlistentry> + + <varlistentry> + <term><emphasis>Additional Layers</emphasis></term> + + <listitem> + <para>Other layers may exist. + These could include kernel layers, such as network protocol stacks, + as well as user mode applications building on standard POSIX + system call APIs such as + <emphasis>open()</emphasis>, <emphasis>close()</emphasis>, + <emphasis>read()</emphasis> and <emphasis>write()</emphasis>. + On newer systems, POSIX Async I/O calls may be an option. + Such user mode code will not necessarily be subject to + the GNU General Public License (GPL). + </para> + </listitem></varlistentry> + + +</variablelist> + +<para>OTG-capable systems will also need to include a standard Linux-USB +host side stack, +with <emphasis>usbcore</emphasis>, +one or more <emphasis>Host Controller Drivers</emphasis> (HCDs), +<emphasis>USB Device Drivers</emphasis> to support +the OTG "Targeted Peripheral List", +and so forth. +There will also be an <emphasis>OTG Controller Driver</emphasis>, +which is visible to gadget and device driver developers only indirectly. +That helps the host and device side USB controllers implement the +two new OTG protocols (HNP and SRP). +Roles switch (host to peripheral, or vice versa) using HNP +during USB suspend processing, and SRP can be viewed as a +more battery-friendly kind of device wakeup protocol. +</para> + +<para>Over time, reusable utilities are evolving to help make some +gadget driver tasks simpler. +For example, building configuration descriptors from vectors of +descriptors for the configurations interfaces and endpoints is +now automated, and many drivers now use autoconfiguration to +choose hardware endpoints and initialize their descriptors. + +A potential example of particular interest +is code implementing standard USB-IF protocols for +HID, networking, storage, or audio classes. +Some developers are interested in KDB or KGDB hooks, to let +target hardware be remotely debugged. +Most such USB protocol code doesn't need to be hardware-specific, +any more than network protocols like X11, HTTP, or NFS are. +Such gadget-side interface drivers should eventually be combined, +to implement composite devices. +</para> + +</chapter> + + +<chapter id="api"><title>Kernel Mode Gadget API</title> + +<para>Gadget drivers declare themselves through a +<emphasis>struct usb_gadget_driver</emphasis>, which is responsible for +most parts of enumeration for a <emphasis>struct usb_gadget</emphasis>. +The response to a set_configuration usually involves +enabling one or more of the <emphasis>struct usb_ep</emphasis> objects +exposed by the gadget, and submitting one or more +<emphasis>struct usb_request</emphasis> buffers to transfer data. +Understand those four data types, and their operations, and +you will understand how this API works. +</para> + +<note><title>Incomplete Data Type Descriptions</title> + +<para>This documentation was prepared using the standard Linux +kernel <filename>docproc</filename> tool, which turns text +and in-code comments into SGML DocBook and then into usable +formats such as HTML or PDF. +Other than the "Chapter 9" data types, most of the significant +data types and functions are described here. +</para> + +<para>However, docproc does not understand all the C constructs +that are used, so some relevant information is likely omitted from +what you are reading. +One example of such information is endpoint autoconfiguration. +You'll have to read the header file, and use example source +code (such as that for "Gadget Zero"), to fully understand the API. +</para> + +<para>The part of the API implementing some basic +driver capabilities is specific to the version of the +Linux kernel that's in use. +The 2.6 kernel includes a <emphasis>driver model</emphasis> +framework that has no analogue on earlier kernels; +so those parts of the gadget API are not fully portable. +(They are implemented on 2.4 kernels, but in a different way.) +The driver model state is another part of this API that is +ignored by the kerneldoc tools. +</para> +</note> + +<para>The core API does not expose +every possible hardware feature, only the most widely available ones. +There are significant hardware features, such as device-to-device DMA +(without temporary storage in a memory buffer) +that would be added using hardware-specific APIs. +</para> + +<para>This API allows drivers to use conditional compilation to handle +endpoint capabilities of different hardware, but doesn't require that. +Hardware tends to have arbitrary restrictions, relating to +transfer types, addressing, packet sizes, buffering, and availability. +As a rule, such differences only matter for "endpoint zero" logic +that handles device configuration and management. +The API supports limited run-time +detection of capabilities, through naming conventions for endpoints. +Many drivers will be able to at least partially autoconfigure +themselves. +In particular, driver init sections will often have endpoint +autoconfiguration logic that scans the hardware's list of endpoints +to find ones matching the driver requirements +(relying on those conventions), to eliminate some of the most +common reasons for conditional compilation. +</para> + +<para>Like the Linux-USB host side API, this API exposes +the "chunky" nature of USB messages: I/O requests are in terms +of one or more "packets", and packet boundaries are visible to drivers. +Compared to RS-232 serial protocols, USB resembles +synchronous protocols like HDLC +(N bytes per frame, multipoint addressing, host as the primary +station and devices as secondary stations) +more than asynchronous ones +(tty style: 8 data bits per frame, no parity, one stop bit). +So for example the controller drivers won't buffer +two single byte writes into a single two-byte USB IN packet, +although gadget drivers may do so when they implement +protocols where packet boundaries (and "short packets") +are not significant. +</para> + +<sect1 id="lifecycle"><title>Driver Life Cycle</title> + +<para>Gadget drivers make endpoint I/O requests to hardware without +needing to know many details of the hardware, but driver +setup/configuration code needs to handle some differences. +Use the API like this: +</para> + +<orderedlist numeration='arabic'> + +<listitem><para>Register a driver for the particular device side +usb controller hardware, +such as the net2280 on PCI (USB 2.0), +sa11x0 or pxa25x as found in Linux PDAs, +and so on. +At this point the device is logically in the USB ch9 initial state +("attached"), drawing no power and not usable +(since it does not yet support enumeration). +Any host should not see the device, since it's not +activated the data line pullup used by the host to +detect a device, even if VBUS power is available. +</para></listitem> + +<listitem><para>Register a gadget driver that implements some higher level +device function. That will then bind() to a usb_gadget, which +activates the data line pullup sometime after detecting VBUS. +</para></listitem> + +<listitem><para>The hardware driver can now start enumerating. +The steps it handles are to accept USB power and set_address requests. +Other steps are handled by the gadget driver. +If the gadget driver module is unloaded before the host starts to +enumerate, steps before step 7 are skipped. +</para></listitem> + +<listitem><para>The gadget driver's setup() call returns usb descriptors, +based both on what the bus interface hardware provides and on the +functionality being implemented. +That can involve alternate settings or configurations, +unless the hardware prevents such operation. +For OTG devices, each configuration descriptor includes +an OTG descriptor. +</para></listitem> + +<listitem><para>The gadget driver handles the last step of enumeration, +when the USB host issues a set_configuration call. +It enables all endpoints used in that configuration, +with all interfaces in their default settings. +That involves using a list of the hardware's endpoints, enabling each +endpoint according to its descriptor. +It may also involve using <function>usb_gadget_vbus_draw</function> +to let more power be drawn from VBUS, as allowed by that configuration. +For OTG devices, setting a configuration may also involve reporting +HNP capabilities through a user interface. +</para></listitem> + +<listitem><para>Do real work and perform data transfers, possibly involving +changes to interface settings or switching to new configurations, until the +device is disconnect()ed from the host. +Queue any number of transfer requests to each endpoint. +It may be suspended and resumed several times before being disconnected. +On disconnect, the drivers go back to step 3 (above). +</para></listitem> + +<listitem><para>When the gadget driver module is being unloaded, +the driver unbind() callback is issued. That lets the controller +driver be unloaded. +</para></listitem> + +</orderedlist> + +<para>Drivers will normally be arranged so that just loading the +gadget driver module (or statically linking it into a Linux kernel) +allows the peripheral device to be enumerated, but some drivers +will defer enumeration until some higher level component (like +a user mode daemon) enables it. +Note that at this lowest level there are no policies about how +ep0 configuration logic is implemented, +except that it should obey USB specifications. +Such issues are in the domain of gadget drivers, +including knowing about implementation constraints +imposed by some USB controllers +or understanding that composite devices might happen to +be built by integrating reusable components. +</para> + +<para>Note that the lifecycle above can be slightly different +for OTG devices. +Other than providing an additional OTG descriptor in each +configuration, only the HNP-related differences are particularly +visible to driver code. +They involve reporting requirements during the SET_CONFIGURATION +request, and the option to invoke HNP during some suspend callbacks. +Also, SRP changes the semantics of +<function>usb_gadget_wakeup</function> +slightly. +</para> + +</sect1> + +<sect1 id="ch9"><title>USB 2.0 Chapter 9 Types and Constants</title> + +<para>Gadget drivers +rely on common USB structures and constants +defined in the +<filename><linux/usb_ch9.h></filename> +header file, which is standard in Linux 2.6 kernels. +These are the same types and constants used by host +side drivers (and usbcore). +</para> + +!Iinclude/linux/usb_ch9.h +</sect1> + +<sect1 id="core"><title>Core Objects and Methods</title> + +<para>These are declared in +<filename><linux/usb_gadget.h></filename>, +and are used by gadget drivers to interact with +USB peripheral controller drivers. +</para> + + <!-- yeech, this is ugly in nsgmls PDF output. + + the PDF bookmark and refentry output nesting is wrong, + and the member/argument documentation indents ugly. + + plus something (docproc?) adds whitespace before the + descriptive paragraph text, so it can't line up right + unless the explanations are trivial. + --> + +!Iinclude/linux/usb_gadget.h +</sect1> + +<sect1 id="utils"><title>Optional Utilities</title> + +<para>The core API is sufficient for writing a USB Gadget Driver, +but some optional utilities are provided to simplify common tasks. +These utilities include endpoint autoconfiguration. +</para> + +!Edrivers/usb/gadget/usbstring.c +!Edrivers/usb/gadget/config.c +<!-- !Edrivers/usb/gadget/epautoconf.c --> +</sect1> + +</chapter> + +<chapter id="controllers"><title>Peripheral Controller Drivers</title> + +<para>The first hardware supporting this API was the NetChip 2280 +controller, which supports USB 2.0 high speed and is based on PCI. +This is the <filename>net2280</filename> driver module. +The driver supports Linux kernel versions 2.4 and 2.6; +contact NetChip Technologies for development boards and product +information. +</para> + +<para>Other hardware working in the "gadget" framework includes: +Intel's PXA 25x and IXP42x series processors +(<filename>pxa2xx_udc</filename>), +Toshiba TC86c001 "Goku-S" (<filename>goku_udc</filename>), +Renesas SH7705/7727 (<filename>sh_udc</filename>), +MediaQ 11xx (<filename>mq11xx_udc</filename>), +Hynix HMS30C7202 (<filename>h7202_udc</filename>), +National 9303/4 (<filename>n9604_udc</filename>), +Texas Instruments OMAP (<filename>omap_udc</filename>), +Sharp LH7A40x (<filename>lh7a40x_udc</filename>), +and more. +Most of those are full speed controllers. +</para> + +<para>At this writing, there are people at work on drivers in +this framework for several other USB device controllers, +with plans to make many of them be widely available. +</para> + +<!-- !Edrivers/usb/gadget/net2280.c --> + +<para>A partial USB simulator, +the <filename>dummy_hcd</filename> driver, is available. +It can act like a net2280, a pxa25x, or an sa11x0 in terms +of available endpoints and device speeds; and it simulates +control, bulk, and to some extent interrupt transfers. +That lets you develop some parts of a gadget driver on a normal PC, +without any special hardware, and perhaps with the assistance +of tools such as GDB running with User Mode Linux. +At least one person has expressed interest in adapting that +approach, hooking it up to a simulator for a microcontroller. +Such simulators can help debug subsystems where the runtime hardware +is unfriendly to software development, or is not yet available. +</para> + +<para>Support for other controllers is expected to be developed +and contributed +over time, as this driver framework evolves. +</para> + +</chapter> + +<chapter id="gadget"><title>Gadget Drivers</title> + +<para>In addition to <emphasis>Gadget Zero</emphasis> +(used primarily for testing and development with drivers +for usb controller hardware), other gadget drivers exist. +</para> + +<para>There's an <emphasis>ethernet</emphasis> gadget +driver, which implements one of the most useful +<emphasis>Communications Device Class</emphasis> (CDC) models. +One of the standards for cable modem interoperability even +specifies the use of this ethernet model as one of two +mandatory options. +Gadgets using this code look to a USB host as if they're +an Ethernet adapter. +It provides access to a network where the gadget's CPU is one host, +which could easily be bridging, routing, or firewalling +access to other networks. +Since some hardware can't fully implement the CDC Ethernet +requirements, this driver also implements a "good parts only" +subset of CDC Ethernet. +(That subset doesn't advertise itself as CDC Ethernet, +to avoid creating problems.) +</para> + +<para>Support for Microsoft's <emphasis>RNDIS</emphasis> +protocol has been contributed by Pengutronix and Auerswald GmbH. +This is like CDC Ethernet, but it runs on more slightly USB hardware +(but less than the CDC subset). +However, its main claim to fame is being able to connect directly to +recent versions of Windows, using drivers that Microsoft bundles +and supports, making it much simpler to network with Windows. +</para> + +<para>There is also support for user mode gadget drivers, +using <emphasis>gadgetfs</emphasis>. +This provides a <emphasis>User Mode API</emphasis> that presents +each endpoint as a single file descriptor. I/O is done using +normal <emphasis>read()</emphasis> and <emphasis>read()</emphasis> calls. +Familiar tools like GDB and pthreads can be used to +develop and debug user mode drivers, so that once a robust +controller driver is available many applications for it +won't require new kernel mode software. +Linux 2.6 <emphasis>Async I/O (AIO)</emphasis> +support is available, so that user mode software +can stream data with only slightly more overhead +than a kernel driver. +</para> + +<para>There's a USB Mass Storage class driver, which provides +a different solution for interoperability with systems such +as MS-Windows and MacOS. +That <emphasis>File-backed Storage</emphasis> driver uses a +file or block device as backing store for a drive, +like the <filename>loop</filename> driver. +The USB host uses the BBB, CB, or CBI versions of the mass +storage class specification, using transparent SCSI commands +to access the data from the backing store. +</para> + +<para>There's a "serial line" driver, useful for TTY style +operation over USB. +The latest version of that driver supports CDC ACM style +operation, like a USB modem, and so on most hardware it can +interoperate easily with MS-Windows. +One interesting use of that driver is in boot firmware (like a BIOS), +which can sometimes use that model with very small systems without +real serial lines. +</para> + +<para>Support for other kinds of gadget is expected to +be developed and contributed +over time, as this driver framework evolves. +</para> + +</chapter> + +<chapter id="otg"><title>USB On-The-GO (OTG)</title> + +<para>USB OTG support on Linux 2.6 was initially developed +by Texas Instruments for +<ulink url="http://www.omap.com">OMAP</ulink> 16xx and 17xx +series processors. +Other OTG systems should work in similar ways, but the +hardware level details could be very different. +</para> + +<para>Systems need specialized hardware support to implement OTG, +notably including a special <emphasis>Mini-AB</emphasis> jack +and associated transciever to support <emphasis>Dual-Role</emphasis> +operation: +they can act either as a host, using the standard +Linux-USB host side driver stack, +or as a peripheral, using this "gadget" framework. +To do that, the system software relies on small additions +to those programming interfaces, +and on a new internal component (here called an "OTG Controller") +affecting which driver stack connects to the OTG port. +In each role, the system can re-use the existing pool of +hardware-neutral drivers, layered on top of the controller +driver interfaces (<emphasis>usb_bus</emphasis> or +<emphasis>usb_gadget</emphasis>). +Such drivers need at most minor changes, and most of the calls +added to support OTG can also benefit non-OTG products. +</para> + +<itemizedlist> + <listitem><para>Gadget drivers test the <emphasis>is_otg</emphasis> + flag, and use it to determine whether or not to include + an OTG descriptor in each of their configurations. + </para></listitem> + <listitem><para>Gadget drivers may need changes to support the + two new OTG protocols, exposed in new gadget attributes + such as <emphasis>b_hnp_enable</emphasis> flag. + HNP support should be reported through a user interface + (two LEDs could suffice), and is triggered in some cases + when the host suspends the peripheral. + SRP support can be user-initiated just like remote wakeup, + probably by pressing the same button. + </para></listitem> + <listitem><para>On the host side, USB device drivers need + to be taught to trigger HNP at appropriate moments, using + <function>usb_suspend_device()</function>. + That also conserves battery power, which is useful even + for non-OTG configurations. + </para></listitem> + <listitem><para>Also on the host side, a driver must support the + OTG "Targeted Peripheral List". That's just a whitelist, + used to reject peripherals not supported with a given + Linux OTG host. + <emphasis>This whitelist is product-specific; + each product must modify <filename>otg_whitelist.h</filename> + to match its interoperability specification. + </emphasis> + </para> + <para>Non-OTG Linux hosts, like PCs and workstations, + normally have some solution for adding drivers, so that + peripherals that aren't recognized can eventually be supported. + That approach is unreasonable for consumer products that may + never have their firmware upgraded, and where it's usually + unrealistic to expect traditional PC/workstation/server kinds + of support model to work. + For example, it's often impractical to change device firmware + once the product has been distributed, so driver bugs can't + normally be fixed if they're found after shipment. + </para></listitem> +</itemizedlist> + +<para> +Additional changes are needed below those hardware-neutral +<emphasis>usb_bus</emphasis> and <emphasis>usb_gadget</emphasis> +driver interfaces; those aren't discussed here in any detail. +Those affect the hardware-specific code for each USB Host or Peripheral +controller, and how the HCD initializes (since OTG can be active only +on a single port). +They also involve what may be called an <emphasis>OTG Controller +Driver</emphasis>, managing the OTG transceiver and the OTG state +machine logic as well as much of the root hub behavior for the +OTG port. +The OTG controller driver needs to activate and deactivate USB +controllers depending on the relevant device role. +Some related changes were needed inside usbcore, so that it +can identify OTG-capable devices and respond appropriately +to HNP or SRP protocols. +</para> + +</chapter> + +</book> +<!-- + vim:syntax=sgml:sw=4 +--> diff --git a/Documentation/DocBook/journal-api.tmpl b/Documentation/DocBook/journal-api.tmpl new file mode 100644 index 000000000000..1ef6f43c6d8f --- /dev/null +++ b/Documentation/DocBook/journal-api.tmpl @@ -0,0 +1,333 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="LinuxJBDAPI"> + <bookinfo> + <title>The Linux Journalling API</title> + <authorgroup> + <author> + <firstname>Roger</firstname> + <surname>Gammans</surname> + <affiliation> + <address> + <email>rgammans@computer-surgery.co.uk</email> + </address> + </affiliation> + </author> + </authorgroup> + + <authorgroup> + <author> + <firstname>Stephen</firstname> + <surname>Tweedie</surname> + <affiliation> + <address> + <email>sct@redhat.com</email> + </address> + </affiliation> + </author> + </authorgroup> + + <copyright> + <year>2002</year> + <holder>Roger Gammans</holder> + </copyright> + +<legalnotice> + <para> + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later + version. + </para> + + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + See the GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + + <para> + For more details see the file COPYING in the source + distribution of Linux. + </para> + </legalnotice> + </bookinfo> + +<toc></toc> + + <chapter id="Overview"> + <title>Overview</title> + <sect1> + <title>Details</title> +<para> +The journalling layer is easy to use. You need to +first of all create a journal_t data structure. There are +two calls to do this dependent on how you decide to allocate the physical +media on which the journal resides. The journal_init_inode() call +is for journals stored in filesystem inodes, or the journal_init_dev() +call can be use for journal stored on a raw device (in a continuous range +of blocks). A journal_t is a typedef for a struct pointer, so when +you are finally finished make sure you call journal_destroy() on it +to free up any used kernel memory. +</para> + +<para> +Once you have got your journal_t object you need to 'mount' or load the journal +file, unless of course you haven't initialised it yet - in which case you +need to call journal_create(). +</para> + +<para> +Most of the time however your journal file will already have been created, but +before you load it you must call journal_wipe() to empty the journal file. +Hang on, you say , what if the filesystem wasn't cleanly umount()'d . Well, it is the +job of the client file system to detect this and skip the call to journal_wipe(). +</para> + +<para> +In either case the next call should be to journal_load() which prepares the +journal file for use. Note that journal_wipe(..,0) calls journal_skip_recovery() +for you if it detects any outstanding transactions in the journal and similarly +journal_load() will call journal_recover() if necessary. +I would advise reading fs/ext3/super.c for examples on this stage. +[RGG: Why is the journal_wipe() call necessary - doesn't this needlessly +complicate the API. Or isn't a good idea for the journal layer to hide +dirty mounts from the client fs] +</para> + +<para> +Now you can go ahead and start modifying the underlying +filesystem. Almost. +</para> + + +<para> + +You still need to actually journal your filesystem changes, this +is done by wrapping them into transactions. Additionally you +also need to wrap the modification of each of the the buffers +with calls to the journal layer, so it knows what the modifications +you are actually making are. To do this use journal_start() which +returns a transaction handle. +</para> + +<para> +journal_start() +and its counterpart journal_stop(), which indicates the end of a transaction +are nestable calls, so you can reenter a transaction if necessary, +but remember you must call journal_stop() the same number of times as +journal_start() before the transaction is completed (or more accurately +leaves the the update phase). Ext3/VFS makes use of this feature to simplify +quota support. +</para> + +<para> +Inside each transaction you need to wrap the modifications to the +individual buffers (blocks). Before you start to modify a buffer you +need to call journal_get_{create,write,undo}_access() as appropriate, +this allows the journalling layer to copy the unmodified data if it +needs to. After all the buffer may be part of a previously uncommitted +transaction. +At this point you are at last ready to modify a buffer, and once +you are have done so you need to call journal_dirty_{meta,}data(). +Or if you've asked for access to a buffer you now know is now longer +required to be pushed back on the device you can call journal_forget() +in much the same way as you might have used bforget() in the past. +</para> + +<para> +A journal_flush() may be called at any time to commit and checkpoint +all your transactions. +</para> + +<para> +Then at umount time , in your put_super() (2.4) or write_super() (2.5) +you can then call journal_destroy() to clean up your in-core journal object. +</para> + + +<para> +Unfortunately there a couple of ways the journal layer can cause a deadlock. +The first thing to note is that each task can only have +a single outstanding transaction at any one time, remember nothing +commits until the outermost journal_stop(). This means +you must complete the transaction at the end of each file/inode/address +etc. operation you perform, so that the journalling system isn't re-entered +on another journal. Since transactions can't be nested/batched +across differing journals, and another filesystem other than +yours (say ext3) may be modified in a later syscall. +</para> + +<para> +The second case to bear in mind is that journal_start() can +block if there isn't enough space in the journal for your transaction +(based on the passed nblocks param) - when it blocks it merely(!) needs to +wait for transactions to complete and be committed from other tasks, +so essentially we are waiting for journal_stop(). So to avoid +deadlocks you must treat journal_start/stop() as if they +were semaphores and include them in your semaphore ordering rules to prevent +deadlocks. Note that journal_extend() has similar blocking behaviour to +journal_start() so you can deadlock here just as easily as on journal_start(). +</para> + +<para> +Try to reserve the right number of blocks the first time. ;-). This will +be the maximum number of blocks you are going to touch in this transaction. +I advise having a look at at least ext3_jbd.h to see the basis on which +ext3 uses to make these decisions. +</para> + +<para> +Another wriggle to watch out for is your on-disk block allocation strategy. +why? Because, if you undo a delete, you need to ensure you haven't reused any +of the freed blocks in a later transaction. One simple way of doing this +is make sure any blocks you allocate only have checkpointed transactions +listed against them. Ext3 does this in ext3_test_allocatable(). +</para> + +<para> +Lock is also providing through journal_{un,}lock_updates(), +ext3 uses this when it wants a window with a clean and stable fs for a moment. +eg. +</para> + +<programlisting> + + journal_lock_updates() //stop new stuff happening.. + journal_flush() // checkpoint everything. + ..do stuff on stable fs + journal_unlock_updates() // carry on with filesystem use. +</programlisting> + +<para> +The opportunities for abuse and DOS attacks with this should be obvious, +if you allow unprivileged userspace to trigger codepaths containing these +calls. +</para> + +<para> +A new feature of jbd since 2.5.25 is commit callbacks with the new +journal_callback_set() function you can now ask the journalling layer +to call you back when the transaction is finally committed to disk, so that +you can do some of your own management. The key to this is the journal_callback +struct, this maintains the internal callback information but you can +extend it like this:- +</para> +<programlisting> + struct myfs_callback_s { + //Data structure element required by jbd.. + struct journal_callback for_jbd; + // Stuff for myfs allocated together. + myfs_inode* i_commited; + + } +</programlisting> + +<para> +this would be useful if you needed to know when data was committed to a +particular inode. +</para> + +</sect1> + +<sect1> +<title>Summary</title> +<para> +Using the journal is a matter of wrapping the different context changes, +being each mount, each modification (transaction) and each changed buffer +to tell the journalling layer about them. +</para> + +<para> +Here is a some pseudo code to give you an idea of how it works, as +an example. +</para> + +<programlisting> + journal_t* my_jnrl = journal_create(); + journal_init_{dev,inode}(jnrl,...) + if (clean) journal_wipe(); + journal_load(); + + foreach(transaction) { /*transactions must be + completed before + a syscall returns to + userspace*/ + + handle_t * xct=journal_start(my_jnrl); + foreach(bh) { + journal_get_{create,write,undo}_access(xact,bh); + if ( myfs_modify(bh) ) { /* returns true + if makes changes */ + journal_dirty_{meta,}data(xact,bh); + } else { + journal_forget(bh); + } + } + journal_stop(xct); + } + journal_destroy(my_jrnl); +</programlisting> +</sect1> + +</chapter> + + <chapter id="adt"> + <title>Data Types</title> + <para> + The journalling layer uses typedefs to 'hide' the concrete definitions + of the structures used. As a client of the JBD layer you can + just rely on the using the pointer as a magic cookie of some sort. + + Obviously the hiding is not enforced as this is 'C'. + </para> + <sect1><title>Structures</title> +!Iinclude/linux/jbd.h + </sect1> +</chapter> + + <chapter id="calls"> + <title>Functions</title> + <para> + The functions here are split into two groups those that + affect a journal as a whole, and those which are used to + manage transactions +</para> + <sect1><title>Journal Level</title> +!Efs/jbd/journal.c +!Efs/jbd/recovery.c + </sect1> + <sect1><title>Transasction Level</title> +!Efs/jbd/transaction.c + </sect1> +</chapter> +<chapter> + <title>See also</title> + <para> + <citation> + <ulink url="ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/journal-design.ps.gz"> + Journaling the Linux ext2fs Filesystem,LinuxExpo 98, Stephen Tweedie + </ulink> + </citation> + </para> + <para> + <citation> + <ulink url="http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html"> + Ext3 Journalling FileSystem , OLS 2000, Dr. Stephen Tweedie + </ulink> + </citation> + </para> +</chapter> + +</book> diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl new file mode 100644 index 000000000000..1bd20c860285 --- /dev/null +++ b/Documentation/DocBook/kernel-api.tmpl @@ -0,0 +1,342 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="LinuxKernelAPI"> + <bookinfo> + <title>The Linux Kernel API</title> + + <legalnotice> + <para> + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later + version. + </para> + + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + See the GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + + <para> + For more details see the file COPYING in the source + distribution of Linux. + </para> + </legalnotice> + </bookinfo> + +<toc></toc> + + <chapter id="Basics"> + <title>Driver Basics</title> + <sect1><title>Driver Entry and Exit points</title> +!Iinclude/linux/init.h + </sect1> + + <sect1><title>Atomic and pointer manipulation</title> +!Iinclude/asm-i386/atomic.h +!Iinclude/asm-i386/unaligned.h + </sect1> + +<!-- FIXME: + kernel/sched.c has no docs, which stuffs up the sgml. Comment + out until somebody adds docs. KAO + <sect1><title>Delaying, scheduling, and timer routines</title> +X!Ekernel/sched.c + </sect1> +KAO --> + </chapter> + + <chapter id="adt"> + <title>Data Types</title> + <sect1><title>Doubly Linked Lists</title> +!Iinclude/linux/list.h + </sect1> + </chapter> + + <chapter id="libc"> + <title>Basic C Library Functions</title> + + <para> + When writing drivers, you cannot in general use routines which are + from the C Library. Some of the functions have been found generally + useful and they are listed below. The behaviour of these functions + may vary slightly from those defined by ANSI, and these deviations + are noted in the text. + </para> + + <sect1><title>String Conversions</title> +!Ilib/vsprintf.c +!Elib/vsprintf.c + </sect1> + <sect1><title>String Manipulation</title> +!Ilib/string.c +!Elib/string.c + </sect1> + <sect1><title>Bit Operations</title> +!Iinclude/asm-i386/bitops.h + </sect1> + </chapter> + + <chapter id="mm"> + <title>Memory Management in Linux</title> + <sect1><title>The Slab Cache</title> +!Emm/slab.c + </sect1> + <sect1><title>User Space Memory Access</title> +!Iinclude/asm-i386/uaccess.h +!Iarch/i386/lib/usercopy.c + </sect1> + </chapter> + + <chapter id="kfifo"> + <title>FIFO Buffer</title> + <sect1><title>kfifo interface</title> +!Iinclude/linux/kfifo.h +!Ekernel/kfifo.c + </sect1> + </chapter> + + <chapter id="proc"> + <title>The proc filesystem</title> + + <sect1><title>sysctl interface</title> +!Ekernel/sysctl.c + </sect1> + </chapter> + + <chapter id="debugfs"> + <title>The debugfs filesystem</title> + + <sect1><title>debugfs interface</title> +!Efs/debugfs/inode.c +!Efs/debugfs/file.c + </sect1> + </chapter> + + <chapter id="vfs"> + <title>The Linux VFS</title> + <sect1><title>The Directory Cache</title> +!Efs/dcache.c +!Iinclude/linux/dcache.h + </sect1> + <sect1><title>Inode Handling</title> +!Efs/inode.c +!Efs/bad_inode.c + </sect1> + <sect1><title>Registration and Superblocks</title> +!Efs/super.c + </sect1> + <sect1><title>File Locks</title> +!Efs/locks.c +!Ifs/locks.c + </sect1> + </chapter> + + <chapter id="netcore"> + <title>Linux Networking</title> + <sect1><title>Socket Buffer Functions</title> +!Iinclude/linux/skbuff.h +!Enet/core/skbuff.c + </sect1> + <sect1><title>Socket Filter</title> +!Enet/core/filter.c + </sect1> + <sect1><title>Generic Network Statistics</title> +!Iinclude/linux/gen_stats.h +!Enet/core/gen_stats.c +!Enet/core/gen_estimator.c + </sect1> + </chapter> + + <chapter id="netdev"> + <title>Network device support</title> + <sect1><title>Driver Support</title> +!Enet/core/dev.c + </sect1> + <sect1><title>8390 Based Network Cards</title> +!Edrivers/net/8390.c + </sect1> + <sect1><title>Synchronous PPP</title> +!Edrivers/net/wan/syncppp.c + </sect1> + </chapter> + + <chapter id="modload"> + <title>Module Support</title> + <sect1><title>Module Loading</title> +!Ekernel/kmod.c + </sect1> + <sect1><title>Inter Module support</title> + <para> + Refer to the file kernel/module.c for more information. + </para> +<!-- FIXME: Removed for now since no structured comments in source +X!Ekernel/module.c +--> + </sect1> + </chapter> + + <chapter id="hardware"> + <title>Hardware Interfaces</title> + <sect1><title>Interrupt Handling</title> +!Iarch/i386/kernel/irq.c + </sect1> + + <sect1><title>MTRR Handling</title> +!Earch/i386/kernel/cpu/mtrr/main.c + </sect1> + <sect1><title>PCI Support Library</title> +!Edrivers/pci/pci.c + </sect1> + <sect1><title>PCI Hotplug Support Library</title> +!Edrivers/pci/hotplug/pci_hotplug_core.c + </sect1> + <sect1><title>MCA Architecture</title> + <sect2><title>MCA Device Functions</title> + <para> + Refer to the file arch/i386/kernel/mca.c for more information. + </para> +<!-- FIXME: Removed for now since no structured comments in source +X!Earch/i386/kernel/mca.c +--> + </sect2> + <sect2><title>MCA Bus DMA</title> +!Iinclude/asm-i386/mca_dma.h + </sect2> + </sect1> + </chapter> + + <chapter id="devfs"> + <title>The Device File System</title> +!Efs/devfs/base.c + </chapter> + + <chapter id="security"> + <title>Security Framework</title> +!Esecurity/security.c + </chapter> + + <chapter id="pmfuncs"> + <title>Power Management</title> +!Ekernel/power/pm.c + </chapter> + + <chapter id="blkdev"> + <title>Block Devices</title> +!Edrivers/block/ll_rw_blk.c + </chapter> + + <chapter id="miscdev"> + <title>Miscellaneous Devices</title> +!Edrivers/char/misc.c + </chapter> + + <chapter id="viddev"> + <title>Video4Linux</title> +!Edrivers/media/video/videodev.c + </chapter> + + <chapter id="snddev"> + <title>Sound Devices</title> +!Esound/sound_core.c +<!-- FIXME: Removed for now since no structured comments in source +X!Isound/sound_firmware.c +--> + </chapter> + + <chapter id="uart16x50"> + <title>16x50 UART Driver</title> +!Edrivers/serial/serial_core.c +!Edrivers/serial/8250.c + </chapter> + + <chapter id="z85230"> + <title>Z85230 Support Library</title> +!Edrivers/net/wan/z85230.c + </chapter> + + <chapter id="fbdev"> + <title>Frame Buffer Library</title> + + <para> + The frame buffer drivers depend heavily on four data structures. + These structures are declared in include/linux/fb.h. They are + fb_info, fb_var_screeninfo, fb_fix_screeninfo and fb_monospecs. + The last three can be made available to and from userland. + </para> + + <para> + fb_info defines the current state of a particular video card. + Inside fb_info, there exists a fb_ops structure which is a + collection of needed functions to make fbdev and fbcon work. + fb_info is only visible to the kernel. + </para> + + <para> + fb_var_screeninfo is used to describe the features of a video card + that are user defined. With fb_var_screeninfo, things such as + depth and the resolution may be defined. + </para> + + <para> + The next structure is fb_fix_screeninfo. This defines the + properties of a card that are created when a mode is set and can't + be changed otherwise. A good example of this is the start of the + frame buffer memory. This "locks" the address of the frame buffer + memory, so that it cannot be changed or moved. + </para> + + <para> + The last structure is fb_monospecs. In the old API, there was + little importance for fb_monospecs. This allowed for forbidden things + such as setting a mode of 800x600 on a fix frequency monitor. With + the new API, fb_monospecs prevents such things, and if used + correctly, can prevent a monitor from being cooked. fb_monospecs + will not be useful until kernels 2.5.x. + </para> + + <sect1><title>Frame Buffer Memory</title> +!Edrivers/video/fbmem.c + </sect1> + <sect1><title>Frame Buffer Console</title> +!Edrivers/video/console/fbcon.c + </sect1> + <sect1><title>Frame Buffer Colormap</title> +!Edrivers/video/fbcmap.c + </sect1> +<!-- FIXME: + drivers/video/fbgen.c has no docs, which stuffs up the sgml. Comment + out until somebody adds docs. KAO + <sect1><title>Frame Buffer Generic Functions</title> +X!Idrivers/video/fbgen.c + </sect1> +KAO --> + <sect1><title>Frame Buffer Video Mode Database</title> +!Idrivers/video/modedb.c +!Edrivers/video/modedb.c + </sect1> + <sect1><title>Frame Buffer Macintosh Video Mode Database</title> +!Idrivers/video/macmodes.c + </sect1> + <sect1><title>Frame Buffer Fonts</title> + <para> + Refer to the file drivers/video/console/fonts.c for more information. + </para> +<!-- FIXME: Removed for now since no structured comments in source +X!Idrivers/video/console/fonts.c +--> + </sect1> + </chapter> +</book> diff --git a/Documentation/DocBook/kernel-hacking.tmpl b/Documentation/DocBook/kernel-hacking.tmpl new file mode 100644 index 000000000000..49a9ef82d575 --- /dev/null +++ b/Documentation/DocBook/kernel-hacking.tmpl @@ -0,0 +1,1349 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="lk-hacking-guide"> + <bookinfo> + <title>Unreliable Guide To Hacking The Linux Kernel</title> + + <authorgroup> + <author> + <firstname>Paul</firstname> + <othername>Rusty</othername> + <surname>Russell</surname> + <affiliation> + <address> + <email>rusty@rustcorp.com.au</email> + </address> + </affiliation> + </author> + </authorgroup> + + <copyright> + <year>2001</year> + <holder>Rusty Russell</holder> + </copyright> + + <legalnotice> + <para> + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later + version. + </para> + + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + See the GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + + <para> + For more details see the file COPYING in the source + distribution of Linux. + </para> + </legalnotice> + + <releaseinfo> + This is the first release of this document as part of the kernel tarball. + </releaseinfo> + + </bookinfo> + + <toc></toc> + + <chapter id="introduction"> + <title>Introduction</title> + <para> + Welcome, gentle reader, to Rusty's Unreliable Guide to Linux + Kernel Hacking. This document describes the common routines and + general requirements for kernel code: its goal is to serve as a + primer for Linux kernel development for experienced C + programmers. I avoid implementation details: that's what the + code is for, and I ignore whole tracts of useful routines. + </para> + <para> + Before you read this, please understand that I never wanted to + write this document, being grossly under-qualified, but I always + wanted to read it, and this was the only way. I hope it will + grow into a compendium of best practice, common starting points + and random information. + </para> + </chapter> + + <chapter id="basic-players"> + <title>The Players</title> + + <para> + At any time each of the CPUs in a system can be: + </para> + + <itemizedlist> + <listitem> + <para> + not associated with any process, serving a hardware interrupt; + </para> + </listitem> + + <listitem> + <para> + not associated with any process, serving a softirq, tasklet or bh; + </para> + </listitem> + + <listitem> + <para> + running in kernel space, associated with a process; + </para> + </listitem> + + <listitem> + <para> + running a process in user space. + </para> + </listitem> + </itemizedlist> + + <para> + There is a strict ordering between these: other than the last + category (userspace) each can only be pre-empted by those above. + For example, while a softirq is running on a CPU, no other + softirq will pre-empt it, but a hardware interrupt can. However, + any other CPUs in the system execute independently. + </para> + + <para> + We'll see a number of ways that the user context can block + interrupts, to become truly non-preemptable. + </para> + + <sect1 id="basics-usercontext"> + <title>User Context</title> + + <para> + User context is when you are coming in from a system call or + other trap: you can sleep, and you own the CPU (except for + interrupts) until you call <function>schedule()</function>. + In other words, user context (unlike userspace) is not pre-emptable. + </para> + + <note> + <para> + You are always in user context on module load and unload, + and on operations on the block device layer. + </para> + </note> + + <para> + In user context, the <varname>current</varname> pointer (indicating + the task we are currently executing) is valid, and + <function>in_interrupt()</function> + (<filename>include/linux/interrupt.h</filename>) is <returnvalue>false + </returnvalue>. + </para> + + <caution> + <para> + Beware that if you have interrupts or bottom halves disabled + (see below), <function>in_interrupt()</function> will return a + false positive. + </para> + </caution> + </sect1> + + <sect1 id="basics-hardirqs"> + <title>Hardware Interrupts (Hard IRQs)</title> + + <para> + Timer ticks, <hardware>network cards</hardware> and + <hardware>keyboard</hardware> are examples of real + hardware which produce interrupts at any time. The kernel runs + interrupt handlers, which services the hardware. The kernel + guarantees that this handler is never re-entered: if another + interrupt arrives, it is queued (or dropped). Because it + disables interrupts, this handler has to be fast: frequently it + simply acknowledges the interrupt, marks a `software interrupt' + for execution and exits. + </para> + + <para> + You can tell you are in a hardware interrupt, because + <function>in_irq()</function> returns <returnvalue>true</returnvalue>. + </para> + <caution> + <para> + Beware that this will return a false positive if interrupts are disabled + (see below). + </para> + </caution> + </sect1> + + <sect1 id="basics-softirqs"> + <title>Software Interrupt Context: Bottom Halves, Tasklets, softirqs</title> + + <para> + Whenever a system call is about to return to userspace, or a + hardware interrupt handler exits, any `software interrupts' + which are marked pending (usually by hardware interrupts) are + run (<filename>kernel/softirq.c</filename>). + </para> + + <para> + Much of the real interrupt handling work is done here. Early in + the transition to <acronym>SMP</acronym>, there were only `bottom + halves' (BHs), which didn't take advantage of multiple CPUs. Shortly + after we switched from wind-up computers made of match-sticks and snot, + we abandoned this limitation. + </para> + + <para> + <filename class="headerfile">include/linux/interrupt.h</filename> lists the + different BH's. No matter how many CPUs you have, no two BHs will run at + the same time. This made the transition to SMP simpler, but sucks hard for + scalable performance. A very important bottom half is the timer + BH (<filename class="headerfile">include/linux/timer.h</filename>): you + can register to have it call functions for you in a given length of time. + </para> + + <para> + 2.3.43 introduced softirqs, and re-implemented the (now + deprecated) BHs underneath them. Softirqs are fully-SMP + versions of BHs: they can run on as many CPUs at once as + required. This means they need to deal with any races in shared + data using their own locks. A bitmask is used to keep track of + which are enabled, so the 32 available softirqs should not be + used up lightly. (<emphasis>Yes</emphasis>, people will + notice). + </para> + + <para> + tasklets (<filename class="headerfile">include/linux/interrupt.h</filename>) + are like softirqs, except they are dynamically-registrable (meaning you + can have as many as you want), and they also guarantee that any tasklet + will only run on one CPU at any time, although different tasklets can + run simultaneously (unlike different BHs). + </para> + <caution> + <para> + The name `tasklet' is misleading: they have nothing to do with `tasks', + and probably more to do with some bad vodka Alexey Kuznetsov had at the + time. + </para> + </caution> + + <para> + You can tell you are in a softirq (or bottom half, or tasklet) + using the <function>in_softirq()</function> macro + (<filename class="headerfile">include/linux/interrupt.h</filename>). + </para> + <caution> + <para> + Beware that this will return a false positive if a bh lock (see below) + is held. + </para> + </caution> + </sect1> + </chapter> + + <chapter id="basic-rules"> + <title>Some Basic Rules</title> + + <variablelist> + <varlistentry> + <term>No memory protection</term> + <listitem> + <para> + If you corrupt memory, whether in user context or + interrupt context, the whole machine will crash. Are you + sure you can't do what you want in userspace? + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>No floating point or <acronym>MMX</acronym></term> + <listitem> + <para> + The <acronym>FPU</acronym> context is not saved; even in user + context the <acronym>FPU</acronym> state probably won't + correspond with the current process: you would mess with some + user process' <acronym>FPU</acronym> state. If you really want + to do this, you would have to explicitly save/restore the full + <acronym>FPU</acronym> state (and avoid context switches). It + is generally a bad idea; use fixed point arithmetic first. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>A rigid stack limit</term> + <listitem> + <para> + The kernel stack is about 6K in 2.2 (for most + architectures: it's about 14K on the Alpha), and shared + with interrupts so you can't use it all. Avoid deep + recursion and huge local arrays on the stack (allocate + them dynamically instead). + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>The Linux kernel is portable</term> + <listitem> + <para> + Let's keep it that way. Your code should be 64-bit clean, + and endian-independent. You should also minimize CPU + specific stuff, e.g. inline assembly should be cleanly + encapsulated and minimized to ease porting. Generally it + should be restricted to the architecture-dependent part of + the kernel tree. + </para> + </listitem> + </varlistentry> + </variablelist> + </chapter> + + <chapter id="ioctls"> + <title>ioctls: Not writing a new system call</title> + + <para> + A system call generally looks like this + </para> + + <programlisting> +asmlinkage long sys_mycall(int arg) +{ + return 0; +} + </programlisting> + + <para> + First, in most cases you don't want to create a new system call. + You create a character device and implement an appropriate ioctl + for it. This is much more flexible than system calls, doesn't have + to be entered in every architecture's + <filename class="headerfile">include/asm/unistd.h</filename> and + <filename>arch/kernel/entry.S</filename> file, and is much more + likely to be accepted by Linus. + </para> + + <para> + If all your routine does is read or write some parameter, consider + implementing a <function>sysctl</function> interface instead. + </para> + + <para> + Inside the ioctl you're in user context to a process. When a + error occurs you return a negated errno (see + <filename class="headerfile">include/linux/errno.h</filename>), + otherwise you return <returnvalue>0</returnvalue>. + </para> + + <para> + After you slept you should check if a signal occurred: the + Unix/Linux way of handling signals is to temporarily exit the + system call with the <constant>-ERESTARTSYS</constant> error. The + system call entry code will switch back to user context, process + the signal handler and then your system call will be restarted + (unless the user disabled that). So you should be prepared to + process the restart, e.g. if you're in the middle of manipulating + some data structure. + </para> + + <programlisting> +if (signal_pending()) + return -ERESTARTSYS; + </programlisting> + + <para> + If you're doing longer computations: first think userspace. If you + <emphasis>really</emphasis> want to do it in kernel you should + regularly check if you need to give up the CPU (remember there is + cooperative multitasking per CPU). Idiom: + </para> + + <programlisting> +cond_resched(); /* Will sleep */ + </programlisting> + + <para> + A short note on interface design: the UNIX system call motto is + "Provide mechanism not policy". + </para> + </chapter> + + <chapter id="deadlock-recipes"> + <title>Recipes for Deadlock</title> + + <para> + You cannot call any routines which may sleep, unless: + </para> + <itemizedlist> + <listitem> + <para> + You are in user context. + </para> + </listitem> + + <listitem> + <para> + You do not own any spinlocks. + </para> + </listitem> + + <listitem> + <para> + You have interrupts enabled (actually, Andi Kleen says + that the scheduling code will enable them for you, but + that's probably not what you wanted). + </para> + </listitem> + </itemizedlist> + + <para> + Note that some functions may sleep implicitly: common ones are + the user space access functions (*_user) and memory allocation + functions without <symbol>GFP_ATOMIC</symbol>. + </para> + + <para> + You will eventually lock up your box if you break these rules. + </para> + + <para> + Really. + </para> + </chapter> + + <chapter id="common-routines"> + <title>Common Routines</title> + + <sect1 id="routines-printk"> + <title> + <function>printk()</function> + <filename class="headerfile">include/linux/kernel.h</filename> + </title> + + <para> + <function>printk()</function> feeds kernel messages to the + console, dmesg, and the syslog daemon. It is useful for debugging + and reporting errors, and can be used inside interrupt context, + but use with caution: a machine which has its console flooded with + printk messages is unusable. It uses a format string mostly + compatible with ANSI C printf, and C string concatenation to give + it a first "priority" argument: + </para> + + <programlisting> +printk(KERN_INFO "i = %u\n", i); + </programlisting> + + <para> + See <filename class="headerfile">include/linux/kernel.h</filename>; + for other KERN_ values; these are interpreted by syslog as the + level. Special case: for printing an IP address use + </para> + + <programlisting> +__u32 ipaddress; +printk(KERN_INFO "my ip: %d.%d.%d.%d\n", NIPQUAD(ipaddress)); + </programlisting> + + <para> + <function>printk()</function> internally uses a 1K buffer and does + not catch overruns. Make sure that will be enough. + </para> + + <note> + <para> + You will know when you are a real kernel hacker + when you start typoing printf as printk in your user programs :) + </para> + </note> + + <!--- From the Lions book reader department --> + + <note> + <para> + Another sidenote: the original Unix Version 6 sources had a + comment on top of its printf function: "Printf should not be + used for chit-chat". You should follow that advice. + </para> + </note> + </sect1> + + <sect1 id="routines-copy"> + <title> + <function>copy_[to/from]_user()</function> + / + <function>get_user()</function> + / + <function>put_user()</function> + <filename class="headerfile">include/asm/uaccess.h</filename> + </title> + + <para> + <emphasis>[SLEEPS]</emphasis> + </para> + + <para> + <function>put_user()</function> and <function>get_user()</function> + are used to get and put single values (such as an int, char, or + long) from and to userspace. A pointer into userspace should + never be simply dereferenced: data should be copied using these + routines. Both return <constant>-EFAULT</constant> or 0. + </para> + <para> + <function>copy_to_user()</function> and + <function>copy_from_user()</function> are more general: they copy + an arbitrary amount of data to and from userspace. + <caution> + <para> + Unlike <function>put_user()</function> and + <function>get_user()</function>, they return the amount of + uncopied data (ie. <returnvalue>0</returnvalue> still means + success). + </para> + </caution> + [Yes, this moronic interface makes me cringe. Please submit a + patch and become my hero --RR.] + </para> + <para> + The functions may sleep implicitly. This should never be called + outside user context (it makes no sense), with interrupts + disabled, or a spinlock held. + </para> + </sect1> + + <sect1 id="routines-kmalloc"> + <title><function>kmalloc()</function>/<function>kfree()</function> + <filename class="headerfile">include/linux/slab.h</filename></title> + + <para> + <emphasis>[MAY SLEEP: SEE BELOW]</emphasis> + </para> + + <para> + These routines are used to dynamically request pointer-aligned + chunks of memory, like malloc and free do in userspace, but + <function>kmalloc()</function> takes an extra flag word. + Important values: + </para> + + <variablelist> + <varlistentry> + <term> + <constant> + GFP_KERNEL + </constant> + </term> + <listitem> + <para> + May sleep and swap to free memory. Only allowed in user + context, but is the most reliable way to allocate memory. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <constant> + GFP_ATOMIC + </constant> + </term> + <listitem> + <para> + Don't sleep. Less reliable than <constant>GFP_KERNEL</constant>, + but may be called from interrupt context. You should + <emphasis>really</emphasis> have a good out-of-memory + error-handling strategy. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <constant> + GFP_DMA + </constant> + </term> + <listitem> + <para> + Allocate ISA DMA lower than 16MB. If you don't know what that + is you don't need it. Very unreliable. + </para> + </listitem> + </varlistentry> + </variablelist> + + <para> + If you see a <errorname>kmem_grow: Called nonatomically from int + </errorname> warning message you called a memory allocation function + from interrupt context without <constant>GFP_ATOMIC</constant>. + You should really fix that. Run, don't walk. + </para> + + <para> + If you are allocating at least <constant>PAGE_SIZE</constant> + (<filename class="headerfile">include/asm/page.h</filename>) bytes, + consider using <function>__get_free_pages()</function> + + (<filename class="headerfile">include/linux/mm.h</filename>). It + takes an order argument (0 for page sized, 1 for double page, 2 + for four pages etc.) and the same memory priority flag word as + above. + </para> + + <para> + If you are allocating more than a page worth of bytes you can use + <function>vmalloc()</function>. It'll allocate virtual memory in + the kernel map. This block is not contiguous in physical memory, + but the <acronym>MMU</acronym> makes it look like it is for you + (so it'll only look contiguous to the CPUs, not to external device + drivers). If you really need large physically contiguous memory + for some weird device, you have a problem: it is poorly supported + in Linux because after some time memory fragmentation in a running + kernel makes it hard. The best way is to allocate the block early + in the boot process via the <function>alloc_bootmem()</function> + routine. + </para> + + <para> + Before inventing your own cache of often-used objects consider + using a slab cache in + <filename class="headerfile">include/linux/slab.h</filename> + </para> + </sect1> + + <sect1 id="routines-current"> + <title><function>current</function> + <filename class="headerfile">include/asm/current.h</filename></title> + + <para> + This global variable (really a macro) contains a pointer to + the current task structure, so is only valid in user context. + For example, when a process makes a system call, this will + point to the task structure of the calling process. It is + <emphasis>not NULL</emphasis> in interrupt context. + </para> + </sect1> + + <sect1 id="routines-udelay"> + <title><function>udelay()</function>/<function>mdelay()</function> + <filename class="headerfile">include/asm/delay.h</filename> + <filename class="headerfile">include/linux/delay.h</filename> + </title> + + <para> + The <function>udelay()</function> function can be used for small pauses. + Do not use large values with <function>udelay()</function> as you risk + overflow - the helper function <function>mdelay()</function> is useful + here, or even consider <function>schedule_timeout()</function>. + </para> + </sect1> + + <sect1 id="routines-endian"> + <title><function>cpu_to_be32()</function>/<function>be32_to_cpu()</function>/<function>cpu_to_le32()</function>/<function>le32_to_cpu()</function> + <filename class="headerfile">include/asm/byteorder.h</filename> + </title> + + <para> + The <function>cpu_to_be32()</function> family (where the "32" can + be replaced by 64 or 16, and the "be" can be replaced by "le") are + the general way to do endian conversions in the kernel: they + return the converted value. All variations supply the reverse as + well: <function>be32_to_cpu()</function>, etc. + </para> + + <para> + There are two major variations of these functions: the pointer + variation, such as <function>cpu_to_be32p()</function>, which take + a pointer to the given type, and return the converted value. The + other variation is the "in-situ" family, such as + <function>cpu_to_be32s()</function>, which convert value referred + to by the pointer, and return void. + </para> + </sect1> + + <sect1 id="routines-local-irqs"> + <title><function>local_irq_save()</function>/<function>local_irq_restore()</function> + <filename class="headerfile">include/asm/system.h</filename> + </title> + + <para> + These routines disable hard interrupts on the local CPU, and + restore them. They are reentrant; saving the previous state in + their one <varname>unsigned long flags</varname> argument. If you + know that interrupts are enabled, you can simply use + <function>local_irq_disable()</function> and + <function>local_irq_enable()</function>. + </para> + </sect1> + + <sect1 id="routines-softirqs"> + <title><function>local_bh_disable()</function>/<function>local_bh_enable()</function> + <filename class="headerfile">include/linux/interrupt.h</filename></title> + + <para> + These routines disable soft interrupts on the local CPU, and + restore them. They are reentrant; if soft interrupts were + disabled before, they will still be disabled after this pair + of functions has been called. They prevent softirqs, tasklets + and bottom halves from running on the current CPU. + </para> + </sect1> + + <sect1 id="routines-processorids"> + <title><function>smp_processor_id</function>() + <filename class="headerfile">include/asm/smp.h</filename></title> + + <para> + <function>smp_processor_id()</function> returns the current + processor number, between 0 and <symbol>NR_CPUS</symbol> (the + maximum number of CPUs supported by Linux, currently 32). These + values are not necessarily continuous. + </para> + </sect1> + + <sect1 id="routines-init"> + <title><type>__init</type>/<type>__exit</type>/<type>__initdata</type> + <filename class="headerfile">include/linux/init.h</filename></title> + + <para> + After boot, the kernel frees up a special section; functions + marked with <type>__init</type> and data structures marked with + <type>__initdata</type> are dropped after boot is complete (within + modules this directive is currently ignored). <type>__exit</type> + is used to declare a function which is only required on exit: the + function will be dropped if this file is not compiled as a module. + See the header file for use. Note that it makes no sense for a function + marked with <type>__init</type> to be exported to modules with + <function>EXPORT_SYMBOL()</function> - this will break. + </para> + <para> + Static data structures marked as <type>__initdata</type> must be initialised + (as opposed to ordinary static data which is zeroed BSS) and cannot be + <type>const</type>. + </para> + + </sect1> + + <sect1 id="routines-init-again"> + <title><function>__initcall()</function>/<function>module_init()</function> + <filename class="headerfile">include/linux/init.h</filename></title> + <para> + Many parts of the kernel are well served as a module + (dynamically-loadable parts of the kernel). Using the + <function>module_init()</function> and + <function>module_exit()</function> macros it is easy to write code + without #ifdefs which can operate both as a module or built into + the kernel. + </para> + + <para> + The <function>module_init()</function> macro defines which + function is to be called at module insertion time (if the file is + compiled as a module), or at boot time: if the file is not + compiled as a module the <function>module_init()</function> macro + becomes equivalent to <function>__initcall()</function>, which + through linker magic ensures that the function is called on boot. + </para> + + <para> + The function can return a negative error number to cause + module loading to fail (unfortunately, this has no effect if + the module is compiled into the kernel). For modules, this is + called in user context, with interrupts enabled, and the + kernel lock held, so it can sleep. + </para> + </sect1> + + <sect1 id="routines-moduleexit"> + <title> <function>module_exit()</function> + <filename class="headerfile">include/linux/init.h</filename> </title> + + <para> + This macro defines the function to be called at module removal + time (or never, in the case of the file compiled into the + kernel). It will only be called if the module usage count has + reached zero. This function can also sleep, but cannot fail: + everything must be cleaned up by the time it returns. + </para> + </sect1> + + <!-- add info on new-style module refcounting here --> + </chapter> + + <chapter id="queues"> + <title>Wait Queues + <filename class="headerfile">include/linux/wait.h</filename> + </title> + <para> + <emphasis>[SLEEPS]</emphasis> + </para> + + <para> + A wait queue is used to wait for someone to wake you up when a + certain condition is true. They must be used carefully to ensure + there is no race condition. You declare a + <type>wait_queue_head_t</type>, and then processes which want to + wait for that condition declare a <type>wait_queue_t</type> + referring to themselves, and place that in the queue. + </para> + + <sect1 id="queue-declaring"> + <title>Declaring</title> + + <para> + You declare a <type>wait_queue_head_t</type> using the + <function>DECLARE_WAIT_QUEUE_HEAD()</function> macro, or using the + <function>init_waitqueue_head()</function> routine in your + initialization code. + </para> + </sect1> + + <sect1 id="queue-waitqueue"> + <title>Queuing</title> + + <para> + Placing yourself in the waitqueue is fairly complex, because you + must put yourself in the queue before checking the condition. + There is a macro to do this: + <function>wait_event_interruptible()</function> + + <filename class="headerfile">include/linux/sched.h</filename> The + first argument is the wait queue head, and the second is an + expression which is evaluated; the macro returns + <returnvalue>0</returnvalue> when this expression is true, or + <returnvalue>-ERESTARTSYS</returnvalue> if a signal is received. + The <function>wait_event()</function> version ignores signals. + </para> + <para> + Do not use the <function>sleep_on()</function> function family - + it is very easy to accidentally introduce races; almost certainly + one of the <function>wait_event()</function> family will do, or a + loop around <function>schedule_timeout()</function>. If you choose + to loop around <function>schedule_timeout()</function> remember + you must set the task state (with + <function>set_current_state()</function>) on each iteration to avoid + busy-looping. + </para> + + </sect1> + + <sect1 id="queue-waking"> + <title>Waking Up Queued Tasks</title> + + <para> + Call <function>wake_up()</function> + + <filename class="headerfile">include/linux/sched.h</filename>;, + which will wake up every process in the queue. The exception is + if one has <constant>TASK_EXCLUSIVE</constant> set, in which case + the remainder of the queue will not be woken. + </para> + </sect1> + </chapter> + + <chapter id="atomic-ops"> + <title>Atomic Operations</title> + + <para> + Certain operations are guaranteed atomic on all platforms. The + first class of operations work on <type>atomic_t</type> + + <filename class="headerfile">include/asm/atomic.h</filename>; this + contains a signed integer (at least 24 bits long), and you must use + these functions to manipulate or read atomic_t variables. + <function>atomic_read()</function> and + <function>atomic_set()</function> get and set the counter, + <function>atomic_add()</function>, + <function>atomic_sub()</function>, + <function>atomic_inc()</function>, + <function>atomic_dec()</function>, and + <function>atomic_dec_and_test()</function> (returns + <returnvalue>true</returnvalue> if it was decremented to zero). + </para> + + <para> + Yes. It returns <returnvalue>true</returnvalue> (i.e. != 0) if the + atomic variable is zero. + </para> + + <para> + Note that these functions are slower than normal arithmetic, and + so should not be used unnecessarily. On some platforms they + are much slower, like 32-bit Sparc where they use a spinlock. + </para> + + <para> + The second class of atomic operations is atomic bit operations on a + <type>long</type>, defined in + + <filename class="headerfile">include/linux/bitops.h</filename>. These + operations generally take a pointer to the bit pattern, and a bit + number: 0 is the least significant bit. + <function>set_bit()</function>, <function>clear_bit()</function> + and <function>change_bit()</function> set, clear, and flip the + given bit. <function>test_and_set_bit()</function>, + <function>test_and_clear_bit()</function> and + <function>test_and_change_bit()</function> do the same thing, + except return true if the bit was previously set; these are + particularly useful for very simple locking. + </para> + + <para> + It is possible to call these operations with bit indices greater + than BITS_PER_LONG. The resulting behavior is strange on big-endian + platforms though so it is a good idea not to do this. + </para> + + <para> + Note that the order of bits depends on the architecture, and in + particular, the bitfield passed to these operations must be at + least as large as a <type>long</type>. + </para> + </chapter> + + <chapter id="symbols"> + <title>Symbols</title> + + <para> + Within the kernel proper, the normal linking rules apply + (ie. unless a symbol is declared to be file scope with the + <type>static</type> keyword, it can be used anywhere in the + kernel). However, for modules, a special exported symbol table is + kept which limits the entry points to the kernel proper. Modules + can also export symbols. + </para> + + <sect1 id="sym-exportsymbols"> + <title><function>EXPORT_SYMBOL()</function> + <filename class="headerfile">include/linux/module.h</filename></title> + + <para> + This is the classic method of exporting a symbol, and it works + for both modules and non-modules. In the kernel all these + declarations are often bundled into a single file to help + genksyms (which searches source files for these declarations). + See the comment on genksyms and Makefiles below. + </para> + </sect1> + + <sect1 id="sym-exportsymbols-gpl"> + <title><function>EXPORT_SYMBOL_GPL()</function> + <filename class="headerfile">include/linux/module.h</filename></title> + + <para> + Similar to <function>EXPORT_SYMBOL()</function> except that the + symbols exported by <function>EXPORT_SYMBOL_GPL()</function> can + only be seen by modules with a + <function>MODULE_LICENSE()</function> that specifies a GPL + compatible license. + </para> + </sect1> + </chapter> + + <chapter id="conventions"> + <title>Routines and Conventions</title> + + <sect1 id="conventions-doublelinkedlist"> + <title>Double-linked lists + <filename class="headerfile">include/linux/list.h</filename></title> + + <para> + There are three sets of linked-list routines in the kernel + headers, but this one seems to be winning out (and Linus has + used it). If you don't have some particular pressing need for + a single list, it's a good choice. In fact, I don't care + whether it's a good choice or not, just use it so we can get + rid of the others. + </para> + </sect1> + + <sect1 id="convention-returns"> + <title>Return Conventions</title> + + <para> + For code called in user context, it's very common to defy C + convention, and return <returnvalue>0</returnvalue> for success, + and a negative error number + (eg. <returnvalue>-EFAULT</returnvalue>) for failure. This can be + unintuitive at first, but it's fairly widespread in the networking + code, for example. + </para> + + <para> + The filesystem code uses <function>ERR_PTR()</function> + + <filename class="headerfile">include/linux/fs.h</filename>; to + encode a negative error number into a pointer, and + <function>IS_ERR()</function> and <function>PTR_ERR()</function> + to get it back out again: avoids a separate pointer parameter for + the error number. Icky, but in a good way. + </para> + </sect1> + + <sect1 id="conventions-borkedcompile"> + <title>Breaking Compilation</title> + + <para> + Linus and the other developers sometimes change function or + structure names in development kernels; this is not done just to + keep everyone on their toes: it reflects a fundamental change + (eg. can no longer be called with interrupts on, or does extra + checks, or doesn't do checks which were caught before). Usually + this is accompanied by a fairly complete note to the linux-kernel + mailing list; search the archive. Simply doing a global replace + on the file usually makes things <emphasis>worse</emphasis>. + </para> + </sect1> + + <sect1 id="conventions-initialising"> + <title>Initializing structure members</title> + + <para> + The preferred method of initializing structures is to use + designated initialisers, as defined by ISO C99, eg: + </para> + <programlisting> +static struct block_device_operations opt_fops = { + .open = opt_open, + .release = opt_release, + .ioctl = opt_ioctl, + .check_media_change = opt_media_change, +}; + </programlisting> + <para> + This makes it easy to grep for, and makes it clear which + structure fields are set. You should do this because it looks + cool. + </para> + </sect1> + + <sect1 id="conventions-gnu-extns"> + <title>GNU Extensions</title> + + <para> + GNU Extensions are explicitly allowed in the Linux kernel. + Note that some of the more complex ones are not very well + supported, due to lack of general use, but the following are + considered standard (see the GCC info page section "C + Extensions" for more details - Yes, really the info page, the + man page is only a short summary of the stuff in info): + </para> + <itemizedlist> + <listitem> + <para> + Inline functions + </para> + </listitem> + <listitem> + <para> + Statement expressions (ie. the ({ and }) constructs). + </para> + </listitem> + <listitem> + <para> + Declaring attributes of a function / variable / type + (__attribute__) + </para> + </listitem> + <listitem> + <para> + typeof + </para> + </listitem> + <listitem> + <para> + Zero length arrays + </para> + </listitem> + <listitem> + <para> + Macro varargs + </para> + </listitem> + <listitem> + <para> + Arithmetic on void pointers + </para> + </listitem> + <listitem> + <para> + Non-Constant initializers + </para> + </listitem> + <listitem> + <para> + Assembler Instructions (not outside arch/ and include/asm/) + </para> + </listitem> + <listitem> + <para> + Function names as strings (__FUNCTION__) + </para> + </listitem> + <listitem> + <para> + __builtin_constant_p() + </para> + </listitem> + </itemizedlist> + + <para> + Be wary when using long long in the kernel, the code gcc generates for + it is horrible and worse: division and multiplication does not work + on i386 because the GCC runtime functions for it are missing from + the kernel environment. + </para> + + <!-- FIXME: add a note about ANSI aliasing cleanness --> + </sect1> + + <sect1 id="conventions-cplusplus"> + <title>C++</title> + + <para> + Using C++ in the kernel is usually a bad idea, because the + kernel does not provide the necessary runtime environment + and the include files are not tested for it. It is still + possible, but not recommended. If you really want to do + this, forget about exceptions at least. + </para> + </sect1> + + <sect1 id="conventions-ifdef"> + <title>#if</title> + + <para> + It is generally considered cleaner to use macros in header files + (or at the top of .c files) to abstract away functions rather than + using `#if' pre-processor statements throughout the source code. + </para> + </sect1> + </chapter> + + <chapter id="submitting"> + <title>Putting Your Stuff in the Kernel</title> + + <para> + In order to get your stuff into shape for official inclusion, or + even to make a neat patch, there's administrative work to be + done: + </para> + <itemizedlist> + <listitem> + <para> + Figure out whose pond you've been pissing in. Look at the top of + the source files, inside the <filename>MAINTAINERS</filename> + file, and last of all in the <filename>CREDITS</filename> file. + You should coordinate with this person to make sure you're not + duplicating effort, or trying something that's already been + rejected. + </para> + + <para> + Make sure you put your name and EMail address at the top of + any files you create or mangle significantly. This is the + first place people will look when they find a bug, or when + <emphasis>they</emphasis> want to make a change. + </para> + </listitem> + + <listitem> + <para> + Usually you want a configuration option for your kernel hack. + Edit <filename>Config.in</filename> in the appropriate directory + (but under <filename>arch/</filename> it's called + <filename>config.in</filename>). The Config Language used is not + bash, even though it looks like bash; the safe way is to use only + the constructs that you already see in + <filename>Config.in</filename> files (see + <filename>Documentation/kbuild/kconfig-language.txt</filename>). + It's good to run "make xconfig" at least once to test (because + it's the only one with a static parser). + </para> + + <para> + Variables which can be Y or N use <type>bool</type> followed by a + tagline and the config define name (which must start with + CONFIG_). The <type>tristate</type> function is the same, but + allows the answer M (which defines + <symbol>CONFIG_foo_MODULE</symbol> in your source, instead of + <symbol>CONFIG_FOO</symbol>) if <symbol>CONFIG_MODULES</symbol> + is enabled. + </para> + + <para> + You may well want to make your CONFIG option only visible if + <symbol>CONFIG_EXPERIMENTAL</symbol> is enabled: this serves as a + warning to users. There many other fancy things you can do: see + the various <filename>Config.in</filename> files for ideas. + </para> + </listitem> + + <listitem> + <para> + Edit the <filename>Makefile</filename>: the CONFIG variables are + exported here so you can conditionalize compilation with `ifeq'. + If your file exports symbols then add the names to + <varname>export-objs</varname> so that genksyms will find them. + <caution> + <para> + There is a restriction on the kernel build system that objects + which export symbols must have globally unique names. + If your object does not have a globally unique name then the + standard fix is to move the + <function>EXPORT_SYMBOL()</function> statements to their own + object with a unique name. + This is why several systems have separate exporting objects, + usually suffixed with ksyms. + </para> + </caution> + </para> + </listitem> + + <listitem> + <para> + Document your option in Documentation/Configure.help. Mention + incompatibilities and issues here. <emphasis> Definitely + </emphasis> end your description with <quote> if in doubt, say N + </quote> (or, occasionally, `Y'); this is for people who have no + idea what you are talking about. + </para> + </listitem> + + <listitem> + <para> + Put yourself in <filename>CREDITS</filename> if you've done + something noteworthy, usually beyond a single file (your name + should be at the top of the source files anyway). + <filename>MAINTAINERS</filename> means you want to be consulted + when changes are made to a subsystem, and hear about bugs; it + implies a more-than-passing commitment to some part of the code. + </para> + </listitem> + + <listitem> + <para> + Finally, don't forget to read <filename>Documentation/SubmittingPatches</filename> + and possibly <filename>Documentation/SubmittingDrivers</filename>. + </para> + </listitem> + </itemizedlist> + </chapter> + + <chapter id="cantrips"> + <title>Kernel Cantrips</title> + + <para> + Some favorites from browsing the source. Feel free to add to this + list. + </para> + + <para> + <filename>include/linux/brlock.h:</filename> + </para> + <programlisting> +extern inline void br_read_lock (enum brlock_indices idx) +{ + /* + * This causes a link-time bug message if an + * invalid index is used: + */ + if (idx >= __BR_END) + __br_lock_usage_bug(); + + read_lock(&__brlock_array[smp_processor_id()][idx]); +} + </programlisting> + + <para> + <filename>include/linux/fs.h</filename>: + </para> + <programlisting> +/* + * Kernel pointers have redundant information, so we can use a + * scheme where we can return either an error code or a dentry + * pointer with the same return value. + * + * This should be a per-architecture thing, to allow different + * error and pointer decisions. + */ + #define ERR_PTR(err) ((void *)((long)(err))) + #define PTR_ERR(ptr) ((long)(ptr)) + #define IS_ERR(ptr) ((unsigned long)(ptr) > (unsigned long)(-1000)) +</programlisting> + + <para> + <filename>include/asm-i386/uaccess.h:</filename> + </para> + + <programlisting> +#define copy_to_user(to,from,n) \ + (__builtin_constant_p(n) ? \ + __constant_copy_to_user((to),(from),(n)) : \ + __generic_copy_to_user((to),(from),(n))) + </programlisting> + + <para> + <filename>arch/sparc/kernel/head.S:</filename> + </para> + + <programlisting> +/* + * Sun people can't spell worth damn. "compatability" indeed. + * At least we *know* we can't spell, and use a spell-checker. + */ + +/* Uh, actually Linus it is I who cannot spell. Too much murky + * Sparc assembly will do this to ya. + */ +C_LABEL(cputypvar): + .asciz "compatability" + +/* Tested on SS-5, SS-10. Probably someone at Sun applied a spell-checker. */ + .align 4 +C_LABEL(cputypvar_sun4m): + .asciz "compatible" + </programlisting> + + <para> + <filename>arch/sparc/lib/checksum.S:</filename> + </para> + + <programlisting> + /* Sun, you just can't beat me, you just can't. Stop trying, + * give up. I'm serious, I am going to kick the living shit + * out of you, game over, lights out. + */ + </programlisting> + </chapter> + + <chapter id="credits"> + <title>Thanks</title> + + <para> + Thanks to Andi Kleen for the idea, answering my questions, fixing + my mistakes, filling content, etc. Philipp Rumpf for more spelling + and clarity fixes, and some excellent non-obvious points. Werner + Almesberger for giving me a great summary of + <function>disable_irq()</function>, and Jes Sorensen and Andrea + Arcangeli added caveats. Michael Elizabeth Chastain for checking + and adding to the Configure section. <!-- Rusty insisted on this + bit; I didn't do it! --> Telsa Gwynne for teaching me DocBook. + </para> + </chapter> +</book> + diff --git a/Documentation/DocBook/kernel-locking.tmpl b/Documentation/DocBook/kernel-locking.tmpl new file mode 100644 index 000000000000..90dc2de8e0af --- /dev/null +++ b/Documentation/DocBook/kernel-locking.tmpl @@ -0,0 +1,2088 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="LKLockingGuide"> + <bookinfo> + <title>Unreliable Guide To Locking</title> + + <authorgroup> + <author> + <firstname>Rusty</firstname> + <surname>Russell</surname> + <affiliation> + <address> + <email>rusty@rustcorp.com.au</email> + </address> + </affiliation> + </author> + </authorgroup> + + <copyright> + <year>2003</year> + <holder>Rusty Russell</holder> + </copyright> + + <legalnotice> + <para> + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later + version. + </para> + + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + See the GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + + <para> + For more details see the file COPYING in the source + distribution of Linux. + </para> + </legalnotice> + </bookinfo> + + <toc></toc> + <chapter id="intro"> + <title>Introduction</title> + <para> + Welcome, to Rusty's Remarkably Unreliable Guide to Kernel + Locking issues. This document describes the locking systems in + the Linux Kernel in 2.6. + </para> + <para> + With the wide availability of HyperThreading, and <firstterm + linkend="gloss-preemption">preemption </firstterm> in the Linux + Kernel, everyone hacking on the kernel needs to know the + fundamentals of concurrency and locking for + <firstterm linkend="gloss-smp"><acronym>SMP</acronym></firstterm>. + </para> + </chapter> + + <chapter id="races"> + <title>The Problem With Concurrency</title> + <para> + (Skip this if you know what a Race Condition is). + </para> + <para> + In a normal program, you can increment a counter like so: + </para> + <programlisting> + very_important_count++; + </programlisting> + + <para> + This is what they would expect to happen: + </para> + + <table> + <title>Expected Results</title> + + <tgroup cols="2" align="left"> + + <thead> + <row> + <entry>Instance 1</entry> + <entry>Instance 2</entry> + </row> + </thead> + + <tbody> + <row> + <entry>read very_important_count (5)</entry> + <entry></entry> + </row> + <row> + <entry>add 1 (6)</entry> + <entry></entry> + </row> + <row> + <entry>write very_important_count (6)</entry> + <entry></entry> + </row> + <row> + <entry></entry> + <entry>read very_important_count (6)</entry> + </row> + <row> + <entry></entry> + <entry>add 1 (7)</entry> + </row> + <row> + <entry></entry> + <entry>write very_important_count (7)</entry> + </row> + </tbody> + + </tgroup> + </table> + + <para> + This is what might happen: + </para> + + <table> + <title>Possible Results</title> + + <tgroup cols="2" align="left"> + <thead> + <row> + <entry>Instance 1</entry> + <entry>Instance 2</entry> + </row> + </thead> + + <tbody> + <row> + <entry>read very_important_count (5)</entry> + <entry></entry> + </row> + <row> + <entry></entry> + <entry>read very_important_count (5)</entry> + </row> + <row> + <entry>add 1 (6)</entry> + <entry></entry> + </row> + <row> + <entry></entry> + <entry>add 1 (6)</entry> + </row> + <row> + <entry>write very_important_count (6)</entry> + <entry></entry> + </row> + <row> + <entry></entry> + <entry>write very_important_count (6)</entry> + </row> + </tbody> + </tgroup> + </table> + + <sect1 id="race-condition"> + <title>Race Conditions and Critical Regions</title> + <para> + This overlap, where the result depends on the + relative timing of multiple tasks, is called a <firstterm>race condition</firstterm>. + The piece of code containing the concurrency issue is called a + <firstterm>critical region</firstterm>. And especially since Linux starting running + on SMP machines, they became one of the major issues in kernel + design and implementation. + </para> + <para> + Preemption can have the same effect, even if there is only one + CPU: by preempting one task during the critical region, we have + exactly the same race condition. In this case the thread which + preempts might run the critical region itself. + </para> + <para> + The solution is to recognize when these simultaneous accesses + occur, and use locks to make sure that only one instance can + enter the critical region at any time. There are many + friendly primitives in the Linux kernel to help you do this. + And then there are the unfriendly primitives, but I'll pretend + they don't exist. + </para> + </sect1> + </chapter> + + <chapter id="locks"> + <title>Locking in the Linux Kernel</title> + + <para> + If I could give you one piece of advice: never sleep with anyone + crazier than yourself. But if I had to give you advice on + locking: <emphasis>keep it simple</emphasis>. + </para> + + <para> + Be reluctant to introduce new locks. + </para> + + <para> + Strangely enough, this last one is the exact reverse of my advice when + you <emphasis>have</emphasis> slept with someone crazier than yourself. + And you should think about getting a big dog. + </para> + + <sect1 id="lock-intro"> + <title>Two Main Types of Kernel Locks: Spinlocks and Semaphores</title> + + <para> + There are two main types of kernel locks. The fundamental type + is the spinlock + (<filename class="headerfile">include/asm/spinlock.h</filename>), + which is a very simple single-holder lock: if you can't get the + spinlock, you keep trying (spinning) until you can. Spinlocks are + very small and fast, and can be used anywhere. + </para> + <para> + The second type is a semaphore + (<filename class="headerfile">include/asm/semaphore.h</filename>): it + can have more than one holder at any time (the number decided at + initialization time), although it is most commonly used as a + single-holder lock (a mutex). If you can't get a semaphore, + your task will put itself on the queue, and be woken up when the + semaphore is released. This means the CPU will do something + else while you are waiting, but there are many cases when you + simply can't sleep (see <xref linkend="sleeping-things"/>), and so + have to use a spinlock instead. + </para> + <para> + Neither type of lock is recursive: see + <xref linkend="deadlock"/>. + </para> + </sect1> + + <sect1 id="uniprocessor"> + <title>Locks and Uniprocessor Kernels</title> + + <para> + For kernels compiled without <symbol>CONFIG_SMP</symbol>, and + without <symbol>CONFIG_PREEMPT</symbol> spinlocks do not exist at + all. This is an excellent design decision: when no-one else can + run at the same time, there is no reason to have a lock. + </para> + + <para> + If the kernel is compiled without <symbol>CONFIG_SMP</symbol>, + but <symbol>CONFIG_PREEMPT</symbol> is set, then spinlocks + simply disable preemption, which is sufficient to prevent any + races. For most purposes, we can think of preemption as + equivalent to SMP, and not worry about it separately. + </para> + + <para> + You should always test your locking code with <symbol>CONFIG_SMP</symbol> + and <symbol>CONFIG_PREEMPT</symbol> enabled, even if you don't have an SMP test box, because it + will still catch some kinds of locking bugs. + </para> + + <para> + Semaphores still exist, because they are required for + synchronization between <firstterm linkend="gloss-usercontext">user + contexts</firstterm>, as we will see below. + </para> + </sect1> + + <sect1 id="usercontextlocking"> + <title>Locking Only In User Context</title> + + <para> + If you have a data structure which is only ever accessed from + user context, then you can use a simple semaphore + (<filename>linux/asm/semaphore.h</filename>) to protect it. This + is the most trivial case: you initialize the semaphore to the number + of resources available (usually 1), and call + <function>down_interruptible()</function> to grab the semaphore, and + <function>up()</function> to release it. There is also a + <function>down()</function>, which should be avoided, because it + will not return if a signal is received. + </para> + + <para> + Example: <filename>linux/net/core/netfilter.c</filename> allows + registration of new <function>setsockopt()</function> and + <function>getsockopt()</function> calls, with + <function>nf_register_sockopt()</function>. Registration and + de-registration are only done on module load and unload (and boot + time, where there is no concurrency), and the list of registrations + is only consulted for an unknown <function>setsockopt()</function> + or <function>getsockopt()</function> system call. The + <varname>nf_sockopt_mutex</varname> is perfect to protect this, + especially since the setsockopt and getsockopt calls may well + sleep. + </para> + </sect1> + + <sect1 id="lock-user-bh"> + <title>Locking Between User Context and Softirqs</title> + + <para> + If a <firstterm linkend="gloss-softirq">softirq</firstterm> shares + data with user context, you have two problems. Firstly, the current + user context can be interrupted by a softirq, and secondly, the + critical region could be entered from another CPU. This is where + <function>spin_lock_bh()</function> + (<filename class="headerfile">include/linux/spinlock.h</filename>) is + used. It disables softirqs on that CPU, then grabs the lock. + <function>spin_unlock_bh()</function> does the reverse. (The + '_bh' suffix is a historical reference to "Bottom Halves", the + old name for software interrupts. It should really be + called spin_lock_softirq()' in a perfect world). + </para> + + <para> + Note that you can also use <function>spin_lock_irq()</function> + or <function>spin_lock_irqsave()</function> here, which stop + hardware interrupts as well: see <xref linkend="hardirq-context"/>. + </para> + + <para> + This works perfectly for <firstterm linkend="gloss-up"><acronym>UP + </acronym></firstterm> as well: the spin lock vanishes, and this macro + simply becomes <function>local_bh_disable()</function> + (<filename class="headerfile">include/linux/interrupt.h</filename>), which + protects you from the softirq being run. + </para> + </sect1> + + <sect1 id="lock-user-tasklet"> + <title>Locking Between User Context and Tasklets</title> + + <para> + This is exactly the same as above, because <firstterm + linkend="gloss-tasklet">tasklets</firstterm> are actually run + from a softirq. + </para> + </sect1> + + <sect1 id="lock-user-timers"> + <title>Locking Between User Context and Timers</title> + + <para> + This, too, is exactly the same as above, because <firstterm + linkend="gloss-timers">timers</firstterm> are actually run from + a softirq. From a locking point of view, tasklets and timers + are identical. + </para> + </sect1> + + <sect1 id="lock-tasklets"> + <title>Locking Between Tasklets/Timers</title> + + <para> + Sometimes a tasklet or timer might want to share data with + another tasklet or timer. + </para> + + <sect2 id="lock-tasklets-same"> + <title>The Same Tasklet/Timer</title> + <para> + Since a tasklet is never run on two CPUs at once, you don't + need to worry about your tasklet being reentrant (running + twice at once), even on SMP. + </para> + </sect2> + + <sect2 id="lock-tasklets-different"> + <title>Different Tasklets/Timers</title> + <para> + If another tasklet/timer wants + to share data with your tasklet or timer , you will both need to use + <function>spin_lock()</function> and + <function>spin_unlock()</function> calls. + <function>spin_lock_bh()</function> is + unnecessary here, as you are already in a tasklet, and + none will be run on the same CPU. + </para> + </sect2> + </sect1> + + <sect1 id="lock-softirqs"> + <title>Locking Between Softirqs</title> + + <para> + Often a softirq might + want to share data with itself or a tasklet/timer. + </para> + + <sect2 id="lock-softirqs-same"> + <title>The Same Softirq</title> + + <para> + The same softirq can run on the other CPUs: you can use a + per-CPU array (see <xref linkend="per-cpu"/>) for better + performance. If you're going so far as to use a softirq, + you probably care about scalable performance enough + to justify the extra complexity. + </para> + + <para> + You'll need to use <function>spin_lock()</function> and + <function>spin_unlock()</function> for shared data. + </para> + </sect2> + + <sect2 id="lock-softirqs-different"> + <title>Different Softirqs</title> + + <para> + You'll need to use <function>spin_lock()</function> and + <function>spin_unlock()</function> for shared data, whether it + be a timer, tasklet, different softirq or the same or another + softirq: any of them could be running on a different CPU. + </para> + </sect2> + </sect1> + </chapter> + + <chapter id="hardirq-context"> + <title>Hard IRQ Context</title> + + <para> + Hardware interrupts usually communicate with a + tasklet or softirq. Frequently this involves putting work in a + queue, which the softirq will take out. + </para> + + <sect1 id="hardirq-softirq"> + <title>Locking Between Hard IRQ and Softirqs/Tasklets</title> + + <para> + If a hardware irq handler shares data with a softirq, you have + two concerns. Firstly, the softirq processing can be + interrupted by a hardware interrupt, and secondly, the + critical region could be entered by a hardware interrupt on + another CPU. This is where <function>spin_lock_irq()</function> is + used. It is defined to disable interrupts on that cpu, then grab + the lock. <function>spin_unlock_irq()</function> does the reverse. + </para> + + <para> + The irq handler does not to use + <function>spin_lock_irq()</function>, because the softirq cannot + run while the irq handler is running: it can use + <function>spin_lock()</function>, which is slightly faster. The + only exception would be if a different hardware irq handler uses + the same lock: <function>spin_lock_irq()</function> will stop + that from interrupting us. + </para> + + <para> + This works perfectly for UP as well: the spin lock vanishes, + and this macro simply becomes <function>local_irq_disable()</function> + (<filename class="headerfile">include/asm/smp.h</filename>), which + protects you from the softirq/tasklet/BH being run. + </para> + + <para> + <function>spin_lock_irqsave()</function> + (<filename>include/linux/spinlock.h</filename>) is a variant + which saves whether interrupts were on or off in a flags word, + which is passed to <function>spin_unlock_irqrestore()</function>. This + means that the same code can be used inside an hard irq handler (where + interrupts are already off) and in softirqs (where the irq + disabling is required). + </para> + + <para> + Note that softirqs (and hence tasklets and timers) are run on + return from hardware interrupts, so + <function>spin_lock_irq()</function> also stops these. In that + sense, <function>spin_lock_irqsave()</function> is the most + general and powerful locking function. + </para> + + </sect1> + <sect1 id="hardirq-hardirq"> + <title>Locking Between Two Hard IRQ Handlers</title> + <para> + It is rare to have to share data between two IRQ handlers, but + if you do, <function>spin_lock_irqsave()</function> should be + used: it is architecture-specific whether all interrupts are + disabled inside irq handlers themselves. + </para> + </sect1> + + </chapter> + + <chapter id="cheatsheet"> + <title>Cheat Sheet For Locking</title> + <para> + Pete Zaitcev gives the following summary: + </para> + <itemizedlist> + <listitem> + <para> + If you are in a process context (any syscall) and want to + lock other process out, use a semaphore. You can take a semaphore + and sleep (<function>copy_from_user*(</function> or + <function>kmalloc(x,GFP_KERNEL)</function>). + </para> + </listitem> + <listitem> + <para> + Otherwise (== data can be touched in an interrupt), use + <function>spin_lock_irqsave()</function> and + <function>spin_unlock_irqrestore()</function>. + </para> + </listitem> + <listitem> + <para> + Avoid holding spinlock for more than 5 lines of code and + across any function call (except accessors like + <function>readb</function>). + </para> + </listitem> + </itemizedlist> + + <sect1 id="minimum-lock-reqirements"> + <title>Table of Minimum Requirements</title> + + <para> The following table lists the <emphasis>minimum</emphasis> + locking requirements between various contexts. In some cases, + the same context can only be running on one CPU at a time, so + no locking is required for that context (eg. a particular + thread can only run on one CPU at a time, but if it needs + shares data with another thread, locking is required). + </para> + <para> + Remember the advice above: you can always use + <function>spin_lock_irqsave()</function>, which is a superset + of all other spinlock primitives. + </para> + <table> +<title>Table of Locking Requirements</title> +<tgroup cols="11"> +<tbody> +<row> +<entry></entry> +<entry>IRQ Handler A</entry> +<entry>IRQ Handler B</entry> +<entry>Softirq A</entry> +<entry>Softirq B</entry> +<entry>Tasklet A</entry> +<entry>Tasklet B</entry> +<entry>Timer A</entry> +<entry>Timer B</entry> +<entry>User Context A</entry> +<entry>User Context B</entry> +</row> + +<row> +<entry>IRQ Handler A</entry> +<entry>None</entry> +</row> + +<row> +<entry>IRQ Handler B</entry> +<entry>spin_lock_irqsave</entry> +<entry>None</entry> +</row> + +<row> +<entry>Softirq A</entry> +<entry>spin_lock_irq</entry> +<entry>spin_lock_irq</entry> +<entry>spin_lock</entry> +</row> + +<row> +<entry>Softirq B</entry> +<entry>spin_lock_irq</entry> +<entry>spin_lock_irq</entry> +<entry>spin_lock</entry> +<entry>spin_lock</entry> +</row> + +<row> +<entry>Tasklet A</entry> +<entry>spin_lock_irq</entry> +<entry>spin_lock_irq</entry> +<entry>spin_lock</entry> +<entry>spin_lock</entry> +<entry>None</entry> +</row> + +<row> +<entry>Tasklet B</entry> +<entry>spin_lock_irq</entry> +<entry>spin_lock_irq</entry> +<entry>spin_lock</entry> +<entry>spin_lock</entry> +<entry>spin_lock</entry> +<entry>None</entry> +</row> + +<row> +<entry>Timer A</entry> +<entry>spin_lock_irq</entry> +<entry>spin_lock_irq</entry> +<entry>spin_lock</entry> +<entry>spin_lock</entry> +<entry>spin_lock</entry> +<entry>spin_lock</entry> +<entry>None</entry> +</row> + +<row> +<entry>Timer B</entry> +<entry>spin_lock_irq</entry> +<entry>spin_lock_irq</entry> +<entry>spin_lock</entry> +<entry>spin_lock</entry> +<entry>spin_lock</entry> +<entry>spin_lock</entry> +<entry>spin_lock</entry> +<entry>None</entry> +</row> + +<row> +<entry>User Context A</entry> +<entry>spin_lock_irq</entry> +<entry>spin_lock_irq</entry> +<entry>spin_lock_bh</entry> +<entry>spin_lock_bh</entry> +<entry>spin_lock_bh</entry> +<entry>spin_lock_bh</entry> +<entry>spin_lock_bh</entry> +<entry>spin_lock_bh</entry> +<entry>None</entry> +</row> + +<row> +<entry>User Context B</entry> +<entry>spin_lock_irq</entry> +<entry>spin_lock_irq</entry> +<entry>spin_lock_bh</entry> +<entry>spin_lock_bh</entry> +<entry>spin_lock_bh</entry> +<entry>spin_lock_bh</entry> +<entry>spin_lock_bh</entry> +<entry>spin_lock_bh</entry> +<entry>down_interruptible</entry> +<entry>None</entry> +</row> + +</tbody> +</tgroup> +</table> +</sect1> +</chapter> + + <chapter id="Examples"> + <title>Common Examples</title> + <para> +Let's step through a simple example: a cache of number to name +mappings. The cache keeps a count of how often each of the objects is +used, and when it gets full, throws out the least used one. + + </para> + + <sect1 id="examples-usercontext"> + <title>All In User Context</title> + <para> +For our first example, we assume that all operations are in user +context (ie. from system calls), so we can sleep. This means we can +use a semaphore to protect the cache and all the objects within +it. Here's the code: + </para> + + <programlisting> +#include <linux/list.h> +#include <linux/slab.h> +#include <linux/string.h> +#include <asm/semaphore.h> +#include <asm/errno.h> + +struct object +{ + struct list_head list; + int id; + char name[32]; + int popularity; +}; + +/* Protects the cache, cache_num, and the objects within it */ +static DECLARE_MUTEX(cache_lock); +static LIST_HEAD(cache); +static unsigned int cache_num = 0; +#define MAX_CACHE_SIZE 10 + +/* Must be holding cache_lock */ +static struct object *__cache_find(int id) +{ + struct object *i; + + list_for_each_entry(i, &cache, list) + if (i->id == id) { + i->popularity++; + return i; + } + return NULL; +} + +/* Must be holding cache_lock */ +static void __cache_delete(struct object *obj) +{ + BUG_ON(!obj); + list_del(&obj->list); + kfree(obj); + cache_num--; +} + +/* Must be holding cache_lock */ +static void __cache_add(struct object *obj) +{ + list_add(&obj->list, &cache); + if (++cache_num > MAX_CACHE_SIZE) { + struct object *i, *outcast = NULL; + list_for_each_entry(i, &cache, list) { + if (!outcast || i->popularity < outcast->popularity) + outcast = i; + } + __cache_delete(outcast); + } +} + +int cache_add(int id, const char *name) +{ + struct object *obj; + + if ((obj = kmalloc(sizeof(*obj), GFP_KERNEL)) == NULL) + return -ENOMEM; + + strlcpy(obj->name, name, sizeof(obj->name)); + obj->id = id; + obj->popularity = 0; + + down(&cache_lock); + __cache_add(obj); + up(&cache_lock); + return 0; +} + +void cache_delete(int id) +{ + down(&cache_lock); + __cache_delete(__cache_find(id)); + up(&cache_lock); +} + +int cache_find(int id, char *name) +{ + struct object *obj; + int ret = -ENOENT; + + down(&cache_lock); + obj = __cache_find(id); + if (obj) { + ret = 0; + strcpy(name, obj->name); + } + up(&cache_lock); + return ret; +} +</programlisting> + + <para> +Note that we always make sure we have the cache_lock when we add, +delete, or look up the cache: both the cache infrastructure itself and +the contents of the objects are protected by the lock. In this case +it's easy, since we copy the data for the user, and never let them +access the objects directly. + </para> + <para> +There is a slight (and common) optimization here: in +<function>cache_add</function> we set up the fields of the object +before grabbing the lock. This is safe, as no-one else can access it +until we put it in cache. + </para> + </sect1> + + <sect1 id="examples-interrupt"> + <title>Accessing From Interrupt Context</title> + <para> +Now consider the case where <function>cache_find</function> can be +called from interrupt context: either a hardware interrupt or a +softirq. An example would be a timer which deletes object from the +cache. + </para> + <para> +The change is shown below, in standard patch format: the +<symbol>-</symbol> are lines which are taken away, and the +<symbol>+</symbol> are lines which are added. + </para> +<programlisting> +--- cache.c.usercontext 2003-12-09 13:58:54.000000000 +1100 ++++ cache.c.interrupt 2003-12-09 14:07:49.000000000 +1100 +@@ -12,7 +12,7 @@ + int popularity; + }; + +-static DECLARE_MUTEX(cache_lock); ++static spinlock_t cache_lock = SPIN_LOCK_UNLOCKED; + static LIST_HEAD(cache); + static unsigned int cache_num = 0; + #define MAX_CACHE_SIZE 10 +@@ -55,6 +55,7 @@ + int cache_add(int id, const char *name) + { + struct object *obj; ++ unsigned long flags; + + if ((obj = kmalloc(sizeof(*obj), GFP_KERNEL)) == NULL) + return -ENOMEM; +@@ -63,30 +64,33 @@ + obj->id = id; + obj->popularity = 0; + +- down(&cache_lock); ++ spin_lock_irqsave(&cache_lock, flags); + __cache_add(obj); +- up(&cache_lock); ++ spin_unlock_irqrestore(&cache_lock, flags); + return 0; + } + + void cache_delete(int id) + { +- down(&cache_lock); ++ unsigned long flags; ++ ++ spin_lock_irqsave(&cache_lock, flags); + __cache_delete(__cache_find(id)); +- up(&cache_lock); ++ spin_unlock_irqrestore(&cache_lock, flags); + } + + int cache_find(int id, char *name) + { + struct object *obj; + int ret = -ENOENT; ++ unsigned long flags; + +- down(&cache_lock); ++ spin_lock_irqsave(&cache_lock, flags); + obj = __cache_find(id); + if (obj) { + ret = 0; + strcpy(name, obj->name); + } +- up(&cache_lock); ++ spin_unlock_irqrestore(&cache_lock, flags); + return ret; + } +</programlisting> + + <para> +Note that the <function>spin_lock_irqsave</function> will turn off +interrupts if they are on, otherwise does nothing (if we are already +in an interrupt handler), hence these functions are safe to call from +any context. + </para> + <para> +Unfortunately, <function>cache_add</function> calls +<function>kmalloc</function> with the <symbol>GFP_KERNEL</symbol> +flag, which is only legal in user context. I have assumed that +<function>cache_add</function> is still only called in user context, +otherwise this should become a parameter to +<function>cache_add</function>. + </para> + </sect1> + <sect1 id="examples-refcnt"> + <title>Exposing Objects Outside This File</title> + <para> +If our objects contained more information, it might not be sufficient +to copy the information in and out: other parts of the code might want +to keep pointers to these objects, for example, rather than looking up +the id every time. This produces two problems. + </para> + <para> +The first problem is that we use the <symbol>cache_lock</symbol> to +protect objects: we'd need to make this non-static so the rest of the +code can use it. This makes locking trickier, as it is no longer all +in one place. + </para> + <para> +The second problem is the lifetime problem: if another structure keeps +a pointer to an object, it presumably expects that pointer to remain +valid. Unfortunately, this is only guaranteed while you hold the +lock, otherwise someone might call <function>cache_delete</function> +and even worse, add another object, re-using the same address. + </para> + <para> +As there is only one lock, you can't hold it forever: no-one else would +get any work done. + </para> + <para> +The solution to this problem is to use a reference count: everyone who +has a pointer to the object increases it when they first get the +object, and drops the reference count when they're finished with it. +Whoever drops it to zero knows it is unused, and can actually delete it. + </para> + <para> +Here is the code: + </para> + +<programlisting> +--- cache.c.interrupt 2003-12-09 14:25:43.000000000 +1100 ++++ cache.c.refcnt 2003-12-09 14:33:05.000000000 +1100 +@@ -7,6 +7,7 @@ + struct object + { + struct list_head list; ++ unsigned int refcnt; + int id; + char name[32]; + int popularity; +@@ -17,6 +18,35 @@ + static unsigned int cache_num = 0; + #define MAX_CACHE_SIZE 10 + ++static void __object_put(struct object *obj) ++{ ++ if (--obj->refcnt == 0) ++ kfree(obj); ++} ++ ++static void __object_get(struct object *obj) ++{ ++ obj->refcnt++; ++} ++ ++void object_put(struct object *obj) ++{ ++ unsigned long flags; ++ ++ spin_lock_irqsave(&cache_lock, flags); ++ __object_put(obj); ++ spin_unlock_irqrestore(&cache_lock, flags); ++} ++ ++void object_get(struct object *obj) ++{ ++ unsigned long flags; ++ ++ spin_lock_irqsave(&cache_lock, flags); ++ __object_get(obj); ++ spin_unlock_irqrestore(&cache_lock, flags); ++} ++ + /* Must be holding cache_lock */ + static struct object *__cache_find(int id) + { +@@ -35,6 +65,7 @@ + { + BUG_ON(!obj); + list_del(&obj->list); ++ __object_put(obj); + cache_num--; + } + +@@ -63,6 +94,7 @@ + strlcpy(obj->name, name, sizeof(obj->name)); + obj->id = id; + obj->popularity = 0; ++ obj->refcnt = 1; /* The cache holds a reference */ + + spin_lock_irqsave(&cache_lock, flags); + __cache_add(obj); +@@ -79,18 +111,15 @@ + spin_unlock_irqrestore(&cache_lock, flags); + } + +-int cache_find(int id, char *name) ++struct object *cache_find(int id) + { + struct object *obj; +- int ret = -ENOENT; + unsigned long flags; + + spin_lock_irqsave(&cache_lock, flags); + obj = __cache_find(id); +- if (obj) { +- ret = 0; +- strcpy(name, obj->name); +- } ++ if (obj) ++ __object_get(obj); + spin_unlock_irqrestore(&cache_lock, flags); +- return ret; ++ return obj; + } +</programlisting> + +<para> +We encapsulate the reference counting in the standard 'get' and 'put' +functions. Now we can return the object itself from +<function>cache_find</function> which has the advantage that the user +can now sleep holding the object (eg. to +<function>copy_to_user</function> to name to userspace). +</para> +<para> +The other point to note is that I said a reference should be held for +every pointer to the object: thus the reference count is 1 when first +inserted into the cache. In some versions the framework does not hold +a reference count, but they are more complicated. +</para> + + <sect2 id="examples-refcnt-atomic"> + <title>Using Atomic Operations For The Reference Count</title> +<para> +In practice, <type>atomic_t</type> would usually be used for +<structfield>refcnt</structfield>. There are a number of atomic +operations defined in + +<filename class="headerfile">include/asm/atomic.h</filename>: these are +guaranteed to be seen atomically from all CPUs in the system, so no +lock is required. In this case, it is simpler than using spinlocks, +although for anything non-trivial using spinlocks is clearer. The +<function>atomic_inc</function> and +<function>atomic_dec_and_test</function> are used instead of the +standard increment and decrement operators, and the lock is no longer +used to protect the reference count itself. +</para> + +<programlisting> +--- cache.c.refcnt 2003-12-09 15:00:35.000000000 +1100 ++++ cache.c.refcnt-atomic 2003-12-11 15:49:42.000000000 +1100 +@@ -7,7 +7,7 @@ + struct object + { + struct list_head list; +- unsigned int refcnt; ++ atomic_t refcnt; + int id; + char name[32]; + int popularity; +@@ -18,33 +18,15 @@ + static unsigned int cache_num = 0; + #define MAX_CACHE_SIZE 10 + +-static void __object_put(struct object *obj) +-{ +- if (--obj->refcnt == 0) +- kfree(obj); +-} +- +-static void __object_get(struct object *obj) +-{ +- obj->refcnt++; +-} +- + void object_put(struct object *obj) + { +- unsigned long flags; +- +- spin_lock_irqsave(&cache_lock, flags); +- __object_put(obj); +- spin_unlock_irqrestore(&cache_lock, flags); ++ if (atomic_dec_and_test(&obj->refcnt)) ++ kfree(obj); + } + + void object_get(struct object *obj) + { +- unsigned long flags; +- +- spin_lock_irqsave(&cache_lock, flags); +- __object_get(obj); +- spin_unlock_irqrestore(&cache_lock, flags); ++ atomic_inc(&obj->refcnt); + } + + /* Must be holding cache_lock */ +@@ -65,7 +47,7 @@ + { + BUG_ON(!obj); + list_del(&obj->list); +- __object_put(obj); ++ object_put(obj); + cache_num--; + } + +@@ -94,7 +76,7 @@ + strlcpy(obj->name, name, sizeof(obj->name)); + obj->id = id; + obj->popularity = 0; +- obj->refcnt = 1; /* The cache holds a reference */ ++ atomic_set(&obj->refcnt, 1); /* The cache holds a reference */ + + spin_lock_irqsave(&cache_lock, flags); + __cache_add(obj); +@@ -119,7 +101,7 @@ + spin_lock_irqsave(&cache_lock, flags); + obj = __cache_find(id); + if (obj) +- __object_get(obj); ++ object_get(obj); + spin_unlock_irqrestore(&cache_lock, flags); + return obj; + } +</programlisting> +</sect2> +</sect1> + + <sect1 id="examples-lock-per-obj"> + <title>Protecting The Objects Themselves</title> + <para> +In these examples, we assumed that the objects (except the reference +counts) never changed once they are created. If we wanted to allow +the name to change, there are three possibilities: + </para> + <itemizedlist> + <listitem> + <para> +You can make <symbol>cache_lock</symbol> non-static, and tell people +to grab that lock before changing the name in any object. + </para> + </listitem> + <listitem> + <para> +You can provide a <function>cache_obj_rename</function> which grabs +this lock and changes the name for the caller, and tell everyone to +use that function. + </para> + </listitem> + <listitem> + <para> +You can make the <symbol>cache_lock</symbol> protect only the cache +itself, and use another lock to protect the name. + </para> + </listitem> + </itemizedlist> + + <para> +Theoretically, you can make the locks as fine-grained as one lock for +every field, for every object. In practice, the most common variants +are: +</para> + <itemizedlist> + <listitem> + <para> +One lock which protects the infrastructure (the <symbol>cache</symbol> +list in this example) and all the objects. This is what we have done +so far. + </para> + </listitem> + <listitem> + <para> +One lock which protects the infrastructure (including the list +pointers inside the objects), and one lock inside the object which +protects the rest of that object. + </para> + </listitem> + <listitem> + <para> +Multiple locks to protect the infrastructure (eg. one lock per hash +chain), possibly with a separate per-object lock. + </para> + </listitem> + </itemizedlist> + +<para> +Here is the "lock-per-object" implementation: +</para> +<programlisting> +--- cache.c.refcnt-atomic 2003-12-11 15:50:54.000000000 +1100 ++++ cache.c.perobjectlock 2003-12-11 17:15:03.000000000 +1100 +@@ -6,11 +6,17 @@ + + struct object + { ++ /* These two protected by cache_lock. */ + struct list_head list; ++ int popularity; ++ + atomic_t refcnt; ++ ++ /* Doesn't change once created. */ + int id; ++ ++ spinlock_t lock; /* Protects the name */ + char name[32]; +- int popularity; + }; + + static spinlock_t cache_lock = SPIN_LOCK_UNLOCKED; +@@ -77,6 +84,7 @@ + obj->id = id; + obj->popularity = 0; + atomic_set(&obj->refcnt, 1); /* The cache holds a reference */ ++ spin_lock_init(&obj->lock); + + spin_lock_irqsave(&cache_lock, flags); + __cache_add(obj); +</programlisting> + +<para> +Note that I decide that the <structfield>popularity</structfield> +count should be protected by the <symbol>cache_lock</symbol> rather +than the per-object lock: this is because it (like the +<structname>struct list_head</structname> inside the object) is +logically part of the infrastructure. This way, I don't need to grab +the lock of every object in <function>__cache_add</function> when +seeking the least popular. +</para> + +<para> +I also decided that the <structfield>id</structfield> member is +unchangeable, so I don't need to grab each object lock in +<function>__cache_find()</function> to examine the +<structfield>id</structfield>: the object lock is only used by a +caller who wants to read or write the <structfield>name</structfield> +field. +</para> + +<para> +Note also that I added a comment describing what data was protected by +which locks. This is extremely important, as it describes the runtime +behavior of the code, and can be hard to gain from just reading. And +as Alan Cox says, <quote>Lock data, not code</quote>. +</para> +</sect1> +</chapter> + + <chapter id="common-problems"> + <title>Common Problems</title> + <sect1 id="deadlock"> + <title>Deadlock: Simple and Advanced</title> + + <para> + There is a coding bug where a piece of code tries to grab a + spinlock twice: it will spin forever, waiting for the lock to + be released (spinlocks, rwlocks and semaphores are not + recursive in Linux). This is trivial to diagnose: not a + stay-up-five-nights-talk-to-fluffy-code-bunnies kind of + problem. + </para> + + <para> + For a slightly more complex case, imagine you have a region + shared by a softirq and user context. If you use a + <function>spin_lock()</function> call to protect it, it is + possible that the user context will be interrupted by the softirq + while it holds the lock, and the softirq will then spin + forever trying to get the same lock. + </para> + + <para> + Both of these are called deadlock, and as shown above, it can + occur even with a single CPU (although not on UP compiles, + since spinlocks vanish on kernel compiles with + <symbol>CONFIG_SMP</symbol>=n. You'll still get data corruption + in the second example). + </para> + + <para> + This complete lockup is easy to diagnose: on SMP boxes the + watchdog timer or compiling with <symbol>DEBUG_SPINLOCKS</symbol> set + (<filename>include/linux/spinlock.h</filename>) will show this up + immediately when it happens. + </para> + + <para> + A more complex problem is the so-called 'deadly embrace', + involving two or more locks. Say you have a hash table: each + entry in the table is a spinlock, and a chain of hashed + objects. Inside a softirq handler, you sometimes want to + alter an object from one place in the hash to another: you + grab the spinlock of the old hash chain and the spinlock of + the new hash chain, and delete the object from the old one, + and insert it in the new one. + </para> + + <para> + There are two problems here. First, if your code ever + tries to move the object to the same chain, it will deadlock + with itself as it tries to lock it twice. Secondly, if the + same softirq on another CPU is trying to move another object + in the reverse direction, the following could happen: + </para> + + <table> + <title>Consequences</title> + + <tgroup cols="2" align="left"> + + <thead> + <row> + <entry>CPU 1</entry> + <entry>CPU 2</entry> + </row> + </thead> + + <tbody> + <row> + <entry>Grab lock A -> OK</entry> + <entry>Grab lock B -> OK</entry> + </row> + <row> + <entry>Grab lock B -> spin</entry> + <entry>Grab lock A -> spin</entry> + </row> + </tbody> + </tgroup> + </table> + + <para> + The two CPUs will spin forever, waiting for the other to give up + their lock. It will look, smell, and feel like a crash. + </para> + </sect1> + + <sect1 id="techs-deadlock-prevent"> + <title>Preventing Deadlock</title> + + <para> + Textbooks will tell you that if you always lock in the same + order, you will never get this kind of deadlock. Practice + will tell you that this approach doesn't scale: when I + create a new lock, I don't understand enough of the kernel + to figure out where in the 5000 lock hierarchy it will fit. + </para> + + <para> + The best locks are encapsulated: they never get exposed in + headers, and are never held around calls to non-trivial + functions outside the same file. You can read through this + code and see that it will never deadlock, because it never + tries to grab another lock while it has that one. People + using your code don't even need to know you are using a + lock. + </para> + + <para> + A classic problem here is when you provide callbacks or + hooks: if you call these with the lock held, you risk simple + deadlock, or a deadly embrace (who knows what the callback + will do?). Remember, the other programmers are out to get + you, so don't do this. + </para> + + <sect2 id="techs-deadlock-overprevent"> + <title>Overzealous Prevention Of Deadlocks</title> + + <para> + Deadlocks are problematic, but not as bad as data + corruption. Code which grabs a read lock, searches a list, + fails to find what it wants, drops the read lock, grabs a + write lock and inserts the object has a race condition. + </para> + + <para> + If you don't see why, please stay the fuck away from my code. + </para> + </sect2> + </sect1> + + <sect1 id="racing-timers"> + <title>Racing Timers: A Kernel Pastime</title> + + <para> + Timers can produce their own special problems with races. + Consider a collection of objects (list, hash, etc) where each + object has a timer which is due to destroy it. + </para> + + <para> + If you want to destroy the entire collection (say on module + removal), you might do the following: + </para> + + <programlisting> + /* THIS CODE BAD BAD BAD BAD: IF IT WAS ANY WORSE IT WOULD USE + HUNGARIAN NOTATION */ + spin_lock_bh(&list_lock); + + while (list) { + struct foo *next = list->next; + del_timer(&list->timer); + kfree(list); + list = next; + } + + spin_unlock_bh(&list_lock); + </programlisting> + + <para> + Sooner or later, this will crash on SMP, because a timer can + have just gone off before the <function>spin_lock_bh()</function>, + and it will only get the lock after we + <function>spin_unlock_bh()</function>, and then try to free + the element (which has already been freed!). + </para> + + <para> + This can be avoided by checking the result of + <function>del_timer()</function>: if it returns + <returnvalue>1</returnvalue>, the timer has been deleted. + If <returnvalue>0</returnvalue>, it means (in this + case) that it is currently running, so we can do: + </para> + + <programlisting> + retry: + spin_lock_bh(&list_lock); + + while (list) { + struct foo *next = list->next; + if (!del_timer(&list->timer)) { + /* Give timer a chance to delete this */ + spin_unlock_bh(&list_lock); + goto retry; + } + kfree(list); + list = next; + } + + spin_unlock_bh(&list_lock); + </programlisting> + + <para> + Another common problem is deleting timers which restart + themselves (by calling <function>add_timer()</function> at the end + of their timer function). Because this is a fairly common case + which is prone to races, you should use <function>del_timer_sync()</function> + (<filename class="headerfile">include/linux/timer.h</filename>) + to handle this case. It returns the number of times the timer + had to be deleted before we finally stopped it from adding itself back + in. + </para> + </sect1> + + </chapter> + + <chapter id="Efficiency"> + <title>Locking Speed</title> + + <para> +There are three main things to worry about when considering speed of +some code which does locking. First is concurrency: how many things +are going to be waiting while someone else is holding a lock. Second +is the time taken to actually acquire and release an uncontended lock. +Third is using fewer, or smarter locks. I'm assuming that the lock is +used fairly often: otherwise, you wouldn't be concerned about +efficiency. +</para> + <para> +Concurrency depends on how long the lock is usually held: you should +hold the lock for as long as needed, but no longer. In the cache +example, we always create the object without the lock held, and then +grab the lock only when we are ready to insert it in the list. +</para> + <para> +Acquisition times depend on how much damage the lock operations do to +the pipeline (pipeline stalls) and how likely it is that this CPU was +the last one to grab the lock (ie. is the lock cache-hot for this +CPU): on a machine with more CPUs, this likelihood drops fast. +Consider a 700MHz Intel Pentium III: an instruction takes about 0.7ns, +an atomic increment takes about 58ns, a lock which is cache-hot on +this CPU takes 160ns, and a cacheline transfer from another CPU takes +an additional 170 to 360ns. (These figures from Paul McKenney's +<ulink url="http://www.linuxjournal.com/article.php?sid=6993"> Linux +Journal RCU article</ulink>). +</para> + <para> +These two aims conflict: holding a lock for a short time might be done +by splitting locks into parts (such as in our final per-object-lock +example), but this increases the number of lock acquisitions, and the +results are often slower than having a single lock. This is another +reason to advocate locking simplicity. +</para> + <para> +The third concern is addressed below: there are some methods to reduce +the amount of locking which needs to be done. +</para> + + <sect1 id="efficiency-rwlocks"> + <title>Read/Write Lock Variants</title> + + <para> + Both spinlocks and semaphores have read/write variants: + <type>rwlock_t</type> and <structname>struct rw_semaphore</structname>. + These divide users into two classes: the readers and the writers. If + you are only reading the data, you can get a read lock, but to write to + the data you need the write lock. Many people can hold a read lock, + but a writer must be sole holder. + </para> + + <para> + If your code divides neatly along reader/writer lines (as our + cache code does), and the lock is held by readers for + significant lengths of time, using these locks can help. They + are slightly slower than the normal locks though, so in practice + <type>rwlock_t</type> is not usually worthwhile. + </para> + </sect1> + + <sect1 id="efficiency-read-copy-update"> + <title>Avoiding Locks: Read Copy Update</title> + + <para> + There is a special method of read/write locking called Read Copy + Update. Using RCU, the readers can avoid taking a lock + altogether: as we expect our cache to be read more often than + updated (otherwise the cache is a waste of time), it is a + candidate for this optimization. + </para> + + <para> + How do we get rid of read locks? Getting rid of read locks + means that writers may be changing the list underneath the + readers. That is actually quite simple: we can read a linked + list while an element is being added if the writer adds the + element very carefully. For example, adding + <symbol>new</symbol> to a single linked list called + <symbol>list</symbol>: + </para> + + <programlisting> + new->next = list->next; + wmb(); + list->next = new; + </programlisting> + + <para> + The <function>wmb()</function> is a write memory barrier. It + ensures that the first operation (setting the new element's + <symbol>next</symbol> pointer) is complete and will be seen by + all CPUs, before the second operation is (putting the new + element into the list). This is important, since modern + compilers and modern CPUs can both reorder instructions unless + told otherwise: we want a reader to either not see the new + element at all, or see the new element with the + <symbol>next</symbol> pointer correctly pointing at the rest of + the list. + </para> + <para> + Fortunately, there is a function to do this for standard + <structname>struct list_head</structname> lists: + <function>list_add_rcu()</function> + (<filename>include/linux/list.h</filename>). + </para> + <para> + Removing an element from the list is even simpler: we replace + the pointer to the old element with a pointer to its successor, + and readers will either see it, or skip over it. + </para> + <programlisting> + list->next = old->next; + </programlisting> + <para> + There is <function>list_del_rcu()</function> + (<filename>include/linux/list.h</filename>) which does this (the + normal version poisons the old object, which we don't want). + </para> + <para> + The reader must also be careful: some CPUs can look through the + <symbol>next</symbol> pointer to start reading the contents of + the next element early, but don't realize that the pre-fetched + contents is wrong when the <symbol>next</symbol> pointer changes + underneath them. Once again, there is a + <function>list_for_each_entry_rcu()</function> + (<filename>include/linux/list.h</filename>) to help you. Of + course, writers can just use + <function>list_for_each_entry()</function>, since there cannot + be two simultaneous writers. + </para> + <para> + Our final dilemma is this: when can we actually destroy the + removed element? Remember, a reader might be stepping through + this element in the list right now: it we free this element and + the <symbol>next</symbol> pointer changes, the reader will jump + off into garbage and crash. We need to wait until we know that + all the readers who were traversing the list when we deleted the + element are finished. We use <function>call_rcu()</function> to + register a callback which will actually destroy the object once + the readers are finished. + </para> + <para> + But how does Read Copy Update know when the readers are + finished? The method is this: firstly, the readers always + traverse the list inside + <function>rcu_read_lock()</function>/<function>rcu_read_unlock()</function> + pairs: these simply disable preemption so the reader won't go to + sleep while reading the list. + </para> + <para> + RCU then waits until every other CPU has slept at least once: + since readers cannot sleep, we know that any readers which were + traversing the list during the deletion are finished, and the + callback is triggered. The real Read Copy Update code is a + little more optimized than this, but this is the fundamental + idea. + </para> + +<programlisting> +--- cache.c.perobjectlock 2003-12-11 17:15:03.000000000 +1100 ++++ cache.c.rcupdate 2003-12-11 17:55:14.000000000 +1100 +@@ -1,15 +1,18 @@ + #include <linux/list.h> + #include <linux/slab.h> + #include <linux/string.h> ++#include <linux/rcupdate.h> + #include <asm/semaphore.h> + #include <asm/errno.h> + + struct object + { +- /* These two protected by cache_lock. */ ++ /* This is protected by RCU */ + struct list_head list; + int popularity; + ++ struct rcu_head rcu; ++ + atomic_t refcnt; + + /* Doesn't change once created. */ +@@ -40,7 +43,7 @@ + { + struct object *i; + +- list_for_each_entry(i, &cache, list) { ++ list_for_each_entry_rcu(i, &cache, list) { + if (i->id == id) { + i->popularity++; + return i; +@@ -49,19 +52,25 @@ + return NULL; + } + ++/* Final discard done once we know no readers are looking. */ ++static void cache_delete_rcu(void *arg) ++{ ++ object_put(arg); ++} ++ + /* Must be holding cache_lock */ + static void __cache_delete(struct object *obj) + { + BUG_ON(!obj); +- list_del(&obj->list); +- object_put(obj); ++ list_del_rcu(&obj->list); + cache_num--; ++ call_rcu(&obj->rcu, cache_delete_rcu, obj); + } + + /* Must be holding cache_lock */ + static void __cache_add(struct object *obj) + { +- list_add(&obj->list, &cache); ++ list_add_rcu(&obj->list, &cache); + if (++cache_num > MAX_CACHE_SIZE) { + struct object *i, *outcast = NULL; + list_for_each_entry(i, &cache, list) { +@@ -85,6 +94,7 @@ + obj->popularity = 0; + atomic_set(&obj->refcnt, 1); /* The cache holds a reference */ + spin_lock_init(&obj->lock); ++ INIT_RCU_HEAD(&obj->rcu); + + spin_lock_irqsave(&cache_lock, flags); + __cache_add(obj); +@@ -104,12 +114,11 @@ + struct object *cache_find(int id) + { + struct object *obj; +- unsigned long flags; + +- spin_lock_irqsave(&cache_lock, flags); ++ rcu_read_lock(); + obj = __cache_find(id); + if (obj) + object_get(obj); +- spin_unlock_irqrestore(&cache_lock, flags); ++ rcu_read_unlock(); + return obj; + } +</programlisting> + +<para> +Note that the reader will alter the +<structfield>popularity</structfield> member in +<function>__cache_find()</function>, and now it doesn't hold a lock. +One solution would be to make it an <type>atomic_t</type>, but for +this usage, we don't really care about races: an approximate result is +good enough, so I didn't change it. +</para> + +<para> +The result is that <function>cache_find()</function> requires no +synchronization with any other functions, so is almost as fast on SMP +as it would be on UP. +</para> + +<para> +There is a furthur optimization possible here: remember our original +cache code, where there were no reference counts and the caller simply +held the lock whenever using the object? This is still possible: if +you hold the lock, noone can delete the object, so you don't need to +get and put the reference count. +</para> + +<para> +Now, because the 'read lock' in RCU is simply disabling preemption, a +caller which always has preemption disabled between calling +<function>cache_find()</function> and +<function>object_put()</function> does not need to actually get and +put the reference count: we could expose +<function>__cache_find()</function> by making it non-static, and +such callers could simply call that. +</para> +<para> +The benefit here is that the reference count is not written to: the +object is not altered in any way, which is much faster on SMP +machines due to caching. +</para> + </sect1> + + <sect1 id="per-cpu"> + <title>Per-CPU Data</title> + + <para> + Another technique for avoiding locking which is used fairly + widely is to duplicate information for each CPU. For example, + if you wanted to keep a count of a common condition, you could + use a spin lock and a single counter. Nice and simple. + </para> + + <para> + If that was too slow (it's usually not, but if you've got a + really big machine to test on and can show that it is), you + could instead use a counter for each CPU, then none of them need + an exclusive lock. See <function>DEFINE_PER_CPU()</function>, + <function>get_cpu_var()</function> and + <function>put_cpu_var()</function> + (<filename class="headerfile">include/linux/percpu.h</filename>). + </para> + + <para> + Of particular use for simple per-cpu counters is the + <type>local_t</type> type, and the + <function>cpu_local_inc()</function> and related functions, + which are more efficient than simple code on some architectures + (<filename class="headerfile">include/asm/local.h</filename>). + </para> + + <para> + Note that there is no simple, reliable way of getting an exact + value of such a counter, without introducing more locks. This + is not a problem for some uses. + </para> + </sect1> + + <sect1 id="mostly-hardirq"> + <title>Data Which Mostly Used By An IRQ Handler</title> + + <para> + If data is always accessed from within the same IRQ handler, you + don't need a lock at all: the kernel already guarantees that the + irq handler will not run simultaneously on multiple CPUs. + </para> + <para> + Manfred Spraul points out that you can still do this, even if + the data is very occasionally accessed in user context or + softirqs/tasklets. The irq handler doesn't use a lock, and + all other accesses are done as so: + </para> + +<programlisting> + spin_lock(&lock); + disable_irq(irq); + ... + enable_irq(irq); + spin_unlock(&lock); +</programlisting> + <para> + The <function>disable_irq()</function> prevents the irq handler + from running (and waits for it to finish if it's currently + running on other CPUs). The spinlock prevents any other + accesses happening at the same time. Naturally, this is slower + than just a <function>spin_lock_irq()</function> call, so it + only makes sense if this type of access happens extremely + rarely. + </para> + </sect1> + </chapter> + + <chapter id="sleeping-things"> + <title>What Functions Are Safe To Call From Interrupts?</title> + + <para> + Many functions in the kernel sleep (ie. call schedule()) + directly or indirectly: you can never call them while holding a + spinlock, or with preemption disabled. This also means you need + to be in user context: calling them from an interrupt is illegal. + </para> + + <sect1 id="sleeping"> + <title>Some Functions Which Sleep</title> + + <para> + The most common ones are listed below, but you usually have to + read the code to find out if other calls are safe. If everyone + else who calls it can sleep, you probably need to be able to + sleep, too. In particular, registration and deregistration + functions usually expect to be called from user context, and can + sleep. + </para> + + <itemizedlist> + <listitem> + <para> + Accesses to + <firstterm linkend="gloss-userspace">userspace</firstterm>: + </para> + <itemizedlist> + <listitem> + <para> + <function>copy_from_user()</function> + </para> + </listitem> + <listitem> + <para> + <function>copy_to_user()</function> + </para> + </listitem> + <listitem> + <para> + <function>get_user()</function> + </para> + </listitem> + <listitem> + <para> + <function> put_user()</function> + </para> + </listitem> + </itemizedlist> + </listitem> + + <listitem> + <para> + <function>kmalloc(GFP_KERNEL)</function> + </para> + </listitem> + + <listitem> + <para> + <function>down_interruptible()</function> and + <function>down()</function> + </para> + <para> + There is a <function>down_trylock()</function> which can be + used inside interrupt context, as it will not sleep. + <function>up()</function> will also never sleep. + </para> + </listitem> + </itemizedlist> + </sect1> + + <sect1 id="dont-sleep"> + <title>Some Functions Which Don't Sleep</title> + + <para> + Some functions are safe to call from any context, or holding + almost any lock. + </para> + + <itemizedlist> + <listitem> + <para> + <function>printk()</function> + </para> + </listitem> + <listitem> + <para> + <function>kfree()</function> + </para> + </listitem> + <listitem> + <para> + <function>add_timer()</function> and <function>del_timer()</function> + </para> + </listitem> + </itemizedlist> + </sect1> + </chapter> + + <chapter id="references"> + <title>Further reading</title> + + <itemizedlist> + <listitem> + <para> + <filename>Documentation/spinlocks.txt</filename>: + Linus Torvalds' spinlocking tutorial in the kernel sources. + </para> + </listitem> + + <listitem> + <para> + Unix Systems for Modern Architectures: Symmetric + Multiprocessing and Caching for Kernel Programmers: + </para> + + <para> + Curt Schimmel's very good introduction to kernel level + locking (not written for Linux, but nearly everything + applies). The book is expensive, but really worth every + penny to understand SMP locking. [ISBN: 0201633388] + </para> + </listitem> + </itemizedlist> + </chapter> + + <chapter id="thanks"> + <title>Thanks</title> + + <para> + Thanks to Telsa Gwynne for DocBooking, neatening and adding + style. + </para> + + <para> + Thanks to Martin Pool, Philipp Rumpf, Stephen Rothwell, Paul + Mackerras, Ruedi Aschwanden, Alan Cox, Manfred Spraul, Tim + Waugh, Pete Zaitcev, James Morris, Robert Love, Paul McKenney, + John Ashby for proofreading, correcting, flaming, commenting. + </para> + + <para> + Thanks to the cabal for having no influence on this document. + </para> + </chapter> + + <glossary id="glossary"> + <title>Glossary</title> + + <glossentry id="gloss-preemption"> + <glossterm>preemption</glossterm> + <glossdef> + <para> + Prior to 2.5, or when <symbol>CONFIG_PREEMPT</symbol> is + unset, processes in user context inside the kernel would not + preempt each other (ie. you had that CPU until you have it up, + except for interrupts). With the addition of + <symbol>CONFIG_PREEMPT</symbol> in 2.5.4, this changed: when + in user context, higher priority tasks can "cut in": spinlocks + were changed to disable preemption, even on UP. + </para> + </glossdef> + </glossentry> + + <glossentry id="gloss-bh"> + <glossterm>bh</glossterm> + <glossdef> + <para> + Bottom Half: for historical reasons, functions with + '_bh' in them often now refer to any software interrupt, e.g. + <function>spin_lock_bh()</function> blocks any software interrupt + on the current CPU. Bottom halves are deprecated, and will + eventually be replaced by tasklets. Only one bottom half will be + running at any time. + </para> + </glossdef> + </glossentry> + + <glossentry id="gloss-hwinterrupt"> + <glossterm>Hardware Interrupt / Hardware IRQ</glossterm> + <glossdef> + <para> + Hardware interrupt request. <function>in_irq()</function> returns + <returnvalue>true</returnvalue> in a hardware interrupt handler. + </para> + </glossdef> + </glossentry> + + <glossentry id="gloss-interruptcontext"> + <glossterm>Interrupt Context</glossterm> + <glossdef> + <para> + Not user context: processing a hardware irq or software irq. + Indicated by the <function>in_interrupt()</function> macro + returning <returnvalue>true</returnvalue>. + </para> + </glossdef> + </glossentry> + + <glossentry id="gloss-smp"> + <glossterm><acronym>SMP</acronym></glossterm> + <glossdef> + <para> + Symmetric Multi-Processor: kernels compiled for multiple-CPU + machines. (CONFIG_SMP=y). + </para> + </glossdef> + </glossentry> + + <glossentry id="gloss-softirq"> + <glossterm>Software Interrupt / softirq</glossterm> + <glossdef> + <para> + Software interrupt handler. <function>in_irq()</function> returns + <returnvalue>false</returnvalue>; <function>in_softirq()</function> + returns <returnvalue>true</returnvalue>. Tasklets and softirqs + both fall into the category of 'software interrupts'. + </para> + <para> + Strictly speaking a softirq is one of up to 32 enumerated software + interrupts which can run on multiple CPUs at once. + Sometimes used to refer to tasklets as + well (ie. all software interrupts). + </para> + </glossdef> + </glossentry> + + <glossentry id="gloss-tasklet"> + <glossterm>tasklet</glossterm> + <glossdef> + <para> + A dynamically-registrable software interrupt, + which is guaranteed to only run on one CPU at a time. + </para> + </glossdef> + </glossentry> + + <glossentry id="gloss-timers"> + <glossterm>timer</glossterm> + <glossdef> + <para> + A dynamically-registrable software interrupt, which is run at + (or close to) a given time. When running, it is just like a + tasklet (in fact, they are called from the TIMER_SOFTIRQ). + </para> + </glossdef> + </glossentry> + + <glossentry id="gloss-up"> + <glossterm><acronym>UP</acronym></glossterm> + <glossdef> + <para> + Uni-Processor: Non-SMP. (CONFIG_SMP=n). + </para> + </glossdef> + </glossentry> + + <glossentry id="gloss-usercontext"> + <glossterm>User Context</glossterm> + <glossdef> + <para> + The kernel executing on behalf of a particular process (ie. a + system call or trap) or kernel thread. You can tell which + process with the <symbol>current</symbol> macro.) Not to + be confused with userspace. Can be interrupted by software or + hardware interrupts. + </para> + </glossdef> + </glossentry> + + <glossentry id="gloss-userspace"> + <glossterm>Userspace</glossterm> + <glossdef> + <para> + A process executing its own code outside the kernel. + </para> + </glossdef> + </glossentry> + + </glossary> +</book> + diff --git a/Documentation/DocBook/libata.tmpl b/Documentation/DocBook/libata.tmpl new file mode 100644 index 000000000000..cf2fce7707da --- /dev/null +++ b/Documentation/DocBook/libata.tmpl @@ -0,0 +1,282 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="libataDevGuide"> + <bookinfo> + <title>libATA Developer's Guide</title> + + <authorgroup> + <author> + <firstname>Jeff</firstname> + <surname>Garzik</surname> + </author> + </authorgroup> + + <copyright> + <year>2003</year> + <holder>Jeff Garzik</holder> + </copyright> + + <legalnotice> + <para> + The contents of this file are subject to the Open + Software License version 1.1 that can be found at + <ulink url="http://www.opensource.org/licenses/osl-1.1.txt">http://www.opensource.org/licenses/osl-1.1.txt</ulink> and is included herein + by reference. + </para> + + <para> + Alternatively, the contents of this file may be used under the terms + of the GNU General Public License version 2 (the "GPL") as distributed + in the kernel source COPYING file, in which case the provisions of + the GPL are applicable instead of the above. If you wish to allow + the use of your version of this file only under the terms of the + GPL and not to allow others to use your version of this file under + the OSL, indicate your decision by deleting the provisions above and + replace them with the notice and other provisions required by the GPL. + If you do not delete the provisions above, a recipient may use your + version of this file under either the OSL or the GPL. + </para> + + </legalnotice> + </bookinfo> + +<toc></toc> + + <chapter id="libataThanks"> + <title>Thanks</title> + <para> + The bulk of the ATA knowledge comes thanks to long conversations with + Andre Hedrick (www.linux-ide.org). + </para> + <para> + Thanks to Alan Cox for pointing out similarities + between SATA and SCSI, and in general for motivation to hack on + libata. + </para> + <para> + libata's device detection + method, ata_pio_devchk, and in general all the early probing was + based on extensive study of Hale Landis's probe/reset code in his + ATADRVR driver (www.ata-atapi.com). + </para> + </chapter> + + <chapter id="libataDriverApi"> + <title>libata Driver API</title> + <sect1> + <title>struct ata_port_operations</title> + + <programlisting> +void (*port_disable) (struct ata_port *); + </programlisting> + + <para> + Called from ata_bus_probe() and ata_bus_reset() error paths, + as well as when unregistering from the SCSI module (rmmod, hot + unplug). + </para> + + <programlisting> +void (*dev_config) (struct ata_port *, struct ata_device *); + </programlisting> + + <para> + Called after IDENTIFY [PACKET] DEVICE is issued to each device + found. Typically used to apply device-specific fixups prior to + issue of SET FEATURES - XFER MODE, and prior to operation. + </para> + + <programlisting> +void (*set_piomode) (struct ata_port *, struct ata_device *); +void (*set_dmamode) (struct ata_port *, struct ata_device *); +void (*post_set_mode) (struct ata_port *ap); + </programlisting> + + <para> + Hooks called prior to the issue of SET FEATURES - XFER MODE + command. dev->pio_mode is guaranteed to be valid when + ->set_piomode() is called, and dev->dma_mode is guaranteed to be + valid when ->set_dmamode() is called. ->post_set_mode() is + called unconditionally, after the SET FEATURES - XFER MODE + command completes successfully. + </para> + + <para> + ->set_piomode() is always called (if present), but + ->set_dma_mode() is only called if DMA is possible. + </para> + + <programlisting> +void (*tf_load) (struct ata_port *ap, struct ata_taskfile *tf); +void (*tf_read) (struct ata_port *ap, struct ata_taskfile *tf); + </programlisting> + + <para> + ->tf_load() is called to load the given taskfile into hardware + registers / DMA buffers. ->tf_read() is called to read the + hardware registers / DMA buffers, to obtain the current set of + taskfile register values. + </para> + + <programlisting> +void (*exec_command)(struct ata_port *ap, struct ata_taskfile *tf); + </programlisting> + + <para> + causes an ATA command, previously loaded with + ->tf_load(), to be initiated in hardware. + </para> + + <programlisting> +u8 (*check_status)(struct ata_port *ap); +void (*dev_select)(struct ata_port *ap, unsigned int device); + </programlisting> + + <para> + Reads the Status ATA shadow register from hardware. On some + hardware, this has the side effect of clearing the interrupt + condition. + </para> + + <programlisting> +void (*dev_select)(struct ata_port *ap, unsigned int device); + </programlisting> + + <para> + Issues the low-level hardware command(s) that causes one of N + hardware devices to be considered 'selected' (active and + available for use) on the ATA bus. + </para> + + <programlisting> +void (*phy_reset) (struct ata_port *ap); + </programlisting> + + <para> + The very first step in the probe phase. Actions vary depending + on the bus type, typically. After waking up the device and probing + for device presence (PATA and SATA), typically a soft reset + (SRST) will be performed. Drivers typically use the helper + functions ata_bus_reset() or sata_phy_reset() for this hook. + </para> + + <programlisting> +void (*bmdma_setup) (struct ata_queued_cmd *qc); +void (*bmdma_start) (struct ata_queued_cmd *qc); + </programlisting> + + <para> + When setting up an IDE BMDMA transaction, these hooks arm + (->bmdma_setup) and fire (->bmdma_start) the hardware's DMA + engine. + </para> + + <programlisting> +void (*qc_prep) (struct ata_queued_cmd *qc); +int (*qc_issue) (struct ata_queued_cmd *qc); + </programlisting> + + <para> + Higher-level hooks, these two hooks can potentially supercede + several of the above taskfile/DMA engine hooks. ->qc_prep is + called after the buffers have been DMA-mapped, and is typically + used to populate the hardware's DMA scatter-gather table. + Most drivers use the standard ata_qc_prep() helper function, but + more advanced drivers roll their own. + </para> + <para> + ->qc_issue is used to make a command active, once the hardware + and S/G tables have been prepared. IDE BMDMA drivers use the + helper function ata_qc_issue_prot() for taskfile protocol-based + dispatch. More advanced drivers roll their own ->qc_issue + implementation, using this as the "issue new ATA command to + hardware" hook. + </para> + + <programlisting> +void (*eng_timeout) (struct ata_port *ap); + </programlisting> + + <para> + This is a high level error handling function, called from the + error handling thread, when a command times out. + </para> + + <programlisting> +irqreturn_t (*irq_handler)(int, void *, struct pt_regs *); +void (*irq_clear) (struct ata_port *); + </programlisting> + + <para> + ->irq_handler is the interrupt handling routine registered with + the system, by libata. ->irq_clear is called during probe just + before the interrupt handler is registered, to be sure hardware + is quiet. + </para> + + <programlisting> +u32 (*scr_read) (struct ata_port *ap, unsigned int sc_reg); +void (*scr_write) (struct ata_port *ap, unsigned int sc_reg, + u32 val); + </programlisting> + + <para> + Read and write standard SATA phy registers. Currently only used + if ->phy_reset hook called the sata_phy_reset() helper function. + </para> + + <programlisting> +int (*port_start) (struct ata_port *ap); +void (*port_stop) (struct ata_port *ap); +void (*host_stop) (struct ata_host_set *host_set); + </programlisting> + + <para> + ->port_start() is called just after the data structures for each + port are initialized. Typically this is used to alloc per-port + DMA buffers / tables / rings, enable DMA engines, and similar + tasks. + </para> + <para> + ->host_stop() is called when the rmmod or hot unplug process + begins. The hook must stop all hardware interrupts, DMA + engines, etc. + </para> + <para> + ->port_stop() is called after ->host_stop(). It's sole function + is to release DMA/memory resources, now that they are no longer + actively being used. + </para> + + </sect1> + </chapter> + + <chapter id="libataExt"> + <title>libata Library</title> +!Edrivers/scsi/libata-core.c + </chapter> + + <chapter id="libataInt"> + <title>libata Core Internals</title> +!Idrivers/scsi/libata-core.c + </chapter> + + <chapter id="libataScsiInt"> + <title>libata SCSI translation/emulation</title> +!Edrivers/scsi/libata-scsi.c +!Idrivers/scsi/libata-scsi.c + </chapter> + + <chapter id="PiixInt"> + <title>ata_piix Internals</title> +!Idrivers/scsi/ata_piix.c + </chapter> + + <chapter id="SILInt"> + <title>sata_sil Internals</title> +!Idrivers/scsi/sata_sil.c + </chapter> + +</book> diff --git a/Documentation/DocBook/librs.tmpl b/Documentation/DocBook/librs.tmpl new file mode 100644 index 000000000000..3ff39bafc00e --- /dev/null +++ b/Documentation/DocBook/librs.tmpl @@ -0,0 +1,289 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="Reed-Solomon-Library-Guide"> + <bookinfo> + <title>Reed-Solomon Library Programming Interface</title> + + <authorgroup> + <author> + <firstname>Thomas</firstname> + <surname>Gleixner</surname> + <affiliation> + <address> + <email>tglx@linutronix.de</email> + </address> + </affiliation> + </author> + </authorgroup> + + <copyright> + <year>2004</year> + <holder>Thomas Gleixner</holder> + </copyright> + + <legalnotice> + <para> + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License version 2 as published by the Free Software Foundation. + </para> + + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + See the GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + + <para> + For more details see the file COPYING in the source + distribution of Linux. + </para> + </legalnotice> + </bookinfo> + +<toc></toc> + + <chapter id="intro"> + <title>Introduction</title> + <para> + The generic Reed-Solomon Library provides encoding, decoding + and error correction functions. + </para> + <para> + Reed-Solomon codes are used in communication and storage + applications to ensure data integrity. + </para> + <para> + This documentation is provided for developers who want to utilize + the functions provided by the library. + </para> + </chapter> + + <chapter id="bugs"> + <title>Known Bugs And Assumptions</title> + <para> + None. + </para> + </chapter> + + <chapter id="usage"> + <title>Usage</title> + <para> + This chapter provides examples how to use the library. + </para> + <sect1> + <title>Initializing</title> + <para> + The init function init_rs returns a pointer to a + rs decoder structure, which holds the necessary + information for encoding, decoding and error correction + with the given polynomial. It either uses an existing + matching decoder or creates a new one. On creation all + the lookup tables for fast en/decoding are created. + The function may take a while, so make sure not to + call it in critical code paths. + </para> + <programlisting> +/* the Reed Solomon control structure */ +static struct rs_control *rs_decoder; + +/* Symbolsize is 10 (bits) + * Primitve polynomial is x^10+x^3+1 + * first consecutive root is 0 + * primitve element to generate roots = 1 + * generator polinomial degree (number of roots) = 6 + */ +rs_decoder = init_rs (10, 0x409, 0, 1, 6); + </programlisting> + </sect1> + <sect1> + <title>Encoding</title> + <para> + The encoder calculates the Reed-Solomon code over + the given data length and stores the result in + the parity buffer. Note that the parity buffer must + be initialized before calling the encoder. + </para> + <para> + The expanded data can be inverted on the fly by + providing a non zero inversion mask. The expanded data is + XOR'ed with the mask. This is used e.g. for FLASH + ECC, where the all 0xFF is inverted to an all 0x00. + The Reed-Solomon code for all 0x00 is all 0x00. The + code is inverted before storing to FLASH so it is 0xFF + too. This prevent's that reading from an erased FLASH + results in ECC errors. + </para> + <para> + The databytes are expanded to the given symbol size + on the fly. There is no support for encoding continuous + bitstreams with a symbol size != 8 at the moment. If + it is necessary it should be not a big deal to implement + such functionality. + </para> + <programlisting> +/* Parity buffer. Size = number of roots */ +uint16_t par[6]; +/* Initialize the parity buffer */ +memset(par, 0, sizeof(par)); +/* Encode 512 byte in data8. Store parity in buffer par */ +encode_rs8 (rs_decoder, data8, 512, par, 0); + </programlisting> + </sect1> + <sect1> + <title>Decoding</title> + <para> + The decoder calculates the syndrome over + the given data length and the received parity symbols + and corrects errors in the data. + </para> + <para> + If a syndrome is available from a hardware decoder + then the syndrome calculation is skipped. + </para> + <para> + The correction of the data buffer can be suppressed + by providing a correction pattern buffer and an error + location buffer to the decoder. The decoder stores the + calculated error location and the correction bitmask + in the given buffers. This is useful for hardware + decoders which use a weird bit ordering scheme. + </para> + <para> + The databytes are expanded to the given symbol size + on the fly. There is no support for decoding continuous + bitstreams with a symbolsize != 8 at the moment. If + it is necessary it should be not a big deal to implement + such functionality. + </para> + + <sect2> + <title> + Decoding with syndrome calculation, direct data correction + </title> + <programlisting> +/* Parity buffer. Size = number of roots */ +uint16_t par[6]; +uint8_t data[512]; +int numerr; +/* Receive data */ +..... +/* Receive parity */ +..... +/* Decode 512 byte in data8.*/ +numerr = decode_rs8 (rs_decoder, data8, par, 512, NULL, 0, NULL, 0, NULL); + </programlisting> + </sect2> + + <sect2> + <title> + Decoding with syndrome given by hardware decoder, direct data correction + </title> + <programlisting> +/* Parity buffer. Size = number of roots */ +uint16_t par[6], syn[6]; +uint8_t data[512]; +int numerr; +/* Receive data */ +..... +/* Receive parity */ +..... +/* Get syndrome from hardware decoder */ +..... +/* Decode 512 byte in data8.*/ +numerr = decode_rs8 (rs_decoder, data8, par, 512, syn, 0, NULL, 0, NULL); + </programlisting> + </sect2> + + <sect2> + <title> + Decoding with syndrome given by hardware decoder, no direct data correction. + </title> + <para> + Note: It's not necessary to give data and received parity to the decoder. + </para> + <programlisting> +/* Parity buffer. Size = number of roots */ +uint16_t par[6], syn[6], corr[8]; +uint8_t data[512]; +int numerr, errpos[8]; +/* Receive data */ +..... +/* Receive parity */ +..... +/* Get syndrome from hardware decoder */ +..... +/* Decode 512 byte in data8.*/ +numerr = decode_rs8 (rs_decoder, NULL, NULL, 512, syn, 0, errpos, 0, corr); +for (i = 0; i < numerr; i++) { + do_error_correction_in_your_buffer(errpos[i], corr[i]); +} + </programlisting> + </sect2> + </sect1> + <sect1> + <title>Cleanup</title> + <para> + The function free_rs frees the allocated resources, + if the caller is the last user of the decoder. + </para> + <programlisting> +/* Release resources */ +free_rs(rs_decoder); + </programlisting> + </sect1> + + </chapter> + + <chapter id="structs"> + <title>Structures</title> + <para> + This chapter contains the autogenerated documentation of the structures which are + used in the Reed-Solomon Library and are relevant for a developer. + </para> +!Iinclude/linux/rslib.h + </chapter> + + <chapter id="pubfunctions"> + <title>Public Functions Provided</title> + <para> + This chapter contains the autogenerated documentation of the Reed-Solomon functions + which are exported. + </para> +!Elib/reed_solomon/reed_solomon.c + </chapter> + + <chapter id="credits"> + <title>Credits</title> + <para> + The library code for encoding and decoding was written by Phil Karn. + </para> + <programlisting> + Copyright 2002, Phil Karn, KA9Q + May be used under the terms of the GNU General Public License (GPL) + </programlisting> + <para> + The wrapper functions and interfaces are written by Thomas Gleixner + </para> + <para> + Many users have provided bugfixes, improvements and helping hands for testing. + Thanks a lot. + </para> + <para> + The following people have contributed to this document: + </para> + <para> + Thomas Gleixner<email>tglx@linutronix.de</email> + </para> + </chapter> +</book> diff --git a/Documentation/DocBook/lsm.tmpl b/Documentation/DocBook/lsm.tmpl new file mode 100644 index 000000000000..f63822195871 --- /dev/null +++ b/Documentation/DocBook/lsm.tmpl @@ -0,0 +1,265 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<article class="whitepaper" id="LinuxSecurityModule" lang="en"> + <articleinfo> + <title>Linux Security Modules: General Security Hooks for Linux</title> + <authorgroup> + <author> + <firstname>Stephen</firstname> + <surname>Smalley</surname> + <affiliation> + <orgname>NAI Labs</orgname> + <address><email>ssmalley@nai.com</email></address> + </affiliation> + </author> + <author> + <firstname>Timothy</firstname> + <surname>Fraser</surname> + <affiliation> + <orgname>NAI Labs</orgname> + <address><email>tfraser@nai.com</email></address> + </affiliation> + </author> + <author> + <firstname>Chris</firstname> + <surname>Vance</surname> + <affiliation> + <orgname>NAI Labs</orgname> + <address><email>cvance@nai.com</email></address> + </affiliation> + </author> + </authorgroup> + </articleinfo> + +<sect1><title>Introduction</title> + +<para> +In March 2001, the National Security Agency (NSA) gave a presentation +about Security-Enhanced Linux (SELinux) at the 2.5 Linux Kernel +Summit. SELinux is an implementation of flexible and fine-grained +nondiscretionary access controls in the Linux kernel, originally +implemented as its own particular kernel patch. Several other +security projects (e.g. RSBAC, Medusa) have also developed flexible +access control architectures for the Linux kernel, and various +projects have developed particular access control models for Linux +(e.g. LIDS, DTE, SubDomain). Each project has developed and +maintained its own kernel patch to support its security needs. +</para> + +<para> +In response to the NSA presentation, Linus Torvalds made a set of +remarks that described a security framework he would be willing to +consider for inclusion in the mainstream Linux kernel. He described a +general framework that would provide a set of security hooks to +control operations on kernel objects and a set of opaque security +fields in kernel data structures for maintaining security attributes. +This framework could then be used by loadable kernel modules to +implement any desired model of security. Linus also suggested the +possibility of migrating the Linux capabilities code into such a +module. +</para> + +<para> +The Linux Security Modules (LSM) project was started by WireX to +develop such a framework. LSM is a joint development effort by +several security projects, including Immunix, SELinux, SGI and Janus, +and several individuals, including Greg Kroah-Hartman and James +Morris, to develop a Linux kernel patch that implements this +framework. The patch is currently tracking the 2.4 series and is +targeted for integration into the 2.5 development series. This +technical report provides an overview of the framework and the example +capabilities security module provided by the LSM kernel patch. +</para> + +</sect1> + +<sect1 id="framework"><title>LSM Framework</title> + +<para> +The LSM kernel patch provides a general kernel framework to support +security modules. In particular, the LSM framework is primarily +focused on supporting access control modules, although future +development is likely to address other security needs such as +auditing. By itself, the framework does not provide any additional +security; it merely provides the infrastructure to support security +modules. The LSM kernel patch also moves most of the capabilities +logic into an optional security module, with the system defaulting +to the traditional superuser logic. This capabilities module +is discussed further in <xref linkend="cap"/>. +</para> + +<para> +The LSM kernel patch adds security fields to kernel data structures +and inserts calls to hook functions at critical points in the kernel +code to manage the security fields and to perform access control. It +also adds functions for registering and unregistering security +modules, and adds a general <function>security</function> system call +to support new system calls for security-aware applications. +</para> + +<para> +The LSM security fields are simply <type>void*</type> pointers. For +process and program execution security information, security fields +were added to <structname>struct task_struct</structname> and +<structname>struct linux_binprm</structname>. For filesystem security +information, a security field was added to +<structname>struct super_block</structname>. For pipe, file, and socket +security information, security fields were added to +<structname>struct inode</structname> and +<structname>struct file</structname>. For packet and network device security +information, security fields were added to +<structname>struct sk_buff</structname> and +<structname>struct net_device</structname>. For System V IPC security +information, security fields were added to +<structname>struct kern_ipc_perm</structname> and +<structname>struct msg_msg</structname>; additionally, the definitions +for <structname>struct msg_msg</structname>, <structname>struct +msg_queue</structname>, and <structname>struct +shmid_kernel</structname> were moved to header files +(<filename>include/linux/msg.h</filename> and +<filename>include/linux/shm.h</filename> as appropriate) to allow +the security modules to use these definitions. +</para> + +<para> +Each LSM hook is a function pointer in a global table, +security_ops. This table is a +<structname>security_operations</structname> structure as defined by +<filename>include/linux/security.h</filename>. Detailed documentation +for each hook is included in this header file. At present, this +structure consists of a collection of substructures that group related +hooks based on the kernel object (e.g. task, inode, file, sk_buff, +etc) as well as some top-level hook function pointers for system +operations. This structure is likely to be flattened in the future +for performance. The placement of the hook calls in the kernel code +is described by the "called:" lines in the per-hook documentation in +the header file. The hook calls can also be easily found in the +kernel code by looking for the string "security_ops->". + +</para> + +<para> +Linus mentioned per-process security hooks in his original remarks as a +possible alternative to global security hooks. However, if LSM were +to start from the perspective of per-process hooks, then the base +framework would have to deal with how to handle operations that +involve multiple processes (e.g. kill), since each process might have +its own hook for controlling the operation. This would require a +general mechanism for composing hooks in the base framework. +Additionally, LSM would still need global hooks for operations that +have no process context (e.g. network input operations). +Consequently, LSM provides global security hooks, but a security +module is free to implement per-process hooks (where that makes sense) +by storing a security_ops table in each process' security field and +then invoking these per-process hooks from the global hooks. +The problem of composition is thus deferred to the module. +</para> + +<para> +The global security_ops table is initialized to a set of hook +functions provided by a dummy security module that provides +traditional superuser logic. A <function>register_security</function> +function (in <filename>security/security.c</filename>) is provided to +allow a security module to set security_ops to refer to its own hook +functions, and an <function>unregister_security</function> function is +provided to revert security_ops to the dummy module hooks. This +mechanism is used to set the primary security module, which is +responsible for making the final decision for each hook. +</para> + +<para> +LSM also provides a simple mechanism for stacking additional security +modules with the primary security module. It defines +<function>register_security</function> and +<function>unregister_security</function> hooks in the +<structname>security_operations</structname> structure and provides +<function>mod_reg_security</function> and +<function>mod_unreg_security</function> functions that invoke these +hooks after performing some sanity checking. A security module can +call these functions in order to stack with other modules. However, +the actual details of how this stacking is handled are deferred to the +module, which can implement these hooks in any way it wishes +(including always returning an error if it does not wish to support +stacking). In this manner, LSM again defers the problem of +composition to the module. +</para> + +<para> +Although the LSM hooks are organized into substructures based on +kernel object, all of the hooks can be viewed as falling into two +major categories: hooks that are used to manage the security fields +and hooks that are used to perform access control. Examples of the +first category of hooks include the +<function>alloc_security</function> and +<function>free_security</function> hooks defined for each kernel data +structure that has a security field. These hooks are used to allocate +and free security structures for kernel objects. The first category +of hooks also includes hooks that set information in the security +field after allocation, such as the <function>post_lookup</function> +hook in <structname>struct inode_security_ops</structname>. This hook +is used to set security information for inodes after successful lookup +operations. An example of the second category of hooks is the +<function>permission</function> hook in +<structname>struct inode_security_ops</structname>. This hook checks +permission when accessing an inode. +</para> + +</sect1> + +<sect1 id="cap"><title>LSM Capabilities Module</title> + +<para> +The LSM kernel patch moves most of the existing POSIX.1e capabilities +logic into an optional security module stored in the file +<filename>security/capability.c</filename>. This change allows +users who do not want to use capabilities to omit this code entirely +from their kernel, instead using the dummy module for traditional +superuser logic or any other module that they desire. This change +also allows the developers of the capabilities logic to maintain and +enhance their code more freely, without needing to integrate patches +back into the base kernel. +</para> + +<para> +In addition to moving the capabilities logic, the LSM kernel patch +could move the capability-related fields from the kernel data +structures into the new security fields managed by the security +modules. However, at present, the LSM kernel patch leaves the +capability fields in the kernel data structures. In his original +remarks, Linus suggested that this might be preferable so that other +security modules can be easily stacked with the capabilities module +without needing to chain multiple security structures on the security field. +It also avoids imposing extra overhead on the capabilities module +to manage the security fields. However, the LSM framework could +certainly support such a move if it is determined to be desirable, +with only a few additional changes described below. +</para> + +<para> +At present, the capabilities logic for computing process capabilities +on <function>execve</function> and <function>set*uid</function>, +checking capabilities for a particular process, saving and checking +capabilities for netlink messages, and handling the +<function>capget</function> and <function>capset</function> system +calls have been moved into the capabilities module. There are still a +few locations in the base kernel where capability-related fields are +directly examined or modified, but the current version of the LSM +patch does allow a security module to completely replace the +assignment and testing of capabilities. These few locations would +need to be changed if the capability-related fields were moved into +the security field. The following is a list of known locations that +still perform such direct examination or modification of +capability-related fields: +<itemizedlist> +<listitem><para><filename>fs/open.c</filename>:<function>sys_access</function></para></listitem> +<listitem><para><filename>fs/lockd/host.c</filename>:<function>nlm_bind_host</function></para></listitem> +<listitem><para><filename>fs/nfsd/auth.c</filename>:<function>nfsd_setuser</function></para></listitem> +<listitem><para><filename>fs/proc/array.c</filename>:<function>task_cap</function></para></listitem> +</itemizedlist> +</para> + +</sect1> + +</article> diff --git a/Documentation/DocBook/man/Makefile b/Documentation/DocBook/man/Makefile new file mode 100644 index 000000000000..4fb7ea0f7ac8 --- /dev/null +++ b/Documentation/DocBook/man/Makefile @@ -0,0 +1,3 @@ +# Rules are put in Documentation/DocBook + +clean-files := *.9.gz *.sgml manpage.links manpage.refs diff --git a/Documentation/DocBook/mcabook.tmpl b/Documentation/DocBook/mcabook.tmpl new file mode 100644 index 000000000000..4367f4642f3d --- /dev/null +++ b/Documentation/DocBook/mcabook.tmpl @@ -0,0 +1,107 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="MCAGuide"> + <bookinfo> + <title>MCA Driver Programming Interface</title> + + <authorgroup> + <author> + <firstname>Alan</firstname> + <surname>Cox</surname> + <affiliation> + <address> + <email>alan@redhat.com</email> + </address> + </affiliation> + </author> + <author> + <firstname>David</firstname> + <surname>Weinehall</surname> + </author> + <author> + <firstname>Chris</firstname> + <surname>Beauregard</surname> + </author> + </authorgroup> + + <copyright> + <year>2000</year> + <holder>Alan Cox</holder> + <holder>David Weinehall</holder> + <holder>Chris Beauregard</holder> + </copyright> + + <legalnotice> + <para> + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later + version. + </para> + + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + See the GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + + <para> + For more details see the file COPYING in the source + distribution of Linux. + </para> + </legalnotice> + </bookinfo> + +<toc></toc> + + <chapter id="intro"> + <title>Introduction</title> + <para> + The MCA bus functions provide a generalised interface to find MCA + bus cards, to claim them for a driver, and to read and manipulate POS + registers without being aware of the motherboard internals or + certain deep magic specific to onboard devices. + </para> + <para> + The basic interface to the MCA bus devices is the slot. Each slot + is numbered and virtual slot numbers are assigned to the internal + devices. Using a pci_dev as other busses do does not really make + sense in the MCA context as the MCA bus resources require card + specific interpretation. + </para> + <para> + Finally the MCA bus functions provide a parallel set of DMA + functions mimicing the ISA bus DMA functions as closely as possible, + although also supporting the additional DMA functionality on the + MCA bus controllers. + </para> + </chapter> + <chapter id="bugs"> + <title>Known Bugs And Assumptions</title> + <para> + None. + </para> + </chapter> + + <chapter id="pubfunctions"> + <title>Public Functions Provided</title> +!Earch/i386/kernel/mca.c + </chapter> + + <chapter id="dmafunctions"> + <title>DMA Functions Provided</title> +!Iinclude/asm-i386/mca_dma.h + </chapter> + +</book> diff --git a/Documentation/DocBook/mtdnand.tmpl b/Documentation/DocBook/mtdnand.tmpl new file mode 100644 index 000000000000..6e463d0db266 --- /dev/null +++ b/Documentation/DocBook/mtdnand.tmpl @@ -0,0 +1,1320 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="MTD-NAND-Guide"> + <bookinfo> + <title>MTD NAND Driver Programming Interface</title> + + <authorgroup> + <author> + <firstname>Thomas</firstname> + <surname>Gleixner</surname> + <affiliation> + <address> + <email>tglx@linutronix.de</email> + </address> + </affiliation> + </author> + </authorgroup> + + <copyright> + <year>2004</year> + <holder>Thomas Gleixner</holder> + </copyright> + + <legalnotice> + <para> + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License version 2 as published by the Free Software Foundation. + </para> + + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + See the GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + + <para> + For more details see the file COPYING in the source + distribution of Linux. + </para> + </legalnotice> + </bookinfo> + +<toc></toc> + + <chapter id="intro"> + <title>Introduction</title> + <para> + The generic NAND driver supports almost all NAND and AG-AND based + chips and connects them to the Memory Technology Devices (MTD) + subsystem of the Linux Kernel. + </para> + <para> + This documentation is provided for developers who want to implement + board drivers or filesystem drivers suitable for NAND devices. + </para> + </chapter> + + <chapter id="bugs"> + <title>Known Bugs And Assumptions</title> + <para> + None. + </para> + </chapter> + + <chapter id="dochints"> + <title>Documentation hints</title> + <para> + The function and structure docs are autogenerated. Each function and + struct member has a short description which is marked with an [XXX] identifier. + The following chapters explain the meaning of those identifiers. + </para> + <sect1> + <title>Function identifiers [XXX]</title> + <para> + The functions are marked with [XXX] identifiers in the short + comment. The identifiers explain the usage and scope of the + functions. Following identifiers are used: + </para> + <itemizedlist> + <listitem><para> + [MTD Interface]</para><para> + These functions provide the interface to the MTD kernel API. + They are not replacable and provide functionality + which is complete hardware independent. + </para></listitem> + <listitem><para> + [NAND Interface]</para><para> + These functions are exported and provide the interface to the NAND kernel API. + </para></listitem> + <listitem><para> + [GENERIC]</para><para> + Generic functions are not replacable and provide functionality + which is complete hardware independent. + </para></listitem> + <listitem><para> + [DEFAULT]</para><para> + Default functions provide hardware related functionality which is suitable + for most of the implementations. These functions can be replaced by the + board driver if neccecary. Those functions are called via pointers in the + NAND chip description structure. The board driver can set the functions which + should be replaced by board dependend functions before calling nand_scan(). + If the function pointer is NULL on entry to nand_scan() then the pointer + is set to the default function which is suitable for the detected chip type. + </para></listitem> + </itemizedlist> + </sect1> + <sect1> + <title>Struct member identifiers [XXX]</title> + <para> + The struct members are marked with [XXX] identifiers in the + comment. The identifiers explain the usage and scope of the + members. Following identifiers are used: + </para> + <itemizedlist> + <listitem><para> + [INTERN]</para><para> + These members are for NAND driver internal use only and must not be + modified. Most of these values are calculated from the chip geometry + information which is evaluated during nand_scan(). + </para></listitem> + <listitem><para> + [REPLACEABLE]</para><para> + Replaceable members hold hardware related functions which can be + provided by the board driver. The board driver can set the functions which + should be replaced by board dependend functions before calling nand_scan(). + If the function pointer is NULL on entry to nand_scan() then the pointer + is set to the default function which is suitable for the detected chip type. + </para></listitem> + <listitem><para> + [BOARDSPECIFIC]</para><para> + Board specific members hold hardware related information which must + be provided by the board driver. The board driver must set the function + pointers and datafields before calling nand_scan(). + </para></listitem> + <listitem><para> + [OPTIONAL]</para><para> + Optional members can hold information relevant for the board driver. The + generic NAND driver code does not use this information. + </para></listitem> + </itemizedlist> + </sect1> + </chapter> + + <chapter id="basicboarddriver"> + <title>Basic board driver</title> + <para> + For most boards it will be sufficient to provide just the + basic functions and fill out some really board dependend + members in the nand chip description structure. + See drivers/mtd/nand/skeleton for reference. + </para> + <sect1> + <title>Basic defines</title> + <para> + At least you have to provide a mtd structure and + a storage for the ioremap'ed chip address. + You can allocate the mtd structure using kmalloc + or you can allocate it statically. + In case of static allocation you have to allocate + a nand_chip structure too. + </para> + <para> + Kmalloc based example + </para> + <programlisting> +static struct mtd_info *board_mtd; +static unsigned long baseaddr; + </programlisting> + <para> + Static example + </para> + <programlisting> +static struct mtd_info board_mtd; +static struct nand_chip board_chip; +static unsigned long baseaddr; + </programlisting> + </sect1> + <sect1> + <title>Partition defines</title> + <para> + If you want to divide your device into parititions, then + enable the configuration switch CONFIG_MTD_PARITIONS and define + a paritioning scheme suitable to your board. + </para> + <programlisting> +#define NUM_PARTITIONS 2 +static struct mtd_partition partition_info[] = { + { .name = "Flash partition 1", + .offset = 0, + .size = 8 * 1024 * 1024 }, + { .name = "Flash partition 2", + .offset = MTDPART_OFS_NEXT, + .size = MTDPART_SIZ_FULL }, +}; + </programlisting> + </sect1> + <sect1> + <title>Hardware control function</title> + <para> + The hardware control function provides access to the + control pins of the NAND chip(s). + The access can be done by GPIO pins or by address lines. + If you use address lines, make sure that the timing + requirements are met. + </para> + <para> + <emphasis>GPIO based example</emphasis> + </para> + <programlisting> +static void board_hwcontrol(struct mtd_info *mtd, int cmd) +{ + switch(cmd){ + case NAND_CTL_SETCLE: /* Set CLE pin high */ break; + case NAND_CTL_CLRCLE: /* Set CLE pin low */ break; + case NAND_CTL_SETALE: /* Set ALE pin high */ break; + case NAND_CTL_CLRALE: /* Set ALE pin low */ break; + case NAND_CTL_SETNCE: /* Set nCE pin low */ break; + case NAND_CTL_CLRNCE: /* Set nCE pin high */ break; + } +} + </programlisting> + <para> + <emphasis>Address lines based example.</emphasis> It's assumed that the + nCE pin is driven by a chip select decoder. + </para> + <programlisting> +static void board_hwcontrol(struct mtd_info *mtd, int cmd) +{ + struct nand_chip *this = (struct nand_chip *) mtd->priv; + switch(cmd){ + case NAND_CTL_SETCLE: this->IO_ADDR_W |= CLE_ADRR_BIT; break; + case NAND_CTL_CLRCLE: this->IO_ADDR_W &= ~CLE_ADRR_BIT; break; + case NAND_CTL_SETALE: this->IO_ADDR_W |= ALE_ADRR_BIT; break; + case NAND_CTL_CLRALE: this->IO_ADDR_W &= ~ALE_ADRR_BIT; break; + } +} + </programlisting> + </sect1> + <sect1> + <title>Device ready function</title> + <para> + If the hardware interface has the ready busy pin of the NAND chip connected to a + GPIO or other accesible I/O pin, this function is used to read back the state of the + pin. The function has no arguments and should return 0, if the device is busy (R/B pin + is low) and 1, if the device is ready (R/B pin is high). + If the hardware interface does not give access to the ready busy pin, then + the function must not be defined and the function pointer this->dev_ready is set to NULL. + </para> + </sect1> + <sect1> + <title>Init function</title> + <para> + The init function allocates memory and sets up all the board + specific parameters and function pointers. When everything + is set up nand_scan() is called. This function tries to + detect and identify then chip. If a chip is found all the + internal data fields are initialized accordingly. + The structure(s) have to be zeroed out first and then filled with the neccecary + information about the device. + </para> + <programlisting> +int __init board_init (void) +{ + struct nand_chip *this; + int err = 0; + + /* Allocate memory for MTD device structure and private data */ + board_mtd = kmalloc (sizeof(struct mtd_info) + sizeof (struct nand_chip), GFP_KERNEL); + if (!board_mtd) { + printk ("Unable to allocate NAND MTD device structure.\n"); + err = -ENOMEM; + goto out; + } + + /* Initialize structures */ + memset ((char *) board_mtd, 0, sizeof(struct mtd_info) + sizeof(struct nand_chip)); + + /* map physical adress */ + baseaddr = (unsigned long)ioremap(CHIP_PHYSICAL_ADDRESS, 1024); + if(!baseaddr){ + printk("Ioremap to access NAND chip failed\n"); + err = -EIO; + goto out_mtd; + } + + /* Get pointer to private data */ + this = (struct nand_chip *) (); + /* Link the private data with the MTD structure */ + board_mtd->priv = this; + + /* Set address of NAND IO lines */ + this->IO_ADDR_R = baseaddr; + this->IO_ADDR_W = baseaddr; + /* Reference hardware control function */ + this->hwcontrol = board_hwcontrol; + /* Set command delay time, see datasheet for correct value */ + this->chip_delay = CHIP_DEPENDEND_COMMAND_DELAY; + /* Assign the device ready function, if available */ + this->dev_ready = board_dev_ready; + this->eccmode = NAND_ECC_SOFT; + + /* Scan to find existance of the device */ + if (nand_scan (board_mtd, 1)) { + err = -ENXIO; + goto out_ior; + } + + add_mtd_partitions(board_mtd, partition_info, NUM_PARTITIONS); + goto out; + +out_ior: + iounmap((void *)baseaddr); +out_mtd: + kfree (board_mtd); +out: + return err; +} +module_init(board_init); + </programlisting> + </sect1> + <sect1> + <title>Exit function</title> + <para> + The exit function is only neccecary if the driver is + compiled as a module. It releases all resources which + are held by the chip driver and unregisters the partitions + in the MTD layer. + </para> + <programlisting> +#ifdef MODULE +static void __exit board_cleanup (void) +{ + /* Release resources, unregister device */ + nand_release (board_mtd); + + /* unmap physical adress */ + iounmap((void *)baseaddr); + + /* Free the MTD device structure */ + kfree (board_mtd); +} +module_exit(board_cleanup); +#endif + </programlisting> + </sect1> + </chapter> + + <chapter id="boarddriversadvanced"> + <title>Advanced board driver functions</title> + <para> + This chapter describes the advanced functionality of the NAND + driver. For a list of functions which can be overridden by the board + driver see the documentation of the nand_chip structure. + </para> + <sect1> + <title>Multiple chip control</title> + <para> + The nand driver can control chip arrays. Therefor the + board driver must provide an own select_chip function. This + function must (de)select the requested chip. + The function pointer in the nand_chip structure must + be set before calling nand_scan(). The maxchip parameter + of nand_scan() defines the maximum number of chips to + scan for. Make sure that the select_chip function can + handle the requested number of chips. + </para> + <para> + The nand driver concatenates the chips to one virtual + chip and provides this virtual chip to the MTD layer. + </para> + <para> + <emphasis>Note: The driver can only handle linear chip arrays + of equally sized chips. There is no support for + parallel arrays which extend the buswidth.</emphasis> + </para> + <para> + <emphasis>GPIO based example</emphasis> + </para> + <programlisting> +static void board_select_chip (struct mtd_info *mtd, int chip) +{ + /* Deselect all chips, set all nCE pins high */ + GPIO(BOARD_NAND_NCE) |= 0xff; + if (chip >= 0) + GPIO(BOARD_NAND_NCE) &= ~ (1 << chip); +} + </programlisting> + <para> + <emphasis>Address lines based example.</emphasis> + Its assumed that the nCE pins are connected to an + address decoder. + </para> + <programlisting> +static void board_select_chip (struct mtd_info *mtd, int chip) +{ + struct nand_chip *this = (struct nand_chip *) mtd->priv; + + /* Deselect all chips */ + this->IO_ADDR_R &= ~BOARD_NAND_ADDR_MASK; + this->IO_ADDR_W &= ~BOARD_NAND_ADDR_MASK; + switch (chip) { + case 0: + this->IO_ADDR_R |= BOARD_NAND_ADDR_CHIP0; + this->IO_ADDR_W |= BOARD_NAND_ADDR_CHIP0; + break; + .... + case n: + this->IO_ADDR_R |= BOARD_NAND_ADDR_CHIPn; + this->IO_ADDR_W |= BOARD_NAND_ADDR_CHIPn; + break; + } +} + </programlisting> + </sect1> + <sect1> + <title>Hardware ECC support</title> + <sect2> + <title>Functions and constants</title> + <para> + The nand driver supports three different types of + hardware ECC. + <itemizedlist> + <listitem><para>NAND_ECC_HW3_256</para><para> + Hardware ECC generator providing 3 bytes ECC per + 256 byte. + </para> </listitem> + <listitem><para>NAND_ECC_HW3_512</para><para> + Hardware ECC generator providing 3 bytes ECC per + 512 byte. + </para> </listitem> + <listitem><para>NAND_ECC_HW6_512</para><para> + Hardware ECC generator providing 6 bytes ECC per + 512 byte. + </para> </listitem> + <listitem><para>NAND_ECC_HW8_512</para><para> + Hardware ECC generator providing 6 bytes ECC per + 512 byte. + </para> </listitem> + </itemizedlist> + If your hardware generator has a different functionality + add it at the appropriate place in nand_base.c + </para> + <para> + The board driver must provide following functions: + <itemizedlist> + <listitem><para>enable_hwecc</para><para> + This function is called before reading / writing to + the chip. Reset or initialize the hardware generator + in this function. The function is called with an + argument which let you distinguish between read + and write operations. + </para> </listitem> + <listitem><para>calculate_ecc</para><para> + This function is called after read / write from / to + the chip. Transfer the ECC from the hardware to + the buffer. If the option NAND_HWECC_SYNDROME is set + then the function is only called on write. See below. + </para> </listitem> + <listitem><para>correct_data</para><para> + In case of an ECC error this function is called for + error detection and correction. Return 1 respectively 2 + in case the error can be corrected. If the error is + not correctable return -1. If your hardware generator + matches the default algorithm of the nand_ecc software + generator then use the correction function provided + by nand_ecc instead of implementing duplicated code. + </para> </listitem> + </itemizedlist> + </para> + </sect2> + <sect2> + <title>Hardware ECC with syndrome calculation</title> + <para> + Many hardware ECC implementations provide Reed-Solomon + codes and calculate an error syndrome on read. The syndrome + must be converted to a standard Reed-Solomon syndrome + before calling the error correction code in the generic + Reed-Solomon library. + </para> + <para> + The ECC bytes must be placed immidiately after the data + bytes in order to make the syndrome generator work. This + is contrary to the usual layout used by software ECC. The + seperation of data and out of band area is not longer + possible. The nand driver code handles this layout and + the remaining free bytes in the oob area are managed by + the autoplacement code. Provide a matching oob-layout + in this case. See rts_from4.c and diskonchip.c for + implementation reference. In those cases we must also + use bad block tables on FLASH, because the ECC layout is + interferring with the bad block marker positions. + See bad block table support for details. + </para> + </sect2> + </sect1> + <sect1> + <title>Bad block table support</title> + <para> + Most NAND chips mark the bad blocks at a defined + position in the spare area. Those blocks must + not be erased under any circumstances as the bad + block information would be lost. + It is possible to check the bad block mark each + time when the blocks are accessed by reading the + spare area of the first page in the block. This + is time consuming so a bad block table is used. + </para> + <para> + The nand driver supports various types of bad block + tables. + <itemizedlist> + <listitem><para>Per device</para><para> + The bad block table contains all bad block information + of the device which can consist of multiple chips. + </para> </listitem> + <listitem><para>Per chip</para><para> + A bad block table is used per chip and contains the + bad block information for this particular chip. + </para> </listitem> + <listitem><para>Fixed offset</para><para> + The bad block table is located at a fixed offset + in the chip (device). This applies to various + DiskOnChip devices. + </para> </listitem> + <listitem><para>Automatic placed</para><para> + The bad block table is automatically placed and + detected either at the end or at the beginning + of a chip (device) + </para> </listitem> + <listitem><para>Mirrored tables</para><para> + The bad block table is mirrored on the chip (device) to + allow updates of the bad block table without data loss. + </para> </listitem> + </itemizedlist> + </para> + <para> + nand_scan() calls the function nand_default_bbt(). + nand_default_bbt() selects appropriate default + bad block table desriptors depending on the chip information + which was retrieved by nand_scan(). + </para> + <para> + The standard policy is scanning the device for bad + blocks and build a ram based bad block table which + allows faster access than always checking the + bad block information on the flash chip itself. + </para> + <sect2> + <title>Flash based tables</title> + <para> + It may be desired or neccecary to keep a bad block table in FLASH. + For AG-AND chips this is mandatory, as they have no factory marked + bad blocks. They have factory marked good blocks. The marker pattern + is erased when the block is erased to be reused. So in case of + powerloss before writing the pattern back to the chip this block + would be lost and added to the bad blocks. Therefor we scan the + chip(s) when we detect them the first time for good blocks and + store this information in a bad block table before erasing any + of the blocks. + </para> + <para> + The blocks in which the tables are stored are procteted against + accidental access by marking them bad in the memory bad block + table. The bad block table managment functions are allowed + to circumvernt this protection. + </para> + <para> + The simplest way to activate the FLASH based bad block table support + is to set the option NAND_USE_FLASH_BBT in the option field of + the nand chip structure before calling nand_scan(). For AG-AND + chips is this done by default. + This activates the default FLASH based bad block table functionality + of the NAND driver. The default bad block table options are + <itemizedlist> + <listitem><para>Store bad block table per chip</para></listitem> + <listitem><para>Use 2 bits per block</para></listitem> + <listitem><para>Automatic placement at the end of the chip</para></listitem> + <listitem><para>Use mirrored tables with version numbers</para></listitem> + <listitem><para>Reserve 4 blocks at the end of the chip</para></listitem> + </itemizedlist> + </para> + </sect2> + <sect2> + <title>User defined tables</title> + <para> + User defined tables are created by filling out a + nand_bbt_descr structure and storing the pointer in the + nand_chip structure member bbt_td before calling nand_scan(). + If a mirror table is neccecary a second structure must be + created and a pointer to this structure must be stored + in bbt_md inside the nand_chip structure. If the bbt_md + member is set to NULL then only the main table is used + and no scan for the mirrored table is performed. + </para> + <para> + The most important field in the nand_bbt_descr structure + is the options field. The options define most of the + table properties. Use the predefined constants from + nand.h to define the options. + <itemizedlist> + <listitem><para>Number of bits per block</para> + <para>The supported number of bits is 1, 2, 4, 8.</para></listitem> + <listitem><para>Table per chip</para> + <para>Setting the constant NAND_BBT_PERCHIP selects that + a bad block table is managed for each chip in a chip array. + If this option is not set then a per device bad block table + is used.</para></listitem> + <listitem><para>Table location is absolute</para> + <para>Use the option constant NAND_BBT_ABSPAGE and + define the absolute page number where the bad block + table starts in the field pages. If you have selected bad block + tables per chip and you have a multi chip array then the start page + must be given for each chip in the chip array. Note: there is no scan + for a table ident pattern performed, so the fields + pattern, veroffs, offs, len can be left uninitialized</para></listitem> + <listitem><para>Table location is automatically detected</para> + <para>The table can either be located in the first or the last good + blocks of the chip (device). Set NAND_BBT_LASTBLOCK to place + the bad block table at the end of the chip (device). The + bad block tables are marked and identified by a pattern which + is stored in the spare area of the first page in the block which + holds the bad block table. Store a pointer to the pattern + in the pattern field. Further the length of the pattern has to be + stored in len and the offset in the spare area must be given + in the offs member of the nand_bbt_descr stucture. For mirrored + bad block tables different patterns are mandatory.</para></listitem> + <listitem><para>Table creation</para> + <para>Set the option NAND_BBT_CREATE to enable the table creation + if no table can be found during the scan. Usually this is done only + once if a new chip is found. </para></listitem> + <listitem><para>Table write support</para> + <para>Set the option NAND_BBT_WRITE to enable the table write support. + This allows the update of the bad block table(s) in case a block has + to be marked bad due to wear. The MTD interface function block_markbad + is calling the update function of the bad block table. If the write + support is enabled then the table is updated on FLASH.</para> + <para> + Note: Write support should only be enabled for mirrored tables with + version control. + </para></listitem> + <listitem><para>Table version control</para> + <para>Set the option NAND_BBT_VERSION to enable the table version control. + It's highly recommended to enable this for mirrored tables with write + support. It makes sure that the risk of loosing the bad block + table information is reduced to the loss of the information about the + one worn out block which should be marked bad. The version is stored in + 4 consecutive bytes in the spare area of the device. The position of + the version number is defined by the member veroffs in the bad block table + descriptor.</para></listitem> + <listitem><para>Save block contents on write</para> + <para> + In case that the block which holds the bad block table does contain + other useful information, set the option NAND_BBT_SAVECONTENT. When + the bad block table is written then the whole block is read the bad + block table is updated and the block is erased and everything is + written back. If this option is not set only the bad block table + is written and everything else in the block is ignored and erased. + </para></listitem> + <listitem><para>Number of reserved blocks</para> + <para> + For automatic placement some blocks must be reserved for + bad block table storage. The number of reserved blocks is defined + in the maxblocks member of the babd block table description structure. + Reserving 4 blocks for mirrored tables should be a reasonable number. + This also limits the number of blocks which are scanned for the bad + block table ident pattern. + </para></listitem> + </itemizedlist> + </para> + </sect2> + </sect1> + <sect1> + <title>Spare area (auto)placement</title> + <para> + The nand driver implements different possibilities for + placement of filesystem data in the spare area, + <itemizedlist> + <listitem><para>Placement defined by fs driver</para></listitem> + <listitem><para>Automatic placement</para></listitem> + </itemizedlist> + The default placement function is automatic placement. The + nand driver has built in default placement schemes for the + various chiptypes. If due to hardware ECC functionality the + default placement does not fit then the board driver can + provide a own placement scheme. + </para> + <para> + File system drivers can provide a own placement scheme which + is used instead of the default placement scheme. + </para> + <para> + Placement schemes are defined by a nand_oobinfo structure + <programlisting> +struct nand_oobinfo { + int useecc; + int eccbytes; + int eccpos[24]; + int oobfree[8][2]; +}; + </programlisting> + <itemizedlist> + <listitem><para>useecc</para><para> + The useecc member controls the ecc and placement function. The header + file include/mtd/mtd-abi.h contains constants to select ecc and + placement. MTD_NANDECC_OFF switches off the ecc complete. This is + not recommended and available for testing and diagnosis only. + MTD_NANDECC_PLACE selects caller defined placement, MTD_NANDECC_AUTOPLACE + selects automatic placement. + </para></listitem> + <listitem><para>eccbytes</para><para> + The eccbytes member defines the number of ecc bytes per page. + </para></listitem> + <listitem><para>eccpos</para><para> + The eccpos array holds the byte offsets in the spare area where + the ecc codes are placed. + </para></listitem> + <listitem><para>oobfree</para><para> + The oobfree array defines the areas in the spare area which can be + used for automatic placement. The information is given in the format + {offset, size}. offset defines the start of the usable area, size the + length in bytes. More than one area can be defined. The list is terminated + by an {0, 0} entry. + </para></listitem> + </itemizedlist> + </para> + <sect2> + <title>Placement defined by fs driver</title> + <para> + The calling function provides a pointer to a nand_oobinfo + structure which defines the ecc placement. For writes the + caller must provide a spare area buffer along with the + data buffer. The spare area buffer size is (number of pages) * + (size of spare area). For reads the buffer size is + (number of pages) * ((size of spare area) + (number of ecc + steps per page) * sizeof (int)). The driver stores the + result of the ecc check for each tuple in the spare buffer. + The storage sequence is + </para> + <para> + <spare data page 0><ecc result 0>...<ecc result n> + </para> + <para> + ... + </para> + <para> + <spare data page n><ecc result 0>...<ecc result n> + </para> + <para> + This is a legacy mode used by YAFFS1. + </para> + <para> + If the spare area buffer is NULL then only the ECC placement is + done according to the given scheme in the nand_oobinfo structure. + </para> + </sect2> + <sect2> + <title>Automatic placement</title> + <para> + Automatic placement uses the built in defaults to place the + ecc bytes in the spare area. If filesystem data have to be stored / + read into the spare area then the calling function must provide a + buffer. The buffer size per page is determined by the oobfree array in + the nand_oobinfo structure. + </para> + <para> + If the spare area buffer is NULL then only the ECC placement is + done according to the default builtin scheme. + </para> + </sect2> + <sect2> + <title>User space placement selection</title> + <para> + All non ecc functions like mtd->read and mtd->write use an internal + structure, which can be set by an ioctl. This structure is preset + to the autoplacement default. + <programlisting> + ioctl (fd, MEMSETOOBSEL, oobsel); + </programlisting> + oobsel is a pointer to a user supplied structure of type + nand_oobconfig. The contents of this structure must match the + criteria of the filesystem, which will be used. See an example in utils/nandwrite.c. + </para> + </sect2> + </sect1> + <sect1> + <title>Spare area autoplacement default schemes</title> + <sect2> + <title>256 byte pagesize</title> +<informaltable><tgroup cols="3"><tbody> +<row> +<entry>Offset</entry> +<entry>Content</entry> +<entry>Comment</entry> +</row> +<row> +<entry>0x00</entry> +<entry>ECC byte 0</entry> +<entry>Error correction code byte 0</entry> +</row> +<row> +<entry>0x01</entry> +<entry>ECC byte 1</entry> +<entry>Error correction code byte 1</entry> +</row> +<row> +<entry>0x02</entry> +<entry>ECC byte 2</entry> +<entry>Error correction code byte 2</entry> +</row> +<row> +<entry>0x03</entry> +<entry>Autoplace 0</entry> +<entry></entry> +</row> +<row> +<entry>0x04</entry> +<entry>Autoplace 1</entry> +<entry></entry> +</row> +<row> +<entry>0x05</entry> +<entry>Bad block marker</entry> +<entry>If any bit in this byte is zero, then this block is bad. +This applies only to the first page in a block. In the remaining +pages this byte is reserved</entry> +</row> +<row> +<entry>0x06</entry> +<entry>Autoplace 2</entry> +<entry></entry> +</row> +<row> +<entry>0x07</entry> +<entry>Autoplace 3</entry> +<entry></entry> +</row> +</tbody></tgroup></informaltable> + </sect2> + <sect2> + <title>512 byte pagesize</title> +<informaltable><tgroup cols="3"><tbody> +<row> +<entry>Offset</entry> +<entry>Content</entry> +<entry>Comment</entry> +</row> +<row> +<entry>0x00</entry> +<entry>ECC byte 0</entry> +<entry>Error correction code byte 0 of the lower 256 Byte data in +this page</entry> +</row> +<row> +<entry>0x01</entry> +<entry>ECC byte 1</entry> +<entry>Error correction code byte 1 of the lower 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x02</entry> +<entry>ECC byte 2</entry> +<entry>Error correction code byte 2 of the lower 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x03</entry> +<entry>ECC byte 3</entry> +<entry>Error correction code byte 0 of the upper 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x04</entry> +<entry>reserved</entry> +<entry>reserved</entry> +</row> +<row> +<entry>0x05</entry> +<entry>Bad block marker</entry> +<entry>If any bit in this byte is zero, then this block is bad. +This applies only to the first page in a block. In the remaining +pages this byte is reserved</entry> +</row> +<row> +<entry>0x06</entry> +<entry>ECC byte 4</entry> +<entry>Error correction code byte 1 of the upper 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x07</entry> +<entry>ECC byte 5</entry> +<entry>Error correction code byte 2 of the upper 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x08 - 0x0F</entry> +<entry>Autoplace 0 - 7</entry> +<entry></entry> +</row> +</tbody></tgroup></informaltable> + </sect2> + <sect2> + <title>2048 byte pagesize</title> +<informaltable><tgroup cols="3"><tbody> +<row> +<entry>Offset</entry> +<entry>Content</entry> +<entry>Comment</entry> +</row> +<row> +<entry>0x00</entry> +<entry>Bad block marker</entry> +<entry>If any bit in this byte is zero, then this block is bad. +This applies only to the first page in a block. In the remaining +pages this byte is reserved</entry> +</row> +<row> +<entry>0x01</entry> +<entry>Reserved</entry> +<entry>Reserved</entry> +</row> +<row> +<entry>0x02-0x27</entry> +<entry>Autoplace 0 - 37</entry> +<entry></entry> +</row> +<row> +<entry>0x28</entry> +<entry>ECC byte 0</entry> +<entry>Error correction code byte 0 of the first 256 Byte data in +this page</entry> +</row> +<row> +<entry>0x29</entry> +<entry>ECC byte 1</entry> +<entry>Error correction code byte 1 of the first 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x2A</entry> +<entry>ECC byte 2</entry> +<entry>Error correction code byte 2 of the first 256 Bytes data in +this page</entry> +</row> +<row> +<entry>0x2B</entry> +<entry>ECC byte 3</entry> +<entry>Error correction code byte 0 of the second 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x2C</entry> +<entry>ECC byte 4</entry> +<entry>Error correction code byte 1 of the second 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x2D</entry> +<entry>ECC byte 5</entry> +<entry>Error correction code byte 2 of the second 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x2E</entry> +<entry>ECC byte 6</entry> +<entry>Error correction code byte 0 of the third 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x2F</entry> +<entry>ECC byte 7</entry> +<entry>Error correction code byte 1 of the third 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x30</entry> +<entry>ECC byte 8</entry> +<entry>Error correction code byte 2 of the third 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x31</entry> +<entry>ECC byte 9</entry> +<entry>Error correction code byte 0 of the fourth 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x32</entry> +<entry>ECC byte 10</entry> +<entry>Error correction code byte 1 of the fourth 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x33</entry> +<entry>ECC byte 11</entry> +<entry>Error correction code byte 2 of the fourth 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x34</entry> +<entry>ECC byte 12</entry> +<entry>Error correction code byte 0 of the fifth 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x35</entry> +<entry>ECC byte 13</entry> +<entry>Error correction code byte 1 of the fifth 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x36</entry> +<entry>ECC byte 14</entry> +<entry>Error correction code byte 2 of the fifth 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x37</entry> +<entry>ECC byte 15</entry> +<entry>Error correction code byte 0 of the sixt 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x38</entry> +<entry>ECC byte 16</entry> +<entry>Error correction code byte 1 of the sixt 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x39</entry> +<entry>ECC byte 17</entry> +<entry>Error correction code byte 2 of the sixt 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x3A</entry> +<entry>ECC byte 18</entry> +<entry>Error correction code byte 0 of the seventh 256 Bytes of +data in this page</entry> +</row> +<row> +<entry>0x3B</entry> +<entry>ECC byte 19</entry> +<entry>Error correction code byte 1 of the seventh 256 Bytes of +data in this page</entry> +</row> +<row> +<entry>0x3C</entry> +<entry>ECC byte 20</entry> +<entry>Error correction code byte 2 of the seventh 256 Bytes of +data in this page</entry> +</row> +<row> +<entry>0x3D</entry> +<entry>ECC byte 21</entry> +<entry>Error correction code byte 0 of the eigth 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x3E</entry> +<entry>ECC byte 22</entry> +<entry>Error correction code byte 1 of the eigth 256 Bytes of data +in this page</entry> +</row> +<row> +<entry>0x3F</entry> +<entry>ECC byte 23</entry> +<entry>Error correction code byte 2 of the eigth 256 Bytes of data +in this page</entry> +</row> +</tbody></tgroup></informaltable> + </sect2> + </sect1> + </chapter> + + <chapter id="filesystems"> + <title>Filesystem support</title> + <para> + The NAND driver provides all neccecary functions for a + filesystem via the MTD interface. + </para> + <para> + Filesystems must be aware of the NAND pecularities and + restrictions. One major restrictions of NAND Flash is, that you cannot + write as often as you want to a page. The consecutive writes to a page, + before erasing it again, are restricted to 1-3 writes, depending on the + manufacturers specifications. This applies similar to the spare area. + </para> + <para> + Therefor NAND aware filesystems must either write in page size chunks + or hold a writebuffer to collect smaller writes until they sum up to + pagesize. Available NAND aware filesystems: JFFS2, YAFFS. + </para> + <para> + The spare area usage to store filesystem data is controlled by + the spare area placement functionality which is described in one + of the earlier chapters. + </para> + </chapter> + <chapter id="tools"> + <title>Tools</title> + <para> + The MTD project provides a couple of helpful tools to handle NAND Flash. + <itemizedlist> + <listitem><para>flasherase, flasheraseall: Erase and format FLASH partitions</para></listitem> + <listitem><para>nandwrite: write filesystem images to NAND FLASH</para></listitem> + <listitem><para>nanddump: dump the contents of a NAND FLASH partitions</para></listitem> + </itemizedlist> + </para> + <para> + These tools are aware of the NAND restrictions. Please use those tools + instead of complaining about errors which are caused by non NAND aware + access methods. + </para> + </chapter> + + <chapter id="defines"> + <title>Constants</title> + <para> + This chapter describes the constants which might be relevant for a driver developer. + </para> + <sect1> + <title>Chip option constants</title> + <sect2> + <title>Constants for chip id table</title> + <para> + These constants are defined in nand.h. They are ored together to describe + the chip functionality. + <programlisting> +/* Chip can not auto increment pages */ +#define NAND_NO_AUTOINCR 0x00000001 +/* Buswitdh is 16 bit */ +#define NAND_BUSWIDTH_16 0x00000002 +/* Device supports partial programming without padding */ +#define NAND_NO_PADDING 0x00000004 +/* Chip has cache program function */ +#define NAND_CACHEPRG 0x00000008 +/* Chip has copy back function */ +#define NAND_COPYBACK 0x00000010 +/* AND Chip which has 4 banks and a confusing page / block + * assignment. See Renesas datasheet for further information */ +#define NAND_IS_AND 0x00000020 +/* Chip has a array of 4 pages which can be read without + * additional ready /busy waits */ +#define NAND_4PAGE_ARRAY 0x00000040 + </programlisting> + </para> + </sect2> + <sect2> + <title>Constants for runtime options</title> + <para> + These constants are defined in nand.h. They are ored together to describe + the functionality. + <programlisting> +/* Use a flash based bad block table. This option is parsed by the + * default bad block table function (nand_default_bbt). */ +#define NAND_USE_FLASH_BBT 0x00010000 +/* The hw ecc generator provides a syndrome instead a ecc value on read + * This can only work if we have the ecc bytes directly behind the + * data bytes. Applies for DOC and AG-AND Renesas HW Reed Solomon generators */ +#define NAND_HWECC_SYNDROME 0x00020000 + </programlisting> + </para> + </sect2> + </sect1> + + <sect1> + <title>ECC selection constants</title> + <para> + Use these constants to select the ECC algorithm. + <programlisting> +/* No ECC. Usage is not recommended ! */ +#define NAND_ECC_NONE 0 +/* Software ECC 3 byte ECC per 256 Byte data */ +#define NAND_ECC_SOFT 1 +/* Hardware ECC 3 byte ECC per 256 Byte data */ +#define NAND_ECC_HW3_256 2 +/* Hardware ECC 3 byte ECC per 512 Byte data */ +#define NAND_ECC_HW3_512 3 +/* Hardware ECC 6 byte ECC per 512 Byte data */ +#define NAND_ECC_HW6_512 4 +/* Hardware ECC 6 byte ECC per 512 Byte data */ +#define NAND_ECC_HW8_512 6 + </programlisting> + </para> + </sect1> + + <sect1> + <title>Hardware control related constants</title> + <para> + These constants describe the requested hardware access function when + the boardspecific hardware control function is called + <programlisting> +/* Select the chip by setting nCE to low */ +#define NAND_CTL_SETNCE 1 +/* Deselect the chip by setting nCE to high */ +#define NAND_CTL_CLRNCE 2 +/* Select the command latch by setting CLE to high */ +#define NAND_CTL_SETCLE 3 +/* Deselect the command latch by setting CLE to low */ +#define NAND_CTL_CLRCLE 4 +/* Select the address latch by setting ALE to high */ +#define NAND_CTL_SETALE 5 +/* Deselect the address latch by setting ALE to low */ +#define NAND_CTL_CLRALE 6 +/* Set write protection by setting WP to high. Not used! */ +#define NAND_CTL_SETWP 7 +/* Clear write protection by setting WP to low. Not used! */ +#define NAND_CTL_CLRWP 8 + </programlisting> + </para> + </sect1> + + <sect1> + <title>Bad block table related constants</title> + <para> + These constants describe the options used for bad block + table descriptors. + <programlisting> +/* Options for the bad block table descriptors */ + +/* The number of bits used per block in the bbt on the device */ +#define NAND_BBT_NRBITS_MSK 0x0000000F +#define NAND_BBT_1BIT 0x00000001 +#define NAND_BBT_2BIT 0x00000002 +#define NAND_BBT_4BIT 0x00000004 +#define NAND_BBT_8BIT 0x00000008 +/* The bad block table is in the last good block of the device */ +#define NAND_BBT_LASTBLOCK 0x00000010 +/* The bbt is at the given page, else we must scan for the bbt */ +#define NAND_BBT_ABSPAGE 0x00000020 +/* The bbt is at the given page, else we must scan for the bbt */ +#define NAND_BBT_SEARCH 0x00000040 +/* bbt is stored per chip on multichip devices */ +#define NAND_BBT_PERCHIP 0x00000080 +/* bbt has a version counter at offset veroffs */ +#define NAND_BBT_VERSION 0x00000100 +/* Create a bbt if none axists */ +#define NAND_BBT_CREATE 0x00000200 +/* Search good / bad pattern through all pages of a block */ +#define NAND_BBT_SCANALLPAGES 0x00000400 +/* Scan block empty during good / bad block scan */ +#define NAND_BBT_SCANEMPTY 0x00000800 +/* Write bbt if neccecary */ +#define NAND_BBT_WRITE 0x00001000 +/* Read and write back block contents when writing bbt */ +#define NAND_BBT_SAVECONTENT 0x00002000 + </programlisting> + </para> + </sect1> + + </chapter> + + <chapter id="structs"> + <title>Structures</title> + <para> + This chapter contains the autogenerated documentation of the structures which are + used in the NAND driver and might be relevant for a driver developer. Each + struct member has a short description which is marked with an [XXX] identifier. + See the chapter "Documentation hints" for an explanation. + </para> +!Iinclude/linux/mtd/nand.h + </chapter> + + <chapter id="pubfunctions"> + <title>Public Functions Provided</title> + <para> + This chapter contains the autogenerated documentation of the NAND kernel API functions + which are exported. Each function has a short description which is marked with an [XXX] identifier. + See the chapter "Documentation hints" for an explanation. + </para> +!Edrivers/mtd/nand/nand_base.c +!Edrivers/mtd/nand/nand_bbt.c +!Edrivers/mtd/nand/nand_ecc.c + </chapter> + + <chapter id="intfunctions"> + <title>Internal Functions Provided</title> + <para> + This chapter contains the autogenerated documentation of the NAND driver internal functions. + Each function has a short description which is marked with an [XXX] identifier. + See the chapter "Documentation hints" for an explanation. + The functions marked with [DEFAULT] might be relevant for a board driver developer. + </para> +!Idrivers/mtd/nand/nand_base.c +!Idrivers/mtd/nand/nand_bbt.c +!Idrivers/mtd/nand/nand_ecc.c + </chapter> + + <chapter id="credits"> + <title>Credits</title> + <para> + The following people have contributed to the NAND driver: + <orderedlist> + <listitem><para>Steven J. Hill<email>sjhill@realitydiluted.com</email></para></listitem> + <listitem><para>David Woodhouse<email>dwmw2@infradead.org</email></para></listitem> + <listitem><para>Thomas Gleixner<email>tglx@linutronix.de</email></para></listitem> + </orderedlist> + A lot of users have provided bugfixes, improvements and helping hands for testing. + Thanks a lot. + </para> + <para> + The following people have contributed to this document: + <orderedlist> + <listitem><para>Thomas Gleixner<email>tglx@linutronix.de</email></para></listitem> + </orderedlist> + </para> + </chapter> +</book> diff --git a/Documentation/DocBook/procfs-guide.tmpl b/Documentation/DocBook/procfs-guide.tmpl new file mode 100644 index 000000000000..45cad23efefa --- /dev/null +++ b/Documentation/DocBook/procfs-guide.tmpl @@ -0,0 +1,591 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" [ +<!ENTITY procfsexample SYSTEM "procfs_example.xml"> +]> + +<book id="LKProcfsGuide"> + <bookinfo> + <title>Linux Kernel Procfs Guide</title> + + <authorgroup> + <author> + <firstname>Erik</firstname> + <othername>(J.A.K.)</othername> + <surname>Mouw</surname> + <affiliation> + <orgname>Delft University of Technology</orgname> + <orgdiv>Faculty of Information Technology and Systems</orgdiv> + <address> + <email>J.A.K.Mouw@its.tudelft.nl</email> + <pob>PO BOX 5031</pob> + <postcode>2600 GA</postcode> + <city>Delft</city> + <country>The Netherlands</country> + </address> + </affiliation> + </author> + </authorgroup> + + <revhistory> + <revision> + <revnumber>1.0 </revnumber> + <date>May 30, 2001</date> + <revremark>Initial revision posted to linux-kernel</revremark> + </revision> + <revision> + <revnumber>1.1 </revnumber> + <date>June 3, 2001</date> + <revremark>Revised after comments from linux-kernel</revremark> + </revision> + </revhistory> + + <copyright> + <year>2001</year> + <holder>Erik Mouw</holder> + </copyright> + + + <legalnotice> + <para> + This documentation is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later + version. + </para> + + <para> + This documentation is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR + PURPOSE. See the GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + + <para> + For more details see the file COPYING in the source + distribution of Linux. + </para> + </legalnotice> + </bookinfo> + + + + + <toc> + </toc> + + + + + <preface> + <title>Preface</title> + + <para> + This guide describes the use of the procfs file system from + within the Linux kernel. The idea to write this guide came up on + the #kernelnewbies IRC channel (see <ulink + url="http://www.kernelnewbies.org/">http://www.kernelnewbies.org/</ulink>), + when Jeff Garzik explained the use of procfs and forwarded me a + message Alexander Viro wrote to the linux-kernel mailing list. I + agreed to write it up nicely, so here it is. + </para> + + <para> + I'd like to thank Jeff Garzik + <email>jgarzik@pobox.com</email> and Alexander Viro + <email>viro@parcelfarce.linux.theplanet.co.uk</email> for their input, + Tim Waugh <email>twaugh@redhat.com</email> for his <ulink + url="http://people.redhat.com/twaugh/docbook/selfdocbook/">Selfdocbook</ulink>, + and Marc Joosen <email>marcj@historia.et.tudelft.nl</email> for + proofreading. + </para> + + <para> + This documentation was written while working on the LART + computing board (<ulink + url="http://www.lart.tudelft.nl/">http://www.lart.tudelft.nl/</ulink>), + which is sponsored by the Mobile Multi-media Communications + (<ulink + url="http://www.mmc.tudelft.nl/">http://www.mmc.tudelft.nl/</ulink>) + and Ubiquitous Communications (<ulink + url="http://www.ubicom.tudelft.nl/">http://www.ubicom.tudelft.nl/</ulink>) + projects. + </para> + + <para> + Erik + </para> + </preface> + + + + + <chapter id="intro"> + <title>Introduction</title> + + <para> + The <filename class="directory">/proc</filename> file system + (procfs) is a special file system in the linux kernel. It's a + virtual file system: it is not associated with a block device + but exists only in memory. The files in the procfs are there to + allow userland programs access to certain information from the + kernel (like process information in <filename + class="directory">/proc/[0-9]+/</filename>), but also for debug + purposes (like <filename>/proc/ksyms</filename>). + </para> + + <para> + This guide describes the use of the procfs file system from + within the Linux kernel. It starts by introducing all relevant + functions to manage the files within the file system. After that + it shows how to communicate with userland, and some tips and + tricks will be pointed out. Finally a complete example will be + shown. + </para> + + <para> + Note that the files in <filename + class="directory">/proc/sys</filename> are sysctl files: they + don't belong to procfs and are governed by a completely + different API described in the Kernel API book. + </para> + </chapter> + + + + + <chapter id="managing"> + <title>Managing procfs entries</title> + + <para> + This chapter describes the functions that various kernel + components use to populate the procfs with files, symlinks, + device nodes, and directories. + </para> + + <para> + A minor note before we start: if you want to use any of the + procfs functions, be sure to include the correct header file! + This should be one of the first lines in your code: + </para> + + <programlisting> +#include <linux/proc_fs.h> + </programlisting> + + + + + <sect1 id="regularfile"> + <title>Creating a regular file</title> + + <funcsynopsis> + <funcprototype> + <funcdef>struct proc_dir_entry* <function>create_proc_entry</function></funcdef> + <paramdef>const char* <parameter>name</parameter></paramdef> + <paramdef>mode_t <parameter>mode</parameter></paramdef> + <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef> + </funcprototype> + </funcsynopsis> + + <para> + This function creates a regular file with the name + <parameter>name</parameter>, file mode + <parameter>mode</parameter> in the directory + <parameter>parent</parameter>. To create a file in the root of + the procfs, use <constant>NULL</constant> as + <parameter>parent</parameter> parameter. When successful, the + function will return a pointer to the freshly created + <structname>struct proc_dir_entry</structname>; otherwise it + will return <constant>NULL</constant>. <xref + linkend="userland"/> describes how to do something useful with + regular files. + </para> + + <para> + Note that it is specifically supported that you can pass a + path that spans multiple directories. For example + <function>create_proc_entry</function>(<parameter>"drivers/via0/info"</parameter>) + will create the <filename class="directory">via0</filename> + directory if necessary, with standard + <constant>0755</constant> permissions. + </para> + + <para> + If you only want to be able to read the file, the function + <function>create_proc_read_entry</function> described in <xref + linkend="convenience"/> may be used to create and initialise + the procfs entry in one single call. + </para> + </sect1> + + + + + <sect1> + <title>Creating a symlink</title> + + <funcsynopsis> + <funcprototype> + <funcdef>struct proc_dir_entry* + <function>proc_symlink</function></funcdef> <paramdef>const + char* <parameter>name</parameter></paramdef> + <paramdef>struct proc_dir_entry* + <parameter>parent</parameter></paramdef> <paramdef>const + char* <parameter>dest</parameter></paramdef> + </funcprototype> + </funcsynopsis> + + <para> + This creates a symlink in the procfs directory + <parameter>parent</parameter> that points from + <parameter>name</parameter> to + <parameter>dest</parameter>. This translates in userland to + <literal>ln -s</literal> <parameter>dest</parameter> + <parameter>name</parameter>. + </para> + </sect1> + + <sect1> + <title>Creating a directory</title> + + <funcsynopsis> + <funcprototype> + <funcdef>struct proc_dir_entry* <function>proc_mkdir</function></funcdef> + <paramdef>const char* <parameter>name</parameter></paramdef> + <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef> + </funcprototype> + </funcsynopsis> + + <para> + Create a directory <parameter>name</parameter> in the procfs + directory <parameter>parent</parameter>. + </para> + </sect1> + + + + + <sect1> + <title>Removing an entry</title> + + <funcsynopsis> + <funcprototype> + <funcdef>void <function>remove_proc_entry</function></funcdef> + <paramdef>const char* <parameter>name</parameter></paramdef> + <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef> + </funcprototype> + </funcsynopsis> + + <para> + Removes the entry <parameter>name</parameter> in the directory + <parameter>parent</parameter> from the procfs. Entries are + removed by their <emphasis>name</emphasis>, not by the + <structname>struct proc_dir_entry</structname> returned by the + various create functions. Note that this function doesn't + recursively remove entries. + </para> + + <para> + Be sure to free the <structfield>data</structfield> entry from + the <structname>struct proc_dir_entry</structname> before + <function>remove_proc_entry</function> is called (that is: if + there was some <structfield>data</structfield> allocated, of + course). See <xref linkend="usingdata"/> for more information + on using the <structfield>data</structfield> entry. + </para> + </sect1> + </chapter> + + + + + <chapter id="userland"> + <title>Communicating with userland</title> + + <para> + Instead of reading (or writing) information directly from + kernel memory, procfs works with <emphasis>call back + functions</emphasis> for files: functions that are called when + a specific file is being read or written. Such functions have + to be initialised after the procfs file is created by setting + the <structfield>read_proc</structfield> and/or + <structfield>write_proc</structfield> fields in the + <structname>struct proc_dir_entry*</structname> that the + function <function>create_proc_entry</function> returned: + </para> + + <programlisting> +struct proc_dir_entry* entry; + +entry->read_proc = read_proc_foo; +entry->write_proc = write_proc_foo; + </programlisting> + + <para> + If you only want to use a the + <structfield>read_proc</structfield>, the function + <function>create_proc_read_entry</function> described in <xref + linkend="convenience"/> may be used to create and initialise the + procfs entry in one single call. + </para> + + + + <sect1> + <title>Reading data</title> + + <para> + The read function is a call back function that allows userland + processes to read data from the kernel. The read function + should have the following format: + </para> + + <funcsynopsis> + <funcprototype> + <funcdef>int <function>read_func</function></funcdef> + <paramdef>char* <parameter>page</parameter></paramdef> + <paramdef>char** <parameter>start</parameter></paramdef> + <paramdef>off_t <parameter>off</parameter></paramdef> + <paramdef>int <parameter>count</parameter></paramdef> + <paramdef>int* <parameter>eof</parameter></paramdef> + <paramdef>void* <parameter>data</parameter></paramdef> + </funcprototype> + </funcsynopsis> + + <para> + The read function should write its information into the + <parameter>page</parameter>. For proper use, the function + should start writing at an offset of + <parameter>off</parameter> in <parameter>page</parameter> and + write at most <parameter>count</parameter> bytes, but because + most read functions are quite simple and only return a small + amount of information, these two parameters are usually + ignored (it breaks pagers like <literal>more</literal> and + <literal>less</literal>, but <literal>cat</literal> still + works). + </para> + + <para> + If the <parameter>off</parameter> and + <parameter>count</parameter> parameters are properly used, + <parameter>eof</parameter> should be used to signal that the + end of the file has been reached by writing + <literal>1</literal> to the memory location + <parameter>eof</parameter> points to. + </para> + + <para> + The parameter <parameter>start</parameter> doesn't seem to be + used anywhere in the kernel. The <parameter>data</parameter> + parameter can be used to create a single call back function for + several files, see <xref linkend="usingdata"/>. + </para> + + <para> + The <function>read_func</function> function must return the + number of bytes written into the <parameter>page</parameter>. + </para> + + <para> + <xref linkend="example"/> shows how to use a read call back + function. + </para> + </sect1> + + + + + <sect1> + <title>Writing data</title> + + <para> + The write call back function allows a userland process to write + data to the kernel, so it has some kind of control over the + kernel. The write function should have the following format: + </para> + + <funcsynopsis> + <funcprototype> + <funcdef>int <function>write_func</function></funcdef> + <paramdef>struct file* <parameter>file</parameter></paramdef> + <paramdef>const char* <parameter>buffer</parameter></paramdef> + <paramdef>unsigned long <parameter>count</parameter></paramdef> + <paramdef>void* <parameter>data</parameter></paramdef> + </funcprototype> + </funcsynopsis> + + <para> + The write function should read <parameter>count</parameter> + bytes at maximum from the <parameter>buffer</parameter>. Note + that the <parameter>buffer</parameter> doesn't live in the + kernel's memory space, so it should first be copied to kernel + space with <function>copy_from_user</function>. The + <parameter>file</parameter> parameter is usually + ignored. <xref linkend="usingdata"/> shows how to use the + <parameter>data</parameter> parameter. + </para> + + <para> + Again, <xref linkend="example"/> shows how to use this call back + function. + </para> + </sect1> + + + + + <sect1 id="usingdata"> + <title>A single call back for many files</title> + + <para> + When a large number of almost identical files is used, it's + quite inconvenient to use a separate call back function for + each file. A better approach is to have a single call back + function that distinguishes between the files by using the + <structfield>data</structfield> field in <structname>struct + proc_dir_entry</structname>. First of all, the + <structfield>data</structfield> field has to be initialised: + </para> + + <programlisting> +struct proc_dir_entry* entry; +struct my_file_data *file_data; + +file_data = kmalloc(sizeof(struct my_file_data), GFP_KERNEL); +entry->data = file_data; + </programlisting> + + <para> + The <structfield>data</structfield> field is a <type>void + *</type>, so it can be initialised with anything. + </para> + + <para> + Now that the <structfield>data</structfield> field is set, the + <function>read_proc</function> and + <function>write_proc</function> can use it to distinguish + between files because they get it passed into their + <parameter>data</parameter> parameter: + </para> + + <programlisting> +int foo_read_func(char *page, char **start, off_t off, + int count, int *eof, void *data) +{ + int len; + + if(data == file_data) { + /* special case for this file */ + } else { + /* normal processing */ + } + + return len; +} + </programlisting> + + <para> + Be sure to free the <structfield>data</structfield> data field + when removing the procfs entry. + </para> + </sect1> + </chapter> + + + + + <chapter id="tips"> + <title>Tips and tricks</title> + + + + + <sect1 id="convenience"> + <title>Convenience functions</title> + + <funcsynopsis> + <funcprototype> + <funcdef>struct proc_dir_entry* <function>create_proc_read_entry</function></funcdef> + <paramdef>const char* <parameter>name</parameter></paramdef> + <paramdef>mode_t <parameter>mode</parameter></paramdef> + <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef> + <paramdef>read_proc_t* <parameter>read_proc</parameter></paramdef> + <paramdef>void* <parameter>data</parameter></paramdef> + </funcprototype> + </funcsynopsis> + + <para> + This function creates a regular file in exactly the same way + as <function>create_proc_entry</function> from <xref + linkend="regularfile"/> does, but also allows to set the read + function <parameter>read_proc</parameter> in one call. This + function can set the <parameter>data</parameter> as well, like + explained in <xref linkend="usingdata"/>. + </para> + </sect1> + + + + <sect1> + <title>Modules</title> + + <para> + If procfs is being used from within a module, be sure to set + the <structfield>owner</structfield> field in the + <structname>struct proc_dir_entry</structname> to + <constant>THIS_MODULE</constant>. + </para> + + <programlisting> +struct proc_dir_entry* entry; + +entry->owner = THIS_MODULE; + </programlisting> + </sect1> + + + + + <sect1> + <title>Mode and ownership</title> + + <para> + Sometimes it is useful to change the mode and/or ownership of + a procfs entry. Here is an example that shows how to achieve + that: + </para> + + <programlisting> +struct proc_dir_entry* entry; + +entry->mode = S_IWUSR |S_IRUSR | S_IRGRP | S_IROTH; +entry->uid = 0; +entry->gid = 100; + </programlisting> + + </sect1> + </chapter> + + + + + <chapter id="example"> + <title>Example</title> + + <!-- be careful with the example code: it shouldn't be wider than + approx. 60 columns, or otherwise it won't fit properly on a page + --> + +&procfsexample; + + </chapter> +</book> diff --git a/Documentation/DocBook/procfs_example.c b/Documentation/DocBook/procfs_example.c new file mode 100644 index 000000000000..7064084c1c5e --- /dev/null +++ b/Documentation/DocBook/procfs_example.c @@ -0,0 +1,224 @@ +/* + * procfs_example.c: an example proc interface + * + * Copyright (C) 2001, Erik Mouw (J.A.K.Mouw@its.tudelft.nl) + * + * This file accompanies the procfs-guide in the Linux kernel + * source. Its main use is to demonstrate the concepts and + * functions described in the guide. + * + * This software has been developed while working on the LART + * computing board (http://www.lart.tudelft.nl/), which is + * sponsored by the Mobile Multi-media Communications + * (http://www.mmc.tudelft.nl/) and Ubiquitous Communications + * (http://www.ubicom.tudelft.nl/) projects. + * + * The author can be reached at: + * + * Erik Mouw + * Information and Communication Theory Group + * Faculty of Information Technology and Systems + * Delft University of Technology + * P.O. Box 5031 + * 2600 GA Delft + * The Netherlands + * + * + * This program is free software; you can redistribute + * it and/or modify it under the terms of the GNU General + * Public License as published by the Free Software + * Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * This program is distributed in the hope that it will be + * useful, but WITHOUT ANY WARRANTY; without even the implied + * warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR + * PURPOSE. See the GNU General Public License for more + * details. + * + * You should have received a copy of the GNU General Public + * License along with this program; if not, write to the + * Free Software Foundation, Inc., 59 Temple Place, + * Suite 330, Boston, MA 02111-1307 USA + * + */ + +#include <linux/module.h> +#include <linux/kernel.h> +#include <linux/init.h> +#include <linux/proc_fs.h> +#include <linux/jiffies.h> +#include <asm/uaccess.h> + + +#define MODULE_VERS "1.0" +#define MODULE_NAME "procfs_example" + +#define FOOBAR_LEN 8 + +struct fb_data_t { + char name[FOOBAR_LEN + 1]; + char value[FOOBAR_LEN + 1]; +}; + + +static struct proc_dir_entry *example_dir, *foo_file, + *bar_file, *jiffies_file, *symlink; + + +struct fb_data_t foo_data, bar_data; + + +static int proc_read_jiffies(char *page, char **start, + off_t off, int count, + int *eof, void *data) +{ + int len; + + len = sprintf(page, "jiffies = %ld\n", + jiffies); + + return len; +} + + +static int proc_read_foobar(char *page, char **start, + off_t off, int count, + int *eof, void *data) +{ + int len; + struct fb_data_t *fb_data = (struct fb_data_t *)data; + + /* DON'T DO THAT - buffer overruns are bad */ + len = sprintf(page, "%s = '%s'\n", + fb_data->name, fb_data->value); + + return len; +} + + +static int proc_write_foobar(struct file *file, + const char *buffer, + unsigned long count, + void *data) +{ + int len; + struct fb_data_t *fb_data = (struct fb_data_t *)data; + + if(count > FOOBAR_LEN) + len = FOOBAR_LEN; + else + len = count; + + if(copy_from_user(fb_data->value, buffer, len)) + return -EFAULT; + + fb_data->value[len] = '\0'; + + return len; +} + + +static int __init init_procfs_example(void) +{ + int rv = 0; + + /* create directory */ + example_dir = proc_mkdir(MODULE_NAME, NULL); + if(example_dir == NULL) { + rv = -ENOMEM; + goto out; + } + + example_dir->owner = THIS_MODULE; + + /* create jiffies using convenience function */ + jiffies_file = create_proc_read_entry("jiffies", + 0444, example_dir, + proc_read_jiffies, + NULL); + if(jiffies_file == NULL) { + rv = -ENOMEM; + goto no_jiffies; + } + + jiffies_file->owner = THIS_MODULE; + + /* create foo and bar files using same callback + * functions + */ + foo_file = create_proc_entry("foo", 0644, example_dir); + if(foo_file == NULL) { + rv = -ENOMEM; + goto no_foo; + } + + strcpy(foo_data.name, "foo"); + strcpy(foo_data.value, "foo"); + foo_file->data = &foo_data; + foo_file->read_proc = proc_read_foobar; + foo_file->write_proc = proc_write_foobar; + foo_file->owner = THIS_MODULE; + + bar_file = create_proc_entry("bar", 0644, example_dir); + if(bar_file == NULL) { + rv = -ENOMEM; + goto no_bar; + } + + strcpy(bar_data.name, "bar"); + strcpy(bar_data.value, "bar"); + bar_file->data = &bar_data; + bar_file->read_proc = proc_read_foobar; + bar_file->write_proc = proc_write_foobar; + bar_file->owner = THIS_MODULE; + + /* create symlink */ + symlink = proc_symlink("jiffies_too", example_dir, + "jiffies"); + if(symlink == NULL) { + rv = -ENOMEM; + goto no_symlink; + } + + symlink->owner = THIS_MODULE; + + /* everything OK */ + printk(KERN_INFO "%s %s initialised\n", + MODULE_NAME, MODULE_VERS); + return 0; + +no_symlink: + remove_proc_entry("tty", example_dir); +no_tty: + remove_proc_entry("bar", example_dir); +no_bar: + remove_proc_entry("foo", example_dir); +no_foo: + remove_proc_entry("jiffies", example_dir); +no_jiffies: + remove_proc_entry(MODULE_NAME, NULL); +out: + return rv; +} + + +static void __exit cleanup_procfs_example(void) +{ + remove_proc_entry("jiffies_too", example_dir); + remove_proc_entry("tty", example_dir); + remove_proc_entry("bar", example_dir); + remove_proc_entry("foo", example_dir); + remove_proc_entry("jiffies", example_dir); + remove_proc_entry(MODULE_NAME, NULL); + + printk(KERN_INFO "%s %s removed\n", + MODULE_NAME, MODULE_VERS); +} + + +module_init(init_procfs_example); +module_exit(cleanup_procfs_example); + +MODULE_AUTHOR("Erik Mouw"); +MODULE_DESCRIPTION("procfs examples"); diff --git a/Documentation/DocBook/scsidrivers.tmpl b/Documentation/DocBook/scsidrivers.tmpl new file mode 100644 index 000000000000..d058e65daf19 --- /dev/null +++ b/Documentation/DocBook/scsidrivers.tmpl @@ -0,0 +1,193 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="scsidrivers"> + <bookinfo> + <title>SCSI Subsystem Interfaces</title> + + <authorgroup> + <author> + <firstname>Douglas</firstname> + <surname>Gilbert</surname> + <affiliation> + <address> + <email>dgilbert@interlog.com</email> + </address> + </affiliation> + </author> + </authorgroup> + <pubdate>2003-08-11</pubdate> + + <copyright> + <year>2002</year> + <year>2003</year> + <holder>Douglas Gilbert</holder> + </copyright> + + <legalnotice> + <para> + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later + version. + </para> + + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + See the GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + + <para> + For more details see the file COPYING in the source + distribution of Linux. + </para> + </legalnotice> + + </bookinfo> + +<toc></toc> + + <chapter id="intro"> + <title>Introduction</title> + <para> +This document outlines the interface between the Linux scsi mid level +and lower level drivers. Lower level drivers are variously called HBA +(host bus adapter) drivers, host drivers (HD) or pseudo adapter drivers. +The latter alludes to the fact that a lower level driver may be a +bridge to another IO subsystem (and the "ide-scsi" driver is an example +of this). There can be many lower level drivers active in a running +system, but only one per hardware type. For example, the aic7xxx driver +controls adaptec controllers based on the 7xxx chip series. Most lower +level drivers can control one or more scsi hosts (a.k.a. scsi initiators). + </para> +<para> +This document can been found in an ASCII text file in the linux kernel +source: <filename>Documentation/scsi/scsi_mid_low_api.txt</filename> . +It currently hold a little more information than this document. The +<filename>drivers/scsi/hosts.h</filename> and <filename> +drivers/scsi/scsi.h</filename> headers contain descriptions of members +of important structures for the scsi subsystem. +</para> + </chapter> + + <chapter id="driver-struct"> + <title>Driver structure</title> + <para> +Traditionally a lower level driver for the scsi subsystem has been +at least two files in the drivers/scsi directory. For example, a +driver called "xyz" has a header file "xyz.h" and a source file +"xyz.c". [Actually there is no good reason why this couldn't all +be in one file.] Some drivers that have been ported to several operating +systems (e.g. aic7xxx which has separate files for generic and +OS-specific code) have more than two files. Such drivers tend to have +their own directory under the drivers/scsi directory. + </para> + <para> +scsi_module.c is normally included at the end of a lower +level driver. For it to work a declaration like this is needed before +it is included: +<programlisting> + static Scsi_Host_Template driver_template = DRIVER_TEMPLATE; + /* DRIVER_TEMPLATE should contain pointers to supported interface + functions. Scsi_Host_Template is defined hosts.h */ + #include "scsi_module.c" +</programlisting> + </para> + <para> +The scsi_module.c assumes the name "driver_template" is appropriately +defined. It contains 2 functions: +<orderedlist> +<listitem><para> + init_this_scsi_driver() called during builtin and module driver + initialization: invokes mid level's scsi_register_host() +</para></listitem> +<listitem><para> + exit_this_scsi_driver() called during closedown: invokes + mid level's scsi_unregister_host() +</para></listitem> +</orderedlist> + </para> +<para> +When a new, lower level driver is being added to Linux, the following +files (all found in the drivers/scsi directory) will need some attention: +Makefile, Config.help and Config.in . It is probably best to look at what +an existing lower level driver does in this regard. +</para> + </chapter> + + <chapter id="intfunctions"> + <title>Interface Functions</title> +!EDocumentation/scsi/scsi_mid_low_api.txt + </chapter> + + <chapter id="locks"> + <title>Locks</title> +<para> +Each Scsi_Host instance has a spin_lock called Scsi_Host::default_lock +which is initialized in scsi_register() [found in hosts.c]. Within the +same function the Scsi_Host::host_lock pointer is initialized to point +at default_lock with the scsi_assign_lock() function. Thereafter +lock and unlock operations performed by the mid level use the +Scsi_Host::host_lock pointer. +</para> +<para> +Lower level drivers can override the use of Scsi_Host::default_lock by +using scsi_assign_lock(). The earliest opportunity to do this would +be in the detect() function after it has invoked scsi_register(). It +could be replaced by a coarser grain lock (e.g. per driver) or a +lock of equal granularity (i.e. per host). Using finer grain locks +(e.g. per scsi device) may be possible by juggling locks in +queuecommand(). +</para> + </chapter> + + <chapter id="changes"> + <title>Changes since lk 2.4 series</title> +<para> +io_request_lock has been replaced by several finer grained locks. The lock +relevant to lower level drivers is Scsi_Host::host_lock and there is one +per scsi host. +</para> +<para> +The older error handling mechanism has been removed. This means the +lower level interface functions abort() and reset() have been removed. +</para> +<para> +In the 2.4 series the scsi subsystem configuration descriptions were +aggregated with the configuration descriptions from all other Linux +subsystems in the Documentation/Configure.help file. In the 2.5 series, +the scsi subsystem now has its own (much smaller) drivers/scsi/Config.help +file. +</para> + </chapter> + + <chapter id="credits"> + <title>Credits</title> +<para> +The following people have contributed to this document: +<orderedlist> +<listitem><para> +Mike Anderson <email>andmike@us.ibm.com</email> +</para></listitem> +<listitem><para> +James Bottomley <email>James.Bottomley@steeleye.com</email> +</para></listitem> +<listitem><para> +Patrick Mansfield <email>patmans@us.ibm.com</email> +</para></listitem> +</orderedlist> +</para> + </chapter> + +</book> diff --git a/Documentation/DocBook/sis900.tmpl b/Documentation/DocBook/sis900.tmpl new file mode 100644 index 000000000000..6c2cbac93c3f --- /dev/null +++ b/Documentation/DocBook/sis900.tmpl @@ -0,0 +1,585 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="SiS900Guide"> + +<bookinfo> + +<title>SiS 900/7016 Fast Ethernet Device Driver</title> + +<authorgroup> +<author> +<firstname>Ollie</firstname> +<surname>Lho</surname> +</author> + +<author> +<firstname>Lei Chun</firstname> +<surname>Chang</surname> +</author> +</authorgroup> + +<edition>Document Revision: 0.3 for SiS900 driver v1.06 & v1.07</edition> +<pubdate>November 16, 2000</pubdate> + +<copyright> + <year>1999</year> + <holder>Silicon Integrated System Corp.</holder> +</copyright> + +<legalnotice> + <para> + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + </para> + + <para> + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + </para> +</legalnotice> + +<abstract> +<para> +This document gives some information on installation and usage of SiS 900/7016 +device driver under Linux. +</para> +</abstract> + +</bookinfo> + +<toc></toc> + +<chapter id="intro"> + <title>Introduction</title> + +<para> +This document describes the revision 1.06 and 1.07 of SiS 900/7016 Fast Ethernet +device driver under Linux. The driver is developed by Silicon Integrated +System Corp. and distributed freely under the GNU General Public License (GPL). +The driver can be compiled as a loadable module and used under Linux kernel +version 2.2.x. (rev. 1.06) +With minimal changes, the driver can also be used under 2.3.x and 2.4.x kernel +(rev. 1.07), please see +<xref linkend="install"/>. If you are intended to +use the driver for earlier kernels, you are on your own. +</para> + +<para> +The driver is tested with usual TCP/IP applications including +FTP, Telnet, Netscape etc. and is used constantly by the developers. +</para> + +<para> +Please send all comments/fixes/questions to +<ulink url="mailto:lcchang@sis.com.tw">Lei-Chun Chang</ulink>. +</para> +</chapter> + +<chapter id="changes"> + <title>Changes</title> + +<para> +Changes made in Revision 1.07 + +<orderedlist> +<listitem> +<para> +Separation of sis900.c and sis900.h in order to move most +constant definition to sis900.h (many of those constants were +corrected) +</para> +</listitem> + +<listitem> +<para> +Clean up PCI detection, the pci-scan from Donald Becker were not used, +just simple pci_find_*. +</para> +</listitem> + +<listitem> +<para> +MII detection is modified to support multiple mii transceiver. +</para> +</listitem> + +<listitem> +<para> +Bugs in read_eeprom, mdio_* were removed. +</para> +</listitem> + +<listitem> +<para> +Lot of sis900 irrelevant comments were removed/changed and +more comments were added to reflect the real situation. +</para> +</listitem> + +<listitem> +<para> +Clean up of physical/virtual address space mess in buffer +descriptors. +</para> +</listitem> + +<listitem> +<para> +Better transmit/receive error handling. +</para> +</listitem> + +<listitem> +<para> +The driver now uses zero-copy single buffer management +scheme to improve performance. +</para> +</listitem> + +<listitem> +<para> +Names of variables were changed to be more consistent. +</para> +</listitem> + +<listitem> +<para> +Clean up of auo-negotiation and timer code. +</para> +</listitem> + +<listitem> +<para> +Automatic detection and change of PHY on the fly. +</para> +</listitem> + +<listitem> +<para> +Bug in mac probing fixed. +</para> +</listitem> + +<listitem> +<para> +Fix 630E equalier problem by modifying the equalizer workaround rule. +</para> +</listitem> + +<listitem> +<para> +Support for ICS1893 10/100 Interated PHYceiver. +</para> +</listitem> + +<listitem> +<para> +Support for media select by ifconfig. +</para> +</listitem> + +<listitem> +<para> +Added kernel-doc extratable documentation. +</para> +</listitem> + +</orderedlist> +</para> +</chapter> + +<chapter id="tested"> + <title>Tested Environment</title> + +<para> +This driver is developed on the following hardware + +<itemizedlist> +<listitem> + +<para> +Intel Celeron 500 with SiS 630 (rev 02) chipset +</para> +</listitem> +<listitem> + +<para> +SiS 900 (rev 01) and SiS 7016/7014 Fast Ethernet Card +</para> +</listitem> + +</itemizedlist> + +and tested with these software environments + +<itemizedlist> +<listitem> + +<para> +Red Hat Linux version 6.2 +</para> +</listitem> +<listitem> + +<para> +Linux kernel version 2.4.0 +</para> +</listitem> +<listitem> + +<para> +Netscape version 4.6 +</para> +</listitem> +<listitem> + +<para> +NcFTP 3.0.0 beta 18 +</para> +</listitem> +<listitem> + +<para> +Samba version 2.0.3 +</para> +</listitem> + +</itemizedlist> + +</para> + +</chapter> + +<chapter id="files"> +<title>Files in This Package</title> + +<para> +In the package you can find these files: +</para> + +<para> +<variablelist> + +<varlistentry> +<term>sis900.c</term> +<listitem> +<para> +Driver source file in C +</para> +</listitem> +</varlistentry> + +<varlistentry> +<term>sis900.h</term> +<listitem> +<para> +Header file for sis900.c +</para> +</listitem> +</varlistentry> + +<varlistentry> +<term>sis900.sgml</term> +<listitem> +<para> +DocBook SGML source of the document +</para> +</listitem> +</varlistentry> + +<varlistentry> +<term>sis900.txt</term> +<listitem> +<para> +Driver document in plain text +</para> +</listitem> +</varlistentry> + +</variablelist> +</para> +</chapter> + +<chapter id="install"> + <title>Installation</title> + +<para> +Silicon Integrated System Corp. is cooperating closely with core Linux Kernel +developers. The revisions of SiS 900 driver are distributed by the usuall channels +for kernel tar files and patches. Those kernel tar files for official kernel and +patches for kernel pre-release can be download at +<ulink url="http://ftp.kernel.org/pub/linux/kernel/">official kernel ftp site</ulink> +and its mirrors. +The 1.06 revision can be found in kernel version later than 2.3.15 and pre-2.2.14, +and 1.07 revision can be found in kernel version 2.4.0. +If you have no prior experience in networking under Linux, please read +<ulink url="http://www.tldp.org/">Ethernet HOWTO</ulink> and +<ulink url="http://www.tldp.org/">Networking HOWTO</ulink> available from +Linux Documentation Project (LDP). +</para> + +<para> +The driver is bundled in release later than 2.2.11 and 2.3.15 so this +is the most easy case. +Be sure you have the appropriate packages for compiling kernel source. +Those packages are listed in Document/Changes in kernel source +distribution. If you have to install the driver other than those bundled +in kernel release, you should have your driver file +<filename>sis900.c</filename> and <filename>sis900.h</filename> +copied into <filename class="directory">/usr/src/linux/drivers/net/</filename> first. +There are two alternative ways to install the driver +</para> + +<sect1> +<title>Building the driver as loadable module</title> + +<para> +To build the driver as a loadable kernel module you have to reconfigure +the kernel to activate network support by +</para> + +<para><screen> +make menuconfig +</screen></para> + +<para> +Choose <quote>Loadable module support ---></quote>, +then select <quote>Enable loadable module support</quote>. +</para> + +<para> +Choose <quote>Network Device Support ---></quote>, select +<quote>Ethernet (10 or 100Mbit)</quote>. +Then select <quote>EISA, VLB, PCI and on board controllers</quote>, +and choose <quote>SiS 900/7016 PCI Fast Ethernet Adapter support</quote> +to <quote>M</quote>. +</para> + +<para> +After reconfiguring the kernel, you can make the driver module by +</para> + +<para><screen> +make modules +</screen></para> + +<para> +The driver should be compiled with no errors. After compiling the driver, +the driver can be installed to proper place by +</para> + +<para><screen> +make modules_install +</screen></para> + +<para> +Load the driver into kernel by +</para> + +<para><screen> +insmod sis900 +</screen></para> + +<para> +When loading the driver into memory, some information message can be view by +</para> + +<para> +<screen> +dmesg +</screen> + +or + +<screen> +cat /var/log/message +</screen> +</para> + +<para> +If the driver is loaded properly you will have messages similar to this: +</para> + +<para><screen> +sis900.c: v1.07.06 11/07/2000 +eth0: SiS 900 PCI Fast Ethernet at 0xd000, IRQ 10, 00:00:e8:83:7f:a4. +eth0: SiS 900 Internal MII PHY transceiver found at address 1. +eth0: Using SiS 900 Internal MII PHY as default +</screen></para> + +<para> +showing the version of the driver and the results of probing routine. +</para> + +<para> +Once the driver is loaded, network can be brought up by +</para> + +<para><screen> +/sbin/ifconfig eth0 IPADDR broadcast BROADCAST netmask NETMASK media TYPE +</screen></para> + +<para> +where IPADDR, BROADCAST, NETMASK are your IP address, broadcast address and +netmask respectively. TYPE is used to set medium type used by the device. +Typical values are "10baseT"(twisted-pair 10Mbps Ethernet) or "100baseT" +(twisted-pair 100Mbps Ethernet). For more information on how to configure +network interface, please refer to +<ulink url="http://www.tldp.org/">Networking HOWTO</ulink>. +</para> + +<para> +The link status is also shown by kernel messages. For example, after the +network interface is activated, you may have the message: +</para> + +<para><screen> +eth0: Media Link On 100mbps full-duplex +</screen></para> + +<para> +If you try to unplug the twist pair (TP) cable you will get +</para> + +<para><screen> +eth0: Media Link Off +</screen></para> + +<para> +indicating that the link is failed. +</para> +</sect1> + +<sect1> +<title>Building the driver into kernel</title> + +<para> +If you want to make the driver into kernel, choose <quote>Y</quote> +rather than <quote>M</quote> on +<quote>SiS 900/7016 PCI Fast Ethernet Adapter support</quote> +when configuring the kernel. Build the kernel image in the usual way +</para> + +<para><screen> +make clean + +make bzlilo +</screen></para> + +<para> +Next time the system reboot, you have the driver in memory. +</para> + +</sect1> +</chapter> + +<chapter id="problems"> + <title>Known Problems and Bugs</title> + +<para> +There are some known problems and bugs. If you find any other bugs please +mail to <ulink url="mailto:lcchang@sis.com.tw">lcchang@sis.com.tw</ulink> + +<orderedlist> + +<listitem> +<para> +AM79C901 HomePNA PHY is not thoroughly tested, there may be some +bugs in the <quote>on the fly</quote> change of transceiver. +</para> +</listitem> + +<listitem> +<para> +A bug is hidden somewhere in the receive buffer management code, +the bug causes NULL pointer reference in the kernel. This fault is +caught before bad things happen and reported with the message: + +<computeroutput> +eth0: NULL pointer encountered in Rx ring, skipping +</computeroutput> + +which can be viewed with <literal remap="tt">dmesg</literal> or +<literal remap="tt">cat /var/log/message</literal>. +</para> +</listitem> + +<listitem> +<para> +The media type change from 10Mbps to 100Mbps twisted-pair ethernet +by ifconfig causes the media link down. +</para> +</listitem> + +</orderedlist> +</para> +</chapter> + +<chapter id="RHistory"> + <title>Revision History</title> + +<para> +<itemizedlist> + +<listitem> +<para> +November 13, 2000, Revision 1.07, seventh release, 630E problem fixed +and further clean up. +</para> +</listitem> + +<listitem> +<para> +November 4, 1999, Revision 1.06, Second release, lots of clean up +and optimization. +</para> +</listitem> + +<listitem> +<para> +August 8, 1999, Revision 1.05, Initial Public Release +</para> +</listitem> + +</itemizedlist> +</para> +</chapter> + +<chapter id="acknowledgements"> + <title>Acknowledgements</title> + +<para> +This driver was originally derived form +<ulink url="mailto:becker@cesdis1.gsfc.nasa.gov">Donald Becker</ulink>'s +<ulink url="ftp://cesdis.gsfc.nasa.gov/pub/linux/drivers/kern-2.3/pci-skeleton.c" +>pci-skeleton</ulink> and +<ulink url="ftp://cesdis.gsfc.nasa.gov/pub/linux/drivers/kern-2.3/rtl8139.c" +>rtl8139</ulink> drivers. Donald also provided various suggestion +regarded with improvements made in revision 1.06. +</para> + +<para> +The 1.05 revision was created by +<ulink url="mailto:cmhuang@sis.com.tw">Jim Huang</ulink>, AMD 79c901 +support was added by <ulink url="mailto:lcs@sis.com.tw">Chin-Shan Li</ulink>. +</para> +</chapter> + +<chapter id="functions"> +<title>List of Functions</title> +!Idrivers/net/sis900.c +</chapter> + +</book> diff --git a/Documentation/DocBook/tulip-user.tmpl b/Documentation/DocBook/tulip-user.tmpl new file mode 100644 index 000000000000..6520d7a1b132 --- /dev/null +++ b/Documentation/DocBook/tulip-user.tmpl @@ -0,0 +1,327 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="TulipUserGuide"> + <bookinfo> + <title>Tulip Driver User's Guide</title> + + <authorgroup> + <author> + <firstname>Jeff</firstname> + <surname>Garzik</surname> + <affiliation> + <address> + <email>jgarzik@pobox.com</email> + </address> + </affiliation> + </author> + </authorgroup> + + <copyright> + <year>2001</year> + <holder>Jeff Garzik</holder> + </copyright> + + <legalnotice> + <para> + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later + version. + </para> + + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + See the GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + + <para> + For more details see the file COPYING in the source + distribution of Linux. + </para> + </legalnotice> + </bookinfo> + + <toc></toc> + + <chapter id="intro"> + <title>Introduction</title> +<para> +The Tulip Ethernet Card Driver +is maintained by Jeff Garzik (<email>jgarzik@pobox.com</email>). +</para> + +<para> +The Tulip driver was developed by Donald Becker and changed by +Jeff Garzik, Takashi Manabe and a cast of thousands. +</para> + +<para> +For 2.4.x and later kernels, the Linux Tulip driver is available at +<ulink url="http://sourceforge.net/projects/tulip/">http://sourceforge.net/projects/tulip/</ulink> +</para> + +<para> + This driver is for the Digital "Tulip" Ethernet adapter interface. + It should work with most DEC 21*4*-based chips/ethercards, as well as + with work-alike chips from Lite-On (PNIC) and Macronix (MXIC) and ASIX. +</para> + +<para> + The original author may be reached as becker@scyld.com, or C/O + Scyld Computing Corporation, + 410 Severn Ave., Suite 210, + Annapolis MD 21403 +</para> + +<para> + Additional information on Donald Becker's tulip.c + is available at <ulink url="http://www.scyld.com/network/tulip.html">http://www.scyld.com/network/tulip.html</ulink> +</para> + + </chapter> + + <chapter id="drvr-compat"> + <title>Driver Compatibility</title> + +<para> +This device driver is designed for the DECchip "Tulip", Digital's +single-chip ethernet controllers for PCI (now owned by Intel). +Supported members of the family +are the 21040, 21041, 21140, 21140A, 21142, and 21143. Similar work-alike +chips from Lite-On, Macronics, ASIX, Compex and other listed below are also +supported. +</para> + +<para> +These chips are used on at least 140 unique PCI board designs. The great +number of chips and board designs supported is the reason for the +driver size and complexity. Almost of the increasing complexity is in the +board configuration and media selection code. There is very little +increasing in the operational critical path length. +</para> + </chapter> + + <chapter id="board-settings"> + <title>Board-specific Settings</title> + +<para> +PCI bus devices are configured by the system at boot time, so no jumpers +need to be set on the board. The system BIOS preferably should assign the +PCI INTA signal to an otherwise unused system IRQ line. +</para> + +<para> +Some boards have EEPROMs tables with default media entry. The factory default +is usually "autoselect". This should only be overridden when using +transceiver connections without link beat e.g. 10base2 or AUI, or (rarely!) +for forcing full-duplex when used with old link partners that do not do +autonegotiation. +</para> + </chapter> + + <chapter id="driver-operation"> + <title>Driver Operation</title> + +<sect1><title>Ring buffers</title> + +<para> +The Tulip can use either ring buffers or lists of Tx and Rx descriptors. +This driver uses statically allocated rings of Rx and Tx descriptors, set at +compile time by RX/TX_RING_SIZE. This version of the driver allocates skbuffs +for the Rx ring buffers at open() time and passes the skb->data field to the +Tulip as receive data buffers. When an incoming frame is less than +RX_COPYBREAK bytes long, a fresh skbuff is allocated and the frame is +copied to the new skbuff. When the incoming frame is larger, the skbuff is +passed directly up the protocol stack and replaced by a newly allocated +skbuff. +</para> + +<para> +The RX_COPYBREAK value is chosen to trade-off the memory wasted by +using a full-sized skbuff for small frames vs. the copying costs of larger +frames. For small frames the copying cost is negligible (esp. considering +that we are pre-loading the cache with immediately useful header +information). For large frames the copying cost is non-trivial, and the +larger copy might flush the cache of useful data. A subtle aspect of this +choice is that the Tulip only receives into longword aligned buffers, thus +the IP header at offset 14 isn't longword aligned for further processing. +Copied frames are put into the new skbuff at an offset of "+2", thus copying +has the beneficial effect of aligning the IP header and preloading the +cache. +</para> + +</sect1> + +<sect1><title>Synchronization</title> +<para> +The driver runs as two independent, single-threaded flows of control. One +is the send-packet routine, which enforces single-threaded use by the +dev->tbusy flag. The other thread is the interrupt handler, which is single +threaded by the hardware and other software. +</para> + +<para> +The send packet thread has partial control over the Tx ring and 'dev->tbusy' +flag. It sets the tbusy flag whenever it's queuing a Tx packet. If the next +queue slot is empty, it clears the tbusy flag when finished otherwise it sets +the 'tp->tx_full' flag. +</para> + +<para> +The interrupt handler has exclusive control over the Rx ring and records stats +from the Tx ring. (The Tx-done interrupt can't be selectively turned off, so +we can't avoid the interrupt overhead by having the Tx routine reap the Tx +stats.) After reaping the stats, it marks the queue entry as empty by setting +the 'base' to zero. Iff the 'tp->tx_full' flag is set, it clears both the +tx_full and tbusy flags. +</para> + +</sect1> + + </chapter> + + <chapter id="errata"> + <title>Errata</title> + +<para> +The old DEC databooks were light on details. +The 21040 databook claims that CSR13, CSR14, and CSR15 should each be the last +register of the set CSR12-15 written. Hmmm, now how is that possible? +</para> + +<para> +The DEC SROM format is very badly designed not precisely defined, leading to +part of the media selection junkheap below. Some boards do not have EEPROM +media tables and need to be patched up. Worse, other boards use the DEC +design kit media table when it isn't correct for their board. +</para> + +<para> +We cannot use MII interrupts because there is no defined GPIO pin to attach +them. The MII transceiver status is polled using an kernel timer. +</para> + </chapter> + + <chapter id="changelog"> + <title>Driver Change History</title> + + <sect1><title>Version 0.9.14 (February 20, 2001)</title> + <itemizedlist> + <listitem><para>Fix PNIC problems (Manfred Spraul)</para></listitem> + <listitem><para>Add new PCI id for Accton comet</para></listitem> + <listitem><para>Support Davicom tulips</para></listitem> + <listitem><para>Fix oops in eeprom parsing</para></listitem> + <listitem><para>Enable workarounds for early PCI chipsets</para></listitem> + <listitem><para>IA64, hppa csr0 support</para></listitem> + <listitem><para>Support media types 5, 6</para></listitem> + <listitem><para>Interpret a bit more of the 21142 SROM extended media type 3</para></listitem> + <listitem><para>Add missing delay in eeprom reading</para></listitem> + </itemizedlist> + </sect1> + + <sect1><title>Version 0.9.11 (November 3, 2000)</title> + <itemizedlist> + <listitem><para>Eliminate extra bus accesses when sharing interrupts (prumpf)</para></listitem> + <listitem><para>Barrier following ownership descriptor bit flip (prumpf)</para></listitem> + <listitem><para>Endianness fixes for >14 addresses in setup frames (prumpf)</para></listitem> + <listitem><para>Report link beat to kernel/userspace via netif_carrier_*. (kuznet)</para></listitem> + <listitem><para>Better spinlocking in set_rx_mode.</para></listitem> + <listitem><para>Fix I/O resource request failure error messages (DaveM catch)</para></listitem> + <listitem><para>Handle DMA allocation failure.</para></listitem> + </itemizedlist> + </sect1> + + <sect1><title>Version 0.9.10 (September 6, 2000)</title> + <itemizedlist> + <listitem><para>Simple interrupt mitigation (via jamal)</para></listitem> + <listitem><para>More PCI ids</para></listitem> + </itemizedlist> + </sect1> + + <sect1><title>Version 0.9.9 (August 11, 2000)</title> + <itemizedlist> + <listitem><para>More PCI ids</para></listitem> + </itemizedlist> + </sect1> + + <sect1><title>Version 0.9.8 (July 13, 2000)</title> + <itemizedlist> + <listitem><para>Correct signed/unsigned comparison for dummy frame index</para></listitem> + <listitem><para>Remove outdated references to struct enet_statistics</para></listitem> + </itemizedlist> + </sect1> + + <sect1><title>Version 0.9.7 (June 17, 2000)</title> + <itemizedlist> + <listitem><para>Timer cleanups (Andrew Morton)</para></listitem> + <listitem><para>Alpha compile fix (somebody?)</para></listitem> + </itemizedlist> + </sect1> + + <sect1><title>Version 0.9.6 (May 31, 2000)</title> + <itemizedlist> + <listitem><para>Revert 21143-related support flag patch</para></listitem> + <listitem><para>Add HPPA/media-table debugging printk</para></listitem> + </itemizedlist> + </sect1> + + <sect1><title>Version 0.9.5 (May 30, 2000)</title> + <itemizedlist> + <listitem><para>HPPA support (willy@puffingroup)</para></listitem> + <listitem><para>CSR6 bits and tulip.h cleanup (Chris Smith)</para></listitem> + <listitem><para>Improve debugging messages a bit</para></listitem> + <listitem><para>Add delay after CSR13 write in t21142_start_nway</para></listitem> + <listitem><para>Remove unused ETHER_STATS code</para></listitem> + <listitem><para>Convert 'extern inline' to 'static inline' in tulip.h (Chris Smith)</para></listitem> + <listitem><para>Update DS21143 support flags in tulip_chip_info[]</para></listitem> + <listitem><para>Use spin_lock_irq, not _irqsave/restore, in tulip_start_xmit()</para></listitem> + <listitem><para>Add locking to set_rx_mode()</para></listitem> + <listitem><para>Fix race with chip setting DescOwned bit (Hal Murray)</para></listitem> + <listitem><para>Request 100% of PIO and MMIO resource space assigned to card</para></listitem> + <listitem><para>Remove error message from pci_enable_device failure</para></listitem> + </itemizedlist> + </sect1> + + <sect1><title>Version 0.9.4.3 (April 14, 2000)</title> + <itemizedlist> + <listitem><para>mod_timer fix (Hal Murray)</para></listitem> + <listitem><para>PNIC2 resuscitation (Chris Smith)</para></listitem> + </itemizedlist> + </sect1> + + <sect1><title>Version 0.9.4.2 (March 21, 2000)</title> + <itemizedlist> + <listitem><para>Fix 21041 CSR7, CSR13/14/15 handling</para></listitem> + <listitem><para>Merge some PCI ids from tulip 0.91x</para></listitem> + <listitem><para>Merge some HAS_xxx flags and flag settings from tulip 0.91x</para></listitem> + <listitem><para>asm/io.h fix (submitted by many) and cleanup</para></listitem> + <listitem><para>s/HAS_NWAY143/HAS_NWAY/</para></listitem> + <listitem><para>Cleanup 21041 mode reporting</para></listitem> + <listitem><para>Small code cleanups</para></listitem> + </itemizedlist> + </sect1> + + <sect1><title>Version 0.9.4.1 (March 18, 2000)</title> + <itemizedlist> + <listitem><para>Finish PCI DMA conversion (davem)</para></listitem> + <listitem><para>Do not netif_start_queue() at end of tulip_tx_timeout() (kuznet)</para></listitem> + <listitem><para>PCI DMA fix (kuznet)</para></listitem> + <listitem><para>eeprom.c code cleanup</para></listitem> + <listitem><para>Remove Xircom Tulip crud</para></listitem> + </itemizedlist> + </sect1> + </chapter> + +</book> diff --git a/Documentation/DocBook/usb.tmpl b/Documentation/DocBook/usb.tmpl new file mode 100644 index 000000000000..f3ef0bf435e9 --- /dev/null +++ b/Documentation/DocBook/usb.tmpl @@ -0,0 +1,979 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="Linux-USB-API"> + <bookinfo> + <title>The Linux-USB Host Side API</title> + + <legalnotice> + <para> + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later + version. + </para> + + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + See the GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + + <para> + For more details see the file COPYING in the source + distribution of Linux. + </para> + </legalnotice> + </bookinfo> + +<toc></toc> + +<chapter id="intro"> + <title>Introduction to USB on Linux</title> + + <para>A Universal Serial Bus (USB) is used to connect a host, + such as a PC or workstation, to a number of peripheral + devices. USB uses a tree structure, with the host at the + root (the system's master), hubs as interior nodes, and + peripheral devices as leaves (and slaves). + Modern PCs support several such trees of USB devices, usually + one USB 2.0 tree (480 Mbit/sec each) with + a few USB 1.1 trees (12 Mbit/sec each) that are used when you + connect a USB 1.1 device directly to the machine's "root hub". + </para> + + <para>That master/slave asymmetry was designed in part for + ease of use. It is not physically possible to assemble + (legal) USB cables incorrectly: all upstream "to-the-host" + connectors are the rectangular type, matching the sockets on + root hubs, and the downstream type are the squarish type + (or they are built in to the peripheral). + Software doesn't need to deal with distributed autoconfiguration + since the pre-designated master node manages all that. + At the electrical level, bus protocol overhead is reduced by + eliminating arbitration and moving scheduling into host software. + </para> + + <para>USB 1.0 was announced in January 1996, and was revised + as USB 1.1 (with improvements in hub specification and + support for interrupt-out transfers) in September 1998. + USB 2.0 was released in April 2000, including high speed + transfers and transaction translating hubs (used for USB 1.1 + and 1.0 backward compatibility). + </para> + + <para>USB support was added to Linux early in the 2.2 kernel series + shortly before the 2.3 development forked off. Updates + from 2.3 were regularly folded back into 2.2 releases, bringing + new features such as <filename>/sbin/hotplug</filename> support, + more drivers, and more robustness. + The 2.5 kernel series continued such improvements, and also + worked on USB 2.0 support, + higher performance, + better consistency between host controller drivers, + API simplification (to make bugs less likely), + and providing internal "kerneldoc" documentation. + </para> + + <para>Linux can run inside USB devices as well as on + the hosts that control the devices. + Because the Linux 2.x USB support evolved to support mass market + platforms such as Apple Macintosh or PC-compatible systems, + it didn't address design concerns for those types of USB systems. + So it can't be used inside mass-market PDAs, or other peripherals. + USB device drivers running inside those Linux peripherals + don't do the same things as the ones running inside hosts, + and so they've been given a different name: + they're called <emphasis>gadget drivers</emphasis>. + This document does not present gadget drivers. + </para> + + </chapter> + +<chapter id="host"> + <title>USB Host-Side API Model</title> + + <para>Within the kernel, + host-side drivers for USB devices talk to the "usbcore" APIs. + There are two types of public "usbcore" APIs, targetted at two different + layers of USB driver. Those are + <emphasis>general purpose</emphasis> drivers, exposed through + driver frameworks such as block, character, or network devices; + and drivers that are <emphasis>part of the core</emphasis>, + which are involved in managing a USB bus. + Such core drivers include the <emphasis>hub</emphasis> driver, + which manages trees of USB devices, and several different kinds + of <emphasis>host controller driver (HCD)</emphasis>, + which control individual busses. + </para> + + <para>The device model seen by USB drivers is relatively complex. + </para> + + <itemizedlist> + + <listitem><para>USB supports four kinds of data transfer + (control, bulk, interrupt, and isochronous). Two transfer + types use bandwidth as it's available (control and bulk), + while the other two types of transfer (interrupt and isochronous) + are scheduled to provide guaranteed bandwidth. + </para></listitem> + + <listitem><para>The device description model includes one or more + "configurations" per device, only one of which is active at a time. + Devices that are capable of high speed operation must also support + full speed configurations, along with a way to ask about the + "other speed" configurations that might be used. + </para></listitem> + + <listitem><para>Configurations have one or more "interface", each + of which may have "alternate settings". Interfaces may be + standardized by USB "Class" specifications, or may be specific to + a vendor or device.</para> + + <para>USB device drivers actually bind to interfaces, not devices. + Think of them as "interface drivers", though you + may not see many devices where the distinction is important. + <emphasis>Most USB devices are simple, with only one configuration, + one interface, and one alternate setting.</emphasis> + </para></listitem> + + <listitem><para>Interfaces have one or more "endpoints", each of + which supports one type and direction of data transfer such as + "bulk out" or "interrupt in". The entire configuration may have + up to sixteen endpoints in each direction, allocated as needed + among all the interfaces. + </para></listitem> + + <listitem><para>Data transfer on USB is packetized; each endpoint + has a maximum packet size. + Drivers must often be aware of conventions such as flagging the end + of bulk transfers using "short" (including zero length) packets. + </para></listitem> + + <listitem><para>The Linux USB API supports synchronous calls for + control and bulk messaging. + It also supports asynchnous calls for all kinds of data transfer, + using request structures called "URBs" (USB Request Blocks). + </para></listitem> + + </itemizedlist> + + <para>Accordingly, the USB Core API exposed to device drivers + covers quite a lot of territory. You'll probably need to consult + the USB 2.0 specification, available online from www.usb.org at + no cost, as well as class or device specifications. + </para> + + <para>The only host-side drivers that actually touch hardware + (reading/writing registers, handling IRQs, and so on) are the HCDs. + In theory, all HCDs provide the same functionality through the same + API. In practice, that's becoming more true on the 2.5 kernels, + but there are still differences that crop up especially with + fault handling. Different controllers don't necessarily report + the same aspects of failures, and recovery from faults (including + software-induced ones like unlinking an URB) isn't yet fully + consistent. + Device driver authors should make a point of doing disconnect + testing (while the device is active) with each different host + controller driver, to make sure drivers don't have bugs of + their own as well as to make sure they aren't relying on some + HCD-specific behavior. + (You will need external USB 1.1 and/or + USB 2.0 hubs to perform all those tests.) + </para> + + </chapter> + +<chapter><title>USB-Standard Types</title> + + <para>In <filename><linux/usb_ch9.h></filename> you will find + the USB data types defined in chapter 9 of the USB specification. + These data types are used throughout USB, and in APIs including + this host side API, gadget APIs, and usbfs. + </para> + +!Iinclude/linux/usb_ch9.h + + </chapter> + +<chapter><title>Host-Side Data Types and Macros</title> + + <para>The host side API exposes several layers to drivers, some of + which are more necessary than others. + These support lifecycle models for host side drivers + and devices, and support passing buffers through usbcore to + some HCD that performs the I/O for the device driver. + </para> + + +!Iinclude/linux/usb.h + + </chapter> + + <chapter><title>USB Core APIs</title> + + <para>There are two basic I/O models in the USB API. + The most elemental one is asynchronous: drivers submit requests + in the form of an URB, and the URB's completion callback + handle the next step. + All USB transfer types support that model, although there + are special cases for control URBs (which always have setup + and status stages, but may not have a data stage) and + isochronous URBs (which allow large packets and include + per-packet fault reports). + Built on top of that is synchronous API support, where a + driver calls a routine that allocates one or more URBs, + submits them, and waits until they complete. + There are synchronous wrappers for single-buffer control + and bulk transfers (which are awkward to use in some + driver disconnect scenarios), and for scatterlist based + streaming i/o (bulk or interrupt). + </para> + + <para>USB drivers need to provide buffers that can be + used for DMA, although they don't necessarily need to + provide the DMA mapping themselves. + There are APIs to use used when allocating DMA buffers, + which can prevent use of bounce buffers on some systems. + In some cases, drivers may be able to rely on 64bit DMA + to eliminate another kind of bounce buffer. + </para> + +!Edrivers/usb/core/urb.c +!Edrivers/usb/core/message.c +!Edrivers/usb/core/file.c +!Edrivers/usb/core/usb.c +!Edrivers/usb/core/hub.c + </chapter> + + <chapter><title>Host Controller APIs</title> + + <para>These APIs are only for use by host controller drivers, + most of which implement standard register interfaces such as + EHCI, OHCI, or UHCI. + UHCI was one of the first interfaces, designed by Intel and + also used by VIA; it doesn't do much in hardware. + OHCI was designed later, to have the hardware do more work + (bigger transfers, tracking protocol state, and so on). + EHCI was designed with USB 2.0; its design has features that + resemble OHCI (hardware does much more work) as well as + UHCI (some parts of ISO support, TD list processing). + </para> + + <para>There are host controllers other than the "big three", + although most PCI based controllers (and a few non-PCI based + ones) use one of those interfaces. + Not all host controllers use DMA; some use PIO, and there + is also a simulator. + </para> + + <para>The same basic APIs are available to drivers for all + those controllers. + For historical reasons they are in two layers: + <structname>struct usb_bus</structname> is a rather thin + layer that became available in the 2.2 kernels, while + <structname>struct usb_hcd</structname> is a more featureful + layer (available in later 2.4 kernels and in 2.5) that + lets HCDs share common code, to shrink driver size + and significantly reduce hcd-specific behaviors. + </para> + +!Edrivers/usb/core/hcd.c +!Edrivers/usb/core/hcd-pci.c +!Edrivers/usb/core/buffer.c + </chapter> + + <chapter> + <title>The USB Filesystem (usbfs)</title> + + <para>This chapter presents the Linux <emphasis>usbfs</emphasis>. + You may prefer to avoid writing new kernel code for your + USB driver; that's the problem that usbfs set out to solve. + User mode device drivers are usually packaged as applications + or libraries, and may use usbfs through some programming library + that wraps it. Such libraries include + <ulink url="http://libusb.sourceforge.net">libusb</ulink> + for C/C++, and + <ulink url="http://jUSB.sourceforge.net">jUSB</ulink> for Java. + </para> + + <note><title>Unfinished</title> + <para>This particular documentation is incomplete, + especially with respect to the asynchronous mode. + As of kernel 2.5.66 the code and this (new) documentation + need to be cross-reviewed. + </para> + </note> + + <para>Configure usbfs into Linux kernels by enabling the + <emphasis>USB filesystem</emphasis> option (CONFIG_USB_DEVICEFS), + and you get basic support for user mode USB device drivers. + Until relatively recently it was often (confusingly) called + <emphasis>usbdevfs</emphasis> although it wasn't solving what + <emphasis>devfs</emphasis> was. + Every USB device will appear in usbfs, regardless of whether or + not it has a kernel driver; but only devices with kernel drivers + show up in devfs. + </para> + + <sect1> + <title>What files are in "usbfs"?</title> + + <para>Conventionally mounted at + <filename>/proc/bus/usb</filename>, usbfs + features include: + <itemizedlist> + <listitem><para><filename>/proc/bus/usb/devices</filename> + ... a text file + showing each of the USB devices on known to the kernel, + and their configuration descriptors. + You can also poll() this to learn about new devices. + </para></listitem> + <listitem><para><filename>/proc/bus/usb/BBB/DDD</filename> + ... magic files + exposing the each device's configuration descriptors, and + supporting a series of ioctls for making device requests, + including I/O to devices. (Purely for access by programs.) + </para></listitem> + </itemizedlist> + </para> + + <para> Each bus is given a number (BBB) based on when it was + enumerated; within each bus, each device is given a similar + number (DDD). + Those BBB/DDD paths are not "stable" identifiers; + expect them to change even if you always leave the devices + plugged in to the same hub port. + <emphasis>Don't even think of saving these in application + configuration files.</emphasis> + Stable identifiers are available, for user mode applications + that want to use them. HID and networking devices expose + these stable IDs, so that for example you can be sure that + you told the right UPS to power down its second server. + "usbfs" doesn't (yet) expose those IDs. + </para> + + </sect1> + + <sect1> + <title>Mounting and Access Control</title> + + <para>There are a number of mount options for usbfs, which will + be of most interest to you if you need to override the default + access control policy. + That policy is that only root may read or write device files + (<filename>/proc/bus/BBB/DDD</filename>) although anyone may read + the <filename>devices</filename> + or <filename>drivers</filename> files. + I/O requests to the device also need the CAP_SYS_RAWIO capability, + </para> + + <para>The significance of that is that by default, all user mode + device drivers need super-user privileges. + You can change modes or ownership in a driver setup + when the device hotplugs, or maye just start the + driver right then, as a privileged server (or some activity + within one). + That's the most secure approach for multi-user systems, + but for single user systems ("trusted" by that user) + it's more convenient just to grant everyone all access + (using the <emphasis>devmode=0666</emphasis> option) + so the driver can start whenever it's needed. + </para> + + <para>The mount options for usbfs, usable in /etc/fstab or + in command line invocations of <emphasis>mount</emphasis>, are: + + <variablelist> + <varlistentry> + <term><emphasis>busgid</emphasis>=NNNNN</term> + <listitem><para>Controls the GID used for the + /proc/bus/usb/BBB + directories. (Default: 0)</para></listitem></varlistentry> + <varlistentry><term><emphasis>busmode</emphasis>=MMM</term> + <listitem><para>Controls the file mode used for the + /proc/bus/usb/BBB + directories. (Default: 0555) + </para></listitem></varlistentry> + <varlistentry><term><emphasis>busuid</emphasis>=NNNNN</term> + <listitem><para>Controls the UID used for the + /proc/bus/usb/BBB + directories. (Default: 0)</para></listitem></varlistentry> + + <varlistentry><term><emphasis>devgid</emphasis>=NNNNN</term> + <listitem><para>Controls the GID used for the + /proc/bus/usb/BBB/DDD + files. (Default: 0)</para></listitem></varlistentry> + <varlistentry><term><emphasis>devmode</emphasis>=MMM</term> + <listitem><para>Controls the file mode used for the + /proc/bus/usb/BBB/DDD + files. (Default: 0644)</para></listitem></varlistentry> + <varlistentry><term><emphasis>devuid</emphasis>=NNNNN</term> + <listitem><para>Controls the UID used for the + /proc/bus/usb/BBB/DDD + files. (Default: 0)</para></listitem></varlistentry> + + <varlistentry><term><emphasis>listgid</emphasis>=NNNNN</term> + <listitem><para>Controls the GID used for the + /proc/bus/usb/devices and drivers files. + (Default: 0)</para></listitem></varlistentry> + <varlistentry><term><emphasis>listmode</emphasis>=MMM</term> + <listitem><para>Controls the file mode used for the + /proc/bus/usb/devices and drivers files. + (Default: 0444)</para></listitem></varlistentry> + <varlistentry><term><emphasis>listuid</emphasis>=NNNNN</term> + <listitem><para>Controls the UID used for the + /proc/bus/usb/devices and drivers files. + (Default: 0)</para></listitem></varlistentry> + </variablelist> + + </para> + + <para>Note that many Linux distributions hard-wire the mount options + for usbfs in their init scripts, such as + <filename>/etc/rc.d/rc.sysinit</filename>, + rather than making it easy to set this per-system + policy in <filename>/etc/fstab</filename>. + </para> + + </sect1> + + <sect1> + <title>/proc/bus/usb/devices</title> + + <para>This file is handy for status viewing tools in user + mode, which can scan the text format and ignore most of it. + More detailed device status (including class and vendor + status) is available from device-specific files. + For information about the current format of this file, + see the + <filename>Documentation/usb/proc_usb_info.txt</filename> + file in your Linux kernel sources. + </para> + + <para>Otherwise the main use for this file from programs + is to poll() it to get notifications of usb devices + as they're plugged or unplugged. + To see what changed, you'd need to read the file and + compare "before" and "after" contents, scan the filesystem, + or see its hotplug event. + </para> + + </sect1> + + <sect1> + <title>/proc/bus/usb/BBB/DDD</title> + + <para>Use these files in one of these basic ways: + </para> + + <para><emphasis>They can be read,</emphasis> + producing first the device descriptor + (18 bytes) and then the descriptors for the current configuration. + See the USB 2.0 spec for details about those binary data formats. + You'll need to convert most multibyte values from little endian + format to your native host byte order, although a few of the + fields in the device descriptor (both of the BCD-encoded fields, + and the vendor and product IDs) will be byteswapped for you. + Note that configuration descriptors include descriptors for + interfaces, altsettings, endpoints, and maybe additional + class descriptors. + </para> + + <para><emphasis>Perform USB operations</emphasis> using + <emphasis>ioctl()</emphasis> requests to make endpoint I/O + requests (synchronously or asynchronously) or manage + the device. + These requests need the CAP_SYS_RAWIO capability, + as well as filesystem access permissions. + Only one ioctl request can be made on one of these + device files at a time. + This means that if you are synchronously reading an endpoint + from one thread, you won't be able to write to a different + endpoint from another thread until the read completes. + This works for <emphasis>half duplex</emphasis> protocols, + but otherwise you'd use asynchronous i/o requests. + </para> + + </sect1> + + + <sect1> + <title>Life Cycle of User Mode Drivers</title> + + <para>Such a driver first needs to find a device file + for a device it knows how to handle. + Maybe it was told about it because a + <filename>/sbin/hotplug</filename> event handling agent + chose that driver to handle the new device. + Or maybe it's an application that scans all the + /proc/bus/usb device files, and ignores most devices. + In either case, it should <function>read()</function> all + the descriptors from the device file, + and check them against what it knows how to handle. + It might just reject everything except a particular + vendor and product ID, or need a more complex policy. + </para> + + <para>Never assume there will only be one such device + on the system at a time! + If your code can't handle more than one device at + a time, at least detect when there's more than one, and + have your users choose which device to use. + </para> + + <para>Once your user mode driver knows what device to use, + it interacts with it in either of two styles. + The simple style is to make only control requests; some + devices don't need more complex interactions than those. + (An example might be software using vendor-specific control + requests for some initialization or configuration tasks, + with a kernel driver for the rest.) + </para> + + <para>More likely, you need a more complex style driver: + one using non-control endpoints, reading or writing data + and claiming exclusive use of an interface. + <emphasis>Bulk</emphasis> transfers are easiest to use, + but only their sibling <emphasis>interrupt</emphasis> transfers + work with low speed devices. + Both interrupt and <emphasis>isochronous</emphasis> transfers + offer service guarantees because their bandwidth is reserved. + Such "periodic" transfers are awkward to use through usbfs, + unless you're using the asynchronous calls. However, interrupt + transfers can also be used in a synchronous "one shot" style. + </para> + + <para>Your user-mode driver should never need to worry + about cleaning up request state when the device is + disconnected, although it should close its open file + descriptors as soon as it starts seeing the ENODEV + errors. + </para> + + </sect1> + + <sect1><title>The ioctl() Requests</title> + + <para>To use these ioctls, you need to include the following + headers in your userspace program: +<programlisting>#include <linux/usb.h> +#include <linux/usbdevice_fs.h> +#include <asm/byteorder.h></programlisting> + The standard USB device model requests, from "Chapter 9" of + the USB 2.0 specification, are automatically included from + the <filename><linux/usb_ch9.h></filename> header. + </para> + + <para>Unless noted otherwise, the ioctl requests + described here will + update the modification time on the usbfs file to which + they are applied (unless they fail). + A return of zero indicates success; otherwise, a + standard USB error code is returned. (These are + documented in + <filename>Documentation/usb/error-codes.txt</filename> + in your kernel sources.) + </para> + + <para>Each of these files multiplexes access to several + I/O streams, one per endpoint. + Each device has one control endpoint (endpoint zero) + which supports a limited RPC style RPC access. + Devices are configured + by khubd (in the kernel) setting a device-wide + <emphasis>configuration</emphasis> that affects things + like power consumption and basic functionality. + The endpoints are part of USB <emphasis>interfaces</emphasis>, + which may have <emphasis>altsettings</emphasis> + affecting things like which endpoints are available. + Many devices only have a single configuration and interface, + so drivers for them will ignore configurations and altsettings. + </para> + + + <sect2> + <title>Management/Status Requests</title> + + <para>A number of usbfs requests don't deal very directly + with device I/O. + They mostly relate to device management and status. + These are all synchronous requests. + </para> + + <variablelist> + + <varlistentry><term>USBDEVFS_CLAIMINTERFACE</term> + <listitem><para>This is used to force usbfs to + claim a specific interface, + which has not previously been claimed by usbfs or any other + kernel driver. + The ioctl parameter is an integer holding the number of + the interface (bInterfaceNumber from descriptor). + </para><para> + Note that if your driver doesn't claim an interface + before trying to use one of its endpoints, and no + other driver has bound to it, then the interface is + automatically claimed by usbfs. + </para><para> + This claim will be released by a RELEASEINTERFACE ioctl, + or by closing the file descriptor. + File modification time is not updated by this request. + </para></listitem></varlistentry> + + <varlistentry><term>USBDEVFS_CONNECTINFO</term> + <listitem><para>Says whether the device is lowspeed. + The ioctl parameter points to a structure like this: +<programlisting>struct usbdevfs_connectinfo { + unsigned int devnum; + unsigned char slow; +}; </programlisting> + File modification time is not updated by this request. + </para><para> + <emphasis>You can't tell whether a "not slow" + device is connected at high speed (480 MBit/sec) + or just full speed (12 MBit/sec).</emphasis> + You should know the devnum value already, + it's the DDD value of the device file name. + </para></listitem></varlistentry> + + <varlistentry><term>USBDEVFS_GETDRIVER</term> + <listitem><para>Returns the name of the kernel driver + bound to a given interface (a string). Parameter + is a pointer to this structure, which is modified: +<programlisting>struct usbdevfs_getdriver { + unsigned int interface; + char driver[USBDEVFS_MAXDRIVERNAME + 1]; +};</programlisting> + File modification time is not updated by this request. + </para></listitem></varlistentry> + + <varlistentry><term>USBDEVFS_IOCTL</term> + <listitem><para>Passes a request from userspace through + to a kernel driver that has an ioctl entry in the + <emphasis>struct usb_driver</emphasis> it registered. +<programlisting>struct usbdevfs_ioctl { + int ifno; + int ioctl_code; + void *data; +}; + +/* user mode call looks like this. + * 'request' becomes the driver->ioctl() 'code' parameter. + * the size of 'param' is encoded in 'request', and that data + * is copied to or from the driver->ioctl() 'buf' parameter. + */ +static int +usbdev_ioctl (int fd, int ifno, unsigned request, void *param) +{ + struct usbdevfs_ioctl wrapper; + + wrapper.ifno = ifno; + wrapper.ioctl_code = request; + wrapper.data = param; + + return ioctl (fd, USBDEVFS_IOCTL, &wrapper); +} </programlisting> + File modification time is not updated by this request. + </para><para> + This request lets kernel drivers talk to user mode code + through filesystem operations even when they don't create + a charactor or block special device. + It's also been used to do things like ask devices what + device special file should be used. + Two pre-defined ioctls are used + to disconnect and reconnect kernel drivers, so + that user mode code can completely manage binding + and configuration of devices. + </para></listitem></varlistentry> + + <varlistentry><term>USBDEVFS_RELEASEINTERFACE</term> + <listitem><para>This is used to release the claim usbfs + made on interface, either implicitly or because of a + USBDEVFS_CLAIMINTERFACE call, before the file + descriptor is closed. + The ioctl parameter is an integer holding the number of + the interface (bInterfaceNumber from descriptor); + File modification time is not updated by this request. + </para><warning><para> + <emphasis>No security check is made to ensure + that the task which made the claim is the one + which is releasing it. + This means that user mode driver may interfere + other ones. </emphasis> + </para></warning></listitem></varlistentry> + + <varlistentry><term>USBDEVFS_RESETEP</term> + <listitem><para>Resets the data toggle value for an endpoint + (bulk or interrupt) to DATA0. + The ioctl parameter is an integer endpoint number + (1 to 15, as identified in the endpoint descriptor), + with USB_DIR_IN added if the device's endpoint sends + data to the host. + </para><warning><para> + <emphasis>Avoid using this request. + It should probably be removed.</emphasis> + Using it typically means the device and driver will lose + toggle synchronization. If you really lost synchronization, + you likely need to completely handshake with the device, + using a request like CLEAR_HALT + or SET_INTERFACE. + </para></warning></listitem></varlistentry> + + </variablelist> + + </sect2> + + <sect2> + <title>Synchronous I/O Support</title> + + <para>Synchronous requests involve the kernel blocking + until until the user mode request completes, either by + finishing successfully or by reporting an error. + In most cases this is the simplest way to use usbfs, + although as noted above it does prevent performing I/O + to more than one endpoint at a time. + </para> + + <variablelist> + + <varlistentry><term>USBDEVFS_BULK</term> + <listitem><para>Issues a bulk read or write request to the + device. + The ioctl parameter is a pointer to this structure: +<programlisting>struct usbdevfs_bulktransfer { + unsigned int ep; + unsigned int len; + unsigned int timeout; /* in milliseconds */ + void *data; +};</programlisting> + </para><para>The "ep" value identifies a + bulk endpoint number (1 to 15, as identified in an endpoint + descriptor), + masked with USB_DIR_IN when referring to an endpoint which + sends data to the host from the device. + The length of the data buffer is identified by "len"; + Recent kernels support requests up to about 128KBytes. + <emphasis>FIXME say how read length is returned, + and how short reads are handled.</emphasis>. + </para></listitem></varlistentry> + + <varlistentry><term>USBDEVFS_CLEAR_HALT</term> + <listitem><para>Clears endpoint halt (stall) and + resets the endpoint toggle. This is only + meaningful for bulk or interrupt endpoints. + The ioctl parameter is an integer endpoint number + (1 to 15, as identified in an endpoint descriptor), + masked with USB_DIR_IN when referring to an endpoint which + sends data to the host from the device. + </para><para> + Use this on bulk or interrupt endpoints which have + stalled, returning <emphasis>-EPIPE</emphasis> status + to a data transfer request. + Do not issue the control request directly, since + that could invalidate the host's record of the + data toggle. + </para></listitem></varlistentry> + + <varlistentry><term>USBDEVFS_CONTROL</term> + <listitem><para>Issues a control request to the device. + The ioctl parameter points to a structure like this: +<programlisting>struct usbdevfs_ctrltransfer { + __u8 bRequestType; + __u8 bRequest; + __u16 wValue; + __u16 wIndex; + __u16 wLength; + __u32 timeout; /* in milliseconds */ + void *data; +};</programlisting> + </para><para> + The first eight bytes of this structure are the contents + of the SETUP packet to be sent to the device; see the + USB 2.0 specification for details. + The bRequestType value is composed by combining a + USB_TYPE_* value, a USB_DIR_* value, and a + USB_RECIP_* value (from + <emphasis><linux/usb.h></emphasis>). + If wLength is nonzero, it describes the length of the data + buffer, which is either written to the device + (USB_DIR_OUT) or read from the device (USB_DIR_IN). + </para><para> + At this writing, you can't transfer more than 4 KBytes + of data to or from a device; usbfs has a limit, and + some host controller drivers have a limit. + (That's not usually a problem.) + <emphasis>Also</emphasis> there's no way to say it's + not OK to get a short read back from the device. + </para></listitem></varlistentry> + + <varlistentry><term>USBDEVFS_RESET</term> + <listitem><para>Does a USB level device reset. + The ioctl parameter is ignored. + After the reset, this rebinds all device interfaces. + File modification time is not updated by this request. + </para><warning><para> + <emphasis>Avoid using this call</emphasis> + until some usbcore bugs get fixed, + since it does not fully synchronize device, interface, + and driver (not just usbfs) state. + </para></warning></listitem></varlistentry> + + <varlistentry><term>USBDEVFS_SETINTERFACE</term> + <listitem><para>Sets the alternate setting for an + interface. The ioctl parameter is a pointer to a + structure like this: +<programlisting>struct usbdevfs_setinterface { + unsigned int interface; + unsigned int altsetting; +}; </programlisting> + File modification time is not updated by this request. + </para><para> + Those struct members are from some interface descriptor + applying to the the current configuration. + The interface number is the bInterfaceNumber value, and + the altsetting number is the bAlternateSetting value. + (This resets each endpoint in the interface.) + </para></listitem></varlistentry> + + <varlistentry><term>USBDEVFS_SETCONFIGURATION</term> + <listitem><para>Issues the + <function>usb_set_configuration</function> call + for the device. + The parameter is an integer holding the number of + a configuration (bConfigurationValue from descriptor). + File modification time is not updated by this request. + </para><warning><para> + <emphasis>Avoid using this call</emphasis> + until some usbcore bugs get fixed, + since it does not fully synchronize device, interface, + and driver (not just usbfs) state. + </para></warning></listitem></varlistentry> + + </variablelist> + </sect2> + + <sect2> + <title>Asynchronous I/O Support</title> + + <para>As mentioned above, there are situations where it may be + important to initiate concurrent operations from user mode code. + This is particularly important for periodic transfers + (interrupt and isochronous), but it can be used for other + kinds of USB requests too. + In such cases, the asynchronous requests described here + are essential. Rather than submitting one request and having + the kernel block until it completes, the blocking is separate. + </para> + + <para>These requests are packaged into a structure that + resembles the URB used by kernel device drivers. + (No POSIX Async I/O support here, sorry.) + It identifies the endpoint type (USBDEVFS_URB_TYPE_*), + endpoint (number, masked with USB_DIR_IN as appropriate), + buffer and length, and a user "context" value serving to + uniquely identify each request. + (It's usually a pointer to per-request data.) + Flags can modify requests (not as many as supported for + kernel drivers). + </para> + + <para>Each request can specify a realtime signal number + (between SIGRTMIN and SIGRTMAX, inclusive) to request a + signal be sent when the request completes. + </para> + + <para>When usbfs returns these urbs, the status value + is updated, and the buffer may have been modified. + Except for isochronous transfers, the actual_length is + updated to say how many bytes were transferred; if the + USBDEVFS_URB_DISABLE_SPD flag is set + ("short packets are not OK"), if fewer bytes were read + than were requested then you get an error report. + </para> + +<programlisting>struct usbdevfs_iso_packet_desc { + unsigned int length; + unsigned int actual_length; + unsigned int status; +}; + +struct usbdevfs_urb { + unsigned char type; + unsigned char endpoint; + int status; + unsigned int flags; + void *buffer; + int buffer_length; + int actual_length; + int start_frame; + int number_of_packets; + int error_count; + unsigned int signr; + void *usercontext; + struct usbdevfs_iso_packet_desc iso_frame_desc[]; +};</programlisting> + + <para> For these asynchronous requests, the file modification + time reflects when the request was initiated. + This contrasts with their use with the synchronous requests, + where it reflects when requests complete. + </para> + + <variablelist> + + <varlistentry><term>USBDEVFS_DISCARDURB</term> + <listitem><para> + <emphasis>TBS</emphasis> + File modification time is not updated by this request. + </para><para> + </para></listitem></varlistentry> + + <varlistentry><term>USBDEVFS_DISCSIGNAL</term> + <listitem><para> + <emphasis>TBS</emphasis> + File modification time is not updated by this request. + </para><para> + </para></listitem></varlistentry> + + <varlistentry><term>USBDEVFS_REAPURB</term> + <listitem><para> + <emphasis>TBS</emphasis> + File modification time is not updated by this request. + </para><para> + </para></listitem></varlistentry> + + <varlistentry><term>USBDEVFS_REAPURBNDELAY</term> + <listitem><para> + <emphasis>TBS</emphasis> + File modification time is not updated by this request. + </para><para> + </para></listitem></varlistentry> + + <varlistentry><term>USBDEVFS_SUBMITURB</term> + <listitem><para> + <emphasis>TBS</emphasis> + </para><para> + </para></listitem></varlistentry> + + </variablelist> + </sect2> + + </sect1> + + </chapter> + +</book> +<!-- vim:syntax=sgml:sw=4 +--> diff --git a/Documentation/DocBook/via-audio.tmpl b/Documentation/DocBook/via-audio.tmpl new file mode 100644 index 000000000000..36e642147d6b --- /dev/null +++ b/Documentation/DocBook/via-audio.tmpl @@ -0,0 +1,597 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="ViaAudioGuide"> + <bookinfo> + <title>Via 686 Audio Driver for Linux</title> + + <authorgroup> + <author> + <firstname>Jeff</firstname> + <surname>Garzik</surname> + </author> + </authorgroup> + + <copyright> + <year>1999-2001</year> + <holder>Jeff Garzik</holder> + </copyright> + + <legalnotice> + <para> + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later + version. + </para> + + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + See the GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + + <para> + For more details see the file COPYING in the source + distribution of Linux. + </para> + </legalnotice> + </bookinfo> + +<toc></toc> + + <chapter id="intro"> + <title>Introduction</title> + <para> + The Via VT82C686A "super southbridge" chips contain + AC97-compatible audio logic which features dual 16-bit stereo + PCM sound channels (full duplex), plus a third PCM channel intended for use + in hardware-assisted FM synthesis. + </para> + <para> + The current Linux kernel audio driver for this family of chips + supports audio playback and recording, but hardware-assisted + FM features, and hardware buffer direct-access (mmap) + support are not yet available. + </para> + <para> + This driver supports any Linux kernel version after 2.4.10. + </para> + <para> + Please send bug reports to the mailing list <email>linux-via@gtf.org</email>. + To subscribe, e-mail <email>majordomo@gtf.org</email> with + </para> + <programlisting> + subscribe linux-via + </programlisting> + <para> + in the body of the message. + </para> + </chapter> + + <chapter id="install"> + <title>Driver Installation</title> + <para> + To use this audio driver, select the + CONFIG_SOUND_VIA82CXXX option in the section Sound during kernel configuration. + Follow the usual kernel procedures for rebuilding the kernel, + or building and installing driver modules. + </para> + <para> + To make this driver the default audio driver, you can add the + following to your /etc/conf.modules file: + </para> + <programlisting> + alias sound via82cxxx_audio + </programlisting> + <para> + Note that soundcore and ac97_codec support modules + are also required for working audio, in addition to + the via82cxxx_audio module itself. + </para> + </chapter> + + <chapter id="reportbug"> + <title>Submitting a bug report</title> + <sect1 id="bugrepdesc"><title>Description of problem</title> + <para> + Describe the application you were using to play/record sound, and how + to reproduce the problem. + </para> + </sect1> + <sect1 id="bugrepdiag"><title>Diagnostic output</title> + <para> + Obtain the via-audio-diag diagnostics program from + http://sf.net/projects/gkernel/ and provide a dump of the + audio chip's registers while the problem is occurring. Sample command line: + </para> + <programlisting> + ./via-audio-diag -aps > diag-output.txt + </programlisting> + </sect1> + <sect1 id="bugrepdebug"><title>Driver debug output</title> + <para> + Define <constant>VIA_DEBUG</constant> at the beginning of the driver, then capture and email + the kernel log output. This can be viewed in the system kernel log (if + enabled), or via the dmesg program. Sample command line: + </para> + <programlisting> + dmesg > /tmp/dmesg-output.txt + </programlisting> + </sect1> + <sect1 id="bugrepprintk"><title>Bigger kernel message buffer</title> + <para> + If you wish to increase the size of the buffer displayed by dmesg, then + change the <constant>LOG_BUF_LEN</constant> macro at the top of linux/kernel/printk.c, recompile + your kernel, and pass the <constant>LOG_BUF_LEN</constant> value to dmesg. Sample command line with + <constant>LOG_BUF_LEN</constant> == 32768: + </para> + <programlisting> + dmesg -s 32768 > /tmp/dmesg-output.txt + </programlisting> + </sect1> + </chapter> + + <chapter id="bugs"> + <title>Known Bugs And Assumptions</title> + <para> + <variablelist> + <varlistentry><term>Low volume</term> + <listitem> + <para> + Volume too low on many systems. Workaround: use mixer program + such as xmixer to increase volume. + </para> + </listitem></varlistentry> + + </variablelist> + + </para> + </chapter> + + <chapter id="thanks"> + <title>Thanks</title> + <para> + Via for providing e-mail support, specs, and NDA'd source code. + </para> + <para> + MandrakeSoft for providing hacking time. + </para> + <para> + AC97 mixer interface fixes and debugging by Ron Cemer <email>roncemer@gte.net</email>. + </para> + <para> + Rui Sousa <email>rui.sousa@conexant.com</email>, for bugfixing + MMAP support, and several other notable fixes that resulted from + his hard work and testing. + </para> + <para> + Adrian Cox <email>adrian@humboldt.co.uk</email>, for bugfixing + MMAP support, and several other notable fixes that resulted from + his hard work and testing. + </para> + <para> + Thomas Sailer for further bugfixes. + </para> + </chapter> + + <chapter id="notes"> + <title>Random Notes</title> + <para> + Two /proc pseudo-files provide diagnostic information. This is generally + not useful to most users. Power users can disable CONFIG_SOUND_VIA82CXXX_PROCFS, + and remove the /proc support code. Once + version 2.0.0 is released, the /proc support code will be disabled by + default. Available /proc pseudo-files: + </para> + <programlisting> + /proc/driver/via/0/info + /proc/driver/via/0/ac97 + </programlisting> + <para> + This driver by default supports all PCI audio devices which report + a vendor id of 0x1106, and a device id of 0x3058. Subsystem vendor + and device ids are not examined. + </para> + <para> + GNU indent formatting options: + <programlisting> +-kr -i8 -ts8 -br -ce -bap -sob -l80 -pcs -cs -ss -bs -di1 -nbc -lp -psl + </programlisting> + </para> + <para> + Via has graciously donated e-mail support and source code to help further + the development of this driver. Their assistance has been invaluable + in the design and coding of the next major version of this driver. + </para> + <para> + The Via audio chip apparently provides a second PCM scatter-gather + DMA channel just for FM data, but does not have a full hardware MIDI + processor. I haven't put much thought towards a solution here, but it + might involve using SoftOSS midi wave table, or simply disabling MIDI + support altogether and using the FM PCM channel as a second (input? output?) + </para> + </chapter> + + <chapter id="changelog"> + <title>Driver ChangeLog</title> + +<sect1 id="version191"><title> +Version 1.9.1 +</title> + <itemizedlist spacing="compact"> + <listitem> + <para> + DSP read/write bugfixes from Thomas Sailer. + </para> + </listitem> + + <listitem> + <para> + Add new PCI id for single-channel use of Via 8233. + </para> + </listitem> + + <listitem> + <para> + Other bug fixes, tweaks, new ioctls. + </para> + </listitem> + + </itemizedlist> +</sect1> + +<sect1 id="version1115"><title> +Version 1.1.15 +</title> + <itemizedlist spacing="compact"> + <listitem> + <para> + Support for variable fragment size and variable fragment number (Rui + Sousa) + </para> + </listitem> + + <listitem> + <para> + Fixes for the SPEED, STEREO, CHANNELS, FMT ioctls when in read & + write mode (Rui Sousa) + </para> + </listitem> + + <listitem> + <para> + Mmaped sound is now fully functional. (Rui Sousa) + </para> + </listitem> + + <listitem> + <para> + Make sure to enable PCI device before reading any of its PCI + config information. (fixes potential hotplug problems) + </para> + </listitem> + + <listitem> + <para> + Clean up code a bit and add more internal function documentation. + </para> + </listitem> + + <listitem> + <para> + AC97 codec access fixes (Adrian Cox) + </para> + </listitem> + + <listitem> + <para> + Big endian fixes (Adrian Cox) + </para> + </listitem> + + <listitem> + <para> + MIDI support (Adrian Cox) + </para> + </listitem> + + <listitem> + <para> + Detect and report locked-rate AC97 codecs. If your hardware only + supports 48Khz (locked rate), then your recording/playback software + must upsample or downsample accordingly. The hardware cannot do it. + </para> + </listitem> + + <listitem> + <para> + Use new pci_request_regions and pci_disable_device functions in + kernel 2.4.6. + </para> + </listitem> + + </itemizedlist> +</sect1> + +<sect1 id="version1114"><title> +Version 1.1.14 +</title> + <itemizedlist spacing="compact"> + <listitem> + <para> + Use VM_RESERVE when available, to eliminate unnecessary page faults. + </para> + </listitem> + </itemizedlist> +</sect1> + +<sect1 id="version1112"><title> +Version 1.1.12 +</title> + <itemizedlist spacing="compact"> + <listitem> + <para> + mmap bug fixes from Linus. + </para> + </listitem> + </itemizedlist> +</sect1> + +<sect1 id="version1111"><title> +Version 1.1.11 +</title> + <itemizedlist spacing="compact"> + <listitem> + <para> + Many more bug fixes. mmap enabled by default, but may still be buggy. + </para> + </listitem> + + <listitem> + <para> + Uses new and spiffy method of mmap'ing the DMA buffer, based + on a suggestion from Linus. + </para> + </listitem> + </itemizedlist> +</sect1> + +<sect1 id="version1110"><title> +Version 1.1.10 +</title> + <itemizedlist spacing="compact"> + <listitem> + <para> + Many bug fixes. mmap enabled by default, but may still be buggy. + </para> + </listitem> + </itemizedlist> +</sect1> + +<sect1 id="version119"><title> +Version 1.1.9 +</title> + <itemizedlist spacing="compact"> + <listitem> + <para> + Redesign and rewrite audio playback implementation. (faster and smaller, hopefully) + </para> + </listitem> + + <listitem> + <para> + Implement recording and full duplex (DSP_CAP_DUPLEX) support. + </para> + </listitem> + + <listitem> + <para> + Make procfs support optional. + </para> + </listitem> + + <listitem> + <para> + Quick interrupt status check, to lessen overhead in interrupt + sharing situations. + </para> + </listitem> + + <listitem> + <para> + Add mmap(2) support. Disabled for now, it is still buggy and experimental. + </para> + </listitem> + + <listitem> + <para> + Surround all syscalls with a semaphore for cheap and easy SMP protection. + </para> + </listitem> + + <listitem> + <para> + Fix bug in channel shutdown (hardware channel reset) code. + </para> + </listitem> + + <listitem> + <para> + Remove unnecessary spinlocks (better performance). + </para> + </listitem> + + <listitem> + <para> + Eliminate "unknown AFMT" message by using a different method + of selecting the best AFMT_xxx sound sample format for use. + </para> + </listitem> + + <listitem> + <para> + Support for realtime hardware pointer position reporting + (DSP_CAP_REALTIME, SNDCTL_DSP_GETxPTR ioctls) + </para> + </listitem> + + <listitem> + <para> + Support for capture/playback triggering + (DSP_CAP_TRIGGER, SNDCTL_DSP_SETTRIGGER ioctls) + </para> + </listitem> + + <listitem> + <para> + SNDCTL_DSP_SETDUPLEX and SNDCTL_DSP_POST ioctls now handled. + </para> + </listitem> + + <listitem> + <para> + Rewrite open(2) and close(2) logic to allow only one user at + a time. All other open(2) attempts will sleep until they succeed. + FIXME: open(O_RDONLY) and open(O_WRONLY) should be allowed to succeed. + </para> + </listitem> + + <listitem> + <para> + Reviewed code to ensure that SMP and multiple audio devices + are fully supported. + </para> + </listitem> + + </itemizedlist> +</sect1> + +<sect1 id="version118"><title> +Version 1.1.8 +</title> + <itemizedlist spacing="compact"> + <listitem> + <para> + Clean up interrupt handler output. Fixes the following kernel error message: + </para> + <programlisting> + unhandled interrupt ... + </programlisting> + </listitem> + + <listitem> + <para> + Convert documentation to DocBook, so that PDF, HTML and PostScript (.ps) output is readily + available. + </para> + </listitem> + + </itemizedlist> +</sect1> + +<sect1 id="version117"><title> +Version 1.1.7 +</title> + <itemizedlist spacing="compact"> + <listitem> + <para> + Fix module unload bug where mixer device left registered + after driver exit + </para> + </listitem> + </itemizedlist> +</sect1> + +<sect1 id="version116"><title> +Version 1.1.6 +</title> + <itemizedlist spacing="compact"> + <listitem> + <para> + Rewrite via_set_rate to mimic ALSA basic AC97 rate setting + </para> + </listitem> + <listitem> + <para> + Remove much dead code + </para> + </listitem> + <listitem> + <para> + Complete spin_lock_irqsave -> spin_lock_irq conversion in via_dsp_ioctl + </para> + </listitem> + <listitem> + <para> + Fix build problem in via_dsp_ioctl + </para> + </listitem> + <listitem> + <para> + Optimize included headers to eliminate headers found in linux/sound + </para> + </listitem> + </itemizedlist> +</sect1> + +<sect1 id="version115"><title> +Version 1.1.5 +</title> + <itemizedlist spacing="compact"> + <listitem> + <para> + Disable some overly-verbose debugging code + </para> + </listitem> + <listitem> + <para> + Remove unnecessary sound locks + </para> + </listitem> + <listitem> + <para> + Fix some ioctls for better time resolution + </para> + </listitem> + <listitem> + <para> + Begin spin_lock_irqsave -> spin_lock_irq conversion in via_dsp_ioctl + </para> + </listitem> + </itemizedlist> +</sect1> + +<sect1 id="version114"><title> +Version 1.1.4 +</title> + <itemizedlist spacing="compact"> + <listitem> + <para> + Completed rewrite of driver. Eliminated SoundBlaster compatibility + completely, and now uses the much-faster scatter-gather DMA engine. + </para> + </listitem> + </itemizedlist> +</sect1> + + </chapter> + + <chapter id="intfunctions"> + <title>Internal Functions</title> +!Isound/oss/via82cxxx_audio.c + </chapter> + +</book> + + diff --git a/Documentation/DocBook/videobook.tmpl b/Documentation/DocBook/videobook.tmpl new file mode 100644 index 000000000000..3ec6c875588a --- /dev/null +++ b/Documentation/DocBook/videobook.tmpl @@ -0,0 +1,1663 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="V4LGuide"> + <bookinfo> + <title>Video4Linux Programming</title> + + <authorgroup> + <author> + <firstname>Alan</firstname> + <surname>Cox</surname> + <affiliation> + <address> + <email>alan@redhat.com</email> + </address> + </affiliation> + </author> + </authorgroup> + + <copyright> + <year>2000</year> + <holder>Alan Cox</holder> + </copyright> + + <legalnotice> + <para> + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later + version. + </para> + + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + See the GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + + <para> + For more details see the file COPYING in the source + distribution of Linux. + </para> + </legalnotice> + </bookinfo> + +<toc></toc> + + <chapter id="intro"> + <title>Introduction</title> + <para> + Parts of this document first appeared in Linux Magazine under a + ninety day exclusivity. + </para> + <para> + Video4Linux is intended to provide a common programming interface + for the many TV and capture cards now on the market, as well as + parallel port and USB video cameras. Radio, teletext decoders and + vertical blanking data interfaces are also provided. + </para> + </chapter> + <chapter id="radio"> + <title>Radio Devices</title> + <para> + There are a wide variety of radio interfaces available for PC's, and these + are generally very simple to program. The biggest problem with supporting + such devices is normally extracting documentation from the vendor. + </para> + <para> + The radio interface supports a simple set of control ioctls standardised + across all radio and tv interfaces. It does not support read or write, which + are used for video streams. The reason radio cards do not allow you to read + the audio stream into an application is that without exception they provide + a connection on to a soundcard. Soundcards can be used to read the radio + data just fine. + </para> + <sect1 id="registerradio"> + <title>Registering Radio Devices</title> + <para> + The Video4linux core provides an interface for registering devices. The + first step in writing our radio card driver is to register it. + </para> + <programlisting> + + +static struct video_device my_radio +{ + "My radio", + VID_TYPE_TUNER, + VID_HARDWARE_MYRADIO, + radio_open. + radio_close, + NULL, /* no read */ + NULL, /* no write */ + NULL, /* no poll */ + radio_ioctl, + NULL, /* no special init function */ + NULL /* no private data */ +}; + + + </programlisting> + <para> + This declares our video4linux device driver interface. The VID_TYPE_ value + defines what kind of an interface we are, and defines basic capabilities. + </para> + <para> + The only defined value relevant for a radio card is VID_TYPE_TUNER which + indicates that the device can be tuned. Clearly our radio is going to have some + way to change channel so it is tuneable. + </para> + <para> + The VID_HARDWARE_ types are unique to each device. Numbers are assigned by + <email>alan@redhat.com</email> when device drivers are going to be released. Until then you + can pull a suitably large number out of your hat and use it. 10000 should be + safe for a very long time even allowing for the huge number of vendors + making new and different radio cards at the moment. + </para> + <para> + We declare an open and close routine, but we do not need read or write, + which are used to read and write video data to or from the card itself. As + we have no read or write there is no poll function. + </para> + <para> + The private initialise function is run when the device is registered. In + this driver we've already done all the work needed. The final pointer is a + private data pointer that can be used by the device driver to attach and + retrieve private data structures. We set this field "priv" to NULL for + the moment. + </para> + <para> + Having the structure defined is all very well but we now need to register it + with the kernel. + </para> + <programlisting> + + +static int io = 0x320; + +int __init myradio_init(struct video_init *v) +{ + if(!request_region(io, MY_IO_SIZE, "myradio")) + { + printk(KERN_ERR + "myradio: port 0x%03X is in use.\n", io); + return -EBUSY; + } + + if(video_device_register(&my_radio, VFL_TYPE_RADIO)==-1) { + release_region(io, MY_IO_SIZE); + return -EINVAL; + } + return 0; +} + + </programlisting> + <para> + The first stage of the initialisation, as is normally the case, is to check + that the I/O space we are about to fiddle with doesn't belong to some other + driver. If it is we leave well alone. If the user gives the address of the + wrong device then we will spot this. These policies will generally avoid + crashing the machine. + </para> + <para> + Now we ask the Video4Linux layer to register the device for us. We hand it + our carefully designed video_device structure and also tell it which group + of devices we want it registered with. In this case VFL_TYPE_RADIO. + </para> + <para> + The types available are + </para> + <table frame="all"><title>Device Types</title> + <tgroup cols="3" align="left"> + <tbody> + <row> + <entry>VFL_TYPE_RADIO</entry><entry>/dev/radio{n}</entry><entry> + + Radio devices are assigned in this block. As with all of these + selections the actual number assignment is done by the video layer + accordijng to what is free.</entry> + </row><row> + <entry>VFL_TYPE_GRABBER</entry><entry>/dev/video{n}</entry><entry> + Video capture devices and also -- counter-intuitively for the name -- + hardware video playback devices such as MPEG2 cards.</entry> + </row><row> + <entry>VFL_TYPE_VBI</entry><entry>/dev/vbi{n}</entry><entry> + The VBI devices capture the hidden lines on a television picture + that carry further information like closed caption data, teletext + (primarily in Europe) and now Intercast and the ATVEC internet + television encodings.</entry> + </row><row> + <entry>VFL_TYPE_VTX</entry><entry>/dev/vtx[n}</entry><entry> + VTX is 'Videotext' also known as 'Teletext'. This is a system for + sending numbered, 40x25, mostly textual page images over the hidden + lines. Unlike the /dev/vbi interfaces, this is for 'smart' decoder + chips. (The use of the word smart here has to be taken in context, + the smartest teletext chips are fairly dumb pieces of technology). + </entry> + </row> + </tbody> + </tgroup> + </table> + <para> + We are most definitely a radio. + </para> + <para> + Finally we allocate our I/O space so that nobody treads on us and return 0 + to signify general happiness with the state of the universe. + </para> + </sect1> + <sect1 id="openradio"> + <title>Opening And Closing The Radio</title> + + <para> + The functions we declared in our video_device are mostly very simple. + Firstly we can drop in what is basically standard code for open and close. + </para> + <programlisting> + + +static int users = 0; + +static int radio_open(stuct video_device *dev, int flags) +{ + if(users) + return -EBUSY; + users++; + return 0; +} + + </programlisting> + <para> + At open time we need to do nothing but check if someone else is also using + the radio card. If nobody is using it we make a note that we are using it, + then we ensure that nobody unloads our driver on us. + </para> + <programlisting> + + +static int radio_close(struct video_device *dev) +{ + users--; +} + + </programlisting> + <para> + At close time we simply need to reduce the user count and allow the module + to become unloadable. + </para> + <para> + If you are sharp you will have noticed neither the open nor the close + routines attempt to reset or change the radio settings. This is intentional. + It allows an application to set up the radio and exit. It avoids a user + having to leave an application running all the time just to listen to the + radio. + </para> + </sect1> + <sect1 id="ioctlradio"> + <title>The Ioctl Interface</title> + <para> + This leaves the ioctl routine, without which the driver will not be + terribly useful to anyone. + </para> + <programlisting> + + +static int radio_ioctl(struct video_device *dev, unsigned int cmd, void *arg) +{ + switch(cmd) + { + case VIDIOCGCAP: + { + struct video_capability v; + v.type = VID_TYPE_TUNER; + v.channels = 1; + v.audios = 1; + v.maxwidth = 0; + v.minwidth = 0; + v.maxheight = 0; + v.minheight = 0; + strcpy(v.name, "My Radio"); + if(copy_to_user(arg, &v, sizeof(v))) + return -EFAULT; + return 0; + } + + </programlisting> + <para> + VIDIOCGCAP is the first ioctl all video4linux devices must support. It + allows the applications to find out what sort of a card they have found and + to figure out what they want to do about it. The fields in the structure are + </para> + <table frame="all"><title>struct video_capability fields</title> + <tgroup cols="2" align="left"> + <tbody> + <row> + <entry>name</entry><entry>The device text name. This is intended for the user.</entry> + </row><row> + <entry>channels</entry><entry>The number of different channels you can tune on + this card. It could even by zero for a card that has + no tuning capability. For our simple FM radio it is 1. + An AM/FM radio would report 2.</entry> + </row><row> + <entry>audios</entry><entry>The number of audio inputs on this device. For our + radio there is only one audio input.</entry> + </row><row> + <entry>minwidth,minheight</entry><entry>The smallest size the card is capable of capturing + images in. We set these to zero. Radios do not + capture pictures</entry> + </row><row> + <entry>maxwidth,maxheight</entry><entry>The largest image size the card is capable of + capturing. For our radio we report 0. + </entry> + </row><row> + <entry>type</entry><entry>This reports the capabilities of the device, and + matches the field we filled in in the struct + video_device when registering.</entry> + </row> + </tbody> + </tgroup> + </table> + <para> + Having filled in the fields, we use copy_to_user to copy the structure into + the users buffer. If the copy fails we return an EFAULT to the application + so that it knows it tried to feed us garbage. + </para> + <para> + The next pair of ioctl operations select which tuner is to be used and let + the application find the tuner properties. We have only a single FM band + tuner in our example device. + </para> + <programlisting> + + + case VIDIOCGTUNER: + { + struct video_tuner v; + if(copy_from_user(&v, arg, sizeof(v))!=0) + return -EFAULT; + if(v.tuner) + return -EINVAL; + v.rangelow=(87*16000); + v.rangehigh=(108*16000); + v.flags = VIDEO_TUNER_LOW; + v.mode = VIDEO_MODE_AUTO; + v.signal = 0xFFFF; + strcpy(v.name, "FM"); + if(copy_to_user(&v, arg, sizeof(v))!=0) + return -EFAULT; + return 0; + } + + </programlisting> + <para> + The VIDIOCGTUNER ioctl allows applications to query a tuner. The application + sets the tuner field to the tuner number it wishes to query. The query does + not change the tuner that is being used, it merely enquires about the tuner + in question. + </para> + <para> + We have exactly one tuner so after copying the user buffer to our temporary + structure we complain if they asked for a tuner other than tuner 0. + </para> + <para> + The video_tuner structure has the following fields + </para> + <table frame="all"><title>struct video_tuner fields</title> + <tgroup cols="2" align="left"> + <tbody> + <row> + <entry>int tuner</entry><entry>The number of the tuner in question</entry> + </row><row> + <entry>char name[32]</entry><entry>A text description of this tuner. "FM" will do fine. + This is intended for the application.</entry> + </row><row> + <entry>u32 flags</entry> + <entry>Tuner capability flags</entry> + </row> + <row> + <entry>u16 mode</entry><entry>The current reception mode</entry> + + </row><row> + <entry>u16 signal</entry><entry>The signal strength scaled between 0 and 65535. If + a device cannot tell the signal strength it should + report 65535. Many simple cards contain only a + signal/no signal bit. Such cards will report either + 0 or 65535.</entry> + + </row><row> + <entry>u32 rangelow, rangehigh</entry><entry> + The range of frequencies supported by the radio + or TV. It is scaled according to the VIDEO_TUNER_LOW + flag.</entry> + + </row> + </tbody> + </tgroup> + </table> + + <table frame="all"><title>struct video_tuner flags</title> + <tgroup cols="2" align="left"> + <tbody> + <row> + <entry>VIDEO_TUNER_PAL</entry><entry>A PAL TV tuner</entry> + </row><row> + <entry>VIDEO_TUNER_NTSC</entry><entry>An NTSC (US) TV tuner</entry> + </row><row> + <entry>VIDEO_TUNER_SECAM</entry><entry>A SECAM (French) TV tuner</entry> + </row><row> + <entry>VIDEO_TUNER_LOW</entry><entry> + The tuner frequency is scaled in 1/16th of a KHz + steps. If not it is in 1/16th of a MHz steps + </entry> + </row><row> + <entry>VIDEO_TUNER_NORM</entry><entry>The tuner can set its format</entry> + </row><row> + <entry>VIDEO_TUNER_STEREO_ON</entry><entry>The tuner is currently receiving a stereo signal</entry> + </row> + </tbody> + </tgroup> + </table> + + <table frame="all"><title>struct video_tuner modes</title> + <tgroup cols="2" align="left"> + <tbody> + <row> + <entry>VIDEO_MODE_PAL</entry><entry>PAL Format</entry> + </row><row> + <entry>VIDEO_MODE_NTSC</entry><entry>NTSC Format (USA)</entry> + </row><row> + <entry>VIDEO_MODE_SECAM</entry><entry>French Format</entry> + </row><row> + <entry>VIDEO_MODE_AUTO</entry><entry>A device that does not need to do + TV format switching</entry> + </row> + </tbody> + </tgroup> + </table> + <para> + The settings for the radio card are thus fairly simple. We report that we + are a tuner called "FM" for FM radio. In order to get the best tuning + resolution we report VIDEO_TUNER_LOW and select tuning to 1/16th of KHz. Its + unlikely our card can do that resolution but it is a fair bet the card can + do better than 1/16th of a MHz. VIDEO_TUNER_LOW is appropriate to almost all + radio usage. + </para> + <para> + We report that the tuner automatically handles deciding what format it is + receiving - true enough as it only handles FM radio. Our example card is + also incapable of detecting stereo or signal strengths so it reports a + strength of 0xFFFF (maximum) and no stereo detected. + </para> + <para> + To finish off we set the range that can be tuned to be 87-108Mhz, the normal + FM broadcast radio range. It is important to find out what the card is + actually capable of tuning. It is easy enough to simply use the FM broadcast + range. Unfortunately if you do this you will discover the FM broadcast + ranges in the USA, Europe and Japan are all subtly different and some users + cannot receive all the stations they wish. + </para> + <para> + The application also needs to be able to set the tuner it wishes to use. In + our case, with a single tuner this is rather simple to arrange. + </para> + <programlisting> + + case VIDIOCSTUNER: + { + struct video_tuner v; + if(copy_from_user(&v, arg, sizeof(v))) + return -EFAULT; + if(v.tuner != 0) + return -EINVAL; + return 0; + } + + </programlisting> + <para> + We copy the user supplied structure into kernel memory so we can examine it. + If the user has selected a tuner other than zero we reject the request. If + they wanted tuner 0 then, surprisingly enough, that is the current tuner already. + </para> + <para> + The next two ioctls we need to provide are to get and set the frequency of + the radio. These both use an unsigned long argument which is the frequency. + The scale of the frequency depends on the VIDEO_TUNER_LOW flag as I + mentioned earlier on. Since we have VIDEO_TUNER_LOW set this will be in + 1/16ths of a KHz. + </para> + <programlisting> + +static unsigned long current_freq; + + + + case VIDIOCGFREQ: + if(copy_to_user(arg, &current_freq, + sizeof(unsigned long)) + return -EFAULT; + return 0; + + </programlisting> + <para> + Querying the frequency in our case is relatively simple. Our radio card is + too dumb to let us query the signal strength so we remember our setting if + we know it. All we have to do is copy it to the user. + </para> + <programlisting> + + + case VIDIOCSFREQ: + { + u32 freq; + if(copy_from_user(arg, &freq, + sizeof(unsigned long))!=0) + return -EFAULT; + if(hardware_set_freq(freq)<0) + return -EINVAL; + current_freq = freq; + return 0; + } + + </programlisting> + <para> + Setting the frequency is a little more complex. We begin by copying the + desired frequency into kernel space. Next we call a hardware specific routine + to set the radio up. This might be as simple as some scaling and a few + writes to an I/O port. For most radio cards it turns out a good deal more + complicated and may involve programming things like a phase locked loop on + the card. This is what documentation is for. + </para> + <para> + The final set of operations we need to provide for our radio are the + volume controls. Not all radio cards can even do volume control. After all + there is a perfectly good volume control on the sound card. We will assume + our radio card has a simple 4 step volume control. + </para> + <para> + There are two ioctls with audio we need to support + </para> + <programlisting> + +static int current_volume=0; + + case VIDIOCGAUDIO: + { + struct video_audio v; + if(copy_from_user(&v, arg, sizeof(v))) + return -EFAULT; + if(v.audio != 0) + return -EINVAL; + v.volume = 16384*current_volume; + v.step = 16384; + strcpy(v.name, "Radio"); + v.mode = VIDEO_SOUND_MONO; + v.balance = 0; + v.base = 0; + v.treble = 0; + + if(copy_to_user(arg. &v, sizeof(v))) + return -EFAULT; + return 0; + } + + </programlisting> + <para> + Much like the tuner we start by copying the user structure into kernel + space. Again we check if the user has asked for a valid audio input. We have + only input 0 and we punt if they ask for another input. + </para> + <para> + Then we fill in the video_audio structure. This has the following format + </para> + <table frame="all"><title>struct video_audio fields</title> + <tgroup cols="2" align="left"> + <tbody> + <row> + <entry>audio</entry><entry>The input the user wishes to query</entry> + </row><row> + <entry>volume</entry><entry>The volume setting on a scale of 0-65535</entry> + </row><row> + <entry>base</entry><entry>The base level on a scale of 0-65535</entry> + </row><row> + <entry>treble</entry><entry>The treble level on a scale of 0-65535</entry> + </row><row> + <entry>flags</entry><entry>The features this audio device supports + </entry> + </row><row> + <entry>name</entry><entry>A text name to display to the user. We picked + "Radio" as it explains things quite nicely.</entry> + </row><row> + <entry>mode</entry><entry>The current reception mode for the audio + + We report MONO because our card is too stupid to know if it is in + mono or stereo. + </entry> + </row><row> + <entry>balance</entry><entry>The stereo balance on a scale of 0-65535, 32768 is + middle.</entry> + </row><row> + <entry>step</entry><entry>The step by which the volume control jumps. This is + used to help make it easy for applications to set + slider behaviour.</entry> + </row> + </tbody> + </tgroup> + </table> + + <table frame="all"><title>struct video_audio flags</title> + <tgroup cols="2" align="left"> + <tbody> + <row> + <entry>VIDEO_AUDIO_MUTE</entry><entry>The audio is currently muted. We + could fake this in our driver but we + choose not to bother.</entry> + </row><row> + <entry>VIDEO_AUDIO_MUTABLE</entry><entry>The input has a mute option</entry> + </row><row> + <entry>VIDEO_AUDIO_TREBLE</entry><entry>The input has a treble control</entry> + </row><row> + <entry>VIDEO_AUDIO_BASS</entry><entry>The input has a base control</entry> + </row> + </tbody> + </tgroup> + </table> + + <table frame="all"><title>struct video_audio modes</title> + <tgroup cols="2" align="left"> + <tbody> + <row> + <entry>VIDEO_SOUND_MONO</entry><entry>Mono sound</entry> + </row><row> + <entry>VIDEO_SOUND_STEREO</entry><entry>Stereo sound</entry> + </row><row> + <entry>VIDEO_SOUND_LANG1</entry><entry>Alternative language 1 (TV specific)</entry> + </row><row> + <entry>VIDEO_SOUND_LANG2</entry><entry>Alternative language 2 (TV specific)</entry> + </row> + </tbody> + </tgroup> + </table> + <para> + Having filled in the structure we copy it back to user space. + </para> + <para> + The VIDIOCSAUDIO ioctl allows the user to set the audio parameters in the + video_audio structure. The driver does its best to honour the request. + </para> + <programlisting> + + case VIDIOCSAUDIO: + { + struct video_audio v; + if(copy_from_user(&v, arg, sizeof(v))) + return -EFAULT; + if(v.audio) + return -EINVAL; + current_volume = v/16384; + hardware_set_volume(current_volume); + return 0; + } + + </programlisting> + <para> + In our case there is very little that the user can set. The volume is + basically the limit. Note that we could pretend to have a mute feature + by rewriting this to + </para> + <programlisting> + + case VIDIOCSAUDIO: + { + struct video_audio v; + if(copy_from_user(&v, arg, sizeof(v))) + return -EFAULT; + if(v.audio) + return -EINVAL; + current_volume = v/16384; + if(v.flags&VIDEO_AUDIO_MUTE) + hardware_set_volume(0); + else + hardware_set_volume(current_volume); + current_muted = v.flags & + VIDEO_AUDIO_MUTE; + return 0; + } + + </programlisting> + <para> + This with the corresponding changes to the VIDIOCGAUDIO code to report the + state of the mute flag we save and to report the card has a mute function, + will allow applications to use a mute facility with this card. It is + questionable whether this is a good idea however. User applications can already + fake this themselves and kernel space is precious. + </para> + <para> + We now have a working radio ioctl handler. So we just wrap up the function + </para> + <programlisting> + + + } + return -ENOIOCTLCMD; +} + + </programlisting> + <para> + and pass the Video4Linux layer back an error so that it knows we did not + understand the request we got passed. + </para> + </sect1> + <sect1 id="modradio"> + <title>Module Wrapper</title> + <para> + Finally we add in the usual module wrapping and the driver is done. + </para> + <programlisting> + +#ifndef MODULE + +static int io = 0x300; + +#else + +static int io = -1; + +#endif + +MODULE_AUTHOR("Alan Cox"); +MODULE_DESCRIPTION("A driver for an imaginary radio card."); +module_param(io, int, 0444); +MODULE_PARM_DESC(io, "I/O address of the card."); + +static int __init init(void) +{ + if(io==-1) + { + printk(KERN_ERR + "You must set an I/O address with io=0x???\n"); + return -EINVAL; + } + return myradio_init(NULL); +} + +static void __exit cleanup(void) +{ + video_unregister_device(&my_radio); + release_region(io, MY_IO_SIZE); +} + +module_init(init); +module_exit(cleanup); + + </programlisting> + <para> + In this example we set the IO base by default if the driver is compiled into + the kernel: you can still set it using "my_radio.irq" if this file is called <filename>my_radio.c</filename>. For the module we require the + user sets the parameter. We set io to a nonsense port (-1) so that we can + tell if the user supplied an io parameter or not. + </para> + <para> + We use MODULE_ defines to give an author for the card driver and a + description. We also use them to declare that io is an integer and it is the + address of the card, and can be read by anyone from sysfs. + </para> + <para> + The clean-up routine unregisters the video_device we registered, and frees + up the I/O space. Note that the unregister takes the actual video_device + structure as its argument. Unlike the file operations structure which can be + shared by all instances of a device a video_device structure as an actual + instance of the device. If you are registering multiple radio devices you + need to fill in one structure per device (most likely by setting up a + template and copying it to each of the actual device structures). + </para> + </sect1> + </chapter> + <chapter> + <title>Video Capture Devices</title> + <sect1 id="introvid"> + <title>Video Capture Device Types</title> + <para> + The video capture devices share the same interfaces as radio devices. In + order to explain the video capture interface I will use the example of a + camera that has no tuners or audio input. This keeps the example relatively + clean. To get both combine the two driver examples. + </para> + <para> + Video capture devices divide into four categories. A little technology + backgrounder. Full motion video even at television resolution (which is + actually fairly low) is pretty resource-intensive. You are continually + passing megabytes of data every second from the capture card to the display. + several alternative approaches have emerged because copying this through the + processor and the user program is a particularly bad idea . + </para> + <para> + The first is to add the television image onto the video output directly. + This is also how some 3D cards work. These basic cards can generally drop the + video into any chosen rectangle of the display. Cards like this, which + include most mpeg1 cards that used the feature connector, aren't very + friendly in a windowing environment. They don't understand windows or + clipping. The video window is always on the top of the display. + </para> + <para> + Chroma keying is a technique used by cards to get around this. It is an old + television mixing trick where you mark all the areas you wish to replace + with a single clear colour that isn't used in the image - TV people use an + incredibly bright blue while computing people often use a particularly + virulent purple. Bright blue occurs on the desktop. Anyone with virulent + purple windows has another problem besides their TV overlay. + </para> + <para> + The third approach is to copy the data from the capture card to the video + card, but to do it directly across the PCI bus. This relieves the processor + from doing the work but does require some smartness on the part of the video + capture chip, as well as a suitable video card. Programming this kind of + card and more so debugging it can be extremely tricky. There are some quite + complicated interactions with the display and you may also have to cope with + various chipset bugs that show up when PCI cards start talking to each + other. + </para> + <para> + To keep our example fairly simple we will assume a card that supports + overlaying a flat rectangular image onto the frame buffer output, and which + can also capture stuff into processor memory. + </para> + </sect1> + <sect1 id="regvid"> + <title>Registering Video Capture Devices</title> + <para> + This time we need to add more functions for our camera device. + </para> + <programlisting> +static struct video_device my_camera +{ + "My Camera", + VID_TYPE_OVERLAY|VID_TYPE_SCALES|\ + VID_TYPE_CAPTURE|VID_TYPE_CHROMAKEY, + VID_HARDWARE_MYCAMERA, + camera_open. + camera_close, + camera_read, /* no read */ + NULL, /* no write */ + camera_poll, /* no poll */ + camera_ioctl, + NULL, /* no special init function */ + NULL /* no private data */ +}; + </programlisting> + <para> + We need a read() function which is used for capturing data from + the card, and we need a poll function so that a driver can wait for the next + frame to be captured. + </para> + <para> + We use the extra video capability flags that did not apply to the + radio interface. The video related flags are + </para> + <table frame="all"><title>Capture Capabilities</title> + <tgroup cols="2" align="left"> + <tbody> + <row> +<entry>VID_TYPE_CAPTURE</entry><entry>We support image capture</entry> +</row><row> +<entry>VID_TYPE_TELETEXT</entry><entry>A teletext capture device (vbi{n])</entry> +</row><row> +<entry>VID_TYPE_OVERLAY</entry><entry>The image can be directly overlaid onto the + frame buffer</entry> +</row><row> +<entry>VID_TYPE_CHROMAKEY</entry><entry>Chromakey can be used to select which parts + of the image to display</entry> +</row><row> +<entry>VID_TYPE_CLIPPING</entry><entry>It is possible to give the board a list of + rectangles to draw around. </entry> +</row><row> +<entry>VID_TYPE_FRAMERAM</entry><entry>The video capture goes into the video memory + and actually changes it. Applications need + to know this so they can clean up after the + card</entry> +</row><row> +<entry>VID_TYPE_SCALES</entry><entry>The image can be scaled to various sizes, + rather than being a single fixed size.</entry> +</row><row> +<entry>VID_TYPE_MONOCHROME</entry><entry>The capture will be monochrome. This isn't a + complete answer to the question since a mono + camera on a colour capture card will still + produce mono output.</entry> +</row><row> +<entry>VID_TYPE_SUBCAPTURE</entry><entry>The card allows only part of its field of + view to be captured. This enables + applications to avoid copying all of a large + image into memory when only some section is + relevant.</entry> + </row> + </tbody> + </tgroup> + </table> + <para> + We set VID_TYPE_CAPTURE so that we are seen as a capture card, + VID_TYPE_CHROMAKEY so the application knows it is time to draw in virulent + purple, and VID_TYPE_SCALES because we can be resized. + </para> + <para> + Our setup is fairly similar. This time we also want an interrupt line + for the 'frame captured' signal. Not all cards have this so some of them + cannot handle poll(). + </para> + <programlisting> + + +static int io = 0x320; +static int irq = 11; + +int __init mycamera_init(struct video_init *v) +{ + if(!request_region(io, MY_IO_SIZE, "mycamera")) + { + printk(KERN_ERR + "mycamera: port 0x%03X is in use.\n", io); + return -EBUSY; + } + + if(video_device_register(&my_camera, + VFL_TYPE_GRABBER)==-1) { + release_region(io, MY_IO_SIZE); + return -EINVAL; + } + return 0; +} + + </programlisting> + <para> + This is little changed from the needs of the radio card. We specify + VFL_TYPE_GRABBER this time as we want to be allocated a /dev/video name. + </para> + </sect1> + <sect1 id="opvid"> + <title>Opening And Closing The Capture Device</title> + <programlisting> + + +static int users = 0; + +static int camera_open(stuct video_device *dev, int flags) +{ + if(users) + return -EBUSY; + if(request_irq(irq, camera_irq, 0, "camera", dev)<0) + return -EBUSY; + users++; + return 0; +} + + +static int camera_close(struct video_device *dev) +{ + users--; + free_irq(irq, dev); +} + </programlisting> + <para> + The open and close routines are also quite similar. The only real change is + that we now request an interrupt for the camera device interrupt line. If we + cannot get the interrupt we report EBUSY to the application and give up. + </para> + </sect1> + <sect1 id="irqvid"> + <title>Interrupt Handling</title> + <para> + Our example handler is for an ISA bus device. If it was PCI you would be + able to share the interrupt and would have set SA_SHIRQ to indicate a + shared IRQ. We pass the device pointer as the interrupt routine argument. We + don't need to since we only support one card but doing this will make it + easier to upgrade the driver for multiple devices in the future. + </para> + <para> + Our interrupt routine needs to do little if we assume the card can simply + queue one frame to be read after it captures it. + </para> + <programlisting> + + +static struct wait_queue *capture_wait; +static int capture_ready = 0; + +static void camera_irq(int irq, void *dev_id, + struct pt_regs *regs) +{ + capture_ready=1; + wake_up_interruptible(&capture_wait); +} + </programlisting> + <para> + The interrupt handler is nice and simple for this card as we are assuming + the card is buffering the frame for us. This means we have little to do but + wake up anybody interested. We also set a capture_ready flag, as we may + capture a frame before an application needs it. In this case we need to know + that a frame is ready. If we had to collect the frame on the interrupt life + would be more complex. + </para> + <para> + The two new routines we need to supply are camera_read which returns a + frame, and camera_poll which waits for a frame to become ready. + </para> + <programlisting> + + +static int camera_poll(struct video_device *dev, + struct file *file, struct poll_table *wait) +{ + poll_wait(file, &capture_wait, wait); + if(capture_read) + return POLLIN|POLLRDNORM; + return 0; +} + + </programlisting> + <para> + Our wait queue for polling is the capture_wait queue. This will cause the + task to be woken up by our camera_irq routine. We check capture_read to see + if there is an image present and if so report that it is readable. + </para> + </sect1> + <sect1 id="rdvid"> + <title>Reading The Video Image</title> + <programlisting> + + +static long camera_read(struct video_device *dev, char *buf, + unsigned long count) +{ + struct wait_queue wait = { current, NULL }; + u8 *ptr; + int len; + int i; + + add_wait_queue(&capture_wait, &wait); + + while(!capture_ready) + { + if(file->flags&O_NDELAY) + { + remove_wait_queue(&capture_wait, &wait); + current->state = TASK_RUNNING; + return -EWOULDBLOCK; + } + if(signal_pending(current)) + { + remove_wait_queue(&capture_wait, &wait); + current->state = TASK_RUNNING; + return -ERESTARTSYS; + } + schedule(); + current->state = TASK_INTERRUPTIBLE; + } + remove_wait_queue(&capture_wait, &wait); + current->state = TASK_RUNNING; + + </programlisting> + <para> + The first thing we have to do is to ensure that the application waits until + the next frame is ready. The code here is almost identical to the mouse code + we used earlier in this chapter. It is one of the common building blocks of + Linux device driver code and probably one which you will find occurs in any + drivers you write. + </para> + <para> + We wait for a frame to be ready, or for a signal to interrupt our waiting. If a + signal occurs we need to return from the system call so that the signal can + be sent to the application itself. We also check to see if the user actually + wanted to avoid waiting - ie if they are using non-blocking I/O and have other things + to get on with. + </para> + <para> + Next we copy the data from the card to the user application. This is rarely + as easy as our example makes out. We will add capture_w, and capture_h here + to hold the width and height of the captured image. We assume the card only + supports 24bit RGB for now. + </para> + <programlisting> + + + + capture_ready = 0; + + ptr=(u8 *)buf; + len = capture_w * 3 * capture_h; /* 24bit RGB */ + + if(len>count) + len=count; /* Doesn't all fit */ + + for(i=0; i<len; i++) + { + put_user(inb(io+IMAGE_DATA), ptr); + ptr++; + } + + hardware_restart_capture(); + + return i; +} + + </programlisting> + <para> + For a real hardware device you would try to avoid the loop with put_user(). + Each call to put_user() has a time overhead checking whether the accesses to user + space are allowed. It would be better to read a line into a temporary buffer + then copy this to user space in one go. + </para> + <para> + Having captured the image and put it into user space we can kick the card to + get the next frame acquired. + </para> + </sect1> + <sect1 id="iocvid"> + <title>Video Ioctl Handling</title> + <para> + As with the radio driver the major control interface is via the ioctl() + function. Video capture devices support the same tuner calls as a radio + device and also support additional calls to control how the video functions + are handled. In this simple example the card has no tuners to avoid making + the code complex. + </para> + <programlisting> + + + +static int camera_ioctl(struct video_device *dev, unsigned int cmd, void *arg) +{ + switch(cmd) + { + case VIDIOCGCAP: + { + struct video_capability v; + v.type = VID_TYPE_CAPTURE|\ + VID_TYPE_CHROMAKEY|\ + VID_TYPE_SCALES|\ + VID_TYPE_OVERLAY; + v.channels = 1; + v.audios = 0; + v.maxwidth = 640; + v.minwidth = 16; + v.maxheight = 480; + v.minheight = 16; + strcpy(v.name, "My Camera"); + if(copy_to_user(arg, &v, sizeof(v))) + return -EFAULT; + return 0; + } + + + </programlisting> + <para> + The first ioctl we must support and which all video capture and radio + devices are required to support is VIDIOCGCAP. This behaves exactly the same + as with a radio device. This time, however, we report the extra capabilities + we outlined earlier on when defining our video_dev structure. + </para> + <para> + We now set the video flags saying that we support overlay, capture, + scaling and chromakey. We also report size limits - our smallest image is + 16x16 pixels, our largest is 640x480. + </para> + <para> + To keep things simple we report no audio and no tuning capabilities at all. + </para> + <programlisting> + + case VIDIOCGCHAN: + { + struct video_channel v; + if(copy_from_user(&v, arg, sizeof(v))) + return -EFAULT; + if(v.channel != 0) + return -EINVAL; + v.flags = 0; + v.tuners = 0; + v.type = VIDEO_TYPE_CAMERA; + v.norm = VIDEO_MODE_AUTO; + strcpy(v.name, "Camera Input");break; + if(copy_to_user(&v, arg, sizeof(v))) + return -EFAULT; + return 0; + } + + + </programlisting> + <para> + This follows what is very much the standard way an ioctl handler looks + in Linux. We copy the data into a kernel space variable and we check that the + request is valid (in this case that the input is 0). Finally we copy the + camera info back to the user. + </para> + <para> + The VIDIOCGCHAN ioctl allows a user to ask about video channels (that is + inputs to the video card). Our example card has a single camera input. The + fields in the structure are + </para> + <table frame="all"><title>struct video_channel fields</title> + <tgroup cols="2" align="left"> + <tbody> + <row> + + <entry>channel</entry><entry>The channel number we are selecting</entry> + </row><row> + <entry>name</entry><entry>The name for this channel. This is intended + to describe the port to the user. + Appropriate names are therefore things like + "Camera" "SCART input"</entry> + </row><row> + <entry>flags</entry><entry>Channel properties</entry> + </row><row> + <entry>type</entry><entry>Input type</entry> + </row><row> + <entry>norm</entry><entry>The current television encoding being used + if relevant for this channel. + </entry> + </row> + </tbody> + </tgroup> + </table> + <table frame="all"><title>struct video_channel flags</title> + <tgroup cols="2" align="left"> + <tbody> + <row> + <entry>VIDEO_VC_TUNER</entry><entry>Channel has a tuner.</entry> + </row><row> + <entry>VIDEO_VC_AUDIO</entry><entry>Channel has audio.</entry> + </row> + </tbody> + </tgroup> + </table> + <table frame="all"><title>struct video_channel types</title> + <tgroup cols="2" align="left"> + <tbody> + <row> + <entry>VIDEO_TYPE_TV</entry><entry>Television input.</entry> + </row><row> + <entry>VIDEO_TYPE_CAMERA</entry><entry>Fixed camera input.</entry> + </row><row> + <entry>0</entry><entry>Type is unknown.</entry> + </row> + </tbody> + </tgroup> + </table> + <table frame="all"><title>struct video_channel norms</title> + <tgroup cols="2" align="left"> + <tbody> + <row> + <entry>VIDEO_MODE_PAL</entry><entry>PAL encoded Television</entry> + </row><row> + <entry>VIDEO_MODE_NTSC</entry><entry>NTSC (US) encoded Television</entry> + </row><row> + <entry>VIDEO_MODE_SECAM</entry><entry>SECAM (French) Television </entry> + </row><row> + <entry>VIDEO_MODE_AUTO</entry><entry>Automatic switching, or format does not + matter</entry> + </row> + </tbody> + </tgroup> + </table> + <para> + The corresponding VIDIOCSCHAN ioctl allows a user to change channel and to + request the norm is changed - for example to switch between a PAL or an NTSC + format camera. + </para> + <programlisting> + + + case VIDIOCSCHAN: + { + struct video_channel v; + if(copy_from_user(&v, arg, sizeof(v))) + return -EFAULT; + if(v.channel != 0) + return -EINVAL; + if(v.norm != VIDEO_MODE_AUTO) + return -EINVAL; + return 0; + } + + + </programlisting> + <para> + The implementation of this call in our driver is remarkably easy. Because we + are assuming fixed format hardware we need only check that the user has not + tried to change anything. + </para> + <para> + The user also needs to be able to configure and adjust the picture they are + seeing. This is much like adjusting a television set. A user application + also needs to know the palette being used so that it knows how to display + the image that has been captured. The VIDIOCGPICT and VIDIOCSPICT ioctl + calls provide this information. + </para> + <programlisting> + + + case VIDIOCGPICT + { + struct video_picture v; + v.brightness = hardware_brightness(); + v.hue = hardware_hue(); + v.colour = hardware_saturation(); + v.contrast = hardware_brightness(); + /* Not settable */ + v.whiteness = 32768; + v.depth = 24; /* 24bit */ + v.palette = VIDEO_PALETTE_RGB24; + if(copy_to_user(&v, arg, + sizeof(v))) + return -EFAULT; + return 0; + } + + + </programlisting> + <para> + The brightness, hue, color, and contrast provide the picture controls that + are akin to a conventional television. Whiteness provides additional + control for greyscale images. All of these values are scaled between 0-65535 + and have 32768 as the mid point setting. The scaling means that applications + do not have to worry about the capability range of the hardware but can let + it make a best effort attempt. + </para> + <para> + Our depth is 24, as this is in bits. We will be returning RGB24 format. This + has one byte of red, then one of green, then one of blue. This then repeats + for every other pixel in the image. The other common formats the interface + defines are + </para> + <table frame="all"><title>Framebuffer Encodings</title> + <tgroup cols="2" align="left"> + <tbody> + <row> + <entry>GREY</entry><entry>Linear greyscale. This is for simple cameras and the + like</entry> + </row><row> + <entry>RGB565</entry><entry>The top 5 bits hold 32 red levels, the next six bits + hold green and the low 5 bits hold blue. </entry> + </row><row> + <entry>RGB555</entry><entry>The top bit is clear. The red green and blue levels + each occupy five bits.</entry> + </row> + </tbody> + </tgroup> + </table> + <para> + Additional modes are support for YUV capture formats. These are common for + TV and video conferencing applications. + </para> + <para> + The VIDIOCSPICT ioctl allows a user to set some of the picture parameters. + Exactly which ones are supported depends heavily on the card itself. It is + possible to support many modes and effects in software. In general doing + this in the kernel is a bad idea. Video capture is a performance-sensitive + application and the programs can often do better if they aren't being + 'helped' by an overkeen driver writer. Thus for our device we will report + RGB24 only and refuse to allow a change. + </para> + <programlisting> + + + case VIDIOCSPICT: + { + struct video_picture v; + if(copy_from_user(&v, arg, sizeof(v))) + return -EFAULT; + if(v.depth!=24 || + v.palette != VIDEO_PALETTE_RGB24) + return -EINVAL; + set_hardware_brightness(v.brightness); + set_hardware_hue(v.hue); + set_hardware_saturation(v.colour); + set_hardware_brightness(v.contrast); + return 0; + } + + + </programlisting> + <para> + We check the user has not tried to change the palette or the depth. We do + not want to carry out some of the changes and then return an error. This may + confuse the application which will be assuming no change occurred. + </para> + <para> + In much the same way as you need to be able to set the picture controls to + get the right capture images, many cards need to know what they are + displaying onto when generating overlay output. In some cases getting this + wrong even makes a nasty mess or may crash the computer. For that reason + the VIDIOCSBUF ioctl used to set up the frame buffer information may well + only be usable by root. + </para> + <para> + We will assume our card is one of the old ISA devices with feature connector + and only supports a couple of standard video modes. Very common for older + cards although the PCI devices are way smarter than this. + </para> + <programlisting> + + +static struct video_buffer capture_fb; + + case VIDIOCGFBUF: + { + if(copy_to_user(arg, &capture_fb, + sizeof(capture_fb))) + return -EFAULT; + return 0; + + } + + + </programlisting> + <para> + We keep the frame buffer information in the format the ioctl uses. This + makes it nice and easy to work with in the ioctl calls. + </para> + <programlisting> + + case VIDIOCSFBUF: + { + struct video_buffer v; + + if(!capable(CAP_SYS_ADMIN)) + return -EPERM; + + if(copy_from_user(&v, arg, sizeof(v))) + return -EFAULT; + if(v.width!=320 && v.width!=640) + return -EINVAL; + if(v.height!=200 && v.height!=240 + && v.height!=400 + && v.height !=480) + return -EINVAL; + memcpy(&capture_fb, &v, sizeof(v)); + hardware_set_fb(&v); + return 0; + } + + + + </programlisting> + <para> + The capable() function checks a user has the required capability. The Linux + operating system has a set of about 30 capabilities indicating privileged + access to services. The default set up gives the superuser (uid 0) all of + them and nobody else has any. + </para> + <para> + We check that the user has the SYS_ADMIN capability, that is they are + allowed to operate as the machine administrator. We don't want anyone but + the administrator making a mess of the display. + </para> + <para> + Next we check for standard PC video modes (320 or 640 wide with either + EGA or VGA depths). If the mode is not a standard video mode we reject it as + not supported by our card. If the mode is acceptable we save it so that + VIDIOCFBUF will give the right answer next time it is called. The + hardware_set_fb() function is some undescribed card specific function to + program the card for the desired mode. + </para> + <para> + Before the driver can display an overlay window it needs to know where the + window should be placed, and also how large it should be. If the card + supports clipping it needs to know which rectangles to omit from the + display. The video_window structure is used to describe the way the image + should be displayed. + </para> + <table frame="all"><title>struct video_window fields</title> + <tgroup cols="2" align="left"> + <tbody> + <row> + <entry>width</entry><entry>The width in pixels of the desired image. The card + may use a smaller size if this size is not available</entry> + </row><row> + <entry>height</entry><entry>The height of the image. The card may use a smaller + size if this size is not available.</entry> + </row><row> + <entry>x</entry><entry> The X position of the top left of the window. This + is in pixels relative to the left hand edge of the + picture. Not all cards can display images aligned on + any pixel boundary. If the position is unsuitable + the card adjusts the image right and reduces the + width.</entry> + </row><row> + <entry>y</entry><entry> The Y position of the top left of the window. This + is counted in pixels relative to the top edge of the + picture. As with the width if the card cannot + display starting on this line it will adjust the + values.</entry> + </row><row> + <entry>chromakey</entry><entry>The colour (expressed in RGB32 format) for the + chromakey colour if chroma keying is being used. </entry> + </row><row> + <entry>clips</entry><entry>An array of rectangles that must not be drawn + over.</entry> + </row><row> + <entry>clipcount</entry><entry>The number of clips in this array.</entry> + </row> + </tbody> + </tgroup> + </table> + <para> + Each clip is a struct video_clip which has the following fields + </para> + <table frame="all"><title>video_clip fields</title> + <tgroup cols="2" align="left"> + <tbody> + <row> + <entry>x, y</entry><entry>Co-ordinates relative to the display</entry> + </row><row> + <entry>width, height</entry><entry>Width and height in pixels</entry> + </row><row> + <entry>next</entry><entry>A spare field for the application to use</entry> + </row> + </tbody> + </tgroup> + </table> + <para> + The driver is required to ensure it always draws in the area requested or a smaller area, and that it never draws in any of the areas that are clipped. + This may well mean it has to leave alone. small areas the application wished to be + drawn. + </para> + <para> + Our example card uses chromakey so does not have to address most of the + clipping. We will add a video_window structure to our global variables to + remember our parameters, as we did with the frame buffer. + </para> + <programlisting> + + + case VIDIOCGWIN: + { + if(copy_to_user(arg, &capture_win, + sizeof(capture_win))) + return -EFAULT; + return 0; + } + + + case VIDIOCSWIN: + { + struct video_window v; + if(copy_from_user(&v, arg, sizeof(v))) + return -EFAULT; + if(v.width > 640 || v.height > 480) + return -EINVAL; + if(v.width < 16 || v.height < 16) + return -EINVAL; + hardware_set_key(v.chromakey); + hardware_set_window(v); + memcpy(&capture_win, &v, sizeof(v)); + capture_w = v.width; + capture_h = v.height; + return 0; + } + + + </programlisting> + <para> + Because we are using Chromakey our setup is fairly simple. Mostly we have to + check the values are sane and load them into the capture card. + </para> + <para> + With all the setup done we can now turn on the actual capture/overlay. This + is done with the VIDIOCCAPTURE ioctl. This takes a single integer argument + where 0 is on and 1 is off. + </para> + <programlisting> + + + case VIDIOCCAPTURE: + { + int v; + if(get_user(v, (int *)arg)) + return -EFAULT; + if(v==0) + hardware_capture_off(); + else + { + if(capture_fb.width == 0 + || capture_w == 0) + return -EINVAL; + hardware_capture_on(); + } + return 0; + } + + + </programlisting> + <para> + We grab the flag from user space and either enable or disable according to + its value. There is one small corner case we have to consider here. Suppose + that the capture was requested before the video window or the frame buffer + had been set up. In those cases there will be unconfigured fields in our + card data, as well as unconfigured hardware settings. We check for this case and + return an error if the frame buffer or the capture window width is zero. + </para> + <programlisting> + + + default: + return -ENOIOCTLCMD; + } +} + </programlisting> + <para> + + We don't need to support any other ioctls, so if we get this far, it is time + to tell the video layer that we don't now what the user is talking about. + </para> + </sect1> + <sect1 id="endvid"> + <title>Other Functionality</title> + <para> + The Video4Linux layer supports additional features, including a high + performance mmap() based capture mode and capturing part of the image. + These features are out of the scope of the book. You should however have enough + example code to implement most simple video4linux devices for radio and TV + cards. + </para> + </sect1> + </chapter> + <chapter id="bugs"> + <title>Known Bugs And Assumptions</title> + <para> + <variablelist> + <varlistentry><term>Multiple Opens</term> + <listitem> + <para> + The driver assumes multiple opens should not be allowed. A driver + can work around this but not cleanly. + </para> + </listitem></varlistentry> + + <varlistentry><term>API Deficiencies</term> + <listitem> + <para> + The existing API poorly reflects compression capable devices. There + are plans afoot to merge V4L, V4L2 and some other ideas into a + better interface. + </para> + </listitem></varlistentry> + </variablelist> + + </para> + </chapter> + + <chapter id="pubfunctions"> + <title>Public Functions Provided</title> +!Edrivers/media/video/videodev.c + </chapter> + +</book> diff --git a/Documentation/DocBook/wanbook.tmpl b/Documentation/DocBook/wanbook.tmpl new file mode 100644 index 000000000000..9eebcc304de4 --- /dev/null +++ b/Documentation/DocBook/wanbook.tmpl @@ -0,0 +1,99 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="WANGuide"> + <bookinfo> + <title>Synchronous PPP and Cisco HDLC Programming Guide</title> + + <authorgroup> + <author> + <firstname>Alan</firstname> + <surname>Cox</surname> + <affiliation> + <address> + <email>alan@redhat.com</email> + </address> + </affiliation> + </author> + </authorgroup> + + <copyright> + <year>2000</year> + <holder>Alan Cox</holder> + </copyright> + + <legalnotice> + <para> + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later + version. + </para> + + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + See the GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + + <para> + For more details see the file COPYING in the source + distribution of Linux. + </para> + </legalnotice> + </bookinfo> + +<toc></toc> + + <chapter id="intro"> + <title>Introduction</title> + <para> + The syncppp drivers in Linux provide a fairly complete + implementation of Cisco HDLC and a minimal implementation of + PPP. The longer term goal is to switch the PPP layer to the + generic PPP interface that is new in Linux 2.3.x. The API should + remain unchanged when this is done, but support will then be + available for IPX, compression and other PPP features + </para> + </chapter> + <chapter id="bugs"> + <title>Known Bugs And Assumptions</title> + <para> + <variablelist> + <varlistentry><term>PPP is minimal</term> + <listitem> + <para> + The current PPP implementation is very basic, although sufficient + for most wan usages. + </para> + </listitem></varlistentry> + + <varlistentry><term>Cisco HDLC Quirks</term> + <listitem> + <para> + Currently we do not end all packets with the correct Cisco multicast + or unicast flags. Nothing appears to mind too much but this should + be corrected. + </para> + </listitem></varlistentry> + </variablelist> + + </para> + </chapter> + + <chapter id="pubfunctions"> + <title>Public Functions Provided</title> +!Edrivers/net/wan/syncppp.c + </chapter> + +</book> diff --git a/Documentation/DocBook/writing_usb_driver.tmpl b/Documentation/DocBook/writing_usb_driver.tmpl new file mode 100644 index 000000000000..51f3bfb6fb6e --- /dev/null +++ b/Documentation/DocBook/writing_usb_driver.tmpl @@ -0,0 +1,419 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="USBDeviceDriver"> + <bookinfo> + <title>Writing USB Device Drivers</title> + + <authorgroup> + <author> + <firstname>Greg</firstname> + <surname>Kroah-Hartman</surname> + <affiliation> + <address> + <email>greg@kroah.com</email> + </address> + </affiliation> + </author> + </authorgroup> + + <copyright> + <year>2001-2002</year> + <holder>Greg Kroah-Hartman</holder> + </copyright> + + <legalnotice> + <para> + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later + version. + </para> + + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + See the GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + + <para> + For more details see the file COPYING in the source + distribution of Linux. + </para> + + <para> + This documentation is based on an article published in + Linux Journal Magazine, October 2001, Issue 90. + </para> + </legalnotice> + </bookinfo> + +<toc></toc> + + <chapter id="intro"> + <title>Introduction</title> + <para> + The Linux USB subsystem has grown from supporting only two different + types of devices in the 2.2.7 kernel (mice and keyboards), to over 20 + different types of devices in the 2.4 kernel. Linux currently supports + almost all USB class devices (standard types of devices like keyboards, + mice, modems, printers and speakers) and an ever-growing number of + vendor-specific devices (such as USB to serial converters, digital + cameras, Ethernet devices and MP3 players). For a full list of the + different USB devices currently supported, see Resources. + </para> + <para> + The remaining kinds of USB devices that do not have support on Linux are + almost all vendor-specific devices. Each vendor decides to implement a + custom protocol to talk to their device, so a custom driver usually needs + to be created. Some vendors are open with their USB protocols and help + with the creation of Linux drivers, while others do not publish them, and + developers are forced to reverse-engineer. See Resources for some links + to handy reverse-engineering tools. + </para> + <para> + Because each different protocol causes a new driver to be created, I have + written a generic USB driver skeleton, modeled after the pci-skeleton.c + file in the kernel source tree upon which many PCI network drivers have + been based. This USB skeleton can be found at drivers/usb/usb-skeleton.c + in the kernel source tree. In this article I will walk through the basics + of the skeleton driver, explaining the different pieces and what needs to + be done to customize it to your specific device. + </para> + </chapter> + + <chapter id="basics"> + <title>Linux USB Basics</title> + <para> + If you are going to write a Linux USB driver, please become familiar with + the USB protocol specification. It can be found, along with many other + useful documents, at the USB home page (see Resources). An excellent + introduction to the Linux USB subsystem can be found at the USB Working + Devices List (see Resources). It explains how the Linux USB subsystem is + structured and introduces the reader to the concept of USB urbs, which + are essential to USB drivers. + </para> + <para> + The first thing a Linux USB driver needs to do is register itself with + the Linux USB subsystem, giving it some information about which devices + the driver supports and which functions to call when a device supported + by the driver is inserted or removed from the system. All of this + information is passed to the USB subsystem in the usb_driver structure. + The skeleton driver declares a usb_driver as: + </para> + <programlisting> +static struct usb_driver skel_driver = { + .name = "skeleton", + .probe = skel_probe, + .disconnect = skel_disconnect, + .fops = &skel_fops, + .minor = USB_SKEL_MINOR_BASE, + .id_table = skel_table, +}; + </programlisting> + <para> + The variable name is a string that describes the driver. It is used in + informational messages printed to the system log. The probe and + disconnect function pointers are called when a device that matches the + information provided in the id_table variable is either seen or removed. + </para> + <para> + The fops and minor variables are optional. Most USB drivers hook into + another kernel subsystem, such as the SCSI, network or TTY subsystem. + These types of drivers register themselves with the other kernel + subsystem, and any user-space interactions are provided through that + interface. But for drivers that do not have a matching kernel subsystem, + such as MP3 players or scanners, a method of interacting with user space + is needed. The USB subsystem provides a way to register a minor device + number and a set of file_operations function pointers that enable this + user-space interaction. The skeleton driver needs this kind of interface, + so it provides a minor starting number and a pointer to its + file_operations functions. + </para> + <para> + The USB driver is then registered with a call to usb_register, usually in + the driver's init function, as shown here: + </para> + <programlisting> +static int __init usb_skel_init(void) +{ + int result; + + /* register this driver with the USB subsystem */ + result = usb_register(&skel_driver); + if (result < 0) { + err("usb_register failed for the "__FILE__ "driver." + "Error number %d", result); + return -1; + } + + return 0; +} +module_init(usb_skel_init); + </programlisting> + <para> + When the driver is unloaded from the system, it needs to unregister + itself with the USB subsystem. This is done with the usb_unregister + function: + </para> + <programlisting> +static void __exit usb_skel_exit(void) +{ + /* deregister this driver with the USB subsystem */ + usb_deregister(&skel_driver); +} +module_exit(usb_skel_exit); + </programlisting> + <para> + To enable the linux-hotplug system to load the driver automatically when + the device is plugged in, you need to create a MODULE_DEVICE_TABLE. The + following code tells the hotplug scripts that this module supports a + single device with a specific vendor and product ID: + </para> + <programlisting> +/* table of devices that work with this driver */ +static struct usb_device_id skel_table [] = { + { USB_DEVICE(USB_SKEL_VENDOR_ID, USB_SKEL_PRODUCT_ID) }, + { } /* Terminating entry */ +}; +MODULE_DEVICE_TABLE (usb, skel_table); + </programlisting> + <para> + There are other macros that can be used in describing a usb_device_id for + drivers that support a whole class of USB drivers. See usb.h for more + information on this. + </para> + </chapter> + + <chapter id="device"> + <title>Device operation</title> + <para> + When a device is plugged into the USB bus that matches the device ID + pattern that your driver registered with the USB core, the probe function + is called. The usb_device structure, interface number and the interface ID + are passed to the function: + </para> + <programlisting> +static int skel_probe(struct usb_interface *interface, + const struct usb_device_id *id) + </programlisting> + <para> + The driver now needs to verify that this device is actually one that it + can accept. If so, it returns 0. + If not, or if any error occurs during initialization, an errorcode + (such as <literal>-ENOMEM</literal> or <literal>-ENODEV</literal>) + is returned from the probe function. + </para> + <para> + In the skeleton driver, we determine what end points are marked as bulk-in + and bulk-out. We create buffers to hold the data that will be sent and + received from the device, and a USB urb to write data to the device is + initialized. + </para> + <para> + Conversely, when the device is removed from the USB bus, the disconnect + function is called with the device pointer. The driver needs to clean any + private data that has been allocated at this time and to shut down any + pending urbs that are in the USB system. The driver also unregisters + itself from the devfs subsystem with the call: + </para> + <programlisting> +/* remove our devfs node */ +devfs_unregister(skel->devfs); + </programlisting> + <para> + Now that the device is plugged into the system and the driver is bound to + the device, any of the functions in the file_operations structure that + were passed to the USB subsystem will be called from a user program trying + to talk to the device. The first function called will be open, as the + program tries to open the device for I/O. We increment our private usage + count and save off a pointer to our internal structure in the file + structure. This is done so that future calls to file operations will + enable the driver to determine which device the user is addressing. All + of this is done with the following code: + </para> + <programlisting> +/* increment our usage count for the module */ +++skel->open_count; + +/* save our object in the file's private structure */ +file->private_data = dev; + </programlisting> + <para> + After the open function is called, the read and write functions are called + to receive and send data to the device. In the skel_write function, we + receive a pointer to some data that the user wants to send to the device + and the size of the data. The function determines how much data it can + send to the device based on the size of the write urb it has created (this + size depends on the size of the bulk out end point that the device has). + Then it copies the data from user space to kernel space, points the urb to + the data and submits the urb to the USB subsystem. This can be shown in + he following code: + </para> + <programlisting> +/* we can only write as much as 1 urb will hold */ +bytes_written = (count > skel->bulk_out_size) ? skel->bulk_out_size : count; + +/* copy the data from user space into our urb */ +copy_from_user(skel->write_urb->transfer_buffer, buffer, bytes_written); + +/* set up our urb */ +usb_fill_bulk_urb(skel->write_urb, + skel->dev, + usb_sndbulkpipe(skel->dev, skel->bulk_out_endpointAddr), + skel->write_urb->transfer_buffer, + bytes_written, + skel_write_bulk_callback, + skel); + +/* send the data out the bulk port */ +result = usb_submit_urb(skel->write_urb); +if (result) { + err("Failed submitting write urb, error %d", result); +} + </programlisting> + <para> + When the write urb is filled up with the proper information using the + usb_fill_bulk_urb function, we point the urb's completion callback to call our + own skel_write_bulk_callback function. This function is called when the + urb is finished by the USB subsystem. The callback function is called in + interrupt context, so caution must be taken not to do very much processing + at that time. Our implementation of skel_write_bulk_callback merely + reports if the urb was completed successfully or not and then returns. + </para> + <para> + The read function works a bit differently from the write function in that + we do not use an urb to transfer data from the device to the driver. + Instead we call the usb_bulk_msg function, which can be used to send or + receive data from a device without having to create urbs and handle + urb completion callback functions. We call the usb_bulk_msg function, + giving it a buffer into which to place any data received from the device + and a timeout value. If the timeout period expires without receiving any + data from the device, the function will fail and return an error message. + This can be shown with the following code: + </para> + <programlisting> +/* do an immediate bulk read to get data from the device */ +retval = usb_bulk_msg (skel->dev, + usb_rcvbulkpipe (skel->dev, + skel->bulk_in_endpointAddr), + skel->bulk_in_buffer, + skel->bulk_in_size, + &count, HZ*10); +/* if the read was successful, copy the data to user space */ +if (!retval) { + if (copy_to_user (buffer, skel->bulk_in_buffer, count)) + retval = -EFAULT; + else + retval = count; +} + </programlisting> + <para> + The usb_bulk_msg function can be very useful for doing single reads or + writes to a device; however, if you need to read or write constantly to a + device, it is recommended to set up your own urbs and submit them to the + USB subsystem. + </para> + <para> + When the user program releases the file handle that it has been using to + talk to the device, the release function in the driver is called. In this + function we decrement our private usage count and wait for possible + pending writes: + </para> + <programlisting> +/* decrement our usage count for the device */ +--skel->open_count; + </programlisting> + <para> + One of the more difficult problems that USB drivers must be able to handle + smoothly is the fact that the USB device may be removed from the system at + any point in time, even if a program is currently talking to it. It needs + to be able to shut down any current reads and writes and notify the + user-space programs that the device is no longer there. The following + code (function <function>skel_delete</function>) + is an example of how to do this: </para> + <programlisting> +static inline void skel_delete (struct usb_skel *dev) +{ + if (dev->bulk_in_buffer != NULL) + kfree (dev->bulk_in_buffer); + if (dev->bulk_out_buffer != NULL) + usb_buffer_free (dev->udev, dev->bulk_out_size, + dev->bulk_out_buffer, + dev->write_urb->transfer_dma); + if (dev->write_urb != NULL) + usb_free_urb (dev->write_urb); + kfree (dev); +} + </programlisting> + <para> + If a program currently has an open handle to the device, we reset the flag + <literal>device_present</literal>. For + every read, write, release and other functions that expect a device to be + present, the driver first checks this flag to see if the device is + still present. If not, it releases that the device has disappeared, and a + -ENODEV error is returned to the user-space program. When the release + function is eventually called, it determines if there is no device + and if not, it does the cleanup that the skel_disconnect + function normally does if there are no open files on the device (see + Listing 5). + </para> + </chapter> + + <chapter id="iso"> + <title>Isochronous Data</title> + <para> + This usb-skeleton driver does not have any examples of interrupt or + isochronous data being sent to or from the device. Interrupt data is sent + almost exactly as bulk data is, with a few minor exceptions. Isochronous + data works differently with continuous streams of data being sent to or + from the device. The audio and video camera drivers are very good examples + of drivers that handle isochronous data and will be useful if you also + need to do this. + </para> + </chapter> + + <chapter id="Conclusion"> + <title>Conclusion</title> + <para> + Writing Linux USB device drivers is not a difficult task as the + usb-skeleton driver shows. This driver, combined with the other current + USB drivers, should provide enough examples to help a beginning author + create a working driver in a minimal amount of time. The linux-usb-devel + mailing list archives also contain a lot of helpful information. + </para> + </chapter> + + <chapter id="resources"> + <title>Resources</title> + <para> + The Linux USB Project: <ulink url="http://www.linux-usb.org">http://www.linux-usb.org/</ulink> + </para> + <para> + Linux Hotplug Project: <ulink url="http://linux-hotplug.sourceforge.net">http://linux-hotplug.sourceforge.net/</ulink> + </para> + <para> + Linux USB Working Devices List: <ulink url="http://www.qbik.ch/usb/devices">http://www.qbik.ch/usb/devices/</ulink> + </para> + <para> + linux-usb-devel Mailing List Archives: <ulink url="http://marc.theaimsgroup.com/?l=linux-usb-devel">http://marc.theaimsgroup.com/?l=linux-usb-devel</ulink> + </para> + <para> + Programming Guide for Linux USB Device Drivers: <ulink url="http://usb.cs.tum.edu/usbdoc">http://usb.cs.tum.edu/usbdoc</ulink> + </para> + <para> + USB Home Page: <ulink url="http://www.usb.org">http://www.usb.org</ulink> + </para> + </chapter> + +</book> diff --git a/Documentation/DocBook/z8530book.tmpl b/Documentation/DocBook/z8530book.tmpl new file mode 100644 index 000000000000..a507876447aa --- /dev/null +++ b/Documentation/DocBook/z8530book.tmpl @@ -0,0 +1,385 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []> + +<book id="Z85230Guide"> + <bookinfo> + <title>Z8530 Programming Guide</title> + + <authorgroup> + <author> + <firstname>Alan</firstname> + <surname>Cox</surname> + <affiliation> + <address> + <email>alan@redhat.com</email> + </address> + </affiliation> + </author> + </authorgroup> + + <copyright> + <year>2000</year> + <holder>Alan Cox</holder> + </copyright> + + <legalnotice> + <para> + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later + version. + </para> + + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + See the GNU General Public License for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + + <para> + For more details see the file COPYING in the source + distribution of Linux. + </para> + </legalnotice> + </bookinfo> + +<toc></toc> + + <chapter id="intro"> + <title>Introduction</title> + <para> + The Z85x30 family synchronous/asynchronous controller chips are + used on a large number of cheap network interface cards. The + kernel provides a core interface layer that is designed to make + it easy to provide WAN services using this chip. + </para> + <para> + The current driver only support synchronous operation. Merging the + asynchronous driver support into this code to allow any Z85x30 + device to be used as both a tty interface and as a synchronous + controller is a project for Linux post the 2.4 release + </para> + <para> + The support code handles most common card configurations and + supports running both Cisco HDLC and Synchronous PPP. With extra + glue the frame relay and X.25 protocols can also be used with this + driver. + </para> + </chapter> + + <chapter> + <title>Driver Modes</title> + <para> + The Z85230 driver layer can drive Z8530, Z85C30 and Z85230 devices + in three different modes. Each mode can be applied to an individual + channel on the chip (each chip has two channels). + </para> + <para> + The PIO synchronous mode supports the most common Z8530 wiring. Here + the chip is interface to the I/O and interrupt facilities of the + host machine but not to the DMA subsystem. When running PIO the + Z8530 has extremely tight timing requirements. Doing high speeds, + even with a Z85230 will be tricky. Typically you should expect to + achieve at best 9600 baud with a Z8C530 and 64Kbits with a Z85230. + </para> + <para> + The DMA mode supports the chip when it is configured to use dual DMA + channels on an ISA bus. The better cards tend to support this mode + of operation for a single channel. With DMA running the Z85230 tops + out when it starts to hit ISA DMA constraints at about 512Kbits. It + is worth noting here that many PC machines hang or crash when the + chip is driven fast enough to hold the ISA bus solid. + </para> + <para> + Transmit DMA mode uses a single DMA channel. The DMA channel is used + for transmission as the transmit FIFO is smaller than the receive + FIFO. it gives better performance than pure PIO mode but is nowhere + near as ideal as pure DMA mode. + </para> + </chapter> + + <chapter> + <title>Using the Z85230 driver</title> + <para> + The Z85230 driver provides the back end interface to your board. To + configure a Z8530 interface you need to detect the board and to + identify its ports and interrupt resources. It is also your problem + to verify the resources are available. + </para> + <para> + Having identified the chip you need to fill in a struct z8530_dev, + which describes each chip. This object must exist until you finally + shutdown the board. Firstly zero the active field. This ensures + nothing goes off without you intending it. The irq field should + be set to the interrupt number of the chip. (Each chip has a single + interrupt source rather than each channel). You are responsible + for allocating the interrupt line. The interrupt handler should be + set to <function>z8530_interrupt</function>. The device id should + be set to the z8530_dev structure pointer. Whether the interrupt can + be shared or not is board dependent, and up to you to initialise. + </para> + <para> + The structure holds two channel structures. + Initialise chanA.ctrlio and chanA.dataio with the address of the + control and data ports. You can or this with Z8530_PORT_SLEEP to + indicate your interface needs the 5uS delay for chip settling done + in software. The PORT_SLEEP option is architecture specific. Other + flags may become available on future platforms, eg for MMIO. + Initialise the chanA.irqs to &z8530_nop to start the chip up + as disabled and discarding interrupt events. This ensures that + stray interrupts will be mopped up and not hang the bus. Set + chanA.dev to point to the device structure itself. The + private and name field you may use as you wish. The private field + is unused by the Z85230 layer. The name is used for error reporting + and it may thus make sense to make it match the network name. + </para> + <para> + Repeat the same operation with the B channel if your chip has + both channels wired to something useful. This isn't always the + case. If it is not wired then the I/O values do not matter, but + you must initialise chanB.dev. + </para> + <para> + If your board has DMA facilities then initialise the txdma and + rxdma fields for the relevant channels. You must also allocate the + ISA DMA channels and do any necessary board level initialisation + to configure them. The low level driver will do the Z8530 and + DMA controller programming but not board specific magic. + </para> + <para> + Having initialised the device you can then call + <function>z8530_init</function>. This will probe the chip and + reset it into a known state. An identification sequence is then + run to identify the chip type. If the checks fail to pass the + function returns a non zero error code. Typically this indicates + that the port given is not valid. After this call the + type field of the z8530_dev structure is initialised to either + Z8530, Z85C30 or Z85230 according to the chip found. + </para> + <para> + Once you have called z8530_init you can also make use of the utility + function <function>z8530_describe</function>. This provides a + consistent reporting format for the Z8530 devices, and allows all + the drivers to provide consistent reporting. + </para> + </chapter> + + <chapter> + <title>Attaching Network Interfaces</title> + <para> + If you wish to use the network interface facilities of the driver, + then you need to attach a network device to each channel that is + present and in use. In addition to use the SyncPPP and Cisco HDLC + you need to follow some additional plumbing rules. They may seem + complex but a look at the example hostess_sv11 driver should + reassure you. + </para> + <para> + The network device used for each channel should be pointed to by + the netdevice field of each channel. The dev-> priv field of the + network device points to your private data - you will need to be + able to find your ppp device from this. In addition to use the + sync ppp layer the private data must start with a void * pointer + to the syncppp structures. + </para> + <para> + The way most drivers approach this particular problem is to + create a structure holding the Z8530 device definition and + put that and the syncppp pointer into the private field of + the network device. The network device fields of the channels + then point back to the network devices. The ppp_device can also + be put in the private structure conveniently. + </para> + <para> + If you wish to use the synchronous ppp then you need to attach + the syncppp layer to the network device. You should do this before + you register the network device. The + <function>sppp_attach</function> requires that the first void * + pointer in your private data is pointing to an empty struct + ppp_device. The function fills in the initial data for the + ppp/hdlc layer. + </para> + <para> + Before you register your network device you will also need to + provide suitable handlers for most of the network device callbacks. + See the network device documentation for more details on this. + </para> + </chapter> + + <chapter> + <title>Configuring And Activating The Port</title> + <para> + The Z85230 driver provides helper functions and tables to load the + port registers on the Z8530 chips. When programming the register + settings for a channel be aware that the documentation recommends + initialisation orders. Strange things happen when these are not + followed. + </para> + <para> + <function>z8530_channel_load</function> takes an array of + pairs of initialisation values in an array of u8 type. The first + value is the Z8530 register number. Add 16 to indicate the alternate + register bank on the later chips. The array is terminated by a 255. + </para> + <para> + The driver provides a pair of public tables. The + z8530_hdlc_kilostream table is for the UK 'Kilostream' service and + also happens to cover most other end host configurations. The + z8530_hdlc_kilostream_85230 table is the same configuration using + the enhancements of the 85230 chip. The configuration loaded is + standard NRZ encoded synchronous data with HDLC bitstuffing. All + of the timing is taken from the other end of the link. + </para> + <para> + When writing your own tables be aware that the driver internally + tracks register values. It may need to reload values. You should + therefore be sure to set registers 1-7, 9-11, 14 and 15 in all + configurations. Where the register settings depend on DMA selection + the driver will update the bits itself when you open or close. + Loading a new table with the interface open is not recommended. + </para> + <para> + There are three standard configurations supported by the core + code. In PIO mode the interface is programmed up to use + interrupt driven PIO. This places high demands on the host processor + to avoid latency. The driver is written to take account of latency + issues but it cannot avoid latencies caused by other drivers, + notably IDE in PIO mode. Because the drivers allocate buffers you + must also prevent MTU changes while the port is open. + </para> + <para> + Once the port is open it will call the rx_function of each channel + whenever a completed packet arrived. This is invoked from + interrupt context and passes you the channel and a network + buffer (struct sk_buff) holding the data. The data includes + the CRC bytes so most users will want to trim the last two + bytes before processing the data. This function is very timing + critical. When you wish to simply discard data the support + code provides the function <function>z8530_null_rx</function> + to discard the data. + </para> + <para> + To active PIO mode sending and receiving the <function> + z8530_sync_open</function> is called. This expects to be passed + the network device and the channel. Typically this is called from + your network device open callback. On a failure a non zero error + status is returned. The <function>z8530_sync_close</function> + function shuts down a PIO channel. This must be done before the + channel is opened again and before the driver shuts down + and unloads. + </para> + <para> + The ideal mode of operation is dual channel DMA mode. Here the + kernel driver will configure the board for DMA in both directions. + The driver also handles ISA DMA issues such as controller + programming and the memory range limit for you. This mode is + activated by calling the <function>z8530_sync_dma_open</function> + function. On failure a non zero error value is returned. + Once this mode is activated it can be shut down by calling the + <function>z8530_sync_dma_close</function>. You must call the close + function matching the open mode you used. + </para> + <para> + The final supported mode uses a single DMA channel to drive the + transmit side. As the Z85C30 has a larger FIFO on the receive + channel this tends to increase the maximum speed a little. + This is activated by calling the <function>z8530_sync_txdma_open + </function>. This returns a non zero error code on failure. The + <function>z8530_sync_txdma_close</function> function closes down + the Z8530 interface from this mode. + </para> + </chapter> + + <chapter> + <title>Network Layer Functions</title> + <para> + The Z8530 layer provides functions to queue packets for + transmission. The driver internally buffers the frame currently + being transmitted and one further frame (in order to keep back + to back transmission running). Any further buffering is up to + the caller. + </para> + <para> + The function <function>z8530_queue_xmit</function> takes a network + buffer in sk_buff format and queues it for transmission. The + caller must provide the entire packet with the exception of the + bitstuffing and CRC. This is normally done by the caller via + the syncppp interface layer. It returns 0 if the buffer has been + queued and non zero values for queue full. If the function accepts + the buffer it becomes property of the Z8530 layer and the caller + should not free it. + </para> + <para> + The function <function>z8530_get_stats</function> returns a pointer + to an internally maintained per interface statistics block. This + provides most of the interface code needed to implement the network + layer get_stats callback. + </para> + </chapter> + + <chapter> + <title>Porting The Z8530 Driver</title> + <para> + The Z8530 driver is written to be portable. In DMA mode it makes + assumptions about the use of ISA DMA. These are probably warranted + in most cases as the Z85230 in particular was designed to glue to PC + type machines. The PIO mode makes no real assumptions. + </para> + <para> + Should you need to retarget the Z8530 driver to another architecture + the only code that should need changing are the port I/O functions. + At the moment these assume PC I/O port accesses. This may not be + appropriate for all platforms. Replacing + <function>z8530_read_port</function> and <function>z8530_write_port + </function> is intended to be all that is required to port this + driver layer. + </para> + </chapter> + + <chapter id="bugs"> + <title>Known Bugs And Assumptions</title> + <para> + <variablelist> + <varlistentry><term>Interrupt Locking</term> + <listitem> + <para> + The locking in the driver is done via the global cli/sti lock. This + makes for relatively poor SMP performance. Switching this to use a + per device spin lock would probably materially improve performance. + </para> + </listitem></varlistentry> + + <varlistentry><term>Occasional Failures</term> + <listitem> + <para> + We have reports of occasional failures when run for very long + periods of time and the driver starts to receive junk frames. At + the moment the cause of this is not clear. + </para> + </listitem></varlistentry> + </variablelist> + + </para> + </chapter> + + <chapter id="pubfunctions"> + <title>Public Functions Provided</title> +!Edrivers/net/wan/z85230.c + </chapter> + + <chapter id="intfunctions"> + <title>Internal Functions</title> +!Idrivers/net/wan/z85230.c + </chapter> + +</book> diff --git a/Documentation/IO-mapping.txt b/Documentation/IO-mapping.txt new file mode 100644 index 000000000000..86edb61bdee6 --- /dev/null +++ b/Documentation/IO-mapping.txt @@ -0,0 +1,208 @@ +[ NOTE: The virt_to_bus() and bus_to_virt() functions have been + superseded by the functionality provided by the PCI DMA + interface (see Documentation/DMA-mapping.txt). They continue + to be documented below for historical purposes, but new code + must not use them. --davidm 00/12/12 ] + +[ This is a mail message in response to a query on IO mapping, thus the + strange format for a "document" ] + +The AHA-1542 is a bus-master device, and your patch makes the driver give the +controller the physical address of the buffers, which is correct on x86 +(because all bus master devices see the physical memory mappings directly). + +However, on many setups, there are actually _three_ different ways of looking +at memory addresses, and in this case we actually want the third, the +so-called "bus address". + +Essentially, the three ways of addressing memory are (this is "real memory", +that is, normal RAM--see later about other details): + + - CPU untranslated. This is the "physical" address. Physical address + 0 is what the CPU sees when it drives zeroes on the memory bus. + + - CPU translated address. This is the "virtual" address, and is + completely internal to the CPU itself with the CPU doing the appropriate + translations into "CPU untranslated". + + - bus address. This is the address of memory as seen by OTHER devices, + not the CPU. Now, in theory there could be many different bus + addresses, with each device seeing memory in some device-specific way, but + happily most hardware designers aren't actually actively trying to make + things any more complex than necessary, so you can assume that all + external hardware sees the memory the same way. + +Now, on normal PCs the bus address is exactly the same as the physical +address, and things are very simple indeed. However, they are that simple +because the memory and the devices share the same address space, and that is +not generally necessarily true on other PCI/ISA setups. + +Now, just as an example, on the PReP (PowerPC Reference Platform), the +CPU sees a memory map something like this (this is from memory): + + 0-2 GB "real memory" + 2 GB-3 GB "system IO" (inb/out and similar accesses on x86) + 3 GB-4 GB "IO memory" (shared memory over the IO bus) + +Now, that looks simple enough. However, when you look at the same thing from +the viewpoint of the devices, you have the reverse, and the physical memory +address 0 actually shows up as address 2 GB for any IO master. + +So when the CPU wants any bus master to write to physical memory 0, it +has to give the master address 0x80000000 as the memory address. + +So, for example, depending on how the kernel is actually mapped on the +PPC, you can end up with a setup like this: + + physical address: 0 + virtual address: 0xC0000000 + bus address: 0x80000000 + +where all the addresses actually point to the same thing. It's just seen +through different translations.. + +Similarly, on the Alpha, the normal translation is + + physical address: 0 + virtual address: 0xfffffc0000000000 + bus address: 0x40000000 + +(but there are also Alphas where the physical address and the bus address +are the same). + +Anyway, the way to look up all these translations, you do + + #include <asm/io.h> + + phys_addr = virt_to_phys(virt_addr); + virt_addr = phys_to_virt(phys_addr); + bus_addr = virt_to_bus(virt_addr); + virt_addr = bus_to_virt(bus_addr); + +Now, when do you need these? + +You want the _virtual_ address when you are actually going to access that +pointer from the kernel. So you can have something like this: + + /* + * this is the hardware "mailbox" we use to communicate with + * the controller. The controller sees this directly. + */ + struct mailbox { + __u32 status; + __u32 bufstart; + __u32 buflen; + .. + } mbox; + + unsigned char * retbuffer; + + /* get the address from the controller */ + retbuffer = bus_to_virt(mbox.bufstart); + switch (retbuffer[0]) { + case STATUS_OK: + ... + +on the other hand, you want the bus address when you have a buffer that +you want to give to the controller: + + /* ask the controller to read the sense status into "sense_buffer" */ + mbox.bufstart = virt_to_bus(&sense_buffer); + mbox.buflen = sizeof(sense_buffer); + mbox.status = 0; + notify_controller(&mbox); + +And you generally _never_ want to use the physical address, because you can't +use that from the CPU (the CPU only uses translated virtual addresses), and +you can't use it from the bus master. + +So why do we care about the physical address at all? We do need the physical +address in some cases, it's just not very often in normal code. The physical +address is needed if you use memory mappings, for example, because the +"remap_pfn_range()" mm function wants the physical address of the memory to +be remapped as measured in units of pages, a.k.a. the pfn (the memory +management layer doesn't know about devices outside the CPU, so it +shouldn't need to know about "bus addresses" etc). + +NOTE NOTE NOTE! The above is only one part of the whole equation. The above +only talks about "real memory", that is, CPU memory (RAM). + +There is a completely different type of memory too, and that's the "shared +memory" on the PCI or ISA bus. That's generally not RAM (although in the case +of a video graphics card it can be normal DRAM that is just used for a frame +buffer), but can be things like a packet buffer in a network card etc. + +This memory is called "PCI memory" or "shared memory" or "IO memory" or +whatever, and there is only one way to access it: the readb/writeb and +related functions. You should never take the address of such memory, because +there is really nothing you can do with such an address: it's not +conceptually in the same memory space as "real memory" at all, so you cannot +just dereference a pointer. (Sadly, on x86 it _is_ in the same memory space, +so on x86 it actually works to just deference a pointer, but it's not +portable). + +For such memory, you can do things like + + - reading: + /* + * read first 32 bits from ISA memory at 0xC0000, aka + * C000:0000 in DOS terms + */ + unsigned int signature = isa_readl(0xC0000); + + - remapping and writing: + /* + * remap framebuffer PCI memory area at 0xFC000000, + * size 1MB, so that we can access it: We can directly + * access only the 640k-1MB area, so anything else + * has to be remapped. + */ + char * baseptr = ioremap(0xFC000000, 1024*1024); + + /* write a 'A' to the offset 10 of the area */ + writeb('A',baseptr+10); + + /* unmap when we unload the driver */ + iounmap(baseptr); + + - copying and clearing: + /* get the 6-byte Ethernet address at ISA address E000:0040 */ + memcpy_fromio(kernel_buffer, 0xE0040, 6); + /* write a packet to the driver */ + memcpy_toio(0xE1000, skb->data, skb->len); + /* clear the frame buffer */ + memset_io(0xA0000, 0, 0x10000); + +OK, that just about covers the basics of accessing IO portably. Questions? +Comments? You may think that all the above is overly complex, but one day you +might find yourself with a 500 MHz Alpha in front of you, and then you'll be +happy that your driver works ;) + +Note that kernel versions 2.0.x (and earlier) mistakenly called the +ioremap() function "vremap()". ioremap() is the proper name, but I +didn't think straight when I wrote it originally. People who have to +support both can do something like: + + /* support old naming silliness */ + #if LINUX_VERSION_CODE < 0x020100 + #define ioremap vremap + #define iounmap vfree + #endif + +at the top of their source files, and then they can use the right names +even on 2.0.x systems. + +And the above sounds worse than it really is. Most real drivers really +don't do all that complex things (or rather: the complexity is not so +much in the actual IO accesses as in error handling and timeouts etc). +It's generally not hard to fix drivers, and in many cases the code +actually looks better afterwards: + + unsigned long signature = *(unsigned int *) 0xC0000; + vs + unsigned long signature = readl(0xC0000); + +I think the second version actually is more readable, no? + + Linus + diff --git a/Documentation/IPMI.txt b/Documentation/IPMI.txt new file mode 100644 index 000000000000..90d10e708ca3 --- /dev/null +++ b/Documentation/IPMI.txt @@ -0,0 +1,534 @@ + + The Linux IPMI Driver + --------------------- + Corey Minyard + <minyard@mvista.com> + <minyard@acm.org> + +The Intelligent Platform Management Interface, or IPMI, is a +standard for controlling intelligent devices that monitor a system. +It provides for dynamic discovery of sensors in the system and the +ability to monitor the sensors and be informed when the sensor's +values change or go outside certain boundaries. It also has a +standardized database for field-replacable units (FRUs) and a watchdog +timer. + +To use this, you need an interface to an IPMI controller in your +system (called a Baseboard Management Controller, or BMC) and +management software that can use the IPMI system. + +This document describes how to use the IPMI driver for Linux. If you +are not familiar with IPMI itself, see the web site at +http://www.intel.com/design/servers/ipmi/index.htm. IPMI is a big +subject and I can't cover it all here! + +Configuration +------------- + +The LinuxIPMI driver is modular, which means you have to pick several +things to have it work right depending on your hardware. Most of +these are available in the 'Character Devices' menu. + +No matter what, you must pick 'IPMI top-level message handler' to use +IPMI. What you do beyond that depends on your needs and hardware. + +The message handler does not provide any user-level interfaces. +Kernel code (like the watchdog) can still use it. If you need access +from userland, you need to select 'Device interface for IPMI' if you +want access through a device driver. Another interface is also +available, you may select 'IPMI sockets' in the 'Networking Support' +main menu. This provides a socket interface to IPMI. You may select +both of these at the same time, they will both work together. + +The driver interface depends on your hardware. If you have a board +with a standard interface (These will generally be either "KCS", +"SMIC", or "BT", consult your hardware manual), choose the 'IPMI SI +handler' option. A driver also exists for direct I2C access to the +IPMI management controller. Some boards support this, but it is +unknown if it will work on every board. For this, choose 'IPMI SMBus +handler', but be ready to try to do some figuring to see if it will +work. + +There is also a KCS-only driver interface supplied, but it is +depracated in favor of the SI interface. + +You should generally enable ACPI on your system, as systems with IPMI +should have ACPI tables describing them. + +If you have a standard interface and the board manufacturer has done +their job correctly, the IPMI controller should be automatically +detect (via ACPI or SMBIOS tables) and should just work. Sadly, many +boards do not have this information. The driver attempts standard +defaults, but they may not work. If you fall into this situation, you +need to read the section below named 'The SI Driver' on how to +hand-configure your system. + +IPMI defines a standard watchdog timer. You can enable this with the +'IPMI Watchdog Timer' config option. If you compile the driver into +the kernel, then via a kernel command-line option you can have the +watchdog timer start as soon as it intitializes. It also have a lot +of other options, see the 'Watchdog' section below for more details. +Note that you can also have the watchdog continue to run if it is +closed (by default it is disabled on close). Go into the 'Watchdog +Cards' menu, enable 'Watchdog Timer Support', and enable the option +'Disable watchdog shutdown on close'. + + +Basic Design +------------ + +The Linux IPMI driver is designed to be very modular and flexible, you +only need to take the pieces you need and you can use it in many +different ways. Because of that, it's broken into many chunks of +code. These chunks are: + +ipmi_msghandler - This is the central piece of software for the IPMI +system. It handles all messages, message timing, and responses. The +IPMI users tie into this, and the IPMI physical interfaces (called +System Management Interfaces, or SMIs) also tie in here. This +provides the kernelland interface for IPMI, but does not provide an +interface for use by application processes. + +ipmi_devintf - This provides a userland IOCTL interface for the IPMI +driver, each open file for this device ties in to the message handler +as an IPMI user. + +ipmi_si - A driver for various system interfaces. This supports +KCS, SMIC, and may support BT in the future. Unless you have your own +custom interface, you probably need to use this. + +ipmi_smb - A driver for accessing BMCs on the SMBus. It uses the +I2C kernel driver's SMBus interfaces to send and receive IPMI messages +over the SMBus. + +af_ipmi - A network socket interface to IPMI. This doesn't take up +a character device in your system. + +Note that the KCS-only interface ahs been removed. + +Much documentation for the interface is in the include files. The +IPMI include files are: + +net/af_ipmi.h - Contains the socket interface. + +linux/ipmi.h - Contains the user interface and IOCTL interface for IPMI. + +linux/ipmi_smi.h - Contains the interface for system management interfaces +(things that interface to IPMI controllers) to use. + +linux/ipmi_msgdefs.h - General definitions for base IPMI messaging. + + +Addressing +---------- + +The IPMI addressing works much like IP addresses, you have an overlay +to handle the different address types. The overlay is: + + struct ipmi_addr + { + int addr_type; + short channel; + char data[IPMI_MAX_ADDR_SIZE]; + }; + +The addr_type determines what the address really is. The driver +currently understands two different types of addresses. + +"System Interface" addresses are defined as: + + struct ipmi_system_interface_addr + { + int addr_type; + short channel; + }; + +and the type is IPMI_SYSTEM_INTERFACE_ADDR_TYPE. This is used for talking +straight to the BMC on the current card. The channel must be +IPMI_BMC_CHANNEL. + +Messages that are destined to go out on the IPMB bus use the +IPMI_IPMB_ADDR_TYPE address type. The format is + + struct ipmi_ipmb_addr + { + int addr_type; + short channel; + unsigned char slave_addr; + unsigned char lun; + }; + +The "channel" here is generally zero, but some devices support more +than one channel, it corresponds to the channel as defined in the IPMI +spec. + + +Messages +-------- + +Messages are defined as: + +struct ipmi_msg +{ + unsigned char netfn; + unsigned char lun; + unsigned char cmd; + unsigned char *data; + int data_len; +}; + +The driver takes care of adding/stripping the header information. The +data portion is just the data to be send (do NOT put addressing info +here) or the response. Note that the completion code of a response is +the first item in "data", it is not stripped out because that is how +all the messages are defined in the spec (and thus makes counting the +offsets a little easier :-). + +When using the IOCTL interface from userland, you must provide a block +of data for "data", fill it, and set data_len to the length of the +block of data, even when receiving messages. Otherwise the driver +will have no place to put the message. + +Messages coming up from the message handler in kernelland will come in +as: + + struct ipmi_recv_msg + { + struct list_head link; + + /* The type of message as defined in the "Receive Types" + defines above. */ + int recv_type; + + ipmi_user_t *user; + struct ipmi_addr addr; + long msgid; + struct ipmi_msg msg; + + /* Call this when done with the message. It will presumably free + the message and do any other necessary cleanup. */ + void (*done)(struct ipmi_recv_msg *msg); + + /* Place-holder for the data, don't make any assumptions about + the size or existence of this, since it may change. */ + unsigned char msg_data[IPMI_MAX_MSG_LENGTH]; + }; + +You should look at the receive type and handle the message +appropriately. + + +The Upper Layer Interface (Message Handler) +------------------------------------------- + +The upper layer of the interface provides the users with a consistent +view of the IPMI interfaces. It allows multiple SMI interfaces to be +addressed (because some boards actually have multiple BMCs on them) +and the user should not have to care what type of SMI is below them. + + +Creating the User + +To user the message handler, you must first create a user using +ipmi_create_user. The interface number specifies which SMI you want +to connect to, and you must supply callback functions to be called +when data comes in. The callback function can run at interrupt level, +so be careful using the callbacks. This also allows to you pass in a +piece of data, the handler_data, that will be passed back to you on +all calls. + +Once you are done, call ipmi_destroy_user() to get rid of the user. + +From userland, opening the device automatically creates a user, and +closing the device automatically destroys the user. + + +Messaging + +To send a message from kernel-land, the ipmi_request() call does +pretty much all message handling. Most of the parameter are +self-explanatory. However, it takes a "msgid" parameter. This is NOT +the sequence number of messages. It is simply a long value that is +passed back when the response for the message is returned. You may +use it for anything you like. + +Responses come back in the function pointed to by the ipmi_recv_hndl +field of the "handler" that you passed in to ipmi_create_user(). +Remember again, these may be running at interrupt level. Remember to +look at the receive type, too. + +From userland, you fill out an ipmi_req_t structure and use the +IPMICTL_SEND_COMMAND ioctl. For incoming stuff, you can use select() +or poll() to wait for messages to come in. However, you cannot use +read() to get them, you must call the IPMICTL_RECEIVE_MSG with the +ipmi_recv_t structure to actually get the message. Remember that you +must supply a pointer to a block of data in the msg.data field, and +you must fill in the msg.data_len field with the size of the data. +This gives the receiver a place to actually put the message. + +If the message cannot fit into the data you provide, you will get an +EMSGSIZE error and the driver will leave the data in the receive +queue. If you want to get it and have it truncate the message, us +the IPMICTL_RECEIVE_MSG_TRUNC ioctl. + +When you send a command (which is defined by the lowest-order bit of +the netfn per the IPMI spec) on the IPMB bus, the driver will +automatically assign the sequence number to the command and save the +command. If the response is not receive in the IPMI-specified 5 +seconds, it will generate a response automatically saying the command +timed out. If an unsolicited response comes in (if it was after 5 +seconds, for instance), that response will be ignored. + +In kernelland, after you receive a message and are done with it, you +MUST call ipmi_free_recv_msg() on it, or you will leak messages. Note +that you should NEVER mess with the "done" field of a message, that is +required to properly clean up the message. + +Note that when sending, there is an ipmi_request_supply_msgs() call +that lets you supply the smi and receive message. This is useful for +pieces of code that need to work even if the system is out of buffers +(the watchdog timer uses this, for instance). You supply your own +buffer and own free routines. This is not recommended for normal use, +though, since it is tricky to manage your own buffers. + + +Events and Incoming Commands + +The driver takes care of polling for IPMI events and receiving +commands (commands are messages that are not responses, they are +commands that other things on the IPMB bus have sent you). To receive +these, you must register for them, they will not automatically be sent +to you. + +To receive events, you must call ipmi_set_gets_events() and set the +"val" to non-zero. Any events that have been received by the driver +since startup will immediately be delivered to the first user that +registers for events. After that, if multiple users are registered +for events, they will all receive all events that come in. + +For receiving commands, you have to individually register commands you +want to receive. Call ipmi_register_for_cmd() and supply the netfn +and command name for each command you want to receive. Only one user +may be registered for each netfn/cmd, but different users may register +for different commands. + +From userland, equivalent IOCTLs are provided to do these functions. + + +The Lower Layer (SMI) Interface +------------------------------- + +As mentioned before, multiple SMI interfaces may be registered to the +message handler, each of these is assigned an interface number when +they register with the message handler. They are generally assigned +in the order they register, although if an SMI unregisters and then +another one registers, all bets are off. + +The ipmi_smi.h defines the interface for management interfaces, see +that for more details. + + +The SI Driver +------------- + +The SI driver allows up to 4 KCS or SMIC interfaces to be configured +in the system. By default, scan the ACPI tables for interfaces, and +if it doesn't find any the driver will attempt to register one KCS +interface at the spec-specified I/O port 0xca2 without interrupts. +You can change this at module load time (for a module) with: + + modprobe ipmi_si.o type=<type1>,<type2>.... + ports=<port1>,<port2>... addrs=<addr1>,<addr2>... + irqs=<irq1>,<irq2>... trydefaults=[0|1] + regspacings=<sp1>,<sp2>,... regsizes=<size1>,<size2>,... + regshifts=<shift1>,<shift2>,... + slave_addrs=<addr1>,<addr2>,... + +Each of these except si_trydefaults is a list, the first item for the +first interface, second item for the second interface, etc. + +The si_type may be either "kcs", "smic", or "bt". If you leave it blank, it +defaults to "kcs". + +If you specify si_addrs as non-zero for an interface, the driver will +use the memory address given as the address of the device. This +overrides si_ports. + +If you specify si_ports as non-zero for an interface, the driver will +use the I/O port given as the device address. + +If you specify si_irqs as non-zero for an interface, the driver will +attempt to use the given interrupt for the device. + +si_trydefaults sets whether the standard IPMI interface at 0xca2 and +any interfaces specified by ACPE are tried. By default, the driver +tries it, set this value to zero to turn this off. + +The next three parameters have to do with register layout. The +registers used by the interfaces may not appear at successive +locations and they may not be in 8-bit registers. These parameters +allow the layout of the data in the registers to be more precisely +specified. + +The regspacings parameter give the number of bytes between successive +register start addresses. For instance, if the regspacing is set to 4 +and the start address is 0xca2, then the address for the second +register would be 0xca6. This defaults to 1. + +The regsizes parameter gives the size of a register, in bytes. The +data used by IPMI is 8-bits wide, but it may be inside a larger +register. This parameter allows the read and write type to specified. +It may be 1, 2, 4, or 8. The default is 1. + +Since the register size may be larger than 32 bits, the IPMI data may not +be in the lower 8 bits. The regshifts parameter give the amount to shift +the data to get to the actual IPMI data. + +The slave_addrs specifies the IPMI address of the local BMC. This is +usually 0x20 and the driver defaults to that, but in case it's not, it +can be specified when the driver starts up. + +When compiled into the kernel, the addresses can be specified on the +kernel command line as: + + ipmi_si.type=<type1>,<type2>... + ipmi_si.ports=<port1>,<port2>... ipmi_si.addrs=<addr1>,<addr2>... + ipmi_si.irqs=<irq1>,<irq2>... ipmi_si.trydefaults=[0|1] + ipmi_si.regspacings=<sp1>,<sp2>,... + ipmi_si.regsizes=<size1>,<size2>,... + ipmi_si.regshifts=<shift1>,<shift2>,... + ipmi_si.slave_addrs=<addr1>,<addr2>,... + +It works the same as the module parameters of the same names. + +By default, the driver will attempt to detect any device specified by +ACPI, and if none of those then a KCS device at the spec-specified +0xca2. If you want to turn this off, set the "trydefaults" option to +false. + +If you have high-res timers compiled into the kernel, the driver will +use them to provide much better performance. Note that if you do not +have high-res timers enabled in the kernel and you don't have +interrupts enabled, the driver will run VERY slowly. Don't blame me, +these interfaces suck. + + +The SMBus Driver +---------------- + +The SMBus driver allows up to 4 SMBus devices to be configured in the +system. By default, the driver will register any SMBus interfaces it finds +in the I2C address range of 0x20 to 0x4f on any adapter. You can change this +at module load time (for a module) with: + + modprobe ipmi_smb.o + addr=<adapter1>,<i2caddr1>[,<adapter2>,<i2caddr2>[,...]] + dbg=<flags1>,<flags2>... + [defaultprobe=0] [dbg_probe=1] + +The addresses are specified in pairs, the first is the adapter ID and the +second is the I2C address on that adapter. + +The debug flags are bit flags for each BMC found, they are: +IPMI messages: 1, driver state: 2, timing: 4, I2C probe: 8 + +Setting smb_defaultprobe to zero disabled the default probing of SMBus +interfaces at address range 0x20 to 0x4f. This means that only the +BMCs specified on the smb_addr line will be detected. + +Setting smb_dbg_probe to 1 will enable debugging of the probing and +detection process for BMCs on the SMBusses. + +Discovering the IPMI compilant BMC on the SMBus can cause devices +on the I2C bus to fail. The SMBus driver writes a "Get Device ID" IPMI +message as a block write to the I2C bus and waits for a response. +This action can be detrimental to some I2C devices. It is highly recommended +that the known I2c address be given to the SMBus driver in the smb_addr +parameter. The default adrress range will not be used when a smb_addr +parameter is provided. + +When compiled into the kernel, the addresses can be specified on the +kernel command line as: + + ipmb_smb.addr=<adapter1>,<i2caddr1>[,<adapter2>,<i2caddr2>[,...]] + ipmi_smb.dbg=<flags1>,<flags2>... + ipmi_smb.defaultprobe=0 ipmi_smb.dbg_probe=1 + +These are the same options as on the module command line. + +Note that you might need some I2C changes if CONFIG_IPMI_PANIC_EVENT +is enabled along with this, so the I2C driver knows to run to +completion during sending a panic event. + + +Other Pieces +------------ + +Watchdog +-------- + +A watchdog timer is provided that implements the Linux-standard +watchdog timer interface. It has three module parameters that can be +used to control it: + + modprobe ipmi_watchdog timeout=<t> pretimeout=<t> action=<action type> + preaction=<preaction type> preop=<preop type> start_now=x + nowayout=x + +The timeout is the number of seconds to the action, and the pretimeout +is the amount of seconds before the reset that the pre-timeout panic will +occur (if pretimeout is zero, then pretimeout will not be enabled). Note +that the pretimeout is the time before the final timeout. So if the +timeout is 50 seconds and the pretimeout is 10 seconds, then the pretimeout +will occur in 40 second (10 seconds before the timeout). + +The action may be "reset", "power_cycle", or "power_off", and +specifies what to do when the timer times out, and defaults to +"reset". + +The preaction may be "pre_smi" for an indication through the SMI +interface, "pre_int" for an indication through the SMI with an +interrupts, and "pre_nmi" for a NMI on a preaction. This is how +the driver is informed of the pretimeout. + +The preop may be set to "preop_none" for no operation on a pretimeout, +"preop_panic" to set the preoperation to panic, or "preop_give_data" +to provide data to read from the watchdog device when the pretimeout +occurs. A "pre_nmi" setting CANNOT be used with "preop_give_data" +because you can't do data operations from an NMI. + +When preop is set to "preop_give_data", one byte comes ready to read +on the device when the pretimeout occurs. Select and fasync work on +the device, as well. + +If start_now is set to 1, the watchdog timer will start running as +soon as the driver is loaded. + +If nowayout is set to 1, the watchdog timer will not stop when the +watchdog device is closed. The default value of nowayout is true +if the CONFIG_WATCHDOG_NOWAYOUT option is enabled, or false if not. + +When compiled into the kernel, the kernel command line is available +for configuring the watchdog: + + ipmi_watchdog.timeout=<t> ipmi_watchdog.pretimeout=<t> + ipmi_watchdog.action=<action type> + ipmi_watchdog.preaction=<preaction type> + ipmi_watchdog.preop=<preop type> + ipmi_watchdog.start_now=x + ipmi_watchdog.nowayout=x + +The options are the same as the module parameter options. + +The watchdog will panic and start a 120 second reset timeout if it +gets a pre-action. During a panic or a reboot, the watchdog will +start a 120 timer if it is running to make sure the reboot occurs. + +Note that if you use the NMI preaction for the watchdog, you MUST +NOT use nmi watchdog mode 1. If you use the NMI watchdog, you +must use mode 2. + +Once you open the watchdog timer, you must write a 'V' character to the +device to close it, or the timer will not stop. This is a new semantic +for the driver, but makes it consistent with the rest of the watchdog +drivers in Linux. diff --git a/Documentation/IRQ-affinity.txt b/Documentation/IRQ-affinity.txt new file mode 100644 index 000000000000..938d7dd05490 --- /dev/null +++ b/Documentation/IRQ-affinity.txt @@ -0,0 +1,37 @@ + +SMP IRQ affinity, started by Ingo Molnar <mingo@redhat.com> + + +/proc/irq/IRQ#/smp_affinity specifies which target CPUs are permitted +for a given IRQ source. It's a bitmask of allowed CPUs. It's not allowed +to turn off all CPUs, and if an IRQ controller does not support IRQ +affinity then the value will not change from the default 0xffffffff. + +Here is an example of restricting IRQ44 (eth1) to CPU0-3 then restricting +the IRQ to CPU4-7 (this is an 8-CPU SMP box): + +[root@moon 44]# cat smp_affinity +ffffffff +[root@moon 44]# echo 0f > smp_affinity +[root@moon 44]# cat smp_affinity +0000000f +[root@moon 44]# ping -f h +PING hell (195.4.7.3): 56 data bytes +... +--- hell ping statistics --- +6029 packets transmitted, 6027 packets received, 0% packet loss +round-trip min/avg/max = 0.1/0.1/0.4 ms +[root@moon 44]# cat /proc/interrupts | grep 44: + 44: 0 1785 1785 1783 1783 1 +1 0 IO-APIC-level eth1 +[root@moon 44]# echo f0 > smp_affinity +[root@moon 44]# ping -f h +PING hell (195.4.7.3): 56 data bytes +.. +--- hell ping statistics --- +2779 packets transmitted, 2777 packets received, 0% packet loss +round-trip min/avg/max = 0.1/0.5/585.4 ms +[root@moon 44]# cat /proc/interrupts | grep 44: + 44: 1068 1785 1785 1784 1784 1069 1070 1069 IO-APIC-level eth1 +[root@moon 44]# + diff --git a/Documentation/MSI-HOWTO.txt b/Documentation/MSI-HOWTO.txt new file mode 100644 index 000000000000..d5032eb480aa --- /dev/null +++ b/Documentation/MSI-HOWTO.txt @@ -0,0 +1,503 @@ + The MSI Driver Guide HOWTO + Tom L Nguyen tom.l.nguyen@intel.com + 10/03/2003 + Revised Feb 12, 2004 by Martine Silbermann + email: Martine.Silbermann@hp.com + Revised Jun 25, 2004 by Tom L Nguyen + +1. About this guide + +This guide describes the basics of Message Signaled Interrupts (MSI), +the advantages of using MSI over traditional interrupt mechanisms, +and how to enable your driver to use MSI or MSI-X. Also included is +a Frequently Asked Questions. + +2. Copyright 2003 Intel Corporation + +3. What is MSI/MSI-X? + +Message Signaled Interrupt (MSI), as described in the PCI Local Bus +Specification Revision 2.3 or latest, is an optional feature, and a +required feature for PCI Express devices. MSI enables a device function +to request service by sending an Inbound Memory Write on its PCI bus to +the FSB as a Message Signal Interrupt transaction. Because MSI is +generated in the form of a Memory Write, all transaction conditions, +such as a Retry, Master-Abort, Target-Abort or normal completion, are +supported. + +A PCI device that supports MSI must also support pin IRQ assertion +interrupt mechanism to provide backward compatibility for systems that +do not support MSI. In Systems, which support MSI, the bus driver is +responsible for initializing the message address and message data of +the device function's MSI/MSI-X capability structure during device +initial configuration. + +An MSI capable device function indicates MSI support by implementing +the MSI/MSI-X capability structure in its PCI capability list. The +device function may implement both the MSI capability structure and +the MSI-X capability structure; however, the bus driver should not +enable both. + +The MSI capability structure contains Message Control register, +Message Address register and Message Data register. These registers +provide the bus driver control over MSI. The Message Control register +indicates the MSI capability supported by the device. The Message +Address register specifies the target address and the Message Data +register specifies the characteristics of the message. To request +service, the device function writes the content of the Message Data +register to the target address. The device and its software driver +are prohibited from writing to these registers. + +The MSI-X capability structure is an optional extension to MSI. It +uses an independent and separate capability structure. There are +some key advantages to implementing the MSI-X capability structure +over the MSI capability structure as described below. + + - Support a larger maximum number of vectors per function. + + - Provide the ability for system software to configure + each vector with an independent message address and message + data, specified by a table that resides in Memory Space. + + - MSI and MSI-X both support per-vector masking. Per-vector + masking is an optional extension of MSI but a required + feature for MSI-X. Per-vector masking provides the kernel + the ability to mask/unmask MSI when servicing its software + interrupt service routing handler. If per-vector masking is + not supported, then the device driver should provide the + hardware/software synchronization to ensure that the device + generates MSI when the driver wants it to do so. + +4. Why use MSI? + +As a benefit the simplification of board design, MSI allows board +designers to remove out of band interrupt routing. MSI is another +step towards a legacy-free environment. + +Due to increasing pressure on chipset and processor packages to +reduce pin count, the need for interrupt pins is expected to +diminish over time. Devices, due to pin constraints, may implement +messages to increase performance. + +PCI Express endpoints uses INTx emulation (in-band messages) instead +of IRQ pin assertion. Using INTx emulation requires interrupt +sharing among devices connected to the same node (PCI bridge) while +MSI is unique (non-shared) and does not require BIOS configuration +support. As a result, the PCI Express technology requires MSI +support for better interrupt performance. + +Using MSI enables the device functions to support two or more +vectors, which can be configured to target different CPU's to +increase scalability. + +5. Configuring a driver to use MSI/MSI-X + +By default, the kernel will not enable MSI/MSI-X on all devices that +support this capability. The CONFIG_PCI_MSI kernel option +must be selected to enable MSI/MSI-X support. + +5.1 Including MSI/MSI-X support into the kernel + +To allow MSI/MSI-X capable device drivers to selectively enable +MSI/MSI-X (using pci_enable_msi()/pci_enable_msix() as described +below), the VECTOR based scheme needs to be enabled by setting +CONFIG_PCI_MSI during kernel config. + +Since the target of the inbound message is the local APIC, providing +CONFIG_X86_LOCAL_APIC must be enabled as well as CONFIG_PCI_MSI. + +5.2 Configuring for MSI support + +Due to the non-contiguous fashion in vector assignment of the +existing Linux kernel, this version does not support multiple +messages regardless of a device function is capable of supporting +more than one vector. To enable MSI on a device function's MSI +capability structure requires a device driver to call the function +pci_enable_msi() explicitly. + +5.2.1 API pci_enable_msi + +int pci_enable_msi(struct pci_dev *dev) + +With this new API, any existing device driver, which like to have +MSI enabled on its device function, must call this API to enable MSI +A successful call will initialize the MSI capability structure +with ONE vector, regardless of whether a device function is +capable of supporting multiple messages. This vector replaces the +pre-assigned dev->irq with a new MSI vector. To avoid the conflict +of new assigned vector with existing pre-assigned vector requires +a device driver to call this API before calling request_irq(). + +5.2.2 API pci_disable_msi + +void pci_disable_msi(struct pci_dev *dev) + +This API should always be used to undo the effect of pci_enable_msi() +when a device driver is unloading. This API restores dev->irq with +the pre-assigned IOAPIC vector and switches a device's interrupt +mode to PCI pin-irq assertion/INTx emulation mode. + +Note that a device driver should always call free_irq() on MSI vector +it has done request_irq() on before calling this API. Failure to do +so results a BUG_ON() and a device will be left with MSI enabled and +leaks its vector. + +5.2.3 MSI mode vs. legacy mode diagram + +The below diagram shows the events, which switches the interrupt +mode on the MSI-capable device function between MSI mode and +PIN-IRQ assertion mode. + + ------------ pci_enable_msi ------------------------ + | | <=============== | | + | MSI MODE | | PIN-IRQ ASSERTION MODE | + | | ===============> | | + ------------ pci_disable_msi ------------------------ + + +Figure 1.0 MSI Mode vs. Legacy Mode + +In Figure 1.0, a device operates by default in legacy mode. Legacy +in this context means PCI pin-irq assertion or PCI-Express INTx +emulation. A successful MSI request (using pci_enable_msi()) switches +a device's interrupt mode to MSI mode. A pre-assigned IOAPIC vector +stored in dev->irq will be saved by the PCI subsystem and a new +assigned MSI vector will replace dev->irq. + +To return back to its default mode, a device driver should always call +pci_disable_msi() to undo the effect of pci_enable_msi(). Note that a +device driver should always call free_irq() on MSI vector it has done +request_irq() on before calling pci_disable_msi(). Failure to do so +results a BUG_ON() and a device will be left with MSI enabled and +leaks its vector. Otherwise, the PCI subsystem restores a device's +dev->irq with a pre-assigned IOAPIC vector and marks released +MSI vector as unused. + +Once being marked as unused, there is no guarantee that the PCI +subsystem will reserve this MSI vector for a device. Depending on +the availability of current PCI vector resources and the number of +MSI/MSI-X requests from other drivers, this MSI may be re-assigned. + +For the case where the PCI subsystem re-assigned this MSI vector +another driver, a request to switching back to MSI mode may result +in being assigned a different MSI vector or a failure if no more +vectors are available. + +5.3 Configuring for MSI-X support + +Due to the ability of the system software to configure each vector of +the MSI-X capability structure with an independent message address +and message data, the non-contiguous fashion in vector assignment of +the existing Linux kernel has no impact on supporting multiple +messages on an MSI-X capable device functions. To enable MSI-X on +a device function's MSI-X capability structure requires its device +driver to call the function pci_enable_msix() explicitly. + +The function pci_enable_msix(), once invoked, enables either +all or nothing, depending on the current availability of PCI vector +resources. If the PCI vector resources are available for the number +of vectors requested by a device driver, this function will configure +the MSI-X table of the MSI-X capability structure of a device with +requested messages. To emphasize this reason, for example, a device +may be capable for supporting the maximum of 32 vectors while its +software driver usually may request 4 vectors. It is recommended +that the device driver should call this function once during the +initialization phase of the device driver. + +Unlike the function pci_enable_msi(), the function pci_enable_msix() +does not replace the pre-assigned IOAPIC dev->irq with a new MSI +vector because the PCI subsystem writes the 1:1 vector-to-entry mapping +into the field vector of each element contained in a second argument. +Note that the pre-assigned IO-APIC dev->irq is valid only if the device +operates in PIN-IRQ assertion mode. In MSI-X mode, any attempt of +using dev->irq by the device driver to request for interrupt service +may result unpredictabe behavior. + +For each MSI-X vector granted, a device driver is responsible to call +other functions like request_irq(), enable_irq(), etc. to enable +this vector with its corresponding interrupt service handler. It is +a device driver's choice to assign all vectors with the same +interrupt service handler or each vector with a unique interrupt +service handler. + +5.3.1 Handling MMIO address space of MSI-X Table + +The PCI 3.0 specification has implementation notes that MMIO address +space for a device's MSI-X structure should be isolated so that the +software system can set different page for controlling accesses to +the MSI-X structure. The implementation of MSI patch requires the PCI +subsystem, not a device driver, to maintain full control of the MSI-X +table/MSI-X PBA and MMIO address space of the MSI-X table/MSI-X PBA. +A device driver is prohibited from requesting the MMIO address space +of the MSI-X table/MSI-X PBA. Otherwise, the PCI subsystem will fail +enabling MSI-X on its hardware device when it calls the function +pci_enable_msix(). + +5.3.2 Handling MSI-X allocation + +Determining the number of MSI-X vectors allocated to a function is +dependent on the number of MSI capable devices and MSI-X capable +devices populated in the system. The policy of allocating MSI-X +vectors to a function is defined as the following: + +#of MSI-X vectors allocated to a function = (x - y)/z where + +x = The number of available PCI vector resources by the time + the device driver calls pci_enable_msix(). The PCI vector + resources is the sum of the number of unassigned vectors + (new) and the number of released vectors when any MSI/MSI-X + device driver switches its hardware device back to a legacy + mode or is hot-removed. The number of unassigned vectors + may exclude some vectors reserved, as defined in parameter + NR_HP_RESERVED_VECTORS, for the case where the system is + capable of supporting hot-add/hot-remove operations. Users + may change the value defined in NR_HR_RESERVED_VECTORS to + meet their specific needs. + +y = The number of MSI capable devices populated in the system. + This policy ensures that each MSI capable device has its + vector reserved to avoid the case where some MSI-X capable + drivers may attempt to claim all available vector resources. + +z = The number of MSI-X capable devices pupulated in the system. + This policy ensures that maximum (x - y) is distributed + evenly among MSI-X capable devices. + +Note that the PCI subsystem scans y and z during a bus enumeration. +When the PCI subsystem completes configuring MSI/MSI-X capability +structure of a device as requested by its device driver, y/z is +decremented accordingly. + +5.3.3 Handling MSI-X shortages + +For the case where fewer MSI-X vectors are allocated to a function +than requested, the function pci_enable_msix() will return the +maximum number of MSI-X vectors available to the caller. A device +driver may re-send its request with fewer or equal vectors indicated +in a return. For example, if a device driver requests 5 vectors, but +the number of available vectors is 3 vectors, a value of 3 will be a +return as a result of pci_enable_msix() call. A function could be +designed for its driver to use only 3 MSI-X table entries as +different combinations as ABC--, A-B-C, A--CB, etc. Note that this +patch does not support multiple entries with the same vector. Such +attempt by a device driver to use 5 MSI-X table entries with 3 vectors +as ABBCC, AABCC, BCCBA, etc will result as a failure by the function +pci_enable_msix(). Below are the reasons why supporting multiple +entries with the same vector is an undesirable solution. + + - The PCI subsystem can not determine which entry, which + generated the message, to mask/unmask MSI while handling + software driver ISR. Attempting to walk through all MSI-X + table entries (2048 max) to mask/unmask any match vector + is an undesirable solution. + + - Walk through all MSI-X table entries (2048 max) to handle + SMP affinity of any match vector is an undesirable solution. + +5.3.4 API pci_enable_msix + +int pci_enable_msix(struct pci_dev *dev, u32 *entries, int nvec) + +This API enables a device driver to request the PCI subsystem +for enabling MSI-X messages on its hardware device. Depending on +the availability of PCI vectors resources, the PCI subsystem enables +either all or nothing. + +Argument dev points to the device (pci_dev) structure. + +Argument entries is a pointer of unsigned integer type. The number of +elements is indicated in argument nvec. The content of each element +will be mapped to the following struct defined in /driver/pci/msi.h. + +struct msix_entry { + u16 vector; /* kernel uses to write alloc vector */ + u16 entry; /* driver uses to specify entry */ +}; + +A device driver is responsible for initializing the field entry of +each element with unique entry supported by MSI-X table. Otherwise, +-EINVAL will be returned as a result. A successful return of zero +indicates the PCI subsystem completes initializing each of requested +entries of the MSI-X table with message address and message data. +Last but not least, the PCI subsystem will write the 1:1 +vector-to-entry mapping into the field vector of each element. A +device driver is responsible of keeping track of allocated MSI-X +vectors in its internal data structure. + +Argument nvec is an integer indicating the number of messages +requested. + +A return of zero indicates that the number of MSI-X vectors is +successfully allocated. A return of greater than zero indicates +MSI-X vector shortage. Or a return of less than zero indicates +a failure. This failure may be a result of duplicate entries +specified in second argument, or a result of no available vector, +or a result of failing to initialize MSI-X table entries. + +5.3.5 API pci_disable_msix + +void pci_disable_msix(struct pci_dev *dev) + +This API should always be used to undo the effect of pci_enable_msix() +when a device driver is unloading. Note that a device driver should +always call free_irq() on all MSI-X vectors it has done request_irq() +on before calling this API. Failure to do so results a BUG_ON() and +a device will be left with MSI-X enabled and leaks its vectors. + +5.3.6 MSI-X mode vs. legacy mode diagram + +The below diagram shows the events, which switches the interrupt +mode on the MSI-X capable device function between MSI-X mode and +PIN-IRQ assertion mode (legacy). + + ------------ pci_enable_msix(,,n) ------------------------ + | | <=============== | | + | MSI-X MODE | | PIN-IRQ ASSERTION MODE | + | | ===============> | | + ------------ pci_disable_msix ------------------------ + +Figure 2.0 MSI-X Mode vs. Legacy Mode + +In Figure 2.0, a device operates by default in legacy mode. A +successful MSI-X request (using pci_enable_msix()) switches a +device's interrupt mode to MSI-X mode. A pre-assigned IOAPIC vector +stored in dev->irq will be saved by the PCI subsystem; however, +unlike MSI mode, the PCI subsystem will not replace dev->irq with +assigned MSI-X vector because the PCI subsystem already writes the 1:1 +vector-to-entry mapping into the field vector of each element +specified in second argument. + +To return back to its default mode, a device driver should always call +pci_disable_msix() to undo the effect of pci_enable_msix(). Note that +a device driver should always call free_irq() on all MSI-X vectors it +has done request_irq() on before calling pci_disable_msix(). Failure +to do so results a BUG_ON() and a device will be left with MSI-X +enabled and leaks its vectors. Otherwise, the PCI subsystem switches a +device function's interrupt mode from MSI-X mode to legacy mode and +marks all allocated MSI-X vectors as unused. + +Once being marked as unused, there is no guarantee that the PCI +subsystem will reserve these MSI-X vectors for a device. Depending on +the availability of current PCI vector resources and the number of +MSI/MSI-X requests from other drivers, these MSI-X vectors may be +re-assigned. + +For the case where the PCI subsystem re-assigned these MSI-X vectors +to other driver, a request to switching back to MSI-X mode may result +being assigned with another set of MSI-X vectors or a failure if no +more vectors are available. + +5.4 Handling function implementng both MSI and MSI-X capabilities + +For the case where a function implements both MSI and MSI-X +capabilities, the PCI subsystem enables a device to run either in MSI +mode or MSI-X mode but not both. A device driver determines whether it +wants MSI or MSI-X enabled on its hardware device. Once a device +driver requests for MSI, for example, it is prohibited to request for +MSI-X; in other words, a device driver is not permitted to ping-pong +between MSI mod MSI-X mode during a run-time. + +5.5 Hardware requirements for MSI/MSI-X support +MSI/MSI-X support requires support from both system hardware and +individual hardware device functions. + +5.5.1 System hardware support +Since the target of MSI address is the local APIC CPU, enabling +MSI/MSI-X support in Linux kernel is dependent on whether existing +system hardware supports local APIC. Users should verify their +system whether it runs when CONFIG_X86_LOCAL_APIC=y. + +In SMP environment, CONFIG_X86_LOCAL_APIC is automatically set; +however, in UP environment, users must manually set +CONFIG_X86_LOCAL_APIC. Once CONFIG_X86_LOCAL_APIC=y, setting +CONFIG_PCI_MSI enables the VECTOR based scheme and +the option for MSI-capable device drivers to selectively enable +MSI/MSI-X. + +Note that CONFIG_X86_IO_APIC setting is irrelevant because MSI/MSI-X +vector is allocated new during runtime and MSI/MSI-X support does not +depend on BIOS support. This key independency enables MSI/MSI-X +support on future IOxAPIC free platform. + +5.5.2 Device hardware support +The hardware device function supports MSI by indicating the +MSI/MSI-X capability structure on its PCI capability list. By +default, this capability structure will not be initialized by +the kernel to enable MSI during the system boot. In other words, +the device function is running on its default pin assertion mode. +Note that in many cases the hardware supporting MSI have bugs, +which may result in system hang. The software driver of specific +MSI-capable hardware is responsible for whether calling +pci_enable_msi or not. A return of zero indicates the kernel +successfully initializes the MSI/MSI-X capability structure of the +device funtion. The device function is now running on MSI/MSI-X mode. + +5.6 How to tell whether MSI/MSI-X is enabled on device function + +At the driver level, a return of zero from the function call of +pci_enable_msi()/pci_enable_msix() indicates to a device driver that +its device function is initialized successfully and ready to run in +MSI/MSI-X mode. + +At the user level, users can use command 'cat /proc/interrupts' +to display the vector allocated for a device and its interrupt +MSI/MSI-X mode ("PCI MSI"/"PCI MSIX"). Below shows below MSI mode is +enabled on a SCSI Adaptec 39320D Ultra320. + + CPU0 CPU1 + 0: 324639 0 IO-APIC-edge timer + 1: 1186 0 IO-APIC-edge i8042 + 2: 0 0 XT-PIC cascade + 12: 2797 0 IO-APIC-edge i8042 + 14: 6543 0 IO-APIC-edge ide0 + 15: 1 0 IO-APIC-edge ide1 +169: 0 0 IO-APIC-level uhci-hcd +185: 0 0 IO-APIC-level uhci-hcd +193: 138 10 PCI MSI aic79xx +201: 30 0 PCI MSI aic79xx +225: 30 0 IO-APIC-level aic7xxx +233: 30 0 IO-APIC-level aic7xxx +NMI: 0 0 +LOC: 324553 325068 +ERR: 0 +MIS: 0 + +6. FAQ + +Q1. Are there any limitations on using the MSI? + +A1. If the PCI device supports MSI and conforms to the +specification and the platform supports the APIC local bus, +then using MSI should work. + +Q2. Will it work on all the Pentium processors (P3, P4, Xeon, +AMD processors)? In P3 IPI's are transmitted on the APIC local +bus and in P4 and Xeon they are transmitted on the system +bus. Are there any implications with this? + +A2. MSI support enables a PCI device sending an inbound +memory write (0xfeexxxxx as target address) on its PCI bus +directly to the FSB. Since the message address has a +redirection hint bit cleared, it should work. + +Q3. The target address 0xfeexxxxx will be translated by the +Host Bridge into an interrupt message. Are there any +limitations on the chipsets such as Intel 8xx, Intel e7xxx, +or VIA? + +A3. If these chipsets support an inbound memory write with +target address set as 0xfeexxxxx, as conformed to PCI +specification 2.3 or latest, then it should work. + +Q4. From the driver point of view, if the MSI is lost because +of the errors occur during inbound memory write, then it may +wait for ever. Is there a mechanism for it to recover? + +A4. Since the target of the transaction is an inbound memory +write, all transaction termination conditions (Retry, +Master-Abort, Target-Abort, or normal completion) are +supported. A device sending an MSI must abide by all the PCI +rules and conditions regarding that inbound memory write. So, +if a retry is signaled it must retry, etc... We believe that +the recommendation for Abort is also a retry (refer to PCI +specification 2.3 or latest). diff --git a/Documentation/ManagementStyle b/Documentation/ManagementStyle new file mode 100644 index 000000000000..cbbebfb51ffe --- /dev/null +++ b/Documentation/ManagementStyle @@ -0,0 +1,276 @@ + + Linux kernel management style + +This is a short document describing the preferred (or made up, depending +on who you ask) management style for the linux kernel. It's meant to +mirror the CodingStyle document to some degree, and mainly written to +avoid answering (*) the same (or similar) questions over and over again. + +Management style is very personal and much harder to quantify than +simple coding style rules, so this document may or may not have anything +to do with reality. It started as a lark, but that doesn't mean that it +might not actually be true. You'll have to decide for yourself. + +Btw, when talking about "kernel manager", it's all about the technical +lead persons, not the people who do traditional management inside +companies. If you sign purchase orders or you have any clue about the +budget of your group, you're almost certainly not a kernel manager. +These suggestions may or may not apply to you. + +First off, I'd suggest buying "Seven Habits of Highly Successful +People", and NOT read it. Burn it, it's a great symbolic gesture. + +(*) This document does so not so much by answering the question, but by +making it painfully obvious to the questioner that we don't have a clue +to what the answer is. + +Anyway, here goes: + + + Chapter 1: Decisions + +Everybody thinks managers make decisions, and that decision-making is +important. The bigger and more painful the decision, the bigger the +manager must be to make it. That's very deep and obvious, but it's not +actually true. + +The name of the game is to _avoid_ having to make a decision. In +particular, if somebody tells you "choose (a) or (b), we really need you +to decide on this", you're in trouble as a manager. The people you +manage had better know the details better than you, so if they come to +you for a technical decision, you're screwed. You're clearly not +competent to make that decision for them. + +(Corollary:if the people you manage don't know the details better than +you, you're also screwed, although for a totally different reason. +Namely that you are in the wrong job, and that _they_ should be managing +your brilliance instead). + +So the name of the game is to _avoid_ decisions, at least the big and +painful ones. Making small and non-consequential decisions is fine, and +makes you look like you know what you're doing, so what a kernel manager +needs to do is to turn the big and painful ones into small things where +nobody really cares. + +It helps to realize that the key difference between a big decision and a +small one is whether you can fix your decision afterwards. Any decision +can be made small by just always making sure that if you were wrong (and +you _will_ be wrong), you can always undo the damage later by +backtracking. Suddenly, you get to be doubly managerial for making +_two_ inconsequential decisions - the wrong one _and_ the right one. + +And people will even see that as true leadership (*cough* bullshit +*cough*). + +Thus the key to avoiding big decisions becomes to just avoiding to do +things that can't be undone. Don't get ushered into a corner from which +you cannot escape. A cornered rat may be dangerous - a cornered manager +is just pitiful. + +It turns out that since nobody would be stupid enough to ever really let +a kernel manager have huge fiscal responsibility _anyway_, it's usually +fairly easy to backtrack. Since you're not going to be able to waste +huge amounts of money that you might not be able to repay, the only +thing you can backtrack on is a technical decision, and there +back-tracking is very easy: just tell everybody that you were an +incompetent nincompoop, say you're sorry, and undo all the worthless +work you had people work on for the last year. Suddenly the decision +you made a year ago wasn't a big decision after all, since it could be +easily undone. + +It turns out that some people have trouble with this approach, for two +reasons: + - admitting you were an idiot is harder than it looks. We all like to + maintain appearances, and coming out in public to say that you were + wrong is sometimes very hard indeed. + - having somebody tell you that what you worked on for the last year + wasn't worthwhile after all can be hard on the poor lowly engineers + too, and while the actual _work_ was easy enough to undo by just + deleting it, you may have irrevocably lost the trust of that + engineer. And remember: "irrevocable" was what we tried to avoid in + the first place, and your decision ended up being a big one after + all. + +Happily, both of these reasons can be mitigated effectively by just +admitting up-front that you don't have a friggin' clue, and telling +people ahead of the fact that your decision is purely preliminary, and +might be the wrong thing. You should always reserve the right to change +your mind, and make people very _aware_ of that. And it's much easier +to admit that you are stupid when you haven't _yet_ done the really +stupid thing. + +Then, when it really does turn out to be stupid, people just roll their +eyes and say "Oops, he did it again". + +This preemptive admission of incompetence might also make the people who +actually do the work also think twice about whether it's worth doing or +not. After all, if _they_ aren't certain whether it's a good idea, you +sure as hell shouldn't encourage them by promising them that what they +work on will be included. Make them at least think twice before they +embark on a big endeavor. + +Remember: they'd better know more about the details than you do, and +they usually already think they have the answer to everything. The best +thing you can do as a manager is not to instill confidence, but rather a +healthy dose of critical thinking on what they do. + +Btw, another way to avoid a decision is to plaintively just whine "can't +we just do both?" and look pitiful. Trust me, it works. If it's not +clear which approach is better, they'll eventually figure it out. The +answer may end up being that both teams get so frustrated by the +situation that they just give up. + +That may sound like a failure, but it's usually a sign that there was +something wrong with both projects, and the reason the people involved +couldn't decide was that they were both wrong. You end up coming up +smelling like roses, and you avoided yet another decision that you could +have screwed up on. + + + Chapter 2: People + +Most people are idiots, and being a manager means you'll have to deal +with it, and perhaps more importantly, that _they_ have to deal with +_you_. + +It turns out that while it's easy to undo technical mistakes, it's not +as easy to undo personality disorders. You just have to live with +theirs - and yours. + +However, in order to prepare yourself as a kernel manager, it's best to +remember not to burn any bridges, bomb any innocent villagers, or +alienate too many kernel developers. It turns out that alienating people +is fairly easy, and un-alienating them is hard. Thus "alienating" +immediately falls under the heading of "not reversible", and becomes a +no-no according to Chapter 1. + +There's just a few simple rules here: + (1) don't call people d*ckheads (at least not in public) + (2) learn how to apologize when you forgot rule (1) + +The problem with #1 is that it's very easy to do, since you can say +"you're a d*ckhead" in millions of different ways (*), sometimes without +even realizing it, and almost always with a white-hot conviction that +you are right. + +And the more convinced you are that you are right (and let's face it, +you can call just about _anybody_ a d*ckhead, and you often _will_ be +right), the harder it ends up being to apologize afterwards. + +To solve this problem, you really only have two options: + - get really good at apologies + - spread the "love" out so evenly that nobody really ends up feeling + like they get unfairly targeted. Make it inventive enough, and they + might even be amused. + +The option of being unfailingly polite really doesn't exist. Nobody will +trust somebody who is so clearly hiding his true character. + +(*) Paul Simon sang "Fifty Ways to Lose Your Lover", because quite +frankly, "A Million Ways to Tell a Developer He Is a D*ckhead" doesn't +scan nearly as well. But I'm sure he thought about it. + + + Chapter 3: People II - the Good Kind + +While it turns out that most people are idiots, the corollary to that is +sadly that you are one too, and that while we can all bask in the secure +knowledge that we're better than the average person (let's face it, +nobody ever believes that they're average or below-average), we should +also admit that we're not the sharpest knife around, and there will be +other people that are less of an idiot that you are. + +Some people react badly to smart people. Others take advantage of them. + +Make sure that you, as a kernel maintainer, are in the second group. +Suck up to them, because they are the people who will make your job +easier. In particular, they'll be able to make your decisions for you, +which is what the game is all about. + +So when you find somebody smarter than you are, just coast along. Your +management responsibilities largely become ones of saying "Sounds like a +good idea - go wild", or "That sounds good, but what about xxx?". The +second version in particular is a great way to either learn something +new about "xxx" or seem _extra_ managerial by pointing out something the +smarter person hadn't thought about. In either case, you win. + +One thing to look out for is to realize that greatness in one area does +not necessarily translate to other areas. So you might prod people in +specific directions, but let's face it, they might be good at what they +do, and suck at everything else. The good news is that people tend to +naturally gravitate back to what they are good at, so it's not like you +are doing something irreversible when you _do_ prod them in some +direction, just don't push too hard. + + + Chapter 4: Placing blame + +Things will go wrong, and people want somebody to blame. Tag, you're it. + +It's not actually that hard to accept the blame, especially if people +kind of realize that it wasn't _all_ your fault. Which brings us to the +best way of taking the blame: do it for another guy. You'll feel good +for taking the fall, he'll feel good about not getting blamed, and the +guy who lost his whole 36GB porn-collection because of your incompetence +will grudgingly admit that you at least didn't try to weasel out of it. + +Then make the developer who really screwed up (if you can find him) know +_in_private_ that he screwed up. Not just so he can avoid it in the +future, but so that he knows he owes you one. And, perhaps even more +importantly, he's also likely the person who can fix it. Because, let's +face it, it sure ain't you. + +Taking the blame is also why you get to be manager in the first place. +It's part of what makes people trust you, and allow you the potential +glory, because you're the one who gets to say "I screwed up". And if +you've followed the previous rules, you'll be pretty good at saying that +by now. + + + Chapter 5: Things to avoid + +There's one thing people hate even more than being called "d*ckhead", +and that is being called a "d*ckhead" in a sanctimonious voice. The +first you can apologize for, the second one you won't really get the +chance. They likely will no longer be listening even if you otherwise +do a good job. + +We all think we're better than anybody else, which means that when +somebody else puts on airs, it _really_ rubs us the wrong way. You may +be morally and intellectually superior to everybody around you, but +don't try to make it too obvious unless you really _intend_ to irritate +somebody (*). + +Similarly, don't be too polite or subtle about things. Politeness easily +ends up going overboard and hiding the problem, and as they say, "On the +internet, nobody can hear you being subtle". Use a big blunt object to +hammer the point in, because you can't really depend on people getting +your point otherwise. + +Some humor can help pad both the bluntness and the moralizing. Going +overboard to the point of being ridiculous can drive a point home +without making it painful to the recipient, who just thinks you're being +silly. It can thus help get through the personal mental block we all +have about criticism. + +(*) Hint: internet newsgroups that are not directly related to your work +are great ways to take out your frustrations at other people. Write +insulting posts with a sneer just to get into a good flame every once in +a while, and you'll feel cleansed. Just don't crap too close to home. + + + Chapter 6: Why me? + +Since your main responsibility seems to be to take the blame for other +peoples mistakes, and make it painfully obvious to everybody else that +you're incompetent, the obvious question becomes one of why do it in the +first place? + +First off, while you may or may not get screaming teenage girls (or +boys, let's not be judgmental or sexist here) knocking on your dressing +room door, you _will_ get an immense feeling of personal accomplishment +for being "in charge". Never mind the fact that you're really leading +by trying to keep up with everybody else and running after them as fast +as you can. Everybody will still think you're the person in charge. + +It's a great job if you can hack it. diff --git a/Documentation/PCIEBUS-HOWTO.txt b/Documentation/PCIEBUS-HOWTO.txt new file mode 100644 index 000000000000..c93f42a74d7e --- /dev/null +++ b/Documentation/PCIEBUS-HOWTO.txt @@ -0,0 +1,217 @@ + The PCI Express Port Bus Driver Guide HOWTO + Tom L Nguyen tom.l.nguyen@intel.com + 11/03/2004 + +1. About this guide + +This guide describes the basics of the PCI Express Port Bus driver +and provides information on how to enable the service drivers to +register/unregister with the PCI Express Port Bus Driver. + +2. Copyright 2004 Intel Corporation + +3. What is the PCI Express Port Bus Driver + +A PCI Express Port is a logical PCI-PCI Bridge structure. There +are two types of PCI Express Port: the Root Port and the Switch +Port. The Root Port originates a PCI Express link from a PCI Express +Root Complex and the Switch Port connects PCI Express links to +internal logical PCI buses. The Switch Port, which has its secondary +bus representing the switch's internal routing logic, is called the +switch's Upstream Port. The switch's Downstream Port is bridging from +switch's internal routing bus to a bus representing the downstream +PCI Express link from the PCI Express Switch. + +A PCI Express Port can provide up to four distinct functions, +referred to in this document as services, depending on its port type. +PCI Express Port's services include native hotplug support (HP), +power management event support (PME), advanced error reporting +support (AER), and virtual channel support (VC). These services may +be handled by a single complex driver or be individually distributed +and handled by corresponding service drivers. + +4. Why use the PCI Express Port Bus Driver? + +In existing Linux kernels, the Linux Device Driver Model allows a +physical device to be handled by only a single driver. The PCI +Express Port is a PCI-PCI Bridge device with multiple distinct +services. To maintain a clean and simple solution each service +may have its own software service driver. In this case several +service drivers will compete for a single PCI-PCI Bridge device. +For example, if the PCI Express Root Port native hotplug service +driver is loaded first, it claims a PCI-PCI Bridge Root Port. The +kernel therefore does not load other service drivers for that Root +Port. In other words, it is impossible to have multiple service +drivers load and run on a PCI-PCI Bridge device simultaneously +using the current driver model. + +To enable multiple service drivers running simultaneously requires +having a PCI Express Port Bus driver, which manages all populated +PCI Express Ports and distributes all provided service requests +to the corresponding service drivers as required. Some key +advantages of using the PCI Express Port Bus driver are listed below: + + - Allow multiple service drivers to run simultaneously on + a PCI-PCI Bridge Port device. + + - Allow service drivers implemented in an independent + staged approach. + + - Allow one service driver to run on multiple PCI-PCI Bridge + Port devices. + + - Manage and distribute resources of a PCI-PCI Bridge Port + device to requested service drivers. + +5. Configuring the PCI Express Port Bus Driver vs. Service Drivers + +5.1 Including the PCI Express Port Bus Driver Support into the Kernel + +Including the PCI Express Port Bus driver depends on whether the PCI +Express support is included in the kernel config. The kernel will +automatically include the PCI Express Port Bus driver as a kernel +driver when the PCI Express support is enabled in the kernel. + +5.2 Enabling Service Driver Support + +PCI device drivers are implemented based on Linux Device Driver Model. +All service drivers are PCI device drivers. As discussed above, it is +impossible to load any service driver once the kernel has loaded the +PCI Express Port Bus Driver. To meet the PCI Express Port Bus Driver +Model requires some minimal changes on existing service drivers that +imposes no impact on the functionality of existing service drivers. + +A service driver is required to use the two APIs shown below to +register its service with the PCI Express Port Bus driver (see +section 5.2.1 & 5.2.2). It is important that a service driver +initializes the pcie_port_service_driver data structure, included in +header file /include/linux/pcieport_if.h, before calling these APIs. +Failure to do so will result an identity mismatch, which prevents +the PCI Express Port Bus driver from loading a service driver. + +5.2.1 pcie_port_service_register + +int pcie_port_service_register(struct pcie_port_service_driver *new) + +This API replaces the Linux Driver Model's pci_module_init API. A +service driver should always calls pcie_port_service_register at +module init. Note that after service driver being loaded, calls +such as pci_enable_device(dev) and pci_set_master(dev) are no longer +necessary since these calls are executed by the PCI Port Bus driver. + +5.2.2 pcie_port_service_unregister + +void pcie_port_service_unregister(struct pcie_port_service_driver *new) + +pcie_port_service_unregister replaces the Linux Driver Model's +pci_unregister_driver. It's always called by service driver when a +module exits. + +5.2.3 Sample Code + +Below is sample service driver code to initialize the port service +driver data structure. + +static struct pcie_port_service_id service_id[] = { { + .vendor = PCI_ANY_ID, + .device = PCI_ANY_ID, + .port_type = PCIE_RC_PORT, + .service_type = PCIE_PORT_SERVICE_AER, + }, { /* end: all zeroes */ } +}; + +static struct pcie_port_service_driver root_aerdrv = { + .name = (char *)device_name, + .id_table = &service_id[0], + + .probe = aerdrv_load, + .remove = aerdrv_unload, + + .suspend = aerdrv_suspend, + .resume = aerdrv_resume, +}; + +Below is a sample code for registering/unregistering a service +driver. + +static int __init aerdrv_service_init(void) +{ + int retval = 0; + + retval = pcie_port_service_register(&root_aerdrv); + if (!retval) { + /* + * FIX ME + */ + } + return retval; +} + +static void __exit aerdrv_service_exit(void) +{ + pcie_port_service_unregister(&root_aerdrv); +} + +module_init(aerdrv_service_init); +module_exit(aerdrv_service_exit); + +6. Possible Resource Conflicts + +Since all service drivers of a PCI-PCI Bridge Port device are +allowed to run simultaneously, below lists a few of possible resource +conflicts with proposed solutions. + +6.1 MSI Vector Resource + +The MSI capability structure enables a device software driver to call +pci_enable_msi to request MSI based interrupts. Once MSI interrupts +are enabled on a device, it stays in this mode until a device driver +calls pci_disable_msi to disable MSI interrupts and revert back to +INTx emulation mode. Since service drivers of the same PCI-PCI Bridge +port share the same physical device, if an individual service driver +calls pci_enable_msi/pci_disable_msi it may result unpredictable +behavior. For example, two service drivers run simultaneously on the +same physical Root Port. Both service drivers call pci_enable_msi to +request MSI based interrupts. A service driver may not know whether +any other service drivers have run on this Root Port. If either one +of them calls pci_disable_msi, it puts the other service driver +in a wrong interrupt mode. + +To avoid this situation all service drivers are not permitted to +switch interrupt mode on its device. The PCI Express Port Bus driver +is responsible for determining the interrupt mode and this should be +transparent to service drivers. Service drivers need to know only +the vector IRQ assigned to the field irq of struct pcie_device, which +is passed in when the PCI Express Port Bus driver probes each service +driver. Service drivers should use (struct pcie_device*)dev->irq to +call request_irq/free_irq. In addition, the interrupt mode is stored +in the field interrupt_mode of struct pcie_device. + +6.2 MSI-X Vector Resources + +Similar to the MSI a device driver for an MSI-X capable device can +call pci_enable_msix to request MSI-X interrupts. All service drivers +are not permitted to switch interrupt mode on its device. The PCI +Express Port Bus driver is responsible for determining the interrupt +mode and this should be transparent to service drivers. Any attempt +by service driver to call pci_enable_msix/pci_disable_msix may +result unpredictable behavior. Service drivers should use +(struct pcie_device*)dev->irq and call request_irq/free_irq. + +6.3 PCI Memory/IO Mapped Regions + +Service drivers for PCI Express Power Management (PME), Advanced +Error Reporting (AER), Hot-Plug (HP) and Virtual Channel (VC) access +PCI configuration space on the PCI Express port. In all cases the +registers accessed are independent of each other. This patch assumes +that all service drivers will be well behaved and not overwrite +other service driver's configuration settings. + +6.4 PCI Config Registers + +Each service driver runs its PCI config operations on its own +capability structure except the PCI Express capability structure, in +which Root Control register and Device Control register are shared +between PME and AER. This patch assumes that all service drivers +will be well behaved and not overwrite other service driver's +configuration settings. diff --git a/Documentation/RCU/RTFP.txt b/Documentation/RCU/RTFP.txt new file mode 100644 index 000000000000..12250b342e1f --- /dev/null +++ b/Documentation/RCU/RTFP.txt @@ -0,0 +1,387 @@ +Read the F-ing Papers! + + +This document describes RCU-related publications, and is followed by +the corresponding bibtex entries. + +The first thing resembling RCU was published in 1980, when Kung and Lehman +[Kung80] recommended use of a garbage collector to defer destruction +of nodes in a parallel binary search tree in order to simplify its +implementation. This works well in environments that have garbage +collectors, but current production garbage collectors incur significant +read-side overhead. + +In 1982, Manber and Ladner [Manber82,Manber84] recommended deferring +destruction until all threads running at that time have terminated, again +for a parallel binary search tree. This approach works well in systems +with short-lived threads, such as the K42 research operating system. +However, Linux has long-lived tasks, so more is needed. + +In 1986, Hennessy, Osisek, and Seigh [Hennessy89] introduced passive +serialization, which is an RCU-like mechanism that relies on the presence +of "quiescent states" in the VM/XA hypervisor that are guaranteed not +to be referencing the data structure. However, this mechanism was not +optimized for modern computer systems, which is not surprising given +that these overheads were not so expensive in the mid-80s. Nonetheless, +passive serialization appears to be the first deferred-destruction +mechanism to be used in production. Furthermore, the relevant patent has +lapsed, so this approach may be used in non-GPL software, if desired. +(In contrast, use of RCU is permitted only in software licensed under +GPL. Sorry!!!) + +In 1990, Pugh [Pugh90] noted that explicitly tracking which threads +were reading a given data structure permitted deferred free to operate +in the presence of non-terminating threads. However, this explicit +tracking imposes significant read-side overhead, which is undesirable +in read-mostly situations. This algorithm does take pains to avoid +write-side contention and parallelize the other write-side overheads by +providing a fine-grained locking design, however, it would be interesting +to see how much of the performance advantage reported in 1990 remains +in 2004. + +At about this same time, Adams [Adams91] described ``chaotic relaxation'', +where the normal barriers between successive iterations of convergent +numerical algorithms are relaxed, so that iteration $n$ might use +data from iteration $n-1$ or even $n-2$. This introduces error, +which typically slows convergence and thus increases the number of +iterations required. However, this increase is sometimes more than made +up for by a reduction in the number of expensive barrier operations, +which are otherwise required to synchronize the threads at the end +of each iteration. Unfortunately, chaotic relaxation requires highly +structured data, such as the matrices used in scientific programs, and +is thus inapplicable to most data structures in operating-system kernels. + +In 1993, Jacobson [Jacobson93] verbally described what is perhaps the +simplest deferred-free technique: simply waiting a fixed amount of time +before freeing blocks awaiting deferred free. Jacobson did not describe +any write-side changes he might have made in this work using SGI's Irix +kernel. Aju John published a similar technique in 1995 [AjuJohn95]. +This works well if there is a well-defined upper bound on the length of +time that reading threads can hold references, as there might well be in +hard real-time systems. However, if this time is exceeded, perhaps due +to preemption, excessive interrupts, or larger-than-anticipated load, +memory corruption can ensue, with no reasonable means of diagnosis. +Jacobson's technique is therefore inappropriate for use in production +operating-system kernels, except when such kernels can provide hard +real-time response guarantees for all operations. + +Also in 1995, Pu et al. [Pu95a] applied a technique similar to that of Pugh's +read-side-tracking to permit replugging of algorithms within a commercial +Unix operating system. However, this replugging permitted only a single +reader at a time. The following year, this same group of researchers +extended their technique to allow for multiple readers [Cowan96a]. +Their approach requires memory barriers (and thus pipeline stalls), +but reduces memory latency, contention, and locking overheads. + +1995 also saw the first publication of DYNIX/ptx's RCU mechanism +[Slingwine95], which was optimized for modern CPU architectures, +and was successfully applied to a number of situations within the +DYNIX/ptx kernel. The corresponding conference paper appeared in 1998 +[McKenney98]. + +In 1999, the Tornado and K42 groups described their "generations" +mechanism, which quite similar to RCU [Gamsa99]. These operating systems +made pervasive use of RCU in place of "existence locks", which greatly +simplifies locking hierarchies. + +2001 saw the first RCU presentation involving Linux [McKenney01a] +at OLS. The resulting abundance of RCU patches was presented the +following year [McKenney02a], and use of RCU in dcache was first +described that same year [Linder02a]. + +Also in 2002, Michael [Michael02b,Michael02a] presented techniques +that defer the destruction of data structures to simplify non-blocking +synchronization (wait-free synchronization, lock-free synchronization, +and obstruction-free synchronization are all examples of non-blocking +synchronization). In particular, this technique eliminates locking, +reduces contention, reduces memory latency for readers, and parallelizes +pipeline stalls and memory latency for writers. However, these +techniques still impose significant read-side overhead in the form of +memory barriers. Researchers at Sun worked along similar lines in the +same timeframe [HerlihyLM02,HerlihyLMS03]. + +In 2003, the K42 group described how RCU could be used to create +hot-pluggable implementations of operating-system functions. Later that +year saw a paper describing an RCU implementation of System V IPC +[Arcangeli03], and an introduction to RCU in Linux Journal [McKenney03a]. + +2004 has seen a Linux-Journal article on use of RCU in dcache +[McKenney04a], a performance comparison of locking to RCU on several +different CPUs [McKenney04b], a dissertation describing use of RCU in a +number of operating-system kernels [PaulEdwardMcKenneyPhD], and a paper +describing how to make RCU safe for soft-realtime applications [Sarma04c]. + + +Bibtex Entries + +@article{Kung80 +,author="H. T. Kung and Q. Lehman" +,title="Concurrent Maintenance of Binary Search Trees" +,Year="1980" +,Month="September" +,journal="ACM Transactions on Database Systems" +,volume="5" +,number="3" +,pages="354-382" +} + +@techreport{Manber82 +,author="Udi Manber and Richard E. Ladner" +,title="Concurrency Control in a Dynamic Search Structure" +,institution="Department of Computer Science, University of Washington" +,address="Seattle, Washington" +,year="1982" +,number="82-01-01" +,month="January" +,pages="28" +} + +@article{Manber84 +,author="Udi Manber and Richard E. Ladner" +,title="Concurrency Control in a Dynamic Search Structure" +,Year="1984" +,Month="September" +,journal="ACM Transactions on Database Systems" +,volume="9" +,number="3" +,pages="439-455" +} + +@techreport{Hennessy89 +,author="James P. Hennessy and Damian L. Osisek and Joseph W. {Seigh II}" +,title="Passive Serialization in a Multitasking Environment" +,institution="US Patent and Trademark Office" +,address="Washington, DC" +,year="1989" +,number="US Patent 4,809,168 (lapsed)" +,month="February" +,pages="11" +} + +@techreport{Pugh90 +,author="William Pugh" +,title="Concurrent Maintenance of Skip Lists" +,institution="Institute of Advanced Computer Science Studies, Department of Computer Science, University of Maryland" +,address="College Park, Maryland" +,year="1990" +,number="CS-TR-2222.1" +,month="June" +} + +@Book{Adams91 +,Author="Gregory R. Adams" +,title="Concurrent Programming, Principles, and Practices" +,Publisher="Benjamin Cummins" +,Year="1991" +} + +@unpublished{Jacobson93 +,author="Van Jacobson" +,title="Avoid Read-Side Locking Via Delayed Free" +,year="1993" +,month="September" +,note="Verbal discussion" +} + +@Conference{AjuJohn95 +,Author="Aju John" +,Title="Dynamic vnodes -- Design and Implementation" +,Booktitle="{USENIX Winter 1995}" +,Publisher="USENIX Association" +,Month="January" +,Year="1995" +,pages="11-23" +,Address="New Orleans, LA" +} + +@techreport{Slingwine95 +,author="John D. Slingwine and Paul E. McKenney" +,title="Apparatus and Method for Achieving Reduced Overhead Mutual +Exclusion and Maintaining Coherency in a Multiprocessor System +Utilizing Execution History and Thread Monitoring" +,institution="US Patent and Trademark Office" +,address="Washington, DC" +,year="1995" +,number="US Patent 5,442,758 (contributed under GPL)" +,month="August" +} + +@techreport{Slingwine97 +,author="John D. Slingwine and Paul E. McKenney" +,title="Method for maintaining data coherency using thread +activity summaries in a multicomputer system" +,institution="US Patent and Trademark Office" +,address="Washington, DC" +,year="1997" +,number="US Patent 5,608,893 (contributed under GPL)" +,month="March" +} + +@techreport{Slingwine98 +,author="John D. Slingwine and Paul E. McKenney" +,title="Apparatus and method for achieving reduced overhead +mutual exclusion and maintaining coherency in a multiprocessor +system utilizing execution history and thread monitoring" +,institution="US Patent and Trademark Office" +,address="Washington, DC" +,year="1998" +,number="US Patent 5,727,209 (contributed under GPL)" +,month="March" +} + +@Conference{McKenney98 +,Author="Paul E. McKenney and John D. Slingwine" +,Title="Read-Copy Update: Using Execution History to Solve Concurrency +Problems" +,Booktitle="{Parallel and Distributed Computing and Systems}" +,Month="October" +,Year="1998" +,pages="509-518" +,Address="Las Vegas, NV" +} + +@Conference{Gamsa99 +,Author="Ben Gamsa and Orran Krieger and Jonathan Appavoo and Michael Stumm" +,Title="Tornado: Maximizing Locality and Concurrency in a Shared Memory +Multiprocessor Operating System" +,Booktitle="{Proceedings of the 3\textsuperscript{rd} Symposium on +Operating System Design and Implementation}" +,Month="February" +,Year="1999" +,pages="87-100" +,Address="New Orleans, LA" +} + +@techreport{Slingwine01 +,author="John D. Slingwine and Paul E. McKenney" +,title="Apparatus and method for achieving reduced overhead +mutual exclusion and maintaining coherency in a multiprocessor +system utilizing execution history and thread monitoring" +,institution="US Patent and Trademark Office" +,address="Washington, DC" +,year="2001" +,number="US Patent 5,219,690 (contributed under GPL)" +,month="April" +} + +@Conference{McKenney01a +,Author="Paul E. McKenney and Jonathan Appavoo and Andi Kleen and +Orran Krieger and Rusty Russell and Dipankar Sarma and Maneesh Soni" +,Title="Read-Copy Update" +,Booktitle="{Ottawa Linux Symposium}" +,Month="July" +,Year="2001" +,note="Available: +\url{http://www.linuxsymposium.org/2001/abstracts/readcopy.php} +\url{http://www.rdrop.com/users/paulmck/rclock/rclock_OLS.2001.05.01c.pdf} +[Viewed June 23, 2004]" +annotation=" +Described RCU, and presented some patches implementing and using it in +the Linux kernel. +" +} + +@Conference{Linder02a +,Author="Hanna Linder and Dipankar Sarma and Maneesh Soni" +,Title="Scalability of the Directory Entry Cache" +,Booktitle="{Ottawa Linux Symposium}" +,Month="June" +,Year="2002" +,pages="289-300" +} + +@Conference{McKenney02a +,Author="Paul E. McKenney and Dipankar Sarma and +Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell" +,Title="Read-Copy Update" +,Booktitle="{Ottawa Linux Symposium}" +,Month="June" +,Year="2002" +,pages="338-367" +,note="Available: +\url{http://www.linux.org.uk/~ajh/ols2002_proceedings.pdf.gz} +[Viewed June 23, 2004]" +} + +@article{Appavoo03a +,author="J. Appavoo and K. Hui and C. A. N. Soules and R. W. Wisniewski and +D. M. {Da Silva} and O. Krieger and M. A. Auslander and D. J. Edelsohn and +B. Gamsa and G. R. Ganger and P. McKenney and M. Ostrowski and +B. Rosenburg and M. Stumm and J. Xenidis" +,title="Enabling Autonomic Behavior in Systems Software With Hot Swapping" +,Year="2003" +,Month="January" +,journal="IBM Systems Journal" +,volume="42" +,number="1" +,pages="60-76" +} + +@Conference{Arcangeli03 +,Author="Andrea Arcangeli and Mingming Cao and Paul E. McKenney and +Dipankar Sarma" +,Title="Using Read-Copy Update Techniques for {System V IPC} in the +{Linux} 2.5 Kernel" +,Booktitle="Proceedings of the 2003 USENIX Annual Technical Conference +(FREENIX Track)" +,Publisher="USENIX Association" +,year="2003" +,month="June" +,pages="297-310" +} + +@article{McKenney03a +,author="Paul E. McKenney" +,title="Using {RCU} in the {Linux} 2.5 Kernel" +,Year="2003" +,Month="October" +,journal="Linux Journal" +,volume="1" +,number="114" +,pages="18-26" +} + +@article{McKenney04a +,author="Paul E. McKenney and Dipankar Sarma and Maneesh Soni" +,title="Scaling dcache with {RCU}" +,Year="2004" +,Month="January" +,journal="Linux Journal" +,volume="1" +,number="118" +,pages="38-46" +} + +@Conference{McKenney04b +,Author="Paul E. McKenney" +,Title="{RCU} vs. Locking Performance on Different {CPUs}" +,Booktitle="{linux.conf.au}" +,Month="January" +,Year="2004" +,Address="Adelaide, Australia" +,note="Available: +\url{http://www.linux.org.au/conf/2004/abstracts.html#90} +\url{http://www.rdrop.com/users/paulmck/rclock/lockperf.2004.01.17a.pdf} +[Viewed June 23, 2004]" +} + +@phdthesis{PaulEdwardMcKenneyPhD +,author="Paul E. McKenney" +,title="Exploiting Deferred Destruction: +An Analysis of Read-Copy-Update Techniques +in Operating System Kernels" +,school="OGI School of Science and Engineering at +Oregon Health and Sciences University" +,year="2004" +} + +@Conference{Sarma04c +,Author="Dipankar Sarma and Paul E. McKenney" +,Title="Making RCU Safe for Deep Sub-Millisecond Response Realtime Applications" +,Booktitle="Proceedings of the 2004 USENIX Annual Technical Conference +(FREENIX Track)" +,Publisher="USENIX Association" +,year="2004" +,month="June" +,pages="182-191" +} diff --git a/Documentation/RCU/UP.txt b/Documentation/RCU/UP.txt new file mode 100644 index 000000000000..551a803d82a8 --- /dev/null +++ b/Documentation/RCU/UP.txt @@ -0,0 +1,64 @@ +RCU on Uniprocessor Systems + + +A common misconception is that, on UP systems, the call_rcu() primitive +may immediately invoke its function, and that the synchronize_kernel +primitive may return immediately. The basis of this misconception +is that since there is only one CPU, it should not be necessary to +wait for anything else to get done, since there are no other CPUs for +anything else to be happening on. Although this approach will sort of +work a surprising amount of the time, it is a very bad idea in general. +This document presents two examples that demonstrate exactly how bad an +idea this is. + + +Example 1: softirq Suicide + +Suppose that an RCU-based algorithm scans a linked list containing +elements A, B, and C in process context, and can delete elements from +this same list in softirq context. Suppose that the process-context scan +is referencing element B when it is interrupted by softirq processing, +which deletes element B, and then invokes call_rcu() to free element B +after a grace period. + +Now, if call_rcu() were to directly invoke its arguments, then upon return +from softirq, the list scan would find itself referencing a newly freed +element B. This situation can greatly decrease the life expectancy of +your kernel. + + +Example 2: Function-Call Fatality + +Of course, one could avert the suicide described in the preceding example +by having call_rcu() directly invoke its arguments only if it was called +from process context. However, this can fail in a similar manner. + +Suppose that an RCU-based algorithm again scans a linked list containing +elements A, B, and C in process contexts, but that it invokes a function +on each element as it is scanned. Suppose further that this function +deletes element B from the list, then passes it to call_rcu() for deferred +freeing. This may be a bit unconventional, but it is perfectly legal +RCU usage, since call_rcu() must wait for a grace period to elapse. +Therefore, in this case, allowing call_rcu() to immediately invoke +its arguments would cause it to fail to make the fundamental guarantee +underlying RCU, namely that call_rcu() defers invoking its arguments until +all RCU read-side critical sections currently executing have completed. + +Quick Quiz: why is it -not- legal to invoke synchronize_kernel() in +this case? + + +Summary + +Permitting call_rcu() to immediately invoke its arguments or permitting +synchronize_kernel() to immediately return breaks RCU, even on a UP system. +So do not do it! Even on a UP system, the RCU infrastructure -must- +respect grace periods. + + +Answer to Quick Quiz + +The calling function is scanning an RCU-protected linked list, and +is therefore within an RCU read-side critical section. Therefore, +the called function has been invoked within an RCU read-side critical +section, and is not permitted to block. diff --git a/Documentation/RCU/arrayRCU.txt b/Documentation/RCU/arrayRCU.txt new file mode 100644 index 000000000000..453ebe6953ee --- /dev/null +++ b/Documentation/RCU/arrayRCU.txt @@ -0,0 +1,141 @@ +Using RCU to Protect Read-Mostly Arrays + + +Although RCU is more commonly used to protect linked lists, it can +also be used to protect arrays. Three situations are as follows: + +1. Hash Tables + +2. Static Arrays + +3. Resizeable Arrays + +Each of these situations are discussed below. + + +Situation 1: Hash Tables + +Hash tables are often implemented as an array, where each array entry +has a linked-list hash chain. Each hash chain can be protected by RCU +as described in the listRCU.txt document. This approach also applies +to other array-of-list situations, such as radix trees. + + +Situation 2: Static Arrays + +Static arrays, where the data (rather than a pointer to the data) is +located in each array element, and where the array is never resized, +have not been used with RCU. Rik van Riel recommends using seqlock in +this situation, which would also have minimal read-side overhead as long +as updates are rare. + +Quick Quiz: Why is it so important that updates be rare when + using seqlock? + + +Situation 3: Resizeable Arrays + +Use of RCU for resizeable arrays is demonstrated by the grow_ary() +function used by the System V IPC code. The array is used to map from +semaphore, message-queue, and shared-memory IDs to the data structure +that represents the corresponding IPC construct. The grow_ary() +function does not acquire any locks; instead its caller must hold the +ids->sem semaphore. + +The grow_ary() function, shown below, does some limit checks, allocates a +new ipc_id_ary, copies the old to the new portion of the new, initializes +the remainder of the new, updates the ids->entries pointer to point to +the new array, and invokes ipc_rcu_putref() to free up the old array. +Note that rcu_assign_pointer() is used to update the ids->entries pointer, +which includes any memory barriers required on whatever architecture +you are running on. + + static int grow_ary(struct ipc_ids* ids, int newsize) + { + struct ipc_id_ary* new; + struct ipc_id_ary* old; + int i; + int size = ids->entries->size; + + if(newsize > IPCMNI) + newsize = IPCMNI; + if(newsize <= size) + return newsize; + + new = ipc_rcu_alloc(sizeof(struct kern_ipc_perm *)*newsize + + sizeof(struct ipc_id_ary)); + if(new == NULL) + return size; + new->size = newsize; + memcpy(new->p, ids->entries->p, + sizeof(struct kern_ipc_perm *)*size + + sizeof(struct ipc_id_ary)); + for(i=size;i<newsize;i++) { + new->p[i] = NULL; + } + old = ids->entries; + + /* + * Use rcu_assign_pointer() to make sure the memcpyed + * contents of the new array are visible before the new + * array becomes visible. + */ + rcu_assign_pointer(ids->entries, new); + + ipc_rcu_putref(old); + return newsize; + } + +The ipc_rcu_putref() function decrements the array's reference count +and then, if the reference count has dropped to zero, uses call_rcu() +to free the array after a grace period has elapsed. + +The array is traversed by the ipc_lock() function. This function +indexes into the array under the protection of rcu_read_lock(), +using rcu_dereference() to pick up the pointer to the array so +that it may later safely be dereferenced -- memory barriers are +required on the Alpha CPU. Since the size of the array is stored +with the array itself, there can be no array-size mismatches, so +a simple check suffices. The pointer to the structure corresponding +to the desired IPC object is placed in "out", with NULL indicating +a non-existent entry. After acquiring "out->lock", the "out->deleted" +flag indicates whether the IPC object is in the process of being +deleted, and, if not, the pointer is returned. + + struct kern_ipc_perm* ipc_lock(struct ipc_ids* ids, int id) + { + struct kern_ipc_perm* out; + int lid = id % SEQ_MULTIPLIER; + struct ipc_id_ary* entries; + + rcu_read_lock(); + entries = rcu_dereference(ids->entries); + if(lid >= entries->size) { + rcu_read_unlock(); + return NULL; + } + out = entries->p[lid]; + if(out == NULL) { + rcu_read_unlock(); + return NULL; + } + spin_lock(&out->lock); + + /* ipc_rmid() may have already freed the ID while ipc_lock + * was spinning: here verify that the structure is still valid + */ + if (out->deleted) { + spin_unlock(&out->lock); + rcu_read_unlock(); + return NULL; + } + return out; + } + + +Answer to Quick Quiz: + + The reason that it is important that updates be rare when + using seqlock is that frequent updates can livelock readers. + One way to avoid this problem is to assign a seqlock for + each array entry rather than to the entire array. diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt new file mode 100644 index 000000000000..b3a568abe6b1 --- /dev/null +++ b/Documentation/RCU/checklist.txt @@ -0,0 +1,157 @@ +Review Checklist for RCU Patches + + +This document contains a checklist for producing and reviewing patches +that make use of RCU. Violating any of the rules listed below will +result in the same sorts of problems that leaving out a locking primitive +would cause. This list is based on experiences reviewing such patches +over a rather long period of time, but improvements are always welcome! + +0. Is RCU being applied to a read-mostly situation? If the data + structure is updated more than about 10% of the time, then + you should strongly consider some other approach, unless + detailed performance measurements show that RCU is nonetheless + the right tool for the job. + + The other exception would be where performance is not an issue, + and RCU provides a simpler implementation. An example of this + situation is the dynamic NMI code in the Linux 2.6 kernel, + at least on architectures where NMIs are rare. + +1. Does the update code have proper mutual exclusion? + + RCU does allow -readers- to run (almost) naked, but -writers- must + still use some sort of mutual exclusion, such as: + + a. locking, + b. atomic operations, or + c. restricting updates to a single task. + + If you choose #b, be prepared to describe how you have handled + memory barriers on weakly ordered machines (pretty much all of + them -- even x86 allows reads to be reordered), and be prepared + to explain why this added complexity is worthwhile. If you + choose #c, be prepared to explain how this single task does not + become a major bottleneck on big multiprocessor machines. + +2. Do the RCU read-side critical sections make proper use of + rcu_read_lock() and friends? These primitives are needed + to suppress preemption (or bottom halves, in the case of + rcu_read_lock_bh()) in the read-side critical sections, + and are also an excellent aid to readability. + +3. Does the update code tolerate concurrent accesses? + + The whole point of RCU is to permit readers to run without + any locks or atomic operations. This means that readers will + be running while updates are in progress. There are a number + of ways to handle this concurrency, depending on the situation: + + a. Make updates appear atomic to readers. For example, + pointer updates to properly aligned fields will appear + atomic, as will individual atomic primitives. Operations + performed under a lock and sequences of multiple atomic + primitives will -not- appear to be atomic. + + This is almost always the best approach. + + b. Carefully order the updates and the reads so that + readers see valid data at all phases of the update. + This is often more difficult than it sounds, especially + given modern CPUs' tendency to reorder memory references. + One must usually liberally sprinkle memory barriers + (smp_wmb(), smp_rmb(), smp_mb()) through the code, + making it difficult to understand and to test. + + It is usually better to group the changing data into + a separate structure, so that the change may be made + to appear atomic by updating a pointer to reference + a new structure containing updated values. + +4. Weakly ordered CPUs pose special challenges. Almost all CPUs + are weakly ordered -- even i386 CPUs allow reads to be reordered. + RCU code must take all of the following measures to prevent + memory-corruption problems: + + a. Readers must maintain proper ordering of their memory + accesses. The rcu_dereference() primitive ensures that + the CPU picks up the pointer before it picks up the data + that the pointer points to. This really is necessary + on Alpha CPUs. If you don't believe me, see: + + http://www.openvms.compaq.com/wizard/wiz_2637.html + + The rcu_dereference() primitive is also an excellent + documentation aid, letting the person reading the code + know exactly which pointers are protected by RCU. + + The rcu_dereference() primitive is used by the various + "_rcu()" list-traversal primitives, such as the + list_for_each_entry_rcu(). + + b. If the list macros are being used, the list_del_rcu(), + list_add_tail_rcu(), and list_del_rcu() primitives must + be used in order to prevent weakly ordered machines from + misordering structure initialization and pointer planting. + Similarly, if the hlist macros are being used, the + hlist_del_rcu() and hlist_add_head_rcu() primitives + are required. + + c. Updates must ensure that initialization of a given + structure happens before pointers to that structure are + publicized. Use the rcu_assign_pointer() primitive + when publicizing a pointer to a structure that can + be traversed by an RCU read-side critical section. + + [The rcu_assign_pointer() primitive is in process.] + +5. If call_rcu(), or a related primitive such as call_rcu_bh(), + is used, the callback function must be written to be called + from softirq context. In particular, it cannot block. + +6. Since synchronize_kernel() blocks, it cannot be called from + any sort of irq context. + +7. If the updater uses call_rcu(), then the corresponding readers + must use rcu_read_lock() and rcu_read_unlock(). If the updater + uses call_rcu_bh(), then the corresponding readers must use + rcu_read_lock_bh() and rcu_read_unlock_bh(). Mixing things up + will result in confusion and broken kernels. + + One exception to this rule: rcu_read_lock() and rcu_read_unlock() + may be substituted for rcu_read_lock_bh() and rcu_read_unlock_bh() + in cases where local bottom halves are already known to be + disabled, for example, in irq or softirq context. Commenting + such cases is a must, of course! And the jury is still out on + whether the increased speed is worth it. + +8. Although synchronize_kernel() is a bit slower than is call_rcu(), + it usually results in simpler code. So, unless update performance + is important or the updaters cannot block, synchronize_kernel() + should be used in preference to call_rcu(). + +9. All RCU list-traversal primitives, which include + list_for_each_rcu(), list_for_each_entry_rcu(), + list_for_each_continue_rcu(), and list_for_each_safe_rcu(), + must be within an RCU read-side critical section. RCU + read-side critical sections are delimited by rcu_read_lock() + and rcu_read_unlock(), or by similar primitives such as + rcu_read_lock_bh() and rcu_read_unlock_bh(). + + Use of the _rcu() list-traversal primitives outside of an + RCU read-side critical section causes no harm other than + a slight performance degradation on Alpha CPUs and some + confusion on the part of people trying to read the code. + + Another way of thinking of this is "If you are holding the + lock that prevents the data structure from changing, why do + you also need RCU-based protection?" That said, there may + well be situations where use of the _rcu() list-traversal + primitives while the update-side lock is held results in + simpler and more maintainable code. The jury is still out + on this question. + +10. Conversely, if you are in an RCU read-side critical section, + you -must- use the "_rcu()" variants of the list macros. + Failing to do so will break Alpha and confuse people reading + your code. diff --git a/Documentation/RCU/listRCU.txt b/Documentation/RCU/listRCU.txt new file mode 100644 index 000000000000..bda6ead69bd0 --- /dev/null +++ b/Documentation/RCU/listRCU.txt @@ -0,0 +1,307 @@ +Using RCU to Protect Read-Mostly Linked Lists + + +One of the best applications of RCU is to protect read-mostly linked lists +("struct list_head" in list.h). One big advantage of this approach +is that all of the required memory barriers are included for you in +the list macros. This document describes several applications of RCU, +with the best fits first. + + +Example 1: Read-Side Action Taken Outside of Lock, No In-Place Updates + +The best applications are cases where, if reader-writer locking were +used, the read-side lock would be dropped before taking any action +based on the results of the search. The most celebrated example is +the routing table. Because the routing table is tracking the state of +equipment outside of the computer, it will at times contain stale data. +Therefore, once the route has been computed, there is no need to hold +the routing table static during transmission of the packet. After all, +you can hold the routing table static all you want, but that won't keep +the external Internet from changing, and it is the state of the external +Internet that really matters. In addition, routing entries are typically +added or deleted, rather than being modified in place. + +A straightforward example of this use of RCU may be found in the +system-call auditing support. For example, a reader-writer locked +implementation of audit_filter_task() might be as follows: + + static enum audit_state audit_filter_task(struct task_struct *tsk) + { + struct audit_entry *e; + enum audit_state state; + + read_lock(&auditsc_lock); + list_for_each_entry(e, &audit_tsklist, list) { + if (audit_filter_rules(tsk, &e->rule, NULL, &state)) { + read_unlock(&auditsc_lock); + return state; + } + } + read_unlock(&auditsc_lock); + return AUDIT_BUILD_CONTEXT; + } + +Here the list is searched under the lock, but the lock is dropped before +the corresponding value is returned. By the time that this value is acted +on, the list may well have been modified. This makes sense, since if +you are turning auditing off, it is OK to audit a few extra system calls. + +This means that RCU can be easily applied to the read side, as follows: + + static enum audit_state audit_filter_task(struct task_struct *tsk) + { + struct audit_entry *e; + enum audit_state state; + + rcu_read_lock(); + list_for_each_entry_rcu(e, &audit_tsklist, list) { + if (audit_filter_rules(tsk, &e->rule, NULL, &state)) { + rcu_read_unlock(); + return state; + } + } + rcu_read_unlock(); + return AUDIT_BUILD_CONTEXT; + } + +The read_lock() and read_unlock() calls have become rcu_read_lock() +and rcu_read_unlock(), respectively, and the list_for_each_entry() has +become list_for_each_entry_rcu(). The _rcu() list-traversal primitives +insert the read-side memory barriers that are required on DEC Alpha CPUs. + +The changes to the update side are also straightforward. A reader-writer +lock might be used as follows for deletion and insertion: + + static inline int audit_del_rule(struct audit_rule *rule, + struct list_head *list) + { + struct audit_entry *e; + + write_lock(&auditsc_lock); + list_for_each_entry(e, list, list) { + if (!audit_compare_rule(rule, &e->rule)) { + list_del(&e->list); + write_unlock(&auditsc_lock); + return 0; + } + } + write_unlock(&auditsc_lock); + return -EFAULT; /* No matching rule */ + } + + static inline int audit_add_rule(struct audit_entry *entry, + struct list_head *list) + { + write_lock(&auditsc_lock); + if (entry->rule.flags & AUDIT_PREPEND) { + entry->rule.flags &= ~AUDIT_PREPEND; + list_add(&entry->list, list); + } else { + list_add_tail(&entry->list, list); + } + write_unlock(&auditsc_lock); + return 0; + } + +Following are the RCU equivalents for these two functions: + + static inline int audit_del_rule(struct audit_rule *rule, + struct list_head *list) + { + struct audit_entry *e; + + /* Do not use the _rcu iterator here, since this is the only + * deletion routine. */ + list_for_each_entry(e, list, list) { + if (!audit_compare_rule(rule, &e->rule)) { + list_del_rcu(&e->list); + call_rcu(&e->rcu, audit_free_rule, e); + return 0; + } + } + return -EFAULT; /* No matching rule */ + } + + static inline int audit_add_rule(struct audit_entry *entry, + struct list_head *list) + { + if (entry->rule.flags & AUDIT_PREPEND) { + entry->rule.flags &= ~AUDIT_PREPEND; + list_add_rcu(&entry->list, list); + } else { + list_add_tail_rcu(&entry->list, list); + } + return 0; + } + +Normally, the write_lock() and write_unlock() would be replaced by +a spin_lock() and a spin_unlock(), but in this case, all callers hold +audit_netlink_sem, so no additional locking is required. The auditsc_lock +can therefore be eliminated, since use of RCU eliminates the need for +writers to exclude readers. + +The list_del(), list_add(), and list_add_tail() primitives have been +replaced by list_del_rcu(), list_add_rcu(), and list_add_tail_rcu(). +The _rcu() list-manipulation primitives add memory barriers that are +needed on weakly ordered CPUs (most of them!). + +So, when readers can tolerate stale data and when entries are either added +or deleted, without in-place modification, it is very easy to use RCU! + + +Example 2: Handling In-Place Updates + +The system-call auditing code does not update auditing rules in place. +However, if it did, reader-writer-locked code to do so might look as +follows (presumably, the field_count is only permitted to decrease, +otherwise, the added fields would need to be filled in): + + static inline int audit_upd_rule(struct audit_rule *rule, + struct list_head *list, + __u32 newaction, + __u32 newfield_count) + { + struct audit_entry *e; + struct audit_newentry *ne; + + write_lock(&auditsc_lock); + list_for_each_entry(e, list, list) { + if (!audit_compare_rule(rule, &e->rule)) { + e->rule.action = newaction; + e->rule.file_count = newfield_count; + write_unlock(&auditsc_lock); + return 0; + } + } + write_unlock(&auditsc_lock); + return -EFAULT; /* No matching rule */ + } + +The RCU version creates a copy, updates the copy, then replaces the old +entry with the newly updated entry. This sequence of actions, allowing +concurrent reads while doing a copy to perform an update, is what gives +RCU ("read-copy update") its name. The RCU code is as follows: + + static inline int audit_upd_rule(struct audit_rule *rule, + struct list_head *list, + __u32 newaction, + __u32 newfield_count) + { + struct audit_entry *e; + struct audit_newentry *ne; + + list_for_each_entry(e, list, list) { + if (!audit_compare_rule(rule, &e->rule)) { + ne = kmalloc(sizeof(*entry), GFP_ATOMIC); + if (ne == NULL) + return -ENOMEM; + audit_copy_rule(&ne->rule, &e->rule); + ne->rule.action = newaction; + ne->rule.file_count = newfield_count; + list_add_rcu(ne, e); + list_del(e); + call_rcu(&e->rcu, audit_free_rule, e); + return 0; + } + } + return -EFAULT; /* No matching rule */ + } + +Again, this assumes that the caller holds audit_netlink_sem. Normally, +the reader-writer lock would become a spinlock in this sort of code. + + +Example 3: Eliminating Stale Data + +The auditing examples above tolerate stale data, as do most algorithms +that are tracking external state. Because there is a delay from the +time the external state changes before Linux becomes aware of the change, +additional RCU-induced staleness is normally not a problem. + +However, there are many examples where stale data cannot be tolerated. +One example in the Linux kernel is the System V IPC (see the ipc_lock() +function in ipc/util.c). This code checks a "deleted" flag under a +per-entry spinlock, and, if the "deleted" flag is set, pretends that the +entry does not exist. For this to be helpful, the search function must +return holding the per-entry spinlock, as ipc_lock() does in fact do. + +Quick Quiz: Why does the search function need to return holding the +per-entry lock for this deleted-flag technique to be helpful? + +If the system-call audit module were to ever need to reject stale data, +one way to accomplish this would be to add a "deleted" flag and a "lock" +spinlock to the audit_entry structure, and modify audit_filter_task() +as follows: + + static enum audit_state audit_filter_task(struct task_struct *tsk) + { + struct audit_entry *e; + enum audit_state state; + + rcu_read_lock(); + list_for_each_entry_rcu(e, &audit_tsklist, list) { + if (audit_filter_rules(tsk, &e->rule, NULL, &state)) { + spin_lock(&e->lock); + if (e->deleted) { + spin_unlock(&e->lock); + rcu_read_unlock(); + return AUDIT_BUILD_CONTEXT; + } + rcu_read_unlock(); + return state; + } + } + rcu_read_unlock(); + return AUDIT_BUILD_CONTEXT; + } + +Note that this example assumes that entries are only added and deleted. +Additional mechanism is required to deal correctly with the +update-in-place performed by audit_upd_rule(). For one thing, +audit_upd_rule() would need additional memory barriers to ensure +that the list_add_rcu() was really executed before the list_del_rcu(). + +The audit_del_rule() function would need to set the "deleted" +flag under the spinlock as follows: + + static inline int audit_del_rule(struct audit_rule *rule, + struct list_head *list) + { + struct audit_entry *e; + + /* Do not use the _rcu iterator here, since this is the only + * deletion routine. */ + list_for_each_entry(e, list, list) { + if (!audit_compare_rule(rule, &e->rule)) { + spin_lock(&e->lock); + list_del_rcu(&e->list); + e->deleted = 1; + spin_unlock(&e->lock); + call_rcu(&e->rcu, audit_free_rule, e); + return 0; + } + } + return -EFAULT; /* No matching rule */ + } + + +Summary + +Read-mostly list-based data structures that can tolerate stale data are +the most amenable to use of RCU. The simplest case is where entries are +either added or deleted from the data structure (or atomically modified +in place), but non-atomic in-place modifications can be handled by making +a copy, updating the copy, then replacing the original with the copy. +If stale data cannot be tolerated, then a "deleted" flag may be used +in conjunction with a per-entry spinlock in order to allow the search +function to reject newly deleted data. + + +Answer to Quick Quiz + +If the search function drops the per-entry lock before returning, then +the caller will be processing stale data in any case. If it is really +OK to be processing stale data, then you don't need a "deleted" flag. +If processing stale data really is a problem, then you need to hold the +per-entry lock across all of the code that uses the value looked up. diff --git a/Documentation/RCU/rcu.txt b/Documentation/RCU/rcu.txt new file mode 100644 index 000000000000..7e0c2ab6f2bd --- /dev/null +++ b/Documentation/RCU/rcu.txt @@ -0,0 +1,67 @@ +RCU Concepts + + +The basic idea behind RCU (read-copy update) is to split destructive +operations into two parts, one that prevents anyone from seeing the data +item being destroyed, and one that actually carries out the destruction. +A "grace period" must elapse between the two parts, and this grace period +must be long enough that any readers accessing the item being deleted have +since dropped their references. For example, an RCU-protected deletion +from a linked list would first remove the item from the list, wait for +a grace period to elapse, then free the element. See the listRCU.txt +file for more information on using RCU with linked lists. + + +Frequently Asked Questions + +o Why would anyone want to use RCU? + + The advantage of RCU's two-part approach is that RCU readers need + not acquire any locks, perform any atomic instructions, write to + shared memory, or (on CPUs other than Alpha) execute any memory + barriers. The fact that these operations are quite expensive + on modern CPUs is what gives RCU its performance advantages + in read-mostly situations. The fact that RCU readers need not + acquire locks can also greatly simplify deadlock-avoidance code. + +o How can the updater tell when a grace period has completed + if the RCU readers give no indication when they are done? + + Just as with spinlocks, RCU readers are not permitted to + block, switch to user-mode execution, or enter the idle loop. + Therefore, as soon as a CPU is seen passing through any of these + three states, we know that that CPU has exited any previous RCU + read-side critical sections. So, if we remove an item from a + linked list, and then wait until all CPUs have switched context, + executed in user mode, or executed in the idle loop, we can + safely free up that item. + +o If I am running on a uniprocessor kernel, which can only do one + thing at a time, why should I wait for a grace period? + + See the UP.txt file in this directory. + +o How can I see where RCU is currently used in the Linux kernel? + + Search for "rcu_read_lock", "call_rcu", and "synchronize_kernel". + +o What guidelines should I follow when writing code that uses RCU? + + See the checklist.txt file in this directory. + +o Why the name "RCU"? + + "RCU" stands for "read-copy update". The file listRCU.txt has + more information on where this name came from, search for + "read-copy update" to find it. + +o I hear that RCU is patented? What is with that? + + Yes, it is. There are several known patents related to RCU, + search for the string "Patent" in RTFP.txt to find them. + Of these, one was allowed to lapse by the assignee, and the + others have been contributed to the Linux kernel under GPL. + +o Where can I find more information on RCU? + + See the RTFP.txt file in this directory. diff --git a/Documentation/README.DAC960 b/Documentation/README.DAC960 new file mode 100644 index 000000000000..98ea617a0dd6 --- /dev/null +++ b/Documentation/README.DAC960 @@ -0,0 +1,756 @@ + Linux Driver for Mylex DAC960/AcceleRAID/eXtremeRAID PCI RAID Controllers + + Version 2.2.11 for Linux 2.2.19 + Version 2.4.11 for Linux 2.4.12 + + PRODUCTION RELEASE + + 11 October 2001 + + Leonard N. Zubkoff + Dandelion Digital + lnz@dandelion.com + + Copyright 1998-2001 by Leonard N. Zubkoff <lnz@dandelion.com> + + + INTRODUCTION + +Mylex, Inc. designs and manufactures a variety of high performance PCI RAID +controllers. Mylex Corporation is located at 34551 Ardenwood Blvd., Fremont, +California 94555, USA and can be reached at 510.796.6100 or on the World Wide +Web at http://www.mylex.com. Mylex Technical Support can be reached by +electronic mail at mylexsup@us.ibm.com, by voice at 510.608.2400, or by FAX at +510.745.7715. Contact information for offices in Europe and Japan is available +on their Web site. + +The latest information on Linux support for DAC960 PCI RAID Controllers, as +well as the most recent release of this driver, will always be available from +my Linux Home Page at URL "http://www.dandelion.com/Linux/". The Linux DAC960 +driver supports all current Mylex PCI RAID controllers including the new +eXtremeRAID 2000/3000 and AcceleRAID 352/170/160 models which have an entirely +new firmware interface from the older eXtremeRAID 1100, AcceleRAID 150/200/250, +and DAC960PJ/PG/PU/PD/PL. See below for a complete controller list as well as +minimum firmware version requirements. For simplicity, in most places this +documentation refers to DAC960 generically rather than explicitly listing all +the supported models. + +Driver bug reports should be sent via electronic mail to "lnz@dandelion.com". +Please include with the bug report the complete configuration messages reported +by the driver at startup, along with any subsequent system messages relevant to +the controller's operation, and a detailed description of your system's +hardware configuration. Driver bugs are actually quite rare; if you encounter +problems with disks being marked offline, for example, please contact Mylex +Technical Support as the problem is related to the hardware configuration +rather than the Linux driver. + +Please consult the RAID controller documentation for detailed information +regarding installation and configuration of the controllers. This document +primarily provides information specific to the Linux support. + + + DRIVER FEATURES + +The DAC960 RAID controllers are supported solely as high performance RAID +controllers, not as interfaces to arbitrary SCSI devices. The Linux DAC960 +driver operates at the block device level, the same level as the SCSI and IDE +drivers. Unlike other RAID controllers currently supported on Linux, the +DAC960 driver is not dependent on the SCSI subsystem, and hence avoids all the +complexity and unnecessary code that would be associated with an implementation +as a SCSI driver. The DAC960 driver is designed for as high a performance as +possible with no compromises or extra code for compatibility with lower +performance devices. The DAC960 driver includes extensive error logging and +online configuration management capabilities. Except for initial configuration +of the controller and adding new disk drives, most everything can be handled +from Linux while the system is operational. + +The DAC960 driver is architected to support up to 8 controllers per system. +Each DAC960 parallel SCSI controller can support up to 15 disk drives per +channel, for a maximum of 60 drives on a four channel controller; the fibre +channel eXtremeRAID 3000 controller supports up to 125 disk drives per loop for +a total of 250 drives. The drives installed on a controller are divided into +one or more "Drive Groups", and then each Drive Group is subdivided further +into 1 to 32 "Logical Drives". Each Logical Drive has a specific RAID Level +and caching policy associated with it, and it appears to Linux as a single +block device. Logical Drives are further subdivided into up to 7 partitions +through the normal Linux and PC disk partitioning schemes. Logical Drives are +also known as "System Drives", and Drive Groups are also called "Packs". Both +terms are in use in the Mylex documentation; I have chosen to standardize on +the more generic "Logical Drive" and "Drive Group". + +DAC960 RAID disk devices are named in the style of the Device File System +(DEVFS). The device corresponding to Logical Drive D on Controller C is +referred to as /dev/rd/cCdD, and the partitions are called /dev/rd/cCdDp1 +through /dev/rd/cCdDp7. For example, partition 3 of Logical Drive 5 on +Controller 2 is referred to as /dev/rd/c2d5p3. Note that unlike with SCSI +disks the device names will not change in the event of a disk drive failure. +The DAC960 driver is assigned major numbers 48 - 55 with one major number per +controller. The 8 bits of minor number are divided into 5 bits for the Logical +Drive and 3 bits for the partition. + + + SUPPORTED DAC960/AcceleRAID/eXtremeRAID PCI RAID CONTROLLERS + +The following list comprises the supported DAC960, AcceleRAID, and eXtremeRAID +PCI RAID Controllers as of the date of this document. It is recommended that +anyone purchasing a Mylex PCI RAID Controller not in the following table +contact the author beforehand to verify that it is or will be supported. + +eXtremeRAID 3000 + 1 Wide Ultra-2/LVD SCSI channel + 2 External Fibre FC-AL channels + 233MHz StrongARM SA 110 Processor + 64 Bit 33MHz PCI (backward compatible with 32 Bit PCI slots) + 32MB/64MB ECC SDRAM Memory + +eXtremeRAID 2000 + 4 Wide Ultra-160 LVD SCSI channels + 233MHz StrongARM SA 110 Processor + 64 Bit 33MHz PCI (backward compatible with 32 Bit PCI slots) + 32MB/64MB ECC SDRAM Memory + +AcceleRAID 352 + 2 Wide Ultra-160 LVD SCSI channels + 100MHz Intel i960RN RISC Processor + 64 Bit 33MHz PCI (backward compatible with 32 Bit PCI slots) + 32MB/64MB ECC SDRAM Memory + +AcceleRAID 170 + 1 Wide Ultra-160 LVD SCSI channel + 100MHz Intel i960RM RISC Processor + 16MB/32MB/64MB ECC SDRAM Memory + +AcceleRAID 160 (AcceleRAID 170LP) + 1 Wide Ultra-160 LVD SCSI channel + 100MHz Intel i960RS RISC Processor + Built in 16M ECC SDRAM Memory + PCI Low Profile Form Factor - fit for 2U height + +eXtremeRAID 1100 (DAC1164P) + 3 Wide Ultra-2/LVD SCSI channels + 233MHz StrongARM SA 110 Processor + 64 Bit 33MHz PCI (backward compatible with 32 Bit PCI slots) + 16MB/32MB/64MB Parity SDRAM Memory with Battery Backup + +AcceleRAID 250 (DAC960PTL1) + Uses onboard Symbios SCSI chips on certain motherboards + Also includes one onboard Wide Ultra-2/LVD SCSI Channel + 66MHz Intel i960RD RISC Processor + 4MB/8MB/16MB/32MB/64MB/128MB ECC EDO Memory + +AcceleRAID 200 (DAC960PTL0) + Uses onboard Symbios SCSI chips on certain motherboards + Includes no onboard SCSI Channels + 66MHz Intel i960RD RISC Processor + 4MB/8MB/16MB/32MB/64MB/128MB ECC EDO Memory + +AcceleRAID 150 (DAC960PRL) + Uses onboard Symbios SCSI chips on certain motherboards + Also includes one onboard Wide Ultra-2/LVD SCSI Channel + 33MHz Intel i960RP RISC Processor + 4MB Parity EDO Memory + +DAC960PJ 1/2/3 Wide Ultra SCSI-3 Channels + 66MHz Intel i960RD RISC Processor + 4MB/8MB/16MB/32MB/64MB/128MB ECC EDO Memory + +DAC960PG 1/2/3 Wide Ultra SCSI-3 Channels + 33MHz Intel i960RP RISC Processor + 4MB/8MB ECC EDO Memory + +DAC960PU 1/2/3 Wide Ultra SCSI-3 Channels + Intel i960CF RISC Processor + 4MB/8MB EDRAM or 2MB/4MB/8MB/16MB/32MB DRAM Memory + +DAC960PD 1/2/3 Wide Fast SCSI-2 Channels + Intel i960CF RISC Processor + 4MB/8MB EDRAM or 2MB/4MB/8MB/16MB/32MB DRAM Memory + +DAC960PL 1/2/3 Wide Fast SCSI-2 Channels + Intel i960 RISC Processor + 2MB/4MB/8MB/16MB/32MB DRAM Memory + +DAC960P 1/2/3 Wide Fast SCSI-2 Channels + Intel i960 RISC Processor + 2MB/4MB/8MB/16MB/32MB DRAM Memory + +For the eXtremeRAID 2000/3000 and AcceleRAID 352/170/160, firmware version +6.00-01 or above is required. + +For the eXtremeRAID 1100, firmware version 5.06-0-52 or above is required. + +For the AcceleRAID 250, 200, and 150, firmware version 4.06-0-57 or above is +required. + +For the DAC960PJ and DAC960PG, firmware version 4.06-0-00 or above is required. + +For the DAC960PU, DAC960PD, DAC960PL, and DAC960P, either firmware version +3.51-0-04 or above is required (for dual Flash ROM controllers), or firmware +version 2.73-0-00 or above is required (for single Flash ROM controllers) + +Please note that not all SCSI disk drives are suitable for use with DAC960 +controllers, and only particular firmware versions of any given model may +actually function correctly. Similarly, not all motherboards have a BIOS that +properly initializes the AcceleRAID 250, AcceleRAID 200, AcceleRAID 150, +DAC960PJ, and DAC960PG because the Intel i960RD/RP is a multi-function device. +If in doubt, contact Mylex RAID Technical Support (mylexsup@us.ibm.com) to +verify compatibility. Mylex makes available a hard disk compatibility list at +http://www.mylex.com/support/hdcomp/hd-lists.html. + + + DRIVER INSTALLATION + +This distribution was prepared for Linux kernel version 2.2.19 or 2.4.12. + +To install the DAC960 RAID driver, you may use the following commands, +replacing "/usr/src" with wherever you keep your Linux kernel source tree: + + cd /usr/src + tar -xvzf DAC960-2.2.11.tar.gz (or DAC960-2.4.11.tar.gz) + mv README.DAC960 linux/Documentation + mv DAC960.[ch] linux/drivers/block + patch -p0 < DAC960.patch (if DAC960.patch is included) + cd linux + make config + make bzImage (or zImage) + +Then install "arch/i386/boot/bzImage" or "arch/i386/boot/zImage" as your +standard kernel, run lilo if appropriate, and reboot. + +To create the necessary devices in /dev, the "make_rd" script included in +"DAC960-Utilities.tar.gz" from http://www.dandelion.com/Linux/ may be used. +LILO 21 and FDISK v2.9 include DAC960 support; also included in this archive +are patches to LILO 20 and FDISK v2.8 that add DAC960 support, along with +statically linked executables of LILO and FDISK. This modified version of LILO +will allow booting from a DAC960 controller and/or mounting the root file +system from a DAC960. + +Red Hat Linux 6.0 and SuSE Linux 6.1 include support for Mylex PCI RAID +controllers. Installing directly onto a DAC960 may be problematic from other +Linux distributions until their installation utilities are updated. + + + INSTALLATION NOTES + +Before installing Linux or adding DAC960 logical drives to an existing Linux +system, the controller must first be configured to provide one or more logical +drives using the BIOS Configuration Utility or DACCF. Please note that since +there are only at most 6 usable partitions on each logical drive, systems +requiring more partitions should subdivide a drive group into multiple logical +drives, each of which can have up to 6 usable partitions. Also, note that with +large disk arrays it is advisable to enable the 8GB BIOS Geometry (255/63) +rather than accepting the default 2GB BIOS Geometry (128/32); failing to so do +will cause the logical drive geometry to have more than 65535 cylinders which +will make it impossible for FDISK to be used properly. The 8GB BIOS Geometry +can be enabled by configuring the DAC960 BIOS, which is accessible via Alt-M +during the BIOS initialization sequence. + +For maximum performance and the most efficient E2FSCK performance, it is +recommended that EXT2 file systems be built with a 4KB block size and 16 block +stride to match the DAC960 controller's 64KB default stripe size. The command +"mke2fs -b 4096 -R stride=16 <device>" is appropriate. Unless there will be a +large number of small files on the file systems, it is also beneficial to add +the "-i 16384" option to increase the bytes per inode parameter thereby +reducing the file system metadata. Finally, on systems that will only be run +with Linux 2.2 or later kernels it is beneficial to enable sparse superblocks +with the "-s 1" option. + + + DAC960 ANNOUNCEMENTS MAILING LIST + +The DAC960 Announcements Mailing List provides a forum for informing Linux +users of new driver releases and other announcements regarding Linux support +for DAC960 PCI RAID Controllers. To join the mailing list, send a message to +"dac960-announce-request@dandelion.com" with the line "subscribe" in the +message body. + + + CONTROLLER CONFIGURATION AND STATUS MONITORING + +The DAC960 RAID controllers running firmware 4.06 or above include a Background +Initialization facility so that system downtime is minimized both for initial +installation and subsequent configuration of additional storage. The BIOS +Configuration Utility (accessible via Alt-R during the BIOS initialization +sequence) is used to quickly configure the controller, and then the logical +drives that have been created are available for immediate use even while they +are still being initialized by the controller. The primary need for online +configuration and status monitoring is then to avoid system downtime when disk +drives fail and must be replaced. Mylex's online monitoring and configuration +utilities are being ported to Linux and will become available at some point in +the future. Note that with a SAF-TE (SCSI Accessed Fault-Tolerant Enclosure) +enclosure, the controller is able to rebuild failed drives automatically as +soon as a drive replacement is made available. + +The primary interfaces for controller configuration and status monitoring are +special files created in the /proc/rd/... hierarchy along with the normal +system console logging mechanism. Whenever the system is operating, the DAC960 +driver queries each controller for status information every 10 seconds, and +checks for additional conditions every 60 seconds. The initial status of each +controller is always available for controller N in /proc/rd/cN/initial_status, +and the current status as of the last status monitoring query is available in +/proc/rd/cN/current_status. In addition, status changes are also logged by the +driver to the system console and will appear in the log files maintained by +syslog. The progress of asynchronous rebuild or consistency check operations +is also available in /proc/rd/cN/current_status, and progress messages are +logged to the system console at most every 60 seconds. + +Starting with the 2.2.3/2.0.3 versions of the driver, the status information +available in /proc/rd/cN/initial_status and /proc/rd/cN/current_status has been +augmented to include the vendor, model, revision, and serial number (if +available) for each physical device found connected to the controller: + +***** DAC960 RAID Driver Version 2.2.3 of 19 August 1999 ***** +Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com> +Configuring Mylex DAC960PRL PCI RAID Controller + Firmware Version: 4.07-0-07, Channels: 1, Memory Size: 16MB + PCI Bus: 1, Device: 4, Function: 1, I/O Address: Unassigned + PCI Address: 0xFE300000 mapped at 0xA0800000, IRQ Channel: 21 + Controller Queue Depth: 128, Maximum Blocks per Command: 128 + Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33 + Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63 + SAF-TE Enclosure Management Enabled + Physical Devices: + 0:0 Vendor: IBM Model: DRVS09D Revision: 0270 + Serial Number: 68016775HA + Disk Status: Online, 17928192 blocks + 0:1 Vendor: IBM Model: DRVS09D Revision: 0270 + Serial Number: 68004E53HA + Disk Status: Online, 17928192 blocks + 0:2 Vendor: IBM Model: DRVS09D Revision: 0270 + Serial Number: 13013935HA + Disk Status: Online, 17928192 blocks + 0:3 Vendor: IBM Model: DRVS09D Revision: 0270 + Serial Number: 13016897HA + Disk Status: Online, 17928192 blocks + 0:4 Vendor: IBM Model: DRVS09D Revision: 0270 + Serial Number: 68019905HA + Disk Status: Online, 17928192 blocks + 0:5 Vendor: IBM Model: DRVS09D Revision: 0270 + Serial Number: 68012753HA + Disk Status: Online, 17928192 blocks + 0:6 Vendor: ESG-SHV Model: SCA HSBP M6 Revision: 0.61 + Logical Drives: + /dev/rd/c0d0: RAID-5, Online, 89640960 blocks, Write Thru + No Rebuild or Consistency Check in Progress + +To simplify the monitoring process for custom software, the special file +/proc/rd/status returns "OK" when all DAC960 controllers in the system are +operating normally and no failures have occurred, or "ALERT" if any logical +drives are offline or critical or any non-standby physical drives are dead. + +Configuration commands for controller N are available via the special file +/proc/rd/cN/user_command. A human readable command can be written to this +special file to initiate a configuration operation, and the results of the +operation can then be read back from the special file in addition to being +logged to the system console. The shell command sequence + + echo "<configuration-command>" > /proc/rd/c0/user_command + cat /proc/rd/c0/user_command + +is typically used to execute configuration commands. The configuration +commands are: + + flush-cache + + The "flush-cache" command flushes the controller's cache. The system + automatically flushes the cache at shutdown or if the driver module is + unloaded, so this command is only needed to be certain a write back cache + is flushed to disk before the system is powered off by a command to a UPS. + Note that the flush-cache command also stops an asynchronous rebuild or + consistency check, so it should not be used except when the system is being + halted. + + kill <channel>:<target-id> + + The "kill" command marks the physical drive <channel>:<target-id> as DEAD. + This command is provided primarily for testing, and should not be used + during normal system operation. + + make-online <channel>:<target-id> + + The "make-online" command changes the physical drive <channel>:<target-id> + from status DEAD to status ONLINE. In cases where multiple physical drives + have been killed simultaneously, this command may be used to bring all but + one of them back online, after which a rebuild to the final drive is + necessary. + + Warning: make-online should only be used on a dead physical drive that is + an active part of a drive group, never on a standby drive. The command + should never be used on a dead drive that is part of a critical logical + drive; rebuild should be used if only a single drive is dead. + + make-standby <channel>:<target-id> + + The "make-standby" command changes physical drive <channel>:<target-id> + from status DEAD to status STANDBY. It should only be used in cases where + a dead drive was replaced after an automatic rebuild was performed onto a + standby drive. It cannot be used to add a standby drive to the controller + configuration if one was not created initially; the BIOS Configuration + Utility must be used for that currently. + + rebuild <channel>:<target-id> + + The "rebuild" command initiates an asynchronous rebuild onto physical drive + <channel>:<target-id>. It should only be used when a dead drive has been + replaced. + + check-consistency <logical-drive-number> + + The "check-consistency" command initiates an asynchronous consistency check + of <logical-drive-number> with automatic restoration. It can be used + whenever it is desired to verify the consistency of the redundancy + information. + + cancel-rebuild + cancel-consistency-check + + The "cancel-rebuild" and "cancel-consistency-check" commands cancel any + rebuild or consistency check operations previously initiated. + + + EXAMPLE I - DRIVE FAILURE WITHOUT A STANDBY DRIVE + +The following annotated logs demonstrate the controller configuration and and +online status monitoring capabilities of the Linux DAC960 Driver. The test +configuration comprises 6 1GB Quantum Atlas I disk drives on two channels of a +DAC960PJ controller. The physical drives are configured into a single drive +group without a standby drive, and the drive group has been configured into two +logical drives, one RAID-5 and one RAID-6. Note that these logs are from an +earlier version of the driver and the messages have changed somewhat with newer +releases, but the functionality remains similar. First, here is the current +status of the RAID configuration: + +gwynedd:/u/lnz# cat /proc/rd/c0/current_status +***** DAC960 RAID Driver Version 2.0.0 of 23 March 1999 ***** +Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com> +Configuring Mylex DAC960PJ PCI RAID Controller + Firmware Version: 4.06-0-08, Channels: 3, Memory Size: 8MB + PCI Bus: 0, Device: 19, Function: 1, I/O Address: Unassigned + PCI Address: 0xFD4FC000 mapped at 0x8807000, IRQ Channel: 9 + Controller Queue Depth: 128, Maximum Blocks per Command: 128 + Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33 + Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63 + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Online, 2201600 blocks + 1:2 - Disk: Online, 2201600 blocks + 1:3 - Disk: Online, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Online, 5498880 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Online, 3305472 blocks, Write Thru + No Rebuild or Consistency Check in Progress + +gwynedd:/u/lnz# cat /proc/rd/status +OK + +The above messages indicate that everything is healthy, and /proc/rd/status +returns "OK" indicating that there are no problems with any DAC960 controller +in the system. For demonstration purposes, while I/O is active Physical Drive +1:1 is now disconnected, simulating a drive failure. The failure is noted by +the driver within 10 seconds of the controller's having detected it, and the +driver logs the following console status messages indicating that Logical +Drives 0 and 1 are now CRITICAL as a result of Physical Drive 1:1 being DEAD: + +DAC960#0: Physical Drive 1:2 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02 +DAC960#0: Physical Drive 1:3 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02 +DAC960#0: Physical Drive 1:1 killed because of timeout on SCSI command +DAC960#0: Physical Drive 1:1 is now DEAD +DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now CRITICAL +DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now CRITICAL + +The Sense Keys logged here are just Check Condition / Unit Attention conditions +arising from a SCSI bus reset that is forced by the controller during its error +recovery procedures. Concurrently with the above, the driver status available +from /proc/rd also reflects the drive failure. The status message in +/proc/rd/status has changed from "OK" to "ALERT": + +gwynedd:/u/lnz# cat /proc/rd/status +ALERT + +and /proc/rd/c0/current_status has been updated: + +gwynedd:/u/lnz# cat /proc/rd/c0/current_status + ... + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Dead, 2201600 blocks + 1:2 - Disk: Online, 2201600 blocks + 1:3 - Disk: Online, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Critical, 5498880 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Critical, 3305472 blocks, Write Thru + No Rebuild or Consistency Check in Progress + +Since there are no standby drives configured, the system can continue to access +the logical drives in a performance degraded mode until the failed drive is +replaced and a rebuild operation completed to restore the redundancy of the +logical drives. Once Physical Drive 1:1 is replaced with a properly +functioning drive, or if the physical drive was killed without having failed +(e.g., due to electrical problems on the SCSI bus), the user can instruct the +controller to initiate a rebuild operation onto the newly replaced drive: + +gwynedd:/u/lnz# echo "rebuild 1:1" > /proc/rd/c0/user_command +gwynedd:/u/lnz# cat /proc/rd/c0/user_command +Rebuild of Physical Drive 1:1 Initiated + +The echo command instructs the controller to initiate an asynchronous rebuild +operation onto Physical Drive 1:1, and the status message that results from the +operation is then available for reading from /proc/rd/c0/user_command, as well +as being logged to the console by the driver. + +Within 10 seconds of this command the driver logs the initiation of the +asynchronous rebuild operation: + +DAC960#0: Rebuild of Physical Drive 1:1 Initiated +DAC960#0: Physical Drive 1:1 Error Log: Sense Key = 6, ASC = 29, ASCQ = 01 +DAC960#0: Physical Drive 1:1 is now WRITE-ONLY +DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 1% completed + +and /proc/rd/c0/current_status is updated: + +gwynedd:/u/lnz# cat /proc/rd/c0/current_status + ... + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Write-Only, 2201600 blocks + 1:2 - Disk: Online, 2201600 blocks + 1:3 - Disk: Online, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Critical, 5498880 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Critical, 3305472 blocks, Write Thru + Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 6% completed + +As the rebuild progresses, the current status in /proc/rd/c0/current_status is +updated every 10 seconds: + +gwynedd:/u/lnz# cat /proc/rd/c0/current_status + ... + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Write-Only, 2201600 blocks + 1:2 - Disk: Online, 2201600 blocks + 1:3 - Disk: Online, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Critical, 5498880 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Critical, 3305472 blocks, Write Thru + Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 15% completed + +and every minute a progress message is logged to the console by the driver: + +DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 32% completed +DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 63% completed +DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 94% completed +DAC960#0: Rebuild in Progress: Logical Drive 1 (/dev/rd/c0d1) 94% completed + +Finally, the rebuild completes successfully. The driver logs the status of the +logical and physical drives and the rebuild completion: + +DAC960#0: Rebuild Completed Successfully +DAC960#0: Physical Drive 1:1 is now ONLINE +DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now ONLINE +DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now ONLINE + +/proc/rd/c0/current_status is updated: + +gwynedd:/u/lnz# cat /proc/rd/c0/current_status + ... + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Online, 2201600 blocks + 1:2 - Disk: Online, 2201600 blocks + 1:3 - Disk: Online, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Online, 5498880 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Online, 3305472 blocks, Write Thru + Rebuild Completed Successfully + +and /proc/rd/status indicates that everything is healthy once again: + +gwynedd:/u/lnz# cat /proc/rd/status +OK + + + EXAMPLE II - DRIVE FAILURE WITH A STANDBY DRIVE + +The following annotated logs demonstrate the controller configuration and and +online status monitoring capabilities of the Linux DAC960 Driver. The test +configuration comprises 6 1GB Quantum Atlas I disk drives on two channels of a +DAC960PJ controller. The physical drives are configured into a single drive +group with a standby drive, and the drive group has been configured into two +logical drives, one RAID-5 and one RAID-6. Note that these logs are from an +earlier version of the driver and the messages have changed somewhat with newer +releases, but the functionality remains similar. First, here is the current +status of the RAID configuration: + +gwynedd:/u/lnz# cat /proc/rd/c0/current_status +***** DAC960 RAID Driver Version 2.0.0 of 23 March 1999 ***** +Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com> +Configuring Mylex DAC960PJ PCI RAID Controller + Firmware Version: 4.06-0-08, Channels: 3, Memory Size: 8MB + PCI Bus: 0, Device: 19, Function: 1, I/O Address: Unassigned + PCI Address: 0xFD4FC000 mapped at 0x8807000, IRQ Channel: 9 + Controller Queue Depth: 128, Maximum Blocks per Command: 128 + Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33 + Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63 + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Online, 2201600 blocks + 1:2 - Disk: Online, 2201600 blocks + 1:3 - Disk: Standby, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Online, 4399104 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Online, 2754560 blocks, Write Thru + No Rebuild or Consistency Check in Progress + +gwynedd:/u/lnz# cat /proc/rd/status +OK + +The above messages indicate that everything is healthy, and /proc/rd/status +returns "OK" indicating that there are no problems with any DAC960 controller +in the system. For demonstration purposes, while I/O is active Physical Drive +1:2 is now disconnected, simulating a drive failure. The failure is noted by +the driver within 10 seconds of the controller's having detected it, and the +driver logs the following console status messages: + +DAC960#0: Physical Drive 1:1 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02 +DAC960#0: Physical Drive 1:3 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02 +DAC960#0: Physical Drive 1:2 killed because of timeout on SCSI command +DAC960#0: Physical Drive 1:2 is now DEAD +DAC960#0: Physical Drive 1:2 killed because it was removed +DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now CRITICAL +DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now CRITICAL + +Since a standby drive is configured, the controller automatically begins +rebuilding onto the standby drive: + +DAC960#0: Physical Drive 1:3 is now WRITE-ONLY +DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 4% completed + +Concurrently with the above, the driver status available from /proc/rd also +reflects the drive failure and automatic rebuild. The status message in +/proc/rd/status has changed from "OK" to "ALERT": + +gwynedd:/u/lnz# cat /proc/rd/status +ALERT + +and /proc/rd/c0/current_status has been updated: + +gwynedd:/u/lnz# cat /proc/rd/c0/current_status + ... + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Online, 2201600 blocks + 1:2 - Disk: Dead, 2201600 blocks + 1:3 - Disk: Write-Only, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Critical, 4399104 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Critical, 2754560 blocks, Write Thru + Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 4% completed + +As the rebuild progresses, the current status in /proc/rd/c0/current_status is +updated every 10 seconds: + +gwynedd:/u/lnz# cat /proc/rd/c0/current_status + ... + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Online, 2201600 blocks + 1:2 - Disk: Dead, 2201600 blocks + 1:3 - Disk: Write-Only, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Critical, 4399104 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Critical, 2754560 blocks, Write Thru + Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 40% completed + +and every minute a progress message is logged on the console by the driver: + +DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 40% completed +DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 76% completed +DAC960#0: Rebuild in Progress: Logical Drive 1 (/dev/rd/c0d1) 66% completed +DAC960#0: Rebuild in Progress: Logical Drive 1 (/dev/rd/c0d1) 84% completed + +Finally, the rebuild completes successfully. The driver logs the status of the +logical and physical drives and the rebuild completion: + +DAC960#0: Rebuild Completed Successfully +DAC960#0: Physical Drive 1:3 is now ONLINE +DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now ONLINE +DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now ONLINE + +/proc/rd/c0/current_status is updated: + +***** DAC960 RAID Driver Version 2.0.0 of 23 March 1999 ***** +Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com> +Configuring Mylex DAC960PJ PCI RAID Controller + Firmware Version: 4.06-0-08, Channels: 3, Memory Size: 8MB + PCI Bus: 0, Device: 19, Function: 1, I/O Address: Unassigned + PCI Address: 0xFD4FC000 mapped at 0x8807000, IRQ Channel: 9 + Controller Queue Depth: 128, Maximum Blocks per Command: 128 + Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33 + Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63 + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Online, 2201600 blocks + 1:2 - Disk: Dead, 2201600 blocks + 1:3 - Disk: Online, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Online, 4399104 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Online, 2754560 blocks, Write Thru + Rebuild Completed Successfully + +and /proc/rd/status indicates that everything is healthy once again: + +gwynedd:/u/lnz# cat /proc/rd/status +OK + +Note that the absence of a viable standby drive does not create an "ALERT" +status. Once dead Physical Drive 1:2 has been replaced, the controller must be +told that this has occurred and that the newly replaced drive should become the +new standby drive: + +gwynedd:/u/lnz# echo "make-standby 1:2" > /proc/rd/c0/user_command +gwynedd:/u/lnz# cat /proc/rd/c0/user_command +Make Standby of Physical Drive 1:2 Succeeded + +The echo command instructs the controller to make Physical Drive 1:2 into a +standby drive, and the status message that results from the operation is then +available for reading from /proc/rd/c0/user_command, as well as being logged to +the console by the driver. Within 60 seconds of this command the driver logs: + +DAC960#0: Physical Drive 1:2 Error Log: Sense Key = 6, ASC = 29, ASCQ = 01 +DAC960#0: Physical Drive 1:2 is now STANDBY +DAC960#0: Make Standby of Physical Drive 1:2 Succeeded + +and /proc/rd/c0/current_status is updated: + +gwynedd:/u/lnz# cat /proc/rd/c0/current_status + ... + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Online, 2201600 blocks + 1:2 - Disk: Standby, 2201600 blocks + 1:3 - Disk: Online, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Online, 4399104 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Online, 2754560 blocks, Write Thru + Rebuild Completed Successfully diff --git a/Documentation/README.cycladesZ b/Documentation/README.cycladesZ new file mode 100644 index 000000000000..024a69443cc2 --- /dev/null +++ b/Documentation/README.cycladesZ @@ -0,0 +1,8 @@ + +The Cyclades-Z must have firmware loaded onto the card before it will +operate. This operation should be performed during system startup, + +The firmware, loader program and the latest device driver code are +available from Cyclades at + ftp://ftp.cyclades.com/pub/cyclades/cyclades-z/linux/ + diff --git a/Documentation/SAK.txt b/Documentation/SAK.txt new file mode 100644 index 000000000000..b9019ca872ea --- /dev/null +++ b/Documentation/SAK.txt @@ -0,0 +1,88 @@ +Linux 2.4.2 Secure Attention Key (SAK) handling +18 March 2001, Andrew Morton <akpm@osdl.org> + +An operating system's Secure Attention Key is a security tool which is +provided as protection against trojan password capturing programs. It +is an undefeatable way of killing all programs which could be +masquerading as login applications. Users need to be taught to enter +this key sequence before they log in to the system. + +From the PC keyboard, Linux has two similar but different ways of +providing SAK. One is the ALT-SYSRQ-K sequence. You shouldn't use +this sequence. It is only available if the kernel was compiled with +sysrq support. + +The proper way of generating a SAK is to define the key sequence using +`loadkeys'. This will work whether or not sysrq support is compiled +into the kernel. + +SAK works correctly when the keyboard is in raw mode. This means that +once defined, SAK will kill a running X server. If the system is in +run level 5, the X server will restart. This is what you want to +happen. + +What key sequence should you use? Well, CTRL-ALT-DEL is used to reboot +the machine. CTRL-ALT-BACKSPACE is magical to the X server. We'll +choose CTRL-ALT-PAUSE. + +In your rc.sysinit (or rc.local) file, add the command + + echo "control alt keycode 101 = SAK" | /bin/loadkeys + +And that's it! Only the superuser may reprogram the SAK key. + + +NOTES +===== + +1: Linux SAK is said to be not a "true SAK" as is required by + systems which implement C2 level security. This author does not + know why. + + +2: On the PC keyboard, SAK kills all applications which have + /dev/console opened. + + Unfortunately this includes a number of things which you don't + actually want killed. This is because these applications are + incorrectly holding /dev/console open. Be sure to complain to your + Linux distributor about this! + + You can identify processes which will be killed by SAK with the + command + + # ls -l /proc/[0-9]*/fd/* | grep console + l-wx------ 1 root root 64 Mar 18 00:46 /proc/579/fd/0 -> /dev/console + + Then: + + # ps aux|grep 579 + root 579 0.0 0.1 1088 436 ? S 00:43 0:00 gpm -t ps/2 + + So `gpm' will be killed by SAK. This is a bug in gpm. It should + be closing standard input. You can work around this by finding the + initscript which launches gpm and changing it thusly: + + Old: + + daemon gpm + + New: + + daemon gpm < /dev/null + + Vixie cron also seems to have this problem, and needs the same treatment. + + Also, one prominent Linux distribution has the following three + lines in its rc.sysinit and rc scripts: + + exec 3<&0 + exec 4>&1 + exec 5>&2 + + These commands cause *all* daemons which are launched by the + initscripts to have file descriptors 3, 4 and 5 attached to + /dev/console. So SAK kills them all. A workaround is to simply + delete these lines, but this may cause system management + applications to malfunction - test everything well. + diff --git a/Documentation/SecurityBugs b/Documentation/SecurityBugs new file mode 100644 index 000000000000..26c3b3635d9f --- /dev/null +++ b/Documentation/SecurityBugs @@ -0,0 +1,38 @@ +Linux kernel developers take security very seriously. As such, we'd +like to know when a security bug is found so that it can be fixed and +disclosed as quickly as possible. Please report security bugs to the +Linux kernel security team. + +1) Contact + +The Linux kernel security team can be contacted by email at +<security@kernel.org>. This is a private list of security officers +who will help verify the bug report and develop and release a fix. +It is possible that the security team will bring in extra help from +area maintainers to understand and fix the security vulnerability. + +As it is with any bug, the more information provided the easier it +will be to diagnose and fix. Please review the procedure outlined in +REPORTING-BUGS if you are unclear about what information is helpful. +Any exploit code is very helpful and will not be released without +consent from the reporter unless it has already been made public. + +2) Disclosure + +The goal of the Linux kernel security team is to work with the +bug submitter to bug resolution as well as disclosure. We prefer +to fully disclose the bug as soon as possible. It is reasonable to +delay disclosure when the bug or the fix is not yet fully understood, +the solution is not well-tested or for vendor coordination. However, we +expect these delays to be short, measurable in days, not weeks or months. +A disclosure date is negotiated by the security team working with the +bug submitter as well as vendors. However, the kernel security team +holds the final say when setting a disclosure date. The timeframe for +disclosure is from immediate (esp. if it's already publically known) +to a few weeks. As a basic default policy, we expect report date to +disclosure date to be on the order of 7 days. + +3) Non-disclosure agreements + +The Linux kernel security team is not a formal body and therefore unable +to enter any non-disclosure agreements. diff --git a/Documentation/SubmittingDrivers b/Documentation/SubmittingDrivers new file mode 100644 index 000000000000..de3b252e717d --- /dev/null +++ b/Documentation/SubmittingDrivers @@ -0,0 +1,145 @@ +Submitting Drivers For The Linux Kernel +--------------------------------------- + +This document is intended to explain how to submit device drivers to the +various kernel trees. Note that if you are interested in video card drivers +you should probably talk to XFree86 (http://www.xfree86.org/) and/or X.Org +(http://x.org/) instead. + +Also read the Documentation/SubmittingPatches document. + + +Allocating Device Numbers +------------------------- + +Major and minor numbers for block and character devices are allocated +by the Linux assigned name and number authority (currently better +known as H Peter Anvin). The site is http://www.lanana.org/. This +also deals with allocating numbers for devices that are not going to +be submitted to the mainstream kernel. + +If you don't use assigned numbers then when you device is submitted it will +get given an assigned number even if that is different from values you may +have shipped to customers before. + +Who To Submit Drivers To +------------------------ + +Linux 2.0: + No new drivers are accepted for this kernel tree + +Linux 2.2: + If the code area has a general maintainer then please submit it to + the maintainer listed in MAINTAINERS in the kernel file. If the + maintainer does not respond or you cannot find the appropriate + maintainer then please contact Alan Cox <alan@lxorguk.ukuu.org.uk> + +Linux 2.4: + The same rules apply as 2.2. The final contact point for Linux 2.4 + submissions is Marcelo Tosatti <marcelo.tosatti@cyclades.com>. + +Linux 2.6: + The same rules apply as 2.4 except that you should follow linux-kernel + to track changes in API's. The final contact point for Linux 2.6 + submissions is Andrew Morton <akpm@osdl.org>. + +What Criteria Determine Acceptance +---------------------------------- + +Licensing: The code must be released to us under the + GNU General Public License. We don't insist on any kind + of exclusively GPL licensing, and if you wish the driver + to be useful to other communities such as BSD you may well + wish to release under multiple licenses. + +Copyright: The copyright owner must agree to use of GPL. + It's best if the submitter and copyright owner + are the same person/entity. If not, the name of + the person/entity authorizing use of GPL should be + listed in case it's necessary to verify the will of + the copright owner. + +Interfaces: If your driver uses existing interfaces and behaves like + other drivers in the same class it will be much more likely + to be accepted than if it invents gratuitous new ones. + If you need to implement a common API over Linux and NT + drivers do it in userspace. + +Code: Please use the Linux style of code formatting as documented + in Documentation/CodingStyle. If you have sections of code + that need to be in other formats, for example because they + are shared with a windows driver kit and you want to + maintain them just once separate them out nicely and note + this fact. + +Portability: Pointers are not always 32bits, not all computers are little + endian, people do not all have floating point and you + shouldn't use inline x86 assembler in your driver without + careful thought. Pure x86 drivers generally are not popular. + If you only have x86 hardware it is hard to test portability + but it is easy to make sure the code can easily be made + portable. + +Clarity: It helps if anyone can see how to fix the driver. It helps + you because you get patches not bug reports. If you submit a + driver that intentionally obfuscates how the hardware works + it will go in the bitbucket. + +Control: In general if there is active maintainance of a driver by + the author then patches will be redirected to them unless + they are totally obvious and without need of checking. + If you want to be the contact and update point for the + driver it is a good idea to state this in the comments, + and include an entry in MAINTAINERS for your driver. + +What Criteria Do Not Determine Acceptance +----------------------------------------- + +Vendor: Being the hardware vendor and maintaining the driver is + often a good thing. If there is a stable working driver from + other people already in the tree don't expect 'we are the + vendor' to get your driver chosen. Ideally work with the + existing driver author to build a single perfect driver. + +Author: It doesn't matter if a large Linux company wrote the driver, + or you did. Nobody has any special access to the kernel + tree. Anyone who tells you otherwise isn't telling the + whole story. + + +Resources +--------- + +Linux kernel master tree: + ftp.??.kernel.org:/pub/linux/kernel/... + ?? == your country code, such as "us", "uk", "fr", etc. + +Linux kernel mailing list: + linux-kernel@vger.kernel.org + [mail majordomo@vger.kernel.org to subscribe] + +Linux Device Drivers, Third Edition (covers 2.6.10): + http://lwn.net/Kernel/LDD3/ (free version) + +Kernel traffic: + Weekly summary of kernel list activity (much easier to read) + http://www.kerneltraffic.org/kernel-traffic/ + +LWN.net: + Weekly summary of kernel development activity - http://lwn.net/ + 2.6 API changes: + http://lwn.net/Articles/2.6-kernel-api/ + Porting drivers from prior kernels to 2.6: + http://lwn.net/Articles/driver-porting/ + +KernelTrap: + Occasional Linux kernel articles and developer interviews + http://kerneltrap.org/ + +KernelNewbies: + Documentation and assistance for new kernel programmers + http://kernelnewbies.org/ + +Linux USB project: + http://sourceforge.net/projects/linux-usb/ + diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches new file mode 100644 index 000000000000..9838d32b2fe7 --- /dev/null +++ b/Documentation/SubmittingPatches @@ -0,0 +1,374 @@ + + How to Get Your Change Into the Linux Kernel + or + Care And Operation Of Your Linus Torvalds + + + +For a person or company who wishes to submit a change to the Linux +kernel, the process can sometimes be daunting if you're not familiar +with "the system." This text is a collection of suggestions which +can greatly increase the chances of your change being accepted. + +If you are submitting a driver, also read Documentation/SubmittingDrivers. + + + +-------------------------------------------- +SECTION 1 - CREATING AND SENDING YOUR CHANGE +-------------------------------------------- + + + +1) "diff -up" +------------ + +Use "diff -up" or "diff -uprN" to create patches. + +All changes to the Linux kernel occur in the form of patches, as +generated by diff(1). When creating your patch, make sure to create it +in "unified diff" format, as supplied by the '-u' argument to diff(1). +Also, please use the '-p' argument which shows which C function each +change is in - that makes the resultant diff a lot easier to read. +Patches should be based in the root kernel source directory, +not in any lower subdirectory. + +To create a patch for a single file, it is often sufficient to do: + + SRCTREE= linux-2.4 + MYFILE= drivers/net/mydriver.c + + cd $SRCTREE + cp $MYFILE $MYFILE.orig + vi $MYFILE # make your change + cd .. + diff -up $SRCTREE/$MYFILE{.orig,} > /tmp/patch + +To create a patch for multiple files, you should unpack a "vanilla", +or unmodified kernel source tree, and generate a diff against your +own source tree. For example: + + MYSRC= /devel/linux-2.4 + + tar xvfz linux-2.4.0-test11.tar.gz + mv linux linux-vanilla + wget http://www.moses.uklinux.net/patches/dontdiff + diff -uprN -X dontdiff linux-vanilla $MYSRC > /tmp/patch + rm -f dontdiff + +"dontdiff" is a list of files which are generated by the kernel during +the build process, and should be ignored in any diff(1)-generated +patch. dontdiff is maintained by Tigran Aivazian <tigran@veritas.com> + +Make sure your patch does not include any extra files which do not +belong in a patch submission. Make sure to review your patch -after- +generated it with diff(1), to ensure accuracy. + +If your changes produce a lot of deltas, you may want to look into +splitting them into individual patches which modify things in +logical stages, this will facilitate easier reviewing by other +kernel developers, very important if you want your patch accepted. +There are a number of scripts which can aid in this; + +Quilt: +http://savannah.nongnu.org/projects/quilt + +Randy Dunlap's patch scripts: +http://developer.osdl.org/rddunlap/scripts/patching-scripts.tgz + +Andrew Morton's patch scripts: +http://www.zip.com.au/~akpm/linux/patches/patch-scripts-0.16 + +2) Describe your changes. + +Describe the technical detail of the change(s) your patch includes. + +Be as specific as possible. The WORST descriptions possible include +things like "update driver X", "bug fix for driver X", or "this patch +includes updates for subsystem X. Please apply." + +If your description starts to get long, that's a sign that you probably +need to split up your patch. See #3, next. + + + +3) Separate your changes. + +Separate each logical change into its own patch. + +For example, if your changes include both bug fixes and performance +enhancements for a single driver, separate those changes into two +or more patches. If your changes include an API update, and a new +driver which uses that new API, separate those into two patches. + +On the other hand, if you make a single change to numerous files, +group those changes into a single patch. Thus a single logical change +is contained within a single patch. + +If one patch depends on another patch in order for a change to be +complete, that is OK. Simply note "this patch depends on patch X" +in your patch description. + + +4) Select e-mail destination. + +Look through the MAINTAINERS file and the source code, and determine +if your change applies to a specific subsystem of the kernel, with +an assigned maintainer. If so, e-mail that person. + +If no maintainer is listed, or the maintainer does not respond, send +your patch to the primary Linux kernel developer's mailing list, +linux-kernel@vger.kernel.org. Most kernel developers monitor this +e-mail list, and can comment on your changes. + +Linus Torvalds is the final arbiter of all changes accepted into the +Linux kernel. His e-mail address is <torvalds@osdl.org>. He gets +a lot of e-mail, so typically you should do your best to -avoid- sending +him e-mail. + +Patches which are bug fixes, are "obvious" changes, or similarly +require little discussion should be sent or CC'd to Linus. Patches +which require discussion or do not have a clear advantage should +usually be sent first to linux-kernel. Only after the patch is +discussed should the patch then be submitted to Linus. + +For small patches you may want to CC the Trivial Patch Monkey +trivial@rustcorp.com.au set up by Rusty Russell; which collects "trivial" +patches. Trivial patches must qualify for one of the following rules: + Spelling fixes in documentation + Spelling fixes which could break grep(1). + Warning fixes (cluttering with useless warnings is bad) + Compilation fixes (only if they are actually correct) + Runtime fixes (only if they actually fix things) + Removing use of deprecated functions/macros (eg. check_region). + Contact detail and documentation fixes + Non-portable code replaced by portable code (even in arch-specific, + since people copy, as long as it's trivial) + Any fix by the author/maintainer of the file. (ie. patch monkey + in re-transmission mode) + + + +5) Select your CC (e-mail carbon copy) list. + +Unless you have a reason NOT to do so, CC linux-kernel@vger.kernel.org. + +Other kernel developers besides Linus need to be aware of your change, +so that they may comment on it and offer code review and suggestions. +linux-kernel is the primary Linux kernel developer mailing list. +Other mailing lists are available for specific subsystems, such as +USB, framebuffer devices, the VFS, the SCSI subsystem, etc. See the +MAINTAINERS file for a mailing list that relates specifically to +your change. + +Even if the maintainer did not respond in step #4, make sure to ALWAYS +copy the maintainer when you change their code. + +For small patches you may want to CC the Trivial Patch Monkey +trivial@rustcorp.com.au set up by Rusty Russell; which collects "trivial" +patches. Trivial patches must qualify for one of the following rules: + Spelling fixes in documentation + Spelling fixes which could break grep(1). + Warning fixes (cluttering with useless warnings is bad) + Compilation fixes (only if they are actually correct) + Runtime fixes (only if they actually fix things) + Removing use of deprecated functions/macros (eg. check_region). + Contact detail and documentation fixes + Non-portable code replaced by portable code (even in arch-specific, + since people copy, as long as it's trivial) + Any fix by the author/maintainer of the file. (ie. patch monkey + in re-transmission mode) + + + +6) No MIME, no links, no compression, no attachments. Just plain text. + +Linus and other kernel developers need to be able to read and comment +on the changes you are submitting. It is important for a kernel +developer to be able to "quote" your changes, using standard e-mail +tools, so that they may comment on specific portions of your code. + +For this reason, all patches should be submitting e-mail "inline". +WARNING: Be wary of your editor's word-wrap corrupting your patch, +if you choose to cut-n-paste your patch. + +Do not attach the patch as a MIME attachment, compressed or not. +Many popular e-mail applications will not always transmit a MIME +attachment as plain text, making it impossible to comment on your +code. A MIME attachment also takes Linus a bit more time to process, +decreasing the likelihood of your MIME-attached change being accepted. + +Exception: If your mailer is mangling patches then someone may ask +you to re-send them using MIME. + + + +7) E-mail size. + +When sending patches to Linus, always follow step #6. + +Large changes are not appropriate for mailing lists, and some +maintainers. If your patch, uncompressed, exceeds 40 kB in size, +it is preferred that you store your patch on an Internet-accessible +server, and provide instead a URL (link) pointing to your patch. + + + +8) Name your kernel version. + +It is important to note, either in the subject line or in the patch +description, the kernel version to which this patch applies. + +If the patch does not apply cleanly to the latest kernel version, +Linus will not apply it. + + + +9) Don't get discouraged. Re-submit. + +After you have submitted your change, be patient and wait. If Linus +likes your change and applies it, it will appear in the next version +of the kernel that he releases. + +However, if your change doesn't appear in the next version of the +kernel, there could be any number of reasons. It's YOUR job to +narrow down those reasons, correct what was wrong, and submit your +updated change. + +It is quite common for Linus to "drop" your patch without comment. +That's the nature of the system. If he drops your patch, it could be +due to +* Your patch did not apply cleanly to the latest kernel version +* Your patch was not sufficiently discussed on linux-kernel. +* A style issue (see section 2), +* An e-mail formatting issue (re-read this section) +* A technical problem with your change +* He gets tons of e-mail, and yours got lost in the shuffle +* You are being annoying (See Figure 1) + +When in doubt, solicit comments on linux-kernel mailing list. + + + +10) Include PATCH in the subject + +Due to high e-mail traffic to Linus, and to linux-kernel, it is common +convention to prefix your subject line with [PATCH]. This lets Linus +and other kernel developers more easily distinguish patches from other +e-mail discussions. + + + +11) Sign your work + +To improve tracking of who did what, especially with patches that can +percolate to their final resting place in the kernel through several +layers of maintainers, we've introduced a "sign-off" procedure on +patches that are being emailed around. + +The sign-off is a simple line at the end of the explanation for the +patch, which certifies that you wrote it or otherwise have the right to +pass it on as a open-source patch. The rules are pretty simple: if you +can certify the below: + + Developer's Certificate of Origin 1.0 + + By making a contribution to this project, I certify that: + + (a) The contribution was created in whole or in part by me and I + have the right to submit it under the open source license + indicated in the file; or + + (b) The contribution is based upon previous work that, to the best + of my knowledge, is covered under an appropriate open source + license and I have the right under that license to submit that + work with modifications, whether created in whole or in part + by me, under the same open source license (unless I am + permitted to submit under a different license), as indicated + in the file; or + + (c) The contribution was provided directly to me by some other + person who certified (a), (b) or (c) and I have not modified + it. + +then you just add a line saying + + Signed-off-by: Random J Developer <random@developer.org> + +Some people also put extra tags at the end. They'll just be ignored for +now, but you can do this to mark internal company procedures or just +point out some special detail about the sign-off. + + +----------------------------------- +SECTION 2 - HINTS, TIPS, AND TRICKS +----------------------------------- + +This section lists many of the common "rules" associated with code +submitted to the kernel. There are always exceptions... but you must +have a really good reason for doing so. You could probably call this +section Linus Computer Science 101. + + + +1) Read Documentation/CodingStyle + +Nuff said. If your code deviates too much from this, it is likely +to be rejected without further review, and without comment. + + + +2) #ifdefs are ugly + +Code cluttered with ifdefs is difficult to read and maintain. Don't do +it. Instead, put your ifdefs in a header, and conditionally define +'static inline' functions, or macros, which are used in the code. +Let the compiler optimize away the "no-op" case. + +Simple example, of poor code: + + dev = alloc_etherdev (sizeof(struct funky_private)); + if (!dev) + return -ENODEV; + #ifdef CONFIG_NET_FUNKINESS + init_funky_net(dev); + #endif + +Cleaned-up example: + +(in header) + #ifndef CONFIG_NET_FUNKINESS + static inline void init_funky_net (struct net_device *d) {} + #endif + +(in the code itself) + dev = alloc_etherdev (sizeof(struct funky_private)); + if (!dev) + return -ENODEV; + init_funky_net(dev); + + + +3) 'static inline' is better than a macro + +Static inline functions are greatly preferred over macros. +They provide type safety, have no length limitations, no formatting +limitations, and under gcc they are as cheap as macros. + +Macros should only be used for cases where a static inline is clearly +suboptimal [there a few, isolated cases of this in fast paths], +or where it is impossible to use a static inline function [such as +string-izing]. + +'static inline' is preferred over 'static __inline__', 'extern inline', +and 'extern __inline__'. + + + +4) Don't over-design. + +Don't try to anticipate nebulous future cases which may or may not +be useful: "Make it as simple as you can, and no simpler" + + + diff --git a/Documentation/VGA-softcursor.txt b/Documentation/VGA-softcursor.txt new file mode 100644 index 000000000000..70acfbf399eb --- /dev/null +++ b/Documentation/VGA-softcursor.txt @@ -0,0 +1,39 @@ +Software cursor for VGA by Pavel Machek <pavel@atrey.karlin.mff.cuni.cz> +======================= and Martin Mares <mj@atrey.karlin.mff.cuni.cz> + + Linux now has some ability to manipulate cursor appearance. Normally, you +can set the size of hardware cursor (and also work around some ugly bugs in +those miserable Trident cards--see #define TRIDENT_GLITCH in drivers/video/ +vgacon.c). You can now play a few new tricks: you can make your cursor look +like a non-blinking red block, make it inverse background of the character it's +over or to highlight that character and still choose whether the original +hardware cursor should remain visible or not. There may be other things I have +never thought of. + + The cursor appearance is controlled by a "<ESC>[?1;2;3c" escape sequence +where 1, 2 and 3 are parameters described below. If you omit any of them, +they will default to zeroes. + + Parameter 1 specifies cursor size (0=default, 1=invisible, 2=underline, ..., +8=full block) + 16 if you want the software cursor to be applied + 32 if you +want to always change the background color + 64 if you dislike having the +background the same as the foreground. Highlights are ignored for the last two +flags. + + The second parameter selects character attribute bits you want to change +(by simply XORing them with the value of this parameter). On standard VGA, +the high four bits specify background and the low four the foreground. In both +groups, low three bits set color (as in normal color codes used by the console) +and the most significant one turns on highlight (or sometimes blinking--it +depends on the configuration of your VGA). + + The third parameter consists of character attribute bits you want to set. +Bit setting takes place before bit toggling, so you can simply clear a bit by +including it in both the set mask and the toggle mask. + +Examples: +========= + +To get normal blinking underline, use: echo -e '\033[?2c' +To get blinking block, use: echo -e '\033[?6c' +To get red non-blinking block, use: echo -e '\033[?17;0;64c' diff --git a/Documentation/aoe/aoe.txt b/Documentation/aoe/aoe.txt new file mode 100644 index 000000000000..43e50108d0e2 --- /dev/null +++ b/Documentation/aoe/aoe.txt @@ -0,0 +1,91 @@ +The EtherDrive (R) HOWTO for users of 2.6 kernels is found at ... + + http://www.coraid.com/support/linux/EtherDrive-2.6-HOWTO.html + + It has many tips and hints! + +CREATING DEVICE NODES + + Users of udev should find the block device nodes created + automatically, but to create all the necessary device nodes, use the + udev configuration rules provided in udev.txt (in this directory). + + There is a udev-install.sh script that shows how to install these + rules on your system. + + If you are not using udev, two scripts are provided in + Documentation/aoe as examples of static device node creation for + using the aoe driver. + + rm -rf /dev/etherd + sh Documentation/aoe/mkdevs.sh /dev/etherd + + ... or to make just one shelf's worth of block device nodes ... + + sh Documentation/aoe/mkshelf.sh /dev/etherd 0 + + There is also an autoload script that shows how to edit + /etc/modprobe.conf to ensure that the aoe module is loaded when + necessary. + +USING DEVICE NODES + + "cat /dev/etherd/err" blocks, waiting for error diagnostic output, + like any retransmitted packets. + + "echo eth2 eth4 > /dev/etherd/interfaces" tells the aoe driver to + limit ATA over Ethernet traffic to eth2 and eth4. AoE traffic from + untrusted networks should be ignored as a matter of security. + + "echo > /dev/etherd/discover" tells the driver to find out what AoE + devices are available. + + These character devices may disappear and be replaced by sysfs + counterparts, so distribution maintainers are encouraged to create + scripts that use these devices. + + The block devices are named like this: + + e{shelf}.{slot} + e{shelf}.{slot}p{part} + + ... so that "e0.2" is the third blade from the left (slot 2) in the + first shelf (shelf address zero). That's the whole disk. The first + partition on that disk would be "e0.2p1". + +USING SYSFS + + Each aoe block device in /sys/block has the extra attributes of + state, mac, and netif. The state attribute is "up" when the device + is ready for I/O and "down" if detected but unusable. The + "down,closewait" state shows that the device is still open and + cannot come up again until it has been closed. + + The mac attribute is the ethernet address of the remote AoE device. + The netif attribute is the network interface on the localhost + through which we are communicating with the remote AoE device. + + There is a script in this directory that formats this information + in a convenient way. + + root@makki root# sh Documentation/aoe/status.sh + e10.0 eth3 up + e10.1 eth3 up + e10.2 eth3 up + e10.3 eth3 up + e10.4 eth3 up + e10.5 eth3 up + e10.6 eth3 up + e10.7 eth3 up + e10.8 eth3 up + e10.9 eth3 up + e4.0 eth1 up + e4.1 eth1 up + e4.2 eth1 up + e4.3 eth1 up + e4.4 eth1 up + e4.5 eth1 up + e4.6 eth1 up + e4.7 eth1 up + e4.8 eth1 up + e4.9 eth1 up diff --git a/Documentation/aoe/autoload.sh b/Documentation/aoe/autoload.sh new file mode 100644 index 000000000000..78dad1334c6f --- /dev/null +++ b/Documentation/aoe/autoload.sh @@ -0,0 +1,17 @@ +#!/bin/sh +# set aoe to autoload by installing the +# aliases in /etc/modprobe.conf + +f=/etc/modprobe.conf + +if test ! -r $f || test ! -w $f; then + echo "cannot configure $f for module autoloading" 1>&2 + exit 1 +fi + +grep major-152 $f >/dev/null +if [ $? = 1 ]; then + echo alias block-major-152 aoe >> $f + echo alias char-major-152 aoe >> $f +fi + diff --git a/Documentation/aoe/mkdevs.sh b/Documentation/aoe/mkdevs.sh new file mode 100644 index 000000000000..6ce70703eb47 --- /dev/null +++ b/Documentation/aoe/mkdevs.sh @@ -0,0 +1,36 @@ +#!/bin/sh + +n_shelves=${n_shelves:-10} +n_partitions=${n_partitions:-16} + +if test "$#" != "1"; then + echo "Usage: sh `basename $0` {dir}" 1>&2 + exit 1 +fi +dir=$1 + +MAJOR=152 + +echo "Creating AoE devnode files in $dir ..." + +set -e + +mkdir -p $dir + +# (Status info is in sysfs. See status.sh.) +# rm -f $dir/stat +# mknod -m 0400 $dir/stat c $MAJOR 1 +rm -f $dir/err +mknod -m 0400 $dir/err c $MAJOR 2 +rm -f $dir/discover +mknod -m 0200 $dir/discover c $MAJOR 3 +rm -f $dir/interfaces +mknod -m 0200 $dir/interfaces c $MAJOR 4 + +export n_partitions +mkshelf=`echo $0 | sed 's!mkdevs!mkshelf!'` +i=0 +while test $i -lt $n_shelves; do + sh -xc "sh $mkshelf $dir $i" + i=`expr $i + 1` +done diff --git a/Documentation/aoe/mkshelf.sh b/Documentation/aoe/mkshelf.sh new file mode 100644 index 000000000000..40932836bb80 --- /dev/null +++ b/Documentation/aoe/mkshelf.sh @@ -0,0 +1,25 @@ +#! /bin/sh + +if test "$#" != "2"; then + echo "Usage: sh `basename $0` {dir} {shelfaddress}" 1>&2 + exit 1 +fi +n_partitions=${n_partitions:-16} +dir=$1 +shelf=$2 +MAJOR=152 + +set -e + +minor=`echo 10 \* $shelf \* $n_partitions | bc` +endp=`echo $n_partitions - 1 | bc` +for slot in `seq 0 9`; do + for part in `seq 0 $endp`; do + name=e$shelf.$slot + test "$part" != "0" && name=${name}p$part + rm -f $dir/$name + mknod -m 0660 $dir/$name b $MAJOR $minor + + minor=`expr $minor + 1` + done +done diff --git a/Documentation/aoe/status.sh b/Documentation/aoe/status.sh new file mode 100644 index 000000000000..6628116d4a9f --- /dev/null +++ b/Documentation/aoe/status.sh @@ -0,0 +1,31 @@ +#! /bin/sh +# collate and present sysfs information about AoE storage + +set -e +format="%8s\t%8s\t%8s\n" +me=`basename $0` +sysd=${sysfs_dir:-/sys} + +# printf "$format" device mac netif state + +# Suse 9.1 Pro doesn't put /sys in /etc/mtab +#test -z "`mount | grep sysfs`" && { +test ! -d "$sysd/block" && { + echo "$me Error: sysfs is not mounted" 1>&2 + exit 1 +} +test -z "`lsmod | grep '^aoe'`" && { + echo "$me Error: aoe module is not loaded" 1>&2 + exit 1 +} + +for d in `ls -d $sysd/block/etherd* 2>/dev/null | grep -v p` end; do + # maybe ls comes up empty, so we use "end" + test $d = end && continue + + dev=`echo "$d" | sed 's/.*!//'` + printf "$format" \ + "$dev" \ + "`cat \"$d/netif\"`" \ + "`cat \"$d/state\"`" +done | sort diff --git a/Documentation/aoe/udev-install.sh b/Documentation/aoe/udev-install.sh new file mode 100644 index 000000000000..861a27f98771 --- /dev/null +++ b/Documentation/aoe/udev-install.sh @@ -0,0 +1,26 @@ +# install the aoe-specific udev rules from udev.txt into +# the system's udev configuration +# + +me="`basename $0`" + +# find udev.conf, often /etc/udev/udev.conf +# (or environment can specify where to find udev.conf) +# +if test -z "$conf"; then + if test -r /etc/udev/udev.conf; then + conf=/etc/udev/udev.conf + else + conf="`find /etc -type f -name udev.conf 2> /dev/null`" + if test -z "$conf" || test ! -r "$conf"; then + echo "$me Error: no udev.conf found" 1>&2 + exit 1 + fi + fi +fi + +# find the directory where udev rules are stored, often +# /etc/udev/rules.d +# +rules_d="`sed -n '/^udev_rules=/{ s!udev_rules=!!; s!\"!!g; p; }' $conf`" +test "$rules_d" && sh -xc "cp `dirname $0`/udev.txt $rules_d/60-aoe.rules" diff --git a/Documentation/aoe/udev.txt b/Documentation/aoe/udev.txt new file mode 100644 index 000000000000..ab39d8bb634c --- /dev/null +++ b/Documentation/aoe/udev.txt @@ -0,0 +1,23 @@ +# These rules tell udev what device nodes to create for aoe support. +# They may be installed along the following lines (adjusted to what +# you see on your system). +# +# ecashin@makki ~$ su +# Password: +# bash# find /etc -type f -name udev.conf +# /etc/udev/udev.conf +# bash# grep udev_rules= /etc/udev/udev.conf +# udev_rules="/etc/udev/rules.d/" +# bash# ls /etc/udev/rules.d/ +# 10-wacom.rules 50-udev.rules +# bash# cp /path/to/linux-2.6.xx/Documentation/aoe/udev.txt \ +# /etc/udev/rules.d/60-aoe.rules +# + +# aoe char devices +SUBSYSTEM="aoe", KERNEL="discover", NAME="etherd/%k", GROUP="disk", MODE="0220" +SUBSYSTEM="aoe", KERNEL="err", NAME="etherd/%k", GROUP="disk", MODE="0440" +SUBSYSTEM="aoe", KERNEL="interfaces", NAME="etherd/%k", GROUP="disk", MODE="0220" + +# aoe block devices +KERNEL="etherd*", NAME="%k", GROUP="disk" diff --git a/Documentation/arm/00-INDEX b/Documentation/arm/00-INDEX new file mode 100644 index 000000000000..d753fe59a248 --- /dev/null +++ b/Documentation/arm/00-INDEX @@ -0,0 +1,20 @@ +00-INDEX + - this file +Booting + - requirements for booting +Interrupts + - ARM Interrupt subsystem documentation +Netwinder + - Netwinder specific documentation +README + - General ARM documentation +SA1100 + - SA1100 documentation +XScale + - XScale documentation +empeg + - Empeg documentation +mem_alignment + - alignment abort handler documentation +nwfpe + - NWFPE floating point emulator documentation diff --git a/Documentation/arm/Booting b/Documentation/arm/Booting new file mode 100644 index 000000000000..fad566bb02fc --- /dev/null +++ b/Documentation/arm/Booting @@ -0,0 +1,141 @@ + Booting ARM Linux + ================= + +Author: Russell King +Date : 18 May 2002 + +The following documentation is relevant to 2.4.18-rmk6 and beyond. + +In order to boot ARM Linux, you require a boot loader, which is a small +program that runs before the main kernel. The boot loader is expected +to initialise various devices, and eventually call the Linux kernel, +passing information to the kernel. + +Essentially, the boot loader should provide (as a minimum) the +following: + +1. Setup and initialise the RAM. +2. Initialise one serial port. +3. Detect the machine type. +4. Setup the kernel tagged list. +5. Call the kernel image. + + +1. Setup and initialise RAM +--------------------------- + +Existing boot loaders: MANDATORY +New boot loaders: MANDATORY + +The boot loader is expected to find and initialise all RAM that the +kernel will use for volatile data storage in the system. It performs +this in a machine dependent manner. (It may use internal algorithms +to automatically locate and size all RAM, or it may use knowledge of +the RAM in the machine, or any other method the boot loader designer +sees fit.) + + +2. Initialise one serial port +----------------------------- + +Existing boot loaders: OPTIONAL, RECOMMENDED +New boot loaders: OPTIONAL, RECOMMENDED + +The boot loader should initialise and enable one serial port on the +target. This allows the kernel serial driver to automatically detect +which serial port it should use for the kernel console (generally +used for debugging purposes, or communication with the target.) + +As an alternative, the boot loader can pass the relevant 'console=' +option to the kernel via the tagged lists specifying the port, and +serial format options as described in + + Documentation/kernel-parameters.txt. + + +3. Detect the machine type +-------------------------- + +Existing boot loaders: OPTIONAL +New boot loaders: MANDATORY + +The boot loader should detect the machine type its running on by some +method. Whether this is a hard coded value or some algorithm that +looks at the connected hardware is beyond the scope of this document. +The boot loader must ultimately be able to provide a MACH_TYPE_xxx +value to the kernel. (see linux/arch/arm/tools/mach-types). + + +4. Setup the kernel tagged list +------------------------------- + +Existing boot loaders: OPTIONAL, HIGHLY RECOMMENDED +New boot loaders: MANDATORY + +The boot loader must create and initialise the kernel tagged list. +A valid tagged list starts with ATAG_CORE and ends with ATAG_NONE. +The ATAG_CORE tag may or may not be empty. An empty ATAG_CORE tag +has the size field set to '2' (0x00000002). The ATAG_NONE must set +the size field to zero. + +Any number of tags can be placed in the list. It is undefined +whether a repeated tag appends to the information carried by the +previous tag, or whether it replaces the information in its +entirety; some tags behave as the former, others the latter. + +The boot loader must pass at a minimum the size and location of +the system memory, and root filesystem location. Therefore, the +minimum tagged list should look: + + +-----------+ +base -> | ATAG_CORE | | + +-----------+ | + | ATAG_MEM | | increasing address + +-----------+ | + | ATAG_NONE | | + +-----------+ v + +The tagged list should be stored in system RAM. + +The tagged list must be placed in a region of memory where neither +the kernel decompressor nor initrd 'bootp' program will overwrite +it. The recommended placement is in the first 16KiB of RAM. + +5. Calling the kernel image +--------------------------- + +Existing boot loaders: MANDATORY +New boot loaders: MANDATORY + +There are two options for calling the kernel zImage. If the zImage +is stored in flash, and is linked correctly to be run from flash, +then it is legal for the boot loader to call the zImage in flash +directly. + +The zImage may also be placed in system RAM (at any location) and +called there. Note that the kernel uses 16K of RAM below the image +to store page tables. The recommended placement is 32KiB into RAM. + +In either case, the following conditions must be met: + +- Quiesce all DMA capable devicess so that memory does not get + corrupted by bogus network packets or disk data. This will save + you many hours of debug. + +- CPU register settings + r0 = 0, + r1 = machine type number discovered in (3) above. + r2 = physical address of tagged list in system RAM. + +- CPU mode + All forms of interrupts must be disabled (IRQs and FIQs) + The CPU must be in SVC mode. (A special exception exists for Angel) + +- Caches, MMUs + The MMU must be off. + Instruction cache may be on or off. + Data cache must be off. + +- The boot loader is expected to call the kernel image by jumping + directly to the first instruction of the kernel image. + diff --git a/Documentation/arm/IXP2000 b/Documentation/arm/IXP2000 new file mode 100644 index 000000000000..e0148b6b2c40 --- /dev/null +++ b/Documentation/arm/IXP2000 @@ -0,0 +1,69 @@ + +------------------------------------------------------------------------- +Release Notes for Linux on Intel's IXP2000 Network Processor + +Maintained by Deepak Saxena <dsaxena@plexity.net> +------------------------------------------------------------------------- + +1. Overview + +Intel's IXP2000 family of NPUs (IXP2400, IXP2800, IXP2850) is designed +for high-performance network applications such high-availability +telecom systems. In addition to an XScale core, it contains up to 8 +"MicroEngines" that run special code, several high-end networking +interfaces (UTOPIA, SPI, etc), a PCI host bridge, one serial port, +flash interface, and some other odds and ends. For more information, see: + +http://developer.intel.com/design/network/products/npfamily/ixp2xxx.htm + +2. Linux Support + +Linux currently supports the following features on the IXP2000 NPUs: + +- On-chip serial +- PCI +- Flash (MTD/JFFS2) +- I2C through GPIO +- Timers (watchdog, OS) + +That is about all we can support under Linux ATM b/c the core networking +components of the chip are accessed via Intel's closed source SDK. +Please contact Intel directly on issues with using those. There is +also a mailing list run by some folks at Princeton University that might +be of help: https://lists.cs.princeton.edu/mailman/listinfo/ixp2xxx + +WHATEVER YOU DO, DO NOT POST EMAIL TO THE LINUX-ARM OR LINUX-ARM-KERNEL +MAILING LISTS REGARDING THE INTEL SDK. + +3. Supported Platforms + +- Intel IXDP2400 Reference Platform +- Intel IXDP2800 Reference Platform +- Intel IXDP2401 Reference Platform +- Intel IXDP2801 Reference Platform +- RadiSys ENP-2611 + +4. Usage Notes + +- The IXP2000 platforms usually have rather complex PCI bus topologies + with large memory space requirements. In addition, b/c of the way the + Intel SDK is designed, devices are enumerated in a very specific + way. B/c of this this, we use "pci=firmware" option in the kernel + command line so that we do not re-enumerate the bus. + +- IXDP2x01 systems have variable clock tick rates that we cannot determine + via HW registers. The "ixdp2x01_clk=XXX" cmd line options allow you + to pass the clock rate to the board port. + +5. Thanks + +The IXP2000 work has been funded by Intel Corp. and MontaVista Software, Inc. + +The following people have contributed patches/comments/etc: + +Naeem F. Afzal +Lennert Buytenhek +Jeffrey Daly + +------------------------------------------------------------------------- +Last Update: 8/09/2004 diff --git a/Documentation/arm/IXP4xx b/Documentation/arm/IXP4xx new file mode 100644 index 000000000000..d4c6d3aa0c25 --- /dev/null +++ b/Documentation/arm/IXP4xx @@ -0,0 +1,174 @@ + +------------------------------------------------------------------------- +Release Notes for Linux on Intel's IXP4xx Network Processor + +Maintained by Deepak Saxena <dsaxena@plexity.net> +------------------------------------------------------------------------- + +1. Overview + +Intel's IXP4xx network processor is a highly integrated SOC that +is targeted for network applications, though it has become popular +in industrial control and other areas due to low cost and power +consumption. The IXP4xx family currently consists of several processors +that support different network offload functions such as encryption, +routing, firewalling, etc. The IXP46x family is an updated version which +supports faster speeds, new memory and flash configurations, and more +integration such as an on-chip I2C controller. + +For more information on the various versions of the CPU, see: + + http://developer.intel.com/design/network/products/npfamily/ixp4xx.htm + +Intel also made the IXCP1100 CPU for sometime which is an IXP4xx +stripped of much of the network intelligence. + +2. Linux Support + +Linux currently supports the following features on the IXP4xx chips: + +- Dual serial ports +- PCI interface +- Flash access (MTD/JFFS) +- I2C through GPIO on IXP42x +- GPIO for input/output/interrupts + See include/asm-arm/arch-ixp4xx/platform.h for access functions. +- Timers (watchdog, OS) + +The following components of the chips are not supported by Linux and +require the use of Intel's propietary CSR softare: + +- USB device interface +- Network interfaces (HSS, Utopia, NPEs, etc) +- Network offload functionality + +If you need to use any of the above, you need to download Intel's +software from: + + http://developer.intel.com/design/network/products/npfamily/ixp425swr1.htm + +DO NOT POST QUESTIONS TO THE LINUX MAILING LISTS REGARDING THE PROPIETARY +SOFTWARE. + +There are several websites that provide directions/pointers on using +Intel's software: + +http://ixp4xx-osdg.sourceforge.net/ + Open Source Developer's Guide for using uClinux and the Intel libraries + +http://gatewaymaker.sourceforge.net/ + Simple one page summary of building a gateway using an IXP425 and Linux + +http://ixp425.sourceforge.net/ + ATM device driver for IXP425 that relies on Intel's libraries + +3. Known Issues/Limitations + +3a. Limited inbound PCI window + +The IXP4xx family allows for up to 256MB of memory but the PCI interface +can only expose 64MB of that memory to the PCI bus. This means that if +you are running with > 64MB, all PCI buffers outside of the accessible +range will be bounced using the routines in arch/arm/common/dmabounce.c. + +3b. Limited outbound PCI window + +IXP4xx provides two methods of accessing PCI memory space: + +1) A direct mapped window from 0x48000000 to 0x4bffffff (64MB). + To access PCI via this space, we simply ioremap() the BAR + into the kernel and we can use the standard read[bwl]/write[bwl] + macros. This is the preffered method due to speed but it + limits the system to just 64MB of PCI memory. This can be + problamatic if using video cards and other memory-heavy devices. + +2) If > 64MB of memory space is required, the IXP4xx can be + configured to use indirect registers to access PCI This allows + for up to 128MB (0x48000000 to 0x4fffffff) of memory on the bus. + The disadvantadge of this is that every PCI access requires + three local register accesses plus a spinlock, but in some + cases the performance hit is acceptable. In addition, you cannot + mmap() PCI devices in this case due to the indirect nature + of the PCI window. + +By default, the direct method is used for performance reasons. If +you need more PCI memory, enable the IXP4XX_INDIRECT_PCI config option. + +3c. GPIO as Interrupts + +Currently the code only handles level-sensitive GPIO interrupts + +4. Supported platforms + +ADI Engineering Coyote Gateway Reference Platform +http://www.adiengineering.com/productsCoyote.html + + The ADI Coyote platform is reference design for those building + small residential/office gateways. One NPE is connected to a 10/100 + interface, one to 4-port 10/100 switch, and the third to and ADSL + interface. In addition, it also supports to POTs interfaces connected + via SLICs. Note that those are not supported by Linux ATM. Finally, + the platform has two mini-PCI slots used for 802.11[bga] cards. + Finally, there is an IDE port hanging off the expansion bus. + +Gateworks Avila Network Platform +http://www.gateworks.com/avila_sbc.htm + + The Avila platform is basically and IXDP425 with the 4 PCI slots + replaced with mini-PCI slots and a CF IDE interface hanging off + the expansion bus. + +Intel IXDP425 Development Platform +http://developer.intel.com/design/network/products/npfamily/ixdp425.htm + + This is Intel's standard reference platform for the IXDP425 and is + also known as the Richfield board. It contains 4 PCI slots, 16MB + of flash, two 10/100 ports and one ADSL port. + +Intel IXDP465 Development Platform +http://developer.intel.com/design/network/products/npfamily/ixdp465.htm + + This is basically an IXDP425 with an IXP465 and 32M of flash instead + of just 16. + +Intel IXDPG425 Development Platform + + This is basically and ADI Coyote board with a NEC EHCI controller + added. One issue with this board is that the mini-PCI slots only + have the 3.3v line connected, so you can't use a PCI to mini-PCI + adapter with an E100 card. So to NFS root you need to use either + the CSR or a WiFi card and a ramdisk that BOOTPs and then does + a pivot_root to NFS. + +Motorola PrPMC1100 Processor Mezanine Card +http://www.fountainsys.com/datasheet/PrPMC1100.pdf + + The PrPMC1100 is based on the IXCP1100 and is meant to plug into + and IXP2400/2800 system to act as the system controller. It simply + contains a CPU and 16MB of flash on the board and needs to be + plugged into a carrier board to function. Currently Linux only + supports the Motorola PrPMC carrier board for this platform. + See https://mcg.motorola.com/us/ds/pdf/ds0144.pdf for info + on the carrier board. + +5. TODO LIST + +- Add support for Coyote IDE +- Add support for edge-based GPIO interrupts +- Add support for CF IDE on expansion bus + +6. Thanks + +The IXP4xx work has been funded by Intel Corp. and MontaVista Software, Inc. + +The following people have contributed patches/comments/etc: + +Lennerty Buytenhek +Lutz Jaenicke +Justin Mayfield +Robert E. Ranslam +[I know I've forgotten others, please email me to be added] + +------------------------------------------------------------------------- + +Last Update: 01/04/2005 diff --git a/Documentation/arm/Interrupts b/Documentation/arm/Interrupts new file mode 100644 index 000000000000..72c93de8cd4e --- /dev/null +++ b/Documentation/arm/Interrupts @@ -0,0 +1,173 @@ +2.5.2-rmk5 +---------- + +This is the first kernel that contains a major shake up of some of the +major architecture-specific subsystems. + +Firstly, it contains some pretty major changes to the way we handle the +MMU TLB. Each MMU TLB variant is now handled completely separately - +we have TLB v3, TLB v4 (without write buffer), TLB v4 (with write buffer), +and finally TLB v4 (with write buffer, with I TLB invalidate entry). +There is more assembly code inside each of these functions, mainly to +allow more flexible TLB handling for the future. + +Secondly, the IRQ subsystem. + +The 2.5 kernels will be having major changes to the way IRQs are handled. +Unfortunately, this means that machine types that touch the irq_desc[] +array (basically all machine types) will break, and this means every +machine type that we currently have. + +Lets take an example. On the Assabet with Neponset, we have: + + GPIO25 IRR:2 + SA1100 ------------> Neponset -----------> SA1111 + IIR:1 + -----------> USAR + IIR:0 + -----------> SMC9196 + +The way stuff currently works, all SA1111 interrupts are mutually +exclusive of each other - if you're processing one interrupt from the +SA1111 and another comes in, you have to wait for that interrupt to +finish processing before you can service the new interrupt. Eg, an +IDE PIO-based interrupt on the SA1111 excludes all other SA1111 and +SMC9196 interrupts until it has finished transferring its multi-sector +data, which can be a long time. Note also that since we loop in the +SA1111 IRQ handler, SA1111 IRQs can hold off SMC9196 IRQs indefinitely. + + +The new approach brings several new ideas... + +We introduce the concept of a "parent" and a "child". For example, +to the Neponset handler, the "parent" is GPIO25, and the "children"d +are SA1111, SMC9196 and USAR. + +We also bring the idea of an IRQ "chip" (mainly to reduce the size of +the irqdesc array). This doesn't have to be a real "IC"; indeed the +SA11x0 IRQs are handled by two separate "chip" structures, one for +GPIO0-10, and another for all the rest. It is just a container for +the various operations (maybe this'll change to a better name). +This structure has the following operations: + +struct irqchip { + /* + * Acknowledge the IRQ. + * If this is a level-based IRQ, then it is expected to mask the IRQ + * as well. + */ + void (*ack)(unsigned int irq); + /* + * Mask the IRQ in hardware. + */ + void (*mask)(unsigned int irq); + /* + * Unmask the IRQ in hardware. + */ + void (*unmask)(unsigned int irq); + /* + * Re-run the IRQ + */ + void (*rerun)(unsigned int irq); + /* + * Set the type of the IRQ. + */ + int (*type)(unsigned int irq, unsigned int, type); +}; + +ack - required. May be the same function as mask for IRQs + handled by do_level_IRQ. +mask - required. +unmask - required. +rerun - optional. Not required if you're using do_level_IRQ for all + IRQs that use this 'irqchip'. Generally expected to re-trigger + the hardware IRQ if possible. If not, may call the handler + directly. +type - optional. If you don't support changing the type of an IRQ, + it should be null so people can detect if they are unable to + set the IRQ type. + +For each IRQ, we keep the following information: + + - "disable" depth (number of disable_irq()s without enable_irq()s) + - flags indicating what we can do with this IRQ (valid, probe, + noautounmask) as before + - status of the IRQ (probing, enable, etc) + - chip + - per-IRQ handler + - irqaction structure list + +The handler can be one of the 3 standard handlers - "level", "edge" and +"simple", or your own specific handler if you need to do something special. + +The "level" handler is what we currently have - its pretty simple. +"edge" knows about the brokenness of such IRQ implementations - that you +need to leave the hardware IRQ enabled while processing it, and queueing +further IRQ events should the IRQ happen again while processing. The +"simple" handler is very basic, and does not perform any hardware +manipulation, nor state tracking. This is useful for things like the +SMC9196 and USAR above. + +So, what's changed? + +1. Machine implementations must not write to the irqdesc array. + +2. New functions to manipulate the irqdesc array. The first 4 are expected + to be useful only to machine specific code. The last is recommended to + only be used by machine specific code, but may be used in drivers if + absolutely necessary. + + set_irq_chip(irq,chip) + + Set the mask/unmask methods for handling this IRQ + + set_irq_handler(irq,handler) + + Set the handler for this IRQ (level, edge, simple) + + set_irq_chained_handler(irq,handler) + + Set a "chained" handler for this IRQ - automatically + enables this IRQ (eg, Neponset and SA1111 handlers). + + set_irq_flags(irq,flags) + + Set the valid/probe/noautoenable flags. + + set_irq_type(irq,type) + + Set active the IRQ edge(s)/level. This replaces the + SA1111 INTPOL manipulation, and the set_GPIO_IRQ_edge() + function. Type should be one of the following: + + #define IRQT_NOEDGE (0) + #define IRQT_RISING (__IRQT_RISEDGE) + #define IRQT_FALLING (__IRQT_FALEDGE) + #define IRQT_BOTHEDGE (__IRQT_RISEDGE|__IRQT_FALEDGE) + #define IRQT_LOW (__IRQT_LOWLVL) + #define IRQT_HIGH (__IRQT_HIGHLVL) + +3. set_GPIO_IRQ_edge() is obsolete, and should be replaced by set_irq_type. + +4. Direct access to SA1111 INTPOL is depreciated. Use set_irq_type instead. + +5. A handler is expected to perform any necessary acknowledgement of the + parent IRQ via the correct chip specific function. For instance, if + the SA1111 is directly connected to a SA1110 GPIO, then you should + acknowledge the SA1110 IRQ each time you re-read the SA1111 IRQ status. + +6. For any child which doesn't have its own IRQ enable/disable controls + (eg, SMC9196), the handler must mask or acknowledge the parent IRQ + while the child handler is called, and the child handler should be the + "simple" handler (not "edge" nor "level"). After the handler completes, + the parent IRQ should be unmasked, and the status of all children must + be re-checked for pending events. (see the Neponset IRQ handler for + details). + +7. fixup_irq() is gone, as is include/asm-arm/arch-*/irq.h + +Please note that this will not solve all problems - some of them are +hardware based. Mixing level-based and edge-based IRQs on the same +parent signal (eg neponset) is one such area where a software based +solution can't provide the full answer to low IRQ latency. + diff --git a/Documentation/arm/Netwinder b/Documentation/arm/Netwinder new file mode 100644 index 000000000000..f1b457fbd3de --- /dev/null +++ b/Documentation/arm/Netwinder @@ -0,0 +1,78 @@ +NetWinder specific documentation +================================ + +The NetWinder is a small low-power computer, primarily designed +to run Linux. It is based around the StrongARM RISC processor, +DC21285 PCI bridge, with PC-type hardware glued around it. + +Port usage +========== + +Min - Max Description +--------------------------- +0x0000 - 0x000f DMA1 +0x0020 - 0x0021 PIC1 +0x0060 - 0x006f Keyboard +0x0070 - 0x007f RTC +0x0080 - 0x0087 DMA1 +0x0088 - 0x008f DMA2 +0x00a0 - 0x00a3 PIC2 +0x00c0 - 0x00df DMA2 +0x0180 - 0x0187 IRDA +0x01f0 - 0x01f6 ide0 +0x0201 Game port +0x0203 RWA010 configuration read +0x0220 - ? SoundBlaster +0x0250 - ? WaveArtist +0x0279 RWA010 configuration index +0x02f8 - 0x02ff Serial ttyS1 +0x0300 - 0x031f Ether10 +0x0338 GPIO1 +0x033a GPIO2 +0x0370 - 0x0371 W83977F configuration registers +0x0388 - ? AdLib +0x03c0 - 0x03df VGA +0x03f6 ide0 +0x03f8 - 0x03ff Serial ttyS0 +0x0400 - 0x0408 DC21143 +0x0480 - 0x0487 DMA1 +0x0488 - 0x048f DMA2 +0x0a79 RWA010 configuration write +0xe800 - 0xe80f ide0/ide1 BM DMA + + +Interrupt usage +=============== + +IRQ type Description +--------------------------- + 0 ISA 100Hz timer + 1 ISA Keyboard + 2 ISA cascade + 3 ISA Serial ttyS1 + 4 ISA Serial ttyS0 + 5 ISA PS/2 mouse + 6 ISA IRDA + 7 ISA Printer + 8 ISA RTC alarm + 9 ISA +10 ISA GP10 (Orange reset button) +11 ISA +12 ISA WaveArtist +13 ISA +14 ISA hda1 +15 ISA + +DMA usage +========= + +DMA type Description +--------------------------- + 0 ISA IRDA + 1 ISA + 2 ISA cascade + 3 ISA WaveArtist + 4 ISA + 5 ISA + 6 ISA + 7 ISA WaveArtist diff --git a/Documentation/arm/Porting b/Documentation/arm/Porting new file mode 100644 index 000000000000..a492233931b9 --- /dev/null +++ b/Documentation/arm/Porting @@ -0,0 +1,135 @@ +Taken from list archive at http://lists.arm.linux.org.uk/pipermail/linux-arm-kernel/2001-July/004064.html + +Initial definitions +------------------- + +The following symbol definitions rely on you knowing the translation that +__virt_to_phys() does for your machine. This macro converts the passed +virtual address to a physical address. Normally, it is simply: + + phys = virt - PAGE_OFFSET + PHYS_OFFSET + + +Decompressor Symbols +-------------------- + +ZTEXTADDR + Start address of decompressor. There's no point in talking about + virtual or physical addresses here, since the MMU will be off at + the time when you call the decompressor code. You normally call + the kernel at this address to start it booting. This doesn't have + to be located in RAM, it can be in flash or other read-only or + read-write addressable medium. + +ZBSSADDR + Start address of zero-initialised work area for the decompressor. + This must be pointing at RAM. The decompressor will zero initialise + this for you. Again, the MMU will be off. + +ZRELADDR + This is the address where the decompressed kernel will be written, + and eventually executed. The following constraint must be valid: + + __virt_to_phys(TEXTADDR) == ZRELADDR + + The initial part of the kernel is carefully coded to be position + independent. + +INITRD_PHYS + Physical address to place the initial RAM disk. Only relevant if + you are using the bootpImage stuff (which only works on the old + struct param_struct). + +INITRD_VIRT + Virtual address of the initial RAM disk. The following constraint + must be valid: + + __virt_to_phys(INITRD_VIRT) == INITRD_PHYS + +PARAMS_PHYS + Physical address of the struct param_struct or tag list, giving the + kernel various parameters about its execution environment. + + +Kernel Symbols +-------------- + +PHYS_OFFSET + Physical start address of the first bank of RAM. + +PAGE_OFFSET + Virtual start address of the first bank of RAM. During the kernel + boot phase, virtual address PAGE_OFFSET will be mapped to physical + address PHYS_OFFSET, along with any other mappings you supply. + This should be the same value as TASK_SIZE. + +TASK_SIZE + The maximum size of a user process in bytes. Since user space + always starts at zero, this is the maximum address that a user + process can access+1. The user space stack grows down from this + address. + + Any virtual address below TASK_SIZE is deemed to be user process + area, and therefore managed dynamically on a process by process + basis by the kernel. I'll call this the user segment. + + Anything above TASK_SIZE is common to all processes. I'll call + this the kernel segment. + + (In other words, you can't put IO mappings below TASK_SIZE, and + hence PAGE_OFFSET). + +TEXTADDR + Virtual start address of kernel, normally PAGE_OFFSET + 0x8000. + This is where the kernel image ends up. With the latest kernels, + it must be located at 32768 bytes into a 128MB region. Previous + kernels placed a restriction of 256MB here. + +DATAADDR + Virtual address for the kernel data segment. Must not be defined + when using the decompressor. + +VMALLOC_START +VMALLOC_END + Virtual addresses bounding the vmalloc() area. There must not be + any static mappings in this area; vmalloc will overwrite them. + The addresses must also be in the kernel segment (see above). + Normally, the vmalloc() area starts VMALLOC_OFFSET bytes above the + last virtual RAM address (found using variable high_memory). + +VMALLOC_OFFSET + Offset normally incorporated into VMALLOC_START to provide a hole + between virtual RAM and the vmalloc area. We do this to allow + out of bounds memory accesses (eg, something writing off the end + of the mapped memory map) to be caught. Normally set to 8MB. + +Architecture Specific Macros +---------------------------- + +BOOT_MEM(pram,pio,vio) + `pram' specifies the physical start address of RAM. Must always + be present, and should be the same as PHYS_OFFSET. + + `pio' is the physical address of an 8MB region containing IO for + use with the debugging macros in arch/arm/kernel/debug-armv.S. + + `vio' is the virtual address of the 8MB debugging region. + + It is expected that the debugging region will be re-initialised + by the architecture specific code later in the code (via the + MAPIO function). + +BOOT_PARAMS + Same as, and see PARAMS_PHYS. + +FIXUP(func) + Machine specific fixups, run before memory subsystems have been + initialised. + +MAPIO(func) + Machine specific function to map IO areas (including the debug + region above). + +INITIRQ(func) + Machine specific function to initialise interrupts. + diff --git a/Documentation/arm/README b/Documentation/arm/README new file mode 100644 index 000000000000..a6f718e90a86 --- /dev/null +++ b/Documentation/arm/README @@ -0,0 +1,198 @@ + ARM Linux 2.6 + ============= + + Please check <ftp://ftp.arm.linux.org.uk/pub/armlinux> for + updates. + +Compilation of kernel +--------------------- + + In order to compile ARM Linux, you will need a compiler capable of + generating ARM ELF code with GNU extensions. GCC 2.95.1, EGCS + 1.1.2, and GCC 3.3 are known to be good compilers. Fortunately, you + needn't guess. The kernel will report an error if your compiler is + a recognized offender. + + To build ARM Linux natively, you shouldn't have to alter the ARCH = line + in the top level Makefile. However, if you don't have the ARM Linux ELF + tools installed as default, then you should change the CROSS_COMPILE + line as detailed below. + + If you wish to cross-compile, then alter the following lines in the top + level make file: + + ARCH = <whatever> + with + ARCH = arm + + and + + CROSS_COMPILE= + to + CROSS_COMPILE=<your-path-to-your-compiler-without-gcc> + eg. + CROSS_COMPILE=arm-linux- + + Do a 'make config', followed by 'make Image' to build the kernel + (arch/arm/boot/Image). A compressed image can be built by doing a + 'make zImage' instead of 'make Image'. + + +Bug reports etc +--------------- + + Please send patches to the patch system. For more information, see + http://www.arm.linux.org.uk/patches/info.html Always include some + explanation as to what the patch does and why it is needed. + + Bug reports should be sent to linux-arm-kernel@lists.arm.linux.org.uk, + or submitted through the web form at + http://www.arm.linux.org.uk/forms/solution.shtml + + When sending bug reports, please ensure that they contain all relevant + information, eg. the kernel messages that were printed before/during + the problem, what you were doing, etc. + + +Include files +------------- + + Several new include directories have been created under include/asm-arm, + which are there to reduce the clutter in the top-level directory. These + directories, and their purpose is listed below: + + arch-* machine/platform specific header files + hardware driver-internal ARM specific data structures/definitions + mach descriptions of generic ARM to specific machine interfaces + proc-* processor dependent header files (currently only two + categories) + + +Machine/Platform support +------------------------ + + The ARM tree contains support for a lot of different machine types. To + continue supporting these differences, it has become necessary to split + machine-specific parts by directory. For this, the machine category is + used to select which directories and files get included (we will use + $(MACHINE) to refer to the category) + + To this end, we now have arch/arm/mach-$(MACHINE) directories which are + designed to house the non-driver files for a particular machine (eg, PCI, + memory management, architecture definitions etc). For all future + machines, there should be a corresponding include/asm-arm/arch-$(MACHINE) + directory. + + +Modules +------- + + Although modularisation is supported (and required for the FP emulator), + each module on an ARM2/ARM250/ARM3 machine when is loaded will take + memory up to the next 32k boundary due to the size of the pages. + Therefore, modularisation on these machines really worth it? + + However, ARM6 and up machines allow modules to take multiples of 4k, and + as such Acorn RiscPCs and other architectures using these processors can + make good use of modularisation. + + +ADFS Image files +---------------- + + You can access image files on your ADFS partitions by mounting the ADFS + partition, and then using the loopback device driver. You must have + losetup installed. + + Please note that the PCEmulator DOS partitions have a partition table at + the start, and as such, you will have to give '-o offset' to losetup. + + +Request to developers +--------------------- + + When writing device drivers which include a separate assembler file, please + include it in with the C file, and not the arch/arm/lib directory. This + allows the driver to be compiled as a loadable module without requiring + half the code to be compiled into the kernel image. + + In general, try to avoid using assembler unless it is really necessary. It + makes drivers far less easy to port to other hardware. + + +ST506 hard drives +----------------- + + The ST506 hard drive controllers seem to be working fine (if a little + slowly). At the moment they will only work off the controllers on an + A4x0's motherboard, but for it to work off a Podule just requires + someone with a podule to add the addresses for the IRQ mask and the + HDC base to the source. + + As of 31/3/96 it works with two drives (you should get the ADFS + *configure harddrive set to 2). I've got an internal 20MB and a great + big external 5.25" FH 64MB drive (who could ever want more :-) ). + + I've just got 240K/s off it (a dd with bs=128k); thats about half of what + RiscOS gets; but it's a heck of a lot better than the 50K/s I was getting + last week :-) + + Known bug: Drive data errors can cause a hang; including cases where + the controller has fixed the error using ECC. (Possibly ONLY + in that case...hmm). + + +1772 Floppy +----------- + This also seems to work OK, but hasn't been stressed much lately. It + hasn't got any code for disc change detection in there at the moment which + could be a bit of a problem! Suggestions on the correct way to do this + are welcome. + + +CONFIG_MACH_ and CONFIG_ARCH_ +----------------------------- + A change was made in 2003 to the macro names for new machines. + Historically, CONFIG_ARCH_ was used for the bonafide architecture, + e.g. SA1100, as well as implementations of the architecture, + e.g. Assabet. It was decided to change the implementation macros + to read CONFIG_MACH_ for clarity. Moreover, a retroactive fixup has + not been made because it would complicate patching. + + Previous registrations may be found online. + + <http://www.arm.linux.org.uk/developer/machines/> + +Kernel entry (head.S) +-------------------------- + The initial entry into the kernel is via head.S, which uses machine + independent code. The machine is selected by the value of 'r1' on + entry, which must be kept unique. + + Due to the large number of machines which the ARM port of Linux provides + for, we have a method to manage this which ensures that we don't end up + duplicating large amounts of code. + + We group machine (or platform) support code into machine classes. A + class typically based around one or more system on a chip devices, and + acts as a natural container around the actual implementations. These + classes are given directories - arch/arm/mach-<class> and + include/asm-arm/arch-<class> - which contain the source files to + support the machine class. This directories also contain any machine + specific supporting code. + + For example, the SA1100 class is based upon the SA1100 and SA1110 SoC + devices, and contains the code to support the way the on-board and off- + board devices are used, or the device is setup, and provides that + machine specific "personality." + + This fine-grained machine specific selection is controlled by the machine + type ID, which acts both as a run-time and a compile-time code selection + method. + + You can register a new machine via the web site at: + + <http://www.arm.linux.org.uk/developer/machines/> + +--- +Russell King (15/03/2004) diff --git a/Documentation/arm/SA1100/ADSBitsy b/Documentation/arm/SA1100/ADSBitsy new file mode 100644 index 000000000000..ab47c3833908 --- /dev/null +++ b/Documentation/arm/SA1100/ADSBitsy @@ -0,0 +1,43 @@ +ADS Bitsy Single Board Computer +(It is different from Bitsy(iPAQ) of Compaq) + +For more details, contact Applied Data Systems or see +http://www.applieddata.net/products.html + +The Linux support for this product has been provided by +Woojung Huh <whuh@applieddata.net> + +Use 'make adsbitsy_config' before any 'make config'. +This will set up defaults for ADS Bitsy support. + +The kernel zImage is linked to be loaded and executed at 0xc0400000. + +Linux can be used with the ADS BootLoader that ships with the +newer rev boards. See their documentation on how to load Linux. + +Supported peripherals: +- SA1100 LCD frame buffer (8/16bpp...sort of) +- SA1111 USB Master +- SA1100 serial port +- pcmcia, compact flash +- touchscreen(ucb1200) +- console on LCD screen +- serial ports (ttyS[0-2]) + - ttyS0 is default for serial console + +To do: +- everything else! :-) + +Notes: + +- The flash on board is divided into 3 partitions. + You should be careful to use flash on board. + It's partition is different from GraphicsClient Plus and GraphicsMaster + +- 16bpp mode requires a different cable than what ships with the board. + Contact ADS or look through the manual to wire your own. Currently, + if you compile with 16bit mode support and switch into a lower bpp + mode, the timing is off so the image is corrupted. This will be + fixed soon. + +Any contribution can be sent to nico@cam.org and will be greatly welcome! diff --git a/Documentation/arm/SA1100/Assabet b/Documentation/arm/SA1100/Assabet new file mode 100644 index 000000000000..cbbe5587c78d --- /dev/null +++ b/Documentation/arm/SA1100/Assabet @@ -0,0 +1,301 @@ +The Intel Assabet (SA-1110 evaluation) board +============================================ + +Please see: +http://developer.intel.com/design/strong/quicklist/eval-plat/sa-1110.htm +http://developer.intel.com/design/strong/guides/278278.htm + +Also some notes from John G Dorsey <jd5q@andrew.cmu.edu>: +http://www.cs.cmu.edu/~wearable/software/assabet.html + + +Building the kernel +------------------- + +To build the kernel with current defaults: + + make assabet_config + make oldconfig + make zImage + +The resulting kernel image should be available in linux/arch/arm/boot/zImage. + + +Installing a bootloader +----------------------- + +A couple of bootloaders able to boot Linux on Assabet are available: + +BLOB (http://www.lart.tudelft.nl/lartware/blob/) + + BLOB is a bootloader used within the LART project. Some contributed + patches were merged into BLOB to add support for Assabet. + +Compaq's Bootldr + John Dorsey's patch for Assabet support +(http://www.handhelds.org/Compaq/bootldr.html) +(http://www.wearablegroup.org/software/bootldr/) + + Bootldr is the bootloader developed by Compaq for the iPAQ Pocket PC. + John Dorsey has produced add-on patches to add support for Assabet and + the JFFS filesystem. + +RedBoot (http://sources.redhat.com/redboot/) + + RedBoot is a bootloader developed by Red Hat based on the eCos RTOS + hardware abstraction layer. It supports Assabet amongst many other + hardware platforms. + +RedBoot is currently the recommended choice since it's the only one to have +networking support, and is the most actively maintained. + +Brief examples on how to boot Linux with RedBoot are shown below. But first +you need to have RedBoot installed in your flash memory. A known to work +precompiled RedBoot binary is available from the following location: + +ftp://ftp.netwinder.org/users/n/nico/ +ftp://ftp.arm.linux.org.uk/pub/linux/arm/people/nico/ +ftp://ftp.handhelds.org/pub/linux/arm/sa-1100-patches/ + +Look for redboot-assabet*.tgz. Some installation infos are provided in +redboot-assabet*.txt. + + +Initial RedBoot configuration +----------------------------- + +The commands used here are explained in The RedBoot User's Guide available +on-line at http://sources.redhat.com/ecos/docs-latest/redboot/redboot.html. +Please refer to it for explanations. + +If you have a CF network card (my Assabet kit contained a CF+ LP-E from +Socket Communications Inc.), you should strongly consider using it for TFTP +file transfers. You must insert it before RedBoot runs since it can't detect +it dynamically. + +To initialize the flash directory: + + fis init -f + +To initialize the non-volatile settings, like whether you want to use BOOTP or +a static IP address, etc, use this command: + + fconfig -i + + +Writing a kernel image into flash +--------------------------------- + +First, the kernel image must be loaded into RAM. If you have the zImage file +available on a TFTP server: + + load zImage -r -b 0x100000 + +If you rather want to use Y-Modem upload over the serial port: + + load -m ymodem -r -b 0x100000 + +To write it to flash: + + fis create "Linux kernel" -b 0x100000 -l 0xc0000 + + +Booting the kernel +------------------ + +The kernel still requires a filesystem to boot. A ramdisk image can be loaded +as follows: + + load ramdisk_image.gz -r -b 0x800000 + +Again, Y-Modem upload can be used instead of TFTP by replacing the file name +by '-y ymodem'. + +Now the kernel can be retrieved from flash like this: + + fis load "Linux kernel" + +or loaded as described previously. To boot the kernel: + + exec -b 0x100000 -l 0xc0000 + +The ramdisk image could be stored into flash as well, but there are better +solutions for on-flash filesystems as mentioned below. + + +Using JFFS2 +----------- + +Using JFFS2 (the Second Journalling Flash File System) is probably the most +convenient way to store a writable filesystem into flash. JFFS2 is used in +conjunction with the MTD layer which is responsible for low-level flash +management. More information on the Linux MTD can be found on-line at: +http://www.linux-mtd.infradead.org/. A JFFS howto with some infos about +creating JFFS/JFFS2 images is available from the same site. + +For instance, a sample JFFS2 image can be retrieved from the same FTP sites +mentioned below for the precompiled RedBoot image. + +To load this file: + + load sample_img.jffs2 -r -b 0x100000 + +The result should look like: + +RedBoot> load sample_img.jffs2 -r -b 0x100000 +Raw file loaded 0x00100000-0x00377424 + +Now we must know the size of the unallocated flash: + + fis free + +Result: + +RedBoot> fis free + 0x500E0000 .. 0x503C0000 + +The values above may be different depending on the size of the filesystem and +the type of flash. See their usage below as an example and take care of +substituting yours appropriately. + +We must determine some values: + +size of unallocated flash: 0x503c0000 - 0x500e0000 = 0x2e0000 +size of the filesystem image: 0x00377424 - 0x00100000 = 0x277424 + +We want to fit the filesystem image of course, but we also want to give it all +the remaining flash space as well. To write it: + + fis unlock -f 0x500E0000 -l 0x2e0000 + fis erase -f 0x500E0000 -l 0x2e0000 + fis write -b 0x100000 -l 0x277424 -f 0x500E0000 + fis create "JFFS2" -n -f 0x500E0000 -l 0x2e0000 + +Now the filesystem is associated to a MTD "partition" once Linux has discovered +what they are in the boot process. From Redboot, the 'fis list' command +displays them: + +RedBoot> fis list +Name FLASH addr Mem addr Length Entry point +RedBoot 0x50000000 0x50000000 0x00020000 0x00000000 +RedBoot config 0x503C0000 0x503C0000 0x00020000 0x00000000 +FIS directory 0x503E0000 0x503E0000 0x00020000 0x00000000 +Linux kernel 0x50020000 0x00100000 0x000C0000 0x00000000 +JFFS2 0x500E0000 0x500E0000 0x002E0000 0x00000000 + +However Linux should display something like: + +SA1100 flash: probing 32-bit flash bus +SA1100 flash: Found 2 x16 devices at 0x0 in 32-bit mode +Using RedBoot partition definition +Creating 5 MTD partitions on "SA1100 flash": +0x00000000-0x00020000 : "RedBoot" +0x00020000-0x000e0000 : "Linux kernel" +0x000e0000-0x003c0000 : "JFFS2" +0x003c0000-0x003e0000 : "RedBoot config" +0x003e0000-0x00400000 : "FIS directory" + +What's important here is the position of the partition we are interested in, +which is the third one. Within Linux, this correspond to /dev/mtdblock2. +Therefore to boot Linux with the kernel and its root filesystem in flash, we +need this RedBoot command: + + fis load "Linux kernel" + exec -b 0x100000 -l 0xc0000 -c "root=/dev/mtdblock2" + +Of course other filesystems than JFFS might be used, like cramfs for example. +You might want to boot with a root filesystem over NFS, etc. It is also +possible, and sometimes more convenient, to flash a filesystem directly from +within Linux while booted from a ramdisk or NFS. The Linux MTD repository has +many tools to deal with flash memory as well, to erase it for example. JFFS2 +can then be mounted directly on a freshly erased partition and files can be +copied over directly. Etc... + + +RedBoot scripting +----------------- + +All the commands above aren't so useful if they have to be typed in every +time the Assabet is rebooted. Therefore it's possible to automatize the boot +process using RedBoot's scripting capability. + +For example, I use this to boot Linux with both the kernel and the ramdisk +images retrieved from a TFTP server on the network: + +RedBoot> fconfig +Run script at boot: false true +Boot script: +Enter script, terminate with empty line +>> load zImage -r -b 0x100000 +>> load ramdisk_ks.gz -r -b 0x800000 +>> exec -b 0x100000 -l 0xc0000 +>> +Boot script timeout (1000ms resolution): 3 +Use BOOTP for network configuration: true +GDB connection port: 9000 +Network debug at boot time: false +Update RedBoot non-volatile configuration - are you sure (y/n)? y + +Then, rebooting the Assabet is just a matter of waiting for the login prompt. + + + +Nicolas Pitre +nico@cam.org +June 12, 2001 + + +Status of peripherals in -rmk tree (updated 14/10/2001) +------------------------------------------------------- + +Assabet: + Serial ports: + Radio: TX, RX, CTS, DSR, DCD, RI + PM: Not tested. + COM: TX, RX, CTS, DSR, DCD, RTS, DTR, PM + PM: Not tested. + I2C: Implemented, not fully tested. + L3: Fully tested, pass. + PM: Not tested. + + Video: + LCD: Fully tested. PM + (LCD doesn't like being blanked with + neponset connected) + Video out: Not fully + + Audio: + UDA1341: + Playback: Fully tested, pass. + Record: Implemented, not tested. + PM: Not tested. + + UCB1200: + Audio play: Implemented, not heavily tested. + Audio rec: Implemented, not heavily tested. + Telco audio play: Implemented, not heavily tested. + Telco audio rec: Implemented, not heavily tested. + POTS control: No + Touchscreen: Yes + PM: Not tested. + + Other: + PCMCIA: + LPE: Fully tested, pass. + USB: No + IRDA: + SIR: Fully tested, pass. + FIR: Fully tested, pass. + PM: Not tested. + +Neponset: + Serial ports: + COM1,2: TX, RX, CTS, DSR, DCD, RTS, DTR + PM: Not tested. + USB: Implemented, not heavily tested. + PCMCIA: Implemented, not heavily tested. + PM: Not tested. + CF: Implemented, not heavily tested. + PM: Not tested. + +More stuff can be found in the -np (Nicolas Pitre's) tree. + diff --git a/Documentation/arm/SA1100/Brutus b/Documentation/arm/SA1100/Brutus new file mode 100644 index 000000000000..2254c8f0b326 --- /dev/null +++ b/Documentation/arm/SA1100/Brutus @@ -0,0 +1,66 @@ +Brutus is an evaluation platform for the SA1100 manufactured by Intel. +For more details, see: + +http://developer.intel.com/design/strong/applnots/sa1100lx/getstart.htm + +To compile for Brutus, you must issue the following commands: + + make brutus_config + make config + [accept all the defaults] + make zImage + +The resulting kernel will end up in linux/arch/arm/boot/zImage. This file +must be loaded at 0xc0008000 in Brutus's memory and execution started at +0xc0008000 as well with the value of registers r0 = 0 and r1 = 16 upon +entry. + +But prior to execute the kernel, a ramdisk image must also be loaded in +memory. Use memory address 0xd8000000 for this. Note that the file +containing the (compressed) ramdisk image must not exceed 4 MB. + +Typically, you'll need angelboot to load the kernel. +The following angelboot.opt file should be used: + +----- begin angelboot.opt ----- +base 0xc0008000 +entry 0xc0008000 +r0 0x00000000 +r1 0x00000010 +device /dev/ttyS0 +options "9600 8N1" +baud 115200 +otherfile ramdisk_img.gz +otherbase 0xd8000000 +----- end angelboot.opt ----- + +Then load the kernel and ramdisk with: + + angelboot -f angelboot.opt zImage + +The first Brutus serial port (assumed to be linked to /dev/ttyS0 on your +host PC) is used by angel to load the kernel and ramdisk image. The serial +console is provided through the second Brutus serial port. To access it, +you may use minicom configured with /dev/ttyS1, 9600 baud, 8N1, no flow +control. + +Currently supported: + - RS232 serial ports + - audio output + - LCD screen + - keyboard + +The actual Brutus support may not be complete without extra patches. +If such patches exist, they should be found from +ftp.netwinder.org/users/n/nico. + +A full PCMCIA support is still missing, although it's possible to hack +some drivers in order to drive already inserted cards at boot time with +little modifications. + +Any contribution is welcome. + +Please send patches to nico@cam.org + +Have Fun ! + diff --git a/Documentation/arm/SA1100/CERF b/Documentation/arm/SA1100/CERF new file mode 100644 index 000000000000..b3d845301ef1 --- /dev/null +++ b/Documentation/arm/SA1100/CERF @@ -0,0 +1,29 @@ +*** The StrongARM version of the CerfBoard/Cube has been discontinued *** + +The Intrinsyc CerfBoard is a StrongARM 1110-based computer on a board +that measures approximately 2" square. It includes an Ethernet +controller, an RS232-compatible serial port, a USB function port, and +one CompactFlash+ slot on the back. Pictures can be found at the +Intrinsyc website, http://www.intrinsyc.com. + +This document describes the support in the Linux kernel for the +Intrinsyc CerfBoard. + +Supported in this version: + - CompactFlash+ slot (select PCMCIA in General Setup and any options + that may be required) + - Onboard Crystal CS8900 Ethernet controller (Cerf CS8900A support in + Network Devices) + - Serial ports with a serial console (hardcoded to 38400 8N1) + +In order to get this kernel onto your Cerf, you need a server that runs +both BOOTP and TFTP. Detailed instructions should have come with your +evaluation kit on how to use the bootloader. This series of commands +will suffice: + + make ARCH=arm CROSS_COMPILE=arm-linux- cerfcube_defconfig + make ARCH=arm CROSS_COMPILE=arm-linux- zImage + make ARCH=arm CROSS_COMPILE=arm-linux- modules + cp arch/arm/boot/zImage <TFTP directory> + +support@intrinsyc.com diff --git a/Documentation/arm/SA1100/FreeBird b/Documentation/arm/SA1100/FreeBird new file mode 100644 index 000000000000..eda28b3232e7 --- /dev/null +++ b/Documentation/arm/SA1100/FreeBird @@ -0,0 +1,21 @@ +Freebird-1.1 is produced by Legned(C) ,Inc. +(http://www.legend.com.cn) +and software/linux mainatined by Coventive(C),Inc. +(http://www.coventive.com) + +Based on the Nicolas's strongarm kernel tree. + +=============================================================== +Maintainer: + +Chester Kuo <chester@coventive.com> + <chester@linux.org.tw> + +Author : +Tim wu <timwu@coventive.com> +CIH <cih@coventive.com> +Eric Peng <ericpeng@coventive.com> +Jeff Lee <jeff_lee@coventive.com> +Allen Cheng +Tony Liu <tonyliu@coventive.com> + diff --git a/Documentation/arm/SA1100/GraphicsClient b/Documentation/arm/SA1100/GraphicsClient new file mode 100644 index 000000000000..8fa7e8027ff1 --- /dev/null +++ b/Documentation/arm/SA1100/GraphicsClient @@ -0,0 +1,98 @@ +ADS GraphicsClient Plus Single Board Computer + +For more details, contact Applied Data Systems or see +http://www.applieddata.net/products.html + +The original Linux support for this product has been provided by +Nicolas Pitre <nico@cam.org>. Continued development work by +Woojung Huh <whuh@applieddata.net> + +It's currently possible to mount a root filesystem via NFS providing a +complete Linux environment. Otherwise a ramdisk image may be used. The +board supports MTD/JFFS, so you could also mount something on there. + +Use 'make graphicsclient_config' before any 'make config'. This will set up +defaults for GraphicsClient Plus support. + +The kernel zImage is linked to be loaded and executed at 0xc0200000. +Also the following registers should have the specified values upon entry: + + r0 = 0 + r1 = 29 (this is the GraphicsClient architecture number) + +Linux can be used with the ADS BootLoader that ships with the +newer rev boards. See their documentation on how to load Linux. +Angel is not available for the GraphicsClient Plus AFAIK. + +There is a board known as just the GraphicsClient that ADS used to +produce but has end of lifed. This code will not work on the older +board with the ADS bootloader, but should still work with Angel, +as outlined below. In any case, if you're planning on deploying +something en masse, you should probably get the newer board. + +If using Angel on the older boards, here is a typical angel.opt option file +if the kernel is loaded through the Angel Debug Monitor: + +----- begin angelboot.opt ----- +base 0xc0200000 +entry 0xc0200000 +r0 0x00000000 +r1 0x0000001d +device /dev/ttyS1 +options "38400 8N1" +baud 115200 +#otherfile ramdisk.gz +#otherbase 0xc0800000 +exec minicom +----- end angelboot.opt ----- + +Then the kernel (and ramdisk if otherfile/otherbase lines above are +uncommented) would be loaded with: + + angelboot -f angelboot.opt zImage + +Here it is assumed that the board is connected to ttyS1 on your PC +and that minicom is preconfigured with /dev/ttyS1, 38400 baud, 8N1, no flow +control by default. + +If any other bootloader is used, ensure it accomplish the same, especially +for r0/r1 register values before jumping into the kernel. + + +Supported peripherals: +- SA1100 LCD frame buffer (8/16bpp...sort of) +- on-board SMC 92C96 ethernet NIC +- SA1100 serial port +- flash memory access (MTD/JFFS) +- pcmcia +- touchscreen(ucb1200) +- ps/2 keyboard +- console on LCD screen +- serial ports (ttyS[0-2]) + - ttyS0 is default for serial console +- Smart I/O (ADC, keypad, digital inputs, etc) + See http://www.applieddata.com/developers/linux for IOCTL documentation + and example user space code. ps/2 keybd is multiplexed through this driver + +To do: +- UCB1200 audio with new ucb_generic layer +- everything else! :-) + +Notes: + +- The flash on board is divided into 3 partitions. mtd0 is where + the ADS boot ROM and zImage is stored. It's been marked as + read-only to keep you from blasting over the bootloader. :) mtd1 is + for the ramdisk.gz image. mtd2 is user flash space and can be + utilized for either JFFS or if you're feeling crazy, running ext2 + on top of it. If you're not using the ADS bootloader, you're + welcome to blast over the mtd1 partition also. + +- 16bpp mode requires a different cable than what ships with the board. + Contact ADS or look through the manual to wire your own. Currently, + if you compile with 16bit mode support and switch into a lower bpp + mode, the timing is off so the image is corrupted. This will be + fixed soon. + +Any contribution can be sent to nico@cam.org and will be greatly welcome! + diff --git a/Documentation/arm/SA1100/GraphicsMaster b/Documentation/arm/SA1100/GraphicsMaster new file mode 100644 index 000000000000..dd28745ac521 --- /dev/null +++ b/Documentation/arm/SA1100/GraphicsMaster @@ -0,0 +1,53 @@ +ADS GraphicsMaster Single Board Computer + +For more details, contact Applied Data Systems or see +http://www.applieddata.net/products.html + +The original Linux support for this product has been provided by +Nicolas Pitre <nico@cam.org>. Continued development work by +Woojung Huh <whuh@applieddata.net> + +Use 'make graphicsmaster_config' before any 'make config'. +This will set up defaults for GraphicsMaster support. + +The kernel zImage is linked to be loaded and executed at 0xc0400000. + +Linux can be used with the ADS BootLoader that ships with the +newer rev boards. See their documentation on how to load Linux. + +Supported peripherals: +- SA1100 LCD frame buffer (8/16bpp...sort of) +- SA1111 USB Master +- on-board SMC 92C96 ethernet NIC +- SA1100 serial port +- flash memory access (MTD/JFFS) +- pcmcia, compact flash +- touchscreen(ucb1200) +- ps/2 keyboard +- console on LCD screen +- serial ports (ttyS[0-2]) + - ttyS0 is default for serial console +- Smart I/O (ADC, keypad, digital inputs, etc) + See http://www.applieddata.com/developers/linux for IOCTL documentation + and example user space code. ps/2 keybd is multiplexed through this driver + +To do: +- everything else! :-) + +Notes: + +- The flash on board is divided into 3 partitions. mtd0 is where + the zImage is stored. It's been marked as read-only to keep you + from blasting over the bootloader. :) mtd1 is + for the ramdisk.gz image. mtd2 is user flash space and can be + utilized for either JFFS or if you're feeling crazy, running ext2 + on top of it. If you're not using the ADS bootloader, you're + welcome to blast over the mtd1 partition also. + +- 16bpp mode requires a different cable than what ships with the board. + Contact ADS or look through the manual to wire your own. Currently, + if you compile with 16bit mode support and switch into a lower bpp + mode, the timing is off so the image is corrupted. This will be + fixed soon. + +Any contribution can be sent to nico@cam.org and will be greatly welcome! diff --git a/Documentation/arm/SA1100/HUW_WEBPANEL b/Documentation/arm/SA1100/HUW_WEBPANEL new file mode 100644 index 000000000000..fd56b48d4833 --- /dev/null +++ b/Documentation/arm/SA1100/HUW_WEBPANEL @@ -0,0 +1,17 @@ +The HUW_WEBPANEL is a product of the german company Hoeft & Wessel AG + +If you want more information, please visit +http://www.hoeft-wessel.de + +To build the kernel: + make huw_webpanel_config + make oldconfig + [accept all defaults] + make zImage + +Mostly of the work is done by: +Roman Jordan jor@hoeft-wessel.de +Christoph Schulz schu@hoeft-wessel.de + +2000/12/18/ + diff --git a/Documentation/arm/SA1100/Itsy b/Documentation/arm/SA1100/Itsy new file mode 100644 index 000000000000..3b594534323b --- /dev/null +++ b/Documentation/arm/SA1100/Itsy @@ -0,0 +1,39 @@ +Itsy is a research project done by the Western Research Lab, and Systems +Research Center in Palo Alto, CA. The Itsy project is one of several +research projects at Compaq that are related to pocket computing. + +For more information, see: + + http://www.research.digital.com/wrl/itsy/index.html + +Notes on initial 2.4 Itsy support (8/27/2000) : +The port was done on an Itsy version 1.5 machine with a daughtercard with +64 Meg of DRAM and 32 Meg of Flash. The initial work includes support for +serial console (to see what you're doing). No other devices have been +enabled. + +To build, do a "make menuconfig" (or xmenuconfig) and select Itsy support. +Disable Flash and LCD support. and then do a make zImage. +Finally, you will need to cd to arch/arm/boot/tools and execute a make there +to build the params-itsy program used to boot the kernel. + +In order to install the port of 2.4 to the itsy, You will need to set the +configuration parameters in the monitor as follows: +Arg 1:0x08340000, Arg2: 0xC0000000, Arg3:18 (0x12), Arg4:0 +Make sure the start-routine address is set to 0x00060000. + +Next, flash the params-itsy program to 0x00060000 ("p 1 0x00060000" in the +flash menu) Flash the kernel in arch/arm/boot/zImage into 0x08340000 +("p 1 0x00340000"). Finally flash an initial ramdisk into 0xC8000000 +("p 2 0x0") We used ramdisk-2-30.gz from the 0.11 version directory on +handhelds.org. + +The serial connection we established was at: + 8-bit data, no parity, 1 stop bit(s), 115200.00 b/s. in the monitor, in the +params-itsy program, and in the kernel itself. This can be changed, but +not easily. The monitor parameters are easily changed, the params program +setup is assembly outl's, and the kernel is a configuration item specific to +the itsy. (i.e. grep for CONFIG_SA1100_ITSY and you'll find where it is.) + + +This should get you a properly booting 2.4 kernel on the itsy. diff --git a/Documentation/arm/SA1100/LART b/Documentation/arm/SA1100/LART new file mode 100644 index 000000000000..2f73f513e16a --- /dev/null +++ b/Documentation/arm/SA1100/LART @@ -0,0 +1,14 @@ +Linux Advanced Radio Terminal (LART) +------------------------------------ + +The LART is a small (7.5 x 10cm) SA-1100 board, designed for embedded +applications. It has 32 MB DRAM, 4MB Flash ROM, double RS232 and all +other StrongARM-gadgets. Almost all SA signals are directly accessible +through a number of connectors. The powersupply accepts voltages +between 3.5V and 16V and is overdimensioned to support a range of +daughterboards. A quad Ethernet / IDE / PS2 / sound daughterboard +is under development, with plenty of others in different stages of +planning. + +The hardware designs for this board have been released under an open license; +see the LART page at http://www.lart.tudelft.nl/ for more information. diff --git a/Documentation/arm/SA1100/PLEB b/Documentation/arm/SA1100/PLEB new file mode 100644 index 000000000000..92cae066908d --- /dev/null +++ b/Documentation/arm/SA1100/PLEB @@ -0,0 +1,11 @@ +The PLEB project was started as a student initiative at the School of +Computer Science and Engineering, University of New South Wales to make a +pocket computer capable of running the Linux Kernel. + +PLEB support has yet to be fully integrated. + +For more information, see: + + http://www.cse.unsw.edu.au/~pleb/ + + diff --git a/Documentation/arm/SA1100/Pangolin b/Documentation/arm/SA1100/Pangolin new file mode 100644 index 000000000000..077a6120e129 --- /dev/null +++ b/Documentation/arm/SA1100/Pangolin @@ -0,0 +1,23 @@ +Pangolin is a StrongARM 1110-based evaluation platform produced +by Dialogue Technology (http://www.dialogue.com.tw/). +It has EISA slots for ease of configuration with SDRAM/Flash +memory card, USB/Serial/Audio card, Compact Flash card, +PCMCIA/IDE card and TFT-LCD card. + +To compile for Pangolin, you must issue the following commands: + + make pangolin_config + make oldconfig + make zImage + +Supported peripherals: +- SA1110 serial port (UART1/UART2/UART3) +- flash memory access +- compact flash driver +- UDA1341 sound driver +- SA1100 LCD controller for 800x600 16bpp TFT-LCD +- MQ-200 driver for 800x600 16bpp TFT-LCD +- Penmount(touch panel) driver +- PCMCIA driver +- SMC91C94 LAN driver +- IDE driver (experimental) diff --git a/Documentation/arm/SA1100/Tifon b/Documentation/arm/SA1100/Tifon new file mode 100644 index 000000000000..dd1934d9c851 --- /dev/null +++ b/Documentation/arm/SA1100/Tifon @@ -0,0 +1,7 @@ +Tifon +----- + +More info has to come... + +Contact: Peter Danielsson <peter.danielsson@era-t.ericsson.se> + diff --git a/Documentation/arm/SA1100/Victor b/Documentation/arm/SA1100/Victor new file mode 100644 index 000000000000..01e81fc49461 --- /dev/null +++ b/Documentation/arm/SA1100/Victor @@ -0,0 +1,16 @@ +Victor is known as a "digital talking book player" manufactured by +VisuAide, Inc. to be used by blind people. + +For more information related to Victor, see: + + http://www.visuaide.com/victor + +Of course Victor is using Linux as its main operating system. +The Victor implementation for Linux is maintained by Nicolas Pitre: + + nico@visuaide.com + nico@cam.org + +For any comments, please feel free to contact me through the above +addresses. + diff --git a/Documentation/arm/SA1100/Yopy b/Documentation/arm/SA1100/Yopy new file mode 100644 index 000000000000..e14f16d836ac --- /dev/null +++ b/Documentation/arm/SA1100/Yopy @@ -0,0 +1,2 @@ +See http://www.yopydeveloper.org for more. + diff --git a/Documentation/arm/SA1100/empeg b/Documentation/arm/SA1100/empeg new file mode 100644 index 000000000000..4ece4849a42c --- /dev/null +++ b/Documentation/arm/SA1100/empeg @@ -0,0 +1,2 @@ +See ../empeg/README + diff --git a/Documentation/arm/SA1100/nanoEngine b/Documentation/arm/SA1100/nanoEngine new file mode 100644 index 000000000000..fc431cbfefc2 --- /dev/null +++ b/Documentation/arm/SA1100/nanoEngine @@ -0,0 +1,11 @@ +nanoEngine +---------- + +"nanoEngine" is a SA1110 based single board computer from +Bright Star Engineering Inc. See www.brightstareng.com/arm +for more info. +(Ref: Stuart Adams <sja@brightstareng.com>) + +Also visit Larry Doolittle's "Linux for the nanoEngine" site: +http://recycle.lbl.gov/~ldoolitt/bse/ + diff --git a/Documentation/arm/SA1100/serial_UART b/Documentation/arm/SA1100/serial_UART new file mode 100644 index 000000000000..aea2e91ca0ef --- /dev/null +++ b/Documentation/arm/SA1100/serial_UART @@ -0,0 +1,47 @@ +The SA1100 serial port had its major/minor numbers officially assigned: + +> Date: Sun, 24 Sep 2000 21:40:27 -0700 +> From: H. Peter Anvin <hpa@transmeta.com> +> To: Nicolas Pitre <nico@CAM.ORG> +> Cc: Device List Maintainer <device@lanana.org> +> Subject: Re: device +> +> Okay. Note that device numbers 204 and 205 are used for "low density +> serial devices", so you will have a range of minors on those majors (the +> tty device layer handles this just fine, so you don't have to worry about +> doing anything special.) +> +> So your assignments are: +> +> 204 char Low-density serial ports +> 5 = /dev/ttySA0 SA1100 builtin serial port 0 +> 6 = /dev/ttySA1 SA1100 builtin serial port 1 +> 7 = /dev/ttySA2 SA1100 builtin serial port 2 +> +> 205 char Low-density serial ports (alternate device) +> 5 = /dev/cusa0 Callout device for ttySA0 +> 6 = /dev/cusa1 Callout device for ttySA1 +> 7 = /dev/cusa2 Callout device for ttySA2 +> + +If you're not using devfs, you must create those inodes in /dev +on the root filesystem used by your SA1100-based device: + + mknod ttySA0 c 204 5 + mknod ttySA1 c 204 6 + mknod ttySA2 c 204 7 + mknod cusa0 c 205 5 + mknod cusa1 c 205 6 + mknod cusa2 c 205 7 + +In addition to the creation of the appropriate device nodes above, you +must ensure your user space applications make use of the correct device +name. The classic example is the content of the /etc/inittab file where +you might have a getty process started on ttyS0. In this case: + +- replace occurrences of ttyS0 with ttySA0, ttyS1 with ttySA1, etc. + +- don't forget to add 'ttySA0', 'console', or the appropriate tty name + in /etc/securetty for root to be allowed to login as well. + + diff --git a/Documentation/arm/Samsung-S3C24XX/EB2410ITX.txt b/Documentation/arm/Samsung-S3C24XX/EB2410ITX.txt new file mode 100644 index 000000000000..000e3d7a78b2 --- /dev/null +++ b/Documentation/arm/Samsung-S3C24XX/EB2410ITX.txt @@ -0,0 +1,58 @@ + Simtec Electronics EB2410ITX (BAST) + =================================== + + http://www.simtec.co.uk/products/EB2410ITX/ + +Introduction +------------ + + The EB2410ITX is a S3C2410 based development board with a variety of + peripherals and expansion connectors. This board is also known by + the shortened name of Bast. + + +Configuration +------------- + + To set the default configuration, use `make bast_defconfig` which + supports the commonly used features of this board. + + +Support +------- + + Official support information can be found on the Simtec Electronics + website, at the product page http://www.simtec.co.uk/products/EB2410ITX/ + + Useful links: + + - Resources Page http://www.simtec.co.uk/products/EB2410ITX/resources.html + + - Board FAQ at http://www.simtec.co.uk/products/EB2410ITX/faq.html + + - Bootloader info http://www.simtec.co.uk/products/SWABLE/resources.html + and FAQ http://www.simtec.co.uk/products/SWABLE/faq.html + + +MTD +--- + + The NAND and NOR support has been merged from the linux-mtd project. + Any prolbems, see http://www.linux-mtd.infradead.org/ for more + information or up-to-date versions of linux-mtd. + + +IDE +--- + + Both onboard IDE ports are supported, however there is no support for + changing speed of devices, PIO Mode 4 capable drives should be used. + + +Maintainers +----------- + + This board is maintained by Simtec Electronics. + + +(c) 2004 Ben Dooks, Simtec Electronics diff --git a/Documentation/arm/Samsung-S3C24XX/GPIO.txt b/Documentation/arm/Samsung-S3C24XX/GPIO.txt new file mode 100644 index 000000000000..0822764ec270 --- /dev/null +++ b/Documentation/arm/Samsung-S3C24XX/GPIO.txt @@ -0,0 +1,122 @@ + S3C2410 GPIO Control + ==================== + +Introduction +------------ + + The s3c2410 kernel provides an interface to configure and + manipulate the state of the GPIO pins, and find out other + information about them. + + There are a number of conditions attached to the configuration + of the s3c2410 GPIO system, please read the Samsung provided + data-sheet/users manual to find out the complete list. + + +Headers +------- + + See include/asm-arm/arch-s3c2410/regs-gpio.h for the list + of GPIO pins, and the configuration values for them. This + is included by using #include <asm/arch/regs-gpio.h> + + The GPIO management functions are defined in the hardware + header include/asm-arm/arch-s3c2410/hardware.h which can be + included by #include <asm/arch/hardware.h> + + A useful ammount of documentation can be found in the hardware + header on how the GPIO functions (and others) work. + + Whilst a number of these functions do make some checks on what + is passed to them, for speed of use, they may not always ensure + that the user supplied data to them is correct. + + +PIN Numbers +----------- + + Each pin has an unique number associated with it in regs-gpio.h, + eg S3C2410_GPA0 or S3C2410_GPF1. These defines are used to tell + the GPIO functions which pin is to be used. + + +Configuring a pin +----------------- + + The following function allows the configuration of a given pin to + be changed. + + void s3c2410_gpio_cfgpin(unsigned int pin, unsigned int function); + + Eg: + + s3c2410_gpio_cfgpin(S3C2410_GPA0, S3C2410_GPA0_ADDR0); + s3c2410_gpio_cfgpin(S3C2410_GPE8, S3C2410_GPE8_SDDAT1); + + which would turn GPA0 into the lowest Address line A0, and set + GPE8 to be connected to the SDIO/MMC controller's SDDAT1 line. + + +Reading the current configuration +--------------------------------- + + The current configuration of a pin can be read by using: + + s3c2410_gpio_getcfg(unsigned int pin); + + The return value will be from the same set of values which can be + passed to s3c2410_gpio_cfgpin(). + + +Configuring a pull-up resistor +------------------------------ + + A large proportion of the GPIO pins on the S3C2410 can have weak + pull-up resistors enabled. This can be configured by the following + function: + + void s3c2410_gpio_pullup(unsigned int pin, unsigned int to); + + Where the to value is zero to set the pull-up off, and 1 to enable + the specified pull-up. Any other values are currently undefined. + + +Getting the state of a PIN +-------------------------- + + The state of a pin can be read by using the function: + + unsigned int s3c2410_gpio_getpin(unsigned int pin); + + This will return either zero or non-zero. Do not count on this + function returning 1 if the pin is set. + + +Setting the state of a PIN +-------------------------- + + The value an pin is outputing can be modified by using the following: + + void s3c2410_gpio_setpin(unsigned int pin, unsigned int to); + + Which sets the given pin to the value. Use 0 to write 0, and 1 to + set the output to 1. + + +Getting the IRQ number associated with a PIN +-------------------------------------------- + + The following function can map the given pin number to an IRQ + number to pass to the IRQ system. + + int s3c2410_gpio_getirq(unsigned int pin); + + Note, not all pins have an IRQ. + + +Authour +------- + + +Ben Dooks, 03 October 2004 +(c) 2004 Ben Dooks, Simtec Electronics diff --git a/Documentation/arm/Samsung-S3C24XX/H1940.txt b/Documentation/arm/Samsung-S3C24XX/H1940.txt new file mode 100644 index 000000000000..d6b1de92b111 --- /dev/null +++ b/Documentation/arm/Samsung-S3C24XX/H1940.txt @@ -0,0 +1,40 @@ + HP IPAQ H1940 + ============= + +http://www.handhelds.org/projects/h1940.html + +Introduction +------------ + + The HP H1940 is a S3C2410 based handheld device, with + bluetooth connectivity. + + +Support +------- + + A variety of information is available + + handhelds.org project page: + + http://www.handhelds.org/projects/h1940.html + + handhelds.org wiki page: + + http://handhelds.org/moin/moin.cgi/HpIpaqH1940 + + Herbert Pötzl pages: + + http://vserver.13thfloor.at/H1940/ + + +Maintainers +----------- + + This project is being maintained and developed by a variety + of people, including Ben Dooks, Arnaud Patard, and Herbert Pötzl. + + Thanks to the many others who have also provided support. + + +(c) 2005 Ben Dooks
\ No newline at end of file diff --git a/Documentation/arm/Samsung-S3C24XX/Overview.txt b/Documentation/arm/Samsung-S3C24XX/Overview.txt new file mode 100644 index 000000000000..3af4d29a8938 --- /dev/null +++ b/Documentation/arm/Samsung-S3C24XX/Overview.txt @@ -0,0 +1,156 @@ + S3C24XX ARM Linux Overview + ========================== + + + +Introduction +------------ + + The Samsung S3C24XX range of ARM9 System-on-Chip CPUs are supported + by the 's3c2410' architecture of ARM Linux. Currently the S3C2410 and + the S3C2440 are supported CPUs. + + +Configuration +------------- + + A generic S3C2410 configuration is provided, and can be used as the + default by `make s3c2410_defconfig`. This configuration has support + for all the machines, and the commonly used features on them. + + Certain machines may have their own default configurations as well, + please check the machine specific documentation. + + +Machines +-------- + + The currently supported machines are as follows: + + Simtec Electronics EB2410ITX (BAST) + + A general purpose development board, see EB2410ITX.txt for further + details + + Samsung SMDK2410 + + Samsung's own development board, geared for PDA work. + + Samsung/Meritech SMDK2440 + + The S3C2440 compatible version of the SMDK2440 + + Thorcom VR1000 + + Custom embedded board + + HP IPAQ 1940 + + Handheld (IPAQ), available in several varieties + + HP iPAQ rx3715 + + S3C2440 based IPAQ, with a number of variations depending on + features shipped. + + Acer N30 + + A S3C2410 based PDA from Acer. There is a Wiki page at + http://handhelds.org/moin/moin.cgi/AcerN30Documentation . + + +Adding New Machines +------------------- + + The archicture has been designed to support as many machines as can + be configured for it in one kernel build, and any future additions + should keep this in mind before altering items outside of their own + machine files. + + Machine definitions should be kept in linux/arch/arm/mach-s3c2410, + and there are a number of examples that can be looked at. + + Read the kernel patch submission policies as well as the + Documentation/arm directory before submitting patches. The + ARM kernel series is managed by Russell King, and has a patch system + located at http://www.arm.linux.org.uk/developer/patches/ + as well as mailing lists that can be found from the same site. + + As a courtesy, please notify <ben-linux@fluff.org> of any new + machines or other modifications. + + Any large scale modifications, or new drivers should be discussed + on the ARM kernel mailing list (linux-arm-kernel) before being + attempted. + + +NAND +---- + + The current kernels now have support for the s3c2410 NAND + controller. If there are any problems the latest linux-mtd + CVS can be found from http://www.linux-mtd.infradead.org/ + + +Serial +------ + + The s3c2410 serial driver provides support for the internal + serial ports. These devices appear as /dev/ttySAC0 through 3. + + To create device nodes for these, use the following commands + + mknod ttySAC0 c 204 64 + mknod ttySAC1 c 204 65 + mknod ttySAC2 c 204 66 + + +GPIO +---- + + The core contains support for manipulating the GPIO, see the + documentation in GPIO.txt in the same directory as this file. + + +Clock Management +---------------- + + The core provides the interface defined in the header file + include/asm-arm/hardware/clock.h, to allow control over the + various clock units + + +Port Contributors +----------------- + + Ben Dooks (BJD) + Vincent Sanders + Herbert Potzl + Arnaud Patard (RTP) + Roc Wu + Klaus Fetscher + Dimitry Andric + Shannon Holland + Guillaume Gourat (NexVision) + Christer Weinigel (wingel) (Acer N30) + Lucas Correia Villa Real (S3C2400 port) + + +Document Changes +---------------- + + 05 Sep 2004 - BJD - Added Document Changes section + 05 Sep 2004 - BJD - Added Klaus Fetscher to list of contributors + 25 Oct 2004 - BJD - Added Dimitry Andric to list of contributors + 25 Oct 2004 - BJD - Updated the MTD from the 2.6.9 merge + 21 Jan 2005 - BJD - Added rx3715, added Shannon to contributors + 10 Feb 2005 - BJD - Added Guillaume Gourat to contributors + 02 Mar 2005 - BJD - Added SMDK2440 to list of machines + 06 Mar 2005 - BJD - Added Christer Weinigel + 08 Mar 2005 - BJD - Added LCVR to list of people, updated introduction + 08 Mar 2005 - BJD - Added section on adding machines + +Document Author +--------------- + +Ben Dooks, (c) 2004-2005 Simtec Electronics diff --git a/Documentation/arm/Samsung-S3C24XX/SMDK2440.txt b/Documentation/arm/Samsung-S3C24XX/SMDK2440.txt new file mode 100644 index 000000000000..32e1eae6a25f --- /dev/null +++ b/Documentation/arm/Samsung-S3C24XX/SMDK2440.txt @@ -0,0 +1,56 @@ + Samsung/Meritech SMDK2440 + ========================= + +Introduction +------------ + + The SMDK2440 is a two part evaluation board for the Samsung S3C2440 + processor. It includes support for LCD, SmartMedia, Audio, SD and + 10MBit Ethernet, and expansion headers for various signals, including + the camera and unused GPIO. + + +Configuration +------------- + + To set the default configuration, use `make smdk2440_defconfig` which + will configure the common features of this board, or use + `make s3c2410_config` to include support for all s3c2410/s3c2440 machines + + +Support +------- + + Ben Dooks' SMDK2440 site at http://www.fluff.org/ben/smdk2440/ which + includes linux based USB download tools. + + Some of the h1940 patches that can be found from the H1940 project + site at http://www.handhelds.org/projects/h1940.html can also be + applied to this board. + + +Peripherals +----------- + + There is no current support for any of the extra peripherals on the + base-board itself. + + +MTD +--- + + The NAND flash should be supported by the in kernel MTD NAND support, + NOR flash will be added later. + + +Maintainers +----------- + + This board is being maintained by Ben Dooks, for more info, see + http://www.fluff.org/ben/smdk2440/ + + Many thanks to Dimitry Andric of TomTom for the loan of the SMDK2440, + and to Simtec Electronics for allowing me time to work on this. + + +(c) 2004 Ben Dooks
\ No newline at end of file diff --git a/Documentation/arm/Samsung-S3C24XX/Suspend.txt b/Documentation/arm/Samsung-S3C24XX/Suspend.txt new file mode 100644 index 000000000000..e12bc3284a27 --- /dev/null +++ b/Documentation/arm/Samsung-S3C24XX/Suspend.txt @@ -0,0 +1,106 @@ + S3C24XX Suspend Support + ======================= + + +Introduction +------------ + + The S3C2410 supports a low-power suspend mode, where the SDRAM is kept + in Self-Refresh mode, and all but the essential peripheral blocks are + powered down. For more information on how this works, please look + at the S3C2410 datasheets from Samsung. + + +Requirements +------------ + + 1) A bootloader that can support the necessary resume operation + + 2) Support for at least 1 source for resume + + 3) CONFIG_PM enabled in the kernel + + 4) Any peripherals that are going to be powered down at the same + time require suspend/resume support. + + +Resuming +-------- + + The S3C2410 user manual defines the process of sending the CPU to + sleep and how it resumes. The default behaviour of the Linux code + is to set the GSTATUS3 register to the physical address of the + code to resume Linux operation. + + GSTATUS4 is currently left alone by the sleep code, and is free to + use for any other purposes (for example, the EB2410ITX uses this to + save memory configuration in). + + +Machine Support +--------------- + + The machine specific functions must call the s3c2410_pm_init() function + to say that its bootloader is capable of resuming. This can be as + simple as adding the following to the machine's definition: + + INITMACHINE(s3c2410_pm_init) + + A board can do its own setup before calling s3c2410_pm_init, if it + needs to setup anything else for power management support. + + There is currently no support for over-riding the default method of + saving the resume address, if your board requires it, then contact + the maintainer and discuss what is required. + + Note, the original method of adding an late_initcall() is wrong, + and will end up initialising all compiled machines' pm init! + + +Debugging +--------- + + There are several important things to remember when using PM suspend: + + 1) The uart drivers will disable the clocks to the UART blocks when + suspending, which means that use of printascii() or similar direct + access to the UARTs will cause the debug to stop. + + 2) Whilst the pm code itself will attempt to re-enable the UART clocks, + care should be taken that any external clock sources that the UARTs + rely on are still enabled at that point. + + +Configuration +------------- + + The S3C2410 specific configuration in `System Type` defines various + aspects of how the S3C2410 suspend and resume support is configured + + `S3C2410 PM Suspend debug` + + This option prints messages to the serial console before and after + the actual suspend, giving detailed information on what is + happening + + + `S3C2410 PM Suspend Memory CRC` + + Allows the entire memory to be checksummed before and after the + suspend to see if there has been any corruption of the contents. + + This support requires the CRC32 function to be enabled. + + + `S3C2410 PM Suspend CRC Chunksize (KiB)` + + Defines the size of memory each CRC chunk covers. A smaller value + will mean that the CRC data block will take more memory, but will + identify any faults with better precision + + +Document Author +--------------- + +Ben Dooks, (c) 2004 Simtec Electronics + diff --git a/Documentation/arm/Setup b/Documentation/arm/Setup new file mode 100644 index 000000000000..0abd0720d7ed --- /dev/null +++ b/Documentation/arm/Setup @@ -0,0 +1,129 @@ +Kernel initialisation parameters on ARM Linux +--------------------------------------------- + +The following document describes the kernel initialisation parameter +structure, otherwise known as 'struct param_struct' which is used +for most ARM Linux architectures. + +This structure is used to pass initialisation parameters from the +kernel loader to the Linux kernel proper, and may be short lived +through the kernel initialisation process. As a general rule, it +should not be referenced outside of arch/arm/kernel/setup.c:setup_arch(). + +There are a lot of parameters listed in there, and they are described +below: + + page_size + + This parameter must be set to the page size of the machine, and + will be checked by the kernel. + + nr_pages + + This is the total number of pages of memory in the system. If + the memory is banked, then this should contain the total number + of pages in the system. + + If the system contains separate VRAM, this value should not + include this information. + + ramdisk_size + + This is now obsolete, and should not be used. + + flags + + Various kernel flags, including: + bit 0 - 1 = mount root read only + bit 1 - unused + bit 2 - 0 = load ramdisk + bit 3 - 0 = prompt for ramdisk + + rootdev + + major/minor number pair of device to mount as the root filesystem. + + video_num_cols + video_num_rows + + These two together describe the character size of the dummy console, + or VGA console character size. They should not be used for any other + purpose. + + It's generally a good idea to set these to be either standard VGA, or + the equivalent character size of your fbcon display. This then allows + all the bootup messages to be displayed correctly. + + video_x + video_y + + This describes the character position of cursor on VGA console, and + is otherwise unused. (should not used for other console types, and + should not be used for other purposes). + + memc_control_reg + + MEMC chip control register for Acorn Archimedes and Acorn A5000 + based machines. May be used differently by different architectures. + + sounddefault + + Default sound setting on Acorn machines. May be used differently by + different architectures. + + adfsdrives + + Number of ADFS/MFM disks. May be used differently by different + architectures. + + bytes_per_char_h + bytes_per_char_v + + These are now obsolete, and should not be used. + + pages_in_bank[4] + + Number of pages in each bank of the systems memory (used for RiscPC). + This is intended to be used on systems where the physical memory + is non-contiguous from the processors point of view. + + pages_in_vram + + Number of pages in VRAM (used on Acorn RiscPC). This value may also + be used by loaders if the size of the video RAM can't be obtained + from the hardware. + + initrd_start + initrd_size + + This describes the kernel virtual start address and size of the + initial ramdisk. + + rd_start + + Start address in sectors of the ramdisk image on a floppy disk. + + system_rev + + system revision number. + + system_serial_low + system_serial_high + + system 64-bit serial number + + mem_fclk_21285 + + The speed of the external oscillator to the 21285 (footbridge), + which control's the speed of the memory bus, timer & serial port. + Depending upon the speed of the cpu its value can be between + 0-66 MHz. If no params are passed or a value of zero is passed, + then a value of 50 Mhz is the default on 21285 architectures. + + paths[8][128] + + These are now obsolete, and should not be used. + + commandline + + Kernel command line parameters. Details can be found elsewhere. diff --git a/Documentation/arm/Sharp-LH/CompactFlash b/Documentation/arm/Sharp-LH/CompactFlash new file mode 100644 index 000000000000..8616d877df9e --- /dev/null +++ b/Documentation/arm/Sharp-LH/CompactFlash @@ -0,0 +1,32 @@ +README on the Compact Flash for Card Engines +============================================ + +There are three challenges in supporting the CF interface of the Card +Engines. First, every IO operation must be followed with IO to +another memory region. Second, the slot is wired for one-to-one +address mapping *and* it is wired for 16 bit access only. Second, the +interrupt request line from the CF device isn't wired. + +The IOBARRIER issue is covered in README.IOBARRIER. This isn't an +onerous problem. Enough said here. + +The addressing issue is solved in the +arch/arm/mach-lh7a40x/ide-lpd7a40x.c file with some awkward +work-arounds. We implement a special SELECT_DRIVE routine that is +called before the IDE driver performs its own SELECT_DRIVE. Our code +recognizes that the SELECT register cannot be modified without also +writing a command. It send an IDLE_IMMEDIATE command on selecting a +drive. The function also prevents drive select to the slave drive +since there can be only one. The awkward part is that the IDE driver, +even though we have a select procedure, also attempts to change the +drive by writing directly the SELECT register. This attempt is +explicitly blocked by the OUTB function--not pretty, but effective. + +The lack of interrupts is a more serious problem. Even though the CF +card is fast when compared to a normal IDE device, we don't know that +the CF is really flash. A user could use one of the very small hard +drives being shipped with a CF interface. The IDE code includes a +check for interfaces that lack an IRQ. In these cases, submitting a +command to the IDE controller is followed by a call to poll for +completion. If the device isn't immediately ready, it schedules a +timer to poll again later. diff --git a/Documentation/arm/Sharp-LH/IOBarrier b/Documentation/arm/Sharp-LH/IOBarrier new file mode 100644 index 000000000000..c0d8853672dc --- /dev/null +++ b/Documentation/arm/Sharp-LH/IOBarrier @@ -0,0 +1,45 @@ +README on the IOBARRIER for CardEngine IO +========================================= + +Due to an unfortunate oversight when the Card Engines were designed, +the signals that control access to some peripherals, most notably the +SMC91C9111 ethernet controller, are not properly handled. + +The symptom is that some back to back IO with the peripheral returns +unreliable data. With the SMC chip, you'll see errors about the bank +register being 'screwed'. + +The cause is that the AEN signal to the SMC chip does not transition +for every memory access. It is driven through the CPLD from the CS7 +line of the CPU's static memory controller which is optimized to +eliminate unnecessary transitions. Yet, the SMC requires a transition +for every write access. The Sharp website has more information about +the effect this power-conserving feature has on peripheral +interfacing. + +The solution is to follow every write access to the SMC chip with an +access to another memory region that will force the CPU to release the +chip select line. It is important to guarantee that this access +forces the CPU off-chip. We map a page of SDRAM as if it were an +uncacheable IO device and read from it after every SMC IO write +operation. + + SMC IO + BARRIER IO + +Only this sequence is important. It does not matter that there is no +BARRIER IO before the access to the SMC chip because the AEN latch +only needs occurs after the SMC IO write cycle. The routines that +implement this work-around make an additional concession which is to +disable interrupts during the IO sequence. Other hardware devices +(the LogicPD CPLD) have registers in the same the physical memory +region as the SMC chip. An interrupt might allow an access to one of +those registers while SMC IO is being performed. + +You might be tempted to think that we have to access another device +attached to the static memory controller, but the empirical evidence +indicates that this is not so. Mapping 0x00000000 (flash) and +0xc0000000 (SDRAM) appear to have the same effect. Using SDRAM seems +to be faster. Choosing to access an undecoded memory region is not +desirable as there is no way to know how that chip select will be used +in the future. diff --git a/Documentation/arm/Sharp-LH/KEV7A400 b/Documentation/arm/Sharp-LH/KEV7A400 new file mode 100644 index 000000000000..be32b14cd535 --- /dev/null +++ b/Documentation/arm/Sharp-LH/KEV7A400 @@ -0,0 +1,8 @@ +README on Implementing Linux for Sharp's KEV7a400 +================================================= + +This product has been discontinued by Sharp. For the time being, the +partially implemented code remains in the kernel. At some point in +the future, either the code will be finished or it will be removed +completely. This depends primarily on how many of the development +boards are in the field. diff --git a/Documentation/arm/Sharp-LH/LPD7A400 b/Documentation/arm/Sharp-LH/LPD7A400 new file mode 100644 index 000000000000..3275b453bfdf --- /dev/null +++ b/Documentation/arm/Sharp-LH/LPD7A400 @@ -0,0 +1,15 @@ +README on Implementing Linux for the Logic PD LPD7A400-10 +========================================================= + +- CPLD memory mapping + + The board designers chose to use high address lines for controlling + access to the CPLD registers. It turns out to be a big waste + because we're using an MMU and must map IO space into virtual + memory. The result is that we have to make a mapping for every + register. + +- Serial Console + + It may be OK not to use the serial console option if the user passes + the console device name to the kernel. This deserves some exploration. diff --git a/Documentation/arm/Sharp-LH/LPD7A40X b/Documentation/arm/Sharp-LH/LPD7A40X new file mode 100644 index 000000000000..8c29a27e208f --- /dev/null +++ b/Documentation/arm/Sharp-LH/LPD7A40X @@ -0,0 +1,16 @@ +README on Implementing Linux for the Logic PD LPD7A40X-10 +========================================================= + +- CPLD memory mapping + + The board designers chose to use high address lines for controlling + access to the CPLD registers. It turns out to be a big waste + because we're using an MMU and must map IO space into virtual + memory. The result is that we have to make a mapping for every + register. + +- Serial Console + + It may be OK not to use the serial console option if the user passes + the console device name to the kernel. This deserves some exploration. + diff --git a/Documentation/arm/Sharp-LH/SDRAM b/Documentation/arm/Sharp-LH/SDRAM new file mode 100644 index 000000000000..93ddc23c2faa --- /dev/null +++ b/Documentation/arm/Sharp-LH/SDRAM @@ -0,0 +1,51 @@ +README on the SDRAM Controller for the LH7a40X +============================================== + +The standard configuration for the SDRAM controller generates a sparse +memory array. The precise layout is determined by the SDRAM chips. A +default kernel configuration assembles the discontiguous memory +regions into separate memory nodes via the NUMA (Non-Uniform Memory +Architecture) facilities. In this default configuration, the kernel +is forgiving about the precise layout. As long as it is given an +accurate picture of available memory by the bootloader the kernel will +execute correctly. + +The SDRC supports a mode where some of the chip select lines are +swapped in order to make SDRAM look like a synchronous ROM. Setting +this bit means that the RAM will present as a contiguous array. Some +programmers prefer this to the discontiguous layout. Be aware that +may be a penalty for this feature where some some configurations of +memory are significantly reduced; i.e. 64MiB of RAM appears as only 32 +MiB. + +There are a couple of configuration options to override the default +behavior. When the SROMLL bit is set and memory appears as a +contiguous array, there is no reason to support NUMA. +CONFIG_LH7A40X_CONTIGMEM disables NUMA support. When physical memory +is discontiguous, the memory tables are organized such that there are +two banks per nodes with a small gap between them. This layout wastes +some kernel memory for page tables representing non-existent memory. +CONFIG_LH7A40X_ONE_BANK_PER_NODE optimizes the node tables such that +there are no gaps. These options control the low level organization +of the memory management tables in ways that may prevent the kernel +from booting or may cause the kernel to allocated excessively large +page tables. Be warned. Only change these options if you know what +you are doing. The default behavior is a reasonable compromise that +will suit all users. + +-- + +A typical 32MiB system with the default configuration options will +find physical memory managed as follows. + + node 0: 0xc0000000 4MiB + 0xc1000000 4MiB + node 1: 0xc4000000 4MiB + 0xc5000000 4MiB + node 2: 0xc8000000 4MiB + 0xc9000000 4MiB + node 3: 0xcc000000 4MiB + 0xcd000000 4MiB + +Setting CONFIG_LH7A40X_ONE_BANK_PER_NODE will put each bank into a +separate node. diff --git a/Documentation/arm/Sharp-LH/VectoredInterruptController b/Documentation/arm/Sharp-LH/VectoredInterruptController new file mode 100644 index 000000000000..23047e9861ee --- /dev/null +++ b/Documentation/arm/Sharp-LH/VectoredInterruptController @@ -0,0 +1,80 @@ +README on the Vectored Interrupt Controller of the LH7A404 +========================================================== + +The 404 revision of the LH7A40X series comes with two vectored +interrupts controllers. While the kernel does use some of the +features of these devices, it is far from the purpose for which they +were designed. + +When this README was written, the implementation of the VICs was in +flux. It is possible that some details, especially with priorities, +will change. + +The VIC support code is inspired by routines written by Sharp. + + +Priority Control +---------------- + +The significant reason for using the VIC's vectoring is to control +interrupt priorities. There are two tables in +arch/arm/mach-lh7a40x/irq-lh7a404.c that look something like this. + + static unsigned char irq_pri_vic1[] = { IRQ_GPIO3INTR, }; + static unsigned char irq_pri_vic2[] = { + IRQ_T3UI, IRQ_GPIO7INTR, + IRQ_UART1INTR, IRQ_UART2INTR, IRQ_UART3INTR, }; + +The initialization code reads these tables and inserts a vector +address and enable for each indicated IRQ. Vectored interrupts have +higher priority than non-vectored interrupts. So, on VIC1, +IRQ_GPIO3INTR will be served before any other non-FIQ interrupt. Due +to the way that the vectoring works, IRQ_T3UI is the next highest +priority followed by the other vectored interrupts on VIC2. After +that, the non-vectored interrupts are scanned in VIC1 then in VIC2. + + +ISR +--- + +The interrupt service routine macro get_irqnr() in +arch/arm/kernel/entry-armv.S scans the VICs for the next active +interrupt. The vectoring makes this code somewhat larger than it was +before using vectoring (refer to the LH7A400 implementation). In the +case where an interrupt is vectored, the implementation will tend to +be faster than the non-vectored version. However, the worst-case path +is longer. + +It is worth noting that at present, there is no need to read +VIC2_VECTADDR because the register appears to be shared between the +controllers. The code is written such that if this changes, it ought +to still work properly. + + +Vector Addresses +---------------- + +The proper use of the vectoring hardware would jump to the ISR +specified by the vectoring address. Linux isn't structured to take +advantage of this feature, though it might be possible to change +things to support it. + +In this implementation, the vectoring address is used to speed the +search for the active IRQ. The address is coded such that the lowest +6 bits store the IRQ number for vectored interrupts. These numbers +correspond to the bits in the interrupt status registers. IRQ zero is +the lowest interrupt bit in VIC1. IRQ 32 is the lowest interrupt bit +in VIC2. Because zero is a valid IRQ number and because we cannot +detect whether or not there is a valid vectoring address if that +address is zero, the eigth bit (0x100) is set for vectored interrupts. +The address for IRQ 0x18 (VIC2) is 0x118. Only the ninth bit is set +for the default handler on VIC1 and only the tenth bit is set for the +default handler on VIC2. + +In other words. + + 0x000 - no active interrupt + 0x1ii - vectored interrupt 0xii + 0x2xx - unvectored interrupt on VIC1 (xx is don't care) + 0x4xx - unvectored interrupt on VIC2 (xx is don't care) + diff --git a/Documentation/arm/VFP/release-notes.txt b/Documentation/arm/VFP/release-notes.txt new file mode 100644 index 000000000000..f28e0222f5e5 --- /dev/null +++ b/Documentation/arm/VFP/release-notes.txt @@ -0,0 +1,55 @@ +Release notes for Linux Kernel VFP support code +----------------------------------------------- + +Date: 20 May 2004 +Author: Russell King + +This is the first release of the Linux Kernel VFP support code. It +provides support for the exceptions bounced from VFP hardware found +on ARM926EJ-S. + +This release has been validated against the SoftFloat-2b library by +John R. Hauser using the TestFloat-2a test suite. Details of this +library and test suite can be found at: + + http://www.cs.berkeley.edu/~jhauser/arithmetic/SoftFloat.html + +The operations which have been tested with this package are: + + - fdiv + - fsub + - fadd + - fmul + - fcmp + - fcmpe + - fcvtd + - fcvts + - fsito + - ftosi + - fsqrt + +All the above pass softfloat tests with the following exceptions: + +- fadd/fsub shows some differences in the handling of +0 / -0 results + when input operands differ in signs. +- the handling of underflow exceptions is slightly different. If a + result underflows before rounding, but becomes a normalised number + after rounding, we do not signal an underflow exception. + +Other operations which have been tested by basic assembly-only tests +are: + + - fcpy + - fabs + - fneg + - ftoui + - ftosiz + - ftouiz + +The combination operations have not been tested: + + - fmac + - fnmac + - fmsc + - fnmsc + - fnmul diff --git a/Documentation/arm/empeg/README b/Documentation/arm/empeg/README new file mode 100644 index 000000000000..09cc8d03ae58 --- /dev/null +++ b/Documentation/arm/empeg/README @@ -0,0 +1,13 @@ +Empeg, Ltd's Empeg MP3 Car Audio Player + +The initial design is to go in your car, but you can use it at home, on a +boat... almost anywhere. The principle is to store CD-quality music using +MPEG technology onto a hard disk in the unit, and use the power of the +embedded computer to serve up the music you want. + +For more details, see: + + http://www.empeg.com + + + diff --git a/Documentation/arm/empeg/ir.txt b/Documentation/arm/empeg/ir.txt new file mode 100644 index 000000000000..10a297450164 --- /dev/null +++ b/Documentation/arm/empeg/ir.txt @@ -0,0 +1,49 @@ +Infra-red driver documentation. + +Mike Crowe <mac@empeg.com> +(C) Empeg Ltd 1999 + +Not a lot here yet :-) + +The Kenwood KCA-R6A remote control generates a sequence like the following: + +Go low for approx 16T (Around 9000us) +Go high for approx 8T (Around 4000us) +Go low for less than 2T (Around 750us) + +For each of the 32 bits + Go high for more than 2T (Around 1500us) == 1 + Go high for less than T (Around 400us) == 0 + Go low for less than 2T (Around 750us) + +Rather than repeat a signal when the button is held down certain buttons +generate the following code to indicate repetition. + +Go low for approx 16T +Go high for approx 4T +Go low for less than 2T + +(By removing the <2T from the start of the sequence and placing at the end + it can be considered a stop bit but I found it easier to deal with it at + the start). + +The 32 bits are encoded as XxYy where x and y are the actual data values +while X and Y are the logical inverses of the associated data values. Using +LSB first yields sensible codes for the numbers. + +All codes are of the form b9xx + +The numeric keys generate the code 0x where x is the number pressed. + +Tuner 1c +Tape 1d +CD 1e +CD-MD-CH 1f +Track- 0a +Track+ 0b +Rewind 0c +FF 0d +DNPP 5e +Play/Pause 0e +Vol+ 14 +Vol- 15 diff --git a/Documentation/arm/empeg/mkdevs b/Documentation/arm/empeg/mkdevs new file mode 100644 index 000000000000..7a85e28d14f3 --- /dev/null +++ b/Documentation/arm/empeg/mkdevs @@ -0,0 +1,11 @@ +#!/bin/sh +mknod /dev/display c 244 0 +mknod /dev/ir c 242 0 +mknod /dev/usb0 c 243 0 +mknod /dev/audio c 245 4 +mknod /dev/dsp c 245 3 +mknod /dev/mixer c 245 0 +mknod /dev/empeg_state c 246 0 +mknod /dev/radio0 c 81 64 +ln -sf radio0 radio +ln -sf usb0 usb diff --git a/Documentation/arm/mem_alignment b/Documentation/arm/mem_alignment new file mode 100644 index 000000000000..d145ccca169a --- /dev/null +++ b/Documentation/arm/mem_alignment @@ -0,0 +1,58 @@ +Too many problems poped up because of unnoticed misaligned memory access in +kernel code lately. Therefore the alignment fixup is now unconditionally +configured in for SA11x0 based targets. According to Alan Cox, this is a +bad idea to configure it out, but Russell King has some good reasons for +doing so on some f***ed up ARM architectures like the EBSA110. However +this is not the case on many design I'm aware of, like all SA11x0 based +ones. + +Of course this is a bad idea to rely on the alignment trap to perform +unaligned memory access in general. If those access are predictable, you +are better to use the macros provided by include/asm/unaligned.h. The +alignment trap can fixup misaligned access for the exception cases, but at +a high performance cost. It better be rare. + +Now for user space applications, it is possible to configure the alignment +trap to SIGBUS any code performing unaligned access (good for debugging bad +code), or even fixup the access by software like for kernel code. The later +mode isn't recommended for performance reasons (just think about the +floating point emulation that works about the same way). Fix your code +instead! + +Please note that randomly changing the behaviour without good thought is +real bad - it changes the behaviour of all unaligned instructions in user +space, and might cause programs to fail unexpectedly. + +To change the alignment trap behavior, simply echo a number into +/proc/sys/debug/alignment. The number is made up from various bits: + +bit behavior when set +--- ----------------- + +0 A user process performing an unaligned memory access + will cause the kernel to print a message indicating + process name, pid, pc, instruction, address, and the + fault code. + +1 The kernel will attempt to fix up the user process + performing the unaligned access. This is of course + slow (think about the floating point emulator) and + not recommended for production use. + +2 The kernel will send a SIGBUS signal to the user process + performing the unaligned access. + +Note that not all combinations are supported - only values 0 through 5. +(6 and 7 don't make sense). + +For example, the following will turn on the warnings, but without +fixing up or sending SIGBUS signals: + + echo 1 > /proc/sys/debug/alignment + +You can also read the content of the same file to get statistical +information on unaligned access occurrences plus the current mode of +operation for user space code. + + +Nicolas Pitre, Mar 13, 2001. Modified Russell King, Nov 30, 2001. diff --git a/Documentation/arm/memory.txt b/Documentation/arm/memory.txt new file mode 100644 index 000000000000..4b1c93a8177b --- /dev/null +++ b/Documentation/arm/memory.txt @@ -0,0 +1,72 @@ + Kernel Memory Layout on ARM Linux + + Russell King <rmk@arm.linux.org.uk> + May 21, 2004 (2.6.6) + +This document describes the virtual memory layout which the Linux +kernel uses for ARM processors. It indicates which regions are +free for platforms to use, and which are used by generic code. + +The ARM CPU is capable of addressing a maximum of 4GB virtual memory +space, and this must be shared between user space processes, the +kernel, and hardware devices. + +As the ARM architecture matures, it becomes necessary to reserve +certain regions of VM space for use for new facilities; therefore +this document may reserve more VM space over time. + +Start End Use +-------------------------------------------------------------------------- +ffff8000 ffffffff copy_user_page / clear_user_page use. + For SA11xx and Xscale, this is used to + setup a minicache mapping. + +ffff1000 ffff7fff Reserved. + Platforms must not use this address range. + +ffff0000 ffff0fff CPU vector page. + The CPU vectors are mapped here if the + CPU supports vector relocation (control + register V bit.) + +ffc00000 fffeffff DMA memory mapping region. Memory returned + by the dma_alloc_xxx functions will be + dynamically mapped here. + +ff000000 ffbfffff Reserved for future expansion of DMA + mapping region. + +VMALLOC_END feffffff Free for platform use, recommended. + +VMALLOC_START VMALLOC_END-1 vmalloc() / ioremap() space. + Memory returned by vmalloc/ioremap will + be dynamically placed in this region. + VMALLOC_START may be based upon the value + of the high_memory variable. + +PAGE_OFFSET high_memory-1 Kernel direct-mapped RAM region. + This maps the platforms RAM, and typically + maps all platform RAM in a 1:1 relationship. + +TASK_SIZE PAGE_OFFSET-1 Kernel module space + Kernel modules inserted via insmod are + placed here using dynamic mappings. + +00001000 TASK_SIZE-1 User space mappings + Per-thread mappings are placed here via + the mmap() system call. + +00000000 00000fff CPU vector page / null pointer trap + CPUs which do not support vector remapping + place their vector page here. NULL pointer + dereferences by both the kernel and user + space are also caught via this mapping. + +Please note that mappings which collide with the above areas may result +in a non-bootable kernel, or may cause the kernel to (eventually) panic +at run time. + +Since future CPUs may impact the kernel mapping layout, user programs +must not access any memory which is not mapped inside their 0x0001000 +to TASK_SIZE address range. If they wish to access these areas, they +must set up their own mappings using open() and mmap(). diff --git a/Documentation/arm/nwfpe/NOTES b/Documentation/arm/nwfpe/NOTES new file mode 100644 index 000000000000..40577b5a49d3 --- /dev/null +++ b/Documentation/arm/nwfpe/NOTES @@ -0,0 +1,29 @@ +There seems to be a problem with exp(double) and our emulator. I haven't +been able to track it down yet. This does not occur with the emulator +supplied by Russell King. + +I also found one oddity in the emulator. I don't think it is serious but +will point it out. The ARM calling conventions require floating point +registers f4-f7 to be preserved over a function call. The compiler quite +often uses an stfe instruction to save f4 on the stack upon entry to a +function, and an ldfe instruction to restore it before returning. + +I was looking at some code, that calculated a double result, stored it in f4 +then made a function call. Upon return from the function call the number in +f4 had been converted to an extended value in the emulator. + +This is a side effect of the stfe instruction. The double in f4 had to be +converted to extended, then stored. If an lfm/sfm combination had been used, +then no conversion would occur. This has performance considerations. The +result from the function call and f4 were used in a multiplication. If the +emulator sees a multiply of a double and extended, it promotes the double to +extended, then does the multiply in extended precision. + +This code will cause this problem: + +double x, y, z; +z = log(x)/log(y); + +The result of log(x) (a double) will be calculated, returned in f0, then +moved to f4 to preserve it over the log(y) call. The division will be done +in extended precision, due to the stfe instruction used to save f4 in log(y). diff --git a/Documentation/arm/nwfpe/README b/Documentation/arm/nwfpe/README new file mode 100644 index 000000000000..771871de0c8b --- /dev/null +++ b/Documentation/arm/nwfpe/README @@ -0,0 +1,70 @@ +This directory contains the version 0.92 test release of the NetWinder +Floating Point Emulator. + +The majority of the code was written by me, Scott Bambrough It is +written in C, with a small number of routines in inline assembler +where required. It was written quickly, with a goal of implementing a +working version of all the floating point instructions the compiler +emits as the first target. I have attempted to be as optimal as +possible, but there remains much room for improvement. + +I have attempted to make the emulator as portable as possible. One of +the problems is with leading underscores on kernel symbols. Elf +kernels have no leading underscores, a.out compiled kernels do. I +have attempted to use the C_SYMBOL_NAME macro wherever this may be +important. + +Another choice I made was in the file structure. I have attempted to +contain all operating system specific code in one module (fpmodule.*). +All the other files contain emulator specific code. This should allow +others to port the emulator to NetBSD for instance relatively easily. + +The floating point operations are based on SoftFloat Release 2, by +John Hauser. SoftFloat is a software implementation of floating-point +that conforms to the IEC/IEEE Standard for Binary Floating-point +Arithmetic. As many as four formats are supported: single precision, +double precision, extended double precision, and quadruple precision. +All operations required by the standard are implemented, except for +conversions to and from decimal. We use only the single precision, +double precision and extended double precision formats. The port of +SoftFloat to the ARM was done by Phil Blundell, based on an earlier +port of SoftFloat version 1 by Neil Carson for NetBSD/arm32. + +The file README.FPE contains a description of what has been implemented +so far in the emulator. The file TODO contains a information on what +remains to be done, and other ideas for the emulator. + +Bug reports, comments, suggestions should be directed to me at +<scottb@netwinder.org>. General reports of "this program doesn't +work correctly when your emulator is installed" are useful for +determining that bugs still exist; but are virtually useless when +attempting to isolate the problem. Please report them, but don't +expect quick action. Bugs still exist. The problem remains in isolating +which instruction contains the bug. Small programs illustrating a specific +problem are a godsend. + +Legal Notices +------------- + +The NetWinder Floating Point Emulator is free software. Everything Rebel.com +has written is provided under the GNU GPL. See the file COPYING for copying +conditions. Excluded from the above is the SoftFloat code. John Hauser's +legal notice for SoftFloat is included below. + +------------------------------------------------------------------------------- +SoftFloat Legal Notice + +SoftFloat was written by John R. Hauser. This work was made possible in +part by the International Computer Science Institute, located at Suite 600, +1947 Center Street, Berkeley, California 94704. Funding was partially +provided by the National Science Foundation under grant MIP-9311980. The +original version of this code was written as part of a project to build +a fixed-point vector processor in collaboration with the University of +California at Berkeley, overseen by Profs. Nelson Morgan and John Wawrzynek. + +THIS SOFTWARE IS DISTRIBUTED AS IS, FOR FREE. Although reasonable effort +has been made to avoid it, THIS SOFTWARE MAY CONTAIN FAULTS THAT WILL AT +TIMES RESULT IN INCORRECT BEHAVIOR. USE OF THIS SOFTWARE IS RESTRICTED TO +PERSONS AND ORGANIZATIONS WHO CAN AND WILL TAKE FULL RESPONSIBILITY FOR ANY +AND ALL LOSSES, COSTS, OR OTHER PROBLEMS ARISING FROM ITS USE. +------------------------------------------------------------------------------- diff --git a/Documentation/arm/nwfpe/README.FPE b/Documentation/arm/nwfpe/README.FPE new file mode 100644 index 000000000000..26f5d7bb9a41 --- /dev/null +++ b/Documentation/arm/nwfpe/README.FPE @@ -0,0 +1,156 @@ +The following describes the current state of the NetWinder's floating point +emulator. + +In the following nomenclature is used to describe the floating point +instructions. It follows the conventions in the ARM manual. + +<S|D|E> = <single|double|extended>, no default +{P|M|Z} = {round to +infinity,round to -infinity,round to zero}, + default = round to nearest + +Note: items enclosed in {} are optional. + +Floating Point Coprocessor Data Transfer Instructions (CPDT) +------------------------------------------------------------ + +LDF/STF - load and store floating + +<LDF|STF>{cond}<S|D|E> Fd, Rn +<LDF|STF>{cond}<S|D|E> Fd, [Rn, #<expression>]{!} +<LDF|STF>{cond}<S|D|E> Fd, [Rn], #<expression> + +These instructions are fully implemented. + +LFM/SFM - load and store multiple floating + +Form 1 syntax: +<LFM|SFM>{cond}<S|D|E> Fd, <count>, [Rn] +<LFM|SFM>{cond}<S|D|E> Fd, <count>, [Rn, #<expression>]{!} +<LFM|SFM>{cond}<S|D|E> Fd, <count>, [Rn], #<expression> + +Form 2 syntax: +<LFM|SFM>{cond}<FD,EA> Fd, <count>, [Rn]{!} + +These instructions are fully implemented. They store/load three words +for each floating point register into the memory location given in the +instruction. The format in memory is unlikely to be compatible with +other implementations, in particular the actual hardware. Specific +mention of this is made in the ARM manuals. + +Floating Point Coprocessor Register Transfer Instructions (CPRT) +---------------------------------------------------------------- + +Conversions, read/write status/control register instructions + +FLT{cond}<S,D,E>{P,M,Z} Fn, Rd Convert integer to floating point +FIX{cond}{P,M,Z} Rd, Fn Convert floating point to integer +WFS{cond} Rd Write floating point status register +RFS{cond} Rd Read floating point status register +WFC{cond} Rd Write floating point control register +RFC{cond} Rd Read floating point control register + +FLT/FIX are fully implemented. + +RFS/WFS are fully implemented. + +RFC/WFC are fully implemented. RFC/WFC are supervisor only instructions, and +presently check the CPU mode, and do an invalid instruction trap if not called +from supervisor mode. + +Compare instructions + +CMF{cond} Fn, Fm Compare floating +CMFE{cond} Fn, Fm Compare floating with exception +CNF{cond} Fn, Fm Compare negated floating +CNFE{cond} Fn, Fm Compare negated floating with exception + +These are fully implemented. + +Floating Point Coprocessor Data Instructions (CPDT) +--------------------------------------------------- + +Dyadic operations: + +ADF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - add +SUF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - subtract +RSF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse subtract +MUF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - multiply +DVF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - divide +RDV{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse divide + +These are fully implemented. + +FML{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - fast multiply +FDV{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - fast divide +FRD{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - fast reverse divide + +These are fully implemented as well. They use the same algorithm as the +non-fast versions. Hence, in this implementation their performance is +equivalent to the MUF/DVF/RDV instructions. This is acceptable according +to the ARM manual. The manual notes these are defined only for single +operands, on the actual FPA11 hardware they do not work for double or +extended precision operands. The emulator currently does not check +the requested permissions conditions, and performs the requested operation. + +RMF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - IEEE remainder + +This is fully implemented. + +Monadic operations: + +MVF{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - move +MNF{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - move negated + +These are fully implemented. + +ABS{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - absolute value +SQT{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - square root +RND{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - round + +These are fully implemented. + +URD{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - unnormalized round +NRM{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - normalize + +These are implemented. URD is implemented using the same code as the RND +instruction. Since URD cannot return a unnormalized number, NRM becomes +a NOP. + +Library calls: + +POW{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - power +RPW{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse power +POL{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - polar angle (arctan2) + +LOG{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - logarithm to base 10 +LGN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - logarithm to base e +EXP{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - exponent +SIN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - sine +COS{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - cosine +TAN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - tangent +ASN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arcsine +ACS{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arccosine +ATN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arctangent + +These are not implemented. They are not currently issued by the compiler, +and are handled by routines in libc. These are not implemented by the FPA11 +hardware, but are handled by the floating point support code. They should +be implemented in future versions. + +Signalling: + +Signals are implemented. However current ELF kernels produced by Rebel.com +have a bug in them that prevents the module from generating a SIGFPE. This +is caused by a failure to alias fp_current to the kernel variable +current_set[0] correctly. + +The kernel provided with this distribution (vmlinux-nwfpe-0.93) contains +a fix for this problem and also incorporates the current version of the +emulator directly. It is possible to run with no floating point module +loaded with this kernel. It is provided as a demonstration of the +technology and for those who want to do floating point work that depends +on signals. It is not strictly necessary to use the module. + +A module (either the one provided by Russell King, or the one in this +distribution) can be loaded to replace the functionality of the emulator +built into the kernel. diff --git a/Documentation/arm/nwfpe/TODO b/Documentation/arm/nwfpe/TODO new file mode 100644 index 000000000000..8027061b60eb --- /dev/null +++ b/Documentation/arm/nwfpe/TODO @@ -0,0 +1,67 @@ +TODO LIST +--------- + +POW{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - power +RPW{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse power +POL{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - polar angle (arctan2) + +LOG{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - logarithm to base 10 +LGN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - logarithm to base e +EXP{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - exponent +SIN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - sine +COS{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - cosine +TAN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - tangent +ASN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arcsine +ACS{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arccosine +ATN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arctangent + +These are not implemented. They are not currently issued by the compiler, +and are handled by routines in libc. These are not implemented by the FPA11 +hardware, but are handled by the floating point support code. They should +be implemented in future versions. + +There are a couple of ways to approach the implementation of these. One +method would be to use accurate table methods for these routines. I have +a couple of papers by S. Gal from IBM's research labs in Haifa, Israel that +seem to promise extreme accuracy (in the order of 99.8%) and reasonable speed. +These methods are used in GLIBC for some of the transcendental functions. + +Another approach, which I know little about is CORDIC. This stands for +Coordinate Rotation Digital Computer, and is a method of computing +transcendental functions using mostly shifts and adds and a few +multiplications and divisions. The ARM excels at shifts and adds, +so such a method could be promising, but requires more research to +determine if it is feasible. + +Rounding Methods + +The IEEE standard defines 4 rounding modes. Round to nearest is the +default, but rounding to + or - infinity or round to zero are also allowed. +Many architectures allow the rounding mode to be specified by modifying bits +in a control register. Not so with the ARM FPA11 architecture. To change +the rounding mode one must specify it with each instruction. + +This has made porting some benchmarks difficult. It is possible to +introduce such a capability into the emulator. The FPCR contains +bits describing the rounding mode. The emulator could be altered to +examine a flag, which if set forced it to ignore the rounding mode in +the instruction, and use the mode specified in the bits in the FPCR. + +This would require a method of getting/setting the flag, and the bits +in the FPCR. This requires a kernel call in ArmLinux, as WFC/RFC are +supervisor only instructions. If anyone has any ideas or comments I +would like to hear them. + +[NOTE: pulled out from some docs on ARM floating point, specifically + for the Acorn FPE, but not limited to it: + + The floating point control register (FPCR) may only be present in some + implementations: it is there to control the hardware in an implementation- + specific manner, for example to disable the floating point system. The user + mode of the ARM is not permitted to use this register (since the right is + reserved to alter it between implementations) and the WFC and RFC + instructions will trap if tried in user mode. + + Hence, the answer is yes, you could do this, but then you will run a high + risk of becoming isolated if and when hardware FP emulation comes out + -- Russell]. diff --git a/Documentation/atomic_ops.txt b/Documentation/atomic_ops.txt new file mode 100644 index 000000000000..8eedaa24f5e2 --- /dev/null +++ b/Documentation/atomic_ops.txt @@ -0,0 +1,456 @@ + Semantics and Behavior of Atomic and + Bitmask Operations + + David S. Miller + + This document is intended to serve as a guide to Linux port +maintainers on how to implement atomic counter, bitops, and spinlock +interfaces properly. + + The atomic_t type should be defined as a signed integer. +Also, it should be made opaque such that any kind of cast to a normal +C integer type will fail. Something like the following should +suffice: + + typedef struct { volatile int counter; } atomic_t; + + The first operations to implement for atomic_t's are the +initializers and plain reads. + + #define ATOMIC_INIT(i) { (i) } + #define atomic_set(v, i) ((v)->counter = (i)) + +The first macro is used in definitions, such as: + +static atomic_t my_counter = ATOMIC_INIT(1); + +The second interface can be used at runtime, as in: + + struct foo { atomic_t counter; }; + ... + + struct foo *k; + + k = kmalloc(sizeof(*k), GFP_KERNEL); + if (!k) + return -ENOMEM; + atomic_set(&k->counter, 0); + +Next, we have: + + #define atomic_read(v) ((v)->counter) + +which simply reads the current value of the counter. + +Now, we move onto the actual atomic operation interfaces. + + void atomic_add(int i, atomic_t *v); + void atomic_sub(int i, atomic_t *v); + void atomic_inc(atomic_t *v); + void atomic_dec(atomic_t *v); + +These four routines add and subtract integral values to/from the given +atomic_t value. The first two routines pass explicit integers by +which to make the adjustment, whereas the latter two use an implicit +adjustment value of "1". + +One very important aspect of these two routines is that they DO NOT +require any explicit memory barriers. They need only perform the +atomic_t counter update in an SMP safe manner. + +Next, we have: + + int atomic_inc_return(atomic_t *v); + int atomic_dec_return(atomic_t *v); + +These routines add 1 and subtract 1, respectively, from the given +atomic_t and return the new counter value after the operation is +performed. + +Unlike the above routines, it is required that explicit memory +barriers are performed before and after the operation. It must be +done such that all memory operations before and after the atomic +operation calls are strongly ordered with respect to the atomic +operation itself. + +For example, it should behave as if a smp_mb() call existed both +before and after the atomic operation. + +If the atomic instructions used in an implementation provide explicit +memory barrier semantics which satisfy the above requirements, that is +fine as well. + +Let's move on: + + int atomic_add_return(int i, atomic_t *v); + int atomic_sub_return(int i, atomic_t *v); + +These behave just like atomic_{inc,dec}_return() except that an +explicit counter adjustment is given instead of the implicit "1". +This means that like atomic_{inc,dec}_return(), the memory barrier +semantics are required. + +Next: + + int atomic_inc_and_test(atomic_t *v); + int atomic_dec_and_test(atomic_t *v); + +These two routines increment and decrement by 1, respectively, the +given atomic counter. They return a boolean indicating whether the +resulting counter value was zero or not. + +It requires explicit memory barrier semantics around the operation as +above. + + int atomic_sub_and_test(int i, atomic_t *v); + +This is identical to atomic_dec_and_test() except that an explicit +decrement is given instead of the implicit "1". It requires explicit +memory barrier semantics around the operation. + + int atomic_add_negative(int i, atomic_t *v); + +The given increment is added to the given atomic counter value. A +boolean is return which indicates whether the resulting counter value +is negative. It requires explicit memory barrier semantics around the +operation. + +If a caller requires memory barrier semantics around an atomic_t +operation which does not return a value, a set of interfaces are +defined which accomplish this: + + void smp_mb__before_atomic_dec(void); + void smp_mb__after_atomic_dec(void); + void smp_mb__before_atomic_inc(void); + void smp_mb__after_atomic_dec(void); + +For example, smp_mb__before_atomic_dec() can be used like so: + + obj->dead = 1; + smp_mb__before_atomic_dec(); + atomic_dec(&obj->ref_count); + +It makes sure that all memory operations preceeding the atomic_dec() +call are strongly ordered with respect to the atomic counter +operation. In the above example, it guarentees that the assignment of +"1" to obj->dead will be globally visible to other cpus before the +atomic counter decrement. + +Without the explicitl smp_mb__before_atomic_dec() call, the +implementation could legally allow the atomic counter update visible +to other cpus before the "obj->dead = 1;" assignment. + +The other three interfaces listed are used to provide explicit +ordering with respect to memory operations after an atomic_dec() call +(smp_mb__after_atomic_dec()) and around atomic_inc() calls +(smp_mb__{before,after}_atomic_inc()). + +A missing memory barrier in the cases where they are required by the +atomic_t implementation above can have disasterous results. Here is +an example, which follows a pattern occuring frequently in the Linux +kernel. It is the use of atomic counters to implement reference +counting, and it works such that once the counter falls to zero it can +be guarenteed that no other entity can be accessing the object: + +static void obj_list_add(struct obj *obj) +{ + obj->active = 1; + list_add(&obj->list); +} + +static void obj_list_del(struct obj *obj) +{ + list_del(&obj->list); + obj->active = 0; +} + +static void obj_destroy(struct obj *obj) +{ + BUG_ON(obj->active); + kfree(obj); +} + +struct obj *obj_list_peek(struct list_head *head) +{ + if (!list_empty(head)) { + struct obj *obj; + + obj = list_entry(head->next, struct obj, list); + atomic_inc(&obj->refcnt); + return obj; + } + return NULL; +} + +void obj_poke(void) +{ + struct obj *obj; + + spin_lock(&global_list_lock); + obj = obj_list_peek(&global_list); + spin_unlock(&global_list_lock); + + if (obj) { + obj->ops->poke(obj); + if (atomic_dec_and_test(&obj->refcnt)) + obj_destroy(obj); + } +} + +void obj_timeout(struct obj *obj) +{ + spin_lock(&global_list_lock); + obj_list_del(obj); + spin_unlock(&global_list_lock); + + if (atomic_dec_and_test(&obj->refcnt)) + obj_destroy(obj); +} + +(This is a simplification of the ARP queue management in the + generic neighbour discover code of the networking. Olaf Kirch + found a bug wrt. memory barriers in kfree_skb() that exposed + the atomic_t memory barrier requirements quite clearly.) + +Given the above scheme, it must be the case that the obj->active +update done by the obj list deletion be visible to other processors +before the atomic counter decrement is performed. + +Otherwise, the counter could fall to zero, yet obj->active would still +be set, thus triggering the assertion in obj_destroy(). The error +sequence looks like this: + + cpu 0 cpu 1 + obj_poke() obj_timeout() + obj = obj_list_peek(); + ... gains ref to obj, refcnt=2 + obj_list_del(obj); + obj->active = 0 ... + ... visibility delayed ... + atomic_dec_and_test() + ... refcnt drops to 1 ... + atomic_dec_and_test() + ... refcount drops to 0 ... + obj_destroy() + BUG() triggers since obj->active + still seen as one + obj->active update visibility occurs + +With the memory barrier semantics required of the atomic_t operations +which return values, the above sequence of memory visibility can never +happen. Specifically, in the above case the atomic_dec_and_test() +counter decrement would not become globally visible until the +obj->active update does. + +As a historical note, 32-bit Sparc used to only allow usage of +24-bits of it's atomic_t type. This was because it used 8 bits +as a spinlock for SMP safety. Sparc32 lacked a "compare and swap" +type instruction. However, 32-bit Sparc has since been moved over +to a "hash table of spinlocks" scheme, that allows the full 32-bit +counter to be realized. Essentially, an array of spinlocks are +indexed into based upon the address of the atomic_t being operated +on, and that lock protects the atomic operation. Parisc uses the +same scheme. + +Another note is that the atomic_t operations returning values are +extremely slow on an old 386. + +We will now cover the atomic bitmask operations. You will find that +their SMP and memory barrier semantics are similar in shape and scope +to the atomic_t ops above. + +Native atomic bit operations are defined to operate on objects aligned +to the size of an "unsigned long" C data type, and are least of that +size. The endianness of the bits within each "unsigned long" are the +native endianness of the cpu. + + void set_bit(unsigned long nr, volatils unsigned long *addr); + void clear_bit(unsigned long nr, volatils unsigned long *addr); + void change_bit(unsigned long nr, volatils unsigned long *addr); + +These routines set, clear, and change, respectively, the bit number +indicated by "nr" on the bit mask pointed to by "ADDR". + +They must execute atomically, yet there are no implicit memory barrier +semantics required of these interfaces. + + int test_and_set_bit(unsigned long nr, volatils unsigned long *addr); + int test_and_clear_bit(unsigned long nr, volatils unsigned long *addr); + int test_and_change_bit(unsigned long nr, volatils unsigned long *addr); + +Like the above, except that these routines return a boolean which +indicates whether the changed bit was set _BEFORE_ the atomic bit +operation. + +WARNING! It is incredibly important that the value be a boolean, +ie. "0" or "1". Do not try to be fancy and save a few instructions by +declaring the above to return "long" and just returning something like +"old_val & mask" because that will not work. + +For one thing, this return value gets truncated to int in many code +paths using these interfaces, so on 64-bit if the bit is set in the +upper 32-bits then testers will never see that. + +One great example of where this problem crops up are the thread_info +flag operations. Routines such as test_and_set_ti_thread_flag() chop +the return value into an int. There are other places where things +like this occur as well. + +These routines, like the atomic_t counter operations returning values, +require explicit memory barrier semantics around their execution. All +memory operations before the atomic bit operation call must be made +visible globally before the atomic bit operation is made visible. +Likewise, the atomic bit operation must be visible globally before any +subsequent memory operation is made visible. For example: + + obj->dead = 1; + if (test_and_set_bit(0, &obj->flags)) + /* ... */; + obj->killed = 1; + +The implementation of test_and_set_bit() must guarentee that +"obj->dead = 1;" is visible to cpus before the atomic memory operation +done by test_and_set_bit() becomes visible. Likewise, the atomic +memory operation done by test_and_set_bit() must become visible before +"obj->killed = 1;" is visible. + +Finally there is the basic operation: + + int test_bit(unsigned long nr, __const__ volatile unsigned long *addr); + +Which returns a boolean indicating if bit "nr" is set in the bitmask +pointed to by "addr". + +If explicit memory barriers are required around clear_bit() (which +does not return a value, and thus does not need to provide memory +barrier semantics), two interfaces are provided: + + void smp_mb__before_clear_bit(void); + void smp_mb__after_clear_bit(void); + +They are used as follows, and are akin to their atomic_t operation +brothers: + + /* All memory operations before this call will + * be globally visible before the clear_bit(). + */ + smp_mb__before_clear_bit(); + clear_bit( ... ); + + /* The clear_bit() will be visible before all + * subsequent memory operations. + */ + smp_mb__after_clear_bit(); + +Finally, there are non-atomic versions of the bitmask operations +provided. They are used in contexts where some other higher-level SMP +locking scheme is being used to protect the bitmask, and thus less +expensive non-atomic operations may be used in the implementation. +They have names similar to the above bitmask operation interfaces, +except that two underscores are prefixed to the interface name. + + void __set_bit(unsigned long nr, volatile unsigned long *addr); + void __clear_bit(unsigned long nr, volatile unsigned long *addr); + void __change_bit(unsigned long nr, volatile unsigned long *addr); + int __test_and_set_bit(unsigned long nr, volatile unsigned long *addr); + int __test_and_clear_bit(unsigned long nr, volatile unsigned long *addr); + int __test_and_change_bit(unsigned long nr, volatile unsigned long *addr); + +These non-atomic variants also do not require any special memory +barrier semantics. + +The routines xchg() and cmpxchg() need the same exact memory barriers +as the atomic and bit operations returning values. + +Spinlocks and rwlocks have memory barrier expectations as well. +The rule to follow is simple: + +1) When acquiring a lock, the implementation must make it globally + visible before any subsequent memory operation. + +2) When releasing a lock, the implementation must make it such that + all previous memory operations are globally visible before the + lock release. + +Which finally brings us to _atomic_dec_and_lock(). There is an +architecture-neutral version implemented in lib/dec_and_lock.c, +but most platforms will wish to optimize this in assembler. + + int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock); + +Atomically decrement the given counter, and if will drop to zero +atomically acquire the given spinlock and perform the decrement +of the counter to zero. If it does not drop to zero, do nothing +with the spinlock. + +It is actually pretty simple to get the memory barrier correct. +Simply satisfy the spinlock grab requirements, which is make +sure the spinlock operation is globally visible before any +subsequent memory operation. + +We can demonstrate this operation more clearly if we define +an abstract atomic operation: + + long cas(long *mem, long old, long new); + +"cas" stands for "compare and swap". It atomically: + +1) Compares "old" with the value currently at "mem". +2) If they are equal, "new" is written to "mem". +3) Regardless, the current value at "mem" is returned. + +As an example usage, here is what an atomic counter update +might look like: + +void example_atomic_inc(long *counter) +{ + long old, new, ret; + + while (1) { + old = *counter; + new = old + 1; + + ret = cas(counter, old, new); + if (ret == old) + break; + } +} + +Let's use cas() in order to build a pseudo-C atomic_dec_and_lock(): + +int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock) +{ + long old, new, ret; + int went_to_zero; + + went_to_zero = 0; + while (1) { + old = atomic_read(atomic); + new = old - 1; + if (new == 0) { + went_to_zero = 1; + spin_lock(lock); + } + ret = cas(atomic, old, new); + if (ret == old) + break; + if (went_to_zero) { + spin_unlock(lock); + went_to_zero = 0; + } + } + + return went_to_zero; +} + +Now, as far as memory barriers go, as long as spin_lock() +strictly orders all subsequent memory operations (including +the cas()) with respect to itself, things will be fine. + +Said another way, _atomic_dec_and_lock() must guarentee that +a counter dropping to zero is never made visible before the +spinlock being acquired. + +Note that this also means that for the case where the counter +is not dropping to zero, there are no memory ordering +requirements. diff --git a/Documentation/basic_profiling.txt b/Documentation/basic_profiling.txt new file mode 100644 index 000000000000..65e3dc2d4437 --- /dev/null +++ b/Documentation/basic_profiling.txt @@ -0,0 +1,52 @@ +These instructions are deliberately very basic. If you want something clever, +go read the real docs ;-) Please don't add more stuff, but feel free to +correct my mistakes ;-) (mbligh@aracnet.com) +Thanks to John Levon, Dave Hansen, et al. for help writing this. + +<test> is the thing you're trying to measure. +Make sure you have the correct System.map / vmlinux referenced! + +It is probably easiest to use "make install" for linux and hack +/sbin/installkernel to copy vmlinux to /boot, in addition to vmlinuz, +config, System.map, which are usually installed by default. + +Readprofile +----------- +A recent readprofile command is needed for 2.6, such as found in util-linux +2.12a, which can be downloaded from: + +http://www.kernel.org/pub/linux/utils/util-linux/ + +Most distributions will ship it already. + +Add "profile=2" to the kernel command line. + +clear readprofile -r + <test> +dump output readprofile -m /boot/System.map > captured_profile + +Oprofile +-------- +Get the source (I use 0.8) from http://oprofile.sourceforge.net/ +and add "idle=poll" to the kernel command line +Configure with CONFIG_PROFILING=y and CONFIG_OPROFILE=y & reboot on new kernel +./configure --with-kernel-support +make install + +For superior results, be sure to enable the local APIC. If opreport sees +a 0Hz CPU, APIC was not on. Be aware that idle=poll may mean a performance +penalty. + +One time setup: + opcontrol --setup --vmlinux=/boot/vmlinux + +clear opcontrol --reset +start opcontrol --start + <test> +stop opcontrol --stop +dump output opreport > output_file + +To only report on the kernel, run opreport /boot/vmlinux > output_file + +A reset is needed to clear old statistics, which survive a reboot. + diff --git a/Documentation/binfmt_misc.txt b/Documentation/binfmt_misc.txt new file mode 100644 index 000000000000..d097f09ee15a --- /dev/null +++ b/Documentation/binfmt_misc.txt @@ -0,0 +1,116 @@ + Kernel Support for miscellaneous (your favourite) Binary Formats v1.1 + ===================================================================== + +This Kernel feature allows you to invoke almost (for restrictions see below) +every program by simply typing its name in the shell. +This includes for example compiled Java(TM), Python or Emacs programs. + +To achieve this you must tell binfmt_misc which interpreter has to be invoked +with which binary. Binfmt_misc recognises the binary-type by matching some bytes +at the beginning of the file with a magic byte sequence (masking out specified +bits) you have supplied. Binfmt_misc can also recognise a filename extension +aka '.com' or '.exe'. + +First you must mount binfmt_misc: + mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc + +To actually register a new binary type, you have to set up a string looking like +:name:type:offset:magic:mask:interpreter:flags (where you can choose the ':' upon +your needs) and echo it to /proc/sys/fs/binfmt_misc/register. +Here is what the fields mean: + - 'name' is an identifier string. A new /proc file will be created with this + name below /proc/sys/fs/binfmt_misc + - 'type' is the type of recognition. Give 'M' for magic and 'E' for extension. + - 'offset' is the offset of the magic/mask in the file, counted in bytes. This + defaults to 0 if you omit it (i.e. you write ':name:type::magic...') + - 'magic' is the byte sequence binfmt_misc is matching for. The magic string + may contain hex-encoded characters like \x0a or \xA4. In a shell environment + you will have to write \\x0a to prevent the shell from eating your \. + If you chose filename extension matching, this is the extension to be + recognised (without the '.', the \x0a specials are not allowed). Extension + matching is case sensitive! + - 'mask' is an (optional, defaults to all 0xff) mask. You can mask out some + bits from matching by supplying a string like magic and as long as magic. + The mask is anded with the byte sequence of the file. + - 'interpreter' is the program that should be invoked with the binary as first + argument (specify the full path) + - 'flags' is an optional field that controls several aspects of the invocation + of the interpreter. It is a string of capital letters, each controls a certain + aspect. The following flags are supported - + 'P' - preserve-argv[0]. Legacy behavior of binfmt_misc is to overwrite the + original argv[0] with the full path to the binary. When this flag is + included, binfmt_misc will add an argument to the argument vector for + this purpose, thus preserving the original argv[0]. + 'O' - open-binary. Legacy behavior of binfmt_misc is to pass the full path + of the binary to the interpreter as an argument. When this flag is + included, binfmt_misc will open the file for reading and pass its + descriptor as an argument, instead of the full path, thus allowing + the interpreter to execute non-readable binaries. This feature should + be used with care - the interpreter has to be trusted not to emit + the contents of the non-readable binary. + 'C' - credentials. Currently, the behavior of binfmt_misc is to calculate + the credentials and security token of the new process according to + the interpreter. When this flag is included, these attributes are + calculated according to the binary. It also implies the 'O' flag. + This feature should be used with care as the interpreter + will run with root permissions when a setuid binary owned by root + is run with binfmt_misc. + + +There are some restrictions: + - the whole register string may not exceed 255 characters + - the magic must reside in the first 128 bytes of the file, i.e. + offset+size(magic) has to be less than 128 + - the interpreter string may not exceed 127 characters + +To use binfmt_misc you have to mount it first. You can mount it with +"mount -t binfmt_misc none /proc/sys/fs/binfmt_misc" command, or you can add +a line "none /proc/sys/fs/binfmt_misc binfmt_misc defaults 0 0" to your +/etc/fstab so it auto mounts on boot. + +You may want to add the binary formats in one of your /etc/rc scripts during +boot-up. Read the manual of your init program to figure out how to do this +right. + +Think about the order of adding entries! Later added entries are matched first! + + +A few examples (assumed you are in /proc/sys/fs/binfmt_misc): + +- enable support for em86 (like binfmt_em86, for Alpha AXP only): + echo ':i386:M::\x7fELF\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x03:\xff\xff\xff\xff\xff\xfe\xfe\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfb\xff\xff:/bin/em86:' > register + echo ':i486:M::\x7fELF\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x06:\xff\xff\xff\xff\xff\xfe\xfe\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfb\xff\xff:/bin/em86:' > register + +- enable support for packed DOS applications (pre-configured dosemu hdimages): + echo ':DEXE:M::\x0eDEX::/usr/bin/dosexec:' > register + +- enable support for Windows executables using wine: + echo ':DOSWin:M::MZ::/usr/local/bin/wine:' > register + +For java support see Documentation/java.txt + + +You can enable/disable binfmt_misc or one binary type by echoing 0 (to disable) +or 1 (to enable) to /proc/sys/fs/binfmt_misc/status or /proc/.../the_name. +Catting the file tells you the current status of binfmt_misc/the entry. + +You can remove one entry or all entries by echoing -1 to /proc/.../the_name +or /proc/sys/fs/binfmt_misc/status. + + +HINTS: +====== + +If you want to pass special arguments to your interpreter, you can +write a wrapper script for it. See Documentation/java.txt for an +example. + +Your interpreter should NOT look in the PATH for the filename; the kernel +passes it the full filename (or the file descriptor) to use. Using $PATH can +cause unexpected behaviour and can be a security hazard. + + +There is a web page about binfmt_misc at +http://www.tat.physik.uni-tuebingen.de/~rguenth/linux/binfmt_misc.html + +Richard Günther <rguenth@tat.physik.uni-tuebingen.de> diff --git a/Documentation/block/as-iosched.txt b/Documentation/block/as-iosched.txt new file mode 100644 index 000000000000..6f47332c883d --- /dev/null +++ b/Documentation/block/as-iosched.txt @@ -0,0 +1,165 @@ +Anticipatory IO scheduler +------------------------- +Nick Piggin <piggin@cyberone.com.au> 13 Sep 2003 + +Attention! Database servers, especially those using "TCQ" disks should +investigate performance with the 'deadline' IO scheduler. Any system with high +disk performance requirements should do so, in fact. + +If you see unusual performance characteristics of your disk systems, or you +see big performance regressions versus the deadline scheduler, please email +me. Database users don't bother unless you're willing to test a lot of patches +from me ;) its a known issue. + +Also, users with hardware RAID controllers, doing striping, may find +highly variable performance results with using the as-iosched. The +as-iosched anticipatory implementation is based on the notion that a disk +device has only one physical seeking head. A striped RAID controller +actually has a head for each physical device in the logical RAID device. + +However, setting the antic_expire (see tunable parameters below) produces +very similar behavior to the deadline IO scheduler. + + +Selecting IO schedulers +----------------------- +To choose IO schedulers at boot time, use the argument 'elevator=deadline'. +'noop' and 'as' (the default) are also available. IO schedulers are assigned +globally at boot time only presently. + + +Anticipatory IO scheduler Policies +---------------------------------- +The as-iosched implementation implements several layers of policies +to determine when an IO request is dispatched to the disk controller. +Here are the policies outlined, in order of application. + +1. one-way Elevator algorithm. + +The elevator algorithm is similar to that used in deadline scheduler, with +the addition that it allows limited backward movement of the elevator +(i.e. seeks backwards). A seek backwards can occur when choosing between +two IO requests where one is behind the elevator's current position, and +the other is in front of the elevator's position. If the seek distance to +the request in back of the elevator is less than half the seek distance to +the request in front of the elevator, then the request in back can be chosen. +Backward seeks are also limited to a maximum of MAXBACK (1024*1024) sectors. +This favors forward movement of the elevator, while allowing opportunistic +"short" backward seeks. + +2. FIFO expiration times for reads and for writes. + +This is again very similar to the deadline IO scheduler. The expiration +times for requests on these lists is tunable using the parameters read_expire +and write_expire discussed below. When a read or a write expires in this way, +the IO scheduler will interrupt its current elevator sweep or read anticipation +to service the expired request. + +3. Read and write request batching + +A batch is a collection of read requests or a collection of write +requests. The as scheduler alternates dispatching read and write batches +to the driver. In the case a read batch, the scheduler submits read +requests to the driver as long as there are read requests to submit, and +the read batch time limit has not been exceeded (read_batch_expire). +The read batch time limit begins counting down only when there are +competing write requests pending. + +In the case of a write batch, the scheduler submits write requests to +the driver as long as there are write requests available, and the +write batch time limit has not been exceeded (write_batch_expire). +However, the length of write batches will be gradually shortened +when read batches frequently exceed their time limit. + +When changing between batch types, the scheduler waits for all requests +from the previous batch to complete before scheduling requests for the +next batch. + +The read and write fifo expiration times described in policy 2 above +are checked only when in scheduling IO of a batch for the corresponding +(read/write) type. So for example, the read FIFO timeout values are +tested only during read batches. Likewise, the write FIFO timeout +values are tested only during write batches. For this reason, +it is generally not recommended for the read batch time +to be longer than the write expiration time, nor for the write batch +time to exceed the read expiration time (see tunable parameters below). + +When the IO scheduler changes from a read to a write batch, +it begins the elevator from the request that is on the head of the +write expiration FIFO. Likewise, when changing from a write batch to +a read batch, scheduler begins the elevator from the first entry +on the read expiration FIFO. + +4. Read anticipation. + +Read anticipation occurs only when scheduling a read batch. +This implementation of read anticipation allows only one read request +to be dispatched to the disk controller at a time. In +contrast, many write requests may be dispatched to the disk controller +at a time during a write batch. It is this characteristic that can make +the anticipatory scheduler perform anomalously with controllers supporting +TCQ, or with hardware striped RAID devices. Setting the antic_expire +queue paramter (see below) to zero disables this behavior, and the anticipatory +scheduler behaves essentially like the deadline scheduler. + +When read anticipation is enabled (antic_expire is not zero), reads +are dispatched to the disk controller one at a time. +At the end of each read request, the IO scheduler examines its next +candidate read request from its sorted read list. If that next request +is from the same process as the request that just completed, +or if the next request in the queue is "very close" to the +just completed request, it is dispatched immediately. Otherwise, +statistics (average think time, average seek distance) on the process +that submitted the just completed request are examined. If it seems +likely that that process will submit another request soon, and that +request is likely to be near the just completed request, then the IO +scheduler will stop dispatching more read requests for up time (antic_expire) +milliseconds, hoping that process will submit a new request near the one +that just completed. If such a request is made, then it is dispatched +immediately. If the antic_expire wait time expires, then the IO scheduler +will dispatch the next read request from the sorted read queue. + +To decide whether an anticipatory wait is worthwhile, the scheduler +maintains statistics for each process that can be used to compute +mean "think time" (the time between read requests), and mean seek +distance for that process. One observation is that these statistics +are associated with each process, but those statistics are not associated +with a specific IO device. So for example, if a process is doing IO +on several file systems on separate devices, the statistics will be +a combination of IO behavior from all those devices. + + +Tuning the anticipatory IO scheduler +------------------------------------ +When using 'as', the anticipatory IO scheduler there are 5 parameters under +/sys/block/*/queue/iosched/. All are units of milliseconds. + +The parameters are: +* read_expire + Controls how long until a read request becomes "expired". It also controls the + interval between which expired requests are served, so set to 50, a request + might take anywhere < 100ms to be serviced _if_ it is the next on the + expired list. Obviously request expiration strategies won't make the disk + go faster. The result basically equates to the timeslice a single reader + gets in the presence of other IO. 100*((seek time / read_expire) + 1) is + very roughly the % streaming read efficiency your disk should get with + multiple readers. + +* read_batch_expire + Controls how much time a batch of reads is given before pending writes are + served. A higher value is more efficient. This might be set below read_expire + if writes are to be given higher priority than reads, but reads are to be + as efficient as possible when there are no writes. Generally though, it + should be some multiple of read_expire. + +* write_expire, and +* write_batch_expire are equivalent to the above, for writes. + +* antic_expire + Controls the maximum amount of time we can anticipate a good read (one + with a short seek distance from the most recently completed request) before + giving up. Many other factors may cause anticipation to be stopped early, + or some processes will not be "anticipated" at all. Should be a bit higher + for big seek time devices though not a linear correspondence - most + processes have only a few ms thinktime. + diff --git a/Documentation/block/biodoc.txt b/Documentation/block/biodoc.txt new file mode 100644 index 000000000000..6dd274d7e1cf --- /dev/null +++ b/Documentation/block/biodoc.txt @@ -0,0 +1,1213 @@ + Notes on the Generic Block Layer Rewrite in Linux 2.5 + ===================================================== + +Notes Written on Jan 15, 2002: + Jens Axboe <axboe@suse.de> + Suparna Bhattacharya <suparna@in.ibm.com> + +Last Updated May 2, 2002 +September 2003: Updated I/O Scheduler portions + Nick Piggin <piggin@cyberone.com.au> + +Introduction: + +These are some notes describing some aspects of the 2.5 block layer in the +context of the bio rewrite. The idea is to bring out some of the key +changes and a glimpse of the rationale behind those changes. + +Please mail corrections & suggestions to suparna@in.ibm.com. + +Credits: +--------- + +2.5 bio rewrite: + Jens Axboe <axboe@suse.de> + +Many aspects of the generic block layer redesign were driven by and evolved +over discussions, prior patches and the collective experience of several +people. See sections 8 and 9 for a list of some related references. + +The following people helped with review comments and inputs for this +document: + Christoph Hellwig <hch@infradead.org> + Arjan van de Ven <arjanv@redhat.com> + Randy Dunlap <rddunlap@osdl.org> + Andre Hedrick <andre@linux-ide.org> + +The following people helped with fixes/contributions to the bio patches +while it was still work-in-progress: + David S. Miller <davem@redhat.com> + + +Description of Contents: +------------------------ + +1. Scope for tuning of logic to various needs + 1.1 Tuning based on device or low level driver capabilities + - Per-queue parameters + - Highmem I/O support + - I/O scheduler modularization + 1.2 Tuning based on high level requirements/capabilities + 1.2.1 I/O Barriers + 1.2.2 Request Priority/Latency + 1.3 Direct access/bypass to lower layers for diagnostics and special + device operations + 1.3.1 Pre-built commands +2. New flexible and generic but minimalist i/o structure or descriptor + (instead of using buffer heads at the i/o layer) + 2.1 Requirements/Goals addressed + 2.2 The bio struct in detail (multi-page io unit) + 2.3 Changes in the request structure +3. Using bios + 3.1 Setup/teardown (allocation, splitting) + 3.2 Generic bio helper routines + 3.2.1 Traversing segments and completion units in a request + 3.2.2 Setting up DMA scatterlists + 3.2.3 I/O completion + 3.2.4 Implications for drivers that do not interpret bios (don't handle + multiple segments) + 3.2.5 Request command tagging + 3.3 I/O submission +4. The I/O scheduler +5. Scalability related changes + 5.1 Granular locking: Removal of io_request_lock + 5.2 Prepare for transition to 64 bit sector_t +6. Other Changes/Implications + 6.1 Partition re-mapping handled by the generic block layer +7. A few tips on migration of older drivers +8. A list of prior/related/impacted patches/ideas +9. Other References/Discussion Threads + +--------------------------------------------------------------------------- + +Bio Notes +-------- + +Let us discuss the changes in the context of how some overall goals for the +block layer are addressed. + +1. Scope for tuning the generic logic to satisfy various requirements + +The block layer design supports adaptable abstractions to handle common +processing with the ability to tune the logic to an appropriate extent +depending on the nature of the device and the requirements of the caller. +One of the objectives of the rewrite was to increase the degree of tunability +and to enable higher level code to utilize underlying device/driver +capabilities to the maximum extent for better i/o performance. This is +important especially in the light of ever improving hardware capabilities +and application/middleware software designed to take advantage of these +capabilities. + +1.1 Tuning based on low level device / driver capabilities + +Sophisticated devices with large built-in caches, intelligent i/o scheduling +optimizations, high memory DMA support, etc may find some of the +generic processing an overhead, while for less capable devices the +generic functionality is essential for performance or correctness reasons. +Knowledge of some of the capabilities or parameters of the device should be +used at the generic block layer to take the right decisions on +behalf of the driver. + +How is this achieved ? + +Tuning at a per-queue level: + +i. Per-queue limits/values exported to the generic layer by the driver + +Various parameters that the generic i/o scheduler logic uses are set at +a per-queue level (e.g maximum request size, maximum number of segments in +a scatter-gather list, hardsect size) + +Some parameters that were earlier available as global arrays indexed by +major/minor are now directly associated with the queue. Some of these may +move into the block device structure in the future. Some characteristics +have been incorporated into a queue flags field rather than separate fields +in themselves. There are blk_queue_xxx functions to set the parameters, +rather than update the fields directly + +Some new queue property settings: + + blk_queue_bounce_limit(q, u64 dma_address) + Enable I/O to highmem pages, dma_address being the + limit. No highmem default. + + blk_queue_max_sectors(q, max_sectors) + Maximum size request you can handle in units of 512 byte + sectors. 255 default. + + blk_queue_max_phys_segments(q, max_segments) + Maximum physical segments you can handle in a request. 128 + default (driver limit). (See 3.2.2) + + blk_queue_max_hw_segments(q, max_segments) + Maximum dma segments the hardware can handle in a request. 128 + default (host adapter limit, after dma remapping). + (See 3.2.2) + + blk_queue_max_segment_size(q, max_seg_size) + Maximum size of a clustered segment, 64kB default. + + blk_queue_hardsect_size(q, hardsect_size) + Lowest possible sector size that the hardware can operate + on, 512 bytes default. + +New queue flags: + + QUEUE_FLAG_CLUSTER (see 3.2.2) + QUEUE_FLAG_QUEUED (see 3.2.4) + + +ii. High-mem i/o capabilities are now considered the default + +The generic bounce buffer logic, present in 2.4, where the block layer would +by default copyin/out i/o requests on high-memory buffers to low-memory buffers +assuming that the driver wouldn't be able to handle it directly, has been +changed in 2.5. The bounce logic is now applied only for memory ranges +for which the device cannot handle i/o. A driver can specify this by +setting the queue bounce limit for the request queue for the device +(blk_queue_bounce_limit()). This avoids the inefficiencies of the copyin/out +where a device is capable of handling high memory i/o. + +In order to enable high-memory i/o where the device is capable of supporting +it, the pci dma mapping routines and associated data structures have now been +modified to accomplish a direct page -> bus translation, without requiring +a virtual address mapping (unlike the earlier scheme of virtual address +-> bus translation). So this works uniformly for high-memory pages (which +do not have a correponding kernel virtual address space mapping) and +low-memory pages. + +Note: Please refer to DMA-mapping.txt for a discussion on PCI high mem DMA +aspects and mapping of scatter gather lists, and support for 64 bit PCI. + +Special handling is required only for cases where i/o needs to happen on +pages at physical memory addresses beyond what the device can support. In these +cases, a bounce bio representing a buffer from the supported memory range +is used for performing the i/o with copyin/copyout as needed depending on +the type of the operation. For example, in case of a read operation, the +data read has to be copied to the original buffer on i/o completion, so a +callback routine is set up to do this, while for write, the data is copied +from the original buffer to the bounce buffer prior to issuing the +operation. Since an original buffer may be in a high memory area that's not +mapped in kernel virtual addr, a kmap operation may be required for +performing the copy, and special care may be needed in the completion path +as it may not be in irq context. Special care is also required (by way of +GFP flags) when allocating bounce buffers, to avoid certain highmem +deadlock possibilities. + +It is also possible that a bounce buffer may be allocated from high-memory +area that's not mapped in kernel virtual addr, but within the range that the +device can use directly; so the bounce page may need to be kmapped during +copy operations. [Note: This does not hold in the current implementation, +though] + +There are some situations when pages from high memory may need to +be kmapped, even if bounce buffers are not necessary. For example a device +may need to abort DMA operations and revert to PIO for the transfer, in +which case a virtual mapping of the page is required. For SCSI it is also +done in some scenarios where the low level driver cannot be trusted to +handle a single sg entry correctly. The driver is expected to perform the +kmaps as needed on such occasions using the __bio_kmap_atomic and bio_kmap_irq +routines as appropriate. A driver could also use the blk_queue_bounce() +routine on its own to bounce highmem i/o to low memory for specific requests +if so desired. + +iii. The i/o scheduler algorithm itself can be replaced/set as appropriate + +As in 2.4, it is possible to plugin a brand new i/o scheduler for a particular +queue or pick from (copy) existing generic schedulers and replace/override +certain portions of it. The 2.5 rewrite provides improved modularization +of the i/o scheduler. There are more pluggable callbacks, e.g for init, +add request, extract request, which makes it possible to abstract specific +i/o scheduling algorithm aspects and details outside of the generic loop. +It also makes it possible to completely hide the implementation details of +the i/o scheduler from block drivers. + +I/O scheduler wrappers are to be used instead of accessing the queue directly. +See section 4. The I/O scheduler for details. + +1.2 Tuning Based on High level code capabilities + +i. Application capabilities for raw i/o + +This comes from some of the high-performance database/middleware +requirements where an application prefers to make its own i/o scheduling +decisions based on an understanding of the access patterns and i/o +characteristics + +ii. High performance filesystems or other higher level kernel code's +capabilities + +Kernel components like filesystems could also take their own i/o scheduling +decisions for optimizing performance. Journalling filesystems may need +some control over i/o ordering. + +What kind of support exists at the generic block layer for this ? + +The flags and rw fields in the bio structure can be used for some tuning +from above e.g indicating that an i/o is just a readahead request, or for +marking barrier requests (discussed next), or priority settings (currently +unused). As far as user applications are concerned they would need an +additional mechanism either via open flags or ioctls, or some other upper +level mechanism to communicate such settings to block. + +1.2.1 I/O Barriers + +There is a way to enforce strict ordering for i/os through barriers. +All requests before a barrier point must be serviced before the barrier +request and any other requests arriving after the barrier will not be +serviced until after the barrier has completed. This is useful for higher +level control on write ordering, e.g flushing a log of committed updates +to disk before the corresponding updates themselves. + +A flag in the bio structure, BIO_BARRIER is used to identify a barrier i/o. +The generic i/o scheduler would make sure that it places the barrier request and +all other requests coming after it after all the previous requests in the +queue. Barriers may be implemented in different ways depending on the +driver. A SCSI driver for example could make use of ordered tags to +preserve the necessary ordering with a lower impact on throughput. For IDE +this might be two sync cache flush: a pre and post flush when encountering +a barrier write. + +There is a provision for queues to indicate what kind of barriers they +can provide. This is as of yet unmerged, details will be added here once it +is in the kernel. + +1.2.2 Request Priority/Latency + +Todo/Under discussion: +Arjan's proposed request priority scheme allows higher levels some broad + control (high/med/low) over the priority of an i/o request vs other pending + requests in the queue. For example it allows reads for bringing in an + executable page on demand to be given a higher priority over pending write + requests which haven't aged too much on the queue. Potentially this priority + could even be exposed to applications in some manner, providing higher level + tunability. Time based aging avoids starvation of lower priority + requests. Some bits in the bi_rw flags field in the bio structure are + intended to be used for this priority information. + + +1.3 Direct Access to Low level Device/Driver Capabilities (Bypass mode) + (e.g Diagnostics, Systems Management) + +There are situations where high-level code needs to have direct access to +the low level device capabilities or requires the ability to issue commands +to the device bypassing some of the intermediate i/o layers. +These could, for example, be special control commands issued through ioctl +interfaces, or could be raw read/write commands that stress the drive's +capabilities for certain kinds of fitness tests. Having direct interfaces at +multiple levels without having to pass through upper layers makes +it possible to perform bottom up validation of the i/o path, layer by +layer, starting from the media. + +The normal i/o submission interfaces, e.g submit_bio, could be bypassed +for specially crafted requests which such ioctl or diagnostics +interfaces would typically use, and the elevator add_request routine +can instead be used to directly insert such requests in the queue or preferably +the blk_do_rq routine can be used to place the request on the queue and +wait for completion. Alternatively, sometimes the caller might just +invoke a lower level driver specific interface with the request as a +parameter. + +If the request is a means for passing on special information associated with +the command, then such information is associated with the request->special +field (rather than misuse the request->buffer field which is meant for the +request data buffer's virtual mapping). + +For passing request data, the caller must build up a bio descriptor +representing the concerned memory buffer if the underlying driver interprets +bio segments or uses the block layer end*request* functions for i/o +completion. Alternatively one could directly use the request->buffer field to +specify the virtual address of the buffer, if the driver expects buffer +addresses passed in this way and ignores bio entries for the request type +involved. In the latter case, the driver would modify and manage the +request->buffer, request->sector and request->nr_sectors or +request->current_nr_sectors fields itself rather than using the block layer +end_request or end_that_request_first completion interfaces. +(See 2.3 or Documentation/block/request.txt for a brief explanation of +the request structure fields) + +[TBD: end_that_request_last should be usable even in this case; +Perhaps an end_that_direct_request_first routine could be implemented to make +handling direct requests easier for such drivers; Also for drivers that +expect bios, a helper function could be provided for setting up a bio +corresponding to a data buffer] + +<JENS: I dont understand the above, why is end_that_request_first() not +usable? Or _last for that matter. I must be missing something> +<SUP: What I meant here was that if the request doesn't have a bio, then + end_that_request_first doesn't modify nr_sectors or current_nr_sectors, + and hence can't be used for advancing request state settings on the + completion of partial transfers. The driver has to modify these fields + directly by hand. + This is because end_that_request_first only iterates over the bio list, + and always returns 0 if there are none associated with the request. + _last works OK in this case, and is not a problem, as I mentioned earlier +> + +1.3.1 Pre-built Commands + +A request can be created with a pre-built custom command to be sent directly +to the device. The cmd block in the request structure has room for filling +in the command bytes. (i.e rq->cmd is now 16 bytes in size, and meant for +command pre-building, and the type of the request is now indicated +through rq->flags instead of via rq->cmd) + +The request structure flags can be set up to indicate the type of request +in such cases (REQ_PC: direct packet command passed to driver, REQ_BLOCK_PC: +packet command issued via blk_do_rq, REQ_SPECIAL: special request). + +It can help to pre-build device commands for requests in advance. +Drivers can now specify a request prepare function (q->prep_rq_fn) that the +block layer would invoke to pre-build device commands for a given request, +or perform other preparatory processing for the request. This is routine is +called by elv_next_request(), i.e. typically just before servicing a request. +(The prepare function would not be called for requests that have REQ_DONTPREP +enabled) + +Aside: + Pre-building could possibly even be done early, i.e before placing the + request on the queue, rather than construct the command on the fly in the + driver while servicing the request queue when it may affect latencies in + interrupt context or responsiveness in general. One way to add early + pre-building would be to do it whenever we fail to merge on a request. + Now REQ_NOMERGE is set in the request flags to skip this one in the future, + which means that it will not change before we feed it to the device. So + the pre-builder hook can be invoked there. + + +2. Flexible and generic but minimalist i/o structure/descriptor. + +2.1 Reason for a new structure and requirements addressed + +Prior to 2.5, buffer heads were used as the unit of i/o at the generic block +layer, and the low level request structure was associated with a chain of +buffer heads for a contiguous i/o request. This led to certain inefficiencies +when it came to large i/o requests and readv/writev style operations, as it +forced such requests to be broken up into small chunks before being passed +on to the generic block layer, only to be merged by the i/o scheduler +when the underlying device was capable of handling the i/o in one shot. +Also, using the buffer head as an i/o structure for i/os that didn't originate +from the buffer cache unecessarily added to the weight of the descriptors +which were generated for each such chunk. + +The following were some of the goals and expectations considered in the +redesign of the block i/o data structure in 2.5. + +i. Should be appropriate as a descriptor for both raw and buffered i/o - + avoid cache related fields which are irrelevant in the direct/page i/o path, + or filesystem block size alignment restrictions which may not be relevant + for raw i/o. +ii. Ability to represent high-memory buffers (which do not have a virtual + address mapping in kernel address space). +iii.Ability to represent large i/os w/o unecessarily breaking them up (i.e + greater than PAGE_SIZE chunks in one shot) +iv. At the same time, ability to retain independent identity of i/os from + different sources or i/o units requiring individual completion (e.g. for + latency reasons) +v. Ability to represent an i/o involving multiple physical memory segments + (including non-page aligned page fragments, as specified via readv/writev) + without unecessarily breaking it up, if the underlying device is capable of + handling it. +vi. Preferably should be based on a memory descriptor structure that can be + passed around different types of subsystems or layers, maybe even + networking, without duplication or extra copies of data/descriptor fields + themselves in the process +vii.Ability to handle the possibility of splits/merges as the structure passes + through layered drivers (lvm, md, evms), with minimal overhead. + +The solution was to define a new structure (bio) for the block layer, +instead of using the buffer head structure (bh) directly, the idea being +avoidance of some associated baggage and limitations. The bio structure +is uniformly used for all i/o at the block layer ; it forms a part of the +bh structure for buffered i/o, and in the case of raw/direct i/o kiobufs are +mapped to bio structures. + +2.2 The bio struct + +The bio structure uses a vector representation pointing to an array of tuples +of <page, offset, len> to describe the i/o buffer, and has various other +fields describing i/o parameters and state that needs to be maintained for +performing the i/o. + +Notice that this representation means that a bio has no virtual address +mapping at all (unlike buffer heads). + +struct bio_vec { + struct page *bv_page; + unsigned short bv_len; + unsigned short bv_offset; +}; + +/* + * main unit of I/O for the block layer and lower layers (ie drivers) + */ +struct bio { + sector_t bi_sector; + struct bio *bi_next; /* request queue link */ + struct block_device *bi_bdev; /* target device */ + unsigned long bi_flags; /* status, command, etc */ + unsigned long bi_rw; /* low bits: r/w, high: priority */ + + unsigned int bi_vcnt; /* how may bio_vec's */ + unsigned int bi_idx; /* current index into bio_vec array */ + + unsigned int bi_size; /* total size in bytes */ + unsigned short bi_phys_segments; /* segments after physaddr coalesce*/ + unsigned short bi_hw_segments; /* segments after DMA remapping */ + unsigned int bi_max; /* max bio_vecs we can hold + used as index into pool */ + struct bio_vec *bi_io_vec; /* the actual vec list */ + bio_end_io_t *bi_end_io; /* bi_end_io (bio) */ + atomic_t bi_cnt; /* pin count: free when it hits zero */ + void *bi_private; + bio_destructor_t *bi_destructor; /* bi_destructor (bio) */ +}; + +With this multipage bio design: + +- Large i/os can be sent down in one go using a bio_vec list consisting + of an array of <page, offset, len> fragments (similar to the way fragments + are represented in the zero-copy network code) +- Splitting of an i/o request across multiple devices (as in the case of + lvm or raid) is achieved by cloning the bio (where the clone points to + the same bi_io_vec array, but with the index and size accordingly modified) +- A linked list of bios is used as before for unrelated merges (*) - this + avoids reallocs and makes independent completions easier to handle. +- Code that traverses the req list needs to make a distinction between + segments of a request (bio_for_each_segment) and the distinct completion + units/bios (rq_for_each_bio). +- Drivers which can't process a large bio in one shot can use the bi_idx + field to keep track of the next bio_vec entry to process. + (e.g a 1MB bio_vec needs to be handled in max 128kB chunks for IDE) + [TBD: Should preferably also have a bi_voffset and bi_vlen to avoid modifying + bi_offset an len fields] + +(*) unrelated merges -- a request ends up containing two or more bios that + didn't originate from the same place. + +bi_end_io() i/o callback gets called on i/o completion of the entire bio. + +At a lower level, drivers build a scatter gather list from the merged bios. +The scatter gather list is in the form of an array of <page, offset, len> +entries with their corresponding dma address mappings filled in at the +appropriate time. As an optimization, contiguous physical pages can be +covered by a single entry where <page> refers to the first page and <len> +covers the range of pages (upto 16 contiguous pages could be covered this +way). There is a helper routine (blk_rq_map_sg) which drivers can use to build +the sg list. + +Note: Right now the only user of bios with more than one page is ll_rw_kio, +which in turn means that only raw I/O uses it (direct i/o may not work +right now). The intent however is to enable clustering of pages etc to +become possible. The pagebuf abstraction layer from SGI also uses multi-page +bios, but that is currently not included in the stock development kernels. +The same is true of Andrew Morton's work-in-progress multipage bio writeout +and readahead patches. + +2.3 Changes in the Request Structure + +The request structure is the structure that gets passed down to low level +drivers. The block layer make_request function builds up a request structure, +places it on the queue and invokes the drivers request_fn. The driver makes +use of block layer helper routine elv_next_request to pull the next request +off the queue. Control or diagnostic functions might bypass block and directly +invoke underlying driver entry points passing in a specially constructed +request structure. + +Only some relevant fields (mainly those which changed or may be referred +to in some of the discussion here) are listed below, not necessarily in +the order in which they occur in the structure (see include/linux/blkdev.h) +Refer to Documentation/block/request.txt for details about all the request +structure fields and a quick reference about the layers which are +supposed to use or modify those fields. + +struct request { + struct list_head queuelist; /* Not meant to be directly accessed by + the driver. + Used by q->elv_next_request_fn + rq->queue is gone + */ + . + . + unsigned char cmd[16]; /* prebuilt command data block */ + unsigned long flags; /* also includes earlier rq->cmd settings */ + . + . + sector_t sector; /* this field is now of type sector_t instead of int + preparation for 64 bit sectors */ + . + . + + /* Number of scatter-gather DMA addr+len pairs after + * physical address coalescing is performed. + */ + unsigned short nr_phys_segments; + + /* Number of scatter-gather addr+len pairs after + * physical and DMA remapping hardware coalescing is performed. + * This is the number of scatter-gather entries the driver + * will actually have to deal with after DMA mapping is done. + */ + unsigned short nr_hw_segments; + + /* Various sector counts */ + unsigned long nr_sectors; /* no. of sectors left: driver modifiable */ + unsigned long hard_nr_sectors; /* block internal copy of above */ + unsigned int current_nr_sectors; /* no. of sectors left in the + current segment:driver modifiable */ + unsigned long hard_cur_sectors; /* block internal copy of the above */ + . + . + int tag; /* command tag associated with request */ + void *special; /* same as before */ + char *buffer; /* valid only for low memory buffers upto + current_nr_sectors */ + . + . + struct bio *bio, *biotail; /* bio list instead of bh */ + struct request_list *rl; +} + +See the rq_flag_bits definitions for an explanation of the various flags +available. Some bits are used by the block layer or i/o scheduler. + +The behaviour of the various sector counts are almost the same as before, +except that since we have multi-segment bios, current_nr_sectors refers +to the numbers of sectors in the current segment being processed which could +be one of the many segments in the current bio (i.e i/o completion unit). +The nr_sectors value refers to the total number of sectors in the whole +request that remain to be transferred (no change). The purpose of the +hard_xxx values is for block to remember these counts every time it hands +over the request to the driver. These values are updated by block on +end_that_request_first, i.e. every time the driver completes a part of the +transfer and invokes block end*request helpers to mark this. The +driver should not modify these values. The block layer sets up the +nr_sectors and current_nr_sectors fields (based on the corresponding +hard_xxx values and the number of bytes transferred) and updates it on +every transfer that invokes end_that_request_first. It does the same for the +buffer, bio, bio->bi_idx fields too. + +The buffer field is just a virtual address mapping of the current segment +of the i/o buffer in cases where the buffer resides in low-memory. For high +memory i/o, this field is not valid and must not be used by drivers. + +Code that sets up its own request structures and passes them down to +a driver needs to be careful about interoperation with the block layer helper +functions which the driver uses. (Section 1.3) + +3. Using bios + +3.1 Setup/Teardown + +There are routines for managing the allocation, and reference counting, and +freeing of bios (bio_alloc, bio_get, bio_put). + +This makes use of Ingo Molnar's mempool implementation, which enables +subsystems like bio to maintain their own reserve memory pools for guaranteed +deadlock-free allocations during extreme VM load. For example, the VM +subsystem makes use of the block layer to writeout dirty pages in order to be +able to free up memory space, a case which needs careful handling. The +allocation logic draws from the preallocated emergency reserve in situations +where it cannot allocate through normal means. If the pool is empty and it +can wait, then it would trigger action that would help free up memory or +replenish the pool (without deadlocking) and wait for availability in the pool. +If it is in IRQ context, and hence not in a position to do this, allocation +could fail if the pool is empty. In general mempool always first tries to +perform allocation without having to wait, even if it means digging into the +pool as long it is not less that 50% full. + +On a free, memory is released to the pool or directly freed depending on +the current availability in the pool. The mempool interface lets the +subsystem specify the routines to be used for normal alloc and free. In the +case of bio, these routines make use of the standard slab allocator. + +The caller of bio_alloc is expected to taken certain steps to avoid +deadlocks, e.g. avoid trying to allocate more memory from the pool while +already holding memory obtained from the pool. +[TBD: This is a potential issue, though a rare possibility + in the bounce bio allocation that happens in the current code, since + it ends up allocating a second bio from the same pool while + holding the original bio ] + +Memory allocated from the pool should be released back within a limited +amount of time (in the case of bio, that would be after the i/o is completed). +This ensures that if part of the pool has been used up, some work (in this +case i/o) must already be in progress and memory would be available when it +is over. If allocating from multiple pools in the same code path, the order +or hierarchy of allocation needs to be consistent, just the way one deals +with multiple locks. + +The bio_alloc routine also needs to allocate the bio_vec_list (bvec_alloc()) +for a non-clone bio. There are the 6 pools setup for different size biovecs, +so bio_alloc(gfp_mask, nr_iovecs) will allocate a vec_list of the +given size from these slabs. + +The bi_destructor() routine takes into account the possibility of the bio +having originated from a different source (see later discussions on +n/w to block transfers and kvec_cb) + +The bio_get() routine may be used to hold an extra reference on a bio prior +to i/o submission, if the bio fields are likely to be accessed after the +i/o is issued (since the bio may otherwise get freed in case i/o completion +happens in the meantime). + +The bio_clone() routine may be used to duplicate a bio, where the clone +shares the bio_vec_list with the original bio (i.e. both point to the +same bio_vec_list). This would typically be used for splitting i/o requests +in lvm or md. + +3.2 Generic bio helper Routines + +3.2.1 Traversing segments and completion units in a request + +The macros bio_for_each_segment() and rq_for_each_bio() should be used for +traversing the bios in the request list (drivers should avoid directly +trying to do it themselves). Using these helpers should also make it easier +to cope with block changes in the future. + + rq_for_each_bio(bio, rq) + bio_for_each_segment(bio_vec, bio, i) + /* bio_vec is now current segment */ + +I/O completion callbacks are per-bio rather than per-segment, so drivers +that traverse bio chains on completion need to keep that in mind. Drivers +which don't make a distinction between segments and completion units would +need to be reorganized to support multi-segment bios. + +3.2.2 Setting up DMA scatterlists + +The blk_rq_map_sg() helper routine would be used for setting up scatter +gather lists from a request, so a driver need not do it on its own. + + nr_segments = blk_rq_map_sg(q, rq, scatterlist); + +The helper routine provides a level of abstraction which makes it easier +to modify the internals of request to scatterlist conversion down the line +without breaking drivers. The blk_rq_map_sg routine takes care of several +things like collapsing physically contiguous segments (if QUEUE_FLAG_CLUSTER +is set) and correct segment accounting to avoid exceeding the limits which +the i/o hardware can handle, based on various queue properties. + +- Prevents a clustered segment from crossing a 4GB mem boundary +- Avoids building segments that would exceed the number of physical + memory segments that the driver can handle (phys_segments) and the + number that the underlying hardware can handle at once, accounting for + DMA remapping (hw_segments) (i.e. IOMMU aware limits). + +Routines which the low level driver can use to set up the segment limits: + +blk_queue_max_hw_segments() : Sets an upper limit of the maximum number of +hw data segments in a request (i.e. the maximum number of address/length +pairs the host adapter can actually hand to the device at once) + +blk_queue_max_phys_segments() : Sets an upper limit on the maximum number +of physical data segments in a request (i.e. the largest sized scatter list +a driver could handle) + +3.2.3 I/O completion + +The existing generic block layer helper routines end_request, +end_that_request_first and end_that_request_last can be used for i/o +completion (and setting things up so the rest of the i/o or the next +request can be kicked of) as before. With the introduction of multi-page +bio support, end_that_request_first requires an additional argument indicating +the number of sectors completed. + +3.2.4 Implications for drivers that do not interpret bios (don't handle + multiple segments) + +Drivers that do not interpret bios e.g those which do not handle multiple +segments and do not support i/o into high memory addresses (require bounce +buffers) and expect only virtually mapped buffers, can access the rq->buffer +field. As before the driver should use current_nr_sectors to determine the +size of remaining data in the current segment (that is the maximum it can +transfer in one go unless it interprets segments), and rely on the block layer +end_request, or end_that_request_first/last to take care of all accounting +and transparent mapping of the next bio segment when a segment boundary +is crossed on completion of a transfer. (The end*request* functions should +be used if only if the request has come down from block/bio path, not for +direct access requests which only specify rq->buffer without a valid rq->bio) + +3.2.5 Generic request command tagging + +3.2.5.1 Tag helpers + +Block now offers some simple generic functionality to help support command +queueing (typically known as tagged command queueing), ie manage more than +one outstanding command on a queue at any given time. + + blk_queue_init_tags(request_queue_t *q, int depth) + + Initialize internal command tagging structures for a maximum + depth of 'depth'. + + blk_queue_free_tags((request_queue_t *q) + + Teardown tag info associated with the queue. This will be done + automatically by block if blk_queue_cleanup() is called on a queue + that is using tagging. + +The above are initialization and exit management, the main helpers during +normal operations are: + + blk_queue_start_tag(request_queue_t *q, struct request *rq) + + Start tagged operation for this request. A free tag number between + 0 and 'depth' is assigned to the request (rq->tag holds this number), + and 'rq' is added to the internal tag management. If the maximum depth + for this queue is already achieved (or if the tag wasn't started for + some other reason), 1 is returned. Otherwise 0 is returned. + + blk_queue_end_tag(request_queue_t *q, struct request *rq) + + End tagged operation on this request. 'rq' is removed from the internal + book keeping structures. + +To minimize struct request and queue overhead, the tag helpers utilize some +of the same request members that are used for normal request queue management. +This means that a request cannot both be an active tag and be on the queue +list at the same time. blk_queue_start_tag() will remove the request, but +the driver must remember to call blk_queue_end_tag() before signalling +completion of the request to the block layer. This means ending tag +operations before calling end_that_request_last()! For an example of a user +of these helpers, see the IDE tagged command queueing support. + +Certain hardware conditions may dictate a need to invalidate the block tag +queue. For instance, on IDE any tagged request error needs to clear both +the hardware and software block queue and enable the driver to sanely restart +all the outstanding requests. There's a third helper to do that: + + blk_queue_invalidate_tags(request_queue_t *q) + + Clear the internal block tag queue and readd all the pending requests + to the request queue. The driver will receive them again on the + next request_fn run, just like it did the first time it encountered + them. + +3.2.5.2 Tag info + +Some block functions exist to query current tag status or to go from a +tag number to the associated request. These are, in no particular order: + + blk_queue_tagged(q) + + Returns 1 if the queue 'q' is using tagging, 0 if not. + + blk_queue_tag_request(q, tag) + + Returns a pointer to the request associated with tag 'tag'. + + blk_queue_tag_depth(q) + + Return current queue depth. + + blk_queue_tag_queue(q) + + Returns 1 if the queue can accept a new queued command, 0 if we are + at the maximum depth already. + + blk_queue_rq_tagged(rq) + + Returns 1 if the request 'rq' is tagged. + +3.2.5.2 Internal structure + +Internally, block manages tags in the blk_queue_tag structure: + + struct blk_queue_tag { + struct request **tag_index; /* array or pointers to rq */ + unsigned long *tag_map; /* bitmap of free tags */ + struct list_head busy_list; /* fifo list of busy tags */ + int busy; /* queue depth */ + int max_depth; /* max queue depth */ + }; + +Most of the above is simple and straight forward, however busy_list may need +a bit of explaining. Normally we don't care too much about request ordering, +but in the event of any barrier requests in the tag queue we need to ensure +that requests are restarted in the order they were queue. This may happen +if the driver needs to use blk_queue_invalidate_tags(). + +Tagging also defines a new request flag, REQ_QUEUED. This is set whenever +a request is currently tagged. You should not use this flag directly, +blk_rq_tagged(rq) is the portable way to do so. + +3.3 I/O Submission + +The routine submit_bio() is used to submit a single io. Higher level i/o +routines make use of this: + +(a) Buffered i/o: +The routine submit_bh() invokes submit_bio() on a bio corresponding to the +bh, allocating the bio if required. ll_rw_block() uses submit_bh() as before. + +(b) Kiobuf i/o (for raw/direct i/o): +The ll_rw_kio() routine breaks up the kiobuf into page sized chunks and +maps the array to one or more multi-page bios, issuing submit_bio() to +perform the i/o on each of these. + +The embedded bh array in the kiobuf structure has been removed and no +preallocation of bios is done for kiobufs. [The intent is to remove the +blocks array as well, but it's currently in there to kludge around direct i/o.] +Thus kiobuf allocation has switched back to using kmalloc rather than vmalloc. + +Todo/Observation: + + A single kiobuf structure is assumed to correspond to a contiguous range + of data, so brw_kiovec() invokes ll_rw_kio for each kiobuf in a kiovec. + So right now it wouldn't work for direct i/o on non-contiguous blocks. + This is to be resolved. The eventual direction is to replace kiobuf + by kvec's. + + Badari Pulavarty has a patch to implement direct i/o correctly using + bio and kvec. + + +(c) Page i/o: +Todo/Under discussion: + + Andrew Morton's multi-page bio patches attempt to issue multi-page + writeouts (and reads) from the page cache, by directly building up + large bios for submission completely bypassing the usage of buffer + heads. This work is still in progress. + + Christoph Hellwig had some code that uses bios for page-io (rather than + bh). This isn't included in bio as yet. Christoph was also working on a + design for representing virtual/real extents as an entity and modifying + some of the address space ops interfaces to utilize this abstraction rather + than buffer_heads. (This is somewhat along the lines of the SGI XFS pagebuf + abstraction, but intended to be as lightweight as possible). + +(d) Direct access i/o: +Direct access requests that do not contain bios would be submitted differently +as discussed earlier in section 1.3. + +Aside: + + Kvec i/o: + + Ben LaHaise's aio code uses a slighly different structure instead + of kiobufs, called a kvec_cb. This contains an array of <page, offset, len> + tuples (very much like the networking code), together with a callback function + and data pointer. This is embedded into a brw_cb structure when passed + to brw_kvec_async(). + + Now it should be possible to directly map these kvecs to a bio. Just as while + cloning, in this case rather than PRE_BUILT bio_vecs, we set the bi_io_vec + array pointer to point to the veclet array in kvecs. + + TBD: In order for this to work, some changes are needed in the way multi-page + bios are handled today. The values of the tuples in such a vector passed in + from higher level code should not be modified by the block layer in the course + of its request processing, since that would make it hard for the higher layer + to continue to use the vector descriptor (kvec) after i/o completes. Instead, + all such transient state should either be maintained in the request structure, + and passed on in some way to the endio completion routine. + + +4. The I/O scheduler +I/O schedulers are now per queue. They should be runtime switchable and modular +but aren't yet. Jens has most bits to do this, but the sysfs implementation is +missing. + +A block layer call to the i/o scheduler follows the convention elv_xxx(). This +calls elevator_xxx_fn in the elevator switch (drivers/block/elevator.c). Oh, +xxx and xxx might not match exactly, but use your imagination. If an elevator +doesn't implement a function, the switch does nothing or some minimal house +keeping work. + +4.1. I/O scheduler API + +The functions an elevator may implement are: (* are mandatory) +elevator_merge_fn called to query requests for merge with a bio + +elevator_merge_req_fn " " " with another request + +elevator_merged_fn called when a request in the scheduler has been + involved in a merge. It is used in the deadline + scheduler for example, to reposition the request + if its sorting order has changed. + +*elevator_next_req_fn returns the next scheduled request, or NULL + if there are none (or none are ready). + +*elevator_add_req_fn called to add a new request into the scheduler + +elevator_queue_empty_fn returns true if the merge queue is empty. + Drivers shouldn't use this, but rather check + if elv_next_request is NULL (without losing the + request if one exists!) + +elevator_remove_req_fn This is called when a driver claims ownership of + the target request - it now belongs to the + driver. It must not be modified or merged. + Drivers must not lose the request! A subsequent + call of elevator_next_req_fn must return the + _next_ request. + +elevator_requeue_req_fn called to add a request to the scheduler. This + is used when the request has alrnadebeen + returned by elv_next_request, but hasn't + completed. If this is not implemented then + elevator_add_req_fn is called instead. + +elevator_former_req_fn +elevator_latter_req_fn These return the request before or after the + one specified in disk sort order. Used by the + block layer to find merge possibilities. + +elevator_completed_req_fn called when a request is completed. This might + come about due to being merged with another or + when the device completes the request. + +elevator_may_queue_fn returns true if the scheduler wants to allow the + current context to queue a new request even if + it is over the queue limit. This must be used + very carefully!! + +elevator_set_req_fn +elevator_put_req_fn Must be used to allocate and free any elevator + specific storate for a request. + +elevator_init_fn +elevator_exit_fn Allocate and free any elevator specific storage + for a queue. + +4.2 I/O scheduler implementation +The generic i/o scheduler algorithm attempts to sort/merge/batch requests for +optimal disk scan and request servicing performance (based on generic +principles and device capabilities), optimized for: +i. improved throughput +ii. improved latency +iii. better utilization of h/w & CPU time + +Characteristics: + +i. Binary tree +AS and deadline i/o schedulers use red black binary trees for disk position +sorting and searching, and a fifo linked list for time-based searching. This +gives good scalability and good availablility of information. Requests are +almost always dispatched in disk sort order, so a cache is kept of the next +request in sort order to prevent binary tree lookups. + +This arrangement is not a generic block layer characteristic however, so +elevators may implement queues as they please. + +ii. Last merge hint +The last merge hint is part of the generic queue layer. I/O schedulers must do +some management on it. For the most part, the most important thing is to make +sure q->last_merge is cleared (set to NULL) when the request on it is no longer +a candidate for merging (for example if it has been sent to the driver). + +The last merge performed is cached as a hint for the subsequent request. If +sequential data is being submitted, the hint is used to perform merges without +any scanning. This is not sufficient when there are multiple processes doing +I/O though, so a "merge hash" is used by some schedulers. + +iii. Merge hash +AS and deadline use a hash table indexed by the last sector of a request. This +enables merging code to quickly look up "back merge" candidates, even when +multiple I/O streams are being performed at once on one disk. + +"Front merges", a new request being merged at the front of an existing request, +are far less common than "back merges" due to the nature of most I/O patterns. +Front merges are handled by the binary trees in AS and deadline schedulers. + +iv. Handling barrier cases +A request with flags REQ_HARDBARRIER or REQ_SOFTBARRIER must not be ordered +around. That is, they must be processed after all older requests, and before +any newer ones. This includes merges! + +In AS and deadline schedulers, barriers have the effect of flushing the reorder +queue. The performance cost of this will vary from nothing to a lot depending +on i/o patterns and device characteristics. Obviously they won't improve +performance, so their use should be kept to a minimum. + +v. Handling insertion position directives +A request may be inserted with a position directive. The directives are one of +ELEVATOR_INSERT_BACK, ELEVATOR_INSERT_FRONT, ELEVATOR_INSERT_SORT. + +ELEVATOR_INSERT_SORT is a general directive for non-barrier requests. +ELEVATOR_INSERT_BACK is used to insert a barrier to the back of the queue. +ELEVATOR_INSERT_FRONT is used to insert a barrier to the front of the queue, and +overrides the ordering requested by any previous barriers. In practice this is +harmless and required, because it is used for SCSI requeueing. This does not +require flushing the reorder queue, so does not impose a performance penalty. + +vi. Plugging the queue to batch requests in anticipation of opportunities for + merge/sort optimizations + +This is just the same as in 2.4 so far, though per-device unplugging +support is anticipated for 2.5. Also with a priority-based i/o scheduler, +such decisions could be based on request priorities. + +Plugging is an approach that the current i/o scheduling algorithm resorts to so +that it collects up enough requests in the queue to be able to take +advantage of the sorting/merging logic in the elevator. If the +queue is empty when a request comes in, then it plugs the request queue +(sort of like plugging the bottom of a vessel to get fluid to build up) +till it fills up with a few more requests, before starting to service +the requests. This provides an opportunity to merge/sort the requests before +passing them down to the device. There are various conditions when the queue is +unplugged (to open up the flow again), either through a scheduled task or +could be on demand. For example wait_on_buffer sets the unplugging going +(by running tq_disk) so the read gets satisfied soon. So in the read case, +the queue gets explicitly unplugged as part of waiting for completion, +in fact all queues get unplugged as a side-effect. + +Aside: + This is kind of controversial territory, as it's not clear if plugging is + always the right thing to do. Devices typically have their own queues, + and allowing a big queue to build up in software, while letting the device be + idle for a while may not always make sense. The trick is to handle the fine + balance between when to plug and when to open up. Also now that we have + multi-page bios being queued in one shot, we may not need to wait to merge + a big request from the broken up pieces coming by. + + Per-queue granularity unplugging (still a Todo) may help reduce some of the + concerns with just a single tq_disk flush approach. Something like + blk_kick_queue() to unplug a specific queue (right away ?) + or optionally, all queues, is in the plan. + +4.3 I/O contexts +I/O contexts provide a dynamically allocated per process data area. They may +be used in I/O schedulers, and in the block layer (could be used for IO statis, +priorities for example). See *io_context in drivers/block/ll_rw_blk.c, and +as-iosched.c for an example of usage in an i/o scheduler. + + +5. Scalability related changes + +5.1 Granular Locking: io_request_lock replaced by a per-queue lock + +The global io_request_lock has been removed as of 2.5, to avoid +the scalability bottleneck it was causing, and has been replaced by more +granular locking. The request queue structure has a pointer to the +lock to be used for that queue. As a result, locking can now be +per-queue, with a provision for sharing a lock across queues if +necessary (e.g the scsi layer sets the queue lock pointers to the +corresponding adapter lock, which results in a per host locking +granularity). The locking semantics are the same, i.e. locking is +still imposed by the block layer, grabbing the lock before +request_fn execution which it means that lots of older drivers +should still be SMP safe. Drivers are free to drop the queue +lock themselves, if required. Drivers that explicitly used the +io_request_lock for serialization need to be modified accordingly. +Usually it's as easy as adding a global lock: + + static spinlock_t my_driver_lock = SPIN_LOCK_UNLOCKED; + +and passing the address to that lock to blk_init_queue(). + +5.2 64 bit sector numbers (sector_t prepares for 64 bit support) + +The sector number used in the bio structure has been changed to sector_t, +which could be defined as 64 bit in preparation for 64 bit sector support. + +6. Other Changes/Implications + +6.1 Partition re-mapping handled by the generic block layer + +In 2.5 some of the gendisk/partition related code has been reorganized. +Now the generic block layer performs partition-remapping early and thus +provides drivers with a sector number relative to whole device, rather than +having to take partition number into account in order to arrive at the true +sector number. The routine blk_partition_remap() is invoked by +generic_make_request even before invoking the queue specific make_request_fn, +so the i/o scheduler also gets to operate on whole disk sector numbers. This +should typically not require changes to block drivers, it just never gets +to invoke its own partition sector offset calculations since all bios +sent are offset from the beginning of the device. + + +7. A Few Tips on Migration of older drivers + +Old-style drivers that just use CURRENT and ignores clustered requests, +may not need much change. The generic layer will automatically handle +clustered requests, multi-page bios, etc for the driver. + +For a low performance driver or hardware that is PIO driven or just doesn't +support scatter-gather changes should be minimal too. + +The following are some points to keep in mind when converting old drivers +to bio. + +Drivers should use elv_next_request to pick up requests and are no longer +supposed to handle looping directly over the request list. +(struct request->queue has been removed) + +Now end_that_request_first takes an additional number_of_sectors argument. +It used to handle always just the first buffer_head in a request, now +it will loop and handle as many sectors (on a bio-segment granularity) +as specified. + +Now bh->b_end_io is replaced by bio->bi_end_io, but most of the time the +right thing to use is bio_endio(bio, uptodate) instead. + +If the driver is dropping the io_request_lock from its request_fn strategy, +then it just needs to replace that with q->queue_lock instead. + +As described in Sec 1.1, drivers can set max sector size, max segment size +etc per queue now. Drivers that used to define their own merge functions i +to handle things like this can now just use the blk_queue_* functions at +blk_init_queue time. + +Drivers no longer have to map a {partition, sector offset} into the +correct absolute location anymore, this is done by the block layer, so +where a driver received a request ala this before: + + rq->rq_dev = mk_kdev(3, 5); /* /dev/hda5 */ + rq->sector = 0; /* first sector on hda5 */ + + it will now see + + rq->rq_dev = mk_kdev(3, 0); /* /dev/hda */ + rq->sector = 123128; /* offset from start of disk */ + +As mentioned, there is no virtual mapping of a bio. For DMA, this is +not a problem as the driver probably never will need a virtual mapping. +Instead it needs a bus mapping (pci_map_page for a single segment or +use blk_rq_map_sg for scatter gather) to be able to ship it to the driver. For +PIO drivers (or drivers that need to revert to PIO transfer once in a +while (IDE for example)), where the CPU is doing the actual data +transfer a virtual mapping is needed. If the driver supports highmem I/O, +(Sec 1.1, (ii) ) it needs to use __bio_kmap_atomic and bio_kmap_irq to +temporarily map a bio into the virtual address space. + + +8. Prior/Related/Impacted patches + +8.1. Earlier kiobuf patches (sct/axboe/chait/hch/mkp) +- orig kiobuf & raw i/o patches (now in 2.4 tree) +- direct kiobuf based i/o to devices (no intermediate bh's) +- page i/o using kiobuf +- kiobuf splitting for lvm (mkp) +- elevator support for kiobuf request merging (axboe) +8.2. Zero-copy networking (Dave Miller) +8.3. SGI XFS - pagebuf patches - use of kiobufs +8.4. Multi-page pioent patch for bio (Christoph Hellwig) +8.5. Direct i/o implementation (Andrea Arcangeli) since 2.4.10-pre11 +8.6. Async i/o implementation patch (Ben LaHaise) +8.7. EVMS layering design (IBM EVMS team) +8.8. Larger page cache size patch (Ben LaHaise) and + Large page size (Daniel Phillips) + => larger contiguous physical memory buffers +8.9. VM reservations patch (Ben LaHaise) +8.10. Write clustering patches ? (Marcelo/Quintela/Riel ?) +8.11. Block device in page cache patch (Andrea Archangeli) - now in 2.4.10+ +8.12. Multiple block-size transfers for faster raw i/o (Shailabh Nagar, + Badari) +8.13 Priority based i/o scheduler - prepatches (Arjan van de Ven) +8.14 IDE Taskfile i/o patch (Andre Hedrick) +8.15 Multi-page writeout and readahead patches (Andrew Morton) +8.16 Direct i/o patches for 2.5 using kvec and bio (Badari Pulavarthy) + +9. Other References: + +9.1 The Splice I/O Model - Larry McVoy (and subsequent discussions on lkml, +and Linus' comments - Jan 2001) +9.2 Discussions about kiobuf and bh design on lkml between sct, linus, alan +et al - Feb-March 2001 (many of the initial thoughts that led to bio were +brought up in this discusion thread) +9.3 Discussions on mempool on lkml - Dec 2001. + diff --git a/Documentation/block/deadline-iosched.txt b/Documentation/block/deadline-iosched.txt new file mode 100644 index 000000000000..c918b3a6022d --- /dev/null +++ b/Documentation/block/deadline-iosched.txt @@ -0,0 +1,78 @@ +Deadline IO scheduler tunables +============================== + +This little file attempts to document how the deadline io scheduler works. +In particular, it will clarify the meaning of the exposed tunables that may be +of interest to power users. + +Each io queue has a set of io scheduler tunables associated with it. These +tunables control how the io scheduler works. You can find these entries +in: + +/sys/block/<device>/queue/iosched + +assuming that you have sysfs mounted on /sys. If you don't have sysfs mounted, +you can do so by typing: + +# mount none /sys -t sysfs + + +******************************************************************************** + + +read_expire (in ms) +----------- + +The goal of the deadline io scheduler is to attempt to guarentee a start +service time for a request. As we focus mainly on read latencies, this is +tunable. When a read request first enters the io scheduler, it is assigned +a deadline that is the current time + the read_expire value in units of +miliseconds. + + +write_expire (in ms) +----------- + +Similar to read_expire mentioned above, but for writes. + + +fifo_batch +---------- + +When a read request expires its deadline, we must move some requests from +the sorted io scheduler list to the block device dispatch queue. fifo_batch +controls how many requests we move, based on the cost of each request. A +request is either qualified as a seek or a stream. The io scheduler knows +the last request that was serviced by the drive (or will be serviced right +before this one). See seek_cost and stream_unit. + + +write_starved (number of dispatches) +------------- + +When we have to move requests from the io scheduler queue to the block +device dispatch queue, we always give a preference to reads. However, we +don't want to starve writes indefinitely either. So writes_starved controls +how many times we give preference to reads over writes. When that has been +done writes_starved number of times, we dispatch some writes based on the +same criteria as reads. + + +front_merges (bool) +------------ + +Sometimes it happens that a request enters the io scheduler that is contigious +with a request that is already on the queue. Either it fits in the back of that +request, or it fits at the front. That is called either a back merge candidate +or a front merge candidate. Due to the way files are typically laid out, +back merges are much more common than front merges. For some work loads, you +may even know that it is a waste of time to spend any time attempting to +front merge requests. Setting front_merges to 0 disables this functionality. +Front merges may still occur due to the cached last_merge hint, but since +that comes at basically 0 cost we leave that on. We simply disable the +rbtree front sector lookup when the io scheduler merge function is called. + + +Nov 11 2002, Jens Axboe <axboe@suse.de> + + diff --git a/Documentation/block/request.txt b/Documentation/block/request.txt new file mode 100644 index 000000000000..75924e2a6975 --- /dev/null +++ b/Documentation/block/request.txt @@ -0,0 +1,88 @@ + +struct request documentation + +Jens Axboe <axboe@suse.de> 27/05/02 + +1.0 +Index + +2.0 Struct request members classification + + 2.1 struct request members explanation + +3.0 + + +2.0 +Short explanation of request members + +Classification flags: + + D driver member + B block layer member + I I/O scheduler member + +Unless an entry contains a D classification, a device driver must not access +this member. Some members may contain D classifications, but should only be +access through certain macros or functions (eg ->flags). + +<linux/blkdev.h> + +2.1 +Member Flag Comment +------ ---- ------- + +struct list_head queuelist BI Organization on various internal + queues + +void *elevator_private I I/O scheduler private data + +unsigned char cmd[16] D Driver can use this for setting up + a cdb before execution, see + blk_queue_prep_rq + +unsigned long flags DBI Contains info about data direction, + request type, etc. + +int rq_status D Request status bits + +kdev_t rq_dev DBI Target device + +int errors DB Error counts + +sector_t sector DBI Target location + +unsigned long hard_nr_sectors B Used to keep sector sane + +unsigned long nr_sectors DBI Total number of sectors in request + +unsigned long hard_nr_sectors B Used to keep nr_sectors sane + +unsigned short nr_phys_segments DB Number of physical scatter gather + segments in a request + +unsigned short nr_hw_segments DB Number of hardware scatter gather + segments in a request + +unsigned int current_nr_sectors DB Number of sectors in first segment + of request + +unsigned int hard_cur_sectors B Used to keep current_nr_sectors sane + +int tag DB TCQ tag, if assigned + +void *special D Free to be used by driver + +char *buffer D Map of first segment, also see + section on bouncing SECTION + +struct completion *waiting D Can be used by driver to get signalled + on request completion + +struct bio *bio DBI First bio in request + +struct bio *biotail DBI Last bio in request + +request_queue_t *q DB Request queue this request belongs to + +struct request_list *rl B Request list this request came from diff --git a/Documentation/cachetlb.txt b/Documentation/cachetlb.txt new file mode 100644 index 000000000000..e132fb1163b0 --- /dev/null +++ b/Documentation/cachetlb.txt @@ -0,0 +1,384 @@ + Cache and TLB Flushing + Under Linux + + David S. Miller <davem@redhat.com> + +This document describes the cache/tlb flushing interfaces called +by the Linux VM subsystem. It enumerates over each interface, +describes it's intended purpose, and what side effect is expected +after the interface is invoked. + +The side effects described below are stated for a uniprocessor +implementation, and what is to happen on that single processor. The +SMP cases are a simple extension, in that you just extend the +definition such that the side effect for a particular interface occurs +on all processors in the system. Don't let this scare you into +thinking SMP cache/tlb flushing must be so inefficient, this is in +fact an area where many optimizations are possible. For example, +if it can be proven that a user address space has never executed +on a cpu (see vma->cpu_vm_mask), one need not perform a flush +for this address space on that cpu. + +First, the TLB flushing interfaces, since they are the simplest. The +"TLB" is abstracted under Linux as something the cpu uses to cache +virtual-->physical address translations obtained from the software +page tables. Meaning that if the software page tables change, it is +possible for stale translations to exist in this "TLB" cache. +Therefore when software page table changes occur, the kernel will +invoke one of the following flush methods _after_ the page table +changes occur: + +1) void flush_tlb_all(void) + + The most severe flush of all. After this interface runs, + any previous page table modification whatsoever will be + visible to the cpu. + + This is usually invoked when the kernel page tables are + changed, since such translations are "global" in nature. + +2) void flush_tlb_mm(struct mm_struct *mm) + + This interface flushes an entire user address space from + the TLB. After running, this interface must make sure that + any previous page table modifications for the address space + 'mm' will be visible to the cpu. That is, after running, + there will be no entries in the TLB for 'mm'. + + This interface is used to handle whole address space + page table operations such as what happens during + fork, and exec. + + Platform developers note that generic code will always + invoke this interface without mm->page_table_lock held. + +3) void flush_tlb_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end) + + Here we are flushing a specific range of (user) virtual + address translations from the TLB. After running, this + interface must make sure that any previous page table + modifications for the address space 'vma->vm_mm' in the range + 'start' to 'end-1' will be visible to the cpu. That is, after + running, here will be no entries in the TLB for 'mm' for + virtual addresses in the range 'start' to 'end-1'. + + The "vma" is the backing store being used for the region. + Primarily, this is used for munmap() type operations. + + The interface is provided in hopes that the port can find + a suitably efficient method for removing multiple page + sized translations from the TLB, instead of having the kernel + call flush_tlb_page (see below) for each entry which may be + modified. + + Platform developers note that generic code will always + invoke this interface with mm->page_table_lock held. + +4) void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr) + + This time we need to remove the PAGE_SIZE sized translation + from the TLB. The 'vma' is the backing structure used by + Linux to keep track of mmap'd regions for a process, the + address space is available via vma->vm_mm. Also, one may + test (vma->vm_flags & VM_EXEC) to see if this region is + executable (and thus could be in the 'instruction TLB' in + split-tlb type setups). + + After running, this interface must make sure that any previous + page table modification for address space 'vma->vm_mm' for + user virtual address 'addr' will be visible to the cpu. That + is, after running, there will be no entries in the TLB for + 'vma->vm_mm' for virtual address 'addr'. + + This is used primarily during fault processing. + + Platform developers note that generic code will always + invoke this interface with mm->page_table_lock held. + +5) void flush_tlb_pgtables(struct mm_struct *mm, + unsigned long start, unsigned long end) + + The software page tables for address space 'mm' for virtual + addresses in the range 'start' to 'end-1' are being torn down. + + Some platforms cache the lowest level of the software page tables + in a linear virtually mapped array, to make TLB miss processing + more efficient. On such platforms, since the TLB is caching the + software page table structure, it needs to be flushed when parts + of the software page table tree are unlinked/freed. + + Sparc64 is one example of a platform which does this. + + Usually, when munmap()'ing an area of user virtual address + space, the kernel leaves the page table parts around and just + marks the individual pte's as invalid. However, if very large + portions of the address space are unmapped, the kernel frees up + those portions of the software page tables to prevent potential + excessive kernel memory usage caused by erratic mmap/mmunmap + sequences. It is at these times that flush_tlb_pgtables will + be invoked. + +6) void update_mmu_cache(struct vm_area_struct *vma, + unsigned long address, pte_t pte) + + At the end of every page fault, this routine is invoked to + tell the architecture specific code that a translation + described by "pte" now exists at virtual address "address" + for address space "vma->vm_mm", in the software page tables. + + A port may use this information in any way it so chooses. + For example, it could use this event to pre-load TLB + translations for software managed TLB configurations. + The sparc64 port currently does this. + +7) void tlb_migrate_finish(struct mm_struct *mm) + + This interface is called at the end of an explicit + process migration. This interface provides a hook + to allow a platform to update TLB or context-specific + information for the address space. + + The ia64 sn2 platform is one example of a platform + that uses this interface. + +8) void lazy_mmu_prot_update(pte_t pte) + This interface is called whenever the protection on + any user PTEs change. This interface provides a notification + to architecture specific code to take appropiate action. + + +Next, we have the cache flushing interfaces. In general, when Linux +is changing an existing virtual-->physical mapping to a new value, +the sequence will be in one of the following forms: + + 1) flush_cache_mm(mm); + change_all_page_tables_of(mm); + flush_tlb_mm(mm); + + 2) flush_cache_range(vma, start, end); + change_range_of_page_tables(mm, start, end); + flush_tlb_range(vma, start, end); + + 3) flush_cache_page(vma, addr, pfn); + set_pte(pte_pointer, new_pte_val); + flush_tlb_page(vma, addr); + +The cache level flush will always be first, because this allows +us to properly handle systems whose caches are strict and require +a virtual-->physical translation to exist for a virtual address +when that virtual address is flushed from the cache. The HyperSparc +cpu is one such cpu with this attribute. + +The cache flushing routines below need only deal with cache flushing +to the extent that it is necessary for a particular cpu. Mostly, +these routines must be implemented for cpus which have virtually +indexed caches which must be flushed when virtual-->physical +translations are changed or removed. So, for example, the physically +indexed physically tagged caches of IA32 processors have no need to +implement these interfaces since the caches are fully synchronized +and have no dependency on translation information. + +Here are the routines, one by one: + +1) void flush_cache_mm(struct mm_struct *mm) + + This interface flushes an entire user address space from + the caches. That is, after running, there will be no cache + lines associated with 'mm'. + + This interface is used to handle whole address space + page table operations such as what happens during + fork, exit, and exec. + +2) void flush_cache_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end) + + Here we are flushing a specific range of (user) virtual + addresses from the cache. After running, there will be no + entries in the cache for 'vma->vm_mm' for virtual addresses in + the range 'start' to 'end-1'. + + The "vma" is the backing store being used for the region. + Primarily, this is used for munmap() type operations. + + The interface is provided in hopes that the port can find + a suitably efficient method for removing multiple page + sized regions from the cache, instead of having the kernel + call flush_cache_page (see below) for each entry which may be + modified. + +3) void flush_cache_page(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn) + + This time we need to remove a PAGE_SIZE sized range + from the cache. The 'vma' is the backing structure used by + Linux to keep track of mmap'd regions for a process, the + address space is available via vma->vm_mm. Also, one may + test (vma->vm_flags & VM_EXEC) to see if this region is + executable (and thus could be in the 'instruction cache' in + "Harvard" type cache layouts). + + The 'pfn' indicates the physical page frame (shift this value + left by PAGE_SHIFT to get the physical address) that 'addr' + translates to. It is this mapping which should be removed from + the cache. + + After running, there will be no entries in the cache for + 'vma->vm_mm' for virtual address 'addr' which translates + to 'pfn'. + + This is used primarily during fault processing. + +4) void flush_cache_kmaps(void) + + This routine need only be implemented if the platform utilizes + highmem. It will be called right before all of the kmaps + are invalidated. + + After running, there will be no entries in the cache for + the kernel virtual address range PKMAP_ADDR(0) to + PKMAP_ADDR(LAST_PKMAP). + + This routing should be implemented in asm/highmem.h + +5) void flush_cache_vmap(unsigned long start, unsigned long end) + void flush_cache_vunmap(unsigned long start, unsigned long end) + + Here in these two interfaces we are flushing a specific range + of (kernel) virtual addresses from the cache. After running, + there will be no entries in the cache for the kernel address + space for virtual addresses in the range 'start' to 'end-1'. + + The first of these two routines is invoked after map_vm_area() + has installed the page table entries. The second is invoked + before unmap_vm_area() deletes the page table entries. + +There exists another whole class of cpu cache issues which currently +require a whole different set of interfaces to handle properly. +The biggest problem is that of virtual aliasing in the data cache +of a processor. + +Is your port susceptible to virtual aliasing in it's D-cache? +Well, if your D-cache is virtually indexed, is larger in size than +PAGE_SIZE, and does not prevent multiple cache lines for the same +physical address from existing at once, you have this problem. + +If your D-cache has this problem, first define asm/shmparam.h SHMLBA +properly, it should essentially be the size of your virtually +addressed D-cache (or if the size is variable, the largest possible +size). This setting will force the SYSv IPC layer to only allow user +processes to mmap shared memory at address which are a multiple of +this value. + +NOTE: This does not fix shared mmaps, check out the sparc64 port for +one way to solve this (in particular SPARC_FLAG_MMAPSHARED). + +Next, you have to solve the D-cache aliasing issue for all +other cases. Please keep in mind that fact that, for a given page +mapped into some user address space, there is always at least one more +mapping, that of the kernel in it's linear mapping starting at +PAGE_OFFSET. So immediately, once the first user maps a given +physical page into its address space, by implication the D-cache +aliasing problem has the potential to exist since the kernel already +maps this page at its virtual address. + + void copy_user_page(void *to, void *from, unsigned long addr, struct page *page) + void clear_user_page(void *to, unsigned long addr, struct page *page) + + These two routines store data in user anonymous or COW + pages. It allows a port to efficiently avoid D-cache alias + issues between userspace and the kernel. + + For example, a port may temporarily map 'from' and 'to' to + kernel virtual addresses during the copy. The virtual address + for these two pages is chosen in such a way that the kernel + load/store instructions happen to virtual addresses which are + of the same "color" as the user mapping of the page. Sparc64 + for example, uses this technique. + + The 'addr' parameter tells the virtual address where the + user will ultimately have this page mapped, and the 'page' + parameter gives a pointer to the struct page of the target. + + If D-cache aliasing is not an issue, these two routines may + simply call memcpy/memset directly and do nothing more. + + void flush_dcache_page(struct page *page) + + Any time the kernel writes to a page cache page, _OR_ + the kernel is about to read from a page cache page and + user space shared/writable mappings of this page potentially + exist, this routine is called. + + NOTE: This routine need only be called for page cache pages + which can potentially ever be mapped into the address + space of a user process. So for example, VFS layer code + handling vfs symlinks in the page cache need not call + this interface at all. + + The phrase "kernel writes to a page cache page" means, + specifically, that the kernel executes store instructions + that dirty data in that page at the page->virtual mapping + of that page. It is important to flush here to handle + D-cache aliasing, to make sure these kernel stores are + visible to user space mappings of that page. + + The corollary case is just as important, if there are users + which have shared+writable mappings of this file, we must make + sure that kernel reads of these pages will see the most recent + stores done by the user. + + If D-cache aliasing is not an issue, this routine may + simply be defined as a nop on that architecture. + + There is a bit set aside in page->flags (PG_arch_1) as + "architecture private". The kernel guarantees that, + for pagecache pages, it will clear this bit when such + a page first enters the pagecache. + + This allows these interfaces to be implemented much more + efficiently. It allows one to "defer" (perhaps indefinitely) + the actual flush if there are currently no user processes + mapping this page. See sparc64's flush_dcache_page and + update_mmu_cache implementations for an example of how to go + about doing this. + + The idea is, first at flush_dcache_page() time, if + page->mapping->i_mmap is an empty tree and ->i_mmap_nonlinear + an empty list, just mark the architecture private page flag bit. + Later, in update_mmu_cache(), a check is made of this flag bit, + and if set the flush is done and the flag bit is cleared. + + IMPORTANT NOTE: It is often important, if you defer the flush, + that the actual flush occurs on the same CPU + as did the cpu stores into the page to make it + dirty. Again, see sparc64 for examples of how + to deal with this. + + void copy_to_user_page(struct vm_area_struct *vma, struct page *page, + unsigned long user_vaddr, + void *dst, void *src, int len) + void copy_from_user_page(struct vm_area_struct *vma, struct page *page, + unsigned long user_vaddr, + void *dst, void *src, int len) + When the kernel needs to copy arbitrary data in and out + of arbitrary user pages (f.e. for ptrace()) it will use + these two routines. + + Any necessary cache flushing or other coherency operations + that need to occur should happen here. If the processor's + instruction cache does not snoop cpu stores, it is very + likely that you will need to flush the instruction cache + for copy_to_user_page(). + + void flush_icache_range(unsigned long start, unsigned long end) + When the kernel stores into addresses that it will execute + out of (eg when loading modules), this function is called. + + If the icache does not snoop stores then this routine will need + to flush it. + + void flush_icache_page(struct vm_area_struct *vma, struct page *page) + All the functionality of flush_icache_page can be implemented in + flush_dcache_page and update_mmu_cache. In 2.7 the hope is to + remove this interface completely. diff --git a/Documentation/cciss.txt b/Documentation/cciss.txt new file mode 100644 index 000000000000..d599beb9df8a --- /dev/null +++ b/Documentation/cciss.txt @@ -0,0 +1,132 @@ +This driver is for Compaq's SMART Array Controllers. + +Supported Cards: +---------------- + +This driver is known to work with the following cards: + + * SA 5300 + * SA 5i + * SA 532 + * SA 5312 + * SA 641 + * SA 642 + * SA 6400 + * SA 6400 U320 Expansion Module + * SA 6i + * SA P600 + * SA P800 + * SA E400 + +If nodes are not already created in the /dev/cciss directory, run as root: + +# cd /dev +# ./MAKEDEV cciss + +Device Naming: +-------------- + +You need some entries in /dev for the cciss device. The MAKEDEV script +can make device nodes for you automatically. Currently the device setup +is as follows: + +Major numbers: + 104 cciss0 + 105 cciss1 + 106 cciss2 + 105 cciss3 + 108 cciss4 + 109 cciss5 + 110 cciss6 + 111 cciss7 + +Minor numbers: + b7 b6 b5 b4 b3 b2 b1 b0 + |----+----| |----+----| + | | + | +-------- Partition ID (0=wholedev, 1-15 partition) + | + +-------------------- Logical Volume number + +The device naming scheme is: +/dev/cciss/c0d0 Controller 0, disk 0, whole device +/dev/cciss/c0d0p1 Controller 0, disk 0, partition 1 +/dev/cciss/c0d0p2 Controller 0, disk 0, partition 2 +/dev/cciss/c0d0p3 Controller 0, disk 0, partition 3 + +/dev/cciss/c1d1 Controller 1, disk 1, whole device +/dev/cciss/c1d1p1 Controller 1, disk 1, partition 1 +/dev/cciss/c1d1p2 Controller 1, disk 1, partition 2 +/dev/cciss/c1d1p3 Controller 1, disk 1, partition 3 + +SCSI tape drive and medium changer support +------------------------------------------ + +SCSI sequential access devices and medium changer devices are supported and +appropriate device nodes are automatically created. (e.g. +/dev/st0, /dev/st1, etc. See the "st" man page for more details.) +You must enable "SCSI tape drive support for Smart Array 5xxx" and +"SCSI support" in your kernel configuration to be able to use SCSI +tape drives with your Smart Array 5xxx controller. + +Additionally, note that the driver will not engage the SCSI core at init +time. The driver must be directed to dynamically engage the SCSI core via +the /proc filesystem entry which the "block" side of the driver creates as +/proc/driver/cciss/cciss* at runtime. This is because at driver init time, +the SCSI core may not yet be initialized (because the driver is a block +driver) and attempting to register it with the SCSI core in such a case +would cause a hang. This is best done via an initialization script +(typically in /etc/init.d, but could vary depending on distibution). +For example: + + for x in /proc/driver/cciss/cciss[0-9]* + do + echo "engage scsi" > $x + done + +Once the SCSI core is engaged by the driver, it cannot be disengaged +(except by unloading the driver, if it happens to be linked as a module.) + +Note also that if no sequential access devices or medium changers are +detected, the SCSI core will not be engaged by the action of the above +script. + +Hot plug support for SCSI tape drives +------------------------------------- + +Hot plugging of SCSI tape drives is supported, with some caveats. +The cciss driver must be informed that changes to the SCSI bus +have been made, in addition to and prior to informing the SCSI +mid layer. This may be done via the /proc filesystem. For example: + + echo "rescan" > /proc/scsi/cciss0/1 + +This causes the adapter to query the adapter about changes to the +physical SCSI buses and/or fibre channel arbitrated loop and the +driver to make note of any new or removed sequential access devices +or medium changers. The driver will output messages indicating what +devices have been added or removed and the controller, bus, target and +lun used to address the device. Once this is done, the SCSI mid layer +can be informed of changes to the virtual SCSI bus which the driver +presents to it in the usual way. For example: + + echo scsi add-single-device 3 2 1 0 > /proc/scsi/scsi + +to add a device on controller 3, bus 2, target 1, lun 0. Note that +the driver makes an effort to preserve the devices positions +in the virtual SCSI bus, so if you are only moving tape drives +around on the same adapter and not adding or removing tape drives +from the adapter, informing the SCSI mid layer may not be necessary. + +Note that the naming convention of the /proc filesystem entries +contains a number in addition to the driver name. (E.g. "cciss0" +instead of just "cciss" which you might expect.) + +Note: ONLY sequential access devices and medium changers are presented +as SCSI devices to the SCSI mid layer by the cciss driver. Specifically, +physical SCSI disk drives are NOT presented to the SCSI mid layer. The +physical SCSI disk drives are controlled directly by the array controller +hardware and it is important to prevent the kernel from attempting to directly +access these devices too, as if the array controller were merely a SCSI +controller in the same way that we are allowing it to access SCSI tape drives. + diff --git a/Documentation/cdrom/00-INDEX b/Documentation/cdrom/00-INDEX new file mode 100644 index 000000000000..916dafe29d3f --- /dev/null +++ b/Documentation/cdrom/00-INDEX @@ -0,0 +1,33 @@ +00-INDEX + - this file (info on CD-ROMs and Linux) +Makefile + - only used to generate TeX output from the documentation. +aztcd + - info on Aztech/Orchid/Okano/Wearnes/Conrad/CyCDROM driver. +cdrom-standard.tex + - LaTeX document on standardizing the CD-ROM programming interface. +cdu31a + - info on the Sony CDU31A/CDU33A CD-ROM driver. +cm206 + - info on the Philips/LMS cm206/cm260 CD-ROM driver. +gscd + - info on the Goldstar R420 CD-ROM driver. +ide-cd + - info on setting up and using ATAPI (aka IDE) CD-ROMs. +isp16 + - info on the CD-ROM interface on ISP16, MAD16 or Mozart sound card. +mcd + - info on limitations of standard Mitsumi CD-ROM driver. +mcdx + - info on improved Mitsumi CD-ROM driver. +optcd + - info on the Optics Storage 8000 AT CD-ROM driver +packet-writing.txt + - Info on the CDRW packet writing module +sbpcd + - info on the SoundBlaster/Panasonic CD-ROM interface driver. +sjcd + - info on the SANYO CDR-H94A CD-ROM interface driver. +sonycd535 + - info on the Sony CDU-535 (and 531) CD-ROM driver. + diff --git a/Documentation/cdrom/Makefile b/Documentation/cdrom/Makefile new file mode 100644 index 000000000000..a19e321928e1 --- /dev/null +++ b/Documentation/cdrom/Makefile @@ -0,0 +1,21 @@ +LATEXFILE = cdrom-standard + +all: + make clean + latex $(LATEXFILE) + latex $(LATEXFILE) + @if [ -x `which gv` ]; then \ + `dvips -q -t letter -o $(LATEXFILE).ps $(LATEXFILE).dvi` ;\ + `gv -antialias -media letter -nocenter $(LATEXFILE).ps` ;\ + else \ + `xdvi $(LATEXFILE).dvi &` ;\ + fi + make sortofclean + +clean: + rm -f $(LATEXFILE).ps $(LATEXFILE).dvi $(LATEXFILE).aux $(LATEXFILE).log + +sortofclean: + rm -f $(LATEXFILE).aux $(LATEXFILE).log + + diff --git a/Documentation/cdrom/aztcd b/Documentation/cdrom/aztcd new file mode 100644 index 000000000000..6bf0290ef7ce --- /dev/null +++ b/Documentation/cdrom/aztcd @@ -0,0 +1,822 @@ +$Id: README.aztcd,v 2.60 1997/11/29 09:51:25 root Exp root $ + Readme-File Documentation/cdrom/aztcd + for + AZTECH CD-ROM CDA268-01A, ORCHID CD-3110, + OKANO/WEARNES CDD110, CONRAD TXC, CyCDROM CR520, CR540 + CD-ROM Drives + Version 2.6 and newer + (for other drives see 6.-8.) + +NOTE: THIS DRIVER WILL WORK WITH THE CD-ROM DRIVES LISTED, WHICH HAVE + A PROPRIETARY INTERFACE (implemented on a sound card or on an + ISA-AT-bus card). + IT WILL DEFINITELY NOT WORK WITH CD-ROM DRIVES WITH *IDE*-INTERFACE, + such as the Aztech CDA269-031SE !!! (The only known exceptions are + 'faked' IDE drives like the CyCDROM CR520ie which work with aztcd + under certain conditions, see 7.). IF YOU'RE USING A CD-ROM DRIVE + WITH IDE-INTERFACE, SOMETIMES ALSO CALLED ATAPI-COMPATIBLE, PLEASE + USE THE ide-cd.c DRIVER, WRITTEN BY MARK LORD AND SCOTT SNYDER ! + THE STANDARD-KERNEL 1.2.x NOW ALSO SUPPORTS IDE-CDROM-DRIVES, SEE THE + HARDDISK (!) SECTION OF make config, WHEN COMPILING A NEW KERNEL!!! +---------------------------------------------------------------------------- + +Contents of this file: + 1. NOTE + 2. INSTALLATION + 3. CONFIGURING YOUR KERNEL + 4. RECOMPILING YOUR KERNEL + 4.1 AZTCD AS A RUN-TIME LOADABLE MODULE + 4.2 CDROM CONNECTED TO A SOUNDCARD + 5. KNOWN PROBLEMS, FUTURE DEVELOPMENTS + 5.1 MULTISESSION SUPPORT + 5.2 STATUS RECOGNITION + 5.3 DOSEMU's CDROM SUPPORT + 6. BUG REPORTS + 7. OTHER DRIVES + 8. IF YOU DON'T SUCCEED ... DEBUGGING + 9. TECHNICAL HISTORY OF THE DRIVER + 10. ACKNOWLEDGMENTS + 11. PROGRAMMING ADD ONS: CDPLAY.C + APPENDIX: Source code of cdplay.c +---------------------------------------------------------------------------- + +1. NOTE +This software has been successfully in alpha and beta test and is part of +the standard kernel since kernel 1.1.8x since December 1994. It works with +AZTECH CDA268-01A, ORCHID CDS-3110, ORCHID/WEARNES CDD110 and CONRAD TXC +(Nr.99 31 23 -series 04) and has proven to be stable with kernel +versions 1.0.9 and newer. But with any software there still may be bugs in it. +So if you encounter problems, you are invited to help us improve this software. +Please send me a detailed bug report (see chapter BUG REPORTS). You are also +invited in helping us to increase the number of drives, which are supported. + +Please read the README-files carefully and always keep a backup copy of your +old kernel, in order to reboot if something goes wrong! + +2. INSTALLATION +The driver consists of a header file 'aztcd.h', which normally should reside +in /usr/src/linux/drivers/cdrom and the source code 'aztcd.c', which normally +resides in the same place. It uses /dev/aztcd (/dev/aztcd0 in some distri- +butions), which must be a valid block device with major number 29 and reside +in directory /dev. To mount a CD-ROM, your kernel needs to have the ISO9660- +filesystem support included. + +PLEASE NOTE: aztcd.c has been developed in parallel to the linux kernel, +which had and is having many major and minor changes which are not backward +compatible. Quite definitely aztcd.c version 1.80 and newer will NOT work +in kernels older than 1.3.33. So please always use the most recent version +of aztcd.c with the appropriate linux-kernel. + +3. CONFIGURING YOUR KERNEL +If your kernel is already configured for using the AZTECH driver you will +see the following message while Linux boots: + Aztech CD-ROM Init: DriverVersion=<version number> BaseAddress=<baseaddress> + Aztech CD-ROM Init: FirmwareVersion=<firmware version id of your I/O-card>>> + Aztech CD-ROM Init: <drive type> detected + Aztech CD-ROM Init: End +If the message looks different and you are sure to have a supported drive, +it may have a different base address. The Aztech driver does look for the +CD-ROM drive at the base address specified in aztcd.h at compile time. This +address can be overwritten by boot parameter aztcd=....You should reboot and +start Linux with boot parameter aztcd=<base address>, e.g. aztcd=0x320. If +you do not know the base address, start your PC with DOS and look at the boot +message of your CD-ROM's DOS driver. If that still does not help, use boot +parameter aztcd=<base address>,0x79 , this tells aztcd to try a little harder. +aztcd may be configured to use autoprobing the base address by recompiling +it (see chapter 4.). + +If the message looks correct, as user 'root' you should be able to mount the +drive by + mount -t iso9660 -r /dev/aztcd0 /mnt +and use it as any other filesystem. (If this does not work, check if +/dev/aztcd0 and /mnt do exist and create them, if necessary by doing + mknod /dev/aztcd0 b 29 0 + mkdir /mnt + +If you still get a different message while Linux boots or when you get the +message, that the ISO9660-filesystem is not supported by your kernel, when +you try to mount the CD-ROM drive, you have to recompile your kernel. + +If you do *not* have an Aztech/Orchid/Okano/Wearnes/TXC drive and want to +bypass drive detection during Linux boot up, start with boot parameter aztcd=0. + +Most distributions nowadays do contain a boot disk image containing aztcd. +Please note, that this driver will not work with IDE/ATAPI drives! With these +you must use ide-cd.c instead. + +4. RECOMPILING YOUR KERNEL +If your kernel is not yet configured for the AZTECH driver and the ISO9660- +filesystem, you have to recompile your kernel: + +- Edit aztcd.h to set the I/O-address to your I/O-Base address (AZT_BASE_ADDR), + the driver does not use interrupts or DMA, so if you are using an AZTECH + CD268, an ORCHID CD-3110 or ORCHID/WEARNES CDD110 that's the only item you + have to set up. If you have a soundcard, read chapter 4.2. + Users of other drives should read chapter OTHER DRIVES of this file. + You also can configure that address by kernel boot parameter aztcd=... +- aztcd may be configured to use autoprobing the base address by setting + AZT_BASE_ADDR to '-1'. In that case aztcd probes the addresses listed + under AZT_BASE_AUTO. But please remember, that autoprobing always may + incorrectly influence other hardware components too! +- There are some other points, which may be configured, e.g. auto-eject the + CD when unmounting a drive, tray locking etc., see aztcd.h for details. +- If you're using a linux kernel version prior to 2.1.0, in aztcd.h + uncomment the line '#define AZT_KERNEL_PRIOR_2_1' +- Build a new kernel, configure it for 'Aztech/Orchid/Okano/Wearnes support' + (if you want aztcd to be part of the kernel). Do not configure it for + 'Aztech... support', if you want to use aztcd as a run time loadable module. + But in any case you must have the ISO9660-filesystem included in your + kernel. +- Activate the new kernel, normally this is done by running LILO (don't for- + get to configure it before and to keep a copy of your old kernel in case + something goes wrong!). +- Reboot +- If you've included aztcd in your kernel, you now should see during boot + some messages like + Aztech CD-ROM Init: DriverVersion=<version number> BaseAddress=<baseaddress> + Aztech CD-ROM Init: FirmwareVersion=<firmware version id of your I/O-card> + Aztech CD-ROM Init: <drive type> detected + Aztech CD-ROM Init: End +- If you have not included aztcd in your kernel, but want to load aztcd as a + run time loadable module see 4.1. +- If the message looks correct, as user 'root' you should be able to mount + the drive by + mount -t iso9660 -r /dev/aztcd0 /mnt + and use it as any other filesystem. (If this does not work, check if + /dev/aztcd0 and /mnt do exist and create them, if necessary by doing + mknod /dev/aztcd0 b 29 0 + mkdir /mnt +- If this still does not help, see chapters OTHER DRIVES and DEBUGGING. + +4.1 AZTCD AS A RUN-TIME LOADABLE MODULE +If you do not need aztcd permanently, you can also load and remove the driver +during runtime via insmod and rmmod. To build aztcd as a loadable module you +must configure your kernel for AZTECH module support (answer 'm' when con- +figuring the kernel). Anyhow, you may run into problems, if the version of +your boot kernel is not the same than the source kernel version, from which +you create the modules. So rebuild your kernel, if necessary. + +Now edit the base address of your AZTECH interface card in +/usr/src/linux/drivers/cdrom/aztcd.h to the appropriate value. +aztcd may be configured to use autoprobing the base address by setting +AZT_BASE_ADDR to '-1'. In that case aztcd probes the addresses listed +under AZT_BASE_AUTO. But please remember, that autoprobing always may +incorrectly influence other hardware components too! +There are also some special features which may be configured, e.g. +auto-eject a CD when unmounting the drive etc; see aztcd.h for details. +Then change to /usr/src/linux and do a + make modules + make modules_install +After that you can run-time load the driver via + insmod /lib/modules/X.X.X/misc/aztcd.o +and remove it via rmmod aztcd. +If you did not set the correct base address in aztcd.h, you can also supply the +base address when loading the driver via + insmod /lib/modules/X.X.X/misc/aztcd.o aztcd=<base address> +Again specifying aztcd=-1 will cause autoprobing. +If you do not have the iso9660-filesystem in your boot kernel, you also have +to load it before you can mount the CDROM: + insmod /lib/modules/X.X.X/fs/isofs.o +The mount procedure works as described in 4. above. +(In all commands 'X.X.X' is the current linux kernel version number) + +4.2 CDROM CONNECTED TO A SOUNDCARD +Most soundcards do have a bus interface to the CDROM-drive. In many cases +this soundcard needs to be configured, before the CDROM can be used. This +configuration procedure consists of writing some kind of initialization +data to the soundcard registers. The AZTECH-CDROM driver in the moment does +only support one type of soundcard (SoundWave32). Users of other soundcards +should try to boot DOS first and let their DOS drivers initialize the +soundcard and CDROM, then warm boot (or use loadlin) their PC to start +Linux. +Support for the CDROM-interface of SoundWave32-soundcards is directly +implemented in the AZTECH driver. Please edit linux/drivers/cdrom/aztdc.h, +uncomment line '#define AZT_SW32' and set the appropriate value for +AZT_BASE_ADDR and AZT_SW32_BASE_ADDR. This support was tested with an Orchid +CDS-3110 connected to a SoundWave32. +If you want your soundcard to be supported, find out, how it needs to be +configured and mail me (see 6.) the appropriate information. + +5. KNOWN PROBLEMS, FUTURE DEVELOPMENTS +5.1 MULTISESSION SUPPORT +Multisession support for CD's still is a myth. I implemented and tested a basic +support for multisession and XA CDs, but I still have not enough CDs and appli- +cations to test it rigorously. So if you'd like to help me, please contact me +(Email address see below). As of version 1.4 and newer you can enable the +multisession support in aztcd.h by setting AZT_MULTISESSION to 1. Doing so +will cause the ISO9660-filesystem to deal with multisession CDs, ie. redirect +requests to the Table of Contents (TOC) information from the last session, +which contains the info of all previous sessions etc.. If you do set +AZT_MULTISESSION to 0, you can use multisession CDs anyway. In that case the +drive's firmware will do automatic redirection. For the ISO9660-filesystem any +multisession CD will then look like a 'normal' single session CD. But never- +theless the data of all sessions are viewable and accessible. So with practical- +ly all real world applications you won't notice the difference. But as future +applications may make use of advanced multisession features, I've started to +implement the interface for the ISO9660 multisession interface via ioctl +CDROMMULTISESSION. + +5.2 STATUS RECOGNITION +The drive status recognition does not work correctly in all cases. Changing +a disk or having the door open, when a drive is already mounted, is detected +by the Aztech driver itself, but nevertheless causes multiple read attempts +by the different layers of the ISO9660-filesystem driver, which finally timeout, +so you have to wait quite a little... But isn't it bad style to change a disk +in a mounted drive, anyhow ?! + +The driver uses busy wait in most cases for the drive handshake (macros +STEN_LOW and DTEN_LOW). I tested with a 486/DX2 at 66MHz and a Pentium at +60MHz and 90MHz. Whenever you use a much faster machine you are likely to get +timeout messages. In that case edit aztcd.h and increase the timeout value +AZT_TIMEOUT. + +For some 'slow' drive commands I implemented waiting with a timer waitqueue +(macro STEN_LOW_WAIT). If you get this timeout message, you may also edit +aztcd.h and increase the timeout value AZT_STATUS_DELAY. The waitqueue has +shown to be a little critical. If you get kernel panic messages, edit aztcd.c +and substitute STEN_LOW_WAIT by STEN_LOW. Busy waiting with STEN_LOW is more +stable, but also causes CPU overhead. + +5.3 DOSEMU's CD-ROM SUPPORT +With release 1.20 aztcd was modified to allow access to CD-ROMS when running +under dosemu-0.60.0 aztcd-versions before 1.20 are most likely to crash +Linux, when a CD-ROM is accessed under dosemu. This problem has partly been +fixed, but still when accessing a directory for the first time the system +might hang for some 30sec. So be patient, when using dosemu's CD-ROM support +in combination with aztcd :-) ! +This problem has now (July 1995) been fixed by a modification to dosemu's +CD-ROM driver. The new version came with dosemu-0.60.2, see dosemu's +README.CDROM. + +6. BUG REPORTS +Please send detailed bug reports and bug fixes via EMail to + + Werner.Zimmermann@fht-esslingen.de + +Please include a description of your CD-ROM drive type and interface card, +the exact firmware message during Linux bootup, the version number of the +AZTECH-CDROM-driver and the Linux kernel version. Also a description of your +system's other hardware could be of interest, especially microprocessor type, +clock frequency, other interface cards such as soundcards, ethernet adapter, +game cards etc.. + +I will try to collect the reports and make the necessary modifications from +time to time. I may also come back to you directly with some bug fixes and +ask you to do further testing and debugging. + +Editors of CD-ROMs are invited to send a 'cooperation' copy of their +CD-ROMs to the volunteers, who provided the CD-ROM support for Linux. My +snail mail address for such 'stuff' is + Prof. Dr. W. Zimmermann + Fachhochschule fuer Technik Esslingen + Fachbereich IT + Flandernstrasse 101 + D-73732 Esslingen + Germany + + +7. OTHER DRIVES +The following drives ORCHID CDS3110, OKANO CDD110, WEARNES CDD110 and Conrad +TXC Nr. 993123-series 04 nearly look the same as AZTECH CDA268-01A, especially +they seem to use the same command codes. So it was quite simple to make the +AZTECH driver work with these drives. + +Unfortunately I do not have any of these drives available, so I couldn't test +it myself. In some installations, it seems necessary to initialize the drive +with the DOS driver before (especially if combined with a sound card) and then +do a warm boot (CTRL-ALT-RESET) or start Linux from DOS, e.g. with 'loadlin'. + +If you do not succeed, read chapter DEBUGGING. Thanks in advance! + +Sorry for the inconvenience, but it is difficult to develop for hardware, +which you don't have available for testing. So if you like, please help us. + +If you do have a CyCDROM CR520ie thanks to Hilmar Berger's help your chances +are good, that it will work with aztcd. The CR520ie is sold as an IDE-drive +and really is connected to the IDE interface (primary at 0x1F0 or secondary +at 0x170, configured as slave, not as master). Nevertheless it is not ATAPI +compatible but still uses Aztech's command codes. + + +8. DEBUGGING : IF YOU DON'T SUCCEED, TRY THE FOLLOWING +-reread the complete README file +-make sure, that your drive is hardware configured for + transfer mode: polled + IRQ: not used + DMA: not used + Base Address: something like 300, 320 ... + You can check this, when you start the DOS driver, which came with your + drive. By appropriately configuring the drive and the DOS driver you can + check, whether your drive does operate in this mode correctly under DOS. If + it does not operate under DOS, it won't under Linux. + If your drive's base address is something like 0x170 or 0x1F0 (and it is + not a CyCDROM CR520ie or CR 940ie) you most likely are having an IDE/ATAPI- + compatible drive, which is not supported by aztcd.c, use ide-cd.c instead. + Make sure the Base Address is configured correctly in aztcd.h, also make + sure, that /dev/aztcd0 exists with the correct major number (compare it with + the entry in file /usr/include/linux/major.h for the Aztech drive). +-insert a CD-ROM and close the tray +-cold boot your PC (i.e. via the power on switch or the reset button) +-if you start Linux via DOS, e.g. using loadlin, make sure, that the DOS + driver for the CD-ROM drive is not loaded (comment out the calling lines + in DOS' config.sys!) +-look for the aztcd: init message during Linux init and note them exactly +-log in as root and do a mount -t iso9660 /dev/aztcd0 /mnt +-if you don't succeed in the first time, try several times. Try also to open + and close the tray, then mount again. Please note carefully all commands + you typed in and the aztcd-messages, which you get. +-if you get an 'Aztech CD-ROM init: aborted' message, read the remarks about + the version string below. + +If this does not help, do the same with the following differences +-start DOS before; make now sure, that the DOS driver for the CD-ROM is + loaded under DOS (i.e. uncomment it again in config.sys) +-warm boot your PC (i.e. via CTRL-ALT-DEL) + if you have it, you can also start via loadlin (try both). + ... + Again note all commands and the aztcd-messages. + +If you see STEN_LOW or STEN_LOW_WAIT error messages, increase the timeout +values. + +If this still does not help, +-look in aztcd.c for the lines #if 0 + #define AZT_TEST1 + ... + #endif + and substitute '#if 0' by '#if 1'. +-recompile your kernel and repeat the above two procedures. You will now get + a bundle of debugging messages from the driver. Again note your commands + and the appropriate messages. If you have syslogd running, these messages + may also be found in syslogd's kernel log file. Nevertheless in some + installations syslogd does not yet run, when init() is called, thus look for + the aztcd-messages during init, before the login-prompt appears. + Then look in aztcd.c, to find out, what happened. The normal calling sequence + is: aztcd_init() during Linux bootup procedure init() + after doing a 'mount -t iso9660 /dev/aztcd0 /mnt' the normal calling sequence is + aztcd_open() -> Status 2c after cold reboot with CDROM or audio CD inserted + -> Status 8 after warm reboot with CDROM inserted + -> Status 2e after cold reboot with no disk, closed tray + -> Status 6e after cold reboot, mount with door open + aztUpdateToc() + aztGetDiskInfo() + aztGetQChannelInfo() repeated several times + aztGetToc() + aztGetQChannelInfo() repeated several times + a list of track information + do_aztcd_request() } + azt_transfer() } repeated several times + azt_poll } + Check, if there is a difference in the calling sequence or the status flags! + + There are a lot of other messages, eg. the ACMD-command code (defined in + aztcd.h), status info from the getAztStatus-command and the state sequence of + the finite state machine in azt_poll(). The most important are the status + messages, look how they are defined and try to understand, if they make + sense in the context where they appear. With a CD-ROM inserted the status + should always be 8, except in aztcd_open(). Try to open the tray, insert an + audio disk, insert no disk or reinsert the CD-ROM and check, if the status + bits change accordingly. The status bits are the most likely point, where + the drive manufacturers may implement changes. + +If you still don't succeed, a good point to start is to look in aztcd.c in +function aztcd_init, where the drive should be detected during init. Do the +following: +-reboot the system with boot parameter 'aztcd=<your base address>,0x79'. With + parameter 0x79 most of the drive version detection is bypassed. After that + you should see the complete version string including leading and trailing + blanks during init. + Now adapt the statement + if ((result[1]=='A')&&(result[2]=='Z' ...) + in aztcd_init() to exactly match the first 3 or 4 letters you have seen. +-Another point is the 'smart' card detection feature in aztcd_init(). Normally + the CD-ROM drive is ready, when aztcd_init is trying to read the version + string and a time consuming ACMD_SOFT_RESET command can be avoided. This is + detected by looking, if AFL_OP_OK can be read correctly. If the CD-ROM drive + hangs in some unknown state, e.g. because of an error before a warm start or + because you first operated under DOS, even the version string may be correct, + but the following commands will not. Then change the code in such a way, + that the ACMD_SOFT_RESET is issued in any case, by substituting the + if-statement 'if ( ...=AFL_OP_OK)' by 'if (1)'. + +If you succeed, please mail me the exact version string of your drive and +the code modifications, you have made together with a short explanation. +If you don't succeed, you may mail me the output of the debugging messages. +But remember, they are only useful, if they are exact and complete and you +describe in detail your hardware setup and what you did (cold/warm reboot, +with/without DOS, DOS-driver started/not started, which Linux-commands etc.) + + +9. TECHNICAL HISTORY OF THE DRIVER +The AZTECH-Driver is a rework of the Mitsumi-Driver. Four major items had to +be reworked: + +a) The Mitsumi drive does issue complete status information acknowledging +each command, the Aztech drive does only signal that the command was +processed. So whenever the complete status information is needed, an extra +ACMD_GET_STATUS command is issued. The handshake procedure for the drive +can be found in the functions aztSendCmd(), sendAztCmd() and getAztStatus(). + +b) The Aztech Drive does not have a ACMD_GET_DISK_INFO command, so the +necessary info about the number of tracks (firstTrack, lastTrack), disk +length etc. has to be read from the TOC in the lead in track (see function +aztGetDiskInfo()). + +c) Whenever data is read from the drive, the Mitsumi drive is started with a +command to read an indefinite (0xffffff) number of sectors. When the appropriate +number of sectors is read, the drive is stopped by a ACDM_STOP command. This +does not work with the Aztech drive. I did not find a way to stop it. The +stop and pause commands do only work in AUDIO mode but not in DATA mode. +Therefore I had to modify the 'finite state machine' in function azt_poll to +only read a certain number of sectors and then start a new read on demand. As I +have not completely understood, how the buffer/caching scheme of the Mitsumi +driver was implemented, I am not sure, if I have covered all cases correctly, +whenever you get timeout messages, the bug is most likely to be in that +function azt_poll() around switch(cmd) .... case ACD_S_DATA. + +d) I did not get information about changing drive mode. So I doubt, that the +code around function azt_poll() case AZT_S_MODE does work. In my test I have +not been able to switch to reading in raw mode. For reading raw mode, Aztech +uses a different command than for cooked mode, which I only have implemen- +ted in the ioctl-section but not in the section which is used by the ISO9660. + +The driver was developed on an AST PC with Intel 486/DX2, 8MB RAM, 340MB IDE +hard disk and on an AST PC with Intel Pentium 60MHz, 16MB RAM, 520MB IDE +running Linux kernel version 1.0.9 from the LST 1.8 Distribution. The kernel +was compiled with gcc.2.5.8. My CD-ROM drive is an Aztech CDA268-01A. My +drive says, that it has Firmware Version AZT26801A1.3. It came with an ISA-bus +interface card and works with polled I/O without DMA and without interrupts. +The code for all other drives was 'remote' tested and debugged by a number of +volunteers on the Internet. + +Points, where I feel that possible problems might be and all points where I +did not completely understand the drive's behaviour or trust my own code are +marked with /*???*/ in the source code. There are also some parts in the +Mitsumi driver, where I did not completely understand their code. + + +10. ACKNOWLEDGMENTS +Without the help of P.Bush, Aztech, who delivered technical information +about the Aztech Drive and without the help of E.Moenkeberg, GWDG, who did a +great job in analyzing the command structure of various CD-ROM drives, this +work would not have been possible. E.Moenkeberg was also a great help in +making the software 'kernel ready' and in answering many of the CDROM-related +questions in the newsgroups. He really is *the* Linux CD-ROM guru. Thanks +also to all the guys on the Internet, who collected valuable technical +information about CDROMs. + +Joe Nardone (joe@access.digex.net) was a patient tester even for my first +trial, which was more than slow, and made suggestions for code improvement. +Especially the 'finite state machine' azt_poll() was rewritten by Joe to get +clean C code and avoid the ugly 'gotos', which I copied from mcd.c. + +Robby Schirmer (schirmer@fmi.uni-passau.de) tested the audio stuff (ioctls) +and suggested a lot of patches for them. + +Joseph Piskor and Peter Nugent were the first users with the ORCHID CD3110 +and also were very patient with the problems which occurred. + +Reinhard Max delivered the information for the CDROM-interface of the +SoundWave32 soundcards. + +Jochen Kunz and Olaf Kaluza delivered the information for supporting Conrad's +TXC drive. + +Hilmar Berger delivered the patches for supporting CyCDROM CR520ie. + +Anybody, who is interested in these items should have a look at 'ftp.gwdg.de', +directory 'pub/linux/cdrom' and at 'ftp.cdrom.com', directory 'pub/cdrom'. + +11. PROGRAMMING ADD ONs: cdplay.c +You can use the ioctl-functions included in aztcd.c in your own programs. As +an example on how to do this, you will find a tiny CD Player for audio CDs +named 'cdplay.c'. It allows you to play audio CDs. You can play a specified +track, pause and resume or skip tracks forward and backwards. If you quit the +program without stopping the drive, playing is continued. You can also +(mis)use cdplay to read and hexdump data disks. You can find the code in the +APPENDIX of this file, which you should cut out with an editor and store in a +separate file 'cdplay.c'. To compile it and make it executable, do + gcc -s -Wall -O2 -L/usr/lib cdplay.c -o /usr/local/bin/cdplay # compiles it + chmod +755 /usr/local/bin/cdplay # makes it executable + ln -s /dev/aztcd0 /dev/cdrom # creates a link + (for /usr/lib substitute the top level directory, where your include files + reside, and for /usr/local/bin the directory, where you want the executable + binary to reside ) + +You have to set the correct permissions for cdplay *and* for /dev/mcd0 or +/dev/aztcd0 in order to use it. Remember, that you should not have /dev/cdrom +mounted, when you're playing audio CDs. + +This program is just a hack for testing the ioctl-functions in aztcd.c. I will +not maintain it, so if you run into problems, discard it or have a look into +the source code 'cdplay.c'. The program does only contain a minimum of user +protection and input error detection. If you use the commands in the wrong +order or if you try to read a CD at wrong addresses, you may get error messages +or even hang your machine. If you get STEN_LOW, STEN_LOW_WAIT or segment violation +error messages when using cdplay, after that, the system might not be stable +any more, so you'd better reboot. As the ioctl-functions run in kernel mode, +most normal Linux-multitasking protection features do not work. By using +uninitialized 'wild' pointers etc., it is easy to write to other users' data +and program areas, destroy kernel tables etc.. So if you experiment with ioctls +as always when you are doing systems programming and kernel hacking, you +should have a backup copy of your system in a safe place (and you also +should try restoring from a backup copy first)! + +A reworked and improved version called 'cdtester.c', which has yet more +features for testing CDROM-drives can be found in +Documentation/cdrom/sbpcd, written by E.Moenkeberg. + +Werner Zimmermann +Fachhochschule fuer Technik Esslingen +(EMail: Werner.Zimmermann@fht-esslingen.de) +October, 1997 + +--------------------------------------------------------------------------- +APPENDIX: Source code of cdplay.c + +/* Tiny Audio CD Player + + Copyright 1994, 1995, 1996 Werner Zimmermann (Werner.Zimmermann@fht-esslingen.de) + +This program originally was written to test the audio functions of the +AZTECH.CDROM-driver, but it should work with every CD-ROM drive. Before +using it, you should set a symlink from /dev/cdrom to your real CDROM +device. + +The GNU General Public License applies to this program. + +History: V0.1 W.Zimmermann: First release. Nov. 8, 1994 + V0.2 W.Zimmermann: Enhanced functionality. Nov. 9, 1994 + V0.3 W.Zimmermann: Additional functions. Nov. 28, 1994 + V0.4 W.Zimmermann: fixed some bugs. Dec. 17, 1994 + V0.5 W.Zimmermann: clean 'scanf' commands without compiler warnings + Jan. 6, 1995 + V0.6 W.Zimmermann: volume control (still experimental). Jan. 24, 1995 + V0.7 W.Zimmermann: read raw modified. July 26, 95 +*/ + +#include <stdio.h> +#include <ctype.h> +#include <sys/ioctl.h> +#include <sys/types.h> +#include <fcntl.h> +#include <unistd.h> +#include <linux/cdrom.h> +#include <linux/../../drivers/cdrom/aztcd.h> + +void help(void) +{ printf("Available Commands: STOP s EJECT/CLOSE e QUIT q\n"); + printf(" PLAY TRACK t PAUSE p RESUME r\n"); + printf(" NEXT TRACK n REPEAT LAST l HELP h\n"); + printf(" SUB CHANNEL c TRACK INFO i PLAY AT a\n"); + printf(" READ d READ RAW w VOLUME v\n"); +} + +int main(void) +{ int handle; + unsigned char command=' ', ini=0, first=1, last=1; + unsigned int cmd, i,j,k, arg1,arg2,arg3; + struct cdrom_ti ti; + struct cdrom_tochdr tocHdr; + struct cdrom_subchnl subchnl; + struct cdrom_tocentry entry; + struct cdrom_msf msf; + union { struct cdrom_msf msf; + unsigned char buf[CD_FRAMESIZE_RAW]; + } azt; + struct cdrom_volctrl volctrl; + + printf("\nMini-Audio CD-Player V0.72 (C) 1994,1995,1996 W.Zimmermann\n"); + handle=open("/dev/cdrom",O_RDWR); + ioctl(handle,CDROMRESUME); + + if (handle<=0) + { printf("Drive Error: already playing, no audio disk, door open\n"); + printf(" or no permission (you must be ROOT in order to use this program)\n"); + } + else + { help(); + while (1) + { printf("Type command (h = help): "); + scanf("%s",&command); + switch (command) + { case 'e': cmd=CDROMEJECT; + ioctl(handle,cmd); + break; + case 'p': if (!ini) + { printf("Command not allowed - play track first\n"); + } + else + { cmd=CDROMPAUSE; + if (ioctl(handle,cmd)) printf("Drive Error\n"); + } + break; + case 'r': if (!ini) + { printf("Command not allowed - play track first\n"); + } + else + { cmd=CDROMRESUME; + if (ioctl(handle,cmd)) printf("Drive Error\n"); + } + break; + case 's': cmd=CDROMPAUSE; + if (ioctl(handle,cmd)) printf("Drive error or already stopped\n"); + cmd=CDROMSTOP; + if (ioctl(handle,cmd)) printf("Drive error\n"); + break; + case 't': cmd=CDROMREADTOCHDR; + if (ioctl(handle,cmd,&tocHdr)) printf("Drive Error\n"); + first=tocHdr.cdth_trk0; + last= tocHdr.cdth_trk1; + if ((first==0)||(first>last)) + { printf ("--could not read TOC\n"); + } + else + { printf("--first track: %d --last track: %d --enter track number: ",first,last); + cmd=CDROMPLAYTRKIND; + scanf("%i",&arg1); + ti.cdti_trk0=arg1; + if (ti.cdti_trk0<first) ti.cdti_trk0=first; + if (ti.cdti_trk0>last) ti.cdti_trk0=last; + ti.cdti_ind0=0; + ti.cdti_trk1=last; + ti.cdti_ind1=0; + if (ioctl(handle,cmd,&ti)) printf("Drive Error\n"); + ini=1; + } + break; + case 'n': if (!ini++) + { if (ioctl(handle,CDROMREADTOCHDR,&tocHdr)) printf("Drive Error\n"); + first=tocHdr.cdth_trk0; + last= tocHdr.cdth_trk1; + ti.cdti_trk0=first-1; + } + if ((first==0)||(first>last)) + { printf ("--could not read TOC\n"); + } + else + { cmd=CDROMPLAYTRKIND; + if (++ti.cdti_trk0 > last) ti.cdti_trk0=last; + ti.cdti_ind0=0; + ti.cdti_trk1=last; + ti.cdti_ind1=0; + if (ioctl(handle,cmd,&ti)) printf("Drive Error\n"); + ini=1; + } + break; + case 'l': if (!ini++) + { if (ioctl(handle,CDROMREADTOCHDR,&tocHdr)) printf("Drive Error\n"); + first=tocHdr.cdth_trk0; + last= tocHdr.cdth_trk1; + ti.cdti_trk0=first+1; + } + if ((first==0)||(first>last)) + { printf ("--could not read TOC\n"); + } + else + { cmd=CDROMPLAYTRKIND; + if (--ti.cdti_trk0 < first) ti.cdti_trk0=first; + ti.cdti_ind0=0; + ti.cdti_trk1=last; + ti.cdti_ind1=0; + if (ioctl(handle,cmd,&ti)) printf("Drive Error\n"); + ini=1; + } + break; + case 'c': subchnl.cdsc_format=CDROM_MSF; + if (ioctl(handle,CDROMSUBCHNL,&subchnl)) + printf("Drive Error\n"); + else + { printf("AudioStatus:%s Track:%d Mode:%d MSF=%d:%d:%d\n", \ + subchnl.cdsc_audiostatus==CDROM_AUDIO_PLAY ? "PLAYING":"NOT PLAYING",\ + subchnl.cdsc_trk,subchnl.cdsc_adr, \ + subchnl.cdsc_absaddr.msf.minute, subchnl.cdsc_absaddr.msf.second, \ + subchnl.cdsc_absaddr.msf.frame); + } + break; + case 'i': if (!ini) + { printf("Command not allowed - play track first\n"); + } + else + { cmd=CDROMREADTOCENTRY; + printf("Track No.: "); + scanf("%d",&arg1); + entry.cdte_track=arg1; + if (entry.cdte_track<first) entry.cdte_track=first; + if (entry.cdte_track>last) entry.cdte_track=last; + entry.cdte_format=CDROM_MSF; + if (ioctl(handle,cmd,&entry)) + { printf("Drive error or invalid track no.\n"); + } + else + { printf("Mode %d Track, starts at %d:%d:%d\n", \ + entry.cdte_adr,entry.cdte_addr.msf.minute, \ + entry.cdte_addr.msf.second,entry.cdte_addr.msf.frame); + } + } + break; + case 'a': cmd=CDROMPLAYMSF; + printf("Address (min:sec:frame) "); + scanf("%d:%d:%d",&arg1,&arg2,&arg3); + msf.cdmsf_min0 =arg1; + msf.cdmsf_sec0 =arg2; + msf.cdmsf_frame0=arg3; + if (msf.cdmsf_sec0 > 59) msf.cdmsf_sec0 =59; + if (msf.cdmsf_frame0> 74) msf.cdmsf_frame0=74; + msf.cdmsf_min1=60; + msf.cdmsf_sec1=00; + msf.cdmsf_frame1=00; + if (ioctl(handle,cmd,&msf)) + { printf("Drive error or invalid address\n"); + } + break; +#ifdef AZT_PRIVATE_IOCTLS /*not supported by every CDROM driver*/ + case 'd': cmd=CDROMREADCOOKED; + printf("Address (min:sec:frame) "); + scanf("%d:%d:%d",&arg1,&arg2,&arg3); + azt.msf.cdmsf_min0 =arg1; + azt.msf.cdmsf_sec0 =arg2; + azt.msf.cdmsf_frame0=arg3; + if (azt.msf.cdmsf_sec0 > 59) azt.msf.cdmsf_sec0 =59; + if (azt.msf.cdmsf_frame0> 74) azt.msf.cdmsf_frame0=74; + if (ioctl(handle,cmd,&azt.msf)) + { printf("Drive error, invalid address or unsupported command\n"); + } + k=0; + getchar(); + for (i=0;i<128;i++) + { printf("%4d:",i*16); + for (j=0;j<16;j++) + { printf("%2x ",azt.buf[i*16+j]); + } + for (j=0;j<16;j++) + { if (isalnum(azt.buf[i*16+j])) + printf("%c",azt.buf[i*16+j]); + else + printf("."); + } + printf("\n"); + k++; + if (k>=20) + { printf("press ENTER to continue\n"); + getchar(); + k=0; + } + } + break; + case 'w': cmd=CDROMREADRAW; + printf("Address (min:sec:frame) "); + scanf("%d:%d:%d",&arg1,&arg2,&arg3); + azt.msf.cdmsf_min0 =arg1; + azt.msf.cdmsf_sec0 =arg2; + azt.msf.cdmsf_frame0=arg3; + if (azt.msf.cdmsf_sec0 > 59) azt.msf.cdmsf_sec0 =59; + if (azt.msf.cdmsf_frame0> 74) azt.msf.cdmsf_frame0=74; + if (ioctl(handle,cmd,&azt)) + { printf("Drive error, invalid address or unsupported command\n"); + } + k=0; + for (i=0;i<147;i++) + { printf("%4d:",i*16); + for (j=0;j<16;j++) + { printf("%2x ",azt.buf[i*16+j]); + } + for (j=0;j<16;j++) + { if (isalnum(azt.buf[i*16+j])) + printf("%c",azt.buf[i*16+j]); + else + printf("."); + } + printf("\n"); + k++; + if (k>=20) + { getchar(); + k=0; + } + } + break; +#endif + case 'v': cmd=CDROMVOLCTRL; + printf("--Channel 0 Left (0-255): "); + scanf("%d",&arg1); + printf("--Channel 1 Right (0-255): "); + scanf("%d",&arg2); + volctrl.channel0=arg1; + volctrl.channel1=arg2; + volctrl.channel2=0; + volctrl.channel3=0; + if (ioctl(handle,cmd,&volctrl)) + { printf("Drive error or unsupported command\n"); + } + break; + case 'q': if (close(handle)) printf("Drive Error: CLOSE\n"); + exit(0); + case 'h': help(); + break; + default: printf("unknown command\n"); + break; + } + } + } + return 0; +} diff --git a/Documentation/cdrom/cdrom-standard.tex b/Documentation/cdrom/cdrom-standard.tex new file mode 100644 index 000000000000..92f94e597582 --- /dev/null +++ b/Documentation/cdrom/cdrom-standard.tex @@ -0,0 +1,1022 @@ +\documentclass{article} +\def\version{$Id: cdrom-standard.tex,v 1.9 1997/12/28 15:42:49 david Exp $} +\newcommand{\newsection}[1]{\newpage\section{#1}} + +\evensidemargin=0pt +\oddsidemargin=0pt +\topmargin=-\headheight \advance\topmargin by -\headsep +\textwidth=15.99cm \textheight=24.62cm % normal A4, 1'' margin + +\def\linux{{\sc Linux}} +\def\cdrom{{\sc cd-rom}} +\def\UCD{{\sc Uniform cd-rom Driver}} +\def\cdromc{{\tt {cdrom.c}}} +\def\cdromh{{\tt {cdrom.h}}} +\def\fo{\sl} % foreign words +\def\ie{{\fo i.e.}} +\def\eg{{\fo e.g.}} + +\everymath{\it} \everydisplay{\it} +\catcode `\_=\active \def_{\_\penalty100 } +\catcode`\<=\active \def<#1>{{\langle\hbox{\rm#1}\rangle}} + +\begin{document} +\title{A \linux\ \cdrom\ standard} +\author{David van Leeuwen\\{\normalsize\tt david@ElseWare.cistron.nl} +\\{\footnotesize updated by Erik Andersen {\tt(andersee@debian.org)}} +\\{\footnotesize updated by Jens Axboe {\tt(axboe@image.dk)}}} +\date{12 March 1999} + +\maketitle + +\newsection{Introduction} + +\linux\ is probably the Unix-like operating system that supports +the widest variety of hardware devices. The reasons for this are +presumably +\begin{itemize} +\item + The large list of hardware devices available for the many platforms + that \linux\ now supports (\ie, i386-PCs, Sparc Suns, etc.) +\item + The open design of the operating system, such that anybody can write a + driver for \linux. +\item + There is plenty of source code around as examples of how to write a driver. +\end{itemize} +The openness of \linux, and the many different types of available +hardware has allowed \linux\ to support many different hardware devices. +Unfortunately, the very openness that has allowed \linux\ to support +all these different devices has also allowed the behavior of each +device driver to differ significantly from one device to another. +This divergence of behavior has been very significant for \cdrom\ +devices; the way a particular drive reacts to a `standard' $ioctl()$ +call varies greatly from one device driver to another. To avoid making +their drivers totally inconsistent, the writers of \linux\ \cdrom\ +drivers generally created new device drivers by understanding, copying, +and then changing an existing one. Unfortunately, this practice did not +maintain uniform behavior across all the \linux\ \cdrom\ drivers. + +This document describes an effort to establish Uniform behavior across +all the different \cdrom\ device drivers for \linux. This document also +defines the various $ioctl$s, and how the low-level \cdrom\ device +drivers should implement them. Currently (as of the \linux\ 2.1.$x$ +development kernels) several low-level \cdrom\ device drivers, including +both IDE/ATAPI and SCSI, now use this Uniform interface. + +When the \cdrom\ was developed, the interface between the \cdrom\ drive +and the computer was not specified in the standards. As a result, many +different \cdrom\ interfaces were developed. Some of them had their +own proprietary design (Sony, Mitsumi, Panasonic, Philips), other +manufacturers adopted an existing electrical interface and changed +the functionality (CreativeLabs/SoundBlaster, Teac, Funai) or simply +adapted their drives to one or more of the already existing electrical +interfaces (Aztech, Sanyo, Funai, Vertos, Longshine, Optics Storage and +most of the `NoName' manufacturers). In cases where a new drive really +brought its own interface or used its own command set and flow control +scheme, either a separate driver had to be written, or an existing +driver had to be enhanced. History has delivered us \cdrom\ support for +many of these different interfaces. Nowadays, almost all new \cdrom\ +drives are either IDE/ATAPI or SCSI, and it is very unlikely that any +manufacturer will create a new interface. Even finding drives for the +old proprietary interfaces is getting difficult. + +When (in the 1.3.70's) I looked at the existing software interface, +which was expressed through \cdromh, it appeared to be a rather wild +set of commands and data formats.\footnote{I cannot recollect what +kernel version I looked at, then, presumably 1.2.13 and 1.3.34---the +latest kernel that I was indirectly involved in.} It seemed that many +features of the software interface had been added to accommodate the +capabilities of a particular drive, in an {\fo ad hoc\/} manner. More +importantly, it appeared that the behavior of the `standard' commands +was different for most of the different drivers: \eg, some drivers +close the tray if an $open()$ call occurs when the tray is open, while +others do not. Some drivers lock the door upon opening the device, to +prevent an incoherent file system, but others don't, to allow software +ejection. Undoubtedly, the capabilities of the different drives vary, +but even when two drives have the same capability their drivers' +behavior was usually different. + +I decided to start a discussion on how to make all the \linux\ \cdrom\ +drivers behave more uniformly. I began by contacting the developers of +the many \cdrom\ drivers found in the \linux\ kernel. Their reactions +encouraged me to write the \UCD\ which this document is intended to +describe. The implementation of the \UCD\ is in the file \cdromc. This +driver is intended to be an additional software layer that sits on top +of the low-level device drivers for each \cdrom\ drive. By adding this +additional layer, it is possible to have all the different \cdrom\ +devices behave {\em exactly\/} the same (insofar as the underlying +hardware will allow). + +The goal of the \UCD\ is {\em not\/} to alienate driver developers who +have not yet taken steps to support this effort. The goal of \UCD\ is +simply to give people writing application programs for \cdrom\ drives +{\em one\/} \linux\ \cdrom\ interface with consistent behavior for all +\cdrom\ devices. In addition, this also provides a consistent interface +between the low-level device driver code and the \linux\ kernel. Care +is taken that 100\,\% compatibility exists with the data structures and +programmer's interface defined in \cdromh. This guide was written to +help \cdrom\ driver developers adapt their code to use the \UCD\ code +defined in \cdromc. + +Personally, I think that the most important hardware interfaces are +the IDE/ATAPI drives and, of course, the SCSI drives, but as prices +of hardware drop continuously, it is also likely that people may have +more than one \cdrom\ drive, possibly of mixed types. It is important +that these drives behave in the same way. In December 1994, one of the +cheapest \cdrom\ drives was a Philips cm206, a double-speed proprietary +drive. In the months that I was busy writing a \linux\ driver for it, +proprietary drives became obsolete and IDE/ATAPI drives became the +standard. At the time of the last update to this document (November +1997) it is becoming difficult to even {\em find} anything less than a +16 speed \cdrom\ drive, and 24 speed drives are common. + +\newsection{Standardizing through another software level} +\label{cdrom.c} + +At the time this document was conceived, all drivers directly +implemented the \cdrom\ $ioctl()$ calls through their own routines. This +led to the danger of different drivers forgetting to do important things +like checking that the user was giving the driver valid data. More +importantly, this led to the divergence of behavior, which has already +been discussed. + +For this reason, the \UCD\ was created to enforce consistent \cdrom\ +drive behavior, and to provide a common set of services to the various +low-level \cdrom\ device drivers. The \UCD\ now provides another +software-level, that separates the $ioctl()$ and $open()$ implementation +from the actual hardware implementation. Note that this effort has +made few changes which will affect a user's application programs. The +greatest change involved moving the contents of the various low-level +\cdrom\ drivers' header files to the kernel's cdrom directory. This was +done to help ensure that the user is only presented with only one cdrom +interface, the interface defined in \cdromh. + +\cdrom\ drives are specific enough (\ie, different from other +block-devices such as floppy or hard disc drives), to define a set +of common {\em \cdrom\ device operations}, $<cdrom-device>_dops$. +These operations are different from the classical block-device file +operations, $<block-device>_fops$. + +The routines for the \UCD\ interface level are implemented in the file +\cdromc. In this file, the \UCD\ interfaces with the kernel as a block +device by registering the following general $struct\ file_operations$: +$$ +\halign{$#$\ \hfil&$#$\ \hfil&$/*$ \rm# $*/$\hfil\cr +struct& file_operations\ cdrom_fops = \{\hidewidth\cr + &NULL, & lseek \cr + &block_read, & read---general block-dev read \cr + &block_write, & write---general block-dev write \cr + &NULL, & readdir \cr + &NULL, & select \cr + &cdrom_ioctl, & ioctl \cr + &NULL, & mmap \cr + &cdrom_open, & open \cr + &cdrom_release, & release \cr + &NULL, & fsync \cr + &NULL, & fasync \cr + &cdrom_media_changed, & media change \cr + &NULL & revalidate \cr +\};\cr +} +$$ + +Every active \cdrom\ device shares this $struct$. The routines +declared above are all implemented in \cdromc, since this file is the +place where the behavior of all \cdrom-devices is defined and +standardized. The actual interface to the various types of \cdrom\ +hardware is still performed by various low-level \cdrom-device +drivers. These routines simply implement certain {\em capabilities\/} +that are common to all \cdrom\ (and really, all removable-media +devices). + +Registration of a low-level \cdrom\ device driver is now done through +the general routines in \cdromc, not through the Virtual File System +(VFS) any more. The interface implemented in \cdromc\ is carried out +through two general structures that contain information about the +capabilities of the driver, and the specific drives on which the +driver operates. The structures are: +\begin{description} +\item[$cdrom_device_ops$] + This structure contains information about the low-level driver for a + \cdrom\ device. This structure is conceptually connected to the major + number of the device (although some drivers may have different + major numbers, as is the case for the IDE driver). +\item[$cdrom_device_info$] + This structure contains information about a particular \cdrom\ drive, + such as its device name, speed, etc. This structure is conceptually + connected to the minor number of the device. +\end{description} + +Registering a particular \cdrom\ drive with the \UCD\ is done by the +low-level device driver though a call to: +$$register_cdrom(struct\ cdrom_device_info * <device>_info) +$$ +The device information structure, $<device>_info$, contains all the +information needed for the kernel to interface with the low-level +\cdrom\ device driver. One of the most important entries in this +structure is a pointer to the $cdrom_device_ops$ structure of the +low-level driver. + +The device operations structure, $cdrom_device_ops$, contains a list +of pointers to the functions which are implemented in the low-level +device driver. When \cdromc\ accesses a \cdrom\ device, it does it +through the functions in this structure. It is impossible to know all +the capabilities of future \cdrom\ drives, so it is expected that this +list may need to be expanded from time to time as new technologies are +developed. For example, CD-R and CD-R/W drives are beginning to become +popular, and support will soon need to be added for them. For now, the +current $struct$ is: +$$ +\halign{$#$\ \hfil&$#$\ \hfil&\hbox to 10em{$#$\hss}& + $/*$ \rm# $*/$\hfil\cr +struct& cdrom_device_ops\ \{ \hidewidth\cr + &int& (* open)(struct\ cdrom_device_info *, int)\cr + &void& (* release)(struct\ cdrom_device_info *);\cr + &int& (* drive_status)(struct\ cdrom_device_info *, int);\cr + &int& (* media_changed)(struct\ cdrom_device_info *, int);\cr + &int& (* tray_move)(struct\ cdrom_device_info *, int);\cr + &int& (* lock_door)(struct\ cdrom_device_info *, int);\cr + &int& (* select_speed)(struct\ cdrom_device_info *, int);\cr + &int& (* select_disc)(struct\ cdrom_device_info *, int);\cr + &int& (* get_last_session) (struct\ cdrom_device_info *, + struct\ cdrom_multisession *{});\cr + &int& (* get_mcn)(struct\ cdrom_device_info *, struct\ cdrom_mcn *{});\cr + &int& (* reset)(struct\ cdrom_device_info *);\cr + &int& (* audio_ioctl)(struct\ cdrom_device_info *, unsigned\ int, + void *{});\cr + &int& (* dev_ioctl)(struct\ cdrom_device_info *, unsigned\ int, + unsigned\ long);\cr +\noalign{\medskip} + &const\ int& capability;& capability flags \cr + &int& n_minors;& number of active minor devices \cr +\};\cr +} +$$ +When a low-level device driver implements one of these capabilities, +it should add a function pointer to this $struct$. When a particular +function is not implemented, however, this $struct$ should contain a +NULL instead. The $capability$ flags specify the capabilities of the +\cdrom\ hardware and/or low-level \cdrom\ driver when a \cdrom\ drive +is registered with the \UCD. The value $n_minors$ should be a positive +value indicating the number of minor devices that are supported by +the low-level device driver, normally~1. Although these two variables +are `informative' rather than `operational,' they are included in +$cdrom_device_ops$ because they describe the capability of the {\em +driver\/} rather than the {\em drive}. Nomenclature has always been +difficult in computer programming. + +Note that most functions have fewer parameters than their +$blkdev_fops$ counterparts. This is because very little of the +information in the structures $inode$ and $file$ is used. For most +drivers, the main parameter is the $struct$ $cdrom_device_info$, from +which the major and minor number can be extracted. (Most low-level +\cdrom\ drivers don't even look at the major and minor number though, +since many of them only support one device.) This will be available +through $dev$ in $cdrom_device_info$ described below. + +The drive-specific, minor-like information that is registered with +\cdromc, currently contains the following fields: +$$ +\halign{$#$\ \hfil&$#$\ \hfil&\hbox to 10em{$#$\hss}& + $/*$ \rm# $*/$\hfil\cr +struct& cdrom_device_info\ \{ \hidewidth\cr + & struct\ cdrom_device_ops *& ops;& device operations for this major\cr + & struct\ cdrom_device_info *& next;& next device_info for this major\cr + & void *& handle;& driver-dependent data\cr +\noalign{\medskip} + & kdev_t& dev;& device number (incorporates minor)\cr + & int& mask;& mask of capability: disables them \cr + & int& speed;& maximum speed for reading data \cr + & int& capacity;& number of discs in a jukebox \cr +\noalign{\medskip} + &int& options : 30;& options flags \cr + &unsigned& mc_flags : 2;& media-change buffer flags \cr + & int& use_count;& number of times device is opened\cr + & char& name[20];& name of the device type\cr +\}\cr +}$$ +Using this $struct$, a linked list of the registered minor devices is +built, using the $next$ field. The device number, the device operations +struct and specifications of properties of the drive are stored in this +structure. + +The $mask$ flags can be used to mask out some of the capabilities listed +in $ops\to capability$, if a specific drive doesn't support a feature +of the driver. The value $speed$ specifies the maximum head-rate of the +drive, measured in units of normal audio speed (176\,kB/sec raw data or +150\,kB/sec file system data). The value $n_discs$ should reflect the +number of discs the drive can hold simultaneously, if it is designed +as a juke-box, or otherwise~1. The parameters are declared $const$ +because they describe properties of the drive, which don't change after +registration. + +A few registers contain variables local to the \cdrom\ drive. The +flags $options$ are used to specify how the general \cdrom\ routines +should behave. These various flags registers should provide enough +flexibility to adapt to the different users' wishes (and {\em not\/} the +`arbitrary' wishes of the author of the low-level device driver, as is +the case in the old scheme). The register $mc_flags$ is used to buffer +the information from $media_changed()$ to two separate queues. Other +data that is specific to a minor drive, can be accessed through $handle$, +which can point to a data structure specific to the low-level driver. +The fields $use_count$, $next$, $options$ and $mc_flags$ need not be +initialized. + +The intermediate software layer that \cdromc\ forms will perform some +additional bookkeeping. The use count of the device (the number of +processes that have the device opened) is registered in $use_count$. The +function $cdrom_ioctl()$ will verify the appropriate user-memory regions +for read and write, and in case a location on the CD is transferred, +it will `sanitize' the format by making requests to the low-level +drivers in a standard format, and translating all formats between the +user-software and low level drivers. This relieves much of the drivers' +memory checking and format checking and translation. Also, the necessary +structures will be declared on the program stack. + +The implementation of the functions should be as defined in the +following sections. Two functions {\em must\/} be implemented, namely +$open()$ and $release()$. Other functions may be omitted, their +corresponding capability flags will be cleared upon registration. +Generally, a function returns zero on success and negative on error. A +function call should return only after the command has completed, but of +course waiting for the device should not use processor time. + +\subsection{$Int\ open(struct\ cdrom_device_info * cdi, int\ purpose)$} + +$Open()$ should try to open the device for a specific $purpose$, which +can be either: +\begin{itemize} +\item[0] Open for reading data, as done by {\tt {mount()}} (2), or the +user commands {\tt {dd}} or {\tt {cat}}. +\item[1] Open for $ioctl$ commands, as done by audio-CD playing +programs. +\end{itemize} +Notice that any strategic code (closing tray upon $open()$, etc.)\ is +done by the calling routine in \cdromc, so the low-level routine +should only be concerned with proper initialization, such as spinning +up the disc, etc. % and device-use count + + +\subsection{$Void\ release(struct\ cdrom_device_info * cdi)$} + + +Device-specific actions should be taken such as spinning down the device. +However, strategic actions such as ejection of the tray, or unlocking +the door, should be left over to the general routine $cdrom_release()$. +This is the only function returning type $void$. + +\subsection{$Int\ drive_status(struct\ cdrom_device_info * cdi, int\ slot_nr)$} +\label{drive status} + +The function $drive_status$, if implemented, should provide +information on the status of the drive (not the status of the disc, +which may or may not be in the drive). If the drive is not a changer, +$slot_nr$ should be ignored. In \cdromh\ the possibilities are listed: +$$ +\halign{$#$\ \hfil&$/*$ \rm# $*/$\hfil\cr +CDS_NO_INFO& no information available\cr +CDS_NO_DISC& no disc is inserted, tray is closed\cr +CDS_TRAY_OPEN& tray is opened\cr +CDS_DRIVE_NOT_READY& something is wrong, tray is moving?\cr +CDS_DISC_OK& a disc is loaded and everything is fine\cr +} +$$ + +\subsection{$Int\ media_changed(struct\ cdrom_device_info * cdi, int\ disc_nr)$} + +This function is very similar to the original function in $struct\ +file_operations$. It returns 1 if the medium of the device $cdi\to +dev$ has changed since the last call, and 0 otherwise. The parameter +$disc_nr$ identifies a specific slot in a juke-box, it should be +ignored for single-disc drives. Note that by `re-routing' this +function through $cdrom_media_changed()$, we can implement separate +queues for the VFS and a new $ioctl()$ function that can report device +changes to software (\eg, an auto-mounting daemon). + +\subsection{$Int\ tray_move(struct\ cdrom_device_info * cdi, int\ position)$} + +This function, if implemented, should control the tray movement. (No +other function should control this.) The parameter $position$ controls +the desired direction of movement: +\begin{itemize} +\item[0] Close tray +\item[1] Open tray +\end{itemize} +This function returns 0 upon success, and a non-zero value upon +error. Note that if the tray is already in the desired position, no +action need be taken, and the return value should be 0. + +\subsection{$Int\ lock_door(struct\ cdrom_device_info * cdi, int\ lock)$} + +This function (and no other code) controls locking of the door, if the +drive allows this. The value of $lock$ controls the desired locking +state: +\begin{itemize} +\item[0] Unlock door, manual opening is allowed +\item[1] Lock door, tray cannot be ejected manually +\end{itemize} +This function returns 0 upon success, and a non-zero value upon +error. Note that if the door is already in the requested state, no +action need be taken, and the return value should be 0. + +\subsection{$Int\ select_speed(struct\ cdrom_device_info * cdi, int\ speed)$} + +Some \cdrom\ drives are capable of changing their head-speed. There +are several reasons for changing the speed of a \cdrom\ drive. Badly +pressed \cdrom s may benefit from less-than-maximum head rate. Modern +\cdrom\ drives can obtain very high head rates (up to $24\times$ is +common). It has been reported that these drives can make reading +errors at these high speeds, reducing the speed can prevent data loss +in these circumstances. Finally, some of these drives can +make an annoyingly loud noise, which a lower speed may reduce. %Finally, +%although the audio-low-pass filters probably aren't designed for it, +%more than real-time playback of audio might be used for high-speed +%copying of audio tracks. + +This function specifies the speed at which data is read or audio is +played back. The value of $speed$ specifies the head-speed of the +drive, measured in units of standard cdrom speed (176\,kB/sec raw data +or 150\,kB/sec file system data). So to request that a \cdrom\ drive +operate at 300\,kB/sec you would call the CDROM_SELECT_SPEED $ioctl$ +with $speed=2$. The special value `0' means `auto-selection', \ie, +maximum data-rate or real-time audio rate. If the drive doesn't have +this `auto-selection' capability, the decision should be made on the +current disc loaded and the return value should be positive. A negative +return value indicates an error. + +\subsection{$Int\ select_disc(struct\ cdrom_device_info * cdi, int\ number)$} + +If the drive can store multiple discs (a juke-box) this function +will perform disc selection. It should return the number of the +selected disc on success, a negative value on error. Currently, only +the ide-cd driver supports this functionality. + +\subsection{$Int\ get_last_session(struct\ cdrom_device_info * cdi, struct\ + cdrom_multisession * ms_info)$} + +This function should implement the old corresponding $ioctl()$. For +device $cdi\to dev$, the start of the last session of the current disc +should be returned in the pointer argument $ms_info$. Note that +routines in \cdromc\ have sanitized this argument: its requested +format will {\em always\/} be of the type $CDROM_LBA$ (linear block +addressing mode), whatever the calling software requested. But +sanitization goes even further: the low-level implementation may +return the requested information in $CDROM_MSF$ format if it wishes so +(setting the $ms_info\rightarrow addr_format$ field appropriately, of +course) and the routines in \cdromc\ will make the transformation if +necessary. The return value is 0 upon success. + +\subsection{$Int\ get_mcn(struct\ cdrom_device_info * cdi, struct\ + cdrom_mcn * mcn)$} + +Some discs carry a `Media Catalog Number' (MCN), also called +`Universal Product Code' (UPC). This number should reflect the number +that is generally found in the bar-code on the product. Unfortunately, +the few discs that carry such a number on the disc don't even use the +same format. The return argument to this function is a pointer to a +pre-declared memory region of type $struct\ cdrom_mcn$. The MCN is +expected as a 13-character string, terminated by a null-character. + +\subsection{$Int\ reset(struct\ cdrom_device_info * cdi)$} + +This call should perform a hard-reset on the drive (although in +circumstances that a hard-reset is necessary, a drive may very well not +listen to commands anymore). Preferably, control is returned to the +caller only after the drive has finished resetting. If the drive is no +longer listening, it may be wise for the underlying low-level cdrom +driver to time out. + +\subsection{$Int\ audio_ioctl(struct\ cdrom_device_info * cdi, unsigned\ + int\ cmd, void * arg)$} + +Some of the \cdrom-$ioctl$s defined in \cdromh\ can be +implemented by the routines described above, and hence the function +$cdrom_ioctl$ will use those. However, most $ioctl$s deal with +audio-control. We have decided to leave these to be accessed through a +single function, repeating the arguments $cmd$ and $arg$. Note that +the latter is of type $void*{}$, rather than $unsigned\ long\ +int$. The routine $cdrom_ioctl()$ does do some useful things, +though. It sanitizes the address format type to $CDROM_MSF$ (Minutes, +Seconds, Frames) for all audio calls. It also verifies the memory +location of $arg$, and reserves stack-memory for the argument. This +makes implementation of the $audio_ioctl()$ much simpler than in the +old driver scheme. For example, you may look up the function +$cm206_audio_ioctl()$ in {\tt {cm206.c}} that should be updated with +this documentation. + +An unimplemented ioctl should return $-ENOSYS$, but a harmless request +(\eg, $CDROMSTART$) may be ignored by returning 0 (success). Other +errors should be according to the standards, whatever they are. When +an error is returned by the low-level driver, the \UCD\ tries whenever +possible to return the error code to the calling program. (We may decide +to sanitize the return value in $cdrom_ioctl()$ though, in order to +guarantee a uniform interface to the audio-player software.) + +\subsection{$Int\ dev_ioctl(struct\ cdrom_device_info * cdi, unsigned\ int\ + cmd, unsigned\ long\ arg)$} + +Some $ioctl$s seem to be specific to certain \cdrom\ drives. That is, +they are introduced to service some capabilities of certain drives. In +fact, there are 6 different $ioctl$s for reading data, either in some +particular kind of format, or audio data. Not many drives support +reading audio tracks as data, I believe this is because of protection +of copyrights of artists. Moreover, I think that if audio-tracks are +supported, it should be done through the VFS and not via $ioctl$s. A +problem here could be the fact that audio-frames are 2352 bytes long, +so either the audio-file-system should ask for 75264 bytes at once +(the least common multiple of 512 and 2352), or the drivers should +bend their backs to cope with this incoherence (to which I would be +opposed). Furthermore, it is very difficult for the hardware to find +the exact frame boundaries, since there are no synchronization headers +in audio frames. Once these issues are resolved, this code should be +standardized in \cdromc. + +Because there are so many $ioctl$s that seem to be introduced to +satisfy certain drivers,\footnote{Is there software around that + actually uses these? I'd be interested!} any `non-standard' $ioctl$s +are routed through the call $dev_ioctl()$. In principle, `private' +$ioctl$s should be numbered after the device's major number, and not +the general \cdrom\ $ioctl$ number, {\tt {0x53}}. Currently the +non-supported $ioctl$s are: {\it CDROMREADMODE1, CDROMREADMODE2, + CDROMREADAUDIO, CDROMREADRAW, CDROMREADCOOKED, CDROMSEEK, + CDROMPLAY\-BLK and CDROM\-READALL}. + + +\subsection{\cdrom\ capabilities} +\label{capability} + +Instead of just implementing some $ioctl$ calls, the interface in +\cdromc\ supplies the possibility to indicate the {\em capabilities\/} +of a \cdrom\ drive. This can be done by ORing any number of +capability-constants that are defined in \cdromh\ at the registration +phase. Currently, the capabilities are any of: +$$ +\halign{$#$\ \hfil&$/*$ \rm# $*/$\hfil\cr +CDC_CLOSE_TRAY& can close tray by software control\cr +CDC_OPEN_TRAY& can open tray\cr +CDC_LOCK& can lock and unlock the door\cr +CDC_SELECT_SPEED& can select speed, in units of $\sim$150\,kB/s\cr +CDC_SELECT_DISC& drive is juke-box\cr +CDC_MULTI_SESSION& can read sessions $>\rm1$\cr +CDC_MCN& can read Media Catalog Number\cr +CDC_MEDIA_CHANGED& can report if disc has changed\cr +CDC_PLAY_AUDIO& can perform audio-functions (play, pause, etc)\cr +CDC_RESET& hard reset device\cr +CDC_IOCTLS& driver has non-standard ioctls\cr +CDC_DRIVE_STATUS& driver implements drive status\cr +} +$$ +The capability flag is declared $const$, to prevent drivers from +accidentally tampering with the contents. The capability fags actually +inform \cdromc\ of what the driver can do. If the drive found +by the driver does not have the capability, is can be masked out by +the $cdrom_device_info$ variable $mask$. For instance, the SCSI \cdrom\ +driver has implemented the code for loading and ejecting \cdrom's, and +hence its corresponding flags in $capability$ will be set. But a SCSI +\cdrom\ drive might be a caddy system, which can't load the tray, and +hence for this drive the $cdrom_device_info$ struct will have set +the $CDC_CLOSE_TRAY$ bit in $mask$. + +In the file \cdromc\ you will encounter many constructions of the type +$$\it +if\ (cdo\rightarrow capability \mathrel\& \mathord{\sim} cdi\rightarrow mask + \mathrel{\&} CDC_<capability>) \ldots +$$ +There is no $ioctl$ to set the mask\dots The reason is that +I think it is better to control the {\em behavior\/} rather than the +{\em capabilities}. + +\subsection{Options} + +A final flag register controls the {\em behavior\/} of the \cdrom\ +drives, in order to satisfy different users' wishes, hopefully +independently of the ideas of the respective author who happened to +have made the drive's support available to the \linux\ community. The +current behavior options are: +$$ +\halign{$#$\ \hfil&$/*$ \rm# $*/$\hfil\cr +CDO_AUTO_CLOSE& try to close tray upon device $open()$\cr +CDO_AUTO_EJECT& try to open tray on last device $close()$\cr +CDO_USE_FFLAGS& use $file_pointer\rightarrow f_flags$ to indicate + purpose for $open()$\cr +CDO_LOCK& try to lock door if device is opened\cr +CDO_CHECK_TYPE& ensure disc type is data if opened for data\cr +} +$$ + +The initial value of this register is $CDO_AUTO_CLOSE \mathrel| +CDO_USE_FFLAGS \mathrel| CDO_LOCK$, reflecting my own view on user +interface and software standards. Before you protest, there are two +new $ioctl$s implemented in \cdromc, that allow you to control the +behavior by software. These are: +$$ +\halign{$#$\ \hfil&$/*$ \rm# $*/$\hfil\cr +CDROM_SET_OPTIONS& set options specified in $(int)\ arg$\cr +CDROM_CLEAR_OPTIONS& clear options specified in $(int)\ arg$\cr +} +$$ +One option needs some more explanation: $CDO_USE_FFLAGS$. In the next +newsection we explain what the need for this option is. + +A software package {\tt setcd}, available from the Debian distribution +and {\tt sunsite.unc.edu}, allows user level control of these flags. + +\newsection{The need to know the purpose of opening the \cdrom\ device} + +Traditionally, Unix devices can be used in two different `modes', +either by reading/writing to the device file, or by issuing +controlling commands to the device, by the device's $ioctl()$ +call. The problem with \cdrom\ drives, is that they can be used for +two entirely different purposes. One is to mount removable +file systems, \cdrom s, the other is to play audio CD's. Audio commands +are implemented entirely through $ioctl$s, presumably because the +first implementation (SUN?) has been such. In principle there is +nothing wrong with this, but a good control of the `CD player' demands +that the device can {\em always\/} be opened in order to give the +$ioctl$ commands, regardless of the state the drive is in. + +On the other hand, when used as a removable-media disc drive (what the +original purpose of \cdrom s is) we would like to make sure that the +disc drive is ready for operation upon opening the device. In the old +scheme, some \cdrom\ drivers don't do any integrity checking, resulting +in a number of i/o errors reported by the VFS to the kernel when an +attempt for mounting a \cdrom\ on an empty drive occurs. This is not a +particularly elegant way to find out that there is no \cdrom\ inserted; +it more-or-less looks like the old IBM-PC trying to read an empty floppy +drive for a couple of seconds, after which the system complains it +can't read from it. Nowadays we can {\em sense\/} the existence of a +removable medium in a drive, and we believe we should exploit that +fact. An integrity check on opening of the device, that verifies the +availability of a \cdrom\ and its correct type (data), would be +desirable. + +These two ways of using a \cdrom\ drive, principally for data and +secondarily for playing audio discs, have different demands for the +behavior of the $open()$ call. Audio use simply wants to open the +device in order to get a file handle which is needed for issuing +$ioctl$ commands, while data use wants to open for correct and +reliable data transfer. The only way user programs can indicate what +their {\em purpose\/} of opening the device is, is through the $flags$ +parameter (see {\tt {open(2)}}). For \cdrom\ devices, these flags aren't +implemented (some drivers implement checking for write-related flags, +but this is not strictly necessary if the device file has correct +permission flags). Most option flags simply don't make sense to +\cdrom\ devices: $O_CREAT$, $O_NOCTTY$, $O_TRUNC$, $O_APPEND$, and +$O_SYNC$ have no meaning to a \cdrom. + +We therefore propose to use the flag $O_NONBLOCK$ to indicate +that the device is opened just for issuing $ioctl$ +commands. Strictly, the meaning of $O_NONBLOCK$ is that opening and +subsequent calls to the device don't cause the calling process to +wait. We could interpret this as ``don't wait until someone has +inserted some valid data-\cdrom.'' Thus, our proposal of the +implementation for the $open()$ call for \cdrom s is: +\begin{itemize} +\item If no other flags are set than $O_RDONLY$, the device is opened +for data transfer, and the return value will be 0 only upon successful +initialization of the transfer. The call may even induce some actions +on the \cdrom, such as closing the tray. +\item If the option flag $O_NONBLOCK$ is set, opening will always be +successful, unless the whole device doesn't exist. The drive will take +no actions whatsoever. +\end{itemize} + +\subsection{And what about standards?} + +You might hesitate to accept this proposal as it comes from the +\linux\ community, and not from some standardizing institute. What +about SUN, SGI, HP and all those other Unix and hardware vendors? +Well, these companies are in the lucky position that they generally +control both the hardware and software of their supported products, +and are large enough to set their own standard. They do not have to +deal with a dozen or more different, competing hardware +configurations.\footnote{Incidentally, I think that SUN's approach to +mounting \cdrom s is very good in origin: under Solaris a +volume-daemon automatically mounts a newly inserted \cdrom\ under {\tt +{/cdrom/$<volume-name>$/}}. In my opinion they should have pushed this +further and have {\em every\/} \cdrom\ on the local area network be +mounted at the similar location, \ie, no matter in which particular +machine you insert a \cdrom, it will always appear at the same +position in the directory tree, on every system. When I wanted to +implement such a user-program for \linux, I came across the +differences in behavior of the various drivers, and the need for an +$ioctl$ informing about media changes.} + +We believe that using $O_NONBLOCK$ to indicate that a device is being opened +for $ioctl$ commands only can be easily introduced in the \linux\ +community. All the CD-player authors will have to be informed, we can +even send in our own patches to the programs. The use of $O_NONBLOCK$ +has most likely no influence on the behavior of the CD-players on +other operating systems than \linux. Finally, a user can always revert +to old behavior by a call to $ioctl(file_descriptor, CDROM_CLEAR_OPTIONS, +CDO_USE_FFLAGS)$. + +\subsection{The preferred strategy of $open()$} + +The routines in \cdromc\ are designed in such a way that run-time +configuration of the behavior of \cdrom\ devices (of {\em any\/} type) +can be carried out, by the $CDROM_SET/CLEAR_OPTIONS$ $ioctls$. Thus, various +modes of operation can be set: +\begin{description} +\item[$CDO_AUTO_CLOSE \mathrel| CDO_USE_FFLAGS \mathrel| CDO_LOCK$] This +is the default setting. (With $CDO_CHECK_TYPE$ it will be better, in the +future.) If the device is not yet opened by any other process, and if +the device is being opened for data ($O_NONBLOCK$ is not set) and the +tray is found to be open, an attempt to close the tray is made. Then, +it is verified that a disc is in the drive and, if $CDO_CHECK_TYPE$ is +set, that it contains tracks of type `data mode 1.' Only if all tests +are passed is the return value zero. The door is locked to prevent file +system corruption. If the drive is opened for audio ($O_NONBLOCK$ is +set), no actions are taken and a value of 0 will be returned. +\item[$CDO_AUTO_CLOSE \mathrel| CDO_AUTO_EJECT \mathrel| CDO_LOCK$] This +mimics the behavior of the current sbpcd-driver. The option flags are +ignored, the tray is closed on the first open, if necessary. Similarly, +the tray is opened on the last release, \ie, if a \cdrom\ is unmounted, +it is automatically ejected, such that the user can replace it. +\end{description} +We hope that these option can convince everybody (both driver +maintainers and user program developers) to adopt the new \cdrom\ +driver scheme and option flag interpretation. + +\newsection{Description of routines in \cdromc} + +Only a few routines in \cdromc\ are exported to the drivers. In this +new section we will discuss these, as well as the functions that `take +over' the \cdrom\ interface to the kernel. The header file belonging +to \cdromc\ is called \cdromh. Formerly, some of the contents of this +file were placed in the file {\tt {ucdrom.h}}, but this file has now been +merged back into \cdromh. + +\subsection{$Struct\ file_operations\ cdrom_fops$} + +The contents of this structure were described in section~\ref{cdrom.c}. +A pointer to this structure is assigned to the $fops$ field +of the $struct gendisk$. + +\subsection{$Int\ register_cdrom( struct\ cdrom_device_info\ * cdi)$} + +This function is used in about the same way one registers $cdrom_fops$ +with the kernel, the device operations and information structures, +as described in section~\ref{cdrom.c}, should be registered with the +\UCD: +$$ +register_cdrom(\&<device>_info)); +$$ +This function returns zero upon success, and non-zero upon +failure. The structure $<device>_info$ should have a pointer to the +driver's $<device>_dops$, as in +$$ +\vbox{\halign{&$#$\hfil\cr +struct\ &cdrom_device_info\ <device>_info = \{\cr +& <device>_dops;\cr +&\ldots\cr +\}\cr +}}$$ +Note that a driver must have one static structure, $<device>_dops$, while +it may have as many structures $<device>_info$ as there are minor devices +active. $Register_cdrom()$ builds a linked list from these. + +\subsection{$Int\ unregister_cdrom(struct\ cdrom_device_info * cdi)$} + +Unregistering device $cdi$ with minor number $MINOR(cdi\to dev)$ removes +the minor device from the list. If it was the last registered minor for +the low-level driver, this disconnects the registered device-operation +routines from the \cdrom\ interface. This function returns zero upon +success, and non-zero upon failure. + +\subsection{$Int\ cdrom_open(struct\ inode * ip, struct\ file * fp)$} + +This function is not called directly by the low-level drivers, it is +listed in the standard $cdrom_fops$. If the VFS opens a file, this +function becomes active. A strategy is implemented in this routine, +taking care of all capabilities and options that are set in the +$cdrom_device_ops$ connected to the device. Then, the program flow is +transferred to the device_dependent $open()$ call. + +\subsection{$Void\ cdrom_release(struct\ inode *ip, struct\ file +*fp)$} + +This function implements the reverse-logic of $cdrom_open()$, and then +calls the device-dependent $release()$ routine. When the use-count has +reached 0, the allocated buffers are flushed by calls to $sync_dev(dev)$ +and $invalidate_buffers(dev)$. + + +\subsection{$Int\ cdrom_ioctl(struct\ inode *ip, struct\ file *fp, +unsigned\ int\ cmd, unsigned\ long\ arg)$} +\label{cdrom-ioctl} + +This function handles all the standard $ioctl$ requests for \cdrom\ +devices in a uniform way. The different calls fall into three +categories: $ioctl$s that can be directly implemented by device +operations, ones that are routed through the call $audio_ioctl()$, and +the remaining ones, that are presumable device-dependent. Generally, a +negative return value indicates an error. + +\subsubsection{Directly implemented $ioctl$s} +\label{ioctl-direct} + +The following `old' \cdrom-$ioctl$s are implemented by directly +calling device-operations in $cdrom_device_ops$, if implemented and +not masked: +\begin{description} +\item[CDROMMULTISESSION] Requests the last session on a \cdrom. +\item[CDROMEJECT] Open tray. +\item[CDROMCLOSETRAY] Close tray. +\item[CDROMEJECT_SW] If $arg\not=0$, set behavior to auto-close (close +tray on first open) and auto-eject (eject on last release), otherwise +set behavior to non-moving on $open()$ and $release()$ calls. +\item[CDROM_GET_MCN] Get the Media Catalog Number from a CD. +\end{description} + +\subsubsection{$Ioctl$s routed through $audio_ioctl()$} +\label{ioctl-audio} + +The following set of $ioctl$s are all implemented through a call to +the $cdrom_fops$ function $audio_ioctl()$. Memory checks and +allocation are performed in $cdrom_ioctl()$, and also sanitization of +address format ($CDROM_LBA$/$CDROM_MSF$) is done. +\begin{description} +\item[CDROMSUBCHNL] Get sub-channel data in argument $arg$ of type $struct\ +cdrom_subchnl *{}$. +\item[CDROMREADTOCHDR] Read Table of Contents header, in $arg$ of type +$struct\ cdrom_tochdr *{}$. +\item[CDROMREADTOCENTRY] Read a Table of Contents entry in $arg$ and +specified by $arg$ of type $struct\ cdrom_tocentry *{}$. +\item[CDROMPLAYMSF] Play audio fragment specified in Minute, Second, +Frame format, delimited by $arg$ of type $struct\ cdrom_msf *{}$. +\item[CDROMPLAYTRKIND] Play audio fragment in track-index format +delimited by $arg$ of type $struct\ \penalty-1000 cdrom_ti *{}$. +\item[CDROMVOLCTRL] Set volume specified by $arg$ of type $struct\ +cdrom_volctrl *{}$. +\item[CDROMVOLREAD] Read volume into by $arg$ of type $struct\ +cdrom_volctrl *{}$. +\item[CDROMSTART] Spin up disc. +\item[CDROMSTOP] Stop playback of audio fragment. +\item[CDROMPAUSE] Pause playback of audio fragment. +\item[CDROMRESUME] Resume playing. +\end{description} + +\subsubsection{New $ioctl$s in \cdromc} + +The following $ioctl$s have been introduced to allow user programs to +control the behavior of individual \cdrom\ devices. New $ioctl$ +commands can be identified by the underscores in their names. +\begin{description} +\item[CDROM_SET_OPTIONS] Set options specified by $arg$. Returns the +option flag register after modification. Use $arg = \rm0$ for reading +the current flags. +\item[CDROM_CLEAR_OPTIONS] Clear options specified by $arg$. Returns + the option flag register after modification. +\item[CDROM_SELECT_SPEED] Select head-rate speed of disc specified as + by $arg$ in units of standard cdrom speed (176\,kB/sec raw data or + 150\,kB/sec file system data). The value 0 means `auto-select', \ie, + play audio discs at real time and data discs at maximum speed. The value + $arg$ is checked against the maximum head rate of the drive found in the + $cdrom_dops$. +\item[CDROM_SELECT_DISC] Select disc numbered $arg$ from a juke-box. + First disc is numbered 0. The number $arg$ is checked against the + maximum number of discs in the juke-box found in the $cdrom_dops$. +\item[CDROM_MEDIA_CHANGED] Returns 1 if a disc has been changed since + the last call. Note that calls to $cdrom_media_changed$ by the VFS + are treated by an independent queue, so both mechanisms will detect + a media change once. For juke-boxes, an extra argument $arg$ + specifies the slot for which the information is given. The special + value $CDSL_CURRENT$ requests that information about the currently + selected slot be returned. +\item[CDROM_DRIVE_STATUS] Returns the status of the drive by a call to + $drive_status()$. Return values are defined in section~\ref{drive + status}. Note that this call doesn't return information on the + current playing activity of the drive; this can be polled through an + $ioctl$ call to $CDROMSUBCHNL$. For juke-boxes, an extra argument + $arg$ specifies the slot for which (possibly limited) information is + given. The special value $CDSL_CURRENT$ requests that information + about the currently selected slot be returned. +\item[CDROM_DISC_STATUS] Returns the type of the disc currently in the + drive. It should be viewed as a complement to $CDROM_DRIVE_STATUS$. + This $ioctl$ can provide \emph {some} information about the current + disc that is inserted in the drive. This functionality used to be + implemented in the low level drivers, but is now carried out + entirely in \UCD. + + The history of development of the CD's use as a carrier medium for + various digital information has lead to many different disc types. + This $ioctl$ is useful only in the case that CDs have \emph {only + one} type of data on them. While this is often the case, it is + also very common for CDs to have some tracks with data, and some + tracks with audio. Because this is an existing interface, rather + than fixing this interface by changing the assumptions it was made + under, thereby breaking all user applications that use this + function, the \UCD\ implements this $ioctl$ as follows: If the CD in + question has audio tracks on it, and it has absolutely no CD-I, XA, + or data tracks on it, it will be reported as $CDS_AUDIO$. If it has + both audio and data tracks, it will return $CDS_MIXED$. If there + are no audio tracks on the disc, and if the CD in question has any + CD-I tracks on it, it will be reported as $CDS_XA_2_2$. Failing + that, if the CD in question has any XA tracks on it, it will be + reported as $CDS_XA_2_1$. Finally, if the CD in question has any + data tracks on it, it will be reported as a data CD ($CDS_DATA_1$). + + This $ioctl$ can return: + $$ + \halign{$#$\ \hfil&$/*$ \rm# $*/$\hfil\cr + CDS_NO_INFO& no information available\cr + CDS_NO_DISC& no disc is inserted, or tray is opened\cr + CDS_AUDIO& Audio disc (2352 audio bytes/frame)\cr + CDS_DATA_1& data disc, mode 1 (2048 user bytes/frame)\cr + CDS_XA_2_1& mixed data (XA), mode 2, form 1 (2048 user bytes)\cr + CDS_XA_2_2& mixed data (XA), mode 2, form 1 (2324 user bytes)\cr + CDS_MIXED& mixed audio/data disc\cr + } + $$ + For some information concerning frame layout of the various disc + types, see a recent version of \cdromh. + +\item[CDROM_CHANGER_NSLOTS] Returns the number of slots in a + juke-box. +\item[CDROMRESET] Reset the drive. +\item[CDROM_GET_CAPABILITY] Returns the $capability$ flags for the + drive. Refer to section \ref{capability} for more information on + these flags. +\item[CDROM_LOCKDOOR] Locks the door of the drive. $arg == \rm0$ + unlocks the door, any other value locks it. +\item[CDROM_DEBUG] Turns on debugging info. Only root is allowed + to do this. Same semantics as CDROM_LOCKDOOR. +\end{description} + +\subsubsection{Device dependent $ioctl$s} + +Finally, all other $ioctl$s are passed to the function $dev_ioctl()$, +if implemented. No memory allocation or verification is carried out. + +\newsection{How to update your driver} + +\begin{enumerate} +\item Make a backup of your current driver. +\item Get hold of the files \cdromc\ and \cdromh, they should be in + the directory tree that came with this documentation. +\item Make sure you include \cdromh. +\item Change the 3rd argument of $register_blkdev$ from +$\&<your-drive>_fops$ to $\&cdrom_fops$. +\item Just after that line, add the following to register with the \UCD: + $$register_cdrom(\&<your-drive>_info);$$ + Similarly, add a call to $unregister_cdrom()$ at the appropriate place. +\item Copy an example of the device-operations $struct$ to your + source, \eg, from {\tt {cm206.c}} $cm206_dops$, and change all + entries to names corresponding to your driver, or names you just + happen to like. If your driver doesn't support a certain function, + make the entry $NULL$. At the entry $capability$ you should list all + capabilities your driver currently supports. If your driver + has a capability that is not listed, please send me a message. +\item Copy the $cdrom_device_info$ declaration from the same example + driver, and modify the entries according to your needs. If your + driver dynamically determines the capabilities of the hardware, this + structure should also be declared dynamically. +\item Implement all functions in your $<device>_dops$ structure, + according to prototypes listed in \cdromh, and specifications given + in section~\ref{cdrom.c}. Most likely you have already implemented + the code in a large part, and you will almost certainly need to adapt the + prototype and return values. +\item Rename your $<device>_ioctl()$ function to $audio_ioctl$ and + change the prototype a little. Remove entries listed in the first + part in section~\ref{cdrom-ioctl}, if your code was OK, these are + just calls to the routines you adapted in the previous step. +\item You may remove all remaining memory checking code in the + $audio_ioctl()$ function that deals with audio commands (these are + listed in the second part of section~\ref{cdrom-ioctl}). There is no + need for memory allocation either, so most $case$s in the $switch$ + statement look similar to: + $$ + case\ CDROMREADTOCENTRY\colon get_toc_entry\bigl((struct\ + cdrom_tocentry *{})\ arg\bigr); + $$ +\item All remaining $ioctl$ cases must be moved to a separate + function, $<device>_ioctl$, the device-dependent $ioctl$s. Note that + memory checking and allocation must be kept in this code! +\item Change the prototypes of $<device>_open()$ and + $<device>_release()$, and remove any strategic code (\ie, tray + movement, door locking, etc.). +\item Try to recompile the drivers. We advise you to use modules, both + for {\tt {cdrom.o}} and your driver, as debugging is much easier this + way. +\end{enumerate} + +\newsection{Thanks} + +Thanks to all the people involved. First, Erik Andersen, who has +taken over the torch in maintaining \cdromc\ and integrating much +\cdrom-related code in the 2.1-kernel. Thanks to Scott Snyder and +Gerd Knorr, who were the first to implement this interface for SCSI +and IDE-CD drivers and added many ideas for extension of the data +structures relative to kernel~2.0. Further thanks to Heiko Eissfeldt, +Thomas Quinot, Jon Tombs, Ken Pizzini, Eberhard M\"onkeberg and Andrew +Kroll, the \linux\ \cdrom\ device driver developers who were kind +enough to give suggestions and criticisms during the writing. Finally +of course, I want to thank Linus Torvalds for making this possible in +the first place. + +\vfill +$ \version\ $ +\eject +\end{document} diff --git a/Documentation/cdrom/cdu31a b/Documentation/cdrom/cdu31a new file mode 100644 index 000000000000..c0667da09c00 --- /dev/null +++ b/Documentation/cdrom/cdu31a @@ -0,0 +1,196 @@ + + CDU31A/CDU33A Driver Info + ------------------------- + +Information on the Sony CDU31A/CDU33A CDROM driver for the Linux +kernel. + + Corey Minyard (minyard@metronet.com) + + Colossians 3:17 + +Crude Table of Contents +----------------------- + + Setting Up the Hardware + Configuring the Kernel + Configuring as a Module + Driver Special Features + + +This device driver handles Sony CDU31A/CDU33A CDROM drives and +provides a complete block-level interface as well as an ioctl() +interface as specified in include/linux/cdrom.h). With this +interface, CDROMs can be accessed, standard audio CDs can be played +back normally, and CD audio information can be read off the drive. + +Note that this will only work for CDU31A/CDU33A drives. Some vendors +market their drives as CDU31A compatible. They lie. Their drives are +really CDU31A hardware interface compatible (they can plug into the +same card). They are not software compatible. + +Setting Up the Hardware +----------------------- + +The CDU31A driver is unable to safely tell if an interface card is +present that it can use because the interface card does not announce +its presence in any way besides placing 4 I/O locations in memory. It +used to just probe memory and attempt commands, but Linus wisely asked +me to remove that because it could really screw up other hardware in +the system. + +Because of this, you must tell the kernel where the drive interface +is, what interrupts are used, and possibly if you are on a PAS-16 +soundcard. + +If you have the Sony CDU31A/CDU33A drive interface card, the following +diagram will help you set it up. If you have another card, you are on +your own. You need to make sure that the I/O address and interrupt is +not used by another card in the system. You will need to know the I/O +address and interrupt you have set. Note that use of interrupts is +highly recommended, if possible, it really cuts down on CPU used. +Unfortunately, most soundcards do not support interrupts for their +CDROM interfaces. By default, the Sony interface card comes with +interrupts disabled. + + +----------+-----------------+----------------------+ + | JP1 | 34 Pin Conn | | + | JP2 +-----------------+ | + | JP3 | + | JP4 | + | +--+ + | | +-+ + | | | | External + | | | | Connector + | | | | + | | +-+ + | +--+ + | | + | +--------+ + | | + +------------------------------------------+ + + JP1 sets the Base Address, using the following settings: + + Address Pin 1 Pin 2 + ------- ----- ----- + 0x320 Short Short + 0x330 Short Open + 0x340 Open Short + 0x360 Open Open + + JP2 and JP3 configure the DMA channel; they must be set the same. + + DMA Pin 1 Pin 2 Pin 3 + --- ----- ----- ----- + 1 On Off On + 2 Off On Off + 3 Off Off On + + JP4 Configures the IRQ: + + IRQ Pin 1 Pin 2 Pin 3 Pin 4 + --- ----- ----- ----- ----- + 3 Off Off On Off + 4 Off Off* Off On + 5 On Off Off Off + 6 Off On Off Off + + The documentation states to set this for interrupt + 4, but I think that is a mistake. + +Note that if you have another interface card, you will need to look at +the documentation to find the I/O base address. This is specified to +the SLCD.SYS driver for DOS with the /B: parameter, so you can look at +you DOS driver setup to find the address, if necessary. + +Configuring the Kernel +---------------------- + +You must tell the kernel where the drive is at boot time. This can be +done at the Linux boot prompt, by using LILO, or by using Bootlin. +Note that this is no substitute for HOWTOs and LILO documentation, if +you are confused please read those for info on bootline configuration +and LILO. + +At the linux boot prompt, press the ALT key and add the following line +after the boot name (you can let the kernel boot, it will tell you the +default boot name while booting): + + cdu31a=<base address>,<interrupt>[,PAS] + +The base address needs to have "0x" in front of it, since it is in +hex. For instance, to configure a drive at address 320 on interrupt 5, +use the following: + + cdu31a=0x320,5 + +I use the following boot line: + + cdu31a=0x1f88,0,PAS + +because I have a PAS-16 which does not support interrupt for the +CDU31A interface. + +Adding this as an append line at the beginning of the /etc/lilo.conf +file will set it for lilo configurations. I have the following as the +first line in my lilo.conf file: + + append="cdu31a=0x1f88,0" + +I'm not sure how to set up Bootlin (I have never used it), if someone +would like to fill in this section please do. + + +Configuring as a Module +----------------------- + +The driver supports loading as a module. However, you must specify +the boot address and interrupt on the boot line to insmod. You can't +use modprobe to load it, since modprobe doesn't support setting +variables. + +Anyway, I use the following line to load my driver as a module + + /sbin/insmod /lib/modules/`uname -r`/misc/cdu31a.o cdu31a_port=0x1f88 + +You can set the following variables in the driver: + + cdu31a_port=<I/O address> - sets the base I/O. If hex, put 0x in + front of it. This must be specified. + + cdu31a_irq=<interrupt> - Sets the interrupt number. Leaving this + off will turn interrupts off. + + +Driver Special Features +----------------------- + +This section describes features beyond the normal audio and CD-ROM +functions of the drive. + +2048 byte buffer mode + +If a disk is mounted with -o block=2048, data is copied straight from +the drive data port to the buffer. Otherwise, the readahead buffer +must be involved to hold the other 1K of data when a 1K block +operation is done. Note that with 2048 byte blocks you cannot execute +files from the CD. + +XA compatibility + +The driver should support XA disks for both the CDU31A and CDU33A. It +does this transparently, the using program doesn't need to set it. + +Multi-Session + +A multi-session disk looks just like a normal disk to the user. Just +mount one normally, and all the data should be there. A special +thanks to Koen for help with this! + +Raw sector I/O + +Using the CDROMREADAUDIO it is possible to read raw audio and data +tracks. Both operations return 2352 bytes per sector. On the data +tracks, the first 12 bytes is not returned by the drive and the value +of that data is indeterminate. diff --git a/Documentation/cdrom/cm206 b/Documentation/cdrom/cm206 new file mode 100644 index 000000000000..810368f4f7c4 --- /dev/null +++ b/Documentation/cdrom/cm206 @@ -0,0 +1,185 @@ +This is the readme file for the driver for the Philips/LMS cdrom drive +cm206 in combination with the cm260 host adapter card. + + (c) 1995 David A. van Leeuwen + +Changes since version 0.99 +-------------------------- +- Interfacing to the kernel is routed though an extra interface layer, + cdrom.c. This allows runtime-configurable `behavior' of the cdrom-drive, + independent of the driver. + +Features since version 0.33 +--------------------------- +- Full audio support, that is, both workman, workbone and cdp work + now reasonably. Reading TOC still takes some time. xmcd has been + reported to run successfully. +- Made auto-probe code a little better, I hope + +Features since version 0.28 +--------------------------- +- Full speed transfer rate (300 kB/s). +- Minimum kernel memory usage for buffering (less than 3 kB). +- Multisession support. +- Tray locking. +- Statistics of driver accessible to the user. +- Module support. +- Auto-probing of adapter card's base port and irq line, + also configurable at boot time or module load time. + + +Decide how you are going to use the driver. There are two +options: + + (a) installing the driver as a resident part of the kernel + (b) compiling the driver as a loadable module + + Further, you must decide if you are going to specify the base port + address and the interrupt request line of the adapter card cm260 as + boot options for (a), module parameters for (b), use automatic + probing of these values, or hard-wire your adaptor card's settings + into the source code. If you don't care, you can choose + autoprobing, which is the default. In that case you can move on to + the next step. + +Compiling the kernel +-------------------- +1) move to /usr/src/linux and do a + + make config + + If you have chosen option (a), answer yes to CONFIG_CM206 and + CONFIG_ISO9660_FS. + + If you have chosen option (b), answer yes to CONFIG_MODVERSIONS + and no (!) to CONFIG_CM206 and CONFIG_ISO9660_FS. + +2) then do a + + make clean; make zImage; make modules + +3) do the usual things to install a new image (backup the old one, run + `rdev -R zImage 1', copy the new image in place, run lilo). Might + be `make zlilo'. + +Using the driver as a module +---------------------------- +If you will only occasionally use the cd-rom driver, you can choose +option (b), install as a loadable module. You may have to re-compile +the module when you upgrade the kernel to a new version. + +Since version 0.96, much of the functionality has been transferred to +a generic cdrom interface in the file cdrom.c. The module cm206.o +depends on cdrom.o. If the latter is not compiled into the kernel, +you must explicitly load it before cm206.o: + + insmod /usr/src/linux/modules/cdrom.o + +To install the module, you use the command, as root + + insmod /usr/src/linux/modules/cm206.o + +You can specify the base address on the command line as well as the irq +line to be used, e.g. + + insmod /usr/src/linux/modules/cm206.o cm206=0x300,11 + +The order of base port and irq line doesn't matter; if you specify only +one, the other will have the value of the compiled-in default. You +may also have to install the file-system module `iso9660.o', if you +didn't compile that into the kernel. + + +Using the driver as part of the kernel +-------------------------------------- +If you have chosen option (a), you can specify the base-port +address and irq on the lilo boot command line, e.g.: + + LILO: linux cm206=0x340,11 + +This assumes that your linux kernel image keyword is `linux'. +If you specify either IRQ (3--11) or base port (0x300--0x370), +auto probing is turned off for both settings, thus setting the +other value to the compiled-in default. + +Note that you can also put these parameters in the lilo configuration file: + +# linux config +image = /vmlinuz + root = /dev/hda1 + label = Linux + append = "cm206=0x340,11" + read-only + + +If module parameters and LILO config options don't work +------------------------------------------------------- +If autoprobing does not work, you can hard-wire the default values +of the base port address (CM206_BASE) and interrupt request line +(CM206_IRQ) into the file /usr/src/linux/drivers/cdrom/cm206.h. Change +the defines of CM206_IRQ and CM206_BASE. + + +Mounting the cdrom +------------------ +1) Make sure that the right device is installed in /dev. + + mknod /dev/cm206cd b 32 0 + +2) Make sure there is a mount point, e.g., /cdrom + + mkdir /cdrom + +3) mount using a command like this (run as root): + + mount -rt iso9660 /dev/cm206cd /cdrom + +4) For user-mounts, add a line in /etc/fstab + + /dev/cm206cd /cdrom iso9660 ro,noauto,user + + This will allow users to give the commands + + mount /cdrom + umount /cdrom + +If things don't work +-------------------- + +- Try to do a `dmesg' to find out if the driver said anything about + what is going wrong during the initialization. + +- Try to do a `dd if=/dev/cm206cd | od -tc | less' to read from the + CD. + +- Look in the /proc directory to see if `cm206' shows up under one of + `interrupts', `ioports', `devices' or `modules' (if applicable). + + +DISCLAIMER +---------- +I cannot guarantee that this driver works, or that the hardware will +not be harmed, although I consider it most unlikely. + +I hope that you'll find this driver in some way useful. + + David van Leeuwen + david@tm.tno.nl + +Note for Linux CDROM vendors +----------------------------- +You are encouraged to include this driver on your Linux CDROM. If +you do, you might consider sending me a free copy of that cd-rom. +You can contact me through my e-mail address, david@tm.tno.nl. +If this driver is compiled into a kernel to boot off a cdrom, +you should actually send me a free copy of that cd-rom. + +Copyright +--------- +The copyright of the cm206 driver for Linux is + + (c) 1995 David A. van Leeuwen + +The driver is released under the conditions of the GNU general public +license, which can be found in the file COPYING in the root of this +source tree. diff --git a/Documentation/cdrom/gscd b/Documentation/cdrom/gscd new file mode 100644 index 000000000000..d01ca36b5c43 --- /dev/null +++ b/Documentation/cdrom/gscd @@ -0,0 +1,60 @@ + Goldstar R420 CD-Rom device driver README + +For all kind of other information about the GoldStar R420 CDROM +and this Linux device driver see the WWW page: + + http://linux.rz.fh-hannover.de/~raupach + + + If you are the editor of a Linux CD, you should + enable gscd.c within your boot floppy kernel. Please, + send me one of your CDs for free. + + +This current driver version 0.4a only supports reading data from the disk. +Currently we have no audio and no multisession or XA support. +The polling interface is used, no DMA. + + +Sometimes the GoldStar R420 is sold in a 'Reveal Multimedia Kit'. This kit's +drive interface is compatible, too. + + +Installation +------------ + +Change to '/usr/src/linux/drivers/cdrom' and edit the file 'gscd.h'. Insert +the i/o address of your interface card. + +The default base address is 0x340. This will work for most applications. +Address selection is accomplished by jumpers PN801-1 to PN801-4 on the +GoldStar Interface Card. +Appropriate settings are: 0x300, 0x310, 0x320, 0x330, 0x340, 0x350, 0x360 +0x370, 0x380, 0x390, 0x3A0, 0x3B0, 0x3C0, 0x3D0, 0x3E0, 0x3F0 + +Then go back to '/usr/src/linux/' and 'make config' to build the new +configuration for your kernel. If you want to use the GoldStar driver +like a module, don't select 'GoldStar CDROM support'. By the way, you +have to include the iso9660 filesystem. + +Now start compiling the kernel with 'make zImage'. +If you want to use the driver as a module, you have to do 'make modules' +and 'make modules_install', additionally. +Install your new kernel as usual - maybe you do it with 'make zlilo'. + +Before you can use the driver, you have to + mknod /dev/gscd0 b 16 0 +to create the appropriate device file (you only need to do this once). + +If you use modules, you can try to insert the driver. +Say: 'insmod /usr/src/linux/modules/gscd.o' +or: 'insmod /usr/src/linux/modules/gscd.o gscd=<address>' +The driver should report its results. + +That's it! Mount a disk, i.e. 'mount -rt iso9660 /dev/gscd0 /cdrom' + +Feel free to report errors and suggestions to the following address. +Be sure, I'm very happy to receive your comments! + + Oliver Raupach Hannover, Juni 1995 +(raupach@nwfs1.rz.fh-hannover.de) diff --git a/Documentation/cdrom/ide-cd b/Documentation/cdrom/ide-cd new file mode 100644 index 000000000000..29721bfcde12 --- /dev/null +++ b/Documentation/cdrom/ide-cd @@ -0,0 +1,574 @@ +IDE-CD driver documentation +Originally by scott snyder <snyder@fnald0.fnal.gov> (19 May 1996) +Carrying on the torch is: Erik Andersen <andersee@debian.org> +New maintainers (19 Oct 1998): Jens Axboe <axboe@image.dk> + +1. Introduction +--------------- + +The ide-cd driver should work with all ATAPI ver 1.2 to ATAPI 2.6 compliant +CDROM drives which attach to an IDE interface. Note that some CDROM vendors +(including Mitsumi, Sony, Creative, Aztech, and Goldstar) have made +both ATAPI-compliant drives and drives which use a proprietary +interface. If your drive uses one of those proprietary interfaces, +this driver will not work with it (but one of the other CDROM drivers +probably will). This driver will not work with `ATAPI' drives which +attach to the parallel port. In addition, there is at least one drive +(CyCDROM CR520ie) which attaches to the IDE port but is not ATAPI; +this driver will not work with drives like that either (but see the +aztcd driver). + +This driver provides the following features: + + - Reading from data tracks, and mounting ISO 9660 filesystems. + + - Playing audio tracks. Most of the CDROM player programs floating + around should work; I usually use Workman. + + - Multisession support. + + - On drives which support it, reading digital audio data directly + from audio tracks. The program cdda2wav can be used for this. + Note, however, that only some drives actually support this. + + - There is now support for CDROM changers which comply with the + ATAPI 2.6 draft standard (such as the NEC CDR-251). This additional + functionality includes a function call to query which slot is the + currently selected slot, a function call to query which slots contain + CDs, etc. A sample program which demonstrates this functionality is + appended to the end of this file. The Sanyo 3-disc changer + (which does not conform to the standard) is also now supported. + Please note the driver refers to the first CD as slot # 0. + + +2. Installation +--------------- + +0. The ide-cd relies on the ide disk driver. See + Documentation/ide.txt for up-to-date information on the ide + driver. + +1. Make sure that the ide and ide-cd drivers are compiled into the + kernel you're using. When configuring the kernel, in the section + entitled "Floppy, IDE, and other block devices", say either `Y' + (which will compile the support directly into the kernel) or `M' + (to compile support as a module which can be loaded and unloaded) + to the options: + + Enhanced IDE/MFM/RLL disk/cdrom/tape/floppy support + Include IDE/ATAPI CDROM support + + and `no' to + + Use old disk-only driver on primary interface + + Depending on what type of IDE interface you have, you may need to + specify additional configuration options. See + Documentation/ide.txt. + +2. You should also ensure that the iso9660 filesystem is either + compiled into the kernel or available as a loadable module. You + can see if a filesystem is known to the kernel by catting + /proc/filesystems. + +3. The CDROM drive should be connected to the host on an IDE + interface. Each interface on a system is defined by an I/O port + address and an IRQ number, the standard assignments being + 0x1f0 and 14 for the primary interface and 0x170 and 15 for the + secondary interface. Each interface can control up to two devices, + where each device can be a hard drive, a CDROM drive, a floppy drive, + or a tape drive. The two devices on an interface are called `master' + and `slave'; this is usually selectable via a jumper on the drive. + + Linux names these devices as follows. The master and slave devices + on the primary IDE interface are called `hda' and `hdb', + respectively. The drives on the secondary interface are called + `hdc' and `hdd'. (Interfaces at other locations get other letters + in the third position; see Documentation/ide.txt.) + + If you want your CDROM drive to be found automatically by the + driver, you should make sure your IDE interface uses either the + primary or secondary addresses mentioned above. In addition, if + the CDROM drive is the only device on the IDE interface, it should + be jumpered as `master'. (If for some reason you cannot configure + your system in this manner, you can probably still use the driver. + You may have to pass extra configuration information to the kernel + when you boot, however. See Documentation/ide.txt for more + information.) + +4. Boot the system. If the drive is recognized, you should see a + message which looks like + + hdb: NEC CD-ROM DRIVE:260, ATAPI CDROM drive + + If you do not see this, see section 5 below. + +5. You may want to create a symbolic link /dev/cdrom pointing to the + actual device. You can do this with the command + + ln -s /dev/hdX /dev/cdrom + + where X should be replaced by the letter indicating where your + drive is installed. + +6. You should be able to see any error messages from the driver with + the `dmesg' command. + + +3. Basic usage +-------------- + +An ISO 9660 CDROM can be mounted by putting the disc in the drive and +typing (as root) + + mount -t iso9660 /dev/cdrom /mnt/cdrom + +where it is assumed that /dev/cdrom is a link pointing to the actual +device (as described in step 5 of the last section) and /mnt/cdrom is +an empty directory. You should now be able to see the contents of the +CDROM under the /mnt/cdrom directory. If you want to eject the CDROM, +you must first dismount it with a command like + + umount /mnt/cdrom + +Note that audio CDs cannot be mounted. + +Some distributions set up /etc/fstab to always try to mount a CDROM +filesystem on bootup. It is not required to mount the CDROM in this +manner, though, and it may be a nuisance if you change CDROMs often. +You should feel free to remove the cdrom line from /etc/fstab and +mount CDROMs manually if that suits you better. + +Multisession and photocd discs should work with no special handling. +The hpcdtoppm package (ftp.gwdg.de:/pub/linux/hpcdtoppm/) may be +useful for reading photocds. + +To play an audio CD, you should first unmount and remove any data +CDROM. Any of the CDROM player programs should then work (workman, +workbone, cdplayer, etc.). Lacking anything else, you could use the +cdtester program in Documentation/cdrom/sbpcd. + +On a few drives, you can read digital audio directly using a program +such as cdda2wav. The only types of drive which I've heard support +this are Sony and Toshiba drives. You will get errors if you try to +use this function on a drive which does not support it. + +For supported changers, you can use the `cdchange' program (appended to +the end of this file) to switch between changer slots. Note that the +drive should be unmounted before attempting this. The program takes +two arguments: the CDROM device, and the slot number to which you wish +to change. If the slot number is -1, the drive is unloaded. + + +4. Compilation options +---------------------- + +There are a few additional options which can be set when compiling the +driver. Most people should not need to mess with any of these; they +are listed here simply for completeness. A compilation option can be +enabled by adding a line of the form `#define <option> 1' to the top +of ide-cd.c. All these options are disabled by default. + +VERBOSE_IDE_CD_ERRORS + If this is set, ATAPI error codes will be translated into textual + descriptions. In addition, a dump is made of the command which + provoked the error. This is off by default to save the memory used + by the (somewhat long) table of error descriptions. + +STANDARD_ATAPI + If this is set, the code needed to deal with certain drives which do + not properly implement the ATAPI spec will be disabled. If you know + your drive implements ATAPI properly, you can turn this on to get a + slightly smaller kernel. + +NO_DOOR_LOCKING + If this is set, the driver will never attempt to lock the door of + the drive. + +CDROM_NBLOCKS_BUFFER + This sets the size of the buffer to be used for a CDROMREADAUDIO + ioctl. The default is 8. + +TEST + This currently enables an additional ioctl which enables a user-mode + program to execute an arbitrary packet command. See the source for + details. This should be left off unless you know what you're doing. + + +5. Common problems +------------------ + +This section discusses some common problems encountered when trying to +use the driver, and some possible solutions. Note that if you are +experiencing problems, you should probably also review +Documentation/ide.txt for current information about the underlying +IDE support code. Some of these items apply only to earlier versions +of the driver, but are mentioned here for completeness. + +In most cases, you should probably check with `dmesg' for any errors +from the driver. + +a. Drive is not detected during booting. + + - Review the configuration instructions above and in + Documentation/ide.txt, and check how your hardware is + configured. + + - If your drive is the only device on an IDE interface, it should + be jumpered as master, if at all possible. + + - If your IDE interface is not at the standard addresses of 0x170 + or 0x1f0, you'll need to explicitly inform the driver using a + lilo option. See Documentation/ide.txt. (This feature was + added around kernel version 1.3.30.) + + - If the autoprobing is not finding your drive, you can tell the + driver to assume that one exists by using a lilo option of the + form `hdX=cdrom', where X is the drive letter corresponding to + where your drive is installed. Note that if you do this and you + see a boot message like + + hdX: ATAPI cdrom (?) + + this does _not_ mean that the driver has successfully detected + the drive; rather, it means that the driver has not detected a + drive, but is assuming there's one there anyway because you told + it so. If you actually try to do I/O to a drive defined at a + nonexistent or nonresponding I/O address, you'll probably get + errors with a status value of 0xff. + + - Some IDE adapters require a nonstandard initialization sequence + before they'll function properly. (If this is the case, there + will often be a separate MS-DOS driver just for the controller.) + IDE interfaces on sound cards often fall into this category. + + Support for some interfaces needing extra initialization is + provided in later 1.3.x kernels. You may need to turn on + additional kernel configuration options to get them to work; + see Documentation/ide.txt. + + Even if support is not available for your interface, you may be + able to get it to work with the following procedure. First boot + MS-DOS and load the appropriate drivers. Then warm-boot linux + (i.e., without powering off). If this works, it can be automated + by running loadlin from the MS-DOS autoexec. + + +b. Timeout/IRQ errors. + + - If you always get timeout errors, interrupts from the drive are + probably not making it to the host. + + - IRQ problems may also be indicated by the message + `IRQ probe failed (<n>)' while booting. If <n> is zero, that + means that the system did not see an interrupt from the drive when + it was expecting one (on any feasible IRQ). If <n> is negative, + that means the system saw interrupts on multiple IRQ lines, when + it was expecting to receive just one from the CDROM drive. + + - Double-check your hardware configuration to make sure that the IRQ + number of your IDE interface matches what the driver expects. + (The usual assignments are 14 for the primary (0x1f0) interface + and 15 for the secondary (0x170) interface.) Also be sure that + you don't have some other hardware which might be conflicting with + the IRQ you're using. Also check the BIOS setup for your system; + some have the ability to disable individual IRQ levels, and I've + had one report of a system which was shipped with IRQ 15 disabled + by default. + + - Note that many MS-DOS CDROM drivers will still function even if + there are hardware problems with the interrupt setup; they + apparently don't use interrupts. + + - If you own a Pioneer DR-A24X, you _will_ get nasty error messages + on boot such as "irq timeout: status=0x50 { DriveReady SeekComplete }" + The Pioneer DR-A24X CDROM drives are fairly popular these days. + Unfortunately, these drives seem to become very confused when we perform + the standard Linux ATA disk drive probe. If you own one of these drives, + you can bypass the ATA probing which confuses these CDROM drives, by + adding `append="hdX=noprobe hdX=cdrom"' to your lilo.conf file and running + lilo (again where X is the drive letter corresponding to where your drive + is installed.) + +c. System hangups. + + - If the system locks up when you try to access the CDROM, the most + likely cause is that you have a buggy IDE adapter which doesn't + properly handle simultaneous transactions on multiple interfaces. + The most notorious of these is the CMD640B chip. This problem can + be worked around by specifying the `serialize' option when + booting. Recent kernels should be able to detect the need for + this automatically in most cases, but the detection is not + foolproof. See Documentation/ide.txt for more information + about the `serialize' option and the CMD640B. + + - Note that many MS-DOS CDROM drivers will work with such buggy + hardware, apparently because they never attempt to overlap CDROM + operations with other disk activity. + + +d. Can't mount a CDROM. + + - If you get errors from mount, it may help to check `dmesg' to see + if there are any more specific errors from the driver or from the + filesystem. + + - Make sure there's a CDROM loaded in the drive, and that's it's an + ISO 9660 disc. You can't mount an audio CD. + + - With the CDROM in the drive and unmounted, try something like + + cat /dev/cdrom | od | more + + If you see a dump, then the drive and driver are probably working + OK, and the problem is at the filesystem level (i.e., the CDROM is + not ISO 9660 or has errors in the filesystem structure). + + - If you see `not a block device' errors, check that the definitions + of the device special files are correct. They should be as + follows: + + brw-rw---- 1 root disk 3, 0 Nov 11 18:48 /dev/hda + brw-rw---- 1 root disk 3, 64 Nov 11 18:48 /dev/hdb + brw-rw---- 1 root disk 22, 0 Nov 11 18:48 /dev/hdc + brw-rw---- 1 root disk 22, 64 Nov 11 18:48 /dev/hdd + + Some early Slackware releases had these defined incorrectly. If + these are wrong, you can remake them by running the script + scripts/MAKEDEV.ide. (You may have to make it executable + with chmod first.) + + If you have a /dev/cdrom symbolic link, check that it is pointing + to the correct device file. + + If you hear people talking of the devices `hd1a' and `hd1b', these + were old names for what are now called hdc and hdd. Those names + should be considered obsolete. + + - If mount is complaining that the iso9660 filesystem is not + available, but you know it is (check /proc/filesystems), you + probably need a newer version of mount. Early versions would not + always give meaningful error messages. + + +e. Directory listings are unpredictably truncated, and `dmesg' shows + `buffer botch' error messages from the driver. + + - There was a bug in the version of the driver in 1.2.x kernels + which could cause this. It was fixed in 1.3.0. If you can't + upgrade, you can probably work around the problem by specifying a + blocksize of 2048 when mounting. (Note that you won't be able to + directly execute binaries off the CDROM in that case.) + + If you see this in kernels later than 1.3.0, please report it as a + bug. + + +f. Data corruption. + + - Random data corruption was occasionally observed with the Hitachi + CDR-7730 CDROM. If you experience data corruption, using "hdx=slow" + as a command line parameter may work around the problem, at the + expense of low system performance. + + +6. cdchange.c +------------- + +/* + * cdchange.c [-v] <device> [<slot>] + * + * This loads a CDROM from a specified slot in a changer, and displays + * information about the changer status. The drive should be unmounted before + * using this program. + * + * Changer information is displayed if either the -v flag is specified + * or no slot was specified. + * + * Based on code originally from Gerhard Zuber <zuber@berlin.snafu.de>. + * Changer status information, and rewrite for the new Uniform CDROM driver + * interface by Erik Andersen <andersee@debian.org>. + */ + +#include <stdio.h> +#include <stdlib.h> +#include <errno.h> +#include <string.h> +#include <unistd.h> +#include <fcntl.h> +#include <sys/ioctl.h> +#include <linux/cdrom.h> + + +int +main (int argc, char **argv) +{ + char *program; + char *device; + int fd; /* file descriptor for CD-ROM device */ + int status; /* return status for system calls */ + int verbose = 0; + int slot=-1, x_slot; + int total_slots_available; + + program = argv[0]; + + ++argv; + --argc; + + if (argc < 1 || argc > 3) { + fprintf (stderr, "usage: %s [-v] <device> [<slot>]\n", + program); + fprintf (stderr, " Slots are numbered 1 -- n.\n"); + exit (1); + } + + if (strcmp (argv[0], "-v") == 0) { + verbose = 1; + ++argv; + --argc; + } + + device = argv[0]; + + if (argc == 2) + slot = atoi (argv[1]) - 1; + + /* open device */ + fd = open(device, O_RDONLY | O_NONBLOCK); + if (fd < 0) { + fprintf (stderr, "%s: open failed for `%s': %s\n", + program, device, strerror (errno)); + exit (1); + } + + /* Check CD player status */ + total_slots_available = ioctl (fd, CDROM_CHANGER_NSLOTS); + if (total_slots_available <= 1 ) { + fprintf (stderr, "%s: Device `%s' is not an ATAPI " + "compliant CD changer.\n", program, device); + exit (1); + } + + if (slot >= 0) { + if (slot >= total_slots_available) { + fprintf (stderr, "Bad slot number. " + "Should be 1 -- %d.\n", + total_slots_available); + exit (1); + } + + /* load */ + slot=ioctl (fd, CDROM_SELECT_DISC, slot); + if (slot<0) { + fflush(stdout); + perror ("CDROM_SELECT_DISC "); + exit(1); + } + } + + if (slot < 0 || verbose) { + + status=ioctl (fd, CDROM_SELECT_DISC, CDSL_CURRENT); + if (status<0) { + fflush(stdout); + perror (" CDROM_SELECT_DISC"); + exit(1); + } + slot=status; + + printf ("Current slot: %d\n", slot+1); + printf ("Total slots available: %d\n", + total_slots_available); + + printf ("Drive status: "); + status = ioctl (fd, CDROM_DRIVE_STATUS, CDSL_CURRENT); + if (status<0) { + perror(" CDROM_DRIVE_STATUS"); + } else switch(status) { + case CDS_DISC_OK: + printf ("Ready.\n"); + break; + case CDS_TRAY_OPEN: + printf ("Tray Open.\n"); + break; + case CDS_DRIVE_NOT_READY: + printf ("Drive Not Ready.\n"); + break; + default: + printf ("This Should not happen!\n"); + break; + } + + for (x_slot=0; x_slot<total_slots_available; x_slot++) { + printf ("Slot %2d: ", x_slot+1); + status = ioctl (fd, CDROM_DRIVE_STATUS, x_slot); + if (status<0) { + perror(" CDROM_DRIVE_STATUS"); + } else switch(status) { + case CDS_DISC_OK: + printf ("Disc present."); + break; + case CDS_NO_DISC: + printf ("Empty slot."); + break; + case CDS_TRAY_OPEN: + printf ("CD-ROM tray open.\n"); + break; + case CDS_DRIVE_NOT_READY: + printf ("CD-ROM drive not ready.\n"); + break; + case CDS_NO_INFO: + printf ("No Information available."); + break; + default: + printf ("This Should not happen!\n"); + break; + } + if (slot == x_slot) { + status = ioctl (fd, CDROM_DISC_STATUS); + if (status<0) { + perror(" CDROM_DISC_STATUS"); + } + switch (status) { + case CDS_AUDIO: + printf ("\tAudio disc.\t"); + break; + case CDS_DATA_1: + case CDS_DATA_2: + printf ("\tData disc type %d.\t", status-CDS_DATA_1+1); + break; + case CDS_XA_2_1: + case CDS_XA_2_2: + printf ("\tXA data disc type %d.\t", status-CDS_XA_2_1+1); + break; + default: + printf ("\tUnknown disc type 0x%x!\t", status); + break; + } + } + status = ioctl (fd, CDROM_MEDIA_CHANGED, x_slot); + if (status<0) { + perror(" CDROM_MEDIA_CHANGED"); + } + switch (status) { + case 1: + printf ("Changed.\n"); + break; + default: + printf ("\n"); + break; + } + } + } + + /* close device */ + status = close (fd); + if (status != 0) { + fprintf (stderr, "%s: close failed for `%s': %s\n", + program, device, strerror (errno)); + exit (1); + } + + exit (0); +} diff --git a/Documentation/cdrom/isp16 b/Documentation/cdrom/isp16 new file mode 100644 index 000000000000..cc86533ac9f3 --- /dev/null +++ b/Documentation/cdrom/isp16 @@ -0,0 +1,100 @@ + -- Documentation/cdrom/isp16 + +Docs by Eric van der Maarel <H.T.M.v.d.Maarel@marin.nl> + +This is the README for version 0.6 of the cdrom interface on an +ISP16, MAD16 or Mozart sound card. + +The detection and configuration of this interface used to be included +in both the sjcd and optcd cdrom driver. Drives supported by these +drivers came packed with Media Magic's multi media kit, which also +included the ISP16 card. The idea (thanks Leo Spiekman) +to move it from these drivers into a separate module and moreover, not to +rely on the MAD16 sound driver, are as follows: +-duplication of code in the kernel is a waste of resources and should + be avoided; +-however, kernels and notably those included with Linux distributions + (cf Slackware 3.0 included version 0.5 of the isp16 configuration + code included in the drivers) don't always come with sound support + included. Especially when they already include a bunch of cdrom drivers. + Hence, the cdrom interface should be configurable _independently_ of + sound support. + +The ISP16, MAD16 and Mozart sound cards have an OPTi 82C928 or an +OPTi 82C929 chip. The interface on these cards should work with +any cdrom attached to the card, which is 'electrically' compatible +with Sanyo/Panasonic, Sony or Mitsumi non-ide drives. However, the +command sets for any proprietary drives may differ +(and hence may not be supported in the kernel) from these four types. +For a fact I know the interface works and the way of configuration +as described in this documentation works in combination with the +sjcd (in Sanyo/Panasonic compatibility mode) cdrom drivers +(probably with the optcd (in Sony compatibility mode) as well). +If you have such an OPTi based sound card and you want to use the +cdrom interface with a cdrom drive supported by any of the other cdrom +drivers, it will probably work. Please let me know any experience you +might have). +I understand that cards based on the OPTi 82C929 chips may be configured +(hardware jumpers that is) as an IDE interface. Initialisation of such a +card in this mode is not supported (yet?). + +The suggestion to configure the ISP16 etc. sound card by booting DOS and +do a warm reboot to boot Linux somehow doesn't work, at least not +on my machine (IPC P90), with the OPTi 82C928 based card. + +Booting the kernel through the boot manager LILO allows the use +of some command line options on the 'LILO boot:' prompt. At boot time +press Alt or Shift while the LILO prompt is written on the screen and enter +any kernel options. Alternatively these options may be used in +the appropriate section in /etc/lilo.conf. Adding 'append="<cmd_line_options>"' +will do the trick as well. +The syntax of 'cmd_line_options' is + + isp16=[<port>[,<irq>[,<dma>]]][[,]<drive_type>] + +If there is no ISP16 or compatibles detected, there's probably no harm done. +These options indicate the values that your cdrom drive has been (or will be) +configured to use. +Valid values for the base i/o address are: + port=0x340,0x320,0x330,0x360 +for the interrupt request number + irq=0,3,5,7,9,10,11 +for the direct memory access line + dma=0,3,5,6,7 +and for the type of drive + drive_type=noisp16,Sanyo,Panasonic,Sony,Mitsumi. +Note that these options are case sensitive. +The values 0 for irq and dma indicate that they are not used, and +the drive will be used in 'polling' mode. The values 5 and 7 for irq +should be avoided in order to avoid any conflicts with optional +sound card configuration. +The syntax of the command line does not allow the specification of +irq when there's nothing specified for the base address and no +specification of dma when there is no specification of irq. +The value 'noisp16' for drive_type, which may be used as the first +non-integer option value (e.g. 'isp16=noisp16'), makes sure that probing +for and subsequent configuration of an ISP16-compatible card is skipped +all together. This can be useful to overcome possible conflicts which +may arise while the kernel is probing your hardware. +The default values are + port=0x340 + irq=0 + dma=0 + drive_type=Sanyo +reflecting my own configuration. The defaults can be changed in +the file linux/drivers/cdrom/ips16.h. + +The cdrom interface can be configured at run time by loading the +initialisation driver as a module. In that case, the interface +parameters can be set by giving appropriate values on the command +line. Configuring the driver can then be done by the following +command (assuming you have iso16.o installed in a proper place): + + insmod isp16.o isp16_cdrom_base=<port> isp16_cdrom_irq=<irq> \ + isp16_cdrom_dma=<dma> isp16_cdrom_type=<drive_type> + +where port, irq, dma and drive_type can have any of the values mentioned +above. + + +Have fun! diff --git a/Documentation/cdrom/mcdx b/Documentation/cdrom/mcdx new file mode 100644 index 000000000000..2bac4b7ff6da --- /dev/null +++ b/Documentation/cdrom/mcdx @@ -0,0 +1,29 @@ +If you are using the driver as a module, you can specify your ports and IRQs +like + + # insmod mcdx.o mcdx=0x300,11,0x304,5 + +and so on ("address,IRQ" pairs). +This will override the configuration in mcdx.h. + +This driver: + + o handles XA and (hopefully) multi session CDs as well as + ordinary CDs; + o supports up to 5 drives (of course, you'll need free + IRQs, i/o ports and slots); + o plays audio + +This version doesn't support yet: + + o shared IRQs (but it seems to be possible - I've successfully + connected two drives to the same irq. So it's `only' a + problem of the driver.) + +This driver never will: + + o Read digital audio (i.e. copy directly), due to missing + hardware features. + + +heiko@lotte.sax.de diff --git a/Documentation/cdrom/optcd b/Documentation/cdrom/optcd new file mode 100644 index 000000000000..6f46c7adb243 --- /dev/null +++ b/Documentation/cdrom/optcd @@ -0,0 +1,57 @@ +This is the README file for the Optics Storage 8000 AT CDROM device driver. + +This is the driver for the so-called 'DOLPHIN' drive, with the 34-pin +Sony-compatible interface. For the IDE-compatible Optics Storage 8001 +drive, you will want the ATAPI CDROM driver. The driver also seems to +work with the Lasermate CR328A. If you have a drive that works with +this driver, and that doesn't report itself as DOLPHIN, please drop me +a mail. + +The support for multisession CDs is in ALPHA stage. If you use it, +please mail me your experiences. Multisession support can be disabled +at compile time. + +You can find some older versions of the driver at + dutette.et.tudelft.nl:/pub/linux/ +and at Eberhard's mirror + ftp.gwdg.de:/pub/linux/cdrom/drivers/optics/ + +Before you can use the driver, you have to create the device file once: + # mknod /dev/optcd0 b 17 0 + +To specify the base address if the driver is "compiled-in" to your kernel, +you can use the kernel command line item (LILO option) + optcd=0x340 +with the right address. + +If you have compiled optcd as a module, you can load it with + # insmod /usr/src/linux/modules/optcd.o +or + # insmod /usr/src/linux/modules/optcd.o optcd=0x340 +with the matching address value of your interface card. + +The driver employs a number of buffers to do read-ahead and block size +conversion. The number of buffers is configurable in optcd.h, and has +influence on the driver performance. For my machine (a P75), 6 buffers +seems optimal, as can be seen from this table: + +#bufs kb/s %cpu +1 97 0.1 +2 191 0.3 +3 188 0.2 +4 246 0.3 +5 189 19 +6 280 0.4 +7 281 7.0 +8 246 2.8 +16 281 3.4 + +If you get a throughput significantly below 300 kb/s, try tweaking +N_BUFS, and don't forget to mail me your results! + +I'd appreciate success/failure reports. If you find a bug, try +recompiling the driver with some strategically chosen debug options +(these can be found in optcd.h) and include the messages generated in +your bug report. Good luck. + +Leo Spiekman (spiekman@dutette.et.tudelft.nl) diff --git a/Documentation/cdrom/packet-writing.txt b/Documentation/cdrom/packet-writing.txt new file mode 100644 index 000000000000..3d44c561fe6d --- /dev/null +++ b/Documentation/cdrom/packet-writing.txt @@ -0,0 +1,97 @@ +Getting started quick +--------------------- + +- Select packet support in the block device section and UDF support in + the file system section. + +- Compile and install kernel and modules, reboot. + +- You need the udftools package (pktsetup, mkudffs, cdrwtool). + Download from http://sourceforge.net/projects/linux-udf/ + +- Grab a new CD-RW disc and format it (assuming CD-RW is hdc, substitute + as appropriate): + # cdrwtool -d /dev/hdc -q + +- Setup your writer + # pktsetup dev_name /dev/hdc + +- Now you can mount /dev/pktcdvd/dev_name and copy files to it. Enjoy! + # mount /dev/pktcdvd/dev_name /cdrom -t udf -o rw,noatime + + +Packet writing for DVD-RW media +------------------------------- + +DVD-RW discs can be written to much like CD-RW discs if they are in +the so called "restricted overwrite" mode. To put a disc in restricted +overwrite mode, run: + + # dvd+rw-format /dev/hdc + +You can then use the disc the same way you would use a CD-RW disc: + + # pktsetup dev_name /dev/hdc + # mount /dev/pktcdvd/dev_name /cdrom -t udf -o rw,noatime + + +Packet writing for DVD+RW media +------------------------------- + +According to the DVD+RW specification, a drive supporting DVD+RW discs +shall implement "true random writes with 2KB granularity", which means +that it should be possible to put any filesystem with a block size >= +2KB on such a disc. For example, it should be possible to do: + + # dvd+rw-format /dev/hdc (only needed if the disc has never + been formatted) + # mkudffs /dev/hdc + # mount /dev/hdc /cdrom -t udf -o rw,noatime + +However, some drives don't follow the specification and expect the +host to perform aligned writes at 32KB boundaries. Other drives do +follow the specification, but suffer bad performance problems if the +writes are not 32KB aligned. + +Both problems can be solved by using the pktcdvd driver, which always +generates aligned writes. + + # dvd+rw-format /dev/hdc + # pktsetup dev_name /dev/hdc + # mkudffs /dev/pktcdvd/dev_name + # mount /dev/pktcdvd/dev_name /cdrom -t udf -o rw,noatime + + +Packet writing for DVD-RAM media +-------------------------------- + +DVD-RAM discs are random writable, so using the pktcdvd driver is not +necessary. However, using the pktcdvd driver can improve performance +in the same way it does for DVD+RW media. + + +Notes +----- + +- CD-RW media can usually not be overwritten more than about 1000 + times, so to avoid unnecessary wear on the media, you should always + use the noatime mount option. + +- Defect management (ie automatic remapping of bad sectors) has not + been implemented yet, so you are likely to get at least some + filesystem corruption if the disc wears out. + +- Since the pktcdvd driver makes the disc appear as a regular block + device with a 2KB block size, you can put any filesystem you like on + the disc. For example, run: + + # /sbin/mke2fs /dev/pktcdvd/dev_name + + to create an ext2 filesystem on the disc. + + +Links +----- + +See http://fy.chalmers.se/~appro/linux/DVD+RW/ for more information +about DVD writing. diff --git a/Documentation/cdrom/sbpcd b/Documentation/cdrom/sbpcd new file mode 100644 index 000000000000..d1825dffca34 --- /dev/null +++ b/Documentation/cdrom/sbpcd @@ -0,0 +1,1057 @@ +This README belongs to release 4.2 or newer of the SoundBlaster Pro +(Matsushita, Kotobuki, Panasonic, CreativeLabs, Longshine and Teac) +CD-ROM driver for Linux. + +sbpcd really, really is NOT for ANY IDE/ATAPI drive! +Not even if you have an "original" SoundBlaster card with an IDE interface! +So, you'd better have a look into README.ide if your port address is 0x1F0, +0x170, 0x1E8, 0x168 or similar. +I get tons of mails from IDE/ATAPI drive users - I really can't continue +any more to answer them all. So, if your drive/interface information sheets +mention "IDE" (primary, secondary, tertiary, quaternary) and the DOS driver +invoking line within your CONFIG.SYS is using an address below 0x230: +DON'T ROB MY LAST NERVE - jumper your interface to address 0x170 and IRQ 15 +(that is the "secondary IDE" configuration), set your drive to "master" and +use ide-cd as your driver. If you do not have a second IDE hard disk, use the +LILO commands + hdb=noprobe hdc=cdrom +and get lucky. +To make it fully clear to you: if you mail me about IDE/ATAPI drive problems, +my answer is above, and I simply will discard your mail, hoping to stop the +flood and to find time to lead my 12-year old son towards happy computing. + +The driver is able to drive the whole family of "traditional" AT-style (that +is NOT the new "Enhanced IDE" or "ATAPI" drive standard) Matsushita, +Kotobuki, Panasonic drives, sometimes labelled as "CreativeLabs". The +well-known drives are CR-521, CR-522, CR-523, CR-562, CR-563. +CR-574 is an IDE/ATAPI drive. + +The Longshine LCS-7260 is a double-speed drive which uses the "old" +Matsushita command set. It is supported - with help by Serge Robyns. +Vertos ("Elitegroup Computer Systems", ECS) has a similar drive - support +has started; get in contact if you have such a "Vertos 100" or "ECS-AT" +drive. + +There exists an "IBM External ISA CD-ROM Drive" which in fact is a CR-563 +with a special controller board. This drive is supported (the interface is +of the "LaserMate" type), and it is possibly the best buy today (cheaper than +an internal drive, and you can use it as an internal, too - e.g. plug it into +a soundcard). + +CreativeLabs has a new drive "CD200" and a similar drive "CD200F". The latter +is made by Funai and sometimes named "E2550UA", newer models may be named +"MK4015". The CD200F drives should fully work. +CD200 drives without "F" are still giving problems: drive detection and +playing audio should work, data access will result in errors. I need qualified +feedback about the bugs within the data functions or a drive (I never saw a +CD200). + +The quad-speed Teac CD-55A drive is supported, but still does not reach "full +speed". The data rate already reaches 500 kB/sec if you set SBP_BUFFER_FRAMES +to 64 (it is not recommended to do that for normal "file access" usage, but it +can speed up things a lot if you use something like "dd" to read from the +drive; I use it for verifying self-written CDs this way). +The drive itself is able to deliver 600 kB/sec, so this needs +work; with the normal setup, the performance currently is not even as good as +double-speed. + +This driver is NOT for Mitsumi or Sony or Aztech or Philips or XXX drives, +and again: this driver is in no way usable for any IDE/ATAPI drive. If you +think your drive should work and it doesn't: send me the DOS driver for your +beast (gzipped + uuencoded) and your CONFIG.SYS if you want to ask me for help, +and include an original log message excerpt, and try to give all information +a complete idiot needs to understand your hassle already with your first +mail. And if you want to say "as I have mailed you before", be sure that I +don't remember your "case" by such remarks; at the moment, I have some +hundreds of open correspondences about Linux CDROM questions (hope to reduce if +the IDE/ATAPI user questions disappear). + + +This driver will work with the soundcard interfaces (SB Pro, SB 16, Galaxy, +SoundFX, Mozart, MAD16 ...) and with the "no-sound" cards (Panasonic CI-101P, +LaserMate, WDH-7001C, Longshine LCS-6853, Teac ...). + +It works with the "configurable" interface "Sequoia S-1000", too, which is +used on the Spea Media FX and Ensonic Soundscape sound cards. You have to +specify the type "SBPRO 2" and the true CDROM port address with it, not the +"configuration port" address. + +If you have a sound card which needs a "configuration driver" instead of +jumpers for interface types and addresses (like Mozart cards) - those +drivers get invoked before the DOS CDROM driver in your CONFIG.SYS, typical +names are "cdsetup.sys" and "mztinit.sys" - let the sound driver do the +CDROM port configuration (the leading comments in linux/drivers/sound/mad16.c +are just for you!). Hannu Savolainen's mad16.c code is able to set up my +Mozart card - I simply had to add + #define MAD16_CONF 0x06 + #define MAD16_CDSEL 0x03 +to configure the CDROM interface for type "Panasonic" (LaserMate) and address +0x340. + +The interface type has to get configured in linux/drivers/cdrom/sbpcd.h, +because the register layout is different between the "SoundBlaster" and the +"LaserMate" type. + +I got a report that the Teac interface card "I/F E117098" is of type +"SoundBlaster" (i.e. you have to set SBPRO to 1) even with the addresses +0x300 and above. This is unusual, and it can't get covered by the auto +probing scheme. +The Teac 16-bit interface cards (like P/N E950228-00A, default address 0x2C0) +need the SBPRO 3 setup. + +If auto-probing found the drive, the address is correct. The reported type +may be wrong. A "mount" will give success only if the interface type is set +right. Playing audio should work with a wrong set interface type, too. + +With some Teac and some CD200 drives I have seen interface cards which seem +to lack the "drive select" lines; always drive 0 gets addressed. To avoid +"mirror drives" (four drives detected where you only have one) with such +interface cards, set MAX_DRIVES to 1 and jumper your drive to ID 0 (if +possible). + + +Up to 4 drives per interface card, and up to 4 interface cards are supported. +All supported drive families can be mixed, but the CR-521 drives are +hard-wired to drive ID 0. The drives have to use different drive IDs, and each +drive has to get a unique minor number (0...3), corresponding indirectly to +its drive ID. +The drive IDs may be selected freely from 0 to 3 - they do not have to be in +consecutive order. + +As Don Carroll, don@ds9.us.dell.com or FIDO 1:382/14, told me, it is possible +to change old drives to any ID, too. He writes in this sense: + "In order to be able to use more than one single speed drive + (they do not have the ID jumpers) you must add a DIP switch + and two resistors. The pads are already on the board next to + the power connector. You will see the silkscreen for the + switch if you remove the top cover. + 1 2 3 4 + ID 0 = x F F x O = "on" + ID 1 = x O F x F = "off" + ID 2 = x F O x x = "don't care" + ID 3 = x O O x + Next to the switch are the positions for R76 (7k) and R78 + (12k). I had to play around with the resistor values - ID 3 + did not work with other values. If the values are not good, + ID 3 behaves like ID 0." + +To use more than 4 drives, you simply need a second controller card at a +different address and a second cable. + +The driver supports reading of data from the CD and playing of audio tracks. +The audio part should run with WorkMan, xcdplayer, with the "non-X11" products +CDplayer and WorkBone - tell me if it is not compatible with other software. +The only accepted measure for correctness with the audio functions is the +"cdtester" utility (appended) - most audio player programmers seem to be +better musicians than programmers. ;-) + +With the CR-56x and the CD200 drives, the reading of audio frames is possible. +This is implemented by an IOCTL function which reads READ_AUDIO frames of +2352 bytes at once (configurable with the "READ_AUDIO" define, default is 0). +Reading the same frame a second time gives different data; the frame data +start at a different position, but all read bytes are valid, and we always +read 98 consecutive chunks (of 24 Bytes) as a frame. Reading more than 1 frame +at once possibly misses some chunks at each frame boundary. This lack has to +get corrected by external, "higher level" software which reads the same frame +again and tries to find and eliminate overlapping chunks (24-byte-pieces). + +The transfer rate with reading audio (1-frame-pieces) currently is very slow. +This can be better reading bigger chunks, but the "missing" chunks possibly +occur at the beginning of each single frame. +The software interface possibly may change a bit the day the SCSI driver +supports it too. + +With all but the CR-52x drives, MultiSession is supported. +Photo CDs work (the "old" drives like CR-521 can access only the first +session of a photoCD). +At ftp.gwdg.de:/pub/linux/hpcdtoppm/ you will find Hadmut Danisch's package to +convert photo CD image files and Gerd Knorr's viewing utility. + +The transfer rate will reach 150 kB/sec with CR-52x drives, 300 kB/sec with +CR-56x drives, and currently not more than 500 kB/sec (usually less than +250 kB/sec) with the Teac quad speed drives. +XA (PhotoCD) disks with "old" drives give only 50 kB/sec. + +This release consists of +- this README file +- the driver file linux/drivers/cdrom/sbpcd.c +- the stub files linux/drivers/cdrom/sbpcd[234].c +- the header file linux/drivers/cdrom/sbpcd.h. + + +To install: +----------- + +1. Setup your hardware parameters. Though the driver does "auto-probing" at a + lot of (not all possible!) addresses, this step is recommended for + everyday use. You should let sbpcd auto-probe once and use the reported + address if a drive got found. The reported type may be incorrect; it is + correct if you can mount a data CD. There is no choice for you with the + type; only one is right, the others are deadly wrong. + + a. Go into /usr/src/linux/drivers/cdrom/sbpcd.h and configure it for your + hardware (near the beginning): + a1. Set it up for the appropriate type of interface board. + "Original" CreativeLabs sound cards need "SBPRO 1". + Most "compatible" sound cards (almost all "non-CreativeLabs" cards) + need "SBPRO 0". + The "no-sound" board from OmniCd needs the "SBPRO 1" setup. + The Teac 8-bit "no-sound" boards need the "SBPRO 1" setup. + The Teac 16-bit "no-sound" boards need the "SBPRO 3" setup. + All other "no-sound" boards need the "SBPRO 0" setup. + The Spea Media FX and Ensoniq SoundScape cards need "SBPRO 2". + sbpcd.c holds some examples in its auto-probe list. + If you configure "SBPRO" wrong, the playing of audio CDs will work, + but you will not be able to mount a data CD. + a2. Tell the address of your CDROM_PORT (not of the sound port). + a3. If 4 drives get found, but you have only one, set MAX_DRIVES to 1. + a4. Set DISTRIBUTION to 0. + b. Additionally for 2.a1 and 2.a2, the setup may be done during + boot time (via the "kernel command line" or "LILO option"): + sbpcd=0x320,LaserMate + or + sbpcd=0x230,SoundBlaster + or + sbpcd=0x338,SoundScape + or + sbpcd=0x2C0,Teac16bit + This is especially useful if you install a fresh distribution. + If the second parameter is a number, it gets taken as the type + setting; 0 is "LaserMate", 1 is "SoundBlaster", 2 is "SoundScape", + 3 is "Teac16bit". + So, for example + sbpcd=0x230,1 + is equivalent to + sbpcd=0x230,SoundBlaster + +2. "cd /usr/src/linux" and do a "make config" and select "y" for Matsushita + CD-ROM support and for ISO9660 FileSystem support. If you do not have a + second, third, or fourth controller installed, do not say "y" to the + secondary Matsushita CD-ROM questions. + +3. Then make the kernel image ("make zlilo" or similar). + +4. Make the device file(s). This step usually already has been done by the + MAKEDEV script. + The driver uses MAJOR 25, so, if necessary, do + mknod /dev/sbpcd b 25 0 (if you have only one drive) + and/or + mknod /dev/sbpcd0 b 25 0 + mknod /dev/sbpcd1 b 25 1 + mknod /dev/sbpcd2 b 25 2 + mknod /dev/sbpcd3 b 25 3 + to make the node(s). + + The "first found" drive gets MINOR 0 (regardless of its jumpered ID), the + "next found" (at the same cable) gets MINOR 1, ... + + For a second interface board, you have to make nodes like + mknod /dev/sbpcd4 b 26 0 + mknod /dev/sbpcd5 b 26 1 + and so on. Use the MAJORs 26, 27, 28. + + If you further make a link like + ln -s sbpcd /dev/cdrom + you can use the name /dev/cdrom, too. + +5. Reboot with the new kernel. + +You should now be able to do + mkdir /CD +and + mount -rt iso9660 /dev/sbpcd /CD +or + mount -rt iso9660 -o block=2048 /dev/sbpcd /CD +and see the contents of your CD in the /CD directory. +To use audio CDs, a mounting is not recommended (and it would fail if the +first track is not a data track). + + +Using sbpcd as a "loadable module": +----------------------------------- + +If you do NOT select "Matsushita/Panasonic CDROM driver support" during the +"make config" of your kernel, you can build the "loadable module" sbpcd.o. + +If sbpcd gets used as a module, the support of more than one interface +card (i.e. drives 4...15) is disabled. + +You can specify interface address and type with the "insmod" command like: + # insmod /usr/src/linux/modules/sbpcd.o sbpcd=0x340,0 +or + # insmod /usr/src/linux/modules/sbpcd.o sbpcd=0x230,1 +or + # insmod /usr/src/linux/modules/sbpcd.o sbpcd=0x338,2 +where the last number represents the SBPRO setting (no strings allowed here). + + +Things of interest: +------------------- + +The driver is configured to try the LaserMate type of interface at I/O port +0x0340 first. If this is not appropriate, sbpcd.h should get changed +(you will find the right place - just at the beginning). + +No DMA and no IRQ is used. + +To reduce or increase the amount of kernel messages, edit sbpcd.c and play +with the "DBG_xxx" switches (initialization of the variable "sbpcd_debug"). +Don't forget to reflect on what you do; enabling all DBG_xxx switches at once +may crash your system, and each message line is accompanied by a delay. + +The driver uses the "variable BLOCK_SIZE" feature. To use it, you have to +specify "block=2048" as a mount option. Doing this will disable the direct +execution of a binary from the CD; you have to copy it to a device with the +standard BLOCK_SIZE (1024) first. So, do not use this if your system is +directly "running from the CDROM" (like some of Yggdrasil's installation +variants). There are CDs on the market (like the German "unifix" Linux +distribution) which MUST get handled with a block_size of 1024. Generally, +one can say all the CDs which hold files of the name YMTRANS.TBL are defective; +do not use block=2048 with those. + +Within sbpcd.h, you will find some "#define"s (e.g. EJECT and JUKEBOX). With +these, you can configure the driver for some special things. +You can use the appended program "cdtester" to set the auto-eject feature +during runtime. Jeff Tranter's "eject" utility can do this, too (and more) +for you. + +There is an ioctl CDROMMULTISESSION to obtain with a user program if +the CD is an XA disk and - if it is - where the last session starts. The +"cdtester" program illustrates how to call it. + + +Auto-probing at boot time: +-------------------------- + +The driver does auto-probing at many well-known interface card addresses, +but not all: +Some probings can cause a hang if an NE2000 ethernet card gets touched, because +SBPCD's auto-probing happens before the initialization of the net drivers. +Those "hazardous" addresses are excluded from auto-probing; the "kernel +command line" feature has to be used during installation if you have your +drive at those addresses. The "module" version is allowed to probe at those +addresses, too. + +The auto-probing looks first at the configured address resp. the address +submitted by the kernel command line. With this, it is possible to use this +driver within installation boot floppies, and for any non-standard address, +too. + +Auto-probing will make an assumption about the interface type ("SBPRO" or not), +based upon the address. That assumption may be wrong (initialization will be +o.k., but you will get I/O errors during mount). In that case, use the "kernel +command line" feature and specify address & type at boot time to find out the +right setup. + +For everyday use, address and type should get configured within sbpcd.h. That +will stop the auto-probing due to success with the first try. + +The kernel command "sbpcd=0" suppresses each auto-probing and causes +the driver not to find any drive; it is meant for people who love sbpcd +so much that they do not want to miss it, even if they miss the drives. ;-) + +If you configure "#define CDROM_PORT 0" in sbpcd.h, the auto-probing is +initially disabled and needs an explicit kernel command to get activated. +Once activated, it does not stop before success or end-of-list. This may be +useful within "universal" CDROM installation boot floppies (but using the +loadable module would be better because it allows an "extended" auto-probing +without fearing NE2000 cards). + +To shorten the auto-probing list to a single entry, set DISTRIBUTION 0 within +sbpcd.h. + + +Setting up address and interface type: +-------------------------------------- + +If your I/O port address is not 0x340, you have to look for the #defines near +the beginning of sbpcd.h and configure them: set SBPRO to 0 or 1 or 2, and +change CDROM_PORT to the address of your CDROM I/O port. + +Almost all of the "SoundBlaster compatible" cards behave like the no-sound +interfaces, i.e. need SBPRO 0! + +With "original" SB Pro cards, an initial setting of CD_volume through the +sound card's MIXER register gets done. +If you are using a "compatible" sound card of types "LaserMate" or "SPEA", +you can set SOUND_BASE (in sbpcd.h) to get it done with your card, too... + + +Using audio CDs: +---------------- + +Workman, WorkBone, xcdplayer, cdplayer and the nice little tool "cdplay" (see +README.aztcd from the Aztech driver package) should work. + +The program CDplayer likes to talk to "/dev/mcd" only, xcdplayer wants +"/dev/rsr0", workman loves "/dev/sr0" or "/dev/cdrom" - so, make the +appropriate links to use them without the need to supply parameters. + + +Copying audio tracks: +--------------------- + +The following program will copy track 1 (or a piece of it) from an audio CD +into the file "track01": + +/*=================== begin program ========================================*/ +/* + * read an audio track from a CD + * + * (c) 1994 Eberhard Moenkeberg <emoenke@gwdg.de> + * may be used & enhanced freely + * + * Due to non-existent sync bytes at the beginning of each audio frame (or due + * to a firmware bug within all known drives?), it is currently a kind of + * fortune if two consecutive frames fit together. + * Usually, they overlap, or a little piece is missing. This happens in units + * of 24-byte chunks. It has to get fixed by higher-level software (reading + * until an overlap occurs, and then eliminate the overlapping chunks). + * ftp.gwdg.de:/pub/linux/misc/cdda2wav-sbpcd.*.tar.gz holds an example of + * such an algorithm. + * This example program further is missing to obtain the SubChannel data + * which belong to each frame. + * + * This is only an example of the low-level access routine. The read data are + * pure 16-bit CDDA values; they have to get converted to make sound out of + * them. + * It is no fun to listen to it without prior overlap/underlap correction! + */ +#include <stdio.h> +#include <sys/ioctl.h> +#include <linux/cdrom.h> + +static struct cdrom_tochdr hdr; +static struct cdrom_tocentry entry[101]; +static struct cdrom_read_audio arg; +static u_char buffer[CD_FRAMESIZE_RAW]; +static int datafile, drive; +static int i, j, limit, track, err; +static char filename[32]; + +main(int argc, char *argv[]) +{ +/* + * open /dev/cdrom + */ + drive=open("/dev/cdrom", 0); + if (drive<0) + { + fprintf(stderr, "can't open drive.\n"); + exit (-1); + } +/* + * get TocHeader + */ + fprintf(stdout, "getting TocHeader...\n"); + err=ioctl(drive, CDROMREADTOCHDR, &hdr); + if (err!=0) + { + fprintf(stderr, "can't get TocHeader (error %d).\n", err); + exit (-1); + } + else + fprintf(stdout, "TocHeader: %d %d\n", hdr.cdth_trk0, hdr.cdth_trk1); +/* + * get and display all TocEntries + */ + fprintf(stdout, "getting TocEntries...\n"); + for (i=1;i<=hdr.cdth_trk1+1;i++) + { + if (i!=hdr.cdth_trk1+1) entry[i].cdte_track = i; + else entry[i].cdte_track = CDROM_LEADOUT; + entry[i].cdte_format = CDROM_LBA; + err=ioctl(drive, CDROMREADTOCENTRY, &entry[i]); + if (err!=0) + { + fprintf(stderr, "can't get TocEntry #%d (error %d).\n", i, err); + exit (-1); + } + else + { + fprintf(stdout, "TocEntry #%d: %1X %1X %06X %02X\n", + entry[i].cdte_track, + entry[i].cdte_adr, + entry[i].cdte_ctrl, + entry[i].cdte_addr.lba, + entry[i].cdte_datamode); + } + } + fprintf(stdout, "got all TocEntries.\n"); +/* + * ask for track number (not implemented here) + */ +track=1; +#if 0 /* just read a little piece (4 seconds) */ +entry[track+1].cdte_addr.lba=entry[track].cdte_addr.lba+300; +#endif +/* + * read track into file + */ + sprintf(filename, "track%02d\0", track); + datafile=creat(filename, 0755); + if (datafile<0) + { + fprintf(stderr, "can't open datafile %s.\n", filename); + exit (-1); + } + arg.addr.lba=entry[track].cdte_addr.lba; + arg.addr_format=CDROM_LBA; /* CDROM_MSF would be possible here, too. */ + arg.nframes=1; + arg.buf=&buffer[0]; + limit=entry[track+1].cdte_addr.lba; + for (;arg.addr.lba<limit;arg.addr.lba++) + { + err=ioctl(drive, CDROMREADAUDIO, &arg); + if (err!=0) + { + fprintf(stderr, "can't read abs. frame #%d (error %d).\n", + arg.addr.lba, err); + } + j=write(datafile, &buffer[0], CD_FRAMESIZE_RAW); + if (j!=CD_FRAMESIZE_RAW) + { + fprintf(stderr,"I/O error (datafile) at rel. frame %d\n", + arg.addr.lba-entry[track].cdte_addr.lba); + } + arg.addr.lba++; + } +} +/*===================== end program ========================================*/ + +At ftp.gwdg.de:/pub/linux/misc/cdda2wav-sbpcd.*.tar.gz is an adapted version of +Heiko Eissfeldt's digital-audio to .WAV converter (the original is there, too). +This is preliminary, as Heiko himself will care about it. + + +Known problems: +--------------- + +Currently, the detection of disk change or removal is actively disabled. + +Most attempts to read the UPC/EAN code result in a stream of zeroes. All my +drives are mostly telling there is no UPC/EAN code on disk or there is, but it +is an all-zero number. I guess now almost no CD holds such a number. + +Bug reports, comments, wishes, donations (technical information is a donation, +too :-) etc. to emoenke@gwdg.de. + +SnailMail address, preferable for CD editors if they want to submit a free +"cooperation" copy: + Eberhard Moenkeberg + Reinholdstr. 14 + D-37083 Goettingen + Germany +--- + + +Appendix -- the "cdtester" utility: + +/* + * cdtester.c -- test the audio functions of a CD driver + * + * (c) 1995 Eberhard Moenkeberg <emoenke@gwdg.de> + * published under the GPL + * + * made under heavy use of the "Tiny Audio CD Player" + * from Werner Zimmermann <zimmerma@rz.fht-esslingen.de> + * (see linux/drivers/block/README.aztcd) + */ +#undef AZT_PRIVATE_IOCTLS /* not supported by every CDROM driver */ +#define SBP_PRIVATE_IOCTLS /* not supported by every CDROM driver */ + +#include <stdio.h> +#include <stdio.h> +#include <malloc.h> +#include <sys/ioctl.h> +#include <linux/cdrom.h> + +#ifdef AZT_PRIVATE_IOCTLS +#include <linux/../../drivers/cdrom/aztcd.h> +#endif AZT_PRIVATE_IOCTLS +#ifdef SBP_PRIVATE_IOCTLS +#include <linux/../../drivers/cdrom/sbpcd.h> +#include <linux/fs.h> +#endif SBP_PRIVATE_IOCTLS + +struct cdrom_tochdr hdr; +struct cdrom_tochdr tocHdr; +struct cdrom_tocentry TocEntry[101]; +struct cdrom_tocentry entry; +struct cdrom_multisession ms_info; +struct cdrom_read_audio read_audio; +struct cdrom_ti ti; +struct cdrom_subchnl subchnl; +struct cdrom_msf msf; +struct cdrom_volctrl volctrl; +#ifdef AZT_PRIVATE_IOCTLS +union +{ + struct cdrom_msf msf; + unsigned char buf[CD_FRAMESIZE_RAW]; +} azt; +#endif AZT_PRIVATE_IOCTLS +int i, i1, i2, i3, j, k; +unsigned char sequence=0; +unsigned char command[80]; +unsigned char first=1, last=1; +char *default_device="/dev/cdrom"; +char dev[20]; +char filename[20]; +int drive; +int datafile; +int rc; + +void help(void) +{ + printf("Available Commands:\n"); + printf("STOP s EJECT e QUIT q\n"); + printf("PLAY TRACK t PAUSE p RESUME r\n"); + printf("NEXT TRACK n REPEAT LAST l HELP h\n"); + printf("SUBCHANNEL_Q c TRACK INFO i PLAY AT a\n"); + printf("READ d READ RAW w READ AUDIO A\n"); + printf("MS-INFO M TOC T START S\n"); + printf("SET EJECTSW X DEVICE D DEBUG Y\n"); + printf("AUDIO_BUFSIZ Z RESET R SET VOLUME v\n"); + printf("GET VOLUME V\n"); +} + +/* + * convert MSF number (3 bytes only) to Logical_Block_Address + */ +int msf2lba(u_char *msf) +{ + int i; + + i=(msf[0] * CD_SECS + msf[1]) * CD_FRAMES + msf[2] - CD_BLOCK_OFFSET; + if (i<0) return (0); + return (i); +} +/* + * convert logical_block_address to m-s-f_number (3 bytes only) + */ +void lba2msf(int lba, unsigned char *msf) +{ + lba += CD_BLOCK_OFFSET; + msf[0] = lba / (CD_SECS*CD_FRAMES); + lba %= CD_SECS*CD_FRAMES; + msf[1] = lba / CD_FRAMES; + msf[2] = lba % CD_FRAMES; +} + +int init_drive(char *dev) +{ + unsigned char msf_ent[3]; + + /* + * open the device + */ + drive=open(dev,0); + if (drive<0) return (-1); + /* + * get TocHeader + */ + printf("getting TocHeader...\n"); + rc=ioctl(drive,CDROMREADTOCHDR,&hdr); + if (rc!=0) + { + printf("can't get TocHeader (error %d).\n",rc); + return (-2); + } + else + first=hdr.cdth_trk0; + last=hdr.cdth_trk1; + printf("TocHeader: %d %d\n",hdr.cdth_trk0,hdr.cdth_trk1); + /* + * get and display all TocEntries + */ + printf("getting TocEntries...\n"); + for (i=1;i<=hdr.cdth_trk1+1;i++) + { + if (i!=hdr.cdth_trk1+1) TocEntry[i].cdte_track = i; + else TocEntry[i].cdte_track = CDROM_LEADOUT; + TocEntry[i].cdte_format = CDROM_LBA; + rc=ioctl(drive,CDROMREADTOCENTRY,&TocEntry[i]); + if (rc!=0) + { + printf("can't get TocEntry #%d (error %d).\n",i,rc); + } + else + { + lba2msf(TocEntry[i].cdte_addr.lba,&msf_ent[0]); + if (TocEntry[i].cdte_track==CDROM_LEADOUT) + { + printf("TocEntry #%02X: %1X %1X %02d:%02d:%02d (lba: 0x%06X) %02X\n", + TocEntry[i].cdte_track, + TocEntry[i].cdte_adr, + TocEntry[i].cdte_ctrl, + msf_ent[0], + msf_ent[1], + msf_ent[2], + TocEntry[i].cdte_addr.lba, + TocEntry[i].cdte_datamode); + } + else + { + printf("TocEntry #%02d: %1X %1X %02d:%02d:%02d (lba: 0x%06X) %02X\n", + TocEntry[i].cdte_track, + TocEntry[i].cdte_adr, + TocEntry[i].cdte_ctrl, + msf_ent[0], + msf_ent[1], + msf_ent[2], + TocEntry[i].cdte_addr.lba, + TocEntry[i].cdte_datamode); + } + } + } + return (hdr.cdth_trk1); /* number of tracks */ +} + +void display(int size,unsigned char *buffer) +{ + k=0; + getchar(); + for (i=0;i<(size+1)/16;i++) + { + printf("%4d:",i*16); + for (j=0;j<16;j++) + { + printf(" %02X",buffer[i*16+j]); + } + printf(" "); + for (j=0;j<16;j++) + { + if (isalnum(buffer[i*16+j])) + printf("%c",buffer[i*16+j]); + else + printf("."); + } + printf("\n"); + k++; + if (k>=20) + { + printf("press ENTER to continue\n"); + getchar(); + k=0; + } + } +} + +main(int argc, char *argv[]) +{ + printf("\nTesting tool for a CDROM driver's audio functions V0.1\n"); + printf("(C) 1995 Eberhard Moenkeberg <emoenke@gwdg.de>\n"); + printf("initializing...\n"); + + rc=init_drive(default_device); + if (rc<0) printf("could not open %s (rc=%d).\n",default_device,rc); + help(); + while (1) + { + printf("Give a one-letter command (h = help): "); + scanf("%s",command); + command[1]=0; + switch (command[0]) + { + case 'D': + printf("device name (f.e. /dev/sbpcd3): ? "); + scanf("%s",&dev); + close(drive); + rc=init_drive(dev); + if (rc<0) printf("could not open %s (rc %d).\n",dev,rc); + break; + case 'e': + rc=ioctl(drive,CDROMEJECT); + if (rc<0) printf("CDROMEJECT: rc=%d.\n",rc); + break; + case 'p': + rc=ioctl(drive,CDROMPAUSE); + if (rc<0) printf("CDROMPAUSE: rc=%d.\n",rc); + break; + case 'r': + rc=ioctl(drive,CDROMRESUME); + if (rc<0) printf("CDROMRESUME: rc=%d.\n",rc); + break; + case 's': + rc=ioctl(drive,CDROMSTOP); + if (rc<0) printf("CDROMSTOP: rc=%d.\n",rc); + break; + case 'S': + rc=ioctl(drive,CDROMSTART); + if (rc<0) printf("CDROMSTART: rc=%d.\n",rc); + break; + case 't': + rc=ioctl(drive,CDROMREADTOCHDR,&tocHdr); + if (rc<0) + { + printf("CDROMREADTOCHDR: rc=%d.\n",rc); + break; + } + first=tocHdr.cdth_trk0; + last= tocHdr.cdth_trk1; + if ((first==0)||(first>last)) + { + printf ("--got invalid TOC data.\n"); + } + else + { + printf("--enter track number(first=%d, last=%d): ",first,last); + scanf("%d",&i1); + ti.cdti_trk0=i1; + if (ti.cdti_trk0<first) ti.cdti_trk0=first; + if (ti.cdti_trk0>last) ti.cdti_trk0=last; + ti.cdti_ind0=0; + ti.cdti_trk1=last; + ti.cdti_ind1=0; + rc=ioctl(drive,CDROMSTOP); + rc=ioctl(drive,CDROMPLAYTRKIND,&ti); + if (rc<0) printf("CDROMPLAYTRKIND: rc=%d.\n",rc); + } + break; + case 'n': + rc=ioctl(drive,CDROMSTOP); + if (++ti.cdti_trk0>last) ti.cdti_trk0=last; + ti.cdti_ind0=0; + ti.cdti_trk1=last; + ti.cdti_ind1=0; + rc=ioctl(drive,CDROMPLAYTRKIND,&ti); + if (rc<0) printf("CDROMPLAYTRKIND: rc=%d.\n",rc); + break; + case 'l': + rc=ioctl(drive,CDROMSTOP); + if (--ti.cdti_trk0<first) ti.cdti_trk0=first; + ti.cdti_ind0=0; + ti.cdti_trk1=last; + ti.cdti_ind1=0; + rc=ioctl(drive,CDROMPLAYTRKIND,&ti); + if (rc<0) printf("CDROMPLAYTRKIND: rc=%d.\n",rc); + break; + case 'c': + subchnl.cdsc_format=CDROM_MSF; + rc=ioctl(drive,CDROMSUBCHNL,&subchnl); + if (rc<0) printf("CDROMSUBCHNL: rc=%d.\n",rc); + else + { + printf("AudioStatus:%s Track:%d Mode:%d MSF=%02d:%02d:%02d\n", + subchnl.cdsc_audiostatus==CDROM_AUDIO_PLAY ? "PLAYING":"NOT PLAYING", + subchnl.cdsc_trk,subchnl.cdsc_adr, + subchnl.cdsc_absaddr.msf.minute, + subchnl.cdsc_absaddr.msf.second, + subchnl.cdsc_absaddr.msf.frame); + } + break; + case 'i': + printf("Track No.: "); + scanf("%d",&i1); + entry.cdte_track=i1; + if (entry.cdte_track<first) entry.cdte_track=first; + if (entry.cdte_track>last) entry.cdte_track=last; + entry.cdte_format=CDROM_MSF; + rc=ioctl(drive,CDROMREADTOCENTRY,&entry); + if (rc<0) printf("CDROMREADTOCENTRY: rc=%d.\n",rc); + else + { + printf("Mode %d Track, starts at %02d:%02d:%02d\n", + entry.cdte_adr, + entry.cdte_addr.msf.minute, + entry.cdte_addr.msf.second, + entry.cdte_addr.msf.frame); + } + break; + case 'a': + printf("Address (min:sec:frm) "); + scanf("%d:%d:%d",&i1,&i2,&i3); + msf.cdmsf_min0=i1; + msf.cdmsf_sec0=i2; + msf.cdmsf_frame0=i3; + if (msf.cdmsf_sec0>59) msf.cdmsf_sec0=59; + if (msf.cdmsf_frame0>74) msf.cdmsf_frame0=74; + lba2msf(TocEntry[last+1].cdte_addr.lba-1,&msf.cdmsf_min1); + rc=ioctl(drive,CDROMSTOP); + rc=ioctl(drive,CDROMPLAYMSF,&msf); + if (rc<0) printf("CDROMPLAYMSF: rc=%d.\n",rc); + break; + case 'V': + rc=ioctl(drive,CDROMVOLREAD,&volctrl); + if (rc<0) printf("CDROMVOLCTRL: rc=%d.\n",rc); + printf("Volume: channel 0 (left) %d, channel 1 (right) %d\n",volctrl.channel0,volctrl.channel1); + break; + case 'R': + rc=ioctl(drive,CDROMRESET); + if (rc<0) printf("CDROMRESET: rc=%d.\n",rc); + break; +#ifdef AZT_PRIVATE_IOCTLS /*not supported by every CDROM driver*/ + case 'd': + printf("Address (min:sec:frm) "); + scanf("%d:%d:%d",&i1,&i2,&i3); + azt.msf.cdmsf_min0=i1; + azt.msf.cdmsf_sec0=i2; + azt.msf.cdmsf_frame0=i3; + if (azt.msf.cdmsf_sec0>59) azt.msf.cdmsf_sec0=59; + if (azt.msf.cdmsf_frame0>74) azt.msf.cdmsf_frame0=74; + rc=ioctl(drive,CDROMREADMODE1,&azt.msf); + if (rc<0) printf("CDROMREADMODE1: rc=%d.\n",rc); + else display(CD_FRAMESIZE,azt.buf); + break; + case 'w': + printf("Address (min:sec:frame) "); + scanf("%d:%d:%d",&i1,&i2,&i3); + azt.msf.cdmsf_min0=i1; + azt.msf.cdmsf_sec0=i2; + azt.msf.cdmsf_frame0=i3; + if (azt.msf.cdmsf_sec0>59) azt.msf.cdmsf_sec0=59; + if (azt.msf.cdmsf_frame0>74) azt.msf.cdmsf_frame0=74; + rc=ioctl(drive,CDROMREADMODE2,&azt.msf); + if (rc<0) printf("CDROMREADMODE2: rc=%d.\n",rc); + else display(CD_FRAMESIZE_RAW,azt.buf); /* currently only 2336 */ + break; +#endif + case 'v': + printf("--Channel 0 (Left) (0-255): "); + scanf("%d",&i1); + volctrl.channel0=i1; + printf("--Channel 1 (Right) (0-255): "); + scanf("%d",&i1); + volctrl.channel1=i1; + volctrl.channel2=0; + volctrl.channel3=0; + rc=ioctl(drive,CDROMVOLCTRL,&volctrl); + if (rc<0) printf("CDROMVOLCTRL: rc=%d.\n",rc); + break; + case 'q': + close(drive); + exit(0); + case 'h': + help(); + break; + case 'T': /* display TOC entry - without involving the driver */ + scanf("%d",&i); + if ((i<hdr.cdth_trk0)||(i>hdr.cdth_trk1)) + printf("invalid track number.\n"); + else + printf("TocEntry %02d: adr=%01X ctrl=%01X msf=%02d:%02d:%02d mode=%02X\n", + TocEntry[i].cdte_track, + TocEntry[i].cdte_adr, + TocEntry[i].cdte_ctrl, + TocEntry[i].cdte_addr.msf.minute, + TocEntry[i].cdte_addr.msf.second, + TocEntry[i].cdte_addr.msf.frame, + TocEntry[i].cdte_datamode); + break; + case 'A': /* read audio data into file */ + printf("Address (min:sec:frm) ? "); + scanf("%d:%d:%d",&i1,&i2,&i3); + read_audio.addr.msf.minute=i1; + read_audio.addr.msf.second=i2; + read_audio.addr.msf.frame=i3; + read_audio.addr_format=CDROM_MSF; + printf("# of frames ? "); + scanf("%d",&i1); + read_audio.nframes=i1; + k=read_audio.nframes*CD_FRAMESIZE_RAW; + read_audio.buf=malloc(k); + if (read_audio.buf==NULL) + { + printf("can't malloc %d bytes.\n",k); + break; + } + sprintf(filename,"audio_%02d%02d%02d_%02d.%02d\0", + read_audio.addr.msf.minute, + read_audio.addr.msf.second, + read_audio.addr.msf.frame, + read_audio.nframes, + ++sequence); + datafile=creat(filename, 0755); + if (datafile<0) + { + printf("can't open datafile %s.\n",filename); + break; + } + rc=ioctl(drive,CDROMREADAUDIO,&read_audio); + if (rc!=0) + { + printf("CDROMREADAUDIO: rc=%d.\n",rc); + } + else + { + rc=write(datafile,&read_audio.buf,k); + if (rc!=k) printf("datafile I/O error (%d).\n",rc); + } + close(datafile); + break; + case 'X': /* set EJECT_SW (0: disable, 1: enable auto-ejecting) */ + scanf("%d",&i); + rc=ioctl(drive,CDROMEJECT_SW,i); + if (rc!=0) + printf("CDROMEJECT_SW: rc=%d.\n",rc); + else + printf("EJECT_SW set to %d\n",i); + break; + case 'M': /* get the multisession redirection info */ + ms_info.addr_format=CDROM_LBA; + rc=ioctl(drive,CDROMMULTISESSION,&ms_info); + if (rc!=0) + { + printf("CDROMMULTISESSION(lba): rc=%d.\n",rc); + } + else + { + if (ms_info.xa_flag) printf("MultiSession offset (lba): %d (0x%06X)\n",ms_info.addr.lba,ms_info.addr.lba); + else + { + printf("this CD is not an XA disk.\n"); + break; + } + } + ms_info.addr_format=CDROM_MSF; + rc=ioctl(drive,CDROMMULTISESSION,&ms_info); + if (rc!=0) + { + printf("CDROMMULTISESSION(msf): rc=%d.\n",rc); + } + else + { + if (ms_info.xa_flag) + printf("MultiSession offset (msf): %02d:%02d:%02d (0x%02X%02X%02X)\n", + ms_info.addr.msf.minute, + ms_info.addr.msf.second, + ms_info.addr.msf.frame, + ms_info.addr.msf.minute, + ms_info.addr.msf.second, + ms_info.addr.msf.frame); + else printf("this CD is not an XA disk.\n"); + } + break; +#ifdef SBP_PRIVATE_IOCTLS + case 'Y': /* set the driver's message level */ +#if 0 /* not implemented yet */ + printf("enter switch name (f.e. DBG_CMD): "); + scanf("%s",&dbg_switch); + j=get_dbg_num(dbg_switch); +#else + printf("enter DDIOCSDBG switch number: "); + scanf("%d",&j); +#endif + printf("enter 0 for \"off\", 1 for \"on\": "); + scanf("%d",&i); + if (i==0) j|=0x80; + printf("calling \"ioctl(drive,DDIOCSDBG,%d)\"\n",j); + rc=ioctl(drive,DDIOCSDBG,j); + printf("DDIOCSDBG: rc=%d.\n",rc); + break; + case 'Z': /* set the audio buffer size */ + printf("# frames wanted: ? "); + scanf("%d",&j); + rc=ioctl(drive,CDROMAUDIOBUFSIZ,j); + printf("%d frames granted.\n",rc); + break; +#endif SBP_PRIVATE_IOCTLS + default: + printf("unknown command: \"%s\".\n",command); + break; + } + } +} +/*==========================================================================*/ + diff --git a/Documentation/cdrom/sjcd b/Documentation/cdrom/sjcd new file mode 100644 index 000000000000..74a14847b93a --- /dev/null +++ b/Documentation/cdrom/sjcd @@ -0,0 +1,60 @@ + -- Documentation/cdrom/sjcd + 80% of the work takes 20% of the time, + 20% of the work takes 80% of the time... + (Murphy's law) + + Once started, training can not be stopped... + (Star Wars) + +This is the README for the sjcd cdrom driver, version 1.6. + +This file is meant as a tips & tricks edge for the usage of the SANYO CDR-H94A +cdrom drive. It will grow as the questions arise. ;-) +For info on configuring the ISP16 sound card look at Documentation/cdrom/isp16. + +The driver should work with any of the Panasonic, Sony or Mitsumi style +CDROM interfaces. +The cdrom interface on Media Magic's soft configurable sound card ISP16, +which used to be included in the driver, is now supported in a separate module. +This initialisation module will probably also work with other interfaces +based on an OPTi 82C928 or 82C929 chip (like MAD16 and Mozart): see the +documentation Documentation/cdrom/isp16. + +The device major for sjcd is 18, and minor is 0. Create a block special +file in your /dev directory (e.g., /dev/sjcd) with these numbers. +(For those who don't know, being root and doing the following should do +the trick: + mknod -m 644 /dev/sjcd b 18 0 +and mount the cdrom by /dev/sjcd). + +The default configuration parameters are: + base address 0x340 + no irq + no dma +(Actually the CDR-H94A doesn't know how to use irq and dma.) +As of version 1.2, setting base address at boot time is supported +through the use of command line options: type at the "boot:" prompt: + linux sjcd=<base_address> +(where you would use the kernel labeled "linux" in lilo's configuration +file /etc/lilo.conf). You could also use 'append="sjcd=<configuration_info>"' +in the appropriate section of /etc/lilo.conf +If you're building a kernel yourself you can set your default base +i/o address with SJCD_BASE_ADDR in /usr/src/linux/drivers/cdrom/sjcd.h. + +The sjcd driver supports being loaded as a module. The following +command will set the base i/o address on the fly (assuming you +have installed the module in an appropriate place). + insmod sjcd.o sjcd_base=<base_address> + + +Have fun! + +If something is wrong, please email to vadim@rbrf.ru + or vadim@ipsun.ras.ru + or model@cecmow.enet.dec.com + or H.T.M.v.d.Maarel@marin.nl + +It happens sometimes that Vadim is not reachable by mail. For these +instances, Eric van der Maarel will help too. + + Vadim V. Model, Eric van der Maarel, Eberhard Moenkeberg diff --git a/Documentation/cdrom/sonycd535 b/Documentation/cdrom/sonycd535 new file mode 100644 index 000000000000..59581a4b302a --- /dev/null +++ b/Documentation/cdrom/sonycd535 @@ -0,0 +1,121 @@ + README FOR LINUX SONY CDU-535/531 DRIVER + ======================================== + +This is the Sony CDU-535 (and 531) driver version 0.7 for Linux. +I do not think I have the documentation to add features like DMA support +so if anyone else wants to pursue it or help me with it, please do. +(I need to see what was done for the CDU-31A driver -- perhaps I can +steal some of that code.) + +This is a Linux device driver for the Sony CDU-535 CDROM drive. This is +one of the older Sony drives with its own interface card (Sony bus). +The DOS driver for this drive is named SONY_CDU.SYS - when you boot DOS +your drive should be identified as a SONY CDU-535. The driver works +with a CDU-531 also. One user reported that the driver worked on drives +OEM'ed by Procomm, drive and interface board were labelled Procomm. + +The Linux driver is based on Corey Minyard's sonycd 0.3 driver for +the CDU-31A. Ron Jeppesen just changed the commands that were sent +to the drive to correspond to the CDU-535 commands and registers. +There were enough changes to let bugs creep in but it seems to be stable. +Ron was able to tar an entire CDROM (should read all blocks) and built +ghostview and xfig off Walnut Creek's X11R5/GNU CDROM. xcdplayer and +workman work with the driver. Others have used the driver without +problems except those dealing with wait loops (fixed in third release). +Like Minyard's original driver this one uses a polled interface (this +is also the default setup for the DOS driver). It has not been tried +with interrupts or DMA enabled on the board. + +REQUIREMENTS +============ + + - Sony CDU-535 drive, preferably without interrupts and DMA + enabled on the card. + + - Drive must be set up as unit 1. Only the first unit will be + recognized + + - You must enter your interface address into + /usr/src/linux/drivers/cdrom/sonycd535.h and build the + appropriate kernel or use the "kernel command line" parameter + sonycd535=0x320 + with the correct interface address. + +NOTES: +====== + +1) The drive MUST be turned on when booting or it will not be recognized! + (but see comments on modularized version below) + +2) when the cdrom device is opened the eject button is disabled to keep the + user from ejecting a mounted disk and replacing it with another. + Unfortunately xcdplayer and workman also open the cdrom device so you + have to use the eject button in the software. Keep this in mind if your + cdrom player refuses to give up its disk -- exit workman or xcdplayer, or + umount the drive if it has been mounted. + +THANKS +====== + +Many thanks to Ron Jeppesen (ronj.an@site007.saic.com) for getting +this project off the ground. He wrote the initial release +and the first two patches to this driver (0.1, 0.2, and 0.3). +Thanks also to Eberhard Moenkeberg (emoenke@gwdg.de) for prodding +me to place this code into the mainstream Linux source tree +(as of Linux version 1.1.91), as well as some patches to make +it a better device citizen. Further thanks to Joel Katz +<joelkatz@webchat.org> for his MODULE patches (see details below), +Porfiri Claudio <C.Porfiri@nisms.tei.ericsson.se> for patches +to make the driver work with the older CDU-510/515 series, and +Heiko Eissfeldt <heiko@colossus.escape.de> for pointing out that +the verify_area() checks were ignoring the results of said checks. + +(Acknowledgments from Ron Jeppesen in the 0.3 release:) +Thanks to Corey Minyard who wrote the original CDU-31A driver on which +this driver is based. Thanks to Ken Pizzini and Bob Blair who provided +patches and feedback on the first release of this driver. + +Ken Pizzini +ken@halcyon.com + +------------------------------------------------------------------------------ +(The following is from Joel Katz <joelkatz@webchat.org>.) + + To build a version of sony535.o that can be installed as a module, +use the following command: + +gcc -c -D__KERNEL__ -DMODULE -O2 sonycd535.c -o sonycd535.o + + To install the module, simply type: + +insmod sony535.o + or +insmod sony535.o sonycd535=<address> + + And to remove it: + +rmmod sony535 + + The code checks to see if MODULE is defined and behaves as it used +to if MODULE is not defined. That means your patched file should behave +exactly as it used to if compiled into the kernel. + + I have an external drive, and I usually leave it powered off. I used +to have to reboot if I needed to use the CDROM drive. Now I don't. + + Even if you have an internal drive, why waste the 96K of memory +(unswappable) that the driver uses if you use your CD-ROM drive infrequently? + + This driver will not install (whether compiled in or loaded as a +module) if the CDROM drive is not available during its initialization. This +means that you can have the driver compiled into the kernel and still load +the module later (assuming the driver doesn't install itself during +power-on). This only wastes 12K when you boot with the CDROM drive off. + + This is what I usually do; I leave the driver compiled into the +kernel, but load it as a module if I powered the system up with the drive +off and then later decided to use the CDROM drive. + + Since the driver only uses a single page to point to the chunks, +attempting to set the buffer cache to more than 2 Megabytes would be very +bad; don't do that. diff --git a/Documentation/cli-sti-removal.txt b/Documentation/cli-sti-removal.txt new file mode 100644 index 000000000000..0223c9d20331 --- /dev/null +++ b/Documentation/cli-sti-removal.txt @@ -0,0 +1,133 @@ + +#### cli()/sti() removal guide, started by Ingo Molnar <mingo@redhat.com> + + +as of 2.5.28, five popular macros have been removed on SMP, and +are being phased out on UP: + + cli(), sti(), save_flags(flags), save_flags_cli(flags), restore_flags(flags) + +until now it was possible to protect driver code against interrupt +handlers via a cli(), but from now on other, more lightweight methods +have to be used for synchronization, such as spinlocks or semaphores. + +for example, driver code that used to do something like: + + struct driver_data; + + irq_handler (...) + { + .... + driver_data.finish = 1; + driver_data.new_work = 0; + .... + } + + ... + + ioctl_func (...) + { + ... + cli(); + ... + driver_data.finish = 0; + driver_data.new_work = 2; + ... + sti(); + ... + } + +was SMP-correct because the cli() function ensured that no +interrupt handler (amongst them the above irq_handler()) function +would execute while the cli()-ed section is executing. + +but from now on a more direct method of locking has to be used: + + spinlock_t driver_lock = SPIN_LOCK_UNLOCKED; + struct driver_data; + + irq_handler (...) + { + unsigned long flags; + .... + spin_lock_irqsave(&driver_lock, flags); + .... + driver_data.finish = 1; + driver_data.new_work = 0; + .... + spin_unlock_irqrestore(&driver_lock, flags); + .... + } + + ... + + ioctl_func (...) + { + ... + spin_lock_irq(&driver_lock); + ... + driver_data.finish = 0; + driver_data.new_work = 2; + ... + spin_unlock_irq(&driver_lock); + ... + } + +the above code has a number of advantages: + +- the locking relation is easier to understand - actual lock usage + pinpoints the critical sections. cli() usage is too opaque. + Easier to understand means it's easier to debug. + +- it's faster, because spinlocks are faster to acquire than the + potentially heavily-used IRQ lock. Furthermore, your driver does + not have to wait eg. for a big heavy SCSI interrupt to finish, + because the driver_lock spinlock is only used by your driver. + cli() on the other hand was used by many drivers, and extended + the critical section to the whole IRQ handler function - creating + serious lock contention. + + +to make the transition easier, we've still kept the cli(), sti(), +save_flags(), save_flags_cli() and restore_flags() macros defined +on UP systems - but their usage will be phased out until 2.6 is +released. + +drivers that want to disable local interrupts (interrupts on the +current CPU), can use the following five macros: + + local_irq_disable(), local_irq_enable(), local_save_flags(flags), + local_irq_save(flags), local_irq_restore(flags) + +but beware, their meaning and semantics are much simpler, far from +that of the old cli(), sti(), save_flags(flags) and restore_flags(flags) +SMP meaning: + + local_irq_disable() => turn local IRQs off + + local_irq_enable() => turn local IRQs on + + local_save_flags(flags) => save the current IRQ state into flags. The + state can be on or off. (on some + architectures there's even more bits in it.) + + local_irq_save(flags) => save the current IRQ state into flags and + disable interrupts. + + local_irq_restore(flags) => restore the IRQ state from flags. + +(local_irq_save can save both irqs on and irqs off state, and +local_irq_restore can restore into both irqs on and irqs off state.) + +another related change is that synchronize_irq() now takes a parameter: +synchronize_irq(irq). This change too has the purpose of making SMP +synchronization more lightweight - this way you can wait for your own +interrupt handler to finish, no need to wait for other IRQ sources. + + +why were these changes done? The main reason was the architectural burden +of maintaining the cli()/sti() interface - it became a real problem. The +new interrupt system is much more streamlined, easier to understand, debug, +and it's also a bit faster - the same happened to it that will happen to +cli()/sti() using drivers once they convert to spinlocks :-) + diff --git a/Documentation/computone.txt b/Documentation/computone.txt new file mode 100644 index 000000000000..b1cf59b84d97 --- /dev/null +++ b/Documentation/computone.txt @@ -0,0 +1,588 @@ +NOTE: This is an unmaintained driver. It is not guaranteed to work due to +changes made in the tty layer in 2.6. If you wish to take over maintenance of +this driver, contact Michael Warfield <mhw@wittsend.com>. + +Changelog: +---------- +11-01-2001: Original Document + +10-29-2004: Minor misspelling & format fix, update status of driver. + James Nelson <james4765@gmail.com> + +Computone Intelliport II/Plus Multiport Serial Driver +----------------------------------------------------- + +Release Notes For Linux Kernel 2.2 and higher. +These notes are for the drivers which have already been integrated into the +kernel and have been tested on Linux kernels 2.0, 2.2, 2.3, and 2.4. + +Version: 1.2.14 +Date: 11/01/2001 +Historical Author: Andrew Manison <amanison@america.net> +Primary Author: Doug McNash +Support: support@computone.com +Fixes and Updates: Mike Warfield <mhw@wittsend.com> + +This file assumes that you are using the Computone drivers which are +integrated into the kernel sources. For updating the drivers or installing +drivers into kernels which do not already have Computone drivers, please +refer to the instructions in the README.computone file in the driver patch. + + +1. INTRODUCTION + +This driver supports the entire family of Intelliport II/Plus controllers +with the exception of the MicroChannel controllers. It does not support +products previous to the Intelliport II. + +This driver was developed on the v2.0.x Linux tree and has been tested up +to v2.4.14; it will probably not work with earlier v1.X kernels,. + + +2. QUICK INSTALLATION + +Hardware - If you have an ISA card, find a free interrupt and io port. + List those in use with `cat /proc/interrupts` and + `cat /proc/ioports`. Set the card dip switches to a free + address. You may need to configure your BIOS to reserve an + irq for an ISA card. PCI and EISA parameters are set + automagically. Insert card into computer with the power off + before or after drivers installation. + + Note the hardware address from the Computone ISA cards installed into + the system. These are required for editing ip2.c or editing + /etc/modprobe.conf, or for specification on the modprobe + command line. + + Note that the /etc/modules.conf should be used for older (pre-2.6) + kernels. + +Software - + +Module installation: + +a) Determine free irq/address to use if any (configure BIOS if need be) +b) Run "make config" or "make menuconfig" or "make xconfig" + Select (m) module for CONFIG_COMPUTONE under character + devices. CONFIG_PCI and CONFIG_MODULES also may need to be set. +c) Set address on ISA cards then: + edit /usr/src/linux/drivers/char/ip2.c if needed + or + edit /etc/modprobe.conf if needed (module). + or both to match this setting. +d) Run "make modules" +e) Run "make modules_install" +f) Run "/sbin/depmod -a" +g) install driver using `modprobe ip2 <options>` (options listed below) +h) run ip2mkdev (either the script below or the binary version) + + +Kernel installation: + +a) Determine free irq/address to use if any (configure BIOS if need be) +b) Run "make config" or "make menuconfig" or "make xconfig" + Select (y) kernel for CONFIG_COMPUTONE under character + devices. CONFIG_PCI may need to be set if you have PCI bus. +c) Set address on ISA cards then: + edit /usr/src/linux/drivers/char/ip2.c + (Optional - may be specified on kernel command line now) +d) Run "make zImage" or whatever target you prefer. +e) mv /usr/src/linux/arch/i386/boot/zImage to /boot. +f) Add new config for this kernel into /etc/lilo.conf, run "lilo" + or copy to a floppy disk and boot from that floppy disk. +g) Reboot using this kernel +h) run ip2mkdev (either the script below or the binary version) + +Kernel command line options: + +When compiling the driver into the kernel, io and irq may be +compiled into the driver by editing ip2.c and setting the values for +io and irq in the appropriate array. An alternative is to specify +a command line parameter to the kernel at boot up. + + ip2=io0,irq0,io1,irq1,io2,irq2,io3,irq3 + +Note that this order is very different from the specifications for the +modload parameters which have separate IRQ and IO specifiers. + +The io port also selects PCI (1) and EISA (2) boards. + + io=0 No board + io=1 PCI board + io=2 EISA board + else ISA board io address + +You only need to specify the boards which are present. + + Examples: + + 2 PCI boards: + + ip2=1,0,1,0 + + 1 ISA board at 0x310 irq 5: + + ip2=0x310,5 + +This can be added to and "append" option in lilo.conf similar to this: + + append="ip2=1,0,1,0" + + +3. INSTALLATION + +Previously, the driver sources were packaged with a set of patch files +to update the character drivers' makefile and configuration file, and other +kernel source files. A build script (ip2build) was included which applies +the patches if needed, and build any utilities needed. +What you receive may be a single patch file in conventional kernel +patch format build script. That form can also be applied by +running patch -p1 < ThePatchFile. Otherwise run ip2build. + +The driver can be installed as a module (recommended) or built into the +kernel. This is selected as for other drivers through the `make config` +command from the root of the Linux source tree. If the driver is built +into the kernel you will need to edit the file ip2.c to match the boards +you are installing. See that file for instructions. If the driver is +installed as a module the configuration can also be specified on the +modprobe command line as follows: + + modprobe ip2 irq=irq1,irq2,irq3,irq4 io=addr1,addr2,addr3,addr4 + +where irqnum is one of the valid Intelliport II interrupts (3,4,5,7,10,11, +12,15) and addr1-4 are the base addresses for up to four controllers. If +the irqs are not specified the driver uses the default in ip2.c (which +selects polled mode). If no base addresses are specified the defaults in +ip2.c are used. If you are autoloading the driver module with kerneld or +kmod the base addresses and interrupt number must also be set in ip2.c +and recompile or just insert and options line in /etc/modprobe.conf or both. +The options line is equivalent to the command line and takes precedence over +what is in ip2.c. + +/etc/modprobe.conf sample: + options ip2 io=1,0x328 irq=1,10 + alias char-major-71 ip2 + alias char-major-72 ip2 + alias char-major-73 ip2 + +The equivalent in ip2.c: + +static int io[IP2_MAX_BOARDS]= { 1, 0x328, 0, 0 }; +static int irq[IP2_MAX_BOARDS] = { 1, 10, -1, -1 }; + +The equivalent for the kernel command line (in lilo.conf): + + append="ip2=1,1,0x328,10" + + +Note: Both io and irq should be updated to reflect YOUR system. An "io" + address of 1 or 2 indicates a PCI or EISA card in the board table. + The PCI or EISA irq will be assigned automatically. + +Specifying an invalid or in-use irq will default the driver into +running in polled mode for that card. If all irq entries are 0 then +all cards will operate in polled mode. + +If you select the driver as part of the kernel run : + + make zlilo (or whatever you do to create a bootable kernel) + +If you selected a module run : + + make modules && make modules_install + +The utility ip2mkdev (see 5 and 7 below) creates all the device nodes +required by the driver. For a device to be created it must be configured +in the driver and the board must be installed. Only devices corresponding +to real IntelliPort II ports are created. With multiple boards and expansion +boxes this will leave gaps in the sequence of device names. ip2mkdev uses +Linux tty naming conventions: ttyF0 - ttyF255 for normal devices, and +cuf0 - cuf255 for callout devices. + +If you are using devfs, existing devices are automatically created within +the devfs name space. Normal devices will be tts/F0 - tts/F255 and callout +devices will be cua/F0 - cua/F255. With devfs installed, ip2mkdev will +create symbolic links in /dev from the old conventional names to the newer +devfs names as follows: + + /dev/ip2ipl[n] -> /dev/ip2/ipl[n] n = 0 - 3 + /dev/ip2stat[n] -> /dev/ip2/stat[n] n = 0 - 3 + /dev/ttyF[n] -> /dev/tts/F[n] n = 0 - 255 + /dev/cuf[n] -> /dev/cua/F[n] n = 0 - 255 + +Only devices for existing ports and boards will be created. + +IMPORTANT NOTE: The naming convention used for devfs by this driver +was changed from 1.2.12 to 1.2.13. The old naming convention was to +use ttf/%d for the tty device and cuf/%d for the cua device. That +has been changed to conform to an agreed-upon standard of placing +all the tty devices under tts. The device names are now tts/F%d for +the tty device and cua/F%d for the cua devices. If you were using +the older devfs names, you must update for the newer convention. + +You do not need to run ip2mkdev if you are using devfs and only want to +use the devfs native device names. + + +4. USING THE DRIVERS + +As noted above, the driver implements the ports in accordance with Linux +conventions, and the devices should be interchangeable with the standard +serial devices. (This is a key point for problem reporting: please make +sure that what you are trying do works on the ttySx/cuax ports first; then +tell us what went wrong with the ip2 ports!) + +Higher speeds can be obtained using the setserial utility which remaps +38,400 bps (extb) to 57,600 bps, 115,200 bps, or a custom speed. +Intelliport II installations using the PowerPort expansion module can +use the custom speed setting to select the highest speeds: 153,600 bps, +230,400 bps, 307,200 bps, 460,800bps and 921,600 bps. The base for +custom baud rate configuration is fixed at 921,600 for cards/expansion +modules with ST654's and 115200 for those with Cirrus CD1400's. This +corresponds to the maximum bit rates those chips are capable. +For example if the baud base is 921600 and the baud divisor is 18 then +the custom rate is 921600/18 = 51200 bps. See the setserial man page for +complete details. Of course if stty accepts the higher rates now you can +use that as well as the standard ioctls(). + + +5. ip2mkdev and assorted utilities... + +Several utilities, including the source for a binary ip2mkdev utility are +available under .../drivers/char/ip2. These can be build by changing to +that directory and typing "make" after the kernel has be built. If you do +not wish to compile the binary utilities, the shell script below can be +cut out and run as "ip2mkdev" to create the necessary device files. To +use the ip2mkdev script, you must have procfs enabled and the proc file +system mounted on /proc. + +You do not need to run ip2mkdev if you are using devfs and only want to +use the devfs native device names. + + +6. DEVFS + +DEVFS is the DEVice File System available as an add on package for the +2.2.x kernels and available as a configuration option in 2.3.46 and higher. +Devfs allows for the automatic creation and management of device names +under control of the device drivers themselves. The Devfs namespace is +hierarchical and reduces the clutter present in the normal flat /dev +namespace. Devfs names and conventional device names may be intermixed. +A userspace daemon, devfsd, exists to allow for automatic creation and +management of symbolic links from the devfs name space to the conventional +names. More details on devfs can be found on the DEVFS home site at +<http://www.atnf.csiro.au/~rgooch/linux/> or in the file kernel +documentation files, .../linux/Documentation/filesystems/devfs/README. + +If you are using devfs, existing devices are automatically created within +the devfs name space. Normal devices will be tts/F0 - tts/F255 and callout +devices will be cua/F0 - cua/F255. With devfs installed, ip2mkdev will +create symbolic links in /dev from the old conventional names to the newer +devfs names as follows: + + /dev/ip2ipl[n] -> /dev/ip2/ipl[n] n = 0 - 3 + /dev/ip2stat[n] -> /dev/ip2/stat[n] n = 0 - 3 + /dev/ttyF[n] -> /dev/tts/F[n] n = 0 - 255 + /dev/cuf[n] -> /dev/cua/F[n] n = 0 - 255 + +Only devices for existing ports and boards will be created. + +IMPORTANT NOTE: The naming convention used for devfs by this driver +was changed from 1.2.12 to 1.2.13. The old naming convention was to +use ttf/%d for the tty device and cuf/%d for the cua device. That +has been changed to conform to an agreed-upon standard of placing +all the tty devices under tts. The device names are now tts/F%d for +the tty device and cua/F%d for the cua devices. If you were using +the older devfs names, you must update for the newer convention. + +You do not need to run ip2mkdev if you are using devfs and only want to +use the devfs native device names. + + +7. NOTES + +This is a release version of the driver, but it is impossible to test it +in all configurations of Linux. If there is any anomalous behaviour that +does not match the standard serial port's behaviour please let us know. + + +8. ip2mkdev shell script + +Previously, this script was simply attached here. It is now attached as a +shar archive to make it easier to extract the script from the documentation. +To create the ip2mkdev shell script change to a convenient directory (/tmp +works just fine) and run the following command: + + unshar Documentation/computone.txt + (This file) + +You should now have a file ip2mkdev in your current working directory with +permissions set to execute. Running that script with then create the +necessary devices for the Computone boards, interfaces, and ports which +are present on you system at the time it is run. + + +#!/bin/sh +# This is a shell archive (produced by GNU sharutils 4.2.1). +# To extract the files from this archive, save it to some FILE, remove +# everything before the `!/bin/sh' line above, then type `sh FILE'. +# +# Made on 2001-10-29 10:32 EST by <mhw@alcove.wittsend.com>. +# Source directory was `/home2/src/tmp'. +# +# Existing files will *not* be overwritten unless `-c' is specified. +# +# This shar contains: +# length mode name +# ------ ---------- ------------------------------------------ +# 4251 -rwxr-xr-x ip2mkdev +# +save_IFS="${IFS}" +IFS="${IFS}:" +gettext_dir=FAILED +locale_dir=FAILED +first_param="$1" +for dir in $PATH +do + if test "$gettext_dir" = FAILED && test -f $dir/gettext \ + && ($dir/gettext --version >/dev/null 2>&1) + then + set `$dir/gettext --version 2>&1` + if test "$3" = GNU + then + gettext_dir=$dir + fi + fi + if test "$locale_dir" = FAILED && test -f $dir/shar \ + && ($dir/shar --print-text-domain-dir >/dev/null 2>&1) + then + locale_dir=`$dir/shar --print-text-domain-dir` + fi +done +IFS="$save_IFS" +if test "$locale_dir" = FAILED || test "$gettext_dir" = FAILED +then + echo=echo +else + TEXTDOMAINDIR=$locale_dir + export TEXTDOMAINDIR + TEXTDOMAIN=sharutils + export TEXTDOMAIN + echo="$gettext_dir/gettext -s" +fi +if touch -am -t 200112312359.59 $$.touch >/dev/null 2>&1 && test ! -f 200112312359.59 -a -f $$.touch; then + shar_touch='touch -am -t $1$2$3$4$5$6.$7 "$8"' +elif touch -am 123123592001.59 $$.touch >/dev/null 2>&1 && test ! -f 123123592001.59 -a ! -f 123123592001.5 -a -f $$.touch; then + shar_touch='touch -am $3$4$5$6$1$2.$7 "$8"' +elif touch -am 1231235901 $$.touch >/dev/null 2>&1 && test ! -f 1231235901 -a -f $$.touch; then + shar_touch='touch -am $3$4$5$6$2 "$8"' +else + shar_touch=: + echo + $echo 'WARNING: not restoring timestamps. Consider getting and' + $echo "installing GNU \`touch', distributed in GNU File Utilities..." + echo +fi +rm -f 200112312359.59 123123592001.59 123123592001.5 1231235901 $$.touch +# +if mkdir _sh17581; then + $echo 'x -' 'creating lock directory' +else + $echo 'failed to create lock directory' + exit 1 +fi +# ============= ip2mkdev ============== +if test -f 'ip2mkdev' && test "$first_param" != -c; then + $echo 'x -' SKIPPING 'ip2mkdev' '(file already exists)' +else + $echo 'x -' extracting 'ip2mkdev' '(text)' + sed 's/^X//' << 'SHAR_EOF' > 'ip2mkdev' && +#!/bin/sh - +# +# ip2mkdev +# +# Make or remove devices as needed for Computone Intelliport drivers +# +# First rule! If the dev file exists and you need it, don't mess +# with it. That prevents us from screwing up open ttys, ownership +# and permissions on a running system! +# +# This script will NOT remove devices that no longer exist if their +# board or interface box has been removed. If you want to get rid +# of them, you can manually do an "rm -f /dev/ttyF* /dev/cuaf*" +# before running this script. Running this script will then recreate +# all the valid devices. +# +# Michael H. Warfield +# /\/\|=mhw=|\/\/ +# mhw@wittsend.com +# +# Updated 10/29/2000 for version 1.2.13 naming convention +# under devfs. /\/\|=mhw=|\/\/ +# +# Updated 03/09/2000 for devfs support in ip2 drivers. /\/\|=mhw=|\/\/ +# +X +if test -d /dev/ip2 ; then +# This is devfs mode... We don't do anything except create symlinks +# from the real devices to the old names! +X cd /dev +X echo "Creating symbolic links to devfs devices" +X for i in `ls ip2` ; do +X if test ! -L ip2$i ; then +X # Remove it incase it wasn't a symlink (old device) +X rm -f ip2$i +X ln -s ip2/$i ip2$i +X fi +X done +X for i in `( cd tts ; ls F* )` ; do +X if test ! -L tty$i ; then +X # Remove it incase it wasn't a symlink (old device) +X rm -f tty$i +X ln -s tts/$i tty$i +X fi +X done +X for i in `( cd cua ; ls F* )` ; do +X DEVNUMBER=`expr $i : 'F\(.*\)'` +X if test ! -L cuf$DEVNUMBER ; then +X # Remove it incase it wasn't a symlink (old device) +X rm -f cuf$DEVNUMBER +X ln -s cua/$i cuf$DEVNUMBER +X fi +X done +X exit 0 +fi +X +if test ! -f /proc/tty/drivers +then +X echo "\ +Unable to check driver status. +Make sure proc file system is mounted." +X +X exit 255 +fi +X +if test ! -f /proc/tty/driver/ip2 +then +X echo "\ +Unable to locate ip2 proc file. +Attempting to load driver" +X +X if /sbin/insmod ip2 +X then +X if test ! -f /proc/tty/driver/ip2 +X then +X echo "\ +Unable to locate ip2 proc file after loading driver. +Driver initialization failure or driver version error. +" +X exit 255 +X fi +X else +X echo "Unable to load ip2 driver." +X exit 255 +X fi +fi +X +# Ok... So we got the driver loaded and we can locate the procfs files. +# Next we need our major numbers. +X +TTYMAJOR=`sed -e '/^ip2/!d' -e '/\/dev\/tt/!d' -e 's/.*tt[^ ]*[ ]*\([0-9]*\)[ ]*.*/\1/' < /proc/tty/drivers` +CUAMAJOR=`sed -e '/^ip2/!d' -e '/\/dev\/cu/!d' -e 's/.*cu[^ ]*[ ]*\([0-9]*\)[ ]*.*/\1/' < /proc/tty/drivers` +BRDMAJOR=`sed -e '/^Driver: /!d' -e 's/.*IMajor=\([0-9]*\)[ ]*.*/\1/' < /proc/tty/driver/ip2` +X +echo "\ +TTYMAJOR = $TTYMAJOR +CUAMAJOR = $CUAMAJOR +BRDMAJOR = $BRDMAJOR +" +X +# Ok... Now we should know our major numbers, if appropriate... +# Now we need our boards and start the device loops. +X +grep '^Board [0-9]:' /proc/tty/driver/ip2 | while read token number type alltherest +do +X # The test for blank "type" will catch the stats lead-in lines +X # if they exist in the file +X if test "$type" = "vacant" -o "$type" = "Vacant" -o "$type" = "" +X then +X continue +X fi +X +X BOARDNO=`expr "$number" : '\([0-9]\):'` +X PORTS=`expr "$alltherest" : '.*ports=\([0-9]*\)' | tr ',' ' '` +X MINORS=`expr "$alltherest" : '.*minors=\([0-9,]*\)' | tr ',' ' '` +X +X if test "$BOARDNO" = "" -o "$PORTS" = "" +X then +# This may be a bug. We should at least get this much information +X echo "Unable to process board line" +X continue +X fi +X +X if test "$MINORS" = "" +X then +# Silently skip this one. This board seems to have no boxes +X continue +X fi +X +X echo "board $BOARDNO: $type ports = $PORTS; port numbers = $MINORS" +X +X if test "$BRDMAJOR" != "" +X then +X BRDMINOR=`expr $BOARDNO \* 4` +X STSMINOR=`expr $BRDMINOR + 1` +X if test ! -c /dev/ip2ipl$BOARDNO ; then +X mknod /dev/ip2ipl$BOARDNO c $BRDMAJOR $BRDMINOR +X fi +X if test ! -c /dev/ip2stat$BOARDNO ; then +X mknod /dev/ip2stat$BOARDNO c $BRDMAJOR $STSMINOR +X fi +X fi +X +X if test "$TTYMAJOR" != "" +X then +X PORTNO=$BOARDBASE +X +X for PORTNO in $MINORS +X do +X if test ! -c /dev/ttyF$PORTNO ; then +X # We got the hardware but no device - make it +X mknod /dev/ttyF$PORTNO c $TTYMAJOR $PORTNO +X fi +X done +X fi +X +X if test "$CUAMAJOR" != "" +X then +X PORTNO=$BOARDBASE +X +X for PORTNO in $MINORS +X do +X if test ! -c /dev/cuf$PORTNO ; then +X # We got the hardware but no device - make it +X mknod /dev/cuf$PORTNO c $CUAMAJOR $PORTNO +X fi +X done +X fi +done +X +Xexit 0 +SHAR_EOF + (set 20 01 10 29 10 32 01 'ip2mkdev'; eval "$shar_touch") && + chmod 0755 'ip2mkdev' || + $echo 'restore of' 'ip2mkdev' 'failed' + if ( md5sum --help 2>&1 | grep 'sage: md5sum \[' ) >/dev/null 2>&1 \ + && ( md5sum --version 2>&1 | grep -v 'textutils 1.12' ) >/dev/null; then + md5sum -c << SHAR_EOF >/dev/null 2>&1 \ + || $echo 'ip2mkdev:' 'MD5 check failed' +cb5717134509f38bad9fde6b1f79b4a4 ip2mkdev +SHAR_EOF + else + shar_count="`LC_ALL= LC_CTYPE= LANG= wc -c < 'ip2mkdev'`" + test 4251 -eq "$shar_count" || + $echo 'ip2mkdev:' 'original size' '4251,' 'current size' "$shar_count!" + fi +fi +rm -fr _sh17581 +exit 0 diff --git a/Documentation/cpqarray.txt b/Documentation/cpqarray.txt new file mode 100644 index 000000000000..c7154e20ef5e --- /dev/null +++ b/Documentation/cpqarray.txt @@ -0,0 +1,93 @@ +This driver is for Compaq's SMART2 Intelligent Disk Array Controllers. + +Supported Cards: +---------------- + +This driver is known to work with the following cards: + + * SMART (EISA) + * SMART-2/E (EISA) + * SMART-2/P + * SMART-2DH + * SMART-2SL + * SMART-221 + * SMART-3100ES + * SMART-3200 + * Integrated Smart Array Controller + * SA 4200 + * SA 4250ES + * SA 431 + * RAID LC2 Controller + +It should also work with some really old Disk array adapters, but I am +unable to test against these cards: + + * IDA + * IDA-2 + * IAES + + +EISA Controllers: +----------------- + +If you want to use an EISA controller you'll have to supply some +modprobe/lilo parameters. If the driver is compiled into the kernel, must +give it the controller's IO port address at boot time (it is not +necessary to specify the IRQ). For example, if you had two SMART-2/E +controllers, in EISA slots 1 and 2 you'd give it a boot argument like +this: + + smart2=0x1000,0x2000 + +If you were loading the driver as a module, you'd give load it like this: + + modprobe cpqarray eisa=0x1000,0x2000 + +You can use EISA and PCI adapters at the same time. + + +Device Naming: +-------------- + +You need some entries in /dev for the ida device. MAKEDEV in the /dev +directory can make device nodes for you automatically. The device setup is +as follows: + +Major numbers: + 72 ida0 + 73 ida1 + 74 ida2 + 75 ida3 + 76 ida4 + 77 ida5 + 78 ida6 + 79 ida7 + +Minor numbers: + b7 b6 b5 b4 b3 b2 b1 b0 + |----+----| |----+----| + | | + | +-------- Partition ID (0=wholedev, 1-15 partition) + | + +-------------------- Logical Volume number + +The device naming scheme is: +/dev/ida/c0d0 Controller 0, disk 0, whole device +/dev/ida/c0d0p1 Controller 0, disk 0, partition 1 +/dev/ida/c0d0p2 Controller 0, disk 0, partition 2 +/dev/ida/c0d0p3 Controller 0, disk 0, partition 3 + +/dev/ida/c1d1 Controller 1, disk 1, whole device +/dev/ida/c1d1p1 Controller 1, disk 1, partition 1 +/dev/ida/c1d1p2 Controller 1, disk 1, partition 2 +/dev/ida/c1d1p3 Controller 1, disk 1, partition 3 + + +Changelog: +========== + +10-28-2004 : General cleanup, syntax fixes for in-kernel driver version. + James Nelson <james4765@gmail.com> + + +1999 : Original Document diff --git a/Documentation/cpu-freq/amd-powernow.txt b/Documentation/cpu-freq/amd-powernow.txt new file mode 100644 index 000000000000..254da155fa47 --- /dev/null +++ b/Documentation/cpu-freq/amd-powernow.txt @@ -0,0 +1,38 @@ + +PowerNow! and Cool'n'Quiet are AMD names for frequency +management capabilities in AMD processors. As the hardware +implementation changes in new generations of the processors, +there is a different cpu-freq driver for each generation. + +Note that the driver's will not load on the "wrong" hardware, +so it is safe to try each driver in turn when in doubt as to +which is the correct driver. + +Note that the functionality to change frequency (and voltage) +is not available in all processors. The drivers will refuse +to load on processors without this capability. The capability +is detected with the cpuid instruction. + +The drivers use BIOS supplied tables to obtain frequency and +voltage information appropriate for a particular platform. +Frequency transitions will be unavailable if the BIOS does +not supply these tables. + +6th Generation: powernow-k6 + +7th Generation: powernow-k7: Athlon, Duron, Geode. + +8th Generation: powernow-k8: Athlon, Athlon 64, Opteron, Sempron. +Documentation on this functionality in 8th generation processors +is available in the "BIOS and Kernel Developer's Guide", publication +26094, in chapter 9, available for download from www.amd.com. + +BIOS supplied data, for powernow-k7 and for powernow-k8, may be +from either the PSB table or from ACPI objects. The ACPI support +is only available if the kernel config sets CONFIG_ACPI_PROCESSOR. +The powernow-k8 driver will attempt to use ACPI if so configured, +and fall back to PST if that fails. +The powernow-k7 driver will try to use the PSB support first, and +fall back to ACPI if the PSB support fails. A module parameter, +acpi_force, is provided to force ACPI support to be used instead +of PSB support. diff --git a/Documentation/cpu-freq/core.txt b/Documentation/cpu-freq/core.txt new file mode 100644 index 000000000000..29b3f9ffc66c --- /dev/null +++ b/Documentation/cpu-freq/core.txt @@ -0,0 +1,98 @@ + CPU frequency and voltage scaling code in the Linux(TM) kernel + + + L i n u x C P U F r e q + + C P U F r e q C o r e + + + Dominik Brodowski <linux@brodo.de> + David Kimdon <dwhedon@debian.org> + + + + Clock scaling allows you to change the clock speed of the CPUs on the + fly. This is a nice method to save battery power, because the lower + the clock speed, the less power the CPU consumes. + + +Contents: +--------- +1. CPUFreq core and interfaces +2. CPUFreq notifiers + +1. General Information +======================= + +The CPUFreq core code is located in linux/kernel/cpufreq.c. This +cpufreq code offers a standardized interface for the CPUFreq +architecture drivers (those pieces of code that do actual +frequency transitions), as well as to "notifiers". These are device +drivers or other part of the kernel that need to be informed of +policy changes (ex. thermal modules like ACPI) or of all +frequency changes (ex. timing code) or even need to force certain +speed limits (like LCD drivers on ARM architecture). Additionally, the +kernel "constant" loops_per_jiffy is updated on frequency changes +here. + +Reference counting is done by cpufreq_get_cpu and cpufreq_put_cpu, +which make sure that the cpufreq processor driver is correctly +registered with the core, and will not be unloaded until +cpufreq_put_cpu is called. + +2. CPUFreq notifiers +==================== + +CPUFreq notifiers conform to the standard kernel notifier interface. +See linux/include/linux/notifier.h for details on notifiers. + +There are two different CPUFreq notifiers - policy notifiers and +transition notifiers. + + +2.1 CPUFreq policy notifiers +---------------------------- + +These are notified when a new policy is intended to be set. Each +CPUFreq policy notifier is called three times for a policy transition: + +1.) During CPUFREQ_ADJUST all CPUFreq notifiers may change the limit if + they see a need for this - may it be thermal considerations or + hardware limitations. + +2.) During CPUFREQ_INCOMPATIBLE only changes may be done in order to avoid + hardware failure. + +3.) And during CPUFREQ_NOTIFY all notifiers are informed of the new policy + - if two hardware drivers failed to agree on a new policy before this + stage, the incompatible hardware shall be shut down, and the user + informed of this. + +The phase is specified in the second argument to the notifier. + +The third argument, a void *pointer, points to a struct cpufreq_policy +consisting of five values: cpu, min, max, policy and max_cpu_freq. min +and max are the lower and upper frequencies (in kHz) of the new +policy, policy the new policy, cpu the number of the affected CPU; and +max_cpu_freq the maximum supported CPU frequency. This value is given +for informational purposes only. + + +2.2 CPUFreq transition notifiers +-------------------------------- + +These are notified twice when the CPUfreq driver switches the CPU core +frequency and this change has any external implications. + +The second argument specifies the phase - CPUFREQ_PRECHANGE or +CPUFREQ_POSTCHANGE. + +The third argument is a struct cpufreq_freqs with the following +values: +cpu - number of the affected CPU +old - old frequency +new - new frequency + +If the cpufreq core detects the frequency has changed while the system +was suspended, these notifiers are called with CPUFREQ_RESUMECHANGE as +second argument. diff --git a/Documentation/cpu-freq/cpu-drivers.txt b/Documentation/cpu-freq/cpu-drivers.txt new file mode 100644 index 000000000000..43c743903dd7 --- /dev/null +++ b/Documentation/cpu-freq/cpu-drivers.txt @@ -0,0 +1,216 @@ + CPU frequency and voltage scaling code in the Linux(TM) kernel + + + L i n u x C P U F r e q + + C P U D r i v e r s + + - information for developers - + + + Dominik Brodowski <linux@brodo.de> + + + + Clock scaling allows you to change the clock speed of the CPUs on the + fly. This is a nice method to save battery power, because the lower + the clock speed, the less power the CPU consumes. + + +Contents: +--------- +1. What To Do? +1.1 Initialization +1.2 Per-CPU Initialization +1.3 verify +1.4 target or setpolicy? +1.5 target +1.6 setpolicy +2. Frequency Table Helpers + + + +1. What To Do? +============== + +So, you just got a brand-new CPU / chipset with datasheets and want to +add cpufreq support for this CPU / chipset? Great. Here are some hints +on what is necessary: + + +1.1 Initialization +------------------ + +First of all, in an __initcall level 7 (module_init()) or later +function check whether this kernel runs on the right CPU and the right +chipset. If so, register a struct cpufreq_driver with the CPUfreq core +using cpufreq_register_driver() + +What shall this struct cpufreq_driver contain? + +cpufreq_driver.name - The name of this driver. + +cpufreq_driver.owner - THIS_MODULE; + +cpufreq_driver.init - A pointer to the per-CPU initialization + function. + +cpufreq_driver.verify - A pointer to a "verification" function. + +cpufreq_driver.setpolicy _or_ +cpufreq_driver.target - See below on the differences. + +And optionally + +cpufreq_driver.exit - A pointer to a per-CPU cleanup function. + +cpufreq_driver.resume - A pointer to a per-CPU resume function + which is called with interrupts disabled + and _before_ the pre-suspend frequency + and/or policy is restored by a call to + ->target or ->setpolicy. + +cpufreq_driver.attr - A pointer to a NULL-terminated list of + "struct freq_attr" which allow to + export values to sysfs. + + +1.2 Per-CPU Initialization +-------------------------- + +Whenever a new CPU is registered with the device model, or after the +cpufreq driver registers itself, the per-CPU initialization function +cpufreq_driver.init is called. It takes a struct cpufreq_policy +*policy as argument. What to do now? + +If necessary, activate the CPUfreq support on your CPU. + +Then, the driver must fill in the following values: + +policy->cpuinfo.min_freq _and_ +policy->cpuinfo.max_freq - the minimum and maximum frequency + (in kHz) which is supported by + this CPU +policy->cpuinfo.transition_latency the time it takes on this CPU to + switch between two frequencies (if + appropriate, else specify + CPUFREQ_ETERNAL) + +policy->cur The current operating frequency of + this CPU (if appropriate) +policy->min, +policy->max, +policy->policy and, if necessary, +policy->governor must contain the "default policy" for + this CPU. A few moments later, + cpufreq_driver.verify and either + cpufreq_driver.setpolicy or + cpufreq_driver.target is called with + these values. + +For setting some of these values, the frequency table helpers might be +helpful. See the section 2 for more information on them. + + +1.3 verify +------------ + +When the user decides a new policy (consisting of +"policy,governor,min,max") shall be set, this policy must be validated +so that incompatible values can be corrected. For verifying these +values, a frequency table helper and/or the +cpufreq_verify_within_limits(struct cpufreq_policy *policy, unsigned +int min_freq, unsigned int max_freq) function might be helpful. See +section 2 for details on frequency table helpers. + +You need to make sure that at least one valid frequency (or operating +range) is within policy->min and policy->max. If necessary, increase +policy->max first, and only if this is no solution, decrease policy->min. + + +1.4 target or setpolicy? +---------------------------- + +Most cpufreq drivers or even most cpu frequency scaling algorithms +only allow the CPU to be set to one frequency. For these, you use the +->target call. + +Some cpufreq-capable processors switch the frequency between certain +limits on their own. These shall use the ->setpolicy call + + +1.4. target +------------- + +The target call has three arguments: struct cpufreq_policy *policy, +unsigned int target_frequency, unsigned int relation. + +The CPUfreq driver must set the new frequency when called here. The +actual frequency must be determined using the following rules: + +- keep close to "target_freq" +- policy->min <= new_freq <= policy->max (THIS MUST BE VALID!!!) +- if relation==CPUFREQ_REL_L, try to select a new_freq higher than or equal + target_freq. ("L for lowest, but no lower than") +- if relation==CPUFREQ_REL_H, try to select a new_freq lower than or equal + target_freq. ("H for highest, but no higher than") + +Here again the frequency table helper might assist you - see section 3 +for details. + + +1.5 setpolicy +--------------- + +The setpolicy call only takes a struct cpufreq_policy *policy as +argument. You need to set the lower limit of the in-processor or +in-chipset dynamic frequency switching to policy->min, the upper limit +to policy->max, and -if supported- select a performance-oriented +setting when policy->policy is CPUFREQ_POLICY_PERFORMANCE, and a +powersaving-oriented setting when CPUFREQ_POLICY_POWERSAVE. Also check +the reference implementation in arch/i386/kernel/cpu/cpufreq/longrun.c + + + +2. Frequency Table Helpers +========================== + +As most cpufreq processors only allow for being set to a few specific +frequencies, a "frequency table" with some functions might assist in +some work of the processor driver. Such a "frequency table" consists +of an array of struct cpufreq_freq_table entries, with any value in +"index" you want to use, and the corresponding frequency in +"frequency". At the end of the table, you need to add a +cpufreq_freq_table entry with frequency set to CPUFREQ_TABLE_END. And +if you want to skip one entry in the table, set the frequency to +CPUFREQ_ENTRY_INVALID. The entries don't need to be in ascending +order. + +By calling cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy, + struct cpufreq_frequency_table *table); +the cpuinfo.min_freq and cpuinfo.max_freq values are detected, and +policy->min and policy->max are set to the same values. This is +helpful for the per-CPU initialization stage. + +int cpufreq_frequency_table_verify(struct cpufreq_policy *policy, + struct cpufreq_frequency_table *table); +assures that at least one valid frequency is within policy->min and +policy->max, and all other criteria are met. This is helpful for the +->verify call. + +int cpufreq_frequency_table_target(struct cpufreq_policy *policy, + struct cpufreq_frequency_table *table, + unsigned int target_freq, + unsigned int relation, + unsigned int *index); + +is the corresponding frequency table helper for the ->target +stage. Just pass the values to this function, and the unsigned int +index returns the number of the frequency table entry which contains +the frequency the CPU shall be set to. PLEASE NOTE: This is not the +"index" which is in this cpufreq_table_entry.index, but instead +cpufreq_table[index]. So, the new frequency is +cpufreq_table[index].frequency, and the value you stored into the +frequency table "index" field is +cpufreq_table[index].index. + diff --git a/Documentation/cpu-freq/cpufreq-nforce2.txt b/Documentation/cpu-freq/cpufreq-nforce2.txt new file mode 100644 index 000000000000..9188337d8f6b --- /dev/null +++ b/Documentation/cpu-freq/cpufreq-nforce2.txt @@ -0,0 +1,19 @@ + +The cpufreq-nforce2 driver changes the FSB on nVidia nForce2 plattforms. + +This works better than on other plattforms, because the FSB of the CPU +can be controlled independently from the PCI/AGP clock. + +The module has two options: + + fid: multiplier * 10 (for example 8.5 = 85) + min_fsb: minimum FSB + +If not set, fid is calculated from the current CPU speed and the FSB. +min_fsb defaults to FSB at boot time - 50 MHz. + +IMPORTANT: The available range is limited downwards! + Also the minimum available FSB can differ, for systems + booting with 200 MHz, 150 should always work. + + diff --git a/Documentation/cpu-freq/governors.txt b/Documentation/cpu-freq/governors.txt new file mode 100644 index 000000000000..b85481acd0ca --- /dev/null +++ b/Documentation/cpu-freq/governors.txt @@ -0,0 +1,155 @@ + CPU frequency and voltage scaling code in the Linux(TM) kernel + + + L i n u x C P U F r e q + + C P U F r e q G o v e r n o r s + + - information for users and developers - + + + Dominik Brodowski <linux@brodo.de> + + + + Clock scaling allows you to change the clock speed of the CPUs on the + fly. This is a nice method to save battery power, because the lower + the clock speed, the less power the CPU consumes. + + +Contents: +--------- +1. What is a CPUFreq Governor? + +2. Governors In the Linux Kernel +2.1 Performance +2.2 Powersave +2.3 Userspace + +3. The Governor Interface in the CPUfreq Core + + + +1. What Is A CPUFreq Governor? +============================== + +Most cpufreq drivers (in fact, all except one, longrun) or even most +cpu frequency scaling algorithms only offer the CPU to be set to one +frequency. In order to offer dynamic frequency scaling, the cpufreq +core must be able to tell these drivers of a "target frequency". So +these specific drivers will be transformed to offer a "->target" +call instead of the existing "->setpolicy" call. For "longrun", all +stays the same, though. + +How to decide what frequency within the CPUfreq policy should be used? +That's done using "cpufreq governors". Two are already in this patch +-- they're the already existing "powersave" and "performance" which +set the frequency statically to the lowest or highest frequency, +respectively. At least two more such governors will be ready for +addition in the near future, but likely many more as there are various +different theories and models about dynamic frequency scaling +around. Using such a generic interface as cpufreq offers to scaling +governors, these can be tested extensively, and the best one can be +selected for each specific use. + +Basically, it's the following flow graph: + +CPU can be set to switch independetly | CPU can only be set + within specific "limits" | to specific frequencies + + "CPUfreq policy" + consists of frequency limits (policy->{min,max}) + and CPUfreq governor to be used + / \ + / \ + / the cpufreq governor decides + / (dynamically or statically) + / what target_freq to set within + / the limits of policy->{min,max} + / \ + / \ + Using the ->setpolicy call, Using the ->target call, + the limits and the the frequency closest + "policy" is set. to target_freq is set. + It is assured that it + is within policy->{min,max} + + +2. Governors In the Linux Kernel +================================ + +2.1 Performance +--------------- + +The CPUfreq governor "performance" sets the CPU statically to the +highest frequency within the borders of scaling_min_freq and +scaling_max_freq. + + +2.1 Powersave +------------- + +The CPUfreq governor "powersave" sets the CPU statically to the +lowest frequency within the borders of scaling_min_freq and +scaling_max_freq. + + +2.2 Userspace +------------- + +The CPUfreq governor "userspace" allows the user, or any userspace +program running with UID "root", to set the CPU to a specific frequency +by making a sysfs file "scaling_setspeed" available in the CPU-device +directory. + + + +3. The Governor Interface in the CPUfreq Core +============================================= + +A new governor must register itself with the CPUfreq core using +"cpufreq_register_governor". The struct cpufreq_governor, which has to +be passed to that function, must contain the following values: + +governor->name - A unique name for this governor +governor->governor - The governor callback function +governor->owner - .THIS_MODULE for the governor module (if + appropriate) + +The governor->governor callback is called with the current (or to-be-set) +cpufreq_policy struct for that CPU, and an unsigned int event. The +following events are currently defined: + +CPUFREQ_GOV_START: This governor shall start its duty for the CPU + policy->cpu +CPUFREQ_GOV_STOP: This governor shall end its duty for the CPU + policy->cpu +CPUFREQ_GOV_LIMITS: The limits for CPU policy->cpu have changed to + policy->min and policy->max. + +If you need other "events" externally of your driver, _only_ use the +cpufreq_governor_l(unsigned int cpu, unsigned int event) call to the +CPUfreq core to ensure proper locking. + + +The CPUfreq governor may call the CPU processor driver using one of +these two functions: + +int cpufreq_driver_target(struct cpufreq_policy *policy, + unsigned int target_freq, + unsigned int relation); + +int __cpufreq_driver_target(struct cpufreq_policy *policy, + unsigned int target_freq, + unsigned int relation); + +target_freq must be within policy->min and policy->max, of course. +What's the difference between these two functions? When your governor +still is in a direct code path of a call to governor->governor, the +per-CPU cpufreq lock is still held in the cpufreq core, and there's +no need to lock it again (in fact, this would cause a deadlock). So +use __cpufreq_driver_target only in these cases. In all other cases +(for example, when there's a "daemonized" function that wakes up +every second), use cpufreq_driver_target to lock the cpufreq per-CPU +lock before the command is passed to the cpufreq processor driver. + diff --git a/Documentation/cpu-freq/index.txt b/Documentation/cpu-freq/index.txt new file mode 100644 index 000000000000..5009805f9378 --- /dev/null +++ b/Documentation/cpu-freq/index.txt @@ -0,0 +1,56 @@ + CPU frequency and voltage scaling code in the Linux(TM) kernel + + + L i n u x C P U F r e q + + + + + Dominik Brodowski <linux@brodo.de> + + + + Clock scaling allows you to change the clock speed of the CPUs on the + fly. This is a nice method to save battery power, because the lower + the clock speed, the less power the CPU consumes. + + + +Documents in this directory: +---------------------------- +core.txt - General description of the CPUFreq core and + of CPUFreq notifiers + +cpu-drivers.txt - How to implement a new cpufreq processor driver + +governors.txt - What are cpufreq governors and how to + implement them? + +index.txt - File index, Mailing list and Links (this document) + +user-guide.txt - User Guide to CPUFreq + + +Mailing List +------------ +There is a CPU frequency changing CVS commit and general list where +you can report bugs, problems or submit patches. To post a message, +send an email to cpufreq@lists.linux.org.uk, to subscribe go to +http://lists.linux.org.uk/mailman/listinfo/cpufreq. Previous post to the +mailing list are available to subscribers at +http://lists.linux.org.uk/mailman/private/cpufreq/. + + +Links +----- +the FTP archives: +* ftp://ftp.linux.org.uk/pub/linux/cpufreq/ + +how to access the CVS repository: +* http://cvs.arm.linux.org.uk/ + +the CPUFreq Mailing list: +* http://lists.linux.org.uk/mailman/listinfo/cpufreq + +Clock and voltage scaling for the SA-1100: +* http://www.lart.tudelft.nl/projects/scaling diff --git a/Documentation/cpu-freq/user-guide.txt b/Documentation/cpu-freq/user-guide.txt new file mode 100644 index 000000000000..7fedc00c3d30 --- /dev/null +++ b/Documentation/cpu-freq/user-guide.txt @@ -0,0 +1,185 @@ + CPU frequency and voltage scaling code in the Linux(TM) kernel + + + L i n u x C P U F r e q + + U S E R G U I D E + + + Dominik Brodowski <linux@brodo.de> + + + + Clock scaling allows you to change the clock speed of the CPUs on the + fly. This is a nice method to save battery power, because the lower + the clock speed, the less power the CPU consumes. + + +Contents: +--------- +1. Supported Architectures and Processors +1.1 ARM +1.2 x86 +1.3 sparc64 +1.4 ppc +1.5 SuperH + +2. "Policy" / "Governor"? +2.1 Policy +2.2 Governor + +3. How to change the CPU cpufreq policy and/or speed +3.1 Preferred interface: sysfs +3.2 Deprecated interfaces + + + +1. Supported Architectures and Processors +========================================= + +1.1 ARM +------- + +The following ARM processors are supported by cpufreq: + +ARM Integrator +ARM-SA1100 +ARM-SA1110 + + +1.2 x86 +------- + +The following processors for the x86 architecture are supported by cpufreq: + +AMD Elan - SC400, SC410 +AMD mobile K6-2+ +AMD mobile K6-3+ +AMD mobile Duron +AMD mobile Athlon +AMD Opteron +AMD Athlon 64 +Cyrix Media GXm +Intel mobile PIII and Intel mobile PIII-M on certain chipsets +Intel Pentium 4, Intel Xeon +Intel Pentium M (Centrino) +National Semiconductors Geode GX +Transmeta Crusoe +Transmeta Efficeon +VIA Cyrix 3 / C3 +various processors on some ACPI 2.0-compatible systems [*] + +[*] Only if "ACPI Processor Performance States" are available +to the ACPI<->BIOS interface. + + +1.3 sparc64 +----------- + +The following processors for the sparc64 architecture are supported by +cpufreq: + +UltraSPARC-III + + +1.4 ppc +------- + +Several "PowerBook" and "iBook2" notebooks are supported. + + +1.5 SuperH +---------- + +The following SuperH processors are supported by cpufreq: + +SH-3 +SH-4 + + +2. "Policy" / "Governor" ? +========================== + +Some CPU frequency scaling-capable processor switch between various +frequencies and operating voltages "on the fly" without any kernel or +user involvement. This guarantees very fast switching to a frequency +which is high enough to serve the user's needs, but low enough to save +power. + + +2.1 Policy +---------- + +On these systems, all you can do is select the lower and upper +frequency limit as well as whether you want more aggressive +power-saving or more instantly available processing power. + + +2.2 Governor +------------ + +On all other cpufreq implementations, these boundaries still need to +be set. Then, a "governor" must be selected. Such a "governor" decides +what speed the processor shall run within the boundaries. One such +"governor" is the "userspace" governor. This one allows the user - or +a yet-to-implement userspace program - to decide what specific speed +the processor shall run at. + + +3. How to change the CPU cpufreq policy and/or speed +==================================================== + +3.1 Preferred Interface: sysfs +------------------------------ + +The preferred interface is located in the sysfs filesystem. If you +mounted it at /sys, the cpufreq interface is located in a subdirectory +"cpufreq" within the cpu-device directory +(e.g. /sys/devices/system/cpu/cpu0/cpufreq/ for the first CPU). + +cpuinfo_min_freq : this file shows the minimum operating + frequency the processor can run at(in kHz) +cpuinfo_max_freq : this file shows the maximum operating + frequency the processor can run at(in kHz) +scaling_driver : this file shows what cpufreq driver is + used to set the frequency on this CPU + +scaling_available_governors : this file shows the CPUfreq governors + available in this kernel. You can see the + currently activated governor in + +scaling_governor, and by "echoing" the name of another + governor you can change it. Please note + that some governors won't load - they only + work on some specific architectures or + processors. +scaling_min_freq and +scaling_max_freq show the current "policy limits" (in + kHz). By echoing new values into these + files, you can change these limits. + + +If you have selected the "userspace" governor which allows you to +set the CPU operating frequency to a specific value, you can read out +the current frequency in + +scaling_setspeed. By "echoing" a new frequency into this + you can change the speed of the CPU, + but only within the limits of + scaling_min_freq and scaling_max_freq. + + +3.2 Deprecated Interfaces +------------------------- + +Depending on your kernel configuration, you might find the following +cpufreq-related files: +/proc/cpufreq +/proc/sys/cpu/*/speed +/proc/sys/cpu/*/speed-min +/proc/sys/cpu/*/speed-max + +These are files for deprecated interfaces to cpufreq, which offer far +less functionality. Because of this, these interfaces aren't described +here. + diff --git a/Documentation/cpusets.txt b/Documentation/cpusets.txt new file mode 100644 index 000000000000..1ad26d2c20ae --- /dev/null +++ b/Documentation/cpusets.txt @@ -0,0 +1,415 @@ + CPUSETS + ------- + +Copyright (C) 2004 BULL SA. +Written by Simon.Derr@bull.net + +Portions Copyright (c) 2004 Silicon Graphics, Inc. +Modified by Paul Jackson <pj@sgi.com> + +CONTENTS: +========= + +1. Cpusets + 1.1 What are cpusets ? + 1.2 Why are cpusets needed ? + 1.3 How are cpusets implemented ? + 1.4 How do I use cpusets ? +2. Usage Examples and Syntax + 2.1 Basic Usage + 2.2 Adding/removing cpus + 2.3 Setting flags + 2.4 Attaching processes +3. Questions +4. Contact + +1. Cpusets +========== + +1.1 What are cpusets ? +---------------------- + +Cpusets provide a mechanism for assigning a set of CPUs and Memory +Nodes to a set of tasks. + +Cpusets constrain the CPU and Memory placement of tasks to only +the resources within a tasks current cpuset. They form a nested +hierarchy visible in a virtual file system. These are the essential +hooks, beyond what is already present, required to manage dynamic +job placement on large systems. + +Each task has a pointer to a cpuset. Multiple tasks may reference +the same cpuset. Requests by a task, using the sched_setaffinity(2) +system call to include CPUs in its CPU affinity mask, and using the +mbind(2) and set_mempolicy(2) system calls to include Memory Nodes +in its memory policy, are both filtered through that tasks cpuset, +filtering out any CPUs or Memory Nodes not in that cpuset. The +scheduler will not schedule a task on a CPU that is not allowed in +its cpus_allowed vector, and the kernel page allocator will not +allocate a page on a node that is not allowed in the requesting tasks +mems_allowed vector. + +If a cpuset is cpu or mem exclusive, no other cpuset, other than a direct +ancestor or descendent, may share any of the same CPUs or Memory Nodes. + +User level code may create and destroy cpusets by name in the cpuset +virtual file system, manage the attributes and permissions of these +cpusets and which CPUs and Memory Nodes are assigned to each cpuset, +specify and query to which cpuset a task is assigned, and list the +task pids assigned to a cpuset. + + +1.2 Why are cpusets needed ? +---------------------------- + +The management of large computer systems, with many processors (CPUs), +complex memory cache hierarchies and multiple Memory Nodes having +non-uniform access times (NUMA) presents additional challenges for +the efficient scheduling and memory placement of processes. + +Frequently more modest sized systems can be operated with adequate +efficiency just by letting the operating system automatically share +the available CPU and Memory resources amongst the requesting tasks. + +But larger systems, which benefit more from careful processor and +memory placement to reduce memory access times and contention, +and which typically represent a larger investment for the customer, +can benefit from explictly placing jobs on properly sized subsets of +the system. + +This can be especially valuable on: + + * Web Servers running multiple instances of the same web application, + * Servers running different applications (for instance, a web server + and a database), or + * NUMA systems running large HPC applications with demanding + performance characteristics. + +These subsets, or "soft partitions" must be able to be dynamically +adjusted, as the job mix changes, without impacting other concurrently +executing jobs. + +The kernel cpuset patch provides the minimum essential kernel +mechanisms required to efficiently implement such subsets. It +leverages existing CPU and Memory Placement facilities in the Linux +kernel to avoid any additional impact on the critical scheduler or +memory allocator code. + + +1.3 How are cpusets implemented ? +--------------------------------- + +Cpusets provide a Linux kernel (2.6.7 and above) mechanism to constrain +which CPUs and Memory Nodes are used by a process or set of processes. + +The Linux kernel already has a pair of mechanisms to specify on which +CPUs a task may be scheduled (sched_setaffinity) and on which Memory +Nodes it may obtain memory (mbind, set_mempolicy). + +Cpusets extends these two mechanisms as follows: + + - Cpusets are sets of allowed CPUs and Memory Nodes, known to the + kernel. + - Each task in the system is attached to a cpuset, via a pointer + in the task structure to a reference counted cpuset structure. + - Calls to sched_setaffinity are filtered to just those CPUs + allowed in that tasks cpuset. + - Calls to mbind and set_mempolicy are filtered to just + those Memory Nodes allowed in that tasks cpuset. + - The root cpuset contains all the systems CPUs and Memory + Nodes. + - For any cpuset, one can define child cpusets containing a subset + of the parents CPU and Memory Node resources. + - The hierarchy of cpusets can be mounted at /dev/cpuset, for + browsing and manipulation from user space. + - A cpuset may be marked exclusive, which ensures that no other + cpuset (except direct ancestors and descendents) may contain + any overlapping CPUs or Memory Nodes. + - You can list all the tasks (by pid) attached to any cpuset. + +The implementation of cpusets requires a few, simple hooks +into the rest of the kernel, none in performance critical paths: + + - in main/init.c, to initialize the root cpuset at system boot. + - in fork and exit, to attach and detach a task from its cpuset. + - in sched_setaffinity, to mask the requested CPUs by what's + allowed in that tasks cpuset. + - in sched.c migrate_all_tasks(), to keep migrating tasks within + the CPUs allowed by their cpuset, if possible. + - in the mbind and set_mempolicy system calls, to mask the requested + Memory Nodes by what's allowed in that tasks cpuset. + - in page_alloc, to restrict memory to allowed nodes. + - in vmscan.c, to restrict page recovery to the current cpuset. + +In addition a new file system, of type "cpuset" may be mounted, +typically at /dev/cpuset, to enable browsing and modifying the cpusets +presently known to the kernel. No new system calls are added for +cpusets - all support for querying and modifying cpusets is via +this cpuset file system. + +Each task under /proc has an added file named 'cpuset', displaying +the cpuset name, as the path relative to the root of the cpuset file +system. + +The /proc/<pid>/status file for each task has two added lines, +displaying the tasks cpus_allowed (on which CPUs it may be scheduled) +and mems_allowed (on which Memory Nodes it may obtain memory), +in the format seen in the following example: + + Cpus_allowed: ffffffff,ffffffff,ffffffff,ffffffff + Mems_allowed: ffffffff,ffffffff + +Each cpuset is represented by a directory in the cpuset file system +containing the following files describing that cpuset: + + - cpus: list of CPUs in that cpuset + - mems: list of Memory Nodes in that cpuset + - cpu_exclusive flag: is cpu placement exclusive? + - mem_exclusive flag: is memory placement exclusive? + - tasks: list of tasks (by pid) attached to that cpuset + +New cpusets are created using the mkdir system call or shell +command. The properties of a cpuset, such as its flags, allowed +CPUs and Memory Nodes, and attached tasks, are modified by writing +to the appropriate file in that cpusets directory, as listed above. + +The named hierarchical structure of nested cpusets allows partitioning +a large system into nested, dynamically changeable, "soft-partitions". + +The attachment of each task, automatically inherited at fork by any +children of that task, to a cpuset allows organizing the work load +on a system into related sets of tasks such that each set is constrained +to using the CPUs and Memory Nodes of a particular cpuset. A task +may be re-attached to any other cpuset, if allowed by the permissions +on the necessary cpuset file system directories. + +Such management of a system "in the large" integrates smoothly with +the detailed placement done on individual tasks and memory regions +using the sched_setaffinity, mbind and set_mempolicy system calls. + +The following rules apply to each cpuset: + + - Its CPUs and Memory Nodes must be a subset of its parents. + - It can only be marked exclusive if its parent is. + - If its cpu or memory is exclusive, they may not overlap any sibling. + +These rules, and the natural hierarchy of cpusets, enable efficient +enforcement of the exclusive guarantee, without having to scan all +cpusets every time any of them change to ensure nothing overlaps a +exclusive cpuset. Also, the use of a Linux virtual file system (vfs) +to represent the cpuset hierarchy provides for a familiar permission +and name space for cpusets, with a minimum of additional kernel code. + +1.4 How do I use cpusets ? +-------------------------- + +In order to minimize the impact of cpusets on critical kernel +code, such as the scheduler, and due to the fact that the kernel +does not support one task updating the memory placement of another +task directly, the impact on a task of changing its cpuset CPU +or Memory Node placement, or of changing to which cpuset a task +is attached, is subtle. + +If a cpuset has its Memory Nodes modified, then for each task attached +to that cpuset, the next time that the kernel attempts to allocate +a page of memory for that task, the kernel will notice the change +in the tasks cpuset, and update its per-task memory placement to +remain within the new cpusets memory placement. If the task was using +mempolicy MPOL_BIND, and the nodes to which it was bound overlap with +its new cpuset, then the task will continue to use whatever subset +of MPOL_BIND nodes are still allowed in the new cpuset. If the task +was using MPOL_BIND and now none of its MPOL_BIND nodes are allowed +in the new cpuset, then the task will be essentially treated as if it +was MPOL_BIND bound to the new cpuset (even though its numa placement, +as queried by get_mempolicy(), doesn't change). If a task is moved +from one cpuset to another, then the kernel will adjust the tasks +memory placement, as above, the next time that the kernel attempts +to allocate a page of memory for that task. + +If a cpuset has its CPUs modified, then each task using that +cpuset does _not_ change its behavior automatically. In order to +minimize the impact on the critical scheduling code in the kernel, +tasks will continue to use their prior CPU placement until they +are rebound to their cpuset, by rewriting their pid to the 'tasks' +file of their cpuset. If a task had been bound to some subset of its +cpuset using the sched_setaffinity() call, and if any of that subset +is still allowed in its new cpuset settings, then the task will be +restricted to the intersection of the CPUs it was allowed on before, +and its new cpuset CPU placement. If, on the other hand, there is +no overlap between a tasks prior placement and its new cpuset CPU +placement, then the task will be allowed to run on any CPU allowed +in its new cpuset. If a task is moved from one cpuset to another, +its CPU placement is updated in the same way as if the tasks pid is +rewritten to the 'tasks' file of its current cpuset. + +In summary, the memory placement of a task whose cpuset is changed is +updated by the kernel, on the next allocation of a page for that task, +but the processor placement is not updated, until that tasks pid is +rewritten to the 'tasks' file of its cpuset. This is done to avoid +impacting the scheduler code in the kernel with a check for changes +in a tasks processor placement. + +There is an exception to the above. If hotplug funtionality is used +to remove all the CPUs that are currently assigned to a cpuset, +then the kernel will automatically update the cpus_allowed of all +tasks attached to CPUs in that cpuset with the online CPUs of the +nearest parent cpuset that still has some CPUs online. When memory +hotplug functionality for removing Memory Nodes is available, a +similar exception is expected to apply there as well. In general, +the kernel prefers to violate cpuset placement, over starving a task +that has had all its allowed CPUs or Memory Nodes taken offline. User +code should reconfigure cpusets to only refer to online CPUs and Memory +Nodes when using hotplug to add or remove such resources. + +There is a second exception to the above. GFP_ATOMIC requests are +kernel internal allocations that must be satisfied, immediately. +The kernel may drop some request, in rare cases even panic, if a +GFP_ATOMIC alloc fails. If the request cannot be satisfied within +the current tasks cpuset, then we relax the cpuset, and look for +memory anywhere we can find it. It's better to violate the cpuset +than stress the kernel. + +To start a new job that is to be contained within a cpuset, the steps are: + + 1) mkdir /dev/cpuset + 2) mount -t cpuset none /dev/cpuset + 3) Create the new cpuset by doing mkdir's and write's (or echo's) in + the /dev/cpuset virtual file system. + 4) Start a task that will be the "founding father" of the new job. + 5) Attach that task to the new cpuset by writing its pid to the + /dev/cpuset tasks file for that cpuset. + 6) fork, exec or clone the job tasks from this founding father task. + +For example, the following sequence of commands will setup a cpuset +named "Charlie", containing just CPUs 2 and 3, and Memory Node 1, +and then start a subshell 'sh' in that cpuset: + + mount -t cpuset none /dev/cpuset + cd /dev/cpuset + mkdir Charlie + cd Charlie + /bin/echo 2-3 > cpus + /bin/echo 1 > mems + /bin/echo $$ > tasks + sh + # The subshell 'sh' is now running in cpuset Charlie + # The next line should display '/Charlie' + cat /proc/self/cpuset + +In the case that a change of cpuset includes wanting to move already +allocated memory pages, consider further the work of IWAMOTO +Toshihiro <iwamoto@valinux.co.jp> for page remapping and memory +hotremoval, which can be found at: + + http://people.valinux.co.jp/~iwamoto/mh.html + +The integration of cpusets with such memory migration is not yet +available. + +In the future, a C library interface to cpusets will likely be +available. For now, the only way to query or modify cpusets is +via the cpuset file system, using the various cd, mkdir, echo, cat, +rmdir commands from the shell, or their equivalent from C. + +The sched_setaffinity calls can also be done at the shell prompt using +SGI's runon or Robert Love's taskset. The mbind and set_mempolicy +calls can be done at the shell prompt using the numactl command +(part of Andi Kleen's numa package). + +2. Usage Examples and Syntax +============================ + +2.1 Basic Usage +--------------- + +Creating, modifying, using the cpusets can be done through the cpuset +virtual filesystem. + +To mount it, type: +# mount -t cpuset none /dev/cpuset + +Then under /dev/cpuset you can find a tree that corresponds to the +tree of the cpusets in the system. For instance, /dev/cpuset +is the cpuset that holds the whole system. + +If you want to create a new cpuset under /dev/cpuset: +# cd /dev/cpuset +# mkdir my_cpuset + +Now you want to do something with this cpuset. +# cd my_cpuset + +In this directory you can find several files: +# ls +cpus cpu_exclusive mems mem_exclusive tasks + +Reading them will give you information about the state of this cpuset: +the CPUs and Memory Nodes it can use, the processes that are using +it, its properties. By writing to these files you can manipulate +the cpuset. + +Set some flags: +# /bin/echo 1 > cpu_exclusive + +Add some cpus: +# /bin/echo 0-7 > cpus + +Now attach your shell to this cpuset: +# /bin/echo $$ > tasks + +You can also create cpusets inside your cpuset by using mkdir in this +directory. +# mkdir my_sub_cs + +To remove a cpuset, just use rmdir: +# rmdir my_sub_cs +This will fail if the cpuset is in use (has cpusets inside, or has +processes attached). + +2.2 Adding/removing cpus +------------------------ + +This is the syntax to use when writing in the cpus or mems files +in cpuset directories: + +# /bin/echo 1-4 > cpus -> set cpus list to cpus 1,2,3,4 +# /bin/echo 1,2,3,4 > cpus -> set cpus list to cpus 1,2,3,4 + +2.3 Setting flags +----------------- + +The syntax is very simple: + +# /bin/echo 1 > cpu_exclusive -> set flag 'cpu_exclusive' +# /bin/echo 0 > cpu_exclusive -> unset flag 'cpu_exclusive' + +2.4 Attaching processes +----------------------- + +# /bin/echo PID > tasks + +Note that it is PID, not PIDs. You can only attach ONE task at a time. +If you have several tasks to attach, you have to do it one after another: + +# /bin/echo PID1 > tasks +# /bin/echo PID2 > tasks + ... +# /bin/echo PIDn > tasks + + +3. Questions +============ + +Q: what's up with this '/bin/echo' ? +A: bash's builtin 'echo' command does not check calls to write() against + errors. If you use it in the cpuset file system, you won't be + able to tell whether a command succeeded or failed. + +Q: When I attach processes, only the first of the line gets really attached ! +A: We can only return one error code per call to write(). So you should also + put only ONE pid. + +4. Contact +========== + +Web: http://www.bullopensource.org/cpuset diff --git a/Documentation/cris/README b/Documentation/cris/README new file mode 100644 index 000000000000..795a1dabe6c7 --- /dev/null +++ b/Documentation/cris/README @@ -0,0 +1,195 @@ +Linux 2.4 on the CRIS architecture +================================== +$Id: README,v 1.7 2001/04/19 12:38:32 bjornw Exp $ + +This is a port of Linux 2.4 to Axis Communications ETRAX 100LX embedded +network CPU. For more information about CRIS and ETRAX please see further +below. + +In order to compile this you need a version of gcc with support for the +ETRAX chip family. Please see this link for more information on how to +download the compiler and other tools useful when building and booting +software for the ETRAX platform: + +http://developer.axis.com/doc/software/devboard_lx/install-howto.html + +<more specific information should come in this document later> + +What is CRIS ? +-------------- + +CRIS is an acronym for 'Code Reduced Instruction Set'. It is the CPU +architecture in Axis Communication AB's range of embedded network CPU's, +called ETRAX. The latest CPU is called ETRAX 100LX, where LX stands for +'Linux' because the chip was designed to be a good host for the Linux +operating system. + +The ETRAX 100LX chip +-------------------- + +For reference, plase see the press-release: + +http://www.axis.com/news/us/001101_etrax.htm + +The ETRAX 100LX is a 100 MIPS processor with 8kB cache, MMU, and a very broad +range of built-in interfaces, all with modern scatter/gather DMA. + +Memory interfaces: + + * SRAM + * NOR-flash/ROM + * EDO or page-mode DRAM + * SDRAM + +I/O interfaces: + + * one 10/100 Mbit/s ethernet controller + * four serial-ports (up to 6 Mbit/s) + * two synchronous serial-ports for multimedia codec's etc. + * USB host controller and USB slave + * ATA + * SCSI + * two parallel-ports + * two generic 8-bit ports + + (not all interfaces are available at the same time due to chip pin + multiplexing) + +The previous version of the ETRAX, the ETRAX 100, sits in almost all of +Axis shipping thin-servers like the Axis 2100 web camera or the ETRAX 100 +developer-board. It lacks an MMU so the Linux we run on that is a version +of uClinux (Linux 2.0 without MM-support) ported to the CRIS architecture. +The new Linux 2.4 port has full MM and needs a CPU with an MMU, so it will +not run on the ETRAX 100. + +A version of the Axis developer-board with ETRAX 100LX (running Linux +2.4) is now available. For more information please see developer.axis.com. + + +Bootlog +------- + +Just as an example, this is the debug-output from a boot of Linux 2.4 on +a board with ETRAX 100LX. The displayed BogoMIPS value is 5 times too small :) +At the end you see some user-mode programs booting like telnet and ftp daemons. + +Linux version 2.4.1 (bjornw@godzilla.axis.se) (gcc version 2.96 20000427 (experimental)) #207 Wed Feb 21 15:48:15 CET 2001 +ROM fs in RAM, size 1376256 bytes +Setting up paging and the MMU. +On node 0 totalpages: 2048 +zone(0): 2048 pages. +zone(1): 0 pages. +zone(2): 0 pages. +Linux/CRIS port on ETRAX 100LX (c) 2001 Axis Communications AB +Kernel command line: +Calibrating delay loop... 19.91 BogoMIPS +Memory: 13872k/16384k available (587k kernel code, 2512k reserved, 44k data, 24k init) +kmem_create: Forcing size word alignment - vm_area_struct +kmem_create: Forcing size word alignment - filp +Dentry-cache hash table entries: 2048 (order: 1, 16384 bytes) +Buffer-cache hash table entries: 2048 (order: 0, 8192 bytes) +Page-cache hash table entries: 2048 (order: 0, 8192 bytes) +kmem_create: Forcing size word alignment - kiobuf +kmem_create: Forcing size word alignment - bdev_cache +Inode-cache hash table entries: 1024 (order: 0, 8192 bytes) +kmem_create: Forcing size word alignment - inode_cache +POSIX conformance testing by UNIFIX +Linux NET4.0 for Linux 2.4 +Based upon Swansea University Computer Society NET3.039 +Starting kswapd v1.8 +kmem_create: Forcing size word alignment - file lock cache +kmem_create: Forcing size word alignment - blkdev_requests +block: queued sectors max/low 9109kB/3036kB, 64 slots per queue +ETRAX 100LX 10/100MBit ethernet v2.0 (c) 2000 Axis Communications AB +eth0 initialized +eth0: changed MAC to 00:40:8C:CD:00:00 +ETRAX 100LX serial-driver $Revision: 1.7 $, (c) 2000 Axis Communications AB +ttyS0 at 0xb0000060 is a builtin UART with DMA +ttyS1 at 0xb0000068 is a builtin UART with DMA +ttyS2 at 0xb0000070 is a builtin UART with DMA +ttyS3 at 0xb0000078 is a builtin UART with DMA +Axis flash mapping: 200000 at 50000000 +Axis flash: Found 1 x16 CFI device at 0x0 in 16 bit mode + Amd/Fujitsu Extended Query Table v1.0 at 0x0040 +Axis flash: JEDEC Device ID is 0xC4. Assuming broken CFI table. +Axis flash: Swapping erase regions for broken CFI table. +number of CFI chips: 1 + Using default partition table +I2C driver v2.2, (c) 1999-2001 Axis Communications AB +ETRAX 100LX GPIO driver v2.1, (c) 2001 Axis Communications AB +NET4: Linux TCP/IP 1.0 for NET4.0 +IP Protocols: ICMP, UDP, TCP +kmem_create: Forcing size word alignment - ip_dst_cache +IP: routing cache hash table of 1024 buckets, 8Kbytes +TCP: Hash tables configured (established 2048 bind 2048) +NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. +VFS: Mounted root (cramfs filesystem) readonly. +Init starts up... +Mounted none on /proc ok. +Setting up eth0 with ip 10.13.9.116 and mac 00:40:8c:18:04:60 +eth0: changed MAC to 00:40:8C:18:04:60 +Setting up lo with ip 127.0.0.1 +Default gateway is 10.13.9.1 +Hostname is bbox1 +Telnetd starting, using port 23. + using /bin/sash as shell. +sftpd[15]: sftpd $Revision: 1.7 $ starting up + + + +And here is how some /proc entries look: + +17# cd /proc +17# cat cpuinfo +cpu : CRIS +cpu revision : 10 +cpu model : ETRAX 100LX +cache size : 8 kB +fpu : no +mmu : yes +ethernet : 10/100 Mbps +token ring : no +scsi : yes +ata : yes +usb : yes +bogomips : 99.84 + +17# cat meminfo + total: used: free: shared: buffers: cached: +Mem: 7028736 925696 6103040 114688 0 229376 +Swap: 0 0 0 +MemTotal: 6864 kB +MemFree: 5960 kB +MemShared: 112 kB +Buffers: 0 kB +Cached: 224 kB +Active: 224 kB +Inact_dirty: 0 kB +Inact_clean: 0 kB +Inact_target: 0 kB +HighTotal: 0 kB +HighFree: 0 kB +LowTotal: 6864 kB +LowFree: 5960 kB +SwapTotal: 0 kB +SwapFree: 0 kB +17# ls -l /bin +-rwxr-xr-x 1 342 100 10356 Jan 01 00:00 ifconfig +-rwxr-xr-x 1 342 100 17548 Jan 01 00:00 init +-rwxr-xr-x 1 342 100 9488 Jan 01 00:00 route +-rwxr-xr-x 1 342 100 46036 Jan 01 00:00 sftpd +-rwxr-xr-x 1 342 100 48104 Jan 01 00:00 sh +-rwxr-xr-x 1 342 100 16252 Jan 01 00:00 telnetd + + +(All programs are statically linked to the libc at this point - we have not ported the + shared libraries yet) + + + + + + + + + diff --git a/Documentation/crypto/api-intro.txt b/Documentation/crypto/api-intro.txt new file mode 100644 index 000000000000..a2d5b4900772 --- /dev/null +++ b/Documentation/crypto/api-intro.txt @@ -0,0 +1,244 @@ + + Scatterlist Cryptographic API + +INTRODUCTION + +The Scatterlist Crypto API takes page vectors (scatterlists) as +arguments, and works directly on pages. In some cases (e.g. ECB +mode ciphers), this will allow for pages to be encrypted in-place +with no copying. + +One of the initial goals of this design was to readily support IPsec, +so that processing can be applied to paged skb's without the need +for linearization. + + +DETAILS + +At the lowest level are algorithms, which register dynamically with the +API. + +'Transforms' are user-instantiated objects, which maintain state, handle all +of the implementation logic (e.g. manipulating page vectors), provide an +abstraction to the underlying algorithms, and handle common logical +operations (e.g. cipher modes, HMAC for digests). However, at the user +level they are very simple. + +Conceptually, the API layering looks like this: + + [transform api] (user interface) + [transform ops] (per-type logic glue e.g. cipher.c, digest.c) + [algorithm api] (for registering algorithms) + +The idea is to make the user interface and algorithm registration API +very simple, while hiding the core logic from both. Many good ideas +from existing APIs such as Cryptoapi and Nettle have been adapted for this. + +The API currently supports three types of transforms: Ciphers, Digests and +Compressors. The compression algorithms especially seem to be performing +very well so far. + +Support for hardware crypto devices via an asynchronous interface is +under development. + +Here's an example of how to use the API: + + #include <linux/crypto.h> + + struct scatterlist sg[2]; + char result[128]; + struct crypto_tfm *tfm; + + tfm = crypto_alloc_tfm("md5", 0); + if (tfm == NULL) + fail(); + + /* ... set up the scatterlists ... */ + + crypto_digest_init(tfm); + crypto_digest_update(tfm, &sg, 2); + crypto_digest_final(tfm, result); + + crypto_free_tfm(tfm); + + +Many real examples are available in the regression test module (tcrypt.c). + + +CONFIGURATION NOTES + +As Triple DES is part of the DES module, for those using modular builds, +add the following line to /etc/modprobe.conf: + + alias des3_ede des + +The Null algorithms reside in the crypto_null module, so these lines +should also be added: + + alias cipher_null crypto_null + alias digest_null crypto_null + alias compress_null crypto_null + +The SHA384 algorithm shares code within the SHA512 module, so you'll +also need: + alias sha384 sha512 + + +DEVELOPER NOTES + +Transforms may only be allocated in user context, and cryptographic +methods may only be called from softirq and user contexts. + +When using the API for ciphers, performance will be optimal if each +scatterlist contains data which is a multiple of the cipher's block +size (typically 8 bytes). This prevents having to do any copying +across non-aligned page fragment boundaries. + + +ADDING NEW ALGORITHMS + +When submitting a new algorithm for inclusion, a mandatory requirement +is that at least a few test vectors from known sources (preferably +standards) be included. + +Converting existing well known code is preferred, as it is more likely +to have been reviewed and widely tested. If submitting code from LGPL +sources, please consider changing the license to GPL (see section 3 of +the LGPL). + +Algorithms submitted must also be generally patent-free (e.g. IDEA +will not be included in the mainline until around 2011), and be based +on a recognized standard and/or have been subjected to appropriate +peer review. + +Also check for any RFCs which may relate to the use of specific algorithms, +as well as general application notes such as RFC2451 ("The ESP CBC-Mode +Cipher Algorithms"). + +It's a good idea to avoid using lots of macros and use inlined functions +instead, as gcc does a good job with inlining, while excessive use of +macros can cause compilation problems on some platforms. + +Also check the TODO list at the web site listed below to see what people +might already be working on. + + +BUGS + +Send bug reports to: +James Morris <jmorris@redhat.com> +Cc: David S. Miller <davem@redhat.com> + + +FURTHER INFORMATION + +For further patches and various updates, including the current TODO +list, see: +http://samba.org/~jamesm/crypto/ + + +AUTHORS + +James Morris +David S. Miller + + +CREDITS + +The following people provided invaluable feedback during the development +of the API: + + Alexey Kuznetzov + Rusty Russell + Herbert Valerio Riedel + Jeff Garzik + Michael Richardson + Andrew Morton + Ingo Oeser + Christoph Hellwig + +Portions of this API were derived from the following projects: + + Kerneli Cryptoapi (http://www.kerneli.org/) + Alexander Kjeldaas + Herbert Valerio Riedel + Kyle McMartin + Jean-Luc Cooke + David Bryson + Clemens Fruhwirth + Tobias Ringstrom + Harald Welte + +and; + + Nettle (http://www.lysator.liu.se/~nisse/nettle/) + Niels Möller + +Original developers of the crypto algorithms: + + Dana L. How (DES) + Andrew Tridgell and Steve French (MD4) + Colin Plumb (MD5) + Steve Reid (SHA1) + Jean-Luc Cooke (SHA256, SHA384, SHA512) + Kazunori Miyazawa / USAGI (HMAC) + Matthew Skala (Twofish) + Dag Arne Osvik (Serpent) + Brian Gladman (AES) + Kartikey Mahendra Bhatt (CAST6) + Jon Oberheide (ARC4) + Jouni Malinen (Michael MIC) + +SHA1 algorithm contributors: + Jean-Francois Dive + +DES algorithm contributors: + Raimar Falke + Gisle Sælensminde + Niels Möller + +Blowfish algorithm contributors: + Herbert Valerio Riedel + Kyle McMartin + +Twofish algorithm contributors: + Werner Koch + Marc Mutz + +SHA256/384/512 algorithm contributors: + Andrew McDonald + Kyle McMartin + Herbert Valerio Riedel + +AES algorithm contributors: + Alexander Kjeldaas + Herbert Valerio Riedel + Kyle McMartin + Adam J. Richter + Fruhwirth Clemens (i586) + Linus Torvalds (i586) + +CAST5 algorithm contributors: + Kartikey Mahendra Bhatt (original developers unknown, FSF copyright). + +TEA/XTEA algorithm contributors: + Aaron Grothe + +Khazad algorithm contributors: + Aaron Grothe + +Whirlpool algorithm contributors: + Aaron Grothe + Jean-Luc Cooke + +Anubis algorithm contributors: + Aaron Grothe + +Tiger algorithm contributors: + Aaron Grothe + +Generic scatterwalk code by Adam J. Richter <adam@yggdrasil.com> + +Please send any credits updates or corrections to: +James Morris <jmorris@redhat.com> + diff --git a/Documentation/crypto/descore-readme.txt b/Documentation/crypto/descore-readme.txt new file mode 100644 index 000000000000..166474c2ee0b --- /dev/null +++ b/Documentation/crypto/descore-readme.txt @@ -0,0 +1,352 @@ +Below is the orginal README file from the descore.shar package. +------------------------------------------------------------------------------ + +des - fast & portable DES encryption & decryption. +Copyright (C) 1992 Dana L. How + +This program is free software; you can redistribute it and/or modify +it under the terms of the GNU Library General Public License as published by +the Free Software Foundation; either version 2 of the License, or +(at your option) any later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU Library General Public License for more details. + +You should have received a copy of the GNU Library General Public License +along with this program; if not, write to the Free Software +Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + +Author's address: how@isl.stanford.edu + +$Id: README,v 1.15 1992/05/20 00:25:32 how E $ + + +==>> To compile after untarring/unsharring, just `make' <<== + + +This package was designed with the following goals: +1. Highest possible encryption/decryption PERFORMANCE. +2. PORTABILITY to any byte-addressable host with a 32bit unsigned C type +3. Plug-compatible replacement for KERBEROS's low-level routines. + +This second release includes a number of performance enhancements for +register-starved machines. My discussions with Richard Outerbridge, +71755.204@compuserve.com, sparked a number of these enhancements. + +To more rapidly understand the code in this package, inspect desSmallFips.i +(created by typing `make') BEFORE you tackle desCode.h. The latter is set +up in a parameterized fashion so it can easily be modified by speed-daemon +hackers in pursuit of that last microsecond. You will find it more +illuminating to inspect one specific implementation, +and then move on to the common abstract skeleton with this one in mind. + + +performance comparison to other available des code which i could +compile on a SPARCStation 1 (cc -O4, gcc -O2): + +this code (byte-order independent): + 30us per encryption (options: 64k tables, no IP/FP) + 33us per encryption (options: 64k tables, FIPS standard bit ordering) + 45us per encryption (options: 2k tables, no IP/FP) + 48us per encryption (options: 2k tables, FIPS standard bit ordering) + 275us to set a new key (uses 1k of key tables) + this has the quickest encryption/decryption routines i've seen. + since i was interested in fast des filters rather than crypt(3) + and password cracking, i haven't really bothered yet to speed up + the key setting routine. also, i have no interest in re-implementing + all the other junk in the mit kerberos des library, so i've just + provided my routines with little stub interfaces so they can be + used as drop-in replacements with mit's code or any of the mit- + compatible packages below. (note that the first two timings above + are highly variable because of cache effects). + +kerberos des replacement from australia (version 1.95): + 53us per encryption (uses 2k of tables) + 96us to set a new key (uses 2.25k of key tables) + so despite the author's inclusion of some of the performance + improvements i had suggested to him, this package's + encryption/decryption is still slower on the sparc and 68000. + more specifically, 19-40% slower on the 68020 and 11-35% slower + on the sparc, depending on the compiler; + in full gory detail (ALT_ECB is a libdes variant): + compiler machine desCore libdes ALT_ECB slower by + gcc 2.1 -O2 Sun 3/110 304 uS 369.5uS 461.8uS 22% + cc -O1 Sun 3/110 336 uS 436.6uS 399.3uS 19% + cc -O2 Sun 3/110 360 uS 532.4uS 505.1uS 40% + cc -O4 Sun 3/110 365 uS 532.3uS 505.3uS 38% + gcc 2.1 -O2 Sun 4/50 48 uS 53.4uS 57.5uS 11% + cc -O2 Sun 4/50 48 uS 64.6uS 64.7uS 35% + cc -O4 Sun 4/50 48 uS 64.7uS 64.9uS 35% + (my time measurements are not as accurate as his). + the comments in my first release of desCore on version 1.92: + 68us per encryption (uses 2k of tables) + 96us to set a new key (uses 2.25k of key tables) + this is a very nice package which implements the most important + of the optimizations which i did in my encryption routines. + it's a bit weak on common low-level optimizations which is why + it's 39%-106% slower. because he was interested in fast crypt(3) and + password-cracking applications, he also used the same ideas to + speed up the key-setting routines with impressive results. + (at some point i may do the same in my package). he also implements + the rest of the mit des library. + (code from eay@psych.psy.uq.oz.au via comp.sources.misc) + +fast crypt(3) package from denmark: + the des routine here is buried inside a loop to do the + crypt function and i didn't feel like ripping it out and measuring + performance. his code takes 26 sparc instructions to compute one + des iteration; above, Quick (64k) takes 21 and Small (2k) takes 37. + he claims to use 280k of tables but the iteration calculation seems + to use only 128k. his tables and code are machine independent. + (code from glad@daimi.aau.dk via alt.sources or comp.sources.misc) + +swedish reimplementation of Kerberos des library + 108us per encryption (uses 34k worth of tables) + 134us to set a new key (uses 32k of key tables to get this speed!) + the tables used seem to be machine-independent; + he seems to have included a lot of special case code + so that, e.g., `long' loads can be used instead of 4 `char' loads + when the machine's architecture allows it. + (code obtained from chalmers.se:pub/des) + +crack 3.3c package from england: + as in crypt above, the des routine is buried in a loop. it's + also very modified for crypt. his iteration code uses 16k + of tables and appears to be slow. + (code obtained from aem@aber.ac.uk via alt.sources or comp.sources.misc) + +``highly optimized'' and tweaked Kerberos/Athena code (byte-order dependent): + 165us per encryption (uses 6k worth of tables) + 478us to set a new key (uses <1k of key tables) + so despite the comments in this code, it was possible to get + faster code AND smaller tables, as well as making the tables + machine-independent. + (code obtained from prep.ai.mit.edu) + +UC Berkeley code (depends on machine-endedness): + 226us per encryption +10848us to set a new key + table sizes are unclear, but they don't look very small + (code obtained from wuarchive.wustl.edu) + + +motivation and history + +a while ago i wanted some des routines and the routines documented on sun's +man pages either didn't exist or dumped core. i had heard of kerberos, +and knew that it used des, so i figured i'd use its routines. but once +i got it and looked at the code, it really set off a lot of pet peeves - +it was too convoluted, the code had been written without taking +advantage of the regular structure of operations such as IP, E, and FP +(i.e. the author didn't sit down and think before coding), +it was excessively slow, the author had attempted to clarify the code +by adding MORE statements to make the data movement more `consistent' +instead of simplifying his implementation and cutting down on all data +movement (in particular, his use of L1, R1, L2, R2), and it was full of +idiotic `tweaks' for particular machines which failed to deliver significant +speedups but which did obfuscate everything. so i took the test data +from his verification program and rewrote everything else. + +a while later i ran across the great crypt(3) package mentioned above. +the fact that this guy was computing 2 sboxes per table lookup rather +than one (and using a MUCH larger table in the process) emboldened me to +do the same - it was a trivial change from which i had been scared away +by the larger table size. in his case he didn't realize you don't need to keep +the working data in TWO forms, one for easy use of half the sboxes in +indexing, the other for easy use of the other half; instead you can keep +it in the form for the first half and use a simple rotate to get the other +half. this means i have (almost) half the data manipulation and half +the table size. in fairness though he might be encoding something particular +to crypt(3) in his tables - i didn't check. + +i'm glad that i implemented it the way i did, because this C version is +portable (the ifdef's are performance enhancements) and it is faster +than versions hand-written in assembly for the sparc! + + +porting notes + +one thing i did not want to do was write an enormous mess +which depended on endedness and other machine quirks, +and which necessarily produced different code and different lookup tables +for different machines. see the kerberos code for an example +of what i didn't want to do; all their endedness-specific `optimizations' +obfuscate the code and in the end were slower than a simpler machine +independent approach. however, there are always some portability +considerations of some kind, and i have included some options +for varying numbers of register variables. +perhaps some will still regard the result as a mess! + +1) i assume everything is byte addressable, although i don't actually + depend on the byte order, and that bytes are 8 bits. + i assume word pointers can be freely cast to and from char pointers. + note that 99% of C programs make these assumptions. + i always use unsigned char's if the high bit could be set. +2) the typedef `word' means a 32 bit unsigned integral type. + if `unsigned long' is not 32 bits, change the typedef in desCore.h. + i assume sizeof(word) == 4 EVERYWHERE. + +the (worst-case) cost of my NOT doing endedness-specific optimizations +in the data loading and storing code surrounding the key iterations +is less than 12%. also, there is the added benefit that +the input and output work areas do not need to be word-aligned. + + +OPTIONAL performance optimizations + +1) you should define one of `i386,' `vax,' `mc68000,' or `sparc,' + whichever one is closest to the capabilities of your machine. + see the start of desCode.h to see exactly what this selection implies. + note that if you select the wrong one, the des code will still work; + these are just performance tweaks. +2) for those with functional `asm' keywords: you should change the + ROR and ROL macros to use machine rotate instructions if you have them. + this will save 2 instructions and a temporary per use, + or about 32 to 40 instructions per en/decryption. + note that gcc is smart enough to translate the ROL/R macros into + machine rotates! + +these optimizations are all rather persnickety, yet with them you should +be able to get performance equal to assembly-coding, except that: +1) with the lack of a bit rotate operator in C, rotates have to be synthesized + from shifts. so access to `asm' will speed things up if your machine + has rotates, as explained above in (3) (not necessary if you use gcc). +2) if your machine has less than 12 32-bit registers i doubt your compiler will + generate good code. + `i386' tries to configure the code for a 386 by only declaring 3 registers + (it appears that gcc can use ebx, esi and edi to hold register variables). + however, if you like assembly coding, the 386 does have 7 32-bit registers, + and if you use ALL of them, use `scaled by 8' address modes with displacement + and other tricks, you can get reasonable routines for DesQuickCore... with + about 250 instructions apiece. For DesSmall... it will help to rearrange + des_keymap, i.e., now the sbox # is the high part of the index and + the 6 bits of data is the low part; it helps to exchange these. + since i have no way to conveniently test it i have not provided my + shoehorned 386 version. note that with this release of desCore, gcc is able + to put everything in registers(!), and generate about 370 instructions apiece + for the DesQuickCore... routines! + +coding notes + +the en/decryption routines each use 6 necessary register variables, +with 4 being actively used at once during the inner iterations. +if you don't have 4 register variables get a new machine. +up to 8 more registers are used to hold constants in some configurations. + +i assume that the use of a constant is more expensive than using a register: +a) additionally, i have tried to put the larger constants in registers. + registering priority was by the following: + anything more than 12 bits (bad for RISC and CISC) + greater than 127 in value (can't use movq or byte immediate on CISC) + 9-127 (may not be able to use CISC shift immediate or add/sub quick), + 1-8 were never registered, being the cheapest constants. +b) the compiler may be too stupid to realize table and table+256 should + be assigned to different constant registers and instead repetitively + do the arithmetic, so i assign these to explicit `m' register variables + when possible and helpful. + +i assume that indexing is cheaper or equivalent to auto increment/decrement, +where the index is 7 bits unsigned or smaller. +this assumption is reversed for 68k and vax. + +i assume that addresses can be cheaply formed from two registers, +or from a register and a small constant. +for the 68000, the `two registers and small offset' form is used sparingly. +all index scaling is done explicitly - no hidden shifts by log2(sizeof). + +the code is written so that even a dumb compiler +should never need more than one hidden temporary, +increasing the chance that everything will fit in the registers. +KEEP THIS MORE SUBTLE POINT IN MIND IF YOU REWRITE ANYTHING. +(actually, there are some code fragments now which do require two temps, +but fixing it would either break the structure of the macros or +require declaring another temporary). + + +special efficient data format + +bits are manipulated in this arrangement most of the time (S7 S5 S3 S1): + 003130292827xxxx242322212019xxxx161514131211xxxx080706050403xxxx +(the x bits are still there, i'm just emphasizing where the S boxes are). +bits are rotated left 4 when computing S6 S4 S2 S0: + 282726252423xxxx201918171615xxxx121110090807xxxx040302010031xxxx +the rightmost two bits are usually cleared so the lower byte can be used +as an index into an sbox mapping table. the next two x'd bits are set +to various values to access different parts of the tables. + + +how to use the routines + +datatypes: + pointer to 8 byte area of type DesData + used to hold keys and input/output blocks to des. + + pointer to 128 byte area of type DesKeys + used to hold full 768-bit key. + must be long-aligned. + +DesQuickInit() + call this before using any other routine with `Quick' in its name. + it generates the special 64k table these routines need. +DesQuickDone() + frees this table + +DesMethod(m, k) + m points to a 128byte block, k points to an 8 byte des key + which must have odd parity (or -1 is returned) and which must + not be a (semi-)weak key (or -2 is returned). + normally DesMethod() returns 0. + m is filled in from k so that when one of the routines below + is called with m, the routine will act like standard des + en/decryption with the key k. if you use DesMethod, + you supply a standard 56bit key; however, if you fill in + m yourself, you will get a 768bit key - but then it won't + be standard. it's 768bits not 1024 because the least significant + two bits of each byte are not used. note that these two bits + will be set to magic constants which speed up the encryption/decryption + on some machines. and yes, each byte controls + a specific sbox during a specific iteration. + you really shouldn't use the 768bit format directly; i should + provide a routine that converts 128 6-bit bytes (specified in + S-box mapping order or something) into the right format for you. + this would entail some byte concatenation and rotation. + +Des{Small|Quick}{Fips|Core}{Encrypt|Decrypt}(d, m, s) + performs des on the 8 bytes at s into the 8 bytes at d. (d,s: char *). + uses m as a 768bit key as explained above. + the Encrypt|Decrypt choice is obvious. + Fips|Core determines whether a completely standard FIPS initial + and final permutation is done; if not, then the data is loaded + and stored in a nonstandard bit order (FIPS w/o IP/FP). + Fips slows down Quick by 10%, Small by 9%. + Small|Quick determines whether you use the normal routine + or the crazy quick one which gobbles up 64k more of memory. + Small is 50% slower then Quick, but Quick needs 32 times as much + memory. Quick is included for programs that do nothing but DES, + e.g., encryption filters, etc. + + +Getting it to compile on your machine + +there are no machine-dependencies in the code (see porting), +except perhaps the `now()' macro in desTest.c. +ALL generated tables are machine independent. +you should edit the Makefile with the appropriate optimization flags +for your compiler (MAX optimization). + + +Speeding up kerberos (and/or its des library) + +note that i have included a kerberos-compatible interface in desUtil.c +through the functions des_key_sched() and des_ecb_encrypt(). +to use these with kerberos or kerberos-compatible code put desCore.a +ahead of the kerberos-compatible library on your linker's command line. +you should not need to #include desCore.h; just include the header +file provided with the kerberos library. + +Other uses + +the macros in desCode.h would be very useful for putting inline des +functions in more complicated encryption routines. diff --git a/Documentation/debugging-modules.txt b/Documentation/debugging-modules.txt new file mode 100644 index 000000000000..24029f65fc94 --- /dev/null +++ b/Documentation/debugging-modules.txt @@ -0,0 +1,18 @@ +Debugging Modules after 2.6.3 +----------------------------- + +In almost all distributions, the kernel asks for modules which don't +exist, such as "net-pf-10" or whatever. Changing "modprobe -q" to +"succeed" in this case is hacky and breaks some setups, and also we +want to know if it failed for the fallback code for old aliases in +fs/char_dev.c, for example. + +In the past a debugging message which would fill people's logs was +emitted. This debugging message has been removed. The correct way +of debugging module problems is something like this: + +echo '#! /bin/sh' > /tmp/modprobe +echo 'echo "$@" >> /tmp/modprobe.log' >> /tmp/modprobe +echo 'exec /sbin/modprobe "$@"' >> /tmp/modprobe +chmod a+x /tmp/modprobe +echo /tmp/modprobe > /proc/sys/kernel/modprobe diff --git a/Documentation/device-mapper/dm-io.txt b/Documentation/device-mapper/dm-io.txt new file mode 100644 index 000000000000..3b5d9a52cdcf --- /dev/null +++ b/Documentation/device-mapper/dm-io.txt @@ -0,0 +1,75 @@ +dm-io +===== + +Dm-io provides synchronous and asynchronous I/O services. There are three +types of I/O services available, and each type has a sync and an async +version. + +The user must set up an io_region structure to describe the desired location +of the I/O. Each io_region indicates a block-device along with the starting +sector and size of the region. + + struct io_region { + struct block_device *bdev; + sector_t sector; + sector_t count; + }; + +Dm-io can read from one io_region or write to one or more io_regions. Writes +to multiple regions are specified by an array of io_region structures. + +The first I/O service type takes a list of memory pages as the data buffer for +the I/O, along with an offset into the first page. + + struct page_list { + struct page_list *next; + struct page *page; + }; + + int dm_io_sync(unsigned int num_regions, struct io_region *where, int rw, + struct page_list *pl, unsigned int offset, + unsigned long *error_bits); + int dm_io_async(unsigned int num_regions, struct io_region *where, int rw, + struct page_list *pl, unsigned int offset, + io_notify_fn fn, void *context); + +The second I/O service type takes an array of bio vectors as the data buffer +for the I/O. This service can be handy if the caller has a pre-assembled bio, +but wants to direct different portions of the bio to different devices. + + int dm_io_sync_bvec(unsigned int num_regions, struct io_region *where, + int rw, struct bio_vec *bvec, + unsigned long *error_bits); + int dm_io_async_bvec(unsigned int num_regions, struct io_region *where, + int rw, struct bio_vec *bvec, + io_notify_fn fn, void *context); + +The third I/O service type takes a pointer to a vmalloc'd memory buffer as the +data buffer for the I/O. This service can be handy if the caller needs to do +I/O to a large region but doesn't want to allocate a large number of individual +memory pages. + + int dm_io_sync_vm(unsigned int num_regions, struct io_region *where, int rw, + void *data, unsigned long *error_bits); + int dm_io_async_vm(unsigned int num_regions, struct io_region *where, int rw, + void *data, io_notify_fn fn, void *context); + +Callers of the asynchronous I/O services must include the name of a completion +callback routine and a pointer to some context data for the I/O. + + typedef void (*io_notify_fn)(unsigned long error, void *context); + +The "error" parameter in this callback, as well as the "*error" parameter in +all of the synchronous versions, is a bitset (instead of a simple error value). +In the case of an write-I/O to multiple regions, this bitset allows dm-io to +indicate success or failure on each individual region. + +Before using any of the dm-io services, the user should call dm_io_get() +and specify the number of pages they expect to perform I/O on concurrently. +Dm-io will attempt to resize its mempool to make sure enough pages are +always available in order to avoid unnecessary waiting while performing I/O. + +When the user is finished using the dm-io services, they should call +dm_io_put() and specify the same number of pages that were given on the +dm_io_get() call. + diff --git a/Documentation/device-mapper/kcopyd.txt b/Documentation/device-mapper/kcopyd.txt new file mode 100644 index 000000000000..820382c4cecf --- /dev/null +++ b/Documentation/device-mapper/kcopyd.txt @@ -0,0 +1,47 @@ +kcopyd +====== + +Kcopyd provides the ability to copy a range of sectors from one block-device +to one or more other block-devices, with an asynchronous completion +notification. It is used by dm-snapshot and dm-mirror. + +Users of kcopyd must first create a client and indicate how many memory pages +to set aside for their copy jobs. This is done with a call to +kcopyd_client_create(). + + int kcopyd_client_create(unsigned int num_pages, + struct kcopyd_client **result); + +To start a copy job, the user must set up io_region structures to describe +the source and destinations of the copy. Each io_region indicates a +block-device along with the starting sector and size of the region. The source +of the copy is given as one io_region structure, and the destinations of the +copy are given as an array of io_region structures. + + struct io_region { + struct block_device *bdev; + sector_t sector; + sector_t count; + }; + +To start the copy, the user calls kcopyd_copy(), passing in the client +pointer, pointers to the source and destination io_regions, the name of a +completion callback routine, and a pointer to some context data for the copy. + + int kcopyd_copy(struct kcopyd_client *kc, struct io_region *from, + unsigned int num_dests, struct io_region *dests, + unsigned int flags, kcopyd_notify_fn fn, void *context); + + typedef void (*kcopyd_notify_fn)(int read_err, unsigned int write_err, + void *context); + +When the copy completes, kcopyd will call the user's completion routine, +passing back the user's context pointer. It will also indicate if a read or +write error occurred during the copy. + +When a user is done with all their copy jobs, they should call +kcopyd_client_destroy() to delete the kcopyd client, which will release the +associated memory pages. + + void kcopyd_client_destroy(struct kcopyd_client *kc); + diff --git a/Documentation/device-mapper/linear.txt b/Documentation/device-mapper/linear.txt new file mode 100644 index 000000000000..d5307d380a45 --- /dev/null +++ b/Documentation/device-mapper/linear.txt @@ -0,0 +1,61 @@ +dm-linear +========= + +Device-Mapper's "linear" target maps a linear range of the Device-Mapper +device onto a linear range of another device. This is the basic building +block of logical volume managers. + +Parameters: <dev path> <offset> + <dev path>: Full pathname to the underlying block-device, or a + "major:minor" device-number. + <offset>: Starting sector within the device. + + +Example scripts +=============== +[[ +#!/bin/sh +# Create an identity mapping for a device +echo "0 `blockdev --getsize $1` linear $1 0" | dmsetup create identity +]] + + +[[ +#!/bin/sh +# Join 2 devices together +size1=`blockdev --getsize $1` +size2=`blockdev --getsize $2` +echo "0 $size1 linear $1 0 +$size1 $size2 linear $2 0" | dmsetup create joined +]] + + +[[ +#!/usr/bin/perl -w +# Split a device into 4M chunks and then join them together in reverse order. + +my $name = "reverse"; +my $extent_size = 4 * 1024 * 2; +my $dev = $ARGV[0]; +my $table = ""; +my $count = 0; + +if (!defined($dev)) { + die("Please specify a device.\n"); +} + +my $dev_size = `blockdev --getsize $dev`; +my $extents = int($dev_size / $extent_size) - + (($dev_size % $extent_size) ? 1 : 0); + +while ($extents > 0) { + my $this_start = $count * $extent_size; + $extents--; + $count++; + my $this_offset = $extents * $extent_size; + + $table .= "$this_start $extent_size linear $dev $this_offset\n"; +} + +`echo \"$table\" | dmsetup create $name`; +]] diff --git a/Documentation/device-mapper/striped.txt b/Documentation/device-mapper/striped.txt new file mode 100644 index 000000000000..f34d3236b9da --- /dev/null +++ b/Documentation/device-mapper/striped.txt @@ -0,0 +1,58 @@ +dm-stripe +========= + +Device-Mapper's "striped" target is used to create a striped (i.e. RAID-0) +device across one or more underlying devices. Data is written in "chunks", +with consecutive chunks rotating among the underlying devices. This can +potentially provide improved I/O throughput by utilizing several physical +devices in parallel. + +Parameters: <num devs> <chunk size> [<dev path> <offset>]+ + <num devs>: Number of underlying devices. + <chunk size>: Size of each chunk of data. Must be a power-of-2 and at + least as large as the system's PAGE_SIZE. + <dev path>: Full pathname to the underlying block-device, or a + "major:minor" device-number. + <offset>: Starting sector within the device. + +One or more underlying devices can be specified. The striped device size must +be a multiple of the chunk size and a multiple of the number of underlying +devices. + + +Example scripts +=============== + +[[ +#!/usr/bin/perl -w +# Create a striped device across any number of underlying devices. The device +# will be called "stripe_dev" and have a chunk-size of 128k. + +my $chunk_size = 128 * 2; +my $dev_name = "stripe_dev"; +my $num_devs = @ARGV; +my @devs = @ARGV; +my ($min_dev_size, $stripe_dev_size, $i); + +if (!$num_devs) { + die("Specify at least one device\n"); +} + +$min_dev_size = `blockdev --getsize $devs[0]`; +for ($i = 1; $i < $num_devs; $i++) { + my $this_size = `blockdev --getsize $devs[$i]`; + $min_dev_size = ($min_dev_size < $this_size) ? + $min_dev_size : $this_size; +} + +$stripe_dev_size = $min_dev_size * $num_devs; +$stripe_dev_size -= $stripe_dev_size % ($chunk_size * $num_devs); + +$table = "0 $stripe_dev_size striped $num_devs $chunk_size"; +for ($i = 0; $i < $num_devs; $i++) { + $table .= " $devs[$i] 0"; +} + +`echo $table | dmsetup create $dev_name`; +]] + diff --git a/Documentation/device-mapper/zero.txt b/Documentation/device-mapper/zero.txt new file mode 100644 index 000000000000..20fb38e7fa7e --- /dev/null +++ b/Documentation/device-mapper/zero.txt @@ -0,0 +1,37 @@ +dm-zero +======= + +Device-Mapper's "zero" target provides a block-device that always returns +zero'd data on reads and silently drops writes. This is similar behavior to +/dev/zero, but as a block-device instead of a character-device. + +Dm-zero has no target-specific parameters. + +One very interesting use of dm-zero is for creating "sparse" devices in +conjunction with dm-snapshot. A sparse device reports a device-size larger +than the amount of actual storage space available for that device. A user can +write data anywhere within the sparse device and read it back like a normal +device. Reads to previously unwritten areas will return a zero'd buffer. When +enough data has been written to fill up the actual storage space, the sparse +device is deactivated. This can be very useful for testing device and +filesystem limitations. + +To create a sparse device, start by creating a dm-zero device that's the +desired size of the sparse device. For this example, we'll assume a 10TB +sparse device. + +TEN_TERABYTES=`expr 10 \* 1024 \* 1024 \* 1024 \* 2` # 10 TB in sectors +echo "0 $TEN_TERABYTES zero" | dmsetup create zero1 + +Then create a snapshot of the zero device, using any available block-device as +the COW device. The size of the COW device will determine the amount of real +space available to the sparse device. For this example, we'll assume /dev/sdb1 +is an available 10GB partition. + +echo "0 $TEN_TERABYTES snapshot /dev/mapper/zero1 /dev/sdb1 p 128" | \ + dmsetup create sparse1 + +This will create a 10TB sparse device called /dev/mapper/sparse1 that has +10GB of actual storage space available. If more than 10GB of data is written +to this device, it will start returning I/O errors. + diff --git a/Documentation/devices.txt b/Documentation/devices.txt new file mode 100644 index 000000000000..bb67cf25010e --- /dev/null +++ b/Documentation/devices.txt @@ -0,0 +1,3216 @@ + + LINUX ALLOCATED DEVICES (2.6+ version) + + Maintained by Torben Mathiasen <device@lanana.org> + + Last revised: 25 January 2005 + +This list is the Linux Device List, the official registry of allocated +device numbers and /dev directory nodes for the Linux operating +system. + +The latest version of this list is available from +http://www.lanana.org/docs/device-list/ or +ftp://ftp.kernel.org/pub/linux/docs/device-list/. This version may be +newer than the one distributed with the Linux kernel. + +The LaTeX version of this document is no longer maintained. + +This document is included by reference into the Filesystem Hierarchy +Standard (FHS). The FHS is available from http://www.pathname.com/fhs/. + +Allocations marked (68k/Amiga) apply to Linux/68k on the Amiga +platform only. Allocations marked (68k/Atari) apply to Linux/68k on +the Atari platform only. + +The symbol {2.6} means the allocation is obsolete and scheduled for +removal once kernel version 2.6 (or equivalent) is released. Some of these +allocations have already been removed. + +This document is in the public domain. The author requests, however, +that semantically altered versions are not distributed without +permission of the author, assuming the author can be contacted without +an unreasonable effort. + +In particular, please don't sent patches for this list to Linus, at +least not without contacting me first. + +I do not have any information about these devices beyond what appears +on this list. Any such information requests will be deleted without +reply. + + + **** DEVICE DRIVERS AUTHORS PLEASE READ THIS **** + +To have a major number allocated, or a minor number in situations +where that applies (e.g. busmice), please contact me with the +appropriate device information. Also, if you have additional +information regarding any of the devices listed below, or if I have +made a mistake, I would greatly appreciate a note. + +I do, however, make a few requests about the nature of your report. +This is necessary for me to be able to keep this list up to date and +correct in a timely manner. First of all, *please* send it to the +correct address... <device@lanana.org>. I receive hundreds of email +messages a day, so mail sent to other addresses may very well get lost +in the avalanche. Please put in a descriptive subject, so I can find +your mail again should I need to. Too many people send me email +saying just "device number request" in the subject. + +Second, please include a description of the device *in the same format +as this list*. The reason for this is that it is the only way I have +found to ensure I have all the requisite information to publish your +device and avoid conflicts. + +Third, please don't assume that the distributed version of the list is +up to date. Due to the number of registrations I have to maintain it +in "batch mode", so there is likely additional registrations that +haven't been listed yet. + +Finally, sometimes I have to play "namespace police." Please don't be +offended. I often get submissions for /dev names that would be bound +to cause conflicts down the road. I am trying to avoid getting in a +situation where we would have to suffer an incompatible forward +change. Therefore, please consult with me *before* you make your +device names and numbers in any way public, at least to the point +where it would be at all difficult to get them changed. + +Your cooperation is appreciated. + + + 0 Unnamed devices (e.g. non-device mounts) + 0 = reserved as null device number + See block major 144, 145, 146 for expansion areas. + + 1 char Memory devices + 1 = /dev/mem Physical memory access + 2 = /dev/kmem Kernel virtual memory access + 3 = /dev/null Null device + 4 = /dev/port I/O port access + 5 = /dev/zero Null byte source + 6 = /dev/core OBSOLETE - replaced by /proc/kcore + 7 = /dev/full Returns ENOSPC on write + 8 = /dev/random Nondeterministic random number gen. + 9 = /dev/urandom Faster, less secure random number gen. + 10 = /dev/aio Asyncronous I/O notification interface + 11 = /dev/kmsg Writes to this come out as printk's + 1 block RAM disk + 0 = /dev/ram0 First RAM disk + 1 = /dev/ram1 Second RAM disk + ... + 250 = /dev/initrd Initial RAM disk {2.6} + + Older kernels had /dev/ramdisk (1, 1) here. + /dev/initrd refers to a RAM disk which was preloaded + by the boot loader; newer kernels use /dev/ram0 for + the initrd. + + 2 char Pseudo-TTY masters + 0 = /dev/ptyp0 First PTY master + 1 = /dev/ptyp1 Second PTY master + ... + 255 = /dev/ptyef 256th PTY master + + Pseudo-tty's are named as follows: + * Masters are "pty", slaves are "tty"; + * the fourth letter is one of pqrstuvwxyzabcde indicating + the 1st through 16th series of 16 pseudo-ttys each, and + * the fifth letter is one of 0123456789abcdef indicating + the position within the series. + + These are the old-style (BSD) PTY devices; Unix98 + devices are on major 128 and above and use the PTY + master multiplex (/dev/ptmx) to acquire a PTY on + demand. + + 2 block Floppy disks + 0 = /dev/fd0 Controller 0, drive 0, autodetect + 1 = /dev/fd1 Controller 0, drive 1, autodetect + 2 = /dev/fd2 Controller 0, drive 2, autodetect + 3 = /dev/fd3 Controller 0, drive 3, autodetect + 128 = /dev/fd4 Controller 1, drive 0, autodetect + 129 = /dev/fd5 Controller 1, drive 1, autodetect + 130 = /dev/fd6 Controller 1, drive 2, autodetect + 131 = /dev/fd7 Controller 1, drive 3, autodetect + + To specify format, add to the autodetect device number: + 0 = /dev/fd? Autodetect format + 4 = /dev/fd?d360 5.25" 360K in a 360K drive(1) + 20 = /dev/fd?h360 5.25" 360K in a 1200K drive(1) + 48 = /dev/fd?h410 5.25" 410K in a 1200K drive + 64 = /dev/fd?h420 5.25" 420K in a 1200K drive + 24 = /dev/fd?h720 5.25" 720K in a 1200K drive + 80 = /dev/fd?h880 5.25" 880K in a 1200K drive(1) + 8 = /dev/fd?h1200 5.25" 1200K in a 1200K drive(1) + 40 = /dev/fd?h1440 5.25" 1440K in a 1200K drive(1) + 56 = /dev/fd?h1476 5.25" 1476K in a 1200K drive + 72 = /dev/fd?h1494 5.25" 1494K in a 1200K drive + 92 = /dev/fd?h1600 5.25" 1600K in a 1200K drive(1) + + 12 = /dev/fd?u360 3.5" 360K Double Density(2) + 16 = /dev/fd?u720 3.5" 720K Double Density(1) + 120 = /dev/fd?u800 3.5" 800K Double Density(2) + 52 = /dev/fd?u820 3.5" 820K Double Density + 68 = /dev/fd?u830 3.5" 830K Double Density + 84 = /dev/fd?u1040 3.5" 1040K Double Density(1) + 88 = /dev/fd?u1120 3.5" 1120K Double Density(1) + 28 = /dev/fd?u1440 3.5" 1440K High Density(1) + 124 = /dev/fd?u1600 3.5" 1600K High Density(1) + 44 = /dev/fd?u1680 3.5" 1680K High Density(3) + 60 = /dev/fd?u1722 3.5" 1722K High Density + 76 = /dev/fd?u1743 3.5" 1743K High Density + 96 = /dev/fd?u1760 3.5" 1760K High Density + 116 = /dev/fd?u1840 3.5" 1840K High Density(3) + 100 = /dev/fd?u1920 3.5" 1920K High Density(1) + 32 = /dev/fd?u2880 3.5" 2880K Extra Density(1) + 104 = /dev/fd?u3200 3.5" 3200K Extra Density + 108 = /dev/fd?u3520 3.5" 3520K Extra Density + 112 = /dev/fd?u3840 3.5" 3840K Extra Density(1) + + 36 = /dev/fd?CompaQ Compaq 2880K drive; obsolete? + + (1) Autodetectable format + (2) Autodetectable format in a Double Density (720K) drive only + (3) Autodetectable format in a High Density (1440K) drive only + + NOTE: The letter in the device name (d, q, h or u) + signifies the type of drive: 5.25" Double Density (d), + 5.25" Quad Density (q), 5.25" High Density (h) or 3.5" + (any model, u). The use of the capital letters D, H + and E for the 3.5" models have been deprecated, since + the drive type is insignificant for these devices. + + 3 char Pseudo-TTY slaves + 0 = /dev/ttyp0 First PTY slave + 1 = /dev/ttyp1 Second PTY slave + ... + 255 = /dev/ttyef 256th PTY slave + + These are the old-style (BSD) PTY devices; Unix98 + devices are on major 136 and above. + + 3 block First MFM, RLL and IDE hard disk/CD-ROM interface + 0 = /dev/hda Master: whole disk (or CD-ROM) + 64 = /dev/hdb Slave: whole disk (or CD-ROM) + + For partitions, add to the whole disk device number: + 0 = /dev/hd? Whole disk + 1 = /dev/hd?1 First partition + 2 = /dev/hd?2 Second partition + ... + 63 = /dev/hd?63 63rd partition + + For Linux/i386, partitions 1-4 are the primary + partitions, and 5 and above are logical partitions. + Other versions of Linux use partitioning schemes + appropriate to their respective architectures. + + 4 char TTY devices + 0 = /dev/tty0 Current virtual console + + 1 = /dev/tty1 First virtual console + ... + 63 = /dev/tty63 63rd virtual console + 64 = /dev/ttyS0 First UART serial port + ... + 255 = /dev/ttyS191 192nd UART serial port + + UART serial ports refer to 8250/16450/16550 series devices. + + Older versions of the Linux kernel used this major + number for BSD PTY devices. As of Linux 2.1.115, this + is no longer supported. Use major numbers 2 and 3. + + 4 block Aliases for dynamically allocated major devices to be used + when its not possible to create the real device nodes + because the root filesystem is mounted read-only. + + 0 = /dev/root + + 5 char Alternate TTY devices + 0 = /dev/tty Current TTY device + 1 = /dev/console System console + 2 = /dev/ptmx PTY master multiplex + 64 = /dev/cua0 Callout device for ttyS0 + ... + 255 = /dev/cua191 Callout device for ttyS191 + + (5,1) is /dev/console starting with Linux 2.1.71. See + the section on terminal devices for more information + on /dev/console. + + 6 char Parallel printer devices + 0 = /dev/lp0 Parallel printer on parport0 + 1 = /dev/lp1 Parallel printer on parport1 + ... + + Current Linux kernels no longer have a fixed mapping + between parallel ports and I/O addresses. Instead, + they are redirected through the parport multiplex layer. + + 7 char Virtual console capture devices + 0 = /dev/vcs Current vc text contents + 1 = /dev/vcs1 tty1 text contents + ... + 63 = /dev/vcs63 tty63 text contents + 128 = /dev/vcsa Current vc text/attribute contents + 129 = /dev/vcsa1 tty1 text/attribute contents + ... + 191 = /dev/vcsa63 tty63 text/attribute contents + + NOTE: These devices permit both read and write access. + + 7 block Loopback devices + 0 = /dev/loop0 First loopback device + 1 = /dev/loop1 Second loopback device + ... + + The loopback devices are used to mount filesystems not + associated with block devices. The binding to the + loopback devices is handled by mount(8) or losetup(8). + + 8 block SCSI disk devices (0-15) + 0 = /dev/sda First SCSI disk whole disk + 16 = /dev/sdb Second SCSI disk whole disk + 32 = /dev/sdc Third SCSI disk whole disk + ... + 240 = /dev/sdp Sixteenth SCSI disk whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + 9 char SCSI tape devices + 0 = /dev/st0 First SCSI tape, mode 0 + 1 = /dev/st1 Second SCSI tape, mode 0 + ... + 32 = /dev/st0l First SCSI tape, mode 1 + 33 = /dev/st1l Second SCSI tape, mode 1 + ... + 64 = /dev/st0m First SCSI tape, mode 2 + 65 = /dev/st1m Second SCSI tape, mode 2 + ... + 96 = /dev/st0a First SCSI tape, mode 3 + 97 = /dev/st1a Second SCSI tape, mode 3 + ... + 128 = /dev/nst0 First SCSI tape, mode 0, no rewind + 129 = /dev/nst1 Second SCSI tape, mode 0, no rewind + ... + 160 = /dev/nst0l First SCSI tape, mode 1, no rewind + 161 = /dev/nst1l Second SCSI tape, mode 1, no rewind + ... + 192 = /dev/nst0m First SCSI tape, mode 2, no rewind + 193 = /dev/nst1m Second SCSI tape, mode 2, no rewind + ... + 224 = /dev/nst0a First SCSI tape, mode 3, no rewind + 225 = /dev/nst1a Second SCSI tape, mode 3, no rewind + ... + + "No rewind" refers to the omission of the default + automatic rewind on device close. The MTREW or MTOFFL + ioctl()'s can be used to rewind the tape regardless of + the device used to access it. + + 9 block Metadisk (RAID) devices + 0 = /dev/md0 First metadisk group + 1 = /dev/md1 Second metadisk group + ... + + The metadisk driver is used to span a + filesystem across multiple physical disks. + + 10 char Non-serial mice, misc features + 0 = /dev/logibm Logitech bus mouse + 1 = /dev/psaux PS/2-style mouse port + 2 = /dev/inportbm Microsoft Inport bus mouse + 3 = /dev/atibm ATI XL bus mouse + 4 = /dev/jbm J-mouse + 4 = /dev/amigamouse Amiga mouse (68k/Amiga) + 5 = /dev/atarimouse Atari mouse + 6 = /dev/sunmouse Sun mouse + 7 = /dev/amigamouse1 Second Amiga mouse + 8 = /dev/smouse Simple serial mouse driver + 9 = /dev/pc110pad IBM PC-110 digitizer pad + 10 = /dev/adbmouse Apple Desktop Bus mouse + 11 = /dev/vrtpanel Vr41xx embedded touch panel + 13 = /dev/vpcmouse Connectix Virtual PC Mouse + 14 = /dev/touchscreen/ucb1x00 UCB 1x00 touchscreen + 15 = /dev/touchscreen/mk712 MK712 touchscreen + 128 = /dev/beep Fancy beep device + 129 = /dev/modreq Kernel module load request {2.6} + 130 = /dev/watchdog Watchdog timer port + 131 = /dev/temperature Machine internal temperature + 132 = /dev/hwtrap Hardware fault trap + 133 = /dev/exttrp External device trap + 134 = /dev/apm_bios Advanced Power Management BIOS + 135 = /dev/rtc Real Time Clock + 139 = /dev/openprom SPARC OpenBoot PROM + 140 = /dev/relay8 Berkshire Products Octal relay card + 141 = /dev/relay16 Berkshire Products ISO-16 relay card + 142 = /dev/msr x86 model-specific registers {2.6} + 143 = /dev/pciconf PCI configuration space + 144 = /dev/nvram Non-volatile configuration RAM + 145 = /dev/hfmodem Soundcard shortwave modem control {2.6} + 146 = /dev/graphics Linux/SGI graphics device + 147 = /dev/opengl Linux/SGI OpenGL pipe + 148 = /dev/gfx Linux/SGI graphics effects device + 149 = /dev/input/mouse Linux/SGI Irix emulation mouse + 150 = /dev/input/keyboard Linux/SGI Irix emulation keyboard + 151 = /dev/led Front panel LEDs + 152 = /dev/kpoll Kernel Poll Driver + 153 = /dev/mergemem Memory merge device + 154 = /dev/pmu Macintosh PowerBook power manager + 155 = /dev/isictl MultiTech ISICom serial control + 156 = /dev/lcd Front panel LCD display + 157 = /dev/ac Applicom Intl Profibus card + 158 = /dev/nwbutton Netwinder external button + 159 = /dev/nwdebug Netwinder debug interface + 160 = /dev/nwflash Netwinder flash memory + 161 = /dev/userdma User-space DMA access + 162 = /dev/smbus System Management Bus + 163 = /dev/lik Logitech Internet Keyboard + 164 = /dev/ipmo Intel Intelligent Platform Management + 165 = /dev/vmmon VMWare virtual machine monitor + 166 = /dev/i2o/ctl I2O configuration manager + 167 = /dev/specialix_sxctl Specialix serial control + 168 = /dev/tcldrv Technology Concepts serial control + 169 = /dev/specialix_rioctl Specialix RIO serial control + 170 = /dev/thinkpad/thinkpad IBM Thinkpad devices + 171 = /dev/srripc QNX4 API IPC manager + 172 = /dev/usemaclone Semaphore clone device + 173 = /dev/ipmikcs Intelligent Platform Management + 174 = /dev/uctrl SPARCbook 3 microcontroller + 175 = /dev/agpgart AGP Graphics Address Remapping Table + 176 = /dev/gtrsc Gorgy Timing radio clock + 177 = /dev/cbm Serial CBM bus + 178 = /dev/jsflash JavaStation OS flash SIMM + 179 = /dev/xsvc High-speed shared-mem/semaphore service + 180 = /dev/vrbuttons Vr41xx button input device + 181 = /dev/toshiba Toshiba laptop SMM support + 182 = /dev/perfctr Performance-monitoring counters + 183 = /dev/hwrng Generic random number generator + 184 = /dev/cpu/microcode CPU microcode update interface + 186 = /dev/atomicps Atomic shapshot of process state data + 187 = /dev/irnet IrNET device + 188 = /dev/smbusbios SMBus BIOS + 189 = /dev/ussp_ctl User space serial port control + 190 = /dev/crash Mission Critical Linux crash dump facility + 191 = /dev/pcl181 <information missing> + 192 = /dev/nas_xbus NAS xbus LCD/buttons access + 193 = /dev/d7s SPARC 7-segment display + 194 = /dev/zkshim Zero-Knowledge network shim control + 195 = /dev/elographics/e2201 Elographics touchscreen E271-2201 + 198 = /dev/sexec Signed executable interface + 199 = /dev/scanners/cuecat :CueCat barcode scanner + 200 = /dev/net/tun TAP/TUN network device + 201 = /dev/button/gulpb Transmeta GULP-B buttons + 202 = /dev/emd/ctl Enhanced Metadisk RAID (EMD) control + 204 = /dev/video/em8300 EM8300 DVD decoder control + 205 = /dev/video/em8300_mv EM8300 DVD decoder video + 206 = /dev/video/em8300_ma EM8300 DVD decoder audio + 207 = /dev/video/em8300_sp EM8300 DVD decoder subpicture + 208 = /dev/compaq/cpqphpc Compaq PCI Hot Plug Controller + 209 = /dev/compaq/cpqrid Compaq Remote Insight Driver + 210 = /dev/impi/bt IMPI coprocessor block transfer + 211 = /dev/impi/smic IMPI coprocessor stream interface + 212 = /dev/watchdogs/0 First watchdog device + 213 = /dev/watchdogs/1 Second watchdog device + 214 = /dev/watchdogs/2 Third watchdog device + 215 = /dev/watchdogs/3 Fourth watchdog device + 216 = /dev/fujitsu/apanel Fujitsu/Siemens application panel + 217 = /dev/ni/natmotn National Instruments Motion + 218 = /dev/kchuid Inter-process chuid control + 219 = /dev/modems/mwave MWave modem firmware upload + 220 = /dev/mptctl Message passing technology (MPT) control + 221 = /dev/mvista/hssdsi Montavista PICMG hot swap system driver + 222 = /dev/mvista/hasi Montavista PICMG high availability + 223 = /dev/input/uinput User level driver support for input + 224 = /dev/tpm TCPA TPM driver + 225 = /dev/pps Pulse Per Second driver + 226 = /dev/systrace Systrace device + 227 = /dev/mcelog X86_64 Machine Check Exception driver + 228 = /dev/hpet HPET driver + 229 = /dev/fuse Fuse (virtual filesystem in user-space) + 230 = /dev/midishare MidiShare driver + 240-254 Reserved for local use + 255 Reserved for MISC_DYNAMIC_MINOR + + 11 char Raw keyboard device (Linux/SPARC only) + 0 = /dev/kbd Raw keyboard device + + 11 char Serial Mux device (Linux/PA-RISC only) + 0 = /dev/ttyB0 First mux port + 1 = /dev/ttyB1 Second mux port + ... + + 11 block SCSI CD-ROM devices + 0 = /dev/scd0 First SCSI CD-ROM + 1 = /dev/scd1 Second SCSI CD-ROM + ... + + The prefix /dev/sr (instead of /dev/scd) has been deprecated. + + 12 char QIC-02 tape + 2 = /dev/ntpqic11 QIC-11, no rewind-on-close + 3 = /dev/tpqic11 QIC-11, rewind-on-close + 4 = /dev/ntpqic24 QIC-24, no rewind-on-close + 5 = /dev/tpqic24 QIC-24, rewind-on-close + 6 = /dev/ntpqic120 QIC-120, no rewind-on-close + 7 = /dev/tpqic120 QIC-120, rewind-on-close + 8 = /dev/ntpqic150 QIC-150, no rewind-on-close + 9 = /dev/tpqic150 QIC-150, rewind-on-close + + The device names specified are proposed -- if there + are "standard" names for these devices, please let me know. + + 12 block MSCDEX CD-ROM callback support {2.6} + 0 = /dev/dos_cd0 First MSCDEX CD-ROM + 1 = /dev/dos_cd1 Second MSCDEX CD-ROM + ... + + 13 char Input core + 0 = /dev/input/js0 First joystick + 1 = /dev/input/js1 Second joystick + ... + 32 = /dev/input/mouse0 First mouse + 33 = /dev/input/mouse1 Second mouse + ... + 63 = /dev/input/mice Unified mouse + 64 = /dev/input/event0 First event queue + 65 = /dev/input/event1 Second event queue + ... + + Each device type has 5 bits (32 minors). + + 13 block 8-bit MFM/RLL/IDE controller + 0 = /dev/xda First XT disk whole disk + 64 = /dev/xdb Second XT disk whole disk + + Partitions are handled in the same way as IDE disks + (see major number 3). + + 14 char Open Sound System (OSS) + 0 = /dev/mixer Mixer control + 1 = /dev/sequencer Audio sequencer + 2 = /dev/midi00 First MIDI port + 3 = /dev/dsp Digital audio + 4 = /dev/audio Sun-compatible digital audio + 6 = /dev/sndstat Sound card status information {2.6} + 7 = /dev/audioctl SPARC audio control device + 8 = /dev/sequencer2 Sequencer -- alternate device + 16 = /dev/mixer1 Second soundcard mixer control + 17 = /dev/patmgr0 Sequencer patch manager + 18 = /dev/midi01 Second MIDI port + 19 = /dev/dsp1 Second soundcard digital audio + 20 = /dev/audio1 Second soundcard Sun digital audio + 33 = /dev/patmgr1 Sequencer patch manager + 34 = /dev/midi02 Third MIDI port + 50 = /dev/midi03 Fourth MIDI port + 14 block BIOS harddrive callback support {2.6} + 0 = /dev/dos_hda First BIOS harddrive whole disk + 64 = /dev/dos_hdb Second BIOS harddrive whole disk + 128 = /dev/dos_hdc Third BIOS harddrive whole disk + 192 = /dev/dos_hdd Fourth BIOS harddrive whole disk + + Partitions are handled in the same way as IDE disks + (see major number 3). + + 15 char Joystick + 0 = /dev/js0 First analog joystick + 1 = /dev/js1 Second analog joystick + ... + 128 = /dev/djs0 First digital joystick + 129 = /dev/djs1 Second digital joystick + ... + 15 block Sony CDU-31A/CDU-33A CD-ROM + 0 = /dev/sonycd Sony CDU-31a CD-ROM + + 16 char Non-SCSI scanners + 0 = /dev/gs4500 Genius 4500 handheld scanner + 16 block GoldStar CD-ROM + 0 = /dev/gscd GoldStar CD-ROM + + 17 char Chase serial card + 0 = /dev/ttyH0 First Chase port + 1 = /dev/ttyH1 Second Chase port + ... + 17 block Optics Storage CD-ROM + 0 = /dev/optcd Optics Storage CD-ROM + + 18 char Chase serial card - alternate devices + 0 = /dev/cuh0 Callout device for ttyH0 + 1 = /dev/cuh1 Callout device for ttyH1 + ... + 18 block Sanyo CD-ROM + 0 = /dev/sjcd Sanyo CD-ROM + + 19 char Cyclades serial card + 0 = /dev/ttyC0 First Cyclades port + ... + 31 = /dev/ttyC31 32nd Cyclades port + 19 block "Double" compressed disk + 0 = /dev/double0 First compressed disk + ... + 7 = /dev/double7 Eighth compressed disk + 128 = /dev/cdouble0 Mirror of first compressed disk + ... + 135 = /dev/cdouble7 Mirror of eighth compressed disk + + See the Double documentation for the meaning of the + mirror devices. + + 20 char Cyclades serial card - alternate devices + 0 = /dev/cub0 Callout device for ttyC0 + ... + 31 = /dev/cub31 Callout device for ttyC31 + 20 block Hitachi CD-ROM (under development) + 0 = /dev/hitcd Hitachi CD-ROM + + 21 char Generic SCSI access + 0 = /dev/sg0 First generic SCSI device + 1 = /dev/sg1 Second generic SCSI device + ... + + Most distributions name these /dev/sga, /dev/sgb...; + this sets an unnecessary limit of 26 SCSI devices in + the system and is counter to standard Linux + device-naming practice. + + 21 block Acorn MFM hard drive interface + 0 = /dev/mfma First MFM drive whole disk + 64 = /dev/mfmb Second MFM drive whole disk + + This device is used on the ARM-based Acorn RiscPC. + Partitions are handled the same way as for IDE disks + (see major number 3). + + 22 char Digiboard serial card + 0 = /dev/ttyD0 First Digiboard port + 1 = /dev/ttyD1 Second Digiboard port + ... + 22 block Second IDE hard disk/CD-ROM interface + 0 = /dev/hdc Master: whole disk (or CD-ROM) + 64 = /dev/hdd Slave: whole disk (or CD-ROM) + + Partitions are handled the same way as for the first + interface (see major number 3). + + 23 char Digiboard serial card - alternate devices + 0 = /dev/cud0 Callout device for ttyD0 + 1 = /dev/cud1 Callout device for ttyD1 + ... + 23 block Mitsumi proprietary CD-ROM + 0 = /dev/mcd Mitsumi CD-ROM + + 24 char Stallion serial card + 0 = /dev/ttyE0 Stallion port 0 card 0 + 1 = /dev/ttyE1 Stallion port 1 card 0 + ... + 64 = /dev/ttyE64 Stallion port 0 card 1 + 65 = /dev/ttyE65 Stallion port 1 card 1 + ... + 128 = /dev/ttyE128 Stallion port 0 card 2 + 129 = /dev/ttyE129 Stallion port 1 card 2 + ... + 192 = /dev/ttyE192 Stallion port 0 card 3 + 193 = /dev/ttyE193 Stallion port 1 card 3 + ... + 24 block Sony CDU-535 CD-ROM + 0 = /dev/cdu535 Sony CDU-535 CD-ROM + + 25 char Stallion serial card - alternate devices + 0 = /dev/cue0 Callout device for ttyE0 + 1 = /dev/cue1 Callout device for ttyE1 + ... + 64 = /dev/cue64 Callout device for ttyE64 + 65 = /dev/cue65 Callout device for ttyE65 + ... + 128 = /dev/cue128 Callout device for ttyE128 + 129 = /dev/cue129 Callout device for ttyE129 + ... + 192 = /dev/cue192 Callout device for ttyE192 + 193 = /dev/cue193 Callout device for ttyE193 + ... + 25 block First Matsushita (Panasonic/SoundBlaster) CD-ROM + 0 = /dev/sbpcd0 Panasonic CD-ROM controller 0 unit 0 + 1 = /dev/sbpcd1 Panasonic CD-ROM controller 0 unit 1 + 2 = /dev/sbpcd2 Panasonic CD-ROM controller 0 unit 2 + 3 = /dev/sbpcd3 Panasonic CD-ROM controller 0 unit 3 + + 26 char Quanta WinVision frame grabber {2.6} + 0 = /dev/wvisfgrab Quanta WinVision frame grabber + 26 block Second Matsushita (Panasonic/SoundBlaster) CD-ROM + 0 = /dev/sbpcd4 Panasonic CD-ROM controller 1 unit 0 + 1 = /dev/sbpcd5 Panasonic CD-ROM controller 1 unit 1 + 2 = /dev/sbpcd6 Panasonic CD-ROM controller 1 unit 2 + 3 = /dev/sbpcd7 Panasonic CD-ROM controller 1 unit 3 + + 27 char QIC-117 tape + 0 = /dev/qft0 Unit 0, rewind-on-close + 1 = /dev/qft1 Unit 1, rewind-on-close + 2 = /dev/qft2 Unit 2, rewind-on-close + 3 = /dev/qft3 Unit 3, rewind-on-close + 4 = /dev/nqft0 Unit 0, no rewind-on-close + 5 = /dev/nqft1 Unit 1, no rewind-on-close + 6 = /dev/nqft2 Unit 2, no rewind-on-close + 7 = /dev/nqft3 Unit 3, no rewind-on-close + 16 = /dev/zqft0 Unit 0, rewind-on-close, compression + 17 = /dev/zqft1 Unit 1, rewind-on-close, compression + 18 = /dev/zqft2 Unit 2, rewind-on-close, compression + 19 = /dev/zqft3 Unit 3, rewind-on-close, compression + 20 = /dev/nzqft0 Unit 0, no rewind-on-close, compression + 21 = /dev/nzqft1 Unit 1, no rewind-on-close, compression + 22 = /dev/nzqft2 Unit 2, no rewind-on-close, compression + 23 = /dev/nzqft3 Unit 3, no rewind-on-close, compression + 32 = /dev/rawqft0 Unit 0, rewind-on-close, no file marks + 33 = /dev/rawqft1 Unit 1, rewind-on-close, no file marks + 34 = /dev/rawqft2 Unit 2, rewind-on-close, no file marks + 35 = /dev/rawqft3 Unit 3, rewind-on-close, no file marks + 36 = /dev/nrawqft0 Unit 0, no rewind-on-close, no file marks + 37 = /dev/nrawqft1 Unit 1, no rewind-on-close, no file marks + 38 = /dev/nrawqft2 Unit 2, no rewind-on-close, no file marks + 39 = /dev/nrawqft3 Unit 3, no rewind-on-close, no file marks + 27 block Third Matsushita (Panasonic/SoundBlaster) CD-ROM + 0 = /dev/sbpcd8 Panasonic CD-ROM controller 2 unit 0 + 1 = /dev/sbpcd9 Panasonic CD-ROM controller 2 unit 1 + 2 = /dev/sbpcd10 Panasonic CD-ROM controller 2 unit 2 + 3 = /dev/sbpcd11 Panasonic CD-ROM controller 2 unit 3 + + 28 char Stallion serial card - card programming + 0 = /dev/staliomem0 First Stallion card I/O memory + 1 = /dev/staliomem1 Second Stallion card I/O memory + 2 = /dev/staliomem2 Third Stallion card I/O memory + 3 = /dev/staliomem3 Fourth Stallion card I/O memory + 28 char Atari SLM ACSI laser printer (68k/Atari) + 0 = /dev/slm0 First SLM laser printer + 1 = /dev/slm1 Second SLM laser printer + ... + 28 block Fourth Matsushita (Panasonic/SoundBlaster) CD-ROM + 0 = /dev/sbpcd12 Panasonic CD-ROM controller 3 unit 0 + 1 = /dev/sbpcd13 Panasonic CD-ROM controller 3 unit 1 + 2 = /dev/sbpcd14 Panasonic CD-ROM controller 3 unit 2 + 3 = /dev/sbpcd15 Panasonic CD-ROM controller 3 unit 3 + 28 block ACSI disk (68k/Atari) + 0 = /dev/ada First ACSI disk whole disk + 16 = /dev/adb Second ACSI disk whole disk + 32 = /dev/adc Third ACSI disk whole disk + ... + 240 = /dev/adp 16th ACSI disk whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15, like SCSI. + + 29 char Universal frame buffer + 0 = /dev/fb0 First frame buffer + 1 = /dev/fb1 Second frame buffer + ... + 31 = /dev/fb31 32nd frame buffer + + 29 block Aztech/Orchid/Okano/Wearnes CD-ROM + 0 = /dev/aztcd Aztech CD-ROM + + 30 char iBCS-2 compatibility devices + 0 = /dev/socksys Socket access + 1 = /dev/spx SVR3 local X interface + 32 = /dev/inet/ip Network access + 33 = /dev/inet/icmp + 34 = /dev/inet/ggp + 35 = /dev/inet/ipip + 36 = /dev/inet/tcp + 37 = /dev/inet/egp + 38 = /dev/inet/pup + 39 = /dev/inet/udp + 40 = /dev/inet/idp + 41 = /dev/inet/rawip + + Additionally, iBCS-2 requires the following links: + + /dev/ip -> /dev/inet/ip + /dev/icmp -> /dev/inet/icmp + /dev/ggp -> /dev/inet/ggp + /dev/ipip -> /dev/inet/ipip + /dev/tcp -> /dev/inet/tcp + /dev/egp -> /dev/inet/egp + /dev/pup -> /dev/inet/pup + /dev/udp -> /dev/inet/udp + /dev/idp -> /dev/inet/idp + /dev/rawip -> /dev/inet/rawip + /dev/inet/arp -> /dev/inet/udp + /dev/inet/rip -> /dev/inet/udp + /dev/nfsd -> /dev/socksys + /dev/X0R -> /dev/null (? apparently not required ?) + + 30 block Philips LMS CM-205 CD-ROM + 0 = /dev/cm205cd Philips LMS CM-205 CD-ROM + + /dev/lmscd is an older name for this device. This + driver does not work with the CM-205MS CD-ROM. + + 31 char MPU-401 MIDI + 0 = /dev/mpu401data MPU-401 data port + 1 = /dev/mpu401stat MPU-401 status port + 31 block ROM/flash memory card + 0 = /dev/rom0 First ROM card (rw) + ... + 7 = /dev/rom7 Eighth ROM card (rw) + 8 = /dev/rrom0 First ROM card (ro) + ... + 15 = /dev/rrom7 Eighth ROM card (ro) + 16 = /dev/flash0 First flash memory card (rw) + ... + 23 = /dev/flash7 Eighth flash memory card (rw) + 24 = /dev/rflash0 First flash memory card (ro) + ... + 31 = /dev/rflash7 Eighth flash memory card (ro) + + The read-write (rw) devices support back-caching + written data in RAM, as well as writing to flash RAM + devices. The read-only devices (ro) support reading + only. + + 32 char Specialix serial card + 0 = /dev/ttyX0 First Specialix port + 1 = /dev/ttyX1 Second Specialix port + ... + 32 block Philips LMS CM-206 CD-ROM + 0 = /dev/cm206cd Philips LMS CM-206 CD-ROM + + 33 char Specialix serial card - alternate devices + 0 = /dev/cux0 Callout device for ttyX0 + 1 = /dev/cux1 Callout device for ttyX1 + ... + 33 block Third IDE hard disk/CD-ROM interface + 0 = /dev/hde Master: whole disk (or CD-ROM) + 64 = /dev/hdf Slave: whole disk (or CD-ROM) + + Partitions are handled the same way as for the first + interface (see major number 3). + + 34 char Z8530 HDLC driver + 0 = /dev/scc0 First Z8530, first port + 1 = /dev/scc1 First Z8530, second port + 2 = /dev/scc2 Second Z8530, first port + 3 = /dev/scc3 Second Z8530, second port + ... + + In a previous version these devices were named + /dev/sc1 for /dev/scc0, /dev/sc2 for /dev/scc1, and so + on. + + 34 block Fourth IDE hard disk/CD-ROM interface + 0 = /dev/hdg Master: whole disk (or CD-ROM) + 64 = /dev/hdh Slave: whole disk (or CD-ROM) + + Partitions are handled the same way as for the first + interface (see major number 3). + + 35 char tclmidi MIDI driver + 0 = /dev/midi0 First MIDI port, kernel timed + 1 = /dev/midi1 Second MIDI port, kernel timed + 2 = /dev/midi2 Third MIDI port, kernel timed + 3 = /dev/midi3 Fourth MIDI port, kernel timed + 64 = /dev/rmidi0 First MIDI port, untimed + 65 = /dev/rmidi1 Second MIDI port, untimed + 66 = /dev/rmidi2 Third MIDI port, untimed + 67 = /dev/rmidi3 Fourth MIDI port, untimed + 128 = /dev/smpte0 First MIDI port, SMPTE timed + 129 = /dev/smpte1 Second MIDI port, SMPTE timed + 130 = /dev/smpte2 Third MIDI port, SMPTE timed + 131 = /dev/smpte3 Fourth MIDI port, SMPTE timed + 35 block Slow memory ramdisk + 0 = /dev/slram Slow memory ramdisk + + 36 char Netlink support + 0 = /dev/route Routing, device updates, kernel to user + 1 = /dev/skip enSKIP security cache control + 3 = /dev/fwmonitor Firewall packet copies + 16 = /dev/tap0 First Ethertap device + ... + 31 = /dev/tap15 16th Ethertap device + 36 block MCA ESDI hard disk + 0 = /dev/eda First ESDI disk whole disk + 64 = /dev/edb Second ESDI disk whole disk + ... + + Partitions are handled in the same way as IDE disks + (see major number 3). + + 37 char IDE tape + 0 = /dev/ht0 First IDE tape + 1 = /dev/ht1 Second IDE tape + ... + 128 = /dev/nht0 First IDE tape, no rewind-on-close + 129 = /dev/nht1 Second IDE tape, no rewind-on-close + ... + + Currently, only one IDE tape drive is supported. + + 37 block Zorro II ramdisk + 0 = /dev/z2ram Zorro II ramdisk + + 38 char Myricom PCI Myrinet board + 0 = /dev/mlanai0 First Myrinet board + 1 = /dev/mlanai1 Second Myrinet board + ... + + This device is used for status query, board control + and "user level packet I/O." This board is also + accessible as a standard networking "eth" device. + + 38 block Reserved for Linux/AP+ + + 39 char ML-16P experimental I/O board + 0 = /dev/ml16pa-a0 First card, first analog channel + 1 = /dev/ml16pa-a1 First card, second analog channel + ... + 15 = /dev/ml16pa-a15 First card, 16th analog channel + 16 = /dev/ml16pa-d First card, digital lines + 17 = /dev/ml16pa-c0 First card, first counter/timer + 18 = /dev/ml16pa-c1 First card, second counter/timer + 19 = /dev/ml16pa-c2 First card, third counter/timer + 32 = /dev/ml16pb-a0 Second card, first analog channel + 33 = /dev/ml16pb-a1 Second card, second analog channel + ... + 47 = /dev/ml16pb-a15 Second card, 16th analog channel + 48 = /dev/ml16pb-d Second card, digital lines + 49 = /dev/ml16pb-c0 Second card, first counter/timer + 50 = /dev/ml16pb-c1 Second card, second counter/timer + 51 = /dev/ml16pb-c2 Second card, third counter/timer + ... + 39 block Reserved for Linux/AP+ + + 40 char Matrox Meteor frame grabber {2.6} + 0 = /dev/mmetfgrab Matrox Meteor frame grabber + 40 block Syquest EZ135 parallel port removable drive + 0 = /dev/eza Parallel EZ135 drive, whole disk + + This device is obsolete and will be removed in a + future version of Linux. It has been replaced with + the parallel port IDE disk driver at major number 45. + Partitions are handled in the same way as IDE disks + (see major number 3). + + 41 char Yet Another Micro Monitor + 0 = /dev/yamm Yet Another Micro Monitor + 41 block MicroSolutions BackPack parallel port CD-ROM + 0 = /dev/bpcd BackPack CD-ROM + + This device is obsolete and will be removed in a + future version of Linux. It has been replaced with + the parallel port ATAPI CD-ROM driver at major number 46. + + 42 char Demo/sample use + 42 block Demo/sample use + + This number is intended for use in sample code, as + well as a general "example" device number. It + should never be used for a device driver that is being + distributed; either obtain an official number or use + the local/experimental range. The sudden addition or + removal of a driver with this number should not cause + ill effects to the system (bugs excepted.) + + IN PARTICULAR, ANY DISTRIBUTION WHICH CONTAINS A + DEVICE DRIVER USING MAJOR NUMBER 42 IS NONCOMPLIANT. + + 43 char isdn4linux virtual modem + 0 = /dev/ttyI0 First virtual modem + ... + 63 = /dev/ttyI63 64th virtual modem + 43 block Network block devices + 0 = /dev/nb0 First network block device + 1 = /dev/nb1 Second network block device + ... + + Network Block Device is somehow similar to loopback + devices: If you read from it, it sends packet across + network asking server for data. If you write to it, it + sends packet telling server to write. It could be used + to mounting filesystems over the net, swapping over + the net, implementing block device in userland etc. + + 44 char isdn4linux virtual modem - alternate devices + 0 = /dev/cui0 Callout device for ttyI0 + ... + 63 = /dev/cui63 Callout device for ttyI63 + 44 block Flash Translation Layer (FTL) filesystems + 0 = /dev/ftla FTL on first Memory Technology Device + 16 = /dev/ftlb FTL on second Memory Technology Device + 32 = /dev/ftlc FTL on third Memory Technology Device + ... + 240 = /dev/ftlp FTL on 16th Memory Technology Device + + Partitions are handled in the same way as for IDE + disks (see major number 3) expect that the partition + limit is 15 rather than 63 per disk (same as SCSI.) + + 45 char isdn4linux ISDN BRI driver + 0 = /dev/isdn0 First virtual B channel raw data + ... + 63 = /dev/isdn63 64th virtual B channel raw data + 64 = /dev/isdnctrl0 First channel control/debug + ... + 127 = /dev/isdnctrl63 64th channel control/debug + + 128 = /dev/ippp0 First SyncPPP device + ... + 191 = /dev/ippp63 64th SyncPPP device + + 255 = /dev/isdninfo ISDN monitor interface + 45 block Parallel port IDE disk devices + 0 = /dev/pda First parallel port IDE disk + 16 = /dev/pdb Second parallel port IDE disk + 32 = /dev/pdc Third parallel port IDE disk + 48 = /dev/pdd Fourth parallel port IDE disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the partition + limit is 15 rather than 63 per disk. + + 46 char Comtrol Rocketport serial card + 0 = /dev/ttyR0 First Rocketport port + 1 = /dev/ttyR1 Second Rocketport port + ... + 46 block Parallel port ATAPI CD-ROM devices + 0 = /dev/pcd0 First parallel port ATAPI CD-ROM + 1 = /dev/pcd1 Second parallel port ATAPI CD-ROM + 2 = /dev/pcd2 Third parallel port ATAPI CD-ROM + 3 = /dev/pcd3 Fourth parallel port ATAPI CD-ROM + + 47 char Comtrol Rocketport serial card - alternate devices + 0 = /dev/cur0 Callout device for ttyR0 + 1 = /dev/cur1 Callout device for ttyR1 + ... + 47 block Parallel port ATAPI disk devices + 0 = /dev/pf0 First parallel port ATAPI disk + 1 = /dev/pf1 Second parallel port ATAPI disk + 2 = /dev/pf2 Third parallel port ATAPI disk + 3 = /dev/pf3 Fourth parallel port ATAPI disk + + This driver is intended for floppy disks and similar + devices and hence does not support partitioning. + + 48 char SDL RISCom serial card + 0 = /dev/ttyL0 First RISCom port + 1 = /dev/ttyL1 Second RISCom port + ... + 48 block Mylex DAC960 PCI RAID controller; first controller + 0 = /dev/rd/c0d0 First disk, whole disk + 8 = /dev/rd/c0d1 Second disk, whole disk + ... + 248 = /dev/rd/c0d31 32nd disk, whole disk + + For partitions add: + 0 = /dev/rd/c?d? Whole disk + 1 = /dev/rd/c?d?p1 First partition + ... + 7 = /dev/rd/c?d?p7 Seventh partition + + 49 char SDL RISCom serial card - alternate devices + 0 = /dev/cul0 Callout device for ttyL0 + 1 = /dev/cul1 Callout device for ttyL1 + ... + 49 block Mylex DAC960 PCI RAID controller; second controller + 0 = /dev/rd/c1d0 First disk, whole disk + 8 = /dev/rd/c1d1 Second disk, whole disk + ... + 248 = /dev/rd/c1d31 32nd disk, whole disk + + Partitions are handled as for major 48. + + 50 char Reserved for GLINT + + 50 block Mylex DAC960 PCI RAID controller; third controller + 0 = /dev/rd/c2d0 First disk, whole disk + 8 = /dev/rd/c2d1 Second disk, whole disk + ... + 248 = /dev/rd/c2d31 32nd disk, whole disk + + 51 char Baycom radio modem OR Radio Tech BIM-XXX-RS232 radio modem + 0 = /dev/bc0 First Baycom radio modem + 1 = /dev/bc1 Second Baycom radio modem + ... + 51 block Mylex DAC960 PCI RAID controller; fourth controller + 0 = /dev/rd/c3d0 First disk, whole disk + 8 = /dev/rd/c3d1 Second disk, whole disk + ... + 248 = /dev/rd/c3d31 32nd disk, whole disk + + Partitions are handled as for major 48. + + 52 char Spellcaster DataComm/BRI ISDN card + 0 = /dev/dcbri0 First DataComm card + 1 = /dev/dcbri1 Second DataComm card + 2 = /dev/dcbri2 Third DataComm card + 3 = /dev/dcbri3 Fourth DataComm card + 52 block Mylex DAC960 PCI RAID controller; fifth controller + 0 = /dev/rd/c4d0 First disk, whole disk + 8 = /dev/rd/c4d1 Second disk, whole disk + ... + 248 = /dev/rd/c4d31 32nd disk, whole disk + + Partitions are handled as for major 48. + + 53 char BDM interface for remote debugging MC683xx microcontrollers + 0 = /dev/pd_bdm0 PD BDM interface on lp0 + 1 = /dev/pd_bdm1 PD BDM interface on lp1 + 2 = /dev/pd_bdm2 PD BDM interface on lp2 + 4 = /dev/icd_bdm0 ICD BDM interface on lp0 + 5 = /dev/icd_bdm1 ICD BDM interface on lp1 + 6 = /dev/icd_bdm2 ICD BDM interface on lp2 + + This device is used for the interfacing to the MC683xx + microcontrollers via Background Debug Mode by use of a + Parallel Port interface. PD is the Motorola Public + Domain Interface and ICD is the commercial interface + by P&E. + + 53 block Mylex DAC960 PCI RAID controller; sixth controller + 0 = /dev/rd/c5d0 First disk, whole disk + 8 = /dev/rd/c5d1 Second disk, whole disk + ... + 248 = /dev/rd/c5d31 32nd disk, whole disk + + Partitions are handled as for major 48. + + 54 char Electrocardiognosis Holter serial card + 0 = /dev/holter0 First Holter port + 1 = /dev/holter1 Second Holter port + 2 = /dev/holter2 Third Holter port + + A custom serial card used by Electrocardiognosis SRL + <mseritan@ottonel.pub.ro> to transfer data from Holter + 24-hour heart monitoring equipment. + + 54 block Mylex DAC960 PCI RAID controller; seventh controller + 0 = /dev/rd/c6d0 First disk, whole disk + 8 = /dev/rd/c6d1 Second disk, whole disk + ... + 248 = /dev/rd/c6d31 32nd disk, whole disk + + Partitions are handled as for major 48. + + 55 char DSP56001 digital signal processor + 0 = /dev/dsp56k First DSP56001 + 55 block Mylex DAC960 PCI RAID controller; eigth controller + 0 = /dev/rd/c7d0 First disk, whole disk + 8 = /dev/rd/c7d1 Second disk, whole disk + ... + 248 = /dev/rd/c7d31 32nd disk, whole disk + + Partitions are handled as for major 48. + + 56 char Apple Desktop Bus + 0 = /dev/adb ADB bus control + + Additional devices will be added to this number, all + starting with /dev/adb. + + 56 block Fifth IDE hard disk/CD-ROM interface + 0 = /dev/hdi Master: whole disk (or CD-ROM) + 64 = /dev/hdj Slave: whole disk (or CD-ROM) + + Partitions are handled the same way as for the first + interface (see major number 3). + + 57 char Hayes ESP serial card + 0 = /dev/ttyP0 First ESP port + 1 = /dev/ttyP1 Second ESP port + ... + + 57 block Sixth IDE hard disk/CD-ROM interface + 0 = /dev/hdk Master: whole disk (or CD-ROM) + 64 = /dev/hdl Slave: whole disk (or CD-ROM) + + Partitions are handled the same way as for the first + interface (see major number 3). + + 58 char Hayes ESP serial card - alternate devices + 0 = /dev/cup0 Callout device for ttyP0 + 1 = /dev/cup1 Callout device for ttyP1 + ... + 58 block Reserved for logical volume manager + + 59 char sf firewall package + 0 = /dev/firewall Communication with sf kernel module + + 59 block Generic PDA filesystem device + 0 = /dev/pda0 First PDA device + 1 = /dev/pda1 Second PDA device + ... + + The pda devices are used to mount filesystems on + remote pda's (basically slow handheld machines with + proprietary OS's and limited memory and storage + running small fs translation drivers) through serial / + IRDA / parallel links. + + NAMING CONFLICT -- PROPOSED REVISED NAME /dev/rpda0 etc + + 60-63 char LOCAL/EXPERIMENTAL USE + 60-63 block LOCAL/EXPERIMENTAL USE + Allocated for local/experimental use. For devices not + assigned official numbers, these ranges should be + used in order to avoid conflicting with future assignments. + + 64 char ENskip kernel encryption package + 0 = /dev/enskip Communication with ENskip kernel module + + 64 block Scramdisk/DriveCrypt encrypted devices + 0 = /dev/scramdisk/master Master node for ioctls + 1 = /dev/scramdisk/1 First encrypted device + 2 = /dev/scramdisk/2 Second encrypted device + ... + 255 = /dev/scramdisk/255 255th encrypted device + + The filename of the encrypted container and the passwords + are sent via ioctls (using the sdmount tool) to the master + node which then activates them via one of the + /dev/scramdisk/x nodes for loopback mounting (all handled + through the sdmount tool). + + Requested by: andy@scramdisklinux.org + + 65 char Sundance "plink" Transputer boards (obsolete, unused) + 0 = /dev/plink0 First plink device + 1 = /dev/plink1 Second plink device + 2 = /dev/plink2 Third plink device + 3 = /dev/plink3 Fourth plink device + 64 = /dev/rplink0 First plink device, raw + 65 = /dev/rplink1 Second plink device, raw + 66 = /dev/rplink2 Third plink device, raw + 67 = /dev/rplink3 Fourth plink device, raw + 128 = /dev/plink0d First plink device, debug + 129 = /dev/plink1d Second plink device, debug + 130 = /dev/plink2d Third plink device, debug + 131 = /dev/plink3d Fourth plink device, debug + 192 = /dev/rplink0d First plink device, raw, debug + 193 = /dev/rplink1d Second plink device, raw, debug + 194 = /dev/rplink2d Third plink device, raw, debug + 195 = /dev/rplink3d Fourth plink device, raw, debug + + This is a commercial driver; contact James Howes + <jth@prosig.demon.co.uk> for information. + + 65 block SCSI disk devices (16-31) + 0 = /dev/sdq 17th SCSI disk whole disk + 16 = /dev/sdr 18th SCSI disk whole disk + 32 = /dev/sds 19th SCSI disk whole disk + ... + 240 = /dev/sdaf 32nd SCSI disk whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + 66 char YARC PowerPC PCI coprocessor card + 0 = /dev/yppcpci0 First YARC card + 1 = /dev/yppcpci1 Second YARC card + ... + + 66 block SCSI disk devices (32-47) + 0 = /dev/sdag 33th SCSI disk whole disk + 16 = /dev/sdah 34th SCSI disk whole disk + 32 = /dev/sdai 35th SCSI disk whole disk + ... + 240 = /dev/sdav 48nd SCSI disk whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + 67 char Coda network file system + 0 = /dev/cfs0 Coda cache manager + + See http://www.coda.cs.cmu.edu for information about Coda. + + 67 block SCSI disk devices (48-63) + 0 = /dev/sdaw 49th SCSI disk whole disk + 16 = /dev/sdax 50th SCSI disk whole disk + 32 = /dev/sday 51st SCSI disk whole disk + ... + 240 = /dev/sdbl 64th SCSI disk whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + 68 char CAPI 2.0 interface + 0 = /dev/capi20 Control device + 1 = /dev/capi20.00 First CAPI 2.0 application + 2 = /dev/capi20.01 Second CAPI 2.0 application + ... + 20 = /dev/capi20.19 19th CAPI 2.0 application + + ISDN CAPI 2.0 driver for use with CAPI 2.0 + applications; currently supports the AVM B1 card. + + 68 block SCSI disk devices (64-79) + 0 = /dev/sdbm 65th SCSI disk whole disk + 16 = /dev/sdbn 66th SCSI disk whole disk + 32 = /dev/sdbo 67th SCSI disk whole disk + ... + 240 = /dev/sdcb 80th SCSI disk whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + 69 char MA16 numeric accelerator card + 0 = /dev/ma16 Board memory access + + 69 block SCSI disk devices (80-95) + 0 = /dev/sdcc 81st SCSI disk whole disk + 16 = /dev/sdcd 82nd SCSI disk whole disk + 32 = /dev/sdce 83th SCSI disk whole disk + ... + 240 = /dev/sdcr 96th SCSI disk whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + 70 char SpellCaster Protocol Services Interface + 0 = /dev/apscfg Configuration interface + 1 = /dev/apsauth Authentication interface + 2 = /dev/apslog Logging interface + 3 = /dev/apsdbg Debugging interface + 64 = /dev/apsisdn ISDN command interface + 65 = /dev/apsasync Async command interface + 128 = /dev/apsmon Monitor interface + + 70 block SCSI disk devices (96-111) + 0 = /dev/sdcs 97th SCSI disk whole disk + 16 = /dev/sdct 98th SCSI disk whole disk + 32 = /dev/sdcu 99th SCSI disk whole disk + ... + 240 = /dev/sddh 112nd SCSI disk whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + 71 char Computone IntelliPort II serial card + 0 = /dev/ttyF0 IntelliPort II board 0, port 0 + 1 = /dev/ttyF1 IntelliPort II board 0, port 1 + ... + 63 = /dev/ttyF63 IntelliPort II board 0, port 63 + 64 = /dev/ttyF64 IntelliPort II board 1, port 0 + 65 = /dev/ttyF65 IntelliPort II board 1, port 1 + ... + 127 = /dev/ttyF127 IntelliPort II board 1, port 63 + 128 = /dev/ttyF128 IntelliPort II board 2, port 0 + 129 = /dev/ttyF129 IntelliPort II board 2, port 1 + ... + 191 = /dev/ttyF191 IntelliPort II board 2, port 63 + 192 = /dev/ttyF192 IntelliPort II board 3, port 0 + 193 = /dev/ttyF193 IntelliPort II board 3, port 1 + ... + 255 = /dev/ttyF255 IntelliPort II board 3, port 63 + + 71 block SCSI disk devices (112-127) + 0 = /dev/sddi 113th SCSI disk whole disk + 16 = /dev/sddj 114th SCSI disk whole disk + 32 = /dev/sddk 115th SCSI disk whole disk + ... + 240 = /dev/sddx 128th SCSI disk whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + 72 char Computone IntelliPort II serial card - alternate devices + 0 = /dev/cuf0 Callout device for ttyF0 + 1 = /dev/cuf1 Callout device for ttyF1 + ... + 63 = /dev/cuf63 Callout device for ttyF63 + 64 = /dev/cuf64 Callout device for ttyF64 + 65 = /dev/cuf65 Callout device for ttyF65 + ... + 127 = /dev/cuf127 Callout device for ttyF127 + 128 = /dev/cuf128 Callout device for ttyF128 + 129 = /dev/cuf129 Callout device for ttyF129 + ... + 191 = /dev/cuf191 Callout device for ttyF191 + 192 = /dev/cuf192 Callout device for ttyF192 + 193 = /dev/cuf193 Callout device for ttyF193 + ... + 255 = /dev/cuf255 Callout device for ttyF255 + + 72 block Compaq Intelligent Drive Array, first controller + 0 = /dev/ida/c0d0 First logical drive whole disk + 16 = /dev/ida/c0d1 Second logical drive whole disk + ... + 240 = /dev/ida/c0d15 16th logical drive whole disk + + Partitions are handled the same way as for Mylex + DAC960 (see major number 48) except that the limit on + partitions is 15. + + 73 char Computone IntelliPort II serial card - control devices + 0 = /dev/ip2ipl0 Loadware device for board 0 + 1 = /dev/ip2stat0 Status device for board 0 + 4 = /dev/ip2ipl1 Loadware device for board 1 + 5 = /dev/ip2stat1 Status device for board 1 + 8 = /dev/ip2ipl2 Loadware device for board 2 + 9 = /dev/ip2stat2 Status device for board 2 + 12 = /dev/ip2ipl3 Loadware device for board 3 + 13 = /dev/ip2stat3 Status device for board 3 + + 73 block Compaq Intelligent Drive Array, second controller + 0 = /dev/ida/c1d0 First logical drive whole disk + 16 = /dev/ida/c1d1 Second logical drive whole disk + ... + 240 = /dev/ida/c1d15 16th logical drive whole disk + + Partitions are handled the same way as for Mylex + DAC960 (see major number 48) except that the limit on + partitions is 15. + + 74 char SCI bridge + 0 = /dev/SCI/0 SCI device 0 + 1 = /dev/SCI/1 SCI device 1 + ... + + Currently for Dolphin Interconnect Solutions' PCI-SCI + bridge. + + 74 block Compaq Intelligent Drive Array, third controller + 0 = /dev/ida/c2d0 First logical drive whole disk + 16 = /dev/ida/c2d1 Second logical drive whole disk + ... + 240 = /dev/ida/c2d15 16th logical drive whole disk + + Partitions are handled the same way as for Mylex + DAC960 (see major number 48) except that the limit on + partitions is 15. + + 75 char Specialix IO8+ serial card + 0 = /dev/ttyW0 First IO8+ port, first card + 1 = /dev/ttyW1 Second IO8+ port, first card + ... + 8 = /dev/ttyW8 First IO8+ port, second card + ... + + 75 block Compaq Intelligent Drive Array, fourth controller + 0 = /dev/ida/c3d0 First logical drive whole disk + 16 = /dev/ida/c3d1 Second logical drive whole disk + ... + 240 = /dev/ida/c3d15 16th logical drive whole disk + + Partitions are handled the same way as for Mylex + DAC960 (see major number 48) except that the limit on + partitions is 15. + + 76 char Specialix IO8+ serial card - alternate devices + 0 = /dev/cuw0 Callout device for ttyW0 + 1 = /dev/cuw1 Callout device for ttyW1 + ... + 8 = /dev/cuw8 Callout device for ttyW8 + ... + + 76 block Compaq Intelligent Drive Array, fifth controller + 0 = /dev/ida/c4d0 First logical drive whole disk + 16 = /dev/ida/c4d1 Second logical drive whole disk + ... + 240 = /dev/ida/c4d15 16th logical drive whole disk + + Partitions are handled the same way as for Mylex + DAC960 (see major number 48) except that the limit on + partitions is 15. + + + 77 char ComScire Quantum Noise Generator + 0 = /dev/qng ComScire Quantum Noise Generator + + 77 block Compaq Intelligent Drive Array, sixth controller + 0 = /dev/ida/c5d0 First logical drive whole disk + 16 = /dev/ida/c5d1 Second logical drive whole disk + ... + 240 = /dev/ida/c5d15 16th logical drive whole disk + + Partitions are handled the same way as for Mylex + DAC960 (see major number 48) except that the limit on + partitions is 15. + + + 78 char PAM Software's multimodem boards + 0 = /dev/ttyM0 First PAM modem + 1 = /dev/ttyM1 Second PAM modem + ... + + 78 block Compaq Intelligent Drive Array, seventh controller + 0 = /dev/ida/c6d0 First logical drive whole disk + 16 = /dev/ida/c6d1 Second logical drive whole disk + ... + 240 = /dev/ida/c6d15 16th logical drive whole disk + + Partitions are handled the same way as for Mylex + DAC960 (see major number 48) except that the limit on + partitions is 15. + + + 79 char PAM Software's multimodem boards - alternate devices + 0 = /dev/cum0 Callout device for ttyM0 + 1 = /dev/cum1 Callout device for ttyM1 + ... + + 79 block Compaq Intelligent Drive Array, eigth controller + 0 = /dev/ida/c7d0 First logical drive whole disk + 16 = /dev/ida/c7d1 Second logical drive whole disk + ... + 240 = /dev/ida/c715 16th logical drive whole disk + + Partitions are handled the same way as for Mylex + DAC960 (see major number 48) except that the limit on + partitions is 15. + + + 80 char Photometrics AT200 CCD camera + 0 = /dev/at200 Photometrics AT200 CCD camera + + 80 block I2O hard disk + 0 = /dev/i2o/hda First I2O hard disk, whole disk + 16 = /dev/i2o/hdb Second I2O hard disk, whole disk + ... + 240 = /dev/i2o/hdp 16th I2O hard disk, whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + 81 char video4linux + 0 = /dev/video0 Video capture/overlay device + ... + 63 = /dev/video63 Video capture/overlay device + 64 = /dev/radio0 Radio device + ... + 127 = /dev/radio63 Radio device + 192 = /dev/vtx0 Teletext device + ... + 223 = /dev/vtx31 Teletext device + 224 = /dev/vbi0 Vertical blank interrupt + ... + 255 = /dev/vbi31 Vertical blank interrupt + + 81 block I2O hard disk + 0 = /dev/i2o/hdq 17th I2O hard disk, whole disk + 16 = /dev/i2o/hdr 18th I2O hard disk, whole disk + ... + 240 = /dev/i2o/hdaf 32nd I2O hard disk, whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + 82 char WiNRADiO communications receiver card + 0 = /dev/winradio0 First WiNRADiO card + 1 = /dev/winradio1 Second WiNRADiO card + ... + + The driver and documentation may be obtained from + http://www.proximity.com.au/~brian/winradio/ + + 82 block I2O hard disk + 0 = /dev/i2o/hdag 33rd I2O hard disk, whole disk + 16 = /dev/i2o/hdah 34th I2O hard disk, whole disk + ... + 240 = /dev/i2o/hdav 48th I2O hard disk, whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + 83 char Matrox mga_vid video driver + 0 = /dev/mga_vid0 1st video card + 1 = /dev/mga_vid1 2nd video card + 2 = /dev/mga_vid2 3rd video card + ... + 15 = /dev/mga_vid15 16th video card + + 83 block I2O hard disk + 0 = /dev/i2o/hdaw 49th I2O hard disk, whole disk + 16 = /dev/i2o/hdax 50th I2O hard disk, whole disk + ... + 240 = /dev/i2o/hdbl 64th I2O hard disk, whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + 84 char Ikon 1011[57] Versatec Greensheet Interface + 0 = /dev/ihcp0 First Greensheet port + 1 = /dev/ihcp1 Second Greensheet port + + 84 block I2O hard disk + 0 = /dev/i2o/hdbm 65th I2O hard disk, whole disk + 16 = /dev/i2o/hdbn 66th I2O hard disk, whole disk + ... + 240 = /dev/i2o/hdcb 80th I2O hard disk, whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + 85 char Linux/SGI shared memory input queue + 0 = /dev/shmiq Master shared input queue + 1 = /dev/qcntl0 First device pushed + 2 = /dev/qcntl1 Second device pushed + ... + + 85 block I2O hard disk + 0 = /dev/i2o/hdcc 81st I2O hard disk, whole disk + 16 = /dev/i2o/hdcd 82nd I2O hard disk, whole disk + ... + 240 = /dev/i2o/hdcr 96th I2O hard disk, whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + 86 char SCSI media changer + 0 = /dev/sch0 First SCSI media changer + 1 = /dev/sch1 Second SCSI media changer + ... + + 86 block I2O hard disk + 0 = /dev/i2o/hdcs 97th I2O hard disk, whole disk + 16 = /dev/i2o/hdct 98th I2O hard disk, whole disk + ... + 240 = /dev/i2o/hddh 112th I2O hard disk, whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + 87 char Sony Control-A1 stereo control bus + 0 = /dev/controla0 First device on chain + 1 = /dev/controla1 Second device on chain + ... + + 87 block I2O hard disk + 0 = /dev/i2o/hddi 113rd I2O hard disk, whole disk + 16 = /dev/i2o/hddj 114th I2O hard disk, whole disk + ... + 240 = /dev/i2o/hddx 128th I2O hard disk, whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + 88 char COMX synchronous serial card + 0 = /dev/comx0 COMX channel 0 + 1 = /dev/comx1 COMX channel 1 + ... + + 88 block Seventh IDE hard disk/CD-ROM interface + 0 = /dev/hdm Master: whole disk (or CD-ROM) + 64 = /dev/hdn Slave: whole disk (or CD-ROM) + + Partitions are handled the same way as for the first + interface (see major number 3). + + 89 char I2C bus interface + 0 = /dev/i2c-0 First I2C adapter + 1 = /dev/i2c-1 Second I2C adapter + ... + + 89 block Eighth IDE hard disk/CD-ROM interface + 0 = /dev/hdo Master: whole disk (or CD-ROM) + 64 = /dev/hdp Slave: whole disk (or CD-ROM) + + Partitions are handled the same way as for the first + interface (see major number 3). + + 90 char Memory Technology Device (RAM, ROM, Flash) + 0 = /dev/mtd0 First MTD (rw) + 1 = /dev/mtdr0 First MTD (ro) + ... + 30 = /dev/mtd15 16th MTD (rw) + 31 = /dev/mtdr15 16th MTD (ro) + + 90 block Ninth IDE hard disk/CD-ROM interface + 0 = /dev/hdq Master: whole disk (or CD-ROM) + 64 = /dev/hdr Slave: whole disk (or CD-ROM) + + Partitions are handled the same way as for the first + interface (see major number 3). + + 91 char CAN-Bus devices + 0 = /dev/can0 First CAN-Bus controller + 1 = /dev/can1 Second CAN-Bus controller + ... + + 91 block Tenth IDE hard disk/CD-ROM interface + 0 = /dev/hds Master: whole disk (or CD-ROM) + 64 = /dev/hdt Slave: whole disk (or CD-ROM) + + Partitions are handled the same way as for the first + interface (see major number 3). + + 92 char Reserved for ith Kommunikationstechnik MIC ISDN card + + 92 block PPDD encrypted disk driver + 0 = /dev/ppdd0 First encrypted disk + 1 = /dev/ppdd1 Second encrypted disk + ... + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + 93 char IBM Smart Capture Card frame grabber {2.6} + 0 = /dev/iscc0 First Smart Capture Card + 1 = /dev/iscc1 Second Smart Capture Card + ... + 128 = /dev/isccctl0 First Smart Capture Card control + 129 = /dev/isccctl1 Second Smart Capture Card control + ... + + 93 block NAND Flash Translation Layer filesystem + 0 = /dev/nftla First NFTL layer + 16 = /dev/nftlb Second NFTL layer + ... + 240 = /dev/nftlp 16th NTFL layer + + 94 char miroVIDEO DC10/30 capture/playback device {2.6} + 0 = /dev/dcxx0 First capture card + 1 = /dev/dcxx1 Second capture card + ... + + 94 block IBM S/390 DASD block storage + 0 = /dev/dasda First DASD device, major + 1 = /dev/dasda1 First DASD device, block 1 + 2 = /dev/dasda2 First DASD device, block 2 + 3 = /dev/dasda3 First DASD device, block 3 + 4 = /dev/dasdb Second DASD device, major + 5 = /dev/dasdb1 Second DASD device, block 1 + 6 = /dev/dasdb2 Second DASD device, block 2 + 7 = /dev/dasdb3 Second DASD device, block 3 + ... + + 95 char IP filter + 0 = /dev/ipl Filter control device/log file + 1 = /dev/ipnat NAT control device/log file + 2 = /dev/ipstate State information log file + 3 = /dev/ipauth Authentication control device/log file + ... + + 96 char Parallel port ATAPI tape devices + 0 = /dev/pt0 First parallel port ATAPI tape + 1 = /dev/pt1 Second parallel port ATAPI tape + ... + 128 = /dev/npt0 First p.p. ATAPI tape, no rewind + 129 = /dev/npt1 Second p.p. ATAPI tape, no rewind + ... + + 96 block Inverse NAND Flash Translation Layer + 0 = /dev/inftla First INFTL layer + 16 = /dev/inftlb Second INFTL layer + ... + 240 = /dev/inftlp 16th INTFL layer + + 97 char Parallel port generic ATAPI interface + 0 = /dev/pg0 First parallel port ATAPI device + 1 = /dev/pg1 Second parallel port ATAPI device + 2 = /dev/pg2 Third parallel port ATAPI device + 3 = /dev/pg3 Fourth parallel port ATAPI device + + These devices support the same API as the generic SCSI + devices. + + 97 block Packet writing for CD/DVD devices + 0 = /dev/pktcdvd0 First packet-writing module + 1 = /dev/pktcdvd1 Second packet-writing module + ... + + 98 char Control and Measurement Device (comedi) + 0 = /dev/comedi0 First comedi device + 1 = /dev/comedi1 Second comedi device + ... + + See http://stm.lbl.gov/comedi or http://www.llp.fu-berlin.de/. + + 98 block User-mode virtual block device + 0 = /dev/ubda First user-mode block device + 16 = /dev/udbb Second user-mode block device + ... + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + This device is used by the user-mode virtual kernel port. + + 99 char Raw parallel ports + 0 = /dev/parport0 First parallel port + 1 = /dev/parport1 Second parallel port + ... + + 99 block JavaStation flash disk + 0 = /dev/jsfd JavaStation flash disk + +100 char Telephony for Linux + 0 = /dev/phone0 First telephony device + 1 = /dev/phone1 Second telephony device + ... + +101 char Motorola DSP 56xxx board + 0 = /dev/mdspstat Status information + 1 = /dev/mdsp1 First DSP board I/O controls + ... + 16 = /dev/mdsp16 16th DSP board I/O controls + +101 block AMI HyperDisk RAID controller + 0 = /dev/amiraid/ar0 First array whole disk + 16 = /dev/amiraid/ar1 Second array whole disk + ... + 240 = /dev/amiraid/ar15 16th array whole disk + + For each device, partitions are added as: + 0 = /dev/amiraid/ar? Whole disk + 1 = /dev/amiraid/ar?p1 First partition + 2 = /dev/amiraid/ar?p2 Second partition + ... + 15 = /dev/amiraid/ar?p15 15th partition + +102 char Philips SAA5249 Teletext signal decoder {2.6} + 0 = /dev/tlk0 First Teletext decoder + 1 = /dev/tlk1 Second Teletext decoder + 2 = /dev/tlk2 Third Teletext decoder + 3 = /dev/tlk3 Fourth Teletext decoder + +102 block Compressed block device + 0 = /dev/cbd/a First compressed block device, whole device + 16 = /dev/cbd/b Second compressed block device, whole device + ... + 240 = /dev/cbd/p 16th compressed block device, whole device + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + +103 char Arla network file system + 0 = /dev/nnpfs0 First NNPFS device + 1 = /dev/nnpfs1 Second NNPFS device + + Arla is a free clone of the Andrew File System, AFS. + The NNPFS device gives user mode filesystem + implementations a kernel presence for caching and easy + mounting. For more information about the project, + write to <arla-drinkers@stacken.kth.se> or see + http://www.stacken.kth.se/project/arla/ + +103 block Audit device + 0 = /dev/audit Audit device + +104 char Flash BIOS support + +104 block Compaq Next Generation Drive Array, first controller + 0 = /dev/cciss/c0d0 First logical drive, whole disk + 16 = /dev/cciss/c0d1 Second logical drive, whole disk + ... + 240 = /dev/cciss/c0d15 16th logical drive, whole disk + + Partitions are handled the same way as for Mylex + DAC960 (see major number 48) except that the limit on + partitions is 15. + +105 char Comtrol VS-1000 serial controller + 0 = /dev/ttyV0 First VS-1000 port + 1 = /dev/ttyV1 Second VS-1000 port + ... + +105 block Compaq Next Generation Drive Array, second controller + 0 = /dev/cciss/c1d0 First logical drive, whole disk + 16 = /dev/cciss/c1d1 Second logical drive, whole disk + ... + 240 = /dev/cciss/c1d15 16th logical drive, whole disk + + Partitions are handled the same way as for Mylex + DAC960 (see major number 48) except that the limit on + partitions is 15. + +106 char Comtrol VS-1000 serial controller - alternate devices + 0 = /dev/cuv0 First VS-1000 port + 1 = /dev/cuv1 Second VS-1000 port + ... + +106 block Compaq Next Generation Drive Array, third controller + 0 = /dev/cciss/c2d0 First logical drive, whole disk + 16 = /dev/cciss/c2d1 Second logical drive, whole disk + ... + 240 = /dev/cciss/c2d15 16th logical drive, whole disk + + Partitions are handled the same way as for Mylex + DAC960 (see major number 48) except that the limit on + partitions is 15. + +107 char 3Dfx Voodoo Graphics device + 0 = /dev/3dfx Primary 3Dfx graphics device + +107 block Compaq Next Generation Drive Array, fourth controller + 0 = /dev/cciss/c3d0 First logical drive, whole disk + 16 = /dev/cciss/c3d1 Second logical drive, whole disk + ... + 240 = /dev/cciss/c3d15 16th logical drive, whole disk + + Partitions are handled the same way as for Mylex + DAC960 (see major number 48) except that the limit on + partitions is 15. + +108 char Device independent PPP interface + 0 = /dev/ppp Device independent PPP interface + +108 block Compaq Next Generation Drive Array, fifth controller + 0 = /dev/cciss/c4d0 First logical drive, whole disk + 16 = /dev/cciss/c4d1 Second logical drive, whole disk + ... + 240 = /dev/cciss/c4d15 16th logical drive, whole disk + + Partitions are handled the same way as for Mylex + DAC960 (see major number 48) except that the limit on + partitions is 15. + +109 char Reserved for logical volume manager + +109 block Compaq Next Generation Drive Array, sixth controller + 0 = /dev/cciss/c5d0 First logical drive, whole disk + 16 = /dev/cciss/c5d1 Second logical drive, whole disk + ... + 240 = /dev/cciss/c5d15 16th logical drive, whole disk + + Partitions are handled the same way as for Mylex + DAC960 (see major number 48) except that the limit on + partitions is 15. + +110 char miroMEDIA Surround board + 0 = /dev/srnd0 First miroMEDIA Surround board + 1 = /dev/srnd1 Second miroMEDIA Surround board + ... + +110 block Compaq Next Generation Drive Array, seventh controller + 0 = /dev/cciss/c6d0 First logical drive, whole disk + 16 = /dev/cciss/c6d1 Second logical drive, whole disk + ... + 240 = /dev/cciss/c6d15 16th logical drive, whole disk + + Partitions are handled the same way as for Mylex + DAC960 (see major number 48) except that the limit on + partitions is 15. + +111 char Philips SAA7146-based audio/video card {2.6} + 0 = /dev/av0 First A/V card + 1 = /dev/av1 Second A/V card + ... + +111 block Compaq Next Generation Drive Array, eigth controller + 0 = /dev/cciss/c7d0 First logical drive, whole disk + 16 = /dev/cciss/c7d1 Second logical drive, whole disk + ... + 240 = /dev/cciss/c7d15 16th logical drive, whole disk + + Partitions are handled the same way as for Mylex + DAC960 (see major number 48) except that the limit on + partitions is 15. + +112 char ISI serial card + 0 = /dev/ttyM0 First ISI port + 1 = /dev/ttyM1 Second ISI port + ... + + There is currently a device-naming conflict between + these and PAM multimodems (major 78). + +112 block IBM iSeries virtual disk + 0 = /dev/iseries/vda First virtual disk, whole disk + 8 = /dev/iseries/vdb Second virtual disk, whole disk + ... + 200 = /dev/iseries/vdz 26th virtual disk, whole disk + 208 = /dev/iseries/vdaa 27th virtual disk, whole disk + ... + 248 = /dev/iseries/vdaf 32nd virtual disk, whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 7. + +113 char ISI serial card - alternate devices + 0 = /dev/cum0 Callout device for ttyM0 + 1 = /dev/cum1 Callout device for ttyM1 + ... + +113 block IBM iSeries virtual CD-ROM + + 0 = /dev/iseries/vcda First virtual CD-ROM + 1 = /dev/iseries/vcdb Second virtual CD-ROM + ... + +114 char Picture Elements ISE board + 0 = /dev/ise0 First ISE board + 1 = /dev/ise1 Second ISE board + ... + 128 = /dev/isex0 Control node for first ISE board + 129 = /dev/isex1 Control node for second ISE board + ... + + The ISE board is an embedded computer, optimized for + image processing. The /dev/iseN nodes are the general + I/O access to the board, the /dev/isex0 nodes command + nodes used to control the board. + +114 block IDE BIOS powered software RAID interfaces such as the + Promise Fastrak + + 0 = /dev/ataraid/d0 + 1 = /dev/ataraid/d0p1 + 2 = /dev/ataraid/d0p2 + ... + 16 = /dev/ataraid/d1 + 17 = /dev/ataraid/d1p1 + 18 = /dev/ataraid/d1p2 + ... + 255 = /dev/ataraid/d15p15 + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + +115 char TI link cable devices (115 was formerly the console driver speaker) + 0 = /dev/tipar0 Parallel cable on first parallel port + ... + 7 = /dev/tipar7 Parallel cable on seventh parallel port + + 8 = /dev/tiser0 Serial cable on first serial port + ... + 15 = /dev/tiser7 Serial cable on seventh serial port + + 16 = /dev/tiusb0 First USB cable + ... + 47 = /dev/tiusb31 32nd USB cable + +115 block NetWare (NWFS) Devices (0-255) + + The NWFS (NetWare) devices are used to present a + collection of NetWare Mirror Groups or NetWare + Partitions as a logical storage segment for + use in mounting NetWare volumes. A maximum of + 256 NetWare volumes can be supported in a single + machine. + + http://www.kernel.org/pub/linux/kernel/people/jmerkey/nwfs + + 0 = /dev/nwfs/v0 First NetWare (NWFS) Logical Volume + 1 = /dev/nwfs/v1 Second NetWare (NWFS) Logical Volume + 2 = /dev/nwfs/v2 Third NetWare (NWFS) Logical Volume + ... + 255 = /dev/nwfs/v255 Last NetWare (NWFS) Logical Volume + +116 char Advanced Linux Sound Driver (ALSA) + +116 block MicroMemory battery backed RAM adapter (NVRAM) + Supports 16 boards, 15 paritions each. + Requested by neilb at cse.unsw.edu.au. + + 0 = /dev/umem/d0 Whole of first board + 1 = /dev/umem/d0p1 First partition of first board + 2 = /dev/umem/d0p2 Second partition of first board + 15 = /dev/umem/d0p15 15th partition of first board + + 16 = /dev/umem/d1 Whole of second board + 17 = /dev/umem/d1p1 First partition of second board + ... + 255= /dev/umem/d15p15 15th partition of 16th board. + +117 char COSA/SRP synchronous serial card + 0 = /dev/cosa0c0 1st board, 1st channel + 1 = /dev/cosa0c1 1st board, 2nd channel + ... + 16 = /dev/cosa1c0 2nd board, 1st channel + 17 = /dev/cosa1c1 2nd board, 2nd channel + ... + +117 block Enterprise Volume Management System (EVMS) + + The EVMS driver uses a layered, plug-in model to provide + unparalleled flexibility and extensibility in managing + storage. This allows for easy expansion or customization + of various levels of volume management. Requested by + Mark Peloquin (peloquin at us.ibm.com). + + Note: EVMS populates and manages all the devnodes in + /dev/evms. + + http://sf.net/projects/evms + + 0 = /dev/evms/block_device EVMS block device + 1 = /dev/evms/legacyname1 First EVMS legacy device + 2 = /dev/evms/legacyname2 Second EVMS legacy device + ... + Both ranges can grow (down or up) until they meet. + ... + 254 = /dev/evms/EVMSname2 Second EVMS native device + 255 = /dev/evms/EVMSname1 First EVMS native device + + Note: legacyname(s) are derived from the normal legacy + device names. For example, /dev/hda5 would become + /dev/evms/hda5. + +118 char IBM Cryptographic Accelerator + 0 = /dev/ica Virtual interface to all IBM Crypto Accelerators + 1 = /dev/ica0 IBMCA Device 0 + 2 = /dev/ica1 IBMCA Device 1 + ... + +119 char VMware virtual network control + 0 = /dev/vnet0 1st virtual network + 1 = /dev/vnet1 2nd virtual network + ... + +120-127 char LOCAL/EXPERIMENTAL USE +120-127 block LOCAL/EXPERIMENTAL USE + Allocated for local/experimental use. For devices not + assigned official numbers, these ranges should be + used in order to avoid conflicting with future assignments. + +128-135 char Unix98 PTY masters + + These devices should not have corresponding device + nodes; instead they should be accessed through the + /dev/ptmx cloning interface. + + +128 block SCSI disk devices (128-143) + 0 = /dev/sddy 129th SCSI disk whole disk + 16 = /dev/sddz 130th SCSI disk whole disk + 32 = /dev/sdea 131th SCSI disk whole disk + ... + 240 = /dev/sden 144th SCSI disk whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + +129 block SCSI disk devices (144-159) + 0 = /dev/sdeo 145th SCSI disk whole disk + 16 = /dev/sdep 146th SCSI disk whole disk + 32 = /dev/sdeq 147th SCSI disk whole disk + ... + 240 = /dev/sdfd 160th SCSI disk whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + +130 char (Misc devices) + +130 block SCSI disk devices (160-175) + 0 = /dev/sdfe 161st SCSI disk whole disk + 16 = /dev/sdff 162nd SCSI disk whole disk + 32 = /dev/sdfg 163rd SCSI disk whole disk + ... + 240 = /dev/sdft 176th SCSI disk whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + +131 block SCSI disk devices (176-191) + 0 = /dev/sdfu 177th SCSI disk whole disk + 16 = /dev/sdfv 178th SCSI disk whole disk + 32 = /dev/sdfw 179th SCSI disk whole disk + ... + 240 = /dev/sdgj 192nd SCSI disk whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + +132 block SCSI disk devices (192-207) + 0 = /dev/sdgk 193rd SCSI disk whole disk + 16 = /dev/sdgl 194th SCSI disk whole disk + 32 = /dev/sdgm 195th SCSI disk whole disk + ... + 240 = /dev/sdgz 208th SCSI disk whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + +133 block SCSI disk devices (208-223) + 0 = /dev/sdha 209th SCSI disk whole disk + 16 = /dev/sdhb 210th SCSI disk whole disk + 32 = /dev/sdhc 211th SCSI disk whole disk + ... + 240 = /dev/sdhp 224th SCSI disk whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + +134 block SCSI disk devices (224-239) + 0 = /dev/sdhq 225th SCSI disk whole disk + 16 = /dev/sdhr 226th SCSI disk whole disk + 32 = /dev/sdhs 227th SCSI disk whole disk + ... + 240 = /dev/sdif 240th SCSI disk whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + +135 block SCSI disk devices (240-255) + 0 = /dev/sdig 241st SCSI disk whole disk + 16 = /dev/sdih 242nd SCSI disk whole disk + 32 = /dev/sdih 243rd SCSI disk whole disk + ... + 240 = /dev/sdiv 256th SCSI disk whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + + +136-143 char Unix98 PTY slaves + 0 = /dev/pts/0 First Unix98 pseudo-TTY + 1 = /dev/pts/1 Second Unix98 pesudo-TTY + ... + + These device nodes are automatically generated with + the proper permissions and modes by mounting the + devpts filesystem onto /dev/pts with the appropriate + mount options (distribution dependent, however, on + *most* distributions the appropriate options are + "mode=0620,gid=<gid of the "tty" group>".) + +136 block Mylex DAC960 PCI RAID controller; ninth controller + 0 = /dev/rd/c8d0 First disk, whole disk + 8 = /dev/rd/c8d1 Second disk, whole disk + ... + 248 = /dev/rd/c8d31 32nd disk, whole disk + + Partitions are handled as for major 48. + +137 block Mylex DAC960 PCI RAID controller; tenth controller + 0 = /dev/rd/c9d0 First disk, whole disk + 8 = /dev/rd/c9d1 Second disk, whole disk + ... + 248 = /dev/rd/c9d31 32nd disk, whole disk + + Partitions are handled as for major 48. + +138 block Mylex DAC960 PCI RAID controller; eleventh controller + 0 = /dev/rd/c10d0 First disk, whole disk + 8 = /dev/rd/c10d1 Second disk, whole disk + ... + 248 = /dev/rd/c10d31 32nd disk, whole disk + + Partitions are handled as for major 48. + +139 block Mylex DAC960 PCI RAID controller; twelfth controller + 0 = /dev/rd/c11d0 First disk, whole disk + 8 = /dev/rd/c11d1 Second disk, whole disk + ... + 248 = /dev/rd/c11d31 32nd disk, whole disk + + Partitions are handled as for major 48. + +140 block Mylex DAC960 PCI RAID controller; thirteenth controller + 0 = /dev/rd/c12d0 First disk, whole disk + 8 = /dev/rd/c12d1 Second disk, whole disk + ... + 248 = /dev/rd/c12d31 32nd disk, whole disk + + Partitions are handled as for major 48. + +141 block Mylex DAC960 PCI RAID controller; fourteenth controller + 0 = /dev/rd/c13d0 First disk, whole disk + 8 = /dev/rd/c13d1 Second disk, whole disk + ... + 248 = /dev/rd/c13d31 32nd disk, whole disk + + Partitions are handled as for major 48. + +142 block Mylex DAC960 PCI RAID controller; fifteenth controller + 0 = /dev/rd/c14d0 First disk, whole disk + 8 = /dev/rd/c14d1 Second disk, whole disk + ... + 248 = /dev/rd/c14d31 32nd disk, whole disk + + Partitions are handled as for major 48. + +143 block Mylex DAC960 PCI RAID controller; sixteenth controller + 0 = /dev/rd/c15d0 First disk, whole disk + 8 = /dev/rd/c15d1 Second disk, whole disk + ... + 248 = /dev/rd/c15d31 32nd disk, whole disk + + Partitions are handled as for major 48. + +144 char Encapsulated PPP + 0 = /dev/pppox0 First PPP over Ethernet + ... + 63 = /dev/pppox63 64th PPP over Ethernet + + This is primarily used for ADSL. + + The SST 5136-DN DeviceNet interface driver has been + relocated to major 183 due to an unfortunate conflict. + +144 block Expansion Area #1 for more non-device (e.g. NFS) mounts + 0 = mounted device 256 + 255 = mounted device 511 + +145 char SAM9407-based soundcard + 0 = /dev/sam0_mixer + 1 = /dev/sam0_sequencer + 2 = /dev/sam0_midi00 + 3 = /dev/sam0_dsp + 4 = /dev/sam0_audio + 6 = /dev/sam0_sndstat + 18 = /dev/sam0_midi01 + 34 = /dev/sam0_midi02 + 50 = /dev/sam0_midi03 + 64 = /dev/sam1_mixer + ... + 128 = /dev/sam2_mixer + ... + 192 = /dev/sam3_mixer + ... + + Device functions match OSS, but offer a number of + addons, which are sam9407 specific. OSS can be + operated simultaneously, taking care of the codec. + +145 block Expansion Area #2 for more non-device (e.g. NFS) mounts + 0 = mounted device 512 + 255 = mounted device 767 + +146 char SYSTRAM SCRAMNet mirrored-memory network + 0 = /dev/scramnet0 First SCRAMNet device + 1 = /dev/scramnet1 Second SCRAMNet device + ... + +146 block Expansion Area #3 for more non-device (e.g. NFS) mounts + 0 = mounted device 768 + 255 = mounted device 1023 + +147 char Aureal Semiconductor Vortex Audio device + 0 = /dev/aureal0 First Aureal Vortex + 1 = /dev/aureal1 Second Aureal Vortex + ... + +147 block Distributed Replicated Block Device (DRBD) + 0 = /dev/drbd0 First DRBD device + 1 = /dev/drbd1 Second DRBD device + ... + +148 char Technology Concepts serial card + 0 = /dev/ttyT0 First TCL port + 1 = /dev/ttyT1 Second TCL port + ... + +149 char Technology Concepts serial card - alternate devices + 0 = /dev/cut0 Callout device for ttyT0 + 1 = /dev/cut0 Callout device for ttyT1 + ... + +150 char Real-Time Linux FIFOs + 0 = /dev/rtf0 First RTLinux FIFO + 1 = /dev/rtf1 Second RTLinux FIFO + ... + +151 char DPT I2O SmartRaid V controller + 0 = /dev/dpti0 First DPT I2O adapter + 1 = /dev/dpti1 Second DPT I2O adapter + ... + +152 char EtherDrive Control Device + 0 = /dev/etherd/ctl Connect/Disconnect an EtherDrive + 1 = /dev/etherd/err Monitor errors + 2 = /dev/etherd/raw Raw AoE packet monitor + +152 block EtherDrive Block Devices + 0 = /dev/etherd/0 EtherDrive 0 + ... + 255 = /dev/etherd/255 EtherDrive 255 + +153 char SPI Bus Interface (sometimes referred to as MicroWire) + 0 = /dev/spi0 First SPI device on the bus + 1 = /dev/spi1 Second SPI device on the bus + ... + 15 = /dev/spi15 Sixteenth SPI device on the bus + +153 block Enhanced Metadisk RAID (EMD) storage units + 0 = /dev/emd/0 First unit + 1 = /dev/emd/0p1 Partition 1 on First unit + 2 = /dev/emd/0p2 Partition 2 on First unit + ... + 15 = /dev/emd/0p15 Partition 15 on First unit + + 16 = /dev/emd/1 Second unit + 32 = /dev/emd/2 Third unit + ... + 240 = /dev/emd/15 Sixteenth unit + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 15. + +154 char Specialix RIO serial card + 0 = /dev/ttySR0 First RIO port + ... + 255 = /dev/ttySR255 256th RIO port + +155 char Specialix RIO serial card - alternate devices + 0 = /dev/cusr0 Callout device for ttySR0 + ... + 255 = /dev/cusr255 Callout device for ttySR255 + +156 char Specialix RIO serial card + 0 = /dev/ttySR256 257th RIO port + ... + 255 = /dev/ttySR511 512th RIO port + +157 char Specialix RIO serial card - alternate devices + 0 = /dev/cusr256 Callout device for ttySR256 + ... + 255 = /dev/cusr511 Callout device for ttySR511 + +158 char Dialogic GammaLink fax driver + 0 = /dev/gfax0 GammaLink channel 0 + 1 = /dev/gfax1 GammaLink channel 1 + ... + +159 char RESERVED +159 block RESERVED + +160 char General Purpose Instrument Bus (GPIB) + 0 = /dev/gpib0 First GPIB bus + 1 = /dev/gpib1 Second GPIB bus + ... + +160 block Carmel 8-port SATA Disks on First Controller + 0 = /dev/carmel/0 SATA disk 0 whole disk + 1 = /dev/carmel/0p1 SATA disk 0 partition 1 + ... + 31 = /dev/carmel/0p31 SATA disk 0 partition 31 + + 32 = /dev/carmel/1 SATA disk 1 whole disk + 64 = /dev/carmel/2 SATA disk 2 whole disk + ... + 224 = /dev/carmel/7 SATA disk 7 whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 31. + +161 char IrCOMM devices (IrDA serial/parallel emulation) + 0 = /dev/ircomm0 First IrCOMM device + 1 = /dev/ircomm1 Second IrCOMM device + ... + 16 = /dev/irlpt0 First IrLPT device + 17 = /dev/irlpt1 Second IrLPT device + ... + +161 block Carmel 8-port SATA Disks on Second Controller + 0 = /dev/carmel/8 SATA disk 8 whole disk + 1 = /dev/carmel/8p1 SATA disk 8 partition 1 + ... + 31 = /dev/carmel/8p31 SATA disk 8 partition 31 + + 32 = /dev/carmel/9 SATA disk 9 whole disk + 64 = /dev/carmel/10 SATA disk 10 whole disk + ... + 224 = /dev/carmel/15 SATA disk 15 whole disk + + Partitions are handled in the same way as for IDE + disks (see major number 3) except that the limit on + partitions is 31. + +162 char Raw block device interface + 0 = /dev/rawctl Raw I/O control device + 1 = /dev/raw/raw1 First raw I/O device + 2 = /dev/raw/raw2 Second raw I/O device + ... + +163 char UNASSIGNED (was Radio Tech BIM-XXX-RS232 radio modem - see 51) + +164 char Chase Research AT/PCI-Fast serial card + 0 = /dev/ttyCH0 AT/PCI-Fast board 0, port 0 + ... + 15 = /dev/ttyCH15 AT/PCI-Fast board 0, port 15 + 16 = /dev/ttyCH16 AT/PCI-Fast board 1, port 0 + ... + 31 = /dev/ttyCH31 AT/PCI-Fast board 1, port 15 + 32 = /dev/ttyCH32 AT/PCI-Fast board 2, port 0 + ... + 47 = /dev/ttyCH47 AT/PCI-Fast board 2, port 15 + 48 = /dev/ttyCH48 AT/PCI-Fast board 3, port 0 + ... + 63 = /dev/ttyCH63 AT/PCI-Fast board 3, port 15 + +165 char Chase Research AT/PCI-Fast serial card - alternate devices + 0 = /dev/cuch0 Callout device for ttyCH0 + ... + 63 = /dev/cuch63 Callout device for ttyCH63 + +166 char ACM USB modems + 0 = /dev/ttyACM0 First ACM modem + 1 = /dev/ttyACM1 Second ACM modem + ... + +167 char ACM USB modems - alternate devices + 0 = /dev/cuacm0 Callout device for ttyACM0 + 1 = /dev/cuacm1 Callout device for ttyACM1 + ... + +168 char Eracom CSA7000 PCI encryption adaptor + 0 = /dev/ecsa0 First CSA7000 + 1 = /dev/ecsa1 Second CSA7000 + ... + +169 char Eracom CSA8000 PCI encryption adaptor + 0 = /dev/ecsa8-0 First CSA8000 + 1 = /dev/ecsa8-1 Second CSA8000 + ... + +170 char AMI MegaRAC remote access controller + 0 = /dev/megarac0 First MegaRAC card + 1 = /dev/megarac1 Second MegaRAC card + ... + +171 char Reserved for IEEE 1394 (Firewire) + + +172 char Moxa Intellio serial card + 0 = /dev/ttyMX0 First Moxa port + 1 = /dev/ttyMX1 Second Moxa port + ... + 127 = /dev/ttyMX127 128th Moxa port + 128 = /dev/moxactl Moxa control port + +173 char Moxa Intellio serial card - alternate devices + 0 = /dev/cumx0 Callout device for ttyMX0 + 1 = /dev/cumx1 Callout device for ttyMX1 + ... + 127 = /dev/cumx127 Callout device for ttyMX127 + +174 char SmartIO serial card + 0 = /dev/ttySI0 First SmartIO port + 1 = /dev/ttySI1 Second SmartIO port + ... + +175 char SmartIO serial card - alternate devices + 0 = /dev/cusi0 Callout device for ttySI0 + 1 = /dev/cusi1 Callout device for ttySI1 + ... + +176 char nCipher nFast PCI crypto accelerator + 0 = /dev/nfastpci0 First nFast PCI device + 1 = /dev/nfastpci1 First nFast PCI device + ... + +177 char TI PCILynx memory spaces + 0 = /dev/pcilynx/aux0 AUX space of first PCILynx card + ... + 15 = /dev/pcilynx/aux15 AUX space of 16th PCILynx card + 16 = /dev/pcilynx/rom0 ROM space of first PCILynx card + ... + 31 = /dev/pcilynx/rom15 ROM space of 16th PCILynx card + 32 = /dev/pcilynx/ram0 RAM space of first PCILynx card + ... + 47 = /dev/pcilynx/ram15 RAM space of 16th PCILynx card + +178 char Giganet cLAN1xxx virtual interface adapter + 0 = /dev/clanvi0 First cLAN adapter + 1 = /dev/clanvi1 Second cLAN adapter + ... + +179 char CCube DVXChip-based PCI products + 0 = /dev/dvxirq0 First DVX device + 1 = /dev/dvxirq1 Second DVX device + ... + +180 char USB devices + 0 = /dev/usb/lp0 First USB printer + ... + 15 = /dev/usb/lp15 16th USB printer + 16 = /dev/usb/mouse0 First USB mouse + ... + 31 = /dev/usb/mouse15 16th USB mouse + 32 = /dev/usb/ez0 First USB firmware loader + ... + 47 = /dev/usb/ez15 16th USB firmware loader + 48 = /dev/usb/scanner0 First USB scanner + ... + 63 = /dev/usb/scanner15 16th USB scanner + 64 = /dev/usb/rio500 Diamond Rio 500 + 65 = /dev/usb/usblcd USBLCD Interface (info@usblcd.de) + 66 = /dev/usb/cpad0 Synaptics cPad (mouse/LCD) + +180 block USB block devices + 0 = /dev/uba First USB block device + 8 = /dev/ubb Second USB block device + 16 = /dev/ubc Thrid USB block device + ... + +181 char Conrad Electronic parallel port radio clocks + 0 = /dev/pcfclock0 First Conrad radio clock + 1 = /dev/pcfclock1 Second Conrad radio clock + ... + +182 char Picture Elements THR2 binarizer + 0 = /dev/pethr0 First THR2 board + 1 = /dev/pethr1 Second THR2 board + ... + +183 char SST 5136-DN DeviceNet interface + 0 = /dev/ss5136dn0 First DeviceNet interface + 1 = /dev/ss5136dn1 Second DeviceNet interface + ... + + This device used to be assigned to major number 144. + It had to be moved due to an unfortunate conflict. + +184 char Picture Elements' video simulator/sender + 0 = /dev/pevss0 First sender board + 1 = /dev/pevss1 Second sender board + ... + +185 char InterMezzo high availability file system + 0 = /dev/intermezzo0 First cache manager + 1 = /dev/intermezzo1 Second cache manager + ... + + See http://www.inter-mezzo.org/ for more information. + +186 char Object-based storage control device + 0 = /dev/obd0 First obd control device + 1 = /dev/obd1 Second obd control device + ... + + See ftp://ftp.lustre.org/pub/obd for code and information. + +187 char DESkey hardware encryption device + 0 = /dev/deskey0 First DES key + 1 = /dev/deskey1 Second DES key + ... + +188 char USB serial converters + 0 = /dev/ttyUSB0 First USB serial converter + 1 = /dev/ttyUSB1 Second USB serial converter + ... + +189 char USB serial converters - alternate devices + 0 = /dev/cuusb0 Callout device for ttyUSB0 + 1 = /dev/cuusb1 Callout device for ttyUSB1 + ... + +190 char Kansas City tracker/tuner card + 0 = /dev/kctt0 First KCT/T card + 1 = /dev/kctt1 Second KCT/T card + ... + +191 char Reserved for PCMCIA + +192 char Kernel profiling interface + 0 = /dev/profile Profiling control device + 1 = /dev/profile0 Profiling device for CPU 0 + 2 = /dev/profile1 Profiling device for CPU 1 + ... + +193 char Kernel event-tracing interface + 0 = /dev/trace Tracing control device + 1 = /dev/trace0 Tracing device for CPU 0 + 2 = /dev/trace1 Tracing device for CPU 1 + ... + +194 char linVideoStreams (LINVS) + 0 = /dev/mvideo/status0 Video compression status + 1 = /dev/mvideo/stream0 Video stream + 2 = /dev/mvideo/frame0 Single compressed frame + 3 = /dev/mvideo/rawframe0 Raw uncompressed frame + 4 = /dev/mvideo/codec0 Direct codec access + 5 = /dev/mvideo/video4linux0 Video4Linux compatibility + + 16 = /dev/mvideo/status1 Second device + ... + 32 = /dev/mvideo/status2 Third device + ... + ... + 240 = /dev/mvideo/status15 16th device + ... + +195 char Nvidia graphics devices + 0 = /dev/nvidia0 First Nvidia card + 1 = /dev/nvidia1 Second Nvidia card + ... + 255 = /dev/nvidiactl Nvidia card control device + +196 char Tormenta T1 card + 0 = /dev/tor/0 Master control channel for all cards + 1 = /dev/tor/1 First DS0 + 2 = /dev/tor/2 Second DS0 + ... + 48 = /dev/tor/48 48th DS0 + 49 = /dev/tor/49 First pseudo-channel + 50 = /dev/tor/50 Second pseudo-channel + ... + +197 char OpenTNF tracing facility + 0 = /dev/tnf/t0 Trace 0 data extraction + 1 = /dev/tnf/t1 Trace 1 data extraction + ... + 128 = /dev/tnf/status Tracing facility status + 130 = /dev/tnf/trace Tracing device + +198 char Total Impact TPMP2 quad coprocessor PCI card + 0 = /dev/tpmp2/0 First card + 1 = /dev/tpmp2/1 Second card + ... + +199 char Veritas volume manager (VxVM) volumes + 0 = /dev/vx/rdsk/*/* First volume + 1 = /dev/vx/rdsk/*/* Second volume + ... + +199 block Veritas volume manager (VxVM) volumes + 0 = /dev/vx/dsk/*/* First volume + 1 = /dev/vx/dsk/*/* Second volume + ... + + The namespace in these directories is maintained by + the user space VxVM software. + +200 char Veritas VxVM configuration interface + 0 = /dev/vx/config Configuration access node + 1 = /dev/vx/trace Volume i/o trace access node + 2 = /dev/vx/iod Volume i/o daemon access node + 3 = /dev/vx/info Volume information access node + 4 = /dev/vx/task Volume tasks access node + 5 = /dev/vx/taskmon Volume tasks monitor daemon + +201 char Veritas VxVM dynamic multipathing driver + 0 = /dev/vx/rdmp/* First multipath device + 1 = /dev/vx/rdmp/* Second multipath device + ... +201 block Veritas VxVM dynamic multipathing driver + 0 = /dev/vx/dmp/* First multipath device + 1 = /dev/vx/dmp/* Second multipath device + ... + + The namespace in these directories is maintained by + the user space VxVM software. + +202 char CPU model-specific registers + 0 = /dev/cpu/0/msr MSRs on CPU 0 + 1 = /dev/cpu/1/msr MSRs on CPU 1 + ... + +203 char CPU CPUID information + 0 = /dev/cpu/0/cpuid CPUID on CPU 0 + 1 = /dev/cpu/1/cpuid CPUID on CPU 1 + ... + +204 char Low-density serial ports + 0 = /dev/ttyLU0 LinkUp Systems L72xx UART - port 0 + 1 = /dev/ttyLU1 LinkUp Systems L72xx UART - port 1 + 2 = /dev/ttyLU2 LinkUp Systems L72xx UART - port 2 + 3 = /dev/ttyLU3 LinkUp Systems L72xx UART - port 3 + 4 = /dev/ttyFB0 Intel Footbridge (ARM) + 5 = /dev/ttySA0 StrongARM builtin serial port 0 + 6 = /dev/ttySA1 StrongARM builtin serial port 1 + 7 = /dev/ttySA2 StrongARM builtin serial port 2 + 8 = /dev/ttySC0 SCI serial port (SuperH) - port 0 + 9 = /dev/ttySC1 SCI serial port (SuperH) - port 1 + 10 = /dev/ttySC2 SCI serial port (SuperH) - port 2 + 11 = /dev/ttySC3 SCI serial port (SuperH) - port 3 + 12 = /dev/ttyFW0 Firmware console - port 0 + 13 = /dev/ttyFW1 Firmware console - port 1 + 14 = /dev/ttyFW2 Firmware console - port 2 + 15 = /dev/ttyFW3 Firmware console - port 3 + 16 = /dev/ttyAM0 ARM "AMBA" serial port 0 + ... + 31 = /dev/ttyAM15 ARM "AMBA" serial port 15 + 32 = /dev/ttyDB0 DataBooster serial port 0 + ... + 39 = /dev/ttyDB7 DataBooster serial port 7 + 40 = /dev/ttySG0 SGI Altix console port + 41 = /dev/ttySMX0 Motorola i.MX - port 0 + 42 = /dev/ttySMX1 Motorola i.MX - port 1 + 43 = /dev/ttySMX2 Motorola i.MX - port 2 + 44 = /dev/ttyMM0 Marvell MPSC - port 0 + 45 = /dev/ttyMM1 Marvell MPSC - port 1 + 46 = /dev/ttyCPM0 PPC CPM (SCC or SMC) - port 0 + ... + 47 = /dev/ttyCPM5 PPC CPM (SCC or SMC) - port 5 + 50 = /dev/ttyIOC40 Altix serial card + ... + 81 = /dev/ttyIOC431 Altix serial card + 82 = /dev/ttyVR0 NEC VR4100 series SIU + 83 = /dev/ttyVR1 NEC VR4100 series DSIU + +205 char Low-density serial ports (alternate device) + 0 = /dev/culu0 Callout device for ttyLU0 + 1 = /dev/culu1 Callout device for ttyLU1 + 2 = /dev/culu2 Callout device for ttyLU2 + 3 = /dev/culu3 Callout device for ttyLU3 + 4 = /dev/cufb0 Callout device for ttyFB0 + 5 = /dev/cusa0 Callout device for ttySA0 + 6 = /dev/cusa1 Callout device for ttySA1 + 7 = /dev/cusa2 Callout device for ttySA2 + 8 = /dev/cusc0 Callout device for ttySC0 + 9 = /dev/cusc1 Callout device for ttySC1 + 10 = /dev/cusc2 Callout device for ttySC2 + 11 = /dev/cusc3 Callout device for ttySC3 + 12 = /dev/cufw0 Callout device for ttyFW0 + 13 = /dev/cufw1 Callout device for ttyFW1 + 14 = /dev/cufw2 Callout device for ttyFW2 + 15 = /dev/cufw3 Callout device for ttyFW3 + 16 = /dev/cuam0 Callout device for ttyAM0 + ... + 31 = /dev/cuam15 Callout device for ttyAM15 + 32 = /dev/cudb0 Callout device for ttyDB0 + ... + 39 = /dev/cudb7 Callout device for ttyDB7 + 40 = /dev/cusg0 Callout device for ttySG0 + 41 = /dev/ttycusmx0 Callout device for ttySMX0 + 42 = /dev/ttycusmx1 Callout device for ttySMX1 + 43 = /dev/ttycusmx2 Callout device for ttySMX2 + 46 = /dev/cucpm0 Callout device for ttyCPM0 + ... + 49 = /dev/cucpm5 Callout device for ttyCPM5 + 50 = /dev/cuioc40 Callout device for ttyIOC40 + ... + 81 = /dev/cuioc431 Callout device for ttyIOC431 + 82 = /dev/cuvr0 Callout device for ttyVR0 + 83 = /dev/cuvr1 Callout device for ttyVR1 + + +206 char OnStream SC-x0 tape devices + 0 = /dev/osst0 First OnStream SCSI tape, mode 0 + 1 = /dev/osst1 Second OnStream SCSI tape, mode 0 + ... + 32 = /dev/osst0l First OnStream SCSI tape, mode 1 + 33 = /dev/osst1l Second OnStream SCSI tape, mode 1 + ... + 64 = /dev/osst0m First OnStream SCSI tape, mode 2 + 65 = /dev/osst1m Second OnStream SCSI tape, mode 2 + ... + 96 = /dev/osst0a First OnStream SCSI tape, mode 3 + 97 = /dev/osst1a Second OnStream SCSI tape, mode 3 + ... + 128 = /dev/nosst0 No rewind version of /dev/osst0 + 129 = /dev/nosst1 No rewind version of /dev/osst1 + ... + 160 = /dev/nosst0l No rewind version of /dev/osst0l + 161 = /dev/nosst1l No rewind version of /dev/osst1l + ... + 192 = /dev/nosst0m No rewind version of /dev/osst0m + 193 = /dev/nosst1m No rewind version of /dev/osst1m + ... + 224 = /dev/nosst0a No rewind version of /dev/osst0a + 225 = /dev/nosst1a No rewind version of /dev/osst1a + ... + + The OnStream SC-x0 SCSI tapes do not support the + standard SCSI SASD command set and therefore need + their own driver "osst". Note that the IDE, USB (and + maybe ParPort) versions may be driven via ide-scsi or + usb-storage SCSI emulation and this osst device and + driver as well. The ADR-x0 drives are QIC-157 + compliant and don't need osst. + +207 char Compaq ProLiant health feature indicate + 0 = /dev/cpqhealth/cpqw Redirector interface + 1 = /dev/cpqhealth/crom EISA CROM + 2 = /dev/cpqhealth/cdt Data Table + 3 = /dev/cpqhealth/cevt Event Log + 4 = /dev/cpqhealth/casr Automatic Server Recovery + 5 = /dev/cpqhealth/cecc ECC Memory + 6 = /dev/cpqhealth/cmca Machine Check Architecture + 7 = /dev/cpqhealth/ccsm Deprecated CDT + 8 = /dev/cpqhealth/cnmi NMI Handling + 9 = /dev/cpqhealth/css Sideshow Management + 10 = /dev/cpqhealth/cram CMOS interface + 11 = /dev/cpqhealth/cpci PCI IRQ interface + +208 char User space serial ports + 0 = /dev/ttyU0 First user space serial port + 1 = /dev/ttyU1 Second user space serial port + ... + +209 char User space serial ports (alternate devices) + 0 = /dev/cuu0 Callout device for ttyU0 + 1 = /dev/cuu1 Callout device for ttyU1 + ... + +210 char SBE, Inc. sync/async serial card + 0 = /dev/sbei/wxcfg0 Configuration device for board 0 + 1 = /dev/sbei/dld0 Download device for board 0 + 2 = /dev/sbei/wan00 WAN device, port 0, board 0 + 3 = /dev/sbei/wan01 WAN device, port 1, board 0 + 4 = /dev/sbei/wan02 WAN device, port 2, board 0 + 5 = /dev/sbei/wan03 WAN device, port 3, board 0 + 6 = /dev/sbei/wanc00 WAN clone device, port 0, board 0 + 7 = /dev/sbei/wanc01 WAN clone device, port 1, board 0 + 8 = /dev/sbei/wanc02 WAN clone device, port 2, board 0 + 9 = /dev/sbei/wanc03 WAN clone device, port 3, board 0 + 10 = /dev/sbei/wxcfg1 Configuration device for board 1 + 11 = /dev/sbei/dld1 Download device for board 1 + 12 = /dev/sbei/wan10 WAN device, port 0, board 1 + 13 = /dev/sbei/wan11 WAN device, port 1, board 1 + 14 = /dev/sbei/wan12 WAN device, port 2, board 1 + 15 = /dev/sbei/wan13 WAN device, port 3, board 1 + 16 = /dev/sbei/wanc10 WAN clone device, port 0, board 1 + 17 = /dev/sbei/wanc11 WAN clone device, port 1, board 1 + 18 = /dev/sbei/wanc12 WAN clone device, port 2, board 1 + 19 = /dev/sbei/wanc13 WAN clone device, port 3, board 1 + ... + + Yes, each board is really spaced 10 (decimal) apart. + +211 char Addinum CPCI1500 digital I/O card + 0 = /dev/addinum/cpci1500/0 First CPCI1500 card + 1 = /dev/addinum/cpci1500/1 Second CPCI1500 card + ... + +212 char LinuxTV.org DVB driver subsystem + + 0 = /dev/dvb/adapter0/video0 first video decoder of first card + 1 = /dev/dvb/adapter0/audio0 first audio decoder of first card + 2 = /dev/dvb/adapter0/sec0 (obsolete/unused) + 3 = /dev/dvb/adapter0/frontend0 first frontend device of first card + 4 = /dev/dvb/adapter0/demux0 first demux device of first card + 5 = /dev/dvb/adapter0/dvr0 first digital video recoder device of first card + 6 = /dev/dvb/adapter0/ca0 first common access port of first card + 7 = /dev/dvb/adapter0/net0 first network device of first card + 8 = /dev/dvb/adapter0/osd0 first on-screen-display device of first card + 9 = /dev/dvb/adapter0/video1 second video decoder of first card + ... + 64 = /dev/dvb/adapter1/video0 first video decoder of second card + ... + 128 = /dev/dvb/adapter2/video0 first video decoder of third card + ... + 196 = /dev/dvb/adapter3/video0 first video decoder of fourth card + + +216 char USB BlueTooth devices + 0 = /dev/ttyUB0 First USB BlueTooth device + 1 = /dev/ttyUB1 Second USB BlueTooth device + ... + +217 char USB BlueTooth devices (alternate devices) + 0 = /dev/cuub0 Callout device for ttyUB0 + 1 = /dev/cuub1 Callout device for ttyUB1 + ... + +218 char The Logical Company bus Unibus/Qbus adapters + 0 = /dev/logicalco/bci/0 First bus adapter + 1 = /dev/logicalco/bci/1 First bus adapter + ... + +219 char The Logical Company DCI-1300 digital I/O card + 0 = /dev/logicalco/dci1300/0 First DCI-1300 card + 1 = /dev/logicalco/dci1300/1 Second DCI-1300 card + ... + +220 char Myricom Myrinet "GM" board + 0 = /dev/myricom/gm0 First Myrinet GM board + 1 = /dev/myricom/gmp0 First board "root access" + 2 = /dev/myricom/gm1 Second Myrinet GM board + 3 = /dev/myricom/gmp1 Second board "root access" + ... + +221 char VME bus + 0 = /dev/bus/vme/m0 First master image + 1 = /dev/bus/vme/m1 Second master image + 2 = /dev/bus/vme/m2 Third master image + 3 = /dev/bus/vme/m3 Fourth master image + 4 = /dev/bus/vme/s0 First slave image + 5 = /dev/bus/vme/s1 Second slave image + 6 = /dev/bus/vme/s2 Third slave image + 7 = /dev/bus/vme/s3 Fourth slave image + 8 = /dev/bus/vme/ctl Control + + It is expected that all VME bus drivers will use the + same interface. For interface documentation see + http://www.vmelinux.org/. + +224 char A2232 serial card + 0 = /dev/ttyY0 First A2232 port + 1 = /dev/ttyY1 Second A2232 port + ... + +225 char A2232 serial card (alternate devices) + 0 = /dev/cuy0 Callout device for ttyY0 + 1 = /dev/cuy1 Callout device for ttyY1 + ... + +226 char Direct Rendering Infrastructure (DRI) + 0 = /dev/dri/card0 First graphics card + 1 = /dev/dri/card1 Second graphics card + ... + +227 char IBM 3270 terminal Unix tty access + 1 = /dev/3270/tty1 First 3270 terminal + 2 = /dev/3270/tty2 Seconds 3270 terminal + ... + +228 char IBM 3270 terminal block-mode access + 0 = /dev/3270/tub Controlling interface + 1 = /dev/3270/tub1 First 3270 terminal + 2 = /dev/3270/tub2 Second 3270 terminal + ... + +229 char IBM iSeries virtual console + 0 = /dev/iseries/vtty0 First console port + 1 = /dev/iseries/vtty1 Second console port + ... + +230 char IBM iSeries virtual tape + 0 = /dev/iseries/vt0 First virtual tape, mode 0 + 1 = /dev/iseries/vt1 Second virtual tape, mode 0 + ... + 32 = /dev/iseries/vt0l First virtual tape, mode 1 + 33 = /dev/iseries/vt1l Second virtual tape, mode 1 + ... + 64 = /dev/iseries/vt0m First virtual tape, mode 2 + 65 = /dev/iseries/vt1m Second virtual tape, mode 2 + ... + 96 = /dev/iseries/vt0a First virtual tape, mode 3 + 97 = /dev/iseries/vt1a Second virtual tape, mode 3 + ... + 128 = /dev/iseries/nvt0 First virtual tape, mode 0, no rewind + 129 = /dev/iseries/nvt1 Second virtual tape, mode 0, no rewind + ... + 160 = /dev/iseries/nvt0l First virtual tape, mode 1, no rewind + 161 = /dev/iseries/nvt1l Second virtual tape, mode 1, no rewind + ... + 192 = /dev/iseries/nvt0m First virtual tape, mode 2, no rewind + 193 = /dev/iseries/nvt1m Second virtual tape, mode 2, no rewind + ... + 224 = /dev/iseries/nvt0a First virtual tape, mode 3, no rewind + 225 = /dev/iseries/nvt1a Second virtual tape, mode 3, no rewind + ... + + "No rewind" refers to the omission of the default + automatic rewind on device close. The MTREW or MTOFFL + ioctl()'s can be used to rewind the tape regardless of + the device used to access it. + +231 char InfiniBand MAD + 0 = /dev/infiniband/umad0 + 1 = /dev/infiniband/umad1 + ... + +232-239 UNASSIGNED + +240-254 char LOCAL/EXPERIMENTAL USE +240-254 block LOCAL/EXPERIMENTAL USE + Allocated for local/experimental use. For devices not + assigned official numbers, these ranges should be + used in order to avoid conflicting with future assignments. + +255 char RESERVED +255 block RESERVED + + This major is reserved to assist the expansion to a + larger number space. No device nodes with this major + should ever be created on the filesystem. + + **** ADDITIONAL /dev DIRECTORY ENTRIES + +This section details additional entries that should or may exist in +the /dev directory. It is preferred that symbolic links use the same +form (absolute or relative) as is indicated here. Links are +classified as "hard" or "symbolic" depending on the preferred type of +link; if possible, the indicated type of link should be used. + + + Compulsory links + +These links should exist on all systems: + +/dev/fd /proc/self/fd symbolic File descriptors +/dev/stdin fd/0 symbolic stdin file descriptor +/dev/stdout fd/1 symbolic stdout file descriptor +/dev/stderr fd/2 symbolic stderr file descriptor +/dev/nfsd socksys symbolic Required by iBCS-2 +/dev/X0R null symbolic Required by iBCS-2 + +Note: /dev/X0R is <letter X>-<digit 0>-<letter R>. + + Recommended links + +It is recommended that these links exist on all systems: + +/dev/core /proc/kcore symbolic Backward compatibility +/dev/ramdisk ram0 symbolic Backward compatibility +/dev/ftape qft0 symbolic Backward compatibility +/dev/bttv0 video0 symbolic Backward compatibility +/dev/radio radio0 symbolic Backward compatibility +/dev/i2o* /dev/i2o/* symbolic Backward compatibility +/dev/scd? sr? hard Alternate SCSI CD-ROM name + + Locally defined links + +The following links may be established locally to conform to the +configuration of the system. This is merely a tabulation of existing +practice, and does not constitute a recommendation. However, if they +exist, they should have the following uses. + +/dev/mouse mouse port symbolic Current mouse device +/dev/tape tape device symbolic Current tape device +/dev/cdrom CD-ROM device symbolic Current CD-ROM device +/dev/cdwriter CD-writer symbolic Current CD-writer device +/dev/scanner scanner symbolic Current scanner device +/dev/modem modem port symbolic Current dialout device +/dev/root root device symbolic Current root filesystem +/dev/swap swap device symbolic Current swap device + +/dev/modem should not be used for a modem which supports dialin as +well as dialout, as it tends to cause lock file problems. If it +exists, /dev/modem should point to the appropriate primary TTY device +(the use of the alternate callout devices is deprecated). + +For SCSI devices, /dev/tape and /dev/cdrom should point to the +``cooked'' devices (/dev/st* and /dev/sr*, respectively), whereas +/dev/cdwriter and /dev/scanner should point to the appropriate generic +SCSI devices (/dev/sg*). + +/dev/mouse may point to a primary serial TTY device, a hardware mouse +device, or a socket for a mouse driver program (e.g. /dev/gpmdata). + + Sockets and pipes + +Non-transient sockets and named pipes may exist in /dev. Common entries are: + +/dev/printer socket lpd local socket +/dev/log socket syslog local socket +/dev/gpmdata socket gpm mouse multiplexer + + Mount points + +The following names are reserved for mounting special filesystems +under /dev. These special filesystems provide kernel interfaces that +cannot be provided with standard device nodes. + +/dev/pts devpts PTY slave filesystem +/dev/shm tmpfs POSIX shared memory maintenance access + + **** TERMINAL DEVICES + +Terminal, or TTY devices are a special class of character devices. A +terminal device is any device that could act as a controlling terminal +for a session; this includes virtual consoles, serial ports, and +pseudoterminals (PTYs). + +All terminal devices share a common set of capabilities known as line +diciplines; these include the common terminal line dicipline as well +as SLIP and PPP modes. + +All terminal devices are named similarly; this section explains the +naming and use of the various types of TTYs. Note that the naming +conventions include several historical warts; some of these are +Linux-specific, some were inherited from other systems, and some +reflect Linux outgrowing a borrowed convention. + +A hash mark (#) in a device name is used here to indicate a decimal +number without leading zeroes. + + Virtual consoles and the console device + +Virtual consoles are full-screen terminal displays on the system video +monitor. Virtual consoles are named /dev/tty#, with numbering +starting at /dev/tty1; /dev/tty0 is the current virtual console. +/dev/tty0 is the device that should be used to access the system video +card on those architectures for which the frame buffer devices +(/dev/fb*) are not applicable. Do not use /dev/console +for this purpose. + +The console device, /dev/console, is the device to which system +messages should be sent, and on which logins should be permitted in +single-user mode. Starting with Linux 2.1.71, /dev/console is managed +by the kernel; for previous versions it should be a symbolic link to +either /dev/tty0, a specific virtual console such as /dev/tty1, or to +a serial port primary (tty*, not cu*) device, depending on the +configuration of the system. + + Serial ports + +Serial ports are RS-232 serial ports and any device which simulates +one, either in hardware (such as internal modems) or in software (such +as the ISDN driver.) Under Linux, each serial ports has two device +names, the primary or callin device and the alternate or callout one. +Each kind of device is indicated by a different letter. For any +letter X, the names of the devices are /dev/ttyX# and /dev/cux#, +respectively; for historical reasons, /dev/ttyS# and /dev/ttyC# +correspond to /dev/cua# and /dev/cub#. In the future, it should be +expected that multiple letters will be used; all letters will be upper +case for the "tty" device (e.g. /dev/ttyDP#) and lower case for the +"cu" device (e.g. /dev/cudp#). + +The names /dev/ttyQ# and /dev/cuq# are reserved for local use. + +The alternate devices provide for kernel-based exclusion and somewhat +different defaults than the primary devices. Their main purpose is to +allow the use of serial ports with programs with no inherent or broken +support for serial ports. Their use is deprecated, and they may be +removed from a future version of Linux. + +Arbitration of serial ports is provided by the use of lock files with +the names /var/lock/LCK..ttyX#. The contents of the lock file should +be the PID of the locking process as an ASCII number. + +It is common practice to install links such as /dev/modem +which point to serial ports. In order to ensure proper locking in the +presence of these links, it is recommended that software chase +symlinks and lock all possible names; additionally, it is recommended +that a lock file be installed with the corresponding alternate +device. In order to avoid deadlocks, it is recommended that the locks +are acquired in the following order, and released in the reverse: + + 1. The symbolic link name, if any (/var/lock/LCK..modem) + 2. The "tty" name (/var/lock/LCK..ttyS2) + 3. The alternate device name (/var/lock/LCK..cua2) + +In the case of nested symbolic links, the lock files should be +installed in the order the symlinks are resolved. + +Under no circumstances should an application hold a lock while waiting +for another to be released. In addition, applications which attempt +to create lock files for the corresponding alternate device names +should take into account the possibility of being used on a non-serial +port TTY, for which no alternate device would exist. + + Pseudoterminals (PTYs) + +Pseudoterminals, or PTYs, are used to create login sessions or provide +other capabilities requiring a TTY line dicipline (including SLIP or +PPP capability) to arbitrary data-generation processes. Each PTY has +a master side, named /dev/pty[p-za-e][0-9a-f], and a slave side, named +/dev/tty[p-za-e][0-9a-f]. The kernel arbitrates the use of PTYs by +allowing each master side to be opened only once. + +Once the master side has been opened, the corresponding slave device +can be used in the same manner as any TTY device. The master and +slave devices are connected by the kernel, generating the equivalent +of a bidirectional pipe with TTY capabilities. + +Recent versions of the Linux kernels and GNU libc contain support for +the System V/Unix98 naming scheme for PTYs, which assigns a common +device, /dev/ptmx, to all the masters (opening it will automatically +give you a previously unassigned PTY) and a subdirectory, /dev/pts, +for the slaves; the slaves are named with decimal integers (/dev/pts/# +in our notation). This removes the problem of exhausting the +namespace and enables the kernel to automatically create the device +nodes for the slaves on demand using the "devpts" filesystem. + diff --git a/Documentation/digiepca.txt b/Documentation/digiepca.txt new file mode 100644 index 000000000000..88820fe38dad --- /dev/null +++ b/Documentation/digiepca.txt @@ -0,0 +1,98 @@ +NOTE: This driver is obsolete. Digi provides a 2.6 driver (dgdm) at +http://www.digi.com for PCI cards. They no longer maintain this driver, +and have no 2.6 driver for ISA cards. + +This driver requires a number of user-space tools. They can be aquired from +http://www.digi.com, but only works with 2.4 kernels. + + +The Digi Intl. epca driver. +---------------------------- +The Digi Intl. epca driver for Linux supports the following boards: + +Digi PC/Xem, PC/Xr, PC/Xe, PC/Xi, PC/Xeve +Digi EISA/Xem, PCI/Xem, PCI/Xr + +Limitations: +------------ +Currently the driver only autoprobes for supported PCI boards. + +The Linux MAKEDEV command does not support generating the Digiboard +Devices. Users executing digiConfig to setup EISA and PC series cards +will have their device nodes automatically constructed (cud?? for ~CLOCAL, +and ttyD?? for CLOCAL). Users wishing to boot their board from the LILO +prompt, or those users booting PCI cards may use buildDIGI to construct +the necessary nodes. + +Notes: +------ +This driver may be configured via LILO. For users who have already configured +their driver using digiConfig, configuring from LILO will override previous +settings. Multiple boards may be configured by issuing multiple LILO command +lines. For examples see the bottom of this document. + +Device names start at 0 and continue up. Beware of this as previous Digi +drivers started device names with 1. + +PCI boards are auto-detected and configured by the driver. PCI boards will +be allocated device numbers (internally) beginning with the lowest PCI slot +first. In other words a PCI card in slot 3 will always have higher device +nodes than a PCI card in slot 1. + +LILO config examples: +--------------------- +Using LILO's APPEND command, a string of comma separated identifiers or +integers can be used to configure supported boards. The six values in order +are: + + Enable/Disable this card or Override, + Type of card: PC/Xe (AccelePort) (0), PC/Xeve (1), PC/Xem or PC/Xr (2), + EISA/Xem (3), PC/64Xe (4), PC/Xi (5), + Enable/Disable alternate pin arrangement, + Number of ports on this card, + I/O Port where card is configured (in HEX if using string identifiers), + Base of memory window (in HEX if using string identifiers), + +NOTE : PCI boards are auto-detected and configured. Do not attempt to +configure PCI boards with the LILO append command. If you wish to override +previous configuration data (As set by digiConfig), but you do not wish to +configure any specific card (Example if there are PCI cards in the system) +the following override command will accomplish this: +-> append="digi=2" + +Samples: + append="digiepca=E,PC/Xe,D,16,200,D0000" + or + append="digi=1,0,0,16,512,851968" + +Supporting Tools: +----------------- +Supporting tools include digiDload, digiConfig, buildPCI, and ditty. See +drivers/char/README.epca for more details. Note, +this driver REQUIRES that digiDload be executed prior to it being used. +Failure to do this will result in an ENODEV error. + +Documentation: +-------------- +Complete documentation for this product may be found in the tool package. + +Sources of information and support: +----------------------------------- +Digi Intl. support site for this product: + +-> http://www.digi.com + +Acknowledgments: +---------------- +Much of this work (And even text) was derived from a similar document +supporting the original public domain DigiBoard driver Copyright (C) +1994,1995 Troy De Jongh. Many thanks to Christoph Lameter +(christoph@lameter.com) and Mike McLagan (mike.mclagan@linux.org) who authored +and contributed to the original document. + +Changelog: +---------- +10-29-04: Update status of driver, remove dead links in document + James Nelson <james4765@gmail.com> + +2000 (?) Original Document diff --git a/Documentation/dnotify.txt b/Documentation/dnotify.txt new file mode 100644 index 000000000000..6984fca6002a --- /dev/null +++ b/Documentation/dnotify.txt @@ -0,0 +1,99 @@ + Linux Directory Notification + ============================ + + Stephen Rothwell <sfr@canb.auug.org.au> + +The intention of directory notification is to allow user applications +to be notified when a directory, or any of the files in it, are changed. +The basic mechanism involves the application registering for notification +on a directory using a fcntl(2) call and the notifications themselves +being delivered using signals. + +The application decides which "events" it wants to be notified about. +The currently defined events are: + + DN_ACCESS A file in the directory was accessed (read) + DN_MODIFY A file in the directory was modified (write,truncate) + DN_CREATE A file was created in the directory + DN_DELETE A file was unlinked from directory + DN_RENAME A file in the directory was renamed + DN_ATTRIB A file in the directory had its attributes + changed (chmod,chown) + +Usually, the application must reregister after each notification, but +if DN_MULTISHOT is or'ed with the event mask, then the registration will +remain until explicitly removed (by registering for no events). + +By default, SIGIO will be delivered to the process and no other useful +information. However, if the F_SETSIG fcntl(2) call is used to let the +kernel know which signal to deliver, a siginfo structure will be passed to +the signal handler and the si_fd member of that structure will contain the +file descriptor associated with the directory in which the event occurred. + +Preferably the application will choose one of the real time signals +(SIGRTMIN + <n>) so that the notifications may be queued. This is +especially important if DN_MULTISHOT is specified. Note that SIGRTMIN +is often blocked, so it is better to use (at least) SIGRTMIN + 1. + +Implementation expectations (features and bugs :-)) +--------------------------- + +The notification should work for any local access to files even if the +actual file system is on a remote server. This implies that remote +access to files served by local user mode servers should be notified. +Also, remote accesses to files served by a local kernel NFS server should +be notified. + +In order to make the impact on the file system code as small as possible, +the problem of hard links to files has been ignored. So if a file (x) +exists in two directories (a and b) then a change to the file using the +name "a/x" should be notified to a program expecting notifications on +directory "a", but will not be notified to one expecting notifications on +directory "b". + +Also, files that are unlinked, will still cause notifications in the +last directory that they were linked to. + +Configuration +------------- + +Dnotify is controlled via the CONFIG_DNOTIFY configuration option. When +disabled, fcntl(fd, F_NOTIFY, ...) will return -EINVAL. + +Example +------- + + #define _GNU_SOURCE /* needed to get the defines */ + #include <fcntl.h> /* in glibc 2.2 this has the needed + values defined */ + #include <signal.h> + #include <stdio.h> + #include <unistd.h> + + static volatile int event_fd; + + static void handler(int sig, siginfo_t *si, void *data) + { + event_fd = si->si_fd; + } + + int main(void) + { + struct sigaction act; + int fd; + + act.sa_sigaction = handler; + sigemptyset(&act.sa_mask); + act.sa_flags = SA_SIGINFO; + sigaction(SIGRTMIN + 1, &act, NULL); + + fd = open(".", O_RDONLY); + fcntl(fd, F_SETSIG, SIGRTMIN + 1); + fcntl(fd, F_NOTIFY, DN_MODIFY|DN_CREATE|DN_MULTISHOT); + /* we will now be notified if any of the files + in "." is modified or new files are created */ + while (1) { + pause(); + printf("Got event on fd=%d\n", event_fd); + } + } diff --git a/Documentation/driver-model/binding.txt b/Documentation/driver-model/binding.txt new file mode 100644 index 000000000000..f7ec9d625bfc --- /dev/null +++ b/Documentation/driver-model/binding.txt @@ -0,0 +1,102 @@ + +Driver Binding + +Driver binding is the process of associating a device with a device +driver that can control it. Bus drivers have typically handled this +because there have been bus-specific structures to represent the +devices and the drivers. With generic device and device driver +structures, most of the binding can take place using common code. + + +Bus +~~~ + +The bus type structure contains a list of all devices that are on that bus +type in the system. When device_register is called for a device, it is +inserted into the end of this list. The bus object also contains a +list of all drivers of that bus type. When driver_register is called +for a driver, it is inserted at the end of this list. These are the +two events which trigger driver binding. + + +device_register +~~~~~~~~~~~~~~~ + +When a new device is added, the bus's list of drivers is iterated over +to find one that supports it. In order to determine that, the device +ID of the device must match one of the device IDs that the driver +supports. The format and semantics for comparing IDs is bus-specific. +Instead of trying to derive a complex state machine and matching +algorithm, it is up to the bus driver to provide a callback to compare +a device against the IDs of a driver. The bus returns 1 if a match was +found; 0 otherwise. + +int match(struct device * dev, struct device_driver * drv); + +If a match is found, the device's driver field is set to the driver +and the driver's probe callback is called. This gives the driver a +chance to verify that it really does support the hardware, and that +it's in a working state. + +Device Class +~~~~~~~~~~~~ + +Upon the successful completion of probe, the device is registered with +the class to which it belongs. Device drivers belong to one and only one +class, and that is set in the driver's devclass field. +devclass_add_device is called to enumerate the device within the class +and actually register it with the class, which happens with the +class's register_dev callback. + +NOTE: The device class structures and core routines to manipulate them +are not in the mainline kernel, so the discussion is still a bit +speculative. + + +Driver +~~~~~~ + +When a driver is attached to a device, the device is inserted into the +driver's list of devices. + + +sysfs +~~~~~ + +A symlink is created in the bus's 'devices' directory that points to +the device's directory in the physical hierarchy. + +A symlink is created in the driver's 'devices' directory that points +to the device's directory in the physical hierarchy. + +A directory for the device is created in the class's directory. A +symlink is created in that directory that points to the device's +physical location in the sysfs tree. + +A symlink can be created (though this isn't done yet) in the device's +physical directory to either its class directory, or the class's +top-level directory. One can also be created to point to its driver's +directory also. + + +driver_register +~~~~~~~~~~~~~~~ + +The process is almost identical for when a new driver is added. +The bus's list of devices is iterated over to find a match. Devices +that already have a driver are skipped. All the devices are iterated +over, to bind as many devices as possible to the driver. + + +Removal +~~~~~~~ + +When a device is removed, the reference count for it will eventually +go to 0. When it does, the remove callback of the driver is called. It +is removed from the driver's list of devices and the reference count +of the driver is decremented. All symlinks between the two are removed. + +When a driver is removed, the list of devices that it supports is +iterated over, and the driver's remove callback is called for each +one. The device is removed from that list and the symlinks removed. + diff --git a/Documentation/driver-model/bus.txt b/Documentation/driver-model/bus.txt new file mode 100644 index 000000000000..dd62c7b80b3f --- /dev/null +++ b/Documentation/driver-model/bus.txt @@ -0,0 +1,160 @@ + +Bus Types + +Definition +~~~~~~~~~~ + +struct bus_type { + char * name; + + struct subsystem subsys; + struct kset drivers; + struct kset devices; + + struct bus_attribute * bus_attrs; + struct device_attribute * dev_attrs; + struct driver_attribute * drv_attrs; + + int (*match)(struct device * dev, struct device_driver * drv); + int (*hotplug) (struct device *dev, char **envp, + int num_envp, char *buffer, int buffer_size); + int (*suspend)(struct device * dev, u32 state); + int (*resume)(struct device * dev); +}; + +int bus_register(struct bus_type * bus); + + +Declaration +~~~~~~~~~~~ + +Each bus type in the kernel (PCI, USB, etc) should declare one static +object of this type. They must initialize the name field, and may +optionally initialize the match callback. + +struct bus_type pci_bus_type = { + .name = "pci", + .match = pci_bus_match, +}; + +The structure should be exported to drivers in a header file: + +extern struct bus_type pci_bus_type; + + +Registration +~~~~~~~~~~~~ + +When a bus driver is initialized, it calls bus_register. This +initializes the rest of the fields in the bus object and inserts it +into a global list of bus types. Once the bus object is registered, +the fields in it are usable by the bus driver. + + +Callbacks +~~~~~~~~~ + +match(): Attaching Drivers to Devices +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The format of device ID structures and the semantics for comparing +them are inherently bus-specific. Drivers typically declare an array +of device IDs of devices they support that reside in a bus-specific +driver structure. + +The purpose of the match callback is provide the bus an opportunity to +determine if a particular driver supports a particular device by +comparing the device IDs the driver supports with the device ID of a +particular device, without sacrificing bus-specific functionality or +type-safety. + +When a driver is registered with the bus, the bus's list of devices is +iterated over, and the match callback is called for each device that +does not have a driver associated with it. + + + +Device and Driver Lists +~~~~~~~~~~~~~~~~~~~~~~~ + +The lists of devices and drivers are intended to replace the local +lists that many buses keep. They are lists of struct devices and +struct device_drivers, respectively. Bus drivers are free to use the +lists as they please, but conversion to the bus-specific type may be +necessary. + +The LDM core provides helper functions for iterating over each list. + +int bus_for_each_dev(struct bus_type * bus, struct device * start, void * data, + int (*fn)(struct device *, void *)); + +int bus_for_each_drv(struct bus_type * bus, struct device_driver * start, + void * data, int (*fn)(struct device_driver *, void *)); + +These helpers iterate over the respective list, and call the callback +for each device or driver in the list. All list accesses are +synchronized by taking the bus's lock (read currently). The reference +count on each object in the list is incremented before the callback is +called; it is decremented after the next object has been obtained. The +lock is not held when calling the callback. + + +sysfs +~~~~~~~~ +There is a top-level directory named 'bus'. + +Each bus gets a directory in the bus directory, along with two default +directories: + + /sys/bus/pci/ + |-- devices + `-- drivers + +Drivers registered with the bus get a directory in the bus's drivers +directory: + + /sys/bus/pci/ + |-- devices + `-- drivers + |-- Intel ICH + |-- Intel ICH Joystick + |-- agpgart + `-- e100 + +Each device that is discovered on a bus of that type gets a symlink in +the bus's devices directory to the device's directory in the physical +hierarchy: + + /sys/bus/pci/ + |-- devices + | |-- 00:00.0 -> ../../../root/pci0/00:00.0 + | |-- 00:01.0 -> ../../../root/pci0/00:01.0 + | `-- 00:02.0 -> ../../../root/pci0/00:02.0 + `-- drivers + + +Exporting Attributes +~~~~~~~~~~~~~~~~~~~~ +struct bus_attribute { + struct attribute attr; + ssize_t (*show)(struct bus_type *, char * buf); + ssize_t (*store)(struct bus_type *, const char * buf, size_t count); +}; + +Bus drivers can export attributes using the BUS_ATTR macro that works +similarly to the DEVICE_ATTR macro for devices. For example, a definition +like this: + +static BUS_ATTR(debug,0644,show_debug,store_debug); + +is equivalent to declaring: + +static bus_attribute bus_attr_debug; + +This can then be used to add and remove the attribute from the bus's +sysfs directory using: + +int bus_create_file(struct bus_type *, struct bus_attribute *); +void bus_remove_file(struct bus_type *, struct bus_attribute *); + + diff --git a/Documentation/driver-model/class.txt b/Documentation/driver-model/class.txt new file mode 100644 index 000000000000..2d1d893a5e5d --- /dev/null +++ b/Documentation/driver-model/class.txt @@ -0,0 +1,162 @@ + +Device Classes + + +Introduction +~~~~~~~~~~~~ +A device class describes a type of device, like an audio or network +device. The following device classes have been identified: + +<Insert List of Device Classes Here> + + +Each device class defines a set of semantics and a programming interface +that devices of that class adhere to. Device drivers are the +implemention of that programming interface for a particular device on +a particular bus. + +Device classes are agnostic with respect to what bus a device resides +on. + + +Programming Interface +~~~~~~~~~~~~~~~~~~~~~ +The device class structure looks like: + + +typedef int (*devclass_add)(struct device *); +typedef void (*devclass_remove)(struct device *); + +struct device_class { + char * name; + rwlock_t lock; + u32 devnum; + struct list_head node; + + struct list_head drivers; + struct list_head intf_list; + + struct driver_dir_entry dir; + struct driver_dir_entry device_dir; + struct driver_dir_entry driver_dir; + + devclass_add add_device; + devclass_remove remove_device; +}; + +A typical device class definition would look like: + +struct device_class input_devclass = { + .name = "input", + .add_device = input_add_device, + .remove_device = input_remove_device, +}; + +Each device class structure should be exported in a header file so it +can be used by drivers, extensions and interfaces. + +Device classes are registered and unregistered with the core using: + +int devclass_register(struct device_class * cls); +void devclass_unregister(struct device_class * cls); + + +Devices +~~~~~~~ +As devices are bound to drivers, they are added to the device class +that the driver belongs to. Before the driver model core, this would +typically happen during the driver's probe() callback, once the device +has been initialized. It now happens after the probe() callback +finishes from the core. + +The device is enumerated in the class. Each time a device is added to +the class, the class's devnum field is incremented and assigned to the +device. The field is never decremented, so if the device is removed +from the class and re-added, it will receive a different enumerated +value. + +The class is allowed to create a class-specific structure for the +device and store it in the device's class_data pointer. + +There is no list of devices in the device class. Each driver has a +list of devices that it supports. The device class has a list of +drivers of that particular class. To access all of the devices in the +class, iterate over the device lists of each driver in the class. + + +Device Drivers +~~~~~~~~~~~~~~ +Device drivers are added to device classes when they are registered +with the core. A driver specifies the class it belongs to by setting +the struct device_driver::devclass field. + + +sysfs directory structure +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +There is a top-level sysfs directory named 'class'. + +Each class gets a directory in the class directory, along with two +default subdirectories: + + class/ + `-- input + |-- devices + `-- drivers + + +Drivers registered with the class get a symlink in the drivers/ directory +that points to the driver's directory (under its bus directory): + + class/ + `-- input + |-- devices + `-- drivers + `-- usb:usb_mouse -> ../../../bus/drivers/usb_mouse/ + + +Each device gets a symlink in the devices/ directory that points to the +device's directory in the physical hierarchy: + + class/ + `-- input + |-- devices + | `-- 1 -> ../../../root/pci0/00:1f.0/usb_bus/00:1f.2-1:0/ + `-- drivers + + +Exporting Attributes +~~~~~~~~~~~~~~~~~~~~ +struct devclass_attribute { + struct attribute attr; + ssize_t (*show)(struct device_class *, char * buf, size_t count, loff_t off); + ssize_t (*store)(struct device_class *, const char * buf, size_t count, loff_t off); +}; + +Class drivers can export attributes using the DEVCLASS_ATTR macro that works +similarly to the DEVICE_ATTR macro for devices. For example, a definition +like this: + +static DEVCLASS_ATTR(debug,0644,show_debug,store_debug); + +is equivalent to declaring: + +static devclass_attribute devclass_attr_debug; + +The bus driver can add and remove the attribute from the class's +sysfs directory using: + +int devclass_create_file(struct device_class *, struct devclass_attribute *); +void devclass_remove_file(struct device_class *, struct devclass_attribute *); + +In the example above, the file will be named 'debug' in placed in the +class's directory in sysfs. + + +Interfaces +~~~~~~~~~~ +There may exist multiple mechanisms for accessing the same device of a +particular class type. Device interfaces describe these mechanisms. + +When a device is added to a device class, the core attempts to add it +to every interface that is registered with the device class. + diff --git a/Documentation/driver-model/device.txt b/Documentation/driver-model/device.txt new file mode 100644 index 000000000000..58cc5dc8fd3e --- /dev/null +++ b/Documentation/driver-model/device.txt @@ -0,0 +1,154 @@ + +The Basic Device Structure +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +struct device { + struct list_head g_list; + struct list_head node; + struct list_head bus_list; + struct list_head driver_list; + struct list_head intf_list; + struct list_head children; + struct device * parent; + + char name[DEVICE_NAME_SIZE]; + char bus_id[BUS_ID_SIZE]; + + spinlock_t lock; + atomic_t refcount; + + struct bus_type * bus; + struct driver_dir_entry dir; + + u32 class_num; + + struct device_driver *driver; + void *driver_data; + void *platform_data; + + u32 current_state; + unsigned char *saved_state; + + void (*release)(struct device * dev); +}; + +Fields +~~~~~~ +g_list: Node in the global device list. + +node: Node in device's parent's children list. + +bus_list: Node in device's bus's devices list. + +driver_list: Node in device's driver's devices list. + +intf_list: List of intf_data. There is one structure allocated for + each interface that the device supports. + +children: List of child devices. + +parent: *** FIXME *** + +name: ASCII description of device. + Example: " 3Com Corporation 3c905 100BaseTX [Boomerang]" + +bus_id: ASCII representation of device's bus position. This + field should be a name unique across all devices on the + bus type the device belongs to. + + Example: PCI bus_ids are in the form of + <bus number>:<slot number>.<function number> + This name is unique across all PCI devices in the system. + +lock: Spinlock for the device. + +refcount: Reference count on the device. + +bus: Pointer to struct bus_type that device belongs to. + +dir: Device's sysfs directory. + +class_num: Class-enumerated value of the device. + +driver: Pointer to struct device_driver that controls the device. + +driver_data: Driver-specific data. + +platform_data: Platform data specific to the device. + +current_state: Current power state of the device. + +saved_state: Pointer to saved state of the device. This is usable by + the device driver controlling the device. + +release: Callback to free the device after all references have + gone away. This should be set by the allocator of the + device (i.e. the bus driver that discovered the device). + + +Programming Interface +~~~~~~~~~~~~~~~~~~~~~ +The bus driver that discovers the device uses this to register the +device with the core: + +int device_register(struct device * dev); + +The bus should initialize the following fields: + + - parent + - name + - bus_id + - bus + +A device is removed from the core when its reference count goes to +0. The reference count can be adjusted using: + +struct device * get_device(struct device * dev); +void put_device(struct device * dev); + +get_device() will return a pointer to the struct device passed to it +if the reference is not already 0 (if it's in the process of being +removed already). + +A driver can access the lock in the device structure using: + +void lock_device(struct device * dev); +void unlock_device(struct device * dev); + + +Attributes +~~~~~~~~~~ +struct device_attribute { + struct attribute attr; + ssize_t (*show)(struct device * dev, char * buf, size_t count, loff_t off); + ssize_t (*store)(struct device * dev, const char * buf, size_t count, loff_t off); +}; + +Attributes of devices can be exported via drivers using a simple +procfs-like interface. + +Please see Documentation/filesystems/sysfs.txt for more information +on how sysfs works. + +Attributes are declared using a macro called DEVICE_ATTR: + +#define DEVICE_ATTR(name,mode,show,store) + +Example: + +DEVICE_ATTR(power,0644,show_power,store_power); + +This declares a structure of type struct device_attribute named +'dev_attr_power'. This can then be added and removed to the device's +directory using: + +int device_create_file(struct device *device, struct device_attribute * entry); +void device_remove_file(struct device * dev, struct device_attribute * attr); + +Example: + +device_create_file(dev,&dev_attr_power); +device_remove_file(dev,&dev_attr_power); + +The file name will be 'power' with a mode of 0644 (-rw-r--r--). + diff --git a/Documentation/driver-model/driver.txt b/Documentation/driver-model/driver.txt new file mode 100644 index 000000000000..12447787d329 --- /dev/null +++ b/Documentation/driver-model/driver.txt @@ -0,0 +1,287 @@ + +Device Drivers + +struct device_driver { + char * name; + struct bus_type * bus; + + rwlock_t lock; + atomic_t refcount; + + list_t bus_list; + list_t devices; + + struct driver_dir_entry dir; + + int (*probe) (struct device * dev); + int (*remove) (struct device * dev); + + int (*suspend) (struct device * dev, u32 state, u32 level); + int (*resume) (struct device * dev, u32 level); + + void (*release) (struct device_driver * drv); +}; + + + +Allocation +~~~~~~~~~~ + +Device drivers are statically allocated structures. Though there may +be multiple devices in a system that a driver supports, struct +device_driver represents the driver as a whole (not a particular +device instance). + +Initialization +~~~~~~~~~~~~~~ + +The driver must initialize at least the name and bus fields. It should +also initialize the devclass field (when it arrives), so it may obtain +the proper linkage internally. It should also initialize as many of +the callbacks as possible, though each is optional. + +Declaration +~~~~~~~~~~~ + +As stated above, struct device_driver objects are statically +allocated. Below is an example declaration of the eepro100 +driver. This declaration is hypothetical only; it relies on the driver +being converted completely to the new model. + +static struct device_driver eepro100_driver = { + .name = "eepro100", + .bus = &pci_bus_type, + .devclass = ðernet_devclass, /* when it's implemented */ + + .probe = eepro100_probe, + .remove = eepro100_remove, + .suspend = eepro100_suspend, + .resume = eepro100_resume, +}; + +Most drivers will not be able to be converted completely to the new +model because the bus they belong to has a bus-specific structure with +bus-specific fields that cannot be generalized. + +The most common example of this are device ID structures. A driver +typically defines an array of device IDs that it supports. The format +of these structures and the semantics for comparing device IDs are +completely bus-specific. Defining them as bus-specific entities would +sacrifice type-safety, so we keep bus-specific structures around. + +Bus-specific drivers should include a generic struct device_driver in +the definition of the bus-specific driver. Like this: + +struct pci_driver { + const struct pci_device_id *id_table; + struct device_driver driver; +}; + +A definition that included bus-specific fields would look like +(using the eepro100 driver again): + +static struct pci_driver eepro100_driver = { + .id_table = eepro100_pci_tbl, + .driver = { + .name = "eepro100", + .bus = &pci_bus_type, + .devclass = ðernet_devclass, /* when it's implemented */ + .probe = eepro100_probe, + .remove = eepro100_remove, + .suspend = eepro100_suspend, + .resume = eepro100_resume, + }, +}; + +Some may find the syntax of embedded struct initialization awkward or +even a bit ugly. So far, it's the best way we've found to do what we want... + +Registration +~~~~~~~~~~~~ + +int driver_register(struct device_driver * drv); + +The driver registers the structure on startup. For drivers that have +no bus-specific fields (i.e. don't have a bus-specific driver +structure), they would use driver_register and pass a pointer to their +struct device_driver object. + +Most drivers, however, will have a bus-specific structure and will +need to register with the bus using something like pci_driver_register. + +It is important that drivers register their driver structure as early as +possible. Registration with the core initializes several fields in the +struct device_driver object, including the reference count and the +lock. These fields are assumed to be valid at all times and may be +used by the device model core or the bus driver. + + +Transition Bus Drivers +~~~~~~~~~~~~~~~~~~~~~~ + +By defining wrapper functions, the transition to the new model can be +made easier. Drivers can ignore the generic structure altogether and +let the bus wrapper fill in the fields. For the callbacks, the bus can +define generic callbacks that forward the call to the bus-specific +callbacks of the drivers. + +This solution is intended to be only temporary. In order to get class +information in the driver, the drivers must be modified anyway. Since +converting drivers to the new model should reduce some infrastructural +complexity and code size, it is recommended that they are converted as +class information is added. + +Access +~~~~~~ + +Once the object has been registered, it may access the common fields of +the object, like the lock and the list of devices. + +int driver_for_each_dev(struct device_driver * drv, void * data, + int (*callback)(struct device * dev, void * data)); + +The devices field is a list of all the devices that have been bound to +the driver. The LDM core provides a helper function to operate on all +the devices a driver controls. This helper locks the driver on each +node access, and does proper reference counting on each device as it +accesses it. + + +sysfs +~~~~~ + +When a driver is registered, a sysfs directory is created in its +bus's directory. In this directory, the driver can export an interface +to userspace to control operation of the driver on a global basis; +e.g. toggling debugging output in the driver. + +A future feature of this directory will be a 'devices' directory. This +directory will contain symlinks to the directories of devices it +supports. + + + +Callbacks +~~~~~~~~~ + + int (*probe) (struct device * dev); + +probe is called to verify the existence of a certain type of +hardware. This is called during the driver binding process, after the +bus has verified that the device ID of a device matches one of the +device IDs supported by the driver. + +This callback only verifies that there actually is supported hardware +present. It may allocate a driver-specific structure, but it should +not do any initialization of the hardware itself. The device-specific +structure may be stored in the device's driver_data field. + + int (*init) (struct device * dev); + +init is called during the binding stage. It is called after probe has +successfully returned and the device has been registered with its +class. It is responsible for initializing the hardware. + + int (*remove) (struct device * dev); + +remove is called to dissociate a driver with a device. This may be +called if a device is physically removed from the system, if the +driver module is being unloaded, or during a reboot sequence. + +It is up to the driver to determine if the device is present or +not. It should free any resources allocated specifically for the +device; i.e. anything in the device's driver_data field. + +If the device is still present, it should quiesce the device and place +it into a supported low-power state. + + int (*suspend) (struct device * dev, u32 state, u32 level); + +suspend is called to put the device in a low power state. There are +several stages to successfully suspending a device, which is denoted in +the @level parameter. Breaking the suspend transition into several +stages affords the platform flexibility in performing device power +management based on the requirements of the system and the +user-defined policy. + +SUSPEND_NOTIFY notifies the device that a suspend transition is about +to happen. This happens on system power state transitions to verify +that all devices can successfully suspend. + +A driver may choose to fail on this call, which should cause the +entire suspend transition to fail. A driver should fail only if it +knows that the device will not be able to be resumed properly when the +system wakes up again. It could also fail if it somehow determines it +is in the middle of an operation too important to stop. + +SUSPEND_DISABLE tells the device to stop I/O transactions. When it +stops transactions, or what it should do with unfinished transactions +is a policy of the driver. After this call, the driver should not +accept any other I/O requests. + +SUSPEND_SAVE_STATE tells the device to save the context of the +hardware. This includes any bus-specific hardware state and +device-specific hardware state. A pointer to this saved state can be +stored in the device's saved_state field. + +SUSPEND_POWER_DOWN tells the driver to place the device in the low +power state requested. + +Whether suspend is called with a given level is a policy of the +platform. Some levels may be omitted; drivers must not assume the +reception of any level. However, all levels must be called in the +order above; i.e. notification will always come before disabling; +disabling the device will come before suspending the device. + +All calls are made with interrupts enabled, except for the +SUSPEND_POWER_DOWN level. + + int (*resume) (struct device * dev, u32 level); + +Resume is used to bring a device back from a low power state. Like the +suspend transition, it happens in several stages. + +RESUME_POWER_ON tells the driver to set the power state to the state +before the suspend call (The device could have already been in a low +power state before the suspend call to put in a lower power state). + +RESUME_RESTORE_STATE tells the driver to restore the state saved by +the SUSPEND_SAVE_STATE suspend call. + +RESUME_ENABLE tells the driver to start accepting I/O transactions +again. Depending on driver policy, the device may already have pending +I/O requests. + +RESUME_POWER_ON is called with interrupts disabled. The other resume +levels are called with interrupts enabled. + +As with the various suspend stages, the driver must not assume that +any other resume calls have been or will be made. Each call should be +self-contained and not dependent on any external state. + + +Attributes +~~~~~~~~~~ +struct driver_attribute { + struct attribute attr; + ssize_t (*show)(struct device_driver *, char * buf, size_t count, loff_t off); + ssize_t (*store)(struct device_driver *, const char * buf, size_t count, loff_t off); +}; + +Device drivers can export attributes via their sysfs directories. +Drivers can declare attributes using a DRIVER_ATTR macro that works +identically to the DEVICE_ATTR macro. + +Example: + +DRIVER_ATTR(debug,0644,show_debug,store_debug); + +This is equivalent to declaring: + +struct driver_attribute driver_attr_debug; + +This can then be used to add and remove the attribute from the +driver's directory using: + +int driver_create_file(struct device_driver *, struct driver_attribute *); +void driver_remove_file(struct device_driver *, struct driver_attribute *); diff --git a/Documentation/driver-model/interface.txt b/Documentation/driver-model/interface.txt new file mode 100644 index 000000000000..c66912bfe866 --- /dev/null +++ b/Documentation/driver-model/interface.txt @@ -0,0 +1,129 @@ + +Device Interfaces + +Introduction +~~~~~~~~~~~~ + +Device interfaces are the logical interfaces of device classes that correlate +directly to userspace interfaces, like device nodes. + +Each device class may have multiple interfaces through which you can +access the same device. An input device may support the mouse interface, +the 'evdev' interface, and the touchscreen interface. A SCSI disk would +support the disk interface, the SCSI generic interface, and possibly a raw +device interface. + +Device interfaces are registered with the class they belong to. As devices +are added to the class, they are added to each interface registered with +the class. The interface is responsible for determining whether the device +supports the interface or not. + + +Programming Interface +~~~~~~~~~~~~~~~~~~~~~ + +struct device_interface { + char * name; + rwlock_t lock; + u32 devnum; + struct device_class * devclass; + + struct list_head node; + struct driver_dir_entry dir; + + int (*add_device)(struct device *); + int (*add_device)(struct intf_data *); +}; + +int interface_register(struct device_interface *); +void interface_unregister(struct device_interface *); + + +An interface must specify the device class it belongs to. It is added +to that class's list of interfaces on registration. + + +Interfaces can be added to a device class at any time. Whenever it is +added, each device in the class is passed to the interface's +add_device callback. When an interface is removed, each device is +removed from the interface. + + +Devices +~~~~~~~ +Once a device is added to a device class, it is added to each +interface that is registered with the device class. The class +is expected to place a class-specific data structure in +struct device::class_data. The interface can use that (along with +other fields of struct device) to determine whether or not the driver +and/or device support that particular interface. + + +Data +~~~~ + +struct intf_data { + struct list_head node; + struct device_interface * intf; + struct device * dev; + u32 intf_num; +}; + +int interface_add_data(struct interface_data *); + +The interface is responsible for allocating and initializing a struct +intf_data and calling interface_add_data() to add it to the device's list +of interfaces it belongs to. This list will be iterated over when the device +is removed from the class (instead of all possible interfaces for a class). +This structure should probably be embedded in whatever per-device data +structure the interface is allocating anyway. + +Devices are enumerated within the interface. This happens in interface_add_data() +and the enumerated value is stored in the struct intf_data for that device. + +sysfs +~~~~~ +Each interface is given a directory in the directory of the device +class it belongs to: + +Interfaces get a directory in the class's directory as well: + + class/ + `-- input + |-- devices + |-- drivers + |-- mouse + `-- evdev + +When a device is added to the interface, a symlink is created that points +to the device's directory in the physical hierarchy: + + class/ + `-- input + |-- devices + | `-- 1 -> ../../../root/pci0/00:1f.0/usb_bus/00:1f.2-1:0/ + |-- drivers + | `-- usb:usb_mouse -> ../../../bus/drivers/usb_mouse/ + |-- mouse + | `-- 1 -> ../../../root/pci0/00:1f.0/usb_bus/00:1f.2-1:0/ + `-- evdev + `-- 1 -> ../../../root/pci0/00:1f.0/usb_bus/00:1f.2-1:0/ + + +Future Plans +~~~~~~~~~~~~ +A device interface is correlated directly with a userspace interface +for a device, specifically a device node. For instance, a SCSI disk +exposes at least two interfaces to userspace: the standard SCSI disk +interface and the SCSI generic interface. It might also export a raw +device interface. + +Many interfaces have a major number associated with them and each +device gets a minor number. Or, multiple interfaces might share one +major number, and each will receive a range of minor numbers (like in +the case of input devices). + +These major and minor numbers could be stored in the interface +structure. Major and minor allocations could happen when the interface +is registered with the class, or via a helper function. + diff --git a/Documentation/driver-model/overview.txt b/Documentation/driver-model/overview.txt new file mode 100644 index 000000000000..44662735cf81 --- /dev/null +++ b/Documentation/driver-model/overview.txt @@ -0,0 +1,114 @@ +The Linux Kernel Device Model + +Patrick Mochel <mochel@osdl.org> + +26 August 2002 + + +Overview +~~~~~~~~ + +This driver model is a unification of all the current, disparate driver models +that are currently in the kernel. It is intended to augment the +bus-specific drivers for bridges and devices by consolidating a set of data +and operations into globally accessible data structures. + +Current driver models implement some sort of tree-like structure (sometimes +just a list) for the devices they control. But, there is no linkage between +the different bus types. + +A common data structure can provide this linkage with little overhead: when a +bus driver discovers a particular device, it can insert it into the global +tree as well as its local tree. In fact, the local tree becomes just a subset +of the global tree. + +Common data fields can also be moved out of the local bus models into the +global model. Some of the manipulations of these fields can also be +consolidated. Most likely, manipulation functions will become a set +of helper functions, which the bus drivers wrap around to include any +bus-specific items. + +The common device and bridge interface currently reflects the goals of the +modern PC: namely the ability to do seamless Plug and Play, power management, +and hot plug. (The model dictated by Intel and Microsoft (read: ACPI) ensures +us that any device in the system may fit any of these criteria.) + +In reality, not every bus will be able to support such operations. But, most +buses will support a majority of those operations, and all future buses will. +In other words, a bus that doesn't support an operation is the exception, +instead of the other way around. + + + +Downstream Access +~~~~~~~~~~~~~~~~~ + +Common data fields have been moved out of individual bus layers into a common +data structure. But, these fields must still be accessed by the bus layers, +and sometimes by the device-specific drivers. + +Other bus layers are encouraged to do what has been done for the PCI layer. +struct pci_dev now looks like this: + +struct pci_dev { + ... + + struct device device; +}; + +Note first that it is statically allocated. This means only one allocation on +device discovery. Note also that it is at the _end_ of struct pci_dev. This is +to make people think about what they're doing when switching between the bus +driver and the global driver; and to prevent against mindless casts between +the two. + +The PCI bus layer freely accesses the fields of struct device. It knows about +the structure of struct pci_dev, and it should know the structure of struct +device. PCI devices that have been converted generally do not touch the fields +of struct device. More precisely, device-specific drivers should not touch +fields of struct device unless there is a strong compelling reason to do so. + +This abstraction is prevention of unnecessary pain during transitional phases. +If the name of the field changes or is removed, then every downstream driver +will break. On the other hand, if only the bus layer (and not the device +layer) accesses struct device, it is only that layer that needs to change. + + +User Interface +~~~~~~~~~~~~~~ + +By virtue of having a complete hierarchical view of all the devices in the +system, exporting a complete hierarchical view to userspace becomes relatively +easy. This has been accomplished by implementing a special purpose virtual +file system named sysfs. It is hence possible for the user to mount the +whole sysfs filesystem anywhere in userspace. + +This can be done permanently by providing the following entry into the +/etc/fstab (under the provision that the mount point does exist, of course): + +none /sys sysfs defaults 0 0 + +Or by hand on the command line: + +# mount -t sysfs sysfs /sys + +Whenever a device is inserted into the tree, a directory is created for it. +This directory may be populated at each layer of discovery - the global layer, +the bus layer, or the device layer. + +The global layer currently creates two files - 'name' and 'power'. The +former only reports the name of the device. The latter reports the +current power state of the device. It will also be used to set the current +power state. + +The bus layer may also create files for the devices it finds while probing the +bus. For example, the PCI layer currently creates 'irq' and 'resource' files +for each PCI device. + +A device-specific driver may also export files in its directory to expose +device-specific data or tunable interfaces. + +More information about the sysfs directory layout can be found in +the other documents in this directory and in the file +Documentation/filesystems/sysfs.txt. + diff --git a/Documentation/driver-model/platform.txt b/Documentation/driver-model/platform.txt new file mode 100644 index 000000000000..5eee3e0bfc4c --- /dev/null +++ b/Documentation/driver-model/platform.txt @@ -0,0 +1,99 @@ +Platform Devices and Drivers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Platform devices +~~~~~~~~~~~~~~~~ +Platform devices are devices that typically appear as autonomous +entities in the system. This includes legacy port-based devices and +host bridges to peripheral buses. + + +Platform drivers +~~~~~~~~~~~~~~~~ +Drivers for platform devices are typically very simple and +unstructured. Either the device was present at a particular I/O port +and the driver was loaded, or it was not. There was no possibility +of hotplugging or alternative discovery besides probing at a specific +I/O address and expecting a specific response. + + +Other Architectures, Modern Firmware, and new Platforms +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +These devices are not always at the legacy I/O ports. This is true on +other architectures and on some modern architectures. In most cases, +the drivers are modified to discover the devices at other well-known +ports for the given platform. However, the firmware in these systems +does usually know where exactly these devices reside, and in some +cases, it's the only way of discovering them. + + +The Platform Bus +~~~~~~~~~~~~~~~~ +A platform bus has been created to deal with these issues. First and +foremost, it groups all the legacy devices under a common bus, and +gives them a common parent if they don't already have one. + +But, besides the organizational benefits, the platform bus can also +accommodate firmware-based enumeration. + + +Device Discovery +~~~~~~~~~~~~~~~~ +The platform bus has no concept of probing for devices. Devices +discovery is left up to either the legacy drivers or the +firmware. These entities are expected to notify the platform of +devices that it discovers via the bus's add() callback: + + platform_bus.add(parent,bus_id). + + +Bus IDs +~~~~~~~ +Bus IDs are the canonical names for the devices. There is no globally +standard addressing mechanism for legacy devices. In the IA-32 world, +we have Pnp IDs to use, as well as the legacy I/O ports. However, +neither tell what the device really is or have any meaning on other +platforms. + +Since both PnP IDs and the legacy I/O ports (and other standard I/O +ports for specific devices) have a 1:1 mapping, we map the +platform-specific name or identifier to a generic name (at least +within the scope of the kernel). + +For example, a serial driver might find a device at I/O 0x3f8. The +ACPI firmware might also discover a device with PnP ID (_HID) +PNP0501. Both correspond to the same device and should be mapped to the +canonical name 'serial'. + +The bus_id field should be a concatenation of the canonical name and +the instance of that type of device. For example, the device at I/O +port 0x3f8 should have a bus_id of "serial0". This places the +responsibility of enumerating devices of a particular type up to the +discovery mechanism. But, they are the entity that should know best +(as opposed to the platform bus driver). + + +Drivers +~~~~~~~ +Drivers for platform devices should have a name that is the same as +the canonical name of the devices they support. This allows the +platform bus driver to do simple matching with the basic data +structures to determine if a driver supports a certain device. + +For example, a legacy serial driver should have a name of 'serial' and +register itself with the platform bus. + + +Driver Binding +~~~~~~~~~~~~~~ +Legacy drivers assume they are bound to the device once they start up +and probe an I/O port. Divorcing them from this will be a difficult +process. However, that shouldn't prevent us from implementing +firmware-based enumeration. + +The firmware should notify the platform bus about devices before the +legacy drivers have had a chance to load. Once the drivers are loaded, +they driver model core will attempt to bind the driver to any +previously-discovered devices. Once that has happened, it will be free +to discover any other devices it pleases. + diff --git a/Documentation/driver-model/porting.txt b/Documentation/driver-model/porting.txt new file mode 100644 index 000000000000..ff2fef2107f0 --- /dev/null +++ b/Documentation/driver-model/porting.txt @@ -0,0 +1,445 @@ + +Porting Drivers to the New Driver Model + +Patrick Mochel + +7 January 2003 + + +Overview + +Please refer to Documentation/driver-model/*.txt for definitions of +various driver types and concepts. + +Most of the work of porting devices drivers to the new model happens +at the bus driver layer. This was intentional, to minimize the +negative effect on kernel drivers, and to allow a gradual transition +of bus drivers. + +In a nutshell, the driver model consists of a set of objects that can +be embedded in larger, bus-specific objects. Fields in these generic +objects can replace fields in the bus-specific objects. + +The generic objects must be registered with the driver model core. By +doing so, they will exported via the sysfs filesystem. sysfs can be +mounted by doing + + # mount -t sysfs sysfs /sys + + + +The Process + +Step 0: Read include/linux/device.h for object and function definitions. + +Step 1: Registering the bus driver. + + +- Define a struct bus_type for the bus driver. + +struct bus_type pci_bus_type = { + .name = "pci", +}; + + +- Register the bus type. + This should be done in the initialization function for the bus type, + which is usually the module_init(), or equivalent, function. + +static int __init pci_driver_init(void) +{ + return bus_register(&pci_bus_type); +} + +subsys_initcall(pci_driver_init); + + + The bus type may be unregistered (if the bus driver may be compiled + as a module) by doing: + + bus_unregister(&pci_bus_type); + + +- Export the bus type for others to use. + + Other code may wish to reference the bus type, so declare it in a + shared header file and export the symbol. + +From include/linux/pci.h: + +extern struct bus_type pci_bus_type; + + +From file the above code appears in: + +EXPORT_SYMBOL(pci_bus_type); + + + +- This will cause the bus to show up in /sys/bus/pci/ with two + subdirectories: 'devices' and 'drivers'. + +# tree -d /sys/bus/pci/ +/sys/bus/pci/ +|-- devices +`-- drivers + + + +Step 2: Registering Devices. + +struct device represents a single device. It mainly contains metadata +describing the relationship the device has to other entities. + + +- Embedd a struct device in the bus-specific device type. + + +struct pci_dev { + ... + struct device dev; /* Generic device interface */ + ... +}; + + It is recommended that the generic device not be the first item in + the struct to discourage programmers from doing mindless casts + between the object types. Instead macros, or inline functions, + should be created to convert from the generic object type. + + +#define to_pci_dev(n) container_of(n, struct pci_dev, dev) + +or + +static inline struct pci_dev * to_pci_dev(struct kobject * kobj) +{ + return container_of(n, struct pci_dev, dev); +} + + This allows the compiler to verify type-safety of the operations + that are performed (which is Good). + + +- Initialize the device on registration. + + When devices are discovered or registered with the bus type, the + bus driver should initialize the generic device. The most important + things to initialize are the bus_id, parent, and bus fields. + + The bus_id is an ASCII string that contains the device's address on + the bus. The format of this string is bus-specific. This is + necessary for representing devices in sysfs. + + parent is the physical parent of the device. It is important that + the bus driver sets this field correctly. + + The driver model maintains an ordered list of devices that it uses + for power management. This list must be in order to guarantee that + devices are shutdown before their physical parents, and vice versa. + The order of this list is determined by the parent of registered + devices. + + Also, the location of the device's sysfs directory depends on a + device's parent. sysfs exports a directory structure that mirrors + the device hierarchy. Accurately setting the parent guarantees that + sysfs will accurately represent the hierarchy. + + The device's bus field is a pointer to the bus type the device + belongs to. This should be set to the bus_type that was declared + and initialized before. + + Optionally, the bus driver may set the device's name and release + fields. + + The name field is an ASCII string describing the device, like + + "ATI Technologies Inc Radeon QD" + + The release field is a callback that the driver model core calls + when the device has been removed, and all references to it have + been released. More on this in a moment. + + +- Register the device. + + Once the generic device has been initialized, it can be registered + with the driver model core by doing: + + device_register(&dev->dev); + + It can later be unregistered by doing: + + device_unregister(&dev->dev); + + This should happen on buses that support hotpluggable devices. + If a bus driver unregisters a device, it should not immediately free + it. It should instead wait for the driver model core to call the + device's release method, then free the bus-specific object. + (There may be other code that is currently referencing the device + structure, and it would be rude to free the device while that is + happening). + + + When the device is registered, a directory in sysfs is created. + The PCI tree in sysfs looks like: + +/sys/devices/pci0/ +|-- 00:00.0 +|-- 00:01.0 +| `-- 01:00.0 +|-- 00:02.0 +| `-- 02:1f.0 +| `-- 03:00.0 +|-- 00:1e.0 +| `-- 04:04.0 +|-- 00:1f.0 +|-- 00:1f.1 +| |-- ide0 +| | |-- 0.0 +| | `-- 0.1 +| `-- ide1 +| `-- 1.0 +|-- 00:1f.2 +|-- 00:1f.3 +`-- 00:1f.5 + + Also, symlinks are created in the bus's 'devices' directory + that point to the device's directory in the physical hierarchy. + +/sys/bus/pci/devices/ +|-- 00:00.0 -> ../../../devices/pci0/00:00.0 +|-- 00:01.0 -> ../../../devices/pci0/00:01.0 +|-- 00:02.0 -> ../../../devices/pci0/00:02.0 +|-- 00:1e.0 -> ../../../devices/pci0/00:1e.0 +|-- 00:1f.0 -> ../../../devices/pci0/00:1f.0 +|-- 00:1f.1 -> ../../../devices/pci0/00:1f.1 +|-- 00:1f.2 -> ../../../devices/pci0/00:1f.2 +|-- 00:1f.3 -> ../../../devices/pci0/00:1f.3 +|-- 00:1f.5 -> ../../../devices/pci0/00:1f.5 +|-- 01:00.0 -> ../../../devices/pci0/00:01.0/01:00.0 +|-- 02:1f.0 -> ../../../devices/pci0/00:02.0/02:1f.0 +|-- 03:00.0 -> ../../../devices/pci0/00:02.0/02:1f.0/03:00.0 +`-- 04:04.0 -> ../../../devices/pci0/00:1e.0/04:04.0 + + + +Step 3: Registering Drivers. + +struct device_driver is a simple driver structure that contains a set +of operations that the driver model core may call. + + +- Embed a struct device_driver in the bus-specific driver. + + Just like with devices, do something like: + +struct pci_driver { + ... + struct device_driver driver; +}; + + +- Initialize the generic driver structure. + + When the driver registers with the bus (e.g. doing pci_register_driver()), + initialize the necessary fields of the driver: the name and bus + fields. + + +- Register the driver. + + After the generic driver has been initialized, call + + driver_register(&drv->driver); + + to register the driver with the core. + + When the driver is unregistered from the bus, unregister it from the + core by doing: + + driver_unregister(&drv->driver); + + Note that this will block until all references to the driver have + gone away. Normally, there will not be any. + + +- Sysfs representation. + + Drivers are exported via sysfs in their bus's 'driver's directory. + For example: + +/sys/bus/pci/drivers/ +|-- 3c59x +|-- Ensoniq AudioPCI +|-- agpgart-amdk7 +|-- e100 +`-- serial + + +Step 4: Define Generic Methods for Drivers. + +struct device_driver defines a set of operations that the driver model +core calls. Most of these operations are probably similar to +operations the bus already defines for drivers, but taking different +parameters. + +It would be difficult and tedious to force every driver on a bus to +simultaneously convert their drivers to generic format. Instead, the +bus driver should define single instances of the generic methods that +forward call to the bus-specific drivers. For instance: + + +static int pci_device_remove(struct device * dev) +{ + struct pci_dev * pci_dev = to_pci_dev(dev); + struct pci_driver * drv = pci_dev->driver; + + if (drv) { + if (drv->remove) + drv->remove(pci_dev); + pci_dev->driver = NULL; + } + return 0; +} + + +The generic driver should be initialized with these methods before it +is registered. + + /* initialize common driver fields */ + drv->driver.name = drv->name; + drv->driver.bus = &pci_bus_type; + drv->driver.probe = pci_device_probe; + drv->driver.resume = pci_device_resume; + drv->driver.suspend = pci_device_suspend; + drv->driver.remove = pci_device_remove; + + /* register with core */ + driver_register(&drv->driver); + + +Ideally, the bus should only initialize the fields if they are not +already set. This allows the drivers to implement their own generic +methods. + + +Step 5: Support generic driver binding. + +The model assumes that a device or driver can be dynamically +registered with the bus at any time. When registration happens, +devices must be bound to a driver, or drivers must be bound to all +devices that it supports. + +A driver typically contains a list of device IDs that it supports. The +bus driver compares these IDs to the IDs of devices registered with it. +The format of the device IDs, and the semantics for comparing them are +bus-specific, so the generic model does attempt to generalize them. + +Instead, a bus may supply a method in struct bus_type that does the +comparison: + + int (*match)(struct device * dev, struct device_driver * drv); + +match should return '1' if the driver supports the device, and '0' +otherwise. + +When a device is registered, the bus's list of drivers is iterated +over. bus->match() is called for each one until a match is found. + +When a driver is registered, the bus's list of devices is iterated +over. bus->match() is called for each device that is not already +claimed by a driver. + +When a device is successfully bound to a device, device->driver is +set, the device is added to a per-driver list of devices, and a +symlink is created in the driver's sysfs directory that points to the +device's physical directory: + +/sys/bus/pci/drivers/ +|-- 3c59x +| `-- 00:0b.0 -> ../../../../devices/pci0/00:0b.0 +|-- Ensoniq AudioPCI +|-- agpgart-amdk7 +| `-- 00:00.0 -> ../../../../devices/pci0/00:00.0 +|-- e100 +| `-- 00:0c.0 -> ../../../../devices/pci0/00:0c.0 +`-- serial + + +This driver binding should replace the existing driver binding +mechanism the bus currently uses. + + +Step 6: Supply a hotplug callback. + +Whenever a device is registered with the driver model core, the +userspace program /sbin/hotplug is called to notify userspace. +Users can define actions to perform when a device is inserted or +removed. + +The driver model core passes several arguments to userspace via +environment variables, including + +- ACTION: set to 'add' or 'remove' +- DEVPATH: set to the device's physical path in sysfs. + +A bus driver may also supply additional parameters for userspace to +consume. To do this, a bus must implement the 'hotplug' method in +struct bus_type: + + int (*hotplug) (struct device *dev, char **envp, + int num_envp, char *buffer, int buffer_size); + +This is called immediately before /sbin/hotplug is executed. + + +Step 7: Cleaning up the bus driver. + +The generic bus, device, and driver structures provide several fields +that can replace those defined privately to the bus driver. + +- Device list. + +struct bus_type contains a list of all devices registered with the bus +type. This includes all devices on all instances of that bus type. +An internal list that the bus uses may be removed, in favor of using +this one. + +The core provides an iterator to access these devices. + +int bus_for_each_dev(struct bus_type * bus, struct device * start, + void * data, int (*fn)(struct device *, void *)); + + +- Driver list. + +struct bus_type also contains a list of all drivers registered with +it. An internal list of drivers that the bus driver maintains may +be removed in favor of using the generic one. + +The drivers may be iterated over, like devices: + +int bus_for_each_drv(struct bus_type * bus, struct device_driver * start, + void * data, int (*fn)(struct device_driver *, void *)); + + +Please see drivers/base/bus.c for more information. + + +- rwsem + +struct bus_type contains an rwsem that protects all core accesses to +the device and driver lists. This can be used by the bus driver +internally, and should be used when accessing the device or driver +lists the bus maintains. + + +- Device and driver fields. + +Some of the fields in struct device and struct device_driver duplicate +fields in the bus-specific representations of these objects. Feel free +to remove the bus-specific ones and favor the generic ones. Note +though, that this will likely mean fixing up all the drivers that +reference the bus-specific fields (though those should all be 1-line +changes). + diff --git a/Documentation/dvb/README.dibusb b/Documentation/dvb/README.dibusb new file mode 100644 index 000000000000..7a9e958513f3 --- /dev/null +++ b/Documentation/dvb/README.dibusb @@ -0,0 +1,285 @@ +Documentation for dib3000* frontend drivers and dibusb device driver +==================================================================== + +Copyright (C) 2004-5 Patrick Boettcher (patrick.boettcher@desy.de), + +dibusb and dib3000mb/mc drivers based on GPL code, which has + +Copyright (C) 2004 Amaury Demol for DiBcom (ademol@dibcom.fr) + +This program is free software; you can redistribute it and/or +modify it under the terms of the GNU General Public License as +published by the Free Software Foundation, version 2. + + +Supported devices USB1.1 +======================== + +Produced and reselled by Twinhan: +--------------------------------- +- TwinhanDTV USB-Ter DVB-T Device (VP7041) + http://www.twinhan.com/product_terrestrial_3.asp + +- TwinhanDTV Magic Box (VP7041e) + http://www.twinhan.com/product_terrestrial_4.asp + +- HAMA DVB-T USB device + http://www.hama.de/portal/articleId*110620/action*2598 + +- CTS Portable (Chinese Television System) (2) + http://www.2cts.tv/ctsportable/ + +- Unknown USB DVB-T device with vendor ID Hyper-Paltek + + +Produced and reselled by KWorld: +-------------------------------- +- KWorld V-Stream XPERT DTV DVB-T USB + http://www.kworld.com.tw/en/product/DVBT-USB/DVBT-USB.html + +- JetWay DTV DVB-T USB + http://www.jetway.com.tw/evisn/product/lcd-tv/DVT-USB/dtv-usb.htm + +- ADSTech Instant TV DVB-T USB + http://www.adstech.com/products/PTV-333/intro/PTV-333_intro.asp?pid=PTV-333 + + +Others: +------- +- Ultima Electronic/Artec T1 USB TVBOX (AN2135, AN2235, AN2235 with Panasonic Tuner) + http://82.161.246.249/products-tvbox.html + +- Compro Videomate DVB-U2000 - DVB-T USB (2) + http://www.comprousa.com/products/vmu2000.htm + +- Grandtec USB DVB-T + http://www.grand.com.tw/ + +- Avermedia AverTV DVBT USB (2) + http://www.avermedia.com/ + +- DiBcom USB DVB-T reference device (non-public) + + +Supported devices USB2.0 +======================== +- Twinhan MagicBox II (2) + http://www.twinhan.com/product_terrestrial_7.asp + +- Hanftek UMT-010 (1) + http://www.globalsources.com/si/6008819757082/ProductDetail/Digital-TV/product_id-100046529 + +- Typhoon/Yakumo/HAMA DVB-T mobile USB2.0 (1) + http://www.yakumo.de/produkte/index.php?pid=1&ag=DVB-T + +- Artec T1 USB TVBOX (FX2) (2) + +- Hauppauge WinTV NOVA-T USB2 + http://www.hauppauge.com/ + +- KWorld/ADSTech Instant DVB-T USB2.0 (DiB3000M-B) + +- DiBcom USB2.0 DVB-T reference device (non-public) + +1) It is working almost. +2) No test reports received yet. + + +0. NEWS: + 2005-02-11 - added support for the KWorld/ADSTech Instant DVB-T USB2.0. Thanks a lot to Joachim von Caron + 2005-02-02 - added support for the Hauppauge Win-TV Nova-T USB2 + 2005-01-31 - distorted streaming is finally gone for USB1.1 devices + 2005-01-13 - moved the mirrored pid_filter_table back to dvb-dibusb + - first almost working version for HanfTek UMT-010 + - found out, that Yakumo/HAMA/Typhoon are predessors of the HanfTek UMT-010 + 2005-01-10 - refactoring completed, now everything is very delightful + - tuner quirks for some weird devices (Artec T1 AN2235 device has sometimes a + Panasonic Tuner assembled). Tunerprobing implemented. Thanks a lot to Gunnar Wittich. + 2004-12-29 - after several days of struggling around bug of no returning URBs fixed. + 2004-12-26 - refactored the dibusb-driver, splitted into separate files + - i2c-probing enabled + 2004-12-06 - possibility for demod i2c-address probing + - new usb IDs (Compro,Artec) + 2004-11-23 - merged changes from DiB3000MC_ver2.1 + - revised the debugging + - possibility to deliver the complete TS for USB2.0 + 2004-11-21 - first working version of the dib3000mc/p frontend driver. + 2004-11-12 - added additional remote control keys. Thanks to Uwe Hanke. + 2004-11-07 - added remote control support. Thanks to David Matthews. + 2004-11-05 - added support for a new devices (Grandtec/Avermedia/Artec) + - merged my changes (for dib3000mb/dibusb) to the FE_REFACTORING, because it became HEAD + - moved transfer control (pid filter, fifo control) from usb driver to frontend, it seems + better settled there (added xfer_ops-struct) + - created a common files for frontends (mc/p/mb) + 2004-09-28 - added support for a new device (Unkown, vendor ID is Hyper-Paltek) + 2004-09-20 - added support for a new device (Compro DVB-U2000), thanks + to Amaury Demol for reporting + - changed usb TS transfer method (several urbs, stopping transfer + before setting a new pid) + 2004-09-13 - added support for a new device (Artec T1 USB TVBOX), thanks + to Christian Motschke for reporting + 2004-09-05 - released the dibusb device and dib3000mb-frontend driver + + (old news for vp7041.c) + 2004-07-15 - found out, by accident, that the device has a TUA6010XS for + PLL + 2004-07-12 - figured out, that the driver should also work with the + CTS Portable (Chinese Television System) + 2004-07-08 - firmware-extraction-2.422-problem solved, driver is now working + properly with firmware extracted from 2.422 + - #if for 2.6.4 (dvb), compile issue + - changed firmware handling, see vp7041.txt sec 1.1 + 2004-07-02 - some tuner modifications, v0.1, cleanups, first public + 2004-06-28 - now using the dvb_dmx_swfilter_packets, everything + runs fine now + 2004-06-27 - able to watch and switching channels (pre-alpha) + - no section filtering yet + 2004-06-06 - first TS received, but kernel oops :/ + 2004-05-14 - firmware loader is working + 2004-05-11 - start writing the driver + +1. How to use? +NOTE: This driver was developed using Linux 2.6.6., +it is working with 2.6.7 and above. + +Linux 2.4.x support is not planned, but patches are very welcome. + +NOTE: I'm using Debian testing, so the following explaination (especially +the hotplug-path) needn't match your system, but probably it will :). + +The driver is included in the kernel since Linux 2.6.10. + +1.1. Firmware + +The USB driver needs to download a firmware to start working. + +You can either use "get_dvb_firmware dibusb" to download the firmware or you +can get it directly via + +for USB1.1 (AN2135) +http://www.linuxtv.org/downloads/firmware/dvb-dibusb-5.0.0.11.fw + +for USB1.1 (AN2235) (a few Artec T1 devices) +http://www.linuxtv.org/downloads/firmware/dvb-dibusb-an2235-1.fw + +for USB2.0 (FX2) Hauppauge, DiBcom +http://www.linuxtv.org/downloads/firmware/dvb-dibusb-6.0.0.5.fw + +for USB2.0 ADSTech/Kworld USB2.0 +http://www.linuxtv.org/downloads/firmware/dvb-dibusb-adstech-usb2-1.fw + +for USB2.0 HanfTek +http://www.linuxtv.org/downloads/firmware/dvb-dibusb-an2235-1.fw + + +1.2. Compiling + +Since the driver is in the linux kernel, activating the driver in +your favorite config-environment should sufficient. I recommend +to compile the driver as module. Hotplug does the rest. + +1.3. Loading the drivers + +Hotplug is able to load the driver, when it is needed (because you plugged +in the device). + +If you want to enable debug output, you have to load the driver manually and +from withing the dvb-kernel cvs repository. + +first have a look, which debug level are available: + +modinfo dib3000mb +modinfo dib3000-common +modinfo dib3000mc +modinfo dvb-dibusb + +modprobe dib3000-common debug=<level> +modprobe dib3000mb debug=<level> +modprobe dib3000mc debug=<level> +modprobe dvb-dibusb debug=<level> + +should do the trick. + +When the driver is loaded successfully, the firmware file was in +the right place and the device is connected, the "Power"-LED should be +turned on. + +At this point you should be able to start a dvb-capable application. For myself +I used mplayer, dvbscan, tzap and kaxtv, they are working. Using the device +in vdr is working now also. + +2. Known problems and bugs + +- Don't remove the USB device while running an DVB application, your system will die. + +2.1. Adding support for devices + +It is not possible to determine the range of devices based on the DiBcom +reference designs. This is because the reference design of DiBcom can be sold +to thirds, without telling DiBcom (so done with the Twinhan VP7041 and +the HAMA device). + +When you think you have a device like this and the driver does not recognizes it, +please send the ****load*.inf and the ****cap*.inf of the Windows driver to me. + +Sometimes the Vendor or Product ID is identical to the ones of Twinhan, even +though it is not a Twinhan device (e.g. HAMA), then please send me the name +of the device. I will add it to this list in order to make this clear to +others. + +If you are familar with C you can also add the VID and PID of the device to +the dvb-dibusb-core.c-file and create a patch and send it over to me or to +the linux-dvb mailing list, _after_ you have tried compiling and modprobing +it. + +2.2. USB1.1 Bandwidth limitation + +Most of the currently supported devices are USB1.1 and thus they have a +maximum bandwidth of about 5-6 MBit/s when connected to a USB2.0 hub. +This is not enough for receiving the complete transport stream of a +DVB-T channel (which can be about 16 MBit/s). Normally this is not a +problem, if you only want to watch TV (this does not apply for HDTV), +but watching a channel while recording another channel on the same +frequency simply does not work very well. This applies to all USB1.1 +DVB-T devices, not just dibusb) + +Update: For the USB1.1 and VDR some work has been done (patches and comments +are still very welcome). Maybe the problem is solved in the meantime because I +now use the dmx_sw_filter function instead of dmx_sw_filter_packet. I hope the +linux-dvb software filter is able to get the best of the garbled TS. + +The bug, where the TS is distorted by a heavy usage of the device is gone +definitely. All dibusb-devices I was using (Twinhan, Kworld, DiBcom) are +working like charm now with VDR. Sometimes I even was able to record a channel +and watch another one. + +2.3. Comments + +Patches, comments and suggestions are very very welcome. + +3. Acknowledgements + Amaury Demol (ademol@dibcom.fr) and Francois Kanounnikoff from DiBcom for + providing specs, code and help, on which the dvb-dibusb, dib3000mb and + dib3000mc are based. + + David Matthews for identifying a new device type (Artec T1 with AN2235) + and for extending dibusb with remote control event handling. Thank you. + + Alex Woods for frequently answering question about usb and dvb + stuff, a big thank you. + + Bernd Wagner for helping with huge bug reports and discussions. + + Gunnar Wittich and Joachim von Caron for their trust for giving me + root-shells on their machines to implement support for new devices. + + Some guys on the linux-dvb mailing list for encouraging me + + Peter Schildmann >peter.schildmann-nospam-at-web.de< for his + user-level firmware loader, which saves a lot of time + (when writing the vp7041 driver) + + Ulf Hermenau for helping me out with traditional chinese. + + André Smoktun and Christian Frömmel for supporting me with + hardware and listening to my problems very patient diff --git a/Documentation/dvb/avermedia.txt b/Documentation/dvb/avermedia.txt new file mode 100644 index 000000000000..09020ebd202b --- /dev/null +++ b/Documentation/dvb/avermedia.txt @@ -0,0 +1,304 @@ + +HOWTO: Get An Avermedia DVB-T working under Linux + ______________________________________________ + + Table of Contents + Assumptions and Introduction + The Avermedia DVB-T + Getting the card going + Receiving DVB-T in Australia + Known Limitations + Further Update + +Assumptions and Introduction + + It is assumed that the reader understands the basic structure + of the Linux Kernel DVB drivers and the general principles of + Digital TV. + + One significant difference between Digital TV and Analogue TV + that the unwary (like myself) should consider is that, + although the component structure of budget DVB-T cards are + substantially similar to Analogue TV cards, they function in + substantially different ways. + + The purpose of an Analogue TV is to receive and display an + Analogue Television signal. An Analogue TV signal (otherwise + known as composite video) is an analogue encoding of a + sequence of image frames (25 per second) rasterised using an + interlacing technique. Interlacing takes two fields to + represent one frame. Computers today are at their best when + dealing with digital signals, not analogue signals and a + composite video signal is about as far removed from a digital + data stream as you can get. Therefore, an Analogue TV card for + a PC has the following purpose: + + * Tune the receiver to receive a broadcast signal + * demodulate the broadcast signal + * demultiplex the analogue video signal and analogue audio + signal (note some countries employ a digital audio signal + embedded within the modulated composite analogue signal - + NICAM.) + * digitize the analogue video signal and make the resulting + datastream available to the data bus. + + The digital datastream from an Analogue TV card is generated + by circuitry on the card and is often presented uncompressed. + For a PAL TV signal encoded at a resolution of 768x576 24-bit + color pixels over 25 frames per second - a fair amount of data + is generated and must be proceesed by the PC before it can be + displayed on the video monitor screen. Some Analogue TV cards + for PC's have onboard MPEG2 encoders which permit the raw + digital data stream to be presented to the PC in an encoded + and compressed form - similar to the form that is used in + Digital TV. + + The purpose of a simple budget digital TV card (DVB-T,C or S) + is to simply: + + * Tune the received to receive a broadcast signal. + * Extract the encoded digital datastream from the broadcast + signal. + * Make the encoded digital datastream (MPEG2) available to + the data bus. + + The significant difference between the two is that the tuner + on the analogue TV card spits out an Analogue signal, whereas + the tuner on the digital TV card spits out a compressed + encoded digital datastream. As the signal is already + digitised, it is trivial to pass this datastream to the PC + databus with minimal additional processing and then extract + the digital video and audio datastreams passing them to the + appropriate software or hardware for decoding and viewing. + _________________________________________________________ + +The Avermedia DVB-T + + The Avermedia DVB-T is a budget PCI DVB card. It has 3 inputs: + + * RF Tuner Input + * Composite Video Input (RCA Jack) + * SVIDEO Input (Mini-DIN) + + The RF Tuner Input is the input to the tuner module of the + card. The Tuner is otherwise known as the "Frontend" . The + Frontend of the Avermedia DVB-T is a Microtune 7202D. A timely + post to the linux-dvb mailing list ascertained that the + Microtune 7202D is supported by the sp887x driver which is + found in the dvb-hw CVS module. + + The DVB-T card is based around the BT878 chip which is a very + common multimedia bridge and often found on Analogue TV cards. + There is no on-board MPEG2 decoder, which means that all MPEG2 + decoding must be done in software, or if you have one, on an + MPEG2 hardware decoding card or chipset. + _________________________________________________________ + +Getting the card going + + In order to fire up the card, it is necessary to load a number + of modules from the DVB driver set. Prior to this it will have + been necessary to download these drivers from the linuxtv CVS + server and compile them successfully. + + Depending on the card's feature set, the Device Driver API for + DVB under Linux will expose some of the following device files + in the /dev tree: + + * /dev/dvb/adapter0/audio0 + * /dev/dvb/adapter0/ca0 + * /dev/dvb/adapter0/demux0 + * /dev/dvb/adapter0/dvr0 + * /dev/dvb/adapter0/frontend0 + * /dev/dvb/adapter0/net0 + * /dev/dvb/adapter0/osd0 + * /dev/dvb/adapter0/video0 + + The primary device nodes that we are interested in (at this + stage) for the Avermedia DVB-T are: + + * /dev/dvb/adapter0/dvr0 + * /dev/dvb/adapter0/frontend0 + + The dvr0 device node is used to read the MPEG2 Data Stream and + the frontend0 node is used to tune the frontend tuner module. + + At this stage, it has not been able to ascertain the + functionality of the remaining device nodes in respect of the + Avermedia DVBT. However, full functionality in respect of + tuning, receiving and supplying the MPEG2 data stream is + possible with the currently available versions of the driver. + It may be possible that additional functionality is available + from the card (i.e. viewing the additional analogue inputs + that the card presents), but this has not been tested yet. If + I get around to this, I'll update the document with whatever I + find. + + To power up the card, load the following modules in the + following order: + + * insmod dvb-core.o + * modprobe bttv.o + * insmod bt878.o + * insmod dvb-bt8xx.o + * insmod sp887x.o + + Insertion of these modules into the running kernel will + activate the appropriate DVB device nodes. It is then possible + to start accessing the card with utilities such as scan, tzap, + dvbstream etc. + + The frontend module sp887x.o, requires an external firmware. + Please use the command "get_dvb_firmware sp887x" to download + it. Then copy it to /usr/lib/hotplug/firmware. + +Receiving DVB-T in Australia + + I have no experience of DVB-T in other countries other than + Australia, so I will attempt to explain how it works here in + Melbourne and how this affects the configuration of the DVB-T + card. + + The Digital Broadcasting Australia website has a Reception + locatortool which provides information on transponder channels + and frequencies. My local transmitter happens to be Mount + Dandenong. + + The frequencies broadcast by Mount Dandenong are: + + Table 1. Transponder Frequencies Mount Dandenong, Vic, Aus. + Broadcaster Channel Frequency + ABC VHF 12 226.5 MHz + TEN VHF 11 219.5 MHz + NINE VHF 8 191.625 MHz + SEVEN VHF 6 177.5 MHz + SBS UHF 29 536.5 MHz + + The Scan utility has a set of compiled-in defaults for various + countries and regions, but if they do not suit, or if you have + a pre-compiled scan binary, you can specify a data file on the + command line which contains the transponder frequencies. Here + is a sample file for the above channel transponders: +# Data file for DVB scan program +# +# C Frequency SymbolRate FEC QAM +# S Frequency Polarisation SymbolRate FEC +# T Frequency Bandwidth FEC FEC2 QAM Mode Guard Hier +T 226500000 7MHz 2/3 NONE QAM64 8k 1/8 NONE +T 191625000 7MHz 2/3 NONE QAM64 8k 1/8 NONE +T 219500000 7MHz 2/3 NONE QAM64 8k 1/8 NONE +T 177500000 7MHz 2/3 NONE QAM64 8k 1/8 NONE +T 536500000 7MHz 2/3 NONE QAM64 8k 1/8 NONE + + The defaults for the transponder frequency and other + modulation parameters were obtained from www.dba.org.au. + + When Scan runs, it will output channels.conf information for + any channel's transponders which the card's frontend can lock + onto. (i.e. any whose signal is strong enough at your + antenna). + + Here's my channels.conf file for anyone who's interested: +ABC HDTV:226500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_3_4:FEC_3_4:QAM_64 +:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_16:HIERARCHY_NONE:2307:0:560 +ABC TV Melbourne:226500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_3_4:FEC_3_ +4:QAM_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_16:HIERARCHY_NONE:512:65 +0:561 +ABC TV 2:226500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_3_4:FEC_3_4:QAM_64 +:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_16:HIERARCHY_NONE:512:650:562 +ABC TV 3:226500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_3_4:FEC_3_4:QAM_64 +:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_16:HIERARCHY_NONE:512:650:563 +ABC TV 4:226500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_3_4:FEC_3_4:QAM_64 +:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_16:HIERARCHY_NONE:512:650:564 +ABC DiG Radio:226500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_3_4:FEC_3_4:Q +AM_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_16:HIERARCHY_NONE:0:2311:56 +6 +TEN Digital:219500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_3_4:FEC_1_2:QAM +_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_16:HIERARCHY_NONE:512:650:158 +5 +TEN Digital 1:219500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_3_4:FEC_1_2:Q +AM_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_16:HIERARCHY_NONE:512:650:1 +586 +TEN Digital 2:219500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_3_4:FEC_1_2:Q +AM_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_16:HIERARCHY_NONE:512:650:1 +587 +TEN Digital 3:219500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_3_4:FEC_1_2:Q +AM_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_16:HIERARCHY_NONE:512:650:1 +588 +TEN Digital:219500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_3_4:FEC_1_2:QAM +_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_16:HIERARCHY_NONE:512:650:158 +9 +TEN Digital 4:219500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_3_4:FEC_1_2:Q +AM_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_16:HIERARCHY_NONE:512:650:1 +590 +TEN Digital:219500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_3_4:FEC_1_2:QAM +_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_16:HIERARCHY_NONE:512:650:159 +1 +TEN HD:219500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_3_4:FEC_1_2:QAM_64:T +RANSMISSION_MODE_8K:GUARD_INTERVAL_1_16:HIERARCHY_NONE:514:0:1592 +TEN Digital:219500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_3_4:FEC_1_2:QAM +_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_16:HIERARCHY_NONE:512:650:159 +3 +Nine Digital:191625000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_3_4:FEC_1_2:QA +M_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_16:HIERARCHY_NONE:513:660:10 +72 +Nine Digital HD:191625000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_3_4:FEC_1_2 +:QAM_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_16:HIERARCHY_NONE:512:0:1 +073 +Nine Guide:191625000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_3_4:FEC_1_2:QAM_ +64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_16:HIERARCHY_NONE:514:670:1074 +7 Digital:177500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_2_3:FEC_2_3:QAM_6 +4:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_8:HIERARCHY_NONE:769:770:1328 +7 Digital 1:177500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_2_3:FEC_2_3:QAM +_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_8:HIERARCHY_NONE:769:770:1329 +7 Digital 2:177500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_2_3:FEC_2_3:QAM +_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_8:HIERARCHY_NONE:769:770:1330 +7 Digital 3:177500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_2_3:FEC_2_3:QAM +_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_8:HIERARCHY_NONE:769:770:1331 +7 HD Digital:177500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_2_3:FEC_2_3:QA +M_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_8:HIERARCHY_NONE:833:834:133 +2 +7 Program Guide:177500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_2_3:FEC_2_3 +:QAM_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_8:HIERARCHY_NONE:865:866: +1334 +SBS HD:536500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_2_3:FEC_2_3:QAM_64:T +RANSMISSION_MODE_8K:GUARD_INTERVAL_1_8:HIERARCHY_NONE:102:103:784 +SBS DIGITAL 1:536500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_2_3:FEC_2_3:Q +AM_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_8:HIERARCHY_NONE:161:81:785 +SBS DIGITAL 2:536500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_2_3:FEC_2_3:Q +AM_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_8:HIERARCHY_NONE:162:83:786 +SBS EPG:536500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_2_3:FEC_2_3:QAM_64: +TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_8:HIERARCHY_NONE:163:85:787 +SBS RADIO 1:536500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_2_3:FEC_2_3:QAM +_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_8:HIERARCHY_NONE:0:201:798 +SBS RADIO 2:536500000:INVERSION_OFF:BANDWIDTH_7_MHZ:FEC_2_3:FEC_2_3:QAM +_64:TRANSMISSION_MODE_8K:GUARD_INTERVAL_1_8:HIERARCHY_NONE:0:202:799 + _________________________________________________________ + +Known Limitations + + At present I can say with confidence that the frontend tunes + via /dev/dvb/adapter{x}/frontend0 and supplies an MPEG2 stream + via /dev/dvb/adapter{x}/dvr0. I have not tested the + functionality of any other part of the card yet. I will do so + over time and update this document. + + There are some limitations in the i2c layer due to a returned + error message inconsistency. Although this generates errors in + dmesg and the system logs, it does not appear to affect the + ability of the frontend to function correctly. + _________________________________________________________ + +Further Update + + dvbstream and VideoLAN Client on windows works a treat with + DVB, in fact this is currently serving as my main way of + viewing DVB-T at the moment. Additionally, VLC is happily + decoding HDTV signals, although the PC is dropping the odd + frame here and there - I assume due to processing capability - + as all the decoding is being done under windows in software. + + Many thanks to Nigel Pearson for the updates to this document + since the recent revision of the driver. + + January 29th 2004 diff --git a/Documentation/dvb/bt8xx.txt b/Documentation/dvb/bt8xx.txt new file mode 100644 index 000000000000..e3cacf4f2345 --- /dev/null +++ b/Documentation/dvb/bt8xx.txt @@ -0,0 +1,90 @@ +How to get the Nebula, PCTV and Twinhan DST cards working +========================================================= + +This class of cards has a bt878a as the PCI interface, and +require the bttv driver. + +Please pay close attention to the warning about the bttv module +options below for the DST card. + +1) General informations +======================= + +These drivers require the bttv driver to provide the means to access +the i2c bus and the gpio pins of the bt8xx chipset. + +Because of this, you need to enable +"Device drivers" => "Multimedia devices" + => "Video For Linux" => "BT848 Video For Linux" + +2) Loading Modules +================== + +In general you need to load the bttv driver, which will handle the gpio and +i2c communication for us. Next you need the common dvb-bt8xx device driver +and one frontend driver. + +The bttv driver will HANG YOUR SYSTEM IF YOU DO NOT SPECIFY THE CORRECT +CARD ID! + +(If you don't get your card running and you suspect that the card id you're +using is wrong, have a look at "bttv-cards.c" for a list of possible card +ids.) + +Pay attention to failures when you load the frontend drivers +(e.g. dmesg, /var/log/messages). + +3a) Nebula / Pinnacle PCTV +-------------------------- + + $ modprobe bttv i2c_hw=1 card=0x68 + $ modprobe dvb-bt8xx + +For Nebula cards use the "nxt6000" frontend driver: + $ modprobe nxt6000 + +For Pinnacle PCTV cards use the "cx24110" frontend driver: + $ modprobe cx24110 + +3b) TwinHan +----------- + + $ modprobe bttv i2c_hw=1 card=0x71 + $ modprobe dvb-bt8xx + $ modprobe dst + +The value 0x71 will override the PCI type detection for dvb-bt8xx, which +is necessary for TwinHan cards.# + +If you're having an older card (blue color circuit) and card=0x71 locks your +machine, try using 0x68, too. If that does not work, ask on the DVB mailing list. + +The DST module takes a couple of useful parameters, in case the +dst drivers fails to detect your type of card correctly. + +dst_type takes values 0 (satellite), 1 (terrestial TV), 2 (cable). + +dst_type_flags takes bit combined values: +1 = new tuner type packets. You can use this if your card is detected + and you have debug and you continually see the tuner packets not + working (make sure not a basic problem like dish alignment etc.) + +2 = TS 204. If your card tunes OK, but the picture is terrible, seemingly + breaking up in one half continually, and crc fails a lot, then + this is worth a try (or trying to turn off) + +4 = has symdiv. Some cards, mostly without new tuner packets, require + a symbol division algorithm. Doesn't apply to terrestial TV. + +You can also specify a value to have the autodetected values turned off +(e.g. 0). The autodected values are determined bythe cards 'response +string' which you can see in your logs e.g. + +dst_check_ci: recognize DST-MOT + +or + +dst_check_ci: unable to recognize DSTXCI or STXCI + +-- +Authors: Richard Walker, Jamie Honan, Michael Hunold diff --git a/Documentation/dvb/cards.txt b/Documentation/dvb/cards.txt new file mode 100644 index 000000000000..efdc4ee9d40c --- /dev/null +++ b/Documentation/dvb/cards.txt @@ -0,0 +1,85 @@ +Hardware supported by the linuxtv.org DVB drivers +================================================= + + Generally, the DVB hardware manufacturers frequently change the + frontends (i.e. tuner / demodulator units) used, usually without + changing the product name, revision number or specs. Some cards + are also available in versions with different frontends for + DVB-S/DVB-C/DVB-T. Thus the frontend drivers are listed seperately. + + Note 1: There is no guarantee that every frontend driver works + out of the box with every card, because of different wiring. + + Note 2: The demodulator chips can be used with a variety of + tuner/PLL chips, and not all combinations are supported. Often + the demodulator and tuner/PLL chip are inside a metal box for + shielding, and the whole metal box has its own part number. + + +o Frontends drivers: + - dvb_dummy_fe: for testing... + DVB-S: + - ves1x93 : Alps BSRV2 (ves1893 demodulator) and dbox2 (ves1993) + - cx24110 : Conexant HM1221/HM1811 (cx24110 or cx24106 demod, cx24108 PLL) + - grundig_29504-491 : Grundig 29504-491 (Philips TDA8083 demodulator), tsa5522 PLL + - mt312 : Zarlink mt312 or Mitel vp310 demodulator, sl1935 or tsa5059 PLL + - stv0299 : Alps BSRU6 (tsa5059 PLL), LG TDQB-S00x (tsa5059 PLL), + LG TDQF-S001F (sl1935 PLL), Philips SU1278 (tua6100 PLL), + Philips SU1278SH (tsa5059 PLL), Samsung TBMU24112IMB + DVB-C: + - ves1820 : various (ves1820 demodulator, sp5659c or spXXXX PLL) + - at76c651 : Atmel AT76c651(B) with DAT7021 PLL + DVB-T: + - alps_tdlb7 : Alps TDLB7 (sp8870 demodulator, sp5659 PLL) + - alps_tdmb7 : Alps TDMB7 (cx22700 demodulator) + - grundig_29504-401 : Grundig 29504-401 (LSI L64781 demodulator), tsa5060 PLL + - tda1004x : Philips tda10045h (td1344 or tdm1316l PLL) + - nxt6000 : Alps TDME7 (MITEL SP5659 PLL), Alps TDED4 (TI ALP510 PLL), + Comtech DVBT-6k07 (SP5730 PLL) + (NxtWave Communications NXT6000 demodulator) + - sp887x : Microtune 7202D + - dib3000mb : DiBcom 3000-MB demodulator + DVB-S/C/T: + - dst : TwinHan DST Frontend + + +o Cards based on the Phillips saa7146 multimedia PCI bridge chip: + - TI AV7110 based cards (i.e. with hardware MPEG decoder): + - Siemens/Technotrend/Hauppauge PCI DVB card revision 1.1, 1.3, 1.5, 1.6, 2.1 + (aka Hauppauge Nexus) + - "budget" cards (i.e. without hardware MPEG decoder): + - Technotrend Budget / Hauppauge WinTV-Nova PCI Cards + - SATELCO Multimedia PCI + - KNC1 DVB-S, Typhoon DVB-S, Terratec Cinergy 1200 DVB-S (no CI support) + - Typhoon DVB-S budget + - Fujitsu-Siemens Activy DVB-S budget card + +o Cards based on the B2C2 Inc. FlexCopII/IIb/III: + - Technisat SkyStar2 PCI DVB card revision 2.3, 2.6B, 2.6C + +o Cards based on the Conexant Bt8xx PCI bridge: + - Pinnacle PCTV Sat DVB + - Nebula Electronics DigiTV + - TwinHan DST + - Avermedia DVB-T + +o Technotrend / Hauppauge DVB USB devices: + - Nova USB + - DEC 2000-T, 3000-S, 2540-T + +o DiBcom DVB-T USB based devices: + - Twinhan VisionPlus VisionDTV USB-Ter DVB-T Device + - HAMA DVB-T USB device + - CTS Portable (Chinese Television System) + - KWorld V-Stream XPERT DTV DVB-T USB + - JetWay DTV DVB-T USB + - ADSTech Instant TV DVB-T USB + - Ultima Electronic/Artec T1 USB TVBOX (AN2135 and AN2235) + - Compro Videomate DVB-U2000 - DVB-T USB + - Grandtec USB DVB-T + - Avermedia AverTV DVBT USB + - DiBcom USB DVB-T reference device (non-public) + - Yakumo DVB-T mobile USB2.0 + - DiBcom USB2.0 DVB-T reference device (non-public) + +o Experimental support for the analog module of the Siemens DVB-C PCI card diff --git a/Documentation/dvb/contributors.txt b/Documentation/dvb/contributors.txt new file mode 100644 index 000000000000..c9d5ce370701 --- /dev/null +++ b/Documentation/dvb/contributors.txt @@ -0,0 +1,79 @@ +Thanks go to the following people for patches and contributions: + +Michael Hunold <m.hunold@gmx.de> + for the initial saa7146 driver and it's recent overhaul + +Christian Theiss + for his work on the initial Linux DVB driver + +Marcus Metzler <mocm@metzlerbros.de> +Ralph Metzler <rjkm@metzlerbros.de> + for their continuing work on the DVB driver + +Michael Holzt <kju@debian.org> + for his contributions to the dvb-net driver + +Diego Picciani <d.picciani@novacomp.it> + for CyberLogin for Linux which allows logging onto EON + (in case you are wondering where CyberLogin is, EON changed its login + procedure and CyberLogin is no longer used.) + +Martin Schaller <martin@smurf.franken.de> + for patching the cable card decoder driver + +Klaus Schmidinger <Klaus.Schmidinger@cadsoft.de> + for various fixes regarding tuning, OSD and CI stuff and his work on VDR + +Steve Brown <sbrown@cortland.com> + for his AFC kernel thread + +Christoph Martin <martin@uni-mainz.de> + for his LIRC infrared handler + +Andreas Oberritter <obi@linuxtv.org> +Dennis Noermann <dennis.noermann@noernet.de> +Felix Domke <tmbinc@elitedvb.net> +Florian Schirmer <jolt@tuxbox.org> +Ronny Strutz <3des@elitedvb.de> +Wolfram Joost <dbox2@frokaschwei.de> +...and all the other dbox2 people + for many bugfixes in the generic DVB Core, frontend drivers and + their work on the dbox2 port of the DVB driver + +Oliver Endriss <o.endriss@gmx.de> + for many bugfixes + +Andrew de Quincey <adq_dvb@lidskialf.net> + for the tda1004x frontend driver, and various bugfixes + +Peter Schildmann <peter.schildmann@web.de> + for the driver for the Technisat SkyStar2 PCI DVB card + +Vadim Catana <skystar@moldova.cc> +Roberto Ragusa <r.ragusa@libero.it> +Augusto Cardoso <augusto@carhil.net> + for all the work for the FlexCopII chipset by B2C2,Inc. + +Davor Emard <emard@softhome.net> + for his work on the budget drivers, the demux code, + the module unloading problems, ... + +Hans-Frieder Vogt <hfvogt@arcor.de> + for his work on calculating and checking the crc's for the + TechnoTrend/Hauppauge DEC driver firmware + +Michael Dreher <michael@5dot1.de> +Andreas 'randy' Weinberger + for the support of the Fujitsu-Siemens Activy budget DVB-S + +Kenneth Aafløy <ke-aa@frisurf.no> + for adding support for Typhoon DVB-S budget card + +Ernst Peinlich <e.peinlich@inode.at> + for tuning/DiSEqC support for the DEC 3000-s + +Peter Beutner <p.beutner@gmx.net> + for the IR code for the ttusb-dec driver + +(If you think you should be in this list, but you are not, drop a + line to the DVB mailing list) diff --git a/Documentation/dvb/faq.txt b/Documentation/dvb/faq.txt new file mode 100644 index 000000000000..3bf51e45c972 --- /dev/null +++ b/Documentation/dvb/faq.txt @@ -0,0 +1,160 @@ +Some very frequently asked questions about linuxtv-dvb + +1. The signal seems to die a few seconds after tuning. + + It's not a bug, it's a feature. Because the frontends have + significant power requirements (and hence get very hot), they + are powered down if they are unused (i.e. if the frontend device + is closed). The dvb-core.o module paramter "dvb_shutdown_timeout" + allow you to change the timeout (default 5 seconds). Setting the + timeout to 0 disables the timeout feature. + +2. How can I watch TV? + + The driver distribution includes some simple utilities which + are mainly intended for testing and to demonstrate how the + DVB API works. + + Depending on whether you have a DVB-S, DVB-C or DVB-T card, use + apps/szap/szap, czap or tzap. You must supply a channel list + in ~/.[sct]zap/channels.conf. If you are lucky you can just copy + one of the supplied channel lists, or you can create a new one + by running apps/scan/scan. If you run scan on an unknown network + you might have to supply some start data in apps/scan/initial.h. + + If you have a card with a built-in hardware MPEG-decoder the + drivers create a video4linux device (/dev/v4l/video0) which + you can use to watch TV with any v4l application. xawtv is known + to work. Note that you cannot change channels with xawtv, you + have to zap using [sct]zap. If you want a nice application for + TV watching and record/playback, have a look at VDR. + + If your card does not have a hardware MPEG decoder you need + a software MPEG decoder. Mplayer or xine are known to work. + Newsflash: MythTV also has DVB support now. + Note: Only very recent versions of Mplayer and xine can decode. + MPEG2 transport streams (TS) directly. Then, run + '[sct]zap channelname -r' in one xterm, and keep it running, + and start 'mplayer - < /dev/dvb/adapter0/dvr0' or + 'xine stdin://mpeg2 < /dev/dvb/adapter0/dvr0' in a second xterm. + That's all far from perfect, but it seems no one has written + a nice DVB application which includes a builtin software MPEG + decoder yet. + + Newsflash: Newest xine directly supports DVB. Just copy your + channels.conf to ~/.xine and start 'xine dvb://', or select + the DVB button in the xine GUI. Channel switching works using the + numpad pgup/pgdown (NP9 / NP3) keys to scroll through the channel osd + menu and pressing numpad-enter to switch to the selected channel. + + Note: Older versions of xine and mplayer understand MPEG program + streams (PS) only, and can be used in conjunction with the + ts2ps tool from the Metzler Brother's dvb-mpegtools package. + +3. Which other DVB applications exist? + + http://www.cadsoft.de/people/kls/vdr/ + Klaus Schmidinger's Video Disk Recorder + + http://www.metzlerbros.org/dvb/ + Metzler Bros. DVB development; alternate drivers and + DVB utilities, include dvb-mpegtools and tuxzap. + + http://www.linuxstb.org/ + http://sourceforge.net/projects/dvbtools/ + Dave Chapman's dvbtools package, including + dvbstream and dvbtune + + http://www.linuxdvb.tv/ + Henning Holtschneider's site with many interesting + links and docs + + http://www.dbox2.info/ + LinuxDVB on the dBox2 + + http://www.tuxbox.org/ + http://cvs.tuxbox.org/ + the TuxBox CVS many interesting DVB applications and the dBox2 + DVB source + + http://sourceforge.net/projects/dvbsak/ + DVB Swiss Army Knife library and utilities + + http://www.nenie.org/misc/mpsys/ + MPSYS: a MPEG2 system library and tools + + http://mplayerhq.hu/ + mplayer + + http://xine.sourceforge.net/ + http://xinehq.de/ + xine + + http://www.mythtv.org/ + MythTV - analog TV PVR, but now with DVB support, too + (with software MPEG decode) + + http://dvbsnoop.sourceforge.net/ + DVB sniffer program to monitor, analyze, debug, dump + or view dvb/mpeg/dsm-cc/mhp stream information (TS, + PES, SECTION) + +4. Can't get a signal tuned correctly + + If you are using a Technotrend/Hauppauge DVB-C card *without* analog + module, you might have to use module parameter adac=-1 (dvb-ttpci.o). + +5. The dvb_net device doesn't give me any packets at all + + Run tcpdump on the dvb0_0 interface. This sets the interface + into promiscous mode so it accepts any packets from the PID + you have configured with the dvbnet utility. Check if there + are any packets with the IP addr and MAC addr you have + configured with ifconfig. + + If tcpdump doesn't give you any output, check the statistics + which ifconfig outputs. (Note: If the MAC address is wrong, + dvb_net won't get any input; thus you have to run tcpdump + before checking the statistics.) If there are no packets at + all then maybe the PID is wrong. If there are error packets, + then either the PID is wrong or the stream does not conform to + the MPE standard (EN 301 192, http://www.etsi.org/). You can + use e.g. dvbsnoop for debugging. + +6. The dvb_net device doesn't give me any multicast packets + + Check your routes if they include the multicast address range. + Additionally make sure that "source validation by reversed path + lookup" is disabled: + $ "echo 0 > /proc/sys/net/ipv4/conf/dvb0/rp_filter" + +7. What the hell are all those modules that need to be loaded? + + For a dvb-ttpci av7110 based full-featured card the following + modules are loaded: + + - videodev: Video4Linux core module. This is the base module that + gives you access to the "analog" tv picture of the av7110 mpeg2 + decoder. + + - v4l2-common: common functions for Video4Linux-2 drivers + + - v4l1-compat: backward compatiblity layer for Video4Linux-1 legacy + applications + + - dvb-core: DVB core module. This provides you with the + /dev/dvb/adapter entries + + - saa7146: SAA7146 core driver. This is need to access any SAA7146 + based card in your system. + + - saa7146_vv: SAA7146 video and vbi functions. These are only needed + for full-featured cards. + + - video-buf: capture helper module for the saa7146_vv driver. This + one is responsible to handle capture buffers. + + - dvb-ttpci: The main driver for AV7110 based, full-featued + DVB-S/C/T cards + +eof diff --git a/Documentation/dvb/get_dvb_firmware b/Documentation/dvb/get_dvb_firmware new file mode 100644 index 000000000000..3ffdcb394299 --- /dev/null +++ b/Documentation/dvb/get_dvb_firmware @@ -0,0 +1,397 @@ +#!/usr/bin/perl +# DVB firmware extractor +# +# (c) 2004 Andrew de Quincey +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + +use File::Temp qw/ tempdir /; +use IO::Handle; + +@components = ( "sp8870", "sp887x", "tda10045", "tda10046", "av7110", "dec2000t", + "dec2540t", "dec3000s", "vp7041", "dibusb", "nxt2002", + "or51211", "or51132_qam", "or51132_vsb"); + +# Check args +syntax() if (scalar(@ARGV) != 1); +$cid = $ARGV[0]; + +# Do it! +for ($i=0; $i < scalar(@components); $i++) { + if ($cid eq $components[$i]) { + $outfile = eval($cid); + die $@ if $@; + print STDERR "Firmware $outfile extracted successfully. Now copy it to either /lib/firmware or /usr/lib/hotplug/firmware/ (depending on your hotplug version).\n"; + exit(0); + } +} + +# If we get here, it wasn't found +print STDERR "Unknown component \"$cid\"\n"; +syntax(); + + + + +# --------------------------------------------------------------- +# Firmware-specific extraction subroutines + +sub sp8870 { + my $sourcefile = "tt_Premium_217g.zip"; + my $url = "http://www.technotrend.de/new/217g/$sourcefile"; + my $hash = "53970ec17a538945a6d8cb608a7b3899"; + my $outfile = "dvb-fe-sp8870.fw"; + my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); + + checkstandard(); + + wgetfile($sourcefile, $url); + unzip($sourcefile, $tmpdir); + verify("$tmpdir/software/OEM/HE/App/boot/SC_MAIN.MC", $hash); + copy("$tmpdir/software/OEM/HE/App/boot/SC_MAIN.MC", $outfile); + + $outfile; +} + +sub sp887x { + my $sourcefile = "Dvbt1.3.57.6.zip"; + my $url = "http://www.avermedia.com/software/$sourcefile"; + my $cabfile = "DVBT Net Ver1.3.57.6/disk1/data1.cab"; + my $hash = "237938d53a7f834c05c42b894ca68ac3"; + my $outfile = "dvb-fe-sp887x.fw"; + my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); + + checkstandard(); + checkunshield(); + + wgetfile($sourcefile, $url); + unzip($sourcefile, $tmpdir); + unshield("$tmpdir/$cabfile", $tmpdir); + verify("$tmpdir/ZEnglish/sc_main.mc", $hash); + copy("$tmpdir/ZEnglish/sc_main.mc", $outfile); + + $outfile; +} + +sub tda10045 { + my $sourcefile = "tt_budget_217g.zip"; + my $url = "http://www.technotrend.de/new/217g/$sourcefile"; + my $hash = "2105fd5bf37842fbcdfa4bfd58f3594a"; + my $outfile = "dvb-fe-tda10045.fw"; + my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); + + checkstandard(); + + wgetfile($sourcefile, $url); + unzip($sourcefile, $tmpdir); + extract("$tmpdir/software/OEM/PCI/App/ttlcdacc.dll", 0x37ef9, 30555, "$tmpdir/fwtmp"); + verify("$tmpdir/fwtmp", $hash); + copy("$tmpdir/fwtmp", $outfile); + + $outfile; +} + +sub tda10046 { + my $sourcefile = "tt_budget_217g.zip"; + my $url = "http://www.technotrend.de/new/217g/$sourcefile"; + my $hash = "a25b579e37109af60f4a36c37893957c"; + my $outfile = "dvb-fe-tda10046.fw"; + my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); + + checkstandard(); + + wgetfile($sourcefile, $url); + unzip($sourcefile, $tmpdir); + extract("$tmpdir/software/OEM/PCI/App/ttlcdacc.dll", 0x3f731, 24479, "$tmpdir/fwtmp"); + verify("$tmpdir/fwtmp", $hash); + copy("$tmpdir/fwtmp", $outfile); + + $outfile; +} + +sub av7110 { + my $sourcefile = "dvb-ttpci-01.fw-261d"; + my $url = "http://www.linuxtv.org/downloads/firmware/$sourcefile"; + my $hash = "603431b6259715a8e88f376a53b64e2f"; + my $outfile = "dvb-ttpci-01.fw"; + + checkstandard(); + + wgetfile($sourcefile, $url); + verify($sourcefile, $hash); + copy($sourcefile, $outfile); + + $outfile; +} + +sub dec2000t { + my $sourcefile = "dec217g.exe"; + my $url = "http://hauppauge.lightpath.net/de/$sourcefile"; + my $hash = "bd86f458cee4a8f0a8ce2d20c66215a9"; + my $outfile = "dvb-ttusb-dec-2000t.fw"; + my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); + + checkstandard(); + + wgetfile($sourcefile, $url); + unzip($sourcefile, $tmpdir); + verify("$tmpdir/software/OEM/STB/App/Boot/STB_PC_T.bin", $hash); + copy("$tmpdir/software/OEM/STB/App/Boot/STB_PC_T.bin", $outfile); + + $outfile; +} + +sub dec2540t { + my $sourcefile = "dec217g.exe"; + my $url = "http://hauppauge.lightpath.net/de/$sourcefile"; + my $hash = "53e58f4f5b5c2930beee74a7681fed92"; + my $outfile = "dvb-ttusb-dec-2540t.fw"; + my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); + + checkstandard(); + + wgetfile($sourcefile, $url); + unzip($sourcefile, $tmpdir); + verify("$tmpdir/software/OEM/STB/App/Boot/STB_PC_X.bin", $hash); + copy("$tmpdir/software/OEM/STB/App/Boot/STB_PC_X.bin", $outfile); + + $outfile; +} + +sub dec3000s { + my $sourcefile = "dec217g.exe"; + my $url = "http://hauppauge.lightpath.net/de/$sourcefile"; + my $hash = "b013ececea83f4d6d8d2a29ac7c1b448"; + my $outfile = "dvb-ttusb-dec-3000s.fw"; + my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); + + checkstandard(); + + wgetfile($sourcefile, $url); + unzip($sourcefile, $tmpdir); + verify("$tmpdir/software/OEM/STB/App/Boot/STB_PC_S.bin", $hash); + copy("$tmpdir/software/OEM/STB/App/Boot/STB_PC_S.bin", $outfile); + + $outfile; +} + +sub vp7041 { + my $sourcefile = "2.422.zip"; + my $url = "http://www.twinhan.com/files/driver/USB-Ter/$sourcefile"; + my $hash = "e88c9372d1f66609a3e7b072c53fbcfe"; + my $outfile = "dvb-vp7041-2.422.fw"; + my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); + + checkstandard(); + + wgetfile($sourcefile, $url); + unzip($sourcefile, $tmpdir); + extract("$tmpdir/VisionDTV/Drivers/Win2K&XP/UDTTload.sys", 12503, 3036, "$tmpdir/fwtmp1"); + extract("$tmpdir/VisionDTV/Drivers/Win2K&XP/UDTTload.sys", 2207, 10274, "$tmpdir/fwtmp2"); + + my $CMD = "\000\001\000\222\177\000"; + my $PAD = "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"; + my ($FW); + open $FW, ">$tmpdir/fwtmp3"; + print $FW "$CMD\001$PAD"; + print $FW "$CMD\001$PAD"; + appendfile($FW, "$tmpdir/fwtmp1"); + print $FW "$CMD\000$PAD"; + print $FW "$CMD\001$PAD"; + appendfile($FW, "$tmpdir/fwtmp2"); + print $FW "$CMD\001$PAD"; + print $FW "$CMD\000$PAD"; + close($FW); + + verify("$tmpdir/fwtmp3", $hash); + copy("$tmpdir/fwtmp3", $outfile); + + $outfile; +} + +sub dibusb { + my $url = "http://www.linuxtv.org/downloads/firmware/dvb-dibusb-5.0.0.11.fw"; + my $outfile = "dvb-dibusb-5.0.0.11.fw"; + my $hash = "fa490295a527360ca16dcdf3224ca243"; + + checkstandard(); + + wgetfile($outfile, $url); + verify($outfile,$hash); + + $outfile; +} + +sub nxt2002 { + my $sourcefile = "Broadband4PC_4_2_11.zip"; + my $url = "http://www.bbti.us/download/windows/$sourcefile"; + my $hash = "c6d2ea47a8f456d887ada0cfb718ff2a"; + my $outfile = "dvb-fe-nxt2002.fw"; + my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); + + checkstandard(); + + wgetfile($sourcefile, $url); + unzip($sourcefile, $tmpdir); + verify("$tmpdir/SkyNETU.sys", $hash); + extract("$tmpdir/SkyNETU.sys", 375832, 5908, $outfile); + + $outfile; +} + +sub or51211 { + my $fwfile = "dvb-fe-or51211.fw"; + my $url = "http://linuxtv.org/downloads/firmware/$fwfile"; + my $hash = "d830949c771a289505bf9eafc225d491"; + + checkstandard(); + + wgetfile($fwfile, $url); + verify($fwfile, $hash); + + $fwfile; +} + +sub or51132_qam { + my $fwfile = "dvb-fe-or51132-qam.fw"; + my $url = "http://linuxtv.org/downloads/firmware/$fwfile"; + my $hash = "7702e8938612de46ccadfe9b413cb3b5"; + + checkstandard(); + + wgetfile($fwfile, $url); + verify($fwfile, $hash); + + $fwfile; +} + +sub or51132_vsb { + my $fwfile = "dvb-fe-or51132-vsb.fw"; + my $url = "http://linuxtv.org/downloads/firmware/$fwfile"; + my $hash = "c16208e02f36fc439a557ad4c613364a"; + + checkstandard(); + + wgetfile($fwfile, $url); + verify($fwfile, $hash); + + $fwfile; +} + +# --------------------------------------------------------------- +# Utilities + +sub checkstandard { + if (system("which unzip > /dev/null 2>&1")) { + die "This firmware requires the unzip command - see ftp://ftp.info-zip.org/pub/infozip/UnZip.html\n"; + } + if (system("which md5sum > /dev/null 2>&1")) { + die "This firmware requires the md5sum command - see http://www.gnu.org/software/coreutils/\n"; + } + if (system("which wget > /dev/null 2>&1")) { + die "This firmware requires the wget command - see http://wget.sunsite.dk/\n"; + } +} + +sub checkunshield { + if (system("which unshield > /dev/null 2>&1")) { + die "This firmware requires the unshield command - see http://sourceforge.net/projects/synce/\n"; + } +} + +sub wgetfile { + my ($sourcefile, $url) = @_; + + if (! -f $sourcefile) { + system("wget -O \"$sourcefile\" \"$url\"") and die "wget failed - unable to download firmware"; + } +} + +sub unzip { + my ($sourcefile, $todir) = @_; + + $status = system("unzip -q -o -d \"$todir\" \"$sourcefile\" 2>/dev/null" ); + if ((($status >> 8) > 2) || (($status & 0xff) != 0)) { + die ("unzip failed - unable to extract firmware"); + } +} + +sub unshield { + my ($sourcefile, $todir) = @_; + + system("unshield x -d \"$todir\" \"$sourcefile\" > /dev/null" ) and die ("unshield failed - unable to extract firmware"); +} + +sub verify { + my ($filename, $hash) = @_; + my ($testhash); + + open(CMD, "md5sum \"$filename\"|"); + $testhash = <CMD>; + $testhash =~ /([a-zA-Z0-9]*)/; + $testhash = $1; + close CMD; + die "Hash of extracted file does not match!\n" if ($testhash ne $hash); +} + +sub copy { + my ($from, $to) = @_; + + system("cp -f \"$from\" \"$to\"") and die ("cp failed"); +} + +sub extract { + my ($infile, $offset, $length, $outfile) = @_; + my ($chunklength, $buf, $rcount); + + open INFILE, "<$infile"; + open OUTFILE, ">$outfile"; + sysseek(INFILE, $offset, SEEK_SET); + while($length > 0) { + # Calc chunk size + $chunklength = 2048; + $chunklength = $length if ($chunklength > $length); + + $rcount = sysread(INFILE, $buf, $chunklength); + die "Ran out of data\n" if ($rcount != $chunklength); + syswrite(OUTFILE, $buf); + $length -= $rcount; + } + close INFILE; + close OUTFILE; +} + +sub appendfile { + my ($FH, $infile) = @_; + my ($buf); + + open INFILE, "<$infile"; + while(1) { + $rcount = sysread(INFILE, $buf, 2048); + last if ($rcount == 0); + print $FH $buf; + } + close(INFILE); +} + +sub syntax() { + print STDERR "syntax: get_dvb_firmware <component>\n"; + print STDERR "Supported components:\n"; + for($i=0; $i < scalar(@components); $i++) { + print STDERR "\t" . $components[$i] . "\n"; + } + exit(1); +} diff --git a/Documentation/dvb/readme.txt b/Documentation/dvb/readme.txt new file mode 100644 index 000000000000..754c98c6ad94 --- /dev/null +++ b/Documentation/dvb/readme.txt @@ -0,0 +1,52 @@ +Linux Digital Video Broadcast (DVB) subsystem +============================================= + +The main development site and CVS repository for these +drivers is http://linuxtv.org/. + +The developer mailing list linux-dvb is also hosted there, +see http://linuxtv.org/lists.php. Please check +the archive http://linuxtv.org/pipermail/linux-dvb/ +and the Wiki http://linuxtv.org/wiki/ +before asking newbie questions on the list. + +API documentation, utilities and test/example programs +are available as part of the old driver package for Linux 2.4 +(linuxtv-dvb-1.0.x.tar.gz), or from CVS (module DVB). +We plan to split this into separate packages, but it's not +been done yet. + +http://linuxtv.org/downloads/ + +What's inside this directory: + +"cards.txt" +contains a list of supported hardware. + +"contributors.txt" +is the who-is-who of DVB development + +"faq.txt" +contains frequently asked questions and their answers. + +"get_dvb_firmware" +script to download and extract firmware for those devices +that require it. + +"ttusb-dec.txt" +contains detailed informations about the +TT DEC2000/DEC3000 USB DVB hardware. + +"bt8xx.txt" +contains detailed installation instructions for the +various bt8xx based "budget" DVB cards +(Nebula, Pinnacle PCTV, Twinhan DST) + +"README.dibusb" +contains detailed information about adapters +based on DiBcom reference design. + +"udev.txt" +how to get DVB and udev up and running. + +Good luck and have fun! diff --git a/Documentation/dvb/ttusb-dec.txt b/Documentation/dvb/ttusb-dec.txt new file mode 100644 index 000000000000..5c1e984c26a7 --- /dev/null +++ b/Documentation/dvb/ttusb-dec.txt @@ -0,0 +1,44 @@ +TechnoTrend/Hauppauge DEC USB Driver +==================================== + +Driver Status +------------- + +Supported: + DEC2000-t + DEC2450-t + DEC3000-s + Linux Kernels 2.4 and 2.6 + Video Streaming + Audio Streaming + Section Filters + Channel Zapping + Hotplug firmware loader under 2.6 kernels + +To Do: + Tuner status information + DVB network interface + Streaming video PC->DEC + Conax support for 2450-t + +Getting the Firmware +-------------------- +To download the firmware, use the following commands: +"get_dvb_firmware dec2000t" +"get_dvb_firmware dec2540t" +"get_dvb_firmware dec3000s" + + +Compilation Notes for 2.4 kernels +--------------------------------- +For 2.4 kernels the firmware for the DECs is compiled into the driver itself. + +Copy the three files downloaded above into the build-2.4 directory. + + +Hotplug Firmware Loading for 2.6 kernels +---------------------------------------- +For 2.6 kernels the firmware is loaded at the point that the driver module is +loaded. See linux/Documentation/dvb/firmware.txt for more information. + +Copy the three files downloaded above into the /usr/lib/hotplug/firmware directory. diff --git a/Documentation/dvb/udev.txt b/Documentation/dvb/udev.txt new file mode 100644 index 000000000000..68ee224b6aae --- /dev/null +++ b/Documentation/dvb/udev.txt @@ -0,0 +1,46 @@ +The DVB subsystem currently registers to the sysfs subsystem using the +"class_simple" interface. + +This means that only the basic informations like module loading parameters +are presented through sysfs. Other things that might be interesting are +currently *not* available. + +Nevertheless it's now possible to add proper udev rules so that the +DVB device nodes are created automatically. + +We assume that you have udev already up and running and that have been +creating the DVB device nodes manually up to now due to the missing sysfs +support. + +0. Don't forget to disable your current method of creating the +device nodes manually. + +1. Unfortunately, you'll need a helper script to transform the kernel +sysfs device name into the well known dvb adapter / device naming scheme. +The script should be called "dvb.sh" and should be placed into a script +dir where udev can execute it, most likely /etc/udev/scripts/ + +So, create a new file /etc/udev/scripts/dvb.sh and add the following: +------------------------------schnipp------------------------------------------------ +#!/bin/sh +/bin/echo $1 | /bin/sed -e 's,dvb\([0-9]\)\.\([^0-9]*\)\([0-9]\),dvb/adapter\1/\2\3,' +------------------------------schnipp------------------------------------------------ + +Don't forget to make the script executable with "chmod". + +1. You need to create a proper udev rule that will create the device nodes +like you know them. All real distributions out there scan the /etc/udev/rules.d +directory for rule files. The main udev configuration file /etc/udev/udev.conf +will tell you the directory where the rules are, most likely it's /etc/udev/rules.d/ + +Create a new rule file in that directory called "dvb.rule" and add the following line: +------------------------------schnipp------------------------------------------------ +KERNEL="dvb*", PROGRAM="/etc/udev/scripts/dvb.sh %k", NAME="%c" +------------------------------schnipp------------------------------------------------ + +If you want more control over the device nodes (for example a special group membership) +have a look at "man udev". + +For every device that registers to the sysfs subsystem with a "dvb" prefix, +the helper script /etc/udev/scripts/dvb.sh is invoked, which will then +create the proper device node in your /dev/ directory. diff --git a/Documentation/early-userspace/README b/Documentation/early-userspace/README new file mode 100644 index 000000000000..270a88e22fb9 --- /dev/null +++ b/Documentation/early-userspace/README @@ -0,0 +1,152 @@ +Early userspace support +======================= + +Last update: 2004-12-20 tlh + + +"Early userspace" is a set of libraries and programs that provide +various pieces of functionality that are important enough to be +available while a Linux kernel is coming up, but that don't need to be +run inside the kernel itself. + +It consists of several major infrastructure components: + +- gen_init_cpio, a program that builds a cpio-format archive + containing a root filesystem image. This archive is compressed, and + the compressed image is linked into the kernel image. +- initramfs, a chunk of code that unpacks the compressed cpio image + midway through the kernel boot process. +- klibc, a userspace C library, currently packaged separately, that is + optimized for correctness and small size. + +The cpio file format used by initramfs is the "newc" (aka "cpio -c") +format, and is documented in the file "buffer-format.txt". There are +two ways to add an early userspace image: specify an existing cpio +archive to be used as the image or have the kernel build process build +the image from specifications. + +CPIO ARCHIVE method + +You can create a cpio archive that contains the early userspace image. +Youre cpio archive should be specified in CONFIG_INITRAMFS_SOURCE and it +will be used directly. Only a single cpio file may be specified in +CONFIG_INITRAMFS_SOURCE and directory and file names are not allowed in +combination with a cpio archive. + +IMAGE BUILDING method + +The kernel build process can also build an early userspace image from +source parts rather than supplying a cpio archive. This method provides +a way to create images with root-owned files even though the image was +built by an unprivileged user. + +The image is specified as one or more sources in +CONFIG_INITRAMFS_SOURCE. Sources can be either directories or files - +cpio archives are *not* allowed when building from sources. + +A source directory will have it and all of it's contents packaged. The +specified directory name will be mapped to '/'. When packaging a +directory, limited user and group ID translation can be performed. +INITRAMFS_ROOT_UID can be set to a user ID that needs to be mapped to +user root (0). INITRAMFS_ROOT_GID can be set to a group ID that needs +to be mapped to group root (0). + +A source file must be directives in the format required by the +usr/gen_init_cpio utility (run 'usr/gen_init_cpio --help' to get the +file format). The directives in the file will be passed directly to +usr/gen_init_cpio. + +When a combination of directories and files are specified then the +initramfs image will be an aggregate of all of them. In this way a user +can create a 'root-image' directory and install all files into it. +Because device-special files cannot be created by a unprivileged user, +special files can be listed in a 'root-files' file. Both 'root-image' +and 'root-files' can be listed in CONFIG_INITRAMFS_SOURCE and a complete +early userspace image can be built by an unprivileged user. + +As a technical note, when directories and files are specified, the +entire CONFIG_INITRAMFS_SOURCE is passed to +scripts/gen_initramfs_list.sh. This means that CONFIG_INITRAMFS_SOURCE +can really be interpreted as any legal argument to +gen_initramfs_list.sh. If a directory is specified as an argument then +the contents are scanned, uid/gid translation is performed, and +usr/gen_init_cpio file directives are output. If a directory is +specified as an arugemnt to scripts/gen_initramfs_list.sh then the +contents of the file are simply copied to the output. All of the output +directives from directory scanning and file contents copying are +processed by usr/gen_init_cpio. + +See also 'scripts/gen_initramfs_list.sh -h'. + +Where's this all leading? +========================= + +The klibc distribution contains some of the necessary software to make +early userspace useful. The klibc distribution is currently +maintained separately from the kernel, but this may change early in +the 2.7 era (it missed the boat for 2.5). + +You can obtain somewhat infrequent snapshots of klibc from +ftp://ftp.kernel.org/pub/linux/libs/klibc/ + +For active users, you are better off using the klibc BitKeeper +repositories, at http://klibc.bkbits.net/ + +The standalone klibc distribution currently provides three components, +in addition to the klibc library: + +- ipconfig, a program that configures network interfaces. It can + configure them statically, or use DHCP to obtain information + dynamically (aka "IP autoconfiguration"). +- nfsmount, a program that can mount an NFS filesystem. +- kinit, the "glue" that uses ipconfig and nfsmount to replace the old + support for IP autoconfig, mount a filesystem over NFS, and continue + system boot using that filesystem as root. + +kinit is built as a single statically linked binary to save space. + +Eventually, several more chunks of kernel functionality will hopefully +move to early userspace: + +- Almost all of init/do_mounts* (the beginning of this is already in + place) +- ACPI table parsing +- Insert unwieldy subsystem that doesn't really need to be in kernel + space here + +If kinit doesn't meet your current needs and you've got bytes to burn, +the klibc distribution includes a small Bourne-compatible shell (ash) +and a number of other utilities, so you can replace kinit and build +custom initramfs images that meet your needs exactly. + +For questions and help, you can sign up for the early userspace +mailing list at http://www.zytor.com/mailman/listinfo/klibc + +How does it work? +================= + +The kernel has currently 3 ways to mount the root filesystem: + +a) all required device and filesystem drivers compiled into the kernel, no + initrd. init/main.c:init() will call prepare_namespace() to mount the + final root filesystem, based on the root= option and optional init= to run + some other init binary than listed at the end of init/main.c:init(). + +b) some device and filesystem drivers built as modules and stored in an + initrd. The initrd must contain a binary '/linuxrc' which is supposed to + load these driver modules. It is also possible to mount the final root + filesystem via linuxrc and use the pivot_root syscall. The initrd is + mounted and executed via prepare_namespace(). + +c) using initramfs. The call to prepare_namespace() must be skipped. + This means that a binary must do all the work. Said binary can be stored + into initramfs either via modifying usr/gen_init_cpio.c or via the new + initrd format, an cpio archive. It must be called "/init". This binary + is responsible to do all the things prepare_namespace() would do. + + To remain backwards compatibility, the /init binary will only run if it + comes via an initramfs cpio archive. If this is not the case, + init/main.c:init() will run prepare_namespace() to mount the final root + and exec one of the predefined init binaries. + +Bryan O'Sullivan <bos@serpentine.com> diff --git a/Documentation/early-userspace/buffer-format.txt b/Documentation/early-userspace/buffer-format.txt new file mode 100644 index 000000000000..e1fd7f9dad16 --- /dev/null +++ b/Documentation/early-userspace/buffer-format.txt @@ -0,0 +1,112 @@ + initramfs buffer format + ----------------------- + + Al Viro, H. Peter Anvin + Last revision: 2002-01-13 + +Starting with kernel 2.5.x, the old "initial ramdisk" protocol is +getting {replaced/complemented} with the new "initial ramfs" +(initramfs) protocol. The initramfs contents is passed using the same +memory buffer protocol used by the initrd protocol, but the contents +is different. The initramfs buffer contains an archive which is +expanded into a ramfs filesystem; this document details the format of +the initramfs buffer format. + +The initramfs buffer format is based around the "newc" or "crc" CPIO +formats, and can be created with the cpio(1) utility. The cpio +archive can be compressed using gzip(1). One valid version of an +initramfs buffer is thus a single .cpio.gz file. + +The full format of the initramfs buffer is defined by the following +grammar, where: + * is used to indicate "0 or more occurrences of" + (|) indicates alternatives + + indicates concatenation + GZIP() indicates the gzip(1) of the operand + ALGN(n) means padding with null bytes to an n-byte boundary + + initramfs := ("\0" | cpio_archive | cpio_gzip_archive)* + + cpio_gzip_archive := GZIP(cpio_archive) + + cpio_archive := cpio_file* + (<nothing> | cpio_trailer) + + cpio_file := ALGN(4) + cpio_header + filename + "\0" + ALGN(4) + data + + cpio_trailer := ALGN(4) + cpio_header + "TRAILER!!!\0" + ALGN(4) + + +In human terms, the initramfs buffer contains a collection of +compressed and/or uncompressed cpio archives (in the "newc" or "crc" +formats); arbitrary amounts zero bytes (for padding) can be added +between members. + +The cpio "TRAILER!!!" entry (cpio end-of-archive) is optional, but is +not ignored; see "handling of hard links" below. + +The structure of the cpio_header is as follows (all fields contain +hexadecimal ASCII numbers fully padded with '0' on the left to the +full width of the field, for example, the integer 4780 is represented +by the ASCII string "000012ac"): + +Field name Field size Meaning +c_magic 6 bytes The string "070701" or "070702" +c_ino 8 bytes File inode number +c_mode 8 bytes File mode and permissions +c_uid 8 bytes File uid +c_gid 8 bytes File gid +c_nlink 8 bytes Number of links +c_mtime 8 bytes Modification time +c_filesize 8 bytes Size of data field +c_maj 8 bytes Major part of file device number +c_min 8 bytes Minor part of file device number +c_rmaj 8 bytes Major part of device node reference +c_rmin 8 bytes Minor part of device node reference +c_namesize 8 bytes Length of filename, including final \0 +c_chksum 8 bytes Checksum of data field if c_magic is 070702; + otherwise zero + +The c_mode field matches the contents of st_mode returned by stat(2) +on Linux, and encodes the file type and file permissions. + +The c_filesize should be zero for any file which is not a regular file +or symlink. + +The c_chksum field contains a simple 32-bit unsigned sum of all the +bytes in the data field. cpio(1) refers to this as "crc", which is +clearly incorrect (a cyclic redundancy check is a different and +significantly stronger integrity check), however, this is the +algorithm used. + +If the filename is "TRAILER!!!" this is actually an end-of-archive +marker; the c_filesize for an end-of-archive marker must be zero. + + +*** Handling of hard links + +When a nondirectory with c_nlink > 1 is seen, the (c_maj,c_min,c_ino) +tuple is looked up in a tuple buffer. If not found, it is entered in +the tuple buffer and the entry is created as usual; if found, a hard +link rather than a second copy of the file is created. It is not +necessary (but permitted) to include a second copy of the file +contents; if the file contents is not included, the c_filesize field +should be set to zero to indicate no data section follows. If data is +present, the previous instance of the file is overwritten; this allows +the data-carrying instance of a file to occur anywhere in the sequence +(GNU cpio is reported to attach the data to the last instance of a +file only.) + +c_filesize must not be zero for a symlink. + +When a "TRAILER!!!" end-of-archive marker is seen, the tuple buffer is +reset. This permits archives which are generated independently to be +concatenated. + +To combine file data from different sources (without having to +regenerate the (c_maj,c_min,c_ino) fields), therefore, either one of +the following techniques can be used: + +a) Separate the different file data sources with a "TRAILER!!!" + end-of-archive marker, or + +b) Make sure c_nlink == 1 for all nondirectory entries. diff --git a/Documentation/eisa.txt b/Documentation/eisa.txt new file mode 100644 index 000000000000..8c8388da868a --- /dev/null +++ b/Documentation/eisa.txt @@ -0,0 +1,203 @@ +EISA bus support (Marc Zyngier <maz@wild-wind.fr.eu.org>) + +This document groups random notes about porting EISA drivers to the +new EISA/sysfs API. + +Starting from version 2.5.59, the EISA bus is almost given the same +status as other much more mainstream busses such as PCI or USB. This +has been possible through sysfs, which defines a nice enough set of +abstractions to manage busses, devices and drivers. + +Although the new API is quite simple to use, converting existing +drivers to the new infrastructure is not an easy task (mostly because +detection code is generally also used to probe ISA cards). Moreover, +most EISA drivers are among the oldest Linux drivers so, as you can +imagine, some dust has settled here over the years. + +The EISA infrastructure is made up of three parts : + + - The bus code implements most of the generic code. It is shared + among all the architectures that the EISA code runs on. It + implements bus probing (detecting EISA cards avaible on the bus), + allocates I/O resources, allows fancy naming through sysfs, and + offers interfaces for driver to register. + + - The bus root driver implements the glue between the bus hardware + and the generic bus code. It is responsible for discovering the + device implementing the bus, and setting it up to be latter probed + by the bus code. This can go from something as simple as reserving + an I/O region on x86, to the rather more complex, like the hppa + EISA code. This is the part to implement in order to have EISA + running on an "new" platform. + + - The driver offers the bus a list of devices that it manages, and + implements the necessary callbacks to probe and release devices + whenever told to. + +Every function/structure below lives in <linux/eisa.h>, which depends +heavily on <linux/device.h>. + +** Bus root driver : + +int eisa_root_register (struct eisa_root_device *root); + +The eisa_root_register function is used to declare a device as the +root of an EISA bus. The eisa_root_device structure holds a reference +to this device, as well as some parameters for probing purposes. + +struct eisa_root_device { + struct device *dev; /* Pointer to bridge device */ + struct resource *res; + unsigned long bus_base_addr; + int slots; /* Max slot number */ + int force_probe; /* Probe even when no slot 0 */ + u64 dma_mask; /* from bridge device */ + int bus_nr; /* Set by eisa_root_register */ + struct resource eisa_root_res; /* ditto */ +}; + +node : used for eisa_root_register internal purpose +dev : pointer to the root device +res : root device I/O resource +bus_base_addr : slot 0 address on this bus +slots : max slot number to probe +force_probe : Probe even when slot 0 is empty (no EISA mainboard) +dma_mask : Default DMA mask. Usualy the bridge device dma_mask. +bus_nr : unique bus id, set by eisa_root_register + +** Driver : + +int eisa_driver_register (struct eisa_driver *edrv); +void eisa_driver_unregister (struct eisa_driver *edrv); + +Clear enough ? + +struct eisa_device_id { + char sig[EISA_SIG_LEN]; + unsigned long driver_data; +}; + +struct eisa_driver { + const struct eisa_device_id *id_table; + struct device_driver driver; +}; + +id_table : an array of NULL terminated EISA id strings, + followed by an empty string. Each string can + optionnaly be paired with a driver-dependant value + (driver_data). + +driver : a generic driver, such as described in + Documentation/driver-model/driver.txt. Only .name, + .probe and .remove members are mandatory. + +An example is the 3c59x driver : + +static struct eisa_device_id vortex_eisa_ids[] = { + { "TCM5920", EISA_3C592_OFFSET }, + { "TCM5970", EISA_3C597_OFFSET }, + { "" } +}; + +static struct eisa_driver vortex_eisa_driver = { + .id_table = vortex_eisa_ids, + .driver = { + .name = "3c59x", + .probe = vortex_eisa_probe, + .remove = vortex_eisa_remove + } +}; + +** Device : + +The sysfs framework calls .probe and .remove functions upon device +discovery and removal (note that the .remove function is only called +when driver is built as a module). + +Both functions are passed a pointer to a 'struct device', which is +encapsulated in a 'struct eisa_device' described as follows : + +struct eisa_device { + struct eisa_device_id id; + int slot; + int state; + unsigned long base_addr; + struct resource res[EISA_MAX_RESOURCES]; + u64 dma_mask; + struct device dev; /* generic device */ +}; + +id : EISA id, as read from device. id.driver_data is set from the + matching driver EISA id. +slot : slot number which the device was detected on +state : set of flags indicating the state of the device. Current + flags are EISA_CONFIG_ENABLED and EISA_CONFIG_FORCED. +res : set of four 256 bytes I/O regions allocated to this device +dma_mask: DMA mask set from the parent device. +dev : generic device (see Documentation/driver-model/device.txt) + +You can get the 'struct eisa_device' from 'struct device' using the +'to_eisa_device' macro. + +** Misc stuff : + +void eisa_set_drvdata (struct eisa_device *edev, void *data); + +Stores data into the device's driver_data area. + +void *eisa_get_drvdata (struct eisa_device *edev): + +Gets the pointer previously stored into the device's driver_data area. + +int eisa_get_region_index (void *addr); + +Returns the region number (0 <= x < EISA_MAX_RESOURCES) of a given +address. + +** Kernel parameters : + +eisa_bus.enable_dev : + +A comma-separated list of slots to be enabled, even if the firmware +set the card as disabled. The driver must be able to properly +initialize the device in such conditions. + +eisa_bus.disable_dev : + +A comma-separated list of slots to be enabled, even if the firmware +set the card as enabled. The driver won't be called to handle this +device. + +virtual_root.force_probe : + +Force the probing code to probe EISA slots even when it cannot find an +EISA compliant mainboard (nothing appears on slot 0). Defaultd to 0 +(don't force), and set to 1 (force probing) when either +CONFIG_ALPHA_JENSEN or CONFIG_EISA_VLB_PRIMING are set. + +** Random notes : + +Converting an EISA driver to the new API mostly involves *deleting* +code (since probing is now in the core EISA code). Unfortunately, most +drivers share their probing routine between ISA, MCA and EISA. Special +care must be taken when ripping out the EISA code, so other busses +won't suffer from these surgical strikes... + +You *must not* expect any EISA device to be detected when returning +from eisa_driver_register, since the chances are that the bus has not +yet been probed. In fact, that's what happens most of the time (the +bus root driver usually kicks in rather late in the boot process). +Unfortunately, most drivers are doing the probing by themselves, and +expect to have explored the whole machine when they exit their probe +routine. + +For example, switching your favorite EISA SCSI card to the "hotplug" +model is "the right thing"(tm). + +** Thanks : + +I'd like to thank the following people for their help : +- Xavier Benigni for lending me a wonderful Alpha Jensen, +- James Bottomley, Jeff Garzik for getting this stuff into the kernel, +- Andries Brouwer for contributing numerous EISA ids, +- Catrin Jones for coping with far too many machines at home. diff --git a/Documentation/exception.txt b/Documentation/exception.txt new file mode 100644 index 000000000000..f1d436993eb1 --- /dev/null +++ b/Documentation/exception.txt @@ -0,0 +1,292 @@ + Kernel level exception handling in Linux 2.1.8 + Commentary by Joerg Pommnitz <joerg@raleigh.ibm.com> + +When a process runs in kernel mode, it often has to access user +mode memory whose address has been passed by an untrusted program. +To protect itself the kernel has to verify this address. + +In older versions of Linux this was done with the +int verify_area(int type, const void * addr, unsigned long size) +function. + +This function verified that the memory area starting at address +addr and of size size was accessible for the operation specified +in type (read or write). To do this, verify_read had to look up the +virtual memory area (vma) that contained the address addr. In the +normal case (correctly working program), this test was successful. +It only failed for a few buggy programs. In some kernel profiling +tests, this normally unneeded verification used up a considerable +amount of time. + +To overcome this situation, Linus decided to let the virtual memory +hardware present in every Linux-capable CPU handle this test. + +How does this work? + +Whenever the kernel tries to access an address that is currently not +accessible, the CPU generates a page fault exception and calls the +page fault handler + +void do_page_fault(struct pt_regs *regs, unsigned long error_code) + +in arch/i386/mm/fault.c. The parameters on the stack are set up by +the low level assembly glue in arch/i386/kernel/entry.S. The parameter +regs is a pointer to the saved registers on the stack, error_code +contains a reason code for the exception. + +do_page_fault first obtains the unaccessible address from the CPU +control register CR2. If the address is within the virtual address +space of the process, the fault probably occurred, because the page +was not swapped in, write protected or something similar. However, +we are interested in the other case: the address is not valid, there +is no vma that contains this address. In this case, the kernel jumps +to the bad_area label. + +There it uses the address of the instruction that caused the exception +(i.e. regs->eip) to find an address where the execution can continue +(fixup). If this search is successful, the fault handler modifies the +return address (again regs->eip) and returns. The execution will +continue at the address in fixup. + +Where does fixup point to? + +Since we jump to the contents of fixup, fixup obviously points +to executable code. This code is hidden inside the user access macros. +I have picked the get_user macro defined in include/asm/uaccess.h as an +example. The definition is somewhat hard to follow, so let's peek at +the code generated by the preprocessor and the compiler. I selected +the get_user call in drivers/char/console.c for a detailed examination. + +The original code in console.c line 1405: + get_user(c, buf); + +The preprocessor output (edited to become somewhat readable): + +( + { + long __gu_err = - 14 , __gu_val = 0; + const __typeof__(*( ( buf ) )) *__gu_addr = ((buf)); + if (((((0 + current_set[0])->tss.segment) == 0x18 ) || + (((sizeof(*(buf))) <= 0xC0000000UL) && + ((unsigned long)(__gu_addr ) <= 0xC0000000UL - (sizeof(*(buf))))))) + do { + __gu_err = 0; + switch ((sizeof(*(buf)))) { + case 1: + __asm__ __volatile__( + "1: mov" "b" " %2,%" "b" "1\n" + "2:\n" + ".section .fixup,\"ax\"\n" + "3: movl %3,%0\n" + " xor" "b" " %" "b" "1,%" "b" "1\n" + " jmp 2b\n" + ".section __ex_table,\"a\"\n" + " .align 4\n" + " .long 1b,3b\n" + ".text" : "=r"(__gu_err), "=q" (__gu_val): "m"((*(struct __large_struct *) + ( __gu_addr )) ), "i"(- 14 ), "0"( __gu_err )) ; + break; + case 2: + __asm__ __volatile__( + "1: mov" "w" " %2,%" "w" "1\n" + "2:\n" + ".section .fixup,\"ax\"\n" + "3: movl %3,%0\n" + " xor" "w" " %" "w" "1,%" "w" "1\n" + " jmp 2b\n" + ".section __ex_table,\"a\"\n" + " .align 4\n" + " .long 1b,3b\n" + ".text" : "=r"(__gu_err), "=r" (__gu_val) : "m"((*(struct __large_struct *) + ( __gu_addr )) ), "i"(- 14 ), "0"( __gu_err )); + break; + case 4: + __asm__ __volatile__( + "1: mov" "l" " %2,%" "" "1\n" + "2:\n" + ".section .fixup,\"ax\"\n" + "3: movl %3,%0\n" + " xor" "l" " %" "" "1,%" "" "1\n" + " jmp 2b\n" + ".section __ex_table,\"a\"\n" + " .align 4\n" " .long 1b,3b\n" + ".text" : "=r"(__gu_err), "=r" (__gu_val) : "m"((*(struct __large_struct *) + ( __gu_addr )) ), "i"(- 14 ), "0"(__gu_err)); + break; + default: + (__gu_val) = __get_user_bad(); + } + } while (0) ; + ((c)) = (__typeof__(*((buf))))__gu_val; + __gu_err; + } +); + +WOW! Black GCC/assembly magic. This is impossible to follow, so let's +see what code gcc generates: + + > xorl %edx,%edx + > movl current_set,%eax + > cmpl $24,788(%eax) + > je .L1424 + > cmpl $-1073741825,64(%esp) + > ja .L1423 + > .L1424: + > movl %edx,%eax + > movl 64(%esp),%ebx + > #APP + > 1: movb (%ebx),%dl /* this is the actual user access */ + > 2: + > .section .fixup,"ax" + > 3: movl $-14,%eax + > xorb %dl,%dl + > jmp 2b + > .section __ex_table,"a" + > .align 4 + > .long 1b,3b + > .text + > #NO_APP + > .L1423: + > movzbl %dl,%esi + +The optimizer does a good job and gives us something we can actually +understand. Can we? The actual user access is quite obvious. Thanks +to the unified address space we can just access the address in user +memory. But what does the .section stuff do????? + +To understand this we have to look at the final kernel: + + > objdump --section-headers vmlinux + > + > vmlinux: file format elf32-i386 + > + > Sections: + > Idx Name Size VMA LMA File off Algn + > 0 .text 00098f40 c0100000 c0100000 00001000 2**4 + > CONTENTS, ALLOC, LOAD, READONLY, CODE + > 1 .fixup 000016bc c0198f40 c0198f40 00099f40 2**0 + > CONTENTS, ALLOC, LOAD, READONLY, CODE + > 2 .rodata 0000f127 c019a5fc c019a5fc 0009b5fc 2**2 + > CONTENTS, ALLOC, LOAD, READONLY, DATA + > 3 __ex_table 000015c0 c01a9724 c01a9724 000aa724 2**2 + > CONTENTS, ALLOC, LOAD, READONLY, DATA + > 4 .data 0000ea58 c01abcf0 c01abcf0 000abcf0 2**4 + > CONTENTS, ALLOC, LOAD, DATA + > 5 .bss 00018e21 c01ba748 c01ba748 000ba748 2**2 + > ALLOC + > 6 .comment 00000ec4 00000000 00000000 000ba748 2**0 + > CONTENTS, READONLY + > 7 .note 00001068 00000ec4 00000ec4 000bb60c 2**0 + > CONTENTS, READONLY + +There are obviously 2 non standard ELF sections in the generated object +file. But first we want to find out what happened to our code in the +final kernel executable: + + > objdump --disassemble --section=.text vmlinux + > + > c017e785 <do_con_write+c1> xorl %edx,%edx + > c017e787 <do_con_write+c3> movl 0xc01c7bec,%eax + > c017e78c <do_con_write+c8> cmpl $0x18,0x314(%eax) + > c017e793 <do_con_write+cf> je c017e79f <do_con_write+db> + > c017e795 <do_con_write+d1> cmpl $0xbfffffff,0x40(%esp,1) + > c017e79d <do_con_write+d9> ja c017e7a7 <do_con_write+e3> + > c017e79f <do_con_write+db> movl %edx,%eax + > c017e7a1 <do_con_write+dd> movl 0x40(%esp,1),%ebx + > c017e7a5 <do_con_write+e1> movb (%ebx),%dl + > c017e7a7 <do_con_write+e3> movzbl %dl,%esi + +The whole user memory access is reduced to 10 x86 machine instructions. +The instructions bracketed in the .section directives are no longer +in the normal execution path. They are located in a different section +of the executable file: + + > objdump --disassemble --section=.fixup vmlinux + > + > c0199ff5 <.fixup+10b5> movl $0xfffffff2,%eax + > c0199ffa <.fixup+10ba> xorb %dl,%dl + > c0199ffc <.fixup+10bc> jmp c017e7a7 <do_con_write+e3> + +And finally: + > objdump --full-contents --section=__ex_table vmlinux + > + > c01aa7c4 93c017c0 e09f19c0 97c017c0 99c017c0 ................ + > c01aa7d4 f6c217c0 e99f19c0 a5e717c0 f59f19c0 ................ + > c01aa7e4 080a18c0 01a019c0 0a0a18c0 04a019c0 ................ + +or in human readable byte order: + + > c01aa7c4 c017c093 c0199fe0 c017c097 c017c099 ................ + > c01aa7d4 c017c2f6 c0199fe9 c017e7a5 c0199ff5 ................ + ^^^^^^^^^^^^^^^^^ + this is the interesting part! + > c01aa7e4 c0180a08 c019a001 c0180a0a c019a004 ................ + +What happened? The assembly directives + +.section .fixup,"ax" +.section __ex_table,"a" + +told the assembler to move the following code to the specified +sections in the ELF object file. So the instructions +3: movl $-14,%eax + xorb %dl,%dl + jmp 2b +ended up in the .fixup section of the object file and the addresses + .long 1b,3b +ended up in the __ex_table section of the object file. 1b and 3b +are local labels. The local label 1b (1b stands for next label 1 +backward) is the address of the instruction that might fault, i.e. +in our case the address of the label 1 is c017e7a5: +the original assembly code: > 1: movb (%ebx),%dl +and linked in vmlinux : > c017e7a5 <do_con_write+e1> movb (%ebx),%dl + +The local label 3 (backwards again) is the address of the code to handle +the fault, in our case the actual value is c0199ff5: +the original assembly code: > 3: movl $-14,%eax +and linked in vmlinux : > c0199ff5 <.fixup+10b5> movl $0xfffffff2,%eax + +The assembly code + > .section __ex_table,"a" + > .align 4 + > .long 1b,3b + +becomes the value pair + > c01aa7d4 c017c2f6 c0199fe9 c017e7a5 c0199ff5 ................ + ^this is ^this is + 1b 3b +c017e7a5,c0199ff5 in the exception table of the kernel. + +So, what actually happens if a fault from kernel mode with no suitable +vma occurs? + +1.) access to invalid address: + > c017e7a5 <do_con_write+e1> movb (%ebx),%dl +2.) MMU generates exception +3.) CPU calls do_page_fault +4.) do page fault calls search_exception_table (regs->eip == c017e7a5); +5.) search_exception_table looks up the address c017e7a5 in the + exception table (i.e. the contents of the ELF section __ex_table) + and returns the address of the associated fault handle code c0199ff5. +6.) do_page_fault modifies its own return address to point to the fault + handle code and returns. +7.) execution continues in the fault handling code. +8.) 8a) EAX becomes -EFAULT (== -14) + 8b) DL becomes zero (the value we "read" from user space) + 8c) execution continues at local label 2 (address of the + instruction immediately after the faulting user access). + +The steps 8a to 8c in a certain way emulate the faulting instruction. + +That's it, mostly. If you look at our example, you might ask why +we set EAX to -EFAULT in the exception handler code. Well, the +get_user macro actually returns a value: 0, if the user access was +successful, -EFAULT on failure. Our original code did not test this +return value, however the inline assembly code in get_user tries to +return -EFAULT. GCC selected EAX to return this value. + +NOTE: +Due to the way that the exception table is built and needs to be ordered, +only use exceptions for code in the .text section. Any other section +will cause the exception table to not be sorted correctly, and the +exceptions will fail. diff --git a/Documentation/fb/00-INDEX b/Documentation/fb/00-INDEX new file mode 100644 index 000000000000..92e89aeef52e --- /dev/null +++ b/Documentation/fb/00-INDEX @@ -0,0 +1,25 @@ +Index of files in Documentation/fb. If you think something about frame +buffer devices needs an entry here, needs correction or you've written one +please mail me. + Geert Uytterhoeven <geert@linux-m68k.org> + +00-INDEX + - this file +framebuffer.txt + - introduction to frame buffer devices +internals.txt + - quick overview of frame buffer device internals +modedb.txt + - info on the video mode database +aty128fb.txt + - info on the ATI Rage128 frame buffer driver +clgenfb.txt + - info on the Cirrus Logic frame buffer driver +matroxfb.txt + - info on the Matrox frame buffer driver +pvr2fb.txt + - info on the PowerVR 2 frame buffer driver +tgafb.txt + - info on the TGA (DECChip 21030) frame buffer driver +vesafb.txt + - info on the VESA frame buffer device diff --git a/Documentation/fb/aty128fb.txt b/Documentation/fb/aty128fb.txt new file mode 100644 index 000000000000..069262fb619d --- /dev/null +++ b/Documentation/fb/aty128fb.txt @@ -0,0 +1,72 @@ +[This file is cloned from VesaFB/matroxfb] + +What is aty128fb? +================= + +This is a driver for a graphic framebuffer for ATI Rage128 based devices +on Intel and PPC boxes. + +Advantages: + + * It provides a nice large console (128 cols + 48 lines with 1024x768) + without using tiny, unreadable fonts. + * You can run XF68_FBDev on top of /dev/fb0 + * Most important: boot logo :-) + +Disadvantages: + + * graphic mode is slower than text mode... but you should not notice + if you use same resolution as you used in textmode. + * still experimental. + + +How to use it? +============== + +Switching modes is done using the video=aty128fb:<resolution>... modedb +boot parameter or using `fbset' program. + +See Documentation/fb/modedb.txt for more information on modedb +resolutions. + +You should compile in both vgacon (to boot if you remove your Rage128 from +box) and aty128fb (for graphics mode). You should not compile-in vesafb +unless you have primary display on non-Rage128 VBE2.0 device (see +Documentation/fb/vesafb.txt for details). + + +X11 +=== + +XF68_FBDev should generally work fine, but it is non-accelerated. As of +this document, 8 and 32bpp works fine. There have been palette issues +when switching from X to console and back to X. You will have to restart +X to fix this. + + +Configuration +============= + +You can pass kernel command line options to vesafb with +`video=aty128fb:option1,option2:value2,option3' (multiple options should +be separated by comma, values are separated from options by `:'). +Accepted options: + +noaccel - do not use acceleration engine. It is default. +accel - use acceleration engine. Not finished. +vmode:x - chooses PowerMacintosh video mode <x>. Depreciated. +cmode:x - chooses PowerMacintosh colour mode <x>. Depreciated. +<XxX@X> - selects startup videomode. See modedb.txt for detailed + explanation. Default is 640x480x8bpp. + + +Limitations +=========== + +There are known and unknown bugs, features and misfeatures. +Currently there are following known bugs: + + This driver is still experimental and is not finished. Too many + bugs/errata to list here. + +-- +Brad Douglas <brad@neruo.com> diff --git a/Documentation/fb/cirrusfb.txt b/Documentation/fb/cirrusfb.txt new file mode 100644 index 000000000000..f9436843e998 --- /dev/null +++ b/Documentation/fb/cirrusfb.txt @@ -0,0 +1,97 @@ + + Framebuffer driver for Cirrus Logic chipsets + Copyright 1999 Jeff Garzik <jgarzik@pobox.com> + + + +{ just a little something to get people going; contributors welcome! } + + + +Chip families supported: + SD64 + Piccolo + Picasso + Spectrum + Alpine (GD-543x/4x) + Picasso4 (GD-5446) + GD-5480 + Laguna (GD-546x) + +Bus's supported: + PCI + Zorro + +Architectures supported: + i386 + Alpha + PPC (Motorola Powerstack) + m68k (Amiga) + + + +Default video modes +------------------- +At the moment, there are two kernel command line arguments supported: + +mode:640x480 +mode:800x600 + or +mode:1024x768 + +Full support for startup video modes (modedb) will be integrated soon. + +Version 1.9.9.1 +--------------- +* Fix memory detection for 512kB case +* 800x600 mode +* Fixed timings +* Hint for AXP: Use -accel false -vyres -1 when changing resolution + + +Version 1.9.4.4 +--------------- +* Preliminary Laguna support +* Overhaul color register routines. +* Associated with the above, console colors are now obtained from a LUT + called 'palette' instead of from the VGA registers. This code was + modeled after that in atyfb and matroxfb. +* Code cleanup, add comments. +* Overhaul SR07 handling. +* Bug fixes. + + +Version 1.9.4.3 +--------------- +* Correctly set default startup video mode. +* Do not override ram size setting. Define + CLGEN_USE_HARDCODED_RAM_SETTINGS if you _do_ want to override the RAM + setting. +* Compile fixes related to new 2.3.x IORESOURCE_IO[PORT] symbol changes. +* Use new 2.3.x resource allocation. +* Some code cleanup. + + +Version 1.9.4.2 +--------------- +* Casting fixes. +* Assertions no longer cause an oops on purpose. +* Bug fixes. + + +Version 1.9.4.1 +--------------- +* Add compatibility support. Now requires a 2.1.x, 2.2.x or 2.3.x kernel. + + +Version 1.9.4 +------------- +* Several enhancements, smaller memory footprint, a few bugfixes. +* Requires kernel 2.3.14-pre1 or later. + + +Version 1.9.3 +------------- +* Bundled with kernel 2.3.14-pre1 or later. + + diff --git a/Documentation/fb/framebuffer.txt b/Documentation/fb/framebuffer.txt new file mode 100644 index 000000000000..610e7801207b --- /dev/null +++ b/Documentation/fb/framebuffer.txt @@ -0,0 +1,345 @@ + The Frame Buffer Device + ----------------------- + +Maintained by Geert Uytterhoeven <geert@linux-m68k.org> +Last revised: May 10, 2001 + + +0. Introduction +--------------- + +The frame buffer device provides an abstraction for the graphics hardware. It +represents the frame buffer of some video hardware and allows application +software to access the graphics hardware through a well-defined interface, so +the software doesn't need to know anything about the low-level (hardware +register) stuff. + +The device is accessed through special device nodes, usually located in the +/dev directory, i.e. /dev/fb*. + + +1. User's View of /dev/fb* +-------------------------- + +From the user's point of view, the frame buffer device looks just like any +other device in /dev. It's a character device using major 29; the minor +specifies the frame buffer number. + +By convention, the following device nodes are used (numbers indicate the device +minor numbers): + + 0 = /dev/fb0 First frame buffer + 1 = /dev/fb1 Second frame buffer + ... + 31 = /dev/fb31 32nd frame buffer + +For backwards compatibility, you may want to create the following symbolic +links: + + /dev/fb0current -> fb0 + /dev/fb1current -> fb1 + +and so on... + +The frame buffer devices are also `normal' memory devices, this means, you can +read and write their contents. You can, for example, make a screen snapshot by + + cp /dev/fb0 myfile + +There also can be more than one frame buffer at a time, e.g. if you have a +graphics card in addition to the built-in hardware. The corresponding frame +buffer devices (/dev/fb0 and /dev/fb1 etc.) work independently. + +Application software that uses the frame buffer device (e.g. the X server) will +use /dev/fb0 by default (older software uses /dev/fb0current). You can specify +an alternative frame buffer device by setting the environment variable +$FRAMEBUFFER to the path name of a frame buffer device, e.g. (for sh/bash +users): + + export FRAMEBUFFER=/dev/fb1 + +or (for csh users): + + setenv FRAMEBUFFER /dev/fb1 + +After this the X server will use the second frame buffer. + + +2. Programmer's View of /dev/fb* +-------------------------------- + +As you already know, a frame buffer device is a memory device like /dev/mem and +it has the same features. You can read it, write it, seek to some location in +it and mmap() it (the main usage). The difference is just that the memory that +appears in the special file is not the whole memory, but the frame buffer of +some video hardware. + +/dev/fb* also allows several ioctls on it, by which lots of information about +the hardware can be queried and set. The color map handling works via ioctls, +too. Look into <linux/fb.h> for more information on what ioctls exist and on +which data structures they work. Here's just a brief overview: + + - You can request unchangeable information about the hardware, like name, + organization of the screen memory (planes, packed pixels, ...) and address + and length of the screen memory. + + - You can request and change variable information about the hardware, like + visible and virtual geometry, depth, color map format, timing, and so on. + If you try to change that information, the driver maybe will round up some + values to meet the hardware's capabilities (or return EINVAL if that isn't + possible). + + - You can get and set parts of the color map. Communication is done with 16 + bits per color part (red, green, blue, transparency) to support all + existing hardware. The driver does all the computations needed to apply + it to the hardware (round it down to less bits, maybe throw away + transparency). + +All this hardware abstraction makes the implementation of application programs +easier and more portable. E.g. the X server works completely on /dev/fb* and +thus doesn't need to know, for example, how the color registers of the concrete +hardware are organized. XF68_FBDev is a general X server for bitmapped, +unaccelerated video hardware. The only thing that has to be built into +application programs is the screen organization (bitplanes or chunky pixels +etc.), because it works on the frame buffer image data directly. + +For the future it is planned that frame buffer drivers for graphics cards and +the like can be implemented as kernel modules that are loaded at runtime. Such +a driver just has to call register_framebuffer() and supply some functions. +Writing and distributing such drivers independently from the kernel will save +much trouble... + + +3. Frame Buffer Resolution Maintenance +-------------------------------------- + +Frame buffer resolutions are maintained using the utility `fbset'. It can +change the video mode properties of a frame buffer device. Its main usage is +to change the current video mode, e.g. during boot up in one of your /etc/rc.* +or /etc/init.d/* files. + +Fbset uses a video mode database stored in a configuration file, so you can +easily add your own modes and refer to them with a simple identifier. + + +4. The X Server +--------------- + +The X server (XF68_FBDev) is the most notable application program for the frame +buffer device. Starting with XFree86 release 3.2, the X server is part of +XFree86 and has 2 modes: + + - If the `Display' subsection for the `fbdev' driver in the /etc/XF86Config + file contains a + + Modes "default" + + line, the X server will use the scheme discussed above, i.e. it will start + up in the resolution determined by /dev/fb0 (or $FRAMEBUFFER, if set). You + still have to specify the color depth (using the Depth keyword) and virtual + resolution (using the Virtual keyword) though. This is the default for the + configuration file supplied with XFree86. It's the most simple + configuration, but it has some limitations. + + - Therefore it's also possible to specify resolutions in the /etc/XF86Config + file. This allows for on-the-fly resolution switching while retaining the + same virtual desktop size. The frame buffer device that's used is still + /dev/fb0current (or $FRAMEBUFFER), but the available resolutions are + defined by /etc/XF86Config now. The disadvantage is that you have to + specify the timings in a different format (but `fbset -x' may help). + +To tune a video mode, you can use fbset or xvidtune. Note that xvidtune doesn't +work 100% with XF68_FBDev: the reported clock values are always incorrect. + + +5. Video Mode Timings +--------------------- + +A monitor draws an image on the screen by using an electron beam (3 electron +beams for color models, 1 electron beam for monochrome monitors). The front of +the screen is covered by a pattern of colored phosphors (pixels). If a phosphor +is hit by an electron, it emits a photon and thus becomes visible. + +The electron beam draws horizontal lines (scanlines) from left to right, and +from the top to the bottom of the screen. By modifying the intensity of the +electron beam, pixels with various colors and intensities can be shown. + +After each scanline the electron beam has to move back to the left side of the +screen and to the next line: this is called the horizontal retrace. After the +whole screen (frame) was painted, the beam moves back to the upper left corner: +this is called the vertical retrace. During both the horizontal and vertical +retrace, the electron beam is turned off (blanked). + +The speed at which the electron beam paints the pixels is determined by the +dotclock in the graphics board. For a dotclock of e.g. 28.37516 MHz (millions +of cycles per second), each pixel is 35242 ps (picoseconds) long: + + 1/(28.37516E6 Hz) = 35.242E-9 s + +If the screen resolution is 640x480, it will take + + 640*35.242E-9 s = 22.555E-6 s + +to paint the 640 (xres) pixels on one scanline. But the horizontal retrace +also takes time (e.g. 272 `pixels'), so a full scanline takes + + (640+272)*35.242E-9 s = 32.141E-6 s + +We'll say that the horizontal scanrate is about 31 kHz: + + 1/(32.141E-6 s) = 31.113E3 Hz + +A full screen counts 480 (yres) lines, but we have to consider the vertical +retrace too (e.g. 49 `lines'). So a full screen will take + + (480+49)*32.141E-6 s = 17.002E-3 s + +The vertical scanrate is about 59 Hz: + + 1/(17.002E-3 s) = 58.815 Hz + +This means the screen data is refreshed about 59 times per second. To have a +stable picture without visible flicker, VESA recommends a vertical scanrate of +at least 72 Hz. But the perceived flicker is very human dependent: some people +can use 50 Hz without any trouble, while I'll notice if it's less than 80 Hz. + +Since the monitor doesn't know when a new scanline starts, the graphics board +will supply a synchronization pulse (horizontal sync or hsync) for each +scanline. Similarly it supplies a synchronization pulse (vertical sync or +vsync) for each new frame. The position of the image on the screen is +influenced by the moments at which the synchronization pulses occur. + +The following picture summarizes all timings. The horizontal retrace time is +the sum of the left margin, the right margin and the hsync length, while the +vertical retrace time is the sum of the upper margin, the lower margin and the +vsync length. + + +----------+---------------------------------------------+----------+-------+ + | | ^ | | | + | | |upper_margin | | | + | | ¥ | | | + +----------###############################################----------+-------+ + | # ^ # | | + | # | # | | + | # | # | | + | # | # | | + | left # | # right | hsync | + | margin # | xres # margin | len | + |<-------->#<---------------+--------------------------->#<-------->|<----->| + | # | # | | + | # | # | | + | # | # | | + | # |yres # | | + | # | # | | + | # | # | | + | # | # | | + | # | # | | + | # | # | | + | # | # | | + | # | # | | + | # | # | | + | # ¥ # | | + +----------###############################################----------+-------+ + | | ^ | | | + | | |lower_margin | | | + | | ¥ | | | + +----------+---------------------------------------------+----------+-------+ + | | ^ | | | + | | |vsync_len | | | + | | ¥ | | | + +----------+---------------------------------------------+----------+-------+ + +The frame buffer device expects all horizontal timings in number of dotclocks +(in picoseconds, 1E-12 s), and vertical timings in number of scanlines. + + +6. Converting XFree86 timing values info frame buffer device timings +-------------------------------------------------------------------- + +An XFree86 mode line consists of the following fields: + "800x600" 50 800 856 976 1040 600 637 643 666 + < name > DCF HR SH1 SH2 HFL VR SV1 SV2 VFL + +The frame buffer device uses the following fields: + + - pixclock: pixel clock in ps (pico seconds) + - left_margin: time from sync to picture + - right_margin: time from picture to sync + - upper_margin: time from sync to picture + - lower_margin: time from picture to sync + - hsync_len: length of horizontal sync + - vsync_len: length of vertical sync + +1) Pixelclock: + xfree: in MHz + fb: in picoseconds (ps) + + pixclock = 1000000 / DCF + +2) horizontal timings: + left_margin = HFL - SH2 + right_margin = SH1 - HR + hsync_len = SH2 - SH1 + +3) vertical timings: + upper_margin = VFL - SV2 + lower_margin = SV1 - VR + vsync_len = SV2 - SV1 + +Good examples for VESA timings can be found in the XFree86 source tree, +under "xc/programs/Xserver/hw/xfree86/doc/modeDB.txt". + + +7. References +------------- + +For more specific information about the frame buffer device and its +applications, please refer to the Linux-fbdev website: + + http://linux-fbdev.sourceforge.net/ + +and to the following documentation: + + - The manual pages for fbset: fbset(8), fb.modes(5) + - The manual pages for XFree86: XF68_FBDev(1), XF86Config(4/5) + - The mighty kernel sources: + o linux/drivers/video/ + o linux/include/linux/fb.h + o linux/include/video/ + + + +8. Mailing list +--------------- + +There are several frame buffer device related mailing lists at SourceForge: + - linux-fbdev-announce@lists.sourceforge.net, for announcements, + - linux-fbdev-user@lists.sourceforge.net, for generic user support, + - linux-fbdev-devel@lists.sourceforge.net, for project developers. + +Point your web browser to http://sourceforge.net/projects/linux-fbdev/ for +subscription information and archive browsing. + + +9. Downloading +-------------- + +All necessary files can be found at + + ftp://ftp.uni-erlangen.de/pub/Linux/LOCAL/680x0/ + +and on its mirrors. + +The latest version of fbset can be found at + + http://home.tvd.be/cr26864/Linux/fbdev/ + + +10. Credits +---------- + +This readme was written by Geert Uytterhoeven, partly based on the original +`X-framebuffer.README' by Roman Hodek and Martin Schaller. Section 6 was +provided by Frank Neumann. + +The frame buffer device abstraction was designed by Martin Schaller. diff --git a/Documentation/fb/intel810.txt b/Documentation/fb/intel810.txt new file mode 100644 index 000000000000..fd68b162e4a1 --- /dev/null +++ b/Documentation/fb/intel810.txt @@ -0,0 +1,272 @@ +Intel 810/815 Framebuffer driver + Tony Daplas <adaplas@pol.net> + http://i810fb.sourceforge.net + + March 17, 2002 + + First Released: July 2001 +================================================================ + +A. Introduction + This is a framebuffer driver for various Intel 810/815 compatible +graphics devices. These would include: + + Intel 810 + Intel 810E + Intel 810-DC100 + Intel 815 Internal graphics only, 100Mhz FSB + Intel 815 Internal graphics only + Intel 815 Internal graphics and AGP + +B. Features + + - Choice of using Discrete Video Timings, VESA Generalized Timing + Formula, or a framebuffer specific database to set the video mode + + - Supports a variable range of horizontal and vertical resolution, and + vertical refresh rates if the VESA Generalized Timing Formula is + enabled. + + - Supports color depths of 8, 16, 24 and 32 bits per pixel + + - Supports pseudocolor, directcolor, or truecolor visuals + + - Full and optimized hardware acceleration at 8, 16 and 24 bpp + + - Robust video state save and restore + + - MTRR support + + - Utilizes user-entered monitor specifications to automatically + calculate required video mode parameters. + + - Can concurrently run with xfree86 running with native i810 drivers + + - Hardware Cursor Support + +C. List of available options + + a. "video=i810fb" + enables the i810 driver + + Recommendation: required + + b. "xres:<value>" + select horizontal resolution in pixels + + Recommendation: user preference + (default = 640) + + c. "yres:<value>" + select vertical resolution in scanlines. If Discrete Video Timings + is enabled, this will be ignored and computed as 3*xres/4. + + Recommendation: user preference + (default = 480) + + d. "vyres:<value>" + select virtual vertical resolution in scanlines. If (0) or none + is specified, this will be computed against maximum available memory. + + Recommendation: do not set + (default = 480) + + e. "vram:<value>" + select amount of system RAM in MB to allocate for the video memory + + Recommendation: 1 - 4 MB. + (default = 4) + + f. "bpp:<value>" + select desired pixel depth + + Recommendation: 8 + (default = 8) + + g. "hsync1/hsync2:<value>" + select the minimum and maximum Horizontal Sync Frequency of the + monitor in KHz. If a using a fixed frequency monitor, hsync1 must + be equal to hsync2. + + Recommendation: check monitor manual for correct values + default (29/30) + + h. "vsync1/vsync2:<value>" + select the minimum and maximum Vertical Sync Frequency of the monitor + in Hz. You can also use this option to lock your monitor's refresh + rate. + + Recommendation: check monitor manual for correct values + (default = 60/60) + + IMPORTANT: If you need to clamp your timings, try to give some + leeway for computational errors (over/underflows). Example: if + using vsync1/vsync2 = 60/60, make sure hsync1/hsync2 has at least + a 1 unit difference, and vice versa. + + i. "voffset:<value>" + select at what offset in MB of the logical memory to allocate the + framebuffer memory. The intent is to avoid the memory blocks + used by standard graphics applications (XFree86). The default + offset (16 MB for a 64MB aperture, 8 MB for a 32MB aperture) will + avoid XFree86's usage and allows up to 7MB/15MB of framebuffer + memory. Depending on your usage, adjust the value up or down, + (0 for maximum usage, 31/63 MB for the least amount). Note, an + arbitrary setting may conflict with XFree86. + + Recommendation: do not set + (default = 8 or 16 MB) + + j. "accel" + enable text acceleration. This can be enabled/reenabled anytime + by using 'fbset -accel true/false'. + + Recommendation: enable + (default = not set) + + k. "mtrr" + enable MTRR. This allows data transfers to the framebuffer memory + to occur in bursts which can significantly increase performance. + Not very helpful with the i810/i815 because of 'shared memory'. + + Recommendation: do not set + (default = not set) + + l. "extvga" + if specified, secondary/external VGA output will always be enabled. + Useful if the BIOS turns off the VGA port when no monitor is attached. + The external VGA monitor can then be attached without rebooting. + + Recommendation: do not set + (default = not set) + + m. "sync" + Forces the hardware engine to do a "sync" or wait for the hardware + to finish before starting another instruction. This will produce a + more stable setup, but will be slower. + + Recommendation: do not set + (default = not set) + + n. "dcolor" + Use directcolor visual instead of truecolor for pixel depths greater + than 8 bpp. Useful for color tuning, such as gamma control. + + Recommendation: do not set + (default = not set) + +D. Kernel booting + +Separate each option/option-pair by commas (,) and the option from its value +with a colon (:) as in the following: + +video=i810fb:option1,option2:value2 + +Sample Usage +------------ + +In /etc/lilo.conf, add the line: + +append="video=i810fb:vram:2,xres:1024,yres:768,bpp:8,hsync1:30,hsync2:55, \ + vsync1:50,vsync2:85,accel,mtrr" + +This will initialize the framebuffer to 1024x768 at 8bpp. The framebuffer +will use 2 MB of System RAM. MTRR support will be enabled. The refresh rate +will be computed based on the hsync1/hsync2 and vsync1/vsync2 values. + +IMPORTANT: +You must include hsync1, hsync2, vsync1 and vsync2 to enable video modes +better than 640x480 at 60Hz. + +E. Module options + + The module parameters are essentially similar to the kernel +parameters. The main difference is that you need to include a Boolean value +(1 for TRUE, and 0 for FALSE) for those options which don't need a value. + +Example, to enable MTRR, include "mtrr=1". + +Sample Usage +------------ + +Using the same setup as described above, load the module like this: + + modprobe i810fb vram=2 xres=1024 bpp=8 hsync1=30 hsync2=55 vsync1=50 \ + vsync2=85 accel=1 mtrr=1 + +Or just add the following to /etc/modprobe.conf + + options i810fb vram=2 xres=1024 bpp=16 hsync1=30 hsync2=55 vsync1=50 \ + vsync2=85 accel=1 mtrr=1 + +and just do a + + modprobe i810fb + + +F. Setup + + a. Do your usual method of configuring the kernel. + + make menuconfig/xconfig/config + + b. Under "Code Maturity Options", enable "Prompt for experimental/ + incomplete code/drivers". + + c. Enable agpgart support for the Intel 810/815 on-board graphics. + This is required. The option is under "Character Devices" + + d. Under "Graphics Support", select "Intel 810/815" either statically + or as a module. Choose "use VESA GTF for video timings" if you + need to maximize the capability of your display. To be on the + safe side, you can leave this unselected. + + e. If you want a framebuffer console, enable it under "Console + Drivers" + + f. Compile your kernel. + + g. Load the driver as described in section D and E. + + Optional: + h. If you are going to run XFree86 with its native drivers, the + standard XFree86 4.1.0 and 4.2.0 drivers should work as is. + However, there's a bug in the XFree86 i810 drivers. It attempts + to use XAA even when switched to the console. This will crash + your server. I have a fix at this site: + + http://i810fb.sourceforge.net. + + You can either use the patch, or just replace + + /usr/X11R6/lib/modules/drivers/i810_drv.o + + with the one provided at the website. + + i. Try the DirectFB (http://www.directfb.org) + the i810 gfxdriver + patch to see the chipset in action (or inaction :-). + +G. Acknowledgment: + + 1. Geert Uytterhoeven - his excellent howto and the virtual + framebuffer driver code made this possible. + + 2. Jeff Hartmann for his agpgart code. + + 3. The X developers. Insights were provided just by reading the + XFree86 source code. + + 4. Intel(c). For this value-oriented chipset driver and for + providing documentation. + + 5. Matt Sottek. His inputs and ideas helped in making some + optimizations possible. + +H. Home Page: + + A more complete, and probably updated information is provided at +http://i810fb.sourceforge.net. + +########################### +Tony + diff --git a/Documentation/fb/internals.txt b/Documentation/fb/internals.txt new file mode 100644 index 000000000000..9b2a2b2f3e57 --- /dev/null +++ b/Documentation/fb/internals.txt @@ -0,0 +1,82 @@ + +This is a first start for some documentation about frame buffer device +internals. + +Geert Uytterhoeven <geert@linux-m68k.org>, 21 July 1998 +James Simmons <jsimmons@user.sf.net>, Nov 26 2002 + +-------------------------------------------------------------------------------- + + *** STRUCTURES USED BY THE FRAME BUFFER DEVICE API *** + +The following structures play a role in the game of frame buffer devices. They +are defined in <linux/fb.h>. + +1. Outside the kernel (user space) + + - struct fb_fix_screeninfo + + Device independent unchangeable information about a frame buffer device and + a specific video mode. This can be obtained using the FBIOGET_FSCREENINFO + ioctl. + + - struct fb_var_screeninfo + + Device independent changeable information about a frame buffer device and a + specific video mode. This can be obtained using the FBIOGET_VSCREENINFO + ioctl, and updated with the FBIOPUT_VSCREENINFO ioctl. If you want to pan + the screen only, you can use the FBIOPAN_DISPLAY ioctl. + + - struct fb_cmap + + Device independent colormap information. You can get and set the colormap + using the FBIOGETCMAP and FBIOPUTCMAP ioctls. + + +2. Inside the kernel + + - struct fb_info + + Generic information, API and low level information about a specific frame + buffer device instance (slot number, board address, ...). + + - struct `par' + + Device dependent information that uniquely defines the video mode for this + particular piece of hardware. + + +-------------------------------------------------------------------------------- + + *** VISUALS USED BY THE FRAME BUFFER DEVICE API *** + + +Monochrome (FB_VISUAL_MONO01 and FB_VISUAL_MONO10) +------------------------------------------------- +Each pixel is either black or white. + + +Pseudo color (FB_VISUAL_PSEUDOCOLOR and FB_VISUAL_STATIC_PSEUDOCOLOR) +--------------------------------------------------------------------- +The whole pixel value is fed through a programmable lookup table that has one +color (including red, green, and blue intensities) for each possible pixel +value, and that color is displayed. + + +True color (FB_VISUAL_TRUECOLOR) +-------------------------------- +The pixel value is broken up into red, green, and blue fields. + + +Direct color (FB_VISUAL_DIRECTCOLOR) +------------------------------------ +The pixel value is broken up into red, green, and blue fields, each of which +are looked up in separate red, green, and blue lookup tables. + + +Grayscale displays +------------------ +Grayscale and static grayscale are special variants of pseudo color and static +pseudo color, where the red, green and blue components are always equal to +each other. + diff --git a/Documentation/fb/matroxfb.txt b/Documentation/fb/matroxfb.txt new file mode 100644 index 000000000000..ad7a67707d62 --- /dev/null +++ b/Documentation/fb/matroxfb.txt @@ -0,0 +1,415 @@ +[This file is cloned from VesaFB. Thanks go to Gerd Knorr] + +What is matroxfb? +================= + +This is a driver for a graphic framebuffer for Matrox devices on +Alpha, Intel and PPC boxes. + +Advantages: + + * It provides a nice large console (128 cols + 48 lines with 1024x768) + without using tiny, unreadable fonts. + * You can run XF{68,86}_FBDev or XFree86 fbdev driver on top of /dev/fb0 + * Most important: boot logo :-) + +Disadvantages: + + * graphic mode is slower than text mode... but you should not notice + if you use same resolution as you used in textmode. + + +How to use it? +============== + +Switching modes is done using the video=matroxfb:vesa:... boot parameter +or using `fbset' program. + +If you want, for example, enable a resolution of 1280x1024x24bpp you should +pass to the kernel this command line: "video=matroxfb:vesa:0x1BB". + +You should compile in both vgacon (to boot if you remove you Matrox from +box) and matroxfb (for graphics mode). You should not compile-in vesafb +unless you have primary display on non-Matrox VBE2.0 device (see +Documentation/fb/vesafb.txt for details). + +Currently supported video modes are (through vesa:... interface, PowerMac +has [as addon] compatibility code): + + +[Graphic modes] + +bpp | 640x400 640x480 768x576 800x600 960x720 +----+-------------------------------------------- + 4 | 0x12 0x102 + 8 | 0x100 0x101 0x180 0x103 0x188 + 15 | 0x110 0x181 0x113 0x189 + 16 | 0x111 0x182 0x114 0x18A + 24 | 0x1B2 0x184 0x1B5 0x18C + 32 | 0x112 0x183 0x115 0x18B + + +[Graphic modes (continued)] + +bpp | 1024x768 1152x864 1280x1024 1408x1056 1600x1200 +----+------------------------------------------------ + 4 | 0x104 0x106 + 8 | 0x105 0x190 0x107 0x198 0x11C + 15 | 0x116 0x191 0x119 0x199 0x11D + 16 | 0x117 0x192 0x11A 0x19A 0x11E + 24 | 0x1B8 0x194 0x1BB 0x19C 0x1BF + 32 | 0x118 0x193 0x11B 0x19B + + +[Text modes] + +text | 640x400 640x480 1056x344 1056x400 1056x480 +-----+------------------------------------------------ + 8x8 | 0x1C0 0x108 0x10A 0x10B 0x10C +8x16 | 2, 3, 7 0x109 + +You can enter these number either hexadecimal (leading `0x') or decimal +(0x100 = 256). You can also use value + 512 to achieve compatibility +with your old number passed to vesafb. + +Non-listed number can be achieved by more complicated command-line, for +example 1600x1200x32bpp can be specified by `video=matroxfb:vesa:0x11C,depth:32'. + + +X11 +=== + +XF{68,86}_FBDev should work just fine, but it is non-accelerated. On non-intel +architectures there are some glitches for 24bpp videomodes. 8, 16 and 32bpp +works fine. + +Running another (accelerated) X-Server like XF86_SVGA works too. But (at least) +XFree servers have big troubles in multihead configurations (even on first +head, not even talking about second). Running XFree86 4.x accelerated mga +driver is possible, but you must not enable DRI - if you do, resolution and +color depth of your X desktop must match resolution and color depths of your +virtual consoles, otherwise X will corrupt accelerator settings. + + +SVGALib +======= + +Driver contains SVGALib compatibility code. It is turned on by choosing textual +mode for console. You can do it at boot time by using videomode +2,3,7,0x108-0x10C or 0x1C0. At runtime, `fbset -depth 0' does this work. +Unfortunately, after SVGALib application exits, screen contents is corrupted. +Switching to another console and back fixes it. I hope that it is SVGALib's +problem and not mine, but I'm not sure. + + +Configuration +============= + +You can pass kernel command line options to matroxfb with +`video=matroxfb:option1,option2:value2,option3' (multiple options should be +separated by comma, values are separated from options by `:'). +Accepted options: + +mem:X - size of memory (X can be in megabytes, kilobytes or bytes) + You can only decrease value determined by driver because of + it always probe for memory. Default is to use whole detected + memory usable for on-screen display (i.e. max. 8 MB). +disabled - do not load driver; you can use also `off', but `disabled' + is here too. +enabled - load driver, if you have `video=matroxfb:disabled' in LILO + configuration, you can override it by this (you cannot override + `off'). It is default. +noaccel - do not use acceleration engine. It does not work on Alphas. +accel - use acceleration engine. It is default. +nopan - create initial consoles with vyres = yres, thus disabling virtual + scrolling. +pan - create initial consoles as tall as possible (vyres = memory/vxres). + It is default. +nopciretry - disable PCI retries. It is needed for some broken chipsets, + it is autodetected for intel's 82437. In this case device does + not comply to PCI 2.1 specs (it will not guarantee that every + transaction terminate with success or retry in 32 PCLK). +pciretry - enable PCI retries. It is default, except for intel's 82437. +novga - disables VGA I/O ports. It is default if BIOS did not enable device. + You should not use this option, some boards then do not restart + without power off. +vga - preserve state of VGA I/O ports. It is default. Driver does not + enable VGA I/O if BIOS did not it (it is not safe to enable it in + most cases). +nobios - disables BIOS ROM. It is default if BIOS did not enable BIOS itself. + You should not use this option, some boards then do not restart + without power off. +bios - preserve state of BIOS ROM. It is default. Driver does not enable + BIOS if BIOS was not enabled before. +noinit - tells driver, that devices were already initialized. You should use + it if you have G100 and/or if driver cannot detect memory, you see + strange pattern on screen and so on. Devices not enabled by BIOS + are still initialized. It is default. +init - driver initializes every device it knows about. +memtype - specifies memory type, implies 'init'. This is valid only for G200 + and G400 and has following meaning: + G200: 0 -> 2x128Kx32 chips, 2MB onboard, probably sgram + 1 -> 2x128Kx32 chips, 4MB onboard, probably sgram + 2 -> 2x256Kx32 chips, 4MB onboard, probably sgram + 3 -> 2x256Kx32 chips, 8MB onboard, probably sgram + 4 -> 2x512Kx16 chips, 8/16MB onboard, probably sdram only + 5 -> same as above + 6 -> 4x128Kx32 chips, 4MB onboard, probably sgram + 7 -> 4x128Kx32 chips, 8MB onboard, probably sgram + G400: 0 -> 2x512Kx16 SDRAM, 16/32MB + 2x512Kx32 SGRAM, 16/32MB + 1 -> 2x256Kx32 SGRAM, 8/16MB + 2 -> 4x128Kx32 SGRAM, 8/16MB + 3 -> 4x512Kx32 SDRAM, 32MB + 4 -> 4x256Kx32 SGRAM, 16/32MB + 5 -> 2x1Mx32 SDRAM, 32MB + 6 -> reserved + 7 -> reserved + You should use sdram or sgram parameter in addition to memtype + parameter. +nomtrr - disables write combining on frame buffer. This slows down driver but + there is reported minor incompatibility between GUS DMA and XFree + under high loads if write combining is enabled (sound dropouts). +mtrr - enables write combining on frame buffer. It speeds up video accesses + much. It is default. You must have MTRR support enabled in kernel + and your CPU must have MTRR (f.e. Pentium II have them). +sgram - tells to driver that you have Gxx0 with SGRAM memory. It has no + effect without `init'. +sdram - tells to driver that you have Gxx0 with SDRAM memory. + It is a default. +inv24 - change timings parameters for 24bpp modes on Millenium and + Millenium II. Specify this if you see strange color shadows around + characters. +noinv24 - use standard timings. It is the default. +inverse - invert colors on screen (for LCD displays) +noinverse - show true colors on screen. It is default. +dev:X - bind driver to device X. Driver numbers device from 0 up to N, + where device 0 is first `known' device found, 1 second and so on. + lspci lists devices in this order. + Default is `every' known device for driver with multihead support + and first working device (usually dev:0) for driver without + multihead support. +nohwcursor - disables hardware cursor (use software cursor instead). +hwcursor - enables hardware cursor. It is default. If you are using + non-accelerated mode (`noaccel' or `fbset -accel false'), software + cursor is used (except for text mode). +noblink - disables cursor blinking. Cursor in text mode always blinks (hw + limitation). +blink - enables cursor blinking. It is default. +nofastfont - disables fastfont feature. It is default. +fastfont:X - enables fastfont feature. X specifies size of memory reserved for + font data, it must be >= (fontwidth*fontheight*chars_in_font)/8. + It is faster on Gx00 series, but slower on older cards. +grayscale - enable grayscale summing. It works in PSEUDOCOLOR modes (text, + 4bpp, 8bpp). In DIRECTCOLOR modes it is limited to characters + displayed through putc/putcs. Direct accesses to framebuffer + can paint colors. +nograyscale - disable grayscale summing. It is default. +cross4MB - enables that pixel line can cross 4MB boundary. It is default for + non-Millenium. +nocross4MB - pixel line must not cross 4MB boundary. It is default for + Millenium I or II, because of these devices have hardware + limitations which do not allow this. But this option is + incompatible with some (if not all yet released) versions of + XF86_FBDev. +dfp - enables digital flat panel interface. This option is incompatible with + secondary (TV) output - if DFP is active, TV output must be + inactive and vice versa. DFP always uses same timing as primary + (monitor) output. +dfp:X - use settings X for digital flat panel interface. X is number from + 0 to 0xFF, and meaning of each individual bit is described in + G400 manual, in description of DAC register 0x1F. For normal operation + you should set all bits to zero, except lowest bit. This lowest bit + selects who is source of display clocks, whether G400, or panel. + Default value is now read back from hardware - so you should specify + this value only if you are also using `init' parameter. +outputs:XYZ - set mapping between CRTC and outputs. Each letter can have value + of 0 (for no CRTC), 1 (CRTC1) or 2 (CRTC2), and first letter corresponds + to primary analog output, second letter to the secondary analog output + and third letter to the DVI output. Default setting is 100 for + cards below G400 or G400 without DFP, 101 for G400 with DFP, and + 111 for G450 and G550. You can set mapping only on first card, + use matroxset for setting up other devices. +vesa:X - selects startup videomode. X is number from 0 to 0x1FF, see table + above for detailed explanation. Default is 640x480x8bpp if driver + has 8bpp support. Otherwise first available of 640x350x4bpp, + 640x480x15bpp, 640x480x24bpp, 640x480x32bpp or 80x25 text + (80x25 text is always available). + +If you are not satisfied with videomode selected by `vesa' option, you +can modify it with these options: + +xres:X - horizontal resolution, in pixels. Default is derived from `vesa' + option. +yres:X - vertical resolution, in pixel lines. Default is derived from `vesa' + option. +upper:X - top boundary: lines between end of VSYNC pulse and start of first + pixel line of picture. Default is derived from `vesa' option. +lower:X - bottom boundary: lines between end of picture and start of VSYNC + pulse. Default is derived from `vesa' option. +vslen:X - length of VSYNC pulse, in lines. Default is derived from `vesa' + option. +left:X - left boundary: pixels between end of HSYNC pulse and first pixel. + Default is derived from `vesa' option. +right:X - right boundary: pixels between end of picture and start of HSYNC + pulse. Default is derived from `vesa' option. +hslen:X - length of HSYNC pulse, in pixels. Default is derived from `vesa' + option. +pixclock:X - dotclocks, in ps (picoseconds). Default is derived from `vesa' + option and from `fh' and `fv' options. +sync:X - sync. pulse - bit 0 inverts HSYNC polarity, bit 1 VSYNC polarity. + If bit 3 (value 0x08) is set, composite sync instead of HSYNC is + generated. If bit 5 (value 0x20) is set, sync on green is turned on. + Do not forget that if you want sync on green, you also probably + want composite sync. + Default depends on `vesa'. +depth:X - Bits per pixel: 0=text, 4,8,15,16,24 or 32. Default depends on + `vesa'. + +If you know capabilities of your monitor, you can specify some (or all) of +`maxclk', `fh' and `fv'. In this case, `pixclock' is computed so that +pixclock <= maxclk, real_fh <= fh and real_fv <= fv. + +maxclk:X - maximum dotclock. X can be specified in MHz, kHz or Hz. Default is + `don't care'. +fh:X - maximum horizontal synchronization frequency. X can be specified + in kHz or Hz. Default is `don't care'. +fv:X - maximum vertical frequency. X must be specified in Hz. Default is + 70 for modes derived from `vesa' with yres <= 400, 60Hz for + yres > 400. + + +Limitations +=========== + +There are known and unknown bugs, features and misfeatures. +Currently there are following known bugs: + + SVGALib does not restore screen on exit + + generic fbcon-cfbX procedures do not work on Alphas. Due to this, + `noaccel' (and cfb4 accel) driver does not work on Alpha. So everyone + with access to /dev/fb* on Alpha can hang machine (you should restrict + access to /dev/fb* - everyone with access to this device can destroy + your monitor, believe me...). + + 24bpp does not support correctly XF-FBDev on big-endian architectures. + + interlaced text mode is not supported; it looks like hardware limitation, + but I'm not sure. + + Gxx0 SGRAM/SDRAM is not autodetected. + + If you are using more than one framebuffer device, you must boot kernel + with 'video=scrollback:0'. + + maybe more... +And following misfeatures: + + SVGALib does not restore screen on exit. + + pixclock for text modes is limited by hardware to + 83 MHz on G200 + 66 MHz on Millennium I + 60 MHz on Millennium II + Because I have no access to other devices, I do not know specific + frequencies for them. So driver does not check this and allows you to + set frequency higher that this. It causes sparks, black holes and other + pretty effects on screen. Device was not destroyed during tests. :-) + + my Millennium G200 oscillator has frequency range from 35 MHz to 380 MHz + (and it works with 8bpp on about 320 MHz dotclocks (and changed mclk)). + But Matrox says on product sheet that VCO limit is 50-250 MHz, so I believe + them (maybe that chip overheats, but it has a very big cooler (G100 has + none), so it should work). + + special mixed video/graphics videomodes of Mystique and Gx00 - 2G8V16 and + G16V16 are not supported + + color keying is not supported + + feature connector of Mystique and Gx00 is set to VGA mode (it is disabled + by BIOS) + + DDC (monitor detection) is supported through dualhead driver + + some check for input values are not so strict how it should be (you can + specify vslen=4000 and so on). + + maybe more... +And following features: + + 4bpp is available only on Millennium I and Millennium II. It is hardware + limitation. + + selection between 1:5:5:5 and 5:6:5 16bpp videomode is done by -rgba + option of fbset: "fbset -depth 16 -rgba 5,5,5" selects 1:5:5:5, anything + else selects 5:6:5 mode. + + text mode uses 6 bit VGA palette instead of 8 bit (one of 262144 colors + instead of one of 16M colors). It is due to hardware limitation of + Millennium I/II and SVGALib compatibility. + + +Benchmarks +========== +It is time to redraw whole screen 1000 times in 1024x768, 60Hz. It is +time for draw 6144000 characters on screen through /dev/vcsa +(for 32bpp it is about 3GB of data (exactly 3000 MB); for 8x16 font in +16 seconds, i.e. 187 MBps). +Times were obtained from one older version of driver, now they are about 3% +faster, it is kernel-space only time on P-II/350 MHz, Millennium I in 33 MHz +PCI slot, G200 in AGP 2x slot. I did not test vgacon. + +NOACCEL + 8x16 12x22 + Millennium I G200 Millennium I G200 +8bpp 16.42 9.54 12.33 9.13 +16bpp 21.00 15.70 19.11 15.02 +24bpp 36.66 36.66 35.00 35.00 +32bpp 35.00 30.00 33.85 28.66 + +ACCEL, nofastfont + 8x16 12x22 6x11 + Millennium I G200 Millennium I G200 Millennium I G200 +8bpp 7.79 7.24 13.55 7.78 30.00 21.01 +16bpp 9.13 7.78 16.16 7.78 30.00 21.01 +24bpp 14.17 10.72 18.69 10.24 34.99 21.01 +32bpp 16.15 16.16 18.73 13.09 34.99 21.01 + +ACCEL, fastfont + 8x16 12x22 6x11 + Millennium I G200 Millennium I G200 Millennium I G200 +8bpp 8.41 6.01 6.54 4.37 16.00 10.51 +16bpp 9.54 9.12 8.76 6.17 17.52 14.01 +24bpp 15.00 12.36 11.67 10.00 22.01 18.32 +32bpp 16.18 18.29* 12.71 12.74 24.44 21.00 + +TEXT + 8x16 + Millennium I G200 +TEXT 3.29 1.50 + +* Yes, it is slower than Millennium I. + + +Dualhead G400 +============= +Driver supports dualhead G400 with some limitations: + + secondary head shares videomemory with primary head. It is not problem + if you have 32MB of videoram, but if you have only 16MB, you may have + to think twice before choosing videomode (for example twice 1880x1440x32bpp + is not possible). + + due to hardware limitation, secondary head can use only 16 and 32bpp + videomodes. + + secondary head is not accelerated. There were bad problems with accelerated + XFree when secondary head used to use acceleration. + + secondary head always powerups in 640x480@60-32 videomode. You have to use + fbset to change this mode. + + secondary head always powerups in monitor mode. You have to use fbmatroxset + to change it to TV mode. Also, you must select at least 525 lines for + NTSC output and 625 lines for PAL output. + + kernel is not fully multihead ready. So some things are impossible to do. + + if you compiled it as module, you must insert i2c-matroxfb, matroxfb_maven + and matroxfb_crtc2 into kernel. + + +Dualhead G450 +============= +Driver supports dualhead G450 with some limitations: + + secondary head shares videomemory with primary head. It is not problem + if you have 32MB of videoram, but if you have only 16MB, you may have + to think twice before choosing videomode. + + due to hardware limitation, secondary head can use only 16 and 32bpp + videomodes. + + secondary head is not accelerated. + + secondary head always powerups in 640x480@60-32 videomode. You have to use + fbset to change this mode. + + TV output is not supported + + kernel is not fully multihead ready, so some things are impossible to do. + + if you compiled it as module, you must insert matroxfb_g450 and matroxfb_crtc2 + into kernel. + +-- +Petr Vandrovec <vandrove@vc.cvut.cz> diff --git a/Documentation/fb/modedb.txt b/Documentation/fb/modedb.txt new file mode 100644 index 000000000000..e04458b319d5 --- /dev/null +++ b/Documentation/fb/modedb.txt @@ -0,0 +1,61 @@ + + + modedb default video mode support + + +Currently all frame buffer device drivers have their own video mode databases, +which is a mess and a waste of resources. The main idea of modedb is to have + + - one routine to probe for video modes, which can be used by all frame buffer + devices + - one generic video mode database with a fair amount of standard videomodes + (taken from XFree86) + - the possibility to supply your own mode database for graphics hardware that + needs non-standard modes, like amifb and Mac frame buffer drivers (which + use macmodes.c) + +When a frame buffer device receives a video= option it doesn't know, it should +consider that to be a video mode option. If no frame buffer device is specified +in a video= option, fbmem considers that to be a global video mode option. + +Valid mode specifiers (mode_option argument): + + <xres>x<yres>[-<bpp>][@<refresh>] + <name>[-<bpp>][@<refresh>] + +with <xres>, <yres>, <bpp> and <refresh> decimal numbers and <name> a string. +Things between square brackets are optional. + +To find a suitable video mode, you just call + +int __init fb_find_mode(struct fb_var_screeninfo *var, + struct fb_info *info, const char *mode_option, + const struct fb_videomode *db, unsigned int dbsize, + const struct fb_videomode *default_mode, + unsigned int default_bpp) + +with db/dbsize your non-standard video mode database, or NULL to use the +standard video mode database. + +fb_find_mode() first tries the specified video mode (or any mode that matches, +e.g. there can be multiple 640x480 modes, each of them is tried). If that +fails, the default mode is tried. If that fails, it walks over all modes. + +To specify a video mode at bootup, use the following boot options: + video=<driver>:<xres>x<yres>[-<bpp>][@refresh] + +where <driver> is a name from the table below. Valid default modes can be +found in linux/drivers/video/modedb.c. Check your driver's documentation. +There may be more modes. + + Drivers that support modedb boot options + Boot Name Cards Supported + + amifb - Amiga chipset frame buffer + aty128fb - ATI Rage128 / Pro frame buffer + atyfb - ATI Mach64 frame buffer + tdfxfb - 3D Fx frame buffer + tridentfb - Trident (Cyber)blade chipset frame buffer + +BTW, only a few drivers use this at the moment. Others are to follow +(feel free to send patches). diff --git a/Documentation/fb/pvr2fb.txt b/Documentation/fb/pvr2fb.txt new file mode 100644 index 000000000000..2bf6c2321c2d --- /dev/null +++ b/Documentation/fb/pvr2fb.txt @@ -0,0 +1,61 @@ +$Id: pvr2fb.txt,v 1.1 2001/05/24 05:09:16 mrbrown Exp $ + +What is pvr2fb? +=============== + +This is a driver for PowerVR 2 based graphics frame buffers, such as the +one found in the Dreamcast. + +Advantages: + + * It provides a nice large console (128 cols + 48 lines with 1024x768) + without using tiny, unreadable fonts. + * You can run XF86_FBDev on top of /dev/fb0 + * Most important: boot logo :-) + +Disadvantages: + + * Driver is currently limited to the Dreamcast PowerVR 2 implementation + at the time of this writing. + +Configuration +============= + +You can pass kernel command line options to pvr2fb with +`video=pvr2fb:option1,option2:value2,option3' (multiple options should be +separated by comma, values are separated from options by `:'). +Accepted options: + +font:X - default font to use. All fonts are supported, including the + SUN12x22 font which is very nice at high resolutions. + +mode:X - default video mode. The following video modes are supported: + 640x240-60, 640x480-60. + + Note: the 640x240 mode is currently broken, and should not be + used for any reason. It is only mentioned as a reference. + +inverse - invert colors on screen (for LCD displays) + +nomtrr - disables write combining on frame buffer. This slows down driver + but there is reported minor incompatibility between GUS DMA and + XFree under high loads if write combining is enabled (sound + dropouts). MTRR is enabled by default on systems that have it + configured and that support it. + +cable:X - cable type. This can be any of the following: vga, rgb, and + composite. If none is specified, we guess. + +output:X - output type. This can be any of the following: pal, ntsc, and + vga. If none is specified, we guess. + +X11 +=== + +XF86_FBDev should work, in theory. At the time of this writing it is +totally untested and may or may not even portray the beginnings of +working. If you end up testing this, please let me know! + +-- +Paul Mundt <lethal@linuxdc.org> + diff --git a/Documentation/fb/pxafb.txt b/Documentation/fb/pxafb.txt new file mode 100644 index 000000000000..db9b8500b43b --- /dev/null +++ b/Documentation/fb/pxafb.txt @@ -0,0 +1,54 @@ +Driver for PXA25x LCD controller +================================ + +The driver supports the following options, either via +options=<OPTIONS> when modular or video=pxafb:<OPTIONS> when built in. + +For example: + modprobe pxafb options=mode:640x480-8,passive +or on the kernel command line + video=pxafb:mode:640x480-8,passive + +mode:XRESxYRES[-BPP] + XRES == LCCR1_PPL + 1 + YRES == LLCR2_LPP + 1 + The resolution of the display in pixels + BPP == The bit depth. Valid values are 1, 2, 4, 8 and 16. + +pixclock:PIXCLOCK + Pixel clock in picoseconds + +left:LEFT == LCCR1_BLW + 1 +right:RIGHT == LCCR1_ELW + 1 +hsynclen:HSYNC == LCCR1_HSW + 1 +upper:UPPER == LCCR2_BFW +lower:LOWER == LCCR2_EFR +vsynclen:VSYNC == LCCR2_VSW + 1 + Display margins and sync times + +color | mono => LCCR0_CMS + umm... + +active | passive => LCCR0_PAS + Active (TFT) or Passive (STN) display + +single | dual => LCCR0_SDS + Single or dual panel passive display + +4pix | 8pix => LCCR0_DPD + 4 or 8 pixel monochrome single panel data + +hsync:HSYNC +vsync:VSYNC + Horizontal and vertical sync. 0 => active low, 1 => active + high. + +dpc:DPC + Double pixel clock. 1=>true, 0=>false + +outputen:POLARITY + Output Enable Polarity. 0 => active low, 1 => active high + +pixclockpol:POLARITY + pixel clock polarity + 0 => falling edge, 1 => rising edge diff --git a/Documentation/fb/sa1100fb.txt b/Documentation/fb/sa1100fb.txt new file mode 100644 index 000000000000..f1b4220464df --- /dev/null +++ b/Documentation/fb/sa1100fb.txt @@ -0,0 +1,39 @@ +[This file is cloned from VesaFB/matroxfb] + +What is sa1100fb? +================= + +This is a driver for a graphic framebuffer for the SA-1100 LCD +controller. + +Configuration +============== + +For most common passive displays, giving the option + +video=sa1100fb:bpp:<value>,lccr0:<value>,lccr1:<value>,lccr2:<value>,lccr3:<value> + +on the kernel command line should be enough to configure the +controller. The bits per pixel (bpp) value should be 4, 8, 12, or +16. LCCR values are display-specific and should be computed as +documented in the SA-1100 Developer's Manual, Section 11.7. Dual-panel +displays are supported as long as the SDS bit is set in LCCR0; GPIO<9:2> +are used for the lower panel. + +For active displays or displays requiring additional configuration +(controlling backlights, powering on the LCD, etc.), the command line +options may not be enough to configure the display. Adding sections to +sa1100fb_init_fbinfo(), sa1100fb_activate_var(), +sa1100fb_disable_lcd_controller(), and sa1100fb_enable_lcd_controller() +will probably be necessary. + +Accepted options: + +bpp:<value> Configure for <value> bits per pixel +lccr0:<value> Configure LCD control register 0 (11.7.3) +lccr1:<value> Configure LCD control register 1 (11.7.4) +lccr2:<value> Configure LCD control register 2 (11.7.5) +lccr3:<value> Configure LCD control register 3 (11.7.6) + +-- +Mark Huang <mhuang@livetoy.com> diff --git a/Documentation/fb/sisfb.txt b/Documentation/fb/sisfb.txt new file mode 100644 index 000000000000..3b50c517a08d --- /dev/null +++ b/Documentation/fb/sisfb.txt @@ -0,0 +1,158 @@ + +What is sisfb? +============== + +sisfb is a framebuffer device driver for SiS (Silicon Integrated Systems) +graphics chips. Supported are: + +- SiS 300 series: SiS 300/305, 540, 630(S), 730(S) +- SiS 315 series: SiS 315/H/PRO, 55x, (M)65x, 740, (M)661(F/M)X, (M)741(GX) +- SiS 330 series: SiS 330 ("Xabre"), (M)760 + + +Why do I need a framebuffer driver? +=================================== + +sisfb is eg. useful if you want a high-resolution text console. Besides that, +sisfb is required to run DirectFB (which comes with an additional, dedicated +driver for the 315 series). + +On the 300 series, sisfb on kernels older than 2.6.3 furthermore plays an +important role in connection with DRM/DRI: Sisfb manages the memory heap +used by DRM/DRI for 3D texture and other data. This memory management is +required for using DRI/DRM. + +Kernels >= around 2.6.3 do not need sisfb any longer for DRI/DRM memory +management. The SiS DRM driver has been updated and features a memory manager +of its own (which will be used if sisfb is not compiled). So unless you want +a graphical console, you don't need sisfb on kernels >=2.6.3. + +Sidenote: Since this seems to be a commonly made mistake: sisfb and vesafb +cannot be active at the same time! Do only select one of them in your kernel +configuration. + + +How are parameters passed to sisfb? +=================================== + +Well, it depends: If compiled statically into the kernel, use lilo's append +statement to add the parameters to the kernel command line. Please see lilo's +(or GRUB's) documentation for more information. If sisfb is a kernel module, +parameters are given with the modprobe (or insmod) command. + +Example for sisfb as part of the static kernel: Add the following line to your +lilo.conf: + + append="video=sisfb:mode:1024x768x16,mem:12288,rate:75" + +Example for sisfb as a module: Start sisfb by typing + + modprobe sisfb mode=1024x768x16 rate=75 mem=12288 + +A common mistake is that folks use a wrong parameter format when using the +driver compiled into the kernel. Please note: If compiled into the kernel, +the parameter format is video=sisfb:mode:none or video=sisfb:mode:1024x768x16 +(or whatever mode you want to use, alternatively using any other format +described above or the vesa keyword instead of mode). If compiled as a module, +the parameter format reads mode=none or mode=1024x768x16 (or whatever mode you +want to use). Using a "=" for a ":" (and vice versa) is a huge difference! +Additionally: If you give more than one argument to the in-kernel sisfb, the +arguments are separated with ",". For example: + + video=sisfb:mode:1024x768x16,rate:75,mem:12288 + + +How do I use it? +================ + +Preface statement: This file only covers very little of the driver's +capabilities and features. Please refer to the author's and maintainer's +website at http://www.winischhofer.net/linuxsisvga.shtml for more +information. Additionally, "modinfo sisfb" gives an overview over all +supported options including some explanation. + +The desired display mode can be specified using the keyword "mode" with +a parameter in one of the follwing formats: + - XxYxDepth or + - XxY-Depth or + - XxY-Depth@Rate or + - XxY + - or simply use the VESA mode number in hexadecimal or decimal. + +For example: 1024x768x16, 1024x768-16@75, 1280x1024-16. If no depth is +specified, it defaults to 8. If no rate is given, it defaults to 60Hz. Depth 32 +means 24bit color depth (but 32 bit framebuffer depth, which is not relevant +to the user). + +Additionally, sisfb understands the keyword "vesa" followed by a VESA mode +number in decimal or hexadecimal. For example: vesa=791 or vesa=0x117. Please +use either "mode" or "vesa" but not both. + +Linux 2.4 only: If no mode is given, sisfb defaults to "no mode" (mode=none) if +compiled as a module; if sisfb is statically compiled into the kernel, it +defaults to 800x600x8 unless CRT2 type is LCD, in which case the LCD's native +resolution is used. If you want to switch to a different mode, use the fbset +shell command. + +Linux 2.6 only: If no mode is given, sisfb defaults to 800x600x8 unless CRT2 +type is LCD, in which case it defaults to the LCD's native resolution. If +you want to switch to another mode, use the stty shell command. + +You should compile in both vgacon (to boot if you remove you SiS card from +your system) and sisfb (for graphics mode). Under Linux 2.6, also "Framebuffer +console support" (fbcon) is needed for a graphical console. + +You should *not* compile-in vesafb. And please do not use the "vga=" keyword +in lilo's or grub's configuration file; mode selection is done using the +"mode" or "vesa" keywords as a parameter. See above and below. + + +X11 +=== + +If using XFree86 or X.org, it is recommended that you don't use the "fbdev" +driver but the dedicated "sis" X driver. The "sis" X driver and sisfb are +developed by the same person (Thomas Winischhofer) and cooperate well with +each other. + + +SVGALib +======= + +SVGALib, if directly accessing the hardware, never restores the screen +correctly, especially on laptops or if the output devices are LCD or TV. +Therefore, use the chipset "FBDEV" in SVGALib configuration. This will make +SVGALib use the framebuffer device for mode switches and restoration. + + +Configuration +============= + +(Some) accepted options: + +off - Disable sisfb. This option is only understood if sisfb is + in-kernel, not a module. +mem:X - size of memory for the console, rest will be used for DRI/DRM. X + is in kilobytes. On 300 series, the default is 4096, 8192 or + 16384 (each in kilobyte) depending on how much video ram the card + has. On 315/330 series, the default is the maximum available ram + (since DRI/DRM is not supported for these chipsets). +noaccel - do not use 2D acceleration engine. (Default: use acceleration) +noypan - disable y-panning and scroll by redrawing the entire screen. + This is much slower than y-panning. (Default: use y-panning) +vesa:X - selects startup videomode. X is number from 0 to 0x1FF and + represents the VESA mode number (can be given in decimal or + hexadecimal form, the latter prefixed with "0x"). +mode:X - selects startup videomode. Please see above for the format of + "X". + +Boolean options such as "noaccel" or "noypan" are to be given without a +parameter if sisfb is in-kernel (for example "video=sisfb:noypan). If +sisfb is a module, these are to be set to 1 (for example "modprobe sisfb +noypan=1"). + +-- +Thomas Winischhofer <thomas@winischhofer.net> +May 27, 2004 + + diff --git a/Documentation/fb/sstfb.txt b/Documentation/fb/sstfb.txt new file mode 100644 index 000000000000..628d7ffa8769 --- /dev/null +++ b/Documentation/fb/sstfb.txt @@ -0,0 +1,174 @@ + +Introduction + + This is a frame buffer device driver for 3dfx' Voodoo Graphics + (aka voodoo 1, aka sst1) and Voodoo² (aka Voodoo 2, aka CVG) based + video boards. It's highly experimental code, but is guaranteed to work + on my computer, with my "Maxi Gamer 3D" and "Maxi Gamer 3d²" boards, + and with me "between chair and keyboard". Some people tested other + combinations and it seems that it works. + The main page is located at <http://sstfb.sourceforge.net>, and if + you want the latest version, check out the CVS, as the driver is a work + in progress, I feel uncomfortable with releasing tarballs of something + not completely working...Don't worry, it's still more than useable + (I eat my own dog food) + + Please read the Bug section, and report any success or failure to me + (Ghozlane Toumi <gtoumi@laposte.net>). + BTW, If you have only one monitor , and you don't feel like playing + with the vga passthrou cable, I can only suggest borrowing a screen + somewhere... + + +Installation + + This driver (should) work on ix86, with "late" 2.2.x kernel (tested + with x = 19) and "recent" 2.4.x kernel, as a module or compiled in. + It has been included in mainstream kernel since the infamous 2.4.10. + You can apply the patches found in sstfb/kernel/*-2.{2|4}.x.patch, + and copy sstfb.c to linux/drivers/video/, or apply a single patch, + sstfb/patch-2.{2|4}.x-sstfb-yymmdd to your linux source tree. + + Then configure your kernel as usual: choose "m" or "y" to 3Dfx Voodoo + Graphics in section "console". Compile, install, have fun... and please + drop me a report :) + + +Module Usage + + Warnings. + # You should read completely this section before issuing any command. + # If you have only one monitor to play with, once you insmod the + module, the 3dfx takes control of the output, so you'll have to + plug the monitor to the "normal" video board in order to issue + the commands, or you can blindly use sst_dbg_vgapass + in the tools directory (See Tools). The latest solution is pass the + parameter vgapass=1 when insmodding the driver. (See Kernel/Modules + Options) + + Module insertion: + # insmod sstfb.o + you should see some strange output frome the board: + a big blue square, a green and a red small squares and a vertical + white rectangle. why ? the function's name is self explanatory : + "sstfb_test()"... + (if you don't have a second monitor, you'll have to plug your monitor + directely to the 2D videocard to see what you're typing) + # con2fb /dev/fbx /dev/ttyx + bind a tty to the new frame buffer. if you already have a frame + buffer driver, the voodoo fb will likely be /dev/fb1. if not, + the device will be /dev/fb0. You can check this by doing a + cat /proc/fb. You can find a copy of con2fb in tools/ directory. + if you don't have another fb device, this step is superfluous, + as the console subsystem automagicaly binds ttys to the fb. + # switch to the virtual console you just mapped. "tadaaa" ... + + Module removal: + # con2fb /dev/fbx /dev/ttyx + bind the tty to the old frame buffer so the module can be removed. + (how does it work with vgacon ? short answer : it doesn't work) + # rmmod sstfb + + +Kernel/Modules Options + + You can pass some otions to sstfb module, and via the kernel command + line when the driver is compiled in : + for module : insmod sstfb.o option1=value1 option2=value2 ... + in kernel : video=sstfb:option1,option2:value2,option3 ... + + sstfb supports the folowing options : + +Module Kernel Description + +vgapass=0 vganopass Enable or disable VGA passthrou cable. +vgapass=1 vgapass When enabled, the monitor will get the signal + from the VGA board and not from the voodoo. + Default: nopass + +mem=x mem:x Force frame buffer memory in MiB + allowed values: 0, 1, 2, 4. + Default: 0 (= autodetect) + +inverse=1 inverse Supposed to enable inverse console. + doesn't work yet... + +clipping=1 clipping Enable or disable clipping. +clipping=0 noclipping With clipping enabled, all offscreen + reads and writes are disgarded. + Default: enable clipping. + +gfxclk=x gfxclk:x Force graphic clock frequency (in MHz). + Be carefull with this option, it may be + DANGEROUS. + Default: auto + 50Mhz for Voodoo 1, + 75MHz for Voodoo 2. + +slowpci=1 fastpci Enable or disable fast PCI read/writes. +slowpci=1 slowpci Default : fastpci + +dev=x dev:x Attach the driver to device number x. + 0 is the first compatible board (in + lspci order) + +Tools + + These tools are mostly for debugging purposes, but you can + find some of these interesting : + - con2fb , maps a tty to a fbramebuffer . + con2fb /dev/fb1 /dev/tty5 + - sst_dbg_vgapass , changes vga passthrou. You have to recompile the + driver with SST_DEBUG and SST_DEBUG_IOCTL set to 1 + sst_dbg_vgapass /dev/fb1 1 (enables vga cable) + sst_dbg_vgapass /dev/fb1 0 (disables vga cable) + - glide_reset , resets the voodoo using glide + use this after rmmoding sstfb, if the module refuses to + reinsert . + +Bugs + + - DO NOT use glide while the sstfb module is in, you'll most likely + hang your computer. + - If you see some artefacts (pixels not cleaning and stuff like that), + try turning off clipping (clipping=0), and/or using slowpci + - the driver don't detect the 4Mb frame buffer voodoos, it seems that + the 2 last Mbs wrap around. looking into that . + - The driver is 16 bpp only, 24/32 won't work. + - The driver is not your_favorite_toy-safe. this includes SMP... + [Actually from inspection it seems to be safe - Alan] + - when using XFree86 FBdev (X over fbdev) you may see strange color + patterns at the border of your windows (the pixels lose the lowest + byte -> basicaly the blue component nd some of the green) . I'm unable + to reproduce this with XFree86-3.3, but one of the testers has this + problem with XFree86-4. apparently recent Xfree86-4.x solve this + problem. + - I didn't really test changing the palette, so you may find some weird + things when playing with that. + - Sometimes the driver will not recognise the DAC , and the + initialisation will fail. this is specificaly true for + voodoo 2 boards , but it should be solved in recent versions. please + contact me . + - the 24/32 is not likely to work anytime soon , knowing that the + hardware does ... unusual thigs in 24/32 bpp + - When used with anther video board, current limitations of linux + console subsystem can cause some troubles, specificaly, you should + disable software scrollback , as it can oops badly ... + +Todo + + - Get rid of the previous paragraph. + - Buy more coffee. + - test/port to other arch. + - try to add panning using tweeks with front and back buffer . + - try to implement accel on voodoo2 , this board can actualy do a + lot in 2D even if it was sold as a 3D only board ... + +ghoz. + +-- +Ghozlane Toumi <gtoumi@laposte.net> + + +$Date: 2002/05/09 20:11:45 $ +http://sstfb.sourceforge.net/README diff --git a/Documentation/fb/tgafb.txt b/Documentation/fb/tgafb.txt new file mode 100644 index 000000000000..250083ada8fb --- /dev/null +++ b/Documentation/fb/tgafb.txt @@ -0,0 +1,69 @@ +$Id: tgafb.txt,v 1.1.2.2 2000/04/04 06:50:18 mato Exp $ + +What is tgafb? +=============== + +This is a driver for DECChip 21030 based graphics framebuffers, a.k.a. TGA +cards, which are usually found in older Digital Alpha systems. The +following models are supported: + +ZLxP-E1 (8bpp, 2 MB VRAM) +ZLxP-E2 (32bpp, 8 MB VRAM) +ZLxP-E3 (32bpp, 16 MB VRAM, Zbuffer) + +This version is an almost complete rewrite of the code written by Geert +Uytterhoeven, which was based on the original TGA console code written by +Jay Estabrook. + +Major new features since Linux 2.0.x: + + * Support for multiple resolutions + * Support for fixed-frequency and other oddball monitors + (by allowing the video mode to be set at boot time) + +User-visible changes since Linux 2.2.x: + + * Sync-on-green is now handled properly + * More useful information is printed on bootup + (this helps if people run into problems) + +This driver does not (yet) support the TGA2 family of framebuffers, so the +PowerStorm 3D30/4D20 (also known as PBXGB) cards are not supported. These +can however be used with the standard VGA Text Console driver. + + +Configuration +============= + +You can pass kernel command line options to tgafb with +`video=tgafb:option1,option2:value2,option3' (multiple options should be +separated by comma, values are separated from options by `:'). +Accepted options: + +font:X - default font to use. All fonts are supported, including the + SUN12x22 font which is very nice at high resolutions. + +mode:X - default video mode. The following video modes are supported: + 640x480-60, 800x600-56, 640x480-72, 800x600-60, 800x600-72, + 1024x768-60, 1152x864-60, 1024x768-70, 1024x768-76, + 1152x864-70, 1280x1024-61, 1024x768-85, 1280x1024-70, + 1152x864-84, 1280x1024-76, 1280x1024-85 + + +Known Issues +============ + +The XFree86 FBDev server has been reported not to work, since tgafb doesn't do +mmap(). Running the standard XF86_TGA server from XFree86 3.3.x works fine for +me, however this server does not do acceleration, which make certain operations +quite slow. Support for acceleration is being progressively integrated in +XFree86 4.x. + +When running tgafb in resolutions higher than 640x480, on switching VCs from +tgafb to XF86_TGA 3.3.x, the entire screen is not re-drawn and must be manually +refreshed. This is an X server problem, not a tgafb problem, and is fixed in +XFree86 4.0. + +Enjoy! + +Martin Lucina <mato@kotelna.sk> diff --git a/Documentation/fb/tridentfb.txt b/Documentation/fb/tridentfb.txt new file mode 100644 index 000000000000..8a6c8a43e6a3 --- /dev/null +++ b/Documentation/fb/tridentfb.txt @@ -0,0 +1,54 @@ +Tridentfb is a framebuffer driver for some Trident chip based cards. + +The following list of chips is thought to be supported although not all are +tested: + +those from the Image series with Cyber in their names - accelerated +those with Blade in their names (Blade3D,CyberBlade...) - accelerated +the newer CyberBladeXP family - nonaccelerated + +Only PCI/AGP based cards are supported, none of the older Tridents. + +How to use it? +============== + +When booting you can pass the video parameter. +video=tridentfb + +The parameters for tridentfb are concatenated with a ':' as in this example. + +video=tridentfb:800x600,bpp=16,noaccel + +The second level parameters that tridentfb understands are: + +noaccel - turns off acceleration (when it doesn't work for your card) +accel - force text acceleration (for boards which by default are noacceled) + +fp - use flat panel related stuff +crt - assume monitor is present instead of fp + +center - for flat panels and resolutions smaller than native size center the + image, otherwise use +stretch + +memsize - integer value in Kb, use if your card's memory size is misdetected. + look at the driver output to see what it says when initializing. +memdiff - integer value in Kb,should be nonzero if your card reports + more memory than it actually has.For instance mine is 192K less than + detection says in all three BIOS selectable situations 2M, 4M, 8M. + Only use if your video memory is taken from main memory hence of + configurable size.Otherwise use memsize. + If in some modes which barely fit the memory you see garbage at the bottom + this might help by not letting change to that mode anymore. + +nativex - the width in pixels of the flat panel.If you know it (usually 1024 + 800 or 1280) and it is not what the driver seems to detect use it. + +bpp - bits per pixel (8,16 or 32) +mode - a mode name like 800x600 (as described in Documentation/fb/modedb.txt) + +Using insane values for the above parameters will probably result in driver +misbehaviour so take care(for instance memsize=12345678 or memdiff=23784 or +nativex=93) + +Contact: jani@astechnix.ro diff --git a/Documentation/fb/vesafb.txt b/Documentation/fb/vesafb.txt new file mode 100644 index 000000000000..814e2f56a6ad --- /dev/null +++ b/Documentation/fb/vesafb.txt @@ -0,0 +1,167 @@ + +What is vesafb? +=============== + +This is a generic driver for a graphic framebuffer on intel boxes. + +The idea is simple: Turn on graphics mode at boot time with the help +of the BIOS, and use this as framebuffer device /dev/fb0, like the m68k +(and other) ports do. + +This means we decide at boot time whenever we want to run in text or +graphics mode. Switching mode later on (in protected mode) is +impossible; BIOS calls work in real mode only. VESA BIOS Extensions +Version 2.0 are required, because we need a linear frame buffer. + +Advantages: + + * It provides a nice large console (128 cols + 48 lines with 1024x768) + without using tiny, unreadable fonts. + * You can run XF68_FBDev on top of /dev/fb0 (=> non-accelerated X11 + support for every VBE 2.0 compliant graphics board). + * Most important: boot logo :-) + +Disadvantages: + + * graphic mode is slower than text mode... + + +How to use it? +============== + +Switching modes is done using the vga=... boot parameter. Read +Documentation/svga.txt for details. + +You should compile in both vgacon (for text mode) and vesafb (for +graphics mode). Which of them takes over the console depends on +whenever the specified mode is text or graphics. + +The graphic modes are NOT in the list which you get if you boot with +vga=ask and hit return. The mode you wish to use is derived from the +VESA mode number. Here are those VESA mode numbers: + + | 640x480 800x600 1024x768 1280x1024 +----+------------------------------------- +256 | 0x101 0x103 0x105 0x107 +32k | 0x110 0x113 0x116 0x119 +64k | 0x111 0x114 0x117 0x11A +16M | 0x112 0x115 0x118 0x11B + +The video mode number of the Linux kernel is the VESA mode number plus +0x200. + + Linux_kernel_mode_number = VESA_mode_number + 0x200 + +So the table for the Kernel mode numbers are: + + | 640x480 800x600 1024x768 1280x1024 +----+------------------------------------- +256 | 0x301 0x303 0x305 0x307 +32k | 0x310 0x313 0x316 0x319 +64k | 0x311 0x314 0x317 0x31A +16M | 0x312 0x315 0x318 0x31B + +To enable one of those modes you have to specify "vga=ask" in the +lilo.conf file and rerun LILO. Then you can type in the desired +mode at the "vga=ask" prompt. For example if you like to use +1024x768x256 colors you have to say "305" at this prompt. + +If this does not work, this might be because your BIOS does not support +linear framebuffers or because it does not support this mode at all. +Even if your board does, it might be the BIOS which does not. VESA BIOS +Extensions v2.0 are required, 1.2 is NOT sufficient. You will get a +"bad mode number" message if something goes wrong. + +1. Note: LILO cannot handle hex, for booting directly with + "vga=mode-number" you have to transform the numbers to decimal. +2. Note: Some newer versions of LILO appear to work with those hex values, + if you set the 0x in front of the numbers. + +X11 +=== + +XF68_FBDev should work just fine, but it is non-accelerated. Running +another (accelerated) X-Server like XF86_SVGA might or might not work. +It depends on X-Server and graphics board. + +The X-Server must restore the video mode correctly, else you end up +with a broken console (and vesafb cannot do anything about this). + + +Refresh rates +============= + +There is no way to change the vesafb video mode and/or timings after +booting linux. If you are not happy with the 60 Hz refresh rate, you +have these options: + + * configure and load the DOS-Tools for your the graphics board (if + available) and boot linux with loadlin. + * use a native driver (matroxfb/atyfb) instead if vesafb. If none + is available, write a new one! + * VBE 3.0 might work too. I have neither a gfx board with VBE 3.0 + support nor the specs, so I have not checked this yet. + + +Configuration +============= + +The VESA BIOS provides protected mode interface for changing +some parameters. vesafb can use it for palette changes and +to pan the display. It is turned off by default because it +seems not to work with some BIOS versions, but there are options +to turn it on. + +You can pass options to vesafb using "video=vesafb:option" on +the kernel command line. Multiple options should be separated +by comma, like this: "video=vesafb:ypan,invers" + +Accepted options: + +invers no comment... + +ypan enable display panning using the VESA protected mode + interface. The visible screen is just a window of the + video memory, console scrolling is done by changing the + start of the window. + pro: * scrolling (fullscreen) is fast, because there is + no need to copy around data. + * You'll get scrollback (the Shift-PgUp thing), + the video memory can be used as scrollback buffer + kontra: * scrolling only parts of the screen causes some + ugly flicker effects (boot logo flickers for + example). + +ywrap Same as ypan, but assumes your gfx board can wrap-around + the video memory (i.e. starts reading from top if it + reaches the end of video memory). Faster than ypan. + +redraw scroll by redrawing the affected part of the screen, this + is the safe (and slow) default. + + +vgapal Use the standard vga registers for palette changes. + This is the default. +pmipal Use the protected mode interface for palette changes. + +mtrr setup memory type range registers for the vesafb framebuffer. + +vremap:n + remap 'n' MiB of video RAM. If 0 or not specified, remap memory + according to video mode. (2.5.66 patch/idea by Antonino Daplas + reversed to give override possibility (allocate more fb memory + than the kernel would) to 2.4 by tmb@iki.fi) + +vtotal:n + if the video BIOS of your card incorrectly determines the total + amount of video RAM, use this option to override the BIOS (in MiB). + +Have fun! + + Gerd + +-- +Gerd Knorr <kraxel@goldbach.in-berlin.de> + +Minor (mostly typo) changes +by Nico Schmoigl <schmoigl@rumms.uni-mannheim.de> diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt new file mode 100644 index 000000000000..b6b1f5ab5354 --- /dev/null +++ b/Documentation/feature-removal-schedule.txt @@ -0,0 +1,42 @@ +The following is a list of files and features that are going to be +removed in the kernel source tree. Every entry should contain what +exactly is going away, why it is happening, and who is going to be doing +the work. When the feature is removed from the kernel, it should also +be removed from this file. + +--------------------------- + +What: devfs +When: July 2005 +Files: fs/devfs/*, include/linux/devfs_fs*.h and assorted devfs + function calls throughout the kernel tree +Why: It has been unmaintained for a number of years, has unfixable + races, contains a naming policy within the kernel that is + against the LSB, and can be replaced by using udev. +Who: Greg Kroah-Hartman <greg@kroah.com> + +--------------------------- + +What: ACPI S4bios support +When: May 2005 +Why: Noone uses it, and it probably does not work, anyway. swsusp is + faster, more reliable, and people are actually using it. +Who: Pavel Machek <pavel@suse.cz> + +--------------------------- + +What: PCI Name Database (CONFIG_PCI_NAMES) +When: July 2005 +Why: It bloats the kernel unnecessarily, and is handled by userspace better + (pciutils supports it.) Will eliminate the need to try to keep the + pci.ids file in sync with the sf.net database all of the time. +Who: Greg Kroah-Hartman <gregkh@suse.de> + +--------------------------- + +What: io_remap_page_range() (macro or function) +When: September 2005 +Why: Replaced by io_remap_pfn_range() which allows more memory space + addressabilty (by using a pfn) and supports sparc & sparc64 + iospace as part of the pfn. +Who: Randy Dunlap <rddunlap@osdl.org> diff --git a/Documentation/filesystems/00-INDEX b/Documentation/filesystems/00-INDEX new file mode 100644 index 000000000000..bcfbab899b37 --- /dev/null +++ b/Documentation/filesystems/00-INDEX @@ -0,0 +1,50 @@ +00-INDEX + - this file (info on some of the filesystems supported by linux). +Locking + - info on locking rules as they pertain to Linux VFS. +adfs.txt + - info and mount options for the Acorn Advanced Disc Filing System. +affs.txt + - info and mount options for the Amiga Fast File System. +bfs.txt + - info for the SCO UnixWare Boot Filesystem (BFS). +cifs.txt + - description of the CIFS filesystem +coda.txt + - description of the CODA filesystem. +cramfs.txt + - info on the cram filesystem for small storage (ROMs etc) +devfs/ + - directory containing devfs documentation. +ext2.txt + - info, mount options and specifications for the Ext2 filesystem. +fat_cvf.txt + - info on the Compressed Volume Files extension to the FAT filesystem +hpfs.txt + - info and mount options for the OS/2 HPFS. +isofs.txt + - info and mount options for the ISO 9660 (CDROM) filesystem. +jfs.txt + - info and mount options for the JFS filesystem. +ncpfs.txt + - info on Novell Netware(tm) filesystem using NCP protocol. +ntfs.txt + - info and mount options for the NTFS filesystem (Windows NT). +proc.txt + - info on Linux's /proc filesystem. +romfs.txt + - Description of the ROMFS filesystem. +smbfs.txt + - info on using filesystems with the SMB protocol (Windows 3.11 and NT) +sysv-fs.txt + - info on the SystemV/V7/Xenix/Coherent filesystem. +udf.txt + - info and mount options for the UDF filesystem. +ufs.txt + - info on the ufs filesystem. +vfat.txt + - info on using the VFAT filesystem used in Windows NT and Windows 95 +vfs.txt + - Overview of the Virtual File System +xfs.txt + - info and mount options for the XFS filesystem. diff --git a/Documentation/filesystems/Exporting b/Documentation/filesystems/Exporting new file mode 100644 index 000000000000..31047e0fe14b --- /dev/null +++ b/Documentation/filesystems/Exporting @@ -0,0 +1,176 @@ + +Making Filesystems Exportable +============================= + +Most filesystem operations require a dentry (or two) as a starting +point. Local applications have a reference-counted hold on suitable +dentrys via open file descriptors or cwd/root. However remote +applications that access a filesystem via a remote filesystem protocol +such as NFS may not be able to hold such a reference, and so need a +different way to refer to a particular dentry. As the alternative +form of reference needs to be stable across renames, truncates, and +server-reboot (among other things, though these tend to be the most +problematic), there is no simple answer like 'filename'. + +The mechanism discussed here allows each filesystem implementation to +specify how to generate an opaque (out side of the filesystem) byte +string for any dentry, and how to find an appropriate dentry for any +given opaque byte string. +This byte string will be called a "filehandle fragment" as it +corresponds to part of an NFS filehandle. + +A filesystem which supports the mapping between filehandle fragments +and dentrys will be termed "exportable". + + + +Dcache Issues +------------- + +The dcache normally contains a proper prefix of any given filesystem +tree. This means that if any filesystem object is in the dcache, then +all of the ancestors of that filesystem object are also in the dcache. +As normal access is by filename this prefix is created naturally and +maintained easily (by each object maintaining a reference count on +its parent). + +However when objects are included into the dcache by interpreting a +filehandle fragment, there is no automatic creation of a path prefix +for the object. This leads to two related but distinct features of +the dcache that are not needed for normal filesystem access. + +1/ The dcache must sometimes contain objects that are not part of the + proper prefix. i.e that are not connected to the root. +2/ The dcache must be prepared for a newly found (via ->lookup) directory + to already have a (non-connected) dentry, and must be able to move + that dentry into place (based on the parent and name in the + ->lookup). This is particularly needed for directories as + it is a dcache invariant that directories only have one dentry. + +To implement these features, the dcache has: + +a/ A dentry flag DCACHE_DISCONNECTED which is set on + any dentry that might not be part of the proper prefix. + This is set when anonymous dentries are created, and cleared when a + dentry is noticed to be a child of a dentry which is in the proper + prefix. + +b/ A per-superblock list "s_anon" of dentries which are the roots of + subtrees that are not in the proper prefix. These dentries, as + well as the proper prefix, need to be released at unmount time. As + these dentries will not be hashed, they are linked together on the + d_hash list_head. + +c/ Helper routines to allocate anonymous dentries, and to help attach + loose directory dentries at lookup time. They are: + d_alloc_anon(inode) will return a dentry for the given inode. + If the inode already has a dentry, one of those is returned. + If it doesn't, a new anonymous (IS_ROOT and + DCACHE_DISCONNECTED) dentry is allocated and attached. + In the case of a directory, care is taken that only one dentry + can ever be attached. + d_splice_alias(inode, dentry) will make sure that there is a + dentry with the same name and parent as the given dentry, and + which refers to the given inode. + If the inode is a directory and already has a dentry, then that + dentry is d_moved over the given dentry. + If the passed dentry gets attached, care is taken that this is + mutually exclusive to a d_alloc_anon operation. + If the passed dentry is used, NULL is returned, else the used + dentry is returned. This corresponds to the calling pattern of + ->lookup. + + +Filesystem Issues +----------------- + +For a filesystem to be exportable it must: + + 1/ provide the filehandle fragment routines described below. + 2/ make sure that d_splice_alias is used rather than d_add + when ->lookup finds an inode for a given parent and name. + Typically the ->lookup routine will end: + if (inode) + return d_splice(inode, dentry); + d_add(dentry, inode); + return NULL; + } + + + + A file system implementation declares that instances of the filesystem +are exportable by setting the s_export_op field in the struct +super_block. This field must point to a "struct export_operations" +struct which could potentially be full of NULLs, though normally at +least get_parent will be set. + + The primary operations are decode_fh and encode_fh. +decode_fh takes a filehandle fragment and tries to find or create a +dentry for the object referred to by the filehandle. +encode_fh takes a dentry and creates a filehandle fragment which can +later be used to find/create a dentry for the same object. + +decode_fh will probably make use of "find_exported_dentry". +This function lives in the "exportfs" module which a filesystem does +not need unless it is being exported. So rather that calling +find_exported_dentry directly, each filesystem should call it through +the find_exported_dentry pointer in it's export_operations table. +This field is set correctly by the exporting agent (e.g. nfsd) when a +filesystem is exported, and before any export operations are called. + +find_exported_dentry needs three support functions from the +filesystem: + get_name. When given a parent dentry and a child dentry, this + should find a name in the directory identified by the parent + dentry, which leads to the object identified by the child dentry. + If no get_name function is supplied, a default implementation is + provided which uses vfs_readdir to find potential names, and + matches inode numbers to find the correct match. + + get_parent. When given a dentry for a directory, this should return + a dentry for the parent. Quite possibly the parent dentry will + have been allocated by d_alloc_anon. + The default get_parent function just returns an error so any + filehandle lookup that requires finding a parent will fail. + ->lookup("..") is *not* used as a default as it can leave ".." + entries in the dcache which are too messy to work with. + + get_dentry. When given an opaque datum, this should find the + implied object and create a dentry for it (possibly with + d_alloc_anon). + The opaque datum is whatever is passed down by the decode_fh + function, and is often simply a fragment of the filehandle + fragment. + decode_fh passes two datums through find_exported_dentry. One that + should be used to identify the target object, and one that can be + used to identify the object's parent, should that be necessary. + The default get_dentry function assumes that the datum contains an + inode number and a generation number, and it attempts to get the + inode using "iget" and check it's validity by matching the + generation number. A filesystem should only depend on the default + if iget can safely be used this way. + +If decode_fh and/or encode_fh are left as NULL, then default +implementations are used. These defaults are suitable for ext2 and +extremely similar filesystems (like ext3). + +The default encode_fh creates a filehandle fragment from the inode +number and generation number of the target together with the inode +number and generation number of the parent (if the parent is +required). + +The default decode_fh extract the target and parent datums from the +filehandle assuming the format used by the default encode_fh and +passed them to find_exported_dentry. + + +A filehandle fragment consists of an array of 1 or more 4byte words, +together with a one byte "type". +The decode_fh routine should not depend on the stated size that is +passed to it. This size may be larger than the original filehandle +generated by encode_fh, in which case it will have been padded with +nuls. Rather, the encode_fh routine should choose a "type" which +indicates the decode_fh how much of the filehandle is valid, and how +it should be interpreted. + + diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking new file mode 100644 index 000000000000..a934baeeb33a --- /dev/null +++ b/Documentation/filesystems/Locking @@ -0,0 +1,515 @@ + The text below describes the locking rules for VFS-related methods. +It is (believed to be) up-to-date. *Please*, if you change anything in +prototypes or locking protocols - update this file. And update the relevant +instances in the tree, don't leave that to maintainers of filesystems/devices/ +etc. At the very least, put the list of dubious cases in the end of this file. +Don't turn it into log - maintainers of out-of-the-tree code are supposed to +be able to use diff(1). + Thing currently missing here: socket operations. Alexey? + +--------------------------- dentry_operations -------------------------- +prototypes: + int (*d_revalidate)(struct dentry *, int); + int (*d_hash) (struct dentry *, struct qstr *); + int (*d_compare) (struct dentry *, struct qstr *, struct qstr *); + int (*d_delete)(struct dentry *); + void (*d_release)(struct dentry *); + void (*d_iput)(struct dentry *, struct inode *); + +locking rules: + none have BKL + dcache_lock rename_lock ->d_lock may block +d_revalidate: no no no yes +d_hash no no no yes +d_compare: no yes no no +d_delete: yes no yes no +d_release: no no no yes +d_iput: no no no yes + +--------------------------- inode_operations --------------------------- +prototypes: + int (*create) (struct inode *,struct dentry *,int, struct nameidata *); + struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameid +ata *); + int (*link) (struct dentry *,struct inode *,struct dentry *); + int (*unlink) (struct inode *,struct dentry *); + int (*symlink) (struct inode *,struct dentry *,const char *); + int (*mkdir) (struct inode *,struct dentry *,int); + int (*rmdir) (struct inode *,struct dentry *); + int (*mknod) (struct inode *,struct dentry *,int,dev_t); + int (*rename) (struct inode *, struct dentry *, + struct inode *, struct dentry *); + int (*readlink) (struct dentry *, char __user *,int); + int (*follow_link) (struct dentry *, struct nameidata *); + void (*truncate) (struct inode *); + int (*permission) (struct inode *, int, struct nameidata *); + int (*setattr) (struct dentry *, struct iattr *); + int (*getattr) (struct vfsmount *, struct dentry *, struct kstat *); + int (*setxattr) (struct dentry *, const char *,const void *,size_t,int); + ssize_t (*getxattr) (struct dentry *, const char *, void *, size_t); + ssize_t (*listxattr) (struct dentry *, char *, size_t); + int (*removexattr) (struct dentry *, const char *); + +locking rules: + all may block, none have BKL + i_sem(inode) +lookup: yes +create: yes +link: yes (both) +mknod: yes +symlink: yes +mkdir: yes +unlink: yes (both) +rmdir: yes (both) (see below) +rename: yes (all) (see below) +readlink: no +follow_link: no +truncate: yes (see below) +setattr: yes +permission: no +getattr: no +setxattr: yes +getxattr: no +listxattr: no +removexattr: yes + Additionally, ->rmdir(), ->unlink() and ->rename() have ->i_sem on +victim. + cross-directory ->rename() has (per-superblock) ->s_vfs_rename_sem. + ->truncate() is never called directly - it's a callback, not a +method. It's called by vmtruncate() - library function normally used by +->setattr(). Locking information above applies to that call (i.e. is +inherited from ->setattr() - vmtruncate() is used when ATTR_SIZE had been +passed). + +See Documentation/filesystems/directory-locking for more detailed discussion +of the locking scheme for directory operations. + +--------------------------- super_operations --------------------------- +prototypes: + struct inode *(*alloc_inode)(struct super_block *sb); + void (*destroy_inode)(struct inode *); + void (*read_inode) (struct inode *); + void (*dirty_inode) (struct inode *); + int (*write_inode) (struct inode *, int); + void (*put_inode) (struct inode *); + void (*drop_inode) (struct inode *); + void (*delete_inode) (struct inode *); + void (*put_super) (struct super_block *); + void (*write_super) (struct super_block *); + int (*sync_fs)(struct super_block *sb, int wait); + void (*write_super_lockfs) (struct super_block *); + void (*unlockfs) (struct super_block *); + int (*statfs) (struct super_block *, struct kstatfs *); + int (*remount_fs) (struct super_block *, int *, char *); + void (*clear_inode) (struct inode *); + void (*umount_begin) (struct super_block *); + int (*show_options)(struct seq_file *, struct vfsmount *); + ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t); + ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t); + +locking rules: + All may block. + BKL s_lock s_umount +alloc_inode: no no no +destroy_inode: no +read_inode: no (see below) +dirty_inode: no (must not sleep) +write_inode: no +put_inode: no +drop_inode: no !!!inode_lock!!! +delete_inode: no +put_super: yes yes no +write_super: no yes read +sync_fs: no no read +write_super_lockfs: ? +unlockfs: ? +statfs: no no no +remount_fs: no yes maybe (see below) +clear_inode: no +umount_begin: yes no no +show_options: no (vfsmount->sem) +quota_read: no no no (see below) +quota_write: no no no (see below) + +->read_inode() is not a method - it's a callback used in iget(). +->remount_fs() will have the s_umount lock if it's already mounted. +When called from get_sb_single, it does NOT have the s_umount lock. +->quota_read() and ->quota_write() functions are both guaranteed to +be the only ones operating on the quota file by the quota code (via +dqio_sem) (unless an admin really wants to screw up something and +writes to quota files with quotas on). For other details about locking +see also dquot_operations section. + +--------------------------- file_system_type --------------------------- +prototypes: + struct super_block *(*get_sb) (struct file_system_type *, int, + const char *, void *); + void (*kill_sb) (struct super_block *); +locking rules: + may block BKL +get_sb yes yes +kill_sb yes yes + +->get_sb() returns error or a locked superblock (exclusive on ->s_umount). +->kill_sb() takes a write-locked superblock, does all shutdown work on it, +unlocks and drops the reference. + +--------------------------- address_space_operations -------------------------- +prototypes: + int (*writepage)(struct page *page, struct writeback_control *wbc); + int (*readpage)(struct file *, struct page *); + int (*sync_page)(struct page *); + int (*writepages)(struct address_space *, struct writeback_control *); + int (*set_page_dirty)(struct page *page); + int (*readpages)(struct file *filp, struct address_space *mapping, + struct list_head *pages, unsigned nr_pages); + int (*prepare_write)(struct file *, struct page *, unsigned, unsigned); + int (*commit_write)(struct file *, struct page *, unsigned, unsigned); + sector_t (*bmap)(struct address_space *, sector_t); + int (*invalidatepage) (struct page *, unsigned long); + int (*releasepage) (struct page *, int); + int (*direct_IO)(int, struct kiocb *, const struct iovec *iov, + loff_t offset, unsigned long nr_segs); + +locking rules: + All except set_page_dirty may block + + BKL PageLocked(page) +writepage: no yes, unlocks (see below) +readpage: no yes, unlocks +sync_page: no maybe +writepages: no +set_page_dirty no no +readpages: no +prepare_write: no yes +commit_write: no yes +bmap: yes +invalidatepage: no yes +releasepage: no yes +direct_IO: no + + ->prepare_write(), ->commit_write(), ->sync_page() and ->readpage() +may be called from the request handler (/dev/loop). + + ->readpage() unlocks the page, either synchronously or via I/O +completion. + + ->readpages() populates the pagecache with the passed pages and starts +I/O against them. They come unlocked upon I/O completion. + + ->writepage() is used for two purposes: for "memory cleansing" and for +"sync". These are quite different operations and the behaviour may differ +depending upon the mode. + +If writepage is called for sync (wbc->sync_mode != WBC_SYNC_NONE) then +it *must* start I/O against the page, even if that would involve +blocking on in-progress I/O. + +If writepage is called for memory cleansing (sync_mode == +WBC_SYNC_NONE) then its role is to get as much writeout underway as +possible. So writepage should try to avoid blocking against +currently-in-progress I/O. + +If the filesystem is not called for "sync" and it determines that it +would need to block against in-progress I/O to be able to start new I/O +against the page the filesystem should redirty the page with +redirty_page_for_writepage(), then unlock the page and return zero. +This may also be done to avoid internal deadlocks, but rarely. + +If the filesytem is called for sync then it must wait on any +in-progress I/O and then start new I/O. + +The filesystem should unlock the page synchronously, before returning +to the caller. + +Unless the filesystem is going to redirty_page_for_writepage(), unlock the page +and return zero, writepage *must* run set_page_writeback() against the page, +followed by unlocking it. Once set_page_writeback() has been run against the +page, write I/O can be submitted and the write I/O completion handler must run +end_page_writeback() once the I/O is complete. If no I/O is submitted, the +filesystem must run end_page_writeback() against the page before returning from +writepage. + +That is: after 2.5.12, pages which are under writeout are *not* locked. Note, +if the filesystem needs the page to be locked during writeout, that is ok, too, +the page is allowed to be unlocked at any point in time between the calls to +set_page_writeback() and end_page_writeback(). + +Note, failure to run either redirty_page_for_writepage() or the combination of +set_page_writeback()/end_page_writeback() on a page submitted to writepage +will leave the page itself marked clean but it will be tagged as dirty in the +radix tree. This incoherency can lead to all sorts of hard-to-debug problems +in the filesystem like having dirty inodes at umount and losing written data. + + ->sync_page() locking rules are not well-defined - usually it is called +with lock on page, but that is not guaranteed. Considering the currently +existing instances of this method ->sync_page() itself doesn't look +well-defined... + + ->writepages() is used for periodic writeback and for syscall-initiated +sync operations. The address_space should start I/O against at least +*nr_to_write pages. *nr_to_write must be decremented for each page which is +written. The address_space implementation may write more (or less) pages +than *nr_to_write asks for, but it should try to be reasonably close. If +nr_to_write is NULL, all dirty pages must be written. + +writepages should _only_ write pages which are present on +mapping->io_pages. + + ->set_page_dirty() is called from various places in the kernel +when the target page is marked as needing writeback. It may be called +under spinlock (it cannot block) and is sometimes called with the page +not locked. + + ->bmap() is currently used by legacy ioctl() (FIBMAP) provided by some +filesystems and by the swapper. The latter will eventually go away. All +instances do not actually need the BKL. Please, keep it that way and don't +breed new callers. + + ->invalidatepage() is called when the filesystem must attempt to drop +some or all of the buffers from the page when it is being truncated. It +returns zero on success. If ->invalidatepage is zero, the kernel uses +block_invalidatepage() instead. + + ->releasepage() is called when the kernel is about to try to drop the +buffers from the page in preparation for freeing it. It returns zero to +indicate that the buffers are (or may be) freeable. If ->releasepage is zero, +the kernel assumes that the fs has no private interest in the buffers. + + Note: currently almost all instances of address_space methods are +using BKL for internal serialization and that's one of the worst sources +of contention. Normally they are calling library functions (in fs/buffer.c) +and pass foo_get_block() as a callback (on local block-based filesystems, +indeed). BKL is not needed for library stuff and is usually taken by +foo_get_block(). It's an overkill, since block bitmaps can be protected by +internal fs locking and real critical areas are much smaller than the areas +filesystems protect now. + +----------------------- file_lock_operations ------------------------------ +prototypes: + void (*fl_insert)(struct file_lock *); /* lock insertion callback */ + void (*fl_remove)(struct file_lock *); /* lock removal callback */ + void (*fl_copy_lock)(struct file_lock *, struct file_lock *); + void (*fl_release_private)(struct file_lock *); + + +locking rules: + BKL may block +fl_insert: yes no +fl_remove: yes no +fl_copy_lock: yes no +fl_release_private: yes yes + +----------------------- lock_manager_operations --------------------------- +prototypes: + int (*fl_compare_owner)(struct file_lock *, struct file_lock *); + void (*fl_notify)(struct file_lock *); /* unblock callback */ + void (*fl_copy_lock)(struct file_lock *, struct file_lock *); + void (*fl_release_private)(struct file_lock *); + void (*fl_break)(struct file_lock *); /* break_lease callback */ + +locking rules: + BKL may block +fl_compare_owner: yes no +fl_notify: yes no +fl_copy_lock: yes no +fl_release_private: yes yes +fl_break: yes no + + Currently only NFSD and NLM provide instances of this class. None of the +them block. If you have out-of-tree instances - please, show up. Locking +in that area will change. +--------------------------- buffer_head ----------------------------------- +prototypes: + void (*b_end_io)(struct buffer_head *bh, int uptodate); + +locking rules: + called from interrupts. In other words, extreme care is needed here. +bh is locked, but that's all warranties we have here. Currently only RAID1, +highmem, fs/buffer.c, and fs/ntfs/aops.c are providing these. Block devices +call this method upon the IO completion. + +--------------------------- block_device_operations ----------------------- +prototypes: + int (*open) (struct inode *, struct file *); + int (*release) (struct inode *, struct file *); + int (*ioctl) (struct inode *, struct file *, unsigned, unsigned long); + int (*media_changed) (struct gendisk *); + int (*revalidate_disk) (struct gendisk *); + +locking rules: + BKL bd_sem +open: yes yes +release: yes yes +ioctl: yes no +media_changed: no no +revalidate_disk: no no + +The last two are called only from check_disk_change(). + +--------------------------- file_operations ------------------------------- +prototypes: + loff_t (*llseek) (struct file *, loff_t, int); + ssize_t (*read) (struct file *, char __user *, size_t, loff_t *); + ssize_t (*aio_read) (struct kiocb *, char __user *, size_t, loff_t); + ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *); + ssize_t (*aio_write) (struct kiocb *, const char __user *, size_t, + loff_t); + int (*readdir) (struct file *, void *, filldir_t); + unsigned int (*poll) (struct file *, struct poll_table_struct *); + int (*ioctl) (struct inode *, struct file *, unsigned int, + unsigned long); + long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long); + long (*compat_ioctl) (struct file *, unsigned int, unsigned long); + int (*mmap) (struct file *, struct vm_area_struct *); + int (*open) (struct inode *, struct file *); + int (*flush) (struct file *); + int (*release) (struct inode *, struct file *); + int (*fsync) (struct file *, struct dentry *, int datasync); + int (*aio_fsync) (struct kiocb *, int datasync); + int (*fasync) (int, struct file *, int); + int (*lock) (struct file *, int, struct file_lock *); + ssize_t (*readv) (struct file *, const struct iovec *, unsigned long, + loff_t *); + ssize_t (*writev) (struct file *, const struct iovec *, unsigned long, + loff_t *); + ssize_t (*sendfile) (struct file *, loff_t *, size_t, read_actor_t, + void __user *); + ssize_t (*sendpage) (struct file *, struct page *, int, size_t, + loff_t *, int); + unsigned long (*get_unmapped_area)(struct file *, unsigned long, + unsigned long, unsigned long, unsigned long); + int (*check_flags)(int); + int (*dir_notify)(struct file *, unsigned long); +}; + +locking rules: + All except ->poll() may block. + BKL +llseek: no (see below) +read: no +aio_read: no +write: no +aio_write: no +readdir: no +poll: no +ioctl: yes (see below) +unlocked_ioctl: no (see below) +compat_ioctl: no +mmap: no +open: maybe (see below) +flush: no +release: no +fsync: no (see below) +aio_fsync: no +fasync: yes (see below) +lock: yes +readv: no +writev: no +sendfile: no +sendpage: no +get_unmapped_area: no +check_flags: no +dir_notify: no + +->llseek() locking has moved from llseek to the individual llseek +implementations. If your fs is not using generic_file_llseek, you +need to acquire and release the appropriate locks in your ->llseek(). +For many filesystems, it is probably safe to acquire the inode +semaphore. Note some filesystems (i.e. remote ones) provide no +protection for i_size so you will need to use the BKL. + +->open() locking is in-transit: big lock partially moved into the methods. +The only exception is ->open() in the instances of file_operations that never +end up in ->i_fop/->proc_fops, i.e. ones that belong to character devices +(chrdev_open() takes lock before replacing ->f_op and calling the secondary +method. As soon as we fix the handling of module reference counters all +instances of ->open() will be called without the BKL. + +Note: ext2_release() was *the* source of contention on fs-intensive +loads and dropping BKL on ->release() helps to get rid of that (we still +grab BKL for cases when we close a file that had been opened r/w, but that +can and should be done using the internal locking with smaller critical areas). +Current worst offender is ext2_get_block()... + +->fasync() is a mess. This area needs a big cleanup and that will probably +affect locking. + +->readdir() and ->ioctl() on directories must be changed. Ideally we would +move ->readdir() to inode_operations and use a separate method for directory +->ioctl() or kill the latter completely. One of the problems is that for +anything that resembles union-mount we won't have a struct file for all +components. And there are other reasons why the current interface is a mess... + +->ioctl() on regular files is superceded by the ->unlocked_ioctl() that +doesn't take the BKL. + +->read on directories probably must go away - we should just enforce -EISDIR +in sys_read() and friends. + +->fsync() has i_sem on inode. + +--------------------------- dquot_operations ------------------------------- +prototypes: + int (*initialize) (struct inode *, int); + int (*drop) (struct inode *); + int (*alloc_space) (struct inode *, qsize_t, int); + int (*alloc_inode) (const struct inode *, unsigned long); + int (*free_space) (struct inode *, qsize_t); + int (*free_inode) (const struct inode *, unsigned long); + int (*transfer) (struct inode *, struct iattr *); + int (*write_dquot) (struct dquot *); + int (*acquire_dquot) (struct dquot *); + int (*release_dquot) (struct dquot *); + int (*mark_dirty) (struct dquot *); + int (*write_info) (struct super_block *, int); + +These operations are intended to be more or less wrapping functions that ensure +a proper locking wrt the filesystem and call the generic quota operations. + +What filesystem should expect from the generic quota functions: + + FS recursion Held locks when called +initialize: yes maybe dqonoff_sem +drop: yes - +alloc_space: ->mark_dirty() - +alloc_inode: ->mark_dirty() - +free_space: ->mark_dirty() - +free_inode: ->mark_dirty() - +transfer: yes - +write_dquot: yes dqonoff_sem or dqptr_sem +acquire_dquot: yes dqonoff_sem or dqptr_sem +release_dquot: yes dqonoff_sem or dqptr_sem +mark_dirty: no - +write_info: yes dqonoff_sem + +FS recursion means calling ->quota_read() and ->quota_write() from superblock +operations. + +->alloc_space(), ->alloc_inode(), ->free_space(), ->free_inode() are called +only directly by the filesystem and do not call any fs functions only +the ->mark_dirty() operation. + +More details about quota locking can be found in fs/dquot.c. + +--------------------------- vm_operations_struct ----------------------------- +prototypes: + void (*open)(struct vm_area_struct*); + void (*close)(struct vm_area_struct*); + struct page *(*nopage)(struct vm_area_struct*, unsigned long, int *); + +locking rules: + BKL mmap_sem +open: no yes +close: no yes +nopage: no yes + +================================================================================ + Dubious stuff + +(if you break something or notice that it is broken and do not fix it yourself +- at least put it here) + +ipc/shm.c::shm_delete() - may need BKL. +->read() and ->write() in many drivers are (probably) missing BKL. +drivers/sgi/char/graphics.c::sgi_graphics_nopage() - may need BKL. diff --git a/Documentation/filesystems/adfs.txt b/Documentation/filesystems/adfs.txt new file mode 100644 index 000000000000..060abb0c7004 --- /dev/null +++ b/Documentation/filesystems/adfs.txt @@ -0,0 +1,57 @@ +Mount options for ADFS +---------------------- + + uid=nnn All files in the partition will be owned by + user id nnn. Default 0 (root). + gid=nnn All files in the partition willbe in group + nnn. Default 0 (root). + ownmask=nnn The permission mask for ADFS 'owner' permissions + will be nnn. Default 0700. + othmask=nnn The permission mask for ADFS 'other' permissions + will be nnn. Default 0077. + +Mapping of ADFS permissions to Linux permissions +------------------------------------------------ + + ADFS permissions consist of the following: + + Owner read + Owner write + Other read + Other write + + (In older versions, an 'execute' permission did exist, but this + does not hold the same meaning as the Linux 'execute' permission + and is now obsolete). + + The mapping is performed as follows: + + Owner read -> -r--r--r-- + Owner write -> --w--w---w + Owner read and filetype UnixExec -> ---x--x--x + These are then masked by ownmask, eg 700 -> -rwx------ + Possible owner mode permissions -> -rwx------ + + Other read -> -r--r--r-- + Other write -> --w--w--w- + Other read and filetype UnixExec -> ---x--x--x + These are then masked by othmask, eg 077 -> ----rwxrwx + Possible other mode permissions -> ----rwxrwx + + Hence, with the default masks, if a file is owner read/write, and + not a UnixExec filetype, then the permissions will be: + + -rw------- + + However, if the masks were ownmask=0770,othmask=0007, then this would + be modified to: + -rw-rw---- + + There is no restriction on what you can do with these masks. You may + wish that either read bits give read access to the file for all, but + keep the default write protection (ownmask=0755,othmask=0577): + + -rw-r--r-- + + You can therefore tailor the permission translation to whatever you + desire the permissions should be under Linux. diff --git a/Documentation/filesystems/affs.txt b/Documentation/filesystems/affs.txt new file mode 100644 index 000000000000..30c9738590f4 --- /dev/null +++ b/Documentation/filesystems/affs.txt @@ -0,0 +1,219 @@ +Overview of Amiga Filesystems +============================= + +Not all varieties of the Amiga filesystems are supported for reading and +writing. The Amiga currently knows six different filesystems: + +DOS\0 The old or original filesystem, not really suited for + hard disks and normally not used on them, either. + Supported read/write. + +DOS\1 The original Fast File System. Supported read/write. + +DOS\2 The old "international" filesystem. International means that + a bug has been fixed so that accented ("international") letters + in file names are case-insensitive, as they ought to be. + Supported read/write. + +DOS\3 The "international" Fast File System. Supported read/write. + +DOS\4 The original filesystem with directory cache. The directory + cache speeds up directory accesses on floppies considerably, + but slows down file creation/deletion. Doesn't make much + sense on hard disks. Supported read only. + +DOS\5 The Fast File System with directory cache. Supported read only. + +All of the above filesystems allow block sizes from 512 to 32K bytes. +Supported block sizes are: 512, 1024, 2048 and 4096 bytes. Larger blocks +speed up almost everything at the expense of wasted disk space. The speed +gain above 4K seems not really worth the price, so you don't lose too +much here, either. + +The muFS (multi user File System) equivalents of the above file systems +are supported, too. + +Mount options for the AFFS +========================== + +protect If this option is set, the protection bits cannot be altered. + +setuid[=uid] This sets the owner of all files and directories in the file + system to uid or the uid of the current user, respectively. + +setgid[=gid] Same as above, but for gid. + +mode=mode Sets the mode flags to the given (octal) value, regardless + of the original permissions. Directories will get an x + permission if the corresponding r bit is set. + This is useful since most of the plain AmigaOS files + will map to 600. + +reserved=num Sets the number of reserved blocks at the start of the + partition to num. You should never need this option. + Default is 2. + +root=block Sets the block number of the root block. This should never + be necessary. + +bs=blksize Sets the blocksize to blksize. Valid block sizes are 512, + 1024, 2048 and 4096. Like the root option, this should + never be necessary, as the affs can figure it out itself. + +quiet The file system will not return an error for disallowed + mode changes. + +verbose The volume name, file system type and block size will + be written to the syslog when the filesystem is mounted. + +mufs The filesystem is really a muFS, also it doesn't + identify itself as one. This option is necessary if + the filesystem wasn't formatted as muFS, but is used + as one. + +prefix=path Path will be prefixed to every absolute path name of + symbolic links on an AFFS partition. Default = "/". + (See below.) + +volume=name When symbolic links with an absolute path are created + on an AFFS partition, name will be prepended as the + volume name. Default = "" (empty string). + (See below.) + +Handling of the Users/Groups and protection flags +================================================= + +Amiga -> Linux: + +The Amiga protection flags RWEDRWEDHSPARWED are handled as follows: + + - R maps to r for user, group and others. On directories, R implies x. + + - If both W and D are allowed, w will be set. + + - E maps to x. + + - H and P are always retained and ignored under Linux. + + - A is always reset when a file is written to. + +User id and group id will be used unless set[gu]id are given as mount +options. Since most of the Amiga file systems are single user systems +they will be owned by root. The root directory (the mount point) of the +Amiga filesystem will be owned by the user who actually mounts the +filesystem (the root directory doesn't have uid/gid fields). + +Linux -> Amiga: + +The Linux rwxrwxrwx file mode is handled as follows: + + - r permission will set R for user, group and others. + + - w permission will set W and D for user, group and others. + + - x permission of the user will set E for plain files. + + - All other flags (suid, sgid, ...) are ignored and will + not be retained. + +Newly created files and directories will get the user and group ID +of the current user and a mode according to the umask. + +Symbolic links +============== + +Although the Amiga and Linux file systems resemble each other, there +are some, not always subtle, differences. One of them becomes apparent +with symbolic links. While Linux has a file system with exactly one +root directory, the Amiga has a separate root directory for each +file system (for example, partition, floppy disk, ...). With the Amiga, +these entities are called "volumes". They have symbolic names which +can be used to access them. Thus, symbolic links can point to a +different volume. AFFS turns the volume name into a directory name +and prepends the prefix path (see prefix option) to it. + +Example: +You mount all your Amiga partitions under /amiga/<volume> (where +<volume> is the name of the volume), and you give the option +"prefix=/amiga/" when mounting all your AFFS partitions. (They +might be "User", "WB" and "Graphics", the mount points /amiga/User, +/amiga/WB and /amiga/Graphics). A symbolic link referring to +"User:sc/include/dos/dos.h" will be followed to +"/amiga/User/sc/include/dos/dos.h". + +Examples +======== + +Command line: + mount Archive/Amiga/Workbench3.1.adf /mnt -t affs -o loop,verbose + mount /dev/sda3 /Amiga -t affs + +/etc/fstab entry: + /dev/sdb5 /amiga/Workbench affs noauto,user,exec,verbose 0 0 + +IMPORTANT NOTE +============== + +If you boot Windows 95 (don't know about 3.x, 98 and NT) while you +have an Amiga harddisk connected to your PC, it will overwrite +the bytes 0x00dc..0x00df of block 0 with garbage, thus invalidating +the Rigid Disk Block. Sheer luck has it that this is an unused +area of the RDB, so only the checksum doesn't match anymore. +Linux will ignore this garbage and recognize the RDB anyway, but +before you connect that drive to your Amiga again, you must +restore or repair your RDB. So please do make a backup copy of it +before booting Windows! + +If the damage is already done, the following should fix the RDB +(where <disk> is the device name). +DO AT YOUR OWN RISK: + + dd if=/dev/<disk> of=rdb.tmp count=1 + cp rdb.tmp rdb.fixed + dd if=/dev/zero of=rdb.fixed bs=1 seek=220 count=4 + dd if=rdb.fixed of=/dev/<disk> + +Bugs, Restrictions, Caveats +=========================== + +Quite a few things may not work as advertised. Not everything is +tested, though several hundred MB have been read and written using +this fs. For a most up-to-date list of bugs please consult +fs/affs/Changes. + +Filenames are truncated to 30 characters without warning (this +can be changed by setting the compile-time option AFFS_NO_TRUNCATE +in include/linux/amigaffs.h). + +Case is ignored by the affs in filename matching, but Linux shells +do care about the case. Example (with /wb being an affs mounted fs): + rm /wb/WRONGCASE +will remove /mnt/wrongcase, but + rm /wb/WR* +will not since the names are matched by the shell. + +The block allocation is designed for hard disk partitions. If more +than 1 process writes to a (small) diskette, the blocks are allocated +in an ugly way (but the real AFFS doesn't do much better). This +is also true when space gets tight. + +You cannot execute programs on an OFS (Old File System), since the +program files cannot be memory mapped due to the 488 byte blocks. +For the same reason you cannot mount an image on such a filesystem +via the loopback device. + +The bitmap valid flag in the root block may not be accurate when the +system crashes while an affs partition is mounted. There's currently +no way to fix a garbled filesystem without an Amiga (disk validator) +or manually (who would do this?). Maybe later. + +If you mount affs partitions on system startup, you may want to tell +fsck that the fs should not be checked (place a '0' in the sixth field +of /etc/fstab). + +It's not possible to read floppy disks with a normal PC or workstation +due to an incompatibility with the Amiga floppy controller. + +If you are interested in an Amiga Emulator for Linux, look at + +http://www-users.informatik.rwth-aachen.de/~crux/uae.html diff --git a/Documentation/filesystems/afs.txt b/Documentation/filesystems/afs.txt new file mode 100644 index 000000000000..2f4237dfb8c7 --- /dev/null +++ b/Documentation/filesystems/afs.txt @@ -0,0 +1,155 @@ + kAFS: AFS FILESYSTEM + ==================== + +ABOUT +===== + +This filesystem provides a fairly simple AFS filesystem driver. It is under +development and only provides very basic facilities. It does not yet support +the following AFS features: + + (*) Write support. + (*) Communications security. + (*) Local caching. + (*) pioctl() system call. + (*) Automatic mounting of embedded mountpoints. + + +USAGE +===== + +When inserting the driver modules the root cell must be specified along with a +list of volume location server IP addresses: + + insmod rxrpc.o + insmod kafs.o rootcell=cambridge.redhat.com:172.16.18.73:172.16.18.91 + +The first module is a driver for the RxRPC remote operation protocol, and the +second is the actual filesystem driver for the AFS filesystem. + +Once the module has been loaded, more modules can be added by the following +procedure: + + echo add grand.central.org 18.7.14.88:128.2.191.224 >/proc/fs/afs/cells + +Where the parameters to the "add" command are the name of a cell and a list of +volume location servers within that cell. + +Filesystems can be mounted anywhere by commands similar to the following: + + mount -t afs "%cambridge.redhat.com:root.afs." /afs + mount -t afs "#cambridge.redhat.com:root.cell." /afs/cambridge + mount -t afs "#root.afs." /afs + mount -t afs "#root.cell." /afs/cambridge + + NB: When using this on Linux 2.4, the mount command has to be different, + since the filesystem doesn't have access to the device name argument: + + mount -t afs none /afs -ovol="#root.afs." + +Where the initial character is either a hash or a percent symbol depending on +whether you definitely want a R/W volume (hash) or whether you'd prefer a R/O +volume, but are willing to use a R/W volume instead (percent). + +The name of the volume can be suffixes with ".backup" or ".readonly" to +specify connection to only volumes of those types. + +The name of the cell is optional, and if not given during a mount, then the +named volume will be looked up in the cell specified during insmod. + +Additional cells can be added through /proc (see later section). + + +MOUNTPOINTS +=========== + +AFS has a concept of mountpoints. These are specially formatted symbolic links +(of the same form as the "device name" passed to mount). kAFS presents these +to the user as directories that have special properties: + + (*) They cannot be listed. Running a program like "ls" on them will incur an + EREMOTE error (Object is remote). + + (*) Other objects can't be looked up inside of them. This also incurs an + EREMOTE error. + + (*) They can be queried with the readlink() system call, which will return + the name of the mountpoint to which they point. The "readlink" program + will also work. + + (*) They can be mounted on (which symbolic links can't). + + +PROC FILESYSTEM +=============== + +The rxrpc module creates a number of files in various places in the /proc +filesystem: + + (*) Firstly, some information files are made available in a directory called + "/proc/net/rxrpc/". These list the extant transport endpoint, peer, + connection and call records. + + (*) Secondly, some control files are made available in a directory called + "/proc/sys/rxrpc/". Currently, all these files can be used for is to + turn on various levels of tracing. + +The AFS modules creates a "/proc/fs/afs/" directory and populates it: + + (*) A "cells" file that lists cells currently known to the afs module. + + (*) A directory per cell that contains files that list volume location + servers, volumes, and active servers known within that cell. + + +THE CELL DATABASE +================= + +The filesystem maintains an internal database of all the cells it knows and +the IP addresses of the volume location servers for those cells. The cell to +which the computer belongs is added to the database when insmod is performed +by the "rootcell=" argument. + +Further cells can be added by commands similar to the following: + + echo add CELLNAME VLADDR[:VLADDR][:VLADDR]... >/proc/fs/afs/cells + echo add grand.central.org 18.7.14.88:128.2.191.224 >/proc/fs/afs/cells + +No other cell database operations are available at this time. + + +EXAMPLES +======== + +Here's what I use to test this. Some of the names and IP addresses are local +to my internal DNS. My "root.afs" partition has a mount point within it for +some public volumes volumes. + +insmod -S /tmp/rxrpc.o +insmod -S /tmp/kafs.o rootcell=cambridge.redhat.com:172.16.18.73:172.16.18.91 + +mount -t afs \%root.afs. /afs +mount -t afs \%cambridge.redhat.com:root.cell. /afs/cambridge.redhat.com/ + +echo add grand.central.org 18.7.14.88:128.2.191.224 > /proc/fs/afs/cells +mount -t afs "#grand.central.org:root.cell." /afs/grand.central.org/ +mount -t afs "#grand.central.org:root.archive." /afs/grand.central.org/archive +mount -t afs "#grand.central.org:root.contrib." /afs/grand.central.org/contrib +mount -t afs "#grand.central.org:root.doc." /afs/grand.central.org/doc +mount -t afs "#grand.central.org:root.project." /afs/grand.central.org/project +mount -t afs "#grand.central.org:root.service." /afs/grand.central.org/service +mount -t afs "#grand.central.org:root.software." /afs/grand.central.org/software +mount -t afs "#grand.central.org:root.user." /afs/grand.central.org/user + +umount /afs/grand.central.org/user +umount /afs/grand.central.org/software +umount /afs/grand.central.org/service +umount /afs/grand.central.org/project +umount /afs/grand.central.org/doc +umount /afs/grand.central.org/contrib +umount /afs/grand.central.org/archive +umount /afs/grand.central.org +umount /afs/cambridge.redhat.com +umount /afs +rmmod kafs +rmmod rxrpc diff --git a/Documentation/filesystems/automount-support.txt b/Documentation/filesystems/automount-support.txt new file mode 100644 index 000000000000..58c65a1713e5 --- /dev/null +++ b/Documentation/filesystems/automount-support.txt @@ -0,0 +1,118 @@ +Support is available for filesystems that wish to do automounting support (such +as kAFS which can be found in fs/afs/). This facility includes allowing +in-kernel mounts to be performed and mountpoint degradation to be +requested. The latter can also be requested by userspace. + + +====================== +IN-KERNEL AUTOMOUNTING +====================== + +A filesystem can now mount another filesystem on one of its directories by the +following procedure: + + (1) Give the directory a follow_link() operation. + + When the directory is accessed, the follow_link op will be called, and + it will be provided with the location of the mountpoint in the nameidata + structure (vfsmount and dentry). + + (2) Have the follow_link() op do the following steps: + + (a) Call do_kern_mount() to call the appropriate filesystem to set up a + superblock and gain a vfsmount structure representing it. + + (b) Copy the nameidata provided as an argument and substitute the dentry + argument into it the copy. + + (c) Call do_add_mount() to install the new vfsmount into the namespace's + mountpoint tree, thus making it accessible to userspace. Use the + nameidata set up in (b) as the destination. + + If the mountpoint will be automatically expired, then do_add_mount() + should also be given the location of an expiration list (see further + down). + + (d) Release the path in the nameidata argument and substitute in the new + vfsmount and its root dentry. The ref counts on these will need + incrementing. + +Then from userspace, you can just do something like: + + [root@andromeda root]# mount -t afs \#root.afs. /afs + [root@andromeda root]# ls /afs + asd cambridge cambridge.redhat.com grand.central.org + [root@andromeda root]# ls /afs/cambridge + afsdoc + [root@andromeda root]# ls /afs/cambridge/afsdoc/ + ChangeLog html LICENSE pdf RELNOTES-1.2.2 + +And then if you look in the mountpoint catalogue, you'll see something like: + + [root@andromeda root]# cat /proc/mounts + ... + #root.afs. /afs afs rw 0 0 + #root.cell. /afs/cambridge.redhat.com afs rw 0 0 + #afsdoc. /afs/cambridge.redhat.com/afsdoc afs rw 0 0 + + +=========================== +AUTOMATIC MOUNTPOINT EXPIRY +=========================== + +Automatic expiration of mountpoints is easy, provided you've mounted the +mountpoint to be expired in the automounting procedure outlined above. + +To do expiration, you need to follow these steps: + + (3) Create at least one list off which the vfsmounts to be expired can be + hung. Access to this list will be governed by the vfsmount_lock. + + (4) In step (2c) above, the call to do_add_mount() should be provided with a + pointer to this list. It will hang the vfsmount off of it if it succeeds. + + (5) When you want mountpoints to be expired, call mark_mounts_for_expiry() + with a pointer to this list. This will process the list, marking every + vfsmount thereon for potential expiry on the next call. + + If a vfsmount was already flagged for expiry, and if its usage count is 1 + (it's only referenced by its parent vfsmount), then it will be deleted + from the namespace and thrown away (effectively unmounted). + + It may prove simplest to simply call this at regular intervals, using + some sort of timed event to drive it. + +The expiration flag is cleared by calls to mntput. This means that expiration +will only happen on the second expiration request after the last time the +mountpoint was accessed. + +If a mountpoint is moved, it gets removed from the expiration list. If a bind +mount is made on an expirable mount, the new vfsmount will not be on the +expiration list and will not expire. + +If a namespace is copied, all mountpoints contained therein will be copied, +and the copies of those that are on an expiration list will be added to the +same expiration list. + + +======================= +USERSPACE DRIVEN EXPIRY +======================= + +As an alternative, it is possible for userspace to request expiry of any +mountpoint (though some will be rejected - the current process's idea of the +rootfs for example). It does this by passing the MNT_EXPIRE flag to +umount(). This flag is considered incompatible with MNT_FORCE and MNT_DETACH. + +If the mountpoint in question is in referenced by something other than +umount() or its parent mountpoint, an EBUSY error will be returned and the +mountpoint will not be marked for expiration or unmounted. + +If the mountpoint was not already marked for expiry at that time, an EAGAIN +error will be given and it won't be unmounted. + +Otherwise if it was already marked and it wasn't referenced, unmounting will +take place as usual. + +Again, the expiration flag is cleared every time anything other than umount() +looks at a mountpoint. diff --git a/Documentation/filesystems/befs.txt b/Documentation/filesystems/befs.txt new file mode 100644 index 000000000000..877a7b1d46ec --- /dev/null +++ b/Documentation/filesystems/befs.txt @@ -0,0 +1,117 @@ +BeOS filesystem for Linux + +Document last updated: Dec 6, 2001 + +WARNING +======= +Make sure you understand that this is alpha software. This means that the +implementation is neither complete nor well-tested. + +I DISCLAIM ALL RESPONSIBILTY FOR ANY POSSIBLE BAD EFFECTS OF THIS CODE! + +LICENSE +===== +This software is covered by the GNU General Public License. +See the file COPYING for the complete text of the license. +Or the GNU website: <http://www.gnu.org/licenses/licenses.html> + +AUTHOR +===== +The largest part of the code written by Will Dyson <will_dyson@pobox.com> +He has been working on the code since Aug 13, 2001. See the changelog for +details. + +Original Author: Makoto Kato <m_kato@ga2.so-net.ne.jp> +His orriginal code can still be found at: +<http://hp.vector.co.jp/authors/VA008030/bfs/> +Does anyone know of a more current email address for Makoto? He doesn't +respond to the address given above... + +Current maintainer: Sergey S. Kostyliov <rathamahata@php4.ru> + +WHAT IS THIS DRIVER? +================== +This module implements the native filesystem of BeOS <http://www.be.com/> +for the linux 2.4.1 and later kernels. Currently it is a read-only +implementation. + +Which is it, BFS or BEFS? +================ +Be, Inc said, "BeOS Filesystem is officially called BFS, not BeFS". +But Unixware Boot Filesystem is called bfs, too. And they are already in +the kernel. Because of this nameing conflict, on Linux the BeOS +filesystem is called befs. + +HOW TO INSTALL +============== +step 1. Install the BeFS patch into the source code tree of linux. + +Apply the patchfile to your kernel source tree. +Assuming that your kernel source is in /foo/bar/linux and the patchfile +is called patch-befs-xxx, you would do the following: + + cd /foo/bar/linux + patch -p1 < /path/to/patch-befs-xxx + +if the patching step fails (i.e. there are rejected hunks), you can try to +figure it out yourself (it shouldn't be hard), or mail the maintainer +(Will Dyson <will_dyson@pobox.com>) for help. + +step 2. Configuretion & make kernel + +The linux kernel has many compile-time options. Most of them are beyond the +scope of this document. I suggest the Kernel-HOWTO document as a good general +reference on this topic. <http://www.linux.com/howto/Kernel-HOWTO.html> + +However, to use the BeFS module, you must enable it at configure time. + + cd /foo/bar/linux + make menuconfig (or xconfig) + +The BeFS module is not a standard part of the linux kernel, so you must first +enable support for experimental code under the "Code maturity level" menu. + +Then, under the "Filesystems" menu will be an option called "BeFS +filesystem (experimental)", or something like that. Enable that option +(it is fine to make it a module). + +Save your kernel configuration and then build your kernel. + +step 3. Install + +See the kernel howto <http://www.linux.com/howto/Kernel-HOWTO.html> for +instructions on this critical step. + +USING BFS +========= +To use the BeOS filesystem, use filesystem type 'befs'. + +ex) + mount -t befs /dev/fd0 /beos + +MOUNT OPTIONS +============= +uid=nnn All files in the partition will be owned by user id nnn. +gid=nnn All files in the partition will be in group nnn. +iocharset=xxx Use xxx as the name of the NLS translation table. +debug The driver will output debugging information to the syslog. + +HOW TO GET LASTEST VERSION +========================== + +The latest version is currently available at: +<http://befs-driver.sourceforge.net/> + +ANY KNOWN BUGS? +=========== +As of Jan 20, 2002: + + None + +SPECIAL THANKS +============== +Dominic Giampalo ... Writing "Practical file system design with Be filesystem" +Hiroyuki Yamada ... Testing LinuxPPC. + + + diff --git a/Documentation/filesystems/bfs.txt b/Documentation/filesystems/bfs.txt new file mode 100644 index 000000000000..d2841e0bcf02 --- /dev/null +++ b/Documentation/filesystems/bfs.txt @@ -0,0 +1,57 @@ +BFS FILESYSTEM FOR LINUX +======================== + +The BFS filesystem is used by SCO UnixWare OS for the /stand slice, which +usually contains the kernel image and a few other files required for the +boot process. + +In order to access /stand partition under Linux you obviously need to +know the partition number and the kernel must support UnixWare disk slices +(CONFIG_UNIXWARE_DISKLABEL config option). However BFS support does not +depend on having UnixWare disklabel support because one can also mount +BFS filesystem via loopback: + +# losetup /dev/loop0 stand.img +# mount -t bfs /dev/loop0 /mnt/stand + +where stand.img is a file containing the image of BFS filesystem. +When you have finished using it and umounted you need to also deallocate +/dev/loop0 device by: + +# losetup -d /dev/loop0 + +You can simplify mounting by just typing: + +# mount -t bfs -o loop stand.img /mnt/stand + +this will allocate the first available loopback device (and load loop.o +kernel module if necessary) automatically. If the loopback driver is not +loaded automatically, make sure that your kernel is compiled with kmod +support (CONFIG_KMOD) enabled. Beware that umount will not +deallocate /dev/loopN device if /etc/mtab file on your system is a +symbolic link to /proc/mounts. You will need to do it manually using +"-d" switch of losetup(8). Read losetup(8) manpage for more info. + +To create the BFS image under UnixWare you need to find out first which +slice contains it. The command prtvtoc(1M) is your friend: + +# prtvtoc /dev/rdsk/c0b0t0d0s0 + +(assuming your root disk is on target=0, lun=0, bus=0, controller=0). Then you +look for the slice with tag "STAND", which is usually slice 10. With this +information you can use dd(1) to create the BFS image: + +# umount /stand +# dd if=/dev/rdsk/c0b0t0d0sa of=stand.img bs=512 + +Just in case, you can verify that you have done the right thing by checking +the magic number: + +# od -Ad -tx4 stand.img | more + +The first 4 bytes should be 0x1badface. + +If you have any patches, questions or suggestions regarding this BFS +implementation please contact the author: + +Tigran A. Aivazian <tigran@veritas.com> diff --git a/Documentation/filesystems/cifs.txt b/Documentation/filesystems/cifs.txt new file mode 100644 index 000000000000..49cc923a93e3 --- /dev/null +++ b/Documentation/filesystems/cifs.txt @@ -0,0 +1,51 @@ + This is the client VFS module for the Common Internet File System + (CIFS) protocol which is the successor to the Server Message Block + (SMB) protocol, the native file sharing mechanism for most early + PC operating systems. CIFS is fully supported by current network + file servers such as Windows 2000, Windows 2003 (including + Windows XP) as well by Samba (which provides excellent CIFS + server support for Linux and many other operating systems), so + this network filesystem client can mount to a wide variety of + servers. The smbfs module should be used instead of this cifs module + for mounting to older SMB servers such as OS/2. The smbfs and cifs + modules can coexist and do not conflict. The CIFS VFS filesystem + module is designed to work well with servers that implement the + newer versions (dialects) of the SMB/CIFS protocol such as Samba, + the program written by Andrew Tridgell that turns any Unix host + into a SMB/CIFS file server. + + The intent of this module is to provide the most advanced network + file system function for CIFS compliant servers, including better + POSIX compliance, secure per-user session establishment, high + performance safe distributed caching (oplock), optional packet + signing, large files, Unicode support and other internationalization + improvements. Since both Samba server and this filesystem client support + the CIFS Unix extensions, the combination can provide a reasonable + alternative to NFSv4 for fileserving in some Linux to Linux environments, + not just in Linux to Windows environments. + + This filesystem has an optional mount utility (mount.cifs) that can + be obtained from the project page and installed in the path in the same + directory with the other mount helpers (such as mount.smbfs). + Mounting using the cifs filesystem without installing the mount helper + requires specifying the server's ip address. + + For Linux 2.4: + mount //anything/here /mnt_target -o + user=username,pass=password,unc=//ip_address_of_server/sharename + + For Linux 2.5: + mount //ip_address_of_server/sharename /mnt_target -o user=username, pass=password + + + For more information on the module see the project page at + + http://us1.samba.org/samba/Linux_CIFS_client.html + + For more information on CIFS see: + + http://www.snia.org/tech_activities/CIFS + + or the Samba site: + + http://www.samba.org diff --git a/Documentation/filesystems/coda.txt b/Documentation/filesystems/coda.txt new file mode 100644 index 000000000000..61311356025d --- /dev/null +++ b/Documentation/filesystems/coda.txt @@ -0,0 +1,1673 @@ +NOTE: +This is one of the technical documents describing a component of +Coda -- this document describes the client kernel-Venus interface. + +For more information: + http://www.coda.cs.cmu.edu +For user level software needed to run Coda: + ftp://ftp.coda.cs.cmu.edu + +To run Coda you need to get a user level cache manager for the client, +named Venus, as well as tools to manipulate ACLs, to log in, etc. The +client needs to have the Coda filesystem selected in the kernel +configuration. + +The server needs a user level server and at present does not depend on +kernel support. + + + + + + + + The Venus kernel interface + Peter J. Braam + v1.0, Nov 9, 1997 + + This document describes the communication between Venus and kernel + level filesystem code needed for the operation of the Coda file sys- + tem. This document version is meant to describe the current interface + (version 1.0) as well as improvements we envisage. + ______________________________________________________________________ + + Table of Contents + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 1. Introduction + + 2. Servicing Coda filesystem calls + + 3. The message layer + + 3.1 Implementation details + + 4. The interface at the call level + + 4.1 Data structures shared by the kernel and Venus + 4.2 The pioctl interface + 4.3 root + 4.4 lookup + 4.5 getattr + 4.6 setattr + 4.7 access + 4.8 create + 4.9 mkdir + 4.10 link + 4.11 symlink + 4.12 remove + 4.13 rmdir + 4.14 readlink + 4.15 open + 4.16 close + 4.17 ioctl + 4.18 rename + 4.19 readdir + 4.20 vget + 4.21 fsync + 4.22 inactive + 4.23 rdwr + 4.24 odymount + 4.25 ody_lookup + 4.26 ody_expand + 4.27 prefetch + 4.28 signal + + 5. The minicache and downcalls + + 5.1 INVALIDATE + 5.2 FLUSH + 5.3 PURGEUSER + 5.4 ZAPFILE + 5.5 ZAPDIR + 5.6 ZAPVNODE + 5.7 PURGEFID + 5.8 REPLACE + + 6. Initialization and cleanup + + 6.1 Requirements + + + ______________________________________________________________________ + 0wpage + + 11.. IInnttrroodduuccttiioonn + + + + A key component in the Coda Distributed File System is the cache + manager, _V_e_n_u_s. + + + When processes on a Coda enabled system access files in the Coda + filesystem, requests are directed at the filesystem layer in the + operating system. The operating system will communicate with Venus to + service the request for the process. Venus manages a persistent + client cache and makes remote procedure calls to Coda file servers and + related servers (such as authentication servers) to service these + requests it receives from the operating system. When Venus has + serviced a request it replies to the operating system with appropriate + return codes, and other data related to the request. Optionally the + kernel support for Coda may maintain a minicache of recently processed + requests to limit the number of interactions with Venus. Venus + possesses the facility to inform the kernel when elements from its + minicache are no longer valid. + + This document describes precisely this communication between the + kernel and Venus. The definitions of so called upcalls and downcalls + will be given with the format of the data they handle. We shall also + describe the semantic invariants resulting from the calls. + + Historically Coda was implemented in a BSD file system in Mach 2.6. + The interface between the kernel and Venus is very similar to the BSD + VFS interface. Similar functionality is provided, and the format of + the parameters and returned data is very similar to the BSD VFS. This + leads to an almost natural environment for implementing a kernel-level + filesystem driver for Coda in a BSD system. However, other operating + systems such as Linux and Windows 95 and NT have virtual filesystem + with different interfaces. + + To implement Coda on these systems some reverse engineering of the + Venus/Kernel protocol is necessary. Also it came to light that other + systems could profit significantly from certain small optimizations + and modifications to the protocol. To facilitate this work as well as + to make future ports easier, communication between Venus and the + kernel should be documented in great detail. This is the aim of this + document. + + 0wpage + + 22.. SSeerrvviicciinngg CCooddaa ffiilleessyysstteemm ccaallllss + + The service of a request for a Coda file system service originates in + a process PP which accessing a Coda file. It makes a system call which + traps to the OS kernel. Examples of such calls trapping to the kernel + are _r_e_a_d_, _w_r_i_t_e_, _o_p_e_n_, _c_l_o_s_e_, _c_r_e_a_t_e_, _m_k_d_i_r_, _r_m_d_i_r_, _c_h_m_o_d in a Unix + context. Similar calls exist in the Win32 environment, and are named + _C_r_e_a_t_e_F_i_l_e_, . + + Generally the operating system handles the request in a virtual + filesystem (VFS) layer, which is named I/O Manager in NT and IFS + manager in Windows 95. The VFS is responsible for partial processing + of the request and for locating the specific filesystem(s) which will + service parts of the request. Usually the information in the path + assists in locating the correct FS drivers. Sometimes after extensive + pre-processing, the VFS starts invoking exported routines in the FS + driver. This is the point where the FS specific processing of the + request starts, and here the Coda specific kernel code comes into + play. + + The FS layer for Coda must expose and implement several interfaces. + First and foremost the VFS must be able to make all necessary calls to + the Coda FS layer, so the Coda FS driver must expose the VFS interface + as applicable in the operating system. These differ very significantly + among operating systems, but share features such as facilities to + read/write and create and remove objects. The Coda FS layer services + such VFS requests by invoking one or more well defined services + offered by the cache manager Venus. When the replies from Venus have + come back to the FS driver, servicing of the VFS call continues and + finishes with a reply to the kernel's VFS. Finally the VFS layer + returns to the process. + + As a result of this design a basic interface exposed by the FS driver + must allow Venus to manage message traffic. In particular Venus must + be able to retrieve and place messages and to be notified of the + arrival of a new message. The notification must be through a mechanism + which does not block Venus since Venus must attend to other tasks even + when no messages are waiting or being processed. + + + + + + + Interfaces of the Coda FS Driver + + Furthermore the FS layer provides for a special path of communication + between a user process and Venus, called the pioctl interface. The + pioctl interface is used for Coda specific services, such as + requesting detailed information about the persistent cache managed by + Venus. Here the involvement of the kernel is minimal. It identifies + the calling process and passes the information on to Venus. When + Venus replies the response is passed back to the caller in unmodified + form. + + Finally Venus allows the kernel FS driver to cache the results from + certain services. This is done to avoid excessive context switches + and results in an efficient system. However, Venus may acquire + information, for example from the network which implies that cached + information must be flushed or replaced. Venus then makes a downcall + to the Coda FS layer to request flushes or updates in the cache. The + kernel FS driver handles such requests synchronously. + + Among these interfaces the VFS interface and the facility to place, + receive and be notified of messages are platform specific. We will + not go into the calls exported to the VFS layer but we will state the + requirements of the message exchange mechanism. + + 0wpage + + 33.. TThhee mmeessssaaggee llaayyeerr + + + + At the lowest level the communication between Venus and the FS driver + proceeds through messages. The synchronization between processes + requesting Coda file service and Venus relies on blocking and waking + up processes. The Coda FS driver processes VFS- and pioctl-requests + on behalf of a process P, creates messages for Venus, awaits replies + and finally returns to the caller. The implementation of the exchange + of messages is platform specific, but the semantics have (so far) + appeared to be generally applicable. Data buffers are created by the + FS Driver in kernel memory on behalf of P and copied to user memory in + Venus. + + The FS Driver while servicing P makes upcalls to Venus. Such an + upcall is dispatched to Venus by creating a message structure. The + structure contains the identification of P, the message sequence + number, the size of the request and a pointer to the data in kernel + memory for the request. Since the data buffer is re-used to hold the + reply from Venus, there is a field for the size of the reply. A flags + field is used in the message to precisely record the status of the + message. Additional platform dependent structures involve pointers to + determine the position of the message on queues and pointers to + synchronization objects. In the upcall routine the message structure + is filled in, flags are set to 0, and it is placed on the _p_e_n_d_i_n_g + queue. The routine calling upcall is responsible for allocating the + data buffer; its structure will be described in the next section. + + A facility must exist to notify Venus that the message has been + created, and implemented using available synchronization objects in + the OS. This notification is done in the upcall context of the process + P. When the message is on the pending queue, process P cannot proceed + in upcall. The (kernel mode) processing of P in the filesystem + request routine must be suspended until Venus has replied. Therefore + the calling thread in P is blocked in upcall. A pointer in the + message structure will locate the synchronization object on which P is + sleeping. + + Venus detects the notification that a message has arrived, and the FS + driver allow Venus to retrieve the message with a getmsg_from_kernel + call. This action finishes in the kernel by putting the message on the + queue of processing messages and setting flags to READ. Venus is + passed the contents of the data buffer. The getmsg_from_kernel call + now returns and Venus processes the request. + + At some later point the FS driver receives a message from Venus, + namely when Venus calls sendmsg_to_kernel. At this moment the Coda FS + driver looks at the contents of the message and decides if: + + + +o the message is a reply for a suspended thread P. If so it removes + the message from the processing queue and marks the message as + WRITTEN. Finally, the FS driver unblocks P (still in the kernel + mode context of Venus) and the sendmsg_to_kernel call returns to + Venus. The process P will be scheduled at some point and continues + processing its upcall with the data buffer replaced with the reply + from Venus. + + +o The message is a _d_o_w_n_c_a_l_l. A downcall is a request from Venus to + the FS Driver. The FS driver processes the request immediately + (usually a cache eviction or replacement) and when it finishes + sendmsg_to_kernel returns. + + Now P awakes and continues processing upcall. There are some + subtleties to take account of. First P will determine if it was woken + up in upcall by a signal from some other source (for example an + attempt to terminate P) or as is normally the case by Venus in its + sendmsg_to_kernel call. In the normal case, the upcall routine will + deallocate the message structure and return. The FS routine can proceed + with its processing. + + + + + + + + Sleeping and IPC arrangements + + In case P is woken up by a signal and not by Venus, it will first look + at the flags field. If the message is not yet READ, the process P can + handle its signal without notifying Venus. If Venus has READ, and + the request should not be processed, P can send Venus a signal message + to indicate that it should disregard the previous message. Such + signals are put in the queue at the head, and read first by Venus. If + the message is already marked as WRITTEN it is too late to stop the + processing. The VFS routine will now continue. (-- If a VFS request + involves more than one upcall, this can lead to complicated state, an + extra field "handle_signals" could be added in the message structure + to indicate points of no return have been passed.--) + + + + 33..11.. IImmpplleemmeennttaattiioonn ddeettaaiillss + + The Unix implementation of this mechanism has been through the + implementation of a character device associated with Coda. Venus + retrieves messages by doing a read on the device, replies are sent + with a write and notification is through the select system call on the + file descriptor for the device. The process P is kept waiting on an + interruptible wait queue object. + + In Windows NT and the DPMI Windows 95 implementation a DeviceIoControl + call is used. The DeviceIoControl call is designed to copy buffers + from user memory to kernel memory with OPCODES. The sendmsg_to_kernel + is issued as a synchronous call, while the getmsg_from_kernel call is + asynchronous. Windows EventObjects are used for notification of + message arrival. The process P is kept waiting on a KernelEvent + object in NT and a semaphore in Windows 95. + + 0wpage + + 44.. TThhee iinntteerrffaaccee aatt tthhee ccaallll lleevveell + + + This section describes the upcalls a Coda FS driver can make to Venus. + Each of these upcalls make use of two structures: inputArgs and + outputArgs. In pseudo BNF form the structures take the following + form: + + + struct inputArgs { + u_long opcode; + u_long unique; /* Keep multiple outstanding msgs distinct */ + u_short pid; /* Common to all */ + u_short pgid; /* Common to all */ + struct CodaCred cred; /* Common to all */ + + <union "in" of call dependent parts of inputArgs> + }; + + struct outputArgs { + u_long opcode; + u_long unique; /* Keep multiple outstanding msgs distinct */ + u_long result; + + <union "out" of call dependent parts of inputArgs> + }; + + + + Before going on let us elucidate the role of the various fields. The + inputArgs start with the opcode which defines the type of service + requested from Venus. There are approximately 30 upcalls at present + which we will discuss. The unique field labels the inputArg with a + unique number which will identify the message uniquely. A process and + process group id are passed. Finally the credentials of the caller + are included. + + Before delving into the specific calls we need to discuss a variety of + data structures shared by the kernel and Venus. + + + + + 44..11.. DDaattaa ssttrruuccttuurreess sshhaarreedd bbyy tthhee kkeerrnneell aanndd VVeennuuss + + + The CodaCred structure defines a variety of user and group ids as + they are set for the calling process. The vuid_t and guid_t are 32 bit + unsigned integers. It also defines group membership in an array. On + Unix the CodaCred has proven sufficient to implement good security + semantics for Coda but the structure may have to undergo modification + for the Windows environment when these mature. + + struct CodaCred { + vuid_t cr_uid, cr_euid, cr_suid, cr_fsuid; /* Real, effective, set, fs uid*/ + vgid_t cr_gid, cr_egid, cr_sgid, cr_fsgid; /* same for groups */ + vgid_t cr_groups[NGROUPS]; /* Group membership for caller */ + }; + + + + NNOOTTEE It is questionable if we need CodaCreds in Venus. Finally Venus + doesn't know about groups, although it does create files with the + default uid/gid. Perhaps the list of group membership is superfluous. + + + The next item is the fundamental identifier used to identify Coda + files, the ViceFid. A fid of a file uniquely defines a file or + directory in the Coda filesystem within a _c_e_l_l. (-- A _c_e_l_l is a + group of Coda servers acting under the aegis of a single system + control machine or SCM. See the Coda Administration manual for a + detailed description of the role of the SCM.--) + + + typedef struct ViceFid { + VolumeId Volume; + VnodeId Vnode; + Unique_t Unique; + } ViceFid; + + + + Each of the constituent fields: VolumeId, VnodeId and Unique_t are + unsigned 32 bit integers. We envisage that a further field will need + to be prefixed to identify the Coda cell; this will probably take the + form of a Ipv6 size IP address naming the Coda cell through DNS. + + The next important structure shared between Venus and the kernel is + the attributes of the file. The following structure is used to + exchange information. It has room for future extensions such as + support for device files (currently not present in Coda). + + + + + + + + + + + + + + + + + + + struct coda_vattr { + enum coda_vtype va_type; /* vnode type (for create) */ + u_short va_mode; /* files access mode and type */ + short va_nlink; /* number of references to file */ + vuid_t va_uid; /* owner user id */ + vgid_t va_gid; /* owner group id */ + long va_fsid; /* file system id (dev for now) */ + long va_fileid; /* file id */ + u_quad_t va_size; /* file size in bytes */ + long va_blocksize; /* blocksize preferred for i/o */ + struct timespec va_atime; /* time of last access */ + struct timespec va_mtime; /* time of last modification */ + struct timespec va_ctime; /* time file changed */ + u_long va_gen; /* generation number of file */ + u_long va_flags; /* flags defined for file */ + dev_t va_rdev; /* device special file represents */ + u_quad_t va_bytes; /* bytes of disk space held by file */ + u_quad_t va_filerev; /* file modification number */ + u_int va_vaflags; /* operations flags, see below */ + long va_spare; /* remain quad aligned */ + }; + + + + + 44..22.. TThhee ppiiooccttll iinntteerrffaaccee + + + Coda specific requests can be made by application through the pioctl + interface. The pioctl is implemented as an ordinary ioctl on a + fictitious file /coda/.CONTROL. The pioctl call opens this file, gets + a file handle and makes the ioctl call. Finally it closes the file. + + The kernel involvement in this is limited to providing the facility to + open and close and pass the ioctl message _a_n_d to verify that a path in + the pioctl data buffers is a file in a Coda filesystem. + + The kernel is handed a data packet of the form: + + struct { + const char *path; + struct ViceIoctl vidata; + int follow; + } data; + + + + where + + + struct ViceIoctl { + caddr_t in, out; /* Data to be transferred in, or out */ + short in_size; /* Size of input buffer <= 2K */ + short out_size; /* Maximum size of output buffer, <= 2K */ + }; + + + + The path must be a Coda file, otherwise the ioctl upcall will not be + made. + + NNOOTTEE The data structures and code are a mess. We need to clean this + up. + + We now proceed to document the individual calls: + + 0wpage + + 44..33.. rroooott + + + AArrgguummeennttss + + iinn empty + + oouutt + + struct cfs_root_out { + ViceFid VFid; + } cfs_root; + + + + DDeessccrriippttiioonn This call is made to Venus during the initialization of + the Coda filesystem. If the result is zero, the cfs_root structure + contains the ViceFid of the root of the Coda filesystem. If a non-zero + result is generated, its value is a platform dependent error code + indicating the difficulty Venus encountered in locating the root of + the Coda filesystem. + + 0wpage + + 44..44.. llooookkuupp + + + SSuummmmaarryy Find the ViceFid and type of an object in a directory if it + exists. + + AArrgguummeennttss + + iinn + + struct cfs_lookup_in { + ViceFid VFid; + char *name; /* Place holder for data. */ + } cfs_lookup; + + + + oouutt + + struct cfs_lookup_out { + ViceFid VFid; + int vtype; + } cfs_lookup; + + + + DDeessccrriippttiioonn This call is made to determine the ViceFid and filetype of + a directory entry. The directory entry requested carries name name + and Venus will search the directory identified by cfs_lookup_in.VFid. + The result may indicate that the name does not exist, or that + difficulty was encountered in finding it (e.g. due to disconnection). + If the result is zero, the field cfs_lookup_out.VFid contains the + targets ViceFid and cfs_lookup_out.vtype the coda_vtype giving the + type of object the name designates. + + The name of the object is an 8 bit character string of maximum length + CFS_MAXNAMLEN, currently set to 256 (including a 0 terminator.) + + It is extremely important to realize that Venus bitwise ors the field + cfs_lookup.vtype with CFS_NOCACHE to indicate that the object should + not be put in the kernel name cache. + + NNOOTTEE The type of the vtype is currently wrong. It should be + coda_vtype. Linux does not take note of CFS_NOCACHE. It should. + + 0wpage + + 44..55.. ggeettaattttrr + + + SSuummmmaarryy Get the attributes of a file. + + AArrgguummeennttss + + iinn + + struct cfs_getattr_in { + ViceFid VFid; + struct coda_vattr attr; /* XXXXX */ + } cfs_getattr; + + + + oouutt + + struct cfs_getattr_out { + struct coda_vattr attr; + } cfs_getattr; + + + + DDeessccrriippttiioonn This call returns the attributes of the file identified by + fid. + + EErrrroorrss Errors can occur if the object with fid does not exist, is + unaccessible or if the caller does not have permission to fetch + attributes. + + NNoottee Many kernel FS drivers (Linux, NT and Windows 95) need to acquire + the attributes as well as the Fid for the instantiation of an internal + "inode" or "FileHandle". A significant improvement in performance on + such systems could be made by combining the _l_o_o_k_u_p and _g_e_t_a_t_t_r calls + both at the Venus/kernel interaction level and at the RPC level. + + The vattr structure included in the input arguments is superfluous and + should be removed. + + 0wpage + + 44..66.. sseettaattttrr + + + SSuummmmaarryy Set the attributes of a file. + + AArrgguummeennttss + + iinn + + struct cfs_setattr_in { + ViceFid VFid; + struct coda_vattr attr; + } cfs_setattr; + + + + + oouutt + empty + + DDeessccrriippttiioonn The structure attr is filled with attributes to be changed + in BSD style. Attributes not to be changed are set to -1, apart from + vtype which is set to VNON. Other are set to the value to be assigned. + The only attributes which the FS driver may request to change are the + mode, owner, groupid, atime, mtime and ctime. The return value + indicates success or failure. + + EErrrroorrss A variety of errors can occur. The object may not exist, may + be inaccessible, or permission may not be granted by Venus. + + 0wpage + + 44..77.. aacccceessss + + + SSuummmmaarryy + + AArrgguummeennttss + + iinn + + struct cfs_access_in { + ViceFid VFid; + int flags; + } cfs_access; + + + + oouutt + empty + + DDeessccrriippttiioonn Verify if access to the object identified by VFid for + operations described by flags is permitted. The result indicates if + access will be granted. It is important to remember that Coda uses + ACLs to enforce protection and that ultimately the servers, not the + clients enforce the security of the system. The result of this call + will depend on whether a _t_o_k_e_n is held by the user. + + EErrrroorrss The object may not exist, or the ACL describing the protection + may not be accessible. + + 0wpage + + 44..88.. ccrreeaattee + + + SSuummmmaarryy Invoked to create a file + + AArrgguummeennttss + + iinn + + struct cfs_create_in { + ViceFid VFid; + struct coda_vattr attr; + int excl; + int mode; + char *name; /* Place holder for data. */ + } cfs_create; + + + + + oouutt + + struct cfs_create_out { + ViceFid VFid; + struct coda_vattr attr; + } cfs_create; + + + + DDeessccrriippttiioonn This upcall is invoked to request creation of a file. + The file will be created in the directory identified by VFid, its name + will be name, and the mode will be mode. If excl is set an error will + be returned if the file already exists. If the size field in attr is + set to zero the file will be truncated. The uid and gid of the file + are set by converting the CodaCred to a uid using a macro CRTOUID + (this macro is platform dependent). Upon success the VFid and + attributes of the file are returned. The Coda FS Driver will normally + instantiate a vnode, inode or file handle at kernel level for the new + object. + + + EErrrroorrss A variety of errors can occur. Permissions may be insufficient. + If the object exists and is not a file the error EISDIR is returned + under Unix. + + NNOOTTEE The packing of parameters is very inefficient and appears to + indicate confusion between the system call creat and the VFS operation + create. The VFS operation create is only called to create new objects. + This create call differs from the Unix one in that it is not invoked + to return a file descriptor. The truncate and exclusive options, + together with the mode, could simply be part of the mode as it is + under Unix. There should be no flags argument; this is used in open + (2) to return a file descriptor for READ or WRITE mode. + + The attributes of the directory should be returned too, since the size + and mtime changed. + + 0wpage + + 44..99.. mmkkddiirr + + + SSuummmmaarryy Create a new directory. + + AArrgguummeennttss + + iinn + + struct cfs_mkdir_in { + ViceFid VFid; + struct coda_vattr attr; + char *name; /* Place holder for data. */ + } cfs_mkdir; + + + + oouutt + + struct cfs_mkdir_out { + ViceFid VFid; + struct coda_vattr attr; + } cfs_mkdir; + + + + + DDeessccrriippttiioonn This call is similar to create but creates a directory. + Only the mode field in the input parameters is used for creation. + Upon successful creation, the attr returned contains the attributes of + the new directory. + + EErrrroorrss As for create. + + NNOOTTEE The input parameter should be changed to mode instead of + attributes. + + The attributes of the parent should be returned since the size and + mtime changes. + + 0wpage + + 44..1100.. lliinnkk + + + SSuummmmaarryy Create a link to an existing file. + + AArrgguummeennttss + + iinn + + struct cfs_link_in { + ViceFid sourceFid; /* cnode to link *to* */ + ViceFid destFid; /* Directory in which to place link */ + char *tname; /* Place holder for data. */ + } cfs_link; + + + + oouutt + empty + + DDeessccrriippttiioonn This call creates a link to the sourceFid in the directory + identified by destFid with name tname. The source must reside in the + target's parent, i.e. the source must be have parent destFid, i.e. Coda + does not support cross directory hard links. Only the return value is + relevant. It indicates success or the type of failure. + + EErrrroorrss The usual errors can occur.0wpage + + 44..1111.. ssyymmlliinnkk + + + SSuummmmaarryy create a symbolic link + + AArrgguummeennttss + + iinn + + struct cfs_symlink_in { + ViceFid VFid; /* Directory to put symlink in */ + char *srcname; + struct coda_vattr attr; + char *tname; + } cfs_symlink; + + + + oouutt + none + + DDeessccrriippttiioonn Create a symbolic link. The link is to be placed in the + directory identified by VFid and named tname. It should point to the + pathname srcname. The attributes of the newly created object are to + be set to attr. + + EErrrroorrss + + NNOOTTEE The attributes of the target directory should be returned since + its size changed. + + 0wpage + + 44..1122.. rreemmoovvee + + + SSuummmmaarryy Remove a file + + AArrgguummeennttss + + iinn + + struct cfs_remove_in { + ViceFid VFid; + char *name; /* Place holder for data. */ + } cfs_remove; + + + + oouutt + none + + DDeessccrriippttiioonn Remove file named cfs_remove_in.name in directory + identified by VFid. + + EErrrroorrss + + NNOOTTEE The attributes of the directory should be returned since its + mtime and size may change. + + 0wpage + + 44..1133.. rrmmddiirr + + + SSuummmmaarryy Remove a directory + + AArrgguummeennttss + + iinn + + struct cfs_rmdir_in { + ViceFid VFid; + char *name; /* Place holder for data. */ + } cfs_rmdir; + + + + oouutt + none + + DDeessccrriippttiioonn Remove the directory with name name from the directory + identified by VFid. + + EErrrroorrss + + NNOOTTEE The attributes of the parent directory should be returned since + its mtime and size may change. + + 0wpage + + 44..1144.. rreeaaddlliinnkk + + + SSuummmmaarryy Read the value of a symbolic link. + + AArrgguummeennttss + + iinn + + struct cfs_readlink_in { + ViceFid VFid; + } cfs_readlink; + + + + oouutt + + struct cfs_readlink_out { + int count; + caddr_t data; /* Place holder for data. */ + } cfs_readlink; + + + + DDeessccrriippttiioonn This routine reads the contents of symbolic link + identified by VFid into the buffer data. The buffer data must be able + to hold any name up to CFS_MAXNAMLEN (PATH or NAM??). + + EErrrroorrss No unusual errors. + + 0wpage + + 44..1155.. ooppeenn + + + SSuummmmaarryy Open a file. + + AArrgguummeennttss + + iinn + + struct cfs_open_in { + ViceFid VFid; + int flags; + } cfs_open; + + + + oouutt + + struct cfs_open_out { + dev_t dev; + ino_t inode; + } cfs_open; + + + + DDeessccrriippttiioonn This request asks Venus to place the file identified by + VFid in its cache and to note that the calling process wishes to open + it with flags as in open(2). The return value to the kernel differs + for Unix and Windows systems. For Unix systems the Coda FS Driver is + informed of the device and inode number of the container file in the + fields dev and inode. For Windows the path of the container file is + returned to the kernel. + EErrrroorrss + + NNOOTTEE Currently the cfs_open_out structure is not properly adapted to + deal with the Windows case. It might be best to implement two + upcalls, one to open aiming at a container file name, the other at a + container file inode. + + 0wpage + + 44..1166.. cclloossee + + + SSuummmmaarryy Close a file, update it on the servers. + + AArrgguummeennttss + + iinn + + struct cfs_close_in { + ViceFid VFid; + int flags; + } cfs_close; + + + + oouutt + none + + DDeessccrriippttiioonn Close the file identified by VFid. + + EErrrroorrss + + NNOOTTEE The flags argument is bogus and not used. However, Venus' code + has room to deal with an execp input field, probably this field should + be used to inform Venus that the file was closed but is still memory + mapped for execution. There are comments about fetching versus not + fetching the data in Venus vproc_vfscalls. This seems silly. If a + file is being closed, the data in the container file is to be the new + data. Here again the execp flag might be in play to create confusion: + currently Venus might think a file can be flushed from the cache when + it is still memory mapped. This needs to be understood. + + 0wpage + + 44..1177.. iiooccttll + + + SSuummmmaarryy Do an ioctl on a file. This includes the pioctl interface. + + AArrgguummeennttss + + iinn + + struct cfs_ioctl_in { + ViceFid VFid; + int cmd; + int len; + int rwflag; + char *data; /* Place holder for data. */ + } cfs_ioctl; + + + + oouutt + + + struct cfs_ioctl_out { + int len; + caddr_t data; /* Place holder for data. */ + } cfs_ioctl; + + + + DDeessccrriippttiioonn Do an ioctl operation on a file. The command, len and + data arguments are filled as usual. flags is not used by Venus. + + EErrrroorrss + + NNOOTTEE Another bogus parameter. flags is not used. What is the + business about PREFETCHING in the Venus code? + + + 0wpage + + 44..1188.. rreennaammee + + + SSuummmmaarryy Rename a fid. + + AArrgguummeennttss + + iinn + + struct cfs_rename_in { + ViceFid sourceFid; + char *srcname; + ViceFid destFid; + char *destname; + } cfs_rename; + + + + oouutt + none + + DDeessccrriippttiioonn Rename the object with name srcname in directory + sourceFid to destname in destFid. It is important that the names + srcname and destname are 0 terminated strings. Strings in Unix + kernels are not always null terminated. + + EErrrroorrss + + 0wpage + + 44..1199.. rreeaaddddiirr + + + SSuummmmaarryy Read directory entries. + + AArrgguummeennttss + + iinn + + struct cfs_readdir_in { + ViceFid VFid; + int count; + int offset; + } cfs_readdir; + + + + + oouutt + + struct cfs_readdir_out { + int size; + caddr_t data; /* Place holder for data. */ + } cfs_readdir; + + + + DDeessccrriippttiioonn Read directory entries from VFid starting at offset and + read at most count bytes. Returns the data in data and returns + the size in size. + + EErrrroorrss + + NNOOTTEE This call is not used. Readdir operations exploit container + files. We will re-evaluate this during the directory revamp which is + about to take place. + + 0wpage + + 44..2200.. vvggeett + + + SSuummmmaarryy instructs Venus to do an FSDB->Get. + + AArrgguummeennttss + + iinn + + struct cfs_vget_in { + ViceFid VFid; + } cfs_vget; + + + + oouutt + + struct cfs_vget_out { + ViceFid VFid; + int vtype; + } cfs_vget; + + + + DDeessccrriippttiioonn This upcall asks Venus to do a get operation on an fsobj + labelled by VFid. + + EErrrroorrss + + NNOOTTEE This operation is not used. However, it is extremely useful + since it can be used to deal with read/write memory mapped files. + These can be "pinned" in the Venus cache using vget and released with + inactive. + + 0wpage + + 44..2211.. ffssyynncc + + + SSuummmmaarryy Tell Venus to update the RVM attributes of a file. + + AArrgguummeennttss + + iinn + + struct cfs_fsync_in { + ViceFid VFid; + } cfs_fsync; + + + + oouutt + none + + DDeessccrriippttiioonn Ask Venus to update RVM attributes of object VFid. This + should be called as part of kernel level fsync type calls. The + result indicates if the syncing was successful. + + EErrrroorrss + + NNOOTTEE Linux does not implement this call. It should. + + 0wpage + + 44..2222.. iinnaaccttiivvee + + + SSuummmmaarryy Tell Venus a vnode is no longer in use. + + AArrgguummeennttss + + iinn + + struct cfs_inactive_in { + ViceFid VFid; + } cfs_inactive; + + + + oouutt + none + + DDeessccrriippttiioonn This operation returns EOPNOTSUPP. + + EErrrroorrss + + NNOOTTEE This should perhaps be removed. + + 0wpage + + 44..2233.. rrddwwrr + + + SSuummmmaarryy Read or write from a file + + AArrgguummeennttss + + iinn + + struct cfs_rdwr_in { + ViceFid VFid; + int rwflag; + int count; + int offset; + int ioflag; + caddr_t data; /* Place holder for data. */ + } cfs_rdwr; + + + + + oouutt + + struct cfs_rdwr_out { + int rwflag; + int count; + caddr_t data; /* Place holder for data. */ + } cfs_rdwr; + + + + DDeessccrriippttiioonn This upcall asks Venus to read or write from a file. + + EErrrroorrss + + NNOOTTEE It should be removed since it is against the Coda philosophy that + read/write operations never reach Venus. I have been told the + operation does not work. It is not currently used. + + + 0wpage + + 44..2244.. ooddyymmoouunntt + + + SSuummmmaarryy Allows mounting multiple Coda "filesystems" on one Unix mount + point. + + AArrgguummeennttss + + iinn + + struct ody_mount_in { + char *name; /* Place holder for data. */ + } ody_mount; + + + + oouutt + + struct ody_mount_out { + ViceFid VFid; + } ody_mount; + + + + DDeessccrriippttiioonn Asks Venus to return the rootfid of a Coda system named + name. The fid is returned in VFid. + + EErrrroorrss + + NNOOTTEE This call was used by David for dynamic sets. It should be + removed since it causes a jungle of pointers in the VFS mounting area. + It is not used by Coda proper. Call is not implemented by Venus. + + 0wpage + + 44..2255.. ooddyy__llooookkuupp + + + SSuummmmaarryy Looks up something. + + AArrgguummeennttss + + iinn irrelevant + + + oouutt + irrelevant + + DDeessccrriippttiioonn + + EErrrroorrss + + NNOOTTEE Gut it. Call is not implemented by Venus. + + 0wpage + + 44..2266.. ooddyy__eexxppaanndd + + + SSuummmmaarryy expands something in a dynamic set. + + AArrgguummeennttss + + iinn irrelevant + + oouutt + irrelevant + + DDeessccrriippttiioonn + + EErrrroorrss + + NNOOTTEE Gut it. Call is not implemented by Venus. + + 0wpage + + 44..2277.. pprreeffeettcchh + + + SSuummmmaarryy Prefetch a dynamic set. + + AArrgguummeennttss + + iinn Not documented. + + oouutt + Not documented. + + DDeessccrriippttiioonn Venus worker.cc has support for this call, although it is + noted that it doesn't work. Not surprising, since the kernel does not + have support for it. (ODY_PREFETCH is not a defined operation). + + EErrrroorrss + + NNOOTTEE Gut it. It isn't working and isn't used by Coda. + + + 0wpage + + 44..2288.. ssiiggnnaall + + + SSuummmmaarryy Send Venus a signal about an upcall. + + AArrgguummeennttss + + iinn none + + oouutt + not applicable. + + DDeessccrriippttiioonn This is an out-of-band upcall to Venus to inform Venus + that the calling process received a signal after Venus read the + message from the input queue. Venus is supposed to clean up the + operation. + + EErrrroorrss No reply is given. + + NNOOTTEE We need to better understand what Venus needs to clean up and if + it is doing this correctly. Also we need to handle multiple upcall + per system call situations correctly. It would be important to know + what state changes in Venus take place after an upcall for which the + kernel is responsible for notifying Venus to clean up (e.g. open + definitely is such a state change, but many others are maybe not). + + 0wpage + + 55.. TThhee mmiinniiccaacchhee aanndd ddoowwnnccaallllss + + + The Coda FS Driver can cache results of lookup and access upcalls, to + limit the frequency of upcalls. Upcalls carry a price since a process + context switch needs to take place. The counterpart of caching the + information is that Venus will notify the FS Driver that cached + entries must be flushed or renamed. + + The kernel code generally has to maintain a structure which links the + internal file handles (called vnodes in BSD, inodes in Linux and + FileHandles in Windows) with the ViceFid's which Venus maintains. The + reason is that frequent translations back and forth are needed in + order to make upcalls and use the results of upcalls. Such linking + objects are called ccnnooddeess. + + The current minicache implementations have cache entries which record + the following: + + 1. the name of the file + + 2. the cnode of the directory containing the object + + 3. a list of CodaCred's for which the lookup is permitted. + + 4. the cnode of the object + + The lookup call in the Coda FS Driver may request the cnode of the + desired object from the cache, by passing its name, directory and the + CodaCred's of the caller. The cache will return the cnode or indicate + that it cannot be found. The Coda FS Driver must be careful to + invalidate cache entries when it modifies or removes objects. + + When Venus obtains information that indicates that cache entries are + no longer valid, it will make a downcall to the kernel. Downcalls are + intercepted by the Coda FS Driver and lead to cache invalidations of + the kind described below. The Coda FS Driver does not return an error + unless the downcall data could not be read into kernel memory. + + + 55..11.. IINNVVAALLIIDDAATTEE + + + No information is available on this call. + + + 55..22.. FFLLUUSSHH + + + + AArrgguummeennttss None + + SSuummmmaarryy Flush the name cache entirely. + + DDeessccrriippttiioonn Venus issues this call upon startup and when it dies. This + is to prevent stale cache information being held. Some operating + systems allow the kernel name cache to be switched off dynamically. + When this is done, this downcall is made. + + + 55..33.. PPUURRGGEEUUSSEERR + + + AArrgguummeennttss + + struct cfs_purgeuser_out {/* CFS_PURGEUSER is a venus->kernel call */ + struct CodaCred cred; + } cfs_purgeuser; + + + + DDeessccrriippttiioonn Remove all entries in the cache carrying the Cred. This + call is issued when tokens for a user expire or are flushed. + + + 55..44.. ZZAAPPFFIILLEE + + + AArrgguummeennttss + + struct cfs_zapfile_out { /* CFS_ZAPFILE is a venus->kernel call */ + ViceFid CodaFid; + } cfs_zapfile; + + + + DDeessccrriippttiioonn Remove all entries which have the (dir vnode, name) pair. + This is issued as a result of an invalidation of cached attributes of + a vnode. + + NNOOTTEE Call is not named correctly in NetBSD and Mach. The minicache + zapfile routine takes different arguments. Linux does not implement + the invalidation of attributes correctly. + + + + 55..55.. ZZAAPPDDIIRR + + + AArrgguummeennttss + + struct cfs_zapdir_out { /* CFS_ZAPDIR is a venus->kernel call */ + ViceFid CodaFid; + } cfs_zapdir; + + + + DDeessccrriippttiioonn Remove all entries in the cache lying in a directory + CodaFid, and all children of this directory. This call is issued when + Venus receives a callback on the directory. + + + 55..66.. ZZAAPPVVNNOODDEE + + + + AArrgguummeennttss + + struct cfs_zapvnode_out { /* CFS_ZAPVNODE is a venus->kernel call */ + struct CodaCred cred; + ViceFid VFid; + } cfs_zapvnode; + + + + DDeessccrriippttiioonn Remove all entries in the cache carrying the cred and VFid + as in the arguments. This downcall is probably never issued. + + + 55..77.. PPUURRGGEEFFIIDD + + + SSuummmmaarryy + + AArrgguummeennttss + + struct cfs_purgefid_out { /* CFS_PURGEFID is a venus->kernel call */ + ViceFid CodaFid; + } cfs_purgefid; + + + + DDeessccrriippttiioonn Flush the attribute for the file. If it is a dir (odd + vnode), purge its children from the namecache and remove the file from the + namecache. + + + + 55..88.. RREEPPLLAACCEE + + + SSuummmmaarryy Replace the Fid's for a collection of names. + + AArrgguummeennttss + + struct cfs_replace_out { /* cfs_replace is a venus->kernel call */ + ViceFid NewFid; + ViceFid OldFid; + } cfs_replace; + + + + DDeessccrriippttiioonn This routine replaces a ViceFid in the name cache with + another. It is added to allow Venus during reintegration to replace + locally allocated temp fids while disconnected with global fids even + when the reference counts on those fids are not zero. + + 0wpage + + 66.. IInniittiiaalliizzaattiioonn aanndd cclleeaannuupp + + + This section gives brief hints as to desirable features for the Coda + FS Driver at startup and upon shutdown or Venus failures. Before + entering the discussion it is useful to repeat that the Coda FS Driver + maintains the following data: + + + 1. message queues + + 2. cnodes + + 3. name cache entries + + The name cache entries are entirely private to the driver, so they + can easily be manipulated. The message queues will generally have + clear points of initialization and destruction. The cnodes are + much more delicate. User processes hold reference counts in Coda + filesystems and it can be difficult to clean up the cnodes. + + It can expect requests through: + + 1. the message subsystem + + 2. the VFS layer + + 3. pioctl interface + + Currently the _p_i_o_c_t_l passes through the VFS for Coda so we can + treat these similarly. + + + 66..11.. RReeqquuiirreemmeennttss + + + The following requirements should be accommodated: + + 1. The message queues should have open and close routines. On Unix + the opening of the character devices are such routines. + + +o Before opening, no messages can be placed. + + +o Opening will remove any old messages still pending. + + +o Close will notify any sleeping processes that their upcall cannot + be completed. + + +o Close will free all memory allocated by the message queues. + + + 2. At open the namecache shall be initialized to empty state. + + 3. Before the message queues are open, all VFS operations will fail. + Fortunately this can be achieved by making sure than mounting the + Coda filesystem cannot succeed before opening. + + 4. After closing of the queues, no VFS operations can succeed. Here + one needs to be careful, since a few operations (lookup, + read/write, readdir) can proceed without upcalls. These must be + explicitly blocked. + + 5. Upon closing the namecache shall be flushed and disabled. + + 6. All memory held by cnodes can be freed without relying on upcalls. + + 7. Unmounting the file system can be done without relying on upcalls. + + 8. Mounting the Coda filesystem should fail gracefully if Venus cannot + get the rootfid or the attributes of the rootfid. The latter is + best implemented by Venus fetching these objects before attempting + to mount. + + NNOOTTEE NetBSD in particular but also Linux have not implemented the + above requirements fully. For smooth operation this needs to be + corrected. + + + diff --git a/Documentation/filesystems/cramfs.txt b/Documentation/filesystems/cramfs.txt new file mode 100644 index 000000000000..31f53f0ab957 --- /dev/null +++ b/Documentation/filesystems/cramfs.txt @@ -0,0 +1,76 @@ + + Cramfs - cram a filesystem onto a small ROM + +cramfs is designed to be simple and small, and to compress things well. + +It uses the zlib routines to compress a file one page at a time, and +allows random page access. The meta-data is not compressed, but is +expressed in a very terse representation to make it use much less +diskspace than traditional filesystems. + +You can't write to a cramfs filesystem (making it compressible and +compact also makes it _very_ hard to update on-the-fly), so you have to +create the disk image with the "mkcramfs" utility. + + +Usage Notes +----------- + +File sizes are limited to less than 16MB. + +Maximum filesystem size is a little over 256MB. (The last file on the +filesystem is allowed to extend past 256MB.) + +Only the low 8 bits of gid are stored. The current version of +mkcramfs simply truncates to 8 bits, which is a potential security +issue. + +Hard links are supported, but hard linked files +will still have a link count of 1 in the cramfs image. + +Cramfs directories have no `.' or `..' entries. Directories (like +every other file on cramfs) always have a link count of 1. (There's +no need to use -noleaf in `find', btw.) + +No timestamps are stored in a cramfs, so these default to the epoch +(1970 GMT). Recently-accessed files may have updated timestamps, but +the update lasts only as long as the inode is cached in memory, after +which the timestamp reverts to 1970, i.e. moves backwards in time. + +Currently, cramfs must be written and read with architectures of the +same endianness, and can be read only by kernels with PAGE_CACHE_SIZE +== 4096. At least the latter of these is a bug, but it hasn't been +decided what the best fix is. For the moment if you have larger pages +you can just change the #define in mkcramfs.c, so long as you don't +mind the filesystem becoming unreadable to future kernels. + + +For /usr/share/magic +-------------------- + +0 ulelong 0x28cd3d45 Linux cramfs offset 0 +>4 ulelong x size %d +>8 ulelong x flags 0x%x +>12 ulelong x future 0x%x +>16 string >\0 signature "%.16s" +>32 ulelong x fsid.crc 0x%x +>36 ulelong x fsid.edition %d +>40 ulelong x fsid.blocks %d +>44 ulelong x fsid.files %d +>48 string >\0 name "%.16s" +512 ulelong 0x28cd3d45 Linux cramfs offset 512 +>516 ulelong x size %d +>520 ulelong x flags 0x%x +>524 ulelong x future 0x%x +>528 string >\0 signature "%.16s" +>544 ulelong x fsid.crc 0x%x +>548 ulelong x fsid.edition %d +>552 ulelong x fsid.blocks %d +>556 ulelong x fsid.files %d +>560 string >\0 name "%.16s" + + +Hacker Notes +------------ + +See fs/cramfs/README for filesystem layout and implementation notes. diff --git a/Documentation/filesystems/devfs/ChangeLog b/Documentation/filesystems/devfs/ChangeLog new file mode 100644 index 000000000000..e5aba5246d7c --- /dev/null +++ b/Documentation/filesystems/devfs/ChangeLog @@ -0,0 +1,1977 @@ +/* -*- auto-fill -*- */ +=============================================================================== +Changes for patch v1 + +- creation of devfs + +- modified miscellaneous character devices to support devfs +=============================================================================== +Changes for patch v2 + +- bug fix with manual inode creation +=============================================================================== +Changes for patch v3 + +- bugfixes + +- documentation improvements + +- created a couple of scripts (one to save&restore a devfs and the + other to set up compatibility symlinks) + +- devfs support for SCSI discs. New name format is: sd_hHcCiIlL +=============================================================================== +Changes for patch v4 + +- bugfix for the directory reading code + +- bugfix for compilation with kerneld + +- devfs support for generic hard discs + +- rationalisation of the various watchdog drivers +=============================================================================== +Changes for patch v5 + +- support for mounting directly from entries in the devfs (it doesn't + need to be mounted to do this), including the root filesystem. + Mounting of swap partitions also works. Hence, now if you set + CONFIG_DEVFS_ONLY to 'Y' then you won't be able to access your discs + via ordinary device nodes. Naturally, the default is 'N' so that you + can still use your old device nodes. If you want to mount from devfs + entries, make sure you use: append = "root=/dev/sd_..." in your + lilo.conf. It seems LILO looks for the device number (major&minor) + and writes that into the kernel image :-( + +- support for character memory devices (/dev/null, /dev/zero, /dev/full + and so on). Thanks to C. Scott Ananian <cananian@alumni.princeton.edu> +=============================================================================== +Changes for patch v6 + +- support for subdirectories + +- support for symbolic links (created by devfs_mk_symlink(), no + support yet for creation via symlink(2)) + +- SCSI disc naming now cast in stone, with the format: + /dev/sd/c0b1t2u3 controller=0, bus=1, ID=2, LUN=3, whole disc + /dev/sd/c0b1t2u3p4 controller=0, bus=1, ID=2, LUN=3, 4th partition + +- loop devices now appear in devfs + +- tty devices, console, serial ports, etc. now appear in devfs + Thanks to C. Scott Ananian <cananian@alumni.princeton.edu> + +- bugs with mounting devfs-only devices now fixed +=============================================================================== +Changes for patch v7 + +- SCSI CD-ROMS, tapes and generic devices now appear in devfs +=============================================================================== +Changes for patch v8 + +- bugfix with no-rewind SCSI tapes + +- RAMDISCs now appear in devfs + +- better cleaning up of devfs entries created by various modules + +- interface change to <devfs_register> +=============================================================================== +Changes for patch v9 + +- the v8 patch was corrupted somehow, which would affect the patch for + linux/fs/filesystems.c + I've also fixed the v8 patch file on the WWW + +- MetaDevices (/dev/md*) should now appear in devfs +=============================================================================== +Changes for patch v10 + +- bugfix in meta device support for devfs + +- created this ChangeLog file + +- added devfs support to the floppy driver + +- added support for creating sockets in a devfs +=============================================================================== +Changes for patch v11 + +- added DEVFS_FL_HIDE_UNREG flag + +- incorporated better patch for ttyname() in libc 5.4.43 from H.J. Lu. + +- interface change to <devfs_mk_symlink> + +- support for creating symlinks with symlink(2) + +- parallel port printer (/dev/lp*) now appears in devfs +=============================================================================== +Changes for patch v12 + +- added inode check to <devfs_fill_file> function + +- improved devfs support when mounting from devfs + +- added call to <<release>> operation when removing swap areas on + devfs devices + +- increased NR_SUPER to 128 to support large numbers of devfs mounts + (for chroot(2) gaols) + +- fixed bug in SCSI disc support: was generating incorrect minors if + SCSI ID's did not start at 0 and increase by 1 + +- support symlink traversal when mounting root +=============================================================================== +Changes for patch v13 + +- added devfs support to soundcard driver + Thanks to Eric Dumas <dumas@linux.eu.org> and + C. Scott Ananian <cananian@alumni.princeton.edu> + +- added devfs support to the joystick driver + +- loop driver now has it's own subdirectory "/dev/loop/" + +- created <devfs_get_flags> and <devfs_set_flags> functions + +- fix problem with SCSI disc compatibility names (sd{a,b,c,d,e,f}) + which assumes ID's start at 0 and increase by 1. Also only create + devfs entries for SCSI disc partitions which actually exist + Show new names in partition check + Thanks to Jakub Jelinek <jj@sunsite.ms.mff.cuni.cz> +=============================================================================== +Changes for patch v14 + +- bug fix in floppy driver: would not compile without + CONFIG_DEVFS_FS='Y' + Thanks to Jurgen Botz <jbotz@nova.botz.org> + +- bug fix in loop driver + Thanks to C. Scott Ananian <cananian@alumni.princeton.edu> + +- do not create devfs entries for printers not configured + Thanks to C. Scott Ananian <cananian@alumni.princeton.edu> + +- do not create devfs entries for serial ports not present + Thanks to C. Scott Ananian <cananian@alumni.princeton.edu> + +- ensure <tty_register_devfs> is exported from tty_io.c + Thanks to C. Scott Ananian <cananian@alumni.princeton.edu> + +- allow unregistering of devfs symlink entries + +- fixed bug in SCSI disc naming introduced in last patch version +=============================================================================== +Changes for patch v15 + +- ported to kernel 2.1.81 +=============================================================================== +Changes for patch v16 + +- created <devfs_set_symlink_destination> function + +- moved DEVFS_SUPER_MAGIC into header file + +- added DEVFS_FL_HIDE flag + +- created <devfs_get_maj_min> + +- created <devfs_get_handle_from_inode> + +- fixed bugs in searching by major&minor + +- changed interface to <devfs_unregister>, <devfs_fill_file> and + <devfs_find_handle> + +- fixed inode times when symlink created with symlink(2) + +- change tty driver to do auto-creation of devfs entries + Thanks to C. Scott Ananian <cananian@alumni.princeton.edu> + +- fixed bug in genhd.c: whole disc (non-SCSI) was not registered to + devfs + +- updated libc 5.4.43 patch for ttyname() +=============================================================================== +Changes for patch v17 + +- added CONFIG_DEVFS_TTY_COMPAT + Thanks to C. Scott Ananian <cananian@alumni.princeton.edu> + +- bugfix in devfs support for drivers/char/lp.c + Thanks to C. Scott Ananian <cananian@alumni.princeton.edu> + +- clean up serial driver so that PCMCIA devices unregister correctly + Thanks to C. Scott Ananian <cananian@alumni.princeton.edu> + +- fixed bug in genhd.c: whole disc (non-SCSI) was not registered to + devfs [was missing in patch v16] + +- updated libc 5.4.43 patch for ttyname() [was missing in patch v16] + +- all SCSI devices now registered in /dev/sg + +- support removal of devfs entries via unlink(2) +=============================================================================== +Changes for patch v18 + +- added floppy/?u720 floppy entry + +- fixed kerneld support for entries in devfs subdirectories + +- incorporated latest patch for ttyname() in libc 5.4.43 from H.J. Lu. +=============================================================================== +Changes for patch v19 + +- bug fix when looking up unregistered entries: kerneld was not called + +- fixes for kernel 2.1.86 (now requires 2.1.86) +=============================================================================== +Changes for patch v20 + +- only create available floppy entries + Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl> + +- new IDE naming scheme following SCSI format (i.e. /dev/id/c0b0t0u0p1 + instead of /dev/hda1) + Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl> + +- new XT disc naming scheme following SCSI format (i.e. /dev/xd/c0t0p1 + instead of /dev/xda1) + Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl> + +- new non-standard CD-ROM names (i.e. /dev/sbp/c#t#) + Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl> + +- allow symlink traversal when mounting the root filesystem + +- Create entries for MD devices at MD init + Thanks to Christophe Leroy <christophe.leroy5@capway.com> +=============================================================================== +Changes for patch v21 + +- ported to kernel 2.1.91 +=============================================================================== +Changes for patch v22 + +- SCSI host number patch ("scsihosts=" kernel option) + Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl> +=============================================================================== +Changes for patch v23 + +- Fixed persistence bug with device numbers for manually created + device files + +- Fixed problem with recreating symlinks with different content + +- Added CONFIG_DEVFS_MOUNT (mount devfs on /dev at boot time) +=============================================================================== +Changes for patch v24 + +- Switched from CONFIG_KERNELD to CONFIG_KMOD: module autoloading + should now work again + +- Hide entries which are manually unlinked + +- Always invalidate devfs dentry cache when registering entries + +- Support removal of devfs directories via rmdir(2) + +- Ensure directories created by <devfs_mk_dir> are visible + +- Default no access for "other" for floppy device +=============================================================================== +Changes for patch v25 + +- Updates to CREDITS file and minor IDE numbering change + Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl> + +- Invalidate devfs dentry cache when making directories + +- Invalidate devfs dentry cache when removing entries + +- More informative message if root FS mount fails when devfs + configured + +- Fixed persistence bug with fifos +=============================================================================== +Changes for patch v26 + +- ported to kernel 2.1.97 + +- Changed serial directory from "/dev/serial" to "/dev/tts" and + "/dev/consoles" to "/dev/vc" to be more friendly to new procps +=============================================================================== +Changes for patch v27 + +- Added support for IDE4 and IDE5 + Thanks to Andrzej Krzysztofowicz <ankry@green.mif.pg.gda.pl> + +- Documented "scsihosts=" boot parameter + +- Print process command when debugging kerneld/kmod + +- Added debugging for register/unregister/change operations + +- Added "devfs=" boot options + +- Hide unregistered entries by default +=============================================================================== +Changes for patch v28 + +- No longer lock/unlock superblock in <devfs_put_super> (cope with + recent VFS interface change) + +- Do not automatically change ownership/protection of /dev/tty + +- Drop negative dentries when they are released + +- Manage dcache more efficiently +=============================================================================== +Changes for patch v29 + +- Added DEVFS_FL_AUTO_DEVNUM flag +=============================================================================== +Changes for patch v30 + +- No longer set unnecessary methods + +- Ported to kernel 2.1.99-pre3 +=============================================================================== +Changes for patch v31 + +- Added PID display to <call_kerneld> debugging message + +- Added "diread" and "diwrite" options + +- Ported to kernel 2.1.102 + +- Fixed persistence problem with permissions +=============================================================================== +Changes for patch v32 + +- Fixed devfs support in drivers/block/md.c +=============================================================================== +Changes for patch v33 + +- Support legacy device nodes + +- Fixed bug where recreated inodes were hidden + +- New IDE naming scheme: everything is under /dev/ide +=============================================================================== +Changes for patch v34 + +- Improved debugging in <get_vfs_inode> + +- Prevent duplicate calls to <devfs_mk_dir> in SCSI layer + +- No longer free old dentries in <devfs_mk_dir> + +- Free all dentries for a given entry when deleting inodes +=============================================================================== +Changes for patch v35 + +- Ported to kernel 2.1.105 (sound driver changes) +=============================================================================== +Changes for patch v36 + +- Fixed sound driver port +=============================================================================== +Changes for patch v37 + +- Minor documentation tweaks +=============================================================================== +Changes for patch v38 + +- More documentation tweaks + +- Fix for sound driver port + +- Removed ttyname-patch (grab libc 5.4.44 instead) + +- Ported to kernel 2.1.107-pre2 (loop driver fix) +=============================================================================== +Changes for patch v39 + +- Ported to kernel 2.1.107 (hd.c hunk broke due to spelling "fixes"). Sigh + +- Removed many #ifdef's, replaced with trickery in include/devfs_fs.h +=============================================================================== +Changes for patch v40 + +- Fix for sound driver port + +- Limit auto-device numbering to majors 128 to 239 +=============================================================================== +Changes for patch v41 + +- Fixed inode times persistence problem +=============================================================================== +Changes for patch v42 + +- Ported to kernel 2.1.108 (drivers/scsi/hosts.c hunk broke) +=============================================================================== +Changes for patch v43 + +- Fixed spelling in <devfs_readlink> debug + +- Fixed bug in <devfs_setup> parsing "dilookup" + +- More #ifdef's removed + +- Supported Sparc keyboard (/dev/kbd) + +- Supported DSP56001 digital signal processor (/dev/dsp56k) + +- Supported Apple Desktop Bus (/dev/adb) + +- Supported Coda network file system (/dev/cfs*) +=============================================================================== +Changes for patch v44 + +- Fixed devfs inode leak when manually recreating inodes + +- Fixed permission persistence problem when recreating inodes +=============================================================================== +Changes for patch v45 + +- Ported to kernel 2.1.110 +=============================================================================== +Changes for patch v46 + +- Ported to kernel 2.1.112-pre1 + +- Removed harmless "unused variable" compiler warning + +- Fixed modes for manually recreated device nodes +=============================================================================== +Changes for patch v47 + +- Added NULL devfs inode warning in <devfs_read_inode> + +- Force all inode nlink values to 1 +=============================================================================== +Changes for patch v48 + +- Added "dimknod" option + +- Set inode nlink to 0 when freeing dentries + +- Added support for virtual console capture devices (/dev/vcs*) + Thanks to Dennis Hou <smilax@mindmeld.yi.org> + +- Fixed modes for manually recreated symlinks +=============================================================================== +Changes for patch v49 + +- Ported to kernel 2.1.113 +=============================================================================== +Changes for patch v50 + +- Fixed bugs in recreated directories and symlinks +=============================================================================== +Changes for patch v51 + +- Improved robustness of rc.devfs script + Thanks to Roderich Schupp <rsch@experteam.de> + +- Fixed bugs in recreated device nodes + +- Fixed bug in currently unused <devfs_get_handle_from_inode> + +- Defined new <devfs_handle_t> type + +- Improved debugging when getting entries + +- Fixed bug where directories could be emptied + +- Ported to kernel 2.1.115 +=============================================================================== +Changes for patch v52 + +- Replaced dummy .epoch inode with .devfsd character device + +- Modified rc.devfs to take account of above change + +- Removed spurious driver warning messages when CONFIG_DEVFS_FS=n + +- Implemented devfsd protocol revision 0 +=============================================================================== +Changes for patch v53 + +- Ported to kernel 2.1.116 (kmod change broke hunk) + +- Updated Documentation/Configure.help + +- Test and tty pattern patch for rc.devfs script + Thanks to Roderich Schupp <rsch@experteam.de> + +- Added soothing message to warning in <devfs_d_iput> +=============================================================================== +Changes for patch v54 + +- Ported to kernel 2.1.117 + +- Fixed default permissions in sound driver + +- Added support for frame buffer devices (/dev/fb*) +=============================================================================== +Changes for patch v55 + +- Ported to kernel 2.1.119 + +- Use GCC extensions for structure initialisations + +- Implemented async open notification + +- Incremented devfsd protocol revision to 1 +=============================================================================== +Changes for patch v56 + +- Ported to kernel 2.1.120-pre3 + +- Moved async open notification to end of <devfs_open> +=============================================================================== +Changes for patch v57 + +- Ported to kernel 2.1.121 + +- Prepended "/dev/" to module load request + +- Renamed <call_kerneld> to <call_kmod> + +- Created sample modules.conf file +=============================================================================== +Changes for patch v58 + +- Fixed typo "AYSNC" -> "ASYNC" +=============================================================================== +Changes for patch v59 + +- Added open flag for files +=============================================================================== +Changes for patch v60 + +- Ported to kernel 2.1.123-pre2 +=============================================================================== +Changes for patch v61 + +- Set i_blocks=0 and i_blksize=1024 in <devfs_read_inode> +=============================================================================== +Changes for patch v62 + +- Ported to kernel 2.1.123 +=============================================================================== +Changes for patch v63 + +- Ported to kernel 2.1.124-pre2 +=============================================================================== +Changes for patch v64 + +- Fixed Unix98 pty support + +- Increased buffer size in <get_partition_list> to avoid crash and + burn +=============================================================================== +Changes for patch v65 + +- More Unix98 pty support fixes + +- Added test for empty <<name>> in <devfs_find_handle> + +- Renamed <generate_path> to <devfs_generate_path> and published + +- Created /dev/root symlink + Thanks to Roderich Schupp <rsch@ExperTeam.de> + with further modifications by me +=============================================================================== +Changes for patch v66 + +- Yet more Unix98 pty support fixes (now tested) + +- Created <devfs_get_fops> + +- Support media change checks when CONFIG_DEVFS_ONLY=y + +- Abolished Unix98-style PTY names for old PTY devices +=============================================================================== +Changes for patch v67 + +- Added inline declaration for dummy <devfs_generate_path> + +- Removed spurious "unable to register... in devfs" messages when + CONFIG_DEVFS_FS=n + +- Fixed misc. devices when CONFIG_DEVFS_FS=n + +- Limit auto-device numbering to majors 144 to 239 +=============================================================================== +Changes for patch v68 + +- Hide unopened virtual consoles from directory listings + +- Added support for video capture devices + +- Ported to kernel 2.1.125 +=============================================================================== +Changes for patch v69 + +- Fix for CONFIG_VT=n +=============================================================================== +Changes for patch v70 + +- Added support for non-OSS/Free sound cards +=============================================================================== +Changes for patch v71 + +- Ported to kernel 2.1.126-pre2 +=============================================================================== +Changes for patch v72 + +- #ifdef's for CONFIG_DEVFS_DISABLE_OLD_NAMES removed +=============================================================================== +Changes for patch v73 + +- CONFIG_DEVFS_DISABLE_OLD_NAMES replaced with "nocompat" boot option + +- CONFIG_DEVFS_BOOT_OPTIONS removed: boot options always available +=============================================================================== +Changes for patch v74 + +- Removed CONFIG_DEVFS_MOUNT and "mount" boot option and replaced with + "nomount" boot option + +- Documentation updates + +- Updated sample modules.conf +=============================================================================== +Changes for patch v75 + +- Updated sample modules.conf + +- Remount devfs after initrd finishes + +- Ported to kernel 2.1.127 + +- Added support for ISDN + Thanks to Christophe Leroy <christophe.leroy5@capway.com> +=============================================================================== +Changes for patch v76 + +- Updated an email address in ChangeLog + +- CONFIG_DEVFS_ONLY replaced with "only" boot option +=============================================================================== +Changes for patch v77 + +- Added DEVFS_FL_REMOVABLE flag + +- Check for disc change when listing directories with removable media + devices + +- Use DEVFS_FL_REMOVABLE in sd.c + +- Ported to kernel 2.1.128 +=============================================================================== +Changes for patch v78 + +- Only call <scan_dir_for_removable> on first call to <devfs_readdir> + +- Ported to kernel 2.1.129-pre5 + +- ISDN support improvements + Thanks to Christophe Leroy <christophe.leroy5@capway.com> +=============================================================================== +Changes for patch v79 + +- Ported to kernel 2.1.130 + +- Renamed miscdevice "apm" to "apm_bios" to be consistent with + devices.txt +=============================================================================== +Changes for patch v80 + +- Ported to kernel 2.1.131 + +- Updated <devfs_rmdir> for VFS change in 2.1.131 +=============================================================================== +Changes for patch v81 + +- Fixed permissions on /dev/ptmx +=============================================================================== +Changes for patch v82 + +- Ported to kernel 2.1.132-pre4 + +- Changed initial permissions on /dev/pts/* + +- Created <devfs_mk_compat> + +- Added "symlinks" boot option + +- Changed devfs_register_blkdev() back to register_blkdev() for IDE + +- Check for partitions on removable media in <devfs_lookup> +=============================================================================== +Changes for patch v83 + +- Fixed support for ramdisc when using string-based root FS name + +- Ported to kernel 2.2.0-pre1 +=============================================================================== +Changes for patch v84 + +- Ported to kernel 2.2.0-pre7 +=============================================================================== +Changes for patch v85 + +- Compile fixes for driver/sound/sound_common.c (non-module) and + drivers/isdn/isdn_common.c + Thanks to Christophe Leroy <christophe.leroy5@capway.com> + +- Added support for registering regular files + +- Created <devfs_set_file_size> + +- Added /dev/cpu/mtrr as an alternative interface to /proc/mtrr + +- Update devfs inodes from entries if not changed through FS +=============================================================================== +Changes for patch v86 + +- Ported to kernel 2.2.0-pre9 +=============================================================================== +Changes for patch v87 + +- Fixed bug when mounting non-devfs devices in a devfs +=============================================================================== +Changes for patch v88 + +- Fixed <devfs_fill_file> to only initialise temporary inodes + +- Trap for NULL fops in <devfs_register> + +- Return -ENODEV in <devfs_fill_file> for non-driver inodes + +- Fixed bug when unswapping non-devfs devices in a devfs +=============================================================================== +Changes for patch v89 + +- Switched to C data types in include/linux/devfs_fs.h + +- Switched from PATH_MAX to DEVFS_PATHLEN + +- Updated Documentation/filesystems/devfs/modules.conf to take account + of reverse scanning (!) by modprobe + +- Ported to kernel 2.2.0 +=============================================================================== +Changes for patch v90 + +- CONFIG_DEVFS_DISABLE_OLD_TTY_NAMES replaced with "nottycompat" boot + option + +- CONFIG_DEVFS_TTY_COMPAT removed: existing "symlinks" boot option now + controls this. This means you must have libc 5.4.44 or later, or a + recent version of libc 6 if you use the "symlinks" option +=============================================================================== +Changes for patch v91 + +- Switch from <devfs_mk_symlink> to <devfs_mk_compat> in + drivers/char/vc_screen.c to fix problems with Midnight Commander +=============================================================================== +Changes for patch v92 + +- Ported to kernel 2.2.2-pre5 +=============================================================================== +Changes for patch v93 + +- Modified <sd_name> in drivers/scsi/sd.c to cope with devices that + don't exist (which happens with new RAID autostart code printk()s) +=============================================================================== +Changes for patch v94 + +- Fixed bug in joystick driver: only first joystick was registered +=============================================================================== +Changes for patch v95 + +- Fixed another bug in joystick driver + +- Fixed <devfsd_read> to not overrun event buffer +=============================================================================== +Changes for patch v96 + +- Ported to kernel 2.2.5-2 + +- Created <devfs_auto_unregister> + +- Fixed bugs: compatibility entries were not unregistered for: + loop driver + floppy driver + RAMDISC driver + IDE tape driver + SCSI CD-ROM driver + SCSI HDD driver +=============================================================================== +Changes for patch v97 + +- Fixed bugs: compatibility entries were not unregistered for: + ALSA sound driver + partitions in generic disc driver + +- Don't return unregistred entries in <devfs_find_handle> + +- Panic in <devfs_unregister> if entry unregistered + +- Don't panic in <devfs_auto_unregister> for duplicates +=============================================================================== +Changes for patch v98 + +- Don't unregister already unregistered entries in <unregister> + +- Register entry in <sd_detect> + +- Unregister entry in <sd_detach> + +- Changed to <devfs_*register_chrdev> in drivers/char/tty_io.c + +- Ported to kernel 2.2.7 +=============================================================================== +Changes for patch v99 + +- Ported to kernel 2.2.8 + +- Fixed bug in drivers/scsi/sd.c when >16 SCSI discs + +- Disable warning messages when unable to read partition table for + removable media +=============================================================================== +Changes for patch v100 + +- Ported to kernel 2.3.1-pre5 + +- Added "oops-on-panic" boot option + +- Improved debugging in <devfs_register> and <devfs_unregister> + +- Register entry in <sr_detect> + +- Unregister entry in <sr_detach> + +- Register entry in <sg_detect> + +- Unregister entry in <sg_detach> + +- Added support for ALSA drivers +=============================================================================== +Changes for patch v101 + +- Ported to kernel 2.3.2 +=============================================================================== +Changes for patch v102 + +- Update serial driver to register PCMCIA entries + Thanks to Roch-Alexandre Nomine-Beguin <roch@samarkand.infini.fr> + +- Updated an email address in ChangeLog + +- Hide virtual console capture entries from directory listings when + corresponding console device is not open +=============================================================================== +Changes for patch v103 + +- Ported to kernel 2.3.3 +=============================================================================== +Changes for patch v104 + +- Added documentation for some functions + +- Added "doc" target to fs/devfs/Makefile + +- Added "v4l" directory for video4linux devices + +- Replaced call to <devfs_unregister> in <sd_detach> with call to + <devfs_register_partitions> + +- Moved registration for sr and sg drivers from detect() to attach() + methods + +- Register entries in <st_attach> and unregister in <st_detach> + +- Work around IDE driver treating CD-ROM as gendisk + +- Use <sed> instead of <tr> in rc.devfs + +- Updated ToDo list + +- Removed "oops-on-panic" boot option: now always Oops +=============================================================================== +Changes for patch v105 + +- Unregister SCSI host from <scsi_host_no_list> in <scsi_unregister> + Thanks to Zoltán Böszörményi <zboszor@mail.externet.hu> + +- Don't save /dev/log in rc.devfs + +- Ported to kernel 2.3.4-pre1 +=============================================================================== +Changes for patch v106 + +- Fixed silly typo in drivers/scsi/st.c + +- Improved debugging in <devfs_register> +=============================================================================== +Changes for patch v107 + +- Added "diunlink" and "nokmod" boot options + +- Removed superfluous warning message in <devfs_d_iput> +=============================================================================== +Changes for patch v108 + +- Remove entries when unloading sound module +=============================================================================== +Changes for patch v109 + +- Ported to kernel 2.3.6-pre2 +=============================================================================== +Changes for patch v110 + +- Took account of change to <d_alloc_root> +=============================================================================== +Changes for patch v111 + +- Created separate event queue for each mounted devfs + +- Removed <devfs_invalidate_dcache> + +- Created new ioctl()s for devfsd + +- Incremented devfsd protocol revision to 3 + +- Fixed bug when re-creating directories: contents were lost + +- Block access to inodes until devfsd updates permissions +=============================================================================== +Changes for patch v112 + +- Modified patch so it applies against 2.3.5 and 2.3.6 + +- Updated an email address in ChangeLog + +- Do not automatically change ownership/protection of /dev/tty<n> + +- Updated sample modules.conf + +- Switched to sending process uid/gid to devfsd + +- Renamed <call_kmod> to <try_modload> + +- Added DEVFSD_NOTIFY_LOOKUP event + +- Added DEVFSD_NOTIFY_CHANGE event + +- Added DEVFSD_NOTIFY_CREATE event + +- Incremented devfsd protocol revision to 4 + +- Moved kernel-specific stuff to include/linux/devfs_fs_kernel.h +=============================================================================== +Changes for patch v113 + +- Ported to kernel 2.3.9 + +- Restricted permissions on some block devices +=============================================================================== +Changes for patch v114 + +- Added support for /dev/netlink + Thanks to Dennis Hou <smilax@mindmeld.yi.org> + +- Return EISDIR rather than EINVAL for read(2) on directories + +- Ported to kernel 2.3.10 +=============================================================================== +Changes for patch v115 + +- Added support for all remaining character devices + Thanks to Dennis Hou <smilax@mindmeld.yi.org> + +- Cleaned up netlink support +=============================================================================== +Changes for patch v116 + +- Added support for /dev/parport%d + Thanks to Tim Waugh <tim@cyberelk.demon.co.uk> + +- Fixed parallel port ATAPI tape driver + +- Fixed Atari SLM laser printer driver +=============================================================================== +Changes for patch v117 + +- Added support for COSA card + Thanks to Dennis Hou <smilax@mindmeld.yi.org> + +- Fixed drivers/char/ppdev.c: missing #include <linux/init.h> + +- Fixed drivers/char/ftape/zftape/zftape-init.c + Thanks to Vladimir Popov <mashgrad@usa.net> +=============================================================================== +Changes for patch v118 + +- Ported to kernel 2.3.15-pre3 + +- Fixed bug in loop driver + +- Unregister /dev/lp%d entries in drivers/char/lp.c + Thanks to Maciej W. Rozycki <macro@ds2.pg.gda.pl> +=============================================================================== +Changes for patch v119 + +- Ported to kernel 2.3.16 +=============================================================================== +Changes for patch v120 + +- Fixed bug in drivers/scsi/scsi.c + +- Added /dev/ppp + Thanks to Dennis Hou <smilax@mindmeld.yi.org> + +- Ported to kernel 2.3.17 +=============================================================================== +Changes for patch v121 + +- Fixed bug in drivers/block/loop.c + +- Ported to kernel 2.3.18 +=============================================================================== +Changes for patch v122 + +- Ported to kernel 2.3.19 +=============================================================================== +Changes for patch v123 + +- Ported to kernel 2.3.20 +=============================================================================== +Changes for patch v124 + +- Ported to kernel 2.3.21 +=============================================================================== +Changes for patch v125 + +- Created <devfs_get_info>, <devfs_set_info>, + <devfs_get_first_child> and <devfs_get_next_sibling> + Added <<dir>> parameter to <devfs_register>, <devfs_mk_compat>, + <devfs_mk_dir> and <devfs_find_handle> + Work sponsored by SGI + +- Fixed apparent bug in COSA driver + +- Re-instated "scsihosts=" boot option +=============================================================================== +Changes for patch v126 + +- Always create /dev/pts if CONFIG_UNIX98_PTYS=y + +- Fixed call to <devfs_mk_dir> in drivers/block/ide-disk.c + Thanks to Dennis Hou <smilax@mindmeld.yi.org> + +- Allow multiple unregistrations + +- Created /dev/scsi hierarchy + Work sponsored by SGI +=============================================================================== +Changes for patch v127 + +Work sponsored by SGI + +- No longer disable devpts if devfs enabled (caveat emptor) + +- Added flags array to struct gendisk and removed code from + drivers/scsi/sd.c + +- Created /dev/discs hierarchy +=============================================================================== +Changes for patch v128 + +Work sponsored by SGI + +- Created /dev/cdroms hierarchy +=============================================================================== +Changes for patch v129 + +Work sponsored by SGI + +- Removed compatibility entries for sound devices + +- Removed compatibility entries for printer devices + +- Removed compatibility entries for video4linux devices + +- Removed compatibility entries for parallel port devices + +- Removed compatibility entries for frame buffer devices +=============================================================================== +Changes for patch v130 + +Work sponsored by SGI + +- Added major and minor number to devfsd protocol + +- Incremented devfsd protocol revision to 5 + +- Removed compatibility entries for SoundBlaster CD-ROMs + +- Removed compatibility entries for netlink devices + +- Removed compatibility entries for SCSI generic devices + +- Removed compatibility entries for SCSI tape devices +=============================================================================== +Changes for patch v131 + +Work sponsored by SGI + +- Support info pointer for all devfs entry types + +- Added <<info>> parameter to <devfs_mk_dir> and <devfs_mk_symlink> + +- Removed /dev/st hierarchy + +- Removed /dev/sg hierarchy + +- Removed compatibility entries for loop devices + +- Removed compatibility entries for IDE tape devices + +- Removed compatibility entries for SCSI CD-ROMs + +- Removed /dev/sr hierarchy +=============================================================================== +Changes for patch v132 + +Work sponsored by SGI + +- Removed compatibility entries for floppy devices + +- Removed compatibility entries for RAMDISCs + +- Removed compatibility entries for meta-devices + +- Removed compatibility entries for SCSI discs + +- Created <devfs_make_root> + +- Removed /dev/sd hierarchy + +- Support "../" when searching devfs namespace + +- Created /dev/ide/host* hierarchy + +- Supported IDE hard discs in /dev/ide/host* hierarchy + +- Removed compatibility entries for IDE discs + +- Removed /dev/ide/hd hierarchy + +- Supported IDE CD-ROMs in /dev/ide/host* hierarchy + +- Removed compatibility entries for IDE CD-ROMs + +- Removed /dev/ide/cd hierarchy +=============================================================================== +Changes for patch v133 + +Work sponsored by SGI + +- Created <devfs_get_unregister_slave> + +- Fixed bug in fs/partitions/check.c when rescanning +=============================================================================== +Changes for patch v134 + +Work sponsored by SGI + +- Removed /dev/sd, /dev/sr, /dev/st and /dev/sg directories + +- Removed /dev/ide/hd directory + +- Exported <devfs_get_parent> + +- Created <devfs_register_tape> and /dev/tapes hierarchy + +- Removed /dev/ide/mt hierarchy + +- Removed /dev/ide/fd hierarchy + +- Ported to kernel 2.3.25 +=============================================================================== +Changes for patch v135 + +Work sponsored by SGI + +- Removed compatibility entries for virtual console capture devices + +- Removed unused <devfs_set_symlink_destination> + +- Removed compatibility entries for serial devices + +- Removed compatibility entries for console devices + +- Do not hide entries from devfsd or children + +- Removed DEVFS_FL_TTY_COMPAT flag + +- Removed "nottycompat" boot option + +- Removed <devfs_mk_compat> +=============================================================================== +Changes for patch v136 + +Work sponsored by SGI + +- Moved BSD pty devices to /dev/pty + +- Added DEVFS_FL_WAIT flag +=============================================================================== +Changes for patch v137 + +Work sponsored by SGI + +- Really fixed bug in fs/partitions/check.c when rescanning + +- Support new "disc" naming scheme in <get_removable_partition> + +- Allow NULL fops in <devfs_register> + +- Removed redundant name functions in SCSI disc and IDE drivers +=============================================================================== +Changes for patch v138 + +Work sponsored by SGI + +- Fixed old bugs in drivers/block/paride/pt.c, drivers/char/tpqic02.c, + drivers/net/wan/cosa.c and drivers/scsi/scsi.c + Thanks to Sergey Kubushin <ksi@ksi-linux.com> + +- Fall back to major table if NULL fops given to <devfs_register> +=============================================================================== +Changes for patch v139 + +Work sponsored by SGI + +- Corrected and moved <get_blkfops> and <get_chrfops> declarations + from arch/alpha/kernel/osf_sys.c to include/linux/fs.h + +- Removed name function from struct gendisk + +- Updated devfs FAQ +=============================================================================== +Changes for patch v140 + +Work sponsored by SGI + +- Ported to kernel 2.3.27 +=============================================================================== +Changes for patch v141 + +Work sponsored by SGI + +- Bug fix in arch/m68k/atari/joystick.c + +- Moved ISDN and capi devices to /dev/isdn +=============================================================================== +Changes for patch v142 + +Work sponsored by SGI + +- Bug fix in drivers/block/ide-probe.c (patch confusion) +=============================================================================== +Changes for patch v143 + +Work sponsored by SGI + +- Bug fix in drivers/block/blkpg.c:partition_name() +=============================================================================== +Changes for patch v144 + +Work sponsored by SGI + +- Ported to kernel 2.3.29 + +- Removed calls to <devfs_register> from cdu31a, cm206, mcd and mcdx + CD-ROM drivers: generic driver handles this now + +- Moved joystick devices to /dev/joysticks +=============================================================================== +Changes for patch v145 + +Work sponsored by SGI + +- Ported to kernel 2.3.30-pre3 + +- Register whole-disc entry even for invalid partition tables + +- Fixed bug in mounting root FS when initrd enabled + +- Fixed device entry leak with IDE CD-ROMs + +- Fixed compile problem with drivers/isdn/isdn_common.c + +- Moved COSA devices to /dev/cosa + +- Support fifos when unregistering + +- Created <devfs_register_series> and used in many drivers + +- Moved Coda devices to /dev/coda + +- Moved parallel port IDE tapes to /dev/pt + +- Moved parallel port IDE generic devices to /dev/pg +=============================================================================== +Changes for patch v146 + +Work sponsored by SGI + +- Removed obsolete DEVFS_FL_COMPAT and DEVFS_FL_TOLERANT flags + +- Fixed compile problem with fs/coda/psdev.c + +- Reinstate change to <devfs_register_blkdev> in + drivers/block/ide-probe.c now that fs/isofs/inode.c is fixed + +- Switched to <devfs_register_blkdev> in drivers/block/floppy.c, + drivers/scsi/sr.c and drivers/block/md.c + +- Moved DAC960 devices to /dev/dac960 +=============================================================================== +Changes for patch v147 + +Work sponsored by SGI + +- Ported to kernel 2.3.32-pre4 +=============================================================================== +Changes for patch v148 + +Work sponsored by SGI + +- Removed kmod support: use devfsd instead + +- Moved miscellaneous character devices to /dev/misc +=============================================================================== +Changes for patch v149 + +Work sponsored by SGI + +- Ensure include/linux/joystick.h is OK for user-space + +- Improved debugging in <get_vfs_inode> + +- Ensure dentries created by devfsd will be cleaned up +=============================================================================== +Changes for patch v150 + +Work sponsored by SGI + +- Ported to kernel 2.3.34 +=============================================================================== +Changes for patch v151 + +Work sponsored by SGI + +- Ported to kernel 2.3.35-pre1 + +- Created <devfs_get_name> +=============================================================================== +Changes for patch v152 + +Work sponsored by SGI + +- Updated sample modules.conf + +- Ported to kernel 2.3.36-pre1 +=============================================================================== +Changes for patch v153 + +Work sponsored by SGI + +- Ported to kernel 2.3.42 + +- Removed <devfs_fill_file> +=============================================================================== +Changes for patch v154 + +Work sponsored by SGI + +- Took account of device number changes for /dev/fb* +=============================================================================== +Changes for patch v155 + +Work sponsored by SGI + +- Ported to kernel 2.3.43-pre8 + +- Moved /dev/tty0 to /dev/vc/0 + +- Moved sequence number formatting from <_tty_make_name> to drivers +=============================================================================== +Changes for patch v156 + +Work sponsored by SGI + +- Fixed breakage in drivers/scsi/sd.c due to recent SCSI changes +=============================================================================== +Changes for patch v157 + +Work sponsored by SGI + +- Ported to kernel 2.3.45 +=============================================================================== +Changes for patch v158 + +Work sponsored by SGI + +- Ported to kernel 2.3.46-pre2 +=============================================================================== +Changes for patch v159 + +Work sponsored by SGI + +- Fixed drivers/block/md.c + Thanks to Mike Galbraith <mikeg@weiden.de> + +- Documentation fixes + +- Moved device registration from <lp_init> to <lp_register> + Thanks to Tim Waugh <twaugh@redhat.com> +=============================================================================== +Changes for patch v160 + +Work sponsored by SGI + +- Fixed drivers/char/joystick/joystick.c + Thanks to Vojtech Pavlik <vojtech@suse.cz> + +- Documentation updates + +- Fixed arch/i386/kernel/mtrr.c if procfs and devfs not enabled + +- Fixed drivers/char/stallion.c +=============================================================================== +Changes for patch v161 + +Work sponsored by SGI + +- Remove /dev/ide when ide-mod is unloaded + +- Fixed bug in drivers/block/ide-probe.c when secondary but no primary + +- Added DEVFS_FL_NO_PERSISTENCE flag + +- Used new DEVFS_FL_NO_PERSISTENCE flag for Unix98 pty slaves + +- Removed unnecessary call to <update_devfs_inode_from_entry> in + <devfs_readdir> + +- Only set auto-ownership for /dev/pty/s* +=============================================================================== +Changes for patch v162 + +Work sponsored by SGI + +- Set inode->i_size to correct size for symlinks + Thanks to Jeremy Fitzhardinge <jeremy@goop.org> + +- Only give lookup() method to directories to comply with new VFS + assumptions + +- Remove unnecessary tests in symlink methods + +- Don't kill existing block ops in <devfs_read_inode> + +- Restore auto-ownership for /dev/pty/m* +=============================================================================== +Changes for patch v163 + +Work sponsored by SGI + +- Don't create missing directories in <devfs_find_handle> + +- Removed Documentation/filesystems/devfs/mk-devlinks + +- Updated Documentation/filesystems/devfs/README +=============================================================================== +Changes for patch v164 + +Work sponsored by SGI + +- Fixed CONFIG_DEVFS breakage in drivers/char/serial.c introduced in + linux-2.3.99-pre6-7 +=============================================================================== +Changes for patch v165 + +Work sponsored by SGI + +- Ported to kernel 2.3.99-pre6 +=============================================================================== +Changes for patch v166 + +Work sponsored by SGI + +- Added CONFIG_DEVFS_MOUNT +=============================================================================== +Changes for patch v167 + +Work sponsored by SGI + +- Updated Documentation/filesystems/devfs/README + +- Updated sample modules.conf +=============================================================================== +Changes for patch v168 + +Work sponsored by SGI + +- Disabled multi-mount capability (use VFS bindings instead) + +- Updated README from master HTML file +=============================================================================== +Changes for patch v169 + +Work sponsored by SGI + +- Removed multi-mount code + +- Removed compatibility macros: VFS has changed too much +=============================================================================== +Changes for patch v170 + +Work sponsored by SGI + +- Updated README from master HTML file + +- Merged devfs inode into devfs entry +=============================================================================== +Changes for patch v171 + +Work sponsored by SGI + +- Updated sample modules.conf + +- Removed dead code in <devfs_register> which used to call + <free_dentries> + +- Ported to kernel 2.4.0-test2-pre3 +=============================================================================== +Changes for patch v172 + +Work sponsored by SGI + +- Changed interface to <devfs_register> + +- Changed interface to <devfs_register_series> +=============================================================================== +Changes for patch v173 + +Work sponsored by SGI + +- Simplified interface to <devfs_mk_symlink> + +- Simplified interface to <devfs_mk_dir> + +- Simplified interface to <devfs_find_handle> +=============================================================================== +Changes for patch v174 + +Work sponsored by SGI + +- Updated README from master HTML file +=============================================================================== +Changes for patch v175 + +Work sponsored by SGI + +- DocBook update for fs/devfs/base.c + Thanks to Tim Waugh <twaugh@redhat.com> + +- Removed stale fs/tunnel.c (was never used or completed) +=============================================================================== +Changes for patch v176 + +Work sponsored by SGI + +- Updated ToDo list + +- Removed sample modules.conf: now distributed with devfsd + +- Updated README from master HTML file + +- Ported to kernel 2.4.0-test3-pre4 (which had devfs-patch-v174) +=============================================================================== +Changes for patch v177 + +- Updated README from master HTML file + +- Documentation cleanups + +- Ensure <devfs_generate_path> terminates string for root entry + Thanks to Tim Jansen <tim@tjansen.de> + +- Exported <devfs_get_name> to modules + +- Make <devfs_mk_symlink> send events to devfsd + +- Cleaned up option processing in <devfs_setup> + +- Fixed bugs in handling symlinks: could leak or cause Oops + +- Cleaned up directory handling by separating fops + Thanks to Alexander Viro <viro@parcelfarce.linux.theplanet.co.uk> +=============================================================================== +Changes for patch v178 + +- Fixed handling of inverted options in <devfs_setup> +=============================================================================== +Changes for patch v179 + +- Adjusted <try_modload> to account for <devfs_generate_path> fix +=============================================================================== +Changes for patch v180 + +- Fixed !CONFIG_DEVFS_FS stub declaration of <devfs_get_info> +=============================================================================== +Changes for patch v181 + +- Answered question posed by Al Viro and removed his comments from <devfs_open> + +- Moved setting of registered flag after other fields are changed + +- Fixed race between <devfsd_close> and <devfsd_notify_one> + +- Global VFS changes added bogus BKL to devfsd_close(): removed + +- Widened locking in <devfs_readlink> and <devfs_follow_link> + +- Replaced <devfsd_read> stack usage with <devfsd_ioctl> kmalloc + +- Simplified locking in <devfsd_ioctl> and fixed memory leak +=============================================================================== +Changes for patch v182 + +- Created <devfs_*alloc_major> and <devfs_*alloc_devnum> + +- Removed broken devnum allocation and use <devfs_alloc_devnum> + +- Fixed old devnum leak by calling new <devfs_dealloc_devnum> + +- Created <devfs_*alloc_unique_number> + +- Fixed number leak for /dev/cdroms/cdrom%d + +- Fixed number leak for /dev/discs/disc%d +=============================================================================== +Changes for patch v183 + +- Fixed bug in <devfs_setup> which could hang boot process +=============================================================================== +Changes for patch v184 + +- Documentation typo fix for fs/devfs/util.c + +- Fixed drivers/char/stallion.c for devfs + +- Added DEVFSD_NOTIFY_DELETE event + +- Updated README from master HTML file + +- Removed #include <asm/segment.h> from fs/devfs/base.c +=============================================================================== +Changes for patch v185 + +- Made <block_semaphore> and <char_semaphore> in fs/devfs/util.c + private + +- Fixed inode table races by removing it and using inode->u.generic_ip + instead + +- Moved <devfs_read_inode> into <get_vfs_inode> + +- Moved <devfs_write_inode> into <devfs_notify_change> +=============================================================================== +Changes for patch v186 + +- Fixed race in <devfs_do_symlink> for uni-processor + +- Updated README from master HTML file +=============================================================================== +Changes for patch v187 + +- Fixed drivers/char/stallion.c for devfs + +- Fixed drivers/char/rocket.c for devfs + +- Fixed bug in <devfs_alloc_unique_number>: limited to 128 numbers +=============================================================================== +Changes for patch v188 + +- Updated major masks in fs/devfs/util.c up to Linus' "no new majors" + proclamation. Block: were 126 now 122 free, char: were 26 now 19 free + +- Updated README from master HTML file + +- Removed remnant of multi-mount support in <devfs_mknod> + +- Removed unused DEVFS_FL_SHOW_UNREG flag +=============================================================================== +Changes for patch v189 + +- Removed nlink field from struct devfs_inode + +- Removed auto-ownership for /dev/pty/* (BSD ptys) and used + DEVFS_FL_CURRENT_OWNER|DEVFS_FL_NO_PERSISTENCE for /dev/pty/s* (just + like Unix98 pty slaves) and made /dev/pty/m* rw-rw-rw- access +=============================================================================== +Changes for patch v190 + +- Updated README from master HTML file + +- Replaced BKL with global rwsem to protect symlink data (quick and + dirty hack) +=============================================================================== +Changes for patch v191 + +- Replaced global rwsem for symlink with per-link refcount +=============================================================================== +Changes for patch v192 + +- Removed unnecessary #ifdef CONFIG_DEVFS_FS from arch/i386/kernel/mtrr.c + +- Ported to kernel 2.4.10-pre11 + +- Set inode->i_mapping->a_ops for block nodes in <get_vfs_inode> +=============================================================================== +Changes for patch v193 + +- Went back to global rwsem for symlinks (refcount scheme no good) +=============================================================================== +Changes for patch v194 + +- Fixed overrun in <devfs_link> by removing function (not needed) + +- Updated README from master HTML file +=============================================================================== +Changes for patch v195 + +- Fixed buffer underrun in <try_modload> + +- Moved down_read() from <search_for_entry_in_dir> to <find_entry> +=============================================================================== +Changes for patch v196 + +- Fixed race in <devfsd_ioctl> when setting event mask + Thanks to Kari Hurtta <hurtta@leija.mh.fmi.fi> + +- Avoid deadlock in <devfs_follow_link> by using temporary buffer +=============================================================================== +Changes for patch v197 + +- First release of new locking code for devfs core (v1.0) + +- Fixed bug in drivers/cdrom/cdrom.c +=============================================================================== +Changes for patch v198 + +- Discard temporary buffer, now use "%s" for dentry names + +- Don't generate path in <try_modload>: use fake entry instead + +- Use "existing" directory in <_devfs_make_parent_for_leaf> + +- Use slab cache rather than fixed buffer for devfsd events +=============================================================================== +Changes for patch v199 + +- Removed obsolete usage of DEVFS_FL_NO_PERSISTENCE + +- Send DEVFSD_NOTIFY_REGISTERED events in <devfs_mk_dir> + +- Fixed locking bug in <devfs_d_revalidate_wait> due to typo + +- Do not send CREATE, CHANGE, ASYNC_OPEN or DELETE events from devfsd + or children +=============================================================================== +Changes for patch v200 + +- Ported to kernel 2.5.1-pre2 +=============================================================================== +Changes for patch v201 + +- Fixed bug in <devfsd_read>: was dereferencing freed pointer +=============================================================================== +Changes for patch v202 + +- Fixed bug in <devfsd_close>: was dereferencing freed pointer + +- Added process group check for devfsd privileges +=============================================================================== +Changes for patch v203 + +- Use SLAB_ATOMIC in <devfsd_notify_de> from <devfs_d_delete> +=============================================================================== +Changes for patch v204 + +- Removed long obsolete rc.devfs + +- Return old entry in <devfs_mk_dir> for 2.4.x kernels + +- Updated README from master HTML file + +- Increment refcount on module in <check_disc_changed> + +- Created <devfs_get_handle> and exported <devfs_put> + +- Increment refcount on module in <devfs_get_ops> + +- Created <devfs_put_ops> and used where needed to fix races + +- Added clarifying comments in response to preliminary EMC code review + +- Added poisoning to <devfs_put> + +- Improved debugging messages + +- Fixed unregister bugs in drivers/md/lvm-fs.c +=============================================================================== +Changes for patch v205 + +- Corrected (made useful) debugging message in <unregister> + +- Moved <kmem_cache_create> in <mount_devfs_fs> to <init_devfs_fs> + +- Fixed drivers/md/lvm-fs.c to create "lvm" entry + +- Added magic number to guard against scribbling drivers + +- Only return old entry in <devfs_mk_dir> if a directory + +- Defined macros for error and debug messages + +- Updated README from master HTML file +=============================================================================== +Changes for patch v206 + +- Added support for multiple Compaq cpqarray controllers + +- Fixed (rare, old) race in <devfs_lookup> +=============================================================================== +Changes for patch v207 + +- Fixed deadlock bug in <devfs_d_revalidate_wait> + +- Tag VFS deletable in <devfs_mk_symlink> if handle ignored + +- Updated README from master HTML file +=============================================================================== +Changes for patch v208 + +- Added KERN_* to remaining messages + +- Cleaned up declaration of <stat_read> + +- Updated README from master HTML file +=============================================================================== +Changes for patch v209 + +- Updated README from master HTML file + +- Removed silently introduced calls to lock_kernel() and + unlock_kernel() due to recent VFS locking changes. BKL isn't + required in devfs + +- Changed <devfs_rmdir> to allow later additions if not yet empty + +- Added calls to <devfs_register_partitions> in drivers/block/blkpc.c + <add_partition> and <del_partition> + +- Fixed bug in <devfs_alloc_unique_number>: was clearing beyond + bitfield + +- Fixed bitfield data type for <devfs_*alloc_devnum> + +- Made major bitfield type and initialiser 64 bit safe +=============================================================================== +Changes for patch v210 + +- Updated fs/devfs/util.c to fix shift warning on 64 bit machines + Thanks to Anton Blanchard <anton@samba.org> + +- Updated README from master HTML file +=============================================================================== +Changes for patch v211 + +- Do not put miscellaneous character devices in /dev/misc if they + specify their own directory (i.e. contain a '/' character) + +- Copied macro for error messages from fs/devfs/base.c to + fs/devfs/util.c and made use of this macro + +- Removed 2.4.x compatibility code from fs/devfs/base.c +=============================================================================== +Changes for patch v212 + +- Added BKL to <devfs_open> because drivers still need it +=============================================================================== +Changes for patch v213 + +- Protected <scan_dir_for_removable> and <get_removable_partition> + from changing directory contents +=============================================================================== +Changes for patch v214 + +- Switched to ISO C structure field initialisers + +- Switch to set_current_state() and move before add_wait_queue() + +- Updated README from master HTML file + +- Fixed devfs entry leak in <devfs_readdir> when *readdir fails +=============================================================================== +Changes for patch v215 + +- Created <devfs_find_and_unregister> + +- Switched many functions from <devfs_find_handle> to + <devfs_find_and_unregister> + +- Switched many functions from <devfs_find_handle> to <devfs_get_handle> +=============================================================================== +Changes for patch v216 + +- Switched arch/ia64/sn/io/hcl.c from <devfs_find_handle> to + <devfs_get_handle> + +- Removed deprecated <devfs_find_handle> +=============================================================================== +Changes for patch v217 + +- Exported <devfs_find_and_unregister> and <devfs_only> to modules + +- Updated README from master HTML file + +- Fixed module unload race in <devfs_open> +=============================================================================== +Changes for patch v218 + +- Removed DEVFS_FL_AUTO_OWNER flag + +- Switched lingering structure field initialiser to ISO C + +- Added locking when setting/clearing flags + +- Documentation fix in fs/devfs/util.c diff --git a/Documentation/filesystems/devfs/README b/Documentation/filesystems/devfs/README new file mode 100644 index 000000000000..54366ecc241f --- /dev/null +++ b/Documentation/filesystems/devfs/README @@ -0,0 +1,1964 @@ +Devfs (Device File System) FAQ + + +Linux Devfs (Device File System) FAQ +Richard Gooch +20-AUG-2002 + + +Document languages: + + + + + + + +----------------------------------------------------------------------------- + +NOTE: the master copy of this document is available online at: + +http://www.atnf.csiro.au/~rgooch/linux/docs/devfs.html +and looks much better than the text version distributed with the +kernel sources. A mirror site is available at: + +http://www.ras.ucalgary.ca/~rgooch/linux/docs/devfs.html + +There is also an optional daemon that may be used with devfs. You can +find out more about it at: + +http://www.atnf.csiro.au/~rgooch/linux/ + +A mailing list is available which you may subscribe to. Send +email +to majordomo@oss.sgi.com with the following line in the +body of the message: +subscribe devfs +To unsubscribe, send the message body: +unsubscribe devfs +instead. The list is archived at + +http://oss.sgi.com/projects/devfs/archive/. + +----------------------------------------------------------------------------- + +Contents + + +What is it? + +Why do it? + +Who else does it? + +How it works + +Operational issues (essential reading) + +Instructions for the impatient +Permissions persistence across reboots +Dealing with drivers without devfs support +All the way with Devfs +Other Issues +Kernel Naming Scheme +Devfsd Naming Scheme +Old Compatibility Names +SCSI Host Probing Issues + + + +Device drivers currently ported + +Allocation of Device Numbers + +Questions and Answers + +Making things work +Alternatives to devfs +What I don't like about devfs +How to report bugs +Strange kernel messages +Compilation problems with devfsd + + +Other resources + +Translations of this document + + +----------------------------------------------------------------------------- + + +What is it? + +Devfs is an alternative to "real" character and block special devices +on your root filesystem. Kernel device drivers can register devices by +name rather than major and minor numbers. These devices will appear in +devfs automatically, with whatever default ownership and +protection the driver specified. A daemon (devfsd) can be used to +override these defaults. Devfs has been in the kernel since 2.3.46. + +NOTE that devfs is entirely optional. If you prefer the old +disc-based device nodes, then simply leave CONFIG_DEVFS_FS=n (the +default). In this case, nothing will change. ALSO NOTE that if you do +enable devfs, the defaults are such that full compatibility is +maintained with the old devices names. + +There are two aspects to devfs: one is the underlying device +namespace, which is a namespace just like any mounted filesystem. The +other aspect is the filesystem code which provides a view of the +device namespace. The reason I make a distinction is because devfs +can be mounted many times, with each mount showing the same device +namespace. Changes made are global to all mounted devfs filesystems. +Also, because the devfs namespace exists without any devfs mounts, you +can easily mount the root filesystem by referring to an entry in the +devfs namespace. + + +The cost of devfs is a small increase in kernel code size and memory +usage. About 7 pages of code (some of that in __init sections) and 72 +bytes for each entry in the namespace. A modest system has only a +couple of hundred device entries, so this costs a few more +pages. Compare this with the suggestion to put /dev on a <a +href="#why-faq-ramdisc">ramdisc. + +On a typical machine, the cost is under 0.2 percent. On a modest +system with 64 MBytes of RAM, the cost is under 0.1 percent. The +accusations of "bloatware" levelled at devfs are not justified. + +----------------------------------------------------------------------------- + + +Why do it? + +There are several problems that devfs addresses. Some of these +problems are more serious than others (depending on your point of +view), and some can be solved without devfs. However, the totality of +these problems really calls out for devfs. + +The choice is a patchwork of inefficient user space solutions, which +are complex and likely to be fragile, or to use a simple and efficient +devfs which is robust. + +There have been many counter-proposals to devfs, all seeking to +provide some of the benefits without actually implementing devfs. So +far there has been an absence of code and no proposed alternative has +been able to provide all the features that devfs does. Further, +alternative proposals require far more complexity in user-space (and +still deliver less functionality than devfs). Some people have the +mantra of reducing "kernel bloat", but don't consider the effects on +user-space. + +A good solution limits the total complexity of kernel-space and +user-space. + + +Major&minor allocation + +The existing scheme requires the allocation of major and minor device +numbers for each and every device. This means that a central +co-ordinating authority is required to issue these device numbers +(unless you're developing a "private" device driver), in order to +preserve uniqueness. Devfs shifts the burden to a namespace. This may +not seem like a huge benefit, but actually it is. Since driver authors +will naturally choose a device name which reflects the functionality +of the device, there is far less potential for namespace conflict. +Solving this requires a kernel change. + +/dev management + +Because you currently access devices through device nodes, these must +be created by the system administrator. For standard devices you can +usually find a MAKEDEV programme which creates all these (hundreds!) +of nodes. This means that changes in the kernel must be reflected by +changes in the MAKEDEV programme, or else the system administrator +creates device nodes by hand. + +The basic problem is that there are two separate databases of +major and minor numbers. One is in the kernel and one is in /dev (or +in a MAKEDEV programme, if you want to look at it that way). This is +duplication of information, which is not good practice. +Solving this requires a kernel change. + +/dev growth + +A typical /dev has over 1200 nodes! Most of these devices simply don't +exist because the hardware is not available. A huge /dev increases the +time to access devices (I'm just referring to the dentry lookup times +and the time taken to read inodes off disc: the next subsection shows +some more horrors). + +An example of how big /dev can grow is if we consider SCSI devices: + +host 6 bits (say up to 64 hosts on a really big machine) +channel 4 bits (say up to 16 SCSI buses per host) +id 4 bits +lun 3 bits +partition 6 bits +TOTAL 23 bits + + +This requires 8 Mega (1024*1024) inodes if we want to store all +possible device nodes. Even if we scrap everything but id,partition +and assume a single host adapter with a single SCSI bus and only one +logical unit per SCSI target (id), that's still 10 bits or 1024 +inodes. Each VFS inode takes around 256 bytes (kernel 2.1.78), so +that's 256 kBytes of inode storage on disc (assuming real inodes take +a similar amount of space as VFS inodes). This is actually not so bad, +because disc is cheap these days. Embedded systems would care about +256 kBytes of /dev inodes, but you could argue that embedded systems +would have hand-tuned /dev directories. I've had to do just that on my +embedded systems, but I would rather just leave it to devfs. + +Another issue is the time taken to lookup an inode when first +referenced. Not only does this take time in scanning through a list in +memory, but also the seek times to read the inodes off disc. +This could be solved in user-space using a clever programme which +scanned the kernel logs and deleted /dev entries which are not +available and created them when they were available. This programme +would need to be run every time a new module was loaded, which would +slow things down a lot. + +There is an existing programme called scsidev which will automatically +create device nodes for SCSI devices. It can do this by scanning files +in /proc/scsi. Unfortunately, to extend this idea to other device +nodes would require significant modifications to existing drivers (so +they too would provide information in /proc). This is a non-trivial +change (I should know: devfs has had to do something similar). Once +you go to this much effort, you may as well use devfs itself (which +also provides this information). Furthermore, such a system would +likely be implemented in an ad-hoc fashion, as different drivers will +provide their information in different ways. + +Devfs is much cleaner, because it (naturally) has a uniform mechanism +to provide this information: the device nodes themselves! + + +Node to driver file_operations translation + +There is an important difference between the way disc-based character +and block nodes and devfs entries make the connection between an entry +in /dev and the actual device driver. + +With the current 8 bit major and minor numbers the connection between +disc-based c&b nodes and per-major drivers is done through a +fixed-length table of 128 entries. The various filesystem types set +the inode operations for c&b nodes to {chr,blk}dev_inode_operations, +so when a device is opened a few quick levels of indirection bring us +to the driver file_operations. + +For miscellaneous character devices a second step is required: there +is a scan for the driver entry with the same minor number as the file +that was opened, and the appropriate minor open method is called. This +scanning is done *every time* you open a device node. Potentially, you +may be searching through dozens of misc. entries before you find your +open method. While not an enormous performance overhead, this does +seem pointless. + +Linux *must* move beyond the 8 bit major and minor barrier, +somehow. If we simply increase each to 16 bits, then the indexing +scheme used for major driver lookup becomes untenable, because the +major tables (one each for character and block devices) would need to +be 64 k entries long (512 kBytes on x86, 1 MByte for 64 bit +systems). So we would have to use a scheme like that used for +miscellaneous character devices, which means the search time goes up +linearly with the average number of major device drivers on your +system. Not all "devices" are hardware, some are higher-level drivers +like KGI, so you can get more "devices" without adding hardware +You can improve this by creating an ordered (balanced:-) +binary tree, in which case your search time becomes log(N). +Alternatively, you can use hashing to speed up the search. +But why do that search at all if you don't have to? Once again, it +seems pointless. + +Note that devfs doesn't use the major&minor system. For devfs +entries, the connection is done when you lookup the /dev entry. When +devfs_register() is called, an internal table is appended which has +the entry name and the file_operations. If the dentry cache doesn't +have the /dev entry already, this internal table is scanned to get the +file_operations, and an inode is created. If the dentry cache already +has the entry, there is *no lookup time* (other than the dentry scan +itself, but we can't avoid that anyway, and besides Linux dentries +cream other OS's which don't have them:-). Furthermore, the number of +node entries in a devfs is only the number of available device +entries, not the number of *conceivable* entries. Even if you remove +unnecessary entries in a disc-based /dev, the number of conceivable +entries remains the same: you just limit yourself in order to save +space. + +Devfs provides a fast connection between a VFS node and the device +driver, in a scalable way. + +/dev as a system administration tool + +Right now /dev contains a list of conceivable devices, most of which I +don't have. Devfs only shows those devices available on my +system. This means that listing /dev is a handy way of checking what +devices are available. + +Major&minor size + +Existing major and minor numbers are limited to 8 bits each. This is +now a limiting factor for some drivers, particularly the SCSI disc +driver, which consumes a single major number. Only 16 discs are +supported, and each disc may have only 15 partitions. Maybe this isn't +a problem for you, but some of us are building huge Linux systems with +disc arrays. With devfs an arbitrary pointer can be associated with +each device entry, which can be used to give an effective 32 bit +device identifier (i.e. that's like having a 32 bit minor +number). Since this is private to the kernel, there are no C library +compatibility issues which you would have with increasing major and +minor number sizes. See the section on "Allocation of Device Numbers" +for details on maintaining compatibility with userspace. + +Solving this requires a kernel change. + +Since writing this, the kernel has been modified so that the SCSI disc +driver has more major numbers allocated to it and now supports up to +128 discs. Since these major numbers are non-contiguous (a result of +unplanned expansion), the implementation is a little more cumbersome +than originally. + +Just like the changes to IPv4 to fix impending limitations in the +address space, people find ways around the limitations. In the long +run, however, solutions like IPv6 or devfs can't be put off forever. + +Read-only root filesystem + +Having your device nodes on the root filesystem means that you can't +operate properly with a read-only root filesystem. This is because you +want to change ownerships and protections of tty devices. Existing +practice prevents you using a CD-ROM as your root filesystem for a +*real* system. Sure, you can boot off a CD-ROM, but you can't change +tty ownerships, so it's only good for installing. + +Also, you can't use a shared NFS root filesystem for a cluster of +discless Linux machines (having tty ownerships changed on a common +/dev is not good). Nor can you embed your root filesystem in a +ROM-FS. + +You can get around this by creating a RAMDISC at boot time, making +an ext2 filesystem in it, mounting it somewhere and copying the +contents of /dev into it, then unmounting it and mounting it over +/dev. + +A devfs is a cleaner way of solving this. + +Non-Unix root filesystem + +Non-Unix filesystems (such as NTFS) can't be used for a root +filesystem because they variously don't support character and block +special files or symbolic links. You can't have a separate disc-based +or RAMDISC-based filesystem mounted on /dev because you need device +nodes before you can mount these. Devfs can be mounted without any +device nodes. Devlinks won't work because symlinks aren't supported. +An alternative solution is to use initrd to mount a RAMDISC initial +root filesystem (which is populated with a minimal set of device +nodes), and then construct a new /dev in another RAMDISC, and finally +switch to your non-Unix root filesystem. This requires clever boot +scripts and a fragile and conceptually complex boot procedure. + +Devfs solves this in a robust and conceptually simple way. + +PTY security + +Current pseudo-tty (pty) devices are owned by root and read-writable +by everyone. The user of a pty-pair cannot change +ownership/protections without being suid-root. + +This could be solved with a secure user-space daemon which runs as +root and does the actual creation of pty-pairs. Such a daemon would +require modification to *every* programme that wants to use this new +mechanism. It also slows down creation of pty-pairs. + +An alternative is to create a new open_pty() syscall which does much +the same thing as the user-space daemon. Once again, this requires +modifications to pty-handling programmes. + +The devfs solution allows a device driver to "tag" certain device +files so that when an unopened device is opened, the ownerships are +changed to the current euid and egid of the opening process, and the +protections are changed to the default registered by the driver. When +the device is closed ownership is set back to root and protections are +set back to read-write for everybody. No programme need be changed. +The devpts filesystem provides this auto-ownership feature for Unix98 +ptys. It doesn't support old-style pty devices, nor does it have all +the other features of devfs. + +Intelligent device management + +Devfs implements a simple yet powerful protocol for communication with +a device management daemon (devfsd) which runs in user space. It is +possible to send a message (either synchronously or asynchronously) to +devfsd on any event, such as registration/unregistration of device +entries, opening and closing devices, looking up inodes, scanning +directories and more. This has many possibilities. Some of these are +already implemented. See: + + +http://www.atnf.csiro.au/~rgooch/linux/ + +Device entry registration events can be used by devfsd to change +permissions of newly-created device nodes. This is one mechanism to +control device permissions. + +Device entry registration/unregistration events can be used to run +programmes or scripts. This can be used to provide automatic mounting +of filesystems when a new block device media is inserted into the +drive. + +Asynchronous device open and close events can be used to implement +clever permissions management. For example, the default permissions on +/dev/dsp do not allow everybody to read from the device. This is +sensible, as you don't want some remote user recording what you say at +your console. However, the console user is also prevented from +recording. This behaviour is not desirable. With asynchronous device +open and close events, you can have devfsd run a programme or script +when console devices are opened to change the ownerships for *other* +device nodes (such as /dev/dsp). On closure, you can run a different +script to restore permissions. An advantage of this scheme over +modifying the C library tty handling is that this works even if your +programme crashes (how many times have you seen the utmp database with +lingering entries for non-existent logins?). + +Synchronous device open events can be used to perform intelligent +device access protections. Before the device driver open() method is +called, the daemon must first validate the open attempt, by running an +external programme or script. This is far more flexible than access +control lists, as access can be determined on the basis of other +system conditions instead of just the UID and GID. + +Inode lookup events can be used to authenticate module autoload +requests. Instead of using kmod directly, the event is sent to +devfsd which can implement an arbitrary authentication before loading +the module itself. + +Inode lookup events can also be used to construct arbitrary +namespaces, without having to resort to populating devfs with symlinks +to devices that don't exist. + +Speculative Device Scanning + +Consider an application (like cdparanoia) that wants to find all +CD-ROM devices on the system (SCSI, IDE and other types), whether or +not their respective modules are loaded. The application must +speculatively open certain device nodes (such as /dev/sr0 for the SCSI +CD-ROMs) in order to make sure the module is loaded. This requires +that all Linux distributions follow the standard device naming scheme +(last time I looked RedHat did things differently). Devfs solves the +naming problem. + +The same application also wants to see which devices are actually +available on the system. With the existing system it needs to read the +/dev directory and speculatively open each /dev/sr* device to +determine if the device exists or not. With a large /dev this is an +inefficient operation, especially if there are many /dev/sr* nodes. A +solution like scsidev could reduce the number of /dev/sr* entries (but +of course that also requires all that inefficient directory scanning). + +With devfs, the application can open the /dev/sr directory +(which triggers the module autoloading if required), and proceed to +read /dev/sr. Since only the available devices will have +entries, there are no inefficencies in directory scanning or device +openings. + +----------------------------------------------------------------------------- + +Who else does it? + +FreeBSD has a devfs implementation. Solaris and AIX each have a +pseudo-devfs (something akin to scsidev but for all devices, with some +unspecified kernel support). BeOS, Plan9 and QNX also have it. SGI's +IRIX 6.4 and above also have a device filesystem. + +While we shouldn't just automatically do something because others do +it, we should not ignore the work of others either. FreeBSD has a lot +of competent people working on it, so their opinion should not be +blithely ignored. + +----------------------------------------------------------------------------- + + +How it works + +Registering device entries + +For every entry (device node) in a devfs-based /dev a driver must call +devfs_register(). This adds the name of the device entry, the +file_operations structure pointer and a few other things to an +internal table. Device entries may be added and removed at any +time. When a device entry is registered, it automagically appears in +any mounted devfs'. + +Inode lookup + +When a lookup operation on an entry is performed and if there is no +driver information for that entry devfs will attempt to call +devfsd. If still no driver information can be found then a negative +dentry is yielded and the next stage operation will be called by the +VFS (such as create() or mknod() inode methods). If driver information +can be found, an inode is created (if one does not exist already) and +all is well. + +Manually creating device nodes + +The mknod() method allows you to create an ordinary named pipe in the +devfs, or you can create a character or block special inode if one +does not already exist. You may wish to create a character or block +special inode so that you can set permissions and ownership. Later, if +a device driver registers an entry with the same name, the +permissions, ownership and times are retained. This is how you can set +the protections on a device even before the driver is loaded. Once you +create an inode it appears in the directory listing. + +Unregistering device entries + +A device driver calls devfs_unregister() to unregister an entry. + +Chroot() gaols + +2.2.x kernels + +The semantics of inode creation are different when devfs is mounted +with the "explicit" option. Now, when a device entry is registered, it +will not appear until you use mknod() to create the device. It doesn't +matter if you mknod() before or after the device is registered with +devfs_register(). The purpose of this behaviour is to support +chroot(2) gaols, where you want to mount a minimal devfs inside the +gaol. Only the devices you specifically want to be available (through +your mknod() setup) will be accessible. + +2.4.x kernels + +As of kernel 2.3.99, the VFS has had the ability to rebind parts of +the global filesystem namespace into another part of the namespace. +This now works even at the leaf-node level, which means that +individual files and device nodes may be bound into other parts of the +namespace. This is like making links, but better, because it works +across filesystems (unlike hard links) and works through chroot() +gaols (unlike symbolic links). + +Because of these improvements to the VFS, the multi-mount capability +in devfs is no longer needed. The administrator may create a minimal +device tree inside a chroot(2) gaol by using VFS bindings. As this +provides most of the features of the devfs multi-mount capability, I +removed the multi-mount support code (after issuing an RFC). This +yielded code size reductions and simplifications. + +If you want to construct a minimal chroot() gaol, the following +command should suffice: + +mount --bind /dev/null /gaol/dev/null + + +Repeat for other device nodes you want to expose. Simple! + +----------------------------------------------------------------------------- + + +Operational issues + + +Instructions for the impatient + +Nobody likes reading documentation. People just want to get in there +and play. So this section tells you quickly the steps you need to take +to run with devfs mounted over /dev. Skip these steps and you will end +up with a nearly unbootable system. Subsequent sections describe the +issues in more detail, and discuss non-essential configuration +options. + +Devfsd +OK, if you're reading this, I assume you want to play with +devfs. First you should ensure that /usr/src/linux contains a +recent kernel source tree. Then you need to compile devfsd, the device +management daemon, available at + +http://www.atnf.csiro.au/~rgooch/linux/. +Because the kernel has a naming scheme +which is quite different from the old naming scheme, you need to +install devfsd so that software and configuration files that use the +old naming scheme will not break. + +Compile and install devfsd. You will be provided with a default +configuration file /etc/devfsd.conf which will provide +compatibility symlinks for the old naming scheme. Don't change this +config file unless you know what you're doing. Even if you think you +do know what you're doing, don't change it until you've followed all +the steps below and booted a devfs-enabled system and verified that it +works. + +Now edit your main system boot script so that devfsd is started at the +very beginning (before any filesystem +checks). /etc/rc.d/rc.sysinit is often the main boot script +on systems with SysV-style boot scripts. On systems with BSD-style +boot scripts it is often /etc/rc. Also check +/sbin/rc. + +NOTE that the line you put into the boot +script should be exactly: + +/sbin/devfsd /dev + +DO NOT use some special daemon-launching +programme, otherwise the boot script may not wait for devfsd to finish +initialising. + +System Libraries +There may still be some problems because of broken software making +assumptions about device names. In particular, some software does not +handle devices which are symbolic links. If you are running a libc 5 +based system, install libc 5.4.44 (if you have libc 5.4.46, go back to +libc 5.4.44, which is actually correct). If you are running a glibc +based system, make sure you have glibc 2.1.3 or later. + +/etc/securetty +PAM (Pluggable Authentication Modules) is supposed to be a flexible +mechanism for providing better user authentication and access to +services. Unfortunately, it's also fragile, complex and undocumented +(check out RedHat 6.1, and probably other distributions as well). PAM +has problems with symbolic links. Append the following lines to your +/etc/securetty file: + +vc/1 +vc/2 +vc/3 +vc/4 +vc/5 +vc/6 +vc/7 +vc/8 + +This will not weaken security. If you have a version of util-linux +earlier than 2.10.h, please upgrade to 2.10.h or later. If you +absolutely cannot upgrade, then also append the following lines to +your /etc/securetty file: + +1 +2 +3 +4 +5 +6 +7 +8 + +This may potentially weaken security by allowing root logins over the +network (a password is still required, though). However, since there +are problems with dealing with symlinks, I'm suspicious of the level +of security offered in any case. + +XFree86 +While not essential, it's probably a good idea to upgrade to XFree86 +4.0, as patches went in to make it more devfs-friendly. If you don't, +you'll probably need to apply the following patch to +/etc/security/console.perms so that ordinary users can run +startx. Note that not all distributions have this file (e.g. Debian), +so if it's not present, don't worry about it. + +--- /etc/security/console.perms.orig Sat Apr 17 16:26:47 1999 ++++ /etc/security/console.perms Fri Feb 25 23:53:55 2000 +@@ -14,7 +14,7 @@ + # man 5 console.perms + + # file classes -- these are regular expressions +-<console>=tty[0-9][0-9]* :[0-9]\.[0-9] :[0-9] ++<console>=tty[0-9][0-9]* vc/[0-9][0-9]* :[0-9]\.[0-9] :[0-9] + + # device classes -- these are shell-style globs + <floppy>=/dev/fd[0-1]* + +If the patch does not apply, then change the line: + +<console>=tty[0-9][0-9]* :[0-9]\.[0-9] :[0-9] + +with: + +<console>=tty[0-9][0-9]* vc/[0-9][0-9]* :[0-9]\.[0-9] :[0-9] + + +Disable devpts +I've had a report of devpts mounted on /dev/pts not working +correctly. Since devfs will also manage /dev/pts, there is no +need to mount devpts as well. You should either edit your +/etc/fstab so devpts is not mounted, or disable devpts from +your kernel configuration. + +Unsupported drivers +Not all drivers have devfs support. If you depend on one of these +drivers, you will need to create a script or tarfile that you can use +at boot time to create device nodes as appropriate. There is a +section which describes this. Another +section lists the drivers which have +devfs support. + +/dev/mouse + +Many disributions configure /dev/mouse to be the mouse device +for XFree86 and GPM. I actually think this is a bad idea, because it +adds another level of indirection. When looking at a config file, if +you see /dev/mouse you're left wondering which mouse +is being referred to. Hence I recommend putting the actual mouse +device (for example /dev/psaux) into your +/etc/X11/XF86Config file (and similarly for the GPM +configuration file). + +Alternatively, use the same technique used for unsupported drivers +described above. + +The Kernel +Finally, you need to make sure devfs is compiled into your kernel. Set +CONFIG_EXPERIMENTAL=y, CONFIG_DEVFS_FS=y and CONFIG_DEVFS_MOUNT=y by +using favourite configuration tool (i.e. make config or +make xconfig) and then make clean and then recompile your kernel and +modules. At boot, devfs will be mounted onto /dev. + +If you encounter problems booting (for example if you forgot a +configuration step), you can pass devfs=nomount at the kernel +boot command line. This will prevent the kernel from mounting devfs at +boot time onto /dev. + +In general, a kernel built with CONFIG_DEVFS_FS=y but without mounting +devfs onto /dev is completely safe, and requires no +configuration changes. One exception to take note of is when +LABEL= directives are used in /etc/fstab. In this +case you will be unable to boot properly. This is because the +mount(8) programme uses /proc/partitions as part of +the volume label search process, and the device names it finds are not +available, because setting CONFIG_DEVFS_FS=y changes the names in +/proc/partitions, irrespective of whether devfs is mounted. + +Now you've finished all the steps required. You're now ready to boot +your shiny new kernel. Enjoy. + +Changing the configuration + +OK, you've now booted a devfs-enabled system, and everything works. +Now you may feel like changing the configuration (common targets are +/etc/fstab and /etc/devfsd.conf). Since you have a +system that works, if you make any changes and it doesn't work, you +now know that you only have to restore your configuration files to the +default and it will work again. + + +Permissions persistence across reboots + +If you don't use mknod(2) to create a device file, nor use chmod(2) or +chown(2) to change the ownerships/permissions, the inode ctime will +remain at 0 (the epoch, 12 am, 1-JAN-1970, GMT). Anything with a ctime +later than this has had it's ownership/permissions changed. Hence, a +simple script or programme may be used to tar up all changed inodes, +prior to shutdown. Although effective, many consider this approach a +kludge. + +A much better approach is to use devfsd to save and restore +permissions. It may be configured to record changes in permissions and +will save them in a database (in fact a directory tree), and restore +these upon boot. This is an efficient method and results in immediate +saving of current permissions (unlike the tar approach, which saves +permissions at some unspecified future time). + +The default configuration file supplied with devfsd has config entries +which you may uncomment to enable persistence management. + +If you decide to use the tar approach anyway, be aware that tar will +first unlink(2) an inode before creating a new device node. The +unlink(2) has the effect of breaking the connection between a devfs +entry and the device driver. If you use the "devfs=only" boot option, +you lose access to the device driver, requiring you to reload the +module. I consider this a bug in tar (there is no real need to +unlink(2) the inode first). + +Alternatively, you can use devfsd to provide more sophisticated +management of device permissions. You can use devfsd to store +permissions for whole groups of devices with a single configuration +entry, rather than the conventional single entry per device entry. + +Permissions database stored in mounted-over /dev + +If you wish to save and restore your device permissions into the +disc-based /dev while still mounting devfs onto /dev +you may do so. This requires a 2.4.x kernel (in fact, 2.3.99 or +later), which has the VFS binding facility. You need to do the +following to set this up: + + + +make sure the kernel does not mount devfs at boot time + + +make sure you have a correct /dev/console entry in your +root file-system (where your disc-based /dev lives) + +create the /dev-state directory + + +add the following lines near the very beginning of your boot +scripts: + +mount --bind /dev /dev-state +mount -t devfs none /dev +devfsd /dev + + + + +add the following lines to your /etc/devfsd.conf file: + +REGISTER ^pt[sy] IGNORE +CREATE ^pt[sy] IGNORE +CHANGE ^pt[sy] IGNORE +DELETE ^pt[sy] IGNORE +REGISTER .* COPY /dev-state/$devname $devpath +CREATE .* COPY $devpath /dev-state/$devname +CHANGE .* COPY $devpath /dev-state/$devname +DELETE .* CFUNCTION GLOBAL unlink /dev-state/$devname +RESTORE /dev-state + +Note that the sample devfsd.conf file contains these lines, +as well as other sample configurations you may find useful. See the +devfsd distribution + + +reboot. + + + + +Permissions database stored in normal directory + +If you are using an older kernel which doesn't support VFS binding, +then you won't be able to have the permissions database in a +mounted-over /dev. However, you can still use a regular +directory to store the database. The sample /etc/devfsd.conf +file above may still be used. You will need to create the +/dev-state directory prior to installing devfsd. If you have +old permissions in /dev, then just copy (or move) the device +nodes over to the new directory. + +Which method is better? + +The best method is to have the permissions database stored in the +mounted-over /dev. This is because you will not need to copy +device nodes over to /dev-state, and because it allows you to +switch between devfs and non-devfs kernels, without requiring you to +copy permissions between /dev-state (for devfs) and +/dev (for non-devfs). + + +Dealing with drivers without devfs support + +Currently, not all device drivers in the kernel have been modified to +use devfs. Device drivers which do not yet have devfs support will not +automagically appear in devfs. The simplest way to create device nodes +for these drivers is to unpack a tarfile containing the required +device nodes. You can do this in your boot scripts. All your drivers +will now work as before. + +Hopefully for most people devfs will have enough support so that they +can mount devfs directly over /dev without losing most functionality +(i.e. losing access to various devices). As of 22-JAN-1998 (devfs +patch version 10) I am now running this way. All the devices I have +are available in devfs, so I don't lose anything. + +WARNING: if your configuration requires the old-style device names +(i.e. /dev/hda1 or /dev/sda1), you must install devfsd and configure +it to maintain compatibility entries. It is almost certain that you +will require this. Note that the kernel creates a compatibility entry +for the root device, so you don't need initrd. + +Note that you no longer need to mount devpts if you use Unix98 PTYs, +as devfs can manage /dev/pts itself. This saves you some RAM, as you +don't need to compile and install devpts. Note that some versions of +glibc have a bug with Unix98 pty handling on devfs systems. Contact +the glibc maintainers for a fix. Glibc 2.1.3 has the fix. + +Note also that apart from editing /etc/fstab, other things will need +to be changed if you *don't* install devfsd. Some software (like the X +server) hard-wire device names in their source. It really is much +easier to install devfsd so that compatibility entries are created. +You can then slowly migrate your system to using the new device names +(for example, by starting with /etc/fstab), and then limiting the +compatibility entries that devfsd creates. + +IF YOU CONFIGURE TO MOUNT DEVFS AT BOOT, MAKE SURE YOU INSTALL DEVFSD +BEFORE YOU BOOT A DEVFS-ENABLED KERNEL! + +Now that devfs has gone into the 2.3.46 kernel, I'm getting a lot of +reports back. Many of these are because people are trying to run +without devfsd, and hence some things break. Please just run devfsd if +things break. I want to concentrate on real bugs rather than +misconfiguration problems at the moment. If people are willing to fix +bugs/false assumptions in other code (i.e. glibc, X server) and submit +that to the respective maintainers, that would be great. + + +All the way with Devfs + +The devfs kernel patch creates a rationalised device tree. As stated +above, if you want to keep using the old /dev naming scheme, +you just need to configure devfsd appopriately (see the man +page). People who prefer the old names can ignore this section. For +those of us who like the rationalised names and an uncluttered +/dev, read on. + +If you don't run devfsd, or don't enable compatibility entry +management, then you will have to configure your system to use the new +names. For example, you will then need to edit your +/etc/fstab to use the new disc naming scheme. If you want to +be able to boot non-devfs kernels, you will need compatibility +symlinks in the underlying disc-based /dev pointing back to +the old-style names for when you boot a kernel without devfs. + +You can selectively decide which devices you want compatibility +entries for. For example, you may only want compatibility entries for +BSD pseudo-terminal devices (otherwise you'll have to patch you C +library or use Unix98 ptys instead). It's just a matter of putting in +the correct regular expression into /dev/devfsd.conf. + +There are other choices of naming schemes that you may prefer. For +example, I don't use the kernel-supplied +names, because they are too verbose. A common misconception is +that the kernel-supplied names are meant to be used directly in +configuration files. This is not the case. They are designed to +reflect the layout of the devices attached and to provide easy +classification. + +If you like the kernel-supplied names, that's fine. If you don't then +you should be using devfsd to construct a namespace more to your +liking. Devfsd has built-in code to construct a +namespace that is both logical and easy to +manage. In essence, it creates a convenient abbreviation of the +kernel-supplied namespace. + +You are of course free to build your own namespace. Devfsd has all the +infrastructure required to make this easy for you. All you need do is +write a script. You can even write some C code and devfsd can load the +shared object as a callable extension. + + +Other Issues + +The init programme +Another thing to take note of is whether your init programme +creates a Unix socket /dev/telinit. Some versions of init +create /dev/telinit so that the telinit programme can +communicate with the init process. If you have such a system you need +to make sure that devfs is mounted over /dev *before* init +starts. In other words, you can't leave the mounting of devfs to +/etc/rc, since this is executed after init. Other +versions of init require a named pipe /dev/initctl +which must exist *before* init starts. Once again, you need to +mount devfs and then create the named pipe *before* init +starts. + +The default behaviour now is not to mount devfs onto /dev at +boot time for 2.3.x and later kernels. You can correct this with the +"devfs=mount" boot option. This solves any problems with init, +and also prevents the dreaded: + +Cannot open initial console + +message. For 2.2.x kernels where you need to apply the devfs patch, +the default is to mount. + +If you have automatic mounting of devfs onto /dev then you +may need to create /dev/initctl in your boot scripts. The +following lines should suffice: + +mknod /dev/initctl p +kill -SIGUSR1 1 # tell init that /dev/initctl now exists + +Alternatively, if you don't want the kernel to mount devfs onto +/dev then you could use the following procedure is a +guideline for how to get around /dev/initctl problems: + +# cd /sbin +# mv init init.real +# cat > init +#! /bin/sh +mount -n -t devfs none /dev +mknod /dev/initctl p +exec /sbin/init.real $* +[control-D] +# chmod a+x init + +Note that newer versions of init create /dev/initctl +automatically, so you don't have to worry about this. + +Module autoloading +You will need to configure devfsd to enable module +autoloading. The following lines should be placed in your +/etc/devfsd.conf file: + +LOOKUP .* MODLOAD + + +As of devfsd-v1.3.10, a generic /etc/modules.devfs +configuration file is installed, which is used by the MODLOAD +action. This should be sufficient for most configurations. If you +require further configuration, edit your /etc/modules.conf +file. The way module autoloading work with devfs is: + + +a process attempts to lookup a device node (e.g. /dev/fred) + + +if that device node does not exist, the full pathname is passed to +devfsd as a string + + +devfsd will pass the string to the modprobe programme (provided the +configuration line shown above is present), and specifies that +/etc/modules.devfs is the configuration file + + +/etc/modules.devfs includes /etc/modules.conf to +access local configurations + +modprobe will search it's configuration files, looking for an alias +that translates the pathname into a module name + + +the translated pathname is then used to load the module. + + +If you wanted a lookup of /dev/fred to load the +mymod module, you would require the following configuration +line in /etc/modules.conf: + +alias /dev/fred mymod + +The /etc/modules.devfs configuration file provides many such +aliases for standard device names. If you look closely at this file, +you will note that some modules require multiple alias configuration +lines. This is required to support module autoloading for old and new +device names. + +Mounting root off a devfs device +If you wish to mount root off a devfs device when you pass the +"devfs=only" boot option, then you need to pass in the +"root=<device>" option to the kernel when booting. If you use +LILO, then you must have this in lilo.conf: + +append = "root=<device>" + +Surprised? Yep, so was I. It turns out if you have (as most people +do): + +root = <device> + + +then LILO will determine the device number of <device> and will +write that device number into a special place in the kernel image +before starting the kernel, and the kernel will use that device number +to mount the root filesystem. So, using the "append" variety ensures +that LILO passes the root filesystem device as a string, which devfs +can then use. + +Note that this isn't an issue if you don't pass "devfs=only". + +TTY issues +The ttyname(3) function in some versions of the C library makes +false assumptions about device entries which are symbolic links. The +tty(1) programme is one that depends on this function. I've +written a patch to libc 5.4.43 which fixes this. This has been +included in libc 5.4.44 and a similar fix is in glibc 2.1.3. + + +Kernel Naming Scheme + +The kernel provides a default naming scheme. This scheme is designed +to make it easy to search for specific devices or device types, and to +view the available devices. Some device types (such as hard discs), +have a directory of entries, making it easy to see what devices of +that class are available. Often, the entries are symbolic links into a +directory tree that reflects the topology of available devices. The +topological tree is useful for finding how your devices are arranged. + +Below is a list of the naming schemes for the most common drivers. A +list of reserved device names is +available for reference. Please send email to +rgooch@atnf.csiro.au to obtain an allocation. Please be +patient (the maintainer is busy). An alternative name may be allocated +instead of the requested name, at the discretion of the maintainer. + +Disc Devices + +All discs, whether SCSI, IDE or whatever, are placed under the +/dev/discs hierarchy: + + /dev/discs/disc0 first disc + /dev/discs/disc1 second disc + + +Each of these entries is a symbolic link to the directory for that +device. The device directory contains: + + disc for the whole disc + part* for individual partitions + + +CD-ROM Devices + +All CD-ROMs, whether SCSI, IDE or whatever, are placed under the +/dev/cdroms hierarchy: + + /dev/cdroms/cdrom0 first CD-ROM + /dev/cdroms/cdrom1 second CD-ROM + + +Each of these entries is a symbolic link to the real device entry for +that device. + +Tape Devices + +All tapes, whether SCSI, IDE or whatever, are placed under the +/dev/tapes hierarchy: + + /dev/tapes/tape0 first tape + /dev/tapes/tape1 second tape + + +Each of these entries is a symbolic link to the directory for that +device. The device directory contains: + + mt for mode 0 + mtl for mode 1 + mtm for mode 2 + mta for mode 3 + mtn for mode 0, no rewind + mtln for mode 1, no rewind + mtmn for mode 2, no rewind + mtan for mode 3, no rewind + + +SCSI Devices + +To uniquely identify any SCSI device requires the following +information: + + controller (host adapter) + bus (SCSI channel) + target (SCSI ID) + unit (Logical Unit Number) + + +All SCSI devices are placed under /dev/scsi (assuming devfs +is mounted on /dev). Hence, a SCSI device with the following +parameters: c=1,b=2,t=3,u=4 would appear as: + + /dev/scsi/host1/bus2/target3/lun4 device directory + + +Inside this directory, a number of device entries may be created, +depending on which SCSI device-type drivers were installed. + +See the section on the disc naming scheme to see what entries the SCSI +disc driver creates. + +See the section on the tape naming scheme to see what entries the SCSI +tape driver creates. + +The SCSI CD-ROM driver creates: + + cd + + +The SCSI generic driver creates: + + generic + + +IDE Devices + +To uniquely identify any IDE device requires the following +information: + + controller + bus (aka. primary/secondary) + target (aka. master/slave) + unit + + +All IDE devices are placed under /dev/ide, and uses a similar +naming scheme to the SCSI subsystem. + +XT Hard Discs + +All XT discs are placed under /dev/xd. The first XT disc has +the directory /dev/xd/disc0. + +TTY devices + +The tty devices now appear as: + + New name Old-name Device Type + -------- -------- ----------- + /dev/tts/{0,1,...} /dev/ttyS{0,1,...} Serial ports + /dev/cua/{0,1,...} /dev/cua{0,1,...} Call out devices + /dev/vc/0 /dev/tty Current virtual console + /dev/vc/{1,2,...} /dev/tty{1...63} Virtual consoles + /dev/vcc/{0,1,...} /dev/vcs{1...63} Virtual consoles + /dev/pty/m{0,1,...} /dev/ptyp?? PTY masters + /dev/pty/s{0,1,...} /dev/ttyp?? PTY slaves + + +RAMDISCS + +The RAMDISCS are placed in their own directory, and are named thus: + + /dev/rd/{0,1,2,...} + + +Meta Devices + +The meta devices are placed in their own directory, and are named +thus: + + /dev/md/{0,1,2,...} + + +Floppy discs + +Floppy discs are placed in the /dev/floppy directory. + +Loop devices + +Loop devices are placed in the /dev/loop directory. + +Sound devices + +Sound devices are placed in the /dev/sound directory +(audio, sequencer, ...). + + +Devfsd Naming Scheme + +Devfsd provides a naming scheme which is a convenient abbreviation of +the kernel-supplied namespace. In some +cases, the kernel-supplied naming scheme is quite convenient, so +devfsd does not provide another naming scheme. The convenience names +that devfsd creates are in fact the same names as the original devfs +kernel patch created (before Linus mandated the Big Name +Change). These are referred to as "new compatibility entries". + +In order to configure devfsd to create these convenience names, the +following lines should be placed in your /etc/devfsd.conf: + +REGISTER .* MKNEWCOMPAT +UNREGISTER .* RMNEWCOMPAT + +This will cause devfsd to create (and destroy) symbolic links which +point to the kernel-supplied names. + +SCSI Hard Discs + +All SCSI discs are placed under /dev/sd (assuming devfs is +mounted on /dev). Hence, a SCSI disc with the following +parameters: c=1,b=2,t=3,u=4 would appear as: + + /dev/sd/c1b2t3u4 for the whole disc + /dev/sd/c1b2t3u4p5 for the 5th partition + /dev/sd/c1b2t3u4p5s6 for the 6th slice in the 5th partition + + +SCSI Tapes + +All SCSI tapes are placed under /dev/st. A similar naming +scheme is used as for SCSI discs. A SCSI tape with the +parameters:c=1,b=2,t=3,u=4 would appear as: + + /dev/st/c1b2t3u4m0 for mode 0 + /dev/st/c1b2t3u4m1 for mode 1 + /dev/st/c1b2t3u4m2 for mode 2 + /dev/st/c1b2t3u4m3 for mode 3 + /dev/st/c1b2t3u4m0n for mode 0, no rewind + /dev/st/c1b2t3u4m1n for mode 1, no rewind + /dev/st/c1b2t3u4m2n for mode 2, no rewind + /dev/st/c1b2t3u4m3n for mode 3, no rewind + + +SCSI CD-ROMs + +All SCSI CD-ROMs are placed under /dev/sr. A similar naming +scheme is used as for SCSI discs. A SCSI CD-ROM with the +parameters:c=1,b=2,t=3,u=4 would appear as: + + /dev/sr/c1b2t3u4 + + +SCSI Generic Devices + +The generic (aka. raw) interface for all SCSI devices are placed under +/dev/sg. A similar naming scheme is used as for SCSI discs. A +SCSI generic device with the parameters:c=1,b=2,t=3,u=4 would appear +as: + + /dev/sg/c1b2t3u4 + + +IDE Hard Discs + +All IDE discs are placed under /dev/ide/hd, using a similar +convention to SCSI discs. The following mappings exist between the new +and the old names: + + /dev/hda /dev/ide/hd/c0b0t0u0 + /dev/hdb /dev/ide/hd/c0b0t1u0 + /dev/hdc /dev/ide/hd/c0b1t0u0 + /dev/hdd /dev/ide/hd/c0b1t1u0 + + +IDE Tapes + +A similar naming scheme is used as for IDE discs. The entries will +appear in the /dev/ide/mt directory. + +IDE CD-ROM + +A similar naming scheme is used as for IDE discs. The entries will +appear in the /dev/ide/cd directory. + +IDE Floppies + +A similar naming scheme is used as for IDE discs. The entries will +appear in the /dev/ide/fd directory. + +XT Hard Discs + +All XT discs are placed under /dev/xd. The first XT disc +would appear as /dev/xd/c0t0. + + +Old Compatibility Names + +The old compatibility names are the legacy device names, such as +/dev/hda, /dev/sda, /dev/rtc and so on. +Devfsd can be configured to create compatibility symlinks so that you +may continue to use the old names in your configuration files and so +that old applications will continue to function correctly. + +In order to configure devfsd to create these legacy names, the +following lines should be placed in your /etc/devfsd.conf: + +REGISTER .* MKOLDCOMPAT +UNREGISTER .* RMOLDCOMPAT + +This will cause devfsd to create (and destroy) symbolic links which +point to the kernel-supplied names. + + +----------------------------------------------------------------------------- + + +Device drivers currently ported + +- All miscellaneous character devices support devfs (this is done + transparently through misc_register()) + +- SCSI discs and generic hard discs + +- Character memory devices (null, zero, full and so on) + Thanks to C. Scott Ananian <cananian@alumni.princeton.edu> + +- Loop devices (/dev/loop?) + +- TTY devices (console, serial ports, terminals and pseudo-terminals) + Thanks to C. Scott Ananian <cananian@alumni.princeton.edu> + +- SCSI tapes (/dev/scsi and /dev/tapes) + +- SCSI CD-ROMs (/dev/scsi and /dev/cdroms) + +- SCSI generic devices (/dev/scsi) + +- RAMDISCS (/dev/ram?) + +- Meta Devices (/dev/md*) + +- Floppy discs (/dev/floppy) + +- Parallel port printers (/dev/printers) + +- Sound devices (/dev/sound) + Thanks to Eric Dumas <dumas@linux.eu.org> and + C. Scott Ananian <cananian@alumni.princeton.edu> + +- Joysticks (/dev/joysticks) + +- Sparc keyboard (/dev/kbd) + +- DSP56001 digital signal processor (/dev/dsp56k) + +- Apple Desktop Bus (/dev/adb) + +- Coda network file system (/dev/cfs*) + +- Virtual console capture devices (/dev/vcc) + Thanks to Dennis Hou <smilax@mindmeld.yi.org> + +- Frame buffer devices (/dev/fb) + +- Video capture devices (/dev/v4l) + + +----------------------------------------------------------------------------- + + +Allocation of Device Numbers + +Devfs allows you to write a driver which doesn't need to allocate a +device number (major&minor numbers) for the internal operation of the +kernel. However, there are a number of userspace programmes that use +the device number as a unique handle for a device. An example is the +find programme, which uses device numbers to determine whether +an inode is on a different filesystem than another inode. The device +number used is the one for the block device which a filesystem is +using. To preserve compatibility with userspace programmes, block +devices using devfs need to have unique device numbers allocated to +them. Furthermore, POSIX specifies device numbers, so some kind of +device number needs to be presented to userspace. + +The simplest option (especially when porting drivers to devfs) is to +keep using the old major and minor numbers. Devfs will take whatever +values are given for major&minor and pass them onto userspace. + +This device number is a 16 bit number, so this leaves plenty of space +for large numbers of discs and partitions. This scheme can also be +used for character devices, in particular the tty devices, which are +currently limited to 256 pseudo-ttys (this limits the total number of +simultaneous xterms and remote logins). Note that the device number +is limited to the range 36864-61439 (majors 144-239), in order to +avoid any possible conflicts with existing official allocations. + +Please note that using dynamically allocated block device numbers may +break the NFS daemons (both user and kernel mode), which expect dev_t +for a given device to be constant over the lifetime of remote mounts. + +A final note on this scheme: since it doesn't increase the size of +device numbers, there are no compatibility issues with userspace. + +----------------------------------------------------------------------------- + + +Questions and Answers + + +Making things work +Alternatives to devfs +What I don't like about devfs +How to report bugs +Strange kernel messages +Compilation problems with devfsd + + + +Making things work + +Here are some common questions and answers. + + + +Devfsd doesn't start + +Make sure you have compiled and installed devfsd +Make sure devfsd is being started from your boot +scripts +Make sure you have configured your kernel to enable devfs (see +below) +Make sure devfs is mounted (see below) + + +Devfsd is not managing all my permissions + +Make sure you are capturing the appropriate events. For example, +device entries created by the kernel generate REGISTER events, +but those created by devfsd generate CREATE events. + + +Devfsd is not capturing all REGISTER events + +See the previous entry: you may need to capture CREATE events. + + +X will not start + +Make sure you followed the steps +outlined above. + + +Why don't my network devices appear in devfs? + +This is not a bug. Network devices have their own, completely separate +namespace. They are accessed via socket(2) and +setsockopt(2) calls, and thus require no device nodes. I have +raised the possibilty of moving network devices into the device +namespace, but have had no response. + + +How can I test if I have devfs compiled into my kernel? + +All filesystems built-in or currently loaded are listed in +/proc/filesystems. If you see a devfs entry, then +you know that devfs was compiled into your kernel. If you have +correctly configured and rebuilt your kernel, then devfs will be +built-in. If you think you've configured it in, but +/proc/filesystems doesn't show it, you've made a mistake. +Common mistakes include: + +Using a 2.2.x kernel without applying the devfs patch (if you +don't know how to patch your kernel, use 2.4.x instead, don't bother +asking me how to patch) +Forgetting to set CONFIG_EXPERIMENTAL=y +Forgetting to set CONFIG_DEVFS_FS=y +Forgetting to set CONFIG_DEVFS_MOUNT=y (if you want devfs +to be automatically mounted at boot) +Editing your .config manually, instead of using make +config or make xconfig +Forgetting to run make dep; make clean after changing the +configuration and before compiling +Forgetting to compile your kernel and modules +Forgetting to install your kernel +Forgetting to install your modules + +Please check twice that you've done all these steps before sending in +a bug report. + + + +How can I test if devfs is mounted on /dev? + +The device filesystem will always create an entry called +".devfsd", which is used to communicate with the daemon. Even +if the daemon is not running, this entry will exist. Testing for the +existence of this entry is the approved method of determining if devfs +is mounted or not. Note that the type of entry (i.e. regular file, +character device, named pipe, etc.) may change without notice. Only +the existence of the entry should be relied upon. + + +When I start devfsd, I see the error: +Error opening file: ".devfsd" No such file or directory? + +This means that devfs is not mounted. Make sure you have devfs mounted. + + +How do I mount devfs? + +First make sure you have devfs compiled into your kernel (see +above). Then you will either need to: + +set CONFIG_DEVFS_MOUNT=y in your kernel config +pass devfs=mount to your boot loader +mount devfs manually in your boot scripts with: +mount -t none devfs /dev + + + +Mount by volume LABEL=<label> doesn't work with +devfs + +Most probably you are not mounting devfs onto /dev. What +happens is that if your kernel config has CONFIG_DEVFS_FS=y +then the contents of /proc/partitions will have the devfs +names (such as scsi/host0/bus0/target0/lun0/part1). The +contents of /proc/partitions are used by mount(8) when +mounting by volume label. If devfs is not mounted on /dev, +then mount(8) will fail to find devices. The solution is to +make sure that devfs is mounted on /dev. See above for how to +do that. + + +I have extra or incorrect entries in /dev + +You may have stale entries in your dev-state area. Check for a +RESTORE configuration line in your devfsd configuration +(typically /etc/devfsd.conf). If you have this line, check +the contents of the specified directory for stale entries. Remove +any entries which are incorrect, then reboot. + + +I get "Unable to open initial console" messages at boot + +This usually happens when you don't have devfs automounted onto +/dev at boot time, and there is no valid +/dev/console entry on your root file-system. Create a valid +/dev/console device node. + + + + + +Alternatives to devfs + +I've attempted to collate all the anti-devfs proposals and explain +their limitations. Under construction. + + +Why not just pass device create/remove events to a daemon? + +Here the suggestion is to develop an API in the kernel so that devices +can register create and remove events, and a daemon listens for those +events. The daemon would then populate/depopulate /dev (which +resides on disc). + +This has several limitations: + + +it only works for modules loaded and unloaded (or devices inserted +and removed) after the kernel has finished booting. Without a database +of events, there is no way the daemon could fully populate +/dev + + +if you add a database to this scheme, the question is then how to +present that database to user-space. If you make it a list of strings +with embedded event codes which are passed through a pipe to the +daemon, then this is only of use to the daemon. I would argue that the +natural way to present this data is via a filesystem (since many of +the events will be of a hierarchical nature), such as devfs. +Presenting the data as a filesystem makes it easy for the user to see +what is available and also makes it easy to write scripts to scan the +"database" + + +the tight binding between device nodes and drivers is no longer +possible (requiring the otherwise perfectly avoidable +table lookups) + + +you cannot catch inode lookup events on /dev which means +that module autoloading requires device nodes to be created. This is a +problem, particularly for drivers where only a few inodes are created +from a potentially large set + + +this technique can't be used when the root FS is mounted +read-only + + + + +Just implement a better scsidev + +This suggestion involves taking the scsidev programme and +extending it to scan for all devices, not just SCSI devices. The +scsidev programme works by scanning /proc/scsi + +Problems: + + +the kernel does not currently provide a list of all devices +available. Not all drivers register entries in /proc or +generate kernel messages + + +there is no uniform mechanism to register devices other than the +devfs API + + +implementing such an API is then the same as the +proposal above + + + + +Put /dev on a ramdisc + +This suggestion involves creating a ramdisc and populating it with +device nodes and then mounting it over /dev. + +Problems: + + + +this doesn't help when mounting the root filesystem, since you +still need a device node to do that + + +if you want to use this technique for the root device node as +well, you need to use initrd. This complicates the booting sequence +and makes it significantly harder to administer and configure. The +initrd is essentially opaque, robbing the system administrator of easy +configuration + + +insufficient information is available to correctly populate the +ramdisc. So we come back to the +proposal above to "solve" this + + +a ramdisc-based solution would take more kernel memory, since the +backing store would be (at best) normal VFS inodes and dentries, which +take 284 bytes and 112 bytes, respectively, for each entry. Compare +that to 72 bytes for devfs + + + + +Do nothing: there's no problem + +Sometimes people can be heard to claim that the existing scheme is +fine. This is what they're ignoring: + + +device number size (8 bits each for major and minor) is a real +limitation, and must be fixed somehow. Systems with large numbers of +SCSI devices, for example, will continue to consume the remaining +unallocated major numbers. USB will also need to push beyond the 8 bit +minor limitation + + +simply increasing the device number size is insufficient. Apart +from causing a lot of pain, it doesn't solve the management issues +of a /dev with thousands or more device nodes + + +ignoring the problem of a huge /dev will not make it go +away, and dismisses the legitimacy of a large number of people who +want a dynamic /dev + + +the standard response then becomes: "write a device management +daemon", which brings us back to the +proposal above + + + + +What I don't like about devfs + +Here are some common complaints about devfs, and some suggestions and +solutions that may make it more palatable for you. I can't please +everybody, but I do try :-) + +I hate the naming scheme + +First, remember that no naming scheme will please everybody. You hate +the scheme, others love it. Who's to say who's right and who's wrong? +Ultimately, the person who writes the code gets to choose, and what +exists now is a combination of the choices made by the +devfs author and the +kernel maintainer (Linus). + +However, not all is lost. If you want to create your own naming +scheme, it is a simple matter to write a standalone script, hack +devfsd, or write a script called by devfsd. You can create whatever +naming scheme you like. + +Further, if you want to remove all traces of the devfs naming scheme +from /dev, you can mount devfs elsewhere (say +/devfs) and populate /dev with links into +/devfs. This population can be automated using devfsd if you +wish. + +You can even use the VFS binding facility to make the links, rather +than using symbolic links. This way, you don't even have to see the +"destination" of these symbolic links. + +Devfs puts policy into the kernel + +There's already policy in the kernel. Device numbers are in fact +policy (why should the kernel dictate what device numbers I use?). +Face it, some policy has to be in the kernel. The real difference +between device names as policy and device numbers as policy is that +no one will use device numbers directly, because device +numbers are devoid of meaning to humans and are ugly. At least with +the devfs device names, (even though you can add your own naming +scheme) some people will use the devfs-supplied names directly. This +offends some people :-) + +Devfs is bloatware + +This is not even remotely true. As shown above, +both code and data size are quite modest. + + +How to report bugs + +If you have (or think you have) a bug with devfs, please follow the +steps below: + + + +make sure you have enabled debugging output when configuring your +kernel. You will need to set (at least) the following config options: + +CONFIG_DEVFS_DEBUG=y +CONFIG_DEBUG_KERNEL=y +CONFIG_DEBUG_SLAB=y + + + +please make sure you have the latest devfs patches applied. The +latest kernel version might not have the latest devfs patches applied +yet (Linus is very busy) + + +save a copy of your complete kernel logs (preferably by +using the dmesg programme) for later inclusion in your bug +report. You may need to use the -s switch to increase the +internal buffer size so you can capture all the boot messages. +Don't edit or trim the dmesg output + + + + +try booting with devfs=dall passed to the kernel boot +command line (read the documentation on your bootloader on how to do +this), and save the result to a file. This may be quite verbose, and +it may overflow the messages buffer, but try to get as much of it as +you can + + +if you get an Oops, run ksymoops to decode it so that the +names of the offending functions are provided. A non-decoded Oops is +pretty useless + + +send a copy of your devfsd configuration file(s) + +send the bug report to me first. +Don't expect that I will see it if you post it to the linux-kernel +mailing list. Include all the information listed above, plus +anything else that you think might be relevant. Put the string +devfs somewhere in the subject line, so my mail filters mark +it as urgent + + + + +Here is a general guide on how to ask questions in a way that greatly +improves your chances of getting a reply: + +http://www.tuxedo.org/~esr/faqs/smart-questions.html. If you have +a bug to report, you should also read + +http://www.chiark.greenend.org.uk/~sgtatham/bugs.html. + + +Strange kernel messages + +You may see devfs-related messages in your kernel logs. Below are some +messages and what they mean (and what you should do about them, if +anything). + + + +devfs_register(fred): could not append to parent, err: -17 + +You need to check what the error code means, but usually 17 means +EEXIST. This means that a driver attempted to create an entry +fred in a directory, but there already was an entry with that +name. This is often caused by flawed boot scripts which untar a bunch +of inodes into /dev, as a way to restore permissions. This +message is harmless, as the device nodes will still +provide access to the driver (unless you use the devfs=only +boot option, which is only for dedicated souls:-). If you want to get +rid of these annoying messages, upgrade to devfsd-v1.3.20 and use the +recommended RESTORE directive to restore permissions. + + +devfs_mk_dir(bill): using old entry in dir: c1808724 "" + +This is similar to the message above, except that a driver attempted +to create a directory named bill, and the parent directory +has an entry with the same name. In this case, to ensure that drivers +continue to work properly, the old entry is re-used and given to the +driver. In 2.5 kernels, the driver is given a NULL entry, and thus, +under rare circumstances, may not create the require device nodes. +The solution is the same as above. + + + + + +Compilation problems with devfsd + +Usually, you can compile devfsd just by typing in +make in the source directory, followed by a make +install (as root). Sometimes, you may have problems, particularly +on broken configurations. + + + +error messages relating to DEVFSD_NOTIFY_DELETE + +This happened because you have an ancient set of kernel headers +installed in /usr/include/linux or /usr/src/linux. +Install kernel 2.4.10 or later. You may need to pass the +KERNEL_DIR variable to make (if you did not install +the new kernel sources as /usr/src/linux), or you may copy +the devfs_fs.h file in the kernel source tree into +/usr/include/linux. + + + + +----------------------------------------------------------------------------- + + +Other resources + + + +Douglas Gilbert has written a useful document at + +http://www.torque.net/sg/devfs_scsi.html which +explores the SCSI subsystem and how it interacts with devfs + + +Douglas Gilbert has written another useful document at + +http://www.torque.net/scsi/SCSI-2.4-HOWTO/ which +discusses the Linux SCSI subsystem in 2.4. + + +Johannes Erdfelt has started a discussion paper on Linux and +hot-swap devices, describing what the requirements are for a scalable +solution and how and why he's used devfs+devfsd. Note that this is an +early draft only, available in plain text form at: + +http://johannes.erdfelt.com/hotswap.txt. +Johannes has promised a HTML version will follow. + + +I presented an invited +paper +at the + +2nd Annual Storage Management Workshop held in Miamia, Florida, +U.S.A. in October 2000. + + + + +----------------------------------------------------------------------------- + + +Translations of this document + +This document has been translated into other languages. + + + + +The document master (in English) by rgooch@atnf.csiro.au is +available at + +http://www.atnf.csiro.au/~rgooch/linux/docs/devfs.html + + + +A Korean translation by viatoris@nownuri.net is available at + +http://your.destiny.pe.kr/devfs/devfs.html + + + + +----------------------------------------------------------------------------- +Most flags courtesy of ITA's +Flags of All Countries +used with permission. diff --git a/Documentation/filesystems/devfs/ToDo b/Documentation/filesystems/devfs/ToDo new file mode 100644 index 000000000000..afd5a8f2c19b --- /dev/null +++ b/Documentation/filesystems/devfs/ToDo @@ -0,0 +1,40 @@ + Device File System (devfs) ToDo List + + Richard Gooch <rgooch@atnf.csiro.au> + + 3-JUL-2000 + +This is a list of things to be done for better devfs support in the +Linux kernel. If you'd like to contribute to the devfs, please have a +look at this list for anything that is unallocated. Also, if there are +items missing (surely), please contact me so I can add them to the +list (preferably with your name attached to them:-). + + +- >256 ptys + Thanks to C. Scott Ananian <cananian@alumni.princeton.edu> + +- Amiga floppy driver (drivers/block/amiflop.c) + +- Atari floppy driver (drivers/block/ataflop.c) + +- SWIM3 (Super Woz Integrated Machine 3) floppy driver (drivers/block/swim3.c) + +- Amiga ZorroII ramdisc driver (drivers/block/z2ram.c) + +- Parallel port ATAPI CD-ROM (drivers/block/paride/pcd.c) + +- Parallel port ATAPI floppy (drivers/block/paride/pf.c) + +- AP1000 block driver (drivers/ap1000/ap.c, drivers/ap1000/ddv.c) + +- Archimedes floppy (drivers/acorn/block/fd1772.c) + +- MFM hard drive (drivers/acorn/block/mfmhd.c) + +- I2O block device (drivers/message/i2o/i2o_block.c) + +- ST-RAM device (arch/m68k/atari/stram.c) + +- Raw devices + diff --git a/Documentation/filesystems/devfs/boot-options b/Documentation/filesystems/devfs/boot-options new file mode 100644 index 000000000000..df3d33b03e0a --- /dev/null +++ b/Documentation/filesystems/devfs/boot-options @@ -0,0 +1,65 @@ +/* -*- auto-fill -*- */ + + Device File System (devfs) Boot Options + + Richard Gooch <rgooch@atnf.csiro.au> + + 18-AUG-2001 + + +When CONFIG_DEVFS_DEBUG is enabled, you can pass several boot options +to the kernel to debug devfs. The boot options are prefixed by +"devfs=", and are separated by commas. Spaces are not allowed. The +syntax looks like this: + +devfs=<option1>,<option2>,<option3> + +and so on. For example, if you wanted to turn on debugging for module +load requests and device registration, you would do: + +devfs=dmod,dreg + +You may prefix "no" to any option. This will invert the option. + + +Debugging Options +================= + +These requires CONFIG_DEVFS_DEBUG to be enabled. +Note that all debugging options have 'd' as the first character. By +default all options are off. All debugging output is sent to the +kernel logs. The debugging options do not take effect until the devfs +version message appears (just prior to the root filesystem being +mounted). + +These are the options: + +dmod print module load requests to <request_module> + +dreg print device register requests to <devfs_register> + +dunreg print device unregister requests to <devfs_unregister> + +dchange print device change requests to <devfs_set_flags> + +dilookup print inode lookup requests + +diget print VFS inode allocations + +diunlink print inode unlinks + +dichange print inode changes + +dimknod print calls to mknod(2) + +dall some debugging turned on + + +Other Options +============= + +These control the default behaviour of devfs. The options are: + +mount mount devfs onto /dev at boot time + +only disable non-devfs device nodes for devfs-capable drivers diff --git a/Documentation/filesystems/directory-locking b/Documentation/filesystems/directory-locking new file mode 100644 index 000000000000..34380d4fbce3 --- /dev/null +++ b/Documentation/filesystems/directory-locking @@ -0,0 +1,113 @@ + Locking scheme used for directory operations is based on two +kinds of locks - per-inode (->i_sem) and per-filesystem (->s_vfs_rename_sem). + + For our purposes all operations fall in 5 classes: + +1) read access. Locking rules: caller locks directory we are accessing. + +2) object creation. Locking rules: same as above. + +3) object removal. Locking rules: caller locks parent, finds victim, +locks victim and calls the method. + +4) rename() that is _not_ cross-directory. Locking rules: caller locks +the parent, finds source and target, if target already exists - locks it +and then calls the method. + +5) link creation. Locking rules: + * lock parent + * check that source is not a directory + * lock source + * call the method. + +6) cross-directory rename. The trickiest in the whole bunch. Locking +rules: + * lock the filesystem + * lock parents in "ancestors first" order. + * find source and target. + * if old parent is equal to or is a descendent of target + fail with -ENOTEMPTY + * if new parent is equal to or is a descendent of source + fail with -ELOOP + * if target exists - lock it. + * call the method. + + +The rules above obviously guarantee that all directories that are going to be +read, modified or removed by method will be locked by caller. + + +If no directory is its own ancestor, the scheme above is deadlock-free. +Proof: + + First of all, at any moment we have a partial ordering of the +objects - A < B iff A is an ancestor of B. + + That ordering can change. However, the following is true: + +(1) if object removal or non-cross-directory rename holds lock on A and + attempts to acquire lock on B, A will remain the parent of B until we + acquire the lock on B. (Proof: only cross-directory rename can change + the parent of object and it would have to lock the parent). + +(2) if cross-directory rename holds the lock on filesystem, order will not + change until rename acquires all locks. (Proof: other cross-directory + renames will be blocked on filesystem lock and we don't start changing + the order until we had acquired all locks). + +(3) any operation holds at most one lock on non-directory object and + that lock is acquired after all other locks. (Proof: see descriptions + of operations). + + Now consider the minimal deadlock. Each process is blocked on +attempt to acquire some lock and already holds at least one lock. Let's +consider the set of contended locks. First of all, filesystem lock is +not contended, since any process blocked on it is not holding any locks. +Thus all processes are blocked on ->i_sem. + + Non-directory objects are not contended due to (3). Thus link +creation can't be a part of deadlock - it can't be blocked on source +and it means that it doesn't hold any locks. + + Any contended object is either held by cross-directory rename or +has a child that is also contended. Indeed, suppose that it is held by +operation other than cross-directory rename. Then the lock this operation +is blocked on belongs to child of that object due to (1). + + It means that one of the operations is cross-directory rename. +Otherwise the set of contended objects would be infinite - each of them +would have a contended child and we had assumed that no object is its +own descendent. Moreover, there is exactly one cross-directory rename +(see above). + + Consider the object blocking the cross-directory rename. One +of its descendents is locked by cross-directory rename (otherwise we +would again have an infinite set of of contended objects). But that +means that cross-directory rename is taking locks out of order. Due +to (2) the order hadn't changed since we had acquired filesystem lock. +But locking rules for cross-directory rename guarantee that we do not +try to acquire lock on descendent before the lock on ancestor. +Contradiction. I.e. deadlock is impossible. Q.E.D. + + + These operations are guaranteed to avoid loop creation. Indeed, +the only operation that could introduce loops is cross-directory rename. +Since the only new (parent, child) pair added by rename() is (new parent, +source), such loop would have to contain these objects and the rest of it +would have to exist before rename(). I.e. at the moment of loop creation +rename() responsible for that would be holding filesystem lock and new parent +would have to be equal to or a descendent of source. But that means that +new parent had been equal to or a descendent of source since the moment when +we had acquired filesystem lock and rename() would fail with -ELOOP in that +case. + + While this locking scheme works for arbitrary DAGs, it relies on +ability to check that directory is a descendent of another object. Current +implementation assumes that directory graph is a tree. This assumption is +also preserved by all operations (cross-directory rename on a tree that would +not introduce a cycle will leave it a tree and link() fails for directories). + + Notice that "directory" in the above == "anything that might have +children", so if we are going to introduce hybrid objects we will need +either to make sure that link(2) doesn't work for them or to make changes +in is_subdir() that would make it work even in presence of such beasts. diff --git a/Documentation/filesystems/ext2.txt b/Documentation/filesystems/ext2.txt new file mode 100644 index 000000000000..b5cb9110cc6b --- /dev/null +++ b/Documentation/filesystems/ext2.txt @@ -0,0 +1,383 @@ + +The Second Extended Filesystem +============================== + +ext2 was originally released in January 1993. Written by R\'emy Card, +Theodore Ts'o and Stephen Tweedie, it was a major rewrite of the +Extended Filesystem. It is currently still (April 2001) the predominant +filesystem in use by Linux. There are also implementations available +for NetBSD, FreeBSD, the GNU HURD, Windows 95/98/NT, OS/2 and RISC OS. + +Options +======= + +Most defaults are determined by the filesystem superblock, and can be +set using tune2fs(8). Kernel-determined defaults are indicated by (*). + +bsddf (*) Makes `df' act like BSD. +minixdf Makes `df' act like Minix. + +check Check block and inode bitmaps at mount time + (requires CONFIG_EXT2_CHECK). +check=none, nocheck (*) Don't do extra checking of bitmaps on mount + (check=normal and check=strict options removed) + +debug Extra debugging information is sent to the + kernel syslog. Useful for developers. + +errors=continue Keep going on a filesystem error. +errors=remount-ro Remount the filesystem read-only on an error. +errors=panic Panic and halt the machine if an error occurs. + +grpid, bsdgroups Give objects the same group ID as their parent. +nogrpid, sysvgroups New objects have the group ID of their creator. + +nouid32 Use 16-bit UIDs and GIDs. + +oldalloc Enable the old block allocator. Orlov should + have better performance, we'd like to get some + feedback if it's the contrary for you. +orlov (*) Use the Orlov block allocator. + (See http://lwn.net/Articles/14633/ and + http://lwn.net/Articles/14446/.) + +resuid=n The user ID which may use the reserved blocks. +resgid=n The group ID which may use the reserved blocks. + +sb=n Use alternate superblock at this location. + +user_xattr Enable "user." POSIX Extended Attributes + (requires CONFIG_EXT2_FS_XATTR). + See also http://acl.bestbits.at +nouser_xattr Don't support "user." extended attributes. + +acl Enable POSIX Access Control Lists support + (requires CONFIG_EXT2_FS_POSIX_ACL). + See also http://acl.bestbits.at +noacl Don't support POSIX ACLs. + +nobh Do not attach buffer_heads to file pagecache. + +grpquota,noquota,quota,usrquota Quota options are silently ignored by ext2. + + +Specification +============= + +ext2 shares many properties with traditional Unix filesystems. It has +the concepts of blocks, inodes and directories. It has space in the +specification for Access Control Lists (ACLs), fragments, undeletion and +compression though these are not yet implemented (some are available as +separate patches). There is also a versioning mechanism to allow new +features (such as journalling) to be added in a maximally compatible +manner. + +Blocks +------ + +The space in the device or file is split up into blocks. These are +a fixed size, of 1024, 2048 or 4096 bytes (8192 bytes on Alpha systems), +which is decided when the filesystem is created. Smaller blocks mean +less wasted space per file, but require slightly more accounting overhead, +and also impose other limits on the size of files and the filesystem. + +Block Groups +------------ + +Blocks are clustered into block groups in order to reduce fragmentation +and minimise the amount of head seeking when reading a large amount +of consecutive data. Information about each block group is kept in a +descriptor table stored in the block(s) immediately after the superblock. +Two blocks near the start of each group are reserved for the block usage +bitmap and the inode usage bitmap which show which blocks and inodes +are in use. Since each bitmap is limited to a single block, this means +that the maximum size of a block group is 8 times the size of a block. + +The block(s) following the bitmaps in each block group are designated +as the inode table for that block group and the remainder are the data +blocks. The block allocation algorithm attempts to allocate data blocks +in the same block group as the inode which contains them. + +The Superblock +-------------- + +The superblock contains all the information about the configuration of +the filing system. The primary copy of the superblock is stored at an +offset of 1024 bytes from the start of the device, and it is essential +to mounting the filesystem. Since it is so important, backup copies of +the superblock are stored in block groups throughout the filesystem. +The first version of ext2 (revision 0) stores a copy at the start of +every block group, along with backups of the group descriptor block(s). +Because this can consume a considerable amount of space for large +filesystems, later revisions can optionally reduce the number of backup +copies by only putting backups in specific groups (this is the sparse +superblock feature). The groups chosen are 0, 1 and powers of 3, 5 and 7. + +The information in the superblock contains fields such as the total +number of inodes and blocks in the filesystem and how many are free, +how many inodes and blocks are in each block group, when the filesystem +was mounted (and if it was cleanly unmounted), when it was modified, +what version of the filesystem it is (see the Revisions section below) +and which OS created it. + +If the filesystem is revision 1 or higher, then there are extra fields, +such as a volume name, a unique identification number, the inode size, +and space for optional filesystem features to store configuration info. + +All fields in the superblock (as in all other ext2 structures) are stored +on the disc in little endian format, so a filesystem is portable between +machines without having to know what machine it was created on. + +Inodes +------ + +The inode (index node) is a fundamental concept in the ext2 filesystem. +Each object in the filesystem is represented by an inode. The inode +structure contains pointers to the filesystem blocks which contain the +data held in the object and all of the metadata about an object except +its name. The metadata about an object includes the permissions, owner, +group, flags, size, number of blocks used, access time, change time, +modification time, deletion time, number of links, fragments, version +(for NFS) and extended attributes (EAs) and/or Access Control Lists (ACLs). + +There are some reserved fields which are currently unused in the inode +structure and several which are overloaded. One field is reserved for the +directory ACL if the inode is a directory and alternately for the top 32 +bits of the file size if the inode is a regular file (allowing file sizes +larger than 2GB). The translator field is unused under Linux, but is used +by the HURD to reference the inode of a program which will be used to +interpret this object. Most of the remaining reserved fields have been +used up for both Linux and the HURD for larger owner and group fields, +The HURD also has a larger mode field so it uses another of the remaining +fields to store the extra more bits. + +There are pointers to the first 12 blocks which contain the file's data +in the inode. There is a pointer to an indirect block (which contains +pointers to the next set of blocks), a pointer to a doubly-indirect +block (which contains pointers to indirect blocks) and a pointer to a +trebly-indirect block (which contains pointers to doubly-indirect blocks). + +The flags field contains some ext2-specific flags which aren't catered +for by the standard chmod flags. These flags can be listed with lsattr +and changed with the chattr command, and allow specific filesystem +behaviour on a per-file basis. There are flags for secure deletion, +undeletable, compression, synchronous updates, immutability, append-only, +dumpable, no-atime, indexed directories, and data-journaling. Not all +of these are supported yet. + +Directories +----------- + +A directory is a filesystem object and has an inode just like a file. +It is a specially formatted file containing records which associate +each name with an inode number. Later revisions of the filesystem also +encode the type of the object (file, directory, symlink, device, fifo, +socket) to avoid the need to check the inode itself for this information +(support for taking advantage of this feature does not yet exist in +Glibc 2.2). + +The inode allocation code tries to assign inodes which are in the same +block group as the directory in which they are first created. + +The current implementation of ext2 uses a singly-linked list to store +the filenames in the directory; a pending enhancement uses hashing of the +filenames to allow lookup without the need to scan the entire directory. + +The current implementation never removes empty directory blocks once they +have been allocated to hold more files. + +Special files +------------- + +Symbolic links are also filesystem objects with inodes. They deserve +special mention because the data for them is stored within the inode +itself if the symlink is less than 60 bytes long. It uses the fields +which would normally be used to store the pointers to data blocks. +This is a worthwhile optimisation as it we avoid allocating a full +block for the symlink, and most symlinks are less than 60 characters long. + +Character and block special devices never have data blocks assigned to +them. Instead, their device number is stored in the inode, again reusing +the fields which would be used to point to the data blocks. + +Reserved Space +-------------- + +In ext2, there is a mechanism for reserving a certain number of blocks +for a particular user (normally the super-user). This is intended to +allow for the system to continue functioning even if non-priveleged users +fill up all the space available to them (this is independent of filesystem +quotas). It also keeps the filesystem from filling up entirely which +helps combat fragmentation. + +Filesystem check +---------------- + +At boot time, most systems run a consistency check (e2fsck) on their +filesystems. The superblock of the ext2 filesystem contains several +fields which indicate whether fsck should actually run (since checking +the filesystem at boot can take a long time if it is large). fsck will +run if the filesystem was not cleanly unmounted, if the maximum mount +count has been exceeded or if the maximum time between checks has been +exceeded. + +Feature Compatibility +--------------------- + +The compatibility feature mechanism used in ext2 is sophisticated. +It safely allows features to be added to the filesystem, without +unnecessarily sacrificing compatibility with older versions of the +filesystem code. The feature compatibility mechanism is not supported by +the original revision 0 (EXT2_GOOD_OLD_REV) of ext2, but was introduced in +revision 1. There are three 32-bit fields, one for compatible features +(COMPAT), one for read-only compatible (RO_COMPAT) features and one for +incompatible (INCOMPAT) features. + +These feature flags have specific meanings for the kernel as follows: + +A COMPAT flag indicates that a feature is present in the filesystem, +but the on-disk format is 100% compatible with older on-disk formats, so +a kernel which didn't know anything about this feature could read/write +the filesystem without any chance of corrupting the filesystem (or even +making it inconsistent). This is essentially just a flag which says +"this filesystem has a (hidden) feature" that the kernel or e2fsck may +want to be aware of (more on e2fsck and feature flags later). The ext3 +HAS_JOURNAL feature is a COMPAT flag because the ext3 journal is simply +a regular file with data blocks in it so the kernel does not need to +take any special notice of it if it doesn't understand ext3 journaling. + +An RO_COMPAT flag indicates that the on-disk format is 100% compatible +with older on-disk formats for reading (i.e. the feature does not change +the visible on-disk format). However, an old kernel writing to such a +filesystem would/could corrupt the filesystem, so this is prevented. The +most common such feature, SPARSE_SUPER, is an RO_COMPAT feature because +sparse groups allow file data blocks where superblock/group descriptor +backups used to live, and ext2_free_blocks() refuses to free these blocks, +which would leading to inconsistent bitmaps. An old kernel would also +get an error if it tried to free a series of blocks which crossed a group +boundary, but this is a legitimate layout in a SPARSE_SUPER filesystem. + +An INCOMPAT flag indicates the on-disk format has changed in some +way that makes it unreadable by older kernels, or would otherwise +cause a problem if an old kernel tried to mount it. FILETYPE is an +INCOMPAT flag because older kernels would think a filename was longer +than 256 characters, which would lead to corrupt directory listings. +The COMPRESSION flag is an obvious INCOMPAT flag - if the kernel +doesn't understand compression, you would just get garbage back from +read() instead of it automatically decompressing your data. The ext3 +RECOVER flag is needed to prevent a kernel which does not understand the +ext3 journal from mounting the filesystem without replaying the journal. + +For e2fsck, it needs to be more strict with the handling of these +flags than the kernel. If it doesn't understand ANY of the COMPAT, +RO_COMPAT, or INCOMPAT flags it will refuse to check the filesystem, +because it has no way of verifying whether a given feature is valid +or not. Allowing e2fsck to succeed on a filesystem with an unknown +feature is a false sense of security for the user. Refusing to check +a filesystem with unknown features is a good incentive for the user to +update to the latest e2fsck. This also means that anyone adding feature +flags to ext2 also needs to update e2fsck to verify these features. + +Metadata +-------- + +It is frequently claimed that the ext2 implementation of writing +asynchronous metadata is faster than the ffs synchronous metadata +scheme but less reliable. Both methods are equally resolvable by their +respective fsck programs. + +If you're exceptionally paranoid, there are 3 ways of making metadata +writes synchronous on ext2: + +per-file if you have the program source: use the O_SYNC flag to open() +per-file if you don't have the source: use "chattr +S" on the file +per-filesystem: add the "sync" option to mount (or in /etc/fstab) + +the first and last are not ext2 specific but do force the metadata to +be written synchronously. See also Journaling below. + +Limitations +----------- + +There are various limits imposed by the on-disk layout of ext2. Other +limits are imposed by the current implementation of the kernel code. +Many of the limits are determined at the time the filesystem is first +created, and depend upon the block size chosen. The ratio of inodes to +data blocks is fixed at filesystem creation time, so the only way to +increase the number of inodes is to increase the size of the filesystem. +No tools currently exist which can change the ratio of inodes to blocks. + +Most of these limits could be overcome with slight changes in the on-disk +format and using a compatibility flag to signal the format change (at +the expense of some compatibility). + +Filesystem block size: 1kB 2kB 4kB 8kB + +File size limit: 16GB 256GB 2048GB 2048GB +Filesystem size limit: 2047GB 8192GB 16384GB 32768GB + +There is a 2.4 kernel limit of 2048GB for a single block device, so no +filesystem larger than that can be created at this time. There is also +an upper limit on the block size imposed by the page size of the kernel, +so 8kB blocks are only allowed on Alpha systems (and other architectures +which support larger pages). + +There is an upper limit of 32768 subdirectories in a single directory. + +There is a "soft" upper limit of about 10-15k files in a single directory +with the current linear linked-list directory implementation. This limit +stems from performance problems when creating and deleting (and also +finding) files in such large directories. Using a hashed directory index +(under development) allows 100k-1M+ files in a single directory without +performance problems (although RAM size becomes an issue at this point). + +The (meaningless) absolute upper limit of files in a single directory +(imposed by the file size, the realistic limit is obviously much less) +is over 130 trillion files. It would be higher except there are not +enough 4-character names to make up unique directory entries, so they +have to be 8 character filenames, even then we are fairly close to +running out of unique filenames. + +Journaling +---------- + +A journaling extension to the ext2 code has been developed by Stephen +Tweedie. It avoids the risks of metadata corruption and the need to +wait for e2fsck to complete after a crash, without requiring a change +to the on-disk ext2 layout. In a nutshell, the journal is a regular +file which stores whole metadata (and optionally data) blocks that have +been modified, prior to writing them into the filesystem. This means +it is possible to add a journal to an existing ext2 filesystem without +the need for data conversion. + +When changes to the filesystem (e.g. a file is renamed) they are stored in +a transaction in the journal and can either be complete or incomplete at +the time of a crash. If a transaction is complete at the time of a crash +(or in the normal case where the system does not crash), then any blocks +in that transaction are guaranteed to represent a valid filesystem state, +and are copied into the filesystem. If a transaction is incomplete at +the time of the crash, then there is no guarantee of consistency for +the blocks in that transaction so they are discarded (which means any +filesystem changes they represent are also lost). +Check Documentation/filesystems/ext3.txt if you want to read more about +ext3 and journaling. + +References +========== + +The kernel source file:/usr/src/linux/fs/ext2/ +e2fsprogs (e2fsck) http://e2fsprogs.sourceforge.net/ +Design & Implementation http://e2fsprogs.sourceforge.net/ext2intro.html +Journaling (ext3) ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/ +Hashed Directories http://kernelnewbies.org/~phillips/htree/ +Filesystem Resizing http://ext2resize.sourceforge.net/ +Compression (*) http://www.netspace.net.au/~reiter/e2compr/ + +Implementations for: +Windows 95/98/NT/2000 http://uranus.it.swin.edu.au/~jn/linux/Explore2fs.htm +Windows 95 (*) http://www.yipton.demon.co.uk/content.html#FSDEXT2 +DOS client (*) ftp://metalab.unc.edu/pub/Linux/system/filesystems/ext2/ +OS/2 http://perso.wanadoo.fr/matthieu.willm/ext2-os2/ +RISC OS client ftp://ftp.barnet.ac.uk/pub/acorn/armlinux/iscafs/ + +(*) no longer actively developed/supported (as of Apr 2001) diff --git a/Documentation/filesystems/ext3.txt b/Documentation/filesystems/ext3.txt new file mode 100644 index 000000000000..9ab7f446f7ad --- /dev/null +++ b/Documentation/filesystems/ext3.txt @@ -0,0 +1,183 @@ + +Ext3 Filesystem +=============== + +ext3 was originally released in September 1999. Written by Stephen Tweedie +for 2.2 branch, and ported to 2.4 kernels by Peter Braam, Andreas Dilger, +Andrew Morton, Alexander Viro, Ted Ts'o and Stephen Tweedie. + +ext3 is ext2 filesystem enhanced with journalling capabilities. + +Options +======= + +When mounting an ext3 filesystem, the following option are accepted: +(*) == default + +jounal=update Update the ext3 file system's journal to the + current format. + +journal=inum When a journal already exists, this option is + ignored. Otherwise, it specifies the number of + the inode which will represent the ext3 file + system's journal file. + +noload Don't load the journal on mounting. + +data=journal All data are committed into the journal prior + to being written into the main file system. + +data=ordered (*) All data are forced directly out to the main file + system prior to its metadata being committed to + the journal. + +data=writeback Data ordering is not preserved, data may be + written into the main file system after its + metadata has been committed to the journal. + +commit=nrsec (*) Ext3 can be told to sync all its data and metadata + every 'nrsec' seconds. The default value is 5 seconds. + This means that if you lose your power, you will lose, + as much, the latest 5 seconds of work (your filesystem + will not be damaged though, thanks to journaling). This + default value (or any low value) will hurt performance, + but it's good for data-safety. Setting it to 0 will + have the same effect than leaving the default 5 sec. + Setting it to very large values will improve + performance. + +barrier=1 This enables/disables barriers. barrier=0 disables it, + barrier=1 enables it. + +orlov (*) This enables the new Orlov block allocator. It's enabled + by default. + +oldalloc This disables the Orlov block allocator and enables the + old block allocator. Orlov should have better performance, + we'd like to get some feedback if it's the contrary for + you. + +user_xattr (*) Enables POSIX Extended Attributes. It's enabled by + default, however you need to confifure its support + (CONFIG_EXT3_FS_XATTR). This is neccesary if you want + to use POSIX Acces Control Lists support. You can visit + http://acl.bestbits.at to know more about POSIX Extended + attributes. + +nouser_xattr Disables POSIX Extended Attributes. + +acl (*) Enables POSIX Access Control Lists support. This is + enabled by default, however you need to configure + its support (CONFIG_EXT3_FS_POSIX_ACL). If you want + to know more about ACLs visit http://acl.bestbits.at + +noacl This option disables POSIX Access Control List support. + +reservation + +noreservation + +resize= + +bsddf (*) Make 'df' act like BSD. +minixdf Make 'df' act like Minix. + +check=none Don't do extra checking of bitmaps on mount. +nocheck + +debug Extra debugging information is sent to syslog. + +errors=remount-ro(*) Remount the filesystem read-only on an error. +errors=continue Keep going on a filesystem error. +errors=panic Panic and halt the machine if an error occurs. + +grpid Give objects the same group ID as their creator. +bsdgroups + +nogrpid (*) New objects have the group ID of their creator. +sysvgroups + +resgid=n The group ID which may use the reserved blocks. + +resuid=n The user ID which may use the reserved blocks. + +sb=n Use alternate superblock at this location. + +quota Quota options are currently silently ignored. +noquota (see fs/ext3/super.c, line 594) +grpquota +usrquota + + +Specification +============= +ext3 shares all disk implementation with ext2 filesystem, and add +transactions capabilities to ext2. Journaling is done by the +Journaling block device layer. + +Journaling Block Device layer +----------------------------- +The Journaling Block Device layer (JBD) isn't ext3 specific. It was +design to add journaling capabilities on a block device. The ext3 +filesystem code will inform the JBD of modifications it is performing +(Call a transaction). the journal support the transactions start and +stop, and in case of crash, the journal can replayed the transactions +to put the partition on a consistent state fastly. + +handles represent a single atomic update to a filesystem. JBD can +handle external journal on a block device. + +Data Mode +--------- +There's 3 different data modes: + +* writeback mode +In data=writeback mode, ext3 does not journal data at all. This mode +provides a similar level of journaling as XFS, JFS, and ReiserFS in its +default mode - metadata journaling. A crash+recovery can cause +incorrect data to appear in files which were written shortly before the +crash. This mode will typically provide the best ext3 performance. + +* ordered mode +In data=ordered mode, ext3 only officially journals metadata, but it +logically groups metadata and data blocks into a single unit called a +transaction. When it's time to write the new metadata out to disk, the +associated data blocks are written first. In general, this mode +perform slightly slower than writeback but significantly faster than +journal mode. + +* journal mode +data=journal mode provides full data and metadata journaling. All new +data is written to the journal first, and then to its final location. +In the event of a crash, the journal can be replayed, bringing both +data and metadata into a consistent state. This mode is the slowest +except when data needs to be read from and written to disk at the same +time where it outperform all others mode. + +Compatibility +------------- + +Ext2 partitions can be easily convert to ext3, with `tune2fs -j <dev>`. +Ext3 is fully compatible with Ext2. Ext3 partitions can easily be +mounted as Ext2. + +External Tools +============== +see manual pages to know more. + +tune2fs: create a ext3 journal on a ext2 partition with the -j flags +mke2fs: create a ext3 partition with the -j flags +debugfs: ext2 and ext3 file system debugger + +References +========== + +kernel source: file:/usr/src/linux/fs/ext3 + file:/usr/src/linux/fs/jbd + +programs: http://e2fsprogs.sourceforge.net + +useful link: + http://www.zip.com.au/~akpm/linux/ext3/ext3-usage.html + http://www-106.ibm.com/developerworks/linux/library/l-fs7/ + http://www-106.ibm.com/developerworks/linux/library/l-fs8/ diff --git a/Documentation/filesystems/hfs.txt b/Documentation/filesystems/hfs.txt new file mode 100644 index 000000000000..bd0fa7704035 --- /dev/null +++ b/Documentation/filesystems/hfs.txt @@ -0,0 +1,83 @@ + +Macintosh HFS Filesystem for Linux +================================== + +HFS stands for ``Hierarchical File System'' and is the filesystem used +by the Mac Plus and all later Macintosh models. Earlier Macintosh +models used MFS (``Macintosh File System''), which is not supported, +MacOS 8.1 and newer support a filesystem called HFS+ that's similar to +HFS but is extended in various areas. Use the hfsplus filesystem driver +to access such filesystems from Linux. + + +Mount options +============= + +When mounting an HFS filesystem, the following options are accepted: + + creator=cccc, type=cccc + Specifies the creator/type values as shown by the MacOS finder + used for creating new files. Default values: '????'. + + uid=n, gid=n + Specifies the user/group that owns all files on the filesystems. + Default: user/group id of the mounting process. + + dir_umask=n, file_umask=n, umask=n + Specifies the umask used for all files , all directories or all + files and directories. Defaults to the umask of the mounting process. + + session=n + Select the CDROM session to mount as HFS filesystem. Defaults to + leaving that decision to the CDROM driver. This option will fail + with anything but a CDROM as underlying devices. + + part=n + Select partition number n from the devices. Does only makes + sense for CDROMS because they can't be partitioned under Linux. + For disk devices the generic partition parsing code does this + for us. Defaults to not parsing the partition table at all. + + quiet + Ignore invalid mount options instead of complaining. + + +Writing to HFS Filesystems +========================== + +HFS is not a UNIX filesystem, thus it does not have the usual features you'd +expect: + + o You can't modify the set-uid, set-gid, sticky or executable bits or the uid + and gid of files. + o You can't create hard- or symlinks, device files, sockets or FIFOs. + +HFS does on the other have the concepts of multiple forks per file. These +non-standard forks are represented as hidden additional files in the normal +filesystems namespace which is kind of a cludge and makes the semantics for +the a little strange: + + o You can't create, delete or rename resource forks of files or the + Finder's metadata. + o They are however created (with default values), deleted and renamed + along with the corresponding data fork or directory. + o Copying files to a different filesystem will loose those attributes + that are essential for MacOS to work. + + +Creating HFS filesystems +=================================== + +The hfsutils package from Robert Leslie contains a program called +hformat that can be used to create HFS filesystem. See +<http://www.mars.org/home/rob/proj/hfs/> for details. + + +Credits +======= + +The HFS drivers was written by Paul H. Hargrovea (hargrove@sccm.Stanford.EDU) +and is now maintained by Roman Zippel (roman@ardistech.com) at Ardis +Technologies. +Roman rewrote large parts of the code and brought in btree routines derived +from Brad Boyer's hfsplus driver (also maintained by Roman now). diff --git a/Documentation/filesystems/hpfs.txt b/Documentation/filesystems/hpfs.txt new file mode 100644 index 000000000000..33dc360c8e89 --- /dev/null +++ b/Documentation/filesystems/hpfs.txt @@ -0,0 +1,296 @@ +Read/Write HPFS 2.09 +1998-2004, Mikulas Patocka + +email: mikulas@artax.karlin.mff.cuni.cz +homepage: http://artax.karlin.mff.cuni.cz/~mikulas/vyplody/hpfs/index-e.cgi + +CREDITS: +Chris Smith, 1993, original read-only HPFS, some code and hpfs structures file + is taken from it +Jacques Gelinas, MSDos mmap, Inspired by fs/nfs/mmap.c (Jon Tombs 15 Aug 1993) +Werner Almesberger, 1992, 1993, MSDos option parser & CR/LF conversion + +Mount options + +uid=xxx,gid=xxx,umask=xxx (default uid=gid=0 umask=default_system_umask) + Set owner/group/mode for files that do not have it specified in extended + attributes. Mode is inverted umask - for example umask 027 gives owner + all permission, group read permission and anybody else no access. Note + that for files mode is anded with 0666. If you want files to have 'x' + rights, you must use extended attributes. +case=lower,asis (default asis) + File name lowercasing in readdir. +conv=binary,text,auto (default binary) + CR/LF -> LF conversion, if auto, decision is made according to extension + - there is a list of text extensions (I thing it's better to not convert + text file than to damage binary file). If you want to change that list, + change it in the source. Original readonly HPFS contained some strange + heuristic algorithm that I removed. I thing it's danger to let the + computer decide whether file is text or binary. For example, DJGPP + binaries contain small text message at the beginning and they could be + misidentified and damaged under some circumstances. +check=none,normal,strict (default normal) + Check level. Selecting none will cause only little speedup and big + danger. I tried to write it so that it won't crash if check=normal on + corrupted filesystems. check=strict means many superfluous checks - + used for debugging (for example it checks if file is allocated in + bitmaps when accessing it). +errors=continue,remount-ro,panic (default remount-ro) + Behaviour when filesystem errors found. +chkdsk=no,errors,always (default errors) + When to mark filesystem dirty so that OS/2 checks it. +eas=no,ro,rw (default rw) + What to do with extended attributes. 'no' - ignore them and use always + values specified in uid/gid/mode options. 'ro' - read extended + attributes but do not create them. 'rw' - create extended attributes + when you use chmod/chown/chgrp/mknod/ln -s on the filesystem. +timeshift=(-)nnn (default 0) + Shifts the time by nnn seconds. For example, if you see under linux + one hour more, than under os/2, use timeshift=-3600. + + +File names + +As in OS/2, filenames are case insensitive. However, shell thinks that names +are case sensitive, so for example when you create a file FOO, you can use +'cat FOO', 'cat Foo', 'cat foo' or 'cat F*' but not 'cat f*'. Note, that you +also won't be able to compile linux kernel (and maybe other things) on HPFS +because kernel creates different files with names like bootsect.S and +bootsect.s. When searching for file thats name has characters >= 128, codepages +are used - see below. +OS/2 ignores dots and spaces at the end of file name, so this driver does as +well. If you create 'a. ...', the file 'a' will be created, but you can still +access it under names 'a.', 'a..', 'a . . . ' etc. + + +Extended attributes + +On HPFS partitions, OS/2 can associate to each file a special information called +extended attributes. Extended attributes are pairs of (key,value) where key is +an ascii string identifying that attribute and value is any string of bytes of +variable length. OS/2 stores window and icon positions and file types there. So +why not use it for unix-specific info like file owner or access rights? This +driver can do it. If you chown/chgrp/chmod on a hpfs partition, extended +attributes with keys "UID", "GID" or "MODE" and 2-byte values are created. Only +that extended attributes those value differs from defaults specified in mount +options are created. Once created, the extended attributes are never deleted, +they're just changed. It means that when your default uid=0 and you type +something like 'chown luser file; chown root file' the file will contain +extended attribute UID=0. And when you umount the fs and mount it again with +uid=luser_uid, the file will be still owned by root! If you chmod file to 444, +extended attribute "MODE" will not be set, this special case is done by setting +read-only flag. When you mknod a block or char device, besides "MODE", the +special 4-byte extended attribute "DEV" will be created containing the device +number. Currently this driver cannot resize extended attributes - it means +that if somebody (I don't know who?) has set "UID", "GID", "MODE" or "DEV" +attributes with different sizes, they won't be rewritten and changing these +values doesn't work. + + +Symlinks + +You can do symlinks on HPFS partition, symlinks are achieved by setting extended +attribute named "SYMLINK" with symlink value. Like on ext2, you can chown and +chgrp symlinks but I don't know what is it good for. chmoding symlink results +in chmoding file where symlink points. These symlinks are just for Linux use and +incompatible with OS/2. OS/2 PmShell symlinks are not supported because they are +stored in very crazy way. They tried to do it so that link changes when file is +moved ... sometimes it works. But the link is partly stored in directory +extended attributes and partly in OS2SYS.INI. I don't want (and don't know how) +to analyze or change OS2SYS.INI. + + +Codepages + +HPFS can contain several uppercasing tables for several codepages and each +file has a pointer to codepage it's name is in. However OS/2 was created in +America where people don't care much about codepages and so multiple codepages +support is quite buggy. I have Czech OS/2 working in codepage 852 on my disk. +Once I booted English OS/2 working in cp 850 and I created a file on my 852 +partition. It marked file name codepage as 850 - good. But when I again booted +Czech OS/2, the file was completely inaccessible under any name. It seems that +OS/2 uppercases the search pattern with its system code page (852) and file +name it's comparing to with its code page (850). These could never match. Is it +really what IBM developers wanted? But problems continued. When I created in +Czech OS/2 another file in that directory, that file was inaccessible too. OS/2 +probably uses different uppercasing method when searching where to place a file +(note, that files in HPFS directory must be sorted) and when searching for +a file. Finally when I opened this directory in PmShell, PmShell crashed (the +funny thing was that, when rebooted, PmShell tried to reopen this directory +again :-). chkdsk happily ignores these errors and only low-level disk +modification saved me. Never mix different language versions of OS/2 on one +system although HPFS was designed to allow that. +OK, I could implement complex codepage support to this driver but I think it +would cause more problems than benefit with such buggy implementation in OS/2. +So this driver simply uses first codepage it finds for uppercasing and +lowercasing no matter what's file codepage index. Usually all file names are in +this codepage - if you don't try to do what I described above :-) + + +Known bugs + +HPFS386 on OS/2 server is not supported. HPFS386 installed on normal OS/2 client +should work. If you have OS/2 server, use only read-only mode. I don't know how +to handle some HPFS386 structures like access control list or extended perm +list, I don't know how to delete them when file is deleted and how to not +overwrite them with extended attributes. Send me some info on these structures +and I'll make it. However, this driver should detect presence of HPFS386 +structures, remount read-only and not destroy them (I hope). + +When there's not enough space for extended attributes, they will be truncated +and no error is returned. + +OS/2 can't access files if the path is longer than about 256 chars but this +driver allows you to do it. chkdsk ignores such errors. + +Sometimes you won't be able to delete some files on a very full filesystem +(returning error ENOSPC). That's because file in non-leaf node in directory tree +(one directory, if it's large, has dirents in tree on HPFS) must be replaced +with another node when deleted. And that new file might have larger name than +the old one so the new name doesn't fit in directory node (dnode). And that +would result in directory tree splitting, that takes disk space. Workaround is +to delete other files that are leaf (probability that the file is non-leaf is +about 1/50) or to truncate file first to make some space. +You encounter this problem only if you have many directories so that +preallocated directory band is full i.e. + number_of_directories / size_of_filesystem_in_mb > 4. + +You can't delete open directories. + +You can't rename over directories (what is it good for?). + +Renaming files so that only case changes doesn't work. This driver supports it +but vfs doesn't. Something like 'mv file FILE' won't work. + +All atimes and directory mtimes are not updated. That's because of performance +reasons. If you extremely wish to update them, let me know, I'll write it (but +it will be slow). + +When the system is out of memory and swap, it may slightly corrupt filesystem +(lost files, unbalanced directories). (I guess all filesystem may do it). + +When compiled, you get warning: function declaration isn't a prototype. Does +anybody know what does it mean? + + +What does "unbalanced tree" message mean? + +Old versions of this driver created sometimes unbalanced dnode trees. OS/2 +chkdsk doesn't scream if the tree is unbalanced (and sometimes creates +unbalanced trees too :-) but both HPFS and HPFS386 contain bug that it rarely +crashes when the tree is not balanced. This driver handles unbalanced trees +correctly and writes warning if it finds them. If you see this message, this is +probably because of directories created with old version of this driver. +Workaround is to move all files from that directory to another and then back +again. Do it in Linux, not OS/2! If you see this message in directory that is +whole created by this driver, it is BUG - let me know about it. + + +Bugs in OS/2 + +When you have two (or more) lost directories pointing each to other, chkdsk +locks up when repairing filesystem. + +Sometimes (I think it's random) when you create a file with one-char name under +OS/2, OS/2 marks it as 'long'. chkdsk then removes this flag saying "Minor fs +error corrected". + +File names like "a .b" are marked as 'long' by OS/2 but chkdsk "corrects" it and +marks them as short (and writes "minor fs error corrected"). This bug is not in +HPFS386. + +Codepage bugs described above. + +If you don't install fixpacks, there are many, many more... + + +History + +0.90 First public release +0.91 Fixed bug that caused shooting to memory when write_inode was called on + open inode (rarely happened) +0.92 Fixed a little memory leak in freeing directory inodes +0.93 Fixed bug that locked up the machine when there were too many filenames + with first 15 characters same + Fixed write_file to zero file when writing behind file end +0.94 Fixed a little memory leak when trying to delete busy file or directory +0.95 Fixed a bug that i_hpfs_parent_dir was not updated when moving files +1.90 First version for 2.1.1xx kernels +1.91 Fixed a bug that chk_sectors failed when sectors were at the end of disk + Fixed a race-condition when write_inode is called while deleting file + Fixed a bug that could possibly happen (with very low probability) when + using 0xff in filenames + Rewritten locking to avoid race-conditions + Mount option 'eas' now works + Fsync no longer returns error + Files beginning with '.' are marked hidden + Remount support added + Alloc is not so slow when filesystem becomes full + Atimes are no more updated because it slows down operation + Code cleanup (removed all commented debug prints) +1.92 Corrected a bug when sync was called just before closing file +1.93 Modified, so that it works with kernels >= 2.1.131, I don't know if it + works with previous versions + Fixed a possible problem with disks > 64G (but I don't have one, so I can't + test it) + Fixed a file overflow at 2G + Added new option 'timeshift' + Changed behaviour on HPFS386: It is now possible to operate on HPFS386 in + read-only mode + Fixed a bug that slowed down alloc and prevented allocating 100% space + (this bug was not destructive) +1.94 Added workaround for one bug in Linux + Fixed one buffer leak + Fixed some incompatibilities with large extended attributes (but it's still + not 100% ok, I have no info on it and OS/2 doesn't want to create them) + Rewritten allocation + Fixed a bug with i_blocks (du sometimes didn't display correct values) + Directories have no longer archive attribute set (some programs don't like + it) + Fixed a bug that it set badly one flag in large anode tree (it was not + destructive) +1.95 Fixed one buffer leak, that could happen on corrupted filesystem + Fixed one bug in allocation in 1.94 +1.96 Added workaround for one bug in OS/2 (HPFS locked up, HPFS386 reported + error sometimes when opening directories in PMSHELL) + Fixed a possible bitmap race + Fixed possible problem on large disks + You can now delete open files + Fixed a nondestructive race in rename +1.97 Support for HPFS v3 (on large partitions) + Fixed a bug that it didn't allow creation of files > 128M (it should be 2G) +1.97.1 Changed names of global symbols + Fixed a bug when chmoding or chowning root directory +1.98 Fixed a deadlock when using old_readdir + Better directory handling; workaround for "unbalanced tree" bug in OS/2 +1.99 Corrected a possible problem when there's not enough space while deleting + file + Now it tries to truncate the file if there's not enough space when deleting + Removed a lot of redundant code +2.00 Fixed a bug in rename (it was there since 1.96) + Better anti-fragmentation strategy +2.01 Fixed problem with directory listing over NFS + Directory lseek now checks for proper parameters + Fixed race-condition in buffer code - it is in all filesystems in Linux; + when reading device (cat /dev/hda) while creating files on it, files + could be damaged +2.02 Woraround for bug in breada in Linux. breada could cause accesses beyond + end of partition +2.03 Char, block devices and pipes are correctly created + Fixed non-crashing race in unlink (Alexander Viro) + Now it works with Japanese version of OS/2 +2.04 Fixed error when ftruncate used to extend file +2.05 Fixed crash when got mount parameters without = + Fixed crash when allocation of anode failed due to full disk + Fixed some crashes when block io or inode allocation failed +2.06 Fixed some crash on corrupted disk structures + Better allocation strategy + Reschedule points added so that it doesn't lock CPU long time + It should work in read-only mode on Warp Server +2.07 More fixes for Warp Server. Now it really works +2.08 Creating new files is not so slow on large disks + An attempt to sync deleted file does not generate filesystem error +2.09 Fixed error on extremly fragmented files + + + vim: set textwidth=80: diff --git a/Documentation/filesystems/isofs.txt b/Documentation/filesystems/isofs.txt new file mode 100644 index 000000000000..f64a10506689 --- /dev/null +++ b/Documentation/filesystems/isofs.txt @@ -0,0 +1,38 @@ +Mount options that are the same as for msdos and vfat partitions. + + gid=nnn All files in the partition will be in group nnn. + uid=nnn All files in the partition will be owned by user id nnn. + umask=nnn The permission mask (see umask(1)) for the partition. + +Mount options that are the same as vfat partitions. These are only useful +when using discs encoded using Microsoft's Joliet extensions. + iocharset=name Character set to use for converting from Unicode to + ASCII. Joliet filenames are stored in Unicode format, but + Unix for the most part doesn't know how to deal with Unicode. + There is also an option of doing UTF8 translations with the + utf8 option. + utf8 Encode Unicode names in UTF8 format. Default is no. + +Mount options unique to the isofs filesystem. + block=512 Set the block size for the disk to 512 bytes + block=1024 Set the block size for the disk to 1024 bytes + block=2048 Set the block size for the disk to 2048 bytes + check=relaxed Matches filenames with different cases + check=strict Matches only filenames with the exact same case + cruft Try to handle badly formatted CDs. + map=off Do not map non-Rock Ridge filenames to lower case + map=normal Map non-Rock Ridge filenames to lower case + map=acorn As map=normal but also apply Acorn extensions if present + mode=xxx Sets the permissions on files to xxx + nojoliet Ignore Joliet extensions if they are present. + norock Ignore Rock Ridge extensions if they are present. + unhide Show hidden files. + session=x Select number of session on multisession CD + sbsector=xxx Session begins from sector xxx + +Recommended documents about ISO 9660 standard are located at: +http://www.y-adagio.com/public/standards/iso_cdromr/tocont.htm +ftp://ftp.ecma.ch/ecma-st/Ecma-119.pdf +Quoting from the PDF "This 2nd Edition of Standard ECMA-119 is technically +identical with ISO 9660.", so it is a valid and gratis substitute of the +official ISO specification. diff --git a/Documentation/filesystems/jfs.txt b/Documentation/filesystems/jfs.txt new file mode 100644 index 000000000000..3e992daf99ad --- /dev/null +++ b/Documentation/filesystems/jfs.txt @@ -0,0 +1,35 @@ +IBM's Journaled File System (JFS) for Linux + +JFS Homepage: http://jfs.sourceforge.net/ + +The following mount options are supported: + +iocharset=name Character set to use for converting from Unicode to + ASCII. The default is to do no conversion. Use + iocharset=utf8 for UTF8 translations. This requires + CONFIG_NLS_UTF8 to be set in the kernel .config file. + iocharset=none specifies the default behavior explicitly. + +resize=value Resize the volume to <value> blocks. JFS only supports + growing a volume, not shrinking it. This option is only + valid during a remount, when the volume is mounted + read-write. The resize keyword with no value will grow + the volume to the full size of the partition. + +nointegrity Do not write to the journal. The primary use of this option + is to allow for higher performance when restoring a volume + from backup media. The integrity of the volume is not + guaranteed if the system abnormally abends. + +integrity Default. Commit metadata changes to the journal. Use this + option to remount a volume where the nointegrity option was + previously specified in order to restore normal behavior. + +errors=continue Keep going on a filesystem error. +errors=remount-ro Default. Remount the filesystem read-only on an error. +errors=panic Panic and halt the machine if an error occurs. + +Please send bugs, comments, cards and letters to shaggy@austin.ibm.com. + +The JFS mailing list can be subscribed to by using the link labeled +"Mail list Subscribe" at our web page http://jfs.sourceforge.net/ diff --git a/Documentation/filesystems/ncpfs.txt b/Documentation/filesystems/ncpfs.txt new file mode 100644 index 000000000000..f12c30c93f2f --- /dev/null +++ b/Documentation/filesystems/ncpfs.txt @@ -0,0 +1,12 @@ +The ncpfs filesystem understands the NCP protocol, designed by the +Novell Corporation for their NetWare(tm) product. NCP is functionally +similar to the NFS used in the TCP/IP community. +To mount a NetWare filesystem, you need a special mount program, which +can be found in the ncpfs package. The home site for ncpfs is +ftp.gwdg.de/pub/linux/misc/ncpfs, but sunsite and its many mirrors +will have it as well. + +Related products are linware and mars_nwe, which will give Linux partial +NetWare server functionality. Linware's home site is +klokan.sh.cvut.cz/pub/linux/linware; mars_nwe can be found on +ftp.gwdg.de/pub/linux/misc/ncpfs. diff --git a/Documentation/filesystems/ntfs.txt b/Documentation/filesystems/ntfs.txt new file mode 100644 index 000000000000..f89b440fad1d --- /dev/null +++ b/Documentation/filesystems/ntfs.txt @@ -0,0 +1,630 @@ +The Linux NTFS filesystem driver +================================ + + +Table of contents +================= + +- Overview +- Web site +- Features +- Supported mount options +- Known bugs and (mis-)features +- Using NTFS volume and stripe sets + - The Device-Mapper driver + - The Software RAID / MD driver + - Limitiations when using the MD driver +- ChangeLog + + +Overview +======== + +Linux-NTFS comes with a number of user-space programs known as ntfsprogs. +These include mkntfs, a full-featured ntfs file system format utility, +ntfsundelete used for recovering files that were unintentionally deleted +from an NTFS volume and ntfsresize which is used to resize an NTFS partition. +See the web site for more information. + +To mount an NTFS 1.2/3.x (Windows NT4/2000/XP/2003) volume, use the file +system type 'ntfs'. The driver currently supports read-only mode (with no +fault-tolerance, encryption or journalling) and very limited, but safe, write +support. + +For fault tolerance and raid support (i.e. volume and stripe sets), you can +use the kernel's Software RAID / MD driver. See section "Using Software RAID +with NTFS" for details. + + +Web site +======== + +There is plenty of additional information on the linux-ntfs web site +at http://linux-ntfs.sourceforge.net/ + +The web site has a lot of additional information, such as a comprehensive +FAQ, documentation on the NTFS on-disk format, informaiton on the Linux-NTFS +userspace utilities, etc. + + +Features +======== + +- This is a complete rewrite of the NTFS driver that used to be in the kernel. + This new driver implements NTFS read support and is functionally equivalent + to the old ntfs driver. +- The new driver has full support for sparse files on NTFS 3.x volumes which + the old driver isn't happy with. +- The new driver supports execution of binaries due to mmap() now being + supported. +- The new driver supports loopback mounting of files on NTFS which is used by + some Linux distributions to enable the user to run Linux from an NTFS + partition by creating a large file while in Windows and then loopback + mounting the file while in Linux and creating a Linux filesystem on it that + is used to install Linux on it. +- A comparison of the two drivers using: + time find . -type f -exec md5sum "{}" \; + run three times in sequence with each driver (after a reboot) on a 1.4GiB + NTFS partition, showed the new driver to be 20% faster in total time elapsed + (from 9:43 minutes on average down to 7:53). The time spent in user space + was unchanged but the time spent in the kernel was decreased by a factor of + 2.5 (from 85 CPU seconds down to 33). +- The driver does not support short file names in general. For backwards + compatibility, we implement access to files using their short file names if + they exist. The driver will not create short file names however, and a + rename will discard any existing short file name. +- The new driver supports exporting of mounted NTFS volumes via NFS. +- The new driver supports async io (aio). +- The new driver supports fsync(2), fdatasync(2), and msync(2). +- The new driver supports readv(2) and writev(2). +- The new driver supports access time updates (including mtime and ctime). + + +Supported mount options +======================= + +In addition to the generic mount options described by the manual page for the +mount command (man 8 mount, also see man 5 fstab), the NTFS driver supports the +following mount options: + +iocharset=name Deprecated option. Still supported but please use + nls=name in the future. See description for nls=name. + +nls=name Character set to use when returning file names. + Unlike VFAT, NTFS suppresses names that contain + unconvertible characters. Note that most character + sets contain insufficient characters to represent all + possible Unicode characters that can exist on NTFS. + To be sure you are not missing any files, you are + advised to use nls=utf8 which is capable of + representing all Unicode characters. + +utf8=<bool> Option no longer supported. Currently mapped to + nls=utf8 but please use nls=utf8 in the future and + make sure utf8 is compiled either as module or into + the kernel. See description for nls=name. + +uid= +gid= +umask= Provide default owner, group, and access mode mask. + These options work as documented in mount(8). By + default, the files/directories are owned by root and + he/she has read and write permissions, as well as + browse permission for directories. No one else has any + access permissions. I.e. the mode on all files is by + default rw------- and for directories rwx------, a + consequence of the default fmask=0177 and dmask=0077. + Using a umask of zero will grant all permissions to + everyone, i.e. all files and directories will have mode + rwxrwxrwx. + +fmask= +dmask= Instead of specifying umask which applies both to + files and directories, fmask applies only to files and + dmask only to directories. + +sloppy=<BOOL> If sloppy is specified, ignore unknown mount options. + Otherwise the default behaviour is to abort mount if + any unknown options are found. + +show_sys_files=<BOOL> If show_sys_files is specified, show the system files + in directory listings. Otherwise the default behaviour + is to hide the system files. + Note that even when show_sys_files is specified, "$MFT" + will not be visible due to bugs/mis-features in glibc. + Further, note that irrespective of show_sys_files, all + files are accessible by name, i.e. you can always do + "ls -l \$UpCase" for example to specifically show the + system file containing the Unicode upcase table. + +case_sensitive=<BOOL> If case_sensitive is specified, treat all file names as + case sensitive and create file names in the POSIX + namespace. Otherwise the default behaviour is to treat + file names as case insensitive and to create file names + in the WIN32/LONG name space. Note, the Linux NTFS + driver will never create short file names and will + remove them on rename/delete of the corresponding long + file name. + Note that files remain accessible via their short file + name, if it exists. If case_sensitive, you will need + to provide the correct case of the short file name. + +errors=opt What to do when critical file system errors are found. + Following values can be used for "opt": + continue: DEFAULT, try to clean-up as much as + possible, e.g. marking a corrupt inode as + bad so it is no longer accessed, and then + continue. + recover: At present only supported is recovery of + the boot sector from the backup copy. + If read-only mount, the recovery is done + in memory only and not written to disk. + Note that the options are additive, i.e. specifying: + errors=continue,errors=recover + means the driver will attempt to recover and if that + fails it will clean-up as much as possible and + continue. + +mft_zone_multiplier= Set the MFT zone multiplier for the volume (this + setting is not persistent across mounts and can be + changed from mount to mount but cannot be changed on + remount). Values of 1 to 4 are allowed, 1 being the + default. The MFT zone multiplier determines how much + space is reserved for the MFT on the volume. If all + other space is used up, then the MFT zone will be + shrunk dynamically, so this has no impact on the + amount of free space. However, it can have an impact + on performance by affecting fragmentation of the MFT. + In general use the default. If you have a lot of small + files then use a higher value. The values have the + following meaning: + Value MFT zone size (% of volume size) + 1 12.5% + 2 25% + 3 37.5% + 4 50% + Note this option is irrelevant for read-only mounts. + + +Known bugs and (mis-)features +============================= + +- The link count on each directory inode entry is set to 1, due to Linux not + supporting directory hard links. This may well confuse some user space + applications, since the directory names will have the same inode numbers. + This also speeds up ntfs_read_inode() immensely. And we haven't found any + problems with this approach so far. If you find a problem with this, please + let us know. + + +Please send bug reports/comments/feedback/abuse to the Linux-NTFS development +list at sourceforge: linux-ntfs-dev@lists.sourceforge.net + + +Using NTFS volume and stripe sets +================================= + +For support of volume and stripe sets, you can either use the kernel's +Device-Mapper driver or the kernel's Software RAID / MD driver. The former is +the recommended one to use for linear raid. But the latter is required for +raid level 5. For striping and mirroring, either driver should work fine. + + +The Device-Mapper driver +------------------------ + +You will need to create a table of the components of the volume/stripe set and +how they fit together and load this into the kernel using the dmsetup utility +(see man 8 dmsetup). + +Linear volume sets, i.e. linear raid, has been tested and works fine. Even +though untested, there is no reason why stripe sets, i.e. raid level 0, and +mirrors, i.e. raid level 1 should not work, too. Stripes with parity, i.e. +raid level 5, unfortunately cannot work yet because the current version of the +Device-Mapper driver does not support raid level 5. You may be able to use the +Software RAID / MD driver for raid level 5, see the next section for details. + +To create the table describing your volume you will need to know each of its +components and their sizes in sectors, i.e. multiples of 512-byte blocks. + +For NT4 fault tolerant volumes you can obtain the sizes using fdisk. So for +example if one of your partitions is /dev/hda2 you would do: + +$ fdisk -ul /dev/hda + +Disk /dev/hda: 81.9 GB, 81964302336 bytes +255 heads, 63 sectors/track, 9964 cylinders, total 160086528 sectors +Units = sectors of 1 * 512 = 512 bytes + + Device Boot Start End Blocks Id System + /dev/hda1 * 63 4209029 2104483+ 83 Linux + /dev/hda2 4209030 37768814 16779892+ 86 NTFS + /dev/hda3 37768815 46170809 4200997+ 83 Linux + +And you would know that /dev/hda2 has a size of 37768814 - 4209030 + 1 = +33559785 sectors. + +For Win2k and later dynamic disks, you can for example use the ldminfo utility +which is part of the Linux LDM tools (the latest version at the time of +writing is linux-ldm-0.0.8.tar.bz2). You can download it from: + http://linux-ntfs.sourceforge.net/downloads.html +Simply extract the downloaded archive (tar xvjf linux-ldm-0.0.8.tar.bz2), go +into it (cd linux-ldm-0.0.8) and change to the test directory (cd test). You +will find the precompiled (i386) ldminfo utility there. NOTE: You will not be +able to compile this yourself easily so use the binary version! + +Then you would use ldminfo in dump mode to obtain the necessary information: + +$ ./ldminfo --dump /dev/hda + +This would dump the LDM database found on /dev/hda which describes all of your +dynamic disks and all the volumes on them. At the bottom you will see the +VOLUME DEFINITIONS section which is all you really need. You may need to look +further above to determine which of the disks in the volume definitions is +which device in Linux. Hint: Run ldminfo on each of your dynamic disks and +look at the Disk Id close to the top of the output for each (the PRIVATE HEADER +section). You can then find these Disk Ids in the VBLK DATABASE section in the +<Disk> components where you will get the LDM Name for the disk that is found in +the VOLUME DEFINITIONS section. + +Note you will also need to enable the LDM driver in the Linux kernel. If your +distribution did not enable it, you will need to recompile the kernel with it +enabled. This will create the LDM partitions on each device at boot time. You +would then use those devices (for /dev/hda they would be /dev/hda1, 2, 3, etc) +in the Device-Mapper table. + +You can also bypass using the LDM driver by using the main device (e.g. +/dev/hda) and then using the offsets of the LDM partitions into this device as +the "Start sector of device" when creating the table. Once again ldminfo would +give you the correct information to do this. + +Assuming you know all your devices and their sizes things are easy. + +For a linear raid the table would look like this (note all values are in +512-byte sectors): + +--- cut here --- +# Offset into Size of this Raid type Device Start sector +# volume device of device +0 1028161 linear /dev/hda1 0 +1028161 3903762 linear /dev/hdb2 0 +4931923 2103211 linear /dev/hdc1 0 +--- cut here --- + +For a striped volume, i.e. raid level 0, you will need to know the chunk size +you used when creating the volume. Windows uses 64kiB as the default, so it +will probably be this unless you changes the defaults when creating the array. + +For a raid level 0 the table would look like this (note all values are in +512-byte sectors): + +--- cut here --- +# Offset Size Raid Number Chunk 1st Start 2nd Start +# into of the type of size Device in Device in +# volume volume stripes device device +0 2056320 striped 2 128 /dev/hda1 0 /dev/hdb1 0 +--- cut here --- + +If there are more than two devices, just add each of them to the end of the +line. + +Finally, for a mirrored volume, i.e. raid level 1, the table would look like +this (note all values are in 512-byte sectors): + +--- cut here --- +# Ofs Size Raid Log Number Region Should Number Source Start Taget Start +# in of the type type of log size sync? of Device in Device in +# vol volume params mirrors Device Device +0 2056320 mirror core 2 16 nosync 2 /dev/hda1 0 /dev/hdb1 0 +--- cut here --- + +If you are mirroring to multiple devices you can specify further targets at the +end of the line. + +Note the "Should sync?" parameter "nosync" means that the two mirrors are +already in sync which will be the case on a clean shutdown of Windows. If the +mirrors are not clean, you can specify the "sync" option instead of "nosync" +and the Device-Mapper driver will then copy the entirey of the "Source Device" +to the "Target Device" or if you specified multipled target devices to all of +them. + +Once you have your table, save it in a file somewhere (e.g. /etc/ntfsvolume1), +and hand it over to dmsetup to work with, like so: + +$ dmsetup create myvolume1 /etc/ntfsvolume1 + +You can obviously replace "myvolume1" with whatever name you like. + +If it all worked, you will now have the device /dev/device-mapper/myvolume1 +which you can then just use as an argument to the mount command as usual to +mount the ntfs volume. For example: + +$ mount -t ntfs -o ro /dev/device-mapper/myvolume1 /mnt/myvol1 + +(You need to create the directory /mnt/myvol1 first and of course you can use +anything you like instead of /mnt/myvol1 as long as it is an existing +directory.) + +It is advisable to do the mount read-only to see if the volume has been setup +correctly to avoid the possibility of causing damage to the data on the ntfs +volume. + + +The Software RAID / MD driver +----------------------------- + +An alternative to using the Device-Mapper driver is to use the kernel's +Software RAID / MD driver. For which you need to set up your /etc/raidtab +appropriately (see man 5 raidtab). + +Linear volume sets, i.e. linear raid, as well as stripe sets, i.e. raid level +0, have been tested and work fine (though see section "Limitiations when using +the MD driver with NTFS volumes" especially if you want to use linear raid). +Even though untested, there is no reason why mirrors, i.e. raid level 1, and +stripes with parity, i.e. raid level 5, should not work, too. + +You have to use the "persistent-superblock 0" option for each raid-disk in the +NTFS volume/stripe you are configuring in /etc/raidtab as the persistent +superblock used by the MD driver would damange the NTFS volume. + +Windows by default uses a stripe chunk size of 64k, so you probably want the +"chunk-size 64k" option for each raid-disk, too. + +For example, if you have a stripe set consisting of two partitions /dev/hda5 +and /dev/hdb1 your /etc/raidtab would look like this: + +raiddev /dev/md0 + raid-level 0 + nr-raid-disks 2 + nr-spare-disks 0 + persistent-superblock 0 + chunk-size 64k + device /dev/hda5 + raid-disk 0 + device /dev/hdb1 + raid-disl 1 + +For linear raid, just change the raid-level above to "raid-level linear", for +mirrors, change it to "raid-level 1", and for stripe sets with parity, change +it to "raid-level 5". + +Note for stripe sets with parity you will also need to tell the MD driver +which parity algorithm to use by specifying the option "parity-algorithm +which", where you need to replace "which" with the name of the algorithm to +use (see man 5 raidtab for available algorithms) and you will have to try the +different available algorithms until you find one that works. Make sure you +are working read-only when playing with this as you may damage your data +otherwise. If you find which algorithm works please let us know (email the +linux-ntfs developers list linux-ntfs-dev@lists.sourceforge.net or drop in on +IRC in channel #ntfs on the irc.freenode.net network) so we can update this +documentation. + +Once the raidtab is setup, run for example raid0run -a to start all devices or +raid0run /dev/md0 to start a particular md device, in this case /dev/md0. + +Then just use the mount command as usual to mount the ntfs volume using for +example: mount -t ntfs -o ro /dev/md0 /mnt/myntfsvolume + +It is advisable to do the mount read-only to see if the md volume has been +setup correctly to avoid the possibility of causing damage to the data on the +ntfs volume. + + +Limitiations when using the Software RAID / MD driver +----------------------------------------------------- + +Using the md driver will not work properly if any of your NTFS partitions have +an odd number of sectors. This is especially important for linear raid as all +data after the first partition with an odd number of sectors will be offset by +one or more sectors so if you mount such a partition with write support you +will cause massive damage to the data on the volume which will only become +apparent when you try to use the volume again under Windows. + +So when using linear raid, make sure that all your partitions have an even +number of sectors BEFORE attempting to use it. You have been warned! + +Even better is to simply use the Device-Mapper for linear raid and then you do +not have this problem with odd numbers of sectors. + + +ChangeLog +========= + +Note, a technical ChangeLog aimed at kernel hackers is in fs/ntfs/ChangeLog. + +2.1.22: + - Improve handling of ntfs volumes with errors. + - Fix various bugs and race conditions. +2.1.21: + - Fix several race conditions and various other bugs. + - Many internal cleanups, code reorganization, optimizations, and mft + and index record writing code rewritten to fit in with the changes. + - Update Documentation/filesystems/ntfs.txt with instructions on how to + use the Device-Mapper driver with NTFS ftdisk/LDM raid. +2.1.20: + - Fix two stupid bugs introduced in 2.1.18 release. +2.1.19: + - Minor bugfix in handling of the default upcase table. + - Many internal cleanups and improvements. Many thanks to Linus + Torvalds and Al Viro for the help and advice with the sparse + annotations and cleanups. +2.1.18: + - Fix scheduling latencies at mount time. (Ingo Molnar) + - Fix endianness bug in a little traversed portion of the attribute + lookup code. +2.1.17: + - Fix bugs in mount time error code paths. +2.1.16: + - Implement access time updates (including mtime and ctime). + - Implement fsync(2), fdatasync(2), and msync(2) system calls. + - Enable the readv(2) and writev(2) system calls. + - Enable access via the asynchronous io (aio) API by adding support for + the aio_read(3) and aio_write(3) functions. +2.1.15: + - Invalidate quotas when (re)mounting read-write. + NOTE: This now only leave user space journalling on the side. (See + note for version 2.1.13, below.) +2.1.14: + - Fix an NFSd caused deadlock reported by several users. +2.1.13: + - Implement writing of inodes (access time updates are not implemented + yet so mounting with -o noatime,nodiratime is enforced). + - Enable writing out of resident files so you can now overwrite any + uncompressed, unencrypted, nonsparse file as long as you do not + change the file size. + - Add housekeeping of ntfs system files so that ntfsfix no longer needs + to be run after writing to an NTFS volume. + NOTE: This still leaves quota tracking and user space journalling on + the side but they should not cause data corruption. In the worst + case the charged quotas will be out of date ($Quota) and some + userspace applications might get confused due to the out of date + userspace journal ($UsnJrnl). +2.1.12: + - Fix the second fix to the decompression engine from the 2.1.9 release + and some further internals cleanups. +2.1.11: + - Driver internal cleanups. +2.1.10: + - Force read-only (re)mounting of volumes with unsupported volume + flags and various cleanups. +2.1.9: + - Fix two bugs in handling of corner cases in the decompression engine. +2.1.8: + - Read the $MFT mirror and compare it to the $MFT and if the two do not + match, force a read-only mount and do not allow read-write remounts. + - Read and parse the $LogFile journal and if it indicates that the + volume was not shutdown cleanly, force a read-only mount and do not + allow read-write remounts. If the $LogFile indicates a clean + shutdown and a read-write (re)mount is requested, empty $LogFile to + ensure that Windows cannot cause data corruption by replaying a stale + journal after Linux has written to the volume. + - Improve time handling so that the NTFS time is fully preserved when + converted to kernel time and only up to 99 nano-seconds are lost when + kernel time is converted to NTFS time. +2.1.7: + - Enable NFS exporting of mounted NTFS volumes. +2.1.6: + - Fix minor bug in handling of compressed directories that fixes the + erroneous "du" and "stat" output people reported. +2.1.5: + - Minor bug fix in attribute list attribute handling that fixes the + I/O errors on "ls" of certain fragmented files found by at least two + people running Windows XP. +2.1.4: + - Minor update allowing compilation with all gcc versions (well, the + ones the kernel can be compiled with anyway). +2.1.3: + - Major bug fixes for reading files and volumes in corner cases which + were being hit by Windows 2k/XP users. +2.1.2: + - Major bug fixes aleviating the hangs in statfs experienced by some + users. +2.1.1: + - Update handling of compressed files so people no longer get the + frequently reported warning messages about initialized_size != + data_size. +2.1.0: + - Add configuration option for developmental write support. + - Initial implementation of file overwriting. (Writes to resident files + are not written out to disk yet, so avoid writing to files smaller + than about 1kiB.) + - Intercept/abort changes in file size as they are not implemented yet. +2.0.25: + - Minor bugfixes in error code paths and small cleanups. +2.0.24: + - Small internal cleanups. + - Support for sendfile system call. (Christoph Hellwig) +2.0.23: + - Massive internal locking changes to mft record locking. Fixes + various race conditions and deadlocks. + - Fix ntfs over loopback for compressed files by adding an + optimization barrier. (gcc was screwing up otherwise ?) + Thanks go to Christoph Hellwig for pointing these two out: + - Remove now unused function fs/ntfs/malloc.h::vmalloc_nofs(). + - Fix ntfs_free() for ia64 and parisc. +2.0.22: + - Small internal cleanups. +2.0.21: + These only affect 32-bit architectures: + - Check for, and refuse to mount too large volumes (maximum is 2TiB). + - Check for, and refuse to open too large files and directories + (maximum is 16TiB). +2.0.20: + - Support non-resident directory index bitmaps. This means we now cope + with huge directories without problems. + - Fix a page leak that manifested itself in some cases when reading + directory contents. + - Internal cleanups. +2.0.19: + - Fix race condition and improvements in block i/o interface. + - Optimization when reading compressed files. +2.0.18: + - Fix race condition in reading of compressed files. +2.0.17: + - Cleanups and optimizations. +2.0.16: + - Fix stupid bug introduced in 2.0.15 in new attribute inode API. + - Big internal cleanup replacing the mftbmp access hacks by using the + new attribute inode API instead. +2.0.15: + - Bug fix in parsing of remount options. + - Internal changes implementing attribute (fake) inodes allowing all + attribute i/o to go via the page cache and to use all the normal + vfs/mm functionality. +2.0.14: + - Internal changes improving run list merging code and minor locking + change to not rely on BKL in ntfs_statfs(). +2.0.13: + - Internal changes towards using iget5_locked() in preparation for + fake inodes and small cleanups to ntfs_volume structure. +2.0.12: + - Internal cleanups in address space operations made possible by the + changes introduced in the previous release. +2.0.11: + - Internal updates and cleanups introducing the first step towards + fake inode based attribute i/o. +2.0.10: + - Microsoft says that the maximum number of inodes is 2^32 - 1. Update + the driver accordingly to only use 32-bits to store inode numbers on + 32-bit architectures. This improves the speed of the driver a little. +2.0.9: + - Change decompression engine to use a single buffer. This should not + affect performance except perhaps on the most heavy i/o on SMP + systems when accessing multiple compressed files from multiple + devices simultaneously. + - Minor updates and cleanups. +2.0.8: + - Remove now obsolete show_inodes and posix mount option(s). + - Restore show_sys_files mount option. + - Add new mount option case_sensitive, to determine if the driver + treats file names as case sensitive or not. + - Mostly drop support for short file names (for backwards compatibility + we only support accessing files via their short file name if one + exists). + - Fix dcache aliasing issues wrt short/long file names. + - Cleanups and minor fixes. +2.0.7: + - Just cleanups. +2.0.6: + - Major bugfix to make compatible with other kernel changes. This fixes + the hangs/oopses on umount. + - Locking cleanup in directory operations (remove BKL usage). +2.0.5: + - Major buffer overflow bug fix. + - Minor cleanups and updates for kernel 2.5.12. +2.0.4: + - Cleanups and updates for kernel 2.5.11. +2.0.3: + - Small bug fixes, cleanups, and performance improvements. +2.0.2: + - Use default fmask of 0177 so that files are no executable by default. + If you want owner executable files, just use fmask=0077. + - Update for kernel 2.5.9 but preserve backwards compatibility with + kernel 2.5.7. + - Minor bug fixes, cleanups, and updates. +2.0.1: + - Minor updates, primarily set the executable bit by default on files + so they can be executed. +2.0.0: + - Started ChangeLog. + diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting new file mode 100644 index 000000000000..2f388460cbe7 --- /dev/null +++ b/Documentation/filesystems/porting @@ -0,0 +1,266 @@ +Changes since 2.5.0: + +--- +[recommended] + +New helpers: sb_bread(), sb_getblk(), sb_find_get_block(), set_bh(), + sb_set_blocksize() and sb_min_blocksize(). + +Use them. + +(sb_find_get_block() replaces 2.4's get_hash_table()) + +--- +[recommended] + +New methods: ->alloc_inode() and ->destroy_inode(). + +Remove inode->u.foo_inode_i +Declare + struct foo_inode_info { + /* fs-private stuff */ + struct inode vfs_inode; + }; + static inline struct foo_inode_info *FOO_I(struct inode *inode) + { + return list_entry(inode, struct foo_inode_info, vfs_inode); + } + +Use FOO_I(inode) instead of &inode->u.foo_inode_i; + +Add foo_alloc_inode() and foo_destory_inode() - the former should allocate +foo_inode_info and return the address of ->vfs_inode, the latter should free +FOO_I(inode) (see in-tree filesystems for examples). + +Make them ->alloc_inode and ->destroy_inode in your super_operations. + +Keep in mind that now you need explicit initialization of private data - +typically in ->read_inode() and after getting an inode from new_inode(). + +At some point that will become mandatory. + +--- +[mandatory] + +Change of file_system_type method (->read_super to ->get_sb) + +->read_super() is no more. Ditto for DECLARE_FSTYPE and DECLARE_FSTYPE_DEV. + +Turn your foo_read_super() into a function that would return 0 in case of +success and negative number in case of error (-EINVAL unless you have more +informative error value to report). Call it foo_fill_super(). Now declare + +struct super_block foo_get_sb(struct file_system_type *fs_type, + int flags, const char *dev_name, void *data) +{ + return get_sb_bdev(fs_type, flags, dev_name, data, ext2_fill_super); +} + +(or similar with s/bdev/nodev/ or s/bdev/single/, depending on the kind of +filesystem). + +Replace DECLARE_FSTYPE... with explicit initializer and have ->get_sb set as +foo_get_sb. + +--- +[mandatory] + +Locking change: ->s_vfs_rename_sem is taken only by cross-directory renames. +Most likely there is no need to change anything, but if you relied on +global exclusion between renames for some internal purpose - you need to +change your internal locking. Otherwise exclusion warranties remain the +same (i.e. parents and victim are locked, etc.). + +--- +[informational] + +Now we have the exclusion between ->lookup() and directory removal (by +->rmdir() and ->rename()). If you used to need that exclusion and do +it by internal locking (most of filesystems couldn't care less) - you +can relax your locking. + +--- +[mandatory] + +->lookup(), ->truncate(), ->create(), ->unlink(), ->mknod(), ->mkdir(), +->rmdir(), ->link(), ->lseek(), ->symlink(), ->rename() +and ->readdir() are called without BKL now. Grab it on entry, drop upon return +- that will guarantee the same locking you used to have. If your method or its +parts do not need BKL - better yet, now you can shift lock_kernel() and +unlock_kernel() so that they would protect exactly what needs to be +protected. + +--- +[mandatory] + +BKL is also moved from around sb operations. ->write_super() Is now called +without BKL held. BKL should have been shifted into individual fs sb_op +functions. If you don't need it, remove it. + +--- +[informational] + +check for ->link() target not being a directory is done by callers. Feel +free to drop it... + +--- +[informational] + +->link() callers hold ->i_sem on the object we are linking to. Some of your +problems might be over... + +--- +[mandatory] + +new file_system_type method - kill_sb(superblock). If you are converting +an existing filesystem, set it according to ->fs_flags: + FS_REQUIRES_DEV - kill_block_super + FS_LITTER - kill_litter_super + neither - kill_anon_super +FS_LITTER is gone - just remove it from fs_flags. + +--- +[mandatory] + + FS_SINGLE is gone (actually, that had happened back when ->get_sb() +went in - and hadn't been documented ;-/). Just remove it from fs_flags +(and see ->get_sb() entry for other actions). + +--- +[mandatory] + +->setattr() is called without BKL now. Caller _always_ holds ->i_sem, so +watch for ->i_sem-grabbing code that might be used by your ->setattr(). +Callers of notify_change() need ->i_sem now. + +--- +[recommended] + +New super_block field "struct export_operations *s_export_op" for +explicit support for exporting, e.g. via NFS. The structure is fully +documented at its declaration in include/linux/fs.h, and in +Documentation/filesystems/Exporting. + +Briefly it allows for the definition of decode_fh and encode_fh operations +to encode and decode filehandles, and allows the filesystem to use +a standard helper function for decode_fh, and provide file-system specific +support for this helper, particularly get_parent. + +It is planned that this will be required for exporting once the code +settles down a bit. + +[mandatory] + +s_export_op is now required for exporting a filesystem. +isofs, ext2, ext3, resierfs, fat +can be used as examples of very different filesystems. + +--- +[mandatory] + +iget4() and the read_inode2 callback have been superseded by iget5_locked() +which has the following prototype, + + struct inode *iget5_locked(struct super_block *sb, unsigned long ino, + int (*test)(struct inode *, void *), + int (*set)(struct inode *, void *), + void *data); + +'test' is an additional function that can be used when the inode +number is not sufficient to identify the actual file object. 'set' +should be a non-blocking function that initializes those parts of a +newly created inode to allow the test function to succeed. 'data' is +passed as an opaque value to both test and set functions. + +When the inode has been created by iget5_locked(), it will be returned with +the I_NEW flag set and will still be locked. read_inode has not been +called so the file system still has to finalize the initialization. Once +the inode is initialized it must be unlocked by calling unlock_new_inode(). + +The filesystem is responsible for setting (and possibly testing) i_ino +when appropriate. There is also a simpler iget_locked function that +just takes the superblock and inode number as arguments and does the +test and set for you. + +e.g. + inode = iget_locked(sb, ino); + if (inode->i_state & I_NEW) { + read_inode_from_disk(inode); + unlock_new_inode(inode); + } + +--- +[recommended] + +->getattr() finally getting used. See instances in nfs, minix, etc. + +--- +[mandatory] + +->revalidate() is gone. If your filesystem had it - provide ->getattr() +and let it call whatever you had as ->revlidate() + (for symlinks that +had ->revalidate()) add calls in ->follow_link()/->readlink(). + +--- +[mandatory] + +->d_parent changes are not protected by BKL anymore. Read access is safe +if at least one of the following is true: + * filesystem has no cross-directory rename() + * dcache_lock is held + * we know that parent had been locked (e.g. we are looking at +->d_parent of ->lookup() argument). + * we are called from ->rename(). + * the child's ->d_lock is held +Audit your code and add locking if needed. Notice that any place that is +not protected by the conditions above is risky even in the old tree - you +had been relying on BKL and that's prone to screwups. Old tree had quite +a few holes of that kind - unprotected access to ->d_parent leading to +anything from oops to silent memory corruption. + +--- +[mandatory] + + FS_NOMOUNT is gone. If you use it - just set MS_NOUSER in flags +(see rootfs for one kind of solution and bdev/socket/pipe for another). + +--- +[recommended] + + Use bdev_read_only(bdev) instead of is_read_only(kdev). The latter +is still alive, but only because of the mess in drivers/s390/block/dasd.c. +As soon as it gets fixed is_read_only() will die. + +--- +[mandatory] + +->permission() is called without BKL now. Grab it on entry, drop upon +return - that will guarantee the same locking you used to have. If +your method or its parts do not need BKL - better yet, now you can +shift lock_kernel() and unlock_kernel() so that they would protect +exactly what needs to be protected. + +--- +[mandatory] + +->statfs() is now called without BKL held. BKL should have been +shifted into individual fs sb_op functions where it's not clear that +it's safe to remove it. If you don't need it, remove it. + +--- +[mandatory] + + is_read_only() is gone; use bdev_read_only() instead. + +--- +[mandatory] + + destroy_buffers() is gone; use invalidate_bdev(). + +--- +[mandatory] + + fsync_dev() is gone; use fsync_bdev(). NOTE: lvm breakage is +deliberate; as soon as struct block_device * is propagated in a reasonable +way by that code fixing will become trivial; until then nothing can be +done. diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt new file mode 100644 index 000000000000..cbe85c17176b --- /dev/null +++ b/Documentation/filesystems/proc.txt @@ -0,0 +1,1940 @@ +------------------------------------------------------------------------------ + T H E /proc F I L E S Y S T E M +------------------------------------------------------------------------------ +/proc/sys Terrehon Bowden <terrehon@pacbell.net> October 7 1999 + Bodo Bauer <bb@ricochet.net> + +2.4.x update Jorge Nerin <comandante@zaralinux.com> November 14 2000 +------------------------------------------------------------------------------ +Version 1.3 Kernel version 2.2.12 + Kernel version 2.4.0-test11-pre4 +------------------------------------------------------------------------------ + +Table of Contents +----------------- + + 0 Preface + 0.1 Introduction/Credits + 0.2 Legal Stuff + + 1 Collecting System Information + 1.1 Process-Specific Subdirectories + 1.2 Kernel data + 1.3 IDE devices in /proc/ide + 1.4 Networking info in /proc/net + 1.5 SCSI info + 1.6 Parallel port info in /proc/parport + 1.7 TTY info in /proc/tty + 1.8 Miscellaneous kernel statistics in /proc/stat + + 2 Modifying System Parameters + 2.1 /proc/sys/fs - File system data + 2.2 /proc/sys/fs/binfmt_misc - Miscellaneous binary formats + 2.3 /proc/sys/kernel - general kernel parameters + 2.4 /proc/sys/vm - The virtual memory subsystem + 2.5 /proc/sys/dev - Device specific parameters + 2.6 /proc/sys/sunrpc - Remote procedure calls + 2.7 /proc/sys/net - Networking stuff + 2.8 /proc/sys/net/ipv4 - IPV4 settings + 2.9 Appletalk + 2.10 IPX + 2.11 /proc/sys/fs/mqueue - POSIX message queues filesystem + +------------------------------------------------------------------------------ +Preface +------------------------------------------------------------------------------ + +0.1 Introduction/Credits +------------------------ + +This documentation is part of a soon (or so we hope) to be released book on +the SuSE Linux distribution. As there is no complete documentation for the +/proc file system and we've used many freely available sources to write these +chapters, it seems only fair to give the work back to the Linux community. +This work is based on the 2.2.* kernel version and the upcoming 2.4.*. I'm +afraid it's still far from complete, but we hope it will be useful. As far as +we know, it is the first 'all-in-one' document about the /proc file system. It +is focused on the Intel x86 hardware, so if you are looking for PPC, ARM, +SPARC, AXP, etc., features, you probably won't find what you are looking for. +It also only covers IPv4 networking, not IPv6 nor other protocols - sorry. But +additions and patches are welcome and will be added to this document if you +mail them to Bodo. + +We'd like to thank Alan Cox, Rik van Riel, and Alexey Kuznetsov and a lot of +other people for help compiling this documentation. We'd also like to extend a +special thank you to Andi Kleen for documentation, which we relied on heavily +to create this document, as well as the additional information he provided. +Thanks to everybody else who contributed source or docs to the Linux kernel +and helped create a great piece of software... :) + +If you have any comments, corrections or additions, please don't hesitate to +contact Bodo Bauer at bb@ricochet.net. We'll be happy to add them to this +document. + +The latest version of this document is available online at +http://skaro.nightcrawler.com/~bb/Docs/Proc as HTML version. + +If the above direction does not works for you, ypu could try the kernel +mailing list at linux-kernel@vger.kernel.org and/or try to reach me at +comandante@zaralinux.com. + +0.2 Legal Stuff +--------------- + +We don't guarantee the correctness of this document, and if you come to us +complaining about how you screwed up your system because of incorrect +documentation, we won't feel responsible... + +------------------------------------------------------------------------------ +CHAPTER 1: COLLECTING SYSTEM INFORMATION +------------------------------------------------------------------------------ + +------------------------------------------------------------------------------ +In This Chapter +------------------------------------------------------------------------------ +* Investigating the properties of the pseudo file system /proc and its + ability to provide information on the running Linux system +* Examining /proc's structure +* Uncovering various information about the kernel and the processes running + on the system +------------------------------------------------------------------------------ + + +The proc file system acts as an interface to internal data structures in the +kernel. It can be used to obtain information about the system and to change +certain kernel parameters at runtime (sysctl). + +First, we'll take a look at the read-only parts of /proc. In Chapter 2, we +show you how you can use /proc/sys to change settings. + +1.1 Process-Specific Subdirectories +----------------------------------- + +The directory /proc contains (among other things) one subdirectory for each +process running on the system, which is named after the process ID (PID). + +The link self points to the process reading the file system. Each process +subdirectory has the entries listed in Table 1-1. + + +Table 1-1: Process specific entries in /proc +.............................................................................. + File Content + cmdline Command line arguments + cpu Current and last cpu in wich it was executed (2.4)(smp) + cwd Link to the current working directory + environ Values of environment variables + exe Link to the executable of this process + fd Directory, which contains all file descriptors + maps Memory maps to executables and library files (2.4) + mem Memory held by this process + root Link to the root directory of this process + stat Process status + statm Process memory status information + status Process status in human readable form + wchan If CONFIG_KALLSYMS is set, a pre-decoded wchan +.............................................................................. + +For example, to get the status information of a process, all you have to do is +read the file /proc/PID/status: + + >cat /proc/self/status + Name: cat + State: R (running) + Pid: 5452 + PPid: 743 + TracerPid: 0 (2.4) + Uid: 501 501 501 501 + Gid: 100 100 100 100 + Groups: 100 14 16 + VmSize: 1112 kB + VmLck: 0 kB + VmRSS: 348 kB + VmData: 24 kB + VmStk: 12 kB + VmExe: 8 kB + VmLib: 1044 kB + SigPnd: 0000000000000000 + SigBlk: 0000000000000000 + SigIgn: 0000000000000000 + SigCgt: 0000000000000000 + CapInh: 00000000fffffeff + CapPrm: 0000000000000000 + CapEff: 0000000000000000 + + +This shows you nearly the same information you would get if you viewed it with +the ps command. In fact, ps uses the proc file system to obtain its +information. The statm file contains more detailed information about the +process memory usage. Its seven fields are explained in Table 1-2. + + +Table 1-2: Contents of the statm files (as of 2.6.8-rc3) +.............................................................................. + Field Content + size total program size (pages) (same as VmSize in status) + resident size of memory portions (pages) (same as VmRSS in status) + shared number of pages that are shared (i.e. backed by a file) + trs number of pages that are 'code' (not including libs; broken, + includes data segment) + lrs number of pages of library (always 0 on 2.6) + drs number of pages of data/stack (including libs; broken, + includes library text) + dt number of dirty pages (always 0 on 2.6) +.............................................................................. + +1.2 Kernel data +--------------- + +Similar to the process entries, the kernel data files give information about +the running kernel. The files used to obtain this information are contained in +/proc and are listed in Table 1-3. Not all of these will be present in your +system. It depends on the kernel configuration and the loaded modules, which +files are there, and which are missing. + +Table 1-3: Kernel info in /proc +.............................................................................. + File Content + apm Advanced power management info + buddyinfo Kernel memory allocator information (see text) (2.5) + bus Directory containing bus specific information + cmdline Kernel command line + cpuinfo Info about the CPU + devices Available devices (block and character) + dma Used DMS channels + filesystems Supported filesystems + driver Various drivers grouped here, currently rtc (2.4) + execdomains Execdomains, related to security (2.4) + fb Frame Buffer devices (2.4) + fs File system parameters, currently nfs/exports (2.4) + ide Directory containing info about the IDE subsystem + interrupts Interrupt usage + iomem Memory map (2.4) + ioports I/O port usage + irq Masks for irq to cpu affinity (2.4)(smp?) + isapnp ISA PnP (Plug&Play) Info (2.4) + kcore Kernel core image (can be ELF or A.OUT(deprecated in 2.4)) + kmsg Kernel messages + ksyms Kernel symbol table + loadavg Load average of last 1, 5 & 15 minutes + locks Kernel locks + meminfo Memory info + misc Miscellaneous + modules List of loaded modules + mounts Mounted filesystems + net Networking info (see text) + partitions Table of partitions known to the system + pci Depreciated info of PCI bus (new way -> /proc/bus/pci/, + decoupled by lspci (2.4) + rtc Real time clock + scsi SCSI info (see text) + slabinfo Slab pool info + stat Overall statistics + swaps Swap space utilization + sys See chapter 2 + sysvipc Info of SysVIPC Resources (msg, sem, shm) (2.4) + tty Info of tty drivers + uptime System uptime + version Kernel version + video bttv info of video resources (2.4) +.............................................................................. + +You can, for example, check which interrupts are currently in use and what +they are used for by looking in the file /proc/interrupts: + + > cat /proc/interrupts + CPU0 + 0: 8728810 XT-PIC timer + 1: 895 XT-PIC keyboard + 2: 0 XT-PIC cascade + 3: 531695 XT-PIC aha152x + 4: 2014133 XT-PIC serial + 5: 44401 XT-PIC pcnet_cs + 8: 2 XT-PIC rtc + 11: 8 XT-PIC i82365 + 12: 182918 XT-PIC PS/2 Mouse + 13: 1 XT-PIC fpu + 14: 1232265 XT-PIC ide0 + 15: 7 XT-PIC ide1 + NMI: 0 + +In 2.4.* a couple of lines where added to this file LOC & ERR (this time is the +output of a SMP machine): + + > cat /proc/interrupts + + CPU0 CPU1 + 0: 1243498 1214548 IO-APIC-edge timer + 1: 8949 8958 IO-APIC-edge keyboard + 2: 0 0 XT-PIC cascade + 5: 11286 10161 IO-APIC-edge soundblaster + 8: 1 0 IO-APIC-edge rtc + 9: 27422 27407 IO-APIC-edge 3c503 + 12: 113645 113873 IO-APIC-edge PS/2 Mouse + 13: 0 0 XT-PIC fpu + 14: 22491 24012 IO-APIC-edge ide0 + 15: 2183 2415 IO-APIC-edge ide1 + 17: 30564 30414 IO-APIC-level eth0 + 18: 177 164 IO-APIC-level bttv + NMI: 2457961 2457959 + LOC: 2457882 2457881 + ERR: 2155 + +NMI is incremented in this case because every timer interrupt generates a NMI +(Non Maskable Interrupt) which is used by the NMI Watchdog to detect lockups. + +LOC is the local interrupt counter of the internal APIC of every CPU. + +ERR is incremented in the case of errors in the IO-APIC bus (the bus that +connects the CPUs in a SMP system. This means that an error has been detected, +the IO-APIC automatically retry the transmission, so it should not be a big +problem, but you should read the SMP-FAQ. + +In this context it could be interesting to note the new irq directory in 2.4. +It could be used to set IRQ to CPU affinity, this means that you can "hook" an +IRQ to only one CPU, or to exclude a CPU of handling IRQs. The contents of the +irq subdir is one subdir for each IRQ, and one file; prof_cpu_mask + +For example + > ls /proc/irq/ + 0 10 12 14 16 18 2 4 6 8 prof_cpu_mask + 1 11 13 15 17 19 3 5 7 9 + > ls /proc/irq/0/ + smp_affinity + +The contents of the prof_cpu_mask file and each smp_affinity file for each IRQ +is the same by default: + + > cat /proc/irq/0/smp_affinity + ffffffff + +It's a bitmask, in wich you can specify wich CPUs can handle the IRQ, you can +set it by doing: + + > echo 1 > /proc/irq/prof_cpu_mask + +This means that only the first CPU will handle the IRQ, but you can also echo 5 +wich means that only the first and fourth CPU can handle the IRQ. + +The way IRQs are routed is handled by the IO-APIC, and it's Round Robin +between all the CPUs which are allowed to handle it. As usual the kernel has +more info than you and does a better job than you, so the defaults are the +best choice for almost everyone. + +There are three more important subdirectories in /proc: net, scsi, and sys. +The general rule is that the contents, or even the existence of these +directories, depend on your kernel configuration. If SCSI is not enabled, the +directory scsi may not exist. The same is true with the net, which is there +only when networking support is present in the running kernel. + +The slabinfo file gives information about memory usage at the slab level. +Linux uses slab pools for memory management above page level in version 2.2. +Commonly used objects have their own slab pool (such as network buffers, +directory cache, and so on). + +.............................................................................. + +> cat /proc/buddyinfo + +Node 0, zone DMA 0 4 5 4 4 3 ... +Node 0, zone Normal 1 0 0 1 101 8 ... +Node 0, zone HighMem 2 0 0 1 1 0 ... + +Memory fragmentation is a problem under some workloads, and buddyinfo is a +useful tool for helping diagnose these problems. Buddyinfo will give you a +clue as to how big an area you can safely allocate, or why a previous +allocation failed. + +Each column represents the number of pages of a certain order which are +available. In this case, there are 0 chunks of 2^0*PAGE_SIZE available in +ZONE_DMA, 4 chunks of 2^1*PAGE_SIZE in ZONE_DMA, 101 chunks of 2^4*PAGE_SIZE +available in ZONE_NORMAL, etc... + +.............................................................................. + +meminfo: + +Provides information about distribution and utilization of memory. This +varies by architecture and compile options. The following is from a +16GB PIII, which has highmem enabled. You may not have all of these fields. + +> cat /proc/meminfo + + +MemTotal: 16344972 kB +MemFree: 13634064 kB +Buffers: 3656 kB +Cached: 1195708 kB +SwapCached: 0 kB +Active: 891636 kB +Inactive: 1077224 kB +HighTotal: 15597528 kB +HighFree: 13629632 kB +LowTotal: 747444 kB +LowFree: 4432 kB +SwapTotal: 0 kB +SwapFree: 0 kB +Dirty: 968 kB +Writeback: 0 kB +Mapped: 280372 kB +Slab: 684068 kB +CommitLimit: 7669796 kB +Committed_AS: 100056 kB +PageTables: 24448 kB +VmallocTotal: 112216 kB +VmallocUsed: 428 kB +VmallocChunk: 111088 kB + + MemTotal: Total usable ram (i.e. physical ram minus a few reserved + bits and the kernel binary code) + MemFree: The sum of LowFree+HighFree + Buffers: Relatively temporary storage for raw disk blocks + shouldn't get tremendously large (20MB or so) + Cached: in-memory cache for files read from the disk (the + pagecache). Doesn't include SwapCached + SwapCached: Memory that once was swapped out, is swapped back in but + still also is in the swapfile (if memory is needed it + doesn't need to be swapped out AGAIN because it is already + in the swapfile. This saves I/O) + Active: Memory that has been used more recently and usually not + reclaimed unless absolutely necessary. + Inactive: Memory which has been less recently used. It is more + eligible to be reclaimed for other purposes + HighTotal: + HighFree: Highmem is all memory above ~860MB of physical memory + Highmem areas are for use by userspace programs, or + for the pagecache. The kernel must use tricks to access + this memory, making it slower to access than lowmem. + LowTotal: + LowFree: Lowmem is memory which can be used for everything that + highmem can be used for, but it is also availble for the + kernel's use for its own data structures. Among many + other things, it is where everything from the Slab is + allocated. Bad things happen when you're out of lowmem. + SwapTotal: total amount of swap space available + SwapFree: Memory which has been evicted from RAM, and is temporarily + on the disk + Dirty: Memory which is waiting to get written back to the disk + Writeback: Memory which is actively being written back to the disk + Mapped: files which have been mmaped, such as libraries + Slab: in-kernel data structures cache + CommitLimit: Based on the overcommit ratio ('vm.overcommit_ratio'), + this is the total amount of memory currently available to + be allocated on the system. This limit is only adhered to + if strict overcommit accounting is enabled (mode 2 in + 'vm.overcommit_memory'). + The CommitLimit is calculated with the following formula: + CommitLimit = ('vm.overcommit_ratio' * Physical RAM) + Swap + For example, on a system with 1G of physical RAM and 7G + of swap with a `vm.overcommit_ratio` of 30 it would + yield a CommitLimit of 7.3G. + For more details, see the memory overcommit documentation + in vm/overcommit-accounting. +Committed_AS: The amount of memory presently allocated on the system. + The committed memory is a sum of all of the memory which + has been allocated by processes, even if it has not been + "used" by them as of yet. A process which malloc()'s 1G + of memory, but only touches 300M of it will only show up + as using 300M of memory even if it has the address space + allocated for the entire 1G. This 1G is memory which has + been "committed" to by the VM and can be used at any time + by the allocating application. With strict overcommit + enabled on the system (mode 2 in 'vm.overcommit_memory'), + allocations which would exceed the CommitLimit (detailed + above) will not be permitted. This is useful if one needs + to guarantee that processes will not fail due to lack of + memory once that memory has been successfully allocated. + PageTables: amount of memory dedicated to the lowest level of page + tables. +VmallocTotal: total size of vmalloc memory area + VmallocUsed: amount of vmalloc area which is used +VmallocChunk: largest contigious block of vmalloc area which is free + + +1.3 IDE devices in /proc/ide +---------------------------- + +The subdirectory /proc/ide contains information about all IDE devices of which +the kernel is aware. There is one subdirectory for each IDE controller, the +file drivers and a link for each IDE device, pointing to the device directory +in the controller specific subtree. + +The file drivers contains general information about the drivers used for the +IDE devices: + + > cat /proc/ide/drivers + ide-cdrom version 4.53 + ide-disk version 1.08 + +More detailed information can be found in the controller specific +subdirectories. These are named ide0, ide1 and so on. Each of these +directories contains the files shown in table 1-4. + + +Table 1-4: IDE controller info in /proc/ide/ide? +.............................................................................. + File Content + channel IDE channel (0 or 1) + config Configuration (only for PCI/IDE bridge) + mate Mate name + model Type/Chipset of IDE controller +.............................................................................. + +Each device connected to a controller has a separate subdirectory in the +controllers directory. The files listed in table 1-5 are contained in these +directories. + + +Table 1-5: IDE device information +.............................................................................. + File Content + cache The cache + capacity Capacity of the medium (in 512Byte blocks) + driver driver and version + geometry physical and logical geometry + identify device identify block + media media type + model device identifier + settings device setup + smart_thresholds IDE disk management thresholds + smart_values IDE disk management values +.............................................................................. + +The most interesting file is settings. This file contains a nice overview of +the drive parameters: + + # cat /proc/ide/ide0/hda/settings + name value min max mode + ---- ----- --- --- ---- + bios_cyl 526 0 65535 rw + bios_head 255 0 255 rw + bios_sect 63 0 63 rw + breada_readahead 4 0 127 rw + bswap 0 0 1 r + file_readahead 72 0 2097151 rw + io_32bit 0 0 3 rw + keepsettings 0 0 1 rw + max_kb_per_request 122 1 127 rw + multcount 0 0 8 rw + nice1 1 0 1 rw + nowerr 0 0 1 rw + pio_mode write-only 0 255 w + slow 0 0 1 rw + unmaskirq 0 0 1 rw + using_dma 0 0 1 rw + + +1.4 Networking info in /proc/net +-------------------------------- + +The subdirectory /proc/net follows the usual pattern. Table 1-6 shows the +additional values you get for IP version 6 if you configure the kernel to +support this. Table 1-7 lists the files and their meaning. + + +Table 1-6: IPv6 info in /proc/net +.............................................................................. + File Content + udp6 UDP sockets (IPv6) + tcp6 TCP sockets (IPv6) + raw6 Raw device statistics (IPv6) + igmp6 IP multicast addresses, which this host joined (IPv6) + if_inet6 List of IPv6 interface addresses + ipv6_route Kernel routing table for IPv6 + rt6_stats Global IPv6 routing tables statistics + sockstat6 Socket statistics (IPv6) + snmp6 Snmp data (IPv6) +.............................................................................. + + +Table 1-7: Network info in /proc/net +.............................................................................. + File Content + arp Kernel ARP table + dev network devices with statistics + dev_mcast the Layer2 multicast groups a device is listening too + (interface index, label, number of references, number of bound + addresses). + dev_stat network device status + ip_fwchains Firewall chain linkage + ip_fwnames Firewall chain names + ip_masq Directory containing the masquerading tables + ip_masquerade Major masquerading table + netstat Network statistics + raw raw device statistics + route Kernel routing table + rpc Directory containing rpc info + rt_cache Routing cache + snmp SNMP data + sockstat Socket statistics + tcp TCP sockets + tr_rif Token ring RIF routing table + udp UDP sockets + unix UNIX domain sockets + wireless Wireless interface data (Wavelan etc) + igmp IP multicast addresses, which this host joined + psched Global packet scheduler parameters. + netlink List of PF_NETLINK sockets + ip_mr_vifs List of multicast virtual interfaces + ip_mr_cache List of multicast routing cache +.............................................................................. + +You can use this information to see which network devices are available in +your system and how much traffic was routed over those devices: + + > cat /proc/net/dev + Inter-|Receive |[... + face |bytes packets errs drop fifo frame compressed multicast|[... + lo: 908188 5596 0 0 0 0 0 0 [... + ppp0:15475140 20721 410 0 0 410 0 0 [... + eth0: 614530 7085 0 0 0 0 0 1 [... + + ...] Transmit + ...] bytes packets errs drop fifo colls carrier compressed + ...] 908188 5596 0 0 0 0 0 0 + ...] 1375103 17405 0 0 0 0 0 0 + ...] 1703981 5535 0 0 0 3 0 0 + +In addition, each Channel Bond interface has it's own directory. For +example, the bond0 device will have a directory called /proc/net/bond0/. +It will contain information that is specific to that bond, such as the +current slaves of the bond, the link status of the slaves, and how +many times the slaves link has failed. + +1.5 SCSI info +------------- + +If you have a SCSI host adapter in your system, you'll find a subdirectory +named after the driver for this adapter in /proc/scsi. You'll also see a list +of all recognized SCSI devices in /proc/scsi: + + >cat /proc/scsi/scsi + Attached devices: + Host: scsi0 Channel: 00 Id: 00 Lun: 00 + Vendor: IBM Model: DGHS09U Rev: 03E0 + Type: Direct-Access ANSI SCSI revision: 03 + Host: scsi0 Channel: 00 Id: 06 Lun: 00 + Vendor: PIONEER Model: CD-ROM DR-U06S Rev: 1.04 + Type: CD-ROM ANSI SCSI revision: 02 + + +The directory named after the driver has one file for each adapter found in +the system. These files contain information about the controller, including +the used IRQ and the IO address range. The amount of information shown is +dependent on the adapter you use. The example shows the output for an Adaptec +AHA-2940 SCSI adapter: + + > cat /proc/scsi/aic7xxx/0 + + Adaptec AIC7xxx driver version: 5.1.19/3.2.4 + Compile Options: + TCQ Enabled By Default : Disabled + AIC7XXX_PROC_STATS : Disabled + AIC7XXX_RESET_DELAY : 5 + Adapter Configuration: + SCSI Adapter: Adaptec AHA-294X Ultra SCSI host adapter + Ultra Wide Controller + PCI MMAPed I/O Base: 0xeb001000 + Adapter SEEPROM Config: SEEPROM found and used. + Adaptec SCSI BIOS: Enabled + IRQ: 10 + SCBs: Active 0, Max Active 2, + Allocated 15, HW 16, Page 255 + Interrupts: 160328 + BIOS Control Word: 0x18b6 + Adapter Control Word: 0x005b + Extended Translation: Enabled + Disconnect Enable Flags: 0xffff + Ultra Enable Flags: 0x0001 + Tag Queue Enable Flags: 0x0000 + Ordered Queue Tag Flags: 0x0000 + Default Tag Queue Depth: 8 + Tagged Queue By Device array for aic7xxx host instance 0: + {255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255} + Actual queue depth per device for aic7xxx host instance 0: + {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1} + Statistics: + (scsi0:0:0:0) + Device using Wide/Sync transfers at 40.0 MByte/sec, offset 8 + Transinfo settings: current(12/8/1/0), goal(12/8/1/0), user(12/15/1/0) + Total transfers 160151 (74577 reads and 85574 writes) + (scsi0:0:6:0) + Device using Narrow/Sync transfers at 5.0 MByte/sec, offset 15 + Transinfo settings: current(50/15/0/0), goal(50/15/0/0), user(50/15/0/0) + Total transfers 0 (0 reads and 0 writes) + + +1.6 Parallel port info in /proc/parport +--------------------------------------- + +The directory /proc/parport contains information about the parallel ports of +your system. It has one subdirectory for each port, named after the port +number (0,1,2,...). + +These directories contain the four files shown in Table 1-8. + + +Table 1-8: Files in /proc/parport +.............................................................................. + File Content + autoprobe Any IEEE-1284 device ID information that has been acquired. + devices list of the device drivers using that port. A + will appear by the + name of the device currently using the port (it might not appear + against any). + hardware Parallel port's base address, IRQ line and DMA channel. + irq IRQ that parport is using for that port. This is in a separate + file to allow you to alter it by writing a new value in (IRQ + number or none). +.............................................................................. + +1.7 TTY info in /proc/tty +------------------------- + +Information about the available and actually used tty's can be found in the +directory /proc/tty.You'll find entries for drivers and line disciplines in +this directory, as shown in Table 1-9. + + +Table 1-9: Files in /proc/tty +.............................................................................. + File Content + drivers list of drivers and their usage + ldiscs registered line disciplines + driver/serial usage statistic and status of single tty lines +.............................................................................. + +To see which tty's are currently in use, you can simply look into the file +/proc/tty/drivers: + + > cat /proc/tty/drivers + pty_slave /dev/pts 136 0-255 pty:slave + pty_master /dev/ptm 128 0-255 pty:master + pty_slave /dev/ttyp 3 0-255 pty:slave + pty_master /dev/pty 2 0-255 pty:master + serial /dev/cua 5 64-67 serial:callout + serial /dev/ttyS 4 64-67 serial + /dev/tty0 /dev/tty0 4 0 system:vtmaster + /dev/ptmx /dev/ptmx 5 2 system + /dev/console /dev/console 5 1 system:console + /dev/tty /dev/tty 5 0 system:/dev/tty + unknown /dev/tty 4 1-63 console + + +1.8 Miscellaneous kernel statistics in /proc/stat +------------------------------------------------- + +Various pieces of information about kernel activity are available in the +/proc/stat file. All of the numbers reported in this file are aggregates +since the system first booted. For a quick look, simply cat the file: + + > cat /proc/stat + cpu 2255 34 2290 22625563 6290 127 456 + cpu0 1132 34 1441 11311718 3675 127 438 + cpu1 1123 0 849 11313845 2614 0 18 + intr 114930548 113199788 3 0 5 263 0 4 [... lots more numbers ...] + ctxt 1990473 + btime 1062191376 + processes 2915 + procs_running 1 + procs_blocked 0 + +The very first "cpu" line aggregates the numbers in all of the other "cpuN" +lines. These numbers identify the amount of time the CPU has spent performing +different kinds of work. Time units are in USER_HZ (typically hundredths of a +second). The meanings of the columns are as follows, from left to right: + +- user: normal processes executing in user mode +- nice: niced processes executing in user mode +- system: processes executing in kernel mode +- idle: twiddling thumbs +- iowait: waiting for I/O to complete +- irq: servicing interrupts +- softirq: servicing softirqs + +The "intr" line gives counts of interrupts serviced since boot time, for each +of the possible system interrupts. The first column is the total of all +interrupts serviced; each subsequent column is the total for that particular +interrupt. + +The "ctxt" line gives the total number of context switches across all CPUs. + +The "btime" line gives the time at which the system booted, in seconds since +the Unix epoch. + +The "processes" line gives the number of processes and threads created, which +includes (but is not limited to) those created by calls to the fork() and +clone() system calls. + +The "procs_running" line gives the number of processes currently running on +CPUs. + +The "procs_blocked" line gives the number of processes currently blocked, +waiting for I/O to complete. + + +------------------------------------------------------------------------------ +Summary +------------------------------------------------------------------------------ +The /proc file system serves information about the running system. It not only +allows access to process data but also allows you to request the kernel status +by reading files in the hierarchy. + +The directory structure of /proc reflects the types of information and makes +it easy, if not obvious, where to look for specific data. +------------------------------------------------------------------------------ + +------------------------------------------------------------------------------ +CHAPTER 2: MODIFYING SYSTEM PARAMETERS +------------------------------------------------------------------------------ + +------------------------------------------------------------------------------ +In This Chapter +------------------------------------------------------------------------------ +* Modifying kernel parameters by writing into files found in /proc/sys +* Exploring the files which modify certain parameters +* Review of the /proc/sys file tree +------------------------------------------------------------------------------ + + +A very interesting part of /proc is the directory /proc/sys. This is not only +a source of information, it also allows you to change parameters within the +kernel. Be very careful when attempting this. You can optimize your system, +but you can also cause it to crash. Never alter kernel parameters on a +production system. Set up a development machine and test to make sure that +everything works the way you want it to. You may have no alternative but to +reboot the machine once an error has been made. + +To change a value, simply echo the new value into the file. An example is +given below in the section on the file system data. You need to be root to do +this. You can create your own boot script to perform this every time your +system boots. + +The files in /proc/sys can be used to fine tune and monitor miscellaneous and +general things in the operation of the Linux kernel. Since some of the files +can inadvertently disrupt your system, it is advisable to read both +documentation and source before actually making adjustments. In any case, be +very careful when writing to any of these files. The entries in /proc may +change slightly between the 2.1.* and the 2.2 kernel, so if there is any doubt +review the kernel documentation in the directory /usr/src/linux/Documentation. +This chapter is heavily based on the documentation included in the pre 2.2 +kernels, and became part of it in version 2.2.1 of the Linux kernel. + +2.1 /proc/sys/fs - File system data +----------------------------------- + +This subdirectory contains specific file system, file handle, inode, dentry +and quota information. + +Currently, these files are in /proc/sys/fs: + +dentry-state +------------ + +Status of the directory cache. Since directory entries are dynamically +allocated and deallocated, this file indicates the current status. It holds +six values, in which the last two are not used and are always zero. The others +are listed in table 2-1. + + +Table 2-1: Status files of the directory cache +.............................................................................. + File Content + nr_dentry Almost always zero + nr_unused Number of unused cache entries + age_limit + in seconds after the entry may be reclaimed, when memory is short + want_pages internally +.............................................................................. + +dquot-nr and dquot-max +---------------------- + +The file dquot-max shows the maximum number of cached disk quota entries. + +The file dquot-nr shows the number of allocated disk quota entries and the +number of free disk quota entries. + +If the number of available cached disk quotas is very low and you have a large +number of simultaneous system users, you might want to raise the limit. + +file-nr and file-max +-------------------- + +The kernel allocates file handles dynamically, but doesn't free them again at +this time. + +The value in file-max denotes the maximum number of file handles that the +Linux kernel will allocate. When you get a lot of error messages about running +out of file handles, you might want to raise this limit. The default value is +10% of RAM in kilobytes. To change it, just write the new number into the +file: + + # cat /proc/sys/fs/file-max + 4096 + # echo 8192 > /proc/sys/fs/file-max + # cat /proc/sys/fs/file-max + 8192 + + +This method of revision is useful for all customizable parameters of the +kernel - simply echo the new value to the corresponding file. + +Historically, the three values in file-nr denoted the number of allocated file +handles, the number of allocated but unused file handles, and the maximum +number of file handles. Linux 2.6 always reports 0 as the number of free file +handles -- this is not an error, it just means that the number of allocated +file handles exactly matches the number of used file handles. + +Attempts to allocate more file descriptors than file-max are reported with +printk, look for "VFS: file-max limit <number> reached". + +inode-state and inode-nr +------------------------ + +The file inode-nr contains the first two items from inode-state, so we'll skip +to that file... + +inode-state contains two actual numbers and five dummy values. The numbers +are nr_inodes and nr_free_inodes (in order of appearance). + +nr_inodes +~~~~~~~~~ + +Denotes the number of inodes the system has allocated. This number will +grow and shrink dynamically. + +nr_free_inodes +-------------- + +Represents the number of free inodes. Ie. The number of inuse inodes is +(nr_inodes - nr_free_inodes). + +super-nr and super-max +---------------------- + +Again, super block structures are allocated by the kernel, but not freed. The +file super-max contains the maximum number of super block handlers, where +super-nr shows the number of currently allocated ones. + +Every mounted file system needs a super block, so if you plan to mount lots of +file systems, you may want to increase these numbers. + +aio-nr and aio-max-nr +--------------------- + +aio-nr is the running total of the number of events specified on the +io_setup system call for all currently active aio contexts. If aio-nr +reaches aio-max-nr then io_setup will fail with EAGAIN. Note that +raising aio-max-nr does not result in the pre-allocation or re-sizing +of any kernel data structures. + +2.2 /proc/sys/fs/binfmt_misc - Miscellaneous binary formats +----------------------------------------------------------- + +Besides these files, there is the subdirectory /proc/sys/fs/binfmt_misc. This +handles the kernel support for miscellaneous binary formats. + +Binfmt_misc provides the ability to register additional binary formats to the +Kernel without compiling an additional module/kernel. Therefore, binfmt_misc +needs to know magic numbers at the beginning or the filename extension of the +binary. + +It works by maintaining a linked list of structs that contain a description of +a binary format, including a magic with size (or the filename extension), +offset and mask, and the interpreter name. On request it invokes the given +interpreter with the original program as argument, as binfmt_java and +binfmt_em86 and binfmt_mz do. Since binfmt_misc does not define any default +binary-formats, you have to register an additional binary-format. + +There are two general files in binfmt_misc and one file per registered format. +The two general files are register and status. + +Registering a new binary format +------------------------------- + +To register a new binary format you have to issue the command + + echo :name:type:offset:magic:mask:interpreter: > /proc/sys/fs/binfmt_misc/register + + + +with appropriate name (the name for the /proc-dir entry), offset (defaults to +0, if omitted), magic, mask (which can be omitted, defaults to all 0xff) and +last but not least, the interpreter that is to be invoked (for example and +testing /bin/echo). Type can be M for usual magic matching or E for filename +extension matching (give extension in place of magic). + +Check or reset the status of the binary format handler +------------------------------------------------------ + +If you do a cat on the file /proc/sys/fs/binfmt_misc/status, you will get the +current status (enabled/disabled) of binfmt_misc. Change the status by echoing +0 (disables) or 1 (enables) or -1 (caution: this clears all previously +registered binary formats) to status. For example echo 0 > status to disable +binfmt_misc (temporarily). + +Status of a single handler +-------------------------- + +Each registered handler has an entry in /proc/sys/fs/binfmt_misc. These files +perform the same function as status, but their scope is limited to the actual +binary format. By cating this file, you also receive all related information +about the interpreter/magic of the binfmt. + +Example usage of binfmt_misc (emulate binfmt_java) +-------------------------------------------------- + + cd /proc/sys/fs/binfmt_misc + echo ':Java:M::\xca\xfe\xba\xbe::/usr/local/java/bin/javawrapper:' > register + echo ':HTML:E::html::/usr/local/java/bin/appletviewer:' > register + echo ':Applet:M::<!--applet::/usr/local/java/bin/appletviewer:' > register + echo ':DEXE:M::\x0eDEX::/usr/bin/dosexec:' > register + + +These four lines add support for Java executables and Java applets (like +binfmt_java, additionally recognizing the .html extension with no need to put +<!--applet> to every applet file). You have to install the JDK and the +shell-script /usr/local/java/bin/javawrapper too. It works around the +brokenness of the Java filename handling. To add a Java binary, just create a +link to the class-file somewhere in the path. + +2.3 /proc/sys/kernel - general kernel parameters +------------------------------------------------ + +This directory reflects general kernel behaviors. As I've said before, the +contents depend on your configuration. Here you'll find the most important +files, along with descriptions of what they mean and how to use them. + +acct +---- + +The file contains three values; highwater, lowwater, and frequency. + +It exists only when BSD-style process accounting is enabled. These values +control its behavior. If the free space on the file system where the log lives +goes below lowwater percentage, accounting suspends. If it goes above +highwater percentage, accounting resumes. Frequency determines how often you +check the amount of free space (value is in seconds). Default settings are: 4, +2, and 30. That is, suspend accounting if there is less than 2 percent free; +resume it if we have a value of 3 or more percent; consider information about +the amount of free space valid for 30 seconds + +ctrl-alt-del +------------ + +When the value in this file is 0, ctrl-alt-del is trapped and sent to the init +program to handle a graceful restart. However, when the value is greater that +zero, Linux's reaction to this key combination will be an immediate reboot, +without syncing its dirty buffers. + +[NOTE] + When a program (like dosemu) has the keyboard in raw mode, the + ctrl-alt-del is intercepted by the program before it ever reaches the + kernel tty layer, and it is up to the program to decide what to do with + it. + +domainname and hostname +----------------------- + +These files can be controlled to set the NIS domainname and hostname of your +box. For the classic darkstar.frop.org a simple: + + # echo "darkstar" > /proc/sys/kernel/hostname + # echo "frop.org" > /proc/sys/kernel/domainname + + +would suffice to set your hostname and NIS domainname. + +osrelease, ostype and version +----------------------------- + +The names make it pretty obvious what these fields contain: + + > cat /proc/sys/kernel/osrelease + 2.2.12 + + > cat /proc/sys/kernel/ostype + Linux + + > cat /proc/sys/kernel/version + #4 Fri Oct 1 12:41:14 PDT 1999 + + +The files osrelease and ostype should be clear enough. Version needs a little +more clarification. The #4 means that this is the 4th kernel built from this +source base and the date after it indicates the time the kernel was built. The +only way to tune these values is to rebuild the kernel. + +panic +----- + +The value in this file represents the number of seconds the kernel waits +before rebooting on a panic. When you use the software watchdog, the +recommended setting is 60. If set to 0, the auto reboot after a kernel panic +is disabled, which is the default setting. + +printk +------ + +The four values in printk denote +* console_loglevel, +* default_message_loglevel, +* minimum_console_loglevel and +* default_console_loglevel +respectively. + +These values influence printk() behavior when printing or logging error +messages, which come from inside the kernel. See syslog(2) for more +information on the different log levels. + +console_loglevel +---------------- + +Messages with a higher priority than this will be printed to the console. + +default_message_level +--------------------- + +Messages without an explicit priority will be printed with this priority. + +minimum_console_loglevel +------------------------ + +Minimum (highest) value to which the console_loglevel can be set. + +default_console_loglevel +------------------------ + +Default value for console_loglevel. + +sg-big-buff +----------- + +This file shows the size of the generic SCSI (sg) buffer. At this point, you +can't tune it yet, but you can change it at compile time by editing +include/scsi/sg.h and changing the value of SG_BIG_BUFF. + +If you use a scanner with SANE (Scanner Access Now Easy) you might want to set +this to a higher value. Refer to the SANE documentation on this issue. + +modprobe +-------- + +The location where the modprobe binary is located. The kernel uses this +program to load modules on demand. + +unknown_nmi_panic +----------------- + +The value in this file affects behavior of handling NMI. When the value is +non-zero, unknown NMI is trapped and then panic occurs. At that time, kernel +debugging information is displayed on console. + +NMI switch that most IA32 servers have fires unknown NMI up, for example. +If a system hangs up, try pressing the NMI switch. + +[NOTE] + This function and oprofile share a NMI callback. Therefore this function + cannot be enabled when oprofile is activated. + And NMI watchdog will be disabled when the value in this file is set to + non-zero. + + +2.4 /proc/sys/vm - The virtual memory subsystem +----------------------------------------------- + +The files in this directory can be used to tune the operation of the virtual +memory (VM) subsystem of the Linux kernel. + +vfs_cache_pressure +------------------ + +Controls the tendency of the kernel to reclaim the memory which is used for +caching of directory and inode objects. + +At the default value of vfs_cache_pressure=100 the kernel will attempt to +reclaim dentries and inodes at a "fair" rate with respect to pagecache and +swapcache reclaim. Decreasing vfs_cache_pressure causes the kernel to prefer +to retain dentry and inode caches. Increasing vfs_cache_pressure beyond 100 +causes the kernel to prefer to reclaim dentries and inodes. + +dirty_background_ratio +---------------------- + +Contains, as a percentage of total system memory, the number of pages at which +the pdflush background writeback daemon will start writing out dirty data. + +dirty_ratio +----------------- + +Contains, as a percentage of total system memory, the number of pages at which +a process which is generating disk writes will itself start writing out dirty +data. + +dirty_writeback_centisecs +------------------------- + +The pdflush writeback daemons will periodically wake up and write `old' data +out to disk. This tunable expresses the interval between those wakeups, in +100'ths of a second. + +Setting this to zero disables periodic writeback altogether. + +dirty_expire_centisecs +---------------------- + +This tunable is used to define when dirty data is old enough to be eligible +for writeout by the pdflush daemons. It is expressed in 100'ths of a second. +Data which has been dirty in-memory for longer than this interval will be +written out next time a pdflush daemon wakes up. + +legacy_va_layout +---------------- + +If non-zero, this sysctl disables the new 32-bit mmap mmap layout - the kernel +will use the legacy (2.4) layout for all processes. + +lower_zone_protection +--------------------- + +For some specialised workloads on highmem machines it is dangerous for +the kernel to allow process memory to be allocated from the "lowmem" +zone. This is because that memory could then be pinned via the mlock() +system call, or by unavailability of swapspace. + +And on large highmem machines this lack of reclaimable lowmem memory +can be fatal. + +So the Linux page allocator has a mechanism which prevents allocations +which _could_ use highmem from using too much lowmem. This means that +a certain amount of lowmem is defended from the possibility of being +captured into pinned user memory. + +(The same argument applies to the old 16 megabyte ISA DMA region. This +mechanism will also defend that region from allocations which could use +highmem or lowmem). + +The `lower_zone_protection' tunable determines how aggressive the kernel is +in defending these lower zones. The default value is zero - no +protection at all. + +If you have a machine which uses highmem or ISA DMA and your +applications are using mlock(), or if you are running with no swap then +you probably should increase the lower_zone_protection setting. + +The units of this tunable are fairly vague. It is approximately equal +to "megabytes". So setting lower_zone_protection=100 will protect around 100 +megabytes of the lowmem zone from user allocations. It will also make +those 100 megabytes unavaliable for use by applications and by +pagecache, so there is a cost. + +The effects of this tunable may be observed by monitoring +/proc/meminfo:LowFree. Write a single huge file and observe the point +at which LowFree ceases to fall. + +A reasonable value for lower_zone_protection is 100. + +page-cluster +------------ + +page-cluster controls the number of pages which are written to swap in +a single attempt. The swap I/O size. + +It is a logarithmic value - setting it to zero means "1 page", setting +it to 1 means "2 pages", setting it to 2 means "4 pages", etc. + +The default value is three (eight pages at a time). There may be some +small benefits in tuning this to a different value if your workload is +swap-intensive. + +overcommit_memory +----------------- + +This file contains one value. The following algorithm is used to decide if +there's enough memory: if the value of overcommit_memory is positive, then +there's always enough memory. This is a useful feature, since programs often +malloc() huge amounts of memory 'just in case', while they only use a small +part of it. Leaving this value at 0 will lead to the failure of such a huge +malloc(), when in fact the system has enough memory for the program to run. + +On the other hand, enabling this feature can cause you to run out of memory +and thrash the system to death, so large and/or important servers will want to +set this value to 0. + +nr_hugepages and hugetlb_shm_group +---------------------------------- + +nr_hugepages configures number of hugetlb page reserved for the system. + +hugetlb_shm_group contains group id that is allowed to create SysV shared +memory segment using hugetlb page. + +laptop_mode +----------- + +laptop_mode is a knob that controls "laptop mode". All the things that are +controlled by this knob are discussed in Documentation/laptop-mode.txt. + +block_dump +---------- + +block_dump enables block I/O debugging when set to a nonzero value. More +information on block I/O debugging is in Documentation/laptop-mode.txt. + +swap_token_timeout +------------------ + +This file contains valid hold time of swap out protection token. The Linux +VM has token based thrashing control mechanism and uses the token to prevent +unnecessary page faults in thrashing situation. The unit of the value is +second. The value would be useful to tune thrashing behavior. + +2.5 /proc/sys/dev - Device specific parameters +---------------------------------------------- + +Currently there is only support for CDROM drives, and for those, there is only +one read-only file containing information about the CD-ROM drives attached to +the system: + + >cat /proc/sys/dev/cdrom/info + CD-ROM information, Id: cdrom.c 2.55 1999/04/25 + + drive name: sr0 hdb + drive speed: 32 40 + drive # of slots: 1 0 + Can close tray: 1 1 + Can open tray: 1 1 + Can lock tray: 1 1 + Can change speed: 1 1 + Can select disk: 0 1 + Can read multisession: 1 1 + Can read MCN: 1 1 + Reports media changed: 1 1 + Can play audio: 1 1 + + +You see two drives, sr0 and hdb, along with a list of their features. + +2.6 /proc/sys/sunrpc - Remote procedure calls +--------------------------------------------- + +This directory contains four files, which enable or disable debugging for the +RPC functions NFS, NFS-daemon, RPC and NLM. The default values are 0. They can +be set to one to turn debugging on. (The default value is 0 for each) + +2.7 /proc/sys/net - Networking stuff +------------------------------------ + +The interface to the networking parts of the kernel is located in +/proc/sys/net. Table 2-3 shows all possible subdirectories. You may see only +some of them, depending on your kernel's configuration. + + +Table 2-3: Subdirectories in /proc/sys/net +.............................................................................. + Directory Content Directory Content + core General parameter appletalk Appletalk protocol + unix Unix domain sockets netrom NET/ROM + 802 E802 protocol ax25 AX25 + ethernet Ethernet protocol rose X.25 PLP layer + ipv4 IP version 4 x25 X.25 protocol + ipx IPX token-ring IBM token ring + bridge Bridging decnet DEC net + ipv6 IP version 6 +.............................................................................. + +We will concentrate on IP networking here. Since AX15, X.25, and DEC Net are +only minor players in the Linux world, we'll skip them in this chapter. You'll +find some short info on Appletalk and IPX further on in this chapter. Review +the online documentation and the kernel source to get a detailed view of the +parameters for those protocols. In this section we'll discuss the +subdirectories printed in bold letters in the table above. As default values +are suitable for most needs, there is no need to change these values. + +/proc/sys/net/core - Network core options +----------------------------------------- + +rmem_default +------------ + +The default setting of the socket receive buffer in bytes. + +rmem_max +-------- + +The maximum receive socket buffer size in bytes. + +wmem_default +------------ + +The default setting (in bytes) of the socket send buffer. + +wmem_max +-------- + +The maximum send socket buffer size in bytes. + +message_burst and message_cost +------------------------------ + +These parameters are used to limit the warning messages written to the kernel +log from the networking code. They enforce a rate limit to make a +denial-of-service attack impossible. A higher message_cost factor, results in +fewer messages that will be written. Message_burst controls when messages will +be dropped. The default settings limit warning messages to one every five +seconds. + +netdev_max_backlog +------------------ + +Maximum number of packets, queued on the INPUT side, when the interface +receives packets faster than kernel can process them. + +optmem_max +---------- + +Maximum ancillary buffer size allowed per socket. Ancillary data is a sequence +of struct cmsghdr structures with appended data. + +/proc/sys/net/unix - Parameters for Unix domain sockets +------------------------------------------------------- + +There are only two files in this subdirectory. They control the delays for +deleting and destroying socket descriptors. + +2.8 /proc/sys/net/ipv4 - IPV4 settings +-------------------------------------- + +IP version 4 is still the most used protocol in Unix networking. It will be +replaced by IP version 6 in the next couple of years, but for the moment it's +the de facto standard for the internet and is used in most networking +environments around the world. Because of the importance of this protocol, +we'll have a deeper look into the subtree controlling the behavior of the IPv4 +subsystem of the Linux kernel. + +Let's start with the entries in /proc/sys/net/ipv4. + +ICMP settings +------------- + +icmp_echo_ignore_all and icmp_echo_ignore_broadcasts +---------------------------------------------------- + +Turn on (1) or off (0), if the kernel should ignore all ICMP ECHO requests, or +just those to broadcast and multicast addresses. + +Please note that if you accept ICMP echo requests with a broadcast/multi\-cast +destination address your network may be used as an exploder for denial of +service packet flooding attacks to other hosts. + +icmp_destunreach_rate, icmp_echoreply_rate, icmp_paramprob_rate and icmp_timeexeed_rate +--------------------------------------------------------------------------------------- + +Sets limits for sending ICMP packets to specific targets. A value of zero +disables all limiting. Any positive value sets the maximum package rate in +hundredth of a second (on Intel systems). + +IP settings +----------- + +ip_autoconfig +------------- + +This file contains the number one if the host received its IP configuration by +RARP, BOOTP, DHCP or a similar mechanism. Otherwise it is zero. + +ip_default_ttl +-------------- + +TTL (Time To Live) for IPv4 interfaces. This is simply the maximum number of +hops a packet may travel. + +ip_dynaddr +---------- + +Enable dynamic socket address rewriting on interface address change. This is +useful for dialup interface with changing IP addresses. + +ip_forward +---------- + +Enable or disable forwarding of IP packages between interfaces. Changing this +value resets all other parameters to their default values. They differ if the +kernel is configured as host or router. + +ip_local_port_range +------------------- + +Range of ports used by TCP and UDP to choose the local port. Contains two +numbers, the first number is the lowest port, the second number the highest +local port. Default is 1024-4999. Should be changed to 32768-61000 for +high-usage systems. + +ip_no_pmtu_disc +--------------- + +Global switch to turn path MTU discovery off. It can also be set on a per +socket basis by the applications or on a per route basis. + +ip_masq_debug +------------- + +Enable/disable debugging of IP masquerading. + +IP fragmentation settings +------------------------- + +ipfrag_high_trash and ipfrag_low_trash +-------------------------------------- + +Maximum memory used to reassemble IP fragments. When ipfrag_high_thresh bytes +of memory is allocated for this purpose, the fragment handler will toss +packets until ipfrag_low_thresh is reached. + +ipfrag_time +----------- + +Time in seconds to keep an IP fragment in memory. + +TCP settings +------------ + +tcp_ecn +------- + +This file controls the use of the ECN bit in the IPv4 headers, this is a new +feature about Explicit Congestion Notification, but some routers and firewalls +block trafic that has this bit set, so it could be necessary to echo 0 to +/proc/sys/net/ipv4/tcp_ecn, if you want to talk to this sites. For more info +you could read RFC2481. + +tcp_retrans_collapse +-------------------- + +Bug-to-bug compatibility with some broken printers. On retransmit, try to send +larger packets to work around bugs in certain TCP stacks. Can be turned off by +setting it to zero. + +tcp_keepalive_probes +-------------------- + +Number of keep alive probes TCP sends out, until it decides that the +connection is broken. + +tcp_keepalive_time +------------------ + +How often TCP sends out keep alive messages, when keep alive is enabled. The +default is 2 hours. + +tcp_syn_retries +--------------- + +Number of times initial SYNs for a TCP connection attempt will be +retransmitted. Should not be higher than 255. This is only the timeout for +outgoing connections, for incoming connections the number of retransmits is +defined by tcp_retries1. + +tcp_sack +-------- + +Enable select acknowledgments after RFC2018. + +tcp_timestamps +-------------- + +Enable timestamps as defined in RFC1323. + +tcp_stdurg +---------- + +Enable the strict RFC793 interpretation of the TCP urgent pointer field. The +default is to use the BSD compatible interpretation of the urgent pointer +pointing to the first byte after the urgent data. The RFC793 interpretation is +to have it point to the last byte of urgent data. Enabling this option may +lead to interoperatibility problems. Disabled by default. + +tcp_syncookies +-------------- + +Only valid when the kernel was compiled with CONFIG_SYNCOOKIES. Send out +syncookies when the syn backlog queue of a socket overflows. This is to ward +off the common 'syn flood attack'. Disabled by default. + +Note that the concept of a socket backlog is abandoned. This means the peer +may not receive reliable error messages from an over loaded server with +syncookies enabled. + +tcp_window_scaling +------------------ + +Enable window scaling as defined in RFC1323. + +tcp_fin_timeout +--------------- + +The length of time in seconds it takes to receive a final FIN before the +socket is always closed. This is strictly a violation of the TCP +specification, but required to prevent denial-of-service attacks. + +tcp_max_ka_probes +----------------- + +Indicates how many keep alive probes are sent per slow timer run. Should not +be set too high to prevent bursts. + +tcp_max_syn_backlog +------------------- + +Length of the per socket backlog queue. Since Linux 2.2 the backlog specified +in listen(2) only specifies the length of the backlog queue of already +established sockets. When more connection requests arrive Linux starts to drop +packets. When syncookies are enabled the packets are still answered and the +maximum queue is effectively ignored. + +tcp_retries1 +------------ + +Defines how often an answer to a TCP connection request is retransmitted +before giving up. + +tcp_retries2 +------------ + +Defines how often a TCP packet is retransmitted before giving up. + +Interface specific settings +--------------------------- + +In the directory /proc/sys/net/ipv4/conf you'll find one subdirectory for each +interface the system knows about and one directory calls all. Changes in the +all subdirectory affect all interfaces, whereas changes in the other +subdirectories affect only one interface. All directories have the same +entries: + +accept_redirects +---------------- + +This switch decides if the kernel accepts ICMP redirect messages or not. The +default is 'yes' if the kernel is configured for a regular host and 'no' for a +router configuration. + +accept_source_route +------------------- + +Should source routed packages be accepted or declined. The default is +dependent on the kernel configuration. It's 'yes' for routers and 'no' for +hosts. + +bootp_relay +~~~~~~~~~~~ + +Accept packets with source address 0.b.c.d with destinations not to this host +as local ones. It is supposed that a BOOTP relay daemon will catch and forward +such packets. + +The default is 0, since this feature is not implemented yet (kernel version +2.2.12). + +forwarding +---------- + +Enable or disable IP forwarding on this interface. + +log_martians +------------ + +Log packets with source addresses with no known route to kernel log. + +mc_forwarding +------------- + +Do multicast routing. The kernel needs to be compiled with CONFIG_MROUTE and a +multicast routing daemon is required. + +proxy_arp +--------- + +Does (1) or does not (0) perform proxy ARP. + +rp_filter +--------- + +Integer value determines if a source validation should be made. 1 means yes, 0 +means no. Disabled by default, but local/broadcast address spoofing is always +on. + +If you set this to 1 on a router that is the only connection for a network to +the net, it will prevent spoofing attacks against your internal networks +(external addresses can still be spoofed), without the need for additional +firewall rules. + +secure_redirects +---------------- + +Accept ICMP redirect messages only for gateways, listed in default gateway +list. Enabled by default. + +shared_media +------------ + +If it is not set the kernel does not assume that different subnets on this +device can communicate directly. Default setting is 'yes'. + +send_redirects +-------------- + +Determines whether to send ICMP redirects to other hosts. + +Routing settings +---------------- + +The directory /proc/sys/net/ipv4/route contains several file to control +routing issues. + +error_burst and error_cost +-------------------------- + +These parameters are used to limit how many ICMP destination unreachable to +send from the host in question. ICMP destination unreachable messages are +sent when we can not reach the next hop, while trying to transmit a packet. +It will also print some error messages to kernel logs if someone is ignoring +our ICMP redirects. The higher the error_cost factor is, the fewer +destination unreachable and error messages will be let through. Error_burst +controls when destination unreachable messages and error messages will be +dropped. The default settings limit warning messages to five every second. + +flush +----- + +Writing to this file results in a flush of the routing cache. + +gc_elasticity, gc_interval, gc_min_interval_ms, gc_timeout, gc_thresh +--------------------------------------------------------------------- + +Values to control the frequency and behavior of the garbage collection +algorithm for the routing cache. gc_min_interval is deprecated and replaced +by gc_min_interval_ms. + + +max_size +-------- + +Maximum size of the routing cache. Old entries will be purged once the cache +reached has this size. + +max_delay, min_delay +-------------------- + +Delays for flushing the routing cache. + +redirect_load, redirect_number +------------------------------ + +Factors which determine if more ICPM redirects should be sent to a specific +host. No redirects will be sent once the load limit or the maximum number of +redirects has been reached. + +redirect_silence +---------------- + +Timeout for redirects. After this period redirects will be sent again, even if +this has been stopped, because the load or number limit has been reached. + +Network Neighbor handling +------------------------- + +Settings about how to handle connections with direct neighbors (nodes attached +to the same link) can be found in the directory /proc/sys/net/ipv4/neigh. + +As we saw it in the conf directory, there is a default subdirectory which +holds the default values, and one directory for each interface. The contents +of the directories are identical, with the single exception that the default +settings contain additional options to set garbage collection parameters. + +In the interface directories you'll find the following entries: + +base_reachable_time, base_reachable_time_ms +------------------------------------------- + +A base value used for computing the random reachable time value as specified +in RFC2461. + +Expression of base_reachable_time, which is deprecated, is in seconds. +Expression of base_reachable_time_ms is in milliseconds. + +retrans_time, retrans_time_ms +----------------------------- + +The time between retransmitted Neighbor Solicitation messages. +Used for address resolution and to determine if a neighbor is +unreachable. + +Expression of retrans_time, which is deprecated, is in 1/100 seconds (for +IPv4) or in jiffies (for IPv6). +Expression of retrans_time_ms is in milliseconds. + +unres_qlen +---------- + +Maximum queue length for a pending arp request - the number of packets which +are accepted from other layers while the ARP address is still resolved. + +anycast_delay +------------- + +Maximum for random delay of answers to neighbor solicitation messages in +jiffies (1/100 sec). Not yet implemented (Linux does not have anycast support +yet). + +ucast_solicit +------------- + +Maximum number of retries for unicast solicitation. + +mcast_solicit +------------- + +Maximum number of retries for multicast solicitation. + +delay_first_probe_time +---------------------- + +Delay for the first time probe if the neighbor is reachable. (see +gc_stale_time) + +locktime +-------- + +An ARP/neighbor entry is only replaced with a new one if the old is at least +locktime old. This prevents ARP cache thrashing. + +proxy_delay +----------- + +Maximum time (real time is random [0..proxytime]) before answering to an ARP +request for which we have an proxy ARP entry. In some cases, this is used to +prevent network flooding. + +proxy_qlen +---------- + +Maximum queue length of the delayed proxy arp timer. (see proxy_delay). + +app_solcit +---------- + +Determines the number of requests to send to the user level ARP daemon. Use 0 +to turn off. + +gc_stale_time +------------- + +Determines how often to check for stale ARP entries. After an ARP entry is +stale it will be resolved again (which is useful when an IP address migrates +to another machine). When ucast_solicit is greater than 0 it first tries to +send an ARP packet directly to the known host When that fails and +mcast_solicit is greater than 0, an ARP request is broadcasted. + +2.9 Appletalk +------------- + +The /proc/sys/net/appletalk directory holds the Appletalk configuration data +when Appletalk is loaded. The configurable parameters are: + +aarp-expiry-time +---------------- + +The amount of time we keep an ARP entry before expiring it. Used to age out +old hosts. + +aarp-resolve-time +----------------- + +The amount of time we will spend trying to resolve an Appletalk address. + +aarp-retransmit-limit +--------------------- + +The number of times we will retransmit a query before giving up. + +aarp-tick-time +-------------- + +Controls the rate at which expires are checked. + +The directory /proc/net/appletalk holds the list of active Appletalk sockets +on a machine. + +The fields indicate the DDP type, the local address (in network:node format) +the remote address, the size of the transmit pending queue, the size of the +received queue (bytes waiting for applications to read) the state and the uid +owning the socket. + +/proc/net/atalk_iface lists all the interfaces configured for appletalk.It +shows the name of the interface, its Appletalk address, the network range on +that address (or network number for phase 1 networks), and the status of the +interface. + +/proc/net/atalk_route lists each known network route. It lists the target +(network) that the route leads to, the router (may be directly connected), the +route flags, and the device the route is using. + +2.10 IPX +-------- + +The IPX protocol has no tunable values in proc/sys/net. + +The IPX protocol does, however, provide proc/net/ipx. This lists each IPX +socket giving the local and remote addresses in Novell format (that is +network:node:port). In accordance with the strange Novell tradition, +everything but the port is in hex. Not_Connected is displayed for sockets that +are not tied to a specific remote address. The Tx and Rx queue sizes indicate +the number of bytes pending for transmission and reception. The state +indicates the state the socket is in and the uid is the owning uid of the +socket. + +The /proc/net/ipx_interface file lists all IPX interfaces. For each interface +it gives the network number, the node number, and indicates if the network is +the primary network. It also indicates which device it is bound to (or +Internal for internal networks) and the Frame Type if appropriate. Linux +supports 802.3, 802.2, 802.2 SNAP and DIX (Blue Book) ethernet framing for +IPX. + +The /proc/net/ipx_route table holds a list of IPX routes. For each route it +gives the destination network, the router node (or Directly) and the network +address of the router (or Connected) for internal networks. + +2.11 /proc/sys/fs/mqueue - POSIX message queues filesystem +---------------------------------------------------------- + +The "mqueue" filesystem provides the necessary kernel features to enable the +creation of a user space library that implements the POSIX message queues +API (as noted by the MSG tag in the POSIX 1003.1-2001 version of the System +Interfaces specification.) + +The "mqueue" filesystem contains values for determining/setting the amount of +resources used by the file system. + +/proc/sys/fs/mqueue/queues_max is a read/write file for setting/getting the +maximum number of message queues allowed on the system. + +/proc/sys/fs/mqueue/msg_max is a read/write file for setting/getting the +maximum number of messages in a queue value. In fact it is the limiting value +for another (user) limit which is set in mq_open invocation. This attribute of +a queue must be less or equal then msg_max. + +/proc/sys/fs/mqueue/msgsize_max is a read/write file for setting/getting the +maximum message size value (it is every message queue's attribute set during +its creation). + + +------------------------------------------------------------------------------ +Summary +------------------------------------------------------------------------------ +Certain aspects of kernel behavior can be modified at runtime, without the +need to recompile the kernel, or even to reboot the system. The files in the +/proc/sys tree can not only be read, but also modified. You can use the echo +command to write value into these files, thereby changing the default settings +of the kernel. +------------------------------------------------------------------------------ diff --git a/Documentation/filesystems/romfs.txt b/Documentation/filesystems/romfs.txt new file mode 100644 index 000000000000..2d2a7b2a16b9 --- /dev/null +++ b/Documentation/filesystems/romfs.txt @@ -0,0 +1,187 @@ +ROMFS - ROM FILE SYSTEM + +This is a quite dumb, read only filesystem, mainly for initial RAM +disks of installation disks. It has grown up by the need of having +modules linked at boot time. Using this filesystem, you get a very +similar feature, and even the possibility of a small kernel, with a +file system which doesn't take up useful memory from the router +functions in the basement of your office. + +For comparison, both the older minix and xiafs (the latter is now +defunct) filesystems, compiled as module need more than 20000 bytes, +while romfs is less than a page, about 4000 bytes (assuming i586 +code). Under the same conditions, the msdos filesystem would need +about 30K (and does not support device nodes or symlinks), while the +nfs module with nfsroot is about 57K. Furthermore, as a bit unfair +comparison, an actual rescue disk used up 3202 blocks with ext2, while +with romfs, it needed 3079 blocks. + +To create such a file system, you'll need a user program named +genromfs. It is available via anonymous ftp on sunsite.unc.edu and +its mirrors, in the /pub/Linux/system/recovery/ directory. + +As the name suggests, romfs could be also used (space-efficiently) on +various read-only media, like (E)EPROM disks if someone will have the +motivation.. :) + +However, the main purpose of romfs is to have a very small kernel, +which has only this filesystem linked in, and then can load any module +later, with the current module utilities. It can also be used to run +some program to decide if you need SCSI devices, and even IDE or +floppy drives can be loaded later if you use the "initrd"--initial +RAM disk--feature of the kernel. This would not be really news +flash, but with romfs, you can even spare off your ext2 or minix or +maybe even affs filesystem until you really know that you need it. + +For example, a distribution boot disk can contain only the cd disk +drivers (and possibly the SCSI drivers), and the ISO 9660 filesystem +module. The kernel can be small enough, since it doesn't have other +filesystems, like the quite large ext2fs module, which can then be +loaded off the CD at a later stage of the installation. Another use +would be for a recovery disk, when you are reinstalling a workstation +from the network, and you will have all the tools/modules available +from a nearby server, so you don't want to carry two disks for this +purpose, just because it won't fit into ext2. + +romfs operates on block devices as you can expect, and the underlying +structure is very simple. Every accessible structure begins on 16 +byte boundaries for fast access. The minimum space a file will take +is 32 bytes (this is an empty file, with a less than 16 character +name). The maximum overhead for any non-empty file is the header, and +the 16 byte padding for the name and the contents, also 16+14+15 = 45 +bytes. This is quite rare however, since most file names are longer +than 3 bytes, and shorter than 15 bytes. + +The layout of the filesystem is the following: + +offset content + + +---+---+---+---+ + 0 | - | r | o | m | \ + +---+---+---+---+ The ASCII representation of those bytes + 4 | 1 | f | s | - | / (i.e. "-rom1fs-") + +---+---+---+---+ + 8 | full size | The number of accessible bytes in this fs. + +---+---+---+---+ + 12 | checksum | The checksum of the FIRST 512 BYTES. + +---+---+---+---+ + 16 | volume name | The zero terminated name of the volume, + : : padded to 16 byte boundary. + +---+---+---+---+ + xx | file | + : headers : + +Every multi byte value (32 bit words, I'll use the longwords term from +now on) must be in big endian order. + +The first eight bytes identify the filesystem, even for the casual +inspector. After that, in the 3rd longword, it contains the number of +bytes accessible from the start of this filesystem. The 4th longword +is the checksum of the first 512 bytes (or the number of bytes +accessible, whichever is smaller). The applied algorithm is the same +as in the AFFS filesystem, namely a simple sum of the longwords +(assuming bigendian quantities again). For details, please consult +the source. This algorithm was chosen because although it's not quite +reliable, it does not require any tables, and it is very simple. + +The following bytes are now part of the file system; each file header +must begin on a 16 byte boundary. + +offset content + + +---+---+---+---+ + 0 | next filehdr|X| The offset of the next file header + +---+---+---+---+ (zero if no more files) + 4 | spec.info | Info for directories/hard links/devices + +---+---+---+---+ + 8 | size | The size of this file in bytes + +---+---+---+---+ + 12 | checksum | Covering the meta data, including the file + +---+---+---+---+ name, and padding + 16 | file name | The zero terminated name of the file, + : : padded to 16 byte boundary + +---+---+---+---+ + xx | file data | + : : + +Since the file headers begin always at a 16 byte boundary, the lowest +4 bits would be always zero in the next filehdr pointer. These four +bits are used for the mode information. Bits 0..2 specify the type of +the file; while bit 4 shows if the file is executable or not. The +permissions are assumed to be world readable, if this bit is not set, +and world executable if it is; except the character and block devices, +they are never accessible for other than owner. The owner of every +file is user and group 0, this should never be a problem for the +intended use. The mapping of the 8 possible values to file types is +the following: + + mapping spec.info means + 0 hard link link destination [file header] + 1 directory first file's header + 2 regular file unused, must be zero [MBZ] + 3 symbolic link unused, MBZ (file data is the link content) + 4 block device 16/16 bits major/minor number + 5 char device - " - + 6 socket unused, MBZ + 7 fifo unused, MBZ + +Note that hard links are specifically marked in this filesystem, but +they will behave as you can expect (i.e. share the inode number). +Note also that it is your responsibility to not create hard link +loops, and creating all the . and .. links for directories. This is +normally done correctly by the genromfs program. Please refrain from +using the executable bits for special purposes on the socket and fifo +special files, they may have other uses in the future. Additionally, +please remember that only regular files, and symlinks are supposed to +have a nonzero size field; they contain the number of bytes available +directly after the (padded) file name. + +Another thing to note is that romfs works on file headers and data +aligned to 16 byte boundaries, but most hardware devices and the block +device drivers are unable to cope with smaller than block-sized data. +To overcome this limitation, the whole size of the file system must be +padded to an 1024 byte boundary. + +If you have any problems or suggestions concerning this file system, +please contact me. However, think twice before wanting me to add +features and code, because the primary and most important advantage of +this file system is the small code. On the other hand, don't be +alarmed, I'm not getting that much romfs related mail. Now I can +understand why Avery wrote poems in the ARCnet docs to get some more +feedback. :) + +romfs has also a mailing list, and to date, it hasn't received any +traffic, so you are welcome to join it to discuss your ideas. :) + +It's run by ezmlm, so you can subscribe to it by sending a message +to romfs-subscribe@shadow.banki.hu, the content is irrelevant. + +Pending issues: + +- Permissions and owner information are pretty essential features of a +Un*x like system, but romfs does not provide the full possibilities. +I have never found this limiting, but others might. + +- The file system is read only, so it can be very small, but in case +one would want to write _anything_ to a file system, he still needs +a writable file system, thus negating the size advantages. Possible +solutions: implement write access as a compile-time option, or a new, +similarly small writable filesystem for RAM disks. + +- Since the files are only required to have alignment on a 16 byte +boundary, it is currently possibly suboptimal to read or execute files +from the filesystem. It might be resolved by reordering file data to +have most of it (i.e. except the start and the end) laying at "natural" +boundaries, thus it would be possible to directly map a big portion of +the file contents to the mm subsystem. + +- Compression might be an useful feature, but memory is quite a +limiting factor in my eyes. + +- Where it is used? + +- Does it work on other architectures than intel and motorola? + + +Have fun, +Janos Farkas <chexum@shadow.banki.hu> diff --git a/Documentation/filesystems/smbfs.txt b/Documentation/filesystems/smbfs.txt new file mode 100644 index 000000000000..f673ef0de0f7 --- /dev/null +++ b/Documentation/filesystems/smbfs.txt @@ -0,0 +1,8 @@ +Smbfs is a filesystem that implements the SMB protocol, which is the +protocol used by Windows for Workgroups, Windows 95 and Windows NT. +Smbfs was inspired by Samba, the program written by Andrew Tridgell +that turns any Unix host into a file server for DOS or Windows clients. + +Smbfs is a SMB client, but uses parts of samba for it's operation. For +more info on samba, including documentation, please go to +http://www.samba.org/ and then on to your nearest mirror. diff --git a/Documentation/filesystems/sysfs-pci.txt b/Documentation/filesystems/sysfs-pci.txt new file mode 100644 index 000000000000..e97d024eae77 --- /dev/null +++ b/Documentation/filesystems/sysfs-pci.txt @@ -0,0 +1,88 @@ +Accessing PCI device resources through sysfs + +sysfs, usually mounted at /sys, provides access to PCI resources on platforms +that support it. For example, a given bus might look like this: + + /sys/devices/pci0000:17 + |-- 0000:17:00.0 + | |-- class + | |-- config + | |-- detach_state + | |-- device + | |-- irq + | |-- local_cpus + | |-- resource + | |-- resource0 + | |-- resource1 + | |-- resource2 + | |-- rom + | |-- subsystem_device + | |-- subsystem_vendor + | `-- vendor + `-- detach_state + +The topmost element describes the PCI domain and bus number. In this case, +the domain number is 0000 and the bus number is 17 (both values are in hex). +This bus contains a single function device in slot 0. The domain and bus +numbers are reproduced for convenience. Under the device directory are several +files, each with their own function. + + file function + ---- -------- + class PCI class (ascii, ro) + config PCI config space (binary, rw) + detach_state connection status (bool, rw) + device PCI device (ascii, ro) + irq IRQ number (ascii, ro) + local_cpus nearby CPU mask (cpumask, ro) + resource PCI resource host addresses (ascii, ro) + resource0..N PCI resource N, if present (binary, mmap) + rom PCI ROM resource, if present (binary, ro) + subsystem_device PCI subsystem device (ascii, ro) + subsystem_vendor PCI subsystem vendor (ascii, ro) + vendor PCI vendor (ascii, ro) + + ro - read only file + rw - file is readable and writable + mmap - file is mmapable + ascii - file contains ascii text + binary - file contains binary data + cpumask - file contains a cpumask type + +The read only files are informational, writes to them will be ignored. +Writable files can be used to perform actions on the device (e.g. changing +config space, detaching a device). mmapable files are available via an +mmap of the file at offset 0 and can be used to do actual device programming +from userspace. Note that some platforms don't support mmapping of certain +resources, so be sure to check the return value from any attempted mmap. + +Accessing legacy resources through sysfs + +Legacy I/O port and ISA memory resources are also provided in sysfs if the +underlying platform supports them. They're located in the PCI class heirarchy, +e.g. + + /sys/class/pci_bus/0000:17/ + |-- bridge -> ../../../devices/pci0000:17 + |-- cpuaffinity + |-- legacy_io + `-- legacy_mem + +The legacy_io file is a read/write file that can be used by applications to +do legacy port I/O. The application should open the file, seek to the desired +port (e.g. 0x3e8) and do a read or a write of 1, 2 or 4 bytes. The legacy_mem +file should be mmapped with an offset corresponding to the memory offset +desired, e.g. 0xa0000 for the VGA frame buffer. The application can then +simply dereference the returned pointer (after checking for errors of course) +to access legacy memory space. + +Supporting PCI access on new platforms + +In order to support PCI resource mapping as described above, Linux platform +code must define HAVE_PCI_MMAP and provide a pci_mmap_page_range function. +Platforms are free to only support subsets of the mmap functionality, but +useful return codes should be provided. + +Legacy resources are protected by the HAVE_PCI_LEGACY define. Platforms +wishing to support legacy functionality should define it and provide +pci_legacy_read, pci_legacy_write and pci_mmap_legacy_page_range functions.
\ No newline at end of file diff --git a/Documentation/filesystems/sysfs.txt b/Documentation/filesystems/sysfs.txt new file mode 100644 index 000000000000..60f6c2c4d477 --- /dev/null +++ b/Documentation/filesystems/sysfs.txt @@ -0,0 +1,341 @@ + +sysfs - _The_ filesystem for exporting kernel objects. + +Patrick Mochel <mochel@osdl.org> + +10 January 2003 + + +What it is: +~~~~~~~~~~~ + +sysfs is a ram-based filesystem initially based on ramfs. It provides +a means to export kernel data structures, their attributes, and the +linkages between them to userspace. + +sysfs is tied inherently to the kobject infrastructure. Please read +Documentation/kobject.txt for more information concerning the kobject +interface. + + +Using sysfs +~~~~~~~~~~~ + +sysfs is always compiled in. You can access it by doing: + + mount -t sysfs sysfs /sys + + +Directory Creation +~~~~~~~~~~~~~~~~~~ + +For every kobject that is registered with the system, a directory is +created for it in sysfs. That directory is created as a subdirectory +of the kobject's parent, expressing internal object hierarchies to +userspace. Top-level directories in sysfs represent the common +ancestors of object hierarchies; i.e. the subsystems the objects +belong to. + +Sysfs internally stores the kobject that owns the directory in the +->d_fsdata pointer of the directory's dentry. This allows sysfs to do +reference counting directly on the kobject when the file is opened and +closed. + + +Attributes +~~~~~~~~~~ + +Attributes can be exported for kobjects in the form of regular files in +the filesystem. Sysfs forwards file I/O operations to methods defined +for the attributes, providing a means to read and write kernel +attributes. + +Attributes should be ASCII text files, preferably with only one value +per file. It is noted that it may not be efficient to contain only +value per file, so it is socially acceptable to express an array of +values of the same type. + +Mixing types, expressing multiple lines of data, and doing fancy +formatting of data is heavily frowned upon. Doing these things may get +you publically humiliated and your code rewritten without notice. + + +An attribute definition is simply: + +struct attribute { + char * name; + mode_t mode; +}; + + +int sysfs_create_file(struct kobject * kobj, struct attribute * attr); +void sysfs_remove_file(struct kobject * kobj, struct attribute * attr); + + +A bare attribute contains no means to read or write the value of the +attribute. Subsystems are encouraged to define their own attribute +structure and wrapper functions for adding and removing attributes for +a specific object type. + +For example, the driver model defines struct device_attribute like: + +struct device_attribute { + struct attribute attr; + ssize_t (*show)(struct device * dev, char * buf); + ssize_t (*store)(struct device * dev, const char * buf); +}; + +int device_create_file(struct device *, struct device_attribute *); +void device_remove_file(struct device *, struct device_attribute *); + +It also defines this helper for defining device attributes: + +#define DEVICE_ATTR(_name,_mode,_show,_store) \ +struct device_attribute dev_attr_##_name = { \ + .attr = {.name = __stringify(_name) , .mode = _mode }, \ + .show = _show, \ + .store = _store, \ +}; + +For example, declaring + +static DEVICE_ATTR(foo,0644,show_foo,store_foo); + +is equivalent to doing: + +static struct device_attribute dev_attr_foo = { + .attr = { + .name = "foo", + .mode = 0644, + }, + .show = show_foo, + .store = store_foo, +}; + + +Subsystem-Specific Callbacks +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When a subsystem defines a new attribute type, it must implement a +set of sysfs operations for forwarding read and write calls to the +show and store methods of the attribute owners. + +struct sysfs_ops { + ssize_t (*show)(struct kobject *, struct attribute *,char *); + ssize_t (*store)(struct kobject *,struct attribute *,const char *); +}; + +[ Subsystems should have already defined a struct kobj_type as a +descriptor for this type, which is where the sysfs_ops pointer is +stored. See the kobject documentation for more information. ] + +When a file is read or written, sysfs calls the appropriate method +for the type. The method then translates the generic struct kobject +and struct attribute pointers to the appropriate pointer types, and +calls the associated methods. + + +To illustrate: + +#define to_dev_attr(_attr) container_of(_attr,struct device_attribute,attr) +#define to_dev(d) container_of(d, struct device, kobj) + +static ssize_t +dev_attr_show(struct kobject * kobj, struct attribute * attr, char * buf) +{ + struct device_attribute * dev_attr = to_dev_attr(attr); + struct device * dev = to_dev(kobj); + ssize_t ret = 0; + + if (dev_attr->show) + ret = dev_attr->show(dev,buf); + return ret; +} + + + +Reading/Writing Attribute Data +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To read or write attributes, show() or store() methods must be +specified when declaring the attribute. The method types should be as +simple as those defined for device attributes: + + ssize_t (*show)(struct device * dev, char * buf); + ssize_t (*store)(struct device * dev, const char * buf); + +IOW, they should take only an object and a buffer as parameters. + + +sysfs allocates a buffer of size (PAGE_SIZE) and passes it to the +method. Sysfs will call the method exactly once for each read or +write. This forces the following behavior on the method +implementations: + +- On read(2), the show() method should fill the entire buffer. + Recall that an attribute should only be exporting one value, or an + array of similar values, so this shouldn't be that expensive. + + This allows userspace to do partial reads and seeks arbitrarily over + the entire file at will. + +- On write(2), sysfs expects the entire buffer to be passed during the + first write. Sysfs then passes the entire buffer to the store() + method. + + When writing sysfs files, userspace processes should first read the + entire file, modify the values it wishes to change, then write the + entire buffer back. + + Attribute method implementations should operate on an identical + buffer when reading and writing values. + +Other notes: + +- The buffer will always be PAGE_SIZE bytes in length. On i386, this + is 4096. + +- show() methods should return the number of bytes printed into the + buffer. This is the return value of snprintf(). + +- show() should always use snprintf(). + +- store() should return the number of bytes used from the buffer. This + can be done using strlen(). + +- show() or store() can always return errors. If a bad value comes + through, be sure to return an error. + +- The object passed to the methods will be pinned in memory via sysfs + referencing counting its embedded object. However, the physical + entity (e.g. device) the object represents may not be present. Be + sure to have a way to check this, if necessary. + + +A very simple (and naive) implementation of a device attribute is: + +static ssize_t show_name(struct device * dev, char * buf) +{ + return sprintf(buf,"%s\n",dev->name); +} + +static ssize_t store_name(struct device * dev, const char * buf) +{ + sscanf(buf,"%20s",dev->name); + return strlen(buf); +} + +static DEVICE_ATTR(name,S_IRUGO,show_name,store_name); + + +(Note that the real implementation doesn't allow userspace to set the +name for a device.) + + +Top Level Directory Layout +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The sysfs directory arrangement exposes the relationship of kernel +data structures. + +The top level sysfs diretory looks like: + +block/ +bus/ +class/ +devices/ +firmware/ +net/ + +devices/ contains a filesystem representation of the device tree. It maps +directly to the internal kernel device tree, which is a hierarchy of +struct device. + +bus/ contains flat directory layout of the various bus types in the +kernel. Each bus's directory contains two subdirectories: + + devices/ + drivers/ + +devices/ contains symlinks for each device discovered in the system +that point to the device's directory under root/. + +drivers/ contains a directory for each device driver that is loaded +for devices on that particular bus (this assumes that drivers do not +span multiple bus types). + + +More information can driver-model specific features can be found in +Documentation/driver-model/. + + +TODO: Finish this section. + + +Current Interfaces +~~~~~~~~~~~~~~~~~~ + +The following interface layers currently exist in sysfs: + + +- devices (include/linux/device.h) +---------------------------------- +Structure: + +struct device_attribute { + struct attribute attr; + ssize_t (*show)(struct device * dev, char * buf); + ssize_t (*store)(struct device * dev, const char * buf); +}; + +Declaring: + +DEVICE_ATTR(_name,_str,_mode,_show,_store); + +Creation/Removal: + +int device_create_file(struct device *device, struct device_attribute * attr); +void device_remove_file(struct device * dev, struct device_attribute * attr); + + +- bus drivers (include/linux/device.h) +-------------------------------------- +Structure: + +struct bus_attribute { + struct attribute attr; + ssize_t (*show)(struct bus_type *, char * buf); + ssize_t (*store)(struct bus_type *, const char * buf); +}; + +Declaring: + +BUS_ATTR(_name,_mode,_show,_store) + +Creation/Removal: + +int bus_create_file(struct bus_type *, struct bus_attribute *); +void bus_remove_file(struct bus_type *, struct bus_attribute *); + + +- device drivers (include/linux/device.h) +----------------------------------------- + +Structure: + +struct driver_attribute { + struct attribute attr; + ssize_t (*show)(struct device_driver *, char * buf); + ssize_t (*store)(struct device_driver *, const char * buf); +}; + +Declaring: + +DRIVER_ATTR(_name,_mode,_show,_store) + +Creation/Removal: + +int driver_create_file(struct device_driver *, struct driver_attribute *); +void driver_remove_file(struct device_driver *, struct driver_attribute *); + + diff --git a/Documentation/filesystems/sysv-fs.txt b/Documentation/filesystems/sysv-fs.txt new file mode 100644 index 000000000000..d81722418010 --- /dev/null +++ b/Documentation/filesystems/sysv-fs.txt @@ -0,0 +1,38 @@ +This is the implementation of the SystemV/Coherent filesystem for Linux. +It implements all of + - Xenix FS, + - SystemV/386 FS, + - Coherent FS. + +This is version beta 4. + +To install: +* Answer the 'System V and Coherent filesystem support' question with 'y' + when configuring the kernel. +* To mount a disk or a partition, use + mount [-r] -t sysv device mountpoint + The file system type names + -t sysv + -t xenix + -t coherent + may be used interchangeably, but the last two will eventually disappear. + +Bugs in the present implementation: +- Coherent FS: + - The "free list interleave" n:m is currently ignored. + - Only file systems with no filesystem name and no pack name are recognized. + (See Coherent "man mkfs" for a description of these features.) +- SystemV Release 2 FS: + The superblock is only searched in the blocks 9, 15, 18, which + corresponds to the beginning of track 1 on floppy disks. No support + for this FS on hard disk yet. + + +Please report any bugs and suggestions to + Bruno Haible <haible@ma2s2.mathematik.uni-karlsruhe.de> + Pascal Haible <haible@izfm.uni-stuttgart.de> + Krzysztof G. Baranowski <kgb@manjak.knm.org.pl> + +Bruno Haible +<haible@ma2s2.mathematik.uni-karlsruhe.de> + diff --git a/Documentation/filesystems/tmpfs.txt b/Documentation/filesystems/tmpfs.txt new file mode 100644 index 000000000000..417e3095fe39 --- /dev/null +++ b/Documentation/filesystems/tmpfs.txt @@ -0,0 +1,100 @@ +Tmpfs is a file system which keeps all files in virtual memory. + + +Everything in tmpfs is temporary in the sense that no files will be +created on your hard drive. If you unmount a tmpfs instance, +everything stored therein is lost. + +tmpfs puts everything into the kernel internal caches and grows and +shrinks to accommodate the files it contains and is able to swap +unneeded pages out to swap space. It has maximum size limits which can +be adjusted on the fly via 'mount -o remount ...' + +If you compare it to ramfs (which was the template to create tmpfs) +you gain swapping and limit checking. Another similar thing is the RAM +disk (/dev/ram*), which simulates a fixed size hard disk in physical +RAM, where you have to create an ordinary filesystem on top. Ramdisks +cannot swap and you do not have the possibility to resize them. + +Since tmpfs lives completely in the page cache and on swap, all tmpfs +pages currently in memory will show up as cached. It will not show up +as shared or something like that. Further on you can check the actual +RAM+swap use of a tmpfs instance with df(1) and du(1). + + +tmpfs has the following uses: + +1) There is always a kernel internal mount which you will not see at + all. This is used for shared anonymous mappings and SYSV shared + memory. + + This mount does not depend on CONFIG_TMPFS. If CONFIG_TMPFS is not + set, the user visible part of tmpfs is not build. But the internal + mechanisms are always present. + +2) glibc 2.2 and above expects tmpfs to be mounted at /dev/shm for + POSIX shared memory (shm_open, shm_unlink). Adding the following + line to /etc/fstab should take care of this: + + tmpfs /dev/shm tmpfs defaults 0 0 + + Remember to create the directory that you intend to mount tmpfs on + if necessary (/dev/shm is automagically created if you use devfs). + + This mount is _not_ needed for SYSV shared memory. The internal + mount is used for that. (In the 2.3 kernel versions it was + necessary to mount the predecessor of tmpfs (shm fs) to use SYSV + shared memory) + +3) Some people (including me) find it very convenient to mount it + e.g. on /tmp and /var/tmp and have a big swap partition. And now + loop mounts of tmpfs files do work, so mkinitrd shipped by most + distributions should succeed with a tmpfs /tmp. + +4) And probably a lot more I do not know about :-) + + +tmpfs has three mount options for sizing: + +size: The limit of allocated bytes for this tmpfs instance. The + default is half of your physical RAM without swap. If you + oversize your tmpfs instances the machine will deadlock + since the OOM handler will not be able to free that memory. +nr_blocks: The same as size, but in blocks of PAGE_CACHE_SIZE. +nr_inodes: The maximum number of inodes for this instance. The default + is half of the number of your physical RAM pages, or (on a + a machine with highmem) the number of lowmem RAM pages, + whichever is the lower. + +These parameters accept a suffix k, m or g for kilo, mega and giga and +can be changed on remount. The size parameter also accepts a suffix % +to limit this tmpfs instance to that percentage of your physical RAM: +the default, when neither size nor nr_blocks is specified, is size=50% + +If both nr_blocks (or size) and nr_inodes are set to 0, neither blocks +nor inodes will be limited in that instance. It is generally unwise to +mount with such options, since it allows any user with write access to +use up all the memory on the machine; but enhances the scalability of +that instance in a system with many cpus making intensive use of it. + + +To specify the initial root directory you can use the following mount +options: + +mode: The permissions as an octal number +uid: The user id +gid: The group id + +These options do not have any effect on remount. You can change these +parameters with chmod(1), chown(1) and chgrp(1) on a mounted filesystem. + + +So 'mount -t tmpfs -o size=10G,nr_inodes=10k,mode=700 tmpfs /mytmpfs' +will give you tmpfs instance on /mytmpfs which can allocate 10GB +RAM/SWAP in 10240 inodes and it is only accessible by root. + + +Author: + Christoph Rohland <cr@sap.com>, 1.12.01 +Updated: + Hugh Dickins <hugh@veritas.com>, 01 September 2004 diff --git a/Documentation/filesystems/udf.txt b/Documentation/filesystems/udf.txt new file mode 100644 index 000000000000..e5213bc301f7 --- /dev/null +++ b/Documentation/filesystems/udf.txt @@ -0,0 +1,57 @@ +* +* Documentation/filesystems/udf.txt +* +UDF Filesystem version 0.9.8.1 + +If you encounter problems with reading UDF discs using this driver, +please report them to linux_udf@hpesjro.fc.hp.com, which is the +developer's list. + +Write support requires a block driver which supports writing. The current +scsi and ide cdrom drivers do not support writing. + +------------------------------------------------------------------------------- +The following mount options are supported: + + gid= Set the default group. + umask= Set the default umask. + uid= Set the default user. + bs= Set the block size. + unhide Show otherwise hidden files. + undelete Show deleted files in lists. + adinicb Embed data in the inode (default) + noadinicb Don't embed data in the inode + shortad Use short ad's + longad Use long ad's (default) + nostrict Unset strict conformance + iocharset= Set the NLS character set + +The remaining are for debugging and disaster recovery: + + novrs Skip volume sequence recognition + +The following expect a offset from 0. + + session= Set the CDROM session (default= last session) + anchor= Override standard anchor location. (default= 256) + volume= Override the VolumeDesc location. (unused) + partition= Override the PartitionDesc location. (unused) + lastblock= Set the last block of the filesystem/ + +The following expect a offset from the partition root. + + fileset= Override the fileset block location. (unused) + rootdir= Override the root directory location. (unused) + WARNING: overriding the rootdir to a non-directory may + yield highly unpredictable results. +------------------------------------------------------------------------------- + + +For the latest version and toolset see: + http://linux-udf.sourceforge.net/ + +Documentation on UDF and ECMA 167 is available FREE from: + http://www.osta.org/ + http://www.ecma-international.org/ + +Ben Fennema <bfennema@falcon.csc.calpoly.edu> diff --git a/Documentation/filesystems/ufs.txt b/Documentation/filesystems/ufs.txt new file mode 100644 index 000000000000..2b5a56a6a558 --- /dev/null +++ b/Documentation/filesystems/ufs.txt @@ -0,0 +1,61 @@ +USING UFS +========= + +mount -t ufs -o ufstype=type_of_ufs device dir + + +UFS OPTIONS +=========== + +ufstype=type_of_ufs + UFS is a file system widely used in different operating systems. + The problem are differences among implementations. Features of + some implementations are undocumented, so its hard to recognize + type of ufs automatically. That's why user must specify type of + ufs manually by mount option ufstype. Possible values are: + + old old format of ufs + default value, supported as read-only + + 44bsd used in FreeBSD, NetBSD, OpenBSD + supported as read-write + + ufs2 used in FreeBSD 5.x + supported as read-only + + 5xbsd synonym for ufs2 + + sun used in SunOS (Solaris) + supported as read-write + + sunx86 used in SunOS for Intel (Solarisx86) + supported as read-write + + hp used in HP-UX + supported as read-only + + nextstep + used in NextStep + supported as read-only + + nextstep-cd + used for NextStep CDROMs (block_size == 2048) + supported as read-only + + openstep + used in OpenStep + supported as read-only + + +POSSIBLE PROBLEMS +================= + +There is still bug in reallocation of fragment, in file fs/ufs/balloc.c, +line 364. But it seems working on current buffer cache configuration. + + +BUG REPORTS +=========== + +Any ufs bug report you can send to daniel.pirkl@email.cz (do not send +partition tables bug reports.) diff --git a/Documentation/filesystems/vfat.txt b/Documentation/filesystems/vfat.txt new file mode 100644 index 000000000000..5ead20c6c744 --- /dev/null +++ b/Documentation/filesystems/vfat.txt @@ -0,0 +1,231 @@ +USING VFAT +---------------------------------------------------------------------- +To use the vfat filesystem, use the filesystem type 'vfat'. i.e. + mount -t vfat /dev/fd0 /mnt + +No special partition formatter is required. mkdosfs will work fine +if you want to format from within Linux. + +VFAT MOUNT OPTIONS +---------------------------------------------------------------------- +umask=### -- The permission mask (for files and directories, see umask(1)). + The default is the umask of current process. + +dmask=### -- The permission mask for the directory. + The default is the umask of current process. + +fmask=### -- The permission mask for files. + The default is the umask of current process. + +codepage=### -- Sets the codepage number for converting to shortname + characters on FAT filesystem. + By default, FAT_DEFAULT_CODEPAGE setting is used. + +iocharset=name -- Character set to use for converting between the + encoding is used for user visible filename and 16 bit + Unicode characters. Long filenames are stored on disk + in Unicode format, but Unix for the most part doesn't + know how to deal with Unicode. + By default, FAT_DEFAULT_IOCHARSET setting is used. + + There is also an option of doing UTF8 translations + with the utf8 option. + + NOTE: "iocharset=utf8" is not recommended. If unsure, + you should consider the following option instead. + +utf8=<bool> -- UTF8 is the filesystem safe version of Unicode that + is used by the console. It can be be enabled for the + filesystem with this option. If 'uni_xlate' gets set, + UTF8 gets disabled. + +uni_xlate=<bool> -- Translate unhandled Unicode characters to special + escaped sequences. This would let you backup and + restore filenames that are created with any Unicode + characters. Until Linux supports Unicode for real, + this gives you an alternative. Without this option, + a '?' is used when no translation is possible. The + escape character is ':' because it is otherwise + illegal on the vfat filesystem. The escape sequence + that gets used is ':' and the four digits of hexadecimal + unicode. + +nonumtail=<bool> -- When creating 8.3 aliases, normally the alias will + end in '~1' or tilde followed by some number. If this + option is set, then if the filename is + "longfilename.txt" and "longfile.txt" does not + currently exist in the directory, 'longfile.txt' will + be the short alias instead of 'longfi~1.txt'. + +quiet -- Stops printing certain warning messages. + +check=s|r|n -- Case sensitivity checking setting. + s: strict, case sensitive + r: relaxed, case insensitive + n: normal, default setting, currently case insensitive + +shortname=lower|win95|winnt|mixed + -- Shortname display/create setting. + lower: convert to lowercase for display, + emulate the Windows 95 rule for create. + win95: emulate the Windows 95 rule for display/create. + winnt: emulate the Windows NT rule for display/create. + mixed: emulate the Windows NT rule for display, + emulate the Windows 95 rule for create. + Default setting is `lower'. + +<bool>: 0,1,yes,no,true,false + +TODO +---------------------------------------------------------------------- +* Need to get rid of the raw scanning stuff. Instead, always use + a get next directory entry approach. The only thing left that uses + raw scanning is the directory renaming code. + + +POSSIBLE PROBLEMS +---------------------------------------------------------------------- +* vfat_valid_longname does not properly checked reserved names. +* When a volume name is the same as a directory name in the root + directory of the filesystem, the directory name sometimes shows + up as an empty file. +* autoconv option does not work correctly. + +BUG REPORTS +---------------------------------------------------------------------- +If you have trouble with the VFAT filesystem, mail bug reports to +chaffee@bmrc.cs.berkeley.edu. Please specify the filename +and the operation that gave you trouble. + +TEST SUITE +---------------------------------------------------------------------- +If you plan to make any modifications to the vfat filesystem, please +get the test suite that comes with the vfat distribution at + + http://bmrc.berkeley.edu/people/chaffee/vfat.html + +This tests quite a few parts of the vfat filesystem and additional +tests for new features or untested features would be appreciated. + +NOTES ON THE STRUCTURE OF THE VFAT FILESYSTEM +---------------------------------------------------------------------- +(This documentation was provided by Galen C. Hunt <gchunt@cs.rochester.edu> + and lightly annotated by Gordon Chaffee). + +This document presents a very rough, technical overview of my +knowledge of the extended FAT file system used in Windows NT 3.5 and +Windows 95. I don't guarantee that any of the following is correct, +but it appears to be so. + +The extended FAT file system is almost identical to the FAT +file system used in DOS versions up to and including 6.223410239847 +:-). The significant change has been the addition of long file names. +These names support up to 255 characters including spaces and lower +case characters as opposed to the traditional 8.3 short names. + +Here is the description of the traditional FAT entry in the current +Windows 95 filesystem: + + struct directory { // Short 8.3 names + unsigned char name[8]; // file name + unsigned char ext[3]; // file extension + unsigned char attr; // attribute byte + unsigned char lcase; // Case for base and extension + unsigned char ctime_ms; // Creation time, milliseconds + unsigned char ctime[2]; // Creation time + unsigned char cdate[2]; // Creation date + unsigned char adate[2]; // Last access date + unsigned char reserved[2]; // reserved values (ignored) + unsigned char time[2]; // time stamp + unsigned char date[2]; // date stamp + unsigned char start[2]; // starting cluster number + unsigned char size[4]; // size of the file + }; + +The lcase field specifies if the base and/or the extension of an 8.3 +name should be capitalized. This field does not seem to be used by +Windows 95 but it is used by Windows NT. The case of filenames is not +completely compatible from Windows NT to Windows 95. It is not completely +compatible in the reverse direction, however. Filenames that fit in +the 8.3 namespace and are written on Windows NT to be lowercase will +show up as uppercase on Windows 95. + +Note that the "start" and "size" values are actually little +endian integer values. The descriptions of the fields in this +structure are public knowledge and can be found elsewhere. + +With the extended FAT system, Microsoft has inserted extra +directory entries for any files with extended names. (Any name which +legally fits within the old 8.3 encoding scheme does not have extra +entries.) I call these extra entries slots. Basically, a slot is a +specially formatted directory entry which holds up to 13 characters of +a file's extended name. Think of slots as additional labeling for the +directory entry of the file to which they correspond. Microsoft +prefers to refer to the 8.3 entry for a file as its alias and the +extended slot directory entries as the file name. + +The C structure for a slot directory entry follows: + + struct slot { // Up to 13 characters of a long name + unsigned char id; // sequence number for slot + unsigned char name0_4[10]; // first 5 characters in name + unsigned char attr; // attribute byte + unsigned char reserved; // always 0 + unsigned char alias_checksum; // checksum for 8.3 alias + unsigned char name5_10[12]; // 6 more characters in name + unsigned char start[2]; // starting cluster number + unsigned char name11_12[4]; // last 2 characters in name + }; + +If the layout of the slots looks a little odd, it's only +because of Microsoft's efforts to maintain compatibility with old +software. The slots must be disguised to prevent old software from +panicking. To this end, a number of measures are taken: + + 1) The attribute byte for a slot directory entry is always set + to 0x0f. This corresponds to an old directory entry with + attributes of "hidden", "system", "read-only", and "volume + label". Most old software will ignore any directory + entries with the "volume label" bit set. Real volume label + entries don't have the other three bits set. + + 2) The starting cluster is always set to 0, an impossible + value for a DOS file. + +Because the extended FAT system is backward compatible, it is +possible for old software to modify directory entries. Measures must +be taken to ensure the validity of slots. An extended FAT system can +verify that a slot does in fact belong to an 8.3 directory entry by +the following: + + 1) Positioning. Slots for a file always immediately proceed + their corresponding 8.3 directory entry. In addition, each + slot has an id which marks its order in the extended file + name. Here is a very abbreviated view of an 8.3 directory + entry and its corresponding long name slots for the file + "My Big File.Extension which is long": + + <proceeding files...> + <slot #3, id = 0x43, characters = "h is long"> + <slot #2, id = 0x02, characters = "xtension whic"> + <slot #1, id = 0x01, characters = "My Big File.E"> + <directory entry, name = "MYBIGFIL.EXT"> + + Note that the slots are stored from last to first. Slots + are numbered from 1 to N. The Nth slot is or'ed with 0x40 + to mark it as the last one. + + 2) Checksum. Each slot has an "alias_checksum" value. The + checksum is calculated from the 8.3 name using the + following algorithm: + + for (sum = i = 0; i < 11; i++) { + sum = (((sum&1)<<7)|((sum&0xfe)>>1)) + name[i] + } + + 3) If there is free space in the final slot, a Unicode NULL (0x0000) + is stored after the final character. After that, all unused + characters in the final slot are set to Unicode 0xFFFF. + +Finally, note that the extended name is stored in Unicode. Each Unicode +character takes two bytes. diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt new file mode 100644 index 000000000000..3f318dd44c77 --- /dev/null +++ b/Documentation/filesystems/vfs.txt @@ -0,0 +1,671 @@ +/* -*- auto-fill -*- */ + + Overview of the Virtual File System + + Richard Gooch <rgooch@atnf.csiro.au> + + 5-JUL-1999 + + +Conventions used in this document <section> +================================= + +Each section in this document will have the string "<section>" at the +right-hand side of the section title. Each subsection will have +"<subsection>" at the right-hand side. These strings are meant to make +it easier to search through the document. + +NOTE that the master copy of this document is available online at: +http://www.atnf.csiro.au/~rgooch/linux/docs/vfs.txt + + +What is it? <section> +=========== + +The Virtual File System (otherwise known as the Virtual Filesystem +Switch) is the software layer in the kernel that provides the +filesystem interface to userspace programs. It also provides an +abstraction within the kernel which allows different filesystem +implementations to co-exist. + + +A Quick Look At How It Works <section> +============================ + +In this section I'll briefly describe how things work, before +launching into the details. I'll start with describing what happens +when user programs open and manipulate files, and then look from the +other view which is how a filesystem is supported and subsequently +mounted. + +Opening a File <subsection> +-------------- + +The VFS implements the open(2), stat(2), chmod(2) and similar system +calls. The pathname argument is used by the VFS to search through the +directory entry cache (dentry cache or "dcache"). This provides a very +fast look-up mechanism to translate a pathname (filename) into a +specific dentry. + +An individual dentry usually has a pointer to an inode. Inodes are the +things that live on disc drives, and can be regular files (you know: +those things that you write data into), directories, FIFOs and other +beasts. Dentries live in RAM and are never saved to disc: they exist +only for performance. Inodes live on disc and are copied into memory +when required. Later any changes are written back to disc. The inode +that lives in RAM is a VFS inode, and it is this which the dentry +points to. A single inode can be pointed to by multiple dentries +(think about hardlinks). + +The dcache is meant to be a view into your entire filespace. Unlike +Linus, most of us losers can't fit enough dentries into RAM to cover +all of our filespace, so the dcache has bits missing. In order to +resolve your pathname into a dentry, the VFS may have to resort to +creating dentries along the way, and then loading the inode. This is +done by looking up the inode. + +To look up an inode (usually read from disc) requires that the VFS +calls the lookup() method of the parent directory inode. This method +is installed by the specific filesystem implementation that the inode +lives in. There will be more on this later. + +Once the VFS has the required dentry (and hence the inode), we can do +all those boring things like open(2) the file, or stat(2) it to peek +at the inode data. The stat(2) operation is fairly simple: once the +VFS has the dentry, it peeks at the inode data and passes some of it +back to userspace. + +Opening a file requires another operation: allocation of a file +structure (this is the kernel-side implementation of file +descriptors). The freshly allocated file structure is initialised with +a pointer to the dentry and a set of file operation member functions. +These are taken from the inode data. The open() file method is then +called so the specific filesystem implementation can do it's work. You +can see that this is another switch performed by the VFS. + +The file structure is placed into the file descriptor table for the +process. + +Reading, writing and closing files (and other assorted VFS operations) +is done by using the userspace file descriptor to grab the appropriate +file structure, and then calling the required file structure method +function to do whatever is required. + +For as long as the file is open, it keeps the dentry "open" (in use), +which in turn means that the VFS inode is still in use. + +All VFS system calls (i.e. open(2), stat(2), read(2), write(2), +chmod(2) and so on) are called from a process context. You should +assume that these calls are made without any kernel locks being +held. This means that the processes may be executing the same piece of +filesystem or driver code at the same time, on different +processors. You should ensure that access to shared resources is +protected by appropriate locks. + +Registering and Mounting a Filesystem <subsection> +------------------------------------- + +If you want to support a new kind of filesystem in the kernel, all you +need to do is call register_filesystem(). You pass a structure +describing the filesystem implementation (struct file_system_type) +which is then added to an internal table of supported filesystems. You +can do: + +% cat /proc/filesystems + +to see what filesystems are currently available on your system. + +When a request is made to mount a block device onto a directory in +your filespace the VFS will call the appropriate method for the +specific filesystem. The dentry for the mount point will then be +updated to point to the root inode for the new filesystem. + +It's now time to look at things in more detail. + + +struct file_system_type <section> +======================= + +This describes the filesystem. As of kernel 2.1.99, the following +members are defined: + +struct file_system_type { + const char *name; + int fs_flags; + struct super_block *(*read_super) (struct super_block *, void *, int); + struct file_system_type * next; +}; + + name: the name of the filesystem type, such as "ext2", "iso9660", + "msdos" and so on + + fs_flags: various flags (i.e. FS_REQUIRES_DEV, FS_NO_DCACHE, etc.) + + read_super: the method to call when a new instance of this + filesystem should be mounted + + next: for internal VFS use: you should initialise this to NULL + +The read_super() method has the following arguments: + + struct super_block *sb: the superblock structure. This is partially + initialised by the VFS and the rest must be initialised by the + read_super() method + + void *data: arbitrary mount options, usually comes as an ASCII + string + + int silent: whether or not to be silent on error + +The read_super() method must determine if the block device specified +in the superblock contains a filesystem of the type the method +supports. On success the method returns the superblock pointer, on +failure it returns NULL. + +The most interesting member of the superblock structure that the +read_super() method fills in is the "s_op" field. This is a pointer to +a "struct super_operations" which describes the next level of the +filesystem implementation. + + +struct super_operations <section> +======================= + +This describes how the VFS can manipulate the superblock of your +filesystem. As of kernel 2.1.99, the following members are defined: + +struct super_operations { + void (*read_inode) (struct inode *); + int (*write_inode) (struct inode *, int); + void (*put_inode) (struct inode *); + void (*drop_inode) (struct inode *); + void (*delete_inode) (struct inode *); + int (*notify_change) (struct dentry *, struct iattr *); + void (*put_super) (struct super_block *); + void (*write_super) (struct super_block *); + int (*statfs) (struct super_block *, struct statfs *, int); + int (*remount_fs) (struct super_block *, int *, char *); + void (*clear_inode) (struct inode *); +}; + +All methods are called without any locks being held, unless otherwise +noted. This means that most methods can block safely. All methods are +only called from a process context (i.e. not from an interrupt handler +or bottom half). + + read_inode: this method is called to read a specific inode from the + mounted filesystem. The "i_ino" member in the "struct inode" + will be initialised by the VFS to indicate which inode to + read. Other members are filled in by this method + + write_inode: this method is called when the VFS needs to write an + inode to disc. The second parameter indicates whether the write + should be synchronous or not, not all filesystems check this flag. + + put_inode: called when the VFS inode is removed from the inode + cache. This method is optional + + drop_inode: called when the last access to the inode is dropped, + with the inode_lock spinlock held. + + This method should be either NULL (normal unix filesystem + semantics) or "generic_delete_inode" (for filesystems that do not + want to cache inodes - causing "delete_inode" to always be + called regardless of the value of i_nlink) + + The "generic_delete_inode()" behaviour is equivalent to the + old practice of using "force_delete" in the put_inode() case, + but does not have the races that the "force_delete()" approach + had. + + delete_inode: called when the VFS wants to delete an inode + + notify_change: called when VFS inode attributes are changed. If this + is NULL the VFS falls back to the write_inode() method. This + is called with the kernel lock held + + put_super: called when the VFS wishes to free the superblock + (i.e. unmount). This is called with the superblock lock held + + write_super: called when the VFS superblock needs to be written to + disc. This method is optional + + statfs: called when the VFS needs to get filesystem statistics. This + is called with the kernel lock held + + remount_fs: called when the filesystem is remounted. This is called + with the kernel lock held + + clear_inode: called then the VFS clears the inode. Optional + +The read_inode() method is responsible for filling in the "i_op" +field. This is a pointer to a "struct inode_operations" which +describes the methods that can be performed on individual inodes. + + +struct inode_operations <section> +======================= + +This describes how the VFS can manipulate an inode in your +filesystem. As of kernel 2.1.99, the following members are defined: + +struct inode_operations { + struct file_operations * default_file_ops; + int (*create) (struct inode *,struct dentry *,int); + int (*lookup) (struct inode *,struct dentry *); + int (*link) (struct dentry *,struct inode *,struct dentry *); + int (*unlink) (struct inode *,struct dentry *); + int (*symlink) (struct inode *,struct dentry *,const char *); + int (*mkdir) (struct inode *,struct dentry *,int); + int (*rmdir) (struct inode *,struct dentry *); + int (*mknod) (struct inode *,struct dentry *,int,dev_t); + int (*rename) (struct inode *, struct dentry *, + struct inode *, struct dentry *); + int (*readlink) (struct dentry *, char *,int); + struct dentry * (*follow_link) (struct dentry *, struct dentry *); + int (*readpage) (struct file *, struct page *); + int (*writepage) (struct page *page, struct writeback_control *wbc); + int (*bmap) (struct inode *,int); + void (*truncate) (struct inode *); + int (*permission) (struct inode *, int); + int (*smap) (struct inode *,int); + int (*updatepage) (struct file *, struct page *, const char *, + unsigned long, unsigned int, int); + int (*revalidate) (struct dentry *); +}; + +Again, all methods are called without any locks being held, unless +otherwise noted. + + default_file_ops: this is a pointer to a "struct file_operations" + which describes how to open and then manipulate open files + + create: called by the open(2) and creat(2) system calls. Only + required if you want to support regular files. The dentry you + get should not have an inode (i.e. it should be a negative + dentry). Here you will probably call d_instantiate() with the + dentry and the newly created inode + + lookup: called when the VFS needs to look up an inode in a parent + directory. The name to look for is found in the dentry. This + method must call d_add() to insert the found inode into the + dentry. The "i_count" field in the inode structure should be + incremented. If the named inode does not exist a NULL inode + should be inserted into the dentry (this is called a negative + dentry). Returning an error code from this routine must only + be done on a real error, otherwise creating inodes with system + calls like create(2), mknod(2), mkdir(2) and so on will fail. + If you wish to overload the dentry methods then you should + initialise the "d_dop" field in the dentry; this is a pointer + to a struct "dentry_operations". + This method is called with the directory inode semaphore held + + link: called by the link(2) system call. Only required if you want + to support hard links. You will probably need to call + d_instantiate() just as you would in the create() method + + unlink: called by the unlink(2) system call. Only required if you + want to support deleting inodes + + symlink: called by the symlink(2) system call. Only required if you + want to support symlinks. You will probably need to call + d_instantiate() just as you would in the create() method + + mkdir: called by the mkdir(2) system call. Only required if you want + to support creating subdirectories. You will probably need to + call d_instantiate() just as you would in the create() method + + rmdir: called by the rmdir(2) system call. Only required if you want + to support deleting subdirectories + + mknod: called by the mknod(2) system call to create a device (char, + block) inode or a named pipe (FIFO) or socket. Only required + if you want to support creating these types of inodes. You + will probably need to call d_instantiate() just as you would + in the create() method + + readlink: called by the readlink(2) system call. Only required if + you want to support reading symbolic links + + follow_link: called by the VFS to follow a symbolic link to the + inode it points to. Only required if you want to support + symbolic links + + +struct file_operations <section> +====================== + +This describes how the VFS can manipulate an open file. As of kernel +2.1.99, the following members are defined: + +struct file_operations { + loff_t (*llseek) (struct file *, loff_t, int); + ssize_t (*read) (struct file *, char *, size_t, loff_t *); + ssize_t (*write) (struct file *, const char *, size_t, loff_t *); + int (*readdir) (struct file *, void *, filldir_t); + unsigned int (*poll) (struct file *, struct poll_table_struct *); + int (*ioctl) (struct inode *, struct file *, unsigned int, unsigned long); + int (*mmap) (struct file *, struct vm_area_struct *); + int (*open) (struct inode *, struct file *); + int (*release) (struct inode *, struct file *); + int (*fsync) (struct file *, struct dentry *); + int (*fasync) (struct file *, int); + int (*check_media_change) (kdev_t dev); + int (*revalidate) (kdev_t dev); + int (*lock) (struct file *, int, struct file_lock *); +}; + +Again, all methods are called without any locks being held, unless +otherwise noted. + + llseek: called when the VFS needs to move the file position index + + read: called by read(2) and related system calls + + write: called by write(2) and related system calls + + readdir: called when the VFS needs to read the directory contents + + poll: called by the VFS when a process wants to check if there is + activity on this file and (optionally) go to sleep until there + is activity. Called by the select(2) and poll(2) system calls + + ioctl: called by the ioctl(2) system call + + mmap: called by the mmap(2) system call + + open: called by the VFS when an inode should be opened. When the VFS + opens a file, it creates a new "struct file" and initialises + the "f_op" file operations member with the "default_file_ops" + field in the inode structure. It then calls the open method + for the newly allocated file structure. You might think that + the open method really belongs in "struct inode_operations", + and you may be right. I think it's done the way it is because + it makes filesystems simpler to implement. The open() method + is a good place to initialise the "private_data" member in the + file structure if you want to point to a device structure + + release: called when the last reference to an open file is closed + + fsync: called by the fsync(2) system call + + fasync: called by the fcntl(2) system call when asynchronous + (non-blocking) mode is enabled for a file + +Note that the file operations are implemented by the specific +filesystem in which the inode resides. When opening a device node +(character or block special) most filesystems will call special +support routines in the VFS which will locate the required device +driver information. These support routines replace the filesystem file +operations with those for the device driver, and then proceed to call +the new open() method for the file. This is how opening a device file +in the filesystem eventually ends up calling the device driver open() +method. Note the devfs (the Device FileSystem) has a more direct path +from device node to device driver (this is an unofficial kernel +patch). + + +Directory Entry Cache (dcache) <section> +------------------------------ + +struct dentry_operations +======================== + +This describes how a filesystem can overload the standard dentry +operations. Dentries and the dcache are the domain of the VFS and the +individual filesystem implementations. Device drivers have no business +here. These methods may be set to NULL, as they are either optional or +the VFS uses a default. As of kernel 2.1.99, the following members are +defined: + +struct dentry_operations { + int (*d_revalidate)(struct dentry *); + int (*d_hash) (struct dentry *, struct qstr *); + int (*d_compare) (struct dentry *, struct qstr *, struct qstr *); + void (*d_delete)(struct dentry *); + void (*d_release)(struct dentry *); + void (*d_iput)(struct dentry *, struct inode *); +}; + + d_revalidate: called when the VFS needs to revalidate a dentry. This + is called whenever a name look-up finds a dentry in the + dcache. Most filesystems leave this as NULL, because all their + dentries in the dcache are valid + + d_hash: called when the VFS adds a dentry to the hash table + + d_compare: called when a dentry should be compared with another + + d_delete: called when the last reference to a dentry is + deleted. This means no-one is using the dentry, however it is + still valid and in the dcache + + d_release: called when a dentry is really deallocated + + d_iput: called when a dentry loses its inode (just prior to its + being deallocated). The default when this is NULL is that the + VFS calls iput(). If you define this method, you must call + iput() yourself + +Each dentry has a pointer to its parent dentry, as well as a hash list +of child dentries. Child dentries are basically like files in a +directory. + +Directory Entry Cache APIs +-------------------------- + +There are a number of functions defined which permit a filesystem to +manipulate dentries: + + dget: open a new handle for an existing dentry (this just increments + the usage count) + + dput: close a handle for a dentry (decrements the usage count). If + the usage count drops to 0, the "d_delete" method is called + and the dentry is placed on the unused list if the dentry is + still in its parents hash list. Putting the dentry on the + unused list just means that if the system needs some RAM, it + goes through the unused list of dentries and deallocates them. + If the dentry has already been unhashed and the usage count + drops to 0, in this case the dentry is deallocated after the + "d_delete" method is called + + d_drop: this unhashes a dentry from its parents hash list. A + subsequent call to dput() will dellocate the dentry if its + usage count drops to 0 + + d_delete: delete a dentry. If there are no other open references to + the dentry then the dentry is turned into a negative dentry + (the d_iput() method is called). If there are other + references, then d_drop() is called instead + + d_add: add a dentry to its parents hash list and then calls + d_instantiate() + + d_instantiate: add a dentry to the alias hash list for the inode and + updates the "d_inode" member. The "i_count" member in the + inode structure should be set/incremented. If the inode + pointer is NULL, the dentry is called a "negative + dentry". This function is commonly called when an inode is + created for an existing negative dentry + + d_lookup: look up a dentry given its parent and path name component + It looks up the child of that given name from the dcache + hash table. If it is found, the reference count is incremented + and the dentry is returned. The caller must use d_put() + to free the dentry when it finishes using it. + + +RCU-based dcache locking model +------------------------------ + +On many workloads, the most common operation on dcache is +to look up a dentry, given a parent dentry and the name +of the child. Typically, for every open(), stat() etc., +the dentry corresponding to the pathname will be looked +up by walking the tree starting with the first component +of the pathname and using that dentry along with the next +component to look up the next level and so on. Since it +is a frequent operation for workloads like multiuser +environments and webservers, it is important to optimize +this path. + +Prior to 2.5.10, dcache_lock was acquired in d_lookup and thus +in every component during path look-up. Since 2.5.10 onwards, +fastwalk algorithm changed this by holding the dcache_lock +at the beginning and walking as many cached path component +dentries as possible. This signficantly decreases the number +of acquisition of dcache_lock. However it also increases the +lock hold time signficantly and affects performance in large +SMP machines. Since 2.5.62 kernel, dcache has been using +a new locking model that uses RCU to make dcache look-up +lock-free. + +The current dcache locking model is not very different from the existing +dcache locking model. Prior to 2.5.62 kernel, dcache_lock +protected the hash chain, d_child, d_alias, d_lru lists as well +as d_inode and several other things like mount look-up. RCU-based +changes affect only the way the hash chain is protected. For everything +else the dcache_lock must be taken for both traversing as well as +updating. The hash chain updations too take the dcache_lock. +The significant change is the way d_lookup traverses the hash chain, +it doesn't acquire the dcache_lock for this and rely on RCU to +ensure that the dentry has not been *freed*. + + +Dcache locking details +---------------------- +For many multi-user workloads, open() and stat() on files are +very frequently occurring operations. Both involve walking +of path names to find the dentry corresponding to the +concerned file. In 2.4 kernel, dcache_lock was held +during look-up of each path component. Contention and +cacheline bouncing of this global lock caused significant +scalability problems. With the introduction of RCU +in linux kernel, this was worked around by making +the look-up of path components during path walking lock-free. + + +Safe lock-free look-up of dcache hash table +=========================================== + +Dcache is a complex data structure with the hash table entries +also linked together in other lists. In 2.4 kernel, dcache_lock +protected all the lists. We applied RCU only on hash chain +walking. The rest of the lists are still protected by dcache_lock. +Some of the important changes are : + +1. The deletion from hash chain is done using hlist_del_rcu() macro which + doesn't initialize next pointer of the deleted dentry and this + allows us to walk safely lock-free while a deletion is happening. + +2. Insertion of a dentry into the hash table is done using + hlist_add_head_rcu() which take care of ordering the writes - + the writes to the dentry must be visible before the dentry + is inserted. This works in conjuction with hlist_for_each_rcu() + while walking the hash chain. The only requirement is that + all initialization to the dentry must be done before hlist_add_head_rcu() + since we don't have dcache_lock protection while traversing + the hash chain. This isn't different from the existing code. + +3. The dentry looked up without holding dcache_lock by cannot be + returned for walking if it is unhashed. It then may have a NULL + d_inode or other bogosity since RCU doesn't protect the other + fields in the dentry. We therefore use a flag DCACHE_UNHASHED to + indicate unhashed dentries and use this in conjunction with a + per-dentry lock (d_lock). Once looked up without the dcache_lock, + we acquire the per-dentry lock (d_lock) and check if the + dentry is unhashed. If so, the look-up is failed. If not, the + reference count of the dentry is increased and the dentry is returned. + +4. Once a dentry is looked up, it must be ensured during the path + walk for that component it doesn't go away. In pre-2.5.10 code, + this was done holding a reference to the dentry. dcache_rcu does + the same. In some sense, dcache_rcu path walking looks like + the pre-2.5.10 version. + +5. All dentry hash chain updations must take the dcache_lock as well as + the per-dentry lock in that order. dput() does this to ensure + that a dentry that has just been looked up in another CPU + doesn't get deleted before dget() can be done on it. + +6. There are several ways to do reference counting of RCU protected + objects. One such example is in ipv4 route cache where + deferred freeing (using call_rcu()) is done as soon as + the reference count goes to zero. This cannot be done in + the case of dentries because tearing down of dentries + require blocking (dentry_iput()) which isn't supported from + RCU callbacks. Instead, tearing down of dentries happen + synchronously in dput(), but actual freeing happens later + when RCU grace period is over. This allows safe lock-free + walking of the hash chains, but a matched dentry may have + been partially torn down. The checking of DCACHE_UNHASHED + flag with d_lock held detects such dentries and prevents + them from being returned from look-up. + + +Maintaining POSIX rename semantics +================================== + +Since look-up of dentries is lock-free, it can race against +a concurrent rename operation. For example, during rename +of file A to B, look-up of either A or B must succeed. +So, if look-up of B happens after A has been removed from the +hash chain but not added to the new hash chain, it may fail. +Also, a comparison while the name is being written concurrently +by a rename may result in false positive matches violating +rename semantics. Issues related to race with rename are +handled as described below : + +1. Look-up can be done in two ways - d_lookup() which is safe + from simultaneous renames and __d_lookup() which is not. + If __d_lookup() fails, it must be followed up by a d_lookup() + to correctly determine whether a dentry is in the hash table + or not. d_lookup() protects look-ups using a sequence + lock (rename_lock). + +2. The name associated with a dentry (d_name) may be changed if + a rename is allowed to happen simultaneously. To avoid memcmp() + in __d_lookup() go out of bounds due to a rename and false + positive comparison, the name comparison is done while holding the + per-dentry lock. This prevents concurrent renames during this + operation. + +3. Hash table walking during look-up may move to a different bucket as + the current dentry is moved to a different bucket due to rename. + But we use hlists in dcache hash table and they are null-terminated. + So, even if a dentry moves to a different bucket, hash chain + walk will terminate. [with a list_head list, it may not since + termination is when the list_head in the original bucket is reached]. + Since we redo the d_parent check and compare name while holding + d_lock, lock-free look-up will not race against d_move(). + +4. There can be a theoritical race when a dentry keeps coming back + to original bucket due to double moves. Due to this look-up may + consider that it has never moved and can end up in a infinite loop. + But this is not any worse that theoritical livelocks we already + have in the kernel. + + +Important guidelines for filesystem developers related to dcache_rcu +==================================================================== + +1. Existing dcache interfaces (pre-2.5.62) exported to filesystem + don't change. Only dcache internal implementation changes. However + filesystems *must not* delete from the dentry hash chains directly + using the list macros like allowed earlier. They must use dcache + APIs like d_drop() or __d_drop() depending on the situation. + +2. d_flags is now protected by a per-dentry lock (d_lock). All + access to d_flags must be protected by it. + +3. For a hashed dentry, checking of d_count needs to be protected + by d_lock. + + +Papers and other documentation on dcache locking +================================================ + +1. Scaling dcache with RCU (http://linuxjournal.com/article.php?sid=7124). + +2. http://lse.sourceforge.net/locking/dcache/dcache.html diff --git a/Documentation/filesystems/xfs.txt b/Documentation/filesystems/xfs.txt new file mode 100644 index 000000000000..c7d5d0c7067d --- /dev/null +++ b/Documentation/filesystems/xfs.txt @@ -0,0 +1,188 @@ + +The SGI XFS Filesystem +====================== + +XFS is a high performance journaling filesystem which originated +on the SGI IRIX platform. It is completely multi-threaded, can +support large files and large filesystems, extended attributes, +variable block sizes, is extent based, and makes extensive use of +Btrees (directories, extents, free space) to aid both performance +and scalability. + +Refer to the documentation at http://oss.sgi.com/projects/xfs/ +for further details. This implementation is on-disk compatible +with the IRIX version of XFS. + + +Mount Options +============= + +When mounting an XFS filesystem, the following options are accepted. + + biosize=size + Sets the preferred buffered I/O size (default size is 64K). + "size" must be expressed as the logarithm (base2) of the + desired I/O size. + Valid values for this option are 14 through 16, inclusive + (i.e. 16K, 32K, and 64K bytes). On machines with a 4K + pagesize, 13 (8K bytes) is also a valid size. + The preferred buffered I/O size can also be altered on an + individual file basis using the ioctl(2) system call. + + ikeep/noikeep + When inode clusters are emptied of inodes, keep them around + on the disk (ikeep) - this is the traditional XFS behaviour + and is still the default for now. Using the noikeep option, + inode clusters are returned to the free space pool. + + logbufs=value + Set the number of in-memory log buffers. Valid numbers range + from 2-8 inclusive. + The default value is 8 buffers for filesystems with a + blocksize of 64K, 4 buffers for filesystems with a blocksize + of 32K, 3 buffers for filesystems with a blocksize of 16K + and 2 buffers for all other configurations. Increasing the + number of buffers may increase performance on some workloads + at the cost of the memory used for the additional log buffers + and their associated control structures. + + logbsize=value + Set the size of each in-memory log buffer. + Size may be specified in bytes, or in kilobytes with a "k" suffix. + Valid sizes for version 1 and version 2 logs are 16384 (16k) and + 32768 (32k). Valid sizes for version 2 logs also include + 65536 (64k), 131072 (128k) and 262144 (256k). + The default value for machines with more than 32MB of memory + is 32768, machines with less memory use 16384 by default. + + logdev=device and rtdev=device + Use an external log (metadata journal) and/or real-time device. + An XFS filesystem has up to three parts: a data section, a log + section, and a real-time section. The real-time section is + optional, and the log section can be separate from the data + section or contained within it. + + noalign + Data allocations will not be aligned at stripe unit boundaries. + + noatime + Access timestamps are not updated when a file is read. + + norecovery + The filesystem will be mounted without running log recovery. + If the filesystem was not cleanly unmounted, it is likely to + be inconsistent when mounted in "norecovery" mode. + Some files or directories may not be accessible because of this. + Filesystems mounted "norecovery" must be mounted read-only or + the mount will fail. + + nouuid + Don't check for double mounted file systems using the file system uuid. + This is useful to mount LVM snapshot volumes. + + osyncisosync + Make O_SYNC writes implement true O_SYNC. WITHOUT this option, + Linux XFS behaves as if an "osyncisdsync" option is used, + which will make writes to files opened with the O_SYNC flag set + behave as if the O_DSYNC flag had been used instead. + This can result in better performance without compromising + data safety. + However if this option is not in effect, timestamp updates from + O_SYNC writes can be lost if the system crashes. + If timestamp updates are critical, use the osyncisosync option. + + quota/usrquota/uqnoenforce + User disk quota accounting enabled, and limits (optionally) + enforced. + + grpquota/gqnoenforce + Group disk quota accounting enabled and limits (optionally) + enforced. + + sunit=value and swidth=value + Used to specify the stripe unit and width for a RAID device or + a stripe volume. "value" must be specified in 512-byte block + units. + If this option is not specified and the filesystem was made on + a stripe volume or the stripe width or unit were specified for + the RAID device at mkfs time, then the mount system call will + restore the value from the superblock. For filesystems that + are made directly on RAID devices, these options can be used + to override the information in the superblock if the underlying + disk layout changes after the filesystem has been created. + The "swidth" option is required if the "sunit" option has been + specified, and must be a multiple of the "sunit" value. + +sysctls +======= + +The following sysctls are available for the XFS filesystem: + + fs.xfs.stats_clear (Min: 0 Default: 0 Max: 1) + Setting this to "1" clears accumulated XFS statistics + in /proc/fs/xfs/stat. It then immediately resets to "0". + + fs.xfs.xfssyncd_centisecs (Min: 100 Default: 3000 Max: 720000) + The interval at which the xfssyncd thread flushes metadata + out to disk. This thread will flush log activity out, and + do some processing on unlinked inodes. + + fs.xfs.xfsbufd_centisecs (Min: 50 Default: 100 Max: 3000) + The interval at which xfsbufd scans the dirty metadata buffers list. + + fs.xfs.age_buffer_centisecs (Min: 100 Default: 1500 Max: 720000) + The age at which xfsbufd flushes dirty metadata buffers to disk. + + fs.xfs.error_level (Min: 0 Default: 3 Max: 11) + A volume knob for error reporting when internal errors occur. + This will generate detailed messages & backtraces for filesystem + shutdowns, for example. Current threshold values are: + + XFS_ERRLEVEL_OFF: 0 + XFS_ERRLEVEL_LOW: 1 + XFS_ERRLEVEL_HIGH: 5 + + fs.xfs.panic_mask (Min: 0 Default: 0 Max: 127) + Causes certain error conditions to call BUG(). Value is a bitmask; + AND together the tags which represent errors which should cause panics: + + XFS_NO_PTAG 0 + XFS_PTAG_IFLUSH 0x00000001 + XFS_PTAG_LOGRES 0x00000002 + XFS_PTAG_AILDELETE 0x00000004 + XFS_PTAG_ERROR_REPORT 0x00000008 + XFS_PTAG_SHUTDOWN_CORRUPT 0x00000010 + XFS_PTAG_SHUTDOWN_IOERROR 0x00000020 + XFS_PTAG_SHUTDOWN_LOGERROR 0x00000040 + + This option is intended for debugging only. + + fs.xfs.irix_symlink_mode (Min: 0 Default: 0 Max: 1) + Controls whether symlinks are created with mode 0777 (default) + or whether their mode is affected by the umask (irix mode). + + fs.xfs.irix_sgid_inherit (Min: 0 Default: 0 Max: 1) + Controls files created in SGID directories. + If the group ID of the new file does not match the effective group + ID or one of the supplementary group IDs of the parent dir, the + ISGID bit is cleared if the irix_sgid_inherit compatibility sysctl + is set. + + fs.xfs.restrict_chown (Min: 0 Default: 1 Max: 1) + Controls whether unprivileged users can use chown to "give away" + a file to another user. + + fs.xfs.inherit_sync (Min: 0 Default: 1 Max 1) + Setting this to "1" will cause the "sync" flag set + by the chattr(1) command on a directory to be + inherited by files in that directory. + + fs.xfs.inherit_nodump (Min: 0 Default: 1 Max 1) + Setting this to "1" will cause the "nodump" flag set + by the chattr(1) command on a directory to be + inherited by files in that directory. + + fs.xfs.inherit_noatime (Min: 0 Default: 1 Max 1) + Setting this to "1" will cause the "noatime" flag set + by the chattr(1) command on a directory to be + inherited by files in that directory. diff --git a/Documentation/firmware_class/README b/Documentation/firmware_class/README new file mode 100644 index 000000000000..43e836c07ae8 --- /dev/null +++ b/Documentation/firmware_class/README @@ -0,0 +1,124 @@ + + request_firmware() hotplug interface: + ------------------------------------ + Copyright (C) 2003 Manuel Estrada Sainz <ranty@debian.org> + + Why: + --- + + Today, the most extended way to use firmware in the Linux kernel is linking + it statically in a header file. Which has political and technical issues: + + 1) Some firmware is not legal to redistribute. + 2) The firmware occupies memory permanently, even though it often is just + used once. + 3) Some people, like the Debian crowd, don't consider some firmware free + enough and remove entire drivers (e.g.: keyspan). + + High level behavior (mixed): + ============================ + + kernel(driver): calls request_firmware(&fw_entry, $FIRMWARE, device) + + userspace: + - /sys/class/firmware/xxx/{loading,data} appear. + - hotplug gets called with a firmware identifier in $FIRMWARE + and the usual hotplug environment. + - hotplug: echo 1 > /sys/class/firmware/xxx/loading + + kernel: Discard any previous partial load. + + userspace: + - hotplug: cat appropriate_firmware_image > \ + /sys/class/firmware/xxx/data + + kernel: grows a buffer in PAGE_SIZE increments to hold the image as it + comes in. + + userspace: + - hotplug: echo 0 > /sys/class/firmware/xxx/loading + + kernel: request_firmware() returns and the driver has the firmware + image in fw_entry->{data,size}. If something went wrong + request_firmware() returns non-zero and fw_entry is set to + NULL. + + kernel(driver): Driver code calls release_firmware(fw_entry) releasing + the firmware image and any related resource. + + High level behavior (driver code): + ================================== + + if(request_firmware(&fw_entry, $FIRMWARE, device) == 0) + copy_fw_to_device(fw_entry->data, fw_entry->size); + release(fw_entry); + + Sample/simple hotplug script: + ============================ + + # Both $DEVPATH and $FIRMWARE are already provided in the environment. + + HOTPLUG_FW_DIR=/usr/lib/hotplug/firmware/ + + echo 1 > /sys/$DEVPATH/loading + cat $HOTPLUG_FW_DIR/$FIRMWARE > /sysfs/$DEVPATH/data + echo 0 > /sys/$DEVPATH/loading + + Random notes: + ============ + + - "echo -1 > /sys/class/firmware/xxx/loading" will cancel the load at + once and make request_firmware() return with error. + + - firmware_data_read() and firmware_loading_show() are just provided + for testing and completeness, they are not called in normal use. + + - There is also /sys/class/firmware/timeout which holds a timeout in + seconds for the whole load operation. + + - request_firmware_nowait() is also provided for convenience in + non-user contexts. + + + about in-kernel persistence: + --------------------------- + Under some circumstances, as explained below, it would be interesting to keep + firmware images in non-swappable kernel memory or even in the kernel image + (probably within initramfs). + + Note that this functionality has not been implemented. + + - Why OPTIONAL in-kernel persistence may be a good idea sometimes: + + - If the device that needs the firmware is needed to access the + filesystem. When upon some error the device has to be reset and the + firmware reloaded, it won't be possible to get it from userspace. + e.g.: + - A diskless client with a network card that needs firmware. + - The filesystem is stored in a disk behind an scsi device + that needs firmware. + - Replacing buggy DSDT/SSDT ACPI tables on boot. + Note: this would require the persistent objects to be included + within the kernel image, probably within initramfs. + + And the same device can be needed to access the filesystem or not depending + on the setup, so I think that the choice on what firmware to make + persistent should be left to userspace. + + - Why register_firmware()+__init can be useful: + - For boot devices needing firmware. + - To make the transition easier: + The firmware can be declared __init and register_firmware() + called on module_init. Then the firmware is warranted to be + there even if "firmware hotplug userspace" is not there yet or + it doesn't yet provide the needed firmware. + Once the firmware is widely available in userspace, it can be + removed from the kernel. Or made optional (CONFIG_.*_FIRMWARE). + + In either case, if firmware hotplug support is there, it can move the + firmware out of kernel memory into the real filesystem for later + usage. + + Note: If persistence is implemented on top of initramfs, + register_firmware() may not be appropriate. + diff --git a/Documentation/firmware_class/firmware_sample_driver.c b/Documentation/firmware_class/firmware_sample_driver.c new file mode 100644 index 000000000000..e1c56a7e6583 --- /dev/null +++ b/Documentation/firmware_class/firmware_sample_driver.c @@ -0,0 +1,126 @@ +/* + * firmware_sample_driver.c - + * + * Copyright (c) 2003 Manuel Estrada Sainz <ranty@debian.org> + * + * Sample code on how to use request_firmware() from drivers. + * + * Note that register_firmware() is currently useless. + * + */ + +#include <linux/module.h> +#include <linux/kernel.h> +#include <linux/init.h> +#include <linux/device.h> + +#include "linux/firmware.h" + +#define WE_CAN_NEED_FIRMWARE_BEFORE_USERSPACE_IS_AVAILABLE +#ifdef WE_CAN_NEED_FIRMWARE_BEFORE_USERSPACE_IS_AVAILABLE +char __init inkernel_firmware[] = "let's say that this is firmware\n"; +#endif + +static struct device ghost_device = { + .name = "Ghost Device", + .bus_id = "ghost0", +}; + + +static void sample_firmware_load(char *firmware, int size) +{ + u8 buf[size+1]; + memcpy(buf, firmware, size); + buf[size] = '\0'; + printk("firmware_sample_driver: firmware: %s\n", buf); +} + +static void sample_probe_default(void) +{ + /* uses the default method to get the firmware */ + const struct firmware *fw_entry; + printk("firmware_sample_driver: a ghost device got inserted :)\n"); + + if(request_firmware(&fw_entry, "sample_driver_fw", &ghost_device)!=0) + { + printk(KERN_ERR + "firmware_sample_driver: Firmware not available\n"); + return; + } + + sample_firmware_load(fw_entry->data, fw_entry->size); + + release_firmware(fw_entry); + + /* finish setting up the device */ +} +static void sample_probe_specific(void) +{ + /* Uses some specific hotplug support to get the firmware from + * userspace directly into the hardware, or via some sysfs file */ + + /* NOTE: This currently doesn't work */ + + printk("firmware_sample_driver: a ghost device got inserted :)\n"); + + if(request_firmware(NULL, "sample_driver_fw", &ghost_device)!=0) + { + printk(KERN_ERR + "firmware_sample_driver: Firmware load failed\n"); + return; + } + + /* request_firmware blocks until userspace finished, so at + * this point the firmware should be already in the device */ + + /* finish setting up the device */ +} +static void sample_probe_async_cont(const struct firmware *fw, void *context) +{ + if(!fw){ + printk(KERN_ERR + "firmware_sample_driver: firmware load failed\n"); + return; + } + + printk("firmware_sample_driver: device pointer \"%s\"\n", + (char *)context); + sample_firmware_load(fw->data, fw->size); +} +static void sample_probe_async(void) +{ + /* Let's say that I can't sleep */ + int error; + error = request_firmware_nowait (THIS_MODULE, + "sample_driver_fw", &ghost_device, + "my device pointer", + sample_probe_async_cont); + if(error){ + printk(KERN_ERR + "firmware_sample_driver:" + " request_firmware_nowait failed\n"); + } +} + +static int sample_init(void) +{ +#ifdef WE_CAN_NEED_FIRMWARE_BEFORE_USERSPACE_IS_AVAILABLE + register_firmware("sample_driver_fw", inkernel_firmware, + sizeof(inkernel_firmware)); +#endif + device_initialize(&ghost_device); + /* since there is no real hardware insertion I just call the + * sample probe functions here */ + sample_probe_specific(); + sample_probe_default(); + sample_probe_async(); + return 0; +} +static void __exit sample_exit(void) +{ +} + +module_init (sample_init); +module_exit (sample_exit); + +MODULE_LICENSE("GPL"); diff --git a/Documentation/firmware_class/firmware_sample_firmware_class.c b/Documentation/firmware_class/firmware_sample_firmware_class.c new file mode 100644 index 000000000000..09eab2f1b373 --- /dev/null +++ b/Documentation/firmware_class/firmware_sample_firmware_class.c @@ -0,0 +1,204 @@ +/* + * firmware_sample_firmware_class.c - + * + * Copyright (c) 2003 Manuel Estrada Sainz <ranty@debian.org> + * + * NOTE: This is just a probe of concept, if you think that your driver would + * be well served by this mechanism please contact me first. + * + * DON'T USE THIS CODE AS IS + * + */ + +#include <linux/device.h> +#include <linux/module.h> +#include <linux/init.h> +#include <linux/timer.h> +#include <linux/firmware.h> + + +MODULE_AUTHOR("Manuel Estrada Sainz <ranty@debian.org>"); +MODULE_DESCRIPTION("Hackish sample for using firmware class directly"); +MODULE_LICENSE("GPL"); + +static inline struct class_device *to_class_dev(struct kobject *obj) +{ + return container_of(obj,struct class_device,kobj); +} +static inline +struct class_device_attribute *to_class_dev_attr(struct attribute *_attr) +{ + return container_of(_attr,struct class_device_attribute,attr); +} + +int sysfs_create_bin_file(struct kobject * kobj, struct bin_attribute * attr); +int sysfs_remove_bin_file(struct kobject * kobj, struct bin_attribute * attr); + +struct firmware_priv { + char fw_id[FIRMWARE_NAME_MAX]; + s32 loading:2; + u32 abort:1; +}; + +extern struct class firmware_class; + +static ssize_t firmware_loading_show(struct class_device *class_dev, char *buf) +{ + struct firmware_priv *fw_priv = class_get_devdata(class_dev); + return sprintf(buf, "%d\n", fw_priv->loading); +} +static ssize_t firmware_loading_store(struct class_device *class_dev, + const char *buf, size_t count) +{ + struct firmware_priv *fw_priv = class_get_devdata(class_dev); + int prev_loading = fw_priv->loading; + + fw_priv->loading = simple_strtol(buf, NULL, 10); + + switch(fw_priv->loading){ + case -1: + /* abort load an panic */ + break; + case 1: + /* setup load */ + break; + case 0: + if(prev_loading==1){ + /* finish load and get the device back to working + * state */ + } + break; + } + + return count; +} +static CLASS_DEVICE_ATTR(loading, 0644, + firmware_loading_show, firmware_loading_store); + +static ssize_t firmware_data_read(struct kobject *kobj, + char *buffer, loff_t offset, size_t count) +{ + struct class_device *class_dev = to_class_dev(kobj); + struct firmware_priv *fw_priv = class_get_devdata(class_dev); + + /* read from the devices firmware memory */ + + return count; +} +static ssize_t firmware_data_write(struct kobject *kobj, + char *buffer, loff_t offset, size_t count) +{ + struct class_device *class_dev = to_class_dev(kobj); + struct firmware_priv *fw_priv = class_get_devdata(class_dev); + + /* write to the devices firmware memory */ + + return count; +} +static struct bin_attribute firmware_attr_data = { + .attr = {.name = "data", .mode = 0644}, + .size = 0, + .read = firmware_data_read, + .write = firmware_data_write, +}; +static int fw_setup_class_device(struct class_device *class_dev, + const char *fw_name, + struct device *device) +{ + int retval = 0; + struct firmware_priv *fw_priv = kmalloc(sizeof(struct firmware_priv), + GFP_KERNEL); + + if(!fw_priv){ + retval = -ENOMEM; + goto out; + } + memset(fw_priv, 0, sizeof(*fw_priv)); + memset(class_dev, 0, sizeof(*class_dev)); + + strncpy(fw_priv->fw_id, fw_name, FIRMWARE_NAME_MAX); + fw_priv->fw_id[FIRMWARE_NAME_MAX-1] = '\0'; + + strncpy(class_dev->class_id, device->bus_id, BUS_ID_SIZE); + class_dev->class_id[BUS_ID_SIZE-1] = '\0'; + class_dev->dev = device; + + class_dev->class = &firmware_class, + class_set_devdata(class_dev, fw_priv); + retval = class_device_register(class_dev); + if (retval){ + printk(KERN_ERR "%s: class_device_register failed\n", + __FUNCTION__); + goto error_free_fw_priv; + } + + retval = sysfs_create_bin_file(&class_dev->kobj, &firmware_attr_data); + if (retval){ + printk(KERN_ERR "%s: sysfs_create_bin_file failed\n", + __FUNCTION__); + goto error_unreg_class_dev; + } + + retval = class_device_create_file(class_dev, + &class_device_attr_loading); + if (retval){ + printk(KERN_ERR "%s: class_device_create_file failed\n", + __FUNCTION__); + goto error_remove_data; + } + + goto out; + +error_remove_data: + sysfs_remove_bin_file(&class_dev->kobj, &firmware_attr_data); +error_unreg_class_dev: + class_device_unregister(class_dev); +error_free_fw_priv: + kfree(fw_priv); +out: + return retval; +} +static void fw_remove_class_device(struct class_device *class_dev) +{ + struct firmware_priv *fw_priv = class_get_devdata(class_dev); + + class_device_remove_file(class_dev, &class_device_attr_loading); + sysfs_remove_bin_file(&class_dev->kobj, &firmware_attr_data); + class_device_unregister(class_dev); +} + +static struct class_device *class_dev; + +static struct device my_device = { + .name = "Sample Device", + .bus_id = "my_dev0", +}; + +static int __init firmware_sample_init(void) +{ + int error; + + device_initialize(&my_device); + class_dev = kmalloc(sizeof(struct class_device), GFP_KERNEL); + if(!class_dev) + return -ENOMEM; + + error = fw_setup_class_device(class_dev, "my_firmware_image", + &my_device); + if(error){ + kfree(class_dev); + return error; + } + return 0; + +} +static void __exit firmware_sample_exit(void) +{ + struct firmware_priv *fw_priv = class_get_devdata(class_dev); + fw_remove_class_device(class_dev); + kfree(fw_priv); + kfree(class_dev); +} +module_init(firmware_sample_init); +module_exit(firmware_sample_exit); + diff --git a/Documentation/firmware_class/hotplug-script b/Documentation/firmware_class/hotplug-script new file mode 100644 index 000000000000..1990130f2ab1 --- /dev/null +++ b/Documentation/firmware_class/hotplug-script @@ -0,0 +1,16 @@ +#!/bin/sh + +# Simple hotplug script sample: +# +# Both $DEVPATH and $FIRMWARE are already provided in the environment. + +HOTPLUG_FW_DIR=/usr/lib/hotplug/firmware/ + +echo 1 > /sys/$DEVPATH/loading +cat $HOTPLUG_FW_DIR/$FIRMWARE > /sys/$DEVPATH/data +echo 0 > /sys/$DEVPATH/loading + +# To cancel the load in case of error: +# +# echo -1 > /sys/$DEVPATH/loading +# diff --git a/Documentation/floppy.txt b/Documentation/floppy.txt new file mode 100644 index 000000000000..6fb10fcd82fb --- /dev/null +++ b/Documentation/floppy.txt @@ -0,0 +1,245 @@ +This file describes the floppy driver. + +FAQ list: +========= + + A FAQ list may be found in the fdutils package (see below), and also +at http://fdutils.linux.lu/FAQ.html + + +LILO configuration options (Thinkpad users, read this) +====================================================== + + The floppy driver is configured using the 'floppy=' option in +lilo. This option can be typed at the boot prompt, or entered in the +lilo configuration file. + + Example: If your kernel is called linux-2.6.9, type the following line +at the lilo boot prompt (if you have a thinkpad): + + linux-2.6.9 floppy=thinkpad + +You may also enter the following line in /etc/lilo.conf, in the description +of linux-2.6.9: + + append = "floppy=thinkpad" + + Several floppy related options may be given, example: + + linux-2.6.9 floppy=daring floppy=two_fdc + append = "floppy=daring floppy=two_fdc" + + If you give options both in the lilo config file and on the boot +prompt, the option strings of both places are concatenated, the boot +prompt options coming last. That's why there are also options to +restore the default behavior. + + +Module configuration options +============================ + + If you use the floppy driver as a module, use the following syntax: +modprobe floppy <options> + +Example: + modprobe floppy omnibook messages + + If you need certain options enabled every time you load the floppy driver, +you can put: + + options floppy omnibook messages + +in /etc/modprobe.conf. + + + The floppy driver related options are: + + floppy=asus_pci + Sets the bit mask to allow only units 0 and 1. (default) + + floppy=daring + Tells the floppy driver that you have a well behaved floppy controller. + This allows more efficient and smoother operation, but may fail on + certain controllers. This may speed up certain operations. + + floppy=0,daring + Tells the floppy driver that your floppy controller should be used + with caution. + + floppy=one_fdc + Tells the floppy driver that you have only one floppy controller. + (default) + + floppy=two_fdc + floppy=<address>,two_fdc + Tells the floppy driver that you have two floppy controllers. + The second floppy controller is assumed to be at <address>. + This option is not needed if the second controller is at address + 0x370, and if you use the 'cmos' option. + + floppy=thinkpad + Tells the floppy driver that you have a Thinkpad. Thinkpads use an + inverted convention for the disk change line. + + floppy=0,thinkpad + Tells the floppy driver that you don't have a Thinkpad. + + floppy=omnibook + floppy=nodma + Tells the floppy driver not to use Dma for data transfers. + This is needed on HP Omnibooks, which don't have a workable + DMA channel for the floppy driver. This option is also useful + if you frequently get "Unable to allocate DMA memory" messages. + Indeed, dma memory needs to be continuous in physical memory, + and is thus harder to find, whereas non-dma buffers may be + allocated in virtual memory. However, I advise against this if + you have an FDC without a FIFO (8272A or 82072). 82072A and + later are OK. You also need at least a 486 to use nodma. + If you use nodma mode, I suggest you also set the FIFO + threshold to 10 or lower, in order to limit the number of data + transfer interrupts. + + If you have a FIFO-able FDC, the floppy driver automatically + falls back on non DMA mode if no DMA-able memory can be found. + If you want to avoid this, explicitly ask for 'yesdma'. + + floppy=yesdma + Tells the floppy driver that a workable DMA channel is available. + (default) + + floppy=nofifo + Disables the FIFO entirely. This is needed if you get "Bus + master arbitration error" messages from your Ethernet card (or + from other devices) while accessing the floppy. + + floppy=usefifo + Enables the FIFO. (default) + + floppy=<threshold>,fifo_depth + Sets the FIFO threshold. This is mostly relevant in DMA + mode. If this is higher, the floppy driver tolerates more + interrupt latency, but it triggers more interrupts (i.e. it + imposes more load on the rest of the system). If this is + lower, the interrupt latency should be lower too (faster + processor). The benefit of a lower threshold is less + interrupts. + + To tune the fifo threshold, switch on over/underrun messages + using 'floppycontrol --messages'. Then access a floppy + disk. If you get a huge amount of "Over/Underrun - retrying" + messages, then the fifo threshold is too low. Try with a + higher value, until you only get an occasional Over/Underrun. + It is a good idea to compile the floppy driver as a module + when doing this tuning. Indeed, it allows to try different + fifo values without rebooting the machine for each test. Note + that you need to do 'floppycontrol --messages' every time you + re-insert the module. + + Usually, tuning the fifo threshold should not be needed, as + the default (0xa) is reasonable. + + floppy=<drive>,<type>,cmos + Sets the CMOS type of <drive> to <type>. This is mandatory if + you have more than two floppy drives (only two can be + described in the physical CMOS), or if your BIOS uses + non-standard CMOS types. The CMOS types are: + + 0 - Use the value of the physical CMOS + 1 - 5 1/4 DD + 2 - 5 1/4 HD + 3 - 3 1/2 DD + 4 - 3 1/2 HD + 5 - 3 1/2 ED + 6 - 3 1/2 ED + 16 - unknown or not installed + + (Note: there are two valid types for ED drives. This is because 5 was + initially chosen to represent floppy *tapes*, and 6 for ED drives. + AMI ignored this, and used 5 for ED drives. That's why the floppy + driver handles both.) + + floppy=unexpected_interrupts + Print a warning message when an unexpected interrupt is received. + (default) + + floppy=no_unexpected_interrupts + floppy=L40SX + Don't print a message when an unexpected interrupt is received. This + is needed on IBM L40SX laptops in certain video modes. (There seems + to be an interaction between video and floppy. The unexpected + interrupts affect only performance, and can be safely ignored.) + + floppy=broken_dcl + Don't use the disk change line, but assume that the disk was + changed whenever the device node is reopened. Needed on some + boxes where the disk change line is broken or unsupported. + This should be regarded as a stopgap measure, indeed it makes + floppy operation less efficient due to unneeded cache + flushings, and slightly more unreliable. Please verify your + cable, connection and jumper settings if you have any DCL + problems. However, some older drives, and also some laptops + are known not to have a DCL. + + floppy=debug + Print debugging messages. + + floppy=messages + Print informational messages for some operations (disk change + notifications, warnings about over and underruns, and about + autodetection). + + floppy=silent_dcl_clear + Uses a less noisy way to clear the disk change line (which + doesn't involve seeks). Implied by 'daring' option. + + floppy=<nr>,irq + Sets the floppy IRQ to <nr> instead of 6. + + floppy=<nr>,dma + Sets the floppy DMA channel to <nr> instead of 2. + + floppy=slow + Use PS/2 stepping rate: + " PS/2 floppies have much slower step rates than regular floppies. + It's been recommended that take about 1/4 of the default speed + in some more extreme cases." + + +Supporting utilities and additional documentation: +================================================== + + Additional parameters of the floppy driver can be configured at +runtime. Utilities which do this can be found in the fdutils package. +This package also contains a new version of mtools which allows to +access high capacity disks (up to 1992K on a high density 3 1/2 disk!). +It also contains additional documentation about the floppy driver. + +The latest version can be found at fdutils homepage: + http://fdutils.linux.lu + +The fdutils-5.4 release can be found at: + http://fdutils.linux.lu/fdutils-5.4.src.tar.gz + http://www.tux.org/pub/knaff/fdutils/fdutils-5.4.src.tar.gz + ftp://metalab.unc.edu/pub/Linux/utils/disk-management/fdutils-5.4.src.tar.gz + +Reporting problems about the floppy driver +========================================== + + If you have a question or a bug report about the floppy driver, mail +me at Alain.Knaff@poboxes.com . If you post to Usenet, preferably use +comp.os.linux.hardware. As the volume in these groups is rather high, +be sure to include the word "floppy" (or "FLOPPY") in the subject +line. If the reported problem happens when mounting floppy disks, be +sure to mention also the type of the filesystem in the subject line. + + Be sure to read the FAQ before mailing/posting any bug reports! + + Alain + +Changelog +========= + +10-30-2004 : Cleanup, updating, add reference to module configuration. + James Nelson <james4765@gmail.com> + +6-3-2000 : Original Document diff --git a/Documentation/ftape.txt b/Documentation/ftape.txt new file mode 100644 index 000000000000..7d8bb3384031 --- /dev/null +++ b/Documentation/ftape.txt @@ -0,0 +1,307 @@ +Intro +===== + +This file describes some issues involved when using the "ftape" +floppy tape device driver that comes with the Linux kernel. + +ftape has a home page at + +http://ftape.dot-heine.de/ + +which contains further information about ftape. Please cross check +this WWW address against the address given (if any) in the MAINTAINERS +file located in the top level directory of the Linux kernel source +tree. + +NOTE: This is an unmaintained set of drivers, and it is not guaranteed to work. +If you are interested in taking over maintenance, contact Claus-Justus Heine +<ch@dot-heine.de>, the former maintainer. + +Contents +======== + +A minus 1: Ftape documentation + +A. Changes + 1. Goal + 2. I/O Block Size + 3. Write Access when not at EOD (End Of Data) or BOT (Begin Of Tape) + 4. Formatting + 5. Interchanging cartridges with other operating systems + +B. Debugging Output + 1. Introduction + 2. Tuning the debugging output + +C. Boot and load time configuration + 1. Setting boot time parameters + 2. Module load time parameters + 3. Ftape boot- and load time options + 4. Example kernel parameter setting + 5. Example module parameter setting + +D. Support and contacts + +******************************************************************************* + +A minus 1. Ftape documentation +============================== + +Unluckily, the ftape-HOWTO is out of date. This really needs to be +changed. Up to date documentation as well as recent development +versions of ftape and useful links to related topics can be found at +the ftape home page at + +http://ftape.dot-heine.de/ + +******************************************************************************* + +A. Changes +========== + +1. Goal + ~~~~ + The goal of all that incompatibilities was to give ftape an interface + that resembles the interface provided by SCSI tape drives as close + as possible. Thus any Unix backup program that is known to work + with SCSI tape drives should also work. + + The concept of a fixed block size for read/write transfers is + rather unrelated to this SCSI tape compatibility at the file system + interface level. It developed out of a feature of zftape, a + block wise user transparent on-the-fly compression. That compression + support will not be dropped in future releases for compatibility + reasons with previous releases of zftape. + +2. I/O Block Size + ~~~~~~~~~~~~~~ + The block size defaults to 10k which is the default block size of + GNU tar. + + The block size can be tuned either during kernel configuration or + at runtime with the MTIOCTOP ioctl using the MTSETBLK operation + (i.e. do "mt -f /dev/qft0" setblk #BLKSZ). A block size of 0 + switches to variable block size mode i.e. "mt setblk 0" switches + off the block size restriction. However, this disables zftape's + built in on-the-fly compression which doesn't work with variable + block size mode. + + The BLKSZ parameter must be given as a byte count and must be a + multiple of 32k or 0, i.e. use "mt setblk 32768" to switch to a + block size of 32k. + + The typical symptom of a block size mismatch is an "invalid + argument" error message. + +3. Write Access when not at EOD (End Of Data) or BOT (Begin Of Tape) + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + zftape (the file system interface of ftape-3.x) denies write access + to the tape cartridge when it isn't positioned either at BOT or + EOD. + +4. Formatting + ~~~~~~~~~~ + ftape DOES support formatting of floppy tape cartridges. You need the + `ftformat' program that is shipped with the modules version of ftape. + Please get the latest version of ftape from + + ftp://sunsite.unc.edu/pub/Linux/kernel/tapes + + or from the ftape home page at + + http://ftape.dot-heine.de/ + + `ftformat' is contained in the `./contrib/' subdirectory of that + separate ftape package. + +5. Interchanging cartridges with other operating systems + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + The internal emulation of Unix tape device file marks has changed + completely. ftape now uses the volume table segment as specified + by the QIC-40/80/3010/3020/113 standards to emulate file marks. As + a consequence there is limited support to interchange cartridges + with other operating systems. + + To be more precise: ftape will detect volumes written by other OS's + programs and other OS's programs will detect volumes written by + ftape. + + However, it isn't possible to extract the data dumped to the tape + by some MSDOS program with ftape. This exceeds the scope of a + kernel device driver. If you need such functionality, then go ahead + and write a user space utility that is able to do that. ftape already + provides all kernel level support necessary to do that. + +******************************************************************************* + +B. Debugging Output + ================ + +1. Introduction + ~~~~~~~~~~~~ + The ftape driver can be very noisy in that is can print lots of + debugging messages to the kernel log files and the system console. + While this is useful for debugging it might be annoying during + normal use and enlarges the size of the driver by several kilobytes. + + To reduce the size of the driver you can trim the maximal amount of + debugging information available during kernel configuration. Please + refer to the kernel configuration script and its on-line help + functionality. + + The amount of debugging output maps to the "tracing" boot time + option and the "ft_tracing" modules option as follows: + + 0 bugs + 1 + errors (with call-stack dump) + 2 + warnings + 3 + information + 4 + more information + 5 + program flow + 6 + fdc/dma info + 7 + data flow + 8 + everything else + +2. Tuning the debugging output + ~~~~~~~~~~~~~~~~~~~~~~~~~~~ + To reduce the amount of debugging output printed to the system + console you can + + i) trim the debugging output at run-time with + + mt -f /dev/nqft0 setdensity #DBGLVL + + where "#DBGLVL" is a number between 0 and 9 + + ii) trim the debugging output at module load time with + + modprobe ftape ft_tracing=#DBGLVL + + Of course, this applies only if you have configured ftape to be + compiled as a module. + + iii) trim the debugging output during system boot time. Add the + following to the kernel command line: + + ftape=#DBGLVL,tracing + + Please refer also to the next section if you don't know how to + set boot time parameters. + +******************************************************************************* + +C. Boot and load time configuration + ================================ + +1. Setting boot time parameters + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Assuming that you use lilo, the LI)nux LO)ader, boot time kernel + parameters can be set by adding a line + + append some_kernel_boot_time_parameter + + to `/etc/lilo.conf' or at real boot time by typing in the options + at the prompt provided by LILO. I can't give you advice on how to + specify those parameters with other loaders as I don't use them. + + For ftape, each "some_kernel_boot_time_parameter" looks like + "ftape=value,option". As an example, the debugging output can be + increased with + + ftape=4,tracing + + NOTE: the value precedes the option name. + +2. Module load time parameters + ~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Module parameters can be specified either directly when invoking + the program 'modprobe' at the shell prompt: + + modprobe ftape ft_tracing=4 + + or by editing the file `/etc/modprobe.conf' in which case they take + effect each time when the module is loaded with `modprobe' (please + refer to the respective manual pages). Thus, you should add a line + + options ftape ft_tracing=4 + + to `/etc/modprobe.conf` if you intend to increase the debugging + output of the driver. + + +3. Ftape boot- and load time options + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + i. Controlling the amount of debugging output + DBGLVL has to be replaced by a number between 0 and 8. + + module | kernel command line + -----------------------|---------------------- + ft_tracing=DBGLVL | ftape=DBGLVL,tracing + + ii. Hardware setup + BASE is the base address of your floppy disk controller, + IRQ and DMA give its interrupt and DMA channel, respectively. + BOOL is an integer, "0" means "no"; any other value means + "yes". You don't need to specify anything if connecting your tape + drive to the standard floppy disk controller. All of these + values have reasonable defaults. The defaults can be modified + during kernel configuration, i.e. while running "make config", + "make menuconfig" or "make xconfig" in the top level directory + of the Linux kernel source tree. Please refer also to the on + line documentation provided during that kernel configuration + process. + + ft_probe_fc10 is set to a non-zero value if you wish for ftape to + probe for a Colorado FC-10 or FC-20 controller. + + ft_mach2 is set to a non-zero value if you wish for ftape to probe + for a Mountain MACH-2 controller. + + module | kernel command line + -----------------------|---------------------- + ft_fdc_base=BASE | ftape=BASE,ioport + ft_fdc_irq=IRQ | ftape=IRQ,irq + ft_fdc_dma=DMA | ftape=DMA,dma + ft_probe_fc10=BOOL | ftape=BOOL,fc10 + ft_mach2=BOOL | ftape=BOOL,mach2 + ft_fdc_threshold=THR | ftape=THR,threshold + ft_fdc_rate_limit=RATE | ftape=RATE,datarate + +4. Example kernel parameter setting + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + To configure ftape to probe for a Colorado FC-10/FC-20 controller + and to increase the amount of debugging output a little bit, add + the following line to `/etc/lilo.conf': + + append ftape=1,fc10 ftape=4,tracing + +5. Example module parameter setting + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + To do the same, but with ftape compiled as a loadable kernel + module, add the following line to `/etc/modprobe.conf': + + options ftape ft_probe_fc10=1 ft_tracing=4 + +******************************************************************************* + +D. Support and contacts + ==================== + + Ftape is distributed under the GNU General Public License. There is + absolutely no warranty for this software. However, you can reach + the current maintainer of the ftape package under the email address + given in the MAINTAINERS file which is located in the top level + directory of the Linux kernel source tree. There you'll find also + the relevant mailing list to use as a discussion forum and the web + page to query for the most recent documentation, related work and + development versions of ftape. + + Changelog: + ========== + +~1996: Original Document + +10-24-2004: General cleanup and updating, noting additional module options. + James Nelson <james4765@gmail.com> diff --git a/Documentation/fujitsu/frv/README.txt b/Documentation/fujitsu/frv/README.txt new file mode 100644 index 000000000000..a984faa968e8 --- /dev/null +++ b/Documentation/fujitsu/frv/README.txt @@ -0,0 +1,51 @@ + ================================ + Fujitsu FR-V LINUX DOCUMENTATION + ================================ + +This directory contains documentation for the Fujitsu FR-V CPU architecture +port of Linux. + +The following documents are available: + + (*) features.txt + + A description of the basic features inherent in this architecture port. + + + (*) configuring.txt + + A summary of the configuration options particular to this architecture. + + + (*) booting.txt + + A description of how to boot the kernel image and a summary of the kernel + command line options. + + + (*) gdbstub.txt + + A description of how to debug the kernel using GDB attached by serial + port, and a summary of the services available. + + + (*) mmu-layout.txt + + A description of the virtual and physical memory layout used in the + MMU linux kernel, and the registers used to support it. + + + (*) gdbinit + + An example .gdbinit file for use with GDB. It includes macros for viewing + MMU state on the FR451. See mmu-layout.txt for more information. + + + (*) clock.txt + + A description of the CPU clock scaling interface. + + + (*) atomic-ops.txt + + A description of how the FR-V kernel's atomic operations work. diff --git a/Documentation/fujitsu/frv/atomic-ops.txt b/Documentation/fujitsu/frv/atomic-ops.txt new file mode 100644 index 000000000000..96638e9b9fe0 --- /dev/null +++ b/Documentation/fujitsu/frv/atomic-ops.txt @@ -0,0 +1,134 @@ + ===================================== + FUJITSU FR-V KERNEL ATOMIC OPERATIONS + ===================================== + +On the FR-V CPUs, there is only one atomic Read-Modify-Write operation: the SWAP/SWAPI +instruction. Unfortunately, this alone can't be used to implement the following operations: + + (*) Atomic add to memory + + (*) Atomic subtract from memory + + (*) Atomic bit modification (set, clear or invert) + + (*) Atomic compare and exchange + +On such CPUs, the standard way of emulating such operations in uniprocessor mode is to disable +interrupts, but on the FR-V CPUs, modifying the PSR takes a lot of clock cycles, and it has to be +done twice. This means the CPU runs for a relatively long time with interrupts disabled, +potentially having a great effect on interrupt latency. + + +============= +NEW ALGORITHM +============= + +To get around this, the following algorithm has been implemented. It operates in a way similar to +the LL/SC instruction pairs supported on a number of platforms. + + (*) The CCCR.CC3 register is reserved within the kernel to act as an atomic modify abort flag. + + (*) In the exception prologues run on kernel->kernel entry, CCCR.CC3 is set to 0 (Undefined + state). + + (*) All atomic operations can then be broken down into the following algorithm: + + (1) Set ICC3.Z to true and set CC3 to True (ORCC/CKEQ/ORCR). + + (2) Load the value currently in the memory to be modified into a register. + + (3) Make changes to the value. + + (4) If CC3 is still True, simultaneously and atomically (by VLIW packing): + + (a) Store the modified value back to memory. + + (b) Set ICC3.Z to false (CORCC on GR29 is sufficient for this - GR29 holds the current + task pointer in the kernel, and so is guaranteed to be non-zero). + + (5) If ICC3.Z is still true, go back to step (1). + +This works in a non-SMP environment because any interrupt or other exception that happens between +steps (1) and (4) will set CC3 to the Undefined, thus aborting the store in (4a), and causing the +condition in ICC3 to remain with the Z flag set, thus causing step (5) to loop back to step (1). + + +This algorithm suffers from two problems: + + (1) The condition CCCR.CC3 is cleared unconditionally by an exception, irrespective of whether or + not any changes were made to the target memory location during that exception. + + (2) The branch from step (5) back to step (1) may have to happen more than once until the store + manages to take place. In theory, this loop could cycle forever because there are too many + interrupts coming in, but it's unlikely. + + +======= +EXAMPLE +======= + +Taking an example from include/asm-frv/atomic.h: + + static inline int atomic_add_return(int i, atomic_t *v) + { + unsigned long val; + + asm("0: \n" + +It starts by setting ICC3.Z to true for later use, and also transforming that into CC3 being in the +True state. + + " orcc gr0,gr0,gr0,icc3 \n" <-- (1) + " ckeq icc3,cc7 \n" <-- (1) + +Then it does the load. Note that the final phase of step (1) is done at the same time as the +load. The VLIW packing ensures they are done simultaneously. The ".p" on the load must not be +removed without swapping the order of these two instructions. + + " ld.p %M0,%1 \n" <-- (2) + " orcr cc7,cc7,cc3 \n" <-- (1) + +Then the proposed modification is generated. Note that the old value can be retained if required +(such as in test_and_set_bit()). + + " add%I2 %1,%2,%1 \n" <-- (3) + +Then it attempts to store the value back, contingent on no exception having cleared CC3 since it +was set to True. + + " cst.p %1,%M0 ,cc3,#1 \n" <-- (4a) + +It simultaneously records the success or failure of the store in ICC3.Z. + + " corcc gr29,gr29,gr0 ,cc3,#1 \n" <-- (4b) + +Such that the branch can then be taken if the operation was aborted. + + " beq icc3,#0,0b \n" <-- (5) + : "+U"(v->counter), "=&r"(val) + : "NPr"(i) + : "memory", "cc7", "cc3", "icc3" + ); + + return val; + } + + +============= +CONFIGURATION +============= + +The atomic ops implementation can be made inline or out-of-line by changing the +CONFIG_FRV_OUTOFLINE_ATOMIC_OPS configuration variable. Making it out-of-line has a number of +advantages: + + - The resulting kernel image may be smaller + - Debugging is easier as atomic ops can just be stepped over and they can be breakpointed + +Keeping it inline also has a number of advantages: + + - The resulting kernel may be Faster + - no out-of-line function calls need to be made + - the compiler doesn't have half its registers clobbered by making a call + +The out-of-line implementations live in arch/frv/lib/atomic-ops.S. diff --git a/Documentation/fujitsu/frv/booting.txt b/Documentation/fujitsu/frv/booting.txt new file mode 100644 index 000000000000..4e229056ef22 --- /dev/null +++ b/Documentation/fujitsu/frv/booting.txt @@ -0,0 +1,181 @@ + ========================= + BOOTING FR-V LINUX KERNEL + ========================= + +====================== +PROVIDING A FILESYSTEM +====================== + +First of all, a root filesystem must be made available. This can be done in +one of two ways: + + (1) NFS Export + + A filesystem should be constructed in a directory on an NFS server that + the target board can reach. This directory should then be NFS exported + such that the target board can read and write into it as root. + + (2) Flash Filesystem (JFFS2 Recommended) + + In this case, the image must be stored or built up on flash before it + can be used. A complete image can be built using the mkfs.jffs2 or + similar program and then downloaded and stored into flash by RedBoot. + + +======================== +LOADING THE KERNEL IMAGE +======================== + +The kernel will need to be loaded into RAM by RedBoot (or by some alternative +boot loader) before it can be run. The kernel image (arch/frv/boot/Image) may +be loaded in one of three ways: + + (1) Load from Flash + + This is the simplest. RedBoot can store an image in the flash (see the + RedBoot documentation) and then load it back into RAM. RedBoot keeps + track of the load address, entry point and size, so the command to do + this is simply: + + fis load linux + + The image is then ready to be executed. + + (2) Load by TFTP + + The following command will download a raw binary kernel image from the + default server (as negotiated by BOOTP) and store it into RAM: + + load -b 0x00100000 -r /tftpboot/image.bin + + The image is then ready to be executed. + + (3) Load by Y-Modem + + The following command will download a raw binary kernel image across the + serial port that RedBoot is currently using: + + load -m ymodem -b 0x00100000 -r zImage + + The serial client (such as minicom) must then be told to transmit the + program by Y-Modem. + + When finished, the image will then be ready to be executed. + + +================== +BOOTING THE KERNEL +================== + +Boot the image with the following RedBoot command: + + exec -c "<CMDLINE>" 0x00100000 + +For example: + + exec -c "console=ttySM0,115200 ip=:::::dhcp root=/dev/mtdblock2 rw" + +This will start the kernel running. Note that if the GDB-stub is compiled in, +then the kernel will immediately wait for GDB to connect over serial before +doing anything else. See the section on kernel debugging with GDB. + +The kernel command line <CMDLINE> tells the kernel where its console is and +how to find its root filesystem. This is made up of the following components, +separated by spaces: + + (*) console=ttyS<x>[,<baud>[<parity>[<bits>[<flow>]]]] + + This specifies that the system console should output through on-chip + serial port <x> (which can be "0" or "1"). + + <baud> is a standard baud rate between 1200 and 115200 (default 9600). + + <parity> is a parity setting of "N", "O", "E", "M" or "S" for None, Odd, + Even, Mark or Space. "None" is the default. + + <stop> is "7" or "8" for the number of bits per character. "8" is the + default. + + <flow> is "r" to use flow control (XCTS on serial port 2 only). The + default is to not use flow control. + + For example: + + console=ttyS0,115200 + + To use the first on-chip serial port at baud rate 115200, no parity, 8 + bits, and no flow control. + + (*) root=/dev/<xxxx> + + This specifies the device upon which the root filesystem resides. For + example: + + /dev/nfs NFS root filesystem + /dev/mtdblock3 Fourth RedBoot partition on the System Flash + + (*) rw + + Start with the root filesystem mounted Read/Write. + + The remaining components are all optional: + + (*) ip=<ip>::::<host>:<iface>:<cfg> + + Configure the network interface. If <cfg> is "off" then <ip> should + specify the IP address for the network device <iface>. <host> provide + the hostname for the device. + + If <cfg> is "bootp" or "dhcp", then all of these parameters will be + discovered by consulting a BOOTP or DHCP server. + + For example, the following might be used: + + ip=192.168.73.12::::frv:eth0:off + + This sets the IP address on the VDK motherboard RTL8029 ethernet chipset + (eth0) to be 192.168.73.12, and sets the board's hostname to be "frv". + + (*) nfsroot=<server>:<dir>[,v<vers>] + + This is mandatory if "root=/dev/nfs" is given as an option. It tells the + kernel the IP address of the NFS server providing its root filesystem, + and the pathname on that server of the filesystem. + + The NFS version to use can also be specified. v2 and v3 are supported by + Linux. + + For example: + + nfsroot=192.168.73.1:/nfsroot-frv + + (*) profile=1 + + Turns on the kernel profiler (accessible through /proc/profile). + + (*) console=gdb0 + + This can be used as an alternative to the "console=ttyS..." listed + above. I tells the kernel to pass the console output to GDB if the + gdbstub is compiled in to the kernel. + + If this is used, then the gdbstub passes the text to GDB, which then + simply dumps it to its standard output. + + (*) mem=<xxx>M + + Normally the kernel will work out how much SDRAM it has by reading the + SDRAM controller registers. That can be overridden with this + option. This allows the kernel to be told that it has <xxx> megabytes of + memory available. + + (*) init=<prog> [<arg> [<arg> [<arg> ...]]] + + This tells the kernel what program to run initially. By default this is + /sbin/init, but /sbin/sash or /bin/sh are common alternatives. + + (*) vdc=... + + This option configures the MB93493 companion chip visual display + driver. Please see Documentation/fujitsu/mb93493/vdc.txt for more + information. diff --git a/Documentation/fujitsu/frv/clock.txt b/Documentation/fujitsu/frv/clock.txt new file mode 100644 index 000000000000..c72d350e177a --- /dev/null +++ b/Documentation/fujitsu/frv/clock.txt @@ -0,0 +1,65 @@ +Clock scaling +------------- + +The kernel supports scaling of CLCK.CMODE, CLCK.CM and CLKC.P0 clock +registers. If built with CONFIG_PM and CONFIG_SYSCTL options enabled, four +extra files will appear in the directory /proc/sys/pm/. Reading these files +will show: + + p0 -- current value of the P0 bit in CLKC register. + cm -- current value of the CM bits in CLKC register. + cmode -- current value of the CMODE bits in CLKC register. + +On all boards, the 'p0' file should also be writable, and either '1' or '0' +can be rewritten, to set or clear the CLKC_P0 bit respectively, hence +controlling whether the resource bus rate clock is halved. + +The 'cm' file should also be available on all boards. '0' can be written to it +to shift the board into High-Speed mode (normal), and '1' can be written to +shift the board into Medium-Speed mode. Selecting Low-Speed mode is not +supported by this interface, even though some CPUs do support it. + +On the boards with FR405 CPU (i.e. CB60 and CB70), the 'cmode' file is also +writable, allowing the CPU core speed (and other clock speeds) to be +controlled from userspace. + + +Determining current and possible settings +----------------------------------------- + +The current state and the available masks can be found in /proc/cpuinfo. For +example, on the CB70: + + # cat /proc/cpuinfo + CPU-Series: fr400 + CPU-Core: fr405, gr0-31, BE, CCCR + CPU: mb93405 + MMU: Prot + FP-Media: fr0-31, Media + System: mb93091-cb70, mb93090-mb00 + PM-Controls: cmode=0xd31f, cm=0x3, p0=0x3, suspend=0x9 + PM-Status: cmode=3, cm=0, p0=0 + Clock-In: 50.00 MHz + Clock-Core: 300.00 MHz + Clock-SDRAM: 100.00 MHz + Clock-CBus: 100.00 MHz + Clock-Res: 50.00 MHz + Clock-Ext: 50.00 MHz + Clock-DSU: 25.00 MHz + BogoMips: 300.00 + +And on the PDK, the PM lines look like the following: + + PM-Controls: cm=0x3, p0=0x3, suspend=0x9 + PM-Status: cmode=9, cm=0, p0=0 + +The PM-Controls line, if present, will indicate which /proc/sys/pm files can +be set to what values. The specification values are bitmasks; so, for example, +"suspend=0x9" indicates that 0 and 3 can be written validly to +/proc/sys/pm/suspend. + +The PM-Controls line will only be present if CONFIG_PM is configured to Y. + +The PM-Status line indicates which clock controls are set to which value. If +the file can be read, then the suspend value must be 0, and so that's not +included. diff --git a/Documentation/fujitsu/frv/configuring.txt b/Documentation/fujitsu/frv/configuring.txt new file mode 100644 index 000000000000..36e76a2336fa --- /dev/null +++ b/Documentation/fujitsu/frv/configuring.txt @@ -0,0 +1,125 @@ + ======================================= + FUJITSU FR-V LINUX KERNEL CONFIGURATION + ======================================= + +===================== +CONFIGURATION OPTIONS +===================== + +The most important setting is in the "MMU support options" tab (the first +presented in the configuration tools available): + + (*) "Kernel Type" + + This options allows selection of normal, MMU-requiring linux, and uClinux + (which doesn't require an MMU and doesn't have inter-process protection). + +There are a number of settings in the "Processor type and features" section of +the kernel configuration that need to be considered. + + (*) "CPU" + + The register and instruction sets at the core of the processor. This can + only be set to "FR40x/45x/55x" at the moment - but this permits usage of + the kernel with MB93091 CB10, CB11, CB30, CB41, CB60, CB70 and CB451 + CPU boards, and with the MB93093 PDK board. + + (*) "System" + + This option allows a choice of basic system. This governs the peripherals + that are expected to be available. + + (*) "Motherboard" + + This specifies the type of motherboard being used, and the peripherals + upon it. Currently only "MB93090-MB00" can be set here. + + (*) "Default cache-write mode" + + This controls the initial data cache write management mode. By default + Write-Through is selected, but Write-Back (Copy-Back) can also be + selected. This can be changed dynamically once the kernel is running (see + features.txt). + +There are some architecture specific configuration options in the "General +Setup" section of the kernel configuration too: + + (*) "Reserve memory uncached for (PCI) DMA" + + This requests that a uClinux kernel set aside some memory in an uncached + window for the use as consistent DMA memory (mainly for PCI). At least a + megabyte will be allocated in this way, possibly more. Any memory so + reserved will not be available for normal allocations. + + (*) "Kernel support for ELF-FDPIC binaries" + + This enables the binary-format driver for the new FDPIC ELF binaries that + this platform normally uses. These binaries are totally relocatable - + their separate sections can relocated independently, allowing them to be + shared on uClinux where possible. This should normally be enabled. + + (*) "Kernel image protection" + + This makes the protection register governing access to the core kernel + image prohibit access by userspace programs. This option is available on + uClinux only. + +There are also a number of settings in the "Kernel Hacking" section of the +kernel configuration especially for debugging a kernel on this +architecture. See the "gdbstub.txt" file for information about those. + + +====================== +DEFAULT CONFIGURATIONS +====================== + +The kernel sources include a number of example default configurations: + + (*) defconfig-mb93091 + + Default configuration for the MB93091-VDK with both CPU board and + MB93090-MB00 motherboard running uClinux. + + + (*) defconfig-mb93091-fb + + Default configuration for the MB93091-VDK with CPU board, + MB93090-MB00 motherboard, and DAV board running uClinux. + Includes framebuffer driver. + + + (*) defconfig-mb93093 + + Default configuration for the MB93093-PDK board running uClinux. + + + (*) defconfig-cb70-standalone + + Default configuration for the MB93091-VDK with only CB70 CPU board + running uClinux. This will use the CB70's DM9000 for network access. + + + (*) defconfig-mmu + + Default configuration for the MB93091-VDK with both CB451 CPU board and + MB93090-MB00 motherboard running MMU linux. + + (*) defconfig-mmu-audio + + Default configuration for the MB93091-VDK with CB451 CPU board, DAV + board, and MB93090-MB00 motherboard running MMU linux. Includes + audio driver. + + (*) defconfig-mmu-fb + + Default configuration for the MB93091-VDK with CB451 CPU board, DAV + board, and MB93090-MB00 motherboard running MMU linux. Includes + framebuffer driver. + + (*) defconfig-mmu-standalone + + Default configuration for the MB93091-VDK with only CB451 CPU board + running MMU linux. + + + diff --git a/Documentation/fujitsu/frv/features.txt b/Documentation/fujitsu/frv/features.txt new file mode 100644 index 000000000000..fa20c0e72833 --- /dev/null +++ b/Documentation/fujitsu/frv/features.txt @@ -0,0 +1,310 @@ + =========================== + FUJITSU FR-V LINUX FEATURES + =========================== + +This kernel port has a number of features of which the user should be aware: + + (*) Linux and uClinux + + The FR-V architecture port supports both normal MMU linux and uClinux out + of the same sources. + + + (*) CPU support + + Support for the FR401, FR403, FR405, FR451 and FR555 CPUs should work with + the same uClinux kernel configuration. + + In normal (MMU) Linux mode, only the FR451 CPU will work as that is the + only one with a suitably featured CPU. + + The kernel is written and compiled with the assumption that only the + bottom 32 GR registers and no FR registers will be used by the kernel + itself, however all extra userspace registers will be saved on context + switch. Note that since most CPUs can't support lazy switching, no attempt + is made to do lazy register saving where that would be possible (FR555 + only currently). + + + (*) Board support + + The board on which the kernel will run can be configured on the "Processor + type and features" configuration tab. + + Set the System to "MB93093-PDK" to boot from the MB93093 (FR403) PDK. + + Set the System to "MB93091-VDK" to boot from the CB11, CB30, CB41, CB60, + CB70 or CB451 VDK boards. Set the Motherboard setting to "MB93090-MB00" to + boot with the standard ATA90590B VDK motherboard, and set it to "None" to + boot without any motherboard. + + + (*) Binary Formats + + The only userspace binary format supported is FDPIC ELF. Normal ELF, FLAT + and AOUT binaries are not supported for this architecture. + + FDPIC ELF supports shared library and program interpreter facilities. + + + (*) Scheduler Speed + + The kernel scheduler runs at 100Hz irrespective of the clock speed on this + architecture. This value is set in asm/param.h (see the HZ macro defined + there). + + + (*) Normal (MMU) Linux Memory Layout. + + See mmu-layout.txt in this directory for a description of the normal linux + memory layout + + See include/asm-frv/mem-layout.h for constants pertaining to the memory + layout. + + See include/asm-frv/mb-regs.h for the constants pertaining to the I/O bus + controller configuration. + + + (*) uClinux Memory Layout + + The memory layout used by the uClinux kernel is as follows: + + 0x00000000 - 0x00000FFF Null pointer catch page + 0x20000000 - 0x200FFFFF CS2# [PDK] FPGA + 0xC0000000 - 0xCFFFFFFF SDRAM + 0xC0000000 Base of Linux kernel image + 0xE0000000 - 0xEFFFFFFF CS2# [VDK] SLBUS/PCI window + 0xF0000000 - 0xF0FFFFFF CS5# MB93493 CSC area (DAV daughter board) + 0xF1000000 - 0xF1FFFFFF CS7# [CB70/CB451] CPU-card PCMCIA port space + 0xFC000000 - 0xFC0FFFFF CS1# [VDK] MB86943 config space + 0xFC100000 - 0xFC1FFFFF CS6# [CB70/CB451] CPU-card DM9000 NIC space + 0xFC100000 - 0xFC1FFFFF CS6# [PDK] AX88796 NIC space + 0xFC200000 - 0xFC2FFFFF CS3# MB93493 CSR area (DAV daughter board) + 0xFD000000 - 0xFDFFFFFF CS4# [CB70/CB451] CPU-card extra flash space + 0xFE000000 - 0xFEFFFFFF Internal CPU peripherals + 0xFF000000 - 0xFF1FFFFF CS0# Flash 1 + 0xFF200000 - 0xFF3FFFFF CS0# Flash 2 + 0xFFC00000 - 0xFFC0001F CS0# [VDK] FPGA + + The kernel reads the size of the SDRAM from the memory bus controller + registers by default. + + The kernel initialisation code (1) adjusts the SDRAM base addresses to + move the SDRAM to desired address, (2) moves the kernel image down to the + bottom of SDRAM, (3) adjusts the bus controller registers to move I/O + windows, and (4) rearranges the protection registers to protect all of + this. + + The reasons for doing this are: (1) the page at address 0 should be + inaccessible so that NULL pointer errors can be caught; and (2) the bottom + three quarters are left unoccupied so that an FR-V CPU with an MMU can use + it for virtual userspace mappings. + + See include/asm-frv/mem-layout.h for constants pertaining to the memory + layout. + + See include/asm-frv/mb-regs.h for the constants pertaining to the I/O bus + controller configuration. + + + (*) uClinux Memory Protection + + A DAMPR register is used to cover the entire region used for I/O + (0xE0000000 - 0xFFFFFFFF). This permits the kernel to make uncached + accesses to this region. Userspace is not permitted to access it. + + The DAMPR/IAMPR protection registers not in use for any other purpose are + tiled over the top of the SDRAM such that: + + (1) The core kernel image is covered by as small a tile as possible + granting only the kernel access to the underlying data, whilst + making sure no SDRAM is actually made unavailable by this approach. + + (2) All other tiles are arranged to permit userspace access to the rest + of the SDRAM. + + Barring point (1), there is nothing to protect kernel data against + userspace damage - but this is uClinux. + + + (*) Exceptions and Fixups + + Since the FR40x and FR55x CPUs that do not have full MMUs generate + imprecise data error exceptions, there are currently no automatic fixup + services available in uClinux. This includes misaligned memory access + fixups. + + Userspace EFAULT errors can be trapped by issuing a MEMBAR instruction and + forcing the fault to happen there. + + On the FR451, however, data exceptions are mostly precise, and so + exception fixup handling is implemented as normal. + + + (*) Userspace Breakpoints + + The ptrace() system call supports the following userspace debugging + features: + + (1) Hardware assisted single step. + + (2) Breakpoint via the FR-V "BREAK" instruction. + + (3) Breakpoint via the FR-V "TIRA GR0, #1" instruction. + + (4) Syscall entry/exit trap. + + Each of the above generates a SIGTRAP. + + + (*) On-Chip Serial Ports + + The FR-V on-chip serial ports are made available as ttyS0 and ttyS1. Note + that if the GDB stub is compiled in, ttyS1 will not actually be available + as it will be being used for the GDB stub. + + These ports can be made by: + + mknod /dev/ttyS0 c 4 64 + mknod /dev/ttyS1 c 4 65 + + + (*) Maskable Interrupts + + Level 15 (Non-maskable) interrupts are dealt with by the GDB stub if + present, and cause a panic if not. If the GDB stub is present, ttyS1's + interrupts are rated at level 15. + + All other interrupts are distributed over the set of available priorities + so that no IRQs are shared where possible. The arch interrupt handling + routines attempt to disentangle the various sources available through the + CPU's own multiplexor, and those on off-CPU peripherals. + + + (*) Accessing PCI Devices + + Where PCI is available, care must be taken when dealing with drivers that + access PCI devices. PCI devices present their data in little-endian form, + but the CPU sees it in big-endian form. The macros in asm/io.h try to get + this right, but may not under all circumstances... + + + (*) Ax88796 Ethernet Driver + + The MB93093 PDK board has an Ax88796 ethernet chipset (an NE2000 clone). A + driver has been written to deal specifically with this. The driver + provides MII services for the card. + + The driver can be configured by running make xconfig, and going to: + + (*) Network device support + - turn on "Network device support" + (*) Ethernet (10 or 100Mbit) + - turn on "Ethernet (10 or 100Mbit)" + - turn on "AX88796 NE2000 compatible chipset" + + The driver can be found in: + + drivers/net/ax88796.c + include/asm/ax88796.h + + + (*) WorkRAM Driver + + This driver provides a character device that permits access to the WorkRAM + that can be found on the FR451 CPU. Each page is accessible through a + separate minor number, thereby permitting each page to have its own + filesystem permissions set on the device file. + + The device files should be: + + mknod /dev/frv/workram0 c 240 0 + mknod /dev/frv/workram1 c 240 1 + mknod /dev/frv/workram2 c 240 2 + ... + + The driver will not permit the opening of any device file that does not + correspond to at least a partial page of WorkRAM. So the first device file + is the only one available on the FR451. If any other CPU is detected, none + of the devices will be openable. + + The devices can be accessed with read, write and llseek, and can also be + mmapped. If they're mmapped, they will only map at the appropriate + 0x7e8nnnnn address on linux and at the 0xfe8nnnnn address on uClinux. If + MAP_FIXED is not specified, the appropriate address will be chosen anyway. + + The mappings must be MAP_SHARED not MAP_PRIVATE, and must not be + PROT_EXEC. They must also start at file offset 0, and must not be longer + than one page in size. + + This driver can be configured by running make xconfig, and going to: + + (*) Character devices + - turn on "Fujitsu FR-V CPU WorkRAM support" + + + (*) Dynamic data cache write mode changing + + It is possible to view and to change the data cache's write mode through + the /proc/sys/frv/cache-mode file while the kernel is running. There are + two modes available: + + NAME MEANING + ===== ========================================== + wthru Data cache is in Write-Through mode + wback Data cache is in Write-Back/Copy-Back mode + + To read the cache mode: + + # cat /proc/sys/frv/cache-mode + wthru + + To change the cache mode: + + # echo wback >/proc/sys/frv/cache-mode + # cat /proc/sys/frv/cache-mode + wback + + + (*) MMU Context IDs and Pinning + + On MMU Linux the CPU supports the concept of a context ID in its MMU to + make it more efficient (TLB entries are labelled with a context ID to link + them to specific tasks). + + Normally once a context ID is allocated, it will remain affixed to a task + or CLONE_VM'd group of tasks for as long as it exists. However, since the + kernel is capable of supporting more tasks than there are possible ID + numbers, the kernel will pass context IDs from one task to another if + there are insufficient available. + + The context ID currently in use by a task can be viewed in /proc: + + # grep CXNR /proc/1/status + CXNR: 1 + + Note that kernel threads do not have a userspace context, and so will not + show a CXNR entry in that file. + + Under some circumstances, however, it is desirable to pin a context ID on + a process such that the kernel won't pass it on. This can be done by + writing the process ID of the target process to a special file: + + # echo 17 >/proc/sys/frv/pin-cxnr + + Reading from the file will then show the context ID pinned. + + # cat /proc/sys/frv/pin-cxnr + 4 + + The context ID will remain pinned as long as any process is using that + context, i.e.: when the all the subscribing processes have exited or + exec'd; or when an unpinning request happens: + + # echo 0 >/proc/sys/frv/pin-cxnr + + When there isn't a pinned context, the file shows -1: + + # cat /proc/sys/frv/pin-cxnr + -1 diff --git a/Documentation/fujitsu/frv/gdbinit b/Documentation/fujitsu/frv/gdbinit new file mode 100644 index 000000000000..51517b6f307f --- /dev/null +++ b/Documentation/fujitsu/frv/gdbinit @@ -0,0 +1,102 @@ +set remotebreak 1 + +define _amr + +printf "AMRx DAMR IAMR \n" +printf "==== ===================== =====================\n" +printf "amr0 : L:%08lx P:%08lx : L:%08lx P:%08lx\n",__debug_mmu.damr[0x0].L,__debug_mmu.damr[0x0].P,__debug_mmu.iamr[0x0].L,__debug_mmu.iamr[0x0].P +printf "amr1 : L:%08lx P:%08lx : L:%08lx P:%08lx\n",__debug_mmu.damr[0x1].L,__debug_mmu.damr[0x1].P,__debug_mmu.iamr[0x1].L,__debug_mmu.iamr[0x1].P +printf "amr2 : L:%08lx P:%08lx : L:%08lx P:%08lx\n",__debug_mmu.damr[0x2].L,__debug_mmu.damr[0x2].P,__debug_mmu.iamr[0x2].L,__debug_mmu.iamr[0x2].P +printf "amr3 : L:%08lx P:%08lx : L:%08lx P:%08lx\n",__debug_mmu.damr[0x3].L,__debug_mmu.damr[0x3].P,__debug_mmu.iamr[0x3].L,__debug_mmu.iamr[0x3].P +printf "amr4 : L:%08lx P:%08lx : L:%08lx P:%08lx\n",__debug_mmu.damr[0x4].L,__debug_mmu.damr[0x4].P,__debug_mmu.iamr[0x4].L,__debug_mmu.iamr[0x4].P +printf "amr5 : L:%08lx P:%08lx : L:%08lx P:%08lx\n",__debug_mmu.damr[0x5].L,__debug_mmu.damr[0x5].P,__debug_mmu.iamr[0x5].L,__debug_mmu.iamr[0x5].P +printf "amr6 : L:%08lx P:%08lx : L:%08lx P:%08lx\n",__debug_mmu.damr[0x6].L,__debug_mmu.damr[0x6].P,__debug_mmu.iamr[0x6].L,__debug_mmu.iamr[0x6].P +printf "amr7 : L:%08lx P:%08lx : L:%08lx P:%08lx\n",__debug_mmu.damr[0x7].L,__debug_mmu.damr[0x7].P,__debug_mmu.iamr[0x7].L,__debug_mmu.iamr[0x7].P + +printf "amr8 : L:%08lx P:%08lx\n",__debug_mmu.damr[0x8].L,__debug_mmu.damr[0x8].P +printf "amr9 : L:%08lx P:%08lx\n",__debug_mmu.damr[0x9].L,__debug_mmu.damr[0x9].P +printf "amr10: L:%08lx P:%08lx\n",__debug_mmu.damr[0xa].L,__debug_mmu.damr[0xa].P +printf "amr11: L:%08lx P:%08lx\n",__debug_mmu.damr[0xb].L,__debug_mmu.damr[0xb].P + +end + + +define _tlb +printf "tlb[0x00]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x0].L,__debug_mmu.tlb[0x0].P,__debug_mmu.tlb[0x40+0x0].L,__debug_mmu.tlb[0x40+0x0].P +printf "tlb[0x01]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x1].L,__debug_mmu.tlb[0x1].P,__debug_mmu.tlb[0x40+0x1].L,__debug_mmu.tlb[0x40+0x1].P +printf "tlb[0x02]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x2].L,__debug_mmu.tlb[0x2].P,__debug_mmu.tlb[0x40+0x2].L,__debug_mmu.tlb[0x40+0x2].P +printf "tlb[0x03]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x3].L,__debug_mmu.tlb[0x3].P,__debug_mmu.tlb[0x40+0x3].L,__debug_mmu.tlb[0x40+0x3].P +printf "tlb[0x04]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x4].L,__debug_mmu.tlb[0x4].P,__debug_mmu.tlb[0x40+0x4].L,__debug_mmu.tlb[0x40+0x4].P +printf "tlb[0x05]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x5].L,__debug_mmu.tlb[0x5].P,__debug_mmu.tlb[0x40+0x5].L,__debug_mmu.tlb[0x40+0x5].P +printf "tlb[0x06]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x6].L,__debug_mmu.tlb[0x6].P,__debug_mmu.tlb[0x40+0x6].L,__debug_mmu.tlb[0x40+0x6].P +printf "tlb[0x07]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x7].L,__debug_mmu.tlb[0x7].P,__debug_mmu.tlb[0x40+0x7].L,__debug_mmu.tlb[0x40+0x7].P +printf "tlb[0x08]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x8].L,__debug_mmu.tlb[0x8].P,__debug_mmu.tlb[0x40+0x8].L,__debug_mmu.tlb[0x40+0x8].P +printf "tlb[0x09]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x9].L,__debug_mmu.tlb[0x9].P,__debug_mmu.tlb[0x40+0x9].L,__debug_mmu.tlb[0x40+0x9].P +printf "tlb[0x0a]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0xa].L,__debug_mmu.tlb[0xa].P,__debug_mmu.tlb[0x40+0xa].L,__debug_mmu.tlb[0x40+0xa].P +printf "tlb[0x0b]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0xb].L,__debug_mmu.tlb[0xb].P,__debug_mmu.tlb[0x40+0xb].L,__debug_mmu.tlb[0x40+0xb].P +printf "tlb[0x0c]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0xc].L,__debug_mmu.tlb[0xc].P,__debug_mmu.tlb[0x40+0xc].L,__debug_mmu.tlb[0x40+0xc].P +printf "tlb[0x0d]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0xd].L,__debug_mmu.tlb[0xd].P,__debug_mmu.tlb[0x40+0xd].L,__debug_mmu.tlb[0x40+0xd].P +printf "tlb[0x0e]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0xe].L,__debug_mmu.tlb[0xe].P,__debug_mmu.tlb[0x40+0xe].L,__debug_mmu.tlb[0x40+0xe].P +printf "tlb[0x0f]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0xf].L,__debug_mmu.tlb[0xf].P,__debug_mmu.tlb[0x40+0xf].L,__debug_mmu.tlb[0x40+0xf].P +printf "tlb[0x10]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x10].L,__debug_mmu.tlb[0x10].P,__debug_mmu.tlb[0x40+0x10].L,__debug_mmu.tlb[0x40+0x10].P +printf "tlb[0x11]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x11].L,__debug_mmu.tlb[0x11].P,__debug_mmu.tlb[0x40+0x11].L,__debug_mmu.tlb[0x40+0x11].P +printf "tlb[0x12]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x12].L,__debug_mmu.tlb[0x12].P,__debug_mmu.tlb[0x40+0x12].L,__debug_mmu.tlb[0x40+0x12].P +printf "tlb[0x13]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x13].L,__debug_mmu.tlb[0x13].P,__debug_mmu.tlb[0x40+0x13].L,__debug_mmu.tlb[0x40+0x13].P +printf "tlb[0x14]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x14].L,__debug_mmu.tlb[0x14].P,__debug_mmu.tlb[0x40+0x14].L,__debug_mmu.tlb[0x40+0x14].P +printf "tlb[0x15]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x15].L,__debug_mmu.tlb[0x15].P,__debug_mmu.tlb[0x40+0x15].L,__debug_mmu.tlb[0x40+0x15].P +printf "tlb[0x16]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x16].L,__debug_mmu.tlb[0x16].P,__debug_mmu.tlb[0x40+0x16].L,__debug_mmu.tlb[0x40+0x16].P +printf "tlb[0x17]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x17].L,__debug_mmu.tlb[0x17].P,__debug_mmu.tlb[0x40+0x17].L,__debug_mmu.tlb[0x40+0x17].P +printf "tlb[0x18]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x18].L,__debug_mmu.tlb[0x18].P,__debug_mmu.tlb[0x40+0x18].L,__debug_mmu.tlb[0x40+0x18].P +printf "tlb[0x19]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x19].L,__debug_mmu.tlb[0x19].P,__debug_mmu.tlb[0x40+0x19].L,__debug_mmu.tlb[0x40+0x19].P +printf "tlb[0x1a]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x1a].L,__debug_mmu.tlb[0x1a].P,__debug_mmu.tlb[0x40+0x1a].L,__debug_mmu.tlb[0x40+0x1a].P +printf "tlb[0x1b]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x1b].L,__debug_mmu.tlb[0x1b].P,__debug_mmu.tlb[0x40+0x1b].L,__debug_mmu.tlb[0x40+0x1b].P +printf "tlb[0x1c]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x1c].L,__debug_mmu.tlb[0x1c].P,__debug_mmu.tlb[0x40+0x1c].L,__debug_mmu.tlb[0x40+0x1c].P +printf "tlb[0x1d]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x1d].L,__debug_mmu.tlb[0x1d].P,__debug_mmu.tlb[0x40+0x1d].L,__debug_mmu.tlb[0x40+0x1d].P +printf "tlb[0x1e]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x1e].L,__debug_mmu.tlb[0x1e].P,__debug_mmu.tlb[0x40+0x1e].L,__debug_mmu.tlb[0x40+0x1e].P +printf "tlb[0x1f]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x1f].L,__debug_mmu.tlb[0x1f].P,__debug_mmu.tlb[0x40+0x1f].L,__debug_mmu.tlb[0x40+0x1f].P +printf "tlb[0x20]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x20].L,__debug_mmu.tlb[0x20].P,__debug_mmu.tlb[0x40+0x20].L,__debug_mmu.tlb[0x40+0x20].P +printf "tlb[0x21]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x21].L,__debug_mmu.tlb[0x21].P,__debug_mmu.tlb[0x40+0x21].L,__debug_mmu.tlb[0x40+0x21].P +printf "tlb[0x22]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x22].L,__debug_mmu.tlb[0x22].P,__debug_mmu.tlb[0x40+0x22].L,__debug_mmu.tlb[0x40+0x22].P +printf "tlb[0x23]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x23].L,__debug_mmu.tlb[0x23].P,__debug_mmu.tlb[0x40+0x23].L,__debug_mmu.tlb[0x40+0x23].P +printf "tlb[0x24]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x24].L,__debug_mmu.tlb[0x24].P,__debug_mmu.tlb[0x40+0x24].L,__debug_mmu.tlb[0x40+0x24].P +printf "tlb[0x25]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x25].L,__debug_mmu.tlb[0x25].P,__debug_mmu.tlb[0x40+0x25].L,__debug_mmu.tlb[0x40+0x25].P +printf "tlb[0x26]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x26].L,__debug_mmu.tlb[0x26].P,__debug_mmu.tlb[0x40+0x26].L,__debug_mmu.tlb[0x40+0x26].P +printf "tlb[0x27]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x27].L,__debug_mmu.tlb[0x27].P,__debug_mmu.tlb[0x40+0x27].L,__debug_mmu.tlb[0x40+0x27].P +printf "tlb[0x28]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x28].L,__debug_mmu.tlb[0x28].P,__debug_mmu.tlb[0x40+0x28].L,__debug_mmu.tlb[0x40+0x28].P +printf "tlb[0x29]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x29].L,__debug_mmu.tlb[0x29].P,__debug_mmu.tlb[0x40+0x29].L,__debug_mmu.tlb[0x40+0x29].P +printf "tlb[0x2a]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x2a].L,__debug_mmu.tlb[0x2a].P,__debug_mmu.tlb[0x40+0x2a].L,__debug_mmu.tlb[0x40+0x2a].P +printf "tlb[0x2b]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x2b].L,__debug_mmu.tlb[0x2b].P,__debug_mmu.tlb[0x40+0x2b].L,__debug_mmu.tlb[0x40+0x2b].P +printf "tlb[0x2c]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x2c].L,__debug_mmu.tlb[0x2c].P,__debug_mmu.tlb[0x40+0x2c].L,__debug_mmu.tlb[0x40+0x2c].P +printf "tlb[0x2d]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x2d].L,__debug_mmu.tlb[0x2d].P,__debug_mmu.tlb[0x40+0x2d].L,__debug_mmu.tlb[0x40+0x2d].P +printf "tlb[0x2e]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x2e].L,__debug_mmu.tlb[0x2e].P,__debug_mmu.tlb[0x40+0x2e].L,__debug_mmu.tlb[0x40+0x2e].P +printf "tlb[0x2f]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x2f].L,__debug_mmu.tlb[0x2f].P,__debug_mmu.tlb[0x40+0x2f].L,__debug_mmu.tlb[0x40+0x2f].P +printf "tlb[0x30]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x30].L,__debug_mmu.tlb[0x30].P,__debug_mmu.tlb[0x40+0x30].L,__debug_mmu.tlb[0x40+0x30].P +printf "tlb[0x31]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x31].L,__debug_mmu.tlb[0x31].P,__debug_mmu.tlb[0x40+0x31].L,__debug_mmu.tlb[0x40+0x31].P +printf "tlb[0x32]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x32].L,__debug_mmu.tlb[0x32].P,__debug_mmu.tlb[0x40+0x32].L,__debug_mmu.tlb[0x40+0x32].P +printf "tlb[0x33]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x33].L,__debug_mmu.tlb[0x33].P,__debug_mmu.tlb[0x40+0x33].L,__debug_mmu.tlb[0x40+0x33].P +printf "tlb[0x34]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x34].L,__debug_mmu.tlb[0x34].P,__debug_mmu.tlb[0x40+0x34].L,__debug_mmu.tlb[0x40+0x34].P +printf "tlb[0x35]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x35].L,__debug_mmu.tlb[0x35].P,__debug_mmu.tlb[0x40+0x35].L,__debug_mmu.tlb[0x40+0x35].P +printf "tlb[0x36]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x36].L,__debug_mmu.tlb[0x36].P,__debug_mmu.tlb[0x40+0x36].L,__debug_mmu.tlb[0x40+0x36].P +printf "tlb[0x37]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x37].L,__debug_mmu.tlb[0x37].P,__debug_mmu.tlb[0x40+0x37].L,__debug_mmu.tlb[0x40+0x37].P +printf "tlb[0x38]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x38].L,__debug_mmu.tlb[0x38].P,__debug_mmu.tlb[0x40+0x38].L,__debug_mmu.tlb[0x40+0x38].P +printf "tlb[0x39]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x39].L,__debug_mmu.tlb[0x39].P,__debug_mmu.tlb[0x40+0x39].L,__debug_mmu.tlb[0x40+0x39].P +printf "tlb[0x3a]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x3a].L,__debug_mmu.tlb[0x3a].P,__debug_mmu.tlb[0x40+0x3a].L,__debug_mmu.tlb[0x40+0x3a].P +printf "tlb[0x3b]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x3b].L,__debug_mmu.tlb[0x3b].P,__debug_mmu.tlb[0x40+0x3b].L,__debug_mmu.tlb[0x40+0x3b].P +printf "tlb[0x3c]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x3c].L,__debug_mmu.tlb[0x3c].P,__debug_mmu.tlb[0x40+0x3c].L,__debug_mmu.tlb[0x40+0x3c].P +printf "tlb[0x3d]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x3d].L,__debug_mmu.tlb[0x3d].P,__debug_mmu.tlb[0x40+0x3d].L,__debug_mmu.tlb[0x40+0x3d].P +printf "tlb[0x3e]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x3e].L,__debug_mmu.tlb[0x3e].P,__debug_mmu.tlb[0x40+0x3e].L,__debug_mmu.tlb[0x40+0x3e].P +printf "tlb[0x3f]: %08lx %08lx %08lx %08lx\n",__debug_mmu.tlb[0x3f].L,__debug_mmu.tlb[0x3f].P,__debug_mmu.tlb[0x40+0x3f].L,__debug_mmu.tlb[0x40+0x3f].P +end + + +define _pgd +p (pgd_t[0x40])*(pgd_t*)(__debug_mmu.damr[0x3].L) +end + +define _ptd_i +p (pte_t[0x1000])*(pte_t*)(__debug_mmu.damr[0x4].L) +end + +define _ptd_d +p (pte_t[0x1000])*(pte_t*)(__debug_mmu.damr[0x5].L) +end diff --git a/Documentation/fujitsu/frv/gdbstub.txt b/Documentation/fujitsu/frv/gdbstub.txt new file mode 100644 index 000000000000..6ce5aa9abbc5 --- /dev/null +++ b/Documentation/fujitsu/frv/gdbstub.txt @@ -0,0 +1,130 @@ + ==================== + DEBUGGING FR-V LINUX + ==================== + + +The kernel contains a GDB stub that talks GDB remote protocol across a serial +port. This permits GDB to single step through the kernel, set breakpoints and +trap exceptions that happen in kernel space and interrupt execution. It also +permits the NMI interrupt button or serial port events to jump the kernel into +the debugger. + +On the CPUs that have on-chip UARTs (FR400, FR403, FR405, FR555), the +GDB stub hijacks a serial port for its own purposes, and makes it +generate level 15 interrupts (NMI). The kernel proper cannot see the serial +port in question under these conditions. + +On the MB93091-VDK CPU boards, the GDB stub uses UART1, which would otherwise +be /dev/ttyS1. On the MB93093-PDK, the GDB stub uses UART0. Therefore, on the +PDK there is no externally accessible serial port and the serial port to +which the touch screen is attached becomes /dev/ttyS0. + +Note that the GDB stub runs entirely within CPU debug mode, and so should not +incur any exceptions or interrupts whilst it is active. In particular, note +that the clock will lose time since it is implemented in software. + + +================== +KERNEL PREPARATION +================== + +Firstly, a debuggable kernel must be built. To do this, unpack the kernel tree +and copy the configuration that you wish to use to .config. Then reconfigure +the following things on the "Kernel Hacking" tab: + + (*) "Include debugging information" + + Set this to "Y". This causes all C and Assembly files to be compiled + to include debugging information. + + (*) "In-kernel GDB stub" + + Set this to "Y". This causes the GDB stub to be compiled into the + kernel. + + (*) "Immediate activation" + + Set this to "Y" if you want the GDB stub to activate as soon as possible + and wait for GDB to connect. This allows you to start tracing right from + the beginning of start_kernel() in init/main.c. + + (*) "Console through GDB stub" + + Set this to "Y" if you wish to be able to use "console=gdb0" on the + command line. That tells the kernel to pass system console messages to + GDB (which then prints them on its standard output). This is useful when + debugging the serial drivers that'd otherwise be used to pass console + messages to the outside world. + +Then build as usual, download to the board and execute. Note that if +"Immediate activation" was selected, then the kernel will wait for GDB to +attach. If not, then the kernel will boot immediately and GDB will have to +interupt it or wait for an exception to occur if before doing anything with +the kernel. + + +========================= +KERNEL DEBUGGING WITH GDB +========================= + +Set the serial port on the computer that's going to run GDB to the appropriate +baud rate. Assuming the board's debug port is connected to ttyS0/COM1 on the +computer doing the debugging: + + stty -F /dev/ttyS0 115200 + +Then start GDB in the base of the kernel tree: + + frv-uclinux-gdb linux [uClinux] + +Or: + + frv-uclinux-gdb vmlinux [MMU linux] + +When the prompt appears: + + GNU gdb frv-031024 + Copyright 2003 Free Software Foundation, Inc. + GDB is free software, covered by the GNU General Public License, and you are + welcome to change it and/or distribute copies of it under certain conditions. + Type "show copying" to see the conditions. + There is absolutely no warranty for GDB. Type "show warranty" for details. + This GDB was configured as "--host=i686-pc-linux-gnu --target=frv-uclinux"... + (gdb) + +Attach to the board like this: + + (gdb) target remote /dev/ttyS0 + Remote debugging using /dev/ttyS0 + start_kernel () at init/main.c:395 + (gdb) + +This should show the appropriate lines from the source too. The kernel can +then be debugged almost as if it's any other program. + + +=============================== +INTERRUPTING THE RUNNING KERNEL +=============================== + +The kernel can be interrupted whilst it is running, causing a jump back to the +GDB stub and the debugger: + + (*) Pressing Ctrl-C in GDB. This will cause GDB to try and interrupt the + kernel by sending an RS232 BREAK over the serial line to the GDB + stub. This will (mostly) immediately interrupt the kernel and return it + to the debugger. + + (*) Pressing the NMI button on the board will also cause a jump into the + debugger. + + (*) Setting a software breakpoint. This sets a break instruction at the + desired location which the GDB stub then traps the exception for. + + (*) Setting a hardware breakpoint. The GDB stub is capable of using the IBAR + and DBAR registers to assist debugging. + +Furthermore, the GDB stub will intercept a number of exceptions automatically +if they are caused by kernel execution. It will also intercept BUG() macro +invokation. + diff --git a/Documentation/fujitsu/frv/mmu-layout.txt b/Documentation/fujitsu/frv/mmu-layout.txt new file mode 100644 index 000000000000..11dcc5679887 --- /dev/null +++ b/Documentation/fujitsu/frv/mmu-layout.txt @@ -0,0 +1,306 @@ + ================================= + FR451 MMU LINUX MEMORY MANAGEMENT + ================================= + +============ +MMU HARDWARE +============ + +FR451 MMU Linux puts the MMU into EDAT mode whilst running. This means that it uses both the SAT +registers and the DAT TLB to perform address translation. + +There are 8 IAMLR/IAMPR register pairs and 16 DAMLR/DAMPR register pairs for SAT mode. + +In DAT mode, there is also a TLB organised in cache format as 64 lines x 2 ways. Each line spans a +16KB range of addresses, but can match a larger region. + + +=========================== +MEMORY MANAGEMENT REGISTERS +=========================== + +Certain control registers are used by the kernel memory management routines: + + REGISTERS USAGE + ====================== ================================================== + IAMR0, DAMR0 Kernel image and data mappings + IAMR1, DAMR1 First-chance TLB lookup mapping + DAMR2 Page attachment for cache flush by page + DAMR3 Current PGD mapping + SCR0, DAMR4 Instruction TLB PGE/PTD cache + SCR1, DAMR5 Data TLB PGE/PTD cache + DAMR6-10 kmap_atomic() mappings + DAMR11 I/O mapping + CXNR mm_struct context ID + TTBR Page directory (PGD) pointer (physical address) + + +===================== +GENERAL MEMORY LAYOUT +===================== + +The physical memory layout is as follows: + + PHYSICAL ADDRESS CONTROLLER DEVICE + =================== ============== ======================================= + 00000000 - BFFFFFFF SDRAM SDRAM area + E0000000 - EFFFFFFF L-BUS CS2# VDK SLBUS/PCI window + F0000000 - F0FFFFFF L-BUS CS5# MB93493 CSC area (DAV daughter board) + F1000000 - F1FFFFFF L-BUS CS7# (CB70 CPU-card PCMCIA port I/O space) + FC000000 - FC0FFFFF L-BUS CS1# VDK MB86943 config space + FC100000 - FC1FFFFF L-BUS CS6# DM9000 NIC I/O space + FC200000 - FC2FFFFF L-BUS CS3# MB93493 CSR area (DAV daughter board) + FD000000 - FDFFFFFF L-BUS CS4# (CB70 CPU-card extra flash space) + FE000000 - FEFFFFFF Internal CPU peripherals + FF000000 - FF1FFFFF L-BUS CS0# Flash 1 + FF200000 - FF3FFFFF L-BUS CS0# Flash 2 + FFC00000 - FFC0001F L-BUS CS0# FPGA + +The virtual memory layout is: + + VIRTUAL ADDRESS PHYSICAL TRANSLATOR FLAGS SIZE OCCUPATION + ================= ======== ============== ======= ======= =================================== + 00004000-BFFFFFFF various TLB,xAMR1 D-N-??V 3GB Userspace + C0000000-CFFFFFFF 00000000 xAMPR0 -L-S--V 256MB Kernel image and data + D0000000-D7FFFFFF various TLB,xAMR1 D-NS??V 128MB vmalloc area + D8000000-DBFFFFFF various TLB,xAMR1 D-NS??V 64MB kmap() area + DC000000-DCFFFFFF various TLB 1MB Secondary kmap_atomic() frame + DD000000-DD27FFFF various DAMR 160KB Primary kmap_atomic() frame + DD040000 DAMR2/IAMR2 -L-S--V page Page cache flush attachment point + DD080000 DAMR3 -L-SC-V page Page Directory (PGD) + DD0C0000 DAMR4 -L-SC-V page Cached insn TLB Page Table lookup + DD100000 DAMR5 -L-SC-V page Cached data TLB Page Table lookup + DD140000 DAMR6 -L-S--V page kmap_atomic(KM_BOUNCE_READ) + DD180000 DAMR7 -L-S--V page kmap_atomic(KM_SKB_SUNRPC_DATA) + DD1C0000 DAMR8 -L-S--V page kmap_atomic(KM_SKB_DATA_SOFTIRQ) + DD200000 DAMR9 -L-S--V page kmap_atomic(KM_USER0) + DD240000 DAMR10 -L-S--V page kmap_atomic(KM_USER1) + E0000000-FFFFFFFF E0000000 DAMR11 -L-SC-V 512MB I/O region + +IAMPR1 and DAMPR1 are used as an extension to the TLB. + + +==================== +KMAP AND KMAP_ATOMIC +==================== + +To access pages in the page cache (which may not be directly accessible if highmem is available), +the kernel calls kmap(), does the access and then calls kunmap(); or it calls kmap_atomic(), does +the access and then calls kunmap_atomic(). + +kmap() creates an attachment between an arbitrary inaccessible page and a range of virtual +addresses by installing a PTE in a special page table. The kernel can then access this page as it +wills. When it's finished, the kernel calls kunmap() to clear the PTE. + +kmap_atomic() does something slightly different. In the interests of speed, it chooses one of two +strategies: + + (1) If possible, kmap_atomic() attaches the requested page to one of DAMPR5 through DAMPR10 + register pairs; and the matching kunmap_atomic() clears the DAMPR. This makes high memory + support really fast as there's no need to flush the TLB or modify the page tables. The DAMLR + registers being used for this are preset during boot and don't change over the lifetime of the + process. There's a direct mapping between the first few kmap_atomic() types, DAMR number and + virtual address slot. + + However, there are more kmap_atomic() types defined than there are DAMR registers available, + so we fall back to: + + (2) kmap_atomic() uses a slot in the secondary frame (determined by the type parameter), and then + locks an entry in the TLB to translate that slot to the specified page. The number of slots is + obviously limited, and their positions are controlled such that each slot is matched by a + different line in the TLB. kunmap() ejects the entry from the TLB. + +Note that the first three kmap atomic types are really just declared as placeholders. The DAMPR +registers involved are actually modified directly. + +Also note that kmap() itself may sleep, kmap_atomic() may never sleep and both always succeed; +furthermore, a driver using kmap() may sleep before calling kunmap(), but may not sleep before +calling kunmap_atomic() if it had previously called kmap_atomic(). + + +=============================== +USING MORE THAN 256MB OF MEMORY +=============================== + +The kernel cannot access more than 256MB of memory directly. The physical layout, however, permits +up to 3GB of SDRAM (possibly 3.25GB) to be made available. By using CONFIG_HIGHMEM, the kernel can +allow userspace (by way of page tables) and itself (by way of kmap) to deal with the memory +allocation. + +External devices can, of course, still DMA to and from all of the SDRAM, even if the kernel can't +see it directly. The kernel translates page references into real addresses for communicating to the +devices. + + +=================== +PAGE TABLE TOPOLOGY +=================== + +The page tables are arranged in 2-layer format. There is a middle layer (PMD) that would be used in +3-layer format tables but that is folded into the top layer (PGD) and so consumes no extra memory +or processing power. + + +------+ PGD PMD + | TTBR |--->+-------------------+ + +------+ | | : STE | + | PGE0 | PME0 : STE | + | | : STE | + +-------------------+ Page Table + | | : STE -------------->+--------+ +0x0000 + | PGE1 | PME0 : STE -----------+ | PTE0 | + | | : STE -------+ | +--------+ + +-------------------+ | | | PTE63 | + | | : STE | | +-->+--------+ +0x0100 + | PGE2 | PME0 : STE | | | PTE64 | + | | : STE | | +--------+ + +-------------------+ | | PTE127 | + | | : STE | +------>+--------+ +0x0200 + | PGE3 | PME0 : STE | | PTE128 | + | | : STE | +--------+ + +-------------------+ | PTE191 | + +--------+ +0x0300 + +Each Page Directory (PGD) is 16KB (page size) in size and is divided into 64 entries (PGEs). Each +PGE contains one Page Mid Directory (PMD). + +Each PMD is 256 bytes in size and contains a single entry (PME). Each PME holds 64 FR451 MMU +segment table entries of 4 bytes apiece. Each PME "points to" a page table. In practice, each STE +points to a subset of the page table, the first to PT+0x0000, the second to PT+0x0100, the third to +PT+0x200, and so on. + +Each PGE and PME covers 64MB of the total virtual address space. + +Each Page Table (PTD) is 16KB (page size) in size, and is divided into 4096 entries (PTEs). Each +entry can point to one 16KB page. In practice, each Linux page table is subdivided into 64 FR451 +MMU page tables. But they are all grouped together to make management easier, in particular rmap +support is then trivial. + +Grouping page tables in this fashion makes PGE caching in SCR0/SCR1 more efficient because the +coverage of the cached item is greater. + +Page tables for the vmalloc area are allocated at boot time and shared between all mm_structs. + + +================= +USER SPACE LAYOUT +================= + +For MMU capable Linux, the regions userspace code are allowed to access are kept entirely separate +from those dedicated to the kernel: + + VIRTUAL ADDRESS SIZE PURPOSE + ================= ===== =================================== + 00000000-00003fff 4KB NULL pointer access trap + 00004000-01ffffff ~32MB lower mmap space (grows up) + 02000000-021fffff 2MB Stack space (grows down from top) + 02200000-nnnnnnnn Executable mapping + nnnnnnnn- brk space (grows up) + -bfffffff upper mmap space (grows down) + +This is so arranged so as to make best use of the 16KB page tables and the way in which PGEs/PMEs +are cached by the TLB handler. The lower mmap space is filled first, and then the upper mmap space +is filled. + + +=============================== +GDB-STUB MMU DEBUGGING SERVICES +=============================== + +The gdb-stub included in this kernel provides a number of services to aid in the debugging of MMU +related kernel services: + + (*) Every time the kernel stops, certain state information is dumped into __debug_mmu. This + variable is defined in arch/frv/kernel/gdb-stub.c. Note that the gdbinit file in this + directory has some useful macros for dealing with this. + + (*) __debug_mmu.tlb[] + + This receives the current TLB contents. This can be viewed with the _tlb GDB macro: + + (gdb) _tlb + tlb[0x00]: 01000005 00718203 01000002 00718203 + tlb[0x01]: 01004002 006d4201 01004005 006d4203 + tlb[0x02]: 01008002 006d0201 01008006 00004200 + tlb[0x03]: 0100c006 007f4202 0100c002 0064c202 + tlb[0x04]: 01110005 00774201 01110002 00774201 + tlb[0x05]: 01114005 00770201 01114002 00770201 + tlb[0x06]: 01118002 0076c201 01118005 0076c201 + ... + tlb[0x3d]: 010f4002 00790200 001f4002 0054ca02 + tlb[0x3e]: 010f8005 0078c201 010f8002 0078c201 + tlb[0x3f]: 001fc002 0056ca01 001fc005 00538a01 + + (*) __debug_mmu.iamr[] + (*) __debug_mmu.damr[] + + These receive the current IAMR and DAMR contents. These can be viewed with with the _amr + GDB macro: + + (gdb) _amr + AMRx DAMR IAMR + ==== ===================== ===================== + amr0 : L:c0000000 P:00000cb9 : L:c0000000 P:000004b9 + amr1 : L:01070005 P:006f9203 : L:0102c005 P:006a1201 + amr2 : L:d8d00000 P:00000000 : L:d8d00000 P:00000000 + amr3 : L:d8d04000 P:00534c0d : L:00000000 P:00000000 + amr4 : L:d8d08000 P:00554c0d : L:00000000 P:00000000 + amr5 : L:d8d0c000 P:00554c0d : L:00000000 P:00000000 + amr6 : L:d8d10000 P:00000000 : L:00000000 P:00000000 + amr7 : L:d8d14000 P:00000000 : L:00000000 P:00000000 + amr8 : L:d8d18000 P:00000000 + amr9 : L:d8d1c000 P:00000000 + amr10: L:d8d20000 P:00000000 + amr11: L:e0000000 P:e0000ccd + + (*) The current task's page directory is bound to DAMR3. + + This can be viewed with the _pgd GDB macro: + + (gdb) _pgd + $3 = {{pge = {{ste = {0x554001, 0x554101, 0x554201, 0x554301, 0x554401, + 0x554501, 0x554601, 0x554701, 0x554801, 0x554901, 0x554a01, + 0x554b01, 0x554c01, 0x554d01, 0x554e01, 0x554f01, 0x555001, + 0x555101, 0x555201, 0x555301, 0x555401, 0x555501, 0x555601, + 0x555701, 0x555801, 0x555901, 0x555a01, 0x555b01, 0x555c01, + 0x555d01, 0x555e01, 0x555f01, 0x556001, 0x556101, 0x556201, + 0x556301, 0x556401, 0x556501, 0x556601, 0x556701, 0x556801, + 0x556901, 0x556a01, 0x556b01, 0x556c01, 0x556d01, 0x556e01, + 0x556f01, 0x557001, 0x557101, 0x557201, 0x557301, 0x557401, + 0x557501, 0x557601, 0x557701, 0x557801, 0x557901, 0x557a01, + 0x557b01, 0x557c01, 0x557d01, 0x557e01, 0x557f01}}}}, {pge = {{ + ste = {0x0 <repeats 64 times>}}}} <repeats 51 times>, {pge = {{ste = { + 0x248001, 0x248101, 0x248201, 0x248301, 0x248401, 0x248501, + 0x248601, 0x248701, 0x248801, 0x248901, 0x248a01, 0x248b01, + 0x248c01, 0x248d01, 0x248e01, 0x248f01, 0x249001, 0x249101, + 0x249201, 0x249301, 0x249401, 0x249501, 0x249601, 0x249701, + 0x249801, 0x249901, 0x249a01, 0x249b01, 0x249c01, 0x249d01, + 0x249e01, 0x249f01, 0x24a001, 0x24a101, 0x24a201, 0x24a301, + 0x24a401, 0x24a501, 0x24a601, 0x24a701, 0x24a801, 0x24a901, + 0x24aa01, 0x24ab01, 0x24ac01, 0x24ad01, 0x24ae01, 0x24af01, + 0x24b001, 0x24b101, 0x24b201, 0x24b301, 0x24b401, 0x24b501, + 0x24b601, 0x24b701, 0x24b801, 0x24b901, 0x24ba01, 0x24bb01, + 0x24bc01, 0x24bd01, 0x24be01, 0x24bf01}}}}, {pge = {{ste = { + 0x0 <repeats 64 times>}}}} <repeats 11 times>} + + (*) The PTD last used by the instruction TLB miss handler is attached to DAMR4. + (*) The PTD last used by the data TLB miss handler is attached to DAMR5. + + These can be viewed with the _ptd_i and _ptd_d GDB macros: + + (gdb) _ptd_d + $5 = {{pte = 0x0} <repeats 127 times>, {pte = 0x539b01}, { + pte = 0x0} <repeats 896 times>, {pte = 0x719303}, {pte = 0x6d5303}, { + pte = 0x0}, {pte = 0x0}, {pte = 0x0}, {pte = 0x0}, {pte = 0x0}, { + pte = 0x0}, {pte = 0x0}, {pte = 0x0}, {pte = 0x0}, {pte = 0x6a1303}, { + pte = 0x0} <repeats 12 times>, {pte = 0x709303}, {pte = 0x0}, {pte = 0x0}, + {pte = 0x6fd303}, {pte = 0x6f9303}, {pte = 0x6f5303}, {pte = 0x0}, { + pte = 0x6ed303}, {pte = 0x531b01}, {pte = 0x50db01}, { + pte = 0x0} <repeats 13 times>, {pte = 0x5303}, {pte = 0x7f5303}, { + pte = 0x509b01}, {pte = 0x505b01}, {pte = 0x7c9303}, {pte = 0x7b9303}, { + pte = 0x7b5303}, {pte = 0x7b1303}, {pte = 0x7ad303}, {pte = 0x0}, { + pte = 0x0}, {pte = 0x7a1303}, {pte = 0x0}, {pte = 0x795303}, {pte = 0x0}, { + pte = 0x78d303}, {pte = 0x0}, {pte = 0x0}, {pte = 0x0}, {pte = 0x0}, { + pte = 0x0}, {pte = 0x775303}, {pte = 0x771303}, {pte = 0x76d303}, { + pte = 0x0}, {pte = 0x765303}, {pte = 0x7c5303}, {pte = 0x501b01}, { + pte = 0x4f1b01}, {pte = 0x4edb01}, {pte = 0x0}, {pte = 0x4f9b01}, { + pte = 0x4fdb01}, {pte = 0x0} <repeats 2992 times>} diff --git a/Documentation/hayes-esp.txt b/Documentation/hayes-esp.txt new file mode 100644 index 000000000000..09b5d5856758 --- /dev/null +++ b/Documentation/hayes-esp.txt @@ -0,0 +1,154 @@ +HAYES ESP DRIVER VERSION 2.1 + +A big thanks to the people at Hayes, especially Alan Adamson. Their support +has enabled me to provide enhancements to the driver. + +Please report your experiences with this driver to me (arobinso@nyx.net). I +am looking for both positive and negative feedback. + +*** IMPORTANT CHANGES FOR 2.1 *** +Support for PIO mode. Five situations will cause PIO mode to be used: +1) A multiport card is detected. PIO mode will always be used. (8 port cards +do not support DMA). +2) The DMA channel is set to an invalid value (anything other than 1 or 3). +3) The DMA buffer/channel could not be allocated. The port will revert to PIO +mode until it is reopened. +4) Less than a specified number of bytes need to be transferred to/from the +FIFOs. PIO mode will be used for that transfer only. +5) A port needs to do a DMA transfer and another port is already using the +DMA channel. PIO mode will be used for that transfer only. + +Since the Hayes ESP seems to conflict with other cards (notably sound cards) +when using DMA, DMA is turned off by default. To use DMA, it must be turned +on explicitly, either with the "dma=" option described below or with +setserial. A multiport card can be forced into DMA mode by using setserial; +however, most multiport cards don't support DMA. + +The latest version of setserial allows the enhanced configuration of the ESP +card to be viewed and modified. +*** + +This package contains the files needed to compile a module to support the Hayes +ESP card. The drivers are basically a modified version of the serial drivers. + +Features: + +- Uses the enhanced mode of the ESP card, allowing a wider range of + interrupts and features than compatibility mode +- Uses DMA and 16 bit PIO mode to transfer data to and from the ESP's FIFOs, + reducing CPU load +- Supports primary and secondary ports + + +If the driver is compiled as a module, the IRQs to use can be specified by +using the irq= option. The format is: + +irq=[0x100],[0x140],[0x180],[0x200],[0x240],[0x280],[0x300],[0x380] + +The address in brackets is the base address of the card. The IRQ of +nonexistent cards can be set to 0. If an IRQ of a card that does exist is set +to 0, the driver will attempt to guess at the correct IRQ. For example, to set +the IRQ of the card at address 0x300 to 12, the insmod command would be: + +insmod esp irq=0,0,0,0,0,0,12,0 + +The custom divisor can be set by using the divisor= option. The format is the +same as for the irq= option. Each divisor value is a series of hex digits, +with each digit representing the divisor to use for a corresponding port. The +divisor value is constructed RIGHT TO LEFT. Specifying a nonzero divisor value +will automatically set the spd_cust flag. To calculate the divisor to use for +a certain baud rate, divide the port's base baud (generally 921600) by the +desired rate. For example, to set the divisor of the primary port at 0x300 to +4 and the divisor of the secondary port at 0x308 to 8, the insmod command would +be: + +insmod esp divisor=0,0,0,0,0,0,0x84,0 + +The dma= option can be used to set the DMA channel. The channel can be either +1 or 3. Specifying any other value will force the driver to use PIO mode. +For example, to set the DMA channel to 3, the insmod command would be: + +insmod esp dma=3 + +The rx_trigger= and tx_trigger= options can be used to set the FIFO trigger +levels. They specify when the ESP card should send an interrupt. Larger +values will decrease the number of interrupts; however, a value too high may +result in data loss. Valid values are 1 through 1023, with 768 being the +default. For example, to set the receive trigger level to 512 bytes and the +transmit trigger level to 700 bytes, the insmod command would be: + +insmod esp rx_trigger=512 tx_trigger=700 + +The flow_off= and flow_on= options can be used to set the hardware flow off/ +flow on levels. The flow on level must be lower than the flow off level, and +the flow off level should be higher than rx_trigger. Valid values are 1 +through 1023, with 1016 being the default flow off level and 944 being the +default flow on level. For example, to set the flow off level to 1000 bytes +and the flow on level to 935 bytes, the insmod command would be: + +insmod esp flow_off=1000 flow_on=935 + +The rx_timeout= option can be used to set the receive timeout value. This +value indicates how long after receiving the last character that the ESP card +should wait before signalling an interrupt. Valid values are 0 though 255, +with 128 being the default. A value too high will increase latency, and a +value too low will cause unnecessary interrupts. For example, to set the +receive timeout to 255, the insmod command would be: + +insmod esp rx_timeout=255 + +The pio_threshold= option sets the threshold (in number of characters) for +using PIO mode instead of DMA mode. For example, if this value is 32, +transfers of 32 bytes or less will always use PIO mode. + +insmod esp pio_threshold=32 + +Multiple options can be listed on the insmod command line by separating each +option with a space. For example: + +insmod esp dma=3 trigger=512 + +The esp module can be automatically loaded when needed. To cause this to +happen, add the following lines to /etc/modprobe.conf (replacing the last line +with options for your configuration): + +alias char-major-57 esp +alias char-major-58 esp +options esp irq=0,0,0,0,0,0,3,0 divisor=0,0,0,0,0,0,0x4,0 + +You may also need to run 'depmod -a'. + +Devices must be created manually. To create the devices, note the output from +the module after it is inserted. The output will appear in the location where +kernel messages usually appear (usually /var/adm/messages). Create two devices +for each 'tty' mentioned, one with major of 57 and the other with major of 58. +The minor number should be the same as the tty number reported. The commands +would be (replace ? with the tty number): + +mknod /dev/ttyP? c 57 ? +mknod /dev/cup? c 58 ? + +For example, if the following line appears: + +Oct 24 18:17:23 techno kernel: ttyP8 at 0x0140 (irq = 3) is an ESP primary port + +...two devices should be created: + +mknod /dev/ttyP8 c 57 8 +mknod /dev/cup8 c 58 8 + +You may need to set the permissions on the devices: + +chmod 666 /dev/ttyP* +chmod 666 /dev/cup* + +The ESP module and the serial module should not conflict (they can be used at +the same time). After the ESP module has been loaded the ports on the ESP card +will no longer be accessible by the serial driver. + +If I/O errors are experienced when accessing the port, check for IRQ and DMA +conflicts ('cat /proc/interrupts' and 'cat /proc/dma' for a list of IRQs and +DMAs currently in use). + +Enjoy! +Andrew J. Robinson <arobinso@nyx.net> diff --git a/Documentation/highuid.txt b/Documentation/highuid.txt new file mode 100644 index 000000000000..2c33926b9099 --- /dev/null +++ b/Documentation/highuid.txt @@ -0,0 +1,79 @@ +Notes on the change from 16-bit UIDs to 32-bit UIDs: + +- kernel code MUST take into account __kernel_uid_t and __kernel_uid32_t + when communicating between user and kernel space in an ioctl or data + structure. + +- kernel code should use uid_t and gid_t in kernel-private structures and + code. + +What's left to be done for 32-bit UIDs on all Linux architectures: + +- Disk quotas have an interesting limitation that is not related to the + maximum UID/GID. They are limited by the maximum file size on the + underlying filesystem, because quota records are written at offsets + corresponding to the UID in question. + Further investigation is needed to see if the quota system can cope + properly with huge UIDs. If it can deal with 64-bit file offsets on all + architectures, this should not be a problem. + +- Decide whether or not to keep backwards compatibility with the system + accounting file, or if we should break it as the comments suggest + (currently, the old 16-bit UID and GID are still written to disk, and + part of the former pad space is used to store separate 32-bit UID and + GID) + +- Need to validate that OS emulation calls the 16-bit UID + compatibility syscalls, if the OS being emulated used 16-bit UIDs, or + uses the 32-bit UID system calls properly otherwise. + + This affects at least: + SunOS emulation + Solaris emulation + iBCS on Intel + + sparc32 emulation on sparc64 + (need to support whatever new 32-bit UID system calls are added to + sparc32) + +- Validate that all filesystems behave properly. + + At present, 32-bit UIDs _should_ work for: + ext2 + ufs + isofs + nfs + coda + udf + + Ioctl() fixups have been made for: + ncpfs + smbfs + + Filesystems with simple fixups to prevent 16-bit UID wraparound: + minix + sysv + qnx4 + + Other filesystems have not been checked yet. + +- The ncpfs and smpfs filesystems can not presently use 32-bit UIDs in + all ioctl()s. Some new ioctl()s have been added with 32-bit UIDs, but + more are needed. (as well as new user<->kernel data structures) + +- The ELF core dump format only supports 16-bit UIDs on arm, i386, m68k, + sh, and sparc32. Fixing this is probably not that important, but would + require adding a new ELF section. + +- The ioctl()s used to control the in-kernel NFS server only support + 16-bit UIDs on arm, i386, m68k, sh, and sparc32. + +- make sure that the UID mapping feature of AX25 networking works properly + (it should be safe because it's always used a 32-bit integer to + communicate between user and kernel) + + +Chris Wing +wingc@umich.edu + +last updated: January 11, 2000 diff --git a/Documentation/hpet.txt b/Documentation/hpet.txt new file mode 100644 index 000000000000..4e7cc8d3359b --- /dev/null +++ b/Documentation/hpet.txt @@ -0,0 +1,298 @@ + High Precision Event Timer Driver for Linux + +The High Precision Event Timer (HPET) hardware is the future replacement for the 8254 and Real +Time Clock (RTC) periodic timer functionality. Each HPET can have up two 32 timers. It is possible +to configure the first two timers as legacy replacements for 8254 and RTC periodic. A specification +done by INTEL and Microsoft can be found at http://www.intel.com/labs/platcomp/hpet/hpetspec.htm. + +The driver supports detection of HPET driver allocation and initialization of the HPET before the +driver module_init routine is called. This enables platform code which uses timer 0 or 1 as the +main timer to intercept HPET initialization. An example of this initialization can be found in +arch/i386/kernel/time_hpet.c. + +The driver provides two APIs which are very similar to the API found in the rtc.c driver. +There is a user space API and a kernel space API. An example user space program is provided +below. + +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <fcntl.h> +#include <string.h> +#include <memory.h> +#include <malloc.h> +#include <time.h> +#include <ctype.h> +#include <sys/types.h> +#include <sys/wait.h> +#include <signal.h> +#include <fcntl.h> +#include <errno.h> +#include <sys/time.h> +#include <linux/hpet.h> + + +extern void hpet_open_close(int, const char **); +extern void hpet_info(int, const char **); +extern void hpet_poll(int, const char **); +extern void hpet_fasync(int, const char **); +extern void hpet_read(int, const char **); + +#include <sys/poll.h> +#include <sys/ioctl.h> +#include <signal.h> + +struct hpet_command { + char *command; + void (*func)(int argc, const char ** argv); +} hpet_command[] = { + { + "open-close", + hpet_open_close + }, + { + "info", + hpet_info + }, + { + "poll", + hpet_poll + }, + { + "fasync", + hpet_fasync + }, +}; + +int +main(int argc, const char ** argv) +{ + int i; + + argc--; + argv++; + + if (!argc) { + fprintf(stderr, "-hpet: requires command\n"); + return -1; + } + + + for (i = 0; i < (sizeof (hpet_command) / sizeof (hpet_command[0])); i++) + if (!strcmp(argv[0], hpet_command[i].command)) { + argc--; + argv++; + fprintf(stderr, "-hpet: executing %s\n", + hpet_command[i].command); + hpet_command[i].func(argc, argv); + return 0; + } + + fprintf(stderr, "do_hpet: command %s not implemented\n", argv[0]); + + return -1; +} + +void +hpet_open_close(int argc, const char **argv) +{ + int fd; + + if (argc != 1) { + fprintf(stderr, "hpet_open_close: device-name\n"); + return; + } + + fd = open(argv[0], O_RDONLY); + if (fd < 0) + fprintf(stderr, "hpet_open_close: open failed\n"); + else + close(fd); + + return; +} + +void +hpet_info(int argc, const char **argv) +{ +} + +void +hpet_poll(int argc, const char **argv) +{ + unsigned long freq; + int iterations, i, fd; + struct pollfd pfd; + struct hpet_info info; + struct timeval stv, etv; + struct timezone tz; + long usec; + + if (argc != 3) { + fprintf(stderr, "hpet_poll: device-name freq iterations\n"); + return; + } + + freq = atoi(argv[1]); + iterations = atoi(argv[2]); + + fd = open(argv[0], O_RDONLY); + + if (fd < 0) { + fprintf(stderr, "hpet_poll: open of %s failed\n", argv[0]); + return; + } + + if (ioctl(fd, HPET_IRQFREQ, freq) < 0) { + fprintf(stderr, "hpet_poll: HPET_IRQFREQ failed\n"); + goto out; + } + + if (ioctl(fd, HPET_INFO, &info) < 0) { + fprintf(stderr, "hpet_poll: failed to get info\n"); + goto out; + } + + fprintf(stderr, "hpet_poll: info.hi_flags 0x%lx\n", info.hi_flags); + + if (info.hi_flags && (ioctl(fd, HPET_EPI, 0) < 0)) { + fprintf(stderr, "hpet_poll: HPET_EPI failed\n"); + goto out; + } + + if (ioctl(fd, HPET_IE_ON, 0) < 0) { + fprintf(stderr, "hpet_poll, HPET_IE_ON failed\n"); + goto out; + } + + pfd.fd = fd; + pfd.events = POLLIN; + + for (i = 0; i < iterations; i++) { + pfd.revents = 0; + gettimeofday(&stv, &tz); + if (poll(&pfd, 1, -1) < 0) + fprintf(stderr, "hpet_poll: poll failed\n"); + else { + long data; + + gettimeofday(&etv, &tz); + usec = stv.tv_sec * 1000000 + stv.tv_usec; + usec = (etv.tv_sec * 1000000 + etv.tv_usec) - usec; + + fprintf(stderr, + "hpet_poll: expired time = 0x%lx\n", usec); + + fprintf(stderr, "hpet_poll: revents = 0x%x\n", + pfd.revents); + + if (read(fd, &data, sizeof(data)) != sizeof(data)) { + fprintf(stderr, "hpet_poll: read failed\n"); + } + else + fprintf(stderr, "hpet_poll: data 0x%lx\n", + data); + } + } + +out: + close(fd); + return; +} + +static int hpet_sigio_count; + +static void +hpet_sigio(int val) +{ + fprintf(stderr, "hpet_sigio: called\n"); + hpet_sigio_count++; +} + +void +hpet_fasync(int argc, const char **argv) +{ + unsigned long freq; + int iterations, i, fd, value; + sig_t oldsig; + struct hpet_info info; + + hpet_sigio_count = 0; + fd = -1; + + if ((oldsig = signal(SIGIO, hpet_sigio)) == SIG_ERR) { + fprintf(stderr, "hpet_fasync: failed to set signal handler\n"); + return; + } + + if (argc != 3) { + fprintf(stderr, "hpet_fasync: device-name freq iterations\n"); + goto out; + } + + fd = open(argv[0], O_RDONLY); + + if (fd < 0) { + fprintf(stderr, "hpet_fasync: failed to open %s\n", argv[0]); + return; + } + + + if ((fcntl(fd, F_SETOWN, getpid()) == 1) || + ((value = fcntl(fd, F_GETFL)) == 1) || + (fcntl(fd, F_SETFL, value | O_ASYNC) == 1)) { + fprintf(stderr, "hpet_fasync: fcntl failed\n"); + goto out; + } + + freq = atoi(argv[1]); + iterations = atoi(argv[2]); + + if (ioctl(fd, HPET_IRQFREQ, freq) < 0) { + fprintf(stderr, "hpet_fasync: HPET_IRQFREQ failed\n"); + goto out; + } + + if (ioctl(fd, HPET_INFO, &info) < 0) { + fprintf(stderr, "hpet_fasync: failed to get info\n"); + goto out; + } + + fprintf(stderr, "hpet_fasync: info.hi_flags 0x%lx\n", info.hi_flags); + + if (info.hi_flags && (ioctl(fd, HPET_EPI, 0) < 0)) { + fprintf(stderr, "hpet_fasync: HPET_EPI failed\n"); + goto out; + } + + if (ioctl(fd, HPET_IE_ON, 0) < 0) { + fprintf(stderr, "hpet_fasync, HPET_IE_ON failed\n"); + goto out; + } + + for (i = 0; i < iterations; i++) { + (void) pause(); + fprintf(stderr, "hpet_fasync: count = %d\n", hpet_sigio_count); + } + +out: + signal(SIGIO, oldsig); + + if (fd >= 0) + close(fd); + + return; +} + +The kernel API has three interfaces exported from the driver: + + hpet_register(struct hpet_task *tp, int periodic) + hpet_unregister(struct hpet_task *tp) + hpet_control(struct hpet_task *tp, unsigned int cmd, unsigned long arg) + +The kernel module using this interface fills in the ht_func and ht_data members of the +hpet_task structure before calling hpet_register. hpet_control simply vectors to the hpet_ioctl +routine and has the same commands and respective arguments as the user API. hpet_unregister +is used to terminate usage of the HPET timer reserved by hpet_register. + + diff --git a/Documentation/hw_random.txt b/Documentation/hw_random.txt new file mode 100644 index 000000000000..bb58c36b5845 --- /dev/null +++ b/Documentation/hw_random.txt @@ -0,0 +1,69 @@ + Hardware driver for Intel/AMD/VIA Random Number Generators (RNG) + Copyright 2000,2001 Jeff Garzik <jgarzik@pobox.com> + Copyright 2000,2001 Philipp Rumpf <prumpf@mandrakesoft.com> + +Introduction: + + The hw_random device driver is software that makes use of a + special hardware feature on your CPU or motherboard, + a Random Number Generator (RNG). + + In order to make effective use of this device driver, you + should download the support software as well. Download the + latest version of the "rng-tools" package from the + hw_random driver's official Web site: + + http://sourceforge.net/projects/gkernel/ + +About the Intel RNG hardware, from the firmware hub datasheet: + + The Firmware Hub integrates a Random Number Generator (RNG) + using thermal noise generated from inherently random quantum + mechanical properties of silicon. When not generating new random + bits the RNG circuitry will enter a low power state. Intel will + provide a binary software driver to give third party software + access to our RNG for use as a security feature. At this time, + the RNG is only to be used with a system in an OS-present state. + +Theory of operation: + + Character driver. Using the standard open() + and read() system calls, you can read random data from + the hardware RNG device. This data is NOT CHECKED by any + fitness tests, and could potentially be bogus (if the + hardware is faulty or has been tampered with). Data is only + output if the hardware "has-data" flag is set, but nevertheless + a security-conscious person would run fitness tests on the + data before assuming it is truly random. + + /dev/hwrandom is char device major 10, minor 183. + +Driver notes: + + * FIXME: support poll(2) + + NOTE: request_mem_region was removed, for two reasons: + 1) Only one RNG is supported by this driver, 2) The location + used by the RNG is a fixed location in MMIO-addressable memory, + 3) users with properly working BIOS e820 handling will always + have the region in which the RNG is located reserved, so + request_mem_region calls always fail for proper setups. + However, for people who use mem=XX, BIOS e820 information is + -not- in /proc/iomem, and request_mem_region(RNG_ADDR) can + succeed. + +Driver details: + + Based on: + Intel 82802AB/82802AC Firmware Hub (FWH) Datasheet + May 1999 Order Number: 290658-002 R + + Intel 82802 Firmware Hub: Random Number Generator + Programmer's Reference Manual + December 1999 Order Number: 298029-001 R + + Intel 82802 Firmware HUB Random Number Generator Driver + Copyright (c) 2000 Matt Sottek <msottek@quiknet.com> + + Special thanks to Matt Sottek. I did the "guts", he + did the "brains" and all the testing. diff --git a/Documentation/i2c/busses/i2c-ali1535 b/Documentation/i2c/busses/i2c-ali1535 new file mode 100644 index 000000000000..0db3b4c74ad1 --- /dev/null +++ b/Documentation/i2c/busses/i2c-ali1535 @@ -0,0 +1,42 @@ +Kernel driver i2c-ali1535 + +Supported adapters: + * Acer Labs, Inc. ALI 1535 (south bridge) + Datasheet: Now under NDA + http://www.ali.com.tw/eng/support/datasheet_request.php + +Authors: + Frodo Looijaard <frodol@dds.nl>, + Philip Edelbrock <phil@netroedge.com>, + Mark D. Studebaker <mdsxyz123@yahoo.com>, + Dan Eaton <dan.eaton@rocketlogix.com>, + Stephen Rousset<stephen.rousset@rocketlogix.com> + +Description +----------- + +This is the driver for the SMB Host controller on Acer Labs Inc. (ALI) +M1535 South Bridge. + +The M1535 is a South bridge for portable systems. It is very similar to the +M15x3 South bridges also produced by Acer Labs Inc. Some of the registers +within the part have moved and some have been redefined slightly. +Additionally, the sequencing of the SMBus transactions has been modified to +be more consistent with the sequencing recommended by the manufacturer and +observed through testing. These changes are reflected in this driver and +can be identified by comparing this driver to the i2c-ali15x3 driver. For +an overview of these chips see http://www.acerlabs.com + +The SMB controller is part of the M7101 device, which is an ACPI-compliant +Power Management Unit (PMU). + +The whole M7101 device has to be enabled for the SMB to work. You can't +just enable the SMB alone. The SMB and the ACPI have separate I/O spaces. +We make sure that the SMB is enabled. We leave the ACPI alone. + + +Features +-------- + +This driver controls the SMB Host only. This driver does not use +interrupts. diff --git a/Documentation/i2c/busses/i2c-ali1563 b/Documentation/i2c/busses/i2c-ali1563 new file mode 100644 index 000000000000..99ad4b9bcc32 --- /dev/null +++ b/Documentation/i2c/busses/i2c-ali1563 @@ -0,0 +1,27 @@ +Kernel driver i2c-ali1563 + +Supported adapters: + * Acer Labs, Inc. ALI 1563 (south bridge) + Datasheet: Now under NDA + http://www.ali.com.tw/eng/support/datasheet_request.php + +Author: Patrick Mochel <mochel@digitalimplant.org> + +Description +----------- + +This is the driver for the SMB Host controller on Acer Labs Inc. (ALI) +M1563 South Bridge. + +For an overview of these chips see http://www.acerlabs.com + +The M1563 southbridge is deceptively similar to the M1533, with a few +notable exceptions. One of those happens to be the fact they upgraded the +i2c core to be SMBus 2.0 compliant, and happens to be almost identical to +the i2c controller found in the Intel 801 south bridges. + +Features +-------- + +This driver controls the SMB Host only. This driver does not use +interrupts. diff --git a/Documentation/i2c/busses/i2c-ali15x3 b/Documentation/i2c/busses/i2c-ali15x3 new file mode 100644 index 000000000000..ff28d381bebe --- /dev/null +++ b/Documentation/i2c/busses/i2c-ali15x3 @@ -0,0 +1,112 @@ +Kernel driver i2c-ali15x3 + +Supported adapters: + * Acer Labs, Inc. ALI 1533 and 1543C (south bridge) + Datasheet: Now under NDA + http://www.ali.com.tw/eng/support/datasheet_request.php + +Authors: + Frodo Looijaard <frodol@dds.nl>, + Philip Edelbrock <phil@netroedge.com>, + Mark D. Studebaker <mdsxyz123@yahoo.com> + +Module Parameters +----------------- + +* force_addr: int + Initialize the base address of the i2c controller + + +Notes +----- + +The force_addr parameter is useful for boards that don't set the address in +the BIOS. Does not do a PCI force; the device must still be present in +lspci. Don't use this unless the driver complains that the base address is +not set. + +Example: 'modprobe i2c-ali15x3 force_addr=0xe800' + +SMBus periodically hangs on ASUS P5A motherboards and can only be cleared +by a power cycle. Cause unknown (see Issues below). + + +Description +----------- + +This is the driver for the SMB Host controller on Acer Labs Inc. (ALI) +M1541 and M1543C South Bridges. + +The M1543C is a South bridge for desktop systems. +The M1541 is a South bridge for portable systems. +They are part of the following ALI chipsets: + + * "Aladdin Pro 2" includes the M1621 Slot 1 North bridge with AGP and + 100MHz CPU Front Side bus + * "Aladdin V" includes the M1541 Socket 7 North bridge with AGP and 100MHz + CPU Front Side bus + Some Aladdin V motherboards: + Asus P5A + Atrend ATC-5220 + BCM/GVC VP1541 + Biostar M5ALA + Gigabyte GA-5AX (** Generally doesn't work because the BIOS doesn't + enable the 7101 device! **) + Iwill XA100 Plus + Micronics C200 + Microstar (MSI) MS-5169 + + * "Aladdin IV" includes the M1541 Socket 7 North bridge + with host bus up to 83.3 MHz. + +For an overview of these chips see http://www.acerlabs.com. At this time the +full data sheets on the web site are password protected, however if you +contact the ALI office in San Jose they may give you the password. + +The M1533/M1543C devices appear as FOUR separate devices on the PCI bus. An +output of lspci will show something similar to the following: + + 00:02.0 USB Controller: Acer Laboratories Inc. M5237 (rev 03) + 00:03.0 Bridge: Acer Laboratories Inc. M7101 <= THIS IS THE ONE WE NEED + 00:07.0 ISA bridge: Acer Laboratories Inc. M1533 (rev c3) + 00:0f.0 IDE interface: Acer Laboratories Inc. M5229 (rev c1) + +** IMPORTANT ** +** If you have a M1533 or M1543C on the board and you get +** "ali15x3: Error: Can't detect ali15x3!" +** then run lspci. +** If you see the 1533 and 5229 devices but NOT the 7101 device, +** then you must enable ACPI, the PMU, SMB, or something similar +** in the BIOS. +** The driver won't work if it can't find the M7101 device. + +The SMB controller is part of the M7101 device, which is an ACPI-compliant +Power Management Unit (PMU). + +The whole M7101 device has to be enabled for the SMB to work. You can't +just enable the SMB alone. The SMB and the ACPI have separate I/O spaces. +We make sure that the SMB is enabled. We leave the ACPI alone. + +Features +-------- + +This driver controls the SMB Host only. The SMB Slave +controller on the M15X3 is not enabled. This driver does not use +interrupts. + + +Issues +------ + +This driver requests the I/O space for only the SMB +registers. It doesn't use the ACPI region. + +On the ASUS P5A motherboard, there are several reports that +the SMBus will hang and this can only be resolved by +powering off the computer. It appears to be worse when the board +gets hot, for example under heavy CPU load, or in the summer. +There may be electrical problems on this board. +On the P5A, the W83781D sensor chip is on both the ISA and +SMBus. Therefore the SMBus hangs can generally be avoided +by accessing the W83781D on the ISA bus only. + diff --git a/Documentation/i2c/busses/i2c-amd756 b/Documentation/i2c/busses/i2c-amd756 new file mode 100644 index 000000000000..67f30874d0bf --- /dev/null +++ b/Documentation/i2c/busses/i2c-amd756 @@ -0,0 +1,25 @@ +Kernel driver i2c-amd756 + +Supported adapters: + * AMD 756 + * AMD 766 + * AMD 768 + * AMD 8111 + Datasheets: Publicly available on AMD website + + * nVidia nForce + Datasheet: Unavailable + +Authors: + Frodo Looijaard <frodol@dds.nl>, + Philip Edelbrock <phil@netroedge.com> + +Description +----------- + +This driver supports the AMD 756, 766, 768 and 8111 Peripheral Bus +Controllers, and the nVidia nForce. + +Note that for the 8111, there are two SMBus adapters. The SMBus 1.0 adapter +is supported by this driver, and the SMBus 2.0 adapter is supported by the +i2c-amd8111 driver. diff --git a/Documentation/i2c/busses/i2c-amd8111 b/Documentation/i2c/busses/i2c-amd8111 new file mode 100644 index 000000000000..db294ee7455a --- /dev/null +++ b/Documentation/i2c/busses/i2c-amd8111 @@ -0,0 +1,41 @@ +Kernel driver i2c-adm8111 + +Supported adapters: + * AMD-8111 SMBus 2.0 PCI interface + +Datasheets: + AMD datasheet not yet available, but almost everything can be found + in publically available ACPI 2.0 specification, which the adapter + follows. + +Author: Vojtech Pavlik <vojtech@suse.cz> + +Description +----------- + +If you see something like this: + +00:07.2 SMBus: Advanced Micro Devices [AMD] AMD-8111 SMBus 2.0 (rev 02) + Subsystem: Advanced Micro Devices [AMD] AMD-8111 SMBus 2.0 + Flags: medium devsel, IRQ 19 + I/O ports at d400 [size=32] + +in your 'lspci -v', then this driver is for your chipset. + +Process Call Support +-------------------- + +Supported. + +SMBus 2.0 Support +----------------- + +Supported. Both PEC and block process call support is implemented. Slave +mode or host notification are not yet implemented. + +Notes +----- + +Note that for the 8111, there are two SMBus adapters. The SMBus 2.0 adapter +is supported by this driver, and the SMBus 1.0 adapter is supported by the +i2c-amd756 driver. diff --git a/Documentation/i2c/busses/i2c-i801 b/Documentation/i2c/busses/i2c-i801 new file mode 100644 index 000000000000..fd4b2712d570 --- /dev/null +++ b/Documentation/i2c/busses/i2c-i801 @@ -0,0 +1,80 @@ +Kernel driver i2c-i801 + +Supported adapters: + * Intel 82801AA and 82801AB (ICH and ICH0 - part of the + '810' and '810E' chipsets) + * Intel 82801BA (ICH2 - part of the '815E' chipset) + * Intel 82801CA/CAM (ICH3) + * Intel 82801DB (ICH4) (HW PEC supported, 32 byte buffer not supported) + * Intel 82801EB/ER (ICH5) (HW PEC supported, 32 byte buffer not supported) + * Intel 6300ESB + * Intel 82801FB/FR/FW/FRW (ICH6) + * Intel ICH7 + Datasheets: Publicly available at the Intel website + +Authors: + Frodo Looijaard <frodol@dds.nl>, + Philip Edelbrock <phil@netroedge.com>, + Mark Studebaker <mdsxyz123@yahoo.com> + + +Module Parameters +----------------- + +* force_addr: int + Forcibly enable the ICH at the given address. EXTREMELY DANGEROUS! + + +Description +----------- + +The ICH (properly known as the 82801AA), ICH0 (82801AB), ICH2 (82801BA), +ICH3 (82801CA/CAM) and later devices are Intel chips that are a part of +Intel's '810' chipset for Celeron-based PCs, '810E' chipset for +Pentium-based PCs, '815E' chipset, and others. + +The ICH chips contain at least SEVEN separate PCI functions in TWO logical +PCI devices. An output of lspci will show something similar to the +following: + + 00:1e.0 PCI bridge: Intel Corporation: Unknown device 2418 (rev 01) + 00:1f.0 ISA bridge: Intel Corporation: Unknown device 2410 (rev 01) + 00:1f.1 IDE interface: Intel Corporation: Unknown device 2411 (rev 01) + 00:1f.2 USB Controller: Intel Corporation: Unknown device 2412 (rev 01) + 00:1f.3 Unknown class [0c05]: Intel Corporation: Unknown device 2413 (rev 01) + +The SMBus controller is function 3 in device 1f. Class 0c05 is SMBus Serial +Controller. + +If you do NOT see the 24x3 device at function 3, and you can't figure out +any way in the BIOS to enable it, + +The ICH chips are quite similar to Intel's PIIX4 chip, at least in the +SMBus controller. + +See the file i2c-piix4 for some additional information. + + +Process Call Support +-------------------- + +Not supported. + + +I2C Block Read Support +---------------------- + +Not supported at the moment. + + +SMBus 2.0 Support +----------------- + +The 82801DB (ICH4) and later chips support several SMBus 2.0 features. + +********************** +The lm_sensors project gratefully acknowledges the support of Texas +Instruments in the initial development of this driver. + +The lm_sensors project gratefully acknowledges the support of Intel in the +development of SMBus 2.0 / ICH4 features of this driver. diff --git a/Documentation/i2c/busses/i2c-i810 b/Documentation/i2c/busses/i2c-i810 new file mode 100644 index 000000000000..0544eb332887 --- /dev/null +++ b/Documentation/i2c/busses/i2c-i810 @@ -0,0 +1,46 @@ +Kernel driver i2c-i810 + +Supported adapters: + * Intel 82810, 82810-DC100, 82810E, and 82815 (GMCH) + +Authors: + Frodo Looijaard <frodol@dds.nl>, + Philip Edelbrock <phil@netroedge.com>, + Kyösti Mälkki <kmalkki@cc.hut.fi>, + Ralph Metzler <rjkm@thp.uni-koeln.de>, + Mark D. Studebaker <mdsxyz123@yahoo.com> + +Main contact: Mark Studebaker <mdsxyz123@yahoo.com> + +Description +----------- + +WARNING: If you have an '810' or '815' motherboard, your standard I2C +temperature sensors are most likely on the 801's I2C bus. You want the +i2c-i801 driver for those, not this driver. + +Now for the i2c-i810... + +The GMCH chip contains two I2C interfaces. + +The first interface is used for DDC (Data Display Channel) which is a +serial channel through the VGA monitor connector to a DDC-compliant +monitor. This interface is defined by the Video Electronics Standards +Association (VESA). The standards are available for purchase at +http://www.vesa.org . + +The second interface is a general-purpose I2C bus. It may be connected to a +TV-out chip such as the BT869 or possibly to a digital flat-panel display. + +Features +-------- + +Both busses use the i2c-algo-bit driver for 'bit banging' +and support for specific transactions is provided by i2c-algo-bit. + +Issues +------ + +If you enable bus testing in i2c-algo-bit (insmod i2c-algo-bit bit_test=1), +the test may fail; if so, the i2c-i810 driver won't be inserted. However, +we think this has been fixed. diff --git a/Documentation/i2c/busses/i2c-nforce2 b/Documentation/i2c/busses/i2c-nforce2 new file mode 100644 index 000000000000..e379e182e64f --- /dev/null +++ b/Documentation/i2c/busses/i2c-nforce2 @@ -0,0 +1,41 @@ +Kernel driver i2c-nforce2 + +Supported adapters: + * nForce2 MCP 10de:0064 + * nForce2 Ultra 400 MCP 10de:0084 + * nForce3 Pro150 MCP 10de:00D4 + * nForce3 250Gb MCP 10de:00E4 + * nForce4 MCP 10de:0052 + +Datasheet: not publically available, but seems to be similar to the + AMD-8111 SMBus 2.0 adapter. + +Authors: + Hans-Frieder Vogt <hfvogt@arcor.de>, + Thomas Leibold <thomas@plx.com>, + Patrick Dreker <patrick@dreker.de> + +Description +----------- + +i2c-nforce2 is a driver for the SMBuses included in the nVidia nForce2 MCP. + +If your 'lspci -v' listing shows something like the following, + +00:01.1 SMBus: nVidia Corporation: Unknown device 0064 (rev a2) + Subsystem: Asustek Computer, Inc.: Unknown device 0c11 + Flags: 66Mhz, fast devsel, IRQ 5 + I/O ports at c000 [size=32] + Capabilities: <available only to root> + +then this driver should support the SMBuses of your motherboard. + + +Notes +----- + +The SMBus adapter in the nForce2 chipset seems to be very similar to the +SMBus 2.0 adapter in the AMD-8111 southbridge. However, I could only get +the driver to work with direct I/O access, which is different to the EC +interface of the AMD-8111. Tested on Asus A7N8X. The ACPI DSDT table of the +Asus A7N8X lists two SMBuses, both of which are supported by this driver. diff --git a/Documentation/i2c/busses/i2c-parport b/Documentation/i2c/busses/i2c-parport new file mode 100644 index 000000000000..9f1d0082da18 --- /dev/null +++ b/Documentation/i2c/busses/i2c-parport @@ -0,0 +1,154 @@ +Kernel driver i2c-parport + +Author: Jean Delvare <khali@linux-fr.org> + +This is a unified driver for several i2c-over-parallel-port adapters, +such as the ones made by Philips, Velleman or ELV. This driver is +meant as a replacement for the older, individual drivers: + * i2c-philips-par + * i2c-elv + * i2c-velleman + * video/i2c-parport (NOT the same as this one, dedicated to home brew + teletext adapters) + +It currently supports the following devices: + * Philips adapter + * home brew teletext adapter + * Velleman K8000 adapter + * ELV adapter + * Analog Devices evaluation boards (ADM1025, ADM1030, ADM1031, ADM1032) + +These devices use different pinout configurations, so you have to tell +the driver what you have, using the type module parameter. There is no +way to autodetect the devices. Support for different pinout configurations +can be easily added when needed. + + +Building your own adapter +------------------------- + +If you want to build you own i2c-over-parallel-port adapter, here is +a sample electronics schema (credits go to Sylvain Munaut): + +Device PC +Side ___________________Vdd (+) Side + | | | + --- --- --- + | | | | | | + |R| |R| |R| + | | | | | | + --- --- --- + | | | + | | /| | +SCL ----------x--------o |-----------x------------------- pin 2 + | \| | | + | | | + | |\ | | +SDA ----------x----x---| o---x--------------------------- pin 13 + | |/ | + | | + | /| | + ---------o |----------------x-------------- pin 3 + \| | | + | | + --- --- + | | | | + |R| |R| + | | | | + --- --- + | | + ### ### + GND GND + +Remarks: + - This is the exact pinout and electronics used on the Analog Devices + evaluation boards. + /| + - All inverters -o |- must be 74HC05, they must be open collector output. + \| + - All resitors are 10k. + - Pins 18-25 of the parallel port connected to GND. + - Pins 4-9 (D2-D7) could be used as VDD is the driver drives them high. + The ADM1032 evaluation board uses D4-D7. Beware that the amount of + current you can draw from the parallel port is limited. Also note that + all connected lines MUST BE driven at the same state, else you'll short + circuit the output buffers! So plugging the I2C adapter after loading + the i2c-parport module might be a good safety since data line state + prior to init may be unknown. + - This is 5V! + - Obviously you cannot read SCL (so it's not really standard-compliant). + Pretty easy to add, just copy the SDA part and use another input pin. + That would give (ELV compatible pinout): + + +Device PC +Side ______________________________Vdd (+) Side + | | | | + --- --- --- --- + | | | | | | | | + |R| |R| |R| |R| + | | | | | | | | + --- --- --- --- + | | | | + | | |\ | | +SCL ----------x--------x--| o---x------------------------ pin 15 + | | |/ | + | | | + | | /| | + | ---o |-------------x-------------- pin 2 + | \| | | + | | | + | | | + | |\ | | +SDA ---------------x---x--| o--------x------------------- pin 10 + | |/ | + | | + | /| | + ---o |------------------x--------- pin 3 + \| | | + | | + --- --- + | | | | + |R| |R| + | | | | + --- --- + | | + ### ### + GND GND + + +If possible, you should use the same pinout configuration as existing +adapters do, so you won't even have to change the code. + + +Similar (but different) drivers +------------------------------- + +This driver is NOT the same as the i2c-pport driver found in the i2c +package. The i2c-pport driver makes use of modern parallel port features so +that you don't need additional electronics. It has other restrictions +however, and was not ported to Linux 2.6 (yet). + +This driver is also NOT the same as the i2c-pcf-epp driver found in the +lm_sensors package. The i2c-pcf-epp driver doesn't use the parallel port as +an I2C bus directly. Instead, it uses it to control an external I2C bus +master. That driver was not ported to Linux 2.6 (yet) either. + + +Legacy documentation for Velleman adapter +----------------------------------------- + +Useful links: +Velleman http://www.velleman.be/ +Velleman K8000 Howto http://howto.htlw16.ac.at/k8000-howto.html + +The project has lead to new libs for the Velleman K8000 and K8005: + LIBK8000 v1.99.1 and LIBK8005 v0.21 +With these libs, you can control the K8000 interface card and the K8005 +stepper motor card with the simple commands which are in the original +Velleman software, like SetIOchannel, ReadADchannel, SendStepCCWFull and +many more, using /dev/velleman. + http://home.wanadoo.nl/hihihi/libk8000.htm + http://home.wanadoo.nl/hihihi/libk8005.htm + http://struyve.mine.nu:8080/index.php?block=k8000 + http://sourceforge.net/projects/libk8005/ diff --git a/Documentation/i2c/busses/i2c-parport-light b/Documentation/i2c/busses/i2c-parport-light new file mode 100644 index 000000000000..287436478520 --- /dev/null +++ b/Documentation/i2c/busses/i2c-parport-light @@ -0,0 +1,11 @@ +Kernel driver i2c-parport-light + +Author: Jean Delvare <khali@linux-fr.org> + +This driver is a light version of i2c-parport. It doesn't depend +on the parport driver, and uses direct I/O access instead. This might be +prefered on embedded systems where wasting memory for the clean but heavy +parport handling is not an option. The drawback is a reduced portability +and the impossibility to daisy-chain other parallel port devices. + +Please see i2c-parport for documentation. diff --git a/Documentation/i2c/busses/i2c-pca-isa b/Documentation/i2c/busses/i2c-pca-isa new file mode 100644 index 000000000000..6fc8f4c27c3c --- /dev/null +++ b/Documentation/i2c/busses/i2c-pca-isa @@ -0,0 +1,23 @@ +Kernel driver i2c-pca-isa + +Supported adapters: +This driver supports ISA boards using the Philips PCA 9564 +Parallel bus to I2C bus controller + +Author: Ian Campbell <icampbell@arcom.com>, Arcom Control Systems + +Module Parameters +----------------- + +* base int + I/O base address +* irq int + IRQ interrupt +* clock int + Clock rate as described in table 1 of PCA9564 datasheet + +Description +----------- + +This driver supports ISA boards using the Philips PCA 9564 +Parallel bus to I2C bus controller diff --git a/Documentation/i2c/busses/i2c-piix4 b/Documentation/i2c/busses/i2c-piix4 new file mode 100644 index 000000000000..856b4b8b962c --- /dev/null +++ b/Documentation/i2c/busses/i2c-piix4 @@ -0,0 +1,72 @@ +Kernel driver i2c-piix4 + +Supported adapters: + * Intel 82371AB PIIX4 and PIIX4E + * Intel 82443MX (440MX) + Datasheet: Publicly available at the Intel website + * ServerWorks OSB4, CSB5 and CSB6 southbridges + Datasheet: Only available via NDA from ServerWorks + * Standard Microsystems (SMSC) SLC90E66 (Victory66) southbridge + Datasheet: Publicly available at the SMSC website http://www.smsc.com + +Authors: + Frodo Looijaard <frodol@dds.nl> + Philip Edelbrock <phil@netroedge.com> + + +Module Parameters +----------------- + +* force: int + Forcibly enable the PIIX4. DANGEROUS! +* force_addr: int + Forcibly enable the PIIX4 at the given address. EXTREMELY DANGEROUS! +* fix_hstcfg: int + Fix config register. Needed on some boards (Force CPCI735). + + +Description +----------- + +The PIIX4 (properly known as the 82371AB) is an Intel chip with a lot of +functionality. Among other things, it implements the PCI bus. One of its +minor functions is implementing a System Management Bus. This is a true +SMBus - you can not access it on I2C levels. The good news is that it +natively understands SMBus commands and you do not have to worry about +timing problems. The bad news is that non-SMBus devices connected to it can +confuse it mightily. Yes, this is known to happen... + +Do 'lspci -v' and see whether it contains an entry like this: + +0000:00:02.3 Bridge: Intel Corp. 82371AB/EB/MB PIIX4 ACPI (rev 02) + Flags: medium devsel, IRQ 9 + +Bus and device numbers may differ, but the function number must be +identical (like many PCI devices, the PIIX4 incorporates a number of +different 'functions', which can be considered as separate devices). If you +find such an entry, you have a PIIX4 SMBus controller. + +On some computers (most notably, some Dells), the SMBus is disabled by +default. If you use the insmod parameter 'force=1', the kernel module will +try to enable it. THIS IS VERY DANGEROUS! If the BIOS did not set up a +correct address for this module, you could get in big trouble (read: +crashes, data corruption, etc.). Try this only as a last resort (try BIOS +updates first, for example), and backup first! An even more dangerous +option is 'force_addr=<IOPORT>'. This will not only enable the PIIX4 like +'force' foes, but it will also set a new base I/O port address. The SMBus +parts of the PIIX4 needs a range of 8 of these addresses to function +correctly. If these addresses are already reserved by some other device, +you will get into big trouble! DON'T USE THIS IF YOU ARE NOT VERY SURE +ABOUT WHAT YOU ARE DOING! + +The PIIX4E is just an new version of the PIIX4; it is supported as well. +The PIIX/PIIX3 does not implement an SMBus or I2C bus, so you can't use +this driver on those mainboards. + +The ServerWorks Southbridges, the Intel 440MX, and the Victory766 are +identical to the PIIX4 in I2C/SMBus support. + +A few OSB4 southbridges are known to be misconfigured by the BIOS. In this +case, you have you use the fix_hstcfg module parameter. Do not use it +unless you know you have to, because in some cases it also breaks +configuration on southbridges that don't need it. diff --git a/Documentation/i2c/busses/i2c-prosavage b/Documentation/i2c/busses/i2c-prosavage new file mode 100644 index 000000000000..703687902511 --- /dev/null +++ b/Documentation/i2c/busses/i2c-prosavage @@ -0,0 +1,23 @@ +Kernel driver i2c-prosavage + +Supported adapters: + + S3/VIA KM266/VT8375 aka ProSavage8 + S3/VIA KM133/VT8365 aka Savage4 + +Author: Henk Vergonet <henk@god.dyndns.org> + +Description +----------- + +The Savage4 chips contain two I2C interfaces (aka a I2C 'master' or +'host'). + +The first interface is used for DDC (Data Display Channel) which is a +serial channel through the VGA monitor connector to a DDC-compliant +monitor. This interface is defined by the Video Electronics Standards +Association (VESA). The standards are available for purchase at +http://www.vesa.org . The second interface is a general-purpose I2C bus. + +Usefull for gaining access to the TV Encoder chips. + diff --git a/Documentation/i2c/busses/i2c-savage4 b/Documentation/i2c/busses/i2c-savage4 new file mode 100644 index 000000000000..6ecceab618d3 --- /dev/null +++ b/Documentation/i2c/busses/i2c-savage4 @@ -0,0 +1,26 @@ +Kernel driver i2c-savage4 + +Supported adapters: + * Savage4 + * Savage2000 + +Authors: + Alexander Wold <awold@bigfoot.com>, + Mark D. Studebaker <mdsxyz123@yahoo.com> + +Description +----------- + +The Savage4 chips contain two I2C interfaces (aka a I2C 'master' +or 'host'). + +The first interface is used for DDC (Data Display Channel) which is a +serial channel through the VGA monitor connector to a DDC-compliant +monitor. This interface is defined by the Video Electronics Standards +Association (VESA). The standards are available for purchase at +http://www.vesa.org . The DDC bus is not yet supported because its register +is not directly memory-mapped. + +The second interface is a general-purpose I2C bus. This is the only +interface supported by the driver at the moment. + diff --git a/Documentation/i2c/busses/i2c-sis5595 b/Documentation/i2c/busses/i2c-sis5595 new file mode 100644 index 000000000000..cc47db7d00a9 --- /dev/null +++ b/Documentation/i2c/busses/i2c-sis5595 @@ -0,0 +1,59 @@ +Kernel driver i2c-sis5595 + +Authors: + Frodo Looijaard <frodol@dds.nl>, + Mark D. Studebaker <mdsxyz123@yahoo.com>, + Philip Edelbrock <phil@netroedge.com> + +Supported adapters: + * Silicon Integrated Systems Corp. SiS5595 Southbridge + Datasheet: Publicly available at the Silicon Integrated Systems Corp. site. + +Note: all have mfr. ID 0x1039. + + SUPPORTED PCI ID + 5595 0008 + + Note: these chips contain a 0008 device which is incompatible with the + 5595. We recognize these by the presence of the listed + "blacklist" PCI ID and refuse to load. + + NOT SUPPORTED PCI ID BLACKLIST PCI ID + 540 0008 0540 + 550 0008 0550 + 5513 0008 5511 + 5581 0008 5597 + 5582 0008 5597 + 5597 0008 5597 + 5598 0008 5597/5598 + 630 0008 0630 + 645 0008 0645 + 646 0008 0646 + 648 0008 0648 + 650 0008 0650 + 651 0008 0651 + 730 0008 0730 + 735 0008 0735 + 745 0008 0745 + 746 0008 0746 + +Module Parameters +----------------- + +* force_addr=0xaddr Set the I/O base address. Useful for boards + that don't set the address in the BIOS. Does not do a + PCI force; the device must still be present in lspci. + Don't use this unless the driver complains that the + base address is not set. + +Description +----------- + +i2c-sis5595 is a true SMBus host driver for motherboards with the SiS5595 +southbridges. + +WARNING: If you are trying to access the integrated sensors on the SiS5595 +chip, you want the sis5595 driver for those, not this driver. This driver +is a BUS driver, not a CHIP driver. A BUS driver is used by other CHIP +drivers to access chips on the bus. + diff --git a/Documentation/i2c/busses/i2c-sis630 b/Documentation/i2c/busses/i2c-sis630 new file mode 100644 index 000000000000..9aca6889f748 --- /dev/null +++ b/Documentation/i2c/busses/i2c-sis630 @@ -0,0 +1,49 @@ +Kernel driver i2c-sis630 + +Supported adapters: + * Silicon Integrated Systems Corp (SiS) + 630 chipset (Datasheet: available at http://amalysh.bei.t-online.de/docs/SIS/) + 730 chipset + * Possible other SiS chipsets ? + +Author: Alexander Malysh <amalysh@web.de> + +Module Parameters +----------------- + +* force = [1|0] Forcibly enable the SIS630. DANGEROUS! + This can be interesting for chipsets not named + above to check if it works for you chipset, but DANGEROUS! + +* high_clock = [1|0] Forcibly set Host Master Clock to 56KHz (default, + what your BIOS use). DANGEROUS! This should be a bit + faster, but freeze some systems (i.e. my Laptop). + + +Description +----------- + +This SMBus only driver is known to work on motherboards with the above +named chipsets. + +If you see something like this: + +00:00.0 Host bridge: Silicon Integrated Systems [SiS] 630 Host (rev 31) +00:01.0 ISA bridge: Silicon Integrated Systems [SiS] 85C503/5513 + +or like this: + +00:00.0 Host bridge: Silicon Integrated Systems [SiS] 730 Host (rev 02) +00:01.0 ISA bridge: Silicon Integrated Systems [SiS] 85C503/5513 + +in your 'lspci' output , then this driver is for your chipset. + +Thank You +--------- +Philip Edelbrock <phil@netroedge.com> +- testing SiS730 support +Mark M. Hoffman <mhoffman@lightlink.com> +- bug fixes + +To anyone else which I forgot here ;), thanks! + diff --git a/Documentation/i2c/busses/i2c-sis69x b/Documentation/i2c/busses/i2c-sis69x new file mode 100644 index 000000000000..5be48769f65b --- /dev/null +++ b/Documentation/i2c/busses/i2c-sis69x @@ -0,0 +1,73 @@ +Kernel driver i2c-sis96x + +Replaces 2.4.x i2c-sis645 + +Supported adapters: + * Silicon Integrated Systems Corp (SiS) + Any combination of these host bridges: + 645, 645DX (aka 646), 648, 650, 651, 655, 735, 745, 746 + and these south bridges: + 961, 962, 963(L) + +Author: Mark M. Hoffman <mhoffman@lightlink.com> + +Description +----------- + +This SMBus only driver is known to work on motherboards with the above +named chipset combinations. The driver was developed without benefit of a +proper datasheet from SiS. The SMBus registers are assumed compatible with +those of the SiS630, although they are located in a completely different +place. Thanks to Alexander Malysh <amalysh@web.de> for providing the +SiS630 datasheet (and driver). + +The command "lspci" as root should produce something like these lines: + +00:00.0 Host bridge: Silicon Integrated Systems [SiS]: Unknown device 0645 +00:02.0 ISA bridge: Silicon Integrated Systems [SiS] 85C503/5513 +00:02.1 SMBus: Silicon Integrated Systems [SiS]: Unknown device 0016 + +or perhaps this... + +00:00.0 Host bridge: Silicon Integrated Systems [SiS]: Unknown device 0645 +00:02.0 ISA bridge: Silicon Integrated Systems [SiS]: Unknown device 0961 +00:02.1 SMBus: Silicon Integrated Systems [SiS]: Unknown device 0016 + +(kernel versions later than 2.4.18 may fill in the "Unknown"s) + +If you cant see it please look on quirk_sis_96x_smbus +(drivers/pci/quirks.c) (also if southbridge detection fails) + +I suspect that this driver could be made to work for the following SiS +chipsets as well: 635, and 635T. If anyone owns a board with those chips +AND is willing to risk crashing & burning an otherwise well-behaved kernel +in the name of progress... please contact me at <mhoffman@lightlink.com> or +via the project's mailing list: <sensors@stimpy.netroedge.com>. Please +send bug reports and/or success stories as well. + + +TO DOs +------ + +* The driver does not support SMBus block reads/writes; I may add them if a +scenario is found where they're needed. + + +Thank You +--------- + +Mark D. Studebaker <mdsxyz123@yahoo.com> + - design hints and bug fixes +Alexander Maylsh <amalysh@web.de> + - ditto, plus an important datasheet... almost the one I really wanted +Hans-Günter Lütke Uphues <hg_lu@t-online.de> + - patch for SiS735 +Robert Zwerus <arzie@dds.nl> + - testing for SiS645DX +Kianusch Sayah Karadji <kianusch@sk-tech.net> + - patch for SiS645DX/962 +Ken Healy + - patch for SiS655 + +To anyone else who has written w/ feedback, thanks! + diff --git a/Documentation/i2c/busses/i2c-via b/Documentation/i2c/busses/i2c-via new file mode 100644 index 000000000000..55edfe1a640b --- /dev/null +++ b/Documentation/i2c/busses/i2c-via @@ -0,0 +1,34 @@ +Kernel driver i2c-via + +Supported adapters: + * VIA Technologies, InC. VT82C586B + Datasheet: Publicly available at the VIA website + +Author: Kyösti Mälkki <kmalkki@cc.hut.fi> + +Description +----------- + +i2c-via is an i2c bus driver for motherboards with VIA chipset. + +The following VIA pci chipsets are supported: + - MVP3, VP3, VP2/97, VPX/97 + - others with South bridge VT82C586B + +Your lspci listing must show this : + + Bridge: VIA Technologies, Inc. VT82C586B ACPI (rev 10) + + Problems? + + Q: You have VT82C586B on the motherboard, but not in the listing. + + A: Go to your BIOS setup, section PCI devices or similar. + Turn USB support on, and try again. + + Q: No error messages, but still i2c doesn't seem to work. + + A: This can happen. This driver uses the pins VIA recommends in their + datasheets, but there are several ways the motherboard manufacturer + can actually wire the lines. + diff --git a/Documentation/i2c/busses/i2c-viapro b/Documentation/i2c/busses/i2c-viapro new file mode 100644 index 000000000000..702f5ac68c09 --- /dev/null +++ b/Documentation/i2c/busses/i2c-viapro @@ -0,0 +1,47 @@ +Kernel driver i2c-viapro + +Supported adapters: + * VIA Technologies, Inc. VT82C596A/B + Datasheet: Sometimes available at the VIA website + + * VIA Technologies, Inc. VT82C686A/B + Datasheet: Sometimes available at the VIA website + + * VIA Technologies, Inc. VT8231, VT8233, VT8233A, VT8235, VT8237 + Datasheet: available on request from Via + +Authors: + Frodo Looijaard <frodol@dds.nl>, + Philip Edelbrock <phil@netroedge.com>, + Kyösti Mälkki <kmalkki@cc.hut.fi>, + Mark D. Studebaker <mdsxyz123@yahoo.com> + +Module Parameters +----------------- + +* force: int + Forcibly enable the SMBus controller. DANGEROUS! +* force_addr: int + Forcibly enable the SMBus at the given address. EXTREMELY DANGEROUS! + +Description +----------- + +i2c-viapro is a true SMBus host driver for motherboards with one of the +supported VIA southbridges. + +Your lspci -n listing must show one of these : + + device 1106:3050 (VT82C596 function 3) + device 1106:3051 (VT82C596 function 3) + device 1106:3057 (VT82C686 function 4) + device 1106:3074 (VT8233) + device 1106:3147 (VT8233A) + device 1106:8235 (VT8231) + devide 1106:3177 (VT8235) + devide 1106:3227 (VT8237) + +If none of these show up, you should look in the BIOS for settings like +enable ACPI / SMBus or even USB. + + diff --git a/Documentation/i2c/busses/i2c-voodoo3 b/Documentation/i2c/busses/i2c-voodoo3 new file mode 100644 index 000000000000..62d90a454d39 --- /dev/null +++ b/Documentation/i2c/busses/i2c-voodoo3 @@ -0,0 +1,62 @@ +Kernel driver i2c-voodoo3 + +Supported adapters: + * 3dfx Voodoo3 based cards + * Voodoo Banshee based cards + +Authors: + Frodo Looijaard <frodol@dds.nl>, + Philip Edelbrock <phil@netroedge.com>, + Ralph Metzler <rjkm@thp.uni-koeln.de>, + Mark D. Studebaker <mdsxyz123@yahoo.com> + +Main contact: Philip Edelbrock <phil@netroedge.com> + +The code is based upon Ralph's test code (he did the hard stuff ;') + +Description +----------- + +The 3dfx Voodoo3 chip contains two I2C interfaces (aka a I2C 'master' or +'host'). + +The first interface is used for DDC (Data Display Channel) which is a +serial channel through the VGA monitor connector to a DDC-compliant +monitor. This interface is defined by the Video Electronics Standards +Association (VESA). The standards are available for purchase at +http://www.vesa.org . + +The second interface is a general-purpose I2C bus. The intent by 3dfx was +to allow manufacturers to add extra chips to the video card such as a +TV-out chip such as the BT869 or possibly even I2C based temperature +sensors like the ADM1021 or LM75. + +Stability +--------- + +Seems to be stable on the test machine, but needs more testing on other +machines. Simultaneous accesses of the DDC and I2C busses may cause errors. + +Supported Devices +----------------- + +Specifically, this driver was written and tested on the '3dfx Voodoo3 AGP +3000' which has a tv-out feature (s-video or composite). According to the +docs and discussions, this code should work for any Voodoo3 based cards as +well as Voodoo Banshee based cards. The DDC interface has been tested on a +Voodoo Banshee card. + +Issues +------ + +Probably many, but it seems to work OK on my system. :') + + +External Device Connection +-------------------------- + +The digital video input jumpers give availability to the I2C bus. +Specifically, pins 13 and 25 (bottom row middle, and bottom right-end) are +the I2C clock and I2C data lines, respectively. +5V and GND are probably +also easily available making the addition of extra I2C/SMBus devices easy +to implement. diff --git a/Documentation/i2c/busses/scx200_acb b/Documentation/i2c/busses/scx200_acb new file mode 100644 index 000000000000..08c8cd1df60c --- /dev/null +++ b/Documentation/i2c/busses/scx200_acb @@ -0,0 +1,14 @@ +Kernel driver scx200_acb + +Author: Christer Weinigel <wingel@nano-system.com> + +Module Parameters +----------------- + +* base: int + Base addresses for the ACCESS.bus controllers + +Description +----------- + +Enable the use of the ACCESS.bus controllers of a SCx200 processor. diff --git a/Documentation/i2c/chips/smsc47b397.txt b/Documentation/i2c/chips/smsc47b397.txt new file mode 100644 index 000000000000..389edae7f8df --- /dev/null +++ b/Documentation/i2c/chips/smsc47b397.txt @@ -0,0 +1,146 @@ +November 23, 2004 + +The following specification describes the SMSC LPC47B397-NC sensor chip +(for which there is no public datasheet available). This document was +provided by Craig Kelly (In-Store Broadcast Network) and edited/corrected +by Mark M. Hoffman <mhoffman@lightlink.com>. + +* * * * * + +Methods for detecting the HP SIO and reading the thermal data on a dc7100. + +The thermal information on the dc7100 is contained in the SIO Hardware Monitor +(HWM). The information is accessed through an index/data pair. The index/data +pair is located at the HWM Base Address + 0 and the HWM Base Address + 1. The +HWM Base address can be obtained from Logical Device 8, registers 0x60 (MSB) +and 0x61 (LSB). Currently we are using 0x480 for the HWM Base Address and +0x480 and 0x481 for the index/data pair. + +Reading temperature information. +The temperature information is located in the following registers: +Temp1 0x25 (Currently, this reflects the CPU temp on all systems). +Temp2 0x26 +Temp3 0x27 +Temp4 0x80 + +Programming Example +The following is an example of how to read the HWM temperature registers: +MOV DX,480H +MOV AX,25H +OUT DX,AL +MOV DX,481H +IN AL,DX + +AL contains the data in hex, the temperature in Celsius is the decimal +equivalent. + +Ex: If AL contains 0x2A, the temperature is 42 degrees C. + +Reading tach information. +The fan speed information is located in the following registers: + LSB MSB +Tach1 0x28 0x29 (Currently, this reflects the CPU + fan speed on all systems). +Tach2 0x2A 0x2B +Tach3 0x2C 0x2D +Tach4 0x2E 0x2F + +Important!!! +Reading the tach LSB locks the tach MSB. +The LSB Must be read first. + +How to convert the tach reading to RPM. +The tach reading (TCount) is given by: (Tach MSB * 256) + (Tach LSB) +The SIO counts the number of 90kHz (11.111us) pulses per revolution. +RPM = 60/(TCount * 11.111us) + +Example: +Reg 0x28 = 0x9B +Reg 0x29 = 0x08 + +TCount = 0x89B = 2203 + +RPM = 60 / (2203 * 11.11111 E-6) = 2451 RPM + +Obtaining the SIO version. + +CONFIGURATION SEQUENCE +To program the configuration registers, the following sequence must be followed: +1. Enter Configuration Mode +2. Configure the Configuration Registers +3. Exit Configuration Mode. + +Enter Configuration Mode +To place the chip into the Configuration State The config key (0x55) is written +to the CONFIG PORT (0x2E). + +Configuration Mode +In configuration mode, the INDEX PORT is located at the CONFIG PORT address and +the DATA PORT is at INDEX PORT address + 1. + +The desired configuration registers are accessed in two steps: +a. Write the index of the Logical Device Number Configuration Register + (i.e., 0x07) to the INDEX PORT and then write the number of the + desired logical device to the DATA PORT. + +b. Write the address of the desired configuration register within the + logical device to the INDEX PORT and then write or read the config- + uration register through the DATA PORT. + +Note: If accessing the Global Configuration Registers, step (a) is not required. + +Exit Configuration Mode +To exit the Configuration State the write 0xAA to the CONFIG PORT (0x2E). +The chip returns to the RUN State. (This is important). + +Programming Example +The following is an example of how to read the SIO Device ID located at 0x20 + +; ENTER CONFIGURATION MODE +MOV DX,02EH +MOV AX,055H +OUT DX,AL +; GLOBAL CONFIGURATION REGISTER +MOV DX,02EH +MOV AL,20H +OUT DX,AL +; READ THE DATA +MOV DX,02FH +IN AL,DX +; EXIT CONFIGURATION MODE +MOV DX,02EH +MOV AX,0AAH +OUT DX,AL + +The registers of interest for identifying the SIO on the dc7100 are Device ID +(0x20) and Device Rev (0x21). + +The Device ID will read 0X6F +The Device Rev currently reads 0x01 + +Obtaining the HWM Base Address. +The following is an example of how to read the HWM Base Address located in +Logical Device 8. + +; ENTER CONFIGURATION MODE +MOV DX,02EH +MOV AX,055H +OUT DX,AL +; CONFIGURE REGISTER CRE0, +; LOGICAL DEVICE 8 +MOV DX,02EH +MOV AL,07H +OUT DX,AL ;Point to LD# Config Reg +MOV DX,02FH +MOV AL, 08H +OUT DX,AL;Point to Logical Device 8 +; +MOV DX,02EH +MOV AL,60H +OUT DX,AL ; Point to HWM Base Addr MSB +MOV DX,02FH +IN AL,DX ; Get MSB of HWM Base Addr +; EXIT CONFIGURATION MODE +MOV DX,02EH +MOV AX,0AAH +OUT DX,AL diff --git a/Documentation/i2c/dev-interface b/Documentation/i2c/dev-interface new file mode 100644 index 000000000000..09d6cda2a1fb --- /dev/null +++ b/Documentation/i2c/dev-interface @@ -0,0 +1,146 @@ +Usually, i2c devices are controlled by a kernel driver. But it is also +possible to access all devices on an adapter from userspace, through +the /dev interface. You need to load module i2c-dev for this. + +Each registered i2c adapter gets a number, counting from 0. You can +examine /sys/class/i2c-dev/ to see what number corresponds to which adapter. +I2C device files are character device files with major device number 89 +and a minor device number corresponding to the number assigned as +explained above. They should be called "i2c-%d" (i2c-0, i2c-1, ..., +i2c-10, ...). All 256 minor device numbers are reserved for i2c. + + +C example +========= + +So let's say you want to access an i2c adapter from a C program. The +first thing to do is `#include <linux/i2c.h>" and "#include <linux/i2c-dev.h>. +Yes, I know, you should never include kernel header files, but until glibc +knows about i2c, there is not much choice. + +Now, you have to decide which adapter you want to access. You should +inspect /sys/class/i2c-dev/ to decide this. Adapter numbers are assigned +somewhat dynamically, so you can not even assume /dev/i2c-0 is the +first adapter. + +Next thing, open the device file, as follows: + int file; + int adapter_nr = 2; /* probably dynamically determined */ + char filename[20]; + + sprintf(filename,"/dev/i2c-%d",adapter_nr); + if ((file = open(filename,O_RDWR)) < 0) { + /* ERROR HANDLING; you can check errno to see what went wrong */ + exit(1); + } + +When you have opened the device, you must specify with what device +address you want to communicate: + int addr = 0x40; /* The I2C address */ + if (ioctl(file,I2C_SLAVE,addr) < 0) { + /* ERROR HANDLING; you can check errno to see what went wrong */ + exit(1); + } + +Well, you are all set up now. You can now use SMBus commands or plain +I2C to communicate with your device. SMBus commands are preferred if +the device supports them. Both are illustrated below. + __u8 register = 0x10; /* Device register to access */ + __s32 res; + char buf[10]; + /* Using SMBus commands */ + res = i2c_smbus_read_word_data(file,register); + if (res < 0) { + /* ERROR HANDLING: i2c transaction failed */ + } else { + /* res contains the read word */ + } + /* Using I2C Write, equivalent of + i2c_smbus_write_word_data(file,register,0x6543) */ + buf[0] = register; + buf[1] = 0x43; + buf[2] = 0x65; + if ( write(file,buf,3) != 3) { + /* ERROR HANDLING: i2c transaction failed */ + } + /* Using I2C Read, equivalent of i2c_smbus_read_byte(file) */ + if (read(file,buf,1) != 1) { + /* ERROR HANDLING: i2c transaction failed */ + } else { + /* buf[0] contains the read byte */ + } + +IMPORTANT: because of the use of inline functions, you *have* to use +'-O' or some variation when you compile your program! + + +Full interface description +========================== + +The following IOCTLs are defined and fully supported +(see also i2c-dev.h and i2c.h): + +ioctl(file,I2C_SLAVE,long addr) + Change slave address. The address is passed in the 7 lower bits of the + argument (except for 10 bit addresses, passed in the 10 lower bits in this + case). + +ioctl(file,I2C_TENBIT,long select) + Selects ten bit addresses if select not equals 0, selects normal 7 bit + addresses if select equals 0. Default 0. + +ioctl(file,I2C_PEC,long select) + Selects SMBus PEC (packet error checking) generation and verification + if select not equals 0, disables if select equals 0. Default 0. + Used only for SMBus transactions. + +ioctl(file,I2C_FUNCS,unsigned long *funcs) + Gets the adapter functionality and puts it in *funcs. + +ioctl(file,I2C_RDWR,struct i2c_ioctl_rdwr_data *msgset) + + Do combined read/write transaction without stop in between. + The argument is a pointer to a struct i2c_ioctl_rdwr_data { + + struct i2c_msg *msgs; /* ptr to array of simple messages */ + int nmsgs; /* number of messages to exchange */ + } + + The msgs[] themselves contain further pointers into data buffers. + The function will write or read data to or from that buffers depending + on whether the I2C_M_RD flag is set in a particular message or not. + The slave address and whether to use ten bit address mode has to be + set in each message, overriding the values set with the above ioctl's. + + +Other values are NOT supported at this moment, except for I2C_SMBUS, +which you should never directly call; instead, use the access functions +below. + +You can do plain i2c transactions by using read(2) and write(2) calls. +You do not need to pass the address byte; instead, set it through +ioctl I2C_SLAVE before you try to access the device. + +You can do SMBus level transactions (see documentation file smbus-protocol +for details) through the following functions: + __s32 i2c_smbus_write_quick(int file, __u8 value); + __s32 i2c_smbus_read_byte(int file); + __s32 i2c_smbus_write_byte(int file, __u8 value); + __s32 i2c_smbus_read_byte_data(int file, __u8 command); + __s32 i2c_smbus_write_byte_data(int file, __u8 command, __u8 value); + __s32 i2c_smbus_read_word_data(int file, __u8 command); + __s32 i2c_smbus_write_word_data(int file, __u8 command, __u16 value); + __s32 i2c_smbus_process_call(int file, __u8 command, __u16 value); + __s32 i2c_smbus_read_block_data(int file, __u8 command, __u8 *values); + __s32 i2c_smbus_write_block_data(int file, __u8 command, __u8 length, + __u8 *values); +All these transactions return -1 on failure; you can read errno to see +what happened. The 'write' transactions return 0 on success; the +'read' transactions return the read value, except for read_block, which +returns the number of values read. The block buffers need not be longer +than 32 bytes. + +The above functions are all macros, that resolve to calls to the +i2c_smbus_access function, that on its turn calls a specific ioctl +with the data in a specific format. Read the source code if you +want to know what happens behind the screens. diff --git a/Documentation/i2c/functionality b/Documentation/i2c/functionality new file mode 100644 index 000000000000..8a78a95ae04e --- /dev/null +++ b/Documentation/i2c/functionality @@ -0,0 +1,135 @@ +INTRODUCTION +------------ + +Because not every I2C or SMBus adapter implements everything in the +I2C specifications, a client can not trust that everything it needs +is implemented when it is given the option to attach to an adapter: +the client needs some way to check whether an adapter has the needed +functionality. + + +FUNCTIONALITY CONSTANTS +----------------------- + +For the most up-to-date list of functionality constants, please check +<linux/i2c.h>! + + I2C_FUNC_I2C Plain i2c-level commands (Pure SMBus + adapters typically can not do these) + I2C_FUNC_10BIT_ADDR Handles the 10-bit address extensions + I2C_FUNC_PROTOCOL_MANGLING Knows about the I2C_M_REV_DIR_ADDR, + I2C_M_REV_DIR_ADDR and I2C_M_REV_DIR_NOSTART + flags (which modify the i2c protocol!) + I2C_FUNC_SMBUS_QUICK Handles the SMBus write_quick command + I2C_FUNC_SMBUS_READ_BYTE Handles the SMBus read_byte command + I2C_FUNC_SMBUS_WRITE_BYTE Handles the SMBus write_byte command + I2C_FUNC_SMBUS_READ_BYTE_DATA Handles the SMBus read_byte_data command + I2C_FUNC_SMBUS_WRITE_BYTE_DATA Handles the SMBus write_byte_data command + I2C_FUNC_SMBUS_READ_WORD_DATA Handles the SMBus read_word_data command + I2C_FUNC_SMBUS_WRITE_WORD_DATA Handles the SMBus write_byte_data command + I2C_FUNC_SMBUS_PROC_CALL Handles the SMBus process_call command + I2C_FUNC_SMBUS_READ_BLOCK_DATA Handles the SMBus read_block_data command + I2C_FUNC_SMBUS_WRITE_BLOCK_DATA Handles the SMBus write_block_data command + I2C_FUNC_SMBUS_READ_I2C_BLOCK Handles the SMBus read_i2c_block_data command + I2C_FUNC_SMBUS_WRITE_I2C_BLOCK Handles the SMBus write_i2c_block_data command + +A few combinations of the above flags are also defined for your convenience: + + I2C_FUNC_SMBUS_BYTE Handles the SMBus read_byte + and write_byte commands + I2C_FUNC_SMBUS_BYTE_DATA Handles the SMBus read_byte_data + and write_byte_data commands + I2C_FUNC_SMBUS_WORD_DATA Handles the SMBus read_word_data + and write_word_data commands + I2C_FUNC_SMBUS_BLOCK_DATA Handles the SMBus read_block_data + and write_block_data commands + I2C_FUNC_SMBUS_I2C_BLOCK Handles the SMBus read_i2c_block_data + and write_i2c_block_data commands + I2C_FUNC_SMBUS_EMUL Handles all SMBus commands than can be + emulated by a real I2C adapter (using + the transparent emulation layer) + + +ALGORITHM/ADAPTER IMPLEMENTATION +-------------------------------- + +When you write a new algorithm driver, you will have to implement a +function callback `functionality', that gets an i2c_adapter structure +pointer as its only parameter: + + struct i2c_algorithm { + /* Many other things of course; check <linux/i2c.h>! */ + u32 (*functionality) (struct i2c_adapter *); + } + +A typically implementation is given below, from i2c-algo-bit.c: + + static u32 bit_func(struct i2c_adapter *adap) + { + return I2C_FUNC_SMBUS_EMUL | I2C_FUNC_10BIT_ADDR | + I2C_FUNC_PROTOCOL_MANGLING; + } + + + +CLIENT CHECKING +--------------- + +Before a client tries to attach to an adapter, or even do tests to check +whether one of the devices it supports is present on an adapter, it should +check whether the needed functionality is present. There are two functions +defined which should be used instead of calling the functionality hook +in the algorithm structure directly: + + /* Return the functionality mask */ + extern u32 i2c_get_functionality (struct i2c_adapter *adap); + + /* Return 1 if adapter supports everything we need, 0 if not. */ + extern int i2c_check_functionality (struct i2c_adapter *adap, u32 func); + +This is a typical way to use these functions (from the writing-clients +document): + int foo_detect_client(struct i2c_adapter *adapter, int address, + unsigned short flags, int kind) + { + /* Define needed variables */ + + /* As the very first action, we check whether the adapter has the + needed functionality: we need the SMBus read_word_data, + write_word_data and write_byte functions in this example. */ + if (!i2c_check_functionality(adapter,I2C_FUNC_SMBUS_WORD_DATA | + I2C_FUNC_SMBUS_WRITE_BYTE)) + goto ERROR0; + + /* Now we can do the real detection */ + + ERROR0: + /* Return an error */ + } + + + +CHECKING THROUGH /DEV +--------------------- + +If you try to access an adapter from a userspace program, you will have +to use the /dev interface. You will still have to check whether the +functionality you need is supported, of course. This is done using +the I2C_FUNCS ioctl. An example, adapted from the lm_sensors i2c_detect +program, is below: + + int file; + if (file = open("/dev/i2c-0",O_RDWR) < 0) { + /* Some kind of error handling */ + exit(1); + } + if (ioctl(file,I2C_FUNCS,&funcs) < 0) { + /* Some kind of error handling */ + exit(1); + } + if (! (funcs & I2C_FUNC_SMBUS_QUICK)) { + /* Oops, the needed functionality (SMBus write_quick function) is + not available! */ + exit(1); + } + /* Now it is safe to use the SMBus write_quick command */ diff --git a/Documentation/i2c/i2c-protocol b/Documentation/i2c/i2c-protocol new file mode 100644 index 000000000000..b4022c914210 --- /dev/null +++ b/Documentation/i2c/i2c-protocol @@ -0,0 +1,76 @@ +This document describes the i2c protocol. Or will, when it is finished :-) + +Key to symbols +============== + +S (1 bit) : Start bit +P (1 bit) : Stop bit +Rd/Wr (1 bit) : Read/Write bit. Rd equals 1, Wr equals 0. +A, NA (1 bit) : Accept and reverse accept bit. +Addr (7 bits): I2C 7 bit address. Note that this can be expanded as usual to + get a 10 bit I2C address. +Comm (8 bits): Command byte, a data byte which often selects a register on + the device. +Data (8 bits): A plain data byte. Sometimes, I write DataLow, DataHigh + for 16 bit data. +Count (8 bits): A data byte containing the length of a block operation. + +[..]: Data sent by I2C device, as opposed to data sent by the host adapter. + + +Simple send transaction +====================== + +This corresponds to i2c_master_send. + + S Addr Wr [A] Data [A] Data [A] ... [A] Data [A] P + + +Simple receive transaction +=========================== + +This corresponds to i2c_master_recv + + S Addr Rd [A] [Data] A [Data] A ... A [Data] NA P + + +Combined transactions +==================== + +This corresponds to i2c_transfer + +They are just like the above transactions, but instead of a stop bit P +a start bit S is sent and the transaction continues. An example of +a byte read, followed by a byte write: + + S Addr Rd [A] [Data] NA S Addr Wr [A] Data [A] P + + +Modified transactions +===================== + +We have found some I2C devices that needs the following modifications: + + Flag I2C_M_NOSTART: + In a combined transaction, no 'S Addr Wr/Rd [A]' is generated at some + point. For example, setting I2C_M_NOSTART on the second partial message + generates something like: + S Addr Rd [A] [Data] NA Data [A] P + If you set the I2C_M_NOSTART variable for the first partial message, + we do not generate Addr, but we do generate the startbit S. This will + probably confuse all other clients on your bus, so don't try this. + + Flags I2C_M_REV_DIR_ADDR + This toggles the Rd/Wr flag. That is, if you want to do a write, but + need to emit an Rd instead of a Wr, or vice versa, you set this + flag. For example: + S Addr Rd [A] Data [A] Data [A] ... [A] Data [A] P + + Flags I2C_M_IGNORE_NAK + Normally message is interrupted immediately if there is [NA] from the + client. Setting this flag treats any [NA] as [A], and all of + message is sent. + These messages may still fail to SCL lo->hi timeout. + + Flags I2C_M_NO_RD_ACK + In a read message, master A/NA bit is skipped. diff --git a/Documentation/i2c/i2c-stub b/Documentation/i2c/i2c-stub new file mode 100644 index 000000000000..d6dcb138abf5 --- /dev/null +++ b/Documentation/i2c/i2c-stub @@ -0,0 +1,38 @@ +MODULE: i2c-stub + +DESCRIPTION: + +This module is a very simple fake I2C/SMBus driver. It implements four +types of SMBus commands: write quick, (r/w) byte, (r/w) byte data, and +(r/w) word data. + +No hardware is needed nor associated with this module. It will accept write +quick commands to all addresses; it will respond to the other commands (also +to all addresses) by reading from or writing to an array in memory. It will +also spam the kernel logs for every command it handles. + +A pointer register with auto-increment is implemented for all byte +operations. This allows for continuous byte reads like those supported by +EEPROMs, among others. + +The typical use-case is like this: + 1. load this module + 2. use i2cset (from lm_sensors project) to pre-load some data + 3. load the target sensors chip driver module + 4. observe its behavior in the kernel log + +CAVEATS: + +There are independent arrays for byte/data and word/data commands. Depending +on if/how a target driver mixes them, you'll need to be careful. + +If your target driver polls some byte or word waiting for it to change, the +stub could lock it up. Use i2cset to unlock it. + +If the hardware for your driver has banked registers (e.g. Winbond sensors +chips) this module will not work well - although it could be extended to +support that pretty easily. + +If you spam it hard enough, printk can be lossy. This module really wants +something like relayfs. + diff --git a/Documentation/i2c/porting-clients b/Documentation/i2c/porting-clients new file mode 100644 index 000000000000..56404918eabc --- /dev/null +++ b/Documentation/i2c/porting-clients @@ -0,0 +1,133 @@ +Revision 4, 2004-03-30 +Jean Delvare <khali@linux-fr.org> +Greg KH <greg@kroah.com> + +This is a guide on how to convert I2C chip drivers from Linux 2.4 to +Linux 2.6. I have been using existing drivers (lm75, lm78) as examples. +Then I converted a driver myself (lm83) and updated this document. + +There are two sets of points below. The first set concerns technical +changes. The second set concerns coding policy. Both are mandatory. + +Although reading this guide will help you porting drivers, I suggest +you keep an eye on an already ported driver while porting your own +driver. This will help you a lot understanding what this guide +exactly means. Choose the chip driver that is the more similar to +yours for best results. + +Technical changes: + +* [Includes] Get rid of "version.h". Replace <linux/i2c-proc.h> with + <linux/i2c-sensor.h>. Includes typically look like that: + #include <linux/module.h> + #include <linux/init.h> + #include <linux/slab.h> + #include <linux/i2c.h> + #include <linux/i2c-sensor.h> + #include <linux/i2c-vid.h> /* if you need VRM support */ + #include <asm/io.h> /* if you have I/O operations */ + Please respect this inclusion order. Some extra headers may be + required for a given driver (e.g. "lm75.h"). + +* [Addresses] SENSORS_I2C_END becomes I2C_CLIENT_END, SENSORS_ISA_END + becomes I2C_CLIENT_ISA_END. + +* [Client data] Get rid of sysctl_id. Try using standard names for + register values (for example, temp_os becomes temp_max). You're + still relatively free here, but you *have* to follow the standard + names for sysfs files (see the Sysctl section below). + +* [Function prototypes] The detect functions loses its flags + parameter. Sysctl (e.g. lm75_temp) and miscellaneous functions + are off the list of prototypes. This usually leaves five + prototypes: + static int lm75_attach_adapter(struct i2c_adapter *adapter); + static int lm75_detect(struct i2c_adapter *adapter, int address, + int kind); + static void lm75_init_client(struct i2c_client *client); + static int lm75_detach_client(struct i2c_client *client); + static void lm75_update_client(struct i2c_client *client); + +* [Sysctl] All sysctl stuff is of course gone (defines, ctl_table + and functions). Instead, you have to define show and set functions for + each sysfs file. Only define set for writable values. Take a look at an + existing 2.6 driver for details (lm78 for example). Don't forget + to define the attributes for each file (this is that step that + links callback functions). Use the file names specified in + Documentation/i2c/sysfs-interface for the individual files. Also + convert the units these files read and write to the specified ones. + If you need to add a new type of file, please discuss it on the + sensors mailing list <sensors@stimpy.netroedge.com> by providing a + patch to the Documentation/i2c/sysfs-interface file. + +* [Attach] For I2C drivers, the attach function should make sure + that the adapter's class has I2C_CLASS_HWMON, using the + following construct: + if (!(adapter->class & I2C_CLASS_HWMON)) + return 0; + ISA-only drivers of course don't need this. + +* [Detect] As mentioned earlier, the flags parameter is gone. + The type_name and client_name strings are replaced by a single + name string, which will be filled with a lowercase, short string + (typically the driver name, e.g. "lm75"). + In i2c-only drivers, drop the i2c_is_isa_adapter check, it's + useless. + The errorN labels are reduced to the number needed. If that number + is 2 (i2c-only drivers), it is advised that the labels are named + exit and exit_free. For i2c+isa drivers, labels should be named + ERROR0, ERROR1 and ERROR2. Don't forget to properly set err before + jumping to error labels. By the way, labels should be left-aligned. + Use memset to fill the client and data area with 0x00. + Use i2c_set_clientdata to set the client data (as opposed to + a direct access to client->data). + Use strlcpy instead of strcpy to copy the client name. + Replace the sysctl directory registration by calls to + device_create_file. Move the driver initialization before any + sysfs file creation. + Drop client->id. + +* [Init] Limits must not be set by the driver (can be done later in + user-space). Chip should not be reset default (although a module + parameter may be used to force is), and initialization should be + limited to the strictly necessary steps. + +* [Detach] Get rid of data, remove the call to + i2c_deregister_entry. + +* [Update] Don't access client->data directly, use + i2c_get_clientdata(client) instead. + +* [Interface] Init function should not print anything. Make sure + there is a MODULE_LICENSE() line, at the bottom of the file + (after MODULE_AUTHOR() and MODULE_DESCRIPTION(), in this order). + +Coding policy: + +* [Copyright] Use (C), not (c), for copyright. + +* [Debug/log] Get rid of #ifdef DEBUG/#endif constructs whenever you + can. Calls to printk/pr_debug for debugging purposes are replaced + by calls to dev_dbg. Here is an example on how to call it (taken + from lm75_detect): + dev_dbg(&client->dev, "Starting lm75 update\n"); + Replace other printk calls with the dev_info, dev_err or dev_warn + function, as appropriate. + +* [Constants] Constants defines (registers, conversions, initial + values) should be aligned. This greatly improves readability. + Same goes for variables declarations. Alignments are achieved by the + means of tabs, not spaces. Remember that tabs are set to 8 in the + Linux kernel code. + +* [Structure definition] The name field should be standardized. All + lowercase and as simple as the driver name itself (e.g. "lm75"). + +* [Layout] Avoid extra empty lines between comments and what they + comment. Respect the coding style (see Documentation/CodingStyle), + in particular when it comes to placing curly braces. + +* [Comments] Make sure that no comment refers to a file that isn't + part of the Linux source tree (typically doc/chips/<chip name>), + and that remaining comments still match the code. Merging comment + lines when possible is encouraged. diff --git a/Documentation/i2c/smbus-protocol b/Documentation/i2c/smbus-protocol new file mode 100644 index 000000000000..09f5e5ca4927 --- /dev/null +++ b/Documentation/i2c/smbus-protocol @@ -0,0 +1,216 @@ +SMBus Protocol Summary +====================== +The following is a summary of the SMBus protocol. It applies to +all revisions of the protocol (1.0, 1.1, and 2.0). +Certain protocol features which are not supported by +this package are briefly described at the end of this document. + +Some adapters understand only the SMBus (System Management Bus) protocol, +which is a subset from the I2C protocol. Fortunately, many devices use +only the same subset, which makes it possible to put them on an SMBus. +If you write a driver for some I2C device, please try to use the SMBus +commands if at all possible (if the device uses only that subset of the +I2C protocol). This makes it possible to use the device driver on both +SMBus adapters and I2C adapters (the SMBus command set is automatically +translated to I2C on I2C adapters, but plain I2C commands can not be +handled at all on most pure SMBus adapters). + +Below is a list of SMBus commands. + +Key to symbols +============== + +S (1 bit) : Start bit +P (1 bit) : Stop bit +Rd/Wr (1 bit) : Read/Write bit. Rd equals 1, Wr equals 0. +A, NA (1 bit) : Accept and reverse accept bit. +Addr (7 bits): I2C 7 bit address. Note that this can be expanded as usual to + get a 10 bit I2C address. +Comm (8 bits): Command byte, a data byte which often selects a register on + the device. +Data (8 bits): A plain data byte. Sometimes, I write DataLow, DataHigh + for 16 bit data. +Count (8 bits): A data byte containing the length of a block operation. + +[..]: Data sent by I2C device, as opposed to data sent by the host adapter. + + +SMBus Write Quick +================= + +This sends a single bit to the device, at the place of the Rd/Wr bit. +There is no equivalent Read Quick command. + +A Addr Rd/Wr [A] P + + +SMBus Read Byte +=============== + +This reads a single byte from a device, without specifying a device +register. Some devices are so simple that this interface is enough; for +others, it is a shorthand if you want to read the same register as in +the previous SMBus command. + +S Addr Rd [A] [Data] NA P + + +SMBus Write Byte +================ + +This is the reverse of Read Byte: it sends a single byte to a device. +See Read Byte for more information. + +S Addr Wr [A] Data [A] P + + +SMBus Read Byte Data +==================== + +This reads a single byte from a device, from a designated register. +The register is specified through the Comm byte. + +S Addr Wr [A] Comm [A] S Addr Rd [A] [Data] NA P + + +SMBus Read Word Data +==================== + +This command is very like Read Byte Data; again, data is read from a +device, from a designated register that is specified through the Comm +byte. But this time, the data is a complete word (16 bits). + +S Addr Wr [A] Comm [A] S Addr Rd [A] [DataLow] A [DataHigh] NA P + + +SMBus Write Byte Data +===================== + +This writes a single byte to a device, to a designated register. The +register is specified through the Comm byte. This is the opposite of +the Read Byte Data command. + +S Addr Wr [A] Comm [A] Data [A] P + + +SMBus Write Word Data +===================== + +This is the opposite operation of the Read Word Data command. 16 bits +of data is read from a device, from a designated register that is +specified through the Comm byte. + +S Addr Wr [A] Comm [A] DataLow [A] DataHigh [A] P + + +SMBus Process Call +================== + +This command selects a device register (through the Comm byte), sends +16 bits of data to it, and reads 16 bits of data in return. + +S Addr Wr [A] Comm [A] DataLow [A] DataHigh [A] + S Addr Rd [A] [DataLow] A [DataHigh] NA P + + +SMBus Block Read +================ + +This command reads a block of up to 32 bytes from a device, from a +designated register that is specified through the Comm byte. The amount +of data is specified by the device in the Count byte. + +S Addr Wr [A] Comm [A] + S Addr Rd [A] [Count] A [Data] A [Data] A ... A [Data] NA P + + +SMBus Block Write +================= + +The opposite of the Block Read command, this writes up to 32 bytes to +a device, to a designated register that is specified through the +Comm byte. The amount of data is specified in the Count byte. + +S Addr Wr [A] Comm [A] Count [A] Data [A] Data [A] ... [A] Data [A] P + + +SMBus Block Process Call +======================== + +SMBus Block Process Call was introduced in Revision 2.0 of the specification. + +This command selects a device register (through the Comm byte), sends +1 to 31 bytes of data to it, and reads 1 to 31 bytes of data in return. + +S Addr Wr [A] Comm [A] Count [A] Data [A] ... + S Addr Rd [A] [Count] A [Data] ... A P + + +SMBus Host Notify +================= + +This command is sent from a SMBus device acting as a master to the +SMBus host acting as a slave. +It is the same form as Write Word, with the command code replaced by the +alerting device's address. + +[S] [HostAddr] [Wr] A [DevAddr] A [DataLow] A [DataHigh] A [P] + + +Packet Error Checking (PEC) +=========================== +Packet Error Checking was introduced in Revision 1.1 of the specification. + +PEC adds a CRC-8 error-checking byte to all transfers. + + +Address Resolution Protocol (ARP) +================================= +The Address Resolution Protocol was introduced in Revision 2.0 of +the specification. It is a higher-layer protocol which uses the +messages above. + +ARP adds device enumeration and dynamic address assignment to +the protocol. All ARP communications use slave address 0x61 and +require PEC checksums. + + +I2C Block Transactions +====================== +The following I2C block transactions are supported by the +SMBus layer and are described here for completeness. +I2C block transactions do not limit the number of bytes transferred +but the SMBus layer places a limit of 32 bytes. + + +I2C Block Read +============== + +This command reads a block of bytes from a device, from a +designated register that is specified through the Comm byte. + +S Addr Wr [A] Comm [A] + S Addr Rd [A] [Data] A [Data] A ... A [Data] NA P + + +I2C Block Read (2 Comm bytes) +============================= + +This command reads a block of bytes from a device, from a +designated register that is specified through the two Comm bytes. + +S Addr Wr [A] Comm1 [A] Comm2 [A] + S Addr Rd [A] [Data] A [Data] A ... A [Data] NA P + + +I2C Block Write +=============== + +The opposite of the Block Read command, this writes bytes to +a device, to a designated register that is specified through the +Comm byte. Note that command lengths of 0, 2, or more bytes are +supported as they are indistinguishable from data. + +S Addr Wr [A] Comm [A] Data [A] Data [A] ... [A] Data [A] P + + diff --git a/Documentation/i2c/summary b/Documentation/i2c/summary new file mode 100644 index 000000000000..41dde8776791 --- /dev/null +++ b/Documentation/i2c/summary @@ -0,0 +1,75 @@ +This is an explanation of what i2c is, and what is supported in this package. + +I2C and SMBus +============= + +I2C (pronounce: I squared C) is a protocol developed by Philips. It is a +slow two-wire protocol (10-400 kHz), but it suffices for many types of +devices. + +SMBus (System Management Bus) is a subset of the I2C protocol. Many +modern mainboards have a System Management Bus. There are a lot of +devices which can be connected to a SMBus; the most notable are modern +memory chips with EEPROM memories and chips for hardware monitoring. + +Because the SMBus is just a special case of the generalized I2C bus, we +can simulate the SMBus protocol on plain I2C busses. The reverse is +regretfully impossible. + + +Terminology +=========== + +When we talk about I2C, we use the following terms: + Bus -> Algorithm + Adapter + Device -> Driver + Client + +An Algorithm driver contains general code that can be used for a whole class +of I2C adapters. Each specific adapter driver depends on one algorithm +driver. +A Driver driver (yes, this sounds ridiculous, sorry) contains the general +code to access some type of device. Each detected device gets its own +data in the Client structure. Usually, Driver and Client are more closely +integrated than Algorithm and Adapter. + +For a given configuration, you will need a driver for your I2C bus (usually +a separate Adapter and Algorithm driver), and drivers for your I2C devices +(usually one driver for each device). There are no I2C device drivers +in this package. See the lm_sensors project http://www.lm-sensors.nu +for device drivers. + + +Included Bus Drivers +==================== +Note that only stable drivers are patched into the kernel by 'mkpatch'. + + +Base modules +------------ + +i2c-core: The basic I2C code, including the /proc/bus/i2c* interface +i2c-dev: The /dev/i2c-* interface +i2c-proc: The /proc/sys/dev/sensors interface for device (client) drivers + +Algorithm drivers +----------------- + +i2c-algo-8xx: An algorithm for CPM's I2C device in Motorola 8xx processors (NOT BUILT BY DEFAULT) +i2c-algo-bit: A bit-banging algorithm +i2c-algo-pcf: A PCF 8584 style algorithm +i2c-algo-ibm_ocp: An algorithm for the I2C device in IBM 4xx processors (NOT BUILT BY DEFAULT) + +Adapter drivers +--------------- + +i2c-elektor: Elektor ISA card (uses i2c-algo-pcf) +i2c-elv: ELV parallel port adapter (uses i2c-algo-bit) +i2c-pcf-epp: PCF8584 on a EPP parallel port (uses i2c-algo-pcf) (NOT mkpatched) +i2c-philips-par: Philips style parallel port adapter (uses i2c-algo-bit) +i2c-adap-ibm_ocp: IBM 4xx processor I2C device (uses i2c-algo-ibm_ocp) (NOT BUILT BY DEFAULT) +i2c-pport: Primitive parallel port adapter (uses i2c-algo-bit) +i2c-rpx: RPX board Motorola 8xx I2C device (uses i2c-algo-8xx) (NOT BUILT BY DEFAULT) +i2c-velleman: Velleman K8000 parallel port adapter (uses i2c-algo-bit) + diff --git a/Documentation/i2c/sysfs-interface b/Documentation/i2c/sysfs-interface new file mode 100644 index 000000000000..346400519d0d --- /dev/null +++ b/Documentation/i2c/sysfs-interface @@ -0,0 +1,274 @@ +Naming and data format standards for sysfs files +------------------------------------------------ + +The libsensors library offers an interface to the raw sensors data +through the sysfs interface. See libsensors documentation and source for +more further information. As of writing this document, libsensors +(from lm_sensors 2.8.3) is heavily chip-dependant. Adding or updating +support for any given chip requires modifying the library's code. +This is because libsensors was written for the procfs interface +older kernel modules were using, which wasn't standardized enough. +Recent versions of libsensors (from lm_sensors 2.8.2 and later) have +support for the sysfs interface, though. + +The new sysfs interface was designed to be as chip-independant as +possible. + +Note that motherboards vary widely in the connections to sensor chips. +There is no standard that ensures, for example, that the second +temperature sensor is connected to the CPU, or that the second fan is on +the CPU. Also, some values reported by the chips need some computation +before they make full sense. For example, most chips can only measure +voltages between 0 and +4V. Other voltages are scaled back into that +range using external resistors. Since the values of these resistors +can change from motherboard to motherboard, the conversions cannot be +hard coded into the driver and have to be done in user space. + +For this reason, even if we aim at a chip-independant libsensors, it will +still require a configuration file (e.g. /etc/sensors.conf) for proper +values conversion, labeling of inputs and hiding of unused inputs. + +An alternative method that some programs use is to access the sysfs +files directly. This document briefly describes the standards that the +drivers follow, so that an application program can scan for entries and +access this data in a simple and consistent way. That said, such programs +will have to implement conversion, labeling and hiding of inputs. For +this reason, it is still not recommended to bypass the library. + +If you are developing a userspace application please send us feedback on +this standard. + +Note that this standard isn't completely established yet, so it is subject +to changes, even important ones. One more reason to use the library instead +of accessing sysfs files directly. + +Each chip gets its own directory in the sysfs /sys/devices tree. To +find all sensor chips, it is easier to follow the symlinks from +/sys/i2c/devices/ + +All sysfs values are fixed point numbers. To get the true value of some +of the values, you should divide by the specified value. + +There is only one value per file, unlike the older /proc specification. +The common scheme for files naming is: <type><number>_<item>. Usual +types for sensor chips are "in" (voltage), "temp" (temperature) and +"fan" (fan). Usual items are "input" (measured value), "max" (high +threshold, "min" (low threshold). Numbering usually starts from 1, +except for voltages which start from 0 (because most data sheets use +this). A number is always used for elements that can be present more +than once, even if there is a single element of the given type on the +specific chip. Other files do not refer to a specific element, so +they have a simple name, and no number. + +Alarms are direct indications read from the chips. The drivers do NOT +make comparisons of readings to thresholds. This allows violations +between readings to be caught and alarmed. The exact definition of an +alarm (for example, whether a threshold must be met or must be exceeded +to cause an alarm) is chip-dependent. + + +------------------------------------------------------------------------- + +************ +* Voltages * +************ + +in[0-8]_min Voltage min value. + Unit: millivolt + Read/Write + +in[0-8]_max Voltage max value. + Unit: millivolt + Read/Write + +in[0-8]_input Voltage input value. + Unit: millivolt + Read only + Actual voltage depends on the scaling resistors on the + motherboard, as recommended in the chip datasheet. + This varies by chip and by motherboard. + Because of this variation, values are generally NOT scaled + by the chip driver, and must be done by the application. + However, some drivers (notably lm87 and via686a) + do scale, with various degrees of success. + These drivers will output the actual voltage. + + Typical usage: + in0_* CPU #1 voltage (not scaled) + in1_* CPU #2 voltage (not scaled) + in2_* 3.3V nominal (not scaled) + in3_* 5.0V nominal (scaled) + in4_* 12.0V nominal (scaled) + in5_* -12.0V nominal (scaled) + in6_* -5.0V nominal (scaled) + in7_* varies + in8_* varies + +cpu[0-1]_vid CPU core reference voltage. + Unit: millivolt + Read only. + Not always correct. + +vrm Voltage Regulator Module version number. + Read only. + Two digit number, first is major version, second is + minor version. + Affects the way the driver calculates the CPU core reference + voltage from the vid pins. + + +******** +* Fans * +******** + +fan[1-3]_min Fan minimum value + Unit: revolution/min (RPM) + Read/Write. + +fan[1-3]_input Fan input value. + Unit: revolution/min (RPM) + Read only. + +fan[1-3]_div Fan divisor. + Integer value in powers of two (1, 2, 4, 8, 16, 32, 64, 128). + Some chips only support values 1, 2, 4 and 8. + Note that this is actually an internal clock divisor, which + affects the measurable speed range, not the read value. + +******* +* PWM * +******* + +pwm[1-3] Pulse width modulation fan control. + Integer value in the range 0 to 255 + Read/Write + 255 is max or 100%. + +pwm[1-3]_enable + Switch PWM on and off. + Not always present even if fan*_pwm is. + 0 to turn off + 1 to turn on in manual mode + 2 to turn on in automatic mode + Read/Write + +pwm[1-*]_auto_channels_temp + Select which temperature channels affect this PWM output in + auto mode. Bitfield, 1 is temp1, 2 is temp2, 4 is temp3 etc... + Which values are possible depend on the chip used. + +pwm[1-*]_auto_point[1-*]_pwm +pwm[1-*]_auto_point[1-*]_temp +pwm[1-*]_auto_point[1-*]_temp_hyst + Define the PWM vs temperature curve. Number of trip points is + chip-dependent. Use this for chips which associate trip points + to PWM output channels. + +OR + +temp[1-*]_auto_point[1-*]_pwm +temp[1-*]_auto_point[1-*]_temp +temp[1-*]_auto_point[1-*]_temp_hyst + Define the PWM vs temperature curve. Number of trip points is + chip-dependent. Use this for chips which associate trip points + to temperature channels. + + +**************** +* Temperatures * +**************** + +temp[1-3]_type Sensor type selection. + Integers 1, 2, 3 or thermistor Beta value (3435) + Read/Write. + 1: PII/Celeron Diode + 2: 3904 transistor + 3: thermal diode + Not all types are supported by all chips + +temp[1-4]_max Temperature max value. + Unit: millidegree Celcius + Read/Write value. + +temp[1-3]_min Temperature min value. + Unit: millidegree Celcius + Read/Write value. + +temp[1-3]_max_hyst + Temperature hysteresis value for max limit. + Unit: millidegree Celcius + Must be reported as an absolute temperature, NOT a delta + from the max value. + Read/Write value. + +temp[1-4]_input Temperature input value. + Unit: millidegree Celcius + Read only value. + +temp[1-4]_crit Temperature critical value, typically greater than + corresponding temp_max values. + Unit: millidegree Celcius + Read/Write value. + +temp[1-2]_crit_hyst + Temperature hysteresis value for critical limit. + Unit: millidegree Celcius + Must be reported as an absolute temperature, NOT a delta + from the critical value. + Read/Write value. + + If there are multiple temperature sensors, temp1_* is + generally the sensor inside the chip itself, + reported as "motherboard temperature". temp2_* to + temp4_* are generally sensors external to the chip + itself, for example the thermal diode inside the CPU or + a thermistor nearby. + + +************ +* Currents * +************ + +Note that no known chip provides current measurements as of writing, +so this part is theoretical, so to say. + +curr[1-n]_max Current max value + Unit: milliampere + Read/Write. + +curr[1-n]_min Current min value. + Unit: milliampere + Read/Write. + +curr[1-n]_input Current input value + Unit: milliampere + Read only. + + +********* +* Other * +********* + +alarms Alarm bitmask. + Read only. + Integer representation of one to four bytes. + A '1' bit means an alarm. + Chips should be programmed for 'comparator' mode so that + the alarm will 'come back' after you read the register + if it is still valid. + Generally a direct representation of a chip's internal + alarm registers; there is no standard for the position + of individual bits. + Bits are defined in kernel/include/sensors.h. + +beep_enable Beep/interrupt enable + 0 to disable. + 1 to enable. + Read/Write + +beep_mask Bitmask for beep. + Same format as 'alarms' with the same bit locations. + Read/Write + +eeprom Raw EEPROM data in binary form. + Read only. diff --git a/Documentation/i2c/ten-bit-addresses b/Documentation/i2c/ten-bit-addresses new file mode 100644 index 000000000000..200074f81360 --- /dev/null +++ b/Documentation/i2c/ten-bit-addresses @@ -0,0 +1,22 @@ +The I2C protocol knows about two kinds of device addresses: normal 7 bit +addresses, and an extended set of 10 bit addresses. The sets of addresses +do not intersect: the 7 bit address 0x10 is not the same as the 10 bit +address 0x10 (though a single device could respond to both of them). You +select a 10 bit address by adding an extra byte after the address +byte: + S Addr7 Rd/Wr .... +becomes + S 11110 Addr10 Rd/Wr +S is the start bit, Rd/Wr the read/write bit, and if you count the number +of bits, you will see the there are 8 after the S bit for 7 bit addresses, +and 16 after the S bit for 10 bit addresses. + +WARNING! The current 10 bit address support is EXPERIMENTAL. There are +several places in the code that will cause SEVERE PROBLEMS with 10 bit +addresses, even though there is some basic handling and hooks. Also, +almost no supported adapter handles the 10 bit addresses correctly. + +As soon as a real 10 bit address device is spotted 'in the wild', we +can and will add proper support. Right now, 10 bit address devices +are defined by the I2C protocol, but we have never seen a single device +which supports them. diff --git a/Documentation/i2c/writing-clients b/Documentation/i2c/writing-clients new file mode 100644 index 000000000000..ad27511e3c7d --- /dev/null +++ b/Documentation/i2c/writing-clients @@ -0,0 +1,816 @@ +This is a small guide for those who want to write kernel drivers for I2C +or SMBus devices. + +To set up a driver, you need to do several things. Some are optional, and +some things can be done slightly or completely different. Use this as a +guide, not as a rule book! + + +General remarks +=============== + +Try to keep the kernel namespace as clean as possible. The best way to +do this is to use a unique prefix for all global symbols. This is +especially important for exported symbols, but it is a good idea to do +it for non-exported symbols too. We will use the prefix `foo_' in this +tutorial, and `FOO_' for preprocessor variables. + + +The driver structure +==================== + +Usually, you will implement a single driver structure, and instantiate +all clients from it. Remember, a driver structure contains general access +routines, a client structure specific information like the actual I2C +address. + +static struct i2c_driver foo_driver = { + .owner = THIS_MODULE, + .name = "Foo version 2.3 driver", + .id = I2C_DRIVERID_FOO, /* from i2c-id.h, optional */ + .flags = I2C_DF_NOTIFY, + .attach_adapter = &foo_attach_adapter, + .detach_client = &foo_detach_client, + .command = &foo_command /* may be NULL */ +} + +The name can be chosen freely, and may be upto 40 characters long. Please +use something descriptive here. + +If used, the id should be a unique ID. The range 0xf000 to 0xffff is +reserved for local use, and you can use one of those until you start +distributing the driver, at which time you should contact the i2c authors +to get your own ID(s). Note that most of the time you don't need an ID +at all so you can just omit it. + +Don't worry about the flags field; just put I2C_DF_NOTIFY into it. This +means that your driver will be notified when new adapters are found. +This is almost always what you want. + +All other fields are for call-back functions which will be explained +below. + +There use to be two additional fields in this structure, inc_use et dec_use, +for module usage count, but these fields were obsoleted and removed. + + +Extra client data +================= + +The client structure has a special `data' field that can point to any +structure at all. You can use this to keep client-specific data. You +do not always need this, but especially for `sensors' drivers, it can +be very useful. + +An example structure is below. + + struct foo_data { + struct semaphore lock; /* For ISA access in `sensors' drivers. */ + int sysctl_id; /* To keep the /proc directory entry for + `sensors' drivers. */ + enum chips type; /* To keep the chips type for `sensors' drivers. */ + + /* Because the i2c bus is slow, it is often useful to cache the read + information of a chip for some time (for example, 1 or 2 seconds). + It depends of course on the device whether this is really worthwhile + or even sensible. */ + struct semaphore update_lock; /* When we are reading lots of information, + another process should not update the + below information */ + char valid; /* != 0 if the following fields are valid. */ + unsigned long last_updated; /* In jiffies */ + /* Add the read information here too */ + }; + + +Accessing the client +==================== + +Let's say we have a valid client structure. At some time, we will need +to gather information from the client, or write new information to the +client. How we will export this information to user-space is less +important at this moment (perhaps we do not need to do this at all for +some obscure clients). But we need generic reading and writing routines. + +I have found it useful to define foo_read and foo_write function for this. +For some cases, it will be easier to call the i2c functions directly, +but many chips have some kind of register-value idea that can easily +be encapsulated. Also, some chips have both ISA and I2C interfaces, and +it useful to abstract from this (only for `sensors' drivers). + +The below functions are simple examples, and should not be copied +literally. + + int foo_read_value(struct i2c_client *client, u8 reg) + { + if (reg < 0x10) /* byte-sized register */ + return i2c_smbus_read_byte_data(client,reg); + else /* word-sized register */ + return i2c_smbus_read_word_data(client,reg); + } + + int foo_write_value(struct i2c_client *client, u8 reg, u16 value) + { + if (reg == 0x10) /* Impossible to write - driver error! */ { + return -1; + else if (reg < 0x10) /* byte-sized register */ + return i2c_smbus_write_byte_data(client,reg,value); + else /* word-sized register */ + return i2c_smbus_write_word_data(client,reg,value); + } + +For sensors code, you may have to cope with ISA registers too. Something +like the below often works. Note the locking! + + int foo_read_value(struct i2c_client *client, u8 reg) + { + int res; + if (i2c_is_isa_client(client)) { + down(&(((struct foo_data *) (client->data)) -> lock)); + outb_p(reg,client->addr + FOO_ADDR_REG_OFFSET); + res = inb_p(client->addr + FOO_DATA_REG_OFFSET); + up(&(((struct foo_data *) (client->data)) -> lock)); + return res; + } else + return i2c_smbus_read_byte_data(client,reg); + } + +Writing is done the same way. + + +Probing and attaching +===================== + +Most i2c devices can be present on several i2c addresses; for some this +is determined in hardware (by soldering some chip pins to Vcc or Ground), +for others this can be changed in software (by writing to specific client +registers). Some devices are usually on a specific address, but not always; +and some are even more tricky. So you will probably need to scan several +i2c addresses for your clients, and do some sort of detection to see +whether it is actually a device supported by your driver. + +To give the user a maximum of possibilities, some default module parameters +are defined to help determine what addresses are scanned. Several macros +are defined in i2c.h to help you support them, as well as a generic +detection algorithm. + +You do not have to use this parameter interface; but don't try to use +function i2c_probe() (or i2c_detect()) if you don't. + +NOTE: If you want to write a `sensors' driver, the interface is slightly + different! See below. + + + +Probing classes (i2c) +--------------------- + +All parameters are given as lists of unsigned 16-bit integers. Lists are +terminated by I2C_CLIENT_END. +The following lists are used internally: + + normal_i2c: filled in by the module writer. + A list of I2C addresses which should normally be examined. + normal_i2c_range: filled in by the module writer. + A list of pairs of I2C addresses, each pair being an inclusive range of + addresses which should normally be examined. + probe: insmod parameter. + A list of pairs. The first value is a bus number (-1 for any I2C bus), + the second is the address. These addresses are also probed, as if they + were in the 'normal' list. + probe_range: insmod parameter. + A list of triples. The first value is a bus number (-1 for any I2C bus), + the second and third are addresses. These form an inclusive range of + addresses that are also probed, as if they were in the 'normal' list. + ignore: insmod parameter. + A list of pairs. The first value is a bus number (-1 for any I2C bus), + the second is the I2C address. These addresses are never probed. + This parameter overrules 'normal' and 'probe', but not the 'force' lists. + ignore_range: insmod parameter. + A list of triples. The first value is a bus number (-1 for any I2C bus), + the second and third are addresses. These form an inclusive range of + I2C addresses that are never probed. + This parameter overrules 'normal' and 'probe', but not the 'force' lists. + force: insmod parameter. + A list of pairs. The first value is a bus number (-1 for any I2C bus), + the second is the I2C address. A device is blindly assumed to be on + the given address, no probing is done. + +Fortunately, as a module writer, you just have to define the `normal' +and/or `normal_range' parameters. The complete declaration could look +like this: + + /* Scan 0x20 to 0x2f, 0x37, and 0x40 to 0x4f */ + static unsigned short normal_i2c[] = { 0x37,I2C_CLIENT_END }; + static unsigned short normal_i2c_range[] = { 0x20, 0x2f, 0x40, 0x4f, + I2C_CLIENT_END }; + + /* Magic definition of all other variables and things */ + I2C_CLIENT_INSMOD; + +Note that you *have* to call the two defined variables `normal_i2c' and +`normal_i2c_range', without any prefix! + + +Probing classes (sensors) +------------------------- + +If you write a `sensors' driver, you use a slightly different interface. +As well as I2C addresses, we have to cope with ISA addresses. Also, we +use a enum of chip types. Don't forget to include `sensors.h'. + +The following lists are used internally. They are all lists of integers. + + normal_i2c: filled in by the module writer. Terminated by SENSORS_I2C_END. + A list of I2C addresses which should normally be examined. + normal_i2c_range: filled in by the module writer. Terminated by + SENSORS_I2C_END + A list of pairs of I2C addresses, each pair being an inclusive range of + addresses which should normally be examined. + normal_isa: filled in by the module writer. Terminated by SENSORS_ISA_END. + A list of ISA addresses which should normally be examined. + normal_isa_range: filled in by the module writer. Terminated by + SENSORS_ISA_END + A list of triples. The first two elements are ISA addresses, being an + range of addresses which should normally be examined. The third is the + modulo parameter: only addresses which are 0 module this value relative + to the first address of the range are actually considered. + probe: insmod parameter. Initialize this list with SENSORS_I2C_END values. + A list of pairs. The first value is a bus number (SENSORS_ISA_BUS for + the ISA bus, -1 for any I2C bus), the second is the address. These + addresses are also probed, as if they were in the 'normal' list. + probe_range: insmod parameter. Initialize this list with SENSORS_I2C_END + values. + A list of triples. The first value is a bus number (SENSORS_ISA_BUS for + the ISA bus, -1 for any I2C bus), the second and third are addresses. + These form an inclusive range of addresses that are also probed, as + if they were in the 'normal' list. + ignore: insmod parameter. Initialize this list with SENSORS_I2C_END values. + A list of pairs. The first value is a bus number (SENSORS_ISA_BUS for + the ISA bus, -1 for any I2C bus), the second is the I2C address. These + addresses are never probed. This parameter overrules 'normal' and + 'probe', but not the 'force' lists. + ignore_range: insmod parameter. Initialize this list with SENSORS_I2C_END + values. + A list of triples. The first value is a bus number (SENSORS_ISA_BUS for + the ISA bus, -1 for any I2C bus), the second and third are addresses. + These form an inclusive range of I2C addresses that are never probed. + This parameter overrules 'normal' and 'probe', but not the 'force' lists. + +Also used is a list of pointers to sensors_force_data structures: + force_data: insmod parameters. A list, ending with an element of which + the force field is NULL. + Each element contains the type of chip and a list of pairs. + The first value is a bus number (SENSORS_ISA_BUS for the ISA bus, + -1 for any I2C bus), the second is the address. + These are automatically translated to insmod variables of the form + force_foo. + +So we have a generic insmod variabled `force', and chip-specific variables +`force_CHIPNAME'. + +Fortunately, as a module writer, you just have to define the `normal' +and/or `normal_range' parameters, and define what chip names are used. +The complete declaration could look like this: + /* Scan i2c addresses 0x20 to 0x2f, 0x37, and 0x40 to 0x4f + static unsigned short normal_i2c[] = {0x37,SENSORS_I2C_END}; + static unsigned short normal_i2c_range[] = {0x20,0x2f,0x40,0x4f, + SENSORS_I2C_END}; + /* Scan ISA address 0x290 */ + static unsigned int normal_isa[] = {0x0290,SENSORS_ISA_END}; + static unsigned int normal_isa_range[] = {SENSORS_ISA_END}; + + /* Define chips foo and bar, as well as all module parameters and things */ + SENSORS_INSMOD_2(foo,bar); + +If you have one chip, you use macro SENSORS_INSMOD_1(chip), if you have 2 +you use macro SENSORS_INSMOD_2(chip1,chip2), etc. If you do not want to +bother with chip types, you can use SENSORS_INSMOD_0. + +A enum is automatically defined as follows: + enum chips { any_chip, chip1, chip2, ... } + + +Attaching to an adapter +----------------------- + +Whenever a new adapter is inserted, or for all adapters if the driver is +being registered, the callback attach_adapter() is called. Now is the +time to determine what devices are present on the adapter, and to register +a client for each of them. + +The attach_adapter callback is really easy: we just call the generic +detection function. This function will scan the bus for us, using the +information as defined in the lists explained above. If a device is +detected at a specific address, another callback is called. + + int foo_attach_adapter(struct i2c_adapter *adapter) + { + return i2c_probe(adapter,&addr_data,&foo_detect_client); + } + +For `sensors' drivers, use the i2c_detect function instead: + + int foo_attach_adapter(struct i2c_adapter *adapter) + { + return i2c_detect(adapter,&addr_data,&foo_detect_client); + } + +Remember, structure `addr_data' is defined by the macros explained above, +so you do not have to define it yourself. + +The i2c_probe or i2c_detect function will call the foo_detect_client +function only for those i2c addresses that actually have a device on +them (unless a `force' parameter was used). In addition, addresses that +are already in use (by some other registered client) are skipped. + + +The detect client function +-------------------------- + +The detect client function is called by i2c_probe or i2c_detect. +The `kind' parameter contains 0 if this call is due to a `force' +parameter, and -1 otherwise (for i2c_detect, it contains 0 if +this call is due to the generic `force' parameter, and the chip type +number if it is due to a specific `force' parameter). + +Below, some things are only needed if this is a `sensors' driver. Those +parts are between /* SENSORS ONLY START */ and /* SENSORS ONLY END */ +markers. + +This function should only return an error (any value != 0) if there is +some reason why no more detection should be done anymore. If the +detection just fails for this address, return 0. + +For now, you can ignore the `flags' parameter. It is there for future use. + + int foo_detect_client(struct i2c_adapter *adapter, int address, + unsigned short flags, int kind) + { + int err = 0; + int i; + struct i2c_client *new_client; + struct foo_data *data; + const char *client_name = ""; /* For non-`sensors' drivers, put the real + name here! */ + + /* Let's see whether this adapter can support what we need. + Please substitute the things you need here! + For `sensors' drivers, add `! is_isa &&' to the if statement */ + if (!i2c_check_functionality(adapter,I2C_FUNC_SMBUS_WORD_DATA | + I2C_FUNC_SMBUS_WRITE_BYTE)) + goto ERROR0; + + /* SENSORS ONLY START */ + const char *type_name = ""; + int is_isa = i2c_is_isa_adapter(adapter); + + if (is_isa) { + + /* If this client can't be on the ISA bus at all, we can stop now + (call `goto ERROR0'). But for kicks, we will assume it is all + right. */ + + /* Discard immediately if this ISA range is already used */ + if (check_region(address,FOO_EXTENT)) + goto ERROR0; + + /* Probe whether there is anything on this address. + Some example code is below, but you will have to adapt this + for your own driver */ + + if (kind < 0) /* Only if no force parameter was used */ { + /* We may need long timeouts at least for some chips. */ + #define REALLY_SLOW_IO + i = inb_p(address + 1); + if (inb_p(address + 2) != i) + goto ERROR0; + if (inb_p(address + 3) != i) + goto ERROR0; + if (inb_p(address + 7) != i) + goto ERROR0; + #undef REALLY_SLOW_IO + + /* Let's just hope nothing breaks here */ + i = inb_p(address + 5) & 0x7f; + outb_p(~i & 0x7f,address+5); + if ((inb_p(address + 5) & 0x7f) != (~i & 0x7f)) { + outb_p(i,address+5); + return 0; + } + } + } + + /* SENSORS ONLY END */ + + /* OK. For now, we presume we have a valid client. We now create the + client structure, even though we cannot fill it completely yet. + But it allows us to access several i2c functions safely */ + + /* Note that we reserve some space for foo_data too. If you don't + need it, remove it. We do it here to help to lessen memory + fragmentation. */ + if (! (new_client = kmalloc(sizeof(struct i2c_client) + + sizeof(struct foo_data), + GFP_KERNEL))) { + err = -ENOMEM; + goto ERROR0; + } + + /* This is tricky, but it will set the data to the right value. */ + client->data = new_client + 1; + data = (struct foo_data *) (client->data); + + new_client->addr = address; + new_client->data = data; + new_client->adapter = adapter; + new_client->driver = &foo_driver; + new_client->flags = 0; + + /* Now, we do the remaining detection. If no `force' parameter is used. */ + + /* First, the generic detection (if any), that is skipped if any force + parameter was used. */ + if (kind < 0) { + /* The below is of course bogus */ + if (foo_read(new_client,FOO_REG_GENERIC) != FOO_GENERIC_VALUE) + goto ERROR1; + } + + /* SENSORS ONLY START */ + + /* Next, specific detection. This is especially important for `sensors' + devices. */ + + /* Determine the chip type. Not needed if a `force_CHIPTYPE' parameter + was used. */ + if (kind <= 0) { + i = foo_read(new_client,FOO_REG_CHIPTYPE); + if (i == FOO_TYPE_1) + kind = chip1; /* As defined in the enum */ + else if (i == FOO_TYPE_2) + kind = chip2; + else { + printk("foo: Ignoring 'force' parameter for unknown chip at " + "adapter %d, address 0x%02x\n",i2c_adapter_id(adapter),address); + goto ERROR1; + } + } + + /* Now set the type and chip names */ + if (kind == chip1) { + type_name = "chip1"; /* For /proc entry */ + client_name = "CHIP 1"; + } else if (kind == chip2) { + type_name = "chip2"; /* For /proc entry */ + client_name = "CHIP 2"; + } + + /* Reserve the ISA region */ + if (is_isa) + request_region(address,FOO_EXTENT,type_name); + + /* SENSORS ONLY END */ + + /* Fill in the remaining client fields. */ + strcpy(new_client->name,client_name); + + /* SENSORS ONLY BEGIN */ + data->type = kind; + /* SENSORS ONLY END */ + + data->valid = 0; /* Only if you use this field */ + init_MUTEX(&data->update_lock); /* Only if you use this field */ + + /* Any other initializations in data must be done here too. */ + + /* Tell the i2c layer a new client has arrived */ + if ((err = i2c_attach_client(new_client))) + goto ERROR3; + + /* SENSORS ONLY BEGIN */ + /* Register a new directory entry with module sensors. See below for + the `template' structure. */ + if ((i = i2c_register_entry(new_client, type_name, + foo_dir_table_template,THIS_MODULE)) < 0) { + err = i; + goto ERROR4; + } + data->sysctl_id = i; + + /* SENSORS ONLY END */ + + /* This function can write default values to the client registers, if + needed. */ + foo_init_client(new_client); + return 0; + + /* OK, this is not exactly good programming practice, usually. But it is + very code-efficient in this case. */ + + ERROR4: + i2c_detach_client(new_client); + ERROR3: + ERROR2: + /* SENSORS ONLY START */ + if (is_isa) + release_region(address,FOO_EXTENT); + /* SENSORS ONLY END */ + ERROR1: + kfree(new_client); + ERROR0: + return err; + } + + +Removing the client +=================== + +The detach_client call back function is called when a client should be +removed. It may actually fail, but only when panicking. This code is +much simpler than the attachment code, fortunately! + + int foo_detach_client(struct i2c_client *client) + { + int err,i; + + /* SENSORS ONLY START */ + /* Deregister with the `i2c-proc' module. */ + i2c_deregister_entry(((struct lm78_data *)(client->data))->sysctl_id); + /* SENSORS ONLY END */ + + /* Try to detach the client from i2c space */ + if ((err = i2c_detach_client(client))) { + printk("foo.o: Client deregistration failed, client not detached.\n"); + return err; + } + + /* SENSORS ONLY START */ + if i2c_is_isa_client(client) + release_region(client->addr,LM78_EXTENT); + /* SENSORS ONLY END */ + + kfree(client); /* Frees client data too, if allocated at the same time */ + return 0; + } + + +Initializing the module or kernel +================================= + +When the kernel is booted, or when your foo driver module is inserted, +you have to do some initializing. Fortunately, just attaching (registering) +the driver module is usually enough. + + /* Keep track of how far we got in the initialization process. If several + things have to initialized, and we fail halfway, only those things + have to be cleaned up! */ + static int __initdata foo_initialized = 0; + + static int __init foo_init(void) + { + int res; + printk("foo version %s (%s)\n",FOO_VERSION,FOO_DATE); + + if ((res = i2c_add_driver(&foo_driver))) { + printk("foo: Driver registration failed, module not inserted.\n"); + foo_cleanup(); + return res; + } + foo_initialized ++; + return 0; + } + + void foo_cleanup(void) + { + if (foo_initialized == 1) { + if ((res = i2c_del_driver(&foo_driver))) { + printk("foo: Driver registration failed, module not removed.\n"); + return; + } + foo_initialized --; + } + } + + /* Substitute your own name and email address */ + MODULE_AUTHOR("Frodo Looijaard <frodol@dds.nl>" + MODULE_DESCRIPTION("Driver for Barf Inc. Foo I2C devices"); + + module_init(foo_init); + module_exit(foo_cleanup); + +Note that some functions are marked by `__init', and some data structures +by `__init_data'. Hose functions and structures can be removed after +kernel booting (or module loading) is completed. + +Command function +================ + +A generic ioctl-like function call back is supported. You will seldom +need this. You may even set it to NULL. + + /* No commands defined */ + int foo_command(struct i2c_client *client, unsigned int cmd, void *arg) + { + return 0; + } + + +Sending and receiving +===================== + +If you want to communicate with your device, there are several functions +to do this. You can find all of them in i2c.h. + +If you can choose between plain i2c communication and SMBus level +communication, please use the last. All adapters understand SMBus level +commands, but only some of them understand plain i2c! + + +Plain i2c communication +----------------------- + + extern int i2c_master_send(struct i2c_client *,const char* ,int); + extern int i2c_master_recv(struct i2c_client *,char* ,int); + +These routines read and write some bytes from/to a client. The client +contains the i2c address, so you do not have to include it. The second +parameter contains the bytes the read/write, the third the length of the +buffer. Returned is the actual number of bytes read/written. + + extern int i2c_transfer(struct i2c_adapter *adap, struct i2c_msg *msg, + int num); + +This sends a series of messages. Each message can be a read or write, +and they can be mixed in any way. The transactions are combined: no +stop bit is sent between transaction. The i2c_msg structure contains +for each message the client address, the number of bytes of the message +and the message data itself. + +You can read the file `i2c-protocol' for more information about the +actual i2c protocol. + + +SMBus communication +------------------- + + extern s32 i2c_smbus_xfer (struct i2c_adapter * adapter, u16 addr, + unsigned short flags, + char read_write, u8 command, int size, + union i2c_smbus_data * data); + + This is the generic SMBus function. All functions below are implemented + in terms of it. Never use this function directly! + + + extern s32 i2c_smbus_write_quick(struct i2c_client * client, u8 value); + extern s32 i2c_smbus_read_byte(struct i2c_client * client); + extern s32 i2c_smbus_write_byte(struct i2c_client * client, u8 value); + extern s32 i2c_smbus_read_byte_data(struct i2c_client * client, u8 command); + extern s32 i2c_smbus_write_byte_data(struct i2c_client * client, + u8 command, u8 value); + extern s32 i2c_smbus_read_word_data(struct i2c_client * client, u8 command); + extern s32 i2c_smbus_write_word_data(struct i2c_client * client, + u8 command, u16 value); + extern s32 i2c_smbus_write_block_data(struct i2c_client * client, + u8 command, u8 length, + u8 *values); + +These ones were removed in Linux 2.6.10 because they had no users, but could +be added back later if needed: + + extern s32 i2c_smbus_read_i2c_block_data(struct i2c_client * client, + u8 command, u8 *values); + extern s32 i2c_smbus_read_block_data(struct i2c_client * client, + u8 command, u8 *values); + extern s32 i2c_smbus_write_i2c_block_data(struct i2c_client * client, + u8 command, u8 length, + u8 *values); + extern s32 i2c_smbus_process_call(struct i2c_client * client, + u8 command, u16 value); + extern s32 i2c_smbus_block_process_call(struct i2c_client *client, + u8 command, u8 length, + u8 *values) + +All these transactions return -1 on failure. The 'write' transactions +return 0 on success; the 'read' transactions return the read value, except +for read_block, which returns the number of values read. The block buffers +need not be longer than 32 bytes. + +You can read the file `smbus-protocol' for more information about the +actual SMBus protocol. + + +General purpose routines +======================== + +Below all general purpose routines are listed, that were not mentioned +before. + + /* This call returns a unique low identifier for each registered adapter, + * or -1 if the adapter was not registered. + */ + extern int i2c_adapter_id(struct i2c_adapter *adap); + + +The sensors sysctl/proc interface +================================= + +This section only applies if you write `sensors' drivers. + +Each sensors driver creates a directory in /proc/sys/dev/sensors for each +registered client. The directory is called something like foo-i2c-4-65. +The sensors module helps you to do this as easily as possible. + +The template +------------ + +You will need to define a ctl_table template. This template will automatically +be copied to a newly allocated structure and filled in where necessary when +you call sensors_register_entry. + +First, I will give an example definition. + static ctl_table foo_dir_table_template[] = { + { FOO_SYSCTL_FUNC1, "func1", NULL, 0, 0644, NULL, &i2c_proc_real, + &i2c_sysctl_real,NULL,&foo_func }, + { FOO_SYSCTL_FUNC2, "func2", NULL, 0, 0644, NULL, &i2c_proc_real, + &i2c_sysctl_real,NULL,&foo_func }, + { FOO_SYSCTL_DATA, "data", NULL, 0, 0644, NULL, &i2c_proc_real, + &i2c_sysctl_real,NULL,&foo_data }, + { 0 } + }; + +In the above example, three entries are defined. They can either be +accessed through the /proc interface, in the /proc/sys/dev/sensors/* +directories, as files named func1, func2 and data, or alternatively +through the sysctl interface, in the appropriate table, with identifiers +FOO_SYSCTL_FUNC1, FOO_SYSCTL_FUNC2 and FOO_SYSCTL_DATA. + +The third, sixth and ninth parameters should always be NULL, and the +fourth should always be 0. The fifth is the mode of the /proc file; +0644 is safe, as the file will be owned by root:root. + +The seventh and eighth parameters should be &i2c_proc_real and +&i2c_sysctl_real if you want to export lists of reals (scaled +integers). You can also use your own function for them, as usual. +Finally, the last parameter is the call-back to gather the data +(see below) if you use the *_proc_real functions. + + +Gathering the data +------------------ + +The call back functions (foo_func and foo_data in the above example) +can be called in several ways; the operation parameter determines +what should be done: + + * If operation == SENSORS_PROC_REAL_INFO, you must return the + magnitude (scaling) in nrels_mag; + * If operation == SENSORS_PROC_REAL_READ, you must read information + from the chip and return it in results. The number of integers + to display should be put in nrels_mag; + * If operation == SENSORS_PROC_REAL_WRITE, you must write the + supplied information to the chip. nrels_mag will contain the number + of integers, results the integers themselves. + +The *_proc_real functions will display the elements as reals for the +/proc interface. If you set the magnitude to 2, and supply 345 for +SENSORS_PROC_REAL_READ, it would display 3.45; and if the user would +write 45.6 to the /proc file, it would be returned as 4560 for +SENSORS_PROC_REAL_WRITE. A magnitude may even be negative! + +An example function: + + /* FOO_FROM_REG and FOO_TO_REG translate between scaled values and + register values. Note the use of the read cache. */ + void foo_in(struct i2c_client *client, int operation, int ctl_name, + int *nrels_mag, long *results) + { + struct foo_data *data = client->data; + int nr = ctl_name - FOO_SYSCTL_FUNC1; /* reduce to 0 upwards */ + + if (operation == SENSORS_PROC_REAL_INFO) + *nrels_mag = 2; + else if (operation == SENSORS_PROC_REAL_READ) { + /* Update the readings cache (if necessary) */ + foo_update_client(client); + /* Get the readings from the cache */ + results[0] = FOO_FROM_REG(data->foo_func_base[nr]); + results[1] = FOO_FROM_REG(data->foo_func_more[nr]); + results[2] = FOO_FROM_REG(data->foo_func_readonly[nr]); + *nrels_mag = 2; + } else if (operation == SENSORS_PROC_REAL_WRITE) { + if (*nrels_mag >= 1) { + /* Update the cache */ + data->foo_base[nr] = FOO_TO_REG(results[0]); + /* Update the chip */ + foo_write_value(client,FOO_REG_FUNC_BASE(nr),data->foo_base[nr]); + } + if (*nrels_mag >= 2) { + /* Update the cache */ + data->foo_more[nr] = FOO_TO_REG(results[1]); + /* Update the chip */ + foo_write_value(client,FOO_REG_FUNC_MORE(nr),data->foo_more[nr]); + } + } + } diff --git a/Documentation/i2o/README b/Documentation/i2o/README new file mode 100644 index 000000000000..9aa6ddb446eb --- /dev/null +++ b/Documentation/i2o/README @@ -0,0 +1,63 @@ + + Linux I2O Support (c) Copyright 1999 Red Hat Software + and others. + + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License + as published by the Free Software Foundation; either version + 2 of the License, or (at your option) any later version. + +AUTHORS (so far) + +Alan Cox, Building Number Three Ltd. + Core code, SCSI and Block OSMs + +Steve Ralston, LSI Logic Corp. + Debugging SCSI and Block OSM + +Deepak Saxena, Intel Corp. + Various core/block extensions + /proc interface, bug fixes + Ioctl interfaces for control + Debugging LAN OSM + +Philip Rumpf + Fixed assorted dumb SMP locking bugs + +Juha Sievanen, University of Helsinki Finland + LAN OSM code + /proc interface to LAN class + Bug fixes + Core code extensions + +Auvo Häkkinen, University of Helsinki Finland + LAN OSM code + /Proc interface to LAN class + Bug fixes + Core code extensions + +Taneli Vähäkangas, University of Helsinki Finland + Fixes to i2o_config + +CREDITS + + This work was made possible by + +Red Hat Software + Funding for the Building #3 part of the project + +Symbios Logic (Now LSI) + Host adapters, hints, known to work platforms when I hit + compatibility problems + +BoxHill Corporation + Loan of initial FibreChannel disk array used for development work. + +European Comission + Funding the work done by the University of Helsinki + +SysKonnect + Loan of FDDI and Gigabit Ethernet cards + +ASUSTeK + Loan of I2O motherboard diff --git a/Documentation/i2o/ioctl b/Documentation/i2o/ioctl new file mode 100644 index 000000000000..3e174978997d --- /dev/null +++ b/Documentation/i2o/ioctl @@ -0,0 +1,394 @@ + +Linux I2O User Space Interface +rev 0.3 - 04/20/99 + +============================================================================= +Originally written by Deepak Saxena(deepak@plexity.net) +Currently maintained by Deepak Saxena(deepak@plexity.net) +============================================================================= + +I. Introduction + +The Linux I2O subsystem provides a set of ioctl() commands that can be +utilized by user space applications to communicate with IOPs and devices +on individual IOPs. This document defines the specific ioctl() commands +that are available to the user and provides examples of their uses. + +This document assumes the reader is familiar with or has access to the +I2O specification as no I2O message parameters are outlined. For information +on the specification, see http://www.i2osig.org + +This document and the I2O user space interface are currently maintained +by Deepak Saxena. Please send all comments, errata, and bug fixes to +deepak@csociety.purdue.edu + +II. IOP Access + +Access to the I2O subsystem is provided through the device file named +/dev/i2o/ctl. This file is a character file with major number 10 and minor +number 166. It can be created through the following command: + + mknod /dev/i2o/ctl c 10 166 + +III. Determining the IOP Count + + SYNOPSIS + + ioctl(fd, I2OGETIOPS, int *count); + + u8 count[MAX_I2O_CONTROLLERS]; + + DESCRIPTION + + This function returns the system's active IOP table. count should + point to a buffer containing MAX_I2O_CONTROLLERS entries. Upon + returning, each entry will contain a non-zero value if the given + IOP unit is active, and NULL if it is inactive or non-existent. + + RETURN VALUE. + + Returns 0 if no errors occur, and -1 otherwise. If an error occurs, + errno is set appropriately: + + EFAULT Invalid user space pointer was passed + +IV. Getting Hardware Resource Table + + SYNOPSIS + + ioctl(fd, I2OHRTGET, struct i2o_cmd_hrt *hrt); + + struct i2o_cmd_hrtlct + { + u32 iop; /* IOP unit number */ + void *resbuf; /* Buffer for result */ + u32 *reslen; /* Buffer length in bytes */ + }; + + DESCRIPTION + + This function returns the Hardware Resource Table of the IOP specified + by hrt->iop in the buffer pointed to by hrt->resbuf. The actual size of + the data is written into *(hrt->reslen). + + RETURNS + + This function returns 0 if no errors occur. If an error occurs, -1 + is returned and errno is set appropriately: + + EFAULT Invalid user space pointer was passed + ENXIO Invalid IOP number + ENOBUFS Buffer not large enough. If this occurs, the required + buffer length is written into *(hrt->reslen) + +V. Getting Logical Configuration Table + + SYNOPSIS + + ioctl(fd, I2OLCTGET, struct i2o_cmd_lct *lct); + + struct i2o_cmd_hrtlct + { + u32 iop; /* IOP unit number */ + void *resbuf; /* Buffer for result */ + u32 *reslen; /* Buffer length in bytes */ + }; + + DESCRIPTION + + This function returns the Logical Configuration Table of the IOP specified + by lct->iop in the buffer pointed to by lct->resbuf. The actual size of + the data is written into *(lct->reslen). + + RETURNS + + This function returns 0 if no errors occur. If an error occurs, -1 + is returned and errno is set appropriately: + + EFAULT Invalid user space pointer was passed + ENXIO Invalid IOP number + ENOBUFS Buffer not large enough. If this occurs, the required + buffer length is written into *(lct->reslen) + +VI. Settting Parameters + + SYNOPSIS + + ioctl(fd, I2OPARMSET, struct i2o_parm_setget *ops); + + struct i2o_cmd_psetget + { + u32 iop; /* IOP unit number */ + u32 tid; /* Target device TID */ + void *opbuf; /* Operation List buffer */ + u32 oplen; /* Operation List buffer length in bytes */ + void *resbuf; /* Result List buffer */ + u32 *reslen; /* Result List buffer length in bytes */ + }; + + DESCRIPTION + + This function posts a UtilParamsSet message to the device identified + by ops->iop and ops->tid. The operation list for the message is + sent through the ops->opbuf buffer, and the result list is written + into the buffer pointed to by ops->resbuf. The number of bytes + written is placed into *(ops->reslen). + + RETURNS + + The return value is the size in bytes of the data written into + ops->resbuf if no errors occur. If an error occurs, -1 is returned + and errno is set appropriatly: + + EFAULT Invalid user space pointer was passed + ENXIO Invalid IOP number + ENOBUFS Buffer not large enough. If this occurs, the required + buffer length is written into *(ops->reslen) + ETIMEDOUT Timeout waiting for reply message + ENOMEM Kernel memory allocation error + + A return value of 0 does not mean that the value was actually + changed properly on the IOP. The user should check the result + list to determine the specific status of the transaction. + +VII. Getting Parameters + + SYNOPSIS + + ioctl(fd, I2OPARMGET, struct i2o_parm_setget *ops); + + struct i2o_parm_setget + { + u32 iop; /* IOP unit number */ + u32 tid; /* Target device TID */ + void *opbuf; /* Operation List buffer */ + u32 oplen; /* Operation List buffer length in bytes */ + void *resbuf; /* Result List buffer */ + u32 *reslen; /* Result List buffer length in bytes */ + }; + + DESCRIPTION + + This function posts a UtilParamsGet message to the device identified + by ops->iop and ops->tid. The operation list for the message is + sent through the ops->opbuf buffer, and the result list is written + into the buffer pointed to by ops->resbuf. The actual size of data + written is placed into *(ops->reslen). + + RETURNS + + EFAULT Invalid user space pointer was passed + ENXIO Invalid IOP number + ENOBUFS Buffer not large enough. If this occurs, the required + buffer length is written into *(ops->reslen) + ETIMEDOUT Timeout waiting for reply message + ENOMEM Kernel memory allocation error + + A return value of 0 does not mean that the value was actually + properly retreived. The user should check the result list + to determine the specific status of the transaction. + +VIII. Downloading Software + + SYNOPSIS + + ioctl(fd, I2OSWDL, struct i2o_sw_xfer *sw); + + struct i2o_sw_xfer + { + u32 iop; /* IOP unit number */ + u8 flags; /* DownloadFlags field */ + u8 sw_type; /* Software type */ + u32 sw_id; /* Software ID */ + void *buf; /* Pointer to software buffer */ + u32 *swlen; /* Length of software buffer */ + u32 *maxfrag; /* Number of fragments */ + u32 *curfrag; /* Current fragment number */ + }; + + DESCRIPTION + + This function downloads a software fragment pointed by sw->buf + to the iop identified by sw->iop. The DownloadFlags, SwID, SwType + and SwSize fields of the ExecSwDownload message are filled in with + the values of sw->flags, sw->sw_id, sw->sw_type and *(sw->swlen). + + The fragments _must_ be sent in order and be 8K in size. The last + fragment _may_ be shorter, however. The kernel will compute its + size based on information in the sw->swlen field. + + Please note that SW transfers can take a long time. + + RETURNS + + This function returns 0 no errors occur. If an error occurs, -1 + is returned and errno is set appropriatly: + + EFAULT Invalid user space pointer was passed + ENXIO Invalid IOP number + ETIMEDOUT Timeout waiting for reply message + ENOMEM Kernel memory allocation error + +IX. Uploading Software + + SYNOPSIS + + ioctl(fd, I2OSWUL, struct i2o_sw_xfer *sw); + + struct i2o_sw_xfer + { + u32 iop; /* IOP unit number */ + u8 flags; /* UploadFlags */ + u8 sw_type; /* Software type */ + u32 sw_id; /* Software ID */ + void *buf; /* Pointer to software buffer */ + u32 *swlen; /* Length of software buffer */ + u32 *maxfrag; /* Number of fragments */ + u32 *curfrag; /* Current fragment number */ + }; + + DESCRIPTION + + This function uploads a software fragment from the IOP identified + by sw->iop, sw->sw_type, sw->sw_id and optionally sw->swlen fields. + The UploadFlags, SwID, SwType and SwSize fields of the ExecSwUpload + message are filled in with the values of sw->flags, sw->sw_id, + sw->sw_type and *(sw->swlen). + + The fragments _must_ be requested in order and be 8K in size. The + user is responsible for allocating memory pointed by sw->buf. The + last fragment _may_ be shorter. + + Please note that SW transfers can take a long time. + + RETURNS + + This function returns 0 if no errors occur. If an error occurs, -1 + is returned and errno is set appropriatly: + + EFAULT Invalid user space pointer was passed + ENXIO Invalid IOP number + ETIMEDOUT Timeout waiting for reply message + ENOMEM Kernel memory allocation error + +X. Removing Software + + SYNOPSIS + + ioctl(fd, I2OSWDEL, struct i2o_sw_xfer *sw); + + struct i2o_sw_xfer + { + u32 iop; /* IOP unit number */ + u8 flags; /* RemoveFlags */ + u8 sw_type; /* Software type */ + u32 sw_id; /* Software ID */ + void *buf; /* Unused */ + u32 *swlen; /* Length of the software data */ + u32 *maxfrag; /* Unused */ + u32 *curfrag; /* Unused */ + }; + + DESCRIPTION + + This function removes software from the IOP identified by sw->iop. + The RemoveFlags, SwID, SwType and SwSize fields of the ExecSwRemove message + are filled in with the values of sw->flags, sw->sw_id, sw->sw_type and + *(sw->swlen). Give zero in *(sw->len) if the value is unknown. IOP uses + *(sw->swlen) value to verify correct identication of the module to remove. + The actual size of the module is written into *(sw->swlen). + + RETURNS + + This function returns 0 if no errors occur. If an error occurs, -1 + is returned and errno is set appropriatly: + + EFAULT Invalid user space pointer was passed + ENXIO Invalid IOP number + ETIMEDOUT Timeout waiting for reply message + ENOMEM Kernel memory allocation error + +X. Validating Configuration + + SYNOPSIS + + ioctl(fd, I2OVALIDATE, int *iop); + u32 iop; + + DESCRIPTION + + This function posts an ExecConfigValidate message to the controller + identified by iop. This message indicates that the current + configuration is accepted. The iop changes the status of suspect drivers + to valid and may delete old drivers from its store. + + RETURNS + + This function returns 0 if no erro occur. If an error occurs, -1 is + returned and errno is set appropriatly: + + ETIMEDOUT Timeout waiting for reply message + ENXIO Invalid IOP number + +XI. Configuration Dialog + + SYNOPSIS + + ioctl(fd, I2OHTML, struct i2o_html *htquery); + struct i2o_html + { + u32 iop; /* IOP unit number */ + u32 tid; /* Target device ID */ + u32 page; /* HTML page */ + void *resbuf; /* Buffer for reply HTML page */ + u32 *reslen; /* Length in bytes of reply buffer */ + void *qbuf; /* Pointer to HTTP query string */ + u32 qlen; /* Length in bytes of query string buffer */ + }; + + DESCRIPTION + + This function posts an UtilConfigDialog message to the device identified + by htquery->iop and htquery->tid. The requested HTML page number is + provided by the htquery->page field, and the resultant data is stored + in the buffer pointed to by htquery->resbuf. If there is an HTTP query + string that is to be sent to the device, it should be sent in the buffer + pointed to by htquery->qbuf. If there is no query string, this field + should be set to NULL. The actual size of the reply received is written + into *(htquery->reslen). + + RETURNS + + This function returns 0 if no error occur. If an error occurs, -1 + is returned and errno is set appropriatly: + + EFAULT Invalid user space pointer was passed + ENXIO Invalid IOP number + ENOBUFS Buffer not large enough. If this occurs, the required + buffer length is written into *(ops->reslen) + ETIMEDOUT Timeout waiting for reply message + ENOMEM Kernel memory allocation error + +XII. Events + + In the process of determining this. Current idea is to have use + the select() interface to allow user apps to periodically poll + the /dev/i2o/ctl device for events. When select() notifies the user + that an event is available, the user would call read() to retrieve + a list of all the events that are pending for the specific device. + +============================================================================= +Revision History +============================================================================= + +Rev 0.1 - 04/01/99 +- Initial revision + +Rev 0.2 - 04/06/99 +- Changed return values to match UNIX ioctl() standard. Only return values + are 0 and -1. All errors are reported through errno. +- Added summary of proposed possible event interfaces + +Rev 0.3 - 04/20/99 +- Changed all ioctls() to use pointers to user data instead of actual data +- Updated error values to match the code diff --git a/Documentation/i386/IO-APIC.txt b/Documentation/i386/IO-APIC.txt new file mode 100644 index 000000000000..435e69e6e9aa --- /dev/null +++ b/Documentation/i386/IO-APIC.txt @@ -0,0 +1,117 @@ +Most (all) Intel-MP compliant SMP boards have the so-called 'IO-APIC', +which is an enhanced interrupt controller, it enables us to route +hardware interrupts to multiple CPUs, or to CPU groups. + +Linux supports all variants of compliant SMP boards, including ones with +multiple IO-APICs. (multiple IO-APICs are used in high-end servers to +distribute IRQ load further). + +There are (a few) known breakages in certain older boards, which bugs are +usually worked around by the kernel. If your MP-compliant SMP board does +not boot Linux, then consult the linux-smp mailing list archives first. + +If your box boots fine with enabled IO-APIC IRQs, then your +/proc/interrupts will look like this one: + + ----------------------------> + hell:~> cat /proc/interrupts + CPU0 + 0: 1360293 IO-APIC-edge timer + 1: 4 IO-APIC-edge keyboard + 2: 0 XT-PIC cascade + 13: 1 XT-PIC fpu + 14: 1448 IO-APIC-edge ide0 + 16: 28232 IO-APIC-level Intel EtherExpress Pro 10/100 Ethernet + 17: 51304 IO-APIC-level eth0 + NMI: 0 + ERR: 0 + hell:~> + <---------------------------- + +some interrupts are still listed as 'XT PIC', but this is not a problem, +none of those IRQ sources is performance-critical. + + +in the unlikely case that your board does not create a working mp-table, +you can use the pirq= boot parameter to 'hand-construct' IRQ entries. This +is nontrivial though and cannot be automated. One sample /etc/lilo.conf +entry: + + append="pirq=15,11,10" + +the actual numbers depend on your system, on your PCI cards and on their +PCI slot position. Usually PCI slots are 'daisy chained' before they are +connected to the PCI chipset IRQ routing facility (the incoming PIRQ1-4 +lines): + + ,-. ,-. ,-. ,-. ,-. + PIRQ4 ----| |-. ,-| |-. ,-| |-. ,-| |--------| | + |S| \ / |S| \ / |S| \ / |S| |S| + PIRQ3 ----|l|-. `/---|l|-. `/---|l|-. `/---|l|--------|l| + |o| \/ |o| \/ |o| \/ |o| |o| + PIRQ2 ----|t|-./`----|t|-./`----|t|-./`----|t|--------|t| + |1| /\ |2| /\ |3| /\ |4| |5| + PIRQ1 ----| |- `----| |- `----| |- `----| |--------| | + `-' `-' `-' `-' `-' + +every PCI card emits a PCI IRQ, which can be INTA,INTB,INTC,INTD: + + ,-. + INTD--| | + |S| + INTC--|l| + |o| + INTB--|t| + |x| + INTA--| | + `-' + +These INTA-D PCI IRQs are always 'local to the card', their real meaning +depends on which slot they are in. If you look at the daisy chaining diagram, +a card in slot4, issuing INTA IRQ, it will end up as a signal on PIRQ2 of +the PCI chipset. Most cards issue INTA, this creates optimal distribution +between the PIRQ lines. (distributing IRQ sources properly is not a +necessity, PCI IRQs can be shared at will, but it's a good for performance +to have non shared interrupts). Slot5 should be used for videocards, they +do not use interrupts normally, thus they are not daisy chained either. + +so if you have your SCSI card (IRQ11) in Slot1, Tulip card (IRQ9) in +Slot2, then you'll have to specify this pirq= line: + + append="pirq=11,9" + +the following script tries to figure out such a default pirq= line from +your PCI configuration: + + echo -n pirq=; echo `scanpci | grep T_L | cut -c56-` | sed 's/ /,/g' + +note that this script wont work if you have skipped a few slots or if your +board does not do default daisy-chaining. (or the IO-APIC has the PIRQ pins +connected in some strange way). E.g. if in the above case you have your SCSI +card (IRQ11) in Slot3, and have Slot1 empty: + + append="pirq=0,9,11" + +[value '0' is a generic 'placeholder', reserved for empty (or non-IRQ emitting) +slots.] + +generally, it's always possible to find out the correct pirq= settings, just +permute all IRQ numbers properly ... it will take some time though. An +'incorrect' pirq line will cause the booting process to hang, or a device +won't function properly (if it's inserted as eg. a module). + +If you have 2 PCI buses, then you can use up to 8 pirq values. Although such +boards tend to have a good configuration. + +Be prepared that it might happen that you need some strange pirq line: + + append="pirq=0,0,0,0,0,0,9,11" + +use smart try-and-err techniques to find out the correct pirq line ... + +good luck and mail to linux-smp@vger.kernel.org or +linux-kernel@vger.kernel.org if you have any problems that are not covered +by this document. + +-- mingo + diff --git a/Documentation/i386/boot.txt b/Documentation/i386/boot.txt new file mode 100644 index 000000000000..1c48f0eba6fb --- /dev/null +++ b/Documentation/i386/boot.txt @@ -0,0 +1,441 @@ + THE LINUX/I386 BOOT PROTOCOL + ---------------------------- + + H. Peter Anvin <hpa@zytor.com> + Last update 2002-01-01 + +On the i386 platform, the Linux kernel uses a rather complicated boot +convention. This has evolved partially due to historical aspects, as +well as the desire in the early days to have the kernel itself be a +bootable image, the complicated PC memory model and due to changed +expectations in the PC industry caused by the effective demise of +real-mode DOS as a mainstream operating system. + +Currently, four versions of the Linux/i386 boot protocol exist. + +Old kernels: zImage/Image support only. Some very early kernels + may not even support a command line. + +Protocol 2.00: (Kernel 1.3.73) Added bzImage and initrd support, as + well as a formalized way to communicate between the + boot loader and the kernel. setup.S made relocatable, + although the traditional setup area still assumed + writable. + +Protocol 2.01: (Kernel 1.3.76) Added a heap overrun warning. + +Protocol 2.02: (Kernel 2.4.0-test3-pre3) New command line protocol. + Lower the conventional memory ceiling. No overwrite + of the traditional setup area, thus making booting + safe for systems which use the EBDA from SMM or 32-bit + BIOS entry points. zImage deprecated but still + supported. + +Protocol 2.03: (Kernel 2.4.18-pre1) Explicitly makes the highest possible + initrd address available to the bootloader. + + +**** MEMORY LAYOUT + +The traditional memory map for the kernel loader, used for Image or +zImage kernels, typically looks like: + + | | +0A0000 +------------------------+ + | Reserved for BIOS | Do not use. Reserved for BIOS EBDA. +09A000 +------------------------+ + | Stack/heap/cmdline | For use by the kernel real-mode code. +098000 +------------------------+ + | Kernel setup | The kernel real-mode code. +090200 +------------------------+ + | Kernel boot sector | The kernel legacy boot sector. +090000 +------------------------+ + | Protected-mode kernel | The bulk of the kernel image. +010000 +------------------------+ + | Boot loader | <- Boot sector entry point 0000:7C00 +001000 +------------------------+ + | Reserved for MBR/BIOS | +000800 +------------------------+ + | Typically used by MBR | +000600 +------------------------+ + | BIOS use only | +000000 +------------------------+ + + +When using bzImage, the protected-mode kernel was relocated to +0x100000 ("high memory"), and the kernel real-mode block (boot sector, +setup, and stack/heap) was made relocatable to any address between +0x10000 and end of low memory. Unfortunately, in protocols 2.00 and +2.01 the command line is still required to live in the 0x9XXXX memory +range, and that memory range is still overwritten by the early kernel. +The 2.02 protocol resolves that problem. + +It is desirable to keep the "memory ceiling" -- the highest point in +low memory touched by the boot loader -- as low as possible, since +some newer BIOSes have begun to allocate some rather large amounts of +memory, called the Extended BIOS Data Area, near the top of low +memory. The boot loader should use the "INT 12h" BIOS call to verify +how much low memory is available. + +Unfortunately, if INT 12h reports that the amount of memory is too +low, there is usually nothing the boot loader can do but to report an +error to the user. The boot loader should therefore be designed to +take up as little space in low memory as it reasonably can. For +zImage or old bzImage kernels, which need data written into the +0x90000 segment, the boot loader should make sure not to use memory +above the 0x9A000 point; too many BIOSes will break above that point. + + +**** THE REAL-MODE KERNEL HEADER + +In the following text, and anywhere in the kernel boot sequence, "a +sector" refers to 512 bytes. It is independent of the actual sector +size of the underlying medium. + +The first step in loading a Linux kernel should be to load the +real-mode code (boot sector and setup code) and then examine the +following header at offset 0x01f1. The real-mode code can total up to +32K, although the boot loader may choose to load only the first two +sectors (1K) and then examine the bootup sector size. + +The header looks like: + +Offset Proto Name Meaning +/Size + +01F1/1 ALL setup_sects The size of the setup in sectors +01F2/2 ALL root_flags If set, the root is mounted readonly +01F4/2 ALL syssize DO NOT USE - for bootsect.S use only +01F6/2 ALL swap_dev DO NOT USE - obsolete +01F8/2 ALL ram_size DO NOT USE - for bootsect.S use only +01FA/2 ALL vid_mode Video mode control +01FC/2 ALL root_dev Default root device number +01FE/2 ALL boot_flag 0xAA55 magic number +0200/2 2.00+ jump Jump instruction +0202/4 2.00+ header Magic signature "HdrS" +0206/2 2.00+ version Boot protocol version supported +0208/4 2.00+ realmode_swtch Boot loader hook (see below) +020C/2 2.00+ start_sys The load-low segment (0x1000) (obsolete) +020E/2 2.00+ kernel_version Pointer to kernel version string +0210/1 2.00+ type_of_loader Boot loader identifier +0211/1 2.00+ loadflags Boot protocol option flags +0212/2 2.00+ setup_move_size Move to high memory size (used with hooks) +0214/4 2.00+ code32_start Boot loader hook (see below) +0218/4 2.00+ ramdisk_image initrd load address (set by boot loader) +021C/4 2.00+ ramdisk_size initrd size (set by boot loader) +0220/4 2.00+ bootsect_kludge DO NOT USE - for bootsect.S use only +0224/2 2.01+ heap_end_ptr Free memory after setup end +0226/2 N/A pad1 Unused +0228/4 2.02+ cmd_line_ptr 32-bit pointer to the kernel command line +022C/4 2.03+ initrd_addr_max Highest legal initrd address + +For backwards compatibility, if the setup_sects field contains 0, the +real value is 4. + +If the "HdrS" (0x53726448) magic number is not found at offset 0x202, +the boot protocol version is "old". Loading an old kernel, the +following parameters should be assumed: + + Image type = zImage + initrd not supported + Real-mode kernel must be located at 0x90000. + +Otherwise, the "version" field contains the protocol version, +e.g. protocol version 2.01 will contain 0x0201 in this field. When +setting fields in the header, you must make sure only to set fields +supported by the protocol version in use. + +The "kernel_version" field, if set to a nonzero value, contains a +pointer to a null-terminated human-readable kernel version number +string, less 0x200. This can be used to display the kernel version to +the user. This value should be less than (0x200*setup_sects). For +example, if this value is set to 0x1c00, the kernel version number +string can be found at offset 0x1e00 in the kernel file. This is a +valid value if and only if the "setup_sects" field contains the value +14 or higher. + +Most boot loaders will simply load the kernel at its target address +directly. Such boot loaders do not need to worry about filling in +most of the fields in the header. The following fields should be +filled out, however: + + vid_mode: + Please see the section on SPECIAL COMMAND LINE OPTIONS. + + type_of_loader: + If your boot loader has an assigned id (see table below), enter + 0xTV here, where T is an identifier for the boot loader and V is + a version number. Otherwise, enter 0xFF here. + + Assigned boot loader ids: + 0 LILO + 1 Loadlin + 2 bootsect-loader + 3 SYSLINUX + 4 EtherBoot + 5 ELILO + 7 GRuB + 8 U-BOOT + + Please contact <hpa@zytor.com> if you need a bootloader ID + value assigned. + + loadflags, heap_end_ptr: + If the protocol version is 2.01 or higher, enter the + offset limit of the setup heap into heap_end_ptr and set the + 0x80 bit (CAN_USE_HEAP) of loadflags. heap_end_ptr appears to + be relative to the start of setup (offset 0x0200). + + setup_move_size: + When using protocol 2.00 or 2.01, if the real mode + kernel is not loaded at 0x90000, it gets moved there later in + the loading sequence. Fill in this field if you want + additional data (such as the kernel command line) moved in + addition to the real-mode kernel itself. + + ramdisk_image, ramdisk_size: + If your boot loader has loaded an initial ramdisk (initrd), + set ramdisk_image to the 32-bit pointer to the ramdisk data + and the ramdisk_size to the size of the ramdisk data. + + The initrd should typically be located as high in memory as + possible, as it may otherwise get overwritten by the early + kernel initialization sequence. However, it must never be + located above the address specified in the initrd_addr_max + field. The initrd should be at least 4K page aligned. + + cmd_line_ptr: + If the protocol version is 2.02 or higher, this is a 32-bit + pointer to the kernel command line. The kernel command line + can be located anywhere between the end of setup and 0xA0000. + Fill in this field even if your boot loader does not support a + command line, in which case you can point this to an empty + string (or better yet, to the string "auto".) If this field + is left at zero, the kernel will assume that your boot loader + does not support the 2.02+ protocol. + + ramdisk_max: + The maximum address that may be occupied by the initrd + contents. For boot protocols 2.02 or earlier, this field is + not present, and the maximum address is 0x37FFFFFF. (This + address is defined as the address of the highest safe byte, so + if your ramdisk is exactly 131072 bytes long and this field is + 0x37FFFFFF, you can start your ramdisk at 0x37FE0000.) + + +**** THE KERNEL COMMAND LINE + +The kernel command line has become an important way for the boot +loader to communicate with the kernel. Some of its options are also +relevant to the boot loader itself, see "special command line options" +below. + +The kernel command line is a null-terminated string up to 255 +characters long, plus the final null. + +If the boot protocol version is 2.02 or later, the address of the +kernel command line is given by the header field cmd_line_ptr (see +above.) + +If the protocol version is *not* 2.02 or higher, the kernel +command line is entered using the following protocol: + + At offset 0x0020 (word), "cmd_line_magic", enter the magic + number 0xA33F. + + At offset 0x0022 (word), "cmd_line_offset", enter the offset + of the kernel command line (relative to the start of the + real-mode kernel). + + The kernel command line *must* be within the memory region + covered by setup_move_size, so you may need to adjust this + field. + + +**** SAMPLE BOOT CONFIGURATION + +As a sample configuration, assume the following layout of the real +mode segment: + + 0x0000-0x7FFF Real mode kernel + 0x8000-0x8FFF Stack and heap + 0x9000-0x90FF Kernel command line + +Such a boot loader should enter the following fields in the header: + + unsigned long base_ptr; /* base address for real-mode segment */ + + if ( setup_sects == 0 ) { + setup_sects = 4; + } + + if ( protocol >= 0x0200 ) { + type_of_loader = <type code>; + if ( loading_initrd ) { + ramdisk_image = <initrd_address>; + ramdisk_size = <initrd_size>; + } + if ( protocol >= 0x0201 ) { + heap_end_ptr = 0x9000 - 0x200; + loadflags |= 0x80; /* CAN_USE_HEAP */ + } + if ( protocol >= 0x0202 ) { + cmd_line_ptr = base_ptr + 0x9000; + } else { + cmd_line_magic = 0xA33F; + cmd_line_offset = 0x9000; + setup_move_size = 0x9100; + } + } else { + /* Very old kernel */ + + cmd_line_magic = 0xA33F; + cmd_line_offset = 0x9000; + + /* A very old kernel MUST have its real-mode code + loaded at 0x90000 */ + + if ( base_ptr != 0x90000 ) { + /* Copy the real-mode kernel */ + memcpy(0x90000, base_ptr, (setup_sects+1)*512); + /* Copy the command line */ + memcpy(0x99000, base_ptr+0x9000, 256); + + base_ptr = 0x90000; /* Relocated */ + } + + /* It is recommended to clear memory up to the 32K mark */ + memset(0x90000 + (setup_sects+1)*512, 0, + (64-(setup_sects+1))*512); + } + + +**** LOADING THE REST OF THE KERNEL + +The non-real-mode kernel starts at offset (setup_sects+1)*512 in the +kernel file (again, if setup_sects == 0 the real value is 4.) It +should be loaded at address 0x10000 for Image/zImage kernels and +0x100000 for bzImage kernels. + +The kernel is a bzImage kernel if the protocol >= 2.00 and the 0x01 +bit (LOAD_HIGH) in the loadflags field is set: + + is_bzImage = (protocol >= 0x0200) && (loadflags & 0x01); + load_address = is_bzImage ? 0x100000 : 0x10000; + +Note that Image/zImage kernels can be up to 512K in size, and thus use +the entire 0x10000-0x90000 range of memory. This means it is pretty +much a requirement for these kernels to load the real-mode part at +0x90000. bzImage kernels allow much more flexibility. + + +**** SPECIAL COMMAND LINE OPTIONS + +If the command line provided by the boot loader is entered by the +user, the user may expect the following command line options to work. +They should normally not be deleted from the kernel command line even +though not all of them are actually meaningful to the kernel. Boot +loader authors who need additional command line options for the boot +loader itself should get them registered in +Documentation/kernel-parameters.txt to make sure they will not +conflict with actual kernel options now or in the future. + + vga=<mode> + <mode> here is either an integer (in C notation, either + decimal, octal, or hexadecimal) or one of the strings + "normal" (meaning 0xFFFF), "ext" (meaning 0xFFFE) or "ask" + (meaning 0xFFFD). This value should be entered into the + vid_mode field, as it is used by the kernel before the command + line is parsed. + + mem=<size> + <size> is an integer in C notation optionally followed by K, M + or G (meaning << 10, << 20 or << 30). This specifies the end + of memory to the kernel. This affects the possible placement + of an initrd, since an initrd should be placed near end of + memory. Note that this is an option to *both* the kernel and + the bootloader! + + initrd=<file> + An initrd should be loaded. The meaning of <file> is + obviously bootloader-dependent, and some boot loaders + (e.g. LILO) do not have such a command. + +In addition, some boot loaders add the following options to the +user-specified command line: + + BOOT_IMAGE=<file> + The boot image which was loaded. Again, the meaning of <file> + is obviously bootloader-dependent. + + auto + The kernel was booted without explicit user intervention. + +If these options are added by the boot loader, it is highly +recommended that they are located *first*, before the user-specified +or configuration-specified command line. Otherwise, "init=/bin/sh" +gets confused by the "auto" option. + + +**** RUNNING THE KERNEL + +The kernel is started by jumping to the kernel entry point, which is +located at *segment* offset 0x20 from the start of the real mode +kernel. This means that if you loaded your real-mode kernel code at +0x90000, the kernel entry point is 9020:0000. + +At entry, ds = es = ss should point to the start of the real-mode +kernel code (0x9000 if the code is loaded at 0x90000), sp should be +set up properly, normally pointing to the top of the heap, and +interrupts should be disabled. Furthermore, to guard against bugs in +the kernel, it is recommended that the boot loader sets fs = gs = ds = +es = ss. + +In our example from above, we would do: + + /* Note: in the case of the "old" kernel protocol, base_ptr must + be == 0x90000 at this point; see the previous sample code */ + + seg = base_ptr >> 4; + + cli(); /* Enter with interrupts disabled! */ + + /* Set up the real-mode kernel stack */ + _SS = seg; + _SP = 0x9000; /* Load SP immediately after loading SS! */ + + _DS = _ES = _FS = _GS = seg; + jmp_far(seg+0x20, 0); /* Run the kernel */ + +If your boot sector accesses a floppy drive, it is recommended to +switch off the floppy motor before running the kernel, since the +kernel boot leaves interrupts off and thus the motor will not be +switched off, especially if the loaded kernel has the floppy driver as +a demand-loaded module! + + +**** ADVANCED BOOT TIME HOOKS + +If the boot loader runs in a particularly hostile environment (such as +LOADLIN, which runs under DOS) it may be impossible to follow the +standard memory location requirements. Such a boot loader may use the +following hooks that, if set, are invoked by the kernel at the +appropriate time. The use of these hooks should probably be +considered an absolutely last resort! + +IMPORTANT: All the hooks are required to preserve %esp, %ebp, %esi and +%edi across invocation. + + realmode_swtch: + A 16-bit real mode far subroutine invoked immediately before + entering protected mode. The default routine disables NMI, so + your routine should probably do so, too. + + code32_start: + A 32-bit flat-mode routine *jumped* to immediately after the + transition to protected mode, but before the kernel is + uncompressed. No segments, except CS, are set up; you should + set them up to KERNEL_DS (0x18) yourself. + + After completing your hook, you should jump to the address + that was in this field before your boot loader overwrote it. diff --git a/Documentation/i386/usb-legacy-support.txt b/Documentation/i386/usb-legacy-support.txt new file mode 100644 index 000000000000..1894cdfc69d9 --- /dev/null +++ b/Documentation/i386/usb-legacy-support.txt @@ -0,0 +1,44 @@ +USB Legacy support +~~~~~~~~~~~~~~~~~~ + +Vojtech Pavlik <vojtech@suse.cz>, January 2004 + + +Also known as "USB Keyboard" or "USB Mouse support" in the BIOS Setup is a +feature that allows one to use the USB mouse and keyboard as if they were +their classic PS/2 counterparts. This means one can use an USB keyboard to +type in LILO for example. + +It has several drawbacks, though: + +1) On some machines, the emulated PS/2 mouse takes over even when no USB + mouse is present and a real PS/2 mouse is present. In that case the extra + features (wheel, extra buttons, touchpad mode) of the real PS/2 mouse may + not be available. + +2) If CONFIG_HIGHMEM64G is enabled, the PS/2 mouse emulation can cause + system crashes, because the SMM BIOS is not expecting to be in PAE mode. + The Intel E7505 is a typical machine where this happens. + +3) If AMD64 64-bit mode is enabled, again system crashes often happen, + because the SMM BIOS isn't expecting the CPU to be in 64-bit mode. The + BIOS manufacturers only test with Windows, and Windows doesn't do 64-bit + yet. + +Solutions: + +Problem 1) can be solved by loading the USB drivers prior to loading the +PS/2 mouse driver. Since the PS/2 mouse driver is in 2.6 compiled into +the kernel unconditionally, this means the USB drivers need to be +compiled-in, too. + +Problem 2) can currently only be solved by either disabling HIGHMEM64G +in the kernel config or USB Legacy support in the BIOS. A BIOS update +could help, but so far no such update exists. + +Problem 3) is usually fixed by a BIOS update. Check the board +manufacturers web site. If an update is not available, disable USB +Legacy support in the BIOS. If this alone doesn't help, try also adding +idle=poll on the kernel command line. The BIOS may be entering the SMM +on the HLT instruction as well. + diff --git a/Documentation/i386/zero-page.txt b/Documentation/i386/zero-page.txt new file mode 100644 index 000000000000..67c053a099ed --- /dev/null +++ b/Documentation/i386/zero-page.txt @@ -0,0 +1,84 @@ +Summary of boot_params layout (kernel point of view) + ( collected by Hans Lermen and Martin Mares ) + +The contents of boot_params are used to pass parameters from the +16-bit realmode code of the kernel to the 32-bit part. References/settings +to it mainly are in: + + arch/i386/boot/setup.S + arch/i386/boot/video.S + arch/i386/kernel/head.S + arch/i386/kernel/setup.c + + +Offset Type Description +------ ---- ----------- + 0 32 bytes struct screen_info, SCREEN_INFO + ATTENTION, overlaps the following !!! + 2 unsigned short EXT_MEM_K, extended memory size in Kb (from int 0x15) + 0x20 unsigned short CL_MAGIC, commandline magic number (=0xA33F) + 0x22 unsigned short CL_OFFSET, commandline offset + Address of commandline is calculated: + 0x90000 + contents of CL_OFFSET + (only taken, when CL_MAGIC = 0xA33F) + 0x40 20 bytes struct apm_bios_info, APM_BIOS_INFO + 0x60 16 bytes Intel SpeedStep (IST) BIOS support information + 0x80 16 bytes hd0-disk-parameter from intvector 0x41 + 0x90 16 bytes hd1-disk-parameter from intvector 0x46 + + 0xa0 16 bytes System description table truncated to 16 bytes. + ( struct sys_desc_table_struct ) + 0xb0 - 0x13f Free. Add more parameters here if you really need them. + 0x140- 0x1be EDID_INFO Video mode setup + +0x1c4 unsigned long EFI system table pointer +0x1c8 unsigned long EFI memory descriptor size +0x1cc unsigned long EFI memory descriptor version +0x1d0 unsigned long EFI memory descriptor map pointer +0x1d4 unsigned long EFI memory descriptor map size +0x1e0 unsigned long ALT_MEM_K, alternative mem check, in Kb +0x1e8 char number of entries in E820MAP (below) +0x1e9 unsigned char number of entries in EDDBUF (below) +0x1ea unsigned char number of entries in EDD_MBR_SIG_BUFFER (below) +0x1f1 char size of setup.S, number of sectors +0x1f2 unsigned short MOUNT_ROOT_RDONLY (if !=0) +0x1f4 unsigned short size of compressed kernel-part in the + (b)zImage-file (in 16 byte units, rounded up) +0x1f6 unsigned short swap_dev (unused AFAIK) +0x1f8 unsigned short RAMDISK_FLAGS +0x1fa unsigned short VGA-Mode (old one) +0x1fc unsigned short ORIG_ROOT_DEV (high=Major, low=minor) +0x1ff char AUX_DEVICE_INFO + +0x200 short jump to start of setup code aka "reserved" field. +0x202 4 bytes Signature for SETUP-header, ="HdrS" +0x206 unsigned short Version number of header format + Current version is 0x0201... +0x208 8 bytes (used by setup.S for communication with boot loaders, + look there) +0x210 char LOADER_TYPE, = 0, old one + else it is set by the loader: + 0xTV: T=0 for LILO + 1 for Loadlin + 2 for bootsect-loader + 3 for SYSLINUX + 4 for ETHERBOOT + V = version +0x211 char loadflags: + bit0 = 1: kernel is loaded high (bzImage) + bit7 = 1: Heap and pointer (see below) set by boot + loader. +0x212 unsigned short (setup.S) +0x214 unsigned long KERNEL_START, where the loader started the kernel +0x218 unsigned long INITRD_START, address of loaded ramdisk image +0x21c unsigned long INITRD_SIZE, size in bytes of ramdisk image +0x220 4 bytes (setup.S) +0x224 unsigned short setup.S heap end pointer +0x226 unsigned short zero_pad +0x228 unsigned long cmd_line_ptr +0x22c unsigned long ramdisk_max +0x230 16 bytes trampoline +0x290 - 0x2cf EDD_MBR_SIG_BUFFER (edd.S) +0x2d0 - 0x600 E820MAP +0x600 - 0x7ff EDDBUF (edd.S) for disk signature read sector +0x600 - 0x7eb EDDBUF (edd.S) for edd data diff --git a/Documentation/ia64/IRQ-redir.txt b/Documentation/ia64/IRQ-redir.txt new file mode 100644 index 000000000000..f7bd72261283 --- /dev/null +++ b/Documentation/ia64/IRQ-redir.txt @@ -0,0 +1,69 @@ +IRQ affinity on IA64 platforms +------------------------------ + 07.01.2002, Erich Focht <efocht@ess.nec.de> + + +By writing to /proc/irq/IRQ#/smp_affinity the interrupt routing can be +controlled. The behavior on IA64 platforms is slightly different from +that described in Documentation/IRQ-affinity.txt for i386 systems. + +Because of the usage of SAPIC mode and physical destination mode the +IRQ target is one particular CPU and cannot be a mask of several +CPUs. Only the first non-zero bit is taken into account. + + +Usage examples: + +The target CPU has to be specified as a hexadecimal CPU mask. The +first non-zero bit is the selected CPU. This format has been kept for +compatibility reasons with i386. + +Set the delivery mode of interrupt 41 to fixed and route the +interrupts to CPU #3 (logical CPU number) (2^3=0x08): + echo "8" >/proc/irq/41/smp_affinity + +Set the default route for IRQ number 41 to CPU 6 in lowest priority +delivery mode (redirectable): + echo "r 40" >/proc/irq/41/smp_affinity + +The output of the command + cat /proc/irq/IRQ#/smp_affinity +gives the target CPU mask for the specified interrupt vector. If the CPU +mask is preceded by the character "r", the interrupt is redirectable +(i.e. lowest priority mode routing is used), otherwise its route is +fixed. + + + +Initialization and default behavior: + +If the platform features IRQ redirection (info provided by SAL) all +IO-SAPIC interrupts are initialized with CPU#0 as their default target +and the routing is the so called "lowest priority mode" (actually +fixed SAPIC mode with hint). The XTP chipset registers are used as hints +for the IRQ routing. Currently in Linux XTP registers can have three +values: + - minimal for an idle task, + - normal if any other task runs, + - maximal if the CPU is going to be switched off. +The IRQ is routed to the CPU with lowest XTP register value, the +search begins at the default CPU. Therefore most of the interrupts +will be handled by CPU #0. + +If the platform doesn't feature interrupt redirection IOSAPIC fixed +routing is used. The target CPUs are distributed in a round robin +manner. IRQs will be routed only to the selected target CPUs. Check +with + cat /proc/interrupts + + + +Comments: + +On large (multi-node) systems it is recommended to route the IRQs to +the node to which the corresponding device is connected. +For systems like the NEC AzusA we get IRQ node-affinity for free. This +is because usually the chipsets on each node redirect the interrupts +only to their own CPUs (as they cannot see the XTP registers on the +other nodes). + diff --git a/Documentation/ia64/README b/Documentation/ia64/README new file mode 100644 index 000000000000..aa17f2154cba --- /dev/null +++ b/Documentation/ia64/README @@ -0,0 +1,43 @@ + Linux kernel release 2.4.xx for the IA-64 Platform + + These are the release notes for Linux version 2.4 for IA-64 + platform. This document provides information specific to IA-64 + ONLY, to get additional information about the Linux kernel also + read the original Linux README provided with the kernel. + +INSTALLING the kernel: + + - IA-64 kernel installation is the same as the other platforms, see + original README for details. + + +SOFTWARE REQUIREMENTS + + Compiling and running this kernel requires an IA-64 compliant GCC + compiler. And various software packages also compiled with an + IA-64 compliant GCC compiler. + + +CONFIGURING the kernel: + + Configuration is the same, see original README for details. + + +COMPILING the kernel: + + - Compiling this kernel doesn't differ from other platform so read + the original README for details BUT make sure you have an IA-64 + compliant GCC compiler. + +IA-64 SPECIFICS + + - General issues: + + o Hardly any performance tuning has been done. Obvious targets + include the library routines (IP checksum, etc.). Less + obvious targets include making sure we don't flush the TLB + needlessly, etc. + + o SMP locks cleanup/optimization + + o IA32 support. Currently experimental. It mostly works. diff --git a/Documentation/ia64/efirtc.txt b/Documentation/ia64/efirtc.txt new file mode 100644 index 000000000000..ede2c1e51cd7 --- /dev/null +++ b/Documentation/ia64/efirtc.txt @@ -0,0 +1,128 @@ +EFI Real Time Clock driver +------------------------------- +S. Eranian <eranian@hpl.hp.com> +March 2000 + +I/ Introduction + +This document describes the efirtc.c driver has provided for +the IA-64 platform. + +The purpose of this driver is to supply an API for kernel and user applications +to get access to the Time Service offered by EFI version 0.92. + +EFI provides 4 calls one can make once the OS is booted: GetTime(), +SetTime(), GetWakeupTime(), SetWakeupTime() which are all supported by this +driver. We describe those calls as well the design of the driver in the +following sections. + +II/ Design Decisions + +The original ideas was to provide a very simple driver to get access to, +at first, the time of day service. This is required in order to access, in a +portable way, the CMOS clock. A program like /sbin/hwclock uses such a clock +to initialize the system view of the time during boot. + +Because we wanted to minimize the impact on existing user-level apps using +the CMOS clock, we decided to expose an API that was very similar to the one +used today with the legacy RTC driver (driver/char/rtc.c). However, because +EFI provides a simpler services, not all all ioctl() are available. Also +new ioctl()s have been introduced for things that EFI provides but not the +legacy. + +EFI uses a slightly different way of representing the time, noticeably +the reference date is different. Year is the using the full 4-digit format. +The Epoch is January 1st 1998. For backward compatibility reasons we don't +expose this new way of representing time. Instead we use something very +similar to the struct tm, i.e. struct rtc_time, as used by hwclock. +One of the reasons for doing it this way is to allow for EFI to still evolve +without necessarily impacting any of the user applications. The decoupling +enables flexibility and permits writing wrapper code is ncase things change. + +The driver exposes two interfaces, one via the device file and a set of +ioctl()s. The other is read-only via the /proc filesystem. + +As of today we don't offer a /proc/sys interface. + +To allow for a uniform interface between the legacy RTC and EFI time service, +we have created the include/linux/rtc.h header file to contain only the +"public" API of the two drivers. The specifics of the legacy RTC are still +in include/linux/mc146818rtc.h. + + +III/ Time of day service + +The part of the driver gives access to the time of day service of EFI. +Two ioctl()s, compatible with the legacy RTC calls: + + Read the CMOS clock: ioctl(d, RTC_RD_TIME, &rtc); + + Write the CMOS clock: ioctl(d, RTC_SET_TIME, &rtc); + +The rtc is a pointer to a data structure defined in rtc.h which is close +to a struct tm: + +struct rtc_time { + int tm_sec; + int tm_min; + int tm_hour; + int tm_mday; + int tm_mon; + int tm_year; + int tm_wday; + int tm_yday; + int tm_isdst; +}; + +The driver takes care of converting back an forth between the EFI time and +this format. + +Those two ioctl()s can be exercised with the hwclock command: + +For reading: +# /sbin/hwclock --show +Mon Mar 6 15:32:32 2000 -0.910248 seconds + +For setting: +# /sbin/hwclock --systohc + +Root privileges are required to be able to set the time of day. + +IV/ Wakeup Alarm service + +EFI provides an API by which one can program when a machine should wakeup, +i.e. reboot. This is very different from the alarm provided by the legacy +RTC which is some kind of interval timer alarm. For this reason we don't use +the same ioctl()s to get access to the service. Instead we have +introduced 2 news ioctl()s to the interface of an RTC. + +We have added 2 new ioctl()s that are specific to the EFI driver: + + Read the current state of the alarm + ioctl(d, RTC_WKLAM_RD, &wkt) + + Set the alarm or change its status + ioctl(d, RTC_WKALM_SET, &wkt) + +The wkt structure encapsulates a struct rtc_time + 2 extra fields to get +status information: + +struct rtc_wkalrm { + + unsigned char enabled; /* =1 if alarm is enabled */ + unsigned char pending; /* =1 if alarm is pending */ + + struct rtc_time time; +} + +As of today, none of the existing user-level apps supports this feature. +However writing such a program should be hard by simply using those two +ioctl(). + +Root privileges are required to be able to set the alarm. + +V/ References. + +Checkout the following Web site for more information on EFI: + +http://developer.intel.com/technology/efi/ diff --git a/Documentation/ia64/fsys.txt b/Documentation/ia64/fsys.txt new file mode 100644 index 000000000000..28da181f9966 --- /dev/null +++ b/Documentation/ia64/fsys.txt @@ -0,0 +1,286 @@ +-*-Mode: outline-*- + + Light-weight System Calls for IA-64 + ----------------------------------- + + Started: 13-Jan-2003 + Last update: 27-Sep-2003 + + David Mosberger-Tang + <davidm@hpl.hp.com> + +Using the "epc" instruction effectively introduces a new mode of +execution to the ia64 linux kernel. We call this mode the +"fsys-mode". To recap, the normal states of execution are: + + - kernel mode: + Both the register stack and the memory stack have been + switched over to kernel memory. The user-level state is saved + in a pt-regs structure at the top of the kernel memory stack. + + - user mode: + Both the register stack and the kernel stack are in + user memory. The user-level state is contained in the + CPU registers. + + - bank 0 interruption-handling mode: + This is the non-interruptible state which all + interruption-handlers start execution in. The user-level + state remains in the CPU registers and some kernel state may + be stored in bank 0 of registers r16-r31. + +In contrast, fsys-mode has the following special properties: + + - execution is at privilege level 0 (most-privileged) + + - CPU registers may contain a mixture of user-level and kernel-level + state (it is the responsibility of the kernel to ensure that no + security-sensitive kernel-level state is leaked back to + user-level) + + - execution is interruptible and preemptible (an fsys-mode handler + can disable interrupts and avoid all other interruption-sources + to avoid preemption) + + - neither the memory-stack nor the register-stack can be trusted while + in fsys-mode (they point to the user-level stacks, which may + be invalid, or completely bogus addresses) + +In summary, fsys-mode is much more similar to running in user-mode +than it is to running in kernel-mode. Of course, given that the +privilege level is at level 0, this means that fsys-mode requires some +care (see below). + + +* How to tell fsys-mode + +Linux operates in fsys-mode when (a) the privilege level is 0 (most +privileged) and (b) the stacks have NOT been switched to kernel memory +yet. For convenience, the header file <asm-ia64/ptrace.h> provides +three macros: + + user_mode(regs) + user_stack(task,regs) + fsys_mode(task,regs) + +The "regs" argument is a pointer to a pt_regs structure. The "task" +argument is a pointer to the task structure to which the "regs" +pointer belongs to. user_mode() returns TRUE if the CPU state pointed +to by "regs" was executing in user mode (privilege level 3). +user_stack() returns TRUE if the state pointed to by "regs" was +executing on the user-level stack(s). Finally, fsys_mode() returns +TRUE if the CPU state pointed to by "regs" was executing in fsys-mode. +The fsys_mode() macro is equivalent to the expression: + + !user_mode(regs) && user_stack(task,regs) + +* How to write an fsyscall handler + +The file arch/ia64/kernel/fsys.S contains a table of fsyscall-handlers +(fsyscall_table). This table contains one entry for each system call. +By default, a system call is handled by fsys_fallback_syscall(). This +routine takes care of entering (full) kernel mode and calling the +normal Linux system call handler. For performance-critical system +calls, it is possible to write a hand-tuned fsyscall_handler. For +example, fsys.S contains fsys_getpid(), which is a hand-tuned version +of the getpid() system call. + +The entry and exit-state of an fsyscall handler is as follows: + +** Machine state on entry to fsyscall handler: + + - r10 = 0 + - r11 = saved ar.pfs (a user-level value) + - r15 = system call number + - r16 = "current" task pointer (in normal kernel-mode, this is in r13) + - r32-r39 = system call arguments + - b6 = return address (a user-level value) + - ar.pfs = previous frame-state (a user-level value) + - PSR.be = cleared to zero (i.e., little-endian byte order is in effect) + - all other registers may contain values passed in from user-mode + +** Required machine state on exit to fsyscall handler: + + - r11 = saved ar.pfs (as passed into the fsyscall handler) + - r15 = system call number (as passed into the fsyscall handler) + - r32-r39 = system call arguments (as passed into the fsyscall handler) + - b6 = return address (as passed into the fsyscall handler) + - ar.pfs = previous frame-state (as passed into the fsyscall handler) + +Fsyscall handlers can execute with very little overhead, but with that +speed comes a set of restrictions: + + o Fsyscall-handlers MUST check for any pending work in the flags + member of the thread-info structure and if any of the + TIF_ALLWORK_MASK flags are set, the handler needs to fall back on + doing a full system call (by calling fsys_fallback_syscall). + + o Fsyscall-handlers MUST preserve incoming arguments (r32-r39, r11, + r15, b6, and ar.pfs) because they will be needed in case of a + system call restart. Of course, all "preserved" registers also + must be preserved, in accordance to the normal calling conventions. + + o Fsyscall-handlers MUST check argument registers for containing a + NaT value before using them in any way that could trigger a + NaT-consumption fault. If a system call argument is found to + contain a NaT value, an fsyscall-handler may return immediately + with r8=EINVAL, r10=-1. + + o Fsyscall-handlers MUST NOT use the "alloc" instruction or perform + any other operation that would trigger mandatory RSE + (register-stack engine) traffic. + + o Fsyscall-handlers MUST NOT write to any stacked registers because + it is not safe to assume that user-level called a handler with the + proper number of arguments. + + o Fsyscall-handlers need to be careful when accessing per-CPU variables: + unless proper safe-guards are taken (e.g., interruptions are avoided), + execution may be pre-empted and resumed on another CPU at any given + time. + + o Fsyscall-handlers must be careful not to leak sensitive kernel' + information back to user-level. In particular, before returning to + user-level, care needs to be taken to clear any scratch registers + that could contain sensitive information (note that the current + task pointer is not considered sensitive: it's already exposed + through ar.k6). + + o Fsyscall-handlers MUST NOT access user-memory without first + validating access-permission (this can be done typically via + probe.r.fault and/or probe.w.fault) and without guarding against + memory access exceptions (this can be done with the EX() macros + defined by asmmacro.h). + +The above restrictions may seem draconian, but remember that it's +possible to trade off some of the restrictions by paying a slightly +higher overhead. For example, if an fsyscall-handler could benefit +from the shadow register bank, it could temporarily disable PSR.i and +PSR.ic, switch to bank 0 (bsw.0) and then use the shadow registers as +needed. In other words, following the above rules yields extremely +fast system call execution (while fully preserving system call +semantics), but there is also a lot of flexibility in handling more +complicated cases. + +* Signal handling + +The delivery of (asynchronous) signals must be delayed until fsys-mode +is exited. This is acomplished with the help of the lower-privilege +transfer trap: arch/ia64/kernel/process.c:do_notify_resume_user() +checks whether the interrupted task was in fsys-mode and, if so, sets +PSR.lp and returns immediately. When fsys-mode is exited via the +"br.ret" instruction that lowers the privilege level, a trap will +occur. The trap handler clears PSR.lp again and returns immediately. +The kernel exit path then checks for and delivers any pending signals. + +* PSR Handling + +The "epc" instruction doesn't change the contents of PSR at all. This +is in contrast to a regular interruption, which clears almost all +bits. Because of that, some care needs to be taken to ensure things +work as expected. The following discussion describes how each PSR bit +is handled. + +PSR.be Cleared when entering fsys-mode. A srlz.d instruction is used + to ensure the CPU is in little-endian mode before the first + load/store instruction is executed. PSR.be is normally NOT + restored upon return from an fsys-mode handler. In other + words, user-level code must not rely on PSR.be being preserved + across a system call. +PSR.up Unchanged. +PSR.ac Unchanged. +PSR.mfl Unchanged. Note: fsys-mode handlers must not write-registers! +PSR.mfh Unchanged. Note: fsys-mode handlers must not write-registers! +PSR.ic Unchanged. Note: fsys-mode handlers can clear the bit, if needed. +PSR.i Unchanged. Note: fsys-mode handlers can clear the bit, if needed. +PSR.pk Unchanged. +PSR.dt Unchanged. +PSR.dfl Unchanged. Note: fsys-mode handlers must not write-registers! +PSR.dfh Unchanged. Note: fsys-mode handlers must not write-registers! +PSR.sp Unchanged. +PSR.pp Unchanged. +PSR.di Unchanged. +PSR.si Unchanged. +PSR.db Unchanged. The kernel prevents user-level from setting a hardware + breakpoint that triggers at any privilege level other than 3 (user-mode). +PSR.lp Unchanged. +PSR.tb Lazy redirect. If a taken-branch trap occurs while in + fsys-mode, the trap-handler modifies the saved machine state + such that execution resumes in the gate page at + syscall_via_break(), with privilege level 3. Note: the + taken branch would occur on the branch invoking the + fsyscall-handler, at which point, by definition, a syscall + restart is still safe. If the system call number is invalid, + the fsys-mode handler will return directly to user-level. This + return will trigger a taken-branch trap, but since the trap is + taken _after_ restoring the privilege level, the CPU has already + left fsys-mode, so no special treatment is needed. +PSR.rt Unchanged. +PSR.cpl Cleared to 0. +PSR.is Unchanged (guaranteed to be 0 on entry to the gate page). +PSR.mc Unchanged. +PSR.it Unchanged (guaranteed to be 1). +PSR.id Unchanged. Note: the ia64 linux kernel never sets this bit. +PSR.da Unchanged. Note: the ia64 linux kernel never sets this bit. +PSR.dd Unchanged. Note: the ia64 linux kernel never sets this bit. +PSR.ss Lazy redirect. If set, "epc" will cause a Single Step Trap to + be taken. The trap handler then modifies the saved machine + state such that execution resumes in the gate page at + syscall_via_break(), with privilege level 3. +PSR.ri Unchanged. +PSR.ed Unchanged. Note: This bit could only have an effect if an fsys-mode + handler performed a speculative load that gets NaTted. If so, this + would be the normal & expected behavior, so no special treatment is + needed. +PSR.bn Unchanged. Note: fsys-mode handlers may clear the bit, if needed. + Doing so requires clearing PSR.i and PSR.ic as well. +PSR.ia Unchanged. Note: the ia64 linux kernel never sets this bit. + +* Using fast system calls + +To use fast system calls, userspace applications need simply call +__kernel_syscall_via_epc(). For example + +-- example fgettimeofday() call -- +-- fgettimeofday.S -- + +#include <asm/asmmacro.h> + +GLOBAL_ENTRY(fgettimeofday) +.prologue +.save ar.pfs, r11 +mov r11 = ar.pfs +.body + +mov r2 = 0xa000000000020660;; // gate address + // found by inspection of System.map for the + // __kernel_syscall_via_epc() function. See + // below for how to do this for real. + +mov b7 = r2 +mov r15 = 1087 // gettimeofday syscall +;; +br.call.sptk.many b6 = b7 +;; + +.restore sp + +mov ar.pfs = r11 +br.ret.sptk.many rp;; // return to caller +END(fgettimeofday) + +-- end fgettimeofday.S -- + +In reality, getting the gate address is accomplished by two extra +values passed via the ELF auxiliary vector (include/asm-ia64/elf.h) + + o AT_SYSINFO : is the address of __kernel_syscall_via_epc() + o AT_SYSINFO_EHDR : is the address of the kernel gate ELF DSO + +The ELF DSO is a pre-linked library that is mapped in by the kernel at +the gate page. It is a proper ELF shared object so, with a dynamic +loader that recognises the library, you should be able to make calls to +the exported functions within it as with any other shared library. +AT_SYSINFO points into the kernel DSO at the +__kernel_syscall_via_epc() function for historical reasons (it was +used before the kernel DSO) and as a convenience. diff --git a/Documentation/ia64/serial.txt b/Documentation/ia64/serial.txt new file mode 100644 index 000000000000..f51eb4bc2ff1 --- /dev/null +++ b/Documentation/ia64/serial.txt @@ -0,0 +1,144 @@ +SERIAL DEVICE NAMING + + As of 2.6.10, serial devices on ia64 are named based on the + order of ACPI and PCI enumeration. The first device in the + ACPI namespace (if any) becomes /dev/ttyS0, the second becomes + /dev/ttyS1, etc., and PCI devices are named sequentially + starting after the ACPI devices. + + Prior to 2.6.10, there were confusing exceptions to this: + + - Firmware on some machines (mostly from HP) provides an HCDP + table[1] that tells the kernel about devices that can be used + as a serial console. If the user specified "console=ttyS0" + or the EFI ConOut path contained only UART devices, the + kernel registered the device described by the HCDP as + /dev/ttyS0. + + - If there was no HCDP, we assumed there were UARTs at the + legacy COM port addresses (I/O ports 0x3f8 and 0x2f8), so + the kernel registered those as /dev/ttyS0 and /dev/ttyS1. + + Any additional ACPI or PCI devices were registered sequentially + after /dev/ttyS0 as they were discovered. + + With an HCDP, device names changed depending on EFI configuration + and "console=" arguments. Without an HCDP, device names didn't + change, but we registered devices that might not really exist. + + For example, an HP rx1600 with a single built-in serial port + (described in the ACPI namespace) plus an MP[2] (a PCI device) has + these ports: + + pre-2.6.10 pre-2.6.10 + MMIO (EFI console (EFI console + address on builtin) on MP port) 2.6.10 + ========== ========== ========== ====== + builtin 0xff5e0000 ttyS0 ttyS1 ttyS0 + MP UPS 0xf8031000 ttyS1 ttyS2 ttyS1 + MP Console 0xf8030000 ttyS2 ttyS0 ttyS2 + MP 2 0xf8030010 ttyS3 ttyS3 ttyS3 + MP 3 0xf8030038 ttyS4 ttyS4 ttyS4 + +CONSOLE SELECTION + + EFI knows what your console devices are, but it doesn't tell the + kernel quite enough to actually locate them. The DIG64 HCDP + table[1] does tell the kernel where potential serial console + devices are, but not all firmware supplies it. Also, EFI supports + multiple simultaneous consoles and doesn't tell the kernel which + should be the "primary" one. + + So how do you tell Linux which console device to use? + + - If your firmware supplies the HCDP, it is simplest to + configure EFI with a single device (either a UART or a VGA + card) as the console. Then you don't need to tell Linux + anything; the kernel will automatically use the EFI console. + + (This works only in 2.6.6 or later; prior to that you had + to specify "console=ttyS0" to get a serial console.) + + - Without an HCDP, Linux defaults to a VGA console unless you + specify a "console=" argument. + + NOTE: Don't assume that a serial console device will be /dev/ttyS0. + It might be ttyS1, ttyS2, etc. Make sure you have the appropriate + entries in /etc/inittab (for getty) and /etc/securetty (to allow + root login). + +EARLY SERIAL CONSOLE + + The kernel can't start using a serial console until it knows where + the device lives. Normally this happens when the driver enumerates + all the serial devices, which can happen a minute or more after the + kernel starts booting. + + 2.6.10 and later kernels have an "early uart" driver that works + very early in the boot process. The kernel will automatically use + this if the user supplies an argument like "console=uart,io,0x3f8", + or if the EFI console path contains only a UART device and the + firmware supplies an HCDP. + +TROUBLESHOOTING SERIAL CONSOLE PROBLEMS + + No kernel output after elilo prints "Uncompressing Linux... done": + + - You specified "console=ttyS0" but Linux changed the device + to which ttyS0 refers. Configure exactly one EFI console + device[3] and remove the "console=" option. + + - The EFI console path contains both a VGA device and a UART. + EFI and elilo use both, but Linux defaults to VGA. Remove + the VGA device from the EFI console path[3]. + + - Multiple UARTs selected as EFI console devices. EFI and + elilo use all selected devices, but Linux uses only one. + Make sure only one UART is selected in the EFI console + path[3]. + + - You're connected to an HP MP port[2] but have a non-MP UART + selected as EFI console device. EFI uses the MP as a + console device even when it isn't explicitly selected. + Either move the console cable to the non-MP UART, or change + the EFI console path[3] to the MP UART. + + Long pause (60+ seconds) between "Uncompressing Linux... done" and + start of kernel output: + + - No early console because you used "console=ttyS<n>". Remove + the "console=" option if your firmware supplies an HCDP. + + - If you don't have an HCDP, the kernel doesn't know where + your console lives until the driver discovers serial + devices. Use "console=uart, io,0x3f8" (or appropriate + address for your machine). + + Kernel and init script output works fine, but no "login:" prompt: + + - Add getty entry to /etc/inittab for console tty. Look for + the "Adding console on ttyS<n>" message that tells you which + device is the console. + + "login:" prompt, but can't login as root: + + - Add entry to /etc/securetty for console tty. + + + +[1] http://www.dig64.org/specifications/DIG64_PCDPv20.pdf + The table was originally defined as the "HCDP" for "Headless + Console/Debug Port." The current version is the "PCDP" for + "Primary Console and Debug Port Devices." + +[2] The HP MP (management processor) is a PCI device that provides + several UARTs. One of the UARTs is often used as a console; the + EFI Boot Manager identifies it as "Acpi(HWP0002,700)/Pci(...)/Uart". + The external connection is usually a 25-pin connector, and a + special dongle converts that to three 9-pin connectors, one of + which is labelled "Console." + +[3] EFI console devices are configured using the EFI Boot Manager + "Boot option maintenance" menu. You may have to interrupt the + boot sequence to use this menu, and you will have to reset the + box after changing console configuration. diff --git a/Documentation/ibm-acpi.txt b/Documentation/ibm-acpi.txt new file mode 100644 index 000000000000..c437b1aeff55 --- /dev/null +++ b/Documentation/ibm-acpi.txt @@ -0,0 +1,474 @@ + IBM ThinkPad ACPI Extras Driver + + Version 0.8 + 8 November 2004 + + Borislav Deianov <borislav@users.sf.net> + http://ibm-acpi.sf.net/ + + +This is a Linux ACPI driver for the IBM ThinkPad laptops. It aims to +support various features of these laptops which are accessible through +the ACPI framework but not otherwise supported by the generic Linux +ACPI drivers. + + +Status +------ + +The features currently supported are the following (see below for +detailed description): + + - Fn key combinations + - Bluetooth enable and disable + - video output switching, expansion control + - ThinkLight on and off + - limited docking and undocking + - UltraBay eject + - Experimental: CMOS control + - Experimental: LED control + - Experimental: ACPI sounds + +A compatibility table by model and feature is maintained on the web +site, http://ibm-acpi.sf.net/. I appreciate any success or failure +reports, especially if they add to or correct the compatibility table. +Please include the following information in your report: + + - ThinkPad model name + - a copy of your DSDT, from /proc/acpi/dsdt + - which driver features work and which don't + - the observed behavior of non-working features + +Any other comments or patches are also more than welcome. + + +Installation +------------ + +If you are compiling this driver as included in the Linux kernel +sources, simply enable the CONFIG_ACPI_IBM option (Power Management / +ACPI / IBM ThinkPad Laptop Extras). The rest of this section describes +how to install this driver when downloaded from the web site. + +First, you need to get a kernel with ACPI support up and running. +Please refer to http://acpi.sourceforge.net/ for help with this +step. How successful you will be depends a lot on you ThinkPad model, +the kernel you are using and any additional patches applied. The +kernel provided with your distribution may not be good enough. I +needed to compile a 2.6.7 kernel with the 20040715 ACPI patch to get +ACPI working reliably on my ThinkPad X40. Old ThinkPad models may not +be supported at all. + +Assuming you have the basic ACPI support working (e.g. you can see the +/proc/acpi directory), follow the following steps to install this +driver: + + - unpack the archive: + + tar xzvf ibm-acpi-x.y.tar.gz; cd ibm-acpi-x.y + + - compile the driver: + + make + + - install the module in your kernel modules directory: + + make install + + - load the module: + + modprobe ibm_acpi + +After loading the module, check the "dmesg" output for any error messages. + + +Features +-------- + +The driver creates the /proc/acpi/ibm directory. There is a file under +that directory for each feature described below. Note that while the +driver is still in the alpha stage, the exact proc file format and +commands supported by the various features is guaranteed to change +frequently. + +Driver Version -- /proc/acpi/ibm/driver +-------------------------------------- + +The driver name and version. No commands can be written to this file. + +Hot Keys -- /proc/acpi/ibm/hotkey +--------------------------------- + +Without this driver, only the Fn-F4 key (sleep button) generates an +ACPI event. With the driver loaded, the hotkey feature enabled and the +mask set (see below), the various hot keys generate ACPI events in the +following format: + + ibm/hotkey HKEY 00000080 0000xxxx + +The last four digits vary depending on the key combination pressed. +All labeled Fn-Fx key combinations generate distinct events. In +addition, the lid microswitch and some docking station buttons may +also generate such events. + +The following commands can be written to this file: + + echo enable > /proc/acpi/ibm/hotkey -- enable the hot keys feature + echo disable > /proc/acpi/ibm/hotkey -- disable the hot keys feature + echo 0xffff > /proc/acpi/ibm/hotkey -- enable all possible hot keys + echo 0x0000 > /proc/acpi/ibm/hotkey -- disable all possible hot keys + ... any other 4-hex-digit mask ... + echo reset > /proc/acpi/ibm/hotkey -- restore the original mask + +The bit mask allows some control over which hot keys generate ACPI +events. Not all bits in the mask can be modified. Not all bits that +can be modified do anything. Not all hot keys can be individually +controlled by the mask. Most recent ThinkPad models honor the +following bits (assuming the hot keys feature has been enabled): + + key bit behavior when set behavior when unset + + Fn-F3 always generates ACPI event + Fn-F4 always generates ACPI event + Fn-F5 0010 generate ACPI event enable/disable Bluetooth + Fn-F7 0040 generate ACPI event switch LCD and external display + Fn-F8 0080 generate ACPI event expand screen or none + Fn-F9 0100 generate ACPI event none + Fn-F12 always generates ACPI event + +Some models do not support all of the above. For example, the T30 does +not support Fn-F5 and Fn-F9. Other models do not support the mask at +all. On those models, hot keys cannot be controlled individually. + +Note that enabling ACPI events for some keys prevents their default +behavior. For example, if events for Fn-F5 are enabled, that key will +no longer enable/disable Bluetooth by itself. This can still be done +from an acpid handler for the ibm/hotkey event. + +Note also that not all Fn key combinations are supported through +ACPI. For example, on the X40, the brightness, volume and "Access IBM" +buttons do not generate ACPI events even with this driver. They *can* +be used through the "ThinkPad Buttons" utility, see +http://www.nongnu.org/tpb/ + +Bluetooth -- /proc/acpi/ibm/bluetooth +------------------------------------- + +This feature shows the presence and current state of a Bluetooth +device. If Bluetooth is installed, the following commands can be used: + + echo enable > /proc/acpi/ibm/bluetooth + echo disable > /proc/acpi/ibm/bluetooth + +Video output control -- /proc/acpi/ibm/video +-------------------------------------------- + +This feature allows control over the devices used for video output - +LCD, CRT or DVI (if available). The following commands are available: + + echo lcd_enable > /proc/acpi/ibm/video + echo lcd_disable > /proc/acpi/ibm/video + echo crt_enable > /proc/acpi/ibm/video + echo crt_disable > /proc/acpi/ibm/video + echo dvi_enable > /proc/acpi/ibm/video + echo dvi_disable > /proc/acpi/ibm/video + echo auto_enable > /proc/acpi/ibm/video + echo auto_disable > /proc/acpi/ibm/video + echo expand_toggle > /proc/acpi/ibm/video + echo video_switch > /proc/acpi/ibm/video + +Each video output device can be enabled or disabled individually. +Reading /proc/acpi/ibm/video shows the status of each device. + +Automatic video switching can be enabled or disabled. When automatic +video switching is enabled, certain events (e.g. opening the lid, +docking or undocking) cause the video output device to change +automatically. While this can be useful, it also causes flickering +and, on the X40, video corruption. By disabling automatic switching, +the flickering or video corruption can be avoided. + +The video_switch command cycles through the available video outputs +(it sumulates the behavior of Fn-F7). + +Video expansion can be toggled through this feature. This controls +whether the display is expanded to fill the entire LCD screen when a +mode with less than full resolution is used. Note that the current +video expansion status cannot be determined through this feature. + +Note that on many models (particularly those using Radeon graphics +chips) the X driver configures the video card in a way which prevents +Fn-F7 from working. This also disables the video output switching +features of this driver, as it uses the same ACPI methods as +Fn-F7. Video switching on the console should still work. + +ThinkLight control -- /proc/acpi/ibm/light +------------------------------------------ + +The current status of the ThinkLight can be found in this file. A few +models which do not make the status available will show it as +"unknown". The available commands are: + + echo on > /proc/acpi/ibm/light + echo off > /proc/acpi/ibm/light + +Docking / Undocking -- /proc/acpi/ibm/dock +------------------------------------------ + +Docking and undocking (e.g. with the X4 UltraBase) requires some +actions to be taken by the operating system to safely make or break +the electrical connections with the dock. + +The docking feature of this driver generates the following ACPI events: + + ibm/dock GDCK 00000003 00000001 -- eject request + ibm/dock GDCK 00000003 00000002 -- undocked + ibm/dock GDCK 00000000 00000003 -- docked + +NOTE: These events will only be generated if the laptop was docked +when originally booted. This is due to the current lack of support for +hot plugging of devices in the Linux ACPI framework. If the laptop was +booted while not in the dock, the following message is shown in the +logs: "ibm_acpi: dock device not present". No dock-related events are +generated but the dock and undock commands described below still +work. They can be executed manually or triggered by Fn key +combinations (see the example acpid configuration files included in +the driver tarball package available on the web site). + +When the eject request button on the dock is pressed, the first event +above is generated. The handler for this event should issue the +following command: + + echo undock > /proc/acpi/ibm/dock + +After the LED on the dock goes off, it is safe to eject the laptop. +Note: if you pressed this key by mistake, go ahead and eject the +laptop, then dock it back in. Otherwise, the dock may not function as +expected. + +When the laptop is docked, the third event above is generated. The +handler for this event should issue the following command to fully +enable the dock: + + echo dock > /proc/acpi/ibm/dock + +The contents of the /proc/acpi/ibm/dock file shows the current status +of the dock, as provided by the ACPI framework. + +The docking support in this driver does not take care of enabling or +disabling any other devices you may have attached to the dock. For +example, a CD drive plugged into the UltraBase needs to be disabled or +enabled separately. See the provided example acpid configuration files +for how this can be accomplished. + +There is no support yet for PCI devices that may be attached to a +docking station, e.g. in the ThinkPad Dock II. The driver currently +does not recognize, enable or disable such devices. This means that +the only docking stations currently supported are the X-series +UltraBase docks and "dumb" port replicators like the Mini Dock (the +latter don't need any ACPI support, actually). + +UltraBay Eject -- /proc/acpi/ibm/bay +------------------------------------ + +Inserting or ejecting an UltraBay device requires some actions to be +taken by the operating system to safely make or break the electrical +connections with the device. + +This feature generates the following ACPI events: + + ibm/bay MSTR 00000003 00000000 -- eject request + ibm/bay MSTR 00000001 00000000 -- eject lever inserted + +NOTE: These events will only be generated if the UltraBay was present +when the laptop was originally booted (on the X series, the UltraBay +is in the dock, so it may not be present if the laptop was undocked). +This is due to the current lack of support for hot plugging of devices +in the Linux ACPI framework. If the laptop was booted without the +UltraBay, the following message is shown in the logs: "ibm_acpi: bay +device not present". No bay-related events are generated but the eject +command described below still works. It can be executed manually or +triggered by a hot key combination. + +Sliding the eject lever generates the first event shown above. The +handler for this event should take whatever actions are necessary to +shut down the device in the UltraBay (e.g. call idectl), then issue +the following command: + + echo eject > /proc/acpi/ibm/bay + +After the LED on the UltraBay goes off, it is safe to pull out the +device. + +When the eject lever is inserted, the second event above is +generated. The handler for this event should take whatever actions are +necessary to enable the UltraBay device (e.g. call idectl). + +The contents of the /proc/acpi/ibm/bay file shows the current status +of the UltraBay, as provided by the ACPI framework. + +Experimental Features +--------------------- + +The following features are marked experimental because using them +involves guessing the correct values of some parameters. Guessing +incorrectly may have undesirable effects like crashing your +ThinkPad. USE THESE WITH CAUTION! To activate them, you'll need to +supply the experimental=1 parameter when loading the module. + +Experimental: CMOS control - /proc/acpi/ibm/cmos +------------------------------------------------ + +This feature is used internally by the ACPI firmware to control the +ThinkLight on most newer ThinkPad models. It appears that it can also +control LCD brightness, sounds volume and more, but only on some +models. + +The commands are non-negative integer numbers: + + echo 0 >/proc/acpi/ibm/cmos + echo 1 >/proc/acpi/ibm/cmos + echo 2 >/proc/acpi/ibm/cmos + ... + +The range of numbers which are used internally by various models is 0 +to 21, but it's possible that numbers outside this range have +interesting behavior. Here is the behavior on the X40 (tpb is the +ThinkPad Buttons utility): + + 0 - no effect but tpb reports "Volume down" + 1 - no effect but tpb reports "Volume up" + 2 - no effect but tpb reports "Mute on" + 3 - simulate pressing the "Access IBM" button + 4 - LCD brightness up + 5 - LCD brightness down + 11 - toggle screen expansion + 12 - ThinkLight on + 13 - ThinkLight off + 14 - no effect but tpb reports ThinkLight status change + +If you try this feature, please send me a report similar to the +above. On models which allow control of LCD brightness or sound +volume, I'd like to provide this functionality in an user-friendly +way, but first I need a way to identify the models which this is +possible. + +Experimental: LED control - /proc/acpi/ibm/LED +---------------------------------------------- + +Some of the LED indicators can be controlled through this feature. The +available commands are: + + echo <led number> on >/proc/acpi/ibm/led + echo <led number> off >/proc/acpi/ibm/led + echo <led number> blink >/proc/acpi/ibm/led + +The <led number> parameter is a non-negative integer. The range of LED +numbers used internally by various models is 0 to 7 but it's possible +that numbers outside this range are also valid. Here is the mapping on +the X40: + + 0 - power + 1 - battery (orange) + 2 - battery (green) + 3 - UltraBase + 4 - UltraBay + 7 - standby + +All of the above can be turned on and off and can be made to blink. + +If you try this feature, please send me a report similar to the +above. I'd like to provide this functionality in an user-friendly way, +but first I need to identify the which numbers correspond to which +LEDs on various models. + +Experimental: ACPI sounds - /proc/acpi/ibm/beep +----------------------------------------------- + +The BEEP method is used internally by the ACPI firmware to provide +audible alerts in various situtation. This feature allows the same +sounds to be triggered manually. + +The commands are non-negative integer numbers: + + echo 0 >/proc/acpi/ibm/beep + echo 1 >/proc/acpi/ibm/beep + echo 2 >/proc/acpi/ibm/beep + ... + +The range of numbers which are used internally by various models is 0 +to 17, but it's possible that numbers outside this range are also +valid. Here is the behavior on the X40: + + 2 - two beeps, pause, third beep + 3 - single beep + 4 - "unable" + 5 - single beep + 6 - "AC/DC" + 7 - high-pitched beep + 9 - three short beeps + 10 - very long beep + 12 - low-pitched beep + +(I've only been able to identify a couple of them). + +If you try this feature, please send me a report similar to the +above. I'd like to provide this functionality in an user-friendly way, +but first I need to identify the which numbers correspond to which +sounds on various models. + + +Multiple Command, Module Parameters +----------------------------------- + +Multiple commands can be written to the proc files in one shot by +separating them with commas, for example: + + echo enable,0xffff > /proc/acpi/ibm/hotkey + echo lcd_disable,crt_enable > /proc/acpi/ibm/video + +Commands can also be specified when loading the ibm_acpi module, for +example: + + modprobe ibm_acpi hotkey=enable,0xffff video=auto_disable + + +Example Configuration +--------------------- + +The ACPI support in the kernel is intended to be used in conjunction +with a user-space daemon, acpid. The configuration files for this +daemon control what actions are taken in response to various ACPI +events. An example set of configuration files are included in the +config/ directory of the tarball package available on the web +site. Note that these are provided for illustration purposes only and +may need to be adapted to your particular setup. + +The following utility scripts are used by the example action +scripts (included with ibm-acpi for completeness): + + /usr/local/sbin/idectl -- from the hdparm source distribution, + see http://www.ibiblio.org/pub/Linux/system/hardware + /usr/local/sbin/laptop_mode -- from the Linux kernel source + distribution, see Documentation/laptop-mode.txt + /sbin/service -- comes with Redhat/Fedora distributions + +Toan T Nguyen <ntt@control.uchicago.edu> has written a SuSE powersave +script for the X20, included in config/usr/sbin/ibm_hotkeys_X20 + +Henrik Brix Andersen <brix@gentoo.org> has written a Gentoo ACPI event +handler script for the X31. You can get the latest version from +http://dev.gentoo.org/~brix/files/x31.sh + +David Schweikert <dws@ee.eth.ch> has written an alternative blank.sh +script which works on Debian systems, included in +configs/etc/acpi/actions/blank-debian.sh + + +TODO +---- + +I'd like to implement the following features but haven't yet found the +time and/or I don't yet know how to implement them: + +- UltraBay floppy drive support + diff --git a/Documentation/ide.txt b/Documentation/ide.txt new file mode 100644 index 000000000000..29866fbfb229 --- /dev/null +++ b/Documentation/ide.txt @@ -0,0 +1,394 @@ + + Information regarding the Enhanced IDE drive in Linux 2.6 + +============================================================================== + + + The hdparm utility can be used to control various IDE features on a + running system. It is packaged separately. Please Look for it on popular + linux FTP sites. + + + +*** IMPORTANT NOTICES: BUGGY IDE CHIPSETS CAN CORRUPT DATA!! +*** ================= +*** PCI versions of the CMD640 and RZ1000 interfaces are now detected +*** automatically at startup when PCI BIOS support is configured. +*** +*** Linux disables the "prefetch" ("readahead") mode of the RZ1000 +*** to prevent data corruption possible due to hardware design flaws. +*** +*** For the CMD640, linux disables "IRQ unmasking" (hdparm -u1) on any +*** drive for which the "prefetch" mode of the CMD640 is turned on. +*** If "prefetch" is disabled (hdparm -p8), then "IRQ unmasking" can be +*** used again. +*** +*** For the CMD640, linux disables "32bit I/O" (hdparm -c1) on any drive +*** for which the "prefetch" mode of the CMD640 is turned off. +*** If "prefetch" is enabled (hdparm -p9), then "32bit I/O" can be +*** used again. +*** +*** The CMD640 is also used on some Vesa Local Bus (VLB) cards, and is *NOT* +*** automatically detected by Linux. For safe, reliable operation with such +*** interfaces, one *MUST* use the "ide0=cmd640_vlb" kernel option. +*** +*** Use of the "serialize" option is no longer necessary. + +================================================================================ +Common pitfalls: + +- 40-conductor IDE cables are capable of transferring data in DMA modes up to + udma2, but no faster. + +- If possible devices should be attached to separate channels if they are + available. Typically the disk on the first and CD-ROM on the second. + +- If you mix devices on the same cable, please consider using similar devices + in respect of the data transfer mode they support. + +- Even better try to stick to the same vendor and device type on the same + cable. + +================================================================================ + +This is the multiple IDE interface driver, as evolved from hd.c. + +It supports up to 9 IDE interfaces per default, on one or more IRQs (usually +14 & 15). There can be up to two drives per interface, as per the ATA-6 spec. + +Primary: ide0, port 0x1f0; major=3; hda is minor=0; hdb is minor=64 +Secondary: ide1, port 0x170; major=22; hdc is minor=0; hdd is minor=64 +Tertiary: ide2, port 0x1e8; major=33; hde is minor=0; hdf is minor=64 +Quaternary: ide3, port 0x168; major=34; hdg is minor=0; hdh is minor=64 +fifth.. ide4, usually PCI, probed +sixth.. ide5, usually PCI, probed + +To access devices on interfaces > ide0, device entries please make sure that +device files for them are present in /dev. If not, please create such +entries, by using /dev/MAKEDEV. + +This driver automatically probes for most IDE interfaces (including all PCI +ones), for the drives/geometries attached to those interfaces, and for the IRQ +lines being used by the interfaces (normally 14, 15 for ide0/ide1). + +For special cases, interfaces may be specified using kernel "command line" +options. For example, + + ide3=0x168,0x36e,10 /* ioports 0x168-0x16f,0x36e, irq 10 */ + +Normally the irq number need not be specified, as ide.c will probe for it: + + ide3=0x168,0x36e /* ioports 0x168-0x16f,0x36e */ + +The standard port, and irq values are these: + + ide0=0x1f0,0x3f6,14 + ide1=0x170,0x376,15 + ide2=0x1e8,0x3ee,11 + ide3=0x168,0x36e,10 + +Note that the first parameter reserves 8 contiguous ioports, whereas the +second value denotes a single ioport. If in doubt, do a 'cat /proc/ioports'. + +In all probability the device uses these ports and IRQs if it is attached +to the appropriate ide channel. Pass the parameter for the correct ide +channel to the kernel, as explained above. + +Any number of interfaces may share a single IRQ if necessary, at a slight +performance penalty, whether on separate cards or a single VLB card. +The IDE driver automatically detects and handles this. However, this may +or may not be harmful to your hardware.. two or more cards driving the same IRQ +can potentially burn each other's bus driver, though in practice this +seldom occurs. Be careful, and if in doubt, don't do it! + +Drives are normally found by auto-probing and/or examining the CMOS/BIOS data. +For really weird situations, the apparent (fdisk) geometry can also be specified +on the kernel "command line" using LILO. The format of such lines is: + + hdx=cyls,heads,sects,wpcom,irq +or hdx=cdrom + +where hdx can be any of hda through hdh, Three values are required +(cyls,heads,sects). For example: + + hdc=1050,32,64 hdd=cdrom + +either {hda,hdb} or {hdc,hdd}. The results of successful auto-probing may +override the physical geometry/irq specified, though the "original" geometry +may be retained as the "logical" geometry for partitioning purposes (fdisk). + +If the auto-probing during boot time confuses a drive (ie. the drive works +with hd.c but not with ide.c), then an command line option may be specified +for each drive for which you'd like the drive to skip the hardware +probe/identification sequence. For example: + + hdb=noprobe +or + hdc=768,16,32 + hdc=noprobe + +Note that when only one IDE device is attached to an interface, it should be +jumpered as "single" or "master", *not* "slave". Many folks have had +"trouble" with cdroms because of this requirement, so the driver now probes +for both units, though success is more likely when the drive is jumpered +correctly. + +Courtesy of Scott Snyder and others, the driver supports ATAPI cdrom drives +such as the NEC-260 and the new MITSUMI triple/quad speed drives. +Such drives will be identified at boot time, just like a hard disk. + +If for some reason your cdrom drive is *not* found at boot time, you can force +the probe to look harder by supplying a kernel command line parameter +via LILO, such as: + + hdc=cdrom /* hdc = "master" on second interface */ +or + hdd=cdrom /* hdd = "slave" on second interface */ + +For example, a GW2000 system might have a hard drive on the primary +interface (/dev/hda) and an IDE cdrom drive on the secondary interface +(/dev/hdc). To mount a CD in the cdrom drive, one would use something like: + + ln -sf /dev/hdc /dev/cdrom + mkdir /mnt/cdrom + mount /dev/cdrom /mnt/cdrom -t iso9660 -o ro + +If, after doing all of the above, mount doesn't work and you see +errors from the driver (with dmesg) complaining about `status=0xff', +this means that the hardware is not responding to the driver's attempts +to read it. One of the following is probably the problem: + + - Your hardware is broken. + + - You are using the wrong address for the device, or you have the + drive jumpered wrong. Review the configuration instructions above. + + - Your IDE controller requires some nonstandard initialization sequence + before it will work properly. If this is the case, there will often + be a separate MS-DOS driver just for the controller. IDE interfaces + on sound cards usually fall into this category. Such configurations + can often be made to work by first booting MS-DOS, loading the + appropriate drivers, and then warm-booting linux (without powering + off). This can be automated using loadlin in the MS-DOS autoexec. + +If you always get timeout errors, interrupts from the drive are probably +not making it to the host. Check how you have the hardware jumpered +and make sure it matches what the driver expects (see the configuration +instructions above). If you have a PCI system, also check the BIOS +setup; I've had one report of a system which was shipped with IRQ 15 +disabled by the BIOS. + +The kernel is able to execute binaries directly off of the cdrom, +provided it is mounted with the default block size of 1024 (as above). + +Please pass on any feedback on any of this stuff to the maintainer, +whose address can be found in linux/MAINTAINERS. + +Note that if BOTH hd.c and ide.c are configured into the kernel, +hd.c will normally be allowed to control the primary IDE interface. +This is useful for older hardware that may be incompatible with ide.c, +and still allows newer hardware to run on the 2nd/3rd/4th IDE ports +under control of ide.c. To have ide.c also "take over" the primary +IDE port in this situation, use the "command line" parameter: ide0=0x1f0 + +The IDE driver is modularized. The high level disk/CD-ROM/tape/floppy +drivers can always be compiled as loadable modules, the chipset drivers +can only be compiled into the kernel, and the core code (ide.c) can be +compiled as a loadable module provided no chipset support is needed. + +When using ide.c as a module in combination with kmod, add: + + alias block-major-3 ide-probe + +to /etc/modprobe.conf. + +When ide.c is used as a module, you can pass command line parameters to the +driver using the "options=" keyword to insmod, while replacing any ',' with +';'. For example: + + insmod ide.o options="ide0=serialize ide1=serialize ide2=0x1e8;0x3ee;11" + + +================================================================================ + +Summary of ide driver parameters for kernel command line +-------------------------------------------------------- + + "hdx=" is recognized for all "x" from "a" to "h", such as "hdc". + + "idex=" is recognized for all "x" from "0" to "3", such as "ide1". + + "hdx=noprobe" : drive may be present, but do not probe for it + + "hdx=none" : drive is NOT present, ignore cmos and do not probe + + "hdx=nowerr" : ignore the WRERR_STAT bit on this drive + + "hdx=cdrom" : drive is present, and is a cdrom drive + + "hdx=cyl,head,sect" : disk drive is present, with specified geometry + + "hdx=remap" : remap access of sector 0 to sector 1 (for EZDrive) + + "hdx=remap63" : remap the drive: add 63 to all sector numbers + (for DM OnTrack) + + "hdx=autotune" : driver will attempt to tune interface speed + to the fastest PIO mode supported, + if possible for this drive only. + Not fully supported by all chipset types, + and quite likely to cause trouble with + older/odd IDE drives. + + "hdx=swapdata" : when the drive is a disk, byte swap all data + + "hdx=bswap" : same as above.......... + + "hdx=scsi" : the return of the ide-scsi flag, this is useful for + allowing ide-floppy, ide-tape, and ide-cdrom|writers + to use ide-scsi emulation on a device specific option. + + "idebus=xx" : inform IDE driver of VESA/PCI bus speed in MHz, + where "xx" is between 20 and 66 inclusive, + used when tuning chipset PIO modes. + For PCI bus, 25 is correct for a P75 system, + 30 is correct for P90,P120,P180 systems, + and 33 is used for P100,P133,P166 systems. + If in doubt, use idebus=33 for PCI. + As for VLB, it is safest to not specify it. + Bigger values are safer than smaller ones. + + "idex=noprobe" : do not attempt to access/use this interface + + "idex=base" : probe for an interface at the addr specified, + where "base" is usually 0x1f0 or 0x170 + and "ctl" is assumed to be "base"+0x206 + + "idex=base,ctl" : specify both base and ctl + + "idex=base,ctl,irq" : specify base, ctl, and irq number + + "idex=autotune" : driver will attempt to tune interface speed + to the fastest PIO mode supported, + for all drives on this interface. + Not fully supported by all chipset types, + and quite likely to cause trouble with + older/odd IDE drives. + + "idex=noautotune" : driver will NOT attempt to tune interface speed + This is the default for most chipsets, + except the cmd640. + + "idex=serialize" : do not overlap operations on idex. Please note + that you will have to specify this option for + both the respecitve primary and secondary channel + to take effect. + + "idex=four" : four drives on idex and ide(x^1) share same ports + + "idex=reset" : reset interface after probe + + "idex=dma" : automatically configure/use DMA if possible. + + "idex=ata66" : informs the interface that it has an 80c cable + for chipsets that are ATA-66 capable, but the + ability to bit test for detection is currently + unknown. + + "ide=reverse" : formerly called to pci sub-system, but now local. + + "ide=nodma" : disable DMA globally for the IDE subsystem. + +The following are valid ONLY on ide0, which usually corresponds +to the first ATA interface found on the particular host, and the defaults for +the base,ctl ports must not be altered. + + "ide0=dtc2278" : probe/support DTC2278 interface + "ide0=ht6560b" : probe/support HT6560B interface + "ide0=cmd640_vlb" : *REQUIRED* for VLB cards with the CMD640 chip + (not for PCI -- automatically detected) + "ide0=qd65xx" : probe/support qd65xx interface + "ide0=ali14xx" : probe/support ali14xx chipsets (ALI M1439/M1443/M1445) + "ide0=umc8672" : probe/support umc8672 chipsets + + "ide=doubler" : probe/support IDE doublers on Amiga + +There may be more options than shown -- use the source, Luke! + +Everything else is rejected with a "BAD OPTION" message. + +================================================================================ + +IDE ATAPI streaming tape driver +------------------------------- + +This driver is a part of the Linux ide driver and works in co-operation +with linux/drivers/block/ide.c. + +The driver, in co-operation with ide.c, basically traverses the +request-list for the block device interface. The character device +interface, on the other hand, creates new requests, adds them +to the request-list of the block device, and waits for their completion. + +Pipelined operation mode is now supported on both reads and writes. + +The block device major and minor numbers are determined from the +tape's relative position in the ide interfaces, as explained in ide.c. + +The character device interface consists of the following devices: + + ht0 major 37, minor 0 first IDE tape, rewind on close. + ht1 major 37, minor 1 second IDE tape, rewind on close. + ... + nht0 major 37, minor 128 first IDE tape, no rewind on close. + nht1 major 37, minor 129 second IDE tape, no rewind on close. + ... + +Run /dev/MAKEDEV to create the above entries. + +The general magnetic tape commands compatible interface, as defined by +include/linux/mtio.h, is accessible through the character device. + +General ide driver configuration options, such as the interrupt-unmask +flag, can be configured by issuing an ioctl to the block device interface, +as any other ide device. + +Our own ide-tape ioctl's can be issued to either the block device or +the character device interface. + +Maximal throughput with minimal bus load will usually be achieved in the +following scenario: + + 1. ide-tape is operating in the pipelined operation mode. + 2. No buffering is performed by the user backup program. + + + +================================================================================ + +Some Terminology +---------------- +IDE = Integrated Drive Electronics, meaning that each drive has a built-in +controller, which is why an "IDE interface card" is not a "controller card". + +ATA = AT (the old IBM 286 computer) Attachment Interface, a draft American +National Standard for connecting hard drives to PCs. This is the official +name for "IDE". + +The latest standards define some enhancements, known as the ATA-6 spec, +which grew out of vendor-specific "Enhanced IDE" (EIDE) implementations. + +ATAPI = ATA Packet Interface, a new protocol for controlling the drives, +similar to SCSI protocols, created at the same time as the ATA2 standard. +ATAPI is currently used for controlling CDROM, TAPE and FLOPPY (ZIP or +LS120/240) devices, removable R/W cartridges, and for high capacity hard disk +drives. + +mlord@pobox.com +-- + +Wed Apr 17 22:52:44 CEST 2002 edited by Marcin Dalecki, the current +maintainer. + +Wed Aug 20 22:31:29 CEST 2003 updated ide boot uptions to current ide.c +comments at 2.6.0-test4 time. Maciej Soltysiak <solt@dns.toxicfilms.tv> diff --git a/Documentation/infiniband/ipoib.txt b/Documentation/infiniband/ipoib.txt new file mode 100644 index 000000000000..18e184b56040 --- /dev/null +++ b/Documentation/infiniband/ipoib.txt @@ -0,0 +1,56 @@ +IP OVER INFINIBAND + + The ib_ipoib driver is an implementation of the IP over InfiniBand + protocol as specified by the latest Internet-Drafts issued by the + IETF ipoib working group. It is a "native" implementation in the + sense of setting the interface type to ARPHRD_INFINIBAND and the + hardware address length to 20 (earlier proprietary implementations + masqueraded to the kernel as ethernet interfaces). + +Partitions and P_Keys + + When the IPoIB driver is loaded, it creates one interface for each + port using the P_Key at index 0. To create an interface with a + different P_Key, write the desired P_Key into the main interface's + /sys/class/net/<intf name>/create_child file. For example: + + echo 0x8001 > /sys/class/net/ib0/create_child + + This will create an interface named ib0.8001 with P_Key 0x8001. To + remove a subinterface, use the "delete_child" file: + + echo 0x8001 > /sys/class/net/ib0/delete_child + + The P_Key for any interface is given by the "pkey" file, and the + main interface for a subinterface is in "parent." + +Debugging Information + + By compiling the IPoIB driver with CONFIG_INFINIBAND_IPOIB_DEBUG set + to 'y', tracing messages are compiled into the driver. They are + turned on by setting the module parameters debug_level and + mcast_debug_level to 1. These parameters can be controlled at + runtime through files in /sys/module/ib_ipoib/. + + CONFIG_INFINIBAND_IPOIB_DEBUG also enables the "ipoib_debugfs" + virtual filesystem. By mounting this filesystem, for example with + + mkdir -p /ipoib_debugfs + mount -t ipoib_debugfs none /ipoib_debufs + + it is possible to get statistics about multicast groups from the + files /ipoib_debugfs/ib0_mcg and so on. + + The performance impact of this option is negligible, so it + is safe to enable this option with debug_level set to 0 for normal + operation. + + CONFIG_INFINIBAND_IPOIB_DEBUG_DATA enables even more debug output in + the data path when data_debug_level is set to 1. However, even with + the output disabled, enabling this configuration option will affect + performance, because it adds tests to the fast path. + +References + + IETF IP over InfiniBand (ipoib) Working Group + http://ietf.org/html.charters/ipoib-charter.html diff --git a/Documentation/infiniband/sysfs.txt b/Documentation/infiniband/sysfs.txt new file mode 100644 index 000000000000..ddd519b72ee1 --- /dev/null +++ b/Documentation/infiniband/sysfs.txt @@ -0,0 +1,66 @@ +SYSFS FILES + + For each InfiniBand device, the InfiniBand drivers create the + following files under /sys/class/infiniband/<device name>: + + node_type - Node type (CA, switch or router) + node_guid - Node GUID + sys_image_guid - System image GUID + + In addition, there is a "ports" subdirectory, with one subdirectory + for each port. For example, if mthca0 is a 2-port HCA, there will + be two directories: + + /sys/class/infiniband/mthca0/ports/1 + /sys/class/infiniband/mthca0/ports/2 + + (A switch will only have a single "0" subdirectory for switch port + 0; no subdirectory is created for normal switch ports) + + In each port subdirectory, the following files are created: + + cap_mask - Port capability mask + lid - Port LID + lid_mask_count - Port LID mask count + rate - Port data rate (active width * active speed) + sm_lid - Subnet manager LID for port's subnet + sm_sl - Subnet manager SL for port's subnet + state - Port state (DOWN, INIT, ARMED, ACTIVE or ACTIVE_DEFER) + phys_state - Port physical state (Sleep, Polling, LinkUp, etc) + + There is also a "counters" subdirectory, with files + + VL15_dropped + excessive_buffer_overrun_errors + link_downed + link_error_recovery + local_link_integrity_errors + port_rcv_constraint_errors + port_rcv_data + port_rcv_errors + port_rcv_packets + port_rcv_remote_physical_errors + port_rcv_switch_relay_errors + port_xmit_constraint_errors + port_xmit_data + port_xmit_discards + port_xmit_packets + symbol_error + + Each of these files contains the corresponding value from the port's + Performance Management PortCounters attribute, as described in + section 16.1.3.5 of the InfiniBand Architecture Specification. + + The "pkeys" and "gids" subdirectories contain one file for each + entry in the port's P_Key or GID table respectively. For example, + ports/1/pkeys/10 contains the value at index 10 in port 1's P_Key + table. + +MTHCA + + The Mellanox HCA driver also creates the files: + + hw_rev - Hardware revision number + fw_ver - Firmware version + hca_type - HCA type: "MT23108", "MT25208 (MT23108 compat mode)", + or "MT25208" diff --git a/Documentation/infiniband/user_mad.txt b/Documentation/infiniband/user_mad.txt new file mode 100644 index 000000000000..cae0c83f1ee9 --- /dev/null +++ b/Documentation/infiniband/user_mad.txt @@ -0,0 +1,99 @@ +USERSPACE MAD ACCESS + +Device files + + Each port of each InfiniBand device has a "umad" device and an + "issm" device attached. For example, a two-port HCA will have two + umad devices and two issm devices, while a switch will have one + device of each type (for switch port 0). + +Creating MAD agents + + A MAD agent can be created by filling in a struct ib_user_mad_reg_req + and then calling the IB_USER_MAD_REGISTER_AGENT ioctl on a file + descriptor for the appropriate device file. If the registration + request succeeds, a 32-bit id will be returned in the structure. + For example: + + struct ib_user_mad_reg_req req = { /* ... */ }; + ret = ioctl(fd, IB_USER_MAD_REGISTER_AGENT, (char *) &req); + if (!ret) + my_agent = req.id; + else + perror("agent register"); + + Agents can be unregistered with the IB_USER_MAD_UNREGISTER_AGENT + ioctl. Also, all agents registered through a file descriptor will + be unregistered when the descriptor is closed. + +Receiving MADs + + MADs are received using read(). The buffer passed to read() must be + large enough to hold at least one struct ib_user_mad. For example: + + struct ib_user_mad mad; + ret = read(fd, &mad, sizeof mad); + if (ret != sizeof mad) + perror("read"); + + In addition to the actual MAD contents, the other struct ib_user_mad + fields will be filled in with information on the received MAD. For + example, the remote LID will be in mad.lid. + + If a send times out, a receive will be generated with mad.status set + to ETIMEDOUT. Otherwise when a MAD has been successfully received, + mad.status will be 0. + + poll()/select() may be used to wait until a MAD can be read. + +Sending MADs + + MADs are sent using write(). The agent ID for sending should be + filled into the id field of the MAD, the destination LID should be + filled into the lid field, and so on. For example: + + struct ib_user_mad mad; + + /* fill in mad.data */ + + mad.id = my_agent; /* req.id from agent registration */ + mad.lid = my_dest; /* in network byte order... */ + /* etc. */ + + ret = write(fd, &mad, sizeof mad); + if (ret != sizeof mad) + perror("write"); + +Setting IsSM Capability Bit + + To set the IsSM capability bit for a port, simply open the + corresponding issm device file. If the IsSM bit is already set, + then the open call will block until the bit is cleared (or return + immediately with errno set to EAGAIN if the O_NONBLOCK flag is + passed to open()). The IsSM bit will be cleared when the issm file + is closed. No read, write or other operations can be performed on + the issm file. + +/dev files + + To create the appropriate character device files automatically with + udev, a rule like + + KERNEL="umad*", NAME="infiniband/%k" + KERNEL="issm*", NAME="infiniband/%k" + + can be used. This will create device nodes named + + /dev/infiniband/umad0 + /dev/infiniband/issm0 + + for the first port, and so on. The InfiniBand device and port + associated with these devices can be determined from the files + + /sys/class/infiniband_mad/umad0/ibdev + /sys/class/infiniband_mad/umad0/port + + and + + /sys/class/infiniband_mad/issm0/ibdev + /sys/class/infiniband_mad/issm0/port diff --git a/Documentation/initrd.txt b/Documentation/initrd.txt new file mode 100644 index 000000000000..7de1c80cd719 --- /dev/null +++ b/Documentation/initrd.txt @@ -0,0 +1,340 @@ +Using the initial RAM disk (initrd) +=================================== + +Written 1996,2000 by Werner Almesberger <werner.almesberger@epfl.ch> and + Hans Lermen <lermen@fgan.de> + + +initrd provides the capability to load a RAM disk by the boot loader. +This RAM disk can then be mounted as the root file system and programs +can be run from it. Afterwards, a new root file system can be mounted +from a different device. The previous root (from initrd) is then moved +to a directory and can be subsequently unmounted. + +initrd is mainly designed to allow system startup to occur in two phases, +where the kernel comes up with a minimum set of compiled-in drivers, and +where additional modules are loaded from initrd. + +This document gives a brief overview of the use of initrd. A more detailed +discussion of the boot process can be found in [1]. + + +Operation +--------- + +When using initrd, the system typically boots as follows: + + 1) the boot loader loads the kernel and the initial RAM disk + 2) the kernel converts initrd into a "normal" RAM disk and + frees the memory used by initrd + 3) initrd is mounted read-write as root + 4) /linuxrc is executed (this can be any valid executable, including + shell scripts; it is run with uid 0 and can do basically everything + init can do) + 5) linuxrc mounts the "real" root file system + 6) linuxrc places the root file system at the root directory using the + pivot_root system call + 7) the usual boot sequence (e.g. invocation of /sbin/init) is performed + on the root file system + 8) the initrd file system is removed + +Note that changing the root directory does not involve unmounting it. +It is therefore possible to leave processes running on initrd during that +procedure. Also note that file systems mounted under initrd continue to +be accessible. + + +Boot command-line options +------------------------- + +initrd adds the following new options: + + initrd=<path> (e.g. LOADLIN) + + Loads the specified file as the initial RAM disk. When using LILO, you + have to specify the RAM disk image file in /etc/lilo.conf, using the + INITRD configuration variable. + + noinitrd + + initrd data is preserved but it is not converted to a RAM disk and + the "normal" root file system is mounted. initrd data can be read + from /dev/initrd. Note that the data in initrd can have any structure + in this case and doesn't necessarily have to be a file system image. + This option is used mainly for debugging. + + Note: /dev/initrd is read-only and it can only be used once. As soon + as the last process has closed it, all data is freed and /dev/initrd + can't be opened anymore. + + root=/dev/ram0 (without devfs) + root=/dev/rd/0 (with devfs) + + initrd is mounted as root, and the normal boot procedure is followed, + with the RAM disk still mounted as root. + + +Installation +------------ + +First, a directory for the initrd file system has to be created on the +"normal" root file system, e.g. + +# mkdir /initrd + +The name is not relevant. More details can be found on the pivot_root(2) +man page. + +If the root file system is created during the boot procedure (i.e. if +you're building an install floppy), the root file system creation +procedure should create the /initrd directory. + +If initrd will not be mounted in some cases, its content is still +accessible if the following device has been created (note that this +does not work if using devfs): + +# mknod /dev/initrd b 1 250 +# chmod 400 /dev/initrd + +Second, the kernel has to be compiled with RAM disk support and with +support for the initial RAM disk enabled. Also, at least all components +needed to execute programs from initrd (e.g. executable format and file +system) must be compiled into the kernel. + +Third, you have to create the RAM disk image. This is done by creating a +file system on a block device, copying files to it as needed, and then +copying the content of the block device to the initrd file. With recent +kernels, at least three types of devices are suitable for that: + + - a floppy disk (works everywhere but it's painfully slow) + - a RAM disk (fast, but allocates physical memory) + - a loopback device (the most elegant solution) + +We'll describe the loopback device method: + + 1) make sure loopback block devices are configured into the kernel + 2) create an empty file system of the appropriate size, e.g. + # dd if=/dev/zero of=initrd bs=300k count=1 + # mke2fs -F -m0 initrd + (if space is critical, you may want to use the Minix FS instead of Ext2) + 3) mount the file system, e.g. + # mount -t ext2 -o loop initrd /mnt + 4) create the console device (not necessary if using devfs, but it can't + hurt to do it anyway): + # mkdir /mnt/dev + # mknod /mnt/dev/console c 5 1 + 5) copy all the files that are needed to properly use the initrd + environment. Don't forget the most important file, /linuxrc + Note that /linuxrc's permissions must include "x" (execute). + 6) correct operation the initrd environment can frequently be tested + even without rebooting with the command + # chroot /mnt /linuxrc + This is of course limited to initrds that do not interfere with the + general system state (e.g. by reconfiguring network interfaces, + overwriting mounted devices, trying to start already running demons, + etc. Note however that it is usually possible to use pivot_root in + such a chroot'ed initrd environment.) + 7) unmount the file system + # umount /mnt + 8) the initrd is now in the file "initrd". Optionally, it can now be + compressed + # gzip -9 initrd + +For experimenting with initrd, you may want to take a rescue floppy and +only add a symbolic link from /linuxrc to /bin/sh. Alternatively, you +can try the experimental newlib environment [2] to create a small +initrd. + +Finally, you have to boot the kernel and load initrd. Almost all Linux +boot loaders support initrd. Since the boot process is still compatible +with an older mechanism, the following boot command line parameters +have to be given: + + root=/dev/ram0 init=/linuxrc rw + +if not using devfs, or + + root=/dev/rd/0 init=/linuxrc rw + +if using devfs. (rw is only necessary if writing to the initrd file +system.) + +With LOADLIN, you simply execute + + LOADLIN <kernel> initrd=<disk_image> +e.g. LOADLIN C:\LINUX\BZIMAGE initrd=C:\LINUX\INITRD.GZ root=/dev/ram0 + init=/linuxrc rw + +With LILO, you add the option INITRD=<path> to either the global section +or to the section of the respective kernel in /etc/lilo.conf, and pass +the options using APPEND, e.g. + + image = /bzImage + initrd = /boot/initrd.gz + append = "root=/dev/ram0 init=/linuxrc rw" + +and run /sbin/lilo + +For other boot loaders, please refer to the respective documentation. + +Now you can boot and enjoy using initrd. + + +Changing the root device +------------------------ + +When finished with its duties, linuxrc typically changes the root device +and proceeds with starting the Linux system on the "real" root device. + +The procedure involves the following steps: + - mounting the new root file system + - turning it into the root file system + - removing all accesses to the old (initrd) root file system + - unmounting the initrd file system and de-allocating the RAM disk + +Mounting the new root file system is easy: it just needs to be mounted on +a directory under the current root. Example: + +# mkdir /new-root +# mount -o ro /dev/hda1 /new-root + +The root change is accomplished with the pivot_root system call, which +is also available via the pivot_root utility (see pivot_root(8) man +page; pivot_root is distributed with util-linux version 2.10h or higher +[3]). pivot_root moves the current root to a directory under the new +root, and puts the new root at its place. The directory for the old root +must exist before calling pivot_root. Example: + +# cd /new-root +# mkdir initrd +# pivot_root . initrd + +Now, the linuxrc process may still access the old root via its +executable, shared libraries, standard input/output/error, and its +current root directory. All these references are dropped by the +following command: + +# exec chroot . what-follows <dev/console >dev/console 2>&1 + +Where what-follows is a program under the new root, e.g. /sbin/init +If the new root file system will be used with devfs and has no valid +/dev directory, devfs must be mounted before invoking chroot in order to +provide /dev/console. + +Note: implementation details of pivot_root may change with time. In order +to ensure compatibility, the following points should be observed: + + - before calling pivot_root, the current directory of the invoking + process should point to the new root directory + - use . as the first argument, and the _relative_ path of the directory + for the old root as the second argument + - a chroot program must be available under the old and the new root + - chroot to the new root afterwards + - use relative paths for dev/console in the exec command + +Now, the initrd can be unmounted and the memory allocated by the RAM +disk can be freed: + +# umount /initrd +# blockdev --flushbufs /dev/ram0 # /dev/rd/0 if using devfs + +It is also possible to use initrd with an NFS-mounted root, see the +pivot_root(8) man page for details. + +Note: if linuxrc or any program exec'ed from it terminates for some +reason, the old change_root mechanism is invoked (see section "Obsolete +root change mechanism"). + + +Usage scenarios +--------------- + +The main motivation for implementing initrd was to allow for modular +kernel configuration at system installation. The procedure would work +as follows: + + 1) system boots from floppy or other media with a minimal kernel + (e.g. support for RAM disks, initrd, a.out, and the Ext2 FS) and + loads initrd + 2) /linuxrc determines what is needed to (1) mount the "real" root FS + (i.e. device type, device drivers, file system) and (2) the + distribution media (e.g. CD-ROM, network, tape, ...). This can be + done by asking the user, by auto-probing, or by using a hybrid + approach. + 3) /linuxrc loads the necessary kernel modules + 4) /linuxrc creates and populates the root file system (this doesn't + have to be a very usable system yet) + 5) /linuxrc invokes pivot_root to change the root file system and + execs - via chroot - a program that continues the installation + 6) the boot loader is installed + 7) the boot loader is configured to load an initrd with the set of + modules that was used to bring up the system (e.g. /initrd can be + modified, then unmounted, and finally, the image is written from + /dev/ram0 or /dev/rd/0 to a file) + 8) now the system is bootable and additional installation tasks can be + performed + +The key role of initrd here is to re-use the configuration data during +normal system operation without requiring the use of a bloated "generic" +kernel or re-compiling or re-linking the kernel. + +A second scenario is for installations where Linux runs on systems with +different hardware configurations in a single administrative domain. In +such cases, it is desirable to generate only a small set of kernels +(ideally only one) and to keep the system-specific part of configuration +information as small as possible. In this case, a common initrd could be +generated with all the necessary modules. Then, only /linuxrc or a file +read by it would have to be different. + +A third scenario are more convenient recovery disks, because information +like the location of the root FS partition doesn't have to be provided at +boot time, but the system loaded from initrd can invoke a user-friendly +dialog and it can also perform some sanity checks (or even some form of +auto-detection). + +Last not least, CD-ROM distributors may use it for better installation +from CD, e.g. by using a boot floppy and bootstrapping a bigger RAM disk +via initrd from CD; or by booting via a loader like LOADLIN or directly +from the CD-ROM, and loading the RAM disk from CD without need of +floppies. + + +Obsolete root change mechanism +------------------------------ + +The following mechanism was used before the introduction of pivot_root. +Current kernels still support it, but you should _not_ rely on its +continued availability. + +It works by mounting the "real" root device (i.e. the one set with rdev +in the kernel image or with root=... at the boot command line) as the +root file system when linuxrc exits. The initrd file system is then +unmounted, or, if it is still busy, moved to a directory /initrd, if +such a directory exists on the new root file system. + +In order to use this mechanism, you do not have to specify the boot +command options root, init, or rw. (If specified, they will affect +the real root file system, not the initrd environment.) + +If /proc is mounted, the "real" root device can be changed from within +linuxrc by writing the number of the new root FS device to the special +file /proc/sys/kernel/real-root-dev, e.g. + + # echo 0x301 >/proc/sys/kernel/real-root-dev + +Note that the mechanism is incompatible with NFS and similar file +systems. + +This old, deprecated mechanism is commonly called "change_root", while +the new, supported mechanism is called "pivot_root". + + +Resources +--------- + +[1] Almesberger, Werner; "Booting Linux: The History and the Future" + http://www.almesberger.net/cv/papers/ols2k-9.ps.gz +[2] newlib package (experimental), with initrd example + http://sources.redhat.com/newlib/ +[3] Brouwer, Andries; "util-linux: Miscellaneous utilities for Linux" + ftp://ftp.win.tue.nl/pub/linux-local/utils/util-linux/ diff --git a/Documentation/input/amijoy.txt b/Documentation/input/amijoy.txt new file mode 100644 index 000000000000..3b8b2d43a68e --- /dev/null +++ b/Documentation/input/amijoy.txt @@ -0,0 +1,184 @@ +Amiga 4-joystick parport extension +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Parallel port pins: + + (2) - Up1 (6) - Up2 + (3) - Down1 (7) - Down2 + (4) - Left1 (8) - Left2 + (5) - Right1 (9) - Right2 +(13) - Fire1 (11) - Fire2 +(18) - Gnd1 (18) - Gnd2 + +Amiga digital joystick pinout +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +(1) - Up +(2) - Down +(3) - Left +(4) - Right +(5) - n/c +(6) - Fire button +(7) - +5V (50mA) +(8) - Gnd +(9) - Thumb button + +Amiga mouse pinout +~~~~~~~~~~~~~~~~~~ +(1) - V-pulse +(2) - H-pulse +(3) - VQ-pulse +(4) - HQ-pulse +(5) - Middle button +(6) - Left button +(7) - +5V (50mA) +(8) - Gnd +(9) - Right button + +Amiga analog joystick pinout +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +(1) - Top button +(2) - Top2 button +(3) - Trigger button +(4) - Thumb button +(5) - Analog X +(6) - n/c +(7) - +5V (50mA) +(8) - Gnd +(9) - Analog Y + +Amiga lightpen pinout +~~~~~~~~~~~~~~~~~~~~~ +(1) - n/c +(2) - n/c +(3) - n/c +(4) - n/c +(5) - Touch button +(6) - /Beamtrigger +(7) - +5V (50mA) +(8) - Gnd +(9) - Stylus button + +------------------------------------------------------------------------------- + +NAME rev ADDR type chip Description +JOY0DAT 00A R Denise Joystick-mouse 0 data (left vert, horiz) +JOY1DAT 00C R Denise Joystick-mouse 1 data (right vert,horiz) + + These addresses each read a 16 bit register. These in turn + are loaded from the MDAT serial stream and are clocked in on + the rising edge of SCLK. MLD output is used to parallel load + the external parallel-to-serial converter.This in turn is + loaded with the 4 quadrature inputs from each of two game + controller ports (8 total) plus 8 miscellaneous control bits + which are new for LISA and can be read in upper 8 bits of + LISAID. + Register bits are as follows: + Mouse counter usage (pins 1,3 =Yclock, pins 2,4 =Xclock) + + BIT# 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 +JOY0DAT Y7 Y6 Y5 Y4 Y3 Y2 Y1 Y0 X7 X6 X5 X4 X3 X2 X1 X0 +JOY1DAT Y7 Y6 Y5 Y4 Y3 Y2 Y1 Y0 X7 X6 X5 X4 X3 X2 X1 X0 + + 0=LEFT CONTROLLER PAIR, 1=RIGHT CONTROLLER PAIR. + (4 counters total).The bit usage for both left and right + addresses is shown below. Each 6 bit counter (Y7-Y2,X7-X2) is + clocked by 2 of the signals input from the mouse serial + stream. Starting with first bit recived: + + +-------------------+-----------------------------------------+ + | Serial | Bit Name | Description | + +--------+----------+-----------------------------------------+ + | 0 | M0H | JOY0DAT Horizontal Clock | + | 1 | M0HQ | JOY0DAT Horizontal Clock (quadrature) | + | 2 | M0V | JOY0DAT Vertical Clock | + | 3 | M0VQ | JOY0DAT Vertical Clock (quadrature) | + | 4 | M1V | JOY1DAT Horizontall Clock | + | 5 | M1VQ | JOY1DAT Horizontall Clock (quadrature) | + | 6 | M1V | JOY1DAT Vertical Clock | + | 7 | M1VQ | JOY1DAT Vertical Clock (quadrature) | + +--------+----------+-----------------------------------------+ + + Bits 1 and 0 of each counter (Y1-Y0,X1-X0) may be + read to determine the state of the related input signal pair. + This allows these pins to double as joystick switch inputs. + Joystick switch closures can be deciphered as follows: + + +------------+------+---------------------------------+ + | Directions | Pin# | Counter bits | + +------------+------+---------------------------------+ + | Forward | 1 | Y1 xor Y0 (BIT#09 xor BIT#08) | + | Left | 3 | Y1 | + | Back | 2 | X1 xor X0 (BIT#01 xor BIT#00) | + | Right | 4 | X1 | + +------------+------+---------------------------------+ + +------------------------------------------------------------------------------- + +NAME rev ADDR type chip Description +JOYTEST 036 W Denise Write to all 4 joystick-mouse counters at once. + + Mouse counter write test data: + BIT# 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 + JOYxDAT Y7 Y6 Y5 Y4 Y3 Y2 xx xx X7 X6 X5 X4 X3 X2 xx xx + JOYxDAT Y7 Y6 Y5 Y4 Y3 Y2 xx xx X7 X6 X5 X4 X3 X2 xx xx + +------------------------------------------------------------------------------- + +NAME rev ADDR type chip Description +POT0DAT h 012 R Paula Pot counter data left pair (vert, horiz) +POT1DAT h 014 R Paula Pot counter data right pair (vert,horiz) + + These addresses each read a pair of 8 bit pot counters. + (4 counters total). The bit assignment for both + addresses is shown below. The counters are stopped by signals + from 2 controller connectors (left-right) with 2 pins each. + + BIT# 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 + RIGHT Y7 Y6 Y5 Y4 Y3 Y2 Y1 Y0 X7 X6 X5 X4 X3 X2 X1 X0 + LEFT Y7 Y6 Y5 Y4 Y3 Y2 Y1 Y0 X7 X6 X5 X4 X3 X2 X1 X0 + + +--------------------------+-------+ + | CONNECTORS | PAULA | + +-------+------+-----+-----+-------+ + | Loc. | Dir. | Sym | pin | pin | + +-------+------+-----+-----+-------+ + | RIGHT | Y | RX | 9 | 33 | + | RIGHT | X | RX | 5 | 32 | + | LEFT | Y | LY | 9 | 36 | + | LEFT | X | LX | 5 | 35 | + +-------+------+-----+-----+-------+ + + With normal (NTSC or PAL) horiz. line rate, the pots will + give a full scale (FF) reading with about 500kohms in one + frame time. With proportionally faster horiz line times, + the counters will count proportionally faster. + This should be noted when doing variable beam displays. + +------------------------------------------------------------------------------- + +NAME rev ADDR type chip Description +POTGO 034 W Paula Pot port (4 bit) bi-direction and data, and pot counter start. + +------------------------------------------------------------------------------- + +NAME rev ADDR type chip Description +POTINP 016 R Paula Pot pin data read + + This register controls a 4 bit bi-direction I/O port + that shares the same 4 pins as the 4 pot counters above. + + +-------+----------+---------------------------------------------+ + | BIT# | FUNCTION | DESCRIPTION | + +-------+----------+---------------------------------------------+ + | 15 | OUTRY | Output enable for Paula pin 33 | + | 14 | DATRY | I/O data Paula pin 33 | + | 13 | OUTRX | Output enable for Paula pin 32 | + | 12 | DATRX | I/O data Paula pin 32 | + | 11 | OUTLY | Out put enable for Paula pin 36 | + | 10 | DATLY | I/O data Paula pin 36 | + | 09 | OUTLX | Output enable for Paula pin 35 | + | 08 | DATLX | I/O data Paula pin 35 | + | 07-01 | X | Not used | + | 00 | START | Start pots (dump capacitors,start counters) | + +-------+----------+---------------------------------------------+ + +------------------------------------------------------------------------------- diff --git a/Documentation/input/atarikbd.txt b/Documentation/input/atarikbd.txt new file mode 100644 index 000000000000..8fb896c74114 --- /dev/null +++ b/Documentation/input/atarikbd.txt @@ -0,0 +1,709 @@ +Intelligent Keyboard (ikbd) Protocol + + +1. Introduction + +The Atari Corp. Intelligent Keyboard (ikbd) is a general purpose keyboard +controller that is flexible enough that it can be used in a variety of +products without modification. The keyboard, with its microcontroller, +provides a convenient connection point for a mouse and switch-type joysticks. +The ikbd processor also maintains a time-of-day clock with one second +resolution. +The ikbd has been designed to be general enough that it can be used with a +ariety of new computer products. Product variations in a number of +keyswitches, mouse resolution, etc. can be accommodated. +The ikbd communicates with the main processor over a high speed bi-directional +serial interface. It can function in a variety of modes to facilitate +different applications of the keyboard, joysticks, or mouse. Limited use of +the controller is possible in applications in which only a unidirectional +communications medium is available by carefully designing the default modes. + +3. Keyboard + +The keyboard always returns key make/break scan codes. The ikbd generates +keyboard scan codes for each key press and release. The key scan make (key +closure) codes start at 1, and are defined in Appendix A. For example, the +ISO key position in the scan code table should exist even if no keyswitch +exists in that position on a particular keyboard. The break code for each key +is obtained by ORing 0x80 with the make code. + +The special codes 0xF6 through 0xFF are reserved for use as follows: + 0xF6 status report + 0xF7 absolute mouse position record + 0xF8-0xFB relative mouse position records(lsbs determind by + mouse button states) + 0xFC time-of-day + 0xFD joystick report (both sticks) + 0xFE joystick 0 event + 0xFF joystick 1 event + +The two shift keys return different scan codes in this mode. The ENTER key +and the RETurn key are also distinct. + +4. Mouse + +The mouse port should be capable of supporting a mouse with resolution of +approximately 200 counts (phase changes or 'clicks') per inch of travel. The +mouse should be scanned at a rate that will permit accurate tracking at +velocities up to 10 inches per second. +The ikbd can report mouse motion in three distinctly different ways. It can +report relative motion, absolute motion in a coordinate system maintained +within the ikbd, or by converting mouse motion into keyboard cursor control +key equivalents. +The mouse buttons can be treated as part of the mouse or as additional +keyboard keys. + +4.1 Relative Position Reporting + +In relative position mode, the ikbd will return relative mouse position +records whenever a mouse event occurs. A mouse event consists of a mouse +button being pressed or released, or motion in either axis exceeding a +settable threshold of motion. Regardless of the threshold, all bits of +resolution are returned to the host computer. +Note that the ikbd may return mouse relative position reports with +significantly more than the threshold delta x or y. This may happen since no +relative mouse motion events will be generated: (a) while the keyboard has +been 'paused' ( the event will be stored until keyboard communications is +resumed) (b) while any event is being transmitted. + +The relative mouse position record is a three byte record of the form +(regardless of keyboard mode): + %111110xy ; mouse position record flag + ; where y is the right button state + ; and x is the left button state + X ; delta x as twos complement integer + Y ; delta y as twos complement integer + +Note that the value of the button state bits should be valid even if the +MOUSE BUTTON ACTION has set the buttons to act like part of the keyboard. +If the accumulated motion before the report packet is generated exceeds the ++127...-128 range, the motion is broken into multiple packets. +Note that the sign of the delta y reported is a function of the Y origin +selected. + +4.2 Absolute Position reporting + +The ikbd can also maintain absolute mouse position. Commands exist for +reseting the mouse position, setting X/Y scaling, and interrogating the +current mouse position. + +4.3 Mouse Cursor Key Mode + +The ikbd can translate mouse motion into the equivalent cursor keystrokes. +The number of mouse clicks per keystroke is independently programmable in +each axis. The ikbd internally maintains mouse motion information to the +highest resolution available, and merely generates a pair of cursor key events +for each multiple of the scale factor. +Mouse motion produces the cursor key make code immediately followed by the +break code for the appropriate cursor key. The mouse buttons produce scan +codes above those normally assigned for the largest envisioned keyboard (i.e. +LEFT=0x74 & RIGHT=0x75). + +5. Joystick + +5.1 Joystick Event Reporting + +In this mode, the ikbd generates a record whever the joystick position is +changed (i.e. for each opening or closing of a joystick switch or trigger). + +The joystick event record is two bytes of the form: + %1111111x ; Joystick event marker + ; where x is Joystick 0 or 1 + %x000yyyy ; where yyyy is the stick position + ; and x is the trigger + +5.2 Joystick Interrogation + +The current state of the joystick ports may be interrogated at any time in +this mode by sending an 'Interrogate Joystick' command to the ikbd. + +The ikbd response to joystick interrogation is a three byte report of the form + 0xFD ; joystick report header + %x000yyyy ; Joystick 0 + %x000yyyy ; Joystick 1 + ; where x is the trigger + ; and yyy is the stick position + +5.3 Joystick Monitoring + +A mode is available that devotes nearly all of the keyboard communications +time to reporting the state of the joystick ports at a user specifiable rate. +It remains in this mode until reset or commanded into another mode. The PAUSE +command in this mode not only stop the output but also temporarily stops +scanning the joysticks (samples are not queued). + +5.4 Fire Button Monitoring + +A mode is provided to permit monitoring a single input bit at a high rate. In +this mode the ikbd monitors the state of the Joystick 1 fire button at the +maximum rate permitted by the serial communication channel. The data is packed +8 bits per byte for transmission to the host. The ikbd remains in this mode +until reset or commanded into another mode. The PAUSE command in this mode not +only stops the output but also temporarily stops scanning the button (samples +are not queued). + +5.5 Joystick Key Code Mode + +The ikbd may be commanded to translate the use of either joystick into the +equivalent cursor control keystroke(s). The ikbd provides a single breakpoint +velocity joystick cursor. +Joystick events produce the make code, immediately followed by the break code +for the appropriate cursor motion keys. The trigger or fire buttons of the +joysticks produce pseudo key scan codes above those used by the largest key +matrix envisioned (i.e. JOYSTICK0=0x74, JOYSTICK1=0x75). + +6. Time-of-Day Clock + +The ikbd also maintains a time-of-day clock for the system. Commands are +available to set and interrogate the timer-of-day clock. Time-keeping is +maintained down to a resolution of one second. + +7. Status Inquiries + +The current state of ikbd modes and parameters may be found by sending status +inquiry commands that correspond to the ikbd set commands. + +8. Power-Up Mode + +The keyboard controller will perform a simple self-test on power-up to detect +major controller faults (ROM checksum and RAM test) and such things as stuck +keys. Any keys down at power-up are presumed to be stuck, and their BREAK +(sic) code is returned (which without the preceding MAKE code is a flag for a +keyboard error). If the controller self-test completes without error, the code +0xF0 is returned. (This code will be used to indicate the version/rlease of +the ikbd controller. The first release of the ikbd is version 0xF0, should +there be a second release it will be 0xF1, and so on.) +The ikbd defaults to a mouse position reporting with threshold of 1 unit in +either axis and the Y=0 origin at the top of the screen, and joystick event +reporting mode for joystick 1, with both buttons being logically assigned to +the mouse. After any joystick command, the ikbd assumes that joysticks are +connected to both Joystick0 and Joystick1. Any mouse command (except MOUSE +DISABLE) then causes port 0 to again be scanned as if it were a mouse, and +both buttons are logically connected to it. If a mouse diable command is +received while port 0 is presumed to be a mouse, the button is logically +assigned to Joystick1 ( until the mouse is reenabled by another mouse command). + +9. ikbd Command Set + +This section contains a list of commands that can be sent to the ikbd. Command +codes (such as 0x00) which are not specified should perform no operation +(NOPs). + +9.1 RESET + + 0x80 + 0x01 + +N.B. The RESET command is the only two byte command understood by the ikbd. +Any byte following an 0x80 command byte other than 0x01 is ignored (and causes +the 0x80 to be ignored). +A reset may also be caused by sending a break lasting at least 200mS to the +ikbd. +Executing the RESET command returns the keyboard to its default (power-up) +mode and parameter settings. It does not affect the time-of-day clock. +The RESET command or function causes the ikbd to perform a simple self-test. +If the test is successful, the ikbd will send the code of 0xF0 within 300mS +of receipt of the RESET command (or the end of the break, or power-up). The +ikbd will then scan the key matrix for any stuck (closed) keys. Any keys found +closed will cause the break scan code to be generated (the break code arriving +without being preceded by the make code is a flag for a key matrix error). + +9.2. SET MOUSE BUTTON ACTION + + 0x07 + %00000mss ; mouse button action + ; (m is presumed = 1 when in MOUSE KEYCODE mode) + ; mss=0xy, mouse button press or release causes mouse + ; position report + ; where y=1, mouse key press causes absolute report + ; and x=1, mouse key release causes absolute report + ; mss=100, mouse buttons act like keys + +This command sets how the ikbd should treat the buttons on the mouse. The +default mouse button action mode is %00000000, the buttons are treated as part +of the mouse logically. +When buttons act like keys, LEFT=0x74 & RIGHT=0x75. + +9.3 SET RELATIVE MOUSE POSITION REPORTING + + 0x08 + +Set relative mouse position reporting. (DEFAULT) Mouse position packets are +generated asynchronously by the ikbd whenever motion exceeds the setable +threshold in either axis (see SET MOUSE THRESHOLD). Depending upon the mouse +key mode, mouse position reports may also be generated when either mouse +button is pressed or released. Otherwise the mouse buttons behave as if they +were keyboard keys. + +9.4 SET ABSOLUTE MOUSE POSITIONING + + 0x09 + XMSB ; X maximum (in scaled mouse clicks) + XLSB + YMSB ; Y maximum (in scaled mouse clicks) + YLSB + +Set absolute mouse position maintenance. Resets the ikbd maintained X and Y +coordinates. +In this mode, the value of the internally maintained coordinates does NOT wrap +between 0 and large positive numbers. Excess motion below 0 is ignored. The +command sets the maximum positive value that can be attained in the scaled +coordinate system. Motion beyond that value is also ignored. + +9.5 SET MOUSE KEYCODE MOSE + + 0x0A + deltax ; distance in X clicks to return (LEFT) or (RIGHT) + deltay ; distance in Y clicks to return (UP) or (DOWN) + +Set mouse monitoring routines to return cursor motion keycodes instead of +either RELATIVE or ABSOLUTE motion records. The ikbd returns the appropriate +cursor keycode after mouse travel exceeding the user specified deltas in +either axis. When the keyboard is in key scan code mode, mouse motion will +cause the make code immediately followed by the break code. Note that this +command is not affected by the mouse motion origin. + +9..6 SET MOUSE THRESHOLD + + 0x0B + X ; x threshold in mouse ticks (positive integers) + Y ; y threshold in mouse ticks (positive integers) + +This command sets the threshold before a mouse event is generated. Note that +it does NOT affect the resolution of the data returned to the host. This +command is valid only in RELATIVE MOUSE POSITIONING mode. The thresholds +default to 1 at RESET (or power-up). + +9.7 SET MOUSE SCALE + + 0x0C + X ; horizontal mouse ticks per internel X + Y ; vertical mouse ticks per internel Y + +This command sets the scale factor for the ABSOLUTE MOUSE POSITIONING mode. +In this mode, the specified number of mouse phase changes ('clicks') must +occur before the internally maintained coordinate is changed by one +(independently scaled for each axis). Remember that the mouse position +information is available only by interrogating the ikbd in the ABSOLUTE MOUSE +POSITIONING mode unless the ikbd has been commanded to report on button press +or release (see SET MOSE BUTTON ACTION). + +9.8 INTERROGATE MOUSE POSITION + + 0x0D + Returns: + 0xF7 ; absolute mouse position header + BUTTONS + 0000dcba ; where a is right button down since last interrogation + ; b is right button up since last + ; c is left button down since last + ; d is left button up since last + XMSB ; X coordinate + XLSB + YMSB ; Y coordinate + YLSB + +The INTERROGATE MOUSE POSITION command is valid when in the ABSOLUTE MOUSE +POSITIONING mode, regardless of the setting of the MOUSE BUTTON ACTION. + +9.9 LOAD MOUSE POSITION + + 0x0E + 0x00 ; filler + XMSB ; X coordinate + XLSB ; (in scaled coordinate system) + YMSB ; Y coordinate + YLSB + +This command allows the user to preset the internally maintained absolute +mouse position. + +9.10 SET Y=0 AT BOTTOM + + 0x0F + +This command makes the origin of the Y axis to be at the bottom of the +logical coordinate system internel to the ikbd for all relative or absolute +mouse motion. This causes mouse motion toward the user to be negative in sign +and away from the user to be positive. + +9.11 SET Y=0 AT TOP + + 0x10 + +Makes the origin of the Y axis to be at the top of the logical coordinate +system within the ikbd for all relative or absolute mouse motion. (DEFAULT) +This causes mouse motion toward the user to be positive in sign and away from +the user to be negative. + +9.12 RESUME + + 0x11 + +Resume sending data to the host. Since any command received by the ikbd after +its output has been paused also causes an implicit RESUME this command can be +thought of as a NO OPERATION command. If this command is received by the ikbd +and it is not PAUSED, it is simply ignored. + +9.13 DISABLE MOUSE + + 0x12 + +All mouse event reporting is disabled (and scanning may be internally +disabled). Any valid mouse mode command resumes mouse motion monitoring. (The +valid mouse mode commands are SET RELATIVE MOUSE POSITION REPORTING, SET +ABSOLUTE MOUSE POSITIONING, and SET MOUSE KEYCODE MODE. ) +N.B. If the mouse buttons have been commanded to act like keyboard keys, this +command DOES affect their actions. + +9.14 PAUSE OUTPUT + + 0x13 + +Stop sending data to the host until another valid command is received. Key +matrix activity is still monitored and scan codes or ASCII characters enqueued +(up to the maximum supported by the microcontroller) to be sent when the host +allows the output to be resumed. If in the JOYSTICK EVENT REPORTING mode, +joystick events are also queued. +Mouse motion should be accumulated while the output is paused. If the ikbd is +in RELATIVE MOUSE POSITIONING REPORTING mode, motion is accumulated beyond the +normal threshold limits to produce the minimum number of packets necessary for +transmission when output is resumed. Pressing or releasing either mouse button +causes any accumulated motion to be immediately queued as packets, if the +mouse is in RELATIVE MOUSE POSITION REPORTING mode. +Because of the limitations of the microcontroller memory this command should +be used sparingly, and the output should not be shut of for more than <tbd> +milliseconds at a time. +The output is stopped only at the end of the current 'even'. If the PAUSE +OUTPUT command is received in the middle of a multiple byte report, the packet +will still be transmitted to conclusion and then the PAUSE will take effect. +When the ikbd is in either the JOYSTICK MONITORING mode or the FIRE BUTTON +MONITORING mode, the PAUSE OUTPUT command also temporarily stops the +monitoring process (i.e. the samples are not enqueued for transmission). + +0.15 SET JOYSTICK EVENT REPORTING + + 0x14 + +Enter JOYSTICK EVENT REPORTING mode (DEFAULT). Each opening or closure of a +joystick switch or trigger causes a joystick event record to be generated. + +9.16 SET JOYSTICK INTERROGATION MODE + + 0x15 + +Disables JOYSTICK EVENT REPORTING. Host must send individual JOYSTICK +INTERROGATE commands to sense joystick state. + +9.17 JOYSTICK INTERROGATE + + 0x16 + +Return a record indicating the current state of the joysticks. This command +is valid in either the JOYSTICK EVENT REPORTING mode or the JOYSTICK +INTERROGATION MODE. + +9.18 SET JOYSTICK MONITORING + + 0x17 + rate ; time between samples in hundreths of a second + Returns: (in packets of two as long as in mode) + %000000xy ; where y is JOYSTICK1 Fire button + ; and x is JOYSTICK0 Fire button + %nnnnmmmm ; where m is JOYSTICK1 state + ; and n is JOYSTICK0 state + +Sets the ikbd to do nothing but monitor the serial command lne, maintain the +time-of-day clock, and monitor the joystick. The rate sets the interval +between joystick samples. +N.B. The user should not set the rate higher than the serial communications +channel will allow the 2 bytes packets to be transmitted. + +9.19 SET FIRE BUTTON MONITORING + + 0x18 + Returns: (as long as in mode) + %bbbbbbbb ; state of the JOYSTICK1 fire button packed + ; 8 bits per byte, the first sample if the MSB + +Set the ikbd to do nothing but monitor the serial command line, maintain the +time-of-day clock, and monitor the fire button on Joystick 1. The fire button +is scanned at a rate that causes 8 samples to be made in the time it takes for +the previous byte to be sent to the host (i.e. scan rate = 8/10 * baud rate). +The sample interval should be as constant as possible. + +9.20 SET JOYSTICK KEYCODE MODE + + 0x19 + RX ; length of time (in tenths of seconds) until + ; horizontal velocity breakpoint is reached + RY ; length of time (in tenths of seconds) until + ; vertical velocity breakpoint is reached + TX ; length (in tenths of seconds) of joystick closure + ; until horizontal cursor key is generated before RX + ; has elapsed + TY ; length (in tenths of seconds) of joystick closure + ; until vertical cursor key is generated before RY + ; has elapsed + VX ; length (in tenths of seconds) of joystick closure + ; until horizontal cursor keystokes are generated + ; after RX has elapsed + VY ; length (in tenths of seconds) of joystick closure + ; until vertical cursor keystokes are generated + ; after RY has elapsed + +In this mode, joystick 0 is scanned in a way that simulates cursor keystrokes. +On initial closure, a keystroke pair (make/break) is generated. Then up to Rn +tenths of seconds later, keystroke pairs are generated every Tn tenths of +seconds. After the Rn breakpoint is reached, keystroke pairs are generated +every Vn tenths of seconds. This provides a velocity (auto-repeat) breakpoint +feature. +Note that by setting RX and/or Ry to zero, the velocity feature can be +disabled. The values of TX and TY then become meaningless, and the generation +of cursor 'keystrokes' is set by VX and VY. + +9.21 DISABLE JOYSTICKS + + 0x1A + +Disable the generation of any joystick events (and scanning may be internally +disabled). Any valid joystick mode command resumes joystick monitoring. (The +joystick mode commands are SET JOYSTICK EVENT REPORTING, SET JOYSTICK +INTERROGATION MODE, SET JOYSTICK MONITORING, SET FIRE BUTTON MONITORING, and +SET JOYSTICK KEYCODE MODE.) + +9.22 TIME-OF-DAY CLOCK SET + + 0x1B + YY ; year (2 least significant digits) + MM ; month + DD ; day + hh ; hour + mm ; minute + ss ; second + +All time-of-day data should be sent to the ikbd in packed BCD format. +Any digit that is not a valid BCD digit should be treated as a 'don't care' +and not alter that particular field of the date or time. This permits setting +only some subfields of the time-of-day clock. + +9.23 INTERROGATE TIME-OF-DAT CLOCK + + 0x1C + Returns: + 0xFC ; time-of-day event header + YY ; year (2 least significant digits) + MM ; month + DD ; day + hh ; hour + mm ; minute + ss ; second + + All time-of-day is sent in packed BCD format. + +9.24 MEMORY LOAD + + 0x20 + ADRMSB ; address in controller + ADRLSB ; memory to be loaded + NUM ; number of bytes (0-128) + { data } + +This command permits the host to load arbitrary values into the ikbd +controller memory. The time between data bytes must be less than 20ms. + +9.25 MEMORY READ + + 0x21 + ADRMSB ; address in controller + ADRLSB ; memory to be read + Returns: + 0xF6 ; status header + 0x20 ; memory access + { data } ; 6 data bytes starting at ADR + +This comand permits the host to read from the ikbd controller memory. + +9.26 CONTROLLER EXECUTE + + 0x22 + ADRMSB ; address of subroutine in + ADRLSB ; controller memory to be called + +This command allows the host to command the execution of a subroutine in the +ikbd controller memory. + +9.27 STATUS INQUIRIES + + Status commands are formed by inclusively ORing 0x80 with the + relevant SET command. + + Example: + 0x88 (or 0x89 or 0x8A) ; request mouse mode + Returns: + 0xF6 ; status response header + mode ; 0x08 is RELATIVE + ; 0x09 is ABSOLUTE + ; 0x0A is KEYCODE + param1 ; 0 is RELATIVE + ; XMSB maximum if ABSOLUTE + ; DELTA X is KEYCODE + param2 ; 0 is RELATIVE + ; YMSB maximum if ABSOLUTE + ; DELTA Y is KEYCODE + param3 ; 0 if RELATIVE + ; or KEYCODE + ; YMSB is ABSOLUTE + param4 ; 0 if RELATIVE + ; or KEYCODE + ; YLSB is ABSOLUTE + 0 ; pad + 0 + +The STATUS INQUIRY commands request the ikbd to return either the current mode +or the parameters associated with a given command. All status reports are +padded to form 8 byte long return packets. The responses to the status +requests are designed so that the host may store them away (after stripping +off the status report header byte) and later send them back as commands to +ikbd to restore its state. The 0 pad bytes will be treated as NOPs by the +ikbd. + + Valid STATUS INQUIRY commands are: + + 0x87 mouse button action + 0x88 mouse mode + 0x89 + 0x8A + 0x8B mnouse threshold + 0x8C mouse scale + 0x8F mouse vertical coordinates + 0x90 ( returns 0x0F Y=0 at bottom + 0x10 Y=0 at top ) + 0x92 mouse enable/disable + ( returns 0x00 enabled) + 0x12 disabled ) + 0x94 joystick mode + 0x95 + 0x96 + 0x9A joystick enable/disable + ( returns 0x00 enabled + 0x1A disabled ) + +It is the (host) programmer's responsibility to have only one unanswered +inquiry in process at a time. +STATUS INQUIRY commands are not valid if the ikbd is in JOYSTICK MONITORING +mode or FIRE BUTTON MONITORING mode. + + +10. SCAN CODES + +The key scan codes return by the ikbd are chosen to simplify the +implementaion of GSX. + +GSX Standard Keyboard Mapping. + +Hex Keytop +01 Esc +02 1 +03 2 +04 3 +05 4 +06 5 +07 6 +08 7 +09 8 +0A 9 +0B 0 +0C - +0D == +0E BS +0F TAB +10 Q +11 W +12 E +13 R +14 T +15 Y +16 U +17 I +18 O +19 P +1A [ +1B ] +1C RET +1D CTRL +1E A +1F S +20 D +21 F +22 G +23 H +24 J +25 K +26 L +27 ; +28 ' +29 ` +2A (LEFT) SHIFT +2B \ +2C Z +2D X +2E C +2F V +30 B +31 N +32 M +33 , +34 . +35 / +36 (RIGHT) SHIFT +37 { NOT USED } +38 ALT +39 SPACE BAR +3A CAPS LOCK +3B F1 +3C F2 +3D F3 +3E F4 +3F F5 +40 F6 +41 F7 +42 F8 +43 F9 +44 F10 +45 { NOT USED } +46 { NOT USED } +47 HOME +48 UP ARROW +49 { NOT USED } +4A KEYPAD - +4B LEFT ARROW +4C { NOT USED } +4D RIGHT ARROW +4E KEYPAD + +4F { NOT USED } +50 DOWN ARROW +51 { NOT USED } +52 INSERT +53 DEL +54 { NOT USED } +5F { NOT USED } +60 ISO KEY +61 UNDO +62 HELP +63 KEYPAD ( +64 KEYPAD / +65 KEYPAD * +66 KEYPAD * +67 KEYPAD 7 +68 KEYPAD 8 +69 KEYPAD 9 +6A KEYPAD 4 +6B KEYPAD 5 +6C KEYPAD 6 +6D KEYPAD 1 +6E KEYPAD 2 +6F KEYPAD 3 +70 KEYPAD 0 +71 KEYPAD . +72 KEYPAD ENTER diff --git a/Documentation/input/cd32.txt b/Documentation/input/cd32.txt new file mode 100644 index 000000000000..a003d9b41eca --- /dev/null +++ b/Documentation/input/cd32.txt @@ -0,0 +1,19 @@ +I have written a small patch that let's me use my Amiga CD32 +joypad connected to the parallel port. Thought I'd share it with you so +you can add it to the list of supported joysticks (hopefully someone will +find it useful). + +It needs the following wiring: + +CD32 pad | Parallel port +---------------------------- +1 (Up) | 2 (D0) +2 (Down) | 3 (D1) +3 (Left) | 4 (D2) +4 (Right) | 5 (D3) +5 (Fire3) | 14 (AUTOFD) +6 (Fire1) | 17 (SELIN) +7 (+5V) | 1 (STROBE) +8 (Gnd) | 18 (Gnd) +9 (Fire2) | 7 (D5) + diff --git a/Documentation/input/cs461x.txt b/Documentation/input/cs461x.txt new file mode 100644 index 000000000000..6181747a14d8 --- /dev/null +++ b/Documentation/input/cs461x.txt @@ -0,0 +1,45 @@ +Preface. + +This is a new low-level driver to support analog joystick attached to +Crystal SoundFusion CS4610/CS4612/CS4615. This code is based upon +Vortex/Solo drivers as an example of decoration style, and ALSA +0.5.8a kernel drivers as an chipset documentation and samples. + +This version does not have cooked mode support; the basic code +is present here, but have not tested completely. The button analysis +is completed in this mode, but the axis movement is not. + +Raw mode works fine with analog joystick front-end driver and cs461x +driver as a backend. I've tested this driver with CS4610, 4-axis and +4-button joystick; I mean the jstest utility. Also I've tried to +play in xracer game using joystick, and the result is better than +keyboard only mode. + +The sensitivity and calibrate quality have not been tested; the two +reasons are performed: the same hardware cannot work under Win95 (blue +screen in VJOYD); I have no documentation on my chip; and the existing +behavior in my case was not raised the requirement of joystick calibration. +So the driver have no code to perform hardware related calibration. + +The patch contains minor changes of Config.in and Makefile files. All +needed code have been moved to one separate file cs461x.c like ns558.c +This driver have the basic support for PCI devices only; there is no +ISA or PnP ISA cards supported. AFAIK the ns558 have support for Crystal +ISA and PnP ISA series. + +The driver works witn ALSA drivers simultaneously. For exmple, the xracer +uses joystick as input device and PCM device as sound output in one time. +There are no sound or input collisions detected. The source code have +comments about them; but I've found the joystick can be initialized +separately of ALSA modules. So, you canm use only one joystick driver +without ALSA drivers. The ALSA drivers are not needed to compile or +run this driver. + +There are no debug information print have been placed in source, and no +specific options required to work this driver. The found chipset parameters +are printed via printk(KERN_INFO "..."), see the /var/log/messages to +inspect cs461x: prefixed messages to determine possible card detection +errors. + +Regards, +Viktor diff --git a/Documentation/input/ff.txt b/Documentation/input/ff.txt new file mode 100644 index 000000000000..efa7dd6751f3 --- /dev/null +++ b/Documentation/input/ff.txt @@ -0,0 +1,227 @@ +Force feedback for Linux. +By Johann Deneux <deneux@ifrance.com> on 2001/04/22. +You may redistribute this file. Please remember to include shape.fig and +interactive.fig as well. +---------------------------------------------------------------------------- + +0. Introduction +~~~~~~~~~~~~~~~ +This document describes how to use force feedback devices under Linux. The +goal is not to support these devices as if they were simple input-only devices +(as it is already the case), but to really enable the rendering of force +effects. +At the moment, only I-Force devices are supported, and not officially. That +means I had to find out how the protocol works on my own. Of course, the +information I managed to grasp is far from being complete, and I can not +guarranty that this driver will work for you. +This document only describes the force feedback part of the driver for I-Force +devices. Please read joystick.txt before reading further this document. + +2. Instructions to the user +~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Here are instructions on how to compile and use the driver. In fact, this +driver is the normal iforce, input and evdev drivers written by Vojtech +Pavlik, plus additions to support force feedback. + +Before you start, let me WARN you that some devices shake violently during the +initialisation phase. This happens for example with my "AVB Top Shot Pegasus". +To stop this annoying behaviour, move you joystick to its limits. Anyway, you +should keep a hand on your device, in order to avoid it to brake down if +something goes wrong. + +At the kernel's compilation: + - Enable IForce/Serial + - Enable Event interface + +Compile the modules, install them. + +You also need inputattach. + +You then need to insert the modules into the following order: +% modprobe joydev +% modprobe serport # Only for serial +% modprobe iforce +% modprobe evdev +% ./inputattach -ifor $2 & # Only for serial +If you are using USB, you don't need the inputattach step. + +Please check that you have all the /dev/input entries needed: +cd /dev +rm js* +mkdir input +mknod input/js0 c 13 0 +mknod input/js1 c 13 1 +mknod input/js2 c 13 2 +mknod input/js3 c 13 3 +ln -s input/js0 js0 +ln -s input/js1 js1 +ln -s input/js2 js2 +ln -s input/js3 js3 + +mknod input/event0 c 13 64 +mknod input/event1 c 13 65 +mknod input/event2 c 13 66 +mknod input/event3 c 13 67 + +2.1 Does it work ? +~~~~~~~~~~~~~~~~~~ +There is an utility called fftest that will allow you to test the driver. +% fftest /dev/input/eventXX + +3. Instructions to the developper +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + All interactions are done using the event API. That is, you can use ioctl() +and write() on /dev/input/eventXX. + This information is subject to change. + +3.1 Querying device capabilities +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +#include <linux/input.h> +#include <sys/ioctl.h> + +unsigned long features[1 + FF_MAX/sizeof(unsigned long)]; +int ioctl(int file_descriptor, int request, unsigned long *features); + +"request" must be EVIOCGBIT(EV_FF, size of features array in bytes ) + +Returns the features supported by the device. features is a bitfield with the +following bits: +- FF_X has an X axis (usually joysticks) +- FF_Y has an Y axis (usually joysticks) +- FF_WHEEL has a wheel (usually sterring wheels) +- FF_CONSTANT can render constant force effects +- FF_PERIODIC can render periodic effects (sine, triangle, square...) +- FF_RAMP can render ramp effects +- FF_SPRING can simulate the presence of a spring +- FF_FRICTION can simulate friction +- FF_DAMPER can simulate damper effects +- FF_RUMBLE rumble effects (normally the only effect supported by rumble + pads) +- FF_INERTIA can simulate inertia + + +int ioctl(int fd, EVIOCGEFFECTS, int *n); + +Returns the number of effects the device can keep in its memory. + +3.2 Uploading effects to the device +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +#include <linux/input.h> +#include <sys/ioctl.h> + +int ioctl(int file_descriptor, int request, struct ff_effect *effect); + +"request" must be EVIOCSFF. + +"effect" points to a structure describing the effect to upload. The effect is +uploaded, but not played. +The content of effect may be modified. In particular, its field "id" is set +to the unique id assigned by the driver. This data is required for performing +some operations (removing an effect, controlling the playback). +This if field must be set to -1 by the user in order to tell the driver to +allocate a new effect. +See <linux/input.h> for a description of the ff_effect stuct. You should also +find help in a few sketches, contained in files shape.fig and interactive.fig. +You need xfig to visualize these files. + +3.3 Removing an effect from the device +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +int ioctl(int fd, EVIOCRMFF, effect.id); + +This makes room for new effects in the device's memory. Please note this won't +stop the effect if it was playing. + +3.4 Controlling the playback of effects +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Control of playing is done with write(). Below is an example: + +#include <linux/input.h> +#include <unistd.h> + + struct input_event play; + struct input_event stop; + struct ff_effect effect; + int fd; +... + fd = open("/dev/input/eventXX", O_RDWR); +... + /* Play three times */ + play.type = EV_FF; + play.code = effect.id; + play.value = 3; + + write(fd, (const void*) &play, sizeof(play)); +... + /* Stop an effect */ + stop.type = EV_FF; + stop.code = effect.id; + stop.value = 0; + + write(fd, (const void*) &play, sizeof(stop)); + +3.5 Setting the gain +~~~~~~~~~~~~~~~~~~~~ +Not all devices have the same strength. Therefore, users should set a gain +factor depending on how strong they want effects to be. This setting is +persistent across access to the driver, so you should not care about it if +you are writing games, as another utility probably already set this for you. + +/* Set the gain of the device +int gain; /* between 0 and 100 */ +struct input_event ie; /* structure used to communicate with the driver */ + +ie.type = EV_FF; +ie.code = FF_GAIN; +ie.value = 0xFFFFUL * gain / 100; + +if (write(fd, &ie, sizeof(ie)) == -1) + perror("set gain"); + +3.6 Enabling/Disabling autocenter +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The autocenter feature quite disturbs the rendering of effects in my opinion, +and I think it should be an effect, which computation depends on the game +type. But you can enable it if you want. + +int autocenter; /* between 0 and 100 */ +struct input_event ie; + +ie.type = EV_FF; +ie.code = FF_AUTOCENTER; +ie.value = 0xFFFFUL * autocenter / 100; + +if (write(fd, &ie, sizeof(ie)) == -1) + perror("set auto-center"); + +A value of 0 means "no auto-center". + +3.7 Dynamic update of an effect +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Proceed as if you wanted to upload a new effect, except that instead of +setting the id field to -1, you set it to the wanted effect id. +Normally, the effect is not stopped and restarted. However, depending on the +type of device, not all parameters can be dynamically updated. For example, +the direction of an effect cannot be updated with iforce devices. In this +case, the driver stops the effect, up-load it, and restart it. + + +3.8 Information about the status of effects +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Every time the status of an effect is changed, an event is sent. The values +and meanings of the fields of the event are as follows: +struct input_event { +/* When the status of the effect changed */ + struct timeval time; + +/* Set to EV_FF_STATUS */ + unsigned short type; + +/* Contains the id of the effect */ + unsigned short code; + +/* Indicates the status */ + unsigned int value; +}; + +FF_STATUS_STOPPED The effect stopped playing +FF_STATUS_PLAYING The effect started to play diff --git a/Documentation/input/gameport-programming.txt b/Documentation/input/gameport-programming.txt new file mode 100644 index 000000000000..1ba3d322e0ac --- /dev/null +++ b/Documentation/input/gameport-programming.txt @@ -0,0 +1,189 @@ +$Id: gameport-programming.txt,v 1.3 2001/04/24 13:51:37 vojtech Exp $ + +Programming gameport drivers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +1. A basic classic gameport +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If the gameport doesn't provide more than the inb()/outb() functionality, +the code needed to register it with the joystick drivers is simple: + + struct gameport gameport; + + gameport.io = MY_IO_ADDRESS; + gameport_register_port(&gameport); + +Make sure struct gameport is initialized to 0 in all other fields. The +gameport generic code will take care of the rest. + +If your hardware supports more than one io address, and your driver can +choose which one program the hardware to, starting from the more exotic +addresses is preferred, because the likelyhood of clashing with the standard +0x201 address is smaller. + +Eg. if your driver supports addresses 0x200, 0x208, 0x210 and 0x218, then +0x218 would be the address of first choice. + +If your hardware supports a gameport address that is not mapped to ISA io +space (is above 0x1000), use that one, and don't map the ISA mirror. + +Also, always request_region() on the whole io space occupied by the +gameport. Although only one ioport is really used, the gameport usually +occupies from one to sixteen addresses in the io space. + +Please also consider enabling the gameport on the card in the ->open() +callback if the io is mapped to ISA space - this way it'll occupy the io +space only when something really is using it. Disable it again in the +->close() callback. You also can select the io address in the ->open() +callback, so that it doesn't fail if some of the possible addresses are +already occupied by other gameports. + +2. Memory mapped gameport +~~~~~~~~~~~~~~~~~~~~~~~~~ + +When a gameport can be accessed through MMIO, this way is preferred, because +it is faster, allowing more reads per second. Registering such a gameport +isn't as easy as a basic IO one, but not so much complex: + + struct gameport gameport; + + void my_trigger(struct gameport *gameport) + { + my_mmio = 0xff; + } + + unsigned char my_read(struct gameport *gameport) + { + return my_mmio; + } + + gameport.read = my_read; + gameport.trigger = my_trigger; + gameport_register_port(&gameport); + +3. Cooked mode gameport +~~~~~~~~~~~~~~~~~~~~~~~ + +There are gameports that can report the axis values as numbers, that means +the driver doesn't have to measure them the old way - an ADC is built into +the gameport. To register a cooked gameport: + + struct gameport gameport; + + int my_cooked_read(struct gameport *gameport, int *axes, int *buttons) + { + int i; + + for (i = 0; i < 4; i++) + axes[i] = my_mmio[i]; + buttons[i] = my_mmio[4]; + } + + int my_open(struct gameport *gameport, int mode) + { + return -(mode != GAMEPORT_MODE_COOKED); + } + + gameport.cooked_read = my_cooked_read; + gameport.open = my_open; + gameport.fuzz = 8; + gameport_register_port(&gameport); + +The only confusing thing here is the fuzz value. Best determined by +experimentation, it is the amount of noise in the ADC data. Perfect +gameports can set this to zero, most common have fuzz between 8 and 32. +See analog.c and input.c for handling of fuzz - the fuzz value determines +the size of a gaussian filter window that is used to eliminate the noise +in the data. + +4. More complex gameports +~~~~~~~~~~~~~~~~~~~~~~~~~ + +Gameports can support both raw and cooked modes. In that case combine either +examples 1+2 or 1+3. Gameports can support internal calibration - see below, +and also lightning.c and analog.c on how that works. If your driver supports +more than one gameport instance simultaneously, use the ->private member of +the gameport struct to point to your data. + +5. Unregistering a gameport +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Simple: + +gameport_unregister_port(&gameport); + +6. The gameport structure +~~~~~~~~~~~~~~~~~~~~~~~~~ + +struct gameport { + + void *private; + +A private pointer for free use in the gameport driver. (Not the joystick +driver!) + + int number; + +Number assigned to the gameport when registered. Informational purpose only. + + int io; + +I/O address for use with raw mode. You have to either set this, or ->read() +to some value if your gameport supports raw mode. + + int speed; + +Raw mode speed of the gameport reads in thousands of reads per second. + + int fuzz; + +If the gameport supports cooked mode, this should be set to a value that +represents the amount of noise in the data. See section 3. + + void (*trigger)(struct gameport *); + +Trigger. This function should trigger the ns558 oneshots. If set to NULL, +outb(0xff, io) will be used. + + unsigned char (*read)(struct gameport *); + +Read the buttons and ns558 oneshot bits. If set to NULL, inb(io) will be +used instead. + + int (*cooked_read)(struct gameport *, int *axes, int *buttons); + +If the gameport supports cooked mode, it should point this to its cooked +read function. It should fill axes[0..3] with four values of the joystick axes +and buttons[0] with four bits representing the buttons. + + int (*calibrate)(struct gameport *, int *axes, int *max); + +Function for calibrating the ADC hardware. When called, axes[0..3] should be +pre-filled by cooked data by the caller, max[0..3] should be pre-filled with +expected maximums for each axis. The calibrate() function should set the +sensitivity of the ADC hardware so that the maximums fit in its range and +recompute the axes[] values to match the new sensitivity or re-read them from +the hardware so that they give valid values. + + int (*open)(struct gameport *, int mode); + +Open() serves two purposes. First a driver either opens the port in raw or +in cooked mode, the open() callback can decide which modes are supported. +Second, resource allocation can happen here. The port can also be enabled +here. Prior to this call, other fields of the gameport struct (namely the io +member) need not to be valid. + + void (*close)(struct gameport *); + +Close() should free the resources allocated by open, possibly disabling the +gameport. + + struct gameport_dev *dev; + struct gameport *next; + +For internal use by the gameport layer. + +}; + +Enjoy! diff --git a/Documentation/input/iforce-protocol.txt b/Documentation/input/iforce-protocol.txt new file mode 100644 index 000000000000..95df4ca70e71 --- /dev/null +++ b/Documentation/input/iforce-protocol.txt @@ -0,0 +1,254 @@ +** Introduction
+This document describes what I managed to discover about the protocol used to
+specify force effects to I-Force 2.0 devices. None of this information comes
+from Immerse. That's why you should not trust what is written in this
+document. This document is intended to help understanding the protocol.
+This is not a reference. Comments and corrections are welcome. To contact me,
+send an email to: deneux@ifrance.com
+
+** WARNING **
+I may not be held responsible for any dammage or harm caused if you try to
+send data to your I-Force device based on what you read in this document.
+
+** Preliminary Notes:
+All values are hexadecimal with big-endian encoding (msb on the left). Beware,
+values inside packets are encoded using little-endian. Bytes whose roles are
+unknown are marked ??? Information that needs deeper inspection is marked (?)
+
+** General form of a packet **
+This is how packets look when the device uses the rs232 to communicate.
+2B OP LEN DATA CS
+CS is the checksum. It is equal to the exclusive or of all bytes.
+
+When using USB:
+OP DATA
+The 2B, LEN and CS fields have disappeared, probably because USB handles frames and
+data corruption is handled or unsignificant.
+
+First, I describe effects that are sent by the device to the computer
+
+** Device input state
+This packet is used to indicate the state of each button and the value of each
+axis
+OP= 01 for a joystick, 03 for a wheel
+LEN= Varies from device to device
+00 X-Axis lsb
+01 X-Axis msb
+02 Y-Axis lsb, or gas pedal for a wheel
+03 Y-Axis msb, or brake pedal for a wheel
+04 Throttle
+05 Buttons
+06 Lower 4 bits: Buttons
+ Upper 4 bits: Hat
+07 Rudder
+
+** Device effects states
+OP= 02
+LEN= Varies
+00 ? Bit 1 (Value 2) is the value of the deadman switch
+01 Bit 8 is set if the effect is playing. Bits 0 to 7 are the effect id.
+02 ??
+03 Address of parameter block changed (lsb)
+04 Address of parameter block changed (msb)
+05 Address of second parameter block changed (lsb)
+... depending on the number of parameter blocks updated
+
+** Force effect **
+OP= 01
+LEN= 0e
+00 Channel (when playing several effects at the same time, each must be assigned a channel)
+01 Wave form
+ Val 00 Constant
+ Val 20 Square
+ Val 21 Triangle
+ Val 22 Sine
+ Val 23 Sawtooth up
+ Val 24 Sawtooth down
+ Val 40 Spring (Force = f(pos))
+ Val 41 Friction (Force = f(velocity)) and Inertia (Force = f(acceleration))
+
+
+02 Axes affected and trigger
+ Bits 4-7: Val 2 = effect along one axis. Byte 05 indicates direction
+ Val 4 = X axis only. Byte 05 must contain 5a
+ Val 8 = Y axis only. Byte 05 must contain b4
+ Val c = X and Y axes. Bytes 05 must contain 60
+ Bits 0-3: Val 0 = No trigger
+ Val x+1 = Button x triggers the effect
+ When the whole byte is 0, cancel the previously set trigger
+
+03-04 Duration of effect (little endian encoding, in ms)
+
+05 Direction of effect, if applicable. Else, see 02 for value to assign.
+
+06-07 Minimum time between triggering.
+
+08-09 Address of periodicity or magnitude parameters
+0a-0b Address of attack and fade parameters, or ffff if none.
+*or*
+08-09 Address of interactive parameters for X-axis, or ffff if not applicable
+0a-0b Address of interactive parameters for Y-axis, or ffff if not applicable
+
+0c-0d Delay before execution of effect (little endian encoding, in ms)
+
+
+** Time based parameters **
+
+*** Attack and fade ***
+OP= 02
+LEN= 08
+00-01 Address where to store the parameteres
+02-03 Duration of attack (little endian encoding, in ms)
+04 Level at end of attack. Signed byte.
+05-06 Duration of fade.
+07 Level at end of fade.
+
+*** Magnitude ***
+OP= 03
+LEN= 03
+00-01 Address
+02 Level. Signed byte.
+
+*** Periodicity ***
+OP= 04
+LEN= 07
+00-01 Address
+02 Magnitude. Signed byte.
+03 Offset. Signed byte.
+04 Phase. Val 00 = 0 deg, Val 40 = 90 degs.
+05-06 Period (little endian encoding, in ms)
+
+** Interactive parameters **
+OP= 05
+LEN= 0a
+00-01 Address
+02 Positive Coeff
+03 Negative Coeff
+04+05 Offset (center)
+06+07 Dead band (Val 01F4 = 5000 (decimal))
+08 Positive saturation (Val 0a = 1000 (decimal) Val 64 = 10000 (decimal))
+09 Negative saturation
+
+The encoding is a bit funny here: For coeffs, these are signed values. The
+maximum value is 64 (100 decimal), the min is 9c.
+For the offset, the minimum value is FE0C, the maximum value is 01F4.
+For the deadband, the minimum value is 0, the max is 03E8.
+
+** Controls **
+OP= 41
+LEN= 03
+00 Channel
+01 Start/Stop
+ Val 00: Stop
+ Val 01: Start and play once.
+ Val 41: Start and play n times (See byte 02 below)
+02 Number of iterations n.
+
+** Init **
+
+*** Querying features ***
+OP= ff
+Query command. Length varies according to the query type.
+The general format of this packet is:
+ff 01 QUERY [INDEX] CHECKSUM
+reponses are of the same form:
+FF LEN QUERY VALUE_QUERIED CHECKSUM2
+where LEN = 1 + length(VALUE_QUERIED)
+
+**** Query ram size ****
+QUERY = 42 ('B'uffer size)
+The device should reply with the same packet plus two additionnal bytes
+containing the size of the memory:
+ff 03 42 03 e8 CS would mean that the device has 1000 bytes of ram available.
+
+**** Query number of effects ****
+QUERY = 4e ('N'umber of effects)
+The device should respond by sending the number of effects that can be played
+at the same time (one byte)
+ff 02 4e 14 CS would stand for 20 effects.
+
+**** Vendor's id ****
+QUERY = 4d ('M'anufacturer)
+Query the vendors'id (2 bytes)
+
+**** Product id *****
+QUERY = 50 ('P'roduct)
+Query the product id (2 bytes)
+
+**** Open device ****
+QUERY = 4f ('O'pen)
+No data returned.
+
+**** Close device *****
+QUERY = 43 ('C')lose
+No data returned.
+
+**** Query effect ****
+QUERY = 45 ('E')
+Send effect type.
+Returns nonzero if supported (2 bytes)
+
+**** Firmware Version ****
+QUERY = 56 ('V'ersion)
+Sends back 3 bytes - major, minor, subminor
+
+*** Initialisation of the device ***
+
+**** Set Control ****
+!!! Device dependent, can be different on different models !!!
+OP= 40 <idx> <val> [<val>]
+LEN= 2 or 3
+00 Idx
+ Idx 00 Set dead zone (0..2048)
+ Idx 01 Ignore Deadman sensor (0..1)
+ Idx 02 Enable comm watchdog (0..1)
+ Idx 03 Set the strength of the spring (0..100)
+ Idx 04 Enable or disable the spring (0/1)
+ Idx 05 Set axis saturation threshold (0..2048)
+
+**** Set Effect State ****
+OP= 42 <val>
+LEN= 1
+00 State
+ Bit 3 Pause force feedback
+ Bit 2 Enable force feedback
+ Bit 0 Stop all effects
+
+**** Set overall gain ****
+OP= 43 <val>
+LEN= 1
+00 Gain
+ Val 00 = 0%
+ Val 40 = 50%
+ Val 80 = 100%
+
+** Parameter memory **
+
+Each device has a certain amount of memory to store parameters of effects.
+The amount of RAM may vary, I encountered values from 200 to 1000 bytes. Below
+is the amount of memory apparently needed for every set of parameters:
+ - period : 0c
+ - magnitude : 02
+ - attack and fade : 0e
+ - interactive : 08
+
+** Appendix: How to study the protocol ? **
+
+1. Generate effects using the force editor provided with the DirectX SDK, or use Immersion Studio (freely available at their web site in the developer section: www.immersion.com)
+2. Start a soft spying RS232 or USB (depending on where you connected your joystick/wheel). I used ComPortSpy from fCoder (alpha version!)
+3. Play the effect, and watch what happens on the spy screen.
+
+A few words about ComPortSpy:
+At first glance, this soft seems, hum, well... buggy. In fact, data appear with a few seconds latency. Personnaly, I restart it every time I play an effect.
+Remember it's free (as in free beer) and alpha!
+
+** URLS **
+Check www.immerse.com for Immersion Studio, and www.fcoder.com for ComPortSpy.
+
+** Author of this document **
+Johann Deneux <deneux@ifrance.com>
+Home page at http://www.esil.univ-mrs.fr/~jdeneux/projects/ff/
+
+Additions by Vojtech Pavlik.
+
+I-Force is trademark of Immersion Corp.
diff --git a/Documentation/input/input-programming.txt b/Documentation/input/input-programming.txt new file mode 100644 index 000000000000..180e0689676c --- /dev/null +++ b/Documentation/input/input-programming.txt @@ -0,0 +1,281 @@ +$Id: input-programming.txt,v 1.4 2001/05/04 09:47:14 vojtech Exp $ + +Programming input drivers +~~~~~~~~~~~~~~~~~~~~~~~~~ + +1. Creating an input device driver +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +1.0 The simplest example +~~~~~~~~~~~~~~~~~~~~~~~~ + +Here comes a very simple example of an input device driver. The device has +just one button and the button is accessible at i/o port BUTTON_PORT. When +pressed or released a BUTTON_IRQ happens. The driver could look like: + +#include <linux/input.h> +#include <linux/module.h> +#include <linux/init.h> + +#include <asm/irq.h> +#include <asm/io.h> + +static void button_interrupt(int irq, void *dummy, struct pt_regs *fp) +{ + input_report_key(&button_dev, BTN_1, inb(BUTTON_PORT) & 1); + input_sync(&button_dev); +} + +static int __init button_init(void) +{ + if (request_irq(BUTTON_IRQ, button_interrupt, 0, "button", NULL)) { + printk(KERN_ERR "button.c: Can't allocate irq %d\n", button_irq); + return -EBUSY; + } + + button_dev.evbit[0] = BIT(EV_KEY); + button_dev.keybit[LONG(BTN_0)] = BIT(BTN_0); + + input_register_device(&button_dev); +} + +static void __exit button_exit(void) +{ + input_unregister_device(&button_dev); + free_irq(BUTTON_IRQ, button_interrupt); +} + +module_init(button_init); +module_exit(button_exit); + +1.1 What the example does +~~~~~~~~~~~~~~~~~~~~~~~~~ + +First it has to include the <linux/input.h> file, which interfaces to the +input subsystem. This provides all the definitions needed. + +In the _init function, which is called either upon module load or when +booting the kernel, it grabs the required resources (it should also check +for the presence of the device). + +Then it sets the input bitfields. This way the device driver tells the other +parts of the input systems what it is - what events can be generated or +accepted by this input device. Our example device can only generate EV_KEY type +events, and from those only BTN_0 event code. Thus we only set these two +bits. We could have used + + set_bit(EV_KEY, button_dev.evbit); + set_bit(BTN_0, button_dev.keybit); + +as well, but with more than single bits the first approach tends to be +shorter. + +Then the example driver registers the input device structure by calling + + input_register_device(&button_dev); + +This adds the button_dev structure to linked lists of the input driver and +calls device handler modules _connect functions to tell them a new input +device has appeared. Because the _connect functions may call kmalloc(, +GFP_KERNEL), which can sleep, input_register_device() must not be called +from an interrupt or with a spinlock held. + +While in use, the only used function of the driver is + + button_interrupt() + +which upon every interrupt from the button checks its state and reports it +via the + + input_report_key() + +call to the input system. There is no need to check whether the interrupt +routine isn't reporting two same value events (press, press for example) to +the input system, because the input_report_* functions check that +themselves. + +Then there is the + + input_sync() + +call to tell those who receive the events that we've sent a complete report. +This doesn't seem important in the one button case, but is quite important +for for example mouse movement, where you don't want the X and Y values +to be interpreted separately, because that'd result in a different movement. + +1.2 dev->open() and dev->close() +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In case the driver has to repeatedly poll the device, because it doesn't +have an interrupt coming from it and the polling is too expensive to be done +all the time, or if the device uses a valuable resource (eg. interrupt), it +can use the open and close callback to know when it can stop polling or +release the interrupt and when it must resume polling or grab the interrupt +again. To do that, we would add this to our example driver: + +int button_used = 0; + +static int button_open(struct input_dev *dev) +{ + if (button_used++) + return 0; + + if (request_irq(BUTTON_IRQ, button_interrupt, 0, "button", NULL)) { + printk(KERN_ERR "button.c: Can't allocate irq %d\n", button_irq); + button_used--; + return -EBUSY; + } + + return 0; +} + +static void button_close(struct input_dev *dev) +{ + if (!--button_used) + free_irq(IRQ_AMIGA_VERTB, button_interrupt); +} + +static int __init button_init(void) +{ + ... + button_dev.open = button_open; + button_dev.close = button_close; + ... +} + +Note the button_used variable - we have to track how many times the open +function was called to know when exactly our device stops being used. + +The open() callback should return a 0 in case of success or any nonzero value +in case of failure. The close() callback (which is void) must always succeed. + +1.3 Basic event types +~~~~~~~~~~~~~~~~~~~~~ + +The most simple event type is EV_KEY, which is used for keys and buttons. +It's reported to the input system via: + + input_report_key(struct input_dev *dev, int code, int value) + +See linux/input.h for the allowable values of code (from 0 to KEY_MAX). +Value is interpreted as a truth value, ie any nonzero value means key +pressed, zero value means key released. The input code generates events only +in case the value is different from before. + +In addition to EV_KEY, there are two more basic event types: EV_REL and +EV_ABS. They are used for relative and absolute values supplied by the +device. A relative value may be for example a mouse movement in the X axis. +The mouse reports it as a relative difference from the last position, +because it doesn't have any absolute coordinate system to work in. Absolute +events are namely for joysticks and digitizers - devices that do work in an +absolute coordinate systems. + +Having the device report EV_REL buttons is as simple as with EV_KEY, simply +set the corresponding bits and call the + + input_report_rel(struct input_dev *dev, int code, int value) + +function. Events are generated only for nonzero value. + +However EV_ABS requires a little special care. Before calling +input_register_device, you have to fill additional fields in the input_dev +struct for each absolute axis your device has. If our button device had also +the ABS_X axis: + + button_dev.absmin[ABS_X] = 0; + button_dev.absmax[ABS_X] = 255; + button_dev.absfuzz[ABS_X] = 4; + button_dev.absflat[ABS_X] = 8; + +This setting would be appropriate for a joystick X axis, with the minimum of +0, maximum of 255 (which the joystick *must* be able to reach, no problem if +it sometimes reports more, but it must be able to always reach the min and +max values), with noise in the data up to +- 4, and with a center flat +position of size 8. + +If you don't need absfuzz and absflat, you can set them to zero, which mean +that the thing is precise and always returns to exactly the center position +(if it has any). + +1.4 The void *private field +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This field in the input structure can be used to point to any private data +structures in the input device driver, in case the driver handles more than +one device. You'll need it in the open and close callbacks. + +1.5 NBITS(), LONG(), BIT() +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +These three macros from input.h help some bitfield computations: + + NBITS(x) - returns the length of a bitfield array in longs for x bits + LONG(x) - returns the index in the array in longs for bit x + BIT(x) - returns the index in a long for bit x + +1.6 The number, id* and name fields +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The dev->number is assigned by the input system to the input device when it +is registered. It has no use except for identifying the device to the user +in system messages. + +The dev->name should be set before registering the input device by the input +device driver. It's a string like 'Generic button device' containing a +user friendly name of the device. + +The id* fields contain the bus ID (PCI, USB, ...), vendor ID and device ID +of the device. The bus IDs are defined in input.h. The vendor and device ids +are defined in pci_ids.h, usb_ids.h and similar include files. These fields +should be set by the input device driver before registering it. + +The idtype field can be used for specific information for the input device +driver. + +The id and name fields can be passed to userland via the evdev interface. + +1.7 The keycode, keycodemax, keycodesize fields +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +These two fields will be used for any input devices that report their data +as scancodes. If not all scancodes can be known by autodetection, they may +need to be set by userland utilities. The keycode array then is an array +used to map from scancodes to input system keycodes. The keycode max will +contain the size of the array and keycodesize the size of each entry in it +(in bytes). + +1.8 Key autorepeat +~~~~~~~~~~~~~~~~~~ + +... is simple. It is handled by the input.c module. Hardware autorepeat is +not used, because it's not present in many devices and even where it is +present, it is broken sometimes (at keyboards: Toshiba notebooks). To enable +autorepeat for your device, just set EV_REP in dev->evbit. All will be +handled by the input system. + +1.9 Other event types, handling output events +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The other event types up to now are: + +EV_LED - used for the keyboard LEDs. +EV_SND - used for keyboard beeps. + +They are very similar to for example key events, but they go in the other +direction - from the system to the input device driver. If your input device +driver can handle these events, it has to set the respective bits in evbit, +*and* also the callback routine: + + button_dev.event = button_event; + +int button_event(struct input_dev *dev, unsigned int type, unsigned int code, int value); +{ + if (type == EV_SND && code == SND_BELL) { + outb(value, BUTTON_BELL); + return 0; + } + return -1; +} + +This callback routine can be called from an interrupt or a BH (although that +isn't a rule), and thus must not sleep, and must not take too long to finish. diff --git a/Documentation/input/input.txt b/Documentation/input/input.txt new file mode 100644 index 000000000000..47137e75fdb8 --- /dev/null +++ b/Documentation/input/input.txt @@ -0,0 +1,312 @@ + Linux Input drivers v1.0 + (c) 1999-2001 Vojtech Pavlik <vojtech@ucw.cz> + Sponsored by SuSE + $Id: input.txt,v 1.8 2002/05/29 03:15:01 bradleym Exp $ +---------------------------------------------------------------------------- + +0. Disclaimer +~~~~~~~~~~~~~ + This program is free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by the Free +Software Foundation; either version 2 of the License, or (at your option) +any later version. + + This program is distributed in the hope that it will be useful, but +WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY +or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for +more details. + + You should have received a copy of the GNU General Public License along +with this program; if not, write to the Free Software Foundation, Inc., 59 +Temple Place, Suite 330, Boston, MA 02111-1307 USA + + Should you need to contact me, the author, you can do so either by e-mail +- mail your message to <vojtech@ucw.cz>, or by paper mail: Vojtech Pavlik, +Simunkova 1594, Prague 8, 182 00 Czech Republic + + For your convenience, the GNU General Public License version 2 is included +in the package: See the file COPYING. + +1. Introduction +~~~~~~~~~~~~~~~ + This is a collection of drivers that is designed to support all input +devices under Linux. While it is currently used only on for USB input +devices, future use (say 2.5/2.6) is expected to expand to replace +most of the existing input system, which is why it lives in +drivers/input/ instead of drivers/usb/. + + The centre of the input drivers is the input module, which must be +loaded before any other of the input modules - it serves as a way of +communication between two groups of modules: + +1.1 Device drivers +~~~~~~~~~~~~~~~~~~ + These modules talk to the hardware (for example via USB), and provide +events (keystrokes, mouse movements) to the input module. + +1.2 Event handlers +~~~~~~~~~~~~~~~~~~ + These modules get events from input and pass them where needed via +various interfaces - keystrokes to the kernel, mouse movements via a +simulated PS/2 interface to GPM and X and so on. + +2. Simple Usage +~~~~~~~~~~~~~~~ + For the most usual configuration, with one USB mouse and one USB keyboard, +you'll have to load the following modules (or have them built in to the +kernel): + + input + mousedev + keybdev + usbcore + uhci_hcd or ohci_hcd or ehci_hcd + usbhid + + After this, the USB keyboard will work straight away, and the USB mouse +will be available as a character device on major 13, minor 63: + + crw-r--r-- 1 root root 13, 63 Mar 28 22:45 mice + + This device has to be created, unless you use devfs, in which case it's +created automatically. The commands to do create it by hand are: + + cd /dev + mkdir input + mknod input/mice c 13 63 + + After that you have to point GPM (the textmode mouse cut&paste tool) and +XFree to this device to use it - GPM should be called like: + + gpm -t ps2 -m /dev/input/mice + + And in X: + + Section "Pointer" + Protocol "ImPS/2" + Device "/dev/input/mice" + ZAxisMapping 4 5 + EndSection + + When you do all of the above, you can use your USB mouse and keyboard. + +3. Detailed Description +~~~~~~~~~~~~~~~~~~~~~~~ +3.1 Device drivers +~~~~~~~~~~~~~~~~~~ + Device drivers are the modules that generate events. The events are +however not useful without being handled, so you also will need to use some +of the modules from section 3.2. + +3.1.1 usbhid +~~~~~~~~~~~~ + usbhid is the largest and most complex driver of the whole suite. It +handles all HID devices, and because there is a very wide variety of them, +and because the USB HID specification isn't simple, it needs to be this big. + + Currently, it handles USB mice, joysticks, gamepads, steering wheels +keyboards, trackballs and digitizers. + + However, USB uses HID also for monitor controls, speaker controls, UPSs, +LCDs and many other purposes. + + The monitor and speaker controls should be easy to add to the hid/input +interface, but for the UPSs and LCDs it doesn't make much sense. For this, +the hiddev interface was designed. See Documentation/usb/hiddev.txt +for more information about it. + + The usage of the usbhid module is very simple, it takes no parameters, +detects everything automatically and when a HID device is inserted, it +detects it appropriately. + + However, because the devices vary wildly, you might happen to have a +device that doesn't work well. In that case #define DEBUG at the beginning +of hid-core.c and send me the syslog traces. + +3.1.2 usbmouse +~~~~~~~~~~~~~~ + For embedded systems, for mice with broken HID descriptors and just any +other use when the big usbhid wouldn't be a good choice, there is the +usbmouse driver. It handles USB mice only. It uses a simpler HIDBP +protocol. This also means the mice must support this simpler protocol. Not +all do. If you don't have any strong reason to use this module, use usbhid +instead. + +3.1.3 usbkbd +~~~~~~~~~~~~ + Much like usbmouse, this module talks to keyboards with a simplified +HIDBP protocol. It's smaller, but doesn't support any extra special keys. +Use usbhid instead if there isn't any special reason to use this. + +3.1.4 wacom +~~~~~~~~~~~ + This is a driver for Wacom Graphire and Intuos tablets. Not for Wacom +PenPartner, that one is handled by the HID driver. Although the Intuos and +Graphire tablets claim that they are HID tablets as well, they are not and +thus need this specific driver. + +3.1.5 iforce +~~~~~~~~~~~~ + A driver for I-Force joysticks and wheels, both over USB and RS232. +It includes ForceFeedback support now, even though Immersion +Corp. considers the protocol a trade secret and won't disclose a word +about it. + +3.2 Event handlers +~~~~~~~~~~~~~~~~~~ + Event handlers distrubite the events from the devices to userland and +kernel, as needed. + +3.2.1 keybdev +~~~~~~~~~~~~~ + keybdev is currently a rather ugly hack that translates the input +events into architecture-specific keyboard raw mode (Xlated AT Set2 on +x86), and passes them into the handle_scancode function of the +keyboard.c module. This works well enough on all architectures that +keybdev can generate rawmode on, other architectures can be added to +it. + + The right way would be to pass the events to keyboard.c directly, +best if keyboard.c would itself be an event handler. This is done in +the input patch, available on the webpage mentioned below. + +3.2.2 mousedev +~~~~~~~~~~~~~~ + mousedev is also a hack to make programs that use mouse input +work. It takes events from either mice or digitizers/tablets and makes +a PS/2-style (a la /dev/psaux) mouse device available to the +userland. Ideally, the programs could use a more reasonable interface, +for example evdev + + Mousedev devices in /dev/input (as shown above) are: + + crw-r--r-- 1 root root 13, 32 Mar 28 22:45 mouse0 + crw-r--r-- 1 root root 13, 33 Mar 29 00:41 mouse1 + crw-r--r-- 1 root root 13, 34 Mar 29 00:41 mouse2 + crw-r--r-- 1 root root 13, 35 Apr 1 10:50 mouse3 + ... + ... + crw-r--r-- 1 root root 13, 62 Apr 1 10:50 mouse30 + crw-r--r-- 1 root root 13, 63 Apr 1 10:50 mice + +Each 'mouse' device is assigned to a single mouse or digitizer, except +the last one - 'mice'. This single character device is shared by all +mice and digitizers, and even if none are connected, the device is +present. This is useful for hotplugging USB mice, so that programs +can open the device even when no mice are present. + + CONFIG_INPUT_MOUSEDEV_SCREEN_[XY] in the kernel configuration are +the size of your screen (in pixels) in XFree86. This is needed if you +want to use your digitizer in X, because its movement is sent to X +via a virtual PS/2 mouse and thus needs to be scaled +accordingly. These values won't be used if you use a mouse only. + + Mousedev will generate either PS/2, ImPS/2 (Microsoft IntelliMouse) or +ExplorerPS/2 (IntelliMouse Explorer) protocols, depending on what the +program reading the data wishes. You can set GPM and X to any of +these. You'll need ImPS/2 if you want to make use of a wheel on a USB +mouse and ExplorerPS/2 if you want to use extra (up to 5) buttons. + +3.2.3 joydev +~~~~~~~~~~~~ + Joydev implements v0.x and v1.x Linux joystick api, much like +drivers/char/joystick/joystick.c used to in earlier versions. See +joystick-api.txt in the Documentation subdirectory for details. As +soon as any joystick is connected, it can be accessed in /dev/input +on: + + crw-r--r-- 1 root root 13, 0 Apr 1 10:50 js0 + crw-r--r-- 1 root root 13, 1 Apr 1 10:50 js1 + crw-r--r-- 1 root root 13, 2 Apr 1 10:50 js2 + crw-r--r-- 1 root root 13, 3 Apr 1 10:50 js3 + ... + +And so on up to js31. + +3.2.4 evdev +~~~~~~~~~~~ + evdev is the generic input event interface. It passes the events +generated in the kernel straight to the program, with timestamps. The +API is still evolving, but should be useable now. It's described in +section 5. + + This should be the way for GPM and X to get keyboard and mouse mouse +events. It allows for multihead in X without any specific multihead +kernel support. The event codes are the same on all architectures and +are hardware independent. + + The devices are in /dev/input: + + crw-r--r-- 1 root root 13, 64 Apr 1 10:49 event0 + crw-r--r-- 1 root root 13, 65 Apr 1 10:50 event1 + crw-r--r-- 1 root root 13, 66 Apr 1 10:50 event2 + crw-r--r-- 1 root root 13, 67 Apr 1 10:50 event3 + ... + +And so on up to event31. + +4. Verifying if it works +~~~~~~~~~~~~~~~~~~~~~~~~ + Typing a couple keys on the keyboard should be enough to check that +a USB keyboard works and is correctly connected to the kernel keyboard +driver. + + Doing a cat /dev/input/mouse0 (c, 13, 32) will verify that a mouse +is also emulated, characters should appear if you move it. + + You can test the joystick emulation with the 'jstest' utility, +available in the joystick package (see Documentation/input/joystick.txt). + + You can test the event devices with the 'evtest' utility available +in the LinuxConsole project CVS archive (see the URL below). + +5. Event interface +~~~~~~~~~~~~~~~~~~ + Should you want to add event device support into any application (X, gpm, +svgalib ...) I <vojtech@ucw.cz> will be happy to provide you any help I +can. Here goes a description of the current state of things, which is going +to be extended, but not changed incompatibly as time goes: + + You can use blocking and nonblocking reads, also select() on the +/dev/input/eventX devices, and you'll always get a whole number of input +events on a read. Their layout is: + +struct input_event { + struct timeval time; + unsigned short type; + unsigned short code; + unsigned int value; +}; + + 'time' is the timestamp, it returns the time at which the event happened. +Type is for example EV_REL for relative momement, REL_KEY for a keypress or +release. More types are defined in include/linux/input.h. + + 'code' is event code, for example REL_X or KEY_BACKSPACE, again a complete +list is in include/linux/input.h. + + 'value' is the value the event carries. Either a relative change for +EV_REL, absolute new value for EV_ABS (joysticks ...), or 0 for EV_KEY for +release, 1 for keypress and 2 for autorepeat. + +6. Contacts +~~~~~~~~~~~ + This effort has its home page at: + + http://www.suse.cz/development/input/ + +You'll find both the latest HID driver and the complete Input driver +there as well as information how to access the CVS repository for +latest revisions of the drivers. + + There is also a mailing list for this: + + majordomo@atrey.karlin.mff.cuni.cz + +Send "subscribe linux-input" to subscribe to it. + +The input changes are also being worked on as part of the LinuxConsole +project, see: + + http://sourceforge.net/projects/linuxconsole/ + diff --git a/Documentation/input/interactive.fig b/Documentation/input/interactive.fig new file mode 100644 index 000000000000..1e7de387723a --- /dev/null +++ b/Documentation/input/interactive.fig @@ -0,0 +1,42 @@ +#FIG 3.2 +Landscape +Center +Inches +Letter +100.00 +Single +-2 +1200 2 +2 1 0 2 0 7 50 0 -1 6.000 0 0 -1 0 0 6 + 1200 3600 1800 3600 2400 4800 3000 4800 4200 5700 4800 5700 +2 2 0 1 0 7 50 0 -1 4.000 0 0 -1 0 0 5 + 1200 3150 4800 3150 4800 6300 1200 6300 1200 3150 +2 1 0 1 0 7 50 0 -1 4.000 0 0 -1 0 0 2 + 1200 4800 4800 4800 +2 1 1 1 0 7 50 0 -1 4.000 0 0 -1 0 0 4 + 2400 4800 2400 6525 1950 7125 1950 7800 +2 1 1 1 0 7 50 0 -1 4.000 0 0 -1 0 0 4 + 3000 4800 3000 6525 3600 7125 3600 7800 +2 1 0 1 0 7 50 0 -1 4.000 0 0 -1 0 1 3 + 0 0 1.00 60.00 120.00 + 3825 5400 4125 5100 5400 5100 +2 1 0 1 0 7 50 0 -1 4.000 0 0 -1 0 1 3 + 0 0 1.00 60.00 120.00 + 2100 4200 2400 3900 5400 3900 +2 1 1 1 0 7 50 0 -1 4.000 0 0 -1 0 0 2 + 4800 5700 5400 5700 +2 1 1 1 0 7 50 0 -1 4.000 0 0 -1 0 0 2 + 1800 3600 5400 3600 +2 1 0 1 0 7 50 0 -1 4.000 0 0 -1 0 1 3 + 0 0 1.00 60.00 120.00 + 2700 4800 2700 4425 5400 4425 +2 1 0 1 0 7 50 0 -1 4.000 0 0 -1 1 1 2 + 0 0 1.00 60.00 120.00 + 0 0 1.00 60.00 120.00 + 1950 7800 3600 7800 +4 1 0 50 0 0 12 0.0000 4 135 810 2775 7725 Dead band\001 +4 0 0 50 0 0 12 0.0000 4 180 1155 5400 5700 right saturation\001 +4 0 0 50 0 0 12 0.0000 4 135 1065 5400 3600 left saturation\001 +4 0 0 50 0 0 12 0.0000 4 180 2505 5400 3900 left coeff ( positive in that case )\001 +4 0 0 50 0 0 12 0.0000 4 180 2640 5475 5100 right coeff ( negative in that case )\001 +4 0 0 50 0 0 12 0.0000 4 105 480 5400 4425 center\001 diff --git a/Documentation/input/joystick-api.txt b/Documentation/input/joystick-api.txt new file mode 100644 index 000000000000..acbd32b88454 --- /dev/null +++ b/Documentation/input/joystick-api.txt @@ -0,0 +1,316 @@ + Joystick API Documentation -*-Text-*- + + Ragnar Hojland Espinosa + <ragnar@macula.net> + + 7 Aug 1998 + + $Id: joystick-api.txt,v 1.2 2001/05/08 21:21:23 vojtech Exp $ + +1. Initialization +~~~~~~~~~~~~~~~~~ + +Open the joystick device following the usual semantics (that is, with open). +Since the driver now reports events instead of polling for changes, +immediately after the open it will issue a series of synthetic events +(JS_EVENT_INIT) that you can read to check the initial state of the +joystick. + +By default, the device is opened in blocking mode. + + int fd = open ("/dev/js0", O_RDONLY); + + +2. Event Reading +~~~~~~~~~~~~~~~~ + + struct js_event e; + read (fd, &e, sizeof(struct js_event)); + +where js_event is defined as + + struct js_event { + __u32 time; /* event timestamp in milliseconds */ + __s16 value; /* value */ + __u8 type; /* event type */ + __u8 number; /* axis/button number */ + }; + +If the read is successful, it will return sizeof(struct js_event), unless +you wanted to read more than one event per read as described in section 3.1. + + +2.1 js_event.type +~~~~~~~~~~~~~~~~~ + +The possible values of ``type'' are + + #define JS_EVENT_BUTTON 0x01 /* button pressed/released */ + #define JS_EVENT_AXIS 0x02 /* joystick moved */ + #define JS_EVENT_INIT 0x80 /* initial state of device */ + +As mentioned above, the driver will issue synthetic JS_EVENT_INIT ORed +events on open. That is, if it's issuing a INIT BUTTON event, the +current type value will be + + int type = JS_EVENT_BUTTON | JS_EVENT_INIT; /* 0x81 */ + +If you choose not to differentiate between synthetic or real events +you can turn off the JS_EVENT_INIT bits + + type &= ~JS_EVENT_INIT; /* 0x01 */ + + +2.2 js_event.number +~~~~~~~~~~~~~~~~~~~ + +The values of ``number'' correspond to the axis or button that +generated the event. Note that they carry separate numeration (that +is, you have both an axis 0 and a button 0). Generally, + + number + 1st Axis X 0 + 1st Axis Y 1 + 2nd Axis X 2 + 2nd Axis Y 3 + ...and so on + +Hats vary from one joystick type to another. Some can be moved in 8 +directions, some only in 4, The driver, however, always reports a hat as two +independent axis, even if the hardware doesn't allow independent movement. + + +2.3 js_event.value +~~~~~~~~~~~~~~~~~~ + +For an axis, ``value'' is a signed integer between -32767 and +32767 +representing the position of the joystick along that axis. If you +don't read a 0 when the joystick is `dead', or if it doesn't span the +full range, you should recalibrate it (with, for example, jscal). + +For a button, ``value'' for a press button event is 1 and for a release +button event is 0. + +Though this + + if (js_event.type == JS_EVENT_BUTTON) { + buttons_state ^= (1 << js_event.number); + } + +may work well if you handle JS_EVENT_INIT events separately, + + if ((js_event.type & ~JS_EVENT_INIT) == JS_EVENT_BUTTON) { + if (js_event.value) + buttons_state |= (1 << js_event.number); + else + buttons_state &= ~(1 << js_event.number); + } + +is much safer since it can't lose sync with the driver. As you would +have to write a separate handler for JS_EVENT_INIT events in the first +snippet, this ends up being shorter. + + +2.4 js_event.time +~~~~~~~~~~~~~~~~~ + +The time an event was generated is stored in ``js_event.time''. It's a time +in milliseconds since ... well, since sometime in the past. This eases the +task of detecting double clicks, figuring out if movement of axis and button +presses happened at the same time, and similar. + + +3. Reading +~~~~~~~~~~ + +If you open the device in blocking mode, a read will block (that is, +wait) forever until an event is generated and effectively read. There +are two alternatives if you can't afford to wait forever (which is, +admittedly, a long time;) + + a) use select to wait until there's data to be read on fd, or + until it timeouts. There's a good example on the select(2) + man page. + + b) open the device in non-blocking mode (O_NONBLOCK) + + +3.1 O_NONBLOCK +~~~~~~~~~~~~~~ + +If read returns -1 when reading in O_NONBLOCK mode, this isn't +necessarily a "real" error (check errno(3)); it can just mean there +are no events pending to be read on the driver queue. You should read +all events on the queue (that is, until you get a -1). + +For example, + + while (1) { + while (read (fd, &e, sizeof(struct js_event)) > 0) { + process_event (e); + } + /* EAGAIN is returned when the queue is empty */ + if (errno != EAGAIN) { + /* error */ + } + /* do something interesting with processed events */ + } + +One reason for emptying the queue is that if it gets full you'll start +missing events since the queue is finite, and older events will get +overwritten. + +The other reason is that you want to know all what happened, and not +delay the processing till later. + +Why can get the queue full? Because you don't empty the queue as +mentioned, or because too much time elapses from one read to another +and too many events to store in the queue get generated. Note that +high system load may contribute to space those reads even more. + +If time between reads is enough to fill the queue and lose an event, +the driver will switch to startup mode and next time you read it, +synthetic events (JS_EVENT_INIT) will be generated to inform you of +the actual state of the joystick. + +[As for version 1.2.8, the queue is circular and able to hold 64 + events. You can increment this size bumping up JS_BUFF_SIZE in + joystick.h and recompiling the driver.] + + +In the above code, you might as well want to read more than one event +at a time using the typical read(2) functionality. For that, you would +replace the read above with something like + + struct js_event mybuffer[0xff]; + int i = read (fd, mybuffer, sizeof(struct mybuffer)); + +In this case, read would return -1 if the queue was empty, or some +other value in which the number of events read would be i / +sizeof(js_event) Again, if the buffer was full, it's a good idea to +process the events and keep reading it until you empty the driver queue. + + +4. IOCTLs +~~~~~~~~~ + +The joystick driver defines the following ioctl(2) operations. + + /* function 3rd arg */ + #define JSIOCGAXES /* get number of axes char */ + #define JSIOCGBUTTONS /* get number of buttons char */ + #define JSIOCGVERSION /* get driver version int */ + #define JSIOCGNAME(len) /* get identifier string char */ + #define JSIOCSCORR /* set correction values &js_corr */ + #define JSIOCGCORR /* get correction values &js_corr */ + +For example, to read the number of axes + + char number_of_axes; + ioctl (fd, JSIOCGAXES, &number_of_axes); + + +4.1 JSIOGCVERSION +~~~~~~~~~~~~~~~~~ + +JSIOGCVERSION is a good way to check in run-time whether the running +driver is 1.0+ and supports the event interface. If it is not, the +IOCTL will fail. For a compile-time decision, you can test the +JS_VERSION symbol + + #ifdef JS_VERSION + #if JS_VERSION > 0xsomething + + +4.2 JSIOCGNAME +~~~~~~~~~~~~~~ + +JSIOCGNAME(len) allows you to get the name string of the joystick - the same +as is being printed at boot time. The 'len' argument is the length of the +buffer provided by the application asking for the name. It is used to avoid +possible overrun should the name be too long. + + char name[128]; + if (ioctl(fd, JSIOCGNAME(sizeof(name)), name) < 0) + strncpy(name, "Unknown", sizeof(name)); + printf("Name: %s\n", name); + + +4.3 JSIOC[SG]CORR +~~~~~~~~~~~~~~~~~ + +For usage on JSIOC[SG]CORR I suggest you to look into jscal.c They are +not needed in a normal program, only in joystick calibration software +such as jscal or kcmjoy. These IOCTLs and data types aren't considered +to be in the stable part of the API, and therefore may change without +warning in following releases of the driver. + +Both JSIOCSCORR and JSIOCGCORR expect &js_corr to be able to hold +information for all axis. That is, struct js_corr corr[MAX_AXIS]; + +struct js_corr is defined as + + struct js_corr { + __s32 coef[8]; + __u16 prec; + __u16 type; + }; + +and ``type'' + + #define JS_CORR_NONE 0x00 /* returns raw values */ + #define JS_CORR_BROKEN 0x01 /* broken line */ + + +5. Backward compatibility +~~~~~~~~~~~~~~~~~~~~~~~~~ + +The 0.x joystick driver API is quite limited and its usage is deprecated. +The driver offers backward compatibility, though. Here's a quick summary: + + struct JS_DATA_TYPE js; + while (1) { + if (read (fd, &js, JS_RETURN) != JS_RETURN) { + /* error */ + } + usleep (1000); + } + +As you can figure out from the example, the read returns immediately, +with the actual state of the joystick. + + struct JS_DATA_TYPE { + int buttons; /* immediate button state */ + int x; /* immediate x axis value */ + int y; /* immediate y axis value */ + }; + +and JS_RETURN is defined as + + #define JS_RETURN sizeof(struct JS_DATA_TYPE) + +To test the state of the buttons, + + first_button_state = js.buttons & 1; + second_button_state = js.buttons & 2; + +The axis values do not have a defined range in the original 0.x driver, +except for that the values are non-negative. The 1.2.8+ drivers use a +fixed range for reporting the values, 1 being the minimum, 128 the +center, and 255 maximum value. + +The v0.8.0.2 driver also had an interface for 'digital joysticks', (now +called Multisystem joysticks in this driver), under /dev/djsX. This driver +doesn't try to be compatible with that interface. + + +6. Final Notes +~~~~~~~~~~~~~~ + +____/| Comments, additions, and specially corrections are welcome. +\ o.O| Documentation valid for at least version 1.2.8 of the joystick + =(_)= driver and as usual, the ultimate source for documentation is + U to "Use The Source Luke" or, at your convenience, Vojtech ;) + + - Ragnar +EOF diff --git a/Documentation/input/joystick-parport.txt b/Documentation/input/joystick-parport.txt new file mode 100644 index 000000000000..88a011c9f985 --- /dev/null +++ b/Documentation/input/joystick-parport.txt @@ -0,0 +1,542 @@ + Linux Joystick parport drivers v2.0 + (c) 1998-2000 Vojtech Pavlik <vojtech@ucw.cz> + (c) 1998 Andree Borrmann <a.borrmann@tu-bs.de> + Sponsored by SuSE + $Id: joystick-parport.txt,v 1.6 2001/09/25 09:31:32 vojtech Exp $ +---------------------------------------------------------------------------- + +0. Disclaimer +~~~~~~~~~~~~~ + Any information in this file is provided as-is, without any guarantee that +it will be true. So, use it at your own risk. The possible damages that can +happen include burning your parallel port, and/or the sticks and joystick +and maybe even more. Like when a lightning kills you it is not our problem. + +1. Intro +~~~~~~~~ + The joystick parport drivers are used for joysticks and gamepads not +originally designed for PCs and other computers Linux runs on. Because of +that, PCs usually lack the right ports to connect these devices to. Parallel +port, because of its ability to change single bits at will, and providing +both output and input bits is the most suitable port on the PC for +connecting such devices. + +2. Devices supported +~~~~~~~~~~~~~~~~~~~~ + Many console and 8-bit computer gamepads and joysticks are supported. The +following subsections discuss usage of each. + +2.1 NES and SNES +~~~~~~~~~~~~~~~~ + The Nintendo Entertainment System and Super Nintendo Entertainment System +gamepads are widely available, and easy to get. Also, they are quite easy to +connect to a PC, and don't need much processing speed (108 us for NES and +165 us for SNES, compared to about 1000 us for PC gamepads) to communicate +with them. + + All NES and SNES use the same synchronous serial protocol, clocked from +the computer's side (and thus timing insensitive). To allow up to 5 NES +and/or SNES gamepads connected to the parallel port at once, the output +lines of the parallel port are shared, while one of 5 available input lines +is assigned to each gamepad. + + This protocol is handled by the gamecon.c driver, so that's the one +you'll use for NES and SNES gamepads. + + The main problem with PC parallel ports is that they don't have +5V power +source on any of their pins. So, if you want a reliable source of power +for your pads, use either keyboard or joystick port, and make a pass-through +cable. You can also pull the power directly from the power supply (the red +wire is +5V). + + If you want to use the parallel port only, you can take the power is from +some data pin. For most gamepad and parport implementations only one pin is +needed, and I'd recommend pin 9 for that, the highest data bit. On the other +hand, if you are not planning to use anything else than NES / SNES on the +port, anything between and including pin 4 and pin 9 will work. + +(pin 9) -----> Power + + Unfortunately, there are pads that need a lot more of power, and parallel +ports that can't give much current through the data pins. If this is your +case, you'll need to use diodes (as a prevention of destroying your parallel +port), and combine the currents of two or more data bits together. + + Diodes +(pin 9) ----|>|-------+------> Power + | +(pin 8) ----|>|-------+ + | +(pin 7) ----|>|-------+ + | + <and so on> : + | +(pin 4) ----|>|-------+ + + Ground is quite easy. On PC's parallel port the ground is on any of the +pins from pin 18 to pin 25. So use any pin of these you like for the ground. + +(pin 18) -----> Ground + + NES and SNES pads have two input bits, Clock and Latch, which drive the +serial transfer. These are connected to pins 2 and 3 of the parallel port, +respectively. + +(pin 2) -----> Clock +(pin 3) -----> Latch + + And the last thing is the NES / SNES data wire. Only that isn't shared and +each pad needs its own data pin. The parallel port pins are: + +(pin 10) -----> Pad 1 data +(pin 11) -----> Pad 2 data +(pin 12) -----> Pad 3 data +(pin 13) -----> Pad 4 data +(pin 15) -----> Pad 5 data + + Note that pin 14 is not used, since it is not an input pin on the parallel +port. + + This is everything you need on the PC's side of the connection, now on to +the gamepads side. The NES and SNES have different connectors. Also, there +are quite a lot of NES clones, and because Nintendo used proprietary +connectors for their machines, the cloners couldn't and used standard D-Cannon +connectors. Anyway, if you've got a gamepad, and it has buttons A, B, Turbo +A, Turbo B, Select and Start, and is connected through 5 wires, then it is +either a NES or NES clone and will work with this connection. SNES gamepads +also use 5 wires, but have more buttons. They will work as well, of course. + +Pinout for NES gamepads Pinout for SNES gamepads + + +----> Power +-----------------------\ + | 7 | o o o o | x x o | 1 + 5 +---------+ 7 +-----------------------/ + | x x o \ | | | | | + | o o o o | | | | | +-> Ground + 4 +------------+ 1 | | | +------------> Data + | | | | | | +---------------> Latch + | | | +-> Ground | +------------------> Clock + | | +----> Clock +---------------------> Power + | +-------> Latch + +----------> Data + +Pinout for NES clone (db9) gamepads Pinout for NES clone (db15) gamepads + + +---------> Clock +-----------------> Data + | +-------> Latch | +---> Ground + | | +-----> Data | | + | | | ___________________ + _____________ 8 \ o x x x x x x o / 1 + 5 \ x o o o x / 1 \ o x x o x x o / + \ x o x o / 15 `~~~~~~~~~~~~~' 9 + 9 `~~~~~~~' 6 | | | + | | | | +----> Clock + | +----> Power | +----------> Latch + +--------> Ground +----------------> Power + +2.2 Multisystem joysticks +~~~~~~~~~~~~~~~~~~~~~~~~~ + In the era of 8-bit machines, there was something like de-facto standard +for joystick ports. They were all digital, and all used D-Cannon 9 pin +connectors (db9). Because of that, a single joystick could be used without +hassle on Atari (130, 800XE, 800XL, 2600, 7200), Amiga, Commodore C64, +Amstrad CPC, Sinclair ZX Spectrum and many other machines. That's why these +joysticks are called "Multisystem". + + Now their pinout: + + +---------> Right + | +-------> Left + | | +-----> Down + | | | +---> Up + | | | | + _____________ +5 \ x o o o o / 1 + \ x o x o / + 9 `~~~~~~~' 6 + | | + | +----> Button + +--------> Ground + + However, as time passed, extensions to this standard developed, and these +were not compatible with each other: + + + Atari 130, 800/XL/XE MSX + + +-----------> Power + +---------> Right | +---------> Right + | +-------> Left | | +-------> Left + | | +-----> Down | | | +-----> Down + | | | +---> Up | | | | +---> Up + | | | | | | | | | + _____________ _____________ +5 \ x o o o o / 1 5 \ o o o o o / 1 + \ x o o o / \ o o o o / + 9 `~~~~~~~' 6 9 `~~~~~~~' 6 + | | | | | | | + | | +----> Button | | | +----> Button 1 + | +------> Power | | +------> Button 2 + +--------> Ground | +--------> Output 3 + +----------> Ground + + Amstrad CPC Commodore C64 + + +-----------> Analog Y + +---------> Right | +---------> Right + | +-------> Left | | +-------> Left + | | +-----> Down | | | +-----> Down + | | | +---> Up | | | | +---> Up + | | | | | | | | | + _____________ _____________ +5 \ x o o o o / 1 5 \ o o o o o / 1 + \ x o o o / \ o o o o / + 9 `~~~~~~~' 6 9 `~~~~~~~' 6 + | | | | | | | + | | +----> Button 1 | | | +----> Button + | +------> Button 2 | | +------> Power + +--------> Ground | +--------> Ground + +----------> Analog X + + Sinclair Spectrum +2A/+3 Amiga 1200 + + +-----------> Up +-----------> Button 3 + | +---------> Fire | +---------> Right + | | | | +-------> Left + | | +-----> Ground | | | +-----> Down + | | | | | | | +---> Up + | | | | | | | | + _____________ _____________ +5 \ o o x o x / 1 5 \ o o o o o / 1 + \ o o o o / \ o o o o / + 9 `~~~~~~~' 6 9 `~~~~~~~' 6 + | | | | | | | | + | | | +----> Right | | | +----> Button 1 + | | +------> Left | | +------> Power + | +--------> Ground | +--------> Ground + +----------> Down +----------> Button 2 + + And there were many others. + +2.2.1 Multisystem joysticks using db9.c +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + For the Multisystem joysticks, and their derivatives, the db9.c driver +was written. It allows only one joystick / gamepad per parallel port, but +the interface is easy to build and works with almost anything. + + For the basic 1-button Multisystem joystick you connect its wires to the +parallel port like this: + +(pin 1) -----> Power +(pin 18) -----> Ground + +(pin 2) -----> Up +(pin 3) -----> Down +(pin 4) -----> Left +(pin 5) -----> Right +(pin 6) -----> Button 1 + + However, if the joystick is switch based (eg. clicks when you move it), +you might or might not, depending on your parallel port, need 10 kOhm pullup +resistors on each of the direction and button signals, like this: + +(pin 2) ------------+------> Up + Resistor | +(pin 1) --[10kOhm]--+ + + Try without, and if it doesn't work, add them. For TTL based joysticks / +gamepads the pullups are not needed. + + For joysticks with two buttons you connect the second button to pin 7 on +the parallel port. + +(pin 7) -----> Button 2 + + And that's it. + + On a side note, if you have already built a different adapter for use with +the digital joystick driver 0.8.0.2, this is also supported by the db9.c +driver, as device type 8. (See section 3.2) + +2.2.2 Multisystem joysticks using gamecon.c +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + For some people just one joystick per parallel port is not enough, and/or +want to use them on one parallel port together with NES/SNES/PSX pads. This is +possible using the gamecon.c. It supports up to 5 devices of the above types, +including 1 and 2 buttons Multisystem joysticks. + + However, there is nothing for free. To allow more sticks to be used at +once, you need the sticks to be purely switch based (that is non-TTL), and +not to need power. Just a plain simple six switches inside. If your +joystick can do more (eg. turbofire) you'll need to disable it totally first +if you want to use gamecon.c. + + Also, the connection is a bit more complex. You'll need a bunch of diodes, +and one pullup resistor. First, you connect the Directions and the button +the same as for db9, however with the diodes inbetween. + + Diodes +(pin 2) -----|<|----> Up +(pin 3) -----|<|----> Down +(pin 4) -----|<|----> Left +(pin 5) -----|<|----> Right +(pin 6) -----|<|----> Button 1 + + For two button sticks you also connect the other button. + +(pin 7) -----|<|----> Button 2 + + And finally, you connect the Ground wire of the joystick, like done in +this little schematic to Power and Data on the parallel port, as described +for the NES / SNES pads in section 2.1 of this file - that is, one data pin +for each joystick. The power source is shared. + +Data ------------+-----> Ground + Resistor | +Power --[10kOhm]--+ + + And that's all, here we go! + +2.2.3 Multisystem joysticks using turbografx.c +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + The TurboGraFX interface, designed by + + Steffen Schwenke <schwenke@burg-halle.de> + + allows up to 7 Multisystem joysticks connected to the parallel port. In +Steffen's version, there is support for up to 5 buttons per joystick. However, +since this doesn't work reliably on all parallel ports, the turbografx.c driver +supports only one button per joystick. For more information on how to build the +interface, see + + http://www2.burg-halle.de/~schwenke/parport.html + +2.3 Sony Playstation +~~~~~~~~~~~~~~~~~~~~ + + The PSX controller is supported by the gamecon.c. Pinout of the PSX +controller (compatible with DirectPadPro): + + +---------+---------+---------+ +9 | o o o | o o o | o o o | 1 parallel + \________|_________|________/ port pins + | | | | | | + | | | | | +--------> Clock --- (4) + | | | | +------------> Select --- (3) + | | | +---------------> Power --- (5-9) + | | +------------------> Ground --- (18-25) + | +-------------------------> Command --- (2) + +----------------------------> Data --- (one of 10,11,12,13,15) + + The driver supports these controllers: + + * Standard PSX Pad + * NegCon PSX Pad + * Analog PSX Pad (red mode) + * Analog PSX Pad (green mode) + * PSX Rumble Pad + * PSX DDR Pad + +2.4 Sega +~~~~~~~~ + All the Sega controllers are more or less based on the standard 2-button +Multisystem joystick. However, since they don't use switches and use TTL +logic, the only driver usable with them is the db9.c driver. + +2.4.1 Sega Master System +~~~~~~~~~~~~~~~~~~~~~~~~ + The SMS gamepads are almost exactly the same as normal 2-button +Multisystem joysticks. Set the driver to Multi2 mode, use the corresponding +parallel port pins, and the following schematic: + + +-----------> Power + | +---------> Right + | | +-------> Left + | | | +-----> Down + | | | | +---> Up + | | | | | + _____________ +5 \ o o o o o / 1 + \ o o x o / + 9 `~~~~~~~' 6 + | | | + | | +----> Button 1 + | +--------> Ground + +----------> Button 2 + +2.4.2 Sega Genesis aka MegaDrive +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + The Sega Genesis (in Europe sold as Sega MegaDrive) pads are an extension +to the Sega Master System pads. They use more buttons (3+1, 5+1, 6+1). Use +the following schematic: + + +-----------> Power + | +---------> Right + | | +-------> Left + | | | +-----> Down + | | | | +---> Up + | | | | | + _____________ +5 \ o o o o o / 1 + \ o o o o / + 9 `~~~~~~~' 6 + | | | | + | | | +----> Button 1 + | | +------> Select + | +--------> Ground + +----------> Button 2 + + The Select pin goes to pin 14 on the parallel port. + +(pin 14) -----> Select + + The rest is the same as for Multi2 joysticks using db9.c + +2.4.3 Sega Saturn +~~~~~~~~~~~~~~~~~ + Sega Saturn has eight buttons, and to transfer that, without hacks like +Genesis 6 pads use, it needs one more select pin. Anyway, it is still +handled by the db9.c driver. Its pinout is very different from anything +else. Use this schematic: + + +-----------> Select 1 + | +---------> Power + | | +-------> Up + | | | +-----> Down + | | | | +---> Ground + | | | | | + _____________ +5 \ o o o o o / 1 + \ o o o o / + 9 `~~~~~~~' 6 + | | | | + | | | +----> Select 2 + | | +------> Right + | +--------> Left + +----------> Power + + Select 1 is pin 14 on the parallel port, Select 2 is pin 16 on the +parallel port. + +(pin 14) -----> Select 1 +(pin 16) -----> Select 2 + + The other pins (Up, Down, Right, Left, Power, Ground) are the same as for +Multi joysticks using db9.c + +3. The drivers +~~~~~~~~~~~~~~ + There are three drivers for the parallel port interfaces. Each, as +described above, allows to connect a different group of joysticks and pads. +Here are described their command lines: + +3.1 gamecon.c +~~~~~~~~~~~~~ + Using gamecon.c you can connect up to five devices to one parallel port. It +uses the following kernel/module command line: + + gamecon.map=port,pad1,pad2,pad3,pad4,pad5 + + Where 'port' the number of the parport interface (eg. 0 for parport0). + + And 'pad1' to 'pad5' are pad types connected to different data input pins +(10,11,12,13,15), as described in section 2.1 of this file. + + The types are: + + Type | Joystick/Pad + -------------------- + 0 | None + 1 | SNES pad + 2 | NES pad + 4 | Multisystem 1-button joystick + 5 | Multisystem 2-button joystick + 6 | N64 pad + 7 | Sony PSX controller + 8 | Sony PSX DDR controller + + The exact type of the PSX controller type is autoprobed when used so +hot swapping should work (but is not recomended). + + Should you want to use more than one of parallel ports at once, you can use +gamecon.map2 and gamecon.map3 as additional command line parameters for two +more parallel ports. + + There are two options specific to PSX driver portion. gamecon.psx_delay sets +the command delay when talking to the controllers. The default of 25 should +work but you can try lowering it for better performace. If your pads don't +respond try raising it untill they work. Setting the type to 8 allows the +driver to be used with Dance Dance Revolution or similar games. Arrow keys are +registered as key presses instead of X and Y axes. + +3.2 db9.c +~~~~~~~~~ + Apart from making an interface, there is nothing difficult on using the +db9.c driver. It uses the following kernel/module command line: + + db9.dev=port,type + + Where 'port' is the number of the parport interface (eg. 0 for parport0). + + Caveat here: This driver only works on bidirectional parallel ports. If +your parallel port is recent enough, you should have no trouble with this. +Old parallel ports may not have this feature. + + 'Type' is the type of joystick or pad attached: + + Type | Joystick/Pad + -------------------- + 0 | None + 1 | Multisystem 1-button joystick + 2 | Multisystem 2-button joystick + 3 | Genesis pad (3+1 buttons) + 5 | Genesis pad (5+1 buttons) + 6 | Genesis pad (6+2 buttons) + 7 | Saturn pad (8 buttons) + 8 | Multisystem 1-button joystick (v0.8.0.2 pin-out) + 9 | Two Multisystem 1-button joysticks (v0.8.0.2 pin-out) + 10 | Amiga CD32 pad + + Should you want to use more than one of these joysticks/pads at once, you +can use db9.dev2 and db9.dev3 as additional command line parameters for two +more joysticks/pads. + +3.3 turbografx.c +~~~~~~~~~~~~~~~~ + The turbografx.c driver uses a very simple kernel/module command line: + + turbografx.map=port,js1,js2,js3,js4,js5,js6,js7 + + Where 'port' is the number of the parport interface (eg. 0 for parport0). + + 'jsX' is the number of buttons the Multisystem joysticks connected to the +interface ports 1-7 have. For a standard multisystem joystick, this is 1. + + Should you want to use more than one of these interfaces at once, you can +use turbografx.map2 and turbografx.map3 as additional command line parameters +for two more interfaces. + +3.4 PC parallel port pinout +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + .----------------------------------------. + At the PC: \ 13 12 11 10 9 8 7 6 5 4 3 2 1 / + \ 25 24 23 22 21 20 19 18 17 16 15 14 / + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + Pin | Name | Description + ~~~~~~|~~~~~~~~~|~~~~~~~~~~ + 1 | /STROBE | Strobe + 2-9 | D0-D7 | Data Bit 0-7 + 10 | /ACK | Acknowledge + 11 | BUSY | Busy + 12 | PE | Paper End + 13 | SELIN | Select In + 14 | /AUTOFD | Autofeed + 15 | /ERROR | Error + 16 | /INIT | Initialize + 17 | /SEL | Select + 18-25 | GND | Signal Ground + +3.5 End +~~~~~~~ + That's all, folks! Have fun! diff --git a/Documentation/input/joystick.txt b/Documentation/input/joystick.txt new file mode 100644 index 000000000000..d53b857a3710 --- /dev/null +++ b/Documentation/input/joystick.txt @@ -0,0 +1,588 @@ + Linux Joystick driver v2.0.0 + (c) 1996-2000 Vojtech Pavlik <vojtech@ucw.cz> + Sponsored by SuSE + $Id: joystick.txt,v 1.12 2002/03/03 12:13:07 jdeneux Exp $ +---------------------------------------------------------------------------- + +0. Disclaimer +~~~~~~~~~~~~~ + This program is free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by the Free +Software Foundation; either version 2 of the License, or (at your option) +any later version. + + This program is distributed in the hope that it will be useful, but +WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY +or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for +more details. + + You should have received a copy of the GNU General Public License along +with this program; if not, write to the Free Software Foundation, Inc., 59 +Temple Place, Suite 330, Boston, MA 02111-1307 USA + + Should you need to contact me, the author, you can do so either by e-mail +- mail your message to <vojtech@ucw.cz>, or by paper mail: Vojtech Pavlik, +Simunkova 1594, Prague 8, 182 00 Czech Republic + + For your convenience, the GNU General Public License version 2 is included +in the package: See the file COPYING. + +1. Intro +~~~~~~~~ + The joystick driver for Linux provides support for a variety of joysticks +and similar devices. It is based on a larger project aiming to support all +input devices in Linux. + + Should you encounter any problems while using the driver, or joysticks +this driver can't make complete use of, I'm very interested in hearing about +them. Bug reports and success stories are also welcome. + + The input project website is at: + + http://www.suse.cz/development/input/ + http://atrey.karlin.mff.cuni.cz/~vojtech/input/ + + There is also a mailing list for the driver at: + + listproc@atrey.karlin.mff.cuni.cz + +send "subscribe linux-joystick Your Name" to subscribe to it. + +2. Usage +~~~~~~~~ + For basic usage you just choose the right options in kernel config and +you should be set. + +2.1 inpututils +~~~~~~~~~~~~~~ +For testing and other purposes (for example serial devices), a set of +utilities is available at the abovementioned website. I suggest you download +and install it before going on. + +2.2 Device nodes +~~~~~~~~~~~~~~~~ +For applications to be able to use the joysticks, in you don't use devfs, +you'll have to manually create these nodes in /dev: + +cd /dev +rm js* +mkdir input +mknod input/js0 c 13 0 +mknod input/js1 c 13 1 +mknod input/js2 c 13 2 +mknod input/js3 c 13 3 +ln -s input/js0 js0 +ln -s input/js1 js1 +ln -s input/js2 js2 +ln -s input/js3 js3 + +For testing with inpututils it's also convenient to create these: + +mknod input/event0 c 13 64 +mknod input/event1 c 13 65 +mknod input/event2 c 13 66 +mknod input/event3 c 13 67 + +2.4 Modules needed +~~~~~~~~~~~~~~~~~~ + For all joystick drivers to function, you'll need the userland interface +module in kernel, either loaded or compiled in: + + modprobe joydev + + For gameport joysticks, you'll have to load the gameport driver as well; + + modprobe ns558 + + And for serial port joysticks, you'll need the serial input line +discipline module loaded and the inputattach utility started: + + modprobe serport + inputattach -xxx /dev/tts/X & + + In addition to that, you'll need the joystick driver module itself, most +usually you'll have an analog joystick: + + modprobe analog + + For automatic module loading, something like this might work - tailor to +your needs: + + alias tty-ldisc-2 serport + alias char-major-13 input + above input joydev ns558 analog + options analog map=gamepad,none,2btn + +2.5 Verifying that it works +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + For testing the joystick driver functionality, there is the jstest +program in the utilities package. You run it by typing: + + jstest /dev/js0 + + And it should show a line with the joystick values, which update as you +move the stick, and press its buttons. The axes should all be zero when the +joystick is in the center position. They should not jitter by themselves to +other close values, and they also should be steady in any other position of +the stick. They should have the full range from -32767 to 32767. If all this +is met, then it's all fine, and you can play the games. :) + + If it's not, then there might be a problem. Try to calibrate the joystick, +and if it still doesn't work, read the drivers section of this file, the +troubleshooting section, and the FAQ. + +2.6. Calibration +~~~~~~~~~~~~~~~~ + For most joysticks you won't need any manual calibration, since the +joystick should be autocalibrated by the driver automagically. However, with +some analog joysticks, that either do not use linear resistors, or if you +want better precision, you can use the jscal program + + jscal -c /dev/js0 + + included in the joystick package to set better correction coefficients than +what the driver would choose itself. + + After calibrating the joystick you can verify if you like the new +calibration using the jstest command, and if you do, you then can save the +correction coefficients into a file + + jscal -p /dev/js0 > /etc/joystick.cal + + And add a line to your rc script executing that file + + source /etc/joystick.cal + + This way, after the next reboot your joystick will remain calibrated. You +can also add the jscal -p line to your shutdown script. + + +3. HW specific driver information +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +In this section each of the separate hardware specific drivers is described. + +3.1 Analog joysticks +~~~~~~~~~~~~~~~~~~~~ + The analog.c uses the standard analog inputs of the gameport, and thus +supports all standard joysticks and gamepads. It uses a very advanced +routine for this, allowing for data precision that can't be found on any +other system. + + It also supports extensions like additional hats and buttons compatible +with CH Flightstick Pro, ThrustMaster FCS or 6 and 8 button gamepads. Saitek +Cyborg 'digital' joysticks are also supported by this driver, because +they're basically souped up CHF sticks. + + However the only types that can be autodetected are: + +* 2-axis, 4-button joystick +* 3-axis, 4-button joystick +* 4-axis, 4-button joystick +* Saitek Cyborg 'digital' joysticks + + For other joystick types (more/less axes, hats, and buttons) support +you'll need to specify the types either on the kernel command line or on the +module command line, when inserting analog into the kernel. The +parameters are: + + analog.map=<type1>,<type2>,<type3>,.... + + 'type' is type of the joystick from the table below, defining joysticks +present on gameports in the system, starting with gameport0, second 'type' +entry defining joystick on gameport1 and so on. + + Type | Meaning + ----------------------------------- + none | No analog joystick on that port + auto | Autodetect joystick + 2btn | 2-button n-axis joystick + y-joy | Two 2-button 2-axis joysticks on an Y-cable + y-pad | Two 2-button 2-axis gamepads on an Y-cable + fcs | Thrustmaster FCS compatible joystick + chf | Joystick with a CH Flightstick compatible hat + fullchf | CH Flightstick compatible with two hats and 6 buttons + gamepad | 4/6-button n-axis gamepad + gamepad8 | 8-button 2-axis gamepad + + In case your joystick doesn't fit in any of the above categories, you can +specify the type as a number by combining the bits in the table below. This +is not recommended unless you really know what are you doing. It's not +dangerous, but not simple either. + + Bit | Meaning + -------------------------- + 0 | Axis X1 + 1 | Axis Y1 + 2 | Axis X2 + 3 | Axis Y2 + 4 | Button A + 5 | Button B + 6 | Button C + 7 | Button D + 8 | CHF Buttons X and Y + 9 | CHF Hat 1 + 10 | CHF Hat 2 + 11 | FCS Hat + 12 | Pad Button X + 13 | Pad Button Y + 14 | Pad Button U + 15 | Pad Button V + 16 | Saitek F1-F4 Buttons + 17 | Saitek Digital Mode + 19 | GamePad + 20 | Joy2 Axis X1 + 21 | Joy2 Axis Y1 + 22 | Joy2 Axis X2 + 23 | Joy2 Axis Y2 + 24 | Joy2 Button A + 25 | Joy2 Button B + 26 | Joy2 Button C + 27 | Joy2 Button D + 31 | Joy2 GamePad + +3.2 Microsoft SideWinder joysticks +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Microsoft 'Digital Overdrive' protocol is supported by the sidewinder.c +module. All currently supported joysticks: + +* Microsoft SideWinder 3D Pro +* Microsoft SideWinder Force Feedback Pro +* Microsoft SideWinder Force Feedback Wheel +* Microsoft SideWinder FreeStyle Pro +* Microsoft SideWinder GamePad (up to four, chained) +* Microsoft SideWinder Precision Pro +* Microsoft SideWinder Precision Pro USB + + are autodetected, and thus no module parameters are needed. + + There is one caveat with the 3D Pro. There are 9 buttons reported, +although the joystick has only 8. The 9th button is the mode switch on the +rear side of the joystick. However, moving it, you'll reset the joystick, +and make it unresponsive for about a one third of a second. Furthermore, the +joystick will also re-center itself, taking the position it was in during +this time as a new center position. Use it if you want, but think first. + + The SideWinder Standard is not a digital joystick, and thus is supported +by the analog driver described above. + +3.3 Logitech ADI devices +~~~~~~~~~~~~~~~~~~~~~~~~ + Logitech ADI protocol is supported by the adi.c module. It should support +any Logitech device using this protocol. This includes, but is not limited +to: + +* Logitech CyberMan 2 +* Logitech ThunderPad Digital +* Logitech WingMan Extreme Digital +* Logitech WingMan Formula +* Logitech WingMan Interceptor +* Logitech WingMan GamePad +* Logitech WingMan GamePad USB +* Logitech WingMan GamePad Extreme +* Logitech WingMan Extreme Digital 3D + + ADI devices are autodetected, and the driver supports up to two (any +combination of) devices on a single gameport, using an Y-cable or chained +together. + + Logitech WingMan Joystick, Logitech WingMan Attack, Logitech WingMan +Extreme and Logitech WingMan ThunderPad are not digital joysticks and are +handled by the analog driver described above. Logitech WingMan Warrior and +Logitech Magellan are supported by serial drivers described below. Logitech +WingMan Force and Logitech WingMan Formula Force are supported by the +I-Force driver described below. Logitech CyberMan is not supported yet. + +3.4 Gravis GrIP +~~~~~~~~~~~~~~~ + Gravis GrIP protocol is supported by the grip.c module. It currently +supports: + +* Gravis GamePad Pro +* Gravis BlackHawk Digital +* Gravis Xterminator +* Gravis Xterminator DualControl + + All these devices are autodetected, and you can even use any combination +of up to two of these pads either chained together or using an Y-cable on a +single gameport. + +GrIP MultiPort isn't supported yet. Gravis Stinger is a serial device and is +supported by the stinger driver. Other Gravis joysticks are supported by the +analog driver. + +3.5 FPGaming A3D and MadCatz A3D +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + The Assassin 3D protocol created by FPGaming, is used both by FPGaming +themselves and is licensed to MadCatz. A3D devices are supported by the +a3d.c module. It currently supports: + +* FPGaming Assassin 3D +* MadCatz Panther +* MadCatz Panther XL + + All these devices are autodetected. Because the Assassin 3D and the Panther +allow connecting analog joysticks to them, you'll need to load the analog +driver as well to handle the attached joysticks. + + The trackball should work with USB mousedev module as a normal mouse. See +the USB documentation for how to setup an USB mouse. + +3.6 ThrustMaster DirectConnect (BSP) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + The TM DirectConnect (BSP) protocol is supported by the tmdc.c +module. This includes, but is not limited to: + +* ThrustMaster Millenium 3D Inceptor +* ThrustMaster 3D Rage Pad +* ThrustMaster Fusion Digital Game Pad + + Devices not directly supported, but hopefully working are: + +* ThrustMaster FragMaster +* ThrustMaster Attack Throttle + + If you have one of these, contact me. + + TMDC devices are autodetected, and thus no parameters to the module +are needed. Up to two TMDC devices can be connected to one gameport, using +an Y-cable. + +3.7 Creative Labs Blaster +~~~~~~~~~~~~~~~~~~~~~~~~~ + The Blaster protocol is supported by the cobra.c module. It supports only +the: + +* Creative Blaster GamePad Cobra + + Up to two of these can be used on a single gameport, using an Y-cable. + +3.8 Genius Digital joysticks +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + The Genius digitally communicating joysticks are supported by the gf2k.c +module. This includes: + +* Genius Flight2000 F-23 joystick +* Genius Flight2000 F-31 joystick +* Genius G-09D gamepad + + Other Genius digital joysticks are not supported yet, but support can be +added fairly easily. + +3.9 InterAct Digital joysticks +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + The InterAct digitally communicating joysticks are supported by the +interact.c module. This includes: + +* InterAct HammerHead/FX gamepad +* InterAct ProPad8 gamepad + + Other InterAct digital joysticks are not supported yet, but support can be +added fairly easily. + +3.10 PDPI Lightning 4 gamecards +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + PDPI Lightning 4 gamecards are supported by the lightning.c module. +Once the module is loaded, the analog driver can be used to handle the +joysticks. Digitally communicating joystick will work only on port 0, while +using Y-cables, you can connect up to 8 analog joysticks to a single L4 +card, 16 in case you have two in your system. + +3.11 Trident 4DWave / Aureal Vortex +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Soundcards with a Trident 4DWave DX/NX or Aureal Vortex/Vortex2 chipsets +provide an "Enhanced Game Port" mode where the soundcard handles polling the +joystick. This mode is supported by the pcigame.c module. Once loaded the +analog driver can use the enhanced features of these gameports.. + +3.13 Crystal SoundFusion +~~~~~~~~~~~~~~~~~~~~~~~~ + Soundcards with Crystal SoundFusion chipsets provide an "Enhanced Game +Port", much like the 4DWave or Vortex above. This, and also the normal mode +for the port of the SoundFusion is supported by the cs461x.c module. + +3.14 SoundBlaster Live! +~~~~~~~~~~~~~~~~~~~~~~~~ + The Live! has a special PCI gameport, which, although it doesn't provide +any "Enhanced" stuff like 4DWave and friends, is quite a bit faster than +it's ISA counterparts. It also requires special support, hence the +emu10k1-gp.c module for it instead of the normal ns558.c one. + +3.15 SoundBlaster 64 and 128 - ES1370 and ES1371, ESS Solo1 and S3 SonicVibes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + These PCI soundcards have specific gameports. They are handled by the +sound drivers themselves. Make sure you select gameport support in the +joystick menu and sound card support in the sound menu for your appropriate +card. + +3.16 Amiga +~~~~~~~~~~ + Amiga joysticks, connected to an Amiga, are supported by the amijoy.c +driver. Since they can't be autodetected, the driver has a command line. + + amijoy.map=<a>,<b> + + a and b define the joysticks connected to the JOY0DAT and JOY1DAT ports of +the Amiga. + + Value | Joystick type + --------------------- + 0 | None + 1 | 1-button digital joystick + + No more joystick types are supported now, but that should change in the +future if I get an Amiga in the reach of my fingers. + +3.17 Game console and 8-bit pads and joysticks +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +See joystick-parport.txt for more info. + +3.18 SpaceTec/LabTec devices +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + SpaceTec serial devices communicate using the SpaceWare protocol. It is +supported by the spaceorb.c and spaceball.c drivers. The devices currently +supported by spaceorb.c are: + +* SpaceTec SpaceBall Avenger +* SpaceTec SpaceOrb 360 + +Devices currently supported by spaceball.c are: + +* SpaceTec SpaceBall 4000 FLX + + In addition to having the spaceorb/spaceball and serport modules in the +kernel, you also need to attach a serial port to it. to do that, run the +inputattach program: + + inputattach --spaceorb /dev/tts/x & +or + inputattach --spaceball /dev/tts/x & + +where /dev/tts/x is the serial port which the device is connected to. After +doing this, the device will be reported and will start working. + + There is one caveat with the SpaceOrb. The button #6, the on the bottom +side of the orb, although reported as an ordinary button, causes internal +recentering of the spaceorb, moving the zero point to the position in which +the ball is at the moment of pressing the button. So, think first before +you bind it to some other function. + +SpaceTec SpaceBall 2003 FLX and 3003 FLX are not supported yet. + +3.19 Logitech SWIFT devices +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + The SWIFT serial protocol is supported by the warrior.c module. It +currently supports only the: + +* Logitech WingMan Warrior + +but in the future, Logitech CyberMan (the original one, not CM2) could be +supported as well. To use the module, you need to run inputattach after you +insert/compile the module into your kernel: + + inputattach --warrior /dev/tts/x & + +/dev/tts/x is the serial port your Warrior is attached to. + +3.20 Magellan / Space Mouse +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + The Magellan (or Space Mouse), manufactured by LogiCad3d (formerly Space +Systems), for many other companies (Logitech, HP, ...) is supported by the +joy-magellan module. It currently supports only the: + +* Magellan 3D +* Space Mouse + +models, the additional buttons on the 'Plus' versions are not supported yet. + + To use it, you need to attach the serial port to the driver using the + + inputattach --magellan /dev/tts/x & + +command. After that the Magellan will be detected, initialized, will beep, +and the /dev/input/jsX device should become usable. + +3.21 I-Force devices +~~~~~~~~~~~~~~~~~~~~ + All I-Force devices are supported by the iforce module. This includes: + +* AVB Mag Turbo Force +* AVB Top Shot Pegasus +* AVB Top Shot Force Feedback Racing Wheel +* Logitech WingMan Force +* Logitech WingMan Force Wheel +* Guillemot Race Leader Force Feedback +* Guillemot Force Feedback Racing Wheel +* Thrustmaster Motor Sport GT + + To use it, you need to attach the serial port to the driver using the + + inputattach --iforce /dev/tts/x & + +command. After that the I-Force device will be detected, and the +/dev/input/jsX device should become usable. + + In case you're using the device via the USB port, the inputattach command +isn't needed. + + The I-Force driver now supports force feedback via the event interface. + + Please note that Logitech WingMan *3D devices are _not_ supported by this +module, rather by hid. Force feedback is not supported for those devices. +Logitech gamepads are also hid devices. + +3.22 Gravis Stinger gamepad +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + The Gravis Stinger serial port gamepad, designed for use with laptop +computers, is supported by the stinger.c module. To use it, attach the +serial port to the driver using: + + inputattach --stinger /dev/tty/x & + +where x is the number of the serial port. + +4. Troubleshooting +~~~~~~~~~~~~~~~~~~ + There is quite a high probability that you run into some problems. For +testing whether the driver works, if in doubt, use the jstest utility in +some of its modes. The most useful modes are "normal" - for the 1.x +interface, and "old" for the "0.x" interface. You run it by typing: + + jstest --normal /dev/input/js0 + jstest --old /dev/input/js0 + + Additionally you can do a test with the evtest utility: + + evtest /dev/input/event0 + + Oh, and read the FAQ! :) + +5. FAQ +~~~~~~ +Q: Running 'jstest /dev/js0' results in "File not found" error. What's the + cause? +A: The device files don't exist. Create them (see section 2.2). + +Q: Is it possible to connect my old Atari/Commodore/Amiga/console joystick + or pad that uses a 9-pin D-type cannon connector to the serial port of my + PC? +A: Yes, it is possible, but it'll burn your serial port or the pad. It + won't work, of course. + +Q: My joystick doesn't work with Quake / Quake 2. What's the cause? +A: Quake / Quake 2 don't support joystick. Use joy2key to simulate keypresses + for them. + +6. Programming Interface +~~~~~~~~~~~~~~~~~~~~~~~~ + The 1.0 driver uses a new, event based approach to the joystick driver. +Instead of the user program polling for the joystick values, the joystick +driver now reports only any changes of its state. See joystick-api.txt, +joystick.h and jstest.c included in the joystick package for more +information. The joystick device can be used in either blocking or +nonblocking mode and supports select() calls. + + For backward compatibility the old (v0.x) interface is still included. +Any call to the joystick driver using the old interface will return values +that are compatible to the old interface. This interface is still limited +to 2 axes, and applications using it usually decode only 2 buttons, although +the driver provides up to 32. diff --git a/Documentation/input/shape.fig b/Documentation/input/shape.fig new file mode 100644 index 000000000000..c22bff83d06f --- /dev/null +++ b/Documentation/input/shape.fig @@ -0,0 +1,65 @@ +#FIG 3.2 +Landscape +Center +Inches +Letter +100.00 +Single +-2 +1200 2 +2 1 0 2 0 7 50 0 -1 0.000 0 0 -1 0 0 6 + 4200 3600 4200 3075 4950 2325 7425 2325 8250 3150 8250 3600 +2 1 1 1 0 7 50 0 -1 4.000 0 0 -1 0 0 2 + 4200 3675 4200 5400 +2 1 1 1 0 7 50 0 -1 4.000 0 0 -1 0 0 2 + 8250 3675 8250 5400 +2 1 0 1 0 7 50 0 -1 4.000 0 0 -1 0 0 2 + 3675 3600 8700 3600 +2 1 1 1 0 7 50 0 -1 4.000 0 0 -1 0 0 2 + 8775 3600 10200 3600 +2 1 1 1 0 7 50 0 -1 4.000 0 0 -1 0 0 2 + 8325 3150 9075 3150 +2 1 1 1 0 7 50 0 -1 4.000 0 0 -1 0 0 2 + 7500 2325 10200 2325 +2 1 1 1 0 7 50 0 -1 4.000 0 0 -1 0 0 2 + 3600 3600 3000 3600 +2 1 1 1 0 7 50 0 -1 4.000 0 0 -1 0 0 2 + 4125 3075 3000 3075 +2 1 0 1 0 7 50 0 -1 4.000 0 0 -1 1 1 2 + 0 0 1.00 60.00 120.00 + 0 0 1.00 60.00 120.00 + 4200 5400 8175 5400 +2 1 0 1 0 7 50 0 -1 4.000 0 0 -1 1 1 2 + 0 0 1.00 60.00 120.00 + 0 0 1.00 60.00 120.00 + 10125 2325 10125 3600 +2 1 0 1 0 7 50 0 -1 4.000 0 0 -1 1 1 2 + 0 0 1.00 60.00 120.00 + 0 0 1.00 60.00 120.00 + 3000 3150 3000 3600 +2 1 0 1 0 7 50 0 -1 4.000 0 0 -1 1 1 2 + 0 0 1.00 60.00 120.00 + 0 0 1.00 60.00 120.00 + 9075 3150 9075 3600 +2 1 1 1 0 7 50 0 -1 4.000 0 0 -1 0 0 2 + 4950 2325 4950 1200 +2 1 1 1 0 7 50 0 -1 4.000 0 0 -1 0 0 2 + 7425 2325 7425 1200 +2 1 1 1 0 7 50 0 -1 4.000 0 0 -1 0 0 4 + 4200 3075 4200 2400 3600 1800 3600 1200 +2 1 1 1 0 7 50 0 -1 4.000 0 0 -1 0 0 4 + 8250 3150 8250 2475 8775 1950 8775 1200 +2 1 0 1 0 7 50 0 -1 4.000 0 0 -1 1 1 2 + 0 0 1.00 60.00 120.00 + 0 0 1.00 60.00 120.00 + 3600 1275 4950 1275 +2 1 0 1 0 7 50 0 -1 4.000 0 0 -1 1 1 2 + 0 0 1.00 60.00 120.00 + 0 0 1.00 60.00 120.00 + 7425 1275 8700 1275 +4 1 0 50 0 0 12 0.0000 4 135 1140 6075 5325 Effect duration\001 +4 0 0 50 0 0 12 0.0000 4 180 1305 10200 3000 Effect magnitude\001 +4 0 0 50 0 0 12 0.0000 4 135 780 9150 3450 Fade level\001 +4 1 0 50 0 0 12 0.0000 4 180 1035 4275 1200 Attack length\001 +4 1 0 50 0 0 12 0.0000 4 180 885 8175 1200 Fade length\001 +4 2 0 50 0 0 12 0.0000 4 135 930 2925 3375 Attack level\001 diff --git a/Documentation/input/xpad.txt b/Documentation/input/xpad.txt new file mode 100644 index 000000000000..b9111a703ce0 --- /dev/null +++ b/Documentation/input/xpad.txt @@ -0,0 +1,116 @@ +xpad - Linux USB driver for X-Box gamepads + +This is the very first release of a driver for X-Box gamepads. +Basically, this was hacked away in just a few hours, so don't expect +miracles. +In particular, there is currently NO support for the rumble pack. +You won't find many ff-aware linux applications anyway. + + +0. Status +--------- + +For now, this driver has only been tested on just one Linux-Box. +This one is running a 2.4.18 kernel with usb-uhci on an amd athlon 600. + +The jstest-program from joystick-1.2.15 (jstest-version 2.1.0) reports +8 axes and 10 buttons. + +Alls 8 axes work, though they all have the same range (-32768..32767) +and the zero-setting is not correct for the triggers (I don't know if that +is some limitation of jstest, since the input device setup should be fine. I +didn't have a look at jstest itself yet). + +All of the 10 buttons work (in digital mode). The six buttons on the +right side (A, B, X, Y, black, white) are said to be "analog" and +report their values as 8 bit unsigned, not sure what this is good for. + +I tested the controller with quake3, and configuration and +in game functionality were OK. However, I find it rather difficult to +play first person shooters with a pad. Your mileage may vary. + + +1. USB adapter +-------------- + +Before you can actually use the driver, you need to get yourself an +adapter cable to connect the X-Box controller to your Linux-Box. + +Such a cable is pretty easy to build. The Controller itself is a USB compound +device (a hub with three ports for two expansion slots and the controller +device) with the only difference in a nonstandard connector (5 pins vs. 4 on +standard USB connector). + +You just need to solder a USB connector onto the cable and keep the +yellow wire unconnected. The other pins have the same order on both +connectors so there is no magic to it. Detailed info on these matters +can be found on the net ([1], [2], [3]). + +Thanks to the trip splitter found on the cable you don't even need to cut the +original one. You can buy an extension cable and cut that instead. That way, +you can still use the controller with your X-Box, if you have one ;) + + +2. driver installation +---------------------- + +Once you have the adapter cable and the controller is connected, you need +to load your USB subsystem and should cat /proc/bus/usb/devices. +There should be an entry like the one at the end [4]. + +Currently (as of version 0.0.4), the following three devices are included: + original Microsoft XBOX controller (US), vendor=0x045e, product=0x0202 + original Microsoft XBOX controller (Japan), vendor=0x045e, product=0x0285 + InterAct PowerPad Pro (Germany), vendor=0x05fd, product=0x107a + +If you have another controller that is not listed above and is not recognized +by the driver, please drop me a line with the appropriate info (that is, include +the name, vendor and product ID, as well as the country where you bought it; +sending the whole dump out of /proc/bus/usb/devices along would be even better). + +In theory, the driver should work with other controllers than mine +(InterAct PowerPad pro, bought in Germany) just fine, but I cannot test this +for I only have this one controller. + +If you compiled and installed the driver, test the functionality: +> modprobe xpad +> modprobe joydev +> jstest /dev/js0 + +There should be a single line showing 18 inputs (8 axes, 10 buttons), and +it's values should change if you move the sticks and push the buttons. + +It works? Voila, your done ;) + + +3. Thanks +--------- + +I have to thank ITO Takayuki for the detailed info on his site + http://euc.jp/periphs/xbox-controller.ja.html. + +His useful info and both the usb-skeleton as well as the iforce input driver +(Greg Kroah-Hartmann; Vojtech Pavlik) helped a lot in rapid prototyping +the basic functionality. + + +4. References +------------- + +1. http://euc.jp/periphs/xbox-controller.ja.html (ITO Takayuki) +2. http://xpad.xbox-scene.com/ +3. http://www.xboxhackz.com/Hackz-Reference.htm + +4. /proc/bus/usb/devices - dump from InterAct PowerPad Pro (Germany): + +T: Bus=01 Lev=03 Prnt=04 Port=00 Cnt=01 Dev#= 5 Spd=12 MxCh= 0 +D: Ver= 1.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=32 #Cfgs= 1 +P: Vendor=05fd ProdID=107a Rev= 1.00 +C:* #Ifs= 1 Cfg#= 1 Atr=80 MxPwr=100mA +I: If#= 0 Alt= 0 #EPs= 2 Cls=58(unk. ) Sub=42 Prot=00 Driver=(none) +E: Ad=81(I) Atr=03(Int.) MxPS= 32 Ivl= 10ms +E: Ad=02(O) Atr=03(Int.) MxPS= 32 Ivl= 10ms + +-- +Marko Friedemann <mfr@bmx-chemnitz.de> +2002-07-16 diff --git a/Documentation/io_ordering.txt b/Documentation/io_ordering.txt new file mode 100644 index 000000000000..9faae6f26d32 --- /dev/null +++ b/Documentation/io_ordering.txt @@ -0,0 +1,47 @@ +On some platforms, so-called memory-mapped I/O is weakly ordered. On such +platforms, driver writers are responsible for ensuring that I/O writes to +memory-mapped addresses on their device arrive in the order intended. This is +typically done by reading a 'safe' device or bridge register, causing the I/O +chipset to flush pending writes to the device before any reads are posted. A +driver would usually use this technique immediately prior to the exit of a +critical section of code protected by spinlocks. This would ensure that +subsequent writes to I/O space arrived only after all prior writes (much like a +memory barrier op, mb(), only with respect to I/O). + +A more concrete example from a hypothetical device driver: + + ... +CPU A: spin_lock_irqsave(&dev_lock, flags) +CPU A: val = readl(my_status); +CPU A: ... +CPU A: writel(newval, ring_ptr); +CPU A: spin_unlock_irqrestore(&dev_lock, flags) + ... +CPU B: spin_lock_irqsave(&dev_lock, flags) +CPU B: val = readl(my_status); +CPU B: ... +CPU B: writel(newval2, ring_ptr); +CPU B: spin_unlock_irqrestore(&dev_lock, flags) + ... + +In the case above, the device may receive newval2 before it receives newval, +which could cause problems. Fixing it is easy enough though: + + ... +CPU A: spin_lock_irqsave(&dev_lock, flags) +CPU A: val = readl(my_status); +CPU A: ... +CPU A: writel(newval, ring_ptr); +CPU A: (void)readl(safe_register); /* maybe a config register? */ +CPU A: spin_unlock_irqrestore(&dev_lock, flags) + ... +CPU B: spin_lock_irqsave(&dev_lock, flags) +CPU B: val = readl(my_status); +CPU B: ... +CPU B: writel(newval2, ring_ptr); +CPU B: (void)readl(safe_register); /* maybe a config register? */ +CPU B: spin_unlock_irqrestore(&dev_lock, flags) + +Here, the reads from safe_register will cause the I/O chipset to flush any +pending writes before actually posting the read to the chipset, preventing +possible data corruption. diff --git a/Documentation/ioctl-number.txt b/Documentation/ioctl-number.txt new file mode 100644 index 000000000000..769f925c8526 --- /dev/null +++ b/Documentation/ioctl-number.txt @@ -0,0 +1,196 @@ +Ioctl Numbers +19 October 1999 +Michael Elizabeth Chastain +<mec@shout.net> + +If you are adding new ioctl's to the kernel, you should use the _IO +macros defined in <linux/ioctl.h>: + + _IO an ioctl with no parameters + _IOW an ioctl with write parameters (copy_from_user) + _IOR an ioctl with read parameters (copy_to_user) + _IOWR an ioctl with both write and read parameters. + +'Write' and 'read' are from the user's point of view, just like the +system calls 'write' and 'read'. For example, a SET_FOO ioctl would +be _IOW, although the kernel would actually read data from user space; +a GET_FOO ioctl would be _IOR, although the kernel would actually write +data to user space. + +The first argument to _IO, _IOW, _IOR, or _IOWR is an identifying letter +or number from the table below. Because of the large number of drivers, +many drivers share a partial letter with other drivers. + +If you are writing a driver for a new device and need a letter, pick an +unused block with enough room for expansion: 32 to 256 ioctl commands. +You can register the block by patching this file and submitting the +patch to Linus Torvalds. Or you can e-mail me at <mec@shout.net> and +I'll register one for you. + +The second argument to _IO, _IOW, _IOR, or _IOWR is a sequence number +to distinguish ioctls from each other. The third argument to _IOW, +_IOR, or _IOWR is the type of the data going into the kernel or coming +out of the kernel (e.g. 'int' or 'struct foo'). NOTE! Do NOT use +sizeof(arg) as the third argument as this results in your ioctl thinking +it passes an argument of type size_t. + +Some devices use their major number as the identifier; this is OK, as +long as it is unique. Some devices are irregular and don't follow any +convention at all. + +Following this convention is good because: + +(1) Keeping the ioctl's globally unique helps error checking: + if a program calls an ioctl on the wrong device, it will get an + error rather than some unexpected behaviour. + +(2) The 'strace' build procedure automatically finds ioctl numbers + defined with _IO, _IOW, _IOR, or _IOWR. + +(3) 'strace' can decode numbers back into useful names when the + numbers are unique. + +(4) People looking for ioctls can grep for them more easily when + this convention is used to define the ioctl numbers. + +(5) When following the convention, the driver code can use generic + code to copy the parameters between user and kernel space. + +This table lists ioctls visible from user land for Linux/i386. It contains +most drivers up to 2.3.14, but I know I am missing some. + +Code Seq# Include File Comments +======================================================== +0x00 00-1F linux/fs.h conflict! +0x00 00-1F scsi/scsi_ioctl.h conflict! +0x00 00-1F linux/fb.h conflict! +0x00 00-1F linux/wavefront.h conflict! +0x02 all linux/fd.h +0x03 all linux/hdreg.h +0x04 all linux/umsdos_fs.h +0x06 all linux/lp.h +0x09 all linux/md.h +0x12 all linux/fs.h + linux/blkpg.h +0x1b all InfiniBand Subsystem <http://www.openib.org/> +0x20 all drivers/cdrom/cm206.h +0x22 all scsi/sg.h +'#' 00-3F IEEE 1394 Subsystem Block for the entire subsystem +'1' 00-1F <linux/timepps.h> PPS kit from Ulrich Windl + <ftp://ftp.de.kernel.org/pub/linux/daemons/ntp/PPS/> +'6' 00-10 <asm-i386/processor.h> Intel IA32 microcode update driver + <mailto:tigran@veritas.com> +'8' all SNP8023 advanced NIC card + <mailto:mcr@solidum.com> +'A' 00-1F linux/apm_bios.h +'B' C0-FF advanced bbus + <mailto:maassen@uni-freiburg.de> +'C' all linux/soundcard.h +'D' all asm-s390/dasd.h +'F' all linux/fb.h +'I' all linux/isdn.h +'J' 00-1F drivers/scsi/gdth_ioctl.h +'K' all linux/kd.h +'L' 00-1F linux/loop.h +'L' E0-FF linux/ppdd.h encrypted disk device driver + <http://linux01.gwdg.de/~alatham/ppdd.html> +'M' all linux/soundcard.h conflict! +'M' 00-1F linux/isicom.h conflict! +'N' 00-1F drivers/usb/scanner.h +'P' all linux/soundcard.h +'Q' all linux/soundcard.h +'R' 00-1F linux/random.h +'S' all linux/cdrom.h conflict! +'S' 80-81 scsi/scsi_ioctl.h conflict! +'S' 82-FF scsi/scsi.h conflict! +'T' all linux/soundcard.h conflict! +'T' all asm-i386/ioctls.h conflict! +'U' 00-EF linux/drivers/usb/usb.h +'U' F0-FF drivers/usb/auerswald.c +'V' all linux/vt.h +'W' 00-1F linux/watchdog.h conflict! +'W' 00-1F linux/wanrouter.h conflict! +'X' all linux/xfs_fs.h +'Y' all linux/cyclades.h +'a' all ATM on linux + <http://lrcwww.epfl.ch/linux-atm/magic.html> +'b' 00-FF bit3 vme host bridge + <mailto:natalia@nikhefk.nikhef.nl> +'c' 00-7F linux/comstats.h conflict! +'c' 00-7F linux/coda.h conflict! +'d' 00-FF linux/char/drm/drm/h conflict! +'d' 00-1F linux/devfs_fs.h conflict! +'d' 00-DF linux/video_decoder.h conflict! +'d' F0-FF linux/digi1.h +'e' all linux/digi1.h conflict! +'e' 00-1F linux/video_encoder.h conflict! +'e' 00-1F net/irda/irtty.h conflict! +'f' 00-1F linux/ext2_fs.h +'h' 00-7F Charon filesystem + <mailto:zapman@interlan.net> +'i' 00-3F linux/i2o.h +'j' 00-3F linux/joystick.h +'k' all asm-sparc/kbio.h + asm-sparc64/kbio.h +'l' 00-3F linux/tcfs_fs.h transparent cryptographic file system + <http://mikonos.dia.unisa.it/tcfs> +'l' 40-7F linux/udf_fs_i.h in development: + <http://www.trylinux.com/projects/udf/> +'m' all linux/mtio.h conflict! +'m' all linux/soundcard.h conflict! +'m' all linux/synclink.h conflict! +'m' 00-1F net/irda/irmod.h conflict! +'n' 00-7F linux/ncp_fs.h +'n' E0-FF video/matrox.h matroxfb +'p' 00-3F linux/mc146818rtc.h +'p' 40-7F linux/nvram.h +'p' 80-9F user-space parport + <mailto:tim@cyberelk.net> +'q' 00-1F linux/serio.h +'q' 80-FF Internet PhoneJACK, Internet LineJACK + <http://www.quicknet.net> +'r' 00-1F linux/msdos_fs.h +'s' all linux/cdk.h +'t' 00-7F linux/if_ppp.h +'t' 80-8F linux/isdn_ppp.h +'u' 00-1F linux/smb_fs.h +'v' 00-1F linux/ext2_fs.h conflict! +'v' all linux/videodev.h conflict! +'w' all CERN SCI driver +'y' 00-1F packet based user level communications + <mailto:zapman@interlan.net> +'z' 00-3F CAN bus card + <mailto:hdstich@connectu.ulm.circular.de> +'z' 40-7F CAN bus card + <mailto:oe@port.de> +0x80 00-1F linux/fb.h +0x81 00-1F linux/videotext.h +0x89 00-06 asm-i386/sockios.h +0x89 0B-DF linux/sockios.h +0x89 E0-EF linux/sockios.h SIOCPROTOPRIVATE range +0x89 F0-FF linux/sockios.h SIOCDEVPRIVATE range +0x8B all linux/wireless.h +0x8C 00-3F WiNRADiO driver + <http://www.proximity.com.au/~brian/winradio/> +0x90 00 drivers/cdrom/sbpcd.h +0x93 60-7F linux/auto_fs.h +0x99 00-0F 537-Addinboard driver + <mailto:buk@buks.ipn.de> +0xA0 all linux/sdp/sdp.h Industrial Device Project + <mailto:kenji@bitgate.com> +0xA3 80-8F Port ACL in development: + <mailto:tlewis@mindspring.com> +0xA3 90-9F linux/dtlk.h +0xAB 00-1F linux/nbd.h +0xAC 00-1F linux/raw.h +0xAD 00 Netfilter device in development: + <mailto:rusty@rustcorp.com.au> +0xB0 all RATIO devices in development: + <mailto:vgo@ratio.de> +0xB1 00-1F PPPoX <mailto:mostrows@styx.uwaterloo.ca> +0xCB 00-1F CBM serial IEC bus in development: + <mailto:michael.klein@puffin.lb.shuttle.de> +0xDD 00-3F ZFCP device driver see drivers/s390/scsi/ + <mailto:aherrman@de.ibm.com> +0xF3 00-3F video/sisfb.h sisfb (in development) + <mailto:thomas@winischhofer.net> diff --git a/Documentation/ioctl/cdrom.txt b/Documentation/ioctl/cdrom.txt new file mode 100644 index 000000000000..4ccdcc6fe364 --- /dev/null +++ b/Documentation/ioctl/cdrom.txt @@ -0,0 +1,966 @@ + Summary of CDROM ioctl calls. + ============================ + + Edward A. Falk <efalk@google.com> + + November, 2004 + +This document attempts to describe the ioctl(2) calls supported by +the CDROM layer. These are by-and-large implemented (as of Linux 2.6) +in drivers/cdrom/cdrom.c and drivers/block/scsi_ioctl.c + +ioctl values are listed in <linux/cdrom.h>. As of this writing, they +are as follows: + + CDROMPAUSE Pause Audio Operation + CDROMRESUME Resume paused Audio Operation + CDROMPLAYMSF Play Audio MSF (struct cdrom_msf) + CDROMPLAYTRKIND Play Audio Track/index (struct cdrom_ti) + CDROMREADTOCHDR Read TOC header (struct cdrom_tochdr) + CDROMREADTOCENTRY Read TOC entry (struct cdrom_tocentry) + CDROMSTOP Stop the cdrom drive + CDROMSTART Start the cdrom drive + CDROMEJECT Ejects the cdrom media + CDROMVOLCTRL Control output volume (struct cdrom_volctrl) + CDROMSUBCHNL Read subchannel data (struct cdrom_subchnl) + CDROMREADMODE2 Read CDROM mode 2 data (2336 Bytes) + (struct cdrom_read) + CDROMREADMODE1 Read CDROM mode 1 data (2048 Bytes) + (struct cdrom_read) + CDROMREADAUDIO (struct cdrom_read_audio) + CDROMEJECT_SW enable(1)/disable(0) auto-ejecting + CDROMMULTISESSION Obtain the start-of-last-session + address of multi session disks + (struct cdrom_multisession) + CDROM_GET_MCN Obtain the "Universal Product Code" + if available (struct cdrom_mcn) + CDROM_GET_UPC Deprecated, use CDROM_GET_MCN instead. + CDROMRESET hard-reset the drive + CDROMVOLREAD Get the drive's volume setting + (struct cdrom_volctrl) + CDROMREADRAW read data in raw mode (2352 Bytes) + (struct cdrom_read) + CDROMREADCOOKED read data in cooked mode + CDROMSEEK seek msf address + CDROMPLAYBLK scsi-cd only, (struct cdrom_blk) + CDROMREADALL read all 2646 bytes + CDROMGETSPINDOWN return 4-bit spindown value + CDROMSETSPINDOWN set 4-bit spindown value + CDROMCLOSETRAY pendant of CDROMEJECT + CDROM_SET_OPTIONS Set behavior options + CDROM_CLEAR_OPTIONS Clear behavior options + CDROM_SELECT_SPEED Set the CD-ROM speed + CDROM_SELECT_DISC Select disc (for juke-boxes) + CDROM_MEDIA_CHANGED Check is media changed + CDROM_DRIVE_STATUS Get tray position, etc. + CDROM_DISC_STATUS Get disc type, etc. + CDROM_CHANGER_NSLOTS Get number of slots + CDROM_LOCKDOOR lock or unlock door + CDROM_DEBUG Turn debug messages on/off + CDROM_GET_CAPABILITY get capabilities + CDROMAUDIOBUFSIZ set the audio buffer size + DVD_READ_STRUCT Read structure + DVD_WRITE_STRUCT Write structure + DVD_AUTH Authentication + CDROM_SEND_PACKET send a packet to the drive + CDROM_NEXT_WRITABLE get next writable block + CDROM_LAST_WRITTEN get last block written on disc + + +The information that follows was determined from reading kernel source +code. It is likely that some corrections will be made over time. + + + + + + + +General: + + Unless otherwise specified, all ioctl calls return 0 on success + and -1 with errno set to an appropriate value on error. (Some + ioctls return non-negative data values.) + + Unless otherwise specified, all ioctl calls return -1 and set + errno to EFAULT on a failed attempt to copy data to or from user + address space. + + Individual drivers may return error codes not listed here. + + Unless otherwise specified, all data structures and constants + are defined in <linux/cdrom.h> + + + + +CDROMPAUSE Pause Audio Operation + + usage: + + ioctl(fd, CDROMPAUSE, 0); + + inputs: none + + outputs: none + + error return: + ENOSYS cd drive not audio-capable. + + +CDROMRESUME Resume paused Audio Operation + + usage: + + ioctl(fd, CDROMRESUME, 0); + + inputs: none + + outputs: none + + error return: + ENOSYS cd drive not audio-capable. + + +CDROMPLAYMSF Play Audio MSF (struct cdrom_msf) + + usage: + + struct cdrom_msf msf; + ioctl(fd, CDROMPLAYMSF, &msf); + + inputs: + cdrom_msf structure, describing a segment of music to play + + outputs: none + + error return: + ENOSYS cd drive not audio-capable. + + notes: + MSF stands for minutes-seconds-frames + LBA stands for logical block address + + Segment is described as start and end times, where each time + is described as minutes:seconds:frames. A frame is 1/75 of + a second. + + +CDROMPLAYTRKIND Play Audio Track/index (struct cdrom_ti) + + usage: + + struct cdrom_ti ti; + ioctl(fd, CDROMPLAYTRKIND, &ti); + + inputs: + cdrom_ti structure, describing a segment of music to play + + outputs: none + + error return: + ENOSYS cd drive not audio-capable. + + notes: + Segment is described as start and end times, where each time + is described as a track and an index. + + + +CDROMREADTOCHDR Read TOC header (struct cdrom_tochdr) + + usage: + + cdrom_tochdr header; + ioctl(fd, CDROMREADTOCHDR, &header); + + inputs: + cdrom_tochdr structure + + outputs: + cdrom_tochdr structure + + error return: + ENOSYS cd drive not audio-capable. + + + +CDROMREADTOCENTRY Read TOC entry (struct cdrom_tocentry) + + usage: + + struct cdrom_tocentry entry; + ioctl(fd, CDROMREADTOCENTRY, &entry); + + inputs: + cdrom_tocentry structure + + outputs: + cdrom_tocentry structure + + error return: + ENOSYS cd drive not audio-capable. + EINVAL entry.cdte_format not CDROM_MSF or CDROM_LBA + EINVAL requested track out of bounds + EIO I/O error reading TOC + + notes: + TOC stands for Table Of Contents + MSF stands for minutes-seconds-frames + LBA stands for logical block address + + + +CDROMSTOP Stop the cdrom drive + + usage: + + ioctl(fd, CDROMSTOP, 0); + + inputs: none + + outputs: none + + error return: + ENOSYS cd drive not audio-capable. + + notes: + Exact interpretation of this ioctl depends on the device, + but most seem to spin the drive down. + + +CDROMSTART Start the cdrom drive + + usage: + + ioctl(fd, CDROMSTART, 0); + + inputs: none + + outputs: none + + error return: + ENOSYS cd drive not audio-capable. + + notes: + Exact interpretation of this ioctl depends on the device, + but most seem to spin the drive up and/or close the tray. + Other devices ignore the ioctl completely. + + +CDROMEJECT Ejects the cdrom media + + usage: + + ioctl(fd, CDROMEJECT, 0); + + inputs: none + + outputs: none + + error returns: + ENOSYS cd drive not capable of ejecting + EBUSY other processes are accessing drive, or door is locked + + notes: + See CDROM_LOCKDOOR, below. + + + +CDROMCLOSETRAY pendant of CDROMEJECT + + usage: + + ioctl(fd, CDROMEJECT, 0); + + inputs: none + + outputs: none + + error returns: + ENOSYS cd drive not capable of ejecting + EBUSY other processes are accessing drive, or door is locked + + notes: + See CDROM_LOCKDOOR, below. + + + +CDROMVOLCTRL Control output volume (struct cdrom_volctrl) + + usage: + + struct cdrom_volctrl volume; + ioctl(fd, CDROMVOLCTRL, &volume); + + inputs: + cdrom_volctrl structure containing volumes for up to 4 + channels. + + outputs: none + + error return: + ENOSYS cd drive not audio-capable. + + + +CDROMVOLREAD Get the drive's volume setting + (struct cdrom_volctrl) + + usage: + + struct cdrom_volctrl volume; + ioctl(fd, CDROMVOLREAD, &volume); + + inputs: none + + outputs: + The current volume settings. + + error return: + ENOSYS cd drive not audio-capable. + + + +CDROMSUBCHNL Read subchannel data (struct cdrom_subchnl) + + usage: + + struct cdrom_subchnl q; + ioctl(fd, CDROMSUBCHNL, &q); + + inputs: + cdrom_subchnl structure + + outputs: + cdrom_subchnl structure + + error return: + ENOSYS cd drive not audio-capable. + EINVAL format not CDROM_MSF or CDROM_LBA + + notes: + Format is converted to CDROM_MSF on return + + + +CDROMREADRAW read data in raw mode (2352 Bytes) + (struct cdrom_read) + + usage: + + union { + struct cdrom_msf msf; /* input */ + char buffer[CD_FRAMESIZE_RAW]; /* return */ + } arg; + ioctl(fd, CDROMREADRAW, &arg); + + inputs: + cdrom_msf structure indicating an address to read. + Only the start values are significant. + + outputs: + Data written to address provided by user. + + error return: + EINVAL address less than 0, or msf less than 0:2:0 + ENOMEM out of memory + + notes: + As of 2.6.8.1, comments in <linux/cdrom.h> indicate that this + ioctl accepts a cdrom_read structure, but actual source code + reads a cdrom_msf structure and writes a buffer of data to + the same address. + + MSF values are converted to LBA values via this formula: + + lba = (((m * CD_SECS) + s) * CD_FRAMES + f) - CD_MSF_OFFSET; + + + + +CDROMREADMODE1 Read CDROM mode 1 data (2048 Bytes) + (struct cdrom_read) + + notes: + Identical to CDROMREADRAW except that block size is + CD_FRAMESIZE (2048) bytes + + + +CDROMREADMODE2 Read CDROM mode 2 data (2336 Bytes) + (struct cdrom_read) + + notes: + Identical to CDROMREADRAW except that block size is + CD_FRAMESIZE_RAW0 (2336) bytes + + + +CDROMREADAUDIO (struct cdrom_read_audio) + + usage: + + struct cdrom_read_audio ra; + ioctl(fd, CDROMREADAUDIO, &ra); + + inputs: + cdrom_read_audio structure containing read start + point and length + + outputs: + audio data, returned to buffer indicated by ra + + error return: + EINVAL format not CDROM_MSF or CDROM_LBA + EINVAL nframes not in range [1 75] + ENXIO drive has no queue (probably means invalid fd) + ENOMEM out of memory + + +CDROMEJECT_SW enable(1)/disable(0) auto-ejecting + + usage: + + int val; + ioctl(fd, CDROMEJECT_SW, val); + + inputs: + Flag specifying auto-eject flag. + + outputs: none + + error return: + ENOSYS Drive is not capable of ejecting. + EBUSY Door is locked + + + + +CDROMMULTISESSION Obtain the start-of-last-session + address of multi session disks + (struct cdrom_multisession) + usage: + + struct cdrom_multisession ms_info; + ioctl(fd, CDROMMULTISESSION, &ms_info); + + inputs: + cdrom_multisession structure containing desired + format. + + outputs: + cdrom_multisession structure is filled with last_session + information. + + error return: + EINVAL format not CDROM_MSF or CDROM_LBA + + +CDROM_GET_MCN Obtain the "Universal Product Code" + if available (struct cdrom_mcn) + + usage: + + struct cdrom_mcn mcn; + ioctl(fd, CDROM_GET_MCN, &mcn); + + inputs: none + + outputs: + Universal Product Code + + error return: + ENOSYS Drive is not capable of reading MCN data. + + notes: + Source code comments state: + + The following function is implemented, although very few + audio discs give Universal Product Code information, which + should just be the Medium Catalog Number on the box. Note, + that the way the code is written on the CD is /not/ uniform + across all discs! + + + + +CDROM_GET_UPC CDROM_GET_MCN (deprecated) + + Not implemented, as of 2.6.8.1 + + + +CDROMRESET hard-reset the drive + + usage: + + ioctl(fd, CDROMRESET, 0); + + inputs: none + + outputs: none + + error return: + EACCES Access denied: requires CAP_SYS_ADMIN + ENOSYS Drive is not capable of resetting. + + + + +CDROMREADCOOKED read data in cooked mode + + usage: + + u8 buffer[CD_FRAMESIZE] + ioctl(fd, CDROMREADCOOKED, buffer); + + inputs: none + + outputs: + 2048 bytes of data, "cooked" mode. + + notes: + Not implemented on all drives. + + + + +CDROMREADALL read all 2646 bytes + + Same as CDROMREADCOOKED, but reads 2646 bytes. + + + +CDROMSEEK seek msf address + + usage: + + struct cdrom_msf msf; + ioctl(fd, CDROMSEEK, &msf); + + inputs: + MSF address to seek to. + + outputs: none + + + +CDROMPLAYBLK scsi-cd only, (struct cdrom_blk) + + usage: + + struct cdrom_blk blk; + ioctl(fd, CDROMPLAYBLK, &blk); + + inputs: + Region to play + + outputs: none + + + +CDROMGETSPINDOWN + + usage: + + char spindown; + ioctl(fd, CDROMGETSPINDOWN, &spindown); + + inputs: none + + outputs: + The value of the current 4-bit spindown value. + + + + +CDROMSETSPINDOWN + + usage: + + char spindown + ioctl(fd, CDROMSETSPINDOWN, &spindown); + + inputs: + 4-bit value used to control spindown (TODO: more detail here) + + outputs: none + + + + + +CDROM_SET_OPTIONS Set behavior options + + usage: + + int options; + ioctl(fd, CDROM_SET_OPTIONS, options); + + inputs: + New values for drive options. The logical 'or' of: + CDO_AUTO_CLOSE close tray on first open(2) + CDO_AUTO_EJECT open tray on last release + CDO_USE_FFLAGS use O_NONBLOCK information on open + CDO_LOCK lock tray on open files + CDO_CHECK_TYPE check type on open for data + + outputs: + Returns the resulting options settings in the + ioctl return value. Returns -1 on error. + + error return: + ENOSYS selected option(s) not supported by drive. + + + + +CDROM_CLEAR_OPTIONS Clear behavior options + + Same as CDROM_SET_OPTIONS, except that selected options are + turned off. + + + +CDROM_SELECT_SPEED Set the CD-ROM speed + + usage: + + int speed; + ioctl(fd, CDROM_SELECT_SPEED, speed); + + inputs: + New drive speed. + + outputs: none + + error return: + ENOSYS speed selection not supported by drive. + + + +CDROM_SELECT_DISC Select disc (for juke-boxes) + + usage: + + int disk; + ioctl(fd, CDROM_SELECT_DISC, disk); + + inputs: + Disk to load into drive. + + outputs: none + + error return: + EINVAL Disk number beyond capacity of drive + + + +CDROM_MEDIA_CHANGED Check is media changed + + usage: + + int slot; + ioctl(fd, CDROM_MEDIA_CHANGED, slot); + + inputs: + Slot number to be tested, always zero except for jukeboxes. + May also be special values CDSL_NONE or CDSL_CURRENT + + outputs: + Ioctl return value is 0 or 1 depending on whether the media + has been changed, or -1 on error. + + error returns: + ENOSYS Drive can't detect media change + EINVAL Slot number beyond capacity of drive + ENOMEM Out of memory + + + +CDROM_DRIVE_STATUS Get tray position, etc. + + usage: + + int slot; + ioctl(fd, CDROM_DRIVE_STATUS, slot); + + inputs: + Slot number to be tested, always zero except for jukeboxes. + May also be special values CDSL_NONE or CDSL_CURRENT + + outputs: + Ioctl return value will be one of the following values + from <linux/cdrom.h>: + + CDS_NO_INFO Information not available. + CDS_NO_DISC + CDS_TRAY_OPEN + CDS_DRIVE_NOT_READY + CDS_DISC_OK + -1 error + + error returns: + ENOSYS Drive can't detect drive status + EINVAL Slot number beyond capacity of drive + ENOMEM Out of memory + + + + +CDROM_DISC_STATUS Get disc type, etc. + + usage: + + ioctl(fd, CDROM_DISC_STATUS, 0); + + inputs: none + + outputs: + Ioctl return value will be one of the following values + from <linux/cdrom.h>: + CDS_NO_INFO + CDS_AUDIO + CDS_MIXED + CDS_XA_2_2 + CDS_XA_2_1 + CDS_DATA_1 + + error returns: none at present + + notes: + Source code comments state: + + Ok, this is where problems start. The current interface for + the CDROM_DISC_STATUS ioctl is flawed. It makes the false + assumption that CDs are all CDS_DATA_1 or all CDS_AUDIO, etc. + Unfortunatly, while this is often the case, it is also + very common for CDs to have some tracks with data, and some + tracks with audio. Just because I feel like it, I declare + the following to be the best way to cope. If the CD has + ANY data tracks on it, it will be returned as a data CD. + If it has any XA tracks, I will return it as that. Now I + could simplify this interface by combining these returns with + the above, but this more clearly demonstrates the problem + with the current interface. Too bad this wasn't designed + to use bitmasks... -Erik + + Well, now we have the option CDS_MIXED: a mixed-type CD. + User level programmers might feel the ioctl is not very + useful. + ---david + + + + +CDROM_CHANGER_NSLOTS Get number of slots + + usage: + + ioctl(fd, CDROM_CHANGER_NSLOTS, 0); + + inputs: none + + outputs: + The ioctl return value will be the number of slots in a + CD changer. Typically 1 for non-multi-disk devices. + + error returns: none + + + +CDROM_LOCKDOOR lock or unlock door + + usage: + + int lock; + ioctl(fd, CDROM_LOCKDOOR, lock); + + inputs: + Door lock flag, 1=lock, 0=unlock + + outputs: none + + error returns: + EDRIVE_CANT_DO_THIS Door lock function not supported. + EBUSY Attempt to unlock when multiple users + have the drive open and not CAP_SYS_ADMIN + + notes: + As of 2.6.8.1, the lock flag is a global lock, meaning that + all CD drives will be locked or unlocked together. This is + probably a bug. + + The EDRIVE_CANT_DO_THIS value is defined in <linux/cdrom.h> + and is currently (2.6.8.1) the same as EOPNOTSUPP + + + +CDROM_DEBUG Turn debug messages on/off + + usage: + + int debug; + ioctl(fd, CDROM_DEBUG, debug); + + inputs: + Cdrom debug flag, 0=disable, 1=enable + + outputs: + The ioctl return value will be the new debug flag. + + error return: + EACCES Access denied: requires CAP_SYS_ADMIN + + + +CDROM_GET_CAPABILITY get capabilities + + usage: + + ioctl(fd, CDROM_GET_CAPABILITY, 0); + + inputs: none + + outputs: + The ioctl return value is the current device capability + flags. See CDC_CLOSE_TRAY, CDC_OPEN_TRAY, etc. + + + +CDROMAUDIOBUFSIZ set the audio buffer size + + usage: + + int arg; + ioctl(fd, CDROMAUDIOBUFSIZ, val); + + inputs: + New audio buffer size + + outputs: + The ioctl return value is the new audio buffer size, or -1 + on error. + + error return: + ENOSYS Not supported by this driver. + + notes: + Not supported by all drivers. + + + +DVD_READ_STRUCT Read structure + + usage: + + dvd_struct s; + ioctl(fd, DVD_READ_STRUCT, &s); + + inputs: + dvd_struct structure, containing: + type specifies the information desired, one of + DVD_STRUCT_PHYSICAL, DVD_STRUCT_COPYRIGHT, + DVD_STRUCT_DISCKEY, DVD_STRUCT_BCA, + DVD_STRUCT_MANUFACT + physical.layer_num desired layer, indexed from 0 + copyright.layer_num desired layer, indexed from 0 + disckey.agid + + outputs: + dvd_struct structure, containing: + physical for type == DVD_STRUCT_PHYSICAL + copyright for type == DVD_STRUCT_COPYRIGHT + disckey.value for type == DVD_STRUCT_DISCKEY + bca.{len,value} for type == DVD_STRUCT_BCA + manufact.{len,valu} for type == DVD_STRUCT_MANUFACT + + error returns: + EINVAL physical.layer_num exceeds number of layers + EIO Recieved invalid response from drive + + + +DVD_WRITE_STRUCT Write structure + + Not implemented, as of 2.6.8.1 + + + +DVD_AUTH Authentication + + usage: + + dvd_authinfo ai; + ioctl(fd, DVD_AUTH, &ai); + + inputs: + dvd_authinfo structure. See <linux/cdrom.h> + + outputs: + dvd_authinfo structure. + + error return: + ENOTTY ai.type not recognized. + + + +CDROM_SEND_PACKET send a packet to the drive + + usage: + + struct cdrom_generic_command cgc; + ioctl(fd, CDROM_SEND_PACKET, &cgc); + + inputs: + cdrom_generic_command structure containing the packet to send. + + outputs: none + cdrom_generic_command structure containing results. + + error return: + EIO command failed. + EPERM Operation not permitted, either because a + write command was attempted on a drive which + is opened read-only, or because the command + requires CAP_SYS_RAWIO + EINVAL cgc.data_direction not set + + + +CDROM_NEXT_WRITABLE get next writable block + + usage: + + long next; + ioctl(fd, CDROM_NEXT_WRITABLE, &next); + + inputs: none + + outputs: + The next writable block. + + notes: + If the device does not support this ioctl directly, the + ioctl will return CDROM_LAST_WRITTEN + 7. + + + +CDROM_LAST_WRITTEN get last block written on disc + + usage: + + long last; + ioctl(fd, CDROM_LAST_WRITTEN, &last); + + inputs: none + + outputs: + The last block written on disc + + notes: + If the device does not support this ioctl directly, the + result is derived from the disc's table of contents. If the + table of contents can't be read, this ioctl returns an + error. diff --git a/Documentation/ioctl/hdio.txt b/Documentation/ioctl/hdio.txt new file mode 100644 index 000000000000..9a7aea0636a5 --- /dev/null +++ b/Documentation/ioctl/hdio.txt @@ -0,0 +1,1070 @@ + Summary of HDIO_ ioctl calls. + ============================ + + Edward A. Falk <efalk@google.com> + + November, 2004 + +This document attempts to describe the ioctl(2) calls supported by +the HD/IDE layer. These are by-and-large implemented (as of Linux 2.6) +in drivers/ide/ide.c and drivers/block/scsi_ioctl.c + +ioctl values are listed in <linux/hdreg.h>. As of this writing, they +are as follows: + + ioctls that pass argument pointers to user space: + + HDIO_GETGEO get device geometry + HDIO_GET_UNMASKINTR get current unmask setting + HDIO_GET_MULTCOUNT get current IDE blockmode setting + HDIO_GET_QDMA get use-qdma flag + HDIO_SET_XFER set transfer rate via proc + HDIO_OBSOLETE_IDENTITY OBSOLETE, DO NOT USE + HDIO_GET_KEEPSETTINGS get keep-settings-on-reset flag + HDIO_GET_32BIT get current io_32bit setting + HDIO_GET_NOWERR get ignore-write-error flag + HDIO_GET_DMA get use-dma flag + HDIO_GET_NICE get nice flags + HDIO_GET_IDENTITY get IDE identification info + HDIO_GET_WCACHE get write cache mode on|off + HDIO_GET_ACOUSTIC get acoustic value + HDIO_GET_ADDRESS get sector addressing mode + HDIO_GET_BUSSTATE get the bus state of the hwif + HDIO_TRISTATE_HWIF execute a channel tristate + HDIO_DRIVE_RESET execute a device reset + HDIO_DRIVE_TASKFILE execute raw taskfile + HDIO_DRIVE_TASK execute task and special drive command + HDIO_DRIVE_CMD execute a special drive command + HDIO_DRIVE_CMD_AEB HDIO_DRIVE_TASK + + ioctls that pass non-pointer values: + + HDIO_SET_MULTCOUNT change IDE blockmode + HDIO_SET_UNMASKINTR permit other irqs during I/O + HDIO_SET_KEEPSETTINGS keep ioctl settings on reset + HDIO_SET_32BIT change io_32bit flags + HDIO_SET_NOWERR change ignore-write-error flag + HDIO_SET_DMA change use-dma flag + HDIO_SET_PIO_MODE reconfig interface to new speed + HDIO_SCAN_HWIF register and (re)scan interface + HDIO_SET_NICE set nice flags + HDIO_UNREGISTER_HWIF unregister interface + HDIO_SET_WCACHE change write cache enable-disable + HDIO_SET_ACOUSTIC change acoustic behavior + HDIO_SET_BUSSTATE set the bus state of the hwif + HDIO_SET_QDMA change use-qdma flag + HDIO_SET_ADDRESS change lba addressing modes + + HDIO_SET_IDE_SCSI Set scsi emulation mode on/off + HDIO_SET_SCSI_IDE not implemented yet + + +The information that follows was determined from reading kernel source +code. It is likely that some corrections will be made over time. + + + + + + + +General: + + Unless otherwise specified, all ioctl calls return 0 on success + and -1 with errno set to an appropriate value on error. + + Unless otherwise specified, all ioctl calls return -1 and set + errno to EFAULT on a failed attempt to copy data to or from user + address space. + + Unless otherwise specified, all data structures and constants + are defined in <linux/hdreg.h> + + + +HDIO_GETGEO get device geometry + + usage: + + struct hd_geometry geom; + ioctl(fd, HDIO_GETGEO, &geom); + + + inputs: none + + outputs: + + hd_geometry structure containing: + + heads number of heads + sectors number of sectors/track + cylinders number of cylinders, mod 65536 + start starting sector of this partition. + + + error returns: + EINVAL if the device is not a disk drive or floppy drive, + or if the user passes a null pointer + + + notes: + + Not particularly useful with modern disk drives, whose geometry + is a polite fiction anyway. Modern drives are addressed + purely by sector number nowadays (lba addressing), and the + drive geometry is an abstraction which is actually subject + to change. Currently (as of Nov 2004), the geometry values + are the "bios" values -- presumably the values the drive had + when Linux first booted. + + In addition, the cylinders field of the hd_geometry is an + unsigned short, meaning that on most architectures, this + ioctl will not return a meaningful value on drives with more + than 65535 tracks. + + The start field is unsigned long, meaning that it will not + contain a meaningful value for disks over 219 Gb in size. + + + + +HDIO_GET_UNMASKINTR get current unmask setting + + usage: + + long val; + ioctl(fd, HDIO_GET_UNMASKINTR, &val); + + inputs: none + + outputs: + The value of the drive's current unmask setting + + + +HDIO_SET_UNMASKINTR permit other irqs during I/O + + usage: + + unsigned long val; + ioctl(fd, HDIO_SET_UNMASKINTR, val); + + inputs: + New value for unmask flag + + outputs: none + + error return: + EINVAL (bdev != bdev->bd_contains) (not sure what this means) + EACCES Access denied: requires CAP_SYS_ADMIN + EINVAL value out of range [0 1] + EBUSY Controller busy + + + + +HDIO_GET_MULTCOUNT get current IDE blockmode setting + + usage: + + long val; + ioctl(fd, HDIO_GET_MULTCOUNT, &val); + + inputs: none + + outputs: + The value of the current IDE block mode setting. This + controls how many sectors the drive will transfer per + interrupt. + + + +HDIO_SET_MULTCOUNT change IDE blockmode + + usage: + + int val; + ioctl(fd, HDIO_SET_MULTCOUNT, val); + + inputs: + New value for IDE block mode setting. This controls how many + sectors the drive will transfer per interrupt. + + outputs: none + + error return: + EINVAL (bdev != bdev->bd_contains) (not sure what this means) + EACCES Access denied: requires CAP_SYS_ADMIN + EINVAL value out of range supported by disk. + EBUSY Controller busy or blockmode already set. + EIO Drive did not accept new block mode. + + notes: + + Source code comments read: + + This is tightly woven into the driver->do_special can not + touch. DON'T do it again until a total personality rewrite + is committed. + + If blockmode has already been set, this ioctl will fail with + EBUSY + + + +HDIO_GET_QDMA get use-qdma flag + + Not implemented, as of 2.6.8.1 + + + +HDIO_SET_XFER set transfer rate via proc + + Not implemented, as of 2.6.8.1 + + + +HDIO_OBSOLETE_IDENTITY OBSOLETE, DO NOT USE + + Same as HDIO_GET_IDENTITY (see below), except that it only + returns the first 142 bytes of drive identity information. + + + +HDIO_GET_IDENTITY get IDE identification info + + usage: + + unsigned char identity[512]; + ioctl(fd, HDIO_GET_IDENTITY, identity); + + inputs: none + + outputs: + + ATA drive identity information. For full description, see + the IDENTIFY DEVICE and IDENTIFY PACKET DEVICE commands in + the ATA specification. + + error returns: + EINVAL (bdev != bdev->bd_contains) (not sure what this means) + ENOMSG IDENTIFY DEVICE information not available + + notes: + + Returns information that was obtained when the drive was + probed. Some of this information is subject to change, and + this ioctl does not re-probe the drive to update the + information. + + This information is also available from /proc/ide/hdX/identify + + + +HDIO_GET_KEEPSETTINGS get keep-settings-on-reset flag + + usage: + + long val; + ioctl(fd, HDIO_GET_KEEPSETTINGS, &val); + + inputs: none + + outputs: + The value of the current "keep settings" flag + + notes: + + When set, indicates that kernel should restore settings + after a drive reset. + + + +HDIO_SET_KEEPSETTINGS keep ioctl settings on reset + + usage: + + long val; + ioctl(fd, HDIO_SET_KEEPSETTINGS, val); + + inputs: + New value for keep_settings flag + + outputs: none + + error return: + EINVAL (bdev != bdev->bd_contains) (not sure what this means) + EACCES Access denied: requires CAP_SYS_ADMIN + EINVAL value out of range [0 1] + EBUSY Controller busy + + + +HDIO_GET_32BIT get current io_32bit setting + + usage: + + long val; + ioctl(fd, HDIO_GET_32BIT, &val); + + inputs: none + + outputs: + The value of the current io_32bit setting + + notes: + + 0=16-bit, 1=32-bit, 2,3 = 32bit+sync + + + +HDIO_GET_NOWERR get ignore-write-error flag + + usage: + + long val; + ioctl(fd, HDIO_GET_NOWERR, &val); + + inputs: none + + outputs: + The value of the current ignore-write-error flag + + + +HDIO_GET_DMA get use-dma flag + + usage: + + long val; + ioctl(fd, HDIO_GET_DMA, &val); + + inputs: none + + outputs: + The value of the current use-dma flag + + + +HDIO_GET_NICE get nice flags + + usage: + + long nice; + ioctl(fd, HDIO_GET_NICE, &nice); + + inputs: none + + outputs: + + The drive's "nice" values. + + notes: + + Per-drive flags which determine when the system will give more + bandwidth to other devices sharing the same IDE bus. + See <linux/hdreg.h>, near symbol IDE_NICE_DSC_OVERLAP. + + + + +HDIO_SET_NICE set nice flags + + usage: + + unsigned long nice; + ... + ioctl(fd, HDIO_SET_NICE, nice); + + inputs: + bitmask of nice flags. + + outputs: none + + error returns: + EACCES Access denied: requires CAP_SYS_ADMIN + EPERM Flags other than DSC_OVERLAP and NICE_1 set. + EPERM DSC_OVERLAP specified but not supported by drive + + notes: + + This ioctl sets the DSC_OVERLAP and NICE_1 flags from values + provided by the user. + + Nice flags are listed in <linux/hdreg.h>, starting with + IDE_NICE_DSC_OVERLAP. These values represent shifts. + + + + + +HDIO_GET_WCACHE get write cache mode on|off + + usage: + + long val; + ioctl(fd, HDIO_GET_WCACHE, &val); + + inputs: none + + outputs: + The value of the current write cache mode + + + +HDIO_GET_ACOUSTIC get acoustic value + + usage: + + long val; + ioctl(fd, HDIO_GET_ACOUSTIC, &val); + + inputs: none + + outputs: + The value of the current acoustic settings + + notes: + + See HDIO_SET_ACOUSTIC + + + +HDIO_GET_ADDRESS + + usage: + + long val; + ioctl(fd, HDIO_GET_ADDRESS, &val); + + inputs: none + + outputs: + The value of the current addressing mode: + 0 = 28-bit + 1 = 48-bit + 2 = 48-bit doing 28-bit + 3 = 64-bit + + + +HDIO_GET_BUSSTATE get the bus state of the hwif + + usage: + + long state; + ioctl(fd, HDIO_SCAN_HWIF, &state); + + inputs: none + + outputs: + Current power state of the IDE bus. One of BUSSTATE_OFF, + BUSSTATE_ON, or BUSSTATE_TRISTATE + + error returns: + EACCES Access denied: requires CAP_SYS_ADMIN + + + + +HDIO_SET_BUSSTATE set the bus state of the hwif + + usage: + + int state; + ... + ioctl(fd, HDIO_SCAN_HWIF, state); + + inputs: + Desired IDE power state. One of BUSSTATE_OFF, BUSSTATE_ON, + or BUSSTATE_TRISTATE + + outputs: none + + error returns: + EACCES Access denied: requires CAP_SYS_RAWIO + EOPNOTSUPP Hardware interface does not support bus power control + + + + +HDIO_TRISTATE_HWIF execute a channel tristate + + Not implemented, as of 2.6.8.1. See HDIO_SET_BUSSTATE + + + +HDIO_DRIVE_RESET execute a device reset + + usage: + + int args[3] + ... + ioctl(fd, HDIO_DRIVE_RESET, args); + + inputs: none + + outputs: none + + error returns: + EACCES Access denied: requires CAP_SYS_ADMIN + + notes: + + Abort any current command, prevent anything else from being + queued, execute a reset on the device, and issue BLKRRPART + ioctl on the block device. + + Executes an ATAPI soft reset if applicable, otherwise + executes an ATA soft reset on the controller. + + + +HDIO_DRIVE_TASKFILE execute raw taskfile + + Note: If you don't have a copy of the ANSI ATA specification + handy, you should probably ignore this ioctl. + + Execute an ATA disk command directly by writing the "taskfile" + registers of the drive. Requires ADMIN and RAWIO access + privileges. + + usage: + + struct { + ide_task_request_t req_task; + u8 outbuf[OUTPUT_SIZE]; + u8 inbuf[INPUT_SIZE]; + } task; + memset(&task.req_task, 0, sizeof(task.req_task)); + task.req_task.out_size = sizeof(task.outbuf); + task.req_task.in_size = sizeof(task.inbuf); + ... + ioctl(fd, HDIO_DRIVE_TASKFILE, &task); + ... + + inputs: + + (See below for details on memory area passed to ioctl.) + + io_ports[8] values to be written to taskfile registers + hob_ports[8] high-order bytes, for extended commands. + out_flags flags indicating which registers are valid + in_flags flags indicating which registers should be returned + data_phase see below + req_cmd command type to be executed + out_size size of output buffer + outbuf buffer of data to be transmitted to disk + inbuf buffer of data to be received from disk (see [1]) + + outputs: + + io_ports[] values returned in the taskfile registers + hob_ports[] high-order bytes, for extended commands. + out_flags flags indicating which registers are valid (see [2]) + in_flags flags indicating which registers should be returned + outbuf buffer of data to be transmitted to disk (see [1]) + inbuf buffer of data to be received from disk + + error returns: + EACCES CAP_SYS_ADMIN or CAP_SYS_RAWIO privilege not set. + ENOMSG Device is not a disk drive. + ENOMEM Unable to allocate memory for task + EFAULT req_cmd == TASKFILE_IN_OUT (not implemented as of 2.6.8) + EPERM req_cmd == TASKFILE_MULTI_OUT and drive + multi-count not yet set. + EIO Drive failed the command. + + notes: + + [1] READ THE FOLLOWING NOTES *CAREFULLY*. THIS IOCTL IS + FULL OF GOTCHAS. Extreme caution should be used with using + this ioctl. A mistake can easily corrupt data or hang the + system. + + [2] Both the input and output buffers are copied from the + user and written back to the user, even when not used. + + [3] If one or more bits are set in out_flags and in_flags is + zero, the following values are used for in_flags.all and + written back into in_flags on completion. + + * IDE_TASKFILE_STD_IN_FLAGS | (IDE_HOB_STD_IN_FLAGS << 8) + if LBA48 addressing is enabled for the drive + * IDE_TASKFILE_STD_IN_FLAGS + if CHS/LBA28 + + The association between in_flags.all and each enable + bitfield flips depending on endianess; fortunately, TASKFILE + only uses inflags.b.data bit and ignores all other bits. + The end result is that, on any endian machines, it has no + effect other than modifying in_flags on completion. + + [4] The default value of SELECT is (0xa0|DEV_bit|LBA_bit) + except for four drives per port chipsets. For four drives + per port chipsets, it's (0xa0|DEV_bit|LBA_bit) for the first + pair and (0x80|DEV_bit|LBA_bit) for the second pair. + + [5] The argument to the ioctl is a pointer to a region of + memory containing a ide_task_request_t structure, followed + by an optional buffer of data to be transmitted to the + drive, followed by an optional buffer to receive data from + the drive. + + Command is passed to the disk drive via the ide_task_request_t + structure, which contains these fields: + + io_ports[8] values for the taskfile registers + hob_ports[8] high-order bytes, for extended commands + out_flags flags indicating which entries in the + io_ports[] and hob_ports[] arrays + contain valid values. Type ide_reg_valid_t. + in_flags flags indicating which entries in the + io_ports[] and hob_ports[] arrays + are expected to contain valid values + on return. + data_phase See below + req_cmd Command type, see below + out_size output (user->drive) buffer size, bytes + in_size input (drive->user) buffer size, bytes + + When out_flags is zero, the following registers are loaded. + + HOB_FEATURE If the drive supports LBA48 + HOB_NSECTOR If the drive supports LBA48 + HOB_SECTOR If the drive supports LBA48 + HOB_LCYL If the drive supports LBA48 + HOB_HCYL If the drive supports LBA48 + FEATURE + NSECTOR + SECTOR + LCYL + HCYL + SELECT First, masked with 0xE0 if LBA48, 0xEF + otherwise; then, or'ed with the default + value of SELECT. + + If any bit in out_flags is set, the following registers are loaded. + + HOB_DATA If out_flags.b.data is set. HOB_DATA will + travel on DD8-DD15 on little endian machines + and on DD0-DD7 on big endian machines. + DATA If out_flags.b.data is set. DATA will + travel on DD0-DD7 on little endian machines + and on DD8-DD15 on big endian machines. + HOB_NSECTOR If out_flags.b.nsector_hob is set + HOB_SECTOR If out_flags.b.sector_hob is set + HOB_LCYL If out_flags.b.lcyl_hob is set + HOB_HCYL If out_flags.b.hcyl_hob is set + FEATURE If out_flags.b.feature is set + NSECTOR If out_flags.b.nsector is set + SECTOR If out_flags.b.sector is set + LCYL If out_flags.b.lcyl is set + HCYL If out_flags.b.hcyl is set + SELECT Or'ed with the default value of SELECT and + loaded regardless of out_flags.b.select. + + Taskfile registers are read back from the drive into + {io|hob}_ports[] after the command completes iff one of the + following conditions is met; otherwise, the original values + will be written back, unchanged. + + 1. The drive fails the command (EIO). + 2. One or more than one bits are set in out_flags. + 3. The requested data_phase is TASKFILE_NO_DATA. + + HOB_DATA If in_flags.b.data is set. It will contain + DD8-DD15 on little endian machines and + DD0-DD7 on big endian machines. + DATA If in_flags.b.data is set. It will contain + DD0-DD7 on little endian machines and + DD8-DD15 on big endian machines. + HOB_FEATURE If the drive supports LBA48 + HOB_NSECTOR If the drive supports LBA48 + HOB_SECTOR If the drive supports LBA48 + HOB_LCYL If the drive supports LBA48 + HOB_HCYL If the drive supports LBA48 + NSECTOR + SECTOR + LCYL + HCYL + + The data_phase field describes the data transfer to be + performed. Value is one of: + + TASKFILE_IN + TASKFILE_MULTI_IN + TASKFILE_OUT + TASKFILE_MULTI_OUT + TASKFILE_IN_OUT + TASKFILE_IN_DMA + TASKFILE_IN_DMAQ == IN_DMA (queueing not supported) + TASKFILE_OUT_DMA + TASKFILE_OUT_DMAQ == OUT_DMA (queueing not supported) + TASKFILE_P_IN unimplemented + TASKFILE_P_IN_DMA unimplemented + TASKFILE_P_IN_DMAQ unimplemented + TASKFILE_P_OUT unimplemented + TASKFILE_P_OUT_DMA unimplemented + TASKFILE_P_OUT_DMAQ unimplemented + + The req_cmd field classifies the command type. It may be + one of: + + IDE_DRIVE_TASK_NO_DATA + IDE_DRIVE_TASK_SET_XFER unimplemented + IDE_DRIVE_TASK_IN + IDE_DRIVE_TASK_OUT unimplemented + IDE_DRIVE_TASK_RAW_WRITE + + [6] Do not access {in|out}_flags->all except for resetting + all the bits. Always access individual bit fields. ->all + value will flip depending on endianess. For the same + reason, do not use IDE_{TASKFILE|HOB}_STD_{OUT|IN}_FLAGS + constants defined in hdreg.h. + + + +HDIO_DRIVE_CMD execute a special drive command + + Note: If you don't have a copy of the ANSI ATA specification + handy, you should probably ignore this ioctl. + + usage: + + u8 args[4+XFER_SIZE]; + ... + ioctl(fd, HDIO_DRIVE_CMD, args); + + inputs: + + Commands other than WIN_SMART + args[0] COMMAND + args[1] NSECTOR + args[2] FEATURE + args[3] NSECTOR + + WIN_SMART + args[0] COMMAND + args[1] SECTOR + args[2] FEATURE + args[3] NSECTOR + + outputs: + + args[] buffer is filled with register values followed by any + data returned by the disk. + args[0] status + args[1] error + args[2] NSECTOR + args[3] undefined + args[4+] NSECTOR * 512 bytes of data returned by the command. + + error returns: + EACCES Access denied: requires CAP_SYS_RAWIO + ENOMEM Unable to allocate memory for task + EIO Drive reports error + + notes: + + [1] For commands other than WIN_SMART, args[1] should equal + args[3]. SECTOR, LCYL and HCYL are undefined. For + WIN_SMART, 0x4f and 0xc2 are loaded into LCYL and HCYL + respectively. In both cases SELECT will contain the default + value for the drive. Please refer to HDIO_DRIVE_TASKFILE + notes for the default value of SELECT. + + [2] If NSECTOR value is greater than zero and the drive sets + DRQ when interrupting for the command, NSECTOR * 512 bytes + are read from the device into the area following NSECTOR. + In the above example, the area would be + args[4..4+XFER_SIZE]. 16bit PIO is used regardless of + HDIO_SET_32BIT setting. + + [3] If COMMAND == WIN_SETFEATURES && FEATURE == SETFEATURES_XFER + && NSECTOR >= XFER_SW_DMA_0 && the drive supports any DMA + mode, IDE driver will try to tune the transfer mode of the + drive accordingly. + + + +HDIO_DRIVE_TASK execute task and special drive command + + Note: If you don't have a copy of the ANSI ATA specification + handy, you should probably ignore this ioctl. + + usage: + + u8 args[7]; + ... + ioctl(fd, HDIO_DRIVE_TASK, args); + + inputs: + + Taskfile register values: + args[0] COMMAND + args[1] FEATURE + args[2] NSECTOR + args[3] SECTOR + args[4] LCYL + args[5] HCYL + args[6] SELECT + + outputs: + + Taskfile register values: + args[0] status + args[1] error + args[2] NSECTOR + args[3] SECTOR + args[4] LCYL + args[5] HCYL + args[6] SELECT + + error returns: + EACCES Access denied: requires CAP_SYS_RAWIO + ENOMEM Unable to allocate memory for task + ENOMSG Device is not a disk drive. + EIO Drive failed the command. + + notes: + + [1] DEV bit (0x10) of SELECT register is ignored and the + appropriate value for the drive is used. All other bits + are used unaltered. + + + +HDIO_DRIVE_CMD_AEB HDIO_DRIVE_TASK + + Not implemented, as of 2.6.8.1 + + + +HDIO_SET_32BIT change io_32bit flags + + usage: + + int val; + ioctl(fd, HDIO_SET_32BIT, val); + + inputs: + New value for io_32bit flag + + outputs: none + + error return: + EINVAL (bdev != bdev->bd_contains) (not sure what this means) + EACCES Access denied: requires CAP_SYS_ADMIN + EINVAL value out of range [0 3] + EBUSY Controller busy + + + + +HDIO_SET_NOWERR change ignore-write-error flag + + usage: + + int val; + ioctl(fd, HDIO_SET_NOWERR, val); + + inputs: + New value for ignore-write-error flag. Used for ignoring + WRERR_STAT + + outputs: none + + error return: + EINVAL (bdev != bdev->bd_contains) (not sure what this means) + EACCES Access denied: requires CAP_SYS_ADMIN + EINVAL value out of range [0 1] + EBUSY Controller busy + + + +HDIO_SET_DMA change use-dma flag + + usage: + + long val; + ioctl(fd, HDIO_SET_DMA, val); + + inputs: + New value for use-dma flag + + outputs: none + + error return: + EINVAL (bdev != bdev->bd_contains) (not sure what this means) + EACCES Access denied: requires CAP_SYS_ADMIN + EINVAL value out of range [0 1] + EBUSY Controller busy + + + +HDIO_SET_PIO_MODE reconfig interface to new speed + + usage: + + long val; + ioctl(fd, HDIO_SET_PIO_MODE, val); + + inputs: + New interface speed. + + outputs: none + + error return: + EINVAL (bdev != bdev->bd_contains) (not sure what this means) + EACCES Access denied: requires CAP_SYS_ADMIN + EINVAL value out of range [0 255] + EBUSY Controller busy + + + +HDIO_SCAN_HWIF register and (re)scan interface + + usage: + + int args[3] + ... + ioctl(fd, HDIO_SCAN_HWIF, args); + + inputs: + args[0] io address to probe + args[1] control address to probe + args[2] irq number + + outputs: none + + error returns: + EACCES Access denied: requires CAP_SYS_RAWIO + EIO Probe failed. + + notes: + + This ioctl initializes the addresses and irq for a disk + controller, probes for drives, and creates /proc/ide + interfaces as appropiate. + + + +HDIO_UNREGISTER_HWIF unregister interface + + usage: + + int index; + ioctl(fd, HDIO_UNREGISTER_HWIF, index); + + inputs: + index index of hardware interface to unregister + + outputs: none + + error returns: + EACCES Access denied: requires CAP_SYS_RAWIO + + notes: + + This ioctl removes a hardware interface from the kernel. + + Currently (2.6.8) this ioctl silently fails if any drive on + the interface is busy. + + + +HDIO_SET_WCACHE change write cache enable-disable + + usage: + + int val; + ioctl(fd, HDIO_SET_WCACHE, val); + + inputs: + New value for write cache enable + + outputs: none + + error return: + EINVAL (bdev != bdev->bd_contains) (not sure what this means) + EACCES Access denied: requires CAP_SYS_ADMIN + EINVAL value out of range [0 1] + EBUSY Controller busy + + + +HDIO_SET_ACOUSTIC change acoustic behavior + + usage: + + int val; + ioctl(fd, HDIO_SET_ACOUSTIC, val); + + inputs: + New value for drive acoustic settings + + outputs: none + + error return: + EINVAL (bdev != bdev->bd_contains) (not sure what this means) + EACCES Access denied: requires CAP_SYS_ADMIN + EINVAL value out of range [0 254] + EBUSY Controller busy + + + +HDIO_SET_QDMA change use-qdma flag + + Not implemented, as of 2.6.8.1 + + + +HDIO_SET_ADDRESS change lba addressing modes + + usage: + + int val; + ioctl(fd, HDIO_SET_ADDRESS, val); + + inputs: + New value for addressing mode + 0 = 28-bit + 1 = 48-bit + 2 = 48-bit doing 28-bit + + outputs: none + + error return: + EINVAL (bdev != bdev->bd_contains) (not sure what this means) + EACCES Access denied: requires CAP_SYS_ADMIN + EINVAL value out of range [0 2] + EBUSY Controller busy + EIO Drive does not support lba48 mode. + + +HDIO_SET_IDE_SCSI + + usage: + + long val; + ioctl(fd, HDIO_SET_IDE_SCSI, val); + + inputs: + New value for scsi emulation mode (?) + + outputs: none + + error return: + EINVAL (bdev != bdev->bd_contains) (not sure what this means) + EACCES Access denied: requires CAP_SYS_ADMIN + EINVAL value out of range [0 1] + EBUSY Controller busy + + + +HDIO_SET_SCSI_IDE + + Not implemented, as of 2.6.8.1 + + diff --git a/Documentation/iostats.txt b/Documentation/iostats.txt new file mode 100644 index 000000000000..09a1bafe2528 --- /dev/null +++ b/Documentation/iostats.txt @@ -0,0 +1,150 @@ +I/O statistics fields +--------------- + +Last modified Sep 30, 2003 + +Since 2.4.20 (and some versions before, with patches), and 2.5.45, +more extensive disk statistics have been introduced to help measure disk +activity. Tools such as sar and iostat typically interpret these and do +the work for you, but in case you are interested in creating your own +tools, the fields are explained here. + +In 2.4 now, the information is found as additional fields in +/proc/partitions. In 2.6, the same information is found in two +places: one is in the file /proc/diskstats, and the other is within +the sysfs file system, which must be mounted in order to obtain +the information. Throughout this document we'll assume that sysfs +is mounted on /sys, although of course it may be mounted anywhere. +Both /proc/diskstats and sysfs use the same source for the information +and so should not differ. + +Here are examples of these different formats: + +2.4: + 3 0 39082680 hda 446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160 + 3 1 9221278 hda1 35486 0 35496 38030 0 0 0 0 0 38030 38030 + + +2.6 sysfs: + 446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160 + 35486 38030 38030 38030 + +2.6 diskstats: + 3 0 hda 446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160 + 3 1 hda1 35486 38030 38030 38030 + +On 2.4 you might execute "grep 'hda ' /proc/partitions". On 2.6, you have +a choice of "cat /sys/block/hda/stat" or "grep 'hda ' /proc/diskstats". +The advantage of one over the other is that the sysfs choice works well +if you are watching a known, small set of disks. /proc/diskstats may +be a better choice if you are watching a large number of disks because +you'll avoid the overhead of 50, 100, or 500 or more opens/closes with +each snapshot of your disk statistics. + +In 2.4, the statistics fields are those after the device name. In +the above example, the first field of statistics would be 446216. +By contrast, in 2.6 if you look at /sys/block/hda/stat, you'll +find just the eleven fields, beginning with 446216. If you look at +/proc/diskstats, the eleven fields will be preceded by the major and +minor device numbers, and device name. Each of these formats provide +eleven fields of statistics, each meaning exactly the same things. +All fields except field 9 are cumulative since boot. Field 9 should +go to zero as I/Os complete; all others only increase. Yes, these are +32 bit unsigned numbers, and on a very busy or long-lived system they +may wrap. Applications should be prepared to deal with that; unless +your observations are measured in large numbers of minutes or hours, +they should not wrap twice before you notice them. + +Each set of stats only applies to the indicated device; if you want +system-wide stats you'll have to find all the devices and sum them all up. + +Field 1 -- # of reads issued + This is the total number of reads completed successfully. +Field 2 -- # of reads merged, field 6 -- # of writes merged + Reads and writes which are adjacent to each other may be merged for + efficiency. Thus two 4K reads may become one 8K read before it is + ultimately handed to the disk, and so it will be counted (and queued) + as only one I/O. This field lets you know how often this was done. +Field 3 -- # of sectors read + This is the total number of sectors read successfully. +Field 4 -- # of milliseconds spent reading + This is the total number of milliseconds spent by all reads (as + measured from __make_request() to end_that_request_last()). +Field 5 -- # of writes completed + This is the total number of writes completed successfully. +Field 7 -- # of sectors written + This is the total number of sectors written successfully. +Field 8 -- # of milliseconds spent writing + This is the total number of milliseconds spent by all writes (as + measured from __make_request() to end_that_request_last()). +Field 9 -- # of I/Os currently in progress + The only field that should go to zero. Incremented as requests are + given to appropriate request_queue_t and decremented as they finish. +Field 10 -- # of milliseconds spent doing I/Os + This field is increases so long as field 9 is nonzero. +Field 11 -- weighted # of milliseconds spent doing I/Os + This field is incremented at each I/O start, I/O completion, I/O + merge, or read of these stats by the number of I/Os in progress + (field 9) times the number of milliseconds spent doing I/O since the + last update of this field. This can provide an easy measure of both + I/O completion time and the backlog that may be accumulating. + + +To avoid introducing performance bottlenecks, no locks are held while +modifying these counters. This implies that minor inaccuracies may be +introduced when changes collide, so (for instance) adding up all the +read I/Os issued per partition should equal those made to the disks ... +but due to the lack of locking it may only be very close. + +In 2.6, there are counters for each cpu, which made the lack of locking +almost a non-issue. When the statistics are read, the per-cpu counters +are summed (possibly overflowing the unsigned 32-bit variable they are +summed to) and the result given to the user. There is no convenient +user interface for accessing the per-cpu counters themselves. + +Disks vs Partitions +------------------- + +There were significant changes between 2.4 and 2.6 in the I/O subsystem. +As a result, some statistic information disappeared. The translation from +a disk address relative to a partition to the disk address relative to +the host disk happens much earlier. All merges and timings now happen +at the disk level rather than at both the disk and partition level as +in 2.4. Consequently, you'll see a different statistics output on 2.6 for +partitions from that for disks. There are only *four* fields available +for partitions on 2.6 machines. This is reflected in the examples above. + +Field 1 -- # of reads issued + This is the total number of reads issued to this partition. +Field 2 -- # of sectors read + This is the total number of sectors requested to be read from this + partition. +Field 3 -- # of writes issued + This is the total number of writes issued to this partition. +Field 4 -- # of sectors written + This is the total number of sectors requested to be written to + this partition. + +Note that since the address is translated to a disk-relative one, and no +record of the partition-relative address is kept, the subsequent success +or failure of the read cannot be attributed to the partition. In other +words, the number of reads for partitions is counted slightly before time +of queuing for partitions, and at completion for whole disks. This is +a subtle distinction that is probably uninteresting for most cases. + +Additional notes +---------------- + +In 2.6, sysfs is not mounted by default. If your distribution of +Linux hasn't added it already, here's the line you'll want to add to +your /etc/fstab: + +none /sys sysfs defaults 0 0 + + +In 2.6, all disk statistics were removed from /proc/stat. In 2.4, they +appear in both /proc/partitions and /proc/stat, although the ones in +/proc/stat take a very different format from those in /proc/partitions +(see proc(5), if your system has it.) + +-- ricklind@us.ibm.com diff --git a/Documentation/isapnp.txt b/Documentation/isapnp.txt new file mode 100644 index 000000000000..400d1b5b523d --- /dev/null +++ b/Documentation/isapnp.txt @@ -0,0 +1,14 @@ +ISA Plug & Play support by Jaroslav Kysela <perex@suse.cz> +========================================================== + +Interface /proc/isapnp +====================== + +The interface has been removed. See pnp.txt for more details. + +Interface /proc/bus/isapnp +========================== + +This directory allows access to ISA PnP cards and logical devices. +The regular files contain the contents of ISA PnP registers for +a logical device. diff --git a/Documentation/isdn/00-INDEX b/Documentation/isdn/00-INDEX new file mode 100644 index 000000000000..9fee5f2e5c62 --- /dev/null +++ b/Documentation/isdn/00-INDEX @@ -0,0 +1,43 @@ +00-INDEX + - this file (info on ISDN implementation for Linux) +CREDITS + - list of the kind folks that brought you this stuff. +INTERFACE + - description of Linklevel and Hardwarelevel ISDN interface. +README + - general info on what you need and what to do for Linux ISDN. +README.FAQ + - general info for FAQ. +README.audio + - info for running audio over ISDN. +README.fax + - info for using Fax over ISDN. +README.icn + - info on the ICN-ISDN-card and its driver. +README.HiSax + - info on the HiSax driver which replaces the old teles. +README.hfc-pci + - info on hfc-pci based cards. +README.pcbit + - info on the PCBIT-D ISDN adapter and driver. +README.syncppp + - info on running Sync PPP over ISDN. +syncPPP.FAQ + - frequently asked questions about running PPP over ISDN. +README.avmb1 + - info on driver for AVM-B1 ISDN card. +README.act2000 + - info on driver for IBM ACT-2000 card. +README.eicon + - info on driver for Eicon active cards. +README.concap + - info on "CONCAP" encapsulation protocol interface used for X.25. +README.diversion + - info on module for isdn diversion services. +README.sc + - info on driver for Spellcaster cards. +README.x25 + _ info for running X.25 over ISDN. +README.hysdn + - info on driver for Hypercope active HYSDN cards + diff --git a/Documentation/isdn/CREDITS b/Documentation/isdn/CREDITS new file mode 100644 index 000000000000..e1b3023efaa8 --- /dev/null +++ b/Documentation/isdn/CREDITS @@ -0,0 +1,70 @@ + +I want to thank all who contributed to this project and especially to: +(in alphabetical order) + +Thomas Bogendörfer (tsbogend@bigbug.franken.de) + Tester, lots of bugfixes and hints. + +Alan Cox (alan@redhat.com) + For help getting into standard-kernel. + +Henner Eisen (eis@baty.hanse.de) + For X.25 implementation. + +Volker Götz (volker@oops.franken.de) + For contribution of man-pages, the imontty-tool and a perfect + maintaining of the mailing-list at hub-wue. + +Matthias Hessler (hessler@isdn4linux.de) + For creating and maintaining the FAQ. + +Bernhard Hailer (Bernhard.Hailer@lrz.uni-muenchen.de) + For creating the FAQ, and the leafsite HOWTO. + +Michael 'Ghandi' Herold (michael@abadonna.franken.de) + For contribution of the vbox answering machine. + +Michael Hipp (Michael.Hipp@student.uni-tuebingen.de) + For his Sync-PPP-code. + +Karsten Keil (keil@isdn4linux.de) + For adding 1TR6-support to the Teles-driver. + For the HiSax-driver. + +Michael Knigge (knick@cove.han.de) + For contributing the imon-tool + +Andreas Kool (akool@Kool.f.EUnet.de) + For contribution of the isdnlog/isdnrep-tool + +Pedro Roque Marques (roque@di.fc.ul.pt) + For lot of new ideas and the pcbit driver. + +Eberhard Moenkeberg (emoenke@gwdg.de) + For testing and help to get into kernel. + +Thomas Neumann (tn@ruhr.de) + For help with Cisco-SLARP and keepalive + +Jan den Ouden (denouden@groovin.xs4all.nl) + For contribution of the original teles-driver + +Carsten Paeth (calle@calle.in-berlin.de) + For the AVM-B1-CAPI2.0 driver + +Thomas Pfeiffer (pfeiffer@pds.de) + For V.110, extended T.70 and Hylafax extensions in isdn_tty.c + +Max Riegel (riegel@max.franken.de) + For making the ICN hardware-documentation and test-equipment available. + +Armin Schindler (mac@melware.de) + For the eicon active card driver. + +Gerhard 'Fido' Schneider (fido@wuff.mayn.de) + For heavy-duty-beta-testing with his BBS ;) + +Thomas Uhl (uhl@think.de) + For distributing the cards. + For pushing me to work ;-) + diff --git a/Documentation/isdn/HiSax.cert b/Documentation/isdn/HiSax.cert new file mode 100644 index 000000000000..f2a6fcb8efee --- /dev/null +++ b/Documentation/isdn/HiSax.cert @@ -0,0 +1,96 @@ +-----BEGIN PGP SIGNED MESSAGE----- + +First: + + HiSax is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + +However, if you wish to modify the HiSax sources, please note the following: + +HiSax has passed the ITU approval test suite with ELSA Quickstep ISDN cards +and Eicon Technology Diva 2.01 PCI card. +The certification is only valid for the combination of the tested software +version and the tested hardware. Any changes to the HiSax source code may +therefore affect the certification. + +Additional ITU approval tests have been carried out for all generic cards +using Colognechip single chip solutions HFC-S PCI A for PCI cards as well +as HFC-S USB based USB ISDN ta adapters. +These tests included all layers 1-3 and as well all functional tests for +the layer 1. Because all hardware based on these chips are complete ISDN +solutions in one chip all cards and USB-TAs using these chips are to be +regarded as approved for those tests. Some additional electrical tests +of the layer 1 which are independent of the driver and related to a +special hardware used will be regarded as approved if at least one +solution has been tested including those electrical tests. So if cards +or tas have been completely approved for any other os, the approval +for those electrical tests is valid for linux, too. +Please send any questions regarding this drivers or approval abouts to +werner@isdn-development.de +Additional information and the type approval documents will be found +shortly on the Colognechip website www.colognechip.com + +If you change the main files of the HiSax ISDN stack, the certification will +become invalid. Because in most countries it is illegal to connect +unapproved ISDN equipment to the public network, I have to guarantee that +changes in HiSax do not affect the certification. + +In order to make a valid certification apparent to the user, I have built in +some validation checks that are made during the make process. The HiSax main +files are protected by md5 checksums and the md5sum file is pgp signed by +myself: + +KeyID 1024/FF992F6D 1997/01/16 Karsten Keil <kkeil@suse.de> +Key fingerprint = 92 6B F7 58 EE 86 28 C8 C4 1A E6 DC 39 89 F2 AA + +Only if the checksums are OK, and the signature of the file +"drivers/isdn/hisax/md5sums.asc" match, is the certification valid; a +message confirming this is then displayed during the hisax init process. + +The affected files are: + +drivers/isdn/hisax/isac.c +drivers/isdn/hisax/isdnl1.c +drivers/isdn/hisax/isdnl2.c +drivers/isdn/hisax/isdnl3.c +drivers/isdn/hisax/tei.c +drivers/isdn/hisax/callc.c +drivers/isdn/hisax/l3dss1.c +drivers/isdn/hisax/l3_1tr6.c +drivers/isdn/hisax/cert.c +drivers/isdn/hisax/elsa.c +drivers/isdn/hisax/diva.c +drivers/isdn/hisax/hfc_pci.c + +Please send any changes, bugfixes and patches to me rather than implementing +them directly into the HiSax sources. + +This does not reduce your rights granted by the GNU General Public License. +If you wish to change the sources, go ahead; but note that then the +certification is invalid even if you use one of the approved cards. + +Here are the certification registration numbers for ELSA Quickstep cards: +German D133361J CETECOM ICT Services GmbH 0682 +European D133362J CETECOM ICT Services GmbH 0682 + + +Karsten Keil +keil@isdn4linux.de + +-----BEGIN PGP SIGNATURE----- +Version: 2.6.3i +Charset: noconv + +iQCVAwUBOFAwqTpxHvX/mS9tAQFI2QP9GLDK2iy/KBhwReE3F7LeO+tVhffTVZ3a +20q5/z/WcIg/pnH0uTkl2UgDXBFXYl45zJyDGNpAposIFmT+Edd14o7Vj1w/BBdn +Y+5rBmJf+gyBu61da5d6bv0lpymwRa/um+ri+ilYnZ/XPfg5JKhdjGSBCJuJAElM +d2jFbTrsMYw= +=LNf9 +-----END PGP SIGNATURE----- diff --git a/Documentation/isdn/INTERFACE b/Documentation/isdn/INTERFACE new file mode 100644 index 000000000000..5df17e5b25c8 --- /dev/null +++ b/Documentation/isdn/INTERFACE @@ -0,0 +1,759 @@ +$Id: INTERFACE,v 1.15.8.2 2001/03/13 16:17:07 kai Exp $ + +Description of the Interface between Linklevel and Hardwarelevel + of isdn4linux: + + + The Communication between Linklevel (LL) and Hardwarelevel (HL) + is based on the struct isdn_if (defined in isdnif.h). + + An HL-driver can register itself at LL by calling the function + register_isdn() with a pointer to that struct. Prior to that, it has + to preset some of the fields of isdn_if. The LL sets the rest of + the fields. All further communication is done via callbacks using + the function-pointers defined in isdn_if. + + Changes/Version numbering: + + During development of the ISDN subsystem, several changes have been + made to the interface. Before it went into kernel, the package + had a unique version number. The last version, distributed separately + was 0.7.4. When the subsystem went into kernel, every functional unit + got a separate version number. These numbers are shown at initialization, + separated by slashes: + + c.c/t.t/n.n/p.p/a.a/v.v + + where + + c.c is the revision of the common code. + t.t is the revision of the tty related code. + n.n is the revision of the network related code. + p.p is the revision of the ppp related code. + a.a is the revision of the audio related code. + v.v is the revision of the V.110 related code. + + Changes in this document are marked with '***CHANGEx' where x representing + the version number. If that number starts with 0, it refers to the old, + separately distributed package. If it starts with one of the letters + above, it refers to the revision of the corresponding module. + ***CHANGEIx refers to the revision number of the isdnif.h + +1. Description of the fields of isdn_if: + + int channels; + + This field has to be set by the HL-driver to the number of channels + supported prior to calling register_isdn(). Upon return of the call, + the LL puts an id there, which has to be used by the HL-driver when + invoking the other callbacks. + + int maxbufsize; + + ***CHANGE0.6: New since this version. + + Also to be preset by the HL-driver. With this value the HL-driver + tells the LL the maximum size of a data-packet it will accept. + + unsigned long features; + + To be preset by the HL-driver. Using this field, the HL-driver + announces the features supported. At the moment this is limited to + report the supported layer2 and layer3-protocols. For setting this + field the constants ISDN_FEATURE..., declared in isdnif.h have to be + used. + + ***CHANGE0.7.1: The line type (1TR6, EDSS1) has to be set. + + unsigned short hl_hdrlen; + + ***CHANGE0.7.4: New field. + + To be preset by the HL-driver, if it supports sk_buff's. The driver + should put here the amount of additional space needed in sk_buff's for + its internal purposes. Drivers not supporting sk_buff's should + initialize this field to 0. + + void (*rcvcallb_skb)(int, int, struct sk_buff *) + + ***CHANGE0.7.4: New field. + + This field will be set by LL. The HL-driver delivers received data- + packets by calling this function. Upon calling, the HL-driver must + already have its private data pulled off the head of the sk_buff. + + Parameter: + int driver-Id + int Channel-number locally to the driver. (starting with 0) + struct sk_buff * Pointer to sk_buff, containing received data. + + int (*statcallb)(isdn_ctrl*); + + This field will be set by LL. This function has to be called by the + HL-driver for signaling status-changes or other events to the LL. + + Parameter: + isdn_ctrl* + + The struct isdn_ctrl also defined in isdn_if. The exact meanings of its + fields are described together with the descriptions of the possible + events. Here is only a short description of the fields: + + driver = driver Id. + command = event-type. (one of the constants ISDN_STAT_...) + arg = depends on event-type. + num = depends on event-type. + + Returnvalue: + 0 on success, else -1 + + int (*command)(isdn_ctrl*); + + This field has to be preset by the HL-driver. It points to a function, + to be called by LL to perform functions like dialing, B-channel + setup, etc. The exact meaning of the parameters is described with the + descriptions of the possible commands. + + Parameter: + isdn_ctrl* + driver = driver-Id + command = command to perform. (one of the constants ISDN_CMD_...) + arg = depends on command. + num = depends on command. + + Returnvalue: + >=0 on success, else error-code (-ENODEV etc.) + + int (*writebuf_skb)(int, int, int, struct sk_buff *) + + ***CHANGE0.7.4: New field. + ***CHANGEI.1.21: New field. + + This field has to be preset by the HL-driver. The given function will + be called by the LL for delivering data to be send via B-Channel. + + + Parameter: + int driver-Id ***CHANGE0.7.4: New parameter. + int channel-number locally to the HL-driver. (starts with 0) + int ack ***ChangeI1.21: New parameter + If this is !0, the driver has to signal the delivery + by sending an ISDN_STAT_BSENT. If this is 0, the driver + MUST NOT send an ISDN_STAT_BSENT. + struct sk_buff * Pointer to sk_buff containing data to be send via + B-channel. + + Returnvalue: + Length of data accepted on success, else error-code (-EINVAL on + oversized packets etc.) + + int (*writecmd)(u_char*, int, int, int, int); + + This field has to be preset by the HL-driver. The given function will be + called to perform write-requests on /dev/isdnctrl (i.e. sending commands + to the card) The data-format is hardware-specific. This function is + intended for debugging only. It is not necessary for normal operation + and never will be called by the tty-emulation- or network-code. If + this function is not supported, the driver has to set NULL here. + + Parameter: + u_char* pointer to data. + int length of data. + int flag: 0 = call from within kernel-space. (HL-driver must use + memcpy, may NOT use schedule()) + 1 = call from user-space. (HL-driver must use + memcpy_fromfs, use of schedule() allowed) + int driver-Id. + int channel-number locally to the HL-driver. (starts with 0) + +***CHANGEI1.14: The driver-Id and channel-number are new since this revision. + + Returnvalue: + Length of data accepted on success, else error-code (-EINVAL etc.) + + int (*readstat)(u_char*, int, int, int, int); + + This field has to be preset by the HL-driver. The given function will be + called to perform read-requests on /dev/isdnctrl (i.e. reading replies + from the card) The data-format is hardware-specific. This function is + intended for debugging only. It is not necessary for normal operation + and never will be called by the tty-emulation- or network-code. If + this function is not supported, the driver has to set NULL here. + + Parameter: + u_char* pointer to data. + int length of data. + int flag: 0 = call from within kernel-space. (HL-driver must use + memcpy, may NOT use schedule()) + 1 = call from user-space. (HL-driver must use + memcpy_fromfs, use of schedule() allowed) + int driver-Id. + int channel-number locally to the HL-driver. (starts with 0) + +***CHANGEI1.14: The driver-Id and channel-number are new since this revision. + + Returnvalue: + Length of data on success, else error-code (-EINVAL etc.) + + char id[20]; + ***CHANGE0.7: New since this version. + + This string has to be preset by the HL-driver. Its purpose is for + identification of the driver by the user. Eg.: it is shown in the + status-info of /dev/isdninfo. Furthermore it is used as Id for binding + net-interfaces to a specific channel. If a string of length zero is + given, upon return, isdn4linux will replace it by a generic name. (line0, + line1 etc.) It is recommended to make this string configurable during + module-load-time. (copy a global variable to this string.) For doing that, + modules 1.2.8 or newer are necessary. + +2. Description of the commands, a HL-driver has to support: + + All commands will be performed by calling the function command() described + above from within the LL. The field command of the struct-parameter will + contain the desired command, the field driver is always set to the + appropriate driver-Id. + + Until now, the following commands are defined: + +***CHANGEI1.34: The parameter "num" has been replaced by a union "parm" containing + the old "num" and a new setup_type struct used for ISDN_CMD_DIAL + and ISDN_STAT_ICALL callback. + + ISDN_CMD_IOCTL: + + This command is intended for performing ioctl-calls for configuring + hardware or similar purposes (setting port-addresses, loading firmware + etc.) For this purpose, in the LL all ioctl-calls with an argument + >= IIOCDRVCTL (0x100) will be handed transparently to this + function after subtracting 0x100 and placing the result in arg. + Example: + If a userlevel-program calls ioctl(0x101,...) the function gets + called with the field command set to 1. + + Parameter: + driver = driver-Id. + command = ISDN_CMD_IOCTL + arg = Original ioctl-cmd - IIOCDRVCTL + parm.num = first bytes filled with (unsigned long)arg + + Returnvalue: + Depending on driver. + + + ISDN_CMD_DIAL: + + This command is used to tell the HL-driver it should dial a given + number. + + Parameter: + driver = driver-Id. + command = ISDN_CMD_DIAL + arg = channel-number locally to the driver. (starting with 0) + + parm.setup.phone = An ASCII-String containing the number to dial. + parm.setup.eazmsn = An ASCII-Sting containing the own EAZ or MSN. + parm.setup.si1 = The Service-Indicator. + parm.setup.si2 = Additional Service-Indicator. + + If the Line has been designed as SPV (a special german + feature, meaning semi-leased-line) the phone has to + start with an "S". + ***CHANGE0.6: In previous versions the EAZ has been given in the + highbyte of arg. + ***CHANGE0.7.1: New since this version: ServiceIndicator and AddInfo. + + ISDN_CMD_ACCEPTD: + + With this command, the HL-driver is told to accept a D-Channel-setup. + (Response to an incoming call) + + Parameter: + driver = driver-Id. + command = ISDN_CMD_ACCEPTD + arg = channel-number locally to the driver. (starting with 0) + parm = unused. + + ISDN_CMD_ACCEPTB: + + With this command, the HL-driver is told to perform a B-Channel-setup. + (after establishing D-Channel-Connection) + + Parameter: + driver = driver-Id. + command = ISDN_CMD_ACCEPTB + arg = channel-number locally to the driver. (starting with 0) + parm = unused. + + ISDN_CMD_HANGUP: + + With this command, the HL-driver is told to hangup (B-Channel if + established first, then D-Channel). This command is also used for + actively rejecting an incoming call. + + Parameter: + driver = driver-Id. + command = ISDN_CMD_HANGUP + arg = channel-number locally to the driver. (starting with 0) + parm = unused. + + ISDN_CMD_CLREAZ: + + With this command, the HL-driver is told not to signal incoming + calls to the LL. + + Parameter: + driver = driver-Id. + command = ISDN_CMD_CLREAZ + arg = channel-number locally to the driver. (starting with 0) + parm = unused. + + ISDN_CMD_SETEAZ: + + With this command, the HL-driver is told to signal incoming calls for + the given EAZs/MSNs to the LL. + + Parameter: + driver = driver-Id. + command = ISDN_CMD_SETEAZ + arg = channel-number locally to the driver. (starting with 0) + parm.num = ASCII-String, containing the desired EAZ's/MSN's + (comma-separated). If an empty String is given, the + HL-driver should respond to ALL incoming calls, + regardless of the destination-address. + ***CHANGE0.6: New since this version the "empty-string"-feature. + + ISDN_CMD_GETEAZ: (currently unused) + + With this command, the HL-driver is told to report the current setting + given with ISDN_CMD_SETEAZ. + + Parameter: + driver = driver-Id. + command = ISDN_CMD_GETEAZ + arg = channel-number locally to the driver. (starting with 0) + parm.num = ASCII-String, containing the current EAZ's/MSN's + + ISDN_CMD_SETSIL: (currently unused) + + With this command, the HL-driver is told to signal only incoming + calls with the given Service-Indicators. + + Parameter: + driver = driver-Id. + command = ISDN_CMD_SETSIL + arg = channel-number locally to the driver. (starting with 0) + parm.num = ASCII-String, containing the desired Service-Indicators. + + ISDN_CMD_GETSIL: (currently unused) + + With this command, the HL-driver is told to return the current + Service-Indicators it will respond to. + + Parameter: + driver = driver-Id. + command = ISDN_CMD_SETSIL + arg = channel-number locally to the driver. (starting with 0) + parm.num = ASCII-String, containing the current Service-Indicators. + + ISDN_CMD_SETL2: + + With this command, the HL-driver is told to select the given Layer-2- + protocol. This command is issued by the LL prior to ISDN_CMD_DIAL or + ISDN_CMD_ACCEPTD. + + + Parameter: + driver = driver-Id. + command = ISDN_CMD_SETL2 + arg = channel-number locally to the driver. (starting with 0) + logical or'ed with (protocol-Id << 8) + protocol-Id is one of the constants ISDN_PROTO_L2... + parm = unused. + + ISDN_CMD_GETL2: (currently unused) + + With this command, the HL-driver is told to return the current + setting of the Layer-2-protocol. + + Parameter: + driver = driver-Id. + command = ISDN_CMD_GETL2 + arg = channel-number locally to the driver. (starting with 0) + parm = unused. + Returnvalue: + current protocol-Id (one of the constants ISDN_L2_PROTO) + + ISDN_CMD_SETL3: + + With this command, the HL-driver is told to select the given Layer-3- + protocol. This command is issued by the LL prior to ISDN_CMD_DIAL or + ISDN_CMD_ACCEPTD. + + + Parameter: + driver = driver-Id. + command = ISDN_CMD_SETL3 + arg = channel-number locally to the driver. (starting with 0) + logical or'ed with (protocol-Id << 8) + protocol-Id is one of the constants ISDN_PROTO_L3... + parm.fax = Pointer to T30_s fax struct. (fax usage only) + + ISDN_CMD_GETL2: (currently unused) + + With this command, the HL-driver is told to return the current + setting of the Layer-3-protocol. + + Parameter: + driver = driver-Id. + command = ISDN_CMD_GETL3 + arg = channel-number locally to the driver. (starting with 0) + parm = unused. + Returnvalue: + current protocol-Id (one of the constants ISDN_L3_PROTO) + + ISDN_CMD_PROCEED: + + With this command, the HL-driver is told to proceed with a incoming call. + + Parameter: + driver = driver-Id. + command = ISDN_CMD_PROCEED + arg = channel-number locally to the driver. (starting with 0) + setup.eazmsn= empty string or string send as uus1 in DSS1 with + PROCEED message + + ISDN_CMD_ALERT: + + With this command, the HL-driver is told to alert a proceeding call. + + Parameter: + driver = driver-Id. + command = ISDN_CMD_ALERT + arg = channel-number locally to the driver. (starting with 0) + setup.eazmsn= empty string or string send as uus1 in DSS1 with + ALERT message + + ISDN_CMD_REDIR: + + With this command, the HL-driver is told to redirect a call in proceeding + or alerting state. + + Parameter: + driver = driver-Id. + command = ISDN_CMD_REDIR + arg = channel-number locally to the driver. (starting with 0) + setup.eazmsn= empty string or string send as uus1 in DSS1 protocol + setup.screen= screening indicator + setup.phone = redirected to party number + + ISDN_CMD_PROT_IO: + + With this call, the LL-driver invokes protocol specific features through + the LL. + The call is not implicitely bound to a connection. + + Parameter: + driver = driver-Id + command = ISDN_CMD_PROT_IO + arg = The lower 8 Bits define the addressed protocol as defined + in ISDN_PTYPE..., the upper bits are used to differentiate + the protocol specific CMD. + + para = protocol and function specific. See isdnif.h for detail. + + + ISDN_CMD_FAXCMD: + + With this command the HL-driver receives a fax sub-command. + For details refer to INTERFACE.fax + + Parameter: + driver = driver-Id. + command = ISDN_CMD_FAXCMD + arg = channel-number locally to the driver. (starting with 0) + parm = unused. + + +3. Description of the events to be signaled by the HL-driver to the LL. + + All status-changes are signaled via calling the previously described + function statcallb(). The field command of the struct isdn_cmd has + to be set by the HL-driver with the appropriate Status-Id (event-number). + The field arg has to be set to the channel-number (locally to the driver, + starting with 0) to which this event applies. (Exception: STAVAIL-event) + + Until now, the following Status-Ids are defined: + + ISDN_STAT_AVAIL: + + With this call, the HL-driver signals the availability of new data + for readstat(). Used only for debugging-purposes, see description + of readstat(). + + Parameter: + driver = driver-Id + command = ISDN_STAT_STAVAIL + arg = length of available data. + parm = unused. + + ISDN_STAT_ICALL: + ISDN_STAT_ICALLW: + + With this call, the HL-driver signals an incoming call to the LL. + If ICALLW is signalled the incoming call is a waiting call without + a available B-chan. + + Parameter: + driver = driver-Id + command = ISDN_STAT_ICALL + arg = channel-number, locally to the driver. (starting with 0) + para.setup.phone = Callernumber. + para.setup.eazmsn = CalledNumber. + para.setup.si1 = Service Indicator. + para.setup.si2 = Additional Service Indicator. + para.setup.plan = octet 3 from Calling party number Information Element. + para.setup.screen = octet 3a from Calling party number Information Element. + + Return: + 0 = No device matching this call. + 1 = At least one device matching this call (RING on ttyI). + HL-driver may send ALERTING on the D-channel in this case. + 2 = Call will be rejected. + 3 = Incoming called party number is currently incomplete. + Additional digits are required. + Used for signalling with PtP connections. + 4 = Call will be held in a proceeding state + (HL driver sends PROCEEDING) + Used when a user space prog needs time to interpret a call + para.setup.eazmsn may be filled with an uus1 message of + 30 octets maximum. Empty string if no uus. + 5 = Call will be actively deflected to another party + Only available in DSS1/EURO protocol + para.setup.phone must be set to destination party number + para.setup.eazmsn may be filled with an uus1 message of + 30 octets maximum. Empty string if no uus. + -1 = An error happened. (Invalid parameters for example.) + The keypad support now is included in the dial command. + + + ISDN_STAT_RUN: + + With this call, the HL-driver signals availability of the ISDN-card. + (after initializing, loading firmware) + + Parameter: + driver = driver-Id + command = ISDN_STAT_RUN + arg = unused. + parm = unused. + + ISDN_STAT_STOP: + + With this call, the HL-driver signals unavailability of the ISDN-card. + (before unloading, while resetting/reconfiguring the card) + + Parameter: + driver = driver-Id + command = ISDN_STAT_STOP + arg = unused. + parm = unused. + + ISDN_STAT_DCONN: + + With this call, the HL-driver signals the successful establishment of + a D-Channel-connection. (Response to ISDN_CMD_ACCEPTD or ISDN_CMD_DIAL) + + Parameter: + driver = driver-Id + command = ISDN_STAT_DCONN + arg = channel-number, locally to the driver. (starting with 0) + parm = unused. + + ISDN_STAT_BCONN: + + With this call, the HL-driver signals the successful establishment of + a B-Channel-connection. (Response to ISDN_CMD_ACCEPTB or because the + remote-station has initiated establishment) + + The HL driver should call this when the logical l2/l3 protocol + connection on top of the physical B-channel is established. + + Parameter: + driver = driver-Id + command = ISDN_STAT_BCONN + arg = channel-number, locally to the driver. (starting with 0) + parm.num = ASCII-String, containing type of connection (for analog + modem only). This will be appended to the CONNECT message + e.g. 14400/V.32bis + + ISDN_STAT_DHUP: + + With this call, the HL-driver signals the shutdown of a + D-Channel-connection. This could be a response to a prior ISDN_CMD_HANGUP, + or caused by a remote-hangup or if the remote-station has actively + rejected a call. + + Parameter: + driver = driver-Id + command = ISDN_STAT_DHUP + arg = channel-number, locally to the driver. (starting with 0) + parm = unused. + + ISDN_STAT_BHUP: + + With this call, the HL-driver signals the shutdown of a + B-Channel-connection. This could be a response to a prior ISDN_CMD_HANGUP, + or caused by a remote-hangup. + + The HL driver should call this as soon as the logical l2/l3 protocol + connection on top of the physical B-channel is released. + + Parameter: + driver = driver-Id + command = ISDN_STAT_BHUP + arg = channel-number, locally to the driver. (starting with 0) + parm = unused. + + ISDN_STAT_CINF: + + With this call, the HL-driver delivers charge-unit information to the + LL. + + Parameter: + driver = driver-Id + command = ISDN_STAT_CINF + arg = channel-number, locally to the driver. (starting with 0) + parm.num = ASCII string containing charge-units (digits only). + + ISDN_STAT_LOAD: (currently unused) + + ISDN_STAT_UNLOAD: + + With this call, the HL-driver signals that it will be unloaded now. This + tells the LL to release all corresponding data-structures. + + Parameter: + driver = driver-Id + command = ISDN_STAT_UNLOAD + arg = unused. + parm = unused. + + ISDN_STAT_BSENT: + + With this call the HL-driver signals the delivery of a data-packet. + This callback is used by the network-interfaces only, tty-Emulation + does not need this call. + + Parameter: + driver = driver-Id + command = ISDN_STAT_BSENT + arg = channel-number, locally to the driver. (starting with 0) + parm.length = ***CHANGEI.1.21: New field. + the driver has to set this to the original length + of the skb at the time of receiving it from the linklevel. + + ISDN_STAT_NODCH: + + With this call, the driver has to respond to a prior ISDN_CMD_DIAL, if + no D-Channel is available. + + Parameter: + driver = driver-Id + command = ISDN_STAT_NODCH + arg = channel-number, locally to the driver. (starting with 0) + parm = unused. + + ISDN_STAT_ADDCH: + + This call is for HL-drivers, which are unable to check card-type + or numbers of supported channels before they have loaded any firmware + using ioctl. Those HL-driver simply set the channel-parameter to a + minimum channel-number when registering, and later if they know + the real amount, perform this call, allocating additional channels. + + Parameter: + driver = driver-Id + command = ISDN_STAT_ADDCH + arg = number of channels to be added. + parm = unused. + + ISDN_STAT_CAUSE: + + With this call, the HL-driver delivers CAUSE-messages to the LL. + Currently the LL does not use this messages. Their contents is simply + logged via kernel-messages. Therefore, currently the format of the + messages is completely free. However they should be printable. + + Parameter: + driver = driver-Id + command = ISDN_STAT_NODCH + arg = channel-number, locally to the driver. (starting with 0) + parm.num = ASCII string containing CAUSE-message. + + ISDN_STAT_DISPLAY: + + With this call, the HL-driver delivers DISPLAY-messages to the LL. + Currently the LL does not use this messages. + + Parameter: + driver = driver-Id + command = ISDN_STAT_DISPLAY + arg = channel-number, locally to the driver. (starting with 0) + para.display= string containing DISPLAY-message. + + ISDN_STAT_PROT: + + With this call, the HL-driver delivers protocol specific infos to the LL. + The call is not implicitely bound to a connection. + + Parameter: + driver = driver-Id + command = ISDN_STAT_PROT + arg = The lower 8 Bits define the addressed protocol as defined + in ISDN_PTYPE..., the upper bits are used to differentiate + the protocol specific STAT. + + para = protocol and function specific. See isdnif.h for detail. + + ISDN_STAT_DISCH: + + With this call, the HL-driver signals the LL to disable or enable the + use of supplied channel and driver. + The call may be used to reduce the available number of B-channels after + loading the driver. The LL has to ignore a disabled channel when searching + for free channels. The HL driver itself never delivers STAT callbacks for + disabled channels. + The LL returns a nonzero code if the operation was not successful or the + selected channel is actually regarded as busy. + + Parameter: + driver = driver-Id + command = ISDN_STAT_DISCH + arg = channel-number, locally to the driver. (starting with 0) + parm.num[0] = 0 if channel shall be disabled, else enabled. + + ISDN_STAT_L1ERR: + + ***CHANGEI1.21 new status message. + A signal can be sent to the linklevel if an Layer1-error results in + packet-loss on receive or send. The field errcode of the cmd.parm + union describes the error more precisely. + + Parameter: + driver = driver-Id + command = ISDN_STAT_L1ERR + arg = channel-number, locally to the driver. (starting with 0) + parm.errcode= ISDN_STAT_L1ERR_SEND: Packet lost while sending. + ISDN_STAT_L1ERR_RECV: Packet lost while receiving. + ISDN_STAT_FAXIND: + + With this call the HL-driver signals a fax sub-command to the LL. + For details refer to INTERFACE.fax + + Parameter: + driver = driver-Id. + command = ISDN_STAT_FAXIND + arg = channel-number, locally to the driver. (starting with 0) + parm = unused. + diff --git a/Documentation/isdn/INTERFACE.fax b/Documentation/isdn/INTERFACE.fax new file mode 100644 index 000000000000..7e5731319e30 --- /dev/null +++ b/Documentation/isdn/INTERFACE.fax @@ -0,0 +1,163 @@ +$Id: INTERFACE.fax,v 1.2 2000/08/06 09:22:50 armin Exp $ + + +Description of the fax-subinterface between linklevel and hardwarelevel of + isdn4linux. + + The communication between linklevel (LL) and hardwarelevel (HL) for fax + is based on the struct T30_s (defined in isdnif.h). + This struct is allocated in the LL. + In order to use fax, the LL provides the pointer to this struct with the + command ISDN_CMD_SETL3 (parm.fax). This pointer expires in case of hangup + and when a new channel to a new connection is assigned. + + +Data handling: + In send-mode the HL-driver has to handle the <DLE> codes and the bit-order + conversion by itself. + In receive-mode the LL-driver takes care of the bit-order conversion + (specified by +FBOR) + +Structure T30_s description: + + This structure stores the values (set by AT-commands), the remote- + capability-values and the command-codes between LL and HL. + + If the HL-driver receives ISDN_CMD_FAXCMD, all needed information + is in this struct set by the LL. + To signal information to the LL, the HL-driver has to set the + the parameters and use ISDN_STAT_FAXIND. + (Please refer to INTERFACE) + +Structure T30_s: + + All members are 8-bit unsigned (__u8) + + - resolution + - rate + - width + - length + - compression + - ecm + - binary + - scantime + - id[] + Local faxmachine's parameters, set by +FDIS, +FDCS, +FLID, ... + + - r_resolution + - r_rate + - r_width + - r_length + - r_compression + - r_ecm + - r_binary + - r_scantime + - r_id[] + Remote faxmachine's parameters. To be set by HL-driver. + + - phase + Defines the actual state of fax connection. Set by HL or LL + depending on progress and type of connection. + If the phase changes because of an AT command, the LL driver + changes this value. Otherwise the HL-driver takes care of it, but + only necessary on call establishment (from IDLE to PHASE_A). + (one of the constants ISDN_FAX_PHASE_[IDLE,A,B,C,D,E]) + + - direction + Defines outgoing/send or incoming/receive connection. + (ISDN_TTY_FAX_CONN_[IN,OUT]) + + - code + Commands from LL to HL; possible constants : + ISDN_TTY_FAX_DR signals +FDR command to HL + + ISDN_TTY_FAX_DT signals +FDT command to HL + + ISDN_TTY_FAX_ET signals +FET command to HL + + + Other than that the "code" is set with the hangup-code value at + the end of connection for the +FHNG message. + + - r_code + Commands from HL to LL; possible constants : + ISDN_TTY_FAX_CFR output of +FCFR message. + + ISDN_TTY_FAX_RID output of remote ID set in r_id[] + (+FCSI/+FTSI on send/receive) + + ISDN_TTY_FAX_DCS output of +FDCS and CONNECT message, + switching to phase C. + + ISDN_TTY_FAX_ET signals end of data, + switching to phase D. + + ISDN_TTY_FAX_FCON signals the established, outgoing connection, + switching to phase B. + + ISDN_TTY_FAX_FCON_I signals the established, incoming connection, + switching to phase B. + + ISDN_TTY_FAX_DIS output of +FDIS message and values. + + ISDN_TTY_FAX_SENT signals that all data has been sent + and <DLE><ETX> is acknowledged, + OK message will be sent. + + ISDN_TTY_FAX_PTS signals a msg-confirmation (page sent successful), + depending on fet value: + 0: output OK message (more pages follow) + 1: switching to phase B (next document) + + ISDN_TTY_FAX_TRAIN_OK output of +FDCS and OK message (for receive mode). + + ISDN_TTY_FAX_EOP signals end of data in receive mode, + switching to phase D. + + ISDN_TTY_FAX_HNG output of the +FHNG and value set by code and + OK message, switching to phase E. + + + - badlin + Value of +FBADLIN + + - badmul + Value of +FBADMUL + + - bor + Value of +FBOR + + - fet + Value of +FET command in send-mode. + Set by HL in receive-mode for +FET message. + + - pollid[] + ID-string, set by +FCIG + + - cq + Value of +FCQ + + - cr + Value of +FCR + + - ctcrty + Value of +FCTCRTY + + - minsp + Value of +FMINSP + + - phcto + Value of +FPHCTO + + - rel + Value of +FREL + + - nbc + Value of +FNBC (0,1) + (+FNBC is not a known class 2 fax command, I added this to change the + automatic "best capabilities" connection in the eicon HL-driver) + + +Armin +mac@melware.de + diff --git a/Documentation/isdn/README b/Documentation/isdn/README new file mode 100644 index 000000000000..761595243931 --- /dev/null +++ b/Documentation/isdn/README @@ -0,0 +1,599 @@ +README for the ISDN-subsystem + +1. Preface + + 1.1 Introduction + + This README describes how to set up and how to use the different parts + of the ISDN-subsystem. + + For using the ISDN-subsystem, some additional userlevel programs are + necessary. Those programs and some contributed utilities are available + at + + ftp.isdn4linux.de + + /pub/isdn4linux/isdn4k-utils-<VersionNumber>.tar.gz + + + We also have set up a mailing-list: + + The isdn4linux-project originates in Germany, and therefore by historical + reasons, the mailing-list's primary language is german. However mails + written in english have been welcome all the time. + + to subscribe: write a email to majordomo@listserv.isdn4linux.de, + Subject irrelevant, in the message body: + subscribe isdn4linux <your_email_address> + + To write to the mailing-list, write to isdn4linux@listserv.isdn4linux.de + + This mailinglist is bidirectionally gated to the newsgroup + + de.alt.comm.isdn4linux + + There is also a well maintained FAQ in English available at + http://www.mhessler.de/i4lfaq/ + It can be viewed online, or downloaded in sgml/text/html format. + The FAQ can also be viewed online at + http://www.isdn4inux.de/faq/ + or downloaded from + ftp://ftp.isdn4linux.de/pub/isdn4linux/FAQ/ + + 1.1 Technical details + + In the following Text, the terms MSN and EAZ are used. + + MSN is the abbreviation for (M)ultiple(S)ubscriber(N)umber, and applies + to Euro(EDSS1)-type lines. Usually it is simply the phone number. + + EAZ is the abbreviation of (E)ndgeraete(A)uswahl(Z)iffer and + applies to German 1TR6-type lines. This is a one-digit string, + simply appended to the base phone number + + The internal handling is nearly identical, so replace the appropriate + term to that one, which applies to your local ISDN-environment. + + When the link-level-module isdn.o is loaded, it supports up to 16 + low-level-modules with up to 64 channels. (The number 64 is arbitrarily + chosen and can be configured at compile-time --ISDN_MAX in isdn.h). + A low-level-driver can register itself through an interface (which is + defined in isdnif.h) and gets assigned a slot. + The following char-devices are made available for each channel: + + A raw-control-device with the following functions: + write: raw D-channel-messages (format: depends on driver). + read: raw D-channel-messages (format: depends on driver). + ioctl: depends on driver, i.e. for the ICN-driver, the base-address of + the ports and the shared memory on the card can be set and read + also the boot-code and the protocol software can be loaded into + the card. + + O N L Y !!! for debugging (no locking against other devices): + One raw-data-device with the following functions: + write: data to B-channel. + read: data from B-channel. + + In addition the following devices are made available: + + 128 tty-devices (64 cuix and 64 ttyIx) with integrated modem-emulator: + The functionality is almost the same as that of a serial device + (the line-discs are handled by the kernel), which lets you run + SLIP, CSLIP and asynchronous PPP through the devices. We have tested + Seyon, minicom, CSLIP (uri-dip) PPP, mgetty, XCept and Hylafax. + + The modem-emulation supports the following: + 1.3.1 Commands: + + ATA Answer incoming call. + ATD<No.> Dial, the number may contain: + [0-9] and [,#.*WPT-S] + the latter are ignored until 'S'. + The 'S' must precede the number, if + the line is a SPV (German 1TR6). + ATE0 Echo off. + ATE1 Echo on (default). + ATH Hang-up. + ATH1 Off hook (ignored). + ATH0 Hang-up. + ATI Return "ISDN for Linux...". + ATI0 " + ATI1 " + ATI2 Report of last connection. + ATO On line (data mode). + ATQ0 Enable result codes (default). + ATQ1 Disable result codes (default). + ATSx=y Set register x to y. + ATSx? Show contents of register x. + ATV0 Numeric responses. + ATV1 English responses (default). + ATZ Load registers and EAZ/MSN from Profile. + AT&Bx Set Send-Packet-size to x (max. 4000) + The real packet-size may be limited by the + low-level-driver used. e.g. the HiSax-Module- + limit is 2000. You will get NO Error-Message, + if you set it to higher values, because at the + time of giving this command the corresponding + driver may not be selected (see "Automatic + Assignment") however the size of outgoing packets + will be limited correctly. + AT&D0 Ignore DTR + AT&D2 DTR-low-edge: Hang up and return to + command mode (default). + AT&D3 Same as AT&D2 but also resets all registers. + AT&Ex Set the EAZ/MSN for this channel to x. + AT&F Reset all registers and profile to "factory-defaults" + AT&Lx Set list of phone numbers to listen on. x is a + list of wildcard patterns separated by semicolon. + If this is set, it has precedence over the MSN set + by AT&E. + AT&Rx Select V.110 bitrate adaption. + This command enables V.110 protocol with 9600 baud + (x=9600), 19200 baud (x=19200) or 38400 baud + (x=38400). A value of x=0 disables V.110 switching + back to default X.75. This command sets the following + Registers: + Reg 14 (Layer-2 protocol): + x = 0: 0 + x = 9600: 7 + x = 19200: 8 + x = 38400: 9 + Reg 18.2 = 1 + Reg 19 (Additional Service Indicator): + x = 0: 0 + x = 9600: 197 + x = 19200: 199 + x = 38400: 198 + Note on value in Reg 19: + There is _NO_ common convention for 38400 baud. + The value 198 is chosen arbitrarily. Users + _MUST_ negotiate this value before establishing + a connection. + AT&Sx Set window-size (x = 1..8) (not yet implemented) + AT&V Show all settings. + AT&W0 Write registers and EAZ/MSN to profile. See also + iprofd (5.c in this README). + AT&X0 BTX-mode and T.70-mode off (default) + AT&X1 BTX-mode on. (S13.1=1, S13.5=0 S14=0, S16=7, S18=7, S19=0) + AT&X2 T.70-mode on. (S13.1=1, S13.5=1, S14=0, S16=7, S18=7, S19=0) + AT+Rx Resume a suspended call with CallID x (x = 1,2,3...) + AT+Sx Suspend a call with CallID x (x = 1,2,3...) + + For voice-mode commands refer to README.audio + + 1.3.2 Escape sequence: + During a connection, the emulation reacts just like + a normal modem to the escape sequence <DELAY>+++<DELAY>. + (The escape character - default '+' - can be set in the + register 2). + The DELAY must at least be 1.5 seconds long and delay + between the escape characters must not exceed 0.5 seconds. + + 1.3.3 Registers: + + Nr. Default Description + 0 0 Answer on ring number. + (no auto-answer if S0=0). + 1 0 Count of rings. + 2 43 Escape character. + (a value >= 128 disables the escape sequence). + 3 13 Carriage return character (ASCII). + 4 10 Line feed character (ASCII). + 5 8 Backspace character (ASCII). + 6 3 Delay in seconds before dialing. + 7 60 Wait for carrier. + 8 2 Pause time for comma (ignored) + 9 6 Carrier detect time (ignored) + 10 7 Carrier loss to disconnect time (ignored). + 11 70 Touch tone timing (ignored). + 12 69 Bit coded register: + Bit 0: 0 = Suppress response messages. + 1 = Show response messages. + Bit 1: 0 = English response messages. + 1 = Numeric response messages. + Bit 2: 0 = Echo off. + 1 = Echo on. + Bit 3 0 = DCD always on. + 1 = DCD follows carrier. + Bit 4 0 = CTS follows RTS + 1 = Ignore RTS, CTS always on. + Bit 5 0 = return to command mode on DTR low. + 1 = Same as 0 but also resets all + registers. + See also register 13, bit 2 + Bit 6 0 = DSR always on. + 1 = DSR only on if channel is available. + Bit 7 0 = Cisco-PPP-flag-hack off (default). + 1 = Cisco-PPP-flag-hack on. + 13 0 Bit coded register: + Bit 0: 0 = Use delayed tty-send-algorithm + 1 = Direct tty-send. + Bit 1: 0 = T.70 protocol (Only for BTX!) off + 1 = T.70 protocol (Only for BTX!) on + Bit 2: 0 = Don't hangup on DTR low. + 1 = Hangup on DTR low. + Bit 3: 0 = Standard response messages + 1 = Extended response messages + Bit 4: 0 = CALLER NUMBER before every RING. + 1 = CALLER NUMBER after first RING. + Bit 5: 0 = T.70 extended protocol off + 1 = T.70 extended protocol on + Bit 6: 0 = Special RUNG Message off + 1 = Special RUNG Message on + "RUNG" is delivered on a ttyI, if + an incoming call happened (RING) and + the remote party hung up before any + local ATA was given. + Bit 7: 0 = Don't show display messages from net + 1 = Show display messages from net + (S12 Bit 1 must be 0 too) + 14 0 Layer-2 protocol: + 0 = X75/LAPB with I-frames + 1 = X75/LAPB with UI-frames + 2 = X75/LAPB with BUI-frames + 3 = HDLC + 4 = Transparent (audio) + 7 = V.110, 9600 baud + 8 = V.110, 19200 baud + 9 = V.110, 38400 baud + 10 = Analog Modem (only if hardware supports this) + 11 = Fax G3 (only if hardware supports this) + 15 0 Layer-3 protocol: + 0 = transparent + 1 = transparent with audio features (e.g. DSP) + 2 = Fax G3 Class 2 commands (S14 has to be set to 11) + 3 = Fax G3 Class 1 commands (S14 has to be set to 11) + 16 250 Send-Packet-size/16 + 17 8 Window-size (not yet implemented) + 18 4 Bit coded register, Service-Octet-1 to accept, + or to be used on dialout: + Bit 0: Service 1 (audio) when set. + Bit 1: Service 5 (BTX) when set. + Bit 2: Service 7 (data) when set. + Note: It is possible to set more than one + bit. In this case, on incoming calls + the selected services are accepted, + and if the service is "audio", the + Layer-2-protocol is automatically + changed to 4 regardless of the setting + of register 14. On outgoing calls, + the most significant 1-bit is chosen to + select the outgoing service octet. + 19 0 Service-Octet-2 + 20 0 Bit coded register (readonly) + Service-Octet-1 of last call. + Bit mapping is the same as register 18 + 21 0 Bit coded register (readonly) + Set on incoming call (during RING) to + octet 3 of calling party number IE (Numbering plan) + See section 4.5.10 of ITU Q.931 + 22 0 Bit coded register (readonly) + Set on incoming call (during RING) to + octet 3a of calling party number IE (Screening info) + See section 4.5.10 of ITU Q.931 + 23 0 Bit coded register: + Bit 0: 0 = Add CPN to RING message off + 1 = Add CPN to RING message on + Bit 1: 0 = Add CPN to FCON message off + 1 = Add CPN to FCON message on + Bit 2: 0 = Add CDN to RING/FCON message off + 1 = Add CDN to RING/FCON message on + + Last but not least a (at the moment fairly primitive) device to request + the line-status (/dev/isdninfo) is made available. + + Automatic assignment of devices to lines: + + All inactive physical lines are listening to all EAZs for incoming + calls and are NOT assigned to a specific tty or network interface. + When an incoming call is detected, the driver looks first for a network + interface and then for an opened tty which: + + 1. is configured for the same EAZ. + 2. has the same protocol settings for the B-channel. + 3. (only for network interfaces if the security flag is set) + contains the caller number in its access list. + 4. Either the channel is not bound exclusively to another Net-interface, or + it is bound AND the other checks apply to exactly this interface. + (For usage of the bind-features, refer to the isdnctrl-man-page) + + Only when a matching interface or tty is found is the call accepted + and the "connection" between the low-level-layer and the link-level-layer + is established and kept until the end of the connection. + In all other cases no connection is established. Isdn4linux can be + configured to either do NOTHING in this case (which is useful, if + other, external devices with the same EAZ/MSN are connected to the bus) + or to reject the call actively. (isdnctrl busreject ...) + + For an outgoing call, the inactive physical lines are searched. + The call is placed on the first physical line, which supports the + requested protocols for the B-channel. If a net-interface, however + is pre-bound to a channel, this channel is used directly. + + This makes it possible to configure several network interfaces and ttys + for one EAZ, if the network interfaces are set to secure operation. + If an incoming call matches one network interface, it gets connected to it. + If another incoming call for the same EAZ arrives, which does not match + a network interface, the first tty gets a "RING" and so on. + +2 System prerequisites: + + ATTENTION! + + Always use the latest module utilities. The current version is + named in Documentation/Changes. Some old versions of insmod + are not capable of setting the driver-Ids correctly. + +3. Lowlevel-driver configuration. + + Configuration depends on how the drivers are built. See the + README.<yourDriver> for information on driver-specific setup. + +4. Device-inodes + + The major and minor numbers and their names are described in + Documentation/devices.txt. The major numbers are: + + 43 for the ISDN-tty's. + 44 for the ISDN-callout-tty's. + 45 for control/info/debug devices. + +5. Application + + a) For some card-types, firmware has to be loaded into the cards, before + proceeding with device-independent setup. See README.<yourDriver> + for how to do that. + + b) If you only intend to use ttys, you are nearly ready now. + + c) If you want to have really permanent "Modem"-settings on disk, you + can start the daemon iprofd. Give it a path to a file at the command- + line. It will store the profile-settings in this file every time + an AT&W0 is performed on any ISDN-tty. If the file already exists, + all profiles are initialized from this file. If you want to unload + any of the modules, kill iprofd first. + + d) For networking, continue: Create an interface: + isdnctrl addif isdn0 + + e) Set the EAZ (or MSN for Euro-ISDN): + isdnctrl eaz isdn0 2 + + (For 1TR6 a single digit is allowed, for Euro-ISDN the number is your + real MSN e.g.: Phone-Number) + + f) Set the number for outgoing calls on the interface: + isdnctrl addphone isdn0 out 1234567 + ... (this can be executed more than once, all assigned numbers are + tried in order) + and the number(s) for incoming calls: + isdnctrl addphone isdn0 in 1234567 + + g) Set the timeout for hang-up: + isdnctrl huptimeout isdn0 <timeout_in_seconds> + + h) additionally you may activate charge-hang-up (= Hang up before + next charge-info, this only works, if your isdn-provider transmits + the charge-info during and after the connection): + isdnctrl chargehup isdn0 on + + i) Set the dial mode of the interface: + isdnctrl dialmode isdn0 auto + "off" means that you (or the system) cannot make any connection + (neither incoming or outgoing connections are possible). Use + this if you want to be sure that no connections will be made. + "auto" means that the interface is in auto-dial mode, and will + attempt to make a connection whenever a network data packet needs + the interface's link. Note that this can cause unexpected dialouts, + and lead to a high phone bill! Some daemons or other pc's that use + this interface can cause this. + Incoming connections are also possible. + "manual" is a dial mode created to prevent the unexpected dialouts. + In this mode, the interface will never make any connections on its + own. You must explicitly initiate a connection with "isdnctrl dial + isdn0". However, after an idle time of no traffic as configured for + the huptimeout value with isdnctrl, the connection _will_ be ended. + If you don't want any automatic hangup, set the huptimeout value to 0. + "manual" is the default. + + j) Setup the interface with ifconfig as usual, and set a route to it. + + k) (optional) If you run X11 and have Tcl/Tk-wish version 4.0, you can use + the script tools/tcltk/isdnmon. You can add actions for line-status + changes. See the comments at the beginning of the script for how to + do that. There are other tty-based tools in the tools-subdirectory + contributed by Michael Knigge (imon), Volker Götz (imontty) and + Andreas Kool (isdnmon). + + l) For initial testing, you can set the verbose-level to 2 (default: 0). + Then all incoming calls are logged, even if they are not addressed + to one of the configured net-interfaces: + isdnctrl verbose 2 + + Now you are ready! A ping to the set address should now result in an + automatic dial-out (look at syslog kernel-messages). + The phone numbers and EAZs can be assigned at any time with isdnctrl. + You can add as many interfaces as you like with addif following the + directions above. Of course, there may be some limitations. But we have + tested as many as 20 interfaces without any problem. However, if you + don't give an interface name to addif, the kernel will assign a name + which starts with "eth". The number of "eth"-interfaces is limited by + the kernel. + +5. Additional options for isdnctrl: + + "isdnctrl secure <InterfaceName> on" + Only incoming calls, for which the caller-id is listed in the access + list of the interface are accepted. You can add caller-id's With the + command "isdnctrl addphone <InterfaceName> in <caller-id>" + Euro-ISDN does not transmit the leading '0' of the caller-id for an + incoming call, therefore you should configure it accordingly. + If the real number for the dialout e.g. is "09311234567" the number + to configure here is "9311234567". The pattern-match function + works similar to the shell mechanism. + + ? one arbitrary digit + * zero or arbitrary many digits + [123] one of the digits in the list + [1-5] one digit between '1' and '5' + a '^' as the first character in a list inverts the list + + + "isdnctrl secure <InterfaceName> off" + Switch off secure operation (default). + + "isdnctrl ihup <InterfaceName> [on|off]" + Switch the hang-up-timer for incoming calls on or off. + + "isdnctrl eaz <InterfaceName>" + Returns the EAZ of an interface. + + "isdnctrl delphone <InterfaceName> in|out <number>" + Deletes a number from one of the access-lists of the interface. + + "isdnctrl delif <InterfaceName>" + Removes the interface (and possible slaves) from the kernel. + (You have to unregister it with "ifconfig <InterfaceName> down" before). + + "isdnctrl callback <InterfaceName> [on|off]" + Switches an interface to callback-mode. In this mode, an incoming call + will be rejected and after this the remote-station will be called. If + you test this feature by using ping, some routers will re-dial very + quickly, so that the callback from isdn4linux may not be recognized. + In this case use ping with the option -i <sec> to increase the interval + between echo-packets. + + "isdnctrl cbdelay <InterfaceName> [seconds]" + Sets the delay (default 5 sec) between an incoming call and start of + dialing when callback is enabled. + + "isdnctrl cbhup <InterfaceName> [on|off]" + This enables (default) or disables an active hangup (reject) when getting an + incoming call for an interface which is configured for callback. + + "isdnctrl encap <InterfaceName> <EncapType>" + Selects the type of packet-encapsulation. The encapsulation can be changed + only while an interface is down. + + At the moment the following values are supported: + + rawip (Default) Selects raw-IP-encapsulation. This means, MAC-headers + are stripped off. + ip IP with type-field. Same as IP but the type-field of the MAC-header + is preserved. + x25iface X.25 interface encapsulation (first byte semantics as defined in + ../networking/x25-iface.txt). Use this for running the linux + X.25 network protocol stack (AF_X25 sockets) on top of isdn. + cisco-h A special-mode for communicating with a Cisco, which is configured + to do "hdlc" + ethernet No stripping. Packets are sent with full MAC-header. + The Ethernet-address of the interface is faked, from its + IP-address: fc:fc:i1:i2:i3:i4, where i1-4 are the IP-addr.-values. + syncppp Synchronous PPP + + uihdlc HDLC with UI-frame-header (for use with DOS ISPA, option -h1) + + + NOTE: x25iface encapsulation is currently experimental. Please + read README.x25 for further details + + + Watching packets, using standard-tcpdump will fail for all encapsulations + except ethernet because tcpdump does not know how to handle packets + without MAC-header. A patch for tcpdump is included in the utility-package + mentioned above. + + "isdnctrl l2_prot <InterfaceName> <L2-ProtocolName>" + Selects a layer-2-protocol. + (With the ICN-driver and the HiSax-driver, "x75i" and "hdlc" is available. + With other drivers, "x75ui", "x75bui", "x25dte", "x25dce" may be + possible too. See README.x25 for x25 related l2 protocols.) + + isdnctrl l3_prot <InterfaceName> <L3-ProtocolName> + The same for layer-3. (At the moment only "trans" is allowed) + + "isdnctrl list <InterfaceName>" + Shows all parameters of an interface and the charge-info. + Try "all" as the interface name. + + "isdnctrl hangup <InterfaceName>" + Forces hangup of an interface. + + "isdnctrl bind <InterfaceName> <DriverId>,<ChannelNumber> [exclusive]" + If you are using more than one ISDN card, it is sometimes necessary to + dial out using a specific card or even preserve a specific channel for + dialout of a specific net-interface. This can be done with the above + command. Replace <DriverId> by whatever you assigned while loading the + module. The <ChannelNumber> is counted from zero. The upper limit + depends on the card used. At the moment no card supports more than + 2 channels, so the upper limit is one. + + "isdnctrl unbind <InterfaceName>" + unbinds a previously bound interface. + + "isdnctrl busreject <DriverId> on|off" + If switched on, isdn4linux replies a REJECT to incoming calls, it + cannot match to any configured interface. + If switched off, nothing happens in this case. + You normally should NOT enable this feature, if the ISDN adapter is not + the only device connected to the S0-bus. Otherwise it could happen that + isdn4linux rejects an incoming call, which belongs to another device on + the bus. + + "isdnctrl addslave <InterfaceName> <SlaveName> + Creates a slave interface for channel-bundling. Slave interfaces are + not seen by the kernel, but their ISDN-part can be configured with + isdnctrl as usual. (Phone numbers, EAZ/MSN, timeouts etc.) If more + than two channels are to be bundled, feel free to create as many as you + want. InterfaceName must be a real interface, NOT a slave. Slave interfaces + start dialing, if the master interface resp. the previous slave interface + has a load of more than 7000 cps. They hangup if the load goes under 7000 + cps, according to their "huptimeout"-parameter. + + "isdnctrl sdelay <InterfaceName> secs." + This sets the minimum time an Interface has to be fully loaded, until + it sends a dial-request to its slave. + + "isdnctrl dial <InterfaceName>" + Forces an interface to start dialing even if no packets are to be + transferred. + + "isdnctrl mapping <DriverId> MSN0,MSN1,MSN2,...MSN9" + This installs a mapping table for EAZ<->MSN-mapping for a single line. + Missing MSN's have to be given as "-" or can be omitted, if at the end + of the commandline. + With this command, it's now possible to have an interface listening to + mixed 1TR6- and Euro-Type lines. In this case, the interface has to be + configured to a 1TR6-type EAZ (one digit). The mapping is also valid + for tty-emulation. Seen from the interface/tty-level the mapping + CAN be used, however it's possible to use single tty's/interfaces with + real MSN's (more digits) also, in which case the mapping will be ignored. + Here is an example: + + You have a 1TR6-type line with base-nr. 1234567 and a Euro-line with + MSN's 987654, 987655 and 987656. The DriverId for the Euro-line is "EURO". + + isdnctrl mapping EURO -,987654,987655,987656,-,987655 + ... + isdnctrl eaz isdn0 1 # listen on 12345671(1tr6) and 987654(euro) + ... + isdnctrl eaz isdn1 4 # listen on 12345674(1tr6) only. + ... + isdnctrl eaz isdn2 987654 # listen on 987654(euro) only. + + Same scheme is used with AT&E... at the tty's. + +6. If you want to write a new low-level-driver, you are welcome. + The interface to the link-level-module is described in the file INTERFACE. + If the interface should be expanded for any reason, don't do it + on your own, send me a mail containing the proposed changes and + some reasoning about them. + If other drivers will not be affected, I will include the changes + in the next release. + For developers only, there is a second mailing-list. Write to me + (fritz@isdn4linux.de), if you want to join that list. + +Have fun! + + -Fritz + diff --git a/Documentation/isdn/README.FAQ b/Documentation/isdn/README.FAQ new file mode 100644 index 000000000000..356f7944641d --- /dev/null +++ b/Documentation/isdn/README.FAQ @@ -0,0 +1,26 @@ + +The FAQ for isdn4linux +====================== + +Please note that there is a big FAQ available in the isdn4k-utils. +You find it in: + isdn4k-utils/FAQ/i4lfaq.sgml + +In case you just want to see the FAQ online, or download the newest version, +you can have a look at my website: +http://www.mhessler.de/i4lfaq/ (view + download) +or: +http://www.isdn4linux.de/faq/ (view) + +As the extension tells, the FAQ is in SGML format, and you can convert it +into text/html/... format by using the sgml2txt/sgml2html/... tools. +Alternatively, you can also do a 'configure; make all' in the FAQ directory. + + +Please have a look at the FAQ before posting anything in the Mailinglist, +or the newsgroup! + + +Matthias Hessler +hessler@isdn4linux.de + diff --git a/Documentation/isdn/README.HiSax b/Documentation/isdn/README.HiSax new file mode 100644 index 000000000000..031c8d814337 --- /dev/null +++ b/Documentation/isdn/README.HiSax @@ -0,0 +1,659 @@ +HiSax is a Linux hardware-level driver for passive ISDN cards with Siemens +chipset (ISAC_S 2085/2086/2186, HSCX SAB 82525). It is based on the Teles +driver from Jan den Ouden. +It is meant to be used with isdn4linux, an ISDN link-level module for Linux +written by Fritz Elfert. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + + +Supported cards +--------------- + +Teles 8.0/16.0/16.3 and compatible ones +Teles 16.3c +Teles S0/PCMCIA +Teles PCI +Teles S0Box +Creatix S0Box +Creatix PnP S0 +Compaq ISDN S0 ISA card +AVM A1 (Fritz, Teledat 150) +AVM Fritz PCMCIA +AVM Fritz PnP +AVM Fritz PCI +ELSA Microlink PCC-16, PCF, PCF-Pro, PCC-8 +ELSA Quickstep 1000 +ELSA Quickstep 1000PCI +ELSA Quickstep 3000 (same settings as QS1000) +ELSA Quickstep 3000PCI +ELSA PCMCIA +ITK ix1-micro Rev.2 +Eicon Diva 2.0 ISA and PCI (S0 and U interface, no PRO version) +Eicon Diva 2.01 ISA and PCI +Eicon Diva 2.02 PCI +Eicon Diva Piccola +ASUSCOM NETWORK INC. ISDNLink 128K PC adapter (order code I-IN100-ST-D) +Dynalink IS64PH (OEM version of ASUSCOM NETWORK INC. ISDNLink 128K adapter) +PCBIT-DP (OEM version of ASUSCOM NETWORK INC. ISDNLink) +HFC-2BS0 based cards (TeleInt SA1) +Sedlbauer Speed Card (Speed Win, Teledat 100, PCI, Fax+) +Sedlbauer Speed Star/Speed Star2 (PCMCIA) +Sedlbauer ISDN-Controller PC/104 +USR Sportster internal TA (compatible Stollmann tina-pp V3) +USR internal TA PCI +ith Kommunikationstechnik GmbH MIC 16 ISA card +Traverse Technologie NETjet PCI S0 card and NETspider U card +Ovislink ISDN sc100-p card (NETjet driver) +Dr. Neuhaus Niccy PnP/PCI +Siemens I-Surf 1.0 +Siemens I-Surf 2.0 (with IPAC, try type 12 asuscom) +ACER P10 +HST Saphir +Berkom Telekom A4T +Scitel Quadro +Gazel ISDN cards +HFC-PCI based cards +Winbond W6692 based cards +HFC-S+, HFC-SP/PCMCIA cards +formula-n enternow +Gerdes Power ISDN + +Note: PCF, PCF-Pro: up to now, only the ISDN part is supported + PCC-8: not tested yet + Eicon.Diehl Diva U interface not tested + +If you know other passive cards with the Siemens chipset, please let me know. +You can combine any card, if there is no conflict between the resources +(io, mem, irq). + + +Configuring the driver +---------------------- + +The HiSax driver can either be built directly into the kernel or as a module. +It can be configured using the command line feature while loading the kernel +with LILO or LOADLIN or, if built as a module, using insmod/modprobe with +parameters. +There is also some config needed before you compile the kernel and/or +modules. It is included in the normal "make [menu]config" target at the +kernel. Don't forget it, especially to select the right D-channel protocol. + +Please note: In older versions of the HiSax driver, all PnP cards +needed to be configured with isapnp and worked only with the HiSax +driver used as a module. + +In the current version, HiSax will automatically use the in-kernel +ISAPnP support, provided you selected it during kernel configuration +(CONFIG_ISAPNP), if you don't give the io=, irq= command line parameters. + +The affected card types are: 4,7,12,14,19,27-30 + +a) when built as a module +------------------------- + +insmod/modprobe hisax.o \ + io=iobase irq=IRQ mem=membase type=card_type \ + protocol=D_channel_protocol id=idstring + +or, if several cards are installed: + +insmod/modprobe hisax.o \ + io=iobase1,iobase2,... irq=IRQ1,IRQ2,... mem=membase1,membase2,... \ + type=card_type1,card_type2,... \ + protocol=D_channel_protocol1,D_channel_protocol2,... \ + id=idstring1%idstring2 ... + +where "iobaseN" represents the I/O base address of the Nth card, "membaseN" +the memory base address of the Nth card, etc. + +The reason for the delimiter "%" being used in the idstrings is that "," +won't work with the current modules package. + +The parameters may be specified in any order. For example, the "io" +parameter may precede the "irq" parameter, or vice versa. If several +cards are installed, the ordering within the comma separated parameter +lists must of course be consistent. + +Only parameters applicable to the card type need to be specified. For +example, the Teles 16.3 card is not memory-mapped, so the "mem" +parameter may be omitted for this card. Sometimes it may be necessary +to specify a dummy parameter, however. This is the case when there is +a card of a different type later in the list that needs a parameter +which the preceding card does not. For instance, if a Teles 16.0 card +is listed after a Teles 16.3 card, a dummy memory base parameter of 0 +must be specified for the 16.3. Instead of a dummy value, the parameter +can also be skipped by simply omitting the value. For example: +mem=,0xd0000. See example 6 below. + +The parameter for the D-Channel protocol may be omitted if you selected the +correct one during kernel config. Valid values are "1" for German 1TR6, +"2" for EDSS1 (Euro ISDN), "3" for leased lines (no D-Channel) and "4" +for US NI1. +With US NI1 you have to include your SPID into the MSN setting in the form +<MSN>:<SPID> for example (your phonenumber is 1234 your SPID 5678): +AT&E1234:5678 on ttyI interfaces +isdnctrl eaz ippp0 1234:5678 on network devices + +The Creatix/Teles PnP cards use io1= and io2= instead of io= for specifying +the I/O addresses of the ISAC and HSCX chips, respectively. + +Card types: + + Type Required parameters (in addition to type and protocol) + + 1 Teles 16.0 irq, mem, io + 2 Teles 8.0 irq, mem + 3 Teles 16.3 (non PnP) irq, io + 4 Creatix/Teles PnP irq, io0 (ISAC), io1 (HSCX) + 5 AVM A1 (Fritz) irq, io + 6 ELSA PCC/PCF cards io or nothing for autodetect (the iobase is + required only if you have more than one ELSA + card in your PC) + 7 ELSA Quickstep 1000 irq, io (from isapnp setup) + 8 Teles 16.3 PCMCIA irq, io + 9 ITK ix1-micro Rev.2 irq, io + 10 ELSA PCMCIA irq, io (set with card manager) + 11 Eicon.Diehl Diva ISA PnP irq, io + 11 Eicon.Diehl Diva PCI no parameter + 12 ASUS COM ISDNLink irq, io (from isapnp setup) + 13 HFC-2BS0 based cards irq, io + 14 Teles 16.3c PnP irq, io + 15 Sedlbauer Speed Card irq, io + 15 Sedlbauer PC/104 irq, io + 15 Sedlbauer Speed PCI no parameter + 16 USR Sportster internal irq, io + 17 MIC card irq, io + 18 ELSA Quickstep 1000PCI no parameter + 19 Compaq ISDN S0 ISA card irq, io0, io1, io (from isapnp setup io=IO2) + 20 NETjet PCI card no parameter + 21 Teles PCI no parameter + 22 Sedlbauer Speed Star (PCMCIA) irq, io (set with card manager) + 24 Dr. Neuhaus Niccy PnP irq, io0, io1 (from isapnp setup) + 24 Dr. Neuhaus Niccy PCI no parameter + 25 Teles S0Box irq, io (of the used lpt port) + 26 AVM A1 PCMCIA (Fritz!) irq, io (set with card manager) + 27 AVM PnP (Fritz!PnP) irq, io (from isapnp setup) + 27 AVM PCI (Fritz!PCI) no parameter + 28 Sedlbauer Speed Fax+ irq, io (from isapnp setup) + 29 Siemens I-Surf 1.0 irq, io, memory (from isapnp setup) + 30 ACER P10 irq, io (from isapnp setup) + 31 HST Saphir irq, io + 32 Telekom A4T none + 33 Scitel Quadro subcontroller (4*S0, subctrl 1...4) + 34 Gazel ISDN cards (ISA) irq,io + 34 Gazel ISDN cards (PCI) none + 35 HFC 2BDS0 PCI none + 36 W6692 based PCI cards none + 37 HFC 2BDS0 S+, SP irq,io + 38 NETspider U PCI card none + 39 HFC 2BDS0 SP/PCMCIA irq,io (set with cardmgr) + 40 hotplug interface + 41 Formula-n enter:now PCI none + +At the moment IRQ sharing is only possible with PCI cards. Please make sure +that your IRQ is free and enabled for ISA use. + + +Examples for module loading + +1. Teles 16.3, Euro ISDN, I/O base 280 hex, IRQ 10 + modprobe hisax type=3 protocol=2 io=0x280 irq=10 + +2. Teles 16.0, 1TR6 ISDN, I/O base d80 hex, IRQ 5, Memory d0000 hex + modprobe hisax protocol=1 type=1 io=0xd80 mem=0xd0000 irq=5 + +3. Fritzcard, Euro ISDN, I/O base 340 hex, IRQ 10 and ELSA PCF, Euro ISDN + modprobe hisax type=5,6 protocol=2,2 io=0x340 irq=10 id=Fritz%Elsa + +4. Any ELSA PCC/PCF card, Euro ISDN + modprobe hisax type=6 protocol=2 + +5. Teles 16.3 PnP, Euro ISDN, with isapnp configured + isapnp config: (INT 0 (IRQ 10 (MODE +E))) + (IO 0 (BASE 0x0580)) + (IO 1 (BASE 0x0180)) + modprobe hisax type=4 protocol=2 irq=10 io0=0x580 io1=0x180 + + In the current version of HiSax, you can instead simply use + + modprobe hisax type=4 protocol=2 + + if you configured your kernel for ISAPnP. Don't run isapnp in + this case! + +6. Teles 16.3, Euro ISDN, I/O base 280 hex, IRQ 12 and + Teles 16.0, 1TR6, IRQ 5, Memory d0000 hex + modprobe hisax type=3,1 protocol=2,1 io=0x280 mem=0,0xd0000 + + Please note the dummy 0 memory address for the Teles 16.3, used as a + placeholder as described above, in the last example. + +7. Teles PCMCIA, Euro ISDN, I/O base 180 hex, IRQ 15 (default values) + modprobe hisax type=8 protocol=2 io=0x180 irq=15 + + +b) using LILO/LOADLIN, with the driver compiled directly into the kernel +------------------------------------------------------------------------ + +hisax=typ1,dp1,pa_1,pb_1,pc_1[,typ2,dp2,pa_2 ... \ + typn,dpn,pa_n,pb_n,pc_n][,idstring1[,idstring2,...,idstringn]] + +where + typ1 = type of 1st card (default depends on kernel settings) + dp1 = D-Channel protocol of 1st card. 1=1TR6, 2=EDSS1, 3=leased + pa_1 = 1st parameter (depending on the type of the card) + pb_1 = 2nd parameter ( " " " " " " " ) + pc_1 = 3rd parameter ( " " " " " " " ) + + typ2,dp2,pa_2,pb_2,pc_2 = Parameters of the second card (defaults: none) + typn,dpn,pa_n,pb_n,pc_n = Parameters of the n'th card (up to 16 cards are + supported) + + idstring = Driver ID for accessing the particular card with utility + programs and for identification when using a line monitor + (default: "HiSax") + + Note: the ID string must start with an alphabetical character! + +Card types: + +type + 1 Teles 16.0 pa=irq pb=membase pc=iobase + 2 Teles 8.0 pa=irq pb=membase + 3 Teles 16.3 pa=irq pb=iobase + 4 Creatix/Teles PNP ONLY WORKS AS A MODULE ! + 5 AVM A1 (Fritz) pa=irq pb=iobase + 6 ELSA PCC/PCF cards pa=iobase or nothing for autodetect + 7 ELSA Quickstep 1000 ONLY WORKS AS A MODULE ! + 8 Teles S0 PCMCIA pa=irq pb=iobase + 9 ITK ix1-micro Rev.2 pa=irq pb=iobase + 10 ELSA PCMCIA pa=irq, pb=io (set with card manager) + 11 Eicon.Diehl Diva ISAPnP ONLY WORKS AS A MODULE ! + 11 Eicon.Diehl Diva PCI no parameter + 12 ASUS COM ISDNLink ONLY WORKS AS A MODULE ! + 13 HFC-2BS0 based cards pa=irq pb=io + 14 Teles 16.3c PnP ONLY WORKS AS A MODULE ! + 15 Sedlbauer Speed Card pa=irq pb=io (Speed Win only as module !) + 15 Sedlbauer PC/104 pa=irq pb=io + 15 Sedlbauer Speed PCI no parameter + 16 USR Sportster internal pa=irq pb=io + 17 MIC card pa=irq pb=io + 18 ELSA Quickstep 1000PCI no parameter + 19 Compaq ISDN S0 ISA card ONLY WORKS AS A MODULE ! + 20 NETjet PCI card no parameter + 21 Teles PCI no parameter + 22 Sedlbauer Speed Star (PCMCIA) pa=irq, pb=io (set with card manager) + 24 Dr. Neuhaus Niccy PnP ONLY WORKS AS A MODULE ! + 24 Dr. Neuhaus Niccy PCI no parameter + 25 Teles S0Box pa=irq, pb=io (of the used lpt port) + 26 AVM A1 PCMCIA (Fritz!) pa=irq, pb=io (set with card manager) + 27 AVM PnP (Fritz!PnP) ONLY WORKS AS A MODULE ! + 27 AVM PCI (Fritz!PCI) no parameter + 28 Sedlbauer Speed Fax+ ONLY WORKS AS A MODULE ! + 29 Siemens I-Surf 1.0 ONLY WORKS AS A MODULE ! + 30 ACER P10 ONLY WORKS AS A MODULE ! + 31 HST Saphir pa=irq, pb=io + 32 Telekom A4T no parameter + 33 Scitel Quadro subcontroller (4*S0, subctrl 1...4) + 34 Gazel ISDN cards (ISA) pa=irq, pb=io + 34 Gazel ISDN cards (PCI) no parameter + 35 HFC 2BDS0 PCI no parameter + 36 W6692 based PCI cards none + 37 HFC 2BDS0 S+,SP/PCMCIA ONLY WORKS AS A MODULE ! + 38 NETspider U PCI card none + 39 HFC 2BDS0 SP/PCMCIA ONLY WORKS AS A MODULE ! + 40 hotplug interface ONLY WORKS AS A MODULE ! + 41 Formula-n enter:now PCI none + +Running the driver +------------------ + +When you insmod isdn.o and hisax.o (or with the in-kernel version, during +boot time), a few lines should appear in your syslog. Look for something like: + +Apr 13 21:01:59 kke01 kernel: HiSax: Driver for Siemens chip set ISDN cards +Apr 13 21:01:59 kke01 kernel: HiSax: Version 2.9 +Apr 13 21:01:59 kke01 kernel: HiSax: Revisions 1.14/1.9/1.10/1.25/1.8 +Apr 13 21:01:59 kke01 kernel: HiSax: Total 1 card defined +Apr 13 21:01:59 kke01 kernel: HiSax: Card 1 Protocol EDSS1 Id=HiSax1 (0) +Apr 13 21:01:59 kke01 kernel: HiSax: Elsa driver Rev. 1.13 +... +Apr 13 21:01:59 kke01 kernel: Elsa: PCF-Pro found at 0x360 Rev.:C IRQ 10 +Apr 13 21:01:59 kke01 kernel: Elsa: timer OK; resetting card +Apr 13 21:01:59 kke01 kernel: Elsa: HSCX version A: V2.1 B: V2.1 +Apr 13 21:01:59 kke01 kernel: Elsa: ISAC 2086/2186 V1.1 +... +Apr 13 21:01:59 kke01 kernel: HiSax: DSS1 Rev. 1.14 +Apr 13 21:01:59 kke01 kernel: HiSax: 2 channels added + +This means that the card is ready for use. +Cabling problems or line-downs are not detected, and only some ELSA cards can +detect the S0 power. + +Remember that, according to the new strategy for accessing low-level drivers +from within isdn4linux, you should also define a driver ID while doing +insmod: Simply append hisax_id=<SomeString> to the insmod command line. This +string MUST NOT start with a digit or a small 'x'! + +At this point you can run a 'cat /dev/isdnctrl0' and view debugging messages. + +At the moment, debugging messages are enabled with the hisaxctrl tool: + + hisaxctrl <DriverId> DebugCmd <debugging_flags> + +<DriverId> default is HiSax, if you didn't specify one. + +DebugCmd is 1 for generic debugging + 11 for layer 1 development debugging + 13 for layer 3 development debugging + +where <debugging_flags> is the integer sum of the following debugging +options you wish enabled: + +With DebugCmd set to 1: + + 0x0001 Link-level <--> hardware-level communication + 0x0002 Top state machine + 0x0004 D-Channel Frames for isdnlog + 0x0008 D-Channel Q.921 + 0x0010 B-Channel X.75 + 0x0020 D-Channel l2 + 0x0040 B-Channel l2 + 0x0080 D-Channel link state debugging + 0x0100 B-Channel link state debugging + 0x0200 TEI debug + 0x0400 LOCK debug in callc.c + 0x0800 More paranoid debug in callc.c (not for normal use) + 0x1000 D-Channel l1 state debugging + 0x2000 B-Channel l1 state debugging + +With DebugCmd set to 11: + + 0x0001 Warnings (default: on) + 0x0002 IRQ status + 0x0004 ISAC + 0x0008 ISAC FIFO + 0x0010 HSCX + 0x0020 HSCX FIFO (attention: full B-Channel output!) + 0x0040 D-Channel LAPD frame types + 0x0080 IPAC debug + 0x0100 HFC receive debug + 0x0200 ISAC monitor debug + 0x0400 D-Channel frames for isdnlog (set with 1 0x4 too) + 0x0800 D-Channel message verbose + +With DebugCmd set to 13: + + 1 Warnings (default: on) + 2 l3 protocol descriptor errors + 4 l3 state machine + 8 charge info debugging (1TR6) + +For example, 'hisaxctrl HiSax 1 0x3ff' enables full generic debugging. + +Because of some obscure problems with some switch equipment, the delay +between the CONNECT message and sending the first data on the B-channel is now +configurable with + +hisaxctrl <DriverId> 2 <delay> +<delay> in ms Value between 50 and 800 ms is recommended. + +Downloading Firmware +-------------------- +At the moment, the Sedlbauer speed fax+ is the only card, which +needs to download firmware. +The firmware is downloaded with the hisaxctrl tool: + + hisaxctrl <DriverId> 9 <firmware_filename> + +<DriverId> default is HiSax, if you didn't specify one, + +where <firmware_filename> is the filename of the firmware file. + +For example, 'hisaxctrl HiSax 9 ISAR.BIN' downloads the firmware for +ISAR based cards (like the Sedlbauer speed fax+). + +Warning +------- +HiSax is a work in progress and may crash your machine. +For certification look at HiSax.cert file. + +Limitations +----------- +At this time, HiSax only works on Euro ISDN lines and German 1TR6 lines. +For leased lines see appendix. + +Bugs +---- +If you find any, please let me know. + + +Thanks +------ +Special thanks to: + + Emil Stephan for the name HiSax which is a mix of HSCX and ISAC. + + Fritz Elfert, Jan den Ouden, Michael Hipp, Michael Wein, + Andreas Kool, Pekka Sarnila, Sim Yskes, Johan Myrre'en, + Klaus-Peter Nischke (ITK AG), Christof Petig, Werner Fehn (ELSA GmbH), + Volker Schmidt + Edgar Toernig and Marcus Niemann for the Sedlbauer driver + Stephan von Krawczynski + Juergen Quade for the Leased Line part + Klaus Lichtenwalder (Klaus.Lichtenwalder@WebForum.DE), for ELSA PCMCIA support + Enrik Berkhan (enrik@starfleet.inka.de) for S0BOX specific stuff + Ton van Rosmalen for Teles PCI + Petr Novak <petr.novak@i.cz> for Winbond W6692 support + Werner Cornelius <werner@isdn4linux.de> for HFC-PCI, HFC-S(+/P) and supplementary services support + and more people who are hunting bugs. (If I forgot somebody, please + send me a mail). + + Firma ELSA GmbH + Firma Eicon.Diehl GmbH + Firma Dynalink NL + Firma ASUSCOM NETWORK INC. Taiwan + Firma S.u.S.E + Firma ith Kommunikationstechnik GmbH + Firma Traverse Technologie Australia + Firma Medusa GmbH (www.medusa.de). + Firma Quant-X Austria for sponsoring a DEC Alpha board+CPU + Firma Cologne Chip Designs GmbH + + My girl friend and partner in life Ute for her patience with me. + + +Enjoy, + +Karsten Keil +keil@isdn4linux.de + + +Appendix: Teles PCMCIA driver +----------------------------- + +See + http://www.stud.uni-wuppertal.de/~ea0141/pcmcia.html +for instructions. + +Appendix: Linux and ISDN-leased lines +------------------------------------- + +Original from Juergen Quade, new version KKe. + +Attention NEW VERSION, the old leased line syntax won't work !!! + +You can use HiSax to connect your Linux-Box via an ISDN leased line +to e.g. the Internet: + +1. Build a kernel which includes the HiSax driver either as a module + or as part of the kernel. + cd /usr/src/linux + make menuconfig + <ISDN subsystem - ISDN support -- HiSax> + make clean; make zImage; make modules; make modules_install +2. Install the new kernel + cp /usr/src/linux/arch/i386/boot/zImage /etc/kernel/linux.isdn + vi /etc/lilo.conf + <add new kernel in the bootable image section> + lilo +3. in case the hisax driver is a "fixed" part of the kernel, configure + the driver with lilo: + vi /etc/lilo.conf + <add HiSax driver parameter in the global section (see below)> + lilo + Your lilo.conf _might_ look like the following: + + # LILO configuration-file + # global section + # teles 16.0 on IRQ=5, MEM=0xd8000, PORT=0xd80 + append="hisax=1,3,5,0xd8000,0xd80,HiSax" + # teles 16.3 (non pnp) on IRQ=15, PORT=0xd80 + # append="hisax=3,3,5,0xd8000,0xd80,HiSax" + boot=/dev/sda + compact # faster, but won't work on all systems. + linear + read-only + prompt + timeout=100 + vga = normal # force sane state + # Linux bootable partition config begins + image = /etc/kernel/linux.isdn + root = /dev/sda1 + label = linux.isdn + # + image = /etc/kernel/linux-2.0.30 + root = /dev/sda1 + label = linux.secure + + In the line starting with "append" you have to adapt the parameters + according to your card (see above in this file) + +3. boot the new linux.isdn kernel +4. start the ISDN subsystem: + a) load - if necessary - the modules (depends, whether you compiled + the ISDN driver as module or not) + According to the type of card you have to specify the necessary + driver parameter (irq, io, mem, type, protocol). + For the leased line the protocol is "3". See the table above for + the parameters, which you have to specify depending on your card. + b) configure i4l + /sbin/isdnctrl addif isdn0 + # EAZ 1 -- B1 channel 2 --B2 channel + /sbin/isdnctrl eaz isdn0 1 + /sbin/isdnctrl secure isdn0 on + /sbin/isdnctrl huptimeout isdn0 0 + /sbin/isdnctrl l2_prot isdn0 hdlc + # Attention you must not set an outgoing number !!! This won't work !!! + # The incoming number is LEASED0 for the first card, LEASED1 for the + # second and so on. + /sbin/isdnctrl addphone isdn0 in LEASED0 + # Here is no need to bind the channel. + c) in case the remote partner is a CISCO: + /sbin/isdnctrl encap isdn0 cisco-h + d) configure the interface + /sbin/ifconfig isdn0 ${LOCAL_IP} pointopoint ${REMOTE_IP} + e) set the routes + /sbin/route add -host ${REMOTE_IP} isdn0 + /sbin/route add default gw ${REMOTE_IP} + f) switch the card into leased mode for each used B-channel + /sbin/hisaxctrl HiSax 5 1 + +Remarks: +a) Use state of the art isdn4k-utils + +Here an example script: +#!/bin/sh +# Start/Stop ISDN leased line connection + +I4L_AS_MODULE=yes +I4L_REMOTE_IS_CISCO=no +I4L_MODULE_PARAMS="type=16 io=0x268 irq=7 " +I4L_DEBUG=no +I4L_LEASED_128K=yes +LOCAL_IP=192.168.1.1 +REMOTE_IP=192.168.2.1 + +case "$1" in + start) + echo "Starting ISDN ..." + if [ ${I4L_AS_MODULE} = "yes" ]; then + echo "loading modules..." + /sbin/modprobe hisax ${I4L_MODULE_PARAMS} + fi + # configure interface + /sbin/isdnctrl addif isdn0 + /sbin/isdnctrl secure isdn0 on + if [ ${I4L_DEBUG} = "yes" ]; then + /sbin/isdnctrl verbose 7 + /sbin/hisaxctrl HiSax 1 0xffff + /sbin/hisaxctrl HiSax 11 0xff + cat /dev/isdnctrl >/tmp/lea.log & + fi + if [ ${I4L_REMOTE_IS_CISCO} = "yes" ]; then + /sbin/isdnctrl encap isdn0 cisco-h + fi + /sbin/isdnctrl huptimeout isdn0 0 + # B-CHANNEL 1 + /sbin/isdnctrl eaz isdn0 1 + /sbin/isdnctrl l2_prot isdn0 hdlc + # 1. card + /sbin/isdnctrl addphone isdn0 in LEASED0 + if [ ${I4L_LEASED_128K} = "yes" ]; then + /sbin/isdnctrl addslave isdn0 isdn0s + /sbin/isdnctrl secure isdn0s on + /sbin/isdnctrl huptimeout isdn0s 0 + # B-CHANNEL 2 + /sbin/isdnctrl eaz isdn0s 2 + /sbin/isdnctrl l2_prot isdn0s hdlc + # 1. card + /sbin/isdnctrl addphone isdn0s in LEASED0 + if [ ${I4L_REMOTE_IS_CISCO} = "yes" ]; then + /sbin/isdnctrl encap isdn0s cisco-h + fi + fi + /sbin/isdnctrl dialmode isdn0 manual + # configure tcp/ip + /sbin/ifconfig isdn0 ${LOCAL_IP} pointopoint ${REMOTE_IP} + /sbin/route add -host ${REMOTE_IP} isdn0 + /sbin/route add default gw ${REMOTE_IP} + # switch to leased mode + # B-CHANNEL 1 + /sbin/hisaxctrl HiSax 5 1 + if [ ${I4L_LEASED_128K} = "yes" ]; then + # B-CHANNEL 2 + sleep 10; /* Wait for master */ + /sbin/hisaxctrl HiSax 5 2 + fi + ;; + stop) + /sbin/ifconfig isdn0 down + /sbin/isdnctrl delif isdn0 + if [ ${I4L_DEBUG} = "yes" ]; then + killall cat + fi + if [ ${I4L_AS_MODULE} = "yes" ]; then + /sbin/rmmod hisax + /sbin/rmmod isdn + /sbin/rmmod ppp + /sbin/rmmod slhc + fi + ;; + *) + echo "Usage: $0 {start|stop}" + exit 1 +esac +exit 0 diff --git a/Documentation/isdn/README.act2000 b/Documentation/isdn/README.act2000 new file mode 100644 index 000000000000..ce7115e7f4ce --- /dev/null +++ b/Documentation/isdn/README.act2000 @@ -0,0 +1,104 @@ +$Id: README.act2000,v 1.3 2000/08/06 09:22:51 armin Exp $ + +This document describes the ACT2000 driver for the +IBM Active 2000 ISDN card. + +There are 3 Types of this card available. A ISA-, MCA-, and PCMCIA-Bus +Version. Currently, only the ISA-Bus version of the card is supported. +However MCA and PCMCIA will follow soon. + +The ISA-Bus Version uses 8 IO-ports. The base port address has to be set +manually using the DIP switches. + +Setting up the DIP switches for the IBM Active 2000 ISDN card: + + Note: S5 and S6 always set off! + + S1 S2 S3 S4 Base-port + on on on on 0x0200 (Factory default) + off on on on 0x0240 + on off on on 0x0280 + off off on on 0x02c0 + on on off on 0x0300 + off on off on 0x0340 + on off off on 0x0380 + on on on off 0xcfe0 + off on on off 0xcfa0 + on off on off 0xcf60 + off off on off 0xcf20 + on on off off 0xcee0 + off on off off 0xcea0 + on off off off 0xce60 + off off off off Card disabled + +IRQ is configured by software. Possible values are: + + 3, 5, 7, 10, 11, 12, 15 and none (polled mode) + + +The ACT2000 driver may either be built into the kernel or as a module. +Initialization depends on how the driver is built: + +Driver built into the kernel: + + The ACT2000 driver can be configured using the commandline-feature while + loading the kernel with LILO or LOADLIN. It accepts the following syntax: + + act2000=b,p,i[,idstring] + + where + + b = Bus-Type (1=ISA, 2=MCA, 3=PCMCIA) + p = portbase (-1 means autoprobe) + i = Interrupt (-1 means use next free IRQ, 0 means polled mode) + + The idstring is an arbitrary string used for referencing the card + by the actctrl tool later. + + Defaults used, when no parameters given at all: + + 1,-1,-1,"" + + which means: Autoprobe for an ISA card, use next free IRQ, let the + ISDN linklevel fill the IdString (usually "line0" for the first card). + + If you like to use more than one card, you can use the program + "actctrl" from the utility-package to configure additional cards. + + Using the "actctrl"-utility, portbase and irq can also be changed + during runtime. The D-channel protocol is configured by the "dproto" + option of the "actctrl"-utility after loading the firmware into the + card's memory using the "actctrl"-utility. + +Driver built as module: + + The module act2000.o can be configured during modprobe (insmod) by + appending its parameters to the modprobe resp. insmod commandline. + The following syntax is accepted: + + act_bus=b act_port=p act_irq=i act_id=idstring + + where b, p, i and idstring have the same meanings as the parameters + described for the builtin version above. + + Using the "actctrl"-utility, the same features apply to the modularized + version as to the kernel-builtin one. (i.e. loading of firmware and + configuring the D-channel protocol) + +Loading the firmware into the card: + + The firmware is supplied together with the isdn4k-utils package. It + can be found in the subdirectory act2000/firmware/ + + Assuming you have installed the utility-package correctly, the firmware + will be downloaded into the card using the following command: + + actctrl -d idstring load /etc/isdn/bip11.btl + + where idstring is the Name of the card, given during insmod-time or + (for kernel-builtin driver) on the kernel commandline. If only one + ISDN card is used, the -d isdstrin may be omitted. + + For further documentation (adding more IBM Active 2000 cards), refer to + the manpage actctrl.8 which is included in the isdn4k-utils package. + diff --git a/Documentation/isdn/README.audio b/Documentation/isdn/README.audio new file mode 100644 index 000000000000..8ebca19290d9 --- /dev/null +++ b/Documentation/isdn/README.audio @@ -0,0 +1,138 @@ +$Id: README.audio,v 1.8 1999/07/11 17:17:29 armin Exp $ + +ISDN subsystem for Linux. + Description of audio mode. + +When enabled during kernel configuration, the tty emulator of the ISDN +subsystem is capable of a reduced set of commands to support audio. +This document describes the commands supported and the format of +audio data. + +Commands for enabling/disabling audio mode: + + AT+FCLASS=8 Enable audio mode. + This affects the following registers: + S18: Bits 0 and 2 are set. + S16: Set to 48 and any further change to + larger values is blocked. + AT+FCLASS=0 Disable audio mode. + Register 18 is set to 4. + AT+FCLASS=? Show possible modes. + AT+FCLASS? Report current mode (0 or 8). + +Commands supported in audio mode: + +All audio mode commands have one of the following forms: + + AT+Vxx? Show current setting. + AT+Vxx=? Show possible settings. + AT+Vxx=v Set simple parameter. + AT+Vxx=v,v ... Set complex parameter. + +where xx is a two-character code and v are alphanumerical parameters. +The following commands are supported: + + AT+VNH=x Auto hangup setting. NO EFFECT, supported + for compatibility only. + AT+VNH? Always reporting "1" + AT+VNH=? Always reporting "1" + + AT+VIP Reset all audio parameters. + + AT+VLS=x Line select. x is one of the following: + 0 = No device. + 2 = Phone line. + AT+VLS=? Always reporting "0,2" + AT+VLS? Show current line. + + AT+VRX Start recording. Emulator responds with + CONNECT and starts sending audio data to + the application. See below for data format + + AT+VSD=x,y Set silence-detection parameters. + Possible parameters: + x = 0 ... 31 sensitivity threshold level. + (default 0 , deactivated) + y = 0 ... 255 range of interval in units + of 0.1 second. (default 70) + AT+VSD=? Report possible parameters. + AT+VSD? Show current parameters. + + AT+VDD=x,y Set DTMF-detection parameters. + Only possible if online and during this connection. + Possible parameters: + x = 0 ... 15 sensitivity threshold level. + (default 0 , I4L soft-decode) + (1-15 soft-decode off, hardware on) + y = 0 ... 255 tone duration in units of 5ms. + Not for I4L soft decode (default 8, 40ms) + AT+VDD=? Report possible parameters. + AT+VDD? Show current parameters. + + AT+VSM=x Select audio data format. + Possible parameters: + 2 = ADPCM-2 + 3 = ADPCM-3 + 4 = ADPCM-4 + 5 = aLAW + 6 = uLAW + AT+VSM=? Show possible audio formats. + + AT+VTX Start audio playback. Emulator responds + with CONNECT and starts sending audio data + received from the application via phone line. +General behavior and description of data formats/protocol. + when a connection is made: + + On incoming calls, if the application responds to a RING + with ATA, depending on the calling service, the emulator + responds with either CONNECT (data call) or VCON (voice call). + + On outgoing voice calls, the emulator responds with VCON + upon connection setup. + + Audio recording. + + When receiving audio data, a kind of bisync protocol is used. + Upon AT+VRX command, the emulator responds with CONNECT, and + starts sending audio data to the application. There are several + escape sequences defined, all using DLE (0x10) as Escape char: + + <DLE><ETX> End of audio data. (i.e. caused by a + hangup of the remote side) Emulator stops + recording, responding with VCON. + <DLE><DC4> Abort recording, (send by appl.) Emulator + stops recording, sends DLE,ETX. + <DLE><DLE> Escape sequence for DLE in data stream. + <DLE>0 Touchtone "0" received. + ... + <DLE>9 Touchtone "9" received. + <DLE># Touchtone "#" received. + <DLE>* Touchtone "*" received. + <DLE>A Touchtone "A" received. + <DLE>B Touchtone "B" received. + <DLE>C Touchtone "C" received. + <DLE>D Touchtone "D" received. + + <DLE>q quiet. Silence detected after non-silence. + <DLE>s silence. Silence detected from the + start of recording. + + Currently unsupported DLE sequences: + + <DLE>c FAX calling tone received. + <DLE>b busy tone received. + + Audio playback. + + When sending audio data, upon AT+VTX command, emulator responds with + CONNECT, and starts transferring data from application to the phone line. + The same DLE sequences apply to this mode. + + Full-Duplex-Audio: + + When _both_ commands for recording and playback are given in _one_ + AT-command-line (i.e.: "AT+VTX+VRX"), full-duplex-mode is selected. + In this mode, the only way to stop recording is sending <DLE><DC4> + and the only way to stop playback is to send <DLE><ETX>. + diff --git a/Documentation/isdn/README.avmb1 b/Documentation/isdn/README.avmb1 new file mode 100644 index 000000000000..9e075484ef1e --- /dev/null +++ b/Documentation/isdn/README.avmb1 @@ -0,0 +1,187 @@ +Driver for active AVM Controller. + +The driver provides a kernel capi2.0 Interface (kernelcapi) and +on top of this a User-Level-CAPI2.0-interface (capi) +and a driver to connect isdn4linux with CAPI2.0 (capidrv). +The lowlevel interface can be used to implement a CAPI2.0 +also for passive cards since July 1999. + +The author can be reached at calle@calle.in-berlin.de. +The command avmcapictrl is part of the isdn4k-utils. +t4-files can be found at ftp://ftp.avm.de/cardware/b1/linux/firmware + +Currently supported cards: + B1 ISA (all versions) + B1 PCI + T1/T1B (HEMA card) + M1 + M2 + B1 PCMCIA + +Installing +---------- + +You need at least /dev/capi20 to load the firmware. + +mknod /dev/capi20 c 68 0 +mknod /dev/capi20.00 c 68 1 +mknod /dev/capi20.01 c 68 2 +. +. +. +mknod /dev/capi20.19 c 68 20 + +Running +------- + +To use the card you need the t4-files to download the firmware. +AVM GmbH provides several t4-files for the different D-channel +protocols (b1.t4 for Euro-ISDN). Install these file in /lib/isdn. + +if you configure as modules load the modules this way: + +insmod /lib/modules/current/misc/capiutil.o +insmod /lib/modules/current/misc/b1.o +insmod /lib/modules/current/misc/kernelcapi.o +insmod /lib/modules/current/misc/capidrv.o +insmod /lib/modules/current/misc/capi.o + +if you have an B1-PCI card load the module b1pci.o +insmod /lib/modules/current/misc/b1pci.o +and load the firmware with +avmcapictrl load /lib/isdn/b1.t4 1 + +if you have an B1-ISA card load the module b1isa.o +and add the card by calling +avmcapictrl add 0x150 15 +and load the firmware by calling +avmcapictrl load /lib/isdn/b1.t4 1 + +if you have an T1-ISA card load the module t1isa.o +and add the card by calling +avmcapictrl add 0x450 15 T1 0 +and load the firmware by calling +avmcapictrl load /lib/isdn/t1.t4 1 + +if you have an PCMCIA card (B1/M1/M2) load the module b1pcmcia.o +before you insert the card. + +Leased Lines with B1 +-------------------- +Init card and load firmware. +For an D64S use "FV: 1" as phone number +For an D64S2 use "FV: 1" and "FV: 2" for multilink +or "FV: 1,2" to use CAPI channel bundling. + +/proc-Interface +----------------- + +/proc/capi: + dr-xr-xr-x 2 root root 0 Jul 1 14:03 . + dr-xr-xr-x 82 root root 0 Jun 30 19:08 .. + -r--r--r-- 1 root root 0 Jul 1 14:03 applications + -r--r--r-- 1 root root 0 Jul 1 14:03 applstats + -r--r--r-- 1 root root 0 Jul 1 14:03 capi20 + -r--r--r-- 1 root root 0 Jul 1 14:03 capidrv + -r--r--r-- 1 root root 0 Jul 1 14:03 controller + -r--r--r-- 1 root root 0 Jul 1 14:03 contrstats + -r--r--r-- 1 root root 0 Jul 1 14:03 driver + -r--r--r-- 1 root root 0 Jul 1 14:03 ncci + -r--r--r-- 1 root root 0 Jul 1 14:03 users + +/proc/capi/applications: + applid level3cnt datablkcnt datablklen ncci-cnt recvqueuelen + level3cnt: capi_register parameter + datablkcnt: capi_register parameter + ncci-cnt: current number of nccis (connections) + recvqueuelen: number of messages on receive queue + for example: +1 -2 16 2048 1 0 +2 2 7 2048 1 0 + +/proc/capi/applstats: + applid recvctlmsg nrecvdatamsg nsentctlmsg nsentdatamsg + recvctlmsg: capi messages received without DATA_B3_IND + recvdatamsg: capi DATA_B3_IND received + sentctlmsg: capi messages sent without DATA_B3_REQ + sentdatamsg: capi DATA_B3_REQ sent + for example: +1 2057 1699 1721 1699 + +/proc/capi/capi20: statistics of capi.o (/dev/capi20) + minor nopen nrecvdropmsg nrecvctlmsg nrecvdatamsg sentctlmsg sentdatamsg + minor: minor device number of capi device + nopen: number of calls to devices open + nrecvdropmsg: capi messages dropped (messages in recvqueue in close) + nrecvctlmsg: capi messages received without DATA_B3_IND + nrecvdatamsg: capi DATA_B3_IND received + nsentctlmsg: capi messages sent without DATA_B3_REQ + nsentdatamsg: capi DATA_B3_REQ sent + + for example: +1 2 18 0 16 2 + +/proc/capi/capidrv: statistics of capidrv.o (capi messages) + nrecvctlmsg nrecvdatamsg sentctlmsg sentdatamsg + nrecvctlmsg: capi messages received without DATA_B3_IND + nrecvdatamsg: capi DATA_B3_IND received + nsentctlmsg: capi messages sent without DATA_B3_REQ + nsentdatamsg: capi DATA_B3_REQ sent + for example: +2780 2226 2256 2226 + +/proc/capi/controller: + controller drivername state cardname controllerinfo + for example: +1 b1pci running b1pci-e000 B1 3.07-01 0xe000 19 +2 t1isa running t1isa-450 B1 3.07-01 0x450 11 0 +3 b1pcmcia running m2-150 B1 3.07-01 0x150 5 + +/proc/capi/contrstats: + controller nrecvctlmsg nrecvdatamsg sentctlmsg sentdatamsg + nrecvctlmsg: capi messages received without DATA_B3_IND + nrecvdatamsg: capi DATA_B3_IND received + nsentctlmsg: capi messages sent without DATA_B3_REQ + nsentdatamsg: capi DATA_B3_REQ sent + for example: +1 2845 2272 2310 2274 +2 2 0 2 0 +3 2 0 2 0 + +/proc/capi/driver: + drivername ncontroller + for example: +b1pci 1 +t1isa 1 +b1pcmcia 1 +b1isa 0 + +/proc/capi/ncci: + apllid ncci winsize sendwindow + for example: +1 0x10101 8 0 + +/proc/capi/users: kernelmodules that use the kernelcapi. + name + for example: +capidrv +capi20 + +Questions +--------- +Check out the FAQ (ftp.isdn4linux.de) or subscribe to the +linux-avmb1@calle.in-berlin.de mailing list by sending +a mail to majordomo@calle.in-berlin.de with +subscribe linux-avmb1 +in the body. + +German documentation and several scripts can be found at +ftp://ftp.avm.de/cardware/b1/linux/ + +Bugs +---- +If you find any please let me know. + +Enjoy, + +Carsten Paeth (calle@calle.in-berlin.de) diff --git a/Documentation/isdn/README.concap b/Documentation/isdn/README.concap new file mode 100644 index 000000000000..2f114babe4b6 --- /dev/null +++ b/Documentation/isdn/README.concap @@ -0,0 +1,259 @@ +Description of the "concap" encapsulation protocol interface +============================================================ + +The "concap" interface is intended to be used by network device +drivers that need to process an encapsulation protocol. +It is assumed that the protocol interacts with a linux network device by +- data transmission +- connection control (establish, release) +Thus, the mnemonic: "CONnection CONtrolling eNCAPsulation Protocol". + +This is currently only used inside the isdn subsystem. But it might +also be useful to other kinds of network devices. Thus, if you want +to suggest changes that improve usability or performance of the +interface, please let me know. I'm willing to include them in future +releases (even if I needed to adapt the current isdn code to the +changed interface). + + +Why is this useful? +=================== + +The encapsulation protocol used on top of WAN connections or permanent +point-to-point links are frequently chosen upon bilateral agreement. +Thus, a device driver for a certain type of hardware must support +several different encapsulation protocols at once. + +The isdn device driver did already support several different +encapsulation protocols. The encapsulation protocol is configured by a +user space utility (isdnctrl). The isdn network interface code then +uses several case statements which select appropriate actions +depending on the currently configured encapsulation protocol. + +In contrast, LAN network interfaces always used a single encapsulation +protocol which is unique to the hardware type of the interface. The LAN +encapsulation is usually done by just sticking a header on the data. Thus, +traditional linux network device drivers used to process the +encapsulation protocol directly (usually by just providing a hard_header() +method in the device structure) using some hardware type specific support +functions. This is simple, direct and efficient. But it doesn't fit all +the requirements for complex WAN encapsulations. + + + The configurability of the encapsulation protocol to be used + makes isdn network interfaces more flexible, but also much more + complex than traditional lan network interfaces. + + +Many Encapsulation protocols used on top of WAN connections will not just +stick a header on the data. They also might need to set up or release +the WAN connection. They also might want to send other data for their +private purpose over the wire, e.g. ppp does a lot of link level +negotiation before the first piece of user data can be transmitted. +Such encapsulation protocols for WAN devices are typically more complex +than encapsulation protocols for lan devices. Thus, network interface +code for typical WAN devices also tends to be more complex. + + +In order to support Linux' x25 PLP implementation on top of +isdn network interfaces I could have introduced yet another branch to +the various case statements inside drivers/isdn/isdn_net.c. +This eventually made isdn_net.c even more complex. In addition, it made +isdn_net.c harder to maintain. Thus, by identifying an abstract +interface between the network interface code and the encapsulation +protocol, complexity could be reduced and maintainability could be +increased. + + +Likewise, a similar encapsulation protocol will frequently be needed by +several different interfaces of even different hardware type, e.g. the +synchronous ppp implementation used by the isdn driver and the +asynchronous ppp implementation used by the ppp driver have a lot of +similar code in them. By cleanly separating the encapsulation protocol +from the hardware specific interface stuff such code could be shared +better in future. + + +When operating over dial-up-connections (e.g. telephone lines via modem, +non-permanent virtual circuits of wide area networks, ISDN) many +encapsulation protocols will need to control the connection. Therefore, +some basic connection control primitives are supported. The type and +semantics of the connection (i.e the ISO layer where connection service +is provided) is outside our scope and might be different depending on +the encapsulation protocol used, e.g. for a ppp module using our service +on top of a modem connection a connect_request will result in dialing +a (somewhere else configured) remote phone number. For an X25-interface +module (LAPB semantics, as defined in Documentation/networking/x25-iface.txt) +a connect_request will ask for establishing a reliable lapb +datalink connection. + + +The encapsulation protocol currently provides the following +service primitives to the network device. + +- create a new encapsulation protocol instance +- delete encapsulation protocol instance and free all its resources +- initialize (open) the encapsulation protocol instance for use. +- deactivate (close) an encapsulation protocol instance. +- process (xmit) data handed down by upper protocol layer +- receive data from lower (hardware) layer +- process connect indication from lower (hardware) layer +- process disconnect indication from lower (hardware) layer + + +The network interface driver accesses those primitives via callbacks +provided by the encapsulation protocol instance within a +struct concap_proto_ops. + +struct concap_proto_ops{ + + /* create a new encapsulation protocol instance of same type */ + struct concap_proto * (*proto_new) (void); + + /* delete encapsulation protocol instance and free all its resources. + cprot may no loger be referenced after calling this */ + void (*proto_del)(struct concap_proto *cprot); + + /* initialize the protocol's data. To be called at interface startup + or when the device driver resets the interface. All services of the + encapsulation protocol may be used after this*/ + int (*restart)(struct concap_proto *cprot, + struct net_device *ndev, + struct concap_device_ops *dops); + + /* deactivate an encapsulation protocol instance. The encapsulation + protocol may not call any *dops methods after this. */ + int (*close)(struct concap_proto *cprot); + + /* process a frame handed down to us by upper layer */ + int (*encap_and_xmit)(struct concap_proto *cprot, struct sk_buff *skb); + + /* to be called for each data entity received from lower layer*/ + int (*data_ind)(struct concap_proto *cprot, struct sk_buff *skb); + + /* to be called when a connection was set up/down. + Protocols that don't process these primitives might fill in + dummy methods here */ + int (*connect_ind)(struct concap_proto *cprot); + int (*disconn_ind)(struct concap_proto *cprot); +}; + + +The data structures are defined in the header file include/linux/concap.h. + + +A Network interface using encapsulation protocols must also provide +some service primitives to the encapsulation protocol: + +- request data being submitted by lower layer (device hardware) +- request a connection being set up by lower layer +- request a connection being released by lower layer + +The encapsulation protocol accesses those primitives via callbacks +provided by the network interface within a struct concap_device_ops. + +struct concap_device_ops{ + + /* to request data be submitted by device */ + int (*data_req)(struct concap_proto *, struct sk_buff *); + + /* Control methods must be set to NULL by devices which do not + support connection control. */ + /* to request a connection be set up */ + int (*connect_req)(struct concap_proto *); + + /* to request a connection be released */ + int (*disconn_req)(struct concap_proto *); +}; + +The network interface does not explicitly provide a receive service +because the encapsulation protocol directly calls netif_rx(). + + + + +An encapsulation protocol itself is actually the +struct concap_proto{ + struct net_device *net_dev; /* net device using our service */ + struct concap_device_ops *dops; /* callbacks provided by device */ + struct concap_proto_ops *pops; /* callbacks provided by us */ + int flags; + void *proto_data; /* protocol specific private data, to + be accessed via *pops methods only*/ + /* + : + whatever + : + */ +}; + +Most of this is filled in when the device requests the protocol to +be reset (opend). The network interface must provide the net_dev and +dops pointers. Other concap_proto members should be considered private +data that are only accessed by the pops callback functions. Likewise, +a concap proto should access the network device's private data +only by means of the callbacks referred to by the dops pointer. + + +A possible extended device structure which uses the connection controlling +encapsulation services could look like this: + +struct concap_device{ + struct net_device net_dev; + struct my_priv /* device->local stuff */ + /* the my_priv struct might contain a + struct concap_device_ops *dops; + to provide the device specific callbacks + */ + struct concap_proto *cprot; /* callbacks provided by protocol */ +}; + + + +Misc Thoughts +============= + +The concept of the concap proto might help to reuse protocol code and +reduce the complexity of certain network interface implementations. +The trade off is that it introduces yet another procedure call layer +when processing the protocol. This has of course some impact on +performance. However, typically the concap interface will be used by +devices attached to slow lines (like telephone, isdn, leased synchronous +lines). For such slow lines, the overhead is probably negligible. +This might no longer hold for certain high speed WAN links (like +ATM). + + +If general linux network interfaces explicitly supported concap +protocols (e.g. by a member struct concap_proto* in struct net_device) +then the interface of the service function could be changed +by passing a pointer of type (struct net_device*) instead of +type (struct concap_proto*). Doing so would make many of the service +functions compatible to network device support functions. + +e.g. instead of the concap protocol's service function + + int (*encap_and_xmit)(struct concap_proto *cprot, struct sk_buff *skb); + +we could have + + int (*encap_and_xmit)(struct net_device *ndev, struct sk_buff *skb); + +As this is compatible to the dev->hard_start_xmit() method, the device +driver could directly register the concap protocol's encap_and_xmit() +function as its hard_start_xmit() method. This would eliminate one +procedure call layer. + + +The device's data request function could also be defined as + + int (*data_req)(struct net_device *ndev, struct sk_buff *skb); + +This might even allow for some protocol stacking. And the network +interface might even register the same data_req() function directly +as its hard_start_xmit() method when a zero layer encapsulation +protocol is configured. Thus, eliminating the performance penalty +of the concap interface when a trivial concap protocol is used. +Nevertheless, the device remains able to support encapsulation +protocol configuration. + diff --git a/Documentation/isdn/README.diversion b/Documentation/isdn/README.diversion new file mode 100644 index 000000000000..bddcd5fb86ff --- /dev/null +++ b/Documentation/isdn/README.diversion @@ -0,0 +1,127 @@ +The isdn diversion services are a supporting module working together with +the isdn4linux and the HiSax module for passive cards. +Active cards, TAs and cards using a own or other driver than the HiSax +module need to be adapted to the HL<->LL interface described in a separate +document. The diversion services may be used with all cards supported by +the HiSax driver. +The diversion kernel interface and controlling tool divertctrl were written +by Werner Cornelius (werner@isdn4linux.de or werner@titro.de) under the +GNU General Public License. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + +Table of contents +================= + +1. Features of the i4l diversion services + (Or what can the i4l diversion services do for me) + +2. Required hard- and software + +3. Compiling, installing and loading/unloading the module + Tracing calling and diversion information + +4. Tracing calling and diversion information + +5. Format of the divert device ASCII output + + +1. Features of the i4l diversion services + (Or what can the i4l diversion services do for me) + + The i4l diversion services offers call forwarding and logging normally + only supported by isdn phones. Incoming calls may be diverted + unconditionally (CFU), when not reachable (CFNR) or on busy condition + (CFB). + The diversions may be invoked statically in the providers exchange + as normally done by isdn phones. In this case all incoming calls + with a special (or all) service identifiers are forwarded if the + forwarding reason is met. Activated static services may also be + interrogated (queried). + The i4l diversion services additionally offers a dynamic version of + call forwarding which is not preprogrammed inside the providers exchange + but dynamically activated by i4l. + In this case all incoming calls are checked by rules that may be + compared to the mechanism of ipfwadm or ipchains. If a given rule matches + the checking process is finished and the rule matching will be applied + to the call. + The rules include primary and secondary service identifiers, called + number and subaddress, callers number and subaddress and whether the rule + matches to all filtered calls or only those when all B-channel resources + are exhausted. + Actions that may be invoked by a rule are ignore, proceed, reject, + direct divert or delayed divert of a call. + All incoming calls matching a rule except the ignore rule a reported and + logged as ASCII via the proc filesystem (/proc/net/isdn/divert). If proceed + is selected the call will be held in a proceeding state (without ringing) + for a certain amount of time to let an external program or client decide + how to handle the call. + + +2. Required hard- and software + + For using the i4l diversion services the isdn line must be of a EURO/DSS1 + type. Additionally the i4l services only work together with the HiSax + driver for passive isdn cards. All HiSax supported cards may be used for + the diversion purposes. + The static diversion services require the provider having static services + CFU, CFNR, CFB activated on an MSN-line. The static services may not be + used on a point-to-point connection. Further the static services are only + available in some countries (for example germany). Countries requiring the + keypad protocol for activating static diversions (like the netherlands) are + not supported but may use the tty devices for this purpose. + The dynamic diversion services may be used in all countries if the provider + enables the feature CF (call forwarding). This should work on both MSN- and + point-to-point lines. + To add and delete rules the additional divertctrl program is needed. This + program is part of the isdn4kutils package. + +3. Compiling, installing and loading/unloading the module + Tracing calling and diversion information + + + To compile the i4l code with diversion support you need to say yes to the + DSS1 diversion services when selecting the i4l options in the kernel + config (menuconfig or config). + After having properly activated a make modules and make modules_install all + required modules will be correctly installed in the needed modules dirs. + As the diversion services are currently not included in the scripts of most + standard distributions you will have to add a "insmod dss1_divert" after + having loaded the global isdn module. + The module can be loaded without any command line parameters. + If the module is actually loaded and active may be checked with a + "cat /proc/modules" or "ls /proc/net/isdn/divert". The divert file is + dynamically created by the diversion module and removed when the module is + unloaded. + + +4. Tracing calling and diversion information + + You also may put a "cat /proc/net/isdn/divert" in the background with the + output redirected to a file. Then all actions of the module are logged. + The divert file in the proc system may be opened more than once, so in + conjunction with inetd and a small remote client on other machines inside + your network incoming calls and reactions by the module may be shown on + every listening machine. + If a call is reported as proceeding an external program or client may + specify during a certain amount of time (normally 4 to 10 seconds) what + to do with that call. + To unload the module all open files to the device in the proc system must + be closed. Otherwise the module (and isdn.o) may not be unloaded. + +5. Format of the divert device ASCII output + + To be done later + diff --git a/Documentation/isdn/README.fax b/Documentation/isdn/README.fax new file mode 100644 index 000000000000..5314958a8a6e --- /dev/null +++ b/Documentation/isdn/README.fax @@ -0,0 +1,45 @@ + +Fax with isdn4linux +=================== + +When enabled during kernel configuration, the tty emulator +of the ISDN subsystem is capable of the Fax Class 2 commands. + +This only makes sense under the following conditions : + +- You need the commands as dummy, because you are using + hylafax (with patch) for AVM capi. +- You want to use the fax capabilities of your isdn-card. + (supported cards are listed below) + + +NOTE: This implementation does *not* support fax with passive + ISDN-cards (known as softfax). The low-level driver of + the ISDN-card and/or the card itself must support this. + + +Supported ISDN-Cards +-------------------- + +Eicon DIVA Server BRI/PCI + - full support with both B-channels. + +Eicon DIVA Server 4BRI/PCI + - full support with all B-channels. + +Eicon DIVA Server PRI/PCI + - full support on amount of B-channels + depending on DSPs on board. + + + +The command set is known as Class 2 (not Class 2.0) and +can be activated by AT+FCLASS=2 + + +The interface between the link-level-module and the hardware-level driver +is described in the files INTERFACE.fax and INTERFACE. + +Armin +mac@melware.de + diff --git a/Documentation/isdn/README.hfc-pci b/Documentation/isdn/README.hfc-pci new file mode 100644 index 000000000000..e8a4ef0226e8 --- /dev/null +++ b/Documentation/isdn/README.hfc-pci @@ -0,0 +1,41 @@ +The driver for the HFC-PCI and HFC-PCI-A chips from CCD may be used +for many OEM cards using this chips. +Additionally the driver has a special feature which makes it possible +to read the echo-channel of the isdn bus. So all frames in both directions +may be logged. +When the echo logging feature is used the number of available B-channels +for a HFC-PCI card is reduced to 1. Of course this is only relevant to +the card, not to the isdn line. +To activate the echo mode the following ioctls must be entered: + +hisaxctrl <driver/cardname> 10 1 + +This reduces the available channels to 1. There must not be open connections +through this card when entering the command. +And then: + +hisaxctrl <driver/cardname> 12 1 + +This enables the echo mode. If Hex logging is activated the isdnctrlx +devices show a output with a line beginning of HEX: for the providers +exchange and ECHO: for isdn devices sending to the provider. + +If more than one HFC-PCI cards are installed, a specific card may be selected +at the hisax module load command line. Supply the load command with the desired +IO-address of the desired card. +Example: +There tree cards installed in your machine at IO-base addresses 0xd000, 0xd400 +and 0xdc00 +If you want to use the card at 0xd400 standalone you should supply the insmod +or depmod with type=35 io=0xd400. +If you want to use all three cards, but the order needs to be at 0xdc00,0xd400, +0xd000 you may give the parameters type=35,35,35 io=0xdc00,0xd400,0xd00 +Then the desired card will be the initialised in the desired order. +If the io parameter is used the io addresses of all used cards should be +supplied else the parameter is assumed 0 and a auto search for a free card is +invoked which may not give the wanted result. + +Comments and reports to werner@isdn4linux.de or werner@isdn-development.de + + + diff --git a/Documentation/isdn/README.hysdn b/Documentation/isdn/README.hysdn new file mode 100644 index 000000000000..56cc59df1fb7 --- /dev/null +++ b/Documentation/isdn/README.hysdn @@ -0,0 +1,195 @@ +$Id: README.hysdn,v 1.3.6.1 2001/02/10 14:41:19 kai Exp $ +The hysdn driver has been written by +by Werner Cornelius (werner@isdn4linux.de or werner@titro.de) +for Hypercope GmbH Aachen Germany. Hypercope agreed to publish this driver +under the GNU General Public License. + +The CAPI 2.0-support was added by Ulrich Albrecht (ualbrecht@hypercope.de) +for Hypercope GmbH Aachen, Germany. + + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + +Table of contents +================= + +1. About the driver + +2. Loading/Unloading the driver + +3. Entries in the /proc filesystem + +4. The /proc/net/hysdn/cardconfX file + +5. The /proc/net/hysdn/cardlogX file + +6. Where to get additional info and help + + +1. About the driver + + The drivers/isdn/hysdn subdir contains a driver for HYPERCOPEs active + PCI isdn cards Champ, Ergo and Metro. To enable support for this cards + enable ISDN support in the kernel config and support for HYSDN cards in + the active cards submenu. The driver may only be compiled and used if + support for loadable modules and the process filesystem have been enabled. + + These cards provide two different interfaces to the kernel. Without the + optional CAPI 2.0 support, they register as ethernet card. IP-routing + to a ISDN-destination is performed on the card itself. All necessary + handlers for various protocols like ppp and others as well as config info + and firmware may be fetched from Hypercopes WWW-Site www.hypercope.de. + + With CAPI 2.0 support enabled, the card can also be used as a CAPI 2.0 + compliant devices with either CAPI 2.0 applications + (check isdn4k-utils) or -using the capidrv module- as a regular + isdn4linux device. This is done via the same mechanism as with the + active AVM cards and in fact uses the same module. + + +2. Loading/Unloading the driver + + The module has no command line parameters and auto detects up to 10 cards + in the id-range 0-9. + If a loaded driver shall be unloaded all open files in the /proc/net/hysdn + subdir need to be closed and all ethernet interfaces allocated by this + driver must be shut down. Otherwise the module counter will avoid a module + unload. + + If you are using the CAPI 2.0-interface, make sure to load/modprobe the + kernelcapi-module first. + + If you plan to use the capidrv-link to isdn4linux, make sure to load + capidrv.o after all modules using this driver (i.e. after hysdn and + any avm-specific modules). + +3. Entries in the /proc filesystem + + When the module has been loaded it adds the directory hysdn in the + /proc/net tree. This directory contains exactly 2 file entries for each + card. One is called cardconfX and the other cardlogX, where X is the + card id number from 0 to 9. + The cards are numbered in the order found in the PCI config data. + +4. The /proc/net/hysdn/cardconfX file + + This file may be read to get by everyone to get info about the cards type, + actual state, available features and used resources. + The first 3 entries (id, bus and slot) are PCI info fields, the following + type field gives the information about the cards type: + + 4 -> Ergo card (server card with 2 b-chans) + 5 -> Metro card (server card with 4 or 8 b-chans) + 6 -> Champ card (client card with 2 b-chans) + + The following 3 fields show the hardware assignments for irq, iobase and the + dual ported memory (dp-mem). + The fields b-chans and fax-chans announce the available card resources of + this types for the user. + The state variable indicates the actual drivers state for this card with the + following assignments. + + 0 -> card has not been booted since driver load + 1 -> card booting is actually in progess + 2 -> card is in an error state due to a previous boot failure + 3 -> card is booted and active + + And the last field (device) shows the name of the ethernet device assigned + to this card. Up to the first successful boot this field only shows a - + to tell that no net device has been allocated up to now. Once a net device + has been allocated it remains assigned to this card, even if a card is + rebooted and an boot error occurs. + + Writing to the cardconfX file boots the card or transfers config lines to + the cards firmware. The type of data is automatically detected when the + first data is written. Only root has write access to this file. + The firmware boot files are normally called hyclient.pof for client cards + and hyserver.pof for server cards. + After successfully writing the boot file, complete config files or single + config lines may be copied to this file. + If an error occurs the return value given to the writing process has the + following additional codes (decimal): + + 1000 Another process is currently bootng the card + 1001 Invalid firmware header + 1002 Boards dual-port RAM test failed + 1003 Internal firmware handler error + 1004 Boot image size invalid + 1005 First boot stage (bootstrap loader) failed + 1006 Second boot stage failure + 1007 Timeout waiting for card ready during boot + 1008 Operation only allowed in booted state + 1009 Config line too long + 1010 Invalid channel number + 1011 Timeout sending config data + + Additional info about error reasons may be fetched from the log output. + +5. The /proc/net/hysdn/cardlogX file + + The cardlogX file entry may be opened multiple for reading by everyone to + get the cards and drivers log data. Card messages always start with the + keyword LOG. All other lines are output from the driver. + The driver log data may be redirected to the syslog by selecting the + appropriate bitmask. The cards log messages will always be send to this + interface but never to the syslog. + + A root user may write a decimal or hex (with 0x) value t this file to select + desired output options. As mentioned above the cards log dat is always + written to the cardlog file independent of the following options only used + to check and debug the driver itself: + + For example: + echo "0x34560078" > /proc/net/hysdn/cardlog0 + to output the hex log mask 34560078 for card 0. + + The written value is regarded as an unsigned 32-Bit value, bit ored for + desired output. The following bits are already assigned: + + 0x80000000 All driver log data is alternatively via syslog + 0x00000001 Log memory allocation errors + 0x00000010 Firmware load start and close are logged + 0x00000020 Log firmware record parser + 0x00000040 Log every firmware write actions + 0x00000080 Log all card related boot messages + 0x00000100 Output all config data sent for debugging purposes + 0x00000200 Only non comment config lines are shown wth channel + 0x00000400 Additional conf log output + 0x00001000 Log the asynchronous scheduler actions (config and log) + 0x00100000 Log all open and close actions to /proc/net/hysdn/card files + 0x00200000 Log all actions from /proc file entries + 0x00010000 Log network interface init and deinit + +6. Where to get additional info and help + + If you have any problems concerning the driver or configuration contact + the Hypercope support team (support@hypercope.de) and or the authors + Werner Cornelius (werner@isdn4linux or cornelius@titro.de) or + Ulrich Albrecht (ualbrecht@hypercope.de). + + + + + + + + + + + + + + + diff --git a/Documentation/isdn/README.icn b/Documentation/isdn/README.icn new file mode 100644 index 000000000000..a5f55eadb3ca --- /dev/null +++ b/Documentation/isdn/README.icn @@ -0,0 +1,148 @@ +$Id: README.icn,v 1.7 2000/08/06 09:22:51 armin Exp $ + +You can get the ICN-ISDN-card from: + +Thinking Objects Software GmbH +Versbacher Röthe 159 +97078 Würzburg +Tel: +49 931 2877950 +Fax: +49 931 2877951 + +email info@think.de +WWW http:/www.think.de + + +The card communicates with the PC by two interfaces: + 1. A range of 4 successive port-addresses, whose base address can be + configured with the switches. + 2. A memory window with 16KB-256KB size, which can be setup in 16k steps + over the whole range of 16MB. Isdn4linux only uses a 16k window. + The base address of the window can be configured when loading + the lowlevel-module (see README). If using more than one card, + all cards are mapped to the same window and activated as needed. + +Setting up the IO-address dipswitches for the ICN-ISDN-card: + + Two types of cards exist, one with dip-switches and one with + hook-switches. + + 1. Setting for the card with hook-switches: + + (0 = switch closed, 1 = switch open) + + S3 S2 S1 Base-address + 0 0 0 0x300 + 0 0 1 0x310 + 0 1 0 0x320 (Default for isdn4linux) + 0 1 1 0x330 + 1 0 0 0x340 + 1 0 1 0x350 + 1 1 0 0x360 + 1 1 1 NOT ALLOWED! + + 2. Setting for the card with dip-switches: + + (0 = switch closed, 1 = switch open) + + S1 S2 S3 S4 Base-Address + 0 0 0 0 0x300 + 0 0 0 1 0x310 + 0 0 1 0 0x320 (Default for isdn4linux) + 0 0 1 1 0x330 + 0 1 0 0 0x340 + 0 1 0 1 0x350 + 0 1 1 0 0x360 + 0 1 1 1 NOT ALLOWED! + 1 0 0 0 0x308 + 1 0 0 1 0x318 + 1 0 1 0 0x328 + 1 0 1 1 0x338 + 1 1 0 0 0x348 + 1 1 0 1 0x358 + 1 1 1 0 0x368 + 1 1 1 1 NOT ALLOWED! + +The ICN driver may be built into the kernel or as a module. Initialization +depends on how the driver is built: + +Driver built into the kernel: + + The ICN driver can be configured using the commandline-feature while + loading the kernel with LILO or LOADLIN. It accepts the following syntax: + + icn=p,m[,idstring1[,idstring2]] + + where + + p = portbase (default: 0x320) + m = shared memory (default: 0xd0000) + + When using the ICN double card (4B), you MUST define TWO idstrings. + idstring must start with a character! There is no way for the driver + to distinguish between a 2B and 4B type card. Therefore, by supplying + TWO idstrings, you tell the driver that you have a 4B installed. + + If you like to use more than one card, you can use the program + "icnctrl" from the utility-package to configure additional cards. + You need to configure shared memory only once, since the icn-driver + maps all cards into the same address-space. + + Using the "icnctrl"-utility, portbase and shared memory can also be + changed during runtime. + + The D-channel protocol is configured by loading different firmware + into the card's memory using the "icnctrl"-utility. + + +Driver built as module: + + The module icn.o can be configured during "insmod'ing" it by + appending its parameters to the insmod-commandline. The following + syntax is accepted: + + portbase=p membase=m icn_id=idstring [icn_id2=idstring2] + + where p, m, idstring1 and idstring2 have the same meanings as the + parameters described for the kernel-version above. + + When using the ICN double card (4B), you MUST define TWO idstrings. + idstring must start with a character! There is no way for the driver + to distinguish between a 2B and 4B type card. Therefore, by supplying + TWO idstrings, you tell the driver that you have a 4B installed. + + Using the "icnctrl"-utility, the same features apply to the modularized + version like to the kernel-builtin one. + + The D-channel protocol is configured by loading different firmware + into the card's memory using the "icnctrl"-utility. + +Loading the firmware into the card: + + The firmware is supplied together with the isdn4k-utils package. It + can be found in the subdirectory icnctrl/firmware/ + + There are 3 files: + + loadpg.bin - Image of the bootstrap loader. + pc_1t_ca.bin - Image of firmware for german 1TR6 protocol. + pc_eu_ca.bin - Image if firmware for EDSS1 (Euro-ISDN) protocol. + + Assuming you have installed the utility-package correctly, the firmware + will be downloaded into the 2B-card using the following command: + + icnctrl -d Idstring load /etc/isdn/loadpg.bin /etc/isdn/pc_XX_ca.bin + + where XX is either "1t" or "eu", depending on the D-Channel protocol + used on your S0-bus and Idstring is the Name of the card, given during + insmod-time or (for kernel-builtin driver) on the kernel commandline. + + To load a 4B-card, the same command is used, except a second firmware + file is appended to the commandline of icnctrl. + + -> After downloading firmware, the two LEDs at the back cover of the card + (ICN-4B: 4 LEDs) must be blinking intermittently now. If a connection + is up, the corresponding led is lit continuously. + + For further documentation (adding more ICN-cards), refer to the manpage + icnctrl.8 which is included in the isdn4k-utils package. + diff --git a/Documentation/isdn/README.pcbit b/Documentation/isdn/README.pcbit new file mode 100644 index 000000000000..5125002282e5 --- /dev/null +++ b/Documentation/isdn/README.pcbit @@ -0,0 +1,40 @@ +------------------------------------------------------------------------------ + README file for the PCBIT-D Device Driver. +------------------------------------------------------------------------------ + +The PCBIT is a Euro ISDN adapter manufactured in Portugal by Octal and +developed in cooperation with Portugal Telecom and Inesc. +The driver interfaces with the standard kernel isdn facilities +originally developed by Fritz Elfert in the isdn4linux project. + +The common versions of the pcbit board require a firmware that is +distributed (and copyrighted) by the manufacturer. To load this +firmware you need "pcbitctl" available on the standard isdn4k-utils +package or in the pcbit package available in: + +ftp://ftp.di.fc.ul.pt/pub/systems/Linux/isdn + +Known Limitations: + +- The board reset procedure is at the moment incorrect and will only +allow you to load the firmware after a hard reset. + +- Only HDLC in B-channels is supported at the moment. There is no +current support for X.25 in B or D channels nor LAPD in B +channels. The main reason is that these two other protocol modes have, +to my knowledge, very little use. If you want to see them implemented +*do* send me a mail. + +- The driver often triggers errors in the board that I and the +manufacturer believe to be caused by bugs in the firmware. The current +version includes several procedures for error recovery that should +allow normal operation. Plans for the future include cooperation with +the manufacturer in order to solve this problem. + +Information/hints/help can be obtained in the linux isdn +mailing list (isdn4linux@listserv.isdn4linux.de) or directly from me. + +regards, + Pedro. + +<roque@di.fc.ul.pt> diff --git a/Documentation/isdn/README.sc b/Documentation/isdn/README.sc new file mode 100644 index 000000000000..1153cd926059 --- /dev/null +++ b/Documentation/isdn/README.sc @@ -0,0 +1,281 @@ +Welcome to Beta Release 2 of the combination ISDN driver for SpellCaster's +ISA ISDN adapters. Please note this release 2 includes support for the +DataCommute/BRI and TeleCommute/BRI adapters only and any other use is +guaranteed to fail. If you have a DataCommute/PRI installed in the test +computer, we recommend removing it as it will be detected but will not +be usable. To see what we have done to Beta Release 2, see section 3. + +Speaking of guarantees, THIS IS BETA SOFTWARE and as such contains +bugs and defects either known or unknown. Use this software at your own +risk. There is NO SUPPORT for this software. Some help may be available +through the web site or the mailing list but such support is totally at +our own option and without warranty. If you choose to assume all and +total risk by using this driver, we encourage you to join the beta +mailing list. + +To join the Linux beta mailing list, send a message to: +majordomo@spellcast.com with the words "subscribe linux-beta" as the only +contents of the message. Do not include a signature. If you choose to +remove yourself from this list at a later date, send another message to +the same address with the words "unsubscribe linux-beta" as its only +contents. + +TABLE OF CONTENTS +----------------- + 1. Introduction + 1.1 What is ISDN4Linux? + 1.2 What is different between this driver and previous drivers? + 1.3 How do I setup my system with the correct software to use + this driver release? + + 2. Basic Operations + 2.1 Unpacking and installing the driver + 2.2 Read the man pages!!! + 2.3 Installing the driver + 2.4 Removing the driver + 2.5 What to do if it doesn't load + 2.6 How to setup ISDN4Linux with the driver + + 3. Beta Change Summaries and Miscellaneous Notes + +1. Introduction +--------------- + +The revision 2 Linux driver for SpellCaster ISA ISDN adapters is built +upon ISDN4Linux available separately or as included in Linux 2.0 and later. +The driver will support a maximum of 4 adapters in any one system of any +type including DataCommute/BRI, DataCommute/PRI and TeleCommute/BRI for a +maximum of 92 channels for host. The driver is supplied as a module in +source form and needs to be complied before it can be used. It has been +tested on Linux 2.0.20. + +1.1 What Is ISDN4Linux + +ISDN4Linux is a driver and set of tools used to access and use ISDN devices +on a Linux platform in a common and standard way. It supports HDLC and PPP +protocols and offers channel bundling and MLPPP support. To use ISDN4Linux +you need to configure your kernel for ISDN support and get the ISDN4Linux +tool kit from our web site. + +ISDN4Linux creates a channel pool from all of the available ISDN channels +and therefore can function across adapters. When an ISDN4Linux compliant +driver (such as ours) is loaded, all of the channels go into a pool and +are used on a first-come first-served basis. In addition, individual +channels can be specifically bound to particular interfaces. + +1.2 What is different between this driver and previous drivers? + +The revision 2 driver besides adopting the ISDN4Linux architecture has many +subtle and not so subtle functional differences from previous releases. These +include: + - More efficient shared memory management combined with a simpler + configuration. All adapters now use only 16Kbytes of shared RAM + versus between 16K and 64K. New methods for using the shared RAM + allow us to utilize all of the available RAM on the adapter through + only one 16K page. + - Better detection of available upper memory. The probing routines + have been improved to better detect available shared RAM pages and + used pages are now locked. + - Decreased loading time and a wider range of I/O ports probed. + We have significantly reduced the amount of time it takes to load + the driver and at the same time doubled the number of I/O ports + probed increasing the likelihood of finding an adapter. + - We now support all ISA adapter models with a single driver instead + of separate drivers for each model. The revision 2 driver supports + the DataCommute/BRI, DataCommute/PRI and TeleCommute/BRI in any + combination up to a maximum of four adapters per system. + - On board PPP protocol support has been removed in favour of the + sync-PPP support used in ISDN4Linux. This means more control of + the protocol parameters, faster negotiation time and a more + familiar interface. + +1.3 How do I setup my system with the correct software to use + this driver release? + +Before you can compile, install and use the SpellCaster ISA ISDN driver, you +must ensure that the following software is installed, configured and running: + + - Linux kernel 2.0.20 or later with the required init and ps + versions. Please see your distribution vendor for the correct + utility packages. The latest kernel is available from + ftp://sunsite.unc.edu/pub/Linux/kernel/v2.0/ + + - The latest modules package (modules-2.0.0.tar.gz) from + ftp://sunsite.unc.edu/pub/Linux/kernel/modules-2.0.0.tar.gz + + - The ISDN4Linux tools available from + ftp://ftp.franken.de/pub/isdn4linux/v2.0/isdn4k-utils-2.0.tar.gz + This package may fail to compile for you so you can alternatively + get a pre-compiled version from + ftp://ftp.spellcast.com/pub/drivers/isdn4linux/isdn4k-bin-2.0.tar.gz + + +2. Basic Operations +------------------- + +2.1 Unpacking and installing the driver + + 1. As root, create a directory in a convenient place. We suggest + /usr/src/spellcaster. + + 2. Unpack the archive with : + tar xzf sc-n.nn.tar.gz -C /usr/src/spellcaster + + 3. Change directory to /usr/src/spellcaster + + 4. Read the README and RELNOTES files. + + 5. Run 'make' and if all goes well, run 'make install'. + +2.2 Read the man pages!!! + +Make sure you read the scctrl(8) and sc(4) manual pages before continuing +any further. Type 'man 8 scctrl' and 'man 4 sc'. + +2.3 Installing the driver + +To install the driver, type '/sbin/insmod sc' as root. sc(4) details options +you can specify but you shouldn't need to use any unless this doesn't work. + +Make sure the driver loaded and detected all of the adapters by typing +'dmesg'. + +The driver can be configured so that it is loaded upon startup. To do this, +edit the file "/etc/modules/'uname -f'/'uname -v'" and insert the driver name +"sc" into this file. + +2.4 Removing the driver + +To remove the driver, delete any interfaces that may exist (see isdnctrl(8) +for more on this) and then type '/sbin/rmmod sc'. + +2.5 What to do if it doesn't load + +If, when you try to install the driver, you get a message mentioning +'register_isdn' then you do not have the ISDN4Linux system installed. Please +make sure that ISDN support is configured in the kernel. + +If you get a message that says 'initialization of sc failed', then the +driver failed to detect an adapter or failed to find resources needed such +as a free IRQ line or shared memory segment. If you are sure there are free +resources available, use the insmod options detailed in sc(4) to override +the probing function. + +Upon testing, the following problem was noted, the driver would load without +problems, but the board would not respond beyond that point. When a check was +done with 'cat /proc/interrupts' the interrupt count for sc was 0. In the event +of this problem, change the BIOS settings so that the interrupts in question are +reserved for ISA use only. + + +2.6 How to setup ISDN4Linux with the driver + +There are three main configurations which you can use with the driver: + +A) Basic HDLC connection +B) PPP connection +C) MLPPP connection + +It should be mentioned here that you may also use a tty connection if you +desire. The Documentation directory of the isdn4linux subsystem offers good +documentation on this feature. + +A) 10 steps to the establishment of a basic HDLC connection +----------------------------------------------------------- + +- please open the isdn-hdlc file in the examples directory and follow along... + + This file is a script used to configure a BRI ISDN TA to establish a + basic HDLC connection between its two channels. Two network + interfaces are created and two routes added between the channels. + + i) using the isdnctrl utility, add an interface with "addif" and + name it "isdn0" + ii) add the outgoing and inbound telephone numbers + iii) set the Layer 2 protocol to hdlc + iv) set the eaz of the interface to be the phone number of that + specific channel + v) to turn the callback features off, set the callback to "off" and + the callback delay (cbdelay) to 0. + vi) the hangup timeout can be set to a specified number of seconds + vii) the hangup upon incoming call can be set on or off + viii) use the ifconfig command to bring up the network interface with + a specific IP address and point to point address + ix) add a route to the IP address through the isdn0 interface + x) a ping should result in the establishment of the connection + + +B) Establishment of a PPP connection +------------------------------------ + +- please open the isdn-ppp file in the examples directory and follow along... + + This file is a script used to configure a BRI ISDN TA to establish a + PPP connection between the two channels. The file is almost + identical to the HDLC connection example except that the packet + encapsulation type has to be set. + + use the same procedure as in the HDLC connection from steps i) to + iii) then, after the Layer 2 protocol is set, set the encapsulation + "encap" to syncppp. With this done, the rest of the steps, iv) to x) + can be followed from above. + + Then, the ipppd (ippp daemon) must be setup: + + xi) use the ipppd function found in /sbin/ipppd to set the following: + xii) take out (minus) VJ compression and bsd compression + xiii) set the mru size to 2000 + xiv) link the two /dev interfaces to the daemon + +NOTE: A "*" in the inbound telephone number specifies that a call can be +accepted on any number. + +C) Establishment of a MLPPP connection +-------------------------------------- + +- please open the isdn-mppp file in the examples directory and follow along... + + This file is a script used to configure a BRI ISDN TA to accept a + Multi Link PPP connection. + + i) using the isdnctrl utility, add an interface with "addif" and + name it "ippp0" + ii) add the inbound telephone number + iii) set the Layer 2 protocol to hdlc and the Layer 3 protocol to + trans (transparent) + iv) set the packet encapsulation to syncppp + v) set the eaz of the interface to be the phone number of that + specific channel + vi) to turn the callback features off, set the callback to "off" and + the callback delay (cbdelay) to 0. + vi) the hangup timeout can be set to a specified number of seconds + vii) the hangup upon incoming call can be set on or off + viii) add a slave interface and name it "ippp32" for example + ix) set the similar parameters for the ippp32 interface + x) use the ifconfig command to bring-up the ippp0 interface with a + specific IP address and point to point address + xi) add a route to the IP address through the ippp0 interface + xii) use the ipppd function found in /sbin/ipppd to set the following: + xiii) take out (minus) bsd compression + xiv) set the mru size to 2000 + xv) add (+) the multi-link function "+mp" + xvi) link the two /dev interfaces to the daemon + +NOTE: To use the MLPPP connection to dial OUT to a MLPPP connection, change +the inbound telephone numbers to the outgoing telephone numbers of the MLPPP +host. + + +3. Beta Change Summaries and Miscellaneous Notes +------------------------------------------------ +When using the "scctrl" utility to upload firmware revisions on the board, +please note that the byte count displayed at the end of the operation may be +different from the total number of bytes in the "dcbfwn.nn.sr" file. Please +disregard the displayed byte count. + +It was noted that in Beta Release 1, the module would fail to load and result +in a segmentation fault when 'insmod'ed. This problem was created when one of +the isdn4linux parameters, (isdn_ctrl, data field) was filled in. In some +cases, this data field was NULL, and was left unchecked, so when it was +referenced... segv. The bug has been fixed around line 63-68 of event.c. + diff --git a/Documentation/isdn/README.syncppp b/Documentation/isdn/README.syncppp new file mode 100644 index 000000000000..27d260095cce --- /dev/null +++ b/Documentation/isdn/README.syncppp @@ -0,0 +1,58 @@ +Some additional information for setting up a syncPPP +connection using network interfaces. +--------------------------------------------------------------- + +You need one thing beside the isdn4linux package: + + a patched pppd .. (I called it ipppd to show the difference) + +Compiling isdn4linux with sync PPP: +----------------------------------- +To compile isdn4linux with the sync PPP part, you have +to answer the appropriate question when doing a "make config" +Don't forget to load the slhc.o +module before the isdn.o module, if VJ-compression support +is not compiled into your kernel. (e.g if you have no PPP or +CSLIP in the kernel) + +Using isdn4linux with sync PPP: +------------------------------- +Sync PPP is just another encapsulation for isdn4linux. The +name to enable sync PPP encapsulation is 'syncppp' .. e.g: + + /sbin/isdnctrl encap ippp0 syncppp + +The name of the interface is here 'ippp0'. You need +one interface with the name 'ippp0' to saturate the +ipppd, which checks the ppp version via this interface. +Currently, all devices must have the name ipppX where +'X' is a decimal value. + +To set up a PPP connection you need the ipppd .. You must start +the ipppd once after installing the modules. The ipppd +communicates with the isdn4linux link-level driver using the +/dev/ippp0 to /dev/ippp15 devices. One ipppd can handle +all devices at once. If you want to use two PPP connections +at the same time, you have to connect the ipppd to two +devices .. and so on. +I've implemented one additional option for the ipppd: + 'useifip' will get (if set to not 0.0.0.0) the IP address + for the negotiation from the attached network-interface. +(also: ipppd will try to negotiate pointopoint IP as remote IP) +You must disable BSD-compression, this implementation can't +handle compressed packets. + +Check the etc/rc.isdn.syncppp in the isdn4kernel-util package +for an example setup script. + +To use the MPPP stuff, you must configure a slave device +with isdn4linux. Now call the ipppd with the '+mp' option. +To increase the number of links, you must use the +'addlink' option of the isdnctrl tool. (rc.isdn.syncppp.MPPP is +an example script) + +enjoy it, + michael + + + diff --git a/Documentation/isdn/README.x25 b/Documentation/isdn/README.x25 new file mode 100644 index 000000000000..e561a77c4e22 --- /dev/null +++ b/Documentation/isdn/README.x25 @@ -0,0 +1,184 @@ + +X.25 support within isdn4linux +============================== + +This is alpha/beta test code. Use it completely at your own risk. +As new versions appear, the stuff described here might suddenly change +or become invalid without notice. + +Keep in mind: + +You are using several new parts of the 2.2.x kernel series which +have not been tested in a large scale. Therefore, you might encounter +more bugs as usual. + +- If you connect to an X.25 neighbour not operated by yourself, ASK the + other side first. Be prepared that bugs in the protocol implementation + might result in problems. + +- This implementation has never wiped out my whole hard disk yet. But as + this is experimental code, don't blame me if that happened to you. + Backing up important data will never harm. + +- Monitor your isdn connections while using this software. This should + prevent you from undesired phone bills in case of driver problems. + + + + +How to configure the kernel +=========================== + +The ITU-T (former CCITT) X.25 network protocol layer has been implemented +in the Linux source tree since version 2.1.16. The isdn subsystem might be +useful to run X.25 on top of ISDN. If you want to try it, select + + "CCITT X.25 Packet Layer" + +from the networking options as well as + + "ISDN Support" and "X.25 PLP on Top of ISDN" + +from the ISDN subsystem options when you configure your kernel for +compilation. You currently also need to enable +"Prompt for development and/or incomplete code/drivers" from the +"Code maturity level options" menu. For the x25trace utility to work +you also need to enable "Packet socket". + +For local testing it is also recommended to enable the isdnloop driver +from the isdn subsystem's configuration menu. + +For testing, it is recommended that all isdn drivers and the X.25 PLP +protocol are compiled as loadable modules. Like this, you can recover +from certain errors by simply unloading and reloading the modules. + + + +What's it for? How to use it? +============================= + +X.25 on top of isdn might be useful with two different scenarios: + +- You might want to access a public X.25 data network from your Linux box. + You can use i4l if you were physically connected to the X.25 switch + by an ISDN B-channel (leased line as well as dial up connection should + work). + + This corresponds to ITU-T recommendation X.31 Case A (circuit-mode + access to PSPDN [packet switched public data network]). + + NOTE: X.31 also covers a Case B (access to PSPDN via virtual + circuit / packet mode service). The latter mode (which in theory + also allows using the D-channel) is not supported by isdn4linux. + It should however be possible to establish such packet mode connections + with certain active isdn cards provided that the firmware supports X.31 + and the driver exports this functionality to the user. Currently, + the AVM B1 driver is the only driver which does so. (It should be + possible to access D-channel X.31 with active AVM cards using the + CAPI interface of the AVM-B1 driver). + +- Or you might want to operate certain ISDN teleservices on your linux + box. A lot of those teleservices run on top of the ISO-8208 + (DTE-DTE mode) network layer protocol. ISO-8208 is essentially the + same as ITU-T X.25. + + Popular candidates of such teleservices are EUROfile transfer or any + teleservice applying ITU-T recommendation T.90. + +To use the X.25 protocol on top of isdn, just create an isdn network +interface as usual, configure your own and/or peer's ISDN numbers, +and choose x25iface encapsulation by + + isdnctrl encap <iface-name> x25iface. + +Once encap is set like this, the device can be used by the X.25 packet layer. + +All the stuff needed for X.25 is implemented inside the isdn link +level (mainly isdn_net.c and some new source files). Thus, it should +work with every existing HL driver. I was able to successfully open X.25 +connections on top of the isdnloop driver and the hisax driver. +"x25iface"-encapsulation bypasses demand dialing. Dialing will be +initiated when the upper (X.25 packet) layer requests the lapb datalink to +be established. But hangup timeout is still active. Whenever a hangup +occurs, all existing X.25 connections on that link will be cleared +It is recommended to use sufficiently large hangup-timeouts for the +isdn interfaces. + + +In order to set up a conforming protocol stack you also need to +specify the proper l2_prot parameter: + +To operate in ISO-8208 X.25 DTE-DTE mode, use + + isdnctrl l2_prot <iface-name> x75i + +To access an X.25 network switch via isdn (your linux box is the DTE), use + + isdnctrl l2_prot <iface-name> x25dte + +To mimic an X.25 network switch (DCE side of the connection), use + + isdnctrl l2_prot <iface-name> x25dce + +However, x25dte or x25dce is currently not supported by any real HL +level driver. The main difference between x75i and x25dte/dce is that +x25d[tc]e uses fixed lap_b addresses. With x75i, the side which +initiates the isdn connection uses the DTE's lap_b address while the +called side used the DCE's lap_b address. Thus, l2_prot x75i might +probably work if you access a public X.25 network as long as the +corresponding isdn connection is set up by you. At least one test +was successful to connect via isdn4linux to an X.25 switch using this +trick. At the switch side, a terminal adapter X.21 was used to connect +it to the isdn. + + +How to set up a test installation? +================================== + +To test X.25 on top of isdn, you need to get + +- a recent version of the "isdnctrl" program that supports setting the new + X.25 specific parameters. + +- the x25-utils-2.X package from + ftp://ftp.hes.iki.fi/pub/ham/linux/ax25/x25utils-* + (don't confuse the x25-utils with the ax25-utils) + +- an application program that uses linux PF_X25 sockets (some are + contained in the x25-util package). + +Before compiling the user level utilities make sure that the compiler/ +preprocessor will fetch the proper kernel header files of this kernel +source tree. Either make /usr/include/linux a symbolic link pointing to +this kernel's include/linux directory or set the appropriate compiler flags. + +When all drivers and interfaces are loaded and configured you need to +ifconfig the network interfaces up and add X.25-routes to them. Use +the usual ifconfig tool. + +ifconfig <iface-name> up + +But a special x25route tool (distributed with the x25-util package) +is needed to set up X.25 routes. I.e. + +x25route add 01 <iface-name> + +will cause all x.25 connections to the destination X.25-address +"01" to be routed to your created isdn network interface. + +There are currently no real X.25 applications available. However, for +tests, the x25-utils package contains a modified version of telnet +and telnetd that uses X.25 sockets instead of tcp/ip sockets. You can +use those for your first tests. Furthermore, you might check +ftp://ftp.hamburg.pop.de/pub/LOCAL/linux/i4l-eft/ which contains some +alpha-test implementation ("eftp4linux") of the EUROfile transfer +protocol. + +The scripts distributed with the eftp4linux test releases might also +provide useful examples for setting up X.25 on top of isdn. + +The x25-utility package also contains an x25trace tool that can be +used to monitor X.25 packets received by the network interfaces. +The /proc/net/x25* files also contain useful information. + +- Henner diff --git a/Documentation/isdn/syncPPP.FAQ b/Documentation/isdn/syncPPP.FAQ new file mode 100644 index 000000000000..3257a4bc0786 --- /dev/null +++ b/Documentation/isdn/syncPPP.FAQ @@ -0,0 +1,224 @@ +simple isdn4linux PPP FAQ .. to be continued .. not 'debugged' +------------------------------------------------------------------- + +Q01: what's pppd, ipppd, syncPPP, asyncPPP ?? +Q02: error message "this system lacks PPP support" +Q03: strange information using 'ifconfig' +Q04: MPPP?? What's that and how can I use it ... +Q05: I tried MPPP but it doesn't work +Q06: can I use asynchronous PPP encapsulation with network devices +Q07: A SunISDN machine can't connect to my i4l system +Q08: I wanna talk to several machines, which need different configs +Q09: Starting the ipppd, I get only error messages from i4l +Q10: I wanna use dynamic IP address assignment +Q11: I can't connect. How can I check where the problem is. +Q12: How can I reduce login delay? + +------------------------------------------------------------------- + +Q01: pppd, ipppd, syncPPP, asyncPPP .. what is that ? + what should I use? +A: The pppd is for asynchronous PPP .. asynchronous means + here, the framing is character based. (e.g when + using ttyI* or tty* devices) + + The ipppd handles PPP packets coming in HDLC + frames (bit based protocol) ... The PPP driver + in isdn4linux pushes all IP packets direct + to the network layer and all PPP protocol + frames to the /dev/ippp* device. + So, the ipppd is a simple external network + protocol handler. + + If you login into a remote machine using the + /dev/ttyI* devices and then enable PPP on the + remote terminal server -> use the 'old' pppd + + If your remote side immediately starts to send + frames ... you probably connect to a + syncPPP machine .. use the network device part + of isdn4linux with the 'syncppp' encapsulation + and make sure, that the ipppd is running and + connected to at least one /dev/ippp*. Check the + isdn4linux manual on how to configure a network device. + +-- + +Q02: when I start the ipppd .. I only get the + error message "this system lacks PPP support" +A: check that at least the device 'ippp0' exists. + (you can check this e.g with the program 'ifconfig') + The ipppd NEEDS this device under THIS name .. + If this device doesn't exists, use: + isdnctrl addif ippp0 + isdnctrl encap ippp0 syncppp + ... (see isdn4linux doc for more) ... +A: Maybe you have compiled the ipppd with another + kernel source tree than the kernel you currently + run ... + +-- + +Q03: when I list the netdevices with ifconfig I see, that + my ISDN interface has a HWaddr and IRQ=0 and Base + address = 0 +A: The device is a fake ethernet device .. ignore IRQ and baseaddr + You need the HWaddr only for ethernet encapsulation. + +-- + +Q04: MPPP?? What's that and how can I use it ... + +A: MPPP or MP or MPP (Warning: MP is also an + acronym for 'Multi Processor') stands for + Multi Point to Point and means bundling + of several channels to one logical stream. + To enable MPPP negotiation you must call the + ipppd with the '+mp' option. + You must also configure a slave device for + every additional channel. (see the i4l manual + for more) + To use channel bundling you must first activate + the 'master' or initial call. Now you can add + the slave channels with the command: + isdnctrl addlink <device> + e.g: + isdnctrl addlink ippp0 + This is different from other encapsulations of + isdn4linux! With syncPPP, there is no automatic + activation of slave devices. + +-- + +Q05: I tried MPPP but it doesn't work .. the ipppd + writes in the debug log something like: + .. rcvd [0][proto=0x3d] c0 00 00 00 80 fd 01 01 00 0a ... + .. sent [0][LCP ProtRej id=0x2 00 3d c0 00 00 00 80 fd 01 ... + +A: you forgot to compile MPPP/RFC1717 support into the + ISDN Subsystem. Recompile with this option enabled. + +-- + +Q06: can I use asynchronous PPP encapsulation + over the network interface of isdn4linux .. + +A: No .. that's not possible .. Use the standard + PPP package over the /dev/ttyI* devices. You + must not use the ipppd for this. + +-- + +Q07: A SunISDN machine tries to connect my i4l system, + which doesn't work. + Checking the debug log I just saw garbage like: +!![ ... fill in the line ... ]!! + +A: The Sun tries to talk asynchronous PPP ... i4l + can't understand this ... try to use the ttyI* + devices with the standard PPP/pppd package + +A: (from Alexanter Strauss: ) +!![ ... fill in mail ]!! + +-- + +Q08: I wanna talk to remote machines, which need + a different configuration. The only way + I found to do this is to kill the ipppd and + start a new one with another config to connect + to the second machine. + +A: you must bind a network interface explicitly to + an ippp device, where you can connect a (for this + interface) individually configured ipppd. + +-- + +Q09: When I start the ipppd I only get error messages + from the i4l driver .. + +A: When starting, the ipppd calls functions which may + trigger a network packet. (e.g gethostbyname()). + Without the ipppd (at this moment, it is not + fully started) we can't handle this network request. + Try to configure hostnames necessary for the ipppd + in your local /etc/hosts file or in a way, that + your system can resolve it without using an + isdn/ippp network-interface. + +-- + +Q10: I wanna use dynamic IP address assignment ... How + must I configure the network device. + +A: At least you must have a route which forwards + a packet to the ippp network-interface to trigger + the dial-on-demand. + A default route to the ippp-interface will work. + Now you must choose a dummy IP address for your + interface. + If for some reason you can't set the default + route to the ippp interface, you may take any + address of the subnet from which you expect your + dynamic IP number and set a 'network route' for + this subnet to the ippp interface. + To allow overriding of the dummy address you + must call the ipppd with the 'ipcp-accept-local' option. + +A: You must know, how the ipppd gets the addresses it wanna + configure. If you don't give any option, the ipppd + tries to negotiate the local host address! + With the option 'noipdefault' it requests an address + from the remote machine. With 'useifip' it gets the + addresses from the net interface. Or you set the address + on the option line with the <a.b.c.d:e.f.g.h> option. + Note: the IP address of the remote machine must be configured + locally or the remote machine must send it in an IPCP request. + If your side doesn't know the IP address after negotiation, it + closes the connection! + You must allow overriding of address with the 'ipcp-accept-*' + options, if you have set your own or the remote address + explicitly. + +A: Maybe you try these options .. e.g: + + /sbin/ipppd :$REMOTE noipdefault /dev/ippp0 + + where REMOTE must be the address of the remote machine (the + machine, which gives you your address) + +-- + +Q11: I can't connect. How can I check where the problem is. + +A: A good help log is the debug output from the ipppd... + Check whether you can find there: + - only a few LCP-conf-req SENT messages (less then 10) + and then a Term-REQ: + -> check whether your ISDN card is well configured + it seems, that your machine doesn't dial + (IRQ,IO,Proto, etc problems) + Configure your ISDN card to print debug messages and + check the /dev/isdnctrl output next time. There + you can see, whether there is activity on the card/line. + - there are at least a few RECV messages in the log: + -> fine: your card is dialing and your remote machine + tries to talk with you. Maybe only a missing + authentication. Check your ipppd configuration again. + - the ipppd exits for some reason: + -> not good ... check /var/adm/syslog and /var/adm/daemon. + Could be a bug in the ipppd. + +-- + +Q12: How can I reduce login delay? + +A: Log a login session ('debug' log) and check which options + your remote side rejects. Next time configure your ipppd + to not negotiate these options. Another 'side effect' is, that + this increases redundancy. (e.g your remote side is buggy and + rejects options in a wrong way). + + + diff --git a/Documentation/java.txt b/Documentation/java.txt new file mode 100644 index 000000000000..e4814c213301 --- /dev/null +++ b/Documentation/java.txt @@ -0,0 +1,396 @@ + Java(tm) Binary Kernel Support for Linux v1.03 + ---------------------------------------------- + +Linux beats them ALL! While all other OS's are TALKING about direct +support of Java Binaries in the OS, Linux is doing it! + +You can execute Java applications and Java Applets just like any +other program after you have done the following: + +1) You MUST FIRST install the Java Developers Kit for Linux. + The Java on Linux HOWTO gives the details on getting and + installing this. This HOWTO can be found at: + + ftp://sunsite.unc.edu/pub/Linux/docs/HOWTO/Java-HOWTO + + You should also set up a reasonable CLASSPATH environment + variable to use Java applications that make use of any + nonstandard classes (not included in the same directory + as the application itself). + +2) You have to compile BINFMT_MISC either as a module or into + the kernel (CONFIG_BINFMT_MISC) and set it up properly. + If you choose to compile it as a module, you will have + to insert it manually with modprobe/insmod, as kmod + can not easily be supported with binfmt_misc. + Read the file 'binfmt_misc.txt' in this directory to know + more about the configuration process. + +3) Add the following configuration items to binfmt_misc + (you should really have read binfmt_misc.txt now): + support for Java applications: + ':Java:M::\xca\xfe\xba\xbe::/usr/local/bin/javawrapper:' + support for executable Jar files: + ':ExecutableJAR:E::jar::/usr/local/bin/jarwrapper:' + support for Java Applets: + ':Applet:E::html::/usr/bin/appletviewer:' + or the following, if you want to be more selective: + ':Applet:M::<!--applet::/usr/bin/appletviewer:' + + Of cause you have to fix the path names. Given path/file names in this + document match the Debian 2.1 system. (i.e. jdk installed in /usr, + custom wrappers from this document in /usr/local) + + Note, that for the more selective applet support you have to modify + existing html-files to contain <!--applet--> in the first line + ('<' has to be the first character!) to let this work! + + For the compiled Java programs you need a wrapper script like the + following (this is because Java is broken in case of the filename + handling), again fix the path names, both in the script and in the + above given configuration string. + + You, too, need the little program after the script. Compile like + gcc -O2 -o javaclassname javaclassname.c + and stick it to /usr/local/bin. + + Both the javawrapper shellscript and the javaclassname program + were supplied by Colin J. Watson <cjw44@cam.ac.uk>. + +====================== Cut here =================== +#!/bin/bash +# /usr/local/bin/javawrapper - the wrapper for binfmt_misc/java + +if [ -z "$1" ]; then + exec 1>&2 + echo Usage: $0 class-file + exit 1 +fi + +CLASS=$1 +FQCLASS=`/usr/local/bin/javaclassname $1` +FQCLASSN=`echo $FQCLASS | sed -e 's/^.*\.\([^.]*\)$/\1/'` +FQCLASSP=`echo $FQCLASS | sed -e 's-\.-/-g' -e 's-^[^/]*$--' -e 's-/[^/]*$--'` + +# for example: +# CLASS=Test.class +# FQCLASS=foo.bar.Test +# FQCLASSN=Test +# FQCLASSP=foo/bar + +unset CLASSBASE + +declare -i LINKLEVEL=0 + +while :; do + if [ "`basename $CLASS .class`" == "$FQCLASSN" ]; then + # See if this directory works straight off + cd -L `dirname $CLASS` + CLASSDIR=$PWD + cd $OLDPWD + if echo $CLASSDIR | grep -q "$FQCLASSP$"; then + CLASSBASE=`echo $CLASSDIR | sed -e "s.$FQCLASSP$.."` + break; + fi + # Try dereferencing the directory name + cd -P `dirname $CLASS` + CLASSDIR=$PWD + cd $OLDPWD + if echo $CLASSDIR | grep -q "$FQCLASSP$"; then + CLASSBASE=`echo $CLASSDIR | sed -e "s.$FQCLASSP$.."` + break; + fi + # If no other possible filename exists + if [ ! -L $CLASS ]; then + exec 1>&2 + echo $0: + echo " $CLASS should be in a" \ + "directory tree called $FQCLASSP" + exit 1 + fi + fi + if [ ! -L $CLASS ]; then break; fi + # Go down one more level of symbolic links + let LINKLEVEL+=1 + if [ $LINKLEVEL -gt 5 ]; then + exec 1>&2 + echo $0: + echo " Too many symbolic links encountered" + exit 1 + fi + CLASS=`ls --color=no -l $CLASS | sed -e 's/^.* \([^ ]*\)$/\1/'` +done + +if [ -z "$CLASSBASE" ]; then + if [ -z "$FQCLASSP" ]; then + GOODNAME=$FQCLASSN.class + else + GOODNAME=$FQCLASSP/$FQCLASSN.class + fi + exec 1>&2 + echo $0: + echo " $FQCLASS should be in a file called $GOODNAME" + exit 1 +fi + +if ! echo $CLASSPATH | grep -q "^\(.*:\)*$CLASSBASE\(:.*\)*"; then + # class is not in CLASSPATH, so prepend dir of class to CLASSPATH + if [ -z "${CLASSPATH}" ] ; then + export CLASSPATH=$CLASSBASE + else + export CLASSPATH=$CLASSBASE:$CLASSPATH + fi +fi + +shift +/usr/bin/java $FQCLASS "$@" +====================== Cut here =================== + + +====================== Cut here =================== +/* javaclassname.c + * + * Extracts the class name from a Java class file; intended for use in a Java + * wrapper of the type supported by the binfmt_misc option in the Linux kernel. + * + * Copyright (C) 1999 Colin J. Watson <cjw44@cam.ac.uk>. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include <stdlib.h> +#include <stdio.h> +#include <stdarg.h> +#include <sys/types.h> + +/* From Sun's Java VM Specification, as tag entries in the constant pool. */ + +#define CP_UTF8 1 +#define CP_INTEGER 3 +#define CP_FLOAT 4 +#define CP_LONG 5 +#define CP_DOUBLE 6 +#define CP_CLASS 7 +#define CP_STRING 8 +#define CP_FIELDREF 9 +#define CP_METHODREF 10 +#define CP_INTERFACEMETHODREF 11 +#define CP_NAMEANDTYPE 12 + +/* Define some commonly used error messages */ + +#define seek_error() error("%s: Cannot seek\n", program) +#define corrupt_error() error("%s: Class file corrupt\n", program) +#define eof_error() error("%s: Unexpected end of file\n", program) +#define utf8_error() error("%s: Only ASCII 1-255 supported\n", program); + +char *program; + +long *pool; + +u_int8_t read_8(FILE *classfile); +u_int16_t read_16(FILE *classfile); +void skip_constant(FILE *classfile, u_int16_t *cur); +void error(const char *format, ...); +int main(int argc, char **argv); + +/* Reads in an unsigned 8-bit integer. */ +u_int8_t read_8(FILE *classfile) +{ + int b = fgetc(classfile); + if(b == EOF) + eof_error(); + return (u_int8_t)b; +} + +/* Reads in an unsigned 16-bit integer. */ +u_int16_t read_16(FILE *classfile) +{ + int b1, b2; + b1 = fgetc(classfile); + if(b1 == EOF) + eof_error(); + b2 = fgetc(classfile); + if(b2 == EOF) + eof_error(); + return (u_int16_t)((b1 << 8) | b2); +} + +/* Reads in a value from the constant pool. */ +void skip_constant(FILE *classfile, u_int16_t *cur) +{ + u_int16_t len; + int seekerr = 1; + pool[*cur] = ftell(classfile); + switch(read_8(classfile)) + { + case CP_UTF8: + len = read_16(classfile); + seekerr = fseek(classfile, len, SEEK_CUR); + break; + case CP_CLASS: + case CP_STRING: + seekerr = fseek(classfile, 2, SEEK_CUR); + break; + case CP_INTEGER: + case CP_FLOAT: + case CP_FIELDREF: + case CP_METHODREF: + case CP_INTERFACEMETHODREF: + case CP_NAMEANDTYPE: + seekerr = fseek(classfile, 4, SEEK_CUR); + break; + case CP_LONG: + case CP_DOUBLE: + seekerr = fseek(classfile, 8, SEEK_CUR); + ++(*cur); + break; + default: + corrupt_error(); + } + if(seekerr) + seek_error(); +} + +void error(const char *format, ...) +{ + va_list ap; + va_start(ap, format); + vfprintf(stderr, format, ap); + va_end(ap); + exit(1); +} + +int main(int argc, char **argv) +{ + FILE *classfile; + u_int16_t cp_count, i, this_class, classinfo_ptr; + u_int8_t length; + + program = argv[0]; + + if(!argv[1]) + error("%s: Missing input file\n", program); + classfile = fopen(argv[1], "rb"); + if(!classfile) + error("%s: Error opening %s\n", program, argv[1]); + + if(fseek(classfile, 8, SEEK_SET)) /* skip magic and version numbers */ + seek_error(); + cp_count = read_16(classfile); + pool = calloc(cp_count, sizeof(long)); + if(!pool) + error("%s: Out of memory for constant pool\n", program); + + for(i = 1; i < cp_count; ++i) + skip_constant(classfile, &i); + if(fseek(classfile, 2, SEEK_CUR)) /* skip access flags */ + seek_error(); + + this_class = read_16(classfile); + if(this_class < 1 || this_class >= cp_count) + corrupt_error(); + if(!pool[this_class] || pool[this_class] == -1) + corrupt_error(); + if(fseek(classfile, pool[this_class] + 1, SEEK_SET)) + seek_error(); + + classinfo_ptr = read_16(classfile); + if(classinfo_ptr < 1 || classinfo_ptr >= cp_count) + corrupt_error(); + if(!pool[classinfo_ptr] || pool[classinfo_ptr] == -1) + corrupt_error(); + if(fseek(classfile, pool[classinfo_ptr] + 1, SEEK_SET)) + seek_error(); + + length = read_16(classfile); + for(i = 0; i < length; ++i) + { + u_int8_t x = read_8(classfile); + if((x & 0x80) || !x) + { + if((x & 0xE0) == 0xC0) + { + u_int8_t y = read_8(classfile); + if((y & 0xC0) == 0x80) + { + int c = ((x & 0x1f) << 6) + (y & 0x3f); + if(c) putchar(c); + else utf8_error(); + } + else utf8_error(); + } + else utf8_error(); + } + else if(x == '/') putchar('.'); + else putchar(x); + } + putchar('\n'); + free(pool); + fclose(classfile); + return 0; +} +====================== Cut here =================== + + +====================== Cut here =================== +#!/bin/bash +# /usr/local/java/bin/jarwrapper - the wrapper for binfmt_misc/jar + +java -jar $1 +====================== Cut here =================== + + +Now simply chmod +x the .class, .jar and/or .html files you want to execute. +To add a Java program to your path best put a symbolic link to the main +.class file into /usr/bin (or another place you like) omitting the .class +extension. The directory containing the original .class file will be +added to your CLASSPATH during execution. + + +To test your new setup, enter in the following simple Java app, and name +it "HelloWorld.java": + + class HelloWorld { + public static void main(String args[]) { + System.out.println("Hello World!"); + } + } + +Now compile the application with: + javac HelloWorld.java + +Set the executable permissions of the binary file, with: + chmod 755 HelloWorld.class + +And then execute it: + ./HelloWorld.class + + +To execute Java Jar files, simple chmod the *.jar files to include +the execution bit, then just do + ./Application.jar + + +To execute Java Applets, simple chmod the *.html files to include +the execution bit, then just do + ./Applet.html + + +originally by Brian A. Lantz, brian@lantz.com +heavily edited for binfmt_misc by Richard Günther +new scripts by Colin J. Watson <cjw44@cam.ac.uk> +added executable Jar file support by Kurt Huwig <kurt@iku-netz.de> + diff --git a/Documentation/kbuild/00-INDEX b/Documentation/kbuild/00-INDEX new file mode 100644 index 000000000000..114644285454 --- /dev/null +++ b/Documentation/kbuild/00-INDEX @@ -0,0 +1,8 @@ +00-INDEX + - this file: info on the kernel build process +kconfig-language.txt + - specification of Config Language, the language in Kconfig files +makefiles.txt + - developer information for linux kernel makefiles +modules.txt + - how to build modules and to install them diff --git a/Documentation/kbuild/kconfig-language.txt b/Documentation/kbuild/kconfig-language.txt new file mode 100644 index 000000000000..ca1967f36423 --- /dev/null +++ b/Documentation/kbuild/kconfig-language.txt @@ -0,0 +1,282 @@ +Introduction +------------ + +The configuration database is collection of configuration options +organized in a tree structure: + + +- Code maturity level options + | +- Prompt for development and/or incomplete code/drivers + +- General setup + | +- Networking support + | +- System V IPC + | +- BSD Process Accounting + | +- Sysctl support + +- Loadable module support + | +- Enable loadable module support + | +- Set version information on all module symbols + | +- Kernel module loader + +- ... + +Every entry has its own dependencies. These dependencies are used +to determine the visibility of an entry. Any child entry is only +visible if its parent entry is also visible. + +Menu entries +------------ + +Most entries define a config option, all other entries help to organize +them. A single configuration option is defined like this: + +config MODVERSIONS + bool "Set version information on all module symbols" + depends MODULES + help + Usually, modules have to be recompiled whenever you switch to a new + kernel. ... + +Every line starts with a key word and can be followed by multiple +arguments. "config" starts a new config entry. The following lines +define attributes for this config option. Attributes can be the type of +the config option, input prompt, dependencies, help text and default +values. A config option can be defined multiple times with the same +name, but every definition can have only a single input prompt and the +type must not conflict. + +Menu attributes +--------------- + +A menu entry can have a number of attributes. Not all of them are +applicable everywhere (see syntax). + +- type definition: "bool"/"tristate"/"string"/"hex"/"int" + Every config option must have a type. There are only two basic types: + tristate and string, the other types are based on these two. The type + definition optionally accepts an input prompt, so these two examples + are equivalent: + + bool "Networking support" + and + bool + prompt "Networking support" + +- input prompt: "prompt" <prompt> ["if" <expr>] + Every menu entry can have at most one prompt, which is used to display + to the user. Optionally dependencies only for this prompt can be added + with "if". + +- default value: "default" <expr> ["if" <expr>] + A config option can have any number of default values. If multiple + default values are visible, only the first defined one is active. + Default values are not limited to the menu entry, where they are + defined, this means the default can be defined somewhere else or be + overridden by an earlier definition. + The default value is only assigned to the config symbol if no other + value was set by the user (via the input prompt above). If an input + prompt is visible the default value is presented to the user and can + be overridden by him. + Optionally dependencies only for this default value can be added with + "if". + +- dependencies: "depends on"/"requires" <expr> + This defines a dependency for this menu entry. If multiple + dependencies are defined they are connected with '&&'. Dependencies + are applied to all other options within this menu entry (which also + accept an "if" expression), so these two examples are equivalent: + + bool "foo" if BAR + default y if BAR + and + depends on BAR + bool "foo" + default y + +- reverse dependencies: "select" <symbol> ["if" <expr>] + While normal dependencies reduce the upper limit of a symbol (see + below), reverse dependencies can be used to force a lower limit of + another symbol. The value of the current menu symbol is used as the + minimal value <symbol> can be set to. If <symbol> is selected multiple + times, the limit is set to the largest selection. + Reverse dependencies can only be used with boolean or tristate + symbols. + +- numerical ranges: "range" <symbol> <symbol> ["if" <expr>] + This allows to limit the range of possible input values for int + and hex symbols. The user can only input a value which is larger than + or equal to the first symbol and smaller than or equal to the second + symbol. + +- help text: "help" or "---help---" + This defines a help text. The end of the help text is determined by + the indentation level, this means it ends at the first line which has + a smaller indentation than the first line of the help text. + "---help---" and "help" do not differ in behaviour, "---help---" is + used to help visually seperate configuration logic from help within + the file as an aid to developers. + + +Menu dependencies +----------------- + +Dependencies define the visibility of a menu entry and can also reduce +the input range of tristate symbols. The tristate logic used in the +expressions uses one more state than normal boolean logic to express the +module state. Dependency expressions have the following syntax: + +<expr> ::= <symbol> (1) + <symbol> '=' <symbol> (2) + <symbol> '!=' <symbol> (3) + '(' <expr> ')' (4) + '!' <expr> (5) + <expr> '&&' <expr> (6) + <expr> '||' <expr> (7) + +Expressions are listed in decreasing order of precedence. + +(1) Convert the symbol into an expression. Boolean and tristate symbols + are simply converted into the respective expression values. All + other symbol types result in 'n'. +(2) If the values of both symbols are equal, it returns 'y', + otherwise 'n'. +(3) If the values of both symbols are equal, it returns 'n', + otherwise 'y'. +(4) Returns the value of the expression. Used to override precedence. +(5) Returns the result of (2-/expr/). +(6) Returns the result of min(/expr/, /expr/). +(7) Returns the result of max(/expr/, /expr/). + +An expression can have a value of 'n', 'm' or 'y' (or 0, 1, 2 +respectively for calculations). A menu entry becomes visible when it's +expression evaluates to 'm' or 'y'. + +There are two types of symbols: constant and nonconstant symbols. +Nonconstant symbols are the most common ones and are defined with the +'config' statement. Nonconstant symbols consist entirely of alphanumeric +characters or underscores. +Constant symbols are only part of expressions. Constant symbols are +always surrounded by single or double quotes. Within the quote any +other character is allowed and the quotes can be escaped using '\'. + +Menu structure +-------------- + +The position of a menu entry in the tree is determined in two ways. First +it can be specified explicitly: + +menu "Network device support" + depends NET + +config NETDEVICES + ... + +endmenu + +All entries within the "menu" ... "endmenu" block become a submenu of +"Network device support". All subentries inherit the dependencies from +the menu entry, e.g. this means the dependency "NET" is added to the +dependency list of the config option NETDEVICES. + +The other way to generate the menu structure is done by analyzing the +dependencies. If a menu entry somehow depends on the previous entry, it +can be made a submenu of it. First, the previous (parent) symbol must +be part of the dependency list and then one of these two conditions +must be true: +- the child entry must become invisible, if the parent is set to 'n' +- the child entry must only be visible, if the parent is visible + +config MODULES + bool "Enable loadable module support" + +config MODVERSIONS + bool "Set version information on all module symbols" + depends MODULES + +comment "module support disabled" + depends !MODULES + +MODVERSIONS directly depends on MODULES, this means it's only visible if +MODULES is different from 'n'. The comment on the other hand is always +visible when MODULES is visible (the (empty) dependency of MODULES is +also part of the comment dependencies). + + +Kconfig syntax +-------------- + +The configuration file describes a series of menu entries, where every +line starts with a keyword (except help texts). The following keywords +end a menu entry: +- config +- menuconfig +- choice/endchoice +- comment +- menu/endmenu +- if/endif +- source +The first five also start the definition of a menu entry. + +config: + + "config" <symbol> + <config options> + +This defines a config symbol <symbol> and accepts any of above +attributes as options. + +menuconfig: + "menuconfig" <symbol> + <config options> + +This is similiar to the simple config entry above, but it also gives a +hint to front ends, that all suboptions should be displayed as a +separate list of options. + +choices: + + "choice" + <choice options> + <choice block> + "endchoice" + +This defines a choice group and accepts any of above attributes as +options. A choice can only be of type bool or tristate, while a boolean +choice only allows a single config entry to be selected, a tristate +choice also allows any number of config entries to be set to 'm'. This +can be used if multiple drivers for a single hardware exists and only a +single driver can be compiled/loaded into the kernel, but all drivers +can be compiled as modules. +A choice accepts another option "optional", which allows to set the +choice to 'n' and no entry needs to be selected. + +comment: + + "comment" <prompt> + <comment options> + +This defines a comment which is displayed to the user during the +configuration process and is also echoed to the output files. The only +possible options are dependencies. + +menu: + + "menu" <prompt> + <menu options> + <menu block> + "endmenu" + +This defines a menu block, see "Menu structure" above for more +information. The only possible options are dependencies. + +if: + + "if" <expr> + <if block> + "endif" + +This defines an if block. The dependency expression <expr> is appended +to all enclosed menu entries. + +source: + + "source" <prompt> + +This reads the specified configuration file. This file is always parsed. diff --git a/Documentation/kbuild/makefiles.txt b/Documentation/kbuild/makefiles.txt new file mode 100644 index 000000000000..2616a58a5a4b --- /dev/null +++ b/Documentation/kbuild/makefiles.txt @@ -0,0 +1,1122 @@ +Linux Kernel Makefiles + +This document describes the Linux kernel Makefiles. + +=== Table of Contents + + === 1 Overview + === 2 Who does what + === 3 The kbuild files + --- 3.1 Goal definitions + --- 3.2 Built-in object goals - obj-y + --- 3.3 Loadable module goals - obj-m + --- 3.4 Objects which export symbols + --- 3.5 Library file goals - lib-y + --- 3.6 Descending down in directories + --- 3.7 Compilation flags + --- 3.8 Command line dependency + --- 3.9 Dependency tracking + --- 3.10 Special Rules + + === 4 Host Program support + --- 4.1 Simple Host Program + --- 4.2 Composite Host Programs + --- 4.3 Defining shared libraries + --- 4.4 Using C++ for host programs + --- 4.5 Controlling compiler options for host programs + --- 4.6 When host programs are actually built + --- 4.7 Using hostprogs-$(CONFIG_FOO) + + === 5 Kbuild clean infrastructure + + === 6 Architecture Makefiles + --- 6.1 Set variables to tweak the build to the architecture + --- 6.2 Add prerequisites to prepare: + --- 6.3 List directories to visit when descending + --- 6.4 Architecture specific boot images + --- 6.5 Building non-kbuild targets + --- 6.6 Commands useful for building a boot image + --- 6.7 Custom kbuild commands + --- 6.8 Preprocessing linker scripts + --- 6.9 $(CC) support functions + + === 7 Kbuild Variables + === 8 Makefile language + === 9 Credits + === 10 TODO + +=== 1 Overview + +The Makefiles have five parts: + + Makefile the top Makefile. + .config the kernel configuration file. + arch/$(ARCH)/Makefile the arch Makefile. + scripts/Makefile.* common rules etc. for all kbuild Makefiles. + kbuild Makefiles there are about 500 of these. + +The top Makefile reads the .config file, which comes from the kernel +configuration process. + +The top Makefile is responsible for building two major products: vmlinux +(the resident kernel image) and modules (any module files). +It builds these goals by recursively descending into the subdirectories of +the kernel source tree. +The list of subdirectories which are visited depends upon the kernel +configuration. The top Makefile textually includes an arch Makefile +with the name arch/$(ARCH)/Makefile. The arch Makefile supplies +architecture-specific information to the top Makefile. + +Each subdirectory has a kbuild Makefile which carries out the commands +passed down from above. The kbuild Makefile uses information from the +.config file to construct various file lists used by kbuild to build +any built-in or modular targets. + +scripts/Makefile.* contains all the definitions/rules etc. that +are used to build the kernel based on the kbuild makefiles. + + +=== 2 Who does what + +People have four different relationships with the kernel Makefiles. + +*Users* are people who build kernels. These people type commands such as +"make menuconfig" or "make". They usually do not read or edit +any kernel Makefiles (or any other source files). + +*Normal developers* are people who work on features such as device +drivers, file systems, and network protocols. These people need to +maintain the kbuild Makefiles for the subsystem that they are +working on. In order to do this effectively, they need some overall +knowledge about the kernel Makefiles, plus detailed knowledge about the +public interface for kbuild. + +*Arch developers* are people who work on an entire architecture, such +as sparc or ia64. Arch developers need to know about the arch Makefile +as well as kbuild Makefiles. + +*Kbuild developers* are people who work on the kernel build system itself. +These people need to know about all aspects of the kernel Makefiles. + +This document is aimed towards normal developers and arch developers. + + +=== 3 The kbuild files + +Most Makefiles within the kernel are kbuild Makefiles that use the +kbuild infrastructure. This chapter introduce the syntax used in the +kbuild makefiles. +The preferred name for the kbuild files is 'Kbuild' but 'Makefile' will +continue to be supported. All new developmen is expected to use the +Kbuild filename. + +Section 3.1 "Goal definitions" is a quick intro, further chapters provide +more details, with real examples. + +--- 3.1 Goal definitions + + Goal definitions are the main part (heart) of the kbuild Makefile. + These lines define the files to be built, any special compilation + options, and any subdirectories to be entered recursively. + + The most simple kbuild makefile contains one line: + + Example: + obj-y += foo.o + + This tell kbuild that there is one object in that directory named + foo.o. foo.o will be built from foo.c or foo.S. + + If foo.o shall be built as a module, the variable obj-m is used. + Therefore the following pattern is often used: + + Example: + obj-$(CONFIG_FOO) += foo.o + + $(CONFIG_FOO) evaluates to either y (for built-in) or m (for module). + If CONFIG_FOO is neither y nor m, then the file will not be compiled + nor linked. + +--- 3.2 Built-in object goals - obj-y + + The kbuild Makefile specifies object files for vmlinux + in the lists $(obj-y). These lists depend on the kernel + configuration. + + Kbuild compiles all the $(obj-y) files. It then calls + "$(LD) -r" to merge these files into one built-in.o file. + built-in.o is later linked into vmlinux by the parent Makefile. + + The order of files in $(obj-y) is significant. Duplicates in + the lists are allowed: the first instance will be linked into + built-in.o and succeeding instances will be ignored. + + Link order is significant, because certain functions + (module_init() / __initcall) will be called during boot in the + order they appear. So keep in mind that changing the link + order may e.g. change the order in which your SCSI + controllers are detected, and thus you disks are renumbered. + + Example: + #drivers/isdn/i4l/Makefile + # Makefile for the kernel ISDN subsystem and device drivers. + # Each configuration option enables a list of files. + obj-$(CONFIG_ISDN) += isdn.o + obj-$(CONFIG_ISDN_PPP_BSDCOMP) += isdn_bsdcomp.o + +--- 3.3 Loadable module goals - obj-m + + $(obj-m) specify object files which are built as loadable + kernel modules. + + A module may be built from one source file or several source + files. In the case of one source file, the kbuild makefile + simply adds the file to $(obj-m). + + Example: + #drivers/isdn/i4l/Makefile + obj-$(CONFIG_ISDN_PPP_BSDCOMP) += isdn_bsdcomp.o + + Note: In this example $(CONFIG_ISDN_PPP_BSDCOMP) evaluates to 'm' + + If a kernel module is built from several source files, you specify + that you want to build a module in the same way as above. + + Kbuild needs to know which the parts that you want to build your + module from, so you have to tell it by setting an + $(<module_name>-objs) variable. + + Example: + #drivers/isdn/i4l/Makefile + obj-$(CONFIG_ISDN) += isdn.o + isdn-objs := isdn_net_lib.o isdn_v110.o isdn_common.o + + In this example, the module name will be isdn.o. Kbuild will + compile the objects listed in $(isdn-objs) and then run + "$(LD) -r" on the list of these files to generate isdn.o. + + Kbuild recognises objects used for composite objects by the suffix + -objs, and the suffix -y. This allows the Makefiles to use + the value of a CONFIG_ symbol to determine if an object is part + of a composite object. + + Example: + #fs/ext2/Makefile + obj-$(CONFIG_EXT2_FS) += ext2.o + ext2-y := balloc.o bitmap.o + ext2-$(CONFIG_EXT2_FS_XATTR) += xattr.o + + In this example xattr.o is only part of the composite object + ext2.o, if $(CONFIG_EXT2_FS_XATTR) evaluates to 'y'. + + Note: Of course, when you are building objects into the kernel, + the syntax above will also work. So, if you have CONFIG_EXT2_FS=y, + kbuild will build an ext2.o file for you out of the individual + parts and then link this into built-in.o, as you would expect. + +--- 3.4 Objects which export symbols + + No special notation is required in the makefiles for + modules exporting symbols. + +--- 3.5 Library file goals - lib-y + + Objects listed with obj-* are used for modules or + combined in a built-in.o for that specific directory. + There is also the possibility to list objects that will + be included in a library, lib.a. + All objects listed with lib-y are combined in a single + library for that directory. + Objects that are listed in obj-y and additional listed in + lib-y will not be included in the library, since they will anyway + be accessible. + For consistency objects listed in lib-m will be included in lib.a. + + Note that the same kbuild makefile may list files to be built-in + and to be part of a library. Therefore the same directory + may contain both a built-in.o and a lib.a file. + + Example: + #arch/i386/lib/Makefile + lib-y := checksum.o delay.o + + This will create a library lib.a based on checksum.o and delay.o. + For kbuild to actually recognize that there is a lib.a being build + the directory shall be listed in libs-y. + See also "6.3 List directories to visit when descending". + + Usage of lib-y is normally restricted to lib/ and arch/*/lib. + +--- 3.6 Descending down in directories + + A Makefile is only responsible for building objects in its own + directory. Files in subdirectories should be taken care of by + Makefiles in these subdirs. The build system will automatically + invoke make recursively in subdirectories, provided you let it know of + them. + + To do so obj-y and obj-m are used. + ext2 lives in a separate directory, and the Makefile present in fs/ + tells kbuild to descend down using the following assignment. + + Example: + #fs/Makefile + obj-$(CONFIG_EXT2_FS) += ext2/ + + If CONFIG_EXT2_FS is set to either 'y' (built-in) or 'm' (modular) + the corresponding obj- variable will be set, and kbuild will descend + down in the ext2 directory. + Kbuild only uses this information to decide that it needs to visit + the directory, it is the Makefile in the subdirectory that + specifies what is modules and what is built-in. + + It is good practice to use a CONFIG_ variable when assigning directory + names. This allows kbuild to totally skip the directory if the + corresponding CONFIG_ option is neither 'y' nor 'm'. + +--- 3.7 Compilation flags + + EXTRA_CFLAGS, EXTRA_AFLAGS, EXTRA_LDFLAGS, EXTRA_ARFLAGS + + All the EXTRA_ variables apply only to the kbuild makefile + where they are assigned. The EXTRA_ variables apply to all + commands executed in the kbuild makefile. + + $(EXTRA_CFLAGS) specifies options for compiling C files with + $(CC). + + Example: + # drivers/sound/emu10k1/Makefile + EXTRA_CFLAGS += -I$(obj) + ifdef DEBUG + EXTRA_CFLAGS += -DEMU10K1_DEBUG + endif + + + This variable is necessary because the top Makefile owns the + variable $(CFLAGS) and uses it for compilation flags for the + entire tree. + + $(EXTRA_AFLAGS) is a similar string for per-directory options + when compiling assembly language source. + + Example: + #arch/x86_64/kernel/Makefile + EXTRA_AFLAGS := -traditional + + + $(EXTRA_LDFLAGS) and $(EXTRA_ARFLAGS) are similar strings for + per-directory options to $(LD) and $(AR). + + Example: + #arch/m68k/fpsp040/Makefile + EXTRA_LDFLAGS := -x + + CFLAGS_$@, AFLAGS_$@ + + CFLAGS_$@ and AFLAGS_$@ only apply to commands in current + kbuild makefile. + + $(CFLAGS_$@) specifies per-file options for $(CC). The $@ + part has a literal value which specifies the file that it is for. + + Example: + # drivers/scsi/Makefile + CFLAGS_aha152x.o = -DAHA152X_STAT -DAUTOCONF + CFLAGS_gdth.o = # -DDEBUG_GDTH=2 -D__SERIAL__ -D__COM2__ \ + -DGDTH_STATISTICS + CFLAGS_seagate.o = -DARBITRATE -DPARITY -DSEAGATE_USE_ASM + + These three lines specify compilation flags for aha152x.o, + gdth.o, and seagate.o + + $(AFLAGS_$@) is a similar feature for source files in assembly + languages. + + Example: + # arch/arm/kernel/Makefile + AFLAGS_head-armv.o := -DTEXTADDR=$(TEXTADDR) -traditional + AFLAGS_head-armo.o := -DTEXTADDR=$(TEXTADDR) -traditional + +--- 3.9 Dependency tracking + + Kbuild tracks dependencies on the following: + 1) All prerequisite files (both *.c and *.h) + 2) CONFIG_ options used in all prerequisite files + 3) Command-line used to compile target + + Thus, if you change an option to $(CC) all affected files will + be re-compiled. + +--- 3.10 Special Rules + + Special rules are used when the kbuild infrastructure does + not provide the required support. A typical example is + header files generated during the build process. + Another example is the architecture specific Makefiles which + needs special rules to prepare boot images etc. + + Special rules are written as normal Make rules. + Kbuild is not executing in the directory where the Makefile is + located, so all special rules shall provide a relative + path to prerequisite files and target files. + + Two variables are used when defining special rules: + + $(src) + $(src) is a relative path which points to the directory + where the Makefile is located. Always use $(src) when + referring to files located in the src tree. + + $(obj) + $(obj) is a relative path which points to the directory + where the target is saved. Always use $(obj) when + referring to generated files. + + Example: + #drivers/scsi/Makefile + $(obj)/53c8xx_d.h: $(src)/53c7,8xx.scr $(src)/script_asm.pl + $(CPP) -DCHIP=810 - < $< | ... $(src)/script_asm.pl + + This is a special rule, following the normal syntax + required by make. + The target file depends on two prerequisite files. References + to the target file are prefixed with $(obj), references + to prerequisites are referenced with $(src) (because they are not + generated files). + + +=== 4 Host Program support + +Kbuild supports building executables on the host for use during the +compilation stage. +Two steps are required in order to use a host executable. + +The first step is to tell kbuild that a host program exists. This is +done utilising the variable hostprogs-y. + +The second step is to add an explicit dependency to the executable. +This can be done in two ways. Either add the dependency in a rule, +or utilise the variable $(always). +Both possibilities are described in the following. + +--- 4.1 Simple Host Program + + In some cases there is a need to compile and run a program on the + computer where the build is running. + The following line tells kbuild that the program bin2hex shall be + built on the build host. + + Example: + hostprogs-y := bin2hex + + Kbuild assumes in the above example that bin2hex is made from a single + c-source file named bin2hex.c located in the same directory as + the Makefile. + +--- 4.2 Composite Host Programs + + Host programs can be made up based on composite objects. + The syntax used to define composite objects for host programs is + similar to the syntax used for kernel objects. + $(<executeable>-objs) list all objects used to link the final + executable. + + Example: + #scripts/lxdialog/Makefile + hostprogs-y := lxdialog + lxdialog-objs := checklist.o lxdialog.o + + Objects with extension .o are compiled from the corresponding .c + files. In the above example checklist.c is compiled to checklist.o + and lxdialog.c is compiled to lxdialog.o. + Finally the two .o files are linked to the executable, lxdialog. + Note: The syntax <executable>-y is not permitted for host-programs. + +--- 4.3 Defining shared libraries + + Objects with extension .so are considered shared libraries, and + will be compiled as position independent objects. + Kbuild provides support for shared libraries, but the usage + shall be restricted. + In the following example the libkconfig.so shared library is used + to link the executable conf. + + Example: + #scripts/kconfig/Makefile + hostprogs-y := conf + conf-objs := conf.o libkconfig.so + libkconfig-objs := expr.o type.o + + Shared libraries always require a corresponding -objs line, and + in the example above the shared library libkconfig is composed by + the two objects expr.o and type.o. + expr.o and type.o will be built as position independent code and + linked as a shared library libkconfig.so. C++ is not supported for + shared libraries. + +--- 4.4 Using C++ for host programs + + kbuild offers support for host programs written in C++. This was + introduced solely to support kconfig, and is not recommended + for general use. + + Example: + #scripts/kconfig/Makefile + hostprogs-y := qconf + qconf-cxxobjs := qconf.o + + In the example above the executable is composed of the C++ file + qconf.cc - identified by $(qconf-cxxobjs). + + If qconf is composed by a mixture of .c and .cc files, then an + additional line can be used to identify this. + + Example: + #scripts/kconfig/Makefile + hostprogs-y := qconf + qconf-cxxobjs := qconf.o + qconf-objs := check.o + +--- 4.5 Controlling compiler options for host programs + + When compiling host programs, it is possible to set specific flags. + The programs will always be compiled utilising $(HOSTCC) passed + the options specified in $(HOSTCFLAGS). + To set flags that will take effect for all host programs created + in that Makefile use the variable HOST_EXTRACFLAGS. + + Example: + #scripts/lxdialog/Makefile + HOST_EXTRACFLAGS += -I/usr/include/ncurses + + To set specific flags for a single file the following construction + is used: + + Example: + #arch/ppc64/boot/Makefile + HOSTCFLAGS_piggyback.o := -DKERNELBASE=$(KERNELBASE) + + It is also possible to specify additional options to the linker. + + Example: + #scripts/kconfig/Makefile + HOSTLOADLIBES_qconf := -L$(QTDIR)/lib + + When linking qconf it will be passed the extra option "-L$(QTDIR)/lib". + +--- 4.6 When host programs are actually built + + Kbuild will only build host-programs when they are referenced + as a prerequisite. + This is possible in two ways: + + (1) List the prerequisite explicitly in a special rule. + + Example: + #drivers/pci/Makefile + hostprogs-y := gen-devlist + $(obj)/devlist.h: $(src)/pci.ids $(obj)/gen-devlist + ( cd $(obj); ./gen-devlist ) < $< + + The target $(obj)/devlist.h will not be built before + $(obj)/gen-devlist is updated. Note that references to + the host programs in special rules must be prefixed with $(obj). + + (2) Use $(always) + When there is no suitable special rule, and the host program + shall be built when a makefile is entered, the $(always) + variable shall be used. + + Example: + #scripts/lxdialog/Makefile + hostprogs-y := lxdialog + always := $(hostprogs-y) + + This will tell kbuild to build lxdialog even if not referenced in + any rule. + +--- 4.7 Using hostprogs-$(CONFIG_FOO) + + A typcal pattern in a Kbuild file lok like this: + + Example: + #scripts/Makefile + hostprogs-$(CONFIG_KALLSYMS) += kallsyms + + Kbuild knows about both 'y' for built-in and 'm' for module. + So if a config symbol evaluate to 'm', kbuild will still build + the binary. In other words Kbuild handle hostprogs-m exactly + like hostprogs-y. But only hostprogs-y is recommend used + when no CONFIG symbol are involved. + +=== 5 Kbuild clean infrastructure + +"make clean" deletes most generated files in the src tree where the kernel +is compiled. This includes generated files such as host programs. +Kbuild knows targets listed in $(hostprogs-y), $(hostprogs-m), $(always), +$(extra-y) and $(targets). They are all deleted during "make clean". +Files matching the patterns "*.[oas]", "*.ko", plus some additional files +generated by kbuild are deleted all over the kernel src tree when +"make clean" is executed. + +Additional files can be specified in kbuild makefiles by use of $(clean-files). + + Example: + #drivers/pci/Makefile + clean-files := devlist.h classlist.h + +When executing "make clean", the two files "devlist.h classlist.h" will +be deleted. Kbuild will assume files to be in same relative directory as the +Makefile except if an absolute path is specified (path starting with '/'). + +To delete a directory hirachy use: + Example: + #scripts/package/Makefile + clean-dirs := $(objtree)/debian/ + +This will delete the directory debian, including all subdirectories. +Kbuild will assume the directories to be in the same relative path as the +Makefile if no absolute path is specified (path does not start with '/'). + +Usually kbuild descends down in subdirectories due to "obj-* := dir/", +but in the architecture makefiles where the kbuild infrastructure +is not sufficient this sometimes needs to be explicit. + + Example: + #arch/i386/boot/Makefile + subdir- := compressed/ + +The above assignment instructs kbuild to descend down in the +directory compressed/ when "make clean" is executed. + +To support the clean infrastructure in the Makefiles that builds the +final bootimage there is an optional target named archclean: + + Example: + #arch/i386/Makefile + archclean: + $(Q)$(MAKE) $(clean)=arch/i386/boot + +When "make clean" is executed, make will descend down in arch/i386/boot, +and clean as usual. The Makefile located in arch/i386/boot/ may use +the subdir- trick to descend further down. + +Note 1: arch/$(ARCH)/Makefile cannot use "subdir-", because that file is +included in the top level makefile, and the kbuild infrastructure +is not operational at that point. + +Note 2: All directories listed in core-y, libs-y, drivers-y and net-y will +be visited during "make clean". + +=== 6 Architecture Makefiles + +The top level Makefile sets up the environment and does the preparation, +before starting to descend down in the individual directories. +The top level makefile contains the generic part, whereas the +arch/$(ARCH)/Makefile contains what is required to set-up kbuild +to the said architecture. +To do so arch/$(ARCH)/Makefile sets a number of variables, and defines +a few targets. + +When kbuild executes the following steps are followed (roughly): +1) Configuration of the kernel => produced .config +2) Store kernel version in include/linux/version.h +3) Symlink include/asm to include/asm-$(ARCH) +4) Updating all other prerequisites to the target prepare: + - Additional prerequisites are specified in arch/$(ARCH)/Makefile +5) Recursively descend down in all directories listed in + init-* core* drivers-* net-* libs-* and build all targets. + - The value of the above variables are extended in arch/$(ARCH)/Makefile. +6) All object files are then linked and the resulting file vmlinux is + located at the root of the src tree. + The very first objects linked are listed in head-y, assigned by + arch/$(ARCH)/Makefile. +7) Finally the architecture specific part does any required post processing + and builds the final bootimage. + - This includes building boot records + - Preparing initrd images and the like + + +--- 6.1 Set variables to tweak the build to the architecture + + LDFLAGS Generic $(LD) options + + Flags used for all invocations of the linker. + Often specifying the emulation is sufficient. + + Example: + #arch/s390/Makefile + LDFLAGS := -m elf_s390 + Note: EXTRA_LDFLAGS and LDFLAGS_$@ can be used to further customise + the flags used. See chapter 7. + + LDFLAGS_MODULE Options for $(LD) when linking modules + + LDFLAGS_MODULE is used to set specific flags for $(LD) when + linking the .ko files used for modules. + Default is "-r", for relocatable output. + + LDFLAGS_vmlinux Options for $(LD) when linking vmlinux + + LDFLAGS_vmlinux is used to specify additional flags to pass to + the linker when linking the final vmlinux. + LDFLAGS_vmlinux uses the LDFLAGS_$@ support. + + Example: + #arch/i386/Makefile + LDFLAGS_vmlinux := -e stext + + OBJCOPYFLAGS objcopy flags + + When $(call if_changed,objcopy) is used to translate a .o file, + then the flags specified in OBJCOPYFLAGS will be used. + $(call if_changed,objcopy) is often used to generate raw binaries on + vmlinux. + + Example: + #arch/s390/Makefile + OBJCOPYFLAGS := -O binary + + #arch/s390/boot/Makefile + $(obj)/image: vmlinux FORCE + $(call if_changed,objcopy) + + In this example the binary $(obj)/image is a binary version of + vmlinux. The usage of $(call if_changed,xxx) will be described later. + + AFLAGS $(AS) assembler flags + + Default value - see top level Makefile + Append or modify as required per architecture. + + Example: + #arch/sparc64/Makefile + AFLAGS += -m64 -mcpu=ultrasparc + + CFLAGS $(CC) compiler flags + + Default value - see top level Makefile + Append or modify as required per architecture. + + Often the CFLAGS variable depends on the configuration. + + Example: + #arch/i386/Makefile + cflags-$(CONFIG_M386) += -march=i386 + CFLAGS += $(cflags-y) + + Many arch Makefiles dynamically run the target C compiler to + probe supported options: + + #arch/i386/Makefile + + ... + cflags-$(CONFIG_MPENTIUMII) += $(call cc-option,\ + -march=pentium2,-march=i686) + ... + # Disable unit-at-a-time mode ... + CFLAGS += $(call cc-option,-fno-unit-at-a-time) + ... + + + The first examples utilises the trick that a config option expands + to 'y' when selected. + + CFLAGS_KERNEL $(CC) options specific for built-in + + $(CFLAGS_KERNEL) contains extra C compiler flags used to compile + resident kernel code. + + CFLAGS_MODULE $(CC) options specific for modules + + $(CFLAGS_MODULE) contains extra C compiler flags used to compile code + for loadable kernel modules. + + +--- 6.2 Add prerequisites to prepare: + + The prepare: rule is used to list prerequisites that needs to be + built before starting to descend down in the subdirectories. + This is usual header files containing assembler constants. + + Example: + #arch/s390/Makefile + prepare: include/asm-$(ARCH)/offsets.h + + In this example the file include/asm-$(ARCH)/offsets.h will + be built before descending down in the subdirectories. + See also chapter XXX-TODO that describe how kbuild supports + generating offset header files. + + +--- 6.3 List directories to visit when descending + + An arch Makefile cooperates with the top Makefile to define variables + which specify how to build the vmlinux file. Note that there is no + corresponding arch-specific section for modules; the module-building + machinery is all architecture-independent. + + + head-y, init-y, core-y, libs-y, drivers-y, net-y + + $(head-y) list objects to be linked first in vmlinux. + $(libs-y) list directories where a lib.a archive can be located. + The rest list directories where a built-in.o object file can be located. + + $(init-y) objects will be located after $(head-y). + Then the rest follows in this order: + $(core-y), $(libs-y), $(drivers-y) and $(net-y). + + The top level Makefile define values for all generic directories, + and arch/$(ARCH)/Makefile only adds architecture specific directories. + + Example: + #arch/sparc64/Makefile + core-y += arch/sparc64/kernel/ + libs-y += arch/sparc64/prom/ arch/sparc64/lib/ + drivers-$(CONFIG_OPROFILE) += arch/sparc64/oprofile/ + + +--- 6.4 Architecture specific boot images + + An arch Makefile specifies goals that take the vmlinux file, compress + it, wrap it in bootstrapping code, and copy the resulting files + somewhere. This includes various kinds of installation commands. + The actual goals are not standardized across architectures. + + It is common to locate any additional processing in a boot/ + directory below arch/$(ARCH)/. + + Kbuild does not provide any smart way to support building a + target specified in boot/. Therefore arch/$(ARCH)/Makefile shall + call make manually to build a target in boot/. + + The recommended approach is to include shortcuts in + arch/$(ARCH)/Makefile, and use the full path when calling down + into the arch/$(ARCH)/boot/Makefile. + + Example: + #arch/i386/Makefile + boot := arch/i386/boot + bzImage: vmlinux + $(Q)$(MAKE) $(build)=$(boot) $(boot)/$@ + + "$(Q)$(MAKE) $(build)=<dir>" is the recommended way to invoke + make in a subdirectory. + + There are no rules for naming of the architecture specific targets, + but executing "make help" will list all relevant targets. + To support this $(archhelp) must be defined. + + Example: + #arch/i386/Makefile + define archhelp + echo '* bzImage - Image (arch/$(ARCH)/boot/bzImage)' + endef + + When make is executed without arguments, the first goal encountered + will be built. In the top level Makefile the first goal present + is all:. + An architecture shall always per default build a bootable image. + In "make help" the default goal is highlighted with a '*'. + Add a new prerequisite to all: to select a default goal different + from vmlinux. + + Example: + #arch/i386/Makefile + all: bzImage + + When "make" is executed without arguments, bzImage will be built. + +--- 6.5 Building non-kbuild targets + + extra-y + + extra-y specify additional targets created in the current + directory, in addition to any targets specified by obj-*. + + Listing all targets in extra-y is required for two purposes: + 1) Enable kbuild to check changes in command lines + - When $(call if_changed,xxx) is used + 2) kbuild knows what files to delete during "make clean" + + Example: + #arch/i386/kernel/Makefile + extra-y := head.o init_task.o + + In this example extra-y is used to list object files that + shall be built, but shall not be linked as part of built-in.o. + + +--- 6.6 Commands useful for building a boot image + + Kbuild provides a few macros that are useful when building a + boot image. + + if_changed + + if_changed is the infrastructure used for the following commands. + + Usage: + target: source(s) FORCE + $(call if_changed,ld/objcopy/gzip) + + When the rule is evaluated it is checked to see if any files + needs an update, or the commandline has changed since last + invocation. The latter will force a rebuild if any options + to the executable have changed. + Any target that utilises if_changed must be listed in $(targets), + otherwise the command line check will fail, and the target will + always be built. + Assignments to $(targets) are without $(obj)/ prefix. + if_changed may be used in conjunction with custom commands as + defined in 6.7 "Custom kbuild commands". + Note: It is a typical mistake to forget the FORCE prerequisite. + + ld + Link target. Often LDFLAGS_$@ is used to set specific options to ld. + + objcopy + Copy binary. Uses OBJCOPYFLAGS usually specified in + arch/$(ARCH)/Makefile. + OBJCOPYFLAGS_$@ may be used to set additional options. + + gzip + Compress target. Use maximum compression to compress target. + + Example: + #arch/i386/boot/Makefile + LDFLAGS_bootsect := -Ttext 0x0 -s --oformat binary + LDFLAGS_setup := -Ttext 0x0 -s --oformat binary -e begtext + + targets += setup setup.o bootsect bootsect.o + $(obj)/setup $(obj)/bootsect: %: %.o FORCE + $(call if_changed,ld) + + In this example there are two possible targets, requiring different + options to the linker. the linker options are specified using the + LDFLAGS_$@ syntax - one for each potential target. + $(targets) are assinged all potential targets, herby kbuild knows + the targets and will: + 1) check for commandline changes + 2) delete target during make clean + + The ": %: %.o" part of the prerequisite is a shorthand that + free us from listing the setup.o and bootsect.o files. + Note: It is a common mistake to forget the "target :=" assignment, + resulting in the target file being recompiled for no + obvious reason. + + +--- 6.7 Custom kbuild commands + + When kbuild is executing with KBUILD_VERBOSE=0 then only a shorthand + of a command is normally displayed. + To enable this behaviour for custom commands kbuild requires + two variables to be set: + quiet_cmd_<command> - what shall be echoed + cmd_<command> - the command to execute + + Example: + # + quiet_cmd_image = BUILD $@ + cmd_image = $(obj)/tools/build $(BUILDFLAGS) \ + $(obj)/vmlinux.bin > $@ + + targets += bzImage + $(obj)/bzImage: $(obj)/vmlinux.bin $(obj)/tools/build FORCE + $(call if_changed,image) + @echo 'Kernel: $@ is ready' + + When updating the $(obj)/bzImage target the line: + + BUILD arch/i386/boot/bzImage + + will be displayed with "make KBUILD_VERBOSE=0". + + +--- 6.8 Preprocessing linker scripts + + When the vmlinux image is build the linker script: + arch/$(ARCH)/kernel/vmlinux.lds is used. + The script is a preprocessed variant of the file vmlinux.lds.S + located in the same directory. + kbuild knows .lds file and includes a rule *lds.S -> *lds. + + Example: + #arch/i386/kernel/Makefile + always := vmlinux.lds + + #Makefile + export CPPFLAGS_vmlinux.lds += -P -C -U$(ARCH) + + The assigment to $(always) is used to tell kbuild to build the + target: vmlinux.lds. + The assignment to $(CPPFLAGS_vmlinux.lds) tell kbuild to use the + specified options when building the target vmlinux.lds. + + When building the *.lds target kbuild used the variakles: + CPPFLAGS : Set in top-level Makefile + EXTRA_CPPFLAGS : May be set in the kbuild makefile + CPPFLAGS_$(@F) : Target specific flags. + Note that the full filename is used in this + assignment. + + The kbuild infrastructure for *lds file are used in several + architecture specific files. + + +--- 6.9 $(CC) support functions + + The kernel may be build with several different versions of + $(CC), each supporting a unique set of features and options. + kbuild provide basic support to check for valid options for $(CC). + $(CC) is useally the gcc compiler, but other alternatives are + available. + + cc-option + cc-option is used to check if $(CC) support a given option, and not + supported to use an optional second option. + + Example: + #arch/i386/Makefile + cflags-y += $(call cc-option,-march=pentium-mmx,-march=i586) + + In the above example cflags-y will be assigned the option + -march=pentium-mmx if supported by $(CC), otherwise -march-i586. + The second argument to cc-option is optional, and if omitted + cflags-y will be assigned no value if first option is not supported. + + cc-option-yn + cc-option-yn is used to check if gcc supports a given option + and return 'y' if supported, otherwise 'n'. + + Example: + #arch/ppc/Makefile + biarch := $(call cc-option-yn, -m32) + aflags-$(biarch) += -a32 + cflags-$(biarch) += -m32 + + In the above example $(biarch) is set to y if $(CC) supports the -m32 + option. When $(biarch) equals to y the expanded variables $(aflags-y) + and $(cflags-y) will be assigned the values -a32 and -m32. + + cc-option-align + gcc version >= 3.0 shifted type of options used to speify + alignment of functions, loops etc. $(cc-option-align) whrn used + as prefix to the align options will select the right prefix: + gcc < 3.00 + cc-option-align = -malign + gcc >= 3.00 + cc-option-align = -falign + + Example: + CFLAGS += $(cc-option-align)-functions=4 + + In the above example the option -falign-functions=4 is used for + gcc >= 3.00. For gcc < 3.00 -malign-functions=4 is used. + + cc-version + cc-version return a numerical version of the $(CC) compiler version. + The format is <major><minor> where both are two digits. So for example + gcc 3.41 would return 0341. + cc-version is useful when a specific $(CC) version is faulty in one + area, for example the -mregparm=3 were broken in some gcc version + even though the option was accepted by gcc. + + Example: + #arch/i386/Makefile + GCC_VERSION := $(call cc-version) + cflags-y += $(shell \ + if [ $(GCC_VERSION) -ge 0300 ] ; then echo "-mregparm=3"; fi ;) + + In the above example -mregparm=3 is only used for gcc version greater + than or equal to gcc 3.0. + + +=== 7 Kbuild Variables + +The top Makefile exports the following variables: + + VERSION, PATCHLEVEL, SUBLEVEL, EXTRAVERSION + + These variables define the current kernel version. A few arch + Makefiles actually use these values directly; they should use + $(KERNELRELEASE) instead. + + $(VERSION), $(PATCHLEVEL), and $(SUBLEVEL) define the basic + three-part version number, such as "2", "4", and "0". These three + values are always numeric. + + $(EXTRAVERSION) defines an even tinier sublevel for pre-patches + or additional patches. It is usually some non-numeric string + such as "-pre4", and is often blank. + + KERNELRELEASE + + $(KERNELRELEASE) is a single string such as "2.4.0-pre4", suitable + for constructing installation directory names or showing in + version strings. Some arch Makefiles use it for this purpose. + + ARCH + + This variable defines the target architecture, such as "i386", + "arm", or "sparc". Some kbuild Makefiles test $(ARCH) to + determine which files to compile. + + By default, the top Makefile sets $(ARCH) to be the same as the + host system architecture. For a cross build, a user may + override the value of $(ARCH) on the command line: + + make ARCH=m68k ... + + + INSTALL_PATH + + This variable defines a place for the arch Makefiles to install + the resident kernel image and System.map file. + Use this for architecture specific install targets. + + INSTALL_MOD_PATH, MODLIB + + $(INSTALL_MOD_PATH) specifies a prefix to $(MODLIB) for module + installation. This variable is not defined in the Makefile but + may be passed in by the user if desired. + + $(MODLIB) specifies the directory for module installation. + The top Makefile defines $(MODLIB) to + $(INSTALL_MOD_PATH)/lib/modules/$(KERNELRELEASE). The user may + override this value on the command line if desired. + +=== 8 Makefile language + +The kernel Makefiles are designed to run with GNU Make. The Makefiles +use only the documented features of GNU Make, but they do use many +GNU extensions. + +GNU Make supports elementary list-processing functions. The kernel +Makefiles use a novel style of list building and manipulation with few +"if" statements. + +GNU Make has two assignment operators, ":=" and "=". ":=" performs +immediate evaluation of the right-hand side and stores an actual string +into the left-hand side. "=" is like a formula definition; it stores the +right-hand side in an unevaluated form and then evaluates this form each +time the left-hand side is used. + +There are some cases where "=" is appropriate. Usually, though, ":=" +is the right choice. + +=== 9 Credits + +Original version made by Michael Elizabeth Chastain, <mailto:mec@shout.net> +Updates by Kai Germaschewski <kai@tp1.ruhr-uni-bochum.de> +Updates by Sam Ravnborg <sam@ravnborg.org> + +=== 10 TODO + +- Describe how kbuild support shipped files with _shipped. +- Generating offset header files. +- Add more variables to section 7? + diff --git a/Documentation/kbuild/modules.txt b/Documentation/kbuild/modules.txt new file mode 100644 index 000000000000..c91caf7eb303 --- /dev/null +++ b/Documentation/kbuild/modules.txt @@ -0,0 +1,419 @@ + +In this document you will find information about: +- how to build external modules +- how to make your module use kbuild infrastructure +- how kbuild will install a kernel +- how to install modules in a non-standard location + +=== Table of Contents + + === 1 Introduction + === 2 How to build external modules + --- 2.1 Building external modules + --- 2.2 Available targets + --- 2.3 Available options + --- 2.4 Preparing the kernel tree for module build + === 3. Example commands + === 4. Creating a kbuild file for an external module + === 5. Include files + --- 5.1 How to include files from the kernel include dir + --- 5.2 External modules using an include/ dir + === 6. Module installation + --- 6.1 INSTALL_MOD_PATH + --- 6.2 INSTALL_MOD_DIR + === 7. Module versioning + === 8. Tips & Tricks + --- 8.1 Testing for CONFIG_FOO_BAR + + + +=== 1. Introduction + +kbuild includes functionality for building modules both +within the kernel source tree and outside the kernel source tree. +The latter is usually referred to as external modules and is used +both during development and for modules that are not planned to be +included in the kernel tree. + +What is covered within this file is mainly information to authors +of modules. The author of an external modules should supply +a makefile that hides most of the complexity so one only has to type +'make' to buld the module. A complete example will be present in +chapter ¤. Creating a kbuild file for an external module". + + +=== 2. How to build external modules + +kbuild offers functionality to build external modules, with the +prerequisite that there is a pre-built kernel available with full source. +A subset of the targets available when building the kernel is available +when building an external module. + +--- 2.1 Building external modules + + Use the following command to build an external module: + + make -C <path-to-kernel> M=`pwd` + + For the running kernel use: + make -C /lib/modules/`uname -r`/build M=`pwd` + + For the above command to succeed the kernel must have been built with + modules enabled. + + To install the modules that were just built: + + make -C <path-to-kernel> M=`pwd` modules_install + + More complex examples later, the above should get you going. + +--- 2.2 Available targets + + $KDIR refers to path to kernel source top-level directory + + make -C $KDIR M=`pwd` + Will build the module(s) located in current directory. + All output files will be located in the same directory + as the module source. + No attempts are made to update the kernel source, and it is + a precondition that a successful make has been executed + for the kernel. + + make -C $KDIR M=`pwd` modules + The modules target is implied when no target is given. + Same functionality as if no target was specified. + See description above. + + make -C $KDIR M=$PWD modules_install + Install the external module(s). + Installation default is in /lib/modules/<kernel-version>/extra, + but may be prefixed with INSTALL_MOD_PATH - see separate chater. + + make -C $KDIR M=$PWD clean + Remove all generated files for the module - the kernel + source directory is not moddified. + + make -C $KDIR M=`pwd` help + help will list the available target when building external + modules. + +--- 2.3 Available options: + + $KDIR refer to path to kernel src + + make -C $KDIR + Used to specify where to find the kernel source. + '$KDIR' represent the directory where the kernel source is. + Make will actually change directory to the specified directory + when executed but change back when finished. + + make -C $KDIR M=`pwd` + M= is used to tell kbuild that an external module is + being built. + The option given to M= is the directory where the external + module (kbuild file) is located. + When an external module is being built only a subset of the + usual targets are available. + + make -C $KDIR SUBDIRS=`pwd` + Same as M=. The SUBDIRS= syntax is kept for backwards + compatibility. + +--- 2.4 Preparing the kernel tree for module build + + To make sure the kernel contains the information required to + build external modules the target 'modules_prepare' must be used. + 'module_prepare' solely exists as a simple way to prepare + a kernel for building external modules. + Note: modules_prepare will not build Module.symvers even if + CONFIG_MODULEVERSIONING is set. + Therefore a full kernel build needs to be executed to make + module versioning work. + + +=== 3. Example commands + +This example shows the actual commands to be executed when building +an external module for the currently running kernel. +In the example below the distribution is supposed to use the +facility to locate output files for a kernel compile in a different +directory than the kernel source - but the examples will also work +when the source and the output files are mixed in the same directory. + +# Kernel source +/lib/modules/<kernel-version>/source -> /usr/src/linux-<version> + +# Output from kernel compile +/lib/modules/<kernel-version>/build -> /usr/src/linux-<version>-up + +Change to the directory where the kbuild file is located and execute +the following commands to build the module: + + cd /home/user/src/module + make -C /usr/src/`uname -r`/source \ + O=/lib/modules/`uname-r`/build \ + M=`pwd` + +Then to install the module use the following command: + + make -C /usr/src/`uname -r`/source \ + O=/lib/modules/`uname-r`/build \ + M=`pwd` \ + modules_install + +If one looks closely you will see that this is the same commands as +listed before - with the directories spelled out. + +The above are rather long commands, and the following chapter +lists a few tricks to make it all easier. + + +=== 4. Creating a kbuild file for an external module + +kbuild is the build system for the kernel, and external modules +must use kbuild to stay compatible with changes in the build system +and to pick up the right flags to gcc etc. + +The kbuild file used as input shall follow the syntax described +in Documentation/kbuild/makefiles.txt. This chapter will introduce a few +more tricks to be used when dealing with external modules. + +In the following a Makefile will be created for a module with the +following files: + 8123_if.c + 8123_if.h + 8123_pci.c + 8123_bin.o_shipped <= Binary blob + +--- 4.1 Shared Makefile for module and kernel + + An external module always includes a wrapper Makefile supporting + building the module using 'make' with no arguments. + The Makefile provided will most likely include additional + functionality such as test targets etc. and this part shall + be filtered away from kbuild since it may impact kbuild if + name clashes occurs. + + Example 1: + --> filename: Makefile + ifneq ($(KERNELRELEASE),) + # kbuild part of makefile + obj-m := 8123.o + 8123-y := 8123_if.o 8123_pci.o 8123_bin.o + + else + # Normal Makefile + + KERNELDIR := /lib/modules/`uname -r`/build + all:: + $(MAKE) -C $KERNELDIR M=`pwd` $@ + + # Module specific targets + genbin: + echo "X" > 8123_bini.o_shipped + + endif + + In example 1 the check for KERNELRELEASE is used to separate + the two parts of the Makefile. kbuild will only see the two + assignments whereas make will see everything except the two + kbuild assignments. + + In recent versions of the kernel, kbuild will look for a file named + Kbuild and as second option look for a file named Makefile. + Utilising the Kbuild file makes us split up the Makefile in example 1 + into two files as shown in example 2: + + Example 2: + --> filename: Kbuild + obj-m := 8123.o + 8123-y := 8123_if.o 8123_pci.o 8123_bin.o + + --> filename: Makefile + KERNELDIR := /lib/modules/`uname -r`/build + all:: + $(MAKE) -C $KERNELDIR M=`pwd` $@ + + # Module specific targets + genbin: + echo "X" > 8123_bin_shipped + + + In example 2 we are down to two fairly simple files and for simple + files as used in this example the split is questionable. But some + external modules use Makefiles of several hundred lines and here it + really pays off to separate the kbuild part from the rest. + Example 3 shows a backward compatible version. + + Example 3: + --> filename: Kbuild + obj-m := 8123.o + 8123-y := 8123_if.o 8123_pci.o 8123_bin.o + + --> filename: Makefile + ifneq ($(KERNELRELEASE),) + include Kbuild + else + # Normal Makefile + + KERNELDIR := /lib/modules/`uname -r`/build + all:: + $(MAKE) -C $KERNELDIR M=`pwd` $@ + + # Module specific targets + genbin: + echo "X" > 8123_bin_shipped + + endif + + The trick here is to include the Kbuild file from Makefile so + if an older version of kbuild picks up the Makefile the Kbuild + file will be included. + +--- 4.2 Binary blobs included in a module + + Some external modules needs to include a .o as a blob. kbuild + has support for this, but requires the blob file to be named + <filename>_shipped. In our example the blob is named + 8123_bin.o_shipped and when the kbuild rules kick in the file + 8123_bin.o is created as a simple copy off the 8213_bin.o_shipped file + with the _shipped part stripped of the filename. + This allows the 8123_bin.o filename to be used in the assignment to + the module. + + Example 4: + obj-m := 8123.o + 8123-y := 8123_if.o 8123_pci.o 8123_bin.o + + In example 4 there is no distinction between the ordinary .c/.h files + and the binary file. But kbuild will pick up different rules to create + the .o file. + + +=== 5. Include files + +Include files are a necessity when a .c file uses something from another .c +files (not strictly in the sense of .c but if good programming practice is +used). Any module that consist of more than one .c file will have a .h file +for one of the .c files. +- If the .h file only describes a module internal interface then the .h file + shall be placed in the same directory as the .c files. +- If the .h files describe an interface used by other parts of the kernel + located in different directories, the .h files shall be located in + include/linux/ or other include/ directories as appropriate. + +One exception for this rule is larger subsystems that have their own directory +under include/ such as include/scsi. Another exception is arch-specific +.h files which are located under include/asm-$(ARCH)/*. + +External modules have a tendency to locate include files in a separate include/ +directory and therefore needs to deal with this in their kbuild file. + +--- 5.1 How to include files from the kernel include dir + + When a module needs to include a file from include/linux/ then one + just uses: + + #include <linux/modules.h> + + kbuild will make sure to add options to gcc so the relevant + directories are searched. + Likewise for .h files placed in the same directory as the .c file. + + #include "8123_if.h" + + will do the job. + +--- 5.2 External modules using an include/ dir + + External modules often locate their .h files in a separate include/ + directory although this is not usual kernel style. When an external + module uses an include/ dir then kbuild needs to be told so. + The trick here is to use either EXTRA_CFLAGS (take effect for all .c + files) or CFLAGS_$F.o (take effect only for a single file). + + In our example if we move 8123_if.h to a subdirectory named include/ + the resulting Kbuild file would look like: + + --> filename: Kbuild + obj-m := 8123.o + + EXTRA_CFLAGS := -Iinclude + 8123-y := 8123_if.o 8123_pci.o 8123_bin.o + + Note that in the assingment there is no space between -I and the path. + This is a kbuild limitation and no space must be present. + + +=== 6. Module installation + +Modules which are included in the kernel is installed in the directory: + + /lib/modules/$(KERNELRELEASE)/kernel + +External modules are installed in the directory: + + /lib/modules/$(KERNELRELEASE)/extra + +--- 6.1 INSTALL_MOD_PATH + + Above are the default directories, but as always some level of + customization is possible. One can prefix the path using the variable + INSTALL_MOD_PATH: + + $ make INSTALL_MOD_PATH=/frodo modules_install + => Install dir: /frodo/lib/modules/$(KERNELRELEASE)/kernel + + INSTALL_MOD_PATH may be set as an ordinary shell variable or as in the + example above be specified on the commandline when calling make. + INSTALL_MOD_PATH has effect both when installing modules included in + the kernel as well as when installing external modules. + +--- 6.2 INSTALL_MOD_DIR + + When installing external modules they are default installed in a + directory under /lib/modules/$(KERNELRELEASE)/extra, but one may wish + to locate modules for a specific functionality in a separate + directory. For this purpose one can use INSTALL_MOD_DIR to specify an + alternative name than 'extra'. + + $ make INSTALL_MOD_DIR=gandalf -C KERNELDIR \ + M=`pwd` modules_install + => Install dir: /lib/modules/$(KERNELRELEASE)/gandalf + + +=== 7. Module versioning + +Module versioning are enabled by the CONFIG_MODVERSIONS tag. + +Module versioning is used as a simple ABI consistency check. The Module +versioning creates a CRC value of the full prototype for an exported symbol and +when a module is loaded/used then the CRC values contained in the kernel are +compared with similar values in the module. If they are not equal then the +kernel refuses to load the module. + +During a kernel build a file named Module.symvers will be generated. This +file includes the symbol version of all symbols within the kernel. If the +Module.symvers file is saved from the last full kernel compile one does not +have to do a full kernel compile to build a module version's compatible module. + +=== 8. Tips & Tricks + +--- 8.1 Testing for CONFIG_FOO_BAR + + Modules often needs to check for certain CONFIG_ options to decide if + a specific feature shall be included in the module. When kbuild is used + this is done by referencing the CONFIG_ variable directly. + + #fs/ext2/Makefile + obj-$(CONFIG_EXT2_FS) += ext2.o + + ext2-y := balloc.o bitmap.o dir.o + ext2-$(CONFIG_EXT2_FS_XATTR) += xattr.o + + External modules have traditionally used grep to check for specific + CONFIG_ settings directly in .config. This usage is broken. + As introduced before external modules shall use kbuild when building + and therefore can use the same methods as in-kernel modules when testing + for CONFIG_ definitions. + diff --git a/Documentation/kernel-doc-nano-HOWTO.txt b/Documentation/kernel-doc-nano-HOWTO.txt new file mode 100644 index 000000000000..c406ce67edd0 --- /dev/null +++ b/Documentation/kernel-doc-nano-HOWTO.txt @@ -0,0 +1,150 @@ +kernel-doc nano-HOWTO +===================== + +Many places in the source tree have extractable documentation in the +form of block comments above functions. The components of this system +are: + +- scripts/kernel-doc + + This is a perl script that hunts for the block comments and can mark + them up directly into DocBook, man, text, and HTML. (No, not + texinfo.) + +- Documentation/DocBook/*.tmpl + + These are SGML template files, which are normal SGML files with + special place-holders for where the extracted documentation should + go. + +- scripts/docproc.c + + This is a program for converting SGML template files into SGML + files. When a file is referenced it is searched for symbols + exported (EXPORT_SYMBOL), to be able to distinguish between internal + and external functions. + It invokes kernel-doc, giving it the list of functions that + are to be documented. + Additionally it is used to scan the SGML template files to locate + all the files referenced herein. This is used to generate dependency + information as used by make. + +- Makefile + + The targets 'sgmldocs', 'psdocs', 'pdfdocs', and 'htmldocs' are used + to build DocBook files, PostScript files, PDF files, and html files + in Documentation/DocBook. + +- Documentation/DocBook/Makefile + + This is where C files are associated with SGML templates. + + +How to extract the documentation +-------------------------------- + +If you just want to read the ready-made books on the various +subsystems (see Documentation/DocBook/*.tmpl), just type 'make +psdocs', or 'make pdfdocs', or 'make htmldocs', depending on your +preference. If you would rather read a different format, you can type +'make sgmldocs' and then use DocBook tools to convert +Documentation/DocBook/*.sgml to a format of your choice (for example, +'db2html ...' if 'make htmldocs' was not defined). + +If you want to see man pages instead, you can do this: + +$ cd linux +$ scripts/kernel-doc -man $(find -name '*.c') | split-man.pl /tmp/man +$ scripts/kernel-doc -man $(find -name '*.h') | split-man.pl /tmp/man + +Here is split-man.pl: + +--> +#!/usr/bin/perl + +if ($#ARGV < 0) { + die "where do I put the results?\n"; +} + +mkdir $ARGV[0],0777; +$state = 0; +while (<STDIN>) { + if (/^\.TH \"[^\"]*\" 4 \"([^\"]*)\"/) { + if ($state == 1) { close OUT } + $state = 1; + $fn = "$ARGV[0]/$1.4"; + print STDERR "Creating $fn\n"; + open OUT, ">$fn" or die "can't open $fn: $!\n"; + print OUT $_; + } elsif ($state != 0) { + print OUT $_; + } +} + +close OUT; +<-- + +If you just want to view the documentation for one function in one +file, you can do this: + +$ scripts/kernel-doc -man -function fn file | nroff -man | less + +or this: + +$ scripts/kernel-doc -text -function fn file + + +How to add extractable documentation to your source files +--------------------------------------------------------- + +The format of the block comment is like this: + +/** + * function_name(:)? (- short description)? +(* @parameterx: (description of parameter x)?)* +(* a blank line)? + * (Description:)? (Description of function)? + * (section header: (section description)? )* +(*)?*/ + +The short function description cannot be multiline, but the other +descriptions can be (and they can contain blank lines). Avoid putting a +spurious blank line after the function name, or else the description will +be repeated! + +All descriptive text is further processed, scanning for the following special +patterns, which are highlighted appropriately. + +'funcname()' - function +'$ENVVAR' - environment variable +'&struct_name' - name of a structure (up to two words including 'struct') +'@parameter' - name of a parameter +'%CONST' - name of a constant. + +Take a look around the source tree for examples. + + +How to make new SGML template files +----------------------------------- + +SGML template files (*.tmpl) are like normal SGML files, except that +they can contain escape sequences where extracted documentation should +be inserted. + +!E<filename> is replaced by the documentation, in <filename>, for +functions that are exported using EXPORT_SYMBOL: the function list is +collected from files listed in Documentation/DocBook/Makefile. + +!I<filename> is replaced by the documentation for functions that are +_not_ exported using EXPORT_SYMBOL. + +!D<filename> is used to name additional files to search for functions +exported using EXPORT_SYMBOL. + +!F<filename> <function [functions...]> is replaced by the +documentation, in <filename>, for the functions listed. + + +Tim. +*/ <twaugh@redhat.com> + diff --git a/Documentation/kernel-docs.txt b/Documentation/kernel-docs.txt new file mode 100644 index 000000000000..cb89fb3b61ef --- /dev/null +++ b/Documentation/kernel-docs.txt @@ -0,0 +1,777 @@ + + Index of Documentation for People Interested in Writing and/or + + Understanding the Linux Kernel. + + Juan-Mariano de Goyeneche <jmseyas@dit.upm.es> + +/* + * The latest version of this document may be found at: + * http://www.dit.upm.es/~jmseyas/linux/kernel/hackers-docs.html + */ + + The need for a document like this one became apparent in the + linux-kernel mailing list as the same questions, asking for pointers + to information, appeared again and again. + + Fortunately, as more and more people get to GNU/Linux, more and more + get interested in the Kernel. But reading the sources is not always + enough. It is easy to understand the code, but miss the concepts, the + philosophy and design decisions behind this code. + + Unfortunately, not many documents are available for beginners to + start. And, even if they exist, there was no "well-known" place which + kept track of them. These lines try to cover this lack. All documents + available on line known by the author are listed, while some reference + books are also mentioned. + + PLEASE, if you know any paper not listed here or write a new document, + send me an e-mail, and I'll include a reference to it here. Any + corrections, ideas or comments are also welcomed. + + The papers that follow are listed in no particular order. All are + cataloged with the following fields: the document's "Title", the + "Author"/s, the "URL" where they can be found, some "Keywords" helpful + when searching for specific topics, and a brief "Description" of the + Document. + + Enjoy! + + ON-LINE DOCS: + + * Title: "Linux Device Drivers, Third Edition" + Author: Jonathan Corbet, Alessandro Rubini, Greg Kroah-Hartman + URL: http://lwn.net/Kernel/LDD3/ + Description: A 600-page book covering the (2.6.10) driver + programming API and kernel hacking in general. Available under the + Creative Commons Attribution-ShareAlike 2.0 license. + + * Title: "The Linux Kernel" + Author: David A. Rusling. + URL: http://www.tldp.org/LDP/tlk/tlk.html + Keywords: everything!, book. + Description: On line, 200 pages book describing most aspects of + the Linux Kernel. Probably, the first reference for beginners. + Lots of illustrations explaining data structures use and + relationships in the purest Richard W. Stevens' style. Contents: + "1.-Hardware Basics, 2.-Software Basics, 3.-Memory Management, + 4.-Processes, 5.-Interprocess Communication Mechanisms, 6.-PCI, + 7.-Interrupts and Interrupt Handling, 8.-Device Drivers, 9.-The + File system, 10.-Networks, 11.-Kernel Mechanisms, 12.-Modules, + 13.-The Linux Kernel Sources, A.-Linux Data Structures, B.-The + Alpha AXP Processor, C.-Useful Web and FTP Sites, D.-The GNU + General Public License, Glossary". In short: a must have. + + * Title: "The Linux Kernel Hackers' Guide" + Author: Michael K.Johnson and others. + URL: http://www.tldp.org/LDP/khg/HyperNews/get/khg.html + Keywords: everything! + Description: No more Postscript book-like version. Only HTML now. + Many people have contributed. The interface is similar to web + available mailing lists archives. You can find some articles and + then some mails asking questions about them and/or complementing + previous contributions. A little bit anarchic in this aspect, but + with some valuable information in some cases. + + * Title: "Conceptual Architecture of the Linux Kernel" + Author: Ivan T. Bowman. + URL: http://plg.uwaterloo.ca/~itbowman/papers/CS746G-a1.html + Keywords: conceptual software arquitecture, extracted design, + reverse engineering, system structure. + Description: Conceptual software arquitecture of the Linux kernel, + automatically extracted from the source code. Very detailed. Good + figures. Gives good overall kernel understanding. + + * Title: "Concrete Architecture of the Linux Kernel" + Author: Ivan T. Bowman, Saheem Siddiqi, and Meyer C. Tanuan. + URL: http://plg.uwaterloo.ca/~itbowman/papers/CS746G-a2.html + Keywords: concrete arquitecture, extracted design, reverse + engineering, system structure, dependencies. + Description: Concrete arquitecture of the Linux kernel, + automatically extracted from the source code. Very detailed. Good + figures. Gives good overall kernel understanding. This papers + focus on lower details than its predecessor (files, variables...). + + * Title: "Linux as a Case Study: Its Extracted Software + Architecture" + Author: Ivan T. Bowman, Richard C. Holt and Neil V. Brewster. + URL: http://plg.uwaterloo.ca/~itbowman/papers/linuxcase.html + Keywords: software architecture, architecture recovery, + redocumentation. + Description: Paper appeared at ICSE'99, Los Angeles, May 16-22, + 1999. A mixture of the previous two documents from the same + author. + + * Title: "Overview of the Virtual File System" + Author: Richard Gooch. + URL: http://www.atnf.csiro.au/~rgooch/linux/vfs.txt + Keywords: VFS, File System, mounting filesystems, opening files, + dentries, dcache. + Description: Brief introduction to the Linux Virtual File System. + What is it, how it works, operations taken when opening a file or + mounting a file system and description of important data + structures explaining the purpose of each of their entries. + + * Title: "The Linux RAID-1, 4, 5 Code" + Author: Ingo Molnar, Gadi Oxman and Miguel de Icaza. + URL: http://www2.linuxjournal.com/lj-issues/issue44/2391.html + Keywords: RAID, MD driver. + Description: Linux Journal Kernel Korner article. Here is it's + abstract: "A description of the implementation of the RAID-1, + RAID-4 and RAID-5 personalities of the MD device driver in the + Linux kernel, providing users with high performance and reliable, + secondary-storage capability using software". + + * Title: "Dynamic Kernels: Modularized Device Drivers" + Author: Alessandro Rubini. + URL: http://www2.linuxjournal.com/lj-issues/issue23/1219.html + Keywords: device driver, module, loading/unloading modules, + allocating resources. + Description: Linux Journal Kernel Korner article. Here is it's + abstract: "This is the first of a series of four articles + co-authored by Alessandro Rubini and Georg Zezchwitz which present + a practical approach to writing Linux device drivers as kernel + loadable modules. This installment presents an introduction to the + topic, preparing the reader to understand next month's + installment". + + * Title: "Dynamic Kernels: Discovery" + Author: Alessandro Rubini. + URL: http://www2.linuxjournal.com/lj-issues/issue24/1220.html + Keywords: character driver, init_module, clean_up module, + autodetection, mayor number, minor number, file operations, + open(), close(). + Description: Linux Journal Kernel Korner article. Here is it's + abstract: "This article, the second of four, introduces part of + the actual code to create custom module implementing a character + device driver. It describes the code for module initialization and + cleanup, as well as the open() and close() system calls". + + * Title: "The Devil's in the Details" + Author: Georg v. Zezschwitz and Alessandro Rubini. + URL: http://www2.linuxjournal.com/lj-issues/issue25/1221.html + Keywords: read(), write(), select(), ioctl(), blocking/non + blocking mode, interrupt handler. + Description: Linux Journal Kernel Korner article. Here is it's + abstract: "This article, the third of four on writing character + device drivers, introduces concepts of reading, writing, and using + ioctl-calls". + + * Title: "Dissecting Interrupts and Browsing DMA" + Author: Alessandro Rubini and Georg v. Zezschwitz. + URL: http://www2.linuxjournal.com/lj-issues/issue26/1222.html + Keywords: interrupts, irqs, DMA, bottom halves, task queues. + Description: Linux Journal Kernel Korner article. Here is it's + abstract: "This is the fourth in a series of articles about + writing character device drivers as loadable kernel modules. This + month, we further investigate the field of interrupt handling. + Though it is conceptually simple, practical limitations and + constraints make this an ``interesting'' part of device driver + writing, and several different facilities have been provided for + different situations. We also investigate the complex topic of + DMA". + + * Title: "Device Drivers Concluded" + Author: Georg v. Zezschwitz. + URL: http://www2.linuxjournal.com/lj-issues/issue28/1287.html + Keywords: address spaces, pages, pagination, page management, + demand loading, swapping, memory protection, memory mapping, mmap, + virtual memory areas (VMAs), vremap, PCI. + Description: Finally, the above turned out into a five articles + series. This latest one's introduction reads: "This is the last of + five articles about character device drivers. In this final + section, Georg deals with memory mapping devices, beginning with + an overall description of the Linux memory management concepts". + + * Title: "Network Buffers And Memory Management" + Author: Alan Cox. + URL: http://www2.linuxjournal.com/lj-issues/issue30/1312.html + Keywords: sk_buffs, network devices, protocol/link layer + variables, network devices flags, transmit, receive, + configuration, multicast. + Description: Linux Journal Kernel Korner. Here is the abstract: + "Writing a network device driver for Linux is fundamentally + simple---most of the complexity (other than talking to the + hardware) involves managing network packets in memory". + + * Title: "Writing Linux Device Drivers" + Author: Michael K. Johnson. + URL: http://people.redhat.com/johnsonm/devices.html + Keywords: files, VFS, file operations, kernel interface, character + vs block devices, I/O access, hardware interrupts, DMA, access to + user memory, memory allocation, timers. + Description: Introductory 50-minutes (sic) tutorial on writing + device drivers. 12 pages written by the same author of the "Kernel + Hackers' Guide" which give a very good overview of the topic. + + * Title: "The Venus kernel interface" + Author: Peter J. Braam. + URL: + http://www.coda.cs.cmu.edu/doc/html/kernel-venus-protocol.html + Keywords: coda, filesystem, venus, cache manager. + Description: "This document describes the communication between + Venus and kernel level file system code needed for the operation + of the Coda filesystem. This version document is meant to describe + the current interface (version 1.0) as well as improvements we + envisage". + + * Title: "Programming PCI-Devices under Linux" + Author: Claus Schroeter. + URL: + ftp://ftp.llp.fu-berlin.de/pub/linux/LINUX-LAB/whitepapers/pcip.ps + .gz + Keywords: PCI, device, busmastering. + Description: 6 pages tutorial on PCI programming under Linux. + Gives the basic concepts on the architecture of the PCI subsystem, + as long as basic functions and macros to read/write the devices + and perform busmastering. + + * Title: "Writing Character Device Driver for Linux" + Author: R. Baruch and C. Schroeter. + URL: + ftp://ftp.llp.fu-berlin.de/pub/linux/LINUX-LAB/whitepapers/drivers + .ps.gz + Keywords: character device drivers, I/O, signals, DMA, accessing + ports in user space, kernel environment. + Description: 68 pages paper on writing character drivers. A little + bit old (1.993, 1.994) although still useful. + + * Title: "Design and Implementation of the Second Extended + Filesystem" + Author: Rémy Card, Theodore Ts'o, Stephen Tweedie. + URL: http://web.mit.edu/tytso/www/linux/ext2intro.html + Keywords: ext2, linux fs history, inode, directory, link, devices, + VFS, physical structure, performance, benchmarks, ext2fs library, + ext2fs tools, e2fsck. + Description: Paper written by three of the top ext2 hackers. + Covers Linux filesystems history, ext2 motivation, ext2 features, + design, physical structure on disk, performance, benchmarks, + e2fsck's passes description... A must read! + Notes: This paper was first published in the Proceedings of the + First Dutch International Symposium on Linux, ISBN 90-367-0385-9. + + * Title: "Analysis of the Ext2fs structure" + Author: Louis-Dominique Dubeau. + URL: http://step.polymtl.ca/~ldd/ext2fs/ext2fs_toc.html + Keywords: ext2, filesystem, ext2fs. + Description: Description of ext2's blocks, directories, inodes, + bitmaps, invariants... + + * Title: "Journaling the Linux ext2fs Filesystem" + Author: Stephen C. Tweedie. + URL: + ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/journal-design.ps.gz + Keywords: ext3, journaling. + Description: Excellent 8-pages paper explaining the journaling + capabilities added to ext2 by the author, showing different + problems faced and the alternatives chosen. + + * Title: "Kernel API changes from 2.0 to 2.2" + Author: Richard Gooch. + URL: + http://www.atnf.csiro.au/~rgooch/linux/docs/porting-to-2.2.html + Keywords: 2.2, changes. + Description: Kernel functions/structures/variables which changed + from 2.0.x to 2.2.x. + + * Title: "Kernel API changes from 2.2 to 2.4" + Author: Richard Gooch. + URL: + http://www.atnf.csiro.au/~rgooch/linux/docs/porting-to-2.4.html + Keywords: 2.4, changes. + Description: Kernel functions/structures/variables which changed + from 2.2.x to 2.4.x. + + * Title: "Linux Kernel Module Programming Guide" + Author: Ori Pomerantz. + URL: http://www.tldp.org/LDP/lkmpg/mpg.html + Keywords: modules, GPL book, /proc, ioctls, system calls, + interrupt handlers . + Description: Very nice 92 pages GPL book on the topic of modules + programming. Lots of examples. + + * Title: "Device File System (devfs) Overview" + Author: Richard Gooch. + URL: http://www.atnf.csiro.au/~rgooch/linux/docs/devfs.txt + Keywords: filesystem, /dev, devfs, dynamic devices, major/minor + allocation, device management. + Description: Document describing Richard Gooch's controversial + devfs, which allows for dynamic devices, only shows present + devices in /dev, gets rid of major/minor numbers allocation + problems, and allows for hundreds of identical devices (which some + USB systems might demand soon). + + * Title: "I/O Event Handling Under Linux" + Author: Richard Gooch. + URL: http://www.atnf.csiro.au/~rgooch/linux/docs/io-events.html + Keywords: IO, I/O, select(2), poll(2), FDs, aio_read(2), readiness + event queues. + Description: From the Introduction: "I/O Event handling is about + how your Operating System allows you to manage a large number of + open files (file descriptors in UNIX/POSIX, or FDs) in your + application. You want the OS to notify you when FDs become active + (have data ready to be read or are ready for writing). Ideally you + want a mechanism that is scalable. This means a large number of + inactive FDs cost very little in memory and CPU time to manage". + + * Title: "The Kernel Hacking HOWTO" + Author: Various Talented People, and Rusty. + URL: + http://www.lisoleg.net/doc/Kernel-Hacking-HOWTO/kernel-hacking-HOW + TO.html + Keywords: HOWTO, kernel contexts, deadlock, locking, modules, + symbols, return conventions. + Description: From the Introduction: "Please understand that I + never wanted to write this document, being grossly underqualified, + but I always wanted to read it, and this was the only way. I + simply explain some best practices, and give reading entry-points + into the kernel sources. I avoid implementation details: that's + what the code is for, and I ignore whole tracts of useful + routines. This document assumes familiarity with C, and an + understanding of what the kernel is, and how it is used. It was + originally written for the 2.3 kernels, but nearly all of it + applies to 2.2 too; 2.0 is slightly different". + + * Title: "ALSA 0.5.0 Developer documentation" + Author: Stephan 'Jumpy' Bartels . + URL: http://www.math.TU-Berlin.de/~sbartels/alsa/ + Keywords: ALSA, sound, soundcard, driver, lowlevel, hardware. + Description: Advanced Linux Sound Architecture for developers, + both at kernel and user-level sides. Work in progress. ALSA is + supposed to be Linux's next generation sound architecture. + + * Title: "Programming Guide for Linux USB Device Drivers" + Author: Detlef Fliegl. + URL: http://usb.in.tum.de/usbdoc/ + Keywords: USB, universal serial bus. + Description: A must-read. From the Preface: "This document should + give detailed information about the current state of the USB + subsystem and its API for USB device drivers. The first section + will deal with the basics of USB devices. You will learn about + different types of devices and their properties. Going into detail + you will see how USB devices communicate on the bus. The second + section gives an overview of the Linux USB subsystem [2] and the + device driver framework. Then the API and its data structures will + be explained step by step. The last section of this document + contains a reference of all API calls and their return codes". + Notes: Beware: the main page states: "This document may not be + published, printed or used in excerpts without explicit permission + of the author". Fortunately, it may still be read... + + * Title: "Tour Of the Linux Kernel Source" + Author: Vijo Cherian. + URL: http://www.geocities.com/vijoc/tolks/tolks.html + Keywords: . + Description: A classic of this page! Was lost for a while and is + back again. Thanks Vijo! TOLKS: the name says it all. A tour of + the sources, describing directories, files, variables, data + structures... It covers general stuff, device drivers, + filesystems, IPC and Networking Code. + + * Title: "Linux Kernel Mailing List Glossary" + Author: John Levon. + URL: http://www.movement.uklinux.net/glossary.html + Keywords: glossary, terms, linux-kernel. + Description: From the introduction: "This glossary is intended as + a brief description of some of the acronyms and terms you may hear + during discussion of the Linux kernel". + + * Title: "Linux Kernel Locking HOWTO" + Author: Various Talented People, and Rusty. + URL: + http://netfilter.kernelnotes.org/unreliable-guides/kernel-locking- + HOWTO.html + Keywords: locks, locking, spinlock, semaphore, atomic, race + condition, bottom halves, tasklets, softirqs. + Description: The title says it all: document describing the + locking system in the Linux Kernel either in uniprocessor or SMP + systems. + Notes: "It was originally written for the later (>2.3.47) 2.3 + kernels, but most of it applies to 2.2 too; 2.0 is slightly + different". Freely redistributable under the conditions of the GNU + General Public License. + + * Title: "Porting Linux 2.0 Drivers To Linux 2.2: Changes and New + Features " + Author: Alan Cox. + URL: http://www.linux-mag.com/1999-05/gear_01.html + Keywords: ports, porting. + Description: Article from Linux Magazine on porting from 2.0 to + 2.2 kernels. + + * Title: "Porting Device Drivers To Linux 2.2: part II" + Author: Alan Cox. + URL: http://www.linux-mag.com/1999-06/gear_01.html + Keywords: ports, porting. + Description: Second part on porting from 2.0 to 2.2 kernels. + + * Title: "How To Make Sure Your Driver Will Work On The Power + Macintosh" + Author: Paul Mackerras. + URL: http://www.linux-mag.com/1999-07/gear_01.html + Keywords: Mac, Power Macintosh, porting, drivers, compatibility. + Description: The title says it all. + + * Title: "An Introduction to SCSI Drivers" + Author: Alan Cox. + URL: http://www.linux-mag.com/1999-08/gear_01.html + Keywords: SCSI, device, driver. + Description: The title says it all. + + * Title: "Advanced SCSI Drivers And Other Tales" + Author: Alan Cox. + URL: http://www.linux-mag.com/1999-09/gear_01.html + Keywords: SCSI, device, driver, advanced. + Description: The title says it all. + + * Title: "Writing Linux Mouse Drivers" + Author: Alan Cox. + URL: http://www.linux-mag.com/1999-10/gear_01.html + Keywords: mouse, driver, gpm. + Description: The title says it all. + + * Title: "More on Mouse Drivers" + Author: Alan Cox. + URL: http://www.linux-mag.com/1999-11/gear_01.html + Keywords: mouse, driver, gpm, races, asynchronous I/O. + Description: The title still says it all. + + * Title: "Writing Video4linux Radio Driver" + Author: Alan Cox. + URL: http://www.linux-mag.com/1999-12/gear_01.html + Keywords: video4linux, driver, radio, radio devices. + Description: The title says it all. + + * Title: "Video4linux Drivers, Part 1: Video-Capture Device" + Author: Alan Cox. + URL: http://www.linux-mag.com/2000-01/gear_01.html + Keywords: video4linux, driver, video capture, capture devices, + camera driver. + Description: The title says it all. + + * Title: "Video4linux Drivers, Part 2: Video-capture Devices" + Author: Alan Cox. + URL: http://www.linux-mag.com/2000-02/gear_01.html + Keywords: video4linux, driver, video capture, capture devices, + camera driver, control, query capabilities, capability, facility. + Description: The title says it all. + + * Title: "PCI Management in Linux 2.2" + Author: Alan Cox. + URL: http://www.linux-mag.com/2000-03/gear_01.html + Keywords: PCI, bus, bus-mastering. + Description: The title says it all. + + * Title: "Linux 2.4 Kernel Internals" + Author: Tigran Aivazian and Christoph Hellwig. + URL: http://www.moses.uklinux.net/patches/lki.html + Keywords: Linux, kernel, booting, SMB boot, VFS, page cache. + Description: A little book used for a short training course. + Covers building the kernel image, booting (including SMP bootup), + process management, VFS and more. + + * Title: "Linux IP Networking. A Guide to the Implementation and + Modification of the Linux Protocol Stack." + Author: Glenn Herrin. + URL: + http://kernelnewbies.org/documents/ipnetworking/linuxipnetworking. + html + Keywords: network, networking, protocol, IP, UDP, TCP, connection, + socket, receiving, transmitting, forwarding, routing, packets, + modules, /proc, sk_buff, FIB, tags. + Description: Excellent paper devoted to the Linux IP Networking, + explaining anything from the kernel's to the user space + configuration tools' code. Very good to get a general overview of + the kernel networking implementation and understand all steps + packets follow from the time they are received at the network + device till they are delivered to applications. The studied kernel + code is from 2.2.14 version. Provides code for a working packet + dropper example. + + * Title: "Get those boards talking under Linux." + Author: Alex Ivchenko. + URL: http://www.ednmag.com/ednmag/reg/2000/06222000/13df2.htm + Keywords: data-acquisition boards, drivers, modules, interrupts, + memory allocation. + Description: Article written for people wishing to make their data + acquisition boards work on their GNU/Linux machines. Gives a basic + overview on writing drivers, from the naming of functions to + interrupt handling. + Notes: Two-parts article. Part II is at + http://www.ednmag.com/ednmag/reg/2000/07062000/14df.htm + + * Title: "Linux PCMCIA Programmer's Guide" + Author: David Hinds. + URL: http://pcmcia-cs.sourceforge.net/ftp/doc/PCMCIA-PROG.html + Keywords: PCMCIA. + Description: "This document describes how to write kernel device + drivers for the Linux PCMCIA Card Services interface. It also + describes how to write user-mode utilities for communicating with + Card Services. + + * Title: "The Linux Kernel NFSD Implementation" + Author: Neil Brown. + URL: + http://www.cse.unsw.edu.au/~neilb/oss/linux-commentary/nfsd.html + Keywords: knfsd, nfsd, NFS, RPC, lockd, mountd, statd. + Description: The title says it all. + Notes: Covers knfsd's version 1.4.7 (patch against 2.2.7 kernel). + + * Title: "A Linux vm README" + Author: Kanoj Sarcar. + URL: http://reality.sgi.com/kanoj_engr/vm229.html + Keywords: virtual memory, mm, pgd, vma, page, page flags, page + cache, swap cache, kswapd. + Description: Telegraphic, short descriptions and definitions + relating the Linux virtual memory implementation. + + * Title: "(nearly) Complete Linux Loadable Kernel Modules. The + definitive guide for hackers, virus coders and system + administrators." + Author: pragmatic/THC. + URL: http://packetstorm.securify.com/groups/thc/LKM_HACKING.html + Keywords: syscalls, intercept, hide, abuse, symbol table. + Description: Interesting paper on how to abuse the Linux kernel in + order to intercept and modify syscalls, make + files/directories/processes invisible, become root, hijack ttys, + write kernel modules based virus... and solutions for admins to + avoid all those abuses. + Notes: For 2.0.x kernels. Gives guidances to port it to 2.2.x + kernels. Also available in txt format at + http://www.blacknemesis.org/hacking/txt/cllkm.txt + + BOOKS: (Not on-line) + + * Title: "Linux Device Drivers" + Author: Alessandro Rubini. + Publisher: O'Reilly & Associates. + Date: 1998. + Pages: 439. + ISBN: 1-56592-292-1 + + * Title: "Linux Device Drivers, 2nd Edition" + Author: Alessandro Rubini and Jonathan Corbet. + Publisher: O'Reilly & Associates. + Date: 2001. + Pages: 586. + ISBN: 0-59600-008-1 + Notes: Further information in + http://www.oreilly.com/catalog/linuxdrive2/ + + * Title: "Linux Kernel Internals" + Author: Michael Beck. + Publisher: Addison-Wesley. + Date: 1997. + ISBN: 0-201-33143-8 (second edition) + + * Title: "The Design of the UNIX Operating System" + Author: Maurice J. Bach. + Publisher: Prentice Hall. + Date: 1986. + Pages: 471. + ISBN: 0-13-201757-1 + + * Title: "The Design and Implementation of the 4.3 BSD UNIX + Operating System" + Author: Samuel J. Leffler, Marshall Kirk McKusick, Michael J. + Karels, John S. Quarterman. + Publisher: Addison-Wesley. + Date: 1989 (reprinted with corrections on October, 1990). + ISBN: 0-201-06196-1 + + * Title: "The Design and Implementation of the 4.4 BSD UNIX + Operating System" + Author: Marshall Kirk McKusick, Keith Bostic, Michael J. Karels, + John S. Quarterman. + Publisher: Addison-Wesley. + Date: 1996. + ISBN: 0-201-54979-4 + + * Title: "Programmation Linux 2.0 API systeme et fonctionnement du + noyau" + Author: Remy Card, Eric Dumas, Franck Mevel. + Publisher: Eyrolles. + Date: 1997. + Pages: 520. + ISBN: 2-212-08932-5 + Notes: French. + + * Title: "The Linux Kernel Book" + Author: Remy Card, Eric Dumas, Franck Mevel. + Publisher: John Wiley & Sons. + Date: 1998. + ISBN: 0-471-98141-9 + Notes: English translation. + + * Title: "Linux 2.0" + Author: Remy Card, Eric Dumas, Franck Mevel. + Publisher: Gestión 2000. + Date: 1997. + Pages: 501. + ISBN: 8-480-88208-5 + Notes: Spanish translation. + + * Title: "Unix internals -- the new frontiers" + Author: Uresh Vahalia. + Publisher: Prentice Hall. + Date: 1996. + Pages: 600. + ISBN: 0-13-101908-2 + + * Title: "Linux Core Kernel Commentary. Guide to Insider's Knowledge + on the Core Kernel of the Linux Code" + Author: Scott Maxwell. + Publisher: Coriolis. + Date: 1999. + Pages: 592. + ISBN: 1-57610-469-9 + Notes: CD-ROM included. Line by line commentary of the kernel + code. + + * Title: "Linux IP Stacks Commentary" + Author: Stephen Satchell and HBJ Clifford. + Publisher: Coriolis. + Date: 2000. + Pages: ???. + ISBN: 1-57610-470-2 + Notes: Line by line source code commentary book. + + * Title: "Programming for the real world - POSIX.4" + Author: Bill O. Gallmeister. + Publisher: O'Reilly & Associates, Inc.. + Date: 1995. + Pages: ???. + ISBN: I-56592-074-0 + Notes: Though not being directly about Linux, Linux aims to be + POSIX. Good reference. + + * Title: "Understanding the Linux Kernel" + Author: Daniel P. Bovet and Marco Cesati. + Publisher: O'Reilly & Associates, Inc.. + Date: 2000. + Pages: 702. + ISBN: 0-596-00002-2 + Notes: Further information in + http://www.oreilly.com/catalog/linuxkernel/ + + MISCELLANEOUS: + + * Name: linux/Documentation + Author: Many. + URL: Just look inside your kernel sources. + Keywords: anything, DocBook. + Description: Documentation that comes with the kernel sources, + inside the Documentation directory. Some pages from this document + (including this document itself) have been moved there, and might + be more up to date than the web version. + + * Name: "Linux Source Driver" + URL: http://lsd.linux.cz + Keywords: Browsing source code. + Description: "Linux Source Driver (LSD) is an application, which + can make browsing source codes of Linux kernel easier than you can + imagine. You can select between multiple versions of kernel (e.g. + 0.01, 1.0.0, 2.0.33, 2.0.34pre13, 2.0.0, 2.1.101 etc.). With LSD + you can search Linux kernel (fulltext, macros, types, functions + and variables) and LSD can generate patches for you on the fly + (files, directories or kernel)". + + * Name: "Linux Kernel Source Reference" + Author: Thomas Graichen. + URL: http://innominate.org/~graichen/projects/lksr/ + Keywords: CVS, web, cvsweb, browsing source code. + Description: Web interface to a CVS server with the kernel + sources. "Here you can have a look at any file of the Linux kernel + sources of any version starting from 1.0 up to the (daily updated) + current version available. Also you can check the differences + between two versions of a file". + + * Name: "Cross-Referencing Linux" + URL: http://lxr.linux.no/source/ + Keywords: Browsing source code. + Description: Another web-based Linux kernel source code browser. + Lots of cross references to variables and functions. You can see + where they are defined and where they are used. + + * Name: "Linux Weekly News" + URL: http://lwn.net + Keywords: latest kernel news. + Description: The title says it all. There's a fixed kernel section + summarizing developers' work, bug fixes, new features and versions + produced during the week. Published every Thursday. + + * Name: "Kernel Traffic" + URL: http://www.kerneltraffic.org/kernel-traffic/ + Keywords: linux-kernel mailing list, weekly kernel news. + Description: Weekly newsletter covering the most relevant + discussions of the linux-kernel mailing list. + + * Name: "CuTTiNG.eDGe.LiNuX" + URL: http://edge.kernelnotes.org + Keywords: changelist. + Description: Site which provides the changelist for every kernel + release. What's new, what's better, what's changed. Myrdraal reads + the patches and describes them. Pointers to the patches are there, + too. + + * Name: "New linux-kernel Mailing List FAQ" + URL: http://www.tux.org/lkml/ + Keywords: linux-kernel mailing list FAQ. + Description: linux-kernel is a mailing list for developers to + communicate. This FAQ builds on the previous linux-kernel mailing + list FAQ maintained by Frohwalt Egerer, who no longer maintains + it. Read it to see how to join the mailing list. Dozens of + interesting questions regarding the list, Linux, developers (who + is ...?), terms (what is...?) are answered here too. Just read it. + + * Name: "Linux Virtual File System" + Author: Peter J. Braam. + URL: http://www.coda.cs.cmu.edu/doc/talks/linuxvfs/ + Keywords: slides, VFS, inode, superblock, dentry, dcache. + Description: Set of slides, presumably from a presentation on the + Linux VFS layer. Covers version 2.1.x, with dentries and the + dcache. + + * Name: "Gary's Encyclopedia - The Linux Kernel" + Author: Gary (I suppose...). + URL: http://members.aa.net/~swear/pedia/kernel.html + Keywords: links, not found here?. + Description: Gary's Encyclopedia exists to allow the rapid finding + of documentation and other information of interest to GNU/Linux + users. It has about 4000 links to external pages in 150 major + categories. This link is for kernel-specific links, documents, + sites... Look there if you could not find here what you were + looking for. + + * Name: "The home page of Linux-MM" + Author: The Linux-MM team. + URL: http://linux-mm.org/ + Keywords: memory management, Linux-MM, mm patches, TODO, docs, + mailing list. + Description: Site devoted to Linux Memory Management development. + Memory related patches, HOWTOs, links, mm developers... Don't miss + it if you are interested in memory management development! + + * Name: "Kernel Newbies IRC Channel" + URL: http://www.kernelnewbies.org + Keywords: IRC, newbies, channel, asking doubts. + Description: #kernelnewbies on irc.openprojects.net. From the web + page: "#kernelnewbies is an IRC network dedicated to the 'newbie' + kernel hacker. The audience mostly consists of people who are + learning about the kernel, working on kernel projects or + professional kernel hackers that want to help less seasoned kernel + people. [...] #kernelnewbies is on the Open Projects IRC Network, + try irc.openprojects.net or irc.<country>.openprojects.net as your + server and then /join #kernelnewbies". It also hosts articles, + documents, FAQs... + + * Name: "linux-kernel mailing list archives and search engines" + URL: http://www.uwsg.indiana.edu/hypermail/linux/kernel/index.html + URL: http://www.kernelnotes.org/lnxlists/linux-kernel/ + URL: http://www.geocrawler.com + Keywords: linux-kernel, archives, search. + Description: Some of the linux-kernel mailing list archivers. If + you have a better/another one, please let me know. + _________________________________________________________________ + + Document last updated on Thu Jun 28 15:09:39 CEST 2001 diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt new file mode 100644 index 000000000000..4924d387a657 --- /dev/null +++ b/Documentation/kernel-parameters.txt @@ -0,0 +1,1511 @@ +February 2003 Kernel Parameters v2.5.59 + ~~~~~~~~~~~~~~~~~ + +The following is a consolidated list of the kernel parameters as implemented +(mostly) by the __setup() macro and sorted into English Dictionary order +(defined as ignoring all punctuation and sorting digits before letters in a +case insensitive manner), and with descriptions where known. + +Module parameters for loadable modules are specified only as the +parameter name with optional '=' and value as appropriate, such as: + + modprobe usbcore blinkenlights=1 + +Module parameters for modules that are built into the kernel image +are specified on the kernel command line with the module name plus +'.' plus parameter name, with '=' and value if appropriate, such as: + + usbcore.blinkenlights=1 + +The text in square brackets at the beginning of the description state the +restrictions on the kernel for the said kernel parameter to be valid. The +restrictions referred to are that the relevant option is valid if: + + ACPI ACPI support is enabled. + ALSA ALSA sound support is enabled. + APIC APIC support is enabled. + APM Advanced Power Management support is enabled. + AX25 Appropriate AX.25 support is enabled. + CD Appropriate CD support is enabled. + DEVFS devfs support is enabled. + DRM Direct Rendering Management support is enabled. + EDD BIOS Enhanced Disk Drive Services (EDD) is enabled + EFI EFI Partitioning (GPT) is enabled + EIDE EIDE/ATAPI support is enabled. + FB The frame buffer device is enabled. + HW Appropriate hardware is enabled. + IA-32 IA-32 aka i386 architecture is enabled. + IA-64 IA-64 architecture is enabled. + IOSCHED More than one I/O scheduler is enabled. + IP_PNP IP DCHP, BOOTP, or RARP is enabled. + ISAPNP ISA PnP code is enabled. + ISDN Appropriate ISDN support is enabled. + JOY Appropriate joystick support is enabled. + LP Printer support is enabled. + LOOP Loopback device support is enabled. + M68k M68k architecture is enabled. + These options have more detailed description inside of + Documentation/m68k/kernel-options.txt. + MCA MCA bus support is enabled. + MDA MDA console support is enabled. + MOUSE Appropriate mouse support is enabled. + MTD MTD support is enabled. + NET Appropriate network support is enabled. + NUMA NUMA support is enabled. + NFS Appropriate NFS support is enabled. + OSS OSS sound support is enabled. + PARIDE The ParIDE subsystem is enabled. + PARISC The PA-RISC architecture is enabled. + PCI PCI bus support is enabled. + PCMCIA The PCMCIA subsystem is enabled. + PNP Plug & Play support is enabled. + PPC PowerPC architecture is enabled. + PPT Parallel port support is enabled. + PS2 Appropriate PS/2 support is enabled. + RAM RAM disk support is enabled. + S390 S390 architecture is enabled. + SCSI Appropriate SCSI support is enabled. + A lot of drivers has their options described inside of + Documentation/scsi/. + SELINUX SELinux support is enabled. + SERIAL Serial support is enabled. + SMP The kernel is an SMP kernel. + SPARC Sparc architecture is enabled. + SWSUSP Software suspension is enabled. + TS Appropriate touchscreen support is enabled. + USB USB support is enabled. + USBHID USB Human Interface Device support is enabled. + V4L Video For Linux support is enabled. + VGA The VGA console has been enabled. + VT Virtual terminal support is enabled. + WDT Watchdog support is enabled. + XT IBM PC/XT MFM hard disk support is enabled. + X86-64 X86-64 architecture is enabled. + More X86-64 boot options can be found in + Documentation/x86_64/boot-options.txt . + +In addition, the following text indicates that the option: + + BUGS= Relates to possible processor bugs on the said processor. + KNL Is a kernel start-up parameter. + BOOT Is a boot loader parameter. + +Parameters denoted with BOOT are actually interpreted by the boot +loader, and have no meaning to the kernel directly. +Do not modify the syntax of boot loader parameters without extreme +need or coordination with <Documentation/i386/boot.txt>. + +Note that ALL kernel parameters listed below are CASE SENSITIVE, and that +a trailing = on the name of any parameter states that that parameter will +be entered as an environment variable, whereas its absence indicates that +it will appear as a kernel argument readable via /proc/cmdline by programs +running once the system is up. + + 53c7xx= [HW,SCSI] Amiga SCSI controllers + See header of drivers/scsi/53c7xx.c. + See also Documentation/scsi/ncr53c7xx.txt. + + acpi= [HW,ACPI] Advanced Configuration and Power Interface + Format: { force | off | ht | strict } + force -- enable ACPI if default was off + off -- disable ACPI if default was on + noirq -- do not use ACPI for IRQ routing + ht -- run only enough ACPI to enable Hyper Threading + strict -- Be less tolerant of platforms that are not + strictly ACPI specification compliant. + + See also Documentation/pm.txt, pci=noacpi + + acpi_sleep= [HW,ACPI] Sleep options + Format: { s3_bios, s3_mode } + See Documentation/power/video.txt + + acpi_sci= [HW,ACPI] ACPI System Control Interrupt trigger mode + Format: { level | edge | high | low } + + acpi_irq_balance [HW,ACPI] ACPI will balance active IRQs + default in APIC mode + + acpi_irq_nobalance [HW,ACPI] ACPI will not move active IRQs (default) + default in PIC mode + + acpi_irq_pci= [HW,ACPI] If irq_balance, Clear listed IRQs for use by PCI + Format: <irq>,<irq>... + + acpi_irq_isa= [HW,ACPI] If irq_balance, Mark listed IRQs used by ISA + Format: <irq>,<irq>... + + acpi_osi= [HW,ACPI] empty param disables _OSI + + acpi_serialize [HW,ACPI] force serialization of AML methods + + acpi_skip_timer_override [HW,ACPI] + Recognize and ignore IRQ0/pin2 Interrupt Override. + For broken nForce2 BIOS resulting in XT-PIC timer. + + acpi_dbg_layer= [HW,ACPI] + Format: <int> + Each bit of the <int> indicates an acpi debug layer, + 1: enable, 0: disable. It is useful for boot time + debugging. After system has booted up, it can be set + via /proc/acpi/debug_layer. + + acpi_dbg_level= [HW,ACPI] + Format: <int> + Each bit of the <int> indicates an acpi debug level, + 1: enable, 0: disable. It is useful for boot time + debugging. After system has booted up, it can be set + via /proc/acpi/debug_level. + + acpi_fake_ecdt [HW,ACPI] Workaround failure due to BIOS lacking ECDT + + ad1816= [HW,OSS] + Format: <io>,<irq>,<dma>,<dma2> + See also Documentation/sound/oss/AD1816. + + ad1848= [HW,OSS] + Format: <io>,<irq>,<dma>,<dma2>,<type> + + adlib= [HW,OSS] + Format: <io> + + advansys= [HW,SCSI] + See header of drivers/scsi/advansys.c. + + advwdt= [HW,WDT] Advantech WDT + Format: <iostart>,<iostop> + + aedsp16= [HW,OSS] Audio Excel DSP 16 + Format: <io>,<irq>,<dma>,<mss_io>,<mpu_io>,<mpu_irq> + See also header of sound/oss/aedsp16.c. + + aha152x= [HW,SCSI] + See Documentation/scsi/aha152x.txt. + + aha1542= [HW,SCSI] + Format: <portbase>[,<buson>,<busoff>[,<dmaspeed>]] + + aic7xxx= [HW,SCSI] + See Documentation/scsi/aic7xxx.txt. + + aic79xx= [HW,SCSI] + See Documentation/scsi/aic79xx.txt. + + AM53C974= [HW,SCSI] + Format: <host-scsi-id>,<target-scsi-id>,<max-rate>,<max-offset> + See also header of drivers/scsi/AM53C974.c. + + amijoy.map= [HW,JOY] Amiga joystick support + Map of devices attached to JOY0DAT and JOY1DAT + Format: <a>,<b> + See also Documentation/kernel/input/joystick.txt + + analog.map= [HW,JOY] Analog joystick and gamepad support + Specifies type or capabilities of an analog joystick + connected to one of 16 gameports + Format: <type1>,<type2>,..<type16> + + apc= [HW,SPARC] Power management functions (SPARCstation-4/5 + deriv.) + Format: noidle + Disable APC CPU standby support. SPARCstation-Fox does + not play well with APC CPU idle - disable it if you have + APC and your system crashes randomly. + + apic= [APIC,i386] Change the output verbosity whilst booting + Format: { quiet (default) | verbose | debug } + Change the amount of debugging information output + when initialising the APIC and IO-APIC components. + + apm= [APM] Advanced Power Management + See header of arch/i386/kernel/apm.c. + + applicom= [HW] + Format: <mem>,<irq> + + arcrimi= [HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards + Format: <io>,<irq>,<nodeID> + + ataflop= [HW,M68k] + + atarimouse= [HW,MOUSE] Atari Mouse + + atascsi= [HW,SCSI] Atari SCSI + + atkbd.extra= [HW] Enable extra LEDs and keys on IBM RapidAccess, + EzKey and similar keyboards + + atkbd.reset= [HW] Reset keyboard during initialization + + atkbd.set= [HW] Select keyboard code set + Format: <int> (2 = AT (default) 3 = PS/2) + + atkbd.scroll= [HW] Enable scroll wheel on MS Office and similar + keyboards + + atkbd.softraw= [HW] Choose between synthetic and real raw mode + Format: <bool> (0 = real, 1 = synthetic (default)) + + atkbd.softrepeat= + [HW] Use software keyboard repeat + + autotest [IA64] + + awe= [HW,OSS] AWE32/SB32/AWE64 wave table synth + Format: <io>,<memsize>,<isapnp> + + aztcd= [HW,CD] Aztech CD268 CDROM driver + Format: <io>,0x79 (?) + + baycom_epp= [HW,AX25] + Format: <io>,<mode> + + baycom_par= [HW,AX25] BayCom Parallel Port AX.25 Modem + Format: <io>,<mode> + See header of drivers/net/hamradio/baycom_par.c. + + baycom_ser_fdx= [HW,AX25] BayCom Serial Port AX.25 Modem (Full Duplex Mode) + Format: <io>,<irq>,<mode>[,<baud>] + See header of drivers/net/hamradio/baycom_ser_fdx.c. + + baycom_ser_hdx= [HW,AX25] BayCom Serial Port AX.25 Modem (Half Duplex Mode) + Format: <io>,<irq>,<mode> + See header of drivers/net/hamradio/baycom_ser_hdx.c. + + blkmtd_device= [HW,MTD] + blkmtd_erasesz= + blkmtd_ro= + blkmtd_bs= + blkmtd_count= + + bttv.card= [HW,V4L] bttv (bt848 + bt878 based grabber cards) + bttv.radio= Most important insmod options are available as kernel args too. + bttv.pll= See Documentation/video4linux/bttv/Insmod-options + bttv.tuner= and Documentation/video4linux/bttv/CARDLIST + + BusLogic= [HW,SCSI] + See drivers/scsi/BusLogic.c, comment before function + BusLogic_ParseDriverOptions(). + + c101= [NET] Moxa C101 synchronous serial card + + cachesize= [BUGS=IA-32] Override level 2 CPU cache size detection. + Sometimes CPU hardware bugs make them report the cache + size incorrectly. The kernel will attempt work arounds + to fix known problems, but for some CPUs it is not + possible to determine what the correct size should be. + This option provides an override for these situations. + + cdu31a= [HW,CD] + Format: <io>,<irq>[,PAS] + See header of drivers/cdrom/cdu31a.c. + + chandev= [HW,NET] Generic channel device initialisation + + checkreqprot [SELINUX] Set initial checkreqprot flag value. + Format: { "0" | "1" } + See security/selinux/Kconfig help text. + 0 -- check protection applied by kernel (includes any implied execute protection). + 1 -- check protection requested by application. + Default value is set via a kernel config option. + Value can be changed at runtime via /selinux/checkreqprot. + + clock= [BUGS=IA-32, HW] gettimeofday timesource override. + Forces specified timesource (if avaliable) to be used + when calculating gettimeofday(). If specicified timesource + is not avalible, it defaults to PIT. + Format: { pit | tsc | cyclone | pmtmr } + + hpet= [IA-32,HPET] option to disable HPET and use PIT. + Format: disable + + cm206= [HW,CD] + Format: { auto | [<io>,][<irq>] } + + com20020= [HW,NET] ARCnet - COM20020 chipset + Format: <io>[,<irq>[,<nodeID>[,<backplane>[,<ckp>[,<timeout>]]]]] + + com90io= [HW,NET] ARCnet - COM90xx chipset (IO-mapped buffers) + Format: <io>[,<irq>] + + com90xx= [HW,NET] ARCnet - COM90xx chipset (memory-mapped buffers) + Format: <io>[,<irq>[,<memstart>]] + + condev= [HW,S390] console device + conmode= + + console= [KNL] Output console device and options. + + tty<n> Use the virtual console device <n>. + + ttyS<n>[,options] + Use the specified serial port. The options are of + the form "bbbbpn", where "bbbb" is the baud rate, + "p" is parity ("n", "o", or "e"), and "n" is bits. + Default is "9600n8". + + See also Documentation/serial-console.txt. + + uart,io,<addr>[,options] + uart,mmio,<addr>[,options] + Start an early, polled-mode console on the 8250/16550 + UART at the specified I/O port or MMIO address, + switching to the matching ttyS device later. The + options are the same as for ttyS, above. + + cpcihp_generic= [HW,PCI] Generic port I/O CompactPCI driver + Format: <first_slot>,<last_slot>,<port>,<enum_bit>[,<debug>] + + cpia_pp= [HW,PPT] + Format: { parport<nr> | auto | none } + + cs4232= [HW,OSS] + Format: <io>,<irq>,<dma>,<dma2>,<mpuio>,<mpuirq> + + cs89x0_dma= [HW,NET] + Format: <dma> + + cs89x0_media= [HW,NET] + Format: { rj45 | aui | bnc } + + cyclades= [HW,SERIAL] Cyclades multi-serial port adapter. + + dasd= [HW,NET] + See header of drivers/s390/block/dasd_devmap.c. + + db9.dev[2|3]= [HW,JOY] Multisystem joystick support via parallel port + (one device per port) + Format: <port#>,<type> + See also Documentation/input/joystick-parport.txt + + debug [KNL] Enable kernel debugging (events log level). + + decnet= [HW,NET] + Format: <area>[,<node>] + See also Documentation/networking/decnet.txt. + + devfs= [DEVFS] + See Documentation/filesystems/devfs/boot-options. + + dhash_entries= [KNL] + Set number of hash buckets for dentry cache. + + digi= [HW,SERIAL] + IO parameters + enable/disable command. + + digiepca= [HW,SERIAL] + See drivers/char/README.epca and + Documentation/digiepca.txt. + + dmascc= [HW,AX25,SERIAL] AX.25 Z80SCC driver with DMA + support available. + Format: <io_dev0>[,<io_dev1>[,..<io_dev32>]] + + dmasound= [HW,OSS] Sound subsystem buffers + + dscc4.setup= [NET] + + dtc3181e= [HW,SCSI] + + earlyprintk= [IA-32, X86-64] + earlyprintk=vga + earlyprintk=serial[,ttySn[,baudrate]] + + Append ,keep to not disable it when the real console + takes over. + + Only vga or serial at a time, not both. + + Currently only ttyS0 and ttyS1 are supported. + + Interaction with the standard serial driver is not + very good. + + The VGA output is eventually overwritten by the real + console. + + eata= [HW,SCSI] + + eda= [HW,PS2] + + edb= [HW,PS2] + + edd= [EDD] + Format: {"of[f]" | "sk[ipmbr]"} + See comment in arch/i386/boot/edd.S + + eicon= [HW,ISDN] + Format: <id>,<membase>,<irq> + + eisa_irq_edge= [PARISC,HW] + See header of drivers/parisc/eisa.c. + + elanfreq= [IA-32] + See comment before function elanfreq_setup() in + arch/i386/kernel/cpu/cpufreq/elanfreq.c. + + elevator= [IOSCHED] + Format: {"as"|"cfq"|"deadline"|"noop"} + See Documentation/block/as-iosched.txt + and Documentation/block/deadline-iosched.txt for details. + + enforcing [SELINUX] Set initial enforcing status. + Format: {"0" | "1"} + See security/selinux/Kconfig help text. + 0 -- permissive (log only, no denials). + 1 -- enforcing (deny and log). + Default value is 0. + Value can be changed at runtime via /selinux/enforce. + + es1370= [HW,OSS] + Format: <lineout>[,<micbias>] + See also header of sound/oss/es1370.c. + + es1371= [HW,OSS] + Format: <spdif>,[<nomix>,[<amplifier>]] + See also header of sound/oss/es1371.c. + + ether= [HW,NET] Ethernet cards parameters + This option is obsoleted by the "netdev=" option, which + has equivalent usage. See its documentation for details. + + eurwdt= [HW,WDT] Eurotech CPU-1220/1410 onboard watchdog. + Format: <io>[,<irq>] + + fd_mcs= [HW,SCSI] + See header of drivers/scsi/fd_mcs.c. + + fdomain= [HW,SCSI] + See header of drivers/scsi/fdomain.c. + + floppy= [HW] + See Documentation/floppy.txt. + + ftape= [HW] Floppy Tape subsystem debugging options. + See Documentation/ftape.txt. + + gamecon.map[2|3]= + [HW,JOY] Multisystem joystick and NES/SNES/PSX pad + support via parallel port (up to 5 devices per port) + Format: <port#>,<pad1>,<pad2>,<pad3>,<pad4>,<pad5> + See also Documentation/input/joystick-parport.txt + + gamma= [HW,DRM] + + gdth= [HW,SCSI] + See header of drivers/scsi/gdth.c. + + gpt [EFI] Forces disk with valid GPT signature but + invalid Protective MBR to be treated as GPT. + + gscd= [HW,CD] + Format: <io> + + gt96100eth= [NET] MIPS GT96100 Advanced Communication Controller + + gus= [HW,OSS] + Format: <io>,<irq>,<dma>,<dma16> + + gvp11= [HW,SCSI] + + hashdist= [KNL,NUMA] Large hashes allocated during boot + are distributed across NUMA nodes. Defaults on + for IA-64, off otherwise. + + hcl= [IA-64] SGI's Hardware Graph compatibility layer + + hd= [EIDE] (E)IDE hard drive subsystem geometry + Format: <cyl>,<head>,<sect> + + hd?= [HW] (E)IDE subsystem + hd?lun= See Documentation/ide.txt. + + highmem=nn[KMG] [KNL,BOOT] forces the highmem zone to have an exact + size of <nn>. This works even on boxes that have no + highmem otherwise. This also works to reduce highmem + size on bigger boxes. + + hisax= [HW,ISDN] + See Documentation/isdn/README.HiSax. + + hugepages= [HW,IA-32,IA-64] Maximal number of HugeTLB pages. + + noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing + + i8042.direct [HW] Put keyboard port into non-translated mode + i8042.dumbkbd [HW] Pretend that controlled can only read data from + keyboard and can not control its state + (Don't attempt to blink the leds) + i8042.noaux [HW] Don't check for auxiliary (== mouse) port + i8042.nomux [HW] Don't check presence of an active multiplexing + controller + i8042.nopnp [HW] Don't use ACPIPnP / PnPBIOS to discover KBD/AUX + controllers + i8042.panicblink= + [HW] Frequency with which keyboard LEDs should blink + when kernel panics (default is 0.5 sec) + i8042.reset [HW] Reset the controller during init and cleanup + i8042.unlock [HW] Unlock (ignore) the keylock + + i810= [HW,DRM] + + i8k.force [HW] Activate i8k driver even if SMM BIOS signature + does not match list of supported models. + i8k.power_status + [HW] Report power status in /proc/i8k + (disabled by default) + i8k.restricted [HW] Allow controlling fans only if SYS_ADMIN + capability is set. + + ibmmcascsi= [HW,MCA,SCSI] IBM MicroChannel SCSI adapter + See Documentation/mca.txt. + + icn= [HW,ISDN] + Format: <io>[,<membase>[,<icn_id>[,<icn_id2>]]] + + ide= [HW] (E)IDE subsystem + Format: ide=nodma or ide=doubler or ide=reverse + See Documentation/ide.txt. + + ide?= [HW] (E)IDE subsystem + Format: ide?=noprobe or chipset specific parameters. + See Documentation/ide.txt. + + idebus= [HW] (E)IDE subsystem - VLB/PCI bus speed + See Documentation/ide.txt. + + idle= [HW] + Format: idle=poll or idle=halt + + ihash_entries= [KNL] + Set number of hash buckets for inode cache. + + in2000= [HW,SCSI] + See header of drivers/scsi/in2000.c. + + init= [KNL] + Format: <full_path> + Run specified binary instead of /sbin/init as init + process. + + initcall_debug [KNL] Trace initcalls as they are executed. Useful + for working out where the kernel is dying during + startup. + + initrd= [BOOT] Specify the location of the initial ramdisk + + inport.irq= [HW] Inport (ATI XL and Microsoft) busmouse driver + Format: <irq> + + inttest= [IA64] + + io7= [HW] IO7 for Marvel based alpha systems + See comment before marvel_specify_io7 in + arch/alpha/kernel/core_marvel.c. + + ip= [IP_PNP] + See Documentation/nfsroot.txt. + + ip2= [HW] Set IO/IRQ pairs for up to 4 IntelliPort boards + See comment before ip2_setup() in drivers/char/ip2.c. + + ips= [HW,SCSI] Adaptec / IBM ServeRAID controller + See header of drivers/scsi/ips.c. + + isapnp= [ISAPNP] + Format: <RDP>, <reset>, <pci_scan>, <verbosity> + + isolcpus= [KNL,SMP] Isolate CPUs from the general scheduler. + Format: <cpu number>,...,<cpu number> + This option can be used to specify one or more CPUs + to isolate from the general SMP balancing and scheduling + algorithms. The only way to move a process onto or off + an "isolated" CPU is via the CPU affinity syscalls. + <cpu number> begins at 0 and the maximum value is + "number of CPUs in system - 1". + + This option is the preferred way to isolate CPUs. The + alternative - manually setting the CPU mask of all tasks + in the system can cause problems and suboptimal load + balancer performance. + + isp16= [HW,CD] + Format: <io>,<irq>,<dma>,<setup> + + iucv= [HW,NET] + + js= [HW,JOY] Analog joystick + See Documentation/input/joystick.txt. + + keepinitrd [HW,ARM] + + kstack=N [IA-32, X86-64] Print N words from the kernel stack + in oops dumps. + + l2cr= [PPC] + + lapic [IA-32,APIC] Enable the local APIC even if BIOS disabled it. + + lasi= [HW,SCSI] PARISC LASI driver for the 53c700 chip + Format: addr:<io>,irq:<irq> + + llsc*= [IA64] + See function print_params() in arch/ia64/sn/kernel/llsc4.c. + + load_ramdisk= [RAM] List of ramdisks to load from floppy + See Documentation/ramdisk.txt. + + lockd.udpport= [NFS] + + lockd.tcpport= [NFS] + + logibm.irq= [HW,MOUSE] Logitech Bus Mouse Driver + Format: <irq> + + loglevel= All Kernel Messages with a loglevel smaller than the + console loglevel will be printed to the console. It can + also be changed with klogd or other programs. The + loglevels are defined as follows: + + 0 (KERN_EMERG) system is unusable + 1 (KERN_ALERT) action must be taken immediately + 2 (KERN_CRIT) critical conditions + 3 (KERN_ERR) error conditions + 4 (KERN_WARNING) warning conditions + 5 (KERN_NOTICE) normal but significant condition + 6 (KERN_INFO) informational + 7 (KERN_DEBUG) debug-level messages + + log_buf_len=n Sets the size of the printk ring buffer, in bytes. + Format is n, nk, nM. n must be a power of two. The + default is set in kernel config. + + lp=0 [LP] Specify parallel ports to use, e.g, + lp=port[,port...] lp=none,parport0 (lp0 not configured, lp1 uses + lp=reset first parallel port). 'lp=0' disables the + lp=auto printer driver. 'lp=reset' (which can be + specified in addition to the ports) causes + attached printers to be reset. Using + lp=port1,port2,... specifies the parallel ports + to associate lp devices with, starting with + lp0. A port specification may be 'none' to skip + that lp device, or a parport name such as + 'parport0'. Specifying 'lp=auto' instead of a + port specification list means that device IDs + from each port should be examined, to see if + an IEEE 1284-compliant printer is attached; if + so, the driver will manage that printer. + See also header of drivers/char/lp.c. + + lpj=n [KNL] + Sets loops_per_jiffy to given constant, thus avoiding + time-consuming boot-time autodetection (up to 250 ms per + CPU). 0 enables autodetection (default). To determine + the correct value for your kernel, boot with normal + autodetection and see what value is printed. Note that + on SMP systems the preset will be applied to all CPUs, + which is likely to cause problems if your CPUs need + significantly divergent settings. An incorrect value + will cause delays in the kernel to be wrong, leading to + unpredictable I/O errors and other breakage. Although + unlikely, in the extreme case this might damage your + hardware. + + ltpc= [NET] + Format: <io>,<irq>,<dma> + + mac5380= [HW,SCSI] + Format: <can_queue>,<cmd_per_lun>,<sg_tablesize>,<hostid>,<use_tags> + + mac53c9x= [HW,SCSI] + Format: <num_esps>,<disconnect>,<nosync>,<can_queue>,<cmd_per_lun>,<sg_tablesize>,<hostid>,<use_tags> + + machvec= [IA64] + Force the use of a particular machine-vector (machvec) in a generic + kernel. Example: machvec=hpzx1_swiotlb + + mad16= [HW,OSS] + Format: <io>,<irq>,<dma>,<dma16>,<mpu_io>,<mpu_irq>,<joystick> + + maui= [HW,OSS] + Format: <io>,<irq> + + max_loop= [LOOP] Maximum number of loopback devices that can + be mounted + Format: <1-256> + + maxcpus= [SMP] Maximum number of processors that an SMP kernel + should make use of + + max_luns= [SCSI] Maximum number of LUNs to probe + Should be between 1 and 2^32-1. + + max_report_luns= + [SCSI] Maximum number of LUNs received + Should be between 1 and 16384. + + mca-pentium [BUGS=IA-32] + + mcatest= [IA-64] + + mcd= [HW,CD] + Format: <port>,<irq>,<mitsumi_bug_93_wait> + + mcdx= [HW,CD] + + mce [IA-32] Machine Check Exception + + md= [HW] RAID subsystems devices and level + See Documentation/md.txt. + + mdacon= [MDA] + Format: <first>,<last> + Specifies range of consoles to be captured by the MDA. + + mem=nn[KMG] [KNL,BOOT] Force usage of a specific amount of memory + Amount of memory to be used when the kernel is not able + to see the whole system memory or for test. + [IA-32] Use together with memmap= to avoid physical + address space collisions. Without memmap= PCI devices + could be placed at addresses belonging to unused RAM. + + mem=nopentium [BUGS=IA-32] Disable usage of 4MB pages for kernel + memory. + + memmap=exactmap [KNL,IA-32] Enable setting of an exact + E820 memory map, as specified by the user. + Such memmap=exactmap lines can be constructed based on + BIOS output or other requirements. See the memmap=nn@ss + option description. + + memmap=nn[KMG]@ss[KMG] + [KNL] Force usage of a specific region of memory + Region of memory to be used, from ss to ss+nn. + + memmap=nn[KMG]#ss[KMG] + [KNL,ACPI] Mark specific memory as ACPI data. + Region of memory to be used, from ss to ss+nn. + + memmap=nn[KMG]$ss[KMG] + [KNL,ACPI] Mark specific memory as reserved. + Region of memory to be used, from ss to ss+nn. + + meye.*= [HW] Set MotionEye Camera parameters + See Documentation/video4linux/meye.txt. + + mga= [HW,DRM] + + mousedev.tap_time= + [MOUSE] Maximum time between finger touching and + leaving touchpad surface for touch to be considered + a tap and be reported as a left button click (for + touchpads working in absolute mode only). + Format: <msecs> + mousedev.xres= [MOUSE] Horizontal screen resolution, used for devices + reporting absolute coordinates, such as tablets + mousedev.yres= [MOUSE] Vertical screen resolution, used for devices + reporting absolute coordinates, such as tablets + + mpu401= [HW,OSS] + Format: <io>,<irq> + + MTD_Partition= [MTD] + Format: <name>,<region-number>,<size>,<offset> + + MTD_Region= [MTD] + Format: <name>,<region-number>[,<base>,<size>,<buswidth>,<altbuswidth>] + + mtdparts= [MTD] + See drivers/mtd/cmdline.c. + + mtouchusb.raw_coordinates= + [HW] Make the MicroTouch USB driver use raw coordinates ('y', default) + or cooked coordinates ('n') + + n2= [NET] SDL Inc. RISCom/N2 synchronous serial card + + NCR_D700= [HW,SCSI] + See header of drivers/scsi/NCR_D700.c. + + ncr5380= [HW,SCSI] + + ncr53c400= [HW,SCSI] + + ncr53c400a= [HW,SCSI] + + ncr53c406a= [HW,SCSI] + + ncr53c8xx= [HW,SCSI] + + netdev= [NET] Network devices parameters + Format: <irq>,<io>,<mem_start>,<mem_end>,<name> + Note that mem_start is often overloaded to mean + something different and driver-specific. + + nfsaddrs= [NFS] + See Documentation/nfsroot.txt. + + nfsroot= [NFS] nfs root filesystem for disk-less boxes. + See Documentation/nfsroot.txt. + + nmi_watchdog= [KNL,BUGS=IA-32] Debugging features for SMP kernels + + no387 [BUGS=IA-32] Tells the kernel to use the 387 maths + emulation library even if a 387 maths coprocessor + is present. + + noalign [KNL,ARM] + + noapic [SMP,APIC] Tells the kernel to not make use of any + IOAPICs that may be present in the system. + + noasync [HW,M68K] Disables async and sync negotiation for + all devices. + + nobats [PPC] Do not use BATs for mapping kernel lowmem + on "Classic" PPC cores. + + nocache [ARM] + + nodisconnect [HW,SCSI,M68K] Disables SCSI disconnects. + + noexec [IA-64] + + noexec [IA-32, X86-64] + noexec=on: enable non-executable mappings (default) + noexec=off: disable nn-executable mappings + + nofxsr [BUGS=IA-32] + + nohlt [BUGS=ARM] + + no-hlt [BUGS=IA-32] Tells the kernel that the hlt + instruction doesn't work correctly and not to + use it. + + nohalt [IA-64] Tells the kernel not to use the power saving + function PAL_HALT_LIGHT when idle. This increases + power-consumption. On the positive side, it reduces + interrupt wake-up latency, which may improve performance + in certain environments such as networked servers or + real-time systems. + + noirqdebug [IA-32] Disables the code which attempts to detect and + disable unhandled interrupt sources. + + noisapnp [ISAPNP] Disables ISA PnP code. + + noinitrd [RAM] Tells the kernel not to load any configured + initial RAM disk. + + nointroute [IA-64] + + nolapic [IA-32,APIC] Do not enable or use the local APIC. + + noltlbs [PPC] Do not use large page/tlb entries for kernel + lowmem mapping on PPC40x. + + nomce [IA-32] Machine Check Exception + + noresidual [PPC] Don't use residual data on PReP machines. + + noresume [SWSUSP] Disables resume and restore original swap space. + + no-scroll [VGA] Disables scrollback. + This is required for the Braillex ib80-piezo Braille + reader made by F.H. Papenmeier (Germany). + + nosbagart [IA-64] + + nosmp [SMP] Tells an SMP kernel to act as a UP kernel. + + nosync [HW,M68K] Disables sync negotiation for all devices. + + notsc [BUGS=IA-32] Disable Time Stamp Counter + + nousb [USB] Disable the USB subsystem + + nowb [ARM] + + opl3= [HW,OSS] + Format: <io> + + opl3sa= [HW,OSS] + Format: <io>,<irq>,<dma>,<dma2>,<mpu_io>,<mpu_irq> + + opl3sa2= [HW,OSS] + Format: <io>,<irq>,<dma>,<dma2>,<mss_io>,<mpu_io>,<ymode>,<loopback>[,<isapnp>,<multiple] + + oprofile.timer= [HW] + Use timer interrupt instead of performance counters + + optcd= [HW,CD] + Format: <io> + + osst= [HW,SCSI] SCSI Tape Driver + Format: <buffer_size>,<write_threshold> + See also Documentation/scsi/st.txt. + + panic= [KNL] Kernel behaviour on panic + Format: <timeout> + + parkbd.port= [HW] Parallel port number the keyboard adapter is + connected to, default is 0. + Format: <parport#> + parkbd.mode= [HW] Parallel port keyboard adapter mode of operation, + 0 for XT, 1 for AT (default is AT). + Format: <mode> + + parport=0 [HW,PPT] Specify parallel ports. 0 disables. + parport=auto Use 'auto' to force the driver to use + parport=0xBBB[,IRQ[,DMA]] any IRQ/DMA settings detected (the + default is to ignore detected IRQ/DMA + settings because of possible + conflicts). You can specify the base + address, IRQ, and DMA settings; IRQ and + DMA should be numbers, or 'auto' (for + using detected settings on that + particular port), or 'nofifo' (to avoid + using a FIFO even if it is detected). + Parallel ports are assigned in the + order they are specified on the command + line, starting with parport0. + + parport_init_mode= + [HW,PPT] Configure VIA parallel port to + operate in specific mode. This is + necessary on Pegasos computer where + firmware has no options for setting up + parallel port mode and sets it to + spp. Currently this function knows + 686a and 8231 chips. + Format: [spp|ps2|epp|ecp|ecpepp] + + pas2= [HW,OSS] + Format: <io>,<irq>,<dma>,<dma16>,<sb_io>,<sb_irq>,<sb_dma>,<sb_dma16> + + pas16= [HW,SCSI] + See header of drivers/scsi/pas16.c. + + pcbit= [HW,ISDN] + + pcd. [PARIDE] + See header of drivers/block/paride/pcd.c. + See also Documentation/paride.txt. + + pci=option[,option...] [PCI] various PCI subsystem options: + off [IA-32] don't probe for the PCI bus + bios [IA-32] force use of PCI BIOS, don't access + the hardware directly. Use this if your machine + has a non-standard PCI host bridge. + nobios [IA-32] disallow use of PCI BIOS, only direct + hardware access methods are allowed. Use this + if you experience crashes upon bootup and you + suspect they are caused by the BIOS. + conf1 [IA-32] Force use of PCI Configuration Mechanism 1. + conf2 [IA-32] Force use of PCI Configuration Mechanism 2. + nosort [IA-32] Don't sort PCI devices according to + order given by the PCI BIOS. This sorting is done + to get a device order compatible with older kernels. + biosirq [IA-32] Use PCI BIOS calls to get the interrupt + routing table. These calls are known to be buggy + on several machines and they hang the machine when used, + but on other computers it's the only way to get the + interrupt routing table. Try this option if the kernel + is unable to allocate IRQs or discover secondary PCI + buses on your motherboard. + rom [IA-32] Assign address space to expansion ROMs. + Use with caution as certain devices share address + decoders between ROMs and other resources. + irqmask=0xMMMM [IA-32] Set a bit mask of IRQs allowed to be assigned + automatically to PCI devices. You can make the kernel + exclude IRQs of your ISA cards this way. + lastbus=N [IA-32] Scan all buses till bus #N. Can be useful + if the kernel is unable to find your secondary buses + and you want to tell it explicitly which ones they are. + assign-busses [IA-32] Always assign all PCI bus + numbers ourselves, overriding + whatever the firmware may have + done. + usepirqmask [IA-32] Honor the possible IRQ mask + stored in the BIOS $PIR table. This is + needed on some systems with broken + BIOSes, notably some HP Pavilion N5400 + and Omnibook XE3 notebooks. This will + have no effect if ACPI IRQ routing is + enabled. + noacpi [IA-32] Do not use ACPI for IRQ routing + or for PCI scanning. + routeirq Do IRQ routing for all PCI devices. + This is normally done in pci_enable_device(), + so this option is a temporary workaround + for broken drivers that don't call it. + + firmware [ARM] Do not re-enumerate the bus but + instead just use the configuration + from the bootloader. This is currently + used on IXP2000 systems where the + bus has to be configured a certain way + for adjunct CPUs. + + pcmv= [HW,PCMCIA] BadgePAD 4 + + pd. [PARIDE] + See Documentation/paride.txt. + + pdcchassis= [PARISC,HW] Disable/Enable PDC Chassis Status codes at + boot time. + Format: { 0 | 1 } + See arch/parisc/kernel/pdc_chassis.c + + pf. [PARIDE] + See Documentation/paride.txt. + + pg. [PARIDE] + See Documentation/paride.txt. + + pirq= [SMP,APIC] Manual mp-table setup + See Documentation/i386/IO-APIC.txt. + + plip= [PPT,NET] Parallel port network link + Format: { parport<nr> | timid | 0 } + See also Documentation/parport.txt. + + pnpacpi= [ACPI] + { off } + + pnpbios= [ISAPNP] + { on | off | curr | res | no-curr | no-res } + + pnp_reserve_irq= + [ISAPNP] Exclude IRQs for the autoconfiguration + + pnp_reserve_dma= + [ISAPNP] Exclude DMAs for the autoconfiguration + + pnp_reserve_io= [ISAPNP] Exclude I/O ports for the autoconfiguration + Ranges are in pairs (I/O port base and size). + + pnp_reserve_mem= + [ISAPNP] Exclude memory regions for the autoconfiguration + Ranges are in pairs (memory base and size). + + profile= [KNL] Enable kernel profiling via /proc/profile + { schedule | <number> } + (param: schedule - profile schedule points} + (param: profile step/bucket size as a power of 2 for + statistical time based profiling) + + processor.max_cstate= [HW, ACPI] + Limit processor to maximum C-state + max_cstate=9 overrides any DMI blacklist limit. + + prompt_ramdisk= [RAM] List of RAM disks to prompt for floppy disk + before loading. + See Documentation/ramdisk.txt. + + psmouse.proto= [HW,MOUSE] Highest PS2 mouse protocol extension to + probe for (bare|imps|exps). + psmouse.rate= [HW,MOUSE] Set desired mouse report rate, in reports + per second. + psmouse.resetafter= + [HW,MOUSE] Try to reset the device after so many bad packets + (0 = never). + psmouse.resolution= + [HW,MOUSE] Set desired mouse resolution, in dpi. + psmouse.smartscroll= + [HW,MOUSE] Controls Logitech smartscroll autorepeat, + 0 = disabled, 1 = enabled (default). + + pss= [HW,OSS] Personal Sound System (ECHO ESC614) + Format: <io>,<mss_io>,<mss_irq>,<mss_dma>,<mpu_io>,<mpu_irq> + + pt. [PARIDE] + See Documentation/paride.txt. + + quiet= [KNL] Disable log messages + + r128= [HW,DRM] + + raid= [HW,RAID] + See Documentation/md.txt. + + ramdisk= [RAM] Sizes of RAM disks in kilobytes [deprecated] + See Documentation/ramdisk.txt. + + ramdisk_blocksize= + [RAM] + See Documentation/ramdisk.txt. + + ramdisk_size= [RAM] Sizes of RAM disks in kilobytes + New name for the ramdisk parameter. + See Documentation/ramdisk.txt. + + reboot= [BUGS=IA-32,BUGS=ARM,BUGS=IA-64] Rebooting mode + Format: <reboot_mode>[,<reboot_mode2>[,...]] + See arch/*/kernel/reboot.c. + + reserve= [KNL,BUGS] Force the kernel to ignore some iomem area + + resume= [SWSUSP] Specify the partition device for software suspension + + rhash_entries= [KNL,NET] + Set number of hash buckets for route cache + + riscom8= [HW,SERIAL] + Format: <io_board1>[,<io_board2>[,...<io_boardN>]] + + ro [KNL] Mount root device read-only on boot + + root= [KNL] Root filesystem + + rootdelay= [KNL] Delay (in seconds) to pause before attempting to + mount the root filesystem + + rootflags= [KNL] Set root filesystem mount option string + + rootfstype= [KNL] Set root filesystem type + + rw [KNL] Mount root device read-write on boot + + S [KNL] Run init in single mode + + sa1100ir [NET] + See drivers/net/irda/sa1100_ir.c. + + sb= [HW,OSS] + Format: <io>,<irq>,<dma>,<dma2> + + sbni= [NET] Granch SBNI12 leased line adapter + + sbpcd= [HW,CD] Soundblaster CD adapter + Format: <io>,<type> + See a comment before function sbpcd_setup() in + drivers/cdrom/sbpcd.c. + + sc1200wdt= [HW,WDT] SC1200 WDT (watchdog) driver + Format: <io>[,<timeout>[,<isapnp>]] + + scsi_debug_*= [SCSI] + See drivers/scsi/scsi_debug.c. + + scsi_default_dev_flags= + [SCSI] SCSI default device flags + Format: <integer> + + scsi_dev_flags= [SCSI] Black/white list entry for vendor and model + Format: <vendor>:<model>:<flags> + (flags are integer value) + + scsi_logging= [SCSI] + + selinux [SELINUX] Disable or enable SELinux at boot time. + Format: { "0" | "1" } + See security/selinux/Kconfig help text. + 0 -- disable. + 1 -- enable. + Default value is set via kernel config option. + If enabled at boot time, /selinux/disable can be used + later to disable prior to initial policy load. + + serialnumber [BUGS=IA-32] + + sg_def_reserved_size= + [SCSI] + + sgalaxy= [HW,OSS] + Format: <io>,<irq>,<dma>,<dma2>,<sgbase> + + shapers= [NET] + Maximal number of shapers. + + sim710= [SCSI,HW] + See header of drivers/scsi/sim710.c. + + simeth= [IA-64] + simscsi= + + sjcd= [HW,CD] + Format: <io>,<irq>,<dma> + See header of drivers/cdrom/sjcd.c. + + slram= [HW,MTD] + + smart2= [HW] + Format: <io1>[,<io2>[,...,<io8>]] + + snd-ad1816a= [HW,ALSA] + + snd-ad1848= [HW,ALSA] + + snd-ali5451= [HW,ALSA] + + snd-als100= [HW,ALSA] + + snd-als4000= [HW,ALSA] + + snd-azt2320= [HW,ALSA] + + snd-cmi8330= [HW,ALSA] + + snd-cmipci= [HW,ALSA] + + snd-cs4231= [HW,ALSA] + + snd-cs4232= [HW,ALSA] + + snd-cs4236= [HW,ALSA] + + snd-cs4281= [HW,ALSA] + + snd-cs46xx= [HW,ALSA] + + snd-dt019x= [HW,ALSA] + + snd-dummy= [HW,ALSA] + + snd-emu10k1= [HW,ALSA] + + snd-ens1370= [HW,ALSA] + + snd-ens1371= [HW,ALSA] + + snd-es968= [HW,ALSA] + + snd-es1688= [HW,ALSA] + + snd-es18xx= [HW,ALSA] + + snd-es1938= [HW,ALSA] + + snd-es1968= [HW,ALSA] + + snd-fm801= [HW,ALSA] + + snd-gusclassic= [HW,ALSA] + + snd-gusextreme= [HW,ALSA] + + snd-gusmax= [HW,ALSA] + + snd-hdsp= [HW,ALSA] + + snd-ice1712= [HW,ALSA] + + snd-intel8x0= [HW,ALSA] + + snd-interwave= [HW,ALSA] + + snd-interwave-stb= + [HW,ALSA] + + snd-korg1212= [HW,ALSA] + + snd-maestro3= [HW,ALSA] + + snd-mpu401= [HW,ALSA] + + snd-mtpav= [HW,ALSA] + + snd-nm256= [HW,ALSA] + + snd-opl3sa2= [HW,ALSA] + + snd-opti92x-ad1848= + [HW,ALSA] + + snd-opti92x-cs4231= + [HW,ALSA] + + snd-opti93x= [HW,ALSA] + + snd-pmac= [HW,ALSA] + + snd-rme32= [HW,ALSA] + + snd-rme96= [HW,ALSA] + + snd-rme9652= [HW,ALSA] + + snd-sb8= [HW,ALSA] + + snd-sb16= [HW,ALSA] + + snd-sbawe= [HW,ALSA] + + snd-serial= [HW,ALSA] + + snd-sgalaxy= [HW,ALSA] + + snd-sonicvibes= [HW,ALSA] + + snd-sun-amd7930= + [HW,ALSA] + + snd-sun-cs4231= [HW,ALSA] + + snd-trident= [HW,ALSA] + + snd-usb-audio= [HW,ALSA,USB] + + snd-via82xx= [HW,ALSA] + + snd-virmidi= [HW,ALSA] + + snd-wavefront= [HW,ALSA] + + snd-ymfpci= [HW,ALSA] + + sonicvibes= [HW,OSS] + Format: <reverb> + + sonycd535= [HW,CD] + Format: <io>[,<irq>] + + sonypi.*= [HW] Sony Programmable I/O Control Device driver + See Documentation/sonypi.txt + + specialix= [HW,SERIAL] Specialix multi-serial port adapter + See Documentation/specialix.txt. + + spia_io_base= [HW,MTD] + spia_fio_base= + spia_pedr= + spia_peddr= + + sscape= [HW,OSS] + Format: <io>,<irq>,<dma>,<mpu_io>,<mpu_irq> + + st= [HW,SCSI] SCSI tape parameters (buffers, etc.) + See Documentation/scsi/st.txt. + + st0x= [HW,SCSI] + See header of drivers/scsi/seagate.c. + + sti= [PARISC,HW] + Format: <num> + Set the STI (builtin display/keyboard on the HP-PARISC + machines) console (graphic card) which should be used + as the initial boot-console. + See also comment in drivers/video/console/sticore.c. + + sti_font= [HW] + See comment in drivers/video/console/sticore.c. + + stifb= [HW] + Format: bpp:<bpp1>[:<bpp2>[:<bpp3>...]] + + stram_swap= [HW,M68k] + + swiotlb= [IA-64] Number of I/O TLB slabs + + switches= [HW,M68k] + + sym53c416= [HW,SCSI] + See header of drivers/scsi/sym53c416.c. + + t128= [HW,SCSI] + See header of drivers/scsi/t128.c. + + tdfx= [HW,DRM] + + thash_entries= [KNL,NET] + Set number of hash buckets for TCP connection + + time Show timing data prefixed to each printk message line + + tipar.timeout= [HW,PPT] + Set communications timeout in tenths of a second + (default 15). + + tipar.delay= [HW,PPT] + Set inter-bit delay in microseconds (default 10). + + tmc8xx= [HW,SCSI] + See header of drivers/scsi/seagate.c. + + tmscsim= [HW,SCSI] + See comment before function dc390_setup() in + drivers/scsi/tmscsim.c. + + tp720= [HW,PS2] + + trix= [HW,OSS] MediaTrix AudioTrix Pro + Format: <io>,<irq>,<dma>,<dma2>,<sb_io>,<sb_irq>,<sb_dma>,<mpu_io>,<mpu_irq> + + tsdev.xres= [TS] Horizontal screen resolution. + tsdev.yres= [TS] Vertical screen resolution. + + turbografx.map[2|3]= + [HW,JOY] TurboGraFX parallel port interface + Format: <port#>,<js1>,<js2>,<js3>,<js4>,<js5>,<js6>,<js7> + See also Documentation/input/joystick-parport.txt + + u14-34f= [HW,SCSI] UltraStor 14F/34F SCSI host adapter + See header of drivers/scsi/u14-34f.c. + + uart401= [HW,OSS] + Format: <io>,<irq> + + uart6850= [HW,OSS] + Format: <io>,<irq> + + usb-handoff [HW] Enable early USB BIOS -> OS handoff + + usbhid.mousepoll= + [USBHID] The interval which mice are to be polled at. + + video= [FB] Frame buffer configuration + See Documentation/fb/modedb.txt. + + vga= [BOOT,IA-32] Select a particular video mode + See Documentation/i386/boot.txt and Documentation/svga.txt. + Use vga=ask for menu. + This is actually a boot loader parameter; the value is + passed to the kernel using a special protocol. + + vmalloc=nn[KMG] [KNL,BOOT] forces the vmalloc area to have an exact + size of <nn>. This can be used to increase the + minimum size (128MB on x86). It can also be used to + decrease the size and leave more room for directly + mapped kernel RAM. + + vmhalt= [KNL,S390] + + vmpoff= [KNL,S390] + + waveartist= [HW,OSS] + Format: <io>,<irq>,<dma>,<dma2> + + wd33c93= [HW,SCSI] + See header of drivers/scsi/wd33c93.c. + + wd7000= [HW,SCSI] + See header of drivers/scsi/wd7000.c. + + wdt= [WDT] Watchdog + See Documentation/watchdog/watchdog.txt. + + xd= [HW,XT] Original XT pre-IDE (RLL encoded) disks. + xd_geo= See header of drivers/block/xd.c. + + xirc2ps_cs= [NET,PCMCIA] + Format: <irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]] + + + +Changelog: + + The last known update (for 2.4.0) - the changelog was not kept before. + 2000-06-?? Mr. Unknown + + Update for 2.5.49, description for most of the options introduced, + references to other documentation (C files, READMEs, ..), added S390, + PPC, SPARC, MTD, ALSA and OSS category. Minor corrections and + reformatting. + 2002-11-24 Petr Baudis <pasky@ucw.cz> + Randy Dunlap <randy.dunlap@verizon.net> + +TODO: + + Add documentation for ALSA options. + Add more DRM drivers. diff --git a/Documentation/keys.txt b/Documentation/keys.txt new file mode 100644 index 000000000000..36d80aeeaf28 --- /dev/null +++ b/Documentation/keys.txt @@ -0,0 +1,869 @@ + ============================ + KERNEL KEY RETENTION SERVICE + ============================ + +This service allows cryptographic keys, authentication tokens, cross-domain +user mappings, and similar to be cached in the kernel for the use of +filesystems other kernel services. + +Keyrings are permitted; these are a special type of key that can hold links to +other keys. Processes each have three standard keyring subscriptions that a +kernel service can search for relevant keys. + +The key service can be configured on by enabling: + + "Security options"/"Enable access key retention support" (CONFIG_KEYS) + +This document has the following sections: + + - Key overview + - Key service overview + - Key access permissions + - New procfs files + - Userspace system call interface + - Kernel services + - Defining a key type + - Request-key callback service + - Key access filesystem + + +============ +KEY OVERVIEW +============ + +In this context, keys represent units of cryptographic data, authentication +tokens, keyrings, etc.. These are represented in the kernel by struct key. + +Each key has a number of attributes: + + - A serial number. + - A type. + - A description (for matching a key in a search). + - Access control information. + - An expiry time. + - A payload. + - State. + + + (*) Each key is issued a serial number of type key_serial_t that is unique + for the lifetime of that key. All serial numbers are positive non-zero + 32-bit integers. + + Userspace programs can use a key's serial numbers as a way to gain access + to it, subject to permission checking. + + (*) Each key is of a defined "type". Types must be registered inside the + kernel by a kernel service (such as a filesystem) before keys of that + type can be added or used. Userspace programs cannot define new types + directly. + + Key types are represented in the kernel by struct key_type. This defines + a number of operations that can be performed on a key of that type. + + Should a type be removed from the system, all the keys of that type will + be invalidated. + + (*) Each key has a description. This should be a printable string. The key + type provides an operation to perform a match between the description on + a key and a criterion string. + + (*) Each key has an owner user ID, a group ID and a permissions mask. These + are used to control what a process may do to a key from userspace, and + whether a kernel service will be able to find the key. + + (*) Each key can be set to expire at a specific time by the key type's + instantiation function. Keys can also be immortal. + + (*) Each key can have a payload. This is a quantity of data that represent + the actual "key". In the case of a keyring, this is a list of keys to + which the keyring links; in the case of a user-defined key, it's an + arbitrary blob of data. + + Having a payload is not required; and the payload can, in fact, just be a + value stored in the struct key itself. + + When a key is instantiated, the key type's instantiation function is + called with a blob of data, and that then creates the key's payload in + some way. + + Similarly, when userspace wants to read back the contents of the key, if + permitted, another key type operation will be called to convert the key's + attached payload back into a blob of data. + + (*) Each key can be in one of a number of basic states: + + (*) Uninstantiated. The key exists, but does not have any data + attached. Keys being requested from userspace will be in this state. + + (*) Instantiated. This is the normal state. The key is fully formed, and + has data attached. + + (*) Negative. This is a relatively short-lived state. The key acts as a + note saying that a previous call out to userspace failed, and acts as + a throttle on key lookups. A negative key can be updated to a normal + state. + + (*) Expired. Keys can have lifetimes set. If their lifetime is exceeded, + they traverse to this state. An expired key can be updated back to a + normal state. + + (*) Revoked. A key is put in this state by userspace action. It can't be + found or operated upon (apart from by unlinking it). + + (*) Dead. The key's type was unregistered, and so the key is now useless. + + +==================== +KEY SERVICE OVERVIEW +==================== + +The key service provides a number of features besides keys: + + (*) The key service defines two special key types: + + (+) "keyring" + + Keyrings are special keys that contain a list of other keys. Keyring + lists can be modified using various system calls. Keyrings should not + be given a payload when created. + + (+) "user" + + A key of this type has a description and a payload that are arbitrary + blobs of data. These can be created, updated and read by userspace, + and aren't intended for use by kernel services. + + (*) Each process subscribes to three keyrings: a thread-specific keyring, a + process-specific keyring, and a session-specific keyring. + + The thread-specific keyring is discarded from the child when any sort of + clone, fork, vfork or execve occurs. A new keyring is created only when + required. + + The process-specific keyring is replaced with an empty one in the child + on clone, fork, vfork unless CLONE_THREAD is supplied, in which case it + is shared. execve also discards the process's process keyring and creates + a new one. + + The session-specific keyring is persistent across clone, fork, vfork and + execve, even when the latter executes a set-UID or set-GID binary. A + process can, however, replace its current session keyring with a new one + by using PR_JOIN_SESSION_KEYRING. It is permitted to request an anonymous + new one, or to attempt to create or join one of a specific name. + + The ownership of the thread keyring changes when the real UID and GID of + the thread changes. + + (*) Each user ID resident in the system holds two special keyrings: a user + specific keyring and a default user session keyring. The default session + keyring is initialised with a link to the user-specific keyring. + + When a process changes its real UID, if it used to have no session key, it + will be subscribed to the default session key for the new UID. + + If a process attempts to access its session key when it doesn't have one, + it will be subscribed to the default for its current UID. + + (*) Each user has two quotas against which the keys they own are tracked. One + limits the total number of keys and keyrings, the other limits the total + amount of description and payload space that can be consumed. + + The user can view information on this and other statistics through procfs + files. + + Process-specific and thread-specific keyrings are not counted towards a + user's quota. + + If a system call that modifies a key or keyring in some way would put the + user over quota, the operation is refused and error EDQUOT is returned. + + (*) There's a system call interface by which userspace programs can create + and manipulate keys and keyrings. + + (*) There's a kernel interface by which services can register types and + search for keys. + + (*) There's a way for the a search done from the kernel to call back to + userspace to request a key that can't be found in a process's keyrings. + + (*) An optional filesystem is available through which the key database can be + viewed and manipulated. + + +====================== +KEY ACCESS PERMISSIONS +====================== + +Keys have an owner user ID, a group access ID, and a permissions mask. The +mask has up to eight bits each for user, group and other access. Only five of +each set of eight bits are defined. These permissions granted are: + + (*) View + + This permits a key or keyring's attributes to be viewed - including key + type and description. + + (*) Read + + This permits a key's payload to be viewed or a keyring's list of linked + keys. + + (*) Write + + This permits a key's payload to be instantiated or updated, or it allows + a link to be added to or removed from a keyring. + + (*) Search + + This permits keyrings to be searched and keys to be found. Searches can + only recurse into nested keyrings that have search permission set. + + (*) Link + + This permits a key or keyring to be linked to. To create a link from a + keyring to a key, a process must have Write permission on the keyring and + Link permission on the key. + +For changing the ownership, group ID or permissions mask, being the owner of +the key or having the sysadmin capability is sufficient. + + +================ +NEW PROCFS FILES +================ + +Two files have been added to procfs by which an administrator can find out +about the status of the key service: + + (*) /proc/keys + + This lists all the keys on the system, giving information about their + type, description and permissions. The payload of the key is not + available this way: + + SERIAL FLAGS USAGE EXPY PERM UID GID TYPE DESCRIPTION: SUMMARY + 00000001 I----- 39 perm 1f0000 0 0 keyring _uid_ses.0: 1/4 + 00000002 I----- 2 perm 1f0000 0 0 keyring _uid.0: empty + 00000007 I----- 1 perm 1f0000 0 0 keyring _pid.1: empty + 0000018d I----- 1 perm 1f0000 0 0 keyring _pid.412: empty + 000004d2 I--Q-- 1 perm 1f0000 32 -1 keyring _uid.32: 1/4 + 000004d3 I--Q-- 3 perm 1f0000 32 -1 keyring _uid_ses.32: empty + 00000892 I--QU- 1 perm 1f0000 0 0 user metal:copper: 0 + 00000893 I--Q-N 1 35s 1f0000 0 0 user metal:silver: 0 + 00000894 I--Q-- 1 10h 1f0000 0 0 user metal:gold: 0 + + The flags are: + + I Instantiated + R Revoked + D Dead + Q Contributes to user's quota + U Under contruction by callback to userspace + N Negative key + + This file must be enabled at kernel configuration time as it allows anyone + to list the keys database. + + (*) /proc/key-users + + This file lists the tracking data for each user that has at least one key + on the system. Such data includes quota information and statistics: + + [root@andromeda root]# cat /proc/key-users + 0: 46 45/45 1/100 13/10000 + 29: 2 2/2 2/100 40/10000 + 32: 2 2/2 2/100 40/10000 + 38: 2 2/2 2/100 40/10000 + + The format of each line is + <UID>: User ID to which this applies + <usage> Structure refcount + <inst>/<keys> Total number of keys and number instantiated + <keys>/<max> Key count quota + <bytes>/<max> Key size quota + + +=============================== +USERSPACE SYSTEM CALL INTERFACE +=============================== + +Userspace can manipulate keys directly through three new syscalls: add_key, +request_key and keyctl. The latter provides a number of functions for +manipulating keys. + +When referring to a key directly, userspace programs should use the key's +serial number (a positive 32-bit integer). However, there are some special +values available for referring to special keys and keyrings that relate to the +process making the call: + + CONSTANT VALUE KEY REFERENCED + ============================== ====== =========================== + KEY_SPEC_THREAD_KEYRING -1 thread-specific keyring + KEY_SPEC_PROCESS_KEYRING -2 process-specific keyring + KEY_SPEC_SESSION_KEYRING -3 session-specific keyring + KEY_SPEC_USER_KEYRING -4 UID-specific keyring + KEY_SPEC_USER_SESSION_KEYRING -5 UID-session keyring + KEY_SPEC_GROUP_KEYRING -6 GID-specific keyring + + +The main syscalls are: + + (*) Create a new key of given type, description and payload and add it to the + nominated keyring: + + key_serial_t add_key(const char *type, const char *desc, + const void *payload, size_t plen, + key_serial_t keyring); + + If a key of the same type and description as that proposed already exists + in the keyring, this will try to update it with the given payload, or it + will return error EEXIST if that function is not supported by the key + type. The process must also have permission to write to the key to be + able to update it. The new key will have all user permissions granted and + no group or third party permissions. + + Otherwise, this will attempt to create a new key of the specified type + and description, and to instantiate it with the supplied payload and + attach it to the keyring. In this case, an error will be generated if the + process does not have permission to write to the keyring. + + The payload is optional, and the pointer can be NULL if not required by + the type. The payload is plen in size, and plen can be zero for an empty + payload. + + A new keyring can be generated by setting type "keyring", the keyring + name as the description (or NULL) and setting the payload to NULL. + + User defined keys can be created by specifying type "user". It is + recommended that a user defined key's description by prefixed with a type + ID and a colon, such as "krb5tgt:" for a Kerberos 5 ticket granting + ticket. + + Any other type must have been registered with the kernel in advance by a + kernel service such as a filesystem. + + The ID of the new or updated key is returned if successful. + + + (*) Search the process's keyrings for a key, potentially calling out to + userspace to create it. + + key_serial_t request_key(const char *type, const char *description, + const char *callout_info, + key_serial_t dest_keyring); + + This function searches all the process's keyrings in the order thread, + process, session for a matching key. This works very much like + KEYCTL_SEARCH, including the optional attachment of the discovered key to + a keyring. + + If a key cannot be found, and if callout_info is not NULL, then + /sbin/request-key will be invoked in an attempt to obtain a key. The + callout_info string will be passed as an argument to the program. + + +The keyctl syscall functions are: + + (*) Map a special key ID to a real key ID for this process: + + key_serial_t keyctl(KEYCTL_GET_KEYRING_ID, key_serial_t id, + int create); + + The special key specified by "id" is looked up (with the key being + created if necessary) and the ID of the key or keyring thus found is + returned if it exists. + + If the key does not yet exist, the key will be created if "create" is + non-zero; and the error ENOKEY will be returned if "create" is zero. + + + (*) Replace the session keyring this process subscribes to with a new one: + + key_serial_t keyctl(KEYCTL_JOIN_SESSION_KEYRING, const char *name); + + If name is NULL, an anonymous keyring is created attached to the process + as its session keyring, displacing the old session keyring. + + If name is not NULL, if a keyring of that name exists, the process + attempts to attach it as the session keyring, returning an error if that + is not permitted; otherwise a new keyring of that name is created and + attached as the session keyring. + + To attach to a named keyring, the keyring must have search permission for + the process's ownership. + + The ID of the new session keyring is returned if successful. + + + (*) Update the specified key: + + long keyctl(KEYCTL_UPDATE, key_serial_t key, const void *payload, + size_t plen); + + This will try to update the specified key with the given payload, or it + will return error EOPNOTSUPP if that function is not supported by the key + type. The process must also have permission to write to the key to be + able to update it. + + The payload is of length plen, and may be absent or empty as for + add_key(). + + + (*) Revoke a key: + + long keyctl(KEYCTL_REVOKE, key_serial_t key); + + This makes a key unavailable for further operations. Further attempts to + use the key will be met with error EKEYREVOKED, and the key will no longer + be findable. + + + (*) Change the ownership of a key: + + long keyctl(KEYCTL_CHOWN, key_serial_t key, uid_t uid, gid_t gid); + + This function permits a key's owner and group ID to be changed. Either + one of uid or gid can be set to -1 to suppress that change. + + Only the superuser can change a key's owner to something other than the + key's current owner. Similarly, only the superuser can change a key's + group ID to something other than the calling process's group ID or one of + its group list members. + + + (*) Change the permissions mask on a key: + + long keyctl(KEYCTL_SETPERM, key_serial_t key, key_perm_t perm); + + This function permits the owner of a key or the superuser to change the + permissions mask on a key. + + Only bits the available bits are permitted; if any other bits are set, + error EINVAL will be returned. + + + (*) Describe a key: + + long keyctl(KEYCTL_DESCRIBE, key_serial_t key, char *buffer, + size_t buflen); + + This function returns a summary of the key's attributes (but not its + payload data) as a string in the buffer provided. + + Unless there's an error, it always returns the amount of data it could + produce, even if that's too big for the buffer, but it won't copy more + than requested to userspace. If the buffer pointer is NULL then no copy + will take place. + + A process must have view permission on the key for this function to be + successful. + + If successful, a string is placed in the buffer in the following format: + + <type>;<uid>;<gid>;<perm>;<description> + + Where type and description are strings, uid and gid are decimal, and perm + is hexadecimal. A NUL character is included at the end of the string if + the buffer is sufficiently big. + + This can be parsed with + + sscanf(buffer, "%[^;];%d;%d;%o;%s", type, &uid, &gid, &mode, desc); + + + (*) Clear out a keyring: + + long keyctl(KEYCTL_CLEAR, key_serial_t keyring); + + This function clears the list of keys attached to a keyring. The calling + process must have write permission on the keyring, and it must be a + keyring (or else error ENOTDIR will result). + + + (*) Link a key into a keyring: + + long keyctl(KEYCTL_LINK, key_serial_t keyring, key_serial_t key); + + This function creates a link from the keyring to the key. The process + must have write permission on the keyring and must have link permission + on the key. + + Should the keyring not be a keyring, error ENOTDIR will result; and if + the keyring is full, error ENFILE will result. + + The link procedure checks the nesting of the keyrings, returning ELOOP if + it appears to deep or EDEADLK if the link would introduce a cycle. + + + (*) Unlink a key or keyring from another keyring: + + long keyctl(KEYCTL_UNLINK, key_serial_t keyring, key_serial_t key); + + This function looks through the keyring for the first link to the + specified key, and removes it if found. Subsequent links to that key are + ignored. The process must have write permission on the keyring. + + If the keyring is not a keyring, error ENOTDIR will result; and if the + key is not present, error ENOENT will be the result. + + + (*) Search a keyring tree for a key: + + key_serial_t keyctl(KEYCTL_SEARCH, key_serial_t keyring, + const char *type, const char *description, + key_serial_t dest_keyring); + + This searches the keyring tree headed by the specified keyring until a + key is found that matches the type and description criteria. Each keyring + is checked for keys before recursion into its children occurs. + + The process must have search permission on the top level keyring, or else + error EACCES will result. Only keyrings that the process has search + permission on will be recursed into, and only keys and keyrings for which + a process has search permission can be matched. If the specified keyring + is not a keyring, ENOTDIR will result. + + If the search succeeds, the function will attempt to link the found key + into the destination keyring if one is supplied (non-zero ID). All the + constraints applicable to KEYCTL_LINK apply in this case too. + + Error ENOKEY, EKEYREVOKED or EKEYEXPIRED will be returned if the search + fails. On success, the resulting key ID will be returned. + + + (*) Read the payload data from a key: + + key_serial_t keyctl(KEYCTL_READ, key_serial_t keyring, char *buffer, + size_t buflen); + + This function attempts to read the payload data from the specified key + into the buffer. The process must have read permission on the key to + succeed. + + The returned data will be processed for presentation by the key type. For + instance, a keyring will return an array of key_serial_t entries + representing the IDs of all the keys to which it is subscribed. The user + defined key type will return its data as is. If a key type does not + implement this function, error EOPNOTSUPP will result. + + As much of the data as can be fitted into the buffer will be copied to + userspace if the buffer pointer is not NULL. + + On a successful return, the function will always return the amount of + data available rather than the amount copied. + + + (*) Instantiate a partially constructed key. + + key_serial_t keyctl(KEYCTL_INSTANTIATE, key_serial_t key, + const void *payload, size_t plen, + key_serial_t keyring); + + If the kernel calls back to userspace to complete the instantiation of a + key, userspace should use this call to supply data for the key before the + invoked process returns, or else the key will be marked negative + automatically. + + The process must have write access on the key to be able to instantiate + it, and the key must be uninstantiated. + + If a keyring is specified (non-zero), the key will also be linked into + that keyring, however all the constraints applying in KEYCTL_LINK apply + in this case too. + + The payload and plen arguments describe the payload data as for add_key(). + + + (*) Negatively instantiate a partially constructed key. + + key_serial_t keyctl(KEYCTL_NEGATE, key_serial_t key, + unsigned timeout, key_serial_t keyring); + + If the kernel calls back to userspace to complete the instantiation of a + key, userspace should use this call mark the key as negative before the + invoked process returns if it is unable to fulfil the request. + + The process must have write access on the key to be able to instantiate + it, and the key must be uninstantiated. + + If a keyring is specified (non-zero), the key will also be linked into + that keyring, however all the constraints applying in KEYCTL_LINK apply + in this case too. + + +=============== +KERNEL SERVICES +=============== + +The kernel services for key managment are fairly simple to deal with. They can +be broken down into two areas: keys and key types. + +Dealing with keys is fairly straightforward. Firstly, the kernel service +registers its type, then it searches for a key of that type. It should retain +the key as long as it has need of it, and then it should release it. For a +filesystem or device file, a search would probably be performed during the +open call, and the key released upon close. How to deal with conflicting keys +due to two different users opening the same file is left to the filesystem +author to solve. + +When accessing a key's payload data, key->lock should be at least read locked, +or else the data may be changed by an update being performed from userspace +whilst the driver or filesystem is trying to access it. If no update method is +supplied, then the key's payload may be accessed without holding a lock as +there is no way to change it, provided it can be guaranteed that the key's +type definition won't go away. + +(*) To search for a key, call: + + struct key *request_key(const struct key_type *type, + const char *description, + const char *callout_string); + + This is used to request a key or keyring with a description that matches + the description specified according to the key type's match function. This + permits approximate matching to occur. If callout_string is not NULL, then + /sbin/request-key will be invoked in an attempt to obtain the key from + userspace. In that case, callout_string will be passed as an argument to + the program. + + Should the function fail error ENOKEY, EKEYEXPIRED or EKEYREVOKED will be + returned. + + +(*) When it is no longer required, the key should be released using: + + void key_put(struct key *key); + + This can be called from interrupt context. If CONFIG_KEYS is not set then + the argument will not be parsed. + + +(*) Extra references can be made to a key by calling the following function: + + struct key *key_get(struct key *key); + + These need to be disposed of by calling key_put() when they've been + finished with. The key pointer passed in will be returned. If the pointer + is NULL or CONFIG_KEYS is not set then the key will not be dereferenced and + no increment will take place. + + +(*) A key's serial number can be obtained by calling: + + key_serial_t key_serial(struct key *key); + + If key is NULL or if CONFIG_KEYS is not set then 0 will be returned (in the + latter case without parsing the argument). + + +(*) If a keyring was found in the search, this can be further searched by: + + struct key *keyring_search(struct key *keyring, + const struct key_type *type, + const char *description) + + This searches the keyring tree specified for a matching key. Error ENOKEY + is returned upon failure. If successful, the returned key will need to be + released. + + +(*) To check the validity of a key, this function can be called: + + int validate_key(struct key *key); + + This checks that the key in question hasn't expired or and hasn't been + revoked. Should the key be invalid, error EKEYEXPIRED or EKEYREVOKED will + be returned. If the key is NULL or if CONFIG_KEYS is not set then 0 will be + returned (in the latter case without parsing the argument). + + +(*) To register a key type, the following function should be called: + + int register_key_type(struct key_type *type); + + This will return error EEXIST if a type of the same name is already + present. + + +(*) To unregister a key type, call: + + void unregister_key_type(struct key_type *type); + + +=================== +DEFINING A KEY TYPE +=================== + +A kernel service may want to define its own key type. For instance, an AFS +filesystem might want to define a Kerberos 5 ticket key type. To do this, it +author fills in a struct key_type and registers it with the system. + +The structure has a number of fields, some of which are mandatory: + + (*) const char *name + + The name of the key type. This is used to translate a key type name + supplied by userspace into a pointer to the structure. + + + (*) size_t def_datalen + + This is optional - it supplies the default payload data length as + contributed to the quota. If the key type's payload is always or almost + always the same size, then this is a more efficient way to do things. + + The data length (and quota) on a particular key can always be changed + during instantiation or update by calling: + + int key_payload_reserve(struct key *key, size_t datalen); + + With the revised data length. Error EDQUOT will be returned if this is + not viable. + + + (*) int (*instantiate)(struct key *key, const void *data, size_t datalen); + + This method is called to attach a payload to a key during construction. + The payload attached need not bear any relation to the data passed to + this function. + + If the amount of data attached to the key differs from the size in + keytype->def_datalen, then key_payload_reserve() should be called. + + This method does not have to lock the key in order to attach a payload. + The fact that KEY_FLAG_INSTANTIATED is not set in key->flags prevents + anything else from gaining access to the key. + + This method may sleep if it wishes. + + + (*) int (*duplicate)(struct key *key, const struct key *source); + + If this type of key can be duplicated, then this method should be + provided. It is called to copy the payload attached to the source into + the new key. The data length on the new key will have been updated and + the quota adjusted already. + + This method will be called with the source key's semaphore read-locked to + prevent its payload from being changed. It is safe to sleep here. + + + (*) int (*update)(struct key *key, const void *data, size_t datalen); + + If this type of key can be updated, then this method should be + provided. It is called to update a key's payload from the blob of data + provided. + + key_payload_reserve() should be called if the data length might change + before any changes are actually made. Note that if this succeeds, the + type is committed to changing the key because it's already been altered, + so all memory allocation must be done first. + + key_payload_reserve() should be called with the key->lock write locked, + and the changes to the key's attached payload should be made before the + key is locked. + + The key will have its semaphore write-locked before this method is + called. Any changes to the key should be made with the key's rwlock + write-locked also. It is safe to sleep here. + + + (*) int (*match)(const struct key *key, const void *desc); + + This method is called to match a key against a description. It should + return non-zero if the two match, zero if they don't. + + This method should not need to lock the key in any way. The type and + description can be considered invariant, and the payload should not be + accessed (the key may not yet be instantiated). + + It is not safe to sleep in this method; the caller may hold spinlocks. + + + (*) void (*destroy)(struct key *key); + + This method is optional. It is called to discard the payload data on a + key when it is being destroyed. + + This method does not need to lock the key; it can consider the key as + being inaccessible. Note that the key's type may have changed before this + function is called. + + It is not safe to sleep in this method; the caller may hold spinlocks. + + + (*) void (*describe)(const struct key *key, struct seq_file *p); + + This method is optional. It is called during /proc/keys reading to + summarise a key's description and payload in text form. + + This method will be called with the key's rwlock read-locked. This will + prevent the key's payload and state changing; also the description should + not change. This also means it is not safe to sleep in this method. + + + (*) long (*read)(const struct key *key, char __user *buffer, size_t buflen); + + This method is optional. It is called by KEYCTL_READ to translate the + key's payload into something a blob of data for userspace to deal + with. Ideally, the blob should be in the same format as that passed in to + the instantiate and update methods. + + If successful, the blob size that could be produced should be returned + rather than the size copied. + + This method will be called with the key's semaphore read-locked. This + will prevent the key's payload changing. It is not necessary to also + read-lock key->lock when accessing the key's payload. It is safe to sleep + in this method, such as might happen when the userspace buffer is + accessed. + + +============================ +REQUEST-KEY CALLBACK SERVICE +============================ + +To create a new key, the kernel will attempt to execute the following command +line: + + /sbin/request-key create <key> <uid> <gid> \ + <threadring> <processring> <sessionring> <callout_info> + +<key> is the key being constructed, and the three keyrings are the process +keyrings from the process that caused the search to be issued. These are +included for two reasons: + + (1) There may be an authentication token in one of the keyrings that is + required to obtain the key, eg: a Kerberos Ticket-Granting Ticket. + + (2) The new key should probably be cached in one of these rings. + +This program should set it UID and GID to those specified before attempting to +access any more keys. It may then look around for a user specific process to +hand the request off to (perhaps a path held in placed in another key by, for +example, the KDE desktop manager). + +The program (or whatever it calls) should finish construction of the key by +calling KEYCTL_INSTANTIATE, which also permits it to cache the key in one of +the keyrings (probably the session ring) before returning. Alternatively, the +key can be marked as negative with KEYCTL_NEGATE; this also permits the key to +be cached in one of the keyrings. + +If it returns with the key remaining in the unconstructed state, the key will +be marked as being negative, it will be added to the session keyring, and an +error will be returned to the key requestor. + +Supplementary information may be provided from whoever or whatever invoked +this service. This will be passed as the <callout_info> parameter. If no such +information was made available, then "-" will be passed as this parameter +instead. + + +Similarly, the kernel may attempt to update an expired or a soon to expire key +by executing: + + /sbin/request-key update <key> <uid> <gid> \ + <threadring> <processring> <sessionring> + +In this case, the program isn't required to actually attach the key to a ring; +the rings are provided for reference. diff --git a/Documentation/kobject.txt b/Documentation/kobject.txt new file mode 100644 index 000000000000..8d9bffbd192c --- /dev/null +++ b/Documentation/kobject.txt @@ -0,0 +1,367 @@ +The kobject Infrastructure + +Patrick Mochel <mochel@osdl.org> + +Updated: 3 June 2003 + + +Copyright (c) 2003 Patrick Mochel +Copyright (c) 2003 Open Source Development Labs + + +0. Introduction + +The kobject infrastructure performs basic object management that larger +data structures and subsystems can leverage, rather than reimplement +similar functionality. This functionality primarily concerns: + +- Object reference counting. +- Maintaining lists (sets) of objects. +- Object set locking. +- Userspace representation. + +The infrastructure consists of a number of object types to support +this functionality. Their programming interfaces are described below +in detail, and briefly here: + +- kobjects a simple object. +- kset a set of objects of a certain type. +- ktype a set of helpers for objects of a common type. +- subsystem a controlling object for a number of ksets. + + +The kobject infrastructure maintains a close relationship with the +sysfs filesystem. Each kobject that is registered with the kobject +core receives a directory in sysfs. Attributes about the kobject can +then be exported. Please see Documentation/filesystems/sysfs.txt for +more information. + +The kobject infrastructure provides a flexible programming interface, +and allows kobjects and ksets to be used without being registered +(i.e. with no sysfs representation). This is also described later. + + +1. kobjects + +1.1 Description + + +struct kobject is a simple data type that provides a foundation for +more complex object types. It provides a set of basic fields that +almost all complex data types share. kobjects are intended to be +embedded in larger data structures and replace fields they duplicate. + +1.2 Defintion + +struct kobject { + char name[KOBJ_NAME_LEN]; + atomic_t refcount; + struct list_head entry; + struct kobject * parent; + struct kset * kset; + struct kobj_type * ktype; + struct dentry * dentry; +}; + +void kobject_init(struct kobject *); +int kobject_add(struct kobject *); +int kobject_register(struct kobject *); + +void kobject_del(struct kobject *); +void kobject_unregister(struct kobject *); + +struct kobject * kobject_get(struct kobject *); +void kobject_put(struct kobject *); + + +1.3 kobject Programming Interface + +kobjects may be dynamically added and removed from the kobject core +using kobject_register() and kobject_unregister(). Registration +includes inserting the kobject in the list of its dominant kset and +creating a directory for it in sysfs. + +Alternatively, one may use a kobject without adding it to its kset's list +or exporting it via sysfs, by simply calling kobject_init(). An +initialized kobject may later be added to the object hierarchy by +calling kobject_add(). An initialized kobject may be used for +reference counting. + +Note: calling kobject_init() then kobject_add() is functionally +equivalent to calling kobject_register(). + +When a kobject is unregistered, it is removed from its kset's list, +removed from the sysfs filesystem, and its reference count is decremented. +List and sysfs removal happen in kobject_del(), and may be called +manually. kobject_put() decrements the reference count, and may also +be called manually. + +A kobject's reference count may be incremented with kobject_get(), +which returns a valid reference to a kobject; and decremented with +kobject_put(). An object's reference count may only be incremented if +it is already positive. + +When a kobject's reference count reaches 0, the method struct +kobj_type::release() (which the kobject's kset points to) is called. +This allows any memory allocated for the object to be freed. + + +NOTE!!! + +It is _imperative_ that you supply a destructor for dynamically +allocated kobjects to free them if you are using kobject reference +counts. The reference count controls the lifetime of the object. +If it goes to 0, then it is assumed that the object will +be freed and cannot be used. + +More importantly, you must free the object there, and not immediately +after an unregister call. If someone else is referencing the object +(e.g. through a sysfs file), they will obtain a reference to the +object, assume it's valid and operate on it. If the object is +unregistered and freed in the meantime, the operation will then +reference freed memory and go boom. + +This can be prevented, in the simplest case, by defining a release +method and freeing the object from there only. Note that this will not +secure reference count/object management models that use a dual +reference count or do other wacky things with the reference count +(like the networking layer). + + +1.4 sysfs + +Each kobject receives a directory in sysfs. This directory is created +under the kobject's parent directory. + +If a kobject does not have a parent when it is registered, its parent +becomes its dominant kset. + +If a kobject does not have a parent nor a dominant kset, its directory +is created at the top-level of the sysfs partition. This should only +happen for kobjects that are embedded in a struct subsystem. + + + +2. ksets + +2.1 Description + +A kset is a set of kobjects that are embedded in the same type. + + +struct kset { + struct subsystem * subsys; + struct kobj_type * ktype; + struct list_head list; + struct kobject kobj; +}; + + +void kset_init(struct kset * k); +int kset_add(struct kset * k); +int kset_register(struct kset * k); +void kset_unregister(struct kset * k); + +struct kset * kset_get(struct kset * k); +void kset_put(struct kset * k); + +struct kobject * kset_find_obj(struct kset *, char *); + + +The type that the kobjects are embedded in is described by the ktype +pointer. The subsystem that the kobject belongs to is pointed to by the +subsys pointer. + +A kset contains a kobject itself, meaning that it may be registered in +the kobject hierarchy and exported via sysfs. More importantly, the +kset may be embedded in a larger data type, and may be part of another +kset (of that object type). + +For example, a block device is an object (struct gendisk) that is +contained in a set of block devices. It may also contain a set of +partitions (struct hd_struct) that have been found on the device. The +following code snippet illustrates how to express this properly. + + struct gendisk * disk; + ... + disk->kset.kobj.kset = &block_kset; + disk->kset.ktype = &partition_ktype; + kset_register(&disk->kset); + +- The kset that the disk's embedded object belongs to is the + block_kset, and is pointed to by disk->kset.kobj.kset. + +- The type of objects on the disk's _subordinate_ list are partitions, + and is set in disk->kset.ktype. + +- The kset is then registered, which handles initializing and adding + the embedded kobject to the hierarchy. + + +2.2 kset Programming Interface + +All kset functions, except kset_find_obj(), eventually forward the +calls to their embedded kobjects after performing kset-specific +operations. ksets offer a similar programming model to kobjects: they +may be used after they are initialized, without registering them in +the hierarchy. + +kset_find_obj() may be used to locate a kobject with a particular +name. The kobject, if found, is returned. + + +2.3 sysfs + +ksets are represented in sysfs when their embedded kobjects are +registered. They follow the same rules of parenting, with one +exception. If a kset does not have a parent, nor is its embedded +kobject part of another kset, the kset's parent becomes its dominant +subsystem. + +If the kset does not have a parent, its directory is created at the +sysfs root. This should only happen when the kset registered is +embedded in a subsystem itself. + + +3. struct ktype + +3.1. Description + +struct kobj_type { + void (*release)(struct kobject *); + struct sysfs_ops * sysfs_ops; + struct attribute ** default_attrs; +}; + + +Object types require specific functions for converting between the +generic object and the more complex type. struct kobj_type provides +the object-specific fields, which include: + +- release: Called when the kobject's reference count reaches 0. This + should convert the object to the more complex type and free it. + +- sysfs_ops: Provides conversion functions for sysfs access. Please + see the sysfs documentation for more information. + +- default_attrs: Default attributes to be exported via sysfs when the + object is registered.Note that the last attribute has to be + initialized to NULL ! You can find a complete implementation + in drivers/block/genhd.c + + +Instances of struct kobj_type are not registered; only referenced by +the kset. A kobj_type may be referenced by an arbitrary number of +ksets, as there may be disparate sets of identical objects. + + + +4. subsystems + +4.1 Description + +A subsystem represents a significant entity of code that maintains an +arbitrary number of sets of objects of various types. Since the number +of ksets and the type of objects they contain are variable, a +generic representation of a subsystem is minimal. + + +struct subsystem { + struct kset kset; + struct rw_semaphore rwsem; +}; + +int subsystem_register(struct subsystem *); +void subsystem_unregister(struct subsystem *); + +struct subsystem * subsys_get(struct subsystem * s); +void subsys_put(struct subsystem * s); + + +A subsystem contains an embedded kset so: + +- It can be represented in the object hierarchy via the kset's + embedded kobject. + +- It can maintain a default list of objects of one type. + +Additional ksets may attach to the subsystem simply by referencing the +subsystem before they are registered. (This one-way reference means +that there is no way to determine the ksets that are attached to the +subsystem.) + +All ksets that are attached to a subsystem share the subsystem's R/W +semaphore. + + +4.2 subsystem Programming Interface. + +The subsystem programming interface is simple and does not offer the +flexibility that the kset and kobject programming interfaces do. They +may be registered and unregistered, as well as reference counted. Each +call forwards the calls to their embedded ksets (which forward the +calls to their embedded kobjects). + + +4.3 Helpers + +A number of macros are available to make dealing with subsystems and +their embedded objects easier. + + +decl_subsys(name,type) + +Declares a subsystem named '<name>_subsys', with an embedded kset of +type <type>. For example, + +decl_subsys(devices,&ktype_devices); + +is equivalent to doing: + +struct subsystem device_subsys = { + .kset = { + .kobj = { + .name = "devices", + }, + .ktype = &ktype_devices, + } +}; + + +The objects that are registered with a subsystem that use the +subsystem's default list must have their kset ptr set properly. These +objects may have embedded kobjects, ksets, or other subsystems. The +following helpers make setting the kset easier: + + +kobj_set_kset_s(obj,subsys) + +- Assumes that obj->kobj exists, and is a struct kobject. +- Sets the kset of that kobject to the subsystem's embedded kset. + + +kset_set_kset_s(obj,subsys) + +- Assumes that obj->kset exists, and is a struct kset. +- Sets the kset of the embedded kobject to the subsystem's + embedded kset. + +subsys_set_kset(obj,subsys) + +- Assumes obj->subsys exists, and is a struct subsystem. +- Sets obj->subsys.kset.kobj.kset to the subsystem's embedded kset. + + +4.4 sysfs + +subsystems are represented in sysfs via their embedded kobjects. They +follow the same rules as previously mentioned with no exceptions. They +typically receive a top-level directory in sysfs, except when their +embedded kobject is part of another kset, or the parent of the +embedded kobject is explicitly set. + +Note that the subsystem's embedded kset must be 'attached' to the +subsystem itself in order to use its rwsem. This is done after +kset_add() has been called. (Not before, because kset_add() uses its +subsystem for a default parent if it doesn't already have one). + diff --git a/Documentation/laptop-mode.txt b/Documentation/laptop-mode.txt new file mode 100644 index 000000000000..dc4e810afdcd --- /dev/null +++ b/Documentation/laptop-mode.txt @@ -0,0 +1,950 @@ +How to conserve battery power using laptop-mode +----------------------------------------------- + +Document Author: Bart Samwel (bart@samwel.tk) +Date created: January 2, 2004 +Last modified: July 10, 2004 + +Introduction +------------ + +Laptop mode is used to minimize the time that the hard disk needs to be spun up, +to conserve battery power on laptops. It has been reported to cause significant +power savings. + +Contents +-------- + +* Introduction +* Installation +* Caveats +* The Details +* Tips & Tricks +* Control script +* ACPI integration +* Monitoring tool + + +Installation +------------ + +To use laptop mode, you don't need to set any kernel configuration options +or anything. Simply install all the files included in this document, and +laptop mode will automatically be started when you're on battery. For +your convenience, a tarball containing an installer can be downloaded at: + +http://www.xs4all.nl/~bsamwel/laptop_mode/tools + +To configure laptop mode, you need to edit the configuration file, which is +located in /etc/default/laptop-mode on Debian-based systems, or in +/etc/sysconfig/laptop-mode on other systems. + +Unfortunately, automatic enabling of laptop mode does not work for +laptops that don't have ACPI. On those laptops, you need to start laptop +mode manually. To start laptop mode, run "laptop_mode start", and to +stop it, run "laptop_mode stop". (Note: The laptop mode tools package now +has experimental support for APM, you might want to try that first.) + + +Caveats +------- + +* The downside of laptop mode is that you have a chance of losing up to 10 + minutes of work. If you cannot afford this, don't use it! The supplied ACPI + scripts automatically turn off laptop mode when the battery almost runs out, + so that you won't lose any data at the end of your battery life. + +* Most desktop hard drives have a very limited lifetime measured in spindown + cycles, typically about 50.000 times (it's usually listed on the spec sheet). + Check your drive's rating, and don't wear down your drive's lifetime if you + don't need to. + +* If you mount some of your ext3/reiserfs filesystems with the -n option, then + the control script will not be able to remount them correctly. You must set + DO_REMOUNTS=0 in the control script, otherwise it will remount them with the + wrong options -- or it will fail because it cannot write to /etc/mtab. + +* If you have your filesystems listed as type "auto" in fstab, like I did, then + the control script will not recognize them as filesystems that need remounting. + You must list the filesystems with their true type instead. + +* It has been reported that some versions of the mutt mail client use file access + times to determine whether a folder contains new mail. If you use mutt and + experience this, you must disable the noatime remounting by setting the option + DO_REMOUNT_NOATIME to 0 in the configuration file. + + +The Details +----------- + +Laptop mode is controlled by the knob /proc/sys/vm/laptop_mode. This knob is +present for all kernels that have the laptop mode patch, regardless of any +configuration options. When the knob is set, any physical disk I/O (that might +have caused the hard disk to spin up) causes Linux to flush all dirty blocks. The +result of this is that after a disk has spun down, it will not be spun up +anymore to write dirty blocks, because those blocks had already been written +immediately after the most recent read operation. The value of the laptop_mode +knob determines the time between the occurrence of disk I/O and when the flush +is triggered. A sensible value for the knob is 5 seconds. Setting the knob to +0 disables laptop mode. + +To increase the effectiveness of the laptop_mode strategy, the laptop_mode +control script increases dirty_expire_centisecs and dirty_writeback_centisecs in +/proc/sys/vm to about 10 minutes (by default), which means that pages that are +dirtied are not forced to be written to disk as often. The control script also +changes the dirty background ratio, so that background writeback of dirty pages +is not done anymore. Combined with a higher commit value (also 10 minutes) for +ext3 or ReiserFS filesystems (also done automatically by the control script), +this results in concentration of disk activity in a small time interval which +occurs only once every 10 minutes, or whenever the disk is forced to spin up by +a cache miss. The disk can then be spun down in the periods of inactivity. + +If you want to find out which process caused the disk to spin up, you can +gather information by setting the flag /proc/sys/vm/block_dump. When this flag +is set, Linux reports all disk read and write operations that take place, and +all block dirtyings done to files. This makes it possible to debug why a disk +needs to spin up, and to increase battery life even more. The output of +block_dump is written to the kernel output, and it can be retrieved using +"dmesg". When you use block_dump and your kernel logging level also includes +kernel debugging messages, you probably want to turn off klogd, otherwise +the output of block_dump will be logged, causing disk activity that is not +normally there. + + +Configuration +------------- + +The laptop mode configuration file is located in /etc/default/laptop-mode on +Debian-based systems, or in /etc/sysconfig/laptop-mode on other systems. It +contains the following options: + +MAX_AGE: + +Maximum time, in seconds, of hard drive spindown time that you are +confortable with. Worst case, it's possible that you could lose this +amount of work if your battery fails while you're in laptop mode. + +MINIMUM_BATTERY_MINUTES: + +Automatically disable laptop mode if the remaining number of minutes of +battery power is less than this value. Default is 10 minutes. + +AC_HD/BATT_HD: + +The idle timeout that should be set on your hard drive when laptop mode +is active (BATT_HD) and when it is not active (AC_HD). The defaults are +20 seconds (value 4) for BATT_HD and 2 hours (value 244) for AC_HD. The +possible values are those listed in the manual page for "hdparm" for the +"-S" option. + +HD: + +The devices for which the spindown timeout should be adjusted by laptop mode. +Default is /dev/hda. If you specify multiple devices, separate them by a space. + +READAHEAD: + +Disk readahead, in 512-byte sectors, while laptop mode is active. A large +readahead can prevent disk accesses for things like executable pages (which are +loaded on demand while the application executes) and sequentially accessed data +(MP3s). + +DO_REMOUNTS: + +The control script automatically remounts any mounted journaled filesystems +with approriate commit interval options. When this option is set to 0, this +feature is disabled. + +DO_REMOUNT_NOATIME: + +When remounting, should the filesystems be remounted with the noatime option? +Normally, this is set to "1" (enabled), but there may be programs that require +access time recording. + +DIRTY_RATIO: + +The percentage of memory that is allowed to contain "dirty" or unsaved data +before a writeback is forced, while laptop mode is active. Corresponds to +the /proc/sys/vm/dirty_ratio sysctl. + +DIRTY_BACKGROUND_RATIO: + +The percentage of memory that is allowed to contain "dirty" or unsaved data +after a forced writeback is done due to an exceeding of DIRTY_RATIO. Set +this nice and low. This corresponds to the /proc/sys/vm/dirty_background_ratio +sysctl. + +Note that the behaviour of dirty_background_ratio is quite different +when laptop mode is active and when it isn't. When laptop mode is inactive, +dirty_background_ratio is the threshold percentage at which background writeouts +start taking place. When laptop mode is active, however, background writeouts +are disabled, and the dirty_background_ratio only determines how much writeback +is done when dirty_ratio is reached. + +DO_CPU: + +Enable CPU frequency scaling when in laptop mode. (Requires CPUFreq to be setup. +See Documentation/cpu-freq/user-guide.txt for more info. Disabled by default.) + +CPU_MAXFREQ: + +When on battery, what is the maximum CPU speed that the system should use? Legal +values are "slowest" for the slowest speed that your CPU is able to operate at, +or a value listed in /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies. + + +Tips & Tricks +------------- + +* Bartek Kania reports getting up to 50 minutes of extra battery life (on top + of his regular 3 to 3.5 hours) using a spindown time of 5 seconds (BATT_HD=1). + +* You can spin down the disk while playing MP3, by setting disk readahead + to 8MB (READAHEAD=16384). Effectively, the disk will read a complete MP3 at + once, and will then spin down while the MP3 is playing. (Thanks to Bartek + Kania.) + +* Drew Scott Daniels observed: "I don't know why, but when I decrease the number + of colours that my display uses it consumes less battery power. I've seen + this on powerbooks too. I hope that this is a piece of information that + might be useful to the Laptop Mode patch or it's users." + +* In syslog.conf, you can prefix entries with a dash ``-'' to omit syncing the + file after every logging. When you're using laptop-mode and your disk doesn't + spin down, this is a likely culprit. + +* Richard Atterer observed that laptop mode does not work well with noflushd + (http://noflushd.sourceforge.net/), it seems that noflushd prevents laptop-mode + from doing its thing. + +* If you're worried about your data, you might want to consider using a USB + memory stick or something like that as a "working area". (Be aware though + that flash memory can only handle a limited number of writes, and overuse + may wear out your memory stick pretty quickly. Do _not_ use journalling + filesystems on flash memory sticks.) + + +Configuration file for control and ACPI battery scripts +------------------------------------------------------- + +This allows the tunables to be changed for the scripts via an external +configuration file + +It should be installed as /etc/default/laptop-mode on Debian, and as +/etc/sysconfig/laptop-mode on Red Hat, SUSE, Mandrake, and other work-alikes. + +--------------------CONFIG FILE BEGIN------------------------------------------- +# Maximum time, in seconds, of hard drive spindown time that you are +# confortable with. Worst case, it's possible that you could lose this +# amount of work if your battery fails you while in laptop mode. +#MAX_AGE=600 + +# Automatically disable laptop mode when the number of minutes of battery +# that you have left goes below this threshold. +MINIMUM_BATTERY_MINUTES=10 + +# Read-ahead, in 512-byte sectors. You can spin down the disk while playing MP3/OGG +# by setting the disk readahead to 8MB (READAHEAD=16384). Effectively, the disk +# will read a complete MP3 at once, and will then spin down while the MP3/OGG is +# playing. +#READAHEAD=4096 + +# Shall we remount journaled fs. with appropriate commit interval? (1=yes) +#DO_REMOUNTS=1 + +# And shall we add the "noatime" option to that as well? (1=yes) +#DO_REMOUNT_NOATIME=1 + +# Dirty synchronous ratio. At this percentage of dirty pages the process +# which +# calls write() does its own writeback +#DIRTY_RATIO=40 + +# +# Allowed dirty background ratio, in percent. Once DIRTY_RATIO has been +# exceeded, the kernel will wake pdflush which will then reduce the amount +# of dirty memory to dirty_background_ratio. Set this nice and low, so once +# some writeout has commenced, we do a lot of it. +# +#DIRTY_BACKGROUND_RATIO=5 + +# kernel default dirty buffer age +#DEF_AGE=30 +#DEF_UPDATE=5 +#DEF_DIRTY_BACKGROUND_RATIO=10 +#DEF_DIRTY_RATIO=40 +#DEF_XFS_AGE_BUFFER=15 +#DEF_XFS_SYNC_INTERVAL=30 +#DEF_XFS_BUFD_INTERVAL=1 + +# This must be adjusted manually to the value of HZ in the running kernel +# on 2.4, until the XFS people change their 2.4 external interfaces to work in +# centisecs. This can be automated, but it's a work in progress that still +# needs# some fixes. On 2.6 kernels, XFS uses USER_HZ instead of HZ for +# external interfaces, and that is currently always set to 100. So you don't +# need to change this on 2.6. +#XFS_HZ=100 + +# Should the maximum CPU frequency be adjusted down while on battery? +# Requires CPUFreq to be setup. +# See Documentation/cpu-freq/user-guide.txt for more info +#DO_CPU=0 + +# When on battery what is the maximum CPU speed that the system should +# use? Legal values are "slowest" for the slowest speed that your +# CPU is able to operate at, or a value listed in: +# /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies +# Only applicable if DO_CPU=1. +#CPU_MAXFREQ=slowest + +# Idle timeout for your hard drive (man hdparm for valid values, -S option) +# Default is 2 hours on AC (AC_HD=244) and 20 seconds for battery (BATT_HD=4). +#AC_HD=244 +#BATT_HD=4 + +# The drives for which to adjust the idle timeout. Separate them by a space, +# e.g. HD="/dev/hda /dev/hdb". +#HD="/dev/hda" + +# Set the spindown timeout on a hard drive? +#DO_HD=1 + +--------------------CONFIG FILE END--------------------------------------------- + + +Control script +-------------- + +Please note that this control script works for the Linux 2.4 and 2.6 series (thanks +to Kiko Piris). + +--------------------CONTROL SCRIPT BEGIN---------------------------------------- +#!/bin/bash + +# start or stop laptop_mode, best run by a power management daemon when +# ac gets connected/disconnected from a laptop +# +# install as /sbin/laptop_mode +# +# Contributors to this script: Kiko Piris +# Bart Samwel +# Micha Feigin +# Andrew Morton +# Herve Eychenne +# Dax Kelson +# +# Original Linux 2.4 version by: Jens Axboe + +############################################################################# + +# Source config +if [ -f /etc/default/laptop-mode ] ; then + # Debian + . /etc/default/laptop-mode +elif [ -f /etc/sysconfig/laptop-mode ] ; then + # Others + . /etc/sysconfig/laptop-mode +fi + +# Don't raise an error if the config file is incomplete +# set defaults instead: + +# Maximum time, in seconds, of hard drive spindown time that you are +# confortable with. Worst case, it's possible that you could lose this +# amount of work if your battery fails you while in laptop mode. +MAX_AGE=${MAX_AGE:-'600'} + +# Read-ahead, in kilobytes +READAHEAD=${READAHEAD:-'4096'} + +# Shall we remount journaled fs. with appropiate commit interval? (1=yes) +DO_REMOUNTS=${DO_REMOUNTS:-'1'} + +# And shall we add the "noatime" option to that as well? (1=yes) +DO_REMOUNT_NOATIME=${DO_REMOUNT_NOATIME:-'1'} + +# Shall we adjust the idle timeout on a hard drive? +DO_HD=${DO_HD:-'1'} + +# Adjust idle timeout on which hard drive? +HD="${HD:-'/dev/hda'}" + +# spindown time for HD (hdparm -S values) +AC_HD=${AC_HD:-'244'} +BATT_HD=${BATT_HD:-'4'} + +# Dirty synchronous ratio. At this percentage of dirty pages the process which +# calls write() does its own writeback +DIRTY_RATIO=${DIRTY_RATIO:-'40'} + +# cpu frequency scaling +# See Documentation/cpu-freq/user-guide.txt for more info +DO_CPU=${CPU_MANAGE:-'0'} +CPU_MAXFREQ=${CPU_MAXFREQ:-'slowest'} + +# +# Allowed dirty background ratio, in percent. Once DIRTY_RATIO has been +# exceeded, the kernel will wake pdflush which will then reduce the amount +# of dirty memory to dirty_background_ratio. Set this nice and low, so once +# some writeout has commenced, we do a lot of it. +# +DIRTY_BACKGROUND_RATIO=${DIRTY_BACKGROUND_RATIO:-'5'} + +# kernel default dirty buffer age +DEF_AGE=${DEF_AGE:-'30'} +DEF_UPDATE=${DEF_UPDATE:-'5'} +DEF_DIRTY_BACKGROUND_RATIO=${DEF_DIRTY_BACKGROUND_RATIO:-'10'} +DEF_DIRTY_RATIO=${DEF_DIRTY_RATIO:-'40'} +DEF_XFS_AGE_BUFFER=${DEF_XFS_AGE_BUFFER:-'15'} +DEF_XFS_SYNC_INTERVAL=${DEF_XFS_SYNC_INTERVAL:-'30'} +DEF_XFS_BUFD_INTERVAL=${DEF_XFS_BUFD_INTERVAL:-'1'} + +# This must be adjusted manually to the value of HZ in the running kernel +# on 2.4, until the XFS people change their 2.4 external interfaces to work in +# centisecs. This can be automated, but it's a work in progress that still needs +# some fixes. On 2.6 kernels, XFS uses USER_HZ instead of HZ for external +# interfaces, and that is currently always set to 100. So you don't need to +# change this on 2.6. +XFS_HZ=${XFS_HZ:-'100'} + +############################################################################# + +KLEVEL="$(uname -r | + { + IFS='.' read a b c + echo $a.$b + } +)" +case "$KLEVEL" in + "2.4"|"2.6") + ;; + *) + echo "Unhandled kernel version: $KLEVEL ('uname -r' = '$(uname -r)')" >&2 + exit 1 + ;; +esac + +if [ ! -e /proc/sys/vm/laptop_mode ] ; then + echo "Kernel is not patched with laptop_mode patch." >&2 + exit 1 +fi + +if [ ! -w /proc/sys/vm/laptop_mode ] ; then + echo "You do not have enough privileges to enable laptop_mode." >&2 + exit 1 +fi + +# Remove an option (the first parameter) of the form option=<number> from +# a mount options string (the rest of the parameters). +parse_mount_opts () { + OPT="$1" + shift + echo ",$*," | sed \ + -e 's/,'"$OPT"'=[0-9]*,/,/g' \ + -e 's/,,*/,/g' \ + -e 's/^,//' \ + -e 's/,$//' +} + +# Remove an option (the first parameter) without any arguments from +# a mount option string (the rest of the parameters). +parse_nonumber_mount_opts () { + OPT="$1" + shift + echo ",$*," | sed \ + -e 's/,'"$OPT"',/,/g' \ + -e 's/,,*/,/g' \ + -e 's/^,//' \ + -e 's/,$//' +} + +# Find out the state of a yes/no option (e.g. "atime"/"noatime") in +# fstab for a given filesystem, and use this state to replace the +# value of the option in another mount options string. The device +# is the first argument, the option name the second, and the default +# value the third. The remainder is the mount options string. +# +# Example: +# parse_yesno_opts_wfstab /dev/hda1 atime atime defaults,noatime +# +# If fstab contains, say, "rw" for this filesystem, then the result +# will be "defaults,atime". +parse_yesno_opts_wfstab () { + L_DEV="$1" + OPT="$2" + DEF_OPT="$3" + shift 3 + L_OPTS="$*" + PARSEDOPTS1="$(parse_nonumber_mount_opts $OPT $L_OPTS)" + PARSEDOPTS1="$(parse_nonumber_mount_opts no$OPT $PARSEDOPTS1)" + # Watch for a default atime in fstab + FSTAB_OPTS="$(awk '$1 == "'$L_DEV'" { print $4 }' /etc/fstab)" + if echo "$FSTAB_OPTS" | grep "$OPT" > /dev/null ; then + # option specified in fstab: extract the value and use it + if echo "$FSTAB_OPTS" | grep "no$OPT" > /dev/null ; then + echo "$PARSEDOPTS1,no$OPT" + else + # no$OPT not found -- so we must have $OPT. + echo "$PARSEDOPTS1,$OPT" + fi + else + # option not specified in fstab -- choose the default. + echo "$PARSEDOPTS1,$DEF_OPT" + fi +} + +# Find out the state of a numbered option (e.g. "commit=NNN") in +# fstab for a given filesystem, and use this state to replace the +# value of the option in another mount options string. The device +# is the first argument, and the option name the second. The +# remainder is the mount options string in which the replacement +# must be done. +# +# Example: +# parse_mount_opts_wfstab /dev/hda1 commit defaults,commit=7 +# +# If fstab contains, say, "commit=3,rw" for this filesystem, then the +# result will be "rw,commit=3". +parse_mount_opts_wfstab () { + L_DEV="$1" + OPT="$2" + shift 2 + L_OPTS="$*" + PARSEDOPTS1="$(parse_mount_opts $OPT $L_OPTS)" + # Watch for a default commit in fstab + FSTAB_OPTS="$(awk '$1 == "'$L_DEV'" { print $4 }' /etc/fstab)" + if echo "$FSTAB_OPTS" | grep "$OPT=" > /dev/null ; then + # option specified in fstab: extract the value, and use it + echo -n "$PARSEDOPTS1,$OPT=" + echo ",$FSTAB_OPTS," | sed \ + -e 's/.*,'"$OPT"'=//' \ + -e 's/,.*//' + else + # option not specified in fstab: set it to 0 + echo "$PARSEDOPTS1,$OPT=0" + fi +} + +deduce_fstype () { + MP="$1" + # My root filesystem unfortunately has + # type "unknown" in /etc/mtab. If we encounter + # "unknown", we try to get the type from fstab. + cat /etc/fstab | + grep -v '^#' | + while read FSTAB_DEV FSTAB_MP FSTAB_FST FSTAB_OPTS FSTAB_DUMP FSTAB_DUMP ; do + if [ "$FSTAB_MP" = "$MP" ]; then + echo $FSTAB_FST + exit 0 + fi + done +} + +if [ $DO_REMOUNT_NOATIME -eq 1 ] ; then + NOATIME_OPT=",noatime" +fi + +case "$1" in + start) + AGE=$((100*$MAX_AGE)) + XFS_AGE=$(($XFS_HZ*$MAX_AGE)) + echo -n "Starting laptop_mode" + + if [ -d /proc/sys/vm/pagebuf ] ; then + # (For 2.4 and early 2.6.) + # This only needs to be set, not reset -- it is only used when + # laptop mode is enabled. + echo $XFS_AGE > /proc/sys/vm/pagebuf/lm_flush_age + echo $XFS_AGE > /proc/sys/fs/xfs/lm_sync_interval + elif [ -f /proc/sys/fs/xfs/lm_age_buffer ] ; then + # (A couple of early 2.6 laptop mode patches had these.) + # The same goes for these. + echo $XFS_AGE > /proc/sys/fs/xfs/lm_age_buffer + echo $XFS_AGE > /proc/sys/fs/xfs/lm_sync_interval + elif [ -f /proc/sys/fs/xfs/age_buffer ] ; then + # (2.6.6) + # But not for these -- they are also used in normal + # operation. + echo $XFS_AGE > /proc/sys/fs/xfs/age_buffer + echo $XFS_AGE > /proc/sys/fs/xfs/sync_interval + elif [ -f /proc/sys/fs/xfs/age_buffer_centisecs ] ; then + # (2.6.7 upwards) + # And not for these either. These are in centisecs, + # not USER_HZ, so we have to use $AGE, not $XFS_AGE. + echo $AGE > /proc/sys/fs/xfs/age_buffer_centisecs + echo $AGE > /proc/sys/fs/xfs/xfssyncd_centisecs + echo 3000 > /proc/sys/fs/xfs/xfsbufd_centisecs + fi + + case "$KLEVEL" in + "2.4") + echo 1 > /proc/sys/vm/laptop_mode + echo "30 500 0 0 $AGE $AGE 60 20 0" > /proc/sys/vm/bdflush + ;; + "2.6") + echo 5 > /proc/sys/vm/laptop_mode + echo "$AGE" > /proc/sys/vm/dirty_writeback_centisecs + echo "$AGE" > /proc/sys/vm/dirty_expire_centisecs + echo "$DIRTY_RATIO" > /proc/sys/vm/dirty_ratio + echo "$DIRTY_BACKGROUND_RATIO" > /proc/sys/vm/dirty_background_ratio + ;; + esac + if [ $DO_REMOUNTS -eq 1 ]; then + cat /etc/mtab | while read DEV MP FST OPTS DUMP PASS ; do + PARSEDOPTS="$(parse_mount_opts "$OPTS")" + if [ "$FST" = 'unknown' ]; then + FST=$(deduce_fstype $MP) + fi + case "$FST" in + "ext3"|"reiserfs") + PARSEDOPTS="$(parse_mount_opts commit "$OPTS")" + mount $DEV -t $FST $MP -o remount,$PARSEDOPTS,commit=$MAX_AGE$NOATIME_OPT + ;; + "xfs") + mount $DEV -t $FST $MP -o remount,$OPTS$NOATIME_OPT + ;; + esac + if [ -b $DEV ] ; then + blockdev --setra $(($READAHEAD * 2)) $DEV + fi + done + fi + if [ $DO_HD -eq 1 ] ; then + for THISHD in $HD ; do + /sbin/hdparm -S $BATT_HD $THISHD > /dev/null 2>&1 + /sbin/hdparm -B 1 $THISHD > /dev/null 2>&1 + done + fi + if [ $DO_CPU -eq 1 -a -e /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq ]; then + if [ $CPU_MAXFREQ = 'slowest' ]; then + CPU_MAXFREQ=`cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq` + fi + echo $CPU_MAXFREQ > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq + fi + echo "." + ;; + stop) + U_AGE=$((100*$DEF_UPDATE)) + B_AGE=$((100*$DEF_AGE)) + echo -n "Stopping laptop_mode" + echo 0 > /proc/sys/vm/laptop_mode + if [ -f /proc/sys/fs/xfs/age_buffer -a ! -f /proc/sys/fs/xfs/lm_age_buffer ] ; then + # These need to be restored, if there are no lm_*. + echo $(($XFS_HZ*$DEF_XFS_AGE_BUFFER)) > /proc/sys/fs/xfs/age_buffer + echo $(($XFS_HZ*$DEF_XFS_SYNC_INTERVAL)) > /proc/sys/fs/xfs/sync_interval + elif [ -f /proc/sys/fs/xfs/age_buffer_centisecs ] ; then + # These need to be restored as well. + echo $((100*$DEF_XFS_AGE_BUFFER)) > /proc/sys/fs/xfs/age_buffer_centisecs + echo $((100*$DEF_XFS_SYNC_INTERVAL)) > /proc/sys/fs/xfs/xfssyncd_centisecs + echo $((100*$DEF_XFS_BUFD_INTERVAL)) > /proc/sys/fs/xfs/xfsbufd_centisecs + fi + case "$KLEVEL" in + "2.4") + echo "30 500 0 0 $U_AGE $B_AGE 60 20 0" > /proc/sys/vm/bdflush + ;; + "2.6") + echo "$U_AGE" > /proc/sys/vm/dirty_writeback_centisecs + echo "$B_AGE" > /proc/sys/vm/dirty_expire_centisecs + echo "$DEF_DIRTY_RATIO" > /proc/sys/vm/dirty_ratio + echo "$DEF_DIRTY_BACKGROUND_RATIO" > /proc/sys/vm/dirty_background_ratio + ;; + esac + if [ $DO_REMOUNTS -eq 1 ] ; then + cat /etc/mtab | while read DEV MP FST OPTS DUMP PASS ; do + # Reset commit and atime options to defaults. + if [ "$FST" = 'unknown' ]; then + FST=$(deduce_fstype $MP) + fi + case "$FST" in + "ext3"|"reiserfs") + PARSEDOPTS="$(parse_mount_opts_wfstab $DEV commit $OPTS)" + PARSEDOPTS="$(parse_yesno_opts_wfstab $DEV atime atime $PARSEDOPTS)" + mount $DEV -t $FST $MP -o remount,$PARSEDOPTS + ;; + "xfs") + PARSEDOPTS="$(parse_yesno_opts_wfstab $DEV atime atime $OPTS)" + mount $DEV -t $FST $MP -o remount,$PARSEDOPTS + ;; + esac + if [ -b $DEV ] ; then + blockdev --setra 256 $DEV + fi + done + fi + if [ $DO_HD -eq 1 ] ; then + for THISHD in $HD ; do + /sbin/hdparm -S $AC_HD $THISHD > /dev/null 2>&1 + /sbin/hdparm -B 255 $THISHD > /dev/null 2>&1 + done + fi + if [ $DO_CPU -eq 1 -a -e /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq ]; then + echo `cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq` > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq + fi + echo "." + ;; + *) + echo "Usage: $0 {start|stop}" 2>&1 + exit 1 + ;; + +esac + +exit 0 +--------------------CONTROL SCRIPT END------------------------------------------ + + +ACPI integration +---------------- + +Dax Kelson submitted this so that the ACPI acpid daemon will +kick off the laptop_mode script and run hdparm. The part that +automatically disables laptop mode when the battery is low was +writen by Jan Topinski. + +-----------------/etc/acpi/events/ac_adapter BEGIN------------------------------ +event=ac_adapter +action=/etc/acpi/actions/ac.sh %e +----------------/etc/acpi/events/ac_adapter END--------------------------------- + + +-----------------/etc/acpi/events/battery BEGIN--------------------------------- +event=battery.* +action=/etc/acpi/actions/battery.sh %e +----------------/etc/acpi/events/battery END------------------------------------ + + +----------------/etc/acpi/actions/ac.sh BEGIN----------------------------------- +#!/bin/bash + +# ac on/offline event handler + +status=`awk '/^state: / { print $2 }' /proc/acpi/ac_adapter/$2/state` + +case $status in + "on-line") + /sbin/laptop_mode stop + exit 0 + ;; + "off-line") + /sbin/laptop_mode start + exit 0 + ;; +esac +---------------------------/etc/acpi/actions/ac.sh END-------------------------- + + +---------------------------/etc/acpi/actions/battery.sh BEGIN------------------- +#! /bin/bash + +# Automatically disable laptop mode when the battery almost runs out. + +BATT_INFO=/proc/acpi/battery/$2/state + +if [[ -f /proc/sys/vm/laptop_mode ]] +then + LM=`cat /proc/sys/vm/laptop_mode` + if [[ $LM -gt 0 ]] + then + if [[ -f $BATT_INFO ]] + then + # Source the config file only now that we know we need + if [ -f /etc/default/laptop-mode ] ; then + # Debian + . /etc/default/laptop-mode + elif [ -f /etc/sysconfig/laptop-mode ] ; then + # Others + . /etc/sysconfig/laptop-mode + fi + MINIMUM_BATTERY_MINUTES=${MINIMUM_BATTERY_MINUTES:-'10'} + + ACTION="`cat $BATT_INFO | grep charging | cut -c 26-`" + if [[ ACTION -eq "discharging" ]] + then + PRESENT_RATE=`cat $BATT_INFO | grep "present rate:" | sed "s/.* \([0-9][0-9]* \).*/\1/" ` + REMAINING=`cat $BATT_INFO | grep "remaining capacity:" | sed "s/.* \([0-9][0-9]* \).*/\1/" ` + fi + if (($REMAINING * 60 / $PRESENT_RATE < $MINIMUM_BATTERY_MINUTES)) + then + /sbin/laptop_mode stop + fi + else + logger -p daemon.warning "You are using laptop mode and your battery interface $BATT_INFO is missing. This may lead to loss of data when the battery runs out. Check kernel ACPI support and /proc/acpi/battery folder, and edit /etc/acpi/battery.sh to set BATT_INFO to the correct path." + fi + fi +fi +---------------------------/etc/acpi/actions/battery.sh END-------------------- + + +Monitoring tool +--------------- + +Bartek Kania submitted this, it can be used to measure how much time your disk +spends spun up/down. + +---------------------------dslm.c BEGIN----------------------------------------- +/* + * Simple Disk Sleep Monitor + * by Bartek Kania + * Licenced under the GPL + */ +#include <unistd.h> +#include <stdlib.h> +#include <stdio.h> +#include <fcntl.h> +#include <errno.h> +#include <time.h> +#include <string.h> +#include <signal.h> +#include <sys/ioctl.h> +#include <linux/hdreg.h> + +#ifdef DEBUG +#define D(x) x +#else +#define D(x) +#endif + +int endit = 0; + +/* Check if the disk is in powersave-mode + * Most of the code is stolen from hdparm. + * 1 = active, 0 = standby/sleep, -1 = unknown */ +int check_powermode(int fd) +{ + unsigned char args[4] = {WIN_CHECKPOWERMODE1,0,0,0}; + int state; + + if (ioctl(fd, HDIO_DRIVE_CMD, &args) + && (args[0] = WIN_CHECKPOWERMODE2) /* try again with 0x98 */ + && ioctl(fd, HDIO_DRIVE_CMD, &args)) { + if (errno != EIO || args[0] != 0 || args[1] != 0) { + state = -1; /* "unknown"; */ + } else + state = 0; /* "sleeping"; */ + } else { + state = (args[2] == 255) ? 1 : 0; + } + D(printf(" drive state is: %d\n", state)); + + return state; +} + +char *state_name(int i) +{ + if (i == -1) return "unknown"; + if (i == 0) return "sleeping"; + if (i == 1) return "active"; + + return "internal error"; +} + +char *myctime(time_t time) +{ + char *ts = ctime(&time); + ts[strlen(ts) - 1] = 0; + + return ts; +} + +void measure(int fd) +{ + time_t start_time; + int last_state; + time_t last_time; + int curr_state; + time_t curr_time = 0; + time_t time_diff; + time_t active_time = 0; + time_t sleep_time = 0; + time_t unknown_time = 0; + time_t total_time = 0; + int changes = 0; + float tmp; + + printf("Starting measurements\n"); + + last_state = check_powermode(fd); + start_time = last_time = time(0); + printf(" System is in state %s\n\n", state_name(last_state)); + + while(!endit) { + sleep(1); + curr_state = check_powermode(fd); + + if (curr_state != last_state || endit) { + changes++; + curr_time = time(0); + time_diff = curr_time - last_time; + + if (last_state == 1) active_time += time_diff; + else if (last_state == 0) sleep_time += time_diff; + else unknown_time += time_diff; + + last_state = curr_state; + last_time = curr_time; + + printf("%s: State-change to %s\n", myctime(curr_time), + state_name(curr_state)); + } + } + changes--; /* Compensate for SIGINT */ + + total_time = time(0) - start_time; + printf("\nTotal running time: %lus\n", curr_time - start_time); + printf(" State changed %d times\n", changes); + + tmp = (float)sleep_time / (float)total_time * 100; + printf(" Time in sleep state: %lus (%.2f%%)\n", sleep_time, tmp); + tmp = (float)active_time / (float)total_time * 100; + printf(" Time in active state: %lus (%.2f%%)\n", active_time, tmp); + tmp = (float)unknown_time / (float)total_time * 100; + printf(" Time in unknown state: %lus (%.2f%%)\n", unknown_time, tmp); +} + +void ender(int s) +{ + endit = 1; +} + +void usage() +{ + puts("usage: dslm [-w <time>] <disk>"); + exit(0); +} + +int main(int ac, char **av) +{ + int fd; + char *disk = 0; + int settle_time = 60; + + /* Parse the simple command-line */ + if (ac == 2) + disk = av[1]; + else if (ac == 4) { + settle_time = atoi(av[2]); + disk = av[3]; + } else + usage(); + + if (!(fd = open(disk, O_RDONLY|O_NONBLOCK))) { + printf("Can't open %s, because: %s\n", disk, strerror(errno)); + exit(-1); + } + + if (settle_time) { + printf("Waiting %d seconds for the system to settle down to " + "'normal'\n", settle_time); + sleep(settle_time); + } else + puts("Not waiting for system to settle down"); + + signal(SIGINT, ender); + + measure(fd); + + close(fd); + + return 0; +} +---------------------------dslm.c END------------------------------------------- diff --git a/Documentation/ldm.txt b/Documentation/ldm.txt new file mode 100644 index 000000000000..e266e11c19a3 --- /dev/null +++ b/Documentation/ldm.txt @@ -0,0 +1,102 @@ + + LDM - Logical Disk Manager (Dynamic Disks) + ------------------------------------------ + +Overview +-------- + +Windows 2000 and XP use a new partitioning scheme. It is a complete +replacement for the MSDOS style partitions. It stores its information in a +1MiB journalled database at the end of the physical disk. The size of +partitions is limited only by disk space. The maximum number of partitions is +nearly 2000. + +Any partitions created under the LDM are called "Dynamic Disks". There are no +longer any primary or extended partitions. Normal MSDOS style partitions are +now known as Basic Disks. + +If you wish to use Spanned, Striped, Mirrored or RAID 5 Volumes, you must use +Dynamic Disks. The journalling allows Windows to make changes to these +partitions and filesystems without the need to reboot. + +Once the LDM driver has divided up the disk, you can use the MD driver to +assemble any multi-partition volumes, e.g. Stripes, RAID5. + +To prevent legacy applications from repartitioning the disk, the LDM creates a +dummy MSDOS partition containing one disk-sized partition. + + +Example +------- + +Below we have a 50MiB disk, divided into seven partitions. +N.B. The missing 1MiB at the end of the disk is where the LDM database is + stored. + + Device | Offset Bytes Sectors MiB | Size Bytes Sectors MiB + -------+----------------------------+--------------------------- + hda | 0 0 0 | 52428800 102400 50 + hda1 | 51380224 100352 49 | 1048576 2048 1 + hda2 | 16384 32 0 | 6979584 13632 6 + hda3 | 6995968 13664 6 | 10485760 20480 10 + hda4 | 17481728 34144 16 | 4194304 8192 4 + hda5 | 21676032 42336 20 | 5242880 10240 5 + hda6 | 26918912 52576 25 | 10485760 20480 10 + hda7 | 37404672 73056 35 | 13959168 27264 13 + +The LDM Database may not store the partitions in the order that they appear on +disk, but the driver will sort them. + +When Linux boots, you will see something like: + + hda: 102400 sectors w/32KiB Cache, CHS=50/64/32 + hda: [LDM] hda1 hda2 hda3 hda4 hda5 hda6 hda7 + + +Compiling LDM Support +--------------------- + +To enable LDM, choose the following two options: + + "Advanced partition selection" CONFIG_PARTITION_ADVANCED + "Windows Logical Disk Manager (Dynamic Disk) support" CONFIG_LDM_PARTITION + +If you believe the driver isn't working as it should, you can enable the extra +debugging code. This will produce a LOT of output. The option is: + + "Windows LDM extra logging" CONFIG_LDM_DEBUG + +N.B. The partition code cannot be compiled as a module. + +As with all the partition code, if the driver doesn't see signs of its type of +partition, it will pass control to another driver, so there is no harm in +enabling it. + +If you have Dynamic Disks but don't enable the driver, then all you will see +is a dummy MSDOS partition filling the whole disk. You won't be able to mount +any of the volumes on the disk. + + +Booting +------- + +If you enable LDM support, then lilo is capable of booting from any of the +discovered partitions. However, grub does not understand the LDM partitioning +and cannot boot from a Dynamic Disk. + + +More Documentation +------------------ + +There is an Overview of the LDM online together with complete Technical +Documentation. It can also be downloaded in html. + + http://linux-ntfs.sourceforge.net/ldm/index.html + http://linux-ntfs.sourceforge.net/downloads.html + +If you have any LDM questions that aren't answered on the website, email me. + +Cheers, + FlatCap - Richard Russon + ldm@flatcap.org + diff --git a/Documentation/locks.txt b/Documentation/locks.txt new file mode 100644 index 000000000000..ce1be79edfb8 --- /dev/null +++ b/Documentation/locks.txt @@ -0,0 +1,84 @@ + File Locking Release Notes + + Andy Walker <andy@lysaker.kvaerner.no> + + 12 May 1997 + + +1. What's New? +-------------- + +1.1 Broken Flock Emulation +-------------------------- + +The old flock(2) emulation in the kernel was swapped for proper BSD +compatible flock(2) support in the 1.3.x series of kernels. With the +release of the 2.1.x kernel series, support for the old emulation has +been totally removed, so that we don't need to carry this baggage +forever. + +This should not cause problems for anybody, since everybody using a +2.1.x kernel should have updated their C library to a suitable version +anyway (see the file "Documentation/Changes".) + +1.2 Allow Mixed Locks Again +--------------------------- + +1.2.1 Typical Problems - Sendmail +--------------------------------- +Because sendmail was unable to use the old flock() emulation, many sendmail +installations use fcntl() instead of flock(). This is true of Slackware 3.0 +for example. This gave rise to some other subtle problems if sendmail was +configured to rebuild the alias file. Sendmail tried to lock the aliases.dir +file with fcntl() at the same time as the GDBM routines tried to lock this +file with flock(). With pre 1.3.96 kernels this could result in deadlocks that, +over time, or under a very heavy mail load, would eventually cause the kernel +to lock solid with deadlocked processes. + + +1.2.2 The Solution +------------------ +The solution I have chosen, after much experimentation and discussion, +is to make flock() and fcntl() locks oblivious to each other. Both can +exists, and neither will have any effect on the other. + +I wanted the two lock styles to be cooperative, but there were so many +race and deadlock conditions that the current solution was the only +practical one. It puts us in the same position as, for example, SunOS +4.1.x and several other commercial Unices. The only OS's that support +cooperative flock()/fcntl() are those that emulate flock() using +fcntl(), with all the problems that implies. + + +1.3 Mandatory Locking As A Mount Option +--------------------------------------- + +Mandatory locking, as described in 'Documentation/mandatory.txt' was prior +to this release a general configuration option that was valid for all +mounted filesystems. This had a number of inherent dangers, not the least +of which was the ability to freeze an NFS server by asking it to read a +file for which a mandatory lock existed. + +From this release of the kernel, mandatory locking can be turned on and off +on a per-filesystem basis, using the mount options 'mand' and 'nomand'. +The default is to disallow mandatory locking. The intention is that +mandatory locking only be enabled on a local filesystem as the specific need +arises. + +Until an updated version of mount(8) becomes available you may have to apply +this patch to the mount sources (based on the version distributed with Rick +Faith's util-linux-2.5 package): + +*** mount.c.orig Sat Jun 8 09:14:31 1996 +--- mount.c Sat Jun 8 09:13:02 1996 +*************** +*** 100,105 **** +--- 100,107 ---- + { "noauto", 0, MS_NOAUTO }, /* Can only be mounted explicitly */ + { "user", 0, MS_USER }, /* Allow ordinary user to mount */ + { "nouser", 1, MS_USER }, /* Forbid ordinary user to mount */ ++ { "mand", 0, MS_MANDLOCK }, /* Allow mandatory locks on this FS */ ++ { "nomand", 1, MS_MANDLOCK }, /* Forbid mandatory locks on this FS */ + /* add new options here */ + #ifdef MS_NOSUB + { "sub", 1, MS_NOSUB }, /* allow submounts */ diff --git a/Documentation/logo.gif b/Documentation/logo.gif Binary files differnew file mode 100644 index 000000000000..2eae75fecfb9 --- /dev/null +++ b/Documentation/logo.gif diff --git a/Documentation/logo.txt b/Documentation/logo.txt new file mode 100644 index 000000000000..296f0f7f67eb --- /dev/null +++ b/Documentation/logo.txt @@ -0,0 +1,13 @@ +This is the full-colour version of the currently unofficial Linux logo +("currently unofficial" just means that there has been no paperwork and +that I have not really announced it yet). It was created by Larry Ewing, +and is freely usable as long as you acknowledge Larry as the original +artist. + +Note that there are black-and-white versions of this available that +scale down to smaller sizes and are better for letterheads or whatever +you want to use it for: for the full range of logos take a look at +Larry's web-page: + + http://www.isc.tamu.edu/~lewing/linux/ + diff --git a/Documentation/m68k/00-INDEX b/Documentation/m68k/00-INDEX new file mode 100644 index 000000000000..a014e9f00765 --- /dev/null +++ b/Documentation/m68k/00-INDEX @@ -0,0 +1,5 @@ +00-INDEX + - this file +kernel-options.txt + - command line options for Linux/m68k + diff --git a/Documentation/m68k/README.buddha b/Documentation/m68k/README.buddha new file mode 100644 index 000000000000..bf802ffc98ad --- /dev/null +++ b/Documentation/m68k/README.buddha @@ -0,0 +1,210 @@ + +The Amiga Buddha and Catweasel IDE Driver (part of ide.c) was written by +Geert Uytterhoeven based on the following specifications: + +------------------------------------------------------------------------ + +Register map of the Buddha IDE controller and the +Buddha-part of the Catweasel Zorro-II version + +The Autoconfiguration has been implemented just as Commodore +described in their manuals, no tricks have been used (for +example leaving some address lines out of the equations...). +If you want to configure the board yourself (for example let +a Linux kernel configure the card), look at the Commodore +Docs. Reading the nibbles should give this information: + +Vendor number: 4626 ($1212) +product number: 0 (42 for Catweasel Z-II) +Serial number: 0 +Rom-vector: $1000 + +The card should be a Z-II board, size 64K, not for freemem +list, Rom-Vektor is valid, no second Autoconfig-board on the +same card, no space preference, supports "Shutup_forever". + +Setting the base address should be done in two steps, just +as the Amiga Kickstart does: The lower nibble of the 8-Bit +address is written to $4a, then the whole Byte is written to +$48, while it doesn't matter how often you're writing to $4a +as long as $48 is not touched. After $48 has been written, +the whole card disappears from $e8 and is mapped to the new +address just written. Make shure $4a is written before $48, +otherwise your chance is only 1:16 to find the board :-). + +The local memory-map is even active when mapped to $e8: + +$0-$7e Autokonfig-space, see Z-II docs. + +$80-$7fd reserved + +$7fe Speed-select Register: Read & Write + (description see further down) + +$800-$8ff IDE-Select 0 (Port 0, Register set 0) + +$900-$9ff IDE-Select 1 (Port 0, Register set 1) + +$a00-$aff IDE-Select 2 (Port 1, Register set 0) + +$b00-$bff IDE-Select 3 (Port 1, Register set 1) + +$c00-$cff IDE-Select 4 (Port 2, Register set 0, + Catweasel only!) + +$d00-$dff IDE-Select 5 (Port 3, Register set 1, + Catweasel only!) + +$e00-$eff local expansion port, on Catweasel Z-II the + Catweasel registers are also mapped here. + Never touch, use multidisk.device! + +$f00 read only, Byte-access: Bit 7 shows the + level of the IRQ-line of IDE port 0. + +$f01-$f3f mirror of $f00 + +$f40 read only, Byte-access: Bit 7 shows the + level of the IRQ-line of IDE port 1. + +$f41-$f7f mirror of $f40 + +$f80 read only, Byte-access: Bit 7 shows the + level of the IRQ-line of IDE port 2. + (Catweasel only!) + +$f81-$fbf mirror of $f80 + +$fc0 write-only: Writing any value to this + register enables IRQs to be passed from the + IDE ports to the Zorro bus. This mechanism + has been implemented to be compatible with + harddisks that are either defective or have + a buggy firmware and pull the IRQ line up + while starting up. If interrupts would + always be passed to the bus, the computer + might not start up. Once enabled, this flag + can not be disabled again. The level of the + flag can not be determined by software + (what for? Write to me if it's necessary!). + +$fc1-$fff mirror of $fc0 + +$1000-$ffff Buddha-Rom with offset $1000 in the rom + chip. The addresses $0 to $fff of the rom + chip cannot be read. Rom is Byte-wide and + mapped to even addresses. + +The IDE ports issue an INT2. You can read the level of the +IRQ-lines of the IDE-ports by reading from the three (two +for Buddha-only) registers $f00, $f40 and $f80. This way +more than one I/O request can be handled and you can easily +determine what driver has to serve the INT2. Buddha and +Catweasel expansion boards can issue an INT6. A separate +memory map is available for the I/O module and the sysop's +I/O module. + +The IDE ports are fed by the address lines A2 to A4, just as +the Amiga 1200 and Amiga 4000 IDE ports are. This way +existing drivers can be easily ported to Buddha. A move.l +polls two words out of the same address of IDE port since +every word is mirrored once. movem is not possible, but +it's not necessary either, because you can only speedup +68000 systems with this technique. A 68020 system with +fastmem is faster with move.l. + +If you're using the mirrored registers of the IDE-ports with +A6=1, the Buddha doesn't care about the speed that you have +selected in the speed register (see further down). With +A6=1 (for example $840 for port 0, register set 0), a 780ns +access is being made. These registers should be used for a +command access to the harddisk/CD-Rom, since command +accesses are Byte-wide and have to be made slower according +to the ATA-X3T9 manual. + +Now for the speed-register: The register is byte-wide, and +only the upper three bits are used (Bits 7 to 5). Bit 4 +must always be set to 1 to be compatible with later Buddha +versions (if I'll ever update this one). I presume that +I'll never use the lower four bits, but they have to be set +to 1 by definition. + The values in this table have to be shifted 5 bits to the +left and or'd with $1f (this sets the lower 5 bits). + +All the timings have in common: Select and IOR/IOW rise at +the same time. IOR and IOW have a propagation delay of +about 30ns to the clocks on the Zorro bus, that's why the +values are no multiple of 71. One clock-cycle is 71ns long +(exactly 70,5 at 14,18 Mhz on PAL systems). + +value 0 (Default after reset) + +497ns Select (7 clock cycles) , IOR/IOW after 172ns (2 clock cycles) +(same timing as the Amiga 1200 does on it's IDE port without +accelerator card) + +value 1 + +639ns Select (9 clock cycles), IOR/IOW after 243ns (3 clock cycles) + +value 2 + +781ns Select (11 clock cycles), IOR/IOW after 314ns (4 clock cycles) + +value 3 + +355ns Select (5 clock cycles), IOR/IOW after 101ns (1 clock cycle) + +value 4 + +355ns Select (5 clock cycles), IOR/IOW after 172ns (2 clock cycles) + +value 5 + +355ns Select (5 clock cycles), IOR/IOW after 243ns (3 clock cycles) + +value 6 + +1065ns Select (15 clock cycles), IOR/IOW after 314ns (4 clock cycles) + +value 7 + +355ns Select, (5 clock cycles), IOR/IOW after 101ns (1 clock cycle) + +When accessing IDE registers with A6=1 (for example $84x), +the timing will always be mode 0 8-bit compatible, no matter +what you have selected in the speed register: + +781ns select, IOR/IOW after 4 clock cycles (=314ns) aktive. + +All the timings with a very short select-signal (the 355ns +fast accesses) depend on the accelerator card used in the +system: Sometimes two more clock cycles are inserted by the +bus interface, making the whole access 497ns long. This +doesn't affect the reliability of the controller nor the +performance of the card, since this doesn't happen very +often. + +All the timings are calculated and only confirmed by +measurements that allowed me to count the clock cycles. If +the system is clocked by an oscillator other than 28,37516 +Mhz (for example the NTSC-frequency 28,63636 Mhz), each +clock cycle is shortened to a bit less than 70ns (not worth +mentioning). You could think of a small performance boost +by overclocking the system, but you would either need a +multisync monitor, or a graphics card, and your internal +diskdrive would go crazy, that's why you shouldn't tune your +Amiga this way. + +Giving you the possibility to write software that is +compatible with both the Buddha and the Catweasel Z-II, The +Buddha acts just like a Catweasel Z-II with no device +connected to the third IDE-port. The IRQ-register $f80 +always shows a "no IRQ here" on the Buddha, and accesses to +the third IDE port are going into data's Nirwana on the +Buddha. + + Jens Schönfeld february 19th, 1997 + updated may 27th, 1997 + eMail: sysop@nostlgic.tng.oche.de + diff --git a/Documentation/m68k/kernel-options.txt b/Documentation/m68k/kernel-options.txt new file mode 100644 index 000000000000..e191baad8308 --- /dev/null +++ b/Documentation/m68k/kernel-options.txt @@ -0,0 +1,964 @@ + + + Command Line Options for Linux/m68k + =================================== + +Last Update: 2 May 1999 +Linux/m68k version: 2.2.6 +Author: Roman.Hodek@informatik.uni-erlangen.de (Roman Hodek) +Update: jds@kom.auc.dk (Jes Sorensen) and faq@linux-m68k.org (Chris Lawrence) + +0) Introduction +=============== + + Often I've been asked which command line options the Linux/m68k +kernel understands, or how the exact syntax for the ... option is, or +... about the option ... . I hope, this document supplies all the +answers... + + Note that some options might be outdated, their descriptions being +incomplete or missing. Please update the information and send in the +patches. + + +1) Overview of the Kernel's Option Processing +============================================= + +The kernel knows three kinds of options on its command line: + + 1) kernel options + 2) environment settings + 3) arguments for init + +To which of these classes an argument belongs is determined as +follows: If the option is known to the kernel itself, i.e. if the name +(the part before the '=') or, in some cases, the whole argument string +is known to the kernel, it belongs to class 1. Otherwise, if the +argument contains an '=', it is of class 2, and the definition is put +into init's environment. All other arguments are passed to init as +command line options. + + This document describes the valid kernel options for Linux/m68k in +the version mentioned at the start of this file. Later revisions may +add new such options, and some may be missing in older versions. + + In general, the value (the part after the '=') of an option is a +list of values separated by commas. The interpretation of these values +is up to the driver that "owns" the option. This association of +options with drivers is also the reason that some are further +subdivided. + + +2) General Kernel Options +========================= + +2.1) root= +---------- + +Syntax: root=/dev/<device> + or: root=<hex_number> + +This tells the kernel which device it should mount as the root +filesystem. The device must be a block device with a valid filesystem +on it. + + The first syntax gives the device by name. These names are converted +into a major/minor number internally in the kernel in an unusual way. +Normally, this "conversion" is done by the device files in /dev, but +this isn't possible here, because the root filesystem (with /dev) +isn't mounted yet... So the kernel parses the name itself, with some +hardcoded name to number mappings. The name must always be a +combination of two or three letters, followed by a decimal number. +Valid names are: + + /dev/ram: -> 0x0100 (initial ramdisk) + /dev/hda: -> 0x0300 (first IDE disk) + /dev/hdb: -> 0x0340 (second IDE disk) + /dev/sda: -> 0x0800 (first SCSI disk) + /dev/sdb: -> 0x0810 (second SCSI disk) + /dev/sdc: -> 0x0820 (third SCSI disk) + /dev/sdd: -> 0x0830 (forth SCSI disk) + /dev/sde: -> 0x0840 (fifth SCSI disk) + /dev/fd : -> 0x0200 (floppy disk) + /dev/xda: -> 0x0c00 (first XT disk, unused in Linux/m68k) + /dev/xdb: -> 0x0c40 (second XT disk, unused in Linux/m68k) + /dev/ada: -> 0x1c00 (first ACSI device) + /dev/adb: -> 0x1c10 (second ACSI device) + /dev/adc: -> 0x1c20 (third ACSI device) + /dev/add: -> 0x1c30 (forth ACSI device) + +The last four names are available only if the kernel has been compiled +with Atari and ACSI support. + + The name must be followed by a decimal number, that stands for the +partition number. Internally, the value of the number is just +added to the device number mentioned in the table above. The +exceptions are /dev/ram and /dev/fd, where /dev/ram refers to an +initial ramdisk loaded by your bootstrap program (please consult the +instructions for your bootstrap program to find out how to load an +initial ramdisk). As of kernel version 2.0.18 you must specify +/dev/ram as the root device if you want to boot from an initial +ramdisk. For the floppy devices, /dev/fd, the number stands for the +floppy drive number (there are no partitions on floppy disks). I.e., +/dev/fd0 stands for the first drive, /dev/fd1 for the second, and so +on. Since the number is just added, you can also force the disk format +by adding a number greater than 3. If you look into your /dev +directory, use can see the /dev/fd0D720 has major 2 and minor 16. You +can specify this device for the root FS by writing "root=/dev/fd16" on +the kernel command line. + +[Strange and maybe uninteresting stuff ON] + + This unusual translation of device names has some strange +consequences: If, for example, you have a symbolic link from /dev/fd +to /dev/fd0D720 as an abbreviation for floppy driver #0 in DD format, +you cannot use this name for specifying the root device, because the +kernel cannot see this symlink before mounting the root FS and it +isn't in the table above. If you use it, the root device will not be +set at all, without an error message. Another example: You cannot use a +partition on e.g. the sixth SCSI disk as the root filesystem, if you +want to specify it by name. This is, because only the devices up to +/dev/sde are in the table above, but not /dev/sdf. Although, you can +use the sixth SCSI disk for the root FS, but you have to specify the +device by number... (see below). Or, even more strange, you can use the +fact that there is no range checking of the partition number, and your +knowledge that each disk uses 16 minors, and write "root=/dev/sde17" +(for /dev/sdf1). + +[Strange and maybe uninteresting stuff OFF] + + If the device containing your root partition isn't in the table +above, you can also specify it by major and minor numbers. These are +written in hex, with no prefix and no separator between. E.g., if you +have a CD with contents appropriate as a root filesystem in the first +SCSI CD-ROM drive, you boot from it by "root=0b00". Here, hex "0b" = +decimal 11 is the major of SCSI CD-ROMs, and the minor 0 stands for +the first of these. You can find out all valid major numbers by +looking into include/linux/major.h. + + +2.2) ro, rw +----------- + +Syntax: ro + or: rw + +These two options tell the kernel whether it should mount the root +filesystem read-only or read-write. The default is read-only, except +for ramdisks, which default to read-write. + + +2.3) debug +---------- + +Syntax: debug + +This raises the kernel log level to 10 (the default is 7). This is the +same level as set by the "dmesg" command, just that the maximum level +selectable by dmesg is 8. + + +2.4) debug= +----------- + +Syntax: debug=<device> + +This option causes certain kernel messages be printed to the selected +debugging device. This can aid debugging the kernel, since the +messages can be captured and analyzed on some other machine. Which +devices are possible depends on the machine type. There are no checks +for the validity of the device name. If the device isn't implemented, +nothing happens. + + Messages logged this way are in general stack dumps after kernel +memory faults or bad kernel traps, and kernel panics. To be exact: all +messages of level 0 (panic messages) and all messages printed while +the log level is 8 or more (their level doesn't matter). Before stack +dumps, the kernel sets the log level to 10 automatically. A level of +at least 8 can also be set by the "debug" command line option (see +2.3) and at run time with "dmesg -n 8". + +Devices possible for Amiga: + + - "ser": built-in serial port; parameters: 9600bps, 8N1 + - "mem": Save the messages to a reserved area in chip mem. After + rebooting, they can be read under AmigaOS with the tool + 'dmesg'. + +Devices possible for Atari: + + - "ser1": ST-MFP serial port ("Modem1"); parameters: 9600bps, 8N1 + - "ser2": SCC channel B serial port ("Modem2"); parameters: 9600bps, 8N1 + - "ser" : default serial port + This is "ser2" for a Falcon, and "ser1" for any other machine + - "midi": The MIDI port; parameters: 31250bps, 8N1 + - "par" : parallel port + The printing routine for this implements a timeout for the + case there's no printer connected (else the kernel would + lock up). The timeout is not exact, but usually a few + seconds. + + +2.6) ramdisk= +------------- + +Syntax: ramdisk=<size> + + This option instructs the kernel to set up a ramdisk of the given +size in KBytes. Do not use this option if the ramdisk contents are +passed by bootstrap! In this case, the size is selected automatically +and should not be overwritten. + + The only application is for root filesystems on floppy disks, that +should be loaded into memory. To do that, select the corresponding +size of the disk as ramdisk size, and set the root device to the disk +drive (with "root="). + + +2.7) swap= +2.8) buff= +----------- + + I can't find any sign of these options in 2.2.6. + + +3) General Device Options (Amiga and Atari) +=========================================== + +3.1) ether= +----------- + +Syntax: ether=[<irq>[,<base_addr>[,<mem_start>[,<mem_end>]]]],<dev-name> + + <dev-name> is the name of a net driver, as specified in +drivers/net/Space.c in the Linux source. Most prominent are eth0, ... +eth3, sl0, ... sl3, ppp0, ..., ppp3, dummy, and lo. + + The non-ethernet drivers (sl, ppp, dummy, lo) obviously ignore the +settings by this options. Also, the existing ethernet drivers for +Linux/m68k (ariadne, a2065, hydra) don't use them because Zorro boards +are really Plug-'n-Play, so the "ether=" option is useless altogether +for Linux/m68k. + + +3.2) hd= +-------- + +Syntax: hd=<cylinders>,<heads>,<sectors> + + This option sets the disk geometry of an IDE disk. The first hd= +option is for the first IDE disk, the second for the second one. +(I.e., you can give this option twice.) In most cases, you won't have +to use this option, since the kernel can obtain the geometry data +itself. It exists just for the case that this fails for one of your +disks. + + +3.3) max_scsi_luns= +------------------- + +Syntax: max_scsi_luns=<n> + + Sets the maximum number of LUNs (logical units) of SCSI devices to +be scanned. Valid values for <n> are between 1 and 8. Default is 8 if +"Probe all LUNs on each SCSI device" was selected during the kernel +configuration, else 1. + + +3.4) st= +-------- + +Syntax: st=<buffer_size>,[<write_thres>,[<max_buffers>]] + + Sets several parameters of the SCSI tape driver. <buffer_size> is +the number of 512-byte buffers reserved for tape operations for each +device. <write_thres> sets the number of blocks which must be filled +to start an actual write operation to the tape. Maximum value is the +total number of buffers. <max_buffer> limits the total number of +buffers allocated for all tape devices. + + +3.5) dmasound= +-------------- + +Syntax: dmasound=[<buffers>,<buffer-size>[,<catch-radius>]] + + This option controls some configurations of the Linux/m68k DMA sound +driver (Amiga and Atari): <buffers> is the number of buffers you want +to use (minimum 4, default 4), <buffer-size> is the size of each +buffer in kilobytes (minimum 4, default 32) and <catch-radius> says +how much percent of error will be tolerated when setting a frequency +(maximum 10, default 0). For example with 3% you can play 8000Hz +AU-Files on the Falcon with its hardware frequency of 8195Hz and thus +don't need to expand the sound. + + + +4) Options for Atari Only +========================= + +4.1) video= +----------- + +Syntax: video=<fbname>:<sub-options...> + +The <fbname> parameter specifies the name of the frame buffer, +eg. most atari users will want to specify `atafb' here. The +<sub-options> is a comma-separated list of the sub-options listed +below. + +NB: Please notice that this option was renamed from `atavideo' to + `video' during the development of the 1.3.x kernels, thus you + might need to update your boot-scripts if upgrading to 2.x from + an 1.2.x kernel. + +NBB: The behavior of video= was changed in 2.1.57 so the recommended +option is to specify the name of the frame buffer. + +4.1.1) Video Mode +----------------- + +This sub-option may be any of the predefined video modes, as listed +in atari/atafb.c in the Linux/m68k source tree. The kernel will +activate the given video mode at boot time and make it the default +mode, if the hardware allows. Currently defined names are: + + - stlow : 320x200x4 + - stmid, default5 : 640x200x2 + - sthigh, default4: 640x400x1 + - ttlow : 320x480x8, TT only + - ttmid, default1 : 640x480x4, TT only + - tthigh, default2: 1280x960x1, TT only + - vga2 : 640x480x1, Falcon only + - vga4 : 640x480x2, Falcon only + - vga16, default3 : 640x480x4, Falcon only + - vga256 : 640x480x8, Falcon only + - falh2 : 896x608x1, Falcon only + - falh16 : 896x608x4, Falcon only + + If no video mode is given on the command line, the kernel tries the +modes names "default<n>" in turn, until one is possible with the +hardware in use. + + A video mode setting doesn't make sense, if the external driver is +activated by a "external:" sub-option. + +4.1.2) inverse +-------------- + +Invert the display. This affects both, text (consoles) and graphics +(X) display. Usually, the background is chosen to be black. With this +option, you can make the background white. + +4.1.3) font +----------- + +Syntax: font:<fontname> + +Specify the font to use in text modes. Currently you can choose only +between `VGA8x8', `VGA8x16' and `PEARL8x8'. `VGA8x8' is default, if the +vertical size of the display is less than 400 pixel rows. Otherwise, the +`VGA8x16' font is the default. + +4.1.4) hwscroll_ +---------------- + +Syntax: hwscroll_<n> + +The number of additional lines of video memory to reserve for +speeding up the scrolling ("hardware scrolling"). Hardware scrolling +is possible only if the kernel can set the video base address in steps +fine enough. This is true for STE, MegaSTE, TT, and Falcon. It is not +possible with plain STs and graphics cards (The former because the +base address must be on a 256 byte boundary there, the latter because +the kernel doesn't know how to set the base address at all.) + + By default, <n> is set to the number of visible text lines on the +display. Thus, the amount of video memory is doubled, compared to no +hardware scrolling. You can turn off the hardware scrolling altogether +by setting <n> to 0. + +4.1.5) internal: +---------------- + +Syntax: internal:<xres>;<yres>[;<xres_max>;<yres_max>;<offset>] + +This option specifies the capabilities of some extended internal video +hardware, like e.g. OverScan. <xres> and <yres> give the (extended) +dimensions of the screen. + + If your OverScan needs a black border, you have to write the last +three arguments of the "internal:". <xres_max> is the maximum line +length the hardware allows, <yres_max> the maximum number of lines. +<offset> is the offset of the visible part of the screen memory to its +physical start, in bytes. + + Often, extended interval video hardware has to be activated somehow. +For this, see the "sw_*" options below. + +4.1.6) external: +---------------- + +Syntax: + external:<xres>;<yres>;<depth>;<org>;<scrmem>[;<scrlen>[;<vgabase>\ + [;<colw>[;<coltype>[;<xres_virtual>]]]]] + +[I had to break this line...] + + This is probably the most complicated parameter... It specifies that +you have some external video hardware (a graphics board), and how to +use it under Linux/m68k. The kernel cannot know more about the hardware +than you tell it here! The kernel also is unable to set or change any +video modes, since it doesn't know about any board internal. So, you +have to switch to that video mode before you start Linux, and cannot +switch to another mode once Linux has started. + + The first 3 parameters of this sub-option should be obvious: <xres>, +<yres> and <depth> give the dimensions of the screen and the number of +planes (depth). The depth is is the logarithm to base 2 of the number +of colors possible. (Or, the other way round: The number of colors is +2^depth). + + You have to tell the kernel furthermore how the video memory is +organized. This is done by a letter as <org> parameter: + + 'n': "normal planes", i.e. one whole plane after another + 'i': "interleaved planes", i.e. 16 bit of the first plane, than 16 bit + of the next, and so on... This mode is used only with the + built-in Atari video modes, I think there is no card that + supports this mode. + 'p': "packed pixels", i.e. <depth> consecutive bits stand for all + planes of one pixel; this is the most common mode for 8 planes + (256 colors) on graphic cards + 't': "true color" (more or less packed pixels, but without a color + lookup table); usually depth is 24 + +For monochrome modes (i.e., <depth> is 1), the <org> letter has a +different meaning: + + 'n': normal colors, i.e. 0=white, 1=black + 'i': inverted colors, i.e. 0=black, 1=white + + The next important information about the video hardware is the base +address of the video memory. That is given in the <scrmem> parameter, +as a hexadecimal number with a "0x" prefix. You have to find out this +address in the documentation of your hardware. + + The next parameter, <scrlen>, tells the kernel about the size of the +video memory. If it's missing, the size is calculated from <xres>, +<yres>, and <depth>. For now, it is not useful to write a value here. +It would be used only for hardware scrolling (which isn't possible +with the external driver, because the kernel cannot set the video base +address), or for virtual resolutions under X (which the X server +doesn't support yet). So, it's currently best to leave this field +empty, either by ending the "external:" after the video address or by +writing two consecutive semicolons, if you want to give a <vgabase> +(it is allowed to leave this parameter empty). + + The <vgabase> parameter is optional. If it is not given, the kernel +cannot read or write any color registers of the video hardware, and +thus you have to set appropriate colors before you start Linux. But if +your card is somehow VGA compatible, you can tell the kernel the base +address of the VGA register set, so it can change the color lookup +table. You have to look up this address in your board's documentation. +To avoid misunderstandings: <vgabase> is the _base_ address, i.e. a 4k +aligned address. For read/writing the color registers, the kernel +uses the addresses vgabase+0x3c7...vgabase+0x3c9. The <vgabase> +parameter is written in hexadecimal with a "0x" prefix, just as +<scrmem>. + + <colw> is meaningful only if <vgabase> is specified. It tells the +kernel how wide each of the color register is, i.e. the number of bits +per single color (red/green/blue). Default is 6, another quite usual +value is 8. + + Also <coltype> is used together with <vgabase>. It tells the kernel +about the color register model of your gfx board. Currently, the types +"vga" (which is also the default) and "mv300" (SANG MV300) are +implemented. + + Parameter <xres_virtual> is required for ProMST or ET4000 cards where +the physical linelength differs from the visible length. With ProMST, +xres_virtual must be set to 2048. For ET4000, xres_virtual depends on the +initialisation of the video-card. +If you're missing a corresponding yres_virtual: the external part is legacy, +therefore we don't support hardware-dependent functions like hardware-scroll, +panning or blanking. + +4.1.7) eclock: +-------------- + +The external pixel clock attached to the Falcon VIDEL shifter. This +currently works only with the ScreenWonder! + +4.1.8) monitorcap: +------------------- + +Syntax: monitorcap:<vmin>;<vmax>;<hmin>;<hmax> + +This describes the capabilities of a multisync monitor. Don't use it +with a fixed-frequency monitor! For now, only the Falcon frame buffer +uses the settings of "monitorcap:". + + <vmin> and <vmax> are the minimum and maximum, resp., vertical frequencies +your monitor can work with, in Hz. <hmin> and <hmax> are the same for +the horizontal frequency, in kHz. + + The defaults are 58;62;31;32 (VGA compatible). + + The defaults for TV/SC1224/SC1435 cover both PAL and NTSC standards. + +4.1.9) keep +------------ + +If this option is given, the framebuffer device doesn't do any video +mode calculations and settings on its own. The only Atari fb device +that does this currently is the Falcon. + + What you reach with this: Settings for unknown video extensions +aren't overridden by the driver, so you can still use the mode found +when booting, when the driver doesn't know to set this mode itself. +But this also means, that you can't switch video modes anymore... + + An example where you may want to use "keep" is the ScreenBlaster for +the Falcon. + + +4.2) atamouse= +-------------- + +Syntax: atamouse=<x-threshold>,[<y-threshold>] + + With this option, you can set the mouse movement reporting threshold. +This is the number of pixels of mouse movement that have to accumulate +before the IKBD sends a new mouse packet to the kernel. Higher values +reduce the mouse interrupt load and thus reduce the chance of keyboard +overruns. Lower values give a slightly faster mouse responses and +slightly better mouse tracking. + + You can set the threshold in x and y separately, but usually this is +of little practical use. If there's just one number in the option, it +is used for both dimensions. The default value is 2 for both +thresholds. + + +4.3) ataflop= +------------- + +Syntax: ataflop=<drive type>[,<trackbuffering>[,<steprateA>[,<steprateB>]]] + + The drive type may be 0, 1, or 2, for DD, HD, and ED, resp. This + setting affects how many buffers are reserved and which formats are + probed (see also below). The default is 1 (HD). Only one drive type + can be selected. If you have two disk drives, select the "better" + type. + + The second parameter <trackbuffer> tells the kernel whether to use + track buffering (1) or not (0). The default is machine-dependent: + no for the Medusa and yes for all others. + + With the two following parameters, you can change the default + steprate used for drive A and B, resp. + + +4.4) atascsi= +------------- + +Syntax: atascsi=<can_queue>[,<cmd_per_lun>[,<scat-gat>[,<host-id>[,<tagged>]]]] + + This option sets some parameters for the Atari native SCSI driver. +Generally, any number of arguments can be omitted from the end. And +for each of the numbers, a negative value means "use default". The +defaults depend on whether TT-style or Falcon-style SCSI is used. +Below, defaults are noted as n/m, where the first value refers to +TT-SCSI and the latter to Falcon-SCSI. If an illegal value is given +for one parameter, an error message is printed and that one setting is +ignored (others aren't affected). + + <can_queue>: + This is the maximum number of SCSI commands queued internally to the + Atari SCSI driver. A value of 1 effectively turns off the driver + internal multitasking (if it causes problems). Legal values are >= + 1. <can_queue> can be as high as you like, but values greater than + <cmd_per_lun> times the number of SCSI targets (LUNs) you have + don't make sense. Default: 16/8. + + <cmd_per_lun>: + Maximum number of SCSI commands issued to the driver for one + logical unit (LUN, usually one SCSI target). Legal values start + from 1. If tagged queuing (see below) is not used, values greater + than 2 don't make sense, but waste memory. Otherwise, the maximum + is the number of command tags available to the driver (currently + 32). Default: 8/1. (Note: Values > 1 seem to cause problems on a + Falcon, cause not yet known.) + + The <cmd_per_lun> value at a great part determines the amount of + memory SCSI reserves for itself. The formula is rather + complicated, but I can give you some hints: + no scatter-gather : cmd_per_lun * 232 bytes + full scatter-gather: cmd_per_lun * approx. 17 Kbytes + + <scat-gat>: + Size of the scatter-gather table, i.e. the number of requests + consecutive on the disk that can be merged into one SCSI command. + Legal values are between 0 and 255. Default: 255/0. Note: This + value is forced to 0 on a Falcon, since scatter-gather isn't + possible with the ST-DMA. Not using scatter-gather hurts + performance significantly. + + <host-id>: + The SCSI ID to be used by the initiator (your Atari). This is + usually 7, the highest possible ID. Every ID on the SCSI bus must + be unique. Default: determined at run time: If the NV-RAM checksum + is valid, and bit 7 in byte 30 of the NV-RAM is set, the lower 3 + bits of this byte are used as the host ID. (This method is defined + by Atari and also used by some TOS HD drivers.) If the above + isn't given, the default ID is 7. (both, TT and Falcon). + + <tagged>: + 0 means turn off tagged queuing support, all other values > 0 mean + use tagged queuing for targets that support it. Default: currently + off, but this may change when tagged queuing handling has been + proved to be reliable. + + Tagged queuing means that more than one command can be issued to + one LUN, and the SCSI device itself orders the requests so they + can be performed in optimal order. Not all SCSI devices support + tagged queuing (:-(). + +4.6 switches= +------------- + +Syntax: switches=<list of switches> + + With this option you can switch some hardware lines that are often +used to enable/disable certain hardware extensions. Examples are +OverScan, overclocking, ... + + The <list of switches> is a comma-separated list of the following +items: + + ikbd: set RTS of the keyboard ACIA high + midi: set RTS of the MIDI ACIA high + snd6: set bit 6 of the PSG port A + snd7: set bit 6 of the PSG port A + +It doesn't make sense to mention a switch more than once (no +difference to only once), but you can give as many switches as you +want to enable different features. The switch lines are set as early +as possible during kernel initialization (even before determining the +present hardware.) + + All of the items can also be prefixed with "ov_", i.e. "ov_ikbd", +"ov_midi", ... These options are meant for switching on an OverScan +video extension. The difference to the bare option is that the +switch-on is done after video initialization, and somehow synchronized +to the HBLANK. A speciality is that ov_ikbd and ov_midi are switched +off before rebooting, so that OverScan is disabled and TOS boots +correctly. + + If you give an option both, with and without the "ov_" prefix, the +earlier initialization ("ov_"-less) takes precedence. But the +switching-off on reset still happens in this case. + +4.5) stram_swap= +---------------- + +Syntax: stram_swap=<do_swap>[,<max_swap>] + + This option is available only if the kernel has been compiled with +CONFIG_STRAM_SWAP enabled. Normally, the kernel then determines +dynamically whether to actually use ST-RAM as swap space. (Currently, +the fraction of ST-RAM must be less or equal 1/3 of total memory to +enable this swapping.) You can override the kernel's decision by +specifying this option. 1 for <do_swap> means always enable the swap, +even if you have less alternate RAM. 0 stands for never swap to +ST-RAM, even if it's small enough compared to the rest of memory. + + If ST-RAM swapping is enabled, the kernel usually uses all free +ST-RAM as swap "device". If the kernel resides in ST-RAM, the region +allocated by it is obviously never used for swapping :-) You can also +limit this amount by specifying the second parameter, <max_swap>, if +you want to use parts of ST-RAM as normal system memory. <max_swap> is +in kBytes and the number should be a multiple of 4 (otherwise: rounded +down). + +5) Options for Amiga Only: +========================== + +5.1) video= +----------- + +Syntax: video=<fbname>:<sub-options...> + +The <fbname> parameter specifies the name of the frame buffer, valid +options are `amifb', `cyber', 'virge', `retz3' and `clgen', provided +that the respective frame buffer devices have been compiled into the +kernel (or compiled as loadable modules). The behavior of the <fbname> +option was changed in 2.1.57 so it is now recommended to specify this +option. + +The <sub-options> is a comma-separated list of the sub-options listed +below. This option is organized similar to the Atari version of the +"video"-option (4.1), but knows fewer sub-options. + +5.1.1) video mode +----------------- + +Again, similar to the video mode for the Atari (see 4.1.1). Predefined +modes depend on the used frame buffer device. + +OCS, ECS and AGA machines all use the color frame buffer. The following +predefined video modes are available: + +NTSC modes: + - ntsc : 640x200, 15 kHz, 60 Hz + - ntsc-lace : 640x400, 15 kHz, 60 Hz interlaced +PAL modes: + - pal : 640x256, 15 kHz, 50 Hz + - pal-lace : 640x512, 15 kHz, 50 Hz interlaced +ECS modes: + - multiscan : 640x480, 29 kHz, 57 Hz + - multiscan-lace : 640x960, 29 kHz, 57 Hz interlaced + - euro36 : 640x200, 15 kHz, 72 Hz + - euro36-lace : 640x400, 15 kHz, 72 Hz interlaced + - euro72 : 640x400, 29 kHz, 68 Hz + - euro72-lace : 640x800, 29 kHz, 68 Hz interlaced + - super72 : 800x300, 23 kHz, 70 Hz + - super72-lace : 800x600, 23 kHz, 70 Hz interlaced + - dblntsc-ff : 640x400, 27 kHz, 57 Hz + - dblntsc-lace : 640x800, 27 kHz, 57 Hz interlaced + - dblpal-ff : 640x512, 27 kHz, 47 Hz + - dblpal-lace : 640x1024, 27 kHz, 47 Hz interlaced + - dblntsc : 640x200, 27 kHz, 57 Hz doublescan + - dblpal : 640x256, 27 kHz, 47 Hz doublescan +VGA modes: + - vga : 640x480, 31 kHz, 60 Hz + - vga70 : 640x400, 31 kHz, 70 Hz + +Please notice that the ECS and VGA modes require either an ECS or AGA +chipset, and that these modes are limited to 2-bit color for the ECS +chipset and 8-bit color for the AGA chipset. + +5.1.2) depth +------------ + +Syntax: depth:<nr. of bit-planes> + +Specify the number of bit-planes for the selected video-mode. + +5.1.3) inverse +-------------- + +Use inverted display (black on white). Functionally the same as the +"inverse" sub-option for the Atari. + +5.1.4) font +----------- + +Syntax: font:<fontname> + +Specify the font to use in text modes. Functionally the same as the +"font" sub-option for the Atari, except that `PEARL8x8' is used instead +of `VGA8x8' if the vertical size of the display is less than 400 pixel +rows. + +5.1.5) monitorcap: +------------------- + +Syntax: monitorcap:<vmin>;<vmax>;<hmin>;<hmax> + +This describes the capabilities of a multisync monitor. For now, only +the color frame buffer uses the settings of "monitorcap:". + + <vmin> and <vmax> are the minimum and maximum, resp., vertical frequencies +your monitor can work with, in Hz. <hmin> and <hmax> are the same for +the horizontal frequency, in kHz. + + The defaults are 50;90;15;38 (Generic Amiga multisync monitor). + + +5.2) fd_def_df0= +---------------- + +Syntax: fd_def_df0=<value> + +Sets the df0 value for "silent" floppy drives. The value should be in +hexadecimal with "0x" prefix. + + +5.3) wd33c93= +------------- + +Syntax: wd33c93=<sub-options...> + +These options affect the A590/A2091, A3000 and GVP Series II SCSI +controllers. + +The <sub-options> is a comma-separated list of the sub-options listed +below. + +5.3.1) nosync +------------- + +Syntax: nosync:bitmask + + bitmask is a byte where the 1st 7 bits correspond with the 7 +possible SCSI devices. Set a bit to prevent sync negotiation on that +device. To maintain backwards compatibility, a command-line such as +"wd33c93=255" will be automatically translated to +"wd33c93=nosync:0xff". The default is to disable sync negotiation for +all devices, eg. nosync:0xff. + +5.3.2) period +------------- + +Syntax: period:ns + + `ns' is the minimum # of nanoseconds in a SCSI data transfer +period. Default is 500; acceptable values are 250 - 1000. + +5.3.3) disconnect +----------------- + +Syntax: disconnect:x + + Specify x = 0 to never allow disconnects, 2 to always allow them. +x = 1 does 'adaptive' disconnects, which is the default and generally +the best choice. + +5.3.4) debug +------------ + +Syntax: debug:x + + If `DEBUGGING_ON' is defined, x is a bit mask that causes various +types of debug output to printed - see the DB_xxx defines in +wd33c93.h. + +5.3.5) clock +------------ + +Syntax: clock:x + + x = clock input in MHz for WD33c93 chip. Normal values would be from +8 through 20. The default value depends on your hostadapter(s), +default for the A3000 internal controller is 14, for the A2091 it's 8 +and for the GVP hostadapters it's either 8 or 14, depending on the +hostadapter and the SCSI-clock jumper present on some GVP +hostadapters. + +5.3.6) next +----------- + + No argument. Used to separate blocks of keywords when there's more +than one wd33c93-based host adapter in the system. + +5.3.7) nodma +------------ + +Syntax: nodma:x + + If x is 1 (or if the option is just written as "nodma"), the WD33c93 +controller will not use DMA (= direct memory access) to access the +Amiga's memory. This is useful for some systems (like A3000's and +A4000's with the A3640 accelerator, revision 3.0) that have problems +using DMA to chip memory. The default is 0, i.e. to use DMA if +possible. + + +5.4) gvp11= +----------- + +Syntax: gvp11=<addr-mask> + + The earlier versions of the GVP driver did not handle DMA +address-mask settings correctly which made it necessary for some +people to use this option, in order to get their GVP controller +running under Linux. These problems have hopefully been solved and the +use of this option is now highly unrecommended! + + Incorrect use can lead to unpredictable behavior, so please only use +this option if you *know* what you are doing and have a reason to do +so. In any case if you experience problems and need to use this +option, please inform us about it by mailing to the Linux/68k kernel +mailing list. + + The address mask set by this option specifies which addresses are +valid for DMA with the GVP Series II SCSI controller. An address is +valid, if no bits are set except the bits that are set in the mask, +too. + + Some versions of the GVP can only DMA into a 24 bit address range, +some can address a 25 bit address range while others can use the whole +32 bit address range for DMA. The correct setting depends on your +controller and should be autodetected by the driver. An example is the +24 bit region which is specified by a mask of 0x00fffffe. + + +5.5) 53c7xx= +------------ + +Syntax: 53c7xx=<sub-options...> + +These options affect the A4000T, A4091, WarpEngine, Blizzard 603e+, +and GForce 040/060 SCSI controllers on the Amiga, as well as the +builtin MVME 16x SCSI controller. + +The <sub-options> is a comma-separated list of the sub-options listed +below. + +5.5.1) nosync +------------- + +Syntax: nosync:0 + + Disables sync negotiation for all devices. Any value after the + colon is acceptable (and has the same effect). + +5.5.2) noasync +-------------- + +Syntax: noasync:0 + + Disables async and sync negotiation for all devices. Any value + after the colon is acceptable (and has the same effect). + +5.5.3) nodisconnect +------------------- + +Syntax: nodisconnect:0 + + Disables SCSI disconnects. Any value after the colon is acceptable + (and has the same effect). + +5.5.4) validids +--------------- + +Syntax: validids:0xNN + + Specify which SCSI ids the driver should pay attention to. This is + a bitmask (i.e. to only pay attention to ID#4, you'd use 0x10). + Default is 0x7f (devices 0-6). + +5.5.5) opthi +5.5.6) optlo +------------ + +Syntax: opthi:M,optlo:N + + Specify options for "hostdata->options". The acceptable definitions + are listed in drivers/scsi/53c7xx.h; the 32 high bits should be in + opthi and the 32 low bits in optlo. They must be specified in the + order opthi=M,optlo=N. + +5.5.7) next +----------- + + No argument. Used to separate blocks of keywords when there's more + than one 53c7xx host adapter in the system. + + +/* Local Variables: */ +/* mode: text */ +/* End: */ diff --git a/Documentation/magic-number.txt b/Documentation/magic-number.txt new file mode 100644 index 000000000000..bd8eefa17587 --- /dev/null +++ b/Documentation/magic-number.txt @@ -0,0 +1,174 @@ +This file is a registry of magic numbers which are in use. When you +add a magic number to a structure, you should also add it to this +file, since it is best if the magic numbers used by various structures +are unique. + +It is a *very* good idea to protect kernel data structures with magic +numbers. This allows you to check at run time whether (a) a structure +has been clobbered, or (b) you've passed the wrong structure to a +routine. This last is especially useful --- particularly when you are +passing pointers to structures via a void * pointer. The tty code, +for example, does this frequently to pass driver-specific and line +discipline-specific structures back and forth. + +The way to use magic numbers is to declare then at the beginning of +the structure, like so: + +struct tty_ldisc { + int magic; + ... +}; + +Please follow this discipline when you are adding future enhancements +to the kernel! It has saved me countless hours of debugging, +especially in the screwy cases where an array has been overrun and +structures following the array have been overwritten. Using this +discipline, these cases get detected quickly and safely. + + Theodore Ts'o + 31 Mar 94 + +The magic table is current to Linux 2.1.55. + + Michael Chastain + <mailto:mec@shout.net> + 22 Sep 1997 + +Now it should be up to date with Linux 2.1.112. Because +we are in feature freeze time it is very unlikely that +something will change before 2.2.x. The entries are +sorted by number field. + + Krzysztof G. Baranowski + <mailto: kgb@knm.org.pl> + 29 Jul 1998 + +Updated the magic table to Linux 2.5.45. Right over the feature freeze, +but it is possible that some new magic numbers will sneak into the +kernel before 2.6.x yet. + + Petr Baudis + <pasky@ucw.cz> + 03 Nov 2002 + +Updated the magic table to Linux 2.5.74. + + Fabian Frederick + <ffrederick@users.sourceforge.net> + 09 Jul 2003 + + +Magic Name Number Structure File +=========================================================================== +PG_MAGIC 'P' pg_{read,write}_hdr include/linux/pg.h +CMAGIC 0x0111 user include/linux/a.out.h +MKISS_DRIVER_MAGIC 0x04bf mkiss_channel drivers/net/mkiss.h +RISCOM8_MAGIC 0x0907 riscom_port drivers/char/riscom8.h +SPECIALIX_MAGIC 0x0907 specialix_port drivers/char/specialix_io8.h +AURORA_MAGIC 0x0A18 Aurora_port drivers/sbus/char/aurora.h +HDLC_MAGIC 0x239e n_hdlc drivers/char/n_hdlc.c +APM_BIOS_MAGIC 0x4101 apm_user arch/i386/kernel/apm.c +CYCLADES_MAGIC 0x4359 cyclades_port include/linux/cyclades.h +DB_MAGIC 0x4442 fc_info drivers/net/iph5526_novram.c +DL_MAGIC 0x444d fc_info drivers/net/iph5526_novram.c +FASYNC_MAGIC 0x4601 fasync_struct include/linux/fs.h +FF_MAGIC 0x4646 fc_info drivers/net/iph5526_novram.c +ISICOM_MAGIC 0x4d54 isi_port include/linux/isicom.h +PTY_MAGIC 0x5001 drivers/char/pty.c +PPP_MAGIC 0x5002 ppp include/linux/if_pppvar.h +SERIAL_MAGIC 0x5301 async_struct include/linux/serial.h +SSTATE_MAGIC 0x5302 serial_state include/linux/serial.h +SLIP_MAGIC 0x5302 slip drivers/net/slip.h +STRIP_MAGIC 0x5303 strip drivers/net/strip.c +X25_ASY_MAGIC 0x5303 x25_asy drivers/net/x25_asy.h +SIXPACK_MAGIC 0x5304 sixpack drivers/net/hamradio/6pack.h +AX25_MAGIC 0x5316 ax_disp drivers/net/mkiss.h +ESP_MAGIC 0x53ee esp_struct drivers/char/esp.h +TTY_MAGIC 0x5401 tty_struct include/linux/tty.h +MGSL_MAGIC 0x5401 mgsl_info drivers/char/synclink.c +TTY_DRIVER_MAGIC 0x5402 tty_driver include/linux/tty_driver.h +MGSLPC_MAGIC 0x5402 mgslpc_info drivers/char/pcmcia/synclink_cs.c +TTY_LDISC_MAGIC 0x5403 tty_ldisc include/linux/tty_ldisc.h +USB_SERIAL_MAGIC 0x6702 usb_serial drivers/usb/serial/usb-serial.h +FULL_DUPLEX_MAGIC 0x6969 drivers/net/tulip/de2104x.c +USB_BLUETOOTH_MAGIC 0x6d02 usb_bluetooth drivers/usb/class/bluetty.c +RFCOMM_TTY_MAGIC 0x6d02 net/bluetooth/rfcomm/tty.c +USB_SERIAL_PORT_MAGIC 0x7301 usb_serial_port drivers/usb/serial/usb-serial.h +CG_MAGIC 0x00090255 ufs_cylinder_group include/linux/ufs_fs.h +A2232_MAGIC 0x000a2232 gs_port drivers/char/ser_a2232.h +SOLARIS_SOCKET_MAGIC 0x000ADDED sol_socket_struct arch/sparc64/solaris/socksys.h +RPORT_MAGIC 0x00525001 r_port drivers/char/rocket_int.h +LSEMAGIC 0x05091998 lse drivers/fc4/fc.c +GDTIOCTL_MAGIC 0x06030f07 gdth_iowr_str drivers/scsi/gdth_ioctl.h +RIEBL_MAGIC 0x09051990 drivers/net/atarilance.c +RIO_MAGIC 0x12345678 gs_port drivers/char/rio/rio_linux.c +SX_MAGIC 0x12345678 gs_port drivers/char/sx.h +NBD_REQUEST_MAGIC 0x12560953 nbd_request include/linux/nbd.h +RED_MAGIC2 0x170fc2a5 (any) mm/slab.c +BAYCOM_MAGIC 0x19730510 baycom_state drivers/net/baycom_epp.c +ISDN_X25IFACE_MAGIC 0x1e75a2b9 isdn_x25iface_proto_data + drivers/isdn/isdn_x25iface.h +ECP_MAGIC 0x21504345 cdkecpsig include/linux/cdk.h +LSOMAGIC 0x27091997 lso drivers/fc4/fc.c +LSMAGIC 0x2a3b4d2a ls drivers/fc4/fc.c +WANPIPE_MAGIC 0x414C4453 sdla_{dump,exec} include/linux/wanpipe.h +CS_CARD_MAGIC 0x43525553 cs_card sound/oss/cs46xx.c +LABELCL_MAGIC 0x4857434c labelcl_info_s include/asm/ia64/sn/labelcl.h +ISDN_ASYNC_MAGIC 0x49344C01 modem_info include/linux/isdn.h +CTC_ASYNC_MAGIC 0x49344C01 ctc_tty_info drivers/s390/net/ctctty.c +ISDN_NET_MAGIC 0x49344C02 isdn_net_local_s drivers/isdn/i4l/isdn_net_lib.h +SAVEKMSG_MAGIC2 0x4B4D5347 savekmsg arch/*/amiga/config.c +STLI_BOARDMAGIC 0x4bc6c825 stlibrd include/linux/istallion.h +CS_STATE_MAGIC 0x4c4f4749 cs_state sound/oss/cs46xx.c +SLAB_C_MAGIC 0x4f17a36d kmem_cache_s mm/slab.c +COW_MAGIC 0x4f4f4f4d cow_header_v1 arch/um/drivers/ubd_user.c +I810_CARD_MAGIC 0x5072696E i810_card sound/oss/i810_audio.c +TRIDENT_CARD_MAGIC 0x5072696E trident_card sound/oss/trident.c +ROUTER_MAGIC 0x524d4157 wan_device include/linux/wanrouter.h +SCC_MAGIC 0x52696368 gs_port drivers/char/scc.h +SAVEKMSG_MAGIC1 0x53415645 savekmsg arch/*/amiga/config.c +GDA_MAGIC 0x58464552 gda include/asm-mips64/sn/gda.h +RED_MAGIC1 0x5a2cf071 (any) mm/slab.c +STL_PORTMAGIC 0x5a7182c9 stlport include/linux/stallion.h +EEPROM_MAGIC_VALUE 0X5ab478d2 lanai_dev drivers/atm/lanai.c +HDLCDRV_MAGIC 0x5ac6e778 hdlcdrv_state include/linux/hdlcdrv.h +EPCA_MAGIC 0x5c6df104 channel include/linux/epca.h +PCXX_MAGIC 0x5c6df104 channel drivers/char/pcxx.h +KV_MAGIC 0x5f4b565f kernel_vars_s include/asm-mips64/sn/klkernvars.h +I810_STATE_MAGIC 0x63657373 i810_state sound/oss/i810_audio.c +TRIDENT_STATE_MAGIC 0x63657373 trient_state sound/oss/trident.c +M3_CARD_MAGIC 0x646e6f50 m3_card sound/oss/maestro3.c +FW_HEADER_MAGIC 0x65726F66 fw_header drivers/atm/fore200e.h +SLOT_MAGIC 0x67267321 slot drivers/hotplug/cpqphp.h +SLOT_MAGIC 0x67267322 slot drivers/hotplug/acpiphp.h +LO_MAGIC 0x68797548 nbd_device include/linux/nbd.h +OPROFILE_MAGIC 0x6f70726f super_block drivers/oprofile/oprofilefs.h +M3_STATE_MAGIC 0x734d724d m3_state sound/oss/maestro3.c +STL_PANELMAGIC 0x7ef621a1 stlpanel include/linux/stallion.h +VMALLOC_MAGIC 0x87654320 snd_alloc_track sound/core/memory.c +KMALLOC_MAGIC 0x87654321 snd_alloc_track sound/core/memory.c +PWC_MAGIC 0x89DC10AB pwc_device drivers/usb/media/pwc.h +NBD_REPLY_MAGIC 0x96744668 nbd_reply include/linux/nbd.h +STL_BOARDMAGIC 0xa2267f52 stlbrd include/linux/stallion.h +ENI155_MAGIC 0xa54b872d midway_eprom drivers/atm/eni.h +SCI_MAGIC 0xbabeface gs_port drivers/char/sh-sci.h +CODA_MAGIC 0xC0DAC0DA coda_file_info include/linux/coda_fs_i.h +DPMEM_MAGIC 0xc0ffee11 gdt_pci_sram drivers/scsi/gdth.h +STLI_PORTMAGIC 0xe671c7a1 stliport include/linux/istallion.h +YAM_MAGIC 0xF10A7654 yam_port drivers/net/hamradio/yam.c +CCB_MAGIC 0xf2691ad2 ccb drivers/scsi/ncr53c8xx.c +QUEUE_MAGIC_FREE 0xf7e1c9a3 queue_entry drivers/scsi/arm/queue.c +QUEUE_MAGIC_USED 0xf7e1cc33 queue_entry drivers/scsi/arm/queue.c +HTB_CMAGIC 0xFEFAFEF1 htb_class net/sched/sch_htb.c +NMI_MAGIC 0x48414d4d455201 nmi_s include/asm-mips64/sn/nmi.h + +Note that there are also defined special per-driver magic numbers in sound +memory management. See include/sound/sndmagic.h for complete list of them. Many +OSS sound drivers have their magic numbers constructed from the soundcard PCI +ID - these are not listed here as well. + +IrDA subsystem also uses large number of own magic numbers, see +include/net/irda/irda.h for a complete list of them. + +HFS is another larger user of magic numbers - you can find them in +fs/hfs/hfs.h. diff --git a/Documentation/mandatory.txt b/Documentation/mandatory.txt new file mode 100644 index 000000000000..bc449d49eee5 --- /dev/null +++ b/Documentation/mandatory.txt @@ -0,0 +1,152 @@ + Mandatory File Locking For The Linux Operating System + + Andy Walker <andy@lysaker.kvaerner.no> + + 15 April 1996 + + +1. What is mandatory locking? +------------------------------ + +Mandatory locking is kernel enforced file locking, as opposed to the more usual +cooperative file locking used to guarantee sequential access to files among +processes. File locks are applied using the flock() and fcntl() system calls +(and the lockf() library routine which is a wrapper around fcntl().) It is +normally a process' responsibility to check for locks on a file it wishes to +update, before applying its own lock, updating the file and unlocking it again. +The most commonly used example of this (and in the case of sendmail, the most +troublesome) is access to a user's mailbox. The mail user agent and the mail +transfer agent must guard against updating the mailbox at the same time, and +prevent reading the mailbox while it is being updated. + +In a perfect world all processes would use and honour a cooperative, or +"advisory" locking scheme. However, the world isn't perfect, and there's +a lot of poorly written code out there. + +In trying to address this problem, the designers of System V UNIX came up +with a "mandatory" locking scheme, whereby the operating system kernel would +block attempts by a process to write to a file that another process holds a +"read" -or- "shared" lock on, and block attempts to both read and write to a +file that a process holds a "write " -or- "exclusive" lock on. + +The System V mandatory locking scheme was intended to have as little impact as +possible on existing user code. The scheme is based on marking individual files +as candidates for mandatory locking, and using the existing fcntl()/lockf() +interface for applying locks just as if they were normal, advisory locks. + +Note 1: In saying "file" in the paragraphs above I am actually not telling +the whole truth. System V locking is based on fcntl(). The granularity of +fcntl() is such that it allows the locking of byte ranges in files, in addition +to entire files, so the mandatory locking rules also have byte level +granularity. + +Note 2: POSIX.1 does not specify any scheme for mandatory locking, despite +borrowing the fcntl() locking scheme from System V. The mandatory locking +scheme is defined by the System V Interface Definition (SVID) Version 3. + +2. Marking a file for mandatory locking +--------------------------------------- + +A file is marked as a candidate for mandatory locking by setting the group-id +bit in its file mode but removing the group-execute bit. This is an otherwise +meaningless combination, and was chosen by the System V implementors so as not +to break existing user programs. + +Note that the group-id bit is usually automatically cleared by the kernel when +a setgid file is written to. This is a security measure. The kernel has been +modified to recognize the special case of a mandatory lock candidate and to +refrain from clearing this bit. Similarly the kernel has been modified not +to run mandatory lock candidates with setgid privileges. + +3. Available implementations +---------------------------- + +I have considered the implementations of mandatory locking available with +SunOS 4.1.x, Solaris 2.x and HP-UX 9.x. + +Generally I have tried to make the most sense out of the behaviour exhibited +by these three reference systems. There are many anomalies. + +All the reference systems reject all calls to open() for a file on which +another process has outstanding mandatory locks. This is in direct +contravention of SVID 3, which states that only calls to open() with the +O_TRUNC flag set should be rejected. The Linux implementation follows the SVID +definition, which is the "Right Thing", since only calls with O_TRUNC can +modify the contents of the file. + +HP-UX even disallows open() with O_TRUNC for a file with advisory locks, not +just mandatory locks. That would appear to contravene POSIX.1. + +mmap() is another interesting case. All the operating systems mentioned +prevent mandatory locks from being applied to an mmap()'ed file, but HP-UX +also disallows advisory locks for such a file. SVID actually specifies the +paranoid HP-UX behaviour. + +In my opinion only MAP_SHARED mappings should be immune from locking, and then +only from mandatory locks - that is what is currently implemented. + +SunOS is so hopeless that it doesn't even honour the O_NONBLOCK flag for +mandatory locks, so reads and writes to locked files always block when they +should return EAGAIN. + +I'm afraid that this is such an esoteric area that the semantics described +below are just as valid as any others, so long as the main points seem to +agree. + +4. Semantics +------------ + +1. Mandatory locks can only be applied via the fcntl()/lockf() locking + interface - in other words the System V/POSIX interface. BSD style + locks using flock() never result in a mandatory lock. + +2. If a process has locked a region of a file with a mandatory read lock, then + other processes are permitted to read from that region. If any of these + processes attempts to write to the region it will block until the lock is + released, unless the process has opened the file with the O_NONBLOCK + flag in which case the system call will return immediately with the error + status EAGAIN. + +3. If a process has locked a region of a file with a mandatory write lock, all + attempts to read or write to that region block until the lock is released, + unless a process has opened the file with the O_NONBLOCK flag in which case + the system call will return immediately with the error status EAGAIN. + +4. Calls to open() with O_TRUNC, or to creat(), on a existing file that has + any mandatory locks owned by other processes will be rejected with the + error status EAGAIN. + +5. Attempts to apply a mandatory lock to a file that is memory mapped and + shared (via mmap() with MAP_SHARED) will be rejected with the error status + EAGAIN. + +6. Attempts to create a shared memory map of a file (via mmap() with MAP_SHARED) + that has any mandatory locks in effect will be rejected with the error status + EAGAIN. + +5. Which system calls are affected? +----------------------------------- + +Those which modify a file's contents, not just the inode. That gives read(), +write(), readv(), writev(), open(), creat(), mmap(), truncate() and +ftruncate(). truncate() and ftruncate() are considered to be "write" actions +for the purposes of mandatory locking. + +The affected region is usually defined as stretching from the current position +for the total number of bytes read or written. For the truncate calls it is +defined as the bytes of a file removed or added (we must also consider bytes +added, as a lock can specify just "the whole file", rather than a specific +range of bytes.) + +Note 3: I may have overlooked some system calls that need mandatory lock +checking in my eagerness to get this code out the door. Please let me know, or +better still fix the system calls yourself and submit a patch to me or Linus. + +6. Warning! +----------- + +Not even root can override a mandatory lock, so runaway processes can wreak +havoc if they lock crucial files. The way around it is to change the file +permissions (remove the setgid bit) before trying to read or write to it. +Of course, that might be a bit tricky if the system is hung :-( + diff --git a/Documentation/mca.txt b/Documentation/mca.txt new file mode 100644 index 000000000000..6e32c305c65a --- /dev/null +++ b/Documentation/mca.txt @@ -0,0 +1,320 @@ +i386 Micro Channel Architecture Support +======================================= + +MCA support is enabled using the CONFIG_MCA define. A machine with a MCA +bus will have the kernel variable MCA_bus set, assuming the BIOS feature +bits are set properly (see arch/i386/boot/setup.S for information on +how this detection is done). + +Adapter Detection +================= + +The ideal MCA adapter detection is done through the use of the +Programmable Option Select registers. Generic functions for doing +this have been added in include/linux/mca.h and arch/i386/kernel/mca.c. +Everything needed to detect adapters and read (and write) configuration +information is there. A number of MCA-specific drivers already use +this. The typical probe code looks like the following: + + #include <linux/mca.h> + + unsigned char pos2, pos3, pos4, pos5; + struct net_device* dev; + int slot; + + if( MCA_bus ) { + slot = mca_find_adapter( ADAPTER_ID, 0 ); + if( slot == MCA_NOTFOUND ) { + return -ENODEV; + } + /* optional - see below */ + mca_set_adapter_name( slot, "adapter name & description" ); + mca_set_adapter_procfn( slot, dev_getinfo, dev ); + + /* read the POS registers. Most devices only use 2 and 3 */ + pos2 = mca_read_stored_pos( slot, 2 ); + pos3 = mca_read_stored_pos( slot, 3 ); + pos4 = mca_read_stored_pos( slot, 4 ); + pos5 = mca_read_stored_pos( slot, 5 ); + } else { + return -ENODEV; + } + + /* extract configuration from pos[2345] and set everything up */ + +Loadable modules should modify this to test that the specified IRQ and +IO ports (plus whatever other stuff) match. See 3c523.c for example +code (actually, smc-mca.c has a slightly more complex example that can +handle a list of adapter ids). + +Keep in mind that devices should never directly access the POS registers +(via inb(), outb(), etc). While it's generally safe, there is a small +potential for blowing up hardware when it's done at the wrong time. +Furthermore, accessing a POS register disables a device temporarily. +This is usually okay during startup, but do _you_ want to rely on it? +During initial configuration, mca_init() reads all the POS registers +into memory. mca_read_stored_pos() accesses that data. mca_read_pos() +and mca_write_pos() are also available for (safer) direct POS access, +but their use is _highly_ discouraged. mca_write_pos() is particularly +dangerous, as it is possible for adapters to be put in inconsistent +states (i.e. sharing IO address, etc) and may result in crashes, toasted +hardware, and blindness. + +User level drivers (such as the AGX X server) can use /proc/mca/pos to +find adapters (see below). + +Some MCA adapters can also be detected via the usual ISA-style device +probing (many SCSI adapters, for example). This sort of thing is highly +discouraged. Perfectly good information is available telling you what's +there, so there's no excuse for messing with random IO ports. However, +we MCA people still appreciate any ISA-style driver that will work with +our hardware. You take what you can get... + +Level-Triggered Interrupts +========================== + +Because MCA uses level-triggered interrupts, a few problems arise with +what might best be described as the ISA mindset and its effects on +drivers. These sorts of problems are expected to become less common as +more people use shared IRQs on PCI machines. + +In general, an interrupt must be acknowledged not only at the ICU (which +is done automagically by the kernel), but at the device level. In +particular, IRQ 0 must be reset after a timer interrupt (now done in +arch/i386/kernel/time.c) or the first timer interrupt hangs the system. +There were also problems with the 1.3.x floppy drivers, but that seems +to have been fixed. + +IRQs are also shareable, and most MCA-specific devices should be coded +with shared IRQs in mind. + +/proc/mca +========= + +/proc/mca is a directory containing various files for adapters and +other stuff. + + /proc/mca/pos Straight listing of POS registers + /proc/mca/slot[1-8] Information on adapter in specific slot + /proc/mca/video Same for integrated video + /proc/mca/scsi Same for integrated SCSI + /proc/mca/machine Machine information + +See Appendix A for a sample. + +Device drivers can easily add their own information function for +specific slots (including integrated ones) via the +mca_set_adapter_procfn() call. Drivers that support this are ESDI, IBM +SCSI, and 3c523. If a device is also a module, make sure that the proc +function is removed in the module cleanup. This will require storing +the slot information in a private structure somewhere. See the 3c523 +driver for details. + +Your typical proc function will look something like this: + + static int + dev_getinfo( char* buf, int slot, void* d ) { + struct net_device* dev = (struct net_device*) d; + int len = 0; + + len += sprintf( buf+len, "Device: %s\n", dev->name ); + len += sprintf( buf+len, "IRQ: %d\n", dev->irq ); + len += sprintf( buf+len, "IO Port: %#lx-%#lx\n", ... ); + ... + + return len; + } + +Some of the standard MCA information will already be printed, so don't +bother repeating it. Don't try putting in more than 3K of information. + +Enable this function with: + mca_set_adapter_procfn( slot, dev_getinfo, dev ); + +Disable it with: + mca_set_adapter_procfn( slot, NULL, NULL ); + +It is also recommended that, even if you don't write a proc function, to +set the name of the adapter (i.e. "PS/2 ESDI Controller") via +mca_set_adapter_name( int slot, char* name ). + +MCA Device Drivers +================== + +Currently, there are a number of MCA-specific device drivers. + +1) PS/2 ESDI + drivers/block/ps2esdi.c + include/linux/ps2esdi.h + Uses major number 36, and should use /dev files /dev/eda, /dev/edb. + Supports two drives, but only one controller. May use the + command-line args "ed=cyl,head,sec" and "tp720". + +2) PS/2 SCSI + drivers/scsi/ibmmca.c + drivers/scsi/ibmmca.h + The driver for the IBM SCSI subsystem. Includes both integrated + controllers and adapter cards. May require command-line arg + "ibmmcascsi=io_port" to force detection of an adapter. If you have a + machine with a front-panel display (i.e. model 95), you can use + "ibmmcascsi=display" to enable a drive activity indicator. + +3) 3c523 + drivers/net/3c523.c + drivers/net/3c523.h + 3Com 3c523 Etherlink/MC ethernet driver. + +4) SMC Ultra/MCA and IBM Adapter/A + drivers/net/smc-mca.c + drivers/net/smc-mca.h + Driver for the MCA version of the SMC Ultra and various other + OEM'ed and work-alike cards (Elite, Adapter/A, etc). + +5) NE/2 + driver/net/ne2.c + driver/net/ne2.h + The NE/2 is the MCA version of the NE2000. This may not work + with clones that have a different adapter id than the original + NE/2. + +6) Future Domain MCS-600/700, OEM'd IBM Fast SCSI Aapter/A and + Reply Sound Blaster/SCSI (SCSI part) + Better support for these cards than the driver for ISA. + Supports multiple cards with IRQ sharing. + +Also added boot time option of scsi-probe, which can do reordering of +SCSI host adapters. This will direct the kernel on the order which +SCSI adapter should be detected. Example: + scsi-probe=ibmmca,fd_mcs,adaptec1542,buslogic + +The serial drivers were modified to support the extended IO port range +of the typical MCA system (also #ifdef CONFIG_MCA). + +The following devices work with existing drivers: +1) Token-ring +2) Future Domain SCSI (MCS-600, MCS-700, not MCS-350, OEM'ed IBM SCSI) +3) Adaptec 1640 SCSI (using the aha1542 driver) +4) Bustek/Buslogic SCSI (various) +5) Probably all Arcnet cards. +6) Some, possibly all, MCA IDE controllers. +7) 3Com 3c529 (MCA version of 3c509) (patched) + +8) Intel EtherExpressMC (patched version) + You need to have CONFIG_MCA defined to have EtherExpressMC support. +9) Reply Sound Blaster/SCSI (SB part) (patched version) + +Bugs & Other Weirdness +====================== + +NMIs tend to occur with MCA machines because of various hardware +weirdness, bus timeouts, and many other non-critical things. Some basic +code to handle them (inspired by the NetBSD MCA code) has been added to +detect the guilty device, but it's pretty incomplete. If NMIs are a +persistent problem (on some model 70 or 80s, they occur every couple +shell commands), the CONFIG_IGNORE_NMI flag will take care of that. + +Various Pentium machines have had serious problems with the FPU test in +bugs.h. Basically, the machine hangs after the HLT test. This occurs, +as far as we know, on the Pentium-equipped 85s, 95s, and some PC Servers. +The PCI/MCA PC 750s are fine as far as I can tell. The ``mca-pentium'' +boot-prompt flag will disable the FPU bug check if this is a problem +with your machine. + +The model 80 has a raft of problems that are just too weird and unique +to get into here. Some people have no trouble while others have nothing +but problems. I'd suspect some problems are related to the age of the +average 80 and accompanying hardware deterioration, although others +are definitely design problems with the hardware. Among the problems +include SCSI controller problems, ESDI controller problems, and serious +screw-ups in the floppy controller. Oh, and the parallel port is also +pretty flaky. There were about 5 or 6 different model 80 motherboards +produced to fix various obscure problems. As far as I know, it's pretty +much impossible to tell which bugs a particular model 80 has (other than +triggering them, that is). + +Drivers are required for some MCA memory adapters. If you're suddenly +short a few megs of RAM, this might be the reason. The (I think) Enhanced +Memory Adapter commonly found on the model 70 is one. There's a very +alpha driver floating around, but it's pretty ugly (disassembled from +the DOS driver, actually). See the MCA Linux web page (URL below) +for more current memory info. + +The Thinkpad 700 and 720 will work, but various components are either +non-functional, flaky, or we don't know anything about them. The +graphics controller is supposed to be some WD, but we can't get things +working properly. The PCMCIA slots don't seem to work. Ditto for APM. +The serial ports work, but detection seems to be flaky. + +Credits +======= +A whole pile of people have contributed to the MCA code. I'd include +their names here, but I don't have a list handy. Check the MCA Linux +home page (URL below) for a perpetually out-of-date list. + +===================================================================== +MCA Linux Home Page: http://glycerine.itsmm.uni.edu/mca/ + +Christophe Beauregard +chrisb@truespectra.com +cpbeaure@calum.csclub.uwaterloo.ca + +===================================================================== +Appendix A: Sample /proc/mca + +This is from my model 8595. Slot 1 contains the standard IBM SCSI +adapter, slot 3 is an Adaptec AHA-1640, slot 5 is a XGA-1 video adapter, +and slot 7 is the 3c523 Etherlink/MC. + +/proc/mca/machine: +Model Id: 0xf8 +Submodel Id: 0x14 +BIOS Revision: 0x5 + +/proc/mca/pos: +Slot 1: ff 8e f1 fc a0 ff ff ff IBM SCSI Adapter w/Cache +Slot 2: ff ff ff ff ff ff ff ff +Slot 3: 1f 0f 81 3b bf b6 ff ff +Slot 4: ff ff ff ff ff ff ff ff +Slot 5: db 8f 1d 5e fd c0 00 00 +Slot 6: ff ff ff ff ff ff ff ff +Slot 7: 42 60 ff 08 ff ff ff ff 3Com 3c523 Etherlink/MC +Slot 8: ff ff ff ff ff ff ff ff +Video : ff ff ff ff ff ff ff ff +SCSI : ff ff ff ff ff ff ff ff + +/proc/mca/slot1: +Slot: 1 +Adapter Name: IBM SCSI Adapter w/Cache +Id: 8eff +Enabled: Yes +POS: ff 8e f1 fc a0 ff ff ff +Subsystem PUN: 7 +Detected at boot: Yes + +/proc/mca/slot3: +Slot: 3 +Adapter Name: Unknown +Id: 0f1f +Enabled: Yes +POS: 1f 0f 81 3b bf b6 ff ff + +/proc/mca/slot5: +Slot: 5 +Adapter Name: Unknown +Id: 8fdb +Enabled: Yes +POS: db 8f 1d 5e fd c0 00 00 + +/proc/mca/slot7: +Slot: 7 +Adapter Name: 3Com 3c523 Etherlink/MC +Id: 6042 +Enabled: Yes +POS: 42 60 ff 08 ff ff ff ff +Revision: 0xe +IRQ: 9 +IO Address: 0x3300-0x3308 +Memory: 0xd8000-0xdbfff +Transceiver: External +Device: eth0 +Hardware Address: 02 60 8c 45 c4 2a diff --git a/Documentation/md.txt b/Documentation/md.txt new file mode 100644 index 000000000000..e2b536992a27 --- /dev/null +++ b/Documentation/md.txt @@ -0,0 +1,118 @@ +Tools that manage md devices can be found at + http://www.<country>.kernel.org/pub/linux/utils/raid/.... + + +Boot time assembly of RAID arrays +--------------------------------- + +You can boot with your md device with the following kernel command +lines: + +for old raid arrays without persistent superblocks: + md=<md device no.>,<raid level>,<chunk size factor>,<fault level>,dev0,dev1,...,devn + +for raid arrays with persistent superblocks + md=<md device no.>,dev0,dev1,...,devn +or, to assemble a partitionable array: + md=d<md device no.>,dev0,dev1,...,devn + +md device no. = the number of the md device ... + 0 means md0, + 1 md1, + 2 md2, + 3 md3, + 4 md4 + +raid level = -1 linear mode + 0 striped mode + other modes are only supported with persistent super blocks + +chunk size factor = (raid-0 and raid-1 only) + Set the chunk size as 4k << n. + +fault level = totally ignored + +dev0-devn: e.g. /dev/hda1,/dev/hdc1,/dev/sda1,/dev/sdb1 + +A possible loadlin line (Harald Hoyer <HarryH@Royal.Net>) looks like this: + +e:\loadlin\loadlin e:\zimage root=/dev/md0 md=0,0,4,0,/dev/hdb2,/dev/hdc3 ro + + +Boot time autodetection of RAID arrays +-------------------------------------- + +When md is compiled into the kernel (not as module), partitions of +type 0xfd are scanned and automatically assembled into RAID arrays. +This autodetection may be suppressed with the kernel parameter +"raid=noautodetect". As of kernel 2.6.9, only drives with a type 0 +superblock can be autodetected and run at boot time. + +The kernel parameter "raid=partitionable" (or "raid=part") means +that all auto-detected arrays are assembled as partitionable. + + +Superblock formats +------------------ + +The md driver can support a variety of different superblock formats. +Currently, it supports superblock formats "0.90.0" and the "md-1" format +introduced in the 2.5 development series. + +The kernel will autodetect which format superblock is being used. + +Superblock format '0' is treated differently to others for legacy +reasons - it is the original superblock format. + + +General Rules - apply for all superblock formats +------------------------------------------------ + +An array is 'created' by writing appropriate superblocks to all +devices. + +It is 'assembled' by associating each of these devices with an +particular md virtual device. Once it is completely assembled, it can +be accessed. + +An array should be created by a user-space tool. This will write +superblocks to all devices. It will usually mark the array as +'unclean', or with some devices missing so that the kernel md driver +can create appropriate redundancy (copying in raid1, parity +calculation in raid4/5). + +When an array is assembled, it is first initialized with the +SET_ARRAY_INFO ioctl. This contains, in particular, a major and minor +version number. The major version number selects which superblock +format is to be used. The minor number might be used to tune handling +of the format, such as suggesting where on each device to look for the +superblock. + +Then each device is added using the ADD_NEW_DISK ioctl. This +provides, in particular, a major and minor number identifying the +device to add. + +The array is started with the RUN_ARRAY ioctl. + +Once started, new devices can be added. They should have an +appropriate superblock written to them, and then passed be in with +ADD_NEW_DISK. + +Devices that have failed or are not yet active can be detached from an +array using HOT_REMOVE_DISK. + + +Specific Rules that apply to format-0 super block arrays, and + arrays with no superblock (non-persistent). +------------------------------------------------------------- + +An array can be 'created' by describing the array (level, chunksize +etc) in a SET_ARRAY_INFO ioctl. This must has major_version==0 and +raid_disks != 0. + +Then uninitialized devices can be added with ADD_NEW_DISK. The +structure passed to ADD_NEW_DISK must specify the state of the device +and it's role in the array. + +Once started with RUN_ARRAY, uninitialized spares can be added with +HOT_ADD_DISK. diff --git a/Documentation/memory.txt b/Documentation/memory.txt new file mode 100644 index 000000000000..2b3dedd39538 --- /dev/null +++ b/Documentation/memory.txt @@ -0,0 +1,60 @@ +There are several classic problems related to memory on Linux +systems. + + 1) There are some buggy motherboards which cannot properly + deal with the memory above 16MB. Consider exchanging + your motherboard. + + 2) You cannot do DMA on the ISA bus to addresses above + 16M. Most device drivers under Linux allow the use + of bounce buffers which work around this problem. Drivers + that don't use bounce buffers will be unstable with + more than 16M installed. Drivers that use bounce buffers + will be OK, but may have slightly higher overhead. + + 3) There are some motherboards that will not cache above + a certain quantity of memory. If you have one of these + motherboards, your system will be SLOWER, not faster + as you add more memory. Consider exchanging your + motherboard. + +All of these problems can be addressed with the "mem=XXXM" boot option +(where XXX is the size of RAM to use in megabytes). +It can also tell Linux to use less memory than is actually installed. +If you use "mem=" on a machine with PCI, consider using "memmap=" to avoid +physical address space collisions. + +See the documentation of your boot loader (LILO, loadlin, etc.) about +how to pass options to the kernel. + +There are other memory problems which Linux cannot deal with. Random +corruption of memory is usually a sign of serious hardware trouble. +Try: + + * Reducing memory settings in the BIOS to the most conservative + timings. + + * Adding a cooling fan. + + * Not overclocking your CPU. + + * Having the memory tested in a memory tester or exchanged + with the vendor. Consider testing it with memtest86 yourself. + + * Exchanging your CPU, cache, or motherboard for one that works. + + * Disabling the cache from the BIOS. + + * Try passing the "mem=4M" option to the kernel to limit + Linux to using a very small amount of memory. Use "memmap="-option + together with "mem=" on systems with PCI to avoid physical address + space collisions. + + +Other tricks: + + * Try passing the "no-387" option to the kernel to ignore + a buggy FPU. + + * Try passing the "no-hlt" option to disable the potentially + buggy HLT instruction in your CPU. diff --git a/Documentation/mips/GT64120.README b/Documentation/mips/GT64120.README new file mode 100644 index 000000000000..2d0eec91dc59 --- /dev/null +++ b/Documentation/mips/GT64120.README @@ -0,0 +1,65 @@ +README for arch/mips/gt64120 directory and subdirectories + +Jun Sun, jsun@mvista.com or jsun@junsun.net +01/27, 2001 + +MOTIVATION +---------- + +Many MIPS boards share the same system controller (or CPU companian chip), +such as GT-64120. It is highly desirable to let these boards share +the same controller code instead of duplicating them. + +This directory is meant to hold all MIPS boards that use GT-64120 or GT-64120A. + + +HOW TO ADD A BOARD +------------------ + +. Create a subdirectory include/asm/gt64120/<board>. + +. Create a file called gt64120_dep.h under that directory. + +. Modify include/asm/gt64120/gt64120.h file to include the new gt64120_dep.h + based on config options. The board-dep section is at the end of + include/asm/gt64120/gt64120.h file. There you can find all required + definitions include/asm/gt64120/<board>/gt64120_dep.h file must supply. + +. Create a subdirectory arch/mips/gt64120/<board> directory to hold + board specific routines. + +. The GT-64120 common code is supplied under arch/mips/gt64120/common directory. + It includes: + 1) arch/mips/gt64120/pci.c - + common PCI routine, include the top-level pcibios_init() + 2) arch/mips/gt64120/irq.c - + common IRQ routine, include the top-level do_IRQ() + [This part really belongs to arch/mips/kernel. jsun] + 3) arch/mips/gt64120/gt_irq.c - + common IRQ routines for GT-64120 chip. Currently it only handles + the timer interrupt. + +. Board-specific routines are supplied under arch/mips/gt64120/<board> dir. + 1) arch/mips/gt64120/<board>/pci.c - it provides bus fixup routine + 2) arch/mips/gt64120/<board>/irq.c - it provides enable/disable irqs + and board irq setup routine (irq_setup) + 3) arch/mips/gt64120/<board>/int-handler.S - + The first-level interrupt dispatching routine. + 4) a bunch of other "normal" stuff (setup, prom, dbg_io, reset, etc) + +. Follow other "normal" procedure to modify configuration files, etc. + + +TO-DO LIST +---------- + +. Expand arch/mips/gt64120/gt_irq.c to handle all GT-64120 interrupts. + We probably need to introduce GT_IRQ_BASE in board-dep header file, + which is used the starting irq_nr for all GT irqs. + + A function, gt64120_handle_irq(), will be added so that the first-level + irq dispatcher will call this function if it detects an interrupt + from GT-64120. + +. More support for GT-64120 PCI features (2nd PCI bus, perhaps) + diff --git a/Documentation/mips/pci/pci.README b/Documentation/mips/pci/pci.README new file mode 100644 index 000000000000..8697ee41372d --- /dev/null +++ b/Documentation/mips/pci/pci.README @@ -0,0 +1,54 @@ + +Pete Popov, ppopov@pacbell.net +07/11/2001 + +This README briefly explains how to use the pci and pci_auto +code in arch/mips/kernel. The code was ported from PowerPC and +modified slightly. It has been tested pretty well on PPC on some +rather complex systems with multiple bridges and devices behind +each bridge. However, at the time this README was written, the +mips port was tested only on boards with a single pci bus and +no P2P bridges. It's very possible that on boards with P2P +bridges some modifications have to be made. The code will +evolve, no doubt, but currently every single mips board +is doing its own pcibios thing and it has become a big +mess. This generic pci code is meant to clean up the mips +pci mess and make it easier to add pci support to new boards. + +inside the define for your board in arch/mips/config.in. +For example, the Galileo EV96100 board looks like this: + +if [ "$CONFIG_MIPS_EV96100" = "y" ]; then + define_bool CONFIG_PCI y + define_bool CONFIG_MIPS_GT96100 y + define_bool CONFIG_NEW_PCI y + define_bool CONFIG_SWAP_IO_SPACE y +fi + + +Next, if you want to use the arch/mips/kernel/pci code, which has the +pcibios_init() function, add + +define_bool CONFIG_NEW_PCI y + +inside the define for your board. Again, the EV96100 example above +show NEW_PCI turned on. + + +Now you need to add your files to hook in your pci configuration +cycles. Usually you'll need only a couple of files named something +like pci_fixups.c and pci_ops.c. You can copy the templates +provided and fill in the code. + +The file pci_ops.c should contain the pci configuration cycles routines. +It also has the mips_pci_channels[] array which contains the descriptors +of each pci controller. + +The file pci_fixups.c contains a few routines to do interrupt fixups, +resources fixups, and, if needed, pci bios fixups. + +Usually you'll put your pci_fixups.c file in your board specific directory, +since the functions in that file are board specific. The functions in +pci_ops.c, on the other hand, are usually pci controller specific so that +file could be shared among a few different boards using the same +pci controller. diff --git a/Documentation/mips/time.README b/Documentation/mips/time.README new file mode 100644 index 000000000000..70bc0dd43d6d --- /dev/null +++ b/Documentation/mips/time.README @@ -0,0 +1,198 @@ +README for MIPS time services + +Jun Sun +jsun@mvista.com or jsun@junsun.net + + +ABOUT +----- +This file describes the new arch/mips/kernel/time.c, related files and the +services they provide. + +If you are short in patience and just want to know how to use time.c for a +new board or convert an existing board, go to the last section. + + +FILES, COMPATABILITY AND CONFIGS +--------------------------------- + +The old arch/mips/kernel/time.c is renamed to old-time.c. + +A new time.c is put there, together with include/asm-mips/time.h. + +Two configs variables are introduced, CONFIG_OLD_TIME_C and CONFIG_NEW_TIME_C. +So we allow boards using + + 1) old time.c (CONFIG_OLD_TIME_C) + 2) new time.c (CONFIG_NEW_TIME_C) + 3) neither (their own private time.c) + +However, it is expected every board will move to the new time.c in the near +future. + + +WHAT THE NEW CODE PROVIDES? +--------------------------- + +The new time code provide the following services: + + a) Implements functions required by Linux common code: + time_init + do_gettimeofday + do_settimeofday + + b) provides an abstraction of RTC and null RTC implementation as default. + extern unsigned long (*rtc_get_time)(void); + extern int (*rtc_set_time)(unsigned long); + + c) a set of gettimeoffset functions for different CPUs and different + needs. + + d) high-level and low-level timer interrupt routines where the timer + interrupt source may or may not be the CPU timer. The high-level + routine is dispatched through do_IRQ() while the low-level is + dispatched in assemably code (usually int-handler.S) + + +WHAT THE NEW CODE REQUIRES? +--------------------------- + +For the new code to work properly, each board implementation needs to supply +the following functions or values: + + a) board_time_init - a function pointer. Invoked at the beginnig of + time_init(). It is optional. + 1. (optional) set up RTC routines + 2. (optional) calibrate and set the mips_counter_frequency + + b) board_timer_setup - a function pointer. Invoked at the end of time_init() + 1. (optional) over-ride any decisions made in time_init() + 2. set up the irqaction for timer interrupt. + 3. enable the timer interrupt + + c) (optional) board-specific RTC routines. + + d) (optional) mips_counter_frequency - It must be definied if the board + is using CPU counter for timer interrupt or it is using fixed rate + gettimeoffset(). + + +PORTING GUIDE +------------- + +Step 1: decide how you like to implement the time services. + + a) does this board have a RTC? If yes, implement the two RTC funcs. + + b) does the CPU have counter/compare registers? + + If the answer is no, you need a timer to provide the timer interrupt + at 100 HZ speed. + + You cannot use the fast gettimeoffset functions, i.e., + + unsigned long fixed_rate_gettimeoffset(void); + unsigned long calibrate_div32_gettimeoffset(void); + unsigned long calibrate_div64_gettimeoffset(void); + + You can use null_gettimeoffset() will gives the same time resolution as + jiffy. Or you can implement your own gettimeoffset (probably based on + some ad hoc hardware on your machine.) + + c) The following sub steps assume your CPU has counter register. + Do you plan to use the CPU counter register as the timer interrupt + or use an exnternal timer? + + In order to use CPU counter register as the timer interrupt source, you + must know the counter speed (mips_counter_frequency). It is usually the + same as the CPU speed or an integral divisor of it. + + d) decide on whether you want to use high-level or low-level timer + interrupt routines. The low-level one is presumably faster, but should + not make too mcuh difference. + + +Step 2: the machine setup() function + + If you supply board_time_init(), set the function poointer. + + Set the function pointer board_timer_setup() (mandatory) + + +Step 3: implement rtc routines, board_time_init() and board_timer_setup() + if needed. + + board_time_init() - + a) (optional) set up RTC routines, + b) (optional) calibrate and set the mips_counter_frequency + (only needed if you intended to use fixed_rate_gettimeoffset + or use cpu counter as timer interrupt source) + + board_timer_setup() - + a) (optional) over-write any choices made above by time_init(). + b) machine specific code should setup the timer irqaction. + c) enable the timer interrupt + + + If the RTC chip is a common chip, I suggest the routines are put under + arch/mips/libs. For example, for DS1386 chip, one would create + rtc-ds1386.c under arch/mips/lib directory. Add the following line to + the arch/mips/lib/Makefile: + + obj-$(CONFIG_DDB5476) += rtc-ds1386.o + +Step 4: if you are using low-level timer interrupt, change your interrupt + dispathcing code to check for timer interrupt and jump to + ll_timer_interrupt() directly if one is detected. + +Step 5: Modify arch/mips/config.in and add CONFIG_NEW_TIME_C to your machine. + Modify the appropriate defconfig if applicable. + +Final notes: + +For some tricky cases, you may need to add your own wrapper functions +for some of the functions in time.c. + +For example, you may define your own timer interrupt routine, which does +some of its own processing and then calls timer_interrupt(). + +You can also over-ride any of the built-in functions (gettimeoffset, +RTC routines and/or timer interrupt routine). + + +PORTING NOTES FOR SMP +---------------------- + +If you have a SMP box, things are slightly more complicated. + +The time service running every jiffy is logically divided into two parts: + + 1) the one for the whole system (defined in timer_interrupt()) + 2) the one that should run for each CPU (defined in local_timer_interrupt()) + +You need to decide on your timer interrupt sources. + + case 1) - whole system has only one timer interrupt delivered to one CPU + + In this case, you set up timer interrupt as in UP systems. In addtion, + you need to set emulate_local_timer_interrupt to 1 so that other + CPUs get to call local_timer_interrupt(). + + THIS IS CURRENTLY NOT IMPLEMNETED. However, it is rather easy to write + one should such a need arise. You simply make a IPI call. + + case 2) - each CPU has a separate timer interrupt + + In this case, you need to set up IRQ such that each of them will + call local_timer_interrupt(). In addition, you need to arrange + one and only one of them to call timer_interrupt(). + + You can also do the low-level version of those interrupt routines, + following similar dispatching routes described above. + +Note about do_gettimeoffset(): + + It is very likely the CPU counter registers are not sync'ed up in a SMP box. + Therefore you cannot really use the many of the existing routines that + are based on CPU counter. You should wirte your own gettimeoffset rouinte + if you want intra-jiffy resolution. diff --git a/Documentation/mono.txt b/Documentation/mono.txt new file mode 100644 index 000000000000..6739ab9615ef --- /dev/null +++ b/Documentation/mono.txt @@ -0,0 +1,66 @@ + Mono(tm) Binary Kernel Support for Linux + ----------------------------------------- + +To configure Linux to automatically execute Mono-based .NET binaries +(in the form of .exe files) without the need to use the mono CLR +wrapper, you can use the BINFMT_MISC kernel support. + +This will allow you to execute Mono-based .NET binaries just like any +other program after you have done the following: + +1) You MUST FIRST install the Mono CLR support, either by downloading + a binary package, a source tarball or by installing from CVS. Binary + packages for several distributions can be found at: + + http://go-mono.com/download.html + + Instructions for compiling Mono can be found at: + + http://www.go-mono.com/compiling.html + + Once the Mono CLR support has been installed, just check that + /usr/bin/mono (which could be located elsewhere, for example + /usr/local/bin/mono) is working. + +2) You have to compile BINFMT_MISC either as a module or into + the kernel (CONFIG_BINFMT_MISC) and set it up properly. + If you choose to compile it as a module, you will have + to insert it manually with modprobe/insmod, as kmod + can not be easily supported with binfmt_misc. + Read the file 'binfmt_misc.txt' in this directory to know + more about the configuration process. + +3) Add the following enries to /etc/rc.local or similar script + to be run at system startup: + +# Insert BINFMT_MISC module into the kernel +if [ ! -e /proc/sys/fs/binfmt_misc/register ]; then + /sbin/modprobe binfmt_misc + # Some distributions, like Fedora Core, perform + # the following command automatically when the + # binfmt_misc module is loaded into the kernel. + # Thus, it is possible that the following line + # is not needed at all. Look at /etc/modprobe.conf + # to check whether this is applicable or not. + mount -t binfmt_misc none /proc/sys/fs/binfmt_misc +fi + +# Register support for .NET CLR binaries +if [ -e /proc/sys/fs/binfmt_misc/register ]; then + # Replace /usr/bin/mono with the correct pathname to + # the Mono CLR runtime (usually /usr/local/bin/mono + # when compiling from sources or CVS). + echo ':CLR:M::MZ::/usr/bin/mono:' > /proc/sys/fs/binfmt_misc/register +else + echo "No binfmt_misc support" + exit 1 +fi + +4) Check that .exe binaries can be ran without the need of a + wrapper script, simply by launching the .exe file directly + from a command prompt, for example: + + /usr/bin/xsd.exe + + NOTE: If this fails with a permission denied error, check + that the .exe file has execute permissions. diff --git a/Documentation/moxa-smartio b/Documentation/moxa-smartio new file mode 100644 index 000000000000..fe24ecc6372e --- /dev/null +++ b/Documentation/moxa-smartio @@ -0,0 +1,411 @@ +============================================================================= + + MOXA Smartio Family Device Driver Ver 1.1 Installation Guide + for Linux Kernel 2.2.x and 2.0.3x + Copyright (C) 1999, Moxa Technologies Co, Ltd. +============================================================================= +Content + +1. Introduction +2. System Requirement +3. Installation +4. Utilities +5. Setserial +6. Troubleshooting + +----------------------------------------------------------------------------- +1. Introduction + + The Smartio family Linux driver, Ver. 1.1, supports following multiport + boards. + + -C104P/H/HS, C104H/PCI, C104HS/PCI, CI-104J 4 port multiport board. + -C168P/H/HS, C168H/PCI 8 port multiport board. + + This driver has been modified a little and cleaned up from the Moxa + contributed driver code and merged into Linux 2.2.14pre. In particular + official major/minor numbers have been assigned which are different to + those the original Moxa supplied driver used. + + This driver and installation procedure have been developed upon Linux Kernel + 2.2.5 and backward compatible to 2.0.3x. This driver supports Intel x86 and + Alpha hardware platform. In order to maintain compatibility, this version + has also been properly tested with RedHat, OpenLinux, TurboLinux and + S.u.S.E Linux. However, if compatibility problem occurs, please contact + Moxa at support@moxa.com.tw. + + In addition to device driver, useful utilities are also provided in this + version. They are + - msdiag Diagnostic program for detecting installed Moxa Smartio boards. + - msmon Monitor program to observe data count and line status signals. + - msterm A simple terminal program which is useful in testing serial + ports. + - io-irq.exe Configuration program to setup ISA boards. Please note that + this program can only be executed under DOS. + + All the drivers and utilities are published in form of source code under + GNU General Public License in this version. Please refer to GNU General + Public License announcement in each source code file for more detail. + + In Moxa's ftp sites, you may always find latest driver at + ftp://ftp.moxa.com or ftp://ftp.moxa.com.tw. + + This version of driver can be installed as Loadable Module (Module driver) + or built-in into kernel (Static driver). You may refer to following + installation procedure for suitable one. Before you install the driver, + please refer to hardware installation procedure in the User's Manual. + + We assume the user should be familiar with following documents. + - Serial-HOWTO + - Kernel-HOWTO + +----------------------------------------------------------------------------- +2. System Requirement + - Hardware platform: Intel x86 or Alpha machine + - Kernel version: 2.0.3x or 2.2.x + - gcc version 2.72 or later + - Maximum 4 boards can be installed in combination + +----------------------------------------------------------------------------- +3. Installation + + 3.1 Hardware installation + + There are two types of buses, ISA and PCI, for Smartio family multiport + board. + + ISA board + --------- + You'll have to configure CAP address, I/O address, Interrupt Vector + as well as IRQ before installing this driver. Please refer to hardware + installation procedure in User's Manual before proceed any further. + Please make sure the JP1 is open after the ISA board is set properly. + + PCI board + --------- + You may need to adjust IRQ usage in BIOS to avoid from IRQ conflict + with other ISA devices. Please refer to hardware installation + procedure in User's Manual in advance. + + IRQ Sharing + ----------- + Each port within the same multiport board shares the same IRQ. Up to + 4 Moxa Smartio Family multiport boards can be installed together on + one system and they can share the same IRQ. + + 3.2 Driver files and device naming convention + + The driver file may be obtained from ftp, CD-ROM or floppy disk. The + first step, anyway, is to copy driver file "mxser.tgz" into specified + directory. e.g. /moxa. The execute commands as below. + + # cd /moxa + # tar xvf /dev/fd0 + or + # cd /moxa + # cp /mnt/cdrom/<driver directory>/mxser.tgz . + # tar xvfz mxser.tgz + + You may find all the driver and utilities files in /moxa/mxser. + Following installation procedure depends on the model you'd like to + run the driver. If you prefer module driver, please refer to 3.3. + If static driver is required, please refer to 3.4. + + Dialin and callout port + ----------------------- + This driver remains traditional serial device properties. There're + two special file name for each serial port. One is dial-in port + which is named "ttyMxx". For callout port, the naming convention + is "cumxx". + + Device naming when more than 2 boards installed + ----------------------------------------------- + Naming convention for each Smartio multiport board is pre-defined + as below. + + Board Num. Dial-in Port Callout port + 1st board ttyM0 - ttyM7 cum0 - cum7 + 2nd board ttyM8 - ttyM15 cum8 - cum15 + 3rd board ttyM16 - ttyM23 cum16 - cum23 + 4th board ttyM24 - ttym31 cum24 - cum31 + + Board sequence + -------------- + This driver will activate ISA boards according to the parameter set + in the driver. After all specified ISA board activated, PCI board + will be installed in the system automatically driven. + Therefore the board number is sorted by the CAP address of ISA boards. + For PCI boards, their sequence will be after ISA boards and C168H/PCI + has higher priority than C104H/PCI boards. + + 3.3 Module driver configuration + Module driver is easiest way to install. If you prefer static driver + installation, please skip this paragraph. + 1. Find "Makefile" in /moxa/mxser, then run + + # make install + + The driver files "mxser.o" and utilities will be properly compiled + and copied to system directories respectively.Then run + + # insmod mxser + + to activate the modular driver. You may run "lsmod" to check + if "mxser.o" is activated. + + 2. Create special files by executing "msmknod". + # cd /moxa/mxser/driver + # ./msmknod + + Default major numbers for dial-in device and callout device are + 174, 175. Msmknod will delete any special files occupying the same + device naming. + + 3. Up to now, you may manually execute "insmod mxser" to activate + this driver and run "rmmod mxser" to remove it. However, it's + better to have a boot time configuration to eliminate manual + operation. + Boot time configuration can be achieved by rc file. Run following + command for setting rc files. + + # cd /moxa/mxser/driver + # cp ./rc.mxser /etc/rc.d + # cd /etc/rc.d + + You may have to modify part of the content in rc.mxser to specify + parameters for ISA board. Please refer to rc.mxser for more detail. + Find "rc.serial". If "rc.serial" doesn't exist, create it by vi. + Add "rc.mxser" in last line. Next, open rc.local by vi + and append following content. + + if [ -f /etc/rc.d/rc.serial ]; then + sh /etc/rc.d/rc.serial + fi + + 4. Reboot and check if mxser.o activated by "lsmod" command. + 5. If you'd like to drive Smartio ISA boards in the system, you'll + have to add parameter to specify CAP address of given board while + activating "mxser.o". The format for parameters are as follows. + + insmod mxser ioaddr=0x???,0x???,0x???,0x??? + | | | | + | | | +- 4th ISA board + | | +------ 3rd ISA board + | +------------ 2nd ISA board + +------------------- 1st ISA board + + 3.4 Static driver configuration + + 1. Create link + # cd /usr/src/linux/drivers/char + # ln -s /moxa/mxser/driver/mxser.c mxser.c + + 2. Add CAP address list for ISA boards + In module mode, the CAP address for ISA board is given by + parameter. In static driver configuration, you'll have to + assign it within driver's source code. If you will not + install any ISA boards, you may skip to next portion. + The instructions to modify driver source code are as + below. + a. # cd /moxa/mxser/driver + # vi mxser.c + b. Find the array mxserBoardCAP[] as below. + + static int mxserBoardCAP[] + = {0x00, 0x00, 0x00, 0x00}; + + c. Change the address within this array using vi. For + example, to driver 2 ISA boards with CAP address + 0x280 and 0x180 as 1st and 2nd board. Just to change + the source code as follows. + + static int mxserBoardCAP[] + = {0x280, 0x180, 0x00, 0x00}; + + 3. Modify tty_io.c + # cd /usr/src/linux/drivers/char/ + # vi tty_io.c + Find pty_init(), insert "mxser_init()" as + + pty_init(); + mxser_init(); + + 4. Modify tty.h + # cd /usr/src/linux/include/linux + # vi tty.h + Find extern int tty_init(void), insert "mxser_init()" as + + extern int tty_init(void); + extern int mxser_init(void); + + 5. Modify Makefile + # cd /usr/src/linux/drivers/char + # vi Makefile + Find L_OBJS := tty_io.o ...... random.o, add + "mxser.o" at last of this line as + L_OBJS := tty_io.o ....... mxser.o + + 6. Rebuild kernel + The following are for Linux kernel rebuilding,for your reference only. + For appropriate details, please refer to the Linux document. + + If 'lilo' utility is installed, please use 'make zlilo' to rebuild + kernel. If 'lilo' is not installed, please follow the following steps. + + a. cd /usr/src/linux + b. make clean /* take a few minutes */ + c. make bzImage /* take probably 10-20 minutes */ + d. Backup original boot kernel. /* optional step */ + e. cp /usr/src/linux/arch/i386/boot/bzImage /boot/vmlinuz + f. Please make sure the boot kernel (vmlinuz) is in the + correct position. If you use 'lilo' utility, you should + check /etc/lilo.conf 'image' item specified the path + which is the 'vmlinuz' path, or you will load wrong + (or old) boot kernel image (vmlinuz). + g. chmod 400 /vmlinuz + h. lilo + i. rdev -R /vmlinuz 1 + j. sync + + Note that if the result of "make zImage" is ERROR, then you have to + go back to Linux configuration Setup. Type "make config" in directory + /usr/src/linux or "setup". + + Since system include file, /usr/src/linux/include/linux/interrupt.h, + is modified each time the MOXA driver is installed, kernel rebuilding + is inevitable. And it takes about 10 to 20 minutes depends on the + machine. + + 7. Make utility + # cd /moxa/mxser/utility + # make install + + 8. Make special file + # cd /moxa/mxser/driver + # ./msmknod + + 9. Reboot + + 3.5 Custom configuration + Although this driver already provides you default configuration, you + still can change the device name and major number.The instruction to + change these parameters are shown as below. + + Change Device name + ------------------ + If you'd like to use other device names instead of default naming + convention, all you have to do is to modify the internal code + within the shell script "msmknod". First, you have to open "msmknod" + by vi. Locate each line contains "ttyM" and "cum" and change them + to the device name you desired. "msmknod" creates the device names + you need next time executed. + + Change Major number + ------------------- + If major number 30 and 35 had been occupied, you may have to select + 2 free major numbers for this driver. There are 3 steps to change + major numbers. + + 1. Find free major numbers + In /proc/devices, you may find all the major numbers occupied + in the system. Please select 2 major numbers that are available. + e.g. 40, 45. + 2. Create special files + Run /moxa/mxser/driver/msmknod to create special files with + specified major numbers. + 3. Modify driver with new major number + Run vi to open /moxa/mxser/driver/mxser.c. Locate the line + contains "MXSERMAJOR". Change the content as below. + #define MXSERMAJOR 40 + #define MXSERCUMAJOR 45 + 4. Run # make install in /moxa/mxser/driver. + + 3.6 Verify driver installation + You may refer to /var/log/messages to check the latest status + log reported by this driver whenever it's activated. +----------------------------------------------------------------------------- +4. Utilities + There are 3 utilities contained in this driver. They are msdiag, msmon and + msterm. These 3 utilities are released in form of source code. They should + be compiled into executable file and copied into /usr/bin. + + msdiag - Diagnostic + -------------------- + This utility provides the function to detect what Moxa Smartio multiport + board exists in the system. + + msmon - Port Monitoring + ----------------------- + This utility gives the user a quick view about all the MOXA ports' + activities. One can easily learn each port's total received/transmitted + (Rx/Tx) character count since the time when the monitoring is started. + Rx/Tx throughputs per second are also reported in interval basis (e.g. + the last 5 seconds) and in average basis (since the time the monitoring + is started). You can reset all ports' count by <HOME> key. <+> <-> + (plus/minus) keys to change the displaying time interval. Press <ENTER> + on the port, that cursor stay, to view the port's communication + parameters, signal status, and input/output queue. + + msterm - Terminal Emulation + --------------------------- + This utility provides data sending and receiving ability of all tty ports, + especially for MOXA ports. It is quite useful for testing simple + application, for example, sending AT command to a modem connected to the + port or used as a terminal for login purpose. Note that this is only a + dumb terminal emulation without handling full screen operation. +----------------------------------------------------------------------------- +5. Setserial + + Supported Setserial parameters are listed as below. + + uart set UART type(16450-->disable FIFO, 16550A-->enable FIFO) + close_delay set the amount of time(in 1/100 of a second) that DTR + should be kept low while being closed. + closing_wait set the amount of time(in 1/100 of a second) that the + serial port should wait for data to be drained while + being closed, before the receiver is disable. + spd_hi Use 57.6kb when the application requests 38.4kb. + spd_vhi Use 115.2kb when the application requests 38.4kb. + spd_normal Use 38.4kb when the application requests 38.4kb. + +----------------------------------------------------------------------------- +6. Troubleshooting + + The boot time error messages and solutions are stated as clearly as + possible. If all the possible solutions fail, please contact our technical + support team to get more help. + + Error msg: More than 4 Moxa Smartio family boards found. Fifth board and + after are ignored. + Solution: + To avoid this problem, please unplug fifth and after board, because Moxa + driver supports up to 4 boards. + + Error msg: Request_irq fail, IRQ(?) may be conflict with another device. + Solution: + Other PCI or ISA devices occupy the assigned IRQ. If you are not sure + which device causes the situation,please check /proc/interrupts to find + free IRQ and simply change another free IRQ for Moxa board. + + Error msg: Board #: C1xx Series(CAP=xxx) interrupt number invalid. + Solution: + Each port within the same multiport board shares the same IRQ. Please set + one IRQ (IRQ doesn't equal to zero) for one Moxa board. + + Error msg: No interrupt vector be set for Moxa ISA board(CAP=xxx). + Solution: + Moxa ISA board needs an interrupt vector.Please refer to user's manual + "Hardware Installation" chapter to set interrupt vector. + + Error msg: Couldn't install MOXA Smartio family driver! + Solution: + Load Moxa driver fail, the major number may conflict with other devices. + Please refer to previous section 3.5 to change a free major number for + Moxa driver. + + Error msg: Couldn't install MOXA Smartio family callout driver! + Solution: + Load Moxa callout driver fail, the callout device major number may + conflict with other devices. Please refer to previous section 3.5 to + change a free callout device major number for Moxa driver. +----------------------------------------------------------------------------- diff --git a/Documentation/mtrr.txt b/Documentation/mtrr.txt new file mode 100644 index 000000000000..b78af1c32996 --- /dev/null +++ b/Documentation/mtrr.txt @@ -0,0 +1,286 @@ +MTRR (Memory Type Range Register) control +3 Jun 1999 +Richard Gooch +<rgooch@atnf.csiro.au> + + On Intel P6 family processors (Pentium Pro, Pentium II and later) + the Memory Type Range Registers (MTRRs) may be used to control + processor access to memory ranges. This is most useful when you have + a video (VGA) card on a PCI or AGP bus. Enabling write-combining + allows bus write transfers to be combined into a larger transfer + before bursting over the PCI/AGP bus. This can increase performance + of image write operations 2.5 times or more. + + The Cyrix 6x86, 6x86MX and M II processors have Address Range + Registers (ARRs) which provide a similar functionality to MTRRs. For + these, the ARRs are used to emulate the MTRRs. + + The AMD K6-2 (stepping 8 and above) and K6-3 processors have two + MTRRs. These are supported. The AMD Athlon family provide 8 Intel + style MTRRs. + + The Centaur C6 (WinChip) has 8 MCRs, allowing write-combining. These + are supported. + + The VIA Cyrix III and VIA C3 CPUs offer 8 Intel style MTRRs. + + The CONFIG_MTRR option creates a /proc/mtrr file which may be used + to manipulate your MTRRs. Typically the X server should use + this. This should have a reasonably generic interface so that + similar control registers on other processors can be easily + supported. + + +There are two interfaces to /proc/mtrr: one is an ASCII interface +which allows you to read and write. The other is an ioctl() +interface. The ASCII interface is meant for administration. The +ioctl() interface is meant for C programs (i.e. the X server). The +interfaces are described below, with sample commands and C code. + +=============================================================================== +Reading MTRRs from the shell: + +% cat /proc/mtrr +reg00: base=0x00000000 ( 0MB), size= 128MB: write-back, count=1 +reg01: base=0x08000000 ( 128MB), size= 64MB: write-back, count=1 +=============================================================================== +Creating MTRRs from the C-shell: +# echo "base=0xf8000000 size=0x400000 type=write-combining" >! /proc/mtrr +or if you use bash: +# echo "base=0xf8000000 size=0x400000 type=write-combining" >| /proc/mtrr + +And the result thereof: +% cat /proc/mtrr +reg00: base=0x00000000 ( 0MB), size= 128MB: write-back, count=1 +reg01: base=0x08000000 ( 128MB), size= 64MB: write-back, count=1 +reg02: base=0xf8000000 (3968MB), size= 4MB: write-combining, count=1 + +This is for video RAM at base address 0xf8000000 and size 4 megabytes. To +find out your base address, you need to look at the output of your X +server, which tells you where the linear framebuffer address is. A +typical line that you may get is: + +(--) S3: PCI: 968 rev 0, Linear FB @ 0xf8000000 + +Note that you should only use the value from the X server, as it may +move the framebuffer base address, so the only value you can trust is +that reported by the X server. + +To find out the size of your framebuffer (what, you don't actually +know?), the following line will tell you: + +(--) S3: videoram: 4096k + +That's 4 megabytes, which is 0x400000 bytes (in hexadecimal). +A patch is being written for XFree86 which will make this automatic: +in other words the X server will manipulate /proc/mtrr using the +ioctl() interface, so users won't have to do anything. If you use a +commercial X server, lobby your vendor to add support for MTRRs. +=============================================================================== +Creating overlapping MTRRs: + +%echo "base=0xfb000000 size=0x1000000 type=write-combining" >/proc/mtrr +%echo "base=0xfb000000 size=0x1000 type=uncachable" >/proc/mtrr + +And the results: cat /proc/mtrr +reg00: base=0x00000000 ( 0MB), size= 64MB: write-back, count=1 +reg01: base=0xfb000000 (4016MB), size= 16MB: write-combining, count=1 +reg02: base=0xfb000000 (4016MB), size= 4kB: uncachable, count=1 + +Some cards (especially Voodoo Graphics boards) need this 4 kB area +excluded from the beginning of the region because it is used for +registers. + +NOTE: You can only create type=uncachable region, if the first +region that you created is type=write-combining. +=============================================================================== +Removing MTRRs from the C-shell: +% echo "disable=2" >! /proc/mtrr +or using bash: +% echo "disable=2" >| /proc/mtrr +=============================================================================== +Reading MTRRs from a C program using ioctl()'s: + +/* mtrr-show.c + + Source file for mtrr-show (example program to show MTRRs using ioctl()'s) + + Copyright (C) 1997-1998 Richard Gooch + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + + Richard Gooch may be reached by email at rgooch@atnf.csiro.au + The postal address is: + Richard Gooch, c/o ATNF, P. O. Box 76, Epping, N.S.W., 2121, Australia. +*/ + +/* + This program will use an ioctl() on /proc/mtrr to show the current MTRR + settings. This is an alternative to reading /proc/mtrr. + + + Written by Richard Gooch 17-DEC-1997 + + Last updated by Richard Gooch 2-MAY-1998 + + +*/ +#include <stdio.h> +#include <string.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <fcntl.h> +#include <sys/ioctl.h> +#include <errno.h> +#define MTRR_NEED_STRINGS +#include <asm/mtrr.h> + +#define TRUE 1 +#define FALSE 0 +#define ERRSTRING strerror (errno) + + +int main () +{ + int fd; + struct mtrr_gentry gentry; + + if ( ( fd = open ("/proc/mtrr", O_RDONLY, 0) ) == -1 ) + { + if (errno == ENOENT) + { + fputs ("/proc/mtrr not found: not supported or you don't have a PPro?\n", + stderr); + exit (1); + } + fprintf (stderr, "Error opening /proc/mtrr\t%s\n", ERRSTRING); + exit (2); + } + for (gentry.regnum = 0; ioctl (fd, MTRRIOC_GET_ENTRY, &gentry) == 0; + ++gentry.regnum) + { + if (gentry.size < 1) + { + fprintf (stderr, "Register: %u disabled\n", gentry.regnum); + continue; + } + fprintf (stderr, "Register: %u base: 0x%lx size: 0x%lx type: %s\n", + gentry.regnum, gentry.base, gentry.size, + mtrr_strings[gentry.type]); + } + if (errno == EINVAL) exit (0); + fprintf (stderr, "Error doing ioctl(2) on /dev/mtrr\t%s\n", ERRSTRING); + exit (3); +} /* End Function main */ +=============================================================================== +Creating MTRRs from a C programme using ioctl()'s: + +/* mtrr-add.c + + Source file for mtrr-add (example programme to add an MTRRs using ioctl()) + + Copyright (C) 1997-1998 Richard Gooch + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + + Richard Gooch may be reached by email at rgooch@atnf.csiro.au + The postal address is: + Richard Gooch, c/o ATNF, P. O. Box 76, Epping, N.S.W., 2121, Australia. +*/ + +/* + This programme will use an ioctl() on /proc/mtrr to add an entry. The first + available mtrr is used. This is an alternative to writing /proc/mtrr. + + + Written by Richard Gooch 17-DEC-1997 + + Last updated by Richard Gooch 2-MAY-1998 + + +*/ +#include <stdio.h> +#include <string.h> +#include <stdlib.h> +#include <unistd.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <fcntl.h> +#include <sys/ioctl.h> +#include <errno.h> +#define MTRR_NEED_STRINGS +#include <asm/mtrr.h> + +#define TRUE 1 +#define FALSE 0 +#define ERRSTRING strerror (errno) + + +int main (int argc, char **argv) +{ + int fd; + struct mtrr_sentry sentry; + + if (argc != 4) + { + fprintf (stderr, "Usage:\tmtrr-add base size type\n"); + exit (1); + } + sentry.base = strtoul (argv[1], NULL, 0); + sentry.size = strtoul (argv[2], NULL, 0); + for (sentry.type = 0; sentry.type < MTRR_NUM_TYPES; ++sentry.type) + { + if (strcmp (argv[3], mtrr_strings[sentry.type]) == 0) break; + } + if (sentry.type >= MTRR_NUM_TYPES) + { + fprintf (stderr, "Illegal type: \"%s\"\n", argv[3]); + exit (2); + } + if ( ( fd = open ("/proc/mtrr", O_WRONLY, 0) ) == -1 ) + { + if (errno == ENOENT) + { + fputs ("/proc/mtrr not found: not supported or you don't have a PPro?\n", + stderr); + exit (3); + } + fprintf (stderr, "Error opening /proc/mtrr\t%s\n", ERRSTRING); + exit (4); + } + if (ioctl (fd, MTRRIOC_ADD_ENTRY, &sentry) == -1) + { + fprintf (stderr, "Error doing ioctl(2) on /dev/mtrr\t%s\n", ERRSTRING); + exit (5); + } + fprintf (stderr, "Sleeping for 5 seconds so you can see the new entry\n"); + sleep (5); + close (fd); + fputs ("I've just closed /proc/mtrr so now the new entry should be gone\n", + stderr); +} /* End Function main */ +=============================================================================== diff --git a/Documentation/nbd.txt b/Documentation/nbd.txt new file mode 100644 index 000000000000..aeb93ffe6416 --- /dev/null +++ b/Documentation/nbd.txt @@ -0,0 +1,47 @@ + Network Block Device (TCP version) + + What is it: With this compiled in the kernel (or as a module), Linux + can use a remote server as one of its block devices. So every time + the client computer wants to read, e.g., /dev/nb0, it sends a + request over TCP to the server, which will reply with the data read. + This can be used for stations with low disk space (or even diskless - + if you boot from floppy) to borrow disk space from another computer. + Unlike NFS, it is possible to put any filesystem on it, etc. It should + even be possible to use NBD as a root filesystem (I've never tried), + but it requires a user-level program to be in the initrd to start. + It also allows you to run block-device in user land (making server + and client physically the same computer, communicating using loopback). + + Current state: It currently works. Network block device is stable. + I originally thought that it was impossible to swap over TCP. It + turned out not to be true - swapping over TCP now works and seems + to be deadlock-free, but it requires heavy patches into Linux's + network layer. + + For more information, or to download the nbd-client and nbd-server + tools, go to http://nbd.sf.net/. + + Howto: To setup nbd, you can simply do the following: + + First, serve a device or file from a remote server: + + nbd-server <port-number> <device-or-file-to-serve-to-client> + + e.g., + root@server1 # nbd-server 1234 /dev/sdb1 + + (serves sdb1 partition on TCP port 1234) + + Then, on the local (client) system: + + nbd-client <server-name-or-IP> <server-port-number> /dev/nb[0-n] + + e.g., + root@client1 # nbd-client server1 1234 /dev/nb0 + + (creates the nb0 device on client1) + + The nbd kernel module need only be installed on the client + system, as the nbd-server is completely in userspace. In fact, + the nbd-server has been successfully ported to other operating + systems, including Windows. diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX new file mode 100644 index 000000000000..834993d26730 --- /dev/null +++ b/Documentation/networking/00-INDEX @@ -0,0 +1,127 @@ +00-INDEX + - this file +3c505.txt + - information on the 3Com EtherLink Plus (3c505) driver. +6pack.txt + - info on the 6pack protocol, an alternative to KISS for AX.25 +Configurable + - info on some of the configurable network parameters +DLINK.txt + - info on the D-Link DE-600/DE-620 parallel port pocket adapters +PLIP.txt + - PLIP: The Parallel Line Internet Protocol device driver +README.sb1000 + - info on General Instrument/NextLevel SURFboard1000 cable modem. +alias.txt + - info on using alias network devices +arcnet-hardware.txt + - tons of info on ARCnet, hubs, jumper settings for ARCnet cards, etc. +arcnet.txt + - info on the using the ARCnet driver itself. +atm.txt + - info on where to get ATM programs and support for Linux. +ax25.txt + - info on using AX.25 and NET/ROM code for Linux +baycom.txt + - info on the driver for Baycom style amateur radio modems +bridge.txt + - where to get user space programs for ethernet bridging with Linux. +comx.txt + - info on drivers for COMX line of synchronous serial adapters. +cops.txt + - info on the COPS LocalTalk Linux driver +cs89x0.txt + - the Crystal LAN (CS8900/20-based) Ethernet ISA adapter driver +de4x5.txt + - the Digital EtherWORKS DE4?? and DE5?? PCI Ethernet driver +decnet.txt + - info on using the DECnet networking layer in Linux. +depca.txt + - the Digital DEPCA/EtherWORKS DE1?? and DE2?? LANCE Ethernet driver +dgrs.txt + - the Digi International RightSwitch SE-X Ethernet driver +dmfe.txt + - info on the Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver. +e100.txt + - info on Intel's EtherExpress PRO/100 line of 10/100 boards +e1000.txt + - info on Intel's E1000 line of gigabit ethernet boards +eql.txt + - serial IP load balancing +ethertap.txt + - the Ethertap user space packet reception and transmission driver +ewrk3.txt + - the Digital EtherWORKS 3 DE203/4/5 Ethernet driver +filter.txt + - Linux Socket Filtering +fore200e.txt + - FORE Systems PCA-200E/SBA-200E ATM NIC driver info. +framerelay.txt + - info on using Frame Relay/Data Link Connection Identifier (DLCI). +ip-sysctl.txt + - /proc/sys/net/ipv4/* variables +ip_dynaddr.txt + - IP dynamic address hack e.g. for auto-dialup links +ipddp.txt + - AppleTalk-IP Decapsulation and AppleTalk-IP Encapsulation +iphase.txt + - Interphase PCI ATM (i)Chip IA Linux driver info. +irda.txt + - where to get IrDA (infrared) utilities and info for Linux. +lapb-module.txt + - programming information of the LAPB module. +ltpc.txt + - the Apple or Farallon LocalTalk PC card driver +multicast.txt + - Behaviour of cards under Multicast +ncsa-telnet + - notes on how NCSA telnet (DOS) breaks with MTU discovery enabled. +net-modules.txt + - info and "insmod" parameters for all network driver modules. +netdevices.txt + - info on network device driver functions exported to the kernel. +olympic.txt + - IBM PCI Pit/Pit-Phy/Olympic Token Ring driver info. +policy-routing.txt + - IP policy-based routing +pt.txt + - the Gracilis Packetwin AX.25 device driver +ray_cs.txt + - Raylink Wireless LAN card driver info. +routing.txt + - the new routing mechanism +shaper.txt + - info on the module that can shape/limit transmitted traffic. +sis900.txt + - SiS 900/7016 Fast Ethernet device driver info. +sk98lin.txt + - Marvell Yukon Chipset / SysKonnect SK-98xx compliant Gigabit + Ethernet Adapter family driver info +skfp.txt + - SysKonnect FDDI (SK-5xxx, Compaq Netelligent) driver info. +smc9.txt + - the driver for SMC's 9000 series of Ethernet cards +smctr.txt + - SMC TokenCard TokenRing Linux driver info. +tcp.txt + - short blurb on how TCP output takes place. +tlan.txt + - ThunderLAN (Compaq Netelligent 10/100, Olicom OC-2xxx) driver info. +tms380tr.txt + - SysKonnect Token Ring ISA/PCI adapter driver info. +tuntap.txt + - TUN/TAP device driver, allowing user space Rx/Tx of packets. +vortex.txt + - info on using 3Com Vortex (3c590, 3c592, 3c595, 3c597) Ethernet cards. +wan-router.txt + - Wan router documentation +wanpipe.txt + - WANPIPE(tm) Multiprotocol WAN Driver for Linux WAN Router +wavelan.txt + - AT&T GIS (nee NCR) WaveLAN card: An Ethernet-like radio transceiver +x25.txt + - general info on X.25 development. +x25-iface.txt + - description of the X.25 Packet Layer to LAPB device interface. +z8530drv.txt + - info about Linux driver for Z8530 based HDLC cards for AX.25 diff --git a/Documentation/networking/3c359.txt b/Documentation/networking/3c359.txt new file mode 100644 index 000000000000..4af8071a6d18 --- /dev/null +++ b/Documentation/networking/3c359.txt @@ -0,0 +1,58 @@ + +3COM PCI TOKEN LINK VELOCITY XL TOKEN RING CARDS README + +Release 0.9.0 - Release + Jul 17th 2000 Mike Phillips + + 1.2.0 - Final + Feb 17th 2002 Mike Phillips + Updated for submission to the 2.4.x kernel. + +Thanks: + Terry Murphy from 3Com for tech docs and support, + Adam D. Ligas for testing the driver. + +Note: + This driver will NOT work with the 3C339 Token Ring cards, you need +to use the tms380 driver instead. + +Options: + +The driver accepts three options: ringspeed, pkt_buf_sz and message_level. + +These options can be specified differently for each card found. + +ringspeed: Has one of three settings 0 (default), 4 or 16. 0 will +make the card autosense the ringspeed and join at the appropriate speed, +this will be the default option for most people. 4 or 16 allow you to +explicitly force the card to operate at a certain speed. The card will fail +if you try to insert it at the wrong speed. (Although some hubs will allow +this so be *very* careful). The main purpose for explicitly setting the ring +speed is for when the card is first on the ring. In autosense mode, if the card +cannot detect any active monitors on the ring it will open at the same speed as +its last opening. This can be hazardous if this speed does not match the speed +you want the ring to operate at. + +pkt_buf_sz: This is this initial receive buffer allocation size. This will +default to 4096 if no value is entered. You may increase performance of the +driver by setting this to a value larger than the network packet size, although +the driver now re-sizes buffers based on MTU settings as well. + +message_level: Controls level of messages created by the driver. Defaults to 0: +which only displays start-up and critical messages. Presently any non-zero +value will display all soft messages as well. NB This does not turn +debugging messages on, that must be done by modified the source code. + +Variable MTU size: + +The driver can handle a MTU size upto either 4500 or 18000 depending upon +ring speed. The driver also changes the size of the receive buffers as part +of the mtu re-sizing, so if you set mtu = 18000, you will need to be able +to allocate 16 * (sk_buff with 18000 buffer size) call it 18500 bytes per ring +position = 296,000 bytes of memory space, plus of course anything +necessary for the tx sk_buff's. Remember this is per card, so if you are +building routers, gateway's etc, you could start to use a lot of memory +real fast. + +2/17/02 Mike Phillips + diff --git a/Documentation/networking/3c505.txt b/Documentation/networking/3c505.txt new file mode 100644 index 000000000000..b9d5b7230118 --- /dev/null +++ b/Documentation/networking/3c505.txt @@ -0,0 +1,46 @@ +The 3Com Etherlink Plus (3c505) driver. + +This driver now uses DMA. There is currently no support for PIO operation. +The default DMA channel is 6; this is _not_ autoprobed, so you must +make sure you configure it correctly. If loading the driver as a +module, you can do this with "modprobe 3c505 dma=n". If the driver is +linked statically into the kernel, you must either use an "ether=" +statement on the command line, or change the definition of ELP_DMA in 3c505.h. + +The driver will warn you if it has to fall back on the compiled in +default DMA channel. + +If no base address is given at boot time, the driver will autoprobe +ports 0x300, 0x280 and 0x310 (in that order). If no IRQ is given, the driver +will try to probe for it. + +The driver can be used as a loadable module. See net-modules.txt for details +of the parameters it can take. + +Theoretically, one instance of the driver can now run multiple cards, +in the standard way (when loading a module, say "modprobe 3c505 +io=0x300,0x340 irq=10,11 dma=6,7" or whatever). I have not tested +this, though. + +The driver may now support revision 2 hardware; the dependency on +being able to read the host control register has been removed. This +is also untested, since I don't have a suitable card. + +Known problems: + I still see "DMA upload timed out" messages from time to time. These +seem to be fairly non-fatal though. + The card is old and slow. + +To do: + Improve probe/setup code + Test multicast and promiscuous operation + +Authors: + The driver is mainly written by Craig Southeren, email + <craigs@ineluki.apana.org.au>. + Parts of the driver (adapting the driver to 1.1.4+ kernels, + IRQ/address detection, some changes) and this README by + Juha Laiho <jlaiho@ichaos.nullnet.fi>. + DMA mode, more fixes, etc, by Philip Blundell <pjb27@cam.ac.uk> + Multicard support, Software configurable DMA, etc., by + Christopher Collins <ccollins@pcug.org.au> diff --git a/Documentation/networking/3c509.txt b/Documentation/networking/3c509.txt new file mode 100644 index 000000000000..867a99f88c68 --- /dev/null +++ b/Documentation/networking/3c509.txt @@ -0,0 +1,210 @@ +Linux and the 3Com EtherLink III Series Ethercards (driver v1.18c and higher) +---------------------------------------------------------------------------- + +This file contains the instructions and caveats for v1.18c and higher versions +of the 3c509 driver. You should not use the driver without reading this file. + +release 1.0 +28 February 2002 +Current maintainer (corrections to): + David Ruggiero <jdr@farfalle.com> + +---------------------------------------------------------------------------- + +(0) Introduction + +The following are notes and information on using the 3Com EtherLink III series +ethercards in Linux. These cards are commonly known by the most widely-used +card's 3Com model number, 3c509. They are all 10mb/s ISA-bus cards and shouldn't +be (but sometimes are) confused with the similarly-numbered PCI-bus "3c905" +(aka "Vortex" or "Boomerang") series. Kernel support for the 3c509 family is +provided by the module 3c509.c, which has code to support all of the following +models: + + 3c509 (original ISA card) + 3c509B (later revision of the ISA card; supports full-duplex) + 3c589 (PCMCIA) + 3c589B (later revision of the 3c589; supports full-duplex) + 3c529 (MCA) + 3c579 (EISA) + +Large portions of this documentation were heavily borrowed from the guide +written the original author of the 3c509 driver, Donald Becker. The master +copy of that document, which contains notes on older versions of the driver, +currently resides on Scyld web server: http://www.scyld.com/network/3c509.html. + + +(1) Special Driver Features + +Overriding card settings + +The driver allows boot- or load-time overriding of the card's detected IOADDR, +IRQ, and transceiver settings, although this capability shouldn't generally be +needed except to enable full-duplex mode (see below). An example of the syntax +for LILO parameters for doing this: + + ether=10,0x310,3,0x3c509,eth0 + +This configures the first found 3c509 card for IRQ 10, base I/O 0x310, and +transceiver type 3 (10base2). The flag "0x3c509" must be set to avoid conflicts +with other card types when overriding the I/O address. When the driver is +loaded as a module, only the IRQ and transceiver setting may be overridden. +For example, setting two cards to 10base2/IRQ10 and AUI/IRQ11 is done by using +the xcvr and irq module options: + + options 3c509 xcvr=3,1 irq=10,11 + + +(2) Full-duplex mode + +The v1.18c driver added support for the 3c509B's full-duplex capabilities. +In order to enable and successfully use full-duplex mode, three conditions +must be met: + +(a) You must have a Etherlink III card model whose hardware supports full- +duplex operations. Currently, the only members of the 3c509 family that are +positively known to support full-duplex are the 3c509B (ISA bus) and 3c589B +(PCMCIA) cards. Cards without the "B" model designation do *not* support +full-duplex mode; these include the original 3c509 (no "B"), the original +3c589, the 3c529 (MCA bus), and the 3c579 (EISA bus). + +(b) You must be using your card's 10baseT transceiver (i.e., the RJ-45 +connector), not its AUI (thick-net) or 10base2 (thin-net/coax) interfaces. +AUI and 10base2 network cabling is physically incapable of full-duplex +operation. + +(c) Most importantly, your 3c509B must be connected to a link partner that is +itself full-duplex capable. This is almost certainly one of two things: a full- +duplex-capable Ethernet switch (*not* a hub), or a full-duplex-capable NIC on +another system that's connected directly to the 3c509B via a crossover cable. + +/////Extremely important caution concerning full-duplex mode///// +Understand that the 3c509B's hardware's full-duplex support is much more +limited than that provide by more modern network interface cards. Although +at the physical layer of the network it fully supports full-duplex operation, +the card was designed before the current Ethernet auto-negotiation (N-way) +spec was written. This means that the 3c509B family ***cannot and will not +auto-negotiate a full-duplex connection with its link partner under any +circumstances, no matter how it is initialized***. If the full-duplex mode +of the 3c509B is enabled, its link partner will very likely need to be +independently _forced_ into full-duplex mode as well; otherwise various nasty +failures will occur - at the very least, you'll see massive numbers of packet +collisions. This is one of very rare circumstances where disabling auto- +negotiation and forcing the duplex mode of a network interface card or switch +would ever be necessary or desirable. + + +(3) Available Transceiver Types + +For versions of the driver v1.18c and above, the available transceiver types are: + +0 transceiver type from EEPROM config (normally 10baseT); force half-duplex +1 AUI (thick-net / DB15 connector) +2 (undefined) +3 10base2 (thin-net == coax / BNC connector) +4 10baseT (RJ-45 connector); force half-duplex mode +8 transceiver type and duplex mode taken from card's EEPROM config settings +12 10baseT (RJ-45 connector); force full-duplex mode + +Prior to driver version 1.18c, only transceiver codes 0-4 were supported. Note +that the new transceiver codes 8 and 12 are the *only* ones that will enable +full-duplex mode, no matter what the card's detected EEPROM settings might be. +This insured that merely upgrading the driver from an earlier version would +never automatically enable full-duplex mode in an existing installation; +it must always be explicitly enabled via one of these code in order to be +activated. + + +(4a) Interpretation of error messages and common problems + +Error Messages + +eth0: Infinite loop in interrupt, status 2011. +These are "mostly harmless" message indicating that the driver had too much +work during that interrupt cycle. With a status of 0x2011 you are receiving +packets faster than they can be removed from the card. This should be rare +or impossible in normal operation. Possible causes of this error report are: + + - a "green" mode enabled that slows the processor down when there is no + keyboard activitiy. + + - some other device or device driver hogging the bus or disabling interrupts. + Check /proc/interrupts for excessive interrupt counts. The timer tick + interrupt should always be incrementing faster than the others. + +No received packets +If a 3c509, 3c562 or 3c589 can successfully transmit packets, but never +receives packets (as reported by /proc/net/dev or 'ifconfig') you likely +have an interrupt line problem. Check /proc/interrupts to verify that the +card is actually generating interrupts. If the interrupt count is not +increasing you likely have a physical conflict with two devices trying to +use the same ISA IRQ line. The common conflict is with a sound card on IRQ10 +or IRQ5, and the easiest solution is to move the 3c509 to a different +interrupt line. If the device is receiving packets but 'ping' doesn't work, +you have a routing problem. + +Tx Carrier Errors Reported in /proc/net/dev +If an EtherLink III appears to transmit packets, but the "Tx carrier errors" +field in /proc/net/dev increments as quickly as the Tx packet count, you +likely have an unterminated network or the incorrect media transceiver selected. + +3c509B card is not detected on machines with an ISA PnP BIOS. +While the updated driver works with most PnP BIOS programs, it does not work +with all. This can be fixed by disabling PnP support using the 3Com-supplied +setup program. + +3c509 card is not detected on overclocked machines +Increase the delay time in id_read_eeprom() from the current value, 500, +to an absurdly high value, such as 5000. + + +(4b) Decoding Status and Error Messages + +The bits in the main status register are: + +value description +0x01 Interrupt latch +0x02 Tx overrun, or Rx underrun +0x04 Tx complete +0x08 Tx FIFO room available +0x10 A complete Rx packet has arrived +0x20 A Rx packet has started to arrive +0x40 The driver has requested an interrupt +0x80 Statistics counter nearly full + +The bits in the transmit (Tx) status word are: + +value description +0x02 Out-of-window collision. +0x04 Status stack overflow (normally impossible). +0x08 16 collisions. +0x10 Tx underrun (not enough PCI bus bandwidth). +0x20 Tx jabber. +0x40 Tx interrupt requested. +0x80 Status is valid (this should always be set). + + +When a transmit error occurs the driver produces a status message such as + + eth0: Transmit error, Tx status register 82 + +The two values typically seen here are: + +0x82 +Out of window collision. This typically occurs when some other Ethernet +host is incorrectly set to full duplex on a half duplex network. + +0x88 +16 collisions. This typically occurs when the network is exceptionally busy +or when another host doesn't correctly back off after a collision. If this +error is mixed with 0x82 errors it is the result of a host incorrectly set +to full duplex (see above). + +Both of these errors are the result of network problems that should be +corrected. They do not represent driver malfunction. + + +(5) Revision history (this file) + +28Feb02 v1.0 DR New; major portions based on Becker original 3c509 docs + diff --git a/Documentation/networking/6pack.txt b/Documentation/networking/6pack.txt new file mode 100644 index 000000000000..48ed2b711bd2 --- /dev/null +++ b/Documentation/networking/6pack.txt @@ -0,0 +1,175 @@ +This is the 6pack-mini-HOWTO, written by + +Andreas Könsgen DG3KQ +Internet: ajk@iehk.rwth-aachen.de +AMPR-net: dg3kq@db0pra.ampr.org +AX.25: dg3kq@db0ach.#nrw.deu.eu + +Last update: April 7, 1998 + +1. What is 6pack, and what are the advantages to KISS? + +6pack is a transmission protocol for data exchange between the PC and +the TNC over a serial line. It can be used as an alternative to KISS. + +6pack has two major advantages: +- The PC is given full control over the radio + channel. Special control data is exchanged between the PC and the TNC so + that the PC knows at any time if the TNC is receiving data, if a TNC + buffer underrun or overrun has occurred, if the PTT is + set and so on. This control data is processed at a higher priority than + normal data, so a data stream can be interrupted at any time to issue an + important event. This helps to improve the channel access and timing + algorithms as everything is computed in the PC. It would even be possible + to experiment with something completely different from the known CSMA and + DAMA channel access methods. + This kind of real-time control is especially important to supply several + TNCs that are connected between each other and the PC by a daisy chain + (however, this feature is not supported yet by the Linux 6pack driver). + +- Each packet transferred over the serial line is supplied with a checksum, + so it is easy to detect errors due to problems on the serial line. + Received packets that are corrupt are not passed on to the AX.25 layer. + Damaged packets that the TNC has received from the PC are not transmitted. + +More details about 6pack are described in the file 6pack.ps that is located +in the doc directory of the AX.25 utilities package. + +2. Who has developed the 6pack protocol? + +The 6pack protocol has been developed by Ekki Plicht DF4OR, Henning Rech +DF9IC and Gunter Jost DK7WJ. A driver for 6pack, written by Gunter Jost and +Matthias Welwarsky DG2FEF, comes along with the PC version of FlexNet. +They have also written a firmware for TNCs to perform the 6pack +protocol (see section 4 below). + +3. Where can I get the latest version of 6pack for LinuX? + +At the moment, the 6pack stuff can obtained via anonymous ftp from +db0bm.automation.fh-aachen.de. In the directory /incoming/dg3kq, +there is a file named 6pack.tgz. + +4. Preparing the TNC for 6pack operation + +To be able to use 6pack, a special firmware for the TNC is needed. The EPROM +of a newly bought TNC does not contain 6pack, so you will have to +program an EPROM yourself. The image file for 6pack EPROMs should be +available on any packet radio box where PC/FlexNet can be found. The name of +the file is 6pack.bin. This file is copyrighted and maintained by the FlexNet +team. It can be used under the terms of the license that comes along +with PC/FlexNet. Please do not ask me about the internals of this file as I +don't know anything about it. I used a textual description of the 6pack +protocol to program the Linux driver. + +TNCs contain a 64kByte EPROM, the lower half of which is used for +the firmware/KISS. The upper half is either empty or is sometimes +programmed with software called TAPR. In the latter case, the TNC +is supplied with a DIP switch so you can easily change between the +two systems. When programming a new EPROM, one of the systems is replaced +by 6pack. It is useful to replace TAPR, as this software is rarely used +nowadays. If your TNC is not equipped with the switch mentioned above, you +can build in one yourself that switches over the highest address pin +of the EPROM between HIGH and LOW level. After having inserted the new EPROM +and switched to 6pack, apply power to the TNC for a first test. The connect +and the status LED are lit for about a second if the firmware initialises +the TNC correctly. + +5. Building and installing the 6pack driver + +The driver has been tested with kernel version 2.1.90. Use with older +kernels may lead to a compilation error because the interface to a kernel +function has been changed in the 2.1.8x kernels. + +How to turn on 6pack support: + +- In the linux kernel configuration program, select the code maturity level + options menu and turn on the prompting for development drivers. + +- Select the amateur radio support menu and turn on the serial port 6pack + driver. + +- Compile and install the kernel and the modules. + +To use the driver, the kissattach program delivered with the AX.25 utilities +has to be modified. + +- Do a cd to the directory that holds the kissattach sources. Edit the + kissattach.c file. At the top, insert the following lines: + + #ifndef N_6PACK + #define N_6PACK (N_AX25+1) + #endif + + Then find the line + + int disc = N_AX25; + + and replace N_AX25 by N_6PACK. + +- Recompile kissattach. Rename it to spattach to avoid confusions. + +Installing the driver: + +- Do an insmod 6pack. Look at your /var/log/messages file to check if the + module has printed its initialization message. + +- Do a spattach as you would launch kissattach when starting a KISS port. + Check if the kernel prints the message '6pack: TNC found'. + +- From here, everything should work as if you were setting up a KISS port. + The only difference is that the network device that represents + the 6pack port is called sp instead of sl or ax. So, sp0 would be the + first 6pack port. + +Although the driver has been tested on various platforms, I still declare it +ALPHA. BE CAREFUL! Sync your disks before insmoding the 6pack module +and spattaching. Watch out if your computer behaves strangely. Read section +6 of this file about known problems. + +Note that the connect and status LEDs of the TNC are controlled in a +different way than they are when the TNC is used with PC/FlexNet. When using +FlexNet, the connect LED is on if there is a connection; the status LED is +on if there is data in the buffer of the PC's AX.25 engine that has to be +transmitted. Under Linux, the 6pack layer is beyond the AX.25 layer, +so the 6pack driver doesn't know anything about connects or data that +has not yet been transmitted. Therefore the LEDs are controlled +as they are in KISS mode: The connect LED is turned on if data is transferred +from the PC to the TNC over the serial line, the status LED if data is +sent to the PC. + +6. Known problems + +When testing the driver with 2.0.3x kernels and +operating with data rates on the radio channel of 9600 Baud or higher, +the driver may, on certain systems, sometimes print the message '6pack: +bad checksum', which is due to data loss if the other station sends two +or more subsequent packets. I have been told that this is due to a problem +with the serial driver of 2.0.3x kernels. I don't know yet if the problem +still exists with 2.1.x kernels, as I have heard that the serial driver +code has been changed with 2.1.x. + +When shutting down the sp interface with ifconfig, the kernel crashes if +there is still an AX.25 connection left over which an IP connection was +running, even if that IP connection is already closed. The problem does not +occur when there is a bare AX.25 connection still running. I don't know if +this is a problem of the 6pack driver or something else in the kernel. + +The driver has been tested as a module, not yet as a kernel-builtin driver. + +The 6pack protocol supports daisy-chaining of TNCs in a token ring, which is +connected to one serial port of the PC. This feature is not implemented +and at least at the moment I won't be able to do it because I do not have +the opportunity to build a TNC daisy-chain and test it. + +Some of the comments in the source code are inaccurate. They are left from +the SLIP/KISS driver, from which the 6pack driver has been derived. +I haven't modified or removed them yet -- sorry! The code itself needs +some cleaning and optimizing. This will be done in a later release. + +If you encounter a bug or if you have a question or suggestion concerning the +driver, feel free to mail me, using the addresses given at the beginning of +this file. + +Have fun! + +Andreas diff --git a/Documentation/networking/Configurable b/Documentation/networking/Configurable new file mode 100644 index 000000000000..69c0dd466ead --- /dev/null +++ b/Documentation/networking/Configurable @@ -0,0 +1,34 @@ + +There are a few network parameters that can be tuned to better match +the kernel to your system hardware and intended usage. The defaults +are usually a good choice for 99% of the people 99% of the time, but +you should be aware they do exist and can be changed. + +The current list of parameters can be found in the files: + + linux/net/TUNABLE + Documentation/networking/ip-sysctl.txt + +Some of these are accessible via the sysctl interface, and many more are +scheduled to be added in this way. For example, some parameters related +to Address Resolution Protocol (ARP) are very easily viewed and altered. + + # cat /proc/sys/net/ipv4/arp_timeout + 6000 + # echo 7000 > /proc/sys/net/ipv4/arp_timeout + # cat /proc/sys/net/ipv4/arp_timeout + 7000 + +Others are already accessible via the related user space programs. +For example, MAX_WINDOW has a default of 32 k which is a good choice for +modern hardware, but if you have a slow (8 bit) Ethernet card and/or a slow +machine, then this will be far too big for the card to keep up with fast +machines transmitting on the same net, resulting in overruns and receive errors. +A value of about 4 k would be more appropriate, which can be set via: + + # route add -net 192.168.3.0 window 4096 + +The remainder of these can only be presently changed by altering a #define +in the related header file. This means an edit and recompile cycle. + + Paul Gortmaker 06/96 diff --git a/Documentation/networking/DLINK.txt b/Documentation/networking/DLINK.txt new file mode 100644 index 000000000000..083d24752b83 --- /dev/null +++ b/Documentation/networking/DLINK.txt @@ -0,0 +1,204 @@ +Released 1994-06-13 + + + CONTENTS: + + 1. Introduction. + 2. License. + 3. Files in this release. + 4. Installation. + 5. Problems and tuning. + 6. Using the drivers with earlier releases. + 7. Acknowledgments. + + + 1. INTRODUCTION. + + This is a set of Ethernet drivers for the D-Link DE-600/DE-620 + pocket adapters, for the parallel port on a Linux based machine. + Some adapter "clones" will also work. Xircom is _not_ a clone... + These drivers _can_ be used as loadable modules, + and were developed for use on Linux 1.1.13 and above. + For use on Linux 1.0.X, or earlier releases, see below. + + I have used these drivers for NFS, ftp, telnet and X-clients on + remote machines. Transmissions with ftp seems to work as + good as can be expected (i.e. > 80k bytes/sec) from a + parallel port...:-) Receive speeds will be about 60-80% of this. + Depending on your machine, somewhat higher speeds can be achieved. + + All comments/fixes to Bjorn Ekwall (bj0rn@blox.se). + + + 2. LICENSE. + + This program is free software; you can redistribute it + and/or modify it under the terms of the GNU General Public + License as published by the Free Software Foundation; either + version 2, or (at your option) any later version. + + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR + PURPOSE. See the GNU General Public License for more + details. + + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 675 Mass Ave, Cambridge, MA + 02139, USA. + + + 3. FILES IN THIS RELEASE. + + README.DLINK This file. + de600.c The Source (may it be with You :-) for the DE-600 + de620.c ditto for the DE-620 + de620.h Macros for de620.c + + If you are upgrading from the d-link tar release, there will + also be a "dlink-patches" file that will patch Linux 1.1.18: + linux/drivers/net/Makefile + linux/drivers/net/CONFIG + linux/drivers/net/MODULES + linux/drivers/net/Space.c + linux/config.in + Apply the patch by: + "cd /usr/src; patch -p0 < linux/drivers/net/dlink-patches" + The old source, "linux/drivers/net/d_link.c", can be removed. + + + 4. INSTALLATION. + + o Get the latest net binaries, according to current net.wisdom. + + o Read the NET-2 and Ethernet HOWTOs and modify your setup. + + o If your parallel port has a strange address or irq, + modify "linux/drivers/net/CONFIG" accordingly, or adjust + the parameters in the "tuning" section in the sources. + + If you are going to use the drivers as loadable modules, do _not_ + enable them while doing "make config", but instead make sure that + the drivers are included in "linux/drivers/net/MODULES". + + If you are _not_ going to use the driver(s) as loadable modules, + but instead have them included in the kernel, remember to enable + the drivers while doing "make config". + + o To include networking and DE600/DE620 support in your kernel: + # cd /linux + (as modules:) + # make config (answer yes on CONFIG_NET and CONFIG_INET) + (else included in the kernel:) + # make config (answer yes on CONFIG _NET, _INET and _DE600 or _DE620) + # make clean + # make zImage (or whatever magic you usually do) + + o I use lilo to boot multiple kernels, so that I at least + can have one working kernel :-). If you do too, append + these lines to /etc/lilo/config: + + image = /linux/zImage + label = newlinux + root = /dev/hda2 (or whatever YOU have...) + + # /etc/lilo/install + + o Do "sync" and reboot the new kernel with a D-Link + DE-600/DE-620 pocket adapter connected. + + o The adapter can be configured with ifconfig eth? + where the actual number is decided by the kernel + when the drivers are initialized. + + + 5. "PROBLEMS" AND TUNING, + + o If you see error messages from the driver, and if the traffic + stops on the adapter, try to do "ifconfig" and "route" once + more, just as in "rc.inet1". This should take care of most + problems, including effects from power loss, or adapters that + aren't connected to the printer port in some way or another. + You can somewhat change the behaviour by enabling/disabling + the macro SHUTDOWN_WHEN_LOST in the "tuning" section. + For the DE-600 there is another macro, CHECK_LOST_DE600, + that you might want to read about in the "tuning" section. + + o Some machines have trouble handling the parallel port and + the adapter at high speed. If you experience problems: + + DE-600: + - The adapter is not recognized at boot, i.e. an Ethernet + address of 00:80:c8:... is not shown, try to add another + "; SLOW_DOWN_IO" + at DE600_SLOW_DOWN in the "tuning" section. As a last resort, + uncomment: "#define REALLY_SLOW_IO" (see <asm/io.h> for hints). + + - You experience "timeout" messages: first try to add another + "; SLOW_DOWN_IO" + at DE600_SLOW_DOWN in the "tuning" section, _then_ try to + increase the value (original value: 5) at + "if (tickssofar < 5)" near line 422. + + DE-620: + - Your parallel port might be "sluggish". To cater for + this, there are the macros LOWSPEED and READ_DELAY/WRITE_DELAY + in the "tuning" section. Your first step should be to enable + LOWSPEED, and after that you can "tune" the XXX_DELAY values. + + o If the adapter _is_ recognized at boot but you get messages + about "Network Unreachable", then the problem is probably + _not_ with the driver. Check your net configuration instead + (ifconfig and route) in "rc.inet1". + + o There is some rudimentary support for debugging, look at + the source. Use "-DDE600_DEBUG=3" or "-DDE620_DEBUG=3" + when compiling, or include it in "linux/drivers/net/CONFIG". + IF YOU HAVE PROBLEMS YOU CAN'T SOLVE: PLEASE COMPILE THE DRIVER + WITH DEBUGGING ENABLED, AND SEND ME THE RESULTING OUTPUT! + + + 6. USING THE DRIVERS WITH EARLIER RELEASES. + + The later 1.1.X releases of the Linux kernel include some + changes in the networking layer (a.k.a. NET3). This affects + these drivers in a few places. The hints that follow are + _not_ tested by me, since I don't have the disk space to keep + all releases on-line. + Known needed changes to date: + - release patchfile: some patches will fail, but they should + be easy to apply "by hand", since they are trivial. + (Space.c: d_link_init() is now called de600_probe()) + - de600.c: change "mark_bh(NET_BH)" to "mark_bh(INET_BH)". + - de620.c: (maybe) change the code around "netif_rx(skb);" to be + similar to the code around "dev_rint(...)" in de600.c + + + 7. ACKNOWLEDGMENTS. + + These drivers wouldn't have been done without the base + (and support) from Ross Biro <bir7@leland.stanford.edu>, + and D-Link Systems Inc. The driver relies upon GPL-ed + source from D-Link Systems Inc. and from Russel Nelson at + Crynwr Software <nelson@crynwr.com>. + + Additional input also from: + Donald Becker <becker@super.org>, Alan Cox <A.Cox@swansea.ac.uk> + and Fred N. van Kempen <waltje@uWalt.NL.Mugnet.ORG> + + DE-600 alpha release primary victim^H^H^H^H^H^Htester: + - Erik Proper <erikp@cs.kun.nl>. + Good input also from several users, most notably + - Mark Burton <markb@ordern.demon.co.uk>. + + DE-620 alpha release victims^H^H^H^H^H^H^Htesters: + - J. Joshua Kopper <kopper@rtsg.mot.com> + - Olav Kvittem <Olav.Kvittem@uninett.no> + - Germano Caronni <caronni@nessie.cs.id.ethz.ch> + - Jeremy Fitzhardinge <jeremy@suite.sw.oz.au> + + + Happy hacking! + + Bjorn Ekwall == bj0rn@blox.se diff --git a/Documentation/networking/NAPI_HOWTO.txt b/Documentation/networking/NAPI_HOWTO.txt new file mode 100644 index 000000000000..54376e8249c1 --- /dev/null +++ b/Documentation/networking/NAPI_HOWTO.txt @@ -0,0 +1,766 @@ +HISTORY: +February 16/2002 -- revision 0.2.1: +COR typo corrected +February 10/2002 -- revision 0.2: +some spell checking ;-> +January 12/2002 -- revision 0.1 +This is still work in progress so may change. +To keep up to date please watch this space. + +Introduction to NAPI +==================== + +NAPI is a proven (www.cyberus.ca/~hadi/usenix-paper.tgz) technique +to improve network performance on Linux. For more details please +read that paper. +NAPI provides a "inherent mitigation" which is bound by system capacity +as can be seen from the following data collected by Robert on Gigabit +ethernet (e1000): + + Psize Ipps Tput Rxint Txint Done Ndone + --------------------------------------------------------------- + 60 890000 409362 17 27622 7 6823 + 128 758150 464364 21 9301 10 7738 + 256 445632 774646 42 15507 21 12906 + 512 232666 994445 241292 19147 241192 1062 + 1024 119061 1000003 872519 19258 872511 0 + 1440 85193 1000003 946576 19505 946569 0 + + +Legend: +"Ipps" stands for input packets per second. +"Tput" == packets out of total 1M that made it out. +"txint" == transmit completion interrupts seen +"Done" == The number of times that the poll() managed to pull all +packets out of the rx ring. Note from this that the lower the +load the more we could clean up the rxring +"Ndone" == is the converse of "Done". Note again, that the higher +the load the more times we couldnt clean up the rxring. + +Observe that: +when the NIC receives 890Kpackets/sec only 17 rx interrupts are generated. +The system cant handle the processing at 1 interrupt/packet at that load level. +At lower rates on the other hand, rx interrupts go up and therefore the +interrupt/packet ratio goes up (as observable from that table). So there is +possibility that under low enough input, you get one poll call for each +input packet caused by a single interrupt each time. And if the system +cant handle interrupt per packet ratio of 1, then it will just have to +chug along .... + + +0) Prerequisites: +================== +A driver MAY continue using the old 2.4 technique for interfacing +to the network stack and not benefit from the NAPI changes. +NAPI additions to the kernel do not break backward compatibility. +NAPI, however, requires the following features to be available: + +A) DMA ring or enough RAM to store packets in software devices. + +B) Ability to turn off interrupts or maybe events that send packets up +the stack. + +NAPI processes packet events in what is known as dev->poll() method. +Typically, only packet receive events are processed in dev->poll(). +The rest of the events MAY be processed by the regular interrupt handler +to reduce processing latency (justified also because there are not that +many of them). +Note, however, NAPI does not enforce that dev->poll() only processes +receive events. +Tests with the tulip driver indicated slightly increased latency if +all of the interrupt handler is moved to dev->poll(). Also MII handling +gets a little trickier. +The example used in this document is to move the receive processing only +to dev->poll(); this is shown with the patch for the tulip driver. +For an example of code that moves all the interrupt driver to +dev->poll() look at the ported e1000 code. + +There are caveats that might force you to go with moving everything to +dev->poll(). Different NICs work differently depending on their status/event +acknowledgement setup. +There are two types of event register ACK mechanisms. + I) what is known as Clear-on-read (COR). + when you read the status/event register, it clears everything! + The natsemi and sunbmac NICs are known to do this. + In this case your only choice is to move all to dev->poll() + + II) Clear-on-write (COW) + i) you clear the status by writing a 1 in the bit-location you want. + These are the majority of the NICs and work the best with NAPI. + Put only receive events in dev->poll(); leave the rest in + the old interrupt handler. + ii) whatever you write in the status register clears every thing ;-> + Cant seem to find any supported by Linux which do this. If + someone knows such a chip email us please. + Move all to dev->poll() + +C) Ability to detect new work correctly. +NAPI works by shutting down event interrupts when theres work and +turning them on when theres none. +New packets might show up in the small window while interrupts were being +re-enabled (refer to appendix 2). A packet might sneak in during the period +we are enabling interrupts. We only get to know about such a packet when the +next new packet arrives and generates an interrupt. +Essentially, there is a small window of opportunity for a race condition +which for clarity we'll refer to as the "rotting packet". + +This is a very important topic and appendix 2 is dedicated for more +discussion. + +Locking rules and environmental guarantees +========================================== + +-Guarantee: Only one CPU at any time can call dev->poll(); this is because +only one CPU can pick the initial interrupt and hence the initial +netif_rx_schedule(dev); +- The core layer invokes devices to send packets in a round robin format. +This implies receive is totaly lockless because of the guarantee only that +one CPU is executing it. +- contention can only be the result of some other CPU accessing the rx +ring. This happens only in close() and suspend() (when these methods +try to clean the rx ring); +****guarantee: driver authors need not worry about this; synchronization +is taken care for them by the top net layer. +-local interrupts are enabled (if you dont move all to dev->poll()). For +example link/MII and txcomplete continue functioning just same old way. +This improves the latency of processing these events. It is also assumed that +the receive interrupt is the largest cause of noise. Note this might not +always be true. +[according to Manfred Spraul, the winbond insists on sending one +txmitcomplete interrupt for each packet (although this can be mitigated)]. +For these broken drivers, move all to dev->poll(). + +For the rest of this text, we'll assume that dev->poll() only +processes receive events. + +new methods introduce by NAPI +============================= + +a) netif_rx_schedule(dev) +Called by an IRQ handler to schedule a poll for device + +b) netif_rx_schedule_prep(dev) +puts the device in a state which allows for it to be added to the +CPU polling list if it is up and running. You can look at this as +the first half of netif_rx_schedule(dev) above; the second half +being c) below. + +c) __netif_rx_schedule(dev) +Add device to the poll list for this CPU; assuming that _prep above +has already been called and returned 1. + +d) netif_rx_reschedule(dev, undo) +Called to reschedule polling for device specifically for some +deficient hardware. Read Appendix 2 for more details. + +e) netif_rx_complete(dev) + +Remove interface from the CPU poll list: it must be in the poll list +on current cpu. This primitive is called by dev->poll(), when +it completes its work. The device cannot be out of poll list at this +call, if it is then clearly it is a BUG(). You'll know ;-> + +All these above nethods are used below. So keep reading for clarity. + +Device driver changes to be made when porting NAPI +================================================== + +Below we describe what kind of changes are required for NAPI to work. + +1) introduction of dev->poll() method +===================================== + +This is the method that is invoked by the network core when it requests +for new packets from the driver. A driver is allowed to send upto +dev->quota packets by the current CPU before yielding to the network +subsystem (so other devices can also get opportunity to send to the stack). + +dev->poll() prototype looks as follows: +int my_poll(struct net_device *dev, int *budget) + +budget is the remaining number of packets the network subsystem on the +current CPU can send up the stack before yielding to other system tasks. +*Each driver is responsible for decrementing budget by the total number of +packets sent. + Total number of packets cannot exceed dev->quota. + +dev->poll() method is invoked by the top layer, the driver just sends if it +can to the stack the packet quantity requested. + +more on dev->poll() below after the interrupt changes are explained. + +2) registering dev->poll() method +=================================== + +dev->poll should be set in the dev->probe() method. +e.g: +dev->open = my_open; +. +. +/* two new additions */ +/* first register my poll method */ +dev->poll = my_poll; +/* next register my weight/quanta; can be overridden in /proc */ +dev->weight = 16; +. +. +dev->stop = my_close; + + + +3) scheduling dev->poll() +============================= +This involves modifying the interrupt handler and the code +path which takes the packet off the NIC and sends them to the +stack. + +it's important at this point to introduce the classical D Becker +interrupt processor: + +------------------ +static irqreturn_t +netdevice_interrupt(int irq, void *dev_id, struct pt_regs *regs) +{ + + struct net_device *dev = (struct net_device *)dev_instance; + struct my_private *tp = (struct my_private *)dev->priv; + + int work_count = my_work_count; + status = read_interrupt_status_reg(); + if (status == 0) + return IRQ_NONE; /* Shared IRQ: not us */ + if (status == 0xffff) + return IRQ_HANDLED; /* Hot unplug */ + if (status & error) + do_some_error_handling() + + do { + acknowledge_ints_ASAP(); + + if (status & link_interrupt) { + spin_lock(&tp->link_lock); + do_some_link_stat_stuff(); + spin_lock(&tp->link_lock); + } + + if (status & rx_interrupt) { + receive_packets(dev); + } + + if (status & rx_nobufs) { + make_rx_buffs_avail(); + } + + if (status & tx_related) { + spin_lock(&tp->lock); + tx_ring_free(dev); + if (tx_died) + restart_tx(); + spin_unlock(&tp->lock); + } + + status = read_interrupt_status_reg(); + + } while (!(status & error) || more_work_to_be_done); + return IRQ_HANDLED; +} + +---------------------------------------------------------------------- + +We now change this to what is shown below to NAPI-enable it: + +---------------------------------------------------------------------- +static irqreturn_t +netdevice_interrupt(int irq, void *dev_id, struct pt_regs *regs) +{ + struct net_device *dev = (struct net_device *)dev_instance; + struct my_private *tp = (struct my_private *)dev->priv; + + status = read_interrupt_status_reg(); + if (status == 0) + return IRQ_NONE; /* Shared IRQ: not us */ + if (status == 0xffff) + return IRQ_HANDLED; /* Hot unplug */ + if (status & error) + do_some_error_handling(); + + do { +/************************ start note *********************************/ + acknowledge_ints_ASAP(); // dont ack rx and rxnobuff here +/************************ end note *********************************/ + + if (status & link_interrupt) { + spin_lock(&tp->link_lock); + do_some_link_stat_stuff(); + spin_unlock(&tp->link_lock); + } +/************************ start note *********************************/ + if (status & rx_interrupt || (status & rx_nobuffs)) { + if (netif_rx_schedule_prep(dev)) { + + /* disable interrupts caused + * by arriving packets */ + disable_rx_and_rxnobuff_ints(); + /* tell system we have work to be done. */ + __netif_rx_schedule(dev); + } else { + printk("driver bug! interrupt while in poll\n"); + /* FIX by disabling interrupts */ + disable_rx_and_rxnobuff_ints(); + } + } +/************************ end note note *********************************/ + + if (status & tx_related) { + spin_lock(&tp->lock); + tx_ring_free(dev); + + if (tx_died) + restart_tx(); + spin_unlock(&tp->lock); + } + + status = read_interrupt_status_reg(); + +/************************ start note *********************************/ + } while (!(status & error) || more_work_to_be_done(status)); +/************************ end note note *********************************/ + return IRQ_HANDLED; +} + +--------------------------------------------------------------------- + + +We note several things from above: + +I) Any interrupt source which is caused by arriving packets is now +turned off when it occurs. Depending on the hardware, there could be +several reasons that arriving packets would cause interrupts; these are the +interrupt sources we wish to avoid. The two common ones are a) a packet +arriving (rxint) b) a packet arriving and finding no DMA buffers available +(rxnobuff) . +This means also acknowledge_ints_ASAP() will not clear the status +register for those two items above; clearing is done in the place where +proper work is done within NAPI; at the poll() and refill_rx_ring() +discussed further below. +netif_rx_schedule_prep() returns 1 if device is in running state and +gets successfully added to the core poll list. If we get a zero value +we can _almost_ assume are already added to the list (instead of not running. +Logic based on the fact that you shouldn't get interrupt if not running) +We rectify this by disabling rx and rxnobuf interrupts. + +II) that receive_packets(dev) and make_rx_buffs_avail() may have disappeared. +These functionalities are still around actually...... + +infact, receive_packets(dev) is very close to my_poll() and +make_rx_buffs_avail() is invoked from my_poll() + +4) converting receive_packets() to dev->poll() +=============================================== + +We need to convert the classical D Becker receive_packets(dev) to my_poll() + +First the typical receive_packets() below: +------------------------------------------------------------------- + +/* this is called by interrupt handler */ +static void receive_packets (struct net_device *dev) +{ + + struct my_private *tp = (struct my_private *)dev->priv; + rx_ring = tp->rx_ring; + cur_rx = tp->cur_rx; + int entry = cur_rx % RX_RING_SIZE; + int received = 0; + int rx_work_limit = tp->dirty_rx + RX_RING_SIZE - tp->cur_rx; + + while (rx_ring_not_empty) { + u32 rx_status; + unsigned int rx_size; + unsigned int pkt_size; + struct sk_buff *skb; + /* read size+status of next frame from DMA ring buffer */ + /* the number 16 and 4 are just examples */ + rx_status = le32_to_cpu (*(u32 *) (rx_ring + ring_offset)); + rx_size = rx_status >> 16; + pkt_size = rx_size - 4; + + /* process errors */ + if ((rx_size > (MAX_ETH_FRAME_SIZE+4)) || + (!(rx_status & RxStatusOK))) { + netdrv_rx_err (rx_status, dev, tp, ioaddr); + return; + } + + if (--rx_work_limit < 0) + break; + + /* grab a skb */ + skb = dev_alloc_skb (pkt_size + 2); + if (skb) { + . + . + netif_rx (skb); + . + . + } else { /* OOM */ + /*seems very driver specific ... some just pass + whatever is on the ring already. */ + } + + /* move to the next skb on the ring */ + entry = (++tp->cur_rx) % RX_RING_SIZE; + received++ ; + + } + + /* store current ring pointer state */ + tp->cur_rx = cur_rx; + + /* Refill the Rx ring buffers if they are needed */ + refill_rx_ring(); + . + . + +} +------------------------------------------------------------------- +We change it to a new one below; note the additional parameter in +the call. + +------------------------------------------------------------------- + +/* this is called by the network core */ +static int my_poll (struct net_device *dev, int *budget) +{ + + struct my_private *tp = (struct my_private *)dev->priv; + rx_ring = tp->rx_ring; + cur_rx = tp->cur_rx; + int entry = cur_rx % RX_BUF_LEN; + /* maximum packets to send to the stack */ +/************************ note note *********************************/ + int rx_work_limit = dev->quota; + +/************************ end note note *********************************/ + do { // outer beginning loop starts here + + clear_rx_status_register_bit(); + + while (rx_ring_not_empty) { + u32 rx_status; + unsigned int rx_size; + unsigned int pkt_size; + struct sk_buff *skb; + /* read size+status of next frame from DMA ring buffer */ + /* the number 16 and 4 are just examples */ + rx_status = le32_to_cpu (*(u32 *) (rx_ring + ring_offset)); + rx_size = rx_status >> 16; + pkt_size = rx_size - 4; + + /* process errors */ + if ((rx_size > (MAX_ETH_FRAME_SIZE+4)) || + (!(rx_status & RxStatusOK))) { + netdrv_rx_err (rx_status, dev, tp, ioaddr); + return 1; + } + +/************************ note note *********************************/ + if (--rx_work_limit < 0) { /* we got packets, but no quota */ + /* store current ring pointer state */ + tp->cur_rx = cur_rx; + + /* Refill the Rx ring buffers if they are needed */ + refill_rx_ring(dev); + goto not_done; + } +/********************** end note **********************************/ + + /* grab a skb */ + skb = dev_alloc_skb (pkt_size + 2); + if (skb) { + . + . +/************************ note note *********************************/ + netif_receive_skb (skb); +/********************** end note **********************************/ + . + . + } else { /* OOM */ + /*seems very driver specific ... common is just pass + whatever is on the ring already. */ + } + + /* move to the next skb on the ring */ + entry = (++tp->cur_rx) % RX_RING_SIZE; + received++ ; + + } + + /* store current ring pointer state */ + tp->cur_rx = cur_rx; + + /* Refill the Rx ring buffers if they are needed */ + refill_rx_ring(dev); + + /* no packets on ring; but new ones can arrive since we last + checked */ + status = read_interrupt_status_reg(); + if (rx status is not set) { + /* If something arrives in this narrow window, + an interrupt will be generated */ + goto done; + } + /* done! at least thats what it looks like ;-> + if new packets came in after our last check on status bits + they'll be caught by the while check and we go back and clear them + since we havent exceeded our quota */ + } while (rx_status_is_set); + +done: + +/************************ note note *********************************/ + dev->quota -= received; + *budget -= received; + + /* If RX ring is not full we are out of memory. */ + if (tp->rx_buffers[tp->dirty_rx % RX_RING_SIZE].skb == NULL) + goto oom; + + /* we are happy/done, no more packets on ring; put us back + to where we can start processing interrupts again */ + netif_rx_complete(dev); + enable_rx_and_rxnobuf_ints(); + + /* The last op happens after poll completion. Which means the following: + * 1. it can race with disabling irqs in irq handler (which are done to + * schedule polls) + * 2. it can race with dis/enabling irqs in other poll threads + * 3. if an irq raised after the begining of the outer beginning + * loop(marked in the code above), it will be immediately + * triggered here. + * + * Summarizing: the logic may results in some redundant irqs both + * due to races in masking and due to too late acking of already + * processed irqs. The good news: no events are ever lost. + */ + + return 0; /* done */ + +not_done: + if (tp->cur_rx - tp->dirty_rx > RX_RING_SIZE/2 || + tp->rx_buffers[tp->dirty_rx % RX_RING_SIZE].skb == NULL) + refill_rx_ring(dev); + + if (!received) { + printk("received==0\n"); + received = 1; + } + dev->quota -= received; + *budget -= received; + return 1; /* not_done */ + +oom: + /* Start timer, stop polling, but do not enable rx interrupts. */ + start_poll_timer(dev); + return 0; /* we'll take it from here so tell core "done"*/ + +/************************ End note note *********************************/ +} +------------------------------------------------------------------- + +From above we note that: +0) rx_work_limit = dev->quota +1) refill_rx_ring() is in charge of clearing the bit for rxnobuff when +it does the work. +2) We have a done and not_done state. +3) instead of netif_rx() we call netif_receive_skb() to pass the skb. +4) we have a new way of handling oom condition +5) A new outer for (;;) loop has been added. This serves the purpose of +ensuring that if a new packet has come in, after we are all set and done, +and we have not exceeded our quota that we continue sending packets up. + + +----------------------------------------------------------- +Poll timer code will need to do the following: + +a) + + if (tp->cur_rx - tp->dirty_rx > RX_RING_SIZE/2 || + tp->rx_buffers[tp->dirty_rx % RX_RING_SIZE].skb == NULL) + refill_rx_ring(dev); + + /* If RX ring is not full we are still out of memory. + Restart the timer again. Else we re-add ourselves + to the master poll list. + */ + + if (tp->rx_buffers[tp->dirty_rx % RX_RING_SIZE].skb == NULL) + restart_timer(); + + else netif_rx_schedule(dev); /* we are back on the poll list */ + +5) dev->close() and dev->suspend() issues +========================================== +The driver writter neednt worry about this. The top net layer takes +care of it. + +6) Adding new Stats to /proc +============================= +In order to debug some of the new features, we introduce new stats +that need to be collected. +TODO: Fill this later. + +APPENDIX 1: discussion on using ethernet HW FC +============================================== +Most chips with FC only send a pause packet when they run out of Rx buffers. +Since packets are pulled off the DMA ring by a softirq in NAPI, +if the system is slow in grabbing them and we have a high input +rate (faster than the system's capacity to remove packets), then theoretically +there will only be one rx interrupt for all packets during a given packetstorm. +Under low load, we might have a single interrupt per packet. +FC should be programmed to apply in the case when the system cant pull out +packets fast enough i.e send a pause only when you run out of rx buffers. +Note FC in itself is a good solution but we have found it to not be +much of a commodity feature (both in NICs and switches) and hence falls +under the same category as using NIC based mitigation. Also experiments +indicate that its much harder to resolve the resource allocation +issue (aka lazy receiving that NAPI offers) and hence quantify its usefullness +proved harder. In any case, FC works even better with NAPI but is not +necessary. + + +APPENDIX 2: the "rotting packet" race-window avoidance scheme +============================================================= + +There are two types of associations seen here + +1) status/int which honors level triggered IRQ + +If a status bit for receive or rxnobuff is set and the corresponding +interrupt-enable bit is not on, then no interrupts will be generated. However, +as soon as the "interrupt-enable" bit is unmasked, an immediate interrupt is +generated. [assuming the status bit was not turned off]. +Generally the concept of level triggered IRQs in association with a status and +interrupt-enable CSR register set is used to avoid the race. + +If we take the example of the tulip: +"pending work" is indicated by the status bit(CSR5 in tulip). +the corresponding interrupt bit (CSR7 in tulip) might be turned off (but +the CSR5 will continue to be turned on with new packet arrivals even if +we clear it the first time) +Very important is the fact that if we turn on the interrupt bit on when +status is set that an immediate irq is triggered. + +If we cleared the rx ring and proclaimed there was "no more work +to be done" and then went on to do a few other things; then when we enable +interrupts, there is a possibility that a new packet might sneak in during +this phase. It helps to look at the pseudo code for the tulip poll +routine: + +-------------------------- + do { + ACK; + while (ring_is_not_empty()) { + work-work-work + if quota is exceeded: exit, no touching irq status/mask + } + /* No packets, but new can arrive while we are doing this*/ + CSR5 := read + if (CSR5 is not set) { + /* If something arrives in this narrow window here, + * where the comments are ;-> irq will be generated */ + unmask irqs; + exit poll; + } + } while (rx_status_is_set); +------------------------ + +CSR5 bit of interest is only the rx status. +If you look at the last if statement: +you just finished grabbing all the packets from the rx ring .. you check if +status bit says theres more packets just in ... it says none; you then +enable rx interrupts again; if a new packet just came in during this check, +we are counting that CSR5 will be set in that small window of opportunity +and that by re-enabling interrupts, we would actually triger an interrupt +to register the new packet for processing. + +[The above description nay be very verbose, if you have better wording +that will make this more understandable, please suggest it.] + +2) non-capable hardware + +These do not generally respect level triggered IRQs. Normally, +irqs may be lost while being masked and the only way to leave poll is to do +a double check for new input after netif_rx_complete() is invoked +and re-enable polling (after seeing this new input). + +Sample code: + +--------- + . + . +restart_poll: + while (ring_is_not_empty()) { + work-work-work + if quota is exceeded: exit, not touching irq status/mask + } + . + . + . + enable_rx_interrupts() + netif_rx_complete(dev); + if (ring_has_new_packet() && netif_rx_reschedule(dev, received)) { + disable_rx_and_rxnobufs() + goto restart_poll + } while (rx_status_is_set); +--------- + +Basically netif_rx_complete() removes us from the poll list, but because a +new packet which will never be caught due to the possibility of a race +might come in, we attempt to re-add ourselves to the poll list. + + + + +APPENDIX 3: Scheduling issues. +============================== +As seen NAPI moves processing to softirq level. Linux uses the ksoftirqd as the +general solution to schedule softirq's to run before next interrupt and by putting +them under scheduler control. Also this prevents consecutive softirq's from +monopolize the CPU. This also have the effect that the priority of ksoftirq needs +to be considered when running very CPU-intensive applications and networking to +get the proper balance of softirq/user balance. Increasing ksoftirq priority to 0 +(eventually more) is reported cure problems with low network performance at high +CPU load. + +Most used processes in a GIGE router: +USER PID %CPU %MEM SIZE RSS TTY STAT START TIME COMMAND +root 3 0.2 0.0 0 0 ? RWN Aug 15 602:00 (ksoftirqd_CPU0) +root 232 0.0 7.9 41400 40884 ? S Aug 15 74:12 gated + +-------------------------------------------------------------------- + +relevant sites: +================== +ftp://robur.slu.se/pub/Linux/net-development/NAPI/ + + +-------------------------------------------------------------------- +TODO: Write net-skeleton.c driver. +------------------------------------------------------------- + +Authors: +======== +Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> +Jamal Hadi Salim <hadi@cyberus.ca> +Robert Olsson <Robert.Olsson@data.slu.se> + +Acknowledgements: +================ +People who made this document better: + +Lennert Buytenhek <buytenh@gnu.org> +Andrew Morton <akpm@zip.com.au> +Manfred Spraul <manfred@colorfullife.com> +Donald Becker <becker@scyld.com> +Jeff Garzik <jgarzik@pobox.com> diff --git a/Documentation/networking/PLIP.txt b/Documentation/networking/PLIP.txt new file mode 100644 index 000000000000..ad7e3f7c3bbf --- /dev/null +++ b/Documentation/networking/PLIP.txt @@ -0,0 +1,215 @@ +PLIP: The Parallel Line Internet Protocol Device + +Donald Becker (becker@super.org) +I.D.A. Supercomputing Research Center, Bowie MD 20715 + +At some point T. Thorn will probably contribute text, +Tommy Thorn (tthorn@daimi.aau.dk) + +PLIP Introduction +----------------- + +This document describes the parallel port packet pusher for Net/LGX. +This device interface allows a point-to-point connection between two +parallel ports to appear as a IP network interface. + +What is PLIP? +============= + +PLIP is Parallel Line IP, that is, the transportation of IP packages +over a parallel port. In the case of a PC, the obvious choice is the +printer port. PLIP is a non-standard, but [can use] uses the standard +LapLink null-printer cable [can also work in turbo mode, with a PLIP +cable]. [The protocol used to pack IP packages, is a simple one +initiated by Crynwr.] + +Advantages of PLIP +================== + +It's cheap, it's available everywhere, and it's easy. + +The PLIP cable is all that's needed to connect two Linux boxes, and it +can be built for very few bucks. + +Connecting two Linux boxes takes only a second's decision and a few +minutes' work, no need to search for a [supported] netcard. This might +even be especially important in the case of notebooks, where netcards +are not easily available. + +Not requiring a netcard also means that apart from connecting the +cables, everything else is software configuration [which in principle +could be made very easy.] + +Disadvantages of PLIP +===================== + +Doesn't work over a modem, like SLIP and PPP. Limited range, 15 m. +Can only be used to connect three (?) Linux boxes. Doesn't connect to +an existing Ethernet. Isn't standard (not even de facto standard, like +SLIP). + +Performance +=========== + +PLIP easily outperforms Ethernet cards....(ups, I was dreaming, but +it *is* getting late. EOB) + +PLIP driver details +------------------- + +The Linux PLIP driver is an implementation of the original Crynwr protocol, +that uses the parallel port subsystem of the kernel in order to properly +share parallel ports between PLIP and other services. + +IRQs and trigger timeouts +========================= + +When a parallel port used for a PLIP driver has an IRQ configured to it, the +PLIP driver is signaled whenever data is sent to it via the cable, such that +when no data is available, the driver isn't being used. + +However, on some machines it is hard, if not impossible, to configure an IRQ +to a certain parallel port, mainly because it is used by some other device. +On these machines, the PLIP driver can be used in IRQ-less mode, where +the PLIP driver would constantly poll the parallel port for data waiting, +and if such data is available, process it. This mode is less efficient than +the IRQ mode, because the driver has to check the parallel port many times +per second, even when no data at all is sent. Some rough measurements +indicate that there isn't a noticeable performance drop when using IRQ-less +mode as compared to IRQ mode as far as the data transfer speed is involved. +There is a performance drop on the machine hosting the driver. + +When the PLIP driver is used in IRQ mode, the timeout used for triggering a +data transfer (the maximal time the PLIP driver would allow the other side +before announcing a timeout, when trying to handshake a transfer of some +data) is, by default, 500usec. As IRQ delivery is more or less immediate, +this timeout is quite sufficient. + +When in IRQ-less mode, the PLIP driver polls the parallel port HZ times +per second (where HZ is typically 100 on most platforms, and 1024 on an +Alpha, as of this writing). Between two such polls, there are 10^6/HZ usecs. +On an i386, for example, 10^6/100 = 10000usec. It is easy to see that it is +quite possible for the trigger timeout to expire between two such polls, as +the timeout is only 500usec long. As a result, it is required to change the +trigger timeout on the *other* side of a PLIP connection, to about +10^6/HZ usecs. If both sides of a PLIP connection are used in IRQ-less mode, +this timeout is required on both sides. + +It appears that in practice, the trigger timeout can be shorter than in the +above calculation. It isn't an important issue, unless the wire is faulty, +in which case a long timeout would stall the machine when, for whatever +reason, bits are dropped. + +A utility that can perform this change in Linux is plipconfig, which is part +of the net-tools package (its location can be found in the +Documentation/Changes file). An example command would be +'plipconfig plipX trigger 10000', where plipX is the appropriate +PLIP device. + +PLIP hardware interconnection +----------------------------- + +PLIP uses several different data transfer methods. The first (and the +only one implemented in the early version of the code) uses a standard +printer "null" cable to transfer data four bits at a time using +data bit outputs connected to status bit inputs. + +The second data transfer method relies on both machines having +bi-directional parallel ports, rather than output-only ``printer'' +ports. This allows byte-wide transfers and avoids reconstructing +nibbles into bytes, leading to much faster transfers. + +Parallel Transfer Mode 0 Cable +============================== + +The cable for the first transfer mode is a standard +printer "null" cable which transfers data four bits at a time using +data bit outputs of the first port (machine T) connected to the +status bit inputs of the second port (machine R). There are five +status inputs, and they are used as four data inputs and a clock (data +strobe) input, arranged so that the data input bits appear as contiguous +bits with standard status register implementation. + +A cable that implements this protocol is available commercially as a +"Null Printer" or "Turbo Laplink" cable. It can be constructed with +two DB-25 male connectors symmetrically connected as follows: + + STROBE output 1* + D0->ERROR 2 - 15 15 - 2 + D1->SLCT 3 - 13 13 - 3 + D2->PAPOUT 4 - 12 12 - 4 + D3->ACK 5 - 10 10 - 5 + D4->BUSY 6 - 11 11 - 6 + D5,D6,D7 are 7*, 8*, 9* + AUTOFD output 14* + INIT output 16* + SLCTIN 17 - 17 + extra grounds are 18*,19*,20*,21*,22*,23*,24* + GROUND 25 - 25 +* Do not connect these pins on either end + +If the cable you are using has a metallic shield it should be +connected to the metallic DB-25 shell at one end only. + +Parallel Transfer Mode 1 +======================== + +The second data transfer method relies on both machines having +bi-directional parallel ports, rather than output-only ``printer'' +ports. This allows byte-wide transfers, and avoids reconstructing +nibbles into bytes. This cable should not be used on unidirectional +``printer'' (as opposed to ``parallel'') ports or when the machine +isn't configured for PLIP, as it will result in output driver +conflicts and the (unlikely) possibility of damage. + +The cable for this transfer mode should be constructed as follows: + + STROBE->BUSY 1 - 11 + D0->D0 2 - 2 + D1->D1 3 - 3 + D2->D2 4 - 4 + D3->D3 5 - 5 + D4->D4 6 - 6 + D5->D5 7 - 7 + D6->D6 8 - 8 + D7->D7 9 - 9 + INIT -> ACK 16 - 10 + AUTOFD->PAPOUT 14 - 12 + SLCT->SLCTIN 13 - 17 + GND->ERROR 18 - 15 + extra grounds are 19*,20*,21*,22*,23*,24* + GROUND 25 - 25 +* Do not connect these pins on either end + +Once again, if the cable you are using has a metallic shield it should +be connected to the metallic DB-25 shell at one end only. + +PLIP Mode 0 transfer protocol +============================= + +The PLIP driver is compatible with the "Crynwr" parallel port transfer +standard in Mode 0. That standard specifies the following protocol: + + send header nibble '0x8' + count-low octet + count-high octet + ... data octets + checksum octet + +Each octet is sent as + <wait for rx. '0x1?'> <send 0x10+(octet&0x0F)> + <wait for rx. '0x0?'> <send 0x00+((octet>>4)&0x0F)> + +To start a transfer the transmitting machine outputs a nibble 0x08. +That raises the ACK line, triggering an interrupt in the receiving +machine. The receiving machine disables interrupts and raises its own ACK +line. + +Restated: + +(OUT is bit 0-4, OUT.j is bit j from OUT. IN likewise) +Send_Byte: + OUT := low nibble, OUT.4 := 1 + WAIT FOR IN.4 = 1 + OUT := high nibble, OUT.4 := 0 + WAIT FOR IN.4 = 0 diff --git a/Documentation/networking/README.sb1000 b/Documentation/networking/README.sb1000 new file mode 100644 index 000000000000..f82d42584e98 --- /dev/null +++ b/Documentation/networking/README.sb1000 @@ -0,0 +1,207 @@ +sb1000 is a module network device driver for the General Instrument (also known +as NextLevel) SURFboard1000 internal cable modem board. This is an ISA card +which is used by a number of cable TV companies to provide cable modem access. +It's a one-way downstream-only cable modem, meaning that your upstream net link +is provided by your regular phone modem. + +This driver was written by Franco Venturi <fventuri@mediaone.net>. He deserves +a great deal of thanks for this wonderful piece of code! + +----------------------------------------------------------------------------- + +Support for this device is now a part of the standard Linux kernel. The +driver source code file is drivers/net/sb1000.c. In addition to this +you will need: + +1.) The "cmconfig" program. This is a utility which supplements "ifconfig" +to configure the cable modem and network interface (usually called "cm0"); +and + +2.) Several PPP scripts which live in /etc/ppp to make connecting via your +cable modem easy. + + These utilities can be obtained from: + + http://www.jacksonville.net/~fventuri/ + + in Franco's original source code distribution .tar.gz file. Support for + the sb1000 driver can be found at: + + http://home.adelphia.net/~siglercm/sb1000.html + http://linuxpower.cx/~cable/ + + along with these utilities. + +3.) The standard isapnp tools. These are necessary to configure your SB1000 +card at boot time (or afterwards by hand) since it's a PnP card. + + If you don't have these installed as a standard part of your Linux + distribution, you can find them at: + + http://www.roestock.demon.co.uk/isapnptools/ + + or check your Linux distribution binary CD or their web site. For help with + isapnp, pnpdump, or /etc/isapnp.conf, go to: + + http://www.roestock.demon.co.uk/isapnptools/isapnpfaq.html + +----------------------------------------------------------------------------- + +To make the SB1000 card work, follow these steps: + +1.) Run `make config', or `make menuconfig', or `make xconfig', whichever +you prefer, in the top kernel tree directory to set up your kernel +configuration. Make sure to say "Y" to "Prompt for development drivers" +and to say "M" to the sb1000 driver. Also say "Y" or "M" to all the standard +networking questions to get TCP/IP and PPP networking support. + +2.) *BEFORE* you build the kernel, edit drivers/net/sb1000.c. Make sure +to redefine the value of READ_DATA_PORT to match the I/O address used +by isapnp to access your PnP cards. This is the value of READPORT in +/etc/isapnp.conf or given by the output of pnpdump. + +3.) Build and install the kernel and modules as usual. + +4.) Boot your new kernel following the usual procedures. + +5.) Set up to configure the new SB1000 PnP card by capturing the output +of "pnpdump" to a file and editing this file to set the correct I/O ports, +IRQ, and DMA settings for all your PnP cards. Make sure none of the settings +conflict with one another. Then test this configuration by running the +"isapnp" command with your new config file as the input. Check for +errors and fix as necessary. (As an aside, I use I/O ports 0x110 and +0x310 and IRQ 11 for my SB1000 card and these work well for me. YMMV.) +Then save the finished config file as /etc/isapnp.conf for proper configuration +on subsequent reboots. + +6.) Download the original file sb1000-1.1.2.tar.gz from Franco's site or one of +the others referenced above. As root, unpack it into a temporary directory and +do a `make cmconfig' and then `install -c cmconfig /usr/local/sbin'. Don't do +`make install' because it expects to find all the utilities built and ready for +installation, not just cmconfig. + +7.) As root, copy all the files under the ppp/ subdirectory in Franco's +tar file into /etc/ppp, being careful not to overwrite any files that are +already in there. Then modify ppp@gi-on to set the correct login name, +phone number, and frequency for the cable modem. Also edit pap-secrets +to specify your login name and password and any site-specific information +you need. + +8.) Be sure to modify /etc/ppp/firewall to use ipchains instead of +the older ipfwadm commands from the 2.0.x kernels. There's a neat utility to +convert ipfwadm commands to ipchains commands: + + http://users.dhp.com/~whisper/ipfwadm2ipchains/ + +You may also wish to modify the firewall script to implement a different +firewalling scheme. + +9.) Start the PPP connection via the script /etc/ppp/ppp@gi-on. You must be +root to do this. It's better to use a utility like sudo to execute +frequently used commands like this with root permissions if possible. If you +connect successfully the cable modem interface will come up and you'll see a +driver message like this at the console: + + cm0: sb1000 at (0x110,0x310), csn 1, S/N 0x2a0d16d8, IRQ 11. + sb1000.c:v1.1.2 6/01/98 (fventuri@mediaone.net) + +The "ifconfig" command should show two new interfaces, ppp0 and cm0. +The command "cmconfig cm0" will give you information about the cable modem +interface. + +10.) Try pinging a site via `ping -c 5 www.yahoo.com', for example. You should +see packets received. + +11.) If you can't get site names (like www.yahoo.com) to resolve into +IP addresses (like 204.71.200.67), be sure your /etc/resolv.conf file +has no syntax errors and has the right nameserver IP addresses in it. +If this doesn't help, try something like `ping -c 5 204.71.200.67' to +see if the networking is running but the DNS resolution is where the +problem lies. + +12.) If you still have problems, go to the support web sites mentioned above +and read the information and documentation there. + +----------------------------------------------------------------------------- + +Common problems: + +1.) Packets go out on the ppp0 interface but don't come back on the cm0 +interface. It looks like I'm connected but I can't even ping any +numerical IP addresses. (This happens predominantly on Debian systems due +to a default boot-time configuration script.) + +Solution -- As root `echo 0 > /proc/sys/net/ipv4/conf/cm0/rp_filter' so it +can share the same IP address as the ppp0 interface. Note that this +command should probably be added to the /etc/ppp/cablemodem script +*right*between* the "/sbin/ifconfig" and "/sbin/cmconfig" commands. +You may need to do this to /proc/sys/net/ipv4/conf/ppp0/rp_filter as well. +If you do this to /proc/sys/net/ipv4/conf/default/rp_filter on each reboot +(in rc.local or some such) then any interfaces can share the same IP +addresses. + +2.) I get "unresolved symbol" error messages on executing `insmod sb1000.o'. + +Solution -- You probably have a non-matching kernel source tree and +/usr/include/linux and /usr/include/asm header files. Make sure you +install the correct versions of the header files in these two directories. +Then rebuild and reinstall the kernel. + +3.) When isapnp runs it reports an error, and my SB1000 card isn't working. + +Solution -- There's a problem with later versions of isapnp using the "(CHECK)" +option in the lines that allocate the two I/O addresses for the SB1000 card. +This first popped up on RH 6.0. Delete "(CHECK)" for the SB1000 I/O addresses. +Make sure they don't conflict with any other pieces of hardware first! Then +rerun isapnp and go from there. + +4.) I can't execute the /etc/ppp/ppp@gi-on file. + +Solution -- As root do `chmod ug+x /etc/ppp/ppp@gi-on'. + +5.) The firewall script isn't working (with 2.2.x and higher kernels). + +Solution -- Use the ipfwadm2ipchains script referenced above to convert the +/etc/ppp/firewall script from the deprecated ipfwadm commands to ipchains. + +6.) I'm getting *tons* of firewall deny messages in the /var/kern.log, +/var/messages, and/or /var/syslog files, and they're filling up my /var +partition!!! + +Solution -- First, tell your ISP that you're receiving DoS (Denial of Service) +and/or portscanning (UDP connection attempts) attacks! Look over the deny +messages to figure out what the attack is and where it's coming from. Next, +edit /etc/ppp/cablemodem and make sure the ",nobroadcast" option is turned on +to the "cmconfig" command (uncomment that line). If you're not receiving these +denied packets on your broadcast interface (IP address xxx.yyy.zzz.255 +typically), then someone is attacking your machine in particular. Be careful +out there.... + +7.) Everything seems to work fine but my computer locks up after a while +(and typically during a lengthy download through the cable modem)! + +Solution -- You may need to add a short delay in the driver to 'slow down' the +SURFboard because your PC might not be able to keep up with the transfer rate +of the SB1000. To do this, it's probably best to download Franco's +sb1000-1.1.2.tar.gz archive and build and install sb1000.o manually. You'll +want to edit the 'Makefile' and look for the 'SB1000_DELAY' +define. Uncomment those 'CFLAGS' lines (and comment out the default ones) +and try setting the delay to something like 60 microseconds with: +'-DSB1000_DELAY=60'. Then do `make' and as root `make install' and try +it out. If it still doesn't work or you like playing with the driver, you may +try other numbers. Remember though that the higher the delay, the slower the +driver (which slows down the rest of the PC too when it is actively +used). Thanks to Ed Daiga for this tip! + +----------------------------------------------------------------------------- + +Credits: This README came from Franco Venturi's original README file which is +still supplied with his driver .tar.gz archive. I and all other sb1000 users +owe Franco a tremendous "Thank you!" Additional thanks goes to Carl Patten +and Ralph Bonnell who are now managing the Linux SB1000 web site, and to +the SB1000 users who reported and helped debug the common problems listed +above. + + + Clemmitt Sigler + csigler@vt.edu diff --git a/Documentation/networking/TODO b/Documentation/networking/TODO new file mode 100644 index 000000000000..66d36ff14bae --- /dev/null +++ b/Documentation/networking/TODO @@ -0,0 +1,18 @@ +To-do items for network drivers +------------------------------- + +* Move ethernet crc routine to generic code + +* (for 2.5) Integrate Jamal Hadi Salim's netdev Rx polling API change + +* Audit all net drivers to make sure magic packet / wake-on-lan / + similar features are disabled in the driver by default. + +* Audit all net drivers to make sure the module always prints out a + version string when loaded as a module, but only prints a version + string when built into the kernel if a device is detected. + +* Add ETHTOOL_GDRVINFO ioctl support to all ethernet drivers. + +* dmfe PCI DMA is totally wrong and only works on x86 + diff --git a/Documentation/networking/alias.txt b/Documentation/networking/alias.txt new file mode 100644 index 000000000000..cd12c2ff518a --- /dev/null +++ b/Documentation/networking/alias.txt @@ -0,0 +1,53 @@ + +IP-Aliasing: +============ + +IP-aliases are additional IP-addresses/masks hooked up to a base +interface by adding a colon and a string when running ifconfig. +This string is usually numeric, but this is not a must. + +IP-Aliases are avail if CONFIG_INET (`standard' IPv4 networking) +is configured in the kernel. + + +o Alias creation. + Alias creation is done by 'magic' interface naming: eg. to create a + 200.1.1.1 alias for eth0 ... + + # ifconfig eth0:0 200.1.1.1 etc,etc.... + ~~ -> request alias #0 creation (if not yet exists) for eth0 + + The corresponding route is also set up by this command. + Please note: The route always points to the base interface. + + +o Alias deletion. + The alias is removed by shutting the alias down: + + # ifconfig eth0:0 down + ~~~~~~~~~~ -> will delete alias + + +o Alias (re-)configuring + + Aliases are not real devices, but programs should be able to configure and + refer to them as usual (ifconfig, route, etc). + + +o Relationship with main device + + If the base device is shut down the added aliases will be deleted + too. + + +Contact +------- +Please finger or e-mail me: + Juan Jose Ciarlante <jjciarla@raiz.uncu.edu.ar> + +Updated by Erik Schoenfelder <schoenfr@gaertner.DE> + +; local variables: +; mode: indented-text +; mode: auto-fill +; end: diff --git a/Documentation/networking/arcnet-hardware.txt b/Documentation/networking/arcnet-hardware.txt new file mode 100644 index 000000000000..30a5f01403d3 --- /dev/null +++ b/Documentation/networking/arcnet-hardware.txt @@ -0,0 +1,3133 @@ + +----------------------------------------------------------------------------- +1) This file is a supplement to arcnet.txt. Please read that for general + driver configuration help. +----------------------------------------------------------------------------- +2) This file is no longer Linux-specific. It should probably be moved out of + the kernel sources. Ideas? +----------------------------------------------------------------------------- + +Because so many people (myself included) seem to have obtained ARCnet cards +without manuals, this file contains a quick introduction to ARCnet hardware, +some cabling tips, and a listing of all jumper settings I can find. Please +e-mail apenwarr@worldvisions.ca with any settings for your particular card, +or any other information you have! + + +INTRODUCTION TO ARCNET +---------------------- + +ARCnet is a network type which works in a way similar to popular Ethernet +networks but which is also different in some very important ways. + +First of all, you can get ARCnet cards in at least two speeds: 2.5 Mbps +(slower than Ethernet) and 100 Mbps (faster than normal Ethernet). In fact, +there are others as well, but these are less common. The different hardware +types, as far as I'm aware, are not compatible and so you cannot wire a +100 Mbps card to a 2.5 Mbps card, and so on. From what I hear, my driver does +work with 100 Mbps cards, but I haven't been able to verify this myself, +since I only have the 2.5 Mbps variety. It is probably not going to saturate +your 100 Mbps card. Stop complaining. :) + +You also cannot connect an ARCnet card to any kind of Ethernet card and +expect it to work. + +There are two "types" of ARCnet - STAR topology and BUS topology. This +refers to how the cards are meant to be wired together. According to most +available documentation, you can only connect STAR cards to STAR cards and +BUS cards to BUS cards. That makes sense, right? Well, it's not quite +true; see below under "Cabling." + +Once you get past these little stumbling blocks, ARCnet is actually quite a +well-designed standard. It uses something called "modified token passing" +which makes it completely incompatible with so-called "Token Ring" cards, +but which makes transfers much more reliable than Ethernet does. In fact, +ARCnet will guarantee that a packet arrives safely at the destination, and +even if it can't possibly be delivered properly (ie. because of a cable +break, or because the destination computer does not exist) it will at least +tell the sender about it. + +Because of the carefully defined action of the "token", it will always make +a pass around the "ring" within a maximum length of time. This makes it +useful for realtime networks. + +In addition, all known ARCnet cards have an (almost) identical programming +interface. This means that with one ARCnet driver you can support any +card, whereas with Ethernet each manufacturer uses what is sometimes a +completely different programming interface, leading to a lot of different, +sometimes very similar, Ethernet drivers. Of course, always using the same +programming interface also means that when high-performance hardware +facilities like PCI bus mastering DMA appear, it's hard to take advantage of +them. Let's not go into that. + +One thing that makes ARCnet cards difficult to program for, however, is the +limit on their packet sizes; standard ARCnet can only send packets that are +up to 508 bytes in length. This is smaller than the Internet "bare minimum" +of 576 bytes, let alone the Ethernet MTU of 1500. To compensate, an extra +level of encapsulation is defined by RFC1201, which I call "packet +splitting," that allows "virtual packets" to grow as large as 64K each, +although they are generally kept down to the Ethernet-style 1500 bytes. + +For more information on the advantages and disadvantages (mostly the +advantages) of ARCnet networks, you might try the "ARCnet Trade Association" +WWW page: + http://www.arcnet.com + + +CABLING ARCNET NETWORKS +----------------------- + +This section was rewritten by + Vojtech Pavlik <vojtech@suse.cz> +using information from several people, including: + Avery Pennraun <apenwarr@worldvisions.ca> + Stephen A. Wood <saw@hallc1.cebaf.gov> + John Paul Morrison <jmorriso@bogomips.ee.ubc.ca> + Joachim Koenig <jojo@repas.de> +and Avery touched it up a bit, at Vojtech's request. + +ARCnet (the classic 2.5 Mbps version) can be connected by two different +types of cabling: coax and twisted pair. The other ARCnet-type networks +(100 Mbps TCNS and 320 kbps - 32 Mbps ARCnet Plus) use different types of +cabling (Type1, Fiber, C1, C4, C5). + +For a coax network, you "should" use 93 Ohm RG-62 cable. But other cables +also work fine, because ARCnet is a very stable network. I personally use 75 +Ohm TV antenna cable. + +Cards for coax cabling are shipped in two different variants: for BUS and +STAR network topologies. They are mostly the same. The only difference +lies in the hybrid chip installed. BUS cards use high impedance output, +while STAR use low impedance. Low impedance card (STAR) is electrically +equal to a high impedance one with a terminator installed. + +Usually, the ARCnet networks are built up from STAR cards and hubs. There +are two types of hubs - active and passive. Passive hubs are small boxes +with four BNC connectors containing four 47 Ohm resistors: + + | | wires + R + junction +-R-+-R- R 47 Ohm resistors + R + | + +The shielding is connected together. Active hubs are much more complicated; +they are powered and contain electronics to amplify the signal and send it +to other segments of the net. They usually have eight connectors. Active +hubs come in two variants - dumb and smart. The dumb variant just +amplifies, but the smart one decodes to digital and encodes back all packets +coming through. This is much better if you have several hubs in the net, +since many dumb active hubs may worsen the signal quality. + +And now to the cabling. What you can connect together: + +1. A card to a card. This is the simplest way of creating a 2-computer + network. + +2. A card to a passive hub. Remember that all unused connectors on the hub + must be properly terminated with 93 Ohm (or something else if you don't + have the right ones) terminators. + (Avery's note: oops, I didn't know that. Mine (TV cable) works + anyway, though.) + +3. A card to an active hub. Here is no need to terminate the unused + connectors except some kind of aesthetic feeling. But, there may not be + more than eleven active hubs between any two computers. That of course + doesn't limit the number of active hubs on the network. + +4. An active hub to another. + +5. An active hub to passive hub. + +Remember, that you can not connect two passive hubs together. The power loss +implied by such a connection is too high for the net to operate reliably. + +An example of a typical ARCnet network: + + R S - STAR type card + S------H--------A-------S R - Terminator + | | H - Hub + | | A - Active hub + | S----H----S + S | + | + S + +The BUS topology is very similar to the one used by Ethernet. The only +difference is in cable and terminators: they should be 93 Ohm. Ethernet +uses 50 Ohm impedance. You use T connectors to put the computers on a single +line of cable, the bus. You have to put terminators at both ends of the +cable. A typical BUS ARCnet network looks like: + + RT----T------T------T------T------TR + B B B B B B + + B - BUS type card + R - Terminator + T - T connector + +But that is not all! The two types can be connected together. According to +the official documentation the only way of connecting them is using an active +hub: + + A------T------T------TR + | B B B + S---H---S + | + S + +The official docs also state that you can use STAR cards at the ends of +BUS network in place of a BUS card and a terminator: + + S------T------T------S + B B + +But, according to my own experiments, you can simply hang a BUS type card +anywhere in middle of a cable in a STAR topology network. And more - you +can use the bus card in place of any star card if you use a terminator. Then +you can build very complicated networks fulfilling all your needs! An +example: + + S + | + RT------T-------T------H------S + B B B | + | R + S------A------T-------T-------A-------H------TR + | B B | | B + | S BT | + | | | S----A-----S + S------H---A----S | | + | | S------T----H---S | + S S B R S + +A basically different cabling scheme is used with Twisted Pair cabling. Each +of the TP cards has two RJ (phone-cord style) connectors. The cards are +then daisy-chained together using a cable connecting every two neighboring +cards. The ends are terminated with RJ 93 Ohm terminators which plug into +the empty connectors of cards on the ends of the chain. An example: + + ___________ ___________ + _R_|_ _|_|_ _|_R_ + | | | | | | + |Card | |Card | |Card | + |_____| |_____| |_____| + + +There are also hubs for the TP topology. There is nothing difficult +involved in using them; you just connect a TP chain to a hub on any end or +even at both. This way you can create almost any network configuration. +The maximum of 11 hubs between any two computers on the net applies here as +well. An example: + + RP-------P--------P--------H-----P------P-----PR + | + RP-----H--------P--------H-----P------PR + | | + PR PR + + R - RJ Terminator + P - TP Card + H - TP Hub + +Like any network, ARCnet has a limited cable length. These are the maximum +cable lengths between two active ends (an active end being an active hub or +a STAR card). + + RG-62 93 Ohm up to 650 m + RG-59/U 75 Ohm up to 457 m + RG-11/U 75 Ohm up to 533 m + IBM Type 1 150 Ohm up to 200 m + IBM Type 3 100 Ohm up to 100 m + +The maximum length of all cables connected to a passive hub is limited to 65 +meters for RG-62 cabling; less for others. You can see that using passive +hubs in a large network is a bad idea. The maximum length of a single "BUS +Trunk" is about 300 meters for RG-62. The maximum distance between the two +most distant points of the net is limited to 3000 meters. The maximum length +of a TP cable between two cards/hubs is 650 meters. + + +SETTING THE JUMPERS +------------------- + +All ARCnet cards should have a total of four or five different settings: + + - the I/O address: this is the "port" your ARCnet card is on. Probed + values in the Linux ARCnet driver are only from 0x200 through 0x3F0. (If + your card has additional ones, which is possible, please tell me.) This + should not be the same as any other device on your system. According to + a doc I got from Novell, MS Windows prefers values of 0x300 or more, + eating net connections on my system (at least) otherwise. My guess is + this may be because, if your card is at 0x2E0, probing for a serial port + at 0x2E8 will reset the card and probably mess things up royally. + - Avery's favourite: 0x300. + + - the IRQ: on 8-bit cards, it might be 2 (9), 3, 4, 5, or 7. + on 16-bit cards, it might be 2 (9), 3, 4, 5, 7, or 10-15. + + Make sure this is different from any other card on your system. Note + that IRQ2 is the same as IRQ9, as far as Linux is concerned. You can + "cat /proc/interrupts" for a somewhat complete list of which ones are in + use at any given time. Here is a list of common usages from Vojtech + Pavlik <vojtech@suse.cz>: + ("Not on bus" means there is no way for a card to generate this + interrupt) + IRQ 0 - Timer 0 (Not on bus) + IRQ 1 - Keyboard (Not on bus) + IRQ 2 - IRQ Controller 2 (Not on bus, nor does interrupt the CPU) + IRQ 3 - COM2 + IRQ 4 - COM1 + IRQ 5 - FREE (LPT2 if you have it; sometimes COM3; maybe PLIP) + IRQ 6 - Floppy disk controller + IRQ 7 - FREE (LPT1 if you don't use the polling driver; PLIP) + IRQ 8 - Realtime Clock Interrupt (Not on bus) + IRQ 9 - FREE (VGA vertical sync interrupt if enabled) + IRQ 10 - FREE + IRQ 11 - FREE + IRQ 12 - FREE + IRQ 13 - Numeric Coprocessor (Not on bus) + IRQ 14 - Fixed Disk Controller + IRQ 15 - FREE (Fixed Disk Controller 2 if you have it) + + Note: IRQ 9 is used on some video cards for the "vertical retrace" + interrupt. This interrupt would have been handy for things like + video games, as it occurs exactly once per screen refresh, but + unfortunately IBM cancelled this feature starting with the original + VGA and thus many VGA/SVGA cards do not support it. For this + reason, no modern software uses this interrupt and it can almost + always be safely disabled, if your video card supports it at all. + + If your card for some reason CANNOT disable this IRQ (usually there + is a jumper), one solution would be to clip the printed circuit + contact on the board: it's the fourth contact from the left on the + back side. I take no responsibility if you try this. + + - Avery's favourite: IRQ2 (actually IRQ9). Watch that VGA, though. + + - the memory address: Unlike most cards, ARCnets use "shared memory" for + copying buffers around. Make SURE it doesn't conflict with any other + used memory in your system! + A0000 - VGA graphics memory (ok if you don't have VGA) + B0000 - Monochrome text mode + C0000 \ One of these is your VGA BIOS - usually C0000. + E0000 / + F0000 - System BIOS + + Anything less than 0xA0000 is, well, a BAD idea since it isn't above + 640k. + - Avery's favourite: 0xD0000 + + - the station address: Every ARCnet card has its own "unique" network + address from 0 to 255. Unlike Ethernet, you can set this address + yourself with a jumper or switch (or on some cards, with special + software). Since it's only 8 bits, you can only have 254 ARCnet cards + on a network. DON'T use 0 or 255, since these are reserved (although + neat stuff will probably happen if you DO use them). By the way, if you + haven't already guessed, don't set this the same as any other ARCnet on + your network! + - Avery's favourite: 3 and 4. Not that it matters. + + - There may be ETS1 and ETS2 settings. These may or may not make a + difference on your card (many manuals call them "reserved"), but are + used to change the delays used when powering up a computer on the + network. This is only necessary when wiring VERY long range ARCnet + networks, on the order of 4km or so; in any case, the only real + requirement here is that all cards on the network with ETS1 and ETS2 + jumpers have them in the same position. Chris Hindy <chrish@io.org> + sent in a chart with actual values for this: + ET1 ET2 Response Time Reconfiguration Time + --- --- ------------- -------------------- + open open 74.7us 840us + open closed 283.4us 1680us + closed open 561.8us 1680us + closed closed 1118.6us 1680us + + Make sure you set ETS1 and ETS2 to the SAME VALUE for all cards on your + network. + +Also, on many cards (not mine, though) there are red and green LED's. +Vojtech Pavlik <vojtech@suse.cz> tells me this is what they mean: + GREEN RED Status + ----- --- ------ + OFF OFF Power off + OFF Short flashes Cabling problems (broken cable or not + terminated) + OFF (short) ON Card init + ON ON Normal state - everything OK, nothing + happens + ON Long flashes Data transfer + ON OFF Never happens (maybe when wrong ID) + + +The following is all the specific information people have sent me about +their own particular ARCnet cards. It is officially a mess, and contains +huge amounts of duplicated information. I have no time to fix it. If you +want to, PLEASE DO! Just send me a 'diff -u' of all your changes. + +The model # is listed right above specifics for that card, so you should be +able to use your text viewer's "search" function to find the entry you want. +If you don't KNOW what kind of card you have, try looking through the +various diagrams to see if you can tell. + +If your model isn't listed and/or has different settings, PLEASE PLEASE +tell me. I had to figure mine out without the manual, and it WASN'T FUN! + +Even if your ARCnet model isn't listed, but has the same jumpers as another +model that is, please e-mail me to say so. + +Cards Listed in this file (in this order, mostly): + + Manufacturer Model # Bits + ------------ ------- ---- + SMC PC100 8 + SMC PC110 8 + SMC PC120 8 + SMC PC130 8 + SMC PC270E 8 + SMC PC500 16 + SMC PC500Longboard 16 + SMC PC550Longboard 16 + SMC PC600 16 + SMC PC710 8 + SMC? LCS-8830(-T) 8/16 + Puredata PDI507 8 + CNet Tech CN120-Series 8 + CNet Tech CN160-Series 16 + Lantech? UM9065L chipset 8 + Acer 5210-003 8 + Datapoint? LAN-ARC-8 8 + Topware TA-ARC/10 8 + Thomas-Conrad 500-6242-0097 REV A 8 + Waterloo? (C)1985 Waterloo Micro. 8 + No Name -- 8/16 + No Name Taiwan R.O.C? 8 + No Name Model 9058 8 + Tiara Tiara Lancard? 8 + + +** SMC = Standard Microsystems Corp. +** CNet Tech = CNet Technology, Inc. + + +Unclassified Stuff +------------------ + - Please send any other information you can find. + + - And some other stuff (more info is welcome!): + From: root@ultraworld.xs4all.nl (Timo Hilbrink) + To: apenwarr@foxnet.net (Avery Pennarun) + Date: Wed, 26 Oct 1994 02:10:32 +0000 (GMT) + Reply-To: timoh@xs4all.nl + + [...parts deleted...] + + About the jumpers: On my PC130 there is one more jumper, located near the + cable-connector and it's for changing to star or bus topology; + closed: star - open: bus + On the PC500 are some more jumper-pins, one block labeled with RX,PDN,TXI + and another with ALE,LA17,LA18,LA19 these are undocumented.. + + [...more parts deleted...] + + --- CUT --- + + +** Standard Microsystems Corp (SMC) ** +PC100, PC110, PC120, PC130 (8-bit cards) +PC500, PC600 (16-bit cards) +--------------------------------- + - mainly from Avery Pennarun <apenwarr@worldvisions.ca>. Values depicted + are from Avery's setup. + - special thanks to Timo Hilbrink <timoh@xs4all.nl> for noting that PC120, + 130, 500, and 600 all have the same switches as Avery's PC100. + PC500/600 have several extra, undocumented pins though. (?) + - PC110 settings were verified by Stephen A. Wood <saw@cebaf.gov> + - Also, the JP- and S-numbers probably don't match your card exactly. Try + to find jumpers/switches with the same number of settings - it's + probably more reliable. + + + JP5 [|] : : : : +(IRQ Setting) IRQ2 IRQ3 IRQ4 IRQ5 IRQ7 + Put exactly one jumper on exactly one set of pins. + + + 1 2 3 4 5 6 7 8 9 10 + S1 /----------------------------------\ +(I/O and Memory | 1 1 * 0 0 0 0 * 1 1 0 1 | + addresses) \----------------------------------/ + |--| |--------| |--------| + (a) (b) (m) + + WARNING. It's very important when setting these which way + you're holding the card, and which way you think is '1'! + + If you suspect that your settings are not being made + correctly, try reversing the direction or inverting the + switch positions. + + a: The first digit of the I/O address. + Setting Value + ------- ----- + 00 0 + 01 1 + 10 2 + 11 3 + + b: The second digit of the I/O address. + Setting Value + ------- ----- + 0000 0 + 0001 1 + 0010 2 + ... ... + 1110 E + 1111 F + + The I/O address is in the form ab0. For example, if + a is 0x2 and b is 0xE, the address will be 0x2E0. + + DO NOT SET THIS LESS THAN 0x200!!!!! + + + m: The first digit of the memory address. + Setting Value + ------- ----- + 0000 0 + 0001 1 + 0010 2 + ... ... + 1110 E + 1111 F + + The memory address is in the form m0000. For example, if + m is D, the address will be 0xD0000. + + DO NOT SET THIS TO C0000, F0000, OR LESS THAN A0000! + + 1 2 3 4 5 6 7 8 + S2 /--------------------------\ +(Station Address) | 1 1 0 0 0 0 0 0 | + \--------------------------/ + + Setting Value + ------- ----- + 00000000 00 + 10000000 01 + 01000000 02 + ... + 01111111 FE + 11111111 FF + + Note that this is binary with the digits reversed! + + DO NOT SET THIS TO 0 OR 255 (0xFF)! + + +***************************************************************************** + +** Standard Microsystems Corp (SMC) ** +PC130E/PC270E (8-bit cards) +--------------------------- + - from Juergen Seifert <seifert@htwm.de> + + +STANDARD MICROSYSTEMS CORPORATION (SMC) ARCNET(R)-PC130E/PC270E +=============================================================== + +This description has been written by Juergen Seifert <seifert@htwm.de> +using information from the following Original SMC Manual + + "Configuration Guide for + ARCNET(R)-PC130E/PC270 + Network Controller Boards + Pub. # 900.044A + June, 1989" + +ARCNET is a registered trademark of the Datapoint Corporation +SMC is a registered trademark of the Standard Microsystems Corporation + +The PC130E is an enhanced version of the PC130 board, is equipped with a +standard BNC female connector for connection to RG-62/U coax cable. +Since this board is designed both for point-to-point connection in star +networks and for connection to bus networks, it is downwardly compatible +with all the other standard boards designed for coax networks (that is, +the PC120, PC110 and PC100 star topology boards and the PC220, PC210 and +PC200 bus topology boards). + +The PC270E is an enhanced version of the PC260 board, is equipped with two +modular RJ11-type jacks for connection to twisted pair wiring. +It can be used in a star or a daisy-chained network. + + + 8 7 6 5 4 3 2 1 + ________________________________________________________________ + | | S1 | | + | |_________________| | + | Offs|Base |I/O Addr | + | RAM Addr | ___| + | ___ ___ CR3 |___| + | | \/ | CR4 |___| + | | PROM | ___| + | | | N | | 8 + | | SOCKET | o | | 7 + | |________| d | | 6 + | ___________________ e | | 5 + | | | A | S | 4 + | |oo| EXT2 | | d | 2 | 3 + | |oo| EXT1 | SMC | d | | 2 + | |oo| ROM | 90C63 | r |___| 1 + | |oo| IRQ7 | | |o| _____| + | |oo| IRQ5 | | |o| | J1 | + | |oo| IRQ4 | | STAR |_____| + | |oo| IRQ3 | | | J2 | + | |oo| IRQ2 |___________________| |_____| + |___ ______________| + | | + |_____________________________________________| + +Legend: + +SMC 90C63 ARCNET Controller / Transceiver /Logic +S1 1-3: I/O Base Address Select + 4-6: Memory Base Address Select + 7-8: RAM Offset Select +S2 1-8: Node ID Select +EXT Extended Timeout Select +ROM ROM Enable Select +STAR Selected - Star Topology (PC130E only) + Deselected - Bus Topology (PC130E only) +CR3/CR4 Diagnostic LEDs +J1 BNC RG62/U Connector (PC130E only) +J1 6-position Telephone Jack (PC270E only) +J2 6-position Telephone Jack (PC270E only) + +Setting one of the switches to Off/Open means "1", On/Closed means "0". + + +Setting the Node ID +------------------- + +The eight switches in group S2 are used to set the node ID. +These switches work in a way similar to the PC100-series cards; see that +entry for more information. + + +Setting the I/O Base Address +---------------------------- + +The first three switches in switch group S1 are used to select one +of eight possible I/O Base addresses using the following table + + + Switch | Hex I/O + 1 2 3 | Address + -------|-------- + 0 0 0 | 260 + 0 0 1 | 290 + 0 1 0 | 2E0 (Manufacturer's default) + 0 1 1 | 2F0 + 1 0 0 | 300 + 1 0 1 | 350 + 1 1 0 | 380 + 1 1 1 | 3E0 + + +Setting the Base Memory (RAM) buffer Address +-------------------------------------------- + +The memory buffer requires 2K of a 16K block of RAM. The base of this +16K block can be located in any of eight positions. +Switches 4-6 of switch group S1 select the Base of the 16K block. +Within that 16K address space, the buffer may be assigned any one of four +positions, determined by the offset, switches 7 and 8 of group S1. + + Switch | Hex RAM | Hex ROM + 4 5 6 7 8 | Address | Address *) + -----------|---------|----------- + 0 0 0 0 0 | C0000 | C2000 + 0 0 0 0 1 | C0800 | C2000 + 0 0 0 1 0 | C1000 | C2000 + 0 0 0 1 1 | C1800 | C2000 + | | + 0 0 1 0 0 | C4000 | C6000 + 0 0 1 0 1 | C4800 | C6000 + 0 0 1 1 0 | C5000 | C6000 + 0 0 1 1 1 | C5800 | C6000 + | | + 0 1 0 0 0 | CC000 | CE000 + 0 1 0 0 1 | CC800 | CE000 + 0 1 0 1 0 | CD000 | CE000 + 0 1 0 1 1 | CD800 | CE000 + | | + 0 1 1 0 0 | D0000 | D2000 (Manufacturer's default) + 0 1 1 0 1 | D0800 | D2000 + 0 1 1 1 0 | D1000 | D2000 + 0 1 1 1 1 | D1800 | D2000 + | | + 1 0 0 0 0 | D4000 | D6000 + 1 0 0 0 1 | D4800 | D6000 + 1 0 0 1 0 | D5000 | D6000 + 1 0 0 1 1 | D5800 | D6000 + | | + 1 0 1 0 0 | D8000 | DA000 + 1 0 1 0 1 | D8800 | DA000 + 1 0 1 1 0 | D9000 | DA000 + 1 0 1 1 1 | D9800 | DA000 + | | + 1 1 0 0 0 | DC000 | DE000 + 1 1 0 0 1 | DC800 | DE000 + 1 1 0 1 0 | DD000 | DE000 + 1 1 0 1 1 | DD800 | DE000 + | | + 1 1 1 0 0 | E0000 | E2000 + 1 1 1 0 1 | E0800 | E2000 + 1 1 1 1 0 | E1000 | E2000 + 1 1 1 1 1 | E1800 | E2000 + +*) To enable the 8K Boot PROM install the jumper ROM. + The default is jumper ROM not installed. + + +Setting the Timeouts and Interrupt +---------------------------------- + +The jumpers labeled EXT1 and EXT2 are used to determine the timeout +parameters. These two jumpers are normally left open. + +To select a hardware interrupt level set one (only one!) of the jumpers +IRQ2, IRQ3, IRQ4, IRQ5, IRQ7. The Manufacturer's default is IRQ2. + + +Configuring the PC130E for Star or Bus Topology +----------------------------------------------- + +The single jumper labeled STAR is used to configure the PC130E board for +star or bus topology. +When the jumper is installed, the board may be used in a star network, when +it is removed, the board can be used in a bus topology. + + +Diagnostic LEDs +--------------- + +Two diagnostic LEDs are visible on the rear bracket of the board. +The green LED monitors the network activity: the red one shows the +board activity: + + Green | Status Red | Status + -------|------------------- ---------|------------------- + on | normal activity flash/on | data transfer + blink | reconfiguration off | no data transfer; + off | defective board or | incorrect memory or + | node ID is zero | I/O address + + +***************************************************************************** + +** Standard Microsystems Corp (SMC) ** +PC500/PC550 Longboard (16-bit cards) +------------------------------------- + - from Juergen Seifert <seifert@htwm.de> + + +STANDARD MICROSYSTEMS CORPORATION (SMC) ARCNET-PC500/PC550 Long Board +===================================================================== + +Note: There is another Version of the PC500 called Short Version, which + is different in hard- and software! The most important differences + are: + - The long board has no Shared memory. + - On the long board the selection of the interrupt is done by binary + coded switch, on the short board directly by jumper. + +[Avery's note: pay special attention to that: the long board HAS NO SHARED +MEMORY. This means the current Linux-ARCnet driver can't use these cards. +I have obtained a PC500Longboard and will be doing some experiments on it in +the future, but don't hold your breath. Thanks again to Juergen Seifert for +his advice about this!] + +This description has been written by Juergen Seifert <seifert@htwm.de> +using information from the following Original SMC Manual + + "Configuration Guide for + SMC ARCNET-PC500/PC550 + Series Network Controller Boards + Pub. # 900.033 Rev. A + November, 1989" + +ARCNET is a registered trademark of the Datapoint Corporation +SMC is a registered trademark of the Standard Microsystems Corporation + +The PC500 is equipped with a standard BNC female connector for connection +to RG-62/U coax cable. +The board is designed both for point-to-point connection in star networks +and for connection to bus networks. + +The PC550 is equipped with two modular RJ11-type jacks for connection +to twisted pair wiring. +It can be used in a star or a daisy-chained (BUS) network. + + 1 + 0 9 8 7 6 5 4 3 2 1 6 5 4 3 2 1 + ____________________________________________________________________ + < | SW1 | | SW2 | | + > |_____________________| |_____________| | + < IRQ |I/O Addr | + > ___| + < CR4 |___| + > CR3 |___| + < ___| + > N | | 8 + < o | | 7 + > d | S | 6 + < e | W | 5 + > A | 3 | 4 + < d | | 3 + > d | | 2 + < r |___| 1 + > |o| _____| + < |o| | J1 | + > 3 1 JP6 |_____| + < |o|o| JP2 | J2 | + > |o|o| |_____| + < 4 2__ ______________| + > | | | + <____| |_____________________________________________| + +Legend: + +SW1 1-6: I/O Base Address Select + 7-10: Interrupt Select +SW2 1-6: Reserved for Future Use +SW3 1-8: Node ID Select +JP2 1-4: Extended Timeout Select +JP6 Selected - Star Topology (PC500 only) + Deselected - Bus Topology (PC500 only) +CR3 Green Monitors Network Activity +CR4 Red Monitors Board Activity +J1 BNC RG62/U Connector (PC500 only) +J1 6-position Telephone Jack (PC550 only) +J2 6-position Telephone Jack (PC550 only) + +Setting one of the switches to Off/Open means "1", On/Closed means "0". + + +Setting the Node ID +------------------- + +The eight switches in group SW3 are used to set the node ID. Each node +attached to the network must have an unique node ID which must be +different from 0. +Switch 1 serves as the least significant bit (LSB). + +The node ID is the sum of the values of all switches set to "1" +These values are: + + Switch | Value + -------|------- + 1 | 1 + 2 | 2 + 3 | 4 + 4 | 8 + 5 | 16 + 6 | 32 + 7 | 64 + 8 | 128 + +Some Examples: + + Switch | Hex | Decimal + 8 7 6 5 4 3 2 1 | Node ID | Node ID + ----------------|---------|--------- + 0 0 0 0 0 0 0 0 | not allowed + 0 0 0 0 0 0 0 1 | 1 | 1 + 0 0 0 0 0 0 1 0 | 2 | 2 + 0 0 0 0 0 0 1 1 | 3 | 3 + . . . | | + 0 1 0 1 0 1 0 1 | 55 | 85 + . . . | | + 1 0 1 0 1 0 1 0 | AA | 170 + . . . | | + 1 1 1 1 1 1 0 1 | FD | 253 + 1 1 1 1 1 1 1 0 | FE | 254 + 1 1 1 1 1 1 1 1 | FF | 255 + + +Setting the I/O Base Address +---------------------------- + +The first six switches in switch group SW1 are used to select one +of 32 possible I/O Base addresses using the following table + + Switch | Hex I/O + 6 5 4 3 2 1 | Address + -------------|-------- + 0 1 0 0 0 0 | 200 + 0 1 0 0 0 1 | 210 + 0 1 0 0 1 0 | 220 + 0 1 0 0 1 1 | 230 + 0 1 0 1 0 0 | 240 + 0 1 0 1 0 1 | 250 + 0 1 0 1 1 0 | 260 + 0 1 0 1 1 1 | 270 + 0 1 1 0 0 0 | 280 + 0 1 1 0 0 1 | 290 + 0 1 1 0 1 0 | 2A0 + 0 1 1 0 1 1 | 2B0 + 0 1 1 1 0 0 | 2C0 + 0 1 1 1 0 1 | 2D0 + 0 1 1 1 1 0 | 2E0 (Manufacturer's default) + 0 1 1 1 1 1 | 2F0 + 1 1 0 0 0 0 | 300 + 1 1 0 0 0 1 | 310 + 1 1 0 0 1 0 | 320 + 1 1 0 0 1 1 | 330 + 1 1 0 1 0 0 | 340 + 1 1 0 1 0 1 | 350 + 1 1 0 1 1 0 | 360 + 1 1 0 1 1 1 | 370 + 1 1 1 0 0 0 | 380 + 1 1 1 0 0 1 | 390 + 1 1 1 0 1 0 | 3A0 + 1 1 1 0 1 1 | 3B0 + 1 1 1 1 0 0 | 3C0 + 1 1 1 1 0 1 | 3D0 + 1 1 1 1 1 0 | 3E0 + 1 1 1 1 1 1 | 3F0 + + +Setting the Interrupt +--------------------- + +Switches seven through ten of switch group SW1 are used to select the +interrupt level. The interrupt level is binary coded, so selections +from 0 to 15 would be possible, but only the following eight values will +be supported: 3, 4, 5, 7, 9, 10, 11, 12. + + Switch | IRQ + 10 9 8 7 | + ---------|-------- + 0 0 1 1 | 3 + 0 1 0 0 | 4 + 0 1 0 1 | 5 + 0 1 1 1 | 7 + 1 0 0 1 | 9 (=2) (default) + 1 0 1 0 | 10 + 1 0 1 1 | 11 + 1 1 0 0 | 12 + + +Setting the Timeouts +-------------------- + +The two jumpers JP2 (1-4) are used to determine the timeout parameters. +These two jumpers are normally left open. +Refer to the COM9026 Data Sheet for alternate configurations. + + +Configuring the PC500 for Star or Bus Topology +---------------------------------------------- + +The single jumper labeled JP6 is used to configure the PC500 board for +star or bus topology. +When the jumper is installed, the board may be used in a star network, when +it is removed, the board can be used in a bus topology. + + +Diagnostic LEDs +--------------- + +Two diagnostic LEDs are visible on the rear bracket of the board. +The green LED monitors the network activity: the red one shows the +board activity: + + Green | Status Red | Status + -------|------------------- ---------|------------------- + on | normal activity flash/on | data transfer + blink | reconfiguration off | no data transfer; + off | defective board or | incorrect memory or + | node ID is zero | I/O address + + +***************************************************************************** + +** SMC ** +PC710 (8-bit card) +------------------ + - from J.S. van Oosten <jvoosten@compiler.tdcnet.nl> + +Note: this data is gathered by experimenting and looking at info of other +cards. However, I'm sure I got 99% of the settings right. + +The SMC710 card resembles the PC270 card, but is much more basic (i.e. no +LEDs, RJ11 jacks, etc.) and 8 bit. Here's a little drawing: + + _______________________________________ + | +---------+ +---------+ |____ + | | S2 | | S1 | | + | +---------+ +---------+ | + | | + | +===+ __ | + | | R | | | X-tal ###___ + | | O | |__| ####__'| + | | M | || ### + | +===+ | + | | + | .. JP1 +----------+ | + | .. | big chip | | + | .. | 90C63 | | + | .. | | | + | .. +----------+ | + ------- ----------- + ||||||||||||||||||||| + +The row of jumpers at JP1 actually consists of 8 jumpers, (sometimes +labelled) the same as on the PC270, from top to bottom: EXT2, EXT1, ROM, +IRQ7, IRQ5, IRQ4, IRQ3, IRQ2 (gee, wonder what they would do? :-) ) + +S1 and S2 perform the same function as on the PC270, only their numbers +are swapped (S1 is the nodeaddress, S2 sets IO- and RAM-address). + +I know it works when connected to a PC110 type ARCnet board. + + +***************************************************************************** + +** Possibly SMC ** +LCS-8830(-T) (8 and 16-bit cards) +--------------------------------- + - from Mathias Katzer <mkatzer@HRZ.Uni-Bielefeld.DE> + - Marek Michalkiewicz <marekm@i17linuxb.ists.pwr.wroc.pl> says the + LCS-8830 is slightly different from LCS-8830-T. These are 8 bit, BUS + only (the JP0 jumper is hardwired), and BNC only. + +This is a LCS-8830-T made by SMC, I think ('SMC' only appears on one PLCC, +nowhere else, not even on the few Xeroxed sheets from the manual). + +SMC ARCnet Board Type LCS-8830-T + + ------------------------------------ + | | + | JP3 88 8 JP2 | + | ##### | \ | + | ##### ET1 ET2 ###| + | 8 ###| + | U3 SW 1 JP0 ###| Phone Jacks + | -- ###| + | | | | + | | | SW2 | + | | | | + | | | ##### | + | -- ##### #### BNC Connector + | #### + | 888888 JP1 | + | 234567 | + -- ------- + ||||||||||||||||||||||||||| + -------------------------- + + +SW1: DIP-Switches for Station Address +SW2: DIP-Switches for Memory Base and I/O Base addresses + +JP0: If closed, internal termination on (default open) +JP1: IRQ Jumpers +JP2: Boot-ROM enabled if closed +JP3: Jumpers for response timeout + +U3: Boot-ROM Socket + + +ET1 ET2 Response Time Idle Time Reconfiguration Time + + 78 86 840 + X 285 316 1680 + X 563 624 1680 + X X 1130 1237 1680 + +(X means closed jumper) + +(DIP-Switch downwards means "0") + +The station address is binary-coded with SW1. + +The I/O base address is coded with DIP-Switches 6,7 and 8 of SW2: + +Switches Base +678 Address +000 260-26f +100 290-29f +010 2e0-2ef +110 2f0-2ff +001 300-30f +101 350-35f +011 380-38f +111 3e0-3ef + + +DIP Switches 1-5 of SW2 encode the RAM and ROM Address Range: + +Switches RAM ROM +12345 Address Range Address Range +00000 C:0000-C:07ff C:2000-C:3fff +10000 C:0800-C:0fff +01000 C:1000-C:17ff +11000 C:1800-C:1fff +00100 C:4000-C:47ff C:6000-C:7fff +10100 C:4800-C:4fff +01100 C:5000-C:57ff +11100 C:5800-C:5fff +00010 C:C000-C:C7ff C:E000-C:ffff +10010 C:C800-C:Cfff +01010 C:D000-C:D7ff +11010 C:D800-C:Dfff +00110 D:0000-D:07ff D:2000-D:3fff +10110 D:0800-D:0fff +01110 D:1000-D:17ff +11110 D:1800-D:1fff +00001 D:4000-D:47ff D:6000-D:7fff +10001 D:4800-D:4fff +01001 D:5000-D:57ff +11001 D:5800-D:5fff +00101 D:8000-D:87ff D:A000-D:bfff +10101 D:8800-D:8fff +01101 D:9000-D:97ff +11101 D:9800-D:9fff +00011 D:C000-D:c7ff D:E000-D:ffff +10011 D:C800-D:cfff +01011 D:D000-D:d7ff +11011 D:D800-D:dfff +00111 E:0000-E:07ff E:2000-E:3fff +10111 E:0800-E:0fff +01111 E:1000-E:17ff +11111 E:1800-E:1fff + + +***************************************************************************** + +** PureData Corp ** +PDI507 (8-bit card) +-------------------- + - from Mark Rejhon <mdrejhon@magi.com> (slight modifications by Avery) + - Avery's note: I think PDI508 cards (but definitely NOT PDI508Plus cards) + are mostly the same as this. PDI508Plus cards appear to be mainly + software-configured. + +Jumpers: + There is a jumper array at the bottom of the card, near the edge + connector. This array is labelled J1. They control the IRQs and + something else. Put only one jumper on the IRQ pins. + + ETS1, ETS2 are for timing on very long distance networks. See the + more general information near the top of this file. + + There is a J2 jumper on two pins. A jumper should be put on them, + since it was already there when I got the card. I don't know what + this jumper is for though. + + There is a two-jumper array for J3. I don't know what it is for, + but there were already two jumpers on it when I got the card. It's + a six pin grid in a two-by-three fashion. The jumpers were + configured as follows: + + .-------. + o | o o | + :-------: ------> Accessible end of card with connectors + o | o o | in this direction -------> + `-------' + +Carl de Billy <CARL@carainfo.com> explains J3 and J4: + + J3 Diagram: + + .-------. + o | o o | + :-------: TWIST Technology + o | o o | + `-------' + .-------. + | o o | o + :-------: COAX Technology + | o o | o + `-------' + + - If using coax cable in a bus topology the J4 jumper must be removed; + place it on one pin. + + - If using bus topology with twisted pair wiring move the J3 + jumpers so they connect the middle pin and the pins closest to the RJ11 + Connectors. Also the J4 jumper must be removed; place it on one pin of + J4 jumper for storage. + + - If using star topology with twisted pair wiring move the J3 + jumpers so they connect the middle pin and the pins closest to the RJ11 + connectors. + + +DIP Switches: + + The DIP switches accessible on the accessible end of the card while + it is installed, is used to set the ARCnet address. There are 8 + switches. Use an address from 1 to 254. + + Switch No. + 12345678 ARCnet address + ----------------------------------------- + 00000000 FF (Don't use this!) + 00000001 FE + 00000010 FD + .... + 11111101 2 + 11111110 1 + 11111111 0 (Don't use this!) + + There is another array of eight DIP switches at the top of the + card. There are five labelled MS0-MS4 which seem to control the + memory address, and another three labelled IO0-IO2 which seem to + control the base I/O address of the card. + + This was difficult to test by trial and error, and the I/O addresses + are in a weird order. This was tested by setting the DIP switches, + rebooting the computer, and attempting to load ARCETHER at various + addresses (mostly between 0x200 and 0x400). The address that caused + the red transmit LED to blink, is the one that I thought works. + + Also, the address 0x3D0 seem to have a special meaning, since the + ARCETHER packet driver loaded fine, but without the red LED + blinking. I don't know what 0x3D0 is for though. I recommend using + an address of 0x300 since Windows may not like addresses below + 0x300. + + IO Switch No. + 210 I/O address + ------------------------------- + 111 0x260 + 110 0x290 + 101 0x2E0 + 100 0x2F0 + 011 0x300 + 010 0x350 + 001 0x380 + 000 0x3E0 + + The memory switches set a reserved address space of 0x1000 bytes + (0x100 segment units, or 4k). For example if I set an address of + 0xD000, it will use up addresses 0xD000 to 0xD100. + + The memory switches were tested by booting using QEMM386 stealth, + and using LOADHI to see what address automatically became excluded + from the upper memory regions, and then attempting to load ARCETHER + using these addresses. + + I recommend using an ARCnet memory address of 0xD000, and putting + the EMS page frame at 0xC000 while using QEMM stealth mode. That + way, you get contiguous high memory from 0xD100 almost all the way + the end of the megabyte. + + Memory Switch 0 (MS0) didn't seem to work properly when set to OFF + on my card. It could be malfunctioning on my card. Experiment with + it ON first, and if it doesn't work, set it to OFF. (It may be a + modifier for the 0x200 bit?) + + MS Switch No. + 43210 Memory address + -------------------------------- + 00001 0xE100 (guessed - was not detected by QEMM) + 00011 0xE000 (guessed - was not detected by QEMM) + 00101 0xDD00 + 00111 0xDC00 + 01001 0xD900 + 01011 0xD800 + 01101 0xD500 + 01111 0xD400 + 10001 0xD100 + 10011 0xD000 + 10101 0xCD00 + 10111 0xCC00 + 11001 0xC900 (guessed - crashes tested system) + 11011 0xC800 (guessed - crashes tested system) + 11101 0xC500 (guessed - crashes tested system) + 11111 0xC400 (guessed - crashes tested system) + + +***************************************************************************** + +** CNet Technology Inc. ** +120 Series (8-bit cards) +------------------------ + - from Juergen Seifert <seifert@htwm.de> + + +CNET TECHNOLOGY INC. (CNet) ARCNET 120A SERIES +============================================== + +This description has been written by Juergen Seifert <seifert@htwm.de> +using information from the following Original CNet Manual + + "ARCNET + USER'S MANUAL + for + CN120A + CN120AB + CN120TP + CN120ST + CN120SBT + P/N:12-01-0007 + Revision 3.00" + +ARCNET is a registered trademark of the Datapoint Corporation + +P/N 120A ARCNET 8 bit XT/AT Star +P/N 120AB ARCNET 8 bit XT/AT Bus +P/N 120TP ARCNET 8 bit XT/AT Twisted Pair +P/N 120ST ARCNET 8 bit XT/AT Star, Twisted Pair +P/N 120SBT ARCNET 8 bit XT/AT Star, Bus, Twisted Pair + + __________________________________________________________________ + | | + | ___| + | LED |___| + | ___| + | N | | ID7 + | o | | ID6 + | d | S | ID5 + | e | W | ID4 + | ___________________ A | 2 | ID3 + | | | d | | ID2 + | | | 1 2 3 4 5 6 7 8 d | | ID1 + | | | _________________ r |___| ID0 + | | 90C65 || SW1 | ____| + | JP 8 7 | ||_________________| | | + | |o|o| JP1 | | | J2 | + | |o|o| |oo| | | JP 1 1 1 | | + | ______________ | | 0 1 2 |____| + | | PROM | |___________________| |o|o|o| _____| + | > SOCKET | JP 6 5 4 3 2 |o|o|o| | J1 | + | |______________| |o|o|o|o|o| |o|o|o| |_____| + |_____ |o|o|o|o|o| ______________| + | | + |_____________________________________________| + +Legend: + +90C65 ARCNET Probe +S1 1-5: Base Memory Address Select + 6-8: Base I/O Address Select +S2 1-8: Node ID Select (ID0-ID7) +JP1 ROM Enable Select +JP2 IRQ2 +JP3 IRQ3 +JP4 IRQ4 +JP5 IRQ5 +JP6 IRQ7 +JP7/JP8 ET1, ET2 Timeout Parameters +JP10/JP11 Coax / Twisted Pair Select (CN120ST/SBT only) +JP12 Terminator Select (CN120AB/ST/SBT only) +J1 BNC RG62/U Connector (all except CN120TP) +J2 Two 6-position Telephone Jack (CN120TP/ST/SBT only) + +Setting one of the switches to Off means "1", On means "0". + + +Setting the Node ID +------------------- + +The eight switches in SW2 are used to set the node ID. Each node attached +to the network must have an unique node ID which must be different from 0. +Switch 1 (ID0) serves as the least significant bit (LSB). + +The node ID is the sum of the values of all switches set to "1" +These values are: + + Switch | Label | Value + -------|-------|------- + 1 | ID0 | 1 + 2 | ID1 | 2 + 3 | ID2 | 4 + 4 | ID3 | 8 + 5 | ID4 | 16 + 6 | ID5 | 32 + 7 | ID6 | 64 + 8 | ID7 | 128 + +Some Examples: + + Switch | Hex | Decimal + 8 7 6 5 4 3 2 1 | Node ID | Node ID + ----------------|---------|--------- + 0 0 0 0 0 0 0 0 | not allowed + 0 0 0 0 0 0 0 1 | 1 | 1 + 0 0 0 0 0 0 1 0 | 2 | 2 + 0 0 0 0 0 0 1 1 | 3 | 3 + . . . | | + 0 1 0 1 0 1 0 1 | 55 | 85 + . . . | | + 1 0 1 0 1 0 1 0 | AA | 170 + . . . | | + 1 1 1 1 1 1 0 1 | FD | 253 + 1 1 1 1 1 1 1 0 | FE | 254 + 1 1 1 1 1 1 1 1 | FF | 255 + + +Setting the I/O Base Address +---------------------------- + +The last three switches in switch block SW1 are used to select one +of eight possible I/O Base addresses using the following table + + + Switch | Hex I/O + 6 7 8 | Address + ------------|-------- + ON ON ON | 260 + OFF ON ON | 290 + ON OFF ON | 2E0 (Manufacturer's default) + OFF OFF ON | 2F0 + ON ON OFF | 300 + OFF ON OFF | 350 + ON OFF OFF | 380 + OFF OFF OFF | 3E0 + + +Setting the Base Memory (RAM) buffer Address +-------------------------------------------- + +The memory buffer (RAM) requires 2K. The base of this buffer can be +located in any of eight positions. The address of the Boot Prom is +memory base + 8K or memory base + 0x2000. +Switches 1-5 of switch block SW1 select the Memory Base address. + + Switch | Hex RAM | Hex ROM + 1 2 3 4 5 | Address | Address *) + --------------------|---------|----------- + ON ON ON ON ON | C0000 | C2000 + ON ON OFF ON ON | C4000 | C6000 + ON ON ON OFF ON | CC000 | CE000 + ON ON OFF OFF ON | D0000 | D2000 (Manufacturer's default) + ON ON ON ON OFF | D4000 | D6000 + ON ON OFF ON OFF | D8000 | DA000 + ON ON ON OFF OFF | DC000 | DE000 + ON ON OFF OFF OFF | E0000 | E2000 + +*) To enable the Boot ROM install the jumper JP1 + +Note: Since the switches 1 and 2 are always set to ON it may be possible + that they can be used to add an offset of 2K, 4K or 6K to the base + address, but this feature is not documented in the manual and I + haven't tested it yet. + + +Setting the Interrupt Line +-------------------------- + +To select a hardware interrupt level install one (only one!) of the jumpers +JP2, JP3, JP4, JP5, JP6. JP2 is the default. + + Jumper | IRQ + -------|----- + 2 | 2 + 3 | 3 + 4 | 4 + 5 | 5 + 6 | 7 + + +Setting the Internal Terminator on CN120AB/TP/SBT +-------------------------------------------------- + +The jumper JP12 is used to enable the internal terminator. + + ----- + 0 | 0 | + ----- ON | | ON + | 0 | | 0 | + | | OFF ----- OFF + | 0 | 0 + ----- + Terminator Terminator + disabled enabled + + +Selecting the Connector Type on CN120ST/SBT +------------------------------------------- + + JP10 JP11 JP10 JP11 + ----- ----- + 0 0 | 0 | | 0 | + ----- ----- | | | | + | 0 | | 0 | | 0 | | 0 | + | | | | ----- ----- + | 0 | | 0 | 0 0 + ----- ----- + Coaxial Cable Twisted Pair Cable + (Default) + + +Setting the Timeout Parameters +------------------------------ + +The jumpers labeled EXT1 and EXT2 are used to determine the timeout +parameters. These two jumpers are normally left open. + + + +***************************************************************************** + +** CNet Technology Inc. ** +160 Series (16-bit cards) +------------------------- + - from Juergen Seifert <seifert@htwm.de> + +CNET TECHNOLOGY INC. (CNet) ARCNET 160A SERIES +============================================== + +This description has been written by Juergen Seifert <seifert@htwm.de> +using information from the following Original CNet Manual + + "ARCNET + USER'S MANUAL + for + CN160A + CN160AB + CN160TP + P/N:12-01-0006 + Revision 3.00" + +ARCNET is a registered trademark of the Datapoint Corporation + +P/N 160A ARCNET 16 bit XT/AT Star +P/N 160AB ARCNET 16 bit XT/AT Bus +P/N 160TP ARCNET 16 bit XT/AT Twisted Pair + + ___________________________________________________________________ + < _________________________ ___| + > |oo| JP2 | | LED |___| + < |oo| JP1 | 9026 | LED |___| + > |_________________________| ___| + < N | | ID7 + > 1 o | | ID6 + < 1 2 3 4 5 6 7 8 9 0 d | S | ID5 + > _______________ _____________________ e | W | ID4 + < | PROM | | SW1 | A | 2 | ID3 + > > SOCKET | |_____________________| d | | ID2 + < |_______________| | IO-Base | MEM | d | | ID1 + > r |___| ID0 + < ____| + > | | + < | J1 | + > | | + < |____| + > 1 1 1 1 | + < 3 4 5 6 7 JP 8 9 0 1 2 3 | + > |o|o|o|o|o| |o|o|o|o|o|o| | + < |o|o|o|o|o| __ |o|o|o|o|o|o| ___________| + > | | | + <____________| |_______________________________________| + +Legend: + +9026 ARCNET Probe +SW1 1-6: Base I/O Address Select + 7-10: Base Memory Address Select +SW2 1-8: Node ID Select (ID0-ID7) +JP1/JP2 ET1, ET2 Timeout Parameters +JP3-JP13 Interrupt Select +J1 BNC RG62/U Connector (CN160A/AB only) +J1 Two 6-position Telephone Jack (CN160TP only) +LED + +Setting one of the switches to Off means "1", On means "0". + + +Setting the Node ID +------------------- + +The eight switches in SW2 are used to set the node ID. Each node attached +to the network must have an unique node ID which must be different from 0. +Switch 1 (ID0) serves as the least significant bit (LSB). + +The node ID is the sum of the values of all switches set to "1" +These values are: + + Switch | Label | Value + -------|-------|------- + 1 | ID0 | 1 + 2 | ID1 | 2 + 3 | ID2 | 4 + 4 | ID3 | 8 + 5 | ID4 | 16 + 6 | ID5 | 32 + 7 | ID6 | 64 + 8 | ID7 | 128 + +Some Examples: + + Switch | Hex | Decimal + 8 7 6 5 4 3 2 1 | Node ID | Node ID + ----------------|---------|--------- + 0 0 0 0 0 0 0 0 | not allowed + 0 0 0 0 0 0 0 1 | 1 | 1 + 0 0 0 0 0 0 1 0 | 2 | 2 + 0 0 0 0 0 0 1 1 | 3 | 3 + . . . | | + 0 1 0 1 0 1 0 1 | 55 | 85 + . . . | | + 1 0 1 0 1 0 1 0 | AA | 170 + . . . | | + 1 1 1 1 1 1 0 1 | FD | 253 + 1 1 1 1 1 1 1 0 | FE | 254 + 1 1 1 1 1 1 1 1 | FF | 255 + + +Setting the I/O Base Address +---------------------------- + +The first six switches in switch block SW1 are used to select the I/O Base +address using the following table: + + Switch | Hex I/O + 1 2 3 4 5 6 | Address + ------------------------|-------- + OFF ON ON OFF OFF ON | 260 + OFF ON OFF ON ON OFF | 290 + OFF ON OFF OFF OFF ON | 2E0 (Manufacturer's default) + OFF ON OFF OFF OFF OFF | 2F0 + OFF OFF ON ON ON ON | 300 + OFF OFF ON OFF ON OFF | 350 + OFF OFF OFF ON ON ON | 380 + OFF OFF OFF OFF OFF ON | 3E0 + +Note: Other IO-Base addresses seem to be selectable, but only the above + combinations are documented. + + +Setting the Base Memory (RAM) buffer Address +-------------------------------------------- + +The switches 7-10 of switch block SW1 are used to select the Memory +Base address of the RAM (2K) and the PROM. + + Switch | Hex RAM | Hex ROM + 7 8 9 10 | Address | Address + ----------------|---------|----------- + OFF OFF ON ON | C0000 | C8000 + OFF OFF ON OFF | D0000 | D8000 (Default) + OFF OFF OFF ON | E0000 | E8000 + +Note: Other MEM-Base addresses seem to be selectable, but only the above + combinations are documented. + + +Setting the Interrupt Line +-------------------------- + +To select a hardware interrupt level install one (only one!) of the jumpers +JP3 through JP13 using the following table: + + Jumper | IRQ + -------|----------------- + 3 | 14 + 4 | 15 + 5 | 12 + 6 | 11 + 7 | 10 + 8 | 3 + 9 | 4 + 10 | 5 + 11 | 6 + 12 | 7 + 13 | 2 (=9) Default! + +Note: - Do not use JP11=IRQ6, it may conflict with your Floppy Disk + Controller + - Use JP3=IRQ14 only, if you don't have an IDE-, MFM-, or RLL- + Hard Disk, it may conflict with their controllers + + +Setting the Timeout Parameters +------------------------------ + +The jumpers labeled JP1 and JP2 are used to determine the timeout +parameters. These two jumpers are normally left open. + + +***************************************************************************** + +** Lantech ** +8-bit card, unknown model +------------------------- + - from Vlad Lungu <vlungu@ugal.ro> - his e-mail address seemed broken at + the time I tried to reach him. Sorry Vlad, if you didn't get my reply. + + ________________________________________________________________ + | 1 8 | + | ___________ __| + | | SW1 | LED |__| + | |__________| | + | ___| + | _____________________ |S | 8 + | | | |W | + | | | |2 | + | | | |__| 1 + | | UM9065L | |o| JP4 ____|____ + | | | |o| | CN | + | | | |________| + | | | | + | |___________________| | + | | + | | + | _____________ | + | | | | + | | PROM | |ooooo| JP6 | + | |____________| |ooooo| | + |_____________ _ _| + |____________________________________________| |__| + + +UM9065L : ARCnet Controller + +SW 1 : Shared Memory Address and I/O Base + + ON=0 + + 12345|Memory Address + -----|-------------- + 00001| D4000 + 00010| CC000 + 00110| D0000 + 01110| D1000 + 01101| D9000 + 10010| CC800 + 10011| DC800 + 11110| D1800 + +It seems that the bits are considered in reverse order. Also, you must +observe that some of those addresses are unusual and I didn't probe them; I +used a memory dump in DOS to identify them. For the 00000 configuration and +some others that I didn't write here the card seems to conflict with the +video card (an S3 GENDAC). I leave the full decoding of those addresses to +you. + + 678| I/O Address + ---|------------ + 000| 260 + 001| failed probe + 010| 2E0 + 011| 380 + 100| 290 + 101| 350 + 110| failed probe + 111| 3E0 + +SW 2 : Node ID (binary coded) + +JP 4 : Boot PROM enable CLOSE - enabled + OPEN - disabled + +JP 6 : IRQ set (ONLY ONE jumper on 1-5 for IRQ 2-6) + + +***************************************************************************** + +** Acer ** +8-bit card, Model 5210-003 +-------------------------- + - from Vojtech Pavlik <vojtech@suse.cz> using portions of the existing + arcnet-hardware file. + +This is a 90C26 based card. Its configuration seems similar to the SMC +PC100, but has some additional jumpers I don't know the meaning of. + + __ + | | + ___________|__|_________________________ + | | | | + | | BNC | | + | |______| ___| + | _____________________ |___ + | | | | + | | Hybrid IC | | + | | | o|o J1 | + | |_____________________| 8|8 | + | 8|8 J5 | + | o|o | + | 8|8 | + |__ 8|8 | + (|__| LED o|o | + | 8|8 | + | 8|8 J15 | + | | + | _____ | + | | | _____ | + | | | | | ___| + | | | | | | + | _____ | ROM | | UFS | | + | | | | | | | | + | | | ___ | | | | | + | | | | | |__.__| |__.__| | + | | NCR | |XTL| _____ _____ | + | | | |___| | | | | | + | |90C26| | | | | | + | | | | RAM | | UFS | | + | | | J17 o|o | | | | | + | | | J16 o|o | | | | | + | |__.__| |__.__| |__.__| | + | ___ | + | | |8 | + | |SW2| | + | | | | + | |___|1 | + | ___ | + | | |10 J18 o|o | + | | | o|o | + | |SW1| o|o | + | | | J21 o|o | + | |___|1 | + | | + |____________________________________| + + +Legend: + +90C26 ARCNET Chip +XTL 20 MHz Crystal +SW1 1-6 Base I/O Address Select + 7-10 Memory Address Select +SW2 1-8 Node ID Select (ID0-ID7) +J1-J5 IRQ Select +J6-J21 Unknown (Probably extra timeouts & ROM enable ...) +LED1 Activity LED +BNC Coax connector (STAR ARCnet) +RAM 2k of SRAM +ROM Boot ROM socket +UFS Unidentified Flying Sockets + + +Setting the Node ID +------------------- + +The eight switches in SW2 are used to set the node ID. Each node attached +to the network must have an unique node ID which must not be 0. +Switch 1 (ID0) serves as the least significant bit (LSB). + +Setting one of the switches to OFF means "1", ON means "0". + +The node ID is the sum of the values of all switches set to "1" +These values are: + + Switch | Value + -------|------- + 1 | 1 + 2 | 2 + 3 | 4 + 4 | 8 + 5 | 16 + 6 | 32 + 7 | 64 + 8 | 128 + +Don't set this to 0 or 255; these values are reserved. + + +Setting the I/O Base Address +---------------------------- + +The switches 1 to 6 of switch block SW1 are used to select one +of 32 possible I/O Base addresses using the following tables + + | Hex + Switch | Value + -------|------- + 1 | 200 + 2 | 100 + 3 | 80 + 4 | 40 + 5 | 20 + 6 | 10 + +The I/O address is sum of all switches set to "1". Remember that +the I/O address space bellow 0x200 is RESERVED for mainboard, so +switch 1 should be ALWAYS SET TO OFF. + + +Setting the Base Memory (RAM) buffer Address +-------------------------------------------- + +The memory buffer (RAM) requires 2K. The base of this buffer can be +located in any of sixteen positions. However, the addresses below +A0000 are likely to cause system hang because there's main RAM. + +Jumpers 7-10 of switch block SW1 select the Memory Base address. + + Switch | Hex RAM + 7 8 9 10 | Address + ----------------|--------- + OFF OFF OFF OFF | F0000 (conflicts with main BIOS) + OFF OFF OFF ON | E0000 + OFF OFF ON OFF | D0000 + OFF OFF ON ON | C0000 (conflicts with video BIOS) + OFF ON OFF OFF | B0000 (conflicts with mono video) + OFF ON OFF ON | A0000 (conflicts with graphics) + + +Setting the Interrupt Line +-------------------------- + +Jumpers 1-5 of the jumper block J1 control the IRQ level. ON means +shorted, OFF means open. + + Jumper | IRQ + 1 2 3 4 5 | + ---------------------------- + ON OFF OFF OFF OFF | 7 + OFF ON OFF OFF OFF | 5 + OFF OFF ON OFF OFF | 4 + OFF OFF OFF ON OFF | 3 + OFF OFF OFF OFF ON | 2 + + +Unknown jumpers & sockets +------------------------- + +I know nothing about these. I just guess that J16&J17 are timeout +jumpers and maybe one of J18-J21 selects ROM. Also J6-J10 and +J11-J15 are connecting IRQ2-7 to some pins on the UFSs. I can't +guess the purpose. + + +***************************************************************************** + +** Datapoint? ** +LAN-ARC-8, an 8-bit card +------------------------ + - from Vojtech Pavlik <vojtech@suse.cz> + +This is another SMC 90C65-based ARCnet card. I couldn't identify the +manufacturer, but it might be DataPoint, because the card has the +original arcNet logo in its upper right corner. + + _______________________________________________________ + | _________ | + | | SW2 | ON arcNet | + | |_________| OFF ___| + | _____________ 1 ______ 8 | | 8 + | | | SW1 | XTAL | ____________ | S | + | > RAM (2k) | |______|| | | W | + | |_____________| | H | | 3 | + | _________|_____ y | |___| 1 + | _________ | | |b | | + | |_________| | | |r | | + | | SMC | |i | | + | | 90C65| |d | | + | _________ | | | | | + | | SW1 | ON | | |I | | + | |_________| OFF |_________|_____/C | _____| + | 1 8 | | | |___ + | ______________ | | | BNC |___| + | | | |____________| |_____| + | > EPROM SOCKET | _____________ | + | |______________| |_____________| | + | ______________| + | | + |________________________________________| + +Legend: + +90C65 ARCNET Chip +SW1 1-5: Base Memory Address Select + 6-8: Base I/O Address Select +SW2 1-8: Node ID Select +SW3 1-5: IRQ Select + 6-7: Extra Timeout + 8 : ROM Enable +BNC Coax connector +XTAL 20 MHz Crystal + + +Setting the Node ID +------------------- + +The eight switches in SW3 are used to set the node ID. Each node attached +to the network must have an unique node ID which must not be 0. +Switch 1 serves as the least significant bit (LSB). + +Setting one of the switches to Off means "1", On means "0". + +The node ID is the sum of the values of all switches set to "1" +These values are: + + Switch | Value + -------|------- + 1 | 1 + 2 | 2 + 3 | 4 + 4 | 8 + 5 | 16 + 6 | 32 + 7 | 64 + 8 | 128 + + +Setting the I/O Base Address +---------------------------- + +The last three switches in switch block SW1 are used to select one +of eight possible I/O Base addresses using the following table + + + Switch | Hex I/O + 6 7 8 | Address + ------------|-------- + ON ON ON | 260 + OFF ON ON | 290 + ON OFF ON | 2E0 (Manufacturer's default) + OFF OFF ON | 2F0 + ON ON OFF | 300 + OFF ON OFF | 350 + ON OFF OFF | 380 + OFF OFF OFF | 3E0 + + +Setting the Base Memory (RAM) buffer Address +-------------------------------------------- + +The memory buffer (RAM) requires 2K. The base of this buffer can be +located in any of eight positions. The address of the Boot Prom is +memory base + 0x2000. +Jumpers 3-5 of switch block SW1 select the Memory Base address. + + Switch | Hex RAM | Hex ROM + 1 2 3 4 5 | Address | Address *) + --------------------|---------|----------- + ON ON ON ON ON | C0000 | C2000 + ON ON OFF ON ON | C4000 | C6000 + ON ON ON OFF ON | CC000 | CE000 + ON ON OFF OFF ON | D0000 | D2000 (Manufacturer's default) + ON ON ON ON OFF | D4000 | D6000 + ON ON OFF ON OFF | D8000 | DA000 + ON ON ON OFF OFF | DC000 | DE000 + ON ON OFF OFF OFF | E0000 | E2000 + +*) To enable the Boot ROM set the switch 8 of switch block SW3 to position ON. + +The switches 1 and 2 probably add 0x0800 and 0x1000 to RAM base address. + + +Setting the Interrupt Line +-------------------------- + +Switches 1-5 of the switch block SW3 control the IRQ level. + + Jumper | IRQ + 1 2 3 4 5 | + ---------------------------- + ON OFF OFF OFF OFF | 3 + OFF ON OFF OFF OFF | 4 + OFF OFF ON OFF OFF | 5 + OFF OFF OFF ON OFF | 7 + OFF OFF OFF OFF ON | 2 + + +Setting the Timeout Parameters +------------------------------ + +The switches 6-7 of the switch block SW3 are used to determine the timeout +parameters. These two switches are normally left in the OFF position. + + +***************************************************************************** + +** Topware ** +8-bit card, TA-ARC/10 +------------------------- + - from Vojtech Pavlik <vojtech@suse.cz> + +This is another very similar 90C65 card. Most of the switches and jumpers +are the same as on other clones. + + _____________________________________________________________________ +| ___________ | | ______ | +| |SW2 NODE ID| | | | XTAL | | +| |___________| | Hybrid IC | |______| | +| ___________ | | __| +| |SW1 MEM+I/O| |_________________________| LED1|__|) +| |___________| 1 2 | +| J3 |o|o| TIMEOUT ______| +| ______________ |o|o| | | +| | | ___________________ | RJ | +| > EPROM SOCKET | | \ |------| +|J2 |______________| | | | | +||o| | | |______| +||o| ROM ENABLE | SMC | _________ | +| _____________ | 90C65 | |_________| _____| +| | | | | | |___ +| > RAM (2k) | | | | BNC |___| +| |_____________| | | |_____| +| |____________________| | +| ________ IRQ 2 3 4 5 7 ___________ | +||________| |o|o|o|o|o| |___________| | +|________ J1|o|o|o|o|o| ______________| + | | + |_____________________________________________| + +Legend: + +90C65 ARCNET Chip +XTAL 20 MHz Crystal +SW1 1-5 Base Memory Address Select + 6-8 Base I/O Address Select +SW2 1-8 Node ID Select (ID0-ID7) +J1 IRQ Select +J2 ROM Enable +J3 Extra Timeout +LED1 Activity LED +BNC Coax connector (BUS ARCnet) +RJ Twisted Pair Connector (daisy chain) + + +Setting the Node ID +------------------- + +The eight switches in SW2 are used to set the node ID. Each node attached to +the network must have an unique node ID which must not be 0. Switch 1 (ID0) +serves as the least significant bit (LSB). + +Setting one of the switches to Off means "1", On means "0". + +The node ID is the sum of the values of all switches set to "1" +These values are: + + Switch | Label | Value + -------|-------|------- + 1 | ID0 | 1 + 2 | ID1 | 2 + 3 | ID2 | 4 + 4 | ID3 | 8 + 5 | ID4 | 16 + 6 | ID5 | 32 + 7 | ID6 | 64 + 8 | ID7 | 128 + +Setting the I/O Base Address +---------------------------- + +The last three switches in switch block SW1 are used to select one +of eight possible I/O Base addresses using the following table: + + + Switch | Hex I/O + 6 7 8 | Address + ------------|-------- + ON ON ON | 260 (Manufacturer's default) + OFF ON ON | 290 + ON OFF ON | 2E0 + OFF OFF ON | 2F0 + ON ON OFF | 300 + OFF ON OFF | 350 + ON OFF OFF | 380 + OFF OFF OFF | 3E0 + + +Setting the Base Memory (RAM) buffer Address +-------------------------------------------- + +The memory buffer (RAM) requires 2K. The base of this buffer can be +located in any of eight positions. The address of the Boot Prom is +memory base + 0x2000. +Jumpers 3-5 of switch block SW1 select the Memory Base address. + + Switch | Hex RAM | Hex ROM + 1 2 3 4 5 | Address | Address *) + --------------------|---------|----------- + ON ON ON ON ON | C0000 | C2000 + ON ON OFF ON ON | C4000 | C6000 (Manufacturer's default) + ON ON ON OFF ON | CC000 | CE000 + ON ON OFF OFF ON | D0000 | D2000 + ON ON ON ON OFF | D4000 | D6000 + ON ON OFF ON OFF | D8000 | DA000 + ON ON ON OFF OFF | DC000 | DE000 + ON ON OFF OFF OFF | E0000 | E2000 + +*) To enable the Boot ROM short the jumper J2. + +The jumpers 1 and 2 probably add 0x0800 and 0x1000 to RAM address. + + +Setting the Interrupt Line +-------------------------- + +Jumpers 1-5 of the jumper block J1 control the IRQ level. ON means +shorted, OFF means open. + + Jumper | IRQ + 1 2 3 4 5 | + ---------------------------- + ON OFF OFF OFF OFF | 2 + OFF ON OFF OFF OFF | 3 + OFF OFF ON OFF OFF | 4 + OFF OFF OFF ON OFF | 5 + OFF OFF OFF OFF ON | 7 + + +Setting the Timeout Parameters +------------------------------ + +The jumpers J3 are used to set the timeout parameters. These two +jumpers are normally left open. + + +***************************************************************************** + +** Thomas-Conrad ** +Model #500-6242-0097 REV A (8-bit card) +--------------------------------------- + - from Lars Karlsson <100617.3473@compuserve.com> + + ________________________________________________________ + | ________ ________ |_____ + | |........| |........| | + | |________| |________| ___| + | SW 3 SW 1 | | + | Base I/O Base Addr. Station | | + | address | | + | ______ switch | | + | | | | | + | | | |___| + | | | ______ |___._ + | |______| |______| ____| BNC + | Jumper- _____| Connector + | Main chip block _ __| ' + | | | | RJ Connector + | |_| | with 110 Ohm + | |__ Terminator + | ___________ __| + | |...........| | RJ-jack + | |...........| _____ | (unused) + | |___________| |_____| |__ + | Boot PROM socket IRQ-jumpers |_ Diagnostic + |________ __ _| LED (red) + | | | | | | | | | | | | | | | | | | | | | | + | | | | | | | | | | | | | | | | | | | | |________| + | + | + +And here are the settings for some of the switches and jumpers on the cards. + + + I/O + + 1 2 3 4 5 6 7 8 + +2E0----- 0 0 0 1 0 0 0 1 +2F0----- 0 0 0 1 0 0 0 0 +300----- 0 0 0 0 1 1 1 1 +350----- 0 0 0 0 1 1 1 0 + +"0" in the above example means switch is off "1" means that it is on. + + + ShMem address. + + 1 2 3 4 5 6 7 8 + +CX00--0 0 1 1 | | | +DX00--0 0 1 0 | +X000--------- 1 1 | +X400--------- 1 0 | +X800--------- 0 1 | +XC00--------- 0 0 +ENHANCED----------- 1 +COMPATIBLE--------- 0 + + + IRQ + + + 3 4 5 7 2 + . . . . . + . . . . . + + +There is a DIP-switch with 8 switches, used to set the shared memory address +to be used. The first 6 switches set the address, the 7th doesn't have any +function, and the 8th switch is used to select "compatible" or "enhanced". +When I got my two cards, one of them had this switch set to "enhanced". That +card didn't work at all, it wasn't even recognized by the driver. The other +card had this switch set to "compatible" and it behaved absolutely normally. I +guess that the switch on one of the cards, must have been changed accidentally +when the card was taken out of its former host. The question remains +unanswered, what is the purpose of the "enhanced" position? + +[Avery's note: "enhanced" probably either disables shared memory (use IO +ports instead) or disables IO ports (use memory addresses instead). This +varies by the type of card involved. I fail to see how either of these +enhance anything. Send me more detailed information about this mode, or +just use "compatible" mode instead.] + + +***************************************************************************** + +** Waterloo Microsystems Inc. ?? ** +8-bit card (C) 1985 +------------------- + - from Robert Michael Best <rmb117@cs.usask.ca> + +[Avery's note: these don't work with my driver for some reason. These cards +SEEM to have settings similar to the PDI508Plus, which is +software-configured and doesn't work with my driver either. The "Waterloo +chip" is a boot PROM, probably designed specifically for the University of +Waterloo. If you have any further information about this card, please +e-mail me.] + +The probe has not been able to detect the card on any of the J2 settings, +and I tried them again with the "Waterloo" chip removed. + + _____________________________________________________________________ +| \/ \/ ___ __ __ | +| C4 C4 |^| | M || ^ ||^| | +| -- -- |_| | 5 || || | C3 | +| \/ \/ C10 |___|| ||_| | +| C4 C4 _ _ | | ?? | +| -- -- | \/ || | | +| | || | | +| | || C1 | | +| | || | \/ _____| +| | C6 || | C9 | |___ +| | || | -- | BNC |___| +| | || | >C7| |_____| +| | || | | +| __ __ |____||_____| 1 2 3 6 | +|| ^ | >C4| |o|o|o|o|o|o| J2 >C4| | +|| | |o|o|o|o|o|o| | +|| C2 | >C4| >C4| | +|| | >C8| | +|| | 2 3 4 5 6 7 IRQ >C4| | +||_____| |o|o|o|o|o|o| J3 | +|_______ |o|o|o|o|o|o| _______________| + | | + |_____________________________________________| + +C1 -- "COM9026 + SMC 8638" + In a chip socket. + +C2 -- "@Copyright + Waterloo Microsystems Inc. + 1985" + In a chip Socket with info printed on a label covering a round window + showing the circuit inside. (The window indicates it is an EPROM chip.) + +C3 -- "COM9032 + SMC 8643" + In a chip socket. + +C4 -- "74LS" + 9 total no sockets. + +M5 -- "50006-136 + 20.000000 MHZ + MTQ-T1-S3 + 0 M-TRON 86-40" + Metallic case with 4 pins, no socket. + +C6 -- "MOSTEK@TC8643 + MK6116N-20 + MALAYSIA" + No socket. + +C7 -- No stamp or label but in a 20 pin chip socket. + +C8 -- "PAL10L8CN + 8623" + In a 20 pin socket. + +C9 -- "PAl16R4A-2CN + 8641" + In a 20 pin socket. + +C10 -- "M8640 + NMC + 9306N" + In an 8 pin socket. + +?? -- Some components on a smaller board and attached with 20 pins all + along the side closest to the BNC connector. The are coated in a dark + resin. + +On the board there are two jumper banks labeled J2 and J3. The +manufacturer didn't put a J1 on the board. The two boards I have both +came with a jumper box for each bank. + +J2 -- Numbered 1 2 3 4 5 6. + 4 and 5 are not stamped due to solder points. + +J3 -- IRQ 2 3 4 5 6 7 + +The board itself has a maple leaf stamped just above the irq jumpers +and "-2 46-86" beside C2. Between C1 and C6 "ASS 'Y 300163" and "@1986 +CORMAN CUSTOM ELECTRONICS CORP." stamped just below the BNC connector. +Below that "MADE IN CANADA" + + +***************************************************************************** + +** No Name ** +8-bit cards, 16-bit cards +------------------------- + - from Juergen Seifert <seifert@htwm.de> + +NONAME 8-BIT ARCNET +=================== + +I have named this ARCnet card "NONAME", since there is no name of any +manufacturer on the Installation manual nor on the shipping box. The only +hint to the existence of a manufacturer at all is written in copper, +it is "Made in Taiwan" + +This description has been written by Juergen Seifert <seifert@htwm.de> +using information from the Original + "ARCnet Installation Manual" + + + ________________________________________________________________ + | |STAR| BUS| T/P| | + | |____|____|____| | + | _____________________ | + | | | | + | | | | + | | | | + | | SMC | | + | | | | + | | COM90C65 | | + | | | | + | | | | + | |__________-__________| | + | _____| + | _______________ | CN | + | | PROM | |_____| + | > SOCKET | | + | |_______________| 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 | + | _______________ _______________ | + | |o|o|o|o|o|o|o|o| | SW1 || SW2 || + | |o|o|o|o|o|o|o|o| |_______________||_______________|| + |___ 2 3 4 5 7 E E R Node ID IOB__|__MEM____| + | \ IRQ / T T O | + |__________________1_2_M______________________| + +Legend: + +COM90C65: ARCnet Probe +S1 1-8: Node ID Select +S2 1-3: I/O Base Address Select + 4-6: Memory Base Address Select + 7-8: RAM Offset Select +ET1, ET2 Extended Timeout Select +ROM ROM Enable Select +CN RG62 Coax Connector +STAR| BUS | T/P Three fields for placing a sign (colored circle) + indicating the topology of the card + +Setting one of the switches to Off means "1", On means "0". + + +Setting the Node ID +------------------- + +The eight switches in group SW1 are used to set the node ID. +Each node attached to the network must have an unique node ID which +must be different from 0. +Switch 8 serves as the least significant bit (LSB). + +The node ID is the sum of the values of all switches set to "1" +These values are: + + Switch | Value + -------|------- + 8 | 1 + 7 | 2 + 6 | 4 + 5 | 8 + 4 | 16 + 3 | 32 + 2 | 64 + 1 | 128 + +Some Examples: + + Switch | Hex | Decimal + 1 2 3 4 5 6 7 8 | Node ID | Node ID + ----------------|---------|--------- + 0 0 0 0 0 0 0 0 | not allowed + 0 0 0 0 0 0 0 1 | 1 | 1 + 0 0 0 0 0 0 1 0 | 2 | 2 + 0 0 0 0 0 0 1 1 | 3 | 3 + . . . | | + 0 1 0 1 0 1 0 1 | 55 | 85 + . . . | | + 1 0 1 0 1 0 1 0 | AA | 170 + . . . | | + 1 1 1 1 1 1 0 1 | FD | 253 + 1 1 1 1 1 1 1 0 | FE | 254 + 1 1 1 1 1 1 1 1 | FF | 255 + + +Setting the I/O Base Address +---------------------------- + +The first three switches in switch group SW2 are used to select one +of eight possible I/O Base addresses using the following table + + Switch | Hex I/O + 1 2 3 | Address + ------------|-------- + ON ON ON | 260 + ON ON OFF | 290 + ON OFF ON | 2E0 (Manufacturer's default) + ON OFF OFF | 2F0 + OFF ON ON | 300 + OFF ON OFF | 350 + OFF OFF ON | 380 + OFF OFF OFF | 3E0 + + +Setting the Base Memory (RAM) buffer Address +-------------------------------------------- + +The memory buffer requires 2K of a 16K block of RAM. The base of this +16K block can be located in any of eight positions. +Switches 4-6 of switch group SW2 select the Base of the 16K block. +Within that 16K address space, the buffer may be assigned any one of four +positions, determined by the offset, switches 7 and 8 of group SW2. + + Switch | Hex RAM | Hex ROM + 4 5 6 7 8 | Address | Address *) + -----------|---------|----------- + 0 0 0 0 0 | C0000 | C2000 + 0 0 0 0 1 | C0800 | C2000 + 0 0 0 1 0 | C1000 | C2000 + 0 0 0 1 1 | C1800 | C2000 + | | + 0 0 1 0 0 | C4000 | C6000 + 0 0 1 0 1 | C4800 | C6000 + 0 0 1 1 0 | C5000 | C6000 + 0 0 1 1 1 | C5800 | C6000 + | | + 0 1 0 0 0 | CC000 | CE000 + 0 1 0 0 1 | CC800 | CE000 + 0 1 0 1 0 | CD000 | CE000 + 0 1 0 1 1 | CD800 | CE000 + | | + 0 1 1 0 0 | D0000 | D2000 (Manufacturer's default) + 0 1 1 0 1 | D0800 | D2000 + 0 1 1 1 0 | D1000 | D2000 + 0 1 1 1 1 | D1800 | D2000 + | | + 1 0 0 0 0 | D4000 | D6000 + 1 0 0 0 1 | D4800 | D6000 + 1 0 0 1 0 | D5000 | D6000 + 1 0 0 1 1 | D5800 | D6000 + | | + 1 0 1 0 0 | D8000 | DA000 + 1 0 1 0 1 | D8800 | DA000 + 1 0 1 1 0 | D9000 | DA000 + 1 0 1 1 1 | D9800 | DA000 + | | + 1 1 0 0 0 | DC000 | DE000 + 1 1 0 0 1 | DC800 | DE000 + 1 1 0 1 0 | DD000 | DE000 + 1 1 0 1 1 | DD800 | DE000 + | | + 1 1 1 0 0 | E0000 | E2000 + 1 1 1 0 1 | E0800 | E2000 + 1 1 1 1 0 | E1000 | E2000 + 1 1 1 1 1 | E1800 | E2000 + +*) To enable the 8K Boot PROM install the jumper ROM. + The default is jumper ROM not installed. + + +Setting Interrupt Request Lines (IRQ) +------------------------------------- + +To select a hardware interrupt level set one (only one!) of the jumpers +IRQ2, IRQ3, IRQ4, IRQ5 or IRQ7. The manufacturer's default is IRQ2. + + +Setting the Timeouts +-------------------- + +The two jumpers labeled ET1 and ET2 are used to determine the timeout +parameters (response and reconfiguration time). Every node in a network +must be set to the same timeout values. + + ET1 ET2 | Response Time (us) | Reconfiguration Time (ms) + --------|--------------------|-------------------------- + Off Off | 78 | 840 (Default) + Off On | 285 | 1680 + On Off | 563 | 1680 + On On | 1130 | 1680 + +On means jumper installed, Off means jumper not installed + + +NONAME 16-BIT ARCNET +==================== + +The manual of my 8-Bit NONAME ARCnet Card contains another description +of a 16-Bit Coax / Twisted Pair Card. This description is incomplete, +because there are missing two pages in the manual booklet. (The table +of contents reports pages ... 2-9, 2-11, 2-12, 3-1, ... but inside +the booklet there is a different way of counting ... 2-9, 2-10, A-1, +(empty page), 3-1, ..., 3-18, A-1 (again), A-2) +Also the picture of the board layout is not as good as the picture of +8-Bit card, because there isn't any letter like "SW1" written to the +picture. +Should somebody have such a board, please feel free to complete this +description or to send a mail to me! + +This description has been written by Juergen Seifert <seifert@htwm.de> +using information from the Original + "ARCnet Installation Manual" + + + ___________________________________________________________________ + < _________________ _________________ | + > | SW? || SW? | | + < |_________________||_________________| | + > ____________________ | + < | | | + > | | | + < | | | + > | | | + < | | | + > | | | + < | | | + > |____________________| | + < ____| + > ____________________ | | + < | | | J1 | + > | < | | + < |____________________| ? ? ? ? ? ? |____| + > |o|o|o|o|o|o| | + < |o|o|o|o|o|o| | + > | + < __ ___________| + > | | | + <____________| |_______________________________________| + + +Setting one of the switches to Off means "1", On means "0". + + +Setting the Node ID +------------------- + +The eight switches in group SW2 are used to set the node ID. +Each node attached to the network must have an unique node ID which +must be different from 0. +Switch 8 serves as the least significant bit (LSB). + +The node ID is the sum of the values of all switches set to "1" +These values are: + + Switch | Value + -------|------- + 8 | 1 + 7 | 2 + 6 | 4 + 5 | 8 + 4 | 16 + 3 | 32 + 2 | 64 + 1 | 128 + +Some Examples: + + Switch | Hex | Decimal + 1 2 3 4 5 6 7 8 | Node ID | Node ID + ----------------|---------|--------- + 0 0 0 0 0 0 0 0 | not allowed + 0 0 0 0 0 0 0 1 | 1 | 1 + 0 0 0 0 0 0 1 0 | 2 | 2 + 0 0 0 0 0 0 1 1 | 3 | 3 + . . . | | + 0 1 0 1 0 1 0 1 | 55 | 85 + . . . | | + 1 0 1 0 1 0 1 0 | AA | 170 + . . . | | + 1 1 1 1 1 1 0 1 | FD | 253 + 1 1 1 1 1 1 1 0 | FE | 254 + 1 1 1 1 1 1 1 1 | FF | 255 + + +Setting the I/O Base Address +---------------------------- + +The first three switches in switch group SW1 are used to select one +of eight possible I/O Base addresses using the following table + + Switch | Hex I/O + 3 2 1 | Address + ------------|-------- + ON ON ON | 260 + ON ON OFF | 290 + ON OFF ON | 2E0 (Manufacturer's default) + ON OFF OFF | 2F0 + OFF ON ON | 300 + OFF ON OFF | 350 + OFF OFF ON | 380 + OFF OFF OFF | 3E0 + + +Setting the Base Memory (RAM) buffer Address +-------------------------------------------- + +The memory buffer requires 2K of a 16K block of RAM. The base of this +16K block can be located in any of eight positions. +Switches 6-8 of switch group SW1 select the Base of the 16K block. +Within that 16K address space, the buffer may be assigned any one of four +positions, determined by the offset, switches 4 and 5 of group SW1. + + Switch | Hex RAM | Hex ROM + 8 7 6 5 4 | Address | Address + -----------|---------|----------- + 0 0 0 0 0 | C0000 | C2000 + 0 0 0 0 1 | C0800 | C2000 + 0 0 0 1 0 | C1000 | C2000 + 0 0 0 1 1 | C1800 | C2000 + | | + 0 0 1 0 0 | C4000 | C6000 + 0 0 1 0 1 | C4800 | C6000 + 0 0 1 1 0 | C5000 | C6000 + 0 0 1 1 1 | C5800 | C6000 + | | + 0 1 0 0 0 | CC000 | CE000 + 0 1 0 0 1 | CC800 | CE000 + 0 1 0 1 0 | CD000 | CE000 + 0 1 0 1 1 | CD800 | CE000 + | | + 0 1 1 0 0 | D0000 | D2000 (Manufacturer's default) + 0 1 1 0 1 | D0800 | D2000 + 0 1 1 1 0 | D1000 | D2000 + 0 1 1 1 1 | D1800 | D2000 + | | + 1 0 0 0 0 | D4000 | D6000 + 1 0 0 0 1 | D4800 | D6000 + 1 0 0 1 0 | D5000 | D6000 + 1 0 0 1 1 | D5800 | D6000 + | | + 1 0 1 0 0 | D8000 | DA000 + 1 0 1 0 1 | D8800 | DA000 + 1 0 1 1 0 | D9000 | DA000 + 1 0 1 1 1 | D9800 | DA000 + | | + 1 1 0 0 0 | DC000 | DE000 + 1 1 0 0 1 | DC800 | DE000 + 1 1 0 1 0 | DD000 | DE000 + 1 1 0 1 1 | DD800 | DE000 + | | + 1 1 1 0 0 | E0000 | E2000 + 1 1 1 0 1 | E0800 | E2000 + 1 1 1 1 0 | E1000 | E2000 + 1 1 1 1 1 | E1800 | E2000 + + +Setting Interrupt Request Lines (IRQ) +------------------------------------- + +?????????????????????????????????????? + + +Setting the Timeouts +-------------------- + +?????????????????????????????????????? + + +***************************************************************************** + +** No Name ** +8-bit cards ("Made in Taiwan R.O.C.") +----------- + - from Vojtech Pavlik <vojtech@suse.cz> + +I have named this ARCnet card "NONAME", since I got only the card with +no manual at all and the only text identifying the manufacturer is +"MADE IN TAIWAN R.O.C" printed on the card. + + ____________________________________________________________ + | 1 2 3 4 5 6 7 8 | + | |o|o| JP1 o|o|o|o|o|o|o|o| ON | + | + o|o|o|o|o|o|o|o| ___| + | _____________ o|o|o|o|o|o|o|o| OFF _____ | | ID7 + | | | SW1 | | | | ID6 + | > RAM (2k) | ____________________ | H | | S | ID5 + | |_____________| | || y | | W | ID4 + | | || b | | 2 | ID3 + | | || r | | | ID2 + | | || i | | | ID1 + | | 90C65 || d | |___| ID0 + | SW3 | || | | + | |o|o|o|o|o|o|o|o| ON | || I | | + | |o|o|o|o|o|o|o|o| | || C | | + | |o|o|o|o|o|o|o|o| OFF |____________________|| | _____| + | 1 2 3 4 5 6 7 8 | | | |___ + | ______________ | | | BNC |___| + | | | |_____| |_____| + | > EPROM SOCKET | | + | |______________| | + | ______________| + | | + |_____________________________________________| + +Legend: + +90C65 ARCNET Chip +SW1 1-5: Base Memory Address Select + 6-8: Base I/O Address Select +SW2 1-8: Node ID Select (ID0-ID7) +SW3 1-5: IRQ Select + 6-7: Extra Timeout + 8 : ROM Enable +JP1 Led connector +BNC Coax connector + +Although the jumpers SW1 and SW3 are marked SW, not JP, they are jumpers, not +switches. + +Setting the jumpers to ON means connecting the upper two pins, off the bottom +two - or - in case of IRQ setting, connecting none of them at all. + +Setting the Node ID +------------------- + +The eight switches in SW2 are used to set the node ID. Each node attached +to the network must have an unique node ID which must not be 0. +Switch 1 (ID0) serves as the least significant bit (LSB). + +Setting one of the switches to Off means "1", On means "0". + +The node ID is the sum of the values of all switches set to "1" +These values are: + + Switch | Label | Value + -------|-------|------- + 1 | ID0 | 1 + 2 | ID1 | 2 + 3 | ID2 | 4 + 4 | ID3 | 8 + 5 | ID4 | 16 + 6 | ID5 | 32 + 7 | ID6 | 64 + 8 | ID7 | 128 + +Some Examples: + + Switch | Hex | Decimal + 8 7 6 5 4 3 2 1 | Node ID | Node ID + ----------------|---------|--------- + 0 0 0 0 0 0 0 0 | not allowed + 0 0 0 0 0 0 0 1 | 1 | 1 + 0 0 0 0 0 0 1 0 | 2 | 2 + 0 0 0 0 0 0 1 1 | 3 | 3 + . . . | | + 0 1 0 1 0 1 0 1 | 55 | 85 + . . . | | + 1 0 1 0 1 0 1 0 | AA | 170 + . . . | | + 1 1 1 1 1 1 0 1 | FD | 253 + 1 1 1 1 1 1 1 0 | FE | 254 + 1 1 1 1 1 1 1 1 | FF | 255 + + +Setting the I/O Base Address +---------------------------- + +The last three switches in switch block SW1 are used to select one +of eight possible I/O Base addresses using the following table + + + Switch | Hex I/O + 6 7 8 | Address + ------------|-------- + ON ON ON | 260 + OFF ON ON | 290 + ON OFF ON | 2E0 (Manufacturer's default) + OFF OFF ON | 2F0 + ON ON OFF | 300 + OFF ON OFF | 350 + ON OFF OFF | 380 + OFF OFF OFF | 3E0 + + +Setting the Base Memory (RAM) buffer Address +-------------------------------------------- + +The memory buffer (RAM) requires 2K. The base of this buffer can be +located in any of eight positions. The address of the Boot Prom is +memory base + 0x2000. +Jumpers 3-5 of jumper block SW1 select the Memory Base address. + + Switch | Hex RAM | Hex ROM + 1 2 3 4 5 | Address | Address *) + --------------------|---------|----------- + ON ON ON ON ON | C0000 | C2000 + ON ON OFF ON ON | C4000 | C6000 + ON ON ON OFF ON | CC000 | CE000 + ON ON OFF OFF ON | D0000 | D2000 (Manufacturer's default) + ON ON ON ON OFF | D4000 | D6000 + ON ON OFF ON OFF | D8000 | DA000 + ON ON ON OFF OFF | DC000 | DE000 + ON ON OFF OFF OFF | E0000 | E2000 + +*) To enable the Boot ROM set the jumper 8 of jumper block SW3 to position ON. + +The jumpers 1 and 2 probably add 0x0800, 0x1000 and 0x1800 to RAM adders. + +Setting the Interrupt Line +-------------------------- + +Jumpers 1-5 of the jumper block SW3 control the IRQ level. + + Jumper | IRQ + 1 2 3 4 5 | + ---------------------------- + ON OFF OFF OFF OFF | 2 + OFF ON OFF OFF OFF | 3 + OFF OFF ON OFF OFF | 4 + OFF OFF OFF ON OFF | 5 + OFF OFF OFF OFF ON | 7 + + +Setting the Timeout Parameters +------------------------------ + +The jumpers 6-7 of the jumper block SW3 are used to determine the timeout +parameters. These two jumpers are normally left in the OFF position. + + +***************************************************************************** + +** No Name ** +(Generic Model 9058) +-------------------- + - from Andrew J. Kroll <ag784@freenet.buffalo.edu> + - Sorry this sat in my to-do box for so long, Andrew! (yikes - over a + year!) + _____ + | < + | .---' + ________________________________________________________________ | | + | | SW2 | | | + | ___________ |_____________| | | + | | | 1 2 3 4 5 6 ___| | + | > 6116 RAM | _________ 8 | | | + | |___________| |20MHzXtal| 7 | | | + | |_________| __________ 6 | S | | + | 74LS373 | |- 5 | W | | + | _________ | E |- 4 | | | + | >_______| ______________|..... P |- 3 | 3 | | + | | | : O |- 2 | | | + | | | : X |- 1 |___| | + | ________________ | | : Y |- | | + | | SW1 | | SL90C65 | : |- | | + | |________________| | | : B |- | | + | 1 2 3 4 5 6 7 8 | | : O |- | | + | |_________o____|..../ A |- _______| | + | ____________________ | R |- | |------, + | | | | D |- | BNC | # | + | > 2764 PROM SOCKET | |__________|- |_______|------' + | |____________________| _________ | | + | >________| <- 74LS245 | | + | | | + |___ ______________| | + |H H H H H H H H H H H H H H H H H H H H H H H| | | + |U_U_U_U_U_U_U_U_U_U_U_U_U_U_U_U_U_U_U_U_U_U_U| | | + \| +Legend: + +SL90C65 ARCNET Controller / Transceiver /Logic +SW1 1-5: IRQ Select + 6: ET1 + 7: ET2 + 8: ROM ENABLE +SW2 1-3: Memory Buffer/PROM Address + 3-6: I/O Address Map +SW3 1-8: Node ID Select +BNC BNC RG62/U Connection + *I* have had success using RG59B/U with *NO* terminators! + What gives?! + +SW1: Timeouts, Interrupt and ROM +--------------------------------- + +To select a hardware interrupt level set one (only one!) of the dip switches +up (on) SW1...(switches 1-5) +IRQ3, IRQ4, IRQ5, IRQ7, IRQ2. The Manufacturer's default is IRQ2. + +The switches on SW1 labeled EXT1 (switch 6) and EXT2 (switch 7) +are used to determine the timeout parameters. These two dip switches +are normally left off (down). + + To enable the 8K Boot PROM position SW1 switch 8 on (UP) labeled ROM. + The default is jumper ROM not installed. + + +Setting the I/O Base Address +---------------------------- + +The last three switches in switch group SW2 are used to select one +of eight possible I/O Base addresses using the following table + + + Switch | Hex I/O + 4 5 6 | Address + -------|-------- + 0 0 0 | 260 + 0 0 1 | 290 + 0 1 0 | 2E0 (Manufacturer's default) + 0 1 1 | 2F0 + 1 0 0 | 300 + 1 0 1 | 350 + 1 1 0 | 380 + 1 1 1 | 3E0 + + +Setting the Base Memory Address (RAM & ROM) +------------------------------------------- + +The memory buffer requires 2K of a 16K block of RAM. The base of this +16K block can be located in any of eight positions. +Switches 1-3 of switch group SW2 select the Base of the 16K block. +(0 = DOWN, 1 = UP) +I could, however, only verify two settings... + + Switch| Hex RAM | Hex ROM + 1 2 3 | Address | Address + ------|---------|----------- + 0 0 0 | E0000 | E2000 + 0 0 1 | D0000 | D2000 (Manufacturer's default) + 0 1 0 | ????? | ????? + 0 1 1 | ????? | ????? + 1 0 0 | ????? | ????? + 1 0 1 | ????? | ????? + 1 1 0 | ????? | ????? + 1 1 1 | ????? | ????? + + +Setting the Node ID +------------------- + +The eight switches in group SW3 are used to set the node ID. +Each node attached to the network must have an unique node ID which +must be different from 0. +Switch 1 serves as the least significant bit (LSB). +switches in the DOWN position are OFF (0) and in the UP position are ON (1) + +The node ID is the sum of the values of all switches set to "1" +These values are: + Switch | Value + -------|------- + 1 | 1 + 2 | 2 + 3 | 4 + 4 | 8 + 5 | 16 + 6 | 32 + 7 | 64 + 8 | 128 + +Some Examples: + + Switch# | Hex | Decimal +8 7 6 5 4 3 2 1 | Node ID | Node ID +----------------|---------|--------- +0 0 0 0 0 0 0 0 | not allowed <-. +0 0 0 0 0 0 0 1 | 1 | 1 | +0 0 0 0 0 0 1 0 | 2 | 2 | +0 0 0 0 0 0 1 1 | 3 | 3 | + . . . | | | +0 1 0 1 0 1 0 1 | 55 | 85 | + . . . | | + Don't use 0 or 255! +1 0 1 0 1 0 1 0 | AA | 170 | + . . . | | | +1 1 1 1 1 1 0 1 | FD | 253 | +1 1 1 1 1 1 1 0 | FE | 254 | +1 1 1 1 1 1 1 1 | FF | 255 <-' + + +***************************************************************************** + +** Tiara ** +(model unknown) +------------------------- + - from Christoph Lameter <christoph@lameter.com> + + +Here is information about my card as far as I could figure it out: +----------------------------------------------- tiara +Tiara LanCard of Tiara Computer Systems. + ++----------------------------------------------+ +! ! Transmitter Unit ! ! +! +------------------+ ------- +! MEM Coax Connector +! ROM 7654321 <- I/O ------- +! : : +--------+ ! +! : : ! 90C66LJ! +++ +! : : ! ! !D Switch to set +! : : ! ! !I the Nodenumber +! : : +--------+ !P +! !++ +! 234567 <- IRQ ! ++------------!!!!!!!!!!!!!!!!!!!!!!!!--------+ + !!!!!!!!!!!!!!!!!!!!!!!! + +0 = Jumper Installed +1 = Open + +Top Jumper line Bit 7 = ROM Enable 654=Memory location 321=I/O + +Settings for Memory Location (Top Jumper Line) +456 Address selected +000 C0000 +001 C4000 +010 CC000 +011 D0000 +100 D4000 +101 D8000 +110 DC000 +111 E0000 + +Settings for I/O Address (Top Jumper Line) +123 Port +000 260 +001 290 +010 2E0 +011 2F0 +100 300 +101 350 +110 380 +111 3E0 + +Settings for IRQ Selection (Lower Jumper Line) +234567 +011111 IRQ 2 +101111 IRQ 3 +110111 IRQ 4 +111011 IRQ 5 +111110 IRQ 7 + +***************************************************************************** + + +Other Cards +----------- + +I have no information on other models of ARCnet cards at the moment. Please +send any and all info to: + apenwarr@worldvisions.ca + +Thanks. diff --git a/Documentation/networking/arcnet.txt b/Documentation/networking/arcnet.txt new file mode 100644 index 000000000000..770fc41a78e8 --- /dev/null +++ b/Documentation/networking/arcnet.txt @@ -0,0 +1,555 @@ +---------------------------------------------------------------------------- +NOTE: See also arcnet-hardware.txt in this directory for jumper-setting +and cabling information if you're like many of us and didn't happen to get a +manual with your ARCnet card. +---------------------------------------------------------------------------- + +Since no one seems to listen to me otherwise, perhaps a poem will get your +attention: + This driver's getting fat and beefy, + But my cat is still named Fifi. + +Hmm, I think I'm allowed to call that a poem, even though it's only two +lines. Hey, I'm in Computer Science, not English. Give me a break. + +The point is: I REALLY REALLY REALLY REALLY REALLY want to hear from you if +you test this and get it working. Or if you don't. Or anything. + +ARCnet 0.32 ALPHA first made it into the Linux kernel 1.1.80 - this was +nice, but after that even FEWER people started writing to me because they +didn't even have to install the patch. <sigh> + +Come on, be a sport! Send me a success report! + +(hey, that was even better than my original poem... this is getting bad!) + + +-------- +WARNING: +-------- + +If you don't e-mail me about your success/failure soon, I may be forced to +start SINGING. And we don't want that, do we? + +(You know, it might be argued that I'm pushing this point a little too much. +If you think so, why not flame me in a quick little e-mail? Please also +include the type of card(s) you're using, software, size of network, and +whether it's working or not.) + +My e-mail address is: apenwarr@worldvisions.ca + + +--------------------------------------------------------------------------- + + +These are the ARCnet drivers for Linux. + + +This new release (2.91) has been put together by David Woodhouse +<dwmw2@cam.ac.uk>, in an attempt to tidy up the driver after adding support +for yet another chipset. Now the generic support has been separated from the +individual chipset drivers, and the source files aren't quite so packed with +#ifdefs! I've changed this file a bit, but kept it in the first person from +Avery, because I didn't want to completely rewrite it. + +The previous release resulted from many months of on-and-off effort from me +(Avery Pennarun), many bug reports/fixes and suggestions from others, and in +particular a lot of input and coding from Tomasz Motylewski. Starting with +ARCnet 2.10 ALPHA, Tomasz's all-new-and-improved RFC1051 support has been +included and seems to be working fine! + + +Where do I discuss these drivers? +--------------------------------- + +Tomasz has been so kind as to set up a new and improved mailing list. +Subscribe by sending a message with the BODY "subscribe linux-arcnet YOUR +REAL NAME" to listserv@tichy.ch.uj.edu.pl. Then, to submit messages to the +list, mail to linux-arcnet@tichy.ch.uj.edu.pl. + +There are archives of the mailing list at: + http://tichy.ch.uj.edu.pl/lists/linux-arcnet + +The people on linux-net@vger.kernel.org have also been known to be very +helpful, especially when we're talking about ALPHA Linux kernels that may or +may not work right in the first place. + + +Other Drivers and Info +---------------------- + +You can try my ARCNET page on the World Wide Web at: + http://www.worldvisions.ca/~apenwarr/arcnet/ + +Also, SMC (one of the companies that makes ARCnet cards) has a WWW site you +might be interested in, which includes several drivers for various cards +including ARCnet. Try: + http://www.smc.com/ + +Performance Technologies makes various network software that supports +ARCnet: + http://www.perftech.com/ or ftp to ftp.perftech.com. + +Novell makes a networking stack for DOS which includes ARCnet drivers. Try +FTPing to ftp.novell.com. + +You can get the Crynwr packet driver collection (including arcether.com, the +one you'll want to use with ARCnet cards) from +oak.oakland.edu:/simtel/msdos/pktdrvr. It won't work perfectly on a 386+ +without patches, though, and also doesn't like several cards. Fixed +versions are available on my WWW page, or via e-mail if you don't have WWW +access. + + +Installing the Driver +--------------------- + +All you will need to do in order to install the driver is: + make config + (be sure to choose ARCnet in the network devices + and at least one chipset driver.) + make clean + make zImage + +If you obtained this ARCnet package as an upgrade to the ARCnet driver in +your current kernel, you will need to first copy arcnet.c over the one in +the linux/drivers/net directory. + +You will know the driver is installed properly if you get some ARCnet +messages when you reboot into the new Linux kernel. + +There are four chipset options: + + 1. Standard ARCnet COM90xx chipset. + +This is the normal ARCnet card, which you've probably got. This is the only +chipset driver which will autoprobe if not told where the card is. +It following options on the command line: + com90xx=[<io>[,<irq>[,<shmem>]]][,<name>] | <name> + +If you load the chipset support as a module, the options are: + io=<io> irq=<irq> shmem=<shmem> device=<name> + +To disable the autoprobe, just specify "com90xx=" on the kernel command line. +To specify the name alone, but allow autoprobe, just put "com90xx=<name>" + + 2. ARCnet COM20020 chipset. + +This is the new chipset from SMC with support for promiscuous mode (packet +sniffing), extra diagnostic information, etc. Unfortunately, there is no +sensible method of autoprobing for these cards. You must specify the I/O +address on the kernel command line. +The command line options are: + com20020=<io>[,<irq>[,<node_ID>[,backplane[,CKP[,timeout]]]]][,name] + +If you load the chipset support as a module, the options are: + io=<io> irq=<irq> node=<node_ID> backplane=<backplane> clock=<CKP> + timeout=<timeout> device=<name> + +The COM20020 chipset allows you to set the node ID in software, overriding the +default which is still set in DIP switches on the card. If you don't have the +COM20020 data sheets, and you don't know what the other three options refer +to, then they won't interest you - forget them. + + 3. ARCnet COM90xx chipset in IO-mapped mode. + +This will also work with the normal ARCnet cards, but doesn't use the shared +memory. It performs less well than the above driver, but is provided in case +you have a card which doesn't support shared memory, or (strangely) in case +you have so many ARCnet cards in your machine that you run out of shmem slots. +If you don't give the IO address on the kernel command line, then the driver +will not find the card. +The command line options are: + com90io=<io>[,<irq>][,<name>] + +If you load the chipset support as a module, the options are: + io=<io> irq=<irq> device=<name> + + 4. ARCnet RIM I cards. + +These are COM90xx chips which are _completely_ memory mapped. The support for +these is not tested. If you have one, please mail the author with a success +report. All options must be specified, except the device name. +Command line options: + arcrimi=<shmem>,<irq>,<node_ID>[,<name>] + +If you load the chipset support as a module, the options are: + shmem=<shmem> irq=<irq> node=<node_ID> device=<name> + + +Loadable Module Support +----------------------- + +Configure and rebuild Linux. When asked, answer 'm' to "Generic ARCnet +support" and to support for your ARCnet chipset if you want to use the +loadable module. You can also say 'y' to "Generic ARCnet support" and 'm' +to the chipset support if you wish. + + make config + make clean + make zImage + make modules + +If you're using a loadable module, you need to use insmod to load it, and +you can specify various characteristics of your card on the command +line. (In recent versions of the driver, autoprobing is much more reliable +and works as a module, so most of this is now unnecessary.) + +For example: + cd /usr/src/linux/modules + insmod arcnet.o + insmod com90xx.o + insmod com20020.o io=0x2e0 device=eth1 + + +Using the Driver +---------------- + +If you build your kernel with ARCnet COM90xx support included, it should +probe for your card automatically when you boot. If you use a different +chipset driver complied into the kernel, you must give the necessary options +on the kernel command line, as detailed above. + +Go read the NET-2-HOWTO and ETHERNET-HOWTO for Linux; they should be +available where you picked up this driver. Think of your ARCnet as a +souped-up (or down, as the case may be) Ethernet card. + +By the way, be sure to change all references from "eth0" to "arc0" in the +HOWTOs. Remember that ARCnet isn't a "true" Ethernet, and the device name +is DIFFERENT. + + +Multiple Cards in One Computer +------------------------------ + +Linux has pretty good support for this now, but since I've been busy, the +ARCnet driver has somewhat suffered in this respect. COM90xx support, if +compiled into the kernel, will (try to) autodetect all the installed cards. + +If you have other cards, with support compiled into the kernel, then you can +just repeat the options on the kernel command line, e.g.: +LILO: linux com20020=0x2e0 com20020=0x380 com90io=0x260 + +If you have the chipset support built as a loadable module, then you need to +do something like this: + insmod -o arc0 com90xx + insmod -o arc1 com20020 io=0x2e0 + insmod -o arc2 com90xx +The ARCnet drivers will now sort out their names automatically. + + +How do I get it to work with...? +-------------------------------- + +NFS: Should be fine linux->linux, just pretend you're using Ethernet cards. + oak.oakland.edu:/simtel/msdos/nfs has some nice DOS clients. There + is also a DOS-based NFS server called SOSS. It doesn't multitask + quite the way Linux does (actually, it doesn't multitask AT ALL) but + you never know what you might need. + + With AmiTCP (and possibly others), you may need to set the following + options in your Amiga nfstab: MD 1024 MR 1024 MW 1024 + (Thanks to Christian Gottschling <ferksy@indigo.tng.oche.de> + for this.) + + Probably these refer to maximum NFS data/read/write block sizes. I + don't know why the defaults on the Amiga didn't work; write to me if + you know more. + +DOS: If you're using the freeware arcether.com, you might want to install + the driver patch from my web page. It helps with PC/TCP, and also + can get arcether to load if it timed out too quickly during + initialization. In fact, if you use it on a 386+ you REALLY need + the patch, really. + +Windows: See DOS :) Trumpet Winsock works fine with either the Novell or + Arcether client, assuming you remember to load winpkt of course. + +LAN Manager and Windows for Workgroups: These programs use protocols that + are incompatible with the Internet standard. They try to pretend + the cards are Ethernet, and confuse everyone else on the network. + + However, v2.00 and higher of the Linux ARCnet driver supports this + protocol via the 'arc0e' device. See the section on "Multiprotocol + Support" for more information. + + Using the freeware Samba server and clients for Linux, you can now + interface quite nicely with TCP/IP-based WfWg or Lan Manager + networks. + +Windows 95: Tools are included with Win95 that let you use either the LANMAN + style network drivers (NDIS) or Novell drivers (ODI) to handle your + ARCnet packets. If you use ODI, you'll need to use the 'arc0' + device with Linux. If you use NDIS, then try the 'arc0e' device. + See the "Multiprotocol Support" section below if you need arc0e, + you're completely insane, and/or you need to build some kind of + hybrid network that uses both encapsulation types. + +OS/2: I've been told it works under Warp Connect with an ARCnet driver from + SMC. You need to use the 'arc0e' interface for this. If you get + the SMC driver to work with the TCP/IP stuff included in the + "normal" Warp Bonus Pack, let me know. + + ftp.microsoft.com also has a freeware "Lan Manager for OS/2" client + which should use the same protocol as WfWg does. I had no luck + installing it under Warp, however. Please mail me with any results. + +NetBSD/AmiTCP: These use an old version of the Internet standard ARCnet + protocol (RFC1051) which is compatible with the Linux driver v2.10 + ALPHA and above using the arc0s device. (See "Multiprotocol ARCnet" + below.) ** Newer versions of NetBSD apparently support RFC1201. + + +Using Multiprotocol ARCnet +-------------------------- + +The ARCnet driver v2.10 ALPHA supports three protocols, each on its own +"virtual network device": + + arc0 - RFC1201 protocol, the official Internet standard which just + happens to be 100% compatible with Novell's TRXNET driver. + Version 1.00 of the ARCnet driver supported _only_ this + protocol. arc0 is the fastest of the three protocols (for + whatever reason), and allows larger packets to be used + because it supports RFC1201 "packet splitting" operations. + Unless you have a specific need to use a different protocol, + I strongly suggest that you stick with this one. + + arc0e - "Ethernet-Encapsulation" which sends packets over ARCnet + that are actually a lot like Ethernet packets, including the + 6-byte hardware addresses. This protocol is compatible with + Microsoft's NDIS ARCnet driver, like the one in WfWg and + LANMAN. Because the MTU of 493 is actually smaller than the + one "required" by TCP/IP (576), there is a chance that some + network operations will not function properly. The Linux + TCP/IP layer can compensate in most cases, however, by + automatically fragmenting the TCP/IP packets to make them + fit. arc0e also works slightly more slowly than arc0, for + reasons yet to be determined. (Probably it's the smaller + MTU that does it.) + + arc0s - The "[s]imple" RFC1051 protocol is the "previous" Internet + standard that is completely incompatible with the new + standard. Some software today, however, continues to + support the old standard (and only the old standard) + including NetBSD and AmiTCP. RFC1051 also does not support + RFC1201's packet splitting, and the MTU of 507 is still + smaller than the Internet "requirement," so it's quite + possible that you may run into problems. It's also slower + than RFC1201 by about 25%, for the same reason as arc0e. + + The arc0s support was contributed by Tomasz Motylewski + and modified somewhat by me. Bugs are probably my fault. + +You can choose not to compile arc0e and arc0s into the driver if you want - +this will save you a bit of memory and avoid confusion when eg. trying to +use the "NFS-root" stuff in recent Linux kernels. + +The arc0e and arc0s devices are created automatically when you first +ifconfig the arc0 device. To actually use them, though, you need to also +ifconfig the other virtual devices you need. There are a number of ways you +can set up your network then: + + +1. Single Protocol. + + This is the simplest way to configure your network: use just one of the + two available protocols. As mentioned above, it's a good idea to use + only arc0 unless you have a good reason (like some other software, ie. + WfWg, that only works with arc0e). + + If you need only arc0, then the following commands should get you going: + ifconfig arc0 MY.IP.ADD.RESS + route add MY.IP.ADD.RESS arc0 + route add -net SUB.NET.ADD.RESS arc0 + [add other local routes here] + + If you need arc0e (and only arc0e), it's a little different: + ifconfig arc0 MY.IP.ADD.RESS + ifconfig arc0e MY.IP.ADD.RESS + route add MY.IP.ADD.RESS arc0e + route add -net SUB.NET.ADD.RESS arc0e + + arc0s works much the same way as arc0e. + + +2. More than one protocol on the same wire. + + Now things start getting confusing. To even try it, you may need to be + partly crazy. Here's what *I* did. :) Note that I don't include arc0s in + my home network; I don't have any NetBSD or AmiTCP computers, so I only + use arc0s during limited testing. + + I have three computers on my home network; two Linux boxes (which prefer + RFC1201 protocol, for reasons listed above), and one XT that can't run + Linux but runs the free Microsoft LANMAN Client instead. + + Worse, one of the Linux computers (freedom) also has a modem and acts as + a router to my Internet provider. The other Linux box (insight) also has + its own IP address and needs to use freedom as its default gateway. The + XT (patience), however, does not have its own Internet IP address and so + I assigned it one on a "private subnet" (as defined by RFC1597). + + To start with, take a simple network with just insight and freedom. + Insight needs to: + - talk to freedom via RFC1201 (arc0) protocol, because I like it + more and it's faster. + - use freedom as its Internet gateway. + + That's pretty easy to do. Set up insight like this: + ifconfig arc0 insight + route add insight arc0 + route add freedom arc0 /* I would use the subnet here (like I said + to to in "single protocol" above), + but the rest of the subnet + unfortunately lies across the PPP + link on freedom, which confuses + things. */ + route add default gw freedom + + And freedom gets configured like so: + ifconfig arc0 freedom + route add freedom arc0 + route add insight arc0 + /* and default gateway is configured by pppd */ + + Great, now insight talks to freedom directly on arc0, and sends packets + to the Internet through freedom. If you didn't know how to do the above, + you should probably stop reading this section now because it only gets + worse. + + Now, how do I add patience into the network? It will be using LANMAN + Client, which means I need the arc0e device. It needs to be able to talk + to both insight and freedom, and also use freedom as a gateway to the + Internet. (Recall that patience has a "private IP address" which won't + work on the Internet; that's okay, I configured Linux IP masquerading on + freedom for this subnet). + + So patience (necessarily; I don't have another IP number from my + provider) has an IP address on a different subnet than freedom and + insight, but needs to use freedom as an Internet gateway. Worse, most + DOS networking programs, including LANMAN, have braindead networking + schemes that rely completely on the netmask and a 'default gateway' to + determine how to route packets. This means that to get to freedom or + insight, patience WILL send through its default gateway, regardless of + the fact that both freedom and insight (courtesy of the arc0e device) + could understand a direct transmission. + + I compensate by giving freedom an extra IP address - aliased 'gatekeeper' + - that is on my private subnet, the same subnet that patience is on. I + then define gatekeeper to be the default gateway for patience. + + To configure freedom (in addition to the commands above): + ifconfig arc0e gatekeeper + route add gatekeeper arc0e + route add patience arc0e + + This way, freedom will send all packets for patience through arc0e, + giving its IP address as gatekeeper (on the private subnet). When it + talks to insight or the Internet, it will use its "freedom" Internet IP + address. + + You will notice that we haven't configured the arc0e device on insight. + This would work, but is not really necessary, and would require me to + assign insight another special IP number from my private subnet. Since + both insight and patience are using freedom as their default gateway, the + two can already talk to each other. + + It's quite fortunate that I set things up like this the first time (cough + cough) because it's really handy when I boot insight into DOS. There, it + runs the Novell ODI protocol stack, which only works with RFC1201 ARCnet. + In this mode it would be impossible for insight to communicate directly + with patience, since the Novell stack is incompatible with Microsoft's + Ethernet-Encap. Without changing any settings on freedom or patience, I + simply set freedom as the default gateway for insight (now in DOS, + remember) and all the forwarding happens "automagically" between the two + hosts that would normally not be able to communicate at all. + + For those who like diagrams, I have created two "virtual subnets" on the + same physical ARCnet wire. You can picture it like this: + + + [RFC1201 NETWORK] [ETHER-ENCAP NETWORK] + (registered Internet subnet) (RFC1597 private subnet) + + (IP Masquerade) + /---------------\ * /---------------\ + | | * | | + | +-Freedom-*-Gatekeeper-+ | + | | | * | | + \-------+-------/ | * \-------+-------/ + | | | + Insight | Patience + (Internet) + + + +It works: what now? +------------------- + +Send mail describing your setup, preferably including driver version, kernel +version, ARCnet card model, CPU type, number of systems on your network, and +list of software in use to me at the following address: + apenwarr@worldvisions.ca + +I do send (sometimes automated) replies to all messages I receive. My email +can be weird (and also usually gets forwarded all over the place along the +way to me), so if you don't get a reply within a reasonable time, please +resend. + + +It doesn't work: what now? +-------------------------- + +Do the same as above, but also include the output of the ifconfig and route +commands, as well as any pertinent log entries (ie. anything that starts +with "arcnet:" and has shown up since the last reboot) in your mail. + +If you want to try fixing it yourself (I strongly recommend that you mail me +about the problem first, since it might already have been solved) you may +want to try some of the debug levels available. For heavy testing on +D_DURING or more, it would be a REALLY good idea to kill your klogd daemon +first! D_DURING displays 4-5 lines for each packet sent or received. D_TX, +D_RX, and D_SKB actually DISPLAY each packet as it is sent or received, +which is obviously quite big. + +Starting with v2.40 ALPHA, the autoprobe routines have changed +significantly. In particular, they won't tell you why the card was not +found unless you turn on the D_INIT_REASONS debugging flag. + +Once the driver is running, you can run the arcdump shell script (available +from me or in the full ARCnet package, if you have it) as root to list the +contents of the arcnet buffers at any time. To make any sense at all out of +this, you should grab the pertinent RFCs. (some are listed near the top of +arcnet.c). arcdump assumes your card is at 0xD0000. If it isn't, edit the +script. + +Buffers 0 and 1 are used for receiving, and Buffers 2 and 3 are for sending. +Ping-pong buffers are implemented both ways. + +If your debug level includes D_DURING and you did NOT define SLOW_XMIT_COPY, +the buffers are cleared to a constant value of 0x42 every time the card is +reset (which should only happen when you do an ifconfig up, or when Linux +decides that the driver is broken). During a transmit, unused parts of the +buffer will be cleared to 0x42 as well. This is to make it easier to figure +out which bytes are being used by a packet. + +You can change the debug level without recompiling the kernel by typing: + ifconfig arc0 down metric 1xxx + /etc/rc.d/rc.inet1 +where "xxx" is the debug level you want. For example, "metric 1015" would put +you at debug level 15. Debug level 7 is currently the default. + +Note that the debug level is (starting with v1.90 ALPHA) a binary +combination of different debug flags; so debug level 7 is really 1+2+4 or +D_NORMAL+D_EXTRA+D_INIT. To include D_DURING, you would add 16 to this, +resulting in debug level 23. + +If you don't understand that, you probably don't want to know anyway. +E-mail me about your problem. + + +I want to send money: what now? +------------------------------- + +Go take a nap or something. You'll feel better in the morning. diff --git a/Documentation/networking/atm.txt b/Documentation/networking/atm.txt new file mode 100644 index 000000000000..82921cee77fe --- /dev/null +++ b/Documentation/networking/atm.txt @@ -0,0 +1,8 @@ +In order to use anything but the most primitive functions of ATM, +several user-mode programs are required to assist the kernel. These +programs and related material can be found via the ATM on Linux Web +page at http://linux-atm.sourceforge.net/ + +If you encounter problems with ATM, please report them on the ATM +on Linux mailing list. Subscription information, archives, etc., +can be found on http://linux-atm.sourceforge.net/ diff --git a/Documentation/networking/ax25.txt b/Documentation/networking/ax25.txt new file mode 100644 index 000000000000..37c25b0925f0 --- /dev/null +++ b/Documentation/networking/ax25.txt @@ -0,0 +1,16 @@ +To use the amateur radio protocols within Linux you will need to get a +suitable copy of the AX.25 Utilities. More detailed information about these +and associated programs can be found on http://zone.pspt.fi/~jsn/. + +For more information about the AX.25, NET/ROM and ROSE protocol stacks, see +the AX25-HOWTO written by Terry Dawson <terry@perf.no.itg.telstra.com.au> +who is also the AX.25 Utilities maintainer. + +There is an active mailing list for discussing Linux amateur radio matters +called linux-hams. To subscribe to it, send a message to +majordomo@vger.kernel.org with the words "subscribe linux-hams" in the body +of the message, the subject field is ignored. + +Jonathan G4KLX + +g4klx@g4klx.demon.co.uk diff --git a/Documentation/networking/baycom.txt b/Documentation/networking/baycom.txt new file mode 100644 index 000000000000..4e68849d5639 --- /dev/null +++ b/Documentation/networking/baycom.txt @@ -0,0 +1,158 @@ + LINUX DRIVERS FOR BAYCOM MODEMS + + Thomas M. Sailer, HB9JNX/AE4WA, <sailer@ife.ee.ethz.ch> + +!!NEW!! (04/98) The drivers for the baycom modems have been split into +separate drivers as they did not share any code, and the driver +and device names have changed. + +This document describes the Linux Kernel Drivers for simple Baycom style +amateur radio modems. + +The following drivers are available: + +baycom_ser_fdx: + This driver supports the SER12 modems either full or half duplex. + Its baud rate may be changed via the `baud' module parameter, + therefore it supports just about every bit bang modem on a + serial port. Its devices are called bcsf0 through bcsf3. + This is the recommended driver for SER12 type modems, + however if you have a broken UART clone that does not have working + delta status bits, you may try baycom_ser_hdx. + +baycom_ser_hdx: + This is an alternative driver for SER12 type modems. + It only supports half duplex, and only 1200 baud. Its devices + are called bcsh0 through bcsh3. Use this driver only if baycom_ser_fdx + does not work with your UART. + +baycom_par: + This driver supports the par96 and picpar modems. + Its devices are called bcp0 through bcp3. + +baycom_epp: + This driver supports the EPP modem. + Its devices are called bce0 through bce3. + This driver is work-in-progress. + +The following modems are supported: + +ser12: This is a very simple 1200 baud AFSK modem. The modem consists only + of a modulator/demodulator chip, usually a TI TCM3105. The computer + is responsible for regenerating the receiver bit clock, as well as + for handling the HDLC protocol. The modem connects to a serial port, + hence the name. Since the serial port is not used as an async serial + port, the kernel driver for serial ports cannot be used, and this + driver only supports standard serial hardware (8250, 16450, 16550) + +par96: This is a modem for 9600 baud FSK compatible to the G3RUH standard. + The modem does all the filtering and regenerates the receiver clock. + Data is transferred from and to the PC via a shift register. + The shift register is filled with 16 bits and an interrupt is signalled. + The PC then empties the shift register in a burst. This modem connects + to the parallel port, hence the name. The modem leaves the + implementation of the HDLC protocol and the scrambler polynomial to + the PC. + +picpar: This is a redesign of the par96 modem by Henning Rech, DF9IC. The modem + is protocol compatible to par96, but uses only three low power ICs + and can therefore be fed from the parallel port and does not require + an additional power supply. Furthermore, it incorporates a carrier + detect circuitry. + +EPP: This is a high-speed modem adaptor that connects to an enhanced parallel port. + Its target audience is users working over a high speed hub (76.8kbit/s). + +eppfpga: This is a redesign of the EPP adaptor. + + + +All of the above modems only support half duplex communications. However, +the driver supports the KISS (see below) fullduplex command. It then simply +starts to send as soon as there's a packet to transmit and does not care +about DCD, i.e. it starts to send even if there's someone else on the channel. +This command is required by some implementations of the DAMA channel +access protocol. + + +The Interface of the drivers + +Unlike previous drivers, these drivers are no longer character devices, +but they are now true kernel network interfaces. Installation is therefore +simple. Once installed, four interfaces named bc{sf,sh,p,e}[0-3] are available. +sethdlc from the ax25 utilities may be used to set driver states etc. +Users of userland AX.25 stacks may use the net2kiss utility (also available +in the ax25 utilities package) to convert packets of a network interface +to a KISS stream on a pseudo tty. There's also a patch available from +me for WAMPES which allows attaching a kernel network interface directly. + + +Configuring the driver + +Every time a driver is inserted into the kernel, it has to know which +modems it should access at which ports. This can be done with the setbaycom +utility. If you are only using one modem, you can also configure the +driver from the insmod command line (or by means of an option line in +/etc/modprobe.conf). + +Examples: + modprobe baycom_ser_fdx mode="ser12*" iobase=0x3f8 irq=4 + sethdlc -i bcsf0 -p mode "ser12*" io 0x3f8 irq 4 + +Both lines configure the first port to drive a ser12 modem at the first +serial port (COM1 under DOS). The * in the mode parameter instructs the driver to use +the software DCD algorithm (see below). + + insmod baycom_par mode="picpar" iobase=0x378 + sethdlc -i bcp0 -p mode "picpar" io 0x378 + +Both lines configure the first port to drive a picpar modem at the +first parallel port (LPT1 under DOS). (Note: picpar implies +hardware DCD, par96 implies software DCD). + +The channel access parameters can be set with sethdlc -a or kissparms. +Note that both utilities interpret the values slightly differently. + + +Hardware DCD versus Software DCD + +To avoid collisions on the air, the driver must know when the channel is +busy. This is the task of the DCD circuitry/software. The driver may either +utilise a software DCD algorithm (options=1) or use a DCD signal from +the hardware (options=0). + +ser12: if software DCD is utilised, the radio's squelch should always be + open. It is highly recommended to use the software DCD algorithm, + as it is much faster than most hardware squelch circuitry. The + disadvantage is a slightly higher load on the system. + +par96: the software DCD algorithm for this type of modem is rather poor. + The modem simply does not provide enough information to implement + a reasonable DCD algorithm in software. Therefore, if your radio + feeds the DCD input of the PAR96 modem, the use of the hardware + DCD circuitry is recommended. + +picpar: the picpar modem features a builtin DCD hardware, which is highly + recommended. + + + +Compatibility with the rest of the Linux kernel + +The serial driver and the baycom serial drivers compete +for the same hardware resources. Of course only one driver can access a given +interface at a time. The serial driver grabs all interfaces it can find at +startup time. Therefore the baycom drivers subsequently won't be able to +access a serial port. You might therefore find it necessary to release +a port owned by the serial driver with 'setserial /dev/ttyS# uart none', where +# is the number of the interface. The baycom drivers do not reserve any +ports at startup, unless one is specified on the 'insmod' command line. Another +method to solve the problem is to compile all drivers as modules and +leave it to kmod to load the correct driver depending on the application. + +The parallel port drivers (baycom_par, baycom_epp) now use the parport subsystem +to arbitrate the ports between different client drivers. + +vy 73s de +Tom Sailer, sailer@ife.ee.ethz.ch +hb9jnx @ hb9w.ampr.org diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt new file mode 100644 index 000000000000..0bc2ed136a38 --- /dev/null +++ b/Documentation/networking/bonding.txt @@ -0,0 +1,1618 @@ + + Linux Ethernet Bonding Driver HOWTO + +Initial release : Thomas Davis <tadavis at lbl.gov> +Corrections, HA extensions : 2000/10/03-15 : + - Willy Tarreau <willy at meta-x.org> + - Constantine Gavrilov <const-g at xpert.com> + - Chad N. Tindel <ctindel at ieee dot org> + - Janice Girouard <girouard at us dot ibm dot com> + - Jay Vosburgh <fubar at us dot ibm dot com> + +Reorganized and updated Feb 2005 by Jay Vosburgh + +Note : +------ + +The bonding driver originally came from Donald Becker's beowulf patches for +kernel 2.0. It has changed quite a bit since, and the original tools from +extreme-linux and beowulf sites will not work with this version of the driver. + +For new versions of the driver, patches for older kernels and the updated +userspace tools, please follow the links at the end of this file. + +Table of Contents +================= + +1. Bonding Driver Installation + +2. Bonding Driver Options + +3. Configuring Bonding Devices +3.1 Configuration with sysconfig support +3.2 Configuration with initscripts support +3.3 Configuring Bonding Manually +3.4 Configuring Multiple Bonds + +5. Querying Bonding Configuration +5.1 Bonding Configuration +5.2 Network Configuration + +6. Switch Configuration + +7. 802.1q VLAN Support + +8. Link Monitoring +8.1 ARP Monitor Operation +8.2 Configuring Multiple ARP Targets +8.3 MII Monitor Operation + +9. Potential Trouble Sources +9.1 Adventures in Routing +9.2 Ethernet Device Renaming +9.3 Painfully Slow Or No Failed Link Detection By Miimon + +10. SNMP agents + +11. Promiscuous mode + +12. High Availability Information +12.1 High Availability in a Single Switch Topology +12.1.1 Bonding Mode Selection for Single Switch Topology +12.1.2 Link Monitoring for Single Switch Topology +12.2 High Availability in a Multiple Switch Topology +12.2.1 Bonding Mode Selection for Multiple Switch Topology +12.2.2 Link Monitoring for Multiple Switch Topology +12.3 Switch Behavior Issues for High Availability + +13. Hardware Specific Considerations +13.1 IBM BladeCenter + +14. Frequently Asked Questions + +15. Resources and Links + + +1. Bonding Driver Installation +============================== + + Most popular distro kernels ship with the bonding driver +already available as a module and the ifenslave user level control +program installed and ready for use. If your distro does not, or you +have need to compile bonding from source (e.g., configuring and +installing a mainline kernel from kernel.org), you'll need to perform +the following steps: + +1.1 Configure and build the kernel with bonding +----------------------------------------------- + + The latest version of the bonding driver is available in the +drivers/net/bonding subdirectory of the most recent kernel source +(which is available on http://kernel.org). + + Prior to the 2.4.11 kernel, the bonding driver was maintained +largely outside the kernel tree; patches for some earlier kernels are +available on the bonding sourceforge site, although those patches are +still several years out of date. Most users will want to use either +the most recent kernel from kernel.org or whatever kernel came with +their distro. + + Configure kernel with "make menuconfig" (or "make xconfig" or +"make config"), then select "Bonding driver support" in the "Network +device support" section. It is recommended that you configure the +driver as module since it is currently the only way to pass parameters +to the driver or configure more than one bonding device. + + Build and install the new kernel and modules, then proceed to +step 2. + +1.2 Install ifenslave Control Utility +------------------------------------- + + The ifenslave user level control program is included in the +kernel source tree, in the file Documentation/networking/ifenslave.c. +It is generally recommended that you use the ifenslave that +corresponds to the kernel that you are using (either from the same +source tree or supplied with the distro), however, ifenslave +executables from older kernels should function (but features newer +than the ifenslave release are not supported). Running an ifenslave +that is newer than the kernel is not supported, and may or may not +work. + + To install ifenslave, do the following: + +# gcc -Wall -O -I/usr/src/linux/include ifenslave.c -o ifenslave +# cp ifenslave /sbin/ifenslave + + If your kernel source is not in "/usr/src/linux," then replace +"/usr/src/linux/include" in the above with the location of your kernel +source include directory. + + You may wish to back up any existing /sbin/ifenslave, or, for +testing or informal use, tag the ifenslave to the kernel version +(e.g., name the ifenslave executable /sbin/ifenslave-2.6.10). + +IMPORTANT NOTE: + + If you omit the "-I" or specify an incorrect directory, you +may end up with an ifenslave that is incompatible with the kernel +you're trying to build it for. Some distros (e.g., Red Hat from 7.1 +onwards) do not have /usr/include/linux symbolically linked to the +default kernel source include directory. + + +2. Bonding Driver Options +========================= + + Options for the bonding driver are supplied as parameters to +the bonding module at load time. They may be given as command line +arguments to the insmod or modprobe command, but are usually specified +in either the /etc/modprobe.conf configuration file, or in a +distro-specific configuration file (some of which are detailed in the +next section). + + The available bonding driver parameters are listed below. If a +parameter is not specified the default value is used. When initially +configuring a bond, it is recommended "tail -f /var/log/messages" be +run in a separate window to watch for bonding driver error messages. + + It is critical that either the miimon or arp_interval and +arp_ip_target parameters be specified, otherwise serious network +degradation will occur during link failures. Very few devices do not +support at least miimon, so there is really no reason not to use it. + + Options with textual values will accept either the text name + or, for backwards compatibility, the option value. E.g., + "mode=802.3ad" and "mode=4" set the same mode. + + The parameters are as follows: + +arp_interval + + Specifies the ARP monitoring frequency in milli-seconds. If + ARP monitoring is used in a load-balancing mode (mode 0 or 2), + the switch should be configured in a mode that evenly + distributes packets across all links - such as round-robin. If + the switch is configured to distribute the packets in an XOR + fashion, all replies from the ARP targets will be received on + the same link which could cause the other team members to + fail. ARP monitoring should not be used in conjunction with + miimon. A value of 0 disables ARP monitoring. The default + value is 0. + +arp_ip_target + + Specifies the ip addresses to use when arp_interval is > 0. + These are the targets of the ARP request sent to determine the + health of the link to the targets. Specify these values in + ddd.ddd.ddd.ddd format. Multiple ip adresses must be + seperated by a comma. At least one IP address must be given + for ARP monitoring to function. The maximum number of targets + that can be specified is 16. The default value is no IP + addresses. + +downdelay + + Specifies the time, in milliseconds, to wait before disabling + a slave after a link failure has been detected. This option + is only valid for the miimon link monitor. The downdelay + value should be a multiple of the miimon value; if not, it + will be rounded down to the nearest multiple. The default + value is 0. + +lacp_rate + + Option specifying the rate in which we'll ask our link partner + to transmit LACPDU packets in 802.3ad mode. Possible values + are: + + slow or 0 + Request partner to transmit LACPDUs every 30 seconds (default) + + fast or 1 + Request partner to transmit LACPDUs every 1 second + +max_bonds + + Specifies the number of bonding devices to create for this + instance of the bonding driver. E.g., if max_bonds is 3, and + the bonding driver is not already loaded, then bond0, bond1 + and bond2 will be created. The default value is 1. + +miimon + + Specifies the frequency in milli-seconds that MII link + monitoring will occur. A value of zero disables MII link + monitoring. A value of 100 is a good starting point. The + use_carrier option, below, affects how the link state is + determined. See the High Availability section for additional + information. The default value is 0. + +mode + + Specifies one of the bonding policies. The default is + balance-rr (round robin). Possible values are: + + balance-rr or 0 + + Round-robin policy: Transmit packets in sequential + order from the first available slave through the + last. This mode provides load balancing and fault + tolerance. + + active-backup or 1 + + Active-backup policy: Only one slave in the bond is + active. A different slave becomes active if, and only + if, the active slave fails. The bond's MAC address is + externally visible on only one port (network adapter) + to avoid confusing the switch. This mode provides + fault tolerance. The primary option affects the + behavior of this mode. + + balance-xor or 2 + + XOR policy: Transmit based on [(source MAC address + XOR'd with destination MAC address) modulo slave + count]. This selects the same slave for each + destination MAC address. This mode provides load + balancing and fault tolerance. + + broadcast or 3 + + Broadcast policy: transmits everything on all slave + interfaces. This mode provides fault tolerance. + + 802.3ad or 4 + + IEEE 802.3ad Dynamic link aggregation. Creates + aggregation groups that share the same speed and + duplex settings. Utilizes all slaves in the active + aggregator according to the 802.3ad specification. + + Pre-requisites: + + 1. Ethtool support in the base drivers for retrieving + the speed and duplex of each slave. + + 2. A switch that supports IEEE 802.3ad Dynamic link + aggregation. + + Most switches will require some type of configuration + to enable 802.3ad mode. + + balance-tlb or 5 + + Adaptive transmit load balancing: channel bonding that + does not require any special switch support. The + outgoing traffic is distributed according to the + current load (computed relative to the speed) on each + slave. Incoming traffic is received by the current + slave. If the receiving slave fails, another slave + takes over the MAC address of the failed receiving + slave. + + Prerequisite: + + Ethtool support in the base drivers for retrieving the + speed of each slave. + + balance-alb or 6 + + Adaptive load balancing: includes balance-tlb plus + receive load balancing (rlb) for IPV4 traffic, and + does not require any special switch support. The + receive load balancing is achieved by ARP negotiation. + The bonding driver intercepts the ARP Replies sent by + the local system on their way out and overwrites the + source hardware address with the unique hardware + address of one of the slaves in the bond such that + different peers use different hardware addresses for + the server. + + Receive traffic from connections created by the server + is also balanced. When the local system sends an ARP + Request the bonding driver copies and saves the peer's + IP information from the ARP packet. When the ARP + Reply arrives from the peer, its hardware address is + retrieved and the bonding driver initiates an ARP + reply to this peer assigning it to one of the slaves + in the bond. A problematic outcome of using ARP + negotiation for balancing is that each time that an + ARP request is broadcast it uses the hardware address + of the bond. Hence, peers learn the hardware address + of the bond and the balancing of receive traffic + collapses to the current slave. This is handled by + sending updates (ARP Replies) to all the peers with + their individually assigned hardware address such that + the traffic is redistributed. Receive traffic is also + redistributed when a new slave is added to the bond + and when an inactive slave is re-activated. The + receive load is distributed sequentially (round robin) + among the group of highest speed slaves in the bond. + + When a link is reconnected or a new slave joins the + bond the receive traffic is redistributed among all + active slaves in the bond by intiating ARP Replies + with the selected mac address to each of the + clients. The updelay parameter (detailed below) must + be set to a value equal or greater than the switch's + forwarding delay so that the ARP Replies sent to the + peers will not be blocked by the switch. + + Prerequisites: + + 1. Ethtool support in the base drivers for retrieving + the speed of each slave. + + 2. Base driver support for setting the hardware + address of a device while it is open. This is + required so that there will always be one slave in the + team using the bond hardware address (the + curr_active_slave) while having a unique hardware + address for each slave in the bond. If the + curr_active_slave fails its hardware address is + swapped with the new curr_active_slave that was + chosen. + +primary + + A string (eth0, eth2, etc) specifying which slave is the + primary device. The specified device will always be the + active slave while it is available. Only when the primary is + off-line will alternate devices be used. This is useful when + one slave is preferred over another, e.g., when one slave has + higher throughput than another. + + The primary option is only valid for active-backup mode. + +updelay + + Specifies the time, in milliseconds, to wait before enabling a + slave after a link recovery has been detected. This option is + only valid for the miimon link monitor. The updelay value + should be a multiple of the miimon value; if not, it will be + rounded down to the nearest multiple. The default value is 0. + +use_carrier + + Specifies whether or not miimon should use MII or ETHTOOL + ioctls vs. netif_carrier_ok() to determine the link + status. The MII or ETHTOOL ioctls are less efficient and + utilize a deprecated calling sequence within the kernel. The + netif_carrier_ok() relies on the device driver to maintain its + state with netif_carrier_on/off; at this writing, most, but + not all, device drivers support this facility. + + If bonding insists that the link is up when it should not be, + it may be that your network device driver does not support + netif_carrier_on/off. The default state for netif_carrier is + "carrier on," so if a driver does not support netif_carrier, + it will appear as if the link is always up. In this case, + setting use_carrier to 0 will cause bonding to revert to the + MII / ETHTOOL ioctl method to determine the link state. + + A value of 1 enables the use of netif_carrier_ok(), a value of + 0 will use the deprecated MII / ETHTOOL ioctls. The default + value is 1. + + + +3. Configuring Bonding Devices +============================== + + There are, essentially, two methods for configuring bonding: +with support from the distro's network initialization scripts, and +without. Distros generally use one of two packages for the network +initialization scripts: initscripts or sysconfig. Recent versions of +these packages have support for bonding, while older versions do not. + + We will first describe the options for configuring bonding for +distros using versions of initscripts and sysconfig with full or +partial support for bonding, then provide information on enabling +bonding without support from the network initialization scripts (i.e., +older versions of initscripts or sysconfig). + + If you're unsure whether your distro uses sysconfig or +initscripts, or don't know if it's new enough, have no fear. +Determining this is fairly straightforward. + + First, issue the command: + +$ rpm -qf /sbin/ifup + + It will respond with a line of text starting with either +"initscripts" or "sysconfig," followed by some numbers. This is the +package that provides your network initialization scripts. + + Next, to determine if your installation supports bonding, +issue the command: + +$ grep ifenslave /sbin/ifup + + If this returns any matches, then your initscripts or +sysconfig has support for bonding. + +3.1 Configuration with sysconfig support +---------------------------------------- + + This section applies to distros using a version of sysconfig +with bonding support, for example, SuSE Linux Enterprise Server 9. + + SuSE SLES 9's networking configuration system does support +bonding, however, at this writing, the YaST system configuration +frontend does not provide any means to work with bonding devices. +Bonding devices can be managed by hand, however, as follows. + + First, if they have not already been configured, configure the +slave devices. On SLES 9, this is most easily done by running the +yast2 sysconfig configuration utility. The goal is for to create an +ifcfg-id file for each slave device. The simplest way to accomplish +this is to configure the devices for DHCP. The name of the +configuration file for each device will be of the form: + +ifcfg-id-xx:xx:xx:xx:xx:xx + + Where the "xx" portion will be replaced with the digits from +the device's permanent MAC address. + + Once the set of ifcfg-id-xx:xx:xx:xx:xx:xx files has been +created, it is necessary to edit the configuration files for the slave +devices (the MAC addresses correspond to those of the slave devices). +Before editing, the file will contain muliple lines, and will look +something like this: + +BOOTPROTO='dhcp' +STARTMODE='on' +USERCTL='no' +UNIQUE='XNzu.WeZGOGF+4wE' +_nm_name='bus-pci-0001:61:01.0' + + Change the BOOTPROTO and STARTMODE lines to the following: + +BOOTPROTO='none' +STARTMODE='off' + + Do not alter the UNIQUE or _nm_name lines. Remove any other +lines (USERCTL, etc). + + Once the ifcfg-id-xx:xx:xx:xx:xx:xx files have been modified, +it's time to create the configuration file for the bonding device +itself. This file is named ifcfg-bondX, where X is the number of the +bonding device to create, starting at 0. The first such file is +ifcfg-bond0, the second is ifcfg-bond1, and so on. The sysconfig +network configuration system will correctly start multiple instances +of bonding. + + The contents of the ifcfg-bondX file is as follows: + +BOOTPROTO="static" +BROADCAST="10.0.2.255" +IPADDR="10.0.2.10" +NETMASK="255.255.0.0" +NETWORK="10.0.2.0" +REMOTE_IPADDR="" +STARTMODE="onboot" +BONDING_MASTER="yes" +BONDING_MODULE_OPTS="mode=active-backup miimon=100" +BONDING_SLAVE0="eth0" +BONDING_SLAVE1="eth1" + + Replace the sample BROADCAST, IPADDR, NETMASK and NETWORK +values with the appropriate values for your network. + + Note that configuring the bonding device with BOOTPROTO='dhcp' +does not work; the scripts attempt to obtain the device address from +DHCP prior to adding any of the slave devices. Without active slaves, +the DHCP requests are not sent to the network. + + The STARTMODE specifies when the device is brought online. +The possible values are: + + onboot: The device is started at boot time. If you're not + sure, this is probably what you want. + + manual: The device is started only when ifup is called + manually. Bonding devices may be configured this + way if you do not wish them to start automatically + at boot for some reason. + + hotplug: The device is started by a hotplug event. This is not + a valid choice for a bonding device. + + off or ignore: The device configuration is ignored. + + The line BONDING_MASTER='yes' indicates that the device is a +bonding master device. The only useful value is "yes." + + The contents of BONDING_MODULE_OPTS are supplied to the +instance of the bonding module for this device. Specify the options +for the bonding mode, link monitoring, and so on here. Do not include +the max_bonds bonding parameter; this will confuse the configuration +system if you have multiple bonding devices. + + Finally, supply one BONDING_SLAVEn="ethX" for each slave, +where "n" is an increasing value, one for each slave, and "ethX" is +the name of the slave device (eth0, eth1, etc). + + When all configuration files have been modified or created, +networking must be restarted for the configuration changes to take +effect. This can be accomplished via the following: + +# /etc/init.d/network restart + + Note that the network control script (/sbin/ifdown) will +remove the bonding module as part of the network shutdown processing, +so it is not necessary to remove the module by hand if, e.g., the +module paramters have changed. + + Also, at this writing, YaST/YaST2 will not manage bonding +devices (they do not show bonding interfaces on its list of network +devices). It is necessary to edit the configuration file by hand to +change the bonding configuration. + + Additional general options and details of the ifcfg file +format can be found in an example ifcfg template file: + +/etc/sysconfig/network/ifcfg.template + + Note that the template does not document the various BONDING_ +settings described above, but does describe many of the other options. + +3.2 Configuration with initscripts support +------------------------------------------ + + This section applies to distros using a version of initscripts +with bonding support, for example, Red Hat Linux 9 or Red Hat +Enterprise Linux version 3. On these systems, the network +initialization scripts have some knowledge of bonding, and can be +configured to control bonding devices. + + These distros will not automatically load the network adapter +driver unless the ethX device is configured with an IP address. +Because of this constraint, users must manually configure a +network-script file for all physical adapters that will be members of +a bondX link. Network script files are located in the directory: + +/etc/sysconfig/network-scripts + + The file name must be prefixed with "ifcfg-eth" and suffixed +with the adapter's physical adapter number. For example, the script +for eth0 would be named /etc/sysconfig/network-scripts/ifcfg-eth0. +Place the following text in the file: + +DEVICE=eth0 +USERCTL=no +ONBOOT=yes +MASTER=bond0 +SLAVE=yes +BOOTPROTO=none + + The DEVICE= line will be different for every ethX device and +must correspond with the name of the file, i.e., ifcfg-eth1 must have +a device line of DEVICE=eth1. The setting of the MASTER= line will +also depend on the final bonding interface name chosen for your bond. +As with other network devices, these typically start at 0, and go up +one for each device, i.e., the first bonding instance is bond0, the +second is bond1, and so on. + + Next, create a bond network script. The file name for this +script will be /etc/sysconfig/network-scripts/ifcfg-bondX where X is +the number of the bond. For bond0 the file is named "ifcfg-bond0", +for bond1 it is named "ifcfg-bond1", and so on. Within that file, +place the following text: + +DEVICE=bond0 +IPADDR=192.168.1.1 +NETMASK=255.255.255.0 +NETWORK=192.168.1.0 +BROADCAST=192.168.1.255 +ONBOOT=yes +BOOTPROTO=none +USERCTL=no + + Be sure to change the networking specific lines (IPADDR, +NETMASK, NETWORK and BROADCAST) to match your network configuration. + + Finally, it is necessary to edit /etc/modules.conf to load the +bonding module when the bond0 interface is brought up. The following +sample lines in /etc/modules.conf will load the bonding module, and +select its options: + +alias bond0 bonding +options bond0 mode=balance-alb miimon=100 + + Replace the sample parameters with the appropriate set of +options for your configuration. + + Finally run "/etc/rc.d/init.d/network restart" as root. This +will restart the networking subsystem and your bond link should be now +up and running. + + +3.3 Configuring Bonding Manually +-------------------------------- + + This section applies to distros whose network initialization +scripts (the sysconfig or initscripts package) do not have specific +knowledge of bonding. One such distro is SuSE Linux Enterprise Server +version 8. + + The general methodology for these systems is to place the +bonding module parameters into /etc/modprobe.conf, then add modprobe +and/or ifenslave commands to the system's global init script. The +name of the global init script differs; for sysconfig, it is +/etc/init.d/boot.local and for initscripts it is /etc/rc.d/rc.local. + + For example, if you wanted to make a simple bond of two e100 +devices (presumed to be eth0 and eth1), and have it persist across +reboots, edit the appropriate file (/etc/init.d/boot.local or +/etc/rc.d/rc.local), and add the following: + +modprobe bonding -obond0 mode=balance-alb miimon=100 +modprobe e100 +ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up +ifenslave bond0 eth0 +ifenslave bond0 eth1 + + Replace the example bonding module parameters and bond0 +network configuration (IP address, netmask, etc) with the appropriate +values for your configuration. The above example loads the bonding +module with the name "bond0," this simplifies the naming if multiple +bonding modules are loaded (each successive instance of the module is +given a different name, and the module instance names match the +bonding interface names). + + Unfortunately, this method will not provide support for the +ifup and ifdown scripts on the bond devices. To reload the bonding +configuration, it is necessary to run the initialization script, e.g., + +# /etc/init.d/boot.local + + or + +# /etc/rc.d/rc.local + + It may be desirable in such a case to create a separate script +which only initializes the bonding configuration, then call that +separate script from within boot.local. This allows for bonding to be +enabled without re-running the entire global init script. + + To shut down the bonding devices, it is necessary to first +mark the bonding device itself as being down, then remove the +appropriate device driver modules. For our example above, you can do +the following: + +# ifconfig bond0 down +# rmmod bond0 +# rmmod e100 + + Again, for convenience, it may be desirable to create a script +with these commands. + + +3.4 Configuring Multiple Bonds +------------------------------ + + This section contains information on configuring multiple +bonding devices with differing options. If you require multiple +bonding devices, but all with the same options, see the "max_bonds" +module paramter, documented above. + + To create multiple bonding devices with differing options, it +is necessary to load the bonding driver multiple times. Note that +current versions of the sysconfig network initialization scripts +handle this automatically; if your distro uses these scripts, no +special action is needed. See the section Configuring Bonding +Devices, above, if you're not sure about your network initialization +scripts. + + To load multiple instances of the module, it is necessary to +specify a different name for each instance (the module loading system +requires that every loaded module, even multiple instances of the same +module, have a unique name). This is accomplished by supplying +multiple sets of bonding options in /etc/modprobe.conf, for example: + +alias bond0 bonding +options bond0 -o bond0 mode=balance-rr miimon=100 + +alias bond1 bonding +options bond1 -o bond1 mode=balance-alb miimon=50 + + will load the bonding module two times. The first instance is +named "bond0" and creates the bond0 device in balance-rr mode with an +miimon of 100. The second instance is named "bond1" and creates the +bond1 device in balance-alb mode with an miimon of 50. + + This may be repeated any number of times, specifying a new and +unique name in place of bond0 or bond1 for each instance. + + When the appropriate module paramters are in place, then +configure bonding according to the instructions for your distro. + +5. Querying Bonding Configuration +================================= + +5.1 Bonding Configuration +------------------------- + + Each bonding device has a read-only file residing in the +/proc/net/bonding directory. The file contents include information +about the bonding configuration, options and state of each slave. + + For example, the contents of /proc/net/bonding/bond0 after the +driver is loaded with parameters of mode=0 and miimon=1000 is +generally as follows: + + Ethernet Channel Bonding Driver: 2.6.1 (October 29, 2004) + Bonding Mode: load balancing (round-robin) + Currently Active Slave: eth0 + MII Status: up + MII Polling Interval (ms): 1000 + Up Delay (ms): 0 + Down Delay (ms): 0 + + Slave Interface: eth1 + MII Status: up + Link Failure Count: 1 + + Slave Interface: eth0 + MII Status: up + Link Failure Count: 1 + + The precise format and contents will change depending upon the +bonding configuration, state, and version of the bonding driver. + +5.2 Network configuration +------------------------- + + The network configuration can be inspected using the ifconfig +command. Bonding devices will have the MASTER flag set; Bonding slave +devices will have the SLAVE flag set. The ifconfig output does not +contain information on which slaves are associated with which masters. + + In the example below, the bond0 interface is the master +(MASTER) while eth0 and eth1 are slaves (SLAVE). Notice all slaves of +bond0 have the same MAC address (HWaddr) as bond0 for all modes except +TLB and ALB that require a unique MAC address for each slave. + +# /sbin/ifconfig +bond0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 + inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0 + UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 + RX packets:7224794 errors:0 dropped:0 overruns:0 frame:0 + TX packets:3286647 errors:1 dropped:0 overruns:1 carrier:0 + collisions:0 txqueuelen:0 + +eth0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 + inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0 + UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 + RX packets:3573025 errors:0 dropped:0 overruns:0 frame:0 + TX packets:1643167 errors:1 dropped:0 overruns:1 carrier:0 + collisions:0 txqueuelen:100 + Interrupt:10 Base address:0x1080 + +eth1 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 + inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0 + UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 + RX packets:3651769 errors:0 dropped:0 overruns:0 frame:0 + TX packets:1643480 errors:0 dropped:0 overruns:0 carrier:0 + collisions:0 txqueuelen:100 + Interrupt:9 Base address:0x1400 + +6. Switch Configuration +======================= + + For this section, "switch" refers to whatever system the +bonded devices are directly connected to (i.e., where the other end of +the cable plugs into). This may be an actual dedicated switch device, +or it may be another regular system (e.g., another computer running +Linux), + + The active-backup, balance-tlb and balance-alb modes do not +require any specific configuration of the switch. + + The 802.3ad mode requires that the switch have the appropriate +ports configured as an 802.3ad aggregation. The precise method used +to configure this varies from switch to switch, but, for example, a +Cisco 3550 series switch requires that the appropriate ports first be +grouped together in a single etherchannel instance, then that +etherchannel is set to mode "lacp" to enable 802.3ad (instead of +standard EtherChannel). + + The balance-rr, balance-xor and broadcast modes generally +require that the switch have the appropriate ports grouped together. +The nomenclature for such a group differs between switches, it may be +called an "etherchannel" (as in the Cisco example, above), a "trunk +group" or some other similar variation. For these modes, each switch +will also have its own configuration options for the switch's transmit +policy to the bond. Typical choices include XOR of either the MAC or +IP addresses. The transmit policy of the two peers does not need to +match. For these three modes, the bonding mode really selects a +transmit policy for an EtherChannel group; all three will interoperate +with another EtherChannel group. + + +7. 802.1q VLAN Support +====================== + + It is possible to configure VLAN devices over a bond interface +using the 8021q driver. However, only packets coming from the 8021q +driver and passing through bonding will be tagged by default. Self +generated packets, for example, bonding's learning packets or ARP +packets generated by either ALB mode or the ARP monitor mechanism, are +tagged internally by bonding itself. As a result, bonding must +"learn" the VLAN IDs configured above it, and use those IDs to tag +self generated packets. + + For reasons of simplicity, and to support the use of adapters +that can do VLAN hardware acceleration offloding, the bonding +interface declares itself as fully hardware offloaing capable, it gets +the add_vid/kill_vid notifications to gather the necessary +information, and it propagates those actions to the slaves. In case +of mixed adapter types, hardware accelerated tagged packets that +should go through an adapter that is not offloading capable are +"un-accelerated" by the bonding driver so the VLAN tag sits in the +regular location. + + VLAN interfaces *must* be added on top of a bonding interface +only after enslaving at least one slave. The bonding interface has a +hardware address of 00:00:00:00:00:00 until the first slave is added. +If the VLAN interface is created prior to the first enslavement, it +would pick up the all-zeroes hardware address. Once the first slave +is attached to the bond, the bond device itself will pick up the +slave's hardware address, which is then available for the VLAN device. + + Also, be aware that a similar problem can occur if all slaves +are released from a bond that still has one or more VLAN interfaces on +top of it. When a new slave is added, the bonding interface will +obtain its hardware address from the first slave, which might not +match the hardware address of the VLAN interfaces (which was +ultimately copied from an earlier slave). + + There are two methods to insure that the VLAN device operates +with the correct hardware address if all slaves are removed from a +bond interface: + + 1. Remove all VLAN interfaces then recreate them + + 2. Set the bonding interface's hardware address so that it +matches the hardware address of the VLAN interfaces. + + Note that changing a VLAN interface's HW address would set the +underlying device -- i.e. the bonding interface -- to promiscouos +mode, which might not be what you want. + + +8. Link Monitoring +================== + + The bonding driver at present supports two schemes for +monitoring a slave device's link state: the ARP monitor and the MII +monitor. + + At the present time, due to implementation restrictions in the +bonding driver itself, it is not possible to enable both ARP and MII +monitoring simultaneously. + +8.1 ARP Monitor Operation +------------------------- + + The ARP monitor operates as its name suggests: it sends ARP +queries to one or more designated peer systems on the network, and +uses the response as an indication that the link is operating. This +gives some assurance that traffic is actually flowing to and from one +or more peers on the local network. + + The ARP monitor relies on the device driver itself to verify +that traffic is flowing. In particular, the driver must keep up to +date the last receive time, dev->last_rx, and transmit start time, +dev->trans_start. If these are not updated by the driver, then the +ARP monitor will immediately fail any slaves using that driver, and +those slaves will stay down. If networking monitoring (tcpdump, etc) +shows the ARP requests and replies on the network, then it may be that +your device driver is not updating last_rx and trans_start. + +8.2 Configuring Multiple ARP Targets +------------------------------------ + + While ARP monitoring can be done with just one target, it can +be useful in a High Availability setup to have several targets to +monitor. In the case of just one target, the target itself may go +down or have a problem making it unresponsive to ARP requests. Having +an additional target (or several) increases the reliability of the ARP +monitoring. + + Multiple ARP targets must be seperated by commas as follows: + +# example options for ARP monitoring with three targets +alias bond0 bonding +options bond0 arp_interval=60 arp_ip_target=192.168.0.1,192.168.0.3,192.168.0.9 + + For just a single target the options would resemble: + +# example options for ARP monitoring with one target +alias bond0 bonding +options bond0 arp_interval=60 arp_ip_target=192.168.0.100 + + +8.3 MII Monitor Operation +------------------------- + + The MII monitor monitors only the carrier state of the local +network interface. It accomplishes this in one of three ways: by +depending upon the device driver to maintain its carrier state, by +querying the device's MII registers, or by making an ethtool query to +the device. + + If the use_carrier module parameter is 1 (the default value), +then the MII monitor will rely on the driver for carrier state +information (via the netif_carrier subsystem). As explained in the +use_carrier parameter information, above, if the MII monitor fails to +detect carrier loss on the device (e.g., when the cable is physically +disconnected), it may be that the driver does not support +netif_carrier. + + If use_carrier is 0, then the MII monitor will first query the +device's (via ioctl) MII registers and check the link state. If that +request fails (not just that it returns carrier down), then the MII +monitor will make an ethtool ETHOOL_GLINK request to attempt to obtain +the same information. If both methods fail (i.e., the driver either +does not support or had some error in processing both the MII register +and ethtool requests), then the MII monitor will assume the link is +up. + +9. Potential Sources of Trouble +=============================== + +9.1 Adventures in Routing +------------------------- + + When bonding is configured, it is important that the slave +devices not have routes that supercede routes of the master (or, +generally, not have routes at all). For example, suppose the bonding +device bond0 has two slaves, eth0 and eth1, and the routing table is +as follows: + +Kernel IP routing table +Destination Gateway Genmask Flags MSS Window irtt Iface +10.0.0.0 0.0.0.0 255.255.0.0 U 40 0 0 eth0 +10.0.0.0 0.0.0.0 255.255.0.0 U 40 0 0 eth1 +10.0.0.0 0.0.0.0 255.255.0.0 U 40 0 0 bond0 +127.0.0.0 0.0.0.0 255.0.0.0 U 40 0 0 lo + + This routing configuration will likely still update the +receive/transmit times in the driver (needed by the ARP monitor), but +may bypass the bonding driver (because outgoing traffic to, in this +case, another host on network 10 would use eth0 or eth1 before bond0). + + The ARP monitor (and ARP itself) may become confused by this +configuration, because ARP requests (generated by the ARP monitor) +will be sent on one interface (bond0), but the corresponding reply +will arrive on a different interface (eth0). This reply looks to ARP +as an unsolicited ARP reply (because ARP matches replies on an +interface basis), and is discarded. The MII monitor is not affected +by the state of the routing table. + + The solution here is simply to insure that slaves do not have +routes of their own, and if for some reason they must, those routes do +not supercede routes of their master. This should generally be the +case, but unusual configurations or errant manual or automatic static +route additions may cause trouble. + +9.2 Ethernet Device Renaming +---------------------------- + + On systems with network configuration scripts that do not +associate physical devices directly with network interface names (so +that the same physical device always has the same "ethX" name), it may +be necessary to add some special logic to either /etc/modules.conf or +/etc/modprobe.conf (depending upon which is installed on the system). + + For example, given a modules.conf containing the following: + +alias bond0 bonding +options bond0 mode=some-mode miimon=50 +alias eth0 tg3 +alias eth1 tg3 +alias eth2 e1000 +alias eth3 e1000 + + If neither eth0 and eth1 are slaves to bond0, then when the +bond0 interface comes up, the devices may end up reordered. This +happens because bonding is loaded first, then its slave device's +drivers are loaded next. Since no other drivers have been loaded, +when the e1000 driver loads, it will receive eth0 and eth1 for its +devices, but the bonding configuration tries to enslave eth2 and eth3 +(which may later be assigned to the tg3 devices). + + Adding the following: + +add above bonding e1000 tg3 + + causes modprobe to load e1000 then tg3, in that order, when +bonding is loaded. This command is fully documented in the +modules.conf manual page. + + On systems utilizing modprobe.conf (or modprobe.conf.local), +an equivalent problem can occur. In this case, the following can be +added to modprobe.conf (or modprobe.conf.local, as appropriate), as +follows (all on one line; it has been split here for clarity): + +install bonding /sbin/modprobe tg3; /sbin/modprobe e1000; + /sbin/modprobe --ignore-install bonding + + This will, when loading the bonding module, rather than +performing the normal action, instead execute the provided command. +This command loads the device drivers in the order needed, then calls +modprobe with --ingore-install to cause the normal action to then take +place. Full documentation on this can be found in the modprobe.conf +and modprobe manual pages. + +9.3. Painfully Slow Or No Failed Link Detection By Miimon +--------------------------------------------------------- + + By default, bonding enables the use_carrier option, which +instructs bonding to trust the driver to maintain carrier state. + + As discussed in the options section, above, some drivers do +not support the netif_carrier_on/_off link state tracking system. +With use_carrier enabled, bonding will always see these links as up, +regardless of their actual state. + + Additionally, other drivers do support netif_carrier, but do +not maintain it in real time, e.g., only polling the link state at +some fixed interval. In this case, miimon will detect failures, but +only after some long period of time has expired. If it appears that +miimon is very slow in detecting link failures, try specifying +use_carrier=0 to see if that improves the failure detection time. If +it does, then it may be that the driver checks the carrier state at a +fixed interval, but does not cache the MII register values (so the +use_carrier=0 method of querying the registers directly works). If +use_carrier=0 does not improve the failover, then the driver may cache +the registers, or the problem may be elsewhere. + + Also, remember that miimon only checks for the device's +carrier state. It has no way to determine the state of devices on or +beyond other ports of a switch, or if a switch is refusing to pass +traffic while still maintaining carrier on. + +10. SNMP agents +=============== + + If running SNMP agents, the bonding driver should be loaded +before any network drivers participating in a bond. This requirement +is due to the the interface index (ipAdEntIfIndex) being associated to +the first interface found with a given IP address. That is, there is +only one ipAdEntIfIndex for each IP address. For example, if eth0 and +eth1 are slaves of bond0 and the driver for eth0 is loaded before the +bonding driver, the interface for the IP address will be associated +with the eth0 interface. This configuration is shown below, the IP +address 192.168.1.1 has an interface index of 2 which indexes to eth0 +in the ifDescr table (ifDescr.2). + + interfaces.ifTable.ifEntry.ifDescr.1 = lo + interfaces.ifTable.ifEntry.ifDescr.2 = eth0 + interfaces.ifTable.ifEntry.ifDescr.3 = eth1 + interfaces.ifTable.ifEntry.ifDescr.4 = eth2 + interfaces.ifTable.ifEntry.ifDescr.5 = eth3 + interfaces.ifTable.ifEntry.ifDescr.6 = bond0 + ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.10.10.10 = 5 + ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.192.168.1.1 = 2 + ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.74.20.94 = 4 + ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.127.0.0.1 = 1 + + This problem is avoided by loading the bonding driver before +any network drivers participating in a bond. Below is an example of +loading the bonding driver first, the IP address 192.168.1.1 is +correctly associated with ifDescr.2. + + interfaces.ifTable.ifEntry.ifDescr.1 = lo + interfaces.ifTable.ifEntry.ifDescr.2 = bond0 + interfaces.ifTable.ifEntry.ifDescr.3 = eth0 + interfaces.ifTable.ifEntry.ifDescr.4 = eth1 + interfaces.ifTable.ifEntry.ifDescr.5 = eth2 + interfaces.ifTable.ifEntry.ifDescr.6 = eth3 + ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.10.10.10 = 6 + ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.192.168.1.1 = 2 + ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.74.20.94 = 5 + ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.127.0.0.1 = 1 + + While some distributions may not report the interface name in +ifDescr, the association between the IP address and IfIndex remains +and SNMP functions such as Interface_Scan_Next will report that +association. + +11. Promiscuous mode +==================== + + When running network monitoring tools, e.g., tcpdump, it is +common to enable promiscuous mode on the device, so that all traffic +is seen (instead of seeing only traffic destined for the local host). +The bonding driver handles promiscuous mode changes to the bonding +master device (e.g., bond0), and propogates the setting to the slave +devices. + + For the balance-rr, balance-xor, broadcast, and 802.3ad modes, +the promiscuous mode setting is propogated to all slaves. + + For the active-backup, balance-tlb and balance-alb modes, the +promiscuous mode setting is propogated only to the active slave. + + For balance-tlb mode, the active slave is the slave currently +receiving inbound traffic. + + For balance-alb mode, the active slave is the slave used as a +"primary." This slave is used for mode-specific control traffic, for +sending to peers that are unassigned or if the load is unbalanced. + + For the active-backup, balance-tlb and balance-alb modes, when +the active slave changes (e.g., due to a link failure), the +promiscuous setting will be propogated to the new active slave. + +12. High Availability Information +================================= + + High Availability refers to configurations that provide +maximum network availability by having redundant or backup devices, +links and switches between the host and the rest of the world. + + There are currently two basic methods for configuring to +maximize availability. They are dependent on the network topology and +the primary goal of the configuration, but in general, a configuration +can be optimized for maximum available bandwidth, or for maximum +network availability. + +12.1 High Availability in a Single Switch Topology +-------------------------------------------------- + + If two hosts (or a host and a switch) are directly connected +via multiple physical links, then there is no network availability +penalty for optimizing for maximum bandwidth: there is only one switch +(or peer), so if it fails, you have no alternative access to fail over +to. + +Example 1 : host to switch (or other host) + + +----------+ +----------+ + | |eth0 eth0| switch | + | Host A +--------------------------+ or | + | +--------------------------+ other | + | |eth1 eth1| host | + +----------+ +----------+ + + +12.1.1 Bonding Mode Selection for single switch topology +-------------------------------------------------------- + + This configuration is the easiest to set up and to understand, +although you will have to decide which bonding mode best suits your +needs. The tradeoffs for each mode are detailed below: + +balance-rr: This mode is the only mode that will permit a single + TCP/IP connection to stripe traffic across multiple + interfaces. It is therefore the only mode that will allow a + single TCP/IP stream to utilize more than one interface's + worth of throughput. This comes at a cost, however: the + striping often results in peer systems receiving packets out + of order, causing TCP/IP's congestion control system to kick + in, often by retransmitting segments. + + It is possible to adjust TCP/IP's congestion limits by + altering the net.ipv4.tcp_reordering sysctl parameter. The + usual default value is 3, and the maximum useful value is 127. + For a four interface balance-rr bond, expect that a single + TCP/IP stream will utilize no more than approximately 2.3 + interface's worth of throughput, even after adjusting + tcp_reordering. + + If you are utilizing protocols other than TCP/IP, UDP for + example, and your application can tolerate out of order + delivery, then this mode can allow for single stream datagram + performance that scales near linearly as interfaces are added + to the bond. + + This mode requires the switch to have the appropriate ports + configured for "etherchannel" or "trunking." + +active-backup: There is not much advantage in this network topology to + the active-backup mode, as the inactive backup devices are all + connected to the same peer as the primary. In this case, a + load balancing mode (with link monitoring) will provide the + same level of network availability, but with increased + available bandwidth. On the plus side, it does not require + any configuration of the switch. + +balance-xor: This mode will limit traffic such that packets destined + for specific peers will always be sent over the same + interface. Since the destination is determined by the MAC + addresses involved, this may be desirable if you have a large + network with many hosts. It is likely to be suboptimal if all + your traffic is passed through a single router, however. As + with balance-rr, the switch ports need to be configured for + "etherchannel" or "trunking." + +broadcast: Like active-backup, there is not much advantage to this + mode in this type of network topology. + +802.3ad: This mode can be a good choice for this type of network + topology. The 802.3ad mode is an IEEE standard, so all peers + that implement 802.3ad should interoperate well. The 802.3ad + protocol includes automatic configuration of the aggregates, + so minimal manual configuration of the switch is needed + (typically only to designate that some set of devices is + usable for 802.3ad). The 802.3ad standard also mandates that + frames be delivered in order (within certain limits), so in + general single connections will not see misordering of + packets. The 802.3ad mode does have some drawbacks: the + standard mandates that all devices in the aggregate operate at + the same speed and duplex. Also, as with all bonding load + balance modes other than balance-rr, no single connection will + be able to utilize more than a single interface's worth of + bandwidth. Additionally, the linux bonding 802.3ad + implementation distributes traffic by peer (using an XOR of + MAC addresses), so in general all traffic to a particular + destination will use the same interface. Finally, the 802.3ad + mode mandates the use of the MII monitor, therefore, the ARP + monitor is not available in this mode. + +balance-tlb: This mode is also a good choice for this type of + topology. It has no special switch configuration + requirements, and balances outgoing traffic by peer, in a + vaguely intelligent manner (not a simple XOR as in balance-xor + or 802.3ad mode), so that unlucky MAC addresses will not all + "bunch up" on a single interface. Interfaces may be of + differing speeds. On the down side, in this mode all incoming + traffic arrives over a single interface, this mode requires + certain ethtool support in the network device driver of the + slave interfaces, and the ARP monitor is not available. + +balance-alb: This mode is everything that balance-tlb is, and more. It + has all of the features (and restrictions) of balance-tlb, and + will also balance incoming traffic from peers (as described in + the Bonding Module Options section, above). The only extra + down side to this mode is that the network device driver must + support changing the hardware address while the device is + open. + +12.1.2 Link Monitoring for Single Switch Topology +------------------------------------------------- + + The choice of link monitoring may largely depend upon which +mode you choose to use. The more advanced load balancing modes do not +support the use of the ARP monitor, and are thus restricted to using +the MII monitor (which does not provide as high a level of assurance +as the ARP monitor). + + +12.2 High Availability in a Multiple Switch Topology +---------------------------------------------------- + + With multiple switches, the configuration of bonding and the +network changes dramatically. In multiple switch topologies, there is +a tradeoff between network availability and usable bandwidth. + + Below is a sample network, configured to maximize the +availability of the network: + + | | + |port3 port3| + +-----+----+ +-----+----+ + | |port2 ISL port2| | + | switch A +--------------------------+ switch B | + | | | | + +-----+----+ +-----++---+ + |port1 port1| + | +-------+ | + +-------------+ host1 +---------------+ + eth0 +-------+ eth1 + + In this configuration, there is a link between the two +switches (ISL, or inter switch link), and multiple ports connecting to +the outside world ("port3" on each switch). There is no technical +reason that this could not be extended to a third switch. + +12.2.1 Bonding Mode Selection for Multiple Switch Topology +---------------------------------------------------------- + + In a topology such as this, the active-backup and broadcast +modes are the only useful bonding modes; the other modes require all +links to terminate on the same peer for them to behave rationally. + +active-backup: This is generally the preferred mode, particularly if + the switches have an ISL and play together well. If the + network configuration is such that one switch is specifically + a backup switch (e.g., has lower capacity, higher cost, etc), + then the primary option can be used to insure that the + preferred link is always used when it is available. + +broadcast: This mode is really a special purpose mode, and is suitable + only for very specific needs. For example, if the two + switches are not connected (no ISL), and the networks beyond + them are totally independant. In this case, if it is + necessary for some specific one-way traffic to reach both + independent networks, then the broadcast mode may be suitable. + +12.2.2 Link Monitoring Selection for Multiple Switch Topology +------------------------------------------------------------- + + The choice of link monitoring ultimately depends upon your +switch. If the switch can reliably fail ports in response to other +failures, then either the MII or ARP monitors should work. For +example, in the above example, if the "port3" link fails at the remote +end, the MII monitor has no direct means to detect this. The ARP +monitor could be configured with a target at the remote end of port3, +thus detecting that failure without switch support. + + In general, however, in a multiple switch topology, the ARP +monitor can provide a higher level of reliability in detecting link +failures. Additionally, it should be configured with multiple targets +(at least one for each switch in the network). This will insure that, +regardless of which switch is active, the ARP monitor has a suitable +target to query. + + +12.3 Switch Behavior Issues for High Availability +------------------------------------------------- + + You may encounter issues with the timing of link up and down +reporting by the switch. + + First, when a link comes up, some switches may indicate that +the link is up (carrier available), but not pass traffic over the +interface for some period of time. This delay is typically due to +some type of autonegotiation or routing protocol, but may also occur +during switch initialization (e.g., during recovery after a switch +failure). If you find this to be a problem, specify an appropriate +value to the updelay bonding module option to delay the use of the +relevant interface(s). + + Second, some switches may "bounce" the link state one or more +times while a link is changing state. This occurs most commonly while +the switch is initializing. Again, an appropriate updelay value may +help, but note that if all links are down, then updelay is ignored +when any link becomes active (the slave closest to completing its +updelay is chosen). + + Note that when a bonding interface has no active links, the +driver will immediately reuse the first link that goes up, even if +updelay parameter was specified. If there are slave interfaces +waiting for the updelay timeout to expire, the interface that first +went into that state will be immediately reused. This reduces down +time of the network if the value of updelay has been overestimated. + + In addition to the concerns about switch timings, if your +switches take a long time to go into backup mode, it may be desirable +to not activate a backup interface immediately after a link goes down. +Failover may be delayed via the downdelay bonding module option. + +13. Hardware Specific Considerations +==================================== + + This section contains additional information for configuring +bonding on specific hardware platforms, or for interfacing bonding +with particular switches or other devices. + +13.1 IBM BladeCenter +-------------------- + + This applies to the JS20 and similar systems. + + On the JS20 blades, the bonding driver supports only +balance-rr, active-backup, balance-tlb and balance-alb modes. This is +largely due to the network topology inside the BladeCenter, detailed +below. + +JS20 network adapter information +-------------------------------- + + All JS20s come with two Broadcom Gigabit Ethernet ports +integrated on the planar. In the BladeCenter chassis, the eth0 port +of all JS20 blades is hard wired to I/O Module #1; similarly, all eth1 +ports are wired to I/O Module #2. An add-on Broadcom daughter card +can be installed on a JS20 to provide two more Gigabit Ethernet ports. +These ports, eth2 and eth3, are wired to I/O Modules 3 and 4, +respectively. + + Each I/O Module may contain either a switch or a passthrough +module (which allows ports to be directly connected to an external +switch). Some bonding modes require a specific BladeCenter internal +network topology in order to function; these are detailed below. + + Additional BladeCenter-specific networking information can be +found in two IBM Redbooks (www.ibm.com/redbooks): + +"IBM eServer BladeCenter Networking Options" +"IBM eServer BladeCenter Layer 2-7 Network Switching" + +BladeCenter networking configuration +------------------------------------ + + Because a BladeCenter can be configured in a very large number +of ways, this discussion will be confined to describing basic +configurations. + + Normally, Ethernet Switch Modules (ESM) are used in I/O +modules 1 and 2. In this configuration, the eth0 and eth1 ports of a +JS20 will be connected to different internal switches (in the +respective I/O modules). + + An optical passthru module (OPM) connects the I/O module +directly to an external switch. By using OPMs in I/O module #1 and +#2, the eth0 and eth1 interfaces of a JS20 can be redirected to the +outside world and connected to a common external switch. + + Depending upon the mix of ESM and OPM modules, the network +will appear to bonding as either a single switch topology (all OPM +modules) or as a multiple switch topology (one or more ESM modules, +zero or more OPM modules). It is also possible to connect ESM modules +together, resulting in a configuration much like the example in "High +Availability in a multiple switch topology." + +Requirements for specifc modes +------------------------------ + + The balance-rr mode requires the use of OPM modules for +devices in the bond, all connected to an common external switch. That +switch must be configured for "etherchannel" or "trunking" on the +appropriate ports, as is usual for balance-rr. + + The balance-alb and balance-tlb modes will function with +either switch modules or passthrough modules (or a mix). The only +specific requirement for these modes is that all network interfaces +must be able to reach all destinations for traffic sent over the +bonding device (i.e., the network must converge at some point outside +the BladeCenter). + + The active-backup mode has no additional requirements. + +Link monitoring issues +---------------------- + + When an Ethernet Switch Module is in place, only the ARP +monitor will reliably detect link loss to an external switch. This is +nothing unusual, but examination of the BladeCenter cabinet would +suggest that the "external" network ports are the ethernet ports for +the system, when it fact there is a switch between these "external" +ports and the devices on the JS20 system itself. The MII monitor is +only able to detect link failures between the ESM and the JS20 system. + + When a passthrough module is in place, the MII monitor does +detect failures to the "external" port, which is then directly +connected to the JS20 system. + +Other concerns +-------------- + + The Serial Over LAN link is established over the primary +ethernet (eth0) only, therefore, any loss of link to eth0 will result +in losing your SoL connection. It will not fail over with other +network traffic. + + It may be desirable to disable spanning tree on the switch +(either the internal Ethernet Switch Module, or an external switch) to +avoid fail-over delays issues when using bonding. + + +14. Frequently Asked Questions +============================== + +1. Is it SMP safe? + + Yes. The old 2.0.xx channel bonding patch was not SMP safe. +The new driver was designed to be SMP safe from the start. + +2. What type of cards will work with it? + + Any Ethernet type cards (you can even mix cards - a Intel +EtherExpress PRO/100 and a 3com 3c905b, for example). They need not +be of the same speed. + +3. How many bonding devices can I have? + + There is no limit. + +4. How many slaves can a bonding device have? + + This is limited only by the number of network interfaces Linux +supports and/or the number of network cards you can place in your +system. + +5. What happens when a slave link dies? + + If link monitoring is enabled, then the failing device will be +disabled. The active-backup mode will fail over to a backup link, and +other modes will ignore the failed link. The link will continue to be +monitored, and should it recover, it will rejoin the bond (in whatever +manner is appropriate for the mode). See the section on High +Availability for additional information. + + Link monitoring can be enabled via either the miimon or +arp_interval paramters (described in the module paramters section, +above). In general, miimon monitors the carrier state as sensed by +the underlying network device, and the arp monitor (arp_interval) +monitors connectivity to another host on the local network. + + If no link monitoring is configured, the bonding driver will +be unable to detect link failures, and will assume that all links are +always available. This will likely result in lost packets, and a +resulting degredation of performance. The precise performance loss +depends upon the bonding mode and network configuration. + +6. Can bonding be used for High Availability? + + Yes. See the section on High Availability for details. + +7. Which switches/systems does it work with? + + The full answer to this depends upon the desired mode. + + In the basic balance modes (balance-rr and balance-xor), it +works with any system that supports etherchannel (also called +trunking). Most managed switches currently available have such +support, and many unmananged switches as well. + + The advanced balance modes (balance-tlb and balance-alb) do +not have special switch requirements, but do need device drivers that +support specific features (described in the appropriate section under +module paramters, above). + + In 802.3ad mode, it works with with systems that support IEEE +802.3ad Dynamic Link Aggregation. Most managed and many unmanaged +switches currently available support 802.3ad. + + The active-backup mode should work with any Layer-II switch. + +8. Where does a bonding device get its MAC address from? + + If not explicitly configured with ifconfig, the MAC address of +the bonding device is taken from its first slave device. This MAC +address is then passed to all following slaves and remains persistent +(even if the the first slave is removed) until the bonding device is +brought down or reconfigured. + + If you wish to change the MAC address, you can set it with +ifconfig: + +# ifconfig bond0 hw ether 00:11:22:33:44:55 + + The MAC address can be also changed by bringing down/up the +device and then changing its slaves (or their order): + +# ifconfig bond0 down ; modprobe -r bonding +# ifconfig bond0 .... up +# ifenslave bond0 eth... + + This method will automatically take the address from the next +slave that is added. + + To restore your slaves' MAC addresses, you need to detach them +from the bond (`ifenslave -d bond0 eth0'). The bonding driver will +then restore the MAC addresses that the slaves had before they were +enslaved. + +15. Resources and Links +======================= + +The latest version of the bonding driver can be found in the latest +version of the linux kernel, found on http://kernel.org + +Discussions regarding the bonding driver take place primarily on the +bonding-devel mailing list, hosted at sourceforge.net. If you have +questions or problems, post them to the list. + +bonding-devel@lists.sourceforge.net + +https://lists.sourceforge.net/lists/listinfo/bonding-devel + +There is also a project site on sourceforge. + +http://www.sourceforge.net/projects/bonding + +Donald Becker's Ethernet Drivers and diag programs may be found at : + - http://www.scyld.com/network/ + +You will also find a lot of information regarding Ethernet, NWay, MII, +etc. at www.scyld.com. + +-- END -- diff --git a/Documentation/networking/bridge.txt b/Documentation/networking/bridge.txt new file mode 100644 index 000000000000..bdae2db4119c --- /dev/null +++ b/Documentation/networking/bridge.txt @@ -0,0 +1,8 @@ +In order to use the Ethernet bridging functionality, you'll need the +userspace tools. These programs and documentation are available +at http://bridge.sourceforge.net. The download page is +http://prdownloads.sourceforge.net/bridge. + +If you still have questions, don't hesitate to post to the mailing list +(more info http://lists.osdl.org/mailman/listinfo/bridge). + diff --git a/Documentation/networking/comx.txt b/Documentation/networking/comx.txt new file mode 100644 index 000000000000..d1526eba2645 --- /dev/null +++ b/Documentation/networking/comx.txt @@ -0,0 +1,248 @@ + + COMX drivers for the 2.2 kernel + +Originally written by: Tivadar Szemethy, <tiv@itc.hu> +Currently maintained by: Gergely Madarasz <gorgo@itc.hu> + +Last change: 21/06/1999. + +INTRODUCTION + +This document describes the software drivers and their use for the +COMX line of synchronous serial adapters for Linux version 2.2.0 and +above. +The cards are produced and sold by ITC-Pro Ltd. Budapest, Hungary +For further info contact <info@itc.hu> +or http://www.itc.hu (mostly in Hungarian). +The firmware files and software are available from ftp://ftp.itc.hu + +Currently, the drivers support the following cards and protocols: + +COMX (2x64 kbps intelligent board) +CMX (1x256 + 1x128 kbps intelligent board) +HiCOMX (2x2Mbps intelligent board) +LoCOMX (1x512 kbps passive board) +MixCOM (1x512 or 2x512kbps passive board with a hardware watchdog an + optional BRI interface and optional flashROM (1-32M)) +SliceCOM (1x2Mbps channelized E1 board) +PciCOM (X21) + +At the moment of writing this document, the (Cisco)-HDLC, LAPB, SyncPPP and +Frame Relay (DTE, rfc1294 IP encapsulation with partially implemented Q933a +LMI) protocols are available as link-level protocol. +X.25 support is being worked on. + +USAGE + +Load the comx.o module and the hardware-specific and protocol-specific +modules you'll need into the running kernel using the insmod utility. +This creates the /proc/comx directory. +See the example scripts in the 'etc' directory. + +/proc INTERFACE INTRO + +The COMX driver set has a new type of user interface based on the /proc +filesystem which eliminates the need for external user-land software doing +IOCTL calls. +Each network interface or device (i.e. those ones you configure with 'ifconfig' +and 'route' etc.) has a corresponding directory under /proc/comx. You can +dynamically create a new interface by saying 'mkdir /proc/comx/comx0' (or you +can name it whatever you want up to 8 characters long, comx[n] is just a +convention). +Generally the files contained in these directories are text files, which can +be viewed by 'cat filename' and you can write a string to such a file by +saying 'echo _string_ >filename'. This is very similar to the sysctl interface. +Don't use a text editor to edit these files, always use 'echo' (or 'cat' +where appropriate). +When you've created the comx[n] directory, two files are created automagically +in it: 'boardtype' and 'protocol'. You have to fill in these files correctly +for your board and protocol you intend to use (see the board and protocol +descriptions in this file below or the example scripts in the 'etc' directory). +After filling in these files, other files will appear in the directory for +setting the various hardware- and protocol-related informations (for example +irq and io addresses, keepalive values etc.) These files are set to default +values upon creation, so you don't necessarily have to change all of them. + +When you're ready with filling in the files in the comx[n] directory, you can +configure the corresponding network interface with the standard network +configuration utilities. If you're unable to bring the interfaces up, look up +the various kernel log files on your system, and consult the messages for +a probable reason. + +EXAMPLE + +To create the interface 'comx0' which is the first channel of a COMX card: + +insmod comx +# insmod comx-hw-comx ; insmod comx-proto-ppp (these are usually +autoloaded if you use the kernel module loader) + +mkdir /proc/comx/comx0 +echo comx >/proc/comx/comx0/boardtype +echo 0x360 >/proc/comx/comx0/io <- jumper-selectable I/O port +echo 0x0a >/proc/comx/comx0/irq <- jumper-selectable IRQ line +echo 0xd000 >/proc/comx/comx0/memaddr <- software-configurable memory + address. COMX uses 64 KB, and this + can be: 0xa000, 0xb000, 0xc000, + 0xd000, 0xe000. Avoid conflicts + with other hardware. +cat </etc/siol1.rom >/proc/comx/comx0/firmware <- the firmware for the card +echo HDLC >/proc/comx/comx0/protocol <- the data-link protocol +echo 10 >/proc/comx/comx0/keepalive <- the keepalive for the protocol +ifconfig comx0 1.2.3.4 pointopoint 5.6.7.8 netmask 255.255.255.255 <- + finally configure it with ifconfig +Check its status: +cat /proc/comx/comx0/status + +If you want to use the second channel of this board: + +mkdir /proc/comx/comx1 +echo comx >/proc/comx/comx1/boardtype +echo 0x360 >/proc/comx/comx1/io +echo 10 >/proc/comx/comx1/irq +echo 0xd000 >/proc/comx/comx1/memaddr +echo 1 >/proc/comx/comx1/channel <- channels are numbered + as 0 (default) and 1 + +Now, check if the driver recognized that you're going to use the other +channel of the same adapter: + +cat /proc/comx/comx0/twin +comx1 +cat /proc/comx/comx1/twin +comx0 + +You don't have to load the firmware twice, if you use both channels of +an adapter, just write it into the channel 0's /proc firmware file. + +Default values: io 0x360 for COMX, 0x320 (HICOMX), irq 10, memaddr 0xd0000 + +THE LOCOMX HARDWARE DRIVER + +The LoCOMX driver doesn't require firmware, and it doesn't use memory either, +but it uses DMA channels 1 and 3. You can set the clock rate (if enabled by +jumpers on the board) by writing the kbps value into the file named 'clock'. +Set it to 'external' (it is the default) if you have external clock source. + +(Note: currently the LoCOMX driver does not support the internal clock) + +THE COMX, CMX AND HICOMX DRIVERS + +On the HICOMX, COMX and CMX, you have to load the firmware (it is different for +the three cards!). All these adapters can share the same memory +address (we usually use 0xd0000). On the CMX you can set the internal +clock rate (if enabled by jumpers on the small adapter boards) by writing +the kbps value into the 'clock' file. You have to do this before initializing +the card. If you use both HICOMX and CMX/COMX cards, initialize the HICOMX +first. The I/O address of the HICOMX board is not configurable by any +method available to the user: it is hardwired to 0x320, and if you have to +change it, consult ITC-Pro Ltd. + +THE MIXCOM DRIVER + +The MixCOM board doesn't require firmware, the driver communicates with +it through I/O ports. You can have three of these cards in one machine. + +THE SLICECOM DRIVER + +The SliceCOM board doesn't require firmware. You can have 4 of these cards +in one machine. The driver doesn't (yet) support shared interrupts, so +you will need a separate IRQ line for every board. +Read Documentation/networking/slicecom.txt for help on configuring +this adapter. + +THE HDLC/PPP LINE PROTOCOL DRIVER + +The HDLC/SyncPPP line protocol driver uses the kernel's built-in syncppp +driver (syncppp.o). You don't have to manually select syncppp.o when building +the kernel, the dependencies compile it in automatically. + + + + +EXAMPLE +(setting up hw parameters, see above) + +# using HDLC: +echo hdlc >/proc/comx/comx0/protocol +echo 10 >/proc/comx/comx0/keepalive <- not necessary, 10 is the default +ifconfig comx0 1.2.3.4 pointopoint 5.6.7.8 netmask 255.255.255.255 + +(setting up hw parameters, see above) + +# using PPP: +echo ppp >/proc/comx/comx0/protocol +ifconfig comx0 up +ifconfig comx0 1.2.3.4 pointopoint 5.6.7.8 netmask 255.255.255.255 + + +THE LAPB LINE PROTOCOL DRIVER + +For this, you'll need to configure LAPB support (See 'LAPB Data Link Driver' in +'Network options' section) into your kernel (thanks to Jonathan Naylor for his +excellent implementation). +comx-proto-lapb.o provides the following files in the appropriate directory +(the default values in parens): t1 (5), t2 (1), n2 (20), mode (DTE, STD) and +window (7). Agree with the administrator of your peer router on these +settings (most people use defaults, but you have to know if you are DTE or +DCE). + +EXAMPLE + +(setting up hw parameters, see above) +echo lapb >/proc/comx/comx0/protocol +echo dce >/proc/comx/comx0/mode <- DCE interface in this example +ifconfig comx0 1.2.3.4 pointopoint 5.6.7.8 netmask 255.255.255.255 + + +THE FRAME RELAY PROTOCOL DRIVER + +You DON'T need any other frame relay related modules from the kernel to use +COMX-Frame Relay. This protocol is a bit more complicated than the others, +because it allows to use 'subinterfaces' or DLCIs within one physical device. +First you have to create the 'master' device (the actual physical interface) +as you would do for other protocols. Specify 'frad' as protocol type. +Now you can bring this interface up by saying 'ifconfig comx0 up' (or whatever +you've named the interface). Do not assign any IP address to this interface +and do not set any routes through it. +Then, set up your DLCIs the following way: create a comx interface for each +DLCI you intend to use (with mkdir), and write 'dlci' to the 'boardtype' file, +and 'ietf-ip' to the 'protocol' file. Currently, the only supported +encapsulation type is this (also called as RFC1294/1490 IP encapsulation). +Write the DLCI number to the 'dlci' file, and write the name of the physical +COMX device to the file called 'master'. +Now you can assign an IP address to this interface and set routes using it. +See the example file for further info and example config script. +Notes: this driver implements a DTE interface with partially implemented +Q933a LMI. +You can find an extensively commented example in the 'etc' directory. + +FURTHER /proc FILES + +boardtype: +Type of the hardware. Valid values are: + 'comx', 'hicomx', 'locomx', 'cmx', 'slicecom'. + +protocol: +Data-link protocol on this channel. Can be: HDLC, LAPB, PPP, FRAD + +status: +You can read the channel's actual status from the 'status' file, for example +'cat /proc/comx/comx3/status'. + +lineup_delay: +Interpreted in seconds (default is 1). Used to avoid line jitter: the system +will consider the line status 'UP' only if it is up for at least this number +of seconds. + +debug: +You can set various debug options through this file. Valid options are: +'comx_events', 'comx_tx', 'comx_rx', 'hw_events', 'hw_tx', 'hw_rx'. +You can enable a debug options by writing its name prepended by a '+' into +the debug file, for example 'echo +comx_rx >comx0/debug'. +Disabling an option happens similarly, use the '-' prefix +(e.g. 'echo -hw_rx >debug'). +Debug results can be read from the debug file, for example: +tail -f /proc/comx/comx2/debug + + diff --git a/Documentation/networking/cops.txt b/Documentation/networking/cops.txt new file mode 100644 index 000000000000..3e344b448e07 --- /dev/null +++ b/Documentation/networking/cops.txt @@ -0,0 +1,63 @@ +Text File for the COPS LocalTalk Linux driver (cops.c). + By Jay Schulist <jschlst@samba.org> + +This driver has two modes and they are: Dayna mode and Tangent mode. +Each mode corresponds with the type of card. It has been found +that there are 2 main types of cards and all other cards are +the same and just have different names or only have minor differences +such as more IO ports. As this driver is tested it will +become more clear exactly what cards are supported. + +Right now these cards are known to work with the COPS driver. The +LT-200 cards work in a somewhat more limited capacity than the +DL200 cards, which work very well and are in use by many people. + +TANGENT driver mode: + Tangent ATB-II, Novell NL-1000, Daystar Digital LT-200 +DAYNA driver mode: + Dayna DL2000/DaynaTalk PC (Half Length), COPS LT-95, + Farallon PhoneNET PC III, Farallon PhoneNET PC II +Other cards possibly supported mode unknown though: + Dayna DL2000 (Full length) + +The COPS driver defaults to using Dayna mode. To change the driver's +mode if you built a driver with dual support use board_type=1 or +board_type=2 for Dayna or Tangent with insmod. + +** Operation/loading of the driver. +Use modprobe like this: /sbin/modprobe cops.o (IO #) (IRQ #) +If you do not specify any options the driver will try and use the IO = 0x240, +IRQ = 5. As of right now I would only use IRQ 5 for the card, if autoprobing. + +To load multiple COPS driver Localtalk cards you can do one of the following. + +insmod cops io=0x240 irq=5 +insmod -o cops2 cops io=0x260 irq=3 + +Or in lilo.conf put something like this: + append="ether=5,0x240,lt0 ether=3,0x260,lt1" + +Then bring up the interface with ifconfig. It will look something like this: +lt0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-F7-00-00-00-00-00-00-00-00 + inet addr:192.168.1.2 Bcast:192.168.1.255 Mask:255.255.255.0 + UP BROADCAST RUNNING NOARP MULTICAST MTU:600 Metric:1 + RX packets:0 errors:0 dropped:0 overruns:0 frame:0 + TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 coll:0 + +** Netatalk Configuration +You will need to configure atalkd with something like the following to make +it work with the cops.c driver. + +* For single LTalk card use. +dummy -seed -phase 2 -net 2000 -addr 2000.10 -zone "1033" +lt0 -seed -phase 1 -net 1000 -addr 1000.50 -zone "1033" + +* For multiple cards, Ethernet and LocalTalk. +eth0 -seed -phase 2 -net 3000 -addr 3000.20 -zone "1033" +lt0 -seed -phase 1 -net 1000 -addr 1000.50 -zone "1033" + +* For multiple LocalTalk cards, and an Ethernet card. +* Order seems to matter here, Ethernet last. +lt0 -seed -phase 1 -net 1000 -addr 1000.10 -zone "LocalTalk1" +lt1 -seed -phase 1 -net 2000 -addr 2000.20 -zone "LocalTalk2" +eth0 -seed -phase 2 -net 3000 -addr 3000.30 -zone "EtherTalk" diff --git a/Documentation/networking/cs89x0.txt b/Documentation/networking/cs89x0.txt new file mode 100644 index 000000000000..188beb7d6a17 --- /dev/null +++ b/Documentation/networking/cs89x0.txt @@ -0,0 +1,703 @@ + +NOTE +---- + +This document was contributed by Cirrus Logic for kernel 2.2.5. This version +has been updated for 2.3.48 by Andrew Morton <andrewm@uow.edu.au> + +Cirrus make a copy of this driver available at their website, as +described below. In general, you should use the driver version which +comes with your Linux distribution. + + + +CIRRUS LOGIC LAN CS8900/CS8920 ETHERNET ADAPTERS +Linux Network Interface Driver ver. 2.00 <kernel 2.3.48> +=============================================================================== + + +TABLE OF CONTENTS + +1.0 CIRRUS LOGIC LAN CS8900/CS8920 ETHERNET ADAPTERS + 1.1 Product Overview + 1.2 Driver Description + 1.2.1 Driver Name + 1.2.2 File in the Driver Package + 1.3 System Requirements + 1.4 Licensing Information + +2.0 ADAPTER INSTALLATION and CONFIGURATION + 2.1 CS8900-based Adapter Configuration + 2.2 CS8920-based Adapter Configuration + +3.0 LOADING THE DRIVER AS A MODULE + +4.0 COMPILING THE DRIVER + 4.1 Compiling the Driver as a Loadable Module + 4.2 Compiling the driver to support memory mode + 4.3 Compiling the driver to support Rx DMA + 4.4 Compiling the Driver into the Kernel + +5.0 TESTING AND TROUBLESHOOTING + 5.1 Known Defects and Limitations + 5.2 Testing the Adapter + 5.2.1 Diagnostic Self-Test + 5.2.2 Diagnostic Network Test + 5.3 Using the Adapter's LEDs + 5.4 Resolving I/O Conflicts + +6.0 TECHNICAL SUPPORT + 6.1 Contacting Cirrus Logic's Technical Support + 6.2 Information Required Before Contacting Technical Support + 6.3 Obtaining the Latest Driver Version + 6.4 Current maintainer + 6.5 Kernel boot parameters + + +1.0 CIRRUS LOGIC LAN CS8900/CS8920 ETHERNET ADAPTERS +=============================================================================== + + +1.1 PRODUCT OVERVIEW + +The CS8900-based ISA Ethernet Adapters from Cirrus Logic follow +IEEE 802.3 standards and support half or full-duplex operation in ISA bus +computers on 10 Mbps Ethernet networks. The adapters are designed for operation +in 16-bit ISA or EISA bus expansion slots and are available in +10BaseT-only or 3-media configurations (10BaseT, 10Base2, and AUI for 10Base-5 +or fiber networks). + +CS8920-based adapters are similar to the CS8900-based adapter with additional +features for Plug and Play (PnP) support and Wakeup Frame recognition. As +such, the configuration procedures differ somewhat between the two types of +adapters. Refer to the "Adapter Configuration" section for details on +configuring both types of adapters. + + +1.2 DRIVER DESCRIPTION + +The CS8900/CS8920 Ethernet Adapter driver for Linux supports the Linux +v2.3.48 or greater kernel. It can be compiled directly into the kernel +or loaded at run-time as a device driver module. + +1.2.1 Driver Name: cs89x0 + +1.2.2 Files in the Driver Archive: + +The files in the driver at Cirrus' website include: + + readme.txt - this file + build - batch file to compile cs89x0.c. + cs89x0.c - driver C code + cs89x0.h - driver header file + cs89x0.o - pre-compiled module (for v2.2.5 kernel) + config/Config.in - sample file to include cs89x0 driver in the kernel. + config/Makefile - sample file to include cs89x0 driver in the kernel. + config/Space.c - sample file to include cs89x0 driver in the kernel. + + + +1.3 SYSTEM REQUIREMENTS + +The following hardware is required: + + * Cirrus Logic LAN (CS8900/20-based) Ethernet ISA Adapter + + * IBM or IBM-compatible PC with: + * An 80386 or higher processor + * 16 bytes of contiguous IO space available between 210h - 370h + * One available IRQ (5,10,11,or 12 for the CS8900, 3-7,9-15 for CS8920). + + * Appropriate cable (and connector for AUI, 10BASE-2) for your network + topology. + +The following software is required: + +* LINUX kernel version 2.3.48 or higher + + * CS8900/20 Setup Utility (DOS-based) + + * LINUX kernel sources for your kernel (if compiling into kernel) + + * GNU Toolkit (gcc and make) v2.6 or above (if compiling into kernel + or a module) + + + +1.4 LICENSING INFORMATION + +This program is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free Software +Foundation, version 1. + +This program is distributed in the hope that it will be useful, but WITHOUT +ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for +more details. + +For a full copy of the GNU General Public License, write to the Free Software +Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + + + +2.0 ADAPTER INSTALLATION and CONFIGURATION +=============================================================================== + +Both the CS8900 and CS8920-based adapters can be configured using parameters +stored in an on-board EEPROM. You must use the DOS-based CS8900/20 Setup +Utility if you want to change the adapter's configuration in EEPROM. + +When loading the driver as a module, you can specify many of the adapter's +configuration parameters on the command-line to override the EEPROM's settings +or for interface configuration when an EEPROM is not used. (CS8920-based +adapters must use an EEPROM.) See Section 3.0 LOADING THE DRIVER AS A MODULE. + +Since the CS8900/20 Setup Utility is a DOS-based application, you must install +and configure the adapter in a DOS-based system using the CS8900/20 Setup +Utility before installation in the target LINUX system. (Not required if +installing a CS8900-based adapter and the default configuration is acceptable.) + + +2.1 CS8900-BASED ADAPTER CONFIGURATION + +CS8900-based adapters shipped from Cirrus Logic have been configured +with the following "default" settings: + + Operation Mode: Memory Mode + IRQ: 10 + Base I/O Address: 300 + Memory Base Address: D0000 + Optimization: DOS Client + Transmission Mode: Half-duplex + BootProm: None + Media Type: Autodetect (3-media cards) or + 10BASE-T (10BASE-T only adapter) + +You should only change the default configuration settings if conflicts with +another adapter exists. To change the adapter's configuration, run the +CS8900/20 Setup Utility. + + +2.2 CS8920-BASED ADAPTER CONFIGURATION + +CS8920-based adapters are shipped from Cirrus Logic configured as Plug +and Play (PnP) enabled. However, since the cs89x0 driver does NOT +support PnP, you must install the CS8920 adapter in a DOS-based PC and +run the CS8900/20 Setup Utility to disable PnP and configure the +adapter before installation in the target Linux system. Failure to do +this will leave the adapter inactive and the driver will be unable to +communicate with the adapter. + + + **************************************************************** + * CS8920-BASED ADAPTERS: * + * * + * CS8920-BASED ADAPTERS ARE PLUG and PLAY ENABLED BY DEFAULT. * + * THE CS89X0 DRIVER DOES NOT SUPPORT PnP. THEREFORE, YOU MUST * + * RUN THE CS8900/20 SETUP UTILITY TO DISABLE PnP SUPPORT AND * + * TO ACTIVATE THE ADAPTER. * + **************************************************************** + + + + +3.0 LOADING THE DRIVER AS A MODULE +=============================================================================== + +If the driver is compiled as a loadable module, you can load the driver module +with the 'modprobe' command. Many of the adapter's configuration parameters can +be specified as command-line arguments to the load command. This facility +provides a means to override the EEPROM's settings or for interface +configuration when an EEPROM is not used. + +Example: + + insmod cs89x0.o io=0x200 irq=0xA media=aui + +This example loads the module and configures the adapter to use an IO port base +address of 200h, interrupt 10, and use the AUI media connection. The following +configuration options are available on the command line: + +* io=### - specify IO address (200h-360h) +* irq=## - specify interrupt level +* use_dma=1 - Enable DMA +* dma=# - specify dma channel (Driver is compiled to support + Rx DMA only) +* dmasize=# (16 or 64) - DMA size 16K or 64K. Default value is set to 16. +* media=rj45 - specify media type + or media=bnc + or media=aui + or medai=auto +* duplex=full - specify forced half/full/autonegotiate duplex + or duplex=half + or duplex=auto +* debug=# - debug level (only available if the driver was compiled + for debugging) + +NOTES: + +a) If an EEPROM is present, any specified command-line parameter + will override the corresponding configuration value stored in + EEPROM. + +b) The "io" parameter must be specified on the command-line. + +c) The driver's hardware probe routine is designed to avoid + writing to I/O space until it knows that there is a cs89x0 + card at the written addresses. This could cause problems + with device probing. To avoid this behaviour, add one + to the `io=' module parameter. This doesn't actually change + the I/O address, but it is a flag to tell the driver + topartially initialise the hardware before trying to + identify the card. This could be dangerous if you are + not sure that there is a cs89x0 card at the provided address. + + For example, to scan for an adapter located at IO base 0x300, + specify an IO address of 0x301. + +d) The "duplex=auto" parameter is only supported for the CS8920. + +e) The minimum command-line configuration required if an EEPROM is + not present is: + + io + irq + media type (no autodetect) + +f) The following additional parameters are CS89XX defaults (values + used with no EEPROM or command-line argument). + + * DMA Burst = enabled + * IOCHRDY Enabled = enabled + * UseSA = enabled + * CS8900 defaults to half-duplex if not specified on command-line + * CS8920 defaults to autoneg if not specified on command-line + * Use reset defaults for other config parameters + * dma_mode = 0 + +g) You can use ifconfig to set the adapter's Ethernet address. + +h) Many Linux distributions use the 'modprobe' command to load + modules. This program uses the '/etc/conf.modules' file to + determine configuration information which is passed to a driver + module when it is loaded. All the configuration options which are + described above may be placed within /etc/conf.modules. + + For example: + + > cat /etc/conf.modules + ... + alias eth0 cs89x0 + options cs89x0 io=0x0200 dma=5 use_dma=1 + ... + + In this example we are telling the module system that the + ethernet driver for this machine should use the cs89x0 driver. We + are asking 'modprobe' to pass the 'io', 'dma' and 'use_dma' + arguments to the driver when it is loaded. + +i) Cirrus recommend that the cs89x0 use the ISA DMA channels 5, 6 or + 7. You will probably find that other DMA channels will not work. + +j) The cs89x0 supports DMA for receiving only. DMA mode is + significantly more efficient. Flooding a 400 MHz Celeron machine + with large ping packets consumes 82% of its CPU capacity in non-DMA + mode. With DMA this is reduced to 45%. + +k) If your Linux kernel was compiled with inbuilt plug-and-play + support you will be able to find information about the cs89x0 card + with the command + + cat /proc/isapnp + +l) If during DMA operation you find erratic behavior or network data + corruption you should use your PC's BIOS to slow the EISA bus clock. + +m) If the cs89x0 driver is compiled directly into the kernel + (non-modular) then its I/O address is automatically determined by + ISA bus probing. The IRQ number, media options, etc are determined + from the card's EEPROM. + +n) If the cs89x0 driver is compiled directly into the kernel, DMA + mode may be selected by providing the kernel with a boot option + 'cs89x0_dma=N' where 'N' is the desired DMA channel number (5, 6 or 7). + + Kernel boot options may be provided on the LILO command line: + + LILO boot: linux cs89x0_dma=5 + + or they may be placed in /etc/lilo.conf: + + image=/boot/bzImage-2.3.48 + append="cs89x0_dma=5" + label=linux + root=/dev/hda5 + read-only + + The DMA Rx buffer size is hardwired to 16 kbytes in this mode. + (64k mode is not available). + + +4.0 COMPILING THE DRIVER +=============================================================================== + +The cs89x0 driver can be compiled directly into the kernel or compiled into +a loadable device driver module. + + +4.1 COMPILING THE DRIVER AS A LOADABLE MODULE + +To compile the driver into a loadable module, use the following command +(single command line, without quotes): + +"gcc -D__KERNEL__ -I/usr/src/linux/include -I/usr/src/linux/net/inet -Wall +-Wstrict-prototypes -O2 -fomit-frame-pointer -DMODULE -DCONFIG_MODVERSIONS +-c cs89x0.c" + +4.2 COMPILING THE DRIVER TO SUPPORT MEMORY MODE + +Support for memory mode was not carried over into the 2.3 series kernels. + +4.3 COMPILING THE DRIVER TO SUPPORT Rx DMA + +The compile-time optionality for DMA was removed in the 2.3 kernel +series. DMA support is now unconditionally part of the driver. It is +enabled by the 'use_dma=1' module option. + +4.4 COMPILING THE DRIVER INTO THE KERNEL + +If your Linux distribution already has support for the cs89x0 driver +then simply copy the source file to the /usr/src/linux/drivers/net +directory to replace the original ones and run the make utility to +rebuild the kernel. See Step 3 for rebuilding the kernel. + +If your Linux does not include the cs89x0 driver, you need to edit three +configuration files, copy the source file to the /usr/src/linux/drivers/net +directory, and then run the make utility to rebuild the kernel. + +1. Edit the following configuration files by adding the statements as +indicated. (When possible, try to locate the added text to the section of the +file containing similar statements). + + +a.) In /usr/src/linux/drivers/net/Config.in, add: + +tristate 'CS89x0 support' CONFIG_CS89x0 + +Example: + + if [ "$CONFIG_EXPERIMENTAL" = "y" ]; then + tristate 'ICL EtherTeam 16i/32 support' CONFIG_ETH16I + fi + + tristate 'CS89x0 support' CONFIG_CS89x0 + + tristate 'NE2000/NE1000 support' CONFIG_NE2000 + if [ "$CONFIG_EXPERIMENTAL" = "y" ]; then + tristate 'NI5210 support' CONFIG_NI52 + + +b.) In /usr/src/linux/drivers/net/Makefile, add the following lines: + +ifeq ($(CONFIG_CS89x0),y) +L_OBJS += cs89x0.o +else + ifeq ($(CONFIG_CS89x0),m) + M_OBJS += cs89x0.o + endif +endif + + +c.) In /linux/drivers/net/Space.c file, add the line: + +extern int cs89x0_probe(struct device *dev); + + +Example: + + extern int ultra_probe(struct device *dev); + extern int wd_probe(struct device *dev); + extern int el2_probe(struct device *dev); + + extern int cs89x0_probe(struct device *dev); + + extern int ne_probe(struct device *dev); + extern int hp_probe(struct device *dev); + extern int hp_plus_probe(struct device *dev); + + +Also add: + + #ifdef CONFIG_CS89x0 + { cs89x0_probe,0 }, + #endif + + +2.) Copy the driver source files (cs89x0.c and cs89x0.h) +into the /usr/src/linux/drivers/net directory. + + +3.) Go to /usr/src/linux directory and run 'make config' followed by 'make' +(or make bzImage) to rebuild the kernel. + +4.) Use the DOS 'setup' utility to disable plug and play on the NIC. + + +5.0 TESTING AND TROUBLESHOOTING +=============================================================================== + +5.1 KNOWN DEFECTS and LIMITATIONS + +Refer to the RELEASE.TXT file distributed as part of this archive for a list of +known defects, driver limitations, and work arounds. + + +5.2 TESTING THE ADAPTER + +Once the adapter has been installed and configured, the diagnostic option of +the CS8900/20 Setup Utility can be used to test the functionality of the +adapter and its network connection. Use the diagnostics 'Self Test' option to +test the functionality of the adapter with the hardware configuration you have +assigned. You can use the diagnostics 'Network Test' to test the ability of the +adapter to communicate across the Ethernet with another PC equipped with a +CS8900/20-based adapter card (it must also be running the CS8900/20 Setup +Utility). + + NOTE: The Setup Utility's diagnostics are designed to run in a + DOS-only operating system environment. DO NOT run the diagnostics + from a DOS or command prompt session under Windows 95, Windows NT, + OS/2, or other operating system. + +To run the diagnostics tests on the CS8900/20 adapter: + + 1.) Boot DOS on the PC and start the CS8900/20 Setup Utility. + + 2.) The adapter's current configuration is displayed. Hit the ENTER key to + get to the main menu. + + 4.) Select 'Diagnostics' (ALT-G) from the main menu. + * Select 'Self-Test' to test the adapter's basic functionality. + * Select 'Network Test' to test the network connection and cabling. + + +5.2.1 DIAGNOSTIC SELF-TEST + +The diagnostic self-test checks the adapter's basic functionality as well as +its ability to communicate across the ISA bus based on the system resources +assigned during hardware configuration. The following tests are performed: + + * IO Register Read/Write Test + The IO Register Read/Write test insures that the CS8900/20 can be + accessed in IO mode, and that the IO base address is correct. + + * Shared Memory Test + The Shared Memory test insures the CS8900/20 can be accessed in memory + mode and that the range of memory addresses assigned does not conflict + with other devices in the system. + + * Interrupt Test + The Interrupt test insures there are no conflicts with the assigned IRQ + signal. + + * EEPROM Test + The EEPROM test insures the EEPROM can be read. + + * Chip RAM Test + The Chip RAM test insures the 4K of memory internal to the CS8900/20 is + working properly. + + * Internal Loop-back Test + The Internal Loop Back test insures the adapter's transmitter and + receiver are operating properly. If this test fails, make sure the + adapter's cable is connected to the network (check for LED activity for + example). + + * Boot PROM Test + The Boot PROM test insures the Boot PROM is present, and can be read. + Failure indicates the Boot PROM was not successfully read due to a + hardware problem or due to a conflicts on the Boot PROM address + assignment. (Test only applies if the adapter is configured to use the + Boot PROM option.) + +Failure of a test item indicates a possible system resource conflict with +another device on the ISA bus. In this case, you should use the Manual Setup +option to reconfigure the adapter by selecting a different value for the system +resource that failed. + + +5.2.2 DIAGNOSTIC NETWORK TEST + +The Diagnostic Network Test verifies a working network connection by +transferring data between two CS8900/20 adapters installed in different PCs +on the same network. (Note: the diagnostic network test should not be run +between two nodes across a router.) + +This test requires that each of the two PCs have a CS8900/20-based adapter +installed and have the CS8900/20 Setup Utility running. The first PC is +configured as a Responder and the other PC is configured as an Initiator. +Once the Initiator is started, it sends data frames to the Responder which +returns the frames to the Initiator. + +The total number of frames received and transmitted are displayed on the +Initiator's display, along with a count of the number of frames received and +transmitted OK or in error. The test can be terminated anytime by the user at +either PC. + +To setup the Diagnostic Network Test: + + 1.) Select a PC with a CS8900/20-based adapter and a known working network + connection to act as the Responder. Run the CS8900/20 Setup Utility + and select 'Diagnostics -> Network Test -> Responder' from the main + menu. Hit ENTER to start the Responder. + + 2.) Return to the PC with the CS8900/20-based adapter you want to test and + start the CS8900/20 Setup Utility. + + 3.) From the main menu, Select 'Diagnostic -> Network Test -> Initiator'. + Hit ENTER to start the test. + +You may stop the test on the Initiator at any time while allowing the Responder +to continue running. In this manner, you can move to additional PCs and test +them by starting the Initiator on another PC without having to stop/start the +Responder. + + + +5.3 USING THE ADAPTER'S LEDs + +The 2 and 3-media adapters have two LEDs visible on the back end of the board +located near the 10Base-T connector. + +Link Integrity LED: A "steady" ON of the green LED indicates a valid 10Base-T +connection. (Only applies to 10Base-T. The green LED has no significance for +a 10Base-2 or AUI connection.) + +TX/RX LED: The yellow LED lights briefly each time the adapter transmits or +receives data. (The yellow LED will appear to "flicker" on a typical network.) + + +5.4 RESOLVING I/O CONFLICTS + +An IO conflict occurs when two or more adapter use the same ISA resource (IO +address, memory address or IRQ). You can usually detect an IO conflict in one +of four ways after installing and or configuring the CS8900/20-based adapter: + + 1.) The system does not boot properly (or at all). + + 2.) The driver can not communicate with the adapter, reporting an "Adapter + not found" error message. + + 3.) You cannot connect to the network or the driver will not load. + + 4.) If you have configured the adapter to run in memory mode but the driver + reports it is using IO mode when loading, this is an indication of a + memory address conflict. + +If an IO conflict occurs, run the CS8900/20 Setup Utility and perform a +diagnostic self-test. Normally, the ISA resource in conflict will fail the +self-test. If so, reconfigure the adapter selecting another choice for the +resource in conflict. Run the diagnostics again to check for further IO +conflicts. + +In some cases, such as when the PC will not boot, it may be necessary to remove +the adapter and reconfigure it by installing it in another PC to run the +CS8900/20 Setup Utility. Once reinstalled in the target system, run the +diagnostics self-test to ensure the new configuration is free of conflicts +before loading the driver again. + +When manually configuring the adapter, keep in mind the typical ISA system +resource usage as indicated in the tables below. + +I/O Address Device IRQ Device +----------- -------- --- -------- + 200-20F Game I/O adapter 3 COM2, Bus Mouse + 230-23F Bus Mouse 4 COM1 + 270-27F LPT3: third parallel port 5 LPT2 + 2F0-2FF COM2: second serial port 6 Floppy Disk controller + 320-32F Fixed disk controller 7 LPT1 + 8 Real-time Clock + 9 EGA/VGA display adapter + 12 Mouse (PS/2) +Memory Address Device 13 Math Coprocessor +-------------- --------------------- 14 Hard Disk controller +A000-BFFF EGA Graphics Adpater +A000-C7FF VGA Graphics Adpater +B000-BFFF Mono Graphics Adapter +B800-BFFF Color Graphics Adapter +E000-FFFF AT BIOS + + + + +6.0 TECHNICAL SUPPORT +=============================================================================== + +6.1 CONTACTING CIRRUS LOGIC'S TECHNICAL SUPPORT + +Cirrus Logic's CS89XX Technical Application Support can be reached at: + +Telephone :(800) 888-5016 (from inside U.S. and Canada) + :(512) 442-7555 (from outside the U.S. and Canada) +Fax :(512) 912-3871 +Email :ethernet@crystal.cirrus.com +WWW :http://www.cirrus.com + + +6.2 INFORMATION REQUIRED BEFORE CONTACTING TECHNICAL SUPPORT + +Before contacting Cirrus Logic for technical support, be prepared to provide as +Much of the following information as possible. + +1.) Adapter type (CRD8900, CDB8900, CDB8920, etc.) + +2.) Adapter configuration + + * IO Base, Memory Base, IO or memory mode enabled, IRQ, DMA channel + * Plug and Play enabled/disabled (CS8920-based adapters only) + * Configured for media auto-detect or specific media type (which type). + +3.) PC System's Configuration + + * Plug and Play system (yes/no) + * BIOS (make and version) + * System make and model + * CPU (type and speed) + * System RAM + * SCSI Adapter + +4.) Software + + * CS89XX driver and version + * Your network operating system and version + * Your system's OS version + * Version of all protocol support files + +5.) Any Error Message displayed. + + + +6.3 OBTAINING THE LATEST DRIVER VERSION + +You can obtain the latest CS89XX drivers and support software from Cirrus Logic's +Web site. You can also contact Cirrus Logic's Technical Support (email: +ethernet@crystal.cirrus.com) and request that you be registered for automatic +software-update notification. + +Cirrus Logic maintains a web page at http://www.cirrus.com with the +the latest drivers and technical publications. + + +6.4 Current maintainer + +In February 2000 the maintenance of this driver was assumed by Andrew +Morton <akpm@zip.com.au> + +6.5 Kernel module parameters + +For use in embedded environments with no cs89x0 EEPROM, the kernel boot +parameter `cs89x0_media=' has been implemented. Usage is: + + cs89x0_media=rj45 or + cs89x0_media=aui or + cs89x0_media=bnc + diff --git a/Documentation/networking/de4x5.txt b/Documentation/networking/de4x5.txt new file mode 100644 index 000000000000..c8e4ca9b2c3e --- /dev/null +++ b/Documentation/networking/de4x5.txt @@ -0,0 +1,178 @@ + Originally, this driver was written for the Digital Equipment + Corporation series of EtherWORKS Ethernet cards: + + DE425 TP/COAX EISA + DE434 TP PCI + DE435 TP/COAX/AUI PCI + DE450 TP/COAX/AUI PCI + DE500 10/100 PCI Fasternet + + but it will now attempt to support all cards which conform to the + Digital Semiconductor SROM Specification. The driver currently + recognises the following chips: + + DC21040 (no SROM) + DC21041[A] + DC21140[A] + DC21142 + DC21143 + + So far the driver is known to work with the following cards: + + KINGSTON + Linksys + ZNYX342 + SMC8432 + SMC9332 (w/new SROM) + ZNYX31[45] + ZNYX346 10/100 4 port (can act as a 10/100 bridge!) + + The driver has been tested on a relatively busy network using the DE425, + DE434, DE435 and DE500 cards and benchmarked with 'ttcp': it transferred + 16M of data to a DECstation 5000/200 as follows: + + TCP UDP + TX RX TX RX + DE425 1030k 997k 1170k 1128k + DE434 1063k 995k 1170k 1125k + DE435 1063k 995k 1170k 1125k + DE500 1063k 998k 1170k 1125k in 10Mb/s mode + + All values are typical (in kBytes/sec) from a sample of 4 for each + measurement. Their error is +/-20k on a quiet (private) network and also + depend on what load the CPU has. + + ========================================================================= + + The ability to load this driver as a loadable module has been included + and used extensively during the driver development (to save those long + reboot sequences). Loadable module support under PCI and EISA has been + achieved by letting the driver autoprobe as if it were compiled into the + kernel. Do make sure you're not sharing interrupts with anything that + cannot accommodate interrupt sharing! + + To utilise this ability, you have to do 8 things: + + 0) have a copy of the loadable modules code installed on your system. + 1) copy de4x5.c from the /linux/drivers/net directory to your favourite + temporary directory. + 2) for fixed autoprobes (not recommended), edit the source code near + line 5594 to reflect the I/O address you're using, or assign these when + loading by: + + insmod de4x5 io=0xghh where g = bus number + hh = device number + + NB: autoprobing for modules is now supported by default. You may just + use: + + insmod de4x5 + + to load all available boards. For a specific board, still use + the 'io=?' above. + 3) compile de4x5.c, but include -DMODULE in the command line to ensure + that the correct bits are compiled (see end of source code). + 4) if you are wanting to add a new card, goto 5. Otherwise, recompile a + kernel with the de4x5 configuration turned off and reboot. + 5) insmod de4x5 [io=0xghh] + 6) run the net startup bits for your new eth?? interface(s) manually + (usually /etc/rc.inet[12] at boot time). + 7) enjoy! + + To unload a module, turn off the associated interface(s) + 'ifconfig eth?? down' then 'rmmod de4x5'. + + Automedia detection is included so that in principle you can disconnect + from, e.g. TP, reconnect to BNC and things will still work (after a + pause whilst the driver figures out where its media went). My tests + using ping showed that it appears to work.... + + By default, the driver will now autodetect any DECchip based card. + Should you have a need to restrict the driver to DIGITAL only cards, you + can compile with a DEC_ONLY define, or if loading as a module, use the + 'dec_only=1' parameter. + + I've changed the timing routines to use the kernel timer and scheduling + functions so that the hangs and other assorted problems that occurred + while autosensing the media should be gone. A bonus for the DC21040 + auto media sense algorithm is that it can now use one that is more in + line with the rest (the DC21040 chip doesn't have a hardware timer). + The downside is the 1 'jiffies' (10ms) resolution. + + IEEE 802.3u MII interface code has been added in anticipation that some + products may use it in the future. + + The SMC9332 card has a non-compliant SROM which needs fixing - I have + patched this driver to detect it because the SROM format used complies + to a previous DEC-STD format. + + I have removed the buffer copies needed for receive on Intels. I cannot + remove them for Alphas since the Tulip hardware only does longword + aligned DMA transfers and the Alphas get alignment traps with non + longword aligned data copies (which makes them really slow). No comment. + + I have added SROM decoding routines to make this driver work with any + card that supports the Digital Semiconductor SROM spec. This will help + all cards running the dc2114x series chips in particular. Cards using + the dc2104x chips should run correctly with the basic driver. I'm in + debt to <mjacob@feral.com> for the testing and feedback that helped get + this feature working. So far we have tested KINGSTON, SMC8432, SMC9332 + (with the latest SROM complying with the SROM spec V3: their first was + broken), ZNYX342 and LinkSys. ZNYX314 (dual 21041 MAC) and ZNYX 315 + (quad 21041 MAC) cards also appear to work despite their incorrectly + wired IRQs. + + I have added a temporary fix for interrupt problems when some SCSI cards + share the same interrupt as the DECchip based cards. The problem occurs + because the SCSI card wants to grab the interrupt as a fast interrupt + (runs the service routine with interrupts turned off) vs. this card + which really needs to run the service routine with interrupts turned on. + This driver will now add the interrupt service routine as a fast + interrupt if it is bounced from the slow interrupt. THIS IS NOT A + RECOMMENDED WAY TO RUN THE DRIVER and has been done for a limited time + until people sort out their compatibility issues and the kernel + interrupt service code is fixed. YOU SHOULD SEPARATE OUT THE FAST + INTERRUPT CARDS FROM THE SLOW INTERRUPT CARDS to ensure that they do not + run on the same interrupt. PCMCIA/CardBus is another can of worms... + + Finally, I think I have really fixed the module loading problem with + more than one DECchip based card. As a side effect, I don't mess with + the device structure any more which means that if more than 1 card in + 2.0.x is installed (4 in 2.1.x), the user will have to edit + linux/drivers/net/Space.c to make room for them. Hence, module loading + is the preferred way to use this driver, since it doesn't have this + limitation. + + Where SROM media detection is used and full duplex is specified in the + SROM, the feature is ignored unless lp->params.fdx is set at compile + time OR during a module load (insmod de4x5 args='eth??:fdx' [see + below]). This is because there is no way to automatically detect full + duplex links except through autonegotiation. When I include the + autonegotiation feature in the SROM autoconf code, this detection will + occur automatically for that case. + + Command line arguments are now allowed, similar to passing arguments + through LILO. This will allow a per adapter board set up of full duplex + and media. The only lexical constraints are: the board name (dev->name) + appears in the list before its parameters. The list of parameters ends + either at the end of the parameter list or with another board name. The + following parameters are allowed: + + fdx for full duplex + autosense to set the media/speed; with the following + sub-parameters: + TP, TP_NW, BNC, AUI, BNC_AUI, 100Mb, 10Mb, AUTO + + Case sensitivity is important for the sub-parameters. They *must* be + upper case. Examples: + + insmod de4x5 args='eth1:fdx autosense=BNC eth0:autosense=100Mb'. + + For a compiled in driver, in linux/drivers/net/CONFIG, place e.g. + DE4X5_OPTS = -DDE4X5_PARM='"eth0:fdx autosense=AUI eth2:autosense=TP"' + + Yes, I know full duplex isn't permissible on BNC or AUI; they're just + examples. By default, full duplex is turned off and AUTO is the default + autosense setting. In reality, I expect only the full duplex option to + be used. Note the use of single quotes in the two examples above and the + lack of commas to separate items. diff --git a/Documentation/networking/decnet.txt b/Documentation/networking/decnet.txt new file mode 100644 index 000000000000..c6bd25f5d61d --- /dev/null +++ b/Documentation/networking/decnet.txt @@ -0,0 +1,234 @@ + Linux DECnet Networking Layer Information + =========================================== + +1) Other documentation.... + + o Project Home Pages + http://www.chygwyn.com/DECnet/ - Kernel info + http://linux-decnet.sourceforge.net/ - Userland tools + http://www.sourceforge.net/projects/linux-decnet/ - Status page + +2) Configuring the kernel + +Be sure to turn on the following options: + + CONFIG_DECNET (obviously) + CONFIG_PROC_FS (to see what's going on) + CONFIG_SYSCTL (for easy configuration) + +if you want to try out router support (not properly debugged yet) +you'll need the following options as well... + + CONFIG_DECNET_ROUTER (to be able to add/delete routes) + CONFIG_NETFILTER (will be required for the DECnet routing daemon) + + CONFIG_DECNET_ROUTE_FWMARK is optional + +Don't turn on SIOCGIFCONF support for DECnet unless you are really sure +that you need it, in general you won't and it can cause ifconfig to +malfunction. + +Run time configuration has changed slightly from the 2.4 system. If you +want to configure an endnode, then the simplified procedure is as follows: + + o Set the MAC address on your ethernet card before starting _any_ other + network protocols. + +As soon as your network card is brought into the UP state, DECnet should +start working. If you need something more complicated or are unsure how +to set the MAC address, see the next section. Also all configurations which +worked with 2.4 will work under 2.5 with no change. + +3) Command line options + +You can set a DECnet address on the kernel command line for compatibility +with the 2.4 configuration procedure, but in general it's not needed any more. +If you do st a DECnet address on the command line, it has only one purpose +which is that its added to the addresses on the loopback device. + +With 2.4 kernels, DECnet would only recognise addresses as local if they +were added to the loopback device. In 2.5, any local interface address +can be used to loop back to the local machine. Of course this does not +prevent you adding further addresses to the loopback device if you +want to. + +N.B. Since the address list of an interface determines the addresses for +which "hello" messages are sent, if you don't set an address on the loopback +interface then you won't see any entries in /proc/net/neigh for the local +host until such time as you start a connection. This doesn't affect the +operation of the local communications in any other way though. + +The kernel command line takes options looking like the following: + + decnet=1,2 + +the two numbers are the node address 1,2 = 1.2 For 2.2.xx kernels +and early 2.3.xx kernels, you must use a comma when specifying the +DECnet address like this. For more recent 2.3.xx kernels, you may +use almost any character except space, although a `.` would be the most +obvious choice :-) + +There used to be a third number specifying the node type. This option +has gone away in favour of a per interface node type. This is now set +using /proc/sys/net/decnet/conf/<dev>/forwarding. This file can be +set with a single digit, 0=EndNode, 1=L1 Router and 2=L2 Router. + +There are also equivalent options for modules. The node address can +also be set through the /proc/sys/net/decnet/ files, as can other system +parameters. + +Currently the only supported devices are ethernet and ip_gre. The +ethernet address of your ethernet card has to be set according to the DECnet +address of the node in order for it to be autoconfigured (and then appear in +/proc/net/decnet_dev). There is a utility available at the above +FTP sites called dn2ethaddr which can compute the correct ethernet +address to use. The address can be set by ifconfig either before at +at the time the device is brought up. If you are using RedHat you can +add the line: + + MACADDR=AA:00:04:00:03:04 + +or something similar, to /etc/sysconfig/network-scripts/ifcfg-eth0 or +wherever your network card's configuration lives. Setting the MAC address +of your ethernet card to an address starting with "hi-ord" will cause a +DECnet address which matches to be added to the interface (which you can +verify with iproute2). + +The default device for routing can be set through the /proc filesystem +by setting /proc/sys/net/decnet/default_device to the +device you want DECnet to route packets out of when no specific route +is available. Usually this will be eth0, for example: + + echo -n "eth0" >/proc/sys/net/decnet/default_device + +If you don't set the default device, then it will default to the first +ethernet card which has been autoconfigured as described above. You can +confirm that by looking in the default_device file of course. + +There is a list of what the other files under /proc/sys/net/decnet/ do +on the kernel patch web site (shown above). + +4) Run time kernel configuration + +This is either done through the sysctl/proc interface (see the kernel web +pages for details on what the various options do) or through the iproute2 +package in the same way as IPv4/6 configuration is performed. + +Documentation for iproute2 is included with the package, although there is +as yet no specific section on DECnet, most of the features apply to both +IP and DECnet, albeit with DECnet addresses instead of IP addresses and +a reduced functionality. + +If you want to configure a DECnet router you'll need the iproute2 package +since its the _only_ way to add and delete routes currently. Eventually +there will be a routing daemon to send and receive routing messages for +each interface and update the kernel routing tables accordingly. The +routing daemon will use netfilter to listen to routing packets, and +rtnetlink to update the kernels routing tables. + +The DECnet raw socket layer has been removed since it was there purely +for use by the routing daemon which will now use netfilter (a much cleaner +and more generic solution) instead. + +5) How can I tell if its working ? + +Here is a quick guide of what to look for in order to know if your DECnet +kernel subsystem is working. + + - Is the node address set (see /proc/sys/net/decnet/node_address) + - Is the node of the correct type + (see /proc/sys/net/decnet/conf/<dev>/forwarding) + - Is the Ethernet MAC address of each Ethernet card set to match + the DECnet address. If in doubt use the dn2ethaddr utility available + at the ftp archive. + - If the previous two steps are satisfied, and the Ethernet card is up, + you should find that it is listed in /proc/net/decnet_dev and also + that it appears as a directory in /proc/sys/net/decnet/conf/. The + loopback device (lo) should also appear and is required to communicate + within a node. + - If you have any DECnet routers on your network, they should appear + in /proc/net/decnet_neigh, otherwise this file will only contain the + entry for the node itself (if it doesn't check to see if lo is up). + - If you want to send to any node which is not listed in the + /proc/net/decnet_neigh file, you'll need to set the default device + to point to an Ethernet card with connection to a router. This is + again done with the /proc/sys/net/decnet/default_device file. + - Try starting a simple server and client, like the dnping/dnmirror + over the loopback interface. With luck they should communicate. + For this step and those after, you'll need the DECnet library + which can be obtained from the above ftp sites as well as the + actual utilities themselves. + - If this seems to work, then try talking to a node on your local + network, and see if you can obtain the same results. + - At this point you are on your own... :-) + +6) How to send a bug report + +If you've found a bug and want to report it, then there are several things +you can do to help me work out exactly what it is that is wrong. Useful +information (_most_ of which _is_ _essential_) includes: + + - What kernel version are you running ? + - What version of the patch are you running ? + - How far though the above set of tests can you get ? + - What is in the /proc/decnet* files and /proc/sys/net/decnet/* files ? + - Which services are you running ? + - Which client caused the problem ? + - How much data was being transferred ? + - Was the network congested ? + - If there was a kernel panic, please run the output through ksymoops + before sending it to me, otherwise its _useless_. + - How can the problem be reproduced ? + - Can you use tcpdump to get a trace ? (N.B. Most (all?) versions of + tcpdump don't understand how to dump DECnet properly, so including + the hex listing of the packet contents is _essential_, usually the -x flag. + You may also need to increase the length grabbed with the -s flag. The + -e flag also provides very useful information (ethernet MAC addresses)) + +7) MAC FAQ + +A quick FAQ on ethernet MAC addresses to explain how Linux and DECnet +interact and how to get the best performance from your hardware. + +Ethernet cards are designed to normally only pass received network frames +to a host computer when they are addressed to it, or to the broadcast address. + +Linux has an interface which allows the setting of extra addresses for +an ethernet card to listen to. If the ethernet card supports it, the +filtering operation will be done in hardware, if not the extra unwanted packets +received will be discarded by the host computer. In the latter case, +significant processor time and bus bandwidth can be used up on a busy +network (see the NAPI documentation for a longer explanation of these +effects). + +DECnet makes use of this interface to allow running DECnet on an ethernet +card which has already been configured using TCP/IP (presumably using the +built in MAC address of the card, as usual) and/or to allow multiple DECnet +addresses on each physical interface. If you do this, be aware that if your +ethernet card doesn't support perfect hashing in its MAC address filter +then your computer will be doing more work than required. Some cards +will simply set themselves into promiscuous mode in order to receive +packets from the DECnet specified addresses. So if you have one of these +cards its better to set the MAC address of the card as described above +to gain the best efficiency. Better still is to use a card which supports +NAPI as well. + + +8) Mailing list + +If you are keen to get involved in development, or want to ask questions +about configuration, or even just report bugs, then there is a mailing +list that you can join, details are at: + +http://sourceforge.net/mail/?group_id=4993 + +9) Legal Info + +The Linux DECnet project team have placed their code under the GPL. The +software is provided "as is" and without warranty express or implied. +DECnet is a trademark of Compaq. This software is not a product of +Compaq. We acknowledge the help of people at Compaq in providing extra +documentation above and beyond what was previously publicly available. + +Steve Whitehouse <SteveW@ACM.org> + diff --git a/Documentation/networking/depca.txt b/Documentation/networking/depca.txt new file mode 100644 index 000000000000..24c6b26e5658 --- /dev/null +++ b/Documentation/networking/depca.txt @@ -0,0 +1,92 @@ + +DE10x +===== + +Memory Addresses: + + SW1 SW2 SW3 SW4 +64K on on on on d0000 dbfff + off on on on c0000 cbfff + off off on on e0000 ebfff + +32K on on off on d8000 dbfff + off on off on c8000 cbfff + off off off on e8000 ebfff + +DBR ROM on on dc000 dffff + off on cc000 cffff + off off ec000 effff + +Note that the 2K mode is set by SW3/SW4 on/off or off/off. Address +assignment is through the RBSA register. + +I/O Address: + SW5 +0x300 on +0x200 off + +Remote Boot: + SW6 +Disable on +Enable off + +Remote Boot Timeout: + SW7 +2.5min on +30s off + +IRQ: + SW8 SW9 SW10 SW11 SW12 +2 on off off off off +3 off on off off off +4 off off on off off +5 off off off on off +7 off off off off on + +DE20x +===== + +Memory Size: + + SW3 SW4 +64K on on +32K off on +2K on off +2K off off + +Start Addresses: + + SW1 SW2 SW3 SW4 +64K on on on on c0000 cffff + on off on on d0000 dffff + off on on on e0000 effff + +32K on on off off c8000 cffff + on off off off d8000 dffff + off on off off e8000 effff + +Illegal off off - - - - + +I/O Address: + SW5 +0x300 on +0x200 off + +Remote Boot: + SW6 +Disable on +Enable off + +Remote Boot Timeout: + SW7 +2.5min on +30s off + +IRQ: + SW8 SW9 SW10 SW11 SW12 +5 on off off off off +9 off on off off off +10 off off on off off +11 off off off on off +15 off off off off on + diff --git a/Documentation/networking/dgrs.txt b/Documentation/networking/dgrs.txt new file mode 100644 index 000000000000..1aa1bb3f94ab --- /dev/null +++ b/Documentation/networking/dgrs.txt @@ -0,0 +1,52 @@ + The Digi International RightSwitch SE-X (dgrs) Device Driver + +This is a Linux driver for the Digi International RightSwitch SE-X +EISA and PCI boards. These are 4 (EISA) or 6 (PCI) port Ethernet +switches and a NIC combined into a single board. This driver can +be compiled into the kernel statically or as a loadable module. + +There is also a companion management tool, called "xrightswitch". +The management tool lets you watch the performance graphically, +as well as set the SNMP agent IP and IPX addresses, IEEE Spanning +Tree, and Aging time. These can also be set from the command line +when the driver is loaded. The driver command line options are: + + debug=NNN Debug printing level + dma=0/1 Disable/Enable DMA on PCI card + spantree=0/1 Disable/Enable IEEE spanning tree + hashexpire=NNN Change address aging time (default 300 seconds) + ipaddr=A,B,C,D Set SNMP agent IP address i.e. 199,86,8,221 + iptrap=A,B,C,D Set SNMP agent IP trap address i.e. 199,86,8,221 + ipxnet=NNN Set SNMP agent IPX network number + nicmode=0/1 Disable/Enable multiple NIC mode + +There is also a tool for setting up input and output packet filters +on each port, called "dgrsfilt". + +Both the management tool and the filtering tool are available +separately from the following FTP site: + + ftp://ftp.dgii.com/drivers/rightswitch/linux/ + +When nicmode=1, the board and driver operate as 4 or 6 individual +NIC ports (eth0...eth5) instead of as a switch. All switching +functions are disabled. In the future, the board firmware may include +a routing cache when in this mode. + +Copyright 1995-1996 Digi International Inc. + +This software may be used and distributed according to the terms +of the GNU General Public License, incorporated herein by reference. + +For information on purchasing a RightSwitch SE-4 or SE-6 +board, please contact Digi's sales department at 1-612-912-3444 +or 1-800-DIGIBRD. Outside the U.S., please check our Web page at: + + http://www.dgii.com + +for sales offices worldwide. Tech support is also available through +the channels listed on the Web site, although as long as I am +employed on networking products at Digi I will be happy to provide +any bug fixes that may be needed. + +-Rick Richardson, rick@dgii.com diff --git a/Documentation/networking/dl2k.txt b/Documentation/networking/dl2k.txt new file mode 100644 index 000000000000..d460492037ef --- /dev/null +++ b/Documentation/networking/dl2k.txt @@ -0,0 +1,281 @@ + + D-Link DL2000-based Gigabit Ethernet Adapter Installation + for Linux + May 23, 2002 + +Contents +======== + - Compatibility List + - Quick Install + - Compiling the Driver + - Installing the Driver + - Option parameter + - Configuration Script Sample + - Troubleshooting + + +Compatibility List +================= +Adapter Support: + +D-Link DGE-550T Gigabit Ethernet Adapter. +D-Link DGE-550SX Gigabit Ethernet Adapter. +D-Link DL2000-based Gigabit Ethernet Adapter. + + +The driver support Linux kernel 2.4.7 later. We had tested it +on the environments below. + + . Red Hat v6.2 (update kernel to 2.4.7) + . Red Hat v7.0 (update kernel to 2.4.7) + . Red Hat v7.1 (kernel 2.4.7) + . Red Hat v7.2 (kernel 2.4.7-10) + + +Quick Install +============= +Install linux driver as following command: + +1. make all +2. insmod dl2k.ko +3. ifconfig eth0 up 10.xxx.xxx.xxx netmask 255.0.0.0 + ^^^^^^^^^^^^^^^\ ^^^^^^^^\ + IP NETMASK +Now eth0 should active, you can test it by "ping" or get more information by +"ifconfig". If tested ok, continue the next step. + +4. cp dl2k.ko /lib/modules/`uname -r`/kernel/drivers/net +5. Add the following line to /etc/modprobe.conf: + alias eth0 dl2k +6. Run "netconfig" or "netconf" to create configuration script ifcfg-eth0 + located at /etc/sysconfig/network-scripts or create it manually. + [see - Configuration Script Sample] +7. Driver will automatically load and configure at next boot time. + +Compiling the Driver +==================== + In Linux, NIC drivers are most commonly configured as loadable modules. +The approach of building a monolithic kernel has become obsolete. The driver +can be compiled as part of a monolithic kernel, but is strongly discouraged. +The remainder of this section assumes the driver is built as a loadable module. +In the Linux environment, it is a good idea to rebuild the driver from the +source instead of relying on a precompiled version. This approach provides +better reliability since a precompiled driver might depend on libraries or +kernel features that are not present in a given Linux installation. + +The 3 files necessary to build Linux device driver are dl2k.c, dl2k.h and +Makefile. To compile, the Linux installation must include the gcc compiler, +the kernel source, and the kernel headers. The Linux driver supports Linux +Kernels 2.4.7. Copy the files to a directory and enter the following command +to compile and link the driver: + +CD-ROM drive +------------ + +[root@XXX /] mkdir cdrom +[root@XXX /] mount -r -t iso9660 -o conv=auto /dev/cdrom /cdrom +[root@XXX /] cd root +[root@XXX /root] mkdir dl2k +[root@XXX /root] cd dl2k +[root@XXX dl2k] cp /cdrom/linux/dl2k.tgz /root/dl2k +[root@XXX dl2k] tar xfvz dl2k.tgz +[root@XXX dl2k] make all + +Floppy disc drive +----------------- + +[root@XXX /] cd root +[root@XXX /root] mkdir dl2k +[root@XXX /root] cd dl2k +[root@XXX dl2k] mcopy a:/linux/dl2k.tgz /root/dl2k +[root@XXX dl2k] tar xfvz dl2k.tgz +[root@XXX dl2k] make all + +Installing the Driver +===================== + + Manual Installation + ------------------- + Once the driver has been compiled, it must be loaded, enabled, and bound + to a protocol stack in order to establish network connectivity. To load a + module enter the command: + + insmod dl2k.o + + or + + insmod dl2k.o <optional parameter> ; add parameter + + =============================================================== + example: insmod dl2k.o media=100mbps_hd + or insmod dl2k.o media=3 + or insmod dl2k.o media=3,2 ; for 2 cards + =============================================================== + + Please reference the list of the command line parameters supported by + the Linux device driver below. + + The insmod command only loads the driver and gives it a name of the form + eth0, eth1, etc. To bring the NIC into an operational state, + it is necessary to issue the following command: + + ifconfig eth0 up + + Finally, to bind the driver to the active protocol (e.g., TCP/IP with + Linux), enter the following command: + + ifup eth0 + + Note that this is meaningful only if the system can find a configuration + script that contains the necessary network information. A sample will be + given in the next paragraph. + + The commands to unload a driver are as follows: + + ifdown eth0 + ifconfig eth0 down + rmmod dl2k.o + + The following are the commands to list the currently loaded modules and + to see the current network configuration. + + lsmod + ifconfig + + + Automated Installation + ---------------------- + This section describes how to install the driver such that it is + automatically loaded and configured at boot time. The following description + is based on a Red Hat 6.0/7.0 distribution, but it can easily be ported to + other distributions as well. + + Red Hat v6.x/v7.x + ----------------- + 1. Copy dl2k.o to the network modules directory, typically + /lib/modules/2.x.x-xx/net or /lib/modules/2.x.x/kernel/drivers/net. + 2. Locate the boot module configuration file, most commonly modprobe.conf + or modules.conf (for 2.4) in the /etc directory. Add the following lines: + + alias ethx dl2k + options dl2k <optional parameters> + + where ethx will be eth0 if the NIC is the only ethernet adapter, eth1 if + one other ethernet adapter is installed, etc. Refer to the table in the + previous section for the list of optional parameters. + 3. Locate the network configuration scripts, normally the + /etc/sysconfig/network-scripts directory, and create a configuration + script named ifcfg-ethx that contains network information. + 4. Note that for most Linux distributions, Red Hat included, a configuration + utility with a graphical user interface is provided to perform steps 2 + and 3 above. + + +Parameter Description +===================== +You can install this driver without any addtional parameter. However, if you +are going to have extensive functions then it is necessary to set extra +parameter. Below is a list of the command line parameters supported by the +Linux device +driver. + +mtu=packet_size - Specifies the maximum packet size. default + is 1500. + +media=media_type - Specifies the media type the NIC operates at. + autosense Autosensing active media. + 10mbps_hd 10Mbps half duplex. + 10mbps_fd 10Mbps full duplex. + 100mbps_hd 100Mbps half duplex. + 100mbps_fd 100Mbps full duplex. + 1000mbps_fd 1000Mbps full duplex. + 1000mbps_hd 1000Mbps half duplex. + 0 Autosensing active media. + 1 10Mbps half duplex. + 2 10Mbps full duplex. + 3 100Mbps half duplex. + 4 100Mbps full duplex. + 5 1000Mbps half duplex. + 6 1000Mbps full duplex. + + By default, the NIC operates at autosense. + 1000mbps_fd and 1000mbps_hd types are only + available for fiber adapter. + +vlan=n - Specifies the VLAN ID. If vlan=0, the + Virtual Local Area Network (VLAN) function is + disable. + +jumbo=[0|1] - Specifies the jumbo frame support. If jumbo=1, + the NIC accept jumbo frames. By default, this + function is disabled. + Jumbo frame usually improve the performance + int gigabit. + This feature need jumbo frame compatible + remote. + +rx_coalesce=m - Number of rx frame handled each interrupt. +rx_timeout=n - Rx DMA wait time for an interrupt. + If set rx_coalesce > 0, hardware only assert + an interrupt for m frames. Hardware won't + assert rx interrupt until m frames received or + reach timeout of n * 640 nano seconds. + Set proper rx_coalesce and rx_timeout can + reduce congestion collapse and overload which + has been a bottlenect for high speed network. + + For example, rx_coalesce=10 rx_timeout=800. + that is, hardware assert only 1 interrupt + for 10 frames received or timeout of 512 us. + +tx_coalesce=n - Number of tx frame handled each interrupt. + Set n > 1 can reduce the interrupts + congestion usually lower performance of + high speed network card. Default is 16. + +tx_flow=[1|0] - Specifies the Tx flow control. If tx_flow=0, + the Tx flow control disable else driver + autodetect. +rx_flow=[1|0] - Specifies the Rx flow control. If rx_flow=0, + the Rx flow control enable else driver + autodetect. + + +Configuration Script Sample +=========================== +Here is a sample of a simple configuration script: + +DEVICE=eth0 +USERCTL=no +ONBOOT=yes +POOTPROTO=none +BROADCAST=207.200.5.255 +NETWORK=207.200.5.0 +NETMASK=255.255.255.0 +IPADDR=207.200.5.2 + + +Troubleshooting +=============== +Q1. Source files contain ^ M behind every line. + Make sure all files are Unix file format (no LF). Try the following + shell command to convert files. + + cat dl2k.c | col -b > dl2k.tmp + mv dl2k.tmp dl2k.c + + OR + + cat dl2k.c | tr -d "\r" > dl2k.tmp + mv dl2k.tmp dl2k.c + +Q2: Could not find header files (*.h) ? + To compile the driver, you need kernel header files. After + installing the kernel source, the header files are usually located in + /usr/src/linux/include, which is the default include directory configured + in Makefile. For some distributions, there is a copy of header files in + /usr/src/include/linux and /usr/src/include/asm, that you can change the + INCLUDEDIR in Makefile to /usr/include without installing kernel source. + Note that RH 7.0 didn't provide correct header files in /usr/include, + including those files will make a wrong version driver. + diff --git a/Documentation/networking/dmfe.txt b/Documentation/networking/dmfe.txt new file mode 100644 index 000000000000..c0e8398674ef --- /dev/null +++ b/Documentation/networking/dmfe.txt @@ -0,0 +1,59 @@ + dmfe.c: Version 1.28 01/18/2000 + + A Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver for Linux. + Copyright (C) 1997 Sten Wang + + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License + as published by the Free Software Foundation; either version 2 + of the License, or (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + + A. Compiler command: + + A-1: For normal single or multiple processor kernel + "gcc -DMODULE -D__KERNEL__ -I/usr/src/linux/net/inet -Wall + -Wstrict-prototypes -O6 -c dmfe.c" + + A-2: For single or multiple processor with kernel module version function + "gcc -DMODULE -DMODVERSIONS -D__KERNEL__ -I/usr/src/linux/net/inet + -Wall -Wstrict-prototypes -O6 -c dmfe.c" + + + B. The following steps teach you how to activate a DM9102 board: + + 1. Used the upper compiler command to compile dmfe.c + + 2. Insert dmfe module into kernel + "insmod dmfe" ;;Auto Detection Mode (Suggest) + "insmod dmfe mode=0" ;;Force 10M Half Duplex + "insmod dmfe mode=1" ;;Force 100M Half Duplex + "insmod dmfe mode=4" ;;Force 10M Full Duplex + "insmod dmfe mode=5" ;;Force 100M Full Duplex + + 3. Config a dm9102 network interface + "ifconfig eth0 172.22.3.18" + ^^^^^^^^^^^ Your IP address + + 4. Activate the IP routing table. For some distributions, it is not + necessary. You can type "route" to check. + + "route add default eth0" + + + 5. Well done. Your DM9102 adapter is now activated. + + + C. Object files description: + 1. dmfe_rh61.o: For Redhat 6.1 + + If you can make sure your kernel version, you can rename + to dmfe.o and directly use it without re-compiling. + + + Author: Sten Wang, 886-3-5798797-8517, E-mail: sten_wang@davicom.com.tw diff --git a/Documentation/networking/driver.txt b/Documentation/networking/driver.txt new file mode 100644 index 000000000000..11fd0ef5ff57 --- /dev/null +++ b/Documentation/networking/driver.txt @@ -0,0 +1,94 @@ +Documents about softnet driver issues in general can be found +at: + + http://www.firstfloor.org/~andi/softnet/ + +Transmit path guidelines: + +1) The hard_start_xmit method must never return '1' under any + normal circumstances. It is considered a hard error unless + there is no way your device can tell ahead of time when it's + transmit function will become busy. + + Instead it must maintain the queue properly. For example, + for a driver implementing scatter-gather this means: + + static int drv_hard_start_xmit(struct sk_buff *skb, + struct net_device *dev) + { + struct drv *dp = dev->priv; + + lock_tx(dp); + ... + /* This is a hard error log it. */ + if (TX_BUFFS_AVAIL(dp) <= (skb_shinfo(skb)->nr_frags + 1)) { + netif_stop_queue(dev); + unlock_tx(dp); + printk(KERN_ERR PFX "%s: BUG! Tx Ring full when queue awake!\n", + dev->name); + return 1; + } + + ... queue packet to card ... + ... update tx consumer index ... + + if (TX_BUFFS_AVAIL(dp) <= (MAX_SKB_FRAGS + 1)) + netif_stop_queue(dev); + + ... + unlock_tx(dp); + ... + } + + And then at the end of your TX reclaimation event handling: + + if (netif_queue_stopped(dp->dev) && + TX_BUFFS_AVAIL(dp) > (MAX_SKB_FRAGS + 1)) + netif_wake_queue(dp->dev); + + For a non-scatter-gather supporting card, the three tests simply become: + + /* This is a hard error log it. */ + if (TX_BUFFS_AVAIL(dp) <= 0) + + and: + + if (TX_BUFFS_AVAIL(dp) == 0) + + and: + + if (netif_queue_stopped(dp->dev) && + TX_BUFFS_AVAIL(dp) > 0) + netif_wake_queue(dp->dev); + +2) Do not forget to update netdev->trans_start to jiffies after + each new tx packet is given to the hardware. + +3) Do not forget that once you return 0 from your hard_start_xmit + method, it is your driver's responsibility to free up the SKB + and in some finite amount of time. + + For example, this means that it is not allowed for your TX + mitigation scheme to let TX packets "hang out" in the TX + ring unreclaimed forever if no new TX packets are sent. + This error can deadlock sockets waiting for send buffer room + to be freed up. + + If you return 1 from the hard_start_xmit method, you must not keep + any reference to that SKB and you must not attempt to free it up. + +Probing guidelines: + +1) Any hardware layer address you obtain for your device should + be verified. For example, for ethernet check it with + linux/etherdevice.h:is_valid_ether_addr() + +Close/stop guidelines: + +1) After the dev->stop routine has been called, the hardware must + not receive or transmit any data. All in flight packets must + be aborted. If necessary, poll or wait for completion of + any reset commands. + +2) The dev->stop routine will be called by unregister_netdevice + if device is still UP. diff --git a/Documentation/networking/e100.txt b/Documentation/networking/e100.txt new file mode 100644 index 000000000000..4ef9f7cd5dc3 --- /dev/null +++ b/Documentation/networking/e100.txt @@ -0,0 +1,170 @@ +Linux* Base Driver for the Intel(R) PRO/100 Family of Adapters +============================================================== + +November 17, 2004 + + +Contents +======== + +- In This Release +- Identifying Your Adapter +- Driver Configuration Parameters +- Additional Configurations +- Support + + +In This Release +=============== + +This file describes the Linux* Base Driver for the Intel(R) PRO/100 Family of +Adapters, version 3.3.x. This driver supports 2.4.x and 2.6.x kernels. + +Identifying Your Adapter +======================== + +For more information on how to identify your adapter, go to the Adapter & +Driver ID Guide at: + + http://support.intel.com/support/network/adapter/pro100/21397.htm + +For the latest Intel network drivers for Linux, refer to the following +website. In the search field, enter your adapter name or type, or use the +networking link on the left to search for your adapter: + + http://downloadfinder.intel.com/scripts-df/support_intel.asp + +Driver Configuration Parameters +=============================== + +The default value for each parameter is generally the recommended setting, +unless otherwise noted. + +Rx Descriptors: Number of receive descriptors. A receive descriptor is a data + structure that describes a receive buffer and its attributes to the network + controller. The data in the descriptor is used by the controller to write + data from the controller to host memory. In the 3.0.x driver the valid + range for this parameter is 64-256. The default value is 64. This parameter + can be changed using the command + + ethtool -G eth? rx n, where n is the number of desired rx descriptors. + +Tx Descriptors: Number of transmit descriptors. A transmit descriptor is a + data structure that describes a transmit buffer and its attributes to the + network controller. The data in the descriptor is used by the controller to + read data from the host memory to the controller. In the 3.0.x driver the + valid range for this parameter is 64-256. The default value is 64. This + parameter can be changed using the command + + ethtool -G eth? tx n, where n is the number of desired tx descriptors. + +Speed/Duplex: The driver auto-negotiates the link speed and duplex settings by + default. Ethtool can be used as follows to force speed/duplex. + + ethtool -s eth? autoneg off speed {10|100} duplex {full|half} + + NOTE: setting the speed/duplex to incorrect values will cause the link to + fail. + +Event Log Message Level: The driver uses the message level flag to log events + to syslog. The message level can be set at driver load time. It can also be + set using the command + + ethtool -s eth? msglvl n + +Additional Configurations +========================= + + Configuring the Driver on Different Distributions + ------------------------------------------------- + + Configuring a network driver to load properly when the system is started is + distribution dependent. Typically, the configuration process involves adding + an alias line to /etc/modules.conf as well as editing other system startup + scripts and/or configuration files. Many popular Linux distributions ship + with tools to make these changes for you. To learn the proper way to + configure a network device for your system, refer to your distribution + documentation. If during this process you are asked for the driver or module + name, the name for the Linux Base Driver for the Intel PRO/100 Family of + Adapters is e100. + + As an example, if you install the e100 driver for two PRO/100 adapters + (eth0 and eth1), add the following to modules.conf: + + alias eth0 e100 + alias eth1 e100 + + Viewing Link Messages + --------------------- + In order to see link messages and other Intel driver information on your + console, you must set the dmesg level up to six. This can be done by + entering the following on the command line before loading the e100 driver: + + dmesg -n 8 + + If you wish to see all messages issued by the driver, including debug + messages, set the dmesg level to eight. + + NOTE: This setting is not saved across reboots. + + Ethtool + ------- + + The driver utilizes the ethtool interface for driver configuration and + diagnostics, as well as displaying statistical information. Ethtool + version 1.6 or later is required for this functionality. + + The latest release of ethtool can be found at: + http://sf.net/projects/gkernel. + + NOTE: This driver uses mii support from the kernel. As a result, when + there is no link, ethtool will report speed/duplex to be 10/half. + + NOTE: Ethtool 1.6 only supports a limited set of ethtool options. Support + for a more complete ethtool feature set can be enabled by upgrading + ethtool to ethtool-1.8.1. + + Enabling Wake on LAN* (WoL) + --------------------------- + WoL is provided through the Ethtool* utility. Ethtool is included with Red + Hat* 8.0. For other Linux distributions, download and install Ethtool from + the following website: http://sourceforge.net/projects/gkernel. + + For instructions on enabling WoL with Ethtool, refer to the Ethtool man + page. + + WoL will be enabled on the system during the next shut down or reboot. For + this driver version, in order to enable WoL, the e100 driver must be + loaded when shutting down or rebooting the system. + + NAPI + ---- + + NAPI (Rx polling mode) is supported in the e100 driver. + + See www.cyberus.ca/~hadi/usenix-paper.tgz for more information on NAPI. + +Support +======= + +For general information, go to the Intel support website at: + + http://support.intel.com + +If an issue is identified with the released source code on the supported +kernel with a supported adapter, email the specific information related to +the issue to linux.nics@intel.com. + + +License +======= + +This software program is released under the terms of a license agreement +between you ('Licensee') and Intel. Do not use or load this software or any +associated materials (collectively, the 'Software') until you have carefully +read the full terms and conditions of the LICENSE located in this software +package. By loading or using the Software, you agree to the terms of this +Agreement. If you do not agree with the terms of this Agreement, do not +install or use the Software. + +* Other names and brands may be claimed as the property of others. diff --git a/Documentation/networking/e1000.txt b/Documentation/networking/e1000.txt new file mode 100644 index 000000000000..2ebd4058d46d --- /dev/null +++ b/Documentation/networking/e1000.txt @@ -0,0 +1,401 @@ +Linux* Base Driver for the Intel(R) PRO/1000 Family of Adapters +=============================================================== + +November 17, 2004 + + +Contents +======== + +- In This Release +- Identifying Your Adapter +- Command Line Parameters +- Speed and Duplex Configuration +- Additional Configurations +- Known Issues +- Support + + +In This Release +=============== + +This file describes the Linux* Base Driver for the Intel(R) PRO/1000 Family +of Adapters, version 5.x.x. + +For questions related to hardware requirements, refer to the documentation +supplied with your Intel PRO/1000 adapter. All hardware requirements listed +apply to use with Linux. + +Native VLANs are now available with supported kernels. + +Identifying Your Adapter +======================== + +For more information on how to identify your adapter, go to the Adapter & +Driver ID Guide at: + + http://support.intel.com/support/network/adapter/pro100/21397.htm + +For the latest Intel network drivers for Linux, refer to the following +website. In the search field, enter your adapter name or type, or use the +networking link on the left to search for your adapter: + + http://downloadfinder.intel.com/scripts-df/support_intel.asp + +Command Line Parameters +======================= + +If the driver is built as a module, the following optional parameters are +used by entering them on the command line with the modprobe or insmod command +using this syntax: + + modprobe e1000 [<option>=<VAL1>,<VAL2>,...] + + insmod e1000 [<option>=<VAL1>,<VAL2>,...] + +For example, with two PRO/1000 PCI adapters, entering: + + insmod e1000 TxDescriptors=80,128 + +loads the e1000 driver with 80 TX descriptors for the first adapter and 128 TX +descriptors for the second adapter. + +The default value for each parameter is generally the recommended setting, +unless otherwise noted. Also, if the driver is statically built into the +kernel, the driver is loaded with the default values for all the parameters. +Ethtool can be used to change some of the parameters at runtime. + + NOTES: For more information about the AutoNeg, Duplex, and Speed + parameters, see the "Speed and Duplex Configuration" section in + this document. + + For more information about the InterruptThrottleRate, RxIntDelay, + TxIntDelay, RxAbsIntDelay, and TxAbsIntDelay parameters, see the + application note at: + http://www.intel.com/design/network/applnots/ap450.htm + + A descriptor describes a data buffer and attributes related to the + data buffer. This information is accessed by the hardware. + +AutoNeg (adapters using copper connections only) +Valid Range: 0x01-0x0F, 0x20-0x2F +Default Value: 0x2F + This parameter is a bit mask that specifies which speed and duplex + settings the board advertises. When this parameter is used, the Speed and + Duplex parameters must not be specified. + NOTE: Refer to the Speed and Duplex section of this readme for more + information on the AutoNeg parameter. + +Duplex (adapters using copper connections only) +Valid Range: 0-2 (0=auto-negotiate, 1=half, 2=full) +Default Value: 0 + Defines the direction in which data is allowed to flow. Can be either one + or two-directional. If both Duplex and the link partner are set to auto- + negotiate, the board auto-detects the correct duplex. If the link partner + is forced (either full or half), Duplex defaults to half-duplex. + +FlowControl +Valid Range: 0-3 (0=none, 1=Rx only, 2=Tx only, 3=Rx&Tx) +Default: Read flow control settings from the EEPROM + This parameter controls the automatic generation(Tx) and response(Rx) to + Ethernet PAUSE frames. + +InterruptThrottleRate +Valid Range: 100-100000 (0=off, 1=dynamic) +Default Value: 8000 + This value represents the maximum number of interrupts per second the + controller generates. InterruptThrottleRate is another setting used in + interrupt moderation. Dynamic mode uses a heuristic algorithm to adjust + InterruptThrottleRate based on the current traffic load. +Un-supported Adapters: InterruptThrottleRate is NOT supported by 82542, 82543 + or 82544-based adapters. + + NOTE: InterruptThrottleRate takes precedence over the TxAbsIntDelay and + RxAbsIntDelay parameters. In other words, minimizing the receive + and/or transmit absolute delays does not force the controller to + generate more interrupts than what the Interrupt Throttle Rate + allows. + CAUTION: If you are using the Intel PRO/1000 CT Network Connection + (controller 82547), setting InterruptThrottleRate to a value + greater than 75,000, may hang (stop transmitting) adapters under + certain network conditions. If this occurs a NETDEV WATCHDOG + message is logged in the system event log. In addition, the + controller is automatically reset, restoring the network + connection. To eliminate the potential for the hang, ensure + that InterruptThrottleRate is set no greater than 75,000 and is + not set to 0. + NOTE: When e1000 is loaded with default settings and multiple adapters are + in use simultaneously, the CPU utilization may increase non-linearly. + In order to limit the CPU utilization without impacting the overall + throughput, we recommend that you load the driver as follows: + + insmod e1000.o InterruptThrottleRate=3000,3000,3000 + + This sets the InterruptThrottleRate to 3000 interrupts/sec for the + first, second, and third instances of the driver. The range of 2000 to + 3000 interrupts per second works on a majority of systems and is a + good starting point, but the optimal value will be platform-specific. + If CPU utilization is not a concern, use RX_POLLING (NAPI) and default + driver settings. + +RxDescriptors +Valid Range: 80-256 for 82542 and 82543-based adapters + 80-4096 for all other supported adapters +Default Value: 256 + This value is the number of receive descriptors allocated by the driver. + Increasing this value allows the driver to buffer more incoming packets. + Each descriptor is 16 bytes. A receive buffer is allocated for each + descriptor and can either be 2048 or 4096 bytes long, depending on the MTU + + setting. An incoming packet can span one or more receive descriptors. + The maximum MTU size is 16110. + + NOTE: MTU designates the frame size. It only needs to be set for Jumbo + Frames. + NOTE: Depending on the available system resources, the request for a + higher number of receive descriptors may be denied. In this case, + use a lower number. + +RxIntDelay +Valid Range: 0-65535 (0=off) +Default Value: 0 + This value delays the generation of receive interrupts in units of 1.024 + microseconds. Receive interrupt reduction can improve CPU efficiency if + properly tuned for specific network traffic. Increasing this value adds + extra latency to frame reception and can end up decreasing the throughput + of TCP traffic. If the system is reporting dropped receives, this value + may be set too high, causing the driver to run out of available receive + descriptors. + + CAUTION: When setting RxIntDelay to a value other than 0, adapters may + hang (stop transmitting) under certain network conditions. If + this occurs a NETDEV WATCHDOG message is logged in the system + event log. In addition, the controller is automatically reset, + restoring the network connection. To eliminate the potential for + the hang ensure that RxIntDelay is set to 0. + +RxAbsIntDelay (82540, 82545 and later adapters only) +Valid Range: 0-65535 (0=off) +Default Value: 128 + This value, in units of 1.024 microseconds, limits the delay in which a + receive interrupt is generated. Useful only if RxIntDelay is non-zero, + this value ensures that an interrupt is generated after the initial + packet is received within the set amount of time. Proper tuning, + along with RxIntDelay, may improve traffic throughput in specific network + conditions. + +Speed (adapters using copper connections only) +Valid Settings: 0, 10, 100, 1000 +Default Value: 0 (auto-negotiate at all supported speeds) + Speed forces the line speed to the specified value in megabits per second + (Mbps). If this parameter is not specified or is set to 0 and the link + partner is set to auto-negotiate, the board will auto-detect the correct + speed. Duplex should also be set when Speed is set to either 10 or 100. + +TxDescriptors +Valid Range: 80-256 for 82542 and 82543-based adapters + 80-4096 for all other supported adapters +Default Value: 256 + This value is the number of transmit descriptors allocated by the driver. + Increasing this value allows the driver to queue more transmits. Each + descriptor is 16 bytes. + + NOTE: Depending on the available system resources, the request for a + higher number of transmit descriptors may be denied. In this case, + use a lower number. + +TxIntDelay +Valid Range: 0-65535 (0=off) +Default Value: 64 + This value delays the generation of transmit interrupts in units of + 1.024 microseconds. Transmit interrupt reduction can improve CPU + efficiency if properly tuned for specific network traffic. If the + system is reporting dropped transmits, this value may be set too high + causing the driver to run out of available transmit descriptors. + +TxAbsIntDelay (82540, 82545 and later adapters only) +Valid Range: 0-65535 (0=off) +Default Value: 64 + This value, in units of 1.024 microseconds, limits the delay in which a + transmit interrupt is generated. Useful only if TxIntDelay is non-zero, + this value ensures that an interrupt is generated after the initial + packet is sent on the wire within the set amount of time. Proper tuning, + along with TxIntDelay, may improve traffic throughput in specific + network conditions. + +XsumRX (not available on the 82542-based adapter) +Valid Range: 0-1 +Default Value: 1 + A value of '1' indicates that the driver should enable IP checksum + offload for received packets (both UDP and TCP) to the adapter hardware. + +Speed and Duplex Configuration +============================== + +Three keywords are used to control the speed and duplex configuration. These +keywords are Speed, Duplex, and AutoNeg. + +If the board uses a fiber interface, these keywords are ignored, and the +fiber interface board only links at 1000 Mbps full-duplex. + +For copper-based boards, the keywords interact as follows: + + The default operation is auto-negotiate. The board advertises all supported + speed and duplex combinations, and it links at the highest common speed and + duplex mode IF the link partner is set to auto-negotiate. + + If Speed = 1000, limited auto-negotiation is enabled and only 1000 Mbps is + advertised (The 1000BaseT spec requires auto-negotiation.) + + If Speed = 10 or 100, then both Speed and Duplex should be set. Auto- + negotiation is disabled, and the AutoNeg parameter is ignored. Partner SHOULD + also be forced. + +The AutoNeg parameter is used when more control is required over the auto- +negotiation process. When this parameter is used, Speed and Duplex parameters +must not be specified. The following table describes supported values for the +AutoNeg parameter: + +Speed (Mbps) 1000 100 100 10 10 +Duplex Full Full Half Full Half +Value (in base 16) 0x20 0x08 0x04 0x02 0x01 + +Example: insmod e1000 AutoNeg=0x03, loads e1000 and specifies (10 full duplex, +10 half duplex) for negotiation with the peer. + +Note that setting AutoNeg does not guarantee that the board will link at the +highest specified speed or duplex mode, but the board will link at the +highest possible speed/duplex of the link partner IF the link partner is also +set to auto-negotiate. If the link partner is forced speed/duplex, the +adapter MUST be forced to the same speed/duplex. + + +Additional Configurations +========================= + + Configuring the Driver on Different Distributions + ------------------------------------------------- + + Configuring a network driver to load properly when the system is started is + distribution dependent. Typically, the configuration process involves adding + an alias line to /etc/modules.conf as well as editing other system startup + scripts and/or configuration files. Many popular Linux distributions ship + with tools to make these changes for you. To learn the proper way to + configure a network device for your system, refer to your distribution + documentation. If during this process you are asked for the driver or module + name, the name for the Linux Base Driver for the Intel PRO/1000 Family of + Adapters is e1000. + + As an example, if you install the e1000 driver for two PRO/1000 adapters + (eth0 and eth1) and set the speed and duplex to 10full and 100half, add the + following to modules.conf: + + alias eth0 e1000 + alias eth1 e1000 + options e1000 Speed=10,100 Duplex=2,1 + + Viewing Link Messages + --------------------- + + Link messages will not be displayed to the console if the distribution is + restricting system messages. In order to see network driver link messages on + your console, set dmesg to eight by entering the following: + + dmesg -n 8 + + NOTE: This setting is not saved across reboots. + + Jumbo Frames + ------------ + + The driver supports Jumbo Frames for all adapters except 82542-based + adapters. Jumbo Frames support is enabled by changing the MTU to a value + larger than the default of 1500. Use the ifconfig command to increase the + MTU size. For example: + + ifconfig ethx mtu 9000 up + + The maximum MTU setting for Jumbo Frames is 16110. This value coincides + with the maximum Jumbo Frames size of 16128. + + NOTE: Jumbo Frames are supported at 1000 Mbps only. Using Jumbo Frames at + 10 or 100 Mbps may result in poor performance or loss of link. + + + NOTE: MTU designates the frame size. To enable Jumbo Frames, increase the + MTU size on the interface beyond 1500. + + Ethtool + ------- + + The driver utilizes the ethtool interface for driver configuration and + diagnostics, as well as displaying statistical information. Ethtool + version 1.6 or later is required for this functionality. + + The latest release of ethtool can be found from + http://sf.net/projects/gkernel. + + NOTE: Ethtool 1.6 only supports a limited set of ethtool options. Support + for a more complete ethtool feature set can be enabled by upgrading + ethtool to ethtool-1.8.1. + + Enabling Wake on LAN* (WoL) + --------------------------- + + WoL is configured through the Ethtool* utility. Ethtool is included with + all versions of Red Hat after Red Hat 7.2. For other Linux distributions, + download and install Ethtool from the following website: + http://sourceforge.net/projects/gkernel. + + For instructions on enabling WoL with Ethtool, refer to the website listed + above. + + WoL will be enabled on the system during the next shut down or reboot. + For this driver version, in order to enable WoL, the e1000 driver must be + loaded when shutting down or rebooting the system. + + NAPI + ---- + + NAPI (Rx polling mode) is supported in the e1000 driver. NAPI is enabled + or disabled based on the configuration of the kernel. + + See www.cyberus.ca/~hadi/usenix-paper.tgz for more information on NAPI. + + +Known Issues +============ + + Jumbo Frames System Requirement + ------------------------------- + + Memory allocation failures have been observed on Linux systems with 64 MB + of RAM or less that are running Jumbo Frames. If you are using Jumbo Frames, + your system may require more than the advertised minimum requirement of 64 MB + of system memory. + + +Support +======= + +For general information, go to the Intel support website at: + + http://support.intel.com + +If an issue is identified with the released source code on the supported +kernel with a supported adapter, email the specific information related to +the issue to linux.nics@intel.com. + + +License +======= + +This software program is released under the terms of a license agreement +between you ('Licensee') and Intel. Do not use or load this software or any +associated materials (collectively, the 'Software') until you have carefully +read the full terms and conditions of the LICENSE located in this software +package. By loading or using the Software, you agree to the terms of this +Agreement. If you do not agree with the terms of this Agreement, do not +install or use the Software. + +* Other names and brands may be claimed as the property of others. diff --git a/Documentation/networking/eql.txt b/Documentation/networking/eql.txt new file mode 100644 index 000000000000..0f1550150f05 --- /dev/null +++ b/Documentation/networking/eql.txt @@ -0,0 +1,528 @@ + EQL Driver: Serial IP Load Balancing HOWTO + Simon "Guru Aleph-Null" Janes, simon@ncm.com + v1.1, February 27, 1995 + + This is the manual for the EQL device driver. EQL is a software device + that lets you load-balance IP serial links (SLIP or uncompressed PPP) + to increase your bandwidth. It will not reduce your latency (i.e. ping + times) except in the case where you already have lots of traffic on + your link, in which it will help them out. This driver has been tested + with the 1.1.75 kernel, and is known to have patched cleanly with + 1.1.86. Some testing with 1.1.92 has been done with the v1.1 patch + which was only created to patch cleanly in the very latest kernel + source trees. (Yes, it worked fine.) + + 1. Introduction + + Which is worse? A huge fee for a 56K leased line or two phone lines? + It's probably the former. If you find yourself craving more bandwidth, + and have a ISP that is flexible, it is now possible to bind modems + together to work as one point-to-point link to increase your + bandwidth. All without having to have a special black box on either + side. + + + The eql driver has only been tested with the Livingston PortMaster-2e + terminal server. I do not know if other terminal servers support load- + balancing, but I do know that the PortMaster does it, and does it + almost as well as the eql driver seems to do it (-- Unfortunately, in + my testing so far, the Livingston PortMaster 2e's load-balancing is a + good 1 to 2 KB/s slower than the test machine working with a 28.8 Kbps + and 14.4 Kbps connection. However, I am not sure that it really is + the PortMaster, or if it's Linux's TCP drivers. I'm told that Linux's + TCP implementation is pretty fast though.--) + + + I suggest to ISPs out there that it would probably be fair to charge + a load-balancing client 75% of the cost of the second line and 50% of + the cost of the third line etc... + + + Hey, we can all dream you know... + + + 2. Kernel Configuration + + Here I describe the general steps of getting a kernel up and working + with the eql driver. From patching, building, to installing. + + + 2.1. Patching The Kernel + + If you do not have or cannot get a copy of the kernel with the eql + driver folded into it, get your copy of the driver from + ftp://slaughter.ncm.com/pub/Linux/LOAD_BALANCING/eql-1.1.tar.gz. + Unpack this archive someplace obvious like /usr/local/src/. It will + create the following files: + + + + ______________________________________________________________________ + -rw-r--r-- guru/ncm 198 Jan 19 18:53 1995 eql-1.1/NO-WARRANTY + -rw-r--r-- guru/ncm 30620 Feb 27 21:40 1995 eql-1.1/eql-1.1.patch + -rwxr-xr-x guru/ncm 16111 Jan 12 22:29 1995 eql-1.1/eql_enslave + -rw-r--r-- guru/ncm 2195 Jan 10 21:48 1995 eql-1.1/eql_enslave.c + ______________________________________________________________________ + + Unpack a recent kernel (something after 1.1.92) someplace convenient + like say /usr/src/linux-1.1.92.eql. Use symbolic links to point + /usr/src/linux to this development directory. + + + Apply the patch by running the commands: + + + ______________________________________________________________________ + cd /usr/src + patch </usr/local/src/eql-1.1/eql-1.1.patch + ______________________________________________________________________ + + + + + + 2.2. Building The Kernel + + After patching the kernel, run make config and configure the kernel + for your hardware. + + + After configuration, make and install according to your habit. + + + 3. Network Configuration + + So far, I have only used the eql device with the DSLIP SLIP connection + manager by Matt Dillon (-- "The man who sold his soul to code so much + so quickly."--) . How you configure it for other "connection" + managers is up to you. Most other connection managers that I've seen + don't do a very good job when it comes to handling more than one + connection. + + + 3.1. /etc/rc.d/rc.inet1 + + In rc.inet1, ifconfig the eql device to the IP address you usually use + for your machine, and the MTU you prefer for your SLIP lines. One + could argue that MTU should be roughly half the usual size for two + modems, one-third for three, one-fourth for four, etc... But going + too far below 296 is probably overkill. Here is an example ifconfig + command that sets up the eql device: + + + + ______________________________________________________________________ + ifconfig eql 198.67.33.239 mtu 1006 + ______________________________________________________________________ + + + + + + Once the eql device is up and running, add a static default route to + it in the routing table using the cool new route syntax that makes + life so much easier: + + + + ______________________________________________________________________ + route add default eql + ______________________________________________________________________ + + + 3.2. Enslaving Devices By Hand + + Enslaving devices by hand requires two utility programs: eql_enslave + and eql_emancipate (-- eql_emancipate hasn't been written because when + an enslaved device "dies", it is automatically taken out of the queue. + I haven't found a good reason to write it yet... other than for + completeness, but that isn't a good motivator is it?--) + + + The syntax for enslaving a device is "eql_enslave <master-name> + <slave-name> <estimated-bps>". Here are some example enslavings: + + + + ______________________________________________________________________ + eql_enslave eql sl0 28800 + eql_enslave eql ppp0 14400 + eql_enslave eql sl1 57600 + ______________________________________________________________________ + + + + + + When you want to free a device from its life of slavery, you can + either down the device with ifconfig (eql will automatically bury the + dead slave and remove it from its queue) or use eql_emancipate to free + it. (-- Or just ifconfig it down, and the eql driver will take it out + for you.--) + + + + ______________________________________________________________________ + eql_emancipate eql sl0 + eql_emancipate eql ppp0 + eql_emancipate eql sl1 + ______________________________________________________________________ + + + + + + 3.3. DSLIP Configuration for the eql Device + + The general idea is to bring up and keep up as many SLIP connections + as you need, automatically. + + + 3.3.1. /etc/slip/runslip.conf + + Here is an example runslip.conf: + + + + + + + + + + + + + + + + ______________________________________________________________________ + name sl-line-1 + enabled + baud 38400 + mtu 576 + ducmd -e /etc/slip/dialout/cua2-288.xp -t 9 + command eql_enslave eql $interface 28800 + address 198.67.33.239 + line /dev/cua2 + + name sl-line-2 + enabled + baud 38400 + mtu 576 + ducmd -e /etc/slip/dialout/cua3-288.xp -t 9 + command eql_enslave eql $interface 28800 + address 198.67.33.239 + line /dev/cua3 + ______________________________________________________________________ + + + + + + 3.4. Using PPP and the eql Device + + I have not yet done any load-balancing testing for PPP devices, mainly + because I don't have a PPP-connection manager like SLIP has with + DSLIP. I did find a good tip from LinuxNET:Billy for PPP performance: + make sure you have asyncmap set to something so that control + characters are not escaped. + + + I tried to fix up a PPP script/system for redialing lost PPP + connections for use with the eql driver the weekend of Feb 25-26 '95 + (Hereafter known as the 8-hour PPP Hate Festival). Perhaps later this + year. + + + 4. About the Slave Scheduler Algorithm + + The slave scheduler probably could be replaced with a dozen other + things and push traffic much faster. The formula in the current set + up of the driver was tuned to handle slaves with wildly different + bits-per-second "priorities". + + + All testing I have done was with two 28.8 V.FC modems, one connecting + at 28800 bps or slower, and the other connecting at 14400 bps all the + time. + + + One version of the scheduler was able to push 5.3 K/s through the + 28800 and 14400 connections, but when the priorities on the links were + very wide apart (57600 vs. 14400) the "faster" modem received all + traffic and the "slower" modem starved. + + + 5. Testers' Reports + + Some people have experimented with the eql device with newer + kernels (than 1.1.75). I have since updated the driver to patch + cleanly in newer kernels because of the removal of the old "slave- + balancing" driver config option. + + + o icee from LinuxNET patched 1.1.86 without any rejects and was able + to boot the kernel and enslave a couple of ISDN PPP links. + + 5.1. Randolph Bentson's Test Report + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + From bentson@grieg.seaslug.org Wed Feb 8 19:08:09 1995 + Date: Tue, 7 Feb 95 22:57 PST + From: Randolph Bentson <bentson@grieg.seaslug.org> + To: guru@ncm.com + Subject: EQL driver tests + + + I have been checking out your eql driver. (Nice work, that!) + Although you may already done this performance testing, here + are some data I've discovered. + + Randolph Bentson + bentson@grieg.seaslug.org + + --------------------------------------------------------- + + + A pseudo-device driver, EQL, written by Simon Janes, can be used + to bundle multiple SLIP connections into what appears to be a + single connection. This allows one to improve dial-up network + connectivity gradually, without having to buy expensive DSU/CSU + hardware and services. + + I have done some testing of this software, with two goals in + mind: first, to ensure it actually works as described and + second, as a method of exercising my device driver. + + The following performance measurements were derived from a set + of SLIP connections run between two Linux systems (1.1.84) using + a 486DX2/66 with a Cyclom-8Ys and a 486SLC/40 with a Cyclom-16Y. + (Ports 0,1,2,3 were used. A later configuration will distribute + port selection across the different Cirrus chips on the boards.) + Once a link was established, I timed a binary ftp transfer of + 289284 bytes of data. If there were no overhead (packet headers, + inter-character and inter-packet delays, etc.) the transfers + would take the following times: + + bits/sec seconds + 345600 8.3 + 234600 12.3 + 172800 16.7 + 153600 18.8 + 76800 37.6 + 57600 50.2 + 38400 75.3 + 28800 100.4 + 19200 150.6 + 9600 301.3 + + A single line running at the lower speeds and with large packets + comes to within 2% of this. Performance is limited for the higher + speeds (as predicted by the Cirrus databook) to an aggregate of + about 160 kbits/sec. The next round of testing will distribute + the load across two or more Cirrus chips. + + The good news is that one gets nearly the full advantage of the + second, third, and fourth line's bandwidth. (The bad news is + that the connection establishment seemed fragile for the higher + speeds. Once established, the connection seemed robust enough.) + + #lines speed mtu seconds theory actual %of + kbit/sec duration speed speed max + 3 115200 900 _ 345600 + 3 115200 400 18.1 345600 159825 46 + 2 115200 900 _ 230400 + 2 115200 600 18.1 230400 159825 69 + 2 115200 400 19.3 230400 149888 65 + 4 57600 900 _ 234600 + 4 57600 600 _ 234600 + 4 57600 400 _ 234600 + 3 57600 600 20.9 172800 138413 80 + 3 57600 900 21.2 172800 136455 78 + 3 115200 600 21.7 345600 133311 38 + 3 57600 400 22.5 172800 128571 74 + 4 38400 900 25.2 153600 114795 74 + 4 38400 600 26.4 153600 109577 71 + 4 38400 400 27.3 153600 105965 68 + 2 57600 900 29.1 115200 99410.3 86 + 1 115200 900 30.7 115200 94229.3 81 + 2 57600 600 30.2 115200 95789.4 83 + 3 38400 900 30.3 115200 95473.3 82 + 3 38400 600 31.2 115200 92719.2 80 + 1 115200 600 31.3 115200 92423 80 + 2 57600 400 32.3 115200 89561.6 77 + 1 115200 400 32.8 115200 88196.3 76 + 3 38400 400 33.5 115200 86353.4 74 + 2 38400 900 43.7 76800 66197.7 86 + 2 38400 600 44 76800 65746.4 85 + 2 38400 400 47.2 76800 61289 79 + 4 19200 900 50.8 76800 56945.7 74 + 4 19200 400 53.2 76800 54376.7 70 + 4 19200 600 53.7 76800 53870.4 70 + 1 57600 900 54.6 57600 52982.4 91 + 1 57600 600 56.2 57600 51474 89 + 3 19200 900 60.5 57600 47815.5 83 + 1 57600 400 60.2 57600 48053.8 83 + 3 19200 600 62 57600 46658.7 81 + 3 19200 400 64.7 57600 44711.6 77 + 1 38400 900 79.4 38400 36433.8 94 + 1 38400 600 82.4 38400 35107.3 91 + 2 19200 900 84.4 38400 34275.4 89 + 1 38400 400 86.8 38400 33327.6 86 + 2 19200 600 87.6 38400 33023.3 85 + 2 19200 400 91.2 38400 31719.7 82 + 4 9600 900 94.7 38400 30547.4 79 + 4 9600 400 106 38400 27290.9 71 + 4 9600 600 110 38400 26298.5 68 + 3 9600 900 118 28800 24515.6 85 + 3 9600 600 120 28800 24107 83 + 3 9600 400 131 28800 22082.7 76 + 1 19200 900 155 19200 18663.5 97 + 1 19200 600 161 19200 17968 93 + 1 19200 400 170 19200 17016.7 88 + 2 9600 600 176 19200 16436.6 85 + 2 9600 900 180 19200 16071.3 83 + 2 9600 400 181 19200 15982.5 83 + 1 9600 900 305 9600 9484.72 98 + 1 9600 600 314 9600 9212.87 95 + 1 9600 400 332 9600 8713.37 90 + + + + + + 5.2. Anthony Healy's Report + + + + + + + + Date: Mon, 13 Feb 1995 16:17:29 +1100 (EST) + From: Antony Healey <ahealey@st.nepean.uws.edu.au> + To: Simon Janes <guru@ncm.com> + Subject: Re: Load Balancing + + Hi Simon, + I've installed your patch and it works great. I have trialed + it over twin SL/IP lines, just over null modems, but I was + able to data at over 48Kb/s [ISDN link -Simon]. I managed a + transfer of up to 7.5 Kbyte/s on one go, but averaged around + 6.4 Kbyte/s, which I think is pretty cool. :) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/Documentation/networking/ewrk3.txt b/Documentation/networking/ewrk3.txt new file mode 100644 index 000000000000..90e9e5f16e6b --- /dev/null +++ b/Documentation/networking/ewrk3.txt @@ -0,0 +1,46 @@ +The EtherWORKS 3 driver in this distribution is designed to work with all +kernels > 1.1.33 (approx) and includes tools in the 'ewrk3tools' +subdirectory to allow set up of the card, similar to the MSDOS +'NICSETUP.EXE' tools provided on the DOS drivers disk (type 'make' in that +subdirectory to make the tools). + +The supported cards are DE203, DE204 and DE205. All other cards are NOT +supported - refer to 'depca.c' for running the LANCE based network cards and +'de4x5.c' for the DIGITAL Semiconductor PCI chip based adapters from +Digital. + +The ability to load this driver as a loadable module has been included and +used extensively during the driver development (to save those long reboot +sequences). To utilise this ability, you have to do 8 things: + + 0) have a copy of the loadable modules code installed on your system. + 1) copy ewrk3.c from the /linux/drivers/net directory to your favourite + temporary directory. + 2) edit the source code near line 1898 to reflect the I/O address and + IRQ you're using. + 3) compile ewrk3.c, but include -DMODULE in the command line to ensure + that the correct bits are compiled (see end of source code). + 4) if you are wanting to add a new card, goto 5. Otherwise, recompile a + kernel with the ewrk3 configuration turned off and reboot. + 5) insmod ewrk3.o + [Alan Cox: Changed this so you can insmod ewrk3.o irq=x io=y] + [Adam Kropelin: Multiple cards now supported by irq=x1,x2 io=y1,y2] + 6) run the net startup bits for your new eth?? interface manually + (usually /etc/rc.inet[12] at boot time). + 7) enjoy! + + Note that autoprobing is not allowed in loadable modules - the system is + already up and running and you're messing with interrupts. + + To unload a module, turn off the associated interface + 'ifconfig eth?? down' then 'rmmod ewrk3'. + +The performance we've achieved so far has been measured through the 'ttcp' +tool at 975kB/s. This measures the total TCP stack performance which +includes the card, so don't expect to get much nearer the 1.25MB/s +theoretical Ethernet rate. + + +Enjoy! + +Dave diff --git a/Documentation/networking/filter.txt b/Documentation/networking/filter.txt new file mode 100644 index 000000000000..bbf2005270b5 --- /dev/null +++ b/Documentation/networking/filter.txt @@ -0,0 +1,42 @@ +filter.txt: Linux Socket Filtering +Written by: Jay Schulist <jschlst@samba.org> + +Introduction +============ + + Linux Socket Filtering is derived from the Berkeley +Packet Filter. There are some distinct differences between +the BSD and Linux Kernel Filtering. + +Linux Socket Filtering (LSF) allows a user-space program to +attach a filter onto any socket and allow or disallow certain +types of data to come through the socket. LSF follows exactly +the same filter code structure as the BSD Berkeley Packet Filter +(BPF), so referring to the BSD bpf.4 manpage is very helpful in +creating filters. + +LSF is much simpler than BPF. One does not have to worry about +devices or anything like that. You simply create your filter +code, send it to the kernel via the SO_ATTACH_FILTER ioctl and +if your filter code passes the kernel check on it, you then +immediately begin filtering data on that socket. + +You can also detach filters from your socket via the +SO_DETACH_FILTER ioctl. This will probably not be used much +since when you close a socket that has a filter on it the +filter is automagically removed. The other less common case +may be adding a different filter on the same socket where you had another +filter that is still running: the kernel takes care of removing +the old one and placing your new one in its place, assuming your +filter has passed the checks, otherwise if it fails the old filter +will remain on that socket. + +Examples +======== + +Ioctls- +setsockopt(sockfd, SOL_SOCKET, SO_ATTACH_FILTER, &Filter, sizeof(Filter)); +setsockopt(sockfd, SOL_SOCKET, SO_DETACH_FILTER, &value, sizeof(value)); + +See the BSD bpf.4 manpage and the BSD Packet Filter paper written by +Steven McCanne and Van Jacobson of Lawrence Berkeley Laboratory. diff --git a/Documentation/networking/fore200e.txt b/Documentation/networking/fore200e.txt new file mode 100644 index 000000000000..b1f337f0f4ca --- /dev/null +++ b/Documentation/networking/fore200e.txt @@ -0,0 +1,66 @@ + +FORE Systems PCA-200E/SBA-200E ATM NIC driver +--------------------------------------------- + +This driver adds support for the FORE Systems 200E-series ATM adapters +to the Linux operating system. It is based on the earlier PCA-200E driver +written by Uwe Dannowski. + +The driver simultaneously supports PCA-200E and SBA-200E adapters on +i386, alpha (untested), powerpc, sparc and sparc64 archs. + +The intent is to enable the use of different models of FORE adapters at the +same time, by hosts that have several bus interfaces (such as PCI+SBUS, +PCI+MCA or PCI+EISA). + +Only PCI and SBUS devices are currently supported by the driver, but support +for other bus interfaces such as EISA should not be too hard to add (this may +be more tricky for the MCA bus, though, as FORE made some MCA-specific +modifications to the adapter's AALI interface). + + +Firmware Copyright Notice +------------------------- + +Please read the fore200e_firmware_copyright file present +in the linux/drivers/atm directory for details and restrictions. + + +Firmware Updates +---------------- + +The FORE Systems 200E-series driver is shipped with firmware data being +uploaded to the ATM adapters at system boot time or at module loading time. +The supplied firmware images should work with all adapters. + +However, if you encounter problems (the firmware doesn't start or the driver +is unable to read the PROM data), you may consider trying another firmware +version. Alternative binary firmware images can be found somewhere on the +ForeThought CD-ROM supplied with your adapter by FORE Systems. + +You can also get the latest firmware images from FORE Systems at +http://www.fore.com. Register TACTics Online and go to +the 'software updates' pages. The firmware binaries are part of +the various ForeThought software distributions. + +Notice that different versions of the PCA-200E firmware exist, depending +on the endianess of the host architecture. The driver is shipped with +both little and big endian PCA firmware images. + +Name and location of the new firmware images can be set at kernel +configuration time: + +1. Copy the new firmware binary files (with .bin, .bin1 or .bin2 suffix) + to some directory, such as linux/drivers/atm. + +2. Reconfigure your kernel to set the new firmware name and location. + Expected pathnames are absolute or relative to the drivers/atm directory. + +3. Rebuild and re-install your kernel or your module. + + +Feedback +-------- + +Feedback is welcome. Please send success stories/bug reports/ +patches/improvement/comments/flames to <lizzi@cnam.fr>. diff --git a/Documentation/networking/framerelay.txt b/Documentation/networking/framerelay.txt new file mode 100644 index 000000000000..1a0b720440dd --- /dev/null +++ b/Documentation/networking/framerelay.txt @@ -0,0 +1,39 @@ +Frame Relay (FR) support for linux is built into a two tiered system of device +drivers. The upper layer implements RFC1490 FR specification, and uses the +Data Link Connection Identifier (DLCI) as its hardware address. Usually these +are assigned by your network supplier, they give you the number/numbers of +the Virtual Connections (VC) assigned to you. + +Each DLCI is a point-to-point link between your machine and a remote one. +As such, a separate device is needed to accommodate the routing. Within the +net-tools archives is 'dlcicfg'. This program will communicate with the +base "DLCI" device, and create new net devices named 'dlci00', 'dlci01'... +The configuration script will ask you how many DLCIs you need, as well as +how many DLCIs you want to assign to each Frame Relay Access Device (FRAD). + +The DLCI uses a number of function calls to communicate with the FRAD, all +of which are stored in the FRAD's private data area. assoc/deassoc, +activate/deactivate and dlci_config. The DLCI supplies a receive function +to the FRAD to accept incoming packets. + +With this initial offering, only 1 FRAD driver is available. With many thanks +to Sangoma Technologies, David Mandelstam & Gene Kozin, the S502A, S502E & +S508 are supported. This driver is currently set up for only FR, but as +Sangoma makes more firmware modules available, it can be updated to provide +them as well. + +Configuration of the FRAD makes use of another net-tools program, 'fradcfg'. +This program makes use of a configuration file (which dlcicfg can also read) +to specify the types of boards to be configured as FRADs, as well as perform +any board specific configuration. The Sangoma module of fradcfg loads the +FR firmware into the card, sets the irq/port/memory information, and provides +an initial configuration. + +Additional FRAD device drivers can be added as hardware is available. + +At this time, the dlcicfg and fradcfg programs have not been incorporated into +the net-tools distribution. They can be found at ftp.invlogic.com, in +/pub/linux. Note that with OS/2 FTPD, you end up in /pub by default, so just +use 'cd linux'. v0.10 is for use on pre-2.0.3 and earlier, v0.15 is for +pre-2.0.4 and later. + diff --git a/Documentation/networking/gen_stats.txt b/Documentation/networking/gen_stats.txt new file mode 100644 index 000000000000..c3297f79c137 --- /dev/null +++ b/Documentation/networking/gen_stats.txt @@ -0,0 +1,117 @@ +Generic networking statistics for netlink users +====================================================================== + +Statistic counters are grouped into structs: + +Struct TLV type Description +---------------------------------------------------------------------- +gnet_stats_basic TCA_STATS_BASIC Basic statistics +gnet_stats_rate_est TCA_STATS_RATE_EST Rate estimator +gnet_stats_queue TCA_STATS_QUEUE Queue statistics +none TCA_STATS_APP Application specific + + +Collecting: +----------- + +Declare the statistic structs you need: +struct mystruct { + struct gnet_stats_basic bstats; + struct gnet_stats_queue qstats; + ... +}; + +Update statistics: +mystruct->tstats.packet++; +mystruct->qstats.backlog += skb->pkt_len; + + +Export to userspace (Dump): +--------------------------- + +my_dumping_routine(struct sk_buff *skb, ...) +{ + struct gnet_dump dump; + + if (gnet_stats_start_copy(skb, TCA_STATS2, &mystruct->lock, &dump) < 0) + goto rtattr_failure; + + if (gnet_stats_copy_basic(&dump, &mystruct->bstats) < 0 || + gnet_stats_copy_queue(&dump, &mystruct->qstats) < 0 || + gnet_stats_copy_app(&dump, &xstats, sizeof(xstats)) < 0) + goto rtattr_failure; + + if (gnet_stats_finish_copy(&dump) < 0) + goto rtattr_failure; + ... +} + +TCA_STATS/TCA_XSTATS backward compatibility: +-------------------------------------------- + +Prior users of struct tc_stats and xstats can maintain backward +compatibility by calling the compat wrappers to keep providing the +existing TLV types. + +my_dumping_routine(struct sk_buff *skb, ...) +{ + if (gnet_stats_start_copy_compat(skb, TCA_STATS2, TCA_STATS, + TCA_XSTATS, &mystruct->lock, &dump) < 0) + goto rtattr_failure; + ... +} + +A struct tc_stats will be filled out during gnet_stats_copy_* calls +and appended to the skb. TCA_XSTATS is provided if gnet_stats_copy_app +was called. + + +Locking: +-------- + +Locks are taken before writing and released once all statistics have +been written. Locks are always released in case of an error. You +are responsible for making sure that the lock is initialized. + + +Rate Estimator: +-------------- + +0) Prepare an estimator attribute. Most likely this would be in user + space. The value of this TLV should contain a tc_estimator structure. + As usual, such a TLV nees to be 32 bit aligned and therefore the + length needs to be appropriately set etc. The estimator interval + and ewma log need to be converted to the appropriate values. + tc_estimator.c::tc_setup_estimator() is advisable to be used as the + conversion routine. It does a few clever things. It takes a time + interval in microsecs, a time constant also in microsecs and a struct + tc_estimator to be populated. The returned tc_estimator can be + transported to the kernel. Transfer such a structure in a TLV of type + TCA_RATE to your code in the kernel. + +In the kernel when setting up: +1) make sure you have basic stats and rate stats setup first. +2) make sure you have initialized stats lock that is used to setup such + stats. +3) Now initialize a new estimator: + + int ret = gen_new_estimator(my_basicstats,my_rate_est_stats, + mystats_lock, attr_with_tcestimator_struct); + + if ret == 0 + success + else + failed + +From now on, everytime you dump my_rate_est_stats it will contain +uptodate info. + +Once you are done, call gen_kill_estimator(my_basicstats, +my_rate_est_stats) Make sure that my_basicstats and my_rate_est_stats +are still valid (i.e still exist) at the time of making this call. + + +Authors: +-------- +Thomas Graf <tgraf@suug.ch> +Jamal Hadi Salim <hadi@cyberus.ca> diff --git a/Documentation/networking/generic-hdlc.txt b/Documentation/networking/generic-hdlc.txt new file mode 100644 index 000000000000..7d1dc6b884f3 --- /dev/null +++ b/Documentation/networking/generic-hdlc.txt @@ -0,0 +1,131 @@ +Generic HDLC layer +Krzysztof Halasa <khc@pm.waw.pl> +January, 2003 + + +Generic HDLC layer currently supports: +- Frame Relay (ANSI, CCITT and no LMI), with ARP support (no InARP). + Normal (routed) and Ethernet-bridged (Ethernet device emulation) + interfaces can share a single PVC. +- raw HDLC - either IP (IPv4) interface or Ethernet device emulation. +- Cisco HDLC, +- PPP (uses syncppp.c), +- X.25 (uses X.25 routines). + +There are hardware drivers for the following cards: +- C101 by Moxa Technologies Co., Ltd. +- RISCom/N2 by SDL Communications Inc. +- and others, some not in the official kernel. + +Ethernet device emulation (using HDLC or Frame-Relay PVC) is compatible +with IEEE 802.1Q (VLANs) and 802.1D (Ethernet bridging). + + +Make sure the hdlc.o and the hardware driver are loaded. It should +create a number of "hdlc" (hdlc0 etc) network devices, one for each +WAN port. You'll need the "sethdlc" utility, get it from: + http://hq.pm.waw.pl/hdlc/ + +Compile sethdlc.c utility: + gcc -O2 -Wall -o sethdlc sethdlc.c +Make sure you're using a correct version of sethdlc for your kernel. + +Use sethdlc to set physical interface, clock rate, HDLC mode used, +and add any required PVCs if using Frame Relay. +Usually you want something like: + + sethdlc hdlc0 clock int rate 128000 + sethdlc hdlc0 cisco interval 10 timeout 25 +or + sethdlc hdlc0 rs232 clock ext + sethdlc hdlc0 fr lmi ansi + sethdlc hdlc0 create 99 + ifconfig hdlc0 up + ifconfig pvc0 localIP pointopoint remoteIP + +In Frame Relay mode, ifconfig master hdlc device up (without assigning +any IP address to it) before using pvc devices. + + +Setting interface: + +* v35 | rs232 | x21 | t1 | e1 - sets physical interface for a given port + if the card has software-selectable interfaces + loopback - activate hardware loopback (for testing only) +* clock ext - external clock (uses DTE RX and TX clock) +* clock int - internal clock (provides clock signal on DCE clock output) +* clock txint - TX internal, RX external (provides TX clock on DCE output) +* clock txfromrx - TX clock derived from RX clock (TX clock on DCE output) +* rate - sets clock rate in bps (not required for external clock or + for txfromrx) + +Setting protocol: + +* hdlc - sets raw HDLC (IP-only) mode + nrz / nrzi / fm-mark / fm-space / manchester - sets transmission code + no-parity / crc16 / crc16-pr0 (CRC16 with preset zeros) / crc32-itu + crc16-itu (CRC16 with ITU-T polynomial) / crc16-itu-pr0 - sets parity + +* hdlc-eth - Ethernet device emulation using HDLC. Parity and encoding + as above. + +* cisco - sets Cisco HDLC mode (IP, IPv6 and IPX supported) + interval - time in seconds between keepalive packets + timeout - time in seconds after last received keepalive packet before + we assume the link is down + +* ppp - sets synchronous PPP mode + +* x25 - sets X.25 mode + +* fr - Frame Relay mode + lmi ansi / ccitt / none - LMI (link management) type + dce - Frame Relay DCE (network) side LMI instead of default DTE (user). + It has nothing to do with clocks! + t391 - link integrity verification polling timer (in seconds) - user + t392 - polling verification timer (in seconds) - network + n391 - full status polling counter - user + n392 - error threshold - both user and network + n393 - monitored events count - both user and network + +Frame-Relay only: +* create n | delete n - adds / deletes PVC interface with DLCI #n. + Newly created interface will be named pvc0, pvc1 etc. + +* create ether n | delete ether n - adds a device for Ethernet-bridged + frames. The device will be named pvceth0, pvceth1 etc. + + + + +Board-specific issues +--------------------- + +n2.o and c101.o need parameters to work: + + insmod n2 hw=io,irq,ram,ports[:io,irq,...] +example: + insmod n2 hw=0x300,10,0xD0000,01 + +or + insmod c101 hw=irq,ram[:irq,...] +example: + insmod c101 hw=9,0xdc000 + +If built into the kernel, these drivers need kernel (command line) parameters: + n2.hw=io,irq,ram,ports:... +or + c101.hw=irq,ram:... + + + +If you have a problem with N2 or C101 card, you can issue the "private" +command to see port's packet descriptor rings (in kernel logs): + + sethdlc hdlc0 private + +The hardware driver has to be build with CONFIG_HDLC_DEBUG_RINGS. +Attaching this info to bug reports would be helpful. Anyway, let me know +if you have problems using this. + +For patches and other info look at http://hq.pm.waw.pl/hdlc/ diff --git a/Documentation/networking/ifenslave.c b/Documentation/networking/ifenslave.c new file mode 100644 index 000000000000..f315d20d3867 --- /dev/null +++ b/Documentation/networking/ifenslave.c @@ -0,0 +1,1110 @@ +/* Mode: C; + * ifenslave.c: Configure network interfaces for parallel routing. + * + * This program controls the Linux implementation of running multiple + * network interfaces in parallel. + * + * Author: Donald Becker <becker@cesdis.gsfc.nasa.gov> + * Copyright 1994-1996 Donald Becker + * + * This program is free software; you can redistribute it + * and/or modify it under the terms of the GNU General Public + * License as published by the Free Software Foundation. + * + * The author may be reached as becker@CESDIS.gsfc.nasa.gov, or C/O + * Center of Excellence in Space Data and Information Sciences + * Code 930.5, Goddard Space Flight Center, Greenbelt MD 20771 + * + * Changes : + * - 2000/10/02 Willy Tarreau <willy at meta-x.org> : + * - few fixes. Master's MAC address is now correctly taken from + * the first device when not previously set ; + * - detach support : call BOND_RELEASE to detach an enslaved interface. + * - give a mini-howto from command-line help : # ifenslave -h + * + * - 2001/02/16 Chad N. Tindel <ctindel at ieee dot org> : + * - Master is now brought down before setting the MAC address. In + * the 2.4 kernel you can't change the MAC address while the device is + * up because you get EBUSY. + * + * - 2001/09/13 Takao Indoh <indou dot takao at jp dot fujitsu dot com> + * - Added the ability to change the active interface on a mode 1 bond + * at runtime. + * + * - 2001/10/23 Chad N. Tindel <ctindel at ieee dot org> : + * - No longer set the MAC address of the master. The bond device will + * take care of this itself + * - Try the SIOC*** versions of the bonding ioctls before using the + * old versions + * - 2002/02/18 Erik Habbinga <erik_habbinga @ hp dot com> : + * - ifr2.ifr_flags was not initialized in the hwaddr_notset case, + * SIOCGIFFLAGS now called before hwaddr_notset test + * + * - 2002/10/31 Tony Cureington <tony.cureington * hp_com> : + * - If the master does not have a hardware address when the first slave + * is enslaved, the master is assigned the hardware address of that + * slave - there is a comment in bonding.c stating "ifenslave takes + * care of this now." This corrects the problem of slaves having + * different hardware addresses in active-backup mode when + * multiple interfaces are specified on a single ifenslave command + * (ifenslave bond0 eth0 eth1). + * + * - 2003/03/18 - Tsippy Mendelson <tsippy.mendelson at intel dot com> and + * Shmulik Hen <shmulik.hen at intel dot com> + * - Moved setting the slave's mac address and openning it, from + * the application to the driver. This enables support of modes + * that need to use the unique mac address of each slave. + * The driver also takes care of closing the slave and restoring its + * original mac address upon release. + * In addition, block possibility of enslaving before the master is up. + * This prevents putting the system in an undefined state. + * + * - 2003/05/01 - Amir Noam <amir.noam at intel dot com> + * - Added ABI version control to restore compatibility between + * new/old ifenslave and new/old bonding. + * - Prevent adding an adapter that is already a slave. + * Fixes the problem of stalling the transmission and leaving + * the slave in a down state. + * + * - 2003/05/01 - Shmulik Hen <shmulik.hen at intel dot com> + * - Prevent enslaving if the bond device is down. + * Fixes the problem of leaving the system in unstable state and + * halting when trying to remove the module. + * - Close socket on all abnormal exists. + * - Add versioning scheme that follows that of the bonding driver. + * current version is 1.0.0 as a base line. + * + * - 2003/05/22 - Jay Vosburgh <fubar at us dot ibm dot com> + * - ifenslave -c was broken; it's now fixed + * - Fixed problem with routes vanishing from master during enslave + * processing. + * + * - 2003/05/27 - Amir Noam <amir.noam at intel dot com> + * - Fix backward compatibility issues: + * For drivers not using ABI versions, slave was set down while + * it should be left up before enslaving. + * Also, master was not set down and the default set_mac_address() + * would fail and generate an error message in the system log. + * - For opt_c: slave should not be set to the master's setting + * while it is running. It was already set during enslave. To + * simplify things, it is now handeled separately. + * + * - 2003/12/01 - Shmulik Hen <shmulik.hen at intel dot com> + * - Code cleanup and style changes + * set version to 1.1.0 + */ + +#define APP_VERSION "1.1.0" +#define APP_RELDATE "December 1, 2003" +#define APP_NAME "ifenslave" + +static char *version = +APP_NAME ".c:v" APP_VERSION " (" APP_RELDATE ")\n" +"o Donald Becker (becker@cesdis.gsfc.nasa.gov).\n" +"o Detach support added on 2000/10/02 by Willy Tarreau (willy at meta-x.org).\n" +"o 2.4 kernel support added on 2001/02/16 by Chad N. Tindel\n" +" (ctindel at ieee dot org).\n"; + +static const char *usage_msg = +"Usage: ifenslave [-f] <master-if> <slave-if> [<slave-if>...]\n" +" ifenslave -d <master-if> <slave-if> [<slave-if>...]\n" +" ifenslave -c <master-if> <slave-if>\n" +" ifenslave --help\n"; + +static const char *help_msg = +"\n" +" To create a bond device, simply follow these three steps :\n" +" - ensure that the required drivers are properly loaded :\n" +" # modprobe bonding ; modprobe <3c59x|eepro100|pcnet32|tulip|...>\n" +" - assign an IP address to the bond device :\n" +" # ifconfig bond0 <addr> netmask <mask> broadcast <bcast>\n" +" - attach all the interfaces you need to the bond device :\n" +" # ifenslave [{-f|--force}] bond0 eth0 [eth1 [eth2]...]\n" +" If bond0 didn't have a MAC address, it will take eth0's. Then, all\n" +" interfaces attached AFTER this assignment will get the same MAC addr.\n" +" (except for ALB/TLB modes)\n" +"\n" +" To set the bond device down and automatically release all the slaves :\n" +" # ifconfig bond0 down\n" +"\n" +" To detach a dead interface without setting the bond device down :\n" +" # ifenslave {-d|--detach} bond0 eth0 [eth1 [eth2]...]\n" +"\n" +" To change active slave :\n" +" # ifenslave {-c|--change-active} bond0 eth0\n" +"\n" +" To show master interface info\n" +" # ifenslave bond0\n" +"\n" +" To show all interfaces info\n" +" # ifenslave {-a|--all-interfaces}\n" +"\n" +" To be more verbose\n" +" # ifenslave {-v|--verbose} ...\n" +"\n" +" # ifenslave {-u|--usage} Show usage\n" +" # ifenslave {-V|--version} Show version\n" +" # ifenslave {-h|--help} This message\n" +"\n"; + +#include <unistd.h> +#include <stdlib.h> +#include <stdio.h> +#include <ctype.h> +#include <string.h> +#include <errno.h> +#include <fcntl.h> +#include <getopt.h> +#include <sys/types.h> +#include <sys/socket.h> +#include <sys/ioctl.h> +#include <linux/if.h> +#include <net/if_arp.h> +#include <linux/if_ether.h> +#include <linux/if_bonding.h> +#include <linux/sockios.h> + +typedef unsigned long long u64; /* hack, so we may include kernel's ethtool.h */ +typedef __uint32_t u32; /* ditto */ +typedef __uint16_t u16; /* ditto */ +typedef __uint8_t u8; /* ditto */ +#include <linux/ethtool.h> + +struct option longopts[] = { + /* { name has_arg *flag val } */ + {"all-interfaces", 0, 0, 'a'}, /* Show all interfaces. */ + {"change-active", 0, 0, 'c'}, /* Change the active slave. */ + {"detach", 0, 0, 'd'}, /* Detach a slave interface. */ + {"force", 0, 0, 'f'}, /* Force the operation. */ + {"help", 0, 0, 'h'}, /* Give help */ + {"usage", 0, 0, 'u'}, /* Give usage */ + {"verbose", 0, 0, 'v'}, /* Report each action taken. */ + {"version", 0, 0, 'V'}, /* Emit version information. */ + { 0, 0, 0, 0} +}; + +/* Command-line flags. */ +unsigned int +opt_a = 0, /* Show-all-interfaces flag. */ +opt_c = 0, /* Change-active-slave flag. */ +opt_d = 0, /* Detach a slave interface. */ +opt_f = 0, /* Force the operation. */ +opt_h = 0, /* Help */ +opt_u = 0, /* Usage */ +opt_v = 0, /* Verbose flag. */ +opt_V = 0; /* Version */ + +int skfd = -1; /* AF_INET socket for ioctl() calls.*/ +int abi_ver = 0; /* userland - kernel ABI version */ +int hwaddr_set = 0; /* Master's hwaddr is set */ +int saved_errno; + +struct ifreq master_mtu, master_flags, master_hwaddr; +struct ifreq slave_mtu, slave_flags, slave_hwaddr; + +struct dev_ifr { + struct ifreq *req_ifr; + char *req_name; + int req_type; +}; + +struct dev_ifr master_ifra[] = { + {&master_mtu, "SIOCGIFMTU", SIOCGIFMTU}, + {&master_flags, "SIOCGIFFLAGS", SIOCGIFFLAGS}, + {&master_hwaddr, "SIOCGIFHWADDR", SIOCGIFHWADDR}, + {NULL, "", 0} +}; + +struct dev_ifr slave_ifra[] = { + {&slave_mtu, "SIOCGIFMTU", SIOCGIFMTU}, + {&slave_flags, "SIOCGIFFLAGS", SIOCGIFFLAGS}, + {&slave_hwaddr, "SIOCGIFHWADDR", SIOCGIFHWADDR}, + {NULL, "", 0} +}; + +static void if_print(char *ifname); +static int get_drv_info(char *master_ifname); +static int get_if_settings(char *ifname, struct dev_ifr ifra[]); +static int get_slave_flags(char *slave_ifname); +static int set_master_hwaddr(char *master_ifname, struct sockaddr *hwaddr); +static int set_slave_hwaddr(char *slave_ifname, struct sockaddr *hwaddr); +static int set_slave_mtu(char *slave_ifname, int mtu); +static int set_if_flags(char *ifname, short flags); +static int set_if_up(char *ifname, short flags); +static int set_if_down(char *ifname, short flags); +static int clear_if_addr(char *ifname); +static int set_if_addr(char *master_ifname, char *slave_ifname); +static int change_active(char *master_ifname, char *slave_ifname); +static int enslave(char *master_ifname, char *slave_ifname); +static int release(char *master_ifname, char *slave_ifname); +#define v_print(fmt, args...) \ + if (opt_v) \ + fprintf(stderr, fmt, ## args ) + +int main(int argc, char *argv[]) +{ + char **spp, *master_ifname, *slave_ifname; + int c, i, rv; + int res = 0; + int exclusive = 0; + + while ((c = getopt_long(argc, argv, "acdfhuvV", longopts, 0)) != EOF) { + switch (c) { + case 'a': opt_a++; exclusive++; break; + case 'c': opt_c++; exclusive++; break; + case 'd': opt_d++; exclusive++; break; + case 'f': opt_f++; exclusive++; break; + case 'h': opt_h++; exclusive++; break; + case 'u': opt_u++; exclusive++; break; + case 'v': opt_v++; break; + case 'V': opt_V++; exclusive++; break; + + case '?': + fprintf(stderr, usage_msg); + res = 2; + goto out; + } + } + + /* options check */ + if (exclusive > 1) { + fprintf(stderr, usage_msg); + res = 2; + goto out; + } + + if (opt_v || opt_V) { + printf(version); + if (opt_V) { + res = 0; + goto out; + } + } + + if (opt_u) { + printf(usage_msg); + res = 0; + goto out; + } + + if (opt_h) { + printf(usage_msg); + printf(help_msg); + res = 0; + goto out; + } + + /* Open a basic socket */ + if ((skfd = socket(AF_INET, SOCK_DGRAM, 0)) < 0) { + perror("socket"); + res = 1; + goto out; + } + + if (opt_a) { + if (optind == argc) { + /* No remaining args */ + /* show all interfaces */ + if_print((char *)NULL); + goto out; + } else { + /* Just show usage */ + fprintf(stderr, usage_msg); + res = 2; + goto out; + } + } + + /* Copy the interface name */ + spp = argv + optind; + master_ifname = *spp++; + + if (master_ifname == NULL) { + fprintf(stderr, usage_msg); + res = 2; + goto out; + } + + /* exchange abi version with bonding module */ + res = get_drv_info(master_ifname); + if (res) { + fprintf(stderr, + "Master '%s': Error: handshake with driver failed. " + "Aborting\n", + master_ifname); + goto out; + } + + slave_ifname = *spp++; + + if (slave_ifname == NULL) { + if (opt_d || opt_c) { + fprintf(stderr, usage_msg); + res = 2; + goto out; + } + + /* A single arg means show the + * configuration for this interface + */ + if_print(master_ifname); + goto out; + } + + res = get_if_settings(master_ifname, master_ifra); + if (res) { + /* Probably a good reason not to go on */ + fprintf(stderr, + "Master '%s': Error: get settings failed: %s. " + "Aborting\n", + master_ifname, strerror(res)); + goto out; + } + + /* check if master is indeed a master; + * if not then fail any operation + */ + if (!(master_flags.ifr_flags & IFF_MASTER)) { + fprintf(stderr, + "Illegal operation; the specified interface '%s' " + "is not a master. Aborting\n", + master_ifname); + res = 1; + goto out; + } + + /* check if master is up; if not then fail any operation */ + if (!(master_flags.ifr_flags & IFF_UP)) { + fprintf(stderr, + "Illegal operation; the specified master interface " + "'%s' is not up.\n", + master_ifname); + res = 1; + goto out; + } + + /* Only for enslaving */ + if (!opt_c && !opt_d) { + sa_family_t master_family = master_hwaddr.ifr_hwaddr.sa_family; + unsigned char *hwaddr = + (unsigned char *)master_hwaddr.ifr_hwaddr.sa_data; + + /* The family '1' is ARPHRD_ETHER for ethernet. */ + if (master_family != 1 && !opt_f) { + fprintf(stderr, + "Illegal operation: The specified master " + "interface '%s' is not ethernet-like.\n " + "This program is designed to work with " + "ethernet-like network interfaces.\n " + "Use the '-f' option to force the " + "operation.\n", + master_ifname); + res = 1; + goto out; + } + + /* Check master's hw addr */ + for (i = 0; i < 6; i++) { + if (hwaddr[i] != 0) { + hwaddr_set = 1; + break; + } + } + + if (hwaddr_set) { + v_print("current hardware address of master '%s' " + "is %2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x, " + "type %d\n", + master_ifname, + hwaddr[0], hwaddr[1], + hwaddr[2], hwaddr[3], + hwaddr[4], hwaddr[5], + master_family); + } + } + + /* Accepts only one slave */ + if (opt_c) { + /* change active slave */ + res = get_slave_flags(slave_ifname); + if (res) { + fprintf(stderr, + "Slave '%s': Error: get flags failed. " + "Aborting\n", + slave_ifname); + goto out; + } + res = change_active(master_ifname, slave_ifname); + if (res) { + fprintf(stderr, + "Master '%s', Slave '%s': Error: " + "Change active failed\n", + master_ifname, slave_ifname); + } + } else { + /* Accept multiple slaves */ + do { + if (opt_d) { + /* detach a slave interface from the master */ + rv = get_slave_flags(slave_ifname); + if (rv) { + /* Can't work with this slave. */ + /* remember the error and skip it*/ + fprintf(stderr, + "Slave '%s': Error: get flags " + "failed. Skipping\n", + slave_ifname); + res = rv; + continue; + } + rv = release(master_ifname, slave_ifname); + if (rv) { + fprintf(stderr, + "Master '%s', Slave '%s': Error: " + "Release failed\n", + master_ifname, slave_ifname); + res = rv; + } + } else { + /* attach a slave interface to the master */ + rv = get_if_settings(slave_ifname, slave_ifra); + if (rv) { + /* Can't work with this slave. */ + /* remember the error and skip it*/ + fprintf(stderr, + "Slave '%s': Error: get " + "settings failed: %s. " + "Skipping\n", + slave_ifname, strerror(rv)); + res = rv; + continue; + } + rv = enslave(master_ifname, slave_ifname); + if (rv) { + fprintf(stderr, + "Master '%s', Slave '%s': Error: " + "Enslave failed\n", + master_ifname, slave_ifname); + res = rv; + } + } + } while ((slave_ifname = *spp++) != NULL); + } + +out: + if (skfd >= 0) { + close(skfd); + } + + return res; +} + +static short mif_flags; + +/* Get the inteface configuration from the kernel. */ +static int if_getconfig(char *ifname) +{ + struct ifreq ifr; + int metric, mtu; /* Parameters of the master interface. */ + struct sockaddr dstaddr, broadaddr, netmask; + unsigned char *hwaddr; + + strcpy(ifr.ifr_name, ifname); + if (ioctl(skfd, SIOCGIFFLAGS, &ifr) < 0) + return -1; + mif_flags = ifr.ifr_flags; + printf("The result of SIOCGIFFLAGS on %s is %x.\n", + ifname, ifr.ifr_flags); + + strcpy(ifr.ifr_name, ifname); + if (ioctl(skfd, SIOCGIFADDR, &ifr) < 0) + return -1; + printf("The result of SIOCGIFADDR is %2.2x.%2.2x.%2.2x.%2.2x.\n", + ifr.ifr_addr.sa_data[0], ifr.ifr_addr.sa_data[1], + ifr.ifr_addr.sa_data[2], ifr.ifr_addr.sa_data[3]); + + strcpy(ifr.ifr_name, ifname); + if (ioctl(skfd, SIOCGIFHWADDR, &ifr) < 0) + return -1; + + /* Gotta convert from 'char' to unsigned for printf(). */ + hwaddr = (unsigned char *)ifr.ifr_hwaddr.sa_data; + printf("The result of SIOCGIFHWADDR is type %d " + "%2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x.\n", + ifr.ifr_hwaddr.sa_family, hwaddr[0], hwaddr[1], + hwaddr[2], hwaddr[3], hwaddr[4], hwaddr[5]); + + strcpy(ifr.ifr_name, ifname); + if (ioctl(skfd, SIOCGIFMETRIC, &ifr) < 0) { + metric = 0; + } else + metric = ifr.ifr_metric; + + strcpy(ifr.ifr_name, ifname); + if (ioctl(skfd, SIOCGIFMTU, &ifr) < 0) + mtu = 0; + else + mtu = ifr.ifr_mtu; + + strcpy(ifr.ifr_name, ifname); + if (ioctl(skfd, SIOCGIFDSTADDR, &ifr) < 0) { + memset(&dstaddr, 0, sizeof(struct sockaddr)); + } else + dstaddr = ifr.ifr_dstaddr; + + strcpy(ifr.ifr_name, ifname); + if (ioctl(skfd, SIOCGIFBRDADDR, &ifr) < 0) { + memset(&broadaddr, 0, sizeof(struct sockaddr)); + } else + broadaddr = ifr.ifr_broadaddr; + + strcpy(ifr.ifr_name, ifname); + if (ioctl(skfd, SIOCGIFNETMASK, &ifr) < 0) { + memset(&netmask, 0, sizeof(struct sockaddr)); + } else + netmask = ifr.ifr_netmask; + + return 0; +} + +static void if_print(char *ifname) +{ + char buff[1024]; + struct ifconf ifc; + struct ifreq *ifr; + int i; + + if (ifname == (char *)NULL) { + ifc.ifc_len = sizeof(buff); + ifc.ifc_buf = buff; + if (ioctl(skfd, SIOCGIFCONF, &ifc) < 0) { + perror("SIOCGIFCONF failed"); + return; + } + + ifr = ifc.ifc_req; + for (i = ifc.ifc_len / sizeof(struct ifreq); --i >= 0; ifr++) { + if (if_getconfig(ifr->ifr_name) < 0) { + fprintf(stderr, + "%s: unknown interface.\n", + ifr->ifr_name); + continue; + } + + if (((mif_flags & IFF_UP) == 0) && !opt_a) continue; + /*ife_print(&ife);*/ + } + } else { + if (if_getconfig(ifname) < 0) { + fprintf(stderr, + "%s: unknown interface.\n", ifname); + } + } +} + +static int get_drv_info(char *master_ifname) +{ + struct ifreq ifr; + struct ethtool_drvinfo info; + char *endptr; + + memset(&ifr, 0, sizeof(ifr)); + strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ); + ifr.ifr_data = (caddr_t)&info; + + info.cmd = ETHTOOL_GDRVINFO; + strncpy(info.driver, "ifenslave", 32); + snprintf(info.fw_version, 32, "%d", BOND_ABI_VERSION); + + if (ioctl(skfd, SIOCETHTOOL, &ifr) < 0) { + if (errno == EOPNOTSUPP) { + goto out; + } + + saved_errno = errno; + v_print("Master '%s': Error: get bonding info failed %s\n", + master_ifname, strerror(saved_errno)); + return 1; + } + + abi_ver = strtoul(info.fw_version, &endptr, 0); + if (*endptr) { + v_print("Master '%s': Error: got invalid string as an ABI " + "version from the bonding module\n", + master_ifname); + return 1; + } + +out: + v_print("ABI ver is %d\n", abi_ver); + + return 0; +} + +static int change_active(char *master_ifname, char *slave_ifname) +{ + struct ifreq ifr; + int res = 0; + + if (!(slave_flags.ifr_flags & IFF_SLAVE)) { + fprintf(stderr, + "Illegal operation: The specified slave interface " + "'%s' is not a slave\n", + slave_ifname); + return 1; + } + + strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ); + strncpy(ifr.ifr_slave, slave_ifname, IFNAMSIZ); + if ((ioctl(skfd, SIOCBONDCHANGEACTIVE, &ifr) < 0) && + (ioctl(skfd, BOND_CHANGE_ACTIVE_OLD, &ifr) < 0)) { + saved_errno = errno; + v_print("Master '%s': Error: SIOCBONDCHANGEACTIVE failed: " + "%s\n", + master_ifname, strerror(saved_errno)); + res = 1; + } + + return res; +} + +static int enslave(char *master_ifname, char *slave_ifname) +{ + struct ifreq ifr; + int res = 0; + + if (slave_flags.ifr_flags & IFF_SLAVE) { + fprintf(stderr, + "Illegal operation: The specified slave interface " + "'%s' is already a slave\n", + slave_ifname); + return 1; + } + + res = set_if_down(slave_ifname, slave_flags.ifr_flags); + if (res) { + fprintf(stderr, + "Slave '%s': Error: bring interface down failed\n", + slave_ifname); + return res; + } + + if (abi_ver < 2) { + /* Older bonding versions would panic if the slave has no IP + * address, so get the IP setting from the master. + */ + res = set_if_addr(master_ifname, slave_ifname); + if (res) { + fprintf(stderr, + "Slave '%s': Error: set address failed\n", + slave_ifname); + return res; + } + } else { + res = clear_if_addr(slave_ifname); + if (res) { + fprintf(stderr, + "Slave '%s': Error: clear address failed\n", + slave_ifname); + return res; + } + } + + if (master_mtu.ifr_mtu != slave_mtu.ifr_mtu) { + res = set_slave_mtu(slave_ifname, master_mtu.ifr_mtu); + if (res) { + fprintf(stderr, + "Slave '%s': Error: set MTU failed\n", + slave_ifname); + return res; + } + } + + if (hwaddr_set) { + /* Master already has an hwaddr + * so set it's hwaddr to the slave + */ + if (abi_ver < 1) { + /* The driver is using an old ABI, so + * the application sets the slave's + * hwaddr + */ + res = set_slave_hwaddr(slave_ifname, + &(master_hwaddr.ifr_hwaddr)); + if (res) { + fprintf(stderr, + "Slave '%s': Error: set hw address " + "failed\n", + slave_ifname); + goto undo_mtu; + } + + /* For old ABI the application needs to bring the + * slave back up + */ + res = set_if_up(slave_ifname, slave_flags.ifr_flags); + if (res) { + fprintf(stderr, + "Slave '%s': Error: bring interface " + "down failed\n", + slave_ifname); + goto undo_slave_mac; + } + } + /* The driver is using a new ABI, + * so the driver takes care of setting + * the slave's hwaddr and bringing + * it up again + */ + } else { + /* No hwaddr for master yet, so + * set the slave's hwaddr to it + */ + if (abi_ver < 1) { + /* For old ABI, the master needs to be + * down before setting it's hwaddr + */ + res = set_if_down(master_ifname, master_flags.ifr_flags); + if (res) { + fprintf(stderr, + "Master '%s': Error: bring interface " + "down failed\n", + master_ifname); + goto undo_mtu; + } + } + + res = set_master_hwaddr(master_ifname, + &(slave_hwaddr.ifr_hwaddr)); + if (res) { + fprintf(stderr, + "Master '%s': Error: set hw address " + "failed\n", + master_ifname); + goto undo_mtu; + } + + if (abi_ver < 1) { + /* For old ABI, bring the master + * back up + */ + res = set_if_up(master_ifname, master_flags.ifr_flags); + if (res) { + fprintf(stderr, + "Master '%s': Error: bring interface " + "up failed\n", + master_ifname); + goto undo_master_mac; + } + } + + hwaddr_set = 1; + } + + /* Do the real thing */ + strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ); + strncpy(ifr.ifr_slave, slave_ifname, IFNAMSIZ); + if ((ioctl(skfd, SIOCBONDENSLAVE, &ifr) < 0) && + (ioctl(skfd, BOND_ENSLAVE_OLD, &ifr) < 0)) { + saved_errno = errno; + v_print("Master '%s': Error: SIOCBONDENSLAVE failed: %s\n", + master_ifname, strerror(saved_errno)); + res = 1; + } + + if (res) { + goto undo_master_mac; + } + + return 0; + +/* rollback (best effort) */ +undo_master_mac: + set_master_hwaddr(master_ifname, &(master_hwaddr.ifr_hwaddr)); + hwaddr_set = 0; + goto undo_mtu; +undo_slave_mac: + set_slave_hwaddr(slave_ifname, &(slave_hwaddr.ifr_hwaddr)); +undo_mtu: + set_slave_mtu(slave_ifname, slave_mtu.ifr_mtu); + return res; +} + +static int release(char *master_ifname, char *slave_ifname) +{ + struct ifreq ifr; + int res = 0; + + if (!(slave_flags.ifr_flags & IFF_SLAVE)) { + fprintf(stderr, + "Illegal operation: The specified slave interface " + "'%s' is not a slave\n", + slave_ifname); + return 1; + } + + strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ); + strncpy(ifr.ifr_slave, slave_ifname, IFNAMSIZ); + if ((ioctl(skfd, SIOCBONDRELEASE, &ifr) < 0) && + (ioctl(skfd, BOND_RELEASE_OLD, &ifr) < 0)) { + saved_errno = errno; + v_print("Master '%s': Error: SIOCBONDRELEASE failed: %s\n", + master_ifname, strerror(saved_errno)); + return 1; + } else if (abi_ver < 1) { + /* The driver is using an old ABI, so we'll set the interface + * down to avoid any conflicts due to same MAC/IP + */ + res = set_if_down(slave_ifname, slave_flags.ifr_flags); + if (res) { + fprintf(stderr, + "Slave '%s': Error: bring interface " + "down failed\n", + slave_ifname); + } + } + + /* set to default mtu */ + set_slave_mtu(slave_ifname, 1500); + + return res; +} + +static int get_if_settings(char *ifname, struct dev_ifr ifra[]) +{ + int i; + int res = 0; + + for (i = 0; ifra[i].req_ifr; i++) { + strncpy(ifra[i].req_ifr->ifr_name, ifname, IFNAMSIZ); + res = ioctl(skfd, ifra[i].req_type, ifra[i].req_ifr); + if (res < 0) { + saved_errno = errno; + v_print("Interface '%s': Error: %s failed: %s\n", + ifname, ifra[i].req_name, + strerror(saved_errno)); + + return saved_errno; + } + } + + return 0; +} + +static int get_slave_flags(char *slave_ifname) +{ + int res = 0; + + strncpy(slave_flags.ifr_name, slave_ifname, IFNAMSIZ); + res = ioctl(skfd, SIOCGIFFLAGS, &slave_flags); + if (res < 0) { + saved_errno = errno; + v_print("Slave '%s': Error: SIOCGIFFLAGS failed: %s\n", + slave_ifname, strerror(saved_errno)); + } else { + v_print("Slave %s: flags %04X.\n", + slave_ifname, slave_flags.ifr_flags); + } + + return res; +} + +static int set_master_hwaddr(char *master_ifname, struct sockaddr *hwaddr) +{ + unsigned char *addr = (unsigned char *)hwaddr->sa_data; + struct ifreq ifr; + int res = 0; + + strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ); + memcpy(&(ifr.ifr_hwaddr), hwaddr, sizeof(struct sockaddr)); + res = ioctl(skfd, SIOCSIFHWADDR, &ifr); + if (res < 0) { + saved_errno = errno; + v_print("Master '%s': Error: SIOCSIFHWADDR failed: %s\n", + master_ifname, strerror(saved_errno)); + return res; + } else { + v_print("Master '%s': hardware address set to " + "%2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x.\n", + master_ifname, addr[0], addr[1], addr[2], + addr[3], addr[4], addr[5]); + } + + return res; +} + +static int set_slave_hwaddr(char *slave_ifname, struct sockaddr *hwaddr) +{ + unsigned char *addr = (unsigned char *)hwaddr->sa_data; + struct ifreq ifr; + int res = 0; + + strncpy(ifr.ifr_name, slave_ifname, IFNAMSIZ); + memcpy(&(ifr.ifr_hwaddr), hwaddr, sizeof(struct sockaddr)); + res = ioctl(skfd, SIOCSIFHWADDR, &ifr); + if (res < 0) { + saved_errno = errno; + + v_print("Slave '%s': Error: SIOCSIFHWADDR failed: %s\n", + slave_ifname, strerror(saved_errno)); + + if (saved_errno == EBUSY) { + v_print(" The device is busy: it must be idle " + "before running this command.\n"); + } else if (saved_errno == EOPNOTSUPP) { + v_print(" The device does not support setting " + "the MAC address.\n" + " Your kernel likely does not support slave " + "devices.\n"); + } else if (saved_errno == EINVAL) { + v_print(" The device's address type does not match " + "the master's address type.\n"); + } + return res; + } else { + v_print("Slave '%s': hardware address set to " + "%2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x.\n", + slave_ifname, addr[0], addr[1], addr[2], + addr[3], addr[4], addr[5]); + } + + return res; +} + +static int set_slave_mtu(char *slave_ifname, int mtu) +{ + struct ifreq ifr; + int res = 0; + + ifr.ifr_mtu = mtu; + strncpy(ifr.ifr_name, slave_ifname, IFNAMSIZ); + + res = ioctl(skfd, SIOCSIFMTU, &ifr); + if (res < 0) { + saved_errno = errno; + v_print("Slave '%s': Error: SIOCSIFMTU failed: %s\n", + slave_ifname, strerror(saved_errno)); + } else { + v_print("Slave '%s': MTU set to %d.\n", slave_ifname, mtu); + } + + return res; +} + +static int set_if_flags(char *ifname, short flags) +{ + struct ifreq ifr; + int res = 0; + + ifr.ifr_flags = flags; + strncpy(ifr.ifr_name, ifname, IFNAMSIZ); + + res = ioctl(skfd, SIOCSIFFLAGS, &ifr); + if (res < 0) { + saved_errno = errno; + v_print("Interface '%s': Error: SIOCSIFFLAGS failed: %s\n", + ifname, strerror(saved_errno)); + } else { + v_print("Interface '%s': flags set to %04X.\n", ifname, flags); + } + + return res; +} + +static int set_if_up(char *ifname, short flags) +{ + return set_if_flags(ifname, flags | IFF_UP); +} + +static int set_if_down(char *ifname, short flags) +{ + return set_if_flags(ifname, flags & ~IFF_UP); +} + +static int clear_if_addr(char *ifname) +{ + struct ifreq ifr; + int res = 0; + + strncpy(ifr.ifr_name, ifname, IFNAMSIZ); + ifr.ifr_addr.sa_family = AF_INET; + memset(ifr.ifr_addr.sa_data, 0, sizeof(ifr.ifr_addr.sa_data)); + + res = ioctl(skfd, SIOCSIFADDR, &ifr); + if (res < 0) { + saved_errno = errno; + v_print("Interface '%s': Error: SIOCSIFADDR failed: %s\n", + ifname, strerror(saved_errno)); + } else { + v_print("Interface '%s': address cleared\n", ifname); + } + + return res; +} + +static int set_if_addr(char *master_ifname, char *slave_ifname) +{ + struct ifreq ifr; + int res; + unsigned char *ipaddr; + int i; + struct { + char *req_name; + char *desc; + int g_ioctl; + int s_ioctl; + } ifra[] = { + {"IFADDR", "addr", SIOCGIFADDR, SIOCSIFADDR}, + {"DSTADDR", "destination addr", SIOCGIFDSTADDR, SIOCSIFDSTADDR}, + {"BRDADDR", "broadcast addr", SIOCGIFBRDADDR, SIOCSIFBRDADDR}, + {"NETMASK", "netmask", SIOCGIFNETMASK, SIOCSIFNETMASK}, + {NULL, NULL, 0, 0}, + }; + + for (i = 0; ifra[i].req_name; i++) { + strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ); + res = ioctl(skfd, ifra[i].g_ioctl, &ifr); + if (res < 0) { + int saved_errno = errno; + + v_print("Interface '%s': Error: SIOCG%s failed: %s\n", + master_ifname, ifra[i].req_name, + strerror(saved_errno)); + + ifr.ifr_addr.sa_family = AF_INET; + memset(ifr.ifr_addr.sa_data, 0, + sizeof(ifr.ifr_addr.sa_data)); + } + + strncpy(ifr.ifr_name, slave_ifname, IFNAMSIZ); + res = ioctl(skfd, ifra[i].s_ioctl, &ifr); + if (res < 0) { + int saved_errno = errno; + + v_print("Interface '%s': Error: SIOCS%s failed: %s\n", + slave_ifname, ifra[i].req_name, + strerror(saved_errno)); + + return res; + } + + ipaddr = ifr.ifr_addr.sa_data; + v_print("Interface '%s': set IP %s to %d.%d.%d.%d\n", + slave_ifname, ifra[i].desc, + ipaddr[0], ipaddr[1], ipaddr[2], ipaddr[3]); + } + + return 0; +} + +/* + * Local variables: + * version-control: t + * kept-new-versions: 5 + * c-indent-level: 4 + * c-basic-offset: 4 + * tab-width: 4 + * compile-command: "gcc -Wall -Wstrict-prototypes -O -I/usr/src/linux/include ifenslave.c -o ifenslave" + * End: + */ + diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt new file mode 100644 index 000000000000..a2c893a7475d --- /dev/null +++ b/Documentation/networking/ip-sysctl.txt @@ -0,0 +1,878 @@ +/proc/sys/net/ipv4/* Variables: + +ip_forward - BOOLEAN + 0 - disabled (default) + not 0 - enabled + + Forward Packets between interfaces. + + This variable is special, its change resets all configuration + parameters to their default state (RFC1122 for hosts, RFC1812 + for routers) + +ip_default_ttl - INTEGER + default 64 + +ip_no_pmtu_disc - BOOLEAN + Disable Path MTU Discovery. + default FALSE + +min_pmtu - INTEGER + default 562 - minimum discovered Path MTU + +mtu_expires - INTEGER + Time, in seconds, that cached PMTU information is kept. + +min_adv_mss - INTEGER + The advertised MSS depends on the first hop route MTU, but will + never be lower than this setting. + +IP Fragmentation: + +ipfrag_high_thresh - INTEGER + Maximum memory used to reassemble IP fragments. When + ipfrag_high_thresh bytes of memory is allocated for this purpose, + the fragment handler will toss packets until ipfrag_low_thresh + is reached. + +ipfrag_low_thresh - INTEGER + See ipfrag_high_thresh + +ipfrag_time - INTEGER + Time in seconds to keep an IP fragment in memory. + +ipfrag_secret_interval - INTEGER + Regeneration interval (in seconds) of the hash secret (or lifetime + for the hash secret) for IP fragments. + Default: 600 + +INET peer storage: + +inet_peer_threshold - INTEGER + The approximate size of the storage. Starting from this threshold + entries will be thrown aggressively. This threshold also determines + entries' time-to-live and time intervals between garbage collection + passes. More entries, less time-to-live, less GC interval. + +inet_peer_minttl - INTEGER + Minimum time-to-live of entries. Should be enough to cover fragment + time-to-live on the reassembling side. This minimum time-to-live is + guaranteed if the pool size is less than inet_peer_threshold. + Measured in jiffies(1). + +inet_peer_maxttl - INTEGER + Maximum time-to-live of entries. Unused entries will expire after + this period of time if there is no memory pressure on the pool (i.e. + when the number of entries in the pool is very small). + Measured in jiffies(1). + +inet_peer_gc_mintime - INTEGER + Minimum interval between garbage collection passes. This interval is + in effect under high memory pressure on the pool. + Measured in jiffies(1). + +inet_peer_gc_maxtime - INTEGER + Minimum interval between garbage collection passes. This interval is + in effect under low (or absent) memory pressure on the pool. + Measured in jiffies(1). + +TCP variables: + +tcp_syn_retries - INTEGER + Number of times initial SYNs for an active TCP connection attempt + will be retransmitted. Should not be higher than 255. Default value + is 5, which corresponds to ~180seconds. + +tcp_synack_retries - INTEGER + Number of times SYNACKs for a passive TCP connection attempt will + be retransmitted. Should not be higher than 255. Default value + is 5, which corresponds to ~180seconds. + +tcp_keepalive_time - INTEGER + How often TCP sends out keepalive messages when keepalive is enabled. + Default: 2hours. + +tcp_keepalive_probes - INTEGER + How many keepalive probes TCP sends out, until it decides that the + connection is broken. Default value: 9. + +tcp_keepalive_intvl - INTEGER + How frequently the probes are send out. Multiplied by + tcp_keepalive_probes it is time to kill not responding connection, + after probes started. Default value: 75sec i.e. connection + will be aborted after ~11 minutes of retries. + +tcp_retries1 - INTEGER + How many times to retry before deciding that something is wrong + and it is necessary to report this suspicion to network layer. + Minimal RFC value is 3, it is default, which corresponds + to ~3sec-8min depending on RTO. + +tcp_retries2 - INTEGER + How may times to retry before killing alive TCP connection. + RFC1122 says that the limit should be longer than 100 sec. + It is too small number. Default value 15 corresponds to ~13-30min + depending on RTO. + +tcp_orphan_retries - INTEGER + How may times to retry before killing TCP connection, closed + by our side. Default value 7 corresponds to ~50sec-16min + depending on RTO. If you machine is loaded WEB server, + you should think about lowering this value, such sockets + may consume significant resources. Cf. tcp_max_orphans. + +tcp_fin_timeout - INTEGER + Time to hold socket in state FIN-WAIT-2, if it was closed + by our side. Peer can be broken and never close its side, + or even died unexpectedly. Default value is 60sec. + Usual value used in 2.2 was 180 seconds, you may restore + it, but remember that if your machine is even underloaded WEB server, + you risk to overflow memory with kilotons of dead sockets, + FIN-WAIT-2 sockets are less dangerous than FIN-WAIT-1, + because they eat maximum 1.5K of memory, but they tend + to live longer. Cf. tcp_max_orphans. + +tcp_max_tw_buckets - INTEGER + Maximal number of timewait sockets held by system simultaneously. + If this number is exceeded time-wait socket is immediately destroyed + and warning is printed. This limit exists only to prevent + simple DoS attacks, you _must_ not lower the limit artificially, + but rather increase it (probably, after increasing installed memory), + if network conditions require more than default value. + +tcp_tw_recycle - BOOLEAN + Enable fast recycling TIME-WAIT sockets. Default value is 0. + It should not be changed without advice/request of technical + experts. + +tcp_tw_reuse - BOOLEAN + Allow to reuse TIME-WAIT sockets for new connections when it is + safe from protocol viewpoint. Default value is 0. + It should not be changed without advice/request of technical + experts. + +tcp_max_orphans - INTEGER + Maximal number of TCP sockets not attached to any user file handle, + held by system. If this number is exceeded orphaned connections are + reset immediately and warning is printed. This limit exists + only to prevent simple DoS attacks, you _must_ not rely on this + or lower the limit artificially, but rather increase it + (probably, after increasing installed memory), + if network conditions require more than default value, + and tune network services to linger and kill such states + more aggressively. Let me to remind again: each orphan eats + up to ~64K of unswappable memory. + +tcp_abort_on_overflow - BOOLEAN + If listening service is too slow to accept new connections, + reset them. Default state is FALSE. It means that if overflow + occurred due to a burst, connection will recover. Enable this + option _only_ if you are really sure that listening daemon + cannot be tuned to accept connections faster. Enabling this + option can harm clients of your server. + +tcp_syncookies - BOOLEAN + Only valid when the kernel was compiled with CONFIG_SYNCOOKIES + Send out syncookies when the syn backlog queue of a socket + overflows. This is to prevent against the common 'syn flood attack' + Default: FALSE + + Note, that syncookies is fallback facility. + It MUST NOT be used to help highly loaded servers to stand + against legal connection rate. If you see synflood warnings + in your logs, but investigation shows that they occur + because of overload with legal connections, you should tune + another parameters until this warning disappear. + See: tcp_max_syn_backlog, tcp_synack_retries, tcp_abort_on_overflow. + + syncookies seriously violate TCP protocol, do not allow + to use TCP extensions, can result in serious degradation + of some services (f.e. SMTP relaying), visible not by you, + but your clients and relays, contacting you. While you see + synflood warnings in logs not being really flooded, your server + is seriously misconfigured. + +tcp_stdurg - BOOLEAN + Use the Host requirements interpretation of the TCP urg pointer field. + Most hosts use the older BSD interpretation, so if you turn this on + Linux might not communicate correctly with them. + Default: FALSE + +tcp_max_syn_backlog - INTEGER + Maximal number of remembered connection requests, which are + still did not receive an acknowledgment from connecting client. + Default value is 1024 for systems with more than 128Mb of memory, + and 128 for low memory machines. If server suffers of overload, + try to increase this number. + +tcp_window_scaling - BOOLEAN + Enable window scaling as defined in RFC1323. + +tcp_timestamps - BOOLEAN + Enable timestamps as defined in RFC1323. + +tcp_sack - BOOLEAN + Enable select acknowledgments (SACKS). + +tcp_fack - BOOLEAN + Enable FACK congestion avoidance and fast retransmission. + The value is not used, if tcp_sack is not enabled. + +tcp_dsack - BOOLEAN + Allows TCP to send "duplicate" SACKs. + +tcp_ecn - BOOLEAN + Enable Explicit Congestion Notification in TCP. + +tcp_reordering - INTEGER + Maximal reordering of packets in a TCP stream. + Default: 3 + +tcp_retrans_collapse - BOOLEAN + Bug-to-bug compatibility with some broken printers. + On retransmit try to send bigger packets to work around bugs in + certain TCP stacks. + +tcp_wmem - vector of 3 INTEGERs: min, default, max + min: Amount of memory reserved for send buffers for TCP socket. + Each TCP socket has rights to use it due to fact of its birth. + Default: 4K + + default: Amount of memory allowed for send buffers for TCP socket + by default. This value overrides net.core.wmem_default used + by other protocols, it is usually lower than net.core.wmem_default. + Default: 16K + + max: Maximal amount of memory allowed for automatically selected + send buffers for TCP socket. This value does not override + net.core.wmem_max, "static" selection via SO_SNDBUF does not use this. + Default: 128K + +tcp_rmem - vector of 3 INTEGERs: min, default, max + min: Minimal size of receive buffer used by TCP sockets. + It is guaranteed to each TCP socket, even under moderate memory + pressure. + Default: 8K + + default: default size of receive buffer used by TCP sockets. + This value overrides net.core.rmem_default used by other protocols. + Default: 87380 bytes. This value results in window of 65535 with + default setting of tcp_adv_win_scale and tcp_app_win:0 and a bit + less for default tcp_app_win. See below about these variables. + + max: maximal size of receive buffer allowed for automatically + selected receiver buffers for TCP socket. This value does not override + net.core.rmem_max, "static" selection via SO_RCVBUF does not use this. + Default: 87380*2 bytes. + +tcp_mem - vector of 3 INTEGERs: min, pressure, max + low: below this number of pages TCP is not bothered about its + memory appetite. + + pressure: when amount of memory allocated by TCP exceeds this number + of pages, TCP moderates its memory consumption and enters memory + pressure mode, which is exited when memory consumption falls + under "low". + + high: number of pages allowed for queueing by all TCP sockets. + + Defaults are calculated at boot time from amount of available + memory. + +tcp_app_win - INTEGER + Reserve max(window/2^tcp_app_win, mss) of window for application + buffer. Value 0 is special, it means that nothing is reserved. + Default: 31 + +tcp_adv_win_scale - INTEGER + Count buffering overhead as bytes/2^tcp_adv_win_scale + (if tcp_adv_win_scale > 0) or bytes-bytes/2^(-tcp_adv_win_scale), + if it is <= 0. + Default: 2 + +tcp_rfc1337 - BOOLEAN + If set, the TCP stack behaves conforming to RFC1337. If unset, + we are not conforming to RFC, but prevent TCP TIME_WAIT + assassination. + Default: 0 + +tcp_low_latency - BOOLEAN + If set, the TCP stack makes decisions that prefer lower + latency as opposed to higher throughput. By default, this + option is not set meaning that higher throughput is preferred. + An example of an application where this default should be + changed would be a Beowulf compute cluster. + Default: 0 + +tcp_westwood - BOOLEAN + Enable TCP Westwood+ congestion control algorithm. + TCP Westwood+ is a sender-side only modification of the TCP Reno + protocol stack that optimizes the performance of TCP congestion + control. It is based on end-to-end bandwidth estimation to set + congestion window and slow start threshold after a congestion + episode. Using this estimation, TCP Westwood+ adaptively sets a + slow start threshold and a congestion window which takes into + account the bandwidth used at the time congestion is experienced. + TCP Westwood+ significantly increases fairness wrt TCP Reno in + wired networks and throughput over wireless links. + Default: 0 + +tcp_vegas_cong_avoid - BOOLEAN + Enable TCP Vegas congestion avoidance algorithm. + TCP Vegas is a sender-side only change to TCP that anticipates + the onset of congestion by estimating the bandwidth. TCP Vegas + adjusts the sending rate by modifying the congestion + window. TCP Vegas should provide less packet loss, but it is + not as aggressive as TCP Reno. + Default:0 + +tcp_bic - BOOLEAN + Enable BIC TCP congestion control algorithm. + BIC-TCP is a sender-side only change that ensures a linear RTT + fairness under large windows while offering both scalability and + bounded TCP-friendliness. The protocol combines two schemes + called additive increase and binary search increase. When the + congestion window is large, additive increase with a large + increment ensures linear RTT fairness as well as good + scalability. Under small congestion windows, binary search + increase provides TCP friendliness. + Default: 0 + +tcp_bic_low_window - INTEGER + Sets the threshold window (in packets) where BIC TCP starts to + adjust the congestion window. Below this threshold BIC TCP behaves + the same as the default TCP Reno. + Default: 14 + +tcp_bic_fast_convergence - BOOLEAN + Forces BIC TCP to more quickly respond to changes in congestion + window. Allows two flows sharing the same connection to converge + more rapidly. + Default: 1 + +tcp_default_win_scale - INTEGER + Sets the minimum window scale TCP will negotiate for on all + conections. + Default: 7 + +tcp_tso_win_divisor - INTEGER + This allows control over what percentage of the congestion window + can be consumed by a single TSO frame. + The setting of this parameter is a choice between burstiness and + building larger TSO frames. + Default: 8 + +tcp_frto - BOOLEAN + Enables F-RTO, an enhanced recovery algorithm for TCP retransmission + timeouts. It is particularly beneficial in wireless environments + where packet loss is typically due to random radio interference + rather than intermediate router congestion. + +somaxconn - INTEGER + Limit of socket listen() backlog, known in userspace as SOMAXCONN. + Defaults to 128. See also tcp_max_syn_backlog for additional tuning + for TCP sockets. + +IP Variables: + +ip_local_port_range - 2 INTEGERS + Defines the local port range that is used by TCP and UDP to + choose the local port. The first number is the first, the + second the last local port number. Default value depends on + amount of memory available on the system: + > 128Mb 32768-61000 + < 128Mb 1024-4999 or even less. + This number defines number of active connections, which this + system can issue simultaneously to systems not supporting + TCP extensions (timestamps). With tcp_tw_recycle enabled + (i.e. by default) range 1024-4999 is enough to issue up to + 2000 connections per second to systems supporting timestamps. + +ip_nonlocal_bind - BOOLEAN + If set, allows processes to bind() to non-local IP addresses, + which can be quite useful - but may break some applications. + Default: 0 + +ip_dynaddr - BOOLEAN + If set non-zero, enables support for dynamic addresses. + If set to a non-zero value larger than 1, a kernel log + message will be printed when dynamic address rewriting + occurs. + Default: 0 + +icmp_echo_ignore_all - BOOLEAN +icmp_echo_ignore_broadcasts - BOOLEAN + If either is set to true, then the kernel will ignore either all + ICMP ECHO requests sent to it or just those to broadcast/multicast + addresses, respectively. + +icmp_ratelimit - INTEGER + Limit the maximal rates for sending ICMP packets whose type matches + icmp_ratemask (see below) to specific targets. + 0 to disable any limiting, otherwise the maximal rate in jiffies(1) + Default: 100 + +icmp_ratemask - INTEGER + Mask made of ICMP types for which rates are being limited. + Significant bits: IHGFEDCBA9876543210 + Default mask: 0000001100000011000 (6168) + + Bit definitions (see include/linux/icmp.h): + 0 Echo Reply + 3 Destination Unreachable * + 4 Source Quench * + 5 Redirect + 8 Echo Request + B Time Exceeded * + C Parameter Problem * + D Timestamp Request + E Timestamp Reply + F Info Request + G Info Reply + H Address Mask Request + I Address Mask Reply + + * These are rate limited by default (see default mask above) + +icmp_ignore_bogus_error_responses - BOOLEAN + Some routers violate RFC1122 by sending bogus responses to broadcast + frames. Such violations are normally logged via a kernel warning. + If this is set to TRUE, the kernel will not give such warnings, which + will avoid log file clutter. + Default: FALSE + +igmp_max_memberships - INTEGER + Change the maximum number of multicast groups we can subscribe to. + Default: 20 + +conf/interface/* changes special settings per interface (where "interface" is + the name of your network interface) +conf/all/* is special, changes the settings for all interfaces + + +log_martians - BOOLEAN + Log packets with impossible addresses to kernel log. + log_martians for the interface will be enabled if at least one of + conf/{all,interface}/log_martians is set to TRUE, + it will be disabled otherwise + +accept_redirects - BOOLEAN + Accept ICMP redirect messages. + accept_redirects for the interface will be enabled if: + - both conf/{all,interface}/accept_redirects are TRUE in the case forwarding + for the interface is enabled + or + - at least one of conf/{all,interface}/accept_redirects is TRUE in the case + forwarding for the interface is disabled + accept_redirects for the interface will be disabled otherwise + default TRUE (host) + FALSE (router) + +forwarding - BOOLEAN + Enable IP forwarding on this interface. + +mc_forwarding - BOOLEAN + Do multicast routing. The kernel needs to be compiled with CONFIG_MROUTE + and a multicast routing daemon is required. + conf/all/mc_forwarding must also be set to TRUE to enable multicast routing + for the interface + +medium_id - INTEGER + Integer value used to differentiate the devices by the medium they + are attached to. Two devices can have different id values when + the broadcast packets are received only on one of them. + The default value 0 means that the device is the only interface + to its medium, value of -1 means that medium is not known. + + Currently, it is used to change the proxy_arp behavior: + the proxy_arp feature is enabled for packets forwarded between + two devices attached to different media. + +proxy_arp - BOOLEAN + Do proxy arp. + proxy_arp for the interface will be enabled if at least one of + conf/{all,interface}/proxy_arp is set to TRUE, + it will be disabled otherwise + +shared_media - BOOLEAN + Send(router) or accept(host) RFC1620 shared media redirects. + Overrides ip_secure_redirects. + shared_media for the interface will be enabled if at least one of + conf/{all,interface}/shared_media is set to TRUE, + it will be disabled otherwise + default TRUE + +secure_redirects - BOOLEAN + Accept ICMP redirect messages only for gateways, + listed in default gateway list. + secure_redirects for the interface will be enabled if at least one of + conf/{all,interface}/secure_redirects is set to TRUE, + it will be disabled otherwise + default TRUE + +send_redirects - BOOLEAN + Send redirects, if router. + send_redirects for the interface will be enabled if at least one of + conf/{all,interface}/send_redirects is set to TRUE, + it will be disabled otherwise + Default: TRUE + +bootp_relay - BOOLEAN + Accept packets with source address 0.b.c.d destined + not to this host as local ones. It is supposed, that + BOOTP relay daemon will catch and forward such packets. + conf/all/bootp_relay must also be set to TRUE to enable BOOTP relay + for the interface + default FALSE + Not Implemented Yet. + +accept_source_route - BOOLEAN + Accept packets with SRR option. + conf/all/accept_source_route must also be set to TRUE to accept packets + with SRR option on the interface + default TRUE (router) + FALSE (host) + +rp_filter - BOOLEAN + 1 - do source validation by reversed path, as specified in RFC1812 + Recommended option for single homed hosts and stub network + routers. Could cause troubles for complicated (not loop free) + networks running a slow unreliable protocol (sort of RIP), + or using static routes. + + 0 - No source validation. + + conf/all/rp_filter must also be set to TRUE to do source validation + on the interface + + Default value is 0. Note that some distributions enable it + in startup scripts. + +arp_filter - BOOLEAN + 1 - Allows you to have multiple network interfaces on the same + subnet, and have the ARPs for each interface be answered + based on whether or not the kernel would route a packet from + the ARP'd IP out that interface (therefore you must use source + based routing for this to work). In other words it allows control + of which cards (usually 1) will respond to an arp request. + + 0 - (default) The kernel can respond to arp requests with addresses + from other interfaces. This may seem wrong but it usually makes + sense, because it increases the chance of successful communication. + IP addresses are owned by the complete host on Linux, not by + particular interfaces. Only for more complex setups like load- + balancing, does this behaviour cause problems. + + arp_filter for the interface will be enabled if at least one of + conf/{all,interface}/arp_filter is set to TRUE, + it will be disabled otherwise + +arp_announce - INTEGER + Define different restriction levels for announcing the local + source IP address from IP packets in ARP requests sent on + interface: + 0 - (default) Use any local address, configured on any interface + 1 - Try to avoid local addresses that are not in the target's + subnet for this interface. This mode is useful when target + hosts reachable via this interface require the source IP + address in ARP requests to be part of their logical network + configured on the receiving interface. When we generate the + request we will check all our subnets that include the + target IP and will preserve the source address if it is from + such subnet. If there is no such subnet we select source + address according to the rules for level 2. + 2 - Always use the best local address for this target. + In this mode we ignore the source address in the IP packet + and try to select local address that we prefer for talks with + the target host. Such local address is selected by looking + for primary IP addresses on all our subnets on the outgoing + interface that include the target IP address. If no suitable + local address is found we select the first local address + we have on the outgoing interface or on all other interfaces, + with the hope we will receive reply for our request and + even sometimes no matter the source IP address we announce. + + The max value from conf/{all,interface}/arp_announce is used. + + Increasing the restriction level gives more chance for + receiving answer from the resolved target while decreasing + the level announces more valid sender's information. + +arp_ignore - INTEGER + Define different modes for sending replies in response to + received ARP requests that resolve local target IP addresses: + 0 - (default): reply for any local target IP address, configured + on any interface + 1 - reply only if the target IP address is local address + configured on the incoming interface + 2 - reply only if the target IP address is local address + configured on the incoming interface and both with the + sender's IP address are part from same subnet on this interface + 3 - do not reply for local addresses configured with scope host, + only resolutions for global and link addresses are replied + 4-7 - reserved + 8 - do not reply for all local addresses + + The max value from conf/{all,interface}/arp_ignore is used + when ARP request is received on the {interface} + +app_solicit - INTEGER + The maximum number of probes to send to the user space ARP daemon + via netlink before dropping back to multicast probes (see + mcast_solicit). Defaults to 0. + +disable_policy - BOOLEAN + Disable IPSEC policy (SPD) for this interface + +disable_xfrm - BOOLEAN + Disable IPSEC encryption on this interface, whatever the policy + + + +tag - INTEGER + Allows you to write a number, which can be used as required. + Default value is 0. + +(1) Jiffie: internal timeunit for the kernel. On the i386 1/100s, on the +Alpha 1/1024s. See the HZ define in /usr/include/asm/param.h for the exact +value on your system. + +Alexey Kuznetsov. +kuznet@ms2.inr.ac.ru + +Updated by: +Andi Kleen +ak@muc.de +Nicolas Delon +delon.nicolas@wanadoo.fr + + + + +/proc/sys/net/ipv6/* Variables: + +IPv6 has no global variables such as tcp_*. tcp_* settings under ipv4/ also +apply to IPv6 [XXX?]. + +bindv6only - BOOLEAN + Default value for IPV6_V6ONLY socket option, + which restricts use of the IPv6 socket to IPv6 communication + only. + TRUE: disable IPv4-mapped address feature + FALSE: enable IPv4-mapped address feature + + Default: FALSE (as specified in RFC2553bis) + +IPv6 Fragmentation: + +ip6frag_high_thresh - INTEGER + Maximum memory used to reassemble IPv6 fragments. When + ip6frag_high_thresh bytes of memory is allocated for this purpose, + the fragment handler will toss packets until ip6frag_low_thresh + is reached. + +ip6frag_low_thresh - INTEGER + See ip6frag_high_thresh + +ip6frag_time - INTEGER + Time in seconds to keep an IPv6 fragment in memory. + +ip6frag_secret_interval - INTEGER + Regeneration interval (in seconds) of the hash secret (or lifetime + for the hash secret) for IPv6 fragments. + Default: 600 + +conf/default/*: + Change the interface-specific default settings. + + +conf/all/*: + Change all the interface-specific settings. + + [XXX: Other special features than forwarding?] + +conf/all/forwarding - BOOLEAN + Enable global IPv6 forwarding between all interfaces. + + IPv4 and IPv6 work differently here; e.g. netfilter must be used + to control which interfaces may forward packets and which not. + + This also sets all interfaces' Host/Router setting + 'forwarding' to the specified value. See below for details. + + This referred to as global forwarding. + +conf/interface/*: + Change special settings per interface. + + The functional behaviour for certain settings is different + depending on whether local forwarding is enabled or not. + +accept_ra - BOOLEAN + Accept Router Advertisements; autoconfigure using them. + + Functional default: enabled if local forwarding is disabled. + disabled if local forwarding is enabled. + +accept_redirects - BOOLEAN + Accept Redirects. + + Functional default: enabled if local forwarding is disabled. + disabled if local forwarding is enabled. + +autoconf - BOOLEAN + Autoconfigure addresses using Prefix Information in Router + Advertisements. + + Functional default: enabled if accept_ra is enabled. + disabled if accept_ra is disabled. + +dad_transmits - INTEGER + The amount of Duplicate Address Detection probes to send. + Default: 1 + +forwarding - BOOLEAN + Configure interface-specific Host/Router behaviour. + + Note: It is recommended to have the same setting on all + interfaces; mixed router/host scenarios are rather uncommon. + + FALSE: + + By default, Host behaviour is assumed. This means: + + 1. IsRouter flag is not set in Neighbour Advertisements. + 2. Router Solicitations are being sent when necessary. + 3. If accept_ra is TRUE (default), accept Router + Advertisements (and do autoconfiguration). + 4. If accept_redirects is TRUE (default), accept Redirects. + + TRUE: + + If local forwarding is enabled, Router behaviour is assumed. + This means exactly the reverse from the above: + + 1. IsRouter flag is set in Neighbour Advertisements. + 2. Router Solicitations are not sent. + 3. Router Advertisements are ignored. + 4. Redirects are ignored. + + Default: FALSE if global forwarding is disabled (default), + otherwise TRUE. + +hop_limit - INTEGER + Default Hop Limit to set. + Default: 64 + +mtu - INTEGER + Default Maximum Transfer Unit + Default: 1280 (IPv6 required minimum) + +router_solicitation_delay - INTEGER + Number of seconds to wait after interface is brought up + before sending Router Solicitations. + Default: 1 + +router_solicitation_interval - INTEGER + Number of seconds to wait between Router Solicitations. + Default: 4 + +router_solicitations - INTEGER + Number of Router Solicitations to send until assuming no + routers are present. + Default: 3 + +use_tempaddr - INTEGER + Preference for Privacy Extensions (RFC3041). + <= 0 : disable Privacy Extensions + == 1 : enable Privacy Extensions, but prefer public + addresses over temporary addresses. + > 1 : enable Privacy Extensions and prefer temporary + addresses over public addresses. + Default: 0 (for most devices) + -1 (for point-to-point devices and loopback devices) + +temp_valid_lft - INTEGER + valid lifetime (in seconds) for temporary addresses. + Default: 604800 (7 days) + +temp_prefered_lft - INTEGER + Preferred lifetime (in seconds) for temporary addresses. + Default: 86400 (1 day) + +max_desync_factor - INTEGER + Maximum value for DESYNC_FACTOR, which is a random value + that ensures that clients don't synchronize with each + other and generate new addresses at exactly the same time. + value is in seconds. + Default: 600 + +regen_max_retry - INTEGER + Number of attempts before give up attempting to generate + valid temporary addresses. + Default: 5 + +max_addresses - INTEGER + Number of maximum addresses per interface. 0 disables limitation. + It is recommended not set too large value (or 0) because it would + be too easy way to crash kernel to allow to create too much of + autoconfigured addresses. + Default: 16 + +icmp/*: +ratelimit - INTEGER + Limit the maximal rates for sending ICMPv6 packets. + 0 to disable any limiting, otherwise the maximal rate in jiffies(1) + Default: 100 + + +IPv6 Update by: +Pekka Savola <pekkas@netcore.fi> +YOSHIFUJI Hideaki / USAGI Project <yoshfuji@linux-ipv6.org> + + +/proc/sys/net/bridge/* Variables: + +bridge-nf-call-arptables - BOOLEAN + 1 : pass bridged ARP traffic to arptables' FORWARD chain. + 0 : disable this. + Default: 1 + +bridge-nf-call-iptables - BOOLEAN + 1 : pass bridged IPv4 traffic to iptables' chains. + 0 : disable this. + Default: 1 + +bridge-nf-call-ip6tables - BOOLEAN + 1 : pass bridged IPv6 traffic to ip6tables' chains. + 0 : disable this. + Default: 1 + +bridge-nf-filter-vlan-tagged - BOOLEAN + 1 : pass bridged vlan-tagged ARP/IP traffic to arptables/iptables. + 0 : disable this. + Default: 1 + + +UNDOCUMENTED: + +dev_weight FIXME +discovery_slots FIXME +discovery_timeout FIXME +fast_poll_increase FIXME +ip6_queue_maxlen FIXME +lap_keepalive_time FIXME +lo_cong FIXME +max_baud_rate FIXME +max_dgram_qlen FIXME +max_noreply_time FIXME +max_tx_data_size FIXME +max_tx_window FIXME +min_tx_turn_time FIXME +mod_cong FIXME +no_cong FIXME +no_cong_thresh FIXME +slot_timeout FIXME +warn_noreply_time FIXME + +$Id: ip-sysctl.txt,v 1.20 2001/12/13 09:00:18 davem Exp $ diff --git a/Documentation/networking/ip_dynaddr.txt b/Documentation/networking/ip_dynaddr.txt new file mode 100644 index 000000000000..45f3c1268e86 --- /dev/null +++ b/Documentation/networking/ip_dynaddr.txt @@ -0,0 +1,29 @@ +IP dynamic address hack-port v0.03 +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +This stuff allows diald ONESHOT connections to get established by +dynamically changing packet source address (and socket's if local procs). +It is implemented for TCP diald-box connections(1) and IP_MASQuerading(2). + +If enabled[*] and forwarding interface has changed: + 1) Socket (and packet) source address is rewritten ON RETRANSMISSIONS + while in SYN_SENT state (diald-box processes). + 2) Out-bounded MASQueraded source address changes ON OUTPUT (when + internal host does retransmission) until a packet from outside is + received by the tunnel. + +This is specially helpful for auto dialup links (diald), where the +``actual'' outgoing address is unknown at the moment the link is +going up. So, the *same* (local AND masqueraded) connections requests that +bring the link up will be able to get established. + +[*] At boot, by default no address rewriting is attempted. + To enable: + # echo 1 > /proc/sys/net/ipv4/ip_dynaddr + To enable verbose mode: + # echo 2 > /proc/sys/net/ipv4/ip_dynaddr + To disable (default) + # echo 0 > /proc/sys/net/ipv4/ip_dynaddr + +Enjoy! + +-- Juanjo <jjciarla@raiz.uncu.edu.ar> diff --git a/Documentation/networking/ipddp.txt b/Documentation/networking/ipddp.txt new file mode 100644 index 000000000000..661a5558dd8e --- /dev/null +++ b/Documentation/networking/ipddp.txt @@ -0,0 +1,78 @@ +Text file for ipddp.c: + AppleTalk-IP Decapsulation and AppleTalk-IP Encapsulation + +This text file is written by Jay Schulist <jschlst@samba.org> + +Introduction +------------ + +AppleTalk-IP (IPDDP) is the method computers connected to AppleTalk +networks can use to communicate via IP. AppleTalk-IP is simply IP datagrams +inside AppleTalk packets. + +Through this driver you can either allow your Linux box to communicate +IP over an AppleTalk network or you can provide IP gatewaying functions +for your AppleTalk users. + +You can currently encapsulate or decapsulate AppleTalk-IP on LocalTalk, +EtherTalk and PPPTalk. The only limit on the protocol is that of what +kernel AppleTalk layer and drivers are available. + +Each mode requires its own user space software. + +Compiling AppleTalk-IP Decapsulation/Encapsulation +================================================= + +AppleTalk-IP decapsulation needs to be compiled into your kernel. You +will need to turn on AppleTalk-IP driver support. Then you will need to +select ONE of the two options; IP to AppleTalk-IP encapsulation support or +AppleTalk-IP to IP decapsulation support. If you compile the driver +statically you will only be able to use the driver for the function you have +enabled in the kernel. If you compile the driver as a module you can +select what mode you want it to run in via a module loading param. +ipddp_mode=1 for AppleTalk-IP encapsulation and ipddp_mode=2 for +AppleTalk-IP to IP decapsulation. + +Basic instructions for user space tools +======================================= + +To enable AppleTalk-IP decapsulation/encapsulation you will need the +proper tools. You can get the tools for decapsulation from +http://spacs1.spacs.k12.wi.us/~jschlst/index.html and for encapsulation +from http://www.maths.unm.edu/~bradford/ltpc.html + +I will briefly describe the operation of the tools, but you will +need to consult the supporting documentation for each set of tools. + +Decapsulation - You will need to download a software package called +MacGate. In this distribution there will be a tool called MacRoute +which enables you to add routes to the kernel for your Macs by hand. +Also the tool MacRegGateWay is included to register the +proper IP Gateway and IP addresses for your machine. Included in this +distribution is a patch to netatalk-1.4b2+asun2.0a17.2 (available from +ftp.u.washington.edu/pub/user-supported/asun/) this patch is optional +but it allows automatic adding and deleting of routes for Macs. (Handy +for locations with large Mac installations) + +Encapsulation - You will need to download a software daemon called ipddpd. +This software expects there to be an AppleTalk-IP gateway on the network. +You will also need to add the proper routes to route your Linux box's IP +traffic out the ipddp interface. + +Common Uses of ipddp.c +---------------------- +Of course AppleTalk-IP decapsulation and encapsulation, but specifically +decapsulation is being used most for connecting LocalTalk networks to +IP networks. Although it has been used on EtherTalk networks to allow +Macs that are only able to tunnel IP over EtherTalk. + +Encapsulation has been used to allow a Linux box stuck on a LocalTalk +network to use IP. It should work equally well if you are stuck on an +EtherTalk only network. + +Further Assistance +------------------- +You can contact me (Jay Schulist <jschlst@samba.org>) with any +questions regarding decapsulation or encapsulation. Bradford W. Johnson +<johns393@maroon.tc.umn.edu> originally wrote the ipddp.c driver for IP +encapsulation in AppleTalk. diff --git a/Documentation/networking/iphase.txt b/Documentation/networking/iphase.txt new file mode 100644 index 000000000000..39ccb8595bf1 --- /dev/null +++ b/Documentation/networking/iphase.txt @@ -0,0 +1,158 @@ + + READ ME FISRT + ATM (i)Chip IA Linux Driver Source +-------------------------------------------------------------------------------- + Read This Before You Begin! +-------------------------------------------------------------------------------- + +Description +----------- + +This is the README file for the Interphase PCI ATM (i)Chip IA Linux driver +source release. + +The features and limitations of this driver are as follows: + - A single VPI (VPI value of 0) is supported. + - Supports 4K VCs for the server board (with 512K control memory) and 1K + VCs for the client board (with 128K control memory). + - UBR, ABR and CBR service categories are supported. + - Only AAL5 is supported. + - Supports setting of PCR on the VCs. + - Multiple adapters in a system are supported. + - All variants of Interphase ATM PCI (i)Chip adapter cards are supported, + including x575 (OC3, control memory 128K , 512K and packet memory 128K, + 512K and 1M), x525 (UTP25) and x531 (DS3 and E3). See + http://www.iphase.com/products/ClassSheet.cfm?ClassID=ATM + for details. + - Only x86 platforms are supported. + - SMP is supported. + + +Before You Start +---------------- + + +Installation +------------ + +1. Installing the adapters in the system + To install the ATM adapters in the system, follow the steps below. + a. Login as root. + b. Shut down the system and power off the system. + c. Install one or more ATM adapters in the system. + d. Connect each adapter to a port on an ATM switch. The green 'Link' + LED on the front panel of the adapter will be on if the adapter is + connected to the switch properly when the system is powered up. + e. Power on and boot the system. + +2. [ Removed ] + +3. Rebuild kernel with ABR support + [ a. and b. removed ] + c. Reconfigure the kernel, choose the Interphase ia driver through "make + menuconfig" or "make xconfig". + d. Rebuild the kernel, loadable modules and the atm tools. + e. Install the new built kernel and modules and reboot. + +4. Load the adapter hardware driver (ia driver) if it is built as a module + a. Login as root. + b. Change directory to /lib/modules/<kernel-version>/atm. + c. Run "insmod suni.o;insmod iphase.o" + The yellow 'status' LED on the front panel of the adapter will blink + while the driver is loaded in the system. + d. To verify that the 'ia' driver is loaded successfully, run the + following command: + + cat /proc/atm/devices + + If the driver is loaded successfully, the output of the command will + be similar to the following lines: + + Itf Type ESI/"MAC"addr AAL(TX,err,RX,err,drop) ... + 0 ia xxxxxxxxx 0 ( 0 0 0 0 0 ) 5 ( 0 0 0 0 0 ) + + You can also check the system log file /var/log/messages for messages + related to the ATM driver. + +5. Ia Driver Configuration + +5.1 Configuration of adapter buffers + The (i)Chip boards have 3 different packet RAM size variants: 128K, 512K and + 1M. The RAM size decides the number of buffers and buffer size. The default + size and number of buffers are set as following: + + Totol Rx RAM Tx RAM Rx Buf Tx Buf Rx buf Tx buf + RAM size size size size size cnt cnt + -------- ------ ------ ------ ------ ------ ------ + 128K 64K 64K 10K 10K 6 6 + 512K 256K 256K 10K 10K 25 25 + 1M 512K 512K 10K 10K 51 51 + + These setting should work well in most environments, but can be + changed by typing the following command: + + insmod <IA_DIR>/ia.o IA_RX_BUF=<RX_CNT> IA_RX_BUF_SZ=<RX_SIZE> \ + IA_TX_BUF=<TX_CNT> IA_TX_BUF_SZ=<TX_SIZE> + Where: + RX_CNT = number of receive buffers in the range (1-128) + RX_SIZE = size of receive buffers in the range (48-64K) + TX_CNT = number of transmit buffers in the range (1-128) + TX_SIZE = size of transmit buffers in the range (48-64K) + + 1. Transmit and receive buffer size must be a multiple of 4. + 2. Care should be taken so that the memory required for the + transmit and receive buffers is less than or equal to the + total adapter packet memory. + +5.2 Turn on ia debug trace + + When the ia driver is built with the CONFIG_ATM_IA_DEBUG flag, the driver + can provide more debug trace if needed. There is a bit mask variable, + IADebugFlag, which controls the output of the traces. You can find the bit + map of the IADebugFlag in iphase.h. + The debug trace can be turn on through the insmod command line option, for + example, "insmod iphase.o IADebugFlag=0xffffffff" can turn on all the debug + traces together with loading the driver. + +6. Ia Driver Test Using ttcp_atm and PVC + + For the PVC setup, the test machines can either be connected back-to-back or + through a switch. If connected through the switch, the switch must be + configured for the PVC(s). + + a. For UBR test: + At the test machine intended to receive data, type: + ttcp_atm -r -a -s 0.100 + At the other test machine, type: + ttcp_atm -t -a -s 0.100 -n 10000 + Run "ttcp_atm -h" to display more options of the ttcp_atm tool. + b. For ABR test: + It is the same as the UBR testing, but with an extra command option: + -Pabr:max_pcr=<xxx> + where: + xxx = the maximum peak cell rate, from 170 - 353207. + This option must be set on both the machines. + c. For CBR test: + It is the same as the UBR testing, but with an extra command option: + -Pcbr:max_pcr=<xxx> + where: + xxx = the maximum peak cell rate, from 170 - 353207. + This option may only be set on the transmit machine. + + +OUTSTANDING ISSUES +------------------ + + + +Contact Information +------------------- + + Customer Support: + United States: Telephone: (214) 654-5555 + Fax: (214) 654-5500 + E-Mail: intouch@iphase.com + Europe: Telephone: 33 (0)1 41 15 44 00 + Fax: 33 (0)1 41 15 12 13 + World Wide Web: http://www.iphase.com + Anonymous FTP: ftp.iphase.com diff --git a/Documentation/networking/irda.txt b/Documentation/networking/irda.txt new file mode 100644 index 000000000000..9e5b8e66d6a5 --- /dev/null +++ b/Documentation/networking/irda.txt @@ -0,0 +1,14 @@ +To use the IrDA protocols within Linux you will need to get a suitable copy +of the IrDA Utilities. More detailed information about these and associated +programs can be found on http://irda.sourceforge.net/ + +For more information about how to use the IrDA protocol stack, see the +Linux Infared HOWTO (http://www.tuxmobil.org/Infrared-HOWTO/Infrared-HOWTO.html) +by Werner Heuser <wehe@tuxmobil.org> + +There is an active mailing list for discussing Linux-IrDA matters called + irda-users@lists.sourceforge.net + + + + diff --git a/Documentation/networking/ixgb.txt b/Documentation/networking/ixgb.txt new file mode 100644 index 000000000000..7c98277777eb --- /dev/null +++ b/Documentation/networking/ixgb.txt @@ -0,0 +1,212 @@ +Linux* Base Driver for the Intel(R) PRO/10GbE Family of Adapters +================================================================ + +November 17, 2004 + + +Contents +======== + +- In This Release +- Identifying Your Adapter +- Command Line Parameters +- Improving Performance +- Support + + +In This Release +=============== + +This file describes the Linux* Base Driver for the Intel(R) PRO/10GbE Family +of Adapters, version 1.0.x. + +For questions related to hardware requirements, refer to the documentation +supplied with your Intel PRO/10GbE adapter. All hardware requirements listed +apply to use with Linux. + +Identifying Your Adapter +======================== + +To verify your Intel adapter is supported, find the board ID number on the +adapter. Look for a label that has a barcode and a number in the format +A12345-001. + +Use the above information and the Adapter & Driver ID Guide at: + + http://support.intel.com/support/network/adapter/pro100/21397.htm + +For the latest Intel network drivers for Linux, go to: + + http://downloadfinder.intel.com/scripts-df/support_intel.asp + +Command Line Parameters +======================= + +If the driver is built as a module, the following optional parameters are +used by entering them on the command line with the modprobe or insmod command +using this syntax: + + modprobe ixgb [<option>=<VAL1>,<VAL2>,...] + + insmod ixgb [<option>=<VAL1>,<VAL2>,...] + +For example, with two PRO/10GbE PCI adapters, entering: + + insmod ixgb TxDescriptors=80,128 + +loads the ixgb driver with 80 TX resources for the first adapter and 128 TX +resources for the second adapter. + +The default value for each parameter is generally the recommended setting, +unless otherwise noted. Also, if the driver is statically built into the +kernel, the driver is loaded with the default values for all the parameters. +Ethtool can be used to change some of the parameters at runtime. + +FlowControl +Valid Range: 0-3 (0=none, 1=Rx only, 2=Tx only, 3=Rx&Tx) +Default: Read from the EEPROM + If EEPROM is not detected, default is 3 + This parameter controls the automatic generation(Tx) and response(Rx) to + Ethernet PAUSE frames. + +RxDescriptors +Valid Range: 64-512 +Default Value: 512 + This value is the number of receive descriptors allocated by the driver. + Increasing this value allows the driver to buffer more incoming packets. + Each descriptor is 16 bytes. A receive buffer is also allocated for + each descriptor and can be either 2048, 4056, 8192, or 16384 bytes, + depending on the MTU setting. When the MTU size is 1500 or less, the + receive buffer size is 2048 bytes. When the MTU is greater than 1500 the + receive buffer size will be either 4056, 8192, or 16384 bytes. The + maximum MTU size is 16114. + +RxIntDelay +Valid Range: 0-65535 (0=off) +Default Value: 6 + This value delays the generation of receive interrupts in units of + 0.8192 microseconds. Receive interrupt reduction can improve CPU + efficiency if properly tuned for specific network traffic. Increasing + this value adds extra latency to frame reception and can end up + decreasing the throughput of TCP traffic. If the system is reporting + dropped receives, this value may be set too high, causing the driver to + run out of available receive descriptors. + +TxDescriptors +Valid Range: 64-4096 +Default Value: 256 + This value is the number of transmit descriptors allocated by the driver. + Increasing this value allows the driver to queue more transmits. Each + descriptor is 16 bytes. + +XsumRX +Valid Range: 0-1 +Default Value: 1 + A value of '1' indicates that the driver should enable IP checksum + offload for received packets (both UDP and TCP) to the adapter hardware. + +XsumTX +Valid Range: 0-1 +Default Value: 1 + A value of '1' indicates that the driver should enable IP checksum + offload for transmitted packets (both UDP and TCP) to the adapter + hardware. + +Improving Performance +===================== + +With the Intel PRO/10 GbE adapter, the default Linux configuration will very +likely limit the total available throughput artificially. There is a set of +things that when applied together increase the ability of Linux to transmit +and receive data. The following enhancements were originally acquired from +settings published at http://www.spec.org/web99 for various submitted results +using Linux. + +NOTE: These changes are only suggestions, and serve as a starting point for +tuning your network performance. + +The changes are made in three major ways, listed in order of greatest effect: +- Use ifconfig to modify the mtu (maximum transmission unit) and the txqueuelen + parameter. +- Use sysctl to modify /proc parameters (essentially kernel tuning) +- Use setpci to modify the MMRBC field in PCI-X configuration space to increase + transmit burst lengths on the bus. + +NOTE: setpci modifies the adapter's configuration registers to allow it to read +up to 4k bytes at a time (for transmits). However, for some systems the +behavior after modifying this register may be undefined (possibly errors of some +kind). A power-cycle, hard reset or explicitly setting the e6 register back to +22 (setpci -d 8086:1048 e6.b=22) may be required to get back to a stable +configuration. + +- COPY these lines and paste them into ixgb_perf.sh: +#!/bin/bash +echo "configuring network performance , edit this file to change the interface" +# set mmrbc to 4k reads, modify only Intel 10GbE device IDs +setpci -d 8086:1048 e6.b=2e +# set the MTU (max transmission unit) - it requires your switch and clients to change too! +# set the txqueuelen +# your ixgb adapter should be loaded as eth1 for this to work, change if needed +ifconfig eth1 mtu 9000 txqueuelen 1000 up +# call the sysctl utility to modify /proc/sys entries +sysctl -p ./sysctl_ixgb.conf +- END ixgb_perf.sh + +- COPY these lines and paste them into sysctl_ixgb.conf: +# some of the defaults may be different for your kernel +# call this file with sysctl -p <this file> +# these are just suggested values that worked well to increase throughput in +# several network benchmark tests, your mileage may vary + +### IPV4 specific settings +net.ipv4.tcp_timestamps = 0 # turns TCP timestamp support off, default 1, reduces CPU use +net.ipv4.tcp_sack = 0 # turn SACK support off, default on +# on systems with a VERY fast bus -> memory interface this is the big gainer +net.ipv4.tcp_rmem = 10000000 10000000 10000000 # sets min/default/max TCP read buffer, default 4096 87380 174760 +net.ipv4.tcp_wmem = 10000000 10000000 10000000 # sets min/pressure/max TCP write buffer, default 4096 16384 131072 +net.ipv4.tcp_mem = 10000000 10000000 10000000 # sets min/pressure/max TCP buffer space, default 31744 32256 32768 + +### CORE settings (mostly for socket and UDP effect) +net.core.rmem_max = 524287 # maximum receive socket buffer size, default 131071 +net.core.wmem_max = 524287 # maximum send socket buffer size, default 131071 +net.core.rmem_default = 524287 # default receive socket buffer size, default 65535 +net.core.wmem_default = 524287 # default send socket buffer size, default 65535 +net.core.optmem_max = 524287 # maximum amount of option memory buffers, default 10240 +net.core.netdev_max_backlog = 300000 # number of unprocessed input packets before kernel starts dropping them, default 300 +- END sysctl_ixgb.conf + +Edit the ixgb_perf.sh script if necessary to change eth1 to whatever interface +your ixgb driver is using. + +NOTE: Unless these scripts are added to the boot process, these changes will +only last only until the next system reboot. + + +Resolving Slow UDP Traffic +-------------------------- + +If your server does not seem to be able to receive UDP traffic as fast as it +can receive TCP traffic, it could be because Linux, by default, does not set +the network stack buffers as large as they need to be to support high UDP +transfer rates. One way to alleviate this problem is to allow more memory to +be used by the IP stack to store incoming data. + +For instance, use the commands: + sysctl -w net.core.rmem_max=262143 +and + sysctl -w net.core.rmem_default=262143 +to increase the read buffer memory max and default to 262143 (256k - 1) from +defaults of max=131071 (128k - 1) and default=65535 (64k - 1). These variables +will increase the amount of memory used by the network stack for receives, and +can be increased significantly more if necessary for your application. + +Support +======= + +For general information and support, go to the Intel support website at: + + http://support.intel.com + +If an issue is identified with the released source code on the supported +kernel with a supported adapter, email the specific information related to +the issue to linux.nics@intel.com. diff --git a/Documentation/networking/lapb-module.txt b/Documentation/networking/lapb-module.txt new file mode 100644 index 000000000000..d4fc8f221559 --- /dev/null +++ b/Documentation/networking/lapb-module.txt @@ -0,0 +1,263 @@ + The Linux LAPB Module Interface 1.3 + + Jonathan Naylor 29.12.96 + +Changed (Henner Eisen, 2000-10-29): int return value for data_indication() + +The LAPB module will be a separately compiled module for use by any parts of +the Linux operating system that require a LAPB service. This document +defines the interfaces to, and the services provided by this module. The +term module in this context does not imply that the LAPB module is a +separately loadable module, although it may be. The term module is used in +its more standard meaning. + +The interface to the LAPB module consists of functions to the module, +callbacks from the module to indicate important state changes, and +structures for getting and setting information about the module. + +Structures +---------- + +Probably the most important structure is the skbuff structure for holding +received and transmitted data, however it is beyond the scope of this +document. + +The two LAPB specific structures are the LAPB initialisation structure and +the LAPB parameter structure. These will be defined in a standard header +file, <linux/lapb.h>. The header file <net/lapb.h> is internal to the LAPB +module and is not for use. + +LAPB Initialisation Structure +----------------------------- + +This structure is used only once, in the call to lapb_register (see below). +It contains information about the device driver that requires the services +of the LAPB module. + +struct lapb_register_struct { + void (*connect_confirmation)(int token, int reason); + void (*connect_indication)(int token, int reason); + void (*disconnect_confirmation)(int token, int reason); + void (*disconnect_indication)(int token, int reason); + int (*data_indication)(int token, struct sk_buff *skb); + void (*data_transmit)(int token, struct sk_buff *skb); +}; + +Each member of this structure corresponds to a function in the device driver +that is called when a particular event in the LAPB module occurs. These will +be described in detail below. If a callback is not required (!!) then a NULL +may be substituted. + + +LAPB Parameter Structure +------------------------ + +This structure is used with the lapb_getparms and lapb_setparms functions +(see below). They are used to allow the device driver to get and set the +operational parameters of the LAPB implementation for a given connection. + +struct lapb_parms_struct { + unsigned int t1; + unsigned int t1timer; + unsigned int t2; + unsigned int t2timer; + unsigned int n2; + unsigned int n2count; + unsigned int window; + unsigned int state; + unsigned int mode; +}; + +T1 and T2 are protocol timing parameters and are given in units of 100ms. N2 +is the maximum number of tries on the link before it is declared a failure. +The window size is the maximum number of outstanding data packets allowed to +be unacknowledged by the remote end, the value of the window is between 1 +and 7 for a standard LAPB link, and between 1 and 127 for an extended LAPB +link. + +The mode variable is a bit field used for setting (at present) three values. +The bit fields have the following meanings: + +Bit Meaning +0 LAPB operation (0=LAPB_STANDARD 1=LAPB_EXTENDED). +1 [SM]LP operation (0=LAPB_SLP 1=LAPB=MLP). +2 DTE/DCE operation (0=LAPB_DTE 1=LAPB_DCE) +3-31 Reserved, must be 0. + +Extended LAPB operation indicates the use of extended sequence numbers and +consequently larger window sizes, the default is standard LAPB operation. +MLP operation is the same as SLP operation except that the addresses used by +LAPB are different to indicate the mode of operation, the default is Single +Link Procedure. The difference between DCE and DTE operation is (i) the +addresses used for commands and responses, and (ii) when the DCE is not +connected, it sends DM without polls set, every T1. The upper case constant +names will be defined in the public LAPB header file. + + +Functions +--------- + +The LAPB module provides a number of function entry points. + + +int lapb_register(void *token, struct lapb_register_struct); + +This must be called before the LAPB module may be used. If the call is +successful then LAPB_OK is returned. The token must be a unique identifier +generated by the device driver to allow for the unique identification of the +instance of the LAPB link. It is returned by the LAPB module in all of the +callbacks, and is used by the device driver in all calls to the LAPB module. +For multiple LAPB links in a single device driver, multiple calls to +lapb_register must be made. The format of the lapb_register_struct is given +above. The return values are: + +LAPB_OK LAPB registered successfully. +LAPB_BADTOKEN Token is already registered. +LAPB_NOMEM Out of memory + + +int lapb_unregister(void *token); + +This releases all the resources associated with a LAPB link. Any current +LAPB link will be abandoned without further messages being passed. After +this call, the value of token is no longer valid for any calls to the LAPB +function. The valid return values are: + +LAPB_OK LAPB unregistered successfully. +LAPB_BADTOKEN Invalid/unknown LAPB token. + + +int lapb_getparms(void *token, struct lapb_parms_struct *parms); + +This allows the device driver to get the values of the current LAPB +variables, the lapb_parms_struct is described above. The valid return values +are: + +LAPB_OK LAPB getparms was successful. +LAPB_BADTOKEN Invalid/unknown LAPB token. + + +int lapb_setparms(void *token, struct lapb_parms_struct *parms); + +This allows the device driver to set the values of the current LAPB +variables, the lapb_parms_struct is described above. The values of t1timer, +t2timer and n2count are ignored, likewise changing the mode bits when +connected will be ignored. An error implies that none of the values have +been changed. The valid return values are: + +LAPB_OK LAPB getparms was successful. +LAPB_BADTOKEN Invalid/unknown LAPB token. +LAPB_INVALUE One of the values was out of its allowable range. + + +int lapb_connect_request(void *token); + +Initiate a connect using the current parameter settings. The valid return +values are: + +LAPB_OK LAPB is starting to connect. +LAPB_BADTOKEN Invalid/unknown LAPB token. +LAPB_CONNECTED LAPB module is already connected. + + +int lapb_disconnect_request(void *token); + +Initiate a disconnect. The valid return values are: + +LAPB_OK LAPB is starting to disconnect. +LAPB_BADTOKEN Invalid/unknown LAPB token. +LAPB_NOTCONNECTED LAPB module is not connected. + + +int lapb_data_request(void *token, struct sk_buff *skb); + +Queue data with the LAPB module for transmitting over the link. If the call +is successful then the skbuff is owned by the LAPB module and may not be +used by the device driver again. The valid return values are: + +LAPB_OK LAPB has accepted the data. +LAPB_BADTOKEN Invalid/unknown LAPB token. +LAPB_NOTCONNECTED LAPB module is not connected. + + +int lapb_data_received(void *token, struct sk_buff *skb); + +Queue data with the LAPB module which has been received from the device. It +is expected that the data passed to the LAPB module has skb->data pointing +to the beginning of the LAPB data. If the call is successful then the skbuff +is owned by the LAPB module and may not be used by the device driver again. +The valid return values are: + +LAPB_OK LAPB has accepted the data. +LAPB_BADTOKEN Invalid/unknown LAPB token. + + +Callbacks +--------- + +These callbacks are functions provided by the device driver for the LAPB +module to call when an event occurs. They are registered with the LAPB +module with lapb_register (see above) in the structure lapb_register_struct +(see above). + + +void (*connect_confirmation)(void *token, int reason); + +This is called by the LAPB module when a connection is established after +being requested by a call to lapb_connect_request (see above). The reason is +always LAPB_OK. + + +void (*connect_indication)(void *token, int reason); + +This is called by the LAPB module when the link is established by the remote +system. The value of reason is always LAPB_OK. + + +void (*disconnect_confirmation)(void *token, int reason); + +This is called by the LAPB module when an event occurs after the device +driver has called lapb_disconnect_request (see above). The reason indicates +what has happened. In all cases the LAPB link can be regarded as being +terminated. The values for reason are: + +LAPB_OK The LAPB link was terminated normally. +LAPB_NOTCONNECTED The remote system was not connected. +LAPB_TIMEDOUT No response was received in N2 tries from the remote + system. + + +void (*disconnect_indication)(void *token, int reason); + +This is called by the LAPB module when the link is terminated by the remote +system or another event has occurred to terminate the link. This may be +returned in response to a lapb_connect_request (see above) if the remote +system refused the request. The values for reason are: + +LAPB_OK The LAPB link was terminated normally by the remote + system. +LAPB_REFUSED The remote system refused the connect request. +LAPB_NOTCONNECTED The remote system was not connected. +LAPB_TIMEDOUT No response was received in N2 tries from the remote + system. + + +int (*data_indication)(void *token, struct sk_buff *skb); + +This is called by the LAPB module when data has been received from the +remote system that should be passed onto the next layer in the protocol +stack. The skbuff becomes the property of the device driver and the LAPB +module will not perform any more actions on it. The skb->data pointer will +be pointing to the first byte of data after the LAPB header. + +This method should return NET_RX_DROP (as defined in the header +file include/linux/netdevice.h) if and only if the frame was dropped +before it could be delivered to the upper layer. + + +void (*data_transmit)(void *token, struct sk_buff *skb); + +This is called by the LAPB module when data is to be transmitted to the +remote system by the device driver. The skbuff becomes the property of the +device driver and the LAPB module will not perform any more actions on it. +The skb->data pointer will be pointing to the first byte of the LAPB header. diff --git a/Documentation/networking/ltpc.txt b/Documentation/networking/ltpc.txt new file mode 100644 index 000000000000..fe2a9129d959 --- /dev/null +++ b/Documentation/networking/ltpc.txt @@ -0,0 +1,131 @@ +This is the ALPHA version of the ltpc driver. + +In order to use it, you will need at least version 1.3.3 of the +netatalk package, and the Apple or Farallon LocalTalk PC card. +There are a number of different LocalTalk cards for the PC; this +driver applies only to the one with the 65c02 processor chip on it. + +To include it in the kernel, select the CONFIG_LTPC switch in the +configuration dialog. You can also compile it as a module. + +While the driver will attempt to autoprobe the I/O port address, IRQ +line, and DMA channel of the card, this does not always work. For +this reason, you should be prepared to supply these parameters +yourself. (see "Card Configuration" below for how to determine or +change the settings on your card) + +When the driver is compiled into the kernel, you can add a line such +as the following to your /etc/lilo.conf: + + append="ltpc=0x240,9,1" + +where the parameters (in order) are the port address, IRQ, and DMA +channel. The second and third values can be omitted, in which case +the driver will try to determine them itself. + +If you load the driver as a module, you can pass the parameters "io=", +"irq=", and "dma=" on the command line with insmod or modprobe, or add +them as options in /etc/modprobe.conf: + + alias lt0 ltpc # autoload the module when the interface is configured + options ltpc io=0x240 irq=9 dma=1 + +Before starting up the netatalk demons (perhaps in rc.local), you +need to add a line such as: + + /sbin/ifconfig lt0 127.0.0.42 + +The address is unimportant - however, the card needs to be configured +with ifconfig so that Netatalk can find it. + +The appropriate netatalk configuration depends on whether you are +attached to a network that includes AppleTalk routers or not. If, +like me, you are simply connecting to your home Macintoshes and +printers, you need to set up netatalk to "seed". The way I do this +is to have the lines + + dummy -seed -phase 2 -net 2000 -addr 2000.26 -zone "1033" + lt0 -seed -phase 1 -net 1033 -addr 1033.27 -zone "1033" + +in my atalkd.conf. What is going on here is that I need to fool +netatalk into thinking that there are two AppleTalk interfaces +present; otherwise, it refuses to seed. This is a hack, and a more +permanent solution would be to alter the netatalk code. Also, make +sure you have the correct name for the dummy interface - If it's +compiled as a module, you will need to refer to it as "dummy0" or some +such. + +If you are attached to an extended AppleTalk network, with routers on +it, then you don't need to fool around with this -- the appropriate +line in atalkd.conf is + + lt0 -phase 1 + +-------------------------------------- + +Card Configuration: + +The interrupts and so forth are configured via the dipswitch on the +board. Set the switches so as not to conflict with other hardware. + + Interrupts -- set at most one. If none are set, the driver uses + polled mode. Because the card was developed in the XT era, the + original documentation refers to IRQ2. Since you'll be running + this on an AT (or later) class machine, that really means IRQ9. + + SW1 IRQ 4 + SW2 IRQ 3 + SW3 IRQ 9 (2 in original card documentation only applies to XT) + + + DMA -- choose DMA 1 or 3, and set both corresponding switches. + + SW4 DMA 3 + SW5 DMA 1 + SW6 DMA 3 + SW7 DMA 1 + + + I/O address -- choose one. + + SW8 220 / 240 + +-------------------------------------- + +IP: + +Yes, it is possible to do IP over LocalTalk. However, you can't just +treat the LocalTalk device like an ordinary Ethernet device, even if +that's what it looks like to Netatalk. + +Instead, you follow the same procedure as for doing IP in EtherTalk. +See Documentation/networking/ipddp.txt for more information about the +kernel driver and userspace tools needed. + +-------------------------------------- + +BUGS: + +IRQ autoprobing often doesn't work on a cold boot. To get around +this, either compile the driver as a module, or pass the parameters +for the card to the kernel as described above. + +Also, as usual, autoprobing is not recommended when you use the driver +as a module. (though it usually works at boot time, at least) + +Polled mode is *really* slow sometimes, but this seems to depend on +the configuration of the network. + +It may theoretically be possible to use two LTPC cards in the same +machine, but this is unsupported, so if you really want to do this, +you'll probably have to hack the initialization code a bit. + +______________________________________ + +THANKS: + Thanks to Alan Cox for helpful discussions early on in this +work, and to Denis Hainsworth for doing the bleeding-edge testing. + +-- Bradford Johnson <bradford@math.umn.edu> + +-- Updated 11/09/1998 by David Huggins-Daines <dhd@debian.org> diff --git a/Documentation/networking/multicast.txt b/Documentation/networking/multicast.txt new file mode 100644 index 000000000000..5049a64313d1 --- /dev/null +++ b/Documentation/networking/multicast.txt @@ -0,0 +1,64 @@ +Behaviour of Cards Under Multicast +================================== + +This is how they currently behave, not what the hardware can do--for example, +the Lance driver doesn't use its filter, even though the code for loading +it is in the DEC Lance-based driver. + +The following are requirements for multicasting +----------------------------------------------- +AppleTalk Multicast hardware filtering not important but + avoid cards only doing promisc +IP-Multicast Multicast hardware filters really help +IP-MRoute AllMulti hardware filters are of no help + + +Board Multicast AllMulti Promisc Filter +------------------------------------------------------------------------ +3c501 YES YES YES Software +3c503 YES YES YES Hardware +3c505 YES NO YES Hardware +3c507 NO NO NO N/A +3c509 YES YES YES Software +3c59x YES YES YES Software +ac3200 YES YES YES Hardware +apricot YES PROMISC YES Hardware +arcnet NO NO NO N/A +at1700 PROMISC PROMISC YES Software +atp PROMISC PROMISC YES Software +cs89x0 YES YES YES Software +de4x5 YES YES YES Hardware +de600 NO NO NO N/A +de620 PROMISC PROMISC YES Software +depca YES PROMISC YES Hardware +dmfe YES YES YES Software(*) +e2100 YES YES YES Hardware +eepro YES PROMISC YES Hardware +eexpress NO NO NO N/A +ewrk3 YES PROMISC YES Hardware +hp-plus YES YES YES Hardware +hp YES YES YES Hardware +hp100 YES YES YES Hardware +ibmtr NO NO NO N/A +ioc3-eth YES YES YES Hardware +lance YES YES YES Software(#) +ne YES YES YES Hardware +ni52 <------------------ Buggy ------------------> +ni65 YES YES YES Software(#) +seeq NO NO NO N/A +sgiseek <------------------ Buggy ------------------> +sk_g16 NO NO YES N/A +smc-ultra YES YES YES Hardware +sunlance YES YES YES Hardware +tulip YES YES YES Hardware +wavelan YES PROMISC YES Hardware +wd YES YES YES Hardware +xirc2ps_cs YES YES YES Hardware +znet YES YES YES Software + + +PROMISC = This multicast mode is in fact promiscuous mode. Avoid using +cards who go PROMISC on any multicast in a multicast kernel. + +(#) = Hardware multicast support is not used yet. +(*) = Hardware support for Davicom 9132 chipset only. diff --git a/Documentation/networking/ncsa-telnet b/Documentation/networking/ncsa-telnet new file mode 100644 index 000000000000..d77d28b09093 --- /dev/null +++ b/Documentation/networking/ncsa-telnet @@ -0,0 +1,16 @@ +NCSA telnet doesn't work with path MTU discovery enabled. This is due to a +bug in NCSA that also stops it working with other modern networking code +such as Solaris. + +The following information is courtesy of +Marek <marekm@i17linuxb.ists.pwr.wroc.pl> + +There is a fixed version somewhere on ftp.upe.ac.za (sorry, I don't +remember the exact pathname, and this site is very slow from here). +It may or may not be faster for you to get it from +ftp://ftp.ists.pwr.wroc.pl/pub/msdos/telnet/ncsa_upe/tel23074.zip +(source is in v230704s.zip). I have tested it with 1.3.79 (with +path mtu discovery enabled - ncsa 2.3.08 didn't work) and it seems +to work. I don't know if anyone is working on this code - this +version is over a year old. Too bad - it's faster and often more +stable than these windoze telnets, and runs on almost anything... diff --git a/Documentation/networking/net-modules.txt b/Documentation/networking/net-modules.txt new file mode 100644 index 000000000000..3830a83513d2 --- /dev/null +++ b/Documentation/networking/net-modules.txt @@ -0,0 +1,324 @@ +Wed 2-Aug-95 <matti.aarnio@utu.fi> + + Linux network driver modules + + Do not mistake this for "README.modules" at the top-level + directory! That document tells about modules in general, while + this one tells only about network device driver modules. + + This is a potpourri of INSMOD-time(*) configuration options + (if such exists) and their default values of various modules + in the Linux network drivers collection. + + Some modules have also hidden (= non-documented) tunable values. + The choice of not documenting them is based on general belief, that + the less the user needs to know, the better. (There are things that + driver developers can use, others should not confuse themselves.) + + In many cases it is highly preferred that insmod:ing is done + ONLY with defining an explicit address for the card, AND BY + NOT USING AUTO-PROBING! + + Now most cards have some explicitly defined base address that they + are compiled with (to avoid auto-probing, among other things). + If that compiled value does not match your actual configuration, + do use the "io=0xXXX" -parameter for the insmod, and give there + a value matching your environment. + + If you are adventurous, you can ask the driver to autoprobe + by using the "io=0" parameter, however it is a potentially dangerous + thing to do in a live system. (If you don't know where the + card is located, you can try autoprobing, and after possible + crash recovery, insmod with proper IO-address..) + + -------------------------- + (*) "INSMOD-time" means when you load module with + /sbin/insmod you can feed it optional parameters. + See "man insmod". + -------------------------- + + + 8390 based Network Modules (Paul Gortmaker, Nov 12, 1995) + -------------------------- + +(Includes: smc-ultra, ne, wd, 3c503, hp, hp-plus, e2100 and ac3200) + +The 8390 series of network drivers now support multiple card systems without +reloading the same module multiple times (memory efficient!) This is done by +specifying multiple comma separated values, such as: + + insmod 3c503.o io=0x280,0x300,0x330,0x350 xcvr=0,1,0,1 + +The above would have the one module controlling four 3c503 cards, with card 2 +and 4 using external transceivers. The "insmod" manual describes the usage +of comma separated value lists. + +It is *STRONGLY RECOMMENDED* that you supply "io=" instead of autoprobing. +If an "io=" argument is not supplied, then the ISA drivers will complain +about autoprobing being not recommended, and begrudgingly autoprobe for +a *SINGLE CARD ONLY* -- if you want to use multiple cards you *have* to +supply an "io=0xNNN,0xQQQ,..." argument. + +The ne module is an exception to the above. A NE2000 is essentially an +8390 chip, some bus glue and some RAM. Because of this, the ne probe is +more invasive than the rest, and so at boot we make sure the ne probe is +done last of all the 8390 cards (so that it won't trip over other 8390 based +cards) With modules we can't ensure that all other non-ne 8390 cards have +already been found. Because of this, the ne module REQUIRES an "io=0xNNN" +argument passed in via insmod. It will refuse to autoprobe. + +It is also worth noting that auto-IRQ probably isn't as reliable during +the flurry of interrupt activity on a running machine. Cards such as the +ne2000 that can't get the IRQ setting from an EEPROM or configuration +register are probably best supplied with an "irq=M" argument as well. + + +---------------------------------------------------------------------- +Card/Module List - Configurable Parameters and Default Values +---------------------------------------------------------------------- + +3c501.c: + io = 0x280 IO base address + irq = 5 IRQ + (Probes ports: 0x280, 0x300) + +3c503.c: + io = 0 (It will complain if you don't supply an "io=0xNNN") + irq = 0 (IRQ software selected by driver using autoIRQ) + xcvr = 0 (Use xcvr=1 to select external transceiver.) + (Probes ports: 0x300, 0x310, 0x330, 0x350, 0x250, 0x280, 0x2A0, 0x2E0) + +3c505.c: + io = 0 + irq = 0 + dma = 6 (not autoprobed) + (Probes ports: 0x300, 0x280, 0x310) + +3c507.c: + io = 0x300 + irq = 0 + (Probes ports: 0x300, 0x320, 0x340, 0x280) + +3c509.c: + io = 0 + irq = 0 + ( Module load-time probing Works reliably only on EISA, ISA ID-PROBE + IS NOT RELIABLE! Compile this driver statically into kernel for + now, if you need it auto-probing on an ISA-bus machine. ) + +8390.c: + (No public options, several other modules need this one) + +a2065.c: + Since this is a Zorro board, it supports full autoprobing, even for + multiple boards. (m68k/Amiga) + +ac3200.c: + io = 0 (Checks 0x1000 to 0x8fff in 0x1000 intervals) + irq = 0 (Read from config register) + (EISA probing..) + +apricot.c: + io = 0x300 (Can't be altered!) + irq = 10 + +arcnet.c: + io = 0 + irqnum = 0 + shmem = 0 + num = 0 + DO SET THESE MANUALLY AT INSMOD! + (When probing, looks at the following possible addresses: + Suggested ones: + 0x300, 0x2E0, 0x2F0, 0x2D0 + Other ones: + 0x200, 0x210, 0x220, 0x230, 0x240, 0x250, 0x260, 0x270, + 0x280, 0x290, 0x2A0, 0x2B0, 0x2C0, + 0x310, 0x320, 0x330, 0x340, 0x350, 0x360, 0x370, + 0x380, 0x390, 0x3A0, 0x3E0, 0x3F0 ) + +ariadne.c: + Since this is a Zorro board, it supports full autoprobing, even for + multiple boards. (m68k/Amiga) + +at1700.c: + io = 0x260 + irq = 0 + (Probes ports: 0x260, 0x280, 0x2A0, 0x240, 0x340, 0x320, 0x380, 0x300) + +atari_bionet.c: + Supports full autoprobing. (m68k/Atari) + +atari_pamsnet.c: + Supports full autoprobing. (m68k/Atari) + +atarilance.c: + Supports full autoprobing. (m68k/Atari) + +atp.c: *Not modularized* + (Probes ports: 0x378, 0x278, 0x3BC; + fixed IRQs: 5 and 7 ) + +cops.c: + io = 0x240 + irq = 5 + nodeid = 0 (AutoSelect = 0, NodeID 1-254 is hand selected.) + (Probes ports: 0x240, 0x340, 0x200, 0x210, 0x220, 0x230, 0x260, + 0x2A0, 0x300, 0x310, 0x320, 0x330, 0x350, 0x360) + +de4x5.c: + io = 0x000b + irq = 10 + is_not_dec = 0 -- For non-DEC card using DEC 21040/21041/21140 chip, set this to 1 + (EISA, and PCI probing) + +de600.c: + de600_debug = 0 + (On port 0x378, irq 7 -- lpt1; compile time configurable) + +de620.c: + bnc = 0, utp = 0 <-- Force media by setting either. + io = 0x378 (also compile-time configurable) + irq = 7 + +depca.c: + io = 0x200 + irq = 7 + (Probes ports: ISA: 0x300, 0x200; + EISA: 0x0c00 ) + +dummy.c: + No options + +e2100.c: + io = 0 (It will complain if you don't supply an "io=0xNNN") + irq = 0 (IRQ software selected by driver) + mem = 0 (Override default shared memory start of 0xd0000) + xcvr = 0 (Use xcvr=1 to select external transceiver.) + (Probes ports: 0x300, 0x280, 0x380, 0x220) + +eepro.c: + io = 0x200 + irq = 0 + (Probes ports: 0x200, 0x240, 0x280, 0x2C0, 0x300, 0x320, 0x340, 0x360) + +eexpress.c: + io = 0x300 + irq = 0 (IRQ value read from EEPROM) + (Probes ports: 0x300, 0x270, 0x320, 0x340) + +eql.c: + (No parameters) + +ewrk3.c: + io = 0x300 + irq = 5 + (With module no autoprobing! + On EISA-bus does EISA probing. + Static linkage probes ports on ISA bus: + 0x100, 0x120, 0x140, 0x160, 0x180, 0x1A0, 0x1C0, + 0x200, 0x220, 0x240, 0x260, 0x280, 0x2A0, 0x2C0, 0x2E0, + 0x300, 0x340, 0x360, 0x380, 0x3A0, 0x3C0) + +hp-plus.c: + io = 0 (It will complain if you don't supply an "io=0xNNN") + irq = 0 (IRQ read from configuration register) + (Probes ports: 0x200, 0x240, 0x280, 0x2C0, 0x300, 0x320, 0x340) + +hp.c: + io = 0 (It will complain if you don't supply an "io=0xNNN") + irq = 0 (IRQ software selected by driver using autoIRQ) + (Probes ports: 0x300, 0x320, 0x340, 0x280, 0x2C0, 0x200, 0x240) + +hp100.c: + hp100_port = 0 (IO-base address) + (Does EISA-probing, if on EISA-slot; + On ISA-bus probes all ports from 0x100 thru to 0x3E0 + in increments of 0x020) + +hydra.c: + Since this is a Zorro board, it supports full autoprobing, even for + multiple boards. (m68k/Amiga) + +ibmtr.c: + io = 0xa20, 0xa24 (autoprobed by default) + irq = 0 (driver cannot select irq - read from hardware) + mem = 0 (shared memory base set at 0xd0000 and not yet + able to override thru mem= parameter.) + +lance.c: *Not modularized* + (PCI, and ISA probing; "CONFIG_PCI" needed for PCI support) + (Probes ISA ports: 0x300, 0x320, 0x340, 0x360) + +loopback.c: *Static kernel component* + +ne.c: + io = 0 (Explicitly *requires* an "io=0xNNN" value) + irq = 0 (Tries to determine configured IRQ via autoIRQ) + (Probes ports: 0x300, 0x280, 0x320, 0x340, 0x360) + +net_init.c: *Static kernel component* + +ni52.c: *Not modularized* + (Probes ports: 0x300, 0x280, 0x360, 0x320, 0x340 + mems: 0xD0000, 0xD2000, 0xC8000, 0xCA000, + 0xD4000, 0xD6000, 0xD8000 ) + +ni65.c: *Not modularized* **16MB MEMORY BARRIER BUG** + (Probes ports: 0x300, 0x320, 0x340, 0x360) + +pi2.c: *Not modularized* (well, NON-STANDARD modularization!) + Only one card supported at this time. + (Probes ports: 0x380, 0x300, 0x320, 0x340, 0x360, 0x3A0) + +plip.c: + io = 0 + irq = 0 (by default, uses IRQ 5 for port at 0x3bc, IRQ 7 + for port at 0x378, and IRQ 2 for port at 0x278) + (Probes ports: 0x278, 0x378, 0x3bc) + +ppp.c: + No options (ppp-2.2+ has some, this is based on non-dynamic + version from ppp-2.1.2d) + +seeq8005.c: *Not modularized* + (Probes ports: 0x300, 0x320, 0x340, 0x360) + +sk_g16.c: *Not modularized* + (Probes ports: 0x100, 0x180, 0x208, 0x220m 0x288, 0x320, 0x328, 0x390) + +skeleton.c: *Skeleton* + +slhc.c: + No configuration parameters + +slip.c: + slip_maxdev = 256 (default value from SL_NRUNIT on slip.h) + + +smc-ultra.c: + io = 0 (It will complain if you don't supply an "io=0xNNN") + irq = 0 (IRQ val. read from EEPROM) + (Probes ports: 0x200, 0x220, 0x240, 0x280, 0x300, 0x340, 0x380) + +tulip.c: *Partial modularization* + (init-time memory allocation makes problems..) + +tunnel.c: + No insmod parameters + +wavelan.c: + io = 0x390 (Settable, but change not recommended) + irq = 0 (Not honoured, if changed..) + +wd.c: + io = 0 (It will complain if you don't supply an "io=0xNNN") + irq = 0 (IRQ val. read from EEPROM, ancient cards use autoIRQ) + mem = 0 (Force shared-memory on address 0xC8000, or whatever..) + mem_end = 0 (Force non-std. mem. size via supplying mem_end val.) + (eg. for 32k WD8003EBT, use mem=0xd0000 mem_end=0xd8000) + (Probes ports: 0x300, 0x280, 0x380, 0x240) + +znet.c: *Not modularized* + (Only one device on Zenith Z-Note (notebook?) systems, + configuration information from (EE)PROM) diff --git a/Documentation/networking/netconsole.txt b/Documentation/networking/netconsole.txt new file mode 100644 index 000000000000..53618fb1a717 --- /dev/null +++ b/Documentation/networking/netconsole.txt @@ -0,0 +1,57 @@ + +started by Ingo Molnar <mingo@redhat.com>, 2001.09.17 +2.6 port and netpoll api by Matt Mackall <mpm@selenic.com>, Sep 9 2003 + +Please send bug reports to Matt Mackall <mpm@selenic.com> + +This module logs kernel printk messages over UDP allowing debugging of +problem where disk logging fails and serial consoles are impractical. + +It can be used either built-in or as a module. As a built-in, +netconsole initializes immediately after NIC cards and will bring up +the specified interface as soon as possible. While this doesn't allow +capture of early kernel panics, it does capture most of the boot +process. + +It takes a string configuration parameter "netconsole" in the +following format: + + netconsole=[src-port]@[src-ip]/[<dev>],[tgt-port]@<tgt-ip>/[tgt-macaddr] + + where + src-port source for UDP packets (defaults to 6665) + src-ip source IP to use (interface address) + dev network interface (eth0) + tgt-port port for logging agent (6666) + tgt-ip IP address for logging agent + tgt-macaddr ethernet MAC address for logging agent (broadcast) + +Examples: + + linux netconsole=4444@10.0.0.1/eth1,9353@10.0.0.2/12:34:56:78:9a:bc + + or + + insmod netconsole netconsole=@/,@10.0.0.2/ + +Built-in netconsole starts immediately after the TCP stack is +initialized and attempts to bring up the supplied dev at the supplied +address. + +The remote host can run either 'netcat -u -l -p <port>' or syslogd. + +WARNING: the default target ethernet setting uses the broadcast +ethernet address to send packets, which can cause increased load on +other systems on the same ethernet segment. + +NOTE: the network device (eth1 in the above case) can run any kind +of other network traffic, netconsole is not intrusive. Netconsole +might cause slight delays in other traffic if the volume of kernel +messages is high, but should have no other impact. + +Netconsole was designed to be as instantaneous as possible, to +enable the logging of even the most critical kernel bugs. It works +from IRQ contexts as well, and does not enable interrupts while +sending packets. Due to these unique needs, configuration can not +be more automatic, and some fundamental limitations will remain: +only IP networks, UDP packets and ethernet devices are supported. diff --git a/Documentation/networking/netdevices.txt b/Documentation/networking/netdevices.txt new file mode 100644 index 000000000000..1509f3aff968 --- /dev/null +++ b/Documentation/networking/netdevices.txt @@ -0,0 +1,75 @@ + +Network Devices, the Kernel, and You! + + +Introduction +============ +The following is a random collection of documentation regarding +network devices. + +struct net_device allocation rules +================================== +Network device structures need to persist even after module is unloaded and +must be allocated with kmalloc. If device has registered successfully, +it will be freed on last use by free_netdev. This is required to handle the +pathologic case cleanly (example: rmmod mydriver </sys/class/net/myeth/mtu ) + +There are routines in net_init.c to handle the common cases of +alloc_etherdev, alloc_netdev. These reserve extra space for driver +private data which gets freed when the network device is freed. If +separately allocated data is attached to the network device +(dev->priv) then it is up to the module exit handler to free that. + + +struct net_device synchronization rules +======================================= +dev->open: + Synchronization: rtnl_lock() semaphore. + Context: process + +dev->stop: + Synchronization: rtnl_lock() semaphore. + Context: process + Note1: netif_running() is guaranteed false + Note2: dev->poll() is guaranteed to be stopped + +dev->do_ioctl: + Synchronization: rtnl_lock() semaphore. + Context: process + +dev->get_stats: + Synchronization: dev_base_lock rwlock. + Context: nominally process, but don't sleep inside an rwlock + +dev->hard_start_xmit: + Synchronization: dev->xmit_lock spinlock. + When the driver sets NETIF_F_LLTX in dev->features this will be + called without holding xmit_lock. In this case the driver + has to lock by itself when needed. It is recommended to use a try lock + for this and return -1 when the spin lock fails. + The locking there should also properly protect against + set_multicast_list + Context: BHs disabled + Notes: netif_queue_stopped() is guaranteed false + Return codes: + o NETDEV_TX_OK everything ok. + o NETDEV_TX_BUSY Cannot transmit packet, try later + Usually a bug, means queue start/stop flow control is broken in + the driver. Note: the driver must NOT put the skb in its DMA ring. + o NETDEV_TX_LOCKED Locking failed, please retry quickly. + Only valid when NETIF_F_LLTX is set. + +dev->tx_timeout: + Synchronization: dev->xmit_lock spinlock. + Context: BHs disabled + Notes: netif_queue_stopped() is guaranteed true + +dev->set_multicast_list: + Synchronization: dev->xmit_lock spinlock. + Context: BHs disabled + +dev->poll: + Synchronization: __LINK_STATE_RX_SCHED bit in dev->state. See + dev_close code and comments in net/core/dev.c for more info. + Context: softirq + diff --git a/Documentation/networking/netif-msg.txt b/Documentation/networking/netif-msg.txt new file mode 100644 index 000000000000..18ad4cea6259 --- /dev/null +++ b/Documentation/networking/netif-msg.txt @@ -0,0 +1,79 @@ + +________________ +NETIF Msg Level + +The design of the network interface message level setting. + +History + + The design of the debugging message interface was guided and + constrained by backwards compatibility previous practice. It is useful + to understand the history and evolution in order to understand current + practice and relate it to older driver source code. + + From the beginning of Linux, each network device driver has had a local + integer variable that controls the debug message level. The message + level ranged from 0 to 7, and monotonically increased in verbosity. + + The message level was not precisely defined past level 3, but were + always implemented within +-1 of the specified level. Drivers tended + to shed the more verbose level messages as they matured. + 0 Minimal messages, only essential information on fatal errors. + 1 Standard messages, initialization status. No run-time messages + 2 Special media selection messages, generally timer-driver. + 3 Interface starts and stops, including normal status messages + 4 Tx and Rx frame error messages, and abnormal driver operation + 5 Tx packet queue information, interrupt events. + 6 Status on each completed Tx packet and received Rx packets + 7 Initial contents of Tx and Rx packets + + Initially this message level variable was uniquely named in each driver + e.g. "lance_debug", so that a kernel symbolic debugger could locate and + modify the setting. When kernel modules became common, the variables + were consistently renamed to "debug" and allowed to be set as a module + parameter. + + This approach worked well. However there is always a demand for + additional features. Over the years the following emerged as + reasonable and easily implemented enhancements + Using an ioctl() call to modify the level. + Per-interface rather than per-driver message level setting. + More selective control over the type of messages emitted. + + The netif_msg recommandation adds these features with only a minor + complexity and code size increase. + + The recommendation is the following points + Retaining the per-driver integer variable "debug" as a module + parameter with a default level of '1'. + + Adding a per-interface private variable named "msg_enable". The + variable is a bit map rather than a level, and is initialized as + 1 << debug + Or more precisely + debug < 0 ? 0 : 1 << min(sizeof(int)-1, debug) + + Messages should changes from + if (debug > 1) + printk(MSG_DEBUG "%s: ... + to + if (np->msg_enable & NETIF_MSG_LINK) + printk(MSG_DEBUG "%s: ... + + +The set of message levels is named + Old level Name Bit position + 0 NETIF_MSG_DRV 0x0001 + 1 NETIF_MSG_PROBE 0x0002 + 2 NETIF_MSG_LINK 0x0004 + 2 NETIF_MSG_TIMER 0x0004 + 3 NETIF_MSG_IFDOWN 0x0008 + 3 NETIF_MSG_IFUP 0x0008 + 4 NETIF_MSG_RX_ERR 0x0010 + 4 NETIF_MSG_TX_ERR 0x0010 + 5 NETIF_MSG_TX_QUEUED 0x0020 + 5 NETIF_MSG_INTR 0x0020 + 6 NETIF_MSG_TX_DONE 0x0040 + 6 NETIF_MSG_RX_STATUS 0x0040 + 7 NETIF_MSG_PKTDATA 0x0080 + diff --git a/Documentation/networking/olympic.txt b/Documentation/networking/olympic.txt new file mode 100644 index 000000000000..c65a94010ea8 --- /dev/null +++ b/Documentation/networking/olympic.txt @@ -0,0 +1,79 @@ + +IBM PCI Pit/Pit-Phy/Olympic CHIPSET BASED TOKEN RING CARDS README + +Release 0.2.0 - Release + June 8th 1999 Peter De Schrijver & Mike Phillips +Release 0.9.C - Release + April 18th 2001 Mike Phillips + +Thanks: +Erik De Cock, Adrian Bridgett and Frank Fiene for their +patience and testing. +Donald Champion for the cardbus support +Kyle Lucke for the dma api changes. +Jonathon Bitner for hardware support. +Everybody on linux-tr for their continued support. + +Options: + +The driver accepts four options: ringspeed, pkt_buf_sz, +message_level and network_monitor. + +These options can be specified differently for each card found. + +ringspeed: Has one of three settings 0 (default), 4 or 16. 0 will +make the card autosense the ringspeed and join at the appropriate speed, +this will be the default option for most people. 4 or 16 allow you to +explicitly force the card to operate at a certain speed. The card will fail +if you try to insert it at the wrong speed. (Although some hubs will allow +this so be *very* careful). The main purpose for explicitly setting the ring +speed is for when the card is first on the ring. In autosense mode, if the card +cannot detect any active monitors on the ring it will not open, so you must +re-init the card at the appropriate speed. Unfortunately at present the only +way of doing this is rmmod and insmod which is a bit tough if it is compiled +in the kernel. + +pkt_buf_sz: This is this initial receive buffer allocation size. This will +default to 4096 if no value is entered. You may increase performance of the +driver by setting this to a value larger than the network packet size, although +the driver now re-sizes buffers based on MTU settings as well. + +message_level: Controls level of messages created by the driver. Defaults to 0: +which only displays start-up and critical messages. Presently any non-zero +value will display all soft messages as well. NB This does not turn +debugging messages on, that must be done by modified the source code. + +network_monitor: Any non-zero value will provide a quasi network monitoring +mode. All unexpected MAC frames (beaconing etc.) will be received +by the driver and the source and destination addresses printed. +Also an entry will be added in /proc/net called olympic_tr%d, where tr%d +is the registered device name, i.e tr0, tr1, etc. This displays low +level information about the configuration of the ring and the adapter. +This feature has been designed for network administrators to assist in +the diagnosis of network / ring problems. (This used to OLYMPIC_NETWORK_MONITOR, +but has now changed to allow each adapter to be configured differently and +to alleviate the necessity to re-compile olympic to turn the option on). + +Multi-card: + +The driver will detect multiple cards and will work with shared interrupts, +each card is assigned the next token ring device, i.e. tr0 , tr1, tr2. The +driver should also happily reside in the system with other drivers. It has +been tested with ibmtr.c running, and I personally have had one Olicom PCI +card and two IBM olympic cards (all on the same interrupt), all running +together. + +Variable MTU size: + +The driver can handle a MTU size upto either 4500 or 18000 depending upon +ring speed. The driver also changes the size of the receive buffers as part +of the mtu re-sizing, so if you set mtu = 18000, you will need to be able +to allocate 16 * (sk_buff with 18000 buffer size) call it 18500 bytes per ring +position = 296,000 bytes of memory space, plus of course anything +necessary for the tx sk_buff's. Remember this is per card, so if you are +building routers, gateway's etc, you could start to use a lot of memory +real fast. + + +6/8/99 Peter De Schrijver and Mike Phillips + diff --git a/Documentation/networking/packet_mmap.txt b/Documentation/networking/packet_mmap.txt new file mode 100644 index 000000000000..8d4cf78258e4 --- /dev/null +++ b/Documentation/networking/packet_mmap.txt @@ -0,0 +1,399 @@ +-------------------------------------------------------------------------------- ++ ABSTRACT +-------------------------------------------------------------------------------- + +This file documents the CONFIG_PACKET_MMAP option available with the PACKET +socket interface on 2.4 and 2.6 kernels. This type of sockets is used for +capture network traffic with utilities like tcpdump or any other that uses +the libpcap library. + +You can find the latest version of this document at + + http://pusa.uv.es/~ulisses/packet_mmap/ + +Please send me your comments to + + Ulisses Alonso Camaró <uaca@i.hate.spam.alumni.uv.es> + +------------------------------------------------------------------------------- ++ Why use PACKET_MMAP +-------------------------------------------------------------------------------- + +In Linux 2.4/2.6 if PACKET_MMAP is not enabled, the capture process is very +inefficient. It uses very limited buffers and requires one system call +to capture each packet, it requires two if you want to get packet's +timestamp (like libpcap always does). + +In the other hand PACKET_MMAP is very efficient. PACKET_MMAP provides a size +configurable circular buffer mapped in user space. This way reading packets just +needs to wait for them, most of the time there is no need to issue a single +system call. By using a shared buffer between the kernel and the user +also has the benefit of minimizing packet copies. + +It's fine to use PACKET_MMAP to improve the performance of the capture process, +but it isn't everything. At least, if you are capturing at high speeds (this +is relative to the cpu speed), you should check if the device driver of your +network interface card supports some sort of interrupt load mitigation or +(even better) if it supports NAPI, also make sure it is enabled. + +-------------------------------------------------------------------------------- ++ How to use CONFIG_PACKET_MMAP +-------------------------------------------------------------------------------- + +From the user standpoint, you should use the higher level libpcap library, wich +is a de facto standard, portable across nearly all operating systems +including Win32. + +Said that, at time of this writing, official libpcap 0.8.1 is out and doesn't include +support for PACKET_MMAP, and also probably the libpcap included in your distribution. + +I'm aware of two implementations of PACKET_MMAP in libpcap: + + http://pusa.uv.es/~ulisses/packet_mmap/ (by Simon Patarin, based on libpcap 0.6.2) + http://public.lanl.gov/cpw/ (by Phil Wood, based on lastest libpcap) + +The rest of this document is intended for people who want to understand +the low level details or want to improve libpcap by including PACKET_MMAP +support. + +-------------------------------------------------------------------------------- ++ How to use CONFIG_PACKET_MMAP directly +-------------------------------------------------------------------------------- + +From the system calls stand point, the use of PACKET_MMAP involves +the following process: + + +[setup] socket() -------> creation of the capture socket + setsockopt() ---> allocation of the circular buffer (ring) + mmap() ---------> maping of the allocated buffer to the + user process + +[capture] poll() ---------> to wait for incoming packets + +[shutdown] close() --------> destruction of the capture socket and + deallocation of all associated + resources. + + +socket creation and destruction is straight forward, and is done +the same way with or without PACKET_MMAP: + +int fd; + +fd= socket(PF_PACKET, mode, htons(ETH_P_ALL)) + +where mode is SOCK_RAW for the raw interface were link level +information can be captured or SOCK_DGRAM for the cooked +interface where link level information capture is not +supported and a link level pseudo-header is provided +by the kernel. + +The destruction of the socket and all associated resources +is done by a simple call to close(fd). + +Next I will describe PACKET_MMAP settings and it's constraints, +also the maping of the circular buffer in the user process and +the use of this buffer. + +-------------------------------------------------------------------------------- ++ PACKET_MMAP settings +-------------------------------------------------------------------------------- + + +To setup PACKET_MMAP from user level code is done with a call like + + setsockopt(fd, SOL_PACKET, PACKET_RX_RING, (void *) &req, sizeof(req)) + +The most significant argument in the previous call is the req parameter, +this parameter must to have the following structure: + + struct tpacket_req + { + unsigned int tp_block_size; /* Minimal size of contiguous block */ + unsigned int tp_block_nr; /* Number of blocks */ + unsigned int tp_frame_size; /* Size of frame */ + unsigned int tp_frame_nr; /* Total number of frames */ + }; + +This structure is defined in /usr/include/linux/if_packet.h and establishes a +circular buffer (ring) of unswappable memory mapped in the capture process. +Being mapped in the capture process allows reading the captured frames and +related meta-information like timestamps without requiring a system call. + +Captured frames are grouped in blocks. Each block is a physically contiguous +region of memory and holds tp_block_size/tp_frame_size frames. The total number +of blocks is tp_block_nr. Note that tp_frame_nr is a redundant parameter because + + frames_per_block = tp_block_size/tp_frame_size + +indeed, packet_set_ring checks that the following condition is true + + frames_per_block * tp_block_nr == tp_frame_nr + + +Lets see an example, with the following values: + + tp_block_size= 4096 + tp_frame_size= 2048 + tp_block_nr = 4 + tp_frame_nr = 8 + +we will get the following buffer structure: + + block #1 block #2 ++---------+---------+ +---------+---------+ +| frame 1 | frame 2 | | frame 3 | frame 4 | ++---------+---------+ +---------+---------+ + + block #3 block #4 ++---------+---------+ +---------+---------+ +| frame 5 | frame 6 | | frame 7 | frame 8 | ++---------+---------+ +---------+---------+ + +A frame can be of any size with the only condition it can fit in a block. A block +can only hold an integer number of frames, or in other words, a frame cannot +be spawn accross two blocks so there are some datails you have to take into +account when choosing the frame_size. See "Maping and use of the circular +buffer (ring)". + + +-------------------------------------------------------------------------------- ++ PACKET_MMAP setting constraints +-------------------------------------------------------------------------------- + +In kernel versions prior to 2.4.26 (for the 2.4 branch) and 2.6.5 (2.6 branch), +the PACKET_MMAP buffer could hold only 32768 frames in a 32 bit architecture or +16384 in a 64 bit architecture. For information on these kernel versions +see http://pusa.uv.es/~ulisses/packet_mmap/packet_mmap.pre-2.4.26_2.6.5.txt + + Block size limit +------------------ + +As stated earlier, each block is a contiguous physical region of memory. These +memory regions are allocated with calls to the __get_free_pages() function. As +the name indicates, this function allocates pages of memory, and the second +argument is "order" or a power of two number of pages, that is +(for PAGE_SIZE == 4096) order=0 ==> 4096 bytes, order=1 ==> 8192 bytes, +order=2 ==> 16384 bytes, etc. The maximum size of a +region allocated by __get_free_pages is determined by the MAX_ORDER macro. More +precisely the limit can be calculated as: + + PAGE_SIZE << MAX_ORDER + + In a i386 architecture PAGE_SIZE is 4096 bytes + In a 2.4/i386 kernel MAX_ORDER is 10 + In a 2.6/i386 kernel MAX_ORDER is 11 + +So get_free_pages can allocate as much as 4MB or 8MB in a 2.4/2.6 kernel +respectively, with an i386 architecture. + +User space programs can include /usr/include/sys/user.h and +/usr/include/linux/mmzone.h to get PAGE_SIZE MAX_ORDER declarations. + +The pagesize can also be determined dynamically with the getpagesize (2) +system call. + + + Block number limit +-------------------- + +To understand the constraints of PACKET_MMAP, we have to see the structure +used to hold the pointers to each block. + +Currently, this structure is a dynamically allocated vector with kmalloc +called pg_vec, its size limits the number of blocks that can be allocated. + + +---+---+---+---+ + | x | x | x | x | + +---+---+---+---+ + | | | | + | | | v + | | v block #4 + | v block #3 + v block #2 + block #1 + + +kmalloc allocates any number of bytes of phisically contiguous memory from +a pool of pre-determined sizes. This pool of memory is mantained by the slab +allocator wich is at the end the responsible for doing the allocation and +hence wich imposes the maximum memory that kmalloc can allocate. + +In a 2.4/2.6 kernel and the i386 architecture, the limit is 131072 bytes. The +predetermined sizes that kmalloc uses can be checked in the "size-<bytes>" +entries of /proc/slabinfo + +In a 32 bit architecture, pointers are 4 bytes long, so the total number of +pointers to blocks is + + 131072/4 = 32768 blocks + + + PACKET_MMAP buffer size calculator +------------------------------------ + +Definitions: + +<size-max> : is the maximum size of allocable with kmalloc (see /proc/slabinfo) +<pointer size>: depends on the architecture -- sizeof(void *) +<page size> : depends on the architecture -- PAGE_SIZE or getpagesize (2) +<max-order> : is the value defined with MAX_ORDER +<frame size> : it's an upper bound of frame's capture size (more on this later) + +from these definitions we will derive + + <block number> = <size-max>/<pointer size> + <block size> = <pagesize> << <max-order> + +so, the max buffer size is + + <block number> * <block size> + +and, the number of frames be + + <block number> * <block size> / <frame size> + +Suposse the following parameters, wich apply for 2.6 kernel and an +i386 architecture: + + <size-max> = 131072 bytes + <pointer size> = 4 bytes + <pagesize> = 4096 bytes + <max-order> = 11 + +and a value for <frame size> of 2048 byteas. These parameters will yield + + <block number> = 131072/4 = 32768 blocks + <block size> = 4096 << 11 = 8 MiB. + +and hence the buffer will have a 262144 MiB size. So it can hold +262144 MiB / 2048 bytes = 134217728 frames + + +Actually, this buffer size is not possible with an i386 architecture. +Remember that the memory is allocated in kernel space, in the case of +an i386 kernel's memory size is limited to 1GiB. + +All memory allocations are not freed until the socket is closed. The memory +allocations are done with GFP_KERNEL priority, this basically means that +the allocation can wait and swap other process' memory in order to allocate +the nececessary memory, so normally limits can be reached. + + Other constraints +------------------- + +If you check the source code you will see that what I draw here as a frame +is not only the link level frame. At the begining of each frame there is a +header called struct tpacket_hdr used in PACKET_MMAP to hold link level's frame +meta information like timestamp. So what we draw here a frame it's really +the following (from include/linux/if_packet.h): + +/* + Frame structure: + + - Start. Frame must be aligned to TPACKET_ALIGNMENT=16 + - struct tpacket_hdr + - pad to TPACKET_ALIGNMENT=16 + - struct sockaddr_ll + - Gap, chosen so that packet data (Start+tp_net) alignes to + TPACKET_ALIGNMENT=16 + - Start+tp_mac: [ Optional MAC header ] + - Start+tp_net: Packet data, aligned to TPACKET_ALIGNMENT=16. + - Pad to align to TPACKET_ALIGNMENT=16 + */ + + + The following are conditions that are checked in packet_set_ring + + tp_block_size must be a multiple of PAGE_SIZE (1) + tp_frame_size must be greater than TPACKET_HDRLEN (obvious) + tp_frame_size must be a multiple of TPACKET_ALIGNMENT + tp_frame_nr must be exactly frames_per_block*tp_block_nr + +Note that tp_block_size should be choosed to be a power of two or there will +be a waste of memory. + +-------------------------------------------------------------------------------- ++ Maping and use of the circular buffer (ring) +-------------------------------------------------------------------------------- + +The maping of the buffer in the user process is done with the conventional +mmap function. Even the circular buffer is compound of several physically +discontiguous blocks of memory, they are contiguous to the user space, hence +just one call to mmap is needed: + + mmap(0, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); + +If tp_frame_size is a divisor of tp_block_size frames will be +contiguosly spaced by tp_frame_size bytes. If not, each +tp_block_size/tp_frame_size frames there will be a gap between +the frames. This is because a frame cannot be spawn across two +blocks. + +At the beginning of each frame there is an status field (see +struct tpacket_hdr). If this field is 0 means that the frame is ready +to be used for the kernel, If not, there is a frame the user can read +and the following flags apply: + + from include/linux/if_packet.h + + #define TP_STATUS_COPY 2 + #define TP_STATUS_LOSING 4 + #define TP_STATUS_CSUMNOTREADY 8 + + +TP_STATUS_COPY : This flag indicates that the frame (and associated + meta information) has been truncated because it's + larger than tp_frame_size. This packet can be + read entirely with recvfrom(). + + In order to make this work it must to be + enabled previously with setsockopt() and + the PACKET_COPY_THRESH option. + + The number of frames than can be buffered to + be read with recvfrom is limited like a normal socket. + See the SO_RCVBUF option in the socket (7) man page. + +TP_STATUS_LOSING : indicates there were packet drops from last time + statistics where checked with getsockopt() and + the PACKET_STATISTICS option. + +TP_STATUS_CSUMNOTREADY: currently it's used for outgoing IP packets wich + it's checksum will be done in hardware. So while + reading the packet we should not try to check the + checksum. + +for convenience there are also the following defines: + + #define TP_STATUS_KERNEL 0 + #define TP_STATUS_USER 1 + +The kernel initializes all frames to TP_STATUS_KERNEL, when the kernel +receives a packet it puts in the buffer and updates the status with +at least the TP_STATUS_USER flag. Then the user can read the packet, +once the packet is read the user must zero the status field, so the kernel +can use again that frame buffer. + +The user can use poll (any other variant should apply too) to check if new +packets are in the ring: + + struct pollfd pfd; + + pfd.fd = fd; + pfd.revents = 0; + pfd.events = POLLIN|POLLRDNORM|POLLERR; + + if (status == TP_STATUS_KERNEL) + retval = poll(&pfd, 1, timeout); + +It doesn't incur in a race condition to first check the status value and +then poll for frames. + +-------------------------------------------------------------------------------- ++ THANKS +-------------------------------------------------------------------------------- + + Jesse Brandeburg, for fixing my grammathical/spelling errors + diff --git a/Documentation/networking/pktgen.txt b/Documentation/networking/pktgen.txt new file mode 100644 index 000000000000..cc4b4d04129c --- /dev/null +++ b/Documentation/networking/pktgen.txt @@ -0,0 +1,214 @@ + + + HOWTO for the linux packet generator + ------------------------------------ + +Date: 041221 + +Enable CONFIG_NET_PKTGEN to compile and build pktgen.o either in kernel +or as module. Module is preferred. insmod pktgen if needed. Once running +pktgen creates a thread on each CPU where each thread has affinty it's CPU. +Monitoring and controlling is done via /proc. Easiest to select a suitable +a sample script and configure. + +On a dual CPU: + +ps aux | grep pkt +root 129 0.3 0.0 0 0 ? SW 2003 523:20 [pktgen/0] +root 130 0.3 0.0 0 0 ? SW 2003 509:50 [pktgen/1] + + +For montoring and control pktgen creates: + /proc/net/pktgen/pgctrl + /proc/net/pktgen/kpktgend_X + /proc/net/pktgen/ethX + + +Viewing threads +=============== +/proc/net/pktgen/kpktgend_0 +Name: kpktgend_0 max_before_softirq: 10000 +Running: +Stopped: eth1 +Result: OK: max_before_softirq=10000 + +Most important the devices assigend to thread. Note! A device can only belong +to one thread. + + +Viewing devices +=============== + +Parm section holds configured info. Current hold running stats. +Result is printed after run or after interruption. Example: + +/proc/net/pktgen/eth1 + +Params: count 10000000 min_pkt_size: 60 max_pkt_size: 60 + frags: 0 delay: 0 clone_skb: 1000000 ifname: eth1 + flows: 0 flowlen: 0 + dst_min: 10.10.11.2 dst_max: + src_min: src_max: + src_mac: 00:00:00:00:00:00 dst_mac: 00:04:23:AC:FD:82 + udp_src_min: 9 udp_src_max: 9 udp_dst_min: 9 udp_dst_max: 9 + src_mac_count: 0 dst_mac_count: 0 + Flags: +Current: + pkts-sofar: 10000000 errors: 39664 + started: 1103053986245187us stopped: 1103053999346329us idle: 880401us + seq_num: 10000011 cur_dst_mac_offset: 0 cur_src_mac_offset: 0 + cur_saddr: 0x10a0a0a cur_daddr: 0x20b0a0a + cur_udp_dst: 9 cur_udp_src: 9 + flows: 0 +Result: OK: 13101142(c12220741+d880401) usec, 10000000 (60byte,0frags) + 763292pps 390Mb/sec (390805504bps) errors: 39664 + +Confguring threads and devices +============================== +This is done via the /proc interface easiest done via pgset in the scripts + +Examples: + + pgset "clone_skb 1" sets the number of copies of the same packet + pgset "clone_skb 0" use single SKB for all transmits + pgset "pkt_size 9014" sets packet size to 9014 + pgset "frags 5" packet will consist of 5 fragments + pgset "count 200000" sets number of packets to send, set to zero + for continious sends untill explicitl stopped. + + pgset "delay 5000" adds delay to hard_start_xmit(). nanoseconds + + pgset "dst 10.0.0.1" sets IP destination address + (BEWARE! This generator is very aggressive!) + + pgset "dst_min 10.0.0.1" Same as dst + pgset "dst_max 10.0.0.254" Set the maximum destination IP. + pgset "src_min 10.0.0.1" Set the minimum (or only) source IP. + pgset "src_max 10.0.0.254" Set the maximum source IP. + pgset "dst6 fec0::1" IPV6 destination address + pgset "src6 fec0::2" IPV6 source address + pgset "dstmac 00:00:00:00:00:00" sets MAC destination address + pgset "srcmac 00:00:00:00:00:00" sets MAC source address + + pgset "src_mac_count 1" Sets the number of MACs we'll range through. + The 'minimum' MAC is what you set with srcmac. + + pgset "dst_mac_count 1" Sets the number of MACs we'll range through. + The 'minimum' MAC is what you set with dstmac. + + pgset "flag [name]" Set a flag to determine behaviour. Current flags + are: IPSRC_RND #IP Source is random (between min/max), + IPDST_RND, UDPSRC_RND, + UDPDST_RND, MACSRC_RND, MACDST_RND + + pgset "udp_src_min 9" set UDP source port min, If < udp_src_max, then + cycle through the port range. + + pgset "udp_src_max 9" set UDP source port max. + pgset "udp_dst_min 9" set UDP destination port min, If < udp_dst_max, then + cycle through the port range. + pgset "udp_dst_max 9" set UDP destination port max. + + pgset stop aborts injection. Also, ^C aborts generator. + + +Example scripts +=============== + +A collection of small tutorial scripts for pktgen is in expamples dir. + +pktgen.conf-1-1 # 1 CPU 1 dev +pktgen.conf-1-2 # 1 CPU 2 dev +pktgen.conf-2-1 # 2 CPU's 1 dev +pktgen.conf-2-2 # 2 CPU's 2 dev +pktgen.conf-1-1-rdos # 1 CPU 1 dev w. route DoS +pktgen.conf-1-1-ip6 # 1 CPU 1 dev ipv6 +pktgen.conf-1-1-ip6-rdos # 1 CPU 1 dev ipv6 w. route DoS +pktgen.conf-1-1-flows # 1 CPU 1 dev multiple flows. + +Run in shell: ./pktgen.conf-X-Y It does all the setup including sending. + + +Interrupt affinity +=================== +Note when adding devices to a specific CPU there good idea to also assign +/proc/irq/XX/smp_affinity so the TX-interrupts gets bound to the same CPU. +as this reduces cache bouncing when freeing skb's. + + +Current commands and configuration options +========================================== + +** Pgcontrol commands: + +start +stop + +** Thread commands: + +add_device +rem_device_all +max_before_softirq + + +** Device commands: + +count +clone_skb +debug + +frags +delay + +src_mac_count +dst_mac_count + +pkt_size +min_pkt_size +max_pkt_size + +udp_src_min +udp_src_max + +udp_dst_min +udp_dst_max + +flag + IPSRC_RND + TXSIZE_RND + IPDST_RND + UDPSRC_RND + UDPDST_RND + MACSRC_RND + MACDST_RND + +dst_min +dst_max + +src_min +src_max + +dst_mac +src_mac + +clear_counters + +dst6 +src6 + +flows +flowlen + +References: +ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/ +ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/examples/ + +Paper from Linux-Kongress in Erlangen 2004. +ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/pktgen_paper.pdf + +Thanks to: +Grant Grundler for testing on IA-64 and parisc, Harald Welte, Lennert Buytenhek +Stephen Hemminger, Andi Kleen, Dave Miller and many others. + + +Good luck with the linux net-development.
\ No newline at end of file diff --git a/Documentation/networking/policy-routing.txt b/Documentation/networking/policy-routing.txt new file mode 100644 index 000000000000..36f6936d7f21 --- /dev/null +++ b/Documentation/networking/policy-routing.txt @@ -0,0 +1,150 @@ +Classes +------- + + "Class" is a complete routing table in common sense. + I.e. it is tree of nodes (destination prefix, tos, metric) + with attached information: gateway, device etc. + This tree is looked up as specified in RFC1812 5.2.4.3 + 1. Basic match + 2. Longest match + 3. Weak TOS. + 4. Metric. (should not be in kernel space, but they are) + 5. Additional pruning rules. (not in kernel space). + + We have two special type of nodes: + REJECT - abort route lookup and return an error value. + THROW - abort route lookup in this class. + + + Currently the number of classes is limited to 255 + (0 is reserved for "not specified class") + + Three classes are builtin: + + RT_CLASS_LOCAL=255 - local interface addresses, + broadcasts, nat addresses. + + RT_CLASS_MAIN=254 - all normal routes are put there + by default. + + RT_CLASS_DEFAULT=253 - if ip_fib_model==1, then + normal default routes are put there, if ip_fib_model==2 + all gateway routes are put there. + + +Rules +----- + Rule is a record of (src prefix, src interface, tos, dst prefix) + with attached information. + + Rule types: + RTP_ROUTE - lookup in attached class + RTP_NAT - lookup in attached class and if a match is found, + translate packet source address. + RTP_MASQUERADE - lookup in attached class and if a match is found, + masquerade packet as sourced by us. + RTP_DROP - silently drop the packet. + RTP_REJECT - drop the packet and send ICMP NET UNREACHABLE. + RTP_PROHIBIT - drop the packet and send ICMP COMM. ADM. PROHIBITED. + + Rule flags: + RTRF_LOG - log route creations. + RTRF_VALVE - One way route (used with masquerading) + +Default setup: + +root@amber:/pub/ip-routing # iproute -r +Kernel routing policy rules +Pref Source Destination TOS Iface Cl + 0 default default 00 * 255 + 254 default default 00 * 254 + 255 default default 00 * 253 + + +Lookup algorithm +---------------- + + We scan rules list, and if a rule is matched, apply it. + If a route is found, return it. + If it is not found or a THROW node was matched, continue + to scan rules. + +Applications +------------ + +1. Just ignore classes. All the routes are put into MAIN class + (and/or into DEFAULT class). + + HOWTO: iproute add PREFIX [ tos TOS ] [ gw GW ] [ dev DEV ] + [ metric METRIC ] [ reject ] ... (look at iproute utility) + + or use route utility from current net-tools. + +2. Opposite case. Just forget all that you know about routing + tables. Every rule is supplied with its own gateway, device + info. record. This approach is not appropriate for automated + route maintenance, but it is ideal for manual configuration. + + HOWTO: iproute addrule [ from PREFIX ] [ to PREFIX ] [ tos TOS ] + [ dev INPUTDEV] [ pref PREFERENCE ] route [ gw GATEWAY ] + [ dev OUTDEV ] ..... + + Warning: As of now the size of the routing table in this + approach is limited to 256. If someone likes this model, I'll + relax this limitation. + +3. OSPF classes (see RFC1583, RFC1812 E.3.3) + Very clean, stable and robust algorithm for OSPF routing + domains. Unfortunately, it is not widely used in the Internet. + + Proposed setup: + 255 local addresses + 254 interface routes + 253 ASE routes with external metric + 252 ASE routes with internal metric + 251 inter-area routes + 250 intra-area routes for 1st area + 249 intra-area routes for 2nd area + etc. + + Rules: + iproute addrule class 253 + iproute addrule class 252 + iproute addrule class 251 + iproute addrule to a-prefix-for-1st-area class 250 + iproute addrule to another-prefix-for-1st-area class 250 + ... + iproute addrule to a-prefix-for-2nd-area class 249 + ... + + Area classes must be terminated with reject record. + iproute add default reject class 250 + iproute add default reject class 249 + ... + +4. The Variant Router Requirements Algorithm (RFC1812 E.3.2) + Create 16 classes for different TOS values. + It is a funny, but pretty useless algorithm. + I listed it just to show the power of new routing code. + +5. All the variety of combinations...... + + +GATED +----- + + Gated does not understand classes, but it will work + happily in MAIN+DEFAULT. All policy routes can be set + and maintained manually. + +IMPORTANT NOTE +-------------- + route.c has a compilation time switch CONFIG_IP_LOCAL_RT_POLICY. + If it is set, locally originated packets are routed + using all the policy list. This is not very convenient and + pretty ambiguous when used with NAT and masquerading. + I set it to FALSE by default. + + +Alexey Kuznetov +kuznet@ms2.inr.ac.ru diff --git a/Documentation/networking/ppp_generic.txt b/Documentation/networking/ppp_generic.txt new file mode 100644 index 000000000000..15b5172fbb98 --- /dev/null +++ b/Documentation/networking/ppp_generic.txt @@ -0,0 +1,432 @@ + PPP Generic Driver and Channel Interface + ---------------------------------------- + + Paul Mackerras + paulus@samba.org + 7 Feb 2002 + +The generic PPP driver in linux-2.4 provides an implementation of the +functionality which is of use in any PPP implementation, including: + +* the network interface unit (ppp0 etc.) +* the interface to the networking code +* PPP multilink: splitting datagrams between multiple links, and + ordering and combining received fragments +* the interface to pppd, via a /dev/ppp character device +* packet compression and decompression +* TCP/IP header compression and decompression +* detecting network traffic for demand dialling and for idle timeouts +* simple packet filtering + +For sending and receiving PPP frames, the generic PPP driver calls on +the services of PPP `channels'. A PPP channel encapsulates a +mechanism for transporting PPP frames from one machine to another. A +PPP channel implementation can be arbitrarily complex internally but +has a very simple interface with the generic PPP code: it merely has +to be able to send PPP frames, receive PPP frames, and optionally +handle ioctl requests. Currently there are PPP channel +implementations for asynchronous serial ports, synchronous serial +ports, and for PPP over ethernet. + +This architecture makes it possible to implement PPP multilink in a +natural and straightforward way, by allowing more than one channel to +be linked to each ppp network interface unit. The generic layer is +responsible for splitting datagrams on transmit and recombining them +on receive. + + +PPP channel API +--------------- + +See include/linux/ppp_channel.h for the declaration of the types and +functions used to communicate between the generic PPP layer and PPP +channels. + +Each channel has to provide two functions to the generic PPP layer, +via the ppp_channel.ops pointer: + +* start_xmit() is called by the generic layer when it has a frame to + send. The channel has the option of rejecting the frame for + flow-control reasons. In this case, start_xmit() should return 0 + and the channel should call the ppp_output_wakeup() function at a + later time when it can accept frames again, and the generic layer + will then attempt to retransmit the rejected frame(s). If the frame + is accepted, the start_xmit() function should return 1. + +* ioctl() provides an interface which can be used by a user-space + program to control aspects of the channel's behaviour. This + procedure will be called when a user-space program does an ioctl + system call on an instance of /dev/ppp which is bound to the + channel. (Usually it would only be pppd which would do this.) + +The generic PPP layer provides seven functions to channels: + +* ppp_register_channel() is called when a channel has been created, to + notify the PPP generic layer of its presence. For example, setting + a serial port to the PPPDISC line discipline causes the ppp_async + channel code to call this function. + +* ppp_unregister_channel() is called when a channel is to be + destroyed. For example, the ppp_async channel code calls this when + a hangup is detected on the serial port. + +* ppp_output_wakeup() is called by a channel when it has previously + rejected a call to its start_xmit function, and can now accept more + packets. + +* ppp_input() is called by a channel when it has received a complete + PPP frame. + +* ppp_input_error() is called by a channel when it has detected that a + frame has been lost or dropped (for example, because of a FCS (frame + check sequence) error). + +* ppp_channel_index() returns the channel index assigned by the PPP + generic layer to this channel. The channel should provide some way + (e.g. an ioctl) to transmit this back to user-space, as user-space + will need it to attach an instance of /dev/ppp to this channel. + +* ppp_unit_number() returns the unit number of the ppp network + interface to which this channel is connected, or -1 if the channel + is not connected. + +Connecting a channel to the ppp generic layer is initiated from the +channel code, rather than from the generic layer. The channel is +expected to have some way for a user-level process to control it +independently of the ppp generic layer. For example, with the +ppp_async channel, this is provided by the file descriptor to the +serial port. + +Generally a user-level process will initialize the underlying +communications medium and prepare it to do PPP. For example, with an +async tty, this can involve setting the tty speed and modes, issuing +modem commands, and then going through some sort of dialog with the +remote system to invoke PPP service there. We refer to this process +as `discovery'. Then the user-level process tells the medium to +become a PPP channel and register itself with the generic PPP layer. +The channel then has to report the channel number assigned to it back +to the user-level process. From that point, the PPP negotiation code +in the PPP daemon (pppd) can take over and perform the PPP +negotiation, accessing the channel through the /dev/ppp interface. + +At the interface to the PPP generic layer, PPP frames are stored in +skbuff structures and start with the two-byte PPP protocol number. +The frame does *not* include the 0xff `address' byte or the 0x03 +`control' byte that are optionally used in async PPP. Nor is there +any escaping of control characters, nor are there any FCS or framing +characters included. That is all the responsibility of the channel +code, if it is needed for the particular medium. That is, the skbuffs +presented to the start_xmit() function contain only the 2-byte +protocol number and the data, and the skbuffs presented to ppp_input() +must be in the same format. + +The channel must provide an instance of a ppp_channel struct to +represent the channel. The channel is free to use the `private' field +however it wishes. The channel should initialize the `mtu' and +`hdrlen' fields before calling ppp_register_channel() and not change +them until after ppp_unregister_channel() returns. The `mtu' field +represents the maximum size of the data part of the PPP frames, that +is, it does not include the 2-byte protocol number. + +If the channel needs some headroom in the skbuffs presented to it for +transmission (i.e., some space free in the skbuff data area before the +start of the PPP frame), it should set the `hdrlen' field of the +ppp_channel struct to the amount of headroom required. The generic +PPP layer will attempt to provide that much headroom but the channel +should still check if there is sufficient headroom and copy the skbuff +if there isn't. + +On the input side, channels should ideally provide at least 2 bytes of +headroom in the skbuffs presented to ppp_input(). The generic PPP +code does not require this but will be more efficient if this is done. + + +Buffering and flow control +-------------------------- + +The generic PPP layer has been designed to minimize the amount of data +that it buffers in the transmit direction. It maintains a queue of +transmit packets for the PPP unit (network interface device) plus a +queue of transmit packets for each attached channel. Normally the +transmit queue for the unit will contain at most one packet; the +exceptions are when pppd sends packets by writing to /dev/ppp, and +when the core networking code calls the generic layer's start_xmit() +function with the queue stopped, i.e. when the generic layer has +called netif_stop_queue(), which only happens on a transmit timeout. +The start_xmit function always accepts and queues the packet which it +is asked to transmit. + +Transmit packets are dequeued from the PPP unit transmit queue and +then subjected to TCP/IP header compression and packet compression +(Deflate or BSD-Compress compression), as appropriate. After this +point the packets can no longer be reordered, as the decompression +algorithms rely on receiving compressed packets in the same order that +they were generated. + +If multilink is not in use, this packet is then passed to the attached +channel's start_xmit() function. If the channel refuses to take +the packet, the generic layer saves it for later transmission. The +generic layer will call the channel's start_xmit() function again +when the channel calls ppp_output_wakeup() or when the core +networking code calls the generic layer's start_xmit() function +again. The generic layer contains no timeout and retransmission +logic; it relies on the core networking code for that. + +If multilink is in use, the generic layer divides the packet into one +or more fragments and puts a multilink header on each fragment. It +decides how many fragments to use based on the length of the packet +and the number of channels which are potentially able to accept a +fragment at the moment. A channel is potentially able to accept a +fragment if it doesn't have any fragments currently queued up for it +to transmit. The channel may still refuse a fragment; in this case +the fragment is queued up for the channel to transmit later. This +scheme has the effect that more fragments are given to higher- +bandwidth channels. It also means that under light load, the generic +layer will tend to fragment large packets across all the channels, +thus reducing latency, while under heavy load, packets will tend to be +transmitted as single fragments, thus reducing the overhead of +fragmentation. + + +SMP safety +---------- + +The PPP generic layer has been designed to be SMP-safe. Locks are +used around accesses to the internal data structures where necessary +to ensure their integrity. As part of this, the generic layer +requires that the channels adhere to certain requirements and in turn +provides certain guarantees to the channels. Essentially the channels +are required to provide the appropriate locking on the ppp_channel +structures that form the basis of the communication between the +channel and the generic layer. This is because the channel provides +the storage for the ppp_channel structure, and so the channel is +required to provide the guarantee that this storage exists and is +valid at the appropriate times. + +The generic layer requires these guarantees from the channel: + +* The ppp_channel object must exist from the time that + ppp_register_channel() is called until after the call to + ppp_unregister_channel() returns. + +* No thread may be in a call to any of ppp_input(), ppp_input_error(), + ppp_output_wakeup(), ppp_channel_index() or ppp_unit_number() for a + channel at the time that ppp_unregister_channel() is called for that + channel. + +* ppp_register_channel() and ppp_unregister_channel() must be called + from process context, not interrupt or softirq/BH context. + +* The remaining generic layer functions may be called at softirq/BH + level but must not be called from a hardware interrupt handler. + +* The generic layer may call the channel start_xmit() function at + softirq/BH level but will not call it at interrupt level. Thus the + start_xmit() function may not block. + +* The generic layer will only call the channel ioctl() function in + process context. + +The generic layer provides these guarantees to the channels: + +* The generic layer will not call the start_xmit() function for a + channel while any thread is already executing in that function for + that channel. + +* The generic layer will not call the ioctl() function for a channel + while any thread is already executing in that function for that + channel. + +* By the time a call to ppp_unregister_channel() returns, no thread + will be executing in a call from the generic layer to that channel's + start_xmit() or ioctl() function, and the generic layer will not + call either of those functions subsequently. + + +Interface to pppd +----------------- + +The PPP generic layer exports a character device interface called +/dev/ppp. This is used by pppd to control PPP interface units and +channels. Although there is only one /dev/ppp, each open instance of +/dev/ppp acts independently and can be attached either to a PPP unit +or a PPP channel. This is achieved using the file->private_data field +to point to a separate object for each open instance of /dev/ppp. In +this way an effect similar to Solaris' clone open is obtained, +allowing us to control an arbitrary number of PPP interfaces and +channels without having to fill up /dev with hundreds of device names. + +When /dev/ppp is opened, a new instance is created which is initially +unattached. Using an ioctl call, it can then be attached to an +existing unit, attached to a newly-created unit, or attached to an +existing channel. An instance attached to a unit can be used to send +and receive PPP control frames, using the read() and write() system +calls, along with poll() if necessary. Similarly, an instance +attached to a channel can be used to send and receive PPP frames on +that channel. + +In multilink terms, the unit represents the bundle, while the channels +represent the individual physical links. Thus, a PPP frame sent by a +write to the unit (i.e., to an instance of /dev/ppp attached to the +unit) will be subject to bundle-level compression and to fragmentation +across the individual links (if multilink is in use). In contrast, a +PPP frame sent by a write to the channel will be sent as-is on that +channel, without any multilink header. + +A channel is not initially attached to any unit. In this state it can +be used for PPP negotiation but not for the transfer of data packets. +It can then be connected to a PPP unit with an ioctl call, which +makes it available to send and receive data packets for that unit. + +The ioctl calls which are available on an instance of /dev/ppp depend +on whether it is unattached, attached to a PPP interface, or attached +to a PPP channel. The ioctl calls which are available on an +unattached instance are: + +* PPPIOCNEWUNIT creates a new PPP interface and makes this /dev/ppp + instance the "owner" of the interface. The argument should point to + an int which is the desired unit number if >= 0, or -1 to assign the + lowest unused unit number. Being the owner of the interface means + that the interface will be shut down if this instance of /dev/ppp is + closed. + +* PPPIOCATTACH attaches this instance to an existing PPP interface. + The argument should point to an int containing the unit number. + This does not make this instance the owner of the PPP interface. + +* PPPIOCATTCHAN attaches this instance to an existing PPP channel. + The argument should point to an int containing the channel number. + +The ioctl calls available on an instance of /dev/ppp attached to a +channel are: + +* PPPIOCDETACH detaches the instance from the channel. This ioctl is + deprecated since the same effect can be achieved by closing the + instance. In order to prevent possible races this ioctl will fail + with an EINVAL error if more than one file descriptor refers to this + instance (i.e. as a result of dup(), dup2() or fork()). + +* PPPIOCCONNECT connects this channel to a PPP interface. The + argument should point to an int containing the interface unit + number. It will return an EINVAL error if the channel is already + connected to an interface, or ENXIO if the requested interface does + not exist. + +* PPPIOCDISCONN disconnects this channel from the PPP interface that + it is connected to. It will return an EINVAL error if the channel + is not connected to an interface. + +* All other ioctl commands are passed to the channel ioctl() function. + +The ioctl calls that are available on an instance that is attached to +an interface unit are: + +* PPPIOCSMRU sets the MRU (maximum receive unit) for the interface. + The argument should point to an int containing the new MRU value. + +* PPPIOCSFLAGS sets flags which control the operation of the + interface. The argument should be a pointer to an int containing + the new flags value. The bits in the flags value that can be set + are: + SC_COMP_TCP enable transmit TCP header compression + SC_NO_TCP_CCID disable connection-id compression for + TCP header compression + SC_REJ_COMP_TCP disable receive TCP header decompression + SC_CCP_OPEN Compression Control Protocol (CCP) is + open, so inspect CCP packets + SC_CCP_UP CCP is up, may (de)compress packets + SC_LOOP_TRAFFIC send IP traffic to pppd + SC_MULTILINK enable PPP multilink fragmentation on + transmitted packets + SC_MP_SHORTSEQ expect short multilink sequence + numbers on received multilink fragments + SC_MP_XSHORTSEQ transmit short multilink sequence nos. + + The values of these flags are defined in <linux/if_ppp.h>. Note + that the values of the SC_MULTILINK, SC_MP_SHORTSEQ and + SC_MP_XSHORTSEQ bits are ignored if the CONFIG_PPP_MULTILINK option + is not selected. + +* PPPIOCGFLAGS returns the value of the status/control flags for the + interface unit. The argument should point to an int where the ioctl + will store the flags value. As well as the values listed above for + PPPIOCSFLAGS, the following bits may be set in the returned value: + SC_COMP_RUN CCP compressor is running + SC_DECOMP_RUN CCP decompressor is running + SC_DC_ERROR CCP decompressor detected non-fatal error + SC_DC_FERROR CCP decompressor detected fatal error + +* PPPIOCSCOMPRESS sets the parameters for packet compression or + decompression. The argument should point to a ppp_option_data + structure (defined in <linux/if_ppp.h>), which contains a + pointer/length pair which should describe a block of memory + containing a CCP option specifying a compression method and its + parameters. The ppp_option_data struct also contains a `transmit' + field. If this is 0, the ioctl will affect the receive path, + otherwise the transmit path. + +* PPPIOCGUNIT returns, in the int pointed to by the argument, the unit + number of this interface unit. + +* PPPIOCSDEBUG sets the debug flags for the interface to the value in + the int pointed to by the argument. Only the least significant bit + is used; if this is 1 the generic layer will print some debug + messages during its operation. This is only intended for debugging + the generic PPP layer code; it is generally not helpful for working + out why a PPP connection is failing. + +* PPPIOCGDEBUG returns the debug flags for the interface in the int + pointed to by the argument. + +* PPPIOCGIDLE returns the time, in seconds, since the last data + packets were sent and received. The argument should point to a + ppp_idle structure (defined in <linux/ppp_defs.h>). If the + CONFIG_PPP_FILTER option is enabled, the set of packets which reset + the transmit and receive idle timers is restricted to those which + pass the `active' packet filter. + +* PPPIOCSMAXCID sets the maximum connection-ID parameter (and thus the + number of connection slots) for the TCP header compressor and + decompressor. The lower 16 bits of the int pointed to by the + argument specify the maximum connection-ID for the compressor. If + the upper 16 bits of that int are non-zero, they specify the maximum + connection-ID for the decompressor, otherwise the decompressor's + maximum connection-ID is set to 15. + +* PPPIOCSNPMODE sets the network-protocol mode for a given network + protocol. The argument should point to an npioctl struct (defined + in <linux/if_ppp.h>). The `protocol' field gives the PPP protocol + number for the protocol to be affected, and the `mode' field + specifies what to do with packets for that protocol: + + NPMODE_PASS normal operation, transmit and receive packets + NPMODE_DROP silently drop packets for this protocol + NPMODE_ERROR drop packets and return an error on transmit + NPMODE_QUEUE queue up packets for transmit, drop received + packets + + At present NPMODE_ERROR and NPMODE_QUEUE have the same effect as + NPMODE_DROP. + +* PPPIOCGNPMODE returns the network-protocol mode for a given + protocol. The argument should point to an npioctl struct with the + `protocol' field set to the PPP protocol number for the protocol of + interest. On return the `mode' field will be set to the network- + protocol mode for that protocol. + +* PPPIOCSPASS and PPPIOCSACTIVE set the `pass' and `active' packet + filters. These ioctls are only available if the CONFIG_PPP_FILTER + option is selected. The argument should point to a sock_fprog + structure (defined in <linux/filter.h>) containing the compiled BPF + instructions for the filter. Packets are dropped if they fail the + `pass' filter; otherwise, if they fail the `active' filter they are + passed but they do not reset the transmit or receive idle timer. + +* PPPIOCSMRRU enables or disables multilink processing for received + packets and sets the multilink MRRU (maximum reconstructed receive + unit). The argument should point to an int containing the new MRRU + value. If the MRRU value is 0, processing of received multilink + fragments is disabled. This ioctl is only available if the + CONFIG_PPP_MULTILINK option is selected. + +Last modified: 7-feb-2002 diff --git a/Documentation/networking/proc_net_tcp.txt b/Documentation/networking/proc_net_tcp.txt new file mode 100644 index 000000000000..59cb915c3713 --- /dev/null +++ b/Documentation/networking/proc_net_tcp.txt @@ -0,0 +1,47 @@ +This document describes the interfaces /proc/net/tcp and /proc/net/tcp6. + +These /proc interfaces provide information about currently active TCP +connections, and are implemented by tcp_get_info() in net/ipv4/tcp_ipv4.c and +tcp6_get_info() in net/ipv6/tcp_ipv6.c, respectively. + +It will first list all listening TCP sockets, and next list all established +TCP connections. A typical entry of /proc/net/tcp would look like this (split +up into 3 parts because of the length of the line): + + 46: 010310AC:9C4C 030310AC:1770 01 + | | | | | |--> connection state + | | | | |------> remote TCP port number + | | | |-------------> remote IPv4 address + | | |--------------------> local TCP port number + | |---------------------------> local IPv4 address + |----------------------------------> number of entry + + 00000150:00000000 01:00000019 00000000 + | | | | |--> number of unrecovered RTO timeouts + | | | |----------> number of jiffies until timer expires + | | |----------------> timer_active (see below) + | |----------------------> receive-queue + |-------------------------------> transmit-queue + + 1000 0 54165785 4 cd1e6040 25 4 27 3 -1 + | | | | | | | | | |--> slow start size threshold, + | | | | | | | | | or -1 if the treshold + | | | | | | | | | is >= 0xFFFF + | | | | | | | | |----> sending congestion window + | | | | | | | |-------> (ack.quick<<1)|ack.pingpong + | | | | | | |---------> Predicted tick of soft clock + | | | | | | (delayed ACK control data) + | | | | | |------------> retransmit timeout + | | | | |------------------> location of socket in memory + | | | |-----------------------> socket reference count + | | |-----------------------------> inode + | |----------------------------------> unanswered 0-window probes + |---------------------------------------------> uid + +timer_active: + 0 no timer is pending + 1 retransmit-timer is pending + 2 another timer (e.g. delayed ack or keepalive) is pending + 3 this is a socket in TIME_WAIT state. Not all fields will contain + data (or even exist) + 4 zero window probe timer is pending diff --git a/Documentation/networking/pt.txt b/Documentation/networking/pt.txt new file mode 100644 index 000000000000..72e888c1d988 --- /dev/null +++ b/Documentation/networking/pt.txt @@ -0,0 +1,58 @@ +This is the README for the Gracilis Packetwin device driver, version 0.5 +ALPHA for Linux 1.3.43. + +These files will allow you to talk to the PackeTwin (now know as PT) and +connect through it just like a pair of TNCs. To do this you will also +require the AX.25 code in the kernel enabled. + +There are four files in this archive; this readme, a patch file, a .c file +and finally a .h file. The two program files need to be put into the +drivers/net directory in the Linux source tree, for me this is the +directory /usr/src/linux/drivers/net. The patch file needs to be patched in +at the top of the Linux source tree (/usr/src/linux in my case). + +You will most probably have to edit the pt.c file to suit your own setup, +this should just involve changing some of the defines at the top of the file. +Please note that if you run an external modem you must specify a speed of 0. + +The program is currently setup to run a 4800 baud external modem on port A +and a Kantronics DE-9600 daughter board on port B so if you have this (or +something similar) then you're right. + +To compile in the driver, put the files in the correct place and patch in +the diff. You will have to re-configure the kernel again before you +recompile it. + +The driver is not real good at the moment for finding the card. You can +'help' it by changing the order of the potential addresses in the structure +found in the pt_init() function so the address of where the card is is put +first. + +After compiling, you have to get them going, they are pretty well like any +other net device and just need ifconfig to get them going. +As an example, here is my /etc/rc.net +-------------------------- + +# +# Configure the PackeTwin, port A. +/sbin/ifconfig pt0a 44.136.8.87 hw ax25 vk2xlz mtu 512 +/sbin/ifconfig pt0a 44.136.8.87 broadcast 44.136.8.255 netmask 255.255.255.0 +/sbin/route add -net 44.136.8.0 netmask 255.255.255.0 dev pt0a +/sbin/route add -net 44.0.0.0 netmask 255.0.0.0 gw 44.136.8.68 dev pt0a +/sbin/route add -net 138.25.16.0 netmask 255.255.240.0 dev pt0a +/sbin/route add -host 44.136.8.255 dev pt0a +# +# Configure the PackeTwin, port B. +/sbin/ifconfig pt0b 44.136.8.87 hw ax25 vk2xlz-1 mtu 512 +/sbin/ifconfig pt0b 44.136.8.87 broadcast 44.255.255.255 netmask 255.0.0.0 +/sbin/route add -host 44.136.8.216 dev pt0b +/sbin/route add -host 44.136.8.95 dev pt0b +/sbin/route add -host 44.255.255.255 dev pt0b + +This version of the driver comes under the GNU GPL. If you have one of my +previous (non-GPL) versions of the driver, please update to this one. + +I hope that this all works well for you. I would be pleased to hear how +many people use the driver and if it does its job. + + - Craig vk2xlz <csmall@small.dropbear.id.au> diff --git a/Documentation/networking/ray_cs.txt b/Documentation/networking/ray_cs.txt new file mode 100644 index 000000000000..b1def00bc4a3 --- /dev/null +++ b/Documentation/networking/ray_cs.txt @@ -0,0 +1,151 @@ +September 21, 1999 + +Copyright (c) 1998 Corey Thomas (corey@world.std.com) + +This file is the documentation for the Raylink Wireless LAN card driver for +Linux. The Raylink wireless LAN card is a PCMCIA card which provides IEEE +802.11 compatible wireless network connectivity at 1 and 2 megabits/second. +See http://www.raytheon.com/micro/raylink/ for more information on the Raylink +card. This driver is in early development and does have bugs. See the known +bugs and limitations at the end of this document for more information. +This driver also works with WebGear's Aviator 2.4 and Aviator Pro +wireless LAN cards. + +As of kernel 2.3.18, the ray_cs driver is part of the Linux kernel +source. My web page for the development of ray_cs is at +http://world.std.com/~corey/raylink.html and I can be emailed at +corey@world.std.com + +The kernel driver is based on ray_cs-1.62.tgz + +The driver at my web page is intended to be used as an add on to +David Hinds pcmcia package. All the command line parameters are +available when compiled as a module. When built into the kernel, only +the essid= string parameter is available via the kernel command line. +This will change after the method of sorting out parameters for all +the PCMCIA drivers is agreed upon. If you must have a built in driver +with nondefault parameters, they can be edited in +/usr/src/linux/drivers/net/pcmcia/ray_cs.c. Searching for MODULE_PARM +will find them all. + +Information on card services is available at: + ftp://hyper.stanford.edu/pub/pcmcia/doc + http://hyper.stanford.edu/HyperNews/get/pcmcia/home.html + + +Card services user programs are still required for PCMCIA devices. +pcmcia-cs-3.1.1 or greater is required for the kernel version of +the driver. + +Currently, ray_cs is not part of David Hinds card services package, +so the following magic is required. + +At the end of the /etc/pcmcia/config.opts file, add the line: +source ./ray_cs.opts +This will make card services read the ray_cs.opts file +when starting. Create the file /etc/pcmcia/ray_cs.opts containing the +following: + +#### start of /etc/pcmcia/ray_cs.opts ################### +# Configuration options for Raylink Wireless LAN PCMCIA card +device "ray_cs" + class "network" module "misc/ray_cs" + +card "RayLink PC Card WLAN Adapter" + manfid 0x01a6, 0x0000 + bind "ray_cs" + +module "misc/ray_cs" opts "" +#### end of /etc/pcmcia/ray_cs.opts ##################### + + +To join an existing network with +different parameters, contact the network administrator for the +configuration information, and edit /etc/pcmcia/ray_cs.opts. +Add the parameters below between the empty quotes. + +Parameters for ray_cs driver which may be specified in ray_cs.opts: + +bc integer 0 = normal mode (802.11 timing) + 1 = slow down inter frame timing to allow + operation with older breezecom access + points. + +beacon_period integer beacon period in Kilo-microseconds + legal values = must be integer multiple + of hop dwell + default = 256 + +country integer 1 = USA (default) + 2 = Europe + 3 = Japan + 4 = Korea + 5 = Spain + 6 = France + 7 = Israel + 8 = Australia + +essid string ESS ID - network name to join + string with maximum length of 32 chars + default value = "ADHOC_ESSID" + +hop_dwell integer hop dwell time in Kilo-microseconds + legal values = 16,32,64,128(default),256 + +irq_mask integer linux standard 16 bit value 1bit/IRQ + lsb is IRQ 0, bit 1 is IRQ 1 etc. + Used to restrict choice of IRQ's to use. + Recommended method for controlling + interrupts is in /etc/pcmcia/config.opts + +net_type integer 0 (default) = adhoc network, + 1 = infrastructure + +phy_addr string string containing new MAC address in + hex, must start with x eg + x00008f123456 + +psm integer 0 = continuously active + 1 = power save mode (not useful yet) + +pc_debug integer (0-5) larger values for more verbose + logging. Replaces ray_debug. + +ray_debug integer Replaced with pc_debug + +ray_mem_speed integer defaults to 500 + +sniffer integer 0 = not sniffer (default) + 1 = sniffer which can be used to record all + network traffic using tcpdump or similar, + but no normal network use is allowed. + +translate integer 0 = no translation (encapsulate frames) + 1 = translation (RFC1042/802.1) + + +More on sniffer mode: + +tcpdump does not understand 802.11 headers, so it can't +interpret the contents, but it can record to a file. This is only +useful for debugging 802.11 lowlevel protocols that are not visible to +linux. If you want to watch ftp xfers, or do similar things, you +don't need to use sniffer mode. Also, some packet types are never +sent up by the card, so you will never see them (ack, rts, cts, probe +etc.) There is a simple program (showcap) included in the ray_cs +package which parses the 802.11 headers. + +Known Problems and missing features + + Does not work with non x86 + + Does not work with SMP + + Support for defragmenting frames is not yet debugged, and in + fact is known to not work. I have never encountered a net set + up to fragment, but still, it should be fixed. + + The ioctl support is incomplete. The hardware address cannot be set + using ifconfig yet. If a different hardware address is needed, it may + be set using the phy_addr parameter in ray_cs.opts. This requires + a card insertion to take effect. diff --git a/Documentation/networking/routing.txt b/Documentation/networking/routing.txt new file mode 100644 index 000000000000..a26838b930f2 --- /dev/null +++ b/Documentation/networking/routing.txt @@ -0,0 +1,46 @@ +The directory ftp.inr.ac.ru:/ip-routing contains: + +- iproute.c - "professional" routing table maintenance utility. + +- rdisc.tar.gz - rdisc daemon, ported from Sun. + STRONGLY RECOMMENDED FOR ALL HOSTS. + +- routing.tgz - original Mike McLagan's route by source patch. + Currently it is obsolete. + +- gated.dif-ss<NEWEST>.gz - gated-R3_6Alpha_2 fixes. + Look at README.gated + +- mrouted-3.8.dif.gz - mrouted-3.8 fixes. + +- rtmon.c - trivial debugging utility: reads and stores netlink. + + +NEWS for user. + +- Policy based routing. Routing decisions are made on the basis + not only of destination address, but also source address, + TOS and incoming interface. +- Complete set of IP level control messages. + Now Linux is the only OS in the world complying to RFC requirements. + Great win 8) +- New interface addressing paradigm. + Assignment of address ranges to interface, + multiple prefixes etc. etc. + Do not bother, it is compatible with the old one. Moreover: +- You don't need to do "route add aaa.bbb.ccc... eth0" anymore, + it is done automatically. +- "Abstract" UNIX sockets and security enhancements. + This is necessary to use TIRPC and TLI emulation library. + +NEWS for hacker. + +- New destination cache. Flexible, robust and just beautiful. +- Network stack is reordered, simplified, optimized, a lot of bugs fixed. + (well, and new bugs were introduced, but I haven't seen them yet 8)) + It is difficult to describe all the changes, look into source. + +If you see this file, then this patch works 8) + +Alexey Kuznetsov. +kuznet@ms2.inr.ac.ru diff --git a/Documentation/networking/s2io.txt b/Documentation/networking/s2io.txt new file mode 100644 index 000000000000..6726b524ec45 --- /dev/null +++ b/Documentation/networking/s2io.txt @@ -0,0 +1,48 @@ +S2IO Technologies XFrame 10 Gig adapter. +------------------------------------------- + +I. Module loadable parameters. +When loaded as a module, the driver provides a host of Module loadable +parameters, so the device can be tuned as per the users needs. +A list of the Module params is given below. +(i) ring_num: This can be used to program the number of + receive rings used in the driver. +(ii) ring_len: This defines the number of descriptors each ring + can have. There can be a maximum of 8 rings. +(iii) frame_len: This is an array of size 8. Using this we can + set the maximum size of the received frame that can + be steered into the corrsponding receive ring. +(iv) fifo_num: This defines the number of Tx FIFOs thats used in + the driver. +(v) fifo_len: Each element defines the number of + Tx descriptors that can be associated with each + corresponding FIFO. There are a maximum of 8 FIFOs. +(vi) tx_prio: This is a bool, if module is loaded with a non-zero + value for tx_prio multi FIFO scheme is activated. +(vii) rx_prio: This is a bool, if module is loaded with a non-zero + value for tx_prio multi RING scheme is activated. +(viii) latency_timer: The value given against this param will be + loaded into the latency timer register in PCI Config + space, else the register is left with its reset value. + +II. Performance tuning. + By changing a few sysctl parameters. + Copy the following lines into a file and run the following command, + "sysctl -p <file_name>" +### IPV4 specific settings +net.ipv4.tcp_timestamps = 0 # turns TCP timestamp support off, default 1, reduces CPU use +net.ipv4.tcp_sack = 0 # turn SACK support off, default on +# on systems with a VERY fast bus -> memory interface this is the big gainer +net.ipv4.tcp_rmem = 10000000 10000000 10000000 # sets min/default/max TCP read buffer, default 4096 87380 174760 +net.ipv4.tcp_wmem = 10000000 10000000 10000000 # sets min/pressure/max TCP write buffer, default 4096 16384 131072 +net.ipv4.tcp_mem = 10000000 10000000 10000000 # sets min/pressure/max TCP buffer space, default 31744 32256 32768 + +### CORE settings (mostly for socket and UDP effect) +net.core.rmem_max = 524287 # maximum receive socket buffer size, default 131071 +net.core.wmem_max = 524287 # maximum send socket buffer size, default 131071 +net.core.rmem_default = 524287 # default receive socket buffer size, default 65535 +net.core.wmem_default = 524287 # default send socket buffer size, default 65535 +net.core.optmem_max = 524287 # maximum amount of option memory buffers, default 10240 +net.core.netdev_max_backlog = 300000 # number of unprocessed input packets before kernel starts dropping them, default 300 +---End of performance tuning file--- + diff --git a/Documentation/networking/sctp.txt b/Documentation/networking/sctp.txt new file mode 100644 index 000000000000..0c790a76910e --- /dev/null +++ b/Documentation/networking/sctp.txt @@ -0,0 +1,38 @@ +Linux Kernel SCTP + +This is the current BETA release of the Linux Kernel SCTP reference +implementation. + +SCTP (Stream Control Transmission Protocol) is a IP based, message oriented, +reliable transport protocol, with congestion control, support for +transparent multi-homing, and multiple ordered streams of messages. +RFC2960 defines the core protocol. The IETF SIGTRAN working group originally +developed the SCTP protocol and later handed the protocol over to the +Transport Area (TSVWG) working group for the continued evolvement of SCTP as a +general purpose transport. + +See the IETF website (http://www.ietf.org) for further documents on SCTP. +See http://www.ietf.org/rfc/rfc2960.txt + +The initial project goal is to create an Linux kernel reference implementation +of SCTP that is RFC 2960 compliant and provides an programming interface +referred to as the UDP-style API of the Sockets Extensions for SCTP, as +proposed in IETF Internet-Drafts. + + +Caveats: + +-lksctp can be built as statically or as a module. However, be aware that +module removal of lksctp is not yet a safe activity. + +-There is tentative support for IPv6, but most work has gone towards +implementation and testing lksctp on IPv4. + + +For more information, please visit the lksctp project website: + http://www.sf.net/projects/lksctp + +Or contact the lksctp developers through the mailing list: + <lksctp-developers@lists.sourceforge.net> + + diff --git a/Documentation/networking/shaper.txt b/Documentation/networking/shaper.txt new file mode 100644 index 000000000000..6c4ebb66a906 --- /dev/null +++ b/Documentation/networking/shaper.txt @@ -0,0 +1,48 @@ +Traffic Shaper For Linux + +This is the current BETA release of the traffic shaper for Linux. It works +within the following limits: + +o Minimum shaping speed is currently about 9600 baud (it can only +shape down to 1 byte per clock tick) + +o Maximum is about 256K, it will go above this but get a bit blocky. + +o If you ifconfig the master device that a shaper is attached to down +then your machine will follow. + +o The shaper must be a module. + + +Setup: + + A shaper device is configured using the shapeconfig program. +Typically you will do something like this + +shapecfg attach shaper0 eth1 +shapecfg speed shaper0 64000 +ifconfig shaper0 myhost netmask 255.255.255.240 broadcast 1.2.3.4.255 up +route add -net some.network netmask a.b.c.d dev shaper0 + +The shaper should have the same IP address as the device it is attached to +for normal use. + +Gotchas: + + The shaper shapes transmitted traffic. It's rather impossible to +shape received traffic except at the end (or a router) transmitting it. + + Gated/routed/rwhod/mrouted all see the shaper as an additional device +and will treat it as such unless patched. Note that for mrouted you can run +mrouted tunnels via a traffic shaper to control bandwidth usage. + + The shaper is device/route based. This makes it very easy to use +with any setup BUT less flexible. You may need to use iproute2 to set up +multiple route tables to get the flexibility. + + There is no "borrowing" or "sharing" scheme. This is a simple +traffic limiter. We implement Van Jacobson and Sally Floyd's CBQ +architecture into Linux 2.2. This is the preferred solution. Shaper is +for simple or back compatible setups. + +Alan diff --git a/Documentation/networking/sis900.txt b/Documentation/networking/sis900.txt new file mode 100644 index 000000000000..bddffd7385ae --- /dev/null +++ b/Documentation/networking/sis900.txt @@ -0,0 +1,257 @@ + +SiS 900/7016 Fast Ethernet Device Driver + +Ollie Lho + +Lei Chun Chang + + Copyright © 1999 by Silicon Integrated System Corp. + + This document gives some information on installation and usage of SiS + 900/7016 device driver under Linux. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or (at + your option) any later version. + + This program is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 + USA + _________________________________________________________________ + + Table of Contents + 1. Introduction + 2. Changes + 3. Tested Environment + 4. Files in This Package + 5. Installation + + Building the driver as loadable module + Building the driver into kernel + + 6. Known Problems and Bugs + 7. Revision History + 8. Acknowledgements + _________________________________________________________________ + +Chapter 1. Introduction + + This document describes the revision 1.06 and 1.07 of SiS 900/7016 + Fast Ethernet device driver under Linux. The driver is developed by + Silicon Integrated System Corp. and distributed freely under the GNU + General Public License (GPL). The driver can be compiled as a loadable + module and used under Linux kernel version 2.2.x. (rev. 1.06) With + minimal changes, the driver can also be used under 2.3.x and 2.4.x + kernel (rev. 1.07), please see Chapter 5. If you are intended to use + the driver for earlier kernels, you are on your own. + + The driver is tested with usual TCP/IP applications including FTP, + Telnet, Netscape etc. and is used constantly by the developers. + + Please send all comments/fixes/questions to Lei-Chun Chang. + _________________________________________________________________ + +Chapter 2. Changes + + Changes made in Revision 1.07 + + 1. Separation of sis900.c and sis900.h in order to move most constant + definition to sis900.h (many of those constants were corrected) + 2. Clean up PCI detection, the pci-scan from Donald Becker were not + used, just simple pci_find_*. + 3. MII detection is modified to support multiple mii transceiver. + 4. Bugs in read_eeprom, mdio_* were removed. + 5. Lot of sis900 irrelevant comments were removed/changed and more + comments were added to reflect the real situation. + 6. Clean up of physical/virtual address space mess in buffer + descriptors. + 7. Better transmit/receive error handling. + 8. The driver now uses zero-copy single buffer management scheme to + improve performance. + 9. Names of variables were changed to be more consistent. + 10. Clean up of auo-negotiation and timer code. + 11. Automatic detection and change of PHY on the fly. + 12. Bug in mac probing fixed. + 13. Fix 630E equalier problem by modifying the equalizer workaround + rule. + 14. Support for ICS1893 10/100 Interated PHYceiver. + 15. Support for media select by ifconfig. + 16. Added kernel-doc extratable documentation. + _________________________________________________________________ + +Chapter 3. Tested Environment + + This driver is developed on the following hardware + + * Intel Celeron 500 with SiS 630 (rev 02) chipset + * SiS 900 (rev 01) and SiS 7016/7014 Fast Ethernet Card + + and tested with these software environments + + * Red Hat Linux version 6.2 + * Linux kernel version 2.4.0 + * Netscape version 4.6 + * NcFTP 3.0.0 beta 18 + * Samba version 2.0.3 + _________________________________________________________________ + +Chapter 4. Files in This Package + + In the package you can find these files: + + sis900.c + Driver source file in C + + sis900.h + Header file for sis900.c + + sis900.sgml + DocBook SGML source of the document + + sis900.txt + Driver document in plain text + _________________________________________________________________ + +Chapter 5. Installation + + Silicon Integrated System Corp. is cooperating closely with core Linux + Kernel developers. The revisions of SiS 900 driver are distributed by + the usuall channels for kernel tar files and patches. Those kernel tar + files for official kernel and patches for kernel pre-release can be + download at official kernel ftp site and its mirrors. The 1.06 + revision can be found in kernel version later than 2.3.15 and + pre-2.2.14, and 1.07 revision can be found in kernel version 2.4.0. If + you have no prior experience in networking under Linux, please read + Ethernet HOWTO and Networking HOWTO available from Linux Documentation + Project (LDP). + + The driver is bundled in release later than 2.2.11 and 2.3.15 so this + is the most easy case. Be sure you have the appropriate packages for + compiling kernel source. Those packages are listed in Document/Changes + in kernel source distribution. If you have to install the driver other + than those bundled in kernel release, you should have your driver file + sis900.c and sis900.h copied into /usr/src/linux/drivers/net/ first. + There are two alternative ways to install the driver + _________________________________________________________________ + +Building the driver as loadable module + + To build the driver as a loadable kernel module you have to + reconfigure the kernel to activate network support by + +make menuconfig + + Choose "Loadable module support --->", then select "Enable loadable + module support". + + Choose "Network Device Support --->", select "Ethernet (10 or + 100Mbit)". Then select "EISA, VLB, PCI and on board controllers", and + choose "SiS 900/7016 PCI Fast Ethernet Adapter support" to "M". + + After reconfiguring the kernel, you can make the driver module by + +make modules + + The driver should be compiled with no errors. After compiling the + driver, the driver can be installed to proper place by + +make modules_install + + Load the driver into kernel by + +insmod sis900 + + When loading the driver into memory, some information message can be + view by + +dmesg + + or +cat /var/log/message + + If the driver is loaded properly you will have messages similar to + this: + +sis900.c: v1.07.06 11/07/2000 +eth0: SiS 900 PCI Fast Ethernet at 0xd000, IRQ 10, 00:00:e8:83:7f:a4. +eth0: SiS 900 Internal MII PHY transceiver found at address 1. +eth0: Using SiS 900 Internal MII PHY as default + + showing the version of the driver and the results of probing routine. + + Once the driver is loaded, network can be brought up by + +/sbin/ifconfig eth0 IPADDR broadcast BROADCAST netmask NETMASK media TYPE + + where IPADDR, BROADCAST, NETMASK are your IP address, broadcast + address and netmask respectively. TYPE is used to set medium type used + by the device. Typical values are "10baseT"(twisted-pair 10Mbps + Ethernet) or "100baseT" (twisted-pair 100Mbps Ethernet). For more + information on how to configure network interface, please refer to + Networking HOWTO. + + The link status is also shown by kernel messages. For example, after + the network interface is activated, you may have the message: + +eth0: Media Link On 100mbps full-duplex + + If you try to unplug the twist pair (TP) cable you will get + +eth0: Media Link Off + + indicating that the link is failed. + _________________________________________________________________ + +Building the driver into kernel + + If you want to make the driver into kernel, choose "Y" rather than "M" + on "SiS 900/7016 PCI Fast Ethernet Adapter support" when configuring + the kernel. Build the kernel image in the usual way + +make clean + +make bzlilo + + Next time the system reboot, you have the driver in memory. + _________________________________________________________________ + +Chapter 6. Known Problems and Bugs + + There are some known problems and bugs. If you find any other bugs + please mail to lcchang@sis.com.tw + + 1. AM79C901 HomePNA PHY is not thoroughly tested, there may be some + bugs in the "on the fly" change of transceiver. + 2. A bug is hidden somewhere in the receive buffer management code, + the bug causes NULL pointer reference in the kernel. This fault is + caught before bad things happen and reported with the message: + eth0: NULL pointer encountered in Rx ring, skipping which can be + viewed with dmesg or cat /var/log/message. + 3. The media type change from 10Mbps to 100Mbps twisted-pair ethernet + by ifconfig causes the media link down. + _________________________________________________________________ + +Chapter 7. Revision History + + * November 13, 2000, Revision 1.07, seventh release, 630E problem + fixed and further clean up. + * November 4, 1999, Revision 1.06, Second release, lots of clean up + and optimization. + * August 8, 1999, Revision 1.05, Initial Public Release + _________________________________________________________________ + +Chapter 8. Acknowledgements + + This driver was originally derived form Donald Becker's pci-skeleton + and rtl8139 drivers. Donald also provided various suggestion regarded + with improvements made in revision 1.06. + + The 1.05 revision was created by Jim Huang, AMD 79c901 support was + added by Chin-Shan Li. diff --git a/Documentation/networking/sk98lin.txt b/Documentation/networking/sk98lin.txt new file mode 100644 index 000000000000..851fc97bb22f --- /dev/null +++ b/Documentation/networking/sk98lin.txt @@ -0,0 +1,568 @@ +(C)Copyright 1999-2004 Marvell(R). +All rights reserved +=========================================================================== + +sk98lin.txt created 13-Feb-2004 + +Readme File for sk98lin v6.23 +Marvell Yukon/SysKonnect SK-98xx Gigabit Ethernet Adapter family driver for LINUX + +This file contains + 1 Overview + 2 Required Files + 3 Installation + 3.1 Driver Installation + 3.2 Inclusion of adapter at system start + 4 Driver Parameters + 4.1 Per-Port Parameters + 4.2 Adapter Parameters + 5 Large Frame Support + 6 VLAN and Link Aggregation Support (IEEE 802.1, 802.1q, 802.3ad) + 7 Troubleshooting + +=========================================================================== + + +1 Overview +=========== + +The sk98lin driver supports the Marvell Yukon and SysKonnect +SK-98xx/SK-95xx compliant Gigabit Ethernet Adapter on Linux. It has +been tested with Linux on Intel/x86 machines. +*** + + +2 Required Files +================= + +The linux kernel source. +No additional files required. +*** + + +3 Installation +=============== + +It is recommended to download the latest version of the driver from the +SysKonnect web site www.syskonnect.com. If you have downloaded the latest +driver, the Linux kernel has to be patched before the driver can be +installed. For details on how to patch a Linux kernel, refer to the +patch.txt file. + +3.1 Driver Installation +------------------------ + +The following steps describe the actions that are required to install +the driver and to start it manually. These steps should be carried +out for the initial driver setup. Once confirmed to be ok, they can +be included in the system start. + +NOTE 1: To perform the following tasks you need 'root' access. + +NOTE 2: In case of problems, please read the section "Troubleshooting" + below. + +The driver can either be integrated into the kernel or it can be compiled +as a module. Select the appropriate option during the kernel +configuration. + +Compile/use the driver as a module +---------------------------------- +To compile the driver, go to the directory /usr/src/linux and +execute the command "make menuconfig" or "make xconfig" and proceed as +follows: + +To integrate the driver permanently into the kernel, proceed as follows: + +1. Select the menu "Network device support" and then "Ethernet(1000Mbit)" +2. Mark "Marvell Yukon Chipset / SysKonnect SK-98xx family support" + with (*) +3. Build a new kernel when the configuration of the above options is + finished. +4. Install the new kernel. +5. Reboot your system. + +To use the driver as a module, proceed as follows: + +1. Enable 'loadable module support' in the kernel. +2. For automatic driver start, enable the 'Kernel module loader'. +3. Select the menu "Network device support" and then "Ethernet(1000Mbit)" +4. Mark "Marvell Yukon Chipset / SysKonnect SK-98xx family support" + with (M) +5. Execute the command "make modules". +6. Execute the command "make modules_install". + The appropiate modules will be installed. +7. Reboot your system. + + +Load the module manually +------------------------ +To load the module manually, proceed as follows: + +1. Enter "modprobe sk98lin". +2. If a Marvell Yukon or SysKonnect SK-98xx adapter is installed in + your computer and you have a /proc file system, execute the command: + "ls /proc/net/sk98lin/" + This should produce an output containing a line with the following + format: + eth0 eth1 ... + which indicates that your adapter has been found and initialized. + + NOTE 1: If you have more than one Marvell Yukon or SysKonnect SK-98xx + adapter installed, the adapters will be listed as 'eth0', + 'eth1', 'eth2', etc. + For each adapter, repeat steps 3 and 4 below. + + NOTE 2: If you have other Ethernet adapters installed, your Marvell + Yukon or SysKonnect SK-98xx adapter will be mapped to the + next available number, e.g. 'eth1'. The mapping is executed + automatically. + The module installation message (displayed either in a system + log file or on the console) prints a line for each adapter + found containing the corresponding 'ethX'. + +3. Select an IP address and assign it to the respective adapter by + entering: + ifconfig eth0 <ip-address> + With this command, the adapter is connected to the Ethernet. + + SK-98xx Gigabit Ethernet Server Adapters: The yellow LED on the adapter + is now active, the link status LED of the primary port is active and + the link status LED of the secondary port (on dual port adapters) is + blinking (if the ports are connected to a switch or hub). + SK-98xx V2.0 Gigabit Ethernet Adapters: The link status LED is active. + In addition, you will receive a status message on the console stating + "ethX: network connection up using port Y" and showing the selected + connection parameters (x stands for the ethernet device number + (0,1,2, etc), y stands for the port name (A or B)). + + NOTE: If you are in doubt about IP addresses, ask your network + administrator for assistance. + +4. Your adapter should now be fully operational. + Use 'ping <otherstation>' to verify the connection to other computers + on your network. +5. To check the adapter configuration view /proc/net/sk98lin/[devicename]. + For example by executing: + "cat /proc/net/sk98lin/eth0" + +Unload the module +----------------- +To stop and unload the driver modules, proceed as follows: + +1. Execute the command "ifconfig eth0 down". +2. Execute the command "rmmod sk98lin". + +3.2 Inclusion of adapter at system start +----------------------------------------- + +Since a large number of different Linux distributions are +available, we are unable to describe a general installation procedure +for the driver module. +Because the driver is now integrated in the kernel, installation should +be easy, using the standard mechanism of your distribution. +Refer to the distribution's manual for installation of ethernet adapters. + +*** + +4 Driver Parameters +==================== + +Parameters can be set at the command line after the module has been +loaded with the command 'modprobe'. +In some distributions, the configuration tools are able to pass parameters +to the driver module. + +If you use the kernel module loader, you can set driver parameters +in the file /etc/modprobe.conf (or /etc/modules.conf in 2.4 or earlier). +To set the driver parameters in this file, proceed as follows: + +1. Insert a line of the form : + options sk98lin ... + For "...", the same syntax is required as described for the command + line paramaters of modprobe below. +2. To activate the new parameters, either reboot your computer + or + unload and reload the driver. + The syntax of the driver parameters is: + + modprobe sk98lin parameter=value1[,value2[,value3...]] + + where value1 refers to the first adapter, value2 to the second etc. + +NOTE: All parameters are case sensitive. Write them exactly as shown + below. + +Example: +Suppose you have two adapters. You want to set auto-negotiation +on the first adapter to ON and on the second adapter to OFF. +You also want to set DuplexCapabilities on the first adapter +to FULL, and on the second adapter to HALF. +Then, you must enter: + + modprobe sk98lin AutoNeg_A=On,Off DupCap_A=Full,Half + +NOTE: The number of adapters that can be configured this way is + limited in the driver (file skge.c, constant SK_MAX_CARD_PARAM). + The current limit is 16. If you happen to install + more adapters, adjust this and recompile. + + +4.1 Per-Port Parameters +------------------------ + +These settings are available for each port on the adapter. +In the following description, '?' stands for the port for +which you set the parameter (A or B). + +Speed +----- +Parameter: Speed_? +Values: 10, 100, 1000, Auto +Default: Auto + +This parameter is used to set the speed capabilities. It is only valid +for the SK-98xx V2.0 copper adapters. +Usually, the speed is negotiated between the two ports during link +establishment. If this fails, a port can be forced to a specific setting +with this parameter. + +Auto-Negotiation +---------------- +Parameter: AutoNeg_? +Values: On, Off, Sense +Default: On + +The "Sense"-mode automatically detects whether the link partner supports +auto-negotiation or not. + +Duplex Capabilities +------------------- +Parameter: DupCap_? +Values: Half, Full, Both +Default: Both + +This parameters is only relevant if auto-negotiation for this port is +not set to "Sense". If auto-negotiation is set to "On", all three values +are possible. If it is set to "Off", only "Full" and "Half" are allowed. +This parameter is usefull if your link partner does not support all +possible combinations. + +Flow Control +------------ +Parameter: FlowCtrl_? +Values: Sym, SymOrRem, LocSend, None +Default: SymOrRem + +This parameter can be used to set the flow control capabilities the +port reports during auto-negotiation. It can be set for each port +individually. +Possible modes: + -- Sym = Symmetric: both link partners are allowed to send + PAUSE frames + -- SymOrRem = SymmetricOrRemote: both or only remote partner + are allowed to send PAUSE frames + -- LocSend = LocalSend: only local link partner is allowed + to send PAUSE frames + -- None = no link partner is allowed to send PAUSE frames + +NOTE: This parameter is ignored if auto-negotiation is set to "Off". + +Role in Master-Slave-Negotiation (1000Base-T only) +-------------------------------------------------- +Parameter: Role_? +Values: Auto, Master, Slave +Default: Auto + +This parameter is only valid for the SK-9821 and SK-9822 adapters. +For two 1000Base-T ports to communicate, one must take the role of the +master (providing timing information), while the other must be the +slave. Usually, this is negotiated between the two ports during link +establishment. If this fails, a port can be forced to a specific setting +with this parameter. + + +4.2 Adapter Parameters +----------------------- + +Connection Type (SK-98xx V2.0 copper adapters only) +--------------- +Parameter: ConType +Values: Auto, 100FD, 100HD, 10FD, 10HD +Default: Auto + +The parameter 'ConType' is a combination of all five per-port parameters +within one single parameter. This simplifies the configuration of both ports +of an adapter card! The different values of this variable reflect the most +meaningful combinations of port parameters. + +The following table shows the values of 'ConType' and the corresponding +combinations of the per-port parameters: + + ConType | DupCap AutoNeg FlowCtrl Role Speed + ----------+------------------------------------------------------ + Auto | Both On SymOrRem Auto Auto + 100FD | Full Off None Auto (ignored) 100 + 100HD | Half Off None Auto (ignored) 100 + 10FD | Full Off None Auto (ignored) 10 + 10HD | Half Off None Auto (ignored) 10 + +Stating any other port parameter together with this 'ConType' variable +will result in a merged configuration of those settings. This due to +the fact, that the per-port parameters (e.g. Speed_? ) have a higher +priority than the combined variable 'ConType'. + +NOTE: This parameter is always used on both ports of the adapter card. + +Interrupt Moderation +-------------------- +Parameter: Moderation +Values: None, Static, Dynamic +Default: None + +Interrupt moderation is employed to limit the maxmimum number of interrupts +the driver has to serve. That is, one or more interrupts (which indicate any +transmit or receive packet to be processed) are queued until the driver +processes them. When queued interrupts are to be served, is determined by the +'IntsPerSec' parameter, which is explained later below. + +Possible modes: + + -- None - No interrupt moderation is applied on the adapter card. + Therefore, each transmit or receive interrupt is served immediately + as soon as it appears on the interrupt line of the adapter card. + + -- Static - Interrupt moderation is applied on the adapter card. + All transmit and receive interrupts are queued until a complete + moderation interval ends. If such a moderation interval ends, all + queued interrupts are processed in one big bunch without any delay. + The term 'static' reflects the fact, that interrupt moderation is + always enabled, regardless how much network load is currently + passing via a particular interface. In addition, the duration of + the moderation interval has a fixed length that never changes while + the driver is operational. + + -- Dynamic - Interrupt moderation might be applied on the adapter card, + depending on the load of the system. If the driver detects that the + system load is too high, the driver tries to shield the system against + too much network load by enabling interrupt moderation. If - at a later + time - the CPU utilizaton decreases again (or if the network load is + negligible) the interrupt moderation will automatically be disabled. + +Interrupt moderation should be used when the driver has to handle one or more +interfaces with a high network load, which - as a consequence - leads also to a +high CPU utilization. When moderation is applied in such high network load +situations, CPU load might be reduced by 20-30%. + +NOTE: The drawback of using interrupt moderation is an increase of the round- +trip-time (RTT), due to the queueing and serving of interrupts at dedicated +moderation times. + +Interrupts per second +--------------------- +Parameter: IntsPerSec +Values: 30...40000 (interrupts per second) +Default: 2000 + +This parameter is only used, if either static or dynamic interrupt moderation +is used on a network adapter card. Using this paramter if no moderation is +applied, will lead to no action performed. + +This parameter determines the length of any interrupt moderation interval. +Assuming that static interrupt moderation is to be used, an 'IntsPerSec' +parameter value of 2000 will lead to an interrupt moderation interval of +500 microseconds. + +NOTE: The duration of the moderation interval is to be chosen with care. +At first glance, selecting a very long duration (e.g. only 100 interrupts per +second) seems to be meaningful, but the increase of packet-processing delay +is tremendous. On the other hand, selecting a very short moderation time might +compensate the use of any moderation being applied. + + +Preferred Port +-------------- +Parameter: PrefPort +Values: A, B +Default: A + +This is used to force the preferred port to A or B (on dual-port network +adapters). The preferred port is the one that is used if both are detected +as fully functional. + +RLMT Mode (Redundant Link Management Technology) +------------------------------------------------ +Parameter: RlmtMode +Values: CheckLinkState,CheckLocalPort, CheckSeg, DualNet +Default: CheckLinkState + +RLMT monitors the status of the port. If the link of the active port +fails, RLMT switches immediately to the standby link. The virtual link is +maintained as long as at least one 'physical' link is up. + +Possible modes: + + -- CheckLinkState - Check link state only: RLMT uses the link state + reported by the adapter hardware for each individual port to + determine whether a port can be used for all network traffic or + not. + + -- CheckLocalPort - In this mode, RLMT monitors the network path + between the two ports of an adapter by regularly exchanging packets + between them. This mode requires a network configuration in which + the two ports are able to "see" each other (i.e. there must not be + any router between the ports). + + -- CheckSeg - Check local port and segmentation: This mode supports the + same functions as the CheckLocalPort mode and additionally checks + network segmentation between the ports. Therefore, this mode is only + to be used if Gigabit Ethernet switches are installed on the network + that have been configured to use the Spanning Tree protocol. + + -- DualNet - In this mode, ports A and B are used as separate devices. + If you have a dual port adapter, port A will be configured as eth0 + and port B as eth1. Both ports can be used independently with + distinct IP addresses. The preferred port setting is not used. + RLMT is turned off. + +NOTE: RLMT modes CLP and CLPSS are designed to operate in configurations + where a network path between the ports on one adapter exists. + Moreover, they are not designed to work where adapters are connected + back-to-back. +*** + + +5 Large Frame Support +====================== + +The driver supports large frames (also called jumbo frames). Using large +frames can result in an improved throughput if transferring large amounts +of data. +To enable large frames, set the MTU (maximum transfer unit) of the +interface to the desired value (up to 9000), execute the following +command: + ifconfig eth0 mtu 9000 +This will only work if you have two adapters connected back-to-back +or if you use a switch that supports large frames. When using a switch, +it should be configured to allow large frames and auto-negotiation should +be set to OFF. The setting must be configured on all adapters that can be +reached by the large frames. If one adapter is not set to receive large +frames, it will simply drop them. + +You can switch back to the standard ethernet frame size by executing the +following command: + ifconfig eth0 mtu 1500 + +To permanently configure this setting, add a script with the 'ifconfig' +line to the system startup sequence (named something like "S99sk98lin" +in /etc/rc.d/rc2.d). +*** + + +6 VLAN and Link Aggregation Support (IEEE 802.1, 802.1q, 802.3ad) +================================================================== + +The Marvell Yukon/SysKonnect Linux drivers are able to support VLAN and +Link Aggregation according to IEEE standards 802.1, 802.1q, and 802.3ad. +These features are only available after installation of open source +modules available on the Internet: +For VLAN go to: http://www.candelatech.com/~greear/vlan.html +For Link Aggregation go to: http://www.st.rim.or.jp/~yumo + +NOTE: SysKonnect GmbH does not offer any support for these open source + modules and does not take the responsibility for any kind of + failures or problems arising in connection with these modules. + +NOTE: Configuring Link Aggregation on a SysKonnect dual link adapter may + cause problems when unloading the driver. + + +7 Troubleshooting +================== + +If any problems occur during the installation process, check the +following list: + + +Problem: The SK-98xx adapter can not be found by the driver. +Solution: In /proc/pci search for the following entry: + 'Ethernet controller: SysKonnect SK-98xx ...' + If this entry exists, the SK-98xx or SK-98xx V2.0 adapter has + been found by the system and should be operational. + If this entry does not exist or if the file '/proc/pci' is not + found, there may be a hardware problem or the PCI support may + not be enabled in your kernel. + The adapter can be checked using the diagnostics program which + is available on the SysKonnect web site: + www.syskonnect.com + + Some COMPAQ machines have problems dealing with PCI under Linux. + Linux. This problem is described in the 'PCI howto' document + (included in some distributions or available from the + web, e.g. at 'www.linux.org'). + + +Problem: Programs such as 'ifconfig' or 'route' can not be found or the + error message 'Operation not permitted' is displayed. +Reason: You are not logged in as user 'root'. +Solution: Logout and login as 'root' or change to 'root' via 'su'. + + +Problem: Upon use of the command 'ping <address>' the message + "ping: sendto: Network is unreachable" is displayed. +Reason: Your route is not set correctly. +Solution: If you are using RedHat, you probably forgot to set up the + route in the 'network configuration'. + Check the existing routes with the 'route' command and check + if an entry for 'eth0' exists, and if so, if it is set correctly. + + +Problem: The driver can be started, the adapter is connected to the + network, but you cannot receive or transmit any packets; + e.g. 'ping' does not work. +Reason: There is an incorrect route in your routing table. +Solution: Check the routing table with the command 'route' and read the + manual help pages dealing with routes (enter 'man route'). + +NOTE: Although the 2.2.x kernel versions generate the routing entry + automatically, problems of this kind may occur here as well. We've + come across a situation in which the driver started correctly at + system start, but after the driver has been removed and reloaded, + the route of the adapter's network pointed to the 'dummy0'device + and had to be corrected manually. + + +Problem: Your computer should act as a router between multiple + IP subnetworks (using multiple adapters), but computers in + other subnetworks cannot be reached. +Reason: Either the router's kernel is not configured for IP forwarding + or the routing table and gateway configuration of at least one + computer is not working. + +Problem: Upon driver start, the following error message is displayed: + "eth0: -- ERROR -- + Class: internal Software error + Nr: 0xcc + Msg: SkGeInitPort() cannot init running ports" +Reason: You are using a driver compiled for single processor machines + on a multiprocessor machine with SMP (Symmetric MultiProcessor) + kernel. +Solution: Configure your kernel appropriately and recompile the kernel or + the modules. + + + +If your problem is not listed here, please contact SysKonnect's technical +support for help (linux@syskonnect.de). +When contacting our technical support, please ensure that the following +information is available: +- System Manufacturer and HW Informations (CPU, Memory... ) +- PCI-Boards in your system +- Distribution +- Kernel version +- Driver version +*** + + + +***End of Readme File*** diff --git a/Documentation/networking/skfp.txt b/Documentation/networking/skfp.txt new file mode 100644 index 000000000000..3a419ed42f81 --- /dev/null +++ b/Documentation/networking/skfp.txt @@ -0,0 +1,220 @@ +(C)Copyright 1998-2000 SysKonnect, +=========================================================================== + +skfp.txt created 11-May-2000 + +Readme File for skfp.o v2.06 + + +This file contains +(1) OVERVIEW +(2) SUPPORTED ADAPTERS +(3) GENERAL INFORMATION +(4) INSTALLATION +(5) INCLUSION OF THE ADAPTER IN SYSTEM START +(6) TROUBLESHOOTING +(7) FUNCTION OF THE ADAPTER LEDS +(8) HISTORY + +=========================================================================== + + + +(1) OVERVIEW +============ + +This README explains how to use the driver 'skfp' for Linux with your +network adapter. + +Chapter 2: Contains a list of all network adapters that are supported by + this driver. + +Chapter 3: Gives some general information. + +Chapter 4: Describes common problems and solutions. + +Chapter 5: Shows the changed functionality of the adapter LEDs. + +Chapter 6: History of development. + +*** + + +(2) SUPPORTED ADAPTERS +====================== + +The network driver 'skfp' supports the following network adapters: +SysKonnect adapters: + - SK-5521 (SK-NET FDDI-UP) + - SK-5522 (SK-NET FDDI-UP DAS) + - SK-5541 (SK-NET FDDI-FP) + - SK-5543 (SK-NET FDDI-LP) + - SK-5544 (SK-NET FDDI-LP DAS) + - SK-5821 (SK-NET FDDI-UP64) + - SK-5822 (SK-NET FDDI-UP64 DAS) + - SK-5841 (SK-NET FDDI-FP64) + - SK-5843 (SK-NET FDDI-LP64) + - SK-5844 (SK-NET FDDI-LP64 DAS) +Compaq adapters (not tested): + - Netelligent 100 FDDI DAS Fibre SC + - Netelligent 100 FDDI SAS Fibre SC + - Netelligent 100 FDDI DAS UTP + - Netelligent 100 FDDI SAS UTP + - Netelligent 100 FDDI SAS Fibre MIC +*** + + +(3) GENERAL INFORMATION +======================= + +From v2.01 on, the driver is integrated in the linux kernel sources. +Therefor, the installation is the same as for any other adapter +supported by the kernel. +Refer to the manual of your distribution about the installation +of network adapters. +Makes my life much easier :-) +*** + + +(4) TROUBLESHOOTING +=================== + +If you run into problems during installation, check those items: + +Problem: The FDDI adapter can not be found by the driver. +Reason: Look in /proc/pci for the following entry: + 'FDDI network controller: SysKonnect SK-FDDI-PCI ...' + If this entry exists, then the FDDI adapter has been + found by the system and should be able to be used. + If this entry does not exist or if the file '/proc/pci' + is not there, then you may have a hardware problem or PCI + support may not be enabled in your kernel. + The adapter can be checked using the diagnostic program + which is available from the SysKonnect web site: + www.syskonnect.de + Some COMPAQ machines have a problem with PCI under + Linux. This is described in the 'PCI howto' document + (included in some distributions or available from the + www, e.g. at 'www.linux.org') and no workaround is available. + +Problem: You want to use your computer as a router between + multiple IP subnetworks (using multiple adapters), but + you can not reach computers in other subnetworks. +Reason: Either the router's kernel is not configured for IP + forwarding or there is a problem with the routing table + and gateway configuration in at least one of the + computers. + +If your problem is not listed here, please contact our +technical support for help. +You can send email to: + linux@syskonnect.de +When contacting our technical support, +please ensure that the following information is available: +- System Manufacturer and Model +- Boards in your system +- Distribution +- Kernel version + +*** + + +(5) FUNCTION OF THE ADAPTER LEDS +================================ + + The functionality of the LED's on the FDDI network adapters was + changed in SMT version v2.82. With this new SMT version, the yellow + LED works as a ring operational indicator. An active yellow LED + indicates that the ring is down. The green LED on the adapter now + works as a link indicator where an active GREEN LED indicates that + the respective port has a physical connection. + + With versions of SMT prior to v2.82 a ring up was indicated if the + yellow LED was off while the green LED(s) showed the connection + status of the adapter. During a ring down the green LED was off and + the yellow LED was on. + + All implementations indicate that a driver is not loaded if + all LEDs are off. + +*** + + +(6) HISTORY +=========== + +v2.06 (20000511) (In-Kernel version) + New features: + - 64 bit support + - new pci dma interface + - in kernel 2.3.99 + +v2.05 (20000217) (In-Kernel version) + New features: + - Changes for 2.3.45 kernel + +v2.04 (20000207) (Standalone version) + New features: + - Added rx/tx byte counter + +v2.03 (20000111) (Standalone version) + Problems fixed: + - Fixed printk statements from v2.02 + +v2.02 (991215) (Standalone version) + Problems fixed: + - Removed unnecessary output + - Fixed path for "printver.sh" in makefile + +v2.01 (991122) (In-Kernel version) + New features: + - Integration in Linux kernel sources + - Support for memory mapped I/O. + +v2.00 (991112) + New features: + - Full source released under GPL + +v1.05 (991023) + Problems fixed: + - Compilation with kernel version 2.2.13 failed + +v1.04 (990427) + Changes: + - New SMT module included, changing LED functionality + Problems fixed: + - Synchronization on SMP machines was buggy + +v1.03 (990325) + Problems fixed: + - Interrupt routing on SMP machines could be incorrect + +v1.02 (990310) + New features: + - Support for kernel versions 2.2.x added + - Kernel patch instead of private duplicate of kernel functions + +v1.01 (980812) + Problems fixed: + Connection hangup with telnet + Slow telnet connection + +v1.00 beta 01 (980507) + New features: + None. + Problems fixed: + None. + Known limitations: + - tar archive instead of standard package format (rpm). + - FDDI statistic is empty. + - not tested with 2.1.xx kernels + - integration in kernel not tested + - not tested simultaneously with FDDI adapters from other vendors. + - only X86 processors supported. + - SBA (Synchronous Bandwidth Allocator) parameters can + not be configured. + - does not work on some COMPAQ machines. See the PCI howto + document for details about this problem. + - data corruption with kernel versions below 2.0.33. + +*** End of information file *** diff --git a/Documentation/networking/slicecom.hun b/Documentation/networking/slicecom.hun new file mode 100644 index 000000000000..5acf1918694a --- /dev/null +++ b/Documentation/networking/slicecom.hun @@ -0,0 +1,371 @@ + +SliceCOM adapter felhasznaloi dokumentacioja - 0.51 verziohoz + +Bartók István <bartoki@itc.hu> +Utolso modositas: Wed Aug 29 17:26:58 CEST 2001 + +----------------------------------------------------------------- + +Hasznalata: + +Forditas: + +Code maturity level options + [*] Prompt for development and/or incomplete code/drivers + +Network device support + Wan interfaces + <M> MultiGate (COMX) synchronous + <M> Support for MUNICH based boards: SliceCOM, PCICOM (NEW) + <M> Support for HDLC and syncPPP... + + +A modulok betoltese: + +modprobe comx + +modprobe comx-proto-ppp # a Cisco-HDLC es a SyncPPP protokollt is + # ez a modul adja + +modprobe comx-hw-munich # a modul betoltodeskor azonnal jelent a + # syslogba a detektalt kartyakrol + + +Konfiguralas: + +# Ezen az interfeszen Cisco-HDLC vonali protokoll fog futni +# Az interfeszhez rendelt idoszeletek: 1,2 (128 kbit/sec-es vonal) +# (a G.703 keretben az elso adatot vivo idoszelet az 1-es) +# +mkdir /proc/comx/comx0.1/ +echo slicecom >/proc/comx/comx0.1/boardtype +echo hdlc >/proc/comx/comx0.1/protocol +echo 1 2 >/proc/comx/comx0.1/timeslots + + +# Ezen az interfeszen SyncPPP vonali protokoll fog futni +# Az interfeszhez rendelt idoszelet: 3 (64 kbit/sec-es vonal) +# +mkdir /proc/comx/comx0.2/ +echo slicecom >/proc/comx/comx0.2/boardtype +echo ppp >/proc/comx/comx0.2/protocol +echo 3 >/proc/comx/comx0.2/timeslots + +... + +ifconfig comx0.1 up +ifconfig comx0.2 up + +----------------------------------------------------------------- + +A COMX driverek default 20 csomagnyi transmit queue-t rendelnek a halozati +interfeszekhez. WAN halozatokban ennel hosszabbat is szokas hasznalni +(20 es 100 kozott), hogy a vonal kihasznaltsaga nagy terheles eseten jobb +legyen (bar ezzel megno a varhato kesleltetes a csomagok sorban allasa miatt): + +# ifconfig comx0 txqueuelen 50 + +Ezt a beallitasi lehetoseget csak az ujabb disztribuciok ifconfig parancsa +tamogatja (amik mar a 2.2 kernelekhez keszultek, mint a RedHat 6.1 vagy a +Debian 2.2). + +A 2.1-es Debian disztribuciohoz a http://www.debian.org/~rcw/2.2/netbase/ +cimrol toltheto le ujabb netbase csomag, ami mar ilyet tamogato ifconfig +parancsot tartalmaz. Bovebben a 2.2 kernel hasznalatarol Debian 2.1 alatt: +http://www.debian.org/releases/stable/running-kernel-2.2 + +----------------------------------------------------------------- + +A kartya LED-jeinek jelentese: + +piros - eg, ha Remote Alarm-ot kuld a tuloldal +zold - eg, ha a vett jelben megtalalja a keretszinkront + +Reszletesebben: + +piros: zold: jelentes: + +- - nincs keretszinkron (nincs jel, vagy rossz a jel) +- eg "minden rendben" +eg eg a vetel OK, de a tuloldal Remote Alarm-ot kuld +eg - ez nincs ertelmezve, egyelore funkcio nelkul + +----------------------------------------------------------------- + +Reszletesebb leiras a hardver beallitasi lehetosegeirol: + +Az altalanos,- es a protokoll-retegek beallitasi lehetosegeirol a 'comx.txt' +fajlban leirtak SliceCOM kartyanal is ervenyesek, itt csak a hardver-specifikus +beallitasi lehetosegek vannak osszefoglalva: + +Konfiguralasi interfesz a /proc/comx/ alatt: + +Minden timeslot-csoportnak kulon comx* interfeszt kell letrehozni mkdir-rel: +comx0, comx1, .. stb. Itt beallithato, hogy az adott interfesz hanyadik kartya +melyik timeslotja(i)bol alljon ossze. A Cisco-fele serial3:1 elnevezesek +(serial3:1 = a 3. kartyaban az 1-es idoszelet-csoport) Linuxon aliasing-ot +jelentenenek, ezert mi nem tudunk ilyen elnevezest hasznalni. + +Tobb kartya eseten a comx0.1, comx0.2, ... vagy slice0.1, slice0.2 nevek +hasznalhatoak. + +Tobb SliceCOM kartya is lehet egy gepben, de sajat interrupt kell mindegyiknek, +nem tud meg megosztott interruptot kezelni. + +Az egesz kartyat erinto beallitasok: + +Az ioport es irq beallitas nincs: amit a PCI BIOS kioszt a rendszernek, +azt hasznalja a driver. + + +comx0/boardnum - hanyadik SliceCOM kartya a gepben (a 'termeszetes' PCI + sorrendben ertve: ahogyan a /proc/pci-ban vagy az 'lspci' + kimeneteben megjelenik, altalaban az alaplapi PCI meghajto + aramkorokhoz kozelebb eso kartyak a kisebb sorszamuak) + + Default: 0 (0-tol kezdodik a szamolas) + + +Bar a kovetkezoket csak egy-egy interfeszen allitjuk at, megis az egesz kartya +mukodeset egyszerre allitjak. A megkotes hogy csak UP-ban levo interfeszen +hasznalhatoak, azert van, mert kulonben nem vart eredmenyekre vezetne egy ilyen +paranccsorozat: + + echo 0 >boardnum + echo internal >clock_source + echo 1 >boardnum + +- Ez a 0-s board clock_source-at allitana at. + +Ezek a beallitasok megmaradnak az osszes interfesz torlesekor, de torlodnek +a driver modul ki/betoltesekor. + + +comx0/clock_source - A Tx orajelforrasa, a Cisco-val hasonlatosra keszult. + Hasznalata: + + papaya:# echo line >/proc/comx/comx0/clock_source + papaya:# echo internal >/proc/comx/comx0/clock_source + + line - A Tx orajelet a vett adatfolyambol dekodolja, igyekszik + igazodni hozza. Ha nem lat orajelet az inputon, akkor + atall a sajat orajelgeneratorara. + internal - A Tx orajelet a sajat orajelgeneratora szolgaltatja. + + Default: line + + Normal osszeallitas eseten a tavkozlesi szolgaltato eszkoze + (pl. HDSL modem) adja az orajelet, ezert ez a default. + + +comx0/framing - A CRC4 ki/be kapcsolasa + + A CRC4: 16 PCM keretet (A PCM keret az, amibe a 32 darab 64 + kilobites csatorna van bemultiplexalva. Nem osszetevesztendo a HDLC + kerettel.) 2x8 -as csoportokra osztanak, es azokhoz 4-4 bites CRC-t + szamolnak. Elsosorban a vonal minosegenek a monitorozasara szolgal. + + papaya:~# echo crc4 >/proc/comx/comx0/framing + papaya:~# echo no-crc4 >/proc/comx/comx0/framing + + Default a 'crc4', a MATAV vonalak altalaban igy futnak. De ha nem + egyforma is a beallitas a vonal ket vegen, attol a forgalom altalaban + at tud menni. + + +comx0/linecode - A vonali kodolas beallitasa + + papaya:~# echo hdb3 >/proc/comx/comx0/linecode + papaya:~# echo ami >/proc/comx/comx0/linecode + + Default a 'hdb3', a MATAV vonalak igy futnak. + + (az AMI kodolas igen ritka E1-es vonalaknal). Ha ez a beallitas nem + egyezik a vonal ket vegen, akkor elofordulhat hogy a keretszinkron + osszejon, de CRC4-hibak es a vonalakon atvitt adatokban is hibak + keletkeznek (amit a HDLC/SyncPPP szinten CRC-hibaval jelez) + + +comx0/reg - a kartya aramkoreinek, a MUNICH (reg) es a FALC (lbireg) +comx0/lbireg regisztereinek kozvetlen elerese. Hasznalata: + + echo >reg 0x04 0x0 - a 4-es regiszterbe 0-t ir + echo >reg 0x104 - printk()-val kiirja a 4-es regiszter + tartalmat a syslogba. + + WARNING: ezek csak a fejleszteshez keszultek, sok galibat + lehet veluk okozni! + + +comx0/loopback - A kartya G.703 jelenek a visszahurkolasara is van lehetoseg: + + papaya:# echo none >/proc/comx/comx0/loopback + papaya:# echo local >/proc/comx/comx0/loopback + papaya:# echo remote >/proc/comx/comx0/loopback + + none - nincs visszahurkolas, normal mukodes + local - a kartya a sajat maga altal adott jelet kapja vissza + remote - a kartya a kivulrol vett jelet adja kifele + + Default: none + +----------------------------------------------------------------- + +Az interfeszhez (Cisco terminologiaban 'channel-group') kapcsolodo beallitasok: + +comx0/timeslots - mely timeslotok (idoszeletek) tartoznak az adott interfeszhez. + + papaya:~# cat /proc/comx/comx0/timeslots + 1 3 4 5 6 + papaya:~# + + Egy timeslot megkeresese (hanyas interfeszbe tartozik nalunk): + + papaya:~# grep ' 4' /proc/comx/comx*/timeslots + /proc/comx/comx0/timeslots:1 3 4 5 6 + papaya:~# + + Beallitasa: + papaya:~# echo '1 5 2 6 7 8' >/proc/comx/comx0/timeslots + + A timeslotok sorrendje nem szamit, '1 3 2' ugyanaz mint az '1 2 3'. + + Beallitashoz az adott interfesznek DOWN-ban kell lennie + (ifconfig comx0 down), de ugyanannak a kartyanak a tobbi interfesze + uzemelhet kozben. + + Beallitaskor leellenorzi, hogy az uj timeslotok nem utkoznek-e egy + masik interfesz timeslotjaival. Ha utkoznek, akkor nem allitja at. + + Mindig 10-es szamrendszerben tortenik a timeslotok ertelmezese, nehogy + a 08, 09 alaku felirast rosszul ertelmezze. + +----------------------------------------------------------------- + +Az interfeszek es a kartya allapotanak lekerdezese: + +- A ' '-szel kezdodo sorok az eredeti kimenetet, a //-rel kezdodo sorok a +magyarazatot jelzik. + + papaya:~$ cat /proc/comx/comx1/status + Interface administrative status is UP, modem status is UP, protocol is UP + Modem status changes: 0, Transmitter status is IDLE, tbusy: 0 + Interface load (input): 978376 / 947808 / 951024 bits/s (5s/5m/15m) + (output): 978376 / 947848 / 951024 bits/s (5s/5m/15m) + Debug flags: none + RX errors: len: 22, overrun: 1, crc: 0, aborts: 0 + buffer overrun: 0, pbuffer overrun: 0 + TX errors: underrun: 0 + Line keepalive (value: 10) status UP [0] + +// Itt kezdodik a hardver-specifikus resz: + Controller status: + No alarms + +// Alarm: hibajelzes: +// +// No alarms - minden rendben +// +// LOS - Loss Of Signal - nem erzekel jelet a bemeneten. +// AIS - Alarm Indication Signal - csak egymas utani 1-esek jonnek +// a bemeneten, a tuloldal igy is jelezheti hogy meghibasodott vagy +// nincs inicializalva. +// AUXP - Auxiliary Pattern Indication - 01010101.. sorozat jon a bemeneten. +// LFA - Loss of Frame Alignment - nincs keretszinkron +// RRA - Receive Remote Alarm - a tuloldal el, de hibat jelez. +// LMFA - Loss of CRC4 Multiframe Alignment - nincs CRC4-multikeret-szinkron +// NMF - No Multiframe alignment Found after 400 msec - ilyen alarm a no-crc4 +// es crc4 keretezesek eseten nincs, lasd lentebb +// +// Egyeb lehetseges hibajelzesek: +// +// Transmit Line Short - a kartya ugy erzi hogy az adasi kimenete rovidre +// van zarva, ezert kikapcsolta az adast. (nem feltetlenul veszi eszre +// a kulso rovidzarat) + +// A veteli oldal csomagjainak lancolt listai, debug celokra: + + Rx ring: + rafutott: 0 + lastcheck: 50845731, jiffies: 51314281 + base: 017b1858 + rx_desc_ptr: 0 + rx_desc_ptr: 017b1858 + hw_curr_ptr: 017b1858 + 06040000 017b1868 017b1898 c016ff00 + 06040000 017b1878 017b1e9c c016ff00 + 46040000 017b1888 017b24a0 c016ff00 + 06040000 017b1858 017b2aa4 c016ff00 + +// A kartyat hasznalo tobbi interfesz: a 0-s channel-group a comx1 interfesz, +// es az 1,2,...,16 timeslotok tartoznak hozza: + + Interfaces using this board: (channel-group, interface, timeslots) + 0 comx1: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 + 1 comx2: 17 + 2 comx3: 18 + 3 comx4: 19 + 4 comx5: 20 + 5 comx6: 21 + 6 comx7: 22 + 7 comx8: 23 + 8 comx9: 24 + 9 comx10: 25 + 10 comx11: 26 + 11 comx12: 27 + 12 comx13: 28 + 13 comx14: 29 + 14 comx15: 30 + 15 comx16: 31 + +// Hany esemenyt kezelt le a driver egy-egy hardver-interrupt kiszolgalasanal: + + Interrupt work histogram: + hist[ 0]: 0 hist[ 1]: 2 hist[ 2]: 18574 hist[ 3]: 79 + hist[ 4]: 14 hist[ 5]: 1 hist[ 6]: 0 hist[ 7]: 1 + hist[ 8]: 0 hist[ 9]: 7 + +// Hany kikuldendo csomag volt mar a Tx-ringben amikor ujabb lett irva bele: + + Tx ring histogram: + hist[ 0]: 2329 hist[ 1]: 0 hist[ 2]: 0 hist[ 3]: 0 + +// Az E1-interfesz hiba-szamlaloi, az rfc2495-nek megfeleloen: +// (kb. a Cisco routerek "show controllers e1" formatumaban: http://www.cisco.com/univercd/cc/td/doc/product/software/ios11/rbook/rinterfc.htm#xtocid25669126) + +Data in current interval (91 seconds elapsed): + 9516 Line Code Violations, 65 Path Code Violations, 2 E-Bit Errors + 0 Slip Secs, 2 Fr Loss Secs, 2 Line Err Secs, 0 Degraded Mins + 0 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 11 Unavail Secs +Data in Interval 1 (15 minutes): + 0 Line Code Violations, 0 Path Code Violations, 0 E-Bit Errors + 0 Slip Secs, 0 Fr Loss Secs, 0 Line Err Secs, 0 Degraded Mins + 0 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs +Data in last 4 intervals (1 hour): + 0 Line Code Violations, 0 Path Code Violations, 0 E-Bit Errors + 0 Slip Secs, 0 Fr Loss Secs, 0 Line Err Secs, 0 Degraded Mins + 0 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs +Data in last 96 intervals (24 hours): + 0 Line Code Violations, 0 Path Code Violations, 0 E-Bit Errors + 0 Slip Secs, 0 Fr Loss Secs, 0 Line Err Secs, 0 Degraded Mins + 0 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs + +----------------------------------------------------------------- + +Nehany kulonlegesebb beallitasi lehetoseg (idovel beepulhetnek majd a driverbe): +Ezekkel sok galibat lehet okozni, nagyon ovatosan kell oket hasznalni! + + modified CRC-4, for improved interworking of CRC-4 and non-CRC-4 + devices: (lasd page 107 es g706 Annex B) + lbireg[ 0x1b ] |= 0x08 + lbireg[ 0x1c ] |= 0xc0 + - ilyenkor ertelmezett az NMF - 'No Multiframe alignment Found after + 400 msec' alarm. + + FALC - a vonali meghajto IC + local loop - a sajat adasomat halljam vissza + remote loop - a kivulrol jovo adast adom vissza + + Egy hibakeresesre hasznalhato dolog: + - 1-es timeslot local loop a FALC-ban: echo >lbireg 0x1d 0x21 + - local loop kikapcsolasa: echo >lbireg 0x1d 0x00 diff --git a/Documentation/networking/slicecom.txt b/Documentation/networking/slicecom.txt new file mode 100644 index 000000000000..59cfd95121fb --- /dev/null +++ b/Documentation/networking/slicecom.txt @@ -0,0 +1,369 @@ + +SliceCOM adapter user's documentation - for the 0.51 driver version + +Written by Bartók István <bartoki@itc.hu> + +English translation: Lakatos György <gyuri@itc.hu> +Mon Dec 11 15:28:42 CET 2000 + +Last modified: Wed Aug 29 17:25:37 CEST 2001 + +----------------------------------------------------------------- + +Usage: + +Compiling the kernel: + +Code maturity level options + [*] Prompt for development and/or incomplete code/drivers + +Network device support + Wan interfaces + <M> MultiGate (COMX) synchronous + <M> Support for MUNICH based boards: SliceCOM, PCICOM (NEW) + <M> Support for HDLC and syncPPP... + + +Loading the modules: + +modprobe comx + +modprobe comx-proto-ppp # module for Cisco-HDLC and SyncPPP protocols + +modprobe comx-hw-munich # the module logs information by the kernel + # about the detected boards + + +Configuring the board: + +# This interface will use the Cisco-HDLC line protocol, +# the timeslices assigned are 1,2 (128 KiBit line speed) +# (the first data timeslice in the G.703 frame is no. 1) +# +mkdir /proc/comx/comx0.1/ +echo slicecom >/proc/comx/comx0.1/boardtype +echo hdlc >/proc/comx/comx0.1/protocol +echo 1 2 >/proc/comx/comx0.1/timeslots + + +# This interface uses SyncPPP line protocol, the assigned +# is no. 3 (64 KiBit line speed) +# +mkdir /proc/comx/comx0.2/ +echo slicecom >/proc/comx/comx0.2/boardtype +echo ppp >/proc/comx/comx0.2/protocol +echo 3 >/proc/comx/comx0.2/timeslots + +... + +ifconfig comx0.1 up +ifconfig comx0.2 up + +----------------------------------------------------------------- + +The COMX interfaces use a 10 packet transmit queue by default, however WAN +networks sometimes use bigger values (20 to 100), to utilize the line better +by large traffic (though the line delay increases because of more packets +join the queue). + +# ifconfig comx0 txqueuelen 50 + +This option is only supported by the ifconfig command of the later +distributions, which came with 2.2 kernels, such as RedHat 6.1 or Debian 2.2. + +You can download a newer netbase packet from +http://www.debian.org/~rcw/2.2/netbase/ for Debian 2.1, which has a new +ifconfig. You can get further information about using 2.2 kernel with +Debian 2.1 from http://www.debian.org/releases/stable/running-kernel-2.2 + +----------------------------------------------------------------- + +The SliceCom LEDs: + +red - on, if the interface is unconfigured, or it gets Remote Alarm-s +green - on, if the board finds frame-sync in the received signal + +A bit more detailed: + +red: green: meaning: + +- - no frame-sync, no signal received, or signal SNAFU. +- on "Everything is OK" +on on Recepion is ok, but the remote end sends Remote Alarm +on - The interface is unconfigured + +----------------------------------------------------------------- + +A more detailed description of the hardware setting options: + +The general and the protocol layer options described in the 'comx.txt' file +apply to the SliceCom as well, I only summarize the SliceCom hardware specific +settings below. + +The '/proc/comx' configuring interface: + +An interface directory should be created for every timeslot group with +'mkdir', e,g: 'comx0', 'comx1' etc. The timeslots can be assigned here to the +specific interface. The Cisco-like naming convention (serial3:1 - first +timeslot group of the 3rd. board) can't be used here, because these mean IP +aliasing in Linux. + +You can give any meaningful name to keep the configuration clear; +e.g: 'comx0.1', 'comx0.2', 'comx1.1', comx1.2', if you have two boards +with two interfaces each. + +Settings, which apply to the board: + +Neither 'io' nor 'irq' settings required, the driver uses the resources +given by the PCI BIOS. + +comx0/boardnum - board number of the SliceCom in the PC (using the 'natural' + PCI order) as listed in '/proc/pci' or the output of the + 'lspci' command, generally the slots nearer to the motherboard + PCI driver chips have the lower numbers. + + Default: 0 (the counting starts with 0) + +Though the options below are to be set on a single interface, they apply to the +whole board. The restriction, to use them on 'UP' interfaces, is because the +command sequence below could lead to unpredicable results. + + # echo 0 >boardnum + # echo internal >clock_source + # echo 1 >boardnum + +The sequence would set the clock source of board 0. + +These settings will persist after all the interfaces are cleared, but are +cleared when the driver module is unloaded and loaded again. + +comx0/clock_source - source of the transmit clock + Usage: + + # echo line >/proc/comx/comx0/clock_source + # echo internal >/proc/comx/comx0/clock_source + + line - The Tx clock is being decoded if the input data stream, + if no clock seen on the input, then the board will use it's + own clock generator. + + internal - The Tx clock is supplied by the builtin clock generator. + + Default: line + + Normally, the telecommunication company's end device (the HDSL + modem) provides the Tx clock, that's why 'line' is the default. + +comx0/framing - Switching CRC4 off/on + + CRC4: 16 PCM frames (The 32 64Kibit channels are multiplexed into a + PCM frame, nothing to do with HDLC frames) are divided into 2x8 + groups, each group has a 4 bit CRC. + + # echo crc4 >/proc/comx/comx0/framing + # echo no-crc4 >/proc/comx/comx0/framing + + Default is 'crc4', the Hungarian MATAV lines behave like this. + The traffic generally passes if this setting on both ends don't match. + +comx0/linecode - Setting the line coding + + # echo hdb3 >/proc/comx/comx0/linecode + # echo ami >/proc/comx/comx0/linecode + + Default a 'hdb3', MATAV lines use this. + + (AMI coding is rarely used with E1 lines). Frame sync may occur, if + this setting doesn't match the other end's, but CRC4 and data errors + will come, which will result in CRC errors on HDLC/SyncPPP level. + +comx0/reg - direct access to the board's MUNICH (reg) and FALC (lbireg) +comx0/lbireg circuit's registers + + # echo >reg 0x04 0x0 - write 0 to register 4 + # echo >reg 0x104 - write the contents of register 4 with + printk() to syslog + +WARNING! These are only for development purposes, messing with this will + result much trouble! + +comx0/loopback - Places a loop to the board's G.703 signals + + # echo none >/proc/comx/comx0/loopback + # echo local >/proc/comx/comx0/loopback + # echo remote >/proc/comx/comx0/loopback + + none - normal operation, no loop + local - the board receives it's own output + remote - the board sends the received data to the remote side + + Default: none + +----------------------------------------------------------------- + +Interface (channel group in Cisco terms) settings: + +comx0/timeslots - which timeslots belong to the given interface + + Setting: + + # echo '1 5 2 6 7 8' >/proc/comx/comx0/timeslots + + # cat /proc/comx/comx0/timeslots + 1 2 5 6 7 8 + # + + Finding a timeslot: + + # grep ' 4' /proc/comx/comx*/timeslots + /proc/comx/comx0/timeslots:1 3 4 5 6 + # + + The timeslots can be in any order, '1 2 3' is the same as '1 3 2'. + + The interface has to be DOWN during the setting ('ifconfig comx0 + down'), but the other interfaces could operate normally. + + The driver checks if the assigned timeslots are vacant, if not, then + the setting won't be applied. + + The timeslot values are treated as decimal numbers, not to misunderstand + values of 08, 09 form. + +----------------------------------------------------------------- + +Checking the interface and board status: + +- Lines beginning with ' ' (space) belong to the original output, the lines +which begin with '//' are the comments. + + papaya:~$ cat /proc/comx/comx1/status + Interface administrative status is UP, modem status is UP, protocol is UP + Modem status changes: 0, Transmitter status is IDLE, tbusy: 0 + Interface load (input): 978376 / 947808 / 951024 bits/s (5s/5m/15m) + (output): 978376 / 947848 / 951024 bits/s (5s/5m/15m) + Debug flags: none + RX errors: len: 22, overrun: 1, crc: 0, aborts: 0 + buffer overrun: 0, pbuffer overrun: 0 + TX errors: underrun: 0 + Line keepalive (value: 10) status UP [0] + +// The hardware specific part starts here: + Controller status: + No alarms + +// Alarm: +// +// No alarms - Everything OK +// +// LOS - Loss Of Signal - No signal sensed on the input +// AIS - Alarm Indication Signal - The remot side sends '11111111'-s, +// it tells, that there's an error condition, or it's not +// initialised. +// AUXP - Auxiliary Pattern Indication - 01010101.. received. +// LFA - Loss of Frame Alignment - no frame sync received. +// RRA - Receive Remote Alarm - the remote end's OK, but singnals error cond. +// LMFA - Loss of CRC4 Multiframe Alignment - no CRC4 multiframe sync. +// NMF - No Multiframe alignment Found after 400 msec - no such alarm using +// no-crc4 or crc4 framing, see below. +// +// Other possible error messages: +// +// Transmit Line Short - the board felt, that it's output is short-circuited, +// so it switched the transmission off. (The board can't definitely tell, +// that it's output is short-circuited.) + +// Chained list of the received packets, for debug purposes: + + Rx ring: + rafutott: 0 + lastcheck: 50845731, jiffies: 51314281 + base: 017b1858 + rx_desc_ptr: 0 + rx_desc_ptr: 017b1858 + hw_curr_ptr: 017b1858 + 06040000 017b1868 017b1898 c016ff00 + 06040000 017b1878 017b1e9c c016ff00 + 46040000 017b1888 017b24a0 c016ff00 + 06040000 017b1858 017b2aa4 c016ff00 + +// All the interfaces using the board: comx1, using the 1,2,...16 timeslots, +// comx2, using timeslot 17, etc. + + Interfaces using this board: (channel-group, interface, timeslots) + 0 comx1: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 + 1 comx2: 17 + 2 comx3: 18 + 3 comx4: 19 + 4 comx5: 20 + 5 comx6: 21 + 6 comx7: 22 + 7 comx8: 23 + 8 comx9: 24 + 9 comx10: 25 + 10 comx11: 26 + 11 comx12: 27 + 12 comx13: 28 + 13 comx14: 29 + 14 comx15: 30 + 15 comx16: 31 + +// The number of events handled by the driver during an interrupt cycle: + + Interrupt work histogram: + hist[ 0]: 0 hist[ 1]: 2 hist[ 2]: 18574 hist[ 3]: 79 + hist[ 4]: 14 hist[ 5]: 1 hist[ 6]: 0 hist[ 7]: 1 + hist[ 8]: 0 hist[ 9]: 7 + +// The number of packets to send in the Tx ring, when a new one arrived: + + Tx ring histogram: + hist[ 0]: 2329 hist[ 1]: 0 hist[ 2]: 0 hist[ 3]: 0 + +// The error counters of the E1 interface, according to the RFC2495, +// (similar to the Cisco "show controllers e1" command's output: +// http://www.cisco.com/univercd/cc/td/doc/product/software/ios11/rbook/rinterfc.htm#xtocid25669126) + +Data in current interval (91 seconds elapsed): + 9516 Line Code Violations, 65 Path Code Violations, 2 E-Bit Errors + 0 Slip Secs, 2 Fr Loss Secs, 2 Line Err Secs, 0 Degraded Mins + 0 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 11 Unavail Secs +Data in Interval 1 (15 minutes): + 0 Line Code Violations, 0 Path Code Violations, 0 E-Bit Errors + 0 Slip Secs, 0 Fr Loss Secs, 0 Line Err Secs, 0 Degraded Mins + 0 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs +Data in last 4 intervals (1 hour): + 0 Line Code Violations, 0 Path Code Violations, 0 E-Bit Errors + 0 Slip Secs, 0 Fr Loss Secs, 0 Line Err Secs, 0 Degraded Mins + 0 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs +Data in last 96 intervals (24 hours): + 0 Line Code Violations, 0 Path Code Violations, 0 E-Bit Errors + 0 Slip Secs, 0 Fr Loss Secs, 0 Line Err Secs, 0 Degraded Mins + 0 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs + +----------------------------------------------------------------- + +Some unique options, (may get into the driver later): +Treat them very carefully, these can cause much trouble! + + modified CRC-4, for improved interworking of CRC-4 and non-CRC-4 + devices: (see page 107 and g706 Annex B) + lbireg[ 0x1b ] |= 0x08 + lbireg[ 0x1c ] |= 0xc0 + + - The NMF - 'No Multiframe alignment Found after 400 msec' alarm + comes into account. + + FALC - the line driver chip. + local loop - I hear my transmission back. + remote loop - I echo the remote transmission back. + + Something useful for finding errors: + + - local loop for timeslot 1 in the FALC chip: + + # echo >lbireg 0x1d 0x21 + + - Swithing the loop off: + + # echo >lbireg 0x1d 0x00 diff --git a/Documentation/networking/smc9.txt b/Documentation/networking/smc9.txt new file mode 100644 index 000000000000..d1e15074e43d --- /dev/null +++ b/Documentation/networking/smc9.txt @@ -0,0 +1,42 @@ + +SMC 9xxxx Driver +Revision 0.12 +3/5/96 +Copyright 1996 Erik Stahlman +Released under terms of the GNU General Public License. + +This file contains the instructions and caveats for my SMC9xxx driver. You +should not be using the driver without reading this file. + +Things to note about installation: + + 1. The driver should work on all kernels from 1.2.13 until 1.3.71. + (A kernel patch is supplied for 1.3.71 ) + + 2. If you include this into the kernel, you might need to change some + options, such as for forcing IRQ. + + + 3. To compile as a module, run 'make' . + Make will give you the appropriate options for various kernel support. + + 4. Loading the driver as a module : + + use: insmod smc9194.o + optional parameters: + io=xxxx : your base address + irq=xx : your irq + ifport=x : 0 for whatever is default + 1 for twisted pair + 2 for AUI ( or BNC on some cards ) + +How to obtain the latest version? + +FTP: + ftp://fenris.campus.vt.edu/smc9/smc9-12.tar.gz + ftp://sfbox.vt.edu/filebox/F/fenris/smc9/smc9-12.tar.gz + + +Contacting me: + erik@mail.vt.edu + diff --git a/Documentation/networking/smctr.txt b/Documentation/networking/smctr.txt new file mode 100644 index 000000000000..4c866f5a0ee4 --- /dev/null +++ b/Documentation/networking/smctr.txt @@ -0,0 +1,66 @@ +Text File for the SMC TokenCard TokenRing Linux driver (smctr.c). + By Jay Schulist <jschlst@samba.org> + +The Linux SMC Token Ring driver works with the SMC TokenCard Elite (8115T) +ISA and SMC TokenCard Elite/A (8115T/A) MCA adapters. + +Latest information on this driver can be obtained on the Linux-SNA WWW site. +Please point your browser to: http://www.linux-sna.org + +This driver is rather simple to use. Select Y to Token Ring adapter support +in the kernel configuration. A choice for SMC Token Ring adapters will +appear. This drives supports all SMC ISA/MCA adapters. Choose this +option. I personally recommend compiling the driver as a module (M), but if you +you would like to compile it staticly answer Y instead. + +This driver supports multiple adapters without the need to load multiple copies +of the driver. You should be able to load up to 7 adapters without any kernel +modifications, if you are in need of more please contact the maintainer of this +driver. + +Load the driver either by lilo/loadlin or as a module. When a module using the +following command will suffice for most: + +# modprobe smctr +smctr.c: v1.00 12/6/99 by jschlst@samba.org +tr0: SMC TokenCard 8115T at Io 0x300, Irq 10, Rom 0xd8000, Ram 0xcc000. + +Now just setup the device via ifconfig and set and routes you may have. After +this you are ready to start sending some tokens. + +Errata: +1). For anyone wondering where to pick up the SMC adapters please browse + to http://www.smc.com + +2). If you are the first/only Token Ring Client on a Token Ring LAN, please + specify the ringspeed with the ringspeed=[4/16] module option. If no + ringspeed is specified the driver will attempt to autodetect the ring + speed and/or if the adapter is the first/only station on the ring take + the appropriate actions. + + NOTE: Default ring speed is 16MB UTP. + +3). PnP support for this adapter sucks. I recommend hard setting the + IO/MEM/IRQ by the jumpers on the adapter. If this is not possible + load the module with the following io=[ioaddr] mem=[mem_addr] + irq=[irq_num]. + + The following IRQ, IO, and MEM settings are supported. + + IO ports: + 0x200, 0x220, 0x240, 0x260, 0x280, 0x2A0, 0x2C0, 0x2E0, 0x300, + 0x320, 0x340, 0x360, 0x380. + + IRQs: + 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15 + + Memory addresses: + 0xA0000, 0xA4000, 0xA8000, 0xAC000, 0xB0000, 0xB4000, + 0xB8000, 0xBC000, 0xC0000, 0xC4000, 0xC8000, 0xCC000, + 0xD0000, 0xD4000, 0xD8000, 0xDC000, 0xE0000, 0xE4000, + 0xE8000, 0xEC000, 0xF0000, 0xF4000, 0xF8000, 0xFC000 + +This driver is under the GNU General Public License. Its Firmware image is +included as an initialized C-array and is licensed by SMC to the Linux +users of this driver. However no warranty about its fitness is expressed or +implied by SMC. diff --git a/Documentation/networking/tcp.txt b/Documentation/networking/tcp.txt new file mode 100644 index 000000000000..71749007091e --- /dev/null +++ b/Documentation/networking/tcp.txt @@ -0,0 +1,39 @@ +How the new TCP output machine [nyi] works. + + +Data is kept on a single queue. The skb->users flag tells us if the frame is +one that has been queued already. To add a frame we throw it on the end. Ack +walks down the list from the start. + +We keep a set of control flags + + + sk->tcp_pend_event + + TCP_PEND_ACK Ack needed + TCP_ACK_NOW Needed now + TCP_WINDOW Window update check + TCP_WINZERO Zero probing + + + sk->transmit_queue The transmission frame begin + sk->transmit_new First new frame pointer + sk->transmit_end Where to add frames + + sk->tcp_last_tx_ack Last ack seen + sk->tcp_dup_ack Dup ack count for fast retransmit + + +Frames are queued for output by tcp_write. We do our best to send the frames +off immediately if possible, but otherwise queue and compute the body +checksum in the copy. + +When a write is done we try to clear any pending events and piggy back them. +If the window is full we queue full sized frames. On the first timeout in +zero window we split this. + +On a timer we walk the retransmit list to send any retransmits, update the +backoff timers etc. A change of route table stamp causes a change of header +and recompute. We add any new tcp level headers and refinish the checksum +before sending. + diff --git a/Documentation/networking/tlan.txt b/Documentation/networking/tlan.txt new file mode 100644 index 000000000000..7e6aa5b20c37 --- /dev/null +++ b/Documentation/networking/tlan.txt @@ -0,0 +1,117 @@ +(C) 1997-1998 Caldera, Inc. +(C) 1998 James Banks +(C) 1999-2001 Torben Mathiasen <tmm@image.dk, torben.mathiasen@compaq.com> + +For driver information/updates visit http://opensource.compaq.com + + +TLAN driver for Linux, version 1.14a +README + + +I. Supported Devices. + + Only PCI devices will work with this driver. + + Supported: + Vendor ID Device ID Name + 0e11 ae32 Compaq Netelligent 10/100 TX PCI UTP + 0e11 ae34 Compaq Netelligent 10 T PCI UTP + 0e11 ae35 Compaq Integrated NetFlex 3/P + 0e11 ae40 Compaq Netelligent Dual 10/100 TX PCI UTP + 0e11 ae43 Compaq Netelligent Integrated 10/100 TX UTP + 0e11 b011 Compaq Netelligent 10/100 TX Embedded UTP + 0e11 b012 Compaq Netelligent 10 T/2 PCI UTP/Coax + 0e11 b030 Compaq Netelligent 10/100 TX UTP + 0e11 f130 Compaq NetFlex 3/P + 0e11 f150 Compaq NetFlex 3/P + 108d 0012 Olicom OC-2325 + 108d 0013 Olicom OC-2183 + 108d 0014 Olicom OC-2326 + + + Caveats: + + I am not sure if 100BaseTX daughterboards (for those cards which + support such things) will work. I haven't had any solid evidence + either way. + + However, if a card supports 100BaseTx without requiring an add + on daughterboard, it should work with 100BaseTx. + + The "Netelligent 10 T/2 PCI UTP/Coax" (b012) device is untested, + but I do not expect any problems. + + +II. Driver Options + 1. You can append debug=x to the end of the insmod line to get + debug messages, where x is a bit field where the bits mean + the following: + + 0x01 Turn on general debugging messages. + 0x02 Turn on receive debugging messages. + 0x04 Turn on transmit debugging messages. + 0x08 Turn on list debugging messages. + + 2. You can append aui=1 to the end of the insmod line to cause + the adapter to use the AUI interface instead of the 10 Base T + interface. This is also what to do if you want to use the BNC + connector on a TLAN based device. (Setting this option on a + device that does not have an AUI/BNC connector will probably + cause it to not function correctly.) + + 3. You can set duplex=1 to force half duplex, and duplex=2 to + force full duplex. + + 4. You can set speed=10 to force 10Mbs operation, and speed=100 + to force 100Mbs operation. (I'm not sure what will happen + if a card which only supports 10Mbs is forced into 100Mbs + mode.) + + 5. You have to use speed=X duplex=Y together now. If you just + do "insmod tlan.o speed=100" the driver will do Auto-Neg. + To force a 10Mbps Half-Duplex link do "insmod tlan.o speed=10 + duplex=1". + + 6. If the driver is built into the kernel, you can use the 3rd + and 4th parameters to set aui and debug respectively. For + example: + + ether=0,0,0x1,0x7,eth0 + + This sets aui to 0x1 and debug to 0x7, assuming eth0 is a + supported TLAN device. + + The bits in the third byte are assigned as follows: + + 0x01 = aui + 0x02 = use half duplex + 0x04 = use full duplex + 0x08 = use 10BaseT + 0x10 = use 100BaseTx + + You also need to set both speed and duplex settings when forcing + speeds with kernel-parameters. + ether=0,0,0x12,0,eth0 will force link to 100Mbps Half-Duplex. + + 7. If you have more than one tlan adapter in your system, you can + use the above options on a per adapter basis. To force a 100Mbit/HD + link with your eth1 adapter use: + + insmod tlan speed=0,100 duplex=0,1 + + Now eth0 will use auto-neg and eth1 will be forced to 100Mbit/HD. + Note that the tlan driver supports a maximum of 8 adapters. + + +III. Things to try if you have problems. + 1. Make sure your card's PCI id is among those listed in + section I, above. + 2. Make sure routing is correct. + 3. Try forcing different speed/duplex settings + + +There is also a tlan mailing list which you can join by sending "subscribe tlan" +in the body of an email to majordomo@vuser.vu.union.edu. +There is also a tlan website at http://opensource.compaq.com + diff --git a/Documentation/networking/tms380tr.txt b/Documentation/networking/tms380tr.txt new file mode 100644 index 000000000000..179e527b9da1 --- /dev/null +++ b/Documentation/networking/tms380tr.txt @@ -0,0 +1,147 @@ +Text file for the Linux SysKonnect Token Ring ISA/PCI Adapter Driver. + Text file by: Jay Schulist <jschlst@samba.org> + +The Linux SysKonnect Token Ring driver works with the SysKonnect TR4/16(+) ISA, +SysKonnect TR4/16(+) PCI, SysKonnect TR4/16 PCI, and older revisions of the +SK NET TR4/16 ISA card. + +Latest information on this driver can be obtained on the Linux-SNA WWW site. +Please point your browser to: +http://www.linux-sna.org + +Many thanks to Christoph Goos for his excellent work on this driver and +SysKonnect for donating the adapters to Linux-SNA for the testing and +maintenance of this device driver. + +Important information to be noted: +1. Adapters can be slow to open (~20 secs) and close (~5 secs), please be + patient. +2. This driver works very well when autoprobing for adapters. Why even + think about those nasty io/int/dma settings of modprobe when the driver + will do it all for you! + +This driver is rather simple to use. Select Y to Token Ring adapter support +in the kernel configuration. A choice for SysKonnect Token Ring adapters will +appear. This drives supports all SysKonnect ISA and PCI adapters. Choose this +option. I personally recommend compiling the driver as a module (M), but if you +you would like to compile it staticly answer Y instead. + +This driver supports multiple adapters without the need to load multiple copies +of the driver. You should be able to load up to 7 adapters without any kernel +modifications, if you are in need of more please contact the maintainer of this +driver. + +Load the driver either by lilo/loadlin or as a module. When a module using the +following command will suffice for most: + +# modprobe sktr + +This will produce output similar to the following: (Output is user specific) + +sktr.c: v1.01 08/29/97 by Christoph Goos +tr0: SK NET TR 4/16 PCI found at 0x6100, using IRQ 17. +tr1: SK NET TR 4/16 PCI found at 0x6200, using IRQ 16. +tr2: SK NET TR 4/16 ISA found at 0xa20, using IRQ 10 and DMA 5. + +Now just setup the device via ifconfig and set and routes you may have. After +this you are ready to start sending some tokens. + +Errata: +For anyone wondering where to pick up the SysKonnect adapters please browse +to http://www.syskonnect.com + +This driver is under the GNU General Public License. Its Firmware image is +included as an initialized C-array and is licensed by SysKonnect to the Linux +users of this driver. However no warranty about its fitness is expressed or +implied by SysKonnect. + +Below find attached the setting for the SK NET TR 4/16 ISA adapters +------------------------------------------------------------------- + + *************************** + *** C O N T E N T S *** + *************************** + + 1) Location of DIP-Switch W1 + 2) Default settings + 3) DIP-Switch W1 description + + + ============================================================== + CHAPTER 1 LOCATION OF DIP-SWITCH + ============================================================== + +UÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿ +þUÄÄÄÄÄÄ¿ UÄÄÄÄÄ¿ UÄÄÄ¿ þ +þAÄÄÄÄÄÄU W1 AÄÄÄÄÄU UÄÄÄÄ¿ þ þ þ +þUÄÄÄÄÄÄ¿ þ þ þ þ UÄÄÅ¿ +þAÄÄÄÄÄÄU UÄÄÄÄÄÄÄÄÄÄÄ¿ AÄÄÄÄU þ þ þ þþ +þUÄÄÄÄÄÄ¿ þ þ UÄÄÄ¿ AÄÄÄU AÄÄÅU +þAÄÄÄÄÄÄU þ TMS380C26 þ þ þ þ +þUÄÄÄÄÄÄ¿ þ þ AÄÄÄU AÄ¿ +þAÄÄÄÄÄÄU þ þ þ þ +þ AÄÄÄÄÄÄÄÄÄÄÄU þ þ +þ þ þ +þ AÄU +þ þ +þ þ +þ þ +þ þ +AÄÄÄÄÄÄÄÄÄÄÄÄAÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄAÄÄAÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄAÄÄÄÄÄÄÄÄÄU + AÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄU AÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄU + + ============================================================== + CHAPTER 2 DEFAULT SETTINGS + ============================================================== + + W1 1 2 3 4 5 6 7 8 + +------------------------------+ + | ON X | + | OFF X X X X X X X | + +------------------------------+ + + W1.1 = ON Adapter drives address lines SA17..19 + W1.2 - 1.5 = OFF BootROM disabled + W1.6 - 1.8 = OFF I/O address 0A20h + + ============================================================== + CHAPTER 3 DIP SWITCH W1 DESCRIPTION + ============================================================== + + UÄÄÄAÄÄÄAÄÄÄAÄÄÄAÄÄÄAÄÄÄAÄÄÄAÄÄÄ¿ ON + þ 1 þ 2 þ 3 þ 4 þ 5 þ 6 þ 7 þ 8 þ + AÄÄÄAÄÄÄAÄÄÄAÄÄÄAÄÄÄAÄÄÄAÄÄÄAÄÄÄU OFF + |AD | BootROM Addr. | I/O | + +-+-+-------+-------+-----+-----+ + | | | + | | +------ 6 7 8 + | | ON ON ON 1900h + | | ON ON OFF 0900h + | | ON OFF ON 1980h + | | ON OFF OFF 0980h + | | OFF ON ON 1b20h + | | OFF ON OFF 0b20h + | | OFF OFF ON 1a20h + | | OFF OFF OFF 0a20h (+) + | | + | | + | +-------- 2 3 4 5 + | OFF x x x disabled (+) + | ON ON ON ON C0000 + | ON ON ON OFF C4000 + | ON ON OFF ON C8000 + | ON ON OFF OFF CC000 + | ON OFF ON ON D0000 + | ON OFF ON OFF D4000 + | ON OFF OFF ON D8000 + | ON OFF OFF OFF DC000 + | + | + +----- 1 + OFF adapter does NOT drive SA<17..19> + ON adapter drives SA<17..19> (+) + + + (+) means default setting + + ******************************** diff --git a/Documentation/networking/tuntap.txt b/Documentation/networking/tuntap.txt new file mode 100644 index 000000000000..ec3d109d787a --- /dev/null +++ b/Documentation/networking/tuntap.txt @@ -0,0 +1,147 @@ +Universal TUN/TAP device driver. +Copyright (C) 1999-2000 Maxim Krasnyansky <max_mk@yahoo.com> + + Linux, Solaris drivers + Copyright (C) 1999-2000 Maxim Krasnyansky <max_mk@yahoo.com> + + FreeBSD TAP driver + Copyright (c) 1999-2000 Maksim Yevmenkin <m_evmenkin@yahoo.com> + + Revision of this document 2002 by Florian Thiel <florian.thiel@gmx.net> + +1. Description + TUN/TAP provides packet reception and transmission for user space programs. + It can be seen as a simple Point-to-Point or Ethernet device, which, + instead of receiving packets from physical media, receives them from + user space program and instead of sending packets via physical media + writes them to the user space program. + + In order to use the driver a program has to open /dev/net/tun and issue a + corresponding ioctl() to register a network device with the kernel. A network + device will appear as tunXX or tapXX, depending on the options chosen. When + the program closes the file descriptor, the network device and all + corresponding routes will disappear. + + Depending on the type of device chosen the userspace program has to read/write + IP packets (with tun) or ethernet frames (with tap). Which one is being used + depends on the flags given with the ioctl(). + + The package from http://vtun.sourceforge.net/tun contains two simple examples + for how to use tun and tap devices. Both programs work like a bridge between + two network interfaces. + br_select.c - bridge based on select system call. + br_sigio.c - bridge based on async io and SIGIO signal. + However, the best example is VTun http://vtun.sourceforge.net :)) + +2. Configuration + Create device node: + mkdir /dev/net (if it doesn't exist already) + mknod /dev/net/tun c 10 200 + + Set permissions: + e.g. chmod 0700 /dev/net/tun + if you want the device only accessible by root. Giving regular users the + right to assign network devices is NOT a good idea. Users could assign + bogus network interfaces to trick firewalls or administrators. + + Driver module autoloading + + Make sure that "Kernel module loader" - module auto-loading + support is enabled in your kernel. The kernel should load it on + first access. + + Manual loading + insert the module by hand: + modprobe tun + + If you do it the latter way, you have to load the module every time you + need it, if you do it the other way it will be automatically loaded when + /dev/net/tun is being opened. + +3. Program interface + 3.1 Network device allocation: + + char *dev should be the name of the device with a format string (e.g. + "tun%d"), but (as far as I can see) this can be any valid network device name. + Note that the character pointer becomes overwritten with the real device name + (e.g. "tun0") + + #include <linux/if.h> + #include <linux/if_tun.h> + + int tun_alloc(char *dev) + { + struct ifreq ifr; + int fd, err; + + if( (fd = open("/dev/net/tun", O_RDWR)) < 0 ) + return tun_alloc_old(dev); + + memset(&ifr, 0, sizeof(ifr)); + + /* Flags: IFF_TUN - TUN device (no Ethernet headers) + * IFF_TAP - TAP device + * + * IFF_NO_PI - Do not provide packet information + */ + ifr.ifr_flags = IFF_TUN; + if( *dev ) + strncpy(ifr.ifr_name, dev, IFNAMSIZ); + + if( (err = ioctl(fd, TUNSETIFF, (void *) &ifr)) < 0 ){ + close(fd); + return err; + } + strcpy(dev, ifr.ifr_name); + return fd; + } + + 3.2 Frame format: + If flag IFF_NO_PI is not set each frame format is: + Flags [2 bytes] + Proto [2 bytes] + Raw protocol(IP, IPv6, etc) frame. + +Universal TUN/TAP device driver Frequently Asked Question. + +1. What platforms are supported by TUN/TAP driver ? +Currently driver has been written for 3 Unices: + Linux kernels 2.2.x, 2.4.x + FreeBSD 3.x, 4.x, 5.x + Solaris 2.6, 7.0, 8.0 + +2. What is TUN/TAP driver used for? +As mentioned above, main purpose of TUN/TAP driver is tunneling. +It is used by VTun (http://vtun.sourceforge.net). + +Another interesting application using TUN/TAP is pipsecd +(http://perso.enst.fr/~beyssac/pipsec/), an userspace IPSec +implementation that can use complete kernel routing (unlike FreeS/WAN). + +3. How does Virtual network device actually work ? +Virtual network device can be viewed as a simple Point-to-Point or +Ethernet device, which instead of receiving packets from a physical +media, receives them from user space program and instead of sending +packets via physical media sends them to the user space program. + +Let's say that you configured IPX on the tap0, then whenever +the kernel sends an IPX packet to tap0, it is passed to the application +(VTun for example). The application encrypts, compresses and sends it to +the other side over TCP or UDP. The application on the other side decompresses +and decrypts the data received and writes the packet to the TAP device, +the kernel handles the packet like it came from real physical device. + +4. What is the difference between TUN driver and TAP driver? +TUN works with IP frames. TAP works with Ethernet frames. + +This means that you have to read/write IP packets when you are using tun and +ethernet frames when using tap. + +5. What is the difference between BPF and TUN/TAP driver? +BFP is an advanced packet filter. It can be attached to existing +network interface. It does not provide a virtual network interface. +A TUN/TAP driver does provide a virtual network interface and it is possible +to attach BPF to this interface. + +6. Does TAP driver support kernel Ethernet bridging? +Yes. Linux and FreeBSD drivers support Ethernet bridging. diff --git a/Documentation/networking/vortex.txt b/Documentation/networking/vortex.txt new file mode 100644 index 000000000000..fa12a9e4abdd --- /dev/null +++ b/Documentation/networking/vortex.txt @@ -0,0 +1,450 @@ +Documentation/networking/vortex.txt +Andrew Morton <andrewm@uow.edu.au> +30 April 2000 + + +This document describes the usage and errata of the 3Com "Vortex" device +driver for Linux, 3c59x.c. + +The driver was written by Donald Becker <becker@scyld.com> + +Don is no longer the prime maintainer of this version of the driver. +Please report problems to one or more of: + + Andrew Morton <andrewm@uow.edu.au> + Netdev mailing list <netdev@oss.sgi.com> + Linux kernel mailing list <linux-kernel@vger.kernel.org> + +Please note the 'Reporting and Diagnosing Problems' section at the end +of this file. + + +Since kernel 2.3.99-pre6, this driver incorporates the support for the +3c575-series Cardbus cards which used to be handled by 3c575_cb.c. + +This driver supports the following hardware: + + 3c590 Vortex 10Mbps + 3c592 EISA 10mbps Demon/Vortex + 3c597 EISA Fast Demon/Vortex + 3c595 Vortex 100baseTx + 3c595 Vortex 100baseT4 + 3c595 Vortex 100base-MII + 3Com Vortex + 3c900 Boomerang 10baseT + 3c900 Boomerang 10Mbps Combo + 3c900 Cyclone 10Mbps TPO + 3c900B Cyclone 10Mbps T + 3c900 Cyclone 10Mbps Combo + 3c900 Cyclone 10Mbps TPC + 3c900B-FL Cyclone 10base-FL + 3c905 Boomerang 100baseTx + 3c905 Boomerang 100baseT4 + 3c905B Cyclone 100baseTx + 3c905B Cyclone 10/100/BNC + 3c905B-FX Cyclone 100baseFx + 3c905C Tornado + 3c980 Cyclone + 3cSOHO100-TX Hurricane + 3c555 Laptop Hurricane + 3c575 Boomerang CardBus + 3CCFE575 Cyclone CardBus + 3CCFE575CT Cyclone CardBus + 3CCFE656 Cyclone CardBus + 3CCFEM656 Cyclone CardBus + 3c450 Cyclone/unknown + + +Module parameters +================= + +There are several parameters which may be provided to the driver when +its module is loaded. These are usually placed in /etc/modprobe.conf +(/etc/modules.conf in 2.4). Example: + +options 3c59x debug=3 rx_copybreak=300 + +If you are using the PCMCIA tools (cardmgr) then the options may be +placed in /etc/pcmcia/config.opts: + +module "3c59x" opts "debug=3 rx_copybreak=300" + + +The supported parameters are: + +debug=N + + Where N is a number from 0 to 7. Anything above 3 produces a lot + of output in your system logs. debug=1 is default. + +options=N1,N2,N3,... + + Each number in the list provides an option to the corresponding + network card. So if you have two 3c905's and you wish to provide + them with option 0x204 you would use: + + options=0x204,0x204 + + The individual options are composed of a number of bitfields which + have the following meanings: + + Possible media type settings + 0 10baseT + 1 10Mbs AUI + 2 undefined + 3 10base2 (BNC) + 4 100base-TX + 5 100base-FX + 6 MII (Media Independent Interface) + 7 Use default setting from EEPROM + 8 Autonegotiate + 9 External MII + 10 Use default setting from EEPROM + + When generating a value for the 'options' setting, the above media + selection values may be OR'ed (or added to) the following: + + 0x8000 Set driver debugging level to 7 + 0x4000 Set driver debugging level to 2 + 0x0400 Enable Wake-on-LAN + 0x0200 Force full duplex mode. + 0x0010 Bus-master enable bit (Old Vortex cards only) + + For example: + + insmod 3c59x options=0x204 + + will force full-duplex 100base-TX, rather than allowing the usual + autonegotiation. + +global_options=N + + Sets the `options' parameter for all 3c59x NICs in the machine. + Entries in the `options' array above will override any setting of + this. + +full_duplex=N1,N2,N3... + + Similar to bit 9 of 'options'. Forces the corresponding card into + full-duplex mode. Please use this in preference to the `options' + parameter. + + In fact, please don't use this at all! You're better off getting + autonegotiation working properly. + +global_full_duplex=N1 + + Sets full duplex mode for all 3c59x NICs in the machine. Entries + in the `full_duplex' array above will override any setting of this. + +flow_ctrl=N1,N2,N3... + + Use 802.3x MAC-layer flow control. The 3com cards only support the + PAUSE command, which means that they will stop sending packets for a + short period if they receive a PAUSE frame from the link partner. + + The driver only allows flow control on a link which is operating in + full duplex mode. + + This feature does not appear to work on the 3c905 - only 3c905B and + 3c905C have been tested. + + The 3com cards appear to only respond to PAUSE frames which are + sent to the reserved destination address of 01:80:c2:00:00:01. They + do not honour PAUSE frames which are sent to the station MAC address. + +rx_copybreak=M + + The driver preallocates 32 full-sized (1536 byte) network buffers + for receiving. When a packet arrives, the driver has to decide + whether to leave the packet in its full-sized buffer, or to allocate + a smaller buffer and copy the packet across into it. + + This is a speed/space tradeoff. + + The value of rx_copybreak is used to decide when to make the copy. + If the packet size is less than rx_copybreak, the packet is copied. + The default value for rx_copybreak is 200 bytes. + +max_interrupt_work=N + + The driver's interrupt service routine can handle many receive and + transmit packets in a single invocation. It does this in a loop. + The value of max_interrupt_work governs how mnay times the interrupt + service routine will loop. The default value is 32 loops. If this + is exceeded the interrupt service routine gives up and generates a + warning message "eth0: Too much work in interrupt". + +hw_checksums=N1,N2,N3,... + + Recent 3com NICs are able to generate IPv4, TCP and UDP checksums + in hardware. Linux has used the Rx checksumming for a long time. + The "zero copy" patch which is planned for the 2.4 kernel series + allows you to make use of the NIC's DMA scatter/gather and transmit + checksumming as well. + + The driver is set up so that, when the zerocopy patch is applied, + all Tornado and Cyclone devices will use S/G and Tx checksums. + + This module parameter has been provided so you can override this + decision. If you think that Tx checksums are causing a problem, you + may disable the feature with `hw_checksums=0'. + + If you think your NIC should be performing Tx checksumming and the + driver isn't enabling it, you can force the use of hardware Tx + checksumming with `hw_checksums=1'. + + The driver drops a message in the logfiles to indicate whether or + not it is using hardware scatter/gather and hardware Tx checksums. + + Scatter/gather and hardware checksums provide considerable + performance improvement for the sendfile() system call, but a small + decrease in throughput for send(). There is no effect upon receive + efficiency. + +compaq_ioaddr=N +compaq_irq=N +compaq_device_id=N + + "Variables to work-around the Compaq PCI BIOS32 problem".... + +watchdog=N + + Sets the time duration (in milliseconds) after which the kernel + decides that the transmitter has become stuck and needs to be reset. + This is mainly for debugging purposes, although it may be advantageous + to increase this value on LANs which have very high collision rates. + The default value is 5000 (5.0 seconds). + +enable_wol=N1,N2,N3,... + + Enable Wake-on-LAN support for the relevant interface. Donald + Becker's `ether-wake' application may be used to wake suspended + machines. + + Also enables the NIC's power management support. + +global_enable_wol=N + + Sets enable_wol mode for all 3c59x NICs in the machine. Entries in + the `enable_wol' array above will override any setting of this. + +Media selection +--------------- + +A number of the older NICs such as the 3c590 and 3c900 series have +10base2 and AUI interfaces. + +Prior to January, 2001 this driver would autoeselect the 10base2 or AUI +port if it didn't detect activity on the 10baseT port. It would then +get stuck on the 10base2 port and a driver reload was necessary to +switch back to 10baseT. This behaviour could not be prevented with a +module option override. + +Later (current) versions of the driver _do_ support locking of the +media type. So if you load the driver module with + + modprobe 3c59x options=0 + +it will permanently select the 10baseT port. Automatic selection of +other media types does not occur. + + +Transmit error, Tx status register 82 +------------------------------------- + +This is a common error which is almost always caused by another host on +the same network being in full-duplex mode, while this host is in +half-duplex mode. You need to find that other host and make it run in +half-duplex mode or fix this host to run in full-duplex mode. + +As a last resort, you can force the 3c59x driver into full-duplex mode +with + + options 3c59x full_duplex=1 + +but this has to be viewed as a workaround for broken network gear and +should only really be used for equipment which cannot autonegotiate. + + +Additional resources +-------------------- + +Details of the device driver implementation are at the top of the source file. + +Additional documentation is available at Don Becker's Linux Drivers site: + + http://www.scyld.com/network/vortex.html + +Donald Becker's driver development site: + + http://www.scyld.com/network + +Donald's vortex-diag program is useful for inspecting the NIC's state: + + http://www.scyld.com/diag/#pci-diags + +Donald's mii-diag program may be used for inspecting and manipulating +the NIC's Media Independent Interface subsystem: + + http://www.scyld.com/diag/#mii-diag + +Donald's wake-on-LAN page: + + http://www.scyld.com/expert/wake-on-lan.html + +3Com's documentation for many NICs, including the ones supported by +this driver is available at + + http://support.3com.com/partners/developer/developer_form.html + +3Com's DOS-based application for setting up the NICs EEPROMs: + + ftp://ftp.3com.com/pub/nic/3c90x/3c90xx2.exe + +Driver updates and a detailed changelog for the modifications which +were made for the 2.3/2,4 series kernel is available at + + http://www.uow.edu.au/~andrewm/linux/#3c59x-2.3 + + +Autonegotiation notes +--------------------- + + The driver uses a one-minute heartbeat for adapting to changes in + the external LAN environment. This means that when, for example, a + machine is unplugged from a hubbed 10baseT LAN plugged into a + switched 100baseT LAN, the throughput will be quite dreadful for up + to sixty seconds. Be patient. + + Cisco interoperability note from Walter Wong <wcw+@CMU.EDU>: + + On a side note, adding HAS_NWAY seems to share a problem with the + Cisco 6509 switch. Specifically, you need to change the spanning + tree parameter for the port the machine is plugged into to 'portfast' + mode. Otherwise, the negotiation fails. This has been an issue + we've noticed for a while but haven't had the time to track down. + + Cisco switches (Jeff Busch <jbusch@deja.com>) + + My "standard config" for ports to which PC's/servers connect directly: + + interface FastEthernet0/N + description machinename + load-interval 30 + spanning-tree portfast + + If autonegotiation is a problem, you may need to specify "speed + 100" and "duplex full" as well (or "speed 10" and "duplex half"). + + WARNING: DO NOT hook up hubs/switches/bridges to these + specially-configured ports! The switch will become very confused. + + +Reporting and diagnosing problems +--------------------------------- + +Maintainers find that accurate and complete problem reports are +invaluable in resolving driver problems. We are frequently not able to +reproduce problems and must rely on your patience and efforts to get to +the bottom of the problem. + +If you believe you have a driver problem here are some of the +steps you should take: + +- Is it really a driver problem? + + Eliminate some variables: try different cards, different + computers, different cables, different ports on the switch/hub, + different versions of the kernel or ofthe driver, etc. + +- OK, it's a driver problem. + + You need to generate a report. Typically this is an email to the + maintainer and/or linux-net@vger.kernel.org. The maintainer's + email address will be inthe driver source or in the MAINTAINERS file. + +- The contents of your report will vary a lot depending upon the + problem. If it's a kernel crash then you should refer to the + REPORTING-BUGS file. + + But for most problems it is useful to provide the following: + + o Kernel version, driver version + + o A copy of the banner message which the driver generates when + it is initialised. For example: + + eth0: 3Com PCI 3c905C Tornado at 0xa400, 00:50:da:6a:88:f0, IRQ 19 + 8K byte-wide RAM 5:3 Rx:Tx split, autoselect/Autonegotiate interface. + MII transceiver found at address 24, status 782d. + Enabling bus-master transmits and whole-frame receives. + + NOTE: You must provide the `debug=2' modprobe option to generate + a full detection message. Please do this: + + modprobe 3c59x debug=2 + + o If it is a PCI device, the relevant output from 'lspci -vx', eg: + + 00:09.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 74) + Subsystem: 3Com Corporation: Unknown device 9200 + Flags: bus master, medium devsel, latency 32, IRQ 19 + I/O ports at a400 [size=128] + Memory at db000000 (32-bit, non-prefetchable) [size=128] + Expansion ROM at <unassigned> [disabled] [size=128K] + Capabilities: [dc] Power Management version 2 + 00: b7 10 00 92 07 00 10 02 74 00 00 02 08 20 00 00 + 10: 01 a4 00 00 00 00 00 db 00 00 00 00 00 00 00 00 + 20: 00 00 00 00 00 00 00 00 00 00 00 00 b7 10 00 10 + 30: 00 00 00 00 dc 00 00 00 00 00 00 00 05 01 0a 0a + + o A description of the environment: 10baseT? 100baseT? + full/half duplex? switched or hubbed? + + o Any additional module parameters which you may be providing to the driver. + + o Any kernel logs which are produced. The more the merrier. + If this is a large file and you are sending your report to a + mailing list, mention that you have the logfile, but don't send + it. If you're reporting direct to the maintainer then just send + it. + + To ensure that all kernel logs are available, add the + following line to /etc/syslog.conf: + + kern.* /var/log/messages + + Then restart syslogd with: + + /etc/rc.d/init.d/syslog restart + + (The above may vary, depending upon which Linux distribution you use). + + o If your problem is reproducible then that's great. Try the + following: + + 1) Increase the debug level. Usually this is done via: + + a) modprobe driver debug=7 + b) In /etc/modprobe.conf (or /etc/modules.conf for 2.4): + options driver debug=7 + + 2) Recreate the problem with the higher debug level, + send all logs to the maintainer. + + 3) Download you card's diagnostic tool from Donald + Backer's website http://www.scyld.com/diag. Download + mii-diag.c as well. Build these. + + a) Run 'vortex-diag -aaee' and 'mii-diag -v' when the card is + working correctly. Save the output. + + b) Run the above commands when the card is malfunctioning. Send + both sets of output. + +Finally, please be patient and be prepared to do some work. You may end up working on +this problem for a week or more as the maintainer asks more questions, asks for more +tests, asks for patches to be applied, etc. At the end of it all, the problem may even +remain unresolved. + diff --git a/Documentation/networking/wan-router.txt b/Documentation/networking/wan-router.txt new file mode 100644 index 000000000000..aea20cd2a56e --- /dev/null +++ b/Documentation/networking/wan-router.txt @@ -0,0 +1,622 @@ +------------------------------------------------------------------------------ +Linux WAN Router Utilities Package +------------------------------------------------------------------------------ +Version 2.2.1 +Mar 28, 2001 +Author: Nenad Corbic <ncorbic@sangoma.com> +Copyright (c) 1995-2001 Sangoma Technologies Inc. +------------------------------------------------------------------------------ + +INTRODUCTION + +Wide Area Networks (WANs) are used to interconnect Local Area Networks (LANs) +and/or stand-alone hosts over vast distances with data transfer rates +significantly higher than those achievable with commonly used dial-up +connections. + +Usually an external device called `WAN router' sitting on your local network +or connected to your machine's serial port provides physical connection to +WAN. Although router's job may be as simple as taking your local network +traffic, converting it to WAN format and piping it through the WAN link, these +devices are notoriously expensive, with prices as much as 2 - 5 times higher +then the price of a typical PC box. + +Alternatively, considering robustness and multitasking capabilities of Linux, +an internal router can be built (most routers use some sort of stripped down +Unix-like operating system anyway). With a number of relatively inexpensive WAN +interface cards available on the market, a perfectly usable router can be +built for less than half a price of an external router. Yet a Linux box +acting as a router can still be used for other purposes, such as fire-walling, +running FTP, WWW or DNS server, etc. + +This kernel module introduces the notion of a WAN Link Driver (WLD) to Linux +operating system and provides generic hardware-independent services for such +drivers. Why can existing Linux network device interface not be used for +this purpose? Well, it can. However, there are a few key differences between +a typical network interface (e.g. Ethernet) and a WAN link. + +Many WAN protocols, such as X.25 and frame relay, allow for multiple logical +connections (known as `virtual circuits' in X.25 terminology) over a single +physical link. Each such virtual circuit may (and almost always does) lead +to a different geographical location and, therefore, different network. As a +result, it is the virtual circuit, not the physical link, that represents a +route and, therefore, a network interface in Linux terms. + +To further complicate things, virtual circuits are usually volatile in nature +(excluding so called `permanent' virtual circuits or PVCs). With almost no +time required to set up and tear down a virtual circuit, it is highly desirable +to implement on-demand connections in order to minimize network charges. So +unlike a typical network driver, the WAN driver must be able to handle multiple +network interfaces and cope as multiple virtual circuits come into existence +and go away dynamically. + +Last, but not least, WAN configuration is much more complex than that of say +Ethernet and may well amount to several dozens of parameters. Some of them +are "link-wide" while others are virtual circuit-specific. The same holds +true for WAN statistics which is by far more extensive and extremely useful +when troubleshooting WAN connections. Extending the ifconfig utility to suit +these needs may be possible, but does not seem quite reasonable. Therefore, a +WAN configuration utility and corresponding application programmer's interface +is needed for this purpose. + +Most of these problems are taken care of by this module. Its goal is to +provide a user with more-or-less standard look and feel for all WAN devices and +assist a WAN device driver writer by providing common services, such as: + + o User-level interface via /proc file system + o Centralized configuration + o Device management (setup, shutdown, etc.) + o Network interface management (dynamic creation/destruction) + o Protocol encapsulation/decapsulation + +To ba able to use the Linux WAN Router you will also need a WAN Tools package +available from + + ftp.sangoma.com/pub/linux/current_wanpipe/wanpipe-X.Y.Z.tgz + +where vX.Y.Z represent the wanpipe version number. + +For technical questions and/or comments please e-mail to ncorbic@sangoma.com. +For general inquiries please contact Sangoma Technologies Inc. by + + Hotline: 1-800-388-2475 (USA and Canada, toll free) + Phone: (905) 474-1990 ext: 106 + Fax: (905) 474-9223 + E-mail: dm@sangoma.com (David Mandelstam) + WWW: http://www.sangoma.com + + +INSTALLATION + +Please read the WanpipeForLinux.pdf manual on how to +install the WANPIPE tools and drivers properly. + + +After installing wanpipe package: /usr/local/wanrouter/doc. +On the ftp.sangoma.com : /linux/current_wanpipe/doc + + +COPYRIGHT AND LICENSING INFORMATION + +This program is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free Software +Foundation; either version 2, or (at your option) any later version. + +This program is distributed in the hope that it will be useful, but WITHOUT +ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. + +You should have received a copy of the GNU General Public License along with +this program; if not, write to the Free Software Foundation, Inc., 675 Mass +Ave, Cambridge, MA 02139, USA. + + + +ACKNOWLEDGEMENTS + +This product is based on the WANPIPE(tm) Multiprotocol WAN Router developed +by Sangoma Technologies Inc. for Linux 2.0.x and 2.2.x. Success of the WANPIPE +together with the next major release of Linux kernel in summer 1996 commanded +adequate changes to the WANPIPE code to take full advantage of new Linux +features. + +Instead of continuing developing proprietary interface tied to Sangoma WAN +cards, we decided to separate all hardware-independent code into a separate +module and defined two levels of interfaces - one for user-level applications +and another for kernel-level WAN drivers. WANPIPE is now implemented as a +WAN driver compliant with the WAN Link Driver interface. Also a general +purpose WAN configuration utility and a set of shell scripts was developed to +support WAN router at the user level. + +Many useful ideas concerning hardware-independent interface implementation +were given by Mike McLagan <mike.mclagan@linux.org> and his implementation +of the Frame Relay router and drivers for Sangoma cards (dlci/sdla). + +With the new implementation of the APIs being incorporated into the WANPIPE, +a special thank goes to Alan Cox in providing insight into BSD sockets. + +Special thanks to all the WANPIPE users who performed field-testing, reported +bugs and made valuable comments and suggestions that help us to improve this +product. + + + +NEW IN THIS RELEASE + + o Updated the WANCFG utility + Calls the pppconfig to configure the PPPD + for async connections. + + o Added the PPPCONFIG utility + Used to configure the PPPD dameon for the + WANPIPE Async PPP and standard serial port. + The wancfg calls the pppconfig to configure + the pppd. + + o Fixed the PCI autodetect feature. + The SLOT 0 was used as an autodetect option + however, some high end PC's slot numbers start + from 0. + + o This release has been tested with the new backupd + daemon release. + + +PRODUCT COMPONENTS AND RELATED FILES + +/etc: (or user defined) + wanpipe1.conf default router configuration file + +/lib/modules/X.Y.Z/misc: + wanrouter.o router kernel loadable module + af_wanpipe.o wanpipe api socket module + +/lib/modules/X.Y.Z/net: + sdladrv.o Sangoma SDLA support module + wanpipe.o Sangoma WANPIPE(tm) driver module + +/proc/net/wanrouter + Config reads current router configuration + Status reads current router status + {name} reads WAN driver statistics + +/usr/sbin: + wanrouter wanrouter start-up script + wanconfig wanrouter configuration utility + sdladump WANPIPE adapter memory dump utility + fpipemon Monitor for Frame Relay + cpipemon Monitor for Cisco HDLC + ppipemon Monitor for PPP + xpipemon Monitor for X25 + wpkbdmon WANPIPE keyboard led monitor/debugger + +/usr/local/wanrouter: + README this file + COPYING GNU General Public License + Setup installation script + Filelist distribution definition file + wanrouter.rc meta-configuration file + (used by the Setup and wanrouter script) + +/usr/local/wanrouter/doc: + wanpipeForLinux.pdf WAN Router User's Manual + +/usr/local/wanrouter/patches: + wanrouter-v2213.gz patch for Linux kernels 2.2.11 up to 2.2.13. + wanrouter-v2214.gz patch for Linux kernel 2.2.14. + wanrouter-v2215.gz patch for Linux kernels 2.2.15 to 2.2.17. + wanrouter-v2218.gz patch for Linux kernels 2.2.18 and up. + wanrouter-v240.gz patch for Linux kernel 2.4.0. + wanrouter-v242.gz patch for Linux kernel 2.4.2 and up. + wanrouter-v2034.gz patch for Linux kernel 2.0.34 + wanrouter-v2036.gz patch for Linux kernel 2.0.36 and up. + +/usr/local/wanrouter/patches/kdrivers: + Sources of the latest WANPIPE device drivers. + These are used to UPGRADE the linux kernel to the newest + version if the kernel source has already been pathced with + WANPIPE drivers. + +/usr/local/wanrouter/samples: + interface sample interface configuration file + wanpipe1.cpri CHDLC primary port + wanpipe2.csec CHDLC secondary port + wanpipe1.fr Frame Relay protocol + wanpipe1.ppp PPP protocol ) + wanpipe1.asy CHDLC ASYNC protocol + wanpipe1.x25 X25 protocol + wanpipe1.stty Sync TTY driver (Used by Kernel PPPD daemon) + wanpipe1.atty Async TTY driver (Used by Kernel PPPD daemon) + wanrouter.rc sample meta-configuration file + +/usr/local/wanrouter/util: + * wan-tools utilities source code + +/usr/local/wanrouter/api/x25: + * x25 api sample programs. +/usr/local/wanrouter/api/chdlc: + * chdlc api sample programs. +/usr/local/wanrouter/api/fr: + * fr api sample programs. +/usr/local/wanrouter/config/wancfg: + wancfg WANPIPE GUI configuration program. + Creates wanpipe#.conf files. +/usr/local/wanrouter/config/cfgft1: + cfgft1 GUI CSU/DSU configuration program. + +/usr/include/linux: + wanrouter.h router API definitions + wanpipe.h WANPIPE API definitions + sdladrv.h SDLA support module API definitions + sdlasfm.h SDLA firmware module definitions + if_wanpipe.h WANPIPE Socket definitions + if_wanpipe_common.h WANPIPE Socket/Driver common definitions. + sdlapci.h WANPIPE PCI definitions + + +/usr/src/linux/net/wanrouter: + * wanrouter source code + +/var/log: + wanrouter wanrouter start-up log (created by the Setup script) + +/var/lock: (or /var/lock/subsys for RedHat) + wanrouter wanrouter lock file (created by the Setup script) + +/usr/local/wanrouter/firmware: + fr514.sfm Frame relay firmware for Sangoma S508/S514 card + cdual514.sfm Dual Port Cisco HDLC firmware for Sangoma S508/S514 card + ppp514.sfm PPP Firmware for Sangoma S508 and S514 cards + x25_508.sfm X25 Firmware for Sangoma S508 card. + + +REVISION HISTORY + +1.0.0 December 31, 1996 Initial version + +1.0.1 January 30, 1997 Status and statistics can be read via /proc + filesystem entries. + +1.0.2 April 30, 1997 Added UDP management via monitors. + +1.0.3 June 3, 1997 UDP management for multiple boards using Frame + Relay and PPP + Enabled continuous transmission of Configure + Request Packet for PPP (for 508 only) + Connection Timeout for PPP changed from 900 to 0 + Flow Control Problem fixed for Frame Relay + +1.0.4 July 10, 1997 S508/FT1 monitoring capability in fpipemon and + ppipemon utilities. + Configurable TTL for UDP packets. + Multicast and Broadcast IP source addresses are + silently discarded. + +1.0.5 July 28, 1997 Configurable T391,T392,N391,N392,N393 for Frame + Relay in router.conf. + Configurable Memory Address through router.conf + for Frame Relay, PPP and X.25. (commenting this + out enables auto-detection). + Fixed freeing up received buffers using kfree() + for Frame Relay and X.25. + Protect sdla_peek() by calling save_flags(), + cli() and restore_flags(). + Changed number of Trace elements from 32 to 20 + Added DLCI specific data monitoring in FPIPEMON. +2.0.0 Nov 07, 1997 Implemented protection of RACE conditions by + critical flags for FRAME RELAY and PPP. + DLCI List interrupt mode implemented. + IPX support in FRAME RELAY and PPP. + IPX Server Support (MARS) + More driver specific stats included in FPIPEMON + and PIPEMON. + +2.0.1 Nov 28, 1997 Bug Fixes for version 2.0.0. + Protection of "enable_irq()" while + "disable_irq()" has been enabled from any other + routine (for Frame Relay, PPP and X25). + Added additional Stats for Fpipemon and Ppipemon + Improved Load Sharing for multiple boards + +2.0.2 Dec 09, 1997 Support for PAP and CHAP for ppp has been + implemented. + +2.0.3 Aug 15, 1998 New release supporting Cisco HDLC, CIR for Frame + relay, Dynamic IP assignment for PPP and Inverse + Arp support for Frame-relay. Man Pages are + included for better support and a new utility + for configuring FT1 cards. + +2.0.4 Dec 09, 1998 Dual Port support for Cisco HDLC. + Support for HDLC (LAPB) API. + Supports BiSync Streaming code for S502E + and S503 cards. + Support for Streaming HDLC API. + Provides a BSD socket interface for + creating applications using BiSync + streaming. + +2.0.5 Aug 04, 1999 CHDLC initializatin bug fix. + PPP interrupt driven driver: + Fix to the PPP line hangup problem. + New PPP firmware + Added comments to the startup SYSTEM ERROR messages + Xpipemon debugging application for the X25 protocol + New USER_MANUAL.txt + Fixed the odd boundary 4byte writes to the board. + BiSync Streaming code has been taken out. + Available as a patch. + Streaming HDLC API has been taken out. + Available as a patch. + +2.0.6 Aug 17, 1999 Increased debugging in statup scripts + Fixed insallation bugs from 2.0.5 + Kernel patch works for both 2.2.10 and 2.2.11 kernels. + There is no functional difference between the two packages + +2.0.7 Aug 26, 1999 o Merged X25API code into WANPIPE. + o Fixed a memeory leak for X25API + o Updated the X25API code for 2.2.X kernels. + o Improved NEM handling. + +2.1.0 Oct 25, 1999 o New code for S514 PCI Card + o New CHDLC and Frame Relay drivers + o PPP and X25 are not supported in this release + +2.1.1 Nov 30, 1999 o PPP support for S514 PCI Cards + +2.1.3 Apr 06, 2000 o Socket based x25api + o Socket based chdlc api + o Socket based fr api + o Dual Port Receive only CHDLC support. + o Asynchronous CHDLC support (Secondary Port) + o cfgft1 GUI csu/dsu configurator + o wancfg GUI configuration file + configurator. + o Architectual directory changes. + +beta-2.1.4 Jul 2000 o Dynamic interface configuration: + Network interfaces reflect the state + of protocol layer. If the protocol becomes + disconnected, driver will bring down + the interface. Once the protocol reconnects + the interface will be brought up. + + Note: This option is turned off by default. + + o Dynamic wanrouter setup using 'wanconfig': + wanconfig utility can be used to + shutdown,restart,start or reconfigure + a virtual circuit dynamically. + + Frame Relay: Each DLCI can be: + created,stopped,restarted and reconfigured + dynamically using wanconfig. + + ex: wanconfig card wanpipe1 dev wp1_fr16 up + + o Wanrouter startup via command line arguments: + wanconfig also supports wanrouter startup via command line + arguments. Thus, there is no need to create a wanpipe#.conf + configuration file. + + o Socket based x25api update/bug fixes. + Added support for LCN numbers greater than 255. + Option to pass up modem messages. + Provided a PCI IRQ check, so a single S514 + card is guaranteed to have a non-sharing interrupt. + + o Fixes to the wancfg utility. + o New FT1 debugging support via *pipemon utilities. + o Frame Relay ARP support Enabled. + +beta3-2.1.4 Jul 2000 o X25 M_BIT Problem fix. + o Added the Multi-Port PPP + Updated utilites for the Multi-Port PPP. + +2.1.4 Aut 2000 + o In X25API: + Maximum packet an application can send + to the driver has been extended to 4096 bytes. + + Fixed the x25 startup bug. Enable + communications only after all interfaces + come up. HIGH SVC/PVC is used to calculate + the number of channels. + Enable protocol only after all interfaces + are enabled. + + o Added an extra state to the FT1 config, kernel module. + o Updated the pipemon debuggers. + + o Blocked the Multi-Port PPP from running on kernels + 2.2.16 or greater, due to syncppp kernel module + change. + +beta1-2.1.5 Nov 15 2000 + o Fixed the MulitPort PPP Support for kernels 2.2.16 and above. + 2.2.X kernels only + + o Secured the driver UDP debugging calls + - All illegal netowrk debugging calls are reported to + the log. + - Defined a set of allowed commands, all other denied. + + o Cpipemon + - Added set FT1 commands to the cpipemon. Thus CSU/DSU + configuraiton can be performed using cpipemon. + All systems that cannot run cfgft1 GUI utility should + use cpipemon to configure the on board CSU/DSU. + + + o Keyboard Led Monitor/Debugger + - A new utilty /usr/sbin/wpkbdmon uses keyboard leds + to convey operatinal statistic information of the + Sangoma WANPIPE cards. + NUM_LOCK = Line State (On=connected, Off=disconnected) + CAPS_LOCK = Tx data (On=transmitting, Off=no tx data) + SCROLL_LOCK = Rx data (On=receiving, Off=no rx data + + o Hardware probe on module load and dynamic device allocation + - During WANPIPE module load, all Sangoma cards are probed + and found information is printed in the /var/log/messages. + - If no cards are found, the module load fails. + - Appropriate number of devices are dynamically loaded + based on the number of Sangoma cards found. + + Note: The kernel configuraiton option + CONFIG_WANPIPE_CARDS has been taken out. + + o Fixed the Frame Relay and Chdlc network interfaces so they are + compatible with libpcap libraries. Meaning, tcpdump, snort, + ethereal, and all other packet sniffers and debuggers work on + all WANPIPE netowrk interfaces. + - Set the network interface encoding type to ARPHRD_PPP. + This tell the sniffers that data obtained from the + network interface is in pure IP format. + Fix for 2.2.X kernels only. + + o True interface encoding option for Frame Relay and CHDLC + - The above fix sets the network interface encoding + type to ARPHRD_PPP, however some customers use + the encoding interface type to determine the + protocol running. Therefore, the TURE ENCODING + option will set the interface type back to the + original value. + + NOTE: If this option is used with Frame Relay and CHDLC + libpcap library support will be broken. + i.e. tcpdump will not work. + Fix for 2.2.x Kernels only. + + o Ethernet Bridgind over Frame Relay + - The Frame Relay bridging has been developed by + Kristian Hoffmann and Mark Wells. + - The Linux kernel bridge is used to send ethernet + data over the frame relay links. + For 2.2.X Kernels only. + + o Added extensive 2.0.X support. Most new features of + 2.1.5 for protocols Frame Relay, PPP and CHDLC are + supported under 2.0.X kernels. + +beta1-2.2.0 Dec 30 2000 + o Updated drivers for 2.4.X kernels. + o Updated drivers for SMP support. + o X25API is now able to share PCI interrupts. + o Took out a general polling routine that was used + only by X25API. + o Added appropriate locks to the dynamic reconfiguration + code. + o Fixed a bug in the keyboard debug monitor. + +beta2-2.2.0 Jan 8 2001 + o Patches for 2.4.0 kernel + o Patches for 2.2.18 kernel + o Minor updates to PPP and CHLDC drivers. + Note: No functinal difference. + +beta3-2.2.9 Jan 10 2001 + o I missed the 2.2.18 kernel patches in beta2-2.2.0 + release. They are included in this release. + +Stable Release +2.2.0 Feb 01 2001 + o Bug fix in wancfg GUI configurator. + The edit function didn't work properly. + + +bata1-2.2.1 Feb 09 2001 + o WANPIPE TTY Driver emulation. + Two modes of operation Sync and Async. + Sync: Using the PPPD daemon, kernel SyncPPP layer + and the Wanpipe sync TTY driver: a PPP protocol + connection can be established via Sangoma adapter, over + a T1 leased line. + + The 2.4.0 kernel PPP layer supports MULTILINK + protocol, that can be used to bundle any number of Sangoma + adapters (T1 lines) into one, under a single IP address. + Thus, efficiently obtaining multiple T1 throughput. + + NOTE: The remote side must also implement MULTILINK PPP + protocol. + + Async:Using the PPPD daemon, kernel AsyncPPP layer + and the WANPIPE async TTY driver: a PPP protocol + connection can be established via Sangoma adapter and + a modem, over a telephone line. + + Thus, the WANPIPE async TTY driver simulates a serial + TTY driver that would normally be used to interface the + MODEM to the linux kernel. + + o WANPIPE PPP Backup Utility + This utility will monitor the state of the PPP T1 line. + In case of failure, a dial up connection will be established + via pppd daemon, ether via a serial tty driver (serial port), + or a WANPIPE async TTY driver (in case serial port is unavailable). + + Furthermore, while in dial up mode, the primary PPP T1 link + will be monitored for signs of life. + + If the PPP T1 link comes back to life, the dial up connection + will be shutdown and T1 line re-established. + + + o New Setup installation script. + Option to UPGRADE device drivers if the kernel source has + already been patched with WANPIPE. + + Option to COMPILE WANPIPE modules against the currently + running kernel, thus no need for manual kernel and module + re-compilatin. + + o Updates and Bug Fixes to wancfg utility. + +bata2-2.2.1 Feb 20 2001 + + o Bug fixes to the CHDLC device drivers. + The driver had compilation problems under kernels + 2.2.14 or lower. + + o Bug fixes to the Setup installation script. + The device drivers compilation options didn't work + properly. + + o Update to the wpbackupd daemon. + Optimized the cross-over times, between the primary + link and the backup dialup. + +beta3-2.2.1 Mar 02 2001 + o Patches for 2.4.2 kernel. + + o Bug fixes to util/ make files. + o Bug fixes to the Setup installation script. + + o Took out the backupd support and made it into + as separate package. + +beta4-2.2.1 Mar 12 2001 + + o Fix to the Frame Relay Device driver. + IPSAC sends a packet of zero length + header to the frame relay driver. The + driver tries to push its own 2 byte header + into the packet, which causes the driver to + crash. + + o Fix the WANPIPE re-configuration code. + Bug was found by trying to run the cfgft1 while the + interface was already running. + + o Updates to cfgft1. + Writes a wanpipe#.cfgft1 configuration file + once the CSU/DSU is configured. This file can + holds the current CSU/DSU configuration. + + + +>>>>>> END OF README <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< + + diff --git a/Documentation/networking/wanpipe.txt b/Documentation/networking/wanpipe.txt new file mode 100644 index 000000000000..aea20cd2a56e --- /dev/null +++ b/Documentation/networking/wanpipe.txt @@ -0,0 +1,622 @@ +------------------------------------------------------------------------------ +Linux WAN Router Utilities Package +------------------------------------------------------------------------------ +Version 2.2.1 +Mar 28, 2001 +Author: Nenad Corbic <ncorbic@sangoma.com> +Copyright (c) 1995-2001 Sangoma Technologies Inc. +------------------------------------------------------------------------------ + +INTRODUCTION + +Wide Area Networks (WANs) are used to interconnect Local Area Networks (LANs) +and/or stand-alone hosts over vast distances with data transfer rates +significantly higher than those achievable with commonly used dial-up +connections. + +Usually an external device called `WAN router' sitting on your local network +or connected to your machine's serial port provides physical connection to +WAN. Although router's job may be as simple as taking your local network +traffic, converting it to WAN format and piping it through the WAN link, these +devices are notoriously expensive, with prices as much as 2 - 5 times higher +then the price of a typical PC box. + +Alternatively, considering robustness and multitasking capabilities of Linux, +an internal router can be built (most routers use some sort of stripped down +Unix-like operating system anyway). With a number of relatively inexpensive WAN +interface cards available on the market, a perfectly usable router can be +built for less than half a price of an external router. Yet a Linux box +acting as a router can still be used for other purposes, such as fire-walling, +running FTP, WWW or DNS server, etc. + +This kernel module introduces the notion of a WAN Link Driver (WLD) to Linux +operating system and provides generic hardware-independent services for such +drivers. Why can existing Linux network device interface not be used for +this purpose? Well, it can. However, there are a few key differences between +a typical network interface (e.g. Ethernet) and a WAN link. + +Many WAN protocols, such as X.25 and frame relay, allow for multiple logical +connections (known as `virtual circuits' in X.25 terminology) over a single +physical link. Each such virtual circuit may (and almost always does) lead +to a different geographical location and, therefore, different network. As a +result, it is the virtual circuit, not the physical link, that represents a +route and, therefore, a network interface in Linux terms. + +To further complicate things, virtual circuits are usually volatile in nature +(excluding so called `permanent' virtual circuits or PVCs). With almost no +time required to set up and tear down a virtual circuit, it is highly desirable +to implement on-demand connections in order to minimize network charges. So +unlike a typical network driver, the WAN driver must be able to handle multiple +network interfaces and cope as multiple virtual circuits come into existence +and go away dynamically. + +Last, but not least, WAN configuration is much more complex than that of say +Ethernet and may well amount to several dozens of parameters. Some of them +are "link-wide" while others are virtual circuit-specific. The same holds +true for WAN statistics which is by far more extensive and extremely useful +when troubleshooting WAN connections. Extending the ifconfig utility to suit +these needs may be possible, but does not seem quite reasonable. Therefore, a +WAN configuration utility and corresponding application programmer's interface +is needed for this purpose. + +Most of these problems are taken care of by this module. Its goal is to +provide a user with more-or-less standard look and feel for all WAN devices and +assist a WAN device driver writer by providing common services, such as: + + o User-level interface via /proc file system + o Centralized configuration + o Device management (setup, shutdown, etc.) + o Network interface management (dynamic creation/destruction) + o Protocol encapsulation/decapsulation + +To ba able to use the Linux WAN Router you will also need a WAN Tools package +available from + + ftp.sangoma.com/pub/linux/current_wanpipe/wanpipe-X.Y.Z.tgz + +where vX.Y.Z represent the wanpipe version number. + +For technical questions and/or comments please e-mail to ncorbic@sangoma.com. +For general inquiries please contact Sangoma Technologies Inc. by + + Hotline: 1-800-388-2475 (USA and Canada, toll free) + Phone: (905) 474-1990 ext: 106 + Fax: (905) 474-9223 + E-mail: dm@sangoma.com (David Mandelstam) + WWW: http://www.sangoma.com + + +INSTALLATION + +Please read the WanpipeForLinux.pdf manual on how to +install the WANPIPE tools and drivers properly. + + +After installing wanpipe package: /usr/local/wanrouter/doc. +On the ftp.sangoma.com : /linux/current_wanpipe/doc + + +COPYRIGHT AND LICENSING INFORMATION + +This program is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free Software +Foundation; either version 2, or (at your option) any later version. + +This program is distributed in the hope that it will be useful, but WITHOUT +ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. + +You should have received a copy of the GNU General Public License along with +this program; if not, write to the Free Software Foundation, Inc., 675 Mass +Ave, Cambridge, MA 02139, USA. + + + +ACKNOWLEDGEMENTS + +This product is based on the WANPIPE(tm) Multiprotocol WAN Router developed +by Sangoma Technologies Inc. for Linux 2.0.x and 2.2.x. Success of the WANPIPE +together with the next major release of Linux kernel in summer 1996 commanded +adequate changes to the WANPIPE code to take full advantage of new Linux +features. + +Instead of continuing developing proprietary interface tied to Sangoma WAN +cards, we decided to separate all hardware-independent code into a separate +module and defined two levels of interfaces - one for user-level applications +and another for kernel-level WAN drivers. WANPIPE is now implemented as a +WAN driver compliant with the WAN Link Driver interface. Also a general +purpose WAN configuration utility and a set of shell scripts was developed to +support WAN router at the user level. + +Many useful ideas concerning hardware-independent interface implementation +were given by Mike McLagan <mike.mclagan@linux.org> and his implementation +of the Frame Relay router and drivers for Sangoma cards (dlci/sdla). + +With the new implementation of the APIs being incorporated into the WANPIPE, +a special thank goes to Alan Cox in providing insight into BSD sockets. + +Special thanks to all the WANPIPE users who performed field-testing, reported +bugs and made valuable comments and suggestions that help us to improve this +product. + + + +NEW IN THIS RELEASE + + o Updated the WANCFG utility + Calls the pppconfig to configure the PPPD + for async connections. + + o Added the PPPCONFIG utility + Used to configure the PPPD dameon for the + WANPIPE Async PPP and standard serial port. + The wancfg calls the pppconfig to configure + the pppd. + + o Fixed the PCI autodetect feature. + The SLOT 0 was used as an autodetect option + however, some high end PC's slot numbers start + from 0. + + o This release has been tested with the new backupd + daemon release. + + +PRODUCT COMPONENTS AND RELATED FILES + +/etc: (or user defined) + wanpipe1.conf default router configuration file + +/lib/modules/X.Y.Z/misc: + wanrouter.o router kernel loadable module + af_wanpipe.o wanpipe api socket module + +/lib/modules/X.Y.Z/net: + sdladrv.o Sangoma SDLA support module + wanpipe.o Sangoma WANPIPE(tm) driver module + +/proc/net/wanrouter + Config reads current router configuration + Status reads current router status + {name} reads WAN driver statistics + +/usr/sbin: + wanrouter wanrouter start-up script + wanconfig wanrouter configuration utility + sdladump WANPIPE adapter memory dump utility + fpipemon Monitor for Frame Relay + cpipemon Monitor for Cisco HDLC + ppipemon Monitor for PPP + xpipemon Monitor for X25 + wpkbdmon WANPIPE keyboard led monitor/debugger + +/usr/local/wanrouter: + README this file + COPYING GNU General Public License + Setup installation script + Filelist distribution definition file + wanrouter.rc meta-configuration file + (used by the Setup and wanrouter script) + +/usr/local/wanrouter/doc: + wanpipeForLinux.pdf WAN Router User's Manual + +/usr/local/wanrouter/patches: + wanrouter-v2213.gz patch for Linux kernels 2.2.11 up to 2.2.13. + wanrouter-v2214.gz patch for Linux kernel 2.2.14. + wanrouter-v2215.gz patch for Linux kernels 2.2.15 to 2.2.17. + wanrouter-v2218.gz patch for Linux kernels 2.2.18 and up. + wanrouter-v240.gz patch for Linux kernel 2.4.0. + wanrouter-v242.gz patch for Linux kernel 2.4.2 and up. + wanrouter-v2034.gz patch for Linux kernel 2.0.34 + wanrouter-v2036.gz patch for Linux kernel 2.0.36 and up. + +/usr/local/wanrouter/patches/kdrivers: + Sources of the latest WANPIPE device drivers. + These are used to UPGRADE the linux kernel to the newest + version if the kernel source has already been pathced with + WANPIPE drivers. + +/usr/local/wanrouter/samples: + interface sample interface configuration file + wanpipe1.cpri CHDLC primary port + wanpipe2.csec CHDLC secondary port + wanpipe1.fr Frame Relay protocol + wanpipe1.ppp PPP protocol ) + wanpipe1.asy CHDLC ASYNC protocol + wanpipe1.x25 X25 protocol + wanpipe1.stty Sync TTY driver (Used by Kernel PPPD daemon) + wanpipe1.atty Async TTY driver (Used by Kernel PPPD daemon) + wanrouter.rc sample meta-configuration file + +/usr/local/wanrouter/util: + * wan-tools utilities source code + +/usr/local/wanrouter/api/x25: + * x25 api sample programs. +/usr/local/wanrouter/api/chdlc: + * chdlc api sample programs. +/usr/local/wanrouter/api/fr: + * fr api sample programs. +/usr/local/wanrouter/config/wancfg: + wancfg WANPIPE GUI configuration program. + Creates wanpipe#.conf files. +/usr/local/wanrouter/config/cfgft1: + cfgft1 GUI CSU/DSU configuration program. + +/usr/include/linux: + wanrouter.h router API definitions + wanpipe.h WANPIPE API definitions + sdladrv.h SDLA support module API definitions + sdlasfm.h SDLA firmware module definitions + if_wanpipe.h WANPIPE Socket definitions + if_wanpipe_common.h WANPIPE Socket/Driver common definitions. + sdlapci.h WANPIPE PCI definitions + + +/usr/src/linux/net/wanrouter: + * wanrouter source code + +/var/log: + wanrouter wanrouter start-up log (created by the Setup script) + +/var/lock: (or /var/lock/subsys for RedHat) + wanrouter wanrouter lock file (created by the Setup script) + +/usr/local/wanrouter/firmware: + fr514.sfm Frame relay firmware for Sangoma S508/S514 card + cdual514.sfm Dual Port Cisco HDLC firmware for Sangoma S508/S514 card + ppp514.sfm PPP Firmware for Sangoma S508 and S514 cards + x25_508.sfm X25 Firmware for Sangoma S508 card. + + +REVISION HISTORY + +1.0.0 December 31, 1996 Initial version + +1.0.1 January 30, 1997 Status and statistics can be read via /proc + filesystem entries. + +1.0.2 April 30, 1997 Added UDP management via monitors. + +1.0.3 June 3, 1997 UDP management for multiple boards using Frame + Relay and PPP + Enabled continuous transmission of Configure + Request Packet for PPP (for 508 only) + Connection Timeout for PPP changed from 900 to 0 + Flow Control Problem fixed for Frame Relay + +1.0.4 July 10, 1997 S508/FT1 monitoring capability in fpipemon and + ppipemon utilities. + Configurable TTL for UDP packets. + Multicast and Broadcast IP source addresses are + silently discarded. + +1.0.5 July 28, 1997 Configurable T391,T392,N391,N392,N393 for Frame + Relay in router.conf. + Configurable Memory Address through router.conf + for Frame Relay, PPP and X.25. (commenting this + out enables auto-detection). + Fixed freeing up received buffers using kfree() + for Frame Relay and X.25. + Protect sdla_peek() by calling save_flags(), + cli() and restore_flags(). + Changed number of Trace elements from 32 to 20 + Added DLCI specific data monitoring in FPIPEMON. +2.0.0 Nov 07, 1997 Implemented protection of RACE conditions by + critical flags for FRAME RELAY and PPP. + DLCI List interrupt mode implemented. + IPX support in FRAME RELAY and PPP. + IPX Server Support (MARS) + More driver specific stats included in FPIPEMON + and PIPEMON. + +2.0.1 Nov 28, 1997 Bug Fixes for version 2.0.0. + Protection of "enable_irq()" while + "disable_irq()" has been enabled from any other + routine (for Frame Relay, PPP and X25). + Added additional Stats for Fpipemon and Ppipemon + Improved Load Sharing for multiple boards + +2.0.2 Dec 09, 1997 Support for PAP and CHAP for ppp has been + implemented. + +2.0.3 Aug 15, 1998 New release supporting Cisco HDLC, CIR for Frame + relay, Dynamic IP assignment for PPP and Inverse + Arp support for Frame-relay. Man Pages are + included for better support and a new utility + for configuring FT1 cards. + +2.0.4 Dec 09, 1998 Dual Port support for Cisco HDLC. + Support for HDLC (LAPB) API. + Supports BiSync Streaming code for S502E + and S503 cards. + Support for Streaming HDLC API. + Provides a BSD socket interface for + creating applications using BiSync + streaming. + +2.0.5 Aug 04, 1999 CHDLC initializatin bug fix. + PPP interrupt driven driver: + Fix to the PPP line hangup problem. + New PPP firmware + Added comments to the startup SYSTEM ERROR messages + Xpipemon debugging application for the X25 protocol + New USER_MANUAL.txt + Fixed the odd boundary 4byte writes to the board. + BiSync Streaming code has been taken out. + Available as a patch. + Streaming HDLC API has been taken out. + Available as a patch. + +2.0.6 Aug 17, 1999 Increased debugging in statup scripts + Fixed insallation bugs from 2.0.5 + Kernel patch works for both 2.2.10 and 2.2.11 kernels. + There is no functional difference between the two packages + +2.0.7 Aug 26, 1999 o Merged X25API code into WANPIPE. + o Fixed a memeory leak for X25API + o Updated the X25API code for 2.2.X kernels. + o Improved NEM handling. + +2.1.0 Oct 25, 1999 o New code for S514 PCI Card + o New CHDLC and Frame Relay drivers + o PPP and X25 are not supported in this release + +2.1.1 Nov 30, 1999 o PPP support for S514 PCI Cards + +2.1.3 Apr 06, 2000 o Socket based x25api + o Socket based chdlc api + o Socket based fr api + o Dual Port Receive only CHDLC support. + o Asynchronous CHDLC support (Secondary Port) + o cfgft1 GUI csu/dsu configurator + o wancfg GUI configuration file + configurator. + o Architectual directory changes. + +beta-2.1.4 Jul 2000 o Dynamic interface configuration: + Network interfaces reflect the state + of protocol layer. If the protocol becomes + disconnected, driver will bring down + the interface. Once the protocol reconnects + the interface will be brought up. + + Note: This option is turned off by default. + + o Dynamic wanrouter setup using 'wanconfig': + wanconfig utility can be used to + shutdown,restart,start or reconfigure + a virtual circuit dynamically. + + Frame Relay: Each DLCI can be: + created,stopped,restarted and reconfigured + dynamically using wanconfig. + + ex: wanconfig card wanpipe1 dev wp1_fr16 up + + o Wanrouter startup via command line arguments: + wanconfig also supports wanrouter startup via command line + arguments. Thus, there is no need to create a wanpipe#.conf + configuration file. + + o Socket based x25api update/bug fixes. + Added support for LCN numbers greater than 255. + Option to pass up modem messages. + Provided a PCI IRQ check, so a single S514 + card is guaranteed to have a non-sharing interrupt. + + o Fixes to the wancfg utility. + o New FT1 debugging support via *pipemon utilities. + o Frame Relay ARP support Enabled. + +beta3-2.1.4 Jul 2000 o X25 M_BIT Problem fix. + o Added the Multi-Port PPP + Updated utilites for the Multi-Port PPP. + +2.1.4 Aut 2000 + o In X25API: + Maximum packet an application can send + to the driver has been extended to 4096 bytes. + + Fixed the x25 startup bug. Enable + communications only after all interfaces + come up. HIGH SVC/PVC is used to calculate + the number of channels. + Enable protocol only after all interfaces + are enabled. + + o Added an extra state to the FT1 config, kernel module. + o Updated the pipemon debuggers. + + o Blocked the Multi-Port PPP from running on kernels + 2.2.16 or greater, due to syncppp kernel module + change. + +beta1-2.1.5 Nov 15 2000 + o Fixed the MulitPort PPP Support for kernels 2.2.16 and above. + 2.2.X kernels only + + o Secured the driver UDP debugging calls + - All illegal netowrk debugging calls are reported to + the log. + - Defined a set of allowed commands, all other denied. + + o Cpipemon + - Added set FT1 commands to the cpipemon. Thus CSU/DSU + configuraiton can be performed using cpipemon. + All systems that cannot run cfgft1 GUI utility should + use cpipemon to configure the on board CSU/DSU. + + + o Keyboard Led Monitor/Debugger + - A new utilty /usr/sbin/wpkbdmon uses keyboard leds + to convey operatinal statistic information of the + Sangoma WANPIPE cards. + NUM_LOCK = Line State (On=connected, Off=disconnected) + CAPS_LOCK = Tx data (On=transmitting, Off=no tx data) + SCROLL_LOCK = Rx data (On=receiving, Off=no rx data + + o Hardware probe on module load and dynamic device allocation + - During WANPIPE module load, all Sangoma cards are probed + and found information is printed in the /var/log/messages. + - If no cards are found, the module load fails. + - Appropriate number of devices are dynamically loaded + based on the number of Sangoma cards found. + + Note: The kernel configuraiton option + CONFIG_WANPIPE_CARDS has been taken out. + + o Fixed the Frame Relay and Chdlc network interfaces so they are + compatible with libpcap libraries. Meaning, tcpdump, snort, + ethereal, and all other packet sniffers and debuggers work on + all WANPIPE netowrk interfaces. + - Set the network interface encoding type to ARPHRD_PPP. + This tell the sniffers that data obtained from the + network interface is in pure IP format. + Fix for 2.2.X kernels only. + + o True interface encoding option for Frame Relay and CHDLC + - The above fix sets the network interface encoding + type to ARPHRD_PPP, however some customers use + the encoding interface type to determine the + protocol running. Therefore, the TURE ENCODING + option will set the interface type back to the + original value. + + NOTE: If this option is used with Frame Relay and CHDLC + libpcap library support will be broken. + i.e. tcpdump will not work. + Fix for 2.2.x Kernels only. + + o Ethernet Bridgind over Frame Relay + - The Frame Relay bridging has been developed by + Kristian Hoffmann and Mark Wells. + - The Linux kernel bridge is used to send ethernet + data over the frame relay links. + For 2.2.X Kernels only. + + o Added extensive 2.0.X support. Most new features of + 2.1.5 for protocols Frame Relay, PPP and CHDLC are + supported under 2.0.X kernels. + +beta1-2.2.0 Dec 30 2000 + o Updated drivers for 2.4.X kernels. + o Updated drivers for SMP support. + o X25API is now able to share PCI interrupts. + o Took out a general polling routine that was used + only by X25API. + o Added appropriate locks to the dynamic reconfiguration + code. + o Fixed a bug in the keyboard debug monitor. + +beta2-2.2.0 Jan 8 2001 + o Patches for 2.4.0 kernel + o Patches for 2.2.18 kernel + o Minor updates to PPP and CHLDC drivers. + Note: No functinal difference. + +beta3-2.2.9 Jan 10 2001 + o I missed the 2.2.18 kernel patches in beta2-2.2.0 + release. They are included in this release. + +Stable Release +2.2.0 Feb 01 2001 + o Bug fix in wancfg GUI configurator. + The edit function didn't work properly. + + +bata1-2.2.1 Feb 09 2001 + o WANPIPE TTY Driver emulation. + Two modes of operation Sync and Async. + Sync: Using the PPPD daemon, kernel SyncPPP layer + and the Wanpipe sync TTY driver: a PPP protocol + connection can be established via Sangoma adapter, over + a T1 leased line. + + The 2.4.0 kernel PPP layer supports MULTILINK + protocol, that can be used to bundle any number of Sangoma + adapters (T1 lines) into one, under a single IP address. + Thus, efficiently obtaining multiple T1 throughput. + + NOTE: The remote side must also implement MULTILINK PPP + protocol. + + Async:Using the PPPD daemon, kernel AsyncPPP layer + and the WANPIPE async TTY driver: a PPP protocol + connection can be established via Sangoma adapter and + a modem, over a telephone line. + + Thus, the WANPIPE async TTY driver simulates a serial + TTY driver that would normally be used to interface the + MODEM to the linux kernel. + + o WANPIPE PPP Backup Utility + This utility will monitor the state of the PPP T1 line. + In case of failure, a dial up connection will be established + via pppd daemon, ether via a serial tty driver (serial port), + or a WANPIPE async TTY driver (in case serial port is unavailable). + + Furthermore, while in dial up mode, the primary PPP T1 link + will be monitored for signs of life. + + If the PPP T1 link comes back to life, the dial up connection + will be shutdown and T1 line re-established. + + + o New Setup installation script. + Option to UPGRADE device drivers if the kernel source has + already been patched with WANPIPE. + + Option to COMPILE WANPIPE modules against the currently + running kernel, thus no need for manual kernel and module + re-compilatin. + + o Updates and Bug Fixes to wancfg utility. + +bata2-2.2.1 Feb 20 2001 + + o Bug fixes to the CHDLC device drivers. + The driver had compilation problems under kernels + 2.2.14 or lower. + + o Bug fixes to the Setup installation script. + The device drivers compilation options didn't work + properly. + + o Update to the wpbackupd daemon. + Optimized the cross-over times, between the primary + link and the backup dialup. + +beta3-2.2.1 Mar 02 2001 + o Patches for 2.4.2 kernel. + + o Bug fixes to util/ make files. + o Bug fixes to the Setup installation script. + + o Took out the backupd support and made it into + as separate package. + +beta4-2.2.1 Mar 12 2001 + + o Fix to the Frame Relay Device driver. + IPSAC sends a packet of zero length + header to the frame relay driver. The + driver tries to push its own 2 byte header + into the packet, which causes the driver to + crash. + + o Fix the WANPIPE re-configuration code. + Bug was found by trying to run the cfgft1 while the + interface was already running. + + o Updates to cfgft1. + Writes a wanpipe#.cfgft1 configuration file + once the CSU/DSU is configured. This file can + holds the current CSU/DSU configuration. + + + +>>>>>> END OF README <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< + + diff --git a/Documentation/networking/wavelan.txt b/Documentation/networking/wavelan.txt new file mode 100644 index 000000000000..c1acf5eb3712 --- /dev/null +++ b/Documentation/networking/wavelan.txt @@ -0,0 +1,73 @@ + The Wavelan drivers saga + ------------------------ + + By Jean Tourrilhes <jt@hpl.hp.com> + + The Wavelan is a Radio network adapter designed by +Lucent. Under this generic name is hidden quite a variety of hardware, +and many Linux driver to support it. + The get the full story on Wireless LANs, please consult : + http://www.hpl.hp.com/personal/Jean_Tourrilhes/Linux/ + +"wavelan" driver (old ISA Wavelan) +---------------- + o Config : Network device -> Wireless LAN -> AT&T WaveLAN + o Location : .../drivers/net/wavelan* + o in-line doc : .../drivers/net/wavelan.p.h + o on-line doc : + http://www.hpl.hp.com/personal/Jean_Tourrilhes/Linux/Wavelan.html + + This is the driver for the ISA version of the first generation +of the Wavelan, now discontinued. The device is 2 Mb/s, composed of a +Intel 82586 controller and a Lucent Modem, and is NOT 802.11 compliant. + The driver has been tested with the following hardware : + o Wavelan ISA 915 MHz (full length ISA card) + o Wavelan ISA 915 MHz 2.0 (half length ISA card) + o Wavelan ISA 2.4 GHz (full length ISA card, fixed frequency) + o Wavelan ISA 2.4 GHz 2.0 (half length ISA card, frequency selectable) + o Above cards with the optional DES encryption feature + +"wavelan_cs" driver (old Pcmcia Wavelan) +------------------- + o Config : Network device -> PCMCIA network -> + Pcmcia Wireless LAN -> AT&T/Lucent WaveLAN + o Location : .../drivers/net/pcmcia/wavelan* + o in-line doc : .../drivers/net/pcmcia/wavelan_cs.h + o on-line doc : + http://www.hpl.hp.com/personal/Jean_Tourrilhes/Linux/Wavelan.html + + This is the driver for the PCMCIA version of the first +generation of the Wavelan, now discontinued. The device is 2 Mb/s, +composed of a Intel 82593 controller (totally different from the 82586) +and a Lucent Modem, and NOT 802.11 compatible. + The driver has been tested with the following hardware : + o Wavelan Pcmcia 915 MHz 2.0 (Pcmcia card + separate + modem/antenna block) + o Wavelan Pcmcia 2.4 GHz 2.0 (Pcmcia card + separate + modem/antenna block) + +"wvlan_cs" driver (Wavelan IEEE, GPL) +----------------- + o Config : Not yet in kernel + o Location : Pcmcia package 3.1.10+ + o on-line doc : http://www.fasta.fh-dortmund.de/users/andy/wvlan/ + + This is the driver for the current generation of Wavelan IEEE, +which is 802.11 compatible. Depending on version, it is 2 Mb/s or 11 +Mb/s, with or without encryption, all implemented in Lucent specific +DSP (the Hermes). + This is a GPL full source PCMCIA driver (ISA is just a Pcmcia +card with ISA-Pcmcia bridge). + +"wavelan2_cs" driver (Wavelan IEEE, binary) +-------------------- + o Config : Not yet in kernel + o Location : ftp://sourceforge.org/pcmcia/contrib/ + + This driver support exactly the same hardware as the previous +driver, the main difference is that it is based on a binary library +and supported by Lucent. + + I hope it clears the confusion ;-) + + Jean diff --git a/Documentation/networking/x25-iface.txt b/Documentation/networking/x25-iface.txt new file mode 100644 index 000000000000..975cc87ebdd1 --- /dev/null +++ b/Documentation/networking/x25-iface.txt @@ -0,0 +1,123 @@ + X.25 Device Driver Interface 1.1 + + Jonathan Naylor 26.12.96 + +This is a description of the messages to be passed between the X.25 Packet +Layer and the X.25 device driver. They are designed to allow for the easy +setting of the LAPB mode from within the Packet Layer. + +The X.25 device driver will be coded normally as per the Linux device driver +standards. Most X.25 device drivers will be moderately similar to the +already existing Ethernet device drivers. However unlike those drivers, the +X.25 device driver has a state associated with it, and this information +needs to be passed to and from the Packet Layer for proper operation. + +All messages are held in sk_buff's just like real data to be transmitted +over the LAPB link. The first byte of the skbuff indicates the meaning of +the rest of the skbuff, if any more information does exist. + + +Packet Layer to Device Driver +----------------------------- + +First Byte = 0x00 + +This indicates that the rest of the skbuff contains data to be transmitted +over the LAPB link. The LAPB link should already exist before any data is +passed down. + +First Byte = 0x01 + +Establish the LAPB link. If the link is already established then the connect +confirmation message should be returned as soon as possible. + +First Byte = 0x02 + +Terminate the LAPB link. If it is already disconnected then the disconnect +confirmation message should be returned as soon as possible. + +First Byte = 0x03 + +LAPB parameters. To be defined. + + +Device Driver to Packet Layer +----------------------------- + +First Byte = 0x00 + +This indicates that the rest of the skbuff contains data that has been +received over the LAPB link. + +First Byte = 0x01 + +LAPB link has been established. The same message is used for both a LAPB +link connect_confirmation and a connect_indication. + +First Byte = 0x02 + +LAPB link has been terminated. This same message is used for both a LAPB +link disconnect_confirmation and a disconnect_indication. + +First Byte = 0x03 + +LAPB parameters. To be defined. + + + +Possible Problems +================= + +(Henner Eisen, 2000-10-28) + +The X.25 packet layer protocol depends on a reliable datalink service. +The LAPB protocol provides such reliable service. But this reliability +is not preserved by the Linux network device driver interface: + +- With Linux 2.4.x (and above) SMP kernels, packet ordering is not + preserved. Even if a device driver calls netif_rx(skb1) and later + netif_rx(skb2), skb2 might be delivered to the network layer + earlier that skb1. +- Data passed upstream by means of netif_rx() might be dropped by the + kernel if the backlog queue is congested. + +The X.25 packet layer protocol will detect this and reset the virtual +call in question. But many upper layer protocols are not designed to +handle such N-Reset events gracefully. And frequent N-Reset events +will always degrade performance. + +Thus, driver authors should make netif_rx() as reliable as possible: + +SMP re-ordering will not occur if the driver's interrupt handler is +always executed on the same CPU. Thus, + +- Driver authors should use irq affinity for the interrupt handler. + +The probability of packet loss due to backlog congestion can be +reduced by the following measures or a combination thereof: + +(1) Drivers for kernel versions 2.4.x and above should always check the + return value of netif_rx(). If it returns NET_RX_DROP, the + driver's LAPB protocol must not confirm reception of the frame + to the peer. + This will reliably suppress packet loss. The LAPB protocol will + automatically cause the peer to re-transmit the dropped packet + later. + The lapb module interface was modified to support this. Its + data_indication() method should now transparently pass the + netif_rx() return value to the (lapb mopdule) caller. +(2) Drivers for kernel versions 2.2.x should always check the global + variable netdev_dropping when a new frame is received. The driver + should only call netif_rx() if netdev_dropping is zero. Otherwise + the driver should not confirm delivery of the frame and drop it. + Alternatively, the driver can queue the frame internally and call + netif_rx() later when netif_dropping is 0 again. In that case, delivery + confirmation should also be deferred such that the internal queue + cannot grow to much. + This will not reliably avoid packet loss, but the probability + of packet loss in netif_rx() path will be significantly reduced. +(3) Additionally, driver authors might consider to support + CONFIG_NET_HW_FLOWCONTROL. This allows the driver to be woken up + when a previously congested backlog queue becomes empty again. + The driver could uses this for flow-controlling the peer by means + of the LAPB protocol's flow-control service. diff --git a/Documentation/networking/x25.txt b/Documentation/networking/x25.txt new file mode 100644 index 000000000000..c91c6d7159ff --- /dev/null +++ b/Documentation/networking/x25.txt @@ -0,0 +1,44 @@ +Linux X.25 Project + +As my third year dissertation at University I have taken it upon myself to +write an X.25 implementation for Linux. My aim is to provide a complete X.25 +Packet Layer and a LAPB module to allow for "normal" X.25 to be run using +Linux. There are two sorts of X.25 cards available, intelligent ones that +implement LAPB on the card itself, and unintelligent ones that simply do +framing, bit-stuffing and checksumming. These both need to be handled by the +system. + +I therefore decided to write the implementation such that as far as the +Packet Layer is concerned, the link layer was being performed by a lower +layer of the Linux kernel and therefore it did not concern itself with +implementation of LAPB. Therefore the LAPB modules would be called by +unintelligent X.25 card drivers and not by intelligent ones, this would +provide a uniform device driver interface, and simplify configuration. + +To confuse matters a little, an 802.2 LLC implementation for Linux is being +written which will allow X.25 to be run over an Ethernet (or Token Ring) and +conform with the JNT "Pink Book", this will have a different interface to +the Packet Layer but there will be no confusion since the class of device +being served by the LLC will be completely separate from LAPB. The LLC +implementation is being done as part of another protocol project (SNA) and +by a different author. + +Just when you thought that it could not become more confusing, another +option appeared, XOT. This allows X.25 Packet Layer frames to operate over +the Internet using TCP/IP as a reliable link layer. RFC1613 specifies the +format and behaviour of the protocol. If time permits this option will also +be actively considered. + +A linux-x25 mailing list has been created at vger.kernel.org to support the +development and use of Linux X.25. It is early days yet, but interested +parties are welcome to subscribe to it. Just send a message to +majordomo@vger.kernel.org with the following in the message body: + +subscribe linux-x25 +end + +The contents of the Subject line are ignored. + +Jonathan + +g4klx@g4klx.demon.co.uk diff --git a/Documentation/networking/z8530drv.txt b/Documentation/networking/z8530drv.txt new file mode 100644 index 000000000000..2206abbc3e1b --- /dev/null +++ b/Documentation/networking/z8530drv.txt @@ -0,0 +1,657 @@ +This is a subset of the documentation. To use this driver you MUST have the +full package from: + +Internet: +========= + +1. ftp://ftp.ccac.rwth-aachen.de/pub/jr/z8530drv-utils_3.0-3.tar.gz + +2. ftp://ftp.pspt.fi/pub/ham/linux/ax25/z8530drv-utils_3.0-3.tar.gz + +Please note that the information in this document may be hopelessly outdated. +A new version of the documentation, along with links to other important +Linux Kernel AX.25 documentation and programs, is available on +http://yaina.de/jreuter + +----------------------------------------------------------------------------- + + + SCC.C - Linux driver for Z8530 based HDLC cards for AX.25 + + ******************************************************************** + + (c) 1993,2000 by Joerg Reuter DL1BKE <jreuter@yaina.de> + + portions (c) 1993 Guido ten Dolle PE1NNZ + + for the complete copyright notice see >> Copying.Z8530DRV << + + ******************************************************************** + + +1. Initialization of the driver +=============================== + +To use the driver, 3 steps must be performed: + + 1. if compiled as module: loading the module + 2. Setup of hardware, MODEM and KISS parameters with sccinit + 3. Attach each channel to the Linux kernel AX.25 with "ifconfig" + +Unlike the versions below 2.4 this driver is a real network device +driver. If you want to run xNOS instead of our fine kernel AX.25 +use a 2.x version (available from above sites) or read the +AX.25-HOWTO on how to emulate a KISS TNC on network device drivers. + + +1.1 Loading the module +====================== + +(If you're going to compile the driver as a part of the kernel image, + skip this chapter and continue with 1.2) + +Before you can use a module, you'll have to load it with + + insmod scc.o + +please read 'man insmod' that comes with module-init-tools. + +You should include the insmod in one of the /etc/rc.d/rc.* files, +and don't forget to insert a call of sccinit after that. It +will read your /etc/z8530drv.conf. + +1.2. /etc/z8530drv.conf +======================= + +To setup all parameters you must run /sbin/sccinit from one +of your rc.*-files. This has to be done BEFORE you can +"ifconfig" an interface. Sccinit reads the file /etc/z8530drv.conf +and sets the hardware, MODEM and KISS parameters. A sample file is +delivered with this package. Change it to your needs. + +The file itself consists of two main sections. + +1.2.1 configuration of hardware parameters +========================================== + +The hardware setup section defines the following parameters for each +Z8530: + +chip 1 +data_a 0x300 # data port A +ctrl_a 0x304 # control port A +data_b 0x301 # data port B +ctrl_b 0x305 # control port B +irq 5 # IRQ No. 5 +pclock 4915200 # clock +board BAYCOM # hardware type +escc no # enhanced SCC chip? (8580/85180/85280) +vector 0 # latch for interrupt vector +special no # address of special function register +option 0 # option to set via sfr + + +chip - this is just a delimiter to make sccinit a bit simpler to + program. A parameter has no effect. + +data_a - the address of the data port A of this Z8530 (needed) +ctrl_a - the address of the control port A (needed) +data_b - the address of the data port B (needed) +ctrl_b - the address of the control port B (needed) + +irq - the used IRQ for this chip. Different chips can use different + IRQs or the same. If they share an interrupt, it needs to be + specified within one chip-definition only. + +pclock - the clock at the PCLK pin of the Z8530 (option, 4915200 is + default), measured in Hertz + +board - the "type" of the board: + + SCC type value + --------------------------------- + PA0HZP SCC card PA0HZP + EAGLE card EAGLE + PC100 card PC100 + PRIMUS-PC (DG9BL) card PRIMUS + BayCom (U)SCC card BAYCOM + +escc - if you want support for ESCC chips (8580, 85180, 85280), set + this to "yes" (option, defaults to "no") + +vector - address of the vector latch (aka "intack port") for PA0HZP + cards. There can be only one vector latch for all chips! + (option, defaults to 0) + +special - address of the special function register on several cards. + (option, defaults to 0) + +option - The value you write into that register (option, default is 0) + +You can specify up to four chips (8 channels). If this is not enough, +just change + + #define MAXSCC 4 + +to a higher value. + +Example for the BAYCOM USCC: +---------------------------- + +chip 1 +data_a 0x300 # data port A +ctrl_a 0x304 # control port A +data_b 0x301 # data port B +ctrl_b 0x305 # control port B +irq 5 # IRQ No. 5 (#) +board BAYCOM # hardware type (*) +# +# SCC chip 2 +# +chip 2 +data_a 0x302 +ctrl_a 0x306 +data_b 0x303 +ctrl_b 0x307 +board BAYCOM + +An example for a PA0HZP card: +----------------------------- + +chip 1 +data_a 0x153 +data_b 0x151 +ctrl_a 0x152 +ctrl_b 0x150 +irq 9 +pclock 4915200 +board PA0HZP +vector 0x168 +escc no +# +# +# +chip 2 +data_a 0x157 +data_b 0x155 +ctrl_a 0x156 +ctrl_b 0x154 +irq 9 +pclock 4915200 +board PA0HZP +vector 0x168 +escc no + +A DRSI would should probably work with this: +-------------------------------------------- +(actually: two DRSI cards...) + +chip 1 +data_a 0x303 +data_b 0x301 +ctrl_a 0x302 +ctrl_b 0x300 +irq 7 +pclock 4915200 +board DRSI +escc no +# +# +# +chip 2 +data_a 0x313 +data_b 0x311 +ctrl_a 0x312 +ctrl_b 0x310 +irq 7 +pclock 4915200 +board DRSI +escc no + +Note that you cannot use the on-board baudrate generator off DRSI +cards. Use "mode dpll" for clock source (see below). + +This is based on information provided by Mike Bilow (and verified +by Paul Helay) + +The utility "gencfg" +-------------------- + +If you only know the parameters for the PE1CHL driver for DOS, +run gencfg. It will generate the correct port addresses (I hope). +Its parameters are exactly the same as the ones you use with +the "attach scc" command in net, except that the string "init" must +not appear. Example: + +gencfg 2 0x150 4 2 0 1 0x168 9 4915200 + +will print a skeleton z8530drv.conf for the OptoSCC to stdout. + +gencfg 2 0x300 2 4 5 -4 0 7 4915200 0x10 + +does the same for the BAYCOM USCC card. In my opinion it is much easier +to edit scc_config.h... + + +1.2.2 channel configuration +=========================== + +The channel definition is divided into three sub sections for each +channel: + +An example for scc0: + +# DEVICE + +device scc0 # the device for the following params + +# MODEM / BUFFERS + +speed 1200 # the default baudrate +clock dpll # clock source: + # dpll = normal half duplex operation + # external = MODEM provides own Rx/Tx clock + # divider = use full duplex divider if + # installed (1) +mode nrzi # HDLC encoding mode + # nrzi = 1k2 MODEM, G3RUH 9k6 MODEM + # nrz = DF9IC 9k6 MODEM + # +bufsize 384 # size of buffers. Note that this must include + # the AX.25 header, not only the data field! + # (optional, defaults to 384) + +# KISS (Layer 1) + +txdelay 36 # (see chapter 1.4) +persist 64 +slot 8 +tail 8 +fulldup 0 +wait 12 +min 3 +maxkey 7 +idle 3 +maxdef 120 +group 0 +txoff off +softdcd on +slip off + +The order WITHIN these sections is unimportant. The order OF these +sections IS important. The MODEM parameters are set with the first +recognized KISS parameter... + +Please note that you can initialize the board only once after boot +(or insmod). You can change all parameters but "mode" and "clock" +later with the Sccparam program or through KISS. Just to avoid +security holes... + +(1) this divider is usually mounted on the SCC-PBC (PA0HZP) or not + present at all (BayCom). It feeds back the output of the DPLL + (digital pll) as transmit clock. Using this mode without a divider + installed will normally result in keying the transceiver until + maxkey expires --- of course without sending anything (useful). + +2. Attachment of a channel by your AX.25 software +================================================= + +2.1 Kernel AX.25 +================ + +To set up an AX.25 device you can simply type: + + ifconfig scc0 44.128.1.1 hw ax25 dl0tha-7 + +This will create a network interface with the IP number 44.128.20.107 +and the callsign "dl0tha". If you do not have any IP number (yet) you +can use any of the 44.128.0.0 network. Note that you do not need +axattach. The purpose of axattach (like slattach) is to create a KISS +network device linked to a TTY. Please read the documentation of the +ax25-utils and the AX.25-HOWTO to learn how to set the parameters of +the kernel AX.25. + +2.2 NOS, NET and TFKISS +======================= + +Since the TTY driver (aka KISS TNC emulation) is gone you need +to emulate the old behaviour. The cost of using these programs is +that you probably need to compile the kernel AX.25, regardless of whether +you actually use it or not. First setup your /etc/ax25/axports, +for example: + + 9k6 dl0tha-9 9600 255 4 9600 baud port (scc3) + axlink dl0tha-15 38400 255 4 Link to NOS + +Now "ifconfig" the scc device: + + ifconfig scc3 44.128.1.1 hw ax25 dl0tha-9 + +You can now axattach a pseudo-TTY: + + axattach /dev/ptys0 axlink + +and start your NOS and attach /dev/ptys0 there. The problem is that +NOS is reachable only via digipeating through the kernel AX.25 +(disastrous on a DAMA controlled channel). To solve this problem, +configure "rxecho" to echo the incoming frames from "9k6" to "axlink" +and outgoing frames from "axlink" to "9k6" and start: + + rxecho + +Or simply use "kissbridge" coming with z8530drv-utils: + + ifconfig scc3 hw ax25 dl0tha-9 + kissbridge scc3 /dev/ptys0 + + +3. Adjustment and Display of parameters +======================================= + +3.1 Displaying SCC Parameters: +============================== + +Once a SCC channel has been attached, the parameter settings and +some statistic information can be shown using the param program: + +dl1bke-u:~$ sccstat scc0 + +Parameters: + +speed : 1200 baud +txdelay : 36 +persist : 255 +slottime : 0 +txtail : 8 +fulldup : 1 +waittime : 12 +mintime : 3 sec +maxkeyup : 7 sec +idletime : 3 sec +maxdefer : 120 sec +group : 0x00 +txoff : off +softdcd : on +SLIP : off + +Status: + +HDLC Z8530 Interrupts Buffers +----------------------------------------------------------------------- +Sent : 273 RxOver : 0 RxInts : 125074 Size : 384 +Received : 1095 TxUnder: 0 TxInts : 4684 NoSpace : 0 +RxErrors : 1591 ExInts : 11776 +TxErrors : 0 SpInts : 1503 +Tx State : idle + + +The status info shown is: + +Sent - number of frames transmitted +Received - number of frames received +RxErrors - number of receive errors (CRC, ABORT) +TxErrors - number of discarded Tx frames (due to various reasons) +Tx State - status of the Tx interrupt handler: idle/busy/active/tail (2) +RxOver - number of receiver overruns +TxUnder - number of transmitter underruns +RxInts - number of receiver interrupts +TxInts - number of transmitter interrupts +EpInts - number of receiver special condition interrupts +SpInts - number of external/status interrupts +Size - maximum size of an AX.25 frame (*with* AX.25 headers!) +NoSpace - number of times a buffer could not get allocated + +An overrun is abnormal. If lots of these occur, the product of +baudrate and number of interfaces is too high for the processing +power of your computer. NoSpace errors are unlikely to be caused by the +driver or the kernel AX.25. + + +3.2 Setting Parameters +====================== + + +The setting of parameters of the emulated KISS TNC is done in the +same way in the SCC driver. You can change parameters by using +the kissparms program from the ax25-utils package or use the program +"sccparam": + + sccparam <device> <paramname> <decimal-|hexadecimal value> + +You can change the following parameters: + +param : value +------------------------ +speed : 1200 +txdelay : 36 +persist : 255 +slottime : 0 +txtail : 8 +fulldup : 1 +waittime : 12 +mintime : 3 +maxkeyup : 7 +idletime : 3 +maxdefer : 120 +group : 0x00 +txoff : off +softdcd : on +SLIP : off + + +The parameters have the following meaning: + +speed: + The baudrate on this channel in bits/sec + + Example: sccparam /dev/scc3 speed 9600 + +txdelay: + The delay (in units of 10 ms) after keying of the + transmitter, until the first byte is sent. This is usually + called "TXDELAY" in a TNC. When 0 is specified, the driver + will just wait until the CTS signal is asserted. This + assumes the presence of a timer or other circuitry in the + MODEM and/or transmitter, that asserts CTS when the + transmitter is ready for data. + A normal value of this parameter is 30-36. + + Example: sccparam /dev/scc0 txd 20 + +persist: + This is the probability that the transmitter will be keyed + when the channel is found to be free. It is a value from 0 + to 255, and the probability is (value+1)/256. The value + should be somewhere near 50-60, and should be lowered when + the channel is used more heavily. + + Example: sccparam /dev/scc2 persist 20 + +slottime: + This is the time between samples of the channel. It is + expressed in units of 10 ms. About 200-300 ms (value 20-30) + seems to be a good value. + + Example: sccparam /dev/scc0 slot 20 + +tail: + The time the transmitter will remain keyed after the last + byte of a packet has been transferred to the SCC. This is + necessary because the CRC and a flag still have to leave the + SCC before the transmitter is keyed down. The value depends + on the baudrate selected. A few character times should be + sufficient, e.g. 40ms at 1200 baud. (value 4) + The value of this parameter is in 10 ms units. + + Example: sccparam /dev/scc2 4 + +full: + The full-duplex mode switch. This can be one of the following + values: + + 0: The interface will operate in CSMA mode (the normal + half-duplex packet radio operation) + 1: Fullduplex mode, i.e. the transmitter will be keyed at + any time, without checking the received carrier. It + will be unkeyed when there are no packets to be sent. + 2: Like 1, but the transmitter will remain keyed, also + when there are no packets to be sent. Flags will be + sent in that case, until a timeout (parameter 10) + occurs. + + Example: sccparam /dev/scc0 fulldup off + +wait: + The initial waittime before any transmit attempt, after the + frame has been queue for transmit. This is the length of + the first slot in CSMA mode. In full duplex modes it is + set to 0 for maximum performance. + The value of this parameter is in 10 ms units. + + Example: sccparam /dev/scc1 wait 4 + +maxkey: + The maximal time the transmitter will be keyed to send + packets, in seconds. This can be useful on busy CSMA + channels, to avoid "getting a bad reputation" when you are + generating a lot of traffic. After the specified time has + elapsed, no new frame will be started. Instead, the trans- + mitter will be switched off for a specified time (parameter + min), and then the selected algorithm for keyup will be + started again. + The value 0 as well as "off" will disable this feature, + and allow infinite transmission time. + + Example: sccparam /dev/scc0 maxk 20 + +min: + This is the time the transmitter will be switched off when + the maximum transmission time is exceeded. + + Example: sccparam /dev/scc3 min 10 + +idle + This parameter specifies the maximum idle time in full duplex + 2 mode, in seconds. When no frames have been sent for this + time, the transmitter will be keyed down. A value of 0 is + has same result as the fullduplex mode 1. This parameter + can be disabled. + + Example: sccparam /dev/scc2 idle off # transmit forever + +maxdefer + This is the maximum time (in seconds) to wait for a free channel + to send. When this timer expires the transmitter will be keyed + IMMEDIATELY. If you love to get trouble with other users you + should set this to a very low value ;-) + + Example: sccparam /dev/scc0 maxdefer 240 # 2 minutes + + +txoff: + When this parameter has the value 0, the transmission of packets + is enable. Otherwise it is disabled. + + Example: sccparam /dev/scc2 txoff on + +group: + It is possible to build special radio equipment to use more than + one frequency on the same band, e.g. using several receivers and + only one transmitter that can be switched between frequencies. + Also, you can connect several radios that are active on the same + band. In these cases, it is not possible, or not a good idea, to + transmit on more than one frequency. The SCC driver provides a + method to lock transmitters on different interfaces, using the + "param <interface> group <x>" command. This will only work when + you are using CSMA mode (parameter full = 0). + The number <x> must be 0 if you want no group restrictions, and + can be computed as follows to create restricted groups: + <x> is the sum of some OCTAL numbers: + + 200 This transmitter will only be keyed when all other + transmitters in the group are off. + 100 This transmitter will only be keyed when the carrier + detect of all other interfaces in the group is off. + 0xx A byte that can be used to define different groups. + Interfaces are in the same group, when the logical AND + between their xx values is nonzero. + + Examples: + When 2 interfaces use group 201, their transmitters will never be + keyed at the same time. + When 2 interfaces use group 101, the transmitters will only key + when both channels are clear at the same time. When group 301, + the transmitters will not be keyed at the same time. + + Don't forget to convert the octal numbers into decimal before + you set the parameter. + + Example: (to be written) + +softdcd: + use a software dcd instead of the real one... Useful for a very + slow squelch. + + Example: sccparam /dev/scc0 soft on + + +4. Problems +=========== + +If you have tx-problems with your BayCom USCC card please check +the manufacturer of the 8530. SGS chips have a slightly +different timing. Try Zilog... A solution is to write to register 8 +instead to the data port, but this won't work with the ESCC chips. +*SIGH!* + +A very common problem is that the PTT locks until the maxkeyup timer +expires, although interrupts and clock source are correct. In most +cases compiling the driver with CONFIG_SCC_DELAY (set with +make config) solves the problems. For more hints read the (pseudo) FAQ +and the documentation coming with z8530drv-utils. + +I got reports that the driver has problems on some 386-based systems. +(i.e. Amstrad) Those systems have a bogus AT bus timing which will +lead to delayed answers on interrupts. You can recognize these +problems by looking at the output of Sccstat for the suspected +port. If it shows under- and overruns you own such a system. + +Delayed processing of received data: This depends on + +- the kernel version + +- kernel profiling compiled or not + +- a high interrupt load + +- a high load of the machine --- running X, Xmorph, XV and Povray, + while compiling the kernel... hmm ... even with 32 MB RAM ... ;-) + Or running a named for the whole .ampr.org domain on an 8 MB + box... + +- using information from rxecho or kissbridge. + +Kernel panics: please read /linux/README and find out if it +really occurred within the scc driver. + +If you cannot solve a problem, send me + +- a description of the problem, +- information on your hardware (computer system, scc board, modem) +- your kernel version +- the output of cat /proc/net/z8530 + +4. Thor RLC100 +============== + +Mysteriously this board seems not to work with the driver. Anyone +got it up-and-running? + + +Many thanks to Linus Torvalds and Alan Cox for including the driver +in the Linux standard distribution and their support. + +Joerg Reuter ampr-net: dl1bke@db0pra.ampr.org + AX-25 : DL1BKE @ DB0ABH.#BAY.DEU.EU + Internet: jreuter@yaina.de + WWW : http://yaina.de/jreuter diff --git a/Documentation/nfsroot.txt b/Documentation/nfsroot.txt new file mode 100644 index 000000000000..a87d4af216c0 --- /dev/null +++ b/Documentation/nfsroot.txt @@ -0,0 +1,210 @@ +Mounting the root filesystem via NFS (nfsroot) +=============================================== + +Written 1996 by Gero Kuhlmann <gero@gkminix.han.de> +Updated 1997 by Martin Mares <mj@atrey.karlin.mff.cuni.cz> + + + +If you want to use a diskless system, as an X-terminal or printer +server for example, you have to put your root filesystem onto a +non-disk device. This can either be a ramdisk (see initrd.txt in +this directory for further information) or a filesystem mounted +via NFS. The following text describes on how to use NFS for the +root filesystem. For the rest of this text 'client' means the +diskless system, and 'server' means the NFS server. + + + + +1.) Enabling nfsroot capabilities + ----------------------------- + +In order to use nfsroot you have to select support for NFS during +kernel configuration. Note that NFS cannot be loaded as a module +in this case. The configuration script will then ask you whether +you want to use nfsroot, and if yes what kind of auto configuration +system you want to use. Selecting both BOOTP and RARP is safe. + + + + +2.) Kernel command line + ------------------- + +When the kernel has been loaded by a boot loader (either by loadlin, +LILO or a network boot program) it has to be told what root fs device +to use, and where to find the server and the name of the directory +on the server to mount as root. This can be established by a couple +of kernel command line parameters: + + +root=/dev/nfs + + This is necessary to enable the pseudo-NFS-device. Note that it's not a + real device but just a synonym to tell the kernel to use NFS instead of + a real device. + + +nfsroot=[<server-ip>:]<root-dir>[,<nfs-options>] + + If the `nfsroot' parameter is NOT given on the command line, the default + "/tftpboot/%s" will be used. + + <server-ip> Specifies the IP address of the NFS server. If this field + is not given, the default address as determined by the + `ip' variable (see below) is used. One use of this + parameter is for example to allow using different servers + for RARP and NFS. Usually you can leave this blank. + + <root-dir> Name of the directory on the server to mount as root. If + there is a "%s" token in the string, the token will be + replaced by the ASCII-representation of the client's IP + address. + + <nfs-options> Standard NFS options. All options are separated by commas. + If the options field is not given, the following defaults + will be used: + port = as given by server portmap daemon + rsize = 1024 + wsize = 1024 + timeo = 7 + retrans = 3 + acregmin = 3 + acregmax = 60 + acdirmin = 30 + acdirmax = 60 + flags = hard, nointr, noposix, cto, ac + + +ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf> + + This parameter tells the kernel how to configure IP addresses of devices + and also how to set up the IP routing table. It was originally called `nfsaddrs', + but now the boot-time IP configuration works independently of NFS, so it + was renamed to `ip' and the old name remained as an alias for compatibility + reasons. + + If this parameter is missing from the kernel command line, all fields are + assumed to be empty, and the defaults mentioned below apply. In general + this means that the kernel tries to configure everything using both + RARP and BOOTP (depending on what has been enabled during kernel confi- + guration, and if both what protocol answer got in first). + + <client-ip> IP address of the client. If empty, the address will either + be determined by RARP or BOOTP. What protocol is used de- + pends on what has been enabled during kernel configuration + and on the <autoconf> parameter. If this parameter is not + empty, neither RARP nor BOOTP will be used. + + <server-ip> IP address of the NFS server. If RARP is used to determine + the client address and this parameter is NOT empty only + replies from the specified server are accepted. To use + different RARP and NFS server, specify your RARP server + here (or leave it blank), and specify your NFS server in + the `nfsroot' parameter (see above). If this entry is blank + the address of the server is used which answered the RARP + or BOOTP request. + + <gw-ip> IP address of a gateway if the server is on a different + subnet. If this entry is empty no gateway is used and the + server is assumed to be on the local network, unless a + value has been received by BOOTP. + + <netmask> Netmask for local network interface. If this is empty, + the netmask is derived from the client IP address assuming + classful addressing, unless overridden in BOOTP reply. + + <hostname> Name of the client. If empty, the client IP address is + used in ASCII notation, or the value received by BOOTP. + + <device> Name of network device to use. If this is empty, all + devices are used for RARP and BOOTP requests, and the + first one we receive a reply on is configured. If you have + only one device, you can safely leave this blank. + + <autoconf> Method to use for autoconfiguration. If this is either + 'rarp' or 'bootp', the specified protocol is used. + If the value is 'both' or empty, both protocols are used + so far as they have been enabled during kernel configura- + tion. 'off' means no autoconfiguration. + + The <autoconf> parameter can appear alone as the value to the `ip' + parameter (without all the ':' characters before) in which case auto- + configuration is used. + + + + +3.) Kernel loader + ------------- + +To get the kernel into memory different approaches can be used. They +depend on what facilities are available: + + +3.1) Writing the kernel onto a floppy using dd: + As always you can just write the kernel onto a floppy using dd, + but then it's not possible to use kernel command lines at all. + To substitute the 'root=' parameter, create a dummy device on any + linux system with major number 0 and minor number 255 using mknod: + + mknod /dev/boot255 c 0 255 + + Then copy the kernel zImage file onto a floppy using dd: + + dd if=/usr/src/linux/arch/i386/boot/zImage of=/dev/fd0 + + And finally use rdev to set the root device: + + rdev /dev/fd0 /dev/boot255 + + You can then remove the dummy device /dev/boot255 again. There + is no real device available for it. + The other two kernel command line parameters cannot be substi- + tuted with rdev. Therefore, using this method the kernel will + by default use RARP and/or BOOTP, and if it gets an answer via + RARP will mount the directory /tftpboot/<client-ip>/ as its + root. If it got a BOOTP answer the directory name in that answer + is used. + + +3.2) Using LILO + When using LILO you can specify all necessary command line + parameters with the 'append=' command in the LILO configuration + file. However, to use the 'root=' command you also need to + set up a dummy device as described in 3.1 above. For how to use + LILO and its 'append=' command please refer to the LILO + documentation. + +3.3) Using loadlin + When you want to boot Linux from a DOS command prompt without + having a local hard disk to mount as root, you can use loadlin. + I was told that it works, but haven't used it myself yet. In + general you should be able to create a kernel command line simi- + lar to how LILO is doing it. Please refer to the loadlin docu- + mentation for further information. + +3.4) Using a boot ROM + This is probably the most elegant way of booting a diskless + client. With a boot ROM the kernel gets loaded using the TFTP + protocol. As far as I know, no commercial boot ROMs yet + support booting Linux over the network, but there are two + free implementations of a boot ROM available on sunsite.unc.edu + and its mirrors. They are called 'netboot-nfs' and 'etherboot'. + Both contain everything you need to boot a diskless Linux client. + + + + +4.) Credits + ------- + + The nfsroot code in the kernel and the RARP support have been written + by Gero Kuhlmann <gero@gkminix.han.de>. + + The rest of the IP layer autoconfiguration code has been written + by Martin Mares <mj@atrey.karlin.mff.cuni.cz>. + + In order to write the initial version of nfsroot I would like to thank + Jens-Uwe Mager <jum@anubis.han.de> for his help. diff --git a/Documentation/nmi_watchdog.txt b/Documentation/nmi_watchdog.txt new file mode 100644 index 000000000000..c025a4561c10 --- /dev/null +++ b/Documentation/nmi_watchdog.txt @@ -0,0 +1,81 @@ + +[NMI watchdog is available for x86 and x86-64 architectures] + +Is your system locking up unpredictably? No keyboard activity, just +a frustrating complete hard lockup? Do you want to help us debugging +such lockups? If all yes then this document is definitely for you. + +On many x86/x86-64 type hardware there is a feature that enables +us to generate 'watchdog NMI interrupts'. (NMI: Non Maskable Interrupt +which get executed even if the system is otherwise locked up hard). +This can be used to debug hard kernel lockups. By executing periodic +NMI interrupts, the kernel can monitor whether any CPU has locked up, +and print out debugging messages if so. + +In order to use the NMI watchdog, you need to have APIC support in your +kernel. For SMP kernels, APIC support gets compiled in automatically. For +UP, enable either CONFIG_X86_UP_APIC (Processor type and features -> Local +APIC support on uniprocessors) or CONFIG_X86_UP_IOAPIC (Processor type and +features -> IO-APIC support on uniprocessors) in your kernel config. +CONFIG_X86_UP_APIC is for uniprocessor machines without an IO-APIC. +CONFIG_X86_UP_IOAPIC is for uniprocessor with an IO-APIC. [Note: certain +kernel debugging options, such as Kernel Stack Meter or Kernel Tracer, +may implicitly disable the NMI watchdog.] + +For x86-64, the needed APIC is always compiled in, and the NMI watchdog is +always enabled with I/O-APIC mode (nmi_watchdog=1). Currently, local APIC +mode (nmi_watchdog=2) does not work on x86-64. + +Using local APIC (nmi_watchdog=2) needs the first performance register, so +you can't use it for other purposes (such as high precision performance +profiling.) However, at least oprofile and the perfctr driver disable the +local APIC NMI watchdog automatically. + +To actually enable the NMI watchdog, use the 'nmi_watchdog=N' boot +parameter. Eg. the relevant lilo.conf entry: + + append="nmi_watchdog=1" + +For SMP machines and UP machines with an IO-APIC use nmi_watchdog=1. +For UP machines without an IO-APIC use nmi_watchdog=2, this only works +for some processor types. If in doubt, boot with nmi_watchdog=1 and +check the NMI count in /proc/interrupts; if the count is zero then +reboot with nmi_watchdog=2 and check the NMI count. If it is still +zero then log a problem, you probably have a processor that needs to be +added to the nmi code. + +A 'lockup' is the following scenario: if any CPU in the system does not +execute the period local timer interrupt for more than 5 seconds, then +the NMI handler generates an oops and kills the process. This +'controlled crash' (and the resulting kernel messages) can be used to +debug the lockup. Thus whenever the lockup happens, wait 5 seconds and +the oops will show up automatically. If the kernel produces no messages +then the system has crashed so hard (eg. hardware-wise) that either it +cannot even accept NMI interrupts, or the crash has made the kernel +unable to print messages. + +Be aware that when using local APIC, the frequency of NMI interrupts +it generates, depends on the system load. The local APIC NMI watchdog, +lacking a better source, uses the "cycles unhalted" event. As you may +guess it doesn't tick when the CPU is in the halted state (which happens +when the system is idle), but if your system locks up on anything but the +"hlt" processor instruction, the watchdog will trigger very soon as the +"cycles unhalted" event will happen every clock tick. If it locks up on +"hlt", then you are out of luck -- the event will not happen at all and the +watchdog won't trigger. This is a shortcoming of the local APIC watchdog +-- unfortunately there is no "clock ticks" event that would work all the +time. The I/O APIC watchdog is driven externally and has no such shortcoming. +But its NMI frequency is much higher, resulting in a more significant hit +to the overall system performance. + +NOTE: starting with 2.4.2-ac18 the NMI-oopser is disabled by default, +you have to enable it with a boot time parameter. Prior to 2.4.2-ac18 +the NMI-oopser is enabled unconditionally on x86 SMP boxes. + +On x86-64 the NMI oopser is on by default. On 64bit Intel CPUs +it uses IO-APIC by default and on AMD it uses local APIC. + +[ feel free to send bug reports, suggestions and patches to + Ingo Molnar <mingo@redhat.com> or the Linux SMP mailing + list at <linux-smp@vger.kernel.org> ] + diff --git a/Documentation/nommu-mmap.txt b/Documentation/nommu-mmap.txt new file mode 100644 index 000000000000..b88ebe4d808c --- /dev/null +++ b/Documentation/nommu-mmap.txt @@ -0,0 +1,198 @@ + ============================= + NO-MMU MEMORY MAPPING SUPPORT + ============================= + +The kernel has limited support for memory mapping under no-MMU conditions, such +as are used in uClinux environments. From the userspace point of view, memory +mapping is made use of in conjunction with the mmap() system call, the shmat() +call and the execve() system call. From the kernel's point of view, execve() +mapping is actually performed by the binfmt drivers, which call back into the +mmap() routines to do the actual work. + +Memory mapping behaviour also involves the way fork(), vfork(), clone() and +ptrace() work. Under uClinux there is no fork(), and clone() must be supplied +the CLONE_VM flag. + +The behaviour is similar between the MMU and no-MMU cases, but not identical; +and it's also much more restricted in the latter case: + + (*) Anonymous mapping, MAP_PRIVATE + + In the MMU case: VM regions backed by arbitrary pages; copy-on-write + across fork. + + In the no-MMU case: VM regions backed by arbitrary contiguous runs of + pages. + + (*) Anonymous mapping, MAP_SHARED + + These behave very much like private mappings, except that they're + shared across fork() or clone() without CLONE_VM in the MMU case. Since + the no-MMU case doesn't support these, behaviour is identical to + MAP_PRIVATE there. + + (*) File, MAP_PRIVATE, PROT_READ / PROT_EXEC, !PROT_WRITE + + In the MMU case: VM regions backed by pages read from file; changes to + the underlying file are reflected in the mapping; copied across fork. + + In the no-MMU case: + + - If one exists, the kernel will re-use an existing mapping to the + same segment of the same file if that has compatible permissions, + even if this was created by another process. + + - If possible, the file mapping will be directly on the backing device + if the backing device has the BDI_CAP_MAP_DIRECT capability and + appropriate mapping protection capabilities. Ramfs, romfs, cramfs + and mtd might all permit this. + + - If the backing device device can't or won't permit direct sharing, + but does have the BDI_CAP_MAP_COPY capability, then a copy of the + appropriate bit of the file will be read into a contiguous bit of + memory and any extraneous space beyond the EOF will be cleared + + - Writes to the file do not affect the mapping; writes to the mapping + are visible in other processes (no MMU protection), but should not + happen. + + (*) File, MAP_PRIVATE, PROT_READ / PROT_EXEC, PROT_WRITE + + In the MMU case: like the non-PROT_WRITE case, except that the pages in + question get copied before the write actually happens. From that point + on writes to the file underneath that page no longer get reflected into + the mapping's backing pages. The page is then backed by swap instead. + + In the no-MMU case: works much like the non-PROT_WRITE case, except + that a copy is always taken and never shared. + + (*) Regular file / blockdev, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE + + In the MMU case: VM regions backed by pages read from file; changes to + pages written back to file; writes to file reflected into pages backing + mapping; shared across fork. + + In the no-MMU case: not supported. + + (*) Memory backed regular file, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE + + In the MMU case: As for ordinary regular files. + + In the no-MMU case: The filesystem providing the memory-backed file + (such as ramfs or tmpfs) may choose to honour an open, truncate, mmap + sequence by providing a contiguous sequence of pages to map. In that + case, a shared-writable memory mapping will be possible. It will work + as for the MMU case. If the filesystem does not provide any such + support, then the mapping request will be denied. + + (*) Memory backed blockdev, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE + + In the MMU case: As for ordinary regular files. + + In the no-MMU case: As for memory backed regular files, but the + blockdev must be able to provide a contiguous run of pages without + truncate being called. The ramdisk driver could do this if it allocated + all its memory as a contiguous array upfront. + + (*) Memory backed chardev, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE + + In the MMU case: As for ordinary regular files. + + In the no-MMU case: The character device driver may choose to honour + the mmap() by providing direct access to the underlying device if it + provides memory or quasi-memory that can be accessed directly. Examples + of such are frame buffers and flash devices. If the driver does not + provide any such support, then the mapping request will be denied. + + +============================ +FURTHER NOTES ON NO-MMU MMAP +============================ + + (*) A request for a private mapping of less than a page in size may not return + a page-aligned buffer. This is because the kernel calls kmalloc() to + allocate the buffer, not get_free_page(). + + (*) A list of all the mappings on the system is visible through /proc/maps in + no-MMU mode. + + (*) Supplying MAP_FIXED or a requesting a particular mapping address will + result in an error. + + (*) Files mapped privately usually have to have a read method provided by the + driver or filesystem so that the contents can be read into the memory + allocated if mmap() chooses not to map the backing device directly. An + error will result if they don't. This is most likely to be encountered + with character device files, pipes, fifos and sockets. + +============================================ +PROVIDING SHAREABLE CHARACTER DEVICE SUPPORT +============================================ + +To provide shareable character device support, a driver must provide a +file->f_op->get_unmapped_area() operation. The mmap() routines will call this +to get a proposed address for the mapping. This may return an error if it +doesn't wish to honour the mapping because it's too long, at a weird offset, +under some unsupported combination of flags or whatever. + +The driver should also provide backing device information with capabilities set +to indicate the permitted types of mapping on such devices. The default is +assumed to be readable and writable, not executable, and only shareable +directly (can't be copied). + +The file->f_op->mmap() operation will be called to actually inaugurate the +mapping. It can be rejected at that point. Returning the ENOSYS error will +cause the mapping to be copied instead if BDI_CAP_MAP_COPY is specified. + +The vm_ops->close() routine will be invoked when the last mapping on a chardev +is removed. An existing mapping will be shared, partially or not, if possible +without notifying the driver. + +It is permitted also for the file->f_op->get_unmapped_area() operation to +return -ENOSYS. This will be taken to mean that this operation just doesn't +want to handle it, despite the fact it's got an operation. For instance, it +might try directing the call to a secondary driver which turns out not to +implement it. Such is the case for the framebuffer driver which attempts to +direct the call to the device-specific driver. Under such circumstances, the +mapping request will be rejected if BDI_CAP_MAP_COPY is not specified, and a +copy mapped otherwise. + +IMPORTANT NOTE: + + Some types of device may present a different appearance to anyone + looking at them in certain modes. Flash chips can be like this; for + instance if they're in programming or erase mode, you might see the + status reflected in the mapping, instead of the data. + + In such a case, care must be taken lest userspace see a shared or a + private mapping showing such information when the driver is busy + controlling the device. Remember especially: private executable + mappings may still be mapped directly off the device under some + circumstances! + + +============================================== +PROVIDING SHAREABLE MEMORY-BACKED FILE SUPPORT +============================================== + +Provision of shared mappings on memory backed files is similar to the provision +of support for shared mapped character devices. The main difference is that the +filesystem providing the service will probably allocate a contiguous collection +of pages and permit mappings to be made on that. + +It is recommended that a truncate operation applied to such a file that +increases the file size, if that file is empty, be taken as a request to gather +enough pages to honour a mapping. This is required to support POSIX shared +memory. + +Memory backed devices are indicated by the mapping's backing device info having +the memory_backed flag set. + + +======================================== +PROVIDING SHAREABLE BLOCK DEVICE SUPPORT +======================================== + +Provision of shared mappings on block device files is exactly the same as for +character devices. If there isn't a real device underneath, then the driver +should allocate sufficient contiguous memory to honour any supported mapping. diff --git a/Documentation/numastat.txt b/Documentation/numastat.txt new file mode 100644 index 000000000000..80133ace1eb2 --- /dev/null +++ b/Documentation/numastat.txt @@ -0,0 +1,22 @@ + +Numa policy hit/miss statistics + +/sys/devices/system/node/node*/numastat + +All units are pages. Hugepages have separate counters. + +numa_hit A process wanted to allocate memory from this node, + and succeeded. +numa_miss A process wanted to allocate memory from this node, + but ended up with memory from another. +numa_foreign A process wanted to allocate on another node, + but ended up with memory from this one. +local_node A process ran on this node and got memory from it. +other_node A process ran on this node and got memory from another node. +interleave_hit Interleaving wanted to allocate from this node + and succeeded. + +For easier reading you can use the numastat utility from the numactl package +(ftp://ftp.suse.com/pub/people/ak/numa/numactl*). Note that it only works +well right now on machines with a small number of CPUs. + diff --git a/Documentation/oops-tracing.txt b/Documentation/oops-tracing.txt new file mode 100644 index 000000000000..da711028e5f7 --- /dev/null +++ b/Documentation/oops-tracing.txt @@ -0,0 +1,229 @@ +NOTE: ksymoops is useless on 2.6. Please use the Oops in its original format +(from dmesg, etc). Ignore any references in this or other docs to "decoding +the Oops" or "running it through ksymoops". If you post an Oops fron 2.6 that +has been run through ksymoops, people will just tell you to repost it. + +Quick Summary +------------- + +Find the Oops and send it to the maintainer of the kernel area that seems to be +involved with the problem. Don't worry too much about getting the wrong person. +If you are unsure send it to the person responsible for the code relevant to +what you were doing. If it occurs repeatably try and describe how to recreate +it. That's worth even more than the oops. + +If you are totally stumped as to whom to send the report, send it to +linux-kernel@vger.kernel.org. Thanks for your help in making Linux as +stable as humanly possible. + +Where is the Oops? +---------------------- + +Normally the Oops text is read from the kernel buffers by klogd and +handed to syslogd which writes it to a syslog file, typically +/var/log/messages (depends on /etc/syslog.conf). Sometimes klogd dies, +in which case you can run dmesg > file to read the data from the kernel +buffers and save it. Or you can cat /proc/kmsg > file, however you +have to break in to stop the transfer, kmsg is a "never ending file". +If the machine has crashed so badly that you cannot enter commands or +the disk is not available then you have three options :- + +(1) Hand copy the text from the screen and type it in after the machine + has restarted. Messy but it is the only option if you have not + planned for a crash. + +(2) Boot with a serial console (see Documentation/serial-console.txt), + run a null modem to a second machine and capture the output there + using your favourite communication program. Minicom works well. + +(3) Patch the kernel with one of the crash dump patches. These save + data to a floppy disk or video rom or a swap partition. None of + these are standard kernel patches so you have to find and apply + them yourself. Search kernel archives for kmsgdump, lkcd and + oops+smram. + + +Full Information +---------------- + +NOTE: the message from Linus below applies to 2.4 kernel. I have preserved it +for historical reasons, and because some of the information in it still +applies. Especially, please ignore any references to ksymoops. + +From: Linus Torvalds <torvalds@osdl.org> + +How to track down an Oops.. [originally a mail to linux-kernel] + +The main trick is having 5 years of experience with those pesky oops +messages ;-) + +Actually, there are things you can do that make this easier. I have two +separate approaches: + + gdb /usr/src/linux/vmlinux + gdb> disassemble <offending_function> + +That's the easy way to find the problem, at least if the bug-report is +well made (like this one was - run through ksymoops to get the +information of which function and the offset in the function that it +happened in). + +Oh, it helps if the report happens on a kernel that is compiled with the +same compiler and similar setups. + +The other thing to do is disassemble the "Code:" part of the bug report: +ksymoops will do this too with the correct tools, but if you don't have +the tools you can just do a silly program: + + char str[] = "\xXX\xXX\xXX..."; + main(){} + +and compile it with gcc -g and then do "disassemble str" (where the "XX" +stuff are the values reported by the Oops - you can just cut-and-paste +and do a replace of spaces to "\x" - that's what I do, as I'm too lazy +to write a program to automate this all). + +Finally, if you want to see where the code comes from, you can do + + cd /usr/src/linux + make fs/buffer.s # or whatever file the bug happened in + +and then you get a better idea of what happens than with the gdb +disassembly. + +Now, the trick is just then to combine all the data you have: the C +sources (and general knowledge of what it _should_ do), the assembly +listing and the code disassembly (and additionally the register dump you +also get from the "oops" message - that can be useful to see _what_ the +corrupted pointers were, and when you have the assembler listing you can +also match the other registers to whatever C expressions they were used +for). + +Essentially, you just look at what doesn't match (in this case it was the +"Code" disassembly that didn't match with what the compiler generated). +Then you need to find out _why_ they don't match. Often it's simple - you +see that the code uses a NULL pointer and then you look at the code and +wonder how the NULL pointer got there, and if it's a valid thing to do +you just check against it.. + +Now, if somebody gets the idea that this is time-consuming and requires +some small amount of concentration, you're right. Which is why I will +mostly just ignore any panic reports that don't have the symbol table +info etc looked up: it simply gets too hard to look it up (I have some +programs to search for specific patterns in the kernel code segment, and +sometimes I have been able to look up those kinds of panics too, but +that really requires pretty good knowledge of the kernel just to be able +to pick out the right sequences etc..) + +_Sometimes_ it happens that I just see the disassembled code sequence +from the panic, and I know immediately where it's coming from. That's when +I get worried that I've been doing this for too long ;-) + + Linus + + +--------------------------------------------------------------------------- +Notes on Oops tracing with klogd: + +In order to help Linus and the other kernel developers there has been +substantial support incorporated into klogd for processing protection +faults. In order to have full support for address resolution at least +version 1.3-pl3 of the sysklogd package should be used. + +When a protection fault occurs the klogd daemon automatically +translates important addresses in the kernel log messages to their +symbolic equivalents. This translated kernel message is then +forwarded through whatever reporting mechanism klogd is using. The +protection fault message can be simply cut out of the message files +and forwarded to the kernel developers. + +Two types of address resolution are performed by klogd. The first is +static translation and the second is dynamic translation. Static +translation uses the System.map file in much the same manner that +ksymoops does. In order to do static translation the klogd daemon +must be able to find a system map file at daemon initialization time. +See the klogd man page for information on how klogd searches for map +files. + +Dynamic address translation is important when kernel loadable modules +are being used. Since memory for kernel modules is allocated from the +kernel's dynamic memory pools there are no fixed locations for either +the start of the module or for functions and symbols in the module. + +The kernel supports system calls which allow a program to determine +which modules are loaded and their location in memory. Using these +system calls the klogd daemon builds a symbol table which can be used +to debug a protection fault which occurs in a loadable kernel module. + +At the very minimum klogd will provide the name of the module which +generated the protection fault. There may be additional symbolic +information available if the developer of the loadable module chose to +export symbol information from the module. + +Since the kernel module environment can be dynamic there must be a +mechanism for notifying the klogd daemon when a change in module +environment occurs. There are command line options available which +allow klogd to signal the currently executing daemon that symbol +information should be refreshed. See the klogd manual page for more +information. + +A patch is included with the sysklogd distribution which modifies the +modules-2.0.0 package to automatically signal klogd whenever a module +is loaded or unloaded. Applying this patch provides essentially +seamless support for debugging protection faults which occur with +kernel loadable modules. + +The following is an example of a protection fault in a loadable module +processed by klogd: +--------------------------------------------------------------------------- +Aug 29 09:51:01 blizard kernel: Unable to handle kernel paging request at virtual address f15e97cc +Aug 29 09:51:01 blizard kernel: current->tss.cr3 = 0062d000, %cr3 = 0062d000 +Aug 29 09:51:01 blizard kernel: *pde = 00000000 +Aug 29 09:51:01 blizard kernel: Oops: 0002 +Aug 29 09:51:01 blizard kernel: CPU: 0 +Aug 29 09:51:01 blizard kernel: EIP: 0010:[oops:_oops+16/3868] +Aug 29 09:51:01 blizard kernel: EFLAGS: 00010212 +Aug 29 09:51:01 blizard kernel: eax: 315e97cc ebx: 003a6f80 ecx: 001be77b edx: 00237c0c +Aug 29 09:51:01 blizard kernel: esi: 00000000 edi: bffffdb3 ebp: 00589f90 esp: 00589f8c +Aug 29 09:51:01 blizard kernel: ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018 +Aug 29 09:51:01 blizard kernel: Process oops_test (pid: 3374, process nr: 21, stackpage=00589000) +Aug 29 09:51:01 blizard kernel: Stack: 315e97cc 00589f98 0100b0b4 bffffed4 0012e38e 00240c64 003a6f80 00000001 +Aug 29 09:51:01 blizard kernel: 00000000 00237810 bfffff00 0010a7fa 00000003 00000001 00000000 bfffff00 +Aug 29 09:51:01 blizard kernel: bffffdb3 bffffed4 ffffffda 0000002b 0007002b 0000002b 0000002b 00000036 +Aug 29 09:51:01 blizard kernel: Call Trace: [oops:_oops_ioctl+48/80] [_sys_ioctl+254/272] [_system_call+82/128] +Aug 29 09:51:01 blizard kernel: Code: c7 00 05 00 00 00 eb 08 90 90 90 90 90 90 90 90 89 ec 5d c3 +--------------------------------------------------------------------------- + +Dr. G.W. Wettstein Oncology Research Div. Computing Facility +Roger Maris Cancer Center INTERNET: greg@wind.rmcc.com +820 4th St. N. +Fargo, ND 58122 +Phone: 701-234-7556 + + +--------------------------------------------------------------------------- +Tainted kernels: + +Some oops reports contain the string 'Tainted: ' after the program +counter, this indicates that the kernel has been tainted by some +mechanism. The string is followed by a series of position sensitive +characters, each representing a particular tainted value. + + 1: 'G' if all modules loaded have a GPL or compatible license, 'P' if + any proprietary module has been loaded. Modules without a + MODULE_LICENSE or with a MODULE_LICENSE that is not recognised by + insmod as GPL compatible are assumed to be proprietary. + + 2: 'F' if any module was force loaded by insmod -f, ' ' if all + modules were loaded normally. + + 3: 'S' if the oops occurred on an SMP kernel running on hardware that + hasn't been certified as safe to run multiprocessor. + Currently this occurs only on various Athlons that are not + SMP capable. + +The primary reason for the 'Tainted: ' string is to tell kernel +debuggers if this is a clean kernel or if anything unusual has +occurred. Tainting is permanent, even if an offending module is +unloading the tainted value remains to indicate that the kernel is not +trustworthy. diff --git a/Documentation/paride.txt b/Documentation/paride.txt new file mode 100644 index 000000000000..e4312676bdda --- /dev/null +++ b/Documentation/paride.txt @@ -0,0 +1,417 @@ + + Linux and parallel port IDE devices + +PARIDE v1.03 (c) 1997-8 Grant Guenther <grant@torque.net> + +1. Introduction + +Owing to the simplicity and near universality of the parallel port interface +to personal computers, many external devices such as portable hard-disk, +CD-ROM, LS-120 and tape drives use the parallel port to connect to their +host computer. While some devices (notably scanners) use ad-hoc methods +to pass commands and data through the parallel port interface, most +external devices are actually identical to an internal model, but with +a parallel-port adapter chip added in. Some of the original parallel port +adapters were little more than mechanisms for multiplexing a SCSI bus. +(The Iomega PPA-3 adapter used in the ZIP drives is an example of this +approach). Most current designs, however, take a different approach. +The adapter chip reproduces a small ISA or IDE bus in the external device +and the communication protocol provides operations for reading and writing +device registers, as well as data block transfer functions. Sometimes, +the device being addressed via the parallel cable is a standard SCSI +controller like an NCR 5380. The "ditto" family of external tape +drives use the ISA replicator to interface a floppy disk controller, +which is then connected to a floppy-tape mechanism. The vast majority +of external parallel port devices, however, are now based on standard +IDE type devices, which require no intermediate controller. If one +were to open up a parallel port CD-ROM drive, for instance, one would +find a standard ATAPI CD-ROM drive, a power supply, and a single adapter +that interconnected a standard PC parallel port cable and a standard +IDE cable. It is usually possible to exchange the CD-ROM device with +any other device using the IDE interface. + +The document describes the support in Linux for parallel port IDE +devices. It does not cover parallel port SCSI devices, "ditto" tape +drives or scanners. Many different devices are supported by the +parallel port IDE subsystem, including: + + MicroSolutions backpack CD-ROM + MicroSolutions backpack PD/CD + MicroSolutions backpack hard-drives + MicroSolutions backpack 8000t tape drive + SyQuest EZ-135, EZ-230 & SparQ drives + Avatar Shark + Imation Superdisk LS-120 + Maxell Superdisk LS-120 + FreeCom Power CD + Hewlett-Packard 5GB and 8GB tape drives + Hewlett-Packard 7100 and 7200 CD-RW drives + +as well as most of the clone and no-name products on the market. + +To support such a wide range of devices, PARIDE, the parallel port IDE +subsystem, is actually structured in three parts. There is a base +paride module which provides a registry and some common methods for +accessing the parallel ports. The second component is a set of +high-level drivers for each of the different types of supported devices: + + pd IDE disk + pcd ATAPI CD-ROM + pf ATAPI disk + pt ATAPI tape + pg ATAPI generic + +(Currently, the pg driver is only used with CD-R drives). + +The high-level drivers function according to the relevant standards. +The third component of PARIDE is a set of low-level protocol drivers +for each of the parallel port IDE adapter chips. Thanks to the interest +and encouragement of Linux users from many parts of the world, +support is available for almost all known adapter protocols: + + aten ATEN EH-100 (HK) + bpck Microsolutions backpack (US) + comm DataStor (old-type) "commuter" adapter (TW) + dstr DataStor EP-2000 (TW) + epat Shuttle EPAT (UK) + epia Shuttle EPIA (UK) + fit2 FIT TD-2000 (US) + fit3 FIT TD-3000 (US) + friq Freecom IQ cable (DE) + frpw Freecom Power (DE) + kbic KingByte KBIC-951A and KBIC-971A (TW) + ktti KT Technology PHd adapter (SG) + on20 OnSpec 90c20 (US) + on26 OnSpec 90c26 (US) + + +2. Using the PARIDE subsystem + +While configuring the Linux kernel, you may choose either to build +the PARIDE drivers into your kernel, or to build them as modules. + +In either case, you will need to select "Parallel port IDE device support" +as well as at least one of the high-level drivers and at least one +of the parallel port communication protocols. If you do not know +what kind of parallel port adapter is used in your drive, you could +begin by checking the file names and any text files on your DOS +installation floppy. Alternatively, you can look at the markings on +the adapter chip itself. That's usually sufficient to identify the +correct device. + +You can actually select all the protocol modules, and allow the PARIDE +subsystem to try them all for you. + +For the "brand-name" products listed above, here are the protocol +and high-level drivers that you would use: + + Manufacturer Model Driver Protocol + + MicroSolutions CD-ROM pcd bpck + MicroSolutions PD drive pf bpck + MicroSolutions hard-drive pd bpck + MicroSolutions 8000t tape pt bpck + SyQuest EZ, SparQ pd epat + Imation Superdisk pf epat + Maxell Superdisk pf friq + Avatar Shark pd epat + FreeCom CD-ROM pcd frpw + Hewlett-Packard 5GB Tape pt epat + Hewlett-Packard 7200e (CD) pcd epat + Hewlett-Packard 7200e (CD-R) pg epat + +2.1 Configuring built-in drivers + +We recommend that you get to know how the drivers work and how to +configure them as loadable modules, before attempting to compile a +kernel with the drivers built-in. + +If you built all of your PARIDE support directly into your kernel, +and you have just a single parallel port IDE device, your kernel should +locate it automatically for you. If you have more than one device, +you may need to give some command line options to your bootloader +(eg: LILO), how to do that is beyond the scope of this document. + +The high-level drivers accept a number of command line parameters, all +of which are documented in the source files in linux/drivers/block/paride. +By default, each driver will automatically try all parallel ports it +can find, and all protocol types that have been installed, until it finds +a parallel port IDE adapter. Once it finds one, the probe stops. So, +if you have more than one device, you will need to tell the drivers +how to identify them. This requires specifying the port address, the +protocol identification number and, for some devices, the drive's +chain ID. While your system is booting, a number of messages are +displayed on the console. Like all such messages, they can be +reviewed with the 'dmesg' command. Among those messages will be +some lines like: + + paride: bpck registered as protocol 0 + paride: epat registered as protocol 1 + +The numbers will always be the same until you build a new kernel with +different protocol selections. You should note these numbers as you +will need them to identify the devices. + +If you happen to be using a MicroSolutions backpack device, you will +also need to know the unit ID number for each drive. This is usually +the last two digits of the drive's serial number (but read MicroSolutions' +documentation about this). + +As an example, let's assume that you have a MicroSolutions PD/CD drive +with unit ID number 36 connected to the parallel port at 0x378, a SyQuest +EZ-135 connected to the chained port on the PD/CD drive and also an +Imation Superdisk connected to port 0x278. You could give the following +options on your boot command: + + pd.drive0=0x378,1 pf.drive0=0x278,1 pf.drive1=0x378,0,36 + +In the last option, pf.drive1 configures device /dev/pf1, the 0x378 +is the parallel port base address, the 0 is the protocol registration +number and 36 is the chain ID. + +Please note: while PARIDE will work both with and without the +PARPORT parallel port sharing system that is included by the +"Parallel port support" option, PARPORT must be included and enabled +if you want to use chains of devices on the same parallel port. + +2.2 Loading and configuring PARIDE as modules + +It is much faster and simpler to get to understand the PARIDE drivers +if you use them as loadable kernel modules. + +Note 1: using these drivers with the "kerneld" automatic module loading +system is not recommended for beginners, and is not documented here. + +Note 2: if you build PARPORT support as a loadable module, PARIDE must +also be built as loadable modules, and PARPORT must be loaded before the +PARIDE modules. + +To use PARIDE, you must begin by + + insmod paride + +this loads a base module which provides a registry for the protocols, +among other tasks. + +Then, load as many of the protocol modules as you think you might need. +As you load each module, it will register the protocols that it supports, +and print a log message to your kernel log file and your console. For +example: + + # insmod epat + paride: epat registered as protocol 0 + # insmod kbic + paride: k951 registered as protocol 1 + paride: k971 registered as protocol 2 + +Finally, you can load high-level drivers for each kind of device that +you have connected. By default, each driver will autoprobe for a single +device, but you can support up to four similar devices by giving their +individual co-ordinates when you load the driver. + +For example, if you had two no-name CD-ROM drives both using the +KingByte KBIC-951A adapter, one on port 0x378 and the other on 0x3bc +you could give the following command: + + # insmod pcd drive0=0x378,1 drive1=0x3bc,1 + +For most adapters, giving a port address and protocol number is sufficient, +but check the source files in linux/drivers/block/paride for more +information. (Hopefully someone will write some man pages one day !). + +As another example, here's what happens when PARPORT is installed, and +a SyQuest EZ-135 is attached to port 0x378: + + # insmod paride + paride: version 1.0 installed + # insmod epat + paride: epat registered as protocol 0 + # insmod pd + pd: pd version 1.0, major 45, cluster 64, nice 0 + pda: Sharing parport1 at 0x378 + pda: epat 1.0, Shuttle EPAT chip c3 at 0x378, mode 5 (EPP-32), delay 1 + pda: SyQuest EZ135A, 262144 blocks [128M], (512/16/32), removable media + pda: pda1 + +Note that the last line is the output from the generic partition table +scanner - in this case it reports that it has found a disk with one partition. + +2.3 Using a PARIDE device + +Once the drivers have been loaded, you can access PARIDE devices in the +same way as their traditional counterparts. You will probably need to +create the device "special files". Here is a simple script that you can +cut to a file and execute: + +#!/bin/bash +# +# mkd -- a script to create the device special files for the PARIDE subsystem +# +function mkdev { + mknod $1 $2 $3 $4 ; chmod 0660 $1 ; chown root:disk $1 +} +# +function pd { + D=$( printf \\$( printf "x%03x" $[ $1 + 97 ] ) ) + mkdev pd$D b 45 $[ $1 * 16 ] + for P in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 + do mkdev pd$D$P b 45 $[ $1 * 16 + $P ] + done +} +# +cd /dev +# +for u in 0 1 2 3 ; do pd $u ; done +for u in 0 1 2 3 ; do mkdev pcd$u b 46 $u ; done +for u in 0 1 2 3 ; do mkdev pf$u b 47 $u ; done +for u in 0 1 2 3 ; do mkdev pt$u c 96 $u ; done +for u in 0 1 2 3 ; do mkdev npt$u c 96 $[ $u + 128 ] ; done +for u in 0 1 2 3 ; do mkdev pg$u c 97 $u ; done +# +# end of mkd + +With the device files and drivers in place, you can access PARIDE devices +like any other Linux device. For example, to mount a CD-ROM in pcd0, use: + + mount /dev/pcd0 /cdrom + +If you have a fresh Avatar Shark cartridge, and the drive is pda, you +might do something like: + + fdisk /dev/pda -- make a new partition table with + partition 1 of type 83 + + mke2fs /dev/pda1 -- to build the file system + + mkdir /shark -- make a place to mount the disk + + mount /dev/pda1 /shark + +Devices like the Imation superdisk work in the same way, except that +they do not have a partition table. For example to make a 120MB +floppy that you could share with a DOS system: + + mkdosfs /dev/pf0 + mount /dev/pf0 /mnt + + +2.4 The pf driver + +The pf driver is intended for use with parallel port ATAPI disk +devices. The most common devices in this category are PD drives +and LS-120 drives. Traditionally, media for these devices are not +partitioned. Consequently, the pf driver does not support partitioned +media. This may be changed in a future version of the driver. + +2.5 Using the pt driver + +The pt driver for parallel port ATAPI tape drives is a minimal driver. +It does not yet support many of the standard tape ioctl operations. +For best performance, a block size of 32KB should be used. You will +probably want to set the parallel port delay to 0, if you can. + +2.6 Using the pg driver + +The pg driver can be used in conjunction with the cdrecord program +to create CD-ROMs. Please get cdrecord version 1.6.1 or later +from ftp://ftp.fokus.gmd.de/pub/unix/cdrecord/ . To record CD-R media +your parallel port should ideally be set to EPP mode, and the "port delay" +should be set to 0. With those settings it is possible to record at 2x +speed without any buffer underruns. If you cannot get the driver to work +in EPP mode, try to use "bidirectional" or "PS/2" mode and 1x speeds only. + + +3. Troubleshooting + +3.1 Use EPP mode if you can + +The most common problems that people report with the PARIDE drivers +concern the parallel port CMOS settings. At this time, none of the +PARIDE protocol modules support ECP mode, or any ECP combination modes. +If you are able to do so, please set your parallel port into EPP mode +using your CMOS setup procedure. + +3.2 Check the port delay + +Some parallel ports cannot reliably transfer data at full speed. To +offset the errors, the PARIDE protocol modules introduce a "port +delay" between each access to the i/o ports. Each protocol sets +a default value for this delay. In most cases, the user can override +the default and set it to 0 - resulting in somewhat higher transfer +rates. In some rare cases (especially with older 486 systems) the +default delays are not long enough. if you experience corrupt data +transfers, or unexpected failures, you may wish to increase the +port delay. The delay can be programmed using the "driveN" parameters +to each of the high-level drivers. Please see the notes above, or +read the comments at the beginning of the driver source files in +linux/drivers/block/paride. + +3.3 Some drives need a printer reset + +There appear to be a number of "noname" external drives on the market +that do not always power up correctly. We have noticed this with some +drives based on OnSpec and older Freecom adapters. In these rare cases, +the adapter can often be reinitialised by issuing a "printer reset" on +the parallel port. As the reset operation is potentially disruptive in +multiple device environments, the PARIDE drivers will not do it +automatically. You can however, force a printer reset by doing: + + insmod lp reset=1 + rmmod lp + +If you have one of these marginal cases, you should probably build +your paride drivers as modules, and arrange to do the printer reset +before loading the PARIDE drivers. + +3.4 Use the verbose option and dmesg if you need help + +While a lot of testing has gone into these drivers to make them work +as smoothly as possible, problems will arise. If you do have problems, +please check all the obvious things first: does the drive work in +DOS with the manufacturer's drivers ? If that doesn't yield any useful +clues, then please make sure that only one drive is hooked to your system, +and that either (a) PARPORT is enabled or (b) no other device driver +is using your parallel port (check in /proc/ioports). Then, load the +appropriate drivers (you can load several protocol modules if you want) +as in: + + # insmod paride + # insmod epat + # insmod bpck + # insmod kbic + ... + # insmod pd verbose=1 + +(using the correct driver for the type of device you have, of course). +The verbose=1 parameter will cause the drivers to log a trace of their +activity as they attempt to locate your drive. + +Use 'dmesg' to capture a log of all the PARIDE messages (any messages +beginning with paride:, a protocol module's name or a driver's name) and +include that with your bug report. You can submit a bug report in one +of two ways. Either send it directly to the author of the PARIDE suite, +by e-mail to grant@torque.net, or join the linux-parport mailing list +and post your report there. + +3.5 For more information or help + +You can join the linux-parport mailing list by sending a mail message +to + linux-parport-request@torque.net + +with the single word + + subscribe + +in the body of the mail message (not in the subject line). Please be +sure that your mail program is correctly set up when you do this, as +the list manager is a robot that will subscribe you using the reply +address in your mail headers. REMOVE any anti-spam gimmicks you may +have in your mail headers, when sending mail to the list server. + +You might also find some useful information on the linux-parport +web pages (although they are not always up to date) at + + http://www.torque.net/parport/ + + diff --git a/Documentation/parisc/00-INDEX b/Documentation/parisc/00-INDEX new file mode 100644 index 000000000000..cbd060961f43 --- /dev/null +++ b/Documentation/parisc/00-INDEX @@ -0,0 +1,6 @@ +00-INDEX + - this file. +debugging + - some debugging hints for real-mode code +registers + - current/planned usage of registers diff --git a/Documentation/parisc/debugging b/Documentation/parisc/debugging new file mode 100644 index 000000000000..d728594058e5 --- /dev/null +++ b/Documentation/parisc/debugging @@ -0,0 +1,39 @@ +okay, here are some hints for debugging the lower-level parts of +linux/parisc. + + +1. Absolute addresses + +A lot of the assembly code currently runs in real mode, which means +absolute addresses are used instead of virtual addresses as in the +rest of the kernel. To translate an absolute address to a virtual +address you can lookup in System.map, add __PAGE_OFFSET (0x10000000 +currently). + + +2. HPMCs + +When real-mode code tries to access non-existent memory, you'll get +an HPMC instead of a kernel oops. To debug an HPMC, try to find +the System Responder/Requestor addresses. The System Requestor +address should match (one of the) processor HPAs (high addresses in +the I/O range); the System Responder address is the address real-mode +code tried to access. + +Typical values for the System Responder address are addresses larger +than __PAGE_OFFSET (0x10000000) which mean a virtual address didn't +get translated to a physical address before real-mode code tried to +access it. + + +3. Q bit fun + +Certain, very critical code has to clear the Q bit in the PSW. What +happens when the Q bit is cleared is the CPU does not update the +registers interruption handlers read to find out where the machine +was interrupted - so if you get an interruption between the instruction +that clears the Q bit and the RFI that sets it again you don't know +where exactly it happened. If you're lucky the IAOQ will point to the +instrucion that cleared the Q bit, if you're not it points anywhere +at all. Usually Q bit problems will show themselves in unexplainable +system hangs or running off the end of physical memory. diff --git a/Documentation/parisc/registers b/Documentation/parisc/registers new file mode 100644 index 000000000000..dd3caddd1ad9 --- /dev/null +++ b/Documentation/parisc/registers @@ -0,0 +1,121 @@ +Register Usage for Linux/PA-RISC + +[ an asterisk is used for planned usage which is currently unimplemented ] + + General Registers as specified by ABI + + Control Registers + +CR 0 (Recovery Counter) used for ptrace +CR 1-CR 7(undefined) unused +CR 8 (Protection ID) per-process value* +CR 9, 12, 13 (PIDS) unused +CR10 (CCR) lazy FPU saving* +CR11 as specified by ABI (SAR) +CR14 (interruption vector) initialized to fault_vector +CR15 (EIEM) initialized to all ones* +CR16 (Interval Timer) read for cycle count/write starts Interval Tmr +CR17-CR22 interruption parameters +CR19 Interrupt Instruction Register +CR20 Interrupt Space Register +CR21 Interrupt Offset Register +CR22 Interrupt PSW +CR23 (EIRR) read for pending interrupts/write clears bits +CR24 (TR 0) Kernel Space Page Directory Pointer +CR25 (TR 1) User Space Page Directory Pointer +CR26 (TR 2) not used +CR27 (TR 3) Thread descriptor pointer +CR28 (TR 4) not used +CR29 (TR 5) not used +CR30 (TR 6) current / 0 +CR31 (TR 7) Temporary register, used in various places + + Space Registers (kernel mode) + +SR0 temporary space register +SR4-SR7 set to 0 +SR1 temporary space register +SR2 kernel should not clobber this +SR3 used for userspace accesses (current process) + + Space Registers (user mode) + +SR0 temporary space register +SR1 temporary space register +SR2 holds space of linux gateway page +SR3 holds user address space value while in kernel +SR4-SR7 Defines short address space for user/kernel + + + Processor Status Word + +W (64-bit addresses) 0 +E (Little-endian) 0 +S (Secure Interval Timer) 0 +T (Taken Branch Trap) 0 +H (Higher-privilege trap) 0 +L (Lower-privilege trap) 0 +N (Nullify next instruction) used by C code +X (Data memory break disable) 0 +B (Taken Branch) used by C code +C (code address translation) 1, 0 while executing real-mode code +V (divide step correction) used by C code +M (HPMC mask) 0, 1 while executing HPMC handler* +C/B (carry/borrow bits) used by C code +O (ordered references) 1* +F (performance monitor) 0 +R (Recovery Counter trap) 0 +Q (collect interruption state) 1 (0 in code directly preceding an rfi) +P (Protection Identifiers) 1* +D (Data address translation) 1, 0 while executing real-mode code +I (external interrupt mask) used by cli()/sti() macros + + "Invisible" Registers + +PSW default W value 0 +PSW default E value 0 +Shadow Registers used by interruption handler code +TOC enable bit 1 + +========================================================================= +Register usage notes, originally from John Marvin, with some additional +notes from Randolph Chung. + +For the general registers: + +r1,r2,r19-r26,r28,r29 & r31 can be used without saving them first. And of +course, you need to save them if you care about them, before calling +another procedure. Some of the above registers do have special meanings +that you should be aware of: + + r1: The addil instruction is hardwired to place its result in r1, + so if you use that instruction be aware of that. + + r2: This is the return pointer. In general you don't want to + use this, since you need the pointer to get back to your + caller. However, it is grouped with this set of registers + since the caller can't rely on the value being the same + when you return, i.e. you can copy r2 to another register + and return through that register after trashing r2, and + that should not cause a problem for the calling routine. + + r19-r22: these are generally regarded as temporary registers. + Note that in 64 bit they are arg7-arg4. + + r23-r26: these are arg3-arg0, i.e. you can use them if you + don't care about the values that were passed in anymore. + + r28,r29: are ret0 and ret1. They are what you pass return values + in. r28 is the primary return. When returning small structures + r29 may also be used to pass data back to the caller. + + r30: stack pointer + + r31: the ble instruction puts the return pointer in here. + + +r3-r18,r27,r30 need to be saved and restored. r3-r18 are just + general purpose registers. r27 is the data pointer, and is + used to make references to global variables easier. r30 is + the stack pointer. + diff --git a/Documentation/parport-lowlevel.txt b/Documentation/parport-lowlevel.txt new file mode 100644 index 000000000000..1d40008a1926 --- /dev/null +++ b/Documentation/parport-lowlevel.txt @@ -0,0 +1,1490 @@ +PARPORT interface documentation +------------------------------- + +Time-stamp: <2000-02-24 13:30:20 twaugh> + +Described here are the following functions: + +Global functions: + parport_register_driver + parport_unregister_driver + parport_enumerate + parport_register_device + parport_unregister_device + parport_claim + parport_claim_or_block + parport_release + parport_yield + parport_yield_blocking + parport_wait_peripheral + parport_poll_peripheral + parport_wait_event + parport_negotiate + parport_read + parport_write + parport_open + parport_close + parport_device_id + parport_device_num + parport_device_coords + parport_find_class + parport_find_device + parport_set_timeout + +Port functions (can be overridden by low-level drivers): + SPP: + port->ops->read_data + port->ops->write_data + port->ops->read_status + port->ops->read_control + port->ops->write_control + port->ops->frob_control + port->ops->enable_irq + port->ops->disable_irq + port->ops->data_forward + port->ops->data_reverse + + EPP: + port->ops->epp_write_data + port->ops->epp_read_data + port->ops->epp_write_addr + port->ops->epp_read_addr + + ECP: + port->ops->ecp_write_data + port->ops->ecp_read_data + port->ops->ecp_write_addr + + Other: + port->ops->nibble_read_data + port->ops->byte_read_data + port->ops->compat_write_data + +The parport subsystem comprises 'parport' (the core port-sharing +code), and a variety of low-level drivers that actually do the port +accesses. Each low-level driver handles a particular style of port +(PC, Amiga, and so on). + +The parport interface to the device driver author can be broken down +into global functions and port functions. + +The global functions are mostly for communicating between the device +driver and the parport subsystem: acquiring a list of available ports, +claiming a port for exclusive use, and so on. They also include +'generic' functions for doing standard things that will work on any +IEEE 1284-capable architecture. + +The port functions are provided by the low-level drivers, although the +core parport module provides generic 'defaults' for some routines. +The port functions can be split into three groups: SPP, EPP, and ECP. + +SPP (Standard Parallel Port) functions modify so-called 'SPP' +registers: data, status, and control. The hardware may not actually +have registers exactly like that, but the PC does and this interface is +modelled after common PC implementations. Other low-level drivers may +be able to emulate most of the functionality. + +EPP (Enhanced Parallel Port) functions are provided for reading and +writing in IEEE 1284 EPP mode, and ECP (Extended Capabilities Port) +functions are used for IEEE 1284 ECP mode. (What about BECP? Does +anyone care?) + +Hardware assistance for EPP and/or ECP transfers may or may not be +available, and if it is available it may or may not be used. If +hardware is not used, the transfer will be software-driven. In order +to cope with peripherals that only tenuously support IEEE 1284, a +low-level driver specific function is provided, for altering 'fudge +factors'. + +GLOBAL FUNCTIONS +---------------- + +parport_register_driver - register a device driver with parport +----------------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_driver { + const char *name; + void (*attach) (struct parport *); + void (*detach) (struct parport *); + struct parport_driver *next; +}; +int parport_register_driver (struct parport_driver *driver); + +DESCRIPTION + +In order to be notified about parallel ports when they are detected, +parport_register_driver should be called. Your driver will +immediately be notified of all ports that have already been detected, +and of each new port as low-level drivers are loaded. + +A 'struct parport_driver' contains the textual name of your driver, +a pointer to a function to handle new ports, and a pointer to a +function to handle ports going away due to a low-level driver +unloading. Ports will only be detached if they are not being used +(i.e. there are no devices registered on them). + +The visible parts of the 'struct parport *' argument given to +attach/detach are: + +struct parport +{ + struct parport *next; /* next parport in list */ + const char *name; /* port's name */ + unsigned int modes; /* bitfield of hardware modes */ + struct parport_device_info probe_info; + /* IEEE1284 info */ + int number; /* parport index */ + struct parport_operations *ops; + ... +}; + +There are other members of the structure, but they should not be +touched. + +The 'modes' member summarises the capabilities of the underlying +hardware. It consists of flags which may be bitwise-ored together: + + PARPORT_MODE_PCSPP IBM PC registers are available, + i.e. functions that act on data, + control and status registers are + probably writing directly to the + hardware. + PARPORT_MODE_TRISTATE The data drivers may be turned off. + This allows the data lines to be used + for reverse (peripheral to host) + transfers. + PARPORT_MODE_COMPAT The hardware can assist with + compatibility-mode (printer) + transfers, i.e. compat_write_block. + PARPORT_MODE_EPP The hardware can assist with EPP + transfers. + PARPORT_MODE_ECP The hardware can assist with ECP + transfers. + PARPORT_MODE_DMA The hardware can use DMA, so you might + want to pass ISA DMA-able memory + (i.e. memory allocated using the + GFP_DMA flag with kmalloc) to the + low-level driver in order to take + advantage of it. + +There may be other flags in 'modes' as well. + +The contents of 'modes' is advisory only. For example, if the +hardware is capable of DMA, and PARPORT_MODE_DMA is in 'modes', it +doesn't necessarily mean that DMA will always be used when possible. +Similarly, hardware that is capable of assisting ECP transfers won't +necessarily be used. + +RETURN VALUE + +Zero on success, otherwise an error code. + +ERRORS + +None. (Can it fail? Why return int?) + +EXAMPLE + +static void lp_attach (struct parport *port) +{ + ... + private = kmalloc (...); + dev[count++] = parport_register_device (...); + ... +} + +static void lp_detach (struct parport *port) +{ + ... +} + +static struct parport_driver lp_driver = { + "lp", + lp_attach, + lp_detach, + NULL /* always put NULL here */ +}; + +int lp_init (void) +{ + ... + if (parport_register_driver (&lp_driver)) { + /* Failed; nothing we can do. */ + return -EIO; + } + ... +} + +SEE ALSO + +parport_unregister_driver, parport_register_device, parport_enumerate + +parport_unregister_driver - tell parport to forget about this driver +------------------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_driver { + const char *name; + void (*attach) (struct parport *); + void (*detach) (struct parport *); + struct parport_driver *next; +}; +void parport_unregister_driver (struct parport_driver *driver); + +DESCRIPTION + +This tells parport not to notify the device driver of new ports or of +ports going away. Registered devices belonging to that driver are NOT +unregistered: parport_unregister_device must be used for each one. + +EXAMPLE + +void cleanup_module (void) +{ + ... + /* Stop notifications. */ + parport_unregister_driver (&lp_driver); + + /* Unregister devices. */ + for (i = 0; i < NUM_DEVS; i++) + parport_unregister_device (dev[i]); + ... +} + +SEE ALSO + +parport_register_driver, parport_enumerate + +parport_enumerate - retrieve a list of parallel ports (DEPRECATED) +----------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport *parport_enumerate (void); + +DESCRIPTION + +Retrieve the first of a list of valid parallel ports for this machine. +Successive parallel ports can be found using the 'struct parport +*next' element of the 'struct parport *' that is returned. If 'next' +is NULL, there are no more parallel ports in the list. The number of +ports in the list will not exceed PARPORT_MAX. + +RETURN VALUE + +A 'struct parport *' describing a valid parallel port for the machine, +or NULL if there are none. + +ERRORS + +This function can return NULL to indicate that there are no parallel +ports to use. + +EXAMPLE + +int detect_device (void) +{ + struct parport *port; + + for (port = parport_enumerate (); + port != NULL; + port = port->next) { + /* Try to detect a device on the port... */ + ... + } + } + + ... +} + +NOTES + +parport_enumerate is deprecated; parport_register_driver should be +used instead. + +SEE ALSO + +parport_register_driver, parport_unregister_driver + +parport_register_device - register to use a port +----------------------- + +SYNOPSIS + +#include <linux/parport.h> + +typedef int (*preempt_func) (void *handle); +typedef void (*wakeup_func) (void *handle); +typedef int (*irq_func) (int irq, void *handle, struct pt_regs *); + +struct pardevice *parport_register_device(struct parport *port, + const char *name, + preempt_func preempt, + wakeup_func wakeup, + irq_func irq, + int flags, + void *handle); + +DESCRIPTION + +Use this function to register your device driver on a parallel port +('port'). Once you have done that, you will be able to use +parport_claim and parport_release in order to use the port. + +This function will register three callbacks into your driver: +'preempt', 'wakeup' and 'irq'. Each of these may be NULL in order to +indicate that you do not want a callback. + +When the 'preempt' function is called, it is because another driver +wishes to use the parallel port. The 'preempt' function should return +non-zero if the parallel port cannot be released yet -- if zero is +returned, the port is lost to another driver and the port must be +re-claimed before use. + +The 'wakeup' function is called once another driver has released the +port and no other driver has yet claimed it. You can claim the +parallel port from within the 'wakeup' function (in which case the +claim is guaranteed to succeed), or choose not to if you don't need it +now. + +If an interrupt occurs on the parallel port your driver has claimed, +the 'irq' function will be called. (Write something about shared +interrupts here.) + +The 'handle' is a pointer to driver-specific data, and is passed to +the callback functions. + +'flags' may be a bitwise combination of the following flags: + + Flag Meaning + PARPORT_DEV_EXCL The device cannot share the parallel port at all. + Use this only when absolutely necessary. + +The typedefs are not actually defined -- they are only shown in order +to make the function prototype more readable. + +The visible parts of the returned 'struct pardevice' are: + +struct pardevice { + struct parport *port; /* Associated port */ + void *private; /* Device driver's 'handle' */ + ... +}; + +RETURN VALUE + +A 'struct pardevice *': a handle to the registered parallel port +device that can be used for parport_claim, parport_release, etc. + +ERRORS + +A return value of NULL indicates that there was a problem registering +a device on that port. + +EXAMPLE + +static int preempt (void *handle) +{ + if (busy_right_now) + return 1; + + must_reclaim_port = 1; + return 0; +} + +static void wakeup (void *handle) +{ + struct toaster *private = handle; + struct pardevice *dev = private->dev; + if (!dev) return; /* avoid races */ + + if (want_port) + parport_claim (dev); +} + +static int toaster_detect (struct toaster *private, struct parport *port) +{ + private->dev = parport_register_device (port, "toaster", preempt, + wakeup, NULL, 0, + private); + if (!private->dev) + /* Couldn't register with parport. */ + return -EIO; + + must_reclaim_port = 0; + busy_right_now = 1; + parport_claim_or_block (private->dev); + ... + /* Don't need the port while the toaster warms up. */ + busy_right_now = 0; + ... + busy_right_now = 1; + if (must_reclaim_port) { + parport_claim_or_block (private->dev); + must_reclaim_port = 0; + } + ... +} + +SEE ALSO + +parport_unregister_device, parport_claim + +parport_unregister_device - finish using a port +------------------------- + +SYNPOPSIS + +#include <linux/parport.h> + +void parport_unregister_device (struct pardevice *dev); + +DESCRIPTION + +This function is the opposite of parport_register_device. After using +parport_unregister_device, 'dev' is no longer a valid device handle. + +You should not unregister a device that is currently claimed, although +if you do it will be released automatically. + +EXAMPLE + + ... + kfree (dev->private); /* before we lose the pointer */ + parport_unregister_device (dev); + ... + +SEE ALSO + +parport_unregister_driver + +parport_claim, parport_claim_or_block - claim the parallel port for a device +------------------------------------- + +SYNOPSIS + +#include <linux/parport.h> + +int parport_claim (struct pardevice *dev); +int parport_claim_or_block (struct pardevice *dev); + +DESCRIPTION + +These functions attempt to gain control of the parallel port on which +'dev' is registered. 'parport_claim' does not block, but +'parport_claim_or_block' may do. (Put something here about blocking +interruptibly or non-interruptibly.) + +You should not try to claim a port that you have already claimed. + +RETURN VALUE + +A return value of zero indicates that the port was successfully +claimed, and the caller now has possession of the parallel port. + +If 'parport_claim_or_block' blocks before returning successfully, the +return value is positive. + +ERRORS + + -EAGAIN The port is unavailable at the moment, but another attempt + to claim it may succeed. + +SEE ALSO + +parport_release + +parport_release - release the parallel port +--------------- + +SYNOPSIS + +#include <linux/parport.h> + +void parport_release (struct pardevice *dev); + +DESCRIPTION + +Once a parallel port device has been claimed, it can be released using +'parport_release'. It cannot fail, but you should not release a +device that you do not have possession of. + +EXAMPLE + +static size_t write (struct pardevice *dev, const void *buf, + size_t len) +{ + ... + written = dev->port->ops->write_ecp_data (dev->port, buf, + len); + parport_release (dev); + ... +} + + +SEE ALSO + +change_mode, parport_claim, parport_claim_or_block, parport_yield + +parport_yield, parport_yield_blocking - temporarily release a parallel port +------------------------------------- + +SYNOPSIS + +#include <linux/parport.h> + +int parport_yield (struct pardevice *dev) +int parport_yield_blocking (struct pardevice *dev); + +DESCRIPTION + +When a driver has control of a parallel port, it may allow another +driver to temporarily 'borrow' it. 'parport_yield' does not block; +'parport_yield_blocking' may do. + +RETURN VALUE + +A return value of zero indicates that the caller still owns the port +and the call did not block. + +A positive return value from 'parport_yield_blocking' indicates that +the caller still owns the port and the call blocked. + +A return value of -EAGAIN indicates that the caller no longer owns the +port, and it must be re-claimed before use. + +ERRORS + + -EAGAIN Ownership of the parallel port was given away. + +SEE ALSO + +parport_release + +parport_wait_peripheral - wait for status lines, up to 35ms +----------------------- + +SYNOPSIS + +#include <linux/parport.h> + +int parport_wait_peripheral (struct parport *port, + unsigned char mask, + unsigned char val); + +DESCRIPTION + +Wait for the status lines in mask to match the values in val. + +RETURN VALUE + + -EINTR a signal is pending + 0 the status lines in mask have values in val + 1 timed out while waiting (35ms elapsed) + +SEE ALSO + +parport_poll_peripheral + +parport_poll_peripheral - wait for status lines, in usec +----------------------- + +SYNOPSIS + +#include <linux/parport.h> + +int parport_poll_peripheral (struct parport *port, + unsigned char mask, + unsigned char val, + int usec); + +DESCRIPTION + +Wait for the status lines in mask to match the values in val. + +RETURN VALUE + + -EINTR a signal is pending + 0 the status lines in mask have values in val + 1 timed out while waiting (usec microseconds have elapsed) + +SEE ALSO + +parport_wait_peripheral + +parport_wait_event - wait for an event on a port +------------------ + +SYNOPSIS + +#include <linux/parport.h> + +int parport_wait_event (struct parport *port, signed long timeout) + +DESCRIPTION + +Wait for an event (e.g. interrupt) on a port. The timeout is in +jiffies. + +RETURN VALUE + + 0 success + <0 error (exit as soon as possible) + >0 timed out + +parport_negotiate - perform IEEE 1284 negotiation +----------------- + +SYNOPSIS + +#include <linux/parport.h> + +int parport_negotiate (struct parport *, int mode); + +DESCRIPTION + +Perform IEEE 1284 negotiation. + +RETURN VALUE + + 0 handshake OK; IEEE 1284 peripheral and mode available + -1 handshake failed; peripheral not compliant (or none present) + 1 handshake OK; IEEE 1284 peripheral present but mode not + available + +SEE ALSO + +parport_read, parport_write + +parport_read - read data from device +------------ + +SYNOPSIS + +#include <linux/parport.h> + +ssize_t parport_read (struct parport *, void *buf, size_t len); + +DESCRIPTION + +Read data from device in current IEEE 1284 transfer mode. This only +works for modes that support reverse data transfer. + +RETURN VALUE + +If negative, an error code; otherwise the number of bytes transferred. + +SEE ALSO + +parport_write, parport_negotiate + +parport_write - write data to device +------------- + +SYNOPSIS + +#include <linux/parport.h> + +ssize_t parport_write (struct parport *, const void *buf, size_t len); + +DESCRIPTION + +Write data to device in current IEEE 1284 transfer mode. This only +works for modes that support forward data transfer. + +RETURN VALUE + +If negative, an error code; otherwise the number of bytes transferred. + +SEE ALSO + +parport_read, parport_negotiate + +parport_open - register device for particular device number +------------ + +SYNOPSIS + +#include <linux/parport.h> + +struct pardevice *parport_open (int devnum, const char *name, + int (*pf) (void *), + void (*kf) (void *), + void (*irqf) (int, void *, + struct pt_regs *), + int flags, void *handle); + +DESCRIPTION + +This is like parport_register_device but takes a device number instead +of a pointer to a struct parport. + +RETURN VALUE + +See parport_register_device. If no device is associated with devnum, +NULL is returned. + +SEE ALSO + +parport_register_device, parport_device_num + +parport_close - unregister device for particular device number +------------- + +SYNOPSIS + +#include <linux/parport.h> + +void parport_close (struct pardevice *dev); + +DESCRIPTION + +This is the equivalent of parport_unregister_device for parport_open. + +SEE ALSO + +parport_unregister_device, parport_open + +parport_device_id - obtain IEEE 1284 Device ID +----------------- + +SYNOPSIS + +#include <linux/parport.h> + +ssize_t parport_device_id (int devnum, char *buffer, size_t len); + +DESCRIPTION + +Obtains the IEEE 1284 Device ID associated with a given device. + +RETURN VALUE + +If negative, an error code; otherwise, the number of bytes of buffer +that contain the device ID. The format of the device ID is as +follows: + +[length][ID] + +The first two bytes indicate the inclusive length of the entire Device +ID, and are in big-endian order. The ID is a sequence of pairs of the +form: + +key:value; + +NOTES + +Many devices have ill-formed IEEE 1284 Device IDs. + +SEE ALSO + +parport_find_class, parport_find_device, parport_device_num + +parport_device_num - convert device coordinates to device number +------------------ + +SYNOPSIS + +#include <linux/parport.h> + +int parport_device_num (int parport, int mux, int daisy); + +DESCRIPTION + +Convert between device coordinates (port, multiplexor, daisy chain +address) and device number (zero-based). + +RETURN VALUE + +Device number, or -1 if no device at given coordinates. + +SEE ALSO + +parport_device_coords, parport_open, parport_device_id + +parport_device_coords - convert device number to device coordinates +------------------ + +SYNOPSIS + +#include <linux/parport.h> + +int parport_device_coords (int devnum, int *parport, int *mux, + int *daisy); + +DESCRIPTION + +Convert between device number (zero-based) and device coordinates +(port, multiplexor, daisy chain address). + +RETURN VALUE + +Zero on success, in which case the coordinates are (*parport, *mux, +*daisy). + +SEE ALSO + +parport_device_num, parport_open, parport_device_id + +parport_find_class - find a device by its class +------------------ + +SYNOPSIS + +#include <linux/parport.h> + +typedef enum { + PARPORT_CLASS_LEGACY = 0, /* Non-IEEE1284 device */ + PARPORT_CLASS_PRINTER, + PARPORT_CLASS_MODEM, + PARPORT_CLASS_NET, + PARPORT_CLASS_HDC, /* Hard disk controller */ + PARPORT_CLASS_PCMCIA, + PARPORT_CLASS_MEDIA, /* Multimedia device */ + PARPORT_CLASS_FDC, /* Floppy disk controller */ + PARPORT_CLASS_PORTS, + PARPORT_CLASS_SCANNER, + PARPORT_CLASS_DIGCAM, + PARPORT_CLASS_OTHER, /* Anything else */ + PARPORT_CLASS_UNSPEC, /* No CLS field in ID */ + PARPORT_CLASS_SCSIADAPTER +} parport_device_class; + +int parport_find_class (parport_device_class cls, int from); + +DESCRIPTION + +Find a device by class. The search starts from device number from+1. + +RETURN VALUE + +The device number of the next device in that class, or -1 if no such +device exists. + +NOTES + +Example usage: + +int devnum = -1; +while ((devnum = parport_find_class (PARPORT_CLASS_DIGCAM, devnum)) != -1) { + struct pardevice *dev = parport_open (devnum, ...); + ... +} + +SEE ALSO + +parport_find_device, parport_open, parport_device_id + +parport_find_device - find a device by its class +------------------ + +SYNOPSIS + +#include <linux/parport.h> + +int parport_find_device (const char *mfg, const char *mdl, int from); + +DESCRIPTION + +Find a device by vendor and model. The search starts from device +number from+1. + +RETURN VALUE + +The device number of the next device matching the specifications, or +-1 if no such device exists. + +NOTES + +Example usage: + +int devnum = -1; +while ((devnum = parport_find_device ("IOMEGA", "ZIP+", devnum)) != -1) { + struct pardevice *dev = parport_open (devnum, ...); + ... +} + +SEE ALSO + +parport_find_class, parport_open, parport_device_id + +parport_set_timeout - set the inactivity timeout +------------------- + +SYNOPSIS + +#include <linux/parport.h> + +long parport_set_timeout (struct pardevice *dev, long inactivity); + +DESCRIPTION + +Set the inactivity timeout, in jiffies, for a registered device. The +previous timeout is returned. + +RETURN VALUE + +The previous timeout, in jiffies. + +NOTES + +Some of the port->ops functions for a parport may take time, owing to +delays at the peripheral. After the peripheral has not responded for +'inactivity' jiffies, a timeout will occur and the blocking function +will return. + +A timeout of 0 jiffies is a special case: the function must do as much +as it can without blocking or leaving the hardware in an unknown +state. If port operations are performed from within an interrupt +handler, for instance, a timeout of 0 jiffies should be used. + +Once set for a registered device, the timeout will remain at the set +value until set again. + +SEE ALSO + +port->ops->xxx_read/write_yyy + +PORT FUNCTIONS +-------------- + +The functions in the port->ops structure (struct parport_operations) +are provided by the low-level driver responsible for that port. + +port->ops->read_data - read the data register +-------------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + unsigned char (*read_data) (struct parport *port); + ... +}; + +DESCRIPTION + +If port->modes contains the PARPORT_MODE_TRISTATE flag and the +PARPORT_CONTROL_DIRECTION bit in the control register is set, this +returns the value on the data pins. If port->modes contains the +PARPORT_MODE_TRISTATE flag and the PARPORT_CONTROL_DIRECTION bit is +not set, the return value _may_ be the last value written to the data +register. Otherwise the return value is undefined. + +SEE ALSO + +write_data, read_status, write_control + +port->ops->write_data - write the data register +--------------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + void (*write_data) (struct parport *port, unsigned char d); + ... +}; + +DESCRIPTION + +Writes to the data register. May have side-effects (a STROBE pulse, +for instance). + +SEE ALSO + +read_data, read_status, write_control + +port->ops->read_status - read the status register +---------------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + unsigned char (*read_status) (struct parport *port); + ... +}; + +DESCRIPTION + +Reads from the status register. This is a bitmask: + +- PARPORT_STATUS_ERROR (printer fault, "nFault") +- PARPORT_STATUS_SELECT (on-line, "Select") +- PARPORT_STATUS_PAPEROUT (no paper, "PError") +- PARPORT_STATUS_ACK (handshake, "nAck") +- PARPORT_STATUS_BUSY (busy, "Busy") + +There may be other bits set. + +SEE ALSO + +read_data, write_data, write_control + +port->ops->read_control - read the control register +----------------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + unsigned char (*read_control) (struct parport *port); + ... +}; + +DESCRIPTION + +Returns the last value written to the control register (either from +write_control or frob_control). No port access is performed. + +SEE ALSO + +read_data, write_data, read_status, write_control + +port->ops->write_control - write the control register +------------------------ + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + void (*write_status) (struct parport *port, unsigned char s); + ... +}; + +DESCRIPTION + +Writes to the control register. This is a bitmask: + _______ +- PARPORT_CONTROL_STROBE (nStrobe) + _______ +- PARPORT_CONTROL_AUTOFD (nAutoFd) + _____ +- PARPORT_CONTROL_INIT (nInit) + _________ +- PARPORT_CONTROL_SELECT (nSelectIn) + +SEE ALSO + +read_data, write_data, read_status, frob_control + +port->ops->frob_control - write control register bits +----------------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + void (*frob_control) (struct parport *port, + unsigned char mask, + unsigned char val); + ... +}; + +DESCRIPTION + +This is equivalent to reading from the control register, masking out +the bits in mask, exclusive-or'ing with the bits in val, and writing +the result to the control register. + +As some ports don't allow reads from the control port, a software copy +of its contents is maintained, so frob_control is in fact only one +port access. + +SEE ALSO + +read_data, write_data, read_status, write_control + +port->ops->enable_irq - enable interrupt generation +--------------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + void (*enable_irq) (struct parport *port); + ... +}; + +DESCRIPTION + +The parallel port hardware is instructed to generate interrupts at +appropriate moments, although those moments are +architecture-specific. For the PC architecture, interrupts are +commonly generated on the rising edge of nAck. + +SEE ALSO + +disable_irq + +port->ops->disable_irq - disable interrupt generation +---------------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + void (*disable_irq) (struct parport *port); + ... +}; + +DESCRIPTION + +The parallel port hardware is instructed not to generate interrupts. +The interrupt itself is not masked. + +SEE ALSO + +enable_irq + +port->ops->data_forward - enable data drivers +----------------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + void (*data_forward) (struct parport *port); + ... +}; + +DESCRIPTION + +Enables the data line drivers, for 8-bit host-to-peripheral +communications. + +SEE ALSO + +data_reverse + +port->ops->data_reverse - tristate the buffer +----------------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + void (*data_reverse) (struct parport *port); + ... +}; + +DESCRIPTION + +Places the data bus in a high impedance state, if port->modes has the +PARPORT_MODE_TRISTATE bit set. + +SEE ALSO + +data_forward + +port->ops->epp_write_data - write EPP data +------------------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + size_t (*epp_write_data) (struct parport *port, const void *buf, + size_t len, int flags); + ... +}; + +DESCRIPTION + +Writes data in EPP mode, and returns the number of bytes written. + +The 'flags' parameter may be one or more of the following, +bitwise-or'ed together: + +PARPORT_EPP_FAST Use fast transfers. Some chips provide 16-bit and + 32-bit registers. However, if a transfer + times out, the return value may be unreliable. + +SEE ALSO + +epp_read_data, epp_write_addr, epp_read_addr + +port->ops->epp_read_data - read EPP data +------------------------ + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + size_t (*epp_read_data) (struct parport *port, void *buf, + size_t len, int flags); + ... +}; + +DESCRIPTION + +Reads data in EPP mode, and returns the number of bytes read. + +The 'flags' parameter may be one or more of the following, +bitwise-or'ed together: + +PARPORT_EPP_FAST Use fast transfers. Some chips provide 16-bit and + 32-bit registers. However, if a transfer + times out, the return value may be unreliable. + +SEE ALSO + +epp_write_data, epp_write_addr, epp_read_addr + +port->ops->epp_write_addr - write EPP address +------------------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + size_t (*epp_write_addr) (struct parport *port, + const void *buf, size_t len, int flags); + ... +}; + +DESCRIPTION + +Writes EPP addresses (8 bits each), and returns the number written. + +The 'flags' parameter may be one or more of the following, +bitwise-or'ed together: + +PARPORT_EPP_FAST Use fast transfers. Some chips provide 16-bit and + 32-bit registers. However, if a transfer + times out, the return value may be unreliable. + +(Does PARPORT_EPP_FAST make sense for this function?) + +SEE ALSO + +epp_write_data, epp_read_data, epp_read_addr + +port->ops->epp_read_addr - read EPP address +------------------------ + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + size_t (*epp_read_addr) (struct parport *port, void *buf, + size_t len, int flags); + ... +}; + +DESCRIPTION + +Reads EPP addresses (8 bits each), and returns the number read. + +The 'flags' parameter may be one or more of the following, +bitwise-or'ed together: + +PARPORT_EPP_FAST Use fast transfers. Some chips provide 16-bit and + 32-bit registers. However, if a transfer + times out, the return value may be unreliable. + +(Does PARPORT_EPP_FAST make sense for this function?) + +SEE ALSO + +epp_write_data, epp_read_data, epp_write_addr + +port->ops->ecp_write_data - write a block of ECP data +------------------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + size_t (*ecp_write_data) (struct parport *port, + const void *buf, size_t len, int flags); + ... +}; + +DESCRIPTION + +Writes a block of ECP data. The 'flags' parameter is ignored. + +RETURN VALUE + +The number of bytes written. + +SEE ALSO + +ecp_read_data, ecp_write_addr + +port->ops->ecp_read_data - read a block of ECP data +------------------------ + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + size_t (*ecp_read_data) (struct parport *port, + void *buf, size_t len, int flags); + ... +}; + +DESCRIPTION + +Reads a block of ECP data. The 'flags' parameter is ignored. + +RETURN VALUE + +The number of bytes read. NB. There may be more unread data in a +FIFO. Is there a way of stunning the FIFO to prevent this? + +SEE ALSO + +ecp_write_block, ecp_write_addr + +port->ops->ecp_write_addr - write a block of ECP addresses +------------------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + size_t (*ecp_write_addr) (struct parport *port, + const void *buf, size_t len, int flags); + ... +}; + +DESCRIPTION + +Writes a block of ECP addresses. The 'flags' parameter is ignored. + +RETURN VALUE + +The number of bytes written. + +NOTES + +This may use a FIFO, and if so shall not return until the FIFO is empty. + +SEE ALSO + +ecp_read_data, ecp_write_data + +port->ops->nibble_read_data - read a block of data in nibble mode +--------------------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + size_t (*nibble_read_data) (struct parport *port, + void *buf, size_t len, int flags); + ... +}; + +DESCRIPTION + +Reads a block of data in nibble mode. The 'flags' parameter is ignored. + +RETURN VALUE + +The number of whole bytes read. + +SEE ALSO + +byte_read_data, compat_write_data + +port->ops->byte_read_data - read a block of data in byte mode +------------------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + size_t (*byte_read_data) (struct parport *port, + void *buf, size_t len, int flags); + ... +}; + +DESCRIPTION + +Reads a block of data in byte mode. The 'flags' parameter is ignored. + +RETURN VALUE + +The number of bytes read. + +SEE ALSO + +nibble_read_data, compat_write_data + +port->ops->compat_write_data - write a block of data in compatibility mode +---------------------------- + +SYNOPSIS + +#include <linux/parport.h> + +struct parport_operations { + ... + size_t (*compat_write_data) (struct parport *port, + const void *buf, size_t len, int flags); + ... +}; + +DESCRIPTION + +Writes a block of data in compatibility mode. The 'flags' parameter +is ignored. + +RETURN VALUE + +The number of bytes written. + +SEE ALSO + +nibble_read_data, byte_read_data diff --git a/Documentation/parport.txt b/Documentation/parport.txt new file mode 100644 index 000000000000..93a7ceef398d --- /dev/null +++ b/Documentation/parport.txt @@ -0,0 +1,268 @@ +The `parport' code provides parallel-port support under Linux. This +includes the ability to share one port between multiple device +drivers. + +You can pass parameters to the parport code to override its automatic +detection of your hardware. This is particularly useful if you want +to use IRQs, since in general these can't be autoprobed successfully. +By default IRQs are not used even if they _can_ be probed. This is +because there are a lot of people using the same IRQ for their +parallel port and a sound card or network card. + +The parport code is split into two parts: generic (which deals with +port-sharing) and architecture-dependent (which deals with actually +using the port). + + +Parport as modules +================== + +If you load the parport code as a module, say + + # insmod parport + +to load the generic parport code. You then must load the +architecture-dependent code with (for example): + + # insmod parport_pc io=0x3bc,0x378,0x278 irq=none,7,auto + +to tell the parport code that you want three PC-style ports, one at +0x3bc with no IRQ, one at 0x378 using IRQ 7, and one at 0x278 with an +auto-detected IRQ. Currently, PC-style (parport_pc), Sun `bpp', +Amiga, Atari, and MFC3 hardware is supported. + +PCI parallel I/O card support comes from parport_pc. Base I/O +addresses should not be specified for supported PCI cards since they +are automatically detected. + + +KMod +---- + +If you use kmod, you will find it useful to edit /etc/modprobe.conf. +Here is an example of the lines that need to be added: + + alias parport_lowlevel parport_pc + options parport_pc io=0x378,0x278 irq=7,auto + +KMod will then automatically load parport_pc (with the options +"io=0x378,0x278 irq=7,auto") whenever a parallel port device driver +(such as lp) is loaded. + +Note that these are example lines only! You shouldn't in general need +to specify any options to parport_pc in order to be able to use a +parallel port. + + +Parport probe [optional] +------------- + +In 2.2 kernels there was a module called parport_probe, which was used +for collecting IEEE 1284 device ID information. This has now been +enhanced and now lives with the IEEE 1284 support. When a parallel +port is detected, the devices that are connected to it are analysed, +and information is logged like this: + + parport0: Printer, BJC-210 (Canon) + +The probe information is available from files in /proc/sys/dev/parport/. + + +Parport linked into the kernel statically +========================================= + +If you compile the parport code into the kernel, then you can use +kernel boot parameters to get the same effect. Add something like the +following to your LILO command line: + + parport=0x3bc parport=0x378,7 parport=0x278,auto,nofifo + +You can have many `parport=...' statements, one for each port you want +to add. Adding `parport=0' to the kernel command-line will disable +parport support entirely. Adding `parport=auto' to the kernel +command-line will make parport use any IRQ lines or DMA channels that +it auto-detects. + + +Files in /proc +============== + +If you have configured the /proc filesystem into your kernel, you will +see a new directory entry: /proc/sys/dev/parport. In there will be a +directory entry for each parallel port for which parport is +configured. In each of those directories are a collection of files +describing that parallel port. + +The /proc/sys/dev/parport directory tree looks like: + +parport +|-- default +| |-- spintime +| `-- timeslice +|-- parport0 +| |-- autoprobe +| |-- autoprobe0 +| |-- autoprobe1 +| |-- autoprobe2 +| |-- autoprobe3 +| |-- devices +| | |-- active +| | `-- lp +| | `-- timeslice +| |-- base-addr +| |-- irq +| |-- dma +| |-- modes +| `-- spintime +`-- parport1 + |-- autoprobe + |-- autoprobe0 + |-- autoprobe1 + |-- autoprobe2 + |-- autoprobe3 + |-- devices + | |-- active + | `-- ppa + | `-- timeslice + |-- base-addr + |-- irq + |-- dma + |-- modes + `-- spintime + + +File: Contents: + +devices/active A list of the device drivers using that port. A "+" + will appear by the name of the device currently using + the port (it might not appear against any). The + string "none" means that there are no device drivers + using that port. + +base-addr Parallel port's base address, or addresses if the port + has more than one in which case they are separated + with tabs. These values might not have any sensible + meaning for some ports. + +irq Parallel port's IRQ, or -1 if none is being used. + +dma Parallel port's DMA channel, or -1 if none is being + used. + +modes Parallel port's hardware modes, comma-separated, + meaning: + + PCSPP PC-style SPP registers are available. + TRISTATE Port is bidirectional. + COMPAT Hardware acceleration for printers is + available and will be used. + EPP Hardware acceleration for EPP protocol + is available and will be used. + ECP Hardware acceleration for ECP protocol + is available and will be used. + DMA DMA is available and will be used. + + Note that the current implementation will only take + advantage of COMPAT and ECP modes if it has an IRQ + line to use. + +autoprobe Any IEEE-1284 device ID information that has been + acquired from the (non-IEEE 1284.3) device. + +autoprobe[0-3] IEEE 1284 device ID information retrieved from + daisy-chain devices that conform to IEEE 1284.3. + +spintime The number of microseconds to busy-loop while waiting + for the peripheral to respond. You might find that + adjusting this improves performance, depending on your + peripherals. This is a port-wide setting, i.e. it + applies to all devices on a particular port. + +timeslice The number of milliseconds that a device driver is + allowed to keep a port claimed for. This is advisory, + and driver can ignore it if it must. + +default/* The defaults for spintime and timeslice. When a new + port is registered, it picks up the default spintime. + When a new device is registered, it picks up the + default timeslice. + +Device drivers +============== + +Once the parport code is initialised, you can attach device drivers to +specific ports. Normally this happens automatically; if the lp driver +is loaded it will create one lp device for each port found. You can +override this, though, by using parameters either when you load the lp +driver: + + # insmod lp parport=0,2 + +or on the LILO command line: + + lp=parport0 lp=parport2 + +Both the above examples would inform lp that you want /dev/lp0 to be +the first parallel port, and /dev/lp1 to be the _third_ parallel port, +with no lp device associated with the second port (parport1). Note +that this is different to the way older kernels worked; there used to +be a static association between the I/O port address and the device +name, so /dev/lp0 was always the port at 0x3bc. This is no longer the +case - if you only have one port, it will default to being /dev/lp0, +regardless of base address. + +Also: + + * If you selected the IEEE 1284 support at compile time, you can say + `lp=auto' on the kernel command line, and lp will create devices + only for those ports that seem to have printers attached. + + * If you give PLIP the `timid' parameter, either with `plip=timid' on + the command line, or with `insmod plip timid=1' when using modules, + it will avoid any ports that seem to be in use by other devices. + + * IRQ autoprobing works only for a few port types at the moment. + +Reporting printer problems with parport +======================================= + +If you are having problems printing, please go through these steps to +try to narrow down where the problem area is. + +When reporting problems with parport, really you need to give all of +the messages that parport_pc spits out when it initialises. There are +several code paths: + +o polling +o interrupt-driven, protocol in software +o interrupt-driven, protocol in hardware using PIO +o interrupt-driven, protocol in hardware using DMA + +The kernel messages that parport_pc logs give an indication of which +code path is being used. (They could be a lot better actually..) + +For normal printer protocol, having IEEE 1284 modes enabled or not +should not make a difference. + +To turn off the 'protocol in hardware' code paths, disable +CONFIG_PARPORT_PC_FIFO. Note that when they are enabled they are not +necessarily _used_; it depends on whether the hardware is available, +enabled by the BIOS, and detected by the driver. + +So, to start with, disable CONFIG_PARPORT_PC_FIFO, and load parport_pc +with 'irq=none'. See if printing works then. It really should, +because this is the simplest code path. + +If that works fine, try with 'io=0x378 irq=7' (adjust for your +hardware), to make it use interrupt-driven in-software protocol. + +If _that_ works fine, then one of the hardware modes isn't working +right. Enable CONFIG_PARPORT_PC_FIFO (no, it isn't a module option, +and yes, it should be), set the port to ECP mode in the BIOS and note +the DMA channel, and try with: + + io=0x378 irq=7 dma=none (for PIO) + io=0x378 irq=7 dma=3 (for DMA) +-- +philb@gnu.org +tim@cyberelk.net diff --git a/Documentation/pci.txt b/Documentation/pci.txt new file mode 100644 index 000000000000..67514bf87ccd --- /dev/null +++ b/Documentation/pci.txt @@ -0,0 +1,284 @@ + How To Write Linux PCI Drivers + + by Martin Mares <mj@ucw.cz> on 07-Feb-2000 + +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The world of PCI is vast and it's full of (mostly unpleasant) surprises. +Different PCI devices have different requirements and different bugs -- +because of this, the PCI support layer in Linux kernel is not as trivial +as one would wish. This short pamphlet tries to help all potential driver +authors find their way through the deep forests of PCI handling. + + +0. Structure of PCI drivers +~~~~~~~~~~~~~~~~~~~~~~~~~~~ +There exist two kinds of PCI drivers: new-style ones (which leave most of +probing for devices to the PCI layer and support online insertion and removal +of devices [thus supporting PCI, hot-pluggable PCI and CardBus in a single +driver]) and old-style ones which just do all the probing themselves. Unless +you have a very good reason to do so, please don't use the old way of probing +in any new code. After the driver finds the devices it wishes to operate +on (either the old or the new way), it needs to perform the following steps: + + Enable the device + Access device configuration space + Discover resources (addresses and IRQ numbers) provided by the device + Allocate these resources + Communicate with the device + Disable the device + +Most of these topics are covered by the following sections, for the rest +look at <linux/pci.h>, it's hopefully well commented. + +If the PCI subsystem is not configured (CONFIG_PCI is not set), most of +the functions described below are defined as inline functions either completely +empty or just returning an appropriate error codes to avoid lots of ifdefs +in the drivers. + + +1. New-style drivers +~~~~~~~~~~~~~~~~~~~~ +The new-style drivers just call pci_register_driver during their initialization +with a pointer to a structure describing the driver (struct pci_driver) which +contains: + + name Name of the driver + id_table Pointer to table of device ID's the driver is + interested in. Most drivers should export this + table using MODULE_DEVICE_TABLE(pci,...). + probe Pointer to a probing function which gets called (during + execution of pci_register_driver for already existing + devices or later if a new device gets inserted) for all + PCI devices which match the ID table and are not handled + by the other drivers yet. This function gets passed a + pointer to the pci_dev structure representing the device + and also which entry in the ID table did the device + match. It returns zero when the driver has accepted the + device or an error code (negative number) otherwise. + This function always gets called from process context, + so it can sleep. + remove Pointer to a function which gets called whenever a + device being handled by this driver is removed (either + during deregistration of the driver or when it's + manually pulled out of a hot-pluggable slot). This + function always gets called from process context, so it + can sleep. + save_state Save a device's state before it's suspend. + suspend Put device into low power state. + resume Wake device from low power state. + enable_wake Enable device to generate wake events from a low power + state. + + (Please see Documentation/power/pci.txt for descriptions + of PCI Power Management and the related functions) + +The ID table is an array of struct pci_device_id ending with a all-zero entry. +Each entry consists of: + + vendor, device Vendor and device ID to match (or PCI_ANY_ID) + subvendor, Subsystem vendor and device ID to match (or PCI_ANY_ID) + subdevice + class, Device class to match. The class_mask tells which bits + class_mask of the class are honored during the comparison. + driver_data Data private to the driver. + +Most drivers don't need to use the driver_data field. Best practice +for use of driver_data is to use it as an index into a static list of +equivalant device types, not to use it as a pointer. + +Have a table entry {PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID} +to have probe() called for every PCI device known to the system. + +New PCI IDs may be added to a device driver at runtime by writing +to the file /sys/bus/pci/drivers/{driver}/new_id. When added, the +driver will probe for all devices it can support. + +echo "vendor device subvendor subdevice class class_mask driver_data" > \ + /sys/bus/pci/drivers/{driver}/new_id +where all fields are passed in as hexadecimal values (no leading 0x). +Users need pass only as many fields as necessary; vendor, device, +subvendor, and subdevice fields default to PCI_ANY_ID (FFFFFFFF), +class and classmask fields default to 0, and driver_data defaults to +0UL. Device drivers must initialize use_driver_data in the dynids struct +in their pci_driver struct prior to calling pci_register_driver in order +for the driver_data field to get passed to the driver. Otherwise, only a +0 is passed in that field. + +When the driver exits, it just calls pci_unregister_driver() and the PCI layer +automatically calls the remove hook for all devices handled by the driver. + +Please mark the initialization and cleanup functions where appropriate +(the corresponding macros are defined in <linux/init.h>): + + __init Initialization code. Thrown away after the driver + initializes. + __exit Exit code. Ignored for non-modular drivers. + __devinit Device initialization code. Identical to __init if + the kernel is not compiled with CONFIG_HOTPLUG, normal + function otherwise. + __devexit The same for __exit. + +Tips: + The module_init()/module_exit() functions (and all initialization + functions called only from these) should be marked __init/exit. + The struct pci_driver shouldn't be marked with any of these tags. + The ID table array should be marked __devinitdata. + The probe() and remove() functions (and all initialization + functions called only from these) should be marked __devinit/exit. + If you are sure the driver is not a hotplug driver then use only + __init/exit __initdata/exitdata. + + Pointers to functions marked as __devexit must be created using + __devexit_p(function_name). That will generate the function + name or NULL if the __devexit function will be discarded. + + +2. How to find PCI devices manually (the old style) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +PCI drivers not using the pci_register_driver() interface search +for PCI devices manually using the following constructs: + +Searching by vendor and device ID: + + struct pci_dev *dev = NULL; + while (dev = pci_get_device(VENDOR_ID, DEVICE_ID, dev)) + configure_device(dev); + +Searching by class ID (iterate in a similar way): + + pci_get_class(CLASS_ID, dev) + +Searching by both vendor/device and subsystem vendor/device ID: + + pci_get_subsys(VENDOR_ID, DEVICE_ID, SUBSYS_VENDOR_ID, SUBSYS_DEVICE_ID, dev). + + You can use the constant PCI_ANY_ID as a wildcard replacement for +VENDOR_ID or DEVICE_ID. This allows searching for any device from a +specific vendor, for example. + + These functions are hotplug-safe. They increment the reference count on +the pci_dev that they return. You must eventually (possibly at module unload) +decrement the reference count on these devices by calling pci_dev_put(). + + +3. Enabling and disabling devices +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Before you do anything with the device you've found, you need to enable +it by calling pci_enable_device() which enables I/O and memory regions of +the device, allocates an IRQ if necessary, assigns missing resources if +needed and wakes up the device if it was in suspended state. Please note +that this function can fail. + + If you want to use the device in bus mastering mode, call pci_set_master() +which enables the bus master bit in PCI_COMMAND register and also fixes +the latency timer value if it's set to something bogus by the BIOS. + + If you want to use the PCI Memory-Write-Invalidate transaction, +call pci_set_mwi(). This enables the PCI_COMMAND bit for Mem-Wr-Inval +and also ensures that the cache line size register is set correctly. +Make sure to check the return value of pci_set_mwi(), not all architectures +may support Memory-Write-Invalidate. + + If your driver decides to stop using the device (e.g., there was an +error while setting it up or the driver module is being unloaded), it +should call pci_disable_device() to deallocate any IRQ resources, disable +PCI bus-mastering, etc. You should not do anything with the device after +calling pci_disable_device(). + +4. How to access PCI config space +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + You can use pci_(read|write)_config_(byte|word|dword) to access the config +space of a device represented by struct pci_dev *. All these functions return 0 +when successful or an error code (PCIBIOS_...) which can be translated to a text +string by pcibios_strerror. Most drivers expect that accesses to valid PCI +devices don't fail. + + If you don't have a struct pci_dev available, you can call +pci_bus_(read|write)_config_(byte|word|dword) to access a given device +and function on that bus. + + If you access fields in the standard portion of the config header, please +use symbolic names of locations and bits declared in <linux/pci.h>. + + If you need to access Extended PCI Capability registers, just call +pci_find_capability() for the particular capability and it will find the +corresponding register block for you. + + +5. Addresses and interrupts +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Memory and port addresses and interrupt numbers should NOT be read from the +config space. You should use the values in the pci_dev structure as they might +have been remapped by the kernel. + + See Documentation/IO-mapping.txt for how to access device memory. + + You still need to call request_region() for I/O regions and +request_mem_region() for memory regions to make sure nobody else is using the +same device. + + All interrupt handlers should be registered with SA_SHIRQ and use the devid +to map IRQs to devices (remember that all PCI interrupts are shared). + + +6. Other interesting functions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +pci_find_slot() Find pci_dev corresponding to given bus and + slot numbers. +pci_set_power_state() Set PCI Power Management state (0=D0 ... 3=D3) +pci_find_capability() Find specified capability in device's capability + list. +pci_module_init() Inline helper function for ensuring correct + pci_driver initialization and error handling. +pci_resource_start() Returns bus start address for a given PCI region +pci_resource_end() Returns bus end address for a given PCI region +pci_resource_len() Returns the byte length of a PCI region +pci_set_drvdata() Set private driver data pointer for a pci_dev +pci_get_drvdata() Return private driver data pointer for a pci_dev +pci_set_mwi() Enable Memory-Write-Invalidate transactions. +pci_clear_mwi() Disable Memory-Write-Invalidate transactions. + + +7. Miscellaneous hints +~~~~~~~~~~~~~~~~~~~~~~ +When displaying PCI slot names to the user (for example when a driver wants +to tell the user what card has it found), please use pci_name(pci_dev) +for this purpose. + +Always refer to the PCI devices by a pointer to the pci_dev structure. +All PCI layer functions use this identification and it's the only +reasonable one. Don't use bus/slot/function numbers except for very +special purposes -- on systems with multiple primary buses their semantics +can be pretty complex. + +If you're going to use PCI bus mastering DMA, take a look at +Documentation/DMA-mapping.txt. + +Don't try to turn on Fast Back to Back writes in your driver. All devices +on the bus need to be capable of doing it, so this is something which needs +to be handled by platform and generic code, not individual drivers. + + +8. Obsolete functions +~~~~~~~~~~~~~~~~~~~~~ +There are several functions which you might come across when trying to +port an old driver to the new PCI interface. They are no longer present +in the kernel as they aren't compatible with hotplug or PCI domains or +having sane locking. + +pcibios_present() and Since ages, you don't need to test presence +pci_present() of PCI subsystem when trying to talk to it. + If it's not there, the list of PCI devices + is empty and all functions for searching for + devices just return NULL. +pcibios_(read|write)_* Superseded by their pci_(read|write)_* + counterparts. +pcibios_find_* Superseded by their pci_get_* counterparts. +pci_for_each_dev() Superseded by pci_get_device() +pci_for_each_dev_reverse() Superseded by pci_find_device_reverse() +pci_for_each_bus() Superseded by pci_find_next_bus() +pci_find_device() Superseded by pci_get_device() +pci_find_subsys() Superseded by pci_get_subsys() +pcibios_find_class() Superseded by pci_get_class() +pci_find_class() Superseded by pci_get_class() +pci_(read|write)_*_nodev() Superseded by pci_bus_(read|write)_*() diff --git a/Documentation/pm.txt b/Documentation/pm.txt new file mode 100644 index 000000000000..cc63ae18d147 --- /dev/null +++ b/Documentation/pm.txt @@ -0,0 +1,251 @@ + Linux Power Management Support + +This document briefly describes how to use power management with your +Linux system and how to add power management support to Linux drivers. + +APM or ACPI? +------------ +If you have a relatively recent x86 mobile, desktop, or server system, +odds are it supports either Advanced Power Management (APM) or +Advanced Configuration and Power Interface (ACPI). ACPI is the newer +of the two technologies and puts power management in the hands of the +operating system, allowing for more intelligent power management than +is possible with BIOS controlled APM. + +The best way to determine which, if either, your system supports is to +build a kernel with both ACPI and APM enabled (as of 2.3.x ACPI is +enabled by default). If a working ACPI implementation is found, the +ACPI driver will override and disable APM, otherwise the APM driver +will be used. + +No sorry, you can not have both ACPI and APM enabled and running at +once. Some people with broken ACPI or broken APM implementations +would like to use both to get a full set of working features, but you +simply can not mix and match the two. Only one power management +interface can be in control of the machine at once. Think about it.. + +User-space Daemons +------------------ +Both APM and ACPI rely on user-space daemons, apmd and acpid +respectively, to be completely functional. Obtain both of these +daemons from your Linux distribution or from the Internet (see below) +and be sure that they are started sometime in the system boot process. +Go ahead and start both. If ACPI or APM is not available on your +system the associated daemon will exit gracefully. + + apmd: http://worldvisions.ca/~apenwarr/apmd/ + acpid: http://acpid.sf.net/ + +Driver Interface -- OBSOLETE, DO NOT USE! +----------------************************* +If you are writing a new driver or maintaining an old driver, it +should include power management support. Without power management +support, a single driver may prevent a system with power management +capabilities from ever being able to suspend (safely). + +Overview: +1) Register each instance of a device with "pm_register" +2) Call "pm_access" before accessing the hardware. + (this will ensure that the hardware is awake and ready) +3) Your "pm_callback" is called before going into a + suspend state (ACPI D1-D3) or after resuming (ACPI D0) + from a suspend. +4) Call "pm_dev_idle" when the device is not being used + (optional but will improve device idle detection) +5) When unloaded, unregister the device with "pm_unregister" + +/* + * Description: Register a device with the power-management subsystem + * + * Parameters: + * type - device type (PCI device, system device, ...) + * id - instance number or unique identifier + * cback - request handler callback (suspend, resume, ...) + * + * Returns: Registered PM device or NULL on error + * + * Examples: + * dev = pm_register(PM_SYS_DEV, PM_SYS_VGA, vga_callback); + * + * struct pci_dev *pci_dev = pci_find_dev(...); + * dev = pm_register(PM_PCI_DEV, PM_PCI_ID(pci_dev), callback); + */ +struct pm_dev *pm_register(pm_dev_t type, unsigned long id, pm_callback cback); + +/* + * Description: Unregister a device with the power management subsystem + * + * Parameters: + * dev - PM device previously returned from pm_register + */ +void pm_unregister(struct pm_dev *dev); + +/* + * Description: Unregister all devices with a matching callback function + * + * Parameters: + * cback - previously registered request callback + * + * Notes: Provided for easier porting from old APM interface + */ +void pm_unregister_all(pm_callback cback); + +/* + * Power management request callback + * + * Parameters: + * dev - PM device previously returned from pm_register + * rqst - request type + * data - data, if any, associated with the request + * + * Returns: 0 if the request is successful + * EINVAL if the request is not supported + * EBUSY if the device is now busy and can not handle the request + * ENOMEM if the device was unable to handle the request due to memory + * + * Details: The device request callback will be called before the + * device/system enters a suspend state (ACPI D1-D3) or + * or after the device/system resumes from suspend (ACPI D0). + * For PM_SUSPEND, the ACPI D-state being entered is passed + * as the "data" argument to the callback. The device + * driver should save (PM_SUSPEND) or restore (PM_RESUME) + * device context when the request callback is called. + * + * Once a driver returns 0 (success) from a suspend + * request, it should not process any further requests or + * access the device hardware until a call to "pm_access" is made. + */ +typedef int (*pm_callback)(struct pm_dev *dev, pm_request_t rqst, void *data); + +Driver Details +-------------- +This is just a quick Q&A as a stopgap until a real driver writers' +power management guide is available. + +Q: When is a device suspended? + +Devices can be suspended based on direct user request (eg. laptop lid +closes), system power policy (eg. sleep after 30 minutes of console +inactivity), or device power policy (eg. power down device after 5 +minutes of inactivity) + +Q: Must a driver honor a suspend request? + +No, a driver can return -EBUSY from a suspend request and this +will stop the system from suspending. When a suspend request +fails, all suspended devices are resumed and the system continues +to run. Suspend can be retried at a later time. + +Q: Can the driver block suspend/resume requests? + +Yes, a driver can delay its return from a suspend or resume +request until the device is ready to handle requests. It +is advantageous to return as quickly as possible from a +request as suspend/resume are done serially. + +Q: What context is a suspend/resume initiated from? + +A suspend or resume is initiated from a kernel thread context. +It is safe to block, allocate memory, initiate requests +or anything else you can do within the kernel. + +Q: Will requests continue to arrive after a suspend? + +Possibly. It is the driver's responsibility to queue(*), +fail, or drop any requests that arrive after returning +success to a suspend request. It is important that the +driver not access its device until after it receives +a resume request as the device's bus may no longer +be active. + +(*) If a driver queues requests for processing after + resume be aware that the device, network, etc. + might be in a different state than at suspend time. + It's probably better to drop requests unless + the driver is a storage device. + +Q: Do I have to manage bus-specific power management registers + +No. It is the responsibility of the bus driver to manage +PCI, USB, etc. power management registers. The bus driver +or the power management subsystem will also enable any +wake-on functionality that the device has. + +Q: So, really, what do I need to do to support suspend/resume? + +You need to save any device context that would +be lost if the device was powered off and then restore +it at resume time. When ACPI is active, there are +three levels of device suspend states; D1, D2, and D3. +(The suspend state is passed as the "data" argument +to the device callback.) With D3, the device is powered +off and loses all context, D1 and D2 are shallower power +states and require less device context to be saved. To +play it safe, just save everything at suspend and restore +everything at resume. + +Q: Where do I store device context for suspend? + +Anywhere in memory, kmalloc a buffer or store it +in the device descriptor. You are guaranteed that the +contents of memory will be restored and accessible +before resume, even when the system suspends to disk. + +Q: What do I need to do for ACPI vs. APM vs. etc? + +Drivers need not be aware of the specific power management +technology that is active. They just need to be aware +of when the overlying power management system requests +that they suspend or resume. + +Q: What about device dependencies? + +When a driver registers a device, the power management +subsystem uses the information provided to build a +tree of device dependencies (eg. USB device X is on +USB controller Y which is on PCI bus Z) When power +management wants to suspend a device, it first sends +a suspend request to its driver, then the bus driver, +and so on up to the system bus. Device resumes +proceed in the opposite direction. + +Q: Who do I contact for additional information about + enabling power management for my specific driver/device? + +ACPI Development mailing list: acpi-devel@lists.sourceforge.net + +System Interface -- OBSOLETE, DO NOT USE! +----------------************************* +If you are providing new power management support to Linux (ie. +adding support for something like APM or ACPI), you should +communicate with drivers through the existing generic power +management interface. + +/* + * Send a request to all devices + * + * Parameters: + * rqst - request type + * data - data, if any, associated with the request + * + * Returns: 0 if the request is successful + * See "pm_callback" return for errors + * + * Details: Walk list of registered devices and call pm_send + * for each until complete or an error is encountered. + * If an error is encountered for a suspend request, + * return all devices to the state they were in before + * the suspend request. + */ +int pm_send_all(pm_request_t rqst, void *data); + +/* + * Find a matching device + * + * Parameters: + * type - device type (PCI device, system device, or 0 to match all devices) + * from - previous match or NULL to start from the beginning + * + * Returns: Matching device or NULL if none found + */ +struct pm_dev *pm_find(pm_dev_t type, struct pm_dev *from); diff --git a/Documentation/pnp.txt b/Documentation/pnp.txt new file mode 100644 index 000000000000..af0f6eabfa1c --- /dev/null +++ b/Documentation/pnp.txt @@ -0,0 +1,249 @@ +Linux Plug and Play Documentation +by Adam Belay <ambx1@neo.rr.com> +last updated: Oct. 16, 2002 +--------------------------------------------------------------------------------------- + + + +Overview +-------- + Plug and Play provides a means of detecting and setting resources for legacy or +otherwise unconfigurable devices. The Linux Plug and Play Layer provides these +services to compatible drivers. + + + +The User Interface +------------------ + The Linux Plug and Play user interface provides a means to activate PnP devices +for legacy and user level drivers that do not support Linux Plug and Play. The +user interface is integrated into driverfs. + +In addition to the standard driverfs file the following are created in each +device's directory: +id - displays a list of support EISA IDs +options - displays possible resource configurations +resources - displays currently allocated resources and allows resource changes + +-activating a device + +#echo "auto" > resources + +this will invoke the automatic resource config system to activate the device + +-manually activating a device + +#echo "manual <depnum> <mode>" > resources +<depnum> - the configuration number +<mode> - static or dynamic + static = for next boot + dynamic = now + +-disabling a device + +#echo "disable" > resources + + +EXAMPLE: + +Suppose you need to activate the floppy disk controller. +1.) change to the proper directory, in my case it is +/driver/bus/pnp/devices/00:0f +# cd /driver/bus/pnp/devices/00:0f +# cat name +PC standard floppy disk controller + +2.) check if the device is already active +# cat resources +DISABLED + +- Notice the string "DISABLED". THis means the device is not active. + +3.) check the device's possible configurations (optional) +# cat options +Dependent: 01 - Priority acceptable + port 0x3f0-0x3f0, align 0x7, size 0x6, 16-bit address decoding + port 0x3f7-0x3f7, align 0x0, size 0x1, 16-bit address decoding + irq 6 + dma 2 8-bit compatible +Dependent: 02 - Priority acceptable + port 0x370-0x370, align 0x7, size 0x6, 16-bit address decoding + port 0x377-0x377, align 0x0, size 0x1, 16-bit address decoding + irq 6 + dma 2 8-bit compatible + +4.) now activate the device +# echo "auto" > resources + +5.) finally check if the device is active +# cat resources +io 0x3f0-0x3f5 +io 0x3f7-0x3f7 +irq 6 +dma 2 + +also there are a series of kernel parameters: +pnp_reserve_irq=irq1[,irq2] .... +pnp_reserve_dma=dma1[,dma2] .... +pnp_reserve_io=io1,size1[,io2,size2] .... +pnp_reserve_mem=mem1,size1[,mem2,size2] .... + + + +The Unified Plug and Play Layer +------------------------------- + All Plug and Play drivers, protocols, and services meet at a central location +called the Plug and Play Layer. This layer is responsible for the exchange of +information between PnP drivers and PnP protocols. Thus it automatically +forwards commands to the proper protocol. This makes writing PnP drivers +significantly easier. + +The following functions are available from the Plug and Play Layer: + +pnp_get_protocol +- increments the number of uses by one + +pnp_put_protocol +- deincrements the number of uses by one + +pnp_register_protocol +- use this to register a new PnP protocol + +pnp_unregister_protocol +- use this function to remove a PnP protocol from the Plug and Play Layer + +pnp_register_driver +- adds a PnP driver to the Plug and Play Layer +- this includes driver model integration + +pnp_unregister_driver +- removes a PnP driver from the Plug and Play Layer + + + +Plug and Play Protocols +----------------------- + This section contains information for PnP protocol developers. + +The following Protocols are currently available in the computing world: +- PNPBIOS: used for system devices such as serial and parallel ports. +- ISAPNP: provides PnP support for the ISA bus +- ACPI: among its many uses, ACPI provides information about system level +devices. +It is meant to replace the PNPBIOS. It is not currently supported by Linux +Plug and Play but it is planned to be in the near future. + + +Requirements for a Linux PnP protocol: +1.) the protocol must use EISA IDs +2.) the protocol must inform the PnP Layer of a devices current configuration +- the ability to set resources is optional but prefered. + +The following are PnP protocol related functions: + +pnp_add_device +- use this function to add a PnP device to the PnP layer +- only call this function when all wanted values are set in the pnp_dev +structure + +pnp_init_device +- call this to initialize the PnP structure + +pnp_remove_device +- call this to remove a device from the Plug and Play Layer. +- it will fail if the device is still in use. +- automatically will free mem used by the device and related structures + +pnp_add_id +- adds a EISA ID to the list of supported IDs for the specified device + +For more information consult the source of a protocol such as +/drivers/pnp/pnpbios/core.c. + + + +Linux Plug and Play Drivers +--------------------------- + This section contains information for linux PnP driver developers. + +The New Way +........... +1.) first make a list of supported EISA IDS +ex: +static const struct pnp_id pnp_dev_table[] = { + /* Standard LPT Printer Port */ + {.id = "PNP0400", .driver_data = 0}, + /* ECP Printer Port */ + {.id = "PNP0401", .driver_data = 0}, + {.id = ""} +}; + +Please note that the character 'X' can be used as a wild card in the function +portion (last four characters). +ex: + /* Unkown PnP modems */ + { "PNPCXXX", UNKNOWN_DEV }, + +Supported PnP card IDs can optionally be defined. +ex: +static const struct pnp_id pnp_card_table[] = { + { "ANYDEVS", 0 }, + { "", 0 } +}; + +2.) Optionally define probe and remove functions. It may make sense not to +define these functions if the driver already has a reliable method of detecting +the resources, such as the parport_pc driver. +ex: +static int +serial_pnp_probe(struct pnp_dev * dev, const struct pnp_id *card_id, const + struct pnp_id *dev_id) +{ +. . . + +ex: +static void serial_pnp_remove(struct pnp_dev * dev) +{ +. . . + +consult /drivers/serial/8250_pnp.c for more information. + +3.) create a driver structure +ex: + +static struct pnp_driver serial_pnp_driver = { + .name = "serial", + .card_id_table = pnp_card_table, + .id_table = pnp_dev_table, + .probe = serial_pnp_probe, + .remove = serial_pnp_remove, +}; + +* name and id_table can not be NULL. + +4.) register the driver +ex: + +static int __init serial8250_pnp_init(void) +{ + return pnp_register_driver(&serial_pnp_driver); +} + +The Old Way +........... + +a series of compatibility functions have been created to make it easy to convert + +ISAPNP drivers. They should serve as a temporary solution only. + +they are as follows: + +struct pnp_card *pnp_find_card(unsigned short vendor, + unsigned short device, + struct pnp_card *from) + +struct pnp_dev *pnp_find_dev(struct pnp_card *card, + unsigned short vendor, + unsigned short function, + struct pnp_dev *from) + diff --git a/Documentation/power/devices.txt b/Documentation/power/devices.txt new file mode 100644 index 000000000000..5d4ae9a39f1d --- /dev/null +++ b/Documentation/power/devices.txt @@ -0,0 +1,319 @@ + +Device Power Management + + +Device power management encompasses two areas - the ability to save +state and transition a device to a low-power state when the system is +entering a low-power state; and the ability to transition a device to +a low-power state while the system is running (and independently of +any other power management activity). + + +Methods + +The methods to suspend and resume devices reside in struct bus_type: + +struct bus_type { + ... + int (*suspend)(struct device * dev, pm_message_t state); + int (*resume)(struct device * dev); +}; + +Each bus driver is responsible implementing these methods, translating +the call into a bus-specific request and forwarding the call to the +bus-specific drivers. For example, PCI drivers implement suspend() and +resume() methods in struct pci_driver. The PCI core is simply +responsible for translating the pointers to PCI-specific ones and +calling the low-level driver. + +This is done to a) ease transition to the new power management methods +and leverage the existing PM code in various bus drivers; b) allow +buses to implement generic and default PM routines for devices, and c) +make the flow of execution obvious to the reader. + + +System Power Management + +When the system enters a low-power state, the device tree is walked in +a depth-first fashion to transition each device into a low-power +state. The ordering of the device tree is guaranteed by the order in +which devices get registered - children are never registered before +their ancestors, and devices are placed at the back of the list when +registered. By walking the list in reverse order, we are guaranteed to +suspend devices in the proper order. + +Devices are suspended once with interrupts enabled. Drivers are +expected to stop I/O transactions, save device state, and place the +device into a low-power state. Drivers may sleep, allocate memory, +etc. at will. + +Some devices are broken and will inevitably have problems powering +down or disabling themselves with interrupts enabled. For these +special cases, they may return -EAGAIN. This will put the device on a +list to be taken care of later. When interrupts are disabled, before +we enter the low-power state, their drivers are called again to put +their device to sleep. + +On resume, the devices that returned -EAGAIN will be called to power +themselves back on with interrupts disabled. Once interrupts have been +re-enabled, the rest of the drivers will be called to resume their +devices. On resume, a driver is responsible for powering back on each +device, restoring state, and re-enabling I/O transactions for that +device. + +System devices follow a slightly different API, which can be found in + + include/linux/sysdev.h + drivers/base/sys.c + +System devices will only be suspended with interrupts disabled, and +after all other devices have been suspended. On resume, they will be +resumed before any other devices, and also with interrupts disabled. + + +Runtime Power Management + +Many devices are able to dynamically power down while the system is +still running. This feature is useful for devices that are not being +used, and can offer significant power savings on a running system. + +In each device's directory, there is a 'power' directory, which +contains at least a 'state' file. Reading from this file displays what +power state the device is currently in. Writing to this file initiates +a transition to the specified power state, which must be a decimal in +the range 1-3, inclusive; or 0 for 'On'. + +The PM core will call the ->suspend() method in the bus_type object +that the device belongs to if the specified state is not 0, or +->resume() if it is. + +Nothing will happen if the specified state is the same state the +device is currently in. + +If the device is already in a low-power state, and the specified state +is another, but different, low-power state, the ->resume() method will +first be called to power the device back on, then ->suspend() will be +called again with the new state. + +The driver is responsible for saving the working state of the device +and putting it into the low-power state specified. If this was +successful, it returns 0, and the device's power_state field is +updated. + +The driver must take care to know whether or not it is able to +properly resume the device, including all step of reinitialization +necessary. (This is the hardest part, and the one most protected by +NDA'd documents). + +The driver must also take care not to suspend a device that is +currently in use. It is their responsibility to provide their own +exclusion mechanisms. + +The runtime power transition happens with interrupts enabled. If a +device cannot support being powered down with interrupts, it may +return -EAGAIN (as it would during a system power management +transition), but it will _not_ be called again, and the transaction +will fail. + +There is currently no way to know what states a device or driver +supports a priori. This will change in the future. + +pm_message_t meaning + +pm_message_t has two fields. event ("major"), and flags. If driver +does not know event code, it aborts the request, returning error. Some +drivers may need to deal with special cases based on the actual type +of suspend operation being done at the system level. This is why +there are flags. + +Event codes are: + +ON -- no need to do anything except special cases like broken +HW. + +# NOTIFICATION -- pretty much same as ON? + +FREEZE -- stop DMA and interrupts, and be prepared to reinit HW from +scratch. That probably means stop accepting upstream requests, the +actual policy of what to do with them beeing specific to a given +driver. It's acceptable for a network driver to just drop packets +while a block driver is expected to block the queue so no request is +lost. (Use IDE as an example on how to do that). FREEZE requires no +power state change, and it's expected for drivers to be able to +quickly transition back to operating state. + +SUSPEND -- like FREEZE, but also put hardware into low-power state. If +there's need to distinguish several levels of sleep, additional flag +is probably best way to do that. + +Transitions are only from a resumed state to a suspended state, never +between 2 suspended states. (ON -> FREEZE or ON -> SUSPEND can happen, +FREEZE -> SUSPEND or SUSPEND -> FREEZE can not). + +All events are: + +[NOTE NOTE NOTE: If you are driver author, you should not care; you +should only look at event, and ignore flags.] + +#Prepare for suspend -- userland is still running but we are going to +#enter suspend state. This gives drivers chance to load firmware from +#disk and store it in memory, or do other activities taht require +#operating userland, ability to kmalloc GFP_KERNEL, etc... All of these +#are forbiden once the suspend dance is started.. event = ON, flags = +#PREPARE_TO_SUSPEND + +Apm standby -- prepare for APM event. Quiesce devices to make life +easier for APM BIOS. event = FREEZE, flags = APM_STANDBY + +Apm suspend -- same as APM_STANDBY, but it we should probably avoid +spinning down disks. event = FREEZE, flags = APM_SUSPEND + +System halt, reboot -- quiesce devices to make life easier for BIOS. event += FREEZE, flags = SYSTEM_HALT or SYSTEM_REBOOT + +System shutdown -- at least disks need to be spun down, or data may be +lost. Quiesce devices, just to make life easier for BIOS. event = +FREEZE, flags = SYSTEM_SHUTDOWN + +Kexec -- turn off DMAs and put hardware into some state where new +kernel can take over. event = FREEZE, flags = KEXEC + +Powerdown at end of swsusp -- very similar to SYSTEM_SHUTDOWN, except wake +may need to be enabled on some devices. This actually has at least 3 +subtypes, system can reboot, enter S4 and enter S5 at the end of +swsusp. event = FREEZE, flags = SWSUSP and one of SYSTEM_REBOOT, +SYSTEM_SHUTDOWN, SYSTEM_S4 + +Suspend to ram -- put devices into low power state. event = SUSPEND, +flags = SUSPEND_TO_RAM + +Freeze for swsusp snapshot -- stop DMA and interrupts. No need to put +devices into low power mode, but you must be able to reinitialize +device from scratch in resume method. This has two flavors, its done +once on suspending kernel, once on resuming kernel. event = FREEZE, +flags = DURING_SUSPEND or DURING_RESUME + +Device detach requested from /sys -- deinitialize device; proably same as +SYSTEM_SHUTDOWN, I do not understand this one too much. probably event += FREEZE, flags = DEV_DETACH. + +#These are not really events sent: +# +#System fully on -- device is working normally; this is probably never +#passed to suspend() method... event = ON, flags = 0 +# +#Ready after resume -- userland is now running, again. Time to free any +#memory you ate during prepare to suspend... event = ON, flags = +#READY_AFTER_RESUME +# + +Driver Detach Power Management + +The kernel now supports the ability to place a device in a low-power +state when it is detached from its driver, which happens when its +module is removed. + +Each device contains a 'detach_state' file in its sysfs directory +which can be used to control this state. Reading from this file +displays what the current detach state is set to. This is 0 (On) by +default. A user may write a positive integer value to this file in the +range of 1-4 inclusive. + +A value of 1-3 will indicate the device should be placed in that +low-power state, which will cause ->suspend() to be called for that +device. A value of 4 indicates that the device should be shutdown, so +->shutdown() will be called for that device. + +The driver is responsible for reinitializing the device when the +module is re-inserted during it's ->probe() (or equivalent) method. +The driver core will not call any extra functions when binding the +device to the driver. + +pm_message_t meaning + +pm_message_t has two fields. event ("major"), and flags. If driver +does not know event code, it aborts the request, returning error. Some +drivers may need to deal with special cases based on the actual type +of suspend operation being done at the system level. This is why +there are flags. + +Event codes are: + +ON -- no need to do anything except special cases like broken +HW. + +# NOTIFICATION -- pretty much same as ON? + +FREEZE -- stop DMA and interrupts, and be prepared to reinit HW from +scratch. That probably means stop accepting upstream requests, the +actual policy of what to do with them being specific to a given +driver. It's acceptable for a network driver to just drop packets +while a block driver is expected to block the queue so no request is +lost. (Use IDE as an example on how to do that). FREEZE requires no +power state change, and it's expected for drivers to be able to +quickly transition back to operating state. + +SUSPEND -- like FREEZE, but also put hardware into low-power state. If +there's need to distinguish several levels of sleep, additional flag +is probably best way to do that. + +Transitions are only from a resumed state to a suspended state, never +between 2 suspended states. (ON -> FREEZE or ON -> SUSPEND can happen, +FREEZE -> SUSPEND or SUSPEND -> FREEZE can not). + +All events are: + +[NOTE NOTE NOTE: If you are driver author, you should not care; you +should only look at event, and ignore flags.] + +#Prepare for suspend -- userland is still running but we are going to +#enter suspend state. This gives drivers chance to load firmware from +#disk and store it in memory, or do other activities taht require +#operating userland, ability to kmalloc GFP_KERNEL, etc... All of these +#are forbiden once the suspend dance is started.. event = ON, flags = +#PREPARE_TO_SUSPEND + +Apm standby -- prepare for APM event. Quiesce devices to make life +easier for APM BIOS. event = FREEZE, flags = APM_STANDBY + +Apm suspend -- same as APM_STANDBY, but it we should probably avoid +spinning down disks. event = FREEZE, flags = APM_SUSPEND + +System halt, reboot -- quiesce devices to make life easier for BIOS. event += FREEZE, flags = SYSTEM_HALT or SYSTEM_REBOOT + +System shutdown -- at least disks need to be spun down, or data may be +lost. Quiesce devices, just to make life easier for BIOS. event = +FREEZE, flags = SYSTEM_SHUTDOWN + +Kexec -- turn off DMAs and put hardware into some state where new +kernel can take over. event = FREEZE, flags = KEXEC + +Powerdown at end of swsusp -- very similar to SYSTEM_SHUTDOWN, except wake +may need to be enabled on some devices. This actually has at least 3 +subtypes, system can reboot, enter S4 and enter S5 at the end of +swsusp. event = FREEZE, flags = SWSUSP and one of SYSTEM_REBOOT, +SYSTEM_SHUTDOWN, SYSTEM_S4 + +Suspend to ram -- put devices into low power state. event = SUSPEND, +flags = SUSPEND_TO_RAM + +Freeze for swsusp snapshot -- stop DMA and interrupts. No need to put +devices into low power mode, but you must be able to reinitialize +device from scratch in resume method. This has two flavors, its done +once on suspending kernel, once on resuming kernel. event = FREEZE, +flags = DURING_SUSPEND or DURING_RESUME + +Device detach requested from /sys -- deinitialize device; proably same as +SYSTEM_SHUTDOWN, I do not understand this one too much. probably event += FREEZE, flags = DEV_DETACH. + +#These are not really events sent: +# +#System fully on -- device is working normally; this is probably never +#passed to suspend() method... event = ON, flags = 0 +# +#Ready after resume -- userland is now running, again. Time to free any +#memory you ate during prepare to suspend... event = ON, flags = +#READY_AFTER_RESUME +# diff --git a/Documentation/power/interface.txt b/Documentation/power/interface.txt new file mode 100644 index 000000000000..f5ebda5f4276 --- /dev/null +++ b/Documentation/power/interface.txt @@ -0,0 +1,43 @@ +Power Management Interface + + +The power management subsystem provides a unified sysfs interface to +userspace, regardless of what architecture or platform one is +running. The interface exists in /sys/power/ directory (assuming sysfs +is mounted at /sys). + +/sys/power/state controls system power state. Reading from this file +returns what states are supported, which is hard-coded to 'standby' +(Power-On Suspend), 'mem' (Suspend-to-RAM), and 'disk' +(Suspend-to-Disk). + +Writing to this file one of those strings causes the system to +transition into that state. Please see the file +Documentation/power/states.txt for a description of each of those +states. + + +/sys/power/disk controls the operating mode of the suspend-to-disk +mechanism. Suspend-to-disk can be handled in several ways. The +greatest distinction is who writes memory to disk - the firmware or +the kernel. If the firmware does it, we assume that it also handles +suspending the system. + +If the kernel does it, then we have three options for putting the system +to sleep - using the platform driver (e.g. ACPI or other PM +registers), powering off the system or rebooting the system (for +testing). The system will support either 'firmware' or 'platform', and +that is known a priori. But, the user may choose 'shutdown' or +'reboot' as alternatives. + +Reading from this file will display what the mode is currently set +to. Writing to this file will accept one of + + 'firmware' + 'platform' + 'shutdown' + 'reboot' + +It will only change to 'firmware' or 'platform' if the system supports +it. + diff --git a/Documentation/power/kernel_threads.txt b/Documentation/power/kernel_threads.txt new file mode 100644 index 000000000000..60b548105edf --- /dev/null +++ b/Documentation/power/kernel_threads.txt @@ -0,0 +1,41 @@ +KERNEL THREADS + + +Freezer + +Upon entering a suspended state the system will freeze all +tasks. This is done by delivering pseudosignals. This affects +kernel threads, too. To successfully freeze a kernel thread +the thread has to check for the pseudosignal and enter the +refrigerator. Code to do this looks like this: + + do { + hub_events(); + wait_event_interruptible(khubd_wait, !list_empty(&hub_event_list)); + if (current->flags & PF_FREEZE) + refrigerator(PF_FREEZE); + } while (!signal_pending(current)); + +from drivers/usb/core/hub.c::hub_thread() + + +The Unfreezable + +Some kernel threads however, must not be frozen. The kernel must +be able to finish pending IO operations and later on be able to +write the memory image to disk. Kernel threads needed to do IO +must stay awake. Such threads must mark themselves unfreezable +like this: + + /* + * This thread doesn't need any user-level access, + * so get rid of all our resources. + */ + daemonize("usb-storage"); + + current->flags |= PF_NOFREEZE; + +from drivers/usb/storage/usb.c::usb_stor_control_thread() + +Such drivers are themselves responsible for staying quiet during +the actual snapshotting. diff --git a/Documentation/power/pci.txt b/Documentation/power/pci.txt new file mode 100644 index 000000000000..c85428e7ad92 --- /dev/null +++ b/Documentation/power/pci.txt @@ -0,0 +1,332 @@ + +PCI Power Management +~~~~~~~~~~~~~~~~~~~~ + +An overview of the concepts and the related functions in the Linux kernel + +Patrick Mochel <mochel@transmeta.com> +(and others) + +--------------------------------------------------------------------------- + +1. Overview +2. How the PCI Subsystem Does Power Management +3. PCI Utility Functions +4. PCI Device Drivers +5. Resources + +1. Overview +~~~~~~~~~~~ + +The PCI Power Management Specification was introduced between the PCI 2.1 and +PCI 2.2 Specifications. It a standard interface for controlling various +power management operations. + +Implementation of the PCI PM Spec is optional, as are several sub-components of +it. If a device supports the PCI PM Spec, the device will have an 8 byte +capability field in its PCI configuration space. This field is used to describe +and control the standard PCI power management features. + +The PCI PM spec defines 4 operating states for devices (D0 - D3) and for buses +(B0 - B3). The higher the number, the less power the device consumes. However, +the higher the number, the longer the latency is for the device to return to +an operational state (D0). + +There are actually two D3 states. When someone talks about D3, they usually +mean D3hot, which corresponds to an ACPI D2 state (power is reduced, the +device may lose some context). But they may also mean D3cold, which is an +ACPI D3 state (power is fully off, all state was discarded); or both. + +Bus power management is not covered in this version of this document. + +Note that all PCI devices support D0 and D3cold by default, regardless of +whether or not they implement any of the PCI PM spec. + +The possible state transitions that a device can undergo are: + ++---------------------------+ +| Current State | New State | ++---------------------------+ +| D0 | D1, D2, D3| ++---------------------------+ +| D1 | D2, D3 | ++---------------------------+ +| D2 | D3 | ++---------------------------+ +| D1, D2, D3 | D0 | ++---------------------------+ + +Note that when the system is entering a global suspend state, all devices will +be placed into D3 and when resuming, all devices will be placed into D0. +However, when the system is running, other state transitions are possible. + +2. How The PCI Subsystem Handles Power Management +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The PCI suspend/resume functionality is accessed indirectly via the Power +Management subsystem. At boot, the PCI driver registers a power management +callback with that layer. Upon entering a suspend state, the PM layer iterates +through all of its registered callbacks. This currently takes place only during +APM state transitions. + +Upon going to sleep, the PCI subsystem walks its device tree twice. Both times, +it does a depth first walk of the device tree. The first walk saves each of the +device's state and checks for devices that will prevent the system from entering +a global power state. The next walk then places the devices in a low power +state. + +The first walk allows a graceful recovery in the event of a failure, since none +of the devices have actually been powered down. + +In both walks, in particular the second, all children of a bridge are touched +before the actual bridge itself. This allows the bridge to retain power while +its children are being accessed. + +Upon resuming from sleep, just the opposite must be true: all bridges must be +powered on and restored before their children are powered on. This is easily +accomplished with a breadth-first walk of the PCI device tree. + + +3. PCI Utility Functions +~~~~~~~~~~~~~~~~~~~~~~~~ + +These are helper functions designed to be called by individual device drivers. +Assuming that a device behaves as advertised, these should be applicable in most +cases. However, results may vary. + +Note that these functions are never implicitly called for the driver. The driver +is always responsible for deciding when and if to call these. + + +pci_save_state +-------------- + +Usage: + pci_save_state(dev, buffer); + +Description: + Save first 64 bytes of PCI config space. Buffer must be allocated by + caller. + + +pci_restore_state +----------------- + +Usage: + pci_restore_state(dev, buffer); + +Description: + Restore previously saved config space. (First 64 bytes only); + + If buffer is NULL, then restore what information we know about the + device from bootup: BARs and interrupt line. + + +pci_set_power_state +------------------- + +Usage: + pci_set_power_state(dev, state); + +Description: + Transition device to low power state using PCI PM Capabilities + registers. + + Will fail under one of the following conditions: + - If state is less than current state, but not D0 (illegal transition) + - Device doesn't support PM Capabilities + - Device does not support requested state + + +pci_enable_wake +--------------- + +Usage: + pci_enable_wake(dev, state, enable); + +Description: + Enable device to generate PME# during low power state using PCI PM + Capabilities. + + Checks whether if device supports generating PME# from requested state + and fail if it does not, unless enable == 0 (request is to disable wake + events, which is implicit if it doesn't even support it in the first + place). + + Note that the PMC Register in the device's PM Capabilties has a bitmask + of the states it supports generating PME# from. D3hot is bit 3 and + D3cold is bit 4. So, while a value of 4 as the state may not seem + semantically correct, it is. + + +4. PCI Device Drivers +~~~~~~~~~~~~~~~~~~~~~ + +These functions are intended for use by individual drivers, and are defined in +struct pci_driver: + + int (*save_state) (struct pci_dev *dev, u32 state); + int (*suspend) (struct pci_dev *dev, u32 state); + int (*resume) (struct pci_dev *dev); + int (*enable_wake) (struct pci_dev *dev, u32 state, int enable); + + +save_state +---------- + +Usage: + +if (dev->driver && dev->driver->save_state) + dev->driver->save_state(dev,state); + +The driver should use this callback to save device state. It should take into +account the current state of the device and the requested state in order to +avoid any unnecessary operations. + +For example, a video card that supports all 4 states (D0-D3), all controller +context is preserved when entering D1, but the screen is placed into a low power +state (blanked). + +The driver can also interpret this function as a notification that it may be +entering a sleep state in the near future. If it knows that the device cannot +enter the requested state, either because of lack of support for it, or because +the device is middle of some critical operation, then it should fail. + +This function should not be used to set any state in the device or the driver +because the device may not actually enter the sleep state (e.g. another driver +later causes causes a global state transition to fail). + +Note that in intermediate low power states, a device's I/O and memory spaces may +be disabled and may not be available in subsequent transitions to lower power +states. + + +suspend +------- + +Usage: + +if (dev->driver && dev->driver->suspend) + dev->driver->suspend(dev,state); + +A driver uses this function to actually transition the device into a low power +state. This should include disabling I/O, IRQs, and bus-mastering, as well as +physically transitioning the device to a lower power state; it may also include +calls to pci_enable_wake(). + +Bus mastering may be disabled by doing: + +pci_disable_device(dev); + +For devices that support the PCI PM Spec, this may be used to set the device's +power state to match the suspend() parameter: + +pci_set_power_state(dev,state); + +The driver is also responsible for disabling any other device-specific features +(e.g blanking screen, turning off on-card memory, etc). + +The driver should be sure to track the current state of the device, as it may +obviate the need for some operations. + +The driver should update the current_state field in its pci_dev structure in +this function, except for PM-capable devices when pci_set_power_state is used. + +resume +------ + +Usage: + +if (dev->driver && dev->driver->suspend) + dev->driver->resume(dev) + +The resume callback may be called from any power state, and is always meant to +transition the device to the D0 state. + +The driver is responsible for reenabling any features of the device that had +been disabled during previous suspend calls, such as IRQs and bus mastering, +as well as calling pci_restore_state(). + +If the device is currently in D3, it may need to be reinitialized in resume(). + + * Some types of devices, like bus controllers, will preserve context in D3hot + (using Vcc power). Their drivers will often want to avoid re-initializing + them after re-entering D0 (perhaps to avoid resetting downstream devices). + + * Other kinds of devices in D3hot will discard device context as part of a + soft reset when re-entering the D0 state. + + * Devices resuming from D3cold always go through a power-on reset. Some + device context can also be preserved using Vaux power. + + * Some systems hide D3cold resume paths from drivers. For example, on PCs + the resume path for suspend-to-disk often runs BIOS powerup code, which + will sometimes re-initialize the device. + +To handle resets during D3 to D0 transitions, it may be convenient to share +device initialization code between probe() and resume(). Device parameters +can also be saved before the driver suspends into D3, avoiding re-probe. + +If the device supports the PCI PM Spec, it can use this to physically transition +the device to D0: + +pci_set_power_state(dev,0); + +Note that if the entire system is transitioning out of a global sleep state, all +devices will be placed in the D0 state, so this is not necessary. However, in +the event that the device is placed in the D3 state during normal operation, +this call is necessary. It is impossible to determine which of the two events is +taking place in the driver, so it is always a good idea to make that call. + +The driver should take note of the state that it is resuming from in order to +ensure correct (and speedy) operation. + +The driver should update the current_state field in its pci_dev structure in +this function, except for PM-capable devices when pci_set_power_state is used. + + +enable_wake +----------- + +Usage: + +if (dev->driver && dev->driver->enable_wake) + dev->driver->enable_wake(dev,state,enable); + +This callback is generally only relevant for devices that support the PCI PM +spec and have the ability to generate a PME# (Power Management Event Signal) +to wake the system up. (However, it is possible that a device may support +some non-standard way of generating a wake event on sleep.) + +Bits 15:11 of the PMC (Power Mgmt Capabilities) Register in a device's +PM Capabilties describe what power states the device supports generating a +wake event from: + ++------------------+ +| Bit | State | ++------------------+ +| 11 | D0 | +| 12 | D1 | +| 13 | D2 | +| 14 | D3hot | +| 15 | D3cold | ++------------------+ + +A device can use this to enable wake events: + + pci_enable_wake(dev,state,enable); + +Note that to enable PME# from D3cold, a value of 4 should be passed to +pci_enable_wake (since it uses an index into a bitmask). If a driver gets +a request to enable wake events from D3, two calls should be made to +pci_enable_wake (one for both D3hot and D3cold). + + +5. Resources +~~~~~~~~~~~~ + +PCI Local Bus Specification +PCI Bus Power Management Interface Specification + + http://pcisig.org + diff --git a/Documentation/power/states.txt b/Documentation/power/states.txt new file mode 100644 index 000000000000..3e5e5d3ff419 --- /dev/null +++ b/Documentation/power/states.txt @@ -0,0 +1,79 @@ + +System Power Management States + + +The kernel supports three power management states generically, though +each is dependent on platform support code to implement the low-level +details for each state. This file describes each state, what they are +commonly called, what ACPI state they map to, and what string to write +to /sys/power/state to enter that state + + +State: Standby / Power-On Suspend +ACPI State: S1 +String: "standby" + +This state offers minimal, though real, power savings, while providing +a very low-latency transition back to a working system. No operating +state is lost (the CPU retains power), so the system easily starts up +again where it left off. + +We try to put devices in a low-power state equivalent to D1, which +also offers low power savings, but low resume latency. Not all devices +support D1, and those that don't are left on. + +A transition from Standby to the On state should take about 1-2 +seconds. + + +State: Suspend-to-RAM +ACPI State: S3 +String: "mem" + +This state offers significant power savings as everything in the +system is put into a low-power state, except for memory, which is +placed in self-refresh mode to retain its contents. + +System and device state is saved and kept in memory. All devices are +suspended and put into D3. In many cases, all peripheral buses lose +power when entering STR, so devices must be able to handle the +transition back to the On state. + +For at least ACPI, STR requires some minimal boot-strapping code to +resume the system from STR. This may be true on other platforms. + +A transition from Suspend-to-RAM to the On state should take about +3-5 seconds. + + +State: Suspend-to-disk +ACPI State: S4 +String: "disk" + +This state offers the greatest power savings, and can be used even in +the absence of low-level platform support for power management. This +state operates similarly to Suspend-to-RAM, but includes a final step +of writing memory contents to disk. On resume, this is read and memory +is restored to its pre-suspend state. + +STD can be handled by the firmware or the kernel. If it is handled by +the firmware, it usually requires a dedicated partition that must be +setup via another operating system for it to use. Despite the +inconvenience, this method requires minimal work by the kernel, since +the firmware will also handle restoring memory contents on resume. + +If the kernel is responsible for persistantly saving state, a mechanism +called 'swsusp' (Swap Suspend) is used to write memory contents to +free swap space. swsusp has some restrictive requirements, but should +work in most cases. Some, albeit outdated, documentation can be found +in Documentation/power/swsusp.txt. + +Once memory state is written to disk, the system may either enter a +low-power state (like ACPI S4), or it may simply power down. Powering +down offers greater savings, and allows this mechanism to work on any +system. However, entering a real low-power state allows the user to +trigger wake up events (e.g. pressing a key or opening a laptop lid). + +A transition from Suspend-to-Disk to the On state should take about 30 +seconds, though it's typically a bit more with the current +implementation. diff --git a/Documentation/power/swsusp.txt b/Documentation/power/swsusp.txt new file mode 100644 index 000000000000..c7c3459fde43 --- /dev/null +++ b/Documentation/power/swsusp.txt @@ -0,0 +1,235 @@ +From kernel/suspend.c: + + * BIG FAT WARNING ********************************************************* + * + * If you have unsupported (*) devices using DMA... + * ...say goodbye to your data. + * + * If you touch anything on disk between suspend and resume... + * ...kiss your data goodbye. + * + * If your disk driver does not support suspend... (IDE does) + * ...you'd better find out how to get along + * without your data. + * + * If you change kernel command line between suspend and resume... + * ...prepare for nasty fsck or worse. + * + * If you change your hardware while system is suspended... + * ...well, it was not good idea. + * + * (*) suspend/resume support is needed to make it safe. + +You need to append resume=/dev/your_swap_partition to kernel command +line. Then you suspend by + +echo shutdown > /sys/power/disk; echo disk > /sys/power/state + +. If you feel ACPI works pretty well on your system, you might try + +echo platform > /sys/power/disk; echo disk > /sys/power/state + + + +Article about goals and implementation of Software Suspend for Linux +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Author: G‚ábor Kuti +Last revised: 2003-10-20 by Pavel Machek + +Idea and goals to achieve + +Nowadays it is common in several laptops that they have a suspend button. It +saves the state of the machine to a filesystem or to a partition and switches +to standby mode. Later resuming the machine the saved state is loaded back to +ram and the machine can continue its work. It has two real benefits. First we +save ourselves the time machine goes down and later boots up, energy costs +are real high when running from batteries. The other gain is that we don't have to +interrupt our programs so processes that are calculating something for a long +time shouldn't need to be written interruptible. + +swsusp saves the state of the machine into active swaps and then reboots or +powerdowns. You must explicitly specify the swap partition to resume from with +``resume='' kernel option. If signature is found it loads and restores saved +state. If the option ``noresume'' is specified as a boot parameter, it skips +the resuming. + +In the meantime while the system is suspended you should not add/remove any +of the hardware, write to the filesystems, etc. + +Sleep states summary +==================== + +There are three different interfaces you can use, /proc/acpi should +work like this: + +In a really perfect world: +echo 1 > /proc/acpi/sleep # for standby +echo 2 > /proc/acpi/sleep # for suspend to ram +echo 3 > /proc/acpi/sleep # for suspend to ram, but with more power conservative +echo 4 > /proc/acpi/sleep # for suspend to disk +echo 5 > /proc/acpi/sleep # for shutdown unfriendly the system + +and perhaps +echo 4b > /proc/acpi/sleep # for suspend to disk via s4bios + +Frequently Asked Questions +========================== + +Q: well, suspending a server is IMHO a really stupid thing, +but... (Diego Zuccato): + +A: You bought new UPS for your server. How do you install it without +bringing machine down? Suspend to disk, rearrange power cables, +resume. + +You have your server on UPS. Power died, and UPS is indicating 30 +seconds to failure. What do you do? Suspend to disk. + +Ethernet card in your server died. You want to replace it. Your +server is not hotplug capable. What do you do? Suspend to disk, +replace ethernet card, resume. If you are fast your users will not +even see broken connections. + + +Q: Maybe I'm missing something, but why don't the regular I/O paths work? + +A: We do use the regular I/O paths. However we cannot restore the data +to its original location as we load it. That would create an +inconsistent kernel state which would certainly result in an oops. +Instead, we load the image into unused memory and then atomically copy +it back to it original location. This implies, of course, a maximum +image size of half the amount of memory. + +There are two solutions to this: + +* require half of memory to be free during suspend. That way you can +read "new" data onto free spots, then cli and copy + +* assume we had special "polling" ide driver that only uses memory +between 0-640KB. That way, I'd have to make sure that 0-640KB is free +during suspending, but otherwise it would work... + +suspend2 shares this fundamental limitation, but does not include user +data and disk caches into "used memory" by saving them in +advance. That means that the limitation goes away in practice. + +Q: Does linux support ACPI S4? + +A: Yes. That's what echo platform > /sys/power/disk does. + +Q: My machine doesn't work with ACPI. How can I use swsusp than ? + +A: Do a reboot() syscall with right parameters. Warning: glibc gets in +its way, so check with strace: + +reboot(LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2, 0xd000fce2) + +(Thanks to Peter Osterlund:) + +#include <unistd.h> +#include <syscall.h> + +#define LINUX_REBOOT_MAGIC1 0xfee1dead +#define LINUX_REBOOT_MAGIC2 672274793 +#define LINUX_REBOOT_CMD_SW_SUSPEND 0xD000FCE2 + +int main() +{ + syscall(SYS_reboot, LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2, + LINUX_REBOOT_CMD_SW_SUSPEND, 0); + return 0; +} + +Also /sys/ interface should be still present. + +Q: What is 'suspend2'? + +A: suspend2 is 'Software Suspend 2', a forked implementation of +suspend-to-disk which is available as separate patches for 2.4 and 2.6 +kernels from swsusp.sourceforge.net. It includes support for SMP, 4GB +highmem and preemption. It also has a extensible architecture that +allows for arbitrary transformations on the image (compression, +encryption) and arbitrary backends for writing the image (eg to swap +or an NFS share[Work In Progress]). Questions regarding suspend2 +should be sent to the mailing list available through the suspend2 +website, and not to the Linux Kernel Mailing List. We are working +toward merging suspend2 into the mainline kernel. + +Q: A kernel thread must voluntarily freeze itself (call 'refrigerator'). +I found some kernel threads that don't do it, and they don't freeze +so the system can't sleep. Is this a known behavior? + +A: All such kernel threads need to be fixed, one by one. Select the +place where the thread is safe to be frozen (no kernel semaphores +should be held at that point and it must be safe to sleep there), and +add: + + if (current->flags & PF_FREEZE) + refrigerator(PF_FREEZE); + +If the thread is needed for writing the image to storage, you should +instead set the PF_NOFREEZE process flag when creating the thread. + + +Q: What is the difference between between "platform", "shutdown" and +"firmware" in /sys/power/disk? + +A: + +shutdown: save state in linux, then tell bios to powerdown + +platform: save state in linux, then tell bios to powerdown and blink + "suspended led" + +firmware: tell bios to save state itself [needs BIOS-specific suspend + partition, and has very little to do with swsusp] + +"platform" is actually right thing to do, but "shutdown" is most +reliable. + +Q: I do not understand why you have such strong objections to idea of +selective suspend. + +A: Do selective suspend during runtime power managment, that's okay. But +its useless for suspend-to-disk. (And I do not see how you could use +it for suspend-to-ram, I hope you do not want that). + +Lets see, so you suggest to + +* SUSPEND all but swap device and parents +* Snapshot +* Write image to disk +* SUSPEND swap device and parents +* Powerdown + +Oh no, that does not work, if swap device or its parents uses DMA, +you've corrupted data. You'd have to do + +* SUSPEND all but swap device and parents +* FREEZE swap device and parents +* Snapshot +* UNFREEZE swap device and parents +* Write +* SUSPEND swap device and parents + +Which means that you still need that FREEZE state, and you get more +complicated code. (And I have not yet introduce details like system +devices). + +Q: There don't seem to be any generally useful behavioral +distinctions between SUSPEND and FREEZE. + +A: Doing SUSPEND when you are asked to do FREEZE is always correct, +but it may be unneccessarily slow. If you want USB to stay simple, +slowness may not matter to you. It can always be fixed later. + +For devices like disk it does matter, you do not want to spindown for +FREEZE. + +Q: After resuming, system is paging heavilly, leading to very bad interactivity. + +A: Try running + +cat `cat /proc/[0-9]*/maps | grep / | sed 's:.* /:/:' | sort -u` > /dev/null + +after resume. swapoff -a; swapon -a may also be usefull. diff --git a/Documentation/power/tricks.txt b/Documentation/power/tricks.txt new file mode 100644 index 000000000000..c6d58d3da133 --- /dev/null +++ b/Documentation/power/tricks.txt @@ -0,0 +1,27 @@ + swsusp/S3 tricks + ~~~~~~~~~~~~~~~~ +Pavel Machek <pavel@suse.cz> + +If you want to trick swsusp/S3 into working, you might want to try: + +* go with minimal config, turn off drivers like USB, AGP you don't + really need + +* turn off APIC and preempt + +* use ext2. At least it has working fsck. [If something seemes to go + wrong, force fsck when you have a chance] + +* turn off modules + +* use vga text console, shut down X. [If you really want X, you might + want to try vesafb later] + +* try running as few processes as possible, preferably go to single + user mode. + +* due to video issues, swsusp should be easier to get working than + S3. Try that first. + +When you make it work, try to find out what exactly was it that broke +suspend, and preferably fix that. diff --git a/Documentation/power/video.txt b/Documentation/power/video.txt new file mode 100644 index 000000000000..8686968416ca --- /dev/null +++ b/Documentation/power/video.txt @@ -0,0 +1,169 @@ + + Video issues with S3 resume + ~~~~~~~~~~~~~~~~~~~~~~~~~~~ + 2003-2005, Pavel Machek + +During S3 resume, hardware needs to be reinitialized. For most +devices, this is easy, and kernel driver knows how to do +it. Unfortunately there's one exception: video card. Those are usually +initialized by BIOS, and kernel does not have enough information to +boot video card. (Kernel usually does not even contain video card +driver -- vesafb and vgacon are widely used). + +This is not problem for swsusp, because during swsusp resume, BIOS is +run normally so video card is normally initialized. S3 has absolutely +no chance of working with SMP/HT. Be sure it to turn it off before +testing (swsusp should work ok, OTOH). + +There are a few types of systems where video works after S3 resume: + +(1) systems where video state is preserved over S3. + +(2) systems where it is possible to call the video BIOS during S3 + resume. Unfortunately, it is not correct to call the video BIOS at + that point, but it happens to work on some machines. Use + acpi_sleep=s3_bios. + +(3) systems that initialize video card into vga text mode and where + the BIOS works well enough to be able to set video mode. Use + acpi_sleep=s3_mode on these. + +(4) on some systems s3_bios kicks video into text mode, and + acpi_sleep=s3_bios,s3_mode is needed. + +(5) radeon systems, where X can soft-boot your video card. You'll need + new enough X, and plain text console (no vesafb or radeonfb), see + http://www.doesi.gmxhome.de/linux/tm800s3/s3.html. Actually you + should probably use vbetool (6) instead. + +(6) other radeon systems, where vbetool is enough to bring system back + to life. It needs text console to be working. Do vbetool vbestate + save > /tmp/delme; echo 3 > /proc/acpi/sleep; vbetool post; vbetool + vbestate restore < /tmp/delme; setfont <whatever>, and your video + should work. + +(7) on some systems, it is possible to boot most of kernel, and then + POSTing bios works. Ole Rohne has patch to do just that at + http://dev.gentoo.org/~marineam/patch-radeonfb-2.6.11-rc2-mm2. + +Now, if you pass acpi_sleep=something, and it does not work with your +bios, you'll get a hard crash during resume. Be careful. Also it is +safest to do your experiments with plain old VGA console. The vesafb +and radeonfb (etc) drivers have a tendency to crash the machine during +resume. + +You may have a system where none of above works. At that point you +either invent another ugly hack that works, or write proper driver for +your video card (good luck getting docs :-(). Maybe suspending from X +(proper X, knowing your hardware, not XF68_FBcon) might have better +chance of working. + +Table of known working systems: + +Model hack (or "how to do it") +------------------------------------------------------------------------------ +Acer Aspire 1406LC ole's late BIOS init (7), turn off DRI +Acer TM 242FX vbetool (6) +Acer TM C300 vga=normal (only suspend on console, not in X), vbetool (6) +Acer TM 4052LCi s3_bios (2) +Acer TM 636Lci s3_bios vga=normal (2) +Acer TM 650 (Radeon M7) vga=normal plus boot-radeon (5) gets text console back +Acer TM 660 ??? (*) +Acer TM 800 vga=normal, X patches, see webpage (5) or vbetool (6) +Acer TM 803 vga=normal, X patches, see webpage (5) or vbetool (6) +Acer TM 803LCi vga=normal, vbetool (6) +Arima W730a vbetool needed (6) +Asus L2400D s3_mode (3)(***) (S1 also works OK) +Asus L3800C (Radeon M7) s3_bios (2) (S1 also works OK) +Asus M6NE ??? (*) +Athlon64 desktop prototype s3_bios (2) +Compal CL-50 ??? (*) +Compaq Armada E500 - P3-700 none (1) (S1 also works OK) +Compaq Evo N620c vga=normal, s3_bios (2) +Dell 600m, ATI R250 Lf none (1), but needs xorg-x11-6.8.1.902-1 +Dell D600, ATI RV250 vga=normal and X, or try vbestate (6) +Dell Inspiron 4000 ??? (*) +Dell Inspiron 500m ??? (*) +Dell Inspiron 600m ??? (*) +Dell Inspiron 8200 ??? (*) +Dell Inspiron 8500 ??? (*) +Dell Inspiron 8600 ??? (*) +eMachines athlon64 machines vbetool needed (6) (someone please get me model #s) +HP NC6000 s3_bios, may not use radeonfb (2); or vbetool (6) +HP NX7000 ??? (*) +HP Pavilion ZD7000 vbetool post needed, need open-source nv driver for X +HP Omnibook XE3 athlon version none (1) +HP Omnibook XE3GC none (1), video is S3 Savage/IX-MV +IBM TP T20, model 2647-44G none (1), video is S3 Inc. 86C270-294 Savage/IX-MV, vesafb gets "interesting" but X work. +IBM TP A31 / Type 2652-M5G s3_mode (3) [works ok with BIOS 1.04 2002-08-23, but not at all with BIOS 1.11 2004-11-05 :-(] +IBM TP R32 / Type 2658-MMG none (1) +IBM TP R40 2722B3G ??? (*) +IBM TP R50p / Type 1832-22U s3_bios (2) +IBM TP R51 ??? (*) +IBM TP T30 236681A ??? (*) +IBM TP T40 / Type 2373-MU4 none (1) +IBM TP T40p none (1) +IBM TP R40p s3_bios (2) +IBM TP T41p s3_bios (2), switch to X after resume +IBM TP T42 ??? (*) +IBM ThinkPad T42p (2373-GTG) s3_bios (2) +IBM TP X20 ??? (*) +IBM TP X30 ??? (*) +IBM TP X31 / Type 2672-XXH none (1), use radeontool (http://fdd.com/software/radeon/) to turn off backlight. +IBM Thinkpad X40 Type 2371-7JG s3_bios,s3_mode (4) +Medion MD4220 ??? (*) +Samsung P35 vbetool needed (6) +Sharp PC-AR10 (ATI rage) none (1) +Sony Vaio PCG-F403 ??? (*) +Sony Vaio PCG-N505SN ??? (*) +Sony Vaio vgn-s260 X or boot-radeon can init it (5) +Toshiba Libretto L5 none (1) +Toshiba Satellite 4030CDT s3_mode (3) +Toshiba Satellite 4080XCDT s3_mode (3) +Toshiba Satellite 4090XCDT ??? (*) +Toshiba Satellite P10-554 s3_bios,s3_mode (4)(****) +Uniwill 244IIO ??? (*) + + +(*) from http://www.ubuntulinux.org/wiki/HoaryPMResults, not sure + which options to use. If you know, please tell me. + +(***) To be tested with a newer kernel. + +(****) Not with SMP kernel, UP only. + +VBEtool details +~~~~~~~~~~~~~~~ +(with thanks to Carl-Daniel Hailfinger) + +First, boot into X and run the following script ONCE: +#!/bin/bash +statedir=/root/s3/state +mkdir -p $statedir +chvt 2 +sleep 1 +vbetool vbestate save >$statedir/vbe + + +To suspend and resume properly, call the following script as root: +#!/bin/bash +statedir=/root/s3/state +curcons=`fgconsole` +fuser /dev/tty$curcons 2>/dev/null|xargs ps -o comm= -p|grep -q X && chvt 2 +cat /dev/vcsa >$statedir/vcsa +sync +echo 3 >/proc/acpi/sleep +sync +vbetool post +vbetool vbestate restore <$statedir/vbe +cat $statedir/vcsa >/dev/vcsa +rckbd restart +chvt $[curcons%6+1] +chvt $curcons + + +Unless you change your graphics card or other hardware configuration, +the state once saved will be OK for every resume afterwards. +NOTE: The "rckbd restart" command may be different for your +distribution. Simply replace it with the command you would use to +set the fonts on screen. diff --git a/Documentation/power/video_extension.txt b/Documentation/power/video_extension.txt new file mode 100644 index 000000000000..8e33d7c82c49 --- /dev/null +++ b/Documentation/power/video_extension.txt @@ -0,0 +1,34 @@ +This driver implement the ACPI Extensions For Display Adapters +for integrated graphics devices on motherboard, as specified in +ACPI 2.0 Specification, Appendix B, allowing to perform some basic +control like defining the video POST device, retrieving EDID information +or to setup a video output, etc. Note that this is an ref. implementation only. +It may or may not work for your integrated video device. + +Interfaces exposed to userland through /proc/acpi/video: + +VGA/info : display the supported video bus device capability like ,Video ROM, CRT/LCD/TV. +VGA/ROM : Used to get a copy of the display devices' ROM data (up to 4k). +VGA/POST_info : Used to determine what options are implemented. +VGA/POST : Used to get/set POST device. +VGA/DOS : Used to get/set ownership of output switching: + Please refer ACPI spec B.4.1 _DOS +VGA/CRT : CRT output +VGA/LCD : LCD output +VGA/TV : TV output +VGA/*/brightness : Used to get/set brightness of output device + +Notify event through /proc/acpi/event: + +#define ACPI_VIDEO_NOTIFY_SWITCH 0x80 +#define ACPI_VIDEO_NOTIFY_PROBE 0x81 +#define ACPI_VIDEO_NOTIFY_CYCLE 0x82 +#define ACPI_VIDEO_NOTIFY_NEXT_OUTPUT 0x83 +#define ACPI_VIDEO_NOTIFY_PREV_OUTPUT 0x84 + +#define ACPI_VIDEO_NOTIFY_CYCLE_BRIGHTNESS 0x82 +#define ACPI_VIDEO_NOTIFY_INC_BRIGHTNESS 0x83 +#define ACPI_VIDEO_NOTIFY_DEC_BRIGHTNESS 0x84 +#define ACPI_VIDEO_NOTIFY_ZERO_BRIGHTNESS 0x85 +#define ACPI_VIDEO_NOTIFY_DISPLAY_OFF 0x86 + diff --git a/Documentation/powerpc/00-INDEX b/Documentation/powerpc/00-INDEX new file mode 100644 index 000000000000..e7bea0a407b4 --- /dev/null +++ b/Documentation/powerpc/00-INDEX @@ -0,0 +1,20 @@ +Index of files in Documentation/powerpc. If you think something about +Linux/PPC needs an entry here, needs correction or you've written one +please mail me. + Cort Dougan (cort@fsmlabs.com) + +00-INDEX + - this file +cpu_features.txt + - info on how we support a variety of CPUs with minimal compile-time + options. +ppc_htab.txt + - info about the Linux/PPC /proc/ppc_htab entry +smp.txt + - use and state info about Linux/PPC on MP machines +SBC8260_memory_mapping.txt + - EST SBC8260 board info +sound.txt + - info on sound support under Linux/PPC +zImage_layout.txt + - info on the kernel images for Linux/PPC diff --git a/Documentation/powerpc/SBC8260_memory_mapping.txt b/Documentation/powerpc/SBC8260_memory_mapping.txt new file mode 100644 index 000000000000..e6e9ee0506c3 --- /dev/null +++ b/Documentation/powerpc/SBC8260_memory_mapping.txt @@ -0,0 +1,197 @@ +Please mail me (Jon Diekema, diekema_jon@si.com or diekema@cideas.com) +if you have questions, comments or corrections. + + * EST SBC8260 Linux memory mapping rules + + http://www.estc.com/ + http://www.estc.com/products/boards/SBC8260-8240_ds.html + + Initial conditions: + ------------------- + + Tasks that need to be perform by the boot ROM before control is + transferred to zImage (compressed Linux kernel): + + - Define the IMMR to 0xf0000000 + + - Initialize the memory controller so that RAM is available at + physical address 0x00000000. On the SBC8260 is this 16M (64M) + SDRAM. + + - The boot ROM should only clear the RAM that it is using. + + The reason for doing this is to enhances the chances of a + successful post mortem on a Linux panic. One of the first + items to examine is the 16k (LOG_BUF_LEN) circular console + buffer called log_buf which is defined in kernel/printk.c. + + - To enhance boot ROM performance, the I-cache can be enabled. + + Date: Mon, 22 May 2000 14:21:10 -0700 + From: Neil Russell <caret@c-side.com> + + LiMon (LInux MONitor) runs with and starts Linux with MMU + off, I-cache enabled, D-cache disabled. The I-cache doesn't + need hints from the MMU to work correctly as the D-cache + does. No D-cache means no special code to handle devices in + the presence of cache (no snooping, etc). The use of the + I-cache means that the monitor can run acceptably fast + directly from ROM, rather than having to copy it to RAM. + + - Build the board information structure (see + include/asm-ppc/est8260.h for its definition) + + - The compressed Linux kernel (zImage) contains a bootstrap loader + that is position independent; you can load it into any RAM, + ROM or FLASH memory address >= 0x00500000 (above 5 MB), or + at its link address of 0x00400000 (4 MB). + + Note: If zImage is loaded at its link address of 0x00400000 (4 MB), + then zImage will skip the step of moving itself to + its link address. + + - Load R3 with the address of the board information structure + + - Transfer control to zImage + + - The Linux console port is SMC1, and the baud rate is controlled + from the bi_baudrate field of the board information structure. + On thing to keep in mind when picking the baud rate, is that + there is no flow control on the SMC ports. I would stick + with something safe and standard like 19200. + + On the EST SBC8260, the SMC1 port is on the COM1 connector of + the board. + + + EST SBC8260 defaults: + --------------------- + + Chip + Memory Sel Bus Use + --------------------- --- --- ---------------------------------- + 0x00000000-0x03FFFFFF CS2 60x (16M or 64M)/64M SDRAM + 0x04000000-0x04FFFFFF CS4 local 4M/16M SDRAM (soldered to the board) + 0x21000000-0x21000000 CS7 60x 1B/64K Flash present detect (from the flash SIMM) + 0x21000001-0x21000001 CS7 60x 1B/64K Switches (read) and LEDs (write) + 0x22000000-0x2200FFFF CS5 60x 8K/64K EEPROM + 0xFC000000-0xFCFFFFFF CS6 60x 2M/16M flash (8 bits wide, soldered to the board) + 0xFE000000-0xFFFFFFFF CS0 60x 4M/16M flash (SIMM) + + Notes: + ------ + + - The chip selects can map 32K blocks and up (powers of 2) + + - The SDRAM machine can handled up to 128Mbytes per chip select + + - Linux uses the 60x bus memory (the SDRAM DIMM) for the + communications buffers. + + - BATs can map 128K-256Mbytes each. There are four data BATs and + four instruction BATs. Generally the data and instruction BATs + are mapped the same. + + - The IMMR must be set above the kernel virtual memory addresses, + which start at 0xC0000000. Otherwise, the kernel may crash as + soon as you start any threads or processes due to VM collisions + in the kernel or user process space. + + + Details from Dan Malek <dan_malek@mvista.com> on 10/29/1999: + + The user application virtual space consumes the first 2 Gbytes + (0x00000000 to 0x7FFFFFFF). The kernel virtual text starts at + 0xC0000000, with data following. There is a "protection hole" + between the end of kernel data and the start of the kernel + dynamically allocated space, but this space is still within + 0xCxxxxxxx. + + Obviously the kernel can't map any physical addresses 1:1 in + these ranges. + + + Details from Dan Malek <dan_malek@mvista.com> on 5/19/2000: + + During the early kernel initialization, the kernel virtual + memory allocator is not operational. Prior to this KVM + initialization, we choose to map virtual to physical addresses + 1:1. That is, the kernel virtual address exactly matches the + physical address on the bus. These mappings are typically done + in arch/ppc/kernel/head.S, or arch/ppc/mm/init.c. Only + absolutely necessary mappings should be done at this time, for + example board control registers or a serial uart. Normal device + driver initialization should map resources later when necessary. + + Although platform dependent, and certainly the case for embedded + 8xx, traditionally memory is mapped at physical address zero, + and I/O devices above physical address 0x80000000. The lowest + and highest (above 0xf0000000) I/O addresses are traditionally + used for devices or registers we need to map during kernel + initialization and prior to KVM operation. For this reason, + and since it followed prior PowerPC platform examples, I chose + to map the embedded 8xx kernel to the 0xc0000000 virtual address. + This way, we can enable the MMU to map the kernel for proper + operation, and still map a few windows before the KVM is operational. + + On some systems, you could possibly run the kernel at the + 0x80000000 or any other virtual address. It just depends upon + mapping that must be done prior to KVM operational. You can never + map devices or kernel spaces that overlap with the user virtual + space. This is why default IMMR mapping used by most BDM tools + won't work. They put the IMMR at something like 0x10000000 or + 0x02000000 for example. You simply can't map these addresses early + in the kernel, and continue proper system operation. + + The embedded 8xx/82xx kernel is mature enough that all you should + need to do is map the IMMR someplace at or above 0xf0000000 and it + should boot far enough to get serial console messages and KGDB + connected on any platform. There are lots of other subtle memory + management design features that you simply don't need to worry + about. If you are changing functions related to MMU initialization, + you are likely breaking things that are known to work and are + heading down a path of disaster and frustration. Your changes + should be to make the flexibility of the processor fit Linux, + not force arbitrary and non-workable memory mappings into Linux. + + - You don't want to change KERNELLOAD or KERNELBASE, otherwise the + virtual memory and MMU code will get confused. + + arch/ppc/Makefile:KERNELLOAD = 0xc0000000 + + include/asm-ppc/page.h:#define PAGE_OFFSET 0xc0000000 + include/asm-ppc/page.h:#define KERNELBASE PAGE_OFFSET + + - RAM is at physical address 0x00000000, and gets mapped to + virtual address 0xC0000000 for the kernel. + + + Physical addresses used by the Linux kernel: + -------------------------------------------- + + 0x00000000-0x3FFFFFFF 1GB reserved for RAM + 0xF0000000-0xF001FFFF 128K IMMR 64K used for dual port memory, + 64K for 8260 registers + + + Logical addresses used by the Linux kernel: + ------------------------------------------- + + 0xF0000000-0xFFFFFFFF 256M BAT0 (IMMR: dual port RAM, registers) + 0xE0000000-0xEFFFFFFF 256M BAT1 (I/O space for custom boards) + 0xC0000000-0xCFFFFFFF 256M BAT2 (RAM) + 0xD0000000-0xDFFFFFFF 256M BAT3 (if RAM > 256MByte) + + + EST SBC8260 Linux mapping: + -------------------------- + + DBAT0, IBAT0, cache inhibited: + + Chip + Memory Sel Use + --------------------- --- --------------------------------- + 0xF0000000-0xF001FFFF n/a IMMR: dual port RAM, registers + + DBAT1, IBAT1, cache inhibited: + diff --git a/Documentation/powerpc/cpu_features.txt b/Documentation/powerpc/cpu_features.txt new file mode 100644 index 000000000000..472739880e87 --- /dev/null +++ b/Documentation/powerpc/cpu_features.txt @@ -0,0 +1,56 @@ +Hollis Blanchard <hollis@austin.ibm.com> +5 Jun 2002 + +This document describes the system (including self-modifying code) used in the +PPC Linux kernel to support a variety of PowerPC CPUs without requiring +compile-time selection. + +Early in the boot process the ppc32 kernel detects the current CPU type and +chooses a set of features accordingly. Some examples include Altivec support, +split instruction and data caches, and if the CPU supports the DOZE and NAP +sleep modes. + +Detection of the feature set is simple. A list of processors can be found in +arch/ppc/kernel/cputable.c. The PVR register is masked and compared with each +value in the list. If a match is found, the cpu_features of cur_cpu_spec is +assigned to the feature bitmask for this processor and a __setup_cpu function +is called. + +C code may test 'cur_cpu_spec[smp_processor_id()]->cpu_features' for a +particular feature bit. This is done in quite a few places, for example +in ppc_setup_l2cr(). + +Implementing cpufeatures in assembly is a little more involved. There are +several paths that are performance-critical and would suffer if an array +index, structure dereference, and conditional branch were added. To avoid the +performance penalty but still allow for runtime (rather than compile-time) CPU +selection, unused code is replaced by 'nop' instructions. This nop'ing is +based on CPU 0's capabilities, so a multi-processor system with non-identical +processors will not work (but such a system would likely have other problems +anyways). + +After detecting the processor type, the kernel patches out sections of code +that shouldn't be used by writing nop's over it. Using cpufeatures requires +just 2 macros (found in include/asm-ppc/cputable.h), as seen in head.S +transfer_to_handler: + + #ifdef CONFIG_ALTIVEC + BEGIN_FTR_SECTION + mfspr r22,SPRN_VRSAVE /* if G4, save vrsave register value */ + stw r22,THREAD_VRSAVE(r23) + END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC) + #endif /* CONFIG_ALTIVEC */ + +If CPU 0 supports Altivec, the code is left untouched. If it doesn't, both +instructions are replaced with nop's. + +The END_FTR_SECTION macro has two simpler variations: END_FTR_SECTION_IFSET +and END_FTR_SECTION_IFCLR. These simply test if a flag is set (in +cur_cpu_spec[0]->cpu_features) or is cleared, respectively. These two macros +should be used in the majority of cases. + +The END_FTR_SECTION macros are implemented by storing information about this +code in the '__ftr_fixup' ELF section. When do_cpu_ftr_fixups +(arch/ppc/kernel/misc.S) is invoked, it will iterate over the records in +__ftr_fixup, and if the required feature is not present it will loop writing +nop's from each BEGIN_FTR_SECTION to END_FTR_SECTION. diff --git a/Documentation/powerpc/eeh-pci-error-recovery.txt b/Documentation/powerpc/eeh-pci-error-recovery.txt new file mode 100644 index 000000000000..2bfe71beec5b --- /dev/null +++ b/Documentation/powerpc/eeh-pci-error-recovery.txt @@ -0,0 +1,332 @@ + + + PCI Bus EEH Error Recovery + -------------------------- + Linas Vepstas + <linas@austin.ibm.com> + 12 January 2005 + + +Overview: +--------- +The IBM POWER-based pSeries and iSeries computers include PCI bus +controller chips that have extended capabilities for detecting and +reporting a large variety of PCI bus error conditions. These features +go under the name of "EEH", for "Extended Error Handling". The EEH +hardware features allow PCI bus errors to be cleared and a PCI +card to be "rebooted", without also having to reboot the operating +system. + +This is in contrast to traditional PCI error handling, where the +PCI chip is wired directly to the CPU, and an error would cause +a CPU machine-check/check-stop condition, halting the CPU entirely. +Another "traditional" technique is to ignore such errors, which +can lead to data corruption, both of user data or of kernel data, +hung/unresponsive adapters, or system crashes/lockups. Thus, +the idea behind EEH is that the operating system can become more +reliable and robust by protecting it from PCI errors, and giving +the OS the ability to "reboot"/recover individual PCI devices. + +Future systems from other vendors, based on the PCI-E specification, +may contain similar features. + + +Causes of EEH Errors +-------------------- +EEH was originally designed to guard against hardware failure, such +as PCI cards dying from heat, humidity, dust, vibration and bad +electrical connections. The vast majority of EEH errors seen in +"real life" are due to eithr poorly seated PCI cards, or, +unfortunately quite commonly, due device driver bugs, device firmware +bugs, and sometimes PCI card hardware bugs. + +The most common software bug, is one that causes the device to +attempt to DMA to a location in system memory that has not been +reserved for DMA access for that card. This is a powerful feature, +as it prevents what; otherwise, would have been silent memory +corruption caused by the bad DMA. A number of device driver +bugs have been found and fixed in this way over the past few +years. Other possible causes of EEH errors include data or +address line parity errors (for example, due to poor electrical +connectivity due to a poorly seated card), and PCI-X split-completion +errors (due to software, device firmware, or device PCI hardware bugs). +The vast majority of "true hardware failures" can be cured by +physically removing and re-seating the PCI card. + + +Detection and Recovery +---------------------- +In the following discussion, a generic overview of how to detect +and recover from EEH errors will be presented. This is followed +by an overview of how the current implementation in the Linux +kernel does it. The actual implementation is subject to change, +and some of the finer points are still being debated. These +may in turn be swayed if or when other architectures implement +similar functionality. + +When a PCI Host Bridge (PHB, the bus controller connecting the +PCI bus to the system CPU electronics complex) detects a PCI error +condition, it will "isolate" the affected PCI card. Isolation +will block all writes (either to the card from the system, or +from the card to the system), and it will cause all reads to +return all-ff's (0xff, 0xffff, 0xffffffff for 8/16/32-bit reads). +This value was chosen because it is the same value you would +get if the device was physically unplugged from the slot. +This includes access to PCI memory, I/O space, and PCI config +space. Interrupts; however, will continued to be delivered. + +Detection and recovery are performed with the aid of ppc64 +firmware. The programming interfaces in the Linux kernel +into the firmware are referred to as RTAS (Run-Time Abstraction +Services). The Linux kernel does not (should not) access +the EEH function in the PCI chipsets directly, primarily because +there are a number of different chipsets out there, each with +different interfaces and quirks. The firmware provides a +uniform abstraction layer that will work with all pSeries +and iSeries hardware (and be forwards-compatible). + +If the OS or device driver suspects that a PCI slot has been +EEH-isolated, there is a firmware call it can make to determine if +this is the case. If so, then the device driver should put itself +into a consistent state (given that it won't be able to complete any +pending work) and start recovery of the card. Recovery normally +would consist of reseting the PCI device (holding the PCI #RST +line high for two seconds), followed by setting up the device +config space (the base address registers (BAR's), latency timer, +cache line size, interrupt line, and so on). This is followed by a +reinitialization of the device driver. In a worst-case scenario, +the power to the card can be toggled, at least on hot-plug-capable +slots. In principle, layers far above the device driver probably +do not need to know that the PCI card has been "rebooted" in this +way; ideally, there should be at most a pause in Ethernet/disk/USB +I/O while the card is being reset. + +If the card cannot be recovered after three or four resets, the +kernel/device driver should assume the worst-case scenario, that the +card has died completely, and report this error to the sysadmin. +In addition, error messages are reported through RTAS and also through +syslogd (/var/log/messages) to alert the sysadmin of PCI resets. +The correct way to deal with failed adapters is to use the standard +PCI hotplug tools to remove and replace the dead card. + + +Current PPC64 Linux EEH Implementation +-------------------------------------- +At this time, a generic EEH recovery mechanism has been implemented, +so that individual device drivers do not need to be modified to support +EEH recovery. This generic mechanism piggy-backs on the PCI hotplug +infrastructure, and percolates events up through the hotplug/udev +infrastructure. Followiing is a detailed description of how this is +accomplished. + +EEH must be enabled in the PHB's very early during the boot process, +and if a PCI slot is hot-plugged. The former is performed by +eeh_init() in arch/ppc64/kernel/eeh.c, and the later by +drivers/pci/hotplug/pSeries_pci.c calling in to the eeh.c code. +EEH must be enabled before a PCI scan of the device can proceed. +Current Power5 hardware will not work unless EEH is enabled; +although older Power4 can run with it disabled. Effectively, +EEH can no longer be turned off. PCI devices *must* be +registered with the EEH code; the EEH code needs to know about +the I/O address ranges of the PCI device in order to detect an +error. Given an arbitrary address, the routine +pci_get_device_by_addr() will find the pci device associated +with that address (if any). + +The default include/asm-ppc64/io.h macros readb(), inb(), insb(), +etc. include a check to see if the the i/o read returned all-0xff's. +If so, these make a call to eeh_dn_check_failure(), which in turn +asks the firmware if the all-ff's value is the sign of a true EEH +error. If it is not, processing continues as normal. The grand +total number of these false alarms or "false positives" can be +seen in /proc/ppc64/eeh (subject to change). Normally, almost +all of these occur during boot, when the PCI bus is scanned, where +a large number of 0xff reads are part of the bus scan procedure. + +If a frozen slot is detected, code in arch/ppc64/kernel/eeh.c will +print a stack trace to syslog (/var/log/messages). This stack trace +has proven to be very useful to device-driver authors for finding +out at what point the EEH error was detected, as the error itself +usually occurs slightly beforehand. + +Next, it uses the Linux kernel notifier chain/work queue mechanism to +allow any interested parties to find out about the failure. Device +drivers, or other parts of the kernel, can use +eeh_register_notifier(struct notifier_block *) to find out about EEH +events. The event will include a pointer to the pci device, the +device node and some state info. Receivers of the event can "do as +they wish"; the default handler will be described further in this +section. + +To assist in the recovery of the device, eeh.c exports the +following functions: + +rtas_set_slot_reset() -- assert the PCI #RST line for 1/8th of a second +rtas_configure_bridge() -- ask firmware to configure any PCI bridges + located topologically under the pci slot. +eeh_save_bars() and eeh_restore_bars(): save and restore the PCI + config-space info for a device and any devices under it. + + +A handler for the EEH notifier_block events is implemented in +drivers/pci/hotplug/pSeries_pci.c, called handle_eeh_events(). +It saves the device BAR's and then calls rpaphp_unconfig_pci_adapter(). +This last call causes the device driver for the card to be stopped, +which causes hotplug events to go out to user space. This triggers +user-space scripts that might issue commands such as "ifdown eth0" +for ethernet cards, and so on. This handler then sleeps for 5 seconds, +hoping to give the user-space scripts enough time to complete. +It then resets the PCI card, reconfigures the device BAR's, and +any bridges underneath. It then calls rpaphp_enable_pci_slot(), +which restarts the device driver and triggers more user-space +events (for example, calling "ifup eth0" for ethernet cards). + + +Device Shutdown and User-Space Events +------------------------------------- +This section documents what happens when a pci slot is unconfigured, +focusing on how the device driver gets shut down, and on how the +events get delivered to user-space scripts. + +Following is an example sequence of events that cause a device driver +close function to be called during the first phase of an EEH reset. +The following sequence is an example of the pcnet32 device driver. + + rpa_php_unconfig_pci_adapter (struct slot *) // in rpaphp_pci.c + { + calls + pci_remove_bus_device (struct pci_dev *) // in /drivers/pci/remove.c + { + calls + pci_destroy_dev (struct pci_dev *) + { + calls + device_unregister (&dev->dev) // in /drivers/base/core.c + { + calls + device_del (struct device *) + { + calls + bus_remove_device() // in /drivers/base/bus.c + { + calls + device_release_driver() + { + calls + struct device_driver->remove() which is just + pci_device_remove() // in /drivers/pci/pci_driver.c + { + calls + struct pci_driver->remove() which is just + pcnet32_remove_one() // in /drivers/net/pcnet32.c + { + calls + unregister_netdev() // in /net/core/dev.c + { + calls + dev_close() // in /net/core/dev.c + { + calls dev->stop(); + which is just pcnet32_close() // in pcnet32.c + { + which does what you wanted + to stop the device + } + } + } + which + frees pcnet32 device driver memory + } + }}}}}} + + + in drivers/pci/pci_driver.c, + struct device_driver->remove() is just pci_device_remove() + which calls struct pci_driver->remove() which is pcnet32_remove_one() + which calls unregister_netdev() (in net/core/dev.c) + which calls dev_close() (in net/core/dev.c) + which calls dev->stop() which is pcnet32_close() + which then does the appropriate shutdown. + +--- +Following is the analogous stack trace for events sent to user-space +when the pci device is unconfigured. + +rpa_php_unconfig_pci_adapter() { // in rpaphp_pci.c + calls + pci_remove_bus_device (struct pci_dev *) { // in /drivers/pci/remove.c + calls + pci_destroy_dev (struct pci_dev *) { + calls + device_unregister (&dev->dev) { // in /drivers/base/core.c + calls + device_del(struct device * dev) { // in /drivers/base/core.c + calls + kobject_del() { //in /libs/kobject.c + calls + kobject_hotplug() { // in /libs/kobject.c + calls + kset_hotplug() { // in /lib/kobject.c + calls + kset->hotplug_ops->hotplug() which is really just + a call to + dev_hotplug() { // in /drivers/base/core.c + calls + dev->bus->hotplug() which is really just a call to + pci_hotplug () { // in drivers/pci/hotplug.c + which prints device name, etc.... + } + } + then kset_hotplug() calls + call_usermodehelper () with + argv[0]=hotplug_path[] which is "/sbin/hotplug" + --> event to userspace, + } + } + kobject_del() then calls sysfs_remove_dir(), which would + trigger any user-space daemon that was watching /sysfs, + and notice the delete event. + + +Pro's and Con's of the Current Design +------------------------------------- +There are several issues with the current EEH software recovery design, +which may be addressed in future revisions. But first, note that the +big plus of the current design is that no changes need to be made to +individual device drivers, so that the current design throws a wide net. +The biggest negative of the design is that it potentially disturbs +network daemons and file systems that didn't need to be disturbed. + +-- A minor complaint is that resetting the network card causes + user-space back-to-back ifdown/ifup burps that potentially disturb + network daemons, that didn't need to even know that the pci + card was being rebooted. + +-- A more serious concern is that the same reset, for SCSI devices, + causes havoc to mounted file systems. Scripts cannot post-facto + unmount a file system without flushing pending buffers, but this + is impossible, because I/O has already been stopped. Thus, + ideally, the reset should happen at or below the block layer, + so that the file systems are not disturbed. + + Reiserfs does not tolerate errors returned from the block device. + Ext3fs seems to be tolerant, retrying reads/writes until it does + succeed. Both have been only lightly tested in this scenario. + + The SCSI-generic subsystem already has built-in code for performing + SCSI device resets, SCSI bus resets, and SCSI host-bus-adapter + (HBA) resets. These are cascaded into a chain of attempted + resets if a SCSI command fails. These are completely hidden + from the block layer. It would be very natural to add an EEH + reset into this chain of events. + +-- If a SCSI error occurs for the root device, all is lost unless + the sysadmin had the foresight to run /bin, /sbin, /etc, /var + and so on, out of ramdisk/tmpfs. + + +Conclusions +----------- +There's forward progress ... + + diff --git a/Documentation/powerpc/hvcs.txt b/Documentation/powerpc/hvcs.txt new file mode 100644 index 000000000000..c0a62e116e6e --- /dev/null +++ b/Documentation/powerpc/hvcs.txt @@ -0,0 +1,567 @@ +=========================================================================== + HVCS + IBM "Hypervisor Virtual Console Server" Installation Guide + for Linux Kernel 2.6.4+ + Copyright (C) 2004 IBM Corporation + +=========================================================================== +NOTE:Eight space tabs are the optimum editor setting for reading this file. +=========================================================================== + + Author(s) : Ryan S. Arnold <rsa@us.ibm.com> + Date Created: March, 02, 2004 + Last Changed: August, 24, 2004 + +--------------------------------------------------------------------------- +Table of contents: + + 1. Driver Introduction: + 2. System Requirements + 3. Build Options: + 3.1 Built-in: + 3.2 Module: + 4. Installation: + 5. Connection: + 6. Disconnection: + 7. Configuration: + 8. Questions & Answers: + 9. Reporting Bugs: + +--------------------------------------------------------------------------- +1. Driver Introduction: + +This is the device driver for the IBM Hypervisor Virtual Console Server, +"hvcs". The IBM hvcs provides a tty driver interface to allow Linux user +space applications access to the system consoles of logically partitioned +operating systems (Linux and AIX) running on the same partitioned Power5 +ppc64 system. Physical hardware consoles per partition are not practical +on this hardware so system consoles are accessed by this driver using +firmware interfaces to virtual terminal devices. + +--------------------------------------------------------------------------- +2. System Requirements: + +This device driver was written using 2.6.4 Linux kernel APIs and will only +build and run on kernels of this version or later. + +This driver was written to operate solely on IBM Power5 ppc64 hardware +though some care was taken to abstract the architecture dependent firmware +calls from the driver code. + +Sysfs must be mounted on the system so that the user can determine which +major and minor numbers are associated with each vty-server. Directions +for sysfs mounting are outside the scope of this document. + +--------------------------------------------------------------------------- +3. Build Options: + +The hvcs driver registers itself as a tty driver. The tty layer +dynamically allocates a block of major and minor numbers in a quantity +requested by the registering driver. The hvcs driver asks the tty layer +for 64 of these major/minor numbers by default to use for hvcs device node +entries. + +If the default number of device entries is adequate then this driver can be +built into the kernel. If not, the default can be over-ridden by inserting +the driver as a module with insmod parameters. + +--------------------------------------------------------------------------- +3.1 Built-in: + +The following menuconfig example demonstrates selecting to build this +driver into the kernel. + + Device Drivers ---> + Character devices ---> + <*> IBM Hypervisor Virtual Console Server Support + +Begin the kernel make process. + +--------------------------------------------------------------------------- +3.2 Module: + +The following menuconfig example demonstrates selecting to build this +driver as a kernel module. + + Device Drivers ---> + Character devices ---> + <M> IBM Hypervisor Virtual Console Server Support + +The make process will build the following kernel modules: + + hvcs.ko + hvcserver.ko + +To insert the module with the default allocation execute the following +commands in the order they appear: + + insmod hvcserver.ko + insmod hvcs.ko + +The hvcserver module contains architecture specific firmware calls and must +be inserted first, otherwise the hvcs module will not find some of the +symbols it expects. + +To override the default use an insmod parameter as follows (requesting 4 +tty devices as an example): + + insmod hvcs.ko hvcs_parm_num_devs=4 + +There is a maximum number of dev entries that can be specified on insmod. +We think that 1024 is currently a decent maximum number of server adapters +to allow. This can always be changed by modifying the constant in the +source file before building. + +NOTE: The length of time it takes to insmod the driver seems to be related +to the number of tty interfaces the registering driver requests. + +In order to remove the driver module execute the following command: + + rmmod hvcs.ko + +The recommended method for installing hvcs as a module is to use depmod to +build a current modules.dep file in /lib/modules/`uname -r` and then +execute: + +modprobe hvcs hvcs_parm_num_devs=4 + +The modules.dep file indicates that hvcserver.ko needs to be inserted +before hvcs.ko and modprobe uses this file to smartly insert the modules in +the proper order. + +The following modprobe command is used to remove hvcs and hvcserver in the +proper order: + +modprobe -r hvcs + +--------------------------------------------------------------------------- +4. Installation: + +The tty layer creates sysfs entries which contain the major and minor +numbers allocated for the hvcs driver. The following snippet of "tree" +output of the sysfs directory shows where these numbers are presented: + + sys/ + |-- *other sysfs base dirs* + | + |-- class + | |-- *other classes of devices* + | | + | `-- tty + | |-- *other tty devices* + | | + | |-- hvcs0 + | | `-- dev + | |-- hvcs1 + | | `-- dev + | |-- hvcs2 + | | `-- dev + | |-- hvcs3 + | | `-- dev + | | + | |-- *other tty devices* + | + |-- *other sysfs base dirs* + +For the above examples the following output is a result of cat'ing the +"dev" entry in the hvcs directory: + + Pow5:/sys/class/tty/hvcs0/ # cat dev + 254:0 + + Pow5:/sys/class/tty/hvcs1/ # cat dev + 254:1 + + Pow5:/sys/class/tty/hvcs2/ # cat dev + 254:2 + + Pow5:/sys/class/tty/hvcs3/ # cat dev + 254:3 + +The output from reading the "dev" attribute is the char device major and +minor numbers that the tty layer has allocated for this driver's use. Most +systems running hvcs will already have the device entries created or udev +will do it automatically. + +Given the example output above, to manually create a /dev/hvcs* node entry +mknod can be used as follows: + + mknod /dev/hvcs0 c 254 0 + mknod /dev/hvcs1 c 254 1 + mknod /dev/hvcs2 c 254 2 + mknod /dev/hvcs3 c 254 3 + +Using mknod to manually create the device entries makes these device nodes +persistent. Once created they will exist prior to the driver insmod. + +Attempting to connect an application to /dev/hvcs* prior to insertion of +the hvcs module will result in an error message similar to the following: + + "/dev/hvcs*: No such device". + +NOTE: Just because there is a device node present doesn't mean that there +is a vty-server device configured for that node. + +--------------------------------------------------------------------------- +5. Connection + +Since this driver controls devices that provide a tty interface a user can +interact with the device node entries using any standard tty-interactive +method (e.g. "cat", "dd", "echo"). The intent of this driver however, is +to provide real time console interaction with a Linux partition's console, +which requires the use of applications that provide bi-directional, +interactive I/O with a tty device. + +Applications (e.g. "minicom" and "screen") that act as terminal emulators +or perform terminal type control sequence conversion on the data being +passed through them are NOT acceptable for providing interactive console +I/O. These programs often emulate antiquated terminal types (vt100 and +ANSI) and expect inbound data to take the form of one of these supported +terminal types but they either do not convert, or do not _adequately_ +convert, outbound data into the terminal type of the terminal which invoked +them (though screen makes an attempt and can apparently be configured with +much termcap wrestling.) + +For this reason kermit and cu are two of the recommended applications for +interacting with a Linux console via an hvcs device. These programs simply +act as a conduit for data transfer to and from the tty device. They do not +require inbound data to take the form of a particular terminal type, nor do +they cook outbound data to a particular terminal type. + +In order to ensure proper functioning of console applications one must make +sure that once connected to a /dev/hvcs console that the console's $TERM +env variable is set to the exact terminal type of the terminal emulator +used to launch the interactive I/O application. If one is using xterm and +kermit to connect to /dev/hvcs0 when the console prompt becomes available +one should "export TERM=xterm" on the console. This tells ncurses +applications that are invoked from the console that they should output +control sequences that xterm can understand. + +As a precautionary measure an hvcs user should always "exit" from their +session before disconnecting an application such as kermit from the device +node. If this is not done, the next user to connect to the console will +continue using the previous user's logged in session which includes +using the $TERM variable that the previous user supplied. + +Hotplug add and remove of vty-server adapters affects which /dev/hvcs* node +is used to connect to each vty-server adapter. In order to determine which +vty-server adapter is associated with which /dev/hvcs* node a special sysfs +attribute has been added to each vty-server sysfs entry. This entry is +called "index" and showing it reveals an integer that refers to the +/dev/hvcs* entry to use to connect to that device. For instance cating the +index attribute of vty-server adapter 30000004 shows the following. + + Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat index + 2 + +This index of '2' means that in order to connect to vty-server adapter +30000004 the user should interact with /dev/hvcs2. + +It should be noted that due to the system hotplug I/O capabilities of a +system the /dev/hvcs* entry that interacts with a particular vty-server +adapter is not guarenteed to remain the same across system reboots. Look +in the Q & A section for more on this issue. + +--------------------------------------------------------------------------- +6. Disconnection + +As a security feature to prevent the delivery of stale data to an +unintended target the Power5 system firmware disables the fetching of data +and discards that data when a connection between a vty-server and a vty has +been severed. As an example, when a vty-server is immediately disconnected +from a vty following output of data to the vty the vty adapter may not have +enough time between when it received the data interrupt and when the +connection was severed to fetch the data from firmware before the fetch is +disabled by firmware. + +When hvcs is being used to serve consoles this behavior is not a huge issue +because the adapter stays connected for large amounts of time following +almost all data writes. When hvcs is being used as a tty conduit to tunnel +data between two partitions [see Q & A below] this is a huge problem +because the standard Linux behavior when cat'ing or dd'ing data to a device +is to open the tty, send the data, and then close the tty. If this driver +manually terminated vty-server connections on tty close this would close +the vty-server and vty connection before the target vty has had a chance to +fetch the data. + +Additionally, disconnecting a vty-server and vty only on module removal or +adapter removal is impractical because other vty-servers in other +partitions may require the usage of the target vty at any time. + +Due to this behavioral restriction disconnection of vty-servers from the +connected vty is a manual procedure using a write to a sysfs attribute +outlined below, on the other hand the initial vty-server connection to a +vty is established automatically by this driver. Manual vty-server +connection is never required. + +In order to terminate the connection between a vty-server and vty the +"vterm_state" sysfs attribute within each vty-server's sysfs entry is used. +Reading this attribute reveals the current connection state of the +vty-server adapter. A zero means that the vty-server is not connected to a +vty. A one indicates that a connection is active. + +Writing a '0' (zero) to the vterm_state attribute will disconnect the VTERM +connection between the vty-server and target vty ONLY if the vterm_state +previously read '1'. The write directive is ignored if the vterm_state +read '0' or if any value other than '0' was written to the vterm_state +attribute. The following example will show the method used for verifying +the vty-server connection status and disconnecting a vty-server connection. + + Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat vterm_state + 1 + + Pow5:/sys/bus/vio/drivers/hvcs/30000004 # echo 0 > vterm_state + + Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat vterm_state + 0 + +All vty-server connections are automatically terminated when the device is +hotplug removed and when the module is removed. + +--------------------------------------------------------------------------- +7. Configuration + +Each vty-server has a sysfs entry in the /sys/devices/vio directory, which +is symlinked in several other sysfs tree directories, notably under the +hvcs driver entry, which looks like the following example: + + Pow5:/sys/bus/vio/drivers/hvcs # ls + . .. 30000003 30000004 rescan + +By design, firmware notifies the hvcs driver of vty-server lifetimes and +partner vty removals but not the addition of partner vtys. Since an HMC +Super Admin can add partner info dynamically we have provided the hvcs +driver sysfs directory with the "rescan" update attribute which will query +firmware and update the partner info for all the vty-servers that this +driver manages. Writing a '1' to the attribute triggers the update. An +explicit example follows: + + Pow5:/sys/bus/vio/drivers/hvcs # echo 1 > rescan + +Reading the attribute will indicate a state of '1' or '0'. A one indicates +that an update is in process. A zero indicates that an update has +completed or was never executed. + +Vty-server entries in this directory are a 32 bit partition unique unit +address that is created by firmware. An example vty-server sysfs entry +looks like the following: + + Pow5:/sys/bus/vio/drivers/hvcs/30000004 # ls + . current_vty devspec name partner_vtys + .. detach_state index partner_clcs vterm_state + +Each entry is provided, by default with a "name" attribute. Reading the +"name" attribute will reveal the device type as shown in the following +example: + + Pow5:/sys/bus/vio/drivers/hvcs/30000003 # cat name + vty-server + +Each entry is also provided, by default, with a "devspec" attribute which +reveals the full device specification when read, as shown in the following +example: + + Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat devspec + /vdevice/vty-server@30000004 + +Each vty-server sysfs dir is provided with two read-only attributes that +provide lists of easily parsed partner vty data: "partner_vtys" and +"partner_clcs". + + Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat partner_vtys + 30000000 + 30000001 + 30000002 + 30000000 + 30000000 + + Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat partner_clcs + U5112.428.103048A-V3-C0 + U5112.428.103048A-V3-C2 + U5112.428.103048A-V3-C3 + U5112.428.103048A-V4-C0 + U5112.428.103048A-V5-C0 + +Reading partner_vtys returns a list of partner vtys. Vty unit address +numbering is only per-partition-unique so entries will frequently repeat. + +Reading partner_clcs returns a list of "converged location codes" which are +composed of a system serial number followed by "-V*", where the '*' is the +target partition number, and "-C*", where the '*' is the slot of the +adapter. The first vty partner corresponds to the first clc item, the +second vty partner to the second clc item, etc. + +A vty-server can only be connected to a single vty at a time. The entry, +"current_vty" prints the clc of the currently selected partner vty when +read. + +The current_vty can be changed by writing a valid partner clc to the entry +as in the following example: + + Pow5:/sys/bus/vio/drivers/hvcs/30000004 # echo U5112.428.10304 + 8A-V4-C0 > current_vty + +Changing the current_vty when a vty-server is already connected to a vty +does not affect the current connection. The change takes effect when the +currently open connection is freed. + +Information on the "vterm_state" attribute was covered earlier on the +chapter entitled "disconnection". + +--------------------------------------------------------------------------- +8. Questions & Answers: +=========================================================================== +Q: What are the security concerns involving hvcs? + +A: There are three main security concerns: + + 1. The creator of the /dev/hvcs* nodes has the ability to restrict + the access of the device entries to certain users or groups. It + may be best to create a special hvcs group privilege for providing + access to system consoles. + + 2. To provide network security when grabbing the console it is + suggested that the user connect to the console hosting partition + using a secure method, such as SSH or sit at a hardware console. + + 3. Make sure to exit the user session when done with a console or + the next vty-server connection (which may be from another + partition) will experience the previously logged in session. + +--------------------------------------------------------------------------- +Q: How do I multiplex a console that I grab through hvcs so that other +people can see it: + +A: You can use "screen" to directly connect to the /dev/hvcs* device and +setup a session on your machine with the console group privileges. As +pointed out earlier by default screen doesn't provide the termcap settings +for most terminal emulators to provide adequate character conversion from +term type "screen" to others. This means that curses based programs may +not display properly in screen sessions. + +--------------------------------------------------------------------------- +Q: Why are the colors all messed up? +Q: Why are the control characters acting strange or not working? +Q: Why is the console output all strange and unintelligible? + +A: Please see the preceding section on "Connection" for a discussion of how +applications can affect the display of character control sequences. +Additionally, just because you logged into the console using and xterm +doesn't mean someone else didn't log into the console with the HMC console +(vt320) before you and leave the session logged in. The best thing to do +is to export TERM to the terminal type of your terminal emulator when you +get the console. Additionally make sure to "exit" the console before you +disconnect from the console. This will ensure that the next user gets +their own TERM type set when they login. + +--------------------------------------------------------------------------- +Q: When I try to CONNECT kermit to an hvcs device I get: +"Sorry, can't open connection: /dev/hvcs*"What is happening? + +A: Some other Power5 console mechanism has a connection to the vty and +isn't giving it up. You can try to force disconnect the consoles from the +HMC by right clicking on the partition and then selecting "close terminal". +Otherwise you have to hunt down the people who have console authority. It +is possible that you already have the console open using another kermit +session and just forgot about it. Please review the console options for +Power5 systems to determine the many ways a system console can be held. + +OR + +A: Another user may not have a connectivity method currently attached to a +/dev/hvcs device but the vterm_state may reveal that they still have the +vty-server connection established. They need to free this using the method +outlined in the section on "Disconnection" in order for others to connect +to the target vty. + +OR + +A: The user profile you are using to execute kermit probably doesn't have +permissions to use the /dev/hvcs* device. + +OR + +A: You probably haven't inserted the hvcs.ko module yet but the /dev/hvcs* +entry still exists (on systems without udev). + +OR + +A: There is not a corresponding vty-server device that maps to an existing +/dev/hvcs* entry. + +--------------------------------------------------------------------------- +Q: When I try to CONNECT kermit to an hvcs device I get: +"Sorry, write access to UUCP lockfile directory denied." + +A: The /dev/hvcs* entry you have specified doesn't exist where you said it +does? Maybe you haven't inserted the module (on systems with udev). + +--------------------------------------------------------------------------- +Q: If I already have one Linux partition installed can I use hvcs on said +partition to provide the console for the install of a second Linux +partition? + +A: Yes granted that your are connected to the /dev/hvcs* device using +kermit or cu or some other program that doesn't provide terminal emulation. + +--------------------------------------------------------------------------- +Q: Can I connect to more than one partition's console at a time using this +driver? + +A: Yes. Of course this means that there must be more than one vty-server +configured for this partition and each must point to a disconnected vty. + +--------------------------------------------------------------------------- +Q: Does the hvcs driver support dynamic (hotplug) addition of devices? + +A: Yes, if you have dlpar and hotplug enabled for your system and it has +been built into the kernel the hvcs drivers is configured to dynamically +handle additions of new devices and removals of unused devices. + +--------------------------------------------------------------------------- +Q: For some reason /dev/hvcs* doesn't map to the same vty-server adapter +after a reboot. What happened? + +A: Assignment of vty-server adapters to /dev/hvcs* entries is always done +in the order that the adapters are exposed. Due to hotplug capabilities of +this driver assignment of hotplug added vty-servers may be in a different +order than how they would be exposed on module load. Rebooting or +reloading the module after dynamic addition may result in the /dev/hvcs* +and vty-server coupling changing if a vty-server adapter was added in a +slot inbetween two other vty-server adapters. Refer to the section above +on how to determine which vty-server goes with which /dev/hvcs* node. +Hint; look at the sysfs "index" attribute for the vty-server. + +--------------------------------------------------------------------------- +Q: Can I use /dev/hvcs* as a conduit to another partition and use a tty +device on that partition as the other end of the pipe? + +A: Yes, on Power5 platforms the hvc_console driver provides a tty interface +for extra /dev/hvc* devices (where /dev/hvc0 is most likely the console). +In order to get a tty conduit working between the two partitions the HMC +Super Admin must create an additional "serial server" for the target +partition with the HMC gui which will show up as /dev/hvc* when the target +partition is rebooted. + +The HMC Super Admin then creates an additional "serial client" for the +current partition and points this at the target partition's newly created +"serial server" adapter (remember the slot). This shows up as an +additional /dev/hvcs* device. + +Now a program on the target system can be configured to read or write to +/dev/hvc* and another program on the current partition can be configured to +read or write to /dev/hvcs*. Now you have a tty conduit between two +partitions. + +--------------------------------------------------------------------------- +9. Reporting Bugs: + +The proper channel for reporting bugs is either through the Linux OS +distribution company that provided your OS or by posting issues to the +ppc64 development mailing list at: + +linuxppc64-dev@lists.linuxppc.org + +This request is to provide a documented and searchable public exchange +of the problems and solutions surrounding this driver for the benefit of +all users. diff --git a/Documentation/powerpc/mpc52xx.txt b/Documentation/powerpc/mpc52xx.txt new file mode 100644 index 000000000000..10dd4ab93b85 --- /dev/null +++ b/Documentation/powerpc/mpc52xx.txt @@ -0,0 +1,39 @@ +Linux 2.6.x on MPC52xx family +----------------------------- + +For the latest info, go to http://www.246tNt.com/mpc52xx/ + +To compile/use : + + - U-Boot: + # <edit Makefile to set ARCH=ppc & CROSS_COMPILE=... ( also EXTRAVERSION + if you wish to ). + # make lite5200_defconfig + # make uImage + + then, on U-boot: + => tftpboot 200000 uImage + => tftpboot 400000 pRamdisk + => bootm 200000 400000 + + - DBug: + # <edit Makefile to set ARCH=ppc & CROSS_COMPILE=... ( also EXTRAVERSION + if you wish to ). + # make lite5200_defconfig + # cp your_initrd.gz arch/ppc/boot/images/ramdisk.image.gz + # make zImage.initrd + # make + + then in DBug: + DBug> dn -i zImage.initrd.lite5200 + + +Some remarks : + - The port is named mpc52xxx, and config options are PPC_MPC52xx. The MGT5100 + is not supported, and I'm not sure anyone is interesting in working on it + so. I didn't took 5xxx because there's apparently a lot of 5xxx that have + nothing to do with the MPC5200. I also included the 'MPC' for the same + reason. + - Of course, I inspired myself from the 2.4 port. If you think I forgot to + mention you/your company in the copyright of some code, I'll correct it + ASAP. diff --git a/Documentation/powerpc/ppc_htab.txt b/Documentation/powerpc/ppc_htab.txt new file mode 100644 index 000000000000..8b8c7df29fa9 --- /dev/null +++ b/Documentation/powerpc/ppc_htab.txt @@ -0,0 +1,118 @@ + Information about /proc/ppc_htab +===================================================================== + +This document and the related code was written by me (Cort Dougan), please +email me (cort@fsmlabs.com) if you have questions, comments or corrections. + +Last Change: 2.16.98 + +This entry in the proc directory is readable by all users but only +writable by root. + +The ppc_htab interface is a user level way of accessing the +performance monitoring registers as well as providing information +about the PTE hash table. + +1. Reading + + Reading this file will give you information about the memory management + hash table that serves as an extended tlb for page translation on the + powerpc. It will also give you information about performance measurement + specific to the cpu that you are using. + + Explanation of the 604 Performance Monitoring Fields: + MMCR0 - the current value of the MMCR0 register + PMC1 + PMC2 - the value of the performance counters and a + description of what events they are counting + which are based on MMCR0 bit settings. + Explanation of the PTE Hash Table fields: + + Size - hash table size in Kb. + Buckets - number of buckets in the table. + Address - the virtual kernel address of the hash table base. + Entries - the number of ptes that can be stored in the hash table. + User/Kernel - how many pte's are in use by the kernel or user at that time. + Overflows - How many of the entries are in their secondary hash location. + Percent full - ratio of free pte entries to in use entries. + Reloads - Count of how many hash table misses have occurred + that were fixed with a reload from the linux tables. + Should always be 0 on 603 based machines. + Non-error Misses - Count of how many hash table misses have occurred + that were completed with the creation of a pte in the linux + tables with a call to do_page_fault(). + Error Misses - Number of misses due to errors such as bad address + and permission violations. This includes kernel access of + bad user addresses that are fixed up by the trap handler. + + Note that calculation of the data displayed from /proc/ppc_htab takes + a long time and spends a great deal of time in the kernel. It would + be quite hard on performance to read this file constantly. In time + there may be a counter in the kernel that allows successive reads from + this file only after a given amount of time has passed to reduce the + possibility of a user slowing the system by reading this file. + +2. Writing + + Writing to the ppc_htab allows you to change the characteristics of + the powerpc PTE hash table and setup performance monitoring. + + Resizing the PTE hash table is not enabled right now due to many + complications with moving the hash table, rehashing the entries + and many many SMP issues that would have to be dealt with. + + Write options to ppc_htab: + + - To set the size of the hash table to 64Kb: + + echo 'size 64' > /proc/ppc_htab + + The size must be a multiple of 64 and must be greater than or equal to + 64. + + - To turn off performance monitoring: + + echo 'off' > /proc/ppc_htab + + - To reset the counters without changing what they're counting: + + echo 'reset' > /proc/ppc_htab + + Note that counting will continue after the reset if it is enabled. + + - To count only events in user mode or only in kernel mode: + + echo 'user' > /proc/ppc_htab + ...or... + echo 'kernel' > /proc/ppc_htab + + Note that these two options are exclusive of one another and the + lack of either of these options counts user and kernel. + Using 'reset' and 'off' reset these flags. + + - The 604 has 2 performance counters which can each count events from + a specific set of events. These sets are disjoint so it is not + possible to count _any_ combination of 2 events. One event can + be counted by PMC1 and one by PMC2. + + To start counting a particular event use: + + echo 'event' > /proc/ppc_htab + + and choose from these events: + + PMC1 + ---- + 'ic miss' - instruction cache misses + 'dtlb' - data tlb misses (not hash table misses) + + PMC2 + ---- + 'dc miss' - data cache misses + 'itlb' - instruction tlb misses (not hash table misses) + 'load miss time' - cycles to complete a load miss + +3. Bugs + + The PMC1 and PMC2 counters can overflow and give no indication of that + in /proc/ppc_htab. diff --git a/Documentation/powerpc/smp.txt b/Documentation/powerpc/smp.txt new file mode 100644 index 000000000000..5b581b849ff7 --- /dev/null +++ b/Documentation/powerpc/smp.txt @@ -0,0 +1,34 @@ + Information about Linux/PPC SMP mode +===================================================================== + +This document and the related code was written by me +(Cort Dougan, cort@fsmlabs.com) please email me if you have questions, +comments or corrections. + +Last Change: 3.31.99 + +If you want to help by writing code or testing different hardware please +email me! + +1. State of Supported Hardware + + PowerSurge Architecture - tested on UMAX s900, Apple 9600 + The second processor on this machine boots up just fine and + enters its idle loop. Hopefully a completely working SMP kernel + on this machine will be done shortly. + + The code makes the assumption of only two processors. The changes + necessary to work with any number would not be overly difficult but + I don't have any machines with >2 processors so it's not high on my + list of priorities. If anyone else would like do to the work email + me and I can point out the places that need changed. If you have >2 + processors and don't want to add support yourself let me know and I + can take a look into it. + + BeBox + BeBox support hasn't been added to the 2.1.X kernels from 2.0.X + but work is being done and SMP support for BeBox is in the works. + + CHRP + CHRP SMP works and is fairly solid. It's been tested on the IBM F50 + with 4 processors for quite some time now. diff --git a/Documentation/powerpc/sound.txt b/Documentation/powerpc/sound.txt new file mode 100644 index 000000000000..df23d95e03a0 --- /dev/null +++ b/Documentation/powerpc/sound.txt @@ -0,0 +1,81 @@ + Information about PowerPC Sound support +===================================================================== + +Please mail me (Cort Dougan, cort@fsmlabs.com) if you have questions, +comments or corrections. + +Last Change: 6.16.99 + +This just covers sound on the PReP and CHRP systems for now and later +will contain information on the PowerMac's. + +Sound on PReP has been tested and is working with the PowerStack and IBM +Power Series onboard sound systems which are based on the cs4231(2) chip. +The sound options when doing the make config are a bit different from +the default, though. + +The I/O base, irq and dma lines that you enter during the make config +are ignored and are set when booting according to the machine type. +This is so that one binary can be used for Motorola and IBM machines +which use different values and isn't allowed by the driver, so things +are hacked together in such a way as to allow this information to be +set automatically on boot. + +1. Motorola PowerStack PReP machines + + Enable support for "Crystal CS4232 based (PnP) cards" and for the + Microsoft Sound System. The MSS isn't used, but some of the routines + that the CS4232 driver uses are in it. + + Although the options you set are ignored and determined automatically + on boot these are included for information only: + + (830) CS4232 audio I/O base 530, 604, E80 or F40 + (10) CS4232 audio IRQ 5, 7, 9, 11, 12 or 15 + (6) CS4232 audio DMA 0, 1 or 3 + (7) CS4232 second (duplex) DMA 0, 1 or 3 + + This will allow simultaneous record and playback, as 2 different dma + channels are used. + + The sound will be all left channel and very low volume since the + auxiliary input isn't muted by default. I had the changes necessary + for this in the kernel but the sound driver maintainer didn't want + to include them since it wasn't common in other machines. To fix this + you need to mute it using a mixer utility of some sort (if you find one + please let me know) or by patching the driver yourself and recompiling. + + There is a problem on the PowerStack 2's (PowerStack Pro's) using a + different irq/drq than the kernel expects. Unfortunately, I don't know + which irq/drq it is so if anyone knows please email me. + + Midi is not supported since the cs4232 driver doesn't support midi yet. + +2. IBM PowerPersonal PReP machines + + I've only tested sound on the Power Personal Series of IBM workstations + so if you try it on others please let me know the result. I'm especially + interested in the 43p's sound system, which I know nothing about. + + Enable support for "Crystal CS4232 based (PnP) cards" and for the + Microsoft Sound System. The MSS isn't used, but some of the routines + that the CS4232 driver uses are in it. + + Although the options you set are ignored and determined automatically + on boot these are included for information only: + + (530) CS4232 audio I/O base 530, 604, E80 or F40 + (5) CS4232 audio IRQ 5, 7, 9, 11, 12 or 15 + (1) CS4232 audio DMA 0, 1 or 3 + (7) CS4232 second (duplex) DMA 0, 1 or 3 + (330) CS4232 MIDI I/O base 330, 370, 3B0 or 3F0 + (9) CS4232 MIDI IRQ 5, 7, 9, 11, 12 or 15 + + This setup does _NOT_ allow for recording yet. + + Midi is not supported since the cs4232 driver doesn't support midi yet. + +2. IBM CHRP + + I have only tested this on the 43P-150. Build the kernel with the cs4232 + set as a module and load the module with irq=9 dma=1 dma2=2 io=0x550 diff --git a/Documentation/powerpc/zImage_layout.txt b/Documentation/powerpc/zImage_layout.txt new file mode 100644 index 000000000000..048e0150f571 --- /dev/null +++ b/Documentation/powerpc/zImage_layout.txt @@ -0,0 +1,47 @@ + Information about the Linux/PPC kernel images +===================================================================== + +Please mail me (Cort Dougan, cort@fsmlabs.com) if you have questions, +comments or corrections. + +This document is meant to answer several questions I've had about how +the PReP system boots and how Linux/PPC interacts with that mechanism. +It would be nice if we could have information on how other architectures +boot here as well. If you have anything to contribute, please +let me know. + + +1. PReP boot file + + This is the file necessary to boot PReP systems from floppy or + hard drive. The firmware reads the PReP partition table entry + and will load the image accordingly. + + To boot the zImage, copy it onto a floppy with dd if=zImage of=/dev/fd0h1440 + or onto a PReP hard drive partition with dd if=zImage of=/dev/sda4 + assuming you've created a PReP partition (type 0x41) with fdisk on + /dev/sda4. + + The layout of the image format is: + + 0x0 +------------+ + | | PReP partition table entry + | | + 0x400 +------------+ + | | Bootstrap program code + data + | | + | | + +------------+ + | | compressed kernel, elf header removed + +------------+ + | | initrd (if loaded) + +------------+ + | | Elf section table for bootstrap program + +------------+ + + +2. MBX boot file + + The MBX boards can load an elf image, and relocate it to the + proper location in memory - it copies the image to the location it was + linked at. diff --git a/Documentation/preempt-locking.txt b/Documentation/preempt-locking.txt new file mode 100644 index 000000000000..57883ca2498b --- /dev/null +++ b/Documentation/preempt-locking.txt @@ -0,0 +1,135 @@ + Proper Locking Under a Preemptible Kernel: + Keeping Kernel Code Preempt-Safe + Robert Love <rml@tech9.net> + Last Updated: 28 Aug 2002 + + +INTRODUCTION + + +A preemptible kernel creates new locking issues. The issues are the same as +those under SMP: concurrency and reentrancy. Thankfully, the Linux preemptible +kernel model leverages existing SMP locking mechanisms. Thus, the kernel +requires explicit additional locking for very few additional situations. + +This document is for all kernel hackers. Developing code in the kernel +requires protecting these situations. + + +RULE #1: Per-CPU data structures need explicit protection + + +Two similar problems arise. An example code snippet: + + struct this_needs_locking tux[NR_CPUS]; + tux[smp_processor_id()] = some_value; + /* task is preempted here... */ + something = tux[smp_processor_id()]; + +First, since the data is per-CPU, it may not have explicit SMP locking, but +require it otherwise. Second, when a preempted task is finally rescheduled, +the previous value of smp_processor_id may not equal the current. You must +protect these situations by disabling preemption around them. + +You can also use put_cpu() and get_cpu(), which will disable preemption. + + +RULE #2: CPU state must be protected. + + +Under preemption, the state of the CPU must be protected. This is arch- +dependent, but includes CPU structures and state not preserved over a context +switch. For example, on x86, entering and exiting FPU mode is now a critical +section that must occur while preemption is disabled. Think what would happen +if the kernel is executing a floating-point instruction and is then preempted. +Remember, the kernel does not save FPU state except for user tasks. Therefore, +upon preemption, the FPU registers will be sold to the lowest bidder. Thus, +preemption must be disabled around such regions. + +Note, some FPU functions are already explicitly preempt safe. For example, +kernel_fpu_begin and kernel_fpu_end will disable and enable preemption. +However, math_state_restore must be called with preemption disabled. + + +RULE #3: Lock acquire and release must be performed by same task + + +A lock acquired in one task must be released by the same task. This +means you can't do oddball things like acquire a lock and go off to +play while another task releases it. If you want to do something +like this, acquire and release the task in the same code path and +have the caller wait on an event by the other task. + + +SOLUTION + + +Data protection under preemption is achieved by disabling preemption for the +duration of the critical region. + +preempt_enable() decrement the preempt counter +preempt_disable() increment the preempt counter +preempt_enable_no_resched() decrement, but do not immediately preempt +preempt_check_resched() if needed, reschedule +preempt_count() return the preempt counter + +The functions are nestable. In other words, you can call preempt_disable +n-times in a code path, and preemption will not be reenabled until the n-th +call to preempt_enable. The preempt statements define to nothing if +preemption is not enabled. + +Note that you do not need to explicitly prevent preemption if you are holding +any locks or interrupts are disabled, since preemption is implicitly disabled +in those cases. + +But keep in mind that 'irqs disabled' is a fundamentally unsafe way of +disabling preemption - any spin_unlock() decreasing the preemption count +to 0 might trigger a reschedule. A simple printk() might trigger a reschedule. +So use this implicit preemption-disabling property only if you know that the +affected codepath does not do any of this. Best policy is to use this only for +small, atomic code that you wrote and which calls no complex functions. + +Example: + + cpucache_t *cc; /* this is per-CPU */ + preempt_disable(); + cc = cc_data(searchp); + if (cc && cc->avail) { + __free_block(searchp, cc_entry(cc), cc->avail); + cc->avail = 0; + } + preempt_enable(); + return 0; + +Notice how the preemption statements must encompass every reference of the +critical variables. Another example: + + int buf[NR_CPUS]; + set_cpu_val(buf); + if (buf[smp_processor_id()] == -1) printf(KERN_INFO "wee!\n"); + spin_lock(&buf_lock); + /* ... */ + +This code is not preempt-safe, but see how easily we can fix it by simply +moving the spin_lock up two lines. + + +PREVENTING PREEMPTION USING INTERRUPT DISABLING + + +It is possible to prevent a preemption event using local_irq_disable and +local_irq_save. Note, when doing so, you must be very careful to not cause +an event that would set need_resched and result in a preemption check. When +in doubt, rely on locking or explicit preemption disabling. + +Note in 2.5 interrupt disabling is now only per-CPU (e.g. local). + +An additional concern is proper usage of local_irq_disable and local_irq_save. +These may be used to protect from preemption, however, on exit, if preemption +may be enabled, a test to see if preemption is required should be done. If +these are called from the spin_lock and read/write lock macros, the right thing +is done. They may also be called within a spin-lock protected region, however, +if they are ever called outside of this context, a test for preemption should +be made. Do note that calls from interrupt context or bottom half/ tasklets +are also protected by preemption locks and so may use the versions which do +not check preemption. diff --git a/Documentation/prio_tree.txt b/Documentation/prio_tree.txt new file mode 100644 index 000000000000..2fbb0c49bc5b --- /dev/null +++ b/Documentation/prio_tree.txt @@ -0,0 +1,107 @@ +The prio_tree.c code indexes vmas using 3 different indexes: + * heap_index = vm_pgoff + vm_size_in_pages : end_vm_pgoff + * radix_index = vm_pgoff : start_vm_pgoff + * size_index = vm_size_in_pages + +A regular radix-priority-search-tree indexes vmas using only heap_index and +radix_index. The conditions for indexing are: + * ->heap_index >= ->left->heap_index && + ->heap_index >= ->right->heap_index + * if (->heap_index == ->left->heap_index) + then ->radix_index < ->left->radix_index; + * if (->heap_index == ->right->heap_index) + then ->radix_index < ->right->radix_index; + * nodes are hashed to left or right subtree using radix_index + similar to a pure binary radix tree. + +A regular radix-priority-search-tree helps to store and query +intervals (vmas). However, a regular radix-priority-search-tree is only +suitable for storing vmas with different radix indices (vm_pgoff). + +Therefore, the prio_tree.c extends the regular radix-priority-search-tree +to handle many vmas with the same vm_pgoff. Such vmas are handled in +2 different ways: 1) All vmas with the same radix _and_ heap indices are +linked using vm_set.list, 2) if there are many vmas with the same radix +index, but different heap indices and if the regular radix-priority-search +tree cannot index them all, we build an overflow-sub-tree that indexes such +vmas using heap and size indices instead of heap and radix indices. For +example, in the figure below some vmas with vm_pgoff = 0 (zero) are +indexed by regular radix-priority-search-tree whereas others are pushed +into an overflow-subtree. Note that all vmas in an overflow-sub-tree have +the same vm_pgoff (radix_index) and if necessary we build different +overflow-sub-trees to handle each possible radix_index. For example, +in figure we have 3 overflow-sub-trees corresponding to radix indices +0, 2, and 4. + +In the final tree the first few (prio_tree_root->index_bits) levels +are indexed using heap and radix indices whereas the overflow-sub-trees below +those levels (i.e. levels prio_tree_root->index_bits + 1 and higher) are +indexed using heap and size indices. In overflow-sub-trees the size_index +is used for hashing the nodes to appropriate places. + +Now, an example prio_tree: + + vmas are represented [radix_index, size_index, heap_index] + i.e., [start_vm_pgoff, vm_size_in_pages, end_vm_pgoff] + +level prio_tree_root->index_bits = 3 +----- + _ + 0 [0,7,7] | + / \ | + ------------------ ------------ | Regular + / \ | radix priority + 1 [1,6,7] [4,3,7] | search tree + / \ / \ | + ------- ----- ------ ----- | heap-and-radix + / \ / \ | indexed + 2 [0,6,6] [2,5,7] [5,2,7] [6,1,7] | + / \ / \ / \ / \ | + 3 [0,5,5] [1,5,6] [2,4,6] [3,4,7] [4,2,6] [5,1,6] [6,0,6] [7,0,7] | + / / / _ + / / / _ + 4 [0,4,4] [2,3,5] [4,1,5] | + / / / | + 5 [0,3,3] [2,2,4] [4,0,4] | Overflow-sub-trees + / / | + 6 [0,2,2] [2,1,3] | heap-and-size + / / | indexed + 7 [0,1,1] [2,0,2] | + / | + 8 [0,0,0] | + _ + +Note that we use prio_tree_root->index_bits to optimize the height +of the heap-and-radix indexed tree. Since prio_tree_root->index_bits is +set according to the maximum end_vm_pgoff mapped, we are sure that all +bits (in vm_pgoff) above prio_tree_root->index_bits are 0 (zero). Therefore, +we only use the first prio_tree_root->index_bits as radix_index. +Whenever index_bits is increased in prio_tree_expand, we shuffle the tree +to make sure that the first prio_tree_root->index_bits levels of the tree +is indexed properly using heap and radix indices. + +We do not optimize the height of overflow-sub-trees using index_bits. +The reason is: there can be many such overflow-sub-trees and all of +them have to be suffled whenever the index_bits increases. This may involve +walking the whole prio_tree in prio_tree_insert->prio_tree_expand code +path which is not desirable. Hence, we do not optimize the height of the +heap-and-size indexed overflow-sub-trees using prio_tree->index_bits. +Instead the overflow sub-trees are indexed using full BITS_PER_LONG bits +of size_index. This may lead to skewed sub-trees because most of the +higher significant bits of the size_index are likely to be be 0 (zero). In +the example above, all 3 overflow-sub-trees are skewed. This may marginally +affect the performance. However, processes rarely map many vmas with the +same start_vm_pgoff but different end_vm_pgoffs. Therefore, we normally +do not require overflow-sub-trees to index all vmas. + +From the above discussion it is clear that the maximum height of +a prio_tree can be prio_tree_root->index_bits + BITS_PER_LONG. +However, in most of the common cases we do not need overflow-sub-trees, +so the tree height in the common cases will be prio_tree_root->index_bits. + +It is fair to mention here that the prio_tree_root->index_bits +is increased on demand, however, the index_bits is not decreased when +vmas are removed from the prio_tree. That's tricky to do. Hence, it's +left as a home work problem. + + diff --git a/Documentation/ramdisk.txt b/Documentation/ramdisk.txt new file mode 100644 index 000000000000..7c25584e082c --- /dev/null +++ b/Documentation/ramdisk.txt @@ -0,0 +1,167 @@ +Using the RAM disk block device with Linux +------------------------------------------ + +Contents: + + 1) Overview + 2) Kernel Command Line Parameters + 3) Using "rdev -r" + 4) An Example of Creating a Compressed RAM Disk + + +1) Overview +----------- + +The RAM disk driver is a way to use main system memory as a block device. It +is required for initrd, an initial filesystem used if you need to load modules +in order to access the root filesystem (see Documentation/initrd.txt). It can +also be used for a temporary filesystem for crypto work, since the contents +are erased on reboot. + +The RAM disk dynamically grows as more space is required. It does this by using +RAM from the buffer cache. The driver marks the buffers it is using as dirty +so that the VM subsystem does not try to reclaim them later. + +Also, the RAM disk supports up to 16 RAM disks out of the box, and can +be reconfigured to support up to 255 RAM disks - change "#define NUM_RAMDISKS" +in drivers/block/rd.c. To use RAM disk support with your system, run +'./MAKEDEV ram' from the /dev directory. RAM disks are all major number 1, and +start with minor number 0 for /dev/ram0, etc. If used, modern kernels use +/dev/ram0 for an initrd. + +The old "ramdisk=<ram_size>" has been changed to "ramdisk_size=<ram_size>" to +make it clearer. The original "ramdisk=<ram_size>" has been kept around for +compatibility reasons, but it may be removed in the future. + +The new RAM disk also has the ability to load compressed RAM disk images, +allowing one to squeeze more programs onto an average installation or +rescue floppy disk. + + +2) Kernel Command Line Parameters +--------------------------------- + + ramdisk_size=N + ============== + +This parameter tells the RAM disk driver to set up RAM disks of N k size. The +default is 4096 (4 MB) (8192 (8 MB) on S390). + + ramdisk_blocksize=N + =================== + +This parameter tells the RAM disk driver how many bytes to use per block. The +default is 512. + + +3) Using "rdev -r" +------------------ + +The usage of the word (two bytes) that "rdev -r" sets in the kernel image is +as follows. The low 11 bits (0 -> 10) specify an offset (in 1 k blocks) of up +to 2 MB (2^11) of where to find the RAM disk (this used to be the size). Bit +14 indicates that a RAM disk is to be loaded, and bit 15 indicates whether a +prompt/wait sequence is to be given before trying to read the RAM disk. Since +the RAM disk dynamically grows as data is being written into it, a size field +is not required. Bits 11 to 13 are not currently used and may as well be zero. +These numbers are no magical secrets, as seen below: + +./arch/i386/kernel/setup.c:#define RAMDISK_IMAGE_START_MASK 0x07FF +./arch/i386/kernel/setup.c:#define RAMDISK_PROMPT_FLAG 0x8000 +./arch/i386/kernel/setup.c:#define RAMDISK_LOAD_FLAG 0x4000 + +Consider a typical two floppy disk setup, where you will have the +kernel on disk one, and have already put a RAM disk image onto disk #2. + +Hence you want to set bits 0 to 13 as 0, meaning that your RAM disk +starts at an offset of 0 kB from the beginning of the floppy. +The command line equivalent is: "ramdisk_start=0" + +You want bit 14 as one, indicating that a RAM disk is to be loaded. +The command line equivalent is: "load_ramdisk=1" + +You want bit 15 as one, indicating that you want a prompt/keypress +sequence so that you have a chance to switch floppy disks. +The command line equivalent is: "prompt_ramdisk=1" + +Putting that together gives 2^15 + 2^14 + 0 = 49152 for an rdev word. +So to create disk one of the set, you would do: + + /usr/src/linux# cat arch/i386/boot/zImage > /dev/fd0 + /usr/src/linux# rdev /dev/fd0 /dev/fd0 + /usr/src/linux# rdev -r /dev/fd0 49152 + +If you make a boot disk that has LILO, then for the above, you would use: + append = "ramdisk_start=0 load_ramdisk=1 prompt_ramdisk=1" +Since the default start = 0 and the default prompt = 1, you could use: + append = "load_ramdisk=1" + + +4) An Example of Creating a Compressed RAM Disk +---------------------------------------------- + +To create a RAM disk image, you will need a spare block device to +construct it on. This can be the RAM disk device itself, or an +unused disk partition (such as an unmounted swap partition). For this +example, we will use the RAM disk device, "/dev/ram0". + +Note: This technique should not be done on a machine with less than 8 MB +of RAM. If using a spare disk partition instead of /dev/ram0, then this +restriction does not apply. + +a) Decide on the RAM disk size that you want. Say 2 MB for this example. + Create it by writing to the RAM disk device. (This step is not currently + required, but may be in the future.) It is wise to zero out the + area (esp. for disks) so that maximal compression is achieved for + the unused blocks of the image that you are about to create. + + dd if=/dev/zero of=/dev/ram0 bs=1k count=2048 + +b) Make a filesystem on it. Say ext2fs for this example. + + mke2fs -vm0 /dev/ram0 2048 + +c) Mount it, copy the files you want to it (eg: /etc/* /dev/* ...) + and unmount it again. + +d) Compress the contents of the RAM disk. The level of compression + will be approximately 50% of the space used by the files. Unused + space on the RAM disk will compress to almost nothing. + + dd if=/dev/ram0 bs=1k count=2048 | gzip -v9 > /tmp/ram_image.gz + +e) Put the kernel onto the floppy + + dd if=zImage of=/dev/fd0 bs=1k + +f) Put the RAM disk image onto the floppy, after the kernel. Use an offset + that is slightly larger than the kernel, so that you can put another + (possibly larger) kernel onto the same floppy later without overlapping + the RAM disk image. An offset of 400 kB for kernels about 350 kB in + size would be reasonable. Make sure offset+size of ram_image.gz is + not larger than the total space on your floppy (usually 1440 kB). + + dd if=/tmp/ram_image.gz of=/dev/fd0 bs=1k seek=400 + +g) Use "rdev" to set the boot device, RAM disk offset, prompt flag, etc. + For prompt_ramdisk=1, load_ramdisk=1, ramdisk_start=400, one would + have 2^15 + 2^14 + 400 = 49552. + + rdev /dev/fd0 /dev/fd0 + rdev -r /dev/fd0 49552 + +That is it. You now have your boot/root compressed RAM disk floppy. Some +users may wish to combine steps (d) and (f) by using a pipe. + +-------------------------------------------------------------------------- + Paul Gortmaker 12/95 + +Changelog: +---------- + +10-22-04 : Updated to reflect changes in command line options, remove + obsolete references, general cleanup. + James Nelson (james4765@gmail.com) + + +12-95 : Original Document diff --git a/Documentation/riscom8.txt b/Documentation/riscom8.txt new file mode 100644 index 000000000000..14f61fdad7ca --- /dev/null +++ b/Documentation/riscom8.txt @@ -0,0 +1,36 @@ +* NOTE - this is an unmaintained driver. The original author cannot be located. + +SDL Communications is now SBS Technologies, and does not have any +information on these ancient ISA cards on their website. + +James Nelson <james4765@gmail.com> - 12-12-2004 + + This is the README for RISCom/8 multi-port serial driver + (C) 1994-1996 D.Gorodchanin + See file LICENSE for terms and conditions. + +NOTE: English is not my native language. + I'm sorry for any mistakes in this text. + +Misc. notes for RISCom/8 serial driver, in no particular order :) + +1) This driver can support up to 4 boards at time. + Use string "riscom8=0xXXX,0xXXX,0xXXX,0xXXX" at LILO prompt, for + setting I/O base addresses for boards. If you compile driver + as module use modprobe options "iobase=0xXXX iobase1=0xXXX iobase2=..." + +2) The driver partially supports famous 'setserial' program, you can use almost + any of its options, excluding port & irq settings. + +3) There are some misc. defines at the beginning of riscom8.c, please read the + comments and try to change some of them in case of problems. + +4) I consider the current state of the driver as BETA. + +5) SDL Communications WWW page is http://www.sdlcomm.com. + +6) You can use the MAKEDEV program to create RISCom/8 /dev/ttyL* entries. + +7) Minor numbers for first board are 0-7, for second 8-15, etc. + +22 Apr 1996. diff --git a/Documentation/rocket.txt b/Documentation/rocket.txt new file mode 100644 index 000000000000..a10678004451 --- /dev/null +++ b/Documentation/rocket.txt @@ -0,0 +1,189 @@ +Comtrol(tm) RocketPort(R)/RocketModem(TM) Series +Device Driver for the Linux Operating System + +=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- + +PRODUCT OVERVIEW +---------------- + +This driver provides a loadable kernel driver for the Comtrol RocketPort +and RocketModem PCI boards. These boards provide, 2, 4, 8, 16, or 32 +high-speed serial ports or modems. This driver supports up to a combination +of four RocketPort or RocketModems boards in one machine simultaneously. +This file assumes that you are using the RocketPort driver which is +integrated into the kernel sources. + +The driver can also be installed as an external module using the usual +"make;make install" routine. This external module driver, obtainable +from the Comtrol website listed below, is useful for updating the driver +or installing it into kernels which do not have the driver configured +into them. Installations instructions for the external module +are in the included README and HW_INSTALL files. + +RocketPort ISA and RocketModem II PCI boards currently are only supported by +this driver in module form. + +The RocketPort ISA board requires I/O ports to be configured by the DIP +switches on the board. See the section "ISA Rocketport Boards" below for +information on how to set the DIP switches. + +You pass the I/O port to the driver using the following module parameters: + +board1 : I/O port for the first ISA board +board2 : I/O port for the second ISA board +board3 : I/O port for the third ISA board +board4 : I/O port for the fourth ISA board + +There is a set of utilities and scripts provided with the external driver +( downloadable from http://www.comtrol.com ) that ease the configuration and +setup of the ISA cards. + +The RocketModem II PCI boards require firmware to be loaded into the card +before it will function. The driver has only been tested as a module for this +board. + +=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- + +INSTALLATION PROCEDURES +----------------------- + +RocketPort/RocketModem PCI cards require no driver configuration, they are +automatically detected and configured. + +The RocketPort driver can be installed as a module (recommended) or built +into the kernel. This is selected, as for other drivers, through the `make config` +command from the root of the Linux source tree during the kernel build process. + +The RocketPort/RocketModem serial ports installed by this driver are assigned +device major number 46, and will be named /dev/ttyRx, where x is the port number +starting at zero (ex. /dev/ttyR0, /devttyR1, ...). If you have multiple cards +installed in the system, the mapping of port names to serial ports is displayed +in the system log at /var/log/messages. + +If installed as a module, the module must be loaded. This can be done +manually by entering "modprobe rocket". To have the module loaded automatically +upon system boot, edit the /etc/modprobe.conf file and add the line +"alias char-major-46 rocket". + +In order to use the ports, their device names (nodes) must be created with mknod. +This is only required once, the system will retain the names once created. To +create the RocketPort/RocketModem device names, use the command +"mknod /dev/ttyRx c 46 x" where x is the port number starting at zero. For example: + +>mknod /dev/ttyR0 c 46 0 +>mknod /dev/ttyR1 c 46 1 +>mknod /dev/ttyR2 c 46 2 + +The Linux script MAKEDEV will create the first 16 ttyRx device names (nodes) +for you: + +>/dev/MAKEDEV ttyR + +=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- + +ISA Rocketport Boards +--------------------- + +You must assign and configure the I/O addresses used by the ISA Rocketport +card before installing and using it. This is done by setting a set of DIP +switches on the Rocketport board. + + +SETTING THE I/O ADDRESS +----------------------- + +Before installing RocketPort(R) or RocketPort RA boards, you must find +a range of I/O addresses for it to use. The first RocketPort card +requires a 68-byte contiguous block of I/O addresses, starting at one +of the following: 0x100h, 0x140h, 0x180h, 0x200h, 0x240h, 0x280h, +0x300h, 0x340h, 0x380h. This I/O address must be reflected in the DIP +switiches of *all* of the Rocketport cards. + +The second, third, and fourth RocketPort cards require a 64-byte +contiguous block of I/O addresses, starting at one of the following +I/O addresses: 0x100h, 0x140h, 0x180h, 0x1C0h, 0x200h, 0x240h, 0x280h, +0x2C0h, 0x300h, 0x340h, 0x380h, 0x3C0h. The I/O address used by the +second, third, and fourth Rocketport cards (if present) are set via +software control. The DIP switch settings for the I/O address must be +set to the value of the first Rocketport cards. + +In order to destinguish each of the card from the others, each card +must have a unique board ID set on the dip switches. The first +Rocketport board must be set with the DIP switches corresponding to +the first board, the second board must be set with the DIP switches +corresponding to the second board, etc. IMPORTANT: The board ID is +the only place where the DIP switch settings should differ between the +various Rocketport boards in a system. + +The I/O address range used by any of the RocketPort cards must not +conflict with any other cards in the system, including other +RocketPort cards. Below, you will find a list of commonly used I/O +address ranges which may be in use by other devices in your system. +On a Linux system, "cat /proc/ioports" will also be helpful in +identifying what I/O addresses are being used by devics on your +system. + +Remember, the FIRST RocketPort uses 68 I/O addresses. So, if you set it +for 0x100, it will occupy 0x100 to 0x143. This would mean that you +CAN NOT set the second, third or fourth board for address 0x140 since +the first 4 bytes of that range are used by the first board. You would +need to set the second, third, or fourth board to one of the next available +blocks such as 0x180. + +=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- + +RocketPort and RocketPort RA SW1 Settings: + + +-------------------------------+ + | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | + +-------+-------+---------------+ + | Unused| Card | I/O Port Block| + +-------------------------------+ + +DIP Switches DIP Switches +7 8 6 5 +=================== =================== +On On UNUSED, MUST BE ON. On On First Card <==== Default + On Off Second Card + Off On Third Card + Off Off Fourth Card + +DIP Switches I/O Address Range +4 3 2 1 Used by the First Card +===================================== +On Off On Off 100-143 +On Off Off On 140-183 +On Off Off Off 180-1C3 <==== Default +Off On On Off 200-243 +Off On Off On 240-283 +Off On Off Off 280-2C3 +Off Off On Off 300-343 +Off Off Off On 340-383 +Off Off Off Off 380-3C3 + +=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- + +REPORTING BUGS +-------------- + +For technical support, please provide the following +information: Driver version, kernel release, distribution of +kernel, and type of board you are using. Error messages and log +printouts port configuration details are especially helpful. + +USA + Phone: (612) 494-4100 + FAX: (612) 494-4199 + email: support@comtrol.com + +Comtrol Europe + Phone: +44 (0) 1 869 323-220 + FAX: +44 (0) 1 869 323-211 + email: support@comtrol.co.uk + +Web: http://www.comtrol.com +FTP: ftp.comtrol.com + +=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- + + diff --git a/Documentation/rpc-cache.txt b/Documentation/rpc-cache.txt new file mode 100644 index 000000000000..2b5d4434fa5a --- /dev/null +++ b/Documentation/rpc-cache.txt @@ -0,0 +1,171 @@ +This document gives a brief introduction to the caching +mechanisms in the sunrpc layer that is used, in particular, +for NFS authentication. + +CACHES +====== +The caching replaces the old exports table and allows for +a wide variety of values to be caches. + +There are a number of caches that are similar in structure though +quite possibly very different in content and use. There is a corpus +of common code for managing these caches. + +Examples of caches that are likely to be needed are: + - mapping from IP address to client name + - mapping from client name and filesystem to export options + - mapping from UID to list of GIDs, to work around NFS's limitation + of 16 gids. + - mappings between local UID/GID and remote UID/GID for sites that + do not have uniform uid assignment + - mapping from network identify to public key for crypto authentication. + +The common code handles such things as: + - general cache lookup with correct locking + - supporting 'NEGATIVE' as well as positive entries + - allowing an EXPIRED time on cache items, and removing + items after they expire, and are no longe in-use. + + Future code extensions are expect to handle + - making requests to user-space to fill in cache entries + - allowing user-space to directly set entries in the cache + - delaying RPC requests that depend on as-yet incomplete + cache entries, and replaying those requests when the cache entry + is complete. + - maintaining last-access times on cache entries + - clean out old entries when the caches become full + +The code for performing a cache lookup is also common, but in the form +of a template. i.e. a #define. +Each cache defines a lookup function by using the DefineCacheLookup +macro, or the simpler DefineSimpleCacheLookup macro + +Creating a Cache +---------------- + +1/ A cache needs a datum to cache. This is in the form of a + structure definition that must contain a + struct cache_head + as an element, usually the first. + It will also contain a key and some content. + Each cache element is reference counted and contains + expiry and update times for use in cache management. +2/ A cache needs a "cache_detail" structure that + describes the cache. This stores the hash table, and some + parameters for cache management. +3/ A cache needs a lookup function. This is created using + the DefineCacheLookup macro. This lookup function is used both + to find entries and to update entries. The normal mode for + updating an entry is to replace the old entry with a new + entry. However it is possible to allow update-in-place + for those caches where it makes sense (no atomicity issues + or indirect reference counting issue) +4/ A cache needs to be registered using cache_register(). This + includes in on a list of caches that will be regularly + cleaned to discard old data. For this to work, some + thread must periodically call cache_clean + +Using a cache +------------- + +To find a value in a cache, call the lookup function passing it a the +datum which contains key, and possibly content, and a flag saying +whether to update the cache with new data from the datum. Depending +on how the cache lookup function was defined, it may take an extra +argument to identify the particular cache in question. + +Except in cases of kmalloc failure, the lookup function +will return a new datum which will store the key and +may contain valid content, or may not. +This datum is typically passed to cache_check which determines the +validity of the datum and may later initiate an upcall to fill +in the data. + +cache_check can be passed a "struct cache_req *". This structure is +typically embedded in the actual request and can be used to create a +deferred copy of the request (struct cache_deferred_req). This is +done when the found cache item is not uptodate, but the is reason to +believe that userspace might provide information soon. When the cache +item does become valid, the deferred copy of the request will be +revisited (->revisit). It is expected that this method will +reschedule the request for processing. + + +Populating a cache +------------------ + +Each cache has a name, and when the cache is registered, a directory +with that name is created in /proc/net/rpc + +This directory contains a file called 'channel' which is a channel +for communicating between kernel and user for populating the cache. +This directory may later contain other files of interacting +with the cache. + +The 'channel' works a bit like a datagram socket. Each 'write' is +passed as a whole to the cache for parsing and interpretation. +Each cache can treat the write requests differently, but it is +expected that a message written will contain: + - a key + - an expiry time + - a content. +with the intention that an item in the cache with the give key +should be create or updated to have the given content, and the +expiry time should be set on that item. + +Reading from a channel is a bit more interesting. When a cache +lookup fail, or when it suceeds but finds an entry that may soon +expiry, a request is lodged for that cache item to be updated by +user-space. These requests appear in the channel file. + +Successive reads will return successive requests. +If there are no more requests to return, read will return EOF, but a +select or poll for read will block waiting for another request to be +added. + +Thus a user-space helper is likely to: + open the channel. + select for readable + read a request + write a response + loop. + +If it dies and needs to be restarted, any requests that have not be +answered will still appear in the file and will be read by the new +instance of the helper. + +Each cache should define a "cache_parse" method which takes a message +written from user-space and processes it. It should return an error +(which propagates back to the write syscall) or 0. + +Each cache should also define a "cache_request" method which +takes a cache item and encodes a request into the buffer +provided. + + +Note: If a cache has no active readers on the channel, and has had not +active readers for more than 60 seconds, further requests will not be +added to the channel but instead all looks that do not find a valid +entry will fail. This is partly for backward compatibility: The +previous nfs exports table was deemed to be authoritative and a +failed lookup meant a definite 'no'. + +request/response format +----------------------- + +While each cache is free to use it's own format for requests +and responses over channel, the following is recommended are +appropriate and support routines are available to help: +Each request or response record should be printable ASCII +with precisely one newline character which should be at the end. +Fields within the record should be separated by spaces, normally one. +If spaces, newlines, or nul characters are needed in a field they +much be quotes. two mechanisms are available: +1/ If a field begins '\x' then it must contain an even number of + hex digits, and pairs of these digits provide the bytes in the + field. +2/ otherwise a \ in the field must be followed by 3 octal digits + which give the code for a byte. Other characters are treated + as them selves. At the very least, space, newlines nul, and + '\' must be quoted in this way. + diff --git a/Documentation/rtc.txt b/Documentation/rtc.txt new file mode 100644 index 000000000000..95d17b3e2eee --- /dev/null +++ b/Documentation/rtc.txt @@ -0,0 +1,282 @@ + + Real Time Clock Driver for Linux + ================================ + +All PCs (even Alpha machines) have a Real Time Clock built into them. +Usually they are built into the chipset of the computer, but some may +actually have a Motorola MC146818 (or clone) on the board. This is the +clock that keeps the date and time while your computer is turned off. + +However it can also be used to generate signals from a slow 2Hz to a +relatively fast 8192Hz, in increments of powers of two. These signals +are reported by interrupt number 8. (Oh! So *that* is what IRQ 8 is +for...) It can also function as a 24hr alarm, raising IRQ 8 when the +alarm goes off. The alarm can also be programmed to only check any +subset of the three programmable values, meaning that it could be set to +ring on the 30th second of the 30th minute of every hour, for example. +The clock can also be set to generate an interrupt upon every clock +update, thus generating a 1Hz signal. + +The interrupts are reported via /dev/rtc (major 10, minor 135, read only +character device) in the form of an unsigned long. The low byte contains +the type of interrupt (update-done, alarm-rang, or periodic) that was +raised, and the remaining bytes contain the number of interrupts since +the last read. Status information is reported through the pseudo-file +/proc/driver/rtc if the /proc filesystem was enabled. The driver has +built in locking so that only one process is allowed to have the /dev/rtc +interface open at a time. + +A user process can monitor these interrupts by doing a read(2) or a +select(2) on /dev/rtc -- either will block/stop the user process until +the next interrupt is received. This is useful for things like +reasonably high frequency data acquisition where one doesn't want to +burn up 100% CPU by polling gettimeofday etc. etc. + +At high frequencies, or under high loads, the user process should check +the number of interrupts received since the last read to determine if +there has been any interrupt "pileup" so to speak. Just for reference, a +typical 486-33 running a tight read loop on /dev/rtc will start to suffer +occasional interrupt pileup (i.e. > 1 IRQ event since last read) for +frequencies above 1024Hz. So you really should check the high bytes +of the value you read, especially at frequencies above that of the +normal timer interrupt, which is 100Hz. + +Programming and/or enabling interrupt frequencies greater than 64Hz is +only allowed by root. This is perhaps a bit conservative, but we don't want +an evil user generating lots of IRQs on a slow 386sx-16, where it might have +a negative impact on performance. Note that the interrupt handler is only +a few lines of code to minimize any possibility of this effect. + +Also, if the kernel time is synchronized with an external source, the +kernel will write the time back to the CMOS clock every 11 minutes. In +the process of doing this, the kernel briefly turns off RTC periodic +interrupts, so be aware of this if you are doing serious work. If you +don't synchronize the kernel time with an external source (via ntp or +whatever) then the kernel will keep its hands off the RTC, allowing you +exclusive access to the device for your applications. + +The alarm and/or interrupt frequency are programmed into the RTC via +various ioctl(2) calls as listed in ./include/linux/rtc.h +Rather than write 50 pages describing the ioctl() and so on, it is +perhaps more useful to include a small test program that demonstrates +how to use them, and demonstrates the features of the driver. This is +probably a lot more useful to people interested in writing applications +that will be using this driver. + + Paul Gortmaker + +-------------------- 8< ---------------- 8< ----------------------------- + +/* + * Real Time Clock Driver Test/Example Program + * + * Compile with: + * gcc -s -Wall -Wstrict-prototypes rtctest.c -o rtctest + * + * Copyright (C) 1996, Paul Gortmaker. + * + * Released under the GNU General Public License, version 2, + * included herein by reference. + * + */ + +#include <stdio.h> +#include <linux/rtc.h> +#include <sys/ioctl.h> +#include <sys/time.h> +#include <sys/types.h> +#include <fcntl.h> +#include <unistd.h> +#include <errno.h> + +int main(void) { + +int i, fd, retval, irqcount = 0; +unsigned long tmp, data; +struct rtc_time rtc_tm; + +fd = open ("/dev/rtc", O_RDONLY); + +if (fd == -1) { + perror("/dev/rtc"); + exit(errno); +} + +fprintf(stderr, "\n\t\t\tRTC Driver Test Example.\n\n"); + +/* Turn on update interrupts (one per second) */ +retval = ioctl(fd, RTC_UIE_ON, 0); +if (retval == -1) { + perror("ioctl"); + exit(errno); +} + +fprintf(stderr, "Counting 5 update (1/sec) interrupts from reading /dev/rtc:"); +fflush(stderr); +for (i=1; i<6; i++) { + /* This read will block */ + retval = read(fd, &data, sizeof(unsigned long)); + if (retval == -1) { + perror("read"); + exit(errno); + } + fprintf(stderr, " %d",i); + fflush(stderr); + irqcount++; +} + +fprintf(stderr, "\nAgain, from using select(2) on /dev/rtc:"); +fflush(stderr); +for (i=1; i<6; i++) { + struct timeval tv = {5, 0}; /* 5 second timeout on select */ + fd_set readfds; + + FD_ZERO(&readfds); + FD_SET(fd, &readfds); + /* The select will wait until an RTC interrupt happens. */ + retval = select(fd+1, &readfds, NULL, NULL, &tv); + if (retval == -1) { + perror("select"); + exit(errno); + } + /* This read won't block unlike the select-less case above. */ + retval = read(fd, &data, sizeof(unsigned long)); + if (retval == -1) { + perror("read"); + exit(errno); + } + fprintf(stderr, " %d",i); + fflush(stderr); + irqcount++; +} + +/* Turn off update interrupts */ +retval = ioctl(fd, RTC_UIE_OFF, 0); +if (retval == -1) { + perror("ioctl"); + exit(errno); +} + +/* Read the RTC time/date */ +retval = ioctl(fd, RTC_RD_TIME, &rtc_tm); +if (retval == -1) { + perror("ioctl"); + exit(errno); +} + +fprintf(stderr, "\n\nCurrent RTC date/time is %d-%d-%d, %02d:%02d:%02d.\n", + rtc_tm.tm_mday, rtc_tm.tm_mon + 1, rtc_tm.tm_year + 1900, + rtc_tm.tm_hour, rtc_tm.tm_min, rtc_tm.tm_sec); + +/* Set the alarm to 5 sec in the future, and check for rollover */ +rtc_tm.tm_sec += 5; +if (rtc_tm.tm_sec >= 60) { + rtc_tm.tm_sec %= 60; + rtc_tm.tm_min++; +} +if (rtc_tm.tm_min == 60) { + rtc_tm.tm_min = 0; + rtc_tm.tm_hour++; +} +if (rtc_tm.tm_hour == 24) + rtc_tm.tm_hour = 0; + +retval = ioctl(fd, RTC_ALM_SET, &rtc_tm); +if (retval == -1) { + perror("ioctl"); + exit(errno); +} + +/* Read the current alarm settings */ +retval = ioctl(fd, RTC_ALM_READ, &rtc_tm); +if (retval == -1) { + perror("ioctl"); + exit(errno); +} + +fprintf(stderr, "Alarm time now set to %02d:%02d:%02d.\n", + rtc_tm.tm_hour, rtc_tm.tm_min, rtc_tm.tm_sec); + +/* Enable alarm interrupts */ +retval = ioctl(fd, RTC_AIE_ON, 0); +if (retval == -1) { + perror("ioctl"); + exit(errno); +} + +fprintf(stderr, "Waiting 5 seconds for alarm..."); +fflush(stderr); +/* This blocks until the alarm ring causes an interrupt */ +retval = read(fd, &data, sizeof(unsigned long)); +if (retval == -1) { + perror("read"); + exit(errno); +} +irqcount++; +fprintf(stderr, " okay. Alarm rang.\n"); + +/* Disable alarm interrupts */ +retval = ioctl(fd, RTC_AIE_OFF, 0); +if (retval == -1) { + perror("ioctl"); + exit(errno); +} + +/* Read periodic IRQ rate */ +retval = ioctl(fd, RTC_IRQP_READ, &tmp); +if (retval == -1) { + perror("ioctl"); + exit(errno); +} +fprintf(stderr, "\nPeriodic IRQ rate was %ldHz.\n", tmp); + +fprintf(stderr, "Counting 20 interrupts at:"); +fflush(stderr); + +/* The frequencies 128Hz, 256Hz, ... 8192Hz are only allowed for root. */ +for (tmp=2; tmp<=64; tmp*=2) { + + retval = ioctl(fd, RTC_IRQP_SET, tmp); + if (retval == -1) { + perror("ioctl"); + exit(errno); + } + + fprintf(stderr, "\n%ldHz:\t", tmp); + fflush(stderr); + + /* Enable periodic interrupts */ + retval = ioctl(fd, RTC_PIE_ON, 0); + if (retval == -1) { + perror("ioctl"); + exit(errno); + } + + for (i=1; i<21; i++) { + /* This blocks */ + retval = read(fd, &data, sizeof(unsigned long)); + if (retval == -1) { + perror("read"); + exit(errno); + } + fprintf(stderr, " %d",i); + fflush(stderr); + irqcount++; + } + + /* Disable periodic interrupts */ + retval = ioctl(fd, RTC_PIE_OFF, 0); + if (retval == -1) { + perror("ioctl"); + exit(errno); + } +} + +fprintf(stderr, "\n\n\t\t\t *** Test complete ***\n"); +fprintf(stderr, "\nTyping \"cat /proc/interrupts\" will show %d more events on IRQ 8.\n\n", + irqcount); + +close(fd); +return 0; + +} /* end main */ diff --git a/Documentation/s390/3270.ChangeLog b/Documentation/s390/3270.ChangeLog new file mode 100644 index 000000000000..031c36081946 --- /dev/null +++ b/Documentation/s390/3270.ChangeLog @@ -0,0 +1,44 @@ +ChangeLog for the UTS Global 3270-support patch + +Sep 2002: Get bootup colors right on 3270 console + * In tubttybld.c, substantially revise ESC processing so that + ESC sequences (especially coloring ones) and the strings + they affect work as right as 3270 can get them. Also, set + screen height to omit the two rows used for input area, in + tty3270_open() in tubtty.c. + +Sep 2002: Dynamically get 3270 input buffer + * Oversize 3270 screen widths may exceed GEOM_MAXINPLEN columns, + so get input-area buffer dynamically when sizing the device in + tubmakemin() in tuball.c (if it's the console) or tty3270_open() + in tubtty.c (if needed). Change tubp->tty_input to be a + pointer rather than an array, in tubio.h. + +Sep 2002: Fix tubfs kmalloc()s + * Do read and write lengths correctly in fs3270_read() + and fs3270_write(), whilst never asking kmalloc() + for more than 0x800 bytes. Affects tubfs.c and tubio.h. + +Sep 2002: Recognize 3270 control unit type 3174 + * Recognize control-unit type 0x3174 as well as 0x327?. + The IBM 2047 device emulates a 3174 control unit. + Modularize control-unit recognition in tuball.c by + adding and invoking new tub3270_is_ours(). + +Apr 2002: Fix 3270 console reboot loop + * (Belated log entry) Fixed reboot loop if 3270 console, + in tubtty.c:ttu3270_bh(). + +Feb 6, 2001: + * This changelog is new + * tub3270 now supports 3270 console: + Specify y for CONFIG_3270 and y for CONFIG_3270_CONSOLE. + Support for 3215 will not appear if 3270 console support + is chosen. + NOTE: The default is 3270 console support, NOT 3215. + * the components are remodularized: added source modules are + tubttybld.c and tubttyscl.c, for screen-building code and + scroll-timeout code. + * tub3270 source for this (2.4.0) version is #ifdeffed to + build with both 2.4.0 and 2.2.16.2. + * color support and minimal other ESC-sequence support is added. diff --git a/Documentation/s390/3270.txt b/Documentation/s390/3270.txt new file mode 100644 index 000000000000..0a044e647d2d --- /dev/null +++ b/Documentation/s390/3270.txt @@ -0,0 +1,274 @@ +IBM 3270 Display System support + +This file describes the driver that supports local channel attachment +of IBM 3270 devices. It consists of three sections: + * Introduction + * Installation + * Operation + + +INTRODUCTION. + +This paper describes installing and operating 3270 devices under +Linux/390. A 3270 device is a block-mode rows-and-columns terminal of +which I'm sure hundreds of millions were sold by IBM and clonemakers +twenty and thirty years ago. + +You may have 3270s in-house and not know it. If you're using the +VM-ESA operating system, define a 3270 to your virtual machine by using +the command "DEF GRAF <hex-address>" This paper presumes you will be +defining four 3270s with the CP/CMS commands + + DEF GRAF 620 + DEF GRAF 621 + DEF GRAF 622 + DEF GRAF 623 + +Your network connection from VM-ESA allows you to use x3270, tn3270, or +another 3270 emulator, started from an xterm window on your PC or +workstation. With the DEF GRAF command, an application such as xterm, +and this Linux-390 3270 driver, you have another way of talking to your +Linux box. + +This paper covers installation of the driver and operation of a +dialed-in x3270. + + +INSTALLATION. + +You install the driver by installing a patch, doing a kernel build, and +running the configuration script (config3270.sh, in this directory). + +WARNING: If you are using 3270 console support, you must rerun the +configuration script every time you change the console's address (perhaps +by using the condev= parameter in silo's /boot/parmfile). More precisely, +you should rerun the configuration script every time your set of 3270s, +including the console 3270, changes subchannel identifier relative to +one another. ReIPL as soon as possible after running the configuration +script and the resulting /tmp/mkdev3270. + +If you have chosen to make tub3270 a module, you add a line to +/etc/modprobe.conf. If you are working on a VM virtual machine, you +can use DEF GRAF to define virtual 3270 devices. + +You may generate both 3270 and 3215 console support, or one or the +other, or neither. If you generate both, the console type under VM is +not changed. Use #CP Q TERM to see what the current console type is. +Use #CP TERM CONMODE 3270 to change it to 3270. If you generate only +3270 console support, then the driver automatically converts your console +at boot time to a 3270 if it is a 3215. + +In brief, these are the steps: + 1. Install the tub3270 patch + 2. (If a module) add a line to /etc/modprobe.conf + 3. (If VM) define devices with DEF GRAF + 4. Reboot + 5. Configure + +To test that everything works, assuming VM and x3270, + 1. Bring up an x3270 window. + 2. Use the DIAL command in that window. + 3. You should immediately see a Linux login screen. + +Here are the installation steps in detail: + + 1. The 3270 driver is a part of the official Linux kernel + source. Build a tree with the kernel source and any necessary + patches. Then do + make oldconfig + (If you wish to disable 3215 console support, edit + .config; change CONFIG_TN3215's value to "n"; + and rerun "make oldconfig".) + make image + make modules + make modules_install + + 2. (Perform this step only if you have configured tub3270 as a + module.) Add a line to /etc/modprobe.conf to automatically + load the driver when it's needed. With this line added, + you will see login prompts appear on your 3270s as soon as + boot is complete (or with emulated 3270s, as soon as you dial + into your vm guest using the command "DIAL <vmguestname>"). + Since the line-mode major number is 227, the line to add to + /etc/modprobe.conf should be: + alias char-major-227 tub3270 + + 3. Define graphic devices to your vm guest machine, if you + haven't already. Define them before you reboot (reipl): + DEFINE GRAF 620 + DEFINE GRAF 621 + DEFINE GRAF 622 + DEFINE GRAF 623 + + 4. Reboot. The reboot process scans hardware devices, including + 3270s, and this enables the tub3270 driver once loaded to respond + correctly to the configuration requests of the next step. If + you have chosen 3270 console support, your console now behaves + as a 3270, not a 3215. + + 5. Run the 3270 configuration script config3270. It is + distributed in this same directory, Documentation/s390, as + config3270.sh. Inspect the output script it produces, + /tmp/mkdev3270, and then run that script. This will create the + necessary character special device files and make the necessary + changes to /etc/inittab. If you have selected DEVFS, the driver + itself creates the device files, and /tmp/mkdev3270 only changes + /etc/inittab. + + Then notify /sbin/init that /etc/inittab has changed, by issuing + the telinit command with the q operand: + cd Documentation/s390 + sh config3270.sh + sh /tmp/mkdev3270 + telinit q + + This should be sufficient for your first time. If your 3270 + configuration has changed and you're reusing config3270, you + should follow these steps: + Change 3270 configuration + Reboot + Run config3270 and /tmp/mkdev3270 + Reboot + +Here are the testing steps in detail: + + 1. Bring up an x3270 window, or use an actual hardware 3278 or + 3279, or use the 3270 emulator of your choice. You would be + running the emulator on your PC or workstation. You would use + the command, for example, + x3270 vm-esa-domain-name & + if you wanted a 3278 Model 4 with 43 rows of 80 columns, the + default model number. The driver does not take advantage of + extended attributes. + + The screen you should now see contains a VM logo with input + lines near the bottom. Use TAB to move to the bottom line, + probably labeled "COMMAND ===>". + + 2. Use the DIAL command instead of the LOGIN command to connect + to one of the virtual 3270s you defined with the DEF GRAF + commands: + dial my-vm-guest-name + + 3. You should immediately see a login prompt from your + Linux-390 operating system. If that does not happen, you would + see instead the line "DIALED TO my-vm-guest-name 0620". + + To troubleshoot: do these things. + + A. Is the driver loaded? Use the lsmod command (no operands) + to find out. Probably it isn't. Try loading it manually, with + the command "insmod tub3270". Does that command give error + messages? Ha! There's your problem. + + B. Is the /etc/inittab file modified as in installation step 3 + above? Use the grep command to find out; for instance, issue + "grep 3270 /etc/inittab". Nothing found? There's your + problem! + + C. Are the device special files created, as in installation + step 2 above? Use the ls -l command to find out; for instance, + issue "ls -l /dev/3270/tty620". The output should start with the + letter "c" meaning character device and should contain "227, 1" + just to the left of the device name. No such file? no "c"? + Wrong major number? Wrong minor number? There's your + problem! + + D. Do you get the message + "HCPDIA047E my-vm-guest-name 0620 does not exist"? + If so, you must issue the command "DEF GRAF 620" from your VM + 3215 console and then reboot the system. + + + +OPERATION. + +The driver defines three areas on the 3270 screen: the log area, the +input area, and the status area. + +The log area takes up all but the bottom two lines of the screen. The +driver writes terminal output to it, starting at the top line and going +down. When it fills, the status area changes from "Linux Running" to +"Linux More...". After a scrolling timeout of (default) 5 sec, the +screen clears and more output is written, from the top down. + +The input area extends from the beginning of the second-to-last screen +line to the start of the status area. You type commands in this area +and hit ENTER to execute them. + +The status area initializes to "Linux Running" to give you a warm +fuzzy feeling. When the log area fills up and output awaits, it +changes to "Linux More...". At this time you can do several things or +nothing. If you do nothing, the screen will clear in (default) 5 sec +and more output will appear. You may hit ENTER with nothing typed in +the input area to toggle between "Linux More..." and "Linux Holding", +which indicates no scrolling will occur. (If you hit ENTER with "Linux +Running" and nothing typed, the application receives a newline.) + +You may change the scrolling timeout value. For example, the following +command line: + echo scrolltime=60 > /proc/tty/driver/tty3270 +changes the scrolling timeout value to 60 sec. Set scrolltime to 0 if +you wish to prevent scrolling entirely. + +Other things you may do when the log area fills up are: hit PA2 to +clear the log area and write more output to it, or hit CLEAR to clear +the log area and the input area and write more output to the log area. + +Some of the Program Function (PF) and Program Attention (PA) keys are +preassigned special functions. The ones that are not yield an alarm +when pressed. + +PA1 causes a SIGINT to the currently running application. You may do +the same thing from the input area, by typing "^C" and hitting ENTER. + +PA2 causes the log area to be cleared. If output awaits, it is then +written to the log area. + +PF3 causes an EOF to be received as input by the application. You may +cause an EOF also by typing "^D" and hitting ENTER. + +No PF key is preassigned to cause a job suspension, but you may cause a +job suspension by typing "^Z" and hitting ENTER. You may wish to +assign this function to a PF key. To make PF7 cause job suspension, +execute the command: + echo pf7=^z > /proc/tty/driver/tty3270 + +If the input you type does not end with the two characters "^n", the +driver appends a newline character and sends it to the tty driver; +otherwise the driver strips the "^n" and does not append a newline. +The IBM 3215 driver behaves similarly. + +Pf10 causes the most recent command to be retrieved from the tube's +command stack (default depth 20) and displayed in the input area. You +may hit PF10 again for the next-most-recent command, and so on. A +command is entered into the stack only when the input area is not made +invisible (such as for password entry) and it is not identical to the +current top entry. PF10 rotates backward through the command stack; +PF11 rotates forward. You may assign the backward function to any PF +key (or PA key, for that matter), say, PA3, with the command: + echo -e pa3=\\033k > /proc/tty/driver/tty3270 +This assigns the string ESC-k to PA3. Similarly, the string ESC-j +performs the forward function. (Rationale: In bash with vi-mode line +editing, ESC-k and ESC-j retrieve backward and forward history. +Suggestions welcome.) + +Is a stack size of twenty commands not to your liking? Change it on +the fly. To change to saving the last 100 commands, execute the +command: + echo recallsize=100 > /proc/tty/driver/tty3270 + +Have a command you issue frequently? Assign it to a PF or PA key! Use +the command + echo pf24="mkdir foobar; cd foobar" > /proc/tty/driver/tty3270 +to execute the commands mkdir foobar and cd foobar immediately when you +hit PF24. Want to see the command line first, before you execute it? +Use the -n option of the echo command: + echo -n pf24="mkdir foo; cd foo" > /proc/tty/driver/tty3270 + + + +Happy testing! I welcome any and all comments about this document, the +driver, etc etc. + +Dick Hitt <rbh00@utsglobal.com> diff --git a/Documentation/s390/CommonIO b/Documentation/s390/CommonIO new file mode 100644 index 000000000000..a831d9ae5a5e --- /dev/null +++ b/Documentation/s390/CommonIO @@ -0,0 +1,109 @@ +S/390 common I/O-Layer - command line parameters and /proc entries +================================================================== + +Command line parameters +----------------------- + +* cio_msg = yes | no + + Determines whether information on found devices and sensed device + characteristics should be shown during startup, i. e. messages of the types + "Detected device 0.0.4711 on subchannel 0.0.0042" and "SenseID: Device + 0.0.4711 reports: ...". + + Default is off. + + +* cio_ignore = {all} | + {<device> | <range of devices>} | + {!<device> | !<range of devices>} + + The given devices will be ignored by the common I/O-layer; no detection + and device sensing will be done on any of those devices. The subchannel to + which the device in question is attached will be treated as if no device was + attached. + + An ignored device can be un-ignored later; see the "/proc entries"-section for + details. + + The devices must be given either as bus ids (0.0.abcd) or as hexadecimal + device numbers (0xabcd or abcd, for 2.4 backward compatibility). + You can use the 'all' keyword to ignore all devices. + The '!' operator will cause the I/O-layer to _not_ ignore a device. + The order on the command line is not important. + + For example, + cio_ignore=0.0.0023-0.0.0042,0.0.4711 + will ignore all devices ranging from 0.0.0023 to 0.0.0042 and the device + 0.0.4711, if detected. + As another example, + cio_ignore=all,!0.0.4711,!0.0.fd00-0.0.fd02 + will ignore all devices but 0.0.4711, 0.0.fd00, 0.0.fd01, 0.0.fd02. + + By default, no devices are ignored. + + +/proc entries +------------- + +* /proc/cio_ignore + + Lists the ranges of devices (by bus id) which are ignored by common I/O. + + You can un-ignore certain or all devices by piping to /proc/cio_ignore. + "free all" will un-ignore all ignored devices, + "free <device range>, <device range>, ..." will un-ignore the specified + devices. + + For example, if devices 0.0.0023 to 0.0.0042 and 0.0.4711 are ignored, + - echo free 0.0.0030-0.0.0032 > /proc/cio_ignore + will un-ignore devices 0.0.0030 to 0.0.0032 and will leave devices 0.0.0023 + to 0.0.002f, 0.0.0033 to 0.0.0042 and 0.0.4711 ignored; + - echo free 0.0.0041 > /proc/cio_ignore will furthermore un-ignore device + 0.0.0041; + - echo free all > /proc/cio_ignore will un-ignore all remaining ignored + devices. + + When a device is un-ignored, device recognition and sensing is performed and + the device driver will be notified if possible, so the device will become + available to the system. + + You can also add ranges of devices to be ignored by piping to + /proc/cio_ignore; "add <device range>, <device range>, ..." will ignore the + specified devices. + + Note: Already known devices cannot be ignored. + + For example, if device 0.0.abcd is already known and all other devices + 0.0.a000-0.0.afff are not known, + "echo add 0.0.a000-0.0.accc, 0.0.af00-0.0.afff > /proc/cio_ignore" + will add 0.0.a000-0.0.abcc, 0.0.abce-0.0.accc and 0.0.af00-0.0.afff to the + list of ignored devices and skip 0.0.abcd. + + The devices can be specified either by bus id (0.0.abcd) or, for 2.4 backward + compatibilty, by the device number in hexadecimal (0xabcd or abcd). + + +* /proc/s390dbf/cio_*/ (S/390 debug feature) + + Some views generated by the debug feature to hold various debug outputs. + + - /proc/s390dbf/cio_crw/sprintf + Messages from the processing of pending channel report words (machine check + handling), which will also show when CONFIG_DEBUG_CRW is defined. + + - /proc/s390dbf/cio_msg/sprintf + Various debug messages from the common I/O-layer; generally, messages which + will also show when CONFIG_DEBUG_IO is defined. + + - /proc/s390dbf/cio_trace/hex_ascii + Logs the calling of functions in the common I/O-layer and, if applicable, + which subchannel they were called for. + + The level of logging can be changed to be more or less verbose by piping to + /proc/s390dbf/cio_*/level a number between 0 and 6; see the documentation on + the S/390 debug feature (Documentation/s390/s390dbf.txt) for details. + +* For some of the information present in the /proc filesystem in 2.4 (namely, + /proc/subchannels and /proc/chpids), see driver-model.txt. + Information formerly in /proc/irq_count is now in /proc/interrupts. diff --git a/Documentation/s390/DASD b/Documentation/s390/DASD new file mode 100644 index 000000000000..9963f1e9c98a --- /dev/null +++ b/Documentation/s390/DASD @@ -0,0 +1,73 @@ +DASD device driver + +S/390's disk devices (DASDs) are managed by Linux via the DASD device +driver. It is valid for all types of DASDs and represents them to +Linux as block devices, namely "dd". Currently the DASD driver uses a +single major number (254) and 4 minor numbers per volume (1 for the +physical volume and 3 for partitions). With respect to partitions see +below. Thus you may have up to 64 DASD devices in your system. + +The kernel parameter 'dasd=from-to,...' may be issued arbitrary times +in the kernel's parameter line or not at all. The 'from' and 'to' +parameters are to be given in hexadecimal notation without a leading +0x. +If you supply kernel parameters the different instances are processed +in order of appearance and a minor number is reserved for any device +covered by the supplied range up to 64 volumes. Additional DASDs are +ignored. If you do not supply the 'dasd=' kernel parameter at all, the +DASD driver registers all supported DASDs of your system to a minor +number in ascending order of the subchannel number. + +The driver currently supports ECKD-devices and there are stubs for +support of the FBA and CKD architectures. For the FBA architecture +only some smart data structures are missing to make the support +complete. +We performed our testing on 3380 and 3390 type disks of different +sizes, under VM and on the bare hardware (LPAR), using internal disks +of the multiprise as well as a RAMAC virtual array. Disks exported by +an Enterprise Storage Server (Seascape) should work fine as well. + +We currently implement one partition per volume, which is the whole +volume, skipping the first blocks up to the volume label. These are +reserved for IPL records and IBM's volume label to assure +accessibility of the DASD from other OSs. In a later stage we will +provide support of partitions, maybe VTOC oriented or using a kind of +partition table in the label record. + +USAGE + +-Low-level format (?CKD only) +For using an ECKD-DASD as a Linux harddisk you have to low-level +format the tracks by issuing the BLKDASDFORMAT-ioctl on that +device. This will erase any data on that volume including IBM volume +labels, VTOCs etc. The ioctl may take a 'struct format_data *' or +'NULL' as an argument. +typedef struct { + int start_unit; + int stop_unit; + int blksize; +} format_data_t; +When a NULL argument is passed to the BLKDASDFORMAT ioctl the whole +disk is formatted to a blocksize of 1024 bytes. Otherwise start_unit +and stop_unit are the first and last track to be formatted. If +stop_unit is -1 it implies that the DASD is formatted from start_unit +up to the last track. blksize can be any power of two between 512 and +4096. We recommend no blksize lower than 1024 because the ext2fs uses +1kB blocks anyway and you gain approx. 50% of capacity increasing your +blksize from 512 byte to 1kB. + +-Make a filesystem +Then you can mk??fs the filesystem of your choice on that volume or +partition. For reasons of sanity you should build your filesystem on +the partition /dev/dd?1 instead of the whole volume. You only lose 3kB +but may be sure that you can reuse your data after introduction of a +real partition table. + +BUGS: +- Performance sometimes is rather low because we don't fully exploit clustering + +TODO-List: +- Add IBM'S Disk layout to genhd +- Enhance driver to use more than one major number +- Enable usage as a module +- Support Cache fast write and DASD fast write (ECKD) diff --git a/Documentation/s390/Debugging390.txt b/Documentation/s390/Debugging390.txt new file mode 100644 index 000000000000..adbfe620c061 --- /dev/null +++ b/Documentation/s390/Debugging390.txt @@ -0,0 +1,2536 @@ + + Debugging on Linux for s/390 & z/Architecture + by + Denis Joseph Barrow (djbarrow@de.ibm.com,barrow_dj@yahoo.com) + Copyright (C) 2000-2001 IBM Deutschland Entwicklung GmbH, IBM Corporation + Best viewed with fixed width fonts + +Overview of Document: +===================== +This document is intended to give an good overview of how to debug +Linux for s/390 & z/Architecture it isn't intended as a complete reference & not a +tutorial on the fundamentals of C & assembly, it dosen't go into +390 IO in any detail. It is intended to complement the documents in the +reference section below & any other worthwhile references you get. + +It is intended like the Enterprise Systems Architecture/390 Reference Summary +to be printed out & used as a quick cheat sheet self help style reference when +problems occur. + +Contents +======== +Register Set +Address Spaces on Intel Linux +Address Spaces on Linux for s/390 & z/Architecture +The Linux for s/390 & z/Architecture Kernel Task Structure +Register Usage & Stackframes on Linux for s/390 & z/Architecture +A sample program with comments +Compiling programs for debugging on Linux for s/390 & z/Architecture +Figuring out gcc compile errors +Debugging Tools +objdump +strace +Performance Debugging +Debugging under VM +s/390 & z/Architecture IO Overview +Debugging IO on s/390 & z/Architecture under VM +GDB on s/390 & z/Architecture +Stack chaining in gdb by hand +Examining core dumps +ldd +Debugging modules +The proc file system +Starting points for debugging scripting languages etc. +Dumptool & Lcrash +SysRq +References +Special Thanks + +Register Set +============ +The current architectures have the following registers. + +16 General propose registers, 32 bit on s/390 64 bit on z/Architecture, r0-r15 or gpr0-gpr15 used for arithmetic & addressing. + +16 Control registers, 32 bit on s/390 64 bit on z/Architecture, ( cr0-cr15 kernel usage only ) used for memory management, +interrupt control,debugging control etc. + +16 Access registers ( ar0-ar15 ) 32 bit on s/390 & z/Architecture +not used by normal programs but potentially could +be used as temporary storage. Their main purpose is their 1 to 1 +association with general purpose registers and are used in +the kernel for copying data between kernel & user address spaces. +Access register 0 ( & access register 1 on z/Architecture ( needs 64 bit +pointer ) ) is currently used by the pthread library as a pointer to +the current running threads private area. + +16 64 bit floating point registers (fp0-fp15 ) IEEE & HFP floating +point format compliant on G5 upwards & a Floating point control reg (FPC) +4 64 bit registers (fp0,fp2,fp4 & fp6) HFP only on older machines. +Note: +Linux (currently) always uses IEEE & emulates G5 IEEE format on older machines, +( provided the kernel is configured for this ). + + +The PSW is the most important register on the machine it +is 64 bit on s/390 & 128 bit on z/Architecture & serves the roles of +a program counter (pc), condition code register,memory space designator. +In IBM standard notation I am counting bit 0 as the MSB. +It has several advantages over a normal program counter +in that you can change address translation & program counter +in a single instruction. To change address translation, +e.g. switching address translation off requires that you +have a logical=physical mapping for the address you are +currently running at. + + Bit Value +s/390 z/Architecture +0 0 Reserved ( must be 0 ) otherwise specification exception occurs. + +1 1 Program Event Recording 1 PER enabled, + PER is used to facilititate debugging e.g. single stepping. + +2-4 2-4 Reserved ( must be 0 ). + +5 5 Dynamic address translation 1=DAT on. + +6 6 Input/Output interrupt Mask + +7 7 External interrupt Mask used primarily for interprocessor signalling & + clock interrupts. + +8-11 8-11 PSW Key used for complex memory protection mechanism not used under linux + +12 12 1 on s/390 0 on z/Architecture + +13 13 Machine Check Mask 1=enable machine check interrupts + +14 14 Wait State set this to 1 to stop the processor except for interrupts & give + time to other LPARS used in CPU idle in the kernel to increase overall + usage of processor resources. + +15 15 Problem state ( if set to 1 certain instructions are disabled ) + all linux user programs run with this bit 1 + ( useful info for debugging under VM ). + +16-17 16-17 Address Space Control + + 00 Primary Space Mode when DAT on + The linux kernel currently runs in this mode, CR1 is affiliated with + this mode & points to the primary segment table origin etc. + + 01 Access register mode this mode is used in functions to + copy data between kernel & user space. + + 10 Secondary space mode not used in linux however CR7 the + register affiliated with this mode is & this & normally + CR13=CR7 to allow us to copy data between kernel & user space. + We do this as follows: + We set ar2 to 0 to designate its + affiliated gpr ( gpr2 )to point to primary=kernel space. + We set ar4 to 1 to designate its + affiliated gpr ( gpr4 ) to point to secondary=home=user space + & then essentially do a memcopy(gpr2,gpr4,size) to + copy data between the address spaces, the reason we use home space for the + kernel & don't keep secondary space free is that code will not run in + secondary space. + + 11 Home Space Mode all user programs run in this mode. + it is affiliated with CR13. + +18-19 18-19 Condition codes (CC) + +20 20 Fixed point overflow mask if 1=FPU exceptions for this event + occur ( normally 0 ) + +21 21 Decimal overflow mask if 1=FPU exceptions for this event occur + ( normally 0 ) + +22 22 Exponent underflow mask if 1=FPU exceptions for this event occur + ( normally 0 ) + +23 23 Significance Mask if 1=FPU exceptions for this event occur + ( normally 0 ) + +24-31 24-30 Reserved Must be 0. + + 31 Extended Addressing Mode + 32 Basic Addressing Mode + Used to set addressing mode + PSW 31 PSW 32 + 0 0 24 bit + 0 1 31 bit + 1 1 64 bit + +32 1=31 bit addressing mode 0=24 bit addressing mode (for backward + compatibility ), linux always runs with this bit set to 1 + +33-64 Instruction address. + 33-63 Reserved must be 0 + 64-127 Address + In 24 bits mode bits 64-103=0 bits 104-127 Address + In 31 bits mode bits 64-96=0 bits 97-127 Address + Note: unlike 31 bit mode on s/390 bit 96 must be zero + when loading the address with LPSWE otherwise a + specification exception occurs, LPSW is fully backward + compatible. + + +Prefix Page(s) +-------------- +This per cpu memory area is too intimately tied to the processor not to mention. +It exists between the real addresses 0-4096 on s/390 & 0-8192 z/Architecture & is exchanged +with a 1 page on s/390 or 2 pages on z/Architecture in absolute storage by the set +prefix instruction in linux'es startup. +This page is mapped to a different prefix for each processor in an SMP configuration +( assuming the os designer is sane of course :-) ). +Bytes 0-512 ( 200 hex ) on s/390 & 0-512,4096-4544,4604-5119 currently on z/Architecture +are used by the processor itself for holding such information as exception indications & +entry points for exceptions. +Bytes after 0xc00 hex are used by linux for per processor globals on s/390 & z/Architecture +( there is a gap on z/Architecure too currently between 0xc00 & 1000 which linux uses ). +The closest thing to this on traditional architectures is the interrupt +vector table. This is a good thing & does simplify some of the kernel coding +however it means that we now cannot catch stray NULL pointers in the +kernel without hard coded checks. + + + +Address Spaces on Intel Linux +============================= + +The traditional Intel Linux is approximately mapped as follows forgive +the ascii art. +0xFFFFFFFF 4GB Himem ***************** + * * + * Kernel Space * + * * + ***************** **************** +User Space Himem (typically 0xC0000000 3GB )* User Stack * * * + ***************** * * + * Shared Libs * * Next Process * + ***************** * to * + * * <== * Run * <== + * User Program * * * + * Data BSS * * * + * Text * * * + * Sections * * * +0x00000000 ***************** **************** + +Now it is easy to see that on Intel it is quite easy to recognise a kernel address +as being one greater than user space himem ( in this case 0xC0000000). +& addresses of less than this are the ones in the current running program on this +processor ( if an smp box ). +If using the virtual machine ( VM ) as a debugger it is quite difficult to +know which user process is running as the address space you are looking at +could be from any process in the run queue. + +The limitation of Intels addressing technique is that the linux +kernel uses a very simple real address to virtual addressing technique +of Real Address=Virtual Address-User Space Himem. +This means that on Intel the kernel linux can typically only address +Himem=0xFFFFFFFF-0xC0000000=1GB & this is all the RAM these machines +can typically use. +They can lower User Himem to 2GB or lower & thus be +able to use 2GB of RAM however this shrinks the maximum size +of User Space from 3GB to 2GB they have a no win limit of 4GB unless +they go to 64 Bit. + + +On 390 our limitations & strengths make us slightly different. +For backward compatibility we are only allowed use 31 bits (2GB) +of our 32 bit addresses,however, we use entirely separate address +spaces for the user & kernel. + +This means we can support 2GB of non Extended RAM on s/390, & more +with the Extended memory management swap device & +currently 4TB of physical memory currently on z/Architecture. + + +Address Spaces on Linux for s/390 & z/Architecture +================================================== + +Our addressing scheme is as follows + + +Himem 0x7fffffff 2GB on s/390 ***************** **************** +currently 0x3ffffffffff (2^42)-1 * User Stack * * * +on z/Architecture. ***************** * * + * Shared Libs * * * + ***************** * * + * * * Kernel * + * User Program * * * + * Data BSS * * * + * Text * * * + * Sections * * * +0x00000000 ***************** **************** + +This also means that we need to look at the PSW problem state bit +or the addressing mode to decide whether we are looking at +user or kernel space. + +Virtual Addresses on s/390 & z/Architecture +=========================================== + +A virtual address on s/390 is made up of 3 parts +The SX ( segment index, roughly corresponding to the PGD & PMD in linux terminology ) +being bits 1-11. +The PX ( page index, corresponding to the page table entry (pte) in linux terminology ) +being bits 12-19. +The remaining bits BX (the byte index are the offset in the page ) +i.e. bits 20 to 31. + +On z/Architecture in linux we currently make up an address from 4 parts. +The region index bits (RX) 0-32 we currently use bits 22-32 +The segment index (SX) being bits 33-43 +The page index (PX) being bits 44-51 +The byte index (BX) being bits 52-63 + +Notes: +1) s/390 has no PMD so the PMD is really the PGD also. +A lot of this stuff is defined in pgtable.h. + +2) Also seeing as s/390's page indexes are only 1k in size +(bits 12-19 x 4 bytes per pte ) we use 1 ( page 4k ) +to make the best use of memory by updating 4 segment indices +entries each time we mess with a PMD & use offsets +0,1024,2048 & 3072 in this page as for our segment indexes. +On z/Architecture our page indexes are now 2k in size +( bits 12-19 x 8 bytes per pte ) we do a similar trick +but only mess with 2 segment indices each time we mess with +a PMD. + +3) As z/Architecture supports upto a massive 5-level page table lookup we +can only use 3 currently on Linux ( as this is all the generic kernel +currently supports ) however this may change in future +this allows us to access ( according to my sums ) +4TB of virtual storage per process i.e. +4096*512(PTES)*1024(PMDS)*2048(PGD) = 4398046511104 bytes, +enough for another 2 or 3 of years I think :-). +to do this we use a region-third-table designation type in +our address space control registers. + + +The Linux for s/390 & z/Architecture Kernel Task Structure +========================================================== +Each process/thread under Linux for S390 has its own kernel task_struct +defined in linux/include/linux/sched.h +The S390 on initialisation & resuming of a process on a cpu sets +the __LC_KERNEL_STACK variable in the spare prefix area for this cpu +( which we use for per processor globals). + +The kernel stack pointer is intimately tied with the task stucture for +each processor as follows. + + s/390 + ************************ + * 1 page kernel stack * + * ( 4K ) * + ************************ + * 1 page task_struct * + * ( 4K ) * +8K aligned ************************ + + z/Architecture + ************************ + * 2 page kernel stack * + * ( 8K ) * + ************************ + * 2 page task_struct * + * ( 8K ) * +16K aligned ************************ + +What this means is that we don't need to dedicate any register or global variable +to point to the current running process & can retrieve it with the following +very simple construct for s/390 & one very similar for z/Architecture. + +static inline struct task_struct * get_current(void) +{ + struct task_struct *current; + __asm__("lhi %0,-8192\n\t" + "nr %0,15" + : "=r" (current) ); + return current; +} + +i.e. just anding the current kernel stack pointer with the mask -8192. +Thankfully because Linux dosen't have support for nested IO interrupts +& our devices have large buffers can survive interrupts being shut for +short amounts of time we don't need a separate stack for interrupts. + + + + +Register Usage & Stackframes on Linux for s/390 & z/Architecture +================================================================= +Overview: +--------- +This is the code that gcc produces at the top & the bottom of +each function, it usually is fairly consistent & similar from +function to function & if you know its layout you can probalby +make some headway in finding the ultimate cause of a problem +after a crash without a source level debugger. + +Note: To follow stackframes requires a knowledge of C or Pascal & +limited knowledge of one assembly language. + +It should be noted that there are some differences between the +s/390 & z/Architecture stack layouts as the z/Architecture stack layout didn't have +to maintain compatibility with older linkage formats. + +Glossary: +--------- +alloca: +This is a built in compiler function for runtime allocation +of extra space on the callers stack which is obviously freed +up on function exit ( e.g. the caller may choose to allocate nothing +of a buffer of 4k if required for temporary purposes ), it generates +very efficient code ( a few cycles ) when compared to alternatives +like malloc. + +automatics: These are local variables on the stack, +i.e they aren't in registers & they aren't static. + +back-chain: +This is a pointer to the stack pointer before entering a +framed functions ( see frameless function ) prologue got by +deferencing the address of the current stack pointer, + i.e. got by accessing the 32 bit value at the stack pointers +current location. + +base-pointer: +This is a pointer to the back of the literal pool which +is an area just behind each procedure used to store constants +in each function. + +call-clobbered: The caller probably needs to save these registers if there +is something of value in them, on the stack or elsewhere before making a +call to another procedure so that it can restore it later. + +epilogue: +The code generated by the compiler to return to the caller. + +frameless-function +A frameless function in Linux for s390 & z/Architecture is one which doesn't +need more than the register save area ( 96 bytes on s/390, 160 on z/Architecture ) +given to it by the caller. +A frameless function never: +1) Sets up a back chain. +2) Calls alloca. +3) Calls other normal functions +4) Has automatics. + +GOT-pointer: +This is a pointer to the global-offset-table in ELF +( Executable Linkable Format, Linux'es most common executable format ), +all globals & shared library objects are found using this pointer. + +lazy-binding +ELF shared libraries are typically only loaded when routines in the shared +library are actually first called at runtime. This is lazy binding. + +procedure-linkage-table +This is a table found from the GOT which contains pointers to routines +in other shared libraries which can't be called to by easier means. + +prologue: +The code generated by the compiler to set up the stack frame. + +outgoing-args: +This is extra area allocated on the stack of the calling function if the +parameters for the callee's cannot all be put in registers, the same +area can be reused by each function the caller calls. + +routine-descriptor: +A COFF executable format based concept of a procedure reference +actually being 8 bytes or more as opposed to a simple pointer to the routine. +This is typically defined as follows +Routine Descriptor offset 0=Pointer to Function +Routine Descriptor offset 4=Pointer to Table of Contents +The table of contents/TOC is roughly equivalent to a GOT pointer. +& it means that shared libraries etc. can be shared between several +environments each with their own TOC. + + +static-chain: This is used in nested functions a concept adopted from pascal +by gcc not used in ansi C or C++ ( although quite useful ), basically it +is a pointer used to reference local variables of enclosing functions. +You might come across this stuff once or twice in your lifetime. + +e.g. +The function below should return 11 though gcc may get upset & toss warnings +about unused variables. +int FunctionA(int a) +{ + int b; + FunctionC(int c) + { + b=c+1; + } + FunctionC(10); + return(b); +} + + +s/390 & z/Architecture Register usage +===================================== +r0 used by syscalls/assembly call-clobbered +r1 used by syscalls/assembly call-clobbered +r2 argument 0 / return value 0 call-clobbered +r3 argument 1 / return value 1 (if long long) call-clobbered +r4 argument 2 call-clobbered +r5 argument 3 call-clobbered +r6 argument 5 saved +r7 pointer-to arguments 5 to ... saved +r8 this & that saved +r9 this & that saved +r10 static-chain ( if nested function ) saved +r11 frame-pointer ( if function used alloca ) saved +r12 got-pointer saved +r13 base-pointer saved +r14 return-address saved +r15 stack-pointer saved + +f0 argument 0 / return value ( float/double ) call-clobbered +f2 argument 1 call-clobbered +f4 z/Architecture argument 2 saved +f6 z/Architecture argument 3 saved +The remaining floating points +f1,f3,f5 f7-f15 are call-clobbered. + +Notes: +------ +1) The only requirement is that registers which are used +by the callee are saved, e.g. the compiler is perfectly +capible of using r11 for purposes other than a frame a +frame pointer if a frame pointer is not needed. +2) In functions with variable arguments e.g. printf the calling procedure +is identical to one without variable arguments & the same number of +parameters. However, the prologue of this function is somewhat more +hairy owing to it having to move these parameters to the stack to +get va_start, va_arg & va_end to work. +3) Access registers are currently unused by gcc but are used in +the kernel. Possibilities exist to use them at the moment for +temporary storage but it isn't recommended. +4) Only 4 of the floating point registers are used for +parameter passing as older machines such as G3 only have only 4 +& it keeps the stack frame compatible with other compilers. +However with IEEE floating point emulation under linux on the +older machines you are free to use the other 12. +5) A long long or double parameter cannot be have the +first 4 bytes in a register & the second four bytes in the +outgoing args area. It must be purely in the outgoing args +area if crossing this boundary. +6) Floating point parameters are mixed with outgoing args +on the outgoing args area in the order the are passed in as parameters. +7) Floating point arguments 2 & 3 are saved in the outgoing args area for +z/Architecture + + +Stack Frame Layout +------------------ +s/390 z/Architecture +0 0 back chain ( a 0 here signifies end of back chain ) +4 8 eos ( end of stack, not used on Linux for S390 used in other linkage formats ) +8 16 glue used in other s/390 linkage formats for saved routine descriptors etc. +12 24 glue used in other s/390 linkage formats for saved routine descriptors etc. +16 32 scratch area +20 40 scratch area +24 48 saved r6 of caller function +28 56 saved r7 of caller function +32 64 saved r8 of caller function +36 72 saved r9 of caller function +40 80 saved r10 of caller function +44 88 saved r11 of caller function +48 96 saved r12 of caller function +52 104 saved r13 of caller function +56 112 saved r14 of caller function +60 120 saved r15 of caller function +64 128 saved f4 of caller function +72 132 saved f6 of caller function +80 undefined +96 160 outgoing args passed from caller to callee +96+x 160+x possible stack alignment ( 8 bytes desirable ) +96+x+y 160+x+y alloca space of caller ( if used ) +96+x+y+z 160+x+y+z automatics of caller ( if used ) +0 back-chain + +A sample program with comments. +=============================== + +Comments on the function test +----------------------------- +1) It didn't need to set up a pointer to the constant pool gpr13 as it isn't used +( :-( ). +2) This is a frameless function & no stack is bought. +3) The compiler was clever enough to recognise that it could return the +value in r2 as well as use it for the passed in parameter ( :-) ). +4) The basr ( branch relative & save ) trick works as follows the instruction +has a special case with r0,r0 with some instruction operands is understood as +the literal value 0, some risc architectures also do this ). So now +we are branching to the next address & the address new program counter is +in r13,so now we subtract the size of the function prologue we have executed ++ the size of the literal pool to get to the top of the literal pool +0040037c int test(int b) +{ # Function prologue below + 40037c: 90 de f0 34 stm %r13,%r14,52(%r15) # Save registers r13 & r14 + 400380: 0d d0 basr %r13,%r0 # Set up pointer to constant pool using + 400382: a7 da ff fa ahi %r13,-6 # basr trick + return(5+b); + # Huge main program + 400386: a7 2a 00 05 ahi %r2,5 # add 5 to r2 + + # Function epilogue below + 40038a: 98 de f0 34 lm %r13,%r14,52(%r15) # restore registers r13 & 14 + 40038e: 07 fe br %r14 # return +} + +Comments on the function main +----------------------------- +1) The compiler did this function optimally ( 8-) ) + +Literal pool for main. +400390: ff ff ff ec .long 0xffffffec +main(int argc,char *argv[]) +{ # Function prologue below + 400394: 90 bf f0 2c stm %r11,%r15,44(%r15) # Save necessary registers + 400398: 18 0f lr %r0,%r15 # copy stack pointer to r0 + 40039a: a7 fa ff a0 ahi %r15,-96 # Make area for callee saving + 40039e: 0d d0 basr %r13,%r0 # Set up r13 to point to + 4003a0: a7 da ff f0 ahi %r13,-16 # literal pool + 4003a4: 50 00 f0 00 st %r0,0(%r15) # Save backchain + + return(test(5)); # Main Program Below + 4003a8: 58 e0 d0 00 l %r14,0(%r13) # load relative address of test from + # literal pool + 4003ac: a7 28 00 05 lhi %r2,5 # Set first parameter to 5 + 4003b0: 4d ee d0 00 bas %r14,0(%r14,%r13) # jump to test setting r14 as return + # address using branch & save instruction. + + # Function Epilogue below + 4003b4: 98 bf f0 8c lm %r11,%r15,140(%r15)# Restore necessary registers. + 4003b8: 07 fe br %r14 # return to do program exit +} + + +Compiler updates +---------------- + +main(int argc,char *argv[]) +{ + 4004fc: 90 7f f0 1c stm %r7,%r15,28(%r15) + 400500: a7 d5 00 04 bras %r13,400508 <main+0xc> + 400504: 00 40 04 f4 .long 0x004004f4 + # compiler now puts constant pool in code to so it saves an instruction + 400508: 18 0f lr %r0,%r15 + 40050a: a7 fa ff a0 ahi %r15,-96 + 40050e: 50 00 f0 00 st %r0,0(%r15) + return(test(5)); + 400512: 58 10 d0 00 l %r1,0(%r13) + 400516: a7 28 00 05 lhi %r2,5 + 40051a: 0d e1 basr %r14,%r1 + # compiler adds 1 extra instruction to epilogue this is done to + # avoid processor pipeline stalls owing to data dependencies on g5 & + # above as register 14 in the old code was needed directly after being loaded + # by the lm %r11,%r15,140(%r15) for the br %14. + 40051c: 58 40 f0 98 l %r4,152(%r15) + 400520: 98 7f f0 7c lm %r7,%r15,124(%r15) + 400524: 07 f4 br %r4 +} + + +Hartmut ( our compiler developer ) also has been threatening to take out the +stack backchain in optimised code as this also causes pipeline stalls, you +have been warned. + +64 bit z/Architecture code disassembly +-------------------------------------- + +If you understand the stuff above you'll understand the stuff +below too so I'll avoid repeating myself & just say that +some of the instructions have g's on the end of them to indicate +they are 64 bit & the stack offsets are a bigger, +the only other difference you'll find between 32 & 64 bit is that +we now use f4 & f6 for floating point arguments on 64 bit. +00000000800005b0 <test>: +int test(int b) +{ + return(5+b); + 800005b0: a7 2a 00 05 ahi %r2,5 + 800005b4: b9 14 00 22 lgfr %r2,%r2 # downcast to integer + 800005b8: 07 fe br %r14 + 800005ba: 07 07 bcr 0,%r7 + + +} + +00000000800005bc <main>: +main(int argc,char *argv[]) +{ + 800005bc: eb bf f0 58 00 24 stmg %r11,%r15,88(%r15) + 800005c2: b9 04 00 1f lgr %r1,%r15 + 800005c6: a7 fb ff 60 aghi %r15,-160 + 800005ca: e3 10 f0 00 00 24 stg %r1,0(%r15) + return(test(5)); + 800005d0: a7 29 00 05 lghi %r2,5 + # brasl allows jumps > 64k & is overkill here bras would do fune + 800005d4: c0 e5 ff ff ff ee brasl %r14,800005b0 <test> + 800005da: e3 40 f1 10 00 04 lg %r4,272(%r15) + 800005e0: eb bf f0 f8 00 04 lmg %r11,%r15,248(%r15) + 800005e6: 07 f4 br %r4 +} + + + +Compiling programs for debugging on Linux for s/390 & z/Architecture +==================================================================== +-gdwarf-2 now works it should be considered the default debugging +format for s/390 & z/Architecture as it is more reliable for debugging +shared libraries, normal -g debugging works much better now +Thanks to the IBM java compiler developers bug reports. + +This is typically done adding/appending the flags -g or -gdwarf-2 to the +CFLAGS & LDFLAGS variables Makefile of the program concerned. + +If using gdb & you would like accurate displays of registers & + stack traces compile without optimisation i.e make sure +that there is no -O2 or similar on the CFLAGS line of the Makefile & +the emitted gcc commands, obviously this will produce worse code +( not advisable for shipment ) but it is an aid to the debugging process. + +This aids debugging because the compiler will copy parameters passed in +in registers onto the stack so backtracing & looking at passed in +parameters will work, however some larger programs which use inline functions +will not compile without optimisation. + +Debugging with optimisation has since much improved after fixing +some bugs, please make sure you are using gdb-5.0 or later developed +after Nov'2000. + +Figuring out gcc compile errors +=============================== +If you are getting a lot of syntax errors compiling a program & the problem +isn't blatantly obvious from the source. +It often helps to just preprocess the file, this is done with the -E +option in gcc. +What this does is that it runs through the very first phase of compilation +( compilation in gcc is done in several stages & gcc calls many programs to +achieve its end result ) with the -E option gcc just calls the gcc preprocessor (cpp). +The c preprocessor does the following, it joins all the files #included together +recursively ( #include files can #include other files ) & also the c file you wish to compile. +It puts a fully qualified path of the #included files in a comment & it +does macro expansion. +This is useful for debugging because +1) You can double check whether the files you expect to be included are the ones +that are being included ( e.g. double check that you aren't going to the i386 asm directory ). +2) Check that macro definitions aren't clashing with typedefs, +3) Check that definitons aren't being used before they are being included. +4) Helps put the line emitting the error under the microscope if it contains macros. + +For convenience the Linux kernel's makefile will do preprocessing automatically for you +by suffixing the file you want built with .i ( instead of .o ) + +e.g. +from the linux directory type +make arch/s390/kernel/signal.i +this will build + +s390-gcc -D__KERNEL__ -I/home1/barrow/linux/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer +-fno-strict-aliasing -D__SMP__ -pipe -fno-strength-reduce -E arch/s390/kernel/signal.c +> arch/s390/kernel/signal.i + +Now look at signal.i you should see something like. + + +# 1 "/home1/barrow/linux/include/asm/types.h" 1 +typedef unsigned short umode_t; +typedef __signed__ char __s8; +typedef unsigned char __u8; +typedef __signed__ short __s16; +typedef unsigned short __u16; + +If instead you are getting errors further down e.g. +unknown instruction:2515 "move.l" or better still unknown instruction:2515 +"Fixme not implemented yet, call Martin" you are probably are attempting to compile some code +meant for another architecture or code that is simply not implemented, with a fixme statement +stuck into the inline assembly code so that the author of the file now knows he has work to do. +To look at the assembly emitted by gcc just before it is about to call gas ( the gnu assembler ) +use the -S option. +Again for your convenience the Linux kernel's Makefile will hold your hand & +do all this donkey work for you also by building the file with the .s suffix. +e.g. +from the Linux directory type +make arch/s390/kernel/signal.s + +s390-gcc -D__KERNEL__ -I/home1/barrow/linux/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer +-fno-strict-aliasing -D__SMP__ -pipe -fno-strength-reduce -S arch/s390/kernel/signal.c +-o arch/s390/kernel/signal.s + + +This will output something like, ( please note the constant pool & the useful comments +in the prologue to give you a hand at interpreting it ). + +.LC54: + .string "misaligned (__u16 *) in __xchg\n" +.LC57: + .string "misaligned (__u32 *) in __xchg\n" +.L$PG1: # Pool sys_sigsuspend +.LC192: + .long -262401 +.LC193: + .long -1 +.LC194: + .long schedule-.L$PG1 +.LC195: + .long do_signal-.L$PG1 + .align 4 +.globl sys_sigsuspend + .type sys_sigsuspend,@function +sys_sigsuspend: +# leaf function 0 +# automatics 16 +# outgoing args 0 +# need frame pointer 0 +# call alloca 0 +# has varargs 0 +# incoming args (stack) 0 +# function length 168 + STM 8,15,32(15) + LR 0,15 + AHI 15,-112 + BASR 13,0 +.L$CO1: AHI 13,.L$PG1-.L$CO1 + ST 0,0(15) + LR 8,2 + N 5,.LC192-.L$PG1(13) + +Adding -g to the above output makes the output even more useful +e.g. typing +make CC:="s390-gcc -g" kernel/sched.s + +which compiles. +s390-gcc -g -D__KERNEL__ -I/home/barrow/linux-2.3/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -pipe -fno-strength-reduce -S kernel/sched.c -o kernel/sched.s + +also outputs stabs ( debugger ) info, from this info you can find out the +offsets & sizes of various elements in structures. +e.g. the stab for the structure +struct rlimit { + unsigned long rlim_cur; + unsigned long rlim_max; +}; +is +.stabs "rlimit:T(151,2)=s8rlim_cur:(0,5),0,32;rlim_max:(0,5),32,32;;",128,0,0,0 +from this stab you can see that +rlimit_cur starts at bit offset 0 & is 32 bits in size +rlimit_max starts at bit offset 32 & is 32 bits in size. + + +Debugging Tools: +================ + +objdump +======= +This is a tool with many options the most useful being ( if compiled with -g). +objdump --source <victim program or object file> > <victims debug listing > + + +The whole kernel can be compiled like this ( Doing this will make a 17MB kernel +& a 200 MB listing ) however you have to strip it before building the image +using the strip command to make it a more reasonable size to boot it. + +A source/assembly mixed dump of the kernel can be done with the line +objdump --source vmlinux > vmlinux.lst +Also if the file isn't compiled -g this will output as much debugging information +as it can ( e.g. function names ), however, this is very slow as it spends lots +of time searching for debugging info, the following self explanitory line should be used +instead if the code isn't compiled -g. +objdump --disassemble-all --syms vmlinux > vmlinux.lst +as it is much faster + +As hard drive space is valuble most of us use the following approach. +1) Look at the emitted psw on the console to find the crash address in the kernel. +2) Look at the file System.map ( in the linux directory ) produced when building +the kernel to find the closest address less than the current PSW to find the +offending function. +3) use grep or similar to search the source tree looking for the source file + with this function if you don't know where it is. +4) rebuild this object file with -g on, as an example suppose the file was +( /arch/s390/kernel/signal.o ) +5) Assuming the file with the erroneous function is signal.c Move to the base of the +Linux source tree. +6) rm /arch/s390/kernel/signal.o +7) make /arch/s390/kernel/signal.o +8) watch the gcc command line emitted +9) type it in again or alernatively cut & paste it on the console adding the -g option. +10) objdump --source arch/s390/kernel/signal.o > signal.lst +This will output the source & the assembly intermixed, as the snippet below shows +This will unfortunately output addresses which aren't the same +as the kernel ones you should be able to get around the mental arithmetic +by playing with the --adjust-vma parameter to objdump. + + + + +extern inline void spin_lock(spinlock_t *lp) +{ + a0: 18 34 lr %r3,%r4 + a2: a7 3a 03 bc ahi %r3,956 + __asm__ __volatile(" lhi 1,-1\n" + a6: a7 18 ff ff lhi %r1,-1 + aa: 1f 00 slr %r0,%r0 + ac: ba 01 30 00 cs %r0,%r1,0(%r3) + b0: a7 44 ff fd jm aa <sys_sigsuspend+0x2e> + saveset = current->blocked; + b4: d2 07 f0 68 mvc 104(8,%r15),972(%r4) + b8: 43 cc + return (set->sig[0] & mask) != 0; +} + +6) If debugging under VM go down to that section in the document for more info. + + +I now have a tool which takes the pain out of --adjust-vma +& you are able to do something like +make /arch/s390/kernel/traps.lst +& it automatically generates the correctly relocated entries for +the text segment in traps.lst. +This tool is now standard in linux distro's in scripts/makelst + +strace: +------- +Q. What is it ? +A. It is a tool for intercepting calls to the kernel & logging them +to a file & on the screen. + +Q. What use is it ? +A. You can used it to find out what files a particular program opens. + + + +Example 1 +--------- +If you wanted to know does ping work but didn't have the source +strace ping -c 1 127.0.0.1 +& then look at the man pages for each of the syscalls below, +( In fact this is sometimes easier than looking at some spagetti +source which conditionally compiles for several architectures ) +Not everything that it throws out needs to make sense immeadiately + +Just looking quickly you can see that it is making up a RAW socket +for the ICMP protocol. +Doing an alarm(10) for a 10 second timeout +& doing a gettimeofday call before & after each read to see +how long the replies took, & writing some text to stdout so the user +has an idea what is going on. + +socket(PF_INET, SOCK_RAW, IPPROTO_ICMP) = 3 +getuid() = 0 +setuid(0) = 0 +stat("/usr/share/locale/C/libc.cat", 0xbffff134) = -1 ENOENT (No such file or directory) +stat("/usr/share/locale/libc/C", 0xbffff134) = -1 ENOENT (No such file or directory) +stat("/usr/local/share/locale/C/libc.cat", 0xbffff134) = -1 ENOENT (No such file or directory) +getpid() = 353 +setsockopt(3, SOL_SOCKET, SO_BROADCAST, [1], 4) = 0 +setsockopt(3, SOL_SOCKET, SO_RCVBUF, [49152], 4) = 0 +fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(3, 1), ...}) = 0 +mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40008000 +ioctl(1, TCGETS, {B9600 opost isig icanon echo ...}) = 0 +write(1, "PING 127.0.0.1 (127.0.0.1): 56 d"..., 42PING 127.0.0.1 (127.0.0.1): 56 data bytes +) = 42 +sigaction(SIGINT, {0x8049ba0, [], SA_RESTART}, {SIG_DFL}) = 0 +sigaction(SIGALRM, {0x8049600, [], SA_RESTART}, {SIG_DFL}) = 0 +gettimeofday({948904719, 138951}, NULL) = 0 +sendto(3, "\10\0D\201a\1\0\0\17#\2178\307\36"..., 64, 0, {sin_family=AF_INET, +sin_port=htons(0), sin_addr=inet_addr("127.0.0.1")}, 16) = 64 +sigaction(SIGALRM, {0x8049600, [], SA_RESTART}, {0x8049600, [], SA_RESTART}) = 0 +sigaction(SIGALRM, {0x8049ba0, [], SA_RESTART}, {0x8049600, [], SA_RESTART}) = 0 +alarm(10) = 0 +recvfrom(3, "E\0\0T\0005\0\0@\1|r\177\0\0\1\177"..., 192, 0, +{sin_family=AF_INET, sin_port=htons(50882), sin_addr=inet_addr("127.0.0.1")}, [16]) = 84 +gettimeofday({948904719, 160224}, NULL) = 0 +recvfrom(3, "E\0\0T\0006\0\0\377\1\275p\177\0"..., 192, 0, +{sin_family=AF_INET, sin_port=htons(50882), sin_addr=inet_addr("127.0.0.1")}, [16]) = 84 +gettimeofday({948904719, 166952}, NULL) = 0 +write(1, "64 bytes from 127.0.0.1: icmp_se"..., +5764 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=28.0 ms + +Example 2 +--------- +strace passwd 2>&1 | grep open +produces the following output +open("/etc/ld.so.cache", O_RDONLY) = 3 +open("/opt/kde/lib/libc.so.5", O_RDONLY) = -1 ENOENT (No such file or directory) +open("/lib/libc.so.5", O_RDONLY) = 3 +open("/dev", O_RDONLY) = 3 +open("/var/run/utmp", O_RDONLY) = 3 +open("/etc/passwd", O_RDONLY) = 3 +open("/etc/shadow", O_RDONLY) = 3 +open("/etc/login.defs", O_RDONLY) = 4 +open("/dev/tty", O_RDONLY) = 4 + +The 2>&1 is done to redirect stderr to stdout & grep is then filtering this input +through the pipe for each line containing the string open. + + +Example 3 +--------- +Getting sophistocated +telnetd crashes on & I don't know why +Steps +----- +1) Replace the following line in /etc/inetd.conf +telnet stream tcp nowait root /usr/sbin/in.telnetd -h +with +telnet stream tcp nowait root /blah + +2) Create the file /blah with the following contents to start tracing telnetd +#!/bin/bash +/usr/bin/strace -o/t1 -f /usr/sbin/in.telnetd -h +3) chmod 700 /blah to make it executable only to root +4) +killall -HUP inetd +or ps aux | grep inetd +get inetd's process id +& kill -HUP inetd to restart it. + +Important options +----------------- +-o is used to tell strace to output to a file in our case t1 in the root directory +-f is to follow children i.e. +e.g in our case above telnetd will start the login process & subsequently a shell like bash. +You will be able to tell which is which from the process ID's listed on the left hand side +of the strace output. +-p<pid> will tell strace to attach to a running process, yup this can be done provided + it isn't being traced or debugged already & you have enough privileges, +the reason 2 processes cannot trace or debug the same program is that strace +becomes the parent process of the one being debugged & processes ( unlike people ) +can have only one parent. + + +However the file /t1 will get big quite quickly +to test it telnet 127.0.0.1 + +now look at what files in.telnetd execve'd +413 execve("/usr/sbin/in.telnetd", ["/usr/sbin/in.telnetd", "-h"], [/* 17 vars */]) = 0 +414 execve("/bin/login", ["/bin/login", "-h", "localhost", "-p"], [/* 2 vars */]) = 0 + +Whey it worked!. + + +Other hints: +------------ +If the program is not very interactive ( i.e. not much keyboard input ) +& is crashing in one architecture but not in another you can do +an strace of both programs under as identical a scenario as you can +on both architectures outputting to a file then. +do a diff of the two traces using the diff program +i.e. +diff output1 output2 +& maybe you'll be able to see where the call paths differed, this +is possibly near the cause of the crash. + +More info +--------- +Look at man pages for strace & the various syscalls +e.g. man strace, man alarm, man socket. + + +Performance Debugging +===================== +gcc is capible of compiling in profiling code just add the -p option +to the CFLAGS, this obviously affects program size & performance. +This can be used by the gprof gnu profiling tool or the +gcov the gnu code coverage tool ( code coverage is a means of testing +code quality by checking if all the code in an executable in exercised by +a tester ). + + +Using top to find out where processes are sleeping in the kernel +---------------------------------------------------------------- +To do this copy the System.map from the root directory where +the linux kernel was built to the /boot directory on your +linux machine. +Start top +Now type fU<return> +You should see a new field called WCHAN which +tells you where each process is sleeping here is a typical output. + + 6:59pm up 41 min, 1 user, load average: 0.00, 0.00, 0.00 +28 processes: 27 sleeping, 1 running, 0 zombie, 0 stopped +CPU states: 0.0% user, 0.1% system, 0.0% nice, 99.8% idle +Mem: 254900K av, 45976K used, 208924K free, 0K shrd, 28636K buff +Swap: 0K av, 0K used, 0K free 8620K cached + + PID USER PRI NI SIZE RSS SHARE WCHAN STAT LIB %CPU %MEM TIME COMMAND + 750 root 12 0 848 848 700 do_select S 0 0.1 0.3 0:00 in.telnetd + 767 root 16 0 1140 1140 964 R 0 0.1 0.4 0:00 top + 1 root 8 0 212 212 180 do_select S 0 0.0 0.0 0:00 init + 2 root 9 0 0 0 0 down_inte SW 0 0.0 0.0 0:00 kmcheck + +The time command +---------------- +Another related command is the time command which gives you an indication +of where a process is spending the majority of its time. +e.g. +time ping -c 5 nc +outputs +real 0m4.054s +user 0m0.010s +sys 0m0.010s + +Debugging under VM +================== + +Notes +----- +Addresses & values in the VM debugger are always hex never decimal +Address ranges are of the format <HexValue1>-<HexValue2> or <HexValue1>.<HexValue2> +e.g. The address range 0x2000 to 0x3000 can be described described as +2000-3000 or 2000.1000 + +The VM Debugger is case insensitive. + +VM's strengths are usually other debuggers weaknesses you can get at any resource +no matter how sensitive e.g. memory management resources,change address translation +in the PSW. For kernel hacking you will reap dividends if you get good at it. + +The VM Debugger displays operators but not operands, probably because some +of it was written when memory was expensive & the programmer was probably proud that +it fitted into 2k of memory & the programmers & didn't want to shock hardcore VM'ers by +changing the interface :-), also the debugger displays useful information on the same line & +the author of the code probably felt that it was a good idea not to go over +the 80 columns on the screen. + +As some of you are probably in a panic now this isn't as unintuitive as it may seem +as the 390 instructions are easy to decode mentally & you can make a good guess at a lot +of them as all the operands are nibble ( half byte aligned ) & if you have an objdump listing +also it is quite easy to follow, if you don't have an objdump listing keep a copy of +the s/390 Reference Summary & look at between pages 2 & 7 or alternatively the +s/390 principles of operation. +e.g. even I can guess that +0001AFF8' LR 180F CC 0 +is a ( load register ) lr r0,r15 + +Also it is very easy to tell the length of a 390 instruction from the 2 most significant +bits in the instruction ( not that this info is really useful except if you are trying to +make sense of a hexdump of code ). +Here is a table +Bits Instruction Length +------------------------------------------ +00 2 Bytes +01 4 Bytes +10 4 Bytes +11 6 Bytes + + + + +The debugger also displays other useful info on the same line such as the +addresses being operated on destination addresses of branches & condition codes. +e.g. +00019736' AHI A7DAFF0E CC 1 +000198BA' BRC A7840004 -> 000198C2' CC 0 +000198CE' STM 900EF068 >> 0FA95E78 CC 2 + + + +Useful VM debugger commands +--------------------------- + +I suppose I'd better mention this before I start +to list the current active traces do +Q TR +there can be a maximum of 255 of these per set +( more about trace sets later ). +To stop traces issue a +TR END. +To delete a particular breakpoint issue +TR DEL <breakpoint number> + +The PA1 key drops to CP mode so you can issue debugger commands, +Doing alt c (on my 3270 console at least ) clears the screen. +hitting b <enter> comes back to the running operating system +from cp mode ( in our case linux ). +It is typically useful to add shortcuts to your profile.exec file +if you have one ( this is roughly equivalent to autoexec.bat in DOS ). +file here are a few from mine. +/* this gives me command history on issuing f12 */ +set pf12 retrieve +/* this continues */ +set pf8 imm b +/* goes to trace set a */ +set pf1 imm tr goto a +/* goes to trace set b */ +set pf2 imm tr goto b +/* goes to trace set c */ +set pf3 imm tr goto c + + + +Instruction Tracing +------------------- +Setting a simple breakpoint +TR I PSWA <address> +To debug a particular function try +TR I R <function address range> +TR I on its own will single step. +TR I DATA <MNEMONIC> <OPTIONAL RANGE> will trace for particular mnemonics +e.g. +TR I DATA 4D R 0197BC.4000 +will trace for BAS'es ( opcode 4D ) in the range 0197BC.4000 +if you were inclined you could add traces for all branch instructions & +suffix them with the run prefix so you would have a backtrace on screen +when a program crashes. +TR BR <INTO OR FROM> will trace branches into or out of an address. +e.g. +TR BR INTO 0 is often quite useful if a program is getting awkward & deciding +to branch to 0 & crashing as this will stop at the address before in jumps to 0. +TR I R <address range> RUN cmd d g +single steps a range of addresses but stays running & +displays the gprs on each step. + + + +Displaying & modifying Registers +-------------------------------- +D G will display all the gprs +Adding a extra G to all the commands is necessary to access the full 64 bit +content in VM on z/Architecture obviously this isn't required for access registers +as these are still 32 bit. +e.g. DGG instead of DG +D X will display all the control registers +D AR will display all the access registers +D AR4-7 will display access registers 4 to 7 +CPU ALL D G will display the GRPS of all CPUS in the configuration +D PSW will display the current PSW +st PSW 2000 will put the value 2000 into the PSW & +cause crash your machine. +D PREFIX displays the prefix offset + + +Displaying Memory +----------------- +To display memory mapped using the current PSW's mapping try +D <range> +To make VM display a message each time it hits a particular address & continue try +D I<range> will disassemble/display a range of instructions. +ST addr 32 bit word will store a 32 bit aligned address +D T<range> will display the EBCDIC in an address ( if you are that way inclined ) +D R<range> will display real addresses ( without DAT ) but with prefixing. +There are other complex options to display if you need to get at say home space +but are in primary space the easiest thing to do is to temporarily +modify the PSW to the other addressing mode, display the stuff & then +restore it. + + + +Hints +----- +If you want to issue a debugger command without halting your virtual machine with the +PA1 key try prefixing the command with #CP e.g. +#cp tr i pswa 2000 +also suffixing most debugger commands with RUN will cause them not +to stop just display the mnemonic at the current instruction on the console. +If you have several breakpoints you want to put into your program & +you get fed up of cross referencing with System.map +you can do the following trick for several symbols. +grep do_signal System.map +which emits the following among other things +0001f4e0 T do_signal +now you can do + +TR I PSWA 0001f4e0 cmd msg * do_signal +This sends a message to your own console each time do_signal is entered. +( As an aside I wrote a perl script once which automatically generated a REXX +script with breakpoints on every kernel procedure, this isn't a good idea +because there are thousands of these routines & VM can only set 255 breakpoints +at a time so you nearly had to spend as long pruning the file down as you would +entering the msg's by hand ),however, the trick might be useful for a single object file. +On linux'es 3270 emulator x3270 there is a very useful option under the file ment +Save Screens In File this is very good of keeping a copy of traces. + +From CMS help <command name> will give you online help on a particular command. +e.g. +HELP DISPLAY + +Also CP has a file called profile.exec which automatically gets called +on startup of CMS ( like autoexec.bat ), keeping on a DOS analogy session +CP has a feature similar to doskey, it may be useful for you to +use profile.exec to define some keystrokes. +e.g. +SET PF9 IMM B +This does a single step in VM on pressing F8. +SET PF10 ^ +This sets up the ^ key. +which can be used for ^c (ctrl-c),^z (ctrl-z) which can't be typed directly into some 3270 consoles. +SET PF11 ^- +This types the starting keystrokes for a sysrq see SysRq below. +SET PF12 RETRIEVE +This retrieves command history on pressing F12. + + +Sometimes in VM the display is set up to scroll automatically this +can be very annoying if there are messages you wish to look at +to stop this do +TERM MORE 255 255 +This will nearly stop automatic screen updates, however it will +cause a denial of service if lots of messages go to the 3270 console, +so it would be foolish to use this as the default on a production machine. + + +Tracing particular processes +---------------------------- +The kernel's text segment is intentionally at an address in memory that it will +very seldom collide with text segments of user programs ( thanks Martin ), +this simplifies debugging the kernel. +However it is quite common for user processes to have addresses which collide +this can make debugging a particular process under VM painful under normal +circumstances as the process may change when doing a +TR I R <address range>. +Thankfully after reading VM's online help I figured out how to debug +I particular process. + +Your first problem is to find the STD ( segment table designation ) +of the program you wish to debug. +There are several ways you can do this here are a few +1) objdump --syms <program to be debugged> | grep main +To get the address of main in the program. +tr i pswa <address of main> +Start the program, if VM drops to CP on what looks like the entry +point of the main function this is most likely the process you wish to debug. +Now do a D X13 or D XG13 on z/Architecture. +On 31 bit the STD is bits 1-19 ( the STO segment table origin ) +& 25-31 ( the STL segment table length ) of CR13. +now type +TR I R STD <CR13's value> 0.7fffffff +e.g. +TR I R STD 8F32E1FF 0.7fffffff +Another very useful variation is +TR STORE INTO STD <CR13's value> <address range> +for finding out when a particular variable changes. + +An alternative way of finding the STD of a currently running process +is to do the following, ( this method is more complex but +could be quite convient if you aren't updating the kernel much & +so your kernel structures will stay constant for a reasonable period of +time ). + +grep task /proc/<pid>/status +from this you should see something like +task: 0f160000 ksp: 0f161de8 pt_regs: 0f161f68 +This now gives you a pointer to the task structure. +Now make CC:="s390-gcc -g" kernel/sched.s +To get the task_struct stabinfo. +( task_struct is defined in include/linux/sched.h ). +Now we want to look at +task->active_mm->pgd +on my machine the active_mm in the task structure stab is +active_mm:(4,12),672,32 +its offset is 672/8=84=0x54 +the pgd member in the mm_struct stab is +pgd:(4,6)=*(29,5),96,32 +so its offset is 96/8=12=0xc + +so we'll +hexdump -s 0xf160054 /dev/mem | more +i.e. task_struct+active_mm offset +to look at the active_mm member +f160054 0fee cc60 0019 e334 0000 0000 0000 0011 +hexdump -s 0x0feecc6c /dev/mem | more +i.e. active_mm+pgd offset +feecc6c 0f2c 0000 0000 0001 0000 0001 0000 0010 +we get something like +now do +TR I R STD <pgd|0x7f> 0.7fffffff +i.e. the 0x7f is added because the pgd only +gives the page table origin & we need to set the low bits +to the maximum possible segment table length. +TR I R STD 0f2c007f 0.7fffffff +on z/Architecture you'll probably need to do +TR I R STD <pgd|0x7> 0.ffffffffffffffff +to set the TableType to 0x1 & the Table length to 3. + + + +Tracing Program Exceptions +-------------------------- +If you get a crash which says something like +illegal operation or specification exception followed by a register dump +You can restart linux & trace these using the tr prog <range or value> trace option. + + + +The most common ones you will normally be tracing for is +1=operation exception +2=privileged operation exception +4=protection exception +5=addressing exception +6=specification exception +10=segment translation exception +11=page translation exception + +The full list of these is on page 22 of the current s/390 Reference Summary. +e.g. +tr prog 10 will trace segment translation exceptions. +tr prog on its own will trace all program interruption codes. + +Trace Sets +---------- +On starting VM you are initially in the INITIAL trace set. +You can do a Q TR to verify this. +If you have a complex tracing situation where you wish to wait for instance +till a driver is open before you start tracing IO, but know in your +heart that you are going to have to make several runs through the code till you +have a clue whats going on. + +What you can do is +TR I PSWA <Driver open address> +hit b to continue till breakpoint +reach the breakpoint +now do your +TR GOTO B +TR IO 7c08-7c09 inst int run +or whatever the IO channels you wish to trace are & hit b + +To got back to the initial trace set do +TR GOTO INITIAL +& the TR I PSWA <Driver open address> will be the only active breakpoint again. + + +Tracing linux syscalls under VM +------------------------------- +Syscalls are implemented on Linux for S390 by the Supervisor call instruction (SVC) there 256 +possibilities of these as the instruction is made up of a 0xA opcode & the second byte being +the syscall number. They are traced using the simple command. +TR SVC <Optional value or range> +the syscalls are defined in linux/include/asm-s390/unistd.h +e.g. to trace all file opens just do +TR SVC 5 ( as this is the syscall number of open ) + + +SMP Specific commands +--------------------- +To find out how many cpus you have +Q CPUS displays all the CPU's available to your virtual machine +To find the cpu that the current cpu VM debugger commands are being directed at do +Q CPU to change the current cpu cpu VM debugger commands are being directed at do +CPU <desired cpu no> + +On a SMP guest issue a command to all CPUs try prefixing the command with cpu all. +To issue a command to a particular cpu try cpu <cpu number> e.g. +CPU 01 TR I R 2000.3000 +If you are running on a guest with several cpus & you have a IO related problem +& cannot follow the flow of code but you know it isnt smp related. +from the bash prompt issue +shutdown -h now or halt. +do a Q CPUS to find out how many cpus you have +detach each one of them from cp except cpu 0 +by issuing a +DETACH CPU 01-(number of cpus in configuration) +& boot linux again. +TR SIGP will trace inter processor signal processor instructions. +DEFINE CPU 01-(number in configuration) +will get your guests cpus back. + + +Help for displaying ascii textstrings +------------------------------------- +On the very latest VM Nucleus'es VM can now display ascii +( thanks Neale for the hint ) by doing +D TX<lowaddr>.<len> +e.g. +D TX0.100 + +Alternatively +============= +Under older VM debuggers ( I love EBDIC too ) you can use this little program I wrote which +will convert a command line of hex digits to ascii text which can be compiled under linux & +you can copy the hex digits from your x3270 terminal to your xterm if you are debugging +from a linuxbox. + +This is quite useful when looking at a parameter passed in as a text string +under VM ( unless you are good at decoding ASCII in your head ). + +e.g. consider tracing an open syscall +TR SVC 5 +We have stopped at a breakpoint +000151B0' SVC 0A05 -> 0001909A' CC 0 + +D 20.8 to check the SVC old psw in the prefix area & see was it from userspace +( for the layout of the prefix area consult P18 of the s/390 390 Reference Summary +if you have it available ). +V00000020 070C2000 800151B2 +The problem state bit wasn't set & it's also too early in the boot sequence +for it to be a userspace SVC if it was we would have to temporarily switch the +psw to user space addressing so we could get at the first parameter of the open in +gpr2. +Next do a +D G2 +GPR 2 = 00014CB4 +Now display what gpr2 is pointing to +D 00014CB4.20 +V00014CB4 2F646576 2F636F6E 736F6C65 00001BF5 +V00014CC4 FC00014C B4001001 E0001000 B8070707 +Now copy the text till the first 00 hex ( which is the end of the string +to an xterm & do hex2ascii on it. +hex2ascii 2F646576 2F636F6E 736F6C65 00 +outputs +Decoded Hex:=/ d e v / c o n s o l e 0x00 +We were opening the console device, + +You can compile the code below yourself for practice :-), +/* + * hex2ascii.c + * a useful little tool for converting a hexadecimal command line to ascii + * + * Author(s): Denis Joseph Barrow (djbarrow@de.ibm.com,barrow_dj@yahoo.com) + * (C) 2000 IBM Deutschland Entwicklung GmbH, IBM Corporation. + */ +#include <stdio.h> + +int main(int argc,char *argv[]) +{ + int cnt1,cnt2,len,toggle=0; + int startcnt=1; + unsigned char c,hex; + + if(argc>1&&(strcmp(argv[1],"-a")==0)) + startcnt=2; + printf("Decoded Hex:="); + for(cnt1=startcnt;cnt1<argc;cnt1++) + { + len=strlen(argv[cnt1]); + for(cnt2=0;cnt2<len;cnt2++) + { + c=argv[cnt1][cnt2]; + if(c>='0'&&c<='9') + c=c-'0'; + if(c>='A'&&c<='F') + c=c-'A'+10; + if(c>='a'&&c<='f') + c=c-'a'+10; + switch(toggle) + { + case 0: + hex=c<<4; + toggle=1; + break; + case 1: + hex+=c; + if(hex<32||hex>127) + { + if(startcnt==1) + printf("0x%02X ",(int)hex); + else + printf("."); + } + else + { + printf("%c",hex); + if(startcnt==1) + printf(" "); + } + toggle=0; + break; + } + } + } + printf("\n"); +} + + + + +Stack tracing under VM +---------------------- +A basic backtrace +----------------- + +Here are the tricks I use 9 out of 10 times it works pretty well, + +When your backchain reaches a dead end +-------------------------------------- +This can happen when an exception happens in the kernel & the kernel is entered twice +if you reach the NULL pointer at the end of the back chain you should be +able to sniff further back if you follow the following tricks. +1) A kernel address should be easy to recognise since it is in +primary space & the problem state bit isn't set & also +The Hi bit of the address is set. +2) Another backchain should also be easy to recognise since it is an +address pointing to another address approximately 100 bytes or 0x70 hex +behind the current stackpointer. + + +Here is some practice. +boot the kernel & hit PA1 at some random time +d g to display the gprs, this should display something like +GPR 0 = 00000001 00156018 0014359C 00000000 +GPR 4 = 00000001 001B8888 000003E0 00000000 +GPR 8 = 00100080 00100084 00000000 000FE000 +GPR 12 = 00010400 8001B2DC 8001B36A 000FFED8 +Note that GPR14 is a return address but as we are real men we are going to +trace the stack. +display 0x40 bytes after the stack pointer. + +V000FFED8 000FFF38 8001B838 80014C8E 000FFF38 +V000FFEE8 00000000 00000000 000003E0 00000000 +V000FFEF8 00100080 00100084 00000000 000FE000 +V000FFF08 00010400 8001B2DC 8001B36A 000FFED8 + + +Ah now look at whats in sp+56 (sp+0x38) this is 8001B36A our saved r14 if +you look above at our stackframe & also agrees with GPR14. + +now backchain +d 000FFF38.40 +we now are taking the contents of SP to get our first backchain. + +V000FFF38 000FFFA0 00000000 00014995 00147094 +V000FFF48 00147090 001470A0 000003E0 00000000 +V000FFF58 00100080 00100084 00000000 001BF1D0 +V000FFF68 00010400 800149BA 80014CA6 000FFF38 + +This displays a 2nd return address of 80014CA6 + +now do d 000FFFA0.40 for our 3rd backchain + +V000FFFA0 04B52002 0001107F 00000000 00000000 +V000FFFB0 00000000 00000000 FF000000 0001107F +V000FFFC0 00000000 00000000 00000000 00000000 +V000FFFD0 00010400 80010802 8001085A 000FFFA0 + + +our 3rd return address is 8001085A + +as the 04B52002 looks suspiciously like rubbish it is fair to assume that the kernel entry routines +for the sake of optimisation dont set up a backchain. + +now look at System.map to see if the addresses make any sense. + +grep -i 0001b3 System.map +outputs among other things +0001b304 T cpu_idle +so 8001B36A +is cpu_idle+0x66 ( quiet the cpu is asleep, don't wake it ) + + +grep -i 00014 System.map +produces among other things +00014a78 T start_kernel +so 0014CA6 is start_kernel+some hex number I can't add in my head. + +grep -i 00108 System.map +this produces +00010800 T _stext +so 8001085A is _stext+0x5a + +Congrats you've done your first backchain. + + + +s/390 & z/Architecture IO Overview +================================== + +I am not going to give a course in 390 IO architecture as this would take me quite a +while & I'm no expert. Instead I'll give a 390 IO architecture summary for Dummies if you have +the s/390 principles of operation available read this instead. If nothing else you may find a few +useful keywords in here & be able to use them on a web search engine like altavista to find +more useful information. + +Unlike other bus architectures modern 390 systems do their IO using mostly +fibre optics & devices such as tapes & disks can be shared between several mainframes, +also S390 can support upto 65536 devices while a high end PC based system might be choking +with around 64. Here is some of the common IO terminology + +Subchannel: +This is the logical number most IO commands use to talk to an IO device there can be upto +0x10000 (65536) of these in a configuration typically there is a few hundred. Under VM +for simplicity they are allocated contiguously, however on the native hardware they are not +they typically stay consistent between boots provided no new hardware is inserted or removed. +Under Linux for 390 we use these as IRQ's & also when issuing an IO command (CLEAR SUBCHANNEL, +HALT SUBCHANNEL,MODIFY SUBCHANNEL,RESUME SUBCHANNEL,START SUBCHANNEL,STORE SUBCHANNEL & +TEST SUBCHANNEL ) we use this as the ID of the device we wish to talk to, the most +important of these instructions are START SUBCHANNEL ( to start IO ), TEST SUBCHANNEL ( to check +whether the IO completed successfully ), & HALT SUBCHANNEL ( to kill IO ), a subchannel +can have up to 8 channel paths to a device this offers redunancy if one is not available. + + +Device Number: +This number remains static & Is closely tied to the hardware, there are 65536 of these +also they are made up of a CHPID ( Channel Path ID, the most significant 8 bits ) +& another lsb 8 bits. These remain static even if more devices are inserted or removed +from the hardware, there is a 1 to 1 mapping between Subchannels & Device Numbers provided +devices arent inserted or removed. + +Channel Control Words: +CCWS are linked lists of instructions initially pointed to by an operation request block (ORB), +which is initially given to Start Subchannel (SSCH) command along with the subchannel number +for the IO subsystem to process while the CPU continues executing normal code. +These come in two flavours, Format 0 ( 24 bit for backward ) +compatibility & Format 1 ( 31 bit ). These are typically used to issue read & write +( & many other instructions ) they consist of a length field & an absolute address field. +For each IO typically get 1 or 2 interrupts one for channel end ( primary status ) when the +channel is idle & the second for device end ( secondary status ) sometimes you get both +concurrently, you check how the IO went on by issuing a TEST SUBCHANNEL at each interrupt, +from which you receive an Interruption response block (IRB). If you get channel & device end +status in the IRB without channel checks etc. your IO probably went okay. If you didn't you +probably need a doctorto examine the IRB & extended status word etc. +If an error occurs more sophistocated control units have a facitity known as +concurrent sense this means that if an error occurs Extended sense information will +be presented in the Extended status word in the IRB if not you have to issue a +subsequent SENSE CCW command after the test subchannel. + + +TPI( Test pending interrupt) can also be used for polled IO but in multitasking multiprocessor +systems it isn't recommended except for checking special cases ( i.e. non looping checks for +pending IO etc. ). + +Store Subchannel & Modify Subchannel can be used to examine & modify operating characteristics +of a subchannel ( e.g. channel paths ). + +Other IO related Terms: +Sysplex: S390's Clustering Technology +QDIO: S390's new high speed IO architecture to support devices such as gigabit ethernet, +this architecture is also designed to be forward compatible with up & coming 64 bit machines. + + +General Concepts + +Input Output Processors (IOP's) are responsible for communicating between +the mainframe CPU's & the channel & relieve the mainframe CPU's from the +burden of communicating with IO devices directly, this allows the CPU's to +concentrate on data processing. + +IOP's can use one or more links ( known as channel paths ) to talk to each +IO device. It first checks for path availability & chooses an available one, +then starts ( & sometimes terminates IO ). +There are two types of channel path ESCON & the Paralell IO interface. + +IO devices are attached to control units, control units provide the +logic to interface the channel paths & channel path IO protocols to +the IO devices, they can be integrated with the devices or housed separately +& often talk to several similar devices ( typical examples would be raid +controllers or a control unit which connects to 1000 3270 terminals ). + + + +---------------------------------------------------------------+ + | +-----+ +-----+ +-----+ +-----+ +----------+ +----------+ | + | | CPU | | CPU | | CPU | | CPU | | Main | | Expanded | | + | | | | | | | | | | Memory | | Storage | | + | +-----+ +-----+ +-----+ +-----+ +----------+ +----------+ | + |---------------------------------------------------------------+ + | IOP | IOP | IOP | + |--------------------------------------------------------------- + | C | C | C | C | C | C | C | C | C | C | C | C | C | C | C | C | + ---------------------------------------------------------------- + || || + || Bus & Tag Channel Path || ESCON + || ====================== || Channel + || || || || Path + +----------+ +----------+ +----------+ + | | | | | | + | CU | | CU | | CU | + | | | | | | + +----------+ +----------+ +----------+ + | | | | | ++----------+ +----------+ +----------+ +----------+ +----------+ +|I/O Device| |I/O Device| |I/O Device| |I/O Device| |I/O Device| ++----------+ +----------+ +----------+ +----------+ +----------+ + CPU = Central Processing Unit + C = Channel + IOP = IP Processor + CU = Control Unit + +The 390 IO systems come in 2 flavours the current 390 machines support both + +The Older 360 & 370 Interface,sometimes called the paralell I/O interface, +sometimes called Bus-and Tag & sometimes Original Equipment Manufacturers +Interface (OEMI). + +This byte wide paralell channel path/bus has parity & data on the "Bus" cable +& control lines on the "Tag" cable. These can operate in byte multiplex mode for +sharing between several slow devices or burst mode & monopolize the channel for the +whole burst. Upto 256 devices can be addressed on one of these cables. These cables are +about one inch in diameter. The maximum unextended length supported by these cables is +125 Meters but this can be extended up to 2km with a fibre optic channel extended +such as a 3044. The maximum burst speed supported is 4.5 megabytes per second however +some really old processors support only transfer rates of 3.0, 2.0 & 1.0 MB/sec. +One of these paths can be daisy chained to up to 8 control units. + + +ESCON if fibre optic it is also called FICON +Was introduced by IBM in 1990. Has 2 fibre optic cables & uses either leds or lasers +for communication at a signaling rate of upto 200 megabits/sec. As 10bits are transferred +for every 8 bits info this drops to 160 megabits/sec & to 18.6 Megabytes/sec once +control info & CRC are added. ESCON only operates in burst mode. + +ESCONs typical max cable length is 3km for the led version & 20km for the laser version +known as XDF ( extended distance facility ). This can be further extended by using an +ESCON director which triples the above mentioned ranges. Unlike Bus & Tag as ESCON is +serial it uses a packet switching architecture the standard Bus & Tag control protocol +is however present within the packets. Upto 256 devices can be attached to each control +unit that uses one of these interfaces. + +Common 390 Devices include: +Network adapters typically OSA2,3172's,2116's & OSA-E gigabit ethernet adapters, +Consoles 3270 & 3215 ( a teletype emulated under linux for a line mode console ). +DASD's direct access storage devices ( otherwise known as hard disks ). +Tape Drives. +CTC ( Channel to Channel Adapters ), +ESCON or Paralell Cables used as a very high speed serial link +between 2 machines. We use 2 cables under linux to do a bi-directional serial link. + + +Debugging IO on s/390 & z/Architecture under VM +=============================================== + +Now we are ready to go on with IO tracing commands under VM + +A few self explanatory queries: +Q OSA +Q CTC +Q DISK ( This command is CMS specific ) +Q DASD + + + + + + +Q OSA on my machine returns +OSA 7C08 ON OSA 7C08 SUBCHANNEL = 0000 +OSA 7C09 ON OSA 7C09 SUBCHANNEL = 0001 +OSA 7C14 ON OSA 7C14 SUBCHANNEL = 0002 +OSA 7C15 ON OSA 7C15 SUBCHANNEL = 0003 + +If you have a guest with certain priviliges you may be able to see devices +which don't belong to you to avoid this do add the option V. +e.g. +Q V OSA + +Now using the device numbers returned by this command we will +Trace the io starting up on the first device 7c08 & 7c09 +In our simplest case we can trace the +start subchannels +like TR SSCH 7C08-7C09 +or the halt subchannels +or TR HSCH 7C08-7C09 +MSCH's ,STSCH's I think you can guess the rest + +Ingo's favourite trick is tracing all the IO's & CCWS & spooling them into the reader of another +VM guest so he can ftp the logfile back to his own machine.I'll do a small bit of this & give you + a look at the output. + +1) Spool stdout to VM reader +SP PRT TO (another vm guest ) or * for the local vm guest +2) Fill the reader with the trace +TR IO 7c08-7c09 INST INT CCW PRT RUN +3) Start up linux +i 00c +4) Finish the trace +TR END +5) close the reader +C PRT +6) list reader contents +RDRLIST +7) copy it to linux4's minidisk +RECEIVE / LOG TXT A1 ( replace +8) +filel & press F11 to look at it +You should see someting like. + +00020942' SSCH B2334000 0048813C CC 0 SCH 0000 DEV 7C08 + CPA 000FFDF0 PARM 00E2C9C4 KEY 0 FPI C0 LPM 80 + CCW 000FFDF0 E4200100 00487FE8 0000 E4240100 ........ + IDAL 43D8AFE8 + IDAL 0FB76000 +00020B0A' I/O DEV 7C08 -> 000197BC' SCH 0000 PARM 00E2C9C4 +00021628' TSCH B2354000 >> 00488164 CC 0 SCH 0000 DEV 7C08 + CCWA 000FFDF8 DEV STS 0C SCH STS 00 CNT 00EC + KEY 0 FPI C0 CC 0 CTLS 4007 +00022238' STSCH B2344000 >> 00488108 CC 0 SCH 0000 DEV 7C08 + +If you don't like messing up your readed ( because you possibly booted from it ) +you can alternatively spool it to another readers guest. + + +Other common VM device related commands +--------------------------------------------- +These commands are listed only because they have +been of use to me in the past & may be of use to +you too. For more complete info on each of the commands +use type HELP <command> from CMS. +detaching devices +DET <devno range> +ATT <devno range> <guest> +attach a device to guest * for your own guest +READY <devno> cause VM to issue a fake interrupt. + +The VARY command is normally only available to VM administrators. +VARY ON PATH <path> TO <devno range> +VARY OFF PATH <PATH> FROM <devno range> +This is used to switch on or off channel paths to devices. + +Q CHPID <channel path ID> +This displays state of devices using this channel path +D SCHIB <subchannel> +This displays the subchannel information SCHIB block for the device. +this I believe is also only available to administrators. +DEFINE CTC <devno> +defines a virtual CTC channel to channel connection +2 need to be defined on each guest for the CTC driver to use. +COUPLE devno userid remote devno +Joins a local virtual device to a remote virtual device +( commonly used for the CTC driver ). + +Building a VM ramdisk under CMS which linux can use +def vfb-<blocksize> <subchannel> <number blocks> +blocksize is commonly 4096 for linux. +Formatting it +format <subchannel> <driver letter e.g. x> (blksize <blocksize> + +Sharing a disk between multiple guests +LINK userid devno1 devno2 mode password + + + +GDB on S390 +=========== +N.B. if compiling for debugging gdb works better without optimisation +( see Compiling programs for debugging ) + +invocation +---------- +gdb <victim program> <optional corefile> + +Online help +----------- +help: gives help on commands +e.g. +help +help display +Note gdb's online help is very good use it. + + +Assembly +-------- +info registers: displays registers other than floating point. +info all-registers: displays floating points as well. +disassemble: dissassembles +e.g. +disassemble without parameters will disassemble the current function +disassemble $pc $pc+10 + +Viewing & modifying variables +----------------------------- +print or p: displays variable or register +e.g. p/x $sp will display the stack pointer + +display: prints variable or register each time program stops +e.g. +display/x $pc will display the program counter +display argc + +undisplay : undo's display's + +info breakpoints: shows all current breakpoints + +info stack: shows stack back trace ( if this dosent work too well, I'll show you the +stacktrace by hand below ). + +info locals: displays local variables. + +info args: display current procedure arguments. + +set args: will set argc & argv each time the victim program is invoked. + +set <variable>=value +set argc=100 +set $pc=0 + + + +Modifying execution +------------------- +step: steps n lines of sourcecode +step steps 1 line. +step 100 steps 100 lines of code. + +next: like step except this will not step into subroutines + +stepi: steps a single machine code instruction. +e.g. stepi 100 + +nexti: steps a single machine code instruction but will not step into subroutines. + +finish: will run until exit of the current routine + +run: (re)starts a program + +cont: continues a program + +quit: exits gdb. + + +breakpoints +------------ + +break +sets a breakpoint +e.g. + +break main + +break *$pc + +break *0x400618 + +heres a really useful one for large programs +rbr +Set a breakpoint for all functions matching REGEXP +e.g. +rbr 390 +will set a breakpoint with all functions with 390 in their name. + +info breakpoints +lists all breakpoints + +delete: delete breakpoint by number or delete them all +e.g. +delete 1 will delete the first breakpoint +delete will delete them all + +watch: This will set a watchpoint ( usually hardware assisted ), +This will watch a variable till it changes +e.g. +watch cnt, will watch the variable cnt till it changes. +As an aside unfortunately gdb's, architecture independent watchpoint code +is inconsistent & not very good, watchpoints usually work but not always. + +info watchpoints: Display currently active watchpoints + +condition: ( another useful one ) +Specify breakpoint number N to break only if COND is true. +Usage is `condition N COND', where N is an integer and COND is an +expression to be evaluated whenever breakpoint N is reached. + + + +User defined functions/macros +----------------------------- +define: ( Note this is very very useful,simple & powerful ) +usage define <name> <list of commands> end + +examples which you should consider putting into .gdbinit in your home directory +define d +stepi +disassemble $pc $pc+10 +end + +define e +nexti +disassemble $pc $pc+10 +end + + +Other hard to classify stuff +---------------------------- +signal n: +sends the victim program a signal. +e.g. signal 3 will send a SIGQUIT. + +info signals: +what gdb does when the victim receives certain signals. + +list: +e.g. +list lists current function source +list 1,10 list first 10 lines of curret file. +list test.c:1,10 + + +directory: +Adds directories to be searched for source if gdb cannot find the source. +(note it is a bit sensititive about slashes ) +e.g. To add the root of the filesystem to the searchpath do +directory // + + +call <function> +This calls a function in the victim program, this is pretty powerful +e.g. +(gdb) call printf("hello world") +outputs: +$1 = 11 + +You might now be thinking that the line above didn't work, something extra had to be done. +(gdb) call fflush(stdout) +hello world$2 = 0 +As an aside the debugger also calls malloc & free under the hood +to make space for the "hello world" string. + + + +hints +----- +1) command completion works just like bash +( if you are a bad typist like me this really helps ) +e.g. hit br <TAB> & cursor up & down :-). + +2) if you have a debugging problem that takes a few steps to recreate +put the steps into a file called .gdbinit in your current working directory +if you have defined a few extra useful user defined commands put these in +your home directory & they will be read each time gdb is launched. + +A typical .gdbinit file might be. +break main +run +break runtime_exception +cont + + +stack chaining in gdb by hand +----------------------------- +This is done using a the same trick described for VM +p/x (*($sp+56))&0x7fffffff get the first backchain. + +For z/Architecture +Replace 56 with 112 & ignore the &0x7fffffff +in the macros below & do nasty casts to longs like the following +as gdb unfortunately deals with printed arguments as ints which +messes up everything. +i.e. here is a 3rd backchain dereference +p/x *(long *)(***(long ***)$sp+112) + + +this outputs +$5 = 0x528f18 +on my machine. +Now you can use +info symbol (*($sp+56))&0x7fffffff +you might see something like. +rl_getc + 36 in section .text telling you what is located at address 0x528f18 +Now do. +p/x (*(*$sp+56))&0x7fffffff +This outputs +$6 = 0x528ed0 +Now do. +info symbol (*(*$sp+56))&0x7fffffff +rl_read_key + 180 in section .text +now do +p/x (*(**$sp+56))&0x7fffffff +& so on. + +Disassembling instructions without debug info +--------------------------------------------- +gdb typically compains if there is a lack of debugging +symbols in the disassemble command with +"No function contains specified address." to get around +this do +x/<number lines to disassemble>xi <address> +e.g. +x/20xi 0x400730 + + + +Note: Remember gdb has history just like bash you don't need to retype the +whole line just use the up & down arrows. + + + +For more info +------------- +From your linuxbox do +man gdb or info gdb. + +core dumps +---------- +What a core dump ?, +A core dump is a file generated by the kernel ( if allowed ) which contains the registers, +& all active pages of the program which has crashed. +From this file gdb will allow you to look at the registers & stack trace & memory of the +program as if it just crashed on your system, it is usually called core & created in the +current working directory. +This is very useful in that a customer can mail a core dump to a technical support department +& the technical support department can reconstruct what happened. +Provided the have an identical copy of this program with debugging symbols compiled in & +the source base of this build is available. +In short it is far more useful than something like a crash log could ever hope to be. + +In theory all that is missing to restart a core dumped program is a kernel patch which +will do the following. +1) Make a new kernel task structure +2) Reload all the dumped pages back into the kernel's memory management structures. +3) Do the required clock fixups +4) Get all files & network connections for the process back into an identical state ( really difficult ). +5) A few more difficult things I haven't thought of. + + + +Why have I never seen one ?. +Probably because you haven't used the command +ulimit -c unlimited in bash +to allow core dumps, now do +ulimit -a +to verify that the limit was accepted. + +A sample core dump +To create this I'm going to do +ulimit -c unlimited +gdb +to launch gdb (my victim app. ) now be bad & do the following from another +telnet/xterm session to the same machine +ps -aux | grep gdb +kill -SIGSEGV <gdb's pid> +or alternatively use killall -SIGSEGV gdb if you have the killall command. +Now look at the core dump. +./gdb ./gdb core +Displays the following +GNU gdb 4.18 +Copyright 1998 Free Software Foundation, Inc. +GDB is free software, covered by the GNU General Public License, and you are +welcome to change it and/or distribute copies of it under certain conditions. +Type "show copying" to see the conditions. +There is absolutely no warranty for GDB. Type "show warranty" for details. +This GDB was configured as "s390-ibm-linux"... +Core was generated by `./gdb'. +Program terminated with signal 11, Segmentation fault. +Reading symbols from /usr/lib/libncurses.so.4...done. +Reading symbols from /lib/libm.so.6...done. +Reading symbols from /lib/libc.so.6...done. +Reading symbols from /lib/ld-linux.so.2...done. +#0 0x40126d1a in read () from /lib/libc.so.6 +Setting up the environment for debugging gdb. +Breakpoint 1 at 0x4dc6f8: file utils.c, line 471. +Breakpoint 2 at 0x4d87a4: file top.c, line 2609. +(top-gdb) info stack +#0 0x40126d1a in read () from /lib/libc.so.6 +#1 0x528f26 in rl_getc (stream=0x7ffffde8) at input.c:402 +#2 0x528ed0 in rl_read_key () at input.c:381 +#3 0x5167e6 in readline_internal_char () at readline.c:454 +#4 0x5168ee in readline_internal_charloop () at readline.c:507 +#5 0x51692c in readline_internal () at readline.c:521 +#6 0x5164fe in readline (prompt=0x7ffff810 "\177ÿøx\177ÿ÷Ø\177ÿøxÀ") + at readline.c:349 +#7 0x4d7a8a in command_line_input (prrompt=0x564420 "(gdb) ", repeat=1, + annotation_suffix=0x4d6b44 "prompt") at top.c:2091 +#8 0x4d6cf0 in command_loop () at top.c:1345 +#9 0x4e25bc in main (argc=1, argv=0x7ffffdf4) at main.c:635 + + +LDD +=== +This is a program which lists the shared libraries which a library needs, +Note you also get the relocations of the shared library text segments which +help when using objdump --source. +e.g. + ldd ./gdb +outputs +libncurses.so.4 => /usr/lib/libncurses.so.4 (0x40018000) +libm.so.6 => /lib/libm.so.6 (0x4005e000) +libc.so.6 => /lib/libc.so.6 (0x40084000) +/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000) + + +Debugging shared libraries +========================== +Most programs use shared libraries, however it can be very painful +when you single step instruction into a function like printf for the +first time & you end up in functions like _dl_runtime_resolve this is +the ld.so doing lazy binding, lazy binding is a concept in ELF where +shared library functions are not loaded into memory unless they are +actually used, great for saving memory but a pain to debug. +To get around this either relink the program -static or exit gdb type +export LD_BIND_NOW=true this will stop lazy binding & restart the gdb'ing +the program in question. + + + +Debugging modules +================= +As modules are dynamically loaded into the kernel their address can be +anywhere to get around this use the -m option with insmod to emit a load +map which can be piped into a file if required. + +The proc file system +==================== +What is it ?. +It is a filesystem created by the kernel with files which are created on demand +by the kernel if read, or can be used to modify kernel parameters, +it is a powerful concept. + +e.g. + +cat /proc/sys/net/ipv4/ip_forward +On my machine outputs +0 +telling me ip_forwarding is not on to switch it on I can do +echo 1 > /proc/sys/net/ipv4/ip_forward +cat it again +cat /proc/sys/net/ipv4/ip_forward +On my machine now outputs +1 +IP forwarding is on. +There is a lot of useful info in here best found by going in & having a look around, +so I'll take you through some entries I consider important. + +All the processes running on the machine have there own entry defined by +/proc/<pid> +So lets have a look at the init process +cd /proc/1 + +cat cmdline +emits +init [2] + +cd /proc/1/fd +This contains numerical entries of all the open files, +some of these you can cat e.g. stdout (2) + +cat /proc/29/maps +on my machine emits + +00400000-00478000 r-xp 00000000 5f:00 4103 /bin/bash +00478000-0047e000 rw-p 00077000 5f:00 4103 /bin/bash +0047e000-00492000 rwxp 00000000 00:00 0 +40000000-40015000 r-xp 00000000 5f:00 14382 /lib/ld-2.1.2.so +40015000-40016000 rw-p 00014000 5f:00 14382 /lib/ld-2.1.2.so +40016000-40017000 rwxp 00000000 00:00 0 +40017000-40018000 rw-p 00000000 00:00 0 +40018000-4001b000 r-xp 00000000 5f:00 14435 /lib/libtermcap.so.2.0.8 +4001b000-4001c000 rw-p 00002000 5f:00 14435 /lib/libtermcap.so.2.0.8 +4001c000-4010d000 r-xp 00000000 5f:00 14387 /lib/libc-2.1.2.so +4010d000-40111000 rw-p 000f0000 5f:00 14387 /lib/libc-2.1.2.so +40111000-40114000 rw-p 00000000 00:00 0 +40114000-4011e000 r-xp 00000000 5f:00 14408 /lib/libnss_files-2.1.2.so +4011e000-4011f000 rw-p 00009000 5f:00 14408 /lib/libnss_files-2.1.2.so +7fffd000-80000000 rwxp ffffe000 00:00 0 + + +Showing us the shared libraries init uses where they are in memory +& memory access permissions for each virtual memory area. + +/proc/1/cwd is a softlink to the current working directory. +/proc/1/root is the root of the filesystem for this process. + +/proc/1/mem is the current running processes memory which you +can read & write to like a file. +strace uses this sometimes as it is a bit faster than the +rather inefficent ptrace interface for peeking at DATA. + + +cat status + +Name: init +State: S (sleeping) +Pid: 1 +PPid: 0 +Uid: 0 0 0 0 +Gid: 0 0 0 0 +Groups: +VmSize: 408 kB +VmLck: 0 kB +VmRSS: 208 kB +VmData: 24 kB +VmStk: 8 kB +VmExe: 368 kB +VmLib: 0 kB +SigPnd: 0000000000000000 +SigBlk: 0000000000000000 +SigIgn: 7fffffffd7f0d8fc +SigCgt: 00000000280b2603 +CapInh: 00000000fffffeff +CapPrm: 00000000ffffffff +CapEff: 00000000fffffeff + +User PSW: 070de000 80414146 +task: 004b6000 tss: 004b62d8 ksp: 004b7ca8 pt_regs: 004b7f68 +User GPRS: +00000400 00000000 0000000b 7ffffa90 +00000000 00000000 00000000 0045d9f4 +0045cafc 7ffffa90 7fffff18 0045cb08 +00010400 804039e8 80403af8 7ffff8b0 +User ACRS: +00000000 00000000 00000000 00000000 +00000001 00000000 00000000 00000000 +00000000 00000000 00000000 00000000 +00000000 00000000 00000000 00000000 +Kernel BackChain CallChain BackChain CallChain + 004b7ca8 8002bd0c 004b7d18 8002b92c + 004b7db8 8005cd50 004b7e38 8005d12a + 004b7f08 80019114 +Showing among other things memory usage & status of some signals & +the processes'es registers from the kernel task_structure +as well as a backchain which may be useful if a process crashes +in the kernel for some unknown reason. + +Some driver debugging techniques +================================ +debug feature +------------- +Some of our drivers now support a "debug feature" in +/proc/s390dbf see s390dbf.txt in the linux/Documentation directory +for more info. +e.g. +to switch on the lcs "debug feature" +echo 5 > /proc/s390dbf/lcs/level +& then after the error occurred. +cat /proc/s390dbf/lcs/sprintf >/logfile +the logfile now contains some information which may help +tech support resolve a problem in the field. + + + +high level debugging network drivers +------------------------------------ +ifconfig is a quite useful command +it gives the current state of network drivers. + +If you suspect your network device driver is dead +one way to check is type +ifconfig <network device> +e.g. tr0 +You should see something like +tr0 Link encap:16/4 Mbps Token Ring (New) HWaddr 00:04:AC:20:8E:48 + inet addr:9.164.185.132 Bcast:9.164.191.255 Mask:255.255.224.0 + UP BROADCAST RUNNING MULTICAST MTU:2000 Metric:1 + RX packets:246134 errors:0 dropped:0 overruns:0 frame:0 + TX packets:5 errors:0 dropped:0 overruns:0 carrier:0 + collisions:0 txqueuelen:100 + +if the device doesn't say up +try +/etc/rc.d/init.d/network start +( this starts the network stack & hopefully calls ifconfig tr0 up ). +ifconfig looks at the output of /proc/net/dev & presents it in a more presentable form +Now ping the device from a machine in the same subnet. +if the RX packets count & TX packets counts don't increment you probably +have problems. +next +cat /proc/net/arp +Do you see any hardware addresses in the cache if not you may have problems. +Next try +ping -c 5 <broadcast_addr> i.e. the Bcast field above in the output of +ifconfig. Do you see any replies from machines other than the local machine +if not you may have problems. also if the TX packets count in ifconfig +hasn't incremented either you have serious problems in your driver +(e.g. the txbusy field of the network device being stuck on ) +or you may have multiple network devices connected. + + +chandev +------- +There is a new device layer for channel devices, some +drivers e.g. lcs are registered with this layer. +If the device uses the channel device layer you'll be +able to find what interrupts it uses & the current state +of the device. +See the manpage chandev.8 &type cat /proc/chandev for more info. + + + +Starting points for debugging scripting languages etc. +====================================================== + +bash/sh + +bash -x <scriptname> +e.g. bash -x /usr/bin/bashbug +displays the following lines as it executes them. ++ MACHINE=i586 ++ OS=linux-gnu ++ CC=gcc ++ CFLAGS= -DPROGRAM='bash' -DHOSTTYPE='i586' -DOSTYPE='linux-gnu' -DMACHTYPE='i586-pc-linux-gnu' -DSHELL -DHAVE_CONFIG_H -I. -I. -I./lib -O2 -pipe ++ RELEASE=2.01 ++ PATCHLEVEL=1 ++ RELSTATUS=release ++ MACHTYPE=i586-pc-linux-gnu + +perl -d <scriptname> runs the perlscript in a fully intercative debugger +<like gdb>. +Type 'h' in the debugger for help. + +for debugging java type +jdb <filename> another fully interactive gdb style debugger. +& type ? in the debugger for help. + + + +Dumptool & Lcrash ( lkcd ) +========================== +Michael Holzheu & others here at IBM have a fairly mature port of +SGI's lcrash tool which allows one to look at kernel structures in a +running kernel. + +It also complements a tool called dumptool which dumps all the kernel's +memory pages & registers to either a tape or a disk. +This can be used by tech support or an ambitious end user do +post mortem debugging of a machine like gdb core dumps. + +Going into how to use this tool in detail will be explained +in other documentation supplied by IBM with the patches & the +lcrash homepage http://oss.sgi.com/projects/lkcd/ & the lcrash manpage. + +How they work +------------- +Lcrash is a perfectly normal program,however, it requires 2 +additional files, Kerntypes which is built using a patch to the +linux kernel sources in the linux root directory & the System.map. + +Kerntypes is an an objectfile whose sole purpose in life +is to provide stabs debug info to lcrash, to do this +Kerntypes is built from kerntypes.c which just includes the most commonly +referenced header files used when debugging, lcrash can then read the +.stabs section of this file. + +Debugging a live system it uses /dev/mem +alternatively for post mortem debugging it uses the data +collected by dumptool. + + + +SysRq +===== +This is now supported by linux for s/390 & z/Architecture. +To enable it do compile the kernel with +Kernel Hacking -> Magic SysRq Key Enabled +echo "1" > /proc/sys/kernel/sysrq +also type +echo "8" >/proc/sys/kernel/printk +To make printk output go to console. +On 390 all commands are prefixed with +^- +e.g. +^-t will show tasks. +^-? or some unknown command will display help. +The sysrq key reading is very picky ( I have to type the keys in an + xterm session & paste them into the x3270 console ) +& it may be wise to predefine the keys as described in the VM hints above + +This is particularly useful for syncing disks unmounting & rebooting +if the machine gets partially hung. + +Read Documentation/sysrq.txt for more info + +References: +=========== +Enterprise Systems Architecture Reference Summary +Enterprise Systems Architecture Principles of Operation +Hartmut Penners s390 stack frame sheet. +IBM Mainframe Channel Attachment a technology brief from a CISCO webpage +Various bits of man & info pages of Linux. +Linux & GDB source. +Various info & man pages. +CMS Help on tracing commands. +Linux for s/390 Elf Application Binary Interface +Linux for z/Series Elf Application Binary Interface ( Both Highly Recommended ) +z/Architecture Principles of Operation SA22-7832-00 +Enterprise Systems Architecture/390 Reference Summary SA22-7209-01 & the +Enterprise Systems Architecture/390 Principles of Operation SA22-7201-05 + +Special Thanks +============== +Special thanks to Neale Ferguson who maintains a much +prettier HTML version of this page at +http://penguinvm.princeton.edu/notes.html#Debug390 +Bob Grainger Stefan Bader & others for reporting bugs diff --git a/Documentation/s390/TAPE b/Documentation/s390/TAPE new file mode 100644 index 000000000000..c639aa5603ff --- /dev/null +++ b/Documentation/s390/TAPE @@ -0,0 +1,122 @@ +Channel attached Tape device driver + +-----------------------------WARNING----------------------------------------- +This driver is considered to be EXPERIMENTAL. Do NOT use it in +production environments. Feel free to test it and report problems back to us. +----------------------------------------------------------------------------- + +The LINUX for zSeries tape device driver manages channel attached tape drives +which are compatible to IBM 3480 or IBM 3490 magnetic tape subsystems. This +includes various models of these devices (for example the 3490E). + + +Tape driver features + +The device driver supports a maximum of 128 tape devices. +No official LINUX device major number is assigned to the zSeries tape device +driver. It allocates major numbers dynamically and reports them on system +startup. +Typically it will get major number 254 for both the character device front-end +and the block device front-end. + +The tape device driver needs no kernel parameters. All supported devices +present are detected on driver initialization at system startup or module load. +The devices detected are ordered by their subchannel numbers. The device with +the lowest subchannel number becomes device 0, the next one will be device 1 +and so on. + + +Tape character device front-end + +The usual way to read or write to the tape device is through the character +device front-end. The zSeries tape device driver provides two character devices +for each physical device -- the first of these will rewind automatically when +it is closed, the second will not rewind automatically. + +The character device nodes are named /dev/rtibm0 (rewinding) and /dev/ntibm0 +(non-rewinding) for the first device, /dev/rtibm1 and /dev/ntibm1 for the +second, and so on. + +The character device front-end can be used as any other LINUX tape device. You +can write to it and read from it using LINUX facilities such as GNU tar. The +tool mt can be used to perform control operations, such as rewinding the tape +or skipping a file. + +Most LINUX tape software should work with either tape character device. + + +Tape block device front-end + +The tape device may also be accessed as a block device in read-only mode. +This could be used for software installation in the same way as it is used with +other operation systems on the zSeries platform (and most LINUX +distributions are shipped on compact disk using ISO9660 filesystems). + +One block device node is provided for each physical device. These are named +/dev/btibm0 for the first device, /dev/btibm1 for the second and so on. +You should only use the ISO9660 filesystem on LINUX for zSeries tapes because +the physical tape devices cannot perform fast seeks and the ISO9660 system is +optimized for this situation. + + +Tape block device example + +In this example a tape with an ISO9660 filesystem is created using the first +tape device. ISO9660 filesystem support must be built into your system kernel +for this. +The mt command is used to issue tape commands and the mkisofs command to +create an ISO9660 filesystem: + +- create a LINUX directory (somedir) with the contents of the filesystem + mkdir somedir + cp contents somedir + +- insert a tape + +- ensure the tape is at the beginning + mt -f /dev/ntibm0 rewind + +- set the blocksize of the character driver. The blocksize 2048 bytes + is commonly used on ISO9660 CD-Roms + mt -f /dev/ntibm0 setblk 2048 + +- write the filesystem to the character device driver + mkisofs -o /dev/ntibm0 somedir + +- rewind the tape again + mt -f /dev/ntibm0 rewind + +- Now you can mount your new filesystem as a block device: + mount -t iso9660 -o ro,block=2048 /dev/btibm0 /mnt + +TODO List + + - Driver has to be stabilized still + +BUGS + +This driver is considered BETA, which means some weaknesses may still +be in it. +If an error occurs which cannot be handled by the code you will get a +sense-data dump.In that case please do the following: + +1. set the tape driver debug level to maximum: + echo 6 >/proc/s390dbf/tape/level + +2. re-perform the actions which produced the bug. (Hopefully the bug will + reappear.) + +3. get a snapshot from the debug-feature: + cat /proc/s390dbf/tape/hex_ascii >somefile + +4. Now put the snapshot together with a detailed description of the situation + that led to the bug: + - Which tool did you use? + - Which hardware do you have? + - Was your tape unit online? + - Is it a shared tape unit? + +5. Send an email with your bug report to: + mailto:Linux390@de.ibm.com + + diff --git a/Documentation/s390/cds.txt b/Documentation/s390/cds.txt new file mode 100644 index 000000000000..d9397170fb36 --- /dev/null +++ b/Documentation/s390/cds.txt @@ -0,0 +1,513 @@ +Linux for S/390 and zSeries + +Common Device Support (CDS) +Device Driver I/O Support Routines + +Authors : Ingo Adlung + Cornelia Huck + +Copyright, IBM Corp. 1999-2002 + +Introduction + +This document describes the common device support routines for Linux/390. +Different than other hardware architectures, ESA/390 has defined a unified +I/O access method. This gives relief to the device drivers as they don't +have to deal with different bus types, polling versus interrupt +processing, shared versus non-shared interrupt processing, DMA versus port +I/O (PIO), and other hardware features more. However, this implies that +either every single device driver needs to implement the hardware I/O +attachment functionality itself, or the operating system provides for a +unified method to access the hardware, providing all the functionality that +every single device driver would have to provide itself. + +The document does not intend to explain the ESA/390 hardware architecture in +every detail.This information can be obtained from the ESA/390 Principles of +Operation manual (IBM Form. No. SA22-7201). + +In order to build common device support for ESA/390 I/O interfaces, a +functional layer was introduced that provides generic I/O access methods to +the hardware. + +The common device support layer comprises the I/O support routines defined +below. Some of them implement common Linux device driver interfaces, while +some of them are ESA/390 platform specific. + +Note: +In order to write a driver for S/390, you also need to look into the interface +described in Documentation/s390/driver-model.txt. + +Note for porting drivers from 2.4: +The major changes are: +* The functions use a ccw_device instead of an irq (subchannel). +* All drivers must define a ccw_driver (see driver-model.txt) and the associated + functions. +* request_irq() and free_irq() are no longer done by the driver. +* The oper_handler is (kindof) replaced by the probe() and set_online() functions + of the ccw_driver. +* The not_oper_handler is (kindof) replaced by the remove() and set_offline() + functions of the ccw_driver. +* The channel device layer is gone. +* The interrupt handlers must be adapted to use a ccw_device as argument. + Moreover, they don't return a devstat, but an irb. +* Before initiating an io, the options must be set via ccw_device_set_options(). + +read_dev_chars() + read device characteristics + +read_conf_data() + read configuration data. + +ccw_device_get_ciw() + get commands from extended sense data. + +ccw_device_start() + initiate an I/O request. + +ccw_device_resume() + resume channel program execution. + +ccw_device_halt() + terminate the current I/O request processed on the device. + +do_IRQ() + generic interrupt routine. This function is called by the interrupt entry + routine whenever an I/O interrupt is presented to the system. The do_IRQ() + routine determines the interrupt status and calls the device specific + interrupt handler according to the rules (flags) defined during I/O request + initiation with do_IO(). + +The next chapters describe the functions other than do_IRQ() in more details. +The do_IRQ() interface is not described, as it is called from the Linux/390 +first level interrupt handler only and does not comprise a device driver +callable interface. Instead, the functional description of do_IO() also +describes the input to the device specific interrupt handler. + +Note: All explanations apply also to the 64 bit architecture s390x. + + +Common Device Support (CDS) for Linux/390 Device Drivers + +General Information + +The following chapters describe the I/O related interface routines the +Linux/390 common device support (CDS) provides to allow for device specific +driver implementations on the IBM ESA/390 hardware platform. Those interfaces +intend to provide the functionality required by every device driver +implementaion to allow to drive a specific hardware device on the ESA/390 +platform. Some of the interface routines are specific to Linux/390 and some +of them can be found on other Linux platforms implementations too. +Miscellaneous function prototypes, data declarations, and macro definitions +can be found in the architecture specific C header file +linux/include/asm-s390/irq.h. + +Overview of CDS interface concepts + +Different to other hardware platforms, the ESA/390 architecture doesn't define +interrupt lines managed by a specific interrupt controller and bus systems +that may or may not allow for shared interrupts, DMA processing, etc.. Instead, +the ESA/390 architecture has implemented a so called channel subsystem, that +provides a unified view of the devices physically attached to the systems. +Though the ESA/390 hardware platform knows about a huge variety of different +peripheral attachments like disk devices (aka. DASDs), tapes, communication +controllers, etc. they can all by accessed by a well defined access method and +they are presenting I/O completion a unified way : I/O interruptions. Every +single device is uniquely identified to the system by a so called subchannel, +where the ESA/390 architecture allows for 64k devices be attached. + +Linux, however, was first built on the Intel PC architecture, with its two +cascaded 8259 programmable interrupt controllers (PICs), that allow for a +maximum of 15 different interrupt lines. All devices attached to such a system +share those 15 interrupt levels. Devices attached to the ISA bus system must +not share interrupt levels (aka. IRQs), as the ISA bus bases on edge triggered +interrupts. MCA, EISA, PCI and other bus systems base on level triggered +interrupts, and therewith allow for shared IRQs. However, if multiple devices +present their hardware status by the same (shared) IRQ, the operating system +has to call every single device driver registered on this IRQ in order to +determine the device driver owning the device that raised the interrupt. + +In order not to introduce a new I/O concept to the common Linux code, +Linux/390 preserves the IRQ concept and semantically maps the ESA/390 +subchannels to Linux as IRQs. This allows Linux/390 to support up to 64k +different IRQs, uniquely representig a single device each. + +Up to kernel 2.4, Linux/390 used to provide interfaces via the IRQ (subchannel). +For internal use of the common I/O layer, these are still there. However, +device drivers should use the new calling interface via the ccw_device only. + +During its startup the Linux/390 system checks for peripheral devices. Each +of those devices is uniquely defined by a so called subchannel by the ESA/390 +channel subsystem. While the subchannel numbers are system generated, each +subchannel also takes a user defined attribute, the so called device number. +Both subchannel number and device number can not exceed 65535. During driverfs +initialisation, the information about control unit type and device types that +imply specific I/O commands (channel command words - CCWs) in order to operate +the device are gathered. Device drivers can retrieve this set of hardware +information during their initialization step to recognize the devices they +support using the information saved in the struct ccw_device given to them. +This methods implies that Linux/390 doesn't require to probe for free (not +armed) interrupt request lines (IRQs) to drive its devices with. Where +applicable, the device drivers can use the read_dev_chars() to retrieve device +characteristics. This can be done without having to request device ownership +previously. + +In order to allow for easy I/O initiation the CDS layer provides a +ccw_device_start() interface that takes a device specific channel program (one +or more CCWs) as input sets up the required architecture specific control blocks +and initiates an I/O request on behalf of the device driver. The +ccw_device_start() routine allows to specify whether it expects the CDS layer +to notify the device driver for every interrupt it observes, or with final status +only. See ccw_device_start() for more details. A device driver must never issue +ESA/390 I/O commands itself, but must use the Linux/390 CDS interfaces instead. + +For long running I/O request to be canceled, the CDS layer provides the +ccw_device_halt() function. Some devices require to initially issue a HALT +SUBCHANNEL (HSCH) command without having pending I/O requests. This function is +also covered by ccw_device_halt(). + + +read_dev_chars() - Read Device Characteristics + +This routine returns the characteristics for the device specified. + +The function is meant to be called with an irq handler in place; that is, +at earliest during set_online() processing. + +While the request is procesed synchronously, the device interrupt +handler is called for final ending status. In case of error situations the +interrupt handler may recover appropriately. The device irq handler can +recognize the corresponding interrupts by the interruption parameter be +0x00524443.The ccw_device must not be locked prior to calling read_dev_chars(). + +The function may be called enabled or disabled. + +int read_dev_chars(struct ccw_device *cdev, void **buffer, int length ); + +cdev - the ccw_device the information is requested for. +buffer - pointer to a buffer pointer. The buffer pointer itself + must contain a valid buffer area. +length - length of the buffer provided. + +The read_dev_chars() function returns : + + 0 - successful completion +-ENODEV - cdev invalid +-EINVAL - an invalid parameter was detected, or the function was called early. +-EBUSY - an irrecoverable I/O error occurred or the device is not + operational. + + +read_conf_data() - Read Configuration Data + +Retrieve the device dependent configuration data. Please have a look at your +device dependent I/O commands for the device specific layout of the node +descriptor elements. + +The function is meant to be called with an irq handler in place; that is, +at earliest during set_online() processing. + +The function may be called enabled or disabled, but the device must not be +locked + +int read_conf_data(struct ccw_device, void **buffer, int *length, __u8 lpm); + +cdev - the ccw_device the data is requested for. +buffer - Pointer to a buffer pointer. The read_conf_data() routine + will allocate a buffer and initialize the buffer pointer + accordingly. It's the device driver's responsibility to + release the kernel memory if no longer needed. +length - Length of the buffer allocated and retrieved. +lpm - Logical path mask to be used for retrieving the data. If + zero the data is retrieved on the next path available. + +The read_conf_data() function returns : + 0 - Successful completion +-ENODEV - cdev invalid. +-EINVAL - An invalid parameter was detected, or the function was called early. +-EIO - An irrecoverable I/O error occurred or the device is + not operational. +-ENOMEM - The read_conf_data() routine couldn't obtain storage. +-EOPNOTSUPP - The device doesn't support the read configuration + data command. + + +get_ciw() - get command information word + +This call enables a device driver to get information about supported commands +from the extended SenseID data. + +struct ciw * +ccw_device_get_ciw(struct ccw_device *cdev, __u32 cmd); + +cdev - The ccw_device for which the command is to be retrieved. +cmd - The command type to be retrieved. + +ccw_device_get_ciw() returns: +NULL - No extended data available, invalid device or command not found. +!NULL - The command requested. + + +ccw_device_start() - Initiate I/O Request + +The ccw_device_start() routines is the I/O request front-end processor. All +device driver I/O requests must be issued using this routine. A device driver +must not issue ESA/390 I/O commands itself. Instead the ccw_device_start() +routine provides all interfaces required to drive arbitrary devices. + +This description also covers the status information passed to the device +driver's interrupt handler as this is related to the rules (flags) defined +with the associated I/O request when calling ccw_device_start(). + +int ccw_device_start(struct ccw_device *cdev, + struct ccw1 *cpa, + unsigned long intparm, + __u8 lpm, + unsigned long flags); + +cdev : ccw_device the I/O is destined for +cpa : logical start address of channel program +user_intparm : user specific interrupt information; will be presented + back to the device driver's interrupt handler. Allows a + device driver to associate the interrupt with a + particular I/O request. +lpm : defines the channel path to be used for a specific I/O + request. A value of 0 will make cio use the opm. +flag : defines the action to be performed for I/O processing + +Possible flag values are : + +DOIO_ALLOW_SUSPEND - channel program may become suspended +DOIO_DENY_PREFETCH - don't allow for CCW prefetch; usually + this implies the channel program might + become modified +DOIO_SUPPRESS_INTER - don't call the handler on intermediate status + +The cpa parameter points to the first format 1 CCW of a channel program : + +struct ccw1 { + __u8 cmd_code;/* command code */ + __u8 flags; /* flags, like IDA addressing, etc. */ + __u16 count; /* byte count */ + __u32 cda; /* data address */ +} __attribute__ ((packed,aligned(8))); + +with the following CCW flags values defined : + +CCW_FLAG_DC - data chaining +CCW_FLAG_CC - command chaining +CCW_FLAG_SLI - suppress incorrct length +CCW_FLAG_SKIP - skip +CCW_FLAG_PCI - PCI +CCW_FLAG_IDA - indirect addressing +CCW_FLAG_SUSPEND - suspend + + +Via ccw_device_set_options(), the device driver may specify the following +options for the device: + +DOIO_EARLY_NOTIFICATION - allow for early interrupt notification +DOIO_REPORT_ALL - report all interrupt conditions + + +The ccw_device_start() function returns : + + 0 - successful completion or request successfully initiated +-EBUSY - The device is currently processing a previous I/O request, or ther is + a status pending at the device. +-ENODEV - cdev is invalid, the device is not operational or the ccw_device is + not online. + +When the I/O request completes, the CDS first level interrupt handler will +accumalate the status in a struct irb and then call the device interrupt handler. +The intparm field will contain the value the device driver has associated with a +particular I/O request. If a pending device status was recognized, +intparm will be set to 0 (zero). This may happen during I/O initiation or delayed +by an alert status notification. In any case this status is not related to the +current (last) I/O request. In case of a delayed status notification no special +interrupt will be presented to indicate I/O completion as the I/O request was +never started, even though ccw_device_start() returned with successful completion. + +If the concurrent sense flag in the extended status word in the irb is set, the +field irb->scsw.count describes the numer of device specific sense bytes +available in the extended control word irb->scsw.ecw[0]. No device sensing by +the device driver itself is required. + +The device interrupt handler can use the following definitions to investigate +the primary unit check source coded in sense byte 0 : + +SNS0_CMD_REJECT 0x80 +SNS0_INTERVENTION_REQ 0x40 +SNS0_BUS_OUT_CHECK 0x20 +SNS0_EQUIPMENT_CHECK 0x10 +SNS0_DATA_CHECK 0x08 +SNS0_OVERRUN 0x04 +SNS0_INCOMPL_DOMAIN 0x01 + +Depending on the device status, multiple of those values may be set together. +Please refer to the device specific documentation for details. + +The irb->scsw.cstat field provides the (accumulated) subchannel status : + +SCHN_STAT_PCI - program controlled interrupt +SCHN_STAT_INCORR_LEN - incorrect length +SCHN_STAT_PROG_CHECK - program check +SCHN_STAT_PROT_CHECK - protection check +SCHN_STAT_CHN_DATA_CHK - channel data check +SCHN_STAT_CHN_CTRL_CHK - channel control check +SCHN_STAT_INTF_CTRL_CHK - interface control check +SCHN_STAT_CHAIN_CHECK - chaining check + +The irb->scsw.dstat field provides the (accumulated) device status : + +DEV_STAT_ATTENTION - attention +DEV_STAT_STAT_MOD - status modifier +DEV_STAT_CU_END - control unit end +DEV_STAT_BUSY - busy +DEV_STAT_CHN_END - channel end +DEV_STAT_DEV_END - device end +DEV_STAT_UNIT_CHECK - unit check +DEV_STAT_UNIT_EXCEP - unit exception + +Please see the ESA/390 Principles of Operation manual for details on the +individual flag meanings. + +Usage Notes : + +Prior to call ccw_device_start() the device driver must assure disabled state, +i.e. the I/O mask value in the PSW must be disabled. This can be accomplished +by calling local_save_flags( flags). The current PSW flags are preserved and +can be restored by local_irq_restore( flags) at a later time. + +If the device driver violates this rule while running in a uni-processor +environment an interrupt might be presented prior to the ccw_device_start() +routine returning to the device driver main path. In this case we will end in a +deadlock situation as the interrupt handler will try to obtain the irq +lock the device driver still owns (see below) ! + +The driver must assure to hold the device specific lock. This can be +accomplished by + +(i) spin_lock(get_ccwdev_lock(cdev)), or +(ii) spin_lock_irqsave(get_ccwdev_lock(cdev), flags) + +Option (i) should be used if the calling routine is running disabled for +I/O interrupts (see above) already. Option (ii) obtains the device gate und +puts the CPU into I/O disabled state by preserving the current PSW flags. + +The device driver is allowed to issue the next ccw_device_start() call from +within its interrupt handler already. It is not required to schedule a +bottom-half, unless an non deterministicly long running error recovery procedure +or similar needs to be scheduled. During I/O processing the Linux/390 generic +I/O device driver support has already obtained the IRQ lock, i.e. the handler +must not try to obtain it again when calling ccw_device_start() or we end in a +deadlock situation! + +If a device driver relies on an I/O request to be completed prior to start the +next it can reduce I/O processing overhead by chaining a NoOp I/O command +CCW_CMD_NOOP to the end of the submitted CCW chain. This will force Channel-End +and Device-End status to be presented together, with a single interrupt. +However, this should be used with care as it implies the channel will remain +busy, not being able to process I/O requests for other devices on the same +channel. Therefore e.g. read commands should never use this technique, as the +result will be presented by a single interrupt anyway. + +In order to minimize I/O overhead, a device driver should use the +DOIO_REPORT_ALL only if the device can report intermediate interrupt +information prior to device-end the device driver urgently relies on. In this +case all I/O interruptions are presented to the device driver until final +status is recognized. + +If a device is able to recover from asynchronosly presented I/O errors, it can +perform overlapping I/O using the DOIO_EARLY_NOTIFICATION flag. While some +devices always report channel-end and device-end together, with a single +interrupt, others present primary status (channel-end) when the channel is +ready for the next I/O request and secondary status (device-end) when the data +transmission has been completed at the device. + +Above flag allows to exploit this feature, e.g. for communication devices that +can handle lost data on the network to allow for enhanced I/O processing. + +Unless the channel subsystem at any time presents a secondary status interrupt, +exploiting this feature will cause only primary status interrupts to be +presented to the device driver while overlapping I/O is performed. When a +secondary status without error (alert status) is presented, this indicates +successful completion for all overlapping ccw_device_start() requests that have +been issued since the last secondary (final) status. + +Channel programs that intend to set the suspend flag on a channel command word +(CCW) must start the I/O operation with the DOIO_ALLOW_SUSPEND option or the +suspend flag will cause a channel program check. At the time the channel program +becomes suspended an intermediate interrupt will be generated by the channel +subsystem. + +ccw_device_resume() - Resume Channel Program Execution + +If a device driver chooses to suspend the current channel program execution by +setting the CCW suspend flag on a particular CCW, the channel program execution +is suspended. In order to resume channel program execution the CIO layer +provides the ccw_device_resume() routine. + +int ccw_device_resume(struct ccw_device *cdev); + +cdev - ccw_device the resume operation is requested for + +The resume_IO() function returns: + + 0 - suspended channel program is resumed +-EBUSY - status pending +-ENODEV - cdev invalid or not-operational subchannel +-EINVAL - resume function not applicable +-ENOTCONN - there is no I/O request pending for completion + +Usage Notes: +Please have a look at the ccw_device_start() usage notes for more details on +suspended channel programs. + +ccw_device_halt() - Halt I/O Request Processing + +Sometimes a device driver might need a possibility to stop the processing of +a long-running channel program or the device might require to initially issue +a halt subchannel (HSCH) I/O command. For those purposes the ccw_device_halt() +command is provided. + +int ccw_device_halt(struct ccw_device *cdev, + unsigned long intparm); + +cdev : ccw_device the halt operation is requested for +intparm : interruption parameter; value is only used if no I/O + is outstanding, otherwise the intparm associated with + the I/O request is returned + +The ccw_device_halt() function returns : + + 0 - successful completion or request successfully initiated +-EBUSY - the device is currently busy, or status pending. +-ENODEV - cdev invalid. +-EINVAL - The device is not operational or the ccw device is not online. + +Usage Notes : + +A device driver may write a never-ending channel program by writing a channel +program that at its end loops back to its beginning by means of a transfer in +channel (TIC) command (CCW_CMD_TIC). Usually this is performed by network +device drivers by setting the PCI CCW flag (CCW_FLAG_PCI). Once this CCW is +executed a program controlled interrupt (PCI) is generated. The device driver +can then perform an appropriate action. Prior to interrupt of an outstanding +read to a network device (with or without PCI flag) a ccw_device_halt() +is required to end the pending operation. + + +Miscellaneous Support Routines + +This chapter describes various routines to be used in a Linux/390 device +driver programming environment. + +get_ccwdev_lock() + +Get the address of the device specific lock. This is then used in +spin_lock() / spin_unlock() calls. + + +__u8 ccw_device_get_path_mask(struct ccw_device *cdev); + +Get the mask of the path currently available for cdev. diff --git a/Documentation/s390/config3270.sh b/Documentation/s390/config3270.sh new file mode 100644 index 000000000000..515e2f431487 --- /dev/null +++ b/Documentation/s390/config3270.sh @@ -0,0 +1,76 @@ +#!/bin/sh +# +# config3270 -- Autoconfigure /dev/3270/* and /etc/inittab +# +# Usage: +# config3270 +# +# Output: +# /tmp/mkdev3270 +# +# Operation: +# 1. Run this script +# 2. Run the script it produces: /tmp/mkdev3270 +# 3. Issue "telinit q" or reboot, as appropriate. +# +P=/proc/tty/driver/tty3270 +ROOT= +D=$ROOT/dev +SUBD=3270 +TTY=$SUBD/tty +TUB=$SUBD/tub +SCR=$ROOT/tmp/mkdev3270 +SCRTMP=$SCR.a +GETTYLINE=:2345:respawn:/sbin/mingetty +INITTAB=$ROOT/etc/inittab +NINITTAB=$ROOT/etc/NEWinittab +OINITTAB=$ROOT/etc/OLDinittab +ADDNOTE=\\"# Additional mingettys for the 3270/tty* driver, tub3270 ---\\" + +if ! ls $P > /dev/null 2>&1; then + modprobe tub3270 > /dev/null 2>&1 +fi +ls $P > /dev/null 2>&1 || exit 1 + +# Initialize two files, one for /dev/3270 commands and one +# to replace the /etc/inittab file (old one saved in OLDinittab) +echo "#!/bin/sh" > $SCR || exit 1 +echo " " >> $SCR +echo "# Script built by /sbin/config3270" >> $SCR +if [ ! -d /dev/dasd ]; then + echo rm -rf "$D/$SUBD/*" >> $SCR +fi +echo "grep -v $TTY $INITTAB > $NINITTAB" > $SCRTMP || exit 1 +echo "echo $ADDNOTE >> $NINITTAB" >> $SCRTMP +if [ ! -d /dev/dasd ]; then + echo mkdir -p $D/$SUBD >> $SCR +fi + +# Now query the tub3270 driver for 3270 device information +# and add appropriate mknod and mingetty lines to our files +echo what=config > $P +while read devno maj min;do + if [ $min = 0 ]; then + fsmaj=$maj + if [ ! -d /dev/dasd ]; then + echo mknod $D/$TUB c $fsmaj 0 >> $SCR + echo chmod 666 $D/$TUB >> $SCR + fi + elif [ $maj = CONSOLE ]; then + if [ ! -d /dev/dasd ]; then + echo mknod $D/$TUB$devno c $fsmaj $min >> $SCR + fi + else + if [ ! -d /dev/dasd ]; then + echo mknod $D/$TTY$devno c $maj $min >>$SCR + echo mknod $D/$TUB$devno c $fsmaj $min >> $SCR + fi + echo "echo t$min$GETTYLINE $TTY$devno >> $NINITTAB" >> $SCRTMP + fi +done < $P + +echo mv $INITTAB $OINITTAB >> $SCRTMP || exit 1 +echo mv $NINITTAB $INITTAB >> $SCRTMP +cat $SCRTMP >> $SCR +rm $SCRTMP +exit 0 diff --git a/Documentation/s390/crypto/crypto-API.txt b/Documentation/s390/crypto/crypto-API.txt new file mode 100644 index 000000000000..78a77624a716 --- /dev/null +++ b/Documentation/s390/crypto/crypto-API.txt @@ -0,0 +1,83 @@ +crypto-API support for z990 Message Security Assist (MSA) instructions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +AUTHOR: Thomas Spatzier (tspat@de.ibm.com) + + +1. Introduction crypto-API +~~~~~~~~~~~~~~~~~~~~~~~~~~ +See Documentation/crypto/api-intro.txt for an introduction/description of the +kernel crypto API. +According to api-intro.txt support for z990 crypto instructions has been added +in the algorithm api layer of the crypto API. Several files containing z990 +optimized implementations of crypto algorithms are placed in the +arch/s390/crypto directory. + + +2. Probing for availability of MSA +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +It should be possible to use Kernels with the z990 crypto implementations both +on machines with MSA available an on those without MSA (pre z990 or z990 +without MSA). Therefore a simple probing mechanisms has been implemented: +In the init function of each crypto module the availability of MSA and of the +respective crypto algorithm in particular will be tested. If the algorithm is +available the module will load and register its algorithm with the crypto API. + +If the respective crypto algorithm is not available, the init function will +return -ENOSYS. In that case a fallback to the standard software implementation +of the crypto algorithm must be taken ( -> the standard crypto modules are +also build when compiling the kernel). + + +3. Ensuring z990 crypto module preference +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +If z990 crypto instructions are available the optimized modules should be +preferred instead of standard modules. + +3.1. compiled-in modules +~~~~~~~~~~~~~~~~~~~~~~~~ +For compiled-in modules it has to be ensured that the z990 modules are linked +before the standard crypto modules. Then, on system startup the init functions +of z990 crypto modules will be called first and query for availability of z990 +crypto instructions. If instruction is available, the z990 module will register +its crypto algorithm implementation -> the load of the standard module will fail +since the algorithm is already registered. +If z990 crypto instruction is not available the load of the z990 module will +fail -> the standard module will load and register its algorithm. + +3.2. dynamic modules +~~~~~~~~~~~~~~~~~~~~ +A system administrator has to take care of giving preference to z990 crypto +modules. If MSA is available appropriate lines have to be added to +/etc/modprobe.conf. + +Example: z990 crypto instruction for SHA1 algorithm is available + + add the following line to /etc/modprobe.conf (assuming the + z990 crypto modules for SHA1 is called sha1_z990): + + alias sha1 sha1_z990 + + -> when the sha1 algorithm is requested through the crypto API + (which has a module autoloader) the z990 module will be loaded. + +TBD: a userspace module probin mechanism + something like 'probe sha1 sha1_z990 sha1' in modprobe.conf + -> try module sha1_z990, if it fails to load load standard module sha1 + the 'probe' statement is currently not supported in modprobe.conf + + +4. Currently implemented z990 crypto algorithms +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The following crypto algorithms with z990 MSA support are currently implemented. +The name of each algorithm under which it is registered in crypto API and the +name of the respective module is given in square brackets. + +- SHA1 Digest Algorithm [sha1 -> sha1_z990] +- DES Encrypt/Decrypt Algorithm (64bit key) [des -> des_z990] +- Tripple DES Encrypt/Decrypt Algorithm (128bit key) [des3_ede128 -> des_z990] +- Tripple DES Encrypt/Decrypt Algorithm (192bit key) [des3_ede -> des_z990] + +In order to load, for example, the sha1_z990 module when the sha1 algorithm is +requested (see 3.2.) add 'alias sha1 sha1_z990' to /etc/modprobe.conf. + diff --git a/Documentation/s390/driver-model.txt b/Documentation/s390/driver-model.txt new file mode 100644 index 000000000000..19461958e2bd --- /dev/null +++ b/Documentation/s390/driver-model.txt @@ -0,0 +1,265 @@ +S/390 driver model interfaces +----------------------------- + +1. CCW devices +-------------- + +All devices which can be addressed by means of ccws are called 'CCW devices' - +even if they aren't actually driven by ccws. + +All ccw devices are accessed via a subchannel, this is reflected in the +structures under root/: + +root/ + - sys + - legacy + - css0/ + - 0.0.0000/0.0.0815/ + - 0.0.0001/0.0.4711/ + - 0.0.0002/ + ... + +In this example, device 0815 is accessed via subchannel 0, device 4711 via +subchannel 1, and subchannel 2 is a non-I/O subchannel. + +You should address a ccw device via its bus id (e.g. 0.0.4711); the device can +be found under bus/ccw/devices/. + +All ccw devices export some data via sysfs. + +cutype: The control unit type / model. + +devtype: The device type / model, if applicable. + +availability: Can be 'good' or 'boxed'; 'no path' or 'no device' for + disconnected devices. + +online: An interface to set the device online and offline. + In the special case of the device being disconnected (see the + notify function under 1.2), piping 0 to online will focibly delete + the device. + +The device drivers can add entries to export per-device data and interfaces. + +There is also some data exported on a per-subchannel basis (see under +bus/css/devices/): + +chpids: Via which chpids the device is connected. + +pimpampom: The path installed, path available and path operational masks. + +There also might be additional data, for example for block devices. + + +1.1 Bringing up a ccw device +---------------------------- + +This is done in several steps. + +a. Each driver can provide one or more parameter interfaces where parameters can + be specified. These interfaces are also in the driver's responsibility. +b. After a. has been performed, if necessary, the device is finally brought up + via the 'online' interface. + + +1.2 Writing a driver for ccw devices +------------------------------------ + +The basic struct ccw_device and struct ccw_driver data structures can be found +under include/asm/ccwdev.h. + +struct ccw_device { + spinlock_t *ccwlock; + struct ccw_device_private *private; + struct ccw_device_id id; + + struct ccw_driver *drv; + struct device dev; + int online; + + void (*handler) (struct ccw_device *dev, unsigned long intparm, + struct irb *irb); +}; + +struct ccw_driver { + struct module *owner; + struct ccw_device_id *ids; + int (*probe) (struct ccw_device *); + int (*remove) (struct ccw_device *); + int (*set_online) (struct ccw_device *); + int (*set_offline) (struct ccw_device *); + int (*notify) (struct ccw_device *, int); + struct device_driver driver; + char *name; +}; + +The 'private' field contains data needed for internal i/o operation only, and +is not available to the device driver. + +Each driver should declare in a MODULE_DEVICE_TABLE into which CU types/models +and/or device types/models it is interested. This information can later be found +found in the struct ccw_device_id fields: + +struct ccw_device_id { + __u16 match_flags; + + __u16 cu_type; + __u16 dev_type; + __u8 cu_model; + __u8 dev_model; + + unsigned long driver_info; +}; + +The functions in ccw_driver should be used in the following way: +probe: This function is called by the device layer for each device the driver + is interested in. The driver should only allocate private structures + to put in dev->driver_data and create attributes (if needed). Also, + the interrupt handler (see below) should be set here. + +int (*probe) (struct ccw_device *cdev); + +Parameters: cdev - the device to be probed. + + +remove: This function is called by the device layer upon removal of the driver, + the device or the module. The driver should perform cleanups here. + +int (*remove) (struct ccw_device *cdev); + +Parameters: cdev - the device to be removed. + + +set_online: This function is called by the common I/O layer when the device is + activated via the 'online' attribute. The driver should finally + setup and activate the device here. + +int (*set_online) (struct ccw_device *); + +Parameters: cdev - the device to be activated. The common layer has + verified that the device is not already online. + + +set_offline: This function is called by the common I/O layer when the device is + de-activated via the 'online' attribute. The driver should shut + down the device, but not de-allocate its private data. + +int (*set_offline) (struct ccw_device *); + +Parameters: cdev - the device to be deactivated. The common layer has + verified that the device is online. + + +notify: This function is called by the common I/O layer for some state changes + of the device. + Signalled to the driver are: + * In online state, device detached (CIO_GONE) or last path gone + (CIO_NO_PATH). The driver must return !0 to keep the device; for + return code 0, the device will be deleted as usual (also when no + notify function is registerd). If the driver wants to keep the + device, it is moved into disconnected state. + * In disconnected state, device operational again (CIO_OPER). The + common I/O layer performs some sanity checks on device number and + Device / CU to be reasonably sure if it is still the same device. + If not, the old device is removed and a new one registered. By the + return code of the notify function the device driver signals if it + wants the device back: !0 for keeping, 0 to make the device being + removed and re-registered. + +int (*notify) (struct ccw_device *, int); + +Parameters: cdev - the device whose state changed. + event - the event that happened. This can be one of CIO_GONE, + CIO_NO_PATH or CIO_OPER. + +The handler field of the struct ccw_device is meant to be set to the interrupt +handler for the device. In order to accommodate drivers which use several +distinct handlers (e.g. multi subchannel devices), this is a member of ccw_device +instead of ccw_driver. +The handler is registered with the common layer during set_online() processing +before the driver is called, and is deregistered during set_offline() after the +driver has been called. Also, after registering / before deregistering, path +grouping resp. disbanding of the path group (if applicable) are performed. + +void (*handler) (struct ccw_device *dev, unsigned long intparm, struct irb *irb); + +Parameters: dev - the device the handler is called for + intparm - the intparm which allows the device driver to identify + the i/o the interrupt is associated with, or to recognize + the interrupt as unsolicited. + irb - interruption response block which contains the accumulated + status. + +The device driver is called from the common ccw_device layer and can retrieve +information about the interrupt from the irb parameter. + + +1.3 ccwgroup devices +-------------------- + +The ccwgroup mechanism is designed to handle devices consisting of multiple ccw +devices, like lcs or ctc. + +The ccw driver provides a 'group' attribute. Piping bus ids of ccw devices to +this attributes creates a ccwgroup device consisting of these ccw devices (if +possible). This ccwgroup device can be set online or offline just like a normal +ccw device. + +Each ccwgroup device also provides an 'ungroup' attribute to destroy the device +again (only when offline). This is a generic ccwgroup mechanism (the driver does +not need to implement anything beyond normal removal routines). + +To implement a ccwgroup driver, please refer to include/asm/ccwgroup.h. Keep in +mind that most drivers will need to implement both a ccwgroup and a ccw driver +(unless you have a meta ccw driver, like cu3088 for lcs and ctc). + + +2. Channel paths +----------------- + +Channel paths show up, like subchannels, under the channel subsystem root (css0) +and are called 'chp0.<chpid>'. They have no driver and do not belong to any bus. +Please note, that unlike /proc/chpids in 2.4, the channel path objects reflect +only the logical state and not the physical state, since we cannot track the +latter consistently due to lacking machine support (we don't need to be aware +of anyway). + +status - Can be 'online' or 'offline'. + Piping 'on' or 'off' sets the chpid logically online/offline. + Piping 'on' to an online chpid triggers path reprobing for all devices + the chpid connects to. This can be used to force the kernel to re-use + a channel path the user knows to be online, but the machine hasn't + created a machine check for. + + +3. System devices +----------------- + +Note: cpus may yet be added here. + +3.1 xpram +--------- + +xpram shows up under sys/ as 'xpram'. + + +4. Other devices +---------------- + +4.1 Netiucv +----------- + +The netiucv driver creates an attribute 'connection' under +bus/iucv/drivers/netiucv. Piping to this attibute creates a new netiucv +connection to the specified host. + +Netiucv connections show up under devices/iucv/ as "netiucv<ifnum>". The interface +number is assigned sequentially to the connections defined via the 'connection' +attribute. + +user - shows the connection partner. + +buffer - maximum buffer size. + Pipe to it to change buffer size. + + diff --git a/Documentation/s390/monreader.txt b/Documentation/s390/monreader.txt new file mode 100644 index 000000000000..d843bb04906e --- /dev/null +++ b/Documentation/s390/monreader.txt @@ -0,0 +1,197 @@ + +Date : 2004-Nov-26 +Author: Gerald Schaefer (geraldsc@de.ibm.com) + + + Linux API for read access to z/VM Monitor Records + ================================================= + + +Description +=========== +This item delivers a new Linux API in the form of a misc char device that is +useable from user space and allows read access to the z/VM Monitor Records +collected by the *MONITOR System Service of z/VM. + + +User Requirements +================= +The z/VM guest on which you want to access this API needs to be configured in +order to allow IUCV connections to the *MONITOR service, i.e. it needs the +IUCV *MONITOR statement in its user entry. If the monitor DCSS to be used is +restricted (likely), you also need the NAMESAVE <DCSS NAME> statement. +This item will use the IUCV device driver to access the z/VM services, so you +need a kernel with IUCV support. You also need z/VM version 4.4 or 5.1. + +There are two options for being able to load the monitor DCSS (examples assume +that the monitor DCSS begins at 144 MB and ends at 152 MB). You can query the +location of the monitor DCSS with the Class E privileged CP command Q NSS MAP +(the values BEGPAG and ENDPAG are given in units of 4K pages). + +See also "CP Command and Utility Reference" (SC24-6081-00) for more information +on the DEF STOR and Q NSS MAP commands, as well as "Saved Segments Planning +and Administration" (SC24-6116-00) for more information on DCSSes. + +1st option: +----------- +You can use the CP command DEF STOR CONFIG to define a "memory hole" in your +guest virtual storage around the address range of the DCSS. + +Example: DEF STOR CONFIG 0.140M 200M.200M + +This defines two blocks of storage, the first is 140MB in size an begins at +address 0MB, the second is 200MB in size and begins at address 200MB, +resulting in a total storage of 340MB. Note that the first block should +always start at 0 and be at least 64MB in size. + +2nd option: +----------- +Your guest virtual storage has to end below the starting address of the DCSS +and you have to specify the "mem=" kernel parameter in your parmfile with a +value greater than the ending address of the DCSS. + +Example: DEF STOR 140M + +This defines 140MB storage size for your guest, the parameter "mem=160M" is +added to the parmfile. + + +User Interface +============== +The char device is implemented as a kernel module named "monreader", +which can be loaded via the modprobe command, or it can be compiled into the +kernel instead. There is one optional module (or kernel) parameter, "mondcss", +to specify the name of the monitor DCSS. If the module is compiled into the +kernel, the kernel parameter "monreader.mondcss=<DCSS NAME>" can be specified +in the parmfile. + +The default name for the DCSS is "MONDCSS" if none is specified. In case that +there are other users already connected to the *MONITOR service (e.g. +Performance Toolkit), the monitor DCSS is already defined and you have to use +the same DCSS. The CP command Q MONITOR (Class E privileged) shows the name +of the monitor DCSS, if already defined, and the users connected to the +*MONITOR service. +Refer to the "z/VM Performance" book (SC24-6109-00) on how to create a monitor +DCSS if your z/VM doesn't have one already, you need Class E privileges to +define and save a DCSS. + +Example: +-------- +modprobe monreader mondcss=MYDCSS + +This loads the module and sets the DCSS name to "MYDCSS". + +NOTE: +----- +This API provides no interface to control the *MONITOR service, e.g. specifiy +which data should be collected. This can be done by the CP command MONITOR +(Class E privileged), see "CP Command and Utility Reference". + +Device nodes with udev: +----------------------- +After loading the module, a char device will be created along with the device +node /<udev directory>/monreader. + +Device nodes without udev: +-------------------------- +If your distribution does not support udev, a device node will not be created +automatically and you have to create it manually after loading the module. +Therefore you need to know the major and minor numbers of the device. These +numbers can be found in /sys/class/misc/monreader/dev. +Typing cat /sys/class/misc/monreader/dev will give an output of the form +<major>:<minor>. The device node can be created via the mknod command, enter +mknod <name> c <major> <minor>, where <name> is the name of the device node +to be created. + +Example: +-------- +# modprobe monreader +# cat /sys/class/misc/monreader/dev +10:63 +# mknod /dev/monreader c 10 63 + +This loads the module with the default monitor DCSS (MONDCSS) and creates a +device node. + +File operations: +---------------- +The following file operations are supported: open, release, read, poll. +There are two alternative methods for reading: either non-blocking read in +conjunction with polling, or blocking read without polling. IOCTLs are not +supported. + +Read: +----- +Reading from the device provides a 12 Byte monitor control element (MCE), +followed by a set of one or more contiguous monitor records (similar to the +output of the CMS utility MONWRITE without the 4K control blocks). The MCE +contains information on the type of the following record set (sample/event +data), the monitor domains contained within it and the start and end address +of the record set in the monitor DCSS. The start and end address can be used +to determine the size of the record set, the end address is the address of the +last byte of data. The start address is needed to handle "end-of-frame" records +correctly (domain 1, record 13), i.e. it can be used to determine the record +start offset relative to a 4K page (frame) boundary. + +See "Appendix A: *MONITOR" in the "z/VM Performance" document for a description +of the monitor control element layout. The layout of the monitor records can +be found here (z/VM 5.1): http://www.vm.ibm.com/pubs/mon510/index.html + +The layout of the data stream provided by the monreader device is as follows: +... +<0 byte read> +<first MCE> \ +<first set of records> | +... |- data set +<last MCE> | +<last set of records> / +<0 byte read> +... + +There may be more than one combination of MCE and corresponding record set +within one data set and the end of each data set is indicated by a successful +read with a return value of 0 (0 byte read). +Any received data must be considered invalid until a complete set was +read successfully, including the closing 0 byte read. Therefore you should +always read the complete set into a buffer before processing the data. + +The maximum size of a data set can be as large as the size of the +monitor DCSS, so design the buffer adequately or use dynamic memory allocation. +The size of the monitor DCSS will be printed into syslog after loading the +module. You can also use the (Class E privileged) CP command Q NSS MAP to +list all available segments and information about them. + +As with most char devices, error conditions are indicated by returning a +negative value for the number of bytes read. In this case, the errno variable +indicates the error condition: + +EIO: reply failed, read data is invalid and the application + should discard the data read since the last successful read with 0 size. +EFAULT: copy_to_user failed, read data is invalid and the application should + discard the data read since the last successful read with 0 size. +EAGAIN: occurs on a non-blocking read if there is no data available at the + moment. There is no data missing or corrupted, just try again or rather + use polling for non-blocking reads. +EOVERFLOW: message limit reached, the data read since the last successful + read with 0 size is valid but subsequent records may be missing. + +In the last case (EOVERFLOW) there may be missing data, in the first two cases +(EIO, EFAULT) there will be missing data. It's up to the application if it will +continue reading subsequent data or rather exit. + +Open: +----- +Only one user is allowed to open the char device. If it is already in use, the +open function will fail (return a negative value) and set errno to EBUSY. +The open function may also fail if an IUCV connection to the *MONITOR service +cannot be established. In this case errno will be set to EIO and an error +message with an IPUSER SEVER code will be printed into syslog. The IPUSER SEVER +codes are described in the "z/VM Performance" book, Appendix A. + +NOTE: +----- +As soon as the device is opened, incoming messages will be accepted and they +will account for the message limit, i.e. opening the device without reading +from it will provoke the "message limit reached" error (EOVERFLOW error code) +eventually. + diff --git a/Documentation/s390/s390dbf.txt b/Documentation/s390/s390dbf.txt new file mode 100644 index 000000000000..2d1cd939b4df --- /dev/null +++ b/Documentation/s390/s390dbf.txt @@ -0,0 +1,615 @@ +S390 Debug Feature +================== + +files: arch/s390/kernel/debug.c + include/asm-s390/debug.h + +Description: +------------ +The goal of this feature is to provide a kernel debug logging API +where log records can be stored efficiently in memory, where each component +(e.g. device drivers) can have one separate debug log. +One purpose of this is to inspect the debug logs after a production system crash +in order to analyze the reason for the crash. +If the system still runs but only a subcomponent which uses dbf failes, +it is possible to look at the debug logs on a live system via the Linux proc +filesystem. +The debug feature may also very useful for kernel and driver development. + +Design: +------- +Kernel components (e.g. device drivers) can register themselves at the debug +feature with the function call debug_register(). This function initializes a +debug log for the caller. For each debug log exists a number of debug areas +where exactly one is active at one time. Each debug area consists of contiguous +pages in memory. In the debug areas there are stored debug entries (log records) +which are written by event- and exception-calls. + +An event-call writes the specified debug entry to the active debug +area and updates the log pointer for the active area. If the end +of the active debug area is reached, a wrap around is done (ring buffer) +and the next debug entry will be written at the beginning of the active +debug area. + +An exception-call writes the specified debug entry to the log and +switches to the next debug area. This is done in order to be sure +that the records which describe the origin of the exception are not +overwritten when a wrap around for the current area occurs. + +The debug areas itselve are also ordered in form of a ring buffer. +When an exception is thrown in the last debug area, the following debug +entries are then written again in the very first area. + +There are three versions for the event- and exception-calls: One for +logging raw data, one for text and one for numbers. + +Each debug entry contains the following data: + +- Timestamp +- Cpu-Number of calling task +- Level of debug entry (0...6) +- Return Address to caller +- Flag, if entry is an exception or not + +The debug logs can be inspected in a live system through entries in +the proc-filesystem. Under the path /proc/s390dbf there is +a directory for each registered component, which is named like the +corresponding component. + +The content of the directories are files which represent different views +to the debug log. Each component can decide which views should be +used through registering them with the function debug_register_view(). +Predefined views for hex/ascii, sprintf and raw binary data are provided. +It is also possible to define other views. The content of +a view can be inspected simply by reading the corresponding proc file. + +All debug logs have an an actual debug level (range from 0 to 6). +The default level is 3. Event and Exception functions have a 'level' +parameter. Only debug entries with a level that is lower or equal +than the actual level are written to the log. This means, when +writing events, high priority log entries should have a low level +value whereas low priority entries should have a high one. +The actual debug level can be changed with the help of the proc-filesystem +through writing a number string "x" to the 'level' proc file which is +provided for every debug log. Debugging can be switched off completely +by using "-" on the 'level' proc file. + +Example: + +> echo "-" > /proc/s390dbf/dasd/level + +It is also possible to deactivate the debug feature globally for every +debug log. You can change the behavior using 2 sysctl parameters in +/proc/sys/s390dbf: +There are currently 2 possible triggers, which stop the debug feature +globally. The first possbility is to use the "debug_active" sysctl. If +set to 1 the debug feature is running. If "debug_active" is set to 0 the +debug feature is turned off. +The second trigger which stops the debug feature is an kernel oops. +That prevents the debug feature from overwriting debug information that +happened before the oops. After an oops you can reactivate the debug feature +by piping 1 to /proc/sys/s390dbf/debug_active. Nevertheless, its not +suggested to use an oopsed kernel in an production environment. +If you want to disallow the deactivation of the debug feature, you can use +the "debug_stoppable" sysctl. If you set "debug_stoppable" to 0 the debug +feature cannot be stopped. If the debug feature is already stopped, it +will stay deactivated. + +Kernel Interfaces: +------------------ + +---------------------------------------------------------------------------- +debug_info_t *debug_register(char *name, int pages_index, int nr_areas, + int buf_size); + +Parameter: name: Name of debug log (e.g. used for proc entry) + pages_index: 2^pages_index pages will be allocated per area + nr_areas: number of debug areas + buf_size: size of data area in each debug entry + +Return Value: Handle for generated debug area + NULL if register failed + +Description: Allocates memory for a debug log + Must not be called within an interrupt handler + +--------------------------------------------------------------------------- +void debug_unregister (debug_info_t * id); + +Parameter: id: handle for debug log + +Return Value: none + +Description: frees memory for a debug log + Must not be called within an interrupt handler + +--------------------------------------------------------------------------- +void debug_set_level (debug_info_t * id, int new_level); + +Parameter: id: handle for debug log + new_level: new debug level + +Return Value: none + +Description: Sets new actual debug level if new_level is valid. + +--------------------------------------------------------------------------- ++void debug_stop_all(void); + +Parameter: none + +Return Value: none + +Description: stops the debug feature if stopping is allowed. Currently + used in case of a kernel oops. + +--------------------------------------------------------------------------- +debug_entry_t* debug_event (debug_info_t* id, int level, void* data, + int length); + +Parameter: id: handle for debug log + level: debug level + data: pointer to data for debug entry + length: length of data in bytes + +Return Value: Address of written debug entry + +Description: writes debug entry to active debug area (if level <= actual + debug level) + +--------------------------------------------------------------------------- +debug_entry_t* debug_int_event (debug_info_t * id, int level, + unsigned int data); +debug_entry_t* debug_long_event(debug_info_t * id, int level, + unsigned long data); + +Parameter: id: handle for debug log + level: debug level + data: integer value for debug entry + +Return Value: Address of written debug entry + +Description: writes debug entry to active debug area (if level <= actual + debug level) + +--------------------------------------------------------------------------- +debug_entry_t* debug_text_event (debug_info_t * id, int level, + const char* data); + +Parameter: id: handle for debug log + level: debug level + data: string for debug entry + +Return Value: Address of written debug entry + +Description: writes debug entry in ascii format to active debug area + (if level <= actual debug level) + +--------------------------------------------------------------------------- +debug_entry_t* debug_sprintf_event (debug_info_t * id, int level, + char* string,...); + +Parameter: id: handle for debug log + level: debug level + string: format string for debug entry + ...: varargs used as in sprintf() + +Return Value: Address of written debug entry + +Description: writes debug entry with format string and varargs (longs) to + active debug area (if level $<=$ actual debug level). + floats and long long datatypes cannot be used as varargs. + +--------------------------------------------------------------------------- + +debug_entry_t* debug_exception (debug_info_t* id, int level, void* data, + int length); + +Parameter: id: handle for debug log + level: debug level + data: pointer to data for debug entry + length: length of data in bytes + +Return Value: Address of written debug entry + +Description: writes debug entry to active debug area (if level <= actual + debug level) and switches to next debug area + +--------------------------------------------------------------------------- +debug_entry_t* debug_int_exception (debug_info_t * id, int level, + unsigned int data); +debug_entry_t* debug_long_exception(debug_info_t * id, int level, + unsigned long data); + +Parameter: id: handle for debug log + level: debug level + data: integer value for debug entry + +Return Value: Address of written debug entry + +Description: writes debug entry to active debug area (if level <= actual + debug level) and switches to next debug area + +--------------------------------------------------------------------------- +debug_entry_t* debug_text_exception (debug_info_t * id, int level, + const char* data); + +Parameter: id: handle for debug log + level: debug level + data: string for debug entry + +Return Value: Address of written debug entry + +Description: writes debug entry in ascii format to active debug area + (if level <= actual debug level) and switches to next debug + area + +--------------------------------------------------------------------------- +debug_entry_t* debug_sprintf_exception (debug_info_t * id, int level, + char* string,...); + +Parameter: id: handle for debug log + level: debug level + string: format string for debug entry + ...: varargs used as in sprintf() + +Return Value: Address of written debug entry + +Description: writes debug entry with format string and varargs (longs) to + active debug area (if level $<=$ actual debug level) and + switches to next debug area. + floats and long long datatypes cannot be used as varargs. + +--------------------------------------------------------------------------- + +int debug_register_view (debug_info_t * id, struct debug_view *view); + +Parameter: id: handle for debug log + view: pointer to debug view struct + +Return Value: 0 : ok + < 0: Error + +Description: registers new debug view and creates proc dir entry + +--------------------------------------------------------------------------- +int debug_unregister_view (debug_info_t * id, struct debug_view *view); + +Parameter: id: handle for debug log + view: pointer to debug view struct + +Return Value: 0 : ok + < 0: Error + +Description: unregisters debug view and removes proc dir entry + + + +Predefined views: +----------------- + +extern struct debug_view debug_hex_ascii_view; +extern struct debug_view debug_raw_view; +extern struct debug_view debug_sprintf_view; + +Examples +-------- + +/* + * hex_ascii- + raw-view Example + */ + +#include <linux/init.h> +#include <asm/debug.h> + +static debug_info_t* debug_info; + +static int init(void) +{ + /* register 4 debug areas with one page each and 4 byte data field */ + + debug_info = debug_register ("test", 0, 4, 4 ); + debug_register_view(debug_info,&debug_hex_ascii_view); + debug_register_view(debug_info,&debug_raw_view); + + debug_text_event(debug_info, 4 , "one "); + debug_int_exception(debug_info, 4, 4711); + debug_event(debug_info, 3, &debug_info, 4); + + return 0; +} + +static void cleanup(void) +{ + debug_unregister (debug_info); +} + +module_init(init); +module_exit(cleanup); + +--------------------------------------------------------------------------- + +/* + * sprintf-view Example + */ + +#include <linux/init.h> +#include <asm/debug.h> + +static debug_info_t* debug_info; + +static int init(void) +{ + /* register 4 debug areas with one page each and data field for */ + /* format string pointer + 2 varargs (= 3 * sizeof(long)) */ + + debug_info = debug_register ("test", 0, 4, sizeof(long) * 3); + debug_register_view(debug_info,&debug_sprintf_view); + + debug_sprintf_event(debug_info, 2 , "first event in %s:%i\n",__FILE__,__LINE__); + debug_sprintf_exception(debug_info, 1, "pointer to debug info: %p\n",&debug_info); + + return 0; +} + +static void cleanup(void) +{ + debug_unregister (debug_info); +} + +module_init(init); +module_exit(cleanup); + + + +ProcFS Interface +---------------- +Views to the debug logs can be investigated through reading the corresponding +proc-files: + +Example: + +> ls /proc/s390dbf/dasd +flush hex_ascii level raw +> cat /proc/s390dbf/dasd/hex_ascii | sort +1 +00 00974733272:680099 2 - 02 0006ad7e 07 ea 4a 90 | .... +00 00974733272:682210 2 - 02 0006ade6 46 52 45 45 | FREE +00 00974733272:682213 2 - 02 0006adf6 07 ea 4a 90 | .... +00 00974733272:682281 1 * 02 0006ab08 41 4c 4c 43 | EXCP +01 00974733272:682284 2 - 02 0006ab16 45 43 4b 44 | ECKD +01 00974733272:682287 2 - 02 0006ab28 00 00 00 04 | .... +01 00974733272:682289 2 - 02 0006ab3e 00 00 00 20 | ... +01 00974733272:682297 2 - 02 0006ad7e 07 ea 4a 90 | .... +01 00974733272:684384 2 - 00 0006ade6 46 52 45 45 | FREE +01 00974733272:684388 2 - 00 0006adf6 07 ea 4a 90 | .... + +See section about predefined views for explanation of the above output! + +Changing the debug level +------------------------ + +Example: + + +> cat /proc/s390dbf/dasd/level +3 +> echo "5" > /proc/s390dbf/dasd/level +> cat /proc/s390dbf/dasd/level +5 + +Flushing debug areas +-------------------- +Debug areas can be flushed with piping the number of the desired +area (0...n) to the proc file "flush". When using "-" all debug areas +are flushed. + +Examples: + +1. Flush debug area 0: +> echo "0" > /proc/s390dbf/dasd/flush + +2. Flush all debug areas: +> echo "-" > /proc/s390dbf/dasd/flush + +Stooping the debug feature +-------------------------- +Example: + +1. Check if stopping is allowed +> cat /proc/sys/s390dbf/debug_stoppable +2. Stop debug feature +> echo 0 > /proc/sys/s390dbf/debug_active + +lcrash Interface +---------------- +It is planned that the dump analysis tool lcrash gets an additional command +'s390dbf' to display all the debug logs. With this tool it will be possible +to investigate the debug logs on a live system and with a memory dump after +a system crash. + +Investigating raw memory +------------------------ +One last possibility to investigate the debug logs at a live +system and after a system crash is to look at the raw memory +under VM or at the Service Element. +It is possible to find the anker of the debug-logs through +the 'debug_area_first' symbol in the System map. Then one has +to follow the correct pointers of the data-structures defined +in debug.h and find the debug-areas in memory. +Normally modules which use the debug feature will also have +a global variable with the pointer to the debug-logs. Following +this pointer it will also be possible to find the debug logs in +memory. + +For this method it is recommended to use '16 * x + 4' byte (x = 0..n) +for the length of the data field in debug_register() in +order to see the debug entries well formatted. + + +Predefined Views +---------------- + +There are three predefined views: hex_ascii, raw and sprintf. +The hex_ascii view shows the data field in hex and ascii representation +(e.g. '45 43 4b 44 | ECKD'). +The raw view returns a bytestream as the debug areas are stored in memory. + +The sprintf view formats the debug entries in the same way as the sprintf +function would do. The sprintf event/expection fuctions write to the +debug entry a pointer to the format string (size = sizeof(long)) +and for each vararg a long value. So e.g. for a debug entry with a format +string plus two varargs one would need to allocate a (3 * sizeof(long)) +byte data area in the debug_register() function. + + +NOTE: If using the sprintf view do NOT use other event/exception functions +than the sprintf-event and -exception functions. + +The format of the hex_ascii and sprintf view is as follows: +- Number of area +- Timestamp (formatted as seconds and microseconds since 00:00:00 Coordinated + Universal Time (UTC), January 1, 1970) +- level of debug entry +- Exception flag (* = Exception) +- Cpu-Number of calling task +- Return Address to caller +- data field + +The format of the raw view is: +- Header as described in debug.h +- datafield + +A typical line of the hex_ascii view will look like the following (first line +is only for explanation and will not be displayed when 'cating' the view): + +area time level exception cpu caller data (hex + ascii) +-------------------------------------------------------------------------- +00 00964419409:440690 1 - 00 88023fe + + +Defining views +-------------- + +Views are specified with the 'debug_view' structure. There are defined +callback functions which are used for reading and writing the proc files: + +struct debug_view { + char name[DEBUG_MAX_PROCF_LEN]; + debug_prolog_proc_t* prolog_proc; + debug_header_proc_t* header_proc; + debug_format_proc_t* format_proc; + debug_input_proc_t* input_proc; + void* private_data; +}; + +where + +typedef int (debug_header_proc_t) (debug_info_t* id, + struct debug_view* view, + int area, + debug_entry_t* entry, + char* out_buf); + +typedef int (debug_format_proc_t) (debug_info_t* id, + struct debug_view* view, char* out_buf, + const char* in_buf); +typedef int (debug_prolog_proc_t) (debug_info_t* id, + struct debug_view* view, + char* out_buf); +typedef int (debug_input_proc_t) (debug_info_t* id, + struct debug_view* view, + struct file* file, const char* user_buf, + size_t in_buf_size, loff_t* offset); + + +The "private_data" member can be used as pointer to view specific data. +It is not used by the debug feature itself. + +The output when reading a debug-proc file is structured like this: + +"prolog_proc output" + +"header_proc output 1" "format_proc output 1" +"header_proc output 2" "format_proc output 2" +"header_proc output 3" "format_proc output 3" +... + +When a view is read from the proc fs, the Debug Feature calls the +'prolog_proc' once for writing the prolog. +Then 'header_proc' and 'format_proc' are called for each +existing debug entry. + +The input_proc can be used to implement functionality when it is written to +the view (e.g. like with 'echo "0" > /proc/s390dbf/dasd/level). + +For header_proc there can be used the default function +debug_dflt_header_fn() which is defined in in debug.h. +and which produces the same header output as the predefined views. +E.g: +00 00964419409:440761 2 - 00 88023ec + +In order to see how to use the callback functions check the implementation +of the default views! + +Example + +#include <asm/debug.h> + +#define UNKNOWNSTR "data: %08x" + +const char* messages[] = +{"This error...........\n", + "That error...........\n", + "Problem..............\n", + "Something went wrong.\n", + "Everything ok........\n", + NULL +}; + +static int debug_test_format_fn( + debug_info_t * id, struct debug_view *view, + char *out_buf, const char *in_buf +) +{ + int i, rc = 0; + + if(id->buf_size >= 4) { + int msg_nr = *((int*)in_buf); + if(msg_nr < sizeof(messages)/sizeof(char*) - 1) + rc += sprintf(out_buf, "%s", messages[msg_nr]); + else + rc += sprintf(out_buf, UNKNOWNSTR, msg_nr); + } + out: + return rc; +} + +struct debug_view debug_test_view = { + "myview", /* name of view */ + NULL, /* no prolog */ + &debug_dflt_header_fn, /* default header for each entry */ + &debug_test_format_fn, /* our own format function */ + NULL, /* no input function */ + NULL /* no private data */ +}; + +===== +test: +===== +debug_info_t *debug_info; +... +debug_info = debug_register ("test", 0, 4, 4 )); +debug_register_view(debug_info, &debug_test_view); +for(i = 0; i < 10; i ++) debug_int_event(debug_info, 1, i); + +> cat /proc/s390dbf/test/myview +00 00964419734:611402 1 - 00 88042ca This error........... +00 00964419734:611405 1 - 00 88042ca That error........... +00 00964419734:611408 1 - 00 88042ca Problem.............. +00 00964419734:611411 1 - 00 88042ca Something went wrong. +00 00964419734:611414 1 - 00 88042ca Everything ok........ +00 00964419734:611417 1 - 00 88042ca data: 00000005 +00 00964419734:611419 1 - 00 88042ca data: 00000006 +00 00964419734:611422 1 - 00 88042ca data: 00000007 +00 00964419734:611425 1 - 00 88042ca data: 00000008 +00 00964419734:611428 1 - 00 88042ca data: 00000009 diff --git a/Documentation/sched-coding.txt b/Documentation/sched-coding.txt new file mode 100644 index 000000000000..2b75ef67c9fe --- /dev/null +++ b/Documentation/sched-coding.txt @@ -0,0 +1,126 @@ + Reference for various scheduler-related methods in the O(1) scheduler + Robert Love <rml@tech9.net>, MontaVista Software + + +Note most of these methods are local to kernel/sched.c - this is by design. +The scheduler is meant to be self-contained and abstracted away. This document +is primarily for understanding the scheduler, not interfacing to it. Some of +the discussed interfaces, however, are general process/scheduling methods. +They are typically defined in include/linux/sched.h. + + +Main Scheduling Methods +----------------------- + +void load_balance(runqueue_t *this_rq, int idle) + Attempts to pull tasks from one cpu to another to balance cpu usage, + if needed. This method is called explicitly if the runqueues are + inbalanced or periodically by the timer tick. Prior to calling, + the current runqueue must be locked and interrupts disabled. + +void schedule() + The main scheduling function. Upon return, the highest priority + process will be active. + + +Locking +------- + +Each runqueue has its own lock, rq->lock. When multiple runqueues need +to be locked, lock acquires must be ordered by ascending &runqueue value. + +A specific runqueue is locked via + + task_rq_lock(task_t pid, unsigned long *flags) + +which disables preemption, disables interrupts, and locks the runqueue pid is +running on. Likewise, + + task_rq_unlock(task_t pid, unsigned long *flags) + +unlocks the runqueue pid is running on, restores interrupts to their previous +state, and reenables preemption. + +The routines + + double_rq_lock(runqueue_t *rq1, runqueue_t *rq2) + +and + + double_rq_unlock(runqueue_t *rq1, runqueue_t *rq2) + +safely lock and unlock, respectively, the two specified runqueues. They do +not, however, disable and restore interrupts. Users are required to do so +manually before and after calls. + + +Values +------ + +MAX_PRIO + The maximum priority of the system, stored in the task as task->prio. + Lower priorities are higher. Normal (non-RT) priorities range from + MAX_RT_PRIO to (MAX_PRIO - 1). +MAX_RT_PRIO + The maximum real-time priority of the system. Valid RT priorities + range from 0 to (MAX_RT_PRIO - 1). +MAX_USER_RT_PRIO + The maximum real-time priority that is exported to user-space. Should + always be equal to or less than MAX_RT_PRIO. Setting it less allows + kernel threads to have higher priorities than any user-space task. +MIN_TIMESLICE +MAX_TIMESLICE + Respectively, the minimum and maximum timeslices (quanta) of a process. + +Data +---- + +struct runqueue + The main per-CPU runqueue data structure. +struct task_struct + The main per-process data structure. + + +General Methods +--------------- + +cpu_rq(cpu) + Returns the runqueue of the specified cpu. +this_rq() + Returns the runqueue of the current cpu. +task_rq(pid) + Returns the runqueue which holds the specified pid. +cpu_curr(cpu) + Returns the task currently running on the given cpu. +rt_task(pid) + Returns true if pid is real-time, false if not. + + +Process Control Methods +----------------------- + +void set_user_nice(task_t *p, long nice) + Sets the "nice" value of task p to the given value. +int setscheduler(pid_t pid, int policy, struct sched_param *param) + Sets the scheduling policy and parameters for the given pid. +int set_cpus_allowed(task_t *p, unsigned long new_mask) + Sets a given task's CPU affinity and migrates it to a proper cpu. + Callers must have a valid reference to the task and assure the + task not exit prematurely. No locks can be held during the call. +set_task_state(tsk, state_value) + Sets the given task's state to the given value. +set_current_state(state_value) + Sets the current task's state to the given value. +void set_tsk_need_resched(struct task_struct *tsk) + Sets need_resched in the given task. +void clear_tsk_need_resched(struct task_struct *tsk) + Clears need_resched in the given task. +void set_need_resched() + Sets need_resched in the current task. +void clear_need_resched() + Clears need_resched in the current task. +int need_resched() + Returns true if need_resched is set in the current task, false + otherwise. +yield() + Place the current process at the end of the runqueue and call schedule. diff --git a/Documentation/sched-design.txt b/Documentation/sched-design.txt new file mode 100644 index 000000000000..9d04e7bbf45f --- /dev/null +++ b/Documentation/sched-design.txt @@ -0,0 +1,165 @@ + Goals, Design and Implementation of the + new ultra-scalable O(1) scheduler + + + This is an edited version of an email Ingo Molnar sent to + lkml on 4 Jan 2002. It describes the goals, design, and + implementation of Ingo's new ultra-scalable O(1) scheduler. + Last Updated: 18 April 2002. + + +Goal +==== + +The main goal of the new scheduler is to keep all the good things we know +and love about the current Linux scheduler: + + - good interactive performance even during high load: if the user + types or clicks then the system must react instantly and must execute + the user tasks smoothly, even during considerable background load. + + - good scheduling/wakeup performance with 1-2 runnable processes. + + - fairness: no process should stay without any timeslice for any + unreasonable amount of time. No process should get an unjustly high + amount of CPU time. + + - priorities: less important tasks can be started with lower priority, + more important tasks with higher priority. + + - SMP efficiency: no CPU should stay idle if there is work to do. + + - SMP affinity: processes which run on one CPU should stay affine to + that CPU. Processes should not bounce between CPUs too frequently. + + - plus additional scheduler features: RT scheduling, CPU binding. + +and the goal is also to add a few new things: + + - fully O(1) scheduling. Are you tired of the recalculation loop + blowing the L1 cache away every now and then? Do you think the goodness + loop is taking a bit too long to finish if there are lots of runnable + processes? This new scheduler takes no prisoners: wakeup(), schedule(), + the timer interrupt are all O(1) algorithms. There is no recalculation + loop. There is no goodness loop either. + + - 'perfect' SMP scalability. With the new scheduler there is no 'big' + runqueue_lock anymore - it's all per-CPU runqueues and locks - two + tasks on two separate CPUs can wake up, schedule and context-switch + completely in parallel, without any interlocking. All + scheduling-relevant data is structured for maximum scalability. + + - better SMP affinity. The old scheduler has a particular weakness that + causes the random bouncing of tasks between CPUs if/when higher + priority/interactive tasks, this was observed and reported by many + people. The reason is that the timeslice recalculation loop first needs + every currently running task to consume its timeslice. But when this + happens on eg. an 8-way system, then this property starves an + increasing number of CPUs from executing any process. Once the last + task that has a timeslice left has finished using up that timeslice, + the recalculation loop is triggered and other CPUs can start executing + tasks again - after having idled around for a number of timer ticks. + The more CPUs, the worse this effect. + + Furthermore, this same effect causes the bouncing effect as well: + whenever there is such a 'timeslice squeeze' of the global runqueue, + idle processors start executing tasks which are not affine to that CPU. + (because the affine tasks have finished off their timeslices already.) + + The new scheduler solves this problem by distributing timeslices on a + per-CPU basis, without having any global synchronization or + recalculation. + + - batch scheduling. A significant proportion of computing-intensive tasks + benefit from batch-scheduling, where timeslices are long and processes + are roundrobin scheduled. The new scheduler does such batch-scheduling + of the lowest priority tasks - so nice +19 jobs will get + 'batch-scheduled' automatically. With this scheduler, nice +19 jobs are + in essence SCHED_IDLE, from an interactiveness point of view. + + - handle extreme loads more smoothly, without breakdown and scheduling + storms. + + - O(1) RT scheduling. For those RT folks who are paranoid about the + O(nr_running) property of the goodness loop and the recalculation loop. + + - run fork()ed children before the parent. Andrea has pointed out the + advantages of this a few months ago, but patches for this feature + do not work with the old scheduler as well as they should, + because idle processes often steal the new child before the fork()ing + CPU gets to execute it. + + +Design +====== + +the core of the new scheduler are the following mechanizms: + + - *two*, priority-ordered 'priority arrays' per CPU. There is an 'active' + array and an 'expired' array. The active array contains all tasks that + are affine to this CPU and have timeslices left. The expired array + contains all tasks which have used up their timeslices - but this array + is kept sorted as well. The active and expired array is not accessed + directly, it's accessed through two pointers in the per-CPU runqueue + structure. If all active tasks are used up then we 'switch' the two + pointers and from now on the ready-to-go (former-) expired array is the + active array - and the empty active array serves as the new collector + for expired tasks. + + - there is a 64-bit bitmap cache for array indices. Finding the highest + priority task is thus a matter of two x86 BSFL bit-search instructions. + +the split-array solution enables us to have an arbitrary number of active +and expired tasks, and the recalculation of timeslices can be done +immediately when the timeslice expires. Because the arrays are always +access through the pointers in the runqueue, switching the two arrays can +be done very quickly. + +this is a hybride priority-list approach coupled with roundrobin +scheduling and the array-switch method of distributing timeslices. + + - there is a per-task 'load estimator'. + +one of the toughest things to get right is good interactive feel during +heavy system load. While playing with various scheduler variants i found +that the best interactive feel is achieved not by 'boosting' interactive +tasks, but by 'punishing' tasks that want to use more CPU time than there +is available. This method is also much easier to do in an O(1) fashion. + +to establish the actual 'load' the task contributes to the system, a +complex-looking but pretty accurate method is used: there is a 4-entry +'history' ringbuffer of the task's activities during the last 4 seconds. +This ringbuffer is operated without much overhead. The entries tell the +scheduler a pretty accurate load-history of the task: has it used up more +CPU time or less during the past N seconds. [the size '4' and the interval +of 4x 1 seconds was found by lots of experimentation - this part is +flexible and can be changed in both directions.] + +the penalty a task gets for generating more load than the CPU can handle +is a priority decrease - there is a maximum amount to this penalty +relative to their static priority, so even fully CPU-bound tasks will +observe each other's priorities, and will share the CPU accordingly. + +the SMP load-balancer can be extended/switched with additional parallel +computing and cache hierarchy concepts: NUMA scheduling, multi-core CPUs +can be supported easily by changing the load-balancer. Right now it's +tuned for my SMP systems. + +i skipped the prev->mm == next->mm advantage - no workload i know of shows +any sensitivity to this. It can be added back by sacrificing O(1) +schedule() [the current and one-lower priority list can be searched for a +that->mm == current->mm condition], but costs a fair number of cycles +during a number of important workloads, so i wanted to avoid this as much +as possible. + +- the SMP idle-task startup code was still racy and the new scheduler +triggered this. So i streamlined the idle-setup code a bit. We do not call +into schedule() before all processors have started up fully and all idle +threads are in place. + +- the patch also cleans up a number of aspects of sched.c - moves code +into other areas of the kernel where it's appropriate, and simplifies +certain code paths and data constructs. As a result, the new scheduler's +code is smaller than the old one. + + Ingo diff --git a/Documentation/sched-domains.txt b/Documentation/sched-domains.txt new file mode 100644 index 000000000000..a9e990ab980f --- /dev/null +++ b/Documentation/sched-domains.txt @@ -0,0 +1,70 @@ +Each CPU has a "base" scheduling domain (struct sched_domain). These are +accessed via cpu_sched_domain(i) and this_sched_domain() macros. The domain +hierarchy is built from these base domains via the ->parent pointer. ->parent +MUST be NULL terminated, and domain structures should be per-CPU as they +are locklessly updated. + +Each scheduling domain spans a number of CPUs (stored in the ->span field). +A domain's span MUST be a superset of it child's span (this restriction could +be relaxed if the need arises), and a base domain for CPU i MUST span at least +i. The top domain for each CPU will generally span all CPUs in the system +although strictly it doesn't have to, but this could lead to a case where some +CPUs will never be given tasks to run unless the CPUs allowed mask is +explicitly set. A sched domain's span means "balance process load among these +CPUs". + +Each scheduling domain must have one or more CPU groups (struct sched_group) +which are organised as a circular one way linked list from the ->groups +pointer. The union of cpumasks of these groups MUST be the same as the +domain's span. The intersection of cpumasks from any two of these groups +MUST be the empty set. The group pointed to by the ->groups pointer MUST +contain the CPU to which the domain belongs. Groups may be shared among +CPUs as they contain read only data after they have been set up. + +Balancing within a sched domain occurs between groups. That is, each group +is treated as one entity. The load of a group is defined as the sum of the +load of each of its member CPUs, and only when the load of a group becomes +out of balance are tasks moved between groups. + +In kernel/sched.c, rebalance_tick is run periodically on each CPU. This +function takes its CPU's base sched domain and checks to see if has reached +its rebalance interval. If so, then it will run load_balance on that domain. +rebalance_tick then checks the parent sched_domain (if it exists), and the +parent of the parent and so forth. + +*** Implementing sched domains *** +The "base" domain will "span" the first level of the hierarchy. In the case +of SMT, you'll span all siblings of the physical CPU, with each group being +a single virtual CPU. + +In SMP, the parent of the base domain will span all physical CPUs in the +node. Each group being a single physical CPU. Then with NUMA, the parent +of the SMP domain will span the entire machine, with each group having the +cpumask of a node. Or, you could do multi-level NUMA or Opteron, for example, +might have just one domain covering its one NUMA level. + +The implementor should read comments in include/linux/sched.h: +struct sched_domain fields, SD_FLAG_*, SD_*_INIT to get an idea of +the specifics and what to tune. + +For SMT, the architecture must define CONFIG_SCHED_SMT and provide a +cpumask_t cpu_sibling_map[NR_CPUS], where cpu_sibling_map[i] is the mask of +all "i"'s siblings as well as "i" itself. + +Architectures may retain the regular override the default SD_*_INIT flags +while using the generic domain builder in kernel/sched.c if they wish to +retain the traditional SMT->SMP->NUMA topology (or some subset of that). This +can be done by #define'ing ARCH_HASH_SCHED_TUNE. + +Alternatively, the architecture may completely override the generic domain +builder by #define'ing ARCH_HASH_SCHED_DOMAIN, and exporting your +arch_init_sched_domains function. This function will attach domains to all +CPUs using cpu_attach_domain. + +Implementors should change the line +#undef SCHED_DOMAIN_DEBUG +to +#define SCHED_DOMAIN_DEBUG +in kernel/sched.c as this enables an error checking parse of the sched domains +which should catch most possible errors (described above). It also prints out +the domain structure in a visual format. diff --git a/Documentation/sched-stats.txt b/Documentation/sched-stats.txt new file mode 100644 index 000000000000..6f72021aae51 --- /dev/null +++ b/Documentation/sched-stats.txt @@ -0,0 +1,153 @@ +Version 10 of schedstats includes support for sched_domains, which +hit the mainline kernel in 2.6.7. Some counters make more sense to be +per-runqueue; other to be per-domain. Note that domains (and their associated +information) will only be pertinent and available on machines utilizing +CONFIG_SMP. + +In version 10 of schedstat, there is at least one level of domain +statistics for each cpu listed, and there may well be more than one +domain. Domains have no particular names in this implementation, but +the highest numbered one typically arbitrates balancing across all the +cpus on the machine, while domain0 is the most tightly focused domain, +sometimes balancing only between pairs of cpus. At this time, there +are no architectures which need more than three domain levels. The first +field in the domain stats is a bit map indicating which cpus are affected +by that domain. + +These fields are counters, and only increment. Programs which make use +of these will need to start with a baseline observation and then calculate +the change in the counters at each subsequent observation. A perl script +which does this for many of the fields is available at + + http://eaglet.rain.com/rick/linux/schedstat/ + +Note that any such script will necessarily be version-specific, as the main +reason to change versions is changes in the output format. For those wishing +to write their own scripts, the fields are described here. + +CPU statistics +-------------- +cpu<N> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 + +NOTE: In the sched_yield() statistics, the active queue is considered empty + if it has only one process in it, since obviously the process calling + sched_yield() is that process. + +First four fields are sched_yield() statistics: + 1) # of times both the active and the expired queue were empty + 2) # of times just the active queue was empty + 3) # of times just the expired queue was empty + 4) # of times sched_yield() was called + +Next four are schedule() statistics: + 5) # of times the active queue had at least one other process on it + 6) # of times we switched to the expired queue and reused it + 7) # of times schedule() was called + 8) # of times schedule() left the processor idle + +Next four are active_load_balance() statistics: + 9) # of times active_load_balance() was called + 10) # of times active_load_balance() caused this cpu to gain a task + 11) # of times active_load_balance() caused this cpu to lose a task + 12) # of times active_load_balance() tried to move a task and failed + +Next three are try_to_wake_up() statistics: + 13) # of times try_to_wake_up() was called + 14) # of times try_to_wake_up() successfully moved the awakening task + 15) # of times try_to_wake_up() attempted to move the awakening task + +Next two are wake_up_new_task() statistics: + 16) # of times wake_up_new_task() was called + 17) # of times wake_up_new_task() successfully moved the new task + +Next one is a sched_migrate_task() statistic: + 18) # of times sched_migrate_task() was called + +Next one is a sched_balance_exec() statistic: + 19) # of times sched_balance_exec() was called + +Next three are statistics describing scheduling latency: + 20) sum of all time spent running by tasks on this processor (in ms) + 21) sum of all time spent waiting to run by tasks on this processor (in ms) + 22) # of tasks (not necessarily unique) given to the processor + +The last six are statistics dealing with pull_task(): + 23) # of times pull_task() moved a task to this cpu when newly idle + 24) # of times pull_task() stole a task from this cpu when another cpu + was newly idle + 25) # of times pull_task() moved a task to this cpu when idle + 26) # of times pull_task() stole a task from this cpu when another cpu + was idle + 27) # of times pull_task() moved a task to this cpu when busy + 28) # of times pull_task() stole a task from this cpu when another cpu + was busy + + +Domain statistics +----------------- +One of these is produced per domain for each cpu described. (Note that if +CONFIG_SMP is not defined, *no* domains are utilized and these lines +will not appear in the output.) + +domain<N> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 + +The first field is a bit mask indicating what cpus this domain operates over. + +The next fifteen are a variety of load_balance() statistics: + + 1) # of times in this domain load_balance() was called when the cpu + was idle + 2) # of times in this domain load_balance() was called when the cpu + was busy + 3) # of times in this domain load_balance() was called when the cpu + was just becoming idle + 4) # of times in this domain load_balance() tried to move one or more + tasks and failed, when the cpu was idle + 5) # of times in this domain load_balance() tried to move one or more + tasks and failed, when the cpu was busy + 6) # of times in this domain load_balance() tried to move one or more + tasks and failed, when the cpu was just becoming idle + 7) sum of imbalances discovered (if any) with each call to + load_balance() in this domain when the cpu was idle + 8) sum of imbalances discovered (if any) with each call to + load_balance() in this domain when the cpu was busy + 9) sum of imbalances discovered (if any) with each call to + load_balance() in this domain when the cpu was just becoming idle + 10) # of times in this domain load_balance() was called but did not find + a busier queue while the cpu was idle + 11) # of times in this domain load_balance() was called but did not find + a busier queue while the cpu was busy + 12) # of times in this domain load_balance() was called but did not find + a busier queue while the cpu was just becoming idle + 13) # of times in this domain a busier queue was found while the cpu was + idle but no busier group was found + 14) # of times in this domain a busier queue was found while the cpu was + busy but no busier group was found + 15) # of times in this domain a busier queue was found while the cpu was + just becoming idle but no busier group was found + +Next two are sched_balance_exec() statistics: + 17) # of times in this domain sched_balance_exec() successfully pushed + a task to a new cpu + 18) # of times in this domain sched_balance_exec() tried but failed to + push a task to a new cpu + +Next two are try_to_wake_up() statistics: + 19) # of times in this domain try_to_wake_up() tried to move a task based + on affinity and cache warmth + 20) # of times in this domain try_to_wake_up() tried to move a task based + on load balancing + + +/proc/<pid>/schedstat +---------------- +schedstats also adds a new /proc/<pid/schedstat file to include some of +the same information on a per-process level. There are three fields in +this file correlating to fields 20, 21, and 22 in the CPU fields, but +they only apply for that process. + +A program could be easily written to make use of these extra fields to +report on how well a particular process or set of processes is faring +under the scheduler's policies. A simple version of such a program is +available at + http://eaglet.rain.com/rick/linux/schedstat/v10/latency.c diff --git a/Documentation/scsi/00-INDEX b/Documentation/scsi/00-INDEX new file mode 100644 index 000000000000..f9cb5bdcce41 --- /dev/null +++ b/Documentation/scsi/00-INDEX @@ -0,0 +1,70 @@ +00-INDEX + - this file +53c700.txt + - info on driver for 53c700 based adapters +AM53C974.txt + - info on driver for AM53c974 based adapters +BusLogic.txt + - info on driver for adapters with BusLogic chips +ChangeLog + - Changes to scsi files, if not listed elsewhere +ChangeLog.ips + - IBM ServeRAID driver Changelog +ChangeLog.ncr53c8xx + - Changes to ncr53c8xx driver +ChangeLog.sym53c8xx + - Changes to sym53c8xx driver +ChangeLog.sym53c8xx_2 + - Changes to second generation of sym53c8xx driver +FlashPoint.txt + - info on driver for BusLogic FlashPoint adapters +LICENSE.FlashPoint + - Licence of the Flashpoint driver +Mylex.txt + - info on driver for Mylex adapters +NinjaSCSI.txt + - info on WorkBiT NinjaSCSI-32/32Bi driver +aha152x.txt + - info on driver for Adaptec AHA152x based adapters +aic7xxx.txt + - info on driver for Adaptec controllers +aic7xxx_old.txt + - info on driver for Adaptec controllers, old generation +cpqfc.txt + - info on driver for Compaq Tachyon TS adapters +dpti.txt + - info on driver for DPT SmartRAID and Adaptec I2O RAID based adapters +dtc3x80.txt + - info on driver for DTC 2x80 based adapters +g_NCR5380.txt + - info on driver for NCR5380 and NCR53c400 based adapters +ibmmca.txt + - info on driver for IBM adapters with MCA bus +in2000.txt + - info on in2000 driver +ncr53c7xx.txt + - info on driver for NCR53c7xx based adapters +ncr53c8xx.txt + - info on driver for NCR53c8xx based adapters +osst.txt + - info on driver for OnStream SC-x0 SCSI tape +ppa.txt + - info on driver for IOmega zip drive +qlogicfas.txt + - info on driver for QLogic FASxxx based adapters +qlogicisp.txt + - info on driver for QLogic ISP 1020 based adapters +scsi-generic.txt + - info on the sg driver for generic (non-disk/CD/tape) SCSI devices. +scsi.txt + - short blurb on using SCSI support as a module. +scsi_mid_low_api.txt + - info on API between SCSI layer and low level drivers +st.txt + - info on scsi tape driver +sym53c500_cs.txt + - info on PCMCIA driver for Symbios Logic 53c500 based adapters +sym53c8xx_2.txt + - info on second generation driver for sym53c8xx based adapters +tmscsim.txt + - info on driver for AM53c974 based adapters diff --git a/Documentation/scsi/53c700.txt b/Documentation/scsi/53c700.txt new file mode 100644 index 000000000000..0da681d497a2 --- /dev/null +++ b/Documentation/scsi/53c700.txt @@ -0,0 +1,154 @@ +General Description +=================== + +This driver supports the 53c700 and 53c700-66 chips. It also supports +the 53c710 but only in 53c700 emulation mode. It is full featured and +does sync (-66 and 710 only), disconnects and tag command queueing. + +Since the 53c700 must be interfaced to a bus, you need to wrapper the +card detector around this driver. For an example, see the +NCR_D700.[ch] or lasi700.[ch] files. + +The comments in the 53c700.[ch] files tell you which parts you need to +fill in to get the driver working. + + +Compile Time Flags +================== + +The driver may be either io mapped or memory mapped. This is +selectable by configuration flags: + +CONFIG_53C700_MEM_MAPPED + +define if the driver is memory mapped. + +CONFIG_53C700_IO_MAPPED + +define if the driver is to be io mapped. + +One or other of the above flags *must* be defined. + +Other flags are: + +CONFIG_53C700_LE_ON_BE + +define if the chipset must be supported in little endian mode on a big +endian architecture (used for the 700 on parisc). + +CONFIG_53C700_USE_CONSISTENT + +allocate consistent memory (should only be used if your architecture +has a mixture of consistent and inconsistent memory). Fully +consistent or fully inconsistent architectures should not define this. + + +Using the Chip Core Driver +========================== + +In order to plumb the 53c700 chip core driver into a working SCSI +driver, you need to know three things about the way the chip is wired +into your system (or expansion card). + +1. The clock speed of the SCSI core +2. The interrupt line used +3. The memory (or io space) location of the 53c700 registers. + +Optionally, you may also need to know other things, like how to read +the SCSI Id from the card bios or whether the chip is wired for +differential operation. + +Usually you can find items 2. and 3. from general spec. documents or +even by examining the configuration of a working driver under another +operating system. + +The clock speed is usually buried deep in the technical literature. +It is required because it is used to set up both the synchronous and +asynchronous dividers for the chip. As a general rule of thumb, +manufacturers set the clock speed at the lowest possible setting +consistent with the best operation of the chip (although some choose +to drive it off the CPU or bus clock rather than going to the expense +of an extra clock chip). The best operation clock speeds are: + +53c700 - 25MHz +53c700-66 - 50MHz +53c710 - 40Mhz + +Writing Your Glue Driver +======================== + +This will be a standard SCSI driver (I don't know of a good document +describing this, just copy from some other driver) with at least a +detect and release entry. + +In the detect routine, you need to allocate a struct +NCR_700_Host_Parameters sized memory area and clear it (so that the +default values for everything are 0). Then you must fill in the +parameters that matter to you (see below), plumb the NCR_700_intr +routine into the interrupt line and call NCR_700_detect with the host +template and the new parameters as arguments. You should also call +the relevant request_*_region function and place the register base +address into the `base' pointer of the host parameters. + +In the release routine, you must free the NCR_700_Host_Parameters that +you allocated, call the corresponding release_*_region and free the +interrupt. + +Handling Interrupts +------------------- + +In general, you should just plumb the card's interrupt line in with + +request_irq(irq, NCR_700_intr, <irq flags>, <driver name>, host); + +where host is the return from the relevant NCR_700_detect() routine. + +You may also write your own interrupt handling routine which calls +NCR_700_intr() directly. However, you should only really do this if +you have a card with more than one chip on it and you can read a +register to tell which set of chips wants the interrupt. + +Settable NCR_700_Host_Parameters +-------------------------------- + +The following are a list of the user settable parameters: + +clock: (MANDATORY) + +Set to the clock speed of the chip in MHz. + +base: (MANDATORY) + +set to the base of the io or mem region for the register set. On 64 +bit architectures this is only 32 bits wide, so the registers must be +mapped into the low 32 bits of memory. + +pci_dev: (OPTIONAL) + +set to the PCI board device. Leave NULL for a non-pci board. This is +used for the pci_alloc_consistent() and pci_map_*() functions. + +dmode_extra: (OPTIONAL, 53c710 only) + +extra flags for the DMODE register. These are used to control bus +output pins on the 710. The settings should be a combination of +DMODE_FC1 and DMODE_FC2. What these pins actually do is entirely up +to the board designer. Usually it is safe to ignore this setting. + +differential: (OPTIONAL) + +set to 1 if the chip drives a differential bus. + +force_le_on_be: (OPTIONAL, only if CONFIG_53C700_LE_ON_BE is set) + +set to 1 if the chip is operating in little endian mode on a big +endian architecture. + +chip710: (OPTIONAL) + +set to 1 if the chip is a 53c710. + +burst_disable: (OPTIONAL, 53c710 only) + +disable 8 byte bursting for DMA transfers. + diff --git a/Documentation/scsi/BusLogic.txt b/Documentation/scsi/BusLogic.txt new file mode 100644 index 000000000000..98023baa0f0d --- /dev/null +++ b/Documentation/scsi/BusLogic.txt @@ -0,0 +1,566 @@ + BusLogic MultiMaster and FlashPoint SCSI Driver for Linux + + Version 2.0.15 for Linux 2.0 + Version 2.1.15 for Linux 2.1 + + PRODUCTION RELEASE + + 17 August 1998 + + Leonard N. Zubkoff + Dandelion Digital + lnz@dandelion.com + + Copyright 1995-1998 by Leonard N. Zubkoff <lnz@dandelion.com> + + + INTRODUCTION + +BusLogic, Inc. designed and manufactured a variety of high performance SCSI +host adapters which share a common programming interface across a diverse +collection of bus architectures by virtue of their MultiMaster ASIC technology. +BusLogic was acquired by Mylex Corporation in February 1996, but the products +supported by this driver originated under the BusLogic name and so that name is +retained in the source code and documentation. + +This driver supports all present BusLogic MultiMaster Host Adapters, and should +support any future MultiMaster designs with little or no modification. More +recently, BusLogic introduced the FlashPoint Host Adapters, which are less +costly and rely on the host CPU, rather than including an onboard processor. +Despite not having an onboard CPU, the FlashPoint Host Adapters perform very +well and have very low command latency. BusLogic has recently provided me with +the FlashPoint Driver Developer's Kit, which comprises documentation and freely +redistributable source code for the FlashPoint SCCB Manager. The SCCB Manager +is the library of code that runs on the host CPU and performs functions +analogous to the firmware on the MultiMaster Host Adapters. Thanks to their +having provided the SCCB Manager, this driver now supports the FlashPoint Host +Adapters as well. + +My primary goals in writing this completely new BusLogic driver for Linux are +to achieve the full performance that BusLogic SCSI Host Adapters and modern +SCSI peripherals are capable of, and to provide a highly robust driver that can +be depended upon for high performance mission critical applications. All of +the major performance features can be configured from the Linux kernel command +line or at module initialization time, allowing individual installations to +tune driver performance and error recovery to their particular needs. + +The latest information on Linux support for BusLogic SCSI Host Adapters, as +well as the most recent release of this driver and the latest firmware for the +BT-948/958/958D, will always be available from my Linux Home Page at URL +"http://www.dandelion.com/Linux/". + +Bug reports should be sent via electronic mail to "lnz@dandelion.com". Please +include with the bug report the complete configuration messages reported by the +driver and SCSI subsystem at startup, along with any subsequent system messages +relevant to SCSI operations, and a detailed description of your system's +hardware configuration. + +Mylex has been an excellent company to work with and I highly recommend their +products to the Linux community. In November 1995, I was offered the +opportunity to become a beta test site for their latest MultiMaster product, +the BT-948 PCI Ultra SCSI Host Adapter, and then again for the BT-958 PCI Wide +Ultra SCSI Host Adapter in January 1996. This was mutually beneficial since +Mylex received a degree and kind of testing that their own testing group cannot +readily achieve, and the Linux community has available high performance host +adapters that have been well tested with Linux even before being brought to +market. This relationship has also given me the opportunity to interact +directly with their technical staff, to understand more about the internal +workings of their products, and in turn to educate them about the needs and +potential of the Linux community. + +More recently, Mylex has reaffirmed the company's interest in supporting the +Linux community, and I am now working on a Linux driver for the DAC960 PCI RAID +Controllers. Mylex's interest and support is greatly appreciated. + +Unlike some other vendors, if you contact Mylex Technical Support with a +problem and are running Linux, they will not tell you that your use of their +products is unsupported. Their latest product marketing literature even states +"Mylex SCSI host adapters are compatible with all major operating systems +including: ... Linux ...". + +Mylex Corporation is located at 34551 Ardenwood Blvd., Fremont, California +94555, USA and can be reached at 510/796-6100 or on the World Wide Web at +http://www.mylex.com. Mylex HBA Technical Support can be reached by electronic +mail at techsup@mylex.com, by Voice at 510/608-2400, or by FAX at 510/745-7715. +Contact information for offices in Europe and Japan is available on the Web +site. + + + DRIVER FEATURES + +o Configuration Reporting and Testing + + During system initialization, the driver reports extensively on the host + adapter hardware configuration, including the synchronous transfer parameters + requested and negotiated with each target device. AutoSCSI settings for + Synchronous Negotiation, Wide Negotiation, and Disconnect/Reconnect are + reported for each target device, as well as the status of Tagged Queuing. + If the same setting is in effect for all target devices, then a single word + or phrase is used; otherwise, a letter is provided for each target device to + indicate the individual status. The following examples + should clarify this reporting format: + + Synchronous Negotiation: Ultra + + Synchronous negotiation is enabled for all target devices and the host + adapter will attempt to negotiate for 20.0 mega-transfers/second. + + Synchronous Negotiation: Fast + + Synchronous negotiation is enabled for all target devices and the host + adapter will attempt to negotiate for 10.0 mega-transfers/second. + + Synchronous Negotiation: Slow + + Synchronous negotiation is enabled for all target devices and the host + adapter will attempt to negotiate for 5.0 mega-transfers/second. + + Synchronous Negotiation: Disabled + + Synchronous negotiation is disabled and all target devices are limited to + asynchronous operation. + + Synchronous Negotiation: UFSNUUU#UUUUUUUU + + Synchronous negotiation to Ultra speed is enabled for target devices 0 + and 4 through 15, to Fast speed for target device 1, to Slow speed for + target device 2, and is not permitted to target device 3. The host + adapter's SCSI ID is represented by the "#". + + The status of Wide Negotiation, Disconnect/Reconnect, and Tagged Queuing + are reported as "Enabled", Disabled", or a sequence of "Y" and "N" letters. + +o Performance Features + + BusLogic SCSI Host Adapters directly implement SCSI-2 Tagged Queuing, and so + support has been included in the driver to utilize tagged queuing with any + target devices that report having the tagged queuing capability. Tagged + queuing allows for multiple outstanding commands to be issued to each target + device or logical unit, and can improve I/O performance substantially. In + addition, BusLogic's Strict Round Robin Mode is used to optimize host adapter + performance, and scatter/gather I/O can support as many segments as can be + effectively utilized by the Linux I/O subsystem. Control over the use of + tagged queuing for each target device as well as individual selection of the + tagged queue depth is available through driver options provided on the kernel + command line or at module initialization time. By default, the queue depth + is determined automatically based on the host adapter's total queue depth and + the number, type, speed, and capabilities of the target devices found. In + addition, tagged queuing is automatically disabled whenever the host adapter + firmware version is known not to implement it correctly, or whenever a tagged + queue depth of 1 is selected. Tagged queuing is also disabled for individual + target devices if disconnect/reconnect is disabled for that device. + +o Robustness Features + + The driver implements extensive error recovery procedures. When the higher + level parts of the SCSI subsystem request that a timed out command be reset, + a selection is made between a full host adapter hard reset and SCSI bus reset + versus sending a bus device reset message to the individual target device + based on the recommendation of the SCSI subsystem. Error recovery strategies + are selectable through driver options individually for each target device, + and also include sending a bus device reset to the specific target device + associated with the command being reset, as well as suppressing error + recovery entirely to avoid perturbing an improperly functioning device. If + the bus device reset error recovery strategy is selected and sending a bus + device reset does not restore correct operation, the next command that is + reset will force a full host adapter hard reset and SCSI bus reset. SCSI bus + resets caused by other devices and detected by the host adapter are also + handled by issuing a soft reset to the host adapter and re-initialization. + Finally, if tagged queuing is active and more than one command reset occurs + in a 10 minute interval, or if a command reset occurs within the first 10 + minutes of operation, then tagged queuing will be disabled for that target + device. These error recovery options improve overall system robustness by + preventing individual errant devices from causing the system as a whole to + lock up or crash, and thereby allowing a clean shutdown and restart after the + offending component is removed. + +o PCI Configuration Support + + On PCI systems running kernels compiled with PCI BIOS support enabled, this + driver will interrogate the PCI configuration space and use the I/O port + addresses assigned by the system BIOS, rather than the ISA compatible I/O + port addresses. The ISA compatible I/O port address is then disabled by the + driver. On PCI systems it is also recommended that the AutoSCSI utility be + used to disable the ISA compatible I/O port entirely as it is not necessary. + The ISA compatible I/O port is disabled by default on the BT-948/958/958D. + +o /proc File System Support + + Copies of the host adapter configuration information together with updated + data transfer and error recovery statistics are available through the + /proc/scsi/BusLogic/<N> interface. + +o Shared Interrupts Support + + On systems that support shared interrupts, any number of BusLogic Host + Adapters may share the same interrupt request channel. + + + SUPPORTED HOST ADAPTERS + +The following list comprises the supported BusLogic SCSI Host Adapters as of +the date of this document. It is recommended that anyone purchasing a BusLogic +Host Adapter not in the following table contact the author beforehand to verify +that it is or will be supported. + +FlashPoint Series PCI Host Adapters: + +FlashPoint LT (BT-930) Ultra SCSI-3 +FlashPoint LT (BT-930R) Ultra SCSI-3 with RAIDPlus +FlashPoint LT (BT-920) Ultra SCSI-3 (BT-930 without BIOS) +FlashPoint DL (BT-932) Dual Channel Ultra SCSI-3 +FlashPoint DL (BT-932R) Dual Channel Ultra SCSI-3 with RAIDPlus +FlashPoint LW (BT-950) Wide Ultra SCSI-3 +FlashPoint LW (BT-950R) Wide Ultra SCSI-3 with RAIDPlus +FlashPoint DW (BT-952) Dual Channel Wide Ultra SCSI-3 +FlashPoint DW (BT-952R) Dual Channel Wide Ultra SCSI-3 with RAIDPlus + +MultiMaster "W" Series Host Adapters: + +BT-948 PCI Ultra SCSI-3 +BT-958 PCI Wide Ultra SCSI-3 +BT-958D PCI Wide Differential Ultra SCSI-3 + +MultiMaster "C" Series Host Adapters: + +BT-946C PCI Fast SCSI-2 +BT-956C PCI Wide Fast SCSI-2 +BT-956CD PCI Wide Differential Fast SCSI-2 +BT-445C VLB Fast SCSI-2 +BT-747C EISA Fast SCSI-2 +BT-757C EISA Wide Fast SCSI-2 +BT-757CD EISA Wide Differential Fast SCSI-2 +BT-545C ISA Fast SCSI-2 +BT-540CF ISA Fast SCSI-2 + +MultiMaster "S" Series Host Adapters: + +BT-445S VLB Fast SCSI-2 +BT-747S EISA Fast SCSI-2 +BT-747D EISA Differential Fast SCSI-2 +BT-757S EISA Wide Fast SCSI-2 +BT-757D EISA Wide Differential Fast SCSI-2 +BT-545S ISA Fast SCSI-2 +BT-542D ISA Differential Fast SCSI-2 +BT-742A EISA SCSI-2 (742A revision H) +BT-542B ISA SCSI-2 (542B revision H) + +MultiMaster "A" Series Host Adapters: + +BT-742A EISA SCSI-2 (742A revisions A - G) +BT-542B ISA SCSI-2 (542B revisions A - G) + +AMI FastDisk Host Adapters that are true BusLogic MultiMaster clones are also +supported by this driver. + +BusLogic SCSI Host Adapters are available packaged both as bare boards and as +retail kits. The BT- model numbers above refer to the bare board packaging. +The retail kit model numbers are found by replacing BT- with KT- in the above +list. The retail kit includes the bare board and manual as well as cabling and +driver media and documentation that are not provided with bare boards. + + + FLASHPOINT INSTALLATION NOTES + +o RAIDPlus Support + + FlashPoint Host Adapters now include RAIDPlus, Mylex's bootable software + RAID. RAIDPlus is not supported on Linux, and there are no plans to support + it. The MD driver in Linux 2.0 provides for concatenation (LINEAR) and + striping (RAID-0), and support for mirroring (RAID-1), fixed parity (RAID-4), + and distributed parity (RAID-5) is available separately. The built-in Linux + RAID support is generally more flexible and is expected to perform better + than RAIDPlus, so there is little impetus to include RAIDPlus support in the + BusLogic driver. + +o Enabling UltraSCSI Transfers + + FlashPoint Host Adapters ship with their configuration set to "Factory + Default" settings that are conservative and do not allow for UltraSCSI speed + to be negotiated. This results in fewer problems when these host adapters + are installed in systems with cabling or termination that is not sufficient + for UltraSCSI operation, or where existing SCSI devices do not properly + respond to synchronous transfer negotiation for UltraSCSI speed. AutoSCSI + may be used to load "Optimum Performance" settings which allow UltraSCSI + speed to be negotiated with all devices, or UltraSCSI speed can be enabled on + an individual basis. It is recommended that SCAM be manually disabled after + the "Optimum Performance" settings are loaded. + + + BT-948/958/958D INSTALLATION NOTES + +The BT-948/958/958D PCI Ultra SCSI Host Adapters have some features which may +require attention in some circumstances when installing Linux. + +o PCI I/O Port Assignments + + When configured to factory default settings, the BT-948/958/958D will only + recognize the PCI I/O port assignments made by the motherboard's PCI BIOS. + The BT-948/958/958D will not respond to any of the ISA compatible I/O ports + that previous BusLogic SCSI Host Adapters respond to. This driver supports + the PCI I/O port assignments, so this is the preferred configuration. + However, if the obsolete BusLogic driver must be used for any reason, such as + a Linux distribution that does not yet use this driver in its boot kernel, + BusLogic has provided an AutoSCSI configuration option to enable a legacy ISA + compatible I/O port. + + To enable this backward compatibility option, invoke the AutoSCSI utility via + Ctrl-B at system startup and select "Adapter Configuration", "View/Modify + Configuration", and then change the "ISA Compatible Port" setting from + "Disable" to "Primary" or "Alternate". Once this driver has been installed, + the "ISA Compatible Port" option should be set back to "Disable" to avoid + possible future I/O port conflicts. The older BT-946C/956C/956CD also have + this configuration option, but the factory default setting is "Primary". + +o PCI Slot Scanning Order + + In systems with multiple BusLogic PCI Host Adapters, the order in which the + PCI slots are scanned may appear reversed with the BT-948/958/958D as + compared to the BT-946C/956C/956CD. For booting from a SCSI disk to work + correctly, it is necessary that the host adapter's BIOS and the kernel agree + on which disk is the boot device, which requires that they recognize the PCI + host adapters in the same order. The motherboard's PCI BIOS provides a + standard way of enumerating the PCI host adapters, which is used by the Linux + kernel. Some PCI BIOS implementations enumerate the PCI slots in order of + increasing bus number and device number, while others do so in the opposite + direction. + + Unfortunately, Microsoft decided that Windows 95 would always enumerate the + PCI slots in order of increasing bus number and device number regardless of + the PCI BIOS enumeration, and requires that their scheme be supported by the + host adapter's BIOS to receive Windows 95 certification. Therefore, the + factory default settings of the BT-948/958/958D enumerate the host adapters + by increasing bus number and device number. To disable this feature, invoke + the AutoSCSI utility via Ctrl-B at system startup and select "Adapter + Configuration", "View/Modify Configuration", press Ctrl-F10, and then change + the "Use Bus And Device # For PCI Scanning Seq." option to OFF. + + This driver will interrogate the setting of the PCI Scanning Sequence option + so as to recognize the host adapters in the same order as they are enumerated + by the host adapter's BIOS. + +o Enabling UltraSCSI Transfers + + The BT-948/958/958D ship with their configuration set to "Factory Default" + settings that are conservative and do not allow for UltraSCSI speed to be + negotiated. This results in fewer problems when these host adapters are + installed in systems with cabling or termination that is not sufficient for + UltraSCSI operation, or where existing SCSI devices do not properly respond + to synchronous transfer negotiation for UltraSCSI speed. AutoSCSI may be + used to load "Optimum Performance" settings which allow UltraSCSI speed to be + negotiated with all devices, or UltraSCSI speed can be enabled on an + individual basis. It is recommended that SCAM be manually disabled after the + "Optimum Performance" settings are loaded. + + + DRIVER OPTIONS + +BusLogic Driver Options may be specified either via the Linux Kernel Command +Line or via the Loadable Kernel Module Installation Facility. Driver Options +for multiple host adapters may be specified either by separating the option +strings by a semicolon, or by specifying multiple "BusLogic=" strings on the +command line. Individual option specifications for a single host adapter are +separated by commas. The Probing and Debugging Options apply to all host +adapters whereas the remaining options apply individually only to the +selected host adapter. + +The BusLogic Driver Probing Options comprise the following: + +IO:<integer> + + The "IO:" option specifies an ISA I/O Address to be probed for a non-PCI + MultiMaster Host Adapter. If neither "IO:" nor "NoProbeISA" options are + specified, then the standard list of BusLogic MultiMaster ISA I/O Addresses + will be probed (0x330, 0x334, 0x230, 0x234, 0x130, and 0x134). Multiple + "IO:" options may be specified to precisely determine the I/O Addresses to + be probed, but the probe order will always follow the standard list. + +NoProbe + + The "NoProbe" option disables all probing and therefore no BusLogic Host + Adapters will be detected. + +NoProbeISA + + The "NoProbeISA" option disables probing of the standard BusLogic ISA I/O + Addresses and therefore only PCI MultiMaster and FlashPoint Host Adapters + will be detected. + +NoProbePCI + + The "NoProbePCI" options disables the interrogation of PCI Configuration + Space and therefore only ISA Multimaster Host Adapters will be detected, as + well as PCI Multimaster Host Adapters that have their ISA Compatible I/O + Port set to "Primary" or "Alternate". + +NoSortPCI + + The "NoSortPCI" option forces PCI MultiMaster Host Adapters to be + enumerated in the order provided by the PCI BIOS, ignoring any setting of + the AutoSCSI "Use Bus And Device # For PCI Scanning Seq." option. + +MultiMasterFirst + + The "MultiMasterFirst" option forces MultiMaster Host Adapters to be probed + before FlashPoint Host Adapters. By default, if both FlashPoint and PCI + MultiMaster Host Adapters are present, this driver will probe for + FlashPoint Host Adapters first unless the BIOS primary disk is controlled + by the first PCI MultiMaster Host Adapter, in which case MultiMaster Host + Adapters will be probed first. + +FlashPointFirst + + The "FlashPointFirst" option forces FlashPoint Host Adapters to be probed + before MultiMaster Host Adapters. + +The BusLogic Driver Tagged Queuing Options allow for explicitly specifying +the Queue Depth and whether Tagged Queuing is permitted for each Target +Device (assuming that the Target Device supports Tagged Queuing). The Queue +Depth is the number of SCSI Commands that are allowed to be concurrently +presented for execution (either to the Host Adapter or Target Device). Note +that explicitly enabling Tagged Queuing may lead to problems; the option to +enable or disable Tagged Queuing is provided primarily to allow disabling +Tagged Queuing on Target Devices that do not implement it correctly. The +following options are available: + +QueueDepth:<integer> + + The "QueueDepth:" or QD:" option specifies the Queue Depth to use for all + Target Devices that support Tagged Queuing, as well as the maximum Queue + Depth for devices that do not support Tagged Queuing. If no Queue Depth + option is provided, the Queue Depth will be determined automatically based + on the Host Adapter's Total Queue Depth and the number, type, speed, and + capabilities of the detected Target Devices. For Host Adapters that + require ISA Bounce Buffers, the Queue Depth is automatically set by default + to BusLogic_TaggedQueueDepthBB or BusLogic_UntaggedQueueDepthBB to avoid + excessive preallocation of DMA Bounce Buffer memory. Target Devices that + do not support Tagged Queuing always have their Queue Depth set to + BusLogic_UntaggedQueueDepth or BusLogic_UntaggedQueueDepthBB, unless a + lower Queue Depth option is provided. A Queue Depth of 1 automatically + disables Tagged Queuing. + +QueueDepth:[<integer>,<integer>...] + + The "QueueDepth:[...]" or "QD:[...]" option specifies the Queue Depth + individually for each Target Device. If an <integer> is omitted, the + associated Target Device will have its Queue Depth selected automatically. + +TaggedQueuing:Default + + The "TaggedQueuing:Default" or "TQ:Default" option permits Tagged Queuing + based on the firmware version of the BusLogic Host Adapter and based on + whether the Queue Depth allows queuing multiple commands. + +TaggedQueuing:Enable + + The "TaggedQueuing:Enable" or "TQ:Enable" option enables Tagged Queuing for + all Target Devices on this Host Adapter, overriding any limitation that + would otherwise be imposed based on the Host Adapter firmware version. + +TaggedQueuing:Disable + + The "TaggedQueuing:Disable" or "TQ:Disable" option disables Tagged Queuing + for all Target Devices on this Host Adapter. + +TaggedQueuing:<Target-Spec> + + The "TaggedQueuing:<Target-Spec>" or "TQ:<Target-Spec>" option controls + Tagged Queuing individually for each Target Device. <Target-Spec> is a + sequence of "Y", "N", and "X" characters. "Y" enables Tagged Queuing, "N" + disables Tagged Queuing, and "X" accepts the default based on the firmware + version. The first character refers to Target Device 0, the second to + Target Device 1, and so on; if the sequence of "Y", "N", and "X" characters + does not cover all the Target Devices, unspecified characters are assumed + to be "X". + +The BusLogic Driver Miscellaneous Options comprise the following: + +BusSettleTime:<seconds> + + The "BusSettleTime:" or "BST:" option specifies the Bus Settle Time in + seconds. The Bus Settle Time is the amount of time to wait between a Host + Adapter Hard Reset which initiates a SCSI Bus Reset and issuing any SCSI + Commands. If unspecified, it defaults to BusLogic_DefaultBusSettleTime. + +InhibitTargetInquiry + + The "InhibitTargetInquiry" option inhibits the execution of an Inquire + Target Devices or Inquire Installed Devices command on MultiMaster Host + Adapters. This may be necessary with some older Target Devices that do not + respond correctly when Logical Units above 0 are addressed. + +The BusLogic Driver Debugging Options comprise the following: + +TraceProbe + + The "TraceProbe" option enables tracing of Host Adapter Probing. + +TraceHardwareReset + + The "TraceHardwareReset" option enables tracing of Host Adapter Hardware + Reset. + +TraceConfiguration + + The "TraceConfiguration" option enables tracing of Host Adapter + Configuration. + +TraceErrors + + The "TraceErrors" option enables tracing of SCSI Commands that return an + error from the Target Device. The CDB and Sense Data will be printed for + each SCSI Command that fails. + +Debug + + The "Debug" option enables all debugging options. + +The following examples demonstrate setting the Queue Depth for Target Devices +1 and 2 on the first host adapter to 7 and 15, the Queue Depth for all Target +Devices on the second host adapter to 31, and the Bus Settle Time on the +second host adapter to 30 seconds. + +Linux Kernel Command Line: + + linux BusLogic=QueueDepth:[,7,15];QueueDepth:31,BusSettleTime:30 + +LILO Linux Boot Loader (in /etc/lilo.conf): + + append = "BusLogic=QueueDepth:[,7,15];QueueDepth:31,BusSettleTime:30" + +INSMOD Loadable Kernel Module Installation Facility: + + insmod BusLogic.o \ + 'BusLogic="QueueDepth:[,7,15];QueueDepth:31,BusSettleTime:30"' + +NOTE: Module Utilities 2.1.71 or later is required for correct parsing + of driver options containing commas. + + + DRIVER INSTALLATION + +This distribution was prepared for Linux kernel version 2.0.35, but should be +compatible with 2.0.4 or any later 2.0 series kernel. + +To install the new BusLogic SCSI driver, you may use the following commands, +replacing "/usr/src" with wherever you keep your Linux kernel source tree: + + cd /usr/src + tar -xvzf BusLogic-2.0.15.tar.gz + mv README.* LICENSE.* BusLogic.[ch] FlashPoint.c linux/drivers/scsi + patch -p0 < BusLogic.patch (only for 2.0.33 and below) + cd linux + make config + make zImage + +Then install "arch/i386/boot/zImage" as your standard kernel, run lilo if +appropriate, and reboot. + + + BUSLOGIC ANNOUNCEMENTS MAILING LIST + +The BusLogic Announcements Mailing List provides a forum for informing Linux +users of new driver releases and other announcements regarding Linux support +for BusLogic SCSI Host Adapters. To join the mailing list, send a message to +"buslogic-announce-request@dandelion.com" with the line "subscribe" in the +message body. diff --git a/Documentation/scsi/ChangeLog.1992-1997 b/Documentation/scsi/ChangeLog.1992-1997 new file mode 100644 index 000000000000..dc88ee2ab73d --- /dev/null +++ b/Documentation/scsi/ChangeLog.1992-1997 @@ -0,0 +1,2023 @@ +Sat Jan 18 15:51:45 1997 Richard Henderson <rth@tamu.edu> + + * Don't play with usage_count directly, instead hand around + the module header and use the module macros. + +Fri May 17 00:00:00 1996 Leonard N. Zubkoff <lnz@dandelion.com> + + * BusLogic Driver Version 2.0.3 Released. + +Tue Apr 16 21:00:00 1996 Leonard N. Zubkoff <lnz@dandelion.com> + + * BusLogic Driver Version 1.3.2 Released. + +Sun Dec 31 23:26:00 1995 Leonard N. Zubkoff <lnz@dandelion.com> + + * BusLogic Driver Version 1.3.1 Released. + +Fri Nov 10 15:29:49 1995 Leonard N. Zubkoff <lnz@dandelion.com> + + * Released new BusLogic driver. + +Wed Aug 9 22:37:04 1995 Andries Brouwer <aeb@cwi.nl> + + As a preparation for new device code, separated the various + functions the request->dev field had into the device proper, + request->rq_dev and a status field request->rq_status. + + The 2nd argument of bios_param is now a kdev_t. + +Wed Jul 19 10:43:15 1995 Michael Neuffer <neuffer@goofy.zdv.uni-mainz.de> + + * scsi.c (scsi_proc_info): /proc/scsi/scsi now also lists all + attached devices. + + * scsi_proc.c (proc_print_scsidevice): Added. Used by scsi.c and + eata_dma_proc.c to produce some device info for /proc/scsi. + + * eata_dma.c (eata_queue)(eata_int_handler)(eata_scsi_done): + Changed handling of internal SCSI commands send to the HBA. + + +Wed Jul 19 10:09:17 1995 Michael Neuffer <neuffer@goofy.zdv.uni-mainz.de> + + * Linux 1.3.11 released. + + * eata_dma.c (eata_queue)(eata_int_handler): Added code to do + command latency measurements if requested by root through + /proc/scsi interface. + Throughout Use HZ constant for time references. + + * eata_pio.c: Use HZ constant for time references. + + * aic7xxx.c, aic7xxx.h, aic7xxx_asm.c: Changed copyright from BSD + to GNU style. + + * scsi.h: Added READ_12 command opcode constant + +Wed Jul 19 09:25:30 1995 Michael Neuffer <neuffer@goofy.zdv.uni-mainz.de> + + * Linux 1.3.10 released. + + * scsi_proc.c (dispatch_scsi_info): Removed unused variable. + +Wed Jul 19 09:25:30 1995 Michael Neuffer <neuffer@goofy.zdv.uni-mainz.de> + + * Linux 1.3.9 released. + + * scsi.c Blacklist concept expanded to 'support' more device + deficiencies. blacklist[] renamed to device_list[] + (scan_scsis): Code cleanup. + + * scsi_debug.c (scsi_debug_proc_info): Added support to control + device lockup simulation via /proc/scsi interface. + + +Wed Jul 19 09:22:34 1995 Michael Neuffer <neuffer@goofy.zdv.uni-mainz.de> + + * Linux 1.3.7 released. + + * scsi_proc.c: Fixed a number of bugs in directory handling + +Wed Jul 19 09:18:28 1995 Michael Neuffer <neuffer@goofy.zdv.uni-mainz.de> + + * Linux 1.3.5 released. + + * Native wide, multichannel and /proc/scsi support now in official + kernel distribution. + + * scsi.c/h, hosts.c/h et al reindented to increase readability + (especially on 80 column wide terminals). + + * scsi.c, scsi_proc.c, ../../fs/proc/inode.c: Added + /proc/scsi/scsi which allows root to scan for hotplugged devices. + + * scsi.c (scsi_proc_info): Added, to support /proc/scsi/scsi. + (scan_scsis): Added some 'spaghetti' code to allow scanning for + single devices. + + +Thu Jun 20 15:20:27 1995 Michael Neuffer <neuffer@goofy.zdv.uni-mainz.de> + + * proc.c: Renamed to scsi_proc.c + +Mon Jun 12 20:32:45 1995 Michael Neuffer <neuffer@goofy.zdv.uni-mainz.de> + + * Linux 1.3.0 released. + +Mon May 15 19:33:14 1995 Michael Neuffer <neuffer@goofy.zdv.uni-mainz.de> + + * scsi.c: Added native multichannel and wide scsi support. + + * proc.c (dispatch_scsi_info) (build_proc_dir_hba_entries): + Updated /proc/scsi interface. + +Thu May 4 17:58:48 1995 Michael Neuffer <neuffer@goofy.zdv.uni-mainz.de> + + * sd.c (requeue_sd_request): Zero out the scatterlist only if + scsi_malloc returned memory for it. + + * eata_dma.c (register_HBA) (eata_queue): Add support for + large scatter/gather tables and set use_clustering accordingly + + * hosts.c: Make use_clustering changeable in the Scsi_Host structure. + +Wed Apr 12 15:25:52 1995 Eric Youngdale (eric@andante) + + * Linux 1.2.5 released. + + * buslogic.c: Update to version 1.15 (From Leonard N. Zubkoff). + Fixed interrupt routine to avoid races when handling multiple + complete commands per interrupt. Seems to come up with faster + cards. + + * eata_dma.c: Update to 2.3.5r. Modularize. Improved error handling + throughout and fixed bug interrupt routine which resulted in shifted + status bytes. Added blink LED state checks for ISA and EISA HBAs. + Memory management bug seems to have disappeared ==> increasing + C_P_L_CURRENT_MAX to 16 for now. Decreasing C_P_L_DIV to 3 for + performance reasons. + + * scsi.c: If we get a FMK, EOM, or ILI when attempting to scan + the bus, assume that it was just noise on the bus, and ignore + the device. + + * scsi.h: Update and add a bunch of missing commands which we + were never using. + + * sd.c: Use restore_flags in do_sd_request - this may result in + latency conditions, but it gets rid of races and crashes. + Do not save flags again when searching for a second command to + queue. + + * st.c: Use bytes, not STP->buffer->buffer_size when reading + from tape. + + +Tue Apr 4 09:42:08 1995 Eric Youngdale (eric@andante) + + * Linux 1.2.4 released. + + * st.c: Fix typo - restoring wrong flags. + +Wed Mar 29 06:55:12 1995 Eric Youngdale (eric@andante) + + * Linux 1.2.3 released. + + * st.c: Perform some waiting operations with interrupts off. + Is this correct??? + +Wed Mar 22 10:34:26 1995 Eric Youngdale (eric@andante) + + * Linux 1.2.2 released. + + * aha152x.c: Modularize. Add support for PCMCIA. + + * eata.c: Update to version 2.0. Fixed bug preventing media + detection. If scsi_register_host returns NULL, fail gracefully. + + * scsi.c: Detect as NEC (for photo-cd purposes) for the 84 + and 25 models as "NEC_OLDCDR". + + * scsi.h: Add define for NEC_OLDCDR + + * sr.c: Add handling for NEC_OLDCDR. Treat as unknown. + + * u14-34f.c: Update to version 2.0. Fixed same bug as in + eata.c. + + +Mon Mar 6 11:11:20 1995 Eric Youngdale (eric@andante) + + * Linux 1.2.0 released. Yeah!!! + + * Minor spelling/punctuation changes throughout. Nothing + substantive. + +Mon Feb 20 21:33:03 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.95 released. + + * qlogic.c: Update to version 0.41. + + * seagate.c: Change some message to be more descriptive about what + we detected. + + * sr.c: spelling/whitespace changes. + +Mon Feb 20 21:33:03 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.94 released. + +Mon Feb 20 08:57:17 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.93 released. + + * hosts.h: Change io_port to long int from short. + + * 53c7,8xx.c: crash on AEN fixed, SCSI reset is no longer a NOP, + NULL pointer panic on odd UDCs fixed, two bugs in diagnostic output + fixed, should initialize correctly if left running, now loadable, + new memory allocation, extraneous diagnostic output suppressed, + splx() replaced with save/restore flags. [ Drew ] + + * hosts.c, hosts.h, scsi_ioctl.c, sd.c, sd_ioctl.c, sg.c, sr.c, + sr_ioctl.c: Add special junk at end that Emacs will use for + formatting the file. + + * qlogic.c: Update to v0.40a. Improve parity handling. + + * scsi.c: Add Hitachi DK312C to blacklist. Change "};" to "}" in + many places. Use scsi_init_malloc to get command block - may + need this to be dma compatible for some host adapters. + Restore interrupts after unregistering a host. + + * sd.c: Use sti instead of restore flags - causes latency problems. + + * seagate.c: Use controller_type to determine string used when + registering irq. + + * sr.c: More photo-cd hacks to make sure we get the xa stuff right. + * sr.h, sr.c: Change is_xa to xa_flags field. + + * st.c: Disable retries for write operations. + +Wed Feb 15 10:52:56 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.92 released. + + * eata.c: Update to 1.17. + + * eata_dma.c: Update to 2.31a. Add more support for /proc/scsi. + Continuing modularization. Less crashes because of the bug in the + memory management ==> increase C_P_L_CURRENT_MAX to 10 + and decrease C_P_L_DIV to 4. + + * hosts.c: If we remove last host registered, reuse host number. + When freeing memory from host being deregistered, free extra_bytes + too. + + * scsi.c (scan_scsis): memset(SDpnt, 0) and set SCmd.device to SDpnt. + Change memory allocation to work around bugs in __get_dma_pages. + Do not free host if usage count is not zero (for modules). + + * sr_ioctl.c: Increase IOCTL_TIMEOUT to 3000. + + * st.c: Allow for ST_EXTRA_DEVS in st data structures. + + * u14-34f.c: Update to 1.17. + +Thu Feb 9 10:11:16 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.91 released. + + * eata.c: Update to 1.16. Use wish_block instead of host->block. + + * hosts.c: Initialize wish_block to 0. + + * hosts.h: Add wish_block. + + * scsi.c: Use wish_block as indicator that the host should be added + to block list. + + * sg.c: Add SG_EXTRA_DEVS to number of slots. + + * u14-34f.c: Use wish_block. + +Tue Feb 7 11:46:04 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.90 released. + + * eata.c: Change naming from eata_* to eata2x_*. Now at vers 1.15. + Update interrupt handler to take pt_regs as arg. Allow blocking + even if loaded as module. Initialize target_time_out array. + Do not put sti(); in timing loop. + + * hosts.c: Do not reuse host numbers. + Use scsi_make_blocked_list to generate blocking list. + + * script_asm.pl: Beats me. Don't know perl. Something to do with + phase index. + + * scsi.c (scsi_make_blocked_list): New function - code copied from + hosts.c. + + * scsi.c: Update code to disable photo CD for Toshiba cdroms. + Use just manufacturer name, not model number. + + * sr.c: Fix setting density for Toshiba drives. + + * u14-34f.c: Clear target_time_out array during reset. + +Wed Feb 1 09:20:45 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.89 released. + + * Makefile, u14-34f.c: Modularize. + + * Makefile, eata.c: Modularize. Now version 1.14 + + * NCR5380.c: Update interrupt handler with new arglist. Minor + cleanups. + + * eata_dma.c: Begin to modularize. Add hooks for /proc/scsi. + New version 2.3.0a. Add code in interrupt handler to allow + certain CDROM drivers to be detected which return a + CHECK_CONDITION during SCSI bus scan. Add opcode check to get + all DATA IN and DATA OUT phases right. Utilize HBA_interpret flag. + Improvements in HBA identification. Various other minor stuff. + + * hosts.c: Initialize ->dma_channel and ->io_port when registering + a new host. + + * qlogic.c: Modularize and add PCMCIA support. + + * scsi.c: Add Hitachi to blacklist. + + * scsi.c: Change default to no lun scan (too many problem devices). + + * scsi.h: Define QUEUE_FULL condition. + + * sd.c: Do not check for non-existent partition until after + new media check. + + * sg.c: Undo previous change which was wrong. + + * sr_ioctl.c: Increase IOCTL_TIMEOUT to 2000. + + * st.c: Patches from Kai - improve filemark handling. + +Tue Jan 31 17:32:12 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.88 released. + + * Throughout - spelling/grammar fixups. + + * scsi.c: Make sure that all buffers are 16 byte aligned - some + drivers (buslogic) need this. + + * scsi.c (scan_scsis): Remove message printed. + + * scsi.c (scsi_init): Move message here. + +Mon Jan 30 06:40:25 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.87 released. + + * sr.c: Photo-cd related changes. (Gerd Knorr??). + + * st.c: Changes from Kai related to EOM detection. + +Mon Jan 23 23:53:10 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.86 released. + + * 53c7,8xx.h: Change SG size to 127. + + * eata_dma: Update to version 2.10i. Remove bug in the registration + of multiple HBAs and channels. Minor other improvements and stylistic + changes. + + * scsi.c: Test for Toshiba XM-3401TA and exclude from detection + as toshiba drive - photo cd does not work with this drive. + + * sr.c: Update photocd code. + +Mon Jan 23 23:53:10 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.85 released. + + * st.c, st_ioctl.c, sg.c, sd_ioctl.c, scsi_ioctl.c, hosts.c: + include linux/mm.h + + * qlogic.c, buslogic.c, aha1542.c: Include linux/module.h. + +Sun Jan 22 22:08:46 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.84 released. + + * Makefile: Support for loadable QLOGIC boards. + + * aha152x.c: Update to version 1.8 from Juergen. + + * eata_dma.c: Update from Michael Neuffer. + Remove hard limit of 2 commands per lun and make it better + configurable. Improvements in HBA identification. + + * in2000.c: Fix biosparam to support large disks. + + * qlogic.c: Minor changes (change sti -> restore_flags). + +Wed Jan 18 23:33:09 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.83 released. + + * aha1542.c(aha1542_intr_handle): Use arguments handed down to find + which irq. + + * buslogic.c: Likewise. + + * eata_dma.c: Use min of 2 cmd_per_lun for OCS_enabled boards. + + * scsi.c: Make RECOVERED_ERROR a SUGGEST_IS_OK. + + * sd.c: Fail if we are opening a non-existent partition. + + * sr.c: Bump SR_TIMEOUT to 15000. + Do not probe for media size at boot time(hard on changers). + Flag device as needing sector size instead. + + * sr_ioctl.c: Remove CDROMMULTISESSION_SYS ioctl. + + * ultrastor.c: Fix bug in call to ultrastor_interrupt (wrong #args). + +Mon Jan 16 07:18:23 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.82 released. + + Throughout. + - Change all interrupt handlers to accept new calling convention. + In particular, we now receive the irq number as one of the arguments. + + * More minor spelling corrections in some of the new files. + + * aha1542.c, buslogic.c: Clean up interrupt handler a little now + that we receive the irq as an arg. + + * aha274x.c: s/snarf_region/request_region/ + + * eata.c: Update to version 1.12. Fix some comments and display a + message if we cannot reserve the port addresses. + + * u14-34f.c: Update to version 1.13. Fix some comments and display a + message if we cannot reserve the port addresses. + + * eata_dma.c: Define get_board_data function (send INQUIRY command). + Use to improve detection of variants of different DPT boards. Change + version subnumber to "0g". + + * fdomain.c: Update to version 5.26. Improve detection of some boards + repackaged by IBM. + + * scsi.c (scsi_register_host): Change "name" to const char *. + + * sr.c: Fix problem in set mode command for Toshiba drives. + + * sr.c: Fix typo from patch 81. + +Fri Jan 13 12:54:46 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.81 released. Codefreeze for 1.2 release announced. + + Big changes here. + + * eata_dma.*: New files from Michael Neuffer. + (neuffer@goofy.zdv.uni-mainz.de). Should support + all eata/dpt cards. + + * hosts.c, Makefile: Add eata_dma. + + * README.st: Document MTEOM. + + Patches from me (ERY) to finish support for low-level loadable scsi. + It now works, and is actually useful. + + * Throughout - add new argument to scsi_init_malloc that takes an + additional parameter. This is used as a priority to kmalloc, + and you can specify the GFP_DMA flag if you need DMA-able memory. + + * Makefile: For source files that are loadable, always add name + to SCSI_SRCS. Fill in modules: target. + + * hosts.c: Change next_host to next_scsi_host, and make global. + Print hosts after we have identified all of them. Use info() + function if present, otherwise use name field. + + * hosts.h: Change attach function to return int, not void. + Define number of device slots to allow for loadable devices. + Define tags to tell scsi module code what type of module we + are loading. + + * scsi.c: Fix scan_scsis so that it can be run by a user process. + Do not use waiting loops - use up and down mechanism as long + as current != task[0]. + + * scsi.c(scan_scsis): Do not use stack variables for I/O - this + could be > 16Mb if we are loading a module at runtime (i.e. use + scsi_init_malloc to get some memory we know will be safe). + + * scsi.c: Change dma freelist to be a set of pages. This allows + us to dynamically adjust the size of the list by adding more pages + to the pagelist. Fix scsi_malloc and scsi_free accordingly. + + * scsi_module.c: Fix include. + + * sd.c: Declare detach function. Increment/decrement module usage + count as required. Fix init functions to allow loaded devices. + Revalidate all new disks so we get the partition tables. Define + detach function. + + * sr.c: Likewise. + + * sg.c: Declare detach function. Allow attachment of devices on + loaded drivers. + + * st.c: Declare detach function. Increment/decrement module usage + count as required. + +Tue Jan 10 10:09:58 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.79 released. + + Patch from some undetermined individual who needs to get a life :-). + + * sr.c: Attacked by spelling bee... + + Patches from Gerd Knorr: + + * sr.c: make printk messages for photoCD a little more informative. + + * sr_ioctl.c: Fix CDROMMULTISESSION_SYS ioctl. + +Mon Jan 9 10:01:37 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.78 released. + + * Makefile: Add empty modules: target. + + * Wheee. Now change register_iomem to request_region. + + * in2000.c: Bugfix - apparently this is the fix that we have + all been waiting for. It fixes a problem whereby the driver + is not stable under heavy load. Race condition and all that. + Patch from Peter Lu. + +Wed Jan 4 21:17:40 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.77 released. + + * 53c7,8xx.c: Fix from Linus - emulate splx. + + Throughout: + + Change "snarf_region" with "register_iomem". + + * scsi_module.c: New file. Contains support for low-level loadable + scsi drivers. [ERY]. + + * sd.c: More s/int/long/ changes. + + * seagate.c: Explicitly include linux/config.h + + * sg.c: Increment/decrement module usage count on open/close. + + * sg.c: Be a bit more careful about the user not supplying enough + information for a valid command. Pass correct size down to + scsi_do_cmd. + + * sr.c: More changes for Photo-CD. This apparently breaks NEC drives. + + * sr_ioctl.c: Support CDROMMULTISESSION ioctl. + + +Sun Jan 1 19:55:21 1995 Eric Youngdale (eric@andante) + + * Linux 1.1.76 released. + + * constants.c: Add type cast in switch statement. + + * scsi.c (scsi_free): Change datatype of "offset" to long. + (scsi_malloc): Change a few more variables to long. Who + did this and why was it important? 64 bit machines? + + + Lots of changes to use save_state/restore_state instead of cli/sti. + Files changed include: + + * aha1542.c: + * aha1740.c: + * buslogic.c: + * in2000.c: + * scsi.c: + * scsi_debug.c: + * sd.c: + * sr.c: + * st.c: + +Wed Dec 28 16:38:29 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.75 released. + + * buslogic.c: Spelling fix. + + * scsi.c: Add HP C1790A and C2500A scanjet to blacklist. + + * scsi.c: Spelling fixup. + + * sd.c: Add support for sd_hardsizes (hard sector sizes). + + * ultrastor.c: Use save_flags/restore_flags instead of cli/sti. + +Fri Dec 23 13:36:25 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.74 released. + + * README.st: Update from Kai Makisara. + + * eata.c: New version from Dario - version 1.11. + use scsicam bios_param routine. Add support for 2011 + and 2021 boards. + + * hosts.c: Add support for blocking. Linked list automatically + generated when shpnt->block is set. + + * scsi.c: Add sankyo & HP scanjet to blacklist. Add support for + kicking things loose when we deadlock. + + * scsi.c: Recognize scanners and processors in scan_scsis. + + * scsi_ioctl.h: Increase timeout to 9 seconds. + + * st.c: New version from Kai - add better support for backspace. + + * u14-34f.c: New version from Dario. Supports blocking. + +Wed Dec 14 14:46:30 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.73 released. + + * buslogic.c: Update from Dave Gentzel. Version 1.14. + Add module related stuff. More fault tolerant if out of + DMA memory. + + * fdomain.c: New version from Rik Faith - version 5.22. Add support + for ISA-200S SCSI adapter. + + * hosts.c: Spelling. + + * qlogic.c: Update to version 0.38a. Add more support for PCMCIA. + + * scsi.c: Mask device type with 0x1f during scan_scsis. + Add support for deadlocking, err, make that getting out of + deadlock situations that are created when we allow the user + to limit requests to one host adapter at a time. + + * scsi.c: Bugfix - pass pid, not SCpnt as second arg to + scsi_times_out. + + * scsi.c: Restore interrupt state to previous value instead of using + cli/sti pairs. + + * scsi.c: Add a bunch of module stuff (all commented out for now). + + * scsi.c: Clean up scsi_dump_status. + +Tue Dec 6 12:34:20 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.72 released. + + * sg.c: Bugfix - always use sg_free, since we might have big buff. + +Fri Dec 2 11:24:53 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.71 released. + + * sg.c: Clear buff field when not in use. Only call scsi_free if + non-null. + + * scsi.h: Call wake_up(&wait_for_request) when done with a + command. + + * scsi.c (scsi_times_out): Pass pid down so that we can protect + against race conditions. + + * scsi.c (scsi_abort): Zero timeout field if we get the + NOT_RUNNING message back from low-level driver. + + + * scsi.c (scsi_done): Restore cmd_len, use_sg here. + + * scsi.c (request_sense): Not here. + + * hosts.h: Add new forbidden_addr, forbidden_size fields. Who + added these and why???? + + * hosts.c (scsi_mem_init): Mark pages as reserved if they fall in + the forbidden regions. I am not sure - I think this is so that + we can deal with boards that do incomplete decoding of their + address lines for the bios chips, but I am not entirely sure. + + * buslogic.c: Set forbidden_addr stuff if using a buggy board. + + * aha1740.c: Test for NULL pointer in SCtmp. This should not + occur, but a nice message is better than a kernel segfault. + + * 53c7,8xx.c: Add new PCI chip ID for 815. + +Fri Dec 2 11:24:53 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.70 released. + + * ChangeLog, st.c: Spelling. + +Tue Nov 29 18:48:42 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.69 released. + + * u14-34f.h: Non-functional change. [Dario]. + + * u14-34f.c: Use block field in Scsi_Host to prevent commands from + being queued to more than one host at the same time (used when + motherboard does not deal with multiple bus-masters very well). + Only when SINGLE_HOST_OPERATIONS is defined. + Use new cmd_per_lun field. [Dario] + + * eata.c: Likewise. + + * st.c: More changes from Kai. Add ready flag to indicate drive + status. + + * README.st: Document this. + + * sr.c: Bugfix (do not subtract CD_BLOCK_OFFSET) for photo-cd + code. + + * sg.c: Bugfix - fix problem where opcode is not correctly set up. + + * seagate.[c,h]: Use #defines to set driver name. + + * scsi_ioctl.c: Zero buffer before executing command. + + * scsi.c: Use new cmd_per_lun field in Scsi_Hosts as appropriate. + Add Sony CDU55S to blacklist. + + * hosts.h: Add new cmd_per_lun field to Scsi_Hosts. + + * hosts.c: Initialize cmd_per_lun in Scsi_Hosts from template. + + * buslogic.c: Use cmd_per_lun field - initialize to different + values depending upon bus type (i.e. use 1 if ISA, so we do not + hog memory). Use other patches which got lost from 1.1.68. + + * aha1542.c: Spelling. + +Tue Nov 29 15:43:50 1994 Eric Youngdale (eric@andante.aib.com) + + * Linux 1.1.68 released. + + Add support for 12 byte vendor specific commands in scsi-generics, + more (i.e. the last mandatory) low-level changes to support + loadable modules, plus a few other changes people have requested + lately. Changes by me (ERY) unless otherwise noted. Spelling + changes appear from some unknown corner of the universe. + + * Throughout: Change COMMAND_SIZE() to use SCpnt->cmd_len. + + * Throughout: Change info() low level function to take a Scsi_Host + pointer. This way the info function can return specific + information about the host in question, if desired. + + * All low-level drivers: Add NULL in initializer for the + usage_count field added to Scsi_Host_Template. + + * aha152x.[c,h]: Remove redundant info() function. + + * aha1542.[c,h]: Likewise. + + * aha1740.[c,h]: Likewise. + + * aha274x.[c,h]: Likewise. + + * eata.[c,h]: Likewise. + + * pas16.[c,h]: Likewise. + + * scsi_debug.[c,h]: Likewise. + + * t128.[c,h]: Likewise. + + * u14-34f.[c,h]: Likewise. + + * ultrastor.[c,h]: Likewise. + + * wd7000.[c,h]: Likewise. + + * aha1542.c: Add support for command line options with lilo to set + DMA parameters, I/O port. From Matt Aarnio. + + * buslogic.[c,h]: New version (1.13) from Dave Gentzel. + + * hosts.h: Add new field to Scsi_Hosts "block" to allow blocking + all I/O to certain other cards. Helps prevent problems with some + ISA motherboards. + + * hosts.h: Add usage_count to Scsi_Host_Template. + + * hosts.h: Add n_io_port to Scsi_Host (used when releasing module). + + * hosts.c: Initialize block field. + + * in2000.c: Remove "static" declarations from exported functions. + + * in2000.h: Likewise. + + * scsi.c: Correctly set cmd_len field as required. Save and + change setting when doing a request_sense, restore when done. + Move abort timeout message. Fix panic in request_queueable to + print correct function name. + + * scsi.c: When incrementing usage count, walk block linked list + for host, and or in SCSI_HOST_BLOCK bit. When decrementing usage + count to 0, clear this bit to allow usage to continue, wake up + processes waiting. + + + * scsi_ioctl.c: If we have an info() function, call it, otherwise + if we have a "name" field, use it, else do nothing. + + * sd.c, sr.c: Clear cmd_len field prior to each command we + generate. + + * sd.h: Add "has_part_table" bit to rscsi_disks. + + * sg.[c,h]: Add support for vendor specific 12 byte commands (i.e. + override command length in COMMAND_SIZE). + + * sr.c: Bugfix from Gerd in photocd code. + + * sr.c: Bugfix in get_sectorsize - always use scsi_malloc buffer - + we cannot guarantee that the stack is < 16Mb. + +Tue Nov 22 15:40:46 1994 Eric Youngdale (eric@andante.aib.com) + + * Linux 1.1.67 released. + + * sr.c: Change spelling of manufactor to manufacturer. + + * scsi.h: Likewise. + + * scsi.c: Likewise. + + * qlogic.c: Spelling corrections. + + * in2000.h: Spelling corrections. + + * in2000.c: Update from Bill Earnest, change from + jshiffle@netcom.com. Support new bios versions. + + * README.qlogic: Spelling correction. + +Tue Nov 22 15:40:46 1994 Eric Youngdale (eric@andante.aib.com) + + * Linux 1.1.66 released. + + * u14-34f.c: Spelling corrections. + + * sr.[h,c]: Add support for multi-session CDs from Gerd Knorr. + + * scsi.h: Add manufactor field for keeping track of device + manufacturer. + + * scsi.c: More spelling corrections. + + * qlogic.h, qlogic.c, README.qlogic: New driver from Tom Zerucha. + + * in2000.c, in2000.h: New driver from Brad McLean/Bill Earnest. + + * fdomain.c: Spelling correction. + + * eata.c: Spelling correction. + +Fri Nov 18 15:22:44 1994 Eric Youngdale (eric@andante.aib.com) + + * Linux 1.1.65 released. + + * eata.h: Update version string to 1.08.00. + + * eata.c: Set sg_tablesize correctly for DPT PM2012 boards. + + * aha274x.seq: Spell checking. + + * README.st: Likewise. + + * README.aha274x: Likewise. + + * ChangeLog: Likewise. + +Tue Nov 15 15:35:08 1994 Eric Youngdale (eric@andante.aib.com) + + * Linux 1.1.64 released. + + * u14-34f.h: Update version number to 1.10.01. + + * u14-34f.c: Use Scsi_Host can_queue variable instead of one from template. + + * eata.[c,h]: New driver for DPT boards from Dario Ballabio. + + * buslogic.c: Use can_queue field. + +Wed Nov 30 12:09:09 1994 Eric Youngdale (eric@andante.aib.com) + + * Linux 1.1.63 released. + + * sd.c: Give I/O error if we attempt 512 byte I/O to a disk with + 1024 byte sectors. + + * scsicam.c: Make sure we do read from whole disk (mask off + partition). + + * scsi.c: Use can_queue in Scsi_Host structure. + Fix panic message about invalid host. + + * hosts.c: Initialize can_queue from template. + + * hosts.h: Add can_queue to Scsi_Host structure. + + * aha1740.c: Print out warning about NULL ecbptr. + +Fri Nov 4 12:40:30 1994 Eric Youngdale (eric@andante.aib.com) + + * Linux 1.1.62 released. + + * fdomain.c: Update to version 5.20. (From Rik Faith). Support + BIOS version 3.5. + + * st.h: Add ST_EOD symbol. + + * st.c: Patches from Kai Makisara - support additional densities, + add support for MTFSS, MTBSS, MTWSM commands. + + * README.st: Update to document new commands. + + * scsi.c: Add Mediavision CDR-H93MV to blacklist. + +Sat Oct 29 20:57:36 1994 Eric Youngdale (eric@andante.aib.com) + + * Linux 1.1.60 released. + + * u14-34f.[c,h]: New driver from Dario Ballabio. + + * aic7770.c, aha274x_seq.h, aha274x.seq, aha274x.h, aha274x.c, + README.aha274x: New files, new driver from John Aycock. + + +Tue Oct 11 08:47:39 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.54 released. + + * Add third PCI chip id. [Drew] + + * buslogic.c: Set BUSLOGIC_CMDLUN back to 1 [Eric]. + + * ultrastor.c: Fix asm directives for new GCC. + + * sr.c, sd.c: Use new end_scsi_request function. + + * scsi.h(end_scsi_request): Return pointer to block if still + active, else return NULL if inactive. Fixes race condition. + +Sun Oct 9 20:23:14 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.53 released. + + * scsi.c: Do not allocate dma bounce buffers if we have exactly + 16Mb. + +Fri Sep 9 05:35:30 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.51 released. + + * aha152x.c: Add support for disabling the parity check. Update + to version 1.4. [Juergen]. + + * seagate.c: Tweak debugging message. + +Wed Aug 31 10:15:55 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.50 released. + + * aha152x.c: Add eb800 for Vtech Platinum SMP boards. [Juergen]. + + * scsi.c: Add Quantum PD1225S to blacklist. + +Fri Aug 26 09:38:45 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.49 released. + + * sd.c: Fix bug when we were deleting the wrong entry if we + get an unsupported sector size device. + + * sr.c: Another spelling patch. + +Thu Aug 25 09:15:27 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.48 released. + + * Throughout: Use new semantics for request_dma, as appropriate. + + * sr.c: Print correct device number. + +Sun Aug 21 17:49:23 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.47 released. + + * NCR5380.c: Add support for LIMIT_TRANSFERSIZE. + + * constants.h: Add prototype for print_Scsi_Cmnd. + + * pas16.c: Some more minor tweaks. Test for Mediavision board. + Allow for disks > 1Gb. [Drew??] + + * sr.c: Set SCpnt->transfersize. + +Tue Aug 16 17:29:35 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.46 released. + + * Throughout: More spelling fixups. + + * buslogic.c: Add a few more fixups from Dave. Disk translation + mainly. + + * pas16.c: Add a few patches (Drew?). + + +Thu Aug 11 20:45:15 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.44 released. + + * hosts.c: Add type casts for scsi_init_malloc. + + * scsicam.c: Add type cast. + +Wed Aug 10 19:23:01 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.43 released. + + * Throughout: Spelling cleanups. [??] + + * aha152x.c, NCR53*.c, fdomain.c, g_NCR5380.c, pas16.c, seagate.c, + t128.c: Use request_irq, not irqaction. [??] + + * aha1542.c: Move test for shost before we start to use shost. + + * aha1542.c, aha1740.c, ultrastor.c, wd7000.c: Use new + calling sequence for request_irq. + + * buslogic.c: Update from Dave Gentzel. + +Tue Aug 9 09:32:59 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.42 released. + + * NCR5380.c: Change NCR5380_print_status to static. + + * seagate.c: A few more bugfixes. Only Drew knows what they are + for. + + * ultrastor.c: Tweak some __asm__ directives so that it works + with newer compilers. [??] + +Sat Aug 6 21:29:36 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.40 released. + + * NCR5380.c: Return SCSI_RESET_WAKEUP from reset function. + + * aha1542.c: Reset mailbox status after a bus device reset. + + * constants.c: Fix typo (;;). + + * g_NCR5380.c: + * pas16.c: Correct usage of NCR5380_init. + + * scsi.c: Remove redundant (and unused variables). + + * sd.c: Use memset to clear all of rscsi_disks before we use it. + + * sg.c: Ditto, except for scsi_generics. + + * sr.c: Ditto, except for scsi_CDs. + + * st.c: Initialize STp->device. + + * seagate.c: Fix bug. [Drew] + +Thu Aug 4 08:47:27 1994 Eric Youngdale (eric@andante) + + * Linux 1.1.39 released. + + * Makefile: Fix typo in NCR53C7xx. + + * st.c: Print correct number for device. + +Tue Aug 2 11:29:14 1994 Eric Youngdale (eric@esp22) + + * Linux 1.1.38 released. + + Lots of changes in 1.1.38. All from Drew unless otherwise noted. + + * 53c7,8xx.c: New file from Drew. PCI driver. + + * 53c7,8xx.h: Likewise. + + * 53c7,8xx.scr: Likewise. + + * 53c8xx_d.h, 53c8xx_u.h, script_asm.pl: Likewise. + + * scsicam.c: New file from Drew. Read block 0 on the disk and + read the partition table. Attempt to deduce the geometry from + the partition table if possible. Only used by 53c[7,8]xx right + now, but could be used by any device for which we have no way + of identifying the geometry. + + * sd.c: Use device letters instead of sd%d in a lot of messages. + + * seagate.c: Fix bug that resulted in lockups with some devices. + + * sr.c (sr_open): Return -EROFS, not -EACCES if we attempt to open + device for write. + + * hosts.c, Makefile: Update for new driver. + + * NCR5380.c, NCR5380.h, g_NCR5380.h: Update from Drew to support + 53C400 chip. + + * constants.c: Define CONST_CMND and CONST_MSG. Other minor + cleanups along the way. Improve handling of CONST_MSG. + + * fdomain.c, fdomain.h: New version from Rik Faith. Update to + 5.18. Should now support TMC-3260 PCI card with 18C30 chip. + + * pas16.c: Update with new irq initialization. + + * t128.c: Update with minor cleanups. + + * scsi.c (scsi_pid): New variable - gives each command a unique + id. Add Quantum LPS5235S to blacklist. Change in_scan to + in_scan_scsis and make global. + + * scsi.h: Add some defines for extended message handling, + INITIATE/RELEASE_RECOVERY. Add a few new fields to support sync + transfers. + + * scsi_ioctl.h: Add ioctl to request synchronous transfers. + + +Tue Jul 26 21:36:58 1994 Eric Youngdale (eric@esp22) + + * Linux 1.1.37 released. + + * aha1542.c: Always call aha1542_mbenable, use new udelay + mechanism so we do not wait a long time if the board does not + implement this command. + + * g_NCR5380.c: Remove #include <linux/config.h> and #if + defined(CONFIG_SCSI_*). + + * seagate.c: Likewise. + + Next round of changes to support loadable modules. Getting closer + now, still not possible to do anything remotely usable. + + hosts.c: Create a linked list of detected high level devices. + (scsi_register_device): New function to insert into this list. + (scsi_init): Call scsi_register_device for each of the known high + level drivers. + + hosts.h: Add prototype for linked list header. Add structure + definition for device template structure which defines the linked + list. + + scsi.c: (scan_scsis): Use linked list instead of knowledge about + existing high level device drivers. + (scsi_dev_init): Use init functions for drivers on linked list + instead of explicit list to initialize and attach devices to high + level drivers. + + scsi.h: Add new field "attached" to scsi_device - count of number + of high level devices attached. + + sd.c, sr.c, sg.c, st.c: Adjust init/attach functions to use new + scheme. + +Sat Jul 23 13:03:17 1994 Eric Youngdale (eric@esp22) + + * Linux 1.1.35 released. + + * ultrastor.c: Change constraint on asm() operand so that it works + with gcc 2.6.0. + +Thu Jul 21 10:37:39 1994 Eric Youngdale (eric@esp22) + + * Linux 1.1.33 released. + + * sr.c(sr_open): Do not allow opens with write access. + +Mon Jul 18 09:51:22 1994 1994 Eric Youngdale (eric@esp22) + + * Linux 1.1.31 released. + + * sd.c: Increase SD_TIMEOUT from 300 to 600. + + * sr.c: Remove stray task_struct* variable that was no longer + used. + + * sr_ioctl.c: Fix typo in up() call. + +Sun Jul 17 16:25:29 1994 Eric Youngdale (eric@esp22) + + * Linux 1.1.30 released. + + * scsi.c (scan_scsis): Fix detection of some Toshiba CDROM drives + that report themselves as disk drives. + + * (Throughout): Use request.sem instead of request.waiting. + Should fix swap problem with fdomain. + +Thu Jul 14 10:51:42 1994 Eric Youngdale (eric@esp22) + + * Linux 1.1.29 released. + + * scsi.c (scan_scsis): Add new devices to end of linked list, not + to the beginning. + + * scsi.h (SCSI_SLEEP): Remove brain dead hack to try to save + the task state before sleeping. + +Sat Jul 9 15:01:03 1994 Eric Youngdale (eric@esp22) + + More changes to eventually support loadable modules. Mainly + we want to use linked lists instead of arrays because it is easier + to dynamically add and remove things this way. + + Quite a bit more work is needed before loadable modules are + possible (and usable) with scsi, but this is most of the grunge + work. + + * Linux 1.1.28 released. + + * scsi.c, scsi.h (allocate_device, request_queueable): Change + argument from index into scsi_devices to a pointer to the + Scsi_Device struct. + + * Throughout: Change all calls to allocate_device, + request_queueable to use new calling sequence. + + * Throughout: Use SCpnt->device instead of + scsi_devices[SCpnt->index]. Ugh - the pointer was there all along + - much cleaner this way. + + * scsi.c (scsi_init_malloc, scsi_free_malloc): New functions - + allow us to pretend that we have a working malloc when we + initialize. Use this instead of passing memory_start, memory_end + around all over the place. + + * scsi.h, st.c, sr.c, sd.c, sg.c: Change *_init1 functions to use + scsi_init_malloc, remove all arguments, no return value. + + * scsi.h: Remove index field from Scsi_Device and Scsi_Cmnd + structs. + + * scsi.c (scsi_dev_init): Set up for scsi_init_malloc. + (scan_scsis): Get SDpnt from scsi_init_malloc, and refresh + when we discover a device. Free pointer before returning. + Change scsi_devices into a linked list. + + * scsi.c (scan_scsis): Change to only scan one host. + (scsi_dev_init): Loop over all detected hosts, and scan them. + + * hosts.c (scsi_init_free): Change so that number of extra bytes + is stored in struct, and we do not have to pass it each time. + + * hosts.h: Change Scsi_Host_Template struct to include "next" and + "release" functions. Initialize to NULL in all low level + adapters. + + * hosts.c: Rename scsi_hosts to builtin_scsi_hosts, create linked + list scsi_hosts, linked together with the new "next" field. + +Wed Jul 6 05:45:02 1994 Eric Youngdale (eric@esp22) + + * Linux 1.1.25 released. + + * aha152x.c: Changes from Juergen - cleanups and updates. + + * sd.c, sr.c: Use new check_media_change and revalidate + file_operations fields. + + * st.c, st.h: Add changes from Kai Makisara, dated Jun 22. + + * hosts.h: Change SG_ALL back to 0xff. Apparently soft error + in /dev/brain resulted in having this bumped up. + Change first parameter in bios_param function to be Disk * instead + of index into rscsi_disks. + + * sd_ioctl.c: Pass pointer to rscsi_disks element instead of index + to array. + + * sd.h: Add struct name "scsi_disk" to typedef for Scsi_Disk. + + * scsi.c: Remove redundant Maxtor XT8760S from blacklist. + In scsi_reset, add printk when DEBUG defined. + + * All low level drivers: Modify definitions of bios_param in + appropriate way. + +Thu Jun 16 10:31:59 1994 Eric Youngdale (eric@esp22) + + * Linux 1.1.20 released. + + * scsi_ioctl.c: Only pass down the actual number of characters + required to scsi_do_cmd, not the one rounded up to a even number + of sectors. + + * ultrastor.c: Changes from Caleb Epstein for 24f cards. Support + larger SG lists. + + * ultrastor.c: Changes from me - use scsi_register to register + host. Add some consistency checking, + +Wed Jun 1 21:12:13 1994 Eric Youngdale (eric@esp22) + + * Linux 1.1.19 released. + + * scsi.h: Add new return code for reset() function: + SCSI_RESET_PUNT. + + * scsi.c: Make SCSI_RESET_PUNT the same as SCSI_RESET_WAKEUP for + now. + + * aha1542.c: If the command responsible for the reset is not + pending, return SCSI_RESET_PUNT. + + * aha1740.c, buslogic.c, wd7000.c, ultrastor.c: Return + SCSI_RESET_PUNT instead of SCSI_RESET_SNOOZE. + +Tue May 31 19:36:01 1994 Eric Youngdale (eric@esp22) + + * buslogic.c: Do not print out message about "must be Adaptec" + if we have detected a buslogic card. Print out a warning message + if we are configuring for >16Mb, since the 445S at board level + D or earlier does not work right. The "D" level board can be made + to work by flipping an undocumented switch, but this is too subtle. + + Changes based upon patches in Yggdrasil distribution. + + * sg.c, sg.h: Return sense data to user. + + * aha1542.c, aha1740.c, buslogic.c: Do not panic if + sense buffer is wrong size. + + * hosts.c: Test for ultrastor card before any of the others. + + * scsi.c: Allow boot-time option for max_scsi_luns=? so that + buggy firmware has an easy work-around. + +Sun May 15 20:24:34 1994 Eric Youngdale (eric@esp22) + + * Linux 1.1.15 released. + + Post-codefreeze thaw... + + * buslogic.[c,h]: New driver from David Gentzel. + + * hosts.h: Add use_clustering field to explicitly say whether + clustering should be used for devices attached to this host + adapter. The buslogic board apparently supports large SG lists, + but it is apparently faster if sd.c condenses this into a smaller + list. + + * sd.c: Use this field instead of heuristic. + + * All host adapter include files: Add appropriate initializer for + use_clustering field. + + * scsi.h: Add #defines for return codes for the abort and reset + functions. There are now a specific set of return codes to fully + specify all of the possible things that the low-level adapter + could do. + + * scsi.c: Act based upon return codes from abort/reset functions. + + * All host adapter abort/reset functions: Return new return code. + + * Add code in scsi.c to help debug timeouts. Use #define + DEBUG_TIMEOUT to enable this. + + * scsi.c: If the host->irq field is set, use + disable_irq/enable_irq before calling queuecommand if we + are not already in an interrupt. Reduce races, and we + can be sloppier about cli/sti in the interrupt routines now + (reduce interrupt latency). + + * constants.c: Fix some things to eliminate warnings. Add some + sense descriptions that were omitted before. + + * aha1542.c: Watch for SCRD from host adapter - if we see it, set + a flag. Currently we only print out the number of pending + commands that might need to be restarted. + + * aha1542.c (aha1542_abort): Look for lost interrupts, OGMB still + full, and attempt to recover. Otherwise give up. + + * aha1542.c (aha1542_reset): Try BUS DEVICE RESET, and then pass + DID_RESET back up to the upper level code for all commands running + on this target (even on different LUNs). + +Sat May 7 14:54:01 1994 + + * Linux 1.1.12 released. + + * st.c, st.h: New version from Kai. Supports boot time + specification of number of buffers. + + * wd7000.[c,h]: Updated driver from John Boyd. Now supports + more than one wd7000 board in machine at one time, among other things. + +Wed Apr 20 22:20:35 1994 + + * Linux 1.1.8 released. + + * sd.c: Add a few type casts where scsi_malloc is called. + +Wed Apr 13 12:53:29 1994 + + * Linux 1.1.4 released. + + * scsi.c: Clean up a few printks (use %p to print pointers). + +Wed Apr 13 11:33:02 1994 + + * Linux 1.1.3 released. + + * fdomain.c: Update to version 5.16 (Handle different FIFO sizes + better). + +Fri Apr 8 08:57:19 1994 + + * Linux 1.1.2 released. + + * Throughout: SCSI portion of cluster diffs added. + +Tue Apr 5 07:41:50 1994 + + * Linux 1.1 development tree initiated. + + * The linux 1.0 development tree is now effectively frozen except + for obvious bugfixes. + +****************************************************************** +****************************************************************** +****************************************************************** +****************************************************************** + +Sun Apr 17 00:17:39 1994 + + * Linux 1.0, patchlevel 9 released. + + * fdomain.c: Update to version 5.16 (Handle different FIFO sizes + better). + +Thu Apr 7 08:36:20 1994 + + * Linux 1.0, patchlevel8 released. + + * fdomain.c: Update to version 5.15 from 5.9. Handles 3.4 bios. + +Sun Apr 3 14:43:03 1994 + + * Linux 1.0, patchlevel6 released. + + * wd7000.c: Make stab at fixing race condition. + +Sat Mar 26 14:14:50 1994 + + * Linux 1.0, patchlevel5 released. + + * aha152x.c, Makefile: Fix a few bugs (too much data message). + Add a few more bios signatures. (Patches from Juergen). + + * aha1542.c: Fix race condition in aha1542_out. + +Mon Mar 21 16:36:20 1994 + + * Linux 1.0, patchlevel3 released. + + * sd.c, st.c, sr.c, sg.c: Return -ENXIO, not -ENODEV if we attempt + to open a non-existent device. + + * scsi.c: Add Chinon cdrom to blacklist. + + * sr_ioctl.c: Check return status of verify_area. + +Sat Mar 6 16:06:19 1994 + + * Linux 1.0 released (technically a pre-release). + + * scsi.c: Add IMS CDD521, Maxtor XT-8760S to blacklist. + +Tue Feb 15 10:58:20 1994 + + * pl15e released. + + * aha1542.c: For 1542C, allow dynamic device scan with >1Gb turned + off. + + * constants.c: Fix typo in definition of CONSTANTS. + + * pl15d released. + +Fri Feb 11 10:10:16 1994 + + * pl15c released. + + * scsi.c: Add Maxtor XT-3280 and Rodime RO3000S to blacklist. + + * scsi.c: Allow tagged queueing for scsi 3 devices as well. + Some really old devices report a version number of 0. Disallow + LUN != 0 for these. + +Thu Feb 10 09:48:57 1994 + + * pl15b released. + +Sun Feb 6 12:19:46 1994 + + * pl15a released. + +Fri Feb 4 09:02:17 1994 + + * scsi.c: Add Teac cdrom to blacklist. + +Thu Feb 3 14:16:43 1994 + + * pl15 released. + +Tue Feb 1 15:47:43 1994 + + * pl14w released. + + * wd7000.c (wd_bases): Fix typo in last change. + +Mon Jan 24 17:37:23 1994 + + * pl14u released. + + * aha1542.c: Support 1542CF/extended bios. Different from 1542C + + * wd7000.c: Allow bios at 0xd8000 as well. + + * ultrastor.c: Do not truncate cylinders to 1024. + + * fdomain.c: Update to version 5.9 (add new bios signature). + + * NCR5380.c: Update from Drew - should work a lot better now. + +Sat Jan 8 15:13:10 1994 + + * pl14o released. + + * sr_ioctl.c: Zero reserved field before trying to set audio volume. + +Wed Jan 5 13:21:10 1994 + + * pl14m released. + + * fdomain.c: Update to version 5.8. No functional difference??? + +Tue Jan 4 14:26:13 1994 + + * pl14l released. + + * ultrastor.c: Remove outl, inl functions (now provided elsewhere). + +Mon Jan 3 12:27:25 1994 + + * pl14k released. + + * aha152x.c: Remove insw and outsw functions. + + * fdomain.c: Ditto. + +Wed Dec 29 09:47:20 1993 + + * pl14i released. + + * scsi.c: Support RECOVERED_ERROR for tape drives. + + * st.c: Update of tape driver from Kai. + +Tue Dec 21 09:18:30 1993 + + * pl14g released. + + * aha1542.[c,h]: Support extended BIOS stuff. + + * scsi.c: Clean up messages about disks, so they are displayed as + sda, sdb, etc instead of sd0, sd1, etc. + + * sr.c: Force reread of capacity if disk was changed. + Clear buffer before asking for capacity/sectorsize (some drives + do not report this properly). Set needs_sector_size flag if + drive did not return sensible sector size. + +Mon Dec 13 12:13:47 1993 + + * aha152x.c: Update to version .101 from Juergen. + +Mon Nov 29 03:03:00 1993 + + * linux 0.99.14 released. + + * All scsi stuff moved from kernel/blk_drv/scsi to drivers/scsi. + + * Throughout: Grammatical corrections to various comments. + + * Makefile: fix so that we do not need to compile things we are + not going to use. + + * NCR5380.c, NCR5380.h, g_NCR5380.c, g_NCR5380.h, pas16.c, + pas16.h, t128.c, t128.h: New files from Drew. + + * aha152x.c, aha152x.h: New files from Juergen Fischer. + + * aha1542.c: Support for more than one 1542 in the machine + at the same time. Make functions static that do not need + visibility. + + * aha1740.c: Set NEEDS_JUMPSTART flag in reset function, so we + know to restart the command. Change prototype of aha1740_reset + to take a command pointer. + + * constants.c: Clean up a few things. + + * fdomain.c: Update to version 5.6. Move snarf_region. Allow + board to be set at different SCSI ids. Remove support for + reselection (did not work well). Set JUMPSTART flag in reset + code. + + * hosts.c: Support new low-level adapters. Allow for more than + one adapter of a given type. + + * hosts.h: Allow for more than one adapter of a given type. + + * scsi.c: Add scsi_device_types array, if NEEDS_JUMPSTART is set + after a low-level reset, start the command again. Sort blacklist, + and add Maxtor MXT-1240S, XT-4170S, NEC CDROM 84, Seagate ST157N. + + * scsi.h: Add constants for tagged queueing. + + * Throughout: Use constants from major.h instead of hardcoded + numbers for major numbers. + + * scsi_ioctl.c: Fix bug in buffer length in ioctl_command. Use + verify_area in GET_IDLUN ioctl. Add new ioctls for + TAGGED_QUEUE_ENABLE, DISABLE. Only allow IOCTL_SEND_COMMAND by + superuser. + + * sd.c: Only pay attention to UNIT_ATTENTION for removable disks. + Fix bug where sometimes portions of blocks would get lost + resulting in processes hanging. Add messages when we spin up a + disk, and fix a bug in the timing. Increase read-ahead for disks + that are on a scatter-gather capable host adapter. + + * seagate.c: Fix so that some parameters can be set from the lilo + prompt. Supply jumpstart flag if we are resetting and need the + command restarted. Fix so that we return 1 if we detect a card + so that multiple card detection works correctly. Add yet another + signature for FD cards (950). Add another signature for ST0x. + + * sg.c, sg.h: New files from Lawrence Foard for generic scsi + access. + + * sr.c: Add type casts for (void*) so that we can do pointer + arithmetic. Works with GCC without this, but it is not strictly + correct. Same bugfix as was in sd.c. Increase read-ahead a la + disk driver. + + * sr_ioctl.c: Use scsi_malloc buffer instead of buffer from stack + since we cannot guarantee that the stack is < 16Mb. + + ultrastor.c: Update to support 24f properly (JFC's driver). + + wd7000.c: Supply jumpstart flag for reset. Do not round up + number of cylinders in biosparam function. + +Sat Sep 4 20:49:56 1993 + + * 0.99pl13 released. + + * Throughout: Use check_region/snarf_region for all low-level + drivers. + + * aha1542.c: Do hard reset instead of soft (some ethercard probes + screw us up). + + * scsi.c: Add new flag ASKED_FOR_SENSE so that we can tell if we are + in a loop whereby the device returns null sense data. + + * sd.c: Add code to spin up a drive if it is not already spinning. + Do this one at a time to make it easier on power supplies. + + * sd_ioctl.c: Use sync_dev instead of fsync_dev in BLKFLSBUF ioctl. + + * seagate.c: Switch around DATA/CONTROL lines. + + * st.c: Change sense to unsigned. + +Thu Aug 5 11:59:18 1993 + + * 0.99pl12 released. + + * constants.c, constants.h: New files with ascii descriptions of + various conditions. + + * Makefile: Do not try to count the number of low-level drivers, + just generate the list of .o files. + + * aha1542.c: Replace 16 with sizeof(SCpnt->sense_buffer). Add tests + for addresses > 16Mb, panic if we find one. + + * aha1740.c: Ditto with sizeof(). + + * fdomain.c: Update to version 3.18. Add new signature, register IRQ + with irqaction. Use ID 7 for new board. Be more intelligent about + obtaining the h/s/c numbers for biosparam. + + * hosts.c: Do not depend upon Makefile generated count of the number + of low-level host adapters. + + * scsi.c: Use array for scsi_command_size instead of a function. Add + Texel cdrom and Maxtor XT-4380S to blacklist. Allow compile time + option for no-multi lun scan. Add semaphore for possible problems + with handshaking, assume device is faulty until we know it not to be + the case. Add DEBUG_INIT symbol to dump info as we scan for devices. + Zero sense buffer so we can tell if we need to request it. When + examining sense information, request sense if buffer is all zero. + If RESET, request sense information to see what to do next. + + * scsi_debug.c: Change some constants to use symbols like INT_MAX. + + * scsi_ioctl.c (kernel_scsi_ioctl): New function -for making ioctl + calls from kernel space. + + * sd.c: Increase timeout to 300. Use functions in constants.h to + display info. Use scsi_malloc buffer for READ_CAPACITY, since + we cannot guarantee that a stack based buffer is < 16Mb. + + * sd_ioctl.c: Add BLKFLSBUF ioctl. + + * seagate.c: Add new compile time options for ARBITRATE, + SLOW_HANDSHAKE, and SLOW_RATE. Update assembly loops for transferring + data. Use kernel_scsi_ioctl to request mode page with geometry. + + * sr.c: Use functions in constants.c to display messages. + + * st.c: Support for variable block size. + + * ultrastor.c: Do not use cache for tape drives. Set + unchecked_isa_dma flag, even though this may not be needed (gets set + later). + +Sat Jul 17 18:32:44 1993 + + * 0.99pl11 released. C++ compilable. + + * Throughout: Add type casts all over the place, and use "ip" instead + of "info" in the various biosparam functions. + + * Makefile: Compile seagate.c with C++ compiler. + + * aha1542.c: Always set ccb pointer as this gets trashed somehow on + some systems. Add a few type casts. Update biosparam function a little. + + * aha1740.c: Add a few type casts. + + * fdomain.c: Update to version 3.17 from 3.6. Now works with + TMC-18C50. + + * scsi.c: Minor changes here and there with datatypes. Save use_sg + when requesting sense information so that this can properly be + restored if we retry the command. Set aside dma buffers assuming each + block is 1 page, not 1Kb minix block. + + * scsi_ioctl.c: Add a few type casts. Other minor changes. + + * sd.c: Correctly free all scsi_malloc'd memory if we run out of + dma_pool. Store blocksize information for each partition. + + * seagate.c: Minor cleanups here and there. + + * sr.c: Set up blocksize array for all discs. Fix bug in freeing + buffers if we run out of dma pool. + +Thu Jun 2 17:58:11 1993 + + * 0.99pl10 released. + + * aha1542.c: Support for BT 445S (VL-bus board with no dma channel). + + * fdomain.c: Upgrade to version 3.6. Preliminary support for TNC-18C50. + + * scsi.c: First attempt to fix problem with old_use_sg. Change + NOT_READY to a SUGGEST_ABORT. Fix timeout race where time might + get decremented past zero. + + * sd.c: Add block_fsync function to dispatch table. + + * sr.c: Increase timeout to 500 from 250. Add entry for sync in + dispatch table (supply NULL). If we do not have a sectorsize, + try to get it in the sd_open function. Add new function just to + obtain sectorsize. + + * sr.h: Add needs_sector_size semaphore. + + * st.c: Add NULL for fsync in dispatch table. + + * wd7000.c: Allow another condition for power on that are normal + and do not require a panic. + +Thu Apr 22 23:10:11 1993 + + * 0.99pl9 released. + + * aha1542.c: Use (void) instead of () in setup_mailboxes. + + * scsi.c: Initialize transfersize and underflow fields in SCmd to 0. + Do not panic for unsupported message bytes. + + * scsi.h: Allocate 12 bytes instead of 10 for commands. Add + transfersize and underflow fields. + + * scsi_ioctl.c: Further bugfix to ioctl_probe. + + * sd.c: Use long instead of int for last parameter in sd_ioctl. + Initialize transfersize and underflow fields. + + * sd_ioctl.c: Ditto for sd_ioctl(,,,,); + + * seagate.c: New version from Drew. Includes new signatures for FD + cards. Support for 0ws jumper. Correctly initialize + scsi_hosts[hostnum].this_id. Improved handing of + disconnect/reconnect, and support command linking. Use + transfersize and underflow fields. Support scatter-gather. + + * sr.c, sr_ioctl.c: Use long instead of int for last parameter in sr_ioctl. + Use buffer and buflength in do_ioctl. Patches from Chris Newbold for + scsi-2 audio commands. + + * ultrastor.c: Comment out in_byte (compiler warning). + + * wd7000.c: Change () to (void) in wd7000_enable_dma. + +Wed Mar 31 16:36:25 1993 + + * 0.99pl8 released. + + * aha1542.c: Handle mailboxes better for 1542C. + Do not truncate number of cylinders at 1024 for biosparam call. + + * aha1740.c: Fix a few minor bugs for multiple devices. + Same as above for biosparam. + + * scsi.c: Add lockable semaphore for removable devices that can have + media removal prevented. Add another signature for flopticals. + (allocate_device): Fix race condition. Allow more space in dma pool + for blocksizes of up to 4Kb. + + * scsi.h: Define COMMAND_SIZE. Define a SCSI specific version of + INIT_REQUEST that can run with interrupts off. + + * scsi_ioctl.c: Make ioctl_probe function more idiot-proof. If + a removable device says ILLEGAL REQUEST to a door-locking command, + clear lockable flag. Add SCSI_IOCTL_GET_IDLUN ioctl. Do not attempt + to lock door for devices that do not have lockable semaphore set. + + * sd.c: Fix race condition for multiple disks. Use INIT_SCSI_REQUEST + instead of INIT_REQUEST. Allow sector sizes of 1024 and 256. For + removable disks that are not ready, mark them as having a media change + (some drives do not report this later). + + * seagate.c: Use volatile keyword for memory-mapped register pointers. + + * sr.c: Fix race condition, a la sd.c. Increase the number of retries + to 1. Use INIT_SCSI_REQUEST. Allow 512 byte sector sizes. Do a + read_capacity when we init the device so we know the size and + sectorsize. + + * st.c: If ioctl not found in st.c, try scsi_ioctl for others. + + * ultrastor.c: Do not truncate number of cylinders at 1024 for + biosparam call. + + * wd7000.c: Ditto. + Throughout: Use COMMAND_SIZE macro to determine length of scsi + command. + + + +Sat Mar 13 17:31:29 1993 + + * 0.99pl7 released. + + Throughout: Improve punctuation in some messages, and use new + verify_area syntax. + + * aha1542.c: Handle unexpected interrupts better. + + * scsi.c: Ditto. Handle reset conditions a bit better, asking for + sense information and retrying if required. + + * scsi_ioctl.c: Allow for 12 byte scsi commands. + + * ultrastor.c: Update to use scatter-gather. + +Sat Feb 20 17:57:15 1993 + + * 0.99pl6 released. + + * fdomain.c: Update to version 3.5. Handle spurious interrupts + better. + + * sd.c: Use register_blkdev function. + + * sr.c: Ditto. + + * st.c: Use register_chrdev function. + + * wd7000.c: Undo previous change. + +Sat Feb 6 11:20:43 1993 + + * 0.99pl5 released. + + * scsi.c: Fix bug in testing for UNIT_ATTENTION. + + * wd7000.c: Check at more addresses for bios. Fix bug in biosparam + (heads & sectors turned around). + +Wed Jan 20 18:13:59 1993 + + * 0.99pl4 released. + + * scsi.c: Ignore leading spaces when looking for blacklisted devices. + + * seagate.c: Add a few new signatures for FD cards. Another patch + with SCint to fix race condition. Use recursion_depth to keep track + of how many times we have been recursively called, and do not start + another command unless we are on the outer level. Fixes bug + with Syquest cartridge drives (used to crash kernel), because + they do not disconnect with large data transfers. + +Tue Jan 12 14:33:36 1993 + + * 0.99pl3 released. + + * fdomain.c: Update to version 3.3 (a few new signatures). + + * scsi.c: Add CDU-541, Denon DRD-25X to blacklist. + (allocate_request, request_queueable): Init request.waiting to NULL if + non-buffer type of request. + + * seagate.c: Allow controller to be overridden with CONTROLLER symbol. + Set SCint=NULL when we are done, to remove race condition. + + * st.c: Changes from Kai. + +Wed Dec 30 20:03:47 1992 + + * 0.99pl2 released. + + * scsi.c: Blacklist back in. Remove Newbury drive as other bugfix + eliminates need for it here. + + * sd.c: Return ENODEV instead of EACCES if no such device available. + (sd_init) Init blkdev_fops earlier so that sd_open is available sooner. + + * sr.c: Same as above for sd.c. + + * st.c: Return ENODEV instead of ENXIO if no device. Init chrdev_fops + sooner, so that it is always there even if no tapes. + + * seagate.c (controller_type): New variable to keep track of ST0x or + FD. Modify signatures list to indicate controller type, and init + controller_type once we find a match. + + * wd7000.c (wd7000_set_sync): Remove redundant function. + +Sun Dec 20 16:26:24 1992 + + * 0.99pl1 released. + + * scsi_ioctl.c: Bugfix - check dev->index, not dev->id against + NR_SCSI_DEVICES. + + * sr_ioctl.c: Verify that device exists before allowing an ioctl. + + * st.c: Patches from Kai - change timeout values, improve end of tape + handling. + +Sun Dec 13 18:15:23 1992 + + * 0.99 kernel released. Baseline for this ChangeLog. diff --git a/Documentation/scsi/ChangeLog.ips b/Documentation/scsi/ChangeLog.ips new file mode 100644 index 000000000000..5019f5182bf4 --- /dev/null +++ b/Documentation/scsi/ChangeLog.ips @@ -0,0 +1,122 @@ +IBM ServeRAID driver Change Log +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + 5.00.01 - Sarasota ( 5i ) adapters must always be scanned first + - Get rid on IOCTL_NEW_COMMAND code + - Add Extended DCDB Commands for Tape Support in 5I + + 4.90.11 - Don't actually RESET unless it's physically required + - Remove unused compile options + + 4.90.08 - Data Corruption if First Scatter Gather Element is > 64K + + 4.90.08 - Increase Delays in Flashing ( Trombone Only - 4H ) + + 4.90.05 - Use New PCI Architecture to facilitate Hot Plug Development + + 4.90.01 - Add ServeRAID Version Checking + + 4.80.26 - Clean up potential code problems ( Arjan's recommendations ) + + 4.80.21 - Change memcpy() to copy_to_user() in NVRAM Page 5 IOCTL path + + 4.80.20 - Set max_sectors in Scsi_Host structure ( if >= 2.4.7 kernel ) + - 5 second delay needed after resetting an i960 adapter + + 4.80.14 - Take all semaphores off stack + - Clean Up New_IOCTL path + + 4.80.04 - Eliminate calls to strtok() if 2.4.x or greater + - Adjustments to Device Queue Depth + + 4.80.00 - Make ia64 Safe + + 4.72.01 - I/O Mapped Memory release ( so "insmod ips" does not Fail ) + - Don't Issue Internal FFDC Command if there are Active Commands + - Close Window for getting too many IOCTL's active + + 4.72.00 - Allow for a Scatter-Gather Element to exceed MAX_XFER Size + + 4.71.00 - Change all memory allocations to not use GFP_DMA flag + - Code Clean-Up for 2.4.x kernel + + 4.70.15 - Fix Breakup for very large ( non-SG ) requests + + 4.70.13 - Don't release HA Lock in ips_next() until SC taken off queue + - Unregister SCSI device in ips_release() + - Don't Send CDB's if we already know the device is not present + + 4.70.12 - Corrective actions for bad controller ( during initialization ) + + 4.70.09 - Use a Common ( Large Buffer ) for Flashing from the JCRM CD + - Add IPSSEND Flash Support + - Set Sense Data for Unknown SCSI Command + - Use Slot Number from NVRAM Page 5 + - Restore caller's DCDB Structure + + 4.20.14 - Update patch files for kernel 2.4.0-test5 + + 4.20.13 - Fix some failure cases / reset code + - Hook into the reboot_notifier to flush the controller + cache + + 4.20.03 - Rename version to coincide with new release schedules + - Performance fixes + - Fix truncation of /proc files with cat + - Merge in changes through kernel 2.4.0test1ac21 + + 4.10.13 - Fix for dynamic unload and proc file system + + 4.10.00 - Add support for ServeRAID 4M/4L + + 4.00.06 - Fix timeout with initial FFDC command + + 4.00.05 - Remove wish_block from init routine + - Use linux/spinlock.h instead of asm/spinlock.h for kernels + 2.3.18 and later + - Sync with other changes from the 2.3 kernels + + 4.00.04 - Rename structures/constants to be prefixed with IPS_ + + 4.00.03 - Add alternative passthru interface + - Add ability to flash ServeRAID BIOS + + 4.00.02 - Fix problem with PT DCDB with no buffer + + 4.00.01 - Add support for First Failure Data Capture + + 4.00.00 - Add support for ServeRAID 4 + + 3.60.02 - Make DCDB direction based on lookup table. + - Only allow one DCDB command to a SCSI ID at a time. + + 3.60.01 - Remove bogus error check in passthru routine. + + 3.60.00 - Bump max commands to 128 for use with ServeRAID + firmware 3.60. + - Change version to 3.60 to coincide with ServeRAID release + numbering. + + 1.00.00 - Initial Public Release + - Functionally equivalent to 0.99.05 + + 0.99.05 - Fix an oops on certain passthru commands + + 0.99.04 - Fix race condition in the passthru mechanism + -- this required the interface to the utilities to change + - Fix error recovery code + + 0.99.03 - Make interrupt routine handle all completed request on the + adapter not just the first one + - Make sure passthru commands get woken up if we run out of + SCBs + - Send all of the commands on the queue at once rather than + one at a time since the card will support it. + + 0.99.02 - Added some additional debug statements to print out + errors if an error occurs while trying to read/write + to a logical drive (IPS_DEBUG). + + - Fixed read/write errors when the adapter is using an + 8K stripe size. + diff --git a/Documentation/scsi/ChangeLog.megaraid b/Documentation/scsi/ChangeLog.megaraid new file mode 100644 index 000000000000..a9356c63b800 --- /dev/null +++ b/Documentation/scsi/ChangeLog.megaraid @@ -0,0 +1,349 @@ +Release Date : Thu Feb 03 12:27:22 EST 2005 - Seokmann Ju <sju@lsil.com> +Current Version : 2.20.4.5 (scsi module), 2.20.2.5 (cmm module) +Older Version : 2.20.4.4 (scsi module), 2.20.2.4 (cmm module) + +1. Modified name of two attributes in scsi_host_template. + On Wed, 2005-02-02 at 10:56 -0500, Ju, Seokmann wrote: + > + .sdev_attrs = megaraid_device_attrs, + > + .shost_attrs = megaraid_class_device_attrs, + + These are, perhaps, slightly confusing names. + The terms device and class_device have well defined meanings in the + generic device model, neither of which is what you mean here. + Why not simply megaraid_sdev_attrs and megaraid_shost_attrs? + + Other than this, it looks fine to me too. + +Release Date : Thu Jan 27 00:01:03 EST 2005 - Atul Mukker <atulm@lsil.com> +Current Version : 2.20.4.4 (scsi module), 2.20.2.5 (cmm module) +Older Version : 2.20.4.3 (scsi module), 2.20.2.4 (cmm module) + +1. Bump up the version of scsi module due to its conflict. + +Release Date : Thu Jan 21 00:01:03 EST 2005 - Atul Mukker <atulm@lsil.com> +Current Version : 2.20.4.3 (scsi module), 2.20.2.5 (cmm module) +Older Version : 2.20.4.2 (scsi module), 2.20.2.4 (cmm module) + +1. Remove driver ioctl for logical drive to scsi address translation and + replace with the sysfs attribute. To remove drives and change + capacity, application shall now use the device attribute to get the + logical drive number for a scsi device. For adding newly created + logical drives, class device attribute would be required to uniquely + identify each controller. + - Atul Mukker <atulm@lsil.com> + + "James, I've been thinking about this a little more, and you may be on + to something here. Let each driver add files as such:" + + - Matt Domsch <Matt_Domsch@dell.com>, 12.15.2004 + linux-scsi mailing list + + + "Then, if you simply publish your LD number as an extra parameter of + the device, you can look through /sys to find it." + + - James Bottomley <James.Bottomley@SteelEye.com>, 01.03.2005 + linux-scsi mailing list + + + "I don't see why not ... it's your driver, you can publish whatever + extra information you need as scsi_device attributes; that was one of + the designs of the extensible attribute system." + + - James Bottomley <James.Bottomley@SteelEye.com>, 01.06.2005 + linux-scsi mailing list + +2. Add AMI megaraid support - Brian King <brking@charter.net> + PCI_VENDOR_ID_AMI, PCI_DEVICE_ID_AMI_MEGARAID3, + PCI_VENDOR_ID_AMI, PCI_SUBSYS_ID_PERC3_DC, + +3. Make some code static - Adrian Bunk <bunk@stusta.de> + Date: Mon, 15 Nov 2004 03:14:57 +0100 + + The patch below makes some needlessly global code static. + -wait_queue_head_t wait_q; + +static wait_queue_head_t wait_q; + + Signed-off-by: Adrian Bunk <bunk@stusta.de> + +4. Added NEC ROMB support - NEC MegaRAID PCI Express ROMB controller + PCI_VENDOR_ID_LSI_LOGIC, PCI_DEVICE_ID_MEGARAID_NEC_ROMB_2E, + PCI_SUBSYS_ID_NEC, PCI_SUBSYS_ID_MEGARAID_NEC_ROMB_2E, + +5. Fixed Tape drive issue : For any Direct CDB command to physical device + including tape, timeout value set by driver was 10 minutes. With this + value, most of command will return within timeout. However, for those + command like ERASE or FORMAT, it takes more than an hour depends on + capacity of the device and the command could be terminated before it + completes. + To address this issue, the 'timeout' field in the DCDB command will + have NO TIMEOUT (i.e., 4) value as its timeout on DCDB command. + + + +Release Date : Thu Dec 9 19:10:23 EST 2004 + - Sreenivas Bagalkote <sreenib@lsil.com> + +Current Version : 2.20.4.2 (scsi module), 2.20.2.4 (cmm module) +Older Version : 2.20.4.1 (scsi module), 2.20.2.3 (cmm module) + +i. Introduced driver ioctl that returns scsi address for a given ld. + + "Why can't the existing sysfs interfaces be used to do this?" + - Brian King (brking@us.ibm.com) + + "I've looked into solving this another way, but I cannot see how + to get this driver-private mapping of logical drive number-> HCTL + without putting code something like this into the driver." + + "...and by providing a mapping a function to userspace, the driver + is free to change its mapping algorithm in the future if necessary .." + - Matt Domsch (Matt_Domsch@dell.com) + +Release Date : Thu Dec 9 19:02:14 EST 2004 - Sreenivas Bagalkote <sreenib@lsil.com> + +Current Version : 2.20.4.1 (scsi module), 2.20.2.3 (cmm module) +Older Version : 2.20.4.1 (scsi module), 2.20.2.2 (cmm module) + +i. Fix a bug in kioc's dma buffer deallocation + +Release Date : Thu Nov 4 18:24:56 EST 2004 - Sreenivas Bagalkote <sreenib@lsil.com> + +Current Version : 2.20.4.1 (scsi module), 2.20.2.2 (cmm module) +Older Version : 2.20.4.0 (scsi module), 2.20.2.1 (cmm module) + +i. Handle IOCTL cmd timeouts more properly. + +ii. pci_dma_sync_{sg,single}_for_cpu was introduced into megaraid_mbox + incorrectly (instead of _for_device). Changed to appropriate + pci_dma_sync_{sg,single}_for_device. + +Release Date : Wed Oct 06 11:15:29 EDT 2004 - Sreenivas Bagalkote <sreenib@lsil.com> +Current Version : 2.20.4.0 (scsi module), 2.20.2.1 (cmm module) +Older Version : 2.20.4.0 (scsi module), 2.20.2.0 (cmm module) + +i. Remove CONFIG_COMPAT around register_ioctl32_conversion + +Release Date : Mon Sep 27 22:15:07 EDT 2004 - Atul Mukker <atulm@lsil.com> +Current Version : 2.20.4.0 (scsi module), 2.20.2.0 (cmm module) +Older Version : 2.20.3.1 (scsi module), 2.20.2.0 (cmm module) + +i. Fix data corruption. Because of a typo in the driver, the IO packets + were wrongly shared by the ioctl path. This causes a whole IO command + to be replaced by an incoming ioctl command. + +Release Date : Tue Aug 24 09:43:35 EDT 2004 - Atul Mukker <atulm@lsil.com> +Current Version : 2.20.3.1 (scsi module), 2.20.2.0 (cmm module) +Older Version : 2.20.3.0 (scsi module), 2.20.2.0 (cmm module) + +i. Function reordering so that inline functions are defined before they + are actually used. It is now mandatory for GCC 3.4.1 (current stable) + + Declare some heavy-weight functions to be non-inlined, + megaraid_mbox_build_cmd, megaraid_mbox_runpendq, + megaraid_mbox_prepare_pthru, megaraid_mbox_prepare_epthru, + megaraid_busywait_mbox + + - Andrew Morton <akpm@osdl.org>, 08.19.2004 + linux-scsi mailing list + + "Something else to clean up after inclusion: every instance of an + inline function is actually rendered as a full function call, because + the function is always used before it is defined. Atul, please + re-arrange the code to eliminate the need for most (all) of the + function prototypes at the top of each file, and define (not just + declare with a prototype) each inline function before its first use" + + - Matt Domsch <Matt_Domsch@dell.com>, 07.27.2004 + linux-scsi mailing list + + +ii. Display elapsed time (countdown) while waiting for FW to boot. + +iii. Module compilation reorder in Makefile so that unresolved symbols do + not occur when driver is compiled non-modular. + + Patrick J. LoPresti <patl@users.sourceforge.net>, 8.22.2004 + linux-scsi mailing list + + +Release Date : Thu Aug 19 09:58:33 EDT 2004 - Atul Mukker <atulm@lsil.com> +Current Version : 2.20.3.0 (scsi module), 2.20.2.0 (cmm module) +Older Version : 2.20.2.0 (scsi module), 2.20.1.0 (cmm module) + +i. When copying the mailbox packets, copy only first 14 bytes (for 32-bit + mailboxes) and only first 22 bytes (for 64-bit mailboxes). This is to + avoid getting the stale values for busy bit. We want to set the busy + bit just before issuing command to the FW. + +ii. In the reset handling, if the reseted command is not owned by the + driver, do not (wrongly) print information for the "attached" driver + packet. + +iii. Have extended wait when issuing command in synchronous mode. This is + required for the cases where the option ROM is disabled and there is + no BIOS to start the controller. The FW starts to boot after receiving + the first command from the driver. The current driver has 1 second + timeout for the synchronous commands, which is far less than what is + actually required. We now wait up to MBOX_RESET_TIME (180 seconds) for + FW boot process. + +iv. In megaraid_mbox_product_info, clear the mailbox contents completely + before preparing the command for inquiry3. This is to ensure that the + FW does not get junk values in the command. + +v. Do away with the redundant LSI_CONFIG_COMPAT redefinition for + CONFIG_COMPAT. Replace <asm/ioctl32.h> with <linux/ioctl32.h> + + - James Bottomley <James.Bottomley@SteelEye.com>, 08.17.2004 + linux-scsi mailing list + +vi. Add support for 64-bit applications. Current drivers assume only + 32-bit applications, even on 64-bit platforms. Use the "data" and + "buffer" fields of the mimd_t structure, instead of embedded 32-bit + addresses in application mailbox and passthru structures. + +vii. Move the function declarations for the management module from + megaraid_mm.h to megaraid_mm.c + + - Andrew Morton <akpm@osdl.org>, 08.19.2004 + linux-scsi mailing list + +viii. Change default values for MEGARAID_NEWGEN, MEGARAID_MM, and + MEGARAID_MAILBOX to 'n' in Kconfig.megaraid + + - Andrew Morton <akpm@osdl.org>, 08.19.2004 + linux-scsi mailing list + +ix. replace udelay with msleep + +x. Typos corrected in comments and whitespace adjustments, explicit + grouping of expressions. + + +Release Date : Fri Jul 23 15:22:07 EDT 2004 - Atul Mukker <atulm@lsil.com> +Current Version : 2.20.2.0 (scsi module), 2.20.1.0 (cmm module) +Older Version : 2.20.1.0 (scsi module), 2.20.0.0 (cmm module) + +i. Add PCI ids for Acer ROMB 2E solution + +ii. Add PCI ids for I4 + +iii. Typo corrected for subsys id for megaraid sata 300-4x + +iv. Remove yield() while mailbox handshake in synchronous commands + + + "My other main gripe is things like this: + + + // wait for maximum 1 second for status to post + + for (i = 0; i < 40000; i++) { + + if (mbox->numstatus != 0xFF) break; + + udelay(25); yield(); + + } + + which litter the driver. Use of yield() in drivers is deprecated." + + - James Bottomley <James.Bottomley@SteelEye.com>, 07.14.2004 + linux-scsi mailing list + +v. Remove redundant __megaraid_busywait_mbox routine + +vi. Fix bug in the managment module, which causes a system lockup when the + IO module is loaded and then unloaded, followed by executing any + management utility. The current version of management module does not + handle the adapter unregister properly. + + Specifically, it still keeps a reference to the unregistered + controllers. To avoid this, the static array adapters has been + replaced by a dynamic list, which gets updated every time an adapter + is added or removed. + + Also, during unregistration of the IO module, the resources are + now released in the exact reverse order of the allocation time + sequence. + + +Release Date : Fri Jun 25 18:58:43 EDT 2004 - Atul Mukker <atulm@lsil.com> +Current Version : 2.20.1.0 +Older Version : megaraid 2.20.0.1 + +i. Stale list pointer in adapter causes kernel panic when module + megaraid_mbox is unloaded + + +Release Date : Thu Jun 24 20:37:11 EDT 2004 - Atul Mukker <atulm@lsil.com> +Current Version : 2.20.0.1 +Older Version : megaraid 2.20.0.00 + +i. Modules are not 'y' by default, but depend on current definition of + SCSI & PCI. + +ii. Redundant structure mraid_driver_t removed. + +iii. Miscellaneous indentation and goto/label fixes. + - Christoph Hellwig <hch@infradead.org>, 06.24.2004 linux-scsi + +iv. scsi_host_put(), do just before completing HBA shutdown. + + + +Release Date : Mon Jun 21 19:53:54 EDT 2004 - Atul Mukker <atulm@lsil.com> +Current Version : 2.20.0.0 +Older Version : megaraid 2.20.0.rc2 and 2.00.3 + +i. Independent module to interact with userland applications and + multiplex command to low level RAID module(s). + + "Shared code in a third module, a "library module", is an acceptable + solution. modprobe automatically loads dependent modules, so users + running "modprobe driver1" or "modprobe driver2" would automatically + load the shared library module." + + - Jeff Garzik <jgarzik@pobox.com> 02.25.2004 LKML + + "As Jeff hinted, if your userspace<->driver API is consistent between + your new MPT-based RAID controllers and your existing megaraid driver, + then perhaps you need a single small helper module (lsiioctl or some + better name), loaded by both mptraid and megaraid automatically, which + handles registering the /dev/megaraid node dynamically. In this case, + both mptraid and megaraid would register with lsiioctl for each + adapter discovered, and lsiioctl would essentially be a switch, + redirecting userspace tool ioctls to the appropriate driver." + + - Matt Domsch <Matt_Domsch@dell.com> 02.25.2004 LKML + +ii. Remove C99 initializations from pci_device id. + + "pci_id_table_g would be much more readable when not using C99 + initializers. + PCI table doesn't change, there's lots of users that prefer the more + readable variant. And it's really far less and much easier to grok + lines without C99 initializers." + + - Christoph Hellwig <hch@infradead.org>, 05.28.2004 linux-scsi + +iii. Many fixes as suggested by Christoph Hellwig <hch@infradead.org> on + linux-scsi, 05.28.2004 + +iv. We now support up to 32 parallel ioctl commands instead of current 1. + There is a conscious effort to let memory allocation not fail for ioctl + commands. + +v. Do away with internal memory management. Use pci_pool_(create|alloc) + instead. + +vi. Kill tasklet when unloading the driver. + +vii. Do not use "host_lock', driver has fine-grain locks now to protect all + data structures. + +viii. Optimize the build scatter-gather list routine. The callers already + know the data transfer address and length. + +ix. Better implementation of error handling and recovery. Driver now + performs extended errors recovery for instances like scsi cable pull. + +x. Disassociate the management commands with an overlaid scsi command. + Driver now treats the management packets as special packets and has a + dedicated callback routine. diff --git a/Documentation/scsi/ChangeLog.ncr53c8xx b/Documentation/scsi/ChangeLog.ncr53c8xx new file mode 100644 index 000000000000..7d03e9d5b5f7 --- /dev/null +++ b/Documentation/scsi/ChangeLog.ncr53c8xx @@ -0,0 +1,495 @@ +Sat May 12 12:00 2001 Gerard Roudier (groudier@club-internet.fr) + * version ncr53c8xx-3.4.3b + - Ensure LEDC bit in GPCNTL is cleared when reading the NVRAM. + Fix sent by Stig Telfer <stig@api-networks.com>. + - Define scsi_set_pci_device() as nil for kernel < 2.4.4. + +Mon Feb 12 22:30 2001 Gerard Roudier (groudier@club-internet.fr) + * version ncr53c8xx-3.4.3 + - Call pci_enable_device() as AC wants this to be done. + - Get both the BAR cookies actual and PCI BAR values. + (see Changelog.sym53c8xx rev. 1.7.3 for details) + - Merge changes for linux-2.4 that declare the host template + in the driver object also when the driver is statically + linked with the kernel. + +Sun Sep 24 21:30 2000 Gerard Roudier (groudier@club-internet.fr) + * version ncr53c8xx-3.4.2 + - See Changelog.sym53c8xx, driver version 1.7.2. + +Wed Jul 26 23:30 2000 Gerard Roudier (groudier@club-internet.fr) + * version ncr53c8xx-3.4.1 + - Provide OpenFirmare path through the proc FS on PPC. + - Remove trailing argument #2 from a couple of #undefs. + +Sun Jul 09 16:30 2000 Gerard Roudier (groudier@club-internet.fr) + * version ncr53c8xx-3.4.0 + - Remove the PROFILE C and SCRIPTS code. + This facility was not this useful and thus was not longer + desirable given the increasing complexity of the driver code. + - Merges from FreeBSD sym-1.6.2 driver: + * Clarify memory barriers needed by the driver for architectures + that implement a weak memory ordering. + - General cleanup: + Move definitions for barriers and IO/MMIO operations to the + sym53c8xx_defs.h header files. They are now shared by the + both drivers. + Use SCSI_NCR_IOMAPPED instead of NCR_IOMAPPED. + +Thu May 11 12:30 2000 Pam Delaney (pam.delaney@lsil.com) + * revision 3.3b + +Mon Apr 24 12:00 2000 Gerard Roudier (groudier@club-internet.fr) + * revision 3.2i + - Return value 1 (instead of 0) from the driver setup routine. + - Let the driver also attach controllers that have been set to + OFF in the NVRAM as it did prior to revision 3.2g. + +Sat Apr 1 12:00 2000 Gerard Roudier (groudier@club-internet.fr) + * revision 3.2h + - Fix a compilation problem on Alpha introduced in version 3.2g. + (`port' changed to `base_io'). + - Move from `sym' to this driver a tiny change for __sparc__ that + applies to cache line size (? Probably from David S Miller). + - Make sure no data transfer will happen for Scsi_Cmnd requests + that supply SCSI_DATA_NONE direction (this avoids some BUG() + statement in the PCI code when a data buffer is also supplied). + +Thu Mar 16 9:30 2000 Pam Delaney (pam.delaney@lsil.com) + * revision 3.3b-3 + - Added exclusion for the 53C1010 and 53C1010_66 chips + to the driver (change to sym53c8xx_comm.h). + +Mon March 6 23:15 2000 Gerard Roudier (groudier@club-internet.fr) + * revision 3.2g + - Add the file sym53c8xx_comm.h that collects code that should + be shared by sym53c8xx and ncr53c8xx drivers. For now, it is + a header file that is only included by the ncr53c8xx driver, + but things will be cleaned up later. This code addresses + notably: + * Chip detection and PCI related initialisations + * NVRAM detection and reading + * DMA mapping + * Boot setup command + * And some other ... + - Add support for the new dynamic dma mapping kernel interface. + Requires Linux-2.3.47 (tested with pre-2.3.47-6). + - Get data transfer direction from the scsi command structure + (Scsi_Cmnd) when this information is available. + +Mon March 6 23:15 2000 Gerard Roudier (groudier@club-internet.fr) + * revision 3.2g + - Add the file sym53c8xx_comm.h that collects code that should + be shared by sym53c8xx and ncr53c8xx drivers. For now, it is + a header file that is only included by the ncr53c8xx driver, + but things will be cleaned up later. This code addresses + notably: + * Chip detection and PCI related initialisations + * NVRAM detection and reading + * DMA mapping + * Boot setup command + * And some other ... + - Add support for the new dynamic dma mapping kernel interface. + Requires Linux-2.3.47 (tested with pre-2.3.47-6). + - Get data transfer direction from the scsi command structure + (Scsi_Cmnd) when this information is available. + +Fri Jan 14 14:00 2000 Pam Delaney (pam.delaney@lsil.com) + * revision pre-3.3b-1 + - Merge parallel driver series 3.31 and 3.2e + +Tue Jan 11 14:00 2000 Pam Delaney (pam.delaney@lsil.com) + * revision 3.31 + - Added support for mounting disks on wide-narrow-wide + scsi configurations. + - Built off of version 3.30 + +Mon Jan 10 13:30 2000 Pam Delaney (pam.delaney@lsil.com) + * revision 3.30 + - Added capability to use the integrity checking code + in the kernel (optional). + - Disabled support for the 53C1010. + - Built off of version 3.2c + +Sat Jan 8 22:00 2000 Gerard Roudier (groudier@club-internet.fr) + * revision 3.2e + - Add year 2000 copyright. + - Display correctly bus signals when bus is detected wrong. + - Remove the dead code that broke driver 3.2d. + +Mon Dec 6 22:00 1999 Gerard Roudier (groudier@club-internet.fr) + * revision 3.2d + - Change messages written by the driver at initialisation and + through the /proc FS (rather cosmetic changes that consist in + printing out the PCI bus number and device/function). + - Get rid of the old PCI bios interface, but preserve kernel 2.0 + compatibility from a simple wrapper. + - Remove the compilation condition about having to acquire the + io_request_lock since it seems to be a definite feature now.:) + - proc_dir structure no longer needed for kernel >= 2.3.27. + - Change the driver detection code by the sym53c8xx one, modulo + some minor changes. The driver can now attach any number of + controllers (>40) and does no longer hoger stack space at + initialisation. + - Definitely disable overlapped PCI arbitration for all dual + function chips, since I cannot make sure for what chip revisions + it is actually safe. + - Add support for the SYM53C1510D. + - Update the poor Tekram sync factor table. + - Remove the compilation condition about having to acquire the + io_request_lock since it seems to be a definite feature now.:) + - proc_dir structure no longer needed for kernel >= 2.3.27. + +Sat Sep 11 18:00 1999 Gerard Roudier (groudier@club-internet.fr) + * revision 3.2c + - Handle correctly (hopefully) jiffies wrap-around. + - Restore the entry used to detect 875 until revision 0xff. + (I removed it inadvertently, it seems :) ) + - Replace __initfunc() which is deprecated stuff by __init which + is not yet so. ;-) + - Add support of some 'resource handling' for linux-2.3.13. + Basically the BARs have been changed to something more complex + in the pci_dev structure. + - Remove some deprecated code. + +Sat May 10 11:00 1999 Gerard Roudier (groudier@club-internet.fr) + * revision pre-3.2b-1 + - Support for the 53C895A by Pamela Delaney <pam.delaney@lsil.com> + The 53C895A contains all of the features of the 896 but has only + one channel and has a 32 bit PCI bus. It does 64 bit PCI addressing + using dual cycle PCI data transfers. + - Miscellaneous minor fixes. + - Some additions to the README.ncr53c8xx file. + +Sun Apr 11 10:00 1999 Gerard Roudier (groudier@club-internet.fr) + * revision 3.2a + - Add 'hostid:#id' boot option. This option allows to change the + default SCSI id the driver uses for controllers. + - Remove nvram layouts and driver set-up structures from the C source, + and use the one defined in sym53c8xx_defs.h file. + (shared by both drivers). + - Set for now MAX LUNS to 16 (instead of 8). + +Thu Mar 11 23:00 1999 Gerard Roudier (groudier@club-internet.fr) + * revision 3.2 (8xx-896 driver bundle) + - Only define the host template in ncr53c8xx.h and include the + sym53c8xx_defs.h file. + - Declare static all symbols that do not need to be visible from + outside the driver code. + - Add 'excl' boot command option that allows to pass to the driver + io address of devices not to attach. + - Add info() function called from the host template to print + driver/host information. + - Minor documentation additions. + +Sat Mar 6 11:00 1999 Gerard Roudier (groudier@club-internet.fr) + * revision 3.1h + - Fix some oooold bug that hangs the bus if a device rejects a + negotiation. Btw, the corresponding stuff also needed some cleanup + and thus the change is a bit larger than it could have been. + - Still some typo that made compilation fail for 64 bit (trivial fix). + +Sun Feb 14:00 1999 Gerard Roudier (groudier@club-internet.fr) + * revision 3.1g + - Deal correctly with 64 bit PCI address registers on Linux 2.2. + Pointed out by Leonard Zubkoff. + - Allow to tune request_irq() flags from the boot command line using + ncr53c8xx=irqm:??, as follows: + a) If bit 0x10 is set in irqm, SA_SHIRQ flag is not used. + b) If bit 0x20 is set in irqm, SA_INTERRUPT flag is not used. + By default the driver uses both SA_SHIRQ and SA_INTERRUPT. + Option 'ncr53c8xx=irqm:0x20' may be used when an IRQ is shared by + a 53C8XX adapter and a network board. + - Tiny mispelling fixed (ABORT instead of ABRT). Was fortunately + harmless. + - Negotiate SYNC data transfers with CCS devices. + +Sat Jan 16 17:30 1999 Gerard Roudier (groudier@club-internet.fr) + * revision 3.1f + - Some PCI fix-ups not needed any more for PPC (from Cort). + - Cache line size set to 16 DWORDS for Sparc (from DSM). + - Waiting list look-up didn't work for the first command of the list. + - Remove 2 useless lines of code. + +Sun Dec 13 18:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 3.1e + - Same work-around as for the 53c876 rev <= 0x15 for 53c896 rev 1: + Disable overlapped arbitration. This will not make difference + since the chip has on-chip RAM. + +Thu Nov 26 22:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 3.1d + - The SISL RAID change requires now remap_pci_mem() stuff to be + compiled for __i386__ when normal IOs are used. + - Minor spelling fixes in doc files. + +Sat Nov 21 18:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 3.1c + - Ignore chips that are driven by SISL RAID (DAC 960). + Change sent by Leonard Zubkoff and slightly reworked. + - Still a buglet in the tags initial settings that needed to be fixed. + It was not possible to disable TGQ at system startup for devices + that claim TGQ support. The driver used at least 2 for the queue + depth but did'nt keep track of user settings for tags depth lower + than 2. + +Wed Nov 11 10:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 3.1b + - The driver was unhappy when configured with default_tags > MAX_TAGS + Hopefully doubly-fixed. + - Update the Configure.help driver section that speaks of TAGS. + +Wed Oct 21 21:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 3.1a + - Changes from Eddie Dost for Sparc and Alpha: + ioremap/iounmap support for Sparc. + pcivtophys changed to bus_dvma_to_phys. + - Add the 53c876 description to the chip table. This is only useful + for printing the right name of the controller. + - DEL-441 Item 2 work-around for the 53c876 rev <= 5 (0x15). + - Add additional checking of INQUIRY data: + Check INQUIRY data received length is at least 7. Byte 7 of + inquiry data contains device features bits and the driver might + be confused by garbage. Also check peripheral qualifier. + - Cleanup of the SCSI tasks management: + Remove the special case for 32 tags. Now the driver only uses the + scheme that allows up to 64 tags per LUN. + Merge some code from the 896 driver. + Use a 1,3,5,...MAXTAGS*2+1 tag numbering. Previous driver could + use any tag number from 1 to 253 and some non conformant devices + might have problems with large tag numbers. + - 'no_sync' changed to 'no_disc' in the README file. This is an old + and trivial mistake that seems to demonstrate the README file is + not often read. :) + +Sun Oct 4 14:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 3.0i + - Cosmetic changes for sparc (but not for the driver) that needs + __irq_itoa() to be used for printed IRQ value to be understandable. + - Some problems with the driver that didn't occur using driver 2.5f + were due to a SCSI selection problem triggered by a clearly + documented feature that in fact seems not to work: (53C8XX chips + are claimed by the manuals to be able to execute SCSI scripts just + after abitration while the SCSI core is performing SCSI selection). + This optimization is broken and has been removed. + - Some broken scsi devices are confused when a negotiation is started + on a LUN that does not correspond to a real device. According to + SCSI specs, this is a device firmware bug. This has been worked + around by only starting negotiation if the LUN has previously be + used for at least 1 successful SCSI command. + - The 'last message sent' printed out on M_REJECT message reception + was read from the SFBR i/o register after the previous message had + been sent. + This was not correct and affects all previous driver versions and + the original FreeBSD one as well. The SCSI scripts has been fixed + so that it now provides the right information to the C code. + +Sat Jul 18 13:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 3.0g + - Preliminary fixes for Big Endian (sent by Eddie C. Dost). + Big Endian architectures should work again with the driver. + Eddie's patch has been partially applied since current 2.1.109 + does not have all the Sparc changes of the vger tree. + - Use of BITS_PER_LONG instead of (~0UL == 0xffffffffUL) has fixed + the problem observed when the driver was compiled using EGCS or + PGCC. + +Mon Jul 13 20:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 3.0f + - Some spelling fixes. + - linux/config.h misplaced in ncr53c8xx.h + - MODULE_PARM stuff added for linux 2.1. + - check INQUIRY response data format is exactly 2. + - use BITS_PER_LONG if defined. + +Sun Jun 28 12:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 3.0e + - Some cleanup, spelling fixes, version checks, documentations + changes, etc ... + +Sat Jun 20 20:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 3.0c + - Add a boot setup option that allows to set up device queue depths + at boot-up. This option is very useful since Linux does not + allow to change scsi device queue depth once the system has been + booted up. + +Sun Jun 15 23:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 3.0a + - Support for up to 64 TAGS per LUN. + - Rewrite the TARGET vs LUN capabilities management. + CmdQueue is now handled as a LUN capability as it shall be. + This also fixes a bug triggered when disabling tagged command + queuing for a device that had this feature enabled. + - Remove the ncr_opennings() stuff that was useless under Linux + and hard to understand to me. + - Add "setverbose" procfs driver command. It allows to tune + verbose level after boot-up. Setting this level to zero, for + example avoid flooding the syslog file. + - Add KERN_XXX to some printk's. + +Tue Jun 10 23:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 3.0 + - Linux config changes for 2.0.34: + Remove NVRAM detection config option. This option is now enabled + by default but can be disabled by editing the driver header file. + Add a PROFILE config option. + - Update Configure.help + - Add calls to new function mdelay() for milli-seconds delay if + kernel version >= 2.1.105. + - Replace all printf(s) by printk(s). After all, the ncr53c8xx is + a driver for Linux. + - Perform auto-sense on COMMAND TERMINATED. Not sure it is useful. + - Some other minor changes. + +Tue Jun 4 23:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 2.6n + - Code cleanup and simplification: + Remove kernel 1.2.X and 1.3.X support. + Remove the _old_ target capabilities table. + Remove the error recovery code that have'nt been really useful. + Use a single alignment boundary (CACHE_LINE_SIZE) for data + structures. + - Several aggressive SCRIPTS optimizations and changes: + Reselect SCRIPTS code rewritten. + Support for selection/reselection without ATN. + And some others. + - Miscallaneous changes in the C code: + Count actual number of CCB queued to the controller (future use). + Lots of other minor changes. + +Wed May 13 20:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 2.6m + - Problem of missed SCSI bus reset with the 53C895 fixed by + Richard Waltham. The 53C895 needs about 650 us for the bus + mode to settle. Delays used while resetting the controller + and the bus have been adjusted. Thanks Richard! + - Some simplification for 64 bit arch done ccb address testing. + - Add a check of the MSG_OUT phase after Selection with ATN. + - The new tagged queue stuff seems ok, so some informationnal + message have been conditionned by verbose >= 3. + - Donnot reset if a SBMC interrupt reports the same bus mode. + - Print out the whole driver set-up. Some options were missing and + the print statement was misplaced for modules. + - Ignore a SCSI parity interrupt if the chip is not connected to + the SCSI bus. + +Sat May 1 16:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 2.6l + - Add CCB done queue support for Alpha and perhaps some other + architectures. + - Add some barriers to enforce memory ordering for x86 and + Alpha architectures. + - Fix something that looks like an old bug in the nego SIR + interrupt code in case of negotiation failure. + +Sat Apr 25 21:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 2.6k + - Remove all accesses to the on-chip RAM from the C code: + Use SCRIPTS to load the on-chip RAM. + Use SCRIPTS to repair the start queue on selection timeout. + Use the copy of script in main memory to calculate the chip + context on phase mismatch. + - The above allows now to use the on-chip RAM without requiring + to get access to the on-chip RAM from the C code. This makes + on-chip RAM useable for linux-1.2.13 and for Linux-Alpha for + instance. + - Some simplifications and cleanups in the SCRIPTS and C code. + - Buglet fixed in parity error recovery SCRIPTS (never tested). + - Minor updates in README.ncr53c8xx. + +Wed Apr 15 21:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 2.6j + - Incorporate changes from linux-2.1.95 ncr53c8xx driver version. + - Add SMP support for linux-2.1.95 and above. + - Fix a bug when QUEUE FULL is returned and no commands are + disconnected. This happens with Atlas I / L912 and may happen + with Atlas II / LXY4. + - Nail another one on CHECK condition when requeuing the command + for auto-sense. + - Call scsi_done() for all completed commands after interrupt + handling. + - Increase the done queue to 24 entries. + +Sat Apr 4 20:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 2.6i + - CTEST0 is used by the 53C885 for Power Management and + priority setting between the 2 functions. + Use SDID instead as actual target number. Just have had to + overwrite it with SSID on reselection. + - Split DATA_IN and DATA_OUT scripts into 2 sub-scripts. + 64 segments are moved from on-chip RAM scripts. + If more segments, a script in main memory is used for the + additional segments. + - Since the SCRIPTS processor continues SCRIPTS execution after + having won arbitration, do some stuff prior to testing any SCSI + phase on reselection. This should have the vertue to process + scripts in parallel with the SCSI core performing selection. + - Increase the done queue to 12 entries. + +Sun Mar 29 12:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 2.6h + - Some fixes. + +Tue Mar 26 23:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 2.6g + - New done queue. 8 entries by default (6 always useable). + Can be increased if needed. + - Resources management using doubly linked queues. + - New auto-sense and QUEUE FULL handling that does not need to + stall the NCR queue any more. + - New CCB starvation avoiding algorithm. + - Prepare CCBs for SCSI commands that cannot be queued, instead of + inserting these commands into the waiting list. The waiting list + is now only used while resetting and when memory for CCBs is not + yet available? + +Sun Feb 8 22:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 2.6f + - Some fixes in order to really support the 53C895, at least with + FAST-20 devices. + - Heavy changes in the target/lun resources management to allow + the scripts to jump directly to the CCB on reselection instead + of walking on the lun CCBs list. Up to 32 tags per lun are now + supported without script processor and PCI traffic overhead. + +Sun Jan 11 22:00 1998 Gerard Roudier (groudier@club-internet.fr) + * revision 2.6d + - new (different ?) implementation of the start queue: + Use a simple CALL to a launch script in the CCB. + - implement a minimal done queue (1 entry :-) ). + this avoid scanning all CCBs on INT FLY (Only scan all CCBs, on + overflow). Hit ratio is better than 99.9 % on my system, so no + need to have a larger done queue. + - generalization of the restart of CCB on special condition as + Abort, QUEUE FULL, CHECK CONDITION. + This has been called 'silly scheduler'. + - make all the profiling code conditionned by a config option. + This spare some PCI traffic and C code when this feature is not + needed. + - handle more cleanly the situation where direction is unknown. + The pointers patching is now performed by the SCRIPTS processor. + - remove some useless scripts instructions. + + Ported from driver 2.5 series: + ------------------------------ + - Use FAST-5 instead of SLOW for slow scsi devices according to + new SPI-2 draft. + - Make some changes in order to accommodate with 875 rev <= 3 + device errata listing 397. Minor consequences are: + . Leave use of PCI Write and Invalidate under user control. + Now, by default the driver does not enable PCI MWI and option + 'specf:y' is required in order to enable this feature. + . Memory Read Line is not enabled for 875 and 875-like chips. + . Programmed burst length set to 64 DWORDS (instead of 128). + (Note: SYMBIOS uses 32 DWORDS for the SDMS BIOS) + - Add 'buschk' boot option. + This option enables checking of SCSI BUS data lines after SCSI + RESET (set by default). (Submitted by Richard Waltham). + - Update the README file. + - Dispatch CONDITION MET and RESERVATION CONFLICT scsi status + as OK driver status. + - Update the README file and the Symbios NVRAM format definition + with removable media flags values (available with SDMS 4.09). + - Several PCI configuration registers fix-ups for powerpc. + (Patch sent by Cort). diff --git a/Documentation/scsi/ChangeLog.sym53c8xx b/Documentation/scsi/ChangeLog.sym53c8xx new file mode 100644 index 000000000000..ef985ec348e6 --- /dev/null +++ b/Documentation/scsi/ChangeLog.sym53c8xx @@ -0,0 +1,593 @@ +Sat May 12 12:00 2001 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.7.3c + - Ensure LEDC bit in GPCNTL is cleared when reading the NVRAM. + Fix sent by Stig Telfer <stig@api-networks.com>. + - Backport from SYM-2 the work-around that allows to support + hardwares that fail PCI parity checking. + - Check that we received at least 8 bytes of INQUIRY response + for byte 7, that contains device capabilities, to be valid. + - Define scsi_set_pci_device() as nil for kernel < 2.4.4. + - + A couple of minor changes. + +Sat Apr 7 19:30 2001 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.7.3b + - Fix an unaligned LOAD from scripts (was used as dummy read). + - In ncr_soft_reset(), only try to ABORT the current operation + for chips that support SRUN bit in ISTAT1 and if SCRIPTS are + currently running, as 896 and 1010 manuals suggest. + - In the CCB abort path, do not assume that the CCB is currently + queued to SCRIPTS. This is not always true, notably after a + QUEUE FULL status or when using untagged commands. + +Sun Mar 4 18:30 2001 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.7.3a + - Fix an issue in the ncr_int_udc() (unexpected disconnect) + handling. If the DSA didn't match a CCB, a bad write to + memory could happen. + +Mon Feb 12 22:30 2001 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.7.3 + - Support for hppa. + Tiny patch sent to me by Robert Hirst. + - Tiny patch for ia64 sent to me by Pamela Delaney. + +Tue Feb 6 13:30 2001 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.7.3-pre1 + - Call pci_enable_device() as AC wants this to be done. + - Get both the BAR cookies used by CPU and actual PCI BAR + values used from SCRIPTS. Recent PCI chips are able to + access themselves using internal cycles, but they compare + BAR values to destination address to make decision. + Earlier chips simply use PCI transactions to access IO + registers from SCRIPTS. + The bus_dvma_to_mem() interface that reverses the actual + PCI BAR value from the BAR cookie is now useless. + This point had been discussed at the list and the solution + got approved by PCI code maintainer (Martin Mares). + - Merge changes for linux-2.4 that declare the host template + in the driver object also when the driver is statically + linked with the kernel. + - Increase SCSI message size up to 12 bytes, given that 8 + bytes was not enough for the PPR message (fix). + - Add field 'maxoffs_st' (max offset for ST data transfers). + The C1010 supports offset 62 in DT mode but only 31 in + ST mode, to 2 different values for the max SCSI offset + are needed. Replace the obviously wrong masking of the + offset against 0x1f for ST mode by a lowering to + maxoffs_st of the SCSI offset in ST mode. + - Refine a work-around for the C1010-66. Revision 1 does + not requires extra cycles in DT DATA OUT phase. + - Add a missing endian-ization (abrt_tbl.addr). + - Minor clean-up in the np structure for fields accessed + from SCRIPTS that requires special alignments. + +Sun Sep 24 21:30 2000 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.7.2 + - Remove the hack for PPC added in previous driver version. + - Add FE_DAC feature bit to distinguish between 64 bit PCI + addressing (FE_DAC) and 64 bit PCI interface (FE_64BIT). + - Get rid of the boot command line "ultra:" argument. + This parameter wasn't that clever since we can use "sync:" + for Ultra/Ultra2 settings, and for Ultra3 we may want to + pass PPR options (for now only DT clocking). + - Add FE_VARCLK feature bit that indicates that SCSI clock + frequency may vary depending on board design and thus, + the driver should try to evaluate the SCSI clock. + - Simplify the way the driver determine the SCSI clock: + ULTRA3 -> 160 MHz, ULTRA2 -> 80 MHz otherwise 40 MHz. + Measure the SCSI clock frequency if FE_VARCLK is set. + - Remove FE_CLK80 feature bit that got useless. + - Add support for the SYM53C875A (Pamela Delaney). + +Wed Jul 26 23:30 2000 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.7.1 + - Provide OpenFirmare path through the proc FS on PPC. + - Download of on-chip SRAM using memcpy_toio() doesn't work + on PPC. Restore previous method (MEMORY MOVE from SCRIPTS). + - Remove trailing argument #2 from a couple of #undefs. + +Sun Jul 09 16:30 2000 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.7.0 + - Remove the PROFILE C and SCRIPTS code. + This facility was not this useful and thus was not longer + desirable given the increasing complexity of the driver code. + - Merges from FreeBSD sym-1.6.2 driver: + * Clarify memory barriers needed by the driver for architectures + that implement a weak memory ordering. + * Simpler handling of illegal phases and data overrun from + SCRIPTS. These errors are now immediately reported to + the C code by an interrupt. + * Sync the residual handling code with sym-1.6.2 and now + report `resid' to user for linux version >= 2.3.99 + - General cleanup: + Move definitions for barriers and IO/MMIO operations to the + sym53c8xx_defs.h header files. They are now shared by the + both drivers. + Remove unused options that claimed to optimize for the 896. + If fact, they were not this clever. :) + Use SCSI_NCR_IOMAPPED instead of NCR_IOMAPPED. + Remove a couple of unused fields from data structures. + +Thu May 11 12:40 2000 Pam Delaney (pam.delaney@lsil.com) + * version sym53c8xx-1.6b + - Merged version. + +Mon Apr 24 12:00 2000 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.5m + - Return value 1 (instead of 0) from the driver setup routine. + - Do not enable PCI DAC cycles. This just broke support for + SYM534C896 on sparc64. Problem fixed by David S. Miller. + +Fri Apr 14 9:00 2000 Pam Delaney (pam.delaney@lsil.com) + * version sym53c8xx-1.6b-9 + - Added 53C1010_66 support. + - Small fix to integrity checking code. + - Removed requirement for integrity checking if want to run + at ultra 3. + +Sat Apr 1 12:00 2000 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.5l + - Tiny change for __sparc__ appeared in 2.3.99-pre4.1 that + applies to cache line size (? Probably from David S Miller). + - Make sure no data transfer will happen for Scsi_Cmnd requests + that supply SCSI_DATA_NONE direction (this avoids some BUG() + statement in the PCI code when a data buffer is also supplied). + +Sat Mar 11 12:00 2000 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.6b-5 + - Test against expected data transfer direction from SCRIPTS. + - Add support for the new dynamic dma mapping kernel interface. + Requires Linux-2.3.47 (tested with pre-2.3.47-6). + Many thanks to David S. Miller for his preliminary changes + that have been useful guidelines. + - Get data transfer direction from the scsi command structure + (Scsi_Cmnd) with kernels that provide this information. + +Mon Mar 6 23:30 2000 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.5k + - Test against expected data transfer direction from SCRIPTS. + - Revert the change in 'ncr_flush_done_cmds()' but unmap the + scsi dma buffer prior to queueing the command to our done + list. + - Miscellaneous (minor) fixes in the code added in driver + version 1.5j. + +Mon Feb 14 4:00 2000 Pam Delaney (pam.delaney@lsil.com) + * version sym53c8xx-pre-1.6b-2. + - Updated the SCRIPTS error handling of the SWIDE + condition - to remove any reads of the sbdl + register. Changes needed because the 896 and 1010 + chips will check parity in some special circumstances. + This will cause a parity error interrupt if not in + data phase. Changes based on those made in the + FreeBSD driver version 1.3.2. + +Sun Feb 20 11:00 2000 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.5j + - Add support for the new dynamic dma mapping kernel interface. + Requires Linux-2.3.47 (tested with pre-2.3.47-6). + Many thanks to David S. Miller for his preliminary changes + that have been useful guidelines, for having reviewed the + code and having tested this driver version on Ultra-Sparc. + - 2 tiny bugs fixed in the PCI wrapper that provides support + for early kernels without pci device structure. + - Get data transfer direction from the scsi command structure + (Scsi_Cmnd) with kernels that provide this information. + - Fix an old bug that only affected 896 rev. 1 when driver + profile support option was set in kernel configuration. + +Fri Jan 14 14:00 2000 Pam Delaney (pam.delaney@lsil.com) + * version sym53c8xx-pre-1.6b-1. + - Merge parallel driver series 1.61 and 1.5e + +Tue Jan 11 14:00 2000 Pam Delaney (pam.delaney@lsil.com) + * version sym53c8xx-1.61 + - Added support for mounting disks on wide-narrow-wide + scsi configurations. + - Modified offset to be a maximum of 31 in ST mode, + 62 in DT mode. + - Based off of 1.60 + +Mon Jan 10 10:00 2000 Pam Delaney (pam.delaney@lsil.com) + * version sym53c8xx-1.60 + - Added capability to use the integrity checking code + in the kernel (optional). + - Added PPR negotiation. + - Added support for 53C1010 Ultra 3 part. + - Based off of 1.5f + +Sat Jan 8 22:00 2000 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.5h + - Add year 2000 copyright. + - Display correctly bus signals when bus is detected wrong. + - Some fix for Sparc from DSM that went directly to kernel tree. + +Mon Dec 6 22:00 1999 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.5g + - Change messages written by the driver at initialisation and + through the /proc FS (rather cosmetic changes that consist in + printing out the PCI bus number and PCI device/function). + - Ensure the SCRIPTS processor is stopped while calibrating the + SCSI clock (the initialisation code has been a bit reworked). + Change moved to the FreeBSD sym_hipd driver). + - Some fixes in the MODIFY_DP/IGN_RESIDUE code and residual + calculation (moved from FreeBSD sym_hipd driver). + - Add NVRAM support for Tekram boards that use 24C16 EEPROM. + Code moved from the FreeBSD sym_hipd driver, since it has + been that one that got this feature first. + - Definitely disable overlapped PCI arbitration for all dual + function chips, since I cannot make sure for what chip revisions + it is actually safe. + - Add support for the SYM53C1510D (also for ncr53c8xx). + - Fix up properly the PCI latency timer when needed or asked for. + - Get rid of the old PCI bios interface, but preserve kernel 2.0 + compatibility from a simple wrapper. + - Update the poor Tekram sync factor table. + - Fix in a tiny 'printk' bug that may oops in case of extended + errors (unrecovered parity error, data overrun, etc ...) + (Sent by Pamela Delaney from LSILOGIC) + - Remove the compilation condition about having to acquire the + io_request_lock since it seems to be a definite feature now.:) + - Change get_pages by GetPages since Linux >= 2.3.27 now wants + get_pages to ever be used as a kernel symbol (from 2.3.27). + - proc_dir structure no longer needed for kernel >= 2.3.27. + +Sun Oct 3 19:00 1999 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.5f + - Change the way the driver checks the PCI clock frequency, so + that overclocked PCI BUS up to 48 MHz will not be refused. + The more the BUS is overclocked, the less the driver will + guarantee that its measure of the SCSI clock is correct. + - Backport some minor improvements of SCRIPTS from the sym_hipd + driver. + - Backport the code rewrite of the START QUEUE dequeuing (on + bad scsi status received) from the sym_hipd driver. + +Sat Sep 11 11:00 1999 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.5e + - New linux-2.3.13 __setup scheme support added. + - Cleanup of the extended error status handling: + Use 1 bit per error type. + - Also save the extended error status prior to auto-sense. + - Add the FE_DIFF chip feature bit to indicate support of + diff probing from GPIO3 (825/825A/876/875). + - Remove the quirk handling that has been useless since day one. + - Work-around PCI chips being reported twice on some platforms. + - Add some redundant PCI reads in order to deal with common + bridge misbehaviour regarding posted write flushing. + - Add some other conditionnal code for people who have to deal + with really broken bridges (they will have to edit a source + file to try these options). + - Handle correctly (hopefully) jiffies wrap-around. + - Restore the entry used to detect 875 until revision 0xff. + (I removed it inadvertently, it seems :) ) + - Replace __initfunc() which is deprecated stuff by __init which + is not yet so. ;-) + - Rewrite the MESSAGE IN scripts more generic by using a MOVE + table indirect. Extended messages of any size are accepted now. + (Size is limited to 8 for now, but a constant is just to be + increased if necessary) + - Fix some bug in the fully untested MDP handling:) and share + some code between MDP handling and residual calculation. + - Calculate the data transfer residual as the 2's complement + integer (A positive value in returned on data overrun, and + a negative one on underrun). + - Add support of some 'resource handling' for linux-2.3.13. + Basically the BARs have been changed to something more complex + in the pci_dev structure. + - Remove some deprecated code. + +Sat Jun 5 11:00 1999 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.5c + - Do not negotiate on auto-sense if we are currently using 8 bit + async transfer for the target. + - Only check for SISL/RAID on i386 platforms. + (A problem has been reported on PPC with that code). + - On MSG REJECT for a negotiation, the driver attempted to restart + the SCRIPT processor when this one was already running. + +Sat May 29 12:00 1999 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.5b + - Force negotiation prior auto-sense. + This ensures that the driver will be able to grab the sense data + from a device that has received a BUS DEVICE RESET message from + another initiator. + - Complete all disconnected CCBs for a logical UNIT if we are told + about a UNIT ATTENTION for a RESET condition by this target. + - Add the control command 'cleardev' that allows to send a ABORT + message to a logical UNIT (for test purpose). + +Tue May 25 23:00 1999 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.5a + - Add support for task abort and bus device reset SCSI message + and implement proper synchonisation with SCRIPTS to handle + correctly task abortion without races. + - Send an ABORT message (if untagged) or ABORT TAG message (if tagged) + when the driver is told to abort a command that is disconnected and + complete the command with appropriate error. + If the aborted command is not yet started, remove it from the start + queue and complete it with error. + - Add the control command 'resetdev' that allows to send a BUS + DEVICE RESET message to a target (for test purpose). + - Clean-up some unused or useless code. + +Fri May 21 23:00 1999 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.5 + - Add support for CHMOV with Wide controllers. + - Handling of the SWIDE (low byte residue at the end of a CHMOV + in DATA IN phase with WIDE transfer when the byte count gets odd). + - Handling of the IGNORE WIDE RESIDUE message. + Handled from SCRIPTS as possible with some optimizations when both + a wide device and the controller are odd at the same time (SWIDE + present and IGNORE WIDE RESIDUE message on the BUS at the same time). + - Check against data OVERRUN/UNDERRUN condition at the end of a data + transfer, whatever a SWIDE is present (OVERRUN in DATA IN phase) + or the SODL is full (UNDERRUN in DATA out phase). + - Handling of the MODIFY DATA POINTER message. + This one cannot be handled from SCRIPTS, but hopefully it will not + happen very often. :) + - Large rewrite of the SCSI MESSAGE handling. + +Sun May 9 11:00 1999 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.4 + - Support for IMMEDIATE ARBITRATION. + See the README file for detailed information about this feature. + Requires both a compile option and a boot option. + - Minor SCRIPTS optimization in reselection pattern for LUN 0. + - Simpler algorithm to deal with SCSI command starvation. + Just use 2 tag counters in flip/flop and switch to the other + one every 3 seconds. + - Do some work in SCRIPTS after the SELECT instruction and prior + to testing for a PHASE. SYMBIOS say this feature is working fine. + (Btw, only problems with Toshiba 3401B had been reported). + - Measure the PCI clock speed and do not attach controllers if + result is greater than 37 MHz. Since the precision of the + algorithm (from Stefan Esser) is better than 2%, this should + be fine. + - Fix the misdetection of SYM53C875E (was detected as a 876). + - Fix the misdetection of SYM53C810 not A (was detected as a 810A). + - Support for up to 256 TAGS per LUN (CMD_PER_LUN). + Currently limited to 255 due to Linux limitation. :) + - Support for up to 508 active commands (CAN_QUEUE). + - Support for the 53C895A by Pamela Delaney <pam.delaney@lsil.com> + The 53C895A contains all of the features of the 896 but has only + one channel and has a 32 bit PCI bus. It does 64 bit PCI addressing + using dual cycle PCI data transfers. + - Miscellaneous minor fixes. + - Some additions to the README.ncr53c8xx file. + +Tue Apr 15 10:00 1999 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.3e + - Support for any number of LUNs (64) (SPI2-compliant). + (Btw, this may only be ever useful under linux-2.2 ;-)) + +Sun Apr 11 10:00 1999 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.3d + - Add 'hostid:#id' boot option. This option allows to change the + default SCSI id the driver uses for controllers. + - Make SCRIPTS not use self-mastering for PCI. + There were still 2 places the driver used this feature of the + 53C8XX family. + - Move some data structures (nvram layouts and driver set-up) to + the sym53c8xx_defs.h file. So, the both drivers will share them. + - Set MAX LUNS to 16 (instead of 8). + +Sat Mar 20 21:00 1999 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.3b + - Add support for NCR PQS PDS. + James Bottomley <James.Bottomley@columbiasc.ncr.com> + - Allow value 0 for host ID. + - Support more than 8 controllers (> 40 in fact :-) ) + - Add 'excl=#ioaddr' boot option: exclude controller. + (Version 1.3a driver) + +Thu Mar 11 23:00 1999 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.3 (8xx-896 driver bundle) + - Equivalent changes as ncr53c8xx-3.2 due to the driver bundle. + (See Changelog.ncr53c8xx) + - Do a normal soft reset as first chip reset, since aborting current + operation may raise an interrupt we are not able to handle since + the interrupt handler is not yet established. + +Sat Mar 6 11:00 1999 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.2b + - Fix some oooold bug that hangs the bus if a device rejects a + negotiation. Btw, the corresponding stuff also needed some cleanup + and thus the change is a bit larger than it could have been. + - Still some typo that made compilation fail for 64 bit (trivial fix). + +Sun Feb 21 20:00 1999 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.2a + - The rewrite of the interrupt handling broke the SBMC interrupt + handling due to a 1 bit mask tiny error. Hopefully fixed. + - If INQUIRY came from a scatter list, the driver looked into + the scatterlist instead of the data.:) Since this should never + happen, we just discard the data if use_sg is not zero. + +Fri Feb 12 23:00 1999 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.2 + - Major rewrite of the interrupt handling and recovery stuff for + the support of non compliant SCSI removal, insertion and all + kinds of screw-up that may happen on the SCSI BUS. + Hopefully, the driver is now unbreakable or may-be, it is just + quite brocken. :-) + Many thanks to Johnson Russel (Symbios) for having responded to + my questions and for his interesting advices and comments about + support of SCSI hot-plug. + - Add 'recovery' option to driver set-up. + - Negotiate SYNC data transfers with CCS devices. + - Deal correctly with 64 bit PCI address registers on Linux 2.2. + Pointed out by Leonard Zubkoff. + +Sun Jan 31 18:00 1999 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.1a + - Some 896 chip revisions (all for now :-)), may hang-up if the + soft reset bit is set at the wrong time while SCRIPTS are running. + We need to first abort the current SCRIPTS operation prior to + resetting the chip. This fix has been sent to me by SYMBIOS/LSI + and I just translated it into ncr53c8xx syntax. + Must be considered 100 % trustable, unless I did some mistake + when translating it. :-) + +Sun Jan 24 18:00 1999 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.1 + - Major rewrite of the SCSI parity error handling. + The informations contained in the data manuals are incomplete about + this feature. + I asked SYMBIOS about and got in reply the explanations that are + _indeed_ missing in the data manuals. + - Allow to tune request_irq() flags from the boot command line using + ncr53c8xx=irqm:??, as follows: + a) If bit 0x10 is set in irqm, SA_SHIRQ flag is not used. + b) If bit 0x20 is set in irqm, SA_INTERRUPT flag is not used. + By default the driver uses both SA_SHIRQ and SA_INTERRUPT. + Option 'ncr53c8xx=irqm:0x20' may be used when an IRQ is shared by + a 53C8XX adapter and a network board. + - Fix for 64 bit PCI address register calculation. (Lance Robinson) + - Fix for big-endian in phase mismatch handling. (Michal Jaegermann) + +Fri Jan 1 20:00 1999 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.0a + - Waiting list look-up didn't work for the first command of the list. + Hopefully fixed, but tested on paper only. ;) + - Remove the most part of PPC specific code for Linux-2.2. + Thanks to Cort. + - Some other minors changes. + +Sat Dec 19 21:00 1998 Gerard Roudier (groudier@club-internet.fr) + * version sym53c8xx-1.0 + - Define some new IO registers for the 896 (istat1, mbox0, mbox1) + - Revamp slighly the Symbios NVRAM lay-out based on the excerpt of + the header file I received from Symbios. + - Check the PCI bus number for the boot order (Using a fast + PCI controller behing a PCI-PCI bridge seems sub-optimal). + - Disable overlapped PCI arbitration for the 896 revision 1. + - Reduce a bit the number of IO register reads for phase mismatch + by reading DWORDS at a time instead of BYTES. + +Thu Dec 3 24:00 1998 Gerard Roudier (groudier@club-internet.fr) + * version pre-sym53c8xx-0.18 + - I received this afternoon a 896 from SYMBIOS and started testing + the driver with this beast. After having fixed 3 buglets, it worked + with all features enabled including the phase mismatch handling + from SCRIPTS. Since this feature is not yet tested enough, the + boot option 'ncr53c8xx=specf:1' is still required to enable the + driver to handle PM from SCRIPTS. + +Sun Nov 29 18:00 1998 Gerard Roudier (groudier@club-internet.fr) + * version pre-sym53c8xx-0.17 + - The SISL RAID change requires now remap_pci_mem() stuff to be + compiled for __i386__ when normal IOs are used. + - The PCI memory read from SCRIPTS that should ensure ordering + was in fact misplaced. BTW, this may explain why broken PCI + device drivers regarding ordering are working so well. ;-) + - Rewrite ncr53c8xx_setup (boot command line options) since the + binary code was a bit too bloated in my opinion. + - Make the code simpler in the wakeup_done routine. + +Tue Nov 24 23:00 1998 Gerard Roudier (groudier@club-internet.fr) + * version pre-sym53c8xx-0.16 + - Add SCSI_NCR_OPTIMIZE_896_1 compile option and 'optim' boot option. + When set, the driver unconditionnaly assumes that the interrupt + handler is called for command completion, then clears INTF, scans + the done queue and returns if some completed CCB is found. If no + completed CCB are found, interrupt handling will proceed normally. + With a 896 that handles MA from SCRIPTS, this can be a great win, + since the driver will never performs PCI read transactions, but + only PCI write transactions that may be posted. + If the driver haven't to also raise the SIGP this would be perfect. + Even with this penalty, I think that this will work great. + Obviously this optimization makes sense only if the IRQ is not + shared with another device. + - Still a buglet in the tags initial settings that needed to be fixed. + It was not possible to disable TGQ at system startup for devices + that claim TGQ support. The driver used at least 2 for the queue + depth but did'nt keep track of user settings for tags depth lower + than 2. + +Thu Nov 19 23:00 1998 Gerard Roudier (groudier@club-internet.fr) + * version pre-sym53c8xx-0.15 + - Add support for hardware LED control of the 896. + - Ignore chips that are driven by SISL RAID (DAC 960). + Change sent by Leonard Zubkoff and slightly reworked. + - Prevent 810A rev 11 and 860 rev 1 from using cache line based + transactions since those early chip revisions may use such on + LOAD/STORE instructions (work-around). + - Remove some useless and bloat code from the pci init stuff. + - Do not use the readX()/writeX() kernel functions for __i386__, + since they perform useless masking operations in order to deal + with broken driver in 2.1.X kernel. + +Wed Nov 11 10:00 1998 Gerard Roudier (groudier@club-internet.fr) + * version pre-sym53c8xx-0.14 + - The driver was unhappy when configured with default_tags > MAX_TAGS + Hopefully doubly-fixed. + - Set PCI_PARITY in PCI_COMMAND register in not set (PCI fix-up). + - Print out some message if phase mismatch is handled from SCRIPTS. + +Sun Nov 1 14H00 1998 Gerard Roudier (groudier@club-internet.fr) + * version pre-sym53c8xx-0.13 + - Some rewrite of the device detection code. This code had been + patched too much and needed to be face-lifted a bit. + Remove all platform dependent fix-ups that was not needed or + conflicted with some other driver code as work-arounds. + Reread the NVRAM before the calling of ncr_attach(). This spares + stack space and so allows to handle more boards. + Handle 64 bit base addresses under linux-2.0.X. + Set MASTER bit in PCI COMMAND register if not set. + +Wed Oct 30 22H00 1998 Gerard Roudier (groudier@club-internet.fr) + * version pre-sym53c8xx-0.12 + - Damned! I just broke the driver for Alpha by leaving a stale + instruction in the source code. Hopefully fixed. + - Do not set PFEN when it is useless. Doing so we are sure that BOF + will be active, since the manual appears to be very unclear on what + feature is actually used by the chip when both PFEN and BOF are + set. + +Sat Oct 24 16H00 1998 Gerard Roudier (groudier@club-internet.fr) + * version pre-sym53c8xx-0.11 + - LOAD/STORE instructions were miscompiled for register offsets + beyond 0x7f. This broke accesses to 896' new registers. + - Disable by default Phase Mismatch handling from SCRIPTS, since + current 896 rev.1 seems not to operate safely with the driver + when this feature is enabled (and above LOAD/STORE fix applied). + I will change the default to 'enabled' when this problem will be + solved. + Using boot option 'ncr53c8xx=specf:1' enables this feature. + - Implement a work-around (DEL 472 - ITEM 5) that should allow the + driver to safely enable hardware phase mismatch with 896 rev. 1. + +Tue Oct 20 22H00 1998 Gerard Roudier (groudier@club-internet.fr) + * version pre-sym53c8xx-0.10 + - Add the 53c876 description to the chip table. This is only useful + for printing the right name of the controller. + - Add additional checking of INQUIRY data: + Check INQUIRY data received length is at least 7. Byte 7 of + inquiry data contains device features bits and the driver might + be confused by garbage. Also check peripheral qualifier. + - Use a 1,3,5,...MAXTAGS*2+1 tag numbering. Previous driver could + use any tag number from 1 to 253 and some non conformant devices + might have problems with large tag numbers. + - Use NAME53C and NAME53C8XX for chip name prefix chip family name. + Just give a try using "sym53c" and "sym53c8xx" instead of "ncr53c" + and "ncr53c8xx". :-) + +Sun Oct 11 17H00 1998 Gerard Roudier (groudier@club-internet.fr) + * version pre-sym53c8xx-0.9 + - DEL-441 Item 2 work-around for the 53c876 rev <= 5 (0x15). + - Break ncr_scatter() into 2 functions in order to guarantee best + possible code optimization for the case we get a scatter list. + - Add the code intended to support up to 1 tera-byte for 64 bit systems. + It is probably too early, but I wanted to complete the thing. + +Sat Oct 3 14H00 1998 Gerard Roudier (groudier@club-internet.fr) + * version pre-sym53c8xx-0.8 + - Do some testing with io_mapped and fix what needed to be so. + - Wait for SCSI selection to complete or time-out immediately after + the chip won arbitration, since executing SCRIPTS while the SCSI + core is performing SCSI selection breaks the selection procedure + at least for some chip revisions. + - Interrupt the SCRIPTS if a device does not go to MSG OUT phase after + having been selected with ATN. Such a situation is not recoverable, + better to fail when we are stuck. diff --git a/Documentation/scsi/ChangeLog.sym53c8xx_2 b/Documentation/scsi/ChangeLog.sym53c8xx_2 new file mode 100644 index 000000000000..18a5d712a56a --- /dev/null +++ b/Documentation/scsi/ChangeLog.sym53c8xx_2 @@ -0,0 +1,144 @@ +Sat Dec 30 21:30 2000 Gerard Roudier + * version sym-2.1.0-20001230 + - Initial release of SYM-2. + +Mon Jan 08 21:30 2001 Gerard Roudier + * version sym-2.1.1-20010108 + - Change a couple of defines containing ncr or NCR by their + equivalent containing sym or SYM instead. + +Sun Jan 14 22:30 2001 Gerard Roudier + * version sym-2.1.2-20010114 + - Fix a couple of printfs: + * Add the target number to the display of transfer parameters. + * Make the display of TCQ and queue depth clearer. + +Wed Jan 17 23:30 2001 Gerard Roudier + * version sym-2.1.3-20010117 + - Wrong residual values were returned in some situations. + This broke cdrecord with linux-2.4.0, for example. + +Sat Jan 20 18:00 2001 Gerard Roudier + * version sym-2.1.4-20010120 + - Add year 2001 to Copyright. + - A tiny bug in the dma memory freeing path has been fixed. + (Driver unload failed with a bad address reference). + +Wed Jan 24 21:00 2001 Gerard Roudier + * version sym-2.1.5-20010124 + - Make the driver work under Linux-2.4.x when statically linked + with the kernel. + - Check against memory allocation failure for SCRIPTZ and add the + missing free of this memory on instance detach. + - Check against GPIO3 pulled low for HVD controllers (driver did + just the opposite). + Misdetection of BUS mode was triggered on module reload only, + since BIOS settings were trusted instead on first load. + +Wed Feb 7 21:00 2001 Gerard Roudier + * version sym-2.1.6-20010207 + - Call pci_enable_device() as wished by kernel maintainers. + - Change the sym_queue_scsiio() interface. + This is intended to simplify portability. + - Move the code intended to deal with the dowloading of SCRIPTS + from SCRIPTS :) in the patch method (was wrongly placed in + the SCRIPTS setup method). + - Add a missing cpu_to_scr() (np->abort_tbl.addr) + - Remove a wrong cpu_to_scr() (np->targtbl_ba) + - Cleanup a bit the PPR failure recovery code. + +Sat Mar 3 21:00 2001 Gerard Roudier + - Add option SYM_OPT_ANNOUNCE_TRANSFER_RATE and move the + corresponding code to file sym_misc.c. + Also move the code that sniffes INQUIRY to sym_misc.c. + This allows to share the corresponding code with NetBSD + without polluating the core driver source (sym_hipd.c). + - Add optionnal code that handles IO timeouts from the driver. + (not used under Linux, but required for NetBSD) + - Donnot assume any longer that PAGE_SHIFT and PAGE_SIZE are + defined at compile time, as at least NetBSD uses variables + in memory for that. + - Refine a work-around for the C1010-33 that consists in + disabling internal LOAD/STORE. Was applied up to revision 1. + Is now only applied to revision 0. + - Some code reorganisations due to code moves between files. + +Tues Apr 10 21:00 2001 Gerard Roudier + * version sym-2.1.9-20010412 + - Reset 53C896 and 53C1010 chip according to the manual. + (i.e.: set the ABRT bit in ISTAT if SCRIPTS are running) + - Set #LUN in request sense only if scsi version <= 2 and + #LUN <= 7. + - Set busy_itl in LCB to 1 if the LCB is allocated and a + SCSI command is active. This is a simplification. + - In sym_hcb_free(), do not scan the free_ccbq if no CCBs + has been allocated. This fixes a panic if attach failed. + - Add DT/ST (double/simple transition) in the transfer + negotiation announce. + - Forces the max number of tasks per LUN to at least 64. + - Use pci_set_dma_mask() for linux-2.4.3 and above. + - A couple of comments fixes. + +Wed May 22:00 2001 Gerard Roudier + * version sym-2.1.10-20010509 + - Mask GPCNTL against 0x1c (was 0xfc) for the reading of the NVRAM. + This ensure LEDC bit will not be set on 896 and later chips. + Fix sent by Chip Salzenberg <chip@perlsupport.com>. + - Define the number of PQS BUSes supported. + Fix sent by Stig Telfer <stig@api-networks.com> + - Miscellaneous common code rearrangements due to NetBSD accel + ioctl support, without impact on Linux (hopefully). + +Mon July 2 12:00 2001 Gerard Roudier + * version sym-2.1.11-20010702 + - Add Tekram 390 U2B/U2W SCSI LED handling. + Submitted by Chip Salzenberg <chip@valinux.com> + - Add call to scsi_set_pci_device() for kernels >= 2.4.4. + - Check pci dma mapping failures and complete the IO with some + error when such mapping fails. + - Fill in instance->max_cmd_len for kernels > 2.4.0. + - A couple of tiny fixes ... + +Sun Sep 9 18:00 2001 Gerard Roudier + * version sym-2.1.12-20010909 + - Change my email address. + - Add infrastructure for the forthcoming 64 bit DMA addressing support. + (Based on PCI 64 bit patch from David S. Miller) + - Donnot use anymore vm_offset_t type. + +Sat Sep 15 20:00 2001 Gerard Roudier + * version sym-2.1.13-20010916 + - Add support for 64 bit DMA addressing using segment registers. + 16 registers for up to 4 GB x 16 -> 64 GB. + +Sat Sep 22 12:00 2001 Gerard Roudier + * version sym-2.1.14-20010922 + - Complete rewrite of the eh handling. The driver is now using a + semaphore in order to behave synchronously as required by the eh + threads. A timer is also used to prevent from waiting indefinitely. + +Sun Sep 30 17:00 2001 Gerard Roudier + * version sym-2.1.15-20010930 + - Include <linux/module.h> unconditionnaly as expected by latest + kernels. + - Use del_timer_sync() for recent kernels to kill the driver timer + on module release. + +Sun Oct 28 15:00 2001 Gerard Roudier + * version sym-2.1.16-20011028 + - Slightly simplify driver configuration. + - Prepare a new patch against linux-2.4.13. + +Sat Nov 17 10:00 2001 Gerard Roudier + * version sym-2.1.17 + - Fix a couple of gcc/gcc3 warnings. + - Allocate separately from the HCB the array for CCBs hashed by DSA. + All driver memory allocations are now not greater than 1 PAGE + even on PPC64 / 4KB PAGE surprising setup. + +Sat Dec 01 18:00 2001 Gerard Roudier + * version sym-2.1.17a + - Use u_long instead of U32 for the IO base cookie. This is more + consistent with what archs are expecting. + - Use MMIO per default for Power PC instead of some fake normal IO, + as Paul Mackerras stated that MMIO works fine now on this arch. diff --git a/Documentation/scsi/FlashPoint.txt b/Documentation/scsi/FlashPoint.txt new file mode 100644 index 000000000000..d5acaa300a46 --- /dev/null +++ b/Documentation/scsi/FlashPoint.txt @@ -0,0 +1,163 @@ +The BusLogic FlashPoint SCSI Host Adapters are now fully supported on Linux. +The upgrade program described below has been officially terminated effective +31 March 1997 since it is no longer needed. + + + + MYLEX INTRODUCES LINUX OPERATING SYSTEM SUPPORT FOR ITS + BUSLOGIC FLASHPOINT LINE OF SCSI HOST ADAPTERS + + +FREMONT, CA, -- October 8, 1996 -- Mylex Corporation has expanded Linux +operating system support to its BusLogic brand of FlashPoint Ultra SCSI +host adapters. All of BusLogic's other SCSI host adapters, including the +MultiMaster line, currently support the Linux operating system. Linux +drivers and information will be available on October 15th at +http://www.dandelion.com/Linux/. + +"Mylex is committed to supporting the Linux community," says Peter Shambora, +vice president of marketing for Mylex. "We have supported Linux driver +development and provided technical support for our host adapters for several +years, and are pleased to now make our FlashPoint products available to this +user base." + +The Linux Operating System + +Linux is a freely-distributed implementation of UNIX for Intel x86, Sun +SPARC, SGI MIPS, Motorola 68k, Digital Alpha AXP and Motorola PowerPC +machines. It supports a wide range of software, including the X Window +System, Emacs, and TCP/IP networking. Further information is available at +http://www.linux.org and http://www.ssc.com/linux. + +FlashPoint Host Adapters + +The FlashPoint family of Ultra SCSI host adapters, designed for workstation +and file server environments, are available in narrow, wide, dual channel, +and dual channel wide versions. These adapters feature SeqEngine +automation technology, which minimizes SCSI command overhead and reduces +the number of interrupts generated to the CPU. + +About Mylex + +Mylex Corporation (NASDAQ/NM SYMBOL: MYLX), founded in 1983, is a leading +producer of RAID technology and network management products. The company +produces high performance disk array (RAID) controllers, and complementary +computer products for network servers, mass storage systems, workstations +and system boards. Through its wide range of RAID controllers and its +BusLogic line of Ultra SCSI host adapter products, Mylex provides enabling +intelligent I/O technologies that increase network management control, +enhance CPU utilization, optimize I/O performance, and ensure data security +and availability. Products are sold globally through a network of OEMs, +major distributors, VARs, and system integrators. Mylex Corporation is +headquartered at 34551 Ardenwood Blvd., Fremont, CA. + + #### + +Contact: + +Peter Shambora +Vice President of Marketing +Mylex Corp. +510/796-6100 +peters@mylex.com + + ANNOUNCEMENT + BusLogic FlashPoint LT/BT-948 Upgrade Program + 1 February 1996 + + ADDITIONAL ANNOUNCEMENT + BusLogic FlashPoint LW/BT-958 Upgrade Program + 14 June 1996 + +Ever since its introduction last October, the BusLogic FlashPoint LT has +been problematic for members of the Linux community, in that no Linux +drivers have been available for this new Ultra SCSI product. Despite it's +officially being positioned as a desktop workstation product, and not being +particularly well suited for a high performance multitasking operating +system like Linux, the FlashPoint LT has been touted by computer system +vendors as the latest thing, and has been sold even on many of their high +end systems, to the exclusion of the older MultiMaster products. This has +caused grief for many people who inadvertently purchased a system expecting +that all BusLogic SCSI Host Adapters were supported by Linux, only to +discover that the FlashPoint was not supported and would not be for quite +some time, if ever. + +After this problem was identified, BusLogic contacted its major OEM +customers to make sure the BT-946C/956C MultiMaster cards would still be +made available, and that Linux users who mistakenly ordered systems with +the FlashPoint would be able to upgrade to the BT-946C. While this helped +many purchasers of new systems, it was only a partial solution to the +overall problem of FlashPoint support for Linux users. It did nothing to +assist the people who initially purchased a FlashPoint for a supported +operating system and then later decided to run Linux, or those who had +ended up with a FlashPoint LT, believing it was supported, and were unable +to return it. + +In the middle of December, I asked to meet with BusLogic's senior +management to discuss the issues related to Linux and free software support +for the FlashPoint. Rumors of varying accuracy had been circulating +publicly about BusLogic's attitude toward the Linux community, and I felt +it was best that these issues be addressed directly. I sent an email +message after 11pm one evening, and the meeting took place the next +afternoon. Unfortunately, corporate wheels sometimes grind slowly, +especially when a company is being acquired, and so it's taken until now +before the details were completely determined and a public statement could +be made. + +BusLogic is not prepared at this time to release the information necessary +for third parties to write drivers for the FlashPoint. The only existing +FlashPoint drivers have been written directly by BusLogic Engineering, and +there is no FlashPoint documentation sufficiently detailed to allow outside +developers to write a driver without substantial assistance. While there +are people at BusLogic who would rather not release the details of the +FlashPoint architecture at all, that debate has not yet been settled either +way. In any event, even if documentation were available today it would +take quite a while for a usable driver to be written, especially since I'm +not convinced that the effort required would be worthwhile. + +However, BusLogic does remain committed to providing a high performance +SCSI solution for the Linux community, and does not want to see anyone left +unable to run Linux because they have a Flashpoint LT. Therefore, BusLogic +has put in place a direct upgrade program to allow any Linux user worldwide +to trade in their FlashPoint LT for the new BT-948 MultiMaster PCI Ultra +SCSI Host Adapter. The BT-948 is the Ultra SCSI successor to the BT-946C +and has all the best features of both the BT-946C and FlashPoint LT, +including smart termination and a flash PROM for easy firmware updates, and +is of course compatible with the present Linux driver. The price for this +upgrade has been set at US $45 plus shipping and handling, and the upgrade +program will be administered through BusLogic Technical Support, which can +be reached by electronic mail at techsup@buslogic.com, by Voice at +1 408 +654-0760, or by FAX at +1 408 492-1542. + +As of 14 June 1996, the original BusLogic FlashPoint LT to BT-948 upgrade +program has now been extended to encompass the FlashPoint LW Wide Ultra +SCSI Host Adapter. Any Linux user worldwide may trade in their FlashPoint +LW (BT-950) for a BT-958 MultiMaster PCI Ultra SCSI Host Adapter. The +price for this upgrade has been set at US $65 plus shipping and handling. + +I was a beta test site for the BT-948/958, and versions 1.2.1 and 1.3.1 of +my BusLogic driver already included latent support for the BT-948/958. +Additional cosmetic support for the Ultra SCSI MultiMaster cards was added +subsequent releases. As a result of this cooperative testing process, +several firmware bugs were found and corrected. My heavily loaded Linux +test system provided an ideal environment for testing error recovery +processes that are much more rarely exercised in production systems, but +are crucial to overall system stability. It was especially convenient +being able to work directly with their firmware engineer in demonstrating +the problems under control of the firmware debugging environment; things +sure have come a long way since the last time I worked on firmware for an +embedded system. I am presently working on some performance testing and +expect to have some data to report in the not too distant future. + +BusLogic asked me to send this announcement since a large percentage of the +questions regarding support for the FlashPoint have either been sent to me +directly via email, or have appeared in the Linux newsgroups in which I +participate. To summarize, BusLogic is offering Linux users an upgrade +from the unsupported FlashPoint LT (BT-930) to the supported BT-948 for US +$45 plus shipping and handling, or from the unsupported FlashPoint LW +(BT-950) to the supported BT-958 for $65 plus shipping and handling. +Contact BusLogic Technical Support at techsup@buslogic.com or +1 408 +654-0760 to take advantage of their offer. + + Leonard N. Zubkoff + lnz@dandelion.com diff --git a/Documentation/scsi/LICENSE.FlashPoint b/Documentation/scsi/LICENSE.FlashPoint new file mode 100644 index 000000000000..ffd0fe226ee2 --- /dev/null +++ b/Documentation/scsi/LICENSE.FlashPoint @@ -0,0 +1,60 @@ + FlashPoint Driver Developer's Kit + Version 1.0 + + Copyright 1995-1996 by Mylex Corporation + All Rights Reserved + +This program is free software; you may redistribute and/or modify it under +the terms of either: + + a) the GNU General Public License as published by the Free Software + Foundation; either version 2, or (at your option) any later version, + + or + + b) the "BSD-style License" included below. + +This program is distributed in the hope that it will be useful, but +WITHOUT ANY WARRANTY, without even the implied warranty of MERCHANTABILITY +or FITNESS FOR A PARTICULAR PURPOSE. See either the GNU General Public +License or the BSD-style License below for more details. + +You should have received a copy of the GNU General Public License along +with this program; if not, write to the Free Software Foundation, Inc., +675 Mass Ave, Cambridge, MA 02139, USA. + +The BSD-style License is as follows: + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + +1. Redistributions of source code must retain this LICENSE.FlashPoint + file, without modification, this list of conditions, and the following + disclaimer. The following copyright notice must appear immediately at + the beginning of all source files: + + Copyright 1995-1996 by Mylex Corporation. All Rights Reserved + + This file is available under both the GNU General Public License + and a BSD-style copyright; see LICENSE.FlashPoint for details. + +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + +3. The name of Mylex Corporation may not be used to endorse or promote + products derived from this software without specific prior written + permission. + +THIS SOFTWARE IS PROVIDED BY MYLEX CORP. ``AS IS'' AND ANY EXPRESS OR +IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES +OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN +NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, +INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +SUCH DAMAGE. diff --git a/Documentation/scsi/Mylex.txt b/Documentation/scsi/Mylex.txt new file mode 100644 index 000000000000..cdf69293f7d5 --- /dev/null +++ b/Documentation/scsi/Mylex.txt @@ -0,0 +1,5 @@ +Please see the file README.BusLogic for information about Linux support for +Mylex (formerly BusLogic) MultiMaster and FlashPoint SCSI Host Adapters. + +The Mylex DAC960 PCI RAID Controllers are now supported. Please consult +http://www.dandelion.com/Linux/ for further information on the DAC960 driver. diff --git a/Documentation/scsi/NinjaSCSI.txt b/Documentation/scsi/NinjaSCSI.txt new file mode 100644 index 000000000000..041780f428ac --- /dev/null +++ b/Documentation/scsi/NinjaSCSI.txt @@ -0,0 +1,130 @@ + + WorkBiT NinjaSCSI-3/32Bi driver for Linux + +1. Comment + This is Workbit corp.'s(http://www.workbit.co.jp/) NinjaSCSI-3 +(http://www.workbit.co.jp/ts/z_nj3r.html) and NinjaSCSI-32Bi +(http://www.workbit.co.jp/ts/z_njsc32bi.html) PCMCIA card driver module +for Linux. + +2. My Linux environment +Linux kernel: 2.4.7 / 2.2.19 +pcmcia-cs: 3.1.27 +gcc: gcc-2.95.4 +PC card: I-O data PCSC-F (NinjaSCSI-3) + I-O data CBSC-II in 16 bit mode (NinjaSCSI-32Bi) +SCSI device: I-O data CDPS-PX24 (CD-ROM drive) + Media Intelligent MMO-640GT (Optical disk drive) + +3. Install +[1] Check your PC card is true "NinjaSCSI-3" card. + If you installed pcmcia-cs already, pcmcia reports your card as UNKNOWN + card, and write ["WBT", "NinjaSCSI-3", "R1.0"] or some other string to + your console or log file. + You can also use "cardctl" program (this program is in pcmcia-cs source + code) to get more info. + +# cat /var/log/messgaes +... +Jan 2 03:45:06 lindberg cardmgr[78]: unsupported card in socket 1 +Jan 2 03:45:06 lindberg cardmgr[78]: product info: "WBT", "NinjaSCSI-3", "R1.0" +... +# cardctl ident +Socket 0: + no product info available +Socket 1: + product info: "IO DATA", "CBSC16 ", "1" + + +[2] Get Linux kernel source, and extract it to /usr/src. + Because NinjaSCSI driver requiers some SCSI header files in Linux kernel + source. + I recomend rebuilding your kernel. This eliminate some versioning problem. +$ cd /usr/src +$ tar -zxvf linux-x.x.x.tar.gz +$ cd linux +$ make config +... + +[3] If you use this driver with Kernel 2.2, Unpack pcmcia-cs in some directory + and make & install. This driver requies pcmcia-cs header file. +$ cd /usr/src +$ tar zxvf cs-pcmcia-cs-3.x.x.tar.gz +... + +[4] Extract this driver's archive somewhere, and edit Makefile, then do make. +$ tar -zxvf nsp_cs-x.x.tar.gz +$ cd nsp_cs-x.x +$ emacs Makefile +... +$ make + +[5] Copy nsp_cs.o to suitable plase, like /lib/modules/<Kernel version>/pcmcia/ . + +[6] Add these lines to /etc/pcmcia/config . + If you yse pcmcia-cs-3.1.8 or later, we can use "nsp_cs.conf" file. + So, you don't need to edit file. Just copy to /etc/pcmcia/ . + +------------------------------------- +device "nsp_cs" + class "scsi" module "nsp_cs" + +card "WorkBit NinjaSCSI-3" + version "WBT", "NinjaSCSI-3", "R1.0" + bind "nsp_cs" + +card "WorkBit NinjaSCSI-32Bi (16bit)" + version "WORKBIT", "UltraNinja-16", "1" + bind "nsp_cs" + +# OEM +card "WorkBit NinjaSCSI-32Bi (16bit) / IO-DATA" + version "IO DATA", "CBSC16 ", "1" + bind "nsp_cs" + +# OEM +card "WorkBit NinjaSCSI-32Bi (16bit) / KME-1" + version "KME ", "SCSI-CARD-001", "1" + bind "nsp_cs" +card "WorkBit NinjaSCSI-32Bi (16bit) / KME-2" + version "KME ", "SCSI-CARD-002", "1" + bind "nsp_cs" +card "WorkBit NinjaSCSI-32Bi (16bit) / KME-3" + version "KME ", "SCSI-CARD-003", "1" + bind "nsp_cs" +card "WorkBit NinjaSCSI-32Bi (16bit) / KME-4" + version "KME ", "SCSI-CARD-004", "1" + bind "nsp_cs" +------------------------------------- + +[7] Start (or restart) pcmcia-cs. +# /etc/rc.d/rc.pcmcia start (BSD style) +or +# /etc/init.d/pcmcia start (SYSV style) + + +4. History +See README.nin_cs . + +5. Caution + If you eject card when doing some operation for your SCSI device or suspend +your computer, you encount some *BAD* error like disk crash. + It works good when I using this driver right way. But I'm not guarantee +your data. Please backup your data when you use this driver. + +6. Known Bugs + In 2.4 kernel, you can't use 640MB Optical disk. This error comes from +high level SCSI driver. + +7. Testing + Please send me some reports(bug reports etc..) of this software. +When you send report, please tell me these or more. + card name + kernel version + your SCSI device name(hard drive, CD-ROM, etc...) + +8. Copyright + See GPL. + + +2001/08/08 yokota@netlab.is.tsukuba.ac.jp <YOKOTA Hiroshi> diff --git a/Documentation/scsi/aha152x.txt b/Documentation/scsi/aha152x.txt new file mode 100644 index 000000000000..2ce022cec9be --- /dev/null +++ b/Documentation/scsi/aha152x.txt @@ -0,0 +1,183 @@ +$Id: README.aha152x,v 1.2 1999/12/25 15:32:30 fischer Exp fischer $ +Adaptec AHA-1520/1522 SCSI driver for Linux (aha152x) + +Copyright 1993-1999 Jürgen Fischer <fischer@norbit.de> +TC1550 patches by Luuk van Dijk (ldz@xs4all.nl) + + +In Revision 2 the driver was modified a lot (especially the +bottom-half handler complete()). + +The driver is much cleaner now, has support for the new +error handling code in 2.3, produced less cpu load (much +less polling loops), has slightly higher throughput (at +least on my ancient test box; a i486/33Mhz/20MB). + + +CONFIGURATION ARGUMENTS: + +IOPORT base io address (0x340/0x140) +IRQ interrupt level (9-12; default 11) +SCSI_ID scsi id of controller (0-7; default 7) +RECONNECT allow targets to disconnect from the bus (0/1; default 1 [on]) +PARITY enable parity checking (0/1; default 1 [on]) +SYNCHRONOUS enable synchronous transfers (0/1; default 1 [on]) +DELAY: bus reset delay (default 100) +EXT_TRANS: enable extended translation (0/1: default 0 [off]) + (see NOTES) + +COMPILE TIME CONFIGURATION (go into AHA152X in drivers/scsi/Makefile): + +-DAUTOCONF + use configuration the controller reports (AHA-152x only) + +-DSKIP_BIOSTEST + Don't test for BIOS signature (AHA-1510 or disabled BIOS) + +-DSETUP0="{ IOPORT, IRQ, SCSI_ID, RECONNECT, PARITY, SYNCHRONOUS, DELAY, EXT_TRANS }" + override for the first controller + +-DSETUP1="{ IOPORT, IRQ, SCSI_ID, RECONNECT, PARITY, SYNCHRONOUS, DELAY, EXT_TRANS }" + override for the second controller + +-DAHA152X_DEBUG + enable debugging output + +-DAHA152X_STAT + enable some statistics + + +LILO COMMAND LINE OPTIONS: + +aha152x=<IOPORT>[,<IRQ>[,<SCSI-ID>[,<RECONNECT>[,<PARITY>[,<SYNCHRONOUS>[,<DELAY> [,<EXT_TRANS]]]]]]] + + The normal configuration can be overridden by specifying a command line. + When you do this, the BIOS test is skipped. Entered values have to be + valid (known). Don't use values that aren't supported under normal + operation. If you think that you need other values: contact me. + For two controllers use the aha152x statement twice. + + +SYMBOLS FOR MODULE CONFIGURATION: + +Choose from 2 alternatives: + +1. specify everything (old) + +aha152x=IOPORT,IRQ,SCSI_ID,RECONNECT,PARITY,SYNCHRONOUS,DELAY,EXT_TRANS + configuration override for first controller + + +aha152x1=IOPORT,IRQ,SCSI_ID,RECONNECT,PARITY,SYNCHRONOUS,DELAY,EXT_TRANS + configuration override for second controller + +2. specify only what you need to (irq or io is required; new) + +io=IOPORT0[,IOPORT1] + IOPORT for first and second controller + +irq=IRQ0[,IRQ1] + IRQ for first and second controller + +scsiid=SCSIID0[,SCSIID1] + SCSIID for first and second controller + +reconnect=RECONNECT0[,RECONNECT1] + allow targets to disconnect for first and second controller + +parity=PAR0[PAR1] + use parity for first and second controller + +sync=SYNCHRONOUS0[,SYNCHRONOUS1] + enable synchronous transfers for first and second controller + +delay=DELAY0[,DELAY1] + reset DELAY for first and second controller + +exttrans=EXTTRANS0[,EXTTRANS1] + enable extended translation for first and second controller + + +If you use both alternatives the first will be taken. + + +NOTES ON EXT_TRANS: + +SCSI uses block numbers to address blocks/sectors on a device. +The BIOS uses a cylinder/head/sector addressing scheme (C/H/S) +scheme instead. DOS expects a BIOS or driver that understands this +C/H/S addressing. + +The number of cylinders/heads/sectors is called geometry and is required +as base for requests in C/H/S addressing. SCSI only knows about the +total capacity of disks in blocks (sectors). + +Therefore the SCSI BIOS/DOS driver has to calculate a logical/virtual +geometry just to be able to support that addressing scheme. The geometry +returned by the SCSI BIOS is a pure calculation and has nothing to +do with the real/physical geometry of the disk (which is usually +irrelevant anyway). + +Basically this has no impact at all on Linux, because it also uses block +instead of C/H/S addressing. Unfortunately C/H/S addressing is also used +in the partition table and therefore every operating system has to know +the right geometry to be able to interpret it. + +Moreover there are certain limitations to the C/H/S addressing scheme, +namely the address space is limited to upto 255 heads, upto 63 sectors +and a maximum of 1023 cylinders. + +The AHA-1522 BIOS calculates the geometry by fixing the number of heads +to 64, the number of sectors to 32 and by calculating the number of +cylinders by dividing the capacity reported by the disk by 64*32 (1 MB). +This is considered to be the default translation. + +With respect to the limit of 1023 cylinders using C/H/S you can only +address the first GB of your disk in the partition table. Therefore +BIOSes of some newer controllers based on the AIC-6260/6360 support +extended translation. This means that the BIOS uses 255 for heads, +63 for sectors and then divides the capacity of the disk by 255*63 +(about 8 MB), as soon it sees a disk greater than 1 GB. That results +in a maximum of about 8 GB addressable diskspace in the partition table +(but there are already bigger disks out there today). + +To make it even more complicated the translation mode might/might +not be configurable in certain BIOS setups. + +This driver does some more or less failsafe guessing to get the +geometry right in most cases: + +- for disks<1GB: use default translation (C/32/64) + +- for disks>1GB: + - take current geometry from the partition table + (using scsicam_bios_param and accept only `valid' geometries, + ie. either (C/32/64) or (C/63/255)). This can be extended translation + even if it's not enabled in the driver. + + - if that fails, take extended translation if enabled by override, + kernel or module parameter, otherwise take default translation and + ask the user for verification. This might on not yet partitioned + disks. + + +REFERENCES USED: + + "AIC-6260 SCSI Chip Specification", Adaptec Corporation. + + "SCSI COMPUTER SYSTEM INTERFACE - 2 (SCSI-2)", X3T9.2/86-109 rev. 10h + + "Writing a SCSI device driver for Linux", Rik Faith (faith@cs.unc.edu) + + "Kernel Hacker's Guide", Michael K. Johnson (johnsonm@sunsite.unc.edu) + + "Adaptec 1520/1522 User's Guide", Adaptec Corporation. + + Michael K. Johnson (johnsonm@sunsite.unc.edu) + + Drew Eckhardt (drew@cs.colorado.edu) + + Eric Youngdale (eric@andante.org) + + special thanks to Eric Youngdale for the free(!) supplying the + documentation on the chip. diff --git a/Documentation/scsi/aic79xx.txt b/Documentation/scsi/aic79xx.txt new file mode 100644 index 000000000000..0aeef740a95a --- /dev/null +++ b/Documentation/scsi/aic79xx.txt @@ -0,0 +1,516 @@ +==================================================================== += Adaptec Ultra320 Family Manager Set v1.3.11 = += = += README for = += The Linux Operating System = +==================================================================== + +The following information is available in this file: + + 1. Supported Hardware + 2. Version History + 3. Command Line Options + 4. Additional Notes + 5. Contacting Adaptec + + +1. Supported Hardware + + The following Adaptec SCSI Host Adapters are supported by this + driver set. + + Ultra320 ASIC Description + ---------------------------------------------------------------- + AIC-7901A Single Channel 64-bit PCI-X 133MHz to + Ultra320 SCSI ASIC + AIC-7901B Single Channel 64-bit PCI-X 133MHz to + Ultra320 SCSI ASIC with Retained Training + AIC-7902A4 Dual Channel 64-bit PCI-X 133MHz to + Ultra320 SCSI ASIC + AIC-7902B Dual Channel 64-bit PCI-X 133MHz to + Ultra320 SCSI ASIC with Retained Training + + Ultra320 Adapters Description ASIC + -------------------------------------------------------------------------- + Adaptec SCSI Card 39320 Dual Channel 64-bit PCI-X 133MHz to 7902A4/7902B + Ultra320 SCSI Card (one external + 68-pin, two internal 68-pin) + Adaptec SCSI Card 39320A Dual Channel 64-bit PCI-X 133MHz to 7902B + Ultra320 SCSI Card (one external + 68-pin, two internal 68-pin) + Adaptec SCSI Card 39320D Dual Channel 64-bit PCI-X 133MHz to 7902A4 + Ultra320 SCSI Card (two external VHDC + and one internal 68-pin) + Adaptec SCSI Card 39320D Dual Channel 64-bit PCI-X 133MHz to 7902A4 + Ultra320 SCSI Card (two external VHDC + and one internal 68-pin) based on the + AIC-7902B ASIC + Adaptec SCSI Card 29320 Single Channel 64-bit PCI-X 133MHz to 7901A + Ultra320 SCSI Card (one external + 68-pin, two internal 68-pin, one + internal 50-pin) + Adaptec SCSI Card 29320A Single Channel 64-bit PCI-X 133MHz to 7901B + Ultra320 SCSI Card (one external + 68-pin, two internal 68-pin, one + internal 50-pin) + Adaptec SCSI Card 29320LP Single Channel 64-bit Low Profile 7901A + PCI-X 133MHz to Ultra320 SCSI Card + (One external VHDC, one internal + 68-pin) + Adaptec SCSI Card 29320ALP Single Channel 64-bit Low Profile 7901B + PCI-X 133MHz to Ultra320 SCSI Card + (One external VHDC, one internal + 68-pin) +2. Version History + + 1.3.11 (July 11, 2003) + - Fix several deadlock issues. + - Add 29320ALP and 39320B Id's. + + 1.3.10 (June 3rd, 2003) + - Align the SCB_TAG field on a 16byte boundary. This avoids + SCB corruption on some PCI-33 busses. + - Correct non-zero luns on Rev B. hardware. + - Update for change in 2.5.X SCSI proc FS interface. + - When negotiation async via an 8bit WDTR message, send + an SDTR with an offset of 0 to be sure the target + knows we are async. This works around a firmware defect + in the Quantum Atlas 10K. + - Implement controller susupend and resume. + - Clear PCI error state during driver attach so that we + don't disable memory mapped I/O due to a stray write + by some other driver probe that occurred before we + claimed the controller. + + 1.3.9 (May 22nd, 2003) + - Fix compiler errors. + - Remove S/G splitting for segments that cross a 4GB boundary. + This is guaranteed not to happen in Linux. + - Add support for scsi_report_device_reset() found in + 2.5.X kernels. + - Add 7901B support. + - Simplify handling of the packtized lun Rev A workaround. + - Correct and simplify handling of the ignore wide residue + message. The previous code would fail to report a residual + if the transaction data length was even and we received + an IWR message. + + 1.3.8 (April 29th, 2003) + - Fix types accessed via the command line interface code. + - Perform a few firmware optimizations. + - Fix "Unexpected PKT busfree" errors. + - Use a sequencer interrupt to notify the host of + commands with bad status. We defer the notification + until there are no outstanding selections to ensure + that the host is interrupted for as short a time as + possible. + - Remove pre-2.2.X support. + - Add support for new 2.5.X interrupt API. + - Correct big-endian architecture support. + + 1.3.7 (April 16th, 2003) + - Use del_timer_sync() to ensure that no timeouts + are pending during controller shutdown. + - For pre-2.5.X kernels, carefully adjust our segment + list size to avoid SCSI malloc pool fragmentation. + - Cleanup channel display in our /proc output. + - Workaround duplicate device entries in the mid-layer + devlice list during add-single-device. + + 1.3.6 (March 28th, 2003) + - Correct a double free in the Domain Validation code. + - Correct a reference to free'ed memory during controller + shutdown. + - Reset the bus on an SE->LVD change. This is required + to reset our transcievers. + + 1.3.5 (March 24th, 2003) + - Fix a few register window mode bugs. + - Include read streaming in the PPR flags we display in + diagnostics as well as /proc. + - Add PCI hot plug support for 2.5.X kernels. + - Correct default precompensation value for RevA hardware. + - Fix Domain Validation thread shutdown. + - Add a firmware workaround to make the LED blink + brighter during packetized operations on the H2A4. + - Correct /proc display of user read streaming settings. + - Simplify driver locking by releasing the io_request_lock + upon driver entry from the mid-layer. + - Cleanup command line parsing and move much of this code + to aiclib. + + 1.3.4 (February 28th, 2003) + - Correct a race condition in our error recovery handler. + - Allow Test Unit Ready commands to take a full 5 seconds + during Domain Validation. + + 1.3.2 (February 19th, 2003) + - Correct a Rev B. regression due to the GEM318 + compatibility fix included in 1.3.1. + + 1.3.1 (February 11th, 2003) + - Add support for the 39320A. + - Improve recovery for certain PCI-X errors. + - Fix handling of LQ/DATA/LQ/DATA for the + same write transaction that can occur without + interveining training. + - Correct compatibility issues with the GEM318 + enclosure services device. + - Correct data corruption issue that occurred under + high tag depth write loads. + - Adapt to a change in the 2.5.X daemonize() API. + - Correct a "Missing case in ahd_handle_scsiint" panic. + + 1.3.0 (January 21st, 2003) + - Full regression testing for all U320 products completed. + - Added abort and target/lun reset error recovery handler and + interrupt coalessing. + + 1.2.0 (November 14th, 2002) + - Added support for Domain Validation + - Add support for the Hewlett-Packard version of the 39320D + and AIC-7902 adapters. + Support for previous adapters has not been fully tested and should + only be used at the customer's own risk. + + 1.1.1 (September 24th, 2002) + - Added support for the Linux 2.5.X kernel series + + 1.1.0 (September 17th, 2002) + - Added support for four additional SCSI products: + ASC-39320, ASC-29320, ASC-29320LP, AIC-7901. + + 1.0.0 (May 30th, 2002) + - Initial driver release. + + 2.1. Software/Hardware Features + - Support for the SPI-4 "Ultra320" standard: + - 320MB/s transfer rates + - Packetized SCSI Protocol at 160MB/s and 320MB/s + - Quick Arbitration Selection (QAS) + - Retained Training Information (Rev B. ASIC only) + - Interrupt Coalessing + - Initiator Mode (target mode not currently + supported) + - Support for the PCI-X standard up to 133MHz + - Support for the PCI v2.2 standard + - Domain Validation + + 2.2. Operating System Support: + - Redhat Linux 7.2, 7.3, 8.0, Advanced Server 2.1 + - SuSE Linux 7.3, 8.0, 8.1, Enterprise Server 7 + - only Intel and AMD x86 supported at this time + - >4GB memory configurations supported. + + Refer to the User's Guide for more details on this. + +3. Command Line Options + + WARNING: ALTERING OR ADDING THESE DRIVER PARAMETERS + INCORRECTLY CAN RENDER YOUR SYSTEM INOPERABLE. + USE THEM WITH CAUTION. + + Edit the file "modprobe.conf" in the directory /etc and add/edit a + line containing 'options aic79xx aic79xx=[command[,command...]]' where + 'command' is one or more of the following: + ----------------------------------------------------------------- + Option: verbose + Definition: enable additional informative messages during + driver operation. + Possible Values: This option is a flag + Default Value: disabled + ----------------------------------------------------------------- + Option: debug:[value] + Definition: Enables various levels of debugging information + The bit definitions for the debugging mask can + be found in drivers/scsi/aic7xxx/aic79xx.h under + the "Debug" heading. + Possible Values: 0x0000 = no debugging, 0xffff = full debugging + Default Value: 0x0000 + ----------------------------------------------------------------- + Option: no_reset + Definition: Do not reset the bus during the initial probe + phase + Possible Values: This option is a flag + Default Value: disabled + ----------------------------------------------------------------- + Option: extended + Definition: Force extended translation on the controller + Possible Values: This option is a flag + Default Value: disabled + ----------------------------------------------------------------- + Option: periodic_otag + Definition: Send an ordered tag periodically to prevent + tag starvation. Needed for some older devices + Possible Values: This option is a flag + Default Value: disabled + ----------------------------------------------------------------- + Option: reverse_scan + Definition: Probe the scsi bus in reverse order, starting + with target 15 + Possible Values: This option is a flag + Default Value: disabled + ----------------------------------------------------------------- + Option: global_tag_depth + Definition: Global tag depth for all targets on all busses. + This option sets the default tag depth which + may be selectively overridden vi the tag_info + option. + Possible Values: 1 - 253 + Default Value: 32 + ----------------------------------------------------------------- + Option: tag_info:{{value[,value...]}[,{value[,value...]}...]} + Definition: Set the per-target tagged queue depth on a + per controller basis. Both controllers and targets + may be ommitted indicating that they should retain + the default tag depth. + Examples: tag_info:{{16,32,32,64,8,8,,32,32,32,32,32,32,32,32,32} + On Controller 0 + specifies a tag depth of 16 for target 0 + specifies a tag depth of 64 for target 3 + specifies a tag depth of 8 for targets 4 and 5 + leaves target 6 at the default + specifies a tag depth of 32 for targets 1,2,7-15 + All other targets retain the default depth. + + tag_info:{{},{32,,32}} + On Controller 1 + specifies a tag depth of 32 for targets 0 and 2 + All other targets retain the default depth. + + Possible Values: 1 - 253 + Default Value: 32 + ----------------------------------------------------------------- + Option: rd_strm: {rd_strm_bitmask[,rd_strm_bitmask...]} + Definition: Enable read streaming on a per target basis. + The rd_strm_bitmask is a 16 bit hex value in which + each bit represents a target. Setting the target's + bit to '1' enables read streaming for that + target. Controllers may be ommitted indicating that + they should retain the default read streaming setting. + Example: rd_strm:{0x0041} + On Controller 0 + enables read streaming for targets 0 and 6. + disables read streaming for targets 1-5,7-15. + All other targets retain the default read + streaming setting. + Example: rd_strm:{0x0023,,0xFFFF} + On Controller 0 + enables read streaming for targets 1,2, and 5. + disables read streaming for targets 3,4,6-15. + On Controller 2 + enables read streaming for all targets. + All other targets retain the default read + streaming setting. + + Possible Values: 0x0000 - 0xffff + Default Value: 0x0000 + ----------------------------------------------------------------- + Option: dv: {value[,value...]} + Definition: Set Domain Validation Policy on a per-controller basis. + Controllers may be ommitted indicating that + they should retain the default read streaming setting. + Example: dv:{-1,0,,1,1,0} + On Controller 0 leave DV at its default setting. + On Controller 1 disable DV. + Skip configuration on Controller 2. + On Controllers 3 and 4 enable DV. + On Controller 5 disable DV. + + Possible Values: < 0 Use setting from serial EEPROM. + 0 Disable DV + > 0 Enable DV + Default Value: DV Serial EEPROM configuration setting. + ----------------------------------------------------------------- + Option: seltime:[value] + Definition: Specifies the selection timeout value + Possible Values: 0 = 256ms, 1 = 128ms, 2 = 64ms, 3 = 32ms + Default Value: 0 + ----------------------------------------------------------------- + + *** The following three options should only be changed at *** + *** the direction of a technical support representative. *** + + ----------------------------------------------------------------- + Option: precomp: {value[,value...]} + Definition: Set IO Cell precompensation value on a per-controller + basis. + Controllers may be ommitted indicating that + they should retain the default precompensation setting. + Example: precomp:{0x1} + On Controller 0 set precompensation to 1. + Example: precomp:{1,,7} + On Controller 0 set precompensation to 1. + On Controller 2 set precompensation to 8. + + Possible Values: 0 - 7 + Default Value: Varies based on chip revision + ----------------------------------------------------------------- + Option: slewrate: {value[,value...]} + Definition: Set IO Cell slew rate on a per-controller basis. + Controllers may be ommitted indicating that + they should retain the default slew rate setting. + Example: slewrate:{0x1} + On Controller 0 set slew rate to 1. + Example: slewrate :{1,,8} + On Controller 0 set slew rate to 1. + On Controller 2 set slew rate to 8. + + Possible Values: 0 - 15 + Default Value: Varies based on chip revision + ----------------------------------------------------------------- + Option: amplitude: {value[,value...]} + Definition: Set IO Cell signal amplitude on a per-controller basis. + Controllers may be ommitted indicating that + they should retain the default read streaming setting. + Example: amplitude:{0x1} + On Controller 0 set amplitude to 1. + Example: amplitude :{1,,7} + On Controller 0 set amplitude to 1. + On Controller 2 set amplitude to 7. + + Possible Values: 1 - 7 + Default Value: Varies based on chip revision + ----------------------------------------------------------------- + + Example: 'options aic79xx aic79xx=verbose,rd_strm:{{0x0041}}' + enables verbose output in the driver and turns read streaming on + for targets 0 and 6 of Controller 0. + +4. Additional Notes + + 4.1. Known/Unresolved or FYI Issues + + * Under SuSE Linux Enterprise 7, the driver may fail to operate + correctly due to a problem with PCI interrupt routing in the + Linux kernel. Please contact SuSE for an updated Linux + kernel. + + 4.2. Third-Party Compatibility Issues + + * Adaptec only supports Ultra320 hard drives running + the latest firmware available. Please check with + your hard drive manufacturer to ensure you have the + latest version. + + 4.3. Operating System or Technology Limitations + + * PCI Hot Plug is untested and may cause the operating system + to stop responding. + * Luns that are not numbered contiguously starting with 0 might not + be automatically probed during system startup. This is a limitation + of the OS. Please contact your Linux vendor for instructions on + manually probing non-contiguous luns. + * Using the Driver Update Disk version of this package during OS + installation under RedHat might result in two versions of this + driver being installed into the system module directory. This + might cause problems with the /sbin/mkinitrd program and/or + other RPM packages that try to install system modules. The best + way to correct this once the system is running is to install + the latest RPM package version of this driver, available from + http://www.adaptec.com. + + +5. Contacting Adaptec + + A Technical Support Identification (TSID) Number is required for + Adaptec technical support. + - The 12-digit TSID can be found on the white barcode-type label + included inside the box with your product. The TSID helps us + provide more efficient service by accurately identifying your + product and support status. + Support Options + - Search the Adaptec Support Knowledgebase (ASK) at + http://ask.adaptec.com for articles, troubleshooting tips, and + frequently asked questions for your product. + - For support via Email, submit your question to Adaptec's + Technical Support Specialists at http://ask.adaptec.com. + + North America + - Visit our Web site at http://www.adaptec.com. + - To speak with a Fibre Channel/RAID/External Storage Technical + Support Specialist, call 1-321-207-2000, + Hours: Monday-Friday, 3:00 A.M. to 5:00 P.M., PST. + (Not open on holidays) + - For Technical Support in all other technologies including + SCSI, call 1-408-934-7274, + Hours: Monday-Friday, 6:00 A.M. to 5:00 P.M., PST. + (Not open on holidays) + - For after hours support, call 1-800-416-8066 ($99/call, + $149/call on holidays) + - To order Adaptec products including software and cables, call + 1-800-442-7274 or 1-408-957-7274. You can also visit our + online store at http://www.adaptecstore.com + + Europe + - Visit our Web site at http://www.adaptec-europe.com. + - English and French: To speak with a Technical Support + Specialist, call one of the following numbers: + - English: +32-2-352-3470 + - French: +32-2-352-3460 + Hours: Monday-Thursday, 10:00 to 12:30, 13:30 to 17:30 CET + Friday, 10:00 to 12:30, 13:30 to 16:30 CET + - German: To speak with a Technical Support Specialist, + call +49-89-456-40660 + Hours: Monday-Thursday, 09:30 to 12:30, 13:30 to 16:30 CET + Friday, 09:30 to 12:30, 13:30 to 15:00 CET + - To order Adaptec products, including accessories and cables: + - UK: +0800-96-65-26 or fax +0800-731-02-95 + - Other European countries: +32-11-300-379 + + Australia and New Zealand + - Visit our Web site at http://www.adaptec.com.au. + - To speak with a Technical Support Specialist, call + +612-9416-0698 + Hours: Monday-Friday, 10:00 A.M. to 4:30 P.M., EAT + (Not open on holidays) + + Japan + - To speak with a Technical Support Specialist, call + +81-3-5308-6120 + Hours: Monday-Friday, 9:00 a.m. to 12:00 p.m., 1:00 p.m. to + 6:00 p.m. TSC + + Hong Kong and China + - To speak with a Technical Support Specialist, call + +852-2869-7200 + Hours: Monday-Friday, 10:00 to 17:00. + - Fax Technical Support at +852-2869-7100. + + Singapore + - To speak with a Technical Support Specialist, call + +65-245-7470 + Hours: Monday-Friday, 10:00 to 17:00. + - Fax Technical Support at +852-2869-7100 + +------------------------------------------------------------------- +/* + * Copyright (c) 2003 Adaptec Inc. 691 S. Milpitas Blvd., Milpitas CA 95035 USA. + * All rights reserved. + * + * You are permitted to redistribute, use and modify this README file in whole + * or in part in conjunction with redistribution of software governed by the + * General Public License, provided that the following conditions are met: + * 1. Redistributions of README file must retain the above copyright + * notice, this list of conditions, and the following disclaimer, + * without modification. + * 2. The name of the author may not be used to endorse or promote products + * derived from this software without specific prior written permission. + * 3. Modifications or new contributions must be attributed in a copyright + * notice identifying the author ("Contributor") and added below the + * original copyright notice. The copyright notice is for purposes of + * identifying contributors and should not be deemed as permission to alter + * the permissions given by Adaptec. + * + * THIS README FILE IS PROVIDED BY ADAPTEC AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, ANY + * WARRANTIES OF NON-INFRINGEMENT OR THE IMPLIED WARRANTIES OF MERCHANTABILITY + * AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL + * ADAPTEC OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED + * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR + * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS README + * FILE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ diff --git a/Documentation/scsi/aic7xxx.txt b/Documentation/scsi/aic7xxx.txt new file mode 100644 index 000000000000..160e7354cd1e --- /dev/null +++ b/Documentation/scsi/aic7xxx.txt @@ -0,0 +1,414 @@ +==================================================================== += Adaptec Aic7xxx Fast -> Ultra160 Family Manager Set v6.2.28 = += README for = += The Linux Operating System = +==================================================================== + +The following information is available in this file: + + 1. Supported Hardware + 2. Version History + 3. Command Line Options + 4. Contacting Adaptec + +1. Supported Hardware + + The following Adaptec SCSI Chips and Host Adapters are supported by + the aic7xxx driver. + + Chip MIPS Host Bus MaxSync MaxWidth SCBs Notes + --------------------------------------------------------------- + aic7770 10 EISA/VL 10MHz 16Bit 4 1 + aic7850 10 PCI/32 10MHz 8Bit 3 + aic7855 10 PCI/32 10MHz 8Bit 3 + aic7856 10 PCI/32 10MHz 8Bit 3 + aic7859 10 PCI/32 20MHz 8Bit 3 + aic7860 10 PCI/32 20MHz 8Bit 3 + aic7870 10 PCI/32 10MHz 16Bit 16 + aic7880 10 PCI/32 20MHz 16Bit 16 + aic7890 20 PCI/32 40MHz 16Bit 16 3 4 5 6 7 8 + aic7891 20 PCI/64 40MHz 16Bit 16 3 4 5 6 7 8 + aic7892 20 PCI/64-66 80MHz 16Bit 16 3 4 5 6 7 8 + aic7895 15 PCI/32 20MHz 16Bit 16 2 3 4 5 + aic7895C 15 PCI/32 20MHz 16Bit 16 2 3 4 5 8 + aic7896 20 PCI/32 40MHz 16Bit 16 2 3 4 5 6 7 8 + aic7897 20 PCI/64 40MHz 16Bit 16 2 3 4 5 6 7 8 + aic7899 20 PCI/64-66 80MHz 16Bit 16 2 3 4 5 6 7 8 + + 1. Multiplexed Twin Channel Device - One controller servicing two + busses. + 2. Multi-function Twin Channel Device - Two controllers on one chip. + 3. Command Channel Secondary DMA Engine - Allows scatter gather list + and SCB prefetch. + 4. 64 Byte SCB Support - Allows disconnected, unttagged request table + for all possible target/lun combinations. + 5. Block Move Instruction Support - Doubles the speed of certain + sequencer operations. + 6. `Bayonet' style Scatter Gather Engine - Improves S/G prefetch + performance. + 7. Queuing Registers - Allows queuing of new transactions without + pausing the sequencer. + 8. Multiple Target IDs - Allows the controller to respond to selection + as a target on multiple SCSI IDs. + + Controller Chip Host-Bus Int-Connectors Ext-Connectors Notes + -------------------------------------------------------------------------- + AHA-274X[A] aic7770 EISA SE-50M SE-HD50F + AHA-274X[A]W aic7770 EISA SE-HD68F SE-HD68F + SE-50M + AHA-274X[A]T aic7770 EISA 2 X SE-50M SE-HD50F + AHA-2842 aic7770 VL SE-50M SE-HD50F + AHA-2940AU aic7860 PCI/32 SE-50M SE-HD50F + AVA-2902I aic7860 PCI/32 SE-50M + AVA-2902E aic7860 PCI/32 SE-50M + AVA-2906 aic7856 PCI/32 SE-50M SE-DB25F + APC-7850 aic7850 PCI/32 SE-50M 1 + AVA-2940 aic7860 PCI/32 SE-50M + AHA-2920B aic7860 PCI/32 SE-50M + AHA-2930B aic7860 PCI/32 SE-50M + AHA-2920C aic7856 PCI/32 SE-50M SE-HD50F + AHA-2930C aic7860 PCI/32 SE-50M + AHA-2930C aic7860 PCI/32 SE-50M + AHA-2910C aic7860 PCI/32 SE-50M + AHA-2915C aic7860 PCI/32 SE-50M + AHA-2940AU/CN aic7860 PCI/32 SE-50M SE-HD50F + AHA-2944W aic7870 PCI/32 HVD-HD68F HVD-HD68F + HVD-50M + AHA-3940W aic7870 PCI/32 2 X SE-HD68F SE-HD68F 2 + AHA-2940UW aic7880 PCI/32 SE-HD68F + SE-50M SE-HD68F + AHA-2940U aic7880 PCI/32 SE-50M SE-HD50F + AHA-2940D aic7880 PCI/32 + aHA-2940 A/T aic7880 PCI/32 + AHA-2940D A/T aic7880 PCI/32 + AHA-3940UW aic7880 PCI/32 2 X SE-HD68F SE-HD68F 3 + AHA-3940UWD aic7880 PCI/32 2 X SE-HD68F 2 X SE-VHD68F 3 + AHA-3940U aic7880 PCI/32 2 X SE-50M SE-HD50F 3 + AHA-2944UW aic7880 PCI/32 HVD-HD68F HVD-HD68F + HVD-50M + AHA-3944UWD aic7880 PCI/32 2 X HVD-HD68F 2 X HVD-VHD68F 3 + AHA-4944UW aic7880 PCI/32 + AHA-2930UW aic7880 PCI/32 + AHA-2940UW Pro aic7880 PCI/32 SE-HD68F SE-HD68F 4 + SE-50M + AHA-2940UW/CN aic7880 PCI/32 + AHA-2940UDual aic7895 PCI/32 + AHA-2940UWDual aic7895 PCI/32 + AHA-3940UWD aic7895 PCI/32 + AHA-3940AUW aic7895 PCI/32 + AHA-3940AUWD aic7895 PCI/32 + AHA-3940AU aic7895 PCI/32 + AHA-3944AUWD aic7895 PCI/32 2 X HVD-HD68F 2 X HVD-VHD68F + AHA-2940U2B aic7890 PCI/32 LVD-HD68F LVD-HD68F + AHA-2940U2 OEM aic7891 PCI/64 + AHA-2940U2W aic7890 PCI/32 LVD-HD68F LVD-HD68F + SE-HD68F + SE-50M + AHA-2950U2B aic7891 PCI/64 LVD-HD68F LVD-HD68F + AHA-2930U2 aic7890 PCI/32 LVD-HD68F SE-HD50F + SE-50M + AHA-3950U2B aic7897 PCI/64 + AHA-3950U2D aic7897 PCI/64 + AHA-29160 aic7892 PCI/64-66 + AHA-29160 CPQ aic7892 PCI/64-66 + AHA-29160N aic7892 PCI/32 LVD-HD68F SE-HD50F + SE-50M + AHA-29160LP aic7892 PCI/64-66 + AHA-19160 aic7892 PCI/64-66 + AHA-29150LP aic7892 PCI/64-66 + AHA-29130LP aic7892 PCI/64-66 + AHA-3960D aic7899 PCI/64-66 2 X LVD-HD68F 2 X LVD-VHD68F + LVD-50M + AHA-3960D CPQ aic7899 PCI/64-66 2 X LVD-HD68F 2 X LVD-VHD68F + LVD-50M + AHA-39160 aic7899 PCI/64-66 2 X LVD-HD68F 2 X LVD-VHD68F + LVD-50M + + 1. No BIOS support + 2. DEC21050 PCI-PCI bridge with multiple controller chips on secondary bus + 3. DEC2115X PCI-PCI bridge with multiple controller chips on secondary bus + 4. All three SCSI connectors may be used simultaneously without + SCSI "stub" effects. + +2. Version History + 6.2.36 (June 3rd, 2003) + - Correct code that disables PCI parity error checking. + - Correct and simplify handling of the ignore wide residue + message. The previous code would fail to report a residual + if the transaction data length was even and we received + an IWR message. + - Add support for the 2.5.X EISA framework. + - Update for change in 2.5.X SCSI proc FS interface. + - Correct Domain Validation command-line option parsing. + - When negotiation async via an 8bit WDTR message, send + an SDTR with an offset of 0 to be sure the target + knows we are async. This works around a firmware defect + in the Quantum Atlas 10K. + - Clear PCI error state during driver attach so that we + don't disable memory mapped I/O due to a stray write + by some other driver probe that occurred before we + claimed the controller. + + 6.2.35 (May 14th, 2003) + - Fix a few GCC 3.3 compiler warnings. + - Correct operation on EISA Twin Channel controller. + - Add support for 2.5.X's scsi_report_device_reset(). + + 6.2.34 (May 5th, 2003) + - Fix locking regression instroduced in 6.2.29 that + could cuase a lock order reversal between the io_request_lock + and our per-softc lock. This was only possible on RH9, + SuSE, and kernel.org 2.4.X kernels. + + 6.2.33 (April 30th, 2003) + - Dynamically disable PCI parity error reporting after + 10 errors are reported to the user. These errors are + the result of some other device issuing PCI transactions + with bad parity. Once the user has been informed of the + problem, continuing to report the errors just degrades + our performance. + + 6.2.32 (March 28th, 2003) + - Dynamically sized S/G lists to avoid SCSI malloc + pool fragmentation and SCSI mid-layer deadlock. + + 6.2.28 (January 20th, 2003) + - Domain Validation Fixes + - Add ability to disable PCI parity error checking. + - Enhanced Memory Mapped I/O probe + + 6.2.20 (November 7th, 2002) + - Added Domain Validation. + +3. Command Line Options + + WARNING: ALTERING OR ADDING THESE DRIVER PARAMETERS + INCORRECTLY CAN RENDER YOUR SYSTEM INOPERABLE. + USE THEM WITH CAUTION. + + Edit the file "modprobe.conf" in the directory /etc and add/edit a + line containing 'options aic7xxx aic7xxx=[command[,command...]]' where + 'command' is one or more of the following: + ----------------------------------------------------------------- + Option: verbose + Definition: enable additional informative messages during + driver operation. + Possible Values: This option is a flag + Default Value: disabled + ----------------------------------------------------------------- + Option: debug:[value] + Definition: Enables various levels of debugging information + Possible Values: 0x0000 = no debugging, 0xffff = full debugging + Default Value: 0x0000 + ----------------------------------------------------------------- + Option: no_probe + Option: probe_eisa_vl + Definition: Do not probe for EISA/VLB controllers. + This is a toggle. If the driver is compiled + to not probe EISA/VLB controllers by default, + specifying "no_probe" will enable this probing. + If the driver is compiled to probe EISA/VLB + controllers by default, specifying "no_probe" + will disable this probing. + Possible Values: This option is a toggle + Default Value: EISA/VLB probing is disabled by default. + ----------------------------------------------------------------- + Option: pci_parity + Definition: Toggles the detection of PCI parity errors. + On many motherboards with VIA chipsets, + PCI parity is not generated correctly on the + PCI bus. It is impossible for the hardware to + differentiate between these "spurious" parity + errors and real parity errors. The symptom of + this problem is a stream of the message: + "scsi0: Data Parity Error Detected during address or write data phase" + output by the driver. + Possible Values: This option is a toggle + Default Value: PCI Parity Error reporting is disabled + ----------------------------------------------------------------- + Option: no_reset + Definition: Do not reset the bus during the initial probe + phase + Possible Values: This option is a flag + Default Value: disabled + ----------------------------------------------------------------- + Option: extended + Definition: Force extended translation on the controller + Possible Values: This option is a flag + Default Value: disabled + ----------------------------------------------------------------- + Option: periodic_otag + Definition: Send an ordered tag periodically to prevent + tag starvation. Needed for some older devices + Possible Values: This option is a flag + Default Value: disabled + ----------------------------------------------------------------- + Option: reverse_scan + Definition: Probe the scsi bus in reverse order, starting + with target 15 + Possible Values: This option is a flag + Default Value: disabled + ----------------------------------------------------------------- + Option: global_tag_depth:[value] + Definition: Global tag depth for all targets on all busses. + This option sets the default tag depth which + may be selectively overridden vi the tag_info + option. + Possible Values: 1 - 253 + Default Value: 32 + ----------------------------------------------------------------- + Option: tag_info:{{value[,value...]}[,{value[,value...]}...]} + Definition: Set the per-target tagged queue depth on a + per controller basis. Both controllers and targets + may be ommitted indicating that they should retain + the default tag depth. + Examples: tag_info:{{16,32,32,64,8,8,,32,32,32,32,32,32,32,32,32} + On Controller 0 + specifies a tag depth of 16 for target 0 + specifies a tag depth of 64 for target 3 + specifies a tag depth of 8 for targets 4 and 5 + leaves target 6 at the default + specifies a tag depth of 32 for targets 1,2,7-15 + All other targets retain the default depth. + + tag_info:{{},{32,,32}} + On Controller 1 + specifies a tag depth of 32 for targets 0 and 2 + All other targets retain the default depth. + + Possible Values: 1 - 253 + Default Value: 32 + ----------------------------------------------------------------- + Option: seltime:[value] + Definition: Specifies the selection timeout value + Possible Values: 0 = 256ms, 1 = 128ms, 2 = 64ms, 3 = 32ms + Default Value: 0 + ----------------------------------------------------------------- + Option: dv: {value[,value...]} + Definition: Set Domain Validation Policy on a per-controller basis. + Controllers may be ommitted indicating that + they should retain the default read streaming setting. + Example: dv:{-1,0,,1,1,0} + On Controller 0 leave DV at its default setting. + On Controller 1 disable DV. + Skip configuration on Controller 2. + On Controllers 3 and 4 enable DV. + On Controller 5 disable DV. + + Possible Values: < 0 Use setting from serial EEPROM. + 0 Disable DV + > 0 Enable DV + + Default Value: SCSI-Select setting on controllers with a SCSI Select + option for DV. Otherwise, on for controllers supporting + U160 speeds and off for all other controller types. + ----------------------------------------------------------------- + + Example: + 'options aic7xxx aic7xxx=verbose,no_probe,tag_info:{{},{,,10}},seltime:1" + enables verbose logging, Disable EISA/VLB probing, + and set tag depth on Controller 1/Target 2 to 10 tags. + +3. Contacting Adaptec + + A Technical Support Identification (TSID) Number is required for + Adaptec technical support. + - The 12-digit TSID can be found on the white barcode-type label + included inside the box with your product. The TSID helps us + provide more efficient service by accurately identifying your + product and support status. + Support Options + - Search the Adaptec Support Knowledgebase (ASK) at + http://ask.adaptec.com for articles, troubleshooting tips, and + frequently asked questions for your product. + - For support via Email, submit your question to Adaptec's + Technical Support Specialists at http://ask.adaptec.com. + + North America + - Visit our Web site at http://www.adaptec.com. + - To speak with a Fibre Channel/RAID/External Storage Technical + Support Specialist, call 1-321-207-2000, + Hours: Monday-Friday, 3:00 A.M. to 5:00 P.M., PST. + (Not open on holidays) + - For Technical Support in all other technologies including + SCSI, call 1-408-934-7274, + Hours: Monday-Friday, 6:00 A.M. to 5:00 P.M., PST. + (Not open on holidays) + - For after hours support, call 1-800-416-8066 ($99/call, + $149/call on holidays) + - To order Adaptec products including software and cables, call + 1-800-442-7274 or 1-408-957-7274. You can also visit our + online store at http://www.adaptecstore.com + + Europe + - Visit our Web site at http://www.adaptec-europe.com. + - English and French: To speak with a Technical Support + Specialist, call one of the following numbers: + - English: +32-2-352-3470 + - French: +32-2-352-3460 + Hours: Monday-Thursday, 10:00 to 12:30, 13:30 to 17:30 CET + Friday, 10:00 to 12:30, 13:30 to 16:30 CET + - German: To speak with a Technical Support Specialist, + call +49-89-456-40660 + Hours: Monday-Thursday, 09:30 to 12:30, 13:30 to 16:30 CET + Friday, 09:30 to 12:30, 13:30 to 15:00 CET + - To order Adaptec products, including accessories and cables: + - UK: +0800-96-65-26 or fax +0800-731-02-95 + - Other European countries: +32-11-300-379 + + Australia and New Zealand + - Visit our Web site at http://www.adaptec.com.au. + - To speak with a Technical Support Specialist, call + +612-9416-0698 + Hours: Monday-Friday, 10:00 A.M. to 4:30 P.M., EAT + (Not open on holidays) + + Japan + - To speak with a Technical Support Specialist, call + +81-3-5308-6120 + Hours: Monday-Friday, 9:00 a.m. to 12:00 p.m., 1:00 p.m. to + 6:00 p.m. TSC + + Hong Kong and China + - To speak with a Technical Support Specialist, call + +852-2869-7200 + Hours: Monday-Friday, 10:00 to 17:00. + - Fax Technical Support at +852-2869-7100. + + Singapore + - To speak with a Technical Support Specialist, call + +65-245-7470 + Hours: Monday-Friday, 10:00 to 17:00. + - Fax Technical Support at +852-2869-7100 + +------------------------------------------------------------------- +/* + * Copyright (c) 2003 Adaptec Inc. 691 S. Milpitas Blvd., Milpitas CA 95035 USA. + * All rights reserved. + * + * You are permitted to redistribute, use and modify this README file in whole + * or in part in conjunction with redistribution of software governed by the + * General Public License, provided that the following conditions are met: + * 1. Redistributions of README file must retain the above copyright + * notice, this list of conditions, and the following disclaimer, + * without modification. + * 2. The name of the author may not be used to endorse or promote products + * derived from this software without specific prior written permission. + * 3. Modifications or new contributions must be attributed in a copyright + * notice identifying the author ("Contributor") and added below the + * original copyright notice. The copyright notice is for purposes of + * identifying contributors and should not be deemed as permission to alter + * the permissions given by Adaptec. + * + * THIS README FILE IS PROVIDED BY ADAPTEC AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, ANY + * WARRANTIES OF NON-INFRINGEMENT OR THE IMPLIED WARRANTIES OF MERCHANTABILITY + * AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL + * ADAPTEC OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED + * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR + * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS README + * FILE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ diff --git a/Documentation/scsi/aic7xxx_old.txt b/Documentation/scsi/aic7xxx_old.txt new file mode 100644 index 000000000000..79e5ac6cb6ff --- /dev/null +++ b/Documentation/scsi/aic7xxx_old.txt @@ -0,0 +1,511 @@ + AIC7xxx Driver for Linux + +Introduction +---------------------------- +The AIC7xxx SCSI driver adds support for Adaptec (http://www.adaptec.com) +SCSI controllers and chipsets. Major portions of the driver and driver +development are shared between both Linux and FreeBSD. Support for the +AIC-7xxx chipsets have been in the default Linux kernel since approximately +linux-1.1.x and fairly stable since linux-1.2.x, and are also in FreeBSD +2.1.0 or later. + + Supported cards/chipsets + ---------------------------- + Adaptec Cards + ---------------------------- + AHA-274x + AHA-274xT + AHA-2842 + AHA-2910B + AHA-2920C + AHA-2930 + AHA-2930U + AHA-2930CU + AHA-2930U2 + AHA-2940 + AHA-2940W + AHA-2940U + AHA-2940UW + AHA-2940UW-PRO + AHA-2940AU + AHA-2940U2W + AHA-2940U2 + AHA-2940U2B + AHA-2940U2BOEM + AHA-2944D + AHA-2944WD + AHA-2944UD + AHA-2944UWD + AHA-2950U2 + AHA-2950U2W + AHA-2950U2B + AHA-29160M + AHA-3940 + AHA-3940U + AHA-3940W + AHA-3940UW + AHA-3940AUW + AHA-3940U2W + AHA-3950U2B + AHA-3950U2D + AHA-3960D + AHA-39160M + AHA-3985 + AHA-3985U + AHA-3985W + AHA-3985UW + + Motherboard Chipsets + ---------------------------- + AIC-777x + AIC-785x + AIC-786x + AIC-787x + AIC-788x + AIC-789x + AIC-3860 + + Bus Types + ---------------------------- + W - Wide SCSI, SCSI-3, 16bit bus, 68pin connector, will also support + SCSI-1/SCSI-2 50pin devices, transfer rates up to 20MB/s. + U - Ultra SCSI, transfer rates up to 40MB/s. + U2- Ultra 2 SCSI, transfer rates up to 80MB/s. + D - Differential SCSI. + T - Twin Channel SCSI. Up to 14 SCSI devices. + + AHA-274x - EISA SCSI controller + AHA-284x - VLB SCSI controller + AHA-29xx - PCI SCSI controller + AHA-394x - PCI controllers with two separate SCSI controllers on-board. + AHA-398x - PCI RAID controllers with three separate SCSI controllers + on-board. + + Not Supported Devices + ------------------------------ + Adaptec Cards + ---------------------------- + AHA-2920 (Only the cards that use the Future Domain chipset are not + supported, any 2920 cards based on Adaptec AIC chipsets, + such as the 2920C, are supported) + AAA-13x Raid Adapters + AAA-113x Raid Port Card + + Motherboard Chipsets + ---------------------------- + AIC-7810 + + Bus Types + ---------------------------- + R - Raid Port busses are not supported. + + The hardware RAID devices sold by Adaptec are *NOT* supported by this + driver (and will people please stop emailing me about them, they are + a totally separate beast from the bare SCSI controllers and this driver + can not be retrofitted in any sane manner to support the hardware RAID + features on those cards - Doug Ledford). + + + People + ------------------------------ + Justin T Gibbs gibbs@plutotech.com + (BSD Driver Author) + Dan Eischen deischen@iworks.InterWorks.org + (Original Linux Driver Co-maintainer) + Dean Gehnert deang@teleport.com + (Original Linux FTP/patch maintainer) + Jess Johnson jester@frenzy.com + (AIC7xxx FAQ author) + Doug Ledford dledford@redhat.com + (Current Linux aic7xxx-5.x.x Driver/Patch/FTP maintainer) + + Special thanks go to John Aycock (aycock@cpsc.ucalgary.ca), the original + author of the driver. John has since retired from the project. Thanks + again for all his work! + + Mailing list + ------------------------------ + There is a mailing list available for users who want to track development + and converse with other users and developers. This list is for both + FreeBSD and Linux support of the AIC7xxx chipsets. + + To subscribe to the AIC7xxx mailing list send mail to the list server, + with "subscribe AIC7xxx" in the body (no Subject: required): + To: majordomo@FreeBSD.ORG + --- + subscribe AIC7xxx + + To unsubscribe from the list, send mail to the list server with: + To: majordomo@FreeBSD.ORG + --- + unsubscribe AIC7xxx + + Send regular messages and replies to: AIC7xxx@FreeBSD.ORG + + Boot Command line options + ------------------------------ + "aic7xxx=no_reset" - Eliminate the SCSI bus reset during startup. + Some SCSI devices need the initial reset that this option disables + in order to work. If you have problems at bootup, please make sure + you aren't using this option. + + "aic7xxx=reverse_scan" - Certain PCI motherboards scan for devices at + bootup by scanning from the highest numbered PCI device to the + lowest numbered PCI device, others do just the opposite and scan + from lowest to highest numbered PCI device. There is no reliable + way to autodetect this ordering. So, we default to the most common + order, which is lowest to highest. Then, in case your motherboard + scans from highest to lowest, we have this option. If your BIOS + finds the drives on controller A before controller B but the linux + kernel finds your drives on controller B before A, then you should + use this option. + + "aic7xxx=extended" - Force the driver to detect extended drive translation + on your controller. This helps those people who have cards without + a SEEPROM make sure that linux and all other operating systems think + the same way about your hard drives. + + "aic7xxx=scbram" - Some cards have external SCB RAM that can be used to + give the card more hardware SCB slots. This allows the driver to use + that SCB RAM. Without this option, the driver won't touch the SCB + RAM because it is known to cause problems on a few cards out there + (such as 3985 class cards). + + "aic7xxx=irq_trigger:x" - Replace x with either 0 or 1 to force the kernel + to use the correct IRQ type for your card. This only applies to EISA + based controllers. On these controllers, 0 is for Edge triggered + interrupts, and 1 is for Level triggered interrupts. If you aren't + sure or don't know which IRQ trigger type your EISA card uses, then + let the kernel autodetect the trigger type. + + "aic7xxx=verbose" - This option can be used in one of two ways. If you + simply specify aic7xxx=verbose, then the kernel will automatically + pick the default set of verbose messages for you to see. + Alternatively, you can specify the command as + "aic7xxx=verbose:0xXXXX" where the X entries are replaced with + hexadecimal digits. This option is a bit field type option. For + a full listing of the available options, search for the + #define VERBOSE_xxxxxx lines in the aic7xxx.c file. If you want + verbose messages, then it is recommended that you simply use the + aic7xxx=verbose variant of this command. + + "aic7xxx=pci_parity:x" - This option controls whether or not the driver + enables PCI parity error checking on the PCI bus. By default, this + checking is disabled. To enable the checks, simply specify pci_parity + with no value afterwords. To reverse the parity from even to odd, + supply any number other than 0 or 255. In short: + pci_parity - Even parity checking (even is the normal PCI parity) + pci_parity:x - Where x > 0, Odd parity checking + pci_parity:0 - No check (default) + NOTE: In order to get Even PCI parity checking, you must use the + version of the option that does not include the : and a number at + the end (unless you want to enter exactly 2^32 - 1 as the number). + + "aic7xxx=no_probe" - This option will disable the probing for any VLB + based 2842 controllers and any EISA based controllers. This is + needed on certain newer motherboards where the normal EISA I/O ranges + have been claimed by other PCI devices. Probing on those machines + will often result in the machine crashing or spontaneously rebooting + during startup. Examples of machines that need this are the + Dell PowerEdge 6300 machines. + + "aic7xxx=seltime:2" - This option controls how long the card waits + during a device selection sequence for the device to respond. + The original SCSI spec says that this "should be" 256ms. This + is generally not required with modern devices. However, some + very old SCSI I devices need the full 256ms. Most modern devices + can run fine with only 64ms. The default for this option is + 64ms. If you need to change this option, then use the following + table to set the proper value in the example above: + 0 - 256ms + 1 - 128ms + 2 - 64ms + 3 - 32ms + + "aic7xxx=panic_on_abort" - This option is for debugging and will cause + the driver to panic the linux kernel and freeze the system the first + time the drivers abort or reset routines are called. This is most + helpful when some problem causes infinite reset loops that scroll too + fast to see. By using this option, you can write down what the errors + actually are and send that information to me so it can be fixed. + + "aic7xxx=dump_card" - This option will print out the *entire* set of + configuration registers on the card during the init sequence. This + is a debugging aid used to see exactly what state the card is in + when we finally finish our initialization routines. If you don't + have documentation on the chipsets, this will do you absolutely + no good unless you are simply trying to write all the information + down in order to send it to me. + + "aic7xxx=dump_sequencer" - This is the same as the above options except + that instead of dumping the register contents on the card, this + option dumps the contents of the sequencer program RAM. This gives + the ability to verify that the instructions downloaded to the + card's sequencer are indeed what they are suppossed to be. Again, + unless you have documentation to tell you how to interpret these + numbers, then it is totally useless. + + "aic7xxx=override_term:0xffffffff" - This option is used to force the + termination on your SCSI controllers to a particular setting. This + is a bit mask variable that applies for up to 8 aic7xxx SCSI channels. + Each channel gets 4 bits, divided as follows: + bit 3 2 1 0 + | | | Enable/Disable Single Ended Low Byte Termination + | | En/Disable Single Ended High Byte Termination + | En/Disable Low Byte LVD Termination + En/Disable High Byte LVD Termination + + The upper 2 bits that deal with LVD termination only apply to Ultra2 + controllers. Futhermore, due to the current Ultra2 controller + designs, these bits are tied together such that setting either bit + enables both low and high byte LVD termination. It is not possible + to only set high or low byte LVD termination in this manner. This is + an artifact of the BIOS definition on Ultra2 controllers. For other + controllers, the only important bits are the two lowest bits. Setting + the higher bits on non-Ultra2 controllers has no effect. A few + examples of how to use this option: + + Enable low and high byte termination on a non-ultra2 controller that + is the first aic7xxx controller (the correct bits are 0011), + aic7xxx=override_term:0x3 + + Enable all termination on the third aic7xxx controller, high byte + termination on the second aic7xxx controller, and low and high byte + SE termination on the first aic7xxx controller + (bits are 1111 0010 0011), + aic7xxx=override_term:0xf23 + + No attempt has been made to make this option non-cryptic. It really + shouldn't be used except in dire circumstances, and if that happens, + I'm probably going to be telling you what to set this to anyway :) + + "aic7xxx=stpwlev:0xffffffff" - This option is used to control the STPWLEV + bit in the DEVCONFIG PCI register. Currently, this is one of the + very few registers that we have absolutely *no* way of detecting + what the variable should be. It depends entirely on how the chipset + and external terminators were coupled by the card/motherboard maker. + Further, a chip reset (at power up) always sets this bit to 0. If + there is no BIOS to run on the chipset/card (such as with a 2910C + or a motherboard controller with the BIOS totally disabled) then + the variable may not get set properly. Of course, if the proper + setting was 0, then that's what it would be after the reset, but if + the proper setting is actually 1.....you get the picture. Now, since + we can't detect this at all, I've added this option to force the + setting. If you have a BIOS on your controller then you should never + need to use this option. However, if you are having lots of SCSI + reset problems and can't seem to get them knocked out, this may help. + + Here's a test to know for certain if you need this option. Make + a boot floppy that you can use to boot your computer up and that + will detect the aic7xxx controller. Next, power down your computer. + While it's down, unplug all SCSI cables from your Adaptec SCSI + controller. Boot the system back up to the Adaptec EZ-SCSI BIOS + and then make sure that termination is enabled on your adapter (if + you have an Adaptec BIOS of course). Next, boot up the floppy you + made and wait for it to detect the aic7xxx controller. If the kernel + finds the controller fine, says scsi : x hosts and then tries to + detect your devices like normal, up to the point where it fails to + mount your root file system and panics, then you're fine. If, on + the other hand, the system goes into an infinite reset loop, then + you need to use this option and/or the previous option to force the + proper termination settings on your controller. If this happens, + then you next need to figure out what your settings should be. + + To find the correct settings, power your machine back down, connect + back up the SCSI cables, and boot back into your machine like normal. + However, boot with the aic7xxx=verbose:0x39 option. Record the + initial DEVCONFIG values for each of your aic7xxx controllers as + they are listed, and also record what the machine is detecting as + the proper termination on your controllers. NOTE: the order in + which the initial DEVCONFIG values are printed out is not gauranteed + to be the same order as the SCSI controllers are registered. The + above option and this option both work on the order of the SCSI + controllers as they are registered, so make sure you match the right + DEVCONFIG values with the right controllers if you have more than + one aic7xxx controller. + + Once you have the detected termination settings and the initial + DEVCONFIG values for each controller, then figure out what the + termination on each of the controllers *should* be. Hopefully, that + part is correct, but it could possibly be wrong if there is + bogus cable detection logic on your controller or something similar. + If all the controllers have the correct termination settings, then + don't set the aic7xxx=override_term variable at all, leave it alone. + Next, on any controllers that go into an infinite reset loop when + you unplug all the SCSI cables, get the starting DEVCONFIG value. + If the initial DEVCONFIG value is divisible by 2, then the correct + setting for that controller is 0. If it's an odd number, then + the correct setting for that controller is 1. For any other + controllers that didn't have an infinite reset problem, then reverse + the above options. If DEVCONFIG was even, then the correct setting + is 1, if not then the correct setting is 0. + + Now that you know what the correct setting was for each controller, + we need to encode that into the aic7xxx=stpwlev:0x... variable. + This variable is a bit field encoded variable. Bit 0 is for the first + aic7xxx controller, bit 1 for the next, etc. Put all these bits + together and you get a number. For example, if the third aic7xxx + needed a 1, but the second and first both needed a 0, then the bits + would be 100 in binary. This then translates to 0x04. You would + therefore set aic7xxx=stpwlev:0x04. This is fairly standard binary + to hexadecimal conversions here. If you aren't up to speed on the + binary->hex conversion then send an email to the aic7xxx mailing + list and someone can help you out. + + "aic7xxx=tag_info:{{8,8..},{8,8..},..}" - This option is used to disable + or enable Tagged Command Queueing (TCQ) on specific devices. As of + driver version 5.1.11, TCQ is now either on or off by default + according to the setting you choose during the make config process. + In order to en/disable TCQ for certian devices at boot time, a user + may use this boot param. The driver will then parse this message out + and en/disable the specific device entries that are present based upon + the value given. The param line is parsed in the following manner: + + { - first instance indicates the start of this parameter values + second instance is the start of entries for a particular + device entry + } - end the entries for a particular host adapter, or end the entire + set of parameter entries + , - move to next entry. Inside of a set of device entries, this + moves us to the next device on the list. Outside of device + entries, this moves us to the next host adapter + . - Same effect as , but is safe to use with insmod. + x - the number to enter into the array at this position. + 0 = Enable tagged queueing on this device and use the default + queue depth + 1-254 = Enable tagged queueing on this device and use this + number as the queue depth + 255 = Disable tagged queueing on this device. + Note: anything above 32 for an actual queue depth is wasteful + and not recommended. + + A few examples of how this can be used: + + tag_info:{{8,12,,0,,255,4}} + This line will only effect the first aic7xxx card registered. It + will set scsi id 0 to a queue depth of 8, id 1 to 12, leave id 2 + at the default, set id 3 to tagged queueing enabled and use the + default queue depth, id 4 default, id 5 disabled, and id 6 to 4. + Any not specified entries stay at the default value, repeated + commas with no value specified will simply increment to the next id + without changing anything for the missing values. + + tag_info:{,,,{,,,255}} + First, second, and third adapters at default values. Fourth + adapter, id 3 is disabled. Notice that leading commas simply + increment what the first number effects, and there are no need + for trailing commas. When you close out an adapter, or the + entire entry, anything not explicitly set stays at the default + value. + + A final note on this option. The scanner I used for this isn't + perfect or highly robust. If you mess the line up, the worst that + should happen is that the line will get ignored. If you don't + close out the entire entry with the final bracket, then any other + aic7xxx options after this will get ignored. So, in general, be + sure of what you are entering, and after you have it right, just + add it to the lilo.conf file so there won't be any mistakes. As + a means of checking this parser, the entire tag_info array for + each card is now printed out in the /proc/scsi/aic7xxx/x file. You + can use that to verify that your options were parsed correctly. + + Boot command line options may be combined to form the proper set of options + a user might need. For example, the following is valid: + + aic7xxx=verbose,extended,irq_trigger:1 + + The only requirement is that individual options be separated by a comma or + a period on the command line. + + Module Loading command options + ------------------------------ + When loading the aic7xxx driver as a module, the exact same options are + available to the user. However, the syntax to specify the options changes + slightly. For insmod, you need to wrap the aic7xxx= argument in quotes + and replace all ',' with '.'. So, for example, a valid insmod line + would be: + + insmod aic7xxx aic7xxx='verbose.irq_trigger:1.extended' + + This line should result in the *exact* same behaviour as if you typed + it in at the lilo prompt and the driver was compiled into the kernel + instead of being a module. The reason for the single quote is so that + the shell won't try to interpret anything in the line, such as {. + Insmod assumes any options starting with a letter instead of a number + is a character string (which is what we want) and by switching all of + the commas to periods, insmod won't interpret this as more than one + string and write junk into our binary image. I consider it a bug in + the insmod program that even if you wrap your string in quotes (quotes + that pass the shell mind you and that insmod sees) it still treates + a comma inside of those quotes as starting a new variable, resulting + in memory scribbles if you don't switch the commas to periods. + + + Kernel Compile options + ------------------------------ + The various kernel compile time options for this driver are now fairly + well documented in the file Documentation/Configure.help. In order to + see this documentation, you need to use one of the advanced configuration + programs (menuconfig and xconfig). If you are using the "make menuconfig" + method of configuring your kernel, then you would simply highlight the + option in question and hit the ? key. If you are using the "make xconfig" + method of configuring your kernel, then simply click on the help button + next to the option you have questions about. The help information from + the Configure.help file will then get automatically displayed. + + /proc support + ------------------------------ + The /proc support for the AIC7xxx can be found in the /proc/scsi/aic7xxx/ + directory. That directory contains a file for each SCSI controller in + the system. Each file presents the current configuration and transfer + statistics (enabled with #define in aic7xxx.c) for each controller. + + Thanks to Michael Neuffer for his upper-level SCSI help, and + Matthew Jacob for statistics support. + + Debugging the driver + ------------------------------ + Should you have problems with this driver, and would like some help in + getting them solved, there are a couple debugging items built into + the driver to facilitate getting the needed information from the system. + In general, I need a complete description of the problem, with as many + logs as possible concerning what happens. To help with this, there is + a command option aic7xxx=panic_on_abort. This option, when set, forces + the driver to panic the kernel on the first SCSI abort issued by the + mid level SCSI code. If your system is going to reset loops and you + can't read the screen, then this is what you need. Not only will it + stop the system, but it also prints out a large amount of state + information in the process. Second, if you specify the option + "aic7xxx=verbose:0x1ffff", the system will print out *SOOOO* much + information as it runs that you won't be able to see anything. + However, this can actually be very useful if your machine simply + locks up when trying to boot, since it will pin-point what was last + happening (in regards to the aic7xxx driver) immediately prior to + the lockup. This is really only useful if your machine simply can + not boot up successfully. If you can get your machine to run, then + this will produce far too much information. + + FTP sites + ------------------------------ + ftp://ftp.redhat.com/pub/aic/ + - Out of date. I used to keep stuff here, but too many people + complained about having a hard time getting into Red Hat's ftp + server. So use the web site below instead. + ftp://ftp.pcnet.com/users/eischen/Linux/ + - Dan Eischen's driver distribution area + ftp://ekf2.vsb.cz/pub/linux/kernel/aic7xxx/ftp.teleport.com/ + - European Linux mirror of Teleport site + + Web sites + ------------------------------ + http://people.redhat.com/dledford/ + - My web site, also the primary aic7xxx site with several related + pages. + +Dean W. Gehnert +deang@teleport.com + +$Revision: 3.0 $ + +Modified by Doug Ledford 1998-2000 + diff --git a/Documentation/scsi/cpqfc.txt b/Documentation/scsi/cpqfc.txt new file mode 100644 index 000000000000..dd33e61c0645 --- /dev/null +++ b/Documentation/scsi/cpqfc.txt @@ -0,0 +1,272 @@ +Notes for CPQFCTS driver for Compaq Tachyon TS +Fibre Channel Host Bus Adapter, PCI 64-bit, 66MHz +for Linux (RH 6.1, 6.2 kernel 2.2.12-32, 2.2.14-5) +SMP tested +Tested in single and dual HBA configuration, 32 and 64bit busses, +33 and 66MHz. Only supports FC-AL. +SEST size 512 Exchanges (simultaneous I/Os) limited by module kmalloc() + max of 128k bytes contiguous. + +Ver 2.5.4 Oct 03, 2002 + * fixed memcpy of sense buffer in ioctl to copy the smaller defined size +Ver 2.5.3 Aug 01, 2002 + * fix the passthru ioctl to handle the Scsi_Cmnd->request being a pointer +Ver 2.5.1 Jul 30, 2002 + * fix ioctl to pay attention to the specified LUN. +Ver 2.5.0 Nov 29, 2001 + * eliminated io_request_lock. This change makes the driver specific + to the 2.5.x kernels. + * silenced excessively noisy printks. + +Ver 2.1.2 July 23, 2002 + * initialize DumCmnd->lun in cpqfcTS_ioctl (used in fcFindLoggedInPorts as LUN index) + +Ver 2.1.1 Oct 18, 2001 + * reinitialize Cmnd->SCp.sent_command (used to identify commands as + passthrus) on calling scsi_done, since the scsi mid layer does not + use (or reinitialize) this field to prevent subsequent comands from + having it set incorrectly. + +Ver 2.1.0 Aug 27, 2001 + * Revise driver to use new kernel 2.4.x PCI DMA API, instead of + virt_to_bus(). (enables driver to work w/ ia64 systems with >2Gb RAM.) + Rework main scatter-gather code to handle cases where SG element + lengths are larger than 0x7FFFF bytes and use as many scatter + gather pages as necessary. (Steve Cameron) + * Makefile changes to bring cpqfc into line w/ rest of SCSI drivers + (thanks to Keith Owens) + +Ver 2.0.5 Aug 06, 2001 + * Reject non-existent luns in the driver rather than letting the + hardware do it. (some HW behaves differently than others in this area.) + * Changed Makefile to rely on "make dep" instead of explicit dependencies + * ifdef'ed out fibre channel analyzer triggering debug code + * fixed a jiffies wrapping issue + +Ver 2.0.4 Aug 01, 2001 + * Incorporated fix for target device reset from Steeleye + * Fixed passthrough ioctl so it doesn't hang. + * Fixed hang in launch_FCworker_thread() that occurred on some machines. + * Avoid problem when number of volumes in a single cabinet > 8 + +Ver 2.0.2 July 23, 2001 + Changed the semiphore changes so the driver would compile in 2.4.7. + This version is for 2.4.7 and beyond. + +Ver 2.0.1 May 7, 2001 + Merged version 1.3.6 fixes into version 2.0.0. + +Ver 2.0.0 May 7, 2001 + Fixed problem so spinlock is being initialized to UNLOCKED. + Fixed updated driver so it compiles in the 2.4 tree. + + Ver 1.3.6 Feb 27, 2001 + Added Target_Device_Reset function for SCSI error handling + Fixed problem with not reseting addressing mode after implicit logout + + +Ver 1.3.4 Sep 7, 2000 + Added Modinfo information + Fixed problem with statically linking the driver + +Ver 1.3.3, Aug 23, 2000 + Fixed device/function number in ioctl + +Ver 1.3.2, July 27, 2000 + Add include for Alpha compile on 2.2.14 kernel (cpq*i2c.c) + Change logic for different FCP-RSP sense_buffer location for HSG80 target + And search for Agilent Tachyon XL2 HBAs (not finished! - in test) + +Tested with +(storage): + Compaq RA-4x000, RAID firmware ver 2.40 - 2.54 + Seagate FC drives model ST39102FC, rev 0006 + Hitachi DK31CJ-72FC rev J8A8 + IBM DDYF-T18350R rev F60K + Compaq FC-SCSI bridge w/ DLT 35/70 Gb DLT (tape) +(servers): + Compaq PL-1850R + Compaq PL-6500 Xeon (400MHz) + Compaq PL-8500 (500MHz, 66MHz, 64bit PCI) + Compaq Alpha DS20 (RH 6.1) +(hubs): + Vixel Rapport 1000 (7-port "dumb") + Gadzoox Gibralter (12-port "dumb") + Gadzoox Capellix 2000, 3000 +(switches): + Brocade 2010, 2400, 2800, rev 2.0.3a (& later) + Gadzoox 3210 (Fabric blade beta) + Vixel 7100 (Fabric beta firmare - known hot plug issues) +using "qa_test" (esp. io_test script) suite modified from Unix tests. + +Installation: +make menuconfig + (select SCSI low-level, Compaq FC HBA) +make modules +make modules_install + +e.g. insmod -f cpqfc + +Due to Fabric/switch delays, driver requires 4 seconds +to initialize. If adapters are found, there will be a entries at +/proc/scsi/cpqfcTS/* + +sample contents of startup messages + +************************* + scsi_register allocating 3596 bytes for CPQFCHBA + ioremap'd Membase: c887e600 + HBA Tachyon RevId 1.2 +Allocating 119808 for 576 Exchanges @ c0dc0000 +Allocating 112904 for LinkQ @ c0c20000 (576 elements) +Allocating 110600 for TachSEST for 512 Exchanges + cpqfcTS: writing IMQ BASE 7C0000h PI 7C4000h + cpqfcTS: SEST c0e40000(virt): Wrote base E40000h @ c887e740 +cpqfcTS: New FC port 0000E8h WWN: 500507650642499D SCSI Chan/Trgt 0/0 +cpqfcTS: New FC port 0000EFh WWN: 50000E100000D5A6 SCSI Chan/Trgt 0/1 +cpqfcTS: New FC port 0000E4h WWN: 21000020370097BB SCSI Chan/Trgt 0/2 +cpqfcTS: New FC port 0000E2h WWN: 2100002037009946 SCSI Chan/Trgt 0/3 +cpqfcTS: New FC port 0000E1h WWN: 21000020370098FE SCSI Chan/Trgt 0/4 +cpqfcTS: New FC port 0000E0h WWN: 21000020370097B2 SCSI Chan/Trgt 0/5 +cpqfcTS: New FC port 0000DCh WWN: 2100002037006CC1 SCSI Chan/Trgt 0/6 +cpqfcTS: New FC port 0000DAh WWN: 21000020370059F6 SCSI Chan/Trgt 0/7 +cpqfcTS: New FC port 00000Fh WWN: 500805F1FADB0E20 SCSI Chan/Trgt 0/8 +cpqfcTS: New FC port 000008h WWN: 500805F1FADB0EBA SCSI Chan/Trgt 0/9 +cpqfcTS: New FC port 000004h WWN: 500805F1FADB1EB9 SCSI Chan/Trgt 0/10 +cpqfcTS: New FC port 000002h WWN: 500805F1FADB1ADE SCSI Chan/Trgt 0/11 +cpqfcTS: New FC port 000001h WWN: 500805F1FADBA2CA SCSI Chan/Trgt 0/12 +scsi4 : Compaq FibreChannel HBA Tachyon TS HPFC-5166A/1.2: WWN 500508B200193F50 + on PCI bus 0 device 0xa0fc irq 5 IObaseL 0x3400, MEMBASE 0xc6ef8600 +PCI bus width 32 bits, bus speed 33 MHz +FCP-SCSI Driver v1.3.0 +GBIC detected: Short-wave. LPSM 0h Monitor +scsi : 5 hosts. + Vendor: IBM Model: DDYF-T18350R Rev: F60K + Type: Direct-Access ANSI SCSI revision: 03 +Detected scsi disk sdb at scsi4, channel 0, id 0, lun 0 + Vendor: HITACHI Model: DK31CJ-72FC Rev: J8A8 + Type: Direct-Access ANSI SCSI revision: 02 +Detected scsi disk sdc at scsi4, channel 0, id 1, lun 0 + Vendor: SEAGATE Model: ST39102FC Rev: 0006 + Type: Direct-Access ANSI SCSI revision: 02 +Detected scsi disk sdd at scsi4, channel 0, id 2, lun 0 + Vendor: SEAGATE Model: ST39102FC Rev: 0006 + Type: Direct-Access ANSI SCSI revision: 02 +Detected scsi disk sde at scsi4, channel 0, id 3, lun 0 + Vendor: SEAGATE Model: ST39102FC Rev: 0006 + Type: Direct-Access ANSI SCSI revision: 02 +Detected scsi disk sdf at scsi4, channel 0, id 4, lun 0 + Vendor: SEAGATE Model: ST39102FC Rev: 0006 + Type: Direct-Access ANSI SCSI revision: 02 +Detected scsi disk sdg at scsi4, channel 0, id 5, lun 0 + Vendor: SEAGATE Model: ST39102FC Rev: 0006 + Type: Direct-Access ANSI SCSI revision: 02 +Detected scsi disk sdh at scsi4, channel 0, id 6, lun 0 + Vendor: SEAGATE Model: ST39102FC Rev: 0006 + Type: Direct-Access ANSI SCSI revision: 02 +Detected scsi disk sdi at scsi4, channel 0, id 7, lun 0 + Vendor: COMPAQ Model: LOGICAL VOLUME Rev: 2.48 + Type: Direct-Access ANSI SCSI revision: 02 +Detected scsi disk sdj at scsi4, channel 0, id 8, lun 0 + Vendor: COMPAQ Model: LOGICAL VOLUME Rev: 2.48 + Type: Direct-Access ANSI SCSI revision: 02 +Detected scsi disk sdk at scsi4, channel 0, id 8, lun 1 + Vendor: COMPAQ Model: LOGICAL VOLUME Rev: 2.40 + Type: Direct-Access ANSI SCSI revision: 02 +Detected scsi disk sdl at scsi4, channel 0, id 9, lun 0 + Vendor: COMPAQ Model: LOGICAL VOLUME Rev: 2.40 + Type: Direct-Access ANSI SCSI revision: 02 +Detected scsi disk sdm at scsi4, channel 0, id 9, lun 1 + Vendor: COMPAQ Model: LOGICAL VOLUME Rev: 2.54 + Type: Direct-Access ANSI SCSI revision: 02 +Detected scsi disk sdn at scsi4, channel 0, id 10, lun 0 + Vendor: COMPAQ Model: LOGICAL VOLUME Rev: 2.54 + Type: Direct-Access ANSI SCSI revision: 02 +Detected scsi disk sdo at scsi4, channel 0, id 11, lun 0 + Vendor: COMPAQ Model: LOGICAL VOLUME Rev: 2.54 + Type: Direct-Access ANSI SCSI revision: 02 +Detected scsi disk sdp at scsi4, channel 0, id 11, lun 1 + Vendor: COMPAQ Model: LOGICAL VOLUME Rev: 2.54 + Type: Direct-Access ANSI SCSI revision: 02 +Detected scsi disk sdq at scsi4, channel 0, id 12, lun 0 + Vendor: COMPAQ Model: LOGICAL VOLUME Rev: 2.54 + Type: Direct-Access ANSI SCSI revision: 02 +Detected scsi disk sdr at scsi4, channel 0, id 12, lun 1 +resize_dma_pool: unknown device type 12 +resize_dma_pool: unknown device type 12 +SCSI device sdb: hdwr sector= 512 bytes. Sectors= 35843670 [17501 MB] [17.5 GB] + sdb: sdb1 +SCSI device sdc: hdwr sector= 512 bytes. Sectors= 144410880 [70513 MB] [70.5 GB] + sdc: sdc1 +SCSI device sdd: hdwr sector= 512 bytes. Sectors= 17783240 [8683 MB] [8.7 GB] + sdd: sdd1 +SCSI device sde: hdwr sector= 512 bytes. Sectors= 17783240 [8683 MB] [8.7 GB] + sde: sde1 +SCSI device sdf: hdwr sector= 512 bytes. Sectors= 17783240 [8683 MB] [8.7 GB] + sdf: sdf1 +SCSI device sdg: hdwr sector= 512 bytes. Sectors= 17783240 [8683 MB] [8.7 GB] + sdg: sdg1 +SCSI device sdh: hdwr sector= 512 bytes. Sectors= 17783240 [8683 MB] [8.7 GB] + sdh: sdh1 +SCSI device sdi: hdwr sector= 512 bytes. Sectors= 17783240 [8683 MB] [8.7 GB] + sdi: sdi1 +SCSI device sdj: hdwr sector= 512 bytes. Sectors= 2056160 [1003 MB] [1.0 GB] + sdj: sdj1 +SCSI device sdk: hdwr sector= 512 bytes. Sectors= 2052736 [1002 MB] [1.0 GB] + sdk: sdk1 +SCSI device sdl: hdwr sector= 512 bytes. Sectors= 17764320 [8673 MB] [8.7 GB] + sdl: sdl1 +SCSI device sdm: hdwr sector= 512 bytes. Sectors= 8380320 [4091 MB] [4.1 GB] + sdm: sdm1 +SCSI device sdn: hdwr sector= 512 bytes. Sectors= 17764320 [8673 MB] [8.7 GB] + sdn: sdn1 +SCSI device sdo: hdwr sector= 512 bytes. Sectors= 17764320 [8673 MB] [8.7 GB] + sdo: sdo1 +SCSI device sdp: hdwr sector= 512 bytes. Sectors= 17764320 [8673 MB] [8.7 GB] + sdp: sdp1 +SCSI device sdq: hdwr sector= 512 bytes. Sectors= 2056160 [1003 MB] [1.0 GB] + sdq: sdq1 +SCSI device sdr: hdwr sector= 512 bytes. Sectors= 2052736 [1002 MB] [1.0 GB] + sdr: sdr1 + +************************* + +If a GBIC of type Short-wave, Long-wave, or Copper is detected, it will +print out; otherwise, "none" is displayed. If the cabling is correct +and a loop circuit is completed, you should see "Monitor"; otherwise, +"LoopFail" (on open circuit) or some LPSM number/state with bit 3 set. + + +ERRATA: +1. Normally, Linux Scsi queries FC devices with INQUIRY strings. All LUNs +found according to INQUIRY should get READ commands at sector 0 to find +partition table, etc. Older kernels only query the first 4 devices. Some +Linux kernels only look for one LUN per target (i.e. FC device). + +2. Physically removing a device, or a malfunctioning system which hides a +device, leads to a 30-second timeout and subsequent _abort call. +In some process contexts, this will hang the kernel (crashing the system). +Single bit errors in frames and virtually all hot plugging events are +gracefully handled with internal driver timer and Abort processing. + +3. Some SCSI drives with error conditions will not handle the 7 second timeout +in this software driver, leading to infinite retries on timed out SCSI commands. +The 7 secs balances the need to quickly recover from lost frames (esp. on sequence +initiatives) and time needed by older/slower/error-state drives in responding. +This can be easily changed in "Exchanges[].timeOut". + +4. Due to the nature of FC soft addressing, there is no assurance that the +same LUNs (drives) will have the same path (e.g. /dev/sdb1) from one boot to +next. Dynamic soft address changes (i.e. 24-bit FC port_id) are +supported during run time (e.g. due to hot plug event) by the use of WWN to +SCSI Nexus (channel/target/LUN) mapping. + +5. Compaq RA4x00 firmware version 2.54 and later supports SSP (Selective +Storage Presentation), which maps LUNs to a WWN. If RA4x00 firmware prior +2.54 (e.g. older controller) is used, or the FC HBA is replaced (another WWN +is used), logical volumes on the RA4x00 will no longer be visible. + + +Send questions/comments to: +Amy Vanzant-Hodge (fibrechannel@compaq.com) + diff --git a/Documentation/scsi/dc395x.txt b/Documentation/scsi/dc395x.txt new file mode 100644 index 000000000000..ae3b79a2d275 --- /dev/null +++ b/Documentation/scsi/dc395x.txt @@ -0,0 +1,102 @@ +README file for the dc395x SCSI driver +========================================== + +Status +------ +The driver has been tested with CD-R and CD-R/W drives. These should +be safe to use. Testing with hard disks has not been done to any +great degree and caution should be exercised if you want to attempt +to use this driver with hard disks. + +This is a 2.5 only driver. For a 2.4 driver please see the original +driver (which this driver started from) at +http://www.garloff.de/kurt/linux/dc395/ + +Problems, questions and patches should be submitted to the mailing +list. Details on the list, including archives, are available at +http://lists.twibble.org/mailman/listinfo/dc395x/ + +Parameters +---------- +The driver uses the settings from the EEPROM set in the SCSI BIOS +setup. If there is no EEPROM, the driver uses default values. +Both can be overriden by command line parameters (module or kernel +parameters). + +The following parameters are available: + + - safe + Default: 0, Acceptable values: 0 or 1 + + If safe is set to 1 then the adapter will use conservative + ("safe") default settings. This sets: + + shortcut for dc395x=7,4,9,15,2,10 + + - adapter_id + Default: 7, Acceptable values: 0 to 15 + + Sets the host adapter SCSI ID. + + - max_speed + Default: 1, Acceptable value: 0 to 7 + 0 = 20 Mhz + 1 = 12.2 Mhz + 2 = 10 Mhz + 3 = 8 Mhz + 4 = 6.7 Mhz + 5 = 5.8 Hhz + 6 = 5 Mhz + 7 = 4 Mhz + + - dev_mode + Bitmap for device configuration + + DevMode bit definition: + Bit Val(hex) Val(dec) Meaning + *0 0x01 1 Parity check + *1 0x02 2 Synchronous Negotiation + *2 0x04 4 Disconnection + *3 0x08 8 Send Start command on startup. (Not used) + *4 0x10 16 Tagged Command Queueing + *5 0x20 32 Wide Negotiation + + - adapter_mode + Bitmap for adapter configuration + + AdaptMode bit definition + Bit Val(hex) Val(dec) Meaning + *0 0x01 1 Support more than two drives. (Not used) + *1 0x02 2 Use DOS compatible mapping for HDs greater than 1GB. + *2 0x04 4 Reset SCSI Bus on startup. + *3 0x08 8 Active Negation: Improves SCSI Bus noise immunity. + 4 0x10 16 Immediate return on BIOS seek command. (Not used) + (*)5 0x20 32 Check for LUNs >= 1. + + - tags + Default: 3, Acceptable values: 0-5 + + The number of tags is 1<<x, if x has been specified + + - reset_delay + Default: 1, Acceptable values: 0-180 + + The seconds to not accept commands after a SCSI Reset + + +For the built in driver the parameters should be prefixed with +dc395x. (eg "dc395x.safe=1") + + +Copyright +--------- +The driver is free software. It is protected by the GNU General Public +License (GPL). Please read it, before using this driver. It should be +included in your kernel sources and with your distribution. It carries the +filename COPYING. If you don't have it, please ask me to send you one by +email. +Note: The GNU GPL says also something about warranty and liability. +Please be aware the following: While we do my best to provide a working and +reliable driver, there is a chance, that it will kill your valuable data. +We refuse to take any responsibility for that. The driver is provided as-is +and YOU USE IT AT YOUR OWN RESPONSIBILITY. diff --git a/Documentation/scsi/dpti.txt b/Documentation/scsi/dpti.txt new file mode 100644 index 000000000000..6e45e70243e5 --- /dev/null +++ b/Documentation/scsi/dpti.txt @@ -0,0 +1,83 @@ + /* TERMS AND CONDITIONS OF USE + * + * Redistribution and use in source form, with or without modification, are + * permitted provided that redistributions of source code must retain the + * above copyright notice, this list of conditions and the following disclaimer. + * + * This software is provided `as is' by Adaptec and + * any express or implied warranties, including, but not limited to, the + * implied warranties of merchantability and fitness for a particular purpose, + * are disclaimed. In no event shall Adaptec be + * liable for any direct, indirect, incidental, special, exemplary or + * consequential damages (including, but not limited to, procurement of + * substitute goods or services; loss of use, data, or profits; or business + * interruptions) however caused and on any theory of liability, whether in + * contract, strict liability, or tort (including negligence or otherwise) + * arising in any way out of the use of this driver software, even if advised + * of the possibility of such damage. + * + **************************************************************** + * This driver supports the Adaptec I2O RAID and DPT SmartRAID V I2O boards. + * + * CREDITS: + * The original linux driver was ported to Linux by Karen White while at + * Dell Computer. It was ported from Bob Pasteur's (of DPT) original + * non-Linux driver. Mark Salyzyn and Bob Pasteur consulted on the original + * driver. + * + * 2.0 version of the driver by Deanna Bonds and Mark Salyzyn. + * + * HISTORY: + * The driver was originally ported to linux version 2.0.34 + * + * V2.0 Rewrite of driver. Re-architectured based on i2o subsystem. + * This was the first full GPL version since the last version used + * i2osig headers which were not GPL. Developer Testing version. + * V2.1 Internal testing + * V2.2 First released version + * + * V2.3 + * Changes: + * Added Raptor Support + * Fixed bug causing system to hang under extreme load with + * management utilities running (removed GFP_DMA from kmalloc flags) + * + * + * V2.4 First version ready to be submitted to be embedded in the kernel + * Changes: + * Implemented suggestions from Alan Cox + * Added calculation of resid for sg layer + * Better error handling + * Added checking underflow condtions + * Added DATAPROTECT checking + * Changed error return codes + * Fixed pointer bug in bus reset routine + * Enabled hba reset from ioctls (allows a FW flash to reboot and use the new + * FW without having to reboot) + * Changed proc output + * + * TODO: + * Add 64 bit Scatter Gather when compiled on 64 bit architectures + * Add sparse lun scanning + * Add code that checks if a device that had been taken offline is + * now online (at the FW level) when test unit ready or inquiry + * command from scsi-core + * Add proc read interface + * busrescan command + * rescan command + * Add code to rescan routine that notifies scsi-core about new devices + * Add support for C-PCI (hotplug stuff) + * Add ioctl passthru error recovery + * + * NOTES: + * The DPT card optimizes the order of processing commands. Consequently, + * a command may take up to 6 minutes to complete after it has been sent + * to the board. + * + * The files dpti_ioctl.h dptsig.h osd_defs.h osd_util.h sys_info.h are part of the + * interface files for Adaptec's management routines. These define the structures used + * in the ioctls. They are written to be portable. They are hard to read, but I need + * to use them 'as is' or I can miss changes in the interface. + * + */ + diff --git a/Documentation/scsi/dtc3x80.txt b/Documentation/scsi/dtc3x80.txt new file mode 100644 index 000000000000..e8ae6230ab3e --- /dev/null +++ b/Documentation/scsi/dtc3x80.txt @@ -0,0 +1,43 @@ +README file for the Linux DTC3180/3280 scsi driver. +by Ray Van Tassle (rayvt@comm.mot.com) March 1996 +Based on the generic & core NCR5380 code by Drew Eckhard + +SCSI device driver for the DTC 3180/3280. +Data Technology Corp---a division of Qume. + +The 3280 has a standard floppy interface. + +The 3180 does not. Otherwise, they are identical. + +The DTC3x80 does not support DMA but it does have Pseudo-DMA which is +supported by the driver. + +It's DTC406 scsi chip is supposedly compatible with the NCR 53C400. +It is memory mapped, uses an IRQ, but no dma or io-port. There is +internal DMA, between SCSI bus and an on-chip 128-byte buffer. Double +buffering is done automagically by the chip. Data is transferred +between the on-chip buffer and CPU/RAM via memory moves. + +The driver detects the possible memory addresses (jumper selectable): + CC00, DC00, C800, and D800 +The possible IRQ's (jumper selectable) are: + IRQ 10, 11, 12, 15 +Parity is supported by the chip, but not by this driver. +Information can be obtained from /proc/scsi/dtc3c80/N. + +Note on interrupts: + +The documentation says that it can be set to interrupt whenever the +on-chip buffer needs CPU attention. I couldn't get this to work. So +the driver polls for data-ready in the pseudo-DMA transfer routine. +The interrupt support routines in the NCR3280.c core modules handle +scsi disconnect/reconnect, and this (mostly) works. However..... I +have tested it with 4 totally different hard drives (both SCSI-1 and +SCSI-2), and one CDROM drive. Interrupts works great for all but one +specific hard drive. For this one, the driver will eventually hang in +the transfer state. I have tested with: "dd bs=4k count=2k +of=/dev/null if=/dev/sdb". It reads ok for a while, then hangs. +After beating my head against this for a couple of weeks, getting +nowhere, I give up. So.....This driver does NOT use interrupts, even +if you have the card jumpered to an IRQ. Probably nobody will ever +care. diff --git a/Documentation/scsi/g_NCR5380.txt b/Documentation/scsi/g_NCR5380.txt new file mode 100644 index 000000000000..3b80f567f818 --- /dev/null +++ b/Documentation/scsi/g_NCR5380.txt @@ -0,0 +1,63 @@ +README file for the Linux g_NCR5380 driver. + +(c) 1993 Drew Eckhard +NCR53c400 extensions (c) 1994,1995,1996 Kevin Lentin + +This file documents the NCR53c400 extensions by Kevin Lentin and some +enhancements to the NCR5380 core. + +This driver supports both NCR5380 and NCR53c400 cards in port or memory +mapped modes. Currently this driver can only support one of those mapping +modes at a time but it does support both of these chips at the same time. +The next release of this driver will support port & memory mapped cards at +the same time. It should be able to handle multiple different cards in the +same machine. + +The drivers/scsi/Makefile has an override in it for the most common +NCR53c400 card, the Trantor T130B in its default configuration: + Port: 0x350 + IRQ : 5 + +The NCR53c400 does not support DMA but it does have Pseudo-DMA which is +supported by the driver. + +If the default configuration does not work for you, you can use the kernel +command lines (eg using the lilo append command): + ncr5380=port,irq,dma + ncr53c400=port,irq +or + ncr5380=base,irq,dma + ncr53c400=base,irq + +The driver does not probe for any addresses or ports other than those in +the OVERRIDE or given to the kernel as above. + +This driver provides some information on what it has detected in +/proc/scsi/g_NCR5380/x where x is the scsi card number as detected at boot +time. More info to come in the future. + +When NCR53c400 support is compiled in, BIOS parameters will be returned by +the driver (the raw 5380 driver does not and I don't plan to fiddle with +it!). + +This driver works as a module. +When included as a module, parameters can be passed on the insmod/modprobe +command line: + ncr_irq=xx the interrupt + ncr_addr=xx the port or base address (for port or memory + mapped, resp.) + ncr_dma=xx the DMA + ncr_5380=1 to set up for a NCR5380 board + ncr_53c400=1 to set up for a NCR53C400 board +e.g. +modprobe g_NCR5380 ncr_irq=5 ncr_addr=0x350 ncr_5380=1 + for a port mapped NCR5380 board or +modprobe g_NCR5380 ncr_irq=255 ncr_addr=0xc8000 ncr_53c400=1 + for a memory mapped NCR53C400 board with interrupts disabled. + +(255 should be specified for no or DMA interrupt, 254 to autoprobe for an + IRQ line if overridden on the command line.) + + +Kevin Lentin +K.Lentin@cs.monash.edu.au diff --git a/Documentation/scsi/ibmmca.txt b/Documentation/scsi/ibmmca.txt new file mode 100644 index 000000000000..2814491600ff --- /dev/null +++ b/Documentation/scsi/ibmmca.txt @@ -0,0 +1,1402 @@ + + -=< The IBM Microchannel SCSI-Subsystem >=- + + for the IBM PS/2 series + + Low Level Software-Driver for Linux + + Copyright (c) 1995 Strom Systems, Inc. under the terms of the GNU + General Public License. Originally written by Martin Kolinek, December 1995. + Officially modified and maintained by Michael Lang since January 1999. + + Version 4.0a + + Last update: January 3, 2001 + + Before you Start + ---------------- + This is the common README.ibmmca file for all driver releases of the + IBM MCA SCSI driver for Linux. Please note, that driver releases 4.0 + or newer do not work with kernel versions older than 2.4.0, while driver + versions older than 4.0 do not work with kernels 2.4.0 or later! If you + try to compile your kernel with the wrong driver source, the + compilation is aborted and you get a corresponding error message. This is + no bug in the driver. It prevents you from using the wrong sourcecode + with the wrong kernel version. + + Authors of this Driver + ---------------------- + - Chris Beauregard (improvement of the SCSI-device mapping by the driver) + - Martin Kolinek (origin, first release of this driver) + - Klaus Kudielka (multiple SCSI-host management/detection, adaption to + Linux Kernel 2.1.x, module support) + - Michael Lang (assigning original pun/lun mapping, dynamical ldn + assignment, rewritten adapter detection, this file, + patches, official driver maintenance and subsequent + debugging, related with the driver) + + Table of Contents + ----------------- + 1 Abstract + 2 Driver Description + 2.1 IBM SCSI-Subsystem Detection + 2.2 Physical Units, Logical Units, and Logical Devices + 2.3 SCSI-Device Recognition and dynamical ldn Assignment + 2.4 SCSI-Device Order + 2.5 Regular SCSI-Command-Processing + 2.6 Abort & Reset Commands + 2.7 Disk Geometry + 2.8 Kernel Boot Option + 2.9 Driver Module Support + 2.10 Multiple Hostadapter Support + 2.11 /proc/scsi-Filesystem Information + 2.12 /proc/mca-Filesystem Information + 2.13 Supported IBM SCSI-Subsystems + 2.14 Linux Kernel Versions + 3 Code History + 4 To do + 5 Users' Manual + 5.1 Commandline Parameters + 5.2 Troubleshooting + 5.3 Bugreports + 5.4 Support WWW-page + 6 References + 7 Credits to + 7.1 People + 7.2 Sponsors & Supporters + 8 Trademarks + 9 Disclaimer + + * * * + + 1 Abstract + ---------- + This README-file describes the IBM SCSI-subsystem low level driver for + Linux. The descriptions which were formerly kept in the source-code have + been taken out to this file to easify the codes' readability. The driver + description has been updated, as most of the former description was already + quite outdated. The history of the driver development is also kept inside + here. Multiple historical developments have been summarized to shorten the + textsize a bit. At the end of this file you can find a small manual for + this driver and hints to get it running on your machine. + + 2 Driver Description + -------------------- + 2.1 IBM SCSI-Subsystem Detection + -------------------------------- + This is done in the ibmmca_detect() function. It first checks, if the + Microchannel-bus support is enabled, as the IBM SCSI-subsystem needs the + Microchannel. In a next step, a free interrupt is chosen and the main + interrupt handler is connected to it to handle answers of the SCSI- + subsystem(s). If the F/W SCSI-adapter is forced by the BIOS to use IRQ11 + instead of IRQ14, IRQ11 is used for the IBM SCSI-2 F/W adapter. In a + further step it is checked, if the adapter gets detected by force from + the kernel commandline, where the I/O port and the SCSI-subsystem id can + be specified. The next step checks if there is an integrated SCSI-subsystem + installed. This register area is fixed through all IBM PS/2 MCA-machines + and appears as something like a virtual slot 10 of the MCA-bus. On most + PS/2 machines, the POS registers of slot 10 are set to 0xff or 0x00 if not + integrated SCSI-controller is available. But on certain PS/2s, like model + 9595, this slot 10 is used to store other information which at earlier + stage confused the driver and resulted in the detection of some ghost-SCSI. + If POS-register 2 and 3 are not 0x00 and not 0xff, but all other POS + registers are either 0xff or 0x00, there must be an integrated SCSI- + subsystem present and it will be registered as IBM Integrated SCSI- + Subsystem. The next step checks, if there is a slot-adapter installed on + the MCA-bus. To get this, the first two POS-registers, that represent the + adapter ID are checked. If they fit to one of the ids, stored in the + adapter list, a SCSI-subsystem is assumed to be found in a slot and will be + registered. This check is done through all possible MCA-bus slots to allow + more than one SCSI-adapter to be present in the PS/2-system and this is + already the first point of problems. Looking into the technical reference + manual for the IBM PS/2 common interfaces, the POS2 register must have + different interpretation of its single bits to avoid overlapping I/O + regions. While one can assume, that the integrated subsystem has a fix + I/O-address at 0x3540 - 0x3547, further installed IBM SCSI-adapters must + use a different I/O-address. This is expressed by bit 1 to 3 of POS2 + (multiplied by 8 + 0x3540). Bits 2 and 3 are reserved for the integrated + subsystem, but not for the adapters! The following list shows, how the + bits of POS2 and POS3 should be interpreted. + + The POS2-register of all PS/2 models' integrated SCSI-subsystems has the + following interpretation of bits: + Bit 7 - 4 : Chip Revision ID (Release) + Bit 3 - 2 : Reserved + Bit 1 : 8k NVRAM Disabled + Bit 0 : Chip Enable (EN-Signal) + The POS3-register is interpreted as follows (for most IBM SCSI-subsys.): + Bit 7 - 5 : SCSI ID + Bit 4 - 0 : Reserved = 0 + The slot-adapters have different interpretation of these bits. The IBM SCSI + adapter (w/Cache) and the IBM SCSI-2 F/W adapter use the following + interpretation of the POS2 register: + Bit 7 - 4 : ROM Segment Address Select + Bit 3 - 1 : Adapter I/O Address Select (*8+0x3540) + Bit 0 : Adapter Enable (EN-Signal) + and for the POS3 register: + Bit 7 - 5 : SCSI ID + Bit 4 : Fairness Enable (SCSI ID3 f. F/W) + Bit 3 - 0 : Arbitration Level + The most modern product of the series is the IBM SCSI-2 F/W adapter, it + allows dual-bus SCSI and SCSI-wide addressing, which means, PUNs may be + between 0 and 15. Here, Bit 4 is the high-order bit of the 4-bit wide + adapter PUN expression. In short words, this means, that IBM PS/2 machines + can only support 1 single integrated subsystem by default. Additional + slot-adapters get ports assigned by the automatic configuration tool. + + One day I found a patch in ibmmca_detect(), forcing the I/O-address to be + 0x3540 for integrated SCSI-subsystems, there was a remark placed, that on + integrated IBM SCSI-subsystems of model 56, the POS2 register was showing 5. + This means, that really for these models, POS2 has to be interpreted + sticking to the technical reference guide. In this case, the bit 2 (4) is + a reserved bit and may not be interpreted. These differences between the + adapters and the integrated controllers are taken into account by the + detection routine of the driver on from version >3.0g. + + Every time, a SCSI-subsystem is discovered, the ibmmca_register() function + is called. This function checks first, if the requested area for the I/O- + address of this SCSI-subsystem is still available and assigns this I/O- + area to the SCSI-subsystem. There are always 8 sequential I/O-addresses + taken for each individual SCSI-subsystem found, which are: + + Offset Type Permissions + 0 Command Interface Register 1 Read/Write + 1 Command Interface Register 2 Read/Write + 2 Command Interface Register 3 Read/Write + 3 Command Interface Register 4 Read/Write + 4 Attention Register Read/Write + 5 Basic Control Register Read/Write + 6 Interrupt Status Register Read + 7 Basic Status Register Read + + After the I/O-address range is assigned, the host-adapter is assigned + to a local structure which keeps all adapter information needed for the + driver itself and the mid- and higher-level SCSI-drivers. The SCSI pun/lun + and the adapters' ldn tables are initialized and get probed afterwards by + the check_devices() function. If no further adapters are found, + ibmmca_detect() quits. + + 2.2 Physical Units, Logical Units, and Logical Devices + ------------------------------------------------------ + There can be up to 56 devices on the SCSI bus (besides the adapter): + there are up to 7 "physical units" (each identified by physical unit + number or pun, also called the scsi id, this is the number you select + with hardware jumpers), and each physical unit can have up to 8 + "logical units" (each identified by logical unit number, or lun, + between 0 and 7). The IBM SCSI-2 F/W adapter offers this on up to two + busses and provides support for 30 logical devices at the same time, where + in wide-addressing mode you can have 16 puns with 32 luns on each device. + This section dexribes you the handling of devices on non-F/W adapters. + Just imagine, that you can have 16 * 32 = 512 devices on a F/W adapter + which means a lot of possible devices for such a small machine. + + Typically the adapter has pun=7, so puns of other physical units + are between 0 and 6(15). On a wide-adapter a pun higher than 7 is + possible, but is normally not used. Almost all physical units have only + one logical unit, with lun=0. A CD-ROM jukebox would be an example of a + physical unit with more than one logical unit. + + The embedded microprocessor of the IBM SCSI-subsystem hides the complex + two-dimensional (pun,lun) organization from the operating system. + When the machine is powered-up (or rebooted), the embedded microprocessor + checks, on its own, all 56 possible (pun,lun) combinations, and the first + 15 devices found are assigned into a one-dimensional array of so-called + "logical devices", identified by "logical device numbers" or ldn. The last + ldn=15 is reserved for the subsystem itself. Wide adapters may have + to check up to 15 * 8 = 120 pun/lun combinations. + + 2.3 SCSI-Device Recognition and Dynamical ldn Assignment + -------------------------------------------------------- + One consequence of information hiding is that the real (pun,lun) + numbers are also hidden. The two possibilities to get around this problem + is to offer fake pun/lun combinations to the operating system or to + delete the whole mapping of the adapter and to reassign the ldns, using + the immediate assign command of the SCSI-subsystem for probing through + all possible pun/lun combinations. a ldn is a "logical device number" + which is used by IBM SCSI-subsystems to access some valid SCSI-device. + At the beginning of the development of this driver, the following approach + was used: + + First, the driver checked the ldn's (0 to 6) to find out which ldn's + have devices assigned. This was done by the functions check_devices() and + device_exists(). The interrupt handler has a special paragraph of code + (see local_checking_phase_flag) to assist in the checking. Assume, for + example, that three logical devices were found assigned at ldn 0, 1, 2. + These are presented to the upper layer of Linux SCSI driver + as devices with bogus (pun, lun) equal to (0,0), (1,0), (2,0). + On the other hand, if the upper layer issues a command to device + say (4,0), this driver returns DID_NO_CONNECT error. + + In a second step of the driver development, the following improvement has + been applied: The first approach limited the number of devices to 7, far + fewer than the 15 that it could usem then it just maped ldn -> + (ldn/8,ldn%8) for pun,lun. We ended up with a real mishmash of puns + and luns, but it all seemed to work. + + The latest development, which is implemented from the driver version 3.0 + and later, realizes the device recognition in the following way: + The physical SCSI-devices on the SCSI-bus are probed via immediate_assign- + and device_inquiry-commands, that is all implemented in a completely new + made check_devices() subroutine. This delivers an exact map of the physical + SCSI-world that is now stored in the get_scsi[][]-array. This means, + that the once hidden pun,lun assignment is now known to this driver. + It no longer believes in default-settings of the subsystem and maps all + ldns to existing pun,lun "by foot". This assures full control of the ldn + mapping and allows dynamical remapping of ldns to different pun,lun, if + there are more SCSI-devices installed than ldns available (n>15). The + ldns from 0 to 6 get 'hardwired' by this driver to puns 0 to 7 at lun=0, + excluding the pun of the subsystem. This assures, that at least simple + SCSI-installations have optimum access-speed and are not touched by + dynamical remapping. The ldns 7 to 14 are put to existing devices with + lun>0 or to non-existing devices, in order to satisfy the subsystem, if + there are less than 15 SCSI-devices connected. In the case of more than 15 + devices, the dynamical mapping goes active. If the get_scsi[][] reports a + device to be existant, but it has no ldn assigned, it gets a ldn out of 7 + to 14. The numbers are assigned in cyclic order. Therefore it takes 8 + dynamical reassignments on the SCSI-devices, until a certain device + loses its ldn again. This assures, that dynamical remapping is avoided + during intense I/O between up to 15 SCSI-devices (means pun,lun + combinations). A further advantage of this method is, that people who + build their kernel without probing on all luns will get what they expect, + because the driver just won't assign everything with lun>0 when + multpile lun probing is inactive. + + 2.4 SCSI-Device Order + --------------------- + Because of the now correct recognition of physical pun,lun, and + their report to mid-level- and higher-level-drivers, the new reported puns + can be different from the old, faked puns. Therefore, Linux will eventually + change /dev/sdXXX assignments and prompt you for corrupted superblock + repair on boottime. In this case DO NOT PANIC, YOUR DISKS ARE STILL OK!!! + You have to reboot (CTRL-D) with an old kernel and set the /etc/fstab-file + entries right. After that, the system should come up as errorfree as before. + If your boot-partition is not coming up, also edit the /etc/lilo.conf-file + in a Linux session booted on old kernel and run lilo before reboot. Check + lilo.conf anyway to get boot on other partitions with foreign OSes right + again. But there exists a feature of this driver that allows you to change + the assignment order of the SCSI-devices by flipping the PUN-assignment. + See the next paragraph for a description. + + The problem for this is, that Linux does not assign the SCSI-devices in the + way as described in the ANSI-SCSI-standard. Linux assigns /dev/sda to + the device with at minimum id 0. But the first drive should be at id 6, + because for historical reasons, drive at id 6 has, by hardware, the highest + priority and a drive at id 0 the lowest. IBM was one of the rare producers, + where the BIOS assigns drives belonging to the ANSI-SCSI-standard. Most + other producers' BIOS does not (I think even Adaptec-BIOS). The + IBMMCA_SCSI_ORDER_STANDARD flag, which you set while configuring the + kernel enables to choose the preferred way of SCSI-device-assignment. + Defining this flag would result in Linux determining the devices in the + same order as DOS and OS/2 does on your MCA-machine. This is also standard + on most industrial computers and OSes, like e.g. OS-9. Leaving this flag + undefined will get your devices ordered in the default way of Linux. See + also the remarks of Chris Beauregard from Dec 15, 1997 and the followups + in section 3. + + 2.5 Regular SCSI-Command-Processing + ----------------------------------- + Only three functions get involved: ibmmca_queuecommand(), issue_cmd(), + and interrupt_handler(). + + The upper layer issues a scsi command by calling function + ibmmca_queuecommand(). This function fills a "subsystem control block" + (scb) and calls a local function issue_cmd(), which writes a scb + command into subsystem I/O ports. Once the scb command is carried out, + the interrupt_handler() is invoked. If a device is determined to be + existant and it has not assigned any ldn, it gets one dynamically. + For this, the whole stuff is done in ibmmca_queuecommand(). + + 2.6 Abort & Reset Commands + -------------------------- + These are implemented with busy waiting for interrupt to arrive. + ibmmca_reset() and ibmmca_abort() do not work sufficently well + up to now and need still a lot of development work. But, this seems + to be even a problem with other SCSI-low level drivers, too. However, + this should be no excuse. + + 2.7 Disk Geometry + ----------------- + The ibmmca_biosparams() function should return the same disk geometry + as the bios. This is needed for fdisk, etc. The returned geometry is + certainly correct for disks smaller than 1 gigabyte. In the meantime, + it has been proved, that this works fine even with disks larger than + 1 gigabyte. + + 2.8 Kernel Boot Option + ---------------------- + The function ibmmca_scsi_setup() is called if option ibmmcascsi=n + is passed to the kernel. See file linux/init/main.c for details. + + 2.9 Driver Module Support + ------------------------- + Is implemented and tested by K. Kudielka. This could probably not work + on kernels <2.1.0. + + 2.10 Multiple Hostadapter Support + --------------------------------- + This driver supports up to eight interfaces of type IBM-SCSI-Subsystem. + Integrated-, and MCA-adapters are automatically recognized. Unrecognizable + IBM-SCSI-Subsystem interfaces can be specified as kernel-parameters. + + 2.11 /proc/scsi-Filesystem Information + -------------------------------------- + Information about the driver condition is given in + /proc/scsi/ibmmca/<host_no>. ibmmca_proc_info() provides this information. + + This table is quite informative for interested users. It shows the load + of commands on the subsystem and wether you are running the bypassed + (software) or integrated (hardware) SCSI-command set (see below). The + amount of accesses is shown. Read, write, modeselect is shown separately + in order to help debugging problems with CD-ROMs or tapedrives. + + The following table shows the list of 15 logical device numbers, that are + used by the SCSI-subsystem. The load on each ldn is shown in the table, + again, read and write commands are split. The last column shows the amount + of reassignments, that have been applied to the ldns, if you have more than + 15 pun/lun combinations available on the SCSI-bus. + + The last two tables show the pun/lun map and the positions of the ldns + on this pun/lun map. This may change during operation, when a ldn is + reassigned to another pun/lun combination. If the necessity for dynamical + assignments is set to 'no', the ldn structure keeps static. + + 2.12 /proc/mca-Filesystem Information + ------------------------------------- + The slot-file contains all default entries and in addition chip and I/O- + address information of the SCSI-subsystem. This information is provided + by ibmmca_getinfo(). + + 2.13 Supported IBM SCSI-Subsystems + ---------------------------------- + The following IBM SCSI-subsystems are supported by this driver: + + - IBM Fast/Wide SCSI-2 Adapter + - IBM 7568 Industrial Computer SCSI Adapter w/Cache + - IBM Expansion Unit SCSI Controller + - IBM SCSI Adapter w/Cache + - IBM SCSI Adapter + - IBM Integrated SCSI Controller + - All clones, 100% compatible with the chipset and subsystem command + system of IBM SCSI-adapters (forced detection) + + 2.14 Linux Kernel Versions + -------------------------- + The IBM SCSI-subsystem low level driver is prepared to be used with + all versions of Linux between 2.0.x and 2.4.x. The compatibility checks + are fully implemented up from version 3.1e of the driver. This means, that + you just need the latest ibmmca.h and ibmmca.c file and copy it in the + linux/drivers/scsi directory. The code is automatically adapted during + kernel compilation. This is different from kernel 2.4.0! Here version + 4.0 or later of the driver must be used for kernel 2.4.0 or later. Version + 4.0 or later does not work together with older kernels! Driver versions + older than 4.0 do not work together with kernel 2.4.0 or later. They work + on all older kernels. + + 3 Code History + -------------- + Jan 15 1996: First public release. + - Martin Kolinek + + Jan 23 1996: Scrapped code which reassigned scsi devices to logical + device numbers. Instead, the existing assignment (created + when the machine is powered-up or rebooted) is used. + A side effect is that the upper layer of Linux SCSI + device driver gets bogus scsi ids (this is benign), + and also the hard disks are ordered under Linux the + same way as they are under dos (i.e., C: disk is sda, + D: disk is sdb, etc.). + - Martin Kolinek + + I think that the CD-ROM is now detected only if a CD is + inside CD_ROM while Linux boots. This can be fixed later, + once the driver works on all types of PS/2's. + - Martin Kolinek + + Feb 7 1996: Modified biosparam function. Fixed the CD-ROM detection. + For now, devices other than harddisk and CD_ROM are + ignored. Temporarily modified abort() function + to behave like reset(). + - Martin Kolinek + + Mar 31 1996: The integrated scsi subsystem is correctly found + in PS/2 models 56,57, but not in model 76. Therefore + the ibmmca_scsi_setup() function has been added today. + This function allows the user to force detection of + scsi subsystem. The kernel option has format + ibmmcascsi=n + where n is the scsi_id (pun) of the subsystem. Most likely, n is 7. + - Martin Kolinek + + Aug 21 1996: Modified the code which maps ldns to (pun,0). It was + insufficient for those of us with CD-ROM changers. + - Chris Beauregard + + Dec 14 1996: More improvements to the ldn mapping. See check_devices + for details. Did more fiddling with the integrated SCSI detection, + but I think it's ultimately hopeless without actually testing the + model of the machine. The 56, 57, 76 and 95 (ultimedia) all have + different integrated SCSI register configurations. However, the 56 + and 57 are the only ones that have problems with forced detection. + - Chris Beauregard + + Mar 8-16 1997: Modified driver to run as a module and to support + multiple adapters. A structure, called ibmmca_hostdata, is now + present, containing all the variables, that were once only + available for one single adapter. The find_subsystem-routine has vanished. + The hardware recognition is now done in ibmmca_detect directly. + This routine checks for presence of MCA-bus, checks the interrupt + level and continues with checking the installed hardware. + Certain PS/2-models do not recognize a SCSI-subsystem automatically. + Hence, the setup defined by command-line-parameters is checked first. + Thereafter, the routine probes for an integrated SCSI-subsystem. + Finally, adapters are checked. This method has the advantage to cover all + possible combinations of multiple SCSI-subsystems on one MCA-board. Up to + eight SCSI-subsystems can be recognized and announced to the upper-level + drivers with this improvement. A set of defines made changes to other + routines as small as possible. + - Klaus Kudielka + + May 30 1997: (v1.5b) + 1) SCSI-command capability enlarged by the recognition of MODE_SELECT. + This needs the RD-Bit to be disabled on IM_OTHER_SCSI_CMD_CMD which + allows data to be written from the system to the device. It is a + necessary step to be allowed to set blocksize of SCSI-tape-drives and + the tape-speed, whithout confusing the SCSI-Subsystem. + 2) The recognition of a tape is included in the check_devices routine. + This is done by checking for TYPE_TAPE, that is already defined in + the kernel-scsi-environment. The markup of a tape is done in the + global ldn_is_tape[] array. If the entry on index ldn + is 1, there is a tapedrive connected. + 3) The ldn_is_tape[] array is necessary to distinguish between tape- and + other devices. Fixed blocklength devices should not cause a problem + with the SCB-command for read and write in the ibmmca_queuecommand + subroutine. Therefore, I only derivate the READ_XX, WRITE_XX for + the tape-devices, as recommended by IBM in this Technical Reference, + mentioned below. (IBM recommends to avoid using the read/write of the + subsystem, but the fact was, that read/write causes a command error from + the subsystem and this causes kernel-panic.) + 4) In addition, I propose to use the ldn instead of a fix char for the + display of PS2_DISK_LED_ON(). On 95, one can distinguish between the + devices that are accessed. It shows activity and easyfies debugging. + The tape-support has been tested with a SONY SDT-5200 and a HP DDS-2 + (I do not know yet the type). Optimization and CD-ROM audio-support, + I am working on ... + - Michael Lang + + June 19 1997: (v1.6b) + 1) Submitting the extra-array ldn_is_tape[] -> to the local ld[] + device-array. + 2) CD-ROM Audio-Play seems to work now. + 3) When using DDS-2 (120M) DAT-Tapes, mtst shows still density-code + 0x13 for ordinary DDS (61000 BPM) instead 0x24 for DDS-2. This appears + also on Adaptec 2940 adaptor in a PCI-System. Therefore, I assume that + the problem is independent of the low-level-driver/bus-architecture. + 4) Hexadecimal ldn on PS/2-95 LED-display. + 5) Fixing of the PS/2-LED on/off that it works right with tapedrives and + does not confuse the disk_rw_in_progress counter. + - Michael Lang + + June 21 1997: (v1.7b) + 1) Adding of a proc_info routine to inform in /proc/scsi/ibmmca/<host> the + outer-world about operational load statistics on the different ldns, + seen by the driver. Everybody that has more than one IBM-SCSI should + test this, because I only have one and cannot see what happens with more + than one IBM-SCSI hosts. + 2) Definition of a driver version-number to have a better recognition of + the source when there are existing too much releases that may confuse + the user, when reading about release-specific problems. Up to know, + I calculated the version-number to be 1.7. Because we are in BETA-test + yet, it is today 1.7b. + 3) Sorry for the heavy bug I programmed on June 19 1997! After that, the + CD-ROM did not work any more! The C7-command was a fake impression + I got while programming. Now, the READ and WRITE commands for CD-ROM are + no longer running over the subsystem, but just over + IM_OTHER_SCSI_CMD_CMD. On my observations (PS/2-95), now CD-ROM mounts + much faster(!) and hopefully all fancy multimedia-functions, like direct + digital recording from audio-CDs also work. (I tried it with cdda2wav + from the cdwtools-package and it filled up the harddisk immediately :-).) + To easify boolean logics, a further local device-type in ld[], called + is_cdrom has been included. + 4) If one uses a SCSI-device of unsupported type/commands, one + immediately runs into a kernel-panic caused by Command Error. To better + understand which SCSI-command caused the problem, I extended this + specific panic-message slightly. + - Michael Lang + + June 25 1997: (v1.8b) + 1) Some cosmetical changes for the handling of SCSI-device-types. + Now, also CD-Burners / WORMs and SCSI-scanners should work. For + MO-drives I have no experience, therefore not yet supported. + In logical_devices I changed from different type-variables to one + called 'device_type' where the values, corresponding to scsi.h, + of a SCSI-device are stored. + 2) There existed a small bug, that maps a device, coming after a SCSI-tape + wrong. Therefore, e.g. a CD-ROM changer would have been mapped wrong + -> problem removed. + 3) Extension of the logical_device structure. Now it contains also device, + vendor and revision-level of a SCSI-device for internal usage. + - Michael Lang + + June 26-29 1997: (v2.0b) + 1) The release number 2.0b is necessary because of the completely new done + recognition and handling of SCSI-devices with the adapter. As I got + from Chris the hint, that the subsystem can reassign ldns dynamically, + I remembered this immediate_assign-command, I found once in the handbook. + Now, the driver first kills all ldn assignments that are set by default + on the SCSI-subsystem. After that, it probes on all puns and luns for + devices by going through all combinations with immediate_assign and + probing for devices, using device_inquiry. The found physical(!) pun,lun + structure is stored in get_scsi[][] as device types. This is followed + by the assignment of all ldns to existing SCSI-devices. If more ldns + than devices are available, they are assigned to non existing pun,lun + combinations to satisfy the adapter. With this, the dynamical mapping + was possible to implement. (For further info see the text in the + source-code and in the description below. Read the description + below BEFORE installing this driver on your system!) + 2) Changed the name IBMMCA_DRIVER_VERSION to IBMMCA_SCSI_DRIVER_VERSION. + 3) The LED-display shows on PS/2-95 no longer the ldn, but the SCSI-ID + (pun) of the accessed SCSI-device. This is now senseful, because the + pun known within the driver is exactly the pun of the physical device + and no longer a fake one. + 4) The /proc/scsi/ibmmca/<host_no> consists now of the first part, where + hit-statistics of ldns is shown and a second part, where the maps of + physical and logical SCSI-devices are displayed. This could be very + interesting, when one is using more than 15 SCSI-devices in order to + follow the dynamical remapping of ldns. + - Michael Lang + + June 26-29 1997: (v2.0b-1) + 1) I forgot to switch the local_checking_phase_flag to 1 and back to 0 + in the dynamical remapping part in ibmmca_queuecommand for the + device_exist routine. Sorry. + - Michael Lang + + July 1-13 1997: (v3.0b,c) + 1) Merging of the driver-developments of Klaus Kudielka and Michael Lang + in order to get a optimum and unified driver-release for the + IBM-SCSI-Subsystem-Adapter(s). + For people, using the Kernel-release >=2.1.0, module-support should + be no problem. For users, running under <2.1.0, module-support may not + work, because the methods have changed between 2.0.x and 2.1.x. + 2) Added some more effective statistics for /proc-output. + 3) Change typecasting at necessary points from (unsigned long) to + virt_to_bus(). + 4) Included #if... at special points to have specific adaption of the + driver to kernel 2.0.x and 2.1.x. It should therefore also run with + later releases. + 5) Magneto-Optical drives and medium-changers are also recognized, now. + Therefore, we have a completely gapfree recognition of all SCSI- + device-types, that are known by Linux up to kernel 2.1.31. + 6) The flag SCSI_IBMMCA_DEV_RESET has been inserted. If it is set within + the configuration, each connected SCSI-device will get a reset command + during boottime. This can be necessary for some special SCSI-devices. + This flag should be included in Config.in. + (See also the new Config.in file.) + Probable next improvement: bad disk handler. + - Michael Lang + + Sept 14 1997: (v3.0c) + 1) Some debugging and speed optimization applied. + - Michael Lang + + Dec 15, 1997 + - chrisb@truespectra.com + - made the front panel display thingy optional, specified from the + command-line via ibmmcascsi=display. Along the lines of the /LED + option for the OS/2 driver. + - fixed small bug in the LED display that would hang some machines. + - reversed ordering of the drives (using the + IBMMCA_SCSI_ORDER_STANDARD define). This is necessary for two main + reasons: + - users who've already installed Linux won't be screwed. Keep + in mind that not everyone is a kernel hacker. + - be consistent with the BIOS ordering of the drives. In the + BIOS, id 6 is C:, id 0 might be D:. With this scheme, they'd be + backwards. This confuses the crap out of those heathens who've + got a impure Linux installation (which, <wince>, I'm one of). + This whole problem arises because IBM is actually non-standard with + the id to BIOS mappings. You'll find, in fdomain.c, a similar + comment about a few FD BIOS revisions. The Linux (and apparently + industry) standard is that C: maps to scsi id (0,0). Let's stick + with that standard. + - Since this is technically a branch of my own, I changed the + version number to 3.0e-cpb. + + Jan 17, 1998: (v3.0f) + 1) Addition of some statistical info for /proc in proc_info. + 2) Taking care of the SCSI-assignment problem, dealed by Chris at Dec 15 + 1997. In fact, IBM is right, concerning the assignment of SCSI-devices + to driveletters. It is conform to the ANSI-definition of the SCSI- + standard to assign drive C: to SCSI-id 6, because it is the highest + hardware priority after the hostadapter (that has still today by + default everywhere id 7). Also realtime-operating systems that I use, + like LynxOS and OS9, which are quite industrial systems use top-down + numbering of the harddisks, that is also starting at id 6. Now, one + sits a bit between two chairs. On one hand side, using the define + IBMMCA_SCSI_ORDER_STANDARD makes Linux assigning disks conform to + the IBM- and ANSI-SCSI-standard and keeps this driver downward + compatible to older releases, on the other hand side, people is quite + habituated in believing that C: is assigned to (0,0) and much other + SCSI-BIOS do so. Therefore, I moved the IBMMCA_SCSI_ORDER_STANDARD + define out of the driver and put it into Config.in as subitem of + 'IBM SCSI support'. A help, added to Documentation/Configure.help + explains the differences between saying 'y' or 'n' to the user, when + IBMMCA_SCSI_ORDER_STANDARD prompts, so the ordinary user is enabled to + choose the way of assignment, depending on his own situation and gusto. + 3) Adapted SCSI_IBMMCA_DEV_RESET to the local naming convention, so it is + now called IBMMCA_SCSI_DEV_RESET. + 4) Optimization of proc_info and its subroutines. + 5) Added more in-source-comments and extended the driver description by + some explanation about the SCSI-device-assignment problem. + - Michael Lang + + Jan 18, 1998: (v3.0g) + 1) Correcting names to be absolutely conform to the later 2.1.x releases. + This is necessary for + IBMMCA_SCSI_DEV_RESET -> CONFIG_IBMMCA_SCSI_DEV_RESET + IBMMCA_SCSI_ORDER_STANDARD -> CONFIG_IBMMCA_SCSI_ORDER_STANDARD + - Michael Lang + + Jan 18, 1999: (v3.1 MCA-team internal) + 1) The multiple hosts structure is accessed from every subroutine, so there + is no longer the address of the device structure passed from function + to function, but only the hostindex. A call by value, nothing more. This + should really be understood by the compiler and the subsystem should get + the right values and addresses. + 2) The SCSI-subsystem detection was not complete and quite hugely buggy up + to now, compared to the technical manual. The interpretation of the pos2 + register is not as assumed by people before, therefore, I dropped a note + in the ibmmca_detect function to show the registers' interpretation. + The pos-registers of integrated SCSI-subsystems do not contain any + information concerning the IO-port offset, really. Instead, they contain + some info about the adapter, the chip, the NVRAM .... The I/O-port is + fixed to 0x3540 - 0x3547. There can be more than one adapters in the + slots and they get an offset for the I/O area in order to get their own + I/O-address area. See chapter 2 for detailed description. At least, the + detection should now work right, even on models other than 95. The 95ers + came happily around the bug, as their pos2 register contains always 0 + in the critical area. Reserved bits are not allowed to be interpreted, + therefore, IBM is allowed to set those bits as they like and they may + really vary between different PS/2 models. So, now, no interpretation + of reserved bits - hopefully no trouble here anymore. + 3) The command error, which you may get on models 55, 56, 57, 70, 77 and + P70 may have been caused by the fact, that adapters of older design do + not like sending commands to non-existing SCSI-devices and will react + with a command error as a sign of protest. While this error is not + present on IBM SCSI Adapter w/cache, it appears on IBM Integrated SCSI + Adapters. Therefore, I implemented a workarround to forgive those + adapters their protests, but it is marked up in the statisctis, so + after a successful boot, you can see in /proc/scsi/ibmmca/<host_number> + how often the command errors have been forgiven to the SCSI-subsystem. + If the number is bigger than 0, you have a SCSI subsystem of older + design, what should no longer matter. + 4) ibmmca_getinfo() has been adapted very carefully, so it shows in the + slotn file really, what is senseful to be presented. + 5) ibmmca_register() has been extended in its parameter list in order to + pass the right name of the SCSI-adapter to Linux. + - Michael Lang + + Feb 6, 1999: (v3.1) + 1) Finally, after some 3.1Beta-releases, the 3.1 release. Sorry, for + the delayed release, but it was not finished with the release of + Kernel 2.2.0. + - Michael Lang + + Feb 10, 1999 (v3.1) + 1) Added a new commandline parameter called 'bypass' in order to bypass + every integrated subsystem SCSI-command consequently in case of + troubles. + 2) Concatenated read_capacity requests to the harddisks. It gave a lot + of troubles with some controllers and after I wanted to apply some + extensions, it jumped out in the same situation, on my w/cache, as like + on D. Weinehalls' Model 56, having integrated SCSI. This gave me the + descissive hint to move the code-part out and declare it global. Now, + it seems to work by far much better an more stable. Let us see, what + the world thinks of it... + 3) By the way, only Sony DAT-drives seem to show density code 0x13. A + test with a HP drive gave right results, so the problem is vendor- + specific and not a problem of the OS or the driver. + - Michael Lang + + Feb 18, 1999 (v3.1d) + 1) The abort command and the reset function have been checked for + inconsistencies. From the logical point of thinking, they work + at their optimum, now, but as the subsystem does not answer with an + interrupt, abort never finishes, sigh... + 2) Everything, that is accessed by a busmaster request from the adapter + is now declared as global variable, even the return-buffer in the + local checking phase. This assures, that no accesses to undefined memory + areas are performed. + 3) In ibmmca.h, the line unchecked_isa_dma is added with 1 in order to + avoid memory-pointers for the areas higher than 16MByte in order to + be sure, it also works on 16-Bit Microchannel bus systems. + 4) A lot of small things have been found, but nothing that endangered the + driver operations. Just it should be more stable, now. + - Michael Lang + + Feb 20, 1999 (v3.1e) + 1) I took the warning from the Linux Kernel Hackers Guide serious and + checked the cmd->result return value to the done-function very carefully. + It is obvious, that the IBM SCSI only delivers the tsb.dev_status, if + some error appeared, else it is undefined. Now, this is fixed. Before + any SCB command gets queued, the tsb.dev_status is set to 0, so the + cmd->result won't screw up Linux higher level drivers. + 2) The reset-function has slightly improved. This is still planed for + abort. During the abort and the reset function, no interrupts are + allowed. This is however quite hard to cope with, so the INT-status + register is read. When the interrupt gets queued, one can find its + status immediately on that register and is enabled to continue in the + reset function. I had no chance to test this really, only in a bogus + situation, I got this function running, but the situation was too much + worse for Linux :-(, so tests will continue. + 3) Buffers got now consistent. No open address mapping, as before and + therefore no further troubles with the unassigned memory segmentation + faults that scrambled probes on 95XX series and even on 85XX series, + when the kernel is done in a not so perfectly fitting way. + 4) Spontaneous interrupts from the subsystem, appearing without any + command previously queued are answered with a DID_BAD_INTR result. + 5) Taken into account ZP Gus' proposals to reverse the SCSI-device + scan order. As it does not work on Kernel 2.1.x or 2.2.x, as proposed + by him, I implemented it in a slightly derived way, which offers in + addition more flexibility. + - Michael Lang + + Apr 23, 2000 (v3.2pre1) + 1) During a very long time, I collected a huge amount of bugreports from + various people, trying really quite different things on their SCSI- + PS/2s. Today, all these bugreports are taken into account and should be + mostly solved. The major topics were: + - Driver crashes during boottime by no obvious reason. + - Driver panics while the midlevel-SCSI-driver is trying to inquire + the SCSI-device properties, even though hardware is in perfect state. + - Displayed info for the various slot-cards is interpreted wrong. + The main reasons for the crashes were two: + 1) The commands to check for device information like INQUIRY, + TEST_UNIT_READY, REQUEST_SENSE and MODE_SENSE cause the devices + to deliver information of up to 255 bytes. Midlevel drivers offer + 1024 bytes of space for the answer, but the IBM-SCSI-adapters do + not accept this, as they stick quite near to ANSI-SCSI and report + a COMMAND_ERROR message which causes the driver to panic. The main + problem was located around the INQUIRY command. Now, for all the + mentioned commands, the buffersize, sent to the adapter is at + maximum 255 which seems to be a quite reasonable solution. + TEST_UNIT_READY gets a buffersize of 0 to make sure, that no + data is transferred in order to avoid any possible command failure. + 2) On unsuccessful TEST_UNIT_READY, the midlevel-driver has to send + a REQUEST_SENSE in order to see, where the problem is located. This + REQUEST_SENSE may have various length in its answer-buffer. IBM + SCSI-subsystems report a command failure, if the returned buffersize + is different from the sent buffersize, but this can be supressed by + a special bit, which is now done and problems seem to be solved. + 2) Code adaption to all kernel-releases. Now, the 3.2 code compiles on + 2.0.x, 2.1.x, 2.2.x and 2.3.x kernel releases without any code-changes. + 3) Commandline-parameters are recognized again, even under Kernel 2.3.x or + higher. + - Michael Lang + + April 27, 2000 (v3.2pre2) + 1) Bypassed commands get read by the adapter by one cycle instead of two. + This increases SCSI-performance. + 2) Synchronous datatransfer is provided for sure to be 5 MHz on older + SCSI and 10 MHz on internal F/W SCSI-adapter. + 3) New commandline parameters allow to force the adapter to slow down while + in synchronous transfer. Could be helpful for very old devices. + - Michael Lang + + June 2, 2000 (v3.2pre5) + 1) Added Jim Shorney's contribution to make the activity indicator + flashing in addition to the LED-alphanumeric display-panel on + models 95A. To be enabled to choose this feature freely, a new + commandline parameter is added, called 'activity'. + 2) Added the READ_CONTROL bit for test_unit_ready SCSI-command. + 3) Added some suppress_exception bits to read_device_capacity and + all device_inquiry occurrences in the driver code. + 4) Complaints about the various KERNEL_VERSION implementations are + taken into account. Every local_LinuxKernelVersion occurrence is + now replaced by KERNEL_VERSION, defined in linux/version.h. + Corresponding changes were applied to ibmmca.h, too. This was a + contribution to all kernel-parts by Philipp Hahn. + - Michael Lang + + July 17, 2000 (v3.2pre8) + A long period of collecting bugreports from all corners of the world + now lead to the following corrections to the code: + 1) SCSI-2 F/W support crashed with a COMMAND ERROR. The reason for this + was, that it is possible to disbale Fast-SCSI for the external bus. + The feature-control command, where this crash appeared regularly tried + to set the maximum speed of 10MHz synchronous transfer speed and that + reports a COMMAND ERROR, if external bus Fast-SCSI is disabled. Now, + the feature-command probes down from maximum speed until the adapter + stops to complain, which is at the same time the maximum possible + speed selected in the reference program. So, F/W external can run at + 5 MHz (slow-) or 10 MHz (fast-SCSI). During feature probing, the + COMMAND ERROR message is used to detect if the adapter does not complain. + 2) Up to now, only combined busmode is supported, if you use external + SCSI-devices, attached to the F/W-controller. If dual bus is selected, + only the internal SCSI-devices get accessed by Linux. For most + applications, this should do fine. + 3) Wide-SCSI-addressing (16-Bit) is now possible for the internal F/W + bus on the F/W adapter. If F/W adapter is detected, the driver + automatically uses the extended PUN/LUN <-> LDN mapping tables, which + are now new from 3.2pre8. This allows PUNs between 0 and 15 and should + provide more fun with the F/W adapter. + 4) Several machines use the SCSI: POS registers for internal/undocumented + storage of system relevant info. This confused the driver, mainly on + models 9595, as it expected no onboard SCSI only, if all POS in + the integrated SCSI-area are set to 0x00 or 0xff. Now, the mechanism + to check for integrated SCSI is much more restrictive and these problems + should be history. + - Michael Lang + + July 18, 2000 (v3.2pre9) + This develop rather quickly at the moment. Two major things were still + missing in 3.2pre8: + 1) The adapter PUN for F/W adapters has 4-bits, while all other adapters + have 3-bits. This is now taken into account for F/W. + 2) When you select CONFIG_IBMMCA_SCSI_ORDER_STANDARD, you should + normally get the inverse probing order of your devices on the SCSI-bus. + The ANSI device order gets scrambled in version 3.2pre8!! Now, a new + and tested algorithm inverts the device-order on the SCSI-bus and + automatically avoids accidental access to whatever SCSI PUN the adapter + is set and works with SCSI- and Wide-SCSI-addressing. + - Michael Lang + + July 23, 2000 (v3.2pre10 unpublished) + 1) LED panel display supports wide-addressing in ibmmca=display mode. + 2) Adapter-information and autoadaption to address-space is done. + 3) Auto-probing for maximum synchronous SCSI transfer rate is working. + 4) Optimization to some embedded function calls is applied. + 5) Added some comment for the user to wait for SCSI-devices being probed. + 6) Finished version 3.2 for Kernel 2.4.0. It least, I thought it is but... + - Michael Lang + + July 26, 2000 (v3.2pre11) + 1) I passed a horrible weekend getting mad with NMIs on kernel 2.2.14 and + a model 9595. Asking around in the community, nobody except of me has + seen such errors. Weired, but I am trying to recompile everything on + the model 9595. Maybe, as I use a specially modified gcc, that could + cause problems. But, it was not the reason. The true background was, + that the kernel was compiled for i386 and the 9595 has a 486DX-2. + Normally, no troubles should appear, but for this special machine, + only the right processor support is working fine! + 2) Previous problems with synchronous speed, slowing down from one adapter + to the next during probing are corrected. Now, local variables store + the synchronous bitmask for every single adapter found on the MCA bus. + 3) LED alphanumeric panel support for XX95 systems is now showing some + alive rotator during boottime. This makes sense, when no monitor is + connected to the system. You can get rid of all display activity, if + you do not use any parameter or just ibmmcascsi=activity, for the + harddrive activity LED, existant on all PS/2, except models 8595-XXX. + If no monitor is available, please use ibmmcascsi=display, which works + fine together with the linuxinfo utility for the LED-panel. + - Michael Lang + + July 29, 2000 (v3.2) + 1) Submission of this driver for kernel 2.4test-XX and 2.2.17. + - Michael Lang + + December 28, 2000 (v3.2d / v4.0) + 1) The interrupt handler had some wrong statement to wait for. This + was done due to experimental reasons during 3.2 development but it + has shown that this is not stable enough. Going back to wait for the + adapter to be not busy is best. + 2) Inquiry requests can be shorter than 255 bytes of return buffer. Due + to a bug in the ibmmca_queuecommand routine, this buffer was forced + to 255 at minimum. If the memory address, this return buffer is pointing + to does not offer more space, invalid memory accesses destabilized the + kernel. + 3) version 4.0 is only valid for kernel 2.4.0 or later. This is necessary + to remove old kernel version dependent waste from the driver. 3.2d is + only distributed with older kernels but keeps compatibility with older + kernel versions. 4.0 and higher versions cannot be used with older + kernels anymore!! You must have at least kernel 2.4.0!! + 4) The commandline argument 'bypass' and all its functionality got removed + in version 4.0. This was never really necessary, as all troubles were + based on non-command related reasons up to now, so bypassing commands + did not help to avoid any bugs. It is kept in 3.2X for debugging reasons. + 5) Dynamical reassignment of ldns was again verified and analyzed to be + completely inoperational. This is corrected and should work now. + 6) All commands that get sent to the SCSI adapter were verified and + completed in such a way, that they are now completely conform to the + demands in the technical description of IBM. Main candidates were the + DEVICE_INQUIRY, REQUEST_SENSE and DEVICE_CAPACITY commands. They must + be tranferred by bypassing the internal command buffer of the adapter + or else the response can be a random result. GET_POS_INFO would be more + safe in usage, if one could use the SUPRESS_EXCEPTION_SHORT, but this + is not allowed by the technical references of IBM. (Sorry, folks, the + model 80 problem is still a task to be solved in a different way.) + 7) v3.2d is still hold back for some days for testing, while 4.0 is + released. + - Michael Lang + + January 3, 2001 (v4.0a) + 1) A lot of complains after the 2.4.0-prerelease kernel came in about + the impossibility to compile the driver as a module. This problem is + solved. In combination with that problem, some unprecise declaration + of the function option_setup() gave some warnings during compilation. + This is solved, too by a forward declaration in ibmmca.c. + 2) #ifdef argument concerning CONFIG_SCSI_IBMMCA is no longer needed and + was entirely removed. + 3) Some switch statements got optimized in code, as some minor variables + in internal SCSI-command handlers. + - Michael Lang + + 4 To do + ------- + - IBM SCSI-2 F/W external SCSI bus support in separate mode! + - It seems that the handling of bad disks is really bad - + non-existent, in fact. However, a low-level driver cannot help + much, if such things happen. + + 5 Users' Manual + --------------- + 5.1 Commandline Parameters + -------------------------- + There exist several features for the IBM SCSI-subsystem driver. + The commandline parameter format is: + + ibmmcascsi=<command1>,<command2>,<command3>,... + + where commandN can be one of the following: + + display Owners of a model 95 or other PS/2 systems with an + alphanumeric LED display may set this to have their + display showing the following output of the 8 digits: + + ------DA + + where '-' stays dark, 'D' shows the SCSI-device id + and 'A' shows the SCSI hostindex, being currently + accessed. During boottime, this will give the message + + SCSIini* + + on the LED-panel, where the * represents a rotator, + showing the activity during the probing phase of the + driver which can take up to two minutes per SCSI-adapter. + adisplay This works like display, but gives more optical overview + of the activities on the SCSI-bus. The display will have + the following output: + + 6543210A + + where the numbers 0 to 6 light up at the shown position, + when the SCSI-device is accessed. 'A' shows again the SCSI + hostindex. If display nor adisplay is set, the internal + PS/2 harddisk LED is used for media-activities. So, if + you really do not have a system with a LED-display, you + should not set display or adisplay. Keep in mind, that + display and adisplay can only be used alternatively. It + is not recommended to use this option, if you have some + wide-addressed devices e.g. at the SCSI-2 F/W adapter in + your system. In addition, the usage of the display for + other tasks in parallel, like the linuxinfo-utility makes + no sense with this option. + activity This enables the PS/2 harddisk LED activity indicator. + Most PS/2 have no alphanumeric LED display, but some + indicator. So you should use this parameter to activate it. + If you own model 9595 (Server95), you can have both, the + LED panel and the activity indicator in parallel. However, + some PS/2s, like the 8595 do not have any harddisk LED + activity indicator, which means, that you must use the + alphanumeric LED display if you want to monitor SCSI- + activity. + bypass This is obsolete from driver version 4.0, as the adapters + got that far understood, that the selection between + integrated and bypassed commands should now work completely + correct! For historical reasons, the old description is + kept here: + This commandline parameter forces the driver never to use + SCSI-subsystems' integrated SCSI-command set. Except of + the immediate assign, which is of vital importance for + every IBM SCSI-subsystem to set its ldns right. Instead, + the ordinary ANSI-SCSI-commands are used and passed by the + controller to the SCSI-devices, therefore 'bypass'. The + effort, done by the subsystem is quite bogus and at a + minimum and therefore it should work everywhere. This + could maybe solve troubles with old or integrated SCSI- + controllers and nasty harddisks. Keep in mind, that using + this flag will slow-down SCSI-accesses slightly, as the + software generated commands are always slower than the + hardware. Non-harddisk devices always get read/write- + commands in bypass mode. On the most recent releases of + the Linux IBM-SCSI-driver, the bypass command should be + no longer a necessary thing, if you are sure about your + SCSI-hardware! + normal This is the parameter, introduced on the 2.0.x development + rail by ZP Gu. This parameter defines the SCSI-device + scan order in the new industry standard. This means, that + the first SCSI-device is the one with the lowest pun. + E.g. harddisk at pun=0 is scanned before harddisk at + pun=6, which means, that harddisk at pun=0 gets sda + and the one at pun=6 gets sdb. + ansi The ANSI-standard for the right scan order, as done by + IBM, Microware and Microsoft, scans SCSI-devices starting + at the highest pun, which means, that e.g. harddisk at + pun=6 gets sda and a harddisk at pun=0 gets sdb. If you + like to have the same SCSI-device order, as in DOS, OS-9 + or OS/2, just use this parameter. + fast SCSI-I/O in synchronous mode is done at 5 MHz for IBM- + SCSI-devices. SCSI-2 Fast/Wide Adapter/A external bus + should then run at 10 MHz if Fast-SCSI is enabled, + and at 5 MHz if Fast-SCSI is disabled on the external + bus. This is the default setting when nothing is + specified here. + medium Synchronous rate is at 50% approximately, which means + 2.5 MHz for IBM SCSI-adapters and 5.0 MHz for F/W ext. + SCSI-bus (when Fast-SCSI speed enabled on external bus). + slow The slowest possible synchronous transfer rate is set. + This means 1.82 MHz for IBM SCSI-adapters and 2.0 MHz + for F/W external bus at Fast-SCSI speed on the external + bus. + + A further option is that you can force the SCSI-driver to accept a SCSI- + subsystem at a certain I/O-address with a predefined adapter PUN. This + is done by entering + + commandN = I/O-base + commandN+1 = adapter PUN + + e.g. ibmmcascsi=0x3540,7 will force the driver to detect a SCSI-subsystem + at I/O-address 0x3540 with adapter PUN 7. Please only use this method, if + the driver does really not recognize your SCSI-adapter! With driver version + 3.2, this recognition of various adapters was hugely improved and you + should try first to remove your commandline arguments of such type with a + newer driver. I bet, it will be recognized correctly. Even multiple and + different types of IBM SCSI-adapters should be recognized correctly, too. + Use the forced detection method only as last solution! + + Examples: + + ibmmcascsi=adisplay + + This will use the advanced display mode for the model 95 LED alphanumeric + display. + + ibmmcascsi=display,0x3558,7 + + This will activate the default display mode for the model 95 LED display + and will force the driver to accept a SCSI-subsystem at I/O-base 0x3558 + with adapter PUN 7. + + 5.2 Troubleshooting + ------------------- + The following FAQs should help you to solve some major problems with this + driver. + + Q: "Reset SCSI-devices at boottime" halts the system at boottime, why? + A: This is only tested with the IBM SCSI Adapter w/cache. It is not + yet prooved to run on other adapters, however you may be lucky. + In version 3.1d this has been hugely improved and should work better, + now. Normally you really won't need to activate this flag in the + kernel configuration, as all post 1989 SCSI-devices should accept + the reset-signal, when the computer is switched on. The SCSI- + subsystem generates this reset while being initialized. This flag + is really reserved for users with very old, very strange or self-made + SCSI-devices. + Q: Why is the SCSI-order of my drives mirrored to the device-order + seen from OS/2 or DOS ? + A: It depends on the operating system, if it looks at the devices in + ANSI-SCSI-standard (starting from pun 6 and going down to pun 0) or + if it just starts at pun 0 and counts up. If you want to be conform + with OS/2 and DOS, you have to activate this flag in the kernel + configuration or you should set 'ansi' as parameter for the kernel. + The parameter 'normal' sets the new industry standard, starting + from pun 0, scanning up to pun 6. This allows you to change your + opinion still after having already compiled the kernel. + Q: Why I cannot find the IBM MCA SCSI support in the config menue? + A: You have to activate MCA bus support, first. + Q: Where can I find the latest info about this driver? + A: See the file MAINTAINERS for the current WWW-address, which offers + updates, info and Q/A lists. At this files' origin, the webaddress + was: http://www.uni-mainz.de/~langm000/linux.html + Q: My SCSI-adapter is not recognized by the driver, what can I do? + A: Just force it to be recognized by kernel parameters. See section 5.1. + If this really happens, do also send e-mail to the maintainer, as + forced detection should be never necessary. Forced detection is in + principal some flaw of the driver adapter detection and goes into + bugreports. + Q: The driver screws up, if it starts to probe SCSI-devices, is there + some way out of it? + A: Yes, that was some recognition problem of the correct SCSI-adapter + and its I/O base addresses. Upgrade your driver to the latest release + and it should be fine again. + Q: I get a message: panic IBM MCA SCSI: command error .... , what can + I do against this? + A: Previously, I followed the way by ignoring command errors by using + ibmmcascsi=forgiveall, but this command no longer exists and is + obsolete. If such a problem appears, it is caused by some segmentation + fault of the driver, which maps to some unallowed area. The latest + version of the driver should be ok, as most bugs have been solved. + Q: There are still kernel panics, even after having set + ibmmcascsi=forgiveall. Are there other possibilities to prevent + such panics? + A: No, get just the latest release of the driver and it should work + better and better with increasing version number. Forget about this + ibmmcascsi=forgiveall, as also ignorecmd are obsolete.! + Q: Linux panics or stops without any comment, but it is probable, that my + harddisk(s) have bad blocks. + A: Sorry, the bad-block handling is still a feeble point of this driver, + but is on the schedule for development in the near future. + Q: Linux panics while dynamically assigning SCSI-ids or ldns. + A: If you disconnect a SCSI-device from the machine, while Linux is up + and the driver uses dynamical reassignment of logical device numbers + (ldn), it really gets "angry" if it won't find devices, that were still + present at boottime and stops Linux. + Q: The system does not recover after an abort-command has been generated. + A: This is regrettably true, as it is not yet understood, why the + SCSI-adapter does really NOT generate any interrupt at the end of + the abort-command. As no interrupt is generated, the abort command + cannot get finished and the system hangs, sorry, but checks are + running to hunt down this problem. If there is a real pending command, + the interrupt MUST get generated after abort. In this case, it + should finish well. + Q: The system gets in bad shape after a SCSI-reset, is this known? + A: Yes, as there are a lot of prescriptions (see the Linux Hackers' + Guide) what has to be done for reset, we still share the bad shape of + the reset functions with all other low level SCSI-drivers. + Astonishingly, reset works in most cases quite ok, but the harddisks + won't run in synchonous mode anymore after a reset, until you reboot. + Q: Why does my XXX w/Cache adapter not use read-prefetch? + A: Ok, that is not completely possible. If a cache is present, the + adapter tries to use it internally. Explicitly, one can use the cache + with a read prefetch command, maybe in future, but this requires + some major overhead of SCSI-commands that risks the performance to + go down more than it gets improved. Tests with that are running. + Q: I have a IBM SCSI-2 Fast/Wide adapter, it boots in some way and hangs. + A: Yes, that is understood, as for sure, your SCSI-2 Fast/Wide adapter + was in such a case recognized as integrated SCSI-adapter or something + else, but not as the correct adapter. As the I/O-ports get assigned + wrongly by that reason, the system should crash in most cases. You + should upgrade to the latest release of the SCSI-driver. The + recommended version is 3.2 or later. Here, the F/W support is in + a stable and reliable condition. Wide-addressing is in addition + supported. + Q: I get a Ooops message and something like "killing interrupt". + A: The reason for this is that the IBM SCSI-subsystem only sends a + termination status back, if some error appeared. In former releases + of the driver, it was not checked, if the termination status block + is NULL. From version 3.2, it is taken care of this. + Q: I have a F/W adapter and the driver sees my internal SCSI-devices, + but ignores the external ones. + A: Select combined busmode in the IBM config-program and check for that + no SCSI-id on the external devices appears on internal devices. + Reboot afterwards. Dual busmode is supported, but works only for the + internal bus, yet. External bus is still ignored. Take care for your + SCSI-ids. If combined bus-mode is activated, on some adapters, + the wide-addressing is not possible, so devices with ids between 8 + and 15 get ignored by the driver & adapter! + Q: I have a 9595 and I get a NMI during heavy SCSI I/O e.g. during fsck. + A COMMAND ERROR is reported and characters on the screen are missing. + Warm reboot is not possible. Things look like quite weired. + A: Check the processor type of your 9595. If you have an 80486 or 486DX-2 + processor complex on your mainboard and you compiled a kernel that + supports 80386 processors, it is possible, that the kernel cannot + keep track of the PS/2 interrupt handling and stops on an NMI. Just + compile a kernel for the correct processor type of your PS/2 and + everything should be fine. This is necessary even if one assumes, + that some 80486 system should be downward compatible to 80386 + software. + Q: Some commands hang and interrupts block the machine. After some + timeout, the syslog reports that it tries to call abort, but the + machine is frozen. + A: This can be a busy wait bug in the interrupt handler of driver + version 3.2. You should at least upgrade to 3.2c if you use + kernel < 2.4.0 and driver version 4.0 if you use kernel 2.4.0 or + later (including all test releases). + Q: I have a PS/2 model 80 and more than 16 MBytes of RAM. The driver + completely refuses to work, reports NMIs, COMMAND ERRORs or other + ambiguous stuff. When reducing the RAM size down below 16 MB, + everything is running smoothly. + A: No real answer, yet. In any case, one should force the kernel to + present SCBs only below the 16 MBytes barrier. Maybe this solves the + problem. Not yet tried, but guessing that it could work. To get this, + set unchecked_isa_dma argument of ibmmca.h from 0 to 1. + + 5.3 Bugreports + -------------- + If you really find bugs in the sourcecode or the driver will successfully + refuse to work on your machine, you should send a bug report to me. The + best for this is to follow the instructions on the WWW-page for this + driver. Fill out the bug-report form, placed on the WWW-page and ship it, + so the bugs can be taken into account with maximum efforts. But, please + do not send bug reports about this driver to Linus Torvalds or Leonard + Zubkoff, as Linus is burried in E-Mail and Leonard is supervising all + SCSI-drivers and won't have the time left to look inside every single + driver to fix a bug and especially DO NOT send modified code to Linus + Torvalds or Alan J. Cox which has not been checked here!!! They are both + quite burried in E-mail (as me, sometimes, too) and one should first check + for problems on my local teststand. Recently, I got a lot of + bugreports for errors in the ibmmca.c code, which I could not imagine, but + a look inside some Linux-distribution showed me quite often some modified + code, which did no longer work on most other machines than the one of the + modifier. Ok, so now that there is maintenance service available for this + driver, please use this address first in order to keep the level of + confusion low. Thank you! + + When you get a SCSI-error message that panics your system, a list of + register-entries of the SCSI-subsystem is shown (from Version 3.1d). With + this list, it is very easy for the maintainer to localize the problem in + the driver or in the configuration of the user. Please write down all the + values from this report and send them to the maintainer. This would really + help a lot and makes life easier concerning misunderstandings. + + Use the bug-report form (see 5.4 for its address) to send all the bug- + stuff to the maintainer or write e-mail with the values from the table. + + 5.4 Support WWW-page + -------------------- + The address of the IBM SCSI-subsystem supporting WWW-page is: + + http://www.uni-mainz.de/~langm000/linux.html + + Here you can find info about the background of this driver, patches, + troubleshooting support, news and a bugreport form. Please check that + WWW-page regularly for latest hints. If ever this URL changes, please + refer to the MAINTAINERS file in order to get the latest address. + + For the bugreport, please fill out the formular on the corresponding + WWW-page. Read the dedicated instructions and write as much as you + know about your problem. If you do not like such formulars, please send + some e-mail directly, but at least with the same information as required by + the formular. + + If you have extensive bugreports, including Ooops messages and + screen-shots, please feel free to send it directly to the address + of the maintainer, too. The current address of the maintainer is: + + Michael Lang <langa2@kph.uni-mainz.de> + + 6 References + ------------ + IBM Corp., "Update for the PS/2 Hardware Interface Technical Reference, + Common Interfaces", Armonk, September 1991, PN 04G3281, + (available in the U.S. for $21.75 at 1-800-IBM-PCTB or in Germany for + around 40,-DM at "Hallo IBM"). + + IBM Corp., "Personal System/2 Micro Channel SCSI + Adapter with Cache Technical Reference", Armonk, March 1990, PN 68X2365. + + IBM Corp., "Personal System/2 Micro Channel SCSI + Adapter Technical Reference", Armonk, March 1990, PN 68X2397. + + IBM Corp., "SCSI-2 Fast/Wide Adapter/A Technical Reference - Dual Bus", + Armonk, March 1994, PN 83G7545. + + Friedhelm Schmidt, "SCSI-Bus und IDE-Schnittstelle - Moderne Peripherie- + Schnittstellen: Hardware, Protokollbeschreibung und Anwendung", 2. Aufl. + Addison Wesley, 1996. + + Michael K. Johnson, "The Linux Kernel Hackers' Guide", Version 0.6, Chapel + Hill - North Carolina, 1995 + + Andreas Kaiser, "SCSI TAPE BACKUP for OS/2 2.0", Version 2.12, Stuttgart + 1993 + + Helmut Rompel, "IBM Computerwelt GUIDE", What is what bei IBM., Systeme * + Programme * Begriffe, IWT-Verlag GmbH - Muenchen, 1988 + + 7 Credits to + ------------ + 7.1 People + ---------- + Klaus Grimm + who already a long time ago gave me the old code from the + SCSI-driver in order to get it running for some old machine + in our institute. + Martin Kolinek + who wrote the first release of the IBM SCSI-subsystem driver. + Chris Beauregard + who for a long time maintained MCA-Linux and the SCSI-driver + in the beginning. Chris, wherever you are: Cheers to you! + Klaus Kudielka + with whom in the 2.1.x times, I had a quite fruitful + cooperation to get the driver running as a module and to get + it running with multiple SCSI-adapters. + David Weinehall + for his excellent maintenance of the MCA-stuff and the quite + detailed bug reports and ideas for this driver (and his + patience ;-)). + Alan J. Cox + for his bugreports and his bold activities in cross-checking + the driver-code with his teststand. + + 7.2 Sponsors & Supporters + ------------------------- + "Hallo IBM", + IBM-Deutschland GmbH + the service of IBM-Deutschland for customers. Their E-Mail + service is unbeatable. Whatever old stuff I asked for, I + always got some helpful answers. + Karl-Otto Reimers, + IBM Klub - Sparte IBM Geschichte, Sindelfingen + for sending me a copy of the w/Cache manual from the + IBM-Deutschland archives. + Harald Staiger + for his extensive hardware donations which allows me today + still to test the driver in various constellations. + Erich Fritscher + for his very kind sponsoring. + Louis Ohland, + Charles Lasitter + for support by shipping me an IBM SCSI-2 Fast/Wide manual. + In addition, the contribution of various hardware is quite + decessive and will make it possible to add FWSR (RAID) + adapter support to the driver in the near future! So, + complaints about no RAID support won't remain forever. + Yes, folks, that is no joke, RAID support is going to rise! + Erik Weber + for the great deal we made about a model 9595 and the nice + surrounding equipment and the cool trip to Mannheim + second-hand computer market. In addition, I would like + to thank him for his exhaustive SCSI-driver testing on his + 95er PS/2 park. + Anthony Hogbin + for his direct shipment of a SCSI F/W adapter, which allowed + me immediately on the first stage to try it on model 8557 + together with onboard SCSI adapter and some SCSI w/Cache. + Andreas Hotz + for his support by memory and an IBM SCSI-adapter. Collecting + all this together now allows me to try really things with + the driver at maximum load and variety on various models in + a very quick and efficient way. + Peter Jennewein + for his model 30, which serves me as part of my teststand + and his cool remark about how you make an ordinary diskette + drive working and how to connect it to an IBM-diskette port. + Johannes Gutenberg-Universitaet, Mainz & + Institut fuer Kernphysik, Mainz Microtron (MAMI) + for the offered space, the link, placed on the central + homepage and the space to store and offer the driver and + related material and the free working times, which allow + me to answer all your e-mail. + + 8 Trademarks + ------------ + IBM, PS/2, OS/2, Microchannel are registered trademarks of International + Business Machines Corporation + + MS-DOS is a registered trademark of Microsoft Corporation + + Microware, OS-9 are registered trademarks of Microware Systems + + 9 Disclaimer + ------------ + Beside the GNU General Public License and the dependent disclaimers and disclaimers + concerning the Linux-kernel in special, this SCSI-driver comes without any + warranty. Its functionality is tested as good as possible on certain + machines and combinations of computer hardware, which does not exclude, + that dataloss or severe damage of hardware is possible while using this + part of software on some arbitrary computer hardware or in combination + with other software packages. It is highly recommended to make backup + copies of your data before using this software. Furthermore, personal + injuries by hardware defects, that could be caused by this SCSI-driver are + not excluded and it is highly recommended to handle this driver with a + maximum of carefulness. + + This driver supports hardware, produced by International Business Machines + Corporation (IBM). + +------ +Michael Lang +(langa2@kph.uni-mainz.de) diff --git a/Documentation/scsi/in2000.txt b/Documentation/scsi/in2000.txt new file mode 100644 index 000000000000..80f104042645 --- /dev/null +++ b/Documentation/scsi/in2000.txt @@ -0,0 +1,202 @@ + +UPDATE NEWS: version 1.33 - 26 Aug 98 + + Interrupt management in this driver has become, over + time, increasingly odd and difficult to explain - this + has been mostly due to my own mental inadequacies. In + recent kernels, it has failed to function at all when + compiled for SMP. I've fixed that problem, and after + taking a fresh look at interrupts in general, greatly + reduced the number of places where they're fiddled + with. Done some heavy testing and it looks very good. + The driver now makes use of the __initfunc() and + __initdata macros to save about 4k of kernel memory. + Once again, the same code works for both 2.0.xx and + 2.1.xx kernels. + +UPDATE NEWS: version 1.32 - 28 Mar 98 + + Removed the check for legal IN2000 hardware versions: + It appears that the driver works fine with serial + EPROMs (the 8-pin chip that defines hardware rev) as + old as 2.1, so we'll assume that all cards are OK. + +UPDATE NEWS: version 1.31 - 6 Jul 97 + + Fixed a bug that caused incorrect SCSI status bytes to be + returned from commands sent to LUN's greater than 0. This + means that CDROM changers work now! Fixed a bug in the + handling of command-line arguments when loaded as a module. + Also put all the header data in in2000.h where it belongs. + There are no longer any differences between this driver in + the 2.1.xx source tree and the 2.0.xx tree, as of 2.0.31 + and 2.1.45 (or is it .46?) - this makes things much easier + for me... + +UPDATE NEWS: version 1.30 - 14 Oct 96 + + Fixed a bug in the code that sets the transfer direction + bit (DESTID_DPD in the WD_DESTINATION_ID register). There + are quite a few SCSI commands that do a write-to-device; + now we deal with all of them correctly. Thanks to Joerg + Dorchain for catching this one. + +UPDATE NEWS: version 1.29 - 24 Sep 96 + + The memory-mapped hardware on the card is now accessed via + the 'readb()' and 'readl()' macros - required by the new + memory management scheme in the 2.1.x kernel series. + As suggested by Andries Brouwer, 'bios_param()' no longer + forces an artificial 1023 track limit on drives. Also + removed some kludge-code left over from struggles with + older (buggy) compilers. + +UPDATE NEWS: version 1.28 - 07 May 96 + + Tightened up the "interrupts enabled/disabled" discipline + in 'in2000_queuecommand()' and maybe 1 or 2 other places. + I _think_ it may have been a little too lax, causing an + occasional crash during full moon. A fully functional + /proc interface is now in place - if you want to play + with it, start by doing 'cat /proc/scsi/in2000/0'. You + can also use it to change a few run-time parameters on + the fly, but it's mostly for debugging. The curious + should take a good look at 'in2000_proc_info()' in the + in2000.c file to get an understanding of what it's all + about; I figure that people who are really into it will + want to add features suited to their own needs... + Also, sync is now DISABLED by default. + +UPDATE NEWS: version 1.27 - 10 Apr 96 + + Fixed a well-hidden bug in the adaptive-disconnect code + that would show up every now and then during extreme + heavy loads involving 2 or more simultaneously active + devices. Thanks to Joe Mack for keeping my nose to the + grindstone on this one. + +UPDATE NEWS: version 1.26 - 07 Mar 96 + + 1.25 had a nasty bug that bit people with swap partitions + and tape drives. Also, in my attempt to guess my way + through Intel assembly language, I made an error in the + inline code for IO writes. Made a few other changes and + repairs - this version (fingers crossed) should work well. + +UPDATE NEWS: version 1.25 - 05 Mar 96 + + Kernel 1.3.70 interrupt mods added; old kernels still OK. + Big help from Bill Earnest and David Willmore on speed + testing and optimizing: I think there's a real improvement + in this area. + New! User-friendly command-line interface for LILO and + module loading - the old method is gone, so you'll need + to read the comments for 'setup_strings' near the top + of in2000.c. For people with CDROM's or other devices + that have a tough time with sync negotiation, you can + now selectively disable sync on individual devices - + search for the 'nosync' keyword in the command-line + comments. Some of you disable the BIOS on the card, which + caused the auto-detect function to fail; there is now a + command-line option to force detection of a ROM-less card. + +UPDATE NEWS: version 1.24a - 24 Feb 96 + + There was a bug in the synchronous transfer code. Only + a few people downloaded before I caught it - could have + been worse. + +UPDATE NEWS: version 1.24 - 23 Feb 96 + + Lots of good changes. Advice from Bill Earnest resulted + in much better detection of cards, more efficient usage + of the fifo, and (hopefully) faster data transfers. The + jury is still out on speed - I hope it's improved some. + One nifty new feature is a cool way of doing disconnect/ + reselect. The driver defaults to what I'm calling + 'adaptive disconnect' - meaning that each command is + evaluated individually as to whether or not it should be + run with the option to disconnect/reselect (if the device + chooses), or as a "SCSI-bus-hog". When several devices + are operating simultaneously, disconnects are usually an + advantage. In a single device system, or if only 1 device + is being accessed, transfers usually go faster if disconnects + are not allowed. + + + +The default arguments (you get these when you don't give an 'in2000' +command-line argument, or you give a blank argument) will cause +the driver to do adaptive disconnect, synchronous transfers, and a +minimum of debug messages. If you want to fool with the options, +search for 'setup_strings' near the top of the in2000.c file and +check the 'hostdata->args' section in in2000.h - but be warned! Not +everything is working yet (some things will never work, probably). +I believe that disabling disconnects (DIS_NEVER) will allow you +to choose a LEVEL2 value higher than 'L2_BASIC', but I haven't +spent a lot of time testing this. You might try 'ENABLE_CLUSTERING' +to see what happens: my tests showed little difference either way. +There's also a define called 'DEFAULT_SX_PER'; this sets the data +transfer speed for the asynchronous mode. I've put it at 500 ns +despite the fact that the card could handle settings of 376 or +252, because higher speeds may be a problem with poor quality +cables or improper termination; 500 ns is a compromise. You can +choose your own default through the command-line with the +'period' keyword. + + +------------------------------------------------ +*********** DIP switch settings ************** +------------------------------------------------ + + sw1-1 sw1-2 BIOS address (hex) + ----------------------------------------- + off off C8000 - CBFF0 + on off D8000 - DBFF0 + off on D0000 - D3FF0 + on on BIOS disabled + + sw1-3 sw1-4 IO port address (hex) + ------------------------------------ + off off 220 - 22F + on off 200 - 20F + off on 110 - 11F + on on 100 - 10F + + sw1-5 sw1-6 sw1-7 Interrupt + ------------------------------ + off off off 15 + off on off 14 + off off on 11 + off on on 10 + on - - disabled + + sw1-8 function depends on BIOS version. In earlier versions this + controlled synchronous data transfer support for MSDOS: + off = disabled + on = enabled + In later ROMs (starting with 01.3 in April 1994) sw1-8 controls + the "greater than 2 disk drive" feature that first appeared in + MSDOS 5.0 (ignored by Linux): + off = 2 drives maximum + on = 7 drives maximum + + sw1-9 Floppy controller + -------------------------- + off disabled + on enabled + +------------------------------------------------ + + I should mention that Drew Eckhardt's 'Generic NCR5380' sources + were my main inspiration, with lots of reference to the IN2000 + driver currently distributed in the kernel source. I also owe + much to a driver written by Hamish Macdonald for Linux-m68k(!). + And to Eric Wright for being an ALPHA guinea pig. And to Bill + Earnest for 2 tons of great input and information. And to David + Willmore for extensive 'bonnie' testing. And to Joe Mack for + continual testing and feedback. + + + John Shifflett jshiffle@netcom.com + diff --git a/Documentation/scsi/megaraid.txt b/Documentation/scsi/megaraid.txt new file mode 100644 index 000000000000..ff864c0f494c --- /dev/null +++ b/Documentation/scsi/megaraid.txt @@ -0,0 +1,70 @@ + Notes on Management Module + ~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Overview: +-------- + +Different classes of controllers from LSI Logic, accept and respond to the +user applications in a similar way. They understand the same firmware control +commands. Furthermore, the applications also can treat different classes of +the controllers uniformly. Hence it is logical to have a single module that +interefaces with the applications on one side and all the low level drivers +on the other. + +The advantages, though obvious, are listed for completeness: + + i. Avoid duplicate code from the low level drivers. + ii. Unburden the low level drivers from having to export the + character node device and related handling. + iii. Implement any policy mechanisms in one place. + iv. Applications have to interface with only module instead of + multiple low level drivers. + +Currently this module (called Common Management Module) is used only to issue +ioctl commands. But this module is envisioned to handle all user space level +interactions. So any 'proc', 'sysfs' implementations will be localized in this +common module. + +Credits: +------- + +"Shared code in a third module, a "library module", is an acceptable +solution. modprobe automatically loads dependent modules, so users +running "modprobe driver1" or "modprobe driver2" would automatically +load the shared library module." + + - Jeff Garzik (jgarzik@pobox.com), 02.25.2004 LKML + +"As Jeff hinted, if your userspace<->driver API is consistent between +your new MPT-based RAID controllers and your existing megaraid driver, +then perhaps you need a single small helper module (lsiioctl or some +better name), loaded by both mptraid and megaraid automatically, which +handles registering the /dev/megaraid node dynamically. In this case, +both mptraid and megaraid would register with lsiioctl for each +adapter discovered, and lsiioctl would essentially be a switch, +redirecting userspace tool ioctls to the appropriate driver." + + - Matt Domsch, (Matt_Domsch@dell.com), 02.25.2004 LKML + +Design: +------ + +The Common Management Module is implemented in megaraid_mm.[ch] files. This +module acts as a registry for low level hba drivers. The low level drivers +(currently only megaraid) register each controller with the common module. + +The applications interface with the common module via the character device +node exported by the module. + +The lower level drivers now understand only a new improved ioctl packet called +uioc_t. The management module converts the older ioctl packets from the older +applications into uioc_t. After driver handles the uioc_t, the common module +will convert that back into the old format before returning to applications. + +As new applications evolve and replace the old ones, the old packet format +will be retired. + +Common module dedicates one uioc_t packet to each controller registered. This +can easily be more than one. But since megaraid is the only low level driver +today, and it can handle only one ioctl, there is no reason to have more. But +as new controller classes get added, this will be tuned appropriately. diff --git a/Documentation/scsi/ncr53c7xx.txt b/Documentation/scsi/ncr53c7xx.txt new file mode 100644 index 000000000000..91e9552d63e5 --- /dev/null +++ b/Documentation/scsi/ncr53c7xx.txt @@ -0,0 +1,40 @@ +README for WarpEngine/A4000T/A4091 SCSI kernels. + +Use the following options to disable options in the SCSI driver. + +Using amiboot for example..... + +To disable Synchronous Negotiation.... + + amiboot -k kernel 53c7xx=nosync:0 + +To disable Disconnection.... + + amiboot -k kernel 53c7xx=nodisconnect:0 + +To disable certain SCSI devices... + + amiboot -k kernel 53c7xx=validids:0x3F + + this allows only device ID's 0,1,2,3,4 and 5 for linux to handle. + (this is a bitmasked field - i.e. each bit represents a SCSI ID) + +These commands work on a per controller basis and use the option 'next' to +move to the next controller in the system. + +e.g. + amiboot -k kernel 53c7xx=nodisconnect:0,next,nosync:0 + + this uses No Disconnection on the first controller and Asynchronous + SCSI on the second controller. + +Known Issues: + +Two devices are known not to function with the default settings of using +synchronous SCSI. These are the Archive Viper 150 Tape Drive and the +SyQuest SQ555 removeable hard drive. When using these devices on a controller +use the 'nosync:0' option. + +Please try these options and post any problems/successes to me. + +Alan Hourihane <alanh@fairlite.demon.co.uk> diff --git a/Documentation/scsi/ncr53c8xx.txt b/Documentation/scsi/ncr53c8xx.txt new file mode 100644 index 000000000000..822d2aca3700 --- /dev/null +++ b/Documentation/scsi/ncr53c8xx.txt @@ -0,0 +1,1854 @@ +The Linux NCR53C8XX/SYM53C8XX drivers README file + +Written by Gerard Roudier <groudier@free.fr> +21 Rue Carnot +95170 DEUIL LA BARRE - FRANCE + +29 May 1999 +=============================================================================== + +1. Introduction +2. Supported chips and SCSI features +3. Advantages of the enhanced 896 driver + 3.1 Optimized SCSI SCRIPTS + 3.2 New features of the SYM53C896 (64 bit PCI dual LVD SCSI controller) +4. Memory mapped I/O versus normal I/O +5. Tagged command queueing +6. Parity checking +7. Profiling information +8. Control commands + 8.1 Set minimum synchronous period + 8.2 Set wide size + 8.3 Set maximum number of concurrent tagged commands + 8.4 Set order type for tagged command + 8.5 Set debug mode + 8.6 Clear profile counters + 8.7 Set flag (no_disc) + 8.8 Set verbose level + 8.9 Reset all logical units of a target + 8.10 Abort all tasks of all logical units of a target +9. Configuration parameters +10. Boot setup commands + 10.1 Syntax + 10.2 Available arguments + 10.2.1 Master parity checking + 10.2.2 Scsi parity checking + 10.2.3 Scsi disconnections + 10.2.4 Special features + 10.2.5 Ultra SCSI support + 10.2.6 Default number of tagged commands + 10.2.7 Default synchronous period factor + 10.2.8 Negotiate synchronous with all devices + 10.2.9 Verbosity level + 10.2.10 Debug mode + 10.2.11 Burst max + 10.2.12 LED support + 10.2.13 Max wide + 10.2.14 Differential mode + 10.2.15 IRQ mode + 10.2.16 Reverse probe + 10.2.17 Fix up PCI configuration space + 10.2.18 Serial NVRAM + 10.2.19 Check SCSI BUS + 10.2.20 Exclude a host from being attached + 10.2.21 Suggest a default SCSI id for hosts + 10.2.22 Enable use of IMMEDIATE ARBITRATION + 10.3 Advised boot setup commands + 10.4 PCI configuration fix-up boot option + 10.5 Serial NVRAM support boot option + 10.6 SCSI BUS checking boot option + 10.7 IMMEDIATE ARBITRATION boot option +11. Some constants and flags of the ncr53c8xx.h header file +12. Installation +13. Architecture dependent features +14. Known problems + 14.1 Tagged commands with Iomega Jaz device + 14.2 Device names change when another controller is added + 14.3 Using only 8 bit devices with a WIDE SCSI controller. + 14.4 Possible data corruption during a Memory Write and Invalidate + 14.5 IRQ sharing problems +15. SCSI problem troubleshooting + 15.1 Problem tracking + 15.2 Understanding hardware error reports +16. Synchonous transfer negotiation tables + 16.1 Synchronous timings for 53C875 and 53C860 Ultra-SCSI controllers + 16.2 Synchronous timings for fast SCSI-2 53C8XX controllers +17. Serial NVRAM support (by Richard Waltham) + 17.1 Features + 17.2 Symbios NVRAM layout + 17.3 Tekram NVRAM layout +18. Support for Big Endian + 18.1 Big Endian CPU + 18.2 NCR chip in Big Endian mode of operations + +=============================================================================== + +1. Introduction + +The initial Linux ncr53c8xx driver has been a port of the ncr driver from +FreeBSD that has been achieved in November 1995 by: + Gerard Roudier <groudier@free.fr> + +The original driver has been written for 386bsd and FreeBSD by: + Wolfgang Stanglmeier <wolf@cologne.de> + Stefan Esser <se@mi.Uni-Koeln.de> + +It is now available as a bundle of 2 drivers: + +- ncr53c8xx generic driver that supports all the SYM53C8XX family including + the ealiest 810 rev. 1, the latest 896 (2 channel LVD SCSI controller) and + the new 895A (1 channel LVD SCSI controller). +- sym53c8xx enhanced driver (a.k.a. 896 drivers) that drops support of oldest + chips in order to gain advantage of new features, as LOAD/STORE intructions + available since the 810A and hardware phase mismatch available with the + 896 and the 895A. + +You can find technical information about the NCR 8xx family in the +PCI-HOWTO written by Michael Will and in the SCSI-HOWTO written by +Drew Eckhardt. + +Information about new chips is available at LSILOGIC web server: + + http://www.lsilogic.com/ + +SCSI standard documentations are available at SYMBIOS ftp server: + + ftp://ftp.symbios.com/ + +Usefull SCSI tools written by Eric Youngdale are available at tsx-11: + + ftp://tsx-11.mit.edu/pub/linux/ALPHA/scsi/scsiinfo-X.Y.tar.gz + ftp://tsx-11.mit.edu/pub/linux/ALPHA/scsi/scsidev-X.Y.tar.gz + +These tools are not ALPHA but quite clean and work quite well. +It is essential you have the 'scsiinfo' package. + +This short documentation describes the features of the generic and enhanced +drivers, configuration parameters and control commands available through +the proc SCSI file system read / write operations. + +This driver has been tested OK with linux/i386, Linux/Alpha and Linux/PPC. + +Latest driver version and patches are available at: + + ftp://ftp.tux.org/pub/people/gerard-roudier +or + ftp://ftp.symbios.com/mirror/ftp.tux.org/pub/tux/roudier/drivers + +I am not a native speaker of English and there are probably lots of +mistakes in this README file. Any help will be welcome. + + +2. Supported chips and SCSI features + +The following features are supported for all chips: + + Synchronous negotiation + Disconnection + Tagged command queuing + SCSI parity checking + Master parity checking + +"Wide negotiation" is supported for chips that allow it. The +following table shows some characteristics of NCR 8xx family chips +and what drivers support them. + + Supported by Supported by + On board the generic the enhanced +Chip SDMS BIOS Wide SCSI std. Max. sync driver driver +---- --------- ---- --------- ---------- ------------ ------------- +810 N N FAST10 10 MB/s Y N +810A N N FAST10 10 MB/s Y Y +815 Y N FAST10 10 MB/s Y N +825 Y Y FAST10 20 MB/s Y N +825A Y Y FAST10 20 MB/s Y Y +860 N N FAST20 20 MB/s Y Y +875 Y Y FAST20 40 MB/s Y Y +876 Y Y FAST20 40 MB/s Y Y +895 Y Y FAST40 80 MB/s Y Y +895A Y Y FAST40 80 MB/s Y Y +896 Y Y FAST40 80 MB/s Y Y +897 Y Y FAST40 80 MB/s Y Y +1510D Y Y FAST40 80 MB/s Y Y +1010 Y Y FAST80 160 MB/s N Y +1010_66* Y Y FAST80 160 MB/s N Y + +* Chip supports 33MHz and 66MHz PCI buses. + + +Summary of other supported features: + +Module: allow to load the driver +Memory mapped I/O: increases performance +Profiling information: read operations from the proc SCSI file system +Control commands: write operations to the proc SCSI file system +Debugging information: written to syslog (expert only) +Scatter / gather +Shared interrupt +Boot setup commands +Serial NVRAM: Symbios and Tekram formats + + +3. Advantages of the enhanced 896 driver + +3.1 Optimized SCSI SCRIPTS. + +The 810A, 825A, 875, 895, 896 and 895A support new SCSI SCRIPTS instructions +named LOAD and STORE that allow to move up to 1 DWORD from/to an IO register +to/from memory much faster that the MOVE MEMORY instruction that is supported +by the 53c7xx and 53c8xx family. +The LOAD/STORE instructions support absolute and DSA relative addressing +modes. The SCSI SCRIPTS had been entirely rewritten using LOAD/STORE instead +of MOVE MEMORY instructions. + +3.2 New features of the SYM53C896 (64 bit PCI dual LVD SCSI controller) + +The 896 and the 895A allows handling of the phase mismatch context from +SCRIPTS (avoids the phase mismatch interrupt that stops the SCSI processor +until the C code has saved the context of the transfer). +Implementing this without using LOAD/STORE instructions would be painfull +and I did'nt even want to try it. + +The 896 chip supports 64 bit PCI transactions and addressing, while the +895A supports 32 bit PCI transactions and 64 bit addressing. +The SCRIPTS processor of these chips is not true 64 bit, but uses segment +registers for bit 32-63. Another interesting feature is that LOAD/STORE +instructions that address the on-chip RAM (8k) remain internal to the chip. + +Due to the use of LOAD/STORE SCRIPTS instructions, this driver does not +support the following chips: +- SYM53C810 revision < 0x10 (16) +- SYM53C815 all revisions +- SYM53C825 revision < 0x10 (16) + +4. Memory mapped I/O versus normal I/O + +Memory mapped I/O has less latency than normal I/O. Since +linux-1.3.x, memory mapped I/O is used rather than normal I/O. Memory +mapped I/O seems to work fine on most hardware configurations, but +some poorly designed motherboards may break this feature. + +The configuration option CONFIG_SCSI_NCR53C8XX_IOMAPPED forces the +driver to use normal I/O in all cases. + + +5. Tagged command queueing + +Queuing more than 1 command at a time to a device allows it to perform +optimizations based on actual head positions and its mechanical +characteristics. This feature may also reduce average command latency. +In order to really gain advantage of this feature, devices must have +a reasonable cache size (No miracle is to be expected for a low-end +hard disk with 128 KB or less). +Some kown SCSI devices do not properly support tagged command queuing. +Generally, firmware revisions that fix this kind of problems are available +at respective vendor web/ftp sites. +All I can say is that the hard disks I use on my machines behave well with +this driver with tagged command queuing enabled: + +- IBM S12 0662 +- Conner 1080S +- Quantum Atlas I +- Quantum Atlas II + +If your controller has NVRAM, you can configure this feature per target +from the user setup tool. The Tekram Setup program allows to tune the +maximum number of queued commands up to 32. The Symbios Setup only allows +to enable or disable this feature. + +The maximum number of simultaneous tagged commands queued to a device +is currently set to 8 by default. This value is suitable for most SCSI +disks. With large SCSI disks (>= 2GB, cache >= 512KB, average seek time +<= 10 ms), using a larger value may give better performances. + +The sym53c8xx driver supports up to 255 commands per device, and the +generic ncr53c8xx driver supports up to 64, but using more than 32 is +generally not worth-while, unless you are using a very large disk or disk +array. It is noticeable that most of recent hard disks seem not to accept +more than 64 simultaneous commands. So, using more than 64 queued commands +is probably just resource wasting. + +If your controller does not have NVRAM or if it is managed by the SDMS +BIOS/SETUP, you can configure tagged queueing feature and device queue +depths from the boot command-line. For example: + + ncr53c8xx=tags:4/t2t3q15-t4q7/t1u0q32 + +will set tagged commands queue depths as follow: + +- target 2 all luns on controller 0 --> 15 +- target 3 all luns on controller 0 --> 15 +- target 4 all luns on controller 0 --> 7 +- target 1 lun 0 on controller 1 --> 32 +- all other target/lun --> 4 + +In some special conditions, some SCSI disk firmwares may return a +QUEUE FULL status for a SCSI command. This behaviour is managed by the +driver using the following heuristic: + +- Each time a QUEUE FULL status is returned, tagged queue depth is reduced + to the actual number of disconnected commands. + +- Every 1000 successfully completed SCSI commands, if allowed by the + current limit, the maximum number of queueable commands is incremented. + +Since QUEUE FULL status reception and handling is resource wasting, the +driver notifies by default this problem to user by indicating the actual +number of commands used and their status, as well as its decision on the +device queue depth change. +The heuristic used by the driver in handling QUEUE FULL ensures that the +impact on performances is not too bad. You can get rid of the messages by +setting verbose level to zero, as follow: + +1st method: boot your system using 'ncr53c8xx=verb:0' option. +2nd method: apply "setverbose 0" control command to the proc fs entry + corresponding to your controller after boot-up. + +6. Parity checking + +The driver supports SCSI parity checking and PCI bus master parity +checking. These features must be enabled in order to ensure safe data +transfers. However, some flawed devices or mother boards will have +problems with parity. You can disable either PCI parity or SCSI parity +checking by entering appropriate options from the boot command line. +(See 10: Boot setup commands). + +7. Profiling information + +Profiling information is available through the proc SCSI file system. +Since gathering profiling information may impact performances, this +feature is disabled by default and requires a compilation configuration +option to be set to Y. + +The device associated with a host has the following pathname: + + /proc/scsi/ncr53c8xx/N (N=0,1,2 ....) + +Generally, only 1 board is used on hardware configuration, and that device is: + /proc/scsi/ncr53c8xx/0 + +However, if the driver has been made as module, the number of the +hosts is incremented each time the driver is loaded. + +In order to display profiling information, just enter: + + cat /proc/scsi/ncr53c8xx/0 + +and you will get something like the following text: + +------------------------------------------------------- +General information: + Chip NCR53C810, device id 0x1, revision id 0x2 + IO port address 0x6000, IRQ number 10 + Using memory mapped IO at virtual address 0x282c000 + Synchronous transfer period 25, max commands per lun 4 +Profiling information: + num_trans = 18014 + num_kbytes = 671314 + num_disc = 25763 + num_break = 1673 + num_int = 1685 + num_fly = 18038 + ms_setup = 4940 + ms_data = 369940 + ms_disc = 183090 + ms_post = 1320 +------------------------------------------------------- + +General information is easy to understand. The device ID and the +revision ID identify the SCSI chip as follows: + +Chip Device id Revision Id +---- --------- ----------- +810 0x1 < 0x10 +810A 0x1 >= 0x10 +815 0x4 +825 0x3 < 0x10 +860 0x6 +825A 0x3 >= 0x10 +875 0xf +895 0xc + +The profiling information is updated upon completion of SCSI commands. +A data structure is allocated and zeroed when the host adapter is +attached. So, if the driver is a module, the profile counters are +cleared each time the driver is loaded. The "clearprof" command +allows you to clear these counters at any time. + +The following counters are available: + +("num" prefix means "number of", +"ms" means milli-seconds) + +num_trans + Number of completed commands + Example above: 18014 completed commands + +num_kbytes + Number of kbytes transferred + Example above: 671 MB transferred + +num_disc + Number of SCSI disconnections + Example above: 25763 SCSI disconnections + +num_break + number of script interruptions (phase mismatch) + Example above: 1673 script interruptions + +num_int + Number of interrupts other than "on the fly" + Example above: 1685 interruptions not "on the fly" + +num_fly + Number of interrupts "on the fly" + Example above: 18038 interruptions "on the fly" + +ms_setup + Elapsed time for SCSI commands setups + Example above: 4.94 seconds + +ms_data + Elapsed time for data transfers + Example above: 369.94 seconds spent for data transfer + +ms_disc + Elapsed time for SCSI disconnections + Example above: 183.09 seconds spent disconnected + +ms_post + Elapsed time for command post processing + (time from SCSI status get to command completion call) + Example above: 1.32 seconds spent for post processing + +Due to the 1/100 second tick of the system clock, "ms_post" time may +be wrong. + +In the example above, we got 18038 interrupts "on the fly" and only +1673 script breaks generally due to disconnections inside a segment +of the scatter list. + + +8. Control commands + +Control commands can be sent to the driver with write operations to +the proc SCSI file system. The generic command syntax is the +following: + + echo "<verb> <parameters>" >/proc/scsi/ncr53c8xx/0 + (assumes controller number is 0) + +Using "all" for "<target>" parameter with the commands below will +apply to all targets of the SCSI chain (except the controller). + +Available commands: + +8.1 Set minimum synchronous period factor + + setsync <target> <period factor> + + target: target number + period: minimum synchronous period. + Maximum speed = 1000/(4*period factor) except for special + cases below. + + Specify a period of 255, to force asynchronous transfer mode. + + 10 means 25 nano-seconds synchronous period + 11 means 30 nano-seconds synchronous period + 12 means 50 nano-seconds synchronous period + +8.2 Set wide size + + setwide <target> <size> + + target: target number + size: 0=8 bits, 1=16bits + +8.3 Set maximum number of concurrent tagged commands + + settags <target> <tags> + + target: target number + tags: number of concurrent tagged commands + must not be greater than SCSI_NCR_MAX_TAGS (default: 8) + +8.4 Set order type for tagged command + + setorder <order> + + order: 3 possible values: + simple: use SIMPLE TAG for all operations (read and write) + ordered: use ORDERED TAG for all operations + default: use default tag type, + SIMPLE TAG for read operations + ORDERED TAG for write operations + + +8.5 Set debug mode + + setdebug <list of debug flags> + + Available debug flags: + alloc: print info about memory allocations (ccb, lcb) + queue: print info about insertions into the command start queue + result: print sense data on CHECK CONDITION status + scatter: print info about the scatter process + scripts: print info about the script binding process + tiny: print minimal debugging information + timing: print timing information of the NCR chip + nego: print information about SCSI negotiations + phase: print information on script interruptions + + Use "setdebug" with no argument to reset debug flags. + + +8.6 Clear profile counters + + clearprof + + The profile counters are automatically cleared when the amount of + data transferred reaches 1000 GB in order to avoid overflow. + The "clearprof" command allows you to clear these counters at any time. + + +8.7 Set flag (no_disc) + + setflag <target> <flag> + + target: target number + + For the moment, only one flag is available: + + no_disc: not allow target to disconnect. + + Do not specify any flag in order to reset the flag. For example: + - setflag 4 + will reset no_disc flag for target 4, so will allow it disconnections. + - setflag all + will allow disconnection for all devices on the SCSI bus. + + +8.8 Set verbose level + + setverbose #level + + The driver default verbose level is 1. This command allows to change + th driver verbose level after boot-up. + +8.9 Reset all logical units of a target + + resetdev <target> + + target: target number + The driver will try to send a BUS DEVICE RESET message to the target. + (Only supported by the SYM53C8XX driver and provided for test purpose) + +8.10 Abort all tasks of all logical units of a target + + cleardev <target> + + target: target number + The driver will try to send a ABORT message to all the logical units + of the target. + (Only supported by the SYM53C8XX driver and provided for test purpose) + + +9. Configuration parameters + +If the firmware of all your devices is perfect enough, all the +features supported by the driver can be enabled at start-up. However, +if only one has a flaw for some SCSI feature, you can disable the +support by the driver of this feature at linux start-up and enable +this feature after boot-up only for devices that support it safely. + +CONFIG_SCSI_NCR53C8XX_PROFILE_SUPPORT (default answer: n) + This option must be set for profiling information to be gathered + and printed out through the proc file system. This features may + impact performances. + +CONFIG_SCSI_NCR53C8XX_IOMAPPED (default answer: n) + Answer "y" if you suspect your mother board to not allow memory mapped I/O. + May slow down performance a little. This option is required by + Linux/PPC and is used no matter what you select here. Linux/PPC + suffers no performance loss with this option since all IO is memory + mapped anyway. + +CONFIG_SCSI_NCR53C8XX_DEFAULT_TAGS (default answer: 8) + Default tagged command queue depth. + +CONFIG_SCSI_NCR53C8XX_MAX_TAGS (default answer: 8) + This option allows you to specify the maximum number of tagged commands + that can be queued to a device. The maximum supported value is 32. + +CONFIG_SCSI_NCR53C8XX_SYNC (default answer: 5) + This option allows you to specify the frequency in MHz the driver + will use at boot time for synchronous data transfer negotiations. + This frequency can be changed later with the "setsync" control command. + 0 means "asynchronous data transfers". + +CONFIG_SCSI_NCR53C8XX_FORCE_SYNC_NEGO (default answer: n) + Force synchronous negotiation for all SCSI-2 devices. + Some SCSI-2 devices do not report this feature in byte 7 of inquiry + response but do support it properly (TAMARACK scanners for example). + +CONFIG_SCSI_NCR53C8XX_NO_DISCONNECT (default and only reasonable answer: n) + If you suspect a device of yours does not properly support disconnections, + you can answer "y". Then, all SCSI devices will never disconnect the bus + even while performing long SCSI operations. + +CONFIG_SCSI_NCR53C8XX_SYMBIOS_COMPAT + Genuine SYMBIOS boards use GPIO0 in output for controller LED and GPIO3 + bit as a flag indicating singled-ended/differential interface. + If all the boards of your system are genuine SYMBIOS boards or use + BIOS and drivers from SYMBIOS, you would want to enable this option. + This option must NOT be enabled if your system has at least one 53C8XX + based scsi board with a vendor-specific BIOS. + For example, Tekram DC-390/U, DC-390/W and DC-390/F scsi controllers + use a vendor-specific BIOS and are known to not use SYMBIOS compatible + GPIO wiring. So, this option must not be enabled if your system has + such a board installed. + +CONFIG_SCSI_NCR53C8XX_NVRAM_DETECT + Enable support for reading the serial NVRAM data on Symbios and + some Symbios compatible cards, and Tekram DC390W/U/F cards. Useful for + systems with more than one Symbios compatible controller where at least + one has a serial NVRAM, or for a system with a mixture of Symbios and + Tekram cards. Enables setting the boot order of host adaptors + to something other than the default order or "reverse probe" order. + Also enables Symbios and Tekram cards to be distinguished so + CONFIG_SCSI_NCR53C8XX_SYMBIOS_COMPAT may be set in a system with a + mixture of Symbios and Tekram cards so the Symbios cards can make use of + the full range of Symbios features, differential, led pin, without + causing problems for the Tekram card(s). + +10. Boot setup commands + +10.1 Syntax + +Setup commands can be passed to the driver either at boot time or as a +string variable using 'insmod'. + +A boot setup command for the ncr53c8xx (sym53c8xx) driver begins with the +driver name "ncr53c8xx="(sym53c8xx). The kernel syntax parser then expects +an optionnal list of integers separated with comma followed by an optional +list of comma-separated strings. Example of boot setup command under lilo +prompt: + +lilo: linux root=/dev/hda2 ncr53c8xx=tags:4,sync:10,debug:0x200 + +- enable tagged commands, up to 4 tagged commands queued. +- set synchronous negotiation speed to 10 Mega-transfers / second. +- set DEBUG_NEGO flag. + +Since comma seems not to be allowed when defining a string variable using +'insmod', the driver also accepts <space> as option separator. +The following command will install driver module with the same options as +above. + + insmod ncr53c8xx.o ncr53c8xx="tags:4 sync:10 debug:0x200" + +For the moment, the integer list of arguments is discarded by the driver. +It will be used in the future in order to allow a per controller setup. + +Each string argument must be specified as "keyword:value". Only lower-case +characters and digits are allowed. + +In a system that contains multiple 53C8xx adapters insmod will install the +specified driver on each adapter. To exclude a chip use the 'excl' keyword. + +The sequence of commands, + + insmod sym53c8xx sym53c8xx=excl:0x1400 + insmod ncr53c8xx + +installs the sym53c8xx driver on all adapters except the one at IO port +address 0x1400 and then installs the ncr53c8xx driver to the adapter at IO +port address 0x1400. + + +10.2 Available arguments + +10.2.1 Master parity checking + mpar:y enabled + mpar:n disabled + +10.2.2 Scsi parity checking + spar:y enabled + spar:n disabled + +10.2.3 Scsi disconnections + disc:y enabled + disc:n disabled + +10.2.4 Special features + Only apply to 810A, 825A, 860, 875 and 895 controllers. + Have no effect with other ones. + specf:y (or 1) enabled + specf:n (or 0) disabled + specf:3 enabled except Memory Write And Invalidate + The default driver setup is 'specf:3'. As a consequence, option 'specf:y' + must be specified in the boot setup command to enable Memory Write And + Invalidate. + +10.2.5 Ultra SCSI support + Only apply to 860, 875, 895, 895a, 896, 1010 and 1010_66 controllers. + Have no effect with other ones. + ultra:n All ultra speeds enabled + ultra:2 Ultra2 enabled + ultra:1 Ultra enabled + ultra:0 Ultra speeds disabled + +10.2.6 Default number of tagged commands + tags:0 (or tags:1 ) tagged command queuing disabled + tags:#tags (#tags > 1) tagged command queuing enabled + #tags will be truncated to the max queued commands configuration parameter. + This option also allows to specify a command queue depth for each device + that support tagged command queueing. + Example: + ncr53c8xx=tags:10/t2t3q16-t5q24/t1u2q32 + will set devices queue depth as follow: + - controller #0 target #2 and target #3 -> 16 commands, + - controller #0 target #5 -> 24 commands, + - controller #1 target #1 logical unit #2 -> 32 commands, + - all other logical units (all targets, all controllers) -> 10 commands. + +10.2.7 Default synchronous period factor + sync:255 disabled (asynchronous transfer mode) + sync:#factor + #factor = 10 Ultra-2 SCSI 40 Mega-transfers / second + #factor = 11 Ultra-2 SCSI 33 Mega-transfers / second + #factor < 25 Ultra SCSI 20 Mega-transfers / second + #factor < 50 Fast SCSI-2 + + In all cases, the driver will use the minimum transfer period supported by + controllers according to NCR53C8XX chip type. + +10.2.8 Negotiate synchronous with all devices + (force sync nego) + fsn:y enabled + fsn:n disabled + +10.2.9 Verbosity level + verb:0 minimal + verb:1 normal + verb:2 too much + +10.2.10 Debug mode + debug:0 clear debug flags + debug:#x set debug flags + #x is an integer value combining the following power-of-2 values: + DEBUG_ALLOC 0x1 + DEBUG_PHASE 0x2 + DEBUG_POLL 0x4 + DEBUG_QUEUE 0x8 + DEBUG_RESULT 0x10 + DEBUG_SCATTER 0x20 + DEBUG_SCRIPT 0x40 + DEBUG_TINY 0x80 + DEBUG_TIMING 0x100 + DEBUG_NEGO 0x200 + DEBUG_TAGS 0x400 + DEBUG_FREEZE 0x800 + DEBUG_RESTART 0x1000 + + You can play safely with DEBUG_NEGO. However, some of these flags may + generate bunches of syslog messages. + +10.2.11 Burst max + burst:0 burst disabled + burst:255 get burst length from initial IO register settings. + burst:#x burst enabled (1<<#x burst transfers max) + #x is an integer value which is log base 2 of the burst transfers max. + The NCR53C875 and NCR53C825A support up to 128 burst transfers (#x = 7). + Other chips only support up to 16 (#x = 4). + This is a maximum value. The driver set the burst length according to chip + and revision ids. By default the driver uses the maximum value supported + by the chip. + +10.2.12 LED support + led:1 enable LED support + led:0 disable LED support + Donnot enable LED support if your scsi board does not use SDMS BIOS. + (See 'Configuration parameters') + +10.2.13 Max wide + wide:1 wide scsi enabled + wide:0 wide scsi disabled + Some scsi boards use a 875 (ultra wide) and only supply narrow connectors. + If you have connected a wide device with a 50 pins to 68 pins cable + converter, any accepted wide negotiation will break further data transfers. + In such a case, using "wide:0" in the bootup command will be helpfull. + +10.2.14 Differential mode + diff:0 never set up diff mode + diff:1 set up diff mode if BIOS set it + diff:2 always set up diff mode + diff:3 set diff mode if GPIO3 is not set + +10.2.15 IRQ mode + irqm:0 always open drain + irqm:1 same as initial settings (assumed BIOS settings) + irqm:2 always totem pole + irqm:0x10 driver will not use SA_SHIRQ flag when requesting irq + irqm:0x20 driver will not use SA_INTERRUPT flag when requesting irq + + (Bits 0x10 and 0x20 can be combined with hardware irq mode option) + +10.2.16 Reverse probe + revprob:n probe chip ids from the PCI configuration in this order: + 810, 815, 820, 860, 875, 885, 895, 896 + revprob:y probe chip ids in the reverse order. + +10.2.17 Fix up PCI configuration space + pcifix:<option bits> + + Available option bits: + 0x0: No attempt to fix PCI configuration space registers values. + 0x1: Set PCI cache-line size register if not set. + 0x2: Set write and invalidate bit in PCI command register. + 0x4: Increase if necessary PCI latency timer according to burst max. + + Use 'pcifix:7' in order to allow the driver to fix up all PCI features. + +10.2.18 Serial NVRAM + nvram:n do not look for serial NVRAM + nvram:y test controllers for onboard serial NVRAM + (alternate binary form) + mvram=<bits options> + 0x01 look for NVRAM (equivalent to nvram=y) + 0x02 ignore NVRAM "Synchronous negotiation" parameters for all devices + 0x04 ignore NVRAM "Wide negotiation" parameter for all devices + 0x08 ignore NVRAM "Scan at boot time" parameter for all devices + 0x80 also attach controllers set to OFF in the NVRAM (sym53c8xx only) + +10.2.19 Check SCSI BUS + buschk:<option bits> + + Available option bits: + 0x0: No check. + 0x1: Check and do not attach the controller on error. + 0x2: Check and just warn on error. + 0x4: Disable SCSI bus integrity checking. + +10.2.20 Exclude a host from being attached + excl=<io_address> + + Prevent host at a given io address from being attached. + For example 'ncr53c8xx=excl:0xb400,excl:0xc000' indicate to the + ncr53c8xx driver not to attach hosts at address 0xb400 and 0xc000. + +10.2.21 Suggest a default SCSI id for hosts + hostid:255 no id suggested. + hostid:#x (0 < x < 7) x suggested for hosts SCSI id. + + If a host SCSI id is available from the NVRAM, the driver will ignore + any value suggested as boot option. Otherwise, if a suggested value + different from 255 has been supplied, it will use it. Otherwise, it will + try to deduce the value previously set in the hardware and use value + 7 if the hardware value is zero. + +10.2.22 Enable use of IMMEDIATE ARBITRATION + (only supported by the sym53c8xx driver. See 10.7 for more details) + iarb:0 do not use this feature. + iarb:#x use this feature according to bit fields as follow: + + bit 0 (1) : enable IARB each time the initiator has been reselected + when it arbitrated for the SCSI BUS. + (#x >> 4) : maximum number of successive settings of IARB if the initiator + win arbitration and it has other commands to send to a device. + +Boot fail safe + safe:y load the following assumed fail safe initial setup + + master parity disabled mpar:n + scsi parity enabled spar:y + disconnections not allowed disc:n + special features disabled specf:n + ultra scsi disabled ultra:n + force sync negotiation disabled fsn:n + reverse probe disabled revprob:n + PCI fix up disabled pcifix:0 + serial NVRAM enabled nvram:y + verbosity level 2 verb:2 + tagged command queuing disabled tags:0 + synchronous negotiation disabled sync:255 + debug flags none debug:0 + burst length from BIOS settings burst:255 + LED support disabled led:0 + wide support disabled wide:0 + settle time 10 seconds settle:10 + differential support from BIOS settings diff:1 + irq mode from BIOS settings irqm:1 + SCSI BUS check do not attach on error buschk:1 + immediate arbitration disabled iarb:0 + +10.3 Advised boot setup commands + +If the driver has been configured with default options, the equivalent +boot setup is: + + ncr53c8xx=mpar:y,spar:y,disc:y,specf:3,fsn:n,ultra:2,fsn:n,revprob:n,verb:1\ + tags:0,sync:50,debug:0,burst:7,led:0,wide:1,settle:2,diff:0,irqm:0 + +For an installation diskette or a safe but not fast system, +boot setup can be: + + ncr53c8xx=safe:y,mpar:y,disc:y + ncr53c8xx=safe:y,disc:y + ncr53c8xx=safe:y,mpar:y + ncr53c8xx=safe:y + +My personnal system works flawlessly with the following equivalent setup: + + ncr53c8xx=mpar:y,spar:y,disc:y,specf:1,fsn:n,ultra:2,fsn:n,revprob:n,verb:1\ + tags:32,sync:12,debug:0,burst:7,led:1,wide:1,settle:2,diff:0,irqm:0 + +The driver prints its actual setup when verbosity level is 2. You can try +"ncr53c8xx=verb:2" to get the "static" setup of the driver, or add "verb:2" +to your boot setup command in order to check the actual setup the driver is +using. + +10.4 PCI configuration fix-up boot option + +pcifix:<option bits> + +Available option bits: + 0x1: Set PCI cache-line size register if not set. + 0x2: Set write and invalidate bit in PCI command register. + +Use 'pcifix:3' in order to allow the driver to fix both PCI features. + +These options only apply to new SYMBIOS chips 810A, 825A, 860, 875 +and 895 and are only supported for Pentium and 486 class processors. +Recent SYMBIOS 53C8XX scsi processors are able to use PCI read multiple +and PCI write and invalidate commands. These features require the +cache line size register to be properly set in the PCI configuration +space of the chips. On the other hand, chips will use PCI write and +invalidate commands only if the corresponding bit is set to 1 in the +PCI command register. + +Not all PCI bioses set the PCI cache line register and the PCI write and +invalidate bit in the PCI configuration space of 53C8XX chips. +Optimized PCI accesses may be broken for some PCI/memory controllers or +make problems with some PCI boards. + +This fix-up worked flawlessly on my previous system. +(MB Triton HX / 53C875 / 53C810A) +I use these options at my own risks as you will do if you decide to +use them too. + + +10.5 Serial NVRAM support boot option + +nvram:n do not look for serial NVRAM +nvram:y test controllers for onboard serial NVRAM + +This option can also been entered as an hexadecimal value that allows +to control what information the driver will get from the NVRAM and what +information it will ignore. +For details see '17. Serial NVRAM support'. + +When this option is enabled, the driver tries to detect all boards using +a Serial NVRAM. This memory is used to hold user set up parameters. + +The parameters the driver is able to get from the NVRAM depend on the +data format used, as follow: + + Tekram format Symbios format +General and host parameters + Boot order N Y + Host SCSI ID Y Y + SCSI parity checking Y Y + Verbose boot messages N Y +SCSI devices parameters + Synchronous transfer speed Y Y + Wide 16 / Narrow Y Y + Tagged Command Queuing enabled Y Y + Disconnections enabled Y Y + Scan at boot time N Y + +In order to speed up the system boot, for each device configured without +the "scan at boot time" option, the driver forces an error on the +first TEST UNIT READY command received for this device. + +Some SDMS BIOS revisions seem to be unable to boot cleanly with very fast +hard disks. In such a situation you cannot configure the NVRAM with +optimized parameters value. + +The 'nvram' boot option can be entered in hexadecimal form in order +to ignore some options configured in the NVRAM, as follow: + +mvram=<bits options> + 0x01 look for NVRAM (equivalent to nvram=y) + 0x02 ignore NVRAM "Synchronous negotiation" parameters for all devices + 0x04 ignore NVRAM "Wide negotiation" parameter for all devices + 0x08 ignore NVRAM "Scan at boot time" parameter for all devices + 0x80 also attach controllers set to OFF in the NVRAM (sym53c8xx only) + +Option 0x80 is only supported by the sym53c8xx driver and is disabled by +default. Result is that, by default (option not set), the sym53c8xx driver +will not attach controllers set to OFF in the NVRAM. + +The ncr53c8xx always tries to attach all the controllers. Option 0x80 has +not been added to the ncr53c8xx driver, since it has been reported to +confuse users who use this driver since a long time. If you desire a +controller not to be attached by the ncr53c8xx driver at Linux boot, you +must use the 'excl' driver boot option. + +10.6 SCSI BUS checking boot option. + +When this option is set to a non-zero value, the driver checks SCSI lines +logic state, 100 micro-seconds after having asserted the SCSI RESET line. +The driver just reads SCSI lines and checks all lines read FALSE except RESET. +Since SCSI devices shall release the BUS at most 800 nano-seconds after SCSI +RESET has been asserted, any signal to TRUE may indicate a SCSI BUS problem. +Unfortunately, the following common SCSI BUS problems are not detected: +- Only 1 terminator installed. +- Misplaced terminators. +- Bad quality terminators. +On the other hand, either bad cabling, broken devices, not conformant +devices, ... may cause a SCSI signal to be wrong when te driver reads it. + +10.7 IMMEDIATE ARBITRATION boot option + +This option is only supported by the SYM53C8XX driver (not by the NCR53C8XX). + +SYMBIOS 53C8XX chips are able to arbitrate for the SCSI BUS as soon as they +have detected an expected disconnection (BUS FREE PHASE). For this process +to be started, bit 1 of SCNTL1 IO register must be set when the chip is +connected to the SCSI BUS. + +When this feature has been enabled for the current connection, the chip has +every chance to win arbitration if only devices with lower priority are +competing for the SCSI BUS. By the way, when the chip is using SCSI id 7, +then it will for sure win the next SCSI BUS arbitration. + +Since, there is no way to know what devices are trying to arbitrate for the +BUS, using this feature can be extremely unfair. So, you are not advised +to enable it, or at most enable this feature for the case the chip lost +the previous arbitration (boot option 'iarb:1'). + +This feature has the following advantages: + +a) Allow the initiator with ID 7 to win arbitration when it wants so. +b) Overlap at least 4 micro-seconds of arbitration time with the execution + of SCRIPTS that deal with the end of the current connection and that + starts the next job. + +Hmmm... But (a) may just prevent other devices from reselecting the initiator, +and delay data transfers or status/completions, and (b) may just waste +SCSI BUS bandwidth if the SCRIPTS execution lasts more than 4 micro-seconds. + +The use of IARB needs the SCSI_NCR_IARB_SUPPORT option to have been defined +at compile time and the 'iarb' boot option to have been set to a non zero +value at boot time. It is not that useful for real work, but can be used +to stress SCSI devices or for some applications that can gain advantage of +it. By the way, if you experience badnesses like 'unexpected disconnections', +'bad reselections', etc... when using IARB on heavy IO load, you should not +be surprised, because force-feeding anything and blocking its arse at the +same time cannot work for a long time. :-)) + + +11. Some constants and flags of the ncr53c8xx.h header file + +Some of these are defined from the configuration parameters. To +change other "defines", you must edit the header file. Do that only +if you know what you are doing. + +SCSI_NCR_SETUP_SPECIAL_FEATURES (default: defined) + If defined, the driver will enable some special features according + to chip and revision id. + For 810A, 860, 825A, 875 and 895 scsi chips, this option enables + support of features that reduce load of PCI bus and memory accesses + during scsi transfer processing: burst op-code fetch, read multiple, + read line, prefetch, cache line, write and invalidate, + burst 128 (875 only), large dma fifo (875 only), offset 16 (875 only). + Can be changed by the following boot setup command: + ncr53c8xx=specf:n + +SCSI_NCR_IOMAPPED (default: not defined) + If defined, normal I/O is forced. + +SCSI_NCR_SHARE_IRQ (default: defined) + If defined, request shared IRQ. + +SCSI_NCR_MAX_TAGS (default: 8) + Maximum number of simultaneous tagged commands to a device. + Can be changed by "settags <target> <maxtags>" + +SCSI_NCR_SETUP_DEFAULT_SYNC (default: 50) + Transfer period factor the driver will use at boot time for synchronous + negotiation. 0 means asynchronous. + Can be changed by "setsync <target> <period factor>" + +SCSI_NCR_SETUP_DEFAULT_TAGS (default: 8) + Default number of simultaneous tagged commands to a device. + < 1 means tagged command queuing disabled at start-up. + +SCSI_NCR_ALWAYS_SIMPLE_TAG (default: defined) + Use SIMPLE TAG for read and write commands. + Can be changed by "setorder <ordered|simple|default>" + +SCSI_NCR_SETUP_DISCONNECTION (default: defined) + If defined, targets are allowed to disconnect. + +SCSI_NCR_SETUP_FORCE_SYNC_NEGO (default: not defined) + If defined, synchronous negotiation is tried for all SCSI-2 devices. + Can be changed by "setsync <target> <period>" + +SCSI_NCR_SETUP_MASTER_PARITY (default: defined) + If defined, master parity checking is enabled. + +SCSI_NCR_SETUP_MASTER_PARITY (default: defined) + If defined, SCSI parity checking is enabled. + +SCSI_NCR_PROFILE_SUPPORT (default: not defined) + If defined, profiling information is gathered. + +SCSI_NCR_MAX_SCATTER (default: 128) + Scatter list size of the driver ccb. + +SCSI_NCR_MAX_TARGET (default: 16) + Max number of targets per host. + +SCSI_NCR_MAX_HOST (default: 2) + Max number of host controllers. + +SCSI_NCR_SETTLE_TIME (default: 2) + Number of seconds the driver will wait after reset. + +SCSI_NCR_TIMEOUT_ALERT (default: 3) + If a pending command will time out after this amount of seconds, + an ordered tag is used for the next command. + Avoids timeouts for unordered tagged commands. + +SCSI_NCR_CAN_QUEUE (default: 7*SCSI_NCR_MAX_TAGS) + Max number of commands that can be queued to a host. + +SCSI_NCR_CMD_PER_LUN (default: SCSI_NCR_MAX_TAGS) + Max number of commands queued to a host for a device. + +SCSI_NCR_SG_TABLESIZE (default: SCSI_NCR_MAX_SCATTER-1) + Max size of the Linux scatter/gather list. + +SCSI_NCR_MAX_LUN (default: 8) + Max number of LUNs per target. + + +12. Installation + +This driver is part of the linux kernel distribution. +Driver files are located in the sub-directory "drivers/scsi" of the +kernel source tree. + +Driver files: + + README.ncr53c8xx : this file + ChangeLog.ncr53c8xx : change log + ncr53c8xx.h : definitions + ncr53c8xx.c : the driver code + +New driver versions are made available separately in order to allow testing +changes and new features prior to including them into the linux kernel +distribution. The following URL provides informations on latest avalaible +patches: + + ftp://ftp.tux.org/pub/people/gerard-roudier/README + + +13. Architecture dependent features. + +<Not yet written> + + +14. Known problems + +14.1 Tagged commands with Iomega Jaz device + +I have not tried this device, however it has been reported to me the +following: This device is capable of Tagged command queuing. However +while spinning up, it rejects Tagged commands. This behaviour is +conforms to 6.8.2 of SCSI-2 specifications. The current behaviour of +the driver in that situation is not satisfying. So do not enable +Tagged command queuing for devices that are able to spin down. The +other problem that may appear is timeouts. The only way to avoid +timeouts seems to edit linux/drivers/scsi/sd.c and to increase the +current timeout values. + +14.2 Device names change when another controller is added. + +When you add a new NCR53C8XX chip based controller to a system that already +has one or more controllers of this family, it may happen that the order +the driver registers them to the kernel causes problems due to device +name changes. +When at least one controller uses NvRAM, SDMS BIOS version 4 allows you to +define the order the BIOS will scan the scsi boards. The driver attaches +controllers according to BIOS information if NvRAM detect option is set. + +If your controllers do not have NvRAM, you can: + +- Ask the driver to probe chip ids in reverse order from the boot command + line: ncr53c8xx=revprob:y +- Make appropriate changes in the fstab. +- Use the 'scsidev' tool from Eric Youngdale. + +14.3 Using only 8 bit devices with a WIDE SCSI controller. + +When only 8 bit NARROW devices are connected to a 16 bit WIDE SCSI controller, +you must ensure that lines of the wide part of the SCSI BUS are pulled-up. +This can be achieved by ENABLING the WIDE TERMINATOR portion of the SCSI +controller card. +The TYAN 1365 documentation revision 1.2 is not correct about such settings. +(page 10, figure 3.3). + +14.4 Possible data corruption during a Memory Write and Invalidate + +This problem is described in SYMBIOS DEL 397, Part Number 69-039241, ITEM 4. + +In some complex situations, 53C875 chips revision <= 3 may start a PCI +Write and Invalidate Command at a not cache-line-aligned 4 DWORDS boundary. +This is only possible when Cache Line Size is 8 DWORDS or greater. +Pentium systems use a 8 DWORDS cache line size and so are concerned by +this chip bug, unlike i486 systems that use a 4 DWORDS cache line size. + +When this situation occurs, the chip may complete the Write and Invalidate +command after having only filled part of the last cache line involved in +the transfer, leaving to data corruption the remainder of this cache line. + +Not using Write And Invalidate obviously gets rid of this chip bug, and so +it is now the default setting of the driver. +However, for people like me who want to enable this feature, I have added +part of a work-around suggested by SYMBIOS. This work-around resets the +addressing logic when the DATA IN phase is entered and so prevents the bug +from being triggered for the first SCSI MOVE of the phase. This work-around +should be enough according to the following: + +The only driver internal data structure that is greater than 8 DWORDS and +that is moved by the SCRIPTS processor is the 'CCB header' that contains +the context of the SCSI transfer. This data structure is aligned on 8 DWORDS +boundary (Pentium Cache Line Size), and so is immune to this chip bug, at +least on Pentium systems. +But the conditions of this bug can be met when a SCSI read command is +performed using a buffer that is 4 DWORDS but not cache-line aligned. +This cannot happen under Linux when scatter/gather lists are used since +they only refer to system buffers that are well aligned. So, a work around +may only be needed under Linux when a scatter/gather list is not used and +when the SCSI DATA IN phase is reentered after a phase mismatch. + +14.5 IRQ sharing problems + +When an IRQ is shared by devices that are handled by different drivers, it +may happen that one driver complains about the request of the IRQ having +failed. Inder Linux-2.0, this may be due to one driver having requested the +IRQ using the SA_INTERRUPT flag but some other having requested the same IRQ +without this flag. Under both Linux-2.0 and linux-2.2, this may be caused by +one driver not having requested the IRQ with the SA_SHIRQ flag. + +By default, the ncr53c8xx and sym53c8xx drivers request IRQs with both the +SA_INTERRUPT and the SA_SHIRQ flag under Linux-2.0 and with only the SA_SHIRQ +flag under Linux-2.2. + +Under Linux-2.0, you can disable use of SA_INTERRUPT flag from the boot +command line by using the following option: + + ncr53c8xx=irqm:0x20 (for the generic ncr53c8xx driver) + sym53c8xx=irqm:0x20 (for the sym53c8xx driver) + +If this does not fix the problem, then you may want to check how all other +drivers are requesting the IRQ and report the problem. Note that if at least +a single driver does not request the IRQ with the SA_SHIRQ flag (share IRQ), +then the request of the IRQ obviously will not succeed for all the drivers. + +15. SCSI problem troubleshooting + +15.1 Problem tracking + +Most SCSI problems are due to a non conformant SCSI bus or to buggy +devices. If infortunately you have SCSI problems, you can check the +following things: + +- SCSI bus cables +- terminations at both end of the SCSI chain +- linux syslog messages (some of them may help you) + +If you do not find the source of problems, you can configure the +driver with no features enabled. + +- only asynchronous data transfers +- tagged commands disabled +- disconnections not allowed + +Now, if your SCSI bus is ok, your system have every chance to work +with this safe configuration but performances will not be optimal. + +If it still fails, then you can send your problem description to +appropriate mailing lists or news-groups. Send me a copy in order to +be sure I will receive it. Obviously, a bug in the driver code is +possible. + + My email address: Gerard Roudier <groudier@free.fr> + +Allowing disconnections is important if you use several devices on +your SCSI bus but often causes problems with buggy devices. +Synchronous data transfers increases throughput of fast devices like +hard disks. Good SCSI hard disks with a large cache gain advantage of +tagged commands queuing. + +Try to enable one feature at a time with control commands. For example: + +- echo "setsync all 25" >/proc/scsi/ncr53c8xx/0 + Will enable fast synchronous data transfer negotiation for all targets. + +- echo "setflag 3" >/proc/scsi/ncr53c8xx/0 + Will reset flags (no_disc) for target 3, and so will allow it to disconnect + the SCSI Bus. + +- echo "settags 3 8" >/proc/scsi/ncr53c8xx/0 + Will enable tagged command queuing for target 3 if that device supports it. + +Once you have found the device and the feature that cause problems, just +disable that feature for that device. + +15.2 Understanding hardware error reports + +When the driver detects an unexpected error condition, it may display a +message of the following pattern. + +sym53c876-0:1: ERROR (0:48) (1-21-65) (f/95) @ (script 7c0:19000000). +sym53c876-0: script cmd = 19000000 +sym53c876-0: regdump: da 10 80 95 47 0f 01 07 75 01 81 21 80 01 09 00. + +Some fields in such a message may help you understand the cause of the +problem, as follows: + +sym53c876-0:1: ERROR (0:48) (1-21-65) (f/95) @ (script 7c0:19000000). +............A.........B.C....D.E..F....G.H.......I.....J...K....... + +Field A : target number. + SCSI ID of the device the controller was talking with at the moment the + error occurs. + +Field B : DSTAT io register (DMA STATUS) + Bit 0x40 : MDPE Master Data Parity Error + Data parity error detected on the PCI BUS. + Bit 0x20 : BF Bus Fault + PCI bus fault condition detected + Bit 0x01 : IID Illegal Instruction Detected + Set by the chip when it detects an Illegal Instruction format + on some condition that makes an instruction illegal. + Bit 0x80 : DFE Dma Fifo Empty + Pure status bit that does not indicate an error. + If the reported DSTAT value contains a combination of MDPE (0x40), + BF (0x20), then the cause may be likely due to a PCI BUS problem. + +Field C : SIST io register (SCSI Interrupt Status) + Bit 0x08 : SGE SCSI GROSS ERROR + Indicates that the chip detected a severe error condition + on the SCSI BUS that prevents the SCSI protocol from functioning + properly. + Bit 0x04 : UDC Unexpected Disconnection + Indicates that the device released the SCSI BUS when the chip + was not expecting this to happen. A device may behave so to + indicate the SCSI initiator that an error condition not reportable using the SCSI protocol has occurred. + Bit 0x02 : RST SCSI BUS Reset + Generally SCSI targets do not reset the SCSI BUS, although any + device on the BUS can reset it at any time. + Bit 0x01 : PAR Parity + SCSI parity error detected. + On a faulty SCSI BUS, any error condition among SGE (0x08), UDC (0x04) and + PAR (0x01) may be detected by the chip. If your SCSI system sometimes + encounters such error conditions, especially SCSI GROSS ERROR, then a SCSI + BUS problem is likely the cause of these errors. + +For fields D,E,F,G and H, you may look into the sym53c8xx_defs.h file +that contains some minimal comments on IO register bits. +Field D : SOCL Scsi Output Control Latch + This register reflects the state of the SCSI control lines the + chip want to drive or compare against. +Field E : SBCL Scsi Bus Control Lines + Actual value of control lines on the SCSI BUS. +Field F : SBDL Scsi Bus Data Lines + Actual value of data lines on the SCSI BUS. +Field G : SXFER SCSI Transfer + Contains the setting of the Synchronous Period for output and + the current Synchronous offset (offset 0 means asynchronous). +Field H : SCNTL3 Scsi Control Register 3 + Contains the setting of timing values for both asynchronous and + synchronous data transfers. + +Understanding Fields I, J, K and dumps requires to have good knowledge of +SCSI standards, chip cores functionnals and internal driver data structures. +You are not required to decode and understand them, unless you want to help +maintain the driver code. + +16. Synchonous transfer negotiation tables + +Tables below have been created by calling the routine the driver uses +for synchronisation negotiation timing calculation and chip setting. +The first table corresponds to Ultra chips 53875 and 53C860 with 80 MHz +clock and 5 clock divisors. +The second one has been calculated by setting the scsi clock to 40 Mhz +and using 4 clock divisors and so applies to all NCR53C8XX chips in fast +SCSI-2 mode. + +Periods are in nano-seconds and speeds are in Mega-transfers per second. +1 Mega-transfers/second means 1 MB/s with 8 bits SCSI and 2 MB/s with +Wide16 SCSI. + +16.1 Synchronous timings for 53C895, 53C875 and 53C860 SCSI controllers + + ---------------------------------------------- + Negotiated NCR settings + Factor Period Speed Period Speed + ------ ------ ------ ------ ------ + 10 25 40.000 25 40.000 (53C895 only) + 11 30.2 33.112 31.25 32.000 (53C895 only) + 12 50 20.000 50 20.000 + 13 52 19.230 62 16.000 + 14 56 17.857 62 16.000 + 15 60 16.666 62 16.000 + 16 64 15.625 75 13.333 + 17 68 14.705 75 13.333 + 18 72 13.888 75 13.333 + 19 76 13.157 87 11.428 + 20 80 12.500 87 11.428 + 21 84 11.904 87 11.428 + 22 88 11.363 93 10.666 + 23 92 10.869 93 10.666 + 24 96 10.416 100 10.000 + 25 100 10.000 100 10.000 + 26 104 9.615 112 8.888 + 27 108 9.259 112 8.888 + 28 112 8.928 112 8.888 + 29 116 8.620 125 8.000 + 30 120 8.333 125 8.000 + 31 124 8.064 125 8.000 + 32 128 7.812 131 7.619 + 33 132 7.575 150 6.666 + 34 136 7.352 150 6.666 + 35 140 7.142 150 6.666 + 36 144 6.944 150 6.666 + 37 148 6.756 150 6.666 + 38 152 6.578 175 5.714 + 39 156 6.410 175 5.714 + 40 160 6.250 175 5.714 + 41 164 6.097 175 5.714 + 42 168 5.952 175 5.714 + 43 172 5.813 175 5.714 + 44 176 5.681 187 5.333 + 45 180 5.555 187 5.333 + 46 184 5.434 187 5.333 + 47 188 5.319 200 5.000 + 48 192 5.208 200 5.000 + 49 196 5.102 200 5.000 + + +16.2 Synchronous timings for fast SCSI-2 53C8XX controllers + + ---------------------------------------------- + Negotiated NCR settings + Factor Period Speed Period Speed + ------ ------ ------ ------ ------ + 25 100 10.000 100 10.000 + 26 104 9.615 125 8.000 + 27 108 9.259 125 8.000 + 28 112 8.928 125 8.000 + 29 116 8.620 125 8.000 + 30 120 8.333 125 8.000 + 31 124 8.064 125 8.000 + 32 128 7.812 131 7.619 + 33 132 7.575 150 6.666 + 34 136 7.352 150 6.666 + 35 140 7.142 150 6.666 + 36 144 6.944 150 6.666 + 37 148 6.756 150 6.666 + 38 152 6.578 175 5.714 + 39 156 6.410 175 5.714 + 40 160 6.250 175 5.714 + 41 164 6.097 175 5.714 + 42 168 5.952 175 5.714 + 43 172 5.813 175 5.714 + 44 176 5.681 187 5.333 + 45 180 5.555 187 5.333 + 46 184 5.434 187 5.333 + 47 188 5.319 200 5.000 + 48 192 5.208 200 5.000 + 49 196 5.102 200 5.000 + + +17. Serial NVRAM (added by Richard Waltham: dormouse@farsrobt.demon.co.uk) + +17.1 Features + +Enabling serial NVRAM support enables detection of the serial NVRAM included +on Symbios and some Symbios compatible host adaptors, and Tekram boards. The +serial NVRAM is used by Symbios and Tekram to hold set up parameters for the +host adaptor and it's attached drives. + +The Symbios NVRAM also holds data on the boot order of host adaptors in a +system with more than one host adaptor. This enables the order of scanning +the cards for drives to be changed from the default used during host adaptor +detection. + +This can be done to a limited extent at the moment using "reverse probe" but +this only changes the order of detection of different types of cards. The +NVRAM boot order settings can do this as well as change the order the same +types of cards are scanned in, something "reverse probe" cannot do. + +Tekram boards using Symbios chips, DC390W/F/U, which have NVRAM are detected +and this is used to distinguish between Symbios compatible and Tekram host +adaptors. This is used to disable the Symbios compatible "diff" setting +incorrectly set on Tekram boards if the CONFIG_SCSI_53C8XX_SYMBIOS_COMPAT +configuration parameter is set enabling both Symbios and Tekram boards to be +used together with the Symbios cards using all their features, including +"diff" support. ("led pin" support for Symbios compatible cards can remain +enabled when using Tekram cards. It does nothing useful for Tekram host +adaptors but does not cause problems either.) + + +17.2 Symbios NVRAM layout + +typical data at NVRAM address 0x100 (53c810a NVRAM) +----------------------------------------------------------- +00 00 +64 01 +8e 0b + +00 30 00 00 00 00 07 00 00 00 00 00 00 00 07 04 10 04 00 00 + +04 00 0f 00 00 10 00 50 00 00 01 00 00 62 +04 00 03 00 00 10 00 58 00 00 01 00 00 63 +04 00 01 00 00 10 00 48 00 00 01 00 00 61 +00 00 00 00 00 00 00 00 00 00 00 00 00 00 + +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 + +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 + +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 + +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 + +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 + +fe fe +00 00 +00 00 +----------------------------------------------------------- +NVRAM layout details + +NVRAM Address 0x000-0x0ff not used + 0x100-0x26f initialised data + 0x270-0x7ff not used + +general layout + + header - 6 bytes, + data - 356 bytes (checksum is byte sum of this data) + trailer - 6 bytes + --- + total 368 bytes + +data area layout + + controller set up - 20 bytes + boot configuration - 56 bytes (4x14 bytes) + device set up - 128 bytes (16x8 bytes) + unused (spare?) - 152 bytes (19x8 bytes) + --- + total 356 bytes + +----------------------------------------------------------- +header + +00 00 - ?? start marker +64 01 - byte count (lsb/msb excludes header/trailer) +8e 0b - checksum (lsb/msb excludes header/trailer) +----------------------------------------------------------- +controller set up + +00 30 00 00 00 00 07 00 00 00 00 00 00 00 07 04 10 04 00 00 + | | | | + | | | -- host ID + | | | + | | --Removable Media Support + | | 0x00 = none + | | 0x01 = Bootable Device + | | 0x02 = All with Media + | | + | --flag bits 2 + | 0x00000001= scan order hi->low + | (default 0x00 - scan low->hi) + --flag bits 1 + 0x00000001 scam enable + 0x00000010 parity enable + 0x00000100 verbose boot msgs + +remaining bytes unknown - they do not appear to change in my +current set up for any of the controllers. + +default set up is identical for 53c810a and 53c875 NVRAM +(Removable Media added Symbios BIOS version 4.09) +----------------------------------------------------------- +boot configuration + +boot order set by order of the devices in this table + +04 00 0f 00 00 10 00 50 00 00 01 00 00 62 -- 1st controller +04 00 03 00 00 10 00 58 00 00 01 00 00 63 2nd controller +04 00 01 00 00 10 00 48 00 00 01 00 00 61 3rd controller +00 00 00 00 00 00 00 00 00 00 00 00 00 00 4th controller + | | | | | | | | + | | | | | | ---- PCI io port adr + | | | | | --0x01 init/scan at boot time + | | | | --PCI device/function number (0xdddddfff) + | | ----- ?? PCI vendor ID (lsb/msb) + ----PCI device ID (lsb/msb) + +?? use of this data is a guess but seems reasonable + +remaining bytes unknown - they do not appear to change in my +current set up + +default set up is identical for 53c810a and 53c875 NVRAM +----------------------------------------------------------- +device set up (up to 16 devices - includes controller) + +0f 00 08 08 64 00 0a 00 - id 0 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 + +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 - id 15 + | | | | | | + | | | | ----timeout (lsb/msb) + | | | --synch period (0x?? 40 Mtrans/sec- fast 40) (probably 0x28) + | | | (0x30 20 Mtrans/sec- fast 20) + | | | (0x64 10 Mtrans/sec- fast ) + | | | (0xc8 5 Mtrans/sec) + | | | (0x00 asynchronous) + | | -- ?? max sync offset (0x08 in NVRAM on 53c810a) + | | (0x10 in NVRAM on 53c875) + | --device bus width (0x08 narrow) + | (0x10 16 bit wide) + --flag bits + 0x00000001 - disconnect enabled + 0x00000010 - scan at boot time + 0x00000100 - scan luns + 0x00001000 - queue tags enabled + +remaining bytes unknown - they do not appear to change in my +current set up + +?? use of this data is a guess but seems reasonable +(but it could be max bus width) + +default set up for 53c810a NVRAM +default set up for 53c875 NVRAM - bus width - 0x10 + - sync offset ? - 0x10 + - sync period - 0x30 +----------------------------------------------------------- +?? spare device space (32 bit bus ??) + +00 00 00 00 00 00 00 00 (19x8bytes) +. +. +00 00 00 00 00 00 00 00 + +default set up is identical for 53c810a and 53c875 NVRAM +----------------------------------------------------------- +trailer + +fe fe - ? end marker ? +00 00 +00 00 + +default set up is identical for 53c810a and 53c875 NVRAM +----------------------------------------------------------- + + + +17.3 Tekram NVRAM layout + +nvram 64x16 (1024 bit) + +Drive settings + +Drive ID 0-15 (addr 0x0yyyy0 = device setup, yyyy = ID) + (addr 0x0yyyy1 = 0x0000) + + x x x x x x x x x x x x x x x x + | | | | | | | | | + | | | | | | | | ----- parity check 0 - off + | | | | | | | | 1 - on + | | | | | | | | + | | | | | | | ------- sync neg 0 - off + | | | | | | | 1 - on + | | | | | | | + | | | | | | --------- disconnect 0 - off + | | | | | | 1 - on + | | | | | | + | | | | | ----------- start cmd 0 - off + | | | | | 1 - on + | | | | | + | | | | -------------- tagged cmds 0 - off + | | | | 1 - on + | | | | + | | | ---------------- wide neg 0 - off + | | | 1 - on + | | | + --------------------------- sync rate 0 - 10.0 Mtrans/sec + 1 - 8.0 + 2 - 6.6 + 3 - 5.7 + 4 - 5.0 + 5 - 4.0 + 6 - 3.0 + 7 - 2.0 + 7 - 2.0 + 8 - 20.0 + 9 - 16.7 + a - 13.9 + b - 11.9 + +Global settings + +Host flags 0 (addr 0x100000, 32) + + x x x x x x x x x x x x x x x x + | | | | | | | | | | | | + | | | | | | | | ----------- host ID 0x00 - 0x0f + | | | | | | | | + | | | | | | | ----------------------- support for 0 - off + | | | | | | | > 2 drives 1 - on + | | | | | | | + | | | | | | ------------------------- support drives 0 - off + | | | | | | > 1Gbytes 1 - on + | | | | | | + | | | | | --------------------------- bus reset on 0 - off + | | | | | power on 1 - on + | | | | | + | | | | ----------------------------- active neg 0 - off + | | | | 1 - on + | | | | + | | | -------------------------------- imm seek 0 - off + | | | 1 - on + | | | + | | ---------------------------------- scan luns 0 - off + | | 1 - on + | | + -------------------------------------- removable 0 - disable + as BIOS dev 1 - boot device + 2 - all + +Host flags 1 (addr 0x100001, 33) + + x x x x x x x x x x x x x x x x + | | | | | | + | | | --------- boot delay 0 - 3 sec + | | | 1 - 5 + | | | 2 - 10 + | | | 3 - 20 + | | | 4 - 30 + | | | 5 - 60 + | | | 6 - 120 + | | | + --------------------------- max tag cmds 0 - 2 + 1 - 4 + 2 - 8 + 3 - 16 + 4 - 32 + +Host flags 2 (addr 0x100010, 34) + + x x x x x x x x x x x x x x x x + | + ----- F2/F6 enable 0 - off ??? + 1 - on ??? + +checksum (addr 0x111111) + +checksum = 0x1234 - (sum addr 0-63) + +---------------------------------------------------------------------------- + +default nvram data: + +0x0037 0x0000 0x0037 0x0000 0x0037 0x0000 0x0037 0x0000 +0x0037 0x0000 0x0037 0x0000 0x0037 0x0000 0x0037 0x0000 +0x0037 0x0000 0x0037 0x0000 0x0037 0x0000 0x0037 0x0000 +0x0037 0x0000 0x0037 0x0000 0x0037 0x0000 0x0037 0x0000 + +0x0f07 0x0400 0x0001 0x0000 0x0000 0x0000 0x0000 0x0000 +0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 +0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 +0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0xfbbc + + +18. Support for Big Endian + +The PCI local bus has been primarily designed for x86 architecture. +As a consequence, PCI devices generally expect DWORDS using little endian +byte ordering. + +18.1 Big Endian CPU + +In order to support NCR chips on a Big Endian architecture the driver has to +perform byte reordering each time it is needed. This feature has been +added to the driver by Cort <cort@cs.nmt.edu> and is available in driver +version 2.5 and later ones. For the moment Big Endian support has only +been tested on Linux/PPC (PowerPC). + +18.2 NCR chip in Big Endian mode of operations + +It can be read in SYMBIOS documentation that some chips support a special +Big Endian mode, on paper: 53C815, 53C825A, 53C875, 53C875N, 53C895. +This mode of operations is not software-selectable, but needs pin named +BigLit to be pulled-up. Using this mode, most of byte reorderings should +be avoided when the driver is running on a Big Endian CPU. +Driver version 2.5 is also, in theory, ready for this feature. + +=============================================================================== +End of NCR53C8XX driver README file diff --git a/Documentation/scsi/osst.txt b/Documentation/scsi/osst.txt new file mode 100644 index 000000000000..ce574e7791ab --- /dev/null +++ b/Documentation/scsi/osst.txt @@ -0,0 +1,219 @@ +README file for the osst driver +=============================== +(w) Kurt Garloff <garloff@suse.de> 12/2000 + +This file describes the osst driver as of version 0.8.x/0.9.x, the released +version of the osst driver. +It is intended to help advanced users to understand the role of osst and to +get them started using (and maybe debugging) it. +It won't address issues like "How do I compile a kernel?" or "How do I load +a module?", as these are too basic. +Once the OnStream got merged into the official kernel, the distro makers +will provide the OnStream support for those who are not familiar with +hacking their kernels. + + +Purpose +------- +The osst driver was developed, because the standard SCSI tape driver in +Linux, st, does not support the OnStream SC-x0 SCSI tape. The st is not to +blame for that, as the OnStream tape drives do not support the standard SCSI +command set for Serial Access Storage Devices (SASDs), which basically +corresponds to the QIC-157 spec. +Nevertheless, the OnStream tapes are nice pieces of hardware and therefore +the osst driver has been written to make these tape devs supported by Linux. +The driver is free software. It's released under the GNU GPL and planned to +be integrated into the mainstream kernel. + + +Implementation +-------------- +The osst is a new high-level SCSI driver, just like st, sr, sd and sg. It +can be compiled into the kernel or loaded as a module. +As it represents a new device, it got assigned a new device node: /dev/osstX +are character devices with major no 206 and minor numbers like the /dev/stX +devices. If those are not present, you may create them by calling +Makedevs.sh as root (see below). +The driver started being a copy of st and as such, the osst devices' +behavior looks very much the same as st to the userspace applications. + + +History +------- +In the first place, osst shared it's identity very much with st. That meant +that it used the same kernel structures and the same device node as st. +So you could only have either of them being present in the kernel. This has +been fixed by registering an own device, now. +st and osst can coexist, each only accessing the devices it can support by +themselves. + + +Installation +------------ +osst got integrated into the linux kernel. Select it during kernel +configuration as module or compile statically into the kernel. +Compile your kernel and install the modules. + +Now, your osst driver is inside the kernel or available as a module, +depending on your choice during kernel config. You may still need to create +the device nodes by calling the Makedevs.sh script (see below) manually, +unless you use a devfs kernel, where this won't be needed. + +To load your module, you may use the command +modprobe osst +as root. dmesg should show you, whether your OnStream tapes have been +recognized. + +If you want to have the module autoloaded on access to /dev/osst, you may +add something like +alias char-major-206 osst +to your /etc/modprobe.conf (before 2.6: modules.conf). + +You may find it convenient to create a symbolic link +ln -s nosst0 /dev/tape +to make programs assuming a default name of /dev/tape more convenient to +use. + +The device nodes for osst have to be created. Use the Makedevs.sh script +attached to this file. + + +Using it +-------- +You may use the OnStream tape driver with your standard backup software, +which may be tar, cpio, amanda, arkeia, BRU, Lone Tar, ... +by specifying /dev/(n)osst0 as the tape device to use or using the above +symlink trick. The IOCTLs to control tape operation are also mostly +supported and you may try the mt (or mt_st) program to jump between +filemarks, eject the tape, ... + +There's one limitation: You need to use a block size of 32kB. + +(This limitation is worked on and will be fixed in version 0.8.8 of + this driver.) + +If you just want to get started with standard software, here is an example +for creating and restoring a full backup: +# Backup +tar cvf - / --exclude /proc | buffer -s 32k -m 24M -B -t -o /dev/nosst0 +# Restore +buffer -s 32k -m 8M -B -t -i /dev/osst0 | tar xvf - -C / + +The buffer command has been used to buffer the data before it goes to the +tape (or the file system) in order to smooth out the data stream and prevent +the tape from needing to stop and rewind. The OnStream does have an internal +buffer and a variable speed which help this, but especially on writing, the +buffering still proves useful in most cases. It also pads the data to +guarantees the block size of 32k. (Otherwise you may pass the -b64 option to +tar.) +Expect something like 1.8MB/s for the SC-x0 drives and 0.9MB/s for the DI-30. +The USB drive will give you about 0.7MB/s. +On a fast machine, you may profit from software data compression (z flag for +tar). + + +USB and IDE +----------- +Via the SCSI emulation layers usb-storage and ide-scsi, you can also use the +osst driver to drive the USB-30 and the DI-30 drives. (Unfortunately, there +is no such layer for the parallel port, otherwise the DP-30 would work as +well.) For the USB support, you need the latest 2.4.0-test kernels and the +latest usb-storage driver from +http://www.linux-usb.org/ +http://sourceforge.net/cvs/?group_id=3581 + +Note that the ide-tape driver as of 1.16f uses a slightly outdated on-tape +format and therefore is not completely interoperable with osst tapes. + +The ADR-x0 line is fully SCSI-2 compliant and is supported by st, not osst. +The on-tape format is supposed to be compatible with the one used by osst. + + +Feedback and updates +-------------------- +The driver development is coordinated through a mailing list +<osst@linux1.onstream.nl> +a CVS repository and some web pages. +The tester's pages which contain recent news and updated drivers to download +can be found on +http://linux1.onstream.nl/test/ + +If you find any problems, please have a look at the tester's page in order +to see whether the problem is already known and solved. Otherwise, please +report it to the mailing list. Your feedback is welcome. (This holds also +for reports of successful usage, of course.) +In case of trouble, please do always provide the following info: +* driver and kernel version used (see syslog) +* driver messages (syslog) +* SCSI config and OnStream Firmware (/proc/scsi/scsi) +* description of error. Is it reproducible? +* software and commands used + +You may subscribe to the mailing list, BTW, it's a majordomo list. + + +Status +------ +0.8.0 was the first widespread BETA release. Since then a lot of reports +have been sent, but mostly reported success or only minor trouble. +All the issues have been addressed. +Check the web pages for more info about the current developments. +0.9.x is the tree for the 2.3/2.4 kernel. + + +Acknowledgments +---------------- +The driver has been started by making a copy of Kai Makisara's st driver. +Most of the development has been done by Willem Riede. The presence of the +userspace program osg (onstreamsg) from Terry Hardie has been rather +helpful. The same holds for Gadi Oxman's ide-tape support for the DI-30. +I did add some patches to those drivers as well and coordinated things a +little bit. +Note that most of them did mostly spend their spare time for the creation of +this driver. +The people from OnStream, especially Jack Bombeeck did support this project +and always tried to answer HW or FW related questions. Furthermore, he +pushed the FW developers to do the right things. +SuSE did support this project by allowing me to work on it during my working +time for them and by integrating the driver into their distro. + +More people did help by sending useful comments. Sorry to those who have +been forgotten. Thanks to all the GNU/FSF and Linux developers who made this +platform such an interesting, nice and stable platform. +Thanks go to those who tested the drivers and did send useful reports. Your +help is needed! + + +Makedevs.sh +----------- +#!/bin/sh +# Script to create OnStream SC-x0 device nodes (major 206) +# Usage: Makedevs.sh [nos [path to dev]] +# $Id: README.osst.kernel,v 1.4 2000/12/20 14:13:15 garloff Exp $ +major=206 +nrs=4 +dir=/dev +test -z "$1" || nrs=$1 +test -z "$2" || dir=$2 +declare -i nr +nr=0 +test -d $dir || mkdir -p $dir +while test $nr -lt $nrs; do + mknod $dir/osst$nr c $major $nr + chown 0.disk $dir/osst$nr; chmod 660 $dir/osst$nr; + mknod $dir/nosst$nr c $major $[nr+128] + chown 0.disk $dir/nosst$nr; chmod 660 $dir/nosst$nr; + mknod $dir/osst${nr}l c $major $[nr+32] + chown 0.disk $dir/osst${nr}l; chmod 660 $dir/osst${nr}l; + mknod $dir/nosst${nr}l c $major $[nr+160] + chown 0.disk $dir/nosst${nr}l; chmod 660 $dir/nosst${nr}l; + mknod $dir/osst${nr}m c $major $[nr+64] + chown 0.disk $dir/osst${nr}m; chmod 660 $dir/osst${nr}m; + mknod $dir/nosst${nr}m c $major $[nr+192] + chown 0.disk $dir/nosst${nr}m; chmod 660 $dir/nosst${nr}m; + mknod $dir/osst${nr}a c $major $[nr+96] + chown 0.disk $dir/osst${nr}a; chmod 660 $dir/osst${nr}a; + mknod $dir/nosst${nr}a c $major $[nr+224] + chown 0.disk $dir/nosst${nr}a; chmod 660 $dir/nosst${nr}a; + let nr+=1 +done diff --git a/Documentation/scsi/ppa.txt b/Documentation/scsi/ppa.txt new file mode 100644 index 000000000000..0dac88d86d87 --- /dev/null +++ b/Documentation/scsi/ppa.txt @@ -0,0 +1,16 @@ +-------- Terse where to get ZIP Drive help info -------- + +General Iomega ZIP drive page for Linux: +http://www.torque.net/~campbell/ + +Driver achive for old drivers: +http://www.torque.net/~campbell/ppa/ + +Linux Parport page (parallel port) +http://www.torque.net/parport/ + +Email list for Linux Parport +linux-parport@torque.net + +Email for problems with ZIP or ZIP Plus drivers +campbell@torque.net diff --git a/Documentation/scsi/qla2xxx.revision.notes b/Documentation/scsi/qla2xxx.revision.notes new file mode 100644 index 000000000000..290cdaf84f8f --- /dev/null +++ b/Documentation/scsi/qla2xxx.revision.notes @@ -0,0 +1,457 @@ +/* + * QLogic ISP2200 and ISP2300 Linux Driver Revision List File. + * + ******************************************************************** + * + * Revision History + * + * Rev 8.00.00b8 December 5, 2003 AV + * - Instruct mid-layer to perform initial scan. + * + * Rev 8.00.00b7 December 5, 2003 AV + * - Resync with Linux Kernel 2.6.0-test11. + * - Add basic NVRAM parser (extras/qla_nvr). + * + * Rev 8.00.00b7-pre11 December 3, 2003 AV + * - Sanitize the scsi_qla_host structure: + * - Purge unused elements. + * - Reorganize high-priority members (cache coherency). + * - Add support for NVRAM access via a sysfs binary attribute: + * - Consolidate semaphore locking access. + * - Fix more PCI posting issues. + * - Add extras directory for dump/NVRAM tools. + * - Remove unused qla_vendor.c file. + * + * Rev 8.00.00b7-pre11 November 26, 2003 DG/AV + * - Merge several patches from Christoph Hellwig [hch@lst.de]: + * - in Linux 2.6 both pci and the scsi layer use the generic + * dma direction bits, use them directly instead of the scsi + * and pci variants and the (noop) conversion routines. + * - Fix _IOXX_BAD() usage for external IOCTL interface. + * - Use atomic construct for HA loop_state member. + * - Add generic model description text for HBA types. + * + * Rev 8.00.00b7-pre5 November 17, 2003 AV + * - Merge several patches from Christoph Hellwig [hch@lst.de]: + * - patch to split the driver into a common qla2xxx.ko and a + * qla2?00.ko for each HBA type - the latter modules are + * only very small wrappers, mostly for the firmware + * images, all the meat is in the common qla2xxx.ko. + * - make the failover code optional. + * - kill useless lock_kernel in dpc thread startup. + * - no need for modversions hacks in 2.6 (or 2.4). + * - kill qla2x00_register_with_Linux. + * - simplify EH code, cmd or it's hostdata can't be NULL, no + * need to search whether the host it's ours, the midlayer + * makes sure it won't call into a driver for some else + * host. + * - Merge several patches from Jes Sorensen + * [jes@wildopensource.com]: + * - Call qla2x00_config_dma_addressing() before performing + * any consistent allocations. This is required since the + * dma mask settings will affect the memory + * pci_alloc_consistent() will return. + * - Call pci_set_consistent_dma_mask() to allow for 64 bit + * consistent allocations, required on some platforms such + * as the SN2. + * - Wait 20 usecs (not sure how long is really necessary, + * but this seems safe) after setting CSR_ISP_SOFT_RESET in + * the ctrl_status register as the card doesn't respond to + * PCI reads while in reset state. This causes a machine + * check on some architectures. + * - Flush PCI writes before calling udelay() to ensure the + * write is not sitting idle in-flight for a while before + * hitting the hardware. + * - Include linux/vmalloc.h in qla_os.c since it uses + * vmalloc(). + * - Use auto-negotiate link speed when using default + * parameters rather than NVRAM settings. Disable NVRAM + * reading on SN2 since it's not possible to execute the + * HBA's BIOS on an SN2. I suggest doing something similar + * for all architectures that do not provide x86 BIOS + * emulation. + * - Clean-up slab-cache allocations: + * - locking. + * - mempool allocations in case of low-memory situations. + * - Fallback to GA_NXT scan if GID_PT call returns more than + * MAX_FIBRE_DEVICES. + * - Preserve iterating port ID across GA_NXT calls in + * qla2x00_find_all_fabric_devs(). + * - Pre-calculate ASCII firmware dump length as to not incur the + * cost-to-calculate at each invocation of a read(). + * + * Rev 8.00.00b6 November 4, 2003 AV + * - Add new 2300 TPX firmware (3.02.18). + * + * Rev 8.00.00b6-pre25 October 20, 2003 RA/AV + * - Resync with Linux Kernel 2.6.0-test9. + * - Rework firmware dump process: + * - Use binary attribute within sysfs tree. + * - Add user-space tool (gdump.sh) to retrieve formatted + * buffer. + * - Add ISP2100 support. + * - Use a slab cache for SRB allocations to reduce memory + * pressure. + * - Initial conversion of driver logging methods to a new + * qla_printk() function which uses dev_printk (Daniel + * Stekloff, IBM). + * - Further reduce stack usage in qla2x00_configure_local_loop() + * and qla2x00_find_all_fabric_devs(). + * - Separate port state used for routing of I/O's from port + * mgmt-login retry etc. + * + * Rev 8.00.00b6-pre19 October 13, 2003 AV + * - Resync with Linux Kernel 2.6.0-test7-bk5. + * - Add intelligent RSCN event handling: + * - reduce scan time during 'port' RSCN events by only + * querying specified port ids. + * - Available on ISP23xx cards only. + * - Increase maximum number of recognizable targets from 256 + * to 512. + * - Backend changes were previously added to support TPX + * (2K logins) firmware. Mid-layer can now scan for targets + * (H, B, T, L) where 512 < T >= 0. + * - Remove IP support from driver. + * - Switch firmware types from IP->TP for ISP22xx and + * IPX->TPX for ISP23xx cards. + * - Remove files qla_ip.[ch]. + * - Remove type designations from firmware filenames. + * + * Rev 8.00.00b6-pre11 September 15, 2003 DG/AV + * - Resync with 6.06.00. + * - Resync with Linux Kernel 2.6.0-test5-bk3. + * - Add new 2300 IPX firmware (3.02.15). + * + * Rev 8.00.00b5 July 31, 2003 AV + * - Always create an fc_lun_t entry for lun 0 - as the mid- + * layer requires access to this lun for discovery to occur. + * - General sanitizing: + * - Add generic firmware option definitions. + * - Generalize retrieval/update of firmware options. + * - Fix compile errors which occur with extended debug. + * - Handle failure cases for scsi_add_host() and + * down_interruptible(). + * - Host template updates: + * - Use standard bios_param callback function. + * - Disable clustering. + * - Remove unchecked_is_dma entry. + * + * Rev 8.00.00b5-pre5 July 29, 2003 DG/AV + * - Resync with 6.06.00b13. + * - Resync with Linux Kernel 2.6.0-test2. + * - Pass the complete loop_id, not the masked (0xff) value + * while issuing mailbox commands (qla_mbx.c/qla_fo.c/ + * qla_iocb.c/qla_init.c). + * - Properly handle zero-length return status for an RLC CDB. + * - Create an fclun_t structure for 'disconnected' luns, + * peripheral-qualifier of 001b. + * - Remove unused LIP-sequence register access during AE 8010. + * - Generalize qla2x00_mark_device_lost() to handle forced + * login request -- modify all direct/indirect invocations + * with proper flag. + * - Save RSCN notification (AE 8015h) data in a proper and + * consistent format (domain, area, al_pa). + * - General sanitizing: + * - scsi_qla_host structure member reordering for cache-line + * coherency. + * - Remove unused SCSI opcodes, endian-swap definitions. + * - Remove CMD_* pre-processor defines. + * - Remove unused SCSIFCHOTSWAP/GAMAP/MULTIHOST codes. + * - Backout patch which added a per-scsi_qla_host scsi host + * spinlock, since mid-layer already defines one. + * - Add new 2300 IPX firmware (3.02.15). + * + * Rev 8.00.00b4 July 14, 2003 RA/DG/AV + * - Resync with 6.06.00b12. + * - Resync with Linux Kernel 2.6.0-test1. + * - Remove IOCB throttling code -- originally #if'd. + * - Remove apidev_*() routines since proc_mknod() has been + * removed -- need alternate IOCTL interface. + * - Merge several performance/fix patches from Arjan van de + * Ven: + * - Undefined operation >> 32. + * - No need to acquire mid-layer lock during command + * callback. + * - Use a per-HBA mid-layer lock. + * - Use a non-locked cycle for setting the count of the + * newly allocated sp (qla2x00_get_new_sp()). + * - Modify semantic behavior of qla2x00_queuecommand(): + * - Reduce cacheline bouncing by having I/Os submitted + * by the IRQ handler. + * - Remove extraneous calls to qla2x00_next() during I/O + * queuing. + * - Use list_splice_init() during qla2x00_done() handling + * of commands to reduce list_lock contention. + * - RIO mode support for ISP2200: + * - Implementation differs slightly from original patch. + * - Do not use bottom-half handler (tasklet/work queue) + * for qla2x00_done() processing. + * + * Rev 8.00.00b4-pre22 July 12, 2003 AV + * - Check for 'Process Response Queue' requests early during + * the Host Status check. + * - General sanitizing: + * - srb_t structure rewrite, removal of unused members. + * - Remove unused fcdev array, fabricid, and PORT_* + * definitions. + * - Remove unused config_reg_t PCI definitions. + * - Add new 2200 IP firmware (2.02.06). + * - Add new 2300 IPX firmware (3.02.14). + * + * Rev 8.00.00b4-pre19 June 30, 2003 AV + * - Resync with Linux Kernel 2.5.73-bk8. + * - Rework IOCB command queuing methods: + * - Upper-layer driver *MUST* properly set the direction + * bit of SCSI commands. + * - Generalize 32bit/64bit queuing path functions. + * - Remove costly page-boundary cross check when using + * 64bit address capable IOCBs. + * + * Rev 8.00.00b4-pre15 June 19, 2003 AV + * - Resync with 6.06.00b11. + * - Continue fcport list consolidation work: + * - Updated IOCTL implementations to use new fcports + * list. + * - Modified product ID check to not verify ISP chip + * revision -- ISP2312 v3 (qla2x00_chip_diag()). + * - Add new 2300 IPX firmware (3.02.13): + * + * Rev 8.00.00b4-pre13 June 19, 2003 AV + * - Fix build process for qla2100 driver -- no support + * for IP. + * - SCSI host template modifications: + * - Set sg_tablesize based on the derived DMA mask. + * - Increase max_sectors since only limit within RISC + * is transfer of (((2^32) - 1) >> 9) sectors. + * + * Rev 8.00.00b4-pre12 June 18, 2003 RA, DG, RL, AV + * - Resync with 6.06.00b10. + * - Resync with Linux Kernel 2.5.72. + * - Initial fcport list consolidation work: + * - fcports/fcinitiators/fcdev/fc_ip --> ha->fcports + * list. + * + * Rev 8.00.00b4-pre7 June 05, 2003 AV + * - Properly release PCI resouces in init-failure case. + * - Reconcile disparite function return code definitions. + * + * Rev 8.00.00b4-pre4 June 03, 2003 AV + * - Resync with Linux Kernel 2.5.70-bk8: + * - SHT proc_info() changes. + * - Restructure SNS Generic Services routines: + * - Add qla_gs.c file to driver distribution. + * - Configure PCI latency timer for ISP23xx. + * + * Rev 8.00.00b4-pre3 June 02, 2003 RA, DG, RL, AV + * - Resync with 6.06.00b5. + * - Rework (again) PCI I/O space configuration + * (Anton Blanchard): + * - Use pci_set_mwi() routine; + * - Remove uneeded qla2x00_set_cache_line() function. + * - Remove extraneous modification of PCI_COMMAND word. + * + * Rev 8.00.00b3 May 29, 2003 AV + * - Resync with Linux Kernel 2.5.70. + * - Move RISC paused check from ISR fast-path. + * + * Rev 8.00.00b3-pre8 May 26, 2003 AV + * - Add new 2300 IPX firmware (3.02.12): + * - Rework PCI I/O space configuration. + * + * Rev 8.00.00b3-pre6 May 22, 2003 RA, DG, RL, AV + * - Resync with 6.06.00b3. + * + * Rev 8.00.00b3-pre4 May 21 2003 AV + * - Add new 2300 IPX firmware (3.02.11): + * - Remove 2300 TPX firmware from distribution. + * + * Rev 8.00.00b3-pre3 May 21 2003 AV + * - Properly setup PCI configuation space during + * initialization: + * - Properly configure Memory-Mapped I/O during early + * configuration stage. + * - Rework IP functionality to support 2k logins. + * - Add new 2300 IPX firmware (3.02.11): + * - Remove 2300 TPX firmware from distribution. + * + * Rev 8.00.00b3-pre2 May ??, 2003 RA, DG, RL, AV + * - Resync with 6.06.00b1. + * + * Rev 8.00.00b3-pre1 May ??, 2003 RA, DG, RL, AV + * - Resync with 6.05.00. + * + * Rev 8.00.00b2 May 19, 2003 AV + * - Simplify dma_addr_t handling during command queuing given + * new block-layer defined restrictions: + * - Physical addresses not spanning 4GB boundaries. + * - Firmware versions: 2100 TP (1.19.24), 2200 IP (2.02.05), + * 2300 TPX (3.02.10). + * + * Rev 8.00.00b2-pre1 May 13, 2003 AV + * - Add support for new 'Hotplug initialization' model. + * - Simplify host template by removing unused callbacks. + * - Use scsicam facilities to determine geometry. + * - Fix compilation issues for non-ISP23xx builds: + * - Correct register references in qla_dbg.c. + * - Correct Makefile build process. + * + * Rev 8.00.00b1 May 05, 2003 AV + * - Resync with Linux Kernel 2.5.69. + * - Firmware versions: 2100 TP (1.19.24), 2200 TP (2.02.05), + * 2300 TPX (3.02.10). + * + * Rev 8.00.00b1-pre45 April ??, 2003 AV + * - Resync with Linux Kernel 2.5.68-bk11: + * - Fix improper return-code assignment during fabric + * discovery. + * - Remove additional extraneous #defines from + * qla_settings.h. + * - USE_PORTNAME -- FO will always use portname. + * - Default queue depth size set to 64. + * + * Rev 8.00.00b1-pre42 April ??, 2003 AV + * - Convert bottom-half tasklet to a work_queue. + * - Initial basic coding of dynamic queue depth handling + * during QUEUE FULL statuses. + * - Fix mailbox interface problem with + * qla2x00_get_retry_cnt(). + * + * Rev 8.00.00b1-pre41 April ??, 2003 AV + * - Convert build defines qla2[1|2|3]00 macros to + * qla2[1|2|3]xx due to module name stringification clashes. + * - Add additional ISP2322 checks during board configuration. + * + * Rev 8.00.00b1-pre40 April ??, 2003 AV + * - Resync with Linux Kernel 2.5.68-bk8: + * - Updated IRQ handler interface. + * - Add ISP dump code (stub) in case of SYSTEM_ERROR on + * ISP2100. + * - Add new 2200 IP firmware (2.02.05). + * + * Rev 8.00.00b1-pre39 April ??, 2003 AV + * - Resync with Linux Kernel 2.5.68. + * - Add simple build.sh script to aid in external compilation. + * - Clean-break with Kernel 2.4 compatibility. + * - Rework DPC routine -- completion routines for signaling. + * - Re-add HBAAPI character device node for IOCTL support. + * - Remove residual QLA2X_PERFORMANCE defines. + * - Allocate SP pool via __get_free_pages() rather than + * individual kmalloc()'s. + * - Inform SCSI mid-layer of 16-byte CDB support + * (host->max_cmd_len): + * - Remove unecessary 'more_cdb' handling code from + * qla_iocb.c and qla_xioct.c. + * - Reduce duplicate code in fabric scanning logic (MS IOCB + * preparation). + * - Add ISP dump code in case of SYSTEM_ERROR. + * - Remove 2300 VIX firmware from distribution: + * - Add initial code for IPX support. + * - Add new 2300 TPX firmware (3.02.10). + * + * Rev 8.00.00b1-pre34 April ??, 2003 AV + * - Resync with Linux Kernel 2.5.67. + * - Use domain/area/al_pa fields when displaying PortID + * values -- addresses endianess issues. + * - Rework large case statement to check 'common' CDB commands + * early in qla2x00_get_cmd_direction(). + * + * Rev 8.00.00b1-pre31 April ??, 2003 AV + * - Update makefile to support PPC64 build. + * - Retool NVRAM configuration routine and structures: + * - Consoldate ISP21xx/ISP22xx/ISP23xx configuration + * (struct nvram_t). + * - Remove big/little endian support structures in favor of + * simplified bit-operations within byte fields. + * - Fix long-standing 'static' buffer sharing problem in + * qla2x00_configure_fabric(). + * + * Rev 8.00.00b1-pre30 April ??, 2003 AV + * - Complete implementation of GID_PT scan. + * - Use consistent MS IOCB invocation method to query SNS: + * - Add RNN_ID and RSNN_NN registrations in a fabric. + * - Remove unused Mailbox Command 6Eh (Send SNS) support + * structures. + * - Use 64bit safe IOCBs while issuing INQUIRY and RLC during + * topology scan. + * - Until reimplementation of fcdev_t/fcport list + * consolidation, valid loop_id ranges are still limited from + * 0x00 through 0xFF -- enforce this within the code. + * + * Rev 8.00.00b1-pre27 March ??, 2003 AV + * - Resync with 6.05.00b9. + * - Retool HBA PCI configuration -- qla2x00_pci_config(). + * - Remove inconsistent use of delay routines (UDELAY/SYS*). + * - Continue to teardown/clean/add comments and debug + * routines. + * - Properly swap bytes of the device's nodename in + * qla2x00_configure_local_loop(). + * + * Rev 8.00.00b1-pre25 March ??, 2003 AV + * - Resync with 6.05.00b8. + * + * Rev 8.00.00b1-pre23 March ??, 2003 AV + * - Remove (#define) IOCB usage throttling. + * - Abstract interrupt polling with qla2x00_poll(). + * - Modify lun scanning logic: + * - If the device does not support the SCSI Report Luns + * command, the driver will now only scan from 0 to the + * max#-luns as defined in the NVRAM (BIOS), rather than + * blindly scanning from 0 to 255 -- which could result in + * an increase in startup time when running against slow + * (JBOD) devices. + * - Rework reset logic in qla2x00_reset_chip() (spec). + * + * Rev 8.00.00b1-pre22 March ??, 2003 AV + * - Resync with 6.05.00b7. + * - Cleanup (rewrite) ISR handler. + * - Rename kmem_zalloc --> qla2x00_kmem_zalloc(): + * - This function will eventually be removed. + * - Add new 2300 VIX firmware (3.02.09): + * - Support for Tape, Fabric, 2K logins, IP, and VI. + * + * Rev 8.00.00b1-pre18 March ??, 2003 AV + * - Support 232x type ISPs. + * - Support single firmware for each ISP type: + * - Restructure brd_info/fw_info methods. + * - Streamline firmware load process. + * - Properly query firmware for version information. + * - Remove extraneous scsi_qla_host members: + * - device_id ==> pdev->device + * - Fix fc4 features (RFF_ID) registration. + * - Convert kmem_zalloc --> qla2x00_kmem_zalloc(). + * - Remove unused/extraneous #defines (USE_PORTNAME). + * + * Rev 8.00.00b1-pre14 March ??, 2003 AV + * - Resync with 6.05.00b6. + * - Initial source-code restructuring effort. + * - Build procedure. + * - Source file layout -- intuitive component layout. + * - Remove unused #defines (*PERFORMANCE, WORD_FW_LOAD, etc). + * - Add support for 2K logins (TPX -- firmware). + * - Add module parameter ql2xsuspendcount. + * - Add new 2200 IP/TP firmware (2.02.04). + * + * Rev 8.00.00b1-pre9 March ??, 2003 RL/DG/RA/AV + * - Use kernel struct list_head for fcport and fclun lists. + * - Remove extraneous (L|M)S_64BITS() and QL21_64*() defines. + * + * Rev 8.00.00b1-pre8 February 28, 2003 RL/DG/RA/AV + * - Resync with 6.05.00b3. + * + * Rev 8.00.00b1-pre7 February 23, 2003 RL/DG/RA/AV + * - Add alternate fabric scanning logic (GID_PT/GNN_ID/GPN_ID). + * - Remove use of deprecated function check_region(). + * - Add new 2300 IP/TP firmware (3.02.08). + * + * Rev 8.00.00b1-pre5 January 28, 2003 RL/DG/RA/AV + * - Resync with 6.05.00b3. + * - Consolidate device_reg structure definitions for ISP types. + * - Add support for new queue-depth selection. + * - Add new 2300 IP/TP firmware (3.02.07). + * + * Rev 8.00.00b1-pre1 January 17, 2003 AV + * - Initial branch from 6.04.00b8 driver. + * - Remove VMWARE specific code. + * - Add support for pci_driver interface. + * + ********************************************************************/ diff --git a/Documentation/scsi/qlogicfas.txt b/Documentation/scsi/qlogicfas.txt new file mode 100644 index 000000000000..398f99168077 --- /dev/null +++ b/Documentation/scsi/qlogicfas.txt @@ -0,0 +1,79 @@ + +This driver supports the Qlogic FASXXX family of chips. This driver +only works with the ISA, VLB, and PCMCIA versions of the Qlogic +FastSCSI! cards as well as any other card based on the FASXX chip +(including the Control Concepts SCSI/IDE/SIO/PIO/FDC cards). + +This driver does NOT support the PCI version. Support for these PCI +Qlogic boards: + + * IQ-PCI + * IQ-PCI-10 + * IQ-PCI-D + +is provided by the qlogicisp.c driver. Check README.qlogicisp for +details. + +Nor does it support the PCI-Basic, which is supported by the +'am53c974' driver. + +PCMCIA SUPPORT + +This currently only works if the card is enabled first from DOS. This +means you will have to load your socket and card services, and +QL41DOS.SYS and QL40ENBL.SYS. These are a minimum, but loading the +rest of the modules won't interfere with the operation. The next +thing to do is load the kernel without resetting the hardware, which +can be a simple ctrl-alt-delete with a boot floppy, or by using +loadlin with the kernel image accessible from DOS. If you are using +the Linux PCMCIA driver, you will have to adjust it or otherwise stop +it from configuring the card. + +I am working with the PCMCIA group to make it more flexible, but that +may take a while. + +ALL CARDS + +The top of the qlogic.c file has a number of defines that controls +configuration. As shipped, it provides a balance between speed and +function. If there are any problems, try setting SLOW_CABLE to 1, and +then try changing USE_IRQ and TURBO_PDMA to zero. If you are familiar +with SCSI, there are other settings which can tune the bus. + +It may be a good idea to enable RESET_AT_START, especially if the +devices may not have been just powered up, or if you are restarting +after a crash, since they may be busy trying to complete the last +command or something. It comes up faster if this is set to zero, and +if you have reliable hardware and connections it may be more useful to +not reset things. + +SOME TROUBLESHOOTING TIPS + +Make sure it works properly under DOS. You should also do an initial FDISK +on a new drive if you want partitions. + +Don't enable all the speedups first. If anything is wrong, they will make +any problem worse. + +IMPORTANT + +The best way to test if your cables, termination, etc. are good is to +copy a very big file (e.g. a doublespace container file, or a very +large executable or archive). It should be at least 5 megabytes, but +you can do multiple tests on smaller files. Then do a COMP to verify +that the file copied properly. (Turn off all caching when doing these +tests, otherwise you will test your RAM and not the files). Then do +10 COMPs, comparing the same file on the SCSI hard drive, i.e. "COMP +realbig.doc realbig.doc". Then do it after the computer gets warm. + +I noticed my system which seems to work 100% would fail this test if +the computer was left on for a few hours. It was worse with longer +cables, and more devices on the SCSI bus. What seems to happen is +that it gets a false ACK causing an extra byte to be inserted into the +stream (and this is not detected). This can be caused by bad +termination (the ACK can be reflected), or by noise when the chips +work less well because of the heat, or when cables get too long for +the speed. + +Remember, if it doesn't work under DOS, it probably won't work under +Linux. diff --git a/Documentation/scsi/qlogicisp.txt b/Documentation/scsi/qlogicisp.txt new file mode 100644 index 000000000000..6920f6c76a9f --- /dev/null +++ b/Documentation/scsi/qlogicisp.txt @@ -0,0 +1,30 @@ +Notes for the QLogic ISP1020 PCI SCSI Driver: + +This driver works well in practice, but does not support disconnect/ +reconnect, which makes using it with tape drives impractical. + +It should work for most host adaptors with the ISP1020 chip. The +QLogic Corporation produces several PCI SCSI adapters which should +work: + + * IQ-PCI + * IQ-PCI-10 + * IQ-PCI-D + +This driver may work with boards containing the ISP1020A or ISP1040A +chips, but that has not been tested. + +This driver will NOT work with: + + * ISA or VL Bus Qlogic cards (they use the 'qlogicfas' driver) + * PCI-basic (it uses the 'am53c974' driver) + +Much thanks to QLogic's tech support for providing the latest ISP1020 +firmware, and for taking the time to review my code. + +Erik Moe +ehm@cris.com + +Revised: +Michael A. Griffith +grif@cs.ucr.edu diff --git a/Documentation/scsi/scsi-generic.txt b/Documentation/scsi/scsi-generic.txt new file mode 100644 index 000000000000..c38e2b3648e4 --- /dev/null +++ b/Documentation/scsi/scsi-generic.txt @@ -0,0 +1,101 @@ + Notes on Linux SCSI Generic (sg) driver + --------------------------------------- + 20020126 +Introduction +============ +The SCSI Generic driver (sg) is one of the four "high level" SCSI device +drivers along with sd, st and sr (disk, tape and CDROM respectively). Sg +is more generalized (but lower level) than its siblings and tends to be +used on SCSI devices that don't fit into the already serviced categories. +Thus sg is used for scanners, CD writers and reading audio CDs digitally +amongst other things. + +Rather than document the driver's interface here, version information +is provided plus pointers (i.e. URLs) where to find documentation +and examples. + + +Major versions of the sg driver +=============================== +There are three major versions of sg found in the linux kernel (lk): + - sg version 1 (original) from 1992 to early 1999 (lk 2.2.5) . + It is based in the sg_header interface structure. + - sg version 2 from lk 2.2.6 in the 2.2 series. It is based on + an extended version of the sg_header interface structure. + - sg version 3 found in the lk 2.4 series (and the lk 2.5 series). + It adds the sg_io_hdr interface structure. + + +Sg driver documentation +======================= +The most recent documentation of the sg driver is kept at the Linux +Documentation Project's (LDP) site: +http://www.tldp.org/HOWTO/SCSI-Generic-HOWTO +This describes the sg version 3 driver found in the lk 2.4 series. +The LDP renders documents in single and multiple page HTML, postscript +and pdf. This document can also be found at: +http://www.torque.net/sg/p/sg_v3_ho.html + +Documentation for the version 2 sg driver found in the lk 2.2 series can +be found at http://www.torque.net/sg/p/scsi-generic.txt . A larger version +is at: http://www.torque.net/sg/p/scsi-generic_long.txt . + +The original documentation for the sg driver (prior to lk 2.2.6) can be +found at http://www.torque.net/sg/p/original/SCSI-Programming-HOWTO.txt +and in the LDP archives. + +A changelog with brief notes can be found in the +/usr/src/linux/include/scsi/sg.h file. Note that the glibc maintainers copy +and edit this file (removing its changelog for example) before placing it +in /usr/include/scsi/sg.h . Driver debugging information and other notes +can be found at the top of the /usr/src/linux/drivers/scsi/sg.c file. + +A more general description of the Linux SCSI subsystem of which sg is a +part can be found at http://www.tldp.org/HOWTO/SCSI-2.4-HOWTO . + + +Example code and utilities +========================== +There are two packages of sg utilities: + - sg3_utils for the sg version 3 driver found in lk 2.4 + - sg_utils for the sg version 2 (and original) driver found in lk 2.2 + and earlier +Both packages will work in the lk 2.4 series however sg3_utils offers more +capabilities. They can be found at: http://www.torque.net/sg and +freshmeat.net + +Another approach is to look at the applications that use the sg driver. +These include cdrecord, cdparanoia, SANE and cdrdao. + + +Mapping of Linux kernel versions to sg driver versions +====================================================== +Here is a list of linux kernels in the 2.4 series that had new version +of the sg driver: + lk 2.4.0 : sg version 3.1.17 + lk 2.4.7 : sg version 3.1.19 + lk 2.4.10 : sg version 3.1.20 ** + lk 2.4.17 : sg version 3.1.22 + +** There were 3 changes to sg version 3.1.20 by third parties in the + next six linux kernel versions. + +For reference here is a list of linux kernels in the 2.2 series that had +new version of the sg driver: + lk 2.2.0 : original sg version [with no version number] + lk 2.2.6 : sg version 2.1.31 + lk 2.2.8 : sg version 2.1.32 + lk 2.2.10 : sg version 2.1.34 [SG_GET_VERSION_NUM ioctl first appeared] + lk 2.2.14 : sg version 2.1.36 + lk 2.2.16 : sg version 2.1.38 + lk 2.2.17 : sg version 2.1.39 + lk 2.2.20 : sg version 2.1.40 + +The lk 2.5 development series has recently commenced and it currently +contains sg version 3.5.23 which is functionally equivalent to sg +version 3.1.22 found in lk 2.4.17 . + + +Douglas Gilbert +26th January 2002 +dgilbert@interlog.com diff --git a/Documentation/scsi/scsi.txt b/Documentation/scsi/scsi.txt new file mode 100644 index 000000000000..dd1bbf4e98e3 --- /dev/null +++ b/Documentation/scsi/scsi.txt @@ -0,0 +1,44 @@ +SCSI subsystem documentation +============================ +The Linux Documentation Project (LDP) maintains a document describing +the SCSI subsystem in the Linux kernel (lk) 2.4 series. See: +http://www.tldp.org/HOWTO/SCSI-2.4-HOWTO . The LDP has single +and multiple page HTML renderings as well as postscript and pdf. +It can also be found at http://www.torque.net/scsi/SCSI-2.4-HOWTO . + + +Notes on using modules in the SCSI subsystem +============================================ +The scsi support in the linux kernel can be modularized in a number of +different ways depending upon the needs of the end user. To understand +your options, we should first define a few terms. + +The scsi-core (also known as the "mid level") contains the core of scsi +support. Without it you can do nothing with any of the other scsi drivers. +The scsi core support can be a module (scsi_mod.o), or it can be built into +the kernel. If the core is a module, it must be the first scsi module +loaded, and if you unload the modules, it will have to be the last one +unloaded. In practice the modprobe and rmmod commands (and "autoclean") +will enforce the correct ordering of loading and unloading modules in +the SCSI subsystem. + +The individual upper and lower level drivers can be loaded in any order +once the scsi core is present in the kernel (either compiled in or loaded +as a module). The disk driver (sd_mod.o), cdrom driver (sr_mod.o), +tape driver ** (st.o) and scsi generics driver (sg.o) represent the upper +level drivers to support the various assorted devices which can be +controlled. You can for example load the tape driver to use the tape drive, +and then unload it once you have no further need for the driver (and release +the associated memory). + +The lower level drivers are the ones that support the individual cards that +are supported for the hardware platform that you are running under. Those +individual cards are often called Host Bus Adapters (HBAs). For example the +aic7xxx.o driver is used to control all recent SCSI controller cards from +Adaptec. Almost all lower level drivers can be built either as modules or +built into the kernel. + + +** There is a variant of the st driver for controlling OnStream tape + devices. Its module name is osst.o . + diff --git a/Documentation/scsi/scsi_mid_low_api.txt b/Documentation/scsi/scsi_mid_low_api.txt new file mode 100644 index 000000000000..1f24129a3099 --- /dev/null +++ b/Documentation/scsi/scsi_mid_low_api.txt @@ -0,0 +1,1546 @@ + Linux Kernel 2.6 series + SCSI mid_level - lower_level driver interface + ============================================= + +Introduction +============ +This document outlines the interface between the Linux SCSI mid level and +SCSI lower level drivers. Lower level drivers (LLDs) are variously called +host bus adapter (HBA) drivers and host drivers (HD). A "host" in this +context is a bridge between a computer IO bus (e.g. PCI or ISA) and a +single SCSI initiator port on a SCSI transport. An "initiator" port +(SCSI terminology, see SAM-3 at http://www.t10.org) sends SCSI commands +to "target" SCSI ports (e.g. disks). There can be many LLDs in a running +system, but only one per hardware type. Most LLDs can control one or more +SCSI HBAs. Some HBAs contain multiple hosts. + +In some cases the SCSI transport is an external bus that already has +its own subsystem in Linux (e.g. USB and ieee1394). In such cases the +SCSI subsystem LLD is a software bridge to the other driver subsystem. +Examples are the usb-storage driver (found in the drivers/usb/storage +directory) and the ieee1394/sbp2 driver (found in the drivers/ieee1394 +directory). + +For example, the aic7xxx LLD controls Adaptec SCSI parallel interface +(SPI) controllers based on that company's 7xxx chip series. The aic7xxx +LLD can be built into the kernel or loaded as a module. There can only be +one aic7xxx LLD running in a Linux system but it may be controlling many +HBAs. These HBAs might be either on PCI daughter-boards or built into +the motherboard (or both). Some aic7xxx based HBAs are dual controllers +and thus represent two hosts. Like most modern HBAs, each aic7xxx host +has its own PCI device address. [The one-to-one correspondence between +a SCSI host and a PCI device is common but not required (e.g. with +ISA or MCA adapters).] + +The SCSI mid level isolates an LLD from other layers such as the SCSI +upper layer drivers and the block layer. + +This version of the document roughly matches linux kernel version 2.6.8 . + +Documentation +============= +There is a SCSI documentation directory within the kernel source tree, +typically Documentation/scsi . Most documents are in plain +(i.e. ASCII) text. This file is named scsi_mid_low_api.txt and can be +found in that directory. A more recent copy of this document may be found +at http://www.torque.net/scsi/scsi_mid_low_api.txt.gz . +Many LLDs are documented there (e.g. aic7xxx.txt). The SCSI mid-level is +briefly described in scsi.txt which contains a url to a document +describing the SCSI subsystem in the lk 2.4 series. Two upper level +drivers have documents in that directory: st.txt (SCSI tape driver) and +scsi-generic.txt (for the sg driver). + +Some documentation (or urls) for LLDs may be found in the C source code +or in the same directory as the C source code. For example to find a url +about the USB mass storage driver see the +/usr/src/linux/drivers/usb/storage directory. + +The Linux kernel source Documentation/DocBook/scsidrivers.tmpl file +refers to this file. With the appropriate DocBook tool-set, this permits +users to generate html, ps and pdf renderings of information within this +file (e.g. the interface functions). + +Driver structure +================ +Traditionally an LLD for the SCSI subsystem has been at least two files in +the drivers/scsi directory. For example, a driver called "xyz" has a header +file "xyz.h" and a source file "xyz.c". [Actually there is no good reason +why this couldn't all be in one file; the header file is superfluous.] Some +drivers that have been ported to several operating systems have more than +two files. For example the aic7xxx driver has separate files for generic +and OS-specific code (e.g. FreeBSD and Linux). Such drivers tend to have +their own directory under the drivers/scsi directory. + +When a new LLD is being added to Linux, the following files (found in the +drivers/scsi directory) will need some attention: Makefile and Kconfig . +It is probably best to study how existing LLDs are organized. + +As the 2.5 series development kernels evolve into the 2.6 series +production series, changes are being introduced into this interface. An +example of this is driver initialization code where there are now 2 models +available. The older one, similar to what was found in the lk 2.4 series, +is based on hosts that are detected at HBA driver load time. This will be +referred to the "passive" initialization model. The newer model allows HBAs +to be hot plugged (and unplugged) during the lifetime of the LLD and will +be referred to as the "hotplug" initialization model. The newer model is +preferred as it can handle both traditional SCSI equipment that is +permanently connected as well as modern "SCSI" devices (e.g. USB or +IEEE 1394 connected digital cameras) that are hotplugged. Both +initialization models are discussed in the following sections. + +An LLD interfaces to the SCSI subsystem several ways: + a) directly invoking functions supplied by the mid level + b) passing a set of function pointers to a registration function + supplied by the mid level. The mid level will then invoke these + functions at some point in the future. The LLD will supply + implementations of these functions. + c) direct access to instances of well known data structures maintained + by the mid level + +Those functions in group a) are listed in a section entitled "Mid level +supplied functions" below. + +Those functions in group b) are listed in a section entitled "Interface +functions" below. Their function pointers are placed in the members of +"struct scsi_host_template", an instance of which is passed to +scsi_host_alloc() ** . Those interface functions that the LLD does not +wish to supply should have NULL placed in the corresponding member of +struct scsi_host_template. Defining an instance of struct +scsi_host_template at file scope will cause NULL to be placed in function + pointer members not explicitly initialized. + +Those usages in group c) should be handled with care, especially in a +"hotplug" environment. LLDs should be aware of the lifetime of instances +that are shared with the mid level and other layers. + +All functions defined within an LLD and all data defined at file scope +should be static. For example the slave_alloc() function in an LLD +called "xxx" could be defined as +"static int xxx_slave_alloc(struct scsi_device * sdev) { /* code */ }" + +** the scsi_host_alloc() function is a replacement for the rather vaguely +named scsi_register() function in most situations. The scsi_register() +and scsi_unregister() functions remain to support legacy LLDs that use +the passive initialization model. + + +Hotplug initialization model +============================ +In this model an LLD controls when SCSI hosts are introduced and removed +from the SCSI subsystem. Hosts can be introduced as early as driver +initialization and removed as late as driver shutdown. Typically a driver +will respond to a sysfs probe() callback that indicates an HBA has been +detected. After confirming that the new device is one that the LLD wants +to control, the LLD will initialize the HBA and then register a new host +with the SCSI mid level. + +During LLD initialization the driver should register itself with the +appropriate IO bus on which it expects to find HBA(s) (e.g. the PCI bus). +This can probably be done via sysfs. Any driver parameters (especially +those that are writable after the driver is loaded) could also be +registered with sysfs at this point. The SCSI mid level first becomes +aware of an LLD when that LLD registers its first HBA. + +At some later time, the LLD becomes aware of an HBA and what follows +is a typical sequence of calls between the LLD and the mid level. +This example shows the mid level scanning the newly introduced HBA for 3 +scsi devices of which only the first 2 respond: + + HBA PROBE: assume 2 SCSI devices found in scan +LLD mid level LLD +===-------------------=========--------------------===------ +scsi_host_alloc() --> +scsi_add_host() --------+ + | + slave_alloc() + slave_configure() --> scsi_adjust_queue_depth() + | + slave_alloc() + slave_configure() + | + slave_alloc() *** + slave_destroy() *** +------------------------------------------------------------ + +If the LLD wants to adjust the default queue settings, it can invoke +scsi_adjust_queue_depth() in its slave_configure() routine. + +*** For scsi devices that the mid level tries to scan but do not + respond, a slave_alloc(), slave_destroy() pair is called. + +When an HBA is being removed it could be as part of an orderly shutdown +associated with the LLD module being unloaded (e.g. with the "rmmod" +command) or in response to a "hot unplug" indicated by sysfs()'s +remove() callback being invoked. In either case, the sequence is the +same: + + HBA REMOVE: assume 2 SCSI devices attached +LLD mid level LLD +===----------------------=========-----------------===------ +scsi_remove_host() ---------+ + | + slave_destroy() + slave_destroy() +scsi_host_put() +------------------------------------------------------------ + +It may be useful for a LLD to keep track of struct Scsi_Host instances +(a pointer is returned by scsi_host_alloc()). Such instances are "owned" +by the mid-level. struct Scsi_Host instances are freed from +scsi_host_put() when the reference count hits zero. + +Hot unplugging an HBA that controls a disk which is processing SCSI +commands on a mounted file system is an interesting situation. Reference +counting logic is being introduced into the mid level to cope with many +of the issues involved. See the section on reference counting below. + + +The hotplug concept may be extended to SCSI devices. Currently, when an +HBA is added, the scsi_add_host() function causes a scan for SCSI devices +attached to the HBA's SCSI transport. On newer SCSI transports the HBA +may become aware of a new SCSI device _after_ the scan has completed. +An LLD can use this sequence to make the mid level aware of a SCSI device: + + SCSI DEVICE hotplug +LLD mid level LLD +===-------------------=========--------------------===------ +scsi_add_device() ------+ + | + slave_alloc() + slave_configure() [--> scsi_adjust_queue_depth()] +------------------------------------------------------------ + +In a similar fashion, an LLD may become aware that a SCSI device has been +removed (unplugged) or the connection to it has been interrupted. Some +existing SCSI transports (e.g. SPI) may not become aware that a SCSI +device has been removed until a subsequent SCSI command fails which will +probably cause that device to be set offline by the mid level. An LLD that +detects the removal of a SCSI device can instigate its removal from +upper layers with this sequence: + + SCSI DEVICE hot unplug +LLD mid level LLD +===----------------------=========-----------------===------ +scsi_remove_device() -------+ + | + slave_destroy() +------------------------------------------------------------ + +It may be useful for an LLD to keep track of struct scsi_device instances +(a pointer is passed as the parameter to slave_alloc() and +slave_configure() callbacks). Such instances are "owned" by the mid-level. +struct scsi_device instances are freed after slave_destroy(). + + +Passive initialization model +============================ +These older LLDs include a file called "scsi_module.c" [yes the ".c" is a +little surprising] in their source code. For that file to work an +instance of struct scsi_host_template with the name "driver_template" +needs to be defined. Here is a typical code sequence used in this model: + static struct scsi_host_template driver_template = { + ... + }; + #include "scsi_module.c" + +The scsi_module.c file contains two functions: + - init_this_scsi_driver() which is executed when the LLD is + initialized (i.e. boot time or module load time) + - exit_this_scsi_driver() which is executed when the LLD is shut + down (i.e. module unload time) +Note: since these functions are tagged with __init and __exit qualifiers +an LLD should not call them explicitly (since the kernel does that). + +Here is an example of an initialization sequence when two hosts are +detected (so detect() returns 2) and the SCSI bus scan on each host +finds 1 SCSI device (and a second device does not respond). + +LLD mid level LLD +===----------------------=========-----------------===------ +init_this_scsi_driver() ----+ + | + detect() -----------------+ + | | + | scsi_register() + | scsi_register() + | + slave_alloc() + slave_configure() --> scsi_adjust_queue_depth() + slave_alloc() *** + slave_destroy() *** + | + slave_alloc() + slave_configure() + slave_alloc() *** + slave_destroy() *** +------------------------------------------------------------ + +The mid level invokes scsi_adjust_queue_depth() with tagged queuing off and +"cmd_per_lun" for that host as the queue length. These settings can be +overridden by a slave_configure() supplied by the LLD. + +*** For scsi devices that the mid level tries to scan but do not + respond, a slave_alloc(), slave_destroy() pair is called. + +Here is an LLD shutdown sequence: + +LLD mid level LLD +===----------------------=========-----------------===------ +exit_this_scsi_driver() ----+ + | + slave_destroy() + release() --> scsi_unregister() + | + slave_destroy() + release() --> scsi_unregister() +------------------------------------------------------------ + +An LLD need not define slave_destroy() (i.e. it is optional). + +The shortcoming of the "passive initialization model" is that host +registration and de-registration are (typically) tied to LLD initialization +and shutdown. Once the LLD is initialized then a new host that appears +(e.g. via hotplugging) cannot easily be added without a redundant +driver shutdown and re-initialization. It may be possible to write an LLD +that uses both initialization models. + + +Reference Counting +================== +The Scsi_Host structure has had reference counting infrastructure added. +This effectively spreads the ownership of struct Scsi_Host instances +across the various SCSI layers which use them. Previously such instances +were exclusively owned by the mid level. LLDs would not usually need to +directly manipulate these reference counts but there may be some cases +where they do. + +There are 3 reference counting functions of interest associated with +struct Scsi_Host: + - scsi_host_alloc(): returns a pointer to new instance of struct + Scsi_Host which has its reference count ^^ set to 1 + - scsi_host_get(): adds 1 to the reference count of the given instance + - scsi_host_put(): decrements 1 from the reference count of the given + instance. If the reference count reaches 0 then the given instance + is freed + +The Scsi_device structure has had reference counting infrastructure added. +This effectively spreads the ownership of struct Scsi_device instances +across the various SCSI layers which use them. Previously such instances +were exclusively owned by the mid level. See the access functions declared +towards the end of include/scsi/scsi_device.h . If an LLD wants to keep +a copy of a pointer to a Scsi_device instance it should use scsi_device_get() +to bump its reference count. When it is finished with the pointer it can +use scsi_device_put() to decrement its reference count (and potentially +delete it). + +^^ struct Scsi_Host actually has 2 reference counts which are manipulated +in parallel by these functions. + + +Conventions +=========== +First, Linus Torvalds's thoughts on C coding style can be found in the +Documentation/CodingStyle file. + +Next, there is a movement to "outlaw" typedefs introducing synonyms for +struct tags. Both can be still found in the SCSI subsystem, but +the typedefs have been moved to a single file, scsi_typedefs.h to +make their future removal easier, for example: +"typedef struct scsi_host_template Scsi_Host_Template;" + +Also, most C99 enhancements are encouraged to the extent they are supported +by the relevant gcc compilers. So C99 style structure and array +initializers are encouraged where appropriate. Don't go too far, +VLAs are not properly supported yet. An exception to this is the use of +"//" style comments; /*...*/ comments are still preferred in Linux. + +Well written, tested and documented code, need not be re-formatted to +comply with the above conventions. For example, the aic7xxx driver +comes to Linux from FreeBSD and Adaptec's own labs. No doubt FreeBSD +and Adaptec have their own coding conventions. + + +Mid level supplied functions +============================ +These functions are supplied by the SCSI mid level for use by LLDs. +The names (i.e. entry points) of these functions are exported +so an LLD that is a module can access them. The kernel will +arrange for the SCSI mid level to be loaded and initialized before any LLD +is initialized. The functions below are listed alphabetically and their +names all start with "scsi_". + +Summary: + scsi_activate_tcq - turn on tag command queueing + scsi_add_device - creates new scsi device (lu) instance + scsi_add_host - perform sysfs registration and SCSI bus scan. + scsi_add_timer - (re-)start timer on a SCSI command. + scsi_adjust_queue_depth - change the queue depth on a SCSI device + scsi_assign_lock - replace default host_lock with given lock + scsi_bios_ptable - return copy of block device's partition table + scsi_block_requests - prevent further commands being queued to given host + scsi_deactivate_tcq - turn off tag command queueing + scsi_delete_timer - cancel timer on a SCSI command. + scsi_host_alloc - return a new scsi_host instance whose refcount==1 + scsi_host_get - increments Scsi_Host instance's refcount + scsi_host_put - decrements Scsi_Host instance's refcount (free if 0) + scsi_partsize - parse partition table into cylinders, heads + sectors + scsi_register - create and register a scsi host adapter instance. + scsi_remove_device - detach and remove a SCSI device + scsi_remove_host - detach and remove all SCSI devices owned by host + scsi_report_bus_reset - report scsi _bus_ reset observed + scsi_set_device - place device reference in host structure + scsi_to_pci_dma_dir - convert SCSI subsystem direction flag to PCI + scsi_to_sbus_dma_dir - convert SCSI subsystem direction flag to SBUS + scsi_track_queue_full - track successive QUEUE_FULL events + scsi_unblock_requests - allow further commands to be queued to given host + scsi_unregister - [calls scsi_host_put()] + + +Details: + +/** + * scsi_activate_tcq - turn on tag command queueing ("ordered" task attribute) + * @sdev: device to turn on TCQ for + * @depth: queue depth + * + * Returns nothing + * + * Might block: no + * + * Notes: Eventually, it is hoped depth would be the maximum depth + * the device could cope with and the real queue depth + * would be adjustable from 0 to depth. + * + * Defined (inline) in: include/scsi/scsi_tcq.h + **/ +void scsi_activate_tcq(struct scsi_device *sdev, int depth) + + +/** + * scsi_add_device - creates new scsi device (lu) instance + * @shost: pointer to scsi host instance + * @channel: channel number (rarely other than 0) + * @id: target id number + * @lun: logical unit number + * + * Returns pointer to new struct scsi_device instance or + * ERR_PTR(-ENODEV) (or some other bent pointer) if something is + * wrong (e.g. no lu responds at given address) + * + * Might block: yes + * + * Notes: This call is usually performed internally during a scsi + * bus scan when an HBA is added (i.e. scsi_add_host()). So it + * should only be called if the HBA becomes aware of a new scsi + * device (lu) after scsi_add_host() has completed. If successful + * this call we lead to slave_alloc() and slave_configure() callbacks + * into the LLD. + * + * Defined in: drivers/scsi/scsi_scan.c + **/ +struct scsi_device * scsi_add_device(struct Scsi_Host *shost, + unsigned int channel, + unsigned int id, unsigned int lun) + + +/** + * scsi_add_host - perform sysfs registration and SCSI bus scan. + * @shost: pointer to scsi host instance + * @dev: pointer to struct device of type scsi class + * + * Returns 0 on success, negative errno of failure (e.g. -ENOMEM) + * + * Might block: no + * + * Notes: Only required in "hotplug initialization model" after a + * successful call to scsi_host_alloc(). + * + * Defined in: drivers/scsi/hosts.c + **/ +int scsi_add_host(struct Scsi_Host *shost, struct device * dev) + + +/** + * scsi_add_timer - (re-)start timer on a SCSI command. + * @scmd: pointer to scsi command instance + * @timeout: duration of timeout in "jiffies" + * @complete: pointer to function to call if timeout expires + * + * Returns nothing + * + * Might block: no + * + * Notes: Each scsi command has its own timer, and as it is added + * to the queue, we set up the timer. When the command completes, + * we cancel the timer. An LLD can use this function to change + * the existing timeout value. + * + * Defined in: drivers/scsi/scsi_error.c + **/ +void scsi_add_timer(struct scsi_cmnd *scmd, int timeout, + void (*complete)(struct scsi_cmnd *)) + + +/** + * scsi_adjust_queue_depth - allow LLD to change queue depth on a SCSI device + * @sdev: pointer to SCSI device to change queue depth on + * @tagged: 0 - no tagged queuing + * MSG_SIMPLE_TAG - simple tagged queuing + * MSG_ORDERED_TAG - ordered tagged queuing + * @tags Number of tags allowed if tagged queuing enabled, + * or number of commands the LLD can queue up + * in non-tagged mode (as per cmd_per_lun). + * + * Returns nothing + * + * Might block: no + * + * Notes: Can be invoked any time on a SCSI device controlled by this + * LLD. [Specifically during and after slave_configure() and prior to + * slave_destroy().] Can safely be invoked from interrupt code. Actual + * queue depth change may be delayed until the next command is being + * processed. See also scsi_activate_tcq() and scsi_deactivate_tcq(). + * + * Defined in: drivers/scsi/scsi.c [see source code for more notes] + * + **/ +void scsi_adjust_queue_depth(struct scsi_device * sdev, int tagged, + int tags) + + +/** + * scsi_assign_lock - replace default host_lock with given lock + * @shost: a pointer to a scsi host instance + * @lock: pointer to lock to replace host_lock for this host + * + * Returns nothing + * + * Might block: no + * + * Defined in: include/scsi/scsi_host.h . + **/ +void scsi_assign_lock(struct Scsi_Host *shost, spinlock_t *lock) + + +/** + * scsi_bios_ptable - return copy of block device's partition table + * @dev: pointer to block device + * + * Returns pointer to partition table, or NULL for failure + * + * Might block: yes + * + * Notes: Caller owns memory returned (free with kfree() ) + * + * Defined in: drivers/scsi/scsicam.c + **/ +unsigned char *scsi_bios_ptable(struct block_device *dev) + + +/** + * scsi_block_requests - prevent further commands being queued to given host + * + * @shost: pointer to host to block commands on + * + * Returns nothing + * + * Might block: no + * + * Notes: There is no timer nor any other means by which the requests + * get unblocked other than the LLD calling scsi_unblock_requests(). + * + * Defined in: drivers/scsi/scsi_lib.c +**/ +void scsi_block_requests(struct Scsi_Host * shost) + + +/** + * scsi_deactivate_tcq - turn off tag command queueing + * @sdev: device to turn off TCQ for + * @depth: queue depth (stored in sdev) + * + * Returns nothing + * + * Might block: no + * + * Defined (inline) in: include/scsi/scsi_tcq.h + **/ +void scsi_deactivate_tcq(struct scsi_device *sdev, int depth) + + +/** + * scsi_delete_timer - cancel timer on a SCSI command. + * @scmd: pointer to scsi command instance + * + * Returns 1 if able to cancel timer else 0 (i.e. too late or already + * cancelled). + * + * Might block: no [may in the future if it invokes del_timer_sync()] + * + * Notes: All commands issued by upper levels already have a timeout + * associated with them. An LLD can use this function to cancel the + * timer. + * + * Defined in: drivers/scsi/scsi_error.c + **/ +int scsi_delete_timer(struct scsi_cmnd *scmd) + + +/** + * scsi_host_alloc - create a scsi host adapter instance and perform basic + * initialization. + * @sht: pointer to scsi host template + * @privsize: extra bytes to allocate in hostdata array (which is the + * last member of the returned Scsi_Host instance) + * + * Returns pointer to new Scsi_Host instance or NULL on failure + * + * Might block: yes + * + * Notes: When this call returns to the LLD, the SCSI bus scan on + * this host has _not_ yet been done. + * The hostdata array (by default zero length) is a per host scratch + * area for the LLD's exclusive use. + * Both associated refcounting objects have their refcount set to 1. + * Full registration (in sysfs) and a bus scan are performed later when + * scsi_add_host() is called. + * + * Defined in: drivers/scsi/hosts.c . + **/ +struct Scsi_Host * scsi_host_alloc(struct scsi_host_template * sht, + int privsize) + + +/** + * scsi_host_get - increment Scsi_Host instance refcount + * @shost: pointer to struct Scsi_Host instance + * + * Returns nothing + * + * Might block: currently may block but may be changed to not block + * + * Notes: Actually increments the counts in two sub-objects + * + * Defined in: drivers/scsi/hosts.c + **/ +void scsi_host_get(struct Scsi_Host *shost) + + +/** + * scsi_host_put - decrement Scsi_Host instance refcount, free if 0 + * @shost: pointer to struct Scsi_Host instance + * + * Returns nothing + * + * Might block: currently may block but may be changed to not block + * + * Notes: Actually decrements the counts in two sub-objects. If the + * latter refcount reaches 0, the Scsi_Host instance is freed. + * The LLD need not worry exactly when the Scsi_Host instance is + * freed, it just shouldn't access the instance after it has balanced + * out its refcount usage. + * + * Defined in: drivers/scsi/hosts.c + **/ +void scsi_host_put(struct Scsi_Host *shost) + + +/** + * scsi_partsize - parse partition table into cylinders, heads + sectors + * @buf: pointer to partition table + * @capacity: size of (total) disk in 512 byte sectors + * @cyls: outputs number of cylinders calculated via this pointer + * @hds: outputs number of heads calculated via this pointer + * @secs: outputs number of sectors calculated via this pointer + * + * Returns 0 on success, -1 on failure + * + * Might block: no + * + * Notes: Caller owns memory returned (free with kfree() ) + * + * Defined in: drivers/scsi/scsicam.c + **/ +int scsi_partsize(unsigned char *buf, unsigned long capacity, + unsigned int *cyls, unsigned int *hds, unsigned int *secs) + + +/** + * scsi_register - create and register a scsi host adapter instance. + * @sht: pointer to scsi host template + * @privsize: extra bytes to allocate in hostdata array (which is the + * last member of the returned Scsi_Host instance) + * + * Returns pointer to new Scsi_Host instance or NULL on failure + * + * Might block: yes + * + * Notes: When this call returns to the LLD, the SCSI bus scan on + * this host has _not_ yet been done. + * The hostdata array (by default zero length) is a per host scratch + * area for the LLD. + * + * Defined in: drivers/scsi/hosts.c . + **/ +struct Scsi_Host * scsi_register(struct scsi_host_template * sht, + int privsize) + + +/** + * scsi_remove_device - detach and remove a SCSI device + * @sdev: a pointer to a scsi device instance + * + * Returns value: 0 on success, -EINVAL if device not attached + * + * Might block: yes + * + * Notes: If an LLD becomes aware that a scsi device (lu) has + * been removed but its host is still present then it can request + * the removal of that scsi device. If successful this call will + * lead to the slave_destroy() callback being invoked. sdev is an + * invalid pointer after this call. + * + * Defined in: drivers/scsi/scsi_sysfs.c . + **/ +int scsi_remove_device(struct scsi_device *sdev) + + +/** + * scsi_remove_host - detach and remove all SCSI devices owned by host + * @shost: a pointer to a scsi host instance + * + * Returns value: 0 on success, 1 on failure (e.g. LLD busy ??) + * + * Might block: yes + * + * Notes: Should only be invoked if the "hotplug initialization + * model" is being used. It should be called _prior_ to + * scsi_unregister(). + * + * Defined in: drivers/scsi/hosts.c . + **/ +int scsi_remove_host(struct Scsi_Host *shost) + + +/** + * scsi_report_bus_reset - report scsi _bus_ reset observed + * @shost: a pointer to a scsi host involved + * @channel: channel (within) host on which scsi bus reset occurred + * + * Returns nothing + * + * Might block: no + * + * Notes: This only needs to be called if the reset is one which + * originates from an unknown location. Resets originated by the + * mid level itself don't need to call this, but there should be + * no harm. The main purpose of this is to make sure that a + * CHECK_CONDITION is properly treated. + * + * Defined in: drivers/scsi/scsi_error.c . + **/ +void scsi_report_bus_reset(struct Scsi_Host * shost, int channel) + + +/** + * scsi_set_device - place device reference in host structure + * @shost: a pointer to a scsi host instance + * @pdev: pointer to device instance to assign + * + * Returns nothing + * + * Might block: no + * + * Defined in: include/scsi/scsi_host.h . + **/ +void scsi_set_device(struct Scsi_Host * shost, struct device * dev) + + +/** + * scsi_to_pci_dma_dir - convert SCSI subsystem direction flag to PCI + * @scsi_data_direction: SCSI subsystem direction flag + * + * Returns DMA_TO_DEVICE given SCSI_DATA_WRITE, + * DMA_FROM_DEVICE given SCSI_DATA_READ + * DMA_BIDIRECTIONAL given SCSI_DATA_UNKNOWN + * else returns DMA_NONE + * + * Might block: no + * + * Notes: The SCSI subsystem now uses the same values for these + * constants as the PCI subsystem so this function is a nop. + * The recommendation is not to use this conversion function anymore + * (in the 2.6 kernel series) as it is not needed. + * + * Defined in: drivers/scsi/scsi.h . + **/ +int scsi_to_pci_dma_dir(unsigned char scsi_data_direction) + + +/** + * scsi_to_sbus_dma_dir - convert SCSI subsystem direction flag to SBUS + * @scsi_data_direction: SCSI subsystem direction flag + * + * Returns DMA_TO_DEVICE given SCSI_DATA_WRITE, + * FROM_DEVICE given SCSI_DATA_READ + * DMA_BIDIRECTIONAL given SCSI_DATA_UNKNOWN + * else returns DMA_NONE + * + * Notes: The SCSI subsystem now uses the same values for these + * constants as the SBUS subsystem so this function is a nop. + * The recommendation is not to use this conversion function anymore + * (in the 2.6 kernel series) as it is not needed. + * + * Might block: no + * + * Defined in: drivers/scsi/scsi.h . + **/ +int scsi_to_sbus_dma_dir(unsigned char scsi_data_direction) + + +/** + * scsi_track_queue_full - track successive QUEUE_FULL events on given + * device to determine if and when there is a need + * to adjust the queue depth on the device. + * @sdev: pointer to SCSI device instance + * @depth: Current number of outstanding SCSI commands on this device, + * not counting the one returned as QUEUE_FULL. + * + * Returns 0 - no change needed + * >0 - adjust queue depth to this new depth + * -1 - drop back to untagged operation using host->cmd_per_lun + * as the untagged command depth + * + * Might block: no + * + * Notes: LLDs may call this at any time and we will do "The Right + * Thing"; interrupt context safe. + * + * Defined in: drivers/scsi/scsi.c . + **/ +int scsi_track_queue_full(Scsi_Device *sdev, int depth) + + +/** + * scsi_unblock_requests - allow further commands to be queued to given host + * + * @shost: pointer to host to unblock commands on + * + * Returns nothing + * + * Might block: no + * + * Defined in: drivers/scsi/scsi_lib.c . +**/ +void scsi_unblock_requests(struct Scsi_Host * shost) + + +/** + * scsi_unregister - unregister and free memory used by host instance + * @shp: pointer to scsi host instance to unregister. + * + * Returns nothing + * + * Might block: no + * + * Notes: Should not be invoked if the "hotplug initialization + * model" is being used. Called internally by exit_this_scsi_driver() + * in the "passive initialization model". Hence a LLD has no need to + * call this function directly. + * + * Defined in: drivers/scsi/hosts.c . + **/ +void scsi_unregister(struct Scsi_Host * shp) + + + + +Interface Functions +=================== +Interface functions are supplied (defined) by LLDs and their function +pointers are placed in an instance of struct scsi_host_template which +is passed to scsi_host_alloc() [or scsi_register() / init_this_scsi_driver()]. +Some are mandatory. Interface functions should be declared static. The +accepted convention is that driver "xyz" will declare its slave_configure() +function as: + static int xyz_slave_configure(struct scsi_device * sdev); +and so forth for all interface functions listed below. + +A pointer to this function should be placed in the 'slave_configure' member +of a "struct scsi_host_template" instance. A pointer to such an instance +should be passed to the mid level's scsi_host_alloc() [or scsi_register() / +init_this_scsi_driver()]. + +The interface functions are also described in the include/scsi/scsi_host.h +file immediately above their definition point in "struct scsi_host_template". +In some cases more detail is given in scsi_host.h than below. + +The interface functions are listed below in alphabetical order. + +Summary: + bios_param - fetch head, sector, cylinder info for a disk + detect - detects HBAs this driver wants to control + eh_timed_out - notify the host that a command timer expired + eh_abort_handler - abort given command + eh_bus_reset_handler - issue SCSI bus reset + eh_device_reset_handler - issue SCSI device reset + eh_host_reset_handler - reset host (host bus adapter) + eh_strategy_handler - driver supplied alternate to scsi_unjam_host() + info - supply information about given host + ioctl - driver can respond to ioctls + proc_info - supports /proc/scsi/{driver_name}/{host_no} + queuecommand - queue scsi command, invoke 'done' on completion + release - release all resources associated with given host + slave_alloc - prior to any commands being sent to a new device + slave_configure - driver fine tuning for given device after attach + slave_destroy - given device is about to be shut down + + +Details: + +/** + * bios_param - fetch head, sector, cylinder info for a disk + * @sdev: pointer to scsi device context (defined in + * include/scsi/scsi_device.h) + * @bdev: pointer to block device context (defined in fs.h) + * @capacity: device size (in 512 byte sectors) + * @params: three element array to place output: + * params[0] number of heads (max 255) + * params[1] number of sectors (max 63) + * params[2] number of cylinders + * + * Return value is ignored + * + * Locks: none + * + * Calling context: process (sd) + * + * Notes: an arbitrary geometry (based on READ CAPACITY) is used + * if this function is not provided. The params array is + * pre-initialized with made up values just in case this function + * doesn't output anything. + * + * Optionally defined in: LLD + **/ + int bios_param(struct scsi_device * sdev, struct block_device *bdev, + sector_t capacity, int params[3]) + + +/** + * detect - detects HBAs this driver wants to control + * @shtp: host template for this driver. + * + * Returns number of hosts this driver wants to control. 0 means no + * suitable hosts found. + * + * Locks: none held + * + * Calling context: process [invoked from init_this_scsi_driver()] + * + * Notes: First function called from the SCSI mid level on this + * driver. Upper level drivers (e.g. sd) may not (yet) be present. + * For each host found, this method should call scsi_register() + * [see hosts.c]. + * + * Defined in: LLD (required if "passive initialization mode" is used, + * not invoked in "hotplug initialization mode") + **/ + int detect(struct scsi_host_template * shtp) + + +/** + * eh_timed_out - The timer for the command has just fired + * @scp: identifies command timing out + * + * Returns: + * + * EH_HANDLED: I fixed the error, please complete the command + * EH_RESET_TIMER: I need more time, reset the timer and + * begin counting again + * EH_NOT_HANDLED Begin normal error recovery + * + * + * Locks: None held + * + * Calling context: interrupt + * + * Notes: This is to give the LLD an opportunity to do local recovery. + * This recovery is limited to determining if the outstanding command + * will ever complete. You may not abort and restart the command from + * this callback. + * + * Optionally defined in: LLD + **/ + int eh_timed_out(struct scsi_cmnd * scp) + + +/** + * eh_abort_handler - abort command associated with scp + * @scp: identifies command to be aborted + * + * Returns SUCCESS if command aborted else FAILED + * + * Locks: struct Scsi_Host::host_lock held (with irqsave) on entry + * and assumed to be held on return. + * + * Calling context: kernel thread + * + * Notes: Invoked from scsi_eh thread. No other commands will be + * queued on current host during eh. + * + * Optionally defined in: LLD + **/ + int eh_abort_handler(struct scsi_cmnd * scp) + + +/** + * eh_bus_reset_handler - issue SCSI bus reset + * @scp: SCSI bus that contains this device should be reset + * + * Returns SUCCESS if command aborted else FAILED + * + * Locks: struct Scsi_Host::host_lock held (with irqsave) on entry + * and assumed to be held on return. + * + * Calling context: kernel thread + * + * Notes: Invoked from scsi_eh thread. No other commands will be + * queued on current host during eh. + * + * Optionally defined in: LLD + **/ + int eh_bus_reset_handler(struct scsi_cmnd * scp) + + +/** + * eh_device_reset_handler - issue SCSI device reset + * @scp: identifies SCSI device to be reset + * + * Returns SUCCESS if command aborted else FAILED + * + * Locks: struct Scsi_Host::host_lock held (with irqsave) on entry + * and assumed to be held on return. + * + * Calling context: kernel thread + * + * Notes: Invoked from scsi_eh thread. No other commands will be + * queued on current host during eh. + * + * Optionally defined in: LLD + **/ + int eh_device_reset_handler(struct scsi_cmnd * scp) + + +/** + * eh_host_reset_handler - reset host (host bus adapter) + * @scp: SCSI host that contains this device should be reset + * + * Returns SUCCESS if command aborted else FAILED + * + * Locks: struct Scsi_Host::host_lock held (with irqsave) on entry + * and assumed to be held on return. + * + * Calling context: kernel thread + * + * Notes: Invoked from scsi_eh thread. No other commands will be + * queued on current host during eh. + * With the default eh_strategy in place, if none of the _abort_, + * _device_reset_, _bus_reset_ or this eh handler function are + * defined (or they all return FAILED) then the device in question + * will be set offline whenever eh is invoked. + * + * Optionally defined in: LLD + **/ + int eh_host_reset_handler(struct scsi_cmnd * scp) + + +/** + * eh_strategy_handler - driver supplied alternate to scsi_unjam_host() + * @shp: host on which error has occurred + * + * Returns TRUE if host unjammed, else FALSE. + * + * Locks: none + * + * Calling context: kernel thread + * + * Notes: Invoked from scsi_eh thread. LLD supplied alternate to + * scsi_unjam_host() found in scsi_error.c + * + * Optionally defined in: LLD + **/ + int eh_strategy_handler(struct Scsi_Host * shp) + + +/** + * info - supply information about given host: driver name plus data + * to distinguish given host + * @shp: host to supply information about + * + * Return ASCII null terminated string. [This driver is assumed to + * manage the memory pointed to and maintain it, typically for the + * lifetime of this host.] + * + * Locks: none + * + * Calling context: process + * + * Notes: Often supplies PCI or ISA information such as IO addresses + * and interrupt numbers. If not supplied struct Scsi_Host::name used + * instead. It is assumed the returned information fits on one line + * (i.e. does not included embedded newlines). + * The SCSI_IOCTL_PROBE_HOST ioctl yields the string returned by this + * function (or struct Scsi_Host::name if this function is not + * available). + * In a similar manner, init_this_scsi_driver() outputs to the console + * each host's "info" (or name) for the driver it is registering. + * Also if proc_info() is not supplied, the output of this function + * is used instead. + * + * Optionally defined in: LLD + **/ + const char * info(struct Scsi_Host * shp) + + +/** + * ioctl - driver can respond to ioctls + * @sdp: device that ioctl was issued for + * @cmd: ioctl number + * @arg: pointer to read or write data from. Since it points to + * user space, should use appropriate kernel functions + * (e.g. copy_from_user() ). In the Unix style this argument + * can also be viewed as an unsigned long. + * + * Returns negative "errno" value when there is a problem. 0 or a + * positive value indicates success and is returned to the user space. + * + * Locks: none + * + * Calling context: process + * + * Notes: The SCSI subsystem uses a "trickle down" ioctl model. + * The user issues an ioctl() against an upper level driver + * (e.g. /dev/sdc) and if the upper level driver doesn't recognize + * the 'cmd' then it is passed to the SCSI mid level. If the SCSI + * mid level does not recognize it, then the LLD that controls + * the device receives the ioctl. According to recent Unix standards + * unsupported ioctl() 'cmd' numbers should return -ENOTTY. + * + * Optionally defined in: LLD + **/ + int ioctl(struct scsi_device *sdp, int cmd, void *arg) + + +/** + * proc_info - supports /proc/scsi/{driver_name}/{host_no} + * @buffer: anchor point to output to (0==writeto1_read0) or fetch from + * (1==writeto1_read0). + * @start: where "interesting" data is written to. Ignored when + * 1==writeto1_read0. + * @offset: offset within buffer 0==writeto1_read0 is actually + * interested in. Ignored when 1==writeto1_read0 . + * @length: maximum (or actual) extent of buffer + * @host_no: host number of interest (struct Scsi_Host::host_no) + * @writeto1_read0: 1 -> data coming from user space towards driver + * (e.g. "echo some_string > /proc/scsi/xyz/2") + * 0 -> user what data from this driver + * (e.g. "cat /proc/scsi/xyz/2") + * + * Returns length when 1==writeto1_read0. Otherwise number of chars + * output to buffer past offset. + * + * Locks: none held + * + * Calling context: process + * + * Notes: Driven from scsi_proc.c which interfaces to proc_fs. proc_fs + * support can now be configured out of the scsi subsystem. + * + * Optionally defined in: LLD + **/ + int proc_info(char * buffer, char ** start, off_t offset, + int length, int host_no, int writeto1_read0) + + +/** + * queuecommand - queue scsi command, invoke 'done' on completion + * @scp: pointer to scsi command object + * @done: function pointer to be invoked on completion + * + * Returns 0 on success. + * + * If there's a failure, return either: + * + * SCSI_MLQUEUE_DEVICE_BUSY if the device queue is full, or + * SCSI_MLQUEUE_HOST_BUSY if the entire host queue is full + * + * On both of these returns, the mid-layer will requeue the I/O + * + * - if the return is SCSI_MLQUEUE_DEVICE_BUSY, only that particular + * device will be paused, and it will be unpaused when a command to + * the device returns (or after a brief delay if there are no more + * outstanding commands to it). Commands to other devices continue + * to be processed normally. + * + * - if the return is SCSI_MLQUEUE_HOST_BUSY, all I/O to the host + * is paused and will be unpaused when any command returns from + * the host (or after a brief delay if there are no outstanding + * commands to the host). + * + * For compatibility with earlier versions of queuecommand, any + * other return value is treated the same as + * SCSI_MLQUEUE_HOST_BUSY. + * + * Other types of errors that are detected immediately may be + * flagged by setting scp->result to an appropriate value, + * invoking the 'done' callback, and then returning 0 from this + * function. If the command is not performed immediately (and the + * LLD is starting (or will start) the given command) then this + * function should place 0 in scp->result and return 0. + * + * Command ownership. If the driver returns zero, it owns the + * command and must take responsibility for ensuring the 'done' + * callback is executed. Note: the driver may call done before + * returning zero, but after it has called done, it may not + * return any value other than zero. If the driver makes a + * non-zero return, it must not execute the command's done + * callback at any time. + * + * Locks: struct Scsi_Host::host_lock held on entry (with "irqsave") + * and is expected to be held on return. + * + * Calling context: in interrupt (soft irq) or process context + * + * Notes: This function should be relatively fast. Normally it will + * not wait for IO to complete. Hence the 'done' callback is invoked + * (often directly from an interrupt service routine) some time after + * this function has returned. In some cases (e.g. pseudo adapter + * drivers that manufacture the response to a SCSI INQUIRY) + * the 'done' callback may be invoked before this function returns. + * If the 'done' callback is not invoked within a certain period + * the SCSI mid level will commence error processing. + * If a status of CHECK CONDITION is placed in "result" when the + * 'done' callback is invoked, then the LLD driver should + * perform autosense and fill in the struct scsi_cmnd::sense_buffer + * array. The scsi_cmnd::sense_buffer array is zeroed prior to + * the mid level queuing a command to an LLD. + * + * Defined in: LLD + **/ + int queuecommand(struct scsi_cmnd * scp, + void (*done)(struct scsi_cmnd *)) + + +/** + * release - release all resources associated with given host + * @shp: host to be released. + * + * Return value ignored (could soon be a function returning void). + * + * Locks: none held + * + * Calling context: process + * + * Notes: Invoked from scsi_module.c's exit_this_scsi_driver(). + * LLD's implementation of this function should call + * scsi_unregister(shp) prior to returning. + * Only needed for old-style host templates. + * + * Defined in: LLD (required in "passive initialization model", + * should not be defined in hotplug model) + **/ + int release(struct Scsi_Host * shp) + + +/** + * slave_alloc - prior to any commands being sent to a new device + * (i.e. just prior to scan) this call is made + * @sdp: pointer to new device (about to be scanned) + * + * Returns 0 if ok. Any other return is assumed to be an error and + * the device is ignored. + * + * Locks: none + * + * Calling context: process + * + * Notes: Allows the driver to allocate any resources for a device + * prior to its initial scan. The corresponding scsi device may not + * exist but the mid level is just about to scan for it (i.e. send + * and INQUIRY command plus ...). If a device is found then + * slave_configure() will be called while if a device is not found + * slave_destroy() is called. + * For more details see the include/scsi/scsi_host.h file. + * + * Optionally defined in: LLD + **/ + int slave_alloc(struct scsi_device *sdp) + + +/** + * slave_configure - driver fine tuning for given device just after it + * has been first scanned (i.e. it responded to an + * INQUIRY) + * @sdp: device that has just been attached + * + * Returns 0 if ok. Any other return is assumed to be an error and + * the device is taken offline. [offline devices will _not_ have + * slave_destroy() called on them so clean up resources.] + * + * Locks: none + * + * Calling context: process + * + * Notes: Allows the driver to inspect the response to the initial + * INQUIRY done by the scanning code and take appropriate action. + * For more details see the include/scsi/scsi_host.h file. + * + * Optionally defined in: LLD + **/ + int slave_configure(struct scsi_device *sdp) + + +/** + * slave_destroy - given device is about to be shut down. All + * activity has ceased on this device. + * @sdp: device that is about to be shut down + * + * Returns nothing + * + * Locks: none + * + * Calling context: process + * + * Notes: Mid level structures for given device are still in place + * but are about to be torn down. Any per device resources allocated + * by this driver for given device should be freed now. No further + * commands will be sent for this sdp instance. [However the device + * could be re-attached in the future in which case a new instance + * of struct scsi_device would be supplied by future slave_alloc() + * and slave_configure() calls.] + * + * Optionally defined in: LLD + **/ + void slave_destroy(struct scsi_device *sdp) + + + +Data Structures +=============== +struct scsi_host_template +------------------------- +There is one "struct scsi_host_template" instance per LLD ***. It is +typically initialized as a file scope static in a driver's header file. That +way members that are not explicitly initialized will be set to 0 or NULL. +Member of interest: + name - name of driver (may contain spaces, please limit to + less than 80 characters) + proc_name - name used in "/proc/scsi/<proc_name>/<host_no>" and + by sysfs in one of its "drivers" directories. Hence + "proc_name" should only contain characters acceptable + to a Unix file name. + (*queuecommand)() - primary callback that the mid level uses to inject + SCSI commands into an LLD. +The structure is defined and commented in include/scsi/scsi_host.h + +*** In extreme situations a single driver may have several instances + if it controls several different classes of hardware (e.g. an LLD + that handles both ISA and PCI cards and has a separate instance of + struct scsi_host_template for each class). + +struct Scsi_Host +---------------- +There is one struct Scsi_Host instance per host (HBA) that an LLD +controls. The struct Scsi_Host structure has many members in common +with "struct scsi_host_template". When a new struct Scsi_Host instance +is created (in scsi_host_alloc() in hosts.c) those common members are +initialized from the driver's struct scsi_host_template instance. Members +of interest: + host_no - system wide unique number that is used for identifying + this host. Issued in ascending order from 0. + can_queue - must be greater than 0; do not send more than can_queue + commands to the adapter. + this_id - scsi id of host (scsi initiator) or -1 if not known + sg_tablesize - maximum scatter gather elements allowed by host. + 0 implies scatter gather not supported by host + max_sectors - maximum number of sectors (usually 512 bytes) allowed + in a single SCSI command. The default value of 0 leads + to a setting of SCSI_DEFAULT_MAX_SECTORS (defined in + scsi_host.h) which is currently set to 1024. So for a + disk the maximum transfer size is 512 KB when max_sectors + is not defined. Note that this size may not be sufficient + for disk firmware uploads. + cmd_per_lun - maximum number of commands that can be queued on devices + controlled by the host. Overridden by LLD calls to + scsi_adjust_queue_depth(). + unchecked_isa_dma - 1=>only use bottom 16 MB of ram (ISA DMA addressing + restriction), 0=>can use full 32 bit (or better) DMA + address space + use_clustering - 1=>SCSI commands in mid level's queue can be merged, + 0=>disallow SCSI command merging + hostt - pointer to driver's struct scsi_host_template from which + this struct Scsi_Host instance was spawned + hostt->proc_name - name of LLD. This is the driver name that sysfs uses + transportt - pointer to driver's struct scsi_transport_template instance + (if any). FC and SPI transports currently supported. + sh_list - a double linked list of pointers to all struct Scsi_Host + instances (currently ordered by ascending host_no) + my_devices - a double linked list of pointers to struct scsi_device + instances that belong to this host. + hostdata[0] - area reserved for LLD at end of struct Scsi_Host. Size + is set by the second argument (named 'xtr_bytes') to + scsi_host_alloc() or scsi_register(). + +The scsi_host structure is defined in include/scsi/scsi_host.h + +struct scsi_device +------------------ +Generally, there is one instance of this structure for each SCSI logical unit +on a host. Scsi devices connected to a host are uniquely identified by a +channel number, target id and logical unit number (lun). +The structure is defined in include/scsi/scsi_device.h + +struct scsi_cmnd +---------------- +Instances of this structure convey SCSI commands to the LLD and responses +back to the mid level. The SCSI mid level will ensure that no more SCSI +commands become queued against the LLD than are indicated by +scsi_adjust_queue_depth() (or struct Scsi_Host::cmd_per_lun). There will +be at least one instance of struct scsi_cmnd available for each SCSI device. +Members of interest: + cmnd - array containing SCSI command + cmnd_len - length (in bytes) of SCSI command + sc_data_direction - direction of data transfer in data phase. See + "enum dma_data_direction" in include/linux/dma-mapping.h + request_bufflen - number of data bytes to transfer (0 if no data phase) + use_sg - ==0 -> no scatter gather list, hence transfer data + to/from request_buffer + - >0 -> scatter gather list (actually an array) in + request_buffer with use_sg elements + request_buffer - either contains data buffer or scatter gather list + depending on the setting of use_sg. Scatter gather + elements are defined by 'struct scatterlist' found + in include/asm/scatterlist.h . + done - function pointer that should be invoked by LLD when the + SCSI command is completed (successfully or otherwise). + Should only be called by an LLD if the LLD has accepted + the command (i.e. queuecommand() returned or will return + 0). The LLD may invoke 'done' prior to queuecommand() + finishing. + result - should be set by LLD prior to calling 'done'. A value + of 0 implies a successfully completed command (and all + data (if any) has been transferred to or from the SCSI + target device). 'result' is a 32 bit unsigned integer that + can be viewed as 4 related bytes. The SCSI status value is + in the LSB. See include/scsi/scsi.h status_byte(), + msg_byte(), host_byte() and driver_byte() macros and + related constants. + sense_buffer - an array (maximum size: SCSI_SENSE_BUFFERSIZE bytes) that + should be written when the SCSI status (LSB of 'result') + is set to CHECK_CONDITION (2). When CHECK_CONDITION is + set, if the top nibble of sense_buffer[0] has the value 7 + then the mid level will assume the sense_buffer array + contains a valid SCSI sense buffer; otherwise the mid + level will issue a REQUEST_SENSE SCSI command to + retrieve the sense buffer. The latter strategy is error + prone in the presence of command queuing so the LLD should + always "auto-sense". + device - pointer to scsi_device object that this command is + associated with. + resid - an LLD should set this signed integer to the requested + transfer length (i.e. 'request_bufflen') less the number + of bytes that are actually transferred. 'resid' is + preset to 0 so an LLD can ignore it if it cannot detect + underruns (overruns should be rare). If possible an LLD + should set 'resid' prior to invoking 'done'. The most + interesting case is data transfers from a SCSI target + device device (i.e. READs) that underrun. + underflow - LLD should place (DID_ERROR << 16) in 'result' if + actual number of bytes transferred is less than this + figure. Not many LLDs implement this check and some that + do just output an error message to the log rather than + report a DID_ERROR. Better for an LLD to implement + 'resid'. + +The scsi_cmnd structure is defined in include/scsi/scsi_cmnd.h + + +Locks +===== +Each struct Scsi_Host instance has a spin_lock called struct +Scsi_Host::default_lock which is initialized in scsi_host_alloc() [found in +hosts.c]. Within the same function the struct Scsi_Host::host_lock pointer +is initialized to point at default_lock with the scsi_assign_lock() function. +Thereafter lock and unlock operations performed by the mid level use the +struct Scsi_Host::host_lock pointer. + +LLDs can override the use of struct Scsi_Host::default_lock by +using scsi_assign_lock(). The earliest opportunity to do this would +be in the detect() function after it has invoked scsi_register(). It +could be replaced by a coarser grain lock (e.g. per driver) or a +lock of equal granularity (i.e. per host). Using finer grain locks +(e.g. per SCSI device) may be possible by juggling locks in +queuecommand(). + +Autosense +========= +Autosense (or auto-sense) is defined in the SAM-2 document as "the +automatic return of sense data to the application client coincident +with the completion of a SCSI command" when a status of CHECK CONDITION +occurs. LLDs should perform autosense. This should be done when the LLD +detects a CHECK CONDITION status by either: + a) instructing the SCSI protocol (e.g. SCSI Parallel Interface (SPI)) + to perform an extra data in phase on such responses + b) or, the LLD issuing a REQUEST SENSE command itself + +Either way, when a status of CHECK CONDITION is detected, the mid level +decides whether the LLD has performed autosense by checking struct +scsi_cmnd::sense_buffer[0] . If this byte has an upper nibble of 7 (or 0xf) +then autosense is assumed to have taken place. If it has another value (and +this byte is initialized to 0 before each command) then the mid level will +issue a REQUEST SENSE command. + +In the presence of queued commands the "nexus" that maintains sense +buffer data from the command that failed until a following REQUEST SENSE +may get out of synchronization. This is why it is best for the LLD +to perform autosense. + + +Changes since lk 2.4 series +=========================== +io_request_lock has been replaced by several finer grained locks. The lock +relevant to LLDs is struct Scsi_Host::host_lock and there is +one per SCSI host. + +The older error handling mechanism has been removed. This means the +LLD interface functions abort() and reset() have been removed. +The struct scsi_host_template::use_new_eh_code flag has been removed. + +In the 2.4 series the SCSI subsystem configuration descriptions were +aggregated with the configuration descriptions from all other Linux +subsystems in the Documentation/Configure.help file. In the 2.6 series, +the SCSI subsystem now has its own (much smaller) drivers/scsi/Kconfig +file that contains both configuration and help information. + +struct SHT has been renamed to struct scsi_host_template. + +Addition of the "hotplug initialization model" and many extra functions +to support it. + + +Credits +======= +The following people have contributed to this document: + Mike Anderson <andmike at us dot ibm dot com> + James Bottomley <James dot Bottomley at steeleye dot com> + Patrick Mansfield <patmans at us dot ibm dot com> + Christoph Hellwig <hch at infradead dot org> + Doug Ledford <dledford at redhat dot com> + Andries Brouwer <Andries dot Brouwer at cwi dot nl> + Randy Dunlap <rddunlap at osdl dot org> + Alan Stern <stern at rowland dot harvard dot edu> + + +Douglas Gilbert +dgilbert at interlog dot com +21st September 2004 diff --git a/Documentation/scsi/st.txt b/Documentation/scsi/st.txt new file mode 100644 index 000000000000..20e30cf31877 --- /dev/null +++ b/Documentation/scsi/st.txt @@ -0,0 +1,499 @@ +This file contains brief information about the SCSI tape driver. +The driver is currently maintained by Kai Mäkisara (email +Kai.Makisara@kolumbus.fi) + +Last modified: Mon Mar 7 21:14:44 2005 by kai.makisara + + +BASICS + +The driver is generic, i.e., it does not contain any code tailored +to any specific tape drive. The tape parameters can be specified with +one of the following three methods: + +1. Each user can specify the tape parameters he/she wants to use +directly with ioctls. This is administratively a very simple and +flexible method and applicable to single-user workstations. However, +in a multiuser environment the next user finds the tape parameters in +state the previous user left them. + +2. The system manager (root) can define default values for some tape +parameters, like block size and density using the MTSETDRVBUFFER ioctl. +These parameters can be programmed to come into effect either when a +new tape is loaded into the drive or if writing begins at the +beginning of the tape. The second method is applicable if the tape +drive performs auto-detection of the tape format well (like some +QIC-drives). The result is that any tape can be read, writing can be +continued using existing format, and the default format is used if +the tape is rewritten from the beginning (or a new tape is written +for the first time). The first method is applicable if the drive +does not perform auto-detection well enough and there is a single +"sensible" mode for the device. An example is a DAT drive that is +used only in variable block mode (I don't know if this is sensible +or not :-). + +The user can override the parameters defined by the system +manager. The changes persist until the defaults again come into +effect. + +3. By default, up to four modes can be defined and selected using the minor +number (bits 5 and 6). The number of modes can be changed by changing +ST_NBR_MODE_BITS in st.h. Mode 0 corresponds to the defaults discussed +above. Additional modes are dormant until they are defined by the +system manager (root). When specification of a new mode is started, +the configuration of mode 0 is used to provide a starting point for +definition of the new mode. + +Using the modes allows the system manager to give the users choices +over some of the buffering parameters not directly accessible to the +users (buffered and asynchronous writes). The modes also allow choices +between formats in multi-tape operations (the explicitly overridden +parameters are reset when a new tape is loaded). + +If more than one mode is used, all modes should contain definitions +for the same set of parameters. + +Many Unices contain internal tables that associate different modes to +supported devices. The Linux SCSI tape driver does not contain such +tables (and will not do that in future). Instead of that, a utility +program can be made that fetches the inquiry data sent by the device, +scans its database, and sets up the modes using the ioctls. Another +alternative is to make a small script that uses mt to set the defaults +tailored to the system. + +The driver supports fixed and variable block size (within buffer +limits). Both the auto-rewind (minor equals device number) and +non-rewind devices (minor is 128 + device number) are implemented. + +In variable block mode, the byte count in write() determines the size +of the physical block on tape. When reading, the drive reads the next +tape block and returns to the user the data if the read() byte count +is at least the block size. Otherwise, error ENOMEM is returned. + +In fixed block mode, the data transfer between the drive and the +driver is in multiples of the block size. The write() byte count must +be a multiple of the block size. This is not required when reading but +may be advisable for portability. + +Support is provided for changing the tape partition and partitioning +of the tape with one or two partitions. By default support for +partitioned tape is disabled for each driver and it can be enabled +with the ioctl MTSETDRVBUFFER. + +By default the driver writes one filemark when the device is closed after +writing and the last operation has been a write. Two filemarks can be +optionally written. In both cases end of data is signified by +returning zero bytes for two consecutive reads. + +If rewind, offline, bsf, or seek is done and previous tape operation was +write, a filemark is written before moving tape. + +The compile options are defined in the file linux/drivers/scsi/st_options.h. + +4. If the open option O_NONBLOCK is used, open succeeds even if the +drive is not ready. If O_NONBLOCK is not used, the driver waits for +the drive to become ready. If this does not happen in ST_BLOCK_SECONDS +seconds, open fails with the errno value EIO. With O_NONBLOCK the +device can be opened for writing even if there is a write protected +tape in the drive (commands trying to write something return error if +attempted). + + +MINOR NUMBERS + +The tape driver currently supports 128 drives by default. This number +can be increased by editing st.h and recompiling the driver if +necessary. The upper limit is 2^17 drives if 4 modes for each drive +are used. + +The minor numbers consist of the following bit fields: + +dev_upper non-rew mode dev-lower + 20 - 8 7 6 5 4 0 +The non-rewind bit is always bit 7 (the uppermost bit in the lowermost +byte). The bits defining the mode are below the non-rewind bit. The +remaining bits define the tape device number. This numbering is +backward compatible with the numbering used when the minor number was +only 8 bits wide. + + +SYSFS SUPPORT + +The driver creates the directory /sys/class/scsi_tape and populates it with +directories corresponding to the existing tape devices. There are autorewind +and non-rewind entries for each mode. The names are stxy and nstxy, where x +is the tape number and y a character corresponding to the mode (none, l, m, +a). For example, the directories for the first tape device are (assuming four +modes): st0 nst0 st0l nst0l st0m nst0m st0a nst0a. + +Each directory contains the entries: default_blksize default_compression +default_density defined dev device driver. The file 'defined' contains 1 +if the mode is defined and zero if not defined. The files 'default_*' contain +the defaults set by the user. The value -1 means the default is not set. The +file 'dev' contains the device numbers corresponding to this device. The links +'device' and 'driver' point to the SCSI device and driver entries. + +A link named 'tape' is made from the SCSI device directory to the class +directory corresponding to the mode 0 auto-rewind device (e.g., st0). + + +BSD AND SYS V SEMANTICS + +The user can choose between these two behaviours of the tape driver by +defining the value of the symbol ST_SYSV. The semantics differ when a +file being read is closed. The BSD semantics leaves the tape where it +currently is whereas the SYS V semantics moves the tape past the next +filemark unless the filemark has just been crossed. + +The default is BSD semantics. + + +BUFFERING + +The driver tries to do transfers directly to/from user space. If this +is not possible, a driver buffer allocated at run-time is used. If +direct i/o is not possible for the whole transfer, the driver buffer +is used (i.e., bounce buffers for individual pages are not +used). Direct i/o can be impossible because of several reasons, e.g.: +- one or more pages are at addresses not reachable by the HBA +- the number of pages in the transfer exceeds the number of + scatter/gather segments permitted by the HBA +- one or more pages can't be locked into memory (should not happen in + any reasonable situation) + +The size of the driver buffers is always at least one tape block. In fixed +block mode, the minimum buffer size is defined (in 1024 byte units) by +ST_FIXED_BUFFER_BLOCKS. With small block size this allows buffering of +several blocks and using one SCSI read or write to transfer all of the +blocks. Buffering of data across write calls in fixed block mode is +allowed if ST_BUFFER_WRITES is non-zero and direct i/o is not used. +Buffer allocation uses chunks of memory having sizes 2^n * (page +size). Because of this the actual buffer size may be larger than the +minimum allowable buffer size. + +NOTE that if direct i/o is used, the small writes are not buffered. This may +cause a surprise when moving from 2.4. There small writes (e.g., tar without +-b option) may have had good throughput but this is not true any more with +2.6. Direct i/o can be turned off to solve this problem but a better solution +is to use bigger write() byte counts (e.g., tar -b 64). + +Asynchronous writing. Writing the buffer contents to the tape is +started and the write call returns immediately. The status is checked +at the next tape operation. Asynchronous writes are not done with +direct i/o and not in fixed block mode. + +Buffered writes and asynchronous writes may in some rare cases cause +problems in multivolume operations if there is not enough space on the +tape after the early-warning mark to flush the driver buffer. + +Read ahead for fixed block mode (ST_READ_AHEAD). Filling the buffer is +attempted even if the user does not want to get all of the data at +this read command. Should be disabled for those drives that don't like +a filemark to truncate a read request or that don't like backspacing. + +Scatter/gather buffers (buffers that consist of chunks non-contiguous +in the physical memory) are used if contiguous buffers can't be +allocated. To support all SCSI adapters (including those not +supporting scatter/gather), buffer allocation is using the following +three kinds of chunks: +1. The initial segment that is used for all SCSI adapters including +those not supporting scatter/gather. The size of this buffer will be +(PAGE_SIZE << ST_FIRST_ORDER) bytes if the system can give a chunk of +this size (and it is not larger than the buffer size specified by +ST_BUFFER_BLOCKS). If this size is not available, the driver halves +the size and tries again until the size of one page. The default +settings in st_options.h make the driver to try to allocate all of the +buffer as one chunk. +2. The scatter/gather segments to fill the specified buffer size are +allocated so that as many segments as possible are used but the number +of segments does not exceed ST_FIRST_SG. +3. The remaining segments between ST_MAX_SG (or the module parameter +max_sg_segs) and the number of segments used in phases 1 and 2 +are used to extend the buffer at run-time if this is necessary. The +number of scatter/gather segments allowed for the SCSI adapter is not +exceeded if it is smaller than the maximum number of scatter/gather +segments specified. If the maximum number allowed for the SCSI adapter +is smaller than the number of segments used in phases 1 and 2, +extending the buffer will always fail. + + +EOM BEHAVIOUR WHEN WRITING + +When the end of medium early warning is encountered, the current write +is finished and the number of bytes is returned. The next write +returns -1 and errno is set to ENOSPC. To enable writing a trailer, +the next write is allowed to proceed and, if successful, the number of +bytes is returned. After this, -1 and the number of bytes are +alternately returned until the physical end of medium (or some other +error) is encountered. + + +MODULE PARAMETERS + +The buffer size, write threshold, and the maximum number of allocated buffers +are configurable when the driver is loaded as a module. The keywords are: + +buffer_kbs=xxx the buffer size for fixed block mode is set + to xxx kilobytes +write_threshold_kbs=xxx the write threshold in kilobytes set to xxx +max_sg_segs=xxx the maximum number of scatter/gather + segments +try_direct_io=x try direct transfer between user buffer and + tape drive if this is non-zero + +Note that if the buffer size is changed but the write threshold is not +set, the write threshold is set to the new buffer size - 2 kB. + + +BOOT TIME CONFIGURATION + +If the driver is compiled into the kernel, the same parameters can be +also set using, e.g., the LILO command line. The preferred syntax is +is to use the same keyword used when loading as module but prepended +with 'st.'. For instance, to set the maximum number of scatter/gather +segments, the parameter 'st.max_sg_segs=xx' should be used (xx is the +number of scatter/gather segments). + +For compatibility, the old syntax from early 2.5 and 2.4 kernel +versions is supported. The same keywords can be used as when loading +the driver as module. If several parameters are set, the keyword-value +pairs are separated with a comma (no spaces allowed). A colon can be +used instead of the equal mark. The definition is prepended by the +string st=. Here is an example: + + st=buffer_kbs:64,write_threhold_kbs:60 + +The following syntax used by the old kernel versions is also supported: + + st=aa[,bb[,dd]] + +where + aa is the buffer size for fixed block mode in 1024 byte units + bb is the write threshold in 1024 byte units + dd is the maximum number of scatter/gather segments + + +IOCTLS + +The tape is positioned and the drive parameters are set with ioctls +defined in mtio.h The tape control program 'mt' uses these ioctls. Try +to find an mt that supports all of the Linux SCSI tape ioctls and +opens the device for writing if the tape contents will be modified +(look for a package mt-st* from the Linux ftp sites; the GNU mt does +not open for writing for, e.g., erase). + +The supported ioctls are: + +The following use the structure mtop: + +MTFSF Space forward over count filemarks. Tape positioned after filemark. +MTFSFM As above but tape positioned before filemark. +MTBSF Space backward over count filemarks. Tape positioned before + filemark. +MTBSFM As above but ape positioned after filemark. +MTFSR Space forward over count records. +MTBSR Space backward over count records. +MTFSS Space forward over count setmarks. +MTBSS Space backward over count setmarks. +MTWEOF Write count filemarks. +MTWSM Write count setmarks. +MTREW Rewind tape. +MTOFFL Set device off line (often rewind plus eject). +MTNOP Do nothing except flush the buffers. +MTRETEN Re-tension tape. +MTEOM Space to end of recorded data. +MTERASE Erase tape. If the argument is zero, the short erase command + is used. The long erase command is used with all other values + of the argument. +MTSEEK Seek to tape block count. Uses Tandberg-compatible seek (QFA) + for SCSI-1 drives and SCSI-2 seek for SCSI-2 drives. The file and + block numbers in the status are not valid after a seek. +MTSETBLK Set the drive block size. Setting to zero sets the drive into + variable block mode (if applicable). +MTSETDENSITY Sets the drive density code to arg. See drive + documentation for available codes. +MTLOCK and MTUNLOCK Explicitly lock/unlock the tape drive door. +MTLOAD and MTUNLOAD Explicitly load and unload the tape. If the + command argument x is between MT_ST_HPLOADER_OFFSET + 1 and + MT_ST_HPLOADER_OFFSET + 6, the number x is used sent to the + drive with the command and it selects the tape slot to use of + HP C1553A changer. +MTCOMPRESSION Sets compressing or uncompressing drive mode using the + SCSI mode page 15. Note that some drives other methods for + control of compression. Some drives (like the Exabytes) use + density codes for compression control. Some drives use another + mode page but this page has not been implemented in the + driver. Some drives without compression capability will accept + any compression mode without error. +MTSETPART Moves the tape to the partition given by the argument at the + next tape operation. The block at which the tape is positioned + is the block where the tape was previously positioned in the + new active partition unless the next tape operation is + MTSEEK. In this case the tape is moved directly to the block + specified by MTSEEK. MTSETPART is inactive unless + MT_ST_CAN_PARTITIONS set. +MTMKPART Formats the tape with one partition (argument zero) or two + partitions (the argument gives in megabytes the size of + partition 1 that is physically the first partition of the + tape). The drive has to support partitions with size specified + by the initiator. Inactive unless MT_ST_CAN_PARTITIONS set. +MTSETDRVBUFFER + Is used for several purposes. The command is obtained from count + with mask MT_SET_OPTIONS, the low order bits are used as argument. + This command is only allowed for the superuser (root). The + subcommands are: + 0 + The drive buffer option is set to the argument. Zero means + no buffering. + MT_ST_BOOLEANS + Sets the buffering options. The bits are the new states + (enabled/disabled) the following options (in the + parenthesis is specified whether the option is global or + can be specified differently for each mode): + MT_ST_BUFFER_WRITES write buffering (mode) + MT_ST_ASYNC_WRITES asynchronous writes (mode) + MT_ST_READ_AHEAD read ahead (mode) + MT_ST_TWO_FM writing of two filemarks (global) + MT_ST_FAST_EOM using the SCSI spacing to EOD (global) + MT_ST_AUTO_LOCK automatic locking of the drive door (global) + MT_ST_DEF_WRITES the defaults are meant only for writes (mode) + MT_ST_CAN_BSR backspacing over more than one records can + be used for repositioning the tape (global) + MT_ST_NO_BLKLIMS the driver does not ask the block limits + from the drive (block size can be changed only to + variable) (global) + MT_ST_CAN_PARTITIONS enables support for partitioned + tapes (global) + MT_ST_SCSI2LOGICAL the logical block number is used in + the MTSEEK and MTIOCPOS for SCSI-2 drives instead of + the device dependent address. It is recommended to set + this flag unless there are tapes using the device + dependent (from the old times) (global) + MT_ST_SYSV sets the SYSV sematics (mode) + MT_ST_NOWAIT enables immediate mode (i.e., don't wait for + the command to finish) for some commands (e.g., rewind) + MT_ST_DEBUGGING debugging (global; debugging must be + compiled into the driver) + MT_ST_SETBOOLEANS + MT_ST_CLEARBOOLEANS + Sets or clears the option bits. + MT_ST_WRITE_THRESHOLD + Sets the write threshold for this device to kilobytes + specified by the lowest bits. + MT_ST_DEF_BLKSIZE + Defines the default block size set automatically. Value + 0xffffff means that the default is not used any more. + MT_ST_DEF_DENSITY + MT_ST_DEF_DRVBUFFER + Used to set or clear the density (8 bits), and drive buffer + state (3 bits). If the value is MT_ST_CLEAR_DEFAULT + (0xfffff) the default will not be used any more. Otherwise + the lowermost bits of the value contain the new value of + the parameter. + MT_ST_DEF_COMPRESSION + The compression default will not be used if the value of + the lowermost byte is 0xff. Otherwise the lowermost bit + contains the new default. If the bits 8-15 are set to a + non-zero number, and this number is not 0xff, the number is + used as the compression algorithm. The value + MT_ST_CLEAR_DEFAULT can be used to clear the compression + default. + MT_ST_SET_TIMEOUT + Set the normal timeout in seconds for this device. The + default is 900 seconds (15 minutes). The timeout should be + long enough for the retries done by the device while + reading/writing. + MT_ST_SET_LONG_TIMEOUT + Set the long timeout that is used for operations that are + known to take a long time. The default is 14000 seconds + (3.9 hours). For erase this value is further multiplied by + eight. + MT_ST_SET_CLN + Set the cleaning request interpretation parameters using + the lowest 24 bits of the argument. The driver can set the + generic status bit GMT_CLN if a cleaning request bit pattern + is found from the extended sense data. Many drives set one or + more bits in the extended sense data when the drive needs + cleaning. The bits are device-dependent. The driver is + given the number of the sense data byte (the lowest eight + bits of the argument; must be >= 18 (values 1 - 17 + reserved) and <= the maximum requested sense data sixe), + a mask to select the relevant bits (the bits 9-16), and the + bit pattern (bits 17-23). If the bit pattern is zero, one + or more bits under the mask indicate cleaning request. If + the pattern is non-zero, the pattern must match the masked + sense data byte. + + (The cleaning bit is set if the additional sense code and + qualifier 00h 17h are seen regardless of the setting of + MT_ST_SET_CLN.) + +The following ioctl uses the structure mtpos: +MTIOCPOS Reads the current position from the drive. Uses + Tandberg-compatible QFA for SCSI-1 drives and the SCSI-2 + command for the SCSI-2 drives. + +The following ioctl uses the structure mtget to return the status: +MTIOCGET Returns some status information. + The file number and block number within file are returned. The + block is -1 when it can't be determined (e.g., after MTBSF). + The drive type is either MTISSCSI1 or MTISSCSI2. + The number of recovered errors since the previous status call + is stored in the lower word of the field mt_erreg. + The current block size and the density code are stored in the field + mt_dsreg (shifts for the subfields are MT_ST_BLKSIZE_SHIFT and + MT_ST_DENSITY_SHIFT). + The GMT_xxx status bits reflect the drive status. GMT_DR_OPEN + is set if there is no tape in the drive. GMT_EOD means either + end of recorded data or end of tape. GMT_EOT means end of tape. + + +MISCELLANEOUS COMPILE OPTIONS + +The recovered write errors are considered fatal if ST_RECOVERED_WRITE_FATAL +is defined. + +The maximum number of tape devices is determined by the define +ST_MAX_TAPES. If more tapes are detected at driver initialization, the +maximum is adjusted accordingly. + +Immediate return from tape positioning SCSI commands can be enabled by +defining ST_NOWAIT. If this is defined, the user should take care that +the next tape operation is not started before the previous one has +finished. The drives and SCSI adapters should handle this condition +gracefully, but some drive/adapter combinations are known to hang the +SCSI bus in this case. + +The MTEOM command is by default implemented as spacing over 32767 +filemarks. With this method the file number in the status is +correct. The user can request using direct spacing to EOD by setting +ST_FAST_EOM 1 (or using the MT_ST_OPTIONS ioctl). In this case the file +number will be invalid. + +When using read ahead or buffered writes the position within the file +may not be correct after the file is closed (correct position may +require backspacing over more than one record). The correct position +within file can be obtained if ST_IN_FILE_POS is defined at compile +time or the MT_ST_CAN_BSR bit is set for the drive with an ioctl. +(The driver always backs over a filemark crossed by read ahead if the +user does not request data that far.) + + +DEBUGGING HINTS + +To enable debugging messages, edit st.c and #define DEBUG 1. As seen +above, debugging can be switched off with an ioctl if debugging is +compiled into the driver. The debugging output is not voluminous. + +If the tape seems to hang, I would be very interested to hear where +the driver is waiting. With the command 'ps -l' you can see the state +of the process using the tape. If the state is D, the process is +waiting for something. The field WCHAN tells where the driver is +waiting. If you have the current System.map in the correct place (in +/boot for the procps I use) or have updated /etc/psdatabase (for kmem +ps), ps writes the function name in the WCHAN field. If not, you have +to look up the function from System.map. + +Note also that the timeouts are very long compared to most other +drivers. This means that the Linux driver may appear hung although the +real reason is that the tape firmware has got confused. diff --git a/Documentation/scsi/sym53c500_cs.txt b/Documentation/scsi/sym53c500_cs.txt new file mode 100644 index 000000000000..75febcf9298c --- /dev/null +++ b/Documentation/scsi/sym53c500_cs.txt @@ -0,0 +1,23 @@ +The sym53c500_cs driver originated as an add-on to David Hinds' pcmcia-cs +package, and was written by Tom Corner (tcorner@via.at). A rewrite was +long overdue, and the current version addresses the following concerns: + + (1) extensive kernel changes between 2.4 and 2.6. + (2) deprecated PCMCIA support outside the kernel. + +All the USE_BIOS code has been ripped out. It was never used, and could +not have worked anyway. The USE_DMA code is likewise gone. Many thanks +to YOKOTA Hiroshi (nsp_cs driver) and David Hinds (qlogic_cs driver) for +the code fragments I shamelessly adapted for this work. Thanks also to +Christoph Hellwig for his patient tutelage while I stumbled about. + +The Symbios Logic 53c500 chip was used in the "newer" (circa 1997) version +of the New Media Bus Toaster PCMCIA SCSI controller. Presumably there are +other products using this chip, but I've never laid eyes (much less hands) +on one. + +Through the years, there have been a number of downloads of the pcmcia-cs +version of this driver, and I guess it worked for those users. It worked +for Tom Corner, and it works for me. Your mileage will probably vary. + +--Bob Tracy (rct@frus.com) diff --git a/Documentation/scsi/sym53c8xx_2.txt b/Documentation/scsi/sym53c8xx_2.txt new file mode 100644 index 000000000000..7f516cdcd262 --- /dev/null +++ b/Documentation/scsi/sym53c8xx_2.txt @@ -0,0 +1,1059 @@ +The Linux SYM-2 driver documentation file + +Written by Gerard Roudier <groudier@free.fr> +21 Rue Carnot +95170 DEUIL LA BARRE - FRANCE + +Updated by Matthew Wilcox <matthew@wil.cx> + +2004-10-09 +=============================================================================== + +1. Introduction +2. Supported chips and SCSI features +3. Advantages of this driver for newer chips. + 3.1 Optimized SCSI SCRIPTS + 3.2 New features appeared with the SYM53C896 +4. Memory mapped I/O versus normal I/O +5. Tagged command queueing +6. Parity checking +7. Profiling information +8. Control commands + 8.1 Set minimum synchronous period + 8.2 Set wide size + 8.3 Set maximum number of concurrent tagged commands + 8.4 Set debug mode + 8.5 Set flag (no_disc) + 8.6 Set verbose level + 8.7 Reset all logical units of a target + 8.8 Abort all tasks of all logical units of a target +9. Configuration parameters +10. Boot setup commands + 10.1 Syntax + 10.2 Available arguments + 10.2.1 Default number of tagged commands + 10.2.2 Burst max + 10.2.3 LED support + 10.2.4 Differential mode + 10.2.5 IRQ mode + 10.2.6 Check SCSI BUS + 10.2.7 Suggest a default SCSI id for hosts + 10.2.8 Verbosity level + 10.2.9 Debug mode + 10.2.10 Settle delay + 10.2.11 Serial NVRAM + 10.2.12 Exclude a host from being attached + 10.3 Converting from old options + 10.4 SCSI BUS checking boot option +11. SCSI problem troubleshooting + 15.1 Problem tracking + 15.2 Understanding hardware error reports +12. Serial NVRAM support (by Richard Waltham) + 17.1 Features + 17.2 Symbios NVRAM layout + 17.3 Tekram NVRAM layout + +=============================================================================== + +1. Introduction + +This driver supports the whole SYM53C8XX family of PCI-SCSI controllers. +It also support the subset of LSI53C10XX PCI-SCSI controllers that are based +on the SYM53C8XX SCRIPTS language. + +It replaces the sym53c8xx+ncr53c8xx driver bundle and shares its core code +with the FreeBSD SYM-2 driver. The `glue' that allows this driver to work +under Linux is contained in 2 files named sym_glue.h and sym_glue.c. +Other drivers files are intended not to depend on the Operating System +on which the driver is used. + +The history of this driver can be summerized as follows: + +1993: ncr driver written for 386bsd and FreeBSD by: + Wolfgang Stanglmeier <wolf@cologne.de> + Stefan Esser <se@mi.Uni-Koeln.de> + +1996: port of the ncr driver to Linux-1.2.13 and rename it ncr53c8xx. + Gerard Roudier + +1998: new sym53c8xx driver for Linux based on LOAD/STORE instruction and that + adds full support for the 896 but drops support for early NCR devices. + Gerard Roudier + +1999: port of the sym53c8xx driver to FreeBSD and support for the LSI53C1010 + 33 MHz and 66MHz Ultra-3 controllers. The new driver is named `sym'. + Gerard Roudier + +2000: Add support for early NCR devices to FreeBSD `sym' driver. + Break the driver into several sources and separate the OS glue + code from the core code that can be shared among different O/Ses. + Write a glue code for Linux. + Gerard Roudier + +2004: Remove FreeBSD compatibility code. Remove support for versions of + Linux before 2.6. Start using Linux facilities. + +This README file addresses the Linux version of the driver. Under FreeBSD, +the driver documentation is the sym.8 man page. + +Information about new chips is available at LSILOGIC web server: + + http://www.lsilogic.com/ + +SCSI standard documentations are available at T10 site: + + http://www.t10.org/ + +Useful SCSI tools written by Eric Youngdale are part of most Linux +distributions: + scsiinfo: command line tool + scsi-config: TCL/Tk tool using scsiinfo + +2. Supported chips and SCSI features + +The following features are supported for all chips: + + Synchronous negotiation + Disconnection + Tagged command queuing + SCSI parity checking + PCI Master parity checking + +Other features depends on chip capabilities. +The driver notably uses optimized SCRIPTS for devices that support +LOAD/STORE and handles PHASE MISMATCH from SCRIPTS for devices that +support the corresponding feature. + +The following table shows some characteristics of the chip family. + + On board LOAD/STORE HARDWARE +Chip SDMS BIOS Wide SCSI std. Max. sync SCRIPTS PHASE MISMATCH +---- --------- ---- --------- ---------- ---------- -------------- +810 N N FAST10 10 MB/s N N +810A N N FAST10 10 MB/s Y N +815 Y N FAST10 10 MB/s N N +825 Y Y FAST10 20 MB/s N N +825A Y Y FAST10 20 MB/s Y N +860 N N FAST20 20 MB/s Y N +875 Y Y FAST20 40 MB/s Y N +875A Y Y FAST20 40 MB/s Y Y +876 Y Y FAST20 40 MB/s Y N +895 Y Y FAST40 80 MB/s Y N +895A Y Y FAST40 80 MB/s Y Y +896 Y Y FAST40 80 MB/s Y Y +897 Y Y FAST40 80 MB/s Y Y +1510D Y Y FAST40 80 MB/s Y Y +1010 Y Y FAST80 160 MB/s Y Y +1010_66* Y Y FAST80 160 MB/s Y Y + +* Chip supports 33MHz and 66MHz PCI bus clock. + + +Summary of other supported features: + +Module: allow to load the driver +Memory mapped I/O: increases performance +Control commands: write operations to the proc SCSI file system +Debugging information: written to syslog (expert only) +Scatter / gather +Shared interrupt +Boot setup commands +Serial NVRAM: Symbios and Tekram formats + + +3. Advantages of this driver for newer chips. + +3.1 Optimized SCSI SCRIPTS. + +All chips except the 810, 815 and 825, support new SCSI SCRIPTS instructions +named LOAD and STORE that allow to move up to 1 DWORD from/to an IO register +to/from memory much faster that the MOVE MEMORY instruction that is supported +by the 53c7xx and 53c8xx family. + +The LOAD/STORE instructions support absolute and DSA relative addressing +modes. The SCSI SCRIPTS had been entirely rewritten using LOAD/STORE instead +of MOVE MEMORY instructions. + +Due to the lack of LOAD/STORE SCRIPTS instructions by earlier chips, this +driver also incorporates a different SCRIPTS set based on MEMORY MOVE, in +order to provide support for the entire SYM53C8XX chips family. + +3.2 New features appeared with the SYM53C896 + +Newer chips (see above) allows handling of the phase mismatch context from +SCRIPTS (avoids the phase mismatch interrupt that stops the SCSI processor +until the C code has saved the context of the transfer). + +The 896 and 1010 chips support 64 bit PCI transactions and addressing, +while the 895A supports 32 bit PCI transactions and 64 bit addressing. +The SCRIPTS processor of these chips is not true 64 bit, but uses segment +registers for bit 32-63. Another interesting feature is that LOAD/STORE +instructions that address the on-chip RAM (8k) remain internal to the chip. + +4. Memory mapped I/O versus normal I/O + +Memory mapped I/O has less latency than normal I/O and is the recommended +way for doing IO with PCI devices. Memory mapped I/O seems to work fine on +most hardware configurations, but some poorly designed chipsets may break +this feature. A configuration option is provided for normal I/O to be +used but the driver defaults to MMIO. + +5. Tagged command queueing + +Queuing more than 1 command at a time to a device allows it to perform +optimizations based on actual head positions and its mechanical +characteristics. This feature may also reduce average command latency. +In order to really gain advantage of this feature, devices must have +a reasonable cache size (No miracle is to be expected for a low-end +hard disk with 128 KB or less). +Some kown old SCSI devices do not properly support tagged command queuing. +Generally, firmware revisions that fix this kind of problems are available +at respective vendor web/ftp sites. +All I can say is that I never have had problem with tagged queuing using +this driver and its predecessors. Hard disks that behaved correctly for +me using tagged commands are the following: + +- IBM S12 0662 +- Conner 1080S +- Quantum Atlas I +- Quantum Atlas II +- Seagate Cheetah I +- Quantum Viking II +- IBM DRVS +- Quantum Atlas IV +- Seagate Cheetah II + +If your controller has NVRAM, you can configure this feature per target +from the user setup tool. The Tekram Setup program allows to tune the +maximum number of queued commands up to 32. The Symbios Setup only allows +to enable or disable this feature. + +The maximum number of simultaneous tagged commands queued to a device +is currently set to 16 by default. This value is suitable for most SCSI +disks. With large SCSI disks (>= 2GB, cache >= 512KB, average seek time +<= 10 ms), using a larger value may give better performances. + +This driver supports up to 255 commands per device, and but using more than +64 is generally not worth-while, unless you are using a very large disk or +disk arrays. It is noticeable that most of recent hard disks seem not to +accept more than 64 simultaneous commands. So, using more than 64 queued +commands is probably just resource wasting. + +If your controller does not have NVRAM or if it is managed by the SDMS +BIOS/SETUP, you can configure tagged queueing feature and device queue +depths from the boot command-line. For example: + + sym53c8xx=tags:4/t2t3q15-t4q7/t1u0q32 + +will set tagged commands queue depths as follow: + +- target 2 all luns on controller 0 --> 15 +- target 3 all luns on controller 0 --> 15 +- target 4 all luns on controller 0 --> 7 +- target 1 lun 0 on controller 1 --> 32 +- all other target/lun --> 4 + +In some special conditions, some SCSI disk firmwares may return a +QUEUE FULL status for a SCSI command. This behaviour is managed by the +driver using the following heuristic: + +- Each time a QUEUE FULL status is returned, tagged queue depth is reduced + to the actual number of disconnected commands. + +- Every 200 successfully completed SCSI commands, if allowed by the + current limit, the maximum number of queueable commands is incremented. + +Since QUEUE FULL status reception and handling is resource wasting, the +driver notifies by default this problem to user by indicating the actual +number of commands used and their status, as well as its decision on the +device queue depth change. +The heuristic used by the driver in handling QUEUE FULL ensures that the +impact on performances is not too bad. You can get rid of the messages by +setting verbose level to zero, as follow: + +1st method: boot your system using 'sym53c8xx=verb:0' option. +2nd method: apply "setverbose 0" control command to the proc fs entry + corresponding to your controller after boot-up. + +6. Parity checking + +The driver supports SCSI parity checking and PCI bus master parity +checking. These features must be enabled in order to ensure safe +data transfers. Some flawed devices or mother boards may have problems +with parity. The options to defeat parity checking have been removed +from the driver. + +7. Profiling information + +This driver does not provide profiling informations as did its predecessors. +This feature was not this useful and added complexity to the code. +As the driver code got more complex, I have decided to remove everything +that didn't seem actually useful. + +8. Control commands + +Control commands can be sent to the driver with write operations to +the proc SCSI file system. The generic command syntax is the +following: + + echo "<verb> <parameters>" >/proc/scsi/sym53c8xx/0 + (assumes controller number is 0) + +Using "all" for "<target>" parameter with the commands below will +apply to all targets of the SCSI chain (except the controller). + +Available commands: + +8.1 Set minimum synchronous period factor + + setsync <target> <period factor> + + target: target number + period: minimum synchronous period. + Maximum speed = 1000/(4*period factor) except for special + cases below. + + Specify a period of 0, to force asynchronous transfer mode. + + 9 means 12.5 nano-seconds synchronous period + 10 means 25 nano-seconds synchronous period + 11 means 30 nano-seconds synchronous period + 12 means 50 nano-seconds synchronous period + +8.2 Set wide size + + setwide <target> <size> + + target: target number + size: 0=8 bits, 1=16bits + +8.3 Set maximum number of concurrent tagged commands + + settags <target> <tags> + + target: target number + tags: number of concurrent tagged commands + must not be greater than configured (default: 16) + +8.4 Set debug mode + + setdebug <list of debug flags> + + Available debug flags: + alloc: print info about memory allocations (ccb, lcb) + queue: print info about insertions into the command start queue + result: print sense data on CHECK CONDITION status + scatter: print info about the scatter process + scripts: print info about the script binding process + tiny: print minimal debugging information + timing: print timing information of the NCR chip + nego: print information about SCSI negotiations + phase: print information on script interruptions + + Use "setdebug" with no argument to reset debug flags. + + +8.5 Set flag (no_disc) + + setflag <target> <flag> + + target: target number + + For the moment, only one flag is available: + + no_disc: not allow target to disconnect. + + Do not specify any flag in order to reset the flag. For example: + - setflag 4 + will reset no_disc flag for target 4, so will allow it disconnections. + - setflag all + will allow disconnection for all devices on the SCSI bus. + + +8.6 Set verbose level + + setverbose #level + + The driver default verbose level is 1. This command allows to change + th driver verbose level after boot-up. + +8.7 Reset all logical units of a target + + resetdev <target> + + target: target number + The driver will try to send a BUS DEVICE RESET message to the target. + +8.8 Abort all tasks of all logical units of a target + + cleardev <target> + + target: target number + The driver will try to send a ABORT message to all the logical units + of the target. + + +9. Configuration parameters + +Under kernel configuration tools (make menuconfig, for example), it is +possible to change some default driver configuration parameters. +If the firmware of all your devices is perfect enough, all the +features supported by the driver can be enabled at start-up. However, +if only one has a flaw for some SCSI feature, you can disable the +support by the driver of this feature at linux start-up and enable +this feature after boot-up only for devices that support it safely. + +Configuration parameters: + +Use normal IO (default answer: n) + Answer "y" if you suspect your mother board to not allow memory mapped I/O. + May slow down performance a little. + +Default tagged command queue depth (default answer: 16) + Entering 0 defaults to tagged commands not being used. + This parameter can be specified from the boot command line. + +Maximum number of queued commands (default answer: 32) + This option allows you to specify the maximum number of tagged commands + that can be queued to a device. The maximum supported value is 255. + +Synchronous transfers frequency (default answer: 80) + This option allows you to specify the frequency in MHz the driver + will use at boot time for synchronous data transfer negotiations. + 0 means "asynchronous data transfers". + +10. Boot setup commands + +10.1 Syntax + +Setup commands can be passed to the driver either at boot time or as +parameters to modprobe, as described in Documentation/kernel-parameters.txt + +Example of boot setup command under lilo prompt: + +lilo: linux root=/dev/sda2 sym53c8xx.cmd_per_lun=4 sym53c8xx.sync=10 sym53c8xx.debug=0x200 + +- enable tagged commands, up to 4 tagged commands queued. +- set synchronous negotiation speed to 10 Mega-transfers / second. +- set DEBUG_NEGO flag. + +The following command will install the driver module with the same +options as above. + + modprobe sym53c8xx cmd_per_lun=4 sync=10 debug=0x200 + +10.2 Available arguments + +10.2.1 Default number of tagged commands + cmd_per_lun=0 (or cmd_per_lun=1) tagged command queuing disabled + cmd_per_lun=#tags (#tags > 1) tagged command queuing enabled + #tags will be truncated to the max queued commands configuration parameter. + +10.2.2 Detailed control of tagged commands + This option allows you to specify a command queue depth for each device + that supports tagged command queueing. + Example: + tag_ctrl=10/t2t3q16-t5q24/t1u2q32 + will set devices queue depth as follow: + - controller #0 target #2 and target #3 -> 16 commands, + - controller #0 target #5 -> 24 commands, + - controller #1 target #1 logical unit #2 -> 32 commands, + - all other logical units (all targets, all controllers) -> 10 commands. + +10.2.3 Burst max + burst=0 burst disabled + burst=255 get burst length from initial IO register settings. + burst=#x burst enabled (1<<#x burst transfers max) + #x is an integer value which is log base 2 of the burst transfers max. + By default the driver uses the maximum value supported by the chip. + +10.2.4 LED support + led=1 enable LED support + led=0 disable LED support + Do not enable LED support if your scsi board does not use SDMS BIOS. + (See 'Configuration parameters') + +10.2.4 Differential mode + diff=0 never set up diff mode + diff=1 set up diff mode if BIOS set it + diff=2 always set up diff mode + diff=3 set diff mode if GPIO3 is not set + +10.2.5 IRQ mode + irqm=0 always open drain + irqm=1 same as initial settings (assumed BIOS settings) + irqm=2 always totem pole + +10.2.6 Check SCSI BUS + buschk=<option bits> + + Available option bits: + 0x0: No check. + 0x1: Check and do not attach the controller on error. + 0x2: Check and just warn on error. + +10.2.7 Suggest a default SCSI id for hosts + hostid=255 no id suggested. + hostid=#x (0 < x < 7) x suggested for hosts SCSI id. + + If a host SCSI id is available from the NVRAM, the driver will ignore + any value suggested as boot option. Otherwise, if a suggested value + different from 255 has been supplied, it will use it. Otherwise, it will + try to deduce the value previously set in the hardware and use value + 7 if the hardware value is zero. + +10.2.8 Verbosity level + verb=0 minimal + verb=1 normal + verb=2 too much + +10.2.9 Debug mode + debug=0 clear debug flags + debug=#x set debug flags + #x is an integer value combining the following power-of-2 values: + DEBUG_ALLOC 0x1 + DEBUG_PHASE 0x2 + DEBUG_POLL 0x4 + DEBUG_QUEUE 0x8 + DEBUG_RESULT 0x10 + DEBUG_SCATTER 0x20 + DEBUG_SCRIPT 0x40 + DEBUG_TINY 0x80 + DEBUG_TIMING 0x100 + DEBUG_NEGO 0x200 + DEBUG_TAGS 0x400 + DEBUG_FREEZE 0x800 + DEBUG_RESTART 0x1000 + + You can play safely with DEBUG_NEGO. However, some of these flags may + generate bunches of syslog messages. + +10.2.10 Settle delay + settle=n delay for n seconds + + After a bus reset, the driver will delay for n seconds before talking + to any device on the bus. The default is 3 seconds and safe mode will + default it to 10. + +10.2.11 Serial NVRAM + NB: option not currently implemented. + nvram=n do not look for serial NVRAM + nvram=y test controllers for onboard serial NVRAM + (alternate binary form) + nvram=<bits options> + 0x01 look for NVRAM (equivalent to nvram=y) + 0x02 ignore NVRAM "Synchronous negotiation" parameters for all devices + 0x04 ignore NVRAM "Wide negotiation" parameter for all devices + 0x08 ignore NVRAM "Scan at boot time" parameter for all devices + 0x80 also attach controllers set to OFF in the NVRAM (sym53c8xx only) + +10.2.12 Exclude a host from being attached + excl=<io_address>,... + + Prevent host at a given io address from being attached. + For example 'excl=0xb400,0xc000' indicate to the + driver not to attach hosts at address 0xb400 and 0xc000. + +10.3 Converting from old style options + +Previously, the sym2 driver accepted arguments of the form + sym53c8xx=tags:4,sync:10,debug:0x200 + +As a result of the new module parameters, this is no longer available. +Most of the options have remained the same, but tags has split into +cmd_per_lun and tag_ctrl for its two different purposes. The sample above +would be specified as: + modprobe sym53c8xx cmd_per_lun=4 sync=10 debug=0x200 + +or on the kernel boot line as: + sym53c8xx.cmd_per_lun=4 sym53c8xx.sync=10 sym53c8xx.debug=0x200 + +10.4 SCSI BUS checking boot option. + +When this option is set to a non-zero value, the driver checks SCSI lines +logic state, 100 micro-seconds after having asserted the SCSI RESET line. +The driver just reads SCSI lines and checks all lines read FALSE except RESET. +Since SCSI devices shall release the BUS at most 800 nano-seconds after SCSI +RESET has been asserted, any signal to TRUE may indicate a SCSI BUS problem. +Unfortunately, the following common SCSI BUS problems are not detected: +- Only 1 terminator installed. +- Misplaced terminators. +- Bad quality terminators. +On the other hand, either bad cabling, broken devices, not conformant +devices, ... may cause a SCSI signal to be wrong when te driver reads it. + +15. SCSI problem troubleshooting + +15.1 Problem tracking + +Most SCSI problems are due to a non conformant SCSI bus or too buggy +devices. If infortunately you have SCSI problems, you can check the +following things: + +- SCSI bus cables +- terminations at both end of the SCSI chain +- linux syslog messages (some of them may help you) + +If you do not find the source of problems, you can configure the +driver or devices in the NVRAM with minimal features. + +- only asynchronous data transfers +- tagged commands disabled +- disconnections not allowed + +Now, if your SCSI bus is ok, your system has every chance to work +with this safe configuration but performances will not be optimal. + +If it still fails, then you can send your problem description to +appropriate mailing lists or news-groups. Send me a copy in order to +be sure I will receive it. Obviously, a bug in the driver code is +possible. + + My cyrrent email address: Gerard Roudier <groudier@free.fr> + +Allowing disconnections is important if you use several devices on +your SCSI bus but often causes problems with buggy devices. +Synchronous data transfers increases throughput of fast devices like +hard disks. Good SCSI hard disks with a large cache gain advantage of +tagged commands queuing. + +15.2 Understanding hardware error reports + +When the driver detects an unexpected error condition, it may display a +message of the following pattern. + +sym0:1: ERROR (0:48) (1-21-65) (f/95/0) @ (script 7c0:19000000). +sym0: script cmd = 19000000 +sym0: regdump: da 10 80 95 47 0f 01 07 75 01 81 21 80 01 09 00. + +Some fields in such a message may help you understand the cause of the +problem, as follows: + +sym0:1: ERROR (0:48) (1-21-65) (f/95/0) @ (script 7c0:19000000). +.....A.........B.C....D.E..F....G.H..I.......J.....K...L....... + +Field A : target number. + SCSI ID of the device the controller was talking with at the moment the + error occurs. + +Field B : DSTAT io register (DMA STATUS) + Bit 0x40 : MDPE Master Data Parity Error + Data parity error detected on the PCI BUS. + Bit 0x20 : BF Bus Fault + PCI bus fault condition detected + Bit 0x01 : IID Illegal Instruction Detected + Set by the chip when it detects an Illegal Instruction format + on some condition that makes an instruction illegal. + Bit 0x80 : DFE Dma Fifo Empty + Pure status bit that does not indicate an error. + If the reported DSTAT value contains a combination of MDPE (0x40), + BF (0x20), then the cause may be likely due to a PCI BUS problem. + +Field C : SIST io register (SCSI Interrupt Status) + Bit 0x08 : SGE SCSI GROSS ERROR + Indicates that the chip detected a severe error condition + on the SCSI BUS that prevents the SCSI protocol from functioning + properly. + Bit 0x04 : UDC Unexpected Disconnection + Indicates that the device released the SCSI BUS when the chip + was not expecting this to happen. A device may behave so to + indicate the SCSI initiator that an error condition not reportable using the SCSI protocol has occurred. + Bit 0x02 : RST SCSI BUS Reset + Generally SCSI targets do not reset the SCSI BUS, although any + device on the BUS can reset it at any time. + Bit 0x01 : PAR Parity + SCSI parity error detected. + On a faulty SCSI BUS, any error condition among SGE (0x08), UDC (0x04) and + PAR (0x01) may be detected by the chip. If your SCSI system sometimes + encounters such error conditions, especially SCSI GROSS ERROR, then a SCSI + BUS problem is likely the cause of these errors. + +For fields D,E,F,G and H, you may look into the sym53c8xx_defs.h file +that contains some minimal comments on IO register bits. +Field D : SOCL Scsi Output Control Latch + This register reflects the state of the SCSI control lines the + chip want to drive or compare against. +Field E : SBCL Scsi Bus Control Lines + Actual value of control lines on the SCSI BUS. +Field F : SBDL Scsi Bus Data Lines + Actual value of data lines on the SCSI BUS. +Field G : SXFER SCSI Transfer + Contains the setting of the Synchronous Period for output and + the current Synchronous offset (offset 0 means asynchronous). +Field H : SCNTL3 Scsi Control Register 3 + Contains the setting of timing values for both asynchronous and + synchronous data transfers. +Field I : SCNTL4 Scsi Control Register 4 + Only meaninful for 53C1010 Ultra3 controllers. + +Understanding Fields J, K, L and dumps requires to have good knowledge of +SCSI standards, chip cores functionnals and internal driver data structures. +You are not required to decode and understand them, unless you want to help +maintain the driver code. + +17. Serial NVRAM (added by Richard Waltham: dormouse@farsrobt.demon.co.uk) + +17.1 Features + +Enabling serial NVRAM support enables detection of the serial NVRAM included +on Symbios and some Symbios compatible host adaptors, and Tekram boards. The +serial NVRAM is used by Symbios and Tekram to hold set up parameters for the +host adaptor and it's attached drives. + +The Symbios NVRAM also holds data on the boot order of host adaptors in a +system with more than one host adaptor. This information is no longer used +as it's fundamentally incompatible with the hotplug PCI model. + +Tekram boards using Symbios chips, DC390W/F/U, which have NVRAM are detected +and this is used to distinguish between Symbios compatible and Tekram host +adaptors. This is used to disable the Symbios compatible "diff" setting +incorrectly set on Tekram boards if the CONFIG_SCSI_53C8XX_SYMBIOS_COMPAT +configuration parameter is set enabling both Symbios and Tekram boards to be +used together with the Symbios cards using all their features, including +"diff" support. ("led pin" support for Symbios compatible cards can remain +enabled when using Tekram cards. It does nothing useful for Tekram host +adaptors but does not cause problems either.) + +The parameters the driver is able to get from the NVRAM depend on the +data format used, as follow: + + Tekram format Symbios format +General and host parameters + Boot order N Y + Host SCSI ID Y Y + SCSI parity checking Y Y + Verbose boot messages N Y +SCSI devices parameters + Synchronous transfer speed Y Y + Wide 16 / Narrow Y Y + Tagged Command Queuing enabled Y Y + Disconnections enabled Y Y + Scan at boot time N Y + +In order to speed up the system boot, for each device configured without +the "scan at boot time" option, the driver forces an error on the +first TEST UNIT READY command received for this device. + + +17.2 Symbios NVRAM layout + +typical data at NVRAM address 0x100 (53c810a NVRAM) +----------------------------------------------------------- +00 00 +64 01 +8e 0b + +00 30 00 00 00 00 07 00 00 00 00 00 00 00 07 04 10 04 00 00 + +04 00 0f 00 00 10 00 50 00 00 01 00 00 62 +04 00 03 00 00 10 00 58 00 00 01 00 00 63 +04 00 01 00 00 10 00 48 00 00 01 00 00 61 +00 00 00 00 00 00 00 00 00 00 00 00 00 00 + +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 + +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 + +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 + +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 + +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 +00 00 00 00 00 00 00 00 + +fe fe +00 00 +00 00 +----------------------------------------------------------- +NVRAM layout details + +NVRAM Address 0x000-0x0ff not used + 0x100-0x26f initialised data + 0x270-0x7ff not used + +general layout + + header - 6 bytes, + data - 356 bytes (checksum is byte sum of this data) + trailer - 6 bytes + --- + total 368 bytes + +data area layout + + controller set up - 20 bytes + boot configuration - 56 bytes (4x14 bytes) + device set up - 128 bytes (16x8 bytes) + unused (spare?) - 152 bytes (19x8 bytes) + --- + total 356 bytes + +----------------------------------------------------------- +header + +00 00 - ?? start marker +64 01 - byte count (lsb/msb excludes header/trailer) +8e 0b - checksum (lsb/msb excludes header/trailer) +----------------------------------------------------------- +controller set up + +00 30 00 00 00 00 07 00 00 00 00 00 00 00 07 04 10 04 00 00 + | | | | + | | | -- host ID + | | | + | | --Removable Media Support + | | 0x00 = none + | | 0x01 = Bootable Device + | | 0x02 = All with Media + | | + | --flag bits 2 + | 0x00000001= scan order hi->low + | (default 0x00 - scan low->hi) + --flag bits 1 + 0x00000001 scam enable + 0x00000010 parity enable + 0x00000100 verbose boot msgs + +remaining bytes unknown - they do not appear to change in my +current set up for any of the controllers. + +default set up is identical for 53c810a and 53c875 NVRAM +(Removable Media added Symbios BIOS version 4.09) +----------------------------------------------------------- +boot configuration + +boot order set by order of the devices in this table + +04 00 0f 00 00 10 00 50 00 00 01 00 00 62 -- 1st controller +04 00 03 00 00 10 00 58 00 00 01 00 00 63 2nd controller +04 00 01 00 00 10 00 48 00 00 01 00 00 61 3rd controller +00 00 00 00 00 00 00 00 00 00 00 00 00 00 4th controller + | | | | | | | | + | | | | | | ---- PCI io port adr + | | | | | --0x01 init/scan at boot time + | | | | --PCI device/function number (0xdddddfff) + | | ----- ?? PCI vendor ID (lsb/msb) + ----PCI device ID (lsb/msb) + +?? use of this data is a guess but seems reasonable + +remaining bytes unknown - they do not appear to change in my +current set up + +default set up is identical for 53c810a and 53c875 NVRAM +----------------------------------------------------------- +device set up (up to 16 devices - includes controller) + +0f 00 08 08 64 00 0a 00 - id 0 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 + +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 +0f 00 08 08 64 00 0a 00 - id 15 + | | | | | | + | | | | ----timeout (lsb/msb) + | | | --synch period (0x?? 40 Mtrans/sec- fast 40) (probably 0x28) + | | | (0x30 20 Mtrans/sec- fast 20) + | | | (0x64 10 Mtrans/sec- fast ) + | | | (0xc8 5 Mtrans/sec) + | | | (0x00 asynchronous) + | | -- ?? max sync offset (0x08 in NVRAM on 53c810a) + | | (0x10 in NVRAM on 53c875) + | --device bus width (0x08 narrow) + | (0x10 16 bit wide) + --flag bits + 0x00000001 - disconnect enabled + 0x00000010 - scan at boot time + 0x00000100 - scan luns + 0x00001000 - queue tags enabled + +remaining bytes unknown - they do not appear to change in my +current set up + +?? use of this data is a guess but seems reasonable +(but it could be max bus width) + +default set up for 53c810a NVRAM +default set up for 53c875 NVRAM - bus width - 0x10 + - sync offset ? - 0x10 + - sync period - 0x30 +----------------------------------------------------------- +?? spare device space (32 bit bus ??) + +00 00 00 00 00 00 00 00 (19x8bytes) +. +. +00 00 00 00 00 00 00 00 + +default set up is identical for 53c810a and 53c875 NVRAM +----------------------------------------------------------- +trailer + +fe fe - ? end marker ? +00 00 +00 00 + +default set up is identical for 53c810a and 53c875 NVRAM +----------------------------------------------------------- + + + +17.3 Tekram NVRAM layout + +nvram 64x16 (1024 bit) + +Drive settings + +Drive ID 0-15 (addr 0x0yyyy0 = device setup, yyyy = ID) + (addr 0x0yyyy1 = 0x0000) + + x x x x x x x x x x x x x x x x + | | | | | | | | | + | | | | | | | | ----- parity check 0 - off + | | | | | | | | 1 - on + | | | | | | | | + | | | | | | | ------- sync neg 0 - off + | | | | | | | 1 - on + | | | | | | | + | | | | | | --------- disconnect 0 - off + | | | | | | 1 - on + | | | | | | + | | | | | ----------- start cmd 0 - off + | | | | | 1 - on + | | | | | + | | | | -------------- tagged cmds 0 - off + | | | | 1 - on + | | | | + | | | ---------------- wide neg 0 - off + | | | 1 - on + | | | + --------------------------- sync rate 0 - 10.0 Mtrans/sec + 1 - 8.0 + 2 - 6.6 + 3 - 5.7 + 4 - 5.0 + 5 - 4.0 + 6 - 3.0 + 7 - 2.0 + 7 - 2.0 + 8 - 20.0 + 9 - 16.7 + a - 13.9 + b - 11.9 + +Global settings + +Host flags 0 (addr 0x100000, 32) + + x x x x x x x x x x x x x x x x + | | | | | | | | | | | | + | | | | | | | | ----------- host ID 0x00 - 0x0f + | | | | | | | | + | | | | | | | ----------------------- support for 0 - off + | | | | | | | > 2 drives 1 - on + | | | | | | | + | | | | | | ------------------------- support drives 0 - off + | | | | | | > 1Gbytes 1 - on + | | | | | | + | | | | | --------------------------- bus reset on 0 - off + | | | | | power on 1 - on + | | | | | + | | | | ----------------------------- active neg 0 - off + | | | | 1 - on + | | | | + | | | -------------------------------- imm seek 0 - off + | | | 1 - on + | | | + | | ---------------------------------- scan luns 0 - off + | | 1 - on + | | + -------------------------------------- removable 0 - disable + as BIOS dev 1 - boot device + 2 - all + +Host flags 1 (addr 0x100001, 33) + + x x x x x x x x x x x x x x x x + | | | | | | + | | | --------- boot delay 0 - 3 sec + | | | 1 - 5 + | | | 2 - 10 + | | | 3 - 20 + | | | 4 - 30 + | | | 5 - 60 + | | | 6 - 120 + | | | + --------------------------- max tag cmds 0 - 2 + 1 - 4 + 2 - 8 + 3 - 16 + 4 - 32 + +Host flags 2 (addr 0x100010, 34) + + x x x x x x x x x x x x x x x x + | + ----- F2/F6 enable 0 - off ??? + 1 - on ??? + +checksum (addr 0x111111) + +checksum = 0x1234 - (sum addr 0-63) + +---------------------------------------------------------------------------- + +default nvram data: + +0x0037 0x0000 0x0037 0x0000 0x0037 0x0000 0x0037 0x0000 +0x0037 0x0000 0x0037 0x0000 0x0037 0x0000 0x0037 0x0000 +0x0037 0x0000 0x0037 0x0000 0x0037 0x0000 0x0037 0x0000 +0x0037 0x0000 0x0037 0x0000 0x0037 0x0000 0x0037 0x0000 + +0x0f07 0x0400 0x0001 0x0000 0x0000 0x0000 0x0000 0x0000 +0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 +0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 +0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0xfbbc + + +=============================================================================== +End of Linux SYM-2 driver documentation file diff --git a/Documentation/scsi/tmscsim.txt b/Documentation/scsi/tmscsim.txt new file mode 100644 index 000000000000..e165229adf50 --- /dev/null +++ b/Documentation/scsi/tmscsim.txt @@ -0,0 +1,449 @@ +The tmscsim driver +================== + +1. Purpose and history +2. Installation +3. Features +4. Configuration via /proc/scsi/tmscsim/? +5. Configuration via boot/module params +6. Potential improvements +7. Bug reports, debugging and updates +8. Acknowledgements +9. Copyright + + +1. Purpose and history +---------------------- +The tmscsim driver supports PCI SCSI Host Adapters based on the AM53C974 +chip. AM53C974 based SCSI adapters include: + Tekram DC390, DC390T + Dawicontrol 2974 + QLogic Fast! PCI Basic + some on-board adapters +(This is most probably not a complete list) + +It has originally written by C.L. Huang from the Tekram corp. to support the +Tekram DC390(T) adapter. This is where the name comes from: tm = Tekram +scsi = SCSI driver, m = AMD (?) as opposed to w for the DC390W/U/F +(NCR53c8X5, X=2/7) driver. Yes, there was also a driver for the latter, +tmscsiw, which supported DC390W/U/F adapters. It's not maintained any more, +as the ncr53c8xx is perfectly supporting these adpaters since some time. + +The driver first appeared in April 1996, exclusively supported the DC390 +and has been enhanced since then in various steps. In May 1998 support for +general AM53C974 based adapters and some possibilities to configure it were +added. The non-DC390 support works by assuming some values for the data +normally taken from the DC390 EEPROM. See below (chapter 5) for details. + +When using the DC390, the configuration is still be done using the DC390 +BIOS setup. The DC390 EEPROM is read and used by the driver, any boot or +module parameters (chapter 5) are ignored! However, you can change settings +dynamically, as described in chapter 4. + +For a more detailed description of the driver's history, see the first lines +of tmscsim.c. +The numbering scheme isn't consistent. The first versions went from 1.00 to +1.12, then 1.20a to 1.20t. Finally I decided to use the ncr53c8xx scheme. So +the next revisions will be 2.0a to 2.0X (stable), 2.1a to 2.1X (experimental), +2.2a to 2.2X (stable, again) etc. (X = anything between a and z.) If I send +fixes to people for testing, I create intermediate versions with a digit +appended, e.g. 2.0c3. + + +2. Installation +--------------- +If you got any recent kernel with this driver and document included in +linux/drivers/scsi, you basically have to do nothing special to use this +driver. Of course you have to choose to compile SCSI support and DC390(T) +support into your kernel or as module when configuring your kernel for +compiling. +NEW: You may as well compile this module outside your kernel, using the +supplied Makefile. + + If you got an old kernel (pre 2.1.127, pre 2.0.37p1) with an old version of + this driver: Get dc390-21125-20b.diff.gz or dc390-2036p21-20b1.diff.gz from + my web page and apply the patch. Apply further patches to upgrade to the + latest version of the driver. + + If you want to do it manually, you should copy the files (dc390.h, + tmscsim.h, tmscsim.c, scsiiom.c and README.tmscsim) from this directory to + linux/drivers/scsi. You have to recompile your kernel/module of course. + + You should apply the three patches included in dc390-120-kernel.diff + (Applying them: cd /usr/src; patch -p0 <~/dc390-120-kernel.diff) + The patches are against 2.1.125, so you might have to manually resolve + rejections when applying to another kernel version. + + The patches will update the kernel startup code to allow boot parameters to + be passed to the driver, update the Documentation and finally offer you the + possibility to omit the non-DC390 parts of the driver. + (By selecting "Omit support for non DC390" you basically disable the + emulation of a DC390 EEPROM for non DC390 adapters. This saves a few bytes + of memory.) + +If you got a very old kernel without the tmscsim driver (pre 2.0.31) +I recommend upgrading your kernel. However, if you don't want to, please +contact me to get the appropriate patches. + + +Upgrading a SCSI driver is always a delicate thing to do. The 2.0 driver has +proven stable on many systems, but it's still a good idea to take some +precautions. In an ideal world you would have a full backup of your disks. +The world isn't ideal and most people don't have full backups (me neither). +So take at least the following measures: +* make your kernel remount the FS read-only on detecting an error: + tune2fs -e remount-ro /dev/sd?? +* have copies of your SCSI disk's partition tables on some safe location: + dd if=/dev/sda of=/mnt/floppy/sda bs=512 count=1 + or just print it with: + fdisk -l | lpr +* make sure you are able to boot Linux (e.g. from floppy disk using InitRD) + if your SCSI disk gets corrupted. You can use + ftp://student.physik.uni-dortmund.de/pub/linux/kernel/bootdisk.gz + +One more warning: I used to overclock my PCI bus to 41.67 MHz. My Tekram +DC390F (Sym53c875) accepted this as well as my Millenium. But the Am53C974 +produced errors and started to corrupt my disks. So don't do that! A 37.50 +MHz PCI bus works for me, though, but I don't recommend using higher clocks +than the 33.33 MHz being in the PCI spec. + +If you want to share the IRQ with another device and the driver refuses to +do so, you might succeed with changing the DC390_IRQ type in tmscsim.c to +SA_SHIRQ | SA_INTERRUPT. + + +3.Features +---------- +- SCSI + * Tagged command queueing + * Sync speed up to 10 MHz + * Disconnection + * Multiple LUNs + +- General / Linux interface + * Support for up to 4 AM53C974 adapters. + * DC390 EEPROM usage or boot/module params + * Information via cat /proc/scsi/tmscsim/? + * Dynamically configurable by writing to /proc/scsi/tmscsim/? + * Dynamic allocation of resources + * SMP support: Locking on io_request lock (Linux 2.1/2.2) or adapter + specific locks (Linux 2.5?) + * Uniform source code for Linux-2.x.y + * Support for dyn. addition/removal of devices via add/remove-single-device + (Try: echo "scsi add-single-device C B T U" >/proc/scsi/scsi + C = Controller, B = Bus, T = Target SCSI ID, U = Unit SCSI LUN.) + Use with care! + * Try to use the partition table for the determination of the mapping + + +4. Configuration via /proc/scsi/tmscsim/? +----------------------------------------- +First of all look at the output of /proc/scsi/tmscsim/? by typing + cat /proc/scsi/tmscsim/? +The "?" should be replaced by the SCSI host number. (The shell might do this +for you.) +You will see some info regarding the adapter and, at the end, a listing of +the attached devices and their settings. + +Here's an example: +garloff@kurt:/home/garloff > cat /proc/scsi/tmscsim/0 +Tekram DC390/AM53C974 PCI SCSI Host Adapter, Driver Version 2.0e7 2000-11-28 +SCSI Host Nr 1, AM53C974 Adapter Nr 0 +IOPortBase 0xb000, IRQ 10 +MaxID 8, MaxLUN 8, AdapterID 6, SelTimeout 250 ms, DelayReset 1 s +TagMaxNum 16, Status 0x00, ACBFlag 0x00, GlitchEater 24 ns +Statistics: Cmnds 1470165, Cmnds not sent directly 0, Out of SRB conds 0 + Lost arbitrations 587, Sel. connected 0, Connected: No +Nr of attached devices: 4, Nr of DCBs: 4 +Map of attached LUNs: 01 00 00 03 01 00 00 00 +Idx ID LUN Prty Sync DsCn SndS TagQ NegoPeriod SyncSpeed SyncOffs MaxCmd +00 00 00 Yes Yes Yes Yes Yes 100 ns 10.0 M 15 16 +01 03 00 Yes Yes Yes Yes No 100 ns 10.0 M 15 01 +02 03 01 Yes Yes Yes Yes No 100 ns 10.0 M 15 01 +03 04 00 Yes Yes Yes Yes No 100 ns 10.0 M 15 01 + +Note that the settings MaxID and MaxLUN are not zero- but one-based, which +means that a setting MaxLUN=4, will result in the support of LUNs 0..3. This +is somehow inconvenient, but the way the mid-level SCSI code expects it to be. + +ACB and DCB are acronyms for Adapter Control Block and Device Control Block. +These are data structures of the driver containing information about the +adapter and the connected SCSI devices respectively. + +Idx is the device index (just a consecutive number for the driver), ID and +LUN are the SCSI ID and LUN, Prty means Parity checking, Sync synchronous +negotiation, DsCn Disconnection, SndS Send Start command on startup (not +used by the driver) and TagQ Tagged Command Queueing. NegoPeriod and +SyncSpeed are somehow redundant, because they are reciprocal values +(1 / 112 ns = 8.9 MHz). At least in theory. The driver is able to adjust the +NegoPeriod more accurate (4ns) than the SyncSpeed (1 / 25ns). I don't know +if certain devices will have problems with this discrepancy. Max. speed is +10 MHz corresp. to a min. NegoPeriod of 100 ns. +(The driver allows slightly higher speeds if the devices (Ultra SCSI) accept +it, but that's out of adapter spec, on your own risk and unlikely to improve +performance. You're likely to crash your disks.) +SyncOffs is the offset used for synchronous negotiations; max. is 15. +The last values are only shown, if Sync is enabled. (NegoPeriod is still +displayed in brackets to show the values which will be used after enabling +Sync.) +MaxCmd ist the number of commands (=tags) which can be processed at the same +time by the device. + +If you want to change a setting, you can do that by writing to +/proc/scsi/tmscsim/?. Basically you have to imitate the output of driver. +(Don't use the brackets for NegoPeriod on Sync disabled devices.) +You don't have to care about capitalisation. The driver will accept space, +tab, comma, = and : as separators. + +There are three kinds of changes: + +(1) Change driver settings: + You type the names of the parameters and the params following it. + Example: + echo "MaxLUN=8 seltimeout 200" >/proc/scsi/tmscsim/0 + + Note that you can only change MaxID, MaxLUN, AdapterID, SelTimeOut, + TagMaxNum, ACBFlag, GlitchEater and DelayReset. Don't change ACBFlag + unless you want to see what happens, if the driver hangs. + +(2) Change device settings: You write a config line to the driver. The Nr + must match the ID and LUN given. If you give "-" as parameter, it is + ignored and the corresponding setting won't be changed. + You can use "y" or "n" instead of "Yes" and "No" if you want to. + You don't need to specify a full line. The driver automatically performs + an INQUIRY on the device if necessary to check if it is capable to operate + with the given settings (Sync, TagQ). + Examples: + echo "0 0 0 y y y - y - 10 " >/proc/scsi/tmscsim/0 + echo "3 5 0 y n y " >/proc/scsi/tmscsim/0 + + To give a short explanation of the first example: + The first three numbers, "0 0 0" (Device index 0, SCSI ID 0, SCSI LUN 0), + select the device to which the following parameters apply. Note that it + would be sufficient to use the index or both SCSI ID and LUN, but I chose + to require all three to have a syntax similar to the output. + The following "y y y - y" enables Parity checking, enables Synchronous + transfers, Disconnection, leaves Send Start (not used) untouched and + enables Tagged Command Queueing for the selected device. The "-" skips + the Negotiation Period setting but the "10" sets the max sync. speed to + 10 MHz. It's useless to specify both NegoPeriod and SyncSpeed as + discussed above. The values used in this example will result in maximum + performance. + +(3) Special commands: You can force a SCSI bus reset, an INQUIRY command, the + removal or the addition of a device's DCB and a SCSI register dump. + This is only used for debugging when you meet problems. The parameter of + the INQUIRY and REMOVE commands is the device index as shown by the + output of /proc/scsi/tmscsim/? in the device listing in the first column + (Idx). ADD takes the SCSI ID and LUN. + Examples: + echo "reset" >/proc/scsi/tmscsim/0 + echo "inquiry 1" >/proc/scsi/tmscsim/0 + echo "remove 2" >/proc/scsi/tmscsim/1 + echo "add 2 3" >/proc/scsi/tmscsim/? + echo "dump" >/proc/scsi/tmscsim/0 + + Note that you will meet problems when you REMOVE a device's DCB with the + remove command if it contains partitions which are mounted. Only use it + after unmounting its partitions, telling the SCSI mid-level code to + remove it (scsi remove-single-device) and you really need a few bytes of + memory. + The ADD command allows you to configure a device before you tell the + mid-level code to try detection. + + +I'd suggest reviewing the output of /proc/scsi/tmscsim/? after changing +settings to see if everything changed as requested. + + +5. Configuration via boot/module parameters +------------------------------------------- +With the DC390, the driver reads its EEPROM settings and tries to use them. +But you may want to override the settings prior to being able to change the +driver configuration via /proc/scsi/tmscsim/?. +If you do have another AM53C974 based adapter, that's even the only +possibility to adjust settings before you are able to write to the +/proc/scsi/tmscsim/? pseudo-file, e.g. if you want to use another +adapter ID than 7. +(BTW, the log message "DC390: No EEPROM found!" is normal without a DC390.) +For this purpose, you can pass options to the driver before it is initialised +by using kernel or module parameters. See lilo(8) or modprobe(1) manual +pages on how to pass params to the kernel or a module. +[NOTE: Formerly, it was not possible to override the EEPROM supplied + settings of the DC390 with cmd line parameters. This has changed since + 2.0e7] + +The syntax of the params is much shorter than the syntax of the /proc/... +interface. This makes it a little bit more difficult to use. However, long +parameter lines have the risk to be misinterpreted and the length of kernel +parameters is limited. + +As the support for non-DC390 adapters works by simulating the values of the +DC390 EEPROM, the settings are given in a DC390 BIOS' way. + +Here's the syntax: +tmscsim=AdaptID,SpdIdx,DevMode,AdaptMode,TaggedCmnds,DelayReset + +Each of the parameters is a number, containing the described information: + +* AdaptID: The SCSI ID of the host adapter. Must be in the range 0..7 + Default is 7. + +* SpdIdx: The index of the maximum speed as in the DC390 BIOS. The values + 0..7 mean 10, 8.0, 6.7, 5.7, 5.0, 4.0, 3.1 and 2 MHz resp. Default is + 0 (10.0 MHz). + +* DevMode is a bit mapped value describing the per-device features. It + applies to all devices. (Sync, Disc and TagQ will only apply, if the + device supports it.) The meaning of the bits (* = default): + + Bit Val(hex) Val(dec) Meaning + *0 0x01 1 Parity check + *1 0x02 2 Synchronous Negotiation + *2 0x04 4 Disconnection + *3 0x08 8 Send Start command on startup. (Not used) + *4 0x10 16 Tagged Command Queueing + + As usual, the desired value is obtained by adding the wanted values. If + you want to enable all values, e.g., you would use 31(0x1f). Default is 31. + +* AdaptMode is a bit mapped value describing the enabled adapter features. + + Bit Val(hex) Val(dec) Meaning + *0 0x01 1 Support more than two drives. (Not used) + *1 0x02 2 Use DOS compatible mapping for HDs greater than 1GB. + *2 0x04 4 Reset SCSI Bus on startup. + *3 0x08 8 Active Negation: Improves SCSI Bus noise immunity. + 4 0x10 16 Immediate return on BIOS seek command. (Not used) + (*)5 0x20 32 Check for LUNs >= 1. + + The default for LUN Check depends on CONFIG_SCSI_MULTI_LUN. + +* TaggedCmnds is a number indicating the maximum number of Tagged Commands. + It is the binary logarithm - 1 of the actual number. Max is 4 (32). + Value Number of Tagged Commands + 0 2 + 1 4 + 2 8 + *3 16 + 4 32 + +* DelayReset is the time in seconds (minus 0.5s), the adapter waits, after a + bus reset. Default is 1 (corresp. to 1.5s). + +Example: + modprobe tmscsim tmscsim=6,2,31 +would set the adapter ID to 6, max. speed to 6.7 MHz, enable all device +features and leave the adapter features, the number of Tagged Commands +and the Delay after a reset to the defaults. + +As you can see, you don't need to specify all of the six params. +If you want values to be ignored (i.e. the EEprom settings or the defaults +will be used), you may pass -2 (not 0!) at the corresponding position. + +The defaults (7,0,31,15,3,1) are aggressive to allow good performance. You +can use tmscsim=7,0,31,63,4,0 for maximum performance, if your SCSI chain +allows it. If you meet problems, you can use tmscsim=-1 which is a shortcut +for tmscsim=7,4,9,15,2,10. + + +6. Potential improvements +------------------------- +Most of the intended work on the driver has been done. Here are a few ideas +to further improve its usability: + +* Cleanly separate per-Target and per-LUN properties (DCB) +* More intelligent abort() routine +* Use new_eh code (Linux-2.1+) +* Have the mid-level (ML) code (and not the driver) handle more of the + various conditions. +* Command queueing in the driver: Eliminate Query list and use ML instead. +* More user friendly boot/module param syntax + +Further investigation on these problems: + +* Driver hangs with sync readcdda (xcdroast) (most probably VIA PCI error) + +Known problems: +Please see http://www.garloff.de/kurt/linux/dc390/problems.html + +* Changing the parameters of multi-lun by the tmscsim/? interface will + cause problems, cause these settings are mostly per Target and not per LUN + and should be updated accordingly. To be fixed for 2.0d24. +* CDRs (eg Yam CRW4416) not recognized, because some buggy devices don't + recover from a SCSI reset in time. Use a higher delay or don't issue + a SCSI bus reset on driver initialization. See problems page. + For the CRW4416S, this seems to be solved with firmware 1.0g (reported by + Jean-Yves Barbier). +* TEAC CD-532S not being recognized. (Works with 1.11). +* Scanners (eg. Astra UMAX 1220S) don't work: Disable Sync Negotiation. + If this does not help, try echo "INQUIRY t" >/proc/scsi/tmscsim/? (t + replaced by the dev index of your scanner). You may try to reset your SCSI + bus afterwards (echo "RESET" >/proc/scsi/tmscsim/?). + The problem seems to be solved as of 2.0d18, thanks to Andreas Rick. +* If there is a valid partition table, the driver will use it for determing + the mapping. If there's none, a reasonable mapping (Symbios-like) will be + assumed. Other operating systems may not like this mapping, though + it's consistent with the BIOS' behaviour. Old DC390 drivers ignored the + partition table and used a H/S = 64/32 or 255/63 translation. So if you + want to be compatible to those, use this old mapping when creating + partition tables. Even worse, on bootup the DC390 might complain if other + mappings are found, so auto rebooting may fail. +* In some situations, the driver will get stuck in an abort loop. This is a + bad interaction between the Mid-Layer of Linux' SCSI code and the driver. + Try to disable DsCn, if you meet this problem. Please contact me for + further debugging. + + +7. Bug reports, debugging and updates +------------------------------------- +Whenever you have problems with the driver, you are invited to ask the +author for help. However, I'd suggest reading the docs and trying to solve +the problem yourself, first. +If you find something, which you believe to be a bug, please report it to me. +Please append the output of /proc/scsi/scsi, /proc/scsi/tmscsim/? and +maybe the DC390 log messages to the report. + +Bug reports should be send to me (Kurt Garloff <dc390@garloff.de>) as well +as to the linux-scsi list (<linux-scsi@vger.kernel.org>), as sometimes bugs +are caused by the SCSI mid-level code. + +I will ask you for some more details and probably I will also ask you to +enable some of the DEBUG options in the driver (tmscsim.c:DC390_DEBUGXXX +defines). The driver will produce some data for the syslog facility then. +Beware: If your syslog gets written to a SCSI disk connected to your +AM53C974, the logging might produce log output again, and you might end +having your box spending most of its time doing the logging. + +The latest version of the driver can be found at: + http://www.garloff.de/kurt/linux/dc390/ + ftp://ftp.suse.com/pub/people/garloff/linux/dc390/ + + +8. Acknowledgements +------------------- +Thanks to Linus Torvalds, Alan Cox, the FSF people, the XFree86 team and +all the others for the wonderful OS and software. +Thanks to C.L. Huang and Philip Giang (Tekram) for the initial driver +release and support. +Thanks to Doug Ledford, Gérard Roudier for support with SCSI coding. +Thanks to a lot of people (espec. Chiaki Ishikawa, Andreas Haumer, Hubert +Tonneau) for intensively testing the driver (and even risking data loss +doing this during early revisions). +Recently, SuSE GmbH, Nuernberg, FRG, has been paying me for the driver +development and maintenance. Special thanks! + + +9. Copyright +------------ + This driver is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; version 2 of the License. + If you want to use any later version of the GNU GPL, you will probably + be allowed to, but you have to ask me and Tekram <erich@tekram.com.tw> + before. + +------------------------------------------------------------------------- +Written by Kurt Garloff <kurt@garloff.de> 1998/06/11 +Last updated 2000/11/28, driver revision 2.0e7 +$Id: README.tmscsim,v 2.25.2.7 2000/12/20 01:07:12 garloff Exp $ diff --git a/Documentation/seclvl.txt b/Documentation/seclvl.txt new file mode 100644 index 000000000000..97274d122d0e --- /dev/null +++ b/Documentation/seclvl.txt @@ -0,0 +1,97 @@ +BSD Secure Levels Linux Security Module +Michael A. Halcrow <mike@halcrow.us> + + +Introduction + +Under the BSD Secure Levels security model, sets of policies are +associated with levels. Levels range from -1 to 2, with -1 being the +weakest and 2 being the strongest. These security policies are +enforced at the kernel level, so not even the superuser is able to +disable or circumvent them. This hardens the machine against attackers +who gain root access to the system. + + +Levels and Policies + +Level -1 (Permanently Insecure): + - Cannot increase the secure level + +Level 0 (Insecure): + - Cannot ptrace the init process + +Level 1 (Default): + - /dev/mem and /dev/kmem are read-only + - IMMUTABLE and APPEND extended attributes, if set, may not be unset + - Cannot load or unload kernel modules + - Cannot write directly to a mounted block device + - Cannot perform raw I/O operations + - Cannot perform network administrative tasks + - Cannot setuid any file + +Level 2 (Secure): + - Cannot decrement the system time + - Cannot write to any block device, whether mounted or not + - Cannot unmount any mounted filesystems + + +Compilation + +To compile the BSD Secure Levels LSM, seclvl.ko, enable the +SECURITY_SECLVL configuration option. This is found under Security +options -> BSD Secure Levels in the kernel configuration menu. + + +Basic Usage + +Once the machine is in a running state, with all the necessary modules +loaded and all the filesystems mounted, you can load the seclvl.ko +module: + +# insmod seclvl.ko + +The module defaults to secure level 1, except when compiled directly +into the kernel, in which case it defaults to secure level 0. To raise +the secure level to 2, the administrator writes ``2'' to the +seclvl/seclvl file under the sysfs mount point (assumed to be /sys in +these examples): + +# echo -n "2" > /sys/seclvl/seclvl + +Alternatively, you can initialize the module at secure level 2 with +the initlvl module parameter: + +# insmod seclvl.ko initlvl=2 + +At this point, it is impossible to remove the module or reduce the +secure level. If the administrator wishes to have the option of doing +so, he must provide a module parameter, sha1_passwd, that specifies +the SHA1 hash of the password that can be used to reduce the secure +level to 0. + +To generate this SHA1 hash, the administrator can use OpenSSL: + +# echo -n "boogabooga" | openssl sha1 +abeda4e0f33defa51741217592bf595efb8d289c + +In order to use password-instigated secure level reduction, the SHA1 +crypto module must be loaded or compiled into the kernel: + +# insmod sha1.ko + +The administrator can then insmod the seclvl module, including the +SHA1 hash of the password: + +# insmod seclvl.ko + sha1_passwd=abeda4e0f33defa51741217592bf595efb8d289c + +To reduce the secure level, write the password to seclvl/passwd under +your sysfs mount point: + +# echo -n "boogabooga" > /sys/seclvl/passwd + +The September 2004 edition of Sys Admin Magazine has an article about +the BSD Secure Levels LSM. I encourage you to refer to that article +for a more in-depth treatment of this security module: + +http://www.samag.com/documents/s=9304/sam0409a/0409a.htm diff --git a/Documentation/serial-console.txt b/Documentation/serial-console.txt new file mode 100644 index 000000000000..6c689b0df2b8 --- /dev/null +++ b/Documentation/serial-console.txt @@ -0,0 +1,104 @@ + Linux Serial Console + +To use a serial port as console you need to compile the support into your +kernel - by default it is not compiled in. For PC style serial ports +it's the config option next to "Standard/generic (dumb) serial support". +You must compile serial support into the kernel and not as a module. + +It is possible to specify multiple devices for console output. You can +define a new kernel command line option to select which device(s) to +use for console output. + +The format of this option is: + + console=device,options + + device: tty0 for the foreground virtual console + ttyX for any other virtual console + ttySx for a serial port + lp0 for the first parallel port + + options: depend on the driver. For the serial port this + defines the baudrate/parity/bits of the port, + in the format BBBBPN, where BBBB is the speed, + P is parity (n/o/e), and N is bits. Default is + 9600n8. The maximum baudrate is 115200. + +You can specify multiple console= options on the kernel command line. +Output will appear on all of them. The last device will be used when +you open /dev/console. So, for example: + + console=ttyS1,9600 console=tty0 + +defines that opening /dev/console will get you the current foreground +virtual console, and kernel messages will appear on both the VGA +console and the 2nd serial port (ttyS1 or COM2) at 9600 baud. + +Note that you can only define one console per device type (serial, video). + +If no console device is specified, the first device found capable of +acting as a system console will be used. At this time, the system +first looks for a VGA card and then for a serial port. So if you don't +have a VGA card in your system the first serial port will automatically +become the console. + +You will need to create a new device to use /dev/console. The official +/dev/console is now character device 5,1. + +Here's an example that will use /dev/ttyS1 (COM2) as the console. +Replace the sample values as needed. + +1. Create /dev/console (real console) and /dev/tty0 (master virtual + console): + + cd /dev + rm -f console tty0 + mknod -m 622 console c 5 1 + mknod -m 622 tty0 c 4 0 + +2. LILO can also take input from a serial device. This is a very + useful option. To tell LILO to use the serial port: + In lilo.conf (global section): + + serial = 1,9600n8 (ttyS1, 9600 bd, no parity, 8 bits) + +3. Adjust to kernel flags for the new kernel, + again in lilo.conf (kernel section) + + append = "console=ttyS1,9600" + +4. Make sure a getty runs on the serial port so that you can login to + it once the system is done booting. This is done by adding a line + like this to /etc/inittab (exact syntax depends on your getty): + + S1:23:respawn:/sbin/getty -L ttyS1 9600 vt100 + +5. Init and /etc/ioctl.save + + Sysvinit remembers its stty settings in a file in /etc, called + `/etc/ioctl.save'. REMOVE THIS FILE before using the serial + console for the first time, because otherwise init will probably + set the baudrate to 38400 (baudrate of the virtual console). + +6. /dev/console and X + Programs that want to do something with the virtual console usually + open /dev/console. If you have created the new /dev/console device, + and your console is NOT the virtual console some programs will fail. + Those are programs that want to access the VT interface, and use + /dev/console instead of /dev/tty0. Some of those programs are: + + Xfree86, svgalib, gpm, SVGATextMode + + It should be fixed in modern versions of these programs though. + + Note that if you boot without a console= option (or with + console=/dev/tty0), /dev/console is the same as /dev/tty0. In that + case everything will still work. + +7. Thanks + + Thanks to Geert Uytterhoeven <geert@linux-m68k.org> + for porting the patches from 2.1.4x to 2.1.6x for taking care of + the integration of these patches into m68k, ppc and alpha. + +Miquel van Smoorenburg <miquels@cistron.nl>, 11-Jun-2000 diff --git a/Documentation/serial/driver b/Documentation/serial/driver new file mode 100644 index 000000000000..e9c0178cd202 --- /dev/null +++ b/Documentation/serial/driver @@ -0,0 +1,330 @@ + + Low Level Serial API + -------------------- + + + $Id: driver,v 1.10 2002/07/22 15:27:30 rmk Exp $ + + +This document is meant as a brief overview of some aspects of the new serial +driver. It is not complete, any questions you have should be directed to +<rmk@arm.linux.org.uk> + +The reference implementation is contained within serial_amba.c. + + + +Low Level Serial Hardware Driver +-------------------------------- + +The low level serial hardware driver is responsible for supplying port +information (defined by uart_port) and a set of control methods (defined +by uart_ops) to the core serial driver. The low level driver is also +responsible for handling interrupts for the port, and providing any +console support. + + +Console Support +--------------- + +The serial core provides a few helper functions. This includes identifing +the correct port structure (via uart_get_console) and decoding command line +arguments (uart_parse_options). + + +Locking +------- + +It is the responsibility of the low level hardware driver to perform the +necessary locking using port->lock. There are some exceptions (which +are described in the uart_ops listing below.) + +There are three locks. A per-port spinlock, a per-port tmpbuf semaphore, +and an overall semaphore. + +From the core driver perspective, the port->lock locks the following +data: + + port->mctrl + port->icount + info->xmit.head (circ->head) + info->xmit.tail (circ->tail) + +The low level driver is free to use this lock to provide any additional +locking. + +The core driver uses the info->tmpbuf_sem lock to prevent multi-threaded +access to the info->tmpbuf bouncebuffer used for port writes. + +The port_sem semaphore is used to protect against ports being added/ +removed or reconfigured at inappropriate times. + + +uart_ops +-------- + +The uart_ops structure is the main interface between serial_core and the +hardware specific driver. It contains all the methods to control the +hardware. + + tx_empty(port) + This function tests whether the transmitter fifo and shifter + for the port described by 'port' is empty. If it is empty, + this function should return TIOCSER_TEMT, otherwise return 0. + If the port does not support this operation, then it should + return TIOCSER_TEMT. + + Locking: none. + Interrupts: caller dependent. + This call must not sleep + + set_mctrl(port, mctrl) + This function sets the modem control lines for port described + by 'port' to the state described by mctrl. The relevant bits + of mctrl are: + - TIOCM_RTS RTS signal. + - TIOCM_DTR DTR signal. + - TIOCM_OUT1 OUT1 signal. + - TIOCM_OUT2 OUT2 signal. + If the appropriate bit is set, the signal should be driven + active. If the bit is clear, the signal should be driven + inactive. + + Locking: port->lock taken. + Interrupts: locally disabled. + This call must not sleep + + get_mctrl(port) + Returns the current state of modem control inputs. The state + of the outputs should not be returned, since the core keeps + track of their state. The state information should include: + - TIOCM_DCD state of DCD signal + - TIOCM_CTS state of CTS signal + - TIOCM_DSR state of DSR signal + - TIOCM_RI state of RI signal + The bit is set if the signal is currently driven active. If + the port does not support CTS, DCD or DSR, the driver should + indicate that the signal is permanently active. If RI is + not available, the signal should not be indicated as active. + + Locking: none. + Interrupts: caller dependent. + This call must not sleep + + stop_tx(port,tty_stop) + Stop transmitting characters. This might be due to the CTS + line becoming inactive or the tty layer indicating we want + to stop transmission. + + tty_stop: 1 if this call is due to the TTY layer issuing a + TTY stop to the driver (equiv to rs_stop). + + Locking: port->lock taken. + Interrupts: locally disabled. + This call must not sleep + + start_tx(port,tty_start) + start transmitting characters. (incidentally, nonempty will + always be nonzero, and shouldn't be used - it will be dropped). + + tty_start: 1 if this call was due to the TTY layer issuing + a TTY start to the driver (equiv to rs_start) + + Locking: port->lock taken. + Interrupts: locally disabled. + This call must not sleep + + stop_rx(port) + Stop receiving characters; the port is in the process of + being closed. + + Locking: port->lock taken. + Interrupts: locally disabled. + This call must not sleep + + enable_ms(port) + Enable the modem status interrupts. + + Locking: port->lock taken. + Interrupts: locally disabled. + This call must not sleep + + break_ctl(port,ctl) + Control the transmission of a break signal. If ctl is + nonzero, the break signal should be transmitted. The signal + should be terminated when another call is made with a zero + ctl. + + Locking: none. + Interrupts: caller dependent. + This call must not sleep + + startup(port) + Grab any interrupt resources and initialise any low level driver + state. Enable the port for reception. It should not activate + RTS nor DTR; this will be done via a separate call to set_mctrl. + + Locking: port_sem taken. + Interrupts: globally disabled. + + shutdown(port) + Disable the port, disable any break condition that may be in + effect, and free any interrupt resources. It should not disable + RTS nor DTR; this will have already been done via a separate + call to set_mctrl. + + Locking: port_sem taken. + Interrupts: caller dependent. + + set_termios(port,termios,oldtermios) + Change the port parameters, including word length, parity, stop + bits. Update read_status_mask and ignore_status_mask to indicate + the types of events we are interested in receiving. Relevant + termios->c_cflag bits are: + CSIZE - word size + CSTOPB - 2 stop bits + PARENB - parity enable + PARODD - odd parity (when PARENB is in force) + CREAD - enable reception of characters (if not set, + still receive characters from the port, but + throw them away. + CRTSCTS - if set, enable CTS status change reporting + CLOCAL - if not set, enable modem status change + reporting. + Relevant termios->c_iflag bits are: + INPCK - enable frame and parity error events to be + passed to the TTY layer. + BRKINT + PARMRK - both of these enable break events to be + passed to the TTY layer. + + IGNPAR - ignore parity and framing errors + IGNBRK - ignore break errors, If IGNPAR is also + set, ignore overrun errors as well. + The interaction of the iflag bits is as follows (parity error + given as an example): + Parity error INPCK IGNPAR + None n/a n/a character received + Yes n/a 0 character discarded + Yes 0 1 character received, marked as + TTY_NORMAL + Yes 1 1 character received, marked as + TTY_PARITY + + Other flags may be used (eg, xon/xoff characters) if your + hardware supports hardware "soft" flow control. + + Locking: none. + Interrupts: caller dependent. + This call must not sleep + + pm(port,state,oldstate) + Perform any power management related activities on the specified + port. State indicates the new state (defined by ACPI D0-D3), + oldstate indicates the previous state. Essentially, D0 means + fully on, D3 means powered down. + + This function should not be used to grab any resources. + + This will be called when the port is initially opened and finally + closed, except when the port is also the system console. This + will occur even if CONFIG_PM is not set. + + Locking: none. + Interrupts: caller dependent. + + type(port) + Return a pointer to a string constant describing the specified + port, or return NULL, in which case the string 'unknown' is + substituted. + + Locking: none. + Interrupts: caller dependent. + + release_port(port) + Release any memory and IO region resources currently in use by + the port. + + Locking: none. + Interrupts: caller dependent. + + request_port(port) + Request any memory and IO region resources required by the port. + If any fail, no resources should be registered when this function + returns, and it should return -EBUSY on failure. + + Locking: none. + Interrupts: caller dependent. + + config_port(port,type) + Perform any autoconfiguration steps required for the port. `type` + contains a bit mask of the required configuration. UART_CONFIG_TYPE + indicates that the port requires detection and identification. + port->type should be set to the type found, or PORT_UNKNOWN if + no port was detected. + + UART_CONFIG_IRQ indicates autoconfiguration of the interrupt signal, + which should be probed using standard kernel autoprobing techniques. + This is not necessary on platforms where ports have interrupts + internally hard wired (eg, system on a chip implementations). + + Locking: none. + Interrupts: caller dependent. + + verify_port(port,serinfo) + Verify the new serial port information contained within serinfo is + suitable for this port type. + + Locking: none. + Interrupts: caller dependent. + + ioctl(port,cmd,arg) + Perform any port specific IOCTLs. IOCTL commands must be defined + using the standard numbering system found in <asm/ioctl.h> + + Locking: none. + Interrupts: caller dependent. + +Other functions +--------------- + +uart_update_timeout(port,cflag,quot) + Update the FIFO drain timeout, port->timeout, according to the + number of bits, parity, stop bits and quotient. + + Locking: caller is expected to take port->lock + Interrupts: n/a + +uart_get_baud_rate(port,termios) + Return the numeric baud rate for the specified termios, taking + account of the special 38400 baud "kludge". The B0 baud rate + is mapped to 9600 baud. + + Locking: caller dependent. + Interrupts: n/a + +uart_get_divisor(port,termios,oldtermios) + Return the divsor (baud_base / baud) for the selected baud rate + specified by termios. If the baud rate is out of range, try + the original baud rate specified by oldtermios (if non-NULL). + If that fails, try 9600 baud. + + If 38400 baud and custom divisor is selected, return the + custom divisor instead. + + Locking: caller dependent. + Interrupts: n/a + +Other notes +----------- + +It is intended some day to drop the 'unused' entries from uart_port, and +allow low level drivers to register their own individual uart_port's with +the core. This will allow drivers to use uart_port as a pointer to a +structure containing both the uart_port entry with their own extensions, +thus: + + struct my_port { + struct uart_port port; + int my_stuff; + }; diff --git a/Documentation/sgi-visws.txt b/Documentation/sgi-visws.txt new file mode 100644 index 000000000000..7ff0811ca2ba --- /dev/null +++ b/Documentation/sgi-visws.txt @@ -0,0 +1,13 @@ + +The SGI Visual Workstations (models 320 and 540) are based around +the Cobalt, Lithium, and Arsenic ASICs. The Cobalt ASIC is the +main system ASIC which interfaces the 1-4 IA32 cpus, the memory +system, and the I/O system in the Lithium ASIC. The Cobalt ASIC +also contains the 3D gfx rendering engine which renders to main +system memory -- part of which is used as the frame buffer which +is DMA'ed to a video connector using the Arsenic ASIC. A PIIX4 +chip and NS87307 are used to provide legacy device support (IDE, +serial, floppy, and parallel). + +The Visual Workstation chipset largely conforms to the PC architecture +with some notable exceptions such as interrupt handling. diff --git a/Documentation/sh/kgdb.txt b/Documentation/sh/kgdb.txt new file mode 100644 index 000000000000..5b04f7f306fc --- /dev/null +++ b/Documentation/sh/kgdb.txt @@ -0,0 +1,179 @@ + +This file describes the configuration and behavior of KGDB for the SH +kernel. Based on a description from Henry Bell <henry.bell@st.com>, it +has been modified to account for quirks in the current implementation. + +Version +======= + +This version of KGDB was written for 2.4.xx kernels for the SH architecture. +Further documentation is available from the linux-sh project website. + + +Debugging Setup: Host +====================== + +The two machines will be connected together via a serial line - this +should be a null modem cable i.e. with a twist. + +On your DEVELOPMENT machine, go to your kernel source directory and +build the kernel, enabling KGDB support in the "kernel hacking" section. +This includes the KGDB code, and also makes the kernel be compiled with +the "-g" option set -- necessary for debugging. + +To install this new kernel, use the following installation procedure. + +Decide on which tty port you want the machines to communicate, then +cable them up back-to-back using the null modem. On the DEVELOPMENT +machine, you may wish to create an initialization file called .gdbinit +(in the kernel source directory or in your home directory) to execute +commonly-used commands at startup. + +A minimal .gdbinit might look like this: + + file vmlinux + set remotebaud 115200 + target remote /dev/ttyS0 + +Change the "target" definition so that it specifies the tty port that +you intend to use. Change the "remotebaud" definition to match the +data rate that you are going to use for the com line (115200 is the +default). + +Debugging Setup: Target +======================== + +By default, the KGDB stub will communicate with the host GDB using +ttySC1 at 115200 baud, 8 databits, no parity; these defaults can be +changed in the kernel configuration. As the kernel starts up, KGDB will +initialize so that breakpoints, kernel segfaults, and so forth will +generally enter the debugger. + +This behavior can be modified by including the "kgdb" option in the +kernel command line; this option has the general form: + + kgdb=<ttyspec>,<action> + +The <ttyspec> indicates the port to use, and can optionally specify +baud, parity and databits -- e.g. "ttySC0,9600N8" or "ttySC1,19200". + +The <action> can be "halt" or "disabled". The "halt" action enters the +debugger via a breakpoint as soon as kgdb is initialized; the "disabled" +action causes kgdb to ignore kernel segfaults and such until explicitly +entered by a breakpoint in the code or by external action (sysrq or NMI). + +(Both <ttyspec> and <action> can appear alone, w/o the separating comma.) + +For example, if you wish to debug early in kernel startup code, you +might specify the halt option: + + kgdb=halt + +Boot the TARGET machinem, which will appear to hang. + +On your DEVELOPMENT machine, cd to the source directory and run the gdb +program. (This is likely to be a cross GDB which runs on your host but +is built for an SH target.) If everything is working correctly you +should see gdb print out a few lines indicating that a breakpoint has +been taken. It will actually show a line of code in the target kernel +inside the gdbstub activation code. + +NOTE: BE SURE TO TERMINATE OR SUSPEND any other host application which +may be using the same serial port (for example, a terminal emulator you +have been using to connect to the target boot code.) Otherwise, data +from the target may not all get to GDB! + +You can now use whatever gdb commands you like to set breakpoints. +Enter "continue" to start your target machine executing again. At this +point the target system will run at full speed until it encounters +your breakpoint or gets a segment violation in the kernel, or whatever. + +Serial Ports: KGDB, Console +============================ + +This version of KGDB may not gracefully handle conflict with other +drivers in the kernel using the same port. If KGDB is configured on the +same port (and with the same parameters) as the kernel console, or if +CONFIG_SH_KGDB_CONSOLE is configured, things should be fine (though in +some cases console messages may appear twice through GDB). But if the +KGDB port is not the kernel console and used by another serial driver +which assumes different serial parameters (e.g. baud rate) KGDB may not +recover. + +Also, when KGDB is entered via sysrq-g (requires CONFIG_KGDB_SYSRQ) and +the kgdb port uses the same port as the console, detaching GDB will not +restore the console to working order without the port being re-opened. + +Another serious consequence of this is that GDB currently CANNOT break +into KGDB externally (e.g. via ^C or <BREAK>); unless a breakpoint or +error is encountered, the only way to enter KGDB after the initial halt +(see above) is via NMI (CONFIG_KGDB_NMI) or sysrq-g (CONFIG_KGDB_SYSRQ). + +Code is included for the basic Hitachi Solution Engine boards to allow +the use of ttyS0 for KGDB if desired; this is less robust, but may be +useful in some cases. (This cannot be selected using the config file, +but only through the kernel command line, e.g. "kgdb=ttyS0", though the +configured defaults for baud rate etc. still apply if not overridden.) + +If gdbstub Does Not Work +======================== + +If it doesn't work, you will have to troubleshoot it. Do the easy +things first like double checking your cabling and data rates. You +might try some non-kernel based programs to see if the back-to-back +connection works properly. Just something simple like cat /etc/hosts +/dev/ttyS0 on one machine and cat /dev/ttyS0 on the other will tell you +if you can send data from one machine to the other. There is no point +in tearing out your hair in the kernel if the line doesn't work. + +If you need to debug the GDB/KGDB communication itself, the gdb commands +"set debug remote 1" and "set debug serial 1" may be useful, but be +warned: they produce a lot of output. + +Threads +======= + +Each process in a target machine is seen as a gdb thread. gdb thread related +commands (info threads, thread n) can be used. CONFIG_KGDB_THREAD must +be defined for this to work. + +In this version, kgdb reports PID_MAX (32768) as the process ID for the +idle process (pid 0), since GDB does not accept 0 as an ID. + +Detaching (exiting KGDB) +========================= + +There are two ways to resume full-speed target execution: "continue" and +"detach". With "continue", GDB inserts any specified breakpoints in the +target code and resumes execution; the target is still in "gdb mode". +If a breakpoint or other debug event (e.g. NMI) happens, the target +halts and communicates with GDB again, which is waiting for it. + +With "detach", GDB does *not* insert any breakpoints; target execution +is resumed and GDB stops communicating (does not wait for the target). +In this case, the target is no longer in "gdb mode" -- for example, +console messages no longer get sent separately to the KGDB port, or +encapsulated for GDB. If a debug event (e.g. NMI) occurs, the target +will re-enter "gdb mode" and will display this fact on the console; you +must give a new "target remote" command to gdb. + +NOTE: TO AVOID LOSSING CONSOLE MESSAGES IN CASE THE KERNEL CONSOLE AND +KGDB USING THE SAME PORT, THE TARGET WAITS FOR ANY INPUT CHARACTER ON +THE KGDB PORT AFTER A DETACH COMMAND. For example, after the detach you +could start a terminal emulator on the same host port and enter a <cr>; +however, this program must then be terminated or suspended in order to +use GBD again if KGDB is re-entered. + + +Acknowledgements +================ + +This code was mostly generated by Henry Bell <henry.bell@st.com>; +largely from KGDB by Amit S. Kale <akale@veritas.com> - extracts from +code by Glenn Engel, Jim Kingdon, David Grothe <dave@gcom.com>, Tigran +Aivazian <tigran@sco.com>, William Gatliff <bgat@open-widgets.com>, Ben +Lee, Steve Chamberlain and Benoit Miller <fulg@iname.com> are also +included. + +Jeremy Siegel +<jsiegel@mvista.com> diff --git a/Documentation/sh/new-machine.txt b/Documentation/sh/new-machine.txt new file mode 100644 index 000000000000..eb2dd2e6993b --- /dev/null +++ b/Documentation/sh/new-machine.txt @@ -0,0 +1,306 @@ + + Adding a new board to LinuxSH + ================================ + + Paul Mundt <lethal@linux-sh.org> + +This document attempts to outline what steps are necessary to add support +for new boards to the LinuxSH port under the new 2.5 and 2.6 kernels. This +also attempts to outline some of the noticeable changes between the 2.4 +and the 2.5/2.6 SH backend. + +1. New Directory Structure +========================== + +The first thing to note is the new directory structure. Under 2.4, most +of the board-specific code (with the exception of stboards) ended up +in arch/sh/kernel/ directly, with board-specific headers ending up in +include/asm-sh/. For the new kernel, things are broken out by board type, +companion chip type, and CPU type. Looking at a tree view of this directory +heirarchy looks like the following: + +Board-specific code: + +. +|-- arch +| `-- sh +| `-- boards +| |-- adx +| | `-- board-specific files +| |-- bigsur +| | `-- board-specific files +| | +| ... more boards here ... +| +`-- include + `-- asm-sh + |-- adx + | `-- board-specific headers + |-- bigsur + | `-- board-specific headers + | + .. more boards here ... + +It should also be noted that each board is required to have some certain +headers. At the time of this writing, io.h is the only thing that needs +to be provided for each board, and can generally just reference generic +functions (with the exception of isa_port2addr). + +Next, for companion chips: +. +`-- arch + `-- sh + `-- cchips + `-- hd6446x + |-- hd64461 + | `-- cchip-specific files + `-- hd64465 + `-- cchip-specific files + +... and so on. Headers for the companion chips are treated the same way as +board-specific headers. Thus, include/asm-sh/hd64461 is home to all of the +hd64461-specific headers. + +Finally, CPU family support is also abstracted: +. +|-- arch +| `-- sh +| |-- kernel +| | `-- cpu +| | |-- sh2 +| | | `-- SH-2 generic files +| | |-- sh3 +| | | `-- SH-3 generic files +| | `-- sh4 +| | `-- SH-4 generic files +| `-- mm +| `-- This is also broken out per CPU family, so each family can +| have their own set of cache/tlb functions. +| +`-- include + `-- asm-sh + |-- cpu-sh2 + | `-- SH-2 specific headers + |-- cpu-sh3 + | `-- SH-3 specific headers + `-- cpu-sh4 + `-- SH-4 specific headers + +It should be noted that CPU subtypes are _not_ abstracted. Thus, these still +need to be dealt with by the CPU family specific code. + +2. Adding a New Board +===================== + +The first thing to determine is whether the board you are adding will be +isolated, or whether it will be part of a family of boards that can mostly +share the same board-specific code with minor differences. + +In the first case, this is just a matter of making a directory for your +board in arch/sh/boards/ and adding rules to hook your board in with the +build system (more on this in the next section). However, for board families +it makes more sense to have a common top-level arch/sh/boards/ directory +and then populate that with sub-directories for each member of the family. +Both the Solution Engine and the hp6xx boards are an example of this. + +After you have setup your new arch/sh/boards/ directory, remember that you +also must add a directory in include/asm-sh for headers localized to this +board. In order to interoperate seamlessly with the build system, it's best +to have this directory the same as the arch/sh/boards/ directory name, +though if your board is again part of a family, the build system has ways +of dealing with this, and you can feel free to name the directory after +the family member itself. + +There are a few things that each board is required to have, both in the +arch/sh/boards and the include/asm-sh/ heirarchy. In order to better +explain this, we use some examples for adding an imaginary board. For +setup code, we're required at the very least to provide definitions for +get_system_type() and platform_setup(). For our imaginary board, this +might look something like: + +/* + * arch/sh/boards/vapor/setup.c - Setup code for imaginary board + */ +#include <linux/init.h> + +const char *get_system_type(void) +{ + return "FooTech Vaporboard"; +} + +int __init platform_setup(void) +{ + /* + * If our hardware actually existed, we would do real + * setup here. Though it's also sane to leave this empty + * if there's no real init work that has to be done for + * this board. + */ + + /* + * Presume all FooTech boards have the same broken timer, + * and also presume that we've defined foo_timer_init to + * do something useful. + */ + board_time_init = foo_timer_init; + + /* Start-up imaginary PCI ... */ + + /* And whatever else ... */ + + return 0; +} + +Our new imaginary board will also have to tie into the machvec in order for it +to be of any use. Currently the machvec is slowly on its way out, but is still +required for the time being. As such, let us take a look at what needs to be +done for the machvec assignment. + +machvec functions fall into a number of categories: + + - I/O functions to IO memory (inb etc) and PCI/main memory (readb etc). + - I/O remapping functions (ioremap etc) + - some initialisation functions + - a 'heartbeat' function + - some miscellaneous flags + +The tree can be built in two ways: + - as a fully generic build. All drivers are linked in, and all functions + go through the machvec + - as a machine specific build. In this case only the required drivers + will be linked in, and some macros may be redefined to not go through + the machvec where performance is important (in particular IO functions). + +There are three ways in which IO can be performed: + - none at all. This is really only useful for the 'unknown' machine type, + which us designed to run on a machine about which we know nothing, and + so all all IO instructions do nothing. + - fully custom. In this case all IO functions go to a machine specific + set of functions which can do what they like + - a generic set of functions. These will cope with most situations, + and rely on a single function, mv_port2addr, which is called through the + machine vector, and converts an IO address into a memory address, which + can be read from/written to directly. + +Thus adding a new machine involves the following steps (I will assume I am +adding a machine called vapor): + + - add a new file include/asm-sh/vapor/io.h which contains prototypes for + any machine specific IO functions prefixed with the machine name, for + example vapor_inb. These will be needed when filling out the machine + vector. + + This is the minimum that is required, however there are ample + opportunities to optimise this. In particular, by making the prototypes + inline function definitions, it is possible to inline the function when + building machine specific versions. Note that the machine vector + functions will still be needed, so that a module built for a generic + setup can be loaded. + + - add a new file arch/sh/boards/vapor/mach.c. This contains the definition + of the machine vector. When building the machine specific version, this + will be the real machine vector (via an alias), while in the generic + version is used to initialise the machine vector, and then freed, by + making it initdata. This should be defined as: + + struct sh_machine_vector mv_vapor __initmv = { + .mv_name = "vapor", + } + ALIAS_MV(vapor) + + - finally add a file arch/sh/boards/vapor/io.c, which contains + definitions of the machine specific io functions. + +A note about initialisation functions. Three initialisation functions are +provided in the machine vector: + - mv_arch_init - called very early on from setup_arch + - mv_init_irq - called from init_IRQ, after the generic SH interrupt + initialisation + - mv_init_pci - currently not used + +Any other remaining functions which need to be called at start up can be +added to the list using the __initcalls macro (or module_init if the code +can be built as a module). Many generic drivers probe to see if the device +they are targeting is present, however this may not always be appropriate, +so a flag can be added to the machine vector which will be set on those +machines which have the hardware in question, reducing the probe to a +single conditional. + +3. Hooking into the Build System +================================ + +Now that we have the corresponding directories setup, and all of the +board-specific code is in place, it's time to look at how to get the +whole mess to fit into the build system. + +Large portions of the build system are now entirely dynamic, and merely +require the proper entry here and there in order to get things done. + +The first thing to do is to add an entry to arch/sh/Kconfig, under the +"System type" menu: + +config SH_VAPOR + bool "Vapor" + help + select Vapor if configuring for a FooTech Vaporboard. + +next, this has to be added into arch/sh/Makefile. All boards require a +machdir-y entry in order to be built. This entry needs to be the name of +the board directory as it appears in arch/sh/boards, even if it is in a +sub-directory (in which case, all parent directories below arch/sh/boards/ +need to be listed). For our new board, this entry can look like: + +machdir-$(CONFIG_SH_VAPOR) += vapor + +provided that we've placed everything in the arch/sh/boards/vapor/ directory. + +Next, the build system assumes that your include/asm-sh directory will also +be named the same. If this is not the case (as is the case with multiple +boards belonging to a common family), then the directory name needs to be +implicitly appended to incdir-y. The existing code manages this for the +Solution Engine and hp6xx boards, so see these for an example. + +Once that is taken care of, it's time to add an entry for the mach type. +This is done by adding an entry to the end of the arch/sh/tools/mach-types +list. The method for doing this is self explanatory, and so we won't waste +space restating it here. After this is done, you will be able to use +implicit checks for your board if you need this somewhere throughout the +common code, such as: + + /* Make sure we're on the FooTech Vaporboard */ + if (!mach_is_vapor()) + return -ENODEV; + +also note that the mach_is_boardname() check will be implicitly forced to +lowercase, regardless of the fact that the mach-types entries are all +uppercase. You can read the script if you really care, but it's pretty ugly, +so you probably don't want to do that. + +Now all that's left to do is providing a defconfig for your new board. This +way, other people who end up with this board can simply use this config +for reference instead of trying to guess what settings are supposed to be +used on it. + +Also, as soon as you have copied over a sample .config for your new board +(assume arch/sh/configs/vapor_defconfig), you can also use this directly as a +build target, and it will be implicitly listed as such in the help text. + +Looking at the 'make help' output, you should now see something like: + +Architecture specific targets (sh): + zImage - Compressed kernel image (arch/sh/boot/zImage) + adx_defconfig - Build for adx + cqreek_defconfig - Build for cqreek + dreamcast_defconfig - Build for dreamcast +... + vapor_defconfig - Build for vapor + +which then allows you to do: + +$ make ARCH=sh CROSS_COMPILE=sh4-linux- vapor_defconfig vmlinux + +which will in turn copy the defconfig for this board, run it through +oldconfig (prompting you for any new options since the time of creation), +and start you on your way to having a functional kernel for your new +board. + diff --git a/Documentation/smart-config.txt b/Documentation/smart-config.txt new file mode 100644 index 000000000000..c9bed4cf8773 --- /dev/null +++ b/Documentation/smart-config.txt @@ -0,0 +1,102 @@ +Smart CONFIG_* Dependencies +1 August 1999 + +Michael Chastain <mec@shout.net> +Werner Almesberger <almesber@lrc.di.epfl.ch> +Martin von Loewis <martin@mira.isdn.cs.tu-berlin.de> + +Here is the problem: + + Suppose that drivers/net/foo.c has the following lines: + + #include <linux/config.h> + + ... + + #ifdef CONFIG_FOO_AUTOFROB + /* Code for auto-frobbing */ + #else + /* Manual frobbing only */ + #endif + + ... + + #ifdef CONFIG_FOO_MODEL_TWO + /* Code for model two */ + #endif + + Now suppose the user (the person building kernels) reconfigures the + kernel to change some unrelated setting. This will regenerate the + file include/linux/autoconf.h, which will cause include/linux/config.h + to be out of date, which will cause drivers/net/foo.c to be recompiled. + + Most kernel sources, perhaps 80% of them, have at least one CONFIG_* + dependency somewhere. So changing _any_ CONFIG_* setting requires + almost _all_ of the kernel to be recompiled. + +Here is the solution: + + We've made the dependency generator, mkdep.c, smarter. Instead of + generating this dependency: + + drivers/net/foo.c: include/linux/config.h + + It now generates these dependencies: + + drivers/net/foo.c: \ + include/config/foo/autofrob.h \ + include/config/foo/model/two.h + + So drivers/net/foo.c depends only on the CONFIG_* lines that + it actually uses. + + A new program, split-include.c, runs at the beginning of + compilation (make bzImage or make zImage). split-include reads + include/linux/autoconf.h and updates the include/config/ tree, + writing one file per option. It updates only the files for options + that have changed. + + mkdep.c no longer generates warning messages for missing or unneeded + <linux/config.h> lines. The new top-level target 'make checkconfig' + checks for these problems. + +Flag Dependencies + + Martin Von Loewis contributed another feature to this patch: + 'flag dependencies'. The idea is that a .o file depends on + the compilation flags used to build it. The file foo.o has + its flags stored in .flags.foo.o. + + Suppose the user changes the foo driver from resident to modular. + 'make' will notice that the current foo.o was not compiled with + -DMODULE and will recompile foo.c. + + All .o files made from C source have flag dependencies. So do .o + files made with ld, and .a files made with ar. However, .o files + made from assembly source do not have flag dependencies (nobody + needs this yet, but it would be good to fix). + +Per-source-file Flags + + Flag dependencies also work with per-source-file flags. + You can specify compilation flags for individual source files + like this: + + CFLAGS_foo.o = -DSPECIAL_FOO_DEFINE + + This helps clean up drivers/net/Makefile, drivers/scsi/Makefile, + and several other Makefiles. + +Credit + + Werner Almesberger had the original idea and wrote the first + version of this patch. + + Michael Chastain picked it up and continued development. He is + now the principal author and maintainer. Please report any bugs + to him. + + Martin von Loewis wrote flag dependencies, with some modifications + by Michael Chastain. + + Thanks to all of the beta testers. diff --git a/Documentation/smp.txt b/Documentation/smp.txt new file mode 100644 index 000000000000..82fc50b6305d --- /dev/null +++ b/Documentation/smp.txt @@ -0,0 +1,22 @@ +To set up SMP + +Configure the kernel and answer Y to CONFIG_SMP. + +If you are using LILO, it is handy to have both SMP and non-SMP +kernel images on hand. Edit /etc/lilo.conf to create an entry +for another kernel image called "linux-smp" or something. + +The next time you compile the kernel, when running a SMP kernel, +edit linux/Makefile and change "MAKE=make" to "MAKE=make -jN" +(where N = number of CPU + 1, or if you have tons of memory/swap + you can just use "-j" without a number). Feel free to experiment +with this one. + +Of course you should time how long each build takes :-) +Example: + make config + time -v sh -c 'make clean install modules modules_install' + +If you are using some Compaq MP compliant machines you will need to set +the operating system in the BIOS settings to "Unixware" - don't ask me +why Compaqs don't work otherwise. diff --git a/Documentation/sonypi.txt b/Documentation/sonypi.txt new file mode 100644 index 000000000000..0f3b2405d09e --- /dev/null +++ b/Documentation/sonypi.txt @@ -0,0 +1,142 @@ +Sony Programmable I/O Control Device Driver Readme +-------------------------------------------------- + Copyright (C) 2001-2004 Stelian Pop <stelian@popies.net> + Copyright (C) 2001-2002 Alcôve <www.alcove.com> + Copyright (C) 2001 Michael Ashley <m.ashley@unsw.edu.au> + Copyright (C) 2001 Junichi Morita <jun1m@mars.dti.ne.jp> + Copyright (C) 2000 Takaya Kinjo <t-kinjo@tc4.so-net.ne.jp> + Copyright (C) 2000 Andrew Tridgell <tridge@samba.org> + +This driver enables access to the Sony Programmable I/O Control Device which +can be found in many Sony Vaio laptops. Some newer Sony laptops (seems to be +limited to new FX series laptops, at least the FX501 and the FX702) lack a +sonypi device and are not supported at all by this driver. + +It will give access (through a user space utility) to some events those laptops +generate, like: + - jogdial events (the small wheel on the side of Vaios) + - capture button events (only on Vaio Picturebook series) + - Fn keys + - bluetooth button (only on C1VR model) + - programmable keys, back, help, zoom, thumbphrase buttons, etc. + (when available) + +Those events (see linux/sonypi.h) can be polled using the character device node +/dev/sonypi (major 10, minor auto allocated or specified as a option). +A simple daemon which translates the jogdial movements into mouse wheel events +can be downloaded at: <http://popies.net/sonypi/> + +Another option to intercept the events is to get them directly through the +input layer. + +This driver supports also some ioctl commands for setting the LCD screen +brightness and querying the batteries charge information (some more +commands may be added in the future). + +This driver can also be used to set the camera controls on Picturebook series +(brightness, contrast etc), and is used by the video4linux driver for the +Motion Eye camera. + +Please note that this driver was created by reverse engineering the Windows +driver and the ACPI BIOS, because Sony doesn't agree to release any programming +specs for its laptops. If someone convinces them to do so, drop me a note. + +Driver options: +--------------- + +Several options can be passed to the sonypi driver using the standard +module argument syntax (<param>=<value> when passing the option to the +module or sonypi.<param>=<value> on the kernel boot line when sonypi is +statically linked into the kernel). Those options are: + + minor: minor number of the misc device /dev/sonypi, + default is -1 (automatic allocation, see /proc/misc + or kernel logs) + + camera: if you have a PictureBook series Vaio (with the + integrated MotionEye camera), set this parameter to 1 + in order to let the driver access to the camera + + fnkeyinit: on some Vaios (C1VE, C1VR etc), the Fn key events don't + get enabled unless you set this parameter to 1. + Do not use this option unless it's actually necessary, + some Vaio models don't deal well with this option. + This option is available only if the kernel is + compiled without ACPI support (since it conflicts + with it and it shouldn't be required anyway if + ACPI is already enabled). + + verbose: set to 1 to print unknown events received from the + sonypi device. + set to 2 to print all events received from the + sonypi device. + + compat: uses some compatibility code for enabling the sonypi + events. If the driver worked for you in the past + (prior to version 1.5) and does not work anymore, + add this option and report to the author. + + mask: event mask telling the driver what events will be + reported to the user. This parameter is required for + some Vaio models where the hardware reuses values + used in other Vaio models (like the FX series who does + not have a jogdial but reuses the jogdial events for + programmable keys events). The default event mask is + set to 0xffffffff, meaning that all possible events + will be tried. You can use the following bits to + construct your own event mask (from + drivers/char/sonypi.h): + SONYPI_JOGGER_MASK 0x0001 + SONYPI_CAPTURE_MASK 0x0002 + SONYPI_FNKEY_MASK 0x0004 + SONYPI_BLUETOOTH_MASK 0x0008 + SONYPI_PKEY_MASK 0x0010 + SONYPI_BACK_MASK 0x0020 + SONYPI_HELP_MASK 0x0040 + SONYPI_LID_MASK 0x0080 + SONYPI_ZOOM_MASK 0x0100 + SONYPI_THUMBPHRASE_MASK 0x0200 + SONYPI_MEYE_MASK 0x0400 + SONYPI_MEMORYSTICK_MASK 0x0800 + SONYPI_BATTERY_MASK 0x1000 + + useinput: if set (which is the default) two input devices are + created, one which interprets the jogdial events as + mouse events, the other one which acts like a + keyboard reporting the pressing of the special keys. + +Module use: +----------- + +In order to automatically load the sonypi module on use, you can put those +lines in your /etc/modprobe.conf file: + + alias char-major-10-250 sonypi + options sonypi minor=250 + +This supposes the use of minor 250 for the sonypi device: + + # mknod /dev/sonypi c 10 250 + +Bugs: +----- + + - several users reported that this driver disables the BIOS-managed + Fn-keys which put the laptop in sleeping state, or switch the + external monitor on/off. There is no workaround yet, since this + driver disables all APM management for those keys, by enabling the + ACPI management (and the ACPI core stuff is not complete yet). If + you have one of those laptops with working Fn keys and want to + continue to use them, don't use this driver. + + - some users reported that the laptop speed is lower (dhrystone + tested) when using the driver with the fnkeyinit parameter. I cannot + reproduce it on my laptop and not all users have this problem. + This happens because the fnkeyinit parameter enables the ACPI + mode (but without additional ACPI control, like processor + speed handling etc). Use ACPI instead of APM if it works on your + laptop. + + - since all development was done by reverse engineering, there is + _absolutely no guarantee_ that this driver will not crash your + laptop. Permanently. diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt new file mode 100644 index 000000000000..71ef0498d5e0 --- /dev/null +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -0,0 +1,1505 @@ + + Advanced Linux Sound Architecture - Driver + ========================================== + Configuration guide + + +Kernel Configuration +==================== + +To enable ALSA support you need at least to build the kernel with +primary sound card support (CONFIG_SOUND). Since ALSA can emulate OSS, +you don't have to choose any of the OSS modules. + +Enable "OSS API emulation" (CONFIG_SND_OSSEMUL) and both OSS mixer and +PCM supports if you want to run OSS applications with ALSA. + +If you want to support the WaveTable functionality on cards such as +SB Live! then you need to enable "Sequencer support" +(CONFIG_SND_SEQUENCER). + +To make ALSA debug messages more verbose, enable the "Verbose printk" +and "Debug" options. To check for memory leaks, turn on "Debug memory" +too. "Debug detection" will add checks for the detection of cards. + +Please note that all the ALSA ISA drivers support the Linux isapnp API +(if the card supports ISA PnP). You don't need to configure the cards +using isapnptools. + + +Creating ALSA devices +===================== + +This depends on your distribution, but normally you use the /dev/MAKEDEV +script to create the necessary device nodes. On some systems you use a +script named 'snddevices'. + + +Module parameters +================= + +The user can load modules with options. If the module supports more than +one card and you have more than one card of the same type then you can +specify multiple values for the option separated by commas. + +Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. + + Module snd + ---------- + + The core ALSA module. It is used by all ALSA card drivers. + It takes the following options which have global effects. + + major - major number for sound driver + - Default: 116 + cards_limit + - limiting card index for auto-loading (1-8) + - Default: 1 + - For auto-loading more than one card, specify this + option together with snd-card-X aliases. + device_mode + - permission mask for dynamic sound device filesystem + - This is available only when DEVFS is enabled + - Default: 0666 + - E.g.: device_mode=0660 + + + Module snd-pcm-oss + ------------------ + + The PCM OSS emulation module. + This module takes options which change the mapping of devices. + + dsp_map - PCM device number maps assigned to the 1st OSS device. + - Default: 0 + adsp_map - PCM device number maps assigned to the 2st OSS device. + - Default: 1 + nonblock_open + - Don't block opening busy PCM devices. + + For example, when dsp_map=2, /dev/dsp will be mapped to PCM #2 of + the card #0. Similarly, when adsp_map=0, /dev/adsp will be mapped + to PCM #0 of the card #0. + For changing the second or later card, specify the option with + commas, such like "dsp_map=0,1". + + nonblock_open option is used to change the behavior of the PCM + regarding opening the device. When this option is non-zero, + opening a busy OSS PCM device won't be blocked but return + immediately with EAGAIN (just like O_NONBLOCK flag). + + Module snd-rawmidi + ------------------ + + This module takes options which change the mapping of devices. + similar to those of the snd-pcm-oss module. + + midi_map - MIDI device number maps assigned to the 1st OSS device. + - Default: 0 + amidi_map - MIDI device number maps assigned to the 2st OSS device. + - Default: 1 + + Common parameters for top sound card modules + -------------------------------------------- + + Each of top level sound card module takes the following options. + + index - index (slot #) of sound card + - Values: 0 through 7 or negative + - If nonnegative, assign that index number + - if negative, interpret as a bitmask of permissible + indices; the first free permitted index is assigned + - Default: -1 + id - card ID (identifier or name) + - Can be up to 15 characters long + - Default: the card type + - A directory by this name is created under /proc/asound/ + containing information about the card + - This ID can be used instead of the index number in + identifying the card + enable - enable card + - Default: enabled, for PCI and ISA PnP cards + + Module snd-ad1816a + ------------------ + + Module for sound cards based on Analog Devices AD1816A/AD1815 ISA chips. + + port - port # for AD1816A chip (PnP setup) + mpu_port - port # for MPU-401 UART (PnP setup) + fm_port - port # for OPL3 (PnP setup) + irq - IRQ # for AD1816A chip (PnP setup) + mpu_irq - IRQ # for MPU-401 UART (PnP setup) + dma1 - first DMA # for AD1816A chip (PnP setup) + dma2 - second DMA # for AD1816A chip (PnP setup) + + Module supports up to 8 cards, autoprobe and PnP. + + Module snd-ad1848 + ----------------- + + Module for sound cards based on AD1848/AD1847/CS4248 ISA chips. + + port - port # for AD1848 chip + irq - IRQ # for AD1848 chip + dma1 - DMA # for AD1848 chip (0,1,3) + + Module supports up to 8 cards. This module does not support autoprobe + thus main port must be specified!!! Other ports are optional. + + Module snd-ali5451 + ------------------ + + Module for ALi M5451 PCI chip. + + pcm_channels - Number of hardware channels assigned for PCM + spdif - Support SPDIF I/O + - Default: disabled + + Module supports autoprobe and multiple chips (max 8). + + The power-management is supported. + + Module snd-als100 + ----------------- + + Module for sound cards based on Avance Logic ALS100/ALS120 ISA chips. + + port - port # for ALS100 (SB16) chip (PnP setup) + irq - IRQ # for ALS100 (SB16) chip (PnP setup) + dma8 - 8-bit DMA # for ALS100 (SB16) chip (PnP setup) + dma16 - 16-bit DMA # for ALS100 (SB16) chip (PnP setup) + mpu_port - port # for MPU-401 UART (PnP setup) + mpu_irq - IRQ # for MPU-401 (PnP setup) + fm_port - port # for OPL3 FM (PnP setup) + + Module supports up to 8 cards, autoprobe and PnP. + + Module snd-als4000 + ------------------ + + Module for sound cards based on Avance Logic ALS4000 PCI chip. + + joystick_port - port # for legacy joystick support. + 0 = disabled (default), 1 = auto-detect + + Module supports up to 8 cards, autoprobe and PnP. + + Module snd-atiixp + ----------------- + + Module for ATI IXP 150/200/250 AC97 controllers. + + ac97_clock - AC'97 clock (defalut = 48000) + ac97_quirk - AC'97 workaround for strange hardware + See the description of intel8x0 module for details. + spdif_aclink - S/PDIF transfer over AC-link (default = 1) + + This module supports up to 8 cards and autoprobe. + + Module snd-atiixp-modem + ----------------------- + + Module for ATI IXP 150/200/250 AC97 modem controllers. + + Module supports up to 8 cards. + + Note: The default index value of this module is -2, i.e. the first + slot is excluded. + + Module snd-au8810, snd-au8820, snd-au8830 + ----------------------------------------- + + Module for Aureal Vortex, Vortex2 and Advantage device. + + pcifix - Control PCI workarounds + 0 = Disable all workarounds + 1 = Force the PCI latency of the Aureal card to 0xff + 2 = Force the Extend PCI#2 Internal Master for Efficient + Handling of Dummy Requests on the VIA KT133 AGP Bridge + 3 = Force both settings + 255 = Autodetect what is required (default) + + This module supports all ADB PCM channels, ac97 mixer, SPDIF, hardware + EQ, mpu401, gameport. A3D and wavetable support are still in development. + Development and reverse engineering work is being coordinated at + http://savannah.nongnu.org/projects/openvortex/ + SPDIF output has a copy of the AC97 codec output, unless you use the + "spdif" pcm device, which allows raw data passthru. + The hardware EQ hardware and SPDIF is only present in the Vortex2 and + Advantage. + + Note: Some ALSA mixer applicactions don't handle the SPDIF samplerate + control correctly. If you have problems regarding this, try + another ALSA compliant mixer (alsamixer works). + + Module snd-azt2320 + ------------------ + + Module for sound cards based on Aztech System AZT2320 ISA chip (PnP only). + + port - port # for AZT2320 chip (PnP setup) + wss_port - port # for WSS (PnP setup) + mpu_port - port # for MPU-401 UART (PnP setup) + fm_port - FM port # for AZT2320 chip (PnP setup) + irq - IRQ # for AZT2320 (WSS) chip (PnP setup) + mpu_irq - IRQ # for MPU-401 UART (PnP setup) + dma1 - 1st DMA # for AZT2320 (WSS) chip (PnP setup) + dma2 - 2nd DMA # for AZT2320 (WSS) chip (PnP setup) + + Module supports up to 8 cards, PnP and autoprobe. + + Module snd-azt3328 + ------------------ + + Module for sound cards based on Aztech AZF3328 PCI chip. + + joystick - Enable joystick (default off) + + Module supports up to 8 cards. + + Module snd-bt87x + ---------------- + + Module for video cards based on Bt87x chips. + + digital_rate - Override the default digital rate (Hz) + load_all - Load the driver even if the card model isn't known + + Module supports up to 8 cards. + + Note: The default index value of this module is -2, i.e. the first + slot is excluded. + + Module snd-ca0106 + ----------------- + + Module for Creative Audigy LS and SB Live 24bit + + Module supports up to 8 cards. + + + Module snd-cmi8330 + ------------------ + + Module for sound cards based on C-Media CMI8330 ISA chips. + + wssport - port # for CMI8330 chip (WSS) + wssirq - IRQ # for CMI8330 chip (WSS) + wssdma - first DMA # for CMI8330 chip (WSS) + sbport - port # for CMI8330 chip (SB16) + sbirq - IRQ # for CMI8330 chip (SB16) + sbdma8 - 8bit DMA # for CMI8330 chip (SB16) + sbdma16 - 16bit DMA # for CMI8330 chip (SB16) + + Module supports up to 8 cards and autoprobe. + + Module snd-cmipci + ----------------- + + Module for C-Media CMI8338 and 8738 PCI sound cards. + + mpu_port - 0x300,0x310,0x320,0x330, 0 = disable (default) + fm_port - 0x388 (default), 0 = disable (default) + soft_ac3 - Sofware-conversion of raw SPDIF packets (model 033 only) + (default = 1) + joystick_port - Joystick port address (0 = disable, 1 = auto-detect) + + Module supports autoprobe and multiple chips (max 8). + + Module snd-cs4231 + ----------------- + + Module for sound cards based on CS4231 ISA chips. + + port - port # for CS4231 chip + mpu_port - port # for MPU-401 UART (optional), -1 = disable + irq - IRQ # for CS4231 chip + mpu_irq - IRQ # for MPU-401 UART + dma1 - first DMA # for CS4231 chip + dma2 - second DMA # for CS4231 chip + + Module supports up to 8 cards. This module does not support autoprobe + thus main port must be specified!!! Other ports are optional. + + The power-management is supported. + + Module snd-cs4232 + ----------------- + + Module for sound cards based on CS4232/CS4232A ISA chips. + + port - port # for CS4232 chip (PnP setup - 0x534) + cport - control port # for CS4232 chip (PnP setup - 0x120,0x210,0xf00) + mpu_port - port # for MPU-401 UART (PnP setup - 0x300), -1 = disable + fm_port - FM port # for CS4232 chip (PnP setup - 0x388), -1 = disable + irq - IRQ # for CS4232 chip (5,7,9,11,12,15) + mpu_irq - IRQ # for MPU-401 UART (9,11,12,15) + dma1 - first DMA # for CS4232 chip (0,1,3) + dma2 - second DMA # for Yamaha CS4232 chip (0,1,3), -1 = disable + isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) + + Module supports up to 8 cards. This module does not support autoprobe + thus main port must be specified!!! Other ports are optional. + + The power-management is supported. + + Module snd-cs4236 + ----------------- + + Module for sound cards based on CS4235/CS4236/CS4236B/CS4237B/ + CS4238B/CS4239 ISA chips. + + port - port # for CS4236 chip (PnP setup - 0x534) + cport - control port # for CS4236 chip (PnP setup - 0x120,0x210,0xf00) + mpu_port - port # for MPU-401 UART (PnP setup - 0x300), -1 = disable + fm_port - FM port # for CS4236 chip (PnP setup - 0x388), -1 = disable + irq - IRQ # for CS4236 chip (5,7,9,11,12,15) + mpu_irq - IRQ # for MPU-401 UART (9,11,12,15) + dma1 - first DMA # for CS4236 chip (0,1,3) + dma2 - second DMA # for CS4236 chip (0,1,3), -1 = disable + isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) + + Module supports up to 8 cards. This module does not support autoprobe + (if ISA PnP is not used) thus main port and control port must be + specified!!! Other ports are optional. + + The power-management is supported. + + Module snd-cs4281 + ----------------- + + Module for Cirrus Logic CS4281 soundchip. + + dual_codec - Secondary codec ID (0 = disable, default) + + Module supports up to 8 cards. + + The power-management is supported. + + Module snd-cs46xx + ----------------- + + Module for PCI sound cards based on CS4610/CS4612/CS4614/CS4615/CS4622/ + CS4624/CS4630/CS4280 PCI chips. + + external_amp - Force to enable external amplifer. + thinkpad - Force to enable Thinkpad's CLKRUN control. + mmap_valid - Support OSS mmap mode (default = 0). + + Module supports up to 8 cards and autoprobe. + Usually external amp and CLKRUN controls are detected automatically + from PCI sub vendor/device ids. If they don't work, give the options + above explicitly. + + The power-management is supported. + + Module snd-dt019x + ----------------- + + Module for Diamond Technologies DT-019X / Avance Logic ALS-007 (PnP + only) + + port - Port # (PnP setup) + mpu_port - Port # for MPU-401 (PnP setup) + fm_port - Port # for FM OPL-3 (PnP setup) + irq - IRQ # (PnP setup) + mpu_irq - IRQ # for MPU-401 (PnP setup) + dma8 - DMA # (PnP setup) + + Module supports up to 8 cards. This module is enabled only with + ISA PnP support. + + Module snd-dummy + ---------------- + + Module for the dummy sound card. This "card" doesn't do any output + or input, but you may use this module for any application which + requires a sound card (like RealPlayer). + + Module snd-emu10k1 + ------------------ + + Module for EMU10K1/EMU10k2 based PCI sound cards. + * Sound Blaster Live! + * Sound Blaster PCI 512 + * Emu APS (partially supported) + * Sound Blaster Audigy + + extin - bitmap of available external inputs for FX8010 (see bellow) + extout - bitmap of available external outputs for FX8010 (see bellow) + seq_ports - allocated sequencer ports (4 by default) + max_synth_voices - limit of voices used for wavetable (64 by default) + max_buffer_size - specifies the maximum size of wavetable/pcm buffers + given in MB unit. Default value is 128. + enable_ir - enable IR + + Module supports up to 8 cards and autoprobe. + + Input & Output configurations [extin/extout] + * Creative Card wo/Digital out [0x0003/0x1f03] + * Creative Card w/Digital out [0x0003/0x1f0f] + * Creative Card w/Digital CD in [0x000f/0x1f0f] + * Creative Card wo/Digital out + LiveDrive [0x3fc3/0x1fc3] + * Creative Card w/Digital out + LiveDrive [0x3fc3/0x1fcf] + * Creative Card w/Digital CD in + LiveDrive [0x3fcf/0x1fcf] + * Creative Card wo/Digital out + Digital I/O 2 [0x0fc3/0x1f0f] + * Creative Card w/Digital out + Digital I/O 2 [0x0fc3/0x1f0f] + * Creative Card w/Digital CD in + Digital I/O 2 [0x0fcf/0x1f0f] + * Creative Card 5.1/w Digital out + LiveDrive [0x3fc3/0x1fff] + * Creative Card 5.1 (c) 2003 [0x3fc3/0x7cff] + * Creative Card all ins and outs [0x3fff/0x7fff] + + Module snd-emu10k1x + ------------------- + + Module for Creative Emu10k1X (SB Live Dell OEM version) + + Module supports up to 8 cards. + + Module snd-ens1370 + ------------------ + + Module for Ensoniq AudioPCI ES1370 PCI sound cards. + * SoundBlaster PCI 64 + * SoundBlaster PCI 128 + + joystick - Enable joystick (default off) + + Module supports up to 8 cards and autoprobe. + + Module snd-ens1371 + ------------------ + + Module for Ensoniq AudioPCI ES1371 PCI sound cards. + * SoundBlaster PCI 64 + * SoundBlaster PCI 128 + * SoundBlaster Vibra PCI + + joystick_port - port # for joystick (0x200,0x208,0x210,0x218), + 0 = disable (default), 1 = auto-detect + + Module supports up to 8 cards and autoprobe. + + Module snd-es968 + ---------------- + + Module for sound cards based on ESS ES968 chip (PnP only). + + port - port # for ES968 (SB8) chip (PnP setup) + irq - IRQ # for ES968 (SB8) chip (PnP setup) + dma1 - DMA # for ES968 (SB8) chip (PnP setup) + + Module supports up to 8 cards, PnP and autoprobe. + + Module snd-es1688 + ----------------- + + Module for ESS AudioDrive ES-1688 and ES-688 sound cards. + + port - port # for ES-1688 chip (0x220,0x240,0x260) + mpu_port - port # for MPU-401 port (0x300,0x310,0x320,0x330), -1 = disable (default) + irq - IRQ # for ES-1688 chip (5,7,9,10) + mpu_irq - IRQ # for MPU-401 port (5,7,9,10) + dma8 - DMA # for ES-1688 chip (0,1,3) + + Module supports up to 8 cards and autoprobe (without MPU-401 port). + + Module snd-es18xx + ----------------- + + Module for ESS AudioDrive ES-18xx sound cards. + + port - port # for ES-18xx chip (0x220,0x240,0x260) + mpu_port - port # for MPU-401 port (0x300,0x310,0x320,0x330), -1 = disable (default) + fm_port - port # for FM (optional, not used) + irq - IRQ # for ES-18xx chip (5,7,9,10) + dma1 - first DMA # for ES-18xx chip (0,1,3) + dma2 - first DMA # for ES-18xx chip (0,1,3) + isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) + + Module supports up to 8 cards ISA PnP and autoprobe (without MPU-401 port + if native ISA PnP routines are not used). + When dma2 is equal with dma1, the driver works as half-duplex. + + The power-management is supported. + + Module snd-es1938 + ----------------- + + Module for sound cards based on ESS Solo-1 (ES1938,ES1946) chips. + + Module supports up to 8 cards and autoprobe. + + Module snd-es1968 + ----------------- + + Module for sound cards based on ESS Maestro-1/2/2E (ES1968/ES1978) chips. + + total_bufsize - total buffer size in kB (1-4096kB) + pcm_substreams_p - playback channels (1-8, default=2) + pcm_substreams_c - capture channels (1-8, default=0) + clock - clock (0 = auto-detection) + use_pm - support the power-management (0 = off, 1 = on, + 2 = auto (default)) + enable_mpu - enable MPU401 (0 = off, 1 = on, 2 = auto (default)) + joystick - enable joystick (default off) + + Module supports up to 8 cards and autoprobe. + + The power-management is supported. + + Module snd-fm801 + ---------------- + + Module for ForteMedia FM801 based PCI sound cards. + + tea575x_tuner - Enable TEA575x tuner + - 1 = MediaForte 256-PCS + - 2 = MediaForte 256-PCPR + - 3 = MediaForte 64-PCR + - High 16-bits are video (radio) device number + 1 + - example: 0x10002 (MediaForte 256-PCPR, device 1) + + Module supports up to 8 cards and autoprobe. + + Module snd-gusclassic + --------------------- + + Module for Gravis UltraSound Classic sound card. + + port - port # for GF1 chip (0x220,0x230,0x240,0x250,0x260) + irq - IRQ # for GF1 chip (3,5,9,11,12,15) + dma1 - DMA # for GF1 chip (1,3,5,6,7) + dma2 - DMA # for GF1 chip (1,3,5,6,7,-1=disable) + joystick_dac - 0 to 31, (0.59V-4.52V or 0.389V-2.98V) + voices - GF1 voices limit (14-32) + pcm_voices - reserved PCM voices + + Module supports up to 8 cards and autoprobe. + + Module snd-gusextreme + --------------------- + + Module for Gravis UltraSound Extreme (Synergy ViperMax) sound card. + + port - port # for ES-1688 chip (0x220,0x230,0x240,0x250,0x260) + gf1_port - port # for GF1 chip (0x210,0x220,0x230,0x240,0x250,0x260,0x270) + mpu_port - port # for MPU-401 port (0x300,0x310,0x320,0x330), -1 = disable + irq - IRQ # for ES-1688 chip (5,7,9,10) + gf1_irq - IRQ # for GF1 chip (3,5,9,11,12,15) + mpu_irq - IRQ # for MPU-401 port (5,7,9,10) + dma8 - DMA # for ES-1688 chip (0,1,3) + dma1 - DMA # for GF1 chip (1,3,5,6,7) + joystick_dac - 0 to 31, (0.59V-4.52V or 0.389V-2.98V) + voices - GF1 voices limit (14-32) + pcm_voices - reserved PCM voices + + Module supports up to 8 cards and autoprobe (without MPU-401 port). + + Module snd-gusmax + ----------------- + + Module for Gravis UltraSound MAX sound card. + + port - port # for GF1 chip (0x220,0x230,0x240,0x250,0x260) + irq - IRQ # for GF1 chip (3,5,9,11,12,15) + dma1 - DMA # for GF1 chip (1,3,5,6,7) + dma2 - DMA # for GF1 chip (1,3,5,6,7,-1=disable) + joystick_dac - 0 to 31, (0.59V-4.52V or 0.389V-2.98V) + voices - GF1 voices limit (14-32) + pcm_voices - reserved PCM voices + + Module supports up to 8 cards and autoprobe. + + Module snd-hda-intel + -------------------- + + Module for Intel HD Audio (ICH6, ICH6M, ICH7) + + model - force the model name + + Module supports up to 8 cards. + + Each codec may have a model table for different configurations. + If your machine isn't listed there, the default (usually minimal) + configuration is set up. You can pass "model=<name>" option to + specify a certain model in such a case. There are different + models depending on the codec chip. + + Model name Description + ---------- ----------- + ALC880 + 3stack 3-jack in back and a headphone out + 3stack-digout 3-jack in back, a HP out and a SPDIF out + 5stack 5-jack in back, 2-jack in front + 5stack-digout 5-jack in back, 2-jack in front, a SPDIF out + w810 3-jack + + CMI9880 + minimal 3-jack in back + min_fp 3-jack in back, 2-jack in front + full 6-jack in back, 2-jack in front + full_dig 6-jack in back, 2-jack in front, SPDIF I/O + allout 5-jack in back, 2-jack in front, SPDIF out + + Module snd-hdsp + --------------- + + Module for RME Hammerfall DSP audio interface(s) + + Module supports up to 8 cards. + + Note: The firmware data can be automatically loaded via hotplug + when CONFIG_FW_LOADER is set. Otherwise, you need to load + the firmware via hdsploader utility included in alsa-tools + package. + The firmware data is found in alsa-firmware package. + + Note: snd-page-alloc module does the job which snd-hammerfall-mem + module did formerly. It will allocate the buffers in advance + when any HDSP cards are found. To make the buffer + allocation sure, load snd-page-alloc module in the early + stage of boot sequence. + + Module snd-ice1712 + ------------------ + + Module for Envy24 (ICE1712) based PCI sound cards. + * MidiMan M Audio Delta 1010 + * MidiMan M Audio Delta 1010LT + * MidiMan M Audio Delta DiO 2496 + * MidiMan M Audio Delta 66 + * MidiMan M Audio Delta 44 + * MidiMan M Audio Delta 410 + * MidiMan M Audio Audiophile 2496 + * TerraTec EWS 88MT + * TerraTec EWS 88D + * TerraTec EWX 24/96 + * TerraTec DMX 6Fire + * Hoontech SoundTrack DSP 24 + * Hoontech SoundTrack DSP 24 Value + * Hoontech SoundTrack DSP 24 Media 7.1 + * Digigram VX442 + + model - Use the given board model, one of the following: + delta1010, dio2496, delta66, delta44, audiophile, delta410, + delta1010lt, vx442, ewx2496, ews88mt, ews88mt_new, ews88d, + dmx6fire, dsp24, dsp24_value, dsp24_71, ez8 + omni - Omni I/O support for MidiMan M-Audio Delta44/66 + cs8427_timeout - reset timeout for the CS8427 chip (S/PDIF transciever) + in msec resolution, default value is 500 (0.5 sec) + + Module supports up to 8 cards and autoprobe. Note: The consumer part + is not used with all Envy24 based cards (for example in the MidiMan Delta + serie). + + Module snd-ice1724 + ------------------ + + Module for Envy24HT (VT/ICE1724) based PCI sound cards. + * MidiMan M Audio Revolution 7.1 + * AMP Ltd AUDIO2000 + * TerraTec Aureon Sky-5.1, Space-7.1 + + model - Use the given board model, one of the following: + revo71, amp2000, prodigy71, aureon51, aureon71, + k8x800 + + Module supports up to 8 cards and autoprobe. + + Module snd-intel8x0 + ------------------- + + Module for AC'97 motherboards from Intel and compatibles. + * Intel i810/810E, i815, i820, i830, i84x, MX440 + * SiS 7012 (SiS 735) + * NVidia NForce, NForce2 + * AMD AMD768, AMD8111 + * ALi m5455 + + ac97_clock - AC'97 codec clock base (0 = auto-detect) + ac97_quirk - AC'97 workaround for strange hardware + The following strings are accepted: + default = don't override the default setting + disable = disable the quirk + hp_only = use headphone control as master + swap_hp = swap headphone and master controls + swap_surround = swap master and surround controls + ad_sharing = for AD1985, turn on OMS bit and use headphone + alc_jack = for ALC65x, turn on the jack sense mode + inv_eapd = inverted EAPD implementation + mute_led = bind EAPD bit for turning on/off mute LED + For backward compatibility, the corresponding integer + value -1, 0, ... are accepted, too. + buggy_irq - Enable workaround for buggy interrupts on some + motherboards (default off) + + Module supports autoprobe and multiple bus-master chips (max 8). + + Note: the latest driver supports auto-detection of chip clock. + if you still encounter too fast playback, specify the clock + explicitly via the module option "ac97_clock=41194". + + Joystick/MIDI ports are not supported by this driver. If your + motherboard has these devices, use the ns558 or snd-mpu401 + modules, respectively. + + The ac97_quirk option is used to enable/override the workaround + for specific devices. Some hardware have swapped output pins + between Master and Headphone, or Surround. The driver provides + the auto-detection of known problematic devices, but some might + be unknown or wrongly detected. In such a case, pass the proper + value with this option. + + The power-management is supported. + + Module snd-intel8x0m + -------------------- + + Module for Intel ICH (i8x0) chipset MC97 modems. + + ac97_clock - AC'97 codec clock base (0 = auto-detect) + + This module supports up to 8 cards and autoprobe. + + Note: The default index value of this module is -2, i.e. the first + slot is excluded. + + Module snd-interwave + -------------------- + + Module for Gravis UltraSound PnP, Dynasonic 3-D/Pro, STB Sound Rage 32 + and other sound cards based on AMD InterWave (tm) chip. + + port - port # for InterWave chip (0x210,0x220,0x230,0x240,0x250,0x260) + irq - IRQ # for InterWave chip (3,5,9,11,12,15) + dma1 - DMA # for InterWave chip (0,1,3,5,6,7) + dma2 - DMA # for InterWave chip (0,1,3,5,6,7,-1=disable) + joystick_dac - 0 to 31, (0.59V-4.52V or 0.389V-2.98V) + midi - 1 = MIDI UART enable, 0 = MIDI UART disable (default) + pcm_voices - reserved PCM voices for the synthesizer (default 2) + effect - 1 = InterWave effects enable (default 0); + requires 8 voices + + Module supports up to 8 cards, autoprobe and ISA PnP. + + Module snd-interwave-stb + ------------------------ + + Module for UltraSound 32-Pro (sound card from STB used by Compaq) + and other sound cards based on AMD InterWave (tm) chip with TEA6330T + circuit for extended control of bass, treble and master volume. + + port - port # for InterWave chip (0x210,0x220,0x230,0x240,0x250,0x260) + port_tc - tone control (i2c bus) port # for TEA6330T chip (0x350,0x360,0x370,0x380) + irq - IRQ # for InterWave chip (3,5,9,11,12,15) + dma1 - DMA # for InterWave chip (0,1,3,5,6,7) + dma2 - DMA # for InterWave chip (0,1,3,5,6,7,-1=disable) + joystick_dac - 0 to 31, (0.59V-4.52V or 0.389V-2.98V) + midi - 1 = MIDI UART enable, 0 = MIDI UART disable (default) + pcm_voices - reserved PCM voices for the synthesizer (default 2) + effect - 1 = InterWave effects enable (default 0); + requires 8 voices + + Module supports up to 8 cards, autoprobe and ISA PnP. + + Module snd-korg1212 + ------------------- + + Module for Korg 1212 IO PCI card + + Module supports up to 8 cards. + + Module snd-maestro3 + ------------------- + + Module for Allegro/Maestro3 chips + + external_amp - enable external amp (enabled by default) + amp_gpio - GPIO pin number for external amp (0-15) or + -1 for default pin (8 for allegro, 1 for + others) + + Module supports autoprobe and multiple chips (max 8). + + Note: the binding of amplifier is dependent on hardware. + If there is no sound even though all channels are unmuted, try to + specify other gpio connection via amp_gpio option. + For example, a Panasonic notebook might need "amp_gpio=0x0d" + option. + + The power-management is supported. + + Module snd-mixart + ----------------- + + Module for Digigram miXart8 sound cards. + + Module supports multiple cards. + Note: One miXart8 board will be represented as 4 alsa cards. + See MIXART.txt for details. + + When the driver is compiled as a module and the hotplug firmware + is supported, the firmware data is loaded via hotplug automatically. + Install the necessary firmware files in alsa-firmware package. + When no hotplug fw loader is available, you need to load the + firmware via mixartloader utility in alsa-tools package. + + Module snd-mpu401 + ----------------- + + Module for MPU-401 UART devices. + + port - port number or -1 (disable) + irq - IRQ number or -1 (disable) + pnp - PnP detection - 0 = disable, 1 = enable (default) + + Module supports multiple devices (max 8) and PnP. + + Module snd-mtpav + ---------------- + + Module for MOTU MidiTimePiece AV multiport MIDI (on the parallel + port). + + port - I/O port # for MTPAV (0x378,0x278, default=0x378) + irq - IRQ # for MTPAV (7,5, default=7) + hwports - number of supported hardware ports, default=8. + + Module supports only 1 card. This module has no enable option. + + Module snd-nm256 + ---------------- + + Module for NeoMagic NM256AV/ZX chips + + playback_bufsize - max playback frame size in kB (4-128kB) + capture_bufsize - max capture frame size in kB (4-128kB) + force_ac97 - 0 or 1 (disabled by default) + buffer_top - specify buffer top address + use_cache - 0 or 1 (disabled by default) + vaio_hack - alias buffer_top=0x25a800 + reset_workaround - enable AC97 RESET workaround for some laptops + + Module supports autoprobe and multiple chips (max 8). + + The power-management is supported. + + Note: on some notebooks the buffer address cannot be detected + automatically, or causes hang-up during initialization. + In such a case, specify the buffer top address explicity via + buffer_top option. + For example, + Sony F250: buffer_top=0x25a800 + Sony F270: buffer_top=0x272800 + The driver supports only ac97 codec. It's possible to force + to initialize/use ac97 although it's not detected. In such a + case, use force_ac97=1 option - but *NO* guarantee whether it + works! + + Note: The NM256 chip can be linked internally with non-AC97 + codecs. This driver supports only the AC97 codec, and won't work + with machines with other (most likely CS423x or OPL3SAx) chips, + even though the device is detected in lspci. In such a case, try + other drivers, e.g. snd-cs4232 or snd-opl3sa2. Some has ISA-PnP + but some doesn't have ISA PnP. You'll need to speicfy isapnp=0 + and proper hardware parameters in the case without ISA PnP. + + Note: some laptops need a workaround for AC97 RESET. For the + known hardware like Dell Latitude LS and Sony PCG-F305, this + workaround is enabled automatically. For other laptops with a + hard freeze, you can try reset_workaround=1 option. + + Note: This driver is really crappy. It's a porting from the + OSS driver, which is a result of black-magic reverse engineering. + The detection of codec will fail if the driver is loaded *after* + X-server as described above. You might be able to force to load + the module, but it may result in hang-up. Hence, make sure that + you load this module *before* X if you encounter this kind of + problem. + + Module snd-opl3sa2 + ------------------ + + Module for Yamaha OPL3-SA2/SA3 sound cards. + + port - control port # for OPL3-SA chip (0x370) + sb_port - SB port # for OPL3-SA chip (0x220,0x240) + wss_port - WSS port # for OPL3-SA chip (0x530,0xe80,0xf40,0x604) + midi_port - port # for MPU-401 UART (0x300,0x330), -1 = disable + fm_port - FM port # for OPL3-SA chip (0x388), -1 = disable + irq - IRQ # for OPL3-SA chip (5,7,9,10) + dma1 - first DMA # for Yamaha OPL3-SA chip (0,1,3) + dma2 - second DMA # for Yamaha OPL3-SA chip (0,1,3), -1 = disable + isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) + + Module supports up to 8 cards and ISA PnP. This module does not support + autoprobe (if ISA PnP is not used) thus all ports must be specified!!! + + The power-management is supported. + + Module snd-opti92x-ad1848 + ------------------------- + + Module for sound cards based on OPTi 82c92x and Analog Devices AD1848 chips. + Module works with OAK Mozart cards as well. + + port - port # for WSS chip (0x530,0xe80,0xf40,0x604) + mpu_port - port # for MPU-401 UART (0x300,0x310,0x320,0x330) + fm_port - port # for OPL3 device (0x388) + irq - IRQ # for WSS chip (5,7,9,10,11) + mpu_irq - IRQ # for MPU-401 UART (5,7,9,10) + dma1 - first DMA # for WSS chip (0,1,3) + + This module supports only one card, autoprobe and PnP. + + Module snd-opti92x-cs4231 + ------------------------- + + Module for sound cards based on OPTi 82c92x and Crystal CS4231 chips. + + port - port # for WSS chip (0x530,0xe80,0xf40,0x604) + mpu_port - port # for MPU-401 UART (0x300,0x310,0x320,0x330) + fm_port - port # for OPL3 device (0x388) + irq - IRQ # for WSS chip (5,7,9,10,11) + mpu_irq - IRQ # for MPU-401 UART (5,7,9,10) + dma1 - first DMA # for WSS chip (0,1,3) + dma2 - second DMA # for WSS chip (0,1,3) + + This module supports only one card, autoprobe and PnP. + + Module snd-opti93x + ------------------ + + Module for sound cards based on OPTi 82c93x chips. + + port - port # for WSS chip (0x530,0xe80,0xf40,0x604) + mpu_port - port # for MPU-401 UART (0x300,0x310,0x320,0x330) + fm_port - port # for OPL3 device (0x388) + irq - IRQ # for WSS chip (5,7,9,10,11) + mpu_irq - IRQ # for MPU-401 UART (5,7,9,10) + dma1 - first DMA # for WSS chip (0,1,3) + dma2 - second DMA # for WSS chip (0,1,3) + + This module supports only one card, autoprobe and PnP. + + Module snd-powermac (on ppc only) + --------------------------------- + + Module for PowerMac, iMac and iBook on-board soundchips + + enable_beep - enable beep using PCM (enabled as default) + + Module supports autoprobe a chip. + + Note: the driver may have problems regarding endianess. + + The power-management is supported. + + Module snd-rme32 + ---------------- + + Module for RME Digi32, Digi32 Pro and Digi32/8 (Sek'd Prodif32, + Prodif96 and Prodif Gold) sound cards. + + Module supports up to 8 cards. + + Module snd-rme96 + ---------------- + + Module for RME Digi96, Digi96/8 and Digi96/8 PRO/PAD/PST sound cards. + + Module supports up to 8 cards. + + Module snd-rme9652 + ------------------ + + Module for RME Digi9652 (Hammerfall, Hammerfall-Light) sound cards. + + precise_ptr - Enable precise pointer (doesn't work reliably). + (default = 0) + + Module supports up to 8 cards. + + Note: snd-page-alloc module does the job which snd-hammerfall-mem + module did formerly. It will allocate the buffers in advance + when any RME9652 cards are found. To make the buffer + allocation sure, load snd-page-alloc module in the early + stage of boot sequence. + + Module snd-sa11xx-uda1341 (on arm only) + --------------------------------------- + + Module for Philips UDA1341TS on Compaq iPAQ H3600 sound card. + + Module supports only one card. + Module has no enable and index options. + + Module snd-sb8 + -------------- + + Module for 8-bit SoundBlaster cards: SoundBlaster 1.0, + SoundBlaster 2.0, + SoundBlaster Pro + + port - port # for SB DSP chip (0x220,0x240,0x260) + irq - IRQ # for SB DSP chip (5,7,9,10) + dma8 - DMA # for SB DSP chip (1,3) + + Module supports up to 8 cards and autoprobe. + + Module snd-sb16 and snd-sbawe + ----------------------------- + + Module for 16-bit SoundBlaster cards: SoundBlaster 16 (PnP), + SoundBlaster AWE 32 (PnP), + SoundBlaster AWE 64 PnP + + port - port # for SB DSP 4.x chip (0x220,0x240,0x260) + mpu_port - port # for MPU-401 UART (0x300,0x330), -1 = disable + awe_port - base port # for EMU8000 synthesizer (0x620,0x640,0x660) + (snd-sbawe module only) + irq - IRQ # for SB DSP 4.x chip (5,7,9,10) + dma8 - 8-bit DMA # for SB DSP 4.x chip (0,1,3) + dma16 - 16-bit DMA # for SB DSP 4.x chip (5,6,7) + mic_agc - Mic Auto-Gain-Control - 0 = disable, 1 = enable (default) + csp - ASP/CSP chip support - 0 = disable (default), 1 = enable + isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) + + Module supports up to 8 cards, autoprobe and ISA PnP. + + Note: To use Vibra16X cards in 16-bit half duplex mode, you must + disable 16bit DMA with dma16 = -1 module parameter. + Also, all Sound Blaster 16 type cards can operate in 16-bit + half duplex mode through 8-bit DMA channel by disabling their + 16-bit DMA channel. + + Module snd-sgalaxy + ------------------ + + Module for Aztech Sound Galaxy sound card. + + sbport - Port # for SB16 interface (0x220,0x240) + wssport - Port # for WSS interface (0x530,0xe80,0xf40,0x604) + irq - IRQ # (7,9,10,11) + dma1 - DMA # + + Module supports up to 8 cards. + + Module snd-sscape + ----------------- + + Module for ENSONIQ SoundScape PnP cards. + + port - Port # (PnP setup) + irq - IRQ # (PnP setup) + mpu_irq - MPU-401 IRQ # (PnP setup) + dma - DMA # (PnP setup) + + Module supports up to 8 cards. ISA PnP must be enabled. + You need sscape_ctl tool in alsa-tools package for loading + the microcode. + + Module snd-sun-amd7930 (on sparc only) + -------------------------------------- + + Module for AMD7930 sound chips found on Sparcs. + + Module supports up to 8 cards. + + Module snd-sun-cs4231 (on sparc only) + ------------------------------------- + + Module for CS4231 sound chips found on Sparcs. + + Module supports up to 8 cards. + + Module snd-wavefront + -------------------- + + Module for Turtle Beach Maui, Tropez and Tropez+ sound cards. + + cs4232_pcm_port - Port # for CS4232 PCM interface. + cs4232_pcm_irq - IRQ # for CS4232 PCM interface (5,7,9,11,12,15). + cs4232_mpu_port - Port # for CS4232 MPU-401 interface. + cs4232_mpu_irq - IRQ # for CS4232 MPU-401 interface (9,11,12,15). + use_cs4232_midi - Use CS4232 MPU-401 interface + (inaccessibly located inside your computer) + ics2115_port - Port # for ICS2115 + ics2115_irq - IRQ # for ICS2115 + fm_port - FM OPL-3 Port # + dma1 - DMA1 # for CS4232 PCM interface. + dma2 - DMA2 # for CS4232 PCM interface. + isapnp - ISA PnP detection - 0 = disable, 1 = enable (default) + + Module supports up to 8 cards and ISA PnP. + + Module snd-sonicvibes + --------------------- + + Module for S3 SonicVibes PCI sound cards. + * PINE Schubert 32 PCI + + reverb - Reverb Enable - 1 = enable, 0 = disable (default) + - SoundCard must have onboard SRAM for this. + mge - Mic Gain Enable - 1 = enable, 0 = disable (default) + + Module supports up to 8 cards and autoprobe. + + Module snd-serial-u16550 + ------------------------ + + Module for UART16550A serial MIDI ports. + + port - port # for UART16550A chip + irq - IRQ # for UART16550A chip, -1 = poll mode + speed - speed in bauds (9600,19200,38400,57600,115200) + 38400 = default + base - base for divisor in bauds (57600,115200,230400,460800) + 115200 = default + outs - number of MIDI ports in a serial port (1-4) + 1 = default + adaptor - Type of adaptor. + 0 = Soundcanvas, 1 = MS-124T, 2 = MS-124W S/A, + 3 = MS-124W M/B, 4 = Generic + + Module supports up to 8 cards. This module does not support autoprobe + thus the main port must be specified!!! Other options are optional. + + Module snd-trident + ------------------ + + Module for Trident 4DWave DX/NX sound cards. + * Best Union Miss Melody 4DWave PCI + * HIS 4DWave PCI + * Warpspeed ONSpeed 4DWave PCI + * AzTech PCI 64-Q3D + * Addonics SV 750 + * CHIC True Sound 4Dwave + * Shark Predator4D-PCI + * Jaton SonicWave 4D + + pcm_channels - max channels (voices) reserved for PCM + wavetable_size - max wavetable size in kB (4-?kb) + + Module supports up to 8 cards and autoprobe. + + The power-management is supported. + + Module snd-usb-audio + -------------------- + + Module for USB audio and USB MIDI devices. + + vid - Vendor ID for the device (optional) + pid - Product ID for the device (optional) + + This module supports up to 8 cards, autoprobe and hotplugging. + + Module snd-usb-usx2y + -------------------- + + Module for Tascam USB US-122, US-224 and US-428 devices. + + This module supports up to 8 cards, autoprobe and hotplugging. + + Note: you need to load the firmware via usx2yloader utility included + in alsa-tools and alsa-firmware packages. + + Module snd-via82xx + ------------------ + + Module for AC'97 motherboards based on VIA 82C686A/686B, 8233, + 8233A, 8233C, 8235 (south) bridge. + + mpu_port - 0x300,0x310,0x320,0x330, otherwise obtain BIOS setup + [VIA686A/686B only] + joystick - Enable joystick (default off) [VIA686A/686B only] + ac97_clock - AC'97 codec clock base (default 48000Hz) + dxs_support - support DXS channels, + 0 = auto (defalut), 1 = enable, 2 = disable, + 3 = 48k only, 4 = no VRA + [VIA8233/C,8235 only] + ac97_quirk - AC'97 workaround for strange hardware + See the description of intel8x0 module for details. + + Module supports autoprobe and multiple bus-master chips (max 8). + + Note: on some SMP motherboards like MSI 694D the interrupts might + not be generated properly. In such a case, please try to + set the SMP (or MPS) version on BIOS to 1.1 instead of + default value 1.4. Then the interrupt number will be + assigned under 15. You might also upgrade your BIOS. + + Note: VIA8233/5 (not VIA8233A) can support DXS (direct sound) + channels as the first PCM. On these channels, up to 4 + streams can be played at the same time. + As default (dxs_support = 0), 48k fixed rate is chosen + except for the known devices since the output is often + noisy except for 48k on some mother boards due to the + bug of BIOS. + Please try once dxs_support=1 and if it works on other + sample rates (e.g. 44.1kHz of mp3 playback), please let us + know the PCI subsystem vendor/device id's (output of + "lspci -nv"). + If it doesn't work, try dxs_support=4. If it still doesn't + work and the default setting is ok, dxs_support=3 is the + right choice. If the default setting doesn't work at all, + try dxs_support=2 to disable the DXS channels. + In any cases, please let us know the result and the + subsystem vendor/device ids. + + Note: for the MPU401 on VIA823x, use snd-mpu401 driver + additonally. The mpu_port option is for VIA686 chips only. + + Module snd-via82xx-modem + ------------------------ + + Module for VIA82xx AC97 modem + + ac97_clock - AC'97 codec clock base (default 48000Hz) + + Module supports up to 8 cards. + + Note: The default index value of this module is -2, i.e. the first + slot is excluded. + + Module snd-virmidi + ------------------ + + Module for virtual rawmidi devices. + This module creates virtual rawmidi devices which communicate + to the corresponding ALSA sequencer ports. + + midi_devs - MIDI devices # (1-8, default=4) + + Module supports up to 8 cards. + + Module snd-vx222 + ---------------- + + Module for Digigram VX-Pocket VX222, V222 v2 and Mic cards. + + mic - Enable Microphone on V222 Mic (NYI) + ibl - Capture IBL size. (default = 0, minimum size) + + Module supports up to 8 cards. + + When the driver is compiled as a module and the hotplug firmware + is supported, the firmware data is loaded via hotplug automatically. + Install the necessary firmware files in alsa-firmware package. + When no hotplug fw loader is available, you need to load the + firmware via vxloader utility in alsa-tools package. To invoke + vxloader automatically, add the following to /etc/modprobe.conf + + install snd-vx222 /sbin/modprobe --first-time -i snd-vx222 && /usr/bin/vxloader + + (for 2.2/2.4 kernels, add "post-install /usr/bin/vxloader" to + /etc/modules.conf, instead.) + IBL size defines the interrupts period for PCM. The smaller size + gives smaller latency but leads to more CPU consumption, too. + The size is usually aligned to 126. As default (=0), the smallest + size is chosen. The possible IBL values can be found in + /proc/asound/cardX/vx-status proc file. + + Module snd-vxpocket + ------------------- + + Module for Digigram VX-Pocket VX2 PCMCIA card. + + ibl - Capture IBL size. (default = 0, minimum size) + + Module supports up to 8 cards. The module is compiled only when + PCMCIA is supported on kernel. + + To activate the driver via the card manager, you'll need to set + up /etc/pcmcia/vxpocket.conf. See the sound/pcmcia/vx/vxpocket.c. + + When the driver is compiled as a module and the hotplug firmware + is supported, the firmware data is loaded via hotplug automatically. + Install the necessary firmware files in alsa-firmware package. + When no hotplug fw loader is available, you need to load the + firmware via vxloader utility in alsa-tools package. + + About capture IBL, see the description of snd-vx222 module. + + Note: the driver is build only when CONFIG_ISA is set. + + Module snd-vxp440 + ----------------- + + Module for Digigram VX-Pocket 440 PCMCIA card. + + ibl - Capture IBL size. (default = 0, minimum size) + + Module supports up to 8 cards. The module is compiled only when + PCMCIA is supported on kernel. + + To activate the driver via the card manager, you'll need to set + up /etc/pcmcia/vxp440.conf. See the sound/pcmcia/vx/vxp440.c. + + When the driver is compiled as a module and the hotplug firmware + is supported, the firmware data is loaded via hotplug automatically. + Install the necessary firmware files in alsa-firmware package. + When no hotplug fw loader is available, you need to load the + firmware via vxloader utility in alsa-tools package. + + About capture IBL, see the description of snd-vx222 module. + + Note: the driver is build only when CONFIG_ISA is set. + + Module snd-ymfpci + ----------------- + + Module for Yamaha PCI chips (YMF72x, YMF74x & YMF75x). + + mpu_port - 0x300,0x330,0x332,0x334, 0 (disable) by default, + 1 (auto-detect for YMF744/754 only) + fm_port - 0x388,0x398,0x3a0,0x3a8, 0 (disable) by default + 1 (auto-detect for YMF744/754 only) + joystick_port - 0x201,0x202,0x204,0x205, 0 (disable) by default, + 1 (auto-detect) + rear_switch - enable shared rear/line-in switch (bool) + + Module supports autoprobe and multiple chips (max 8). + + The power-management is supported. + + Module snd-pdaudiocf + -------------------- + + Module for Sound Core PDAudioCF sound card. + + Note: the driver is build only when CONFIG_ISA is set. + + +Configuring Non-ISAPNP Cards +============================ + +When the kernel is configured with ISA-PnP support, the modules +supporting the isapnp cards will have module options "isapnp". +If this option is set, *only* the ISA-PnP devices will be probed. +For probing the non ISA-PnP cards, you have to pass "isapnp=0" option +together with the proper i/o and irq configuration. + +When the kernel is configured without ISA-PnP support, isapnp option +will be not built in. + + +Module Autoloading Support +========================== + +The ALSA drivers can be loaded automatically on demand by defining +module aliases. The string 'snd-card-%1' is requested for ALSA native +devices where %i is sound card number from zero to seven. + +To auto-load an ALSA driver for OSS services, define the string +'sound-slot-%i' where %i means the slot number for OSS, which +corresponds to the card index of ALSA. Usually, define this +as the the same card module. + +An example configuration for a single emu10k1 card is like below: +----- /etc/modprobe.conf +alias snd-card-0 snd-emu10k1 +alias sound-slot-0 snd-emu10k1 +----- /etc/modprobe.conf + +The available number of auto-loaded sound cards depends on the module +option "cards_limit" of snd module. As default it's set to 1. +To enable the auto-loading of multiple cards, specify the number of +sound cards in that option. + +When multiple cards are available, it'd better to specify the index +number for each card via module option, too, so that the order of +cards is kept consistent. + +An example configuration for two sound cards is like below: + +----- /etc/modprobe.conf +# ALSA portion +options snd cards_limit=2 +alias snd-card-0 snd-interwave +alias snd-card-1 snd-ens1371 +options snd-interwave index=0 +options snd-ens1371 index=1 +# OSS/Free portion +alias sound-slot-0 snd-interwave +alias sound-slot-1 snd-ens1371 +----- /etc/moprobe.conf + +In this example, the interwave card is always loaded as the first card +(index 0) and ens1371 as the second (index 1). + + +ALSA PCM devices to OSS devices mapping +======================================= + +/dev/snd/pcmC0D0[c|p] -> /dev/audio0 (/dev/audio) -> minor 4 +/dev/snd/pcmC0D0[c|p] -> /dev/dsp0 (/dev/dsp) -> minor 3 +/dev/snd/pcmC0D1[c|p] -> /dev/adsp0 (/dev/adsp) -> minor 12 +/dev/snd/pcmC1D0[c|p] -> /dev/audio1 -> minor 4+16 = 20 +/dev/snd/pcmC1D0[c|p] -> /dev/dsp1 -> minor 3+16 = 19 +/dev/snd/pcmC1D1[c|p] -> /dev/adsp1 -> minor 12+16 = 28 +/dev/snd/pcmC2D0[c|p] -> /dev/audio2 -> minor 4+32 = 36 +/dev/snd/pcmC2D0[c|p] -> /dev/dsp2 -> minor 3+32 = 39 +/dev/snd/pcmC2D1[c|p] -> /dev/adsp2 -> minor 12+32 = 44 + +The first number from /dev/snd/pcmC{X}D{Y}[c|p] expression means +sound card number and second means device number. The ALSA devices +have either 'c' or 'p' suffix indicating the direction, capture and +playback, respectively. + +Please note that the device mapping above may be varied via the module +options of snd-pcm-oss module. + + +DEVFS support +============= + +The ALSA driver fully supports the devfs extension. +You should add lines below to your devfsd.conf file: + +LOOKUP snd MODLOAD ACTION snd +REGISTER ^sound/.* PERMISSIONS root.audio 660 +REGISTER ^snd/.* PERMISSIONS root.audio 660 + +Warning: These lines assume that you have the audio group in your system. + Otherwise replace audio word with another group name (root for + example). + + +Proc interfaces (/proc/asound) +============================== + +/proc/asound/card#/pcm#[cp]/oss +------------------------------- + String "erase" - erase all additional informations about OSS applications + String "<app_name> <fragments> <fragment_size> [<options>]" + + <app_name> - name of application with (higher priority) or without path + <fragments> - number of fragments or zero if auto + <fragment_size> - size of fragment in bytes or zero if auto + <options> - optional parameters + - disable the application tries to open a pcm device for + this channel but does not want to use it. + (Cause a bug or mmap needs) + It's good for Quake etc... + - direct don't use plugins + - block force block mode (rvplayer) + - non-block force non-block mode + - whole-frag write only whole fragments (optimization affecting + playback only) + - no-silence do not fill silence ahead to avoid clicks + + Example: echo "x11amp 128 16384" > /proc/asound/card0/pcm0p/oss + echo "squake 0 0 disable" > /proc/asound/card0/pcm0c/oss + echo "rvplayer 0 0 block" > /proc/asound/card0/pcm0p/oss + + +Links +===== + + ALSA project homepage + http://www.alsa-project.org + diff --git a/Documentation/sound/alsa/Audigy-mixer.txt b/Documentation/sound/alsa/Audigy-mixer.txt new file mode 100644 index 000000000000..5132fd95e074 --- /dev/null +++ b/Documentation/sound/alsa/Audigy-mixer.txt @@ -0,0 +1,345 @@ + + Sound Blaster Audigy mixer / default DSP code + =========================================== + +This is based on SB-Live-mixer.txt. + +The EMU10K2 chips have a DSP part which can be programmed to support +various ways of sample processing, which is described here. +(This acticle does not deal with the overall functionality of the +EMU10K2 chips. See the manuals section for further details.) + +The ALSA driver programs this portion of chip by default code +(can be altered later) which offers the following functionality: + + +1) Digital mixer controls +------------------------- + +These controls are built using the DSP instructions. They offer extended +functionality. Only the default build-in code in the ALSA driver is described +here. Note that the controls work as attenuators: the maximum value is the +neutral position leaving the signal unchanged. Note that if the same destination +is mentioned in multiple controls, the signal is accumulated and can be wrapped +(set to maximal or minimal value without checking of overflow). + + +Explanation of used abbreviations: + +DAC - digital to analog converter +ADC - analog to digital converter +I2S - one-way three wire serial bus for digital sound by Philips Semiconductors + (this standard is used for connecting standalone DAC and ADC converters) +LFE - low frequency effects (subwoofer signal) +AC97 - a chip containing an analog mixer, DAC and ADC converters +IEC958 - S/PDIF +FX-bus - the EMU10K2 chip has an effect bus containing 64 accumulators. + Each of the synthesizer voices can feed its output to these accumulators + and the DSP microcontroller can operate with the resulting sum. + +name='PCM Front Playback Volume',index=0 + +This control is used to attenuate samples for left and right front PCM FX-bus +accumulators. ALSA uses accumulators 8 and 9 for left and right front PCM +samples for 5.1 playback. The result samples are forwarded to the front DAC PCM +slots of the Philips DAC. + +name='PCM Surround Playback Volume',index=0 + +This control is used to attenuate samples for left and right surround PCM FX-bus +accumulators. ALSA uses accumulators 2 and 3 for left and right surround PCM +samples for 5.1 playback. The result samples are forwarded to the surround DAC PCM +slots of the Philips DAC. + +name='PCM Center Playback Volume',index=0 + +This control is used to attenuate samples for center PCM FX-bus accumulator. +ALSA uses accumulator 6 for center PCM sample for 5.1 playback. The result sample +is forwarded to the center DAC PCM slot of the Philips DAC. + +name='PCM LFE Playback Volume',index=0 + +This control is used to attenuate sample for LFE PCM FX-bus accumulator. +ALSA uses accumulator 7 for LFE PCM sample for 5.1 playback. The result sample +is forwarded to the LFE DAC PCM slot of the Philips DAC. + +name='PCM Playback Volume',index=0 + +This control is used to attenuate samples for left and right PCM FX-bus +accumulators. ALSA uses accumulators 0 and 1 for left and right PCM samples for +stereo playback. The result samples are forwarded to the front DAC PCM slots +of the Philips DAC. + +name='PCM Capture Volume',index=0 + +This control is used to attenuate samples for left and right PCM FX-bus +accumulator. ALSA uses accumulators 0 and 1 for left and right PCM. +The result is forwarded to the ADC capture FIFO (thus to the standard capture +PCM device). + +name='Music Playback Volume',index=0 + +This control is used to attenuate samples for left and right MIDI FX-bus +accumulators. ALSA uses accumulators 4 and 5 for left and right MIDI samples. +The result samples are forwarded to the front DAC PCM slots of the AC97 codec. + +name='Music Capture Volume',index=0 + +These controls are used to attenuate samples for left and right MIDI FX-bus +accumulator. ALSA uses accumulators 4 and 5 for left and right PCM. +The result is forwarded to the ADC capture FIFO (thus to the standard capture +PCM device). + +name='Mic Playback Volume',index=0 + +This control is used to attenuate samples for left and right Mic input. +For Mic input is used AC97 codec. The result samples are forwarded to +the front DAC PCM slots of the Philips DAC. Samples are forwarded to Mic +capture FIFO (device 1 - 16bit/8KHz mono) too without volume control. + +name='Mic Capture Volume',index=0 + +This control is used to attenuate samples for left and right Mic input. +The result is forwarded to the ADC capture FIFO (thus to the standard capture +PCM device). + +name='Audigy CD Playback Volume',index=0 + +This control is used to attenuate samples from left and right IEC958 TTL +digital inputs (usually used by a CDROM drive). The result samples are +forwarded to the front DAC PCM slots of the Philips DAC. + +name='Audigy CD Capture Volume',index=0 + +This control is used to attenuate samples from left and right IEC958 TTL +digital inputs (usually used by a CDROM drive). The result samples are +forwarded to the ADC capture FIFO (thus to the standard capture PCM device). + +name='IEC958 Optical Playback Volume',index=0 + +This control is used to attenuate samples from left and right IEC958 optical +digital input. The result samples are forwarded to the front DAC PCM slots +of the Philips DAC. + +name='IEC958 Optical Capture Volume',index=0 + +This control is used to attenuate samples from left and right IEC958 optical +digital inputs. The result samples are forwarded to the ADC capture FIFO +(thus to the standard capture PCM device). + +name='Line2 Playback Volume',index=0 + +This control is used to attenuate samples from left and right I2S ADC +inputs (on the AudigyDrive). The result samples are forwarded to the front +DAC PCM slots of the Philips DAC. + +name='Line2 Capture Volume',index=1 + +This control is used to attenuate samples from left and right I2S ADC +inputs (on the AudigyDrive). The result samples are forwarded to the ADC +capture FIFO (thus to the standard capture PCM device). + +name='Analog Mix Playback Volume',index=0 + +This control is used to attenuate samples from left and right I2S ADC +inputs from Philips ADC. The result samples are forwarded to the front +DAC PCM slots of the Philips DAC. This contains mix from analog sources +like CD, Line In, Aux, .... + +name='Analog Mix Capture Volume',index=1 + +This control is used to attenuate samples from left and right I2S ADC +inputs Philips ADC. The result samples are forwarded to the ADC +capture FIFO (thus to the standard capture PCM device). + +name='Aux2 Playback Volume',index=0 + +This control is used to attenuate samples from left and right I2S ADC +inputs (on the AudigyDrive). The result samples are forwarded to the front +DAC PCM slots of the Philips DAC. + +name='Aux2 Capture Volume',index=1 + +This control is used to attenuate samples from left and right I2S ADC +inputs (on the AudigyDrive). The result samples are forwarded to the ADC +capture FIFO (thus to the standard capture PCM device). + +name='Front Playback Volume',index=0 + +All stereo signals are mixed together and mirrored to surround, center and LFE. +This control is used to attenuate samples for left and right front speakers of +this mix. + +name='Surround Playback Volume',index=0 + +All stereo signals are mixed together and mirrored to surround, center and LFE. +This control is used to attenuate samples for left and right surround speakers of +this mix. + +name='Center Playback Volume',index=0 + +All stereo signals are mixed together and mirrored to surround, center and LFE. +This control is used to attenuate sample for center speaker of this mix. + +name='LFE Playback Volume',index=0 + +All stereo signals are mixed together and mirrored to surround, center and LFE. +This control is used to attenuate sample for LFE speaker of this mix. + +name='Tone Control - Switch',index=0 + +This control turns the tone control on or off. The samples for front, rear +and center / LFE outputs are affected. + +name='Tone Control - Bass',index=0 + +This control sets the bass intensity. There is no neutral value!! +When the tone control code is activated, the samples are always modified. +The closest value to pure signal is 20. + +name='Tone Control - Treble',index=0 + +This control sets the treble intensity. There is no neutral value!! +When the tone control code is activated, the samples are always modified. +The closest value to pure signal is 20. + +name='Master Playback Volume',index=0 + +This control is used to attenuate samples for front, surround, center and +LFE outputs. + +name='IEC958 Optical Raw Playback Switch',index=0 + +If this switch is on, then the samples for the IEC958 (S/PDIF) digital +output are taken only from the raw FX8010 PCM, otherwise standard front +PCM samples are taken. + + +2) PCM stream related controls +------------------------------ + +name='EMU10K1 PCM Volume',index 0-31 + +Channel volume attenuation in range 0-0xffff. The maximum value (no +attenuation) is default. The channel mapping for three values is +as follows: + + 0 - mono, default 0xffff (no attenuation) + 1 - left, default 0xffff (no attenuation) + 2 - right, default 0xffff (no attenuation) + +name='EMU10K1 PCM Send Routing',index 0-31 + +This control specifies the destination - FX-bus accumulators. There 24 +values with this mapping: + + 0 - mono, A destination (FX-bus 0-63), default 0 + 1 - mono, B destination (FX-bus 0-63), default 1 + 2 - mono, C destination (FX-bus 0-63), default 2 + 3 - mono, D destination (FX-bus 0-63), default 3 + 4 - mono, E destination (FX-bus 0-63), default 0 + 5 - mono, F destination (FX-bus 0-63), default 0 + 6 - mono, G destination (FX-bus 0-63), default 0 + 7 - mono, H destination (FX-bus 0-63), default 0 + 8 - left, A destination (FX-bus 0-63), default 0 + 9 - left, B destination (FX-bus 0-63), default 1 + 10 - left, C destination (FX-bus 0-63), default 2 + 11 - left, D destination (FX-bus 0-63), default 3 + 12 - left, E destination (FX-bus 0-63), default 0 + 13 - left, F destination (FX-bus 0-63), default 0 + 14 - left, G destination (FX-bus 0-63), default 0 + 15 - left, H destination (FX-bus 0-63), default 0 + 16 - right, A destination (FX-bus 0-63), default 0 + 17 - right, B destination (FX-bus 0-63), default 1 + 18 - right, C destination (FX-bus 0-63), default 2 + 19 - right, D destination (FX-bus 0-63), default 3 + 20 - right, E destination (FX-bus 0-63), default 0 + 21 - right, F destination (FX-bus 0-63), default 0 + 22 - right, G destination (FX-bus 0-63), default 0 + 23 - right, H destination (FX-bus 0-63), default 0 + +Don't forget that it's illegal to assign a channel to the same FX-bus accumulator +more than once (it means 0=0 && 1=0 is an invalid combination). + +name='EMU10K1 PCM Send Volume',index 0-31 + +It specifies the attenuation (amount) for given destination in range 0-255. +The channel mapping is following: + + 0 - mono, A destination attn, default 255 (no attenuation) + 1 - mono, B destination attn, default 255 (no attenuation) + 2 - mono, C destination attn, default 0 (mute) + 3 - mono, D destination attn, default 0 (mute) + 4 - mono, E destination attn, default 0 (mute) + 5 - mono, F destination attn, default 0 (mute) + 6 - mono, G destination attn, default 0 (mute) + 7 - mono, H destination attn, default 0 (mute) + 8 - left, A destination attn, default 255 (no attenuation) + 9 - left, B destination attn, default 0 (mute) + 10 - left, C destination attn, default 0 (mute) + 11 - left, D destination attn, default 0 (mute) + 12 - left, E destination attn, default 0 (mute) + 13 - left, F destination attn, default 0 (mute) + 14 - left, G destination attn, default 0 (mute) + 15 - left, H destination attn, default 0 (mute) + 16 - right, A destination attn, default 0 (mute) + 17 - right, B destination attn, default 255 (no attenuation) + 18 - right, C destination attn, default 0 (mute) + 19 - right, D destination attn, default 0 (mute) + 20 - right, E destination attn, default 0 (mute) + 21 - right, F destination attn, default 0 (mute) + 22 - right, G destination attn, default 0 (mute) + 23 - right, H destination attn, default 0 (mute) + + + +4) MANUALS/PATENTS: +------------------- + +ftp://opensource.creative.com/pub/doc +------------------------------------- + + Files: + LM4545.pdf AC97 Codec + + m2049.pdf The EMU10K1 Digital Audio Processor + + hog63.ps FX8010 - A DSP Chip Architecture for Audio Effects + + +WIPO Patents +------------ + Patent numbers: + WO 9901813 (A1) Audio Effects Processor with multiple asynchronous (Jan. 14, 1999) + streams + + WO 9901814 (A1) Processor with Instruction Set for Audio Effects (Jan. 14, 1999) + + WO 9901953 (A1) Audio Effects Processor having Decoupled Instruction + Execution and Audio Data Sequencing (Jan. 14, 1999) + + +US Patents (http://www.uspto.gov/) +---------------------------------- + + US 5925841 Digital Sampling Instrument employing cache memory (Jul. 20, 1999) + + US 5928342 Audio Effects Processor integrated on a single chip (Jul. 27, 1999) + with a multiport memory onto which multiple asynchronous + digital sound samples can be concurrently loaded + + US 5930158 Processor with Instruction Set for Audio Effects (Jul. 27, 1999) + + US 6032235 Memory initialization circuit (Tram) (Feb. 29, 2000) + + US 6138207 Interpolation looping of audio samples in cache connected to (Oct. 24, 2000) + system bus with prioritization and modification of bus transfers + in accordance with loop ends and minimum block sizes + + US 6151670 Method for conserving memory storage using a (Nov. 21, 2000) + pool of short term memory registers + + US 6195715 Interrupt control for multiple programs communicating with (Feb. 27, 2001) + a common interrupt by associating programs to GP registers, + defining interrupt register, polling GP registers, and invoking + callback routine associated with defined interrupt register diff --git a/Documentation/sound/alsa/Bt87x.txt b/Documentation/sound/alsa/Bt87x.txt new file mode 100644 index 000000000000..11edb2fd2a5a --- /dev/null +++ b/Documentation/sound/alsa/Bt87x.txt @@ -0,0 +1,78 @@ +Intro +===== + +You might have noticed that the bt878 grabber cards have actually +_two_ PCI functions: + +$ lspci +[ ... ] +00:0a.0 Multimedia video controller: Brooktree Corporation Bt878 (rev 02) +00:0a.1 Multimedia controller: Brooktree Corporation Bt878 (rev 02) +[ ... ] + +The first does video, it is backward compatible to the bt848. The second +does audio. snd-bt87x is a driver for the second function. It's a sound +driver which can be used for recording sound (and _only_ recording, no +playback). As most TV cards come with a short cable which can be plugged +into your sound card's line-in you probably don't need this driver if all +you want to do is just watching TV... + +Some cards do not bother to connect anything to the audio input pins of +the chip, and some other cards use the audio function to transport MPEG +video data, so it's quite possible that audio recording may not work +with your card. + + +Driver Status +============= + +The driver is now stable. However, it doesn't know about many TV cards, +and it refuses to load for cards it doesn't know. + +If the driver complains ("Unknown TV card found, the audio driver will +not load"), you can specify the load_all=1 option to force the driver to +try to use the audio capture function of your card. If the frequency of +recorded data is not right, try to specify the digital_rate option with +other values than the default 32000 (often it's 44100 or 64000). + +If you have an unknown card, please mail the ID and board name to +<alsa-devel@lists.sf.net>, regardless of whether audio capture works or +not, so that future versions of this driver know about your card. + + +Audio modes +=========== + +The chip knows two different modes (digital/analog). snd-bt87x +registers two PCM devices, one for each mode. They cannot be used at +the same time. + + +Digital audio mode +================== + +The first device (hw:X,0) gives you 16 bit stereo sound. The sample +rate depends on the external source which feeds the Bt87x with digital +sound via I2S interface. + + +Analog audio mode (A/D) +======================= + +The second device (hw:X,1) gives you 8 or 16 bit mono sound. Supported +sample rates are between 119466 and 448000 Hz (yes, these numbers are +that high). If you've set the CONFIG_SND_BT87X_OVERCLOCK option, the +maximum sample rate is 1792000 Hz, but audio data becomes unusable +beyond 896000 Hz on my card. + +The chip has three analog inputs. Consequently you'll get a mixer +device to control these. + + +Have fun, + + Clemens + + +Written by Clemens Ladisch <clemens@ladisch.de> +big parts copied from btaudio.txt by Gerd Knorr <kraxel@bytesex.org> diff --git a/Documentation/sound/alsa/CMIPCI.txt b/Documentation/sound/alsa/CMIPCI.txt new file mode 100644 index 000000000000..4a7df771b806 --- /dev/null +++ b/Documentation/sound/alsa/CMIPCI.txt @@ -0,0 +1,242 @@ + Brief Notes on C-Media 8738/8338 Driver + ======================================= + + Takashi Iwai <tiwai@suse.de> + + +Front/Rear Multi-channel Playback +--------------------------------- + +CM8x38 chip can use ADC as the second DAC so that two different stereo +channels can be used for front/rear playbacks. Since there are two +DACs, both streams are handled independently unlike the 4/6ch multi- +channel playbacks in the section below. + +As default, ALSA driver assigns the first PCM device (i.e. hw:0,0 for +card#0) for front and 4/6ch playbacks, while the second PCM device +(hw:0,1) is assigned to the second DAC for rear playback. + +There are slight difference between two DACs. + +- The first DAC supports U8 and S16LE formats, while the second DAC + supports only S16LE. +- The seconde DAC supports only two channel stereo. + +Please note that the CM8x38 DAC doesn't support continuous playback +rate but only fixed rates: 5512, 8000, 11025, 16000, 22050, 32000, +44100 and 48000 Hz. + +The rear output can be heard only when "Four Channel Mode" switch is +disabled. Otherwise no signal will be routed to the rear speakers. +As default it's turned on. + +*** WARNING *** +When "Four Channel Mode" switch is off, the output from rear speakers +will be FULL VOLUME regardless of Master and PCM volumes. +This might damage your audio equipment. Please disconnect speakers +before your turn off this switch. +*** WARNING *** + +[ Well.. I once got the output with correct volume (i.e. same with the + front one) and was so excited. It was even with "Four Channel" bit + on and "double DAC" mode. Actually I could hear separate 4 channels + from front and rear speakers! But.. after reboot, all was gone. + It's a very pity that I didn't save the register dump at that + time.. Maybe there is an unknown register to achieve this... ] + +If your card has an extra output jack for the rear output, the rear +playback should be routed there as default. If not, there is a +control switch in the driver "Line-In As Rear", which you can change +via alsamixer or somewhat else. When this switch is on, line-in jack +is used as rear output. + +There are two more controls regarding to the rear output. +The "Exchange DAC" switch is used to exchange front and rear playback +routes, i.e. the 2nd DAC is output from front output. + + +4/6 Multi-Channel Playback +-------------------------- + +The recent CM8738 chips support for the 4/6 multi-channel playback +function. This is useful especially for AC3 decoding. + +When the multi-channel is supported, the driver name has a suffix +"-MC" such like "CMI8738-MC6". You can check this name from +/proc/asound/cards. + +When the 4/6-ch output is enabled, the second DAC accepts up to 6 (or +4) channels. While the dual DAC supports two different rates or +formats, the 4/6-ch playback supports only the same condition for all +channels. Since the multi-channel playback mode uses both DACs, you +cannot operate with full-duplex. + +The 4.0 and 5.1 modes are defined as the pcm "surround40" and "surround51" +in alsa-lib. For example, you can play a WAV file with 6 channels like + + % aplay -Dsurround51 sixchannels.wav + +For programmin the 4/6 channel playback, you need to specify the PCM +channels as you like and set the format S16LE. For example, for playback +with 4 channels, + + snd_pcm_hw_params_set_access(pcm, hw, SND_PCM_ACCESS_RW_INTERLEAVED); + // or mmap if you like + snd_pcm_hw_params_set_format(pcm, hw, SND_PCM_FORMAT_S16_LE); + snd_pcm_hw_params_set_channels(pcm, hw, 4); + +and use the interleaved 4 channel data. + +There are some control switchs affecting to the speaker connections: + +"Line-In As Rear" - As mentioned above, the line-in jack is used + for the rear (3th and 4th channels) output. +"Line-In As Bass" - The line-in jack is used for the bass (5th + and 6th channels) output. +"Mic As Center/LFE" - The mic jack is used for the bass output. + If this switch is on, you cannot use a microphone as a capture + source, of course. + + +Digital I/O +----------- + +The CM8x38 provides the excellent SPDIF capability with very chip +price (yes, that's the reason I bought the card :) + +The SPDIF playback and capture are done via the third PCM device +(hw:0,2). Usually this is assigned to the PCM device "spdif". +The available rates are 44100 and 48000 Hz. +For playback with aplay, you can run like below: + + % aplay -Dhw:0,2 foo.wav + +or + + % aplay -Dspdif foo.wav + +24bit format is also supported experimentally. + +The playback and capture over SPDIF use normal DAC and ADC, +respectively, so you cannot playback both analog and digital streams +simultaneously. + +To enable SPDIF output, you need to turn on "IEC958 Output Switch" +control via mixer or alsactl. Then you'll see the red light on from +the card so you know that's working obviously :) +The SPDIF input is always enabled, so you can hear SPDIF input data +from line-out with "IEC958 In Monitor" switch at any time (see +below). + +You can play via SPDIF even with the first device (hw:0,0), +but SPDIF is enabled only when the proper format (S16LE), sample rate +(441100 or 48000) and channels (2) are used. Otherwise it's turned +off. (Also don't forget to turn on "IEC958 Output Switch", too.) + + +Additionally there are relevant control switches: + +"IEC958 Mix Analog" - Mix analog PCM playback and FM-OPL/3 streams and + output through SPDIF. This switch appears only on old chip + models (CM8738 033 and 037). + Note: without this control you can output PCM to SPDIF. + This is "mixing" of streams, so e.g. it's not for AC3 output + (see the next section). + +"IEC958 In Select" - Select SPDIF input, the internal CD-in (false) + and the external input (true). + +"IEC958 Loop" - SPDIF input data is loop back into SPDIF + output (aka bypass) + +"IEC958 Copyright" - Set the copyright bit. + +"IEC958 5V" - Select 0.5V (coax) or 5V (optical) interface. + On some cards this doesn't work and you need to change the + configuration with hardware dip-switch. + +"IEC958 In Monitor" - SPDIF input is routed to DAC. + +"IEC958 In Phase Inverse" - Set SPDIF input format as inverse. + [FIXME: this doesn't work on all chips..] + +"IEC958 In Valid" - Set input validity flag detection. + +Note: When "PCM Playback Switch" is on, you'll hear the digital output +stream through analog line-out. + + +The AC3 (RAW DIGITAL) OUTPUT +---------------------------- + +The driver supports raw digital (typically AC3) i/o over SPDIF. This +can be toggled via IEC958 playback control, but usually you need to +access it via alsa-lib. See alsa-lib documents for more details. + +On the raw digital mode, the "PCM Playback Switch" is automatically +turned off so that non-audio data is heard from the analog line-out. +Similarly the following switches are off: "IEC958 Mix Analog" and +"IEC958 Loop". The switches are resumed after closing the SPDIF PCM +device automatically to the previous state. + +On the model 033, AC3 is implemented by the software conversion in +the alsa-lib. If you need to bypass the software conversion of IEC958 +subframes, pass the "soft_ac3=0" module option. This doesn't matter +on the newer models. + + +ANALOG MIXER INTERFACE +---------------------- + +The mixer interface on CM8x38 is similar to SB16. +There are Master, PCM, Synth, CD, Line, Mic and PC Speaker playback +volumes. Synth, CD, Line and Mic have playback and capture switches, +too, as well as SB16. + +In addition to the standard SB mixer, CM8x38 provides more functions. +- PCM playback switch +- PCM capture switch (to capture the data sent to DAC) +- Mic Boost switch +- Mic capture volume +- Aux playback volume/switch and capture switch +- 3D control switch + + +MIDI CONTROLLER +--------------- + +The MPU401-UART interface is enabled as default only for the first +(CMIPCI) card. You need to set module option "midi_port" properly +for the 2nd (CMIPCI) card. + +There is _no_ hardware wavetable function on this chip (except for +OPL3 synth below). +What's said as MIDI synth on Windows is a software synthesizer +emulation. On Linux use TiMidity or other softsynth program for +playing MIDI music. + + +FM OPL/3 Synth +-------------- + +The FM OPL/3 is also enabled as default only for the first card. +Set "fm_port" module option for more cards. + +The output quality of FM OPL/3 is, however, very weird. +I don't know why.. + + +Joystick and Modem +------------------ + +The joystick and modem should be available by enabling the control +switch "Joystick" and "Modem" respectively. But I myself have never +tested them yet. + + +Debugging Information +--------------------- + +The registers are shown in /proc/asound/cardX/cmipci. If you have any +problem (especially unexpected behavior of mixer), please attach the +output of this proc file together with the bug report. diff --git a/Documentation/sound/alsa/ControlNames.txt b/Documentation/sound/alsa/ControlNames.txt new file mode 100644 index 000000000000..5b18298e9495 --- /dev/null +++ b/Documentation/sound/alsa/ControlNames.txt @@ -0,0 +1,84 @@ +This document describes standard names of mixer controls. + +Syntax: SOURCE [DIRECTION] FUNCTION + +DIRECTION: + <nothing> (both directions) + Playback + Capture + Bypass Playback + Bypass Capture + +FUNCTION: + Switch (on/off switch) + Volume + Route (route control, hardware specific) + +SOURCE: + Master + Master Mono + Hardware Master + Headphone + PC Speaker + Phone + Phone Input + Phone Output + Synth + FM + Mic + Line + CD + Video + Zoom Video + Aux + PCM + PCM Front + PCM Rear + PCM Pan + Loopback + Analog Loopback (D/A -> A/D loopback) + Digital Loopback (playback -> capture loopback - without analog path) + Mono + Mono Output + Multi + ADC + Wave + Music + I2S + IEC958 + +Exceptions: + [Digital] Capture Source + [Digital] Capture Switch (aka input gain switch) + [Digital] Capture Volume (aka input gain volume) + [Digital] Playback Switch (aka output gain switch) + [Digital] Playback Volume (aka output gain volume) + Tone Control - Switch + Tone Control - Bass + Tone Control - Treble + 3D Control - Switch + 3D Control - Center + 3D Control - Depth + 3D Control - Wide + 3D Control - Space + 3D Control - Level + Mic Boost [(?dB)] + +PCM interface: + + Sample Clock Source { "Word", "Internal", "AutoSync" } + Clock Sync Status { "Lock", "Sync", "No Lock" } + External Rate /* external capture rate */ + Capture Rate /* capture rate taken from external source */ + +IEC958 (S/PDIF) interface: + + IEC958 [...] [Playback|Capture] Switch /* turn on/off the IEC958 interface */ + IEC958 [...] [Playback|Capture] Volume /* digital volume control */ + IEC958 [...] [Playback|Capture] Default /* default or global value - read/write */ + IEC958 [...] [Playback|Capture] Mask /* consumer and professional mask */ + IEC958 [...] [Playback|Capture] Con Mask /* consumer mask */ + IEC958 [...] [Playback|Capture] Pro Mask /* professional mask */ + IEC958 [...] [Playback|Capture] PCM Stream /* the settings assigned to a PCM stream */ + IEC958 Q-subcode [Playback|Capture] Default /* Q-subcode bits */ + IEC958 Preamble [Playback|Capture] Default /* burst preamble words (4*16bits) */ diff --git a/Documentation/sound/alsa/DocBook/alsa-driver-api.tmpl b/Documentation/sound/alsa/DocBook/alsa-driver-api.tmpl new file mode 100644 index 000000000000..1f3ae3e32d69 --- /dev/null +++ b/Documentation/sound/alsa/DocBook/alsa-driver-api.tmpl @@ -0,0 +1,100 @@ +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.1//EN"> + +<book> +<?dbhtml filename="index.html"> + +<!-- ****************************************************** --> +<!-- Header --> +<!-- ****************************************************** --> + <bookinfo> + <title>The ALSA Driver API</title> + + <legalnotice> + <para> + This document is free; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + </para> + + <para> + This document is distributed in the hope that it will be useful, + but <emphasis>WITHOUT ANY WARRANTY</emphasis>; without even the + implied warranty of <emphasis>MERCHANTABILITY or FITNESS FOR A + PARTICULAR PURPOSE</emphasis>. See the GNU General Public License + for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + </legalnotice> + + </bookinfo> + + <chapter><title>Management of Cards and Devices</title> + <sect1><title>Card Managment</title> +!Esound/core/init.c + </sect1> + <sect1><title>Device Components</title> +!Esound/core/device.c + </sect1> + <sect1><title>KMOD and Device File Entries</title> +!Esound/core/sound.c + </sect1> + <sect1><title>Memory Management Helpers</title> +!Esound/core/memory.c +!Esound/core/memalloc.c + </sect1> + </chapter> + <chapter><title>PCM API</title> + <sect1><title>PCM Core</title> +!Esound/core/pcm.c +!Esound/core/pcm_lib.c +!Esound/core/pcm_native.c + </sect1> + <sect1><title>PCM Format Helpers</title> +!Esound/core/pcm_misc.c + </sect1> + <sect1><title>PCM Memory Managment</title> +!Esound/core/pcm_memory.c + </sect1> + </chapter> + <chapter><title>Control/Mixer API</title> + <sect1><title>General Control Interface</title> +!Esound/core/control.c + </sect1> + <sect1><title>AC97 Codec API</title> +!Esound/pci/ac97/ac97_codec.c +!Esound/pci/ac97/ac97_pcm.c + </sect1> + </chapter> + <chapter><title>MIDI API</title> + <sect1><title>Raw MIDI API</title> +!Esound/core/rawmidi.c + </sect1> + <sect1><title>MPU401-UART API</title> +!Esound/drivers/mpu401/mpu401_uart.c + </sect1> + </chapter> + <chapter><title>Proc Info API</title> + <sect1><title>Proc Info Interface</title> +!Esound/core/info.c + </sect1> + </chapter> + <chapter><title>Miscellaneous Functions</title> + <sect1><title>Hardware-Dependent Devices API</title> +!Esound/core/hwdep.c + </sect1> + <sect1><title>ISA DMA Helpers</title> +!Esound/core/isadma.c + </sect1> + <sect1><title>Other Helper Macros</title> +!Iinclude/sound/core.h + </sect1> + </chapter> + +</book> diff --git a/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl b/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl new file mode 100644 index 000000000000..e789475304b6 --- /dev/null +++ b/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl @@ -0,0 +1,6045 @@ +<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.1//EN"> + +<book> +<?dbhtml filename="index.html"> + +<!-- ****************************************************** --> +<!-- Header --> +<!-- ****************************************************** --> + <bookinfo> + <title>Writing an ALSA Driver</title> + <author> + <firstname>Takashi</firstname> + <surname>Iwai</surname> + <affiliation> + <address> + <email>tiwai@suse.de</email> + </address> + </affiliation> + </author> + + <date>March 6, 2005</date> + <edition>0.3.4</edition> + + <abstract> + <para> + This document describes how to write an ALSA (Advanced Linux + Sound Architecture) driver. + </para> + </abstract> + + <legalnotice> + <para> + Copyright (c) 2002-2004 Takashi Iwai <email>tiwai@suse.de</email> + </para> + + <para> + This document is free; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + </para> + + <para> + This document is distributed in the hope that it will be useful, + but <emphasis>WITHOUT ANY WARRANTY</emphasis>; without even the + implied warranty of <emphasis>MERCHANTABILITY or FITNESS FOR A + PARTICULAR PURPOSE</emphasis>. See the GNU General Public License + for more details. + </para> + + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + </para> + </legalnotice> + + </bookinfo> + +<!-- ****************************************************** --> +<!-- Preface --> +<!-- ****************************************************** --> + <preface id="preface"> + <title>Preface</title> + <para> + This document describes how to write an + <ulink url="http://www.alsa-project.org/"><citetitle> + ALSA (Advanced Linux Sound Architecture)</citetitle></ulink> + driver. The document focuses mainly on the PCI soundcard. + In the case of other device types, the API might + be different, too. However, at least the ALSA kernel API is + consistent, and therefore it would be still a bit help for + writing them. + </para> + + <para> + The target of this document is ones who already have enough + skill of C language and have the basic knowledge of linux + kernel programming. This document doesn't explain the general + topics of linux kernel codes and doesn't cover the detail of + implementation of each low-level driver. It describes only how is + the standard way to write a PCI sound driver on ALSA. + </para> + + <para> + If you are already familiar with the older ALSA ver.0.5.x, you + can check the drivers such as <filename>es1938.c</filename> or + <filename>maestro3.c</filename> which have also almost the same + code-base in the ALSA 0.5.x tree, so you can compare the differences. + </para> + + <para> + This document is still a draft version. Any feedbacks and + corrections, please!! + </para> + </preface> + + +<!-- ****************************************************** --> +<!-- File Tree Structure --> +<!-- ****************************************************** --> + <chapter id="file-tree"> + <title>File Tree Structure</title> + + <section id="file-tree-general"> + <title>General</title> + <para> + The ALSA drivers are provided in the two ways. + </para> + + <para> + One is the trees provided as a tarball or via cvs from the + ALSA's ftp site, and another is the 2.6 (or later) Linux kernel + tree. To synchronize both, the ALSA driver tree is split into + two different trees: alsa-kernel and alsa-driver. The former + contains purely the source codes for the Linux 2.6 (or later) + tree. This tree is designed only for compilation on 2.6 or + later environment. The latter, alsa-driver, contains many subtle + files for compiling the ALSA driver on the outside of Linux + kernel like configure script, the wrapper functions for older, + 2.2 and 2.4 kernels, to adapt the latest kernel API, + and additional drivers which are still in development or in + tests. The drivers in alsa-driver tree will be moved to + alsa-kernel (eventually 2.6 kernel tree) once when they are + finished and confirmed to work fine. + </para> + + <para> + The file tree structure of ALSA driver is depicted below. Both + alsa-kernel and alsa-driver have almost the same file + structure, except for <quote>core</quote> directory. It's + named as <quote>acore</quote> in alsa-driver tree. + + <example> + <title>ALSA File Tree Structure</title> + <literallayout> + sound + /core + /oss + /seq + /oss + /instr + /ioctl32 + /include + /drivers + /mpu401 + /opl3 + /i2c + /l3 + /synth + /emux + /pci + /(cards) + /isa + /(cards) + /arm + /ppc + /sparc + /usb + /pcmcia /(cards) + /oss + </literallayout> + </example> + </para> + </section> + + <section id="file-tree-core-directory"> + <title>core directory</title> + <para> + This directory contains the middle layer, that is, the heart + of ALSA drivers. In this directory, the native ALSA modules are + stored. The sub-directories contain different modules and are + dependent upon the kernel config. + </para> + + <section id="file-tree-core-directory-oss"> + <title>core/oss</title> + + <para> + The codes for PCM and mixer OSS emulation modules are stored + in this directory. The rawmidi OSS emulation is included in + the ALSA rawmidi code since it's quite small. The sequencer + code is stored in core/seq/oss directory (see + <link linkend="file-tree-core-directory-seq-oss"><citetitle> + below</citetitle></link>). + </para> + </section> + + <section id="file-tree-core-directory-ioctl32"> + <title>core/ioctl32</title> + + <para> + This directory contains the 32bit-ioctl wrappers for 64bit + architectures such like x86-64, ppc64 and sparc64. For 32bit + and alpha architectures, these are not compiled. + </para> + </section> + + <section id="file-tree-core-directory-seq"> + <title>core/seq</title> + <para> + This and its sub-directories are for the ALSA + sequencer. This directory contains the sequencer core and + primary sequencer modules such like snd-seq-midi, + snd-seq-virmidi, etc. They are compiled only when + <constant>CONFIG_SND_SEQUENCER</constant> is set in the kernel + config. + </para> + </section> + + <section id="file-tree-core-directory-seq-oss"> + <title>core/seq/oss</title> + <para> + This contains the OSS sequencer emulation codes. + </para> + </section> + + <section id="file-tree-core-directory-deq-instr"> + <title>core/seq/instr</title> + <para> + This directory contains the modules for the sequencer + instrument layer. + </para> + </section> + </section> + + <section id="file-tree-include-directory"> + <title>include directory</title> + <para> + This is the place for the public header files of ALSA drivers, + which are to be exported to the user-space, or included by + several files at different directories. Basically, the private + header files should not be placed in this directory, but you may + still find files there, due to historical reason :) + </para> + </section> + + <section id="file-tree-drivers-directory"> + <title>drivers directory</title> + <para> + This directory contains the codes shared among different drivers + on the different architectures. They are hence supposed not to be + architecture-specific. + For example, the dummy pcm driver and the serial MIDI + driver are found in this directory. In the sub-directories, + there are the codes for components which are independent from + bus and cpu architectures. + </para> + + <section id="file-tree-drivers-directory-mpu401"> + <title>drivers/mpu401</title> + <para> + The MPU401 and MPU401-UART modules are stored here. + </para> + </section> + + <section id="file-tree-drivers-directory-opl3"> + <title>drivers/opl3 and opl4</title> + <para> + The OPL3 and OPL4 FM-synth stuff is found here. + </para> + </section> + </section> + + <section id="file-tree-i2c-directory"> + <title>i2c directory</title> + <para> + This contains the ALSA i2c components. + </para> + + <para> + Although there is a standard i2c layer on Linux, ALSA has its + own i2c codes for some cards, because the soundcard needs only a + simple operation and the standard i2c API is too complicated for + such a purpose. + </para> + + <section id="file-tree-i2c-directory-l3"> + <title>i2c/l3</title> + <para> + This is a sub-directory for ARM L3 i2c. + </para> + </section> + </section> + + <section id="file-tree-synth-directory"> + <title>synth directory</title> + <para> + This contains the synth middle-level modules. + </para> + + <para> + So far, there is only Emu8000/Emu10k1 synth driver under + synth/emux sub-directory. + </para> + </section> + + <section id="file-tree-pci-directory"> + <title>pci directory</title> + <para> + This and its sub-directories hold the top-level card modules + for PCI soundcards and the codes specific to the PCI BUS. + </para> + + <para> + The drivers compiled from a single file is stored directly on + pci directory, while the drivers with several source files are + stored on its own sub-directory (e.g. emu10k1, ice1712). + </para> + </section> + + <section id="file-tree-isa-directory"> + <title>isa directory</title> + <para> + This and its sub-directories hold the top-level card modules + for ISA soundcards. + </para> + </section> + + <section id="file-tree-arm-ppc-sparc-directories"> + <title>arm, ppc, and sparc directories</title> + <para> + These are for the top-level card modules which are + specific to each given architecture. + </para> + </section> + + <section id="file-tree-usb-directory"> + <title>usb directory</title> + <para> + This contains the USB-audio driver. On the latest version, the + USB MIDI driver is integrated together with usb-audio driver. + </para> + </section> + + <section id="file-tree-pcmcia-directory"> + <title>pcmcia directory</title> + <para> + The PCMCIA, especially PCCard drivers will go here. CardBus + drivers will be on pci directory, because its API is identical + with the standard PCI cards. + </para> + </section> + + <section id="file-tree-oss-directory"> + <title>oss directory</title> + <para> + The OSS/Lite source files are stored here on Linux 2.6 (or + later) tree. (In the ALSA driver tarball, it's empty, of course :) + </para> + </section> + </chapter> + + +<!-- ****************************************************** --> +<!-- Basic Flow for PCI Drivers --> +<!-- ****************************************************** --> + <chapter id="basic-flow"> + <title>Basic Flow for PCI Drivers</title> + + <section id="basic-flow-outline"> + <title>Outline</title> + <para> + The minimum flow of PCI soundcard is like the following: + + <itemizedlist> + <listitem><para>define the PCI ID table (see the section + <link linkend="pci-resource-entries"><citetitle>PCI Entries + </citetitle></link>).</para></listitem> + <listitem><para>create <function>probe()</function> callback.</para></listitem> + <listitem><para>create <function>remove()</function> callback.</para></listitem> + <listitem><para>create pci_driver table which contains the three pointers above.</para></listitem> + <listitem><para>create <function>init()</function> function just calling <function>pci_module_init()</function> to register the pci_driver table defined above.</para></listitem> + <listitem><para>create <function>exit()</function> function to call <function>pci_unregister_driver()</function> function.</para></listitem> + </itemizedlist> + </para> + </section> + + <section id="basic-flow-example"> + <title>Full Code Example</title> + <para> + The code example is shown below. Some parts are kept + unimplemented at this moment but will be filled in the + succeeding sections. The numbers in comment lines of + <function>snd_mychip_probe()</function> function are the + markers. + + <example> + <title>Basic Flow for PCI Drivers Example</title> + <programlisting> +<![CDATA[ + #include <sound/driver.h> + #include <linux/init.h> + #include <linux/pci.h> + #include <linux/slab.h> + #include <sound/core.h> + #include <sound/initval.h> + + /* module parameters (see "Module Parameters") */ + static int index[SNDRV_CARDS] = SNDRV_DEFAULT_IDX; + static char *id[SNDRV_CARDS] = SNDRV_DEFAULT_STR; + static int enable[SNDRV_CARDS] = SNDRV_DEFAULT_ENABLE_PNP; + + /* definition of the chip-specific record */ + typedef struct snd_mychip mychip_t; + struct snd_mychip { + snd_card_t *card; + // rest of implementation will be in the section + // "PCI Resource Managements" + }; + + /* chip-specific destructor + * (see "PCI Resource Managements") + */ + static int snd_mychip_free(mychip_t *chip) + { + .... // will be implemented later... + } + + /* component-destructor + * (see "Management of Cards and Components") + */ + static int snd_mychip_dev_free(snd_device_t *device) + { + mychip_t *chip = device->device_data; + return snd_mychip_free(chip); + } + + /* chip-specific constructor + * (see "Management of Cards and Components") + */ + static int __devinit snd_mychip_create(snd_card_t *card, + struct pci_dev *pci, + mychip_t **rchip) + { + mychip_t *chip; + int err; + static snd_device_ops_t ops = { + .dev_free = snd_mychip_dev_free, + }; + + *rchip = NULL; + + // check PCI availability here + // (see "PCI Resource Managements") + .... + + /* allocate a chip-specific data with zero filled */ + chip = kcalloc(1, sizeof(*chip), GFP_KERNEL); + if (chip == NULL) + return -ENOMEM; + + chip->card = card; + + // rest of initialization here; will be implemented + // later, see "PCI Resource Managements" + .... + + if ((err = snd_device_new(card, SNDRV_DEV_LOWLEVEL, + chip, &ops)) < 0) { + snd_mychip_free(chip); + return err; + } + + snd_card_set_dev(card, &pci->dev); + + *rchip = chip; + return 0; + } + + /* constructor -- see "Constructor" sub-section */ + static int __devinit snd_mychip_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) + { + static int dev; + snd_card_t *card; + mychip_t *chip; + int err; + + /* (1) */ + if (dev >= SNDRV_CARDS) + return -ENODEV; + if (!enable[dev]) { + dev++; + return -ENOENT; + } + + /* (2) */ + card = snd_card_new(index[dev], id[dev], THIS_MODULE, 0); + if (card == NULL) + return -ENOMEM; + + /* (3) */ + if ((err = snd_mychip_create(card, pci, &chip)) < 0) { + snd_card_free(card); + return err; + } + + /* (4) */ + strcpy(card->driver, "My Chip"); + strcpy(card->shortname, "My Own Chip 123"); + sprintf(card->longname, "%s at 0x%lx irq %i", + card->shortname, chip->ioport, chip->irq); + + /* (5) */ + .... // implemented later + + /* (6) */ + if ((err = snd_card_register(card)) < 0) { + snd_card_free(card); + return err; + } + + /* (7) */ + pci_set_drvdata(pci, card); + dev++; + return 0; + } + + /* destructor -- see "Destructor" sub-section */ + static void __devexit snd_mychip_remove(struct pci_dev *pci) + { + snd_card_free(pci_get_drvdata(pci)); + pci_set_drvdata(pci, NULL); + } +]]> + </programlisting> + </example> + </para> + </section> + + <section id="basic-flow-constructor"> + <title>Constructor</title> + <para> + The real constructor of PCI drivers is probe callback. The + probe callback and other component-constructors which are called + from probe callback should be defined with + <parameter>__devinit</parameter> prefix. You + cannot use <parameter>__init</parameter> prefix for them, + because any PCI device could be a hotplug device. + </para> + + <para> + In the probe callback, the following scheme is often used. + </para> + + <section id="basic-flow-constructor-device-index"> + <title>1) Check and increment the device index.</title> + <para> + <informalexample> + <programlisting> +<![CDATA[ + static int dev; + .... + if (dev >= SNDRV_CARDS) + return -ENODEV; + if (!enable[dev]) { + dev++; + return -ENOENT; + } +]]> + </programlisting> + </informalexample> + + where enable[dev] is the module option. + </para> + + <para> + At each time probe callback is called, check the + availability of the device. If not available, simply increment + the device index and returns. dev will be incremented also + later (<link + linkend="basic-flow-constructor-set-pci"><citetitle>step + 7</citetitle></link>). + </para> + </section> + + <section id="basic-flow-constructor-create-card"> + <title>2) Create a card instance</title> + <para> + <informalexample> + <programlisting> +<![CDATA[ + snd_card_t *card; + .... + card = snd_card_new(index[dev], id[dev], THIS_MODULE, 0); +]]> + </programlisting> + </informalexample> + </para> + + <para> + The detail will be explained in the section + <link linkend="card-management-card-instance"><citetitle> + Management of Cards and Components</citetitle></link>. + </para> + </section> + + <section id="basic-flow-constructor-create-main"> + <title>3) Create a main component</title> + <para> + In this part, the PCI resources are allocated. + + <informalexample> + <programlisting> +<![CDATA[ + mychip_t *chip; + .... + if ((err = snd_mychip_create(card, pci, &chip)) < 0) { + snd_card_free(card); + return err; + } +]]> + </programlisting> + </informalexample> + + The detail will be explained in the section <link + linkend="pci-resource"><citetitle>PCI Resource + Managements</citetitle></link>. + </para> + </section> + + <section id="basic-flow-constructor-main-component"> + <title>4) Set the driver ID and name strings.</title> + <para> + <informalexample> + <programlisting> +<![CDATA[ + strcpy(card->driver, "My Chip"); + strcpy(card->shortname, "My Own Chip 123"); + sprintf(card->longname, "%s at 0x%lx irq %i", + card->shortname, chip->ioport, chip->irq); +]]> + </programlisting> + </informalexample> + + The driver field holds the minimal ID string of the + chip. This is referred by alsa-lib's configurator, so keep it + simple but unique. + Even the same driver can have different driver IDs to + distinguish the functionality of each chip type. + </para> + + <para> + The shortname field is a string shown as more verbose + name. The longname field contains the information which is + shown in <filename>/proc/asound/cards</filename>. + </para> + </section> + + <section id="basic-flow-constructor-create-other"> + <title>5) Create other components, such as mixer, MIDI, etc.</title> + <para> + Here you define the basic components such as + <link linkend="pcm-interface"><citetitle>PCM</citetitle></link>, + mixer (e.g. <link linkend="api-ac97"><citetitle>AC97</citetitle></link>), + MIDI (e.g. <link linkend="midi-interface"><citetitle>MPU-401</citetitle></link>), + and other interfaces. + Also, if you want a <link linkend="proc-interface"><citetitle>proc + file</citetitle></link>, define it here, too. + </para> + </section> + + <section id="basic-flow-constructor-register-card"> + <title>6) Register the card instance.</title> + <para> + <informalexample> + <programlisting> +<![CDATA[ + if ((err = snd_card_register(card)) < 0) { + snd_card_free(card); + return err; + } +]]> + </programlisting> + </informalexample> + </para> + + <para> + Will be explained in the section <link + linkend="card-management-registration"><citetitle>Management + of Cards and Components</citetitle></link>, too. + </para> + </section> + + <section id="basic-flow-constructor-set-pci"> + <title>7) Set the PCI driver data and return zero.</title> + <para> + <informalexample> + <programlisting> +<![CDATA[ + pci_set_drvdata(pci, card); + dev++; + return 0; +]]> + </programlisting> + </informalexample> + + In the above, the card record is stored. This pointer is + referred in the remove callback and power-management + callbacks, too. + </para> + </section> + </section> + + <section id="basic-flow-destructor"> + <title>Destructor</title> + <para> + The destructor, remove callback, simply releases the card + instance. Then the ALSA middle layer will release all the + attached components automatically. + </para> + + <para> + It would be typically like the following: + + <informalexample> + <programlisting> +<![CDATA[ + static void __devexit snd_mychip_remove(struct pci_dev *pci) + { + snd_card_free(pci_get_drvdata(pci)); + pci_set_drvdata(pci, NULL); + } +]]> + </programlisting> + </informalexample> + + The above code assumes that the card pointer is set to the PCI + driver data. + </para> + </section> + + <section id="basic-flow-header-files"> + <title>Header Files</title> + <para> + For the above example, at least the following include files + are necessary. + + <informalexample> + <programlisting> +<![CDATA[ + #include <sound/driver.h> + #include <linux/init.h> + #include <linux/pci.h> + #include <linux/slab.h> + #include <sound/core.h> + #include <sound/initval.h> +]]> + </programlisting> + </informalexample> + + where the last one is necessary only when module options are + defined in the source file. If the codes are split to several + files, the file without module options don't need them. + </para> + + <para> + In addition to them, you'll need + <filename><linux/interrupt.h></filename> for the interrupt + handling, and <filename><asm/io.h></filename> for the i/o + access. If you use <function>mdelay()</function> or + <function>udelay()</function> functions, you'll need to include + <filename><linux/delay.h></filename>, too. + </para> + + <para> + The ALSA interfaces like PCM or control API are defined in other + header files as <filename><sound/xxx.h></filename>. + They have to be included after + <filename><sound/core.h></filename>. + </para> + + </section> + </chapter> + + +<!-- ****************************************************** --> +<!-- Management of Cards and Components --> +<!-- ****************************************************** --> + <chapter id="card-management"> + <title>Management of Cards and Components</title> + + <section id="card-management-card-instance"> + <title>Card Instance</title> + <para> + For each soundcard, a <quote>card</quote> record must be allocated. + </para> + + <para> + A card record is the headquarters of the soundcard. It manages + the list of whole devices (components) on the soundcard, such as + PCM, mixers, MIDI, synthesizer, and so on. Also, the card + record holds the ID and the name strings of the card, manages + the root of proc files, and controls the power-management states + and hotplug disconnections. The component list on the card + record is used to manage the proper releases of resources at + destruction. + </para> + + <para> + As mentioned above, to create a card instance, call + <function>snd_card_new()</function>. + + <informalexample> + <programlisting> +<![CDATA[ + snd_card_t *card; + card = snd_card_new(index, id, module, extra_size); +]]> + </programlisting> + </informalexample> + </para> + + <para> + The function takes four arguments, the card-index number, the + id string, the module pointer (usually + <constant>THIS_MODULE</constant>), + and the size of extra-data space. The last argument is used to + allocate card->private_data for the + chip-specific data. Note that this data + <emphasis>is</emphasis> allocated by + <function>snd_card_new()</function>. + </para> + </section> + + <section id="card-management-component"> + <title>Components</title> + <para> + After the card is created, you can attach the components + (devices) to the card instance. On ALSA driver, a component is + represented as a <type>snd_device_t</type> object. + A component can be a PCM instance, a control interface, a raw + MIDI interface, etc. Each of such instances has one component + entry. + </para> + + <para> + A component can be created via + <function>snd_device_new()</function> function. + + <informalexample> + <programlisting> +<![CDATA[ + snd_device_new(card, SNDRV_DEV_XXX, chip, &ops); +]]> + </programlisting> + </informalexample> + </para> + + <para> + This takes the card pointer, the device-level + (<constant>SNDRV_DEV_XXX</constant>), the data pointer, and the + callback pointers (<parameter>&ops</parameter>). The + device-level defines the type of components and the order of + registration and de-registration. For most of components, the + device-level is already defined. For a user-defined component, + you can use <constant>SNDRV_DEV_LOWLEVEL</constant>. + </para> + + <para> + This function itself doesn't allocate the data space. The data + must be allocated manually beforehand, and its pointer is passed + as the argument. This pointer is used as the identifier + (<parameter>chip</parameter> in the above example) for the + instance. + </para> + + <para> + Each ALSA pre-defined component such as ac97 or pcm calls + <function>snd_device_new()</function> inside its + constructor. The destructor for each component is defined in the + callback pointers. Hence, you don't need to take care of + calling a destructor for such a component. + </para> + + <para> + If you would like to create your own component, you need to + set the destructor function to dev_free callback in + <parameter>ops</parameter>, so that it can be released + automatically via <function>snd_card_free()</function>. The + example will be shown later as an implementation of a + chip-specific data. + </para> + </section> + + <section id="card-management-chip-specific"> + <title>Chip-Specific Data</title> + <para> + The chip-specific information, e.g. the i/o port address, its + resource pointer, or the irq number, is stored in the + chip-specific record. + Usually, the chip-specific record is typedef'ed as + <type>xxx_t</type> like the following: + + <informalexample> + <programlisting> +<![CDATA[ + typedef struct snd_mychip mychip_t; + struct snd_mychip { + .... + }; +]]> + </programlisting> + </informalexample> + </para> + + <para> + In general, there are two ways to allocate the chip record. + </para> + + <section id="card-management-chip-specific-snd-card-new"> + <title>1. Allocating via <function>snd_card_new()</function>.</title> + <para> + As mentioned above, you can pass the extra-data-length to the 4th argument of <function>snd_card_new()</function>, i.e. + + <informalexample> + <programlisting> +<![CDATA[ + card = snd_card_new(index[dev], id[dev], THIS_MODULE, sizeof(mychip_t)); +]]> + </programlisting> + </informalexample> + + whether <type>mychip_t</type> is the type of the chip record. + </para> + + <para> + In return, the allocated record can be accessed as + + <informalexample> + <programlisting> +<![CDATA[ + mychip_t *chip = (mychip_t *)card->private_data; +]]> + </programlisting> + </informalexample> + + With this method, you don't have to allocate twice. + The record is released together with the card instance. + </para> + </section> + + <section id="card-management-chip-specific-allocate-extra"> + <title>2. Allocating an extra device.</title> + + <para> + After allocating a card instance via + <function>snd_card_new()</function> (with + <constant>NULL</constant> on the 4th arg), call + <function>kcalloc()</function>. + + <informalexample> + <programlisting> +<![CDATA[ + snd_card_t *card; + mychip_t *chip; + card = snd_card_new(index[dev], id[dev], THIS_MODULE, NULL); + ..... + chip = kcalloc(1, sizeof(*chip), GFP_KERNEL); +]]> + </programlisting> + </informalexample> + </para> + + <para> + The chip record should have the field to hold the card + pointer at least, + + <informalexample> + <programlisting> +<![CDATA[ + struct snd_mychip { + snd_card_t *card; + .... + }; +]]> + </programlisting> + </informalexample> + </para> + + <para> + Then, set the card pointer in the returned chip instance. + + <informalexample> + <programlisting> +<![CDATA[ + chip->card = card; +]]> + </programlisting> + </informalexample> + </para> + + <para> + Next, initialize the fields, and register this chip + record as a low-level device with a specified + <parameter>ops</parameter>, + + <informalexample> + <programlisting> +<![CDATA[ + static snd_device_ops_t ops = { + .dev_free = snd_mychip_dev_free, + }; + .... + snd_device_new(card, SNDRV_DEV_LOWLEVEL, chip, &ops); +]]> + </programlisting> + </informalexample> + + <function>snd_mychip_dev_free()</function> is the + device-destructor function, which will call the real + destructor. + </para> + + <para> + <informalexample> + <programlisting> +<![CDATA[ + static int snd_mychip_dev_free(snd_device_t *device) + { + mychip_t *chip = device->device_data; + return snd_mychip_free(chip); + } +]]> + </programlisting> + </informalexample> + + where <function>snd_mychip_free()</function> is the real destructor. + </para> + </section> + </section> + + <section id="card-management-registration"> + <title>Registration and Release</title> + <para> + After all components are assigned, register the card instance + by calling <function>snd_card_register()</function>. The access + to the device files are enabled at this point. That is, before + <function>snd_card_register()</function> is called, the + components are safely inaccessible from external side. If this + call fails, exit the probe function after releasing the card via + <function>snd_card_free()</function>. + </para> + + <para> + For releasing the card instance, you can call simply + <function>snd_card_free()</function>. As already mentioned, all + components are released automatically by this call. + </para> + + <para> + As further notes, the destructors (both + <function>snd_mychip_dev_free</function> and + <function>snd_mychip_free</function>) cannot be defined with + <parameter>__devexit</parameter> prefix, because they may be + called from the constructor, too, at the false path. + </para> + + <para> + For a device which allows hotplugging, you can use + <function>snd_card_free_in_thread</function>. This one will + postpone the destruction and wait in a kernel-thread until all + devices are closed. + </para> + + </section> + + </chapter> + + +<!-- ****************************************************** --> +<!-- PCI Resource Managements --> +<!-- ****************************************************** --> + <chapter id="pci-resource"> + <title>PCI Resource Managements</title> + + <section id="pci-resource-example"> + <title>Full Code Example</title> + <para> + In this section, we'll finish the chip-specific constructor, + destructor and PCI entries. The example code is shown first, + below. + + <example> + <title>PCI Resource Managements Example</title> + <programlisting> +<![CDATA[ + struct snd_mychip { + snd_card_t *card; + struct pci_dev *pci; + + unsigned long port; + int irq; + }; + + static int snd_mychip_free(mychip_t *chip) + { + /* disable hardware here if any */ + .... // (not implemented in this document) + + /* release the irq */ + if (chip->irq >= 0) + free_irq(chip->irq, (void *)chip); + /* release the i/o ports & memory */ + pci_release_regions(chip->pci); + /* disable the PCI entry */ + pci_disable_device(chip->pci); + /* release the data */ + kfree(chip); + return 0; + } + + /* chip-specific constructor */ + static int __devinit snd_mychip_create(snd_card_t *card, + struct pci_dev *pci, + mychip_t **rchip) + { + mychip_t *chip; + int err; + static snd_device_ops_t ops = { + .dev_free = snd_mychip_dev_free, + }; + + *rchip = NULL; + + /* initialize the PCI entry */ + if ((err = pci_enable_device(pci)) < 0) + return err; + /* check PCI availability (28bit DMA) */ + if (pci_set_dma_mask(pci, 0x0fffffff) < 0 || + pci_set_consistent_dma_mask(pci, 0x0fffffff) < 0) { + printk(KERN_ERR "error to set 28bit mask DMA\n"); + pci_disable_device(pci); + return -ENXIO; + } + + chip = kcalloc(1, sizeof(*chip), GFP_KERNEL); + if (chip == NULL) { + pci_disable_device(pci); + return -ENOMEM; + } + + /* initialize the stuff */ + chip->card = card; + chip->pci = pci; + chip->irq = -1; + + /* (1) PCI resource allocation */ + if ((err = pci_request_regions(pci, "My Chip")) < 0) { + kfree(chip); + pci_disable_device(pci); + return err; + } + chip->port = pci_resource_start(pci, 0); + if (request_irq(pci->irq, snd_mychip_interrupt, + SA_INTERRUPT|SA_SHIRQ, "My Chip", + (void *)chip)) { + printk(KERN_ERR "cannot grab irq %d\n", pci->irq); + snd_mychip_free(chip); + return -EBUSY; + } + chip->irq = pci->irq; + + /* (2) initialization of the chip hardware */ + .... // (not implemented in this document) + + if ((err = snd_device_new(card, SNDRV_DEV_LOWLEVEL, + chip, &ops)) < 0) { + snd_mychip_free(chip); + return err; + } + + snd_card_set_dev(card, &pci->dev); + + *rchip = chip; + return 0; + } + + /* PCI IDs */ + static struct pci_device_id snd_mychip_ids[] = { + { PCI_VENDOR_ID_FOO, PCI_DEVICE_ID_BAR, + PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0, }, + .... + { 0, } + }; + MODULE_DEVICE_TABLE(pci, snd_mychip_ids); + + /* pci_driver definition */ + static struct pci_driver driver = { + .name = "My Own Chip", + .id_table = snd_mychip_ids, + .probe = snd_mychip_probe, + .remove = __devexit_p(snd_mychip_remove), + }; + + /* initialization of the module */ + static int __init alsa_card_mychip_init(void) + { + return pci_module_init(&driver); + } + + /* clean up the module */ + static void __exit alsa_card_mychip_exit(void) + { + pci_unregister_driver(&driver); + } + + module_init(alsa_card_mychip_init) + module_exit(alsa_card_mychip_exit) + + EXPORT_NO_SYMBOLS; /* for old kernels only */ +]]> + </programlisting> + </example> + </para> + </section> + + <section id="pci-resource-some-haftas"> + <title>Some Hafta's</title> + <para> + The allocation of PCI resources is done in the + <function>probe()</function> function, and usually an extra + <function>xxx_create()</function> function is written for this + purpose. + </para> + + <para> + In the case of PCI devices, you have to call at first + <function>pci_enable_device()</function> function before + allocating resources. Also, you need to set the proper PCI DMA + mask to limit the accessed i/o range. In some cases, you might + need to call <function>pci_set_master()</function> function, + too. + </para> + + <para> + Suppose the 28bit mask, and the code to be added would be like: + + <informalexample> + <programlisting> +<![CDATA[ + if ((err = pci_enable_device(pci)) < 0) + return err; + if (pci_set_dma_mask(pci, 0x0fffffff) < 0 || + pci_set_consistent_dma_mask(pci, 0x0fffffff) < 0) { + printk(KERN_ERR "error to set 28bit mask DMA\n"); + pci_disable_device(pci); + return -ENXIO; + } + +]]> + </programlisting> + </informalexample> + </para> + </section> + + <section id="pci-resource-resource-allocation"> + <title>Resource Allocation</title> + <para> + The allocation of I/O ports and irqs are done via standard kernel + functions. Unlike ALSA ver.0.5.x., there are no helpers for + that. And these resources must be released in the destructor + function (see below). Also, on ALSA 0.9.x, you don't need to + allocate (pseudo-)DMA for PCI like ALSA 0.5.x. + </para> + + <para> + Now assume that this PCI device has an I/O port with 8 bytes + and an interrupt. Then <type>mychip_t</type> will have the + following fields: + + <informalexample> + <programlisting> +<![CDATA[ + struct snd_mychip { + snd_card_t *card; + + unsigned long port; + int irq; + }; +]]> + </programlisting> + </informalexample> + </para> + + <para> + For an i/o port (and also a memory region), you need to have + the resource pointer for the standard resource management. For + an irq, you have to keep only the irq number (integer). But you + need to initialize this number as -1 before actual allocation, + since irq 0 is valid. The port address and its resource pointer + can be initialized as null by + <function>kcalloc()</function> automatically, so you + don't have to take care of resetting them. + </para> + + <para> + The allocation of an i/o port is done like this: + + <informalexample> + <programlisting> +<![CDATA[ + if ((err = pci_request_regions(pci, "My Chip")) < 0) { + kfree(chip); + pci_disable_device(pci); + return err; + } + chip->port = pci_resource_start(pci, 0); +]]> + </programlisting> + </informalexample> + </para> + + <para> + <!-- obsolete --> + It will reserve the i/o port region of 8 bytes of the given + PCI device. The returned value, chip->res_port, is allocated + via <function>kmalloc()</function> by + <function>request_region()</function>. The pointer must be + released via <function>kfree()</function>, but there is some + problem regarding this. This issue will be explained more below. + </para> + + <para> + The allocation of an interrupt source is done like this: + + <informalexample> + <programlisting> +<![CDATA[ + if (request_irq(pci->irq, snd_mychip_interrupt, + SA_INTERRUPT|SA_SHIRQ, "My Chip", + (void *)chip)) { + printk(KERN_ERR "cannot grab irq %d\n", pci->irq); + snd_mychip_free(chip); + return -EBUSY; + } + chip->irq = pci->irq; +]]> + </programlisting> + </informalexample> + + where <function>snd_mychip_interrupt()</function> is the + interrupt handler defined <link + linkend="pcm-interface-interrupt-handler"><citetitle>later</citetitle></link>. + Note that chip->irq should be defined + only when <function>request_irq()</function> succeeded. + </para> + + <para> + On the PCI bus, the interrupts can be shared. Thus, + <constant>SA_SHIRQ</constant> is given as the interrupt flag of + <function>request_irq()</function>. + </para> + + <para> + The last argument of <function>request_irq()</function> is the + data pointer passed to the interrupt handler. Usually, the + chip-specific record is used for that, but you can use what you + like, too. + </para> + + <para> + I won't define the detail of the interrupt handler at this + point, but at least its appearance can be explained now. The + interrupt handler looks usually like the following: + + <informalexample> + <programlisting> +<![CDATA[ + static irqreturn_t snd_mychip_interrupt(int irq, void *dev_id, + struct pt_regs *regs) + { + mychip_t *chip = dev_id; + .... + return IRQ_HANDLED; + } +]]> + </programlisting> + </informalexample> + </para> + + <para> + Now let's write the corresponding destructor for the resources + above. The role of destructor is simple: disable the hardware + (if already activated) and release the resources. So far, we + have no hardware part, so the disabling is not written here. + </para> + + <para> + For releasing the resources, <quote>check-and-release</quote> + method is a safer way. For the interrupt, do like this: + + <informalexample> + <programlisting> +<![CDATA[ + if (chip->irq >= 0) + free_irq(chip->irq, (void *)chip); +]]> + </programlisting> + </informalexample> + + Since the irq number can start from 0, you should initialize + chip->irq with a negative value (e.g. -1), so that you can + check the validity of the irq number as above. + </para> + + <para> + When you requested I/O ports or memory regions via + <function>pci_request_region()</function> or + <function>pci_request_regions()</function> like this example, + release the resource(s) using the corresponding function, + <function>pci_release_region()</function> or + <function>pci_release_regions()</function>. + + <informalexample> + <programlisting> +<![CDATA[ + pci_release_regions(chip->pci); +]]> + </programlisting> + </informalexample> + </para> + + <para> + When you requested manually via <function>request_region()</function> + or <function>request_mem_region</function>, you can release it via + <function>release_resource()</function>. Suppose that you keep + the resource pointer returned from <function>request_region()</function> + in chip->res_port, the release procedure looks like below: + + <informalexample> + <programlisting> +<![CDATA[ + if (chip->res_port) { + release_resource(chip->res_port); + kfree_nocheck(chip->res_port); + } +]]> + </programlisting> + </informalexample> + + As you can see, the resource pointer is also to be freed + via <function>kfree_nocheck()</function> after + <function>release_resource()</function> is called. You + cannot use <function>kfree()</function> here, because on ALSA, + <function>kfree()</function> may be a wrapper to its own + allocator with the memory debugging. Since the resource pointer + is allocated externally outside the ALSA, it must be released + via the native + <function>kfree()</function>. + <function>kfree_nocheck()</function> is used for that; it calls + the native <function>kfree()</function> without wrapper. + </para> + + <para> + Don't forget to call <function>pci_disable_device()</function> + before all finished. + </para> + + <para> + And finally, release the chip-specific record. + + <informalexample> + <programlisting> +<![CDATA[ + kfree(chip); +]]> + </programlisting> + </informalexample> + </para> + + <para> + Again, remember that you cannot + set <parameter>__devexit</parameter> prefix for this destructor. + </para> + + <para> + We didn't implement the hardware-disabling part in the above. + If you need to do this, please note that the destructor may be + called even before the initialization of the chip is completed. + It would be better to have a flag to skip the hardware-disabling + if the hardware was not initialized yet. + </para> + + <para> + When the chip-data is assigned to the card using + <function>snd_device_new()</function> with + <constant>SNDRV_DEV_LOWLELVEL</constant> , its destructor is + called at the last. That is, it is assured that all other + components like PCMs and controls have been already released. + You don't have to call stopping PCMs, etc. explicitly, but just + stop the hardware in the low-level. + </para> + + <para> + The management of a memory-mapped region is almost as same as + the management of an i/o port. You'll need three fields like + the following: + + <informalexample> + <programlisting> +<![CDATA[ + struct snd_mychip { + .... + unsigned long iobase_phys; + void __iomem *iobase_virt; + }; +]]> + </programlisting> + </informalexample> + + and the allocation would be like below: + + <informalexample> + <programlisting> +<![CDATA[ + if ((err = pci_request_regions(pci, "My Chip")) < 0) { + kfree(chip); + return err; + } + chip->iobase_phys = pci_resource_start(pci, 0); + chip->iobase_virt = ioremap_nocache(chip->iobase_phys, + pci_resource_len(pci, 0)); +]]> + </programlisting> + </informalexample> + + and the corresponding destructor would be: + + <informalexample> + <programlisting> +<![CDATA[ + static int snd_mychip_free(mychip_t *chip) + { + .... + if (chip->iobase_virt) + iounmap(chip->iobase_virt); + .... + pci_release_regions(chip->pci); + .... + } +]]> + </programlisting> + </informalexample> + </para> + + </section> + + <section id="pci-resource-device-struct"> + <title>Registration of Device Struct</title> + <para> + At some point, typically after calling <function>snd_device_new()</function>, + you need to register the <structname>struct device</structname> of the chip + you're handling for udev and co. ALSA provides a macro for compatibility with + older kernels. Simply call like the following: + <informalexample> + <programlisting> +<![CDATA[ + snd_card_set_dev(card, &pci->dev); +]]> + </programlisting> + </informalexample> + so that it stores the PCI's device pointer to the card. This will be + referred by ALSA core functions later when the devices are registered. + </para> + <para> + In the case of non-PCI, pass the proper device struct pointer of the BUS + instead. (In the case of legacy ISA without PnP, you don't have to do + anything.) + </para> + </section> + + <section id="pci-resource-entries"> + <title>PCI Entries</title> + <para> + So far, so good. Let's finish the rest of missing PCI + stuffs. At first, we need a + <structname>pci_device_id</structname> table for this + chipset. It's a table of PCI vendor/device ID number, and some + masks. + </para> + + <para> + For example, + + <informalexample> + <programlisting> +<![CDATA[ + static struct pci_device_id snd_mychip_ids[] = { + { PCI_VENDOR_ID_FOO, PCI_DEVICE_ID_BAR, + PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0, }, + .... + { 0, } + }; + MODULE_DEVICE_TABLE(pci, snd_mychip_ids); +]]> + </programlisting> + </informalexample> + </para> + + <para> + The first and second fields of + <structname>pci_device_id</structname> struct are the vendor and + device IDs. If you have nothing special to filter the matching + devices, you can use the rest of fields like above. The last + field of <structname>pci_device_id</structname> struct is a + private data for this entry. You can specify any value here, for + example, to tell the type of different operations per each + device IDs. Such an example is found in intel8x0 driver. + </para> + + <para> + The last entry of this list is the terminator. You must + specify this all-zero entry. + </para> + + <para> + Then, prepare the <structname>pci_driver</structname> record: + + <informalexample> + <programlisting> +<![CDATA[ + static struct pci_driver driver = { + .name = "My Own Chip", + .id_table = snd_mychip_ids, + .probe = snd_mychip_probe, + .remove = __devexit_p(snd_mychip_remove), + }; +]]> + </programlisting> + </informalexample> + </para> + + <para> + The <structfield>probe</structfield> and + <structfield>remove</structfield> functions are what we already + defined in + the previous sections. The <structfield>remove</structfield> should + be defined with + <function>__devexit_p()</function> macro, so that it's not + defined for built-in (and non-hot-pluggable) case. The + <structfield>name</structfield> + field is the name string of this device. Note that you must not + use a slash <quote>/</quote> in this string. + </para> + + <para> + And at last, the module entries: + + <informalexample> + <programlisting> +<![CDATA[ + static int __init alsa_card_mychip_init(void) + { + return pci_module_init(&driver); + } + + static void __exit alsa_card_mychip_exit(void) + { + pci_unregister_driver(&driver); + } + + module_init(alsa_card_mychip_init) + module_exit(alsa_card_mychip_exit) +]]> + </programlisting> + </informalexample> + </para> + + <para> + Note that these module entries are tagged with + <parameter>__init</parameter> and + <parameter>__exit</parameter> prefixes, not + <parameter>__devinit</parameter> nor + <parameter>__devexit</parameter>. + </para> + + <para> + Oh, one thing was forgotten. If you have no exported symbols, + you need to declare it on 2.2 or 2.4 kernels (on 2.6 kernels + it's not necessary, though). + + <informalexample> + <programlisting> +<![CDATA[ + EXPORT_NO_SYMBOLS; +]]> + </programlisting> + </informalexample> + + That's all! + </para> + </section> + </chapter> + + +<!-- ****************************************************** --> +<!-- PCM Interface --> +<!-- ****************************************************** --> + <chapter id="pcm-interface"> + <title>PCM Interface</title> + + <section id="pcm-interface-general"> + <title>General</title> + <para> + The PCM middle layer of ALSA is quite powerful and it is only + necessary for each driver to implement the low-level functions + to access its hardware. + </para> + + <para> + For accessing to the PCM layer, you need to include + <filename><sound/pcm.h></filename> above all. In addition, + <filename><sound/pcm_params.h></filename> might be needed + if you access to some functions related with hw_param. + </para> + + <para> + Each card device can have up to four pcm instances. A pcm + instance corresponds to a pcm device file. The limitation of + number of instances comes only from the available bit size of + the linux's device number. Once when 64bit device number is + used, we'll have more available pcm instances. + </para> + + <para> + A pcm instance consists of pcm playback and capture streams, + and each pcm stream consists of one or more pcm substreams. Some + soundcard supports the multiple-playback function. For example, + emu10k1 has a PCM playback of 32 stereo substreams. In this case, at + each open, a free substream is (usually) automatically chosen + and opened. Meanwhile, when only one substream exists and it was + already opened, the succeeding open will result in the blocking + or the error with <constant>EAGAIN</constant> according to the + file open mode. But you don't have to know the detail in your + driver. The PCM middle layer will take all such jobs. + </para> + </section> + + <section id="pcm-interface-example"> + <title>Full Code Example</title> + <para> + The example code below does not include any hardware access + routines but shows only the skeleton, how to build up the PCM + interfaces. + + <example> + <title>PCM Example Code</title> + <programlisting> +<![CDATA[ + #include <sound/pcm.h> + .... + + /* hardware definition */ + static snd_pcm_hardware_t snd_mychip_playback_hw = { + .info = (SNDRV_PCM_INFO_MMAP | + SNDRV_PCM_INFO_INTERLEAVED | + SNDRV_PCM_INFO_BLOCK_TRANSFER | + SNDRV_PCM_INFO_MMAP_VALID), + .formats = SNDRV_PCM_FMTBIT_S16_LE, + .rates = SNDRV_PCM_RATE_8000_48000, + .rate_min = 8000, + .rate_max = 48000, + .channels_min = 2, + .channels_max = 2, + .buffer_bytes_max = 32768, + .period_bytes_min = 4096, + .period_bytes_max = 32768, + .periods_min = 1, + .periods_max = 1024, + }; + + /* hardware definition */ + static snd_pcm_hardware_t snd_mychip_capture_hw = { + .info = (SNDRV_PCM_INFO_MMAP | + SNDRV_PCM_INFO_INTERLEAVED | + SNDRV_PCM_INFO_BLOCK_TRANSFER | + SNDRV_PCM_INFO_MMAP_VALID), + .formats = SNDRV_PCM_FMTBIT_S16_LE, + .rates = SNDRV_PCM_RATE_8000_48000, + .rate_min = 8000, + .rate_max = 48000, + .channels_min = 2, + .channels_max = 2, + .buffer_bytes_max = 32768, + .period_bytes_min = 4096, + .period_bytes_max = 32768, + .periods_min = 1, + .periods_max = 1024, + }; + + /* open callback */ + static int snd_mychip_playback_open(snd_pcm_substream_t *substream) + { + mychip_t *chip = snd_pcm_substream_chip(substream); + snd_pcm_runtime_t *runtime = substream->runtime; + + runtime->hw = snd_mychip_playback_hw; + // more hardware-initialization will be done here + return 0; + } + + /* close callback */ + static int snd_mychip_playback_close(snd_pcm_substream_t *substream) + { + mychip_t *chip = snd_pcm_substream_chip(substream); + // the hardware-specific codes will be here + return 0; + + } + + /* open callback */ + static int snd_mychip_capture_open(snd_pcm_substream_t *substream) + { + mychip_t *chip = snd_pcm_substream_chip(substream); + snd_pcm_runtime_t *runtime = substream->runtime; + + runtime->hw = snd_mychip_capture_hw; + // more hardware-initialization will be done here + return 0; + } + + /* close callback */ + static int snd_mychip_capture_close(snd_pcm_substream_t *substream) + { + mychip_t *chip = snd_pcm_substream_chip(substream); + // the hardware-specific codes will be here + return 0; + + } + + /* hw_params callback */ + static int snd_mychip_pcm_hw_params(snd_pcm_substream_t *substream, + snd_pcm_hw_params_t * hw_params) + { + return snd_pcm_lib_malloc_pages(substream, + params_buffer_bytes(hw_params)); + } + + /* hw_free callback */ + static int snd_mychip_pcm_hw_free(snd_pcm_substream_t *substream) + { + return snd_pcm_lib_free_pages(substream); + } + + /* prepare callback */ + static int snd_mychip_pcm_prepare(snd_pcm_substream_t *substream) + { + mychip_t *chip = snd_pcm_substream_chip(substream); + snd_pcm_runtime_t *runtime = substream->runtime; + + /* set up the hardware with the current configuration + * for example... + */ + mychip_set_sample_format(chip, runtime->format); + mychip_set_sample_rate(chip, runtime->rate); + mychip_set_channels(chip, runtime->channels); + mychip_set_dma_setup(chip, runtime->dma_area, + chip->buffer_size, + chip->period_size); + return 0; + } + + /* trigger callback */ + static int snd_mychip_pcm_trigger(snd_pcm_substream_t *substream, + int cmd) + { + switch (cmd) { + case SNDRV_PCM_TRIGGER_START: + // do something to start the PCM engine + break; + case SNDRV_PCM_TRIGGER_STOP: + // do something to stop the PCM engine + break; + default: + return -EINVAL; + } + } + + /* pointer callback */ + static snd_pcm_uframes_t + snd_mychip_pcm_pointer(snd_pcm_substream_t *substream) + { + mychip_t *chip = snd_pcm_substream_chip(substream); + unsigned int current_ptr; + + /* get the current hardware pointer */ + current_ptr = mychip_get_hw_pointer(chip); + return current_ptr; + } + + /* operators */ + static snd_pcm_ops_t snd_mychip_playback_ops = { + .open = snd_mychip_playback_open, + .close = snd_mychip_playback_close, + .ioctl = snd_pcm_lib_ioctl, + .hw_params = snd_mychip_pcm_hw_params, + .hw_free = snd_mychip_pcm_hw_free, + .prepare = snd_mychip_pcm_prepare, + .trigger = snd_mychip_pcm_trigger, + .pointer = snd_mychip_pcm_pointer, + }; + + /* operators */ + static snd_pcm_ops_t snd_mychip_capture_ops = { + .open = snd_mychip_capture_open, + .close = snd_mychip_capture_close, + .ioctl = snd_pcm_lib_ioctl, + .hw_params = snd_mychip_pcm_hw_params, + .hw_free = snd_mychip_pcm_hw_free, + .prepare = snd_mychip_pcm_prepare, + .trigger = snd_mychip_pcm_trigger, + .pointer = snd_mychip_pcm_pointer, + }; + + /* + * definitions of capture are omitted here... + */ + + /* create a pcm device */ + static int __devinit snd_mychip_new_pcm(mychip_t *chip) + { + snd_pcm_t *pcm; + int err; + + if ((err = snd_pcm_new(chip->card, "My Chip", 0, 1, 1, + &pcm)) < 0) + return err; + pcm->private_data = chip; + strcpy(pcm->name, "My Chip"); + chip->pcm = pcm; + /* set operators */ + snd_pcm_set_ops(pcm, SNDRV_PCM_STREAM_PLAYBACK, + &snd_mychip_playback_ops); + snd_pcm_set_ops(pcm, SNDRV_PCM_STREAM_CAPTURE, + &snd_mychip_capture_ops); + /* pre-allocation of buffers */ + /* NOTE: this may fail */ + snd_pcm_lib_preallocate_pages_for_all(pcm, SNDRV_DMA_TYPE_DEV, + snd_dma_pci_data(chip->pci), + 64*1024, 64*1024); + return 0; + } +]]> + </programlisting> + </example> + </para> + </section> + + <section id="pcm-interface-constructor"> + <title>Constructor</title> + <para> + A pcm instance is allocated by <function>snd_pcm_new()</function> + function. It would be better to create a constructor for pcm, + namely, + + <informalexample> + <programlisting> +<![CDATA[ + static int __devinit snd_mychip_new_pcm(mychip_t *chip) + { + snd_pcm_t *pcm; + int err; + + if ((err = snd_pcm_new(chip->card, "My Chip", 0, 1, 1, + &pcm)) < 0) + return err; + pcm->private_data = chip; + strcpy(pcm->name, "My Chip"); + chip->pcm = pcm; + .... + return 0; + } +]]> + </programlisting> + </informalexample> + </para> + + <para> + The <function>snd_pcm_new()</function> function takes the four + arguments. The first argument is the card pointer to which this + pcm is assigned, and the second is the ID string. + </para> + + <para> + The third argument (<parameter>index</parameter>, 0 in the + above) is the index of this new pcm. It begins from zero. When + you will create more than one pcm instances, specify the + different numbers in this argument. For example, + <parameter>index</parameter> = 1 for the second PCM device. + </para> + + <para> + The fourth and fifth arguments are the number of substreams + for playback and capture, respectively. Here both 1 are given in + the above example. When no playback or no capture is available, + pass 0 to the corresponding argument. + </para> + + <para> + If a chip supports multiple playbacks or captures, you can + specify more numbers, but they must be handled properly in + open/close, etc. callbacks. When you need to know which + substream you are referring to, then it can be obtained from + <type>snd_pcm_substream_t</type> data passed to each callback + as follows: + + <informalexample> + <programlisting> +<![CDATA[ + snd_pcm_substream_t *substream; + int index = substream->number; +]]> + </programlisting> + </informalexample> + </para> + + <para> + After the pcm is created, you need to set operators for each + pcm stream. + + <informalexample> + <programlisting> +<![CDATA[ + snd_pcm_set_ops(pcm, SNDRV_PCM_STREAM_PLAYBACK, + &snd_mychip_playback_ops); + snd_pcm_set_ops(pcm, SNDRV_PCM_STREAM_CAPTURE, + &snd_mychip_capture_ops); +]]> + </programlisting> + </informalexample> + </para> + + <para> + The operators are defined typically like this: + + <informalexample> + <programlisting> +<![CDATA[ + static snd_pcm_ops_t snd_mychip_playback_ops = { + .open = snd_mychip_pcm_open, + .close = snd_mychip_pcm_close, + .ioctl = snd_pcm_lib_ioctl, + .hw_params = snd_mychip_pcm_hw_params, + .hw_free = snd_mychip_pcm_hw_free, + .prepare = snd_mychip_pcm_prepare, + .trigger = snd_mychip_pcm_trigger, + .pointer = snd_mychip_pcm_pointer, + }; +]]> + </programlisting> + </informalexample> + + Each of callbacks is explained in the subsection + <link linkend="pcm-interface-operators"><citetitle> + Operators</citetitle></link>. + </para> + + <para> + After setting the operators, most likely you'd like to + pre-allocate the buffer. For the pre-allocation, simply call + the following: + + <informalexample> + <programlisting> +<![CDATA[ + snd_pcm_lib_preallocate_pages_for_all(pcm, SNDRV_DMA_TYPE_DEV, + snd_dma_pci_data(chip->pci), + 64*1024, 64*1024); +]]> + </programlisting> + </informalexample> + + It will allocate up to 64kB buffer as default. The details of + buffer management will be described in the later section <link + linkend="buffer-and-memory"><citetitle>Buffer and Memory + Management</citetitle></link>. + </para> + + <para> + Additionally, you can set some extra information for this pcm + in pcm->info_flags. + The available values are defined as + <constant>SNDRV_PCM_INFO_XXX</constant> in + <filename><sound/asound.h></filename>, which is used for + the hardware definition (described later). When your soundchip + supports only half-duplex, specify like this: + + <informalexample> + <programlisting> +<![CDATA[ + pcm->info_flags = SNDRV_PCM_INFO_HALF_DUPLEX; +]]> + </programlisting> + </informalexample> + </para> + </section> + + <section id="pcm-interface-destructor"> + <title>... And the Destructor?</title> + <para> + The destructor for a pcm instance is not always + necessary. Since the pcm device will be released by the middle + layer code automatically, you don't have to call destructor + explicitly. + </para> + + <para> + The destructor would be necessary when you created some + special records internally and need to release them. In such a + case, set the destructor function to + pcm->private_free: + + <example> + <title>PCM Instance with a Destructor</title> + <programlisting> +<![CDATA[ + static void mychip_pcm_free(snd_pcm_t *pcm) + { + mychip_t *chip = snd_pcm_chip(pcm); + /* free your own data */ + kfree(chip->my_private_pcm_data); + // do what you like else + .... + } + + static int __devinit snd_mychip_new_pcm(mychip_t *chip) + { + snd_pcm_t *pcm; + .... + /* allocate your own data */ + chip->my_private_pcm_data = kmalloc(...); + /* set the destructor */ + pcm->private_data = chip; + pcm->private_free = mychip_pcm_free; + .... + } +]]> + </programlisting> + </example> + </para> + </section> + + <section id="pcm-interface-runtime"> + <title>Runtime Pointer - The Chest of PCM Information</title> + <para> + When the PCM substream is opened, a PCM runtime instance is + allocated and assigned to the substream. This pointer is + accessible via <constant>substream->runtime</constant>. + This runtime pointer holds the various information; it holds + the copy of hw_params and sw_params configurations, the buffer + pointers, mmap records, spinlocks, etc. Almost everyhing you + need for controlling the PCM can be found there. + </para> + + <para> + The definition of runtime instance is found in + <filename><sound/pcm.h></filename>. Here is the + copy from the file. + <informalexample> + <programlisting> +<![CDATA[ +struct _snd_pcm_runtime { + /* -- Status -- */ + snd_pcm_substream_t *trigger_master; + snd_timestamp_t trigger_tstamp; /* trigger timestamp */ + int overrange; + snd_pcm_uframes_t avail_max; + snd_pcm_uframes_t hw_ptr_base; /* Position at buffer restart */ + snd_pcm_uframes_t hw_ptr_interrupt; /* Position at interrupt time*/ + + /* -- HW params -- */ + snd_pcm_access_t access; /* access mode */ + snd_pcm_format_t format; /* SNDRV_PCM_FORMAT_* */ + snd_pcm_subformat_t subformat; /* subformat */ + unsigned int rate; /* rate in Hz */ + unsigned int channels; /* channels */ + snd_pcm_uframes_t period_size; /* period size */ + unsigned int periods; /* periods */ + snd_pcm_uframes_t buffer_size; /* buffer size */ + unsigned int tick_time; /* tick time */ + snd_pcm_uframes_t min_align; /* Min alignment for the format */ + size_t byte_align; + unsigned int frame_bits; + unsigned int sample_bits; + unsigned int info; + unsigned int rate_num; + unsigned int rate_den; + + /* -- SW params -- */ + int tstamp_timespec; /* use timeval (0) or timespec (1) */ + snd_pcm_tstamp_t tstamp_mode; /* mmap timestamp is updated */ + unsigned int period_step; + unsigned int sleep_min; /* min ticks to sleep */ + snd_pcm_uframes_t xfer_align; /* xfer size need to be a multiple */ + snd_pcm_uframes_t start_threshold; + snd_pcm_uframes_t stop_threshold; + snd_pcm_uframes_t silence_threshold; /* Silence filling happens when + noise is nearest than this */ + snd_pcm_uframes_t silence_size; /* Silence filling size */ + snd_pcm_uframes_t boundary; /* pointers wrap point */ + + snd_pcm_uframes_t silenced_start; + snd_pcm_uframes_t silenced_size; + + snd_pcm_sync_id_t sync; /* hardware synchronization ID */ + + /* -- mmap -- */ + volatile snd_pcm_mmap_status_t *status; + volatile snd_pcm_mmap_control_t *control; + atomic_t mmap_count; + + /* -- locking / scheduling -- */ + spinlock_t lock; + wait_queue_head_t sleep; + struct timer_list tick_timer; + struct fasync_struct *fasync; + + /* -- private section -- */ + void *private_data; + void (*private_free)(snd_pcm_runtime_t *runtime); + + /* -- hardware description -- */ + snd_pcm_hardware_t hw; + snd_pcm_hw_constraints_t hw_constraints; + + /* -- interrupt callbacks -- */ + void (*transfer_ack_begin)(snd_pcm_substream_t *substream); + void (*transfer_ack_end)(snd_pcm_substream_t *substream); + + /* -- timer -- */ + unsigned int timer_resolution; /* timer resolution */ + + /* -- DMA -- */ + unsigned char *dma_area; /* DMA area */ + dma_addr_t dma_addr; /* physical bus address (not accessible from main CPU) */ + size_t dma_bytes; /* size of DMA area */ + + struct snd_dma_buffer *dma_buffer_p; /* allocated buffer */ + +#if defined(CONFIG_SND_PCM_OSS) || defined(CONFIG_SND_PCM_OSS_MODULE) + /* -- OSS things -- */ + snd_pcm_oss_runtime_t oss; +#endif +}; +]]> + </programlisting> + </informalexample> + </para> + + <para> + For the operators (callbacks) of each sound driver, most of + these records are supposed to be read-only. Only the PCM + middle-layer changes / updates these info. The exceptions are + the hardware description (hw), interrupt callbacks + (transfer_ack_xxx), DMA buffer information, and the private + data. Besides, if you use the standard buffer allocation + method via <function>snd_pcm_lib_malloc_pages()</function>, + you don't need to set the DMA buffer information by yourself. + </para> + + <para> + In the sections below, important records are explained. + </para> + + <section id="pcm-interface-runtime-hw"> + <title>Hardware Description</title> + <para> + The hardware descriptor (<type>snd_pcm_hardware_t</type>) + contains the definitions of the fundamental hardware + configuration. Above all, you'll need to define this in + <link linkend="pcm-interface-operators-open-callback"><citetitle> + the open callback</citetitle></link>. + Note that the runtime instance holds the copy of the + descriptor, not the pointer to the existing descriptor. That + is, in the open callback, you can modify the copied descriptor + (<constant>runtime->hw</constant>) as you need. For example, if the maximum + number of channels is 1 only on some chip models, you can + still use the same hardware descriptor and change the + channels_max later: + <informalexample> + <programlisting> +<![CDATA[ + snd_pcm_runtime_t *runtime = substream->runtime; + ... + runtime->hw = snd_mychip_playback_hw; /* common definition */ + if (chip->model == VERY_OLD_ONE) + runtime->hw.channels_max = 1; +]]> + </programlisting> + </informalexample> + </para> + + <para> + Typically, you'll have a hardware descriptor like below: + <informalexample> + <programlisting> +<![CDATA[ + static snd_pcm_hardware_t snd_mychip_playback_hw = { + .info = (SNDRV_PCM_INFO_MMAP | + SNDRV_PCM_INFO_INTERLEAVED | + SNDRV_PCM_INFO_BLOCK_TRANSFER | + SNDRV_PCM_INFO_MMAP_VALID), + .formats = SNDRV_PCM_FMTBIT_S16_LE, + .rates = SNDRV_PCM_RATE_8000_48000, + .rate_min = 8000, + .rate_max = 48000, + .channels_min = 2, + .channels_max = 2, + .buffer_bytes_max = 32768, + .period_bytes_min = 4096, + .period_bytes_max = 32768, + .periods_min = 1, + .periods_max = 1024, + }; +]]> + </programlisting> + </informalexample> + </para> + + <para> + <itemizedlist> + <listitem><para> + The <structfield>info</structfield> field contains the type and + capabilities of this pcm. The bit flags are defined in + <filename><sound/asound.h></filename> as + <constant>SNDRV_PCM_INFO_XXX</constant>. Here, at least, you + have to specify whether the mmap is supported and which + interleaved format is supported. + When the mmap is supported, add + <constant>SNDRV_PCM_INFO_MMAP</constant> flag here. When the + hardware supports the interleaved or the non-interleaved + format, <constant>SNDRV_PCM_INFO_INTERLEAVED</constant> or + <constant>SNDRV_PCM_INFO_NONINTERLEAVED</constant> flag must + be set, respectively. If both are supported, you can set both, + too. + </para> + + <para> + In the above example, <constant>MMAP_VALID</constant> and + <constant>BLOCK_TRANSFER</constant> are specified for OSS mmap + mode. Usually both are set. Of course, + <constant>MMAP_VALID</constant> is set only if the mmap is + really supported. + </para> + + <para> + The other possible flags are + <constant>SNDRV_PCM_INFO_PAUSE</constant> and + <constant>SNDRV_PCM_INFO_RESUME</constant>. The + <constant>PAUSE</constant> bit means that the pcm supports the + <quote>pause</quote> operation, while the + <constant>RESUME</constant> bit means that the pcm supports + the <quote>suspend/resume</quote> operation. If these flags + are set, the <structfield>trigger</structfield> callback below + must handle the corresponding commands. + </para> + + <para> + When the PCM substreams can be synchronized (typically, + synchorinized start/stop of a playback and a capture streams), + you can give <constant>SNDRV_PCM_INFO_SYNC_START</constant>, + too. In this case, you'll need to check the linked-list of + PCM substreams in the trigger callback. This will be + described in the later section. + </para> + </listitem> + + <listitem> + <para> + <structfield>formats</structfield> field contains the bit-flags + of supported formats (<constant>SNDRV_PCM_FMTBIT_XXX</constant>). + If the hardware supports more than one format, give all or'ed + bits. In the example above, the signed 16bit little-endian + format is specified. + </para> + </listitem> + + <listitem> + <para> + <structfield>rates</structfield> field contains the bit-flags of + supported rates (<constant>SNDRV_PCM_RATE_XXX</constant>). + When the chip supports continuous rates, pass + <constant>CONTINUOUS</constant> bit additionally. + The pre-defined rate bits are provided only for typical + rates. If your chip supports unconventional rates, you need to add + <constant>KNOT</constant> bit and set up the hardware + constraint manually (explained later). + </para> + </listitem> + + <listitem> + <para> + <structfield>rate_min</structfield> and + <structfield>rate_max</structfield> define the minimal and + maximal sample rate. This should correspond somehow to + <structfield>rates</structfield> bits. + </para> + </listitem> + + <listitem> + <para> + <structfield>channel_min</structfield> and + <structfield>channel_max</structfield> + define, as you might already expected, the minimal and maximal + number of channels. + </para> + </listitem> + + <listitem> + <para> + <structfield>buffer_bytes_max</structfield> defines the + maximal buffer size in bytes. There is no + <structfield>buffer_bytes_min</structfield> field, since + it can be calculated from the minimal period size and the + minimal number of periods. + Meanwhile, <structfield>period_bytes_min</structfield> and + define the minimal and maximal size of the period in bytes. + <structfield>periods_max</structfield> and + <structfield>periods_min</structfield> define the maximal and + minimal number of periods in the buffer. + </para> + + <para> + The <quote>period</quote> is a term, that corresponds to + fragment in the OSS world. The period defines the size at + which the PCM interrupt is generated. This size strongly + depends on the hardware. + Generally, the smaller period size will give you more + interrupts, that is, more controls. + In the case of capture, this size defines the input latency. + On the other hand, the whole buffer size defines the + output latency for the playback direction. + </para> + </listitem> + + <listitem> + <para> + There is also a field <structfield>fifo_size</structfield>. + This specifies the size of the hardware FIFO, but it's not + used currently in the driver nor in the alsa-lib. So, you + can ignore this field. + </para> + </listitem> + </itemizedlist> + </para> + </section> + + <section id="pcm-interface-runtime-config"> + <title>PCM Configurations</title> + <para> + Ok, let's go back again to the PCM runtime records. + The most frequently referred records in the runtime instance are + the PCM configurations. + The PCM configurations are stored on runtime instance + after the application sends <type>hw_params</type> data via + alsa-lib. There are many fields copied from hw_params and + sw_params structs. For example, + <structfield>format</structfield> holds the format type + chosen by the application. This field contains the enum value + <constant>SNDRV_PCM_FORMAT_XXX</constant>. + </para> + + <para> + One thing to be noted is that the configured buffer and period + sizes are stored in <quote>frames</quote> in the runtime + In the ALSA world, 1 frame = channels * samples-size. + For conversion between frames and bytes, you can use the + helper functions, <function>frames_to_bytes()</function> and + <function>bytes_to_frames()</function>. + <informalexample> + <programlisting> +<![CDATA[ + period_bytes = frames_to_bytes(runtime, runtime->period_size); +]]> + </programlisting> + </informalexample> + </para> + + <para> + Also, many software parameters (sw_params) are + stored in frames, too. Please check the type of the field. + <type>snd_pcm_uframes_t</type> is for the frames as unsigned + integer while <type>snd_pcm_sframes_t</type> is for the frames + as signed integer. + </para> + </section> + + <section id="pcm-interface-runtime-dma"> + <title>DMA Buffer Information</title> + <para> + The DMA buffer is defined by the following four fields, + <structfield>dma_area</structfield>, + <structfield>dma_addr</structfield>, + <structfield>dma_bytes</structfield> and + <structfield>dma_private</structfield>. + The <structfield>dma_area</structfield> holds the buffer + pointer (the logical address). You can call + <function>memcpy</function> from/to + this pointer. Meanwhile, <structfield>dma_addr</structfield> + holds the physical address of the buffer. This field is + specified only when the buffer is a linear buffer. + <structfield>dma_bytes</structfield> holds the size of buffer + in bytes. <structfield>dma_private</structfield> is used for + the ALSA DMA allocator. + </para> + + <para> + If you use a standard ALSA function, + <function>snd_pcm_lib_malloc_pages()</function>, for + allocating the buffer, these fields are set by the ALSA middle + layer, and you should <emphasis>not</emphasis> change them by + yourself. You can read them but not write them. + On the other hand, if you want to allocate the buffer by + yourself, you'll need to manage it in hw_params callback. + At least, <structfield>dma_bytes</structfield> is mandatory. + <structfield>dma_area</structfield> is necessary when the + buffer is mmapped. If your driver doesn't support mmap, this + field is not necessary. <structfield>dma_addr</structfield> + is also not mandatory. You can use + <structfield>dma_private</structfield> as you like, too. + </para> + </section> + + <section id="pcm-interface-runtime-status"> + <title>Running Status</title> + <para> + The running status can be referred via <constant>runtime->status</constant>. + This is the pointer to <type>snd_pcm_mmap_status_t</type> + record. For example, you can get the current DMA hardware + pointer via <constant>runtime->status->hw_ptr</constant>. + </para> + + <para> + The DMA application pointer can be referred via + <constant>runtime->control</constant>, which points + <type>snd_pcm_mmap_control_t</type> record. + However, accessing directly to this value is not recommended. + </para> + </section> + + <section id="pcm-interface-runtime-private"> + <title>Private Data</title> + <para> + You can allocate a record for the substream and store it in + <constant>runtime->private_data</constant>. Usually, this + done in + <link linkend="pcm-interface-operators-open-callback"><citetitle> + the open callback</citetitle></link>. + Don't mix this with <constant>pcm->private_data</constant>. + The <constant>pcm->private_data</constant> usually points the + chip instance assigned statically at the creation of PCM, while the + <constant>runtime->private_data</constant> points a dynamic + data created at the PCM open callback. + + <informalexample> + <programlisting> +<![CDATA[ + static int snd_xxx_open(snd_pcm_substream_t *substream) + { + my_pcm_data_t *data; + .... + data = kmalloc(sizeof(*data), GFP_KERNEL); + substream->runtime->private_data = data; + .... + } +]]> + </programlisting> + </informalexample> + </para> + + <para> + The allocated object must be released in + <link linkend="pcm-interface-operators-open-callback"><citetitle> + the close callback</citetitle></link>. + </para> + </section> + + <section id="pcm-interface-runtime-intr"> + <title>Interrupt Callbacks</title> + <para> + The field <structfield>transfer_ack_begin</structfield> and + <structfield>transfer_ack_end</structfield> are called at + the beginning and the end of + <function>snd_pcm_period_elapsed()</function>, respectively. + </para> + </section> + + </section> + + <section id="pcm-interface-operators"> + <title>Operators</title> + <para> + OK, now let me explain the detail of each pcm callback + (<parameter>ops</parameter>). In general, every callback must + return 0 if successful, or a negative number with the error + number such as <constant>-EINVAL</constant> at any + error. + </para> + + <para> + The callback function takes at least the argument with + <type>snd_pcm_substream_t</type> pointer. For retrieving the + chip record from the given substream instance, you can use the + following macro. + + <informalexample> + <programlisting> +<![CDATA[ + int xxx() { + mychip_t *chip = snd_pcm_substream_chip(substream); + .... + } +]]> + </programlisting> + </informalexample> + + The macro reads <constant>substream->private_data</constant>, + which is a copy of <constant>pcm->private_data</constant>. + You can override the former if you need to assign different data + records per PCM substream. For example, cmi8330 driver assigns + different private_data for playback and capture directions, + because it uses two different codecs (SB- and AD-compatible) for + different directions. + </para> + + <section id="pcm-interface-operators-open-callback"> + <title>open callback</title> + <para> + <informalexample> + <programlisting> +<![CDATA[ + static int snd_xxx_open(snd_pcm_substream_t *substream); +]]> + </programlisting> + </informalexample> + + This is called when a pcm substream is opened. + </para> + + <para> + At least, here you have to initialize the runtime->hw + record. Typically, this is done by like this: + + <informalexample> + <programlisting> +<![CDATA[ + static int snd_xxx_open(snd_pcm_substream_t *substream) + { + mychip_t *chip = snd_pcm_substream_chip(substream); + snd_pcm_runtime_t *runtime = substream->runtime; + + runtime->hw = snd_mychip_playback_hw; + return 0; + } +]]> + </programlisting> + </informalexample> + + where <parameter>snd_mychip_playback_hw</parameter> is the + pre-defined hardware description. + </para> + + <para> + You can allocate a private data in this callback, as described + in <link linkend="pcm-interface-runtime-private"><citetitle> + Private Data</citetitle></link> section. + </para> + + <para> + If the hardware configuration needs more constraints, set the + hardware constraints here, too. + See <link linkend="pcm-interface-constraints"><citetitle> + Constraints</citetitle></link> for more details. + </para> + </section> + + <section id="pcm-interface-operators-close-callback"> + <title>close callback</title> + <para> + <informalexample> + <programlisting> +<![CDATA[ + static int snd_xxx_close(snd_pcm_substream_t *substream); +]]> + </programlisting> + </informalexample> + + Obviously, this is called when a pcm substream is closed. + </para> + + <para> + Any private instance for a pcm substream allocated in the + open callback will be released here. + + <informalexample> + <programlisting> +<![CDATA[ + static int snd_xxx_close(snd_pcm_substream_t *substream) + { + .... + kfree(substream->runtime->private_data); + .... + } +]]> + </programlisting> + </informalexample> + </para> + </section> + + <section id="pcm-interface-operators-ioctl-callback"> + <title>ioctl callback</title> + <para> + This is used for any special action to pcm ioctls. But + usually you can pass a generic ioctl callback, + <function>snd_pcm_lib_ioctl</function>. + </para> + </section> + + <section id="pcm-interface-operators-hw-params-callback"> + <title>hw_params callback</title> + <para> + <informalexample> + <programlisting> +<![CDATA[ + static int snd_xxx_hw_params(snd_pcm_substream_t * substream, + snd_pcm_hw_params_t * hw_params); +]]> + </programlisting> + </informalexample> + + This and <structfield>hw_free</structfield> callbacks exist + only on ALSA 0.9.x. + </para> + + <para> + This is called when the hardware parameter + (<structfield>hw_params</structfield>) is set + up by the application, + that is, once when the buffer size, the period size, the + format, etc. are defined for the pcm substream. + </para> + + <para> + Many hardware set-up should be done in this callback, + including the allocation of buffers. + </para> + + <para> + Parameters to be initialized are retrieved by + <function>params_xxx()</function> macros. For allocating a + buffer, you can call a helper function, + + <informalexample> + <programlisting> +<![CDATA[ + snd_pcm_lib_malloc_pages(substream, params_buffer_bytes(hw_params)); +]]> + </programlisting> + </informalexample> + + <function>snd_pcm_lib_malloc_pages()</function> is available + only when the DMA buffers have been pre-allocated. + See the section <link + linkend="buffer-and-memory-buffer-types"><citetitle> + Buffer Types</citetitle></link> for more details. + </para> + + <para> + Note that this and <structfield>prepare</structfield> callbacks + may be called multiple times per initialization. + For example, the OSS emulation may + call these callbacks at each change via its ioctl. + </para> + + <para> + Thus, you need to take care not to allocate the same buffers + many times, which will lead to memory leak! Calling the + helper function above many times is OK. It will release the + previous buffer automatically when it was already allocated. + </para> + + <para> + Another note is that this callback is non-atomic + (schedulable). This is important, because the + <structfield>trigger</structfield> callback + is atomic (non-schedulable). That is, mutex or any + schedule-related functions are not available in + <structfield>trigger</structfield> callback. + Please see the subsection + <link linkend="pcm-interface-atomicity"><citetitle> + Atomicity</citetitle></link> for details. + </para> + </section> + + <section id="pcm-interface-operators-hw-free-callback"> + <title>hw_free callback</title> + <para> + <informalexample> + <programlisting> +<![CDATA[ + static int snd_xxx_hw_free(snd_pcm_substream_t * substream); +]]> + </programlisting> + </informalexample> + </para> + + <para> + This is called to release the resources allocated via + <structfield>hw_params</structfield>. For example, releasing the + buffer via + <function>snd_pcm_lib_malloc_pages()</function> is done by + calling the following: + + <informalexample> + <programlisting> +<![CDATA[ + snd_pcm_lib_free_pages(substream); +]]> + </programlisting> + </informalexample> + </para> + + <para> + This function is always called before the close callback is called. + Also, the callback may be called multiple times, too. + Keep track whether the resource was already released. + </para> + </section> + + <section id="pcm-interface-operators-prepare-callback"> + <title>prepare callback</title> + <para> + <informalexample> + <programlisting> +<![CDATA[ + static int snd_xxx_prepare(snd_pcm_substream_t * substream); +]]> + </programlisting> + </informalexample> + </para> + + <para> + This callback is called when the pcm is + <quote>prepared</quote>. You can set the format type, sample + rate, etc. here. The difference from + <structfield>hw_params</structfield> is that the + <structfield>prepare</structfield> callback will be called at each + time + <function>snd_pcm_prepare()</function> is called, i.e. when + recovered after underruns, etc. + </para> + + <para> + Note that this callback became non-atomic since the recent version. + You can use schedule-related fucntions safely in this callback now. + </para> + + <para> + In this and the following callbacks, you can refer to the + values via the runtime record, + substream->runtime. + For example, to get the current + rate, format or channels, access to + runtime->rate, + runtime->format or + runtime->channels, respectively. + The physical address of the allocated buffer is set to + runtime->dma_area. The buffer and period sizes are + in runtime->buffer_size and runtime->period_size, + respectively. + </para> + + <para> + Be careful that this callback will be called many times at + each set up, too. + </para> + </section> + + <section id="pcm-interface-operators-trigger-callback"> + <title>trigger callback</title> + <para> + <informalexample> + <programlisting> +<![CDATA[ + static int snd_xxx_trigger(snd_pcm_substream_t * substream, int cmd); +]]> + </programlisting> + </informalexample> + + This is called when the pcm is started, stopped or paused. + </para> + + <para> + Which action is specified in the second argument, + <constant>SNDRV_PCM_TRIGGER_XXX</constant> in + <filename><sound/pcm.h></filename>. At least, + <constant>START</constant> and <constant>STOP</constant> + commands must be defined in this callback. + + <informalexample> + <programlisting> +<![CDATA[ + switch (cmd) { + case SNDRV_PCM_TRIGGER_START: + // do something to start the PCM engine + break; + case SNDRV_PCM_TRIGGER_STOP: + // do something to stop the PCM engine + break; + default: + return -EINVAL; + } +]]> + </programlisting> + </informalexample> + </para> + + <para> + When the pcm supports the pause operation (given in info + field of the hardware table), <constant>PAUSE_PUSE</constant> + and <constant>PAUSE_RELEASE</constant> commands must be + handled here, too. The former is the command to pause the pcm, + and the latter to restart the pcm again. + </para> + + <para> + When the pcm supports the suspend/resume operation + (i.e. <constant>SNDRV_PCM_INFO_RESUME</constant> flag is set), + <constant>SUSPEND</constant> and <constant>RESUME</constant> + commands must be handled, too. + These commands are issued when the power-management status is + changed. Obviously, the <constant>SUSPEND</constant> and + <constant>RESUME</constant> + do suspend and resume of the pcm substream, and usually, they + are identical with <constant>STOP</constant> and + <constant>START</constant> commands, respectively. + </para> + + <para> + As mentioned, this callback is atomic. You cannot call + the function going to sleep. + The trigger callback should be as minimal as possible, + just really triggering the DMA. The other stuff should be + initialized hw_params and prepare callbacks properly + beforehand. + </para> + </section> + + <section id="pcm-interface-operators-pointer-callback"> + <title>pointer callback</title> + <para> + <informalexample> + <programlisting> +<![CDATA[ + static snd_pcm_uframes_t snd_xxx_pointer(snd_pcm_substream_t * substream) +]]> + </programlisting> + </informalexample> + + This callback is called when the PCM middle layer inquires + the current hardware position on the buffer. The position must + be returned in frames (which was in bytes on ALSA 0.5.x), + ranged from 0 to buffer_size - 1. + </para> + + <para> + This is called usually from the buffer-update routine in the + pcm middle layer, which is invoked when + <function>snd_pcm_period_elapsed()</function> is called in the + interrupt routine. Then the pcm middle layer updates the + position and calculates the available space, and wakes up the + sleeping poll threads, etc. + </para> + + <para> + This callback is also atomic. + </para> + </section> + + <section id="pcm-interface-operators-copy-silence"> + <title>copy and silence callbacks</title> + <para> + These callbacks are not mandatory, and can be omitted in + most cases. These callbacks are used when the hardware buffer + cannot be on the normal memory space. Some chips have their + own buffer on the hardware which is not mappable. In such a + case, you have to transfer the data manually from the memory + buffer to the hardware buffer. Or, if the buffer is + non-contiguous on both physical and virtual memory spaces, + these callbacks must be defined, too. + </para> + + <para> + If these two callbacks are defined, copy and set-silence + operations are done by them. The detailed will be described in + the later section <link + linkend="buffer-and-memory"><citetitle>Buffer and Memory + Management</citetitle></link>. + </para> + </section> + + <section id="pcm-interface-operators-ack"> + <title>ack callback</title> + <para> + This callback is also not mandatory. This callback is called + when the appl_ptr is updated in read or write operations. + Some drivers like emu10k1-fx and cs46xx need to track the + current appl_ptr for the internal buffer, and this callback + is useful only for such a purpose. + </para> + <para> + This callback is atomic. + </para> + </section> + + <section id="pcm-interface-operators-page-callback"> + <title>page callback</title> + + <para> + This callback is also not mandatory. This callback is used + mainly for the non-contiguous buffer. The mmap calls this + callback to get the page address. Some examples will be + explained in the later section <link + linkend="buffer-and-memory"><citetitle>Buffer and Memory + Management</citetitle></link>, too. + </para> + </section> + </section> + + <section id="pcm-interface-interrupt-handler"> + <title>Interrupt Handler</title> + <para> + The rest of pcm stuff is the PCM interrupt handler. The + role of PCM interrupt handler in the sound driver is to update + the buffer position and to tell the PCM middle layer when the + buffer position goes across the prescribed period size. To + inform this, call <function>snd_pcm_period_elapsed()</function> + function. + </para> + + <para> + There are several types of sound chips to generate the interrupts. + </para> + + <section id="pcm-interface-interrupt-handler-boundary"> + <title>Interrupts at the period (fragment) boundary</title> + <para> + This is the most frequently found type: the hardware + generates an interrupt at each period boundary. + In this case, you can call + <function>snd_pcm_period_elapsed()</function> at each + interrupt. + </para> + + <para> + <function>snd_pcm_period_elapsed()</function> takes the + substream pointer as its argument. Thus, you need to keep the + substream pointer accessible from the chip instance. For + example, define substream field in the chip record to hold the + current running substream pointer, and set the pointer value + at open callback (and reset at close callback). + </para> + + <para> + If you aquire a spinlock in the interrupt handler, and the + lock is used in other pcm callbacks, too, then you have to + release the lock before calling + <function>snd_pcm_period_elapsed()</function>, because + <function>snd_pcm_period_elapsed()</function> calls other pcm + callbacks inside. + </para> + + <para> + A typical coding would be like: + + <example> + <title>Interrupt Handler Case #1</title> + <programlisting> +<![CDATA[ + static irqreturn_t snd_mychip_interrupt(int irq, void *dev_id, + struct pt_regs *regs) + { + mychip_t *chip = dev_id; + spin_lock(&chip->lock); + .... + if (pcm_irq_invoked(chip)) { + /* call updater, unlock before it */ + spin_unlock(&chip->lock); + snd_pcm_period_elapsed(chip->substream); + spin_lock(&chip->lock); + // acknowledge the interrupt if necessary + } + .... + spin_unlock(&chip->lock); + return IRQ_HANDLED; + } +]]> + </programlisting> + </example> + </para> + </section> + + <section id="pcm-interface-interrupt-handler-timer"> + <title>High-frequent timer interrupts</title> + <para> + This is the case when the hardware doesn't generate interrupts + at the period boundary but do timer-interrupts at the fixed + timer rate (e.g. es1968 or ymfpci drivers). + In this case, you need to check the current hardware + position and accumulates the processed sample length at each + interrupt. When the accumulated size overcomes the period + size, call + <function>snd_pcm_period_elapsed()</function> and reset the + accumulator. + </para> + + <para> + A typical coding would be like the following. + + <example> + <title>Interrupt Handler Case #2</title> + <programlisting> +<![CDATA[ + static irqreturn_t snd_mychip_interrupt(int irq, void *dev_id, + struct pt_regs *regs) + { + mychip_t *chip = dev_id; + spin_lock(&chip->lock); + .... + if (pcm_irq_invoked(chip)) { + unsigned int last_ptr, size; + /* get the current hardware pointer (in frames) */ + last_ptr = get_hw_ptr(chip); + /* calculate the processed frames since the + * last update + */ + if (last_ptr < chip->last_ptr) + size = runtime->buffer_size + last_ptr + - chip->last_ptr; + else + size = last_ptr - chip->last_ptr; + /* remember the last updated point */ + chip->last_ptr = last_ptr; + /* accumulate the size */ + chip->size += size; + /* over the period boundary? */ + if (chip->size >= runtime->period_size) { + /* reset the accumulator */ + chip->size %= runtime->period_size; + /* call updater */ + spin_unlock(&chip->lock); + snd_pcm_period_elapsed(substream); + spin_lock(&chip->lock); + } + // acknowledge the interrupt if necessary + } + .... + spin_unlock(&chip->lock); + return IRQ_HANDLED; + } +]]> + </programlisting> + </example> + </para> + </section> + + <section id="pcm-interface-interrupt-handler-both"> + <title>On calling <function>snd_pcm_period_elapsed()</function></title> + <para> + In both cases, even if more than one period are elapsed, you + don't have to call + <function>snd_pcm_period_elapsed()</function> many times. Call + only once. And the pcm layer will check the current hardware + pointer and update to the latest status. + </para> + </section> + </section> + + <section id="pcm-interface-atomicity"> + <title>Atomicity</title> + <para> + One of the most important (and thus difficult to debug) problem + on the kernel programming is the race condition. + On linux kernel, usually it's solved via spin-locks or + semaphores. In general, if the race condition may + happen in the interrupt handler, it's handled as atomic, and you + have to use spinlock for protecting the critical session. If it + never happens in the interrupt and it may take relatively long + time, you should use semaphore. + </para> + + <para> + As already seen, some pcm callbacks are atomic and some are + not. For example, <parameter>hw_params</parameter> callback is + non-atomic, while <parameter>trigger</parameter> callback is + atomic. This means, the latter is called already in a spinlock + held by the PCM middle layer. Please take this atomicity into + account when you use a spinlock or a semaphore in the callbacks. + </para> + + <para> + In the atomic callbacks, you cannot use functions which may call + <function>schedule</function> or go to + <function>sleep</function>. The semaphore and mutex do sleep, + and hence they cannot be used inside the atomic callbacks + (e.g. <parameter>trigger</parameter> callback). + For taking a certain delay in such a callback, please use + <function>udelay()</function> or <function>mdelay()</function>. + </para> + + <para> + All three atomic callbacks (trigger, pointer, and ack) are + called with local interrupts disabled. + </para> + + </section> + <section id="pcm-interface-constraints"> + <title>Constraints</title> + <para> + If your chip supports unconventional sample rates, or only the + limited samples, you need to set a constraint for the + condition. + </para> + + <para> + For example, in order to restrict the sample rates in the some + supported values, use + <function>snd_pcm_hw_constraint_list()</function>. + You need to call this function in the open callback. + + <example> + <title>Example of Hardware Constraints</title> + <programlisting> +<![CDATA[ + static unsigned int rates[] = + {4000, 10000, 22050, 44100}; + static snd_pcm_hw_constraint_list_t constraints_rates = { + .count = ARRAY_SIZE(rates), + .list = rates, + .mask = 0, + }; + + static int snd_mychip_pcm_open(snd_pcm_substream_t *substream) + { + int err; + .... + err = snd_pcm_hw_constraint_list(substream->runtime, 0, + SNDRV_PCM_HW_PARAM_RATE, + &constraints_rates); + if (err < 0) + return err; + .... + } +]]> + </programlisting> + </example> + </para> + + <para> + There are many different constraints. + Look in <filename>sound/pcm.h</filename> for a complete list. + You can even define your own constraint rules. + For example, let's suppose my_chip can manage a substream of 1 channel + if and only if the format is S16_LE, otherwise it supports any format + specified in the <type>snd_pcm_hardware_t</type> stucture (or in any + other constraint_list). You can build a rule like this: + + <example> + <title>Example of Hardware Constraints for Channels</title> + <programlisting> +<![CDATA[ + static int hw_rule_format_by_channels(snd_pcm_hw_params_t *params, + snd_pcm_hw_rule_t *rule) + { + snd_interval_t *c = hw_param_interval(params, SNDRV_PCM_HW_PARAM_CHANNELS); + snd_mask_t *f = hw_param_mask(params, SNDRV_PCM_HW_PARAM_FORMAT); + snd_mask_t fmt; + + snd_mask_any(&fmt); /* Init the struct */ + if (c->min < 2) { + fmt.bits[0] &= SNDRV_PCM_FMTBIT_S16_LE; + return snd_mask_refine(f, &fmt); + } + return 0; + } +]]> + </programlisting> + </example> + </para> + + <para> + Then you need to call this function to add your rule: + + <informalexample> + <programlisting> +<![CDATA[ + snd_pcm_hw_rule_add(substream->runtime, 0, SNDRV_PCM_HW_PARAM_CHANNELS, + hw_rule_channels_by_format, 0, SNDRV_PCM_HW_PARAM_FORMAT, + -1); +]]> + </programlisting> + </informalexample> + </para> + + <para> + The rule function is called when an application sets the number of + channels. But an application can set the format before the number of + channels. Thus you also need to define the inverse rule: + + <example> + <title>Example of Hardware Constraints for Channels</title> + <programlisting> +<![CDATA[ + static int hw_rule_channels_by_format(snd_pcm_hw_params_t *params, + snd_pcm_hw_rule_t *rule) + { + snd_interval_t *c = hw_param_interval(params, SNDRV_PCM_HW_PARAM_CHANNELS); + snd_mask_t *f = hw_param_mask(params, SNDRV_PCM_HW_PARAM_FORMAT); + snd_interval_t ch; + + snd_interval_any(&ch); + if (f->bits[0] == SNDRV_PCM_FMTBIT_S16_LE) { + ch.min = ch.max = 1; + ch.integer = 1; + return snd_interval_refine(c, &ch); + } + return 0; + } +]]> + </programlisting> + </example> + </para> + + <para> + ...and in the open callback: + <informalexample> + <programlisting> +<![CDATA[ + snd_pcm_hw_rule_add(substream->runtime, 0, SNDRV_PCM_HW_PARAM_FORMAT, + hw_rule_format_by_channels, 0, SNDRV_PCM_HW_PARAM_CHANNELS, + -1); +]]> + </programlisting> + </informalexample> + </para> + + <para> + I won't explain more details here, rather I + would like to say, <quote>Luke, use the source.</quote> + </para> + </section> + + </chapter> + + +<!-- ****************************************************** --> +<!-- Control Interface --> +<!-- ****************************************************** --> + <chapter id="control-interface"> + <title>Control Interface</title> + + <section id="control-interface-general"> + <title>General</title> + <para> + The control interface is used widely for many switches, + sliders, etc. which are accessed from the user-space. Its most + important use is the mixer interface. In other words, on ALSA + 0.9.x, all the mixer stuff is implemented on the control kernel + API (while there was an independent mixer kernel API on 0.5.x). + </para> + + <para> + ALSA has a well-defined AC97 control module. If your chip + supports only the AC97 and nothing else, you can skip this + section. + </para> + + <para> + The control API is defined in + <filename><sound/control.h></filename>. + Include this file if you add your own controls. + </para> + </section> + + <section id="control-interface-definition"> + <title>Definition of Controls</title> + <para> + For creating a new control, you need to define the three + callbacks: <structfield>info</structfield>, + <structfield>get</structfield> and + <structfield>put</structfield>. Then, define a + <type>snd_kcontrol_new_t</type> record, such as: + + <example> + <title>Definition of a Control</title> + <programlisting> +<![CDATA[ + static snd_kcontrol_new_t my_control __devinitdata = { + .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .name = "PCM Playback Switch", + .index = 0, + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE, + .private_values = 0xffff, + .info = my_control_info, + .get = my_control_get, + .put = my_control_put + }; +]]> + </programlisting> + </example> + </para> + + <para> + Most likely the control is created via + <function>snd_ctl_new1()</function>, and in such a case, you can + add <parameter>__devinitdata</parameter> prefix to the + definition like above. + </para> + + <para> + The <structfield>iface</structfield> field specifies the type of + the control, + <constant>SNDRV_CTL_ELEM_IFACE_XXX</constant>. There are + <constant>MIXER</constant>, <constant>PCM</constant>, + <constant>CARD</constant>, etc. + </para> + + <para> + The <structfield>name</structfield> is the name identifier + string. On ALSA 0.9.x, the control name is very important, + because its role is classified from its name. There are + pre-defined standard control names. The details are described in + the subsection + <link linkend="control-interface-control-names"><citetitle> + Control Names</citetitle></link>. + </para> + + <para> + The <structfield>index</structfield> field holds the index number + of this control. If there are several different controls with + the same name, they can be distinguished by the index + number. This is the case when + several codecs exist on the card. If the index is zero, you can + omit the definition above. + </para> + + <para> + The <structfield>access</structfield> field contains the access + type of this control. Give the combination of bit masks, + <constant>SNDRV_CTL_ELEM_ACCESS_XXX</constant>, there. + The detailed will be explained in the subsection + <link linkend="control-interface-access-flags"><citetitle> + Access Flags</citetitle></link>. + </para> + + <para> + The <structfield>private_values</structfield> field contains + an arbitrary long integer value for this record. When using + generic <structfield>info</structfield>, + <structfield>get</structfield> and + <structfield>put</structfield> callbacks, you can pass a value + through this field. If several small numbers are necessary, you can + combine them in bitwise. Or, it's possible to give a pointer + (casted to unsigned long) of some record to this field, too. + </para> + + <para> + The other three are + <link linkend="control-interface-callbacks"><citetitle> + callback functions</citetitle></link>. + </para> + </section> + + <section id="control-interface-control-names"> + <title>Control Names</title> + <para> + There are some standards for defining the control names. A + control is usually defined from the three parts as + <quote>SOURCE DIRECTION FUNCTION</quote>. + </para> + + <para> + The first, <constant>SOURCE</constant>, specifies the source + of the control, and is a string such as <quote>Master</quote>, + <quote>PCM</quote>, <quote>CD</quote> or + <quote>Line</quote>. There are many pre-defined sources. + </para> + + <para> + The second, <constant>DIRECTION</constant>, is one of the + following strings according to the direction of the control: + <quote>Playback</quote>, <quote>Capture</quote>, <quote>Bypass + Playback</quote> and <quote>Bypass Capture</quote>. Or, it can + be omitted, meaning both playback and capture directions. + </para> + + <para> + The third, <constant>FUNCTION</constant>, is one of the + following strings according to the function of the control: + <quote>Switch</quote>, <quote>Volume</quote> and + <quote>Route</quote>. + </para> + + <para> + The example of control names are, thus, <quote>Master Capture + Switch</quote> or <quote>PCM Playback Volume</quote>. + </para> + + <para> + There are some exceptions: + </para> + + <section id="control-interface-control-names-global"> + <title>Global capture and playback</title> + <para> + <quote>Capture Source</quote>, <quote>Capture Switch</quote> + and <quote>Capture Volume</quote> are used for the global + capture (input) source, switch and volume. Similarly, + <quote>Playback Switch</quote> and <quote>Playback + Volume</quote> are used for the global output gain switch and + volume. + </para> + </section> + + <section id="control-interface-control-names-tone"> + <title>Tone-controls</title> + <para> + tone-control switch and volumes are specified like + <quote>Tone Control - XXX</quote>, e.g. <quote>Tone Control - + Switch</quote>, <quote>Tone Control - Bass</quote>, + <quote>Tone Control - Center</quote>. + </para> + </section> + + <section id="control-interface-control-names-3d"> + <title>3D controls</title> + <para> + 3D-control switches and volumes are specified like <quote>3D + Control - XXX</quote>, e.g. <quote>3D Control - + Switch</quote>, <quote>3D Control - Center</quote>, <quote>3D + Control - Space</quote>. + </para> + </section> + + <section id="control-interface-control-names-mic"> + <title>Mic boost</title> + <para> + Mic-boost switch is set as <quote>Mic Boost</quote> or + <quote>Mic Boost (6dB)</quote>. + </para> + + <para> + More precise information can be found in + <filename>Documentation/sound/alsa/ControlNames.txt</filename>. + </para> + </section> + </section> + + <section id="control-interface-access-flags"> + <title>Access Flags</title> + + <para> + The access flag is the bit-flags which specifies the access type + of the given control. The default access type is + <constant>SNDRV_CTL_ELEM_ACCESS_READWRITE</constant>, + which means both read and write are allowed to this control. + When the access flag is omitted (i.e. = 0), it is + regarded as <constant>READWRITE</constant> access as default. + </para> + + <para> + When the control is read-only, pass + <constant>SNDRV_CTL_ELEM_ACCESS_READ</constant> instead. + In this case, you don't have to define + <structfield>put</structfield> callback. + Similarly, when the control is write-only (although it's a rare + case), you can use <constant>WRITE</constant> flag instead, and + you don't need <structfield>get</structfield> callback. + </para> + + <para> + If the control value changes frequently (e.g. the VU meter), + <constant>VOLATILE</constant> flag should be given. This means + that the control may be changed without + <link linkend="control-interface-change-notification"><citetitle> + notification</citetitle></link>. Applications should poll such + a control constantly. + </para> + + <para> + When the control is inactive, set + <constant>INACTIVE</constant> flag, too. + There are <constant>LOCK</constant> and + <constant>OWNER</constant> flags for changing the write + permissions. + </para> + + </section> + + <section id="control-interface-callbacks"> + <title>Callbacks</title> + + <section id="control-interface-callbacks-info"> + <title>info callback</title> + <para> + The <structfield>info</structfield> callback is used to get + the detailed information of this control. This must store the + values of the given <type>snd_ctl_elem_info_t</type> + object. For example, for a boolean control with a single + element will be: + + <example> + <title>Example of info callback</title> + <programlisting> +<![CDATA[ + static int snd_myctl_info(snd_kcontrol_t *kcontrol, + snd_ctl_elem_info_t *uinfo) + { + uinfo->type = SNDRV_CTL_ELEM_TYPE_BOOLEAN; + uinfo->count = 1; + uinfo->value.integer.min = 0; + uinfo->value.integer.max = 1; + return 0; + } +]]> + </programlisting> + </example> + </para> + + <para> + The <structfield>type</structfield> field specifies the type + of the control. There are <constant>BOOLEAN</constant>, + <constant>INTEGER</constant>, <constant>ENUMERATED</constant>, + <constant>BYTES</constant>, <constant>IEC958</constant> and + <constant>INTEGER64</constant>. The + <structfield>count</structfield> field specifies the + number of elements in this control. For example, a stereo + volume would have count = 2. The + <structfield>value</structfield> field is a union, and + the values stored are depending on the type. The boolean and + integer are identical. + </para> + + <para> + The enumerated type is a bit different from others. You'll + need to set the string for the currently given item index. + + <informalexample> + <programlisting> +<![CDATA[ + static int snd_myctl_info(snd_kcontrol_t *kcontrol, + snd_ctl_elem_info_t *uinfo) + { + static char *texts[4] = { + "First", "Second", "Third", "Fourth" + }; + uinfo->type = SNDRV_CTL_ELEM_TYPE_ENUMERATED; + uinfo->count = 1; + uinfo->value.enumerated.items = 4; + if (uinfo->value.enumerated.item > 3) + uinfo->value.enumerated.item = 3; + strcpy(uinfo->value.enumerated.name, + texts[uinfo->value.enumerated.item]); + return 0; + } +]]> + </programlisting> + </informalexample> + </para> + </section> + + <section id="control-interface-callbacks-get"> + <title>get callback</title> + + <para> + This callback is used to read the current value of the + control and to return to the user-space. + </para> + + <para> + For example, + + <example> + <title>Example of get callback</title> + <programlisting> +<![CDATA[ + static int snd_myctl_get(snd_kcontrol_t *kcontrol, + snd_ctl_elem_value_t *ucontrol) + { + mychip_t *chip = snd_kcontrol_chip(kcontrol); + ucontrol->value.integer.value[0] = get_some_value(chip); + return 0; + } +]]> + </programlisting> + </example> + </para> + + <para> + Here, the chip instance is retrieved via + <function>snd_kcontrol_chip()</function> macro. This macro + converts from kcontrol->private_data to the type defined by + <type>chip_t</type>. The + kcontrol->private_data field is + given as the argument of <function>snd_ctl_new()</function> + (see the later subsection + <link linkend="control-interface-constructor"><citetitle>Constructor</citetitle></link>). + </para> + + <para> + The <structfield>value</structfield> field is depending on + the type of control as well as on info callback. For example, + the sb driver uses this field to store the register offset, + the bit-shift and the bit-mask. The + <structfield>private_value</structfield> is set like + <informalexample> + <programlisting> +<![CDATA[ + .private_value = reg | (shift << 16) | (mask << 24) +]]> + </programlisting> + </informalexample> + and is retrieved in callbacks like + <informalexample> + <programlisting> +<![CDATA[ + static int snd_sbmixer_get_single(snd_kcontrol_t *kcontrol, + snd_ctl_elem_value_t *ucontrol) + { + int reg = kcontrol->private_value & 0xff; + int shift = (kcontrol->private_value >> 16) & 0xff; + int mask = (kcontrol->private_value >> 24) & 0xff; + .... + } +]]> + </programlisting> + </informalexample> + </para> + + <para> + In <structfield>get</structfield> callback, you have to fill all the elements if the + control has more than one elements, + i.e. <structfield>count</structfield> > 1. + In the example above, we filled only one element + (<structfield>value.integer.value[0]</structfield>) since it's + assumed as <structfield>count</structfield> = 1. + </para> + </section> + + <section id="control-interface-callbacks-put"> + <title>put callback</title> + + <para> + This callback is used to write a value from the user-space. + </para> + + <para> + For example, + + <example> + <title>Example of put callback</title> + <programlisting> +<![CDATA[ + static int snd_myctl_put(snd_kcontrol_t *kcontrol, + snd_ctl_elem_value_t *ucontrol) + { + mychip_t *chip = snd_kcontrol_chip(kcontrol); + int changed = 0; + if (chip->current_value != + ucontrol->value.integer.value[0]) { + change_current_value(chip, + ucontrol->value.integer.value[0]); + changed = 1; + } + return changed; + } +]]> + </programlisting> + </example> + + As seen above, you have to return 1 if the value is + changed. If the value is not changed, return 0 instead. + If any fatal error happens, return a negative error code as + usual. + </para> + + <para> + Like <structfield>get</structfield> callback, + when the control has more than one elements, + all elemehts must be evaluated in this callback, too. + </para> + </section> + + <section id="control-interface-callbacks-all"> + <title>Callbacks are not atomic</title> + <para> + All these three callbacks are basically not atomic. + </para> + </section> + </section> + + <section id="control-interface-constructor"> + <title>Constructor</title> + <para> + When everything is ready, finally we can create a new + control. For creating a control, there are two functions to be + called, <function>snd_ctl_new1()</function> and + <function>snd_ctl_add()</function>. + </para> + + <para> + In the simplest way, you can do like this: + + <informalexample> + <programlisting> +<![CDATA[ + if ((err = snd_ctl_add(card, snd_ctl_new1(&my_control, chip))) < 0) + return err; +]]> + </programlisting> + </informalexample> + + where <parameter>my_control</parameter> is the + <type>snd_kcontrol_new_t</type> object defined above, and chip + is the object pointer to be passed to + kcontrol->private_data + which can be referred in callbacks. + </para> + + <para> + <function>snd_ctl_new1()</function> allocates a new + <type>snd_kcontrol_t</type> instance (that's why the definition + of <parameter>my_control</parameter> can be with + <parameter>__devinitdata</parameter> + prefix), and <function>snd_ctl_add</function> assigns the given + control component to the card. + </para> + </section> + + <section id="control-interface-change-notification"> + <title>Change Notification</title> + <para> + If you need to change and update a control in the interrupt + routine, you can call <function>snd_ctl_notify()</function>. For + example, + + <informalexample> + <programlisting> +<![CDATA[ + snd_ctl_notify(card, SNDRV_CTL_EVENT_MASK_VALUE, id_pointer); +]]> + </programlisting> + </informalexample> + + This function takes the card pointer, the event-mask, and the + control id pointer for the notification. The event-mask + specifies the types of notification, for example, in the above + example, the change of control values is notified. + The id pointer is the pointer of <type>snd_ctl_elem_id_t</type> + to be notified. + You can find some examples in <filename>es1938.c</filename> or + <filename>es1968.c</filename> for hardware volume interrupts. + </para> + </section> + + </chapter> + + +<!-- ****************************************************** --> +<!-- API for AC97 Codec --> +<!-- ****************************************************** --> + <chapter id="api-ac97"> + <title>API for AC97 Codec</title> + + <section> + <title>General</title> + <para> + The ALSA AC97 codec layer is a well-defined one, and you don't + have to write many codes to control it. Only low-level control + routines are necessary. The AC97 codec API is defined in + <filename><sound/ac97_codec.h></filename>. + </para> + </section> + + <section id="api-ac97-example"> + <title>Full Code Example</title> + <para> + <example> + <title>Example of AC97 Interface</title> + <programlisting> +<![CDATA[ + struct snd_mychip { + .... + ac97_t *ac97; + .... + }; + + static unsigned short snd_mychip_ac97_read(ac97_t *ac97, + unsigned short reg) + { + mychip_t *chip = ac97->private_data; + .... + // read a register value here from the codec + return the_register_value; + } + + static void snd_mychip_ac97_write(ac97_t *ac97, + unsigned short reg, unsigned short val) + { + mychip_t *chip = ac97->private_data; + .... + // write the given register value to the codec + } + + static int snd_mychip_ac97(mychip_t *chip) + { + ac97_bus_t *bus; + ac97_template_t ac97; + int err; + static ac97_bus_ops_t ops = { + .write = snd_mychip_ac97_write, + .read = snd_mychip_ac97_read, + }; + + if ((err = snd_ac97_bus(chip->card, 0, &ops, NULL, &bus)) < 0) + return err; + memset(&ac97, 0, sizeof(ac97)); + ac97.private_data = chip; + return snd_ac97_mixer(bus, &ac97, &chip->ac97); + } + +]]> + </programlisting> + </example> + </para> + </section> + + <section id="api-ac97-constructor"> + <title>Constructor</title> + <para> + For creating an ac97 instance, first call <function>snd_ac97_bus</function> + with an <type>ac97_bus_ops_t</type> record with callback functions. + + <informalexample> + <programlisting> +<![CDATA[ + ac97_bus_t *bus; + static ac97_bus_ops_t ops = { + .write = snd_mychip_ac97_write, + .read = snd_mychip_ac97_read, + }; + + snd_ac97_bus(card, 0, &ops, NULL, &pbus); +]]> + </programlisting> + </informalexample> + + The bus record is shared among all belonging ac97 instances. + </para> + + <para> + And then call <function>snd_ac97_mixer()</function> with an <type>ac97_template_t</type> + record together with the bus pointer created above. + + <informalexample> + <programlisting> +<![CDATA[ + ac97_template_t ac97; + int err; + + memset(&ac97, 0, sizeof(ac97)); + ac97.private_data = chip; + snd_ac97_mixer(bus, &ac97, &chip->ac97); +]]> + </programlisting> + </informalexample> + + where chip->ac97 is the pointer of a newly created + <type>ac97_t</type> instance. + In this case, the chip pointer is set as the private data, so that + the read/write callback functions can refer to this chip instance. + This instance is not necessarily stored in the chip + record. When you need to change the register values from the + driver, or need the suspend/resume of ac97 codecs, keep this + pointer to pass to the corresponding functions. + </para> + </section> + + <section id="api-ac97-callbacks"> + <title>Callbacks</title> + <para> + The standard callbacks are <structfield>read</structfield> and + <structfield>write</structfield>. Obviously they + correspond to the functions for read and write accesses to the + hardware low-level codes. + </para> + + <para> + The <structfield>read</structfield> callback returns the + register value specified in the argument. + + <informalexample> + <programlisting> +<![CDATA[ + static unsigned short snd_mychip_ac97_read(ac97_t *ac97, + unsigned short reg) + { + mychip_t *chip = ac97->private_data; + .... + return the_register_value; + } +]]> + </programlisting> + </informalexample> + + Here, the chip can be cast from ac97->private_data. + </para> + + <para> + Meanwhile, the <structfield>write</structfield> callback is + used to set the register value. + + <informalexample> + <programlisting> +<![CDATA[ + static void snd_mychip_ac97_write(ac97_t *ac97, + unsigned short reg, unsigned short val) +]]> + </programlisting> + </informalexample> + </para> + + <para> + These callbacks are non-atomic like the callbacks of control API. + </para> + + <para> + There are also other callbacks: + <structfield>reset</structfield>, + <structfield>wait</structfield> and + <structfield>init</structfield>. + </para> + + <para> + The <structfield>reset</structfield> callback is used to reset + the codec. If the chip requires a special way of reset, you can + define this callback. + </para> + + <para> + The <structfield>wait</structfield> callback is used for a + certain wait at the standard initialization of the codec. If the + chip requires the extra wait-time, define this callback. + </para> + + <para> + The <structfield>init</structfield> callback is used for + additional initialization of the codec. + </para> + </section> + + <section id="api-ac97-updating-registers"> + <title>Updating Registers in The Driver</title> + <para> + If you need to access to the codec from the driver, you can + call the following functions: + <function>snd_ac97_write()</function>, + <function>snd_ac97_read()</function>, + <function>snd_ac97_update()</function> and + <function>snd_ac97_update_bits()</function>. + </para> + + <para> + Both <function>snd_ac97_write()</function> and + <function>snd_ac97_update()</function> functions are used to + set a value to the given register + (<constant>AC97_XXX</constant>). The difference between them is + that <function>snd_ac97_update()</function> doesn't write a + value if the given value has been already set, while + <function>snd_ac97_write()</function> always rewrites the + value. + + <informalexample> + <programlisting> +<![CDATA[ + snd_ac97_write(ac97, AC97_MASTER, 0x8080); + snd_ac97_update(ac97, AC97_MASTER, 0x8080); +]]> + </programlisting> + </informalexample> + </para> + + <para> + <function>snd_ac97_read()</function> is used to read the value + of the given register. For example, + + <informalexample> + <programlisting> +<![CDATA[ + value = snd_ac97_read(ac97, AC97_MASTER); +]]> + </programlisting> + </informalexample> + </para> + + <para> + <function>snd_ac97_update_bits()</function> is used to update + some bits of the given register. + + <informalexample> + <programlisting> +<![CDATA[ + snd_ac97_update_bits(ac97, reg, mask, value); +]]> + </programlisting> + </informalexample> + </para> + + <para> + Also, there is a function to change the sample rate (of a + certain register such as + <constant>AC97_PCM_FRONT_DAC_RATE</constant>) when VRA or + DRA is supported by the codec: + <function>snd_ac97_set_rate()</function>. + + <informalexample> + <programlisting> +<![CDATA[ + snd_ac97_set_rate(ac97, AC97_PCM_FRONT_DAC_RATE, 44100); +]]> + </programlisting> + </informalexample> + </para> + + <para> + The following registers are available for setting the rate: + <constant>AC97_PCM_MIC_ADC_RATE</constant>, + <constant>AC97_PCM_FRONT_DAC_RATE</constant>, + <constant>AC97_PCM_LR_ADC_RATE</constant>, + <constant>AC97_SPDIF</constant>. When the + <constant>AC97_SPDIF</constant> is specified, the register is + not really changed but the corresponding IEC958 status bits will + be updated. + </para> + </section> + + <section id="api-ac97-clock-adjustment"> + <title>Clock Adjustment</title> + <para> + On some chip, the clock of the codec isn't 48000 but using a + PCI clock (to save a quartz!). In this case, change the field + bus->clock to the corresponding + value. For example, intel8x0 + and es1968 drivers have the auto-measurement function of the + clock. + </para> + </section> + + <section id="api-ac97-proc-files"> + <title>Proc Files</title> + <para> + The ALSA AC97 interface will create a proc file such as + <filename>/proc/asound/card0/codec97#0/ac97#0-0</filename> and + <filename>ac97#0-0+regs</filename>. You can refer to these files to + see the current status and registers of the codec. + </para> + </section> + + <section id="api-ac97-multiple-codecs"> + <title>Multiple Codecs</title> + <para> + When there are several codecs on the same card, you need to + call <function>snd_ac97_new()</function> multiple times with + ac97.num=1 or greater. The <structfield>num</structfield> field + specifies the codec + number. + </para> + + <para> + If you have set up multiple codecs, you need to either write + different callbacks for each codec or check + ac97->num in the + callback routines. + </para> + </section> + + </chapter> + + +<!-- ****************************************************** --> +<!-- MIDI (MPU401-UART) Interface --> +<!-- ****************************************************** --> + <chapter id="midi-interface"> + <title>MIDI (MPU401-UART) Interface</title> + + <section id="midi-interface-general"> + <title>General</title> + <para> + Many soundcards have built-in MIDI (MPU401-UART) + interfaces. When the soundcard supports the standard MPU401-UART + interface, most likely you can use the ALSA MPU401-UART API. The + MPU401-UART API is defined in + <filename><sound/mpu401.h></filename>. + </para> + + <para> + Some soundchips have similar but a little bit different + implementation of mpu401 stuff. For example, emu10k1 has its own + mpu401 routines. + </para> + </section> + + <section id="midi-interface-constructor"> + <title>Constructor</title> + <para> + For creating a rawmidi object, call + <function>snd_mpu401_uart_new()</function>. + + <informalexample> + <programlisting> +<![CDATA[ + snd_rawmidi_t *rmidi; + snd_mpu401_uart_new(card, 0, MPU401_HW_MPU401, port, integrated, + irq, irq_flags, &rmidi); +]]> + </programlisting> + </informalexample> + </para> + + <para> + The first argument is the card pointer, and the second is the + index of this component. You can create up to 8 rawmidi + devices. + </para> + + <para> + The third argument is the type of the hardware, + <constant>MPU401_HW_XXX</constant>. If it's not a special one, + you can use <constant>MPU401_HW_MPU401</constant>. + </para> + + <para> + The 4th argument is the i/o port address. Many + backward-compatible MPU401 has an i/o port such as 0x330. Or, it + might be a part of its own PCI i/o region. It depends on the + chip design. + </para> + + <para> + When the i/o port address above is a part of the PCI i/o + region, the MPU401 i/o port might have been already allocated + (reserved) by the driver itself. In such a case, pass non-zero + to the 5th argument + (<parameter>integrated</parameter>). Otherwise, pass 0 to it, + and + the mpu401-uart layer will allocate the i/o ports by itself. + </para> + + <para> + Usually, the port address corresponds to the command port and + port + 1 corresponds to the data port. If not, you may change + the <structfield>cport</structfield> field of + <type>mpu401_t</type> manually + afterward. However, <type>mpu401_t</type> pointer is not + returned explicitly by + <function>snd_mpu401_uart_new()</function>. You need to cast + rmidi->private_data to + <type>mpu401_t</type> explicitly, + + <informalexample> + <programlisting> +<![CDATA[ + mpu401_t *mpu; + mpu = rmidi->private_data; +]]> + </programlisting> + </informalexample> + + and reset the cport as you like: + + <informalexample> + <programlisting> +<![CDATA[ + mpu->cport = my_own_control_port; +]]> + </programlisting> + </informalexample> + </para> + + <para> + The 6th argument specifies the irq number for UART. If the irq + is already allocated, pass 0 to the 7th argument + (<parameter>irq_flags</parameter>). Otherwise, pass the flags + for irq allocation + (<constant>SA_XXX</constant> bits) to it, and the irq will be + reserved by the mpu401-uart layer. If the card doesn't generates + UART interrupts, pass -1 as the irq number. Then a timer + interrupt will be invoked for polling. + </para> + </section> + + <section id="midi-interface-interrupt-handler"> + <title>Interrupt Handler</title> + <para> + When the interrupt is allocated in + <function>snd_mpu401_uart_new()</function>, the private + interrupt handler is used, hence you don't have to do nothing + else than creating the mpu401 stuff. Otherwise, you have to call + <function>snd_mpu401_uart_interrupt()</function> explicitly when + a UART interrupt is invoked and checked in your own interrupt + handler. + </para> + + <para> + In this case, you need to pass the private_data of the + returned rawmidi object from + <function>snd_mpu401_uart_new()</function> as the second + argument of <function>snd_mpu401_uart_interrupt()</function>. + + <informalexample> + <programlisting> +<![CDATA[ + snd_mpu401_uart_interrupt(irq, rmidi->private_data, regs); +]]> + </programlisting> + </informalexample> + </para> + </section> + + </chapter> + + +<!-- ****************************************************** --> +<!-- RawMIDI Interface --> +<!-- ****************************************************** --> + <chapter id="rawmidi-interface"> + <title>RawMIDI Interface</title> + + <section id="rawmidi-interface-overview"> + <title>Overview</title> + + <para> + The raw MIDI interface is used for hardware MIDI ports that can + be accessed as a byte stream. It is not used for synthesizer + chips that do not directly understand MIDI. + </para> + + <para> + ALSA handles file and buffer management. All you have to do is + to write some code to move data between the buffer and the + hardware. + </para> + + <para> + The rawmidi API is defined in + <filename><sound/rawmidi.h></filename>. + </para> + </section> + + <section id="rawmidi-interface-constructor"> + <title>Constructor</title> + + <para> + To create a rawmidi device, call the + <function>snd_rawmidi_new</function> function: + <informalexample> + <programlisting> +<![CDATA[ + snd_rawmidi_t *rmidi; + err = snd_rawmidi_new(chip->card, "MyMIDI", 0, outs, ins, &rmidi); + if (err < 0) + return err; + rmidi->private_data = chip; + strcpy(rmidi->name, "My MIDI"); + rmidi->info_flags = SNDRV_RAWMIDI_INFO_OUTPUT | + SNDRV_RAWMIDI_INFO_INPUT | + SNDRV_RAWMIDI_INFO_DUPLEX; +]]> + </programlisting> + </informalexample> + </para> + + <para> + The first argument is the card pointer, the second argument is + the ID string. + </para> + + <para> + The third argument is the index of this component. You can + create up to 8 rawmidi devices. + </para> + + <para> + The fourth and fifth arguments are the number of output and + input substreams, respectively, of this device. (A substream is + the equivalent of a MIDI port.) + </para> + + <para> + Set the <structfield>info_flags</structfield> field to specify + the capabilities of the device. + Set <constant>SNDRV_RAWMIDI_INFO_OUTPUT</constant> if there is + at least one output port, + <constant>SNDRV_RAWMIDI_INFO_INPUT</constant> if there is at + least one input port, + and <constant>SNDRV_RAWMIDI_INFO_DUPLEX</constant> if the device + can handle output and input at the same time. + </para> + + <para> + After the rawmidi device is created, you need to set the + operators (callbacks) for each substream. There are helper + functions to set the operators for all substream of a device: + <informalexample> + <programlisting> +<![CDATA[ + snd_rawmidi_set_ops(rmidi, SNDRV_RAWMIDI_STREAM_OUTPUT, &snd_mymidi_output_ops); + snd_rawmidi_set_ops(rmidi, SNDRV_RAWMIDI_STREAM_INPUT, &snd_mymidi_input_ops); +]]> + </programlisting> + </informalexample> + </para> + + <para> + The operators are usually defined like this: + <informalexample> + <programlisting> +<![CDATA[ + static snd_rawmidi_ops_t snd_mymidi_output_ops = { + .open = snd_mymidi_output_open, + .close = snd_mymidi_output_close, + .trigger = snd_mymidi_output_trigger, + }; +]]> + </programlisting> + </informalexample> + These callbacks are explained in the <link + linkend="rawmidi-interface-callbacks"><citetitle>Callbacks</citetitle></link> + section. + </para> + + <para> + If there is more than one substream, you should give each one a + unique name: + <informalexample> + <programlisting> +<![CDATA[ + struct list_head *list; + snd_rawmidi_substream_t *substream; + list_for_each(list, &rmidi->streams[SNDRV_RAWMIDI_STREAM_OUTPUT].substreams) { + substream = list_entry(list, snd_rawmidi_substream_t, list); + sprintf(substream->name, "My MIDI Port %d", substream->number + 1); + } + /* same for SNDRV_RAWMIDI_STREAM_INPUT */ +]]> + </programlisting> + </informalexample> + </para> + </section> + + <section id="rawmidi-interface-callbacks"> + <title>Callbacks</title> + + <para> + In all callbacks, the private data that you've set for the + rawmidi device can be accessed as + substream->rmidi->private_data. + <!-- <code> isn't available before DocBook 4.3 --> + </para> + + <para> + If there is more than one port, your callbacks can determine the + port index from the snd_rawmidi_substream_t data passed to each + callback: + <informalexample> + <programlisting> +<![CDATA[ + snd_rawmidi_substream_t *substream; + int index = substream->number; +]]> + </programlisting> + </informalexample> + </para> + + <section id="rawmidi-interface-op-open"> + <title><function>open</function> callback</title> + + <informalexample> + <programlisting> +<![CDATA[ + static int snd_xxx_open(snd_rawmidi_substream_t *substream); +]]> + </programlisting> + </informalexample> + + <para> + This is called when a substream is opened. + You can initialize the hardware here, but you should not yet + start transmitting/receiving data. + </para> + </section> + + <section id="rawmidi-interface-op-close"> + <title><function>close</function> callback</title> + + <informalexample> + <programlisting> +<![CDATA[ + static int snd_xxx_close(snd_rawmidi_substream_t *substream); +]]> + </programlisting> + </informalexample> + + <para> + Guess what. + </para> + + <para> + The <function>open</function> and <function>close</function> + callbacks of a rawmidi device are serialized with a mutex, + and can sleep. + </para> + </section> + + <section id="rawmidi-interface-op-trigger-out"> + <title><function>trigger</function> callback for output + substreams</title> + + <informalexample> + <programlisting> +<![CDATA[ + static void snd_xxx_output_trigger(snd_rawmidi_substream_t *substream, int up); +]]> + </programlisting> + </informalexample> + + <para> + This is called with a nonzero <parameter>up</parameter> + parameter when there is some data in the substream buffer that + must be transmitted. + </para> + + <para> + To read data from the buffer, call + <function>snd_rawmidi_transmit_peek</function>. It will + return the number of bytes that have been read; this will be + less than the number of bytes requested when there is no more + data in the buffer. + After the data has been transmitted successfully, call + <function>snd_rawmidi_transmit_ack</function> to remove the + data from the substream buffer: + <informalexample> + <programlisting> +<![CDATA[ + unsigned char data; + while (snd_rawmidi_transmit_peek(substream, &data, 1) == 1) { + if (mychip_try_to_transmit(data)) + snd_rawmidi_transmit_ack(substream, 1); + else + break; /* hardware FIFO full */ + } +]]> + </programlisting> + </informalexample> + </para> + + <para> + If you know beforehand that the hardware will accept data, you + can use the <function>snd_rawmidi_transmit</function> function + which reads some data and removes it from the buffer at once: + <informalexample> + <programlisting> +<![CDATA[ + while (mychip_transmit_possible()) { + unsigned char data; + if (snd_rawmidi_transmit(substream, &data, 1) != 1) + break; /* no more data */ + mychip_transmit(data); + } +]]> + </programlisting> + </informalexample> + </para> + + <para> + If you know beforehand how many bytes you can accept, you can + use a buffer size greater than one with the + <function>snd_rawmidi_transmit*</function> functions. + </para> + + <para> + The <function>trigger</function> callback must not sleep. If + the hardware FIFO is full before the substream buffer has been + emptied, you have to continue transmitting data later, either + in an interrupt handler, or with a timer if the hardware + doesn't have a MIDI transmit interrupt. + </para> + + <para> + The <function>trigger</function> callback is called with a + zero <parameter>up</parameter> parameter when the transmission + of data should be aborted. + </para> + </section> + + <section id="rawmidi-interface-op-trigger-in"> + <title><function>trigger</function> callback for input + substreams</title> + + <informalexample> + <programlisting> +<![CDATA[ + static void snd_xxx_input_trigger(snd_rawmidi_substream_t *substream, int up); +]]> + </programlisting> + </informalexample> + + <para> + This is called with a nonzero <parameter>up</parameter> + parameter to enable receiving data, or with a zero + <parameter>up</parameter> parameter do disable receiving data. + </para> + + <para> + The <function>trigger</function> callback must not sleep; the + actual reading of data from the device is usually done in an + interrupt handler. + </para> + + <para> + When data reception is enabled, your interrupt handler should + call <function>snd_rawmidi_receive</function> for all received + data: + <informalexample> + <programlisting> +<![CDATA[ + void snd_mychip_midi_interrupt(...) + { + while (mychip_midi_available()) { + unsigned char data; + data = mychip_midi_read(); + snd_rawmidi_receive(substream, &data, 1); + } + } +]]> + </programlisting> + </informalexample> + </para> + </section> + + <section id="rawmidi-interface-op-drain"> + <title><function>drain</function> callback</title> + + <informalexample> + <programlisting> +<![CDATA[ + static void snd_xxx_drain(snd_rawmidi_substream_t *substream); +]]> + </programlisting> + </informalexample> + + <para> + This is only used with output substreams. This function should wait + until all data read from the substream buffer has been transmitted. + This ensures that the device can be closed and the driver unloaded + without losing data. + </para> + + <para> + This callback is optional. If you do not set + <structfield>drain</structfield> in the snd_rawmidi_ops_t + structure, ALSA will simply wait for 50 milliseconds + instead. + </para> + </section> + </section> + + </chapter> + + +<!-- ****************************************************** --> +<!-- Miscellaneous Devices --> +<!-- ****************************************************** --> + <chapter id="misc-devices"> + <title>Miscellaneous Devices</title> + + <section id="misc-devices-opl3"> + <title>FM OPL3</title> + <para> + The FM OPL3 is still used on many chips (mainly for backward + compatibility). ALSA has a nice OPL3 FM control layer, too. The + OPL3 API is defined in + <filename><sound/opl3.h></filename>. + </para> + + <para> + FM registers can be directly accessed through direct-FM API, + defined in <filename><sound/asound_fm.h></filename>. In + ALSA native mode, FM registers are accessed through + Hardware-Dependant Device direct-FM extension API, whereas in + OSS compatible mode, FM registers can be accessed with OSS + direct-FM compatible API on <filename>/dev/dmfmX</filename> device. + </para> + + <para> + For creating the OPL3 component, you have two functions to + call. The first one is a constructor for <type>opl3_t</type> + instance. + + <informalexample> + <programlisting> +<![CDATA[ + opl3_t *opl3; + snd_opl3_create(card, lport, rport, OPL3_HW_OPL3_XXX, + integrated, &opl3); +]]> + </programlisting> + </informalexample> + </para> + + <para> + The first argument is the card pointer, the second one is the + left port address, and the third is the right port address. In + most cases, the right port is placed at the left port + 2. + </para> + + <para> + The fourth argument is the hardware type. + </para> + + <para> + When the left and right ports have been already allocated by + the card driver, pass non-zero to the fifth argument + (<parameter>integrated</parameter>). Otherwise, opl3 module will + allocate the specified ports by itself. + </para> + + <para> + When the accessing to the hardware requires special method + instead of the standard I/O access, you can create opl3 instance + separately with <function>snd_opl3_new()</function>. + + <informalexample> + <programlisting> +<![CDATA[ + opl3_t *opl3; + snd_opl3_new(card, OPL3_HW_OPL3_XXX, &opl3); +]]> + </programlisting> + </informalexample> + </para> + + <para> + Then set <structfield>command</structfield>, + <structfield>private_data</structfield> and + <structfield>private_free</structfield> for the private + access function, the private data and the destructor. + The l_port and r_port are not necessarily set. Only the + command must be set properly. You can retrieve the data + from opl3->private_data field. + </para> + + <para> + After creating the opl3 instance via <function>snd_opl3_new()</function>, + call <function>snd_opl3_init()</function> to initialize the chip to the + proper state. Note that <function>snd_opl3_create()</function> always + calls it internally. + </para> + + <para> + If the opl3 instance is created successfully, then create a + hwdep device for this opl3. + + <informalexample> + <programlisting> +<![CDATA[ + snd_hwdep_t *opl3hwdep; + snd_opl3_hwdep_new(opl3, 0, 1, &opl3hwdep); +]]> + </programlisting> + </informalexample> + </para> + + <para> + The first argument is the <type>opl3_t</type> instance you + created, and the second is the index number, usually 0. + </para> + + <para> + The third argument is the index-offset for the sequencer + client assigned to the OPL3 port. When there is an MPU401-UART, + give 1 for here (UART always takes 0). + </para> + </section> + + <section id="misc-devices-hardware-dependent"> + <title>Hardware-Dependent Devices</title> + <para> + Some chips need the access from the user-space for special + controls or for loading the micro code. In such a case, you can + create a hwdep (hardware-dependent) device. The hwdep API is + defined in <filename><sound/hwdep.h></filename>. You can + find examples in opl3 driver or + <filename>isa/sb/sb16_csp.c</filename>. + </para> + + <para> + Creation of the <type>hwdep</type> instance is done via + <function>snd_hwdep_new()</function>. + + <informalexample> + <programlisting> +<![CDATA[ + snd_hwdep_t *hw; + snd_hwdep_new(card, "My HWDEP", 0, &hw); +]]> + </programlisting> + </informalexample> + + where the third argument is the index number. + </para> + + <para> + You can then pass any pointer value to the + <parameter>private_data</parameter>. + If you assign a private data, you should define the + destructor, too. The destructor function is set to + <structfield>private_free</structfield> field. + + <informalexample> + <programlisting> +<![CDATA[ + mydata_t *p = kmalloc(sizeof(*p), GFP_KERNEL); + hw->private_data = p; + hw->private_free = mydata_free; +]]> + </programlisting> + </informalexample> + + and the implementation of destructor would be: + + <informalexample> + <programlisting> +<![CDATA[ + static void mydata_free(snd_hwdep_t *hw) + { + mydata_t *p = hw->private_data; + kfree(p); + } +]]> + </programlisting> + </informalexample> + </para> + + <para> + The arbitrary file operations can be defined for this + instance. The file operators are defined in + <parameter>ops</parameter> table. For example, assume that + this chip needs an ioctl. + + <informalexample> + <programlisting> +<![CDATA[ + hw->ops.open = mydata_open; + hw->ops.ioctl = mydata_ioctl; + hw->ops.release = mydata_release; +]]> + </programlisting> + </informalexample> + + And implement the callback functions as you like. + </para> + </section> + + <section id="misc-devices-IEC958"> + <title>IEC958 (S/PDIF)</title> + <para> + Usually the controls for IEC958 devices are implemented via + control interface. There is a macro to compose a name string for + IEC958 controls, <function>SNDRV_CTL_NAME_IEC958()</function> + defined in <filename><include/asound.h></filename>. + </para> + + <para> + There are some standard controls for IEC958 status bits. These + controls use the type <type>SNDRV_CTL_ELEM_TYPE_IEC958</type>, + and the size of element is fixed as 4 bytes array + (value.iec958.status[x]). For <structfield>info</structfield> + callback, you don't specify + the value field for this type (the count field must be set, + though). + </para> + + <para> + <quote>IEC958 Playback Con Mask</quote> is used to return the + bit-mask for the IEC958 status bits of consumer mode. Similarly, + <quote>IEC958 Playback Pro Mask</quote> returns the bitmask for + professional mode. They are read-only controls, and are defined + as MIXER controls (iface = + <constant>SNDRV_CTL_ELEM_IFACE_MIXER</constant>). + </para> + + <para> + Meanwhile, <quote>IEC958 Playback Default</quote> control is + defined for getting and setting the current default IEC958 + bits. Note that this one is usually defined as a PCM control + (iface = <constant>SNDRV_CTL_ELEM_IFACE_PCM</constant>), + although in some places it's defined as a MIXER control. + </para> + + <para> + In addition, you can define the control switches to + enable/disable or to set the raw bit mode. The implementation + will depend on the chip, but the control should be named as + <quote>IEC958 xxx</quote>, preferably using + <function>SNDRV_CTL_NAME_IEC958()</function> macro. + </para> + + <para> + You can find several cases, for example, + <filename>pci/emu10k1</filename>, + <filename>pci/ice1712</filename>, or + <filename>pci/cmipci.c</filename>. + </para> + </section> + + </chapter> + + +<!-- ****************************************************** --> +<!-- Buffer and Memory Management --> +<!-- ****************************************************** --> + <chapter id="buffer-and-memory"> + <title>Buffer and Memory Management</title> + + <section id="buffer-and-memory-buffer-types"> + <title>Buffer Types</title> + <para> + ALSA provides several different buffer allocation functions + depending on the bus and the architecture. All these have a + consistent API. The allocation of physically-contiguous pages is + done via + <function>snd_malloc_xxx_pages()</function> function, where xxx + is the bus type. + </para> + + <para> + The allocation of pages with fallback is + <function>snd_malloc_xxx_pages_fallback()</function>. This + function tries to allocate the specified pages but if the pages + are not available, it tries to reduce the page sizes until the + enough space is found. + </para> + + <para> + For releasing the space, call + <function>snd_free_xxx_pages()</function> function. + </para> + + <para> + Usually, ALSA drivers try to allocate and reserve + a large contiguous physical space + at the time the module is loaded for the later use. + This is called <quote>pre-allocation</quote>. + As already written, you can call the following function at the + construction of pcm instance (in the case of PCI bus). + + <informalexample> + <programlisting> +<![CDATA[ + snd_pcm_lib_preallocate_pages_for_all(pcm, SNDRV_DMA_TYPE_DEV, + snd_dma_pci_data(pci), size, max); +]]> + </programlisting> + </informalexample> + + where <parameter>size</parameter> is the byte size to be + pre-allocated and the <parameter>max</parameter> is the maximal + size to be changed via <filename>prealloc</filename> proc file. + The allocator will try to get as large area as possible + within the given size. + </para> + + <para> + The second argument (type) and the third argument (device pointer) + are dependent on the bus. + In the case of ISA bus, pass <function>snd_dma_isa_data()</function> + as the third argument with <constant>SNDRV_DMA_TYPE_DEV</constant> type. + For the continuous buffer unrelated to the bus can be pre-allocated + with <constant>SNDRV_DMA_TYPE_CONTINUOUS</constant> type and the + <function>snd_dma_continuous_data(GFP_KERNEL)</function> device pointer, + whereh <constant>GFP_KERNEL</constant> is the kernel allocation flag to + use. For the SBUS, <constant>SNDRV_DMA_TYPE_SBUS</constant> and + <function>snd_dma_sbus_data(sbus_dev)</function> are used instead. + For the PCI scatter-gather buffers, use + <constant>SNDRV_DMA_TYPE_DEV_SG</constant> with + <function>snd_dma_pci_data(pci)</function> + (see the section + <link linkend="buffer-and-memory-non-contiguous"><citetitle>Non-Contiguous Buffers + </citetitle></link>). + </para> + + <para> + Once when the buffer is pre-allocated, you can use the + allocator in the <structfield>hw_params</structfield> callback + + <informalexample> + <programlisting> +<![CDATA[ + snd_pcm_lib_malloc_pages(substream, size); +]]> + </programlisting> + </informalexample> + + Note that you have to pre-allocate to use this function. + </para> + </section> + + <section id="buffer-and-memory-external-hardware"> + <title>External Hardware Buffers</title> + <para> + Some chips have their own hardware buffers and the DMA + transfer from the host memory is not available. In such a case, + you need to either 1) copy/set the audio data directly to the + external hardware buffer, or 2) make an intermediate buffer and + copy/set the data from it to the external hardware buffer in + interrupts (or in tasklets, preferably). + </para> + + <para> + The first case works fine if the external hardware buffer is enough + large. This method doesn't need any extra buffers and thus is + more effective. You need to define the + <structfield>copy</structfield> and + <structfield>silence</structfield> callbacks for + the data transfer. However, there is a drawback: it cannot + be mmapped. The examples are GUS's GF1 PCM or emu8000's + wavetable PCM. + </para> + + <para> + The second case allows the mmap of the buffer, although you have + to handle an interrupt or a tasklet for transferring the data + from the intermediate buffer to the hardware buffer. You can find an + example in vxpocket driver. + </para> + + <para> + Another case is that the chip uses a PCI memory-map + region for the buffer instead of the host memory. In this case, + mmap is available only on certain architectures like intel. In + non-mmap mode, the data cannot be transferred as the normal + way. Thus you need to define <structfield>copy</structfield> and + <structfield>silence</structfield> callbacks as well + as in the cases above. The examples are found in + <filename>rme32.c</filename> and <filename>rme96.c</filename>. + </para> + + <para> + The implementation of <structfield>copy</structfield> and + <structfield>silence</structfield> callbacks depends upon + whether the hardware supports interleaved or non-interleaved + samples. The <structfield>copy</structfield> callback is + defined like below, a bit + differently depending whether the direction is playback or + capture: + + <informalexample> + <programlisting> +<![CDATA[ + static int playback_copy(snd_pcm_substream_t *substream, int channel, + snd_pcm_uframes_t pos, void *src, snd_pcm_uframes_t count); + static int capture_copy(snd_pcm_substream_t *substream, int channel, + snd_pcm_uframes_t pos, void *dst, snd_pcm_uframes_t count); +]]> + </programlisting> + </informalexample> + </para> + + <para> + In the case of interleaved samples, the second argument + (<parameter>channel</parameter>) is not used. The third argument + (<parameter>pos</parameter>) points the + current position offset in frames. + </para> + + <para> + The meaning of the fourth argument is different between + playback and capture. For playback, it holds the source data + pointer, and for capture, it's the destination data pointer. + </para> + + <para> + The last argument is the number of frames to be copied. + </para> + + <para> + What you have to do in this callback is again different + between playback and capture directions. In the case of + playback, you do: copy the given amount of data + (<parameter>count</parameter>) at the specified pointer + (<parameter>src</parameter>) to the specified offset + (<parameter>pos</parameter>) on the hardware buffer. When + coded like memcpy-like way, the copy would be like: + + <informalexample> + <programlisting> +<![CDATA[ + my_memcpy(my_buffer + frames_to_bytes(runtime, pos), src, + frames_to_bytes(runtime, count)); +]]> + </programlisting> + </informalexample> + </para> + + <para> + For the capture direction, you do: copy the given amount of + data (<parameter>count</parameter>) at the specified offset + (<parameter>pos</parameter>) on the hardware buffer to the + specified pointer (<parameter>dst</parameter>). + + <informalexample> + <programlisting> +<![CDATA[ + my_memcpy(dst, my_buffer + frames_to_bytes(runtime, pos), + frames_to_bytes(runtime, count)); +]]> + </programlisting> + </informalexample> + + Note that both of the position and the data amount are given + in frames. + </para> + + <para> + In the case of non-interleaved samples, the implementation + will be a bit more complicated. + </para> + + <para> + You need to check the channel argument, and if it's -1, copy + the whole channels. Otherwise, you have to copy only the + specified channel. Please check + <filename>isa/gus/gus_pcm.c</filename> as an example. + </para> + + <para> + The <structfield>silence</structfield> callback is also + implemented in a similar way. + + <informalexample> + <programlisting> +<![CDATA[ + static int silence(snd_pcm_substream_t *substream, int channel, + snd_pcm_uframes_t pos, snd_pcm_uframes_t count); +]]> + </programlisting> + </informalexample> + </para> + + <para> + The meanings of arguments are identical with the + <structfield>copy</structfield> + callback, although there is no <parameter>src/dst</parameter> + argument. In the case of interleaved samples, the channel + argument has no meaning, as well as on + <structfield>copy</structfield> callback. + </para> + + <para> + The role of <structfield>silence</structfield> callback is to + set the given amount + (<parameter>count</parameter>) of silence data at the + specified offset (<parameter>pos</parameter>) on the hardware + buffer. Suppose that the data format is signed (that is, the + silent-data is 0), and the implementation using a memset-like + function would be like: + + <informalexample> + <programlisting> +<![CDATA[ + my_memcpy(my_buffer + frames_to_bytes(runtime, pos), 0, + frames_to_bytes(runtime, count)); +]]> + </programlisting> + </informalexample> + </para> + + <para> + In the case of non-interleaved samples, again, the + implementation becomes a bit more complicated. See, for example, + <filename>isa/gus/gus_pcm.c</filename>. + </para> + </section> + + <section id="buffer-and-memory-non-contiguous"> + <title>Non-Contiguous Buffers</title> + <para> + If your hardware supports the page table like emu10k1 or the + buffer descriptors like via82xx, you can use the scatter-gather + (SG) DMA. ALSA provides an interface for handling SG-buffers. + The API is provided in <filename><sound/pcm.h></filename>. + </para> + + <para> + For creating the SG-buffer handler, call + <function>snd_pcm_lib_preallocate_pages()</function> or + <function>snd_pcm_lib_preallocate_pages_for_all()</function> + with <constant>SNDRV_DMA_TYPE_DEV_SG</constant> + in the PCM constructor like other PCI pre-allocator. + You need to pass the <function>snd_dma_pci_data(pci)</function>, + where pci is the struct <structname>pci_dev</structname> pointer + of the chip as well. + The <type>snd_sg_buf_t</type> instance is created as + substream->dma_private. You can cast + the pointer like: + + <informalexample> + <programlisting> +<![CDATA[ + snd_pcm_sgbuf_t *sgbuf = (snd_pcm_sgbuf_t*)substream->dma_private; +]]> + </programlisting> + </informalexample> + </para> + + <para> + Then call <function>snd_pcm_lib_malloc_pages()</function> + in <structfield>hw_params</structfield> callback + as well as in the case of normal PCI buffer. + The SG-buffer handler will allocate the non-contiguous kernel + pages of the given size and map them onto the virtually contiguous + memory. The virtual pointer is addressed in runtime->dma_area. + The physical address (runtime->dma_addr) is set to zero, + because the buffer is physically non-contigous. + The physical address table is set up in sgbuf->table. + You can get the physical address at a certain offset via + <function>snd_pcm_sgbuf_get_addr()</function>. + </para> + + <para> + When a SG-handler is used, you need to set + <function>snd_pcm_sgbuf_ops_page</function> as + the <structfield>page</structfield> callback. + (See <link linkend="pcm-interface-operators-page-callback"> + <citetitle>page callback section</citetitle></link>.) + </para> + + <para> + For releasing the data, call + <function>snd_pcm_lib_free_pages()</function> in the + <structfield>hw_free</structfield> callback as usual. + </para> + </section> + + <section id="buffer-and-memory-vmalloced"> + <title>Vmalloc'ed Buffers</title> + <para> + It's possible to use a buffer allocated via + <function>vmalloc</function>, for example, for an intermediate + buffer. Since the allocated pages are not contiguous, you need + to set the <structfield>page</structfield> callback to obtain + the physical address at every offset. + </para> + + <para> + The implementation of <structfield>page</structfield> callback + would be like this: + + <informalexample> + <programlisting> +<![CDATA[ + #include <linux/vmalloc.h> + + /* get the physical page pointer on the given offset */ + static struct page *mychip_page(snd_pcm_substream_t *substream, + unsigned long offset) + { + void *pageptr = substream->runtime->dma_area + offset; + return vmalloc_to_page(pageptr); + } +]]> + </programlisting> + </informalexample> + </para> + </section> + + </chapter> + + +<!-- ****************************************************** --> +<!-- Proc Interface --> +<!-- ****************************************************** --> + <chapter id="proc-interface"> + <title>Proc Interface</title> + <para> + ALSA provides an easy interface for procfs. The proc files are + very useful for debugging. I recommend you set up proc files if + you write a driver and want to get a running status or register + dumps. The API is found in + <filename><sound/info.h></filename>. + </para> + + <para> + For creating a proc file, call + <function>snd_card_proc_new()</function>. + + <informalexample> + <programlisting> +<![CDATA[ + snd_info_entry_t *entry; + int err = snd_card_proc_new(card, "my-file", &entry); +]]> + </programlisting> + </informalexample> + + where the second argument specifies the proc-file name to be + created. The above example will create a file + <filename>my-file</filename> under the card directory, + e.g. <filename>/proc/asound/card0/my-file</filename>. + </para> + + <para> + Like other components, the proc entry created via + <function>snd_card_proc_new()</function> will be registered and + released automatically in the card registration and release + functions. + </para> + + <para> + When the creation is successful, the function stores a new + instance at the pointer given in the third argument. + It is initialized as a text proc file for read only. For using + this proc file as a read-only text file as it is, set the read + callback with a private data via + <function>snd_info_set_text_ops()</function>. + + <informalexample> + <programlisting> +<![CDATA[ + snd_info_set_text_ops(entry, chip, read_size, my_proc_read); +]]> + </programlisting> + </informalexample> + + where the second argument (<parameter>chip</parameter>) is the + private data to be used in the callbacks. The third parameter + specifies the read buffer size and the fourth + (<parameter>my_proc_read</parameter>) is the callback function, which + is defined like + + <informalexample> + <programlisting> +<![CDATA[ + static void my_proc_read(snd_info_entry_t *entry, + snd_info_buffer_t *buffer); +]]> + </programlisting> + </informalexample> + + </para> + + <para> + In the read callback, use <function>snd_iprintf()</function> for + output strings, which works just like normal + <function>printf()</function>. For example, + + <informalexample> + <programlisting> +<![CDATA[ + static void my_proc_read(snd_info_entry_t *entry, + snd_info_buffer_t *buffer) + { + chip_t *chip = entry->private_data; + + snd_iprintf(buffer, "This is my chip!\n"); + snd_iprintf(buffer, "Port = %ld\n", chip->port); + } +]]> + </programlisting> + </informalexample> + </para> + + <para> + The file permission can be changed afterwards. As default, it's + set as read only for all users. If you want to add the write + permission to the user (root as default), set like below: + + <informalexample> + <programlisting> +<![CDATA[ + entry->mode = S_IFREG | S_IRUGO | S_IWUSR; +]]> + </programlisting> + </informalexample> + + and set the write buffer size and the callback + + <informalexample> + <programlisting> +<![CDATA[ + entry->c.text.write_size = 256; + entry->c.text.write = my_proc_write; +]]> + </programlisting> + </informalexample> + </para> + + <para> + The buffer size for read is set to 1024 implicitly by + <function>snd_info_set_text_ops()</function>. It should suffice + in most cases (the size will be aligned to + <constant>PAGE_SIZE</constant> anyway), but if you need to handle + very large text files, you can set it explicitly, too. + + <informalexample> + <programlisting> +<![CDATA[ + entry->c.text.read_size = 65536; +]]> + </programlisting> + </informalexample> + </para> + + <para> + For the write callback, you can use + <function>snd_info_get_line()</function> to get a text line, and + <function>snd_info_get_str()</function> to retrieve a string from + the line. Some examples are found in + <filename>core/oss/mixer_oss.c</filename>, core/oss/and + <filename>pcm_oss.c</filename>. + </para> + + <para> + For a raw-data proc-file, set the attributes like the following: + + <informalexample> + <programlisting> +<![CDATA[ + static struct snd_info_entry_ops my_file_io_ops = { + .read = my_file_io_read, + }; + + entry->content = SNDRV_INFO_CONTENT_DATA; + entry->private_data = chip; + entry->c.ops = &my_file_io_ops; + entry->size = 4096; + entry->mode = S_IFREG | S_IRUGO; +]]> + </programlisting> + </informalexample> + </para> + + <para> + The callback is much more complicated than the text-file + version. You need to use a low-level i/o functions such as + <function>copy_from/to_user()</function> to transfer the + data. + + <informalexample> + <programlisting> +<![CDATA[ + static long my_file_io_read(snd_info_entry_t *entry, + void *file_private_data, + struct file *file, + char *buf, + unsigned long count, + unsigned long pos) + { + long size = count; + if (pos + size > local_max_size) + size = local_max_size - pos; + if (copy_to_user(buf, local_data + pos, size)) + return -EFAULT; + return size; + } +]]> + </programlisting> + </informalexample> + </para> + + </chapter> + + +<!-- ****************************************************** --> +<!-- Power Management --> +<!-- ****************************************************** --> + <chapter id="power-management"> + <title>Power Management</title> + <para> + If the chip is supposed to work with with suspend/resume + functions, you need to add the power-management codes to the + driver. The additional codes for the power-management should be + <function>ifdef</function>'ed with + <constant>CONFIG_PM</constant>. + </para> + + <para> + ALSA provides the common power-management layer. Each card driver + needs to have only low-level suspend and resume callbacks. + + <informalexample> + <programlisting> +<![CDATA[ + #ifdef CONFIG_PM + static int snd_my_suspend(snd_card_t *card, pm_message_t state) + { + .... // do things for suspsend + return 0; + } + static int snd_my_resume(snd_card_t *card) + { + .... // do things for suspsend + return 0; + } + #endif +]]> + </programlisting> + </informalexample> + </para> + + <para> + The scheme of the real suspend job is as following. + + <orderedlist> + <listitem><para>Retrieve the chip data from pm_private_data field.</para></listitem> + <listitem><para>Call <function>snd_pcm_suspend_all()</function> to suspend the running PCM streams.</para></listitem> + <listitem><para>Save the register values if necessary.</para></listitem> + <listitem><para>Stop the hardware if necessary.</para></listitem> + <listitem><para>Disable the PCI device by calling <function>pci_disable_device()</function>.</para></listitem> + </orderedlist> + </para> + + <para> + A typical code would be like: + + <informalexample> + <programlisting> +<![CDATA[ + static int mychip_suspend(snd_card_t *card, pm_message_t state) + { + /* (1) */ + mychip_t *chip = card->pm_private_data; + /* (2) */ + snd_pcm_suspend_all(chip->pcm); + /* (3) */ + snd_mychip_save_registers(chip); + /* (4) */ + snd_mychip_stop_hardware(chip); + /* (5) */ + pci_disable_device(chip->pci); + return 0; + } +]]> + </programlisting> + </informalexample> + </para> + + <para> + The scheme of the real resume job is as following. + + <orderedlist> + <listitem><para>Retrieve the chip data from pm_private_data field.</para></listitem> + <listitem><para>Enable the pci device again by calling + <function>pci_enable_device()</function>.</para></listitem> + <listitem><para>Re-initialize the chip.</para></listitem> + <listitem><para>Restore the saved registers if necessary.</para></listitem> + <listitem><para>Resume the mixer, e.g. calling + <function>snd_ac97_resume()</function>.</para></listitem> + <listitem><para>Restart the hardware (if any).</para></listitem> + </orderedlist> + </para> + + <para> + A typical code would be like: + + <informalexample> + <programlisting> +<![CDATA[ + static void mychip_resume(mychip_t *chip) + { + /* (1) */ + mychip_t *chip = card->pm_private_data; + /* (2) */ + pci_enable_device(chip->pci); + /* (3) */ + snd_mychip_reinit_chip(chip); + /* (4) */ + snd_mychip_restore_registers(chip); + /* (5) */ + snd_ac97_resume(chip->ac97); + /* (6) */ + snd_mychip_restart_chip(chip); + return 0; + } +]]> + </programlisting> + </informalexample> + </para> + + <para> + OK, we have all callbacks now. Let's set up them now. In the + initialization of the card, add the following: + + <informalexample> + <programlisting> +<![CDATA[ + static int __devinit snd_mychip_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) + { + .... + snd_card_t *card; + mychip_t *chip; + .... + snd_card_set_pm_callback(card, snd_my_suspend, snd_my_resume, chip); + .... + } +]]> + </programlisting> + </informalexample> + + Here you don't have to put ifdef CONFIG_PM around, since it's already + checked in the header and expanded to empty if not needed. + </para> + + <para> + If you need a space for saving the registers, you'll need to + allocate the buffer for it here, too, since it would be fatal + if you cannot allocate a memory in the suspend phase. + The allocated buffer should be released in the corresponding + destructor. + </para> + + <para> + And next, set suspend/resume callbacks to the pci_driver, + This can be done by passing a macro SND_PCI_PM_CALLBACKS + in the pci_driver struct. This macro is expanded to the correct + (global) callbacks if CONFIG_PM is set. + + <informalexample> + <programlisting> +<![CDATA[ + static struct pci_driver driver = { + .name = "My Chip", + .id_table = snd_my_ids, + .probe = snd_my_probe, + .remove = __devexit_p(snd_my_remove), + SND_PCI_PM_CALLBACKS + }; +]]> + </programlisting> + </informalexample> + </para> + + </chapter> + + +<!-- ****************************************************** --> +<!-- Module Parameters --> +<!-- ****************************************************** --> + <chapter id="module-parameters"> + <title>Module Parameters</title> + <para> + There are standard module options for ALSA. At least, each + module should have <parameter>index</parameter>, + <parameter>id</parameter> and <parameter>enable</parameter> + options. + </para> + + <para> + If the module supports multiple cards (usually up to + 8 = <constant>SNDRV_CARDS</constant> cards), they should be + arrays. The default initial values are defined already as + constants for ease of programming: + + <informalexample> + <programlisting> +<![CDATA[ + static int index[SNDRV_CARDS] = SNDRV_DEFAULT_IDX; + static char *id[SNDRV_CARDS] = SNDRV_DEFAULT_STR; + static int enable[SNDRV_CARDS] = SNDRV_DEFAULT_ENABLE_PNP; +]]> + </programlisting> + </informalexample> + </para> + + <para> + If the module supports only a single card, they could be single + variables, instead. <parameter>enable</parameter> option is not + always necessary in this case, but it wouldn't be so bad to have a + dummy option for compatibility. + </para> + + <para> + The module parameters must be declared with the standard + <function>module_param()()</function>, + <function>module_param_array()()</function> and + <function>MODULE_PARM_DESC()</function> macros. + </para> + + <para> + The typical coding would be like below: + + <informalexample> + <programlisting> +<![CDATA[ + #define CARD_NAME "My Chip" + + module_param_array(index, int, NULL, 0444); + MODULE_PARM_DESC(index, "Index value for " CARD_NAME " soundcard."); + module_param_array(id, charp, NULL, 0444); + MODULE_PARM_DESC(id, "ID string for " CARD_NAME " soundcard."); + module_param_array(enable, bool, NULL, 0444); + MODULE_PARM_DESC(enable, "Enable " CARD_NAME " soundcard."); +]]> + </programlisting> + </informalexample> + </para> + + <para> + Also, don't forget to define the module description, classes, + license and devices. Especially, the recent modprobe requires to + define the module license as GPL, etc., otherwise the system is + shown as <quote>tainted</quote>. + + <informalexample> + <programlisting> +<![CDATA[ + MODULE_DESCRIPTION("My Chip"); + MODULE_LICENSE("GPL"); + MODULE_SUPPORTED_DEVICE("{{Vendor,My Chip Name}}"); +]]> + </programlisting> + </informalexample> + </para> + + </chapter> + + +<!-- ****************************************************** --> +<!-- How To Put Your Driver --> +<!-- ****************************************************** --> + <chapter id="how-to-put-your-driver"> + <title>How To Put Your Driver Into ALSA Tree</title> + <section> + <title>General</title> + <para> + So far, you've learned how to write the driver codes. + And you might have a question now: how to put my own + driver into the ALSA driver tree? + Here (finally :) the standard procedure is described briefly. + </para> + + <para> + Suppose that you'll create a new PCI driver for the card + <quote>xyz</quote>. The card module name would be + snd-xyz. The new driver is usually put into alsa-driver + tree, <filename>alsa-driver/pci</filename> directory in + the case of PCI cards. + Then the driver is evaluated, audited and tested + by developers and users. After a certain time, the driver + will go to alsa-kernel tree (to the corresponding directory, + such as <filename>alsa-kernel/pci</filename>) and eventually + integrated into Linux 2.6 tree (the directory would be + <filename>linux/sound/pci</filename>). + </para> + + <para> + In the following sections, the driver code is supposed + to be put into alsa-driver tree. The two cases are assumed: + a driver consisting of a single source file and one consisting + of several source files. + </para> + </section> + + <section> + <title>Driver with A Single Source File</title> + <para> + <orderedlist> + <listitem> + <para> + Modify alsa-driver/pci/Makefile + </para> + + <para> + Suppose you have a file xyz.c. Add the following + two lines + <informalexample> + <programlisting> +<![CDATA[ + snd-xyz-objs := xyz.o + obj-$(CONFIG_SND_XYZ) += snd-xyz.o +]]> + </programlisting> + </informalexample> + </para> + </listitem> + + <listitem> + <para> + Create the Kconfig entry + </para> + + <para> + Add the new entry of Kconfig for your xyz driver. + <informalexample> + <programlisting> +<![CDATA[ + config SND_XYZ + tristate "Foobar XYZ" + depends on SND + select SND_PCM + help + Say Y here to include support for Foobar XYZ soundcard. + + To compile this driver as a module, choose M here: the module + will be called snd-xyz. +]]> + </programlisting> + </informalexample> + + the line, select SND_PCM, specifies that the driver xyz supports + PCM. In addition to SND_PCM, the following components are + supported for select command: + SND_RAWMIDI, SND_TIMER, SND_HWDEP, SND_MPU401_UART, + SND_OPL3_LIB, SND_OPL4_LIB, SND_VX_LIB, SND_AC97_CODEC. + Add the select command for each supported component. + </para> + + <para> + Note that some selections imply the lowlevel selections. + For example, PCM includes TIMER, MPU401_UART includes RAWMIDI, + AC97_CODEC includes PCM, and OPL3_LIB includes HWDEP. + You don't need to give the lowlevel selections again. + </para> + + <para> + For the details of Kconfig script, refer to the kbuild + documentation. + </para> + + </listitem> + + <listitem> + <para> + Run cvscompile script to re-generate the configure script and + build the whole stuff again. + </para> + </listitem> + </orderedlist> + </para> + </section> + + <section> + <title>Drivers with Several Source Files</title> + <para> + Suppose that the driver snd-xyz have several source files. + They are located in the new subdirectory, + pci/xyz. + + <orderedlist> + <listitem> + <para> + Add a new directory (<filename>xyz</filename>) in + <filename>alsa-driver/pci/Makefile</filename> like below + + <informalexample> + <programlisting> +<![CDATA[ + obj-$(CONFIG_SND) += xyz/ +]]> + </programlisting> + </informalexample> + </para> + </listitem> + + <listitem> + <para> + Under the directory <filename>xyz</filename>, create a Makefile + + <example> + <title>Sample Makefile for a driver xyz</title> + <programlisting> +<![CDATA[ + ifndef SND_TOPDIR + SND_TOPDIR=../.. + endif + + include $(SND_TOPDIR)/toplevel.config + include $(SND_TOPDIR)/Makefile.conf + + snd-xyz-objs := xyz.o abc.o def.o + + obj-$(CONFIG_SND_XYZ) += snd-xyz.o + + include $(SND_TOPDIR)/Rules.make +]]> + </programlisting> + </example> + </para> + </listitem> + + <listitem> + <para> + Create the Kconfig entry + </para> + + <para> + This procedure is as same as in the last section. + </para> + </listitem> + + <listitem> + <para> + Run cvscompile script to re-generate the configure script and + build the whole stuff again. + </para> + </listitem> + </orderedlist> + </para> + </section> + + </chapter> + +<!-- ****************************************************** --> +<!-- Useful Functions --> +<!-- ****************************************************** --> + <chapter id="useful-functions"> + <title>Useful Functions</title> + + <section id="useful-functions-snd-printk"> + <title><function>snd_printk()</function> and friends</title> + <para> + ALSA provides a verbose version of + <function>printk()</function> function. If a kernel config + <constant>CONFIG_SND_VERBOSE_PRINTK</constant> is set, this + function prints the given message together with the file name + and the line of the caller. The <constant>KERN_XXX</constant> + prefix is processed as + well as the original <function>printk()</function> does, so it's + recommended to add this prefix, e.g. + + <informalexample> + <programlisting> +<![CDATA[ + snd_printk(KERN_ERR "Oh my, sorry, it's extremely bad!\n"); +]]> + </programlisting> + </informalexample> + </para> + + <para> + There are also <function>printk()</function>'s for + debugging. <function>snd_printd()</function> can be used for + general debugging purposes. If + <constant>CONFIG_SND_DEBUG</constant> is set, this function is + compiled, and works just like + <function>snd_printk()</function>. If the ALSA is compiled + without the debugging flag, it's ignored. + </para> + + <para> + <function>snd_printdd()</function> is compiled in only when + <constant>CONFIG_SND_DEBUG_DETECT</constant> is set. Please note + that <constant>DEBUG_DETECT</constant> is not set as default + even if you configure the alsa-driver with + <option>--with-debug=full</option> option. You need to give + explicitly <option>--with-debug=detect</option> option instead. + </para> + </section> + + <section id="useful-functions-snd-assert"> + <title><function>snd_assert()</function></title> + <para> + <function>snd_assert()</function> macro is similar with the + normal <function>assert()</function> macro. For example, + + <informalexample> + <programlisting> +<![CDATA[ + snd_assert(pointer != NULL, return -EINVAL); +]]> + </programlisting> + </informalexample> + </para> + + <para> + The first argument is the expression to evaluate, and the + second argument is the action if it fails. When + <constant>CONFIG_SND_DEBUG</constant>, is set, it will show an + error message such as <computeroutput>BUG? (xxx) (called from + yyy)</computeroutput>. When no debug flag is set, this is + ignored. + </para> + </section> + + <section id="useful-functions-snd-runtime-check"> + <title><function>snd_runtime_check()</function></title> + <para> + This macro is quite similar with + <function>snd_assert()</function>. Unlike + <function>snd_assert()</function>, the expression is always + evaluated regardless of + <constant>CONFIG_SND_DEBUG</constant>. When + <constant>CONFIG_SND_DEBUG</constant> is set, the macro will + show a message like <computeroutput>ERROR (xx) (called from + yyy)</computeroutput>. + </para> + </section> + + <section id="useful-functions-snd-bug"> + <title><function>snd_BUG()</function></title> + <para> + It calls <function>snd_assert(0,)</function> -- that is, just + prints the error message at the point. It's useful to show that + a fatal error happens there. + </para> + </section> + </chapter> + + +<!-- ****************************************************** --> +<!-- Acknowledgments --> +<!-- ****************************************************** --> + <chapter id="acknowledments"> + <title>Acknowledgments</title> + <para> + I would like to thank Phil Kerr for his help for improvement and + corrections of this document. + </para> + <para> + Kevin Conder reformatted the original plain-text to the + DocBook format. + </para> + <para> + Giuliano Pochini corrected typos and contributed the example codes + in the hardware constraints section. + </para> + </chapter> + + +</book> diff --git a/Documentation/sound/alsa/Joystick.txt b/Documentation/sound/alsa/Joystick.txt new file mode 100644 index 000000000000..ccda41b10f8a --- /dev/null +++ b/Documentation/sound/alsa/Joystick.txt @@ -0,0 +1,86 @@ +Analog Joystick Support on ALSA Drivers +======================================= + Oct. 14, 2003 + Takashi Iwai <tiwai@suse.de> + +General +------- + +First of all, you need to enable GAMEPORT support on Linux kernel for +using a joystick with the ALSA driver. For the details of gameport +support, refer to Documentation/input/joystick.txt. + +The joystick support of ALSA drivers is different between ISA and PCI +cards. In the case of ISA (PnP) cards, it's usually handled by the +independent module (ns558). Meanwhile, the ALSA PCI drivers have the +built-in gameport support. Hence, when the ALSA PCI driver is built +in the kernel, CONFIG_GAMEPORT must be 'y', too. Otherwise, the +gameport support on that card will be (silently) disabled. + +Some adapter modules probe the physical connection of the device at +the load time. It'd be safer to plug in the joystick device before +loading the module. + + +PCI Cards +--------- + +For PCI cards, the joystick is enabled when the appropriate module +option is specified. Some drivers don't need options, and the +joystick support is always enabled. In the former ALSA version, there +was a dynamic control API for the joystick activation. It was +changed, however, to the static module options because of the system +stability and the resource management. + +The following PCI drivers support the joystick natively. + + Driver Module Option Available Values + --------------------------------------------------------------------------- + als4000 joystick_port 0 = disable (default), 1 = auto-detect, + manual: any address (e.g. 0x200) + au88x0 N/A N/A + azf3328 joystick 0 = disable, 1 = enable, -1 = auto (default) + ens1370 joystick 0 = disable (default), 1 = enable + ens1371 joystick_port 0 = disable (default), 1 = auto-detect, + manual: 0x200, 0x208, 0x210, 0x218 + cmipci joystick_port 0 = disable (default), 1 = auto-detect, + manual: any address (e.g. 0x200) + cs4281 N/A N/A + cs46xx N/A N/A + es1938 N/A N/A + es1968 joystick 0 = disable (default), 1 = enable + sonicvibes N/A N/A + trident N/A N/A + via82xx(*1) joystick 0 = disable (default), 1 = enable + ymfpci joystick_port 0 = disable (default), 1 = auto-detect, + manual: 0x201, 0x202, 0x204, 0x205(*2) + --------------------------------------------------------------------------- + + *1) VIA686A/B only + *2) With YMF744/754 chips, the port address can be chosen arbitrarily + +The following drivers don't support gameport natively, but there are +additional modules. Load the corresponding module to add the gameport +support. + + Driver Additional Module + ----------------------------- + emu10k1 emu10k1-gp + fm801 fm801-gp + ----------------------------- + +Note: the "pcigame" and "cs461x" modules are for the OSS drivers only. + These ALSA drivers (cs46xx, trident and au88x0) have the + built-in gameport support. + +As mentioned above, ALSA PCI drivers have the built-in gameport +support, so you don't have to load ns558 module. Just load "joydev" +and the appropriate adapter module (e.g. "analog"). + + +ISA Cards +--------- + +ALSA ISA drivers don't have the built-in gameport support. +Instead, you need to load "ns558" module in addition to "joydev" and +the adapter module (e.g. "analog"). diff --git a/Documentation/sound/alsa/MIXART.txt b/Documentation/sound/alsa/MIXART.txt new file mode 100644 index 000000000000..5cb970612870 --- /dev/null +++ b/Documentation/sound/alsa/MIXART.txt @@ -0,0 +1,100 @@ + Alsa driver for Digigram miXart8 and miXart8AES/EBU soundcards + Digigram <alsa@digigram.com> + + +GENERAL +======= + +The miXart8 is a multichannel audio processing and mixing soundcard +that has 4 stereo audio inputs and 4 stereo audio outputs. +The miXart8AES/EBU is the same with a add-on card that offers further +4 digital stereo audio inputs and outputs. +Furthermore the add-on card offers external clock synchronisation +(AES/EBU, Word Clock, Time Code and Video Synchro) + +The mainboard has a PowerPC that offers onboard mpeg encoding and +decoding, samplerate conversions and various effects. + +The driver don't work properly at all until the certain firmwares +are loaded, i.e. no PCM nor mixer devices will appear. +Use the mixartloader that can be found in the alsa-tools package. + + +VERSION 0.1.0 +============= + +One miXart8 board will be represented as 4 alsa cards, each with 1 +stereo analog capture 'pcm0c' and 1 stereo analog playback 'pcm0p' device. +With a miXart8AES/EBU there is in addition 1 stereo digital input +'pcm1c' and 1 stereo digital output 'pcm1p' per card. + +Formats +------- +U8, S16_LE, S16_BE, S24_3LE, S24_3BE, FLOAT_LE, FLOAT_BE +Sample rates : 8000 - 48000 Hz continously + +Playback +-------- +For instance the playback devices are configured to have max. 4 +substreams performing hardware mixing. This could be changed to a +maximum of 24 substreams if wished. +Mono files will be played on the left and right channel. Each channel +can be muted for each stream to use 8 analog/digital outputs seperately. + +Capture +------- +There is one substream per capture device. For instance only stereo +formats are supported. + +Mixer +----- +<Master> and <Master Capture> : analog volume control of playback and capture PCM. +<PCM 0-3> and <PCM Capture> : digital volume control of each analog substream. +<AES 0-3> and <AES Capture> : digital volume control of each AES/EBU substream. +<Monitoring> : Loopback from 'pcm0c' to 'pcm0p' with digital volume +and mute control. + +Rem : for best audio quality try to keep a 0 attenuation on the PCM +and AES volume controls which is set by 219 in the range from 0 to 255 +(about 86% with alsamixer) + + +NOT YET IMPLEMENTED +=================== + +- external clock support (AES/EBU, Word Clock, Time Code, Video Sync) +- MPEG audio formats +- mono record +- on-board effects and samplerate conversions +- linked streams + + +FIRMWARE +======== + +[As of 2.6.11, the firmware can be loaded automatically with hotplug + when CONFIG_FW_LOADER is set. The mixartloader is necessary only + for older versions or when you build the driver into kernel.] + +For loading the firmware automatically after the module is loaded, use +the post-install command. For example, add the following entry to +/etc/modprobe.conf for miXart driver: + + install snd-mixart /sbin/modprobe --first-time -i snd-mixart && \ + /usr/bin/mixartloader +(for 2.2/2.4 kernels, add "post-install snd-mixart /usr/bin/vxloader" to + /etc/modules.conf, instead.) + +The firmware binaries are installed on /usr/share/alsa/firmware +(or /usr/local/share/alsa/firmware, depending to the prefix option of +configure). There will be a miXart.conf file, which define the dsp image +files. + +The firmware files are copyright by Digigram SA + + +COPYRIGHT +========= + +Copyright (c) 2003 Digigram SA <alsa@digigram.com> +Distributalbe under GPL. diff --git a/Documentation/sound/alsa/OSS-Emulation.txt b/Documentation/sound/alsa/OSS-Emulation.txt new file mode 100644 index 000000000000..ec2a02541d5b --- /dev/null +++ b/Documentation/sound/alsa/OSS-Emulation.txt @@ -0,0 +1,297 @@ + NOTES ON KERNEL OSS-EMULATION + ============================= + + Jan. 22, 2004 Takashi Iwai <tiwai@suse.de> + + +Modules +======= + +ALSA provides a powerful OSS emulation on the kernel. +The OSS emulation for PCM, mixer and sequencer devices is implemented +as add-on kernel modules, snd-pcm-oss, snd-mixer-oss and snd-seq-oss. +When you need to access the OSS PCM, mixer or sequencer devices, the +corresponding module has to be loaded. + +These modules are loaded automatically when the corresponding service +is called. The alias is defined sound-service-x-y, where x and y are +the card number and the minor unit number. Usually you don't have to +define these aliases by yourself. + +Only necessary step for auto-loading of OSS modules is to define the +card alias in /etc/modprobe.conf, such as + + alias sound-slot-0 snd-emu10k1 + +As the second card, define sound-slot-1 as well. +Note that you can't use the aliased name as the target name (i.e. +"alias sound-slot-0 snd-card-0" doesn't work any more like the old +modutils). + +The currently available OSS configuration is shown in +/proc/asound/oss/sndstat. This shows in the same syntax of +/dev/sndstat, which is available on the commercial OSS driver. +On ALSA, you can symlink /dev/sndstat to this proc file. + +Please note that the devices listed in this proc file appear only +after the corresponding OSS-emulation module is loaded. Don't worry +even if "NOT ENABLED IN CONFIG" is shown in it. + + +Device Mapping +============== + +ALSA supports the following OSS device files: + + PCM: + /dev/dspX + /dev/adspX + + Mixer: + /dev/mixerX + + MIDI: + /dev/midi0X + /dev/amidi0X + + Sequencer: + /dev/sequencer + /dev/sequencer2 (aka /dev/music) + +where X is the card number from 0 to 7. + +(NOTE: Some distributions have the device files like /dev/midi0 and + /dev/midi1. They are NOT for OSS but for tclmidi, which is + a totally different thing.) + +Unlike the real OSS, ALSA cannot use the device files more than the +assigned ones. For example, the first card cannot use /dev/dsp1 or +/dev/dsp2, but only /dev/dsp0 and /dev/adsp0. + +As seen above, PCM and MIDI may have two devices. Usually, the first +PCM device (hw:0,0 in ALSA) is mapped to /dev/dsp and the secondary +device (hw:0,1) to /dev/adsp (if available). For MIDI, /dev/midi and +/dev/amidi, respectively. + +You can change this device mapping via the module options of +snd-pcm-oss and snd-rawmidi. In the case of PCM, the following +options are available for snd-pcm-oss: + + dsp_map PCM device number assigned to /dev/dspX + (default = 0) + adsp_map PCM device number assigned to /dev/adspX + (default = 1) + +For example, to map the third PCM device (hw:0,2) to /dev/adsp0, +define like this: + + options snd-pcm-oss adsp_map=2 + +The options take arrays. For configuring the second card, specify +two entries separated by comma. For example, to map the third PCM +device on the second card to /dev/adsp1, define like below: + + options snd-pcm-oss adsp_map=0,2 + +To change the mapping of MIDI devices, the following options are +available for snd-rawmidi: + + midi_map MIDI device number assigned to /dev/midi0X + (default = 0) + amidi_map MIDI device number assigned to /dev/amidi0X + (default = 1) + +For example, to assign the third MIDI device on the first card to +/dev/midi00, define as follows: + + options snd-rawmidi midi_map=2 + + +PCM Mode +======== + +As default, ALSA emulates the OSS PCM with so-called plugin layer, +i.e. tries to convert the sample format, rate or channels +automatically when the card doesn't support it natively. +This will lead to some problems for some applications like quake or +wine, especially if they use the card only in the MMAP mode. + +In such a case, you can change the behavior of PCM per application by +writing a command to the proc file. There is a proc file for each PCM +stream, /proc/asound/cardX/pcmY[cp]/oss, where X is the card number +(zero-based), Y the PCM device number (zero-based), and 'p' is for +playback and 'c' for capture, respectively. Note that this proc file +exists only after snd-pcm-oss module is loaded. + +The command sequence has the following syntax: + + app_name fragments fragment_size [options] + +app_name is the name of application with (higher priority) or without +path. +fragments specifies the number of fragments or zero if no specific +number is given. +fragment_size is the size of fragment in bytes or zero if not given. +options is the optional parameters. The following options are +available: + + disable the application tries to open a pcm device for + this channel but does not want to use it. + direct don't use plugins + block force block open mode + non-block force non-block open mode + partial-frag write also partial fragments (affects playback only) + no-silence do not fill silence ahead to avoid clicks + +The disable option is useful when one stream direction (playback or +capture) is not handled correctly by the application although the +hardware itself does support both directions. +The direct option is used, as mentioned above, to bypass the automatic +conversion and useful for MMAP-applications. +For example, to playback the first PCM device without plugins for +quake, send a command via echo like the following: + + % echo "quake 0 0 direct" > /proc/asound/card0/pcm0p/oss + +While quake wants only playback, you may append the second command +to notify driver that only this direction is about to be allocated: + + % echo "quake 0 0 disable" > /proc/asound/card0/pcm0c/oss + +The permission of proc files depend on the module options of snd. +As default it's set as root, so you'll likely need to be superuser for +sending the command above. + +The block and non-block options are used to change the behavior of +opening the device file. + +As default, ALSA behaves as original OSS drivers, i.e. does not block +the file when it's busy. The -EBUSY error is returned in this case. + +This blocking behavior can be changed globally via nonblock_open +module option of snd-pcm-oss. For using the blocking mode as default +for OSS devices, define like the following: + + options snd-pcm-oss nonblock_open=0 + +The partial-frag and no-silence commands have been added recently. +Both commands are for optimization use only. The former command +specifies to invoke the write transfer only when the whole fragment is +filled. The latter stops writing the silence data ahead +automatically. Both are disabled as default. + +You can check the currently defined configuration by reading the proc +file. The read image can be sent to the proc file again, hence you +can save the current configuration + + % cat /proc/asound/card0/pcm0p/oss > /somewhere/oss-cfg + +and restore it like + + % cat /somewhere/oss-cfg > /proc/asound/card0/pcm0p/oss + +Also, for clearing all the current configuration, send "erase" command +as below: + + % echo "erase" > /proc/asound/card0/pcm0p/oss + + +Mixer Elements +============== + +Since ALSA has completely different mixer interface, the emulation of +OSS mixer is relatively complicated. ALSA builds up a mixer element +from several different ALSA (mixer) controls based on the name +string. For example, the volume element SOUND_MIXER_PCM is composed +from "PCM Playback Volume" and "PCM Playback Switch" controls for the +playback direction and from "PCM Capture Volume" and "PCM Capture +Switch" for the capture directory (if exists). When the PCM volume of +OSS is changed, all the volume and switch controls above are adjusted +automatically. + +As default, ALSA uses the following control for OSS volumes: + + OSS volume ALSA control Index + ----------------------------------------------------- + SOUND_MIXER_VOLUME Master 0 + SOUND_MIXER_BASS Tone Control - Bass 0 + SOUND_MIXER_TREBLE Tone Control - Treble 0 + SOUND_MIXER_SYNTH Synth 0 + SOUND_MIXER_PCM PCM 0 + SOUND_MIXER_SPEAKER PC Speaker 0 + SOUND_MIXER_LINE Line 0 + SOUND_MIXER_MIC Mic 0 + SOUND_MIXER_CD CD 0 + SOUND_MIXER_IMIX Monitor Mix 0 + SOUND_MIXER_ALTPCM PCM 1 + SOUND_MIXER_RECLEV (not assigned) + SOUND_MIXER_IGAIN Capture 0 + SOUND_MIXER_OGAIN Playback 0 + SOUND_MIXER_LINE1 Aux 0 + SOUND_MIXER_LINE2 Aux 1 + SOUND_MIXER_LINE3 Aux 2 + SOUND_MIXER_DIGITAL1 Digital 0 + SOUND_MIXER_DIGITAL2 Digital 1 + SOUND_MIXER_DIGITAL3 Digital 2 + SOUND_MIXER_PHONEIN Phone 0 + SOUND_MIXER_PHONEOUT Phone 1 + SOUND_MIXER_VIDEO Video 0 + SOUND_MIXER_RADIO Radio 0 + SOUND_MIXER_MONITOR Monitor 0 + +The second column is the base-string of the corresponding ALSA +control. In fact, the controls with "XXX [Playback|Capture] +[Volume|Switch]" will be checked in addition. + +The current assignment of these mixer elements is listed in the proc +file, /proc/asound/cardX/oss_mixer, which will be like the following + + VOLUME "Master" 0 + BASS "" 0 + TREBLE "" 0 + SYNTH "" 0 + PCM "PCM" 0 + ... + +where the first column is the OSS volume element, the second column +the base-string of the corresponding ALSA control, and the third the +control index. When the string is empty, it means that the +corresponding OSS control is not available. + +For changing the assignment, you can write the configuration to this +proc file. For example, to map "Wave Playback" to the PCM volume, +send the command like the following: + + % echo 'VOLUME "Wave Playback" 0' > /proc/asound/card0/oss_mixer + +The command is exactly as same as listed in the proc file. You can +change one or more elements, one volume per line. In the last +example, both "Wave Playback Volume" and "Wave Playback Switch" will +be affected when PCM volume is changed. + +Like the case of PCM proc file, the permission of proc files depend on +the module options of snd. you'll likely need to be superuser for +sending the command above. + +As well as in the case of PCM proc file, you can save and restore the +current mixer configuration by reading and writing the whole file +image. + + +Unsupported Features +==================== + +MMAP on ICE1712 driver +---------------------- +ICE1712 supports only the unconventional format, interleaved +10-channels 24bit (packed in 32bit) format. Therefore you cannot mmap +the buffer as the conventional (mono or 2-channels, 8 or 16bit) format +on OSS. + +USB devices +----------- +Some USB devices support only 24bit format packed in 3bytes. This +format is not supported by OSS and no conversion is provided by kernel +OSS emulation. You can use the user-space OSS emulation via libaoss +instead. + diff --git a/Documentation/sound/alsa/Procfile.txt b/Documentation/sound/alsa/Procfile.txt new file mode 100644 index 000000000000..25c5d648aef6 --- /dev/null +++ b/Documentation/sound/alsa/Procfile.txt @@ -0,0 +1,191 @@ + Proc Files of ALSA Drivers + ========================== + Takashi Iwai <tiwai@suse.de> + +General +------- + +ALSA has its own proc tree, /proc/asound. Many useful information are +found in this tree. When you encounter a problem and need debugging, +check the files listed in the following sections. + +Each card has its subtree cardX, where X is from 0 to 7. The +card-specific files are stored in the card* subdirectories. + + +Global Information +------------------ + +cards + Shows the list of currently configured ALSA drivers, + index, the id string, short and long descriptions. + +version + Shows the version string and compile date. + +modules + Lists the module of each card + +devices + Lists the ALSA native device mappings. + +meminfo + Shows the status of allocated pages via ALSA drivers. + Appears only when CONFIG_SND_DEBUG=y. + +hwdep + Lists the currently available hwdep devices in format of + <card>-<device>: <name> + +pcm + Lists the currently available PCM devices in format of + <card>-<device>: <id>: <name> : <sub-streams> + +timer + Lists the currently available timer devices + + +oss/devices + Lists the OSS device mappings. + +oss/sndstat + Provides the output compatible with /dev/sndstat. + You can symlink this to /dev/sndstat. + + +Card Specific Files +------------------- + +The card-specific files are found in /proc/asound/card* directories. +Some drivers (e.g. cmipci) have their own proc entries for the +register dump, etc (e.g. /proc/asound/card*/cmipci shows the register +dump). These files would be really helpful for debugging. + +When PCM devices are available on this card, you can see directories +like pcm0p or pcm1c. They hold the PCM information for each PCM +stream. The number after 'pcm' is the PCM device number from 0, and +the last 'p' or 'c' means playback or capture direction. The files in +this subtree is described later. + +The status of MIDI I/O is found in midi* files. It shows the device +name and the received/transmitted bytes through the MIDI device. + +When the card is equipped with AC97 codecs, there are codec97#* +subdirectories (desribed later). + +When the OSS mixer emulation is enabled (and the module is loaded), +oss_mixer file appears here, too. This shows the current mapping of +OSS mixer elements to the ALSA control elements. You can change the +mapping by writing to this device. Read OSS-Emulation.txt for +details. + + +PCM Proc Files +-------------- + +card*/pcm*/info + The general information of this PCM device: card #, device #, + substreams, etc. + +card*/pcm*/xrun_debug + This file appears when CONFIG_SND_DEBUG=y. + This shows the status of xrun (= buffer overrun/xrun) debug of + ALSA PCM middle layer, as an integer from 0 to 2. The value + can be changed by writing to this file, such as + + # cat 2 > /proc/asound/card0/pcm0p/xrun_debug + + When this value is greater than 0, the driver will show the + messages to kernel log when an xrun is detected. The debug + message is shown also when the invalid H/W pointer is detected + at the update of periods (usually called from the interrupt + handler). + + When this value is greater than 1, the driver will show the + stack trace additionally. This may help the debugging. + +card*/pcm*/sub*/info + The general information of this PCM sub-stream. + +card*/pcm*/sub*/status + The current status of this PCM sub-stream, elapsed time, + H/W position, etc. + +card*/pcm*/sub*/hw_params + The hardware parameters set for this sub-stream. + +card*/pcm*/sub*/sw_params + The soft parameters set for this sub-stream. + +card*/pcm*/sub*/prealloc + The buffer pre-allocation information. + + +AC97 Codec Information +---------------------- + +card*/codec97#*/ac97#?-? + Shows the general information of this AC97 codec chip, such as + name, capabilities, set up. + +card*/codec97#0/ac97#?-?+regs + Shows the AC97 register dump. Useful for debugging. + + When CONFIG_SND_DEBUG is enabled, you can write to this file for + changing an AC97 register directly. Pass two hex numbers. + For example, + + # echo 02 9f1f > /proc/asound/card0/codec97#0/ac97#0-0+regs + + +Sequencer Information +--------------------- + +seq/drivers + Lists the currently available ALSA sequencer drivers. + +seq/clients + Shows the list of currently available sequencer clinets and + ports. The connection status and the running status are shown + in this file, too. + +seq/queues + Lists the currently allocated/running sequener queues. + +seq/timer + Lists the currently allocated/running sequencer timers. + +seq/oss + Lists the OSS-compatible sequencer stuffs. + + +Help For Debugging? +------------------- + +When the problem is related with PCM, first try to turn on xrun_debug +mode. This will give you the kernel messages when and where xrun +happened. + +If it's really a bug, report it with the following information + + - the name of the driver/card, show in /proc/asound/cards + - the reigster dump, if available (e.g. card*/cmipci) + +when it's a PCM problem, + + - set-up of PCM, shown in hw_parms, sw_params, and status in the PCM + sub-stream directory + +when it's a mixer problem, + + - AC97 proc files, codec97#*/* files + +for USB audio/midi, + + - output of lsusb -v + - stream* files in card directory + + +The ALSA bug-tracking system is found at: + + https://bugtrack.alsa-project.org/alsa-bug/ diff --git a/Documentation/sound/alsa/SB-Live-mixer.txt b/Documentation/sound/alsa/SB-Live-mixer.txt new file mode 100644 index 000000000000..651adaf60473 --- /dev/null +++ b/Documentation/sound/alsa/SB-Live-mixer.txt @@ -0,0 +1,356 @@ + + Sound Blaster Live mixer / default DSP code + =========================================== + + +The EMU10K1 chips have a DSP part which can be programmed to support +various ways of sample processing, which is described here. +(This acticle does not deal with the overall functionality of the +EMU10K1 chips. See the manuals section for further details.) + +The ALSA driver programs this portion of chip by default code +(can be altered later) which offers the following functionality: + + +1) IEC958 (S/PDIF) raw PCM +-------------------------- + +This PCM device (it's the 4th PCM device (index 3!) and first subdevice +(index 0) for a given card) allows to forward 48kHz, stereo, 16-bit +little endian streams without any modifications to the digital output +(coaxial or optical). The universal interface allows the creation of up +to 8 raw PCM devices operating at 48kHz, 16-bit little endian. It would +be easy to add support for multichannel devices to the current code, +but the conversion routines exist only for stereo (2-channel streams) +at the time. + +Look to tram_poke routines in lowlevel/emu10k1/emufx.c for more details. + + +2) Digital mixer controls +------------------------- + +These controls are built using the DSP instructions. They offer extended +functionality. Only the default build-in code in the ALSA driver is described +here. Note that the controls work as attenuators: the maximum value is the +neutral position leaving the signal unchanged. Note that if the same destination +is mentioned in multiple controls, the signal is accumulated and can be wrapped +(set to maximal or minimal value without checking of overflow). + + +Explanation of used abbreviations: + +DAC - digital to analog converter +ADC - analog to digital converter +I2S - one-way three wire serial bus for digital sound by Philips Semiconductors + (this standard is used for connecting standalone DAC and ADC converters) +LFE - low frequency effects (subwoofer signal) +AC97 - a chip containing an analog mixer, DAC and ADC converters +IEC958 - S/PDIF +FX-bus - the EMU10K1 chip has an effect bus containing 16 accumulators. + Each of the synthesizer voices can feed its output to these accumulators + and the DSP microcontroller can operate with the resulting sum. + + +name='Wave Playback Volume',index=0 + +This control is used to attenuate samples for left and right PCM FX-bus +accumulators. ALSA uses accumulators 0 and 1 for left and right PCM samples. +The result samples are forwarded to the front DAC PCM slots of the AC97 codec. + +name='Wave Surround Playback Volume',index=0 + +This control is used to attenuate samples for left and right PCM FX-bus +accumulators. ALSA uses accumulators 0 and 1 for left and right PCM samples. +The result samples are forwarded to the rear I2S DACs. These DACs operates +separately (they are not inside the AC97 codec). + +name='Wave Center Playback Volume',index=0 + +This control is used to attenuate samples for left and right PCM FX-bus +accumulators. ALSA uses accumulators 0 and 1 for left and right PCM samples. +The result is mixed to mono signal (single channel) and forwarded to +the ??rear?? right DAC PCM slot of the AC97 codec. + +name='Wave LFE Playback Volume',index=0 + +This control is used to attenuate samples for left and right PCM FX-bus +accumulators. ALSA uses accumulators 0 and 1 for left and right PCM. +The result is mixed to mono signal (single channel) and forwarded to +the ??rear?? left DAC PCM slot of the AC97 codec. + +name='Wave Capture Volume',index=0 +name='Wave Capture Switch',index=0 + +These controls are used to attenuate samples for left and right PCM FX-bus +accumulator. ALSA uses accumulators 0 and 1 for left and right PCM. +The result is forwarded to the ADC capture FIFO (thus to the standard capture +PCM device). + +name='Music Playback Volume',index=0 + +This control is used to attenuate samples for left and right MIDI FX-bus +accumulators. ALSA uses accumulators 4 and 5 for left and right MIDI samples. +The result samples are forwarded to the front DAC PCM slots of the AC97 codec. + +name='Music Capture Volume',index=0 +name='Music Capture Switch',index=0 + +These controls are used to attenuate samples for left and right MIDI FX-bus +accumulator. ALSA uses accumulators 4 and 5 for left and right PCM. +The result is forwarded to the ADC capture FIFO (thus to the standard capture +PCM device). + +name='Surround Playback Volume',index=0 + +This control is used to attenuate samples for left and right rear PCM FX-bus +accumulators. ALSA uses accumulators 2 and 3 for left and right rear PCM samples. +The result samples are forwarded to the rear I2S DACs. These DACs operate +separately (they are not inside the AC97 codec). + +name='Surround Capture Volume',index=0 +name='Surround Capture Switch',index=0 + +These controls are used to attenuate samples for left and right rear PCM FX-bus +accumulators. ALSA uses accumulators 2 and 3 for left and right rear PCM samples. +The result is forwarded to the ADC capture FIFO (thus to the standard capture +PCM device). + +name='Center Playback Volume',index=0 + +This control is used to attenuate sample for center PCM FX-bus accumulator. +ALSA uses accumulator 6 for center PCM sample. The result sample is forwarded +to the ??rear?? right DAC PCM slot of the AC97 codec. + +name='LFE Playback Volume',index=0 + +This control is used to attenuate sample for center PCM FX-bus accumulator. +ALSA uses accumulator 6 for center PCM sample. The result sample is forwarded +to the ??rear?? left DAC PCM slot of the AC97 codec. + +name='AC97 Playback Volume',index=0 + +This control is used to attenuate samples for left and right front ADC PCM slots +of the AC97 codec. The result samples are forwarded to the front DAC PCM +slots of the AC97 codec. +******************************************************************************** +*** Note: This control should be zero for the standard operations, otherwise *** +*** a digital loopback is activated. *** +******************************************************************************** + +name='AC97 Capture Volume',index=0 + +This control is used to attenuate samples for left and right front ADC PCM slots +of the AC97 codec. The result is forwarded to the ADC capture FIFO (thus to +the standard capture PCM device). +******************************************************************************** +*** Note: This control should be 100 (maximal value), otherwise no analog *** +*** inputs of the AC97 codec can be captured (recorded). *** +******************************************************************************** + +name='IEC958 TTL Playback Volume',index=0 + +This control is used to attenuate samples from left and right IEC958 TTL +digital inputs (usually used by a CDROM drive). The result samples are +forwarded to the front DAC PCM slots of the AC97 codec. + +name='IEC958 TTL Capture Volume',index=0 + +This control is used to attenuate samples from left and right IEC958 TTL +digital inputs (usually used by a CDROM drive). The result samples are +forwarded to the ADC capture FIFO (thus to the standard capture PCM device). + +name='Zoom Video Playback Volume',index=0 + +This control is used to attenuate samples from left and right zoom video +digital inputs (usually used by a CDROM drive). The result samples are +forwarded to the front DAC PCM slots of the AC97 codec. + +name='Zoom Video Capture Volume',index=0 + +This control is used to attenuate samples from left and right zoom video +digital inputs (usually used by a CDROM drive). The result samples are +forwarded to the ADC capture FIFO (thus to the standard capture PCM device). + +name='IEC958 LiveDrive Playback Volume',index=0 + +This control is used to attenuate samples from left and right IEC958 optical +digital input. The result samples are forwarded to the front DAC PCM slots +of the AC97 codec. + +name='IEC958 LiveDrive Capture Volume',index=0 + +This control is used to attenuate samples from left and right IEC958 optical +digital inputs. The result samples are forwarded to the ADC capture FIFO +(thus to the standard capture PCM device). + +name='IEC958 Coaxial Playback Volume',index=0 + +This control is used to attenuate samples from left and right IEC958 coaxial +digital inputs. The result samples are forwarded to the front DAC PCM slots +of the AC97 codec. + +name='IEC958 Coaxial Capture Volume',index=0 + +This control is used to attenuate samples from left and right IEC958 coaxial +digital inputs. The result samples are forwarded to the ADC capture FIFO +(thus to the standard capture PCM device). + +name='Line LiveDrive Playback Volume',index=0 +name='Line LiveDrive Playback Volume',index=1 + +This control is used to attenuate samples from left and right I2S ADC +inputs (on the LiveDrive). The result samples are forwarded to the front +DAC PCM slots of the AC97 codec. + +name='Line LiveDrive Capture Volume',index=1 +name='Line LiveDrive Capture Volume',index=1 + +This control is used to attenuate samples from left and right I2S ADC +inputs (on the LiveDrive). The result samples are forwarded to the ADC +capture FIFO (thus to the standard capture PCM device). + +name='Tone Control - Switch',index=0 + +This control turns the tone control on or off. The samples for front, rear +and center / LFE outputs are affected. + +name='Tone Control - Bass',index=0 + +This control sets the bass intensity. There is no neutral value!! +When the tone control code is activated, the samples are always modified. +The closest value to pure signal is 20. + +name='Tone Control - Treble',index=0 + +This control sets the treble intensity. There is no neutral value!! +When the tone control code is activated, the samples are always modified. +The closest value to pure signal is 20. + +name='IEC958 Optical Raw Playback Switch',index=0 + +If this switch is on, then the samples for the IEC958 (S/PDIF) digital +output are taken only from the raw FX8010 PCM, otherwise standard front +PCM samples are taken. + +name='Headphone Playback Volume',index=1 + +This control attenuates the samples for the headphone output. + +name='Headphone Center Playback Switch',index=1 + +If this switch is on, then the sample for the center PCM is put to the +left headphone output (useful for SB Live cards without separate center/LFE +output). + +name='Headphone LFE Playback Switch',index=1 + +If this switch is on, then the sample for the center PCM is put to the +right headphone output (useful for SB Live cards without separate center/LFE +output). + + +3) PCM stream related controls +------------------------------ + +name='EMU10K1 PCM Volume',index 0-31 + +Channel volume attenuation in range 0-0xffff. The maximum value (no +attenuation) is default. The channel mapping for three values is +as follows: + + 0 - mono, default 0xffff (no attenuation) + 1 - left, default 0xffff (no attenuation) + 2 - right, default 0xffff (no attenuation) + +name='EMU10K1 PCM Send Routing',index 0-31 + +This control specifies the destination - FX-bus accumulators. There are +twelve values with this mapping: + + 0 - mono, A destination (FX-bus 0-15), default 0 + 1 - mono, B destination (FX-bus 0-15), default 1 + 2 - mono, C destination (FX-bus 0-15), default 2 + 3 - mono, D destination (FX-bus 0-15), default 3 + 4 - left, A destination (FX-bus 0-15), default 0 + 5 - left, B destination (FX-bus 0-15), default 1 + 6 - left, C destination (FX-bus 0-15), default 2 + 7 - left, D destination (FX-bus 0-15), default 3 + 8 - right, A destination (FX-bus 0-15), default 0 + 9 - right, B destination (FX-bus 0-15), default 1 + 10 - right, C destination (FX-bus 0-15), default 2 + 11 - right, D destination (FX-bus 0-15), default 3 + +Don't forget that it's illegal to assign a channel to the same FX-bus accumulator +more than once (it means 0=0 && 1=0 is an invalid combination). + +name='EMU10K1 PCM Send Volume',index 0-31 + +It specifies the attenuation (amount) for given destination in range 0-255. +The channel mapping is following: + + 0 - mono, A destination attn, default 255 (no attenuation) + 1 - mono, B destination attn, default 255 (no attenuation) + 2 - mono, C destination attn, default 0 (mute) + 3 - mono, D destination attn, default 0 (mute) + 4 - left, A destination attn, default 255 (no attenuation) + 5 - left, B destination attn, default 0 (mute) + 6 - left, C destination attn, default 0 (mute) + 7 - left, D destination attn, default 0 (mute) + 8 - right, A destination attn, default 0 (mute) + 9 - right, B destination attn, default 255 (no attenuation) + 10 - right, C destination attn, default 0 (mute) + 11 - right, D destination attn, default 0 (mute) + + + +4) MANUALS/PATENTS: +------------------- + +ftp://opensource.creative.com/pub/doc +------------------------------------- + + Files: + LM4545.pdf AC97 Codec + + m2049.pdf The EMU10K1 Digital Audio Processor + + hog63.ps FX8010 - A DSP Chip Architecture for Audio Effects + + +WIPO Patents +------------ + Patent numbers: + WO 9901813 (A1) Audio Effects Processor with multiple asynchronous (Jan. 14, 1999) + streams + + WO 9901814 (A1) Processor with Instruction Set for Audio Effects (Jan. 14, 1999) + + WO 9901953 (A1) Audio Effects Processor having Decoupled Instruction + Execution and Audio Data Sequencing (Jan. 14, 1999) + + +US Patents (http://www.uspto.gov/) +---------------------------------- + + US 5925841 Digital Sampling Instrument employing cache memory (Jul. 20, 1999) + + US 5928342 Audio Effects Processor integrated on a single chip (Jul. 27, 1999) + with a multiport memory onto which multiple asynchronous + digital sound samples can be concurrently loaded + + US 5930158 Processor with Instruction Set for Audio Effects (Jul. 27, 1999) + + US 6032235 Memory initialization circuit (Tram) (Feb. 29, 2000) + + US 6138207 Interpolation looping of audio samples in cache connected to (Oct. 24, 2000) + system bus with prioritization and modification of bus transfers + in accordance with loop ends and minimum block sizes + + US 6151670 Method for conserving memory storage using a (Nov. 21, 2000) + pool of short term memory registers + + US 6195715 Interrupt control for multiple programs communicating with (Feb. 27, 2001) + a common interrupt by associating programs to GP registers, + defining interrupt register, polling GP registers, and invoking + callback routine associated with defined interrupt register diff --git a/Documentation/sound/alsa/VIA82xx-mixer.txt b/Documentation/sound/alsa/VIA82xx-mixer.txt new file mode 100644 index 000000000000..1b0ac06ba95d --- /dev/null +++ b/Documentation/sound/alsa/VIA82xx-mixer.txt @@ -0,0 +1,8 @@ + + VIA82xx mixer + ============= + +On many VIA82xx boards, the 'Input Source Select' mixer control does not work. +Setting it to 'Input2' on such boards will cause recording to hang, or fail +with EIO (input/output error) via OSS emulation. This control should be left +at 'Input1' for such cards. diff --git a/Documentation/sound/alsa/hda_codec.txt b/Documentation/sound/alsa/hda_codec.txt new file mode 100644 index 000000000000..e9d07b8f1acb --- /dev/null +++ b/Documentation/sound/alsa/hda_codec.txt @@ -0,0 +1,299 @@ +Notes on Universal Interface for Intel High Definition Audio Codec +------------------------------------------------------------------ + +Takashi Iwai <tiwai@suse.de> + + +[Still a draft version] + + +General +======= + +The snd-hda-codec module supports the generic access function for the +High Definition (HD) audio codecs. It's designed to be independent +from the controller code like ac97 codec module. The real accessors +from/to the controller must be implemented in the lowlevel driver. + +The structure of this module is similar with ac97_codec module. +Each codec chip belongs to a bus class which communicates with the +controller. + + +Initialization of Bus Instance +============================== + +The card driver has to create struct hda_bus at first. The template +struct should be filled and passed to the constructor: + +struct hda_bus_template { + void *private_data; + struct pci_dev *pci; + const char *modelname; + struct hda_bus_ops ops; +}; + +The card driver can set and use the private_data field to retrieve its +own data in callback functions. The pci field is used when the patch +needs to check the PCI subsystem IDs, so on. For non-PCI system, it +doesn't have to be set, of course. +The modelname field specifies the board's specific configuration. The +string is passed to the codec parser, and it depends on the parser how +the string is used. +These fields, private_data, pci and modelname are all optional. + +The ops field contains the callback functions as the following: + +struct hda_bus_ops { + int (*command)(struct hda_codec *codec, hda_nid_t nid, int direct, + unsigned int verb, unsigned int parm); + unsigned int (*get_response)(struct hda_codec *codec); + void (*private_free)(struct hda_bus *); +}; + +The command callback is called when the codec module needs to send a +VERB to the controller. It's always a single command. +The get_response callback is called when the codec requires the answer +for the last command. These two callbacks are mandatory and have to +be given. +The last, private_free callback, is optional. It's called in the +destructor to release any necessary data in the lowlevel driver. + +The bus instance is created via snd_hda_bus_new(). You need to pass +the card instance, the template, and the pointer to store the +resultant bus instance. + +int snd_hda_bus_new(snd_card_t *card, const struct hda_bus_template *temp, + struct hda_bus **busp); + +It returns zero if successful. A negative return value means any +error during creation. + + +Creation of Codec Instance +========================== + +Each codec chip on the board is then created on the BUS instance. +To create a codec instance, call snd_hda_codec_new(). + +int snd_hda_codec_new(struct hda_bus *bus, unsigned int codec_addr, + struct hda_codec **codecp); + +The first argument is the BUS instance, the second argument is the +address of the codec, and the last one is the pointer to store the +resultant codec instance (can be NULL if not needed). + +The codec is stored in a linked list of bus instance. You can follow +the codec list like: + + struct list_head *p; + struct hda_codec *codec; + list_for_each(p, &bus->codec_list) { + codec = list_entry(p, struct hda_codec, list); + ... + } + +The codec isn't initialized at this stage properly. The +initialization sequence is called when the controls are built later. + + +Codec Access +============ + +To access codec, use snd_codec_read() and snd_codec_write(). +snd_hda_param_read() is for reading parameters. +For writing a sequence of verbs, use snd_hda_sequence_write(). + +To retrieve the number of sub nodes connected to the given node, use +snd_hda_get_sub_nodes(). The connection list can be obtained via +snd_hda_get_connections() call. + +When an unsolicited event happens, pass the event via +snd_hda_queue_unsol_event() so that the codec routines will process it +later. + + +(Mixer) Controls +================ + +To create mixer controls of all codecs, call +snd_hda_build_controls(). It then builds the mixers and does +initialization stuff on each codec. + + +PCM Stuff +========= + +snd_hda_build_pcms() gives the necessary information to create PCM +streams. When it's called, each codec belonging to the bus stores +codec->num_pcms and codec->pcm_info fields. The num_pcms indicates +the number of elements in pcm_info array. The card driver is supposed +to traverse the codec linked list, read the pcm information in +pcm_info array, and build pcm instances according to them. + +The pcm_info array contains the following record: + +/* PCM information for each substream */ +struct hda_pcm_stream { + unsigned int substreams; /* number of substreams, 0 = not exist */ + unsigned int channels_min; /* min. number of channels */ + unsigned int channels_max; /* max. number of channels */ + hda_nid_t nid; /* default NID to query rates/formats/bps, or set up */ + u32 rates; /* supported rates */ + u64 formats; /* supported formats (SNDRV_PCM_FMTBIT_) */ + unsigned int maxbps; /* supported max. bit per sample */ + struct hda_pcm_ops ops; +}; + +/* for PCM creation */ +struct hda_pcm { + char *name; + struct hda_pcm_stream stream[2]; +}; + +The name can be passed to snd_pcm_new(). The stream field contains +the information for playback (SNDRV_PCM_STREAM_PLAYBACK = 0) and +capture (SNDRV_PCM_STREAM_CAPTURE = 1) directions. The card driver +should pass substreams to snd_pcm_new() for the number of substreams +to create. + +The channels_min, channels_max, rates and formats should be copied to +runtime->hw record. They and maxbps fields are used also to compute +the format value for the HDA codec and controller. Call +snd_hda_calc_stream_format() to get the format value. + +The ops field contains the following callback functions: + +struct hda_pcm_ops { + int (*open)(struct hda_pcm_stream *info, struct hda_codec *codec, + snd_pcm_substream_t *substream); + int (*close)(struct hda_pcm_stream *info, struct hda_codec *codec, + snd_pcm_substream_t *substream); + int (*prepare)(struct hda_pcm_stream *info, struct hda_codec *codec, + unsigned int stream_tag, unsigned int format, + snd_pcm_substream_t *substream); + int (*cleanup)(struct hda_pcm_stream *info, struct hda_codec *codec, + snd_pcm_substream_t *substream); +}; + +All are non-NULL, so you can call them safely without NULL check. + +The open callback should be called in PCM open after runtime->hw is +set up. It may override some setting and constraints additionally. +Similarly, the close callback should be called in the PCM close. + +The prepare callback should be called in PCM prepare. This will set +up the codec chip properly for the operation. The cleanup should be +called in hw_free to clean up the configuration. + +The caller should check the return value, at least for open and +prepare callbacks. When a negative value is returned, some error +occurred. + + +Proc Files +========== + +Each codec dumps the widget node information in +/proc/asound/card*/codec#* file. This information would be really +helpful for debugging. Please provide its contents together with the +bug report. + + +Power Management +================ + +It's simple: +Call snd_hda_suspend() in the PM suspend callback. +Call snd_hda_resume() in the PM resume callback. + + +Codec Preset (Patch) +==================== + +To set up and handle the codec functionality fully, each codec may +have a codec preset (patch). It's defined in struct hda_codec_preset: + + struct hda_codec_preset { + unsigned int id; + unsigned int mask; + unsigned int subs; + unsigned int subs_mask; + unsigned int rev; + const char *name; + int (*patch)(struct hda_codec *codec); + }; + +When the codec id and codec subsystem id match with the given id and +subs fields bitwise (with bitmask mask and subs_mask), the callback +patch is called. The patch callback should initialize the codec and +set the codec->patch_ops field. This is defined as below: + + struct hda_codec_ops { + int (*build_controls)(struct hda_codec *codec); + int (*build_pcms)(struct hda_codec *codec); + int (*init)(struct hda_codec *codec); + void (*free)(struct hda_codec *codec); + void (*unsol_event)(struct hda_codec *codec, unsigned int res); + #ifdef CONFIG_PM + int (*suspend)(struct hda_codec *codec, pm_message_t state); + int (*resume)(struct hda_codec *codec); + #endif + }; + +The build_controls callback is called from snd_hda_build_controls(). +Similarly, the build_pcms callback is called from +snd_hda_build_pcms(). The init callback is called after +build_controls to initialize the hardware. +The free callback is called as a destructor. + +The unsol_event callback is called when an unsolicited event is +received. + +The suspend and resume callbacks are for power management. + +Each entry can be NULL if not necessary to be called. + + +Generic Parser +============== + +When the device doesn't match with any given presets, the widgets are +parsed via th generic parser (hda_generic.c). Its support is +limited: no multi-channel support, for example. + + +Digital I/O +=========== + +Call snd_hda_create_spdif_out_ctls() from the patch to create controls +related with SPDIF out. In the patch resume callback, call +snd_hda_resume_spdif(). + + +Helper Functions +================ + +snd_hda_get_codec_name() stores the codec name on the given string. + +snd_hda_check_board_config() can be used to obtain the configuration +information matching with the device. Define the table with struct +hda_board_config entries (zero-terminated), and pass it to the +function. The function checks the modelname given as a module +parameter, and PCI subsystem IDs. If the matching entry is found, it +returns the config field value. + +snd_hda_add_new_ctls() can be used to create and add control entries. +Pass the zero-terminated array of snd_kcontrol_new_t. The same array +can be passed to snd_hda_resume_ctls() for resume. +Note that this will call control->put callback of these entries. So, +put callback should check codec->in_resume and force to restore the +given value if it's non-zero even if the value is identical with the +cached value. + +Macros HDA_CODEC_VOLUME(), HDA_CODEC_MUTE() and their variables can be +used for the entry of snd_kcontrol_new_t. + +The input MUX helper callbacks for such a control are provided, too: +snd_hda_input_mux_info() and snd_hda_input_mux_put(). See +patch_realtek.c for example. diff --git a/Documentation/sound/alsa/seq_oss.html b/Documentation/sound/alsa/seq_oss.html new file mode 100644 index 000000000000..d9776cf60c07 --- /dev/null +++ b/Documentation/sound/alsa/seq_oss.html @@ -0,0 +1,409 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> +<HTML> +<HEAD> + <TITLE>OSS Sequencer Emulation on ALSA</TITLE> +</HEAD> +<BODY> + +<CENTER> +<H1> + +<HR WIDTH="100%"></H1></CENTER> + +<CENTER> +<H1> +OSS Sequencer Emulation on ALSA</H1></CENTER> + +<HR WIDTH="100%"> +<P>Copyright (c) 1998,1999 by Takashi Iwai +<TT><A HREF="mailto:iwai@ww.uni-erlangen.de"><iwai@ww.uni-erlangen.de></A></TT> +<P>ver.0.1.8; Nov. 16, 1999 +<H2> + +<HR WIDTH="100%"></H2> + +<H2> +1. Description</H2> +This directory contains the OSS sequencer emulation driver on ALSA. Note +that this program is still in the development state. +<P>What this does - it provides the emulation of the OSS sequencer, access +via +<TT>/dev/sequencer</TT> and <TT>/dev/music</TT> devices. +The most of applications using OSS can run if the appropriate ALSA +sequencer is prepared. +<P>The following features are emulated by this driver: +<UL> +<LI> +Normal sequencer and MIDI events:</LI> + +<BR>They are converted to the ALSA sequencer events, and sent to the corresponding +port. +<LI> +Timer events:</LI> + +<BR>The timer is not selectable by ioctl. The control rate is fixed to +100 regardless of HZ. That is, even on Alpha system, a tick is always +1/100 second. The base rate and tempo can be changed in <TT>/dev/music</TT>. + +<LI> +Patch loading:</LI> + +<BR>It purely depends on the synth drivers whether it's supported since +the patch loading is realized by callback to the synth driver. +<LI> +I/O controls:</LI> + +<BR>Most of controls are accepted. Some controls +are dependent on the synth driver, as well as even on original OSS.</UL> +Furthermore, you can find the following advanced features: +<UL> +<LI> +Better queue mechanism:</LI> + +<BR>The events are queued before processing them. +<LI> +Multiple applications:</LI> + +<BR>You can run two or more applications simultaneously (even for OSS sequencer)! +However, each MIDI device is exclusive - that is, if a MIDI device is opened +once by some application, other applications can't use it. No such a restriction +in synth devices. +<LI> +Real-time event processing:</LI> + +<BR>The events can be processed in real time without using out of bound +ioctl. To switch to real-time mode, send ABSTIME 0 event. The followed +events will be processed in real-time without queued. To switch off the +real-time mode, send RELTIME 0 event. +<LI> +<TT>/proc</TT> interface:</LI> + +<BR>The status of applications and devices can be shown via <TT>/proc/asound/seq/oss</TT> +at any time. In the later version, configuration will be changed via <TT>/proc</TT> +interface, too.</UL> + +<H2> +2. Installation</H2> +Run configure script with both sequencer support (<TT>--with-sequencer=yes</TT>) +and OSS emulation (<TT>--with-oss=yes</TT>) options. A module <TT>snd-seq-oss.o</TT> +will be created. If the synth module of your sound card supports for OSS +emulation (so far, only Emu8000 driver), this module will be loaded automatically. +Otherwise, you need to load this module manually. +<P>At beginning, this module probes all the MIDI ports which have been +already connected to the sequencer. Once after that, the creation and deletion +of ports are watched by announcement mechanism of ALSA sequencer. +<P>The available synth and MIDI devices can be found in proc interface. +Run "<TT>cat /proc/asound/seq/oss</TT>", and check the devices. For example, +if you use an AWE64 card, you'll see like the following: +<PRE> OSS sequencer emulation version 0.1.8 + ALSA client number 63 + ALSA receiver port 0 + + Number of applications: 0 + + Number of synth devices: 1 + + synth 0: [EMU8000] + type 0x1 : subtype 0x20 : voices 32 + capabilties : ioctl enabled / load_patch enabled + + Number of MIDI devices: 3 + + midi 0: [Emu8000 Port-0] ALSA port 65:0 + capability write / opened none + + midi 1: [Emu8000 Port-1] ALSA port 65:1 + capability write / opened none + + midi 2: [0: MPU-401 (UART)] ALSA port 64:0 + capability read/write / opened none</PRE> +Note that the device number may be different from the information of +<TT>/proc/asound/oss-devices</TT> +or ones of the original OSS driver. Use the device number listed in <TT>/proc/asound/seq/oss</TT> +to play via OSS sequencer emulation. +<H2> +3. Using Synthesizer Devices</H2> +Run your favorite program. I've tested playmidi-2.4, awemidi-0.4.3, gmod-3.1 +and xmp-1.1.5. You can load samples via <TT>/dev/sequencer</TT> like sfxload, +too. +<P>If the lowlevel driver supports multiple access to synth devices (like +Emu8000 driver), two or more applications are allowed to run at the same +time. +<H2> +4. Using MIDI Devices</H2> +So far, only MIDI output was tested. MIDI input was not checked at all, +but hopefully it will work. Use the device number listed in <TT>/proc/asound/seq/oss</TT>. +Be aware that these numbers are mostly different from the list in +<TT>/proc/asound/oss-devices</TT>. +<H2> +5. Module Options</H2> +The following module options are available: +<UL> +<LI> +<TT>maxqlen</TT></LI> + +<BR>specifies the maximum read/write queue length. This queue is private +for OSS sequencer, so that it is independent from the queue length of ALSA +sequencer. Default value is 1024. +<LI> +<TT>seq_oss_debug</TT></LI> + +<BR>specifies the debug level and accepts zero (= no debug message) or +positive integer. Default value is 0.</UL> + +<H2> +6. Queue Mechanism</H2> +OSS sequencer emulation uses an ALSA priority queue. The +events from <TT>/dev/sequencer</TT> are processed and put onto the queue +specified by module option. +<P>All the events from <TT>/dev/sequencer</TT> are parsed at beginning. +The timing events are also parsed at this moment, so that the events may +be processed in real-time. Sending an event ABSTIME 0 switches the operation +mode to real-time mode, and sending an event RELTIME 0 switches it off. +In the real-time mode, all events are dispatched immediately. +<P>The queued events are dispatched to the corresponding ALSA sequencer +ports after scheduled time by ALSA sequencer dispatcher. +<P>If the write-queue is full, the application sleeps until a certain amount +(as default one half) becomes empty in blocking mode. The synchronization +to write timing was implemented, too. +<P>The input from MIDI devices or echo-back events are stored on read FIFO +queue. If application reads <TT>/dev/sequencer</TT> in blocking mode, the +process will be awaked. + +<H2> +7. Interface to Synthesizer Device</H2> + +<H3> +7.1. Registration</H3> +To register an OSS synthesizer device, use <TT>snd_seq_oss_synth_register</TT> +function. +<PRE>int snd_seq_oss_synth_register(char *name, int type, int subtype, int nvoices, + snd_seq_oss_callback_t *oper, void *private_data)</PRE> +The arguments <TT>name</TT>, <TT>type</TT>, <TT>subtype</TT> and +<TT>nvoices</TT> +are used for making the appropriate synth_info structure for ioctl. The +return value is an index number of this device. This index must be remembered +for unregister. If registration is failed, -errno will be returned. +<P>To release this device, call <TT>snd_seq_oss_synth_unregister function</TT>: +<PRE>int snd_seq_oss_synth_unregister(int index),</PRE> +where the <TT>index</TT> is the index number returned by register function. +<H3> +7.2. Callbacks</H3> +OSS synthesizer devices have capability for sample downloading and ioctls +like sample reset. In OSS emulation, these special features are realized +by using callbacks. The registration argument oper is used to specify these +callbacks. The following callback functions must be defined: +<PRE>snd_seq_oss_callback_t: + int (*open)(snd_seq_oss_arg_t *p, void *closure); + int (*close)(snd_seq_oss_arg_t *p); + int (*ioctl)(snd_seq_oss_arg_t *p, unsigned int cmd, unsigned long arg); + int (*load_patch)(snd_seq_oss_arg_t *p, int format, const char *buf, int offs, int count); + int (*reset)(snd_seq_oss_arg_t *p); +Except for <TT>open</TT> and <TT>close</TT> callbacks, they are allowed +to be NULL. +<P>Each callback function takes the argument type snd_seq_oss_arg_t as the +first argument. +<PRE>struct snd_seq_oss_arg_t { + int app_index; + int file_mode; + int seq_mode; + snd_seq_addr_t addr; + void *private_data; + int event_passing; +};</PRE> +The first three fields, <TT>app_index</TT>, <TT>file_mode</TT> and +<TT>seq_mode</TT> +are initialized by OSS sequencer. The <TT>app_index</TT> is the application +index which is unique to each application opening OSS sequencer. The +<TT>file_mode</TT> +is bit-flags indicating the file operation mode. See +<TT>seq_oss.h</TT> +for its meaning. The <TT>seq_mode</TT> is sequencer operation mode. In +the current version, only <TT>SND_OSSSEQ_MODE_SYNTH</TT> is used. +<P>The next two fields, <TT>addr</TT> and <TT>private_data</TT>, must be +filled by the synth driver at open callback. The <TT>addr</TT> contains +the address of ALSA sequencer port which is assigned to this device. If +the driver allocates memory for <TT>private_data</TT>, it must be released +in close callback by itself. +<P>The last field, <TT>event_passing</TT>, indicates how to translate note-on +/ off events. In <TT>PROCESS_EVENTS</TT> mode, the note 255 is regarded +as velocity change, and key pressure event is passed to the port. In <TT>PASS_EVENTS</TT> +mode, all note on/off events are passed to the port without modified. <TT>PROCESS_KEYPRESS</TT> +mode checks the note above 128 and regards it as key pressure event (mainly +for Emu8000 driver). +<H4> +7.2.1. Open Callback</H4> +The <TT>open</TT> is called at each time this device is opened by an application +using OSS sequencer. This must not be NULL. Typically, the open callback +does the following procedure: +<OL> +<LI> +Allocate private data record.</LI> + +<LI> +Create an ALSA sequencer port.</LI> + +<LI> +Set the new port address on arg->addr.</LI> + +<LI> +Set the private data record pointer on arg->private_data.</LI> +</OL> +Note that the type bit-flags in port_info of this synth port must NOT contain +<TT>TYPE_MIDI_GENERIC</TT> +bit. Instead, <TT>TYPE_SPECIFIC</TT> should be used. Also, <TT>CAP_SUBSCRIPTION</TT> +bit should NOT be included, too. This is necessary to tell it from other +normal MIDI devices. If the open procedure succeeded, return zero. Otherwise, +return -errno. +<H4> +7.2.2 Ioctl Callback</H4> +The <TT>ioctl</TT> callback is called when the sequencer receives device-specific +ioctls. The following two ioctls should be processed by this callback: +<UL> +<LI> +<TT>IOCTL_SEQ_RESET_SAMPLES</TT></LI> + +<BR>reset all samples on memory -- return 0 +<LI> +<TT>IOCTL_SYNTH_MEMAVL</TT></LI> + +<BR>return the available memory size +<LI> +<TT>FM_4OP_ENABLE</TT></LI> + +<BR>can be ignored usually</UL> +The other ioctls are processed inside the sequencer without passing to +the lowlevel driver. +<H4> +7.2.3 Load_Patch Callback</H4> +The <TT>load_patch</TT> callback is used for sample-downloading. This callback +must read the data on user-space and transfer to each device. Return 0 +if succeeded, and -errno if failed. The format argument is the patch key +in patch_info record. The buf is user-space pointer where patch_info record +is stored. The offs can be ignored. The count is total data size of this +sample data. +<H4> +7.2.4 Close Callback</H4> +The <TT>close</TT> callback is called when this device is closed by the +applicaion. If any private data was allocated in open callback, it must +be released in the close callback. The deletion of ALSA port should be +done here, too. This callback must not be NULL. +<H4> +7.2.5 Reset Callback</H4> +The <TT>reset</TT> callback is called when sequencer device is reset or +closed by applications. The callback should turn off the sounds on the +relevant port immediately, and initialize the status of the port. If this +callback is undefined, OSS seq sends a <TT>HEARTBEAT</TT> event to the +port. +<H3> +7.3 Events</H3> +Most of the events are processed by sequencer and translated to the adequate +ALSA sequencer events, so that each synth device can receive by input_event +callback of ALSA sequencer port. The following ALSA events should be implemented +by the driver: +<BR> +<TABLE BORDER WIDTH="75%" NOSAVE > +<TR NOSAVE> +<TD NOSAVE><B>ALSA event</B></TD> + +<TD><B>Original OSS events</B></TD> +</TR> + +<TR> +<TD>NOTEON</TD> + +<TD>SEQ_NOTEON +<BR>MIDI_NOTEON</TD> +</TR> + +<TR> +<TD>NOTE</TD> + +<TD>SEQ_NOTEOFF +<BR>MIDI_NOTEOFF</TD> +</TR> + +<TR NOSAVE> +<TD NOSAVE>KEYPRESS</TD> + +<TD>MIDI_KEY_PRESSURE</TD> +</TR> + +<TR NOSAVE> +<TD>CHANPRESS</TD> + +<TD NOSAVE>SEQ_AFTERTOUCH +<BR>MIDI_CHN_PRESSURE</TD> +</TR> + +<TR NOSAVE> +<TD NOSAVE>PGMCHANGE</TD> + +<TD NOSAVE>SEQ_PGMCHANGE +<BR>MIDI_PGM_CHANGE</TD> +</TR> + +<TR> +<TD>PITCHBEND</TD> + +<TD>SEQ_CONTROLLER(CTRL_PITCH_BENDER) +<BR>MIDI_PITCH_BEND</TD> +</TR> + +<TR> +<TD>CONTROLLER</TD> + +<TD>MIDI_CTL_CHANGE +<BR>SEQ_BALANCE (with CTL_PAN)</TD> +</TR> + +<TR> +<TD>CONTROL14</TD> + +<TD>SEQ_CONTROLLER</TD> +</TR> + +<TR> +<TD>REGPARAM</TD> + +<TD>SEQ_CONTROLLER(CTRL_PITCH_BENDER_RANGE)</TD> +</TR> + +<TR> +<TD>SYSEX</TD> + +<TD>SEQ_SYSEX</TD> +</TR> +</TABLE> + +<P>The most of these behavior can be realized by MIDI emulation driver +included in the Emu8000 lowlevel driver. In the future release, this module +will be independent. +<P>Some OSS events (<TT>SEQ_PRIVATE</TT> and <TT>SEQ_VOLUME</TT> events) are passed as event +type SND_SEQ_OSS_PRIVATE. The OSS sequencer passes these event 8 byte +packets without any modification. The lowlevel driver should process these +events appropriately. +<H2> +8. Interface to MIDI Device</H2> +Since the OSS emulation probes the creation and deletion of ALSA MIDI sequencer +ports automatically by receiving announcement from ALSA sequencer, the +MIDI devices don't need to be registered explicitly like synth devices. +However, the MIDI port_info registered to ALSA sequencer must include a group +name <TT>SND_SEQ_GROUP_DEVICE</TT> and a capability-bit <TT>CAP_READ</TT> or +<TT>CAP_WRITE</TT>. Also, subscription capabilities, <TT>CAP_SUBS_READ</TT> or <TT>CAP_SUBS_WRITE</TT>, +must be defined, too. If these conditions are not satisfied, the port is not +registered as OSS sequencer MIDI device. +<P>The events via MIDI devices are parsed in OSS sequencer and converted +to the corresponding ALSA sequencer events. The input from MIDI sequencer +is also converted to MIDI byte events by OSS sequencer. This works just +a reverse way of seq_midi module. +<H2> +9. Known Problems / TODO's</H2> + +<UL> +<LI> +Patch loading via ALSA instrument layer is not implemented yet.</LI> +</UL> + +</BODY> +</HTML> diff --git a/Documentation/sound/alsa/serial-u16550.txt b/Documentation/sound/alsa/serial-u16550.txt new file mode 100644 index 000000000000..c1919559d509 --- /dev/null +++ b/Documentation/sound/alsa/serial-u16550.txt @@ -0,0 +1,88 @@ + + Serial UART 16450/16550 MIDI driver + =================================== + +The adaptor module parameter allows you to select either: + + 0 - Roland Soundcanvas support (default) + 1 - Midiator MS-124T support (1) + 2 - Midiator MS-124W S/A mode (2) + 3 - MS-124W M/B mode support (3) + 4 - Generic device with multiple input support (4) + +For the Midiator MS-124W, you must set the physical M-S and A-B +switches on the Midiator to match the driver mode you select. + +In Roland Soundcanvas mode, multiple ALSA raw MIDI substreams are supported +(midiCnD0-midiCnD15). Whenever you write to a different substream, the driver +sends the nonstandard MIDI command sequence F5 NN, where NN is the substream +number plus 1. Roland modules use this command to switch between different +"parts", so this feature lets you treat each part as a distinct raw MIDI +substream. The driver provides no way to send F5 00 (no selection) or to not +send the F5 NN command sequence at all; perhaps it ought to. + +Usage example for simple serial converter: + + /sbin/setserial /dev/ttyS0 uart none + /sbin/modprobe snd-serial-u16550 port=0x3f8 irq=4 speed=115200 + +Usage example for Roland SoundCanvas with 4 MIDI ports: + + /sbin/setserial /dev/ttyS0 uart none + /sbin/modprobe snd-serial-u16550 port=0x3f8 irq=4 outs=4 + +In MS-124T mode, one raw MIDI substream is supported (midiCnD0); the outs +module parameter is automatically set to 1. The driver sends the same data to +all four MIDI Out connectors. Set the A-B switch and the speed module +parameter to match (A=19200, B=9600). + +Usage example for MS-124T, with A-B switch in A position: + + /sbin/setserial /dev/ttyS0 uart none + /sbin/modprobe snd-serial-u16550 port=0x3f8 irq=4 adaptor=1 \ + speed=19200 + +In MS-124W S/A mode, one raw MIDI substream is supported (midiCnD0); +the outs module parameter is automatically set to 1. The driver sends +the same data to all four MIDI Out connectors at full MIDI speed. + +Usage example for S/A mode: + + /sbin/setserial /dev/ttyS0 uart none + /sbin/modprobe snd-serial-u16550 port=0x3f8 irq=4 adaptor=2 + +In MS-124W M/B mode, the driver supports 16 ALSA raw MIDI substreams; +the outs module parameter is automatically set to 16. The substream +number gives a bitmask of which MIDI Out connectors the data should be +sent to, with midiCnD1 sending to Out 1, midiCnD2 to Out 2, midiCnD4 to +Out 3, and midiCnD8 to Out 4. Thus midiCnD15 sends the data to all 4 ports. +As a special case, midiCnD0 also sends to all ports, since it is not useful +to send the data to no ports. M/B mode has extra overhead to select the MIDI +Out for each byte, so the aggregate data rate across all four MIDI Outs is +at most one byte every 520 us, as compared with the full MIDI data rate of +one byte every 320 us per port. + +Usage example for M/B mode: + + /sbin/setserial /dev/ttyS0 uart none + /sbin/modprobe snd-serial-u16550 port=0x3f8 irq=4 adaptor=3 + +The MS-124W hardware's M/A mode is currently not supported. This mode allows +the MIDI Outs to act independently at double the aggregate throughput of M/B, +but does not allow sending the same byte simultaneously to multiple MIDI Outs. +The M/A protocol requires the driver to twiddle the modem control lines under +timing constraints, so it would be a bit more complicated to implement than +the other modes. + +Midiator models other than MS-124W and MS-124T are currently not supported. +Note that the suffix letter is significant; the MS-124 and MS-124B are not +compatible, nor are the other known models MS-101, MS-101B, MS-103, and MS-114. +I do have documentation (tim.mann@compaq.com) that partially covers these models, +but no units to experiment with. The MS-124W support is tested with a real unit. +The MS-124T support is untested, but should work. + +The Generic driver supports multiple input and output substreams over a single +serial port. Similar to Roland Soundcanvas mode, F5 NN is used to select the +appropriate input or output stream (depending on the data direction). +Additionally, the CTS signal is used to regulate the data flow. The number of +inputs is specified by the ins parameter. diff --git a/Documentation/sound/oss/AD1816 b/Documentation/sound/oss/AD1816 new file mode 100644 index 000000000000..14bd8f25d523 --- /dev/null +++ b/Documentation/sound/oss/AD1816 @@ -0,0 +1,84 @@ +Documentation for the AD1816(A) sound driver +============================================ + +Installation: +------------- + +To get your AD1816(A) based sound card work, you'll have to enable support for +experimental code ("Prompt for development and/or incomplete code/drivers") +and isapnp ("Plug and Play support", "ISA Plug and Play support"). Enable +"Sound card support", "OSS modules support" and "Support for AD1816(A) based +cards (EXPERIMENTAL)" in the sound configuration menu, too. Now build, install +and reboot the new kernel as usual. + +Features: +--------- + +List of features supported by this driver: +- full-duplex support +- supported audio formats: unsigned 8bit, signed 16bit little endian, + signed 16bit big endian, µ-law, A-law +- supported channels: mono and stereo +- supported recording sources: Master, CD, Line, Line1, Line2, Mic +- supports phat 3d stereo circuit (Line 3) + + +Supported cards: +---------------- + +The following cards are known to work with this driver: +- Terratec Base 1 +- Terratec Base 64 +- HP Kayak +- Acer FX-3D +- SY-1816 +- Highscreen Sound-Boostar 32 Wave 3D +- Highscreen Sound-Boostar 16 +- AVM Apex Pro card +- (Aztech SC-16 3D) +- (Newcom SC-16 3D) +- (Terratec EWS64S) + +Cards listed in brackets are not supported reliable. If you have such a card +you should add the extra parameter: + options=1 +when loading the ad1816 module via modprobe. + + +Troubleshooting: +---------------- + +First of all you should check, if the driver has been loaded +properly. + +If loading of the driver succeeds, but playback/capture fails, check +if you used the correct values for irq, dma and dma2 when loading the module. +If one of them is wrong you usually get the following error message: + +Nov 6 17:06:13 tek01 kernel: Sound: DMA (output) timed out - IRQ/DRQ config error? + +If playback/capture is too fast or to slow, you should have a look at +the clock chip of your sound card. The AD1816 was designed for a 33MHz +oscillator, however most sound card manufacturer use slightly +different oscillators as they are cheaper than 33MHz oscillators. If +you have such a card you have to adjust the ad1816_clockfreq parameter +above. For example: For a card using a 32.875MHz oscillator use +ad1816_clockfreq=32875 instead of ad1816_clockfreq=33000. + + +Updates, bugfixes and bugreports: +-------------------------------- + +As the driver is still experimental and under development, you should +watch out for updates. Updates of the driver are available on the +Internet from one of my home pages: + http://www.student.informatik.tu-darmstadt.de/~tek/projects/linux.html +or: + http://www.tu-darmstadt.de/~tek01/projects/linux.html + +Bugreports, bugfixes and related questions should be sent via E-Mail to: + tek@rbg.informatik.tu-darmstadt.de + +Thorsten Knabe <tek@rbg.informatik.tu-darmstadt.de> +Christoph Hellwig <hch@infradead.org> + Last modified: 2000/09/20 diff --git a/Documentation/sound/oss/ALS b/Documentation/sound/oss/ALS new file mode 100644 index 000000000000..d01ffbfd5808 --- /dev/null +++ b/Documentation/sound/oss/ALS @@ -0,0 +1,66 @@ +ALS-007/ALS-100/ALS-200 based sound cards +========================================= + +Support for sound cards based around the Avance Logic +ALS-007/ALS-100/ALS-200 chip is included. These chips are a single +chip PnP sound solution which is mostly hardware compatible with the +Sound Blaster 16 card, with most differences occurring in the use of +the mixer registers. For this reason the ALS code is integrated +as part of the Sound Blaster 16 driver (adding only 800 bytes to the +SB16 driver). + +To use an ALS sound card under Linux, enable the following options as +modules in the sound configuration section of the kernel config: + - 100% Sound Blaster compatibles (SB16/32/64, ESS, Jazz16) support + - FM synthesizer (YM3812/OPL-3) support + - standalone MPU401 support may be required for some cards; for the + ALS-007, when using isapnptools, it is required +Since the ALS-007/100/200 are PnP cards, ISAPnP support should probably be +compiled in. If kernel level PnP support is not included, isapnptools will +be required to configure the card before the sound modules are loaded. + +When using kernel level ISAPnP, the kernel should correctly identify and +configure all resources required by the card when the "sb" module is +inserted. Note that the ALS-007 does not have a 16 bit DMA channel and that +the MPU401 interface on this card uses a different interrupt to the audio +section. This should all be correctly configured by the kernel; if problems +with the MPU401 interface surface, try using the standalone MPU401 module, +passing "0" as the "sb" module's "mpu_io" module parameter to prevent the +soundblaster driver attempting to register the MPU401 itself. The onboard +synth device can be accessed using the "opl3" module. + +If isapnptools is used to wake up the sound card (as in 2.2.x), the settings +of the card's resources should be passed to the kernel modules ("sb", "opl3" +and "mpu401") using the module parameters. When configuring an ALS-007, be +sure to specify different IRQs for the audio and MPU401 sections - this card +requires they be different. For "sb", "io", "irq" and "dma" should be set +to the same values used to configure the audio section of the card with +isapnp. "dma16" should be explicitly set to "-1" for an ALS-007 since this +card does not have a 16 bit dma channel; if not specified the kernel will +default to using channel 5 anyway which will cause audio not to work. +"mpu_io" should be set to 0. The "io" parameter of the "opl3" module should +also agree with the setting used by isapnp. To get the MPU401 interface +working on an ALS-007 card, the "mpu401" module will be required since this +card uses separate IRQs for the audio and MPU401 sections and there is no +parameter available to pass a different IRQ to the "sb" driver (whose +inbuilt MPU401 driver would otherwise be fine). Insert the mpu401 module +passing appropriate values using the "io" and "irq" parameters. + +The resulting sound driver will provide the following capabilities: + - 8 and 16 bit audio playback + - 8 and 16 bit audio recording + - Software selection of record source (line in, CD, FM, mic, master) + - Record and playback of midi data via the external MPU-401 + - Playback of midi data using inbuilt FM synthesizer + - Control of the ALS-007 mixer via any OSS-compatible mixer programs. + Controls available are Master (L&R), Line in (L&R), CD (L&R), + DSP/PCM/audio out (L&R), FM (L&R) and Mic in (mono). + +Jonathan Woithe +jwoithe@physics.adelaide.edu.au +30 March 1998 + +Modified 2000-02-26 by Dave Forrest, drf5n@virginia.edu to add ALS100/ALS200 +Modified 2000-04-10 by Paul Laufer, pelaufer@csupomona.edu to add ISAPnP info. +Modified 2000-11-19 by Jonathan Woithe, jwoithe@physics.adelaide.edu.au + - updated information for kernel 2.4.x. diff --git a/Documentation/sound/oss/AWE32 b/Documentation/sound/oss/AWE32 new file mode 100644 index 000000000000..cb179bfeb522 --- /dev/null +++ b/Documentation/sound/oss/AWE32 @@ -0,0 +1,76 @@ + Installing and using Creative AWE midi sound under Linux. + +This documentation is devoted to the Creative Sound Blaster AWE32, AWE64 and +SB32. + +1) Make sure you have an ORIGINAL Creative SB32, AWE32 or AWE64 card. This + is important, because the driver works only with real Creative cards. + +2) The first thing you need to do is re-compile your kernel with support for + your sound card. Run your favourite tool to configure the kernel and when + you get to the "Sound" menu you should enable support for the following: + + Sound card support, + OSS sound modules, + 100% Sound Blaster compatibles (SB16/32/64, ESS, Jazz16) support, + AWE32 synth + + If your card is "Plug and Play" you will also need to enable these two + options, found under the "Plug and Play configuration" menu: + + Plug and Play support + ISA Plug and Play support + + Now compile and install the kernel in normal fashion. If you don't know + how to do this you can find instructions for this in the README file + located in the root directory of the kernel source. + +3) Before you can start playing midi files you will have to load a sound + bank file. The utility needed for doing this is called "sfxload", and it + is one of the utilities found in a package called "awesfx". If this + package is not available in your distribution you can download the AWE + snapshot from Creative Labs Open Source website: + + http://www.opensource.creative.com/snapshot.html + + Once you have unpacked the AWE snapshot you will see a "awesfx" + directory. Follow the instructions in awesfx/docs/INSTALL to install the + utilities in this package. After doing this, sfxload should be installed + as: + + /usr/local/bin/sfxload + + To enable AWE general midi synthesis you should also get the sound bank + file for general midi from: + + http://members.xoom.com/yar/synthgm.sbk.gz + + Copy it to a directory of your choice, and unpack it there. + +4) Edit /etc/modprobe.conf, and insert the following lines at the end of the + file: + + alias sound-slot-0 sb + alias sound-service-0-1 awe_wave + install awe_wave /sbin/modprobe --first-time -i awe_wave && /usr/local/bin/sfxload PATH_TO_SOUND_BANK_FILE + + You will of course have to change "PATH_TO_SOUND_BANK_FILE" to the full + path of of the sound bank file. That will enable the Sound Blaster and AWE + wave synthesis. To play midi files you should get one of these programs if + you don't already have them: + + Playmidi: http://playmidi.openprojects.net + + AWEMidi Player (drvmidi) Included in the previously mentioned AWE + snapshot. + + You will probably have to pass the "-e" switch to playmidi to have it use + your midi device. drvmidi should work without switches. + + If something goes wrong please e-mail me. All comments and suggestions are + welcome. + + Yaroslav Rosomakho (alons55@dialup.ptt.ru) + http://www.yar.opennet.ru + +Last Updated: Feb 3 2001 diff --git a/Documentation/sound/oss/AudioExcelDSP16 b/Documentation/sound/oss/AudioExcelDSP16 new file mode 100644 index 000000000000..c0f08922993b --- /dev/null +++ b/Documentation/sound/oss/AudioExcelDSP16 @@ -0,0 +1,101 @@ +Driver +------ + +Informations about Audio Excel DSP 16 driver can be found in the source +file aedsp16.c +Please, read the head of the source before using it. It contain useful +informations. + +Configuration +------------- + +The Audio Excel configuration, is now done with the standard Linux setup. +You have to configure the sound card (Sound Blaster or Microsoft Sound System) +and, if you want it, the Roland MPU-401 (do not use the Sound Blaster MPU-401, +SB-MPU401) in the main driver menu. Activate the lowlevel drivers then select +the Audio Excel hardware that you want to initialize. Check the IRQ/DMA/MIRQ +of the Audio Excel initialization: it must be the same as the SBPRO (or MSS) +setup. If the parameters are different, correct it. +I you own a Gallant's audio card based on SC-6600, activate the SC-6600 support. +If you want to change the configuration of the sound board, be sure to +check off all the configuration items before re-configure it. + +Module parameters +----------------- +To use this driver as a module, you must configure some module parameters, to +set up I/O addresses, IRQ lines and DMA channels. Some parameters are +mandatory while some others are optional. Here a list of parameters you can +use with this module: + +Name Description +==== =========== +MANDATORY +io I/O base address (0x220 or 0x240) +irq irq line (5, 7, 9, 10 or 11) +dma dma channel (0, 1 or 3) + +OPTIONAL +mss_base I/O base address for activate MSS mode (default SBPRO) + (0x530 or 0xE80) +mpu_base I/O base address for activate MPU-401 mode + (0x300, 0x310, 0x320 or 0x330) +mpu_irq MPU-401 irq line (5, 7, 9, 10 or 0) + +The /etc/modprobe.conf will have lines like this: + +options opl3 io=0x388 +options ad1848 io=0x530 irq=11 dma=3 +options aedsp16 io=0x220 irq=11 dma=3 mss_base=0x530 + +Where the aedsp16 options are the options for this driver while opl3 and +ad1848 are the corresponding options for the MSS and OPL3 modules. + +Loading MSS and OPL3 needs to pre load the aedsp16 module to set up correctly +the sound card. Installation dependencies must be written in the modprobe.conf +file: + +install ad1848 /sbin/modprobe aedsp16 && /sbin/modprobe -i ad1848 +install opl3 /sbin/modprobe aedsp16 && /sbin/modprobe -i opl3 + +Then you must load the sound modules stack in this order: +sound -> aedsp16 -> [ ad1848, opl3 ] + +With the above configuration, loading ad1848 or opl3 modules, will +automatically load all the sound stack. + +Sound cards supported +--------------------- +This driver supports the SC-6000 and SC-6600 based Gallant's sound card. +It don't support the Audio Excel DSP 16 III (try the SC-6600 code). +I'm working on the III version of the card: if someone have useful +informations about it, please let me know. +For all the non-supported audio cards, you have to boot MS-DOS (or WIN95) +activating the audio card with the MS-DOS device driver, then you have to +<ctrl>-<alt>-<del> and boot Linux. +Follow these steps: + +1) Compile Linux kernel with standard sound driver, using the emulation + you want, with the parameters of your audio card, + e.g. Microsoft Sound System irq10 dma3 +2) Install your new kernel as the default boot kernel. +3) Boot MS-DOS and configure the audio card with the boot time device + driver, for MSS irq10 dma3 in our example. +4) <ctrl>-<alt>-<del> and boot Linux. This will maintain the DOS configuration + and will boot the new kernel with sound driver. The sound driver will find + the audio card and will recognize and attach it. + +Reports on User successes +------------------------- + +> Date: Mon, 29 Jul 1996 08:35:40 +0100 +> From: Mr S J Greenaway <sjg95@unixfe.rl.ac.uk> +> To: riccardo@cdc8g5.cdc.polimi.it (Riccardo Facchetti) +> Subject: Re: Audio Excel DSP 16 initialization code +> +> Just to let you know got my Audio Excel (emulating a MSS) working +> with my original SB16, thanks for the driver! + + +Last revised: 20 August 1998 +Riccardo Facchetti +fizban@tin.it diff --git a/Documentation/sound/oss/CMI8330 b/Documentation/sound/oss/CMI8330 new file mode 100644 index 000000000000..9c439f1a6dba --- /dev/null +++ b/Documentation/sound/oss/CMI8330 @@ -0,0 +1,153 @@ +Documentation for CMI 8330 (SoundPRO) +------------------------------------- +Alessandro Zummo <azummo@ita.flashnet.it> + +( Be sure to read Documentation/sound/oss/SoundPro too ) + + +This adapter is now directly supported by the sb driver. + + The only thing you have to do is to compile the kernel sound +support as a module and to enable kernel ISAPnP support, +as shown below. + + +CONFIG_SOUND=m +CONFIG_SOUND_SB=m + +CONFIG_PNP=y +CONFIG_ISAPNP=y + + +and optionally: + + +CONFIG_SOUND_MPU401=m + + for MPU401 support. + + +(I suggest you to use "make menuconfig" or "make xconfig" + for a more comfortable configuration editing) + + + +Then you can do + + modprobe sb + +and everything will be (hopefully) configured. + +You should get something similar in syslog: + +sb: CMI8330 detected. +sb: CMI8330 sb base located at 0x220 +sb: CMI8330 mpu base located at 0x330 +sb: CMI8330 mail reports to Alessandro Zummo <azummo@ita.flashnet.it> +sb: ISAPnP reports CMI 8330 SoundPRO at i/o 0x220, irq 7, dma 1,5 + + + + +The old documentation file follows for reference +purposes. + + +How to enable CMI 8330 (SOUNDPRO) soundchip on Linux +------------------------------------------ +Stefan Laudat <Stefan.Laudat@asit.ro> + +[Note: The CMI 8338 is unrelated and is supported by cmpci.o] + + + In order to use CMI8330 under Linux you just have to use a proper isapnp.conf, a good isapnp and a little bit of patience. I use isapnp 1.17, but +you may get a better one I guess at http://www.roestock.demon.co.uk/isapnptools/. + + Of course you will have to compile kernel sound support as module, as shown below: + +CONFIG_SOUND=m +CONFIG_SOUND_OSS=m +CONFIG_SOUND_SB=m +CONFIG_SOUND_ADLIB=m +CONFIG_SOUND_MPU401=m +# Mikro$chaft sound system (kinda useful here ;)) +CONFIG_SOUND_MSS=m + + The /etc/isapnp.conf file will be: + +<snip below> + + +(READPORT 0x0203) +(ISOLATE PRESERVE) +(IDENTIFY *) +(VERBOSITY 2) +(CONFLICT (IO FATAL)(IRQ FATAL)(DMA FATAL)(MEM FATAL)) # or WARNING +(VERIFYLD N) + + +# WSS + +(CONFIGURE CMI0001/16777472 (LD 0 +(IO 0 (SIZE 8) (BASE 0x0530)) +(IO 1 (SIZE 8) (BASE 0x0388)) +(INT 0 (IRQ 7 (MODE +E))) +(DMA 0 (CHANNEL 0)) +(NAME "CMI0001/16777472[0]{CMI8330/C3D Audio Adapter}") +(ACT Y) +)) + +# MPU + +(CONFIGURE CMI0001/16777472 (LD 1 +(IO 0 (SIZE 2) (BASE 0x0330)) +(INT 0 (IRQ 11 (MODE +E))) +(NAME "CMI0001/16777472[1]{CMI8330/C3D Audio Adapter}") +(ACT Y) +)) + +# Joystick + +(CONFIGURE CMI0001/16777472 (LD 2 +(IO 0 (SIZE 8) (BASE 0x0200)) +(NAME "CMI0001/16777472[2]{CMI8330/C3D Audio Adapter}") +(ACT Y) +)) + +# SoundBlaster + +(CONFIGURE CMI0001/16777472 (LD 3 +(IO 0 (SIZE 16) (BASE 0x0220)) +(INT 0 (IRQ 5 (MODE +E))) +(DMA 0 (CHANNEL 1)) +(DMA 1 (CHANNEL 5)) +(NAME "CMI0001/16777472[3]{CMI8330/C3D Audio Adapter}") +(ACT Y) +)) + + +(WAITFORKEY) + +<end of snip> + + The module sequence is trivial: + +/sbin/insmod soundcore +/sbin/insmod sound +/sbin/insmod uart401 +# insert this first +/sbin/insmod ad1848 io=0x530 irq=7 dma=0 soundpro=1 +# The sb module is an alternative to the ad1848 (Microsoft Sound System) +# Anyhow, this is full duplex and has MIDI +/sbin/insmod sb io=0x220 dma=1 dma16=5 irq=5 mpu_io=0x330 + + + +Alma Chao <elysian@ethereal.torsion.org> suggests the following /etc/modprobe.conf: + +alias sound ad1848 +alias synth0 opl3 +options ad1848 io=0x530 irq=7 dma=0 soundpro=1 +options opl3 io=0x388 + + diff --git a/Documentation/sound/oss/CMI8338 b/Documentation/sound/oss/CMI8338 new file mode 100644 index 000000000000..387d058c3f95 --- /dev/null +++ b/Documentation/sound/oss/CMI8338 @@ -0,0 +1,85 @@ +Audio driver for CM8338/CM8738 chips by Chen-Li Tien + + +HARDWARE SUPPORTED +================================================================================ +C-Media CMI8338 +C-Media CMI8738 +On-board C-Media chips + + +STEPS TO BUILD DRIVER +================================================================================ + + 1. Backup the Config.in and Makefile in the sound driver directory + (/usr/src/linux/driver/sound). + The Configure.help provide help when you config driver in step + 4, please backup the original one (/usr/src/linux/Document) and + copy this file. + The cmpci is document for the driver in detail, please copy it + to /usr/src/linux/Document/sound so you can refer it. Backup if + there is already one. + + 2. Extract the tar file by 'tar xvzf cmpci-xx.tar.gz' in the above + directory. + + 3. Change directory to /usr/src/linux + + 4. Config cm8338 driver by 'make menuconfig', 'make config' or + 'make xconfig' command. + + 5. Please select Sound Card (CONFIG_SOUND=m) support and CMPCI + driver (CONFIG_SOUND_CMPCI=m) as modules. Resident mode not tested. + For driver option, please refer 'DRIVER PARAMETER' + + 6. Compile the kernel if necessary. + + 7. Compile the modules by 'make modules'. + + 8. Install the modules by 'make modules_install' + + +INSTALL DRIVER +================================================================================ + + 1. Before first time to run the driver, create module dependency by + 'depmod -a' + + 2. To install the driver manually, enter 'modprobe cmpci'. + + 3. Driver installation for various distributions: + + a. Slackware 4.0 + Add the 'modprobe cmpci' command in your /etc/rc.d/rc.modules + file.so you can start the driver automatically each time booting. + + b. Caldera OpenLinux 2.2 + Use LISA to load the cmpci module. + + c. RedHat 6.0 and S.u.S.E. 6.1 + Add following command in /etc/conf.modules: + + alias sound cmpci + + also visit http://www.cmedia.com.tw for installation instruction. + +DRIVER PARAMETER +================================================================================ + + Some functions for the cm8738 can be configured in Kernel Configuration + or modules parameters. Set these parameters to 1 to enable. + + mpuio: I/O ports base for MPU-401, 0 if disabled. + fmio: I/O ports base for OPL-3, 0 if disabled. + spdif_inverse:Inverse the S/PDIF-in signal, this depends on your + CD-ROM or DVD-ROM. + spdif_loop: Enable S/PDIF loop, this route S/PDIF-in to S/PDIF-out + directly. + speakers: Number of speakers used. + use_line_as_rear:Enable this if you want to use line-in as + rear-out. + use_line_as_bass:Enable this if you want to use line-in as + bass-out. + joystick: Enable joystick. You will need to install Linux joystick + driver. + diff --git a/Documentation/sound/oss/CS4232 b/Documentation/sound/oss/CS4232 new file mode 100644 index 000000000000..7d6af7a5c1c2 --- /dev/null +++ b/Documentation/sound/oss/CS4232 @@ -0,0 +1,23 @@ +To configure the Crystal CS423x sound chip and activate its DSP functions, +modules may be loaded in this order: + + modprobe sound + insmod ad1848 + insmod uart401 + insmod cs4232 io=* irq=* dma=* dma2=* + +This is the meaning of the parameters: + + io--I/O address of the Windows Sound System (normally 0x534) + irq--IRQ of this device + dma and dma2--DMA channels (DMA2 may be 0) + +On some cards, the board attempts to do non-PnP setup, and fails. If you +have problems, use Linux' PnP facilities. + +To get MIDI facilities add + + insmod opl3 io=* + +where "io" is the I/O address of the OPL3 synthesizer. This will be shown +in /proc/sys/pnp and is normally 0x388. diff --git a/Documentation/sound/oss/ESS b/Documentation/sound/oss/ESS new file mode 100644 index 000000000000..bba93b4d2def --- /dev/null +++ b/Documentation/sound/oss/ESS @@ -0,0 +1,34 @@ +Documentation for the ESS AudioDrive chips + +In 2.4 kernels the SoundBlaster driver not only tries to detect an ESS chip, it +tries to detect the type of ESS chip too. The correct detection of the chip +doesn't always succeed however, so unless you use the kernel isapnp facilities +(and you chip is pnp capable) the default behaviour is 2.0 behaviour which +means: only detect ES688 and ES1688. + +All ESS chips now have a recording level setting. This is a need-to-have for +people who want to use their ESS for recording sound. + +Every chip that's detected as a later-than-es1688 chip has a 6 bits logarithmic +master volume control. + +Every chip that's detected as a ES1887 now has Full Duplex support. Made a +little testprogram that shows that is works, haven't seen a real program that +needs this however. + +For ESS chips an additional parameter "esstype" can be specified. This controls +the (auto) detection of the ESS chips. It can have 3 kinds of values: + +-1 Act like 2.0 kernels: only detect ES688 or ES1688. +0 Try to auto-detect the chip (may fail for ES1688) +688 The chip will be treated as ES688 +1688 ,, ,, ,, ,, ,, ,, ES1688 +1868 ,, ,, ,, ,, ,, ,, ES1868 +1869 ,, ,, ,, ,, ,, ,, ES1869 +1788 ,, ,, ,, ,, ,, ,, ES1788 +1887 ,, ,, ,, ,, ,, ,, ES1887 +1888 ,, ,, ,, ,, ,, ,, ES1888 + +Because Full Duplex is supported for ES1887 you can specify a second DMA +channel by specifying module parameter dma16. It can be one of: 0, 1, 3 or 5. + diff --git a/Documentation/sound/oss/ESS1868 b/Documentation/sound/oss/ESS1868 new file mode 100644 index 000000000000..55e922f21bc0 --- /dev/null +++ b/Documentation/sound/oss/ESS1868 @@ -0,0 +1,55 @@ +Documentation for the ESS1868F AudioDrive PnP sound card + +The ESS1868 sound card is a PnP ESS1688-compatible 16-bit sound card. + +It should be automatically detected by the Linux Kernel isapnp support when you +load the sb.o module. Otherwise you should take care of: + + * The ESS1868 does not allow use of a 16-bit DMA, thus DMA 0, 1, 2, and 3 + may only be used. + + * isapnptools version 1.14 does work with ESS1868. Earlier versions might + not. + + * Sound support MUST be compiled as MODULES, not statically linked + into the kernel. + + +NOTE: this is only needed when not using the kernel isapnp support! + +For configuring the sound card's I/O addresses, IRQ and DMA, here is a +sample copy of the isapnp.conf directives regarding the ESS1868: + +(CONFIGURE ESS1868/-1 (LD 1 +(IO 0 (BASE 0x0220)) +(IO 1 (BASE 0x0388)) +(IO 2 (BASE 0x0330)) +(DMA 0 (CHANNEL 1)) +(INT 0 (IRQ 5 (MODE +E))) +(ACT Y) +)) + +(for a full working isapnp.conf file, remember the +(ISOLATE) +(IDENTIFY *) +at the beginning and the +(WAITFORKEY) +at the end.) + +In this setup, the main card I/O is 0x0220, FM synthesizer is 0x0388, and +the MPU-401 MIDI port is located at 0x0330. IRQ is IRQ 5, DMA is channel 1. + +After configuring the sound card via isapnp, to use the card you must load +the sound modules with the proper I/O information. Here is my setup: + +# ESS1868F AudioDrive initialization + +/sbin/modprobe sound +/sbin/insmod uart401 +/sbin/insmod sb io=0x220 irq=5 dma=1 dma16=-1 +/sbin/insmod mpu401 io=0x330 +/sbin/insmod opl3 io=0x388 +/sbin/insmod v_midi + +opl3 is the FM synthesizer +/sbin/insmod opl3 io=0x388 diff --git a/Documentation/sound/oss/INSTALL.awe b/Documentation/sound/oss/INSTALL.awe new file mode 100644 index 000000000000..310f42ca1e83 --- /dev/null +++ b/Documentation/sound/oss/INSTALL.awe @@ -0,0 +1,134 @@ +================================================================ + INSTALLATION OF AWE32 SOUND DRIVER FOR LINUX + Takashi Iwai <iwai@ww.uni-erlangen.de> +================================================================ + +---------------------------------------------------------------- +* Attention to SB-PnP Card Users + +If you're using PnP cards, the initialization of PnP is required +before loading this driver. You have now three options: + 1. Use isapnptools. + 2. Use in-kernel isapnp support. + 3. Initialize PnP on DOS/Windows, then boot linux by loadlin. +In this document, only the case 1 case is treated. + +---------------------------------------------------------------- +* Installation on Red Hat 5.0 Sound Driver + +Please use install-rh.sh under RedHat5.0 directory. +DO NOT USE install.sh below. +See INSTALL.RH for more details. + +---------------------------------------------------------------- +* Installation/Update by Shell Script + + 1. Become root + + % su + + 2. If you have never configured the kernel tree yet, run make config + once (to make dependencies and symlinks). + + # cd /usr/src/linux + # make xconfig + + 3. Run install.sh script + + # sh ./install.sh + + 4. Configure your kernel + + (for Linux 2.[01].x user) + # cd /usr/src/linux + # make xconfig (or make menuconfig) + + (for Linux 1.2.x user) + # cd /usr/src/linux + # make config + + Answer YES to both "lowlevel drivers" and "AWE32 wave synth" items + in Sound menu. ("lowlevel drivers" will appear only in 2.x + kernel.) + + 5. Make your kernel (and modules), and install them as usual. + + 5a. make kernel image + # make zImage + + 5b. make modules and install them + # make modules && make modules_install + + 5c. If you're using lilo, copy the kernel image and run lilo. + Otherwise, copy the kernel image to suitable directory or + media for your system. + + 6. Reboot the kernel if necessary. + - If you updated only the modules, you don't have to reboot + the system. Just remove the old sound modules here. + in + # rmmod sound.o (linux-2.0 or OSS/Free) + # rmmod awe_wave.o (linux-2.1) + + 7. If your AWE card is a PnP and not initialized yet, you'll have to + do it by isapnp tools. Otherwise, skip to 8. + + This section described only a brief explanation. For more + details, please see the AWE64-Mini-HOWTO or isapnp tools FAQ. + + 7a. If you have no isapnp.conf file, generate it by pnpdump. + Otherwise, skip to 7d. + # pnpdump > /etc/isapnp.conf + + 7b. Edit isapnp.conf file. Comment out the appropriate + lines containing desirable I/O ports, DMA and IRQs. + Don't forget to enable (ACT Y) line. + + 7c. Add two i/o ports (0xA20 and 0xE20) in WaveTable part. + ex) + (CONFIGURE CTL0048/58128 (LD 2 + # ANSI string -->WaveTable<-- + (IO 0 (BASE 0x0620)) + (IO 1 (BASE 0x0A20)) + (IO 2 (BASE 0x0E20)) + (ACT Y) + )) + + 7d. Load the config file. + CAUTION: This will reset all PnP cards! + + # isapnp /etc/isapnp.conf + + 8. Load the sound module (if you configured it as a module): + + for 2.0 kernel or OSS/Free monolithic module: + + # modprobe sound.o + + for 2.1 kernel: + + # modprobe sound + # insmod uart401 + # insmod sb io=0x220 irq=5 dma=1 dma16=5 mpu_io=0x330 + (These values depend on your settings.) + # insmod awe_wave + (Be sure to load awe_wave after sb!) + + See Documentation/sound/oss/AWE32 for + more details. + + 9. (only for obsolete systems) If you don't have /dev/sequencer + device file, make it according to Readme.linux file on + /usr/src/linux/drivers/sound. (Run a shell script included in + that file). <-- This file no longer exists in the recent kernels! + + 10. OK, load your own soundfont file, and enjoy MIDI! + + % sfxload synthgm.sbk + % drvmidi foo.mid + + 11. For more advanced use (eg. dynamic loading, virtual bank and + etc.), please read the awedrv FAQ or the instructions in awesfx + and awemidi packages. + +Good luck! diff --git a/Documentation/sound/oss/Introduction b/Documentation/sound/oss/Introduction new file mode 100644 index 000000000000..15d4fb975ac0 --- /dev/null +++ b/Documentation/sound/oss/Introduction @@ -0,0 +1,459 @@ +Introduction Notes on Modular Sound Drivers and Soundcore +Wade Hampton +2/14/2001 + +Purpose: +======== +This document provides some general notes on the modular +sound drivers and their configuration, along with the +support modules sound.o and soundcore.o. + +Note, some of this probably should be added to the Sound-HOWTO! + +Note, soundlow.o was present with 2.2 kernels but is not +required for 2.4.x kernels. References have been removed +to this. + + +Copying: +======== +none + + +History: +======== +0.1.0 11/20/1998 First version, draft +1.0.0 11/1998 Alan Cox changes, incorporation in 2.2.0 + as Documentation/sound/oss/Introduction +1.1.0 6/30/1999 Second version, added notes on making the drivers, + added info on multiple sound cards of similar types,] + added more diagnostics info, added info about esd. + added info on OSS and ALSA. +1.1.1 19991031 Added notes on sound-slot- and sound-service. + (Alan Cox) +1.1.2 20000920 Modified for Kernel 2.4 (Christoph Hellwig) +1.1.3 20010214 Minor notes and corrections (Wade Hampton) + Added examples of sound-slot-0, etc. + + +Modular Sound Drivers: +====================== + +Thanks to the GREAT work by Alan Cox (alan@lxorguk.ukuu.org.uk), + +[And Oleg Drokin, Thomas Sailer, Andrew Veliath and more than a few + others - not to mention Hannu's original code being designed well + enough to cope with that kind of chopping up](Alan) + +the standard Linux kernels support a modular sound driver. From +Alan's comments in linux/drivers/sound/README.FIRST: + + The modular sound driver patches were funded by Red Hat Software + (www.redhat.com). The sound driver here is thus a modified version of + Hannu's code. Please bear that in mind when considering the appropriate + forums for bug reporting. + +The modular sound drivers may be loaded via insmod or modprobe. +To support all the various sound modules, there are two general +support modules that must be loaded first: + + soundcore.o: Top level handler for the sound system, provides + a set of functions for registration of devices + by type. + + sound.o: Common sound functions required by all modules. + +For the specific sound modules (e.g., sb.o for the Soundblaster), +read the documentation on that module to determine what options +are available, for example IRQ, address, DMA. + +Warning, the options for different cards sometime use different names +for the same or a similar feature (dma1= versus dma16=). As a last +resort, inspect the code (search for MODULE_PARM). + +Notes: + +1. There is a new OpenSource sound driver called ALSA which is + currently under development: http://www.alsa-project.org/ + The ALSA drivers support some newer hardware that may not + be supported by this sound driver and also provide some + additional features. + +2. The commercial OSS driver may be obtained from the site: + http://www/opensound.com. This may be used for cards that + are unsupported by the kernel driver, or may be used + by other operating systems. + +3. The enlightenment sound daemon may be used for playing + multiple sounds at the same time via a single card, eliminating + some of the requirements for multiple sound card systems. For + more information, see: http://www.tux.org/~ricdude/EsounD.html + The "esd" program may be used with the real-player and mpeg + players like mpg123 and x11amp. The newer real-player + and some games even include built-in support for ESD! + + +Building the Modules: +===================== + +This document does not provide full details on building the +kernel, etc. The notes below apply only to making the kernel +sound modules. If this conflicts with the kernel's README, +the README takes precedence. + +1. To make the kernel sound modules, cd to your /usr/src/linux + directory (typically) and type make config, make menuconfig, + or make xconfig (to start the command line, dialog, or x-based + configuration tool). + +2. Select the Sound option and a dialog will be displayed. + +3. Select M (module) for "Sound card support". + +4. Select your sound driver(s) as a module. For ProAudio, Sound + Blaster, etc., select M (module) for OSS sound modules. + [thanks to Marvin Stodolsky <stodolsk@erols.com>]A + +5. Make the kernel (e.g., make bzImage), and install the kernel. + +6. Make the modules and install them (make modules; make modules_install). + +Note, for 2.5.x kernels, make sure you have the newer module-init-tools +installed or modules will not be loaded properly. 2.5.x requires an +updated module-init-tools. + + +Plug and Play (PnP: +=================== + +If the sound card is an ISA PnP card, isapnp may be used +to configure the card. See the file isapnp.txt in the +directory one level up (e.g., /usr/src/linux/Documentation). + +Also the 2.4.x kernels provide PnP capabilities, see the +file NEWS in this directory. + +PCI sound cards are highly recommended, as they are far +easier to configure and from what I have read, they use +less resources and are more CPU efficient. + + +INSMOD: +======= + +If loading via insmod, the common modules must be loaded in the +order below BEFORE loading the other sound modules. The card-specific +modules may then be loaded (most require parameters). For example, +I use the following via a shell script to load my SoundBlaster: + +SB_BASE=0x240 +SB_IRQ=9 +SB_DMA=3 +SB_DMA2=5 +SB_MPU=0x300 +# +echo Starting sound +/sbin/insmod soundcore +/sbin/insmod sound +# +echo Starting sound blaster.... +/sbin/insmod uart401 +/sbin/insmod sb io=$SB_BASE irq=$SB_IRQ dma=$SB_DMA dma16=$SB_DMA2 mpu_io=$SB_MP + +When using sound as a module, I typically put these commands +in a file such as /root/soundon.sh. + + +MODPROBE: +========= + +If loading via modprobe, these common files are automatically loaded +when requested by modprobe. For example, my /etc/modprobe.conf contains: + +alias sound sb +options sb io=0x240 irq=9 dma=3 dma16=5 mpu_io=0x300 + +All you need to do to load the module is: + + /sbin/modprobe sb + + +Sound Status: +============= + +The status of sound may be read/checked by: + cat (anyfile).au >/dev/audio + +[WWH: This may not work properly for SoundBlaster PCI 128 cards +such as the es1370/1 (see the es1370/1 files in this directory) +as they do not automatically support uLaw on /dev/audio.] + +The status of the modules and which modules depend on +which other modules may be checked by: + /sbin/lsmod + +/sbin/lsmod should show something like the following: + sb 26280 0 + uart401 5640 0 [sb] + sound 57112 0 [sb uart401] + soundcore 1968 8 [sb sound] + + +Removing Sound: +=============== + +Sound may be removed by using /sbin/rmmod in the reverse order +in which you load the modules. Note, if a program has a sound device +open (e.g., xmixer), that module (and the modules on which it +depends) may not be unloaded. + +For example, I use the following to remove my Soundblaster (rmmod +in the reverse order in which I loaded the modules): + +/sbin/rmmod sb +/sbin/rmmod uart401 +/sbin/rmmod sound +/sbin/rmmod soundcore + +When using sound as a module, I typically put these commands +in a script such as /root/soundoff.sh. + + +Removing Sound for use with OSS: +================================ + +If you get really stuck or have a card that the kernel modules +will not support, you can get a commercial sound driver from +http://www.opensound.com. Before loading the commercial sound +driver, you should do the following: + +1. remove sound modules (detailed above) +2. remove the sound modules from /etc/modprobe.conf +3. move the sound modules from /lib/modules/<kernel>/misc + (for example, I make a /lib/modules/<kernel>/misc/tmp + directory and copy the sound module files to that + directory). + + +Multiple Sound Cards: +===================== + +The sound drivers will support multiple sound cards and there +are some great applications like multitrack that support them. +Typically, you need two sound cards of different types. Note, this +uses more precious interrupts and DMA channels and sometimes +can be a configuration nightmare. I have heard reports of 3-4 +sound cards (typically I only use 2). You can sometimes use +multiple PCI sound cards of the same type. + +On my machine I have two sound cards (cs4232 and Soundblaster Vibra +16). By loading sound as modules, I can control which is the first +sound device (/dev/dsp, /dev/audio, /dev/mixer) and which is +the second. Normally, the cs4232 (Dell sound on the motherboard) +would be the first sound device, but I prefer the Soundblaster. +All you have to do is to load the one you want as /dev/dsp +first (in my case "sb") and then load the other one +(in my case "cs4232"). + +If you have two cards of the same type that are jumpered +cards or different PnP revisions, you may load the same +module twice. For example, I have a SoundBlaster vibra 16 +and an older SoundBlaster 16 (jumpers). To load the module +twice, you need to do the following: + +1. Copy the sound modules to a new name. For example + sb.o could be copied (or symlinked) to sb1.o for the + second SoundBlaster. + +2. Make a second entry in /etc/modprobe.conf, for example, + sound1 or sb1. This second entry should refer to the + new module names for example sb1, and should include + the I/O, etc. for the second sound card. + +3. Update your soundon.sh script, etc. + +Warning: I have never been able to get two PnP sound cards of the +same type to load at the same time. I have tried this several times +with the Soundblaster Vibra 16 cards. OSS has indicated that this +is a PnP problem.... If anyone has any luck doing this, please +send me an E-MAIL. PCI sound cards should not have this problem.a +Since this was originally release, I have received a couple of +mails from people who have accomplished this! + +NOTE: In Linux 2.4 the Sound Blaster driver (and only this one yet) +supports multiple cards with one module by default. +Read the file 'Soundblaster' in this directory for details. + + +Sound Problems: +=============== + +First RTFM (including the troubleshooting section +in the Sound-HOWTO). + +1) If you are having problems loading the modules (for + example, if you get device conflict errors) try the + following: + + A) If you have Win95 or NT on the same computer, + write down what addresses, IRQ, and DMA channels + those were using for the same hardware. You probably + can use these addresses, IRQs, and DMA channels. + You should really do this BEFORE attempting to get + sound working! + + B) Check (cat) /proc/interrupts, /proc/ioports, + and /proc/dma. Are you trying to use an address, + IRQ or DMA port that another device is using? + + C) Check (cat) /proc/isapnp + + D) Inspect your /var/log/messages file. Often that will + indicate what IRQ or IO port could not be obtained. + + E) Try another port or IRQ. Note this may involve + using the PnP tools to move the sound card to + another location. Sometimes this is the only way + and it is more or less trial and error. + +2) If you get motor-boating (the same sound or part of a + sound clip repeated), you probably have either an IRQ + or DMA conflict. Move the card to another IRQ or DMA + port. This has happened to me when playing long files + when I had an IRQ conflict. + +3. If you get dropouts or pauses when playing high sample + rate files such as using mpg123 or x11amp/xmms, you may + have too slow of a CPU and may have to use the options to + play the files at 1/2 speed. For example, you may use + the -2 or -4 option on mpg123. You may also get this + when trying to play mpeg files stored on a CD-ROM + (my Toshiba T8000 PII/366 sometimes has this problem). + +4. If you get "cannot access device" errors, your /dev/dsp + files, etc. may be set to owner root, mode 600. You + may have to use the command: + chmod 666 /dev/dsp /dev/mixer /dev/audio + +5. If you get "device busy" errors, another program has the + sound device open. For example, if using the Enlightenment + sound daemon "esd", the "esd" program has the sound device. + If using "esd", please RTFM the docs on ESD. For example, + esddsp <program> may be used to play files via a non-esd + aware program. + +6) Ask for help on the sound list or send E-MAIL to the + sound driver author/maintainer. + +7) Turn on debug in drivers/sound/sound_config.h (DEB, DDB, MDB). + +8) If the system reports insufficient DMA memory then you may want to + load sound with the "dmabufs=1" option. Or in /etc/conf.modules add + + preinstall sound dmabufs=1 + + This makes the sound system allocate its buffers and hang onto them. + + You may also set persistent DMA when building a 2.4.x kernel. + + +Configuring Sound: +================== + +There are several ways of configuring your sound: + +1) On the kernel command line (when using the sound driver(s) + compiled in the kernel). Check the driver source and + documentation for details. + +2) On the command line when using insmod or in a bash script + using command line calls to load sound. + +3) In /etc/modprobe.conf when using modprobe. + +4) Via Red Hat's GPL'd /usr/sbin/sndconfig program (text based). + +5) Via the OSS soundconf program (with the commercial version + of the OSS driver. + +6) By just loading the module and let isapnp do everything relevant + for you. This works only with a few drivers yet and - of course - + only with isapnp hardware. + +And I am sure, several other ways. + +Anyone want to write a linuxconf module for configuring sound? + + +Module Loading: +=============== + +When a sound card is first referenced and sound is modular, the sound system +will ask for the sound devices to be loaded. Initially it requests that +the driver for the sound system is loaded. It then will ask for +sound-slot-0, where 0 is the first sound card. (sound-slot-1 the second and +so on). Thus you can do + +alias sound-slot-0 sb + +To load a soundblaster at this point. If the slot loading does not provide +the desired device - for example a soundblaster does not directly provide +a midi synth in all cases then it will request "sound-service-0-n" where n +is + + 0 Mixer + + 2 MIDI + + 3, 4 DSP audio + + +For example, I use the following to load my Soundblaster PCI 128 +(ES 1371) card first, followed by my SoundBlaster Vibra 16 card, +then by my TV card: + +# Load the Soundblaster PCI 128 as /dev/dsp, /dev/dsp1, /dev/mixer +alias sound-slot-0 es1371 + +# Load the Soundblaster Vibra 16 as /dev/dsp2, /dev/mixer1 +alias sound-slot-1 sb +options sb io=0x240 irq=5 dma=1 dma16=5 mpu_io=0x330 + +# Load the BTTV (TV card) as /dev/mixer2 +alias sound-slot-2 bttv +alias sound-service-2-0 tvmixer + +pre-install bttv modprobe tuner ; modprobe tvmixer +pre-install tvmixer modprobe msp3400; modprobe tvaudio +options tuner debug=0 type=8 +options bttv card=0 radio=0 pll=0 + + +For More Information (RTFM): +============================ +1) Information on kernel modules: manual pages for insmod and modprobe. + +2) Information on PnP, RTFM manual pages for isapnp. + +3) Sound-HOWTO and Sound-Playing-HOWTO. + +4) OSS's WWW site at http://www.opensound.com. + +5) All the files in Documentation/sound. + +6) The comments and code in linux/drivers/sound. + +7) The sndconfig and rhsound documentation from Red Hat. + +8) The Linux-sound mailing list: sound-list@redhat.com. + +9) Enlightenment documentation (for info on esd) + http://www.tux.org/~ricdude/EsounD.html. + +10) ALSA home page: http://www.alsa-project.org/ + + +Contact Information: +==================== +Wade Hampton: (whampton@staffnet.com) + diff --git a/Documentation/sound/oss/MAD16 b/Documentation/sound/oss/MAD16 new file mode 100644 index 000000000000..865dbd848742 --- /dev/null +++ b/Documentation/sound/oss/MAD16 @@ -0,0 +1,56 @@ +(This recipe has been edited to update the configuration symbols, + and change over to modprobe.conf for 2.6) + +From: Shaw Carruthers <shaw@shawc.demon.co.uk> + +I have been using mad16 sound for some time now with no problems, current +kernel 2.1.89 + +lsmod shows: + +mad16 5176 0 +sb 22044 0 [mad16] +uart401 5576 0 [mad16 sb] +ad1848 14176 1 [mad16] +sound 61928 0 [mad16 sb uart401 ad1848] + +.config has: + +CONFIG_SOUND=m +CONFIG_SOUND_ADLIB=m +CONFIG_SOUND_MAD16=m +CONFIG_SOUND_YM3812=m + +modprobe.conf has: + +alias char-major-14-* mad16 +options sb mad16=1 +options mad16 io=0x530 irq=7 dma=0 dma16=1 && /usr/local/bin/aumix -w 15 -p 20 -m 0 -1 0 -2 0 -3 0 -i 0 + + +To get the built in mixer to work this needs to be: + +options adlib_card io=0x388 # FM synthesizer +options sb mad16=1 +options mad16 io=0x530 irq=7 dma=0 dma16=1 mpu_io=816 mpu_irq=5 && /usr/local/bin/aumix -w 15 -p 20 -m 0 -1 0 -2 0 -3 0 -i 0 + +The addition of the "mpu_io=816 mpu_irq=5" to the mad16 options line is + +------------------------------------------------------------------------ +The mad16 module in addition supports the following options: + +option: meaning: default: +joystick=0,1 disabled, enabled disabled +cdtype=0x00,0x02,0x04, disabled, Sony CDU31A, disabled + 0x06,0x08,0x0a Mitsumi, Panasonic, + Secondary IDE, Primary IDE +cdport=0x340,0x320, 0x340 + 0x330,0x360 +cdirq=0,3,5,7,9,10,11 disabled, IRQ3, ... disabled +cddma=0,5,6,7 disabled, DMA5, ... DMA5 for Mitsumi or IDE +cddma=0,1,2,3 disabled, DMA1, ... DMA3 for Sony or Panasonic +opl4=0,1 OPL3, OPL4 OPL3 + +for more details see linux/drivers/sound/mad16.c + +Rui Sousa diff --git a/Documentation/sound/oss/Maestro b/Documentation/sound/oss/Maestro new file mode 100644 index 000000000000..4a80eb3f8e00 --- /dev/null +++ b/Documentation/sound/oss/Maestro @@ -0,0 +1,123 @@ + An OSS/Lite Driver for the ESS Maestro family of sound cards + + Zach Brown, December 1999 + +Driver Status and Availability +------------------------------ + +The most recent version of this driver will hopefully always be available at + http://www.zabbo.net/maestro/ + +I will try and maintain the most recent stable version of the driver +in both the stable and development kernel lines. + +ESS Maestro Chip Family +----------------------- + +There are 3 main variants of the ESS Maestro PCI sound chip. The first +is the Maestro 1. It was originally produced by Platform Tech as the +'AGOGO'. It can be recognized by Platform Tech's PCI ID 0x1285 with +0x0100 as the device ID. It was put on some sound boards and a few laptops. +ESS bought the design and cleaned it up as the Maestro 2. This starts +their marking with the ESS vendor ID 0x125D and the 'year' device IDs. +The Maestro 2 claims 0x1968 while the Maestro 2e has 0x1978. + +The various families of Maestro are mostly identical as far as this +driver is concerned. It doesn't touch the DSP parts that differ (though +it could for FM synthesis). + +Driver OSS Behavior +-------------------- + +This OSS driver exports /dev/mixer and /dev/dsp to applications, which +mostly adhere to the OSS spec. This driver doesn't register itself +with /dev/sndstat, so don't expect information to appear there. + +The /dev/dsp device exported behaves almost as expected. Playback is +supported in all the various lovely formats. 8/16bit stereo/mono from +8khz to 48khz, and mmap()ing for playback behaves. Capture/recording +is limited due to oddities with the Maestro hardware. One can only +record in 16bit stereo. For recording the maestro uses non interleaved +stereo buffers so that mmap()ing the incoming data does not result in +a ring buffer of LRLR data. mmap()ing of the read buffers is therefore +disallowed until this can be cleaned up. + +/dev/mixer is an interface to the AC'97 codec on the Maestro. It is +worth noting that there are a variety of AC'97s that can be wired to +the Maestro. Which is used is entirely up to the hardware implementor. +This should only be visible to the user by the presence, or lack, of +'Bass' and 'Treble' sliders in the mixer. Not all AC'97s have them. + +The driver doesn't support MIDI or FM playback at the moment. Typically +the Maestro is wired to an MPU MIDI chip, but some hardware implementations +don't. We need to assemble a white list of hardware implementations that +have MIDI wired properly before we can claim to support it safely. + +Compiling and Installing +------------------------ + +With the drivers inclusion into the kernel, compiling and installing +is the same as most OSS/Lite modular sound drivers. Compilation +of the driver is enabled through the CONFIG_SOUND_MAESTRO variable +in the config system. + +It may be modular or statically linked. If it is modular it should be +installed with the rest of the modules for the kernel on the system. +Typically this will be in /lib/modules/ somewhere. 'alias sound maestro' +should also be added to your module configs (typically /etc/conf.modules) +if you're using modular OSS/Lite sound and want to default to using a +maestro chip. + +As this is a PCI device, the module does not need to be informed of +any IO or IRQ resources it should use, it devines these from the +system. Sometimes, on sucky PCs, the BIOS fails to allocated resources +for the maestro. This will result in a message like: + maestro: PCI subsystem reports IRQ 0, this might not be correct. +from the kernel. Should this happen the sound chip most likely will +not operate correctly. To solve this one has to dig through their BIOS +(typically entered by hitting a hot key at boot time) and figure out +what magic needs to happen so that the BIOS will reward the maestro with +an IRQ. This operation is incredibly system specific, so you're on your +own. Sometimes the magic lies in 'PNP Capable Operating System' settings. + +There are very few options to the driver. One is 'debug' which will +tell the driver to print minimal debugging information as it runs. This +can be collected with 'dmesg' or through the klogd daemon. + +The other, more interesting option, is 'dsps_order'. Typically at +install time the driver will only register one available /dev/dsp device +for its use. The 'dsps_order' module parameter allows for more devices +to be allocated, as a power of two. Up to 4 devices can be registered +( dsps_order=2 ). These devices act as fully distinct units and use +separate channels in the maestro. + +Power Management +---------------- + +As of version 0.14, this driver has a minimal understanding of PCI +Power Management. If it finds a valid power management capability +on the PCI device it will attempt to use the power management +functions of the maestro. It will only do this on Maestro 2Es and +only on machines that are known to function well. You can +force the use of power management by setting the 'use_pm' module +option to 1, or can disable it entirely by setting it to 0. + +When using power management, the driver does a few things +differently. It will keep the chip in a lower power mode +when the module is inserted but /dev/dsp is not open. This +allows the mixer to function but turns off the clocks +on other parts of the chip. When /dev/dsp is opened the chip +is brought into full power mode, and brought back down +when it is closed. It also powers down the chip entirely +when the module is removed or the machine is shutdown. This +can have nonobvious consequences. CD audio may not work +after a power managing driver is removed. Also, software that +doesn't understand power management may not be able to talk +to the powered down chip until the machine goes through a hard +reboot to bring it back. + +.. more details .. +------------------ + +drivers/sound/maestro.c contains comments that hopefully explain +the maestro implementation. diff --git a/Documentation/sound/oss/Maestro3 b/Documentation/sound/oss/Maestro3 new file mode 100644 index 000000000000..a113718e8034 --- /dev/null +++ b/Documentation/sound/oss/Maestro3 @@ -0,0 +1,92 @@ + An OSS/Lite Driver for the ESS Maestro3 family of sound chips + + Zach Brown, January 2001 + +Driver Status and Availability +------------------------------ + +The most recent version of this driver will hopefully always be available at + http://www.zabbo.net/maestro3/ + +I will try and maintain the most recent stable version of the driver +in both the stable and development kernel lines. + +Historically I've sucked pretty hard at actually doing that, however. + +ESS Maestro3 Chip Family +----------------------- + +The 'Maestro3' is much like the Maestro2 chip. The noted improvement +is the removal of the silicon in the '2' that did PCM mixing. All that +work is now done through a custom DSP called the ASSP, the Asynchronus +Specific Signal Processor. + +The 'Allegro' is a baby version of the Maestro3. I'm not entirely clear +on the extent of the differences, but the driver supports them both :) + +The 'Allegro' shows up as PCI ID 0x1988 and the Maestro3 as 0x1998, +both under ESS's vendor ID of 0x125D. The Maestro3 can also show up as +0x199a when hardware strapping is used. + +The chip can also act as a multi function device. The modem IDs follow +the audio multimedia device IDs. (so the modem part of an Allegro shows +up as 0x1989) + +Driver OSS Behavior +-------------------- + +This OSS driver exports /dev/mixer and /dev/dsp to applications, which +mostly adhere to the OSS spec. This driver doesn't register itself +with /dev/sndstat, so don't expect information to appear there. + +The /dev/dsp device exported behaves as expected. Playback is +supported in all the various lovely formats. 8/16bit stereo/mono from +8khz to 48khz, with both read()/write(), and mmap(). + +/dev/mixer is an interface to the AC'97 codec on the Maestro3. It is +worth noting that there are a variety of AC'97s that can be wired to +the Maestro3. Which is used is entirely up to the hardware implementor. +This should only be visible to the user by the presence, or lack, of +'Bass' and 'Treble' sliders in the mixer. Not all AC'97s have them. +The Allegro has an onchip AC'97. + +The driver doesn't support MIDI or FM playback at the moment. + +Compiling and Installing +------------------------ + +With the drivers inclusion into the kernel, compiling and installing +is the same as most OSS/Lite modular sound drivers. Compilation +of the driver is enabled through the CONFIG_SOUND_MAESTRO3 variable +in the config system. + +It may be modular or statically linked. If it is modular it should be +installed with the rest of the modules for the kernel on the system. +Typically this will be in /lib/modules/ somewhere. 'alias sound-slot-0 +maestro3' should also be added to your module configs (typically +/etc/modprobe.conf) if you're using modular OSS/Lite sound and want to +default to using a maestro3 chip. + +There are very few options to the driver. One is 'debug' which will +tell the driver to print minimal debugging information as it runs. This +can be collected with 'dmesg' or through the klogd daemon. + +One is 'external_amp', which tells the driver to attempt to enable +an external amplifier. This defaults to '1', you can tell the driver +not to bother enabling such an amplifier by setting it to '0'. + +And the last is 'gpio_pin', which tells the driver which GPIO pin number +the external amp uses (0-15), The Allegro uses 8 by default, all others 1. +If everything loads correctly and seems to be working but you get no sound, +try tweaking this value. + +Systems known to need a different value + Panasonic ToughBook CF-72: gpio_pin=13 + +Power Management +---------------- + +This driver has a minimal understanding of PCI Power Management. It will +try and power down the chip when the system is suspended, and power +it up with it is resumed. It will also try and power down the chip +when the machine is shut down. diff --git a/Documentation/sound/oss/MultiSound b/Documentation/sound/oss/MultiSound new file mode 100644 index 000000000000..e4a18bb7f73a --- /dev/null +++ b/Documentation/sound/oss/MultiSound @@ -0,0 +1,1137 @@ +#! /bin/sh +# +# Turtle Beach MultiSound Driver Notes +# -- Andrew Veliath <andrewtv@usa.net> +# +# Last update: September 10, 1998 +# Corresponding msnd driver: 0.8.3 +# +# ** This file is a README (top part) and shell archive (bottom part). +# The corresponding archived utility sources can be unpacked by +# running `sh MultiSound' (the utilities are only needed for the +# Pinnacle and Fiji cards). ** +# +# +# -=-=- Getting Firmware -=-=- +# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +# +# See the section `Obtaining and Creating Firmware Files' in this +# document for instructions on obtaining the necessary firmware +# files. +# +# +# Supported Features +# ~~~~~~~~~~~~~~~~~~ +# +# Currently, full-duplex digital audio (/dev/dsp only, /dev/audio is +# not currently available) and mixer functionality (/dev/mixer) are +# supported (memory mapped digital audio is not yet supported). +# Digital transfers and monitoring can be done as well if you have +# the digital daughterboard (see the section on using the S/PDIF port +# for more information). +# +# Support for the Turtle Beach MultiSound Hurricane architecture is +# composed of the following modules (these can also operate compiled +# into the kernel): +# +# msnd - MultiSound base (requires soundcore) +# +# msnd_classic - Base audio/mixer support for Classic, Monetery and +# Tahiti cards +# +# msnd_pinnacle - Base audio/mixer support for Pinnacle and Fiji cards +# +# +# Important Notes - Read Before Using +# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +# +# The firmware files are not included (may change in future). You +# must obtain these images from Turtle Beach (they are included in +# the MultiSound Development Kits), and place them in /etc/sound for +# example, and give the full paths in the Linux configuration. If +# you are compiling in support for the MultiSound driver rather than +# using it as a module, these firmware files must be accessible +# during kernel compilation. +# +# Please note these files must be binary files, not assembler. See +# the section later in this document for instructions to obtain these +# files. +# +# +# Configuring Card Resources +# ~~~~~~~~~~~~~~~~~~~~~~~~~~ +# +# ** This section is very important, as your card may not work at all +# or your machine may crash if you do not do this correctly. ** +# +# * Classic/Monterey/Tahiti +# +# These cards are configured through the driver msnd_classic. You must +# know the io port, then the driver will select the irq and memory resources +# on the card. It is up to you to know if these are free locations or now, +# a conflict can lock the machine up. +# +# * Pinnacle/Fiji +# +# The Pinnacle and Fiji cards have an extra config port, either +# 0x250, 0x260 or 0x270. This port can be disabled to have the card +# configured strictly through PnP, however you lose the ability to +# access the IDE controller and joystick devices on this card when +# using PnP. The included pinnaclecfg program in this shell archive +# can be used to configure the card in non-PnP mode, and in PnP mode +# you can use isapnptools. These are described briefly here. +# +# pinnaclecfg is not required; you can use the msnd_pinnacle module +# to fully configure the card as well. However, pinnaclecfg can be +# used to change the resource values of a particular device after the +# msnd_pinnacle module has been loaded. If you are compiling the +# driver into the kernel, you must set these values during compile +# time, however other peripheral resource values can be changed with +# the pinnaclecfg program after the kernel is loaded. +# +# +# *** PnP mode +# +# Use pnpdump to obtain a sample configuration if you can; I was able +# to obtain one with the command `pnpdump 1 0x203' -- this may vary +# for you (running pnpdump by itself did not work for me). Then, +# edit this file and use isapnp to uncomment and set the card values. +# Use these values when inserting the msnd_pinnacle module. Using +# this method, you can set the resources for the DSP and the Kurzweil +# synth (Pinnacle). Since Linux does not directly support PnP +# devices, you may have difficulty when using the card in PnP mode +# when it the driver is compiled into the kernel. Using non-PnP mode +# is preferable in this case. +# +# Here is an example mypinnacle.conf for isapnp that sets the card to +# io base 0x210, irq 5 and mem 0xd8000, and also sets the Kurzweil +# synth to 0x330 and irq 9 (may need editing for your system): +# +# (READPORT 0x0203) +# (CSN 2) +# (IDENTIFY *) +# +# # DSP +# (CONFIGURE BVJ0440/-1 (LD 0 +# (INT 0 (IRQ 5 (MODE +E))) (IO 0 (BASE 0x0210)) (MEM 0 (BASE 0x0d8000)) +# (ACT Y))) +# +# # Kurzweil Synth (Pinnacle Only) +# (CONFIGURE BVJ0440/-1 (LD 1 +# (IO 0 (BASE 0x0330)) (INT 0 (IRQ 9 (MODE +E))) +# (ACT Y))) +# +# (WAITFORKEY) +# +# +# *** Non-PnP mode +# +# The second way is by running the card in non-PnP mode. This +# actually has some advantages in that you can access some other +# devices on the card, such as the joystick and IDE controller. To +# configure the card, unpack this shell archive and build the +# pinnaclecfg program. Using this program, you can assign the +# resource values to the card's devices, or disable the devices. As +# an alternative to using pinnaclecfg, you can specify many of the +# configuration values when loading the msnd_pinnacle module (or +# during kernel configuration when compiling the driver into the +# kernel). +# +# If you specify cfg=0x250 for the msnd_pinnacle module, it +# automatically configure the card to the given io, irq and memory +# values using that config port (the config port is jumper selectable +# on the card to 0x250, 0x260 or 0x270). +# +# See the `msnd_pinnacle Additional Options' section below for more +# information on these parameters (also, if you compile the driver +# directly into the kernel, these extra parameters can be useful +# here). +# +# +# ** It is very easy to cause problems in your machine if you choose a +# resource value which is incorrect. ** +# +# +# Examples +# ~~~~~~~~ +# +# * MultiSound Classic/Monterey/Tahiti: +# +# modprobe soundcore +# insmod msnd +# insmod msnd_classic io=0x290 irq=7 mem=0xd0000 +# +# * MultiSound Pinnacle in PnP mode: +# +# modprobe soundcore +# insmod msnd +# isapnp mypinnacle.conf +# insmod msnd_pinnacle io=0x210 irq=5 mem=0xd8000 <-- match mypinnacle.conf values +# +# * MultiSound Pinnacle in non-PnP mode (replace 0x250 with your configuration port, +# one of 0x250, 0x260 or 0x270): +# +# insmod soundcore +# insmod msnd +# insmod msnd_pinnacle cfg=0x250 io=0x290 irq=5 mem=0xd0000 +# +# * To use the MPU-compatible Kurzweil synth on the Pinnacle in PnP +# mode, add the following (assumes you did `isapnp mypinnacle.conf'): +# +# insmod sound +# insmod mpu401 io=0x330 irq=9 <-- match mypinnacle.conf values +# +# * To use the MPU-compatible Kurzweil synth on the Pinnacle in non-PnP +# mode, add the following. Note how we first configure the peripheral's +# resources, _then_ install a Linux driver for it: +# +# insmod sound +# pinnaclecfg 0x250 mpu 0x330 9 +# insmod mpu401 io=0x330 irq=9 +# +# -- OR you can use the following sequence without pinnaclecfg in non-PnP mode: +# +# insmod soundcore +# insmod msnd +# insmod msnd_pinnacle cfg=0x250 io=0x290 irq=5 mem=0xd0000 mpu_io=0x330 mpu_irq=9 +# insmod sound +# insmod mpu401 io=0x330 irq=9 +# +# * To setup the joystick port on the Pinnacle in non-PnP mode (though +# you have to find the actual Linux joystick driver elsewhere), you +# can use pinnaclecfg: +# +# pinnaclecfg 0x250 joystick 0x200 +# +# -- OR you can configure this using msnd_pinnacle with the following: +# +# insmod soundcore +# insmod msnd +# insmod msnd_pinnacle cfg=0x250 io=0x290 irq=5 mem=0xd0000 joystick_io=0x200 +# +# +# msnd_classic, msnd_pinnacle Required Options +# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +# +# If the following options are not given, the module will not load. +# Examine the kernel message log for informative error messages. +# WARNING--probing isn't supported so try to make sure you have the +# correct shared memory area, otherwise you may experience problems. +# +# io I/O base of DSP, e.g. io=0x210 +# irq IRQ number, e.g. irq=5 +# mem Shared memory area, e.g. mem=0xd8000 +# +# +# msnd_classic, msnd_pinnacle Additional Options +# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +# +# fifosize The digital audio FIFOs, in kilobytes. If not +# specified, the default will be used. Increasing +# this value will reduce the chance of a FIFO +# underflow at the expense of increasing overall +# latency. For example, fifosize=512 will +# allocate 512kB read and write FIFOs (1MB total). +# While this may reduce dropouts, a heavy machine +# load will undoubtedly starve the FIFO of data +# and you will eventually get dropouts. One +# option is to alter the scheduling priority of +# the playback process, using `nice' or some form +# of POSIX soft real-time scheduling. +# +# calibrate_signal Setting this to one calibrates the ADCs to the +# signal, zero calibrates to the card (defaults +# to zero). +# +# +# msnd_pinnacle Additional Options +# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +# +# digital Specify digital=1 to enable the S/PDIF input +# if you have the digital daughterboard +# adapter. This will enable access to the +# DIGITAL1 input for the soundcard in the mixer. +# Some mixer programs might have trouble setting +# the DIGITAL1 source as an input. If you have +# trouble, you can try the setdigital.c program +# at the bottom of this document. +# +# cfg Non-PnP configuration port for the Pinnacle +# and Fiji (typically 0x250, 0x260 or 0x270, +# depending on the jumper configuration). If +# this option is omitted, then it is assumed +# that the card is in PnP mode, and that the +# specified DSP resource values are already +# configured with PnP (i.e. it won't attempt to +# do any sort of configuration). +# +# When the Pinnacle is in non-PnP mode, you can use the following +# options to configure particular devices. If a full specification +# for a device is not given, then the device is not configured. Note +# that you still must use a Linux driver for any of these devices +# once their resources are setup (such as the Linux joystick driver, +# or the MPU401 driver from OSS for the Kurzweil synth). +# +# mpu_io I/O port of MPU (on-board Kurzweil synth) +# mpu_irq IRQ of MPU (on-board Kurzweil synth) +# ide_io0 First I/O port of IDE controller +# ide_io1 Second I/O port of IDE controller +# ide_irq IRQ IDE controller +# joystick_io I/O port of joystick +# +# +# Obtaining and Creating Firmware Files +# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +# +# For the Classic/Tahiti/Monterey +# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +# +# Download to /tmp and unzip the following file from Turtle Beach: +# +# ftp://ftp.voyetra.com/pub/tbs/msndcl/msndvkit.zip +# +# When unzipped, unzip the file named MsndFiles.zip. Then copy the +# following firmware files to /etc/sound (note the file renaming): +# +# cp DSPCODE/MSNDINIT.BIN /etc/sound/msndinit.bin +# cp DSPCODE/MSNDPERM.REB /etc/sound/msndperm.bin +# +# When configuring the Linux kernel, specify /etc/sound/msndinit.bin and +# /etc/sound/msndperm.bin for the two firmware files (Linux kernel +# versions older than 2.2 do not ask for firmware paths, and are +# hardcoded to /etc/sound). +# +# If you are compiling the driver into the kernel, these files must +# be accessible during compilation, but will not be needed later. +# The files must remain, however, if the driver is used as a module. +# +# +# For the Pinnacle/Fiji +# ~~~~~~~~~~~~~~~~~~~~~ +# +# Download to /tmp and unzip the following file from Turtle Beach (be +# sure to use the entire URL; some have had trouble navigating to the +# URL): +# +# ftp://ftp.voyetra.com/pub/tbs/pinn/pnddk100.zip +# +# Unpack this shell archive, and run make in the created directory +# (you need a C compiler and flex to build the utilities). This +# should give you the executables conv, pinnaclecfg and setdigital. +# conv is only used temporarily here to create the firmware files, +# while pinnaclecfg is used to configure the Pinnacle or Fiji card in +# non-PnP mode, and setdigital can be used to set the S/PDIF input on +# the mixer (pinnaclecfg and setdigital should be copied to a +# convenient place, possibly run during system initialization). +# +# To generating the firmware files with the `conv' program, we create +# the binary firmware files by doing the following conversion +# (assuming the archive unpacked into a directory named PINNDDK): +# +# ./conv < PINNDDK/dspcode/pndspini.asm > /etc/sound/pndspini.bin +# ./conv < PINNDDK/dspcode/pndsperm.asm > /etc/sound/pndsperm.bin +# +# The conv (and conv.l) program is not needed after conversion and can +# be safely deleted. Then, when configuring the Linux kernel, specify +# /etc/sound/pndspini.bin and /etc/sound/pndsperm.bin for the two +# firmware files (Linux kernel versions older than 2.2 do not ask for +# firmware paths, and are hardcoded to /etc/sound). +# +# If you are compiling the driver into the kernel, these files must +# be accessible during compilation, but will not be needed later. +# The files must remain, however, if the driver is used as a module. +# +# +# Using Digital I/O with the S/PDIF Port +# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +# +# If you have a Pinnacle or Fiji with the digital daughterboard and +# want to set it as the input source, you can use this program if you +# have trouble trying to do it with a mixer program (be sure to +# insert the module with the digital=1 option, or say Y to the option +# during compiled-in kernel operation). Upon selection of the S/PDIF +# port, you should be able monitor and record from it. +# +# There is something to note about using the S/PDIF port. Digital +# timing is taken from the digital signal, so if a signal is not +# connected to the port and it is selected as recording input, you +# will find PCM playback to be distorted in playback rate. Also, +# attempting to record at a sampling rate other than the DAT rate may +# be problematic (i.e. trying to record at 8000Hz when the DAT signal +# is 44100Hz). If you have a problem with this, set the recording +# input to analog if you need to record at a rate other than that of +# the DAT rate. +# +# +# -- Shell archive attached below, just run `sh MultiSound' to extract. +# Contains Pinnacle/Fiji utilities to convert firmware, configure +# in non-PnP mode, and select the DIGITAL1 input for the mixer. +# +# +#!/bin/sh +# This is a shell archive (produced by GNU sharutils 4.2). +# To extract the files from this archive, save it to some FILE, remove +# everything before the `!/bin/sh' line above, then type `sh FILE'. +# +# Made on 1998-12-04 10:07 EST by <andrewtv@ztransform.velsoft.com>. +# Source directory was `/home/andrewtv/programming/pinnacle/pinnacle'. +# +# Existing files will *not* be overwritten unless `-c' is specified. +# +# This shar contains: +# length mode name +# ------ ---------- ------------------------------------------ +# 2046 -rw-rw-r-- MultiSound.d/setdigital.c +# 10235 -rw-rw-r-- MultiSound.d/pinnaclecfg.c +# 106 -rw-rw-r-- MultiSound.d/Makefile +# 141 -rw-rw-r-- MultiSound.d/conv.l +# 1472 -rw-rw-r-- MultiSound.d/msndreset.c +# +save_IFS="${IFS}" +IFS="${IFS}:" +gettext_dir=FAILED +locale_dir=FAILED +first_param="$1" +for dir in $PATH +do + if test "$gettext_dir" = FAILED && test -f $dir/gettext \ + && ($dir/gettext --version >/dev/null 2>&1) + then + set `$dir/gettext --version 2>&1` + if test "$3" = GNU + then + gettext_dir=$dir + fi + fi + if test "$locale_dir" = FAILED && test -f $dir/shar \ + && ($dir/shar --print-text-domain-dir >/dev/null 2>&1) + then + locale_dir=`$dir/shar --print-text-domain-dir` + fi +done +IFS="$save_IFS" +if test "$locale_dir" = FAILED || test "$gettext_dir" = FAILED +then + echo=echo +else + TEXTDOMAINDIR=$locale_dir + export TEXTDOMAINDIR + TEXTDOMAIN=sharutils + export TEXTDOMAIN + echo="$gettext_dir/gettext -s" +fi +touch -am 1231235999 $$.touch >/dev/null 2>&1 +if test ! -f 1231235999 && test -f $$.touch; then + shar_touch=touch +else + shar_touch=: + echo + $echo 'WARNING: not restoring timestamps. Consider getting and' + $echo "installing GNU \`touch', distributed in GNU File Utilities..." + echo +fi +rm -f 1231235999 $$.touch +# +if mkdir _sh01426; then + $echo 'x -' 'creating lock directory' +else + $echo 'failed to create lock directory' + exit 1 +fi +# ============= MultiSound.d/setdigital.c ============== +if test ! -d 'MultiSound.d'; then + $echo 'x -' 'creating directory' 'MultiSound.d' + mkdir 'MultiSound.d' +fi +if test -f 'MultiSound.d/setdigital.c' && test "$first_param" != -c; then + $echo 'x -' SKIPPING 'MultiSound.d/setdigital.c' '(file already exists)' +else + $echo 'x -' extracting 'MultiSound.d/setdigital.c' '(text)' + sed 's/^X//' << 'SHAR_EOF' > 'MultiSound.d/setdigital.c' && +/********************************************************************* +X * +X * setdigital.c - sets the DIGITAL1 input for a mixer +X * +X * Copyright (C) 1998 Andrew Veliath +X * +X * This program is free software; you can redistribute it and/or modify +X * it under the terms of the GNU General Public License as published by +X * the Free Software Foundation; either version 2 of the License, or +X * (at your option) any later version. +X * +X * This program is distributed in the hope that it will be useful, +X * but WITHOUT ANY WARRANTY; without even the implied warranty of +X * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +X * GNU General Public License for more details. +X * +X * You should have received a copy of the GNU General Public License +X * along with this program; if not, write to the Free Software +X * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. +X * +X ********************************************************************/ +X +#include <stdio.h> +#include <unistd.h> +#include <fcntl.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <sys/ioctl.h> +#include <sys/soundcard.h> +X +int main(int argc, char *argv[]) +{ +X int fd; +X unsigned long recmask, recsrc; +X +X if (argc != 2) { +X fprintf(stderr, "usage: setdigital <mixer device>\n"); +X exit(1); +X } +X +X if ((fd = open(argv[1], O_RDWR)) < 0) { +X perror(argv[1]); +X exit(1); +X } +X +X if (ioctl(fd, SOUND_MIXER_READ_RECMASK, &recmask) < 0) { +X fprintf(stderr, "error: ioctl read recording mask failed\n"); +X perror("ioctl"); +X close(fd); +X exit(1); +X } +X +X if (!(recmask & SOUND_MASK_DIGITAL1)) { +X fprintf(stderr, "error: cannot find DIGITAL1 device in mixer\n"); +X close(fd); +X exit(1); +X } +X +X if (ioctl(fd, SOUND_MIXER_READ_RECSRC, &recsrc) < 0) { +X fprintf(stderr, "error: ioctl read recording source failed\n"); +X perror("ioctl"); +X close(fd); +X exit(1); +X } +X +X recsrc |= SOUND_MASK_DIGITAL1; +X +X if (ioctl(fd, SOUND_MIXER_WRITE_RECSRC, &recsrc) < 0) { +X fprintf(stderr, "error: ioctl write recording source failed\n"); +X perror("ioctl"); +X close(fd); +X exit(1); +X } +X +X close(fd); +X +X return 0; +} +SHAR_EOF + $shar_touch -am 1204092598 'MultiSound.d/setdigital.c' && + chmod 0664 'MultiSound.d/setdigital.c' || + $echo 'restore of' 'MultiSound.d/setdigital.c' 'failed' + if ( md5sum --help 2>&1 | grep 'sage: md5sum \[' ) >/dev/null 2>&1 \ + && ( md5sum --version 2>&1 | grep -v 'textutils 1.12' ) >/dev/null; then + md5sum -c << SHAR_EOF >/dev/null 2>&1 \ + || $echo 'MultiSound.d/setdigital.c:' 'MD5 check failed' +e87217fc3e71288102ba41fd81f71ec4 MultiSound.d/setdigital.c +SHAR_EOF + else + shar_count="`LC_ALL= LC_CTYPE= LANG= wc -c < 'MultiSound.d/setdigital.c'`" + test 2046 -eq "$shar_count" || + $echo 'MultiSound.d/setdigital.c:' 'original size' '2046,' 'current size' "$shar_count!" + fi +fi +# ============= MultiSound.d/pinnaclecfg.c ============== +if test -f 'MultiSound.d/pinnaclecfg.c' && test "$first_param" != -c; then + $echo 'x -' SKIPPING 'MultiSound.d/pinnaclecfg.c' '(file already exists)' +else + $echo 'x -' extracting 'MultiSound.d/pinnaclecfg.c' '(text)' + sed 's/^X//' << 'SHAR_EOF' > 'MultiSound.d/pinnaclecfg.c' && +/********************************************************************* +X * +X * pinnaclecfg.c - Pinnacle/Fiji Device Configuration Program +X * +X * This is for NON-PnP mode only. For PnP mode, use isapnptools. +X * +X * This is Linux-specific, and must be run with root permissions. +X * +X * Part of the Turtle Beach MultiSound Sound Card Driver for Linux +X * +X * Copyright (C) 1998 Andrew Veliath +X * +X * This program is free software; you can redistribute it and/or modify +X * it under the terms of the GNU General Public License as published by +X * the Free Software Foundation; either version 2 of the License, or +X * (at your option) any later version. +X * +X * This program is distributed in the hope that it will be useful, +X * but WITHOUT ANY WARRANTY; without even the implied warranty of +X * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +X * GNU General Public License for more details. +X * +X * You should have received a copy of the GNU General Public License +X * along with this program; if not, write to the Free Software +X * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. +X * +X ********************************************************************/ +X +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <errno.h> +#include <unistd.h> +#include <asm/io.h> +#include <asm/types.h> +X +#define IREG_LOGDEVICE 0x07 +#define IREG_ACTIVATE 0x30 +#define LD_ACTIVATE 0x01 +#define LD_DISACTIVATE 0x00 +#define IREG_EECONTROL 0x3F +#define IREG_MEMBASEHI 0x40 +#define IREG_MEMBASELO 0x41 +#define IREG_MEMCONTROL 0x42 +#define IREG_MEMRANGEHI 0x43 +#define IREG_MEMRANGELO 0x44 +#define MEMTYPE_8BIT 0x00 +#define MEMTYPE_16BIT 0x02 +#define MEMTYPE_RANGE 0x00 +#define MEMTYPE_HIADDR 0x01 +#define IREG_IO0_BASEHI 0x60 +#define IREG_IO0_BASELO 0x61 +#define IREG_IO1_BASEHI 0x62 +#define IREG_IO1_BASELO 0x63 +#define IREG_IRQ_NUMBER 0x70 +#define IREG_IRQ_TYPE 0x71 +#define IRQTYPE_HIGH 0x02 +#define IRQTYPE_LOW 0x00 +#define IRQTYPE_LEVEL 0x01 +#define IRQTYPE_EDGE 0x00 +X +#define HIBYTE(w) ((BYTE)(((WORD)(w) >> 8) & 0xFF)) +#define LOBYTE(w) ((BYTE)(w)) +#define MAKEWORD(low,hi) ((WORD)(((BYTE)(low))|(((WORD)((BYTE)(hi)))<<8))) +X +typedef __u8 BYTE; +typedef __u16 USHORT; +typedef __u16 WORD; +X +static int config_port = -1; +X +static int msnd_write_cfg(int cfg, int reg, int value) +{ +X outb(reg, cfg); +X outb(value, cfg + 1); +X if (value != inb(cfg + 1)) { +X fprintf(stderr, "error: msnd_write_cfg: I/O error\n"); +X return -EIO; +X } +X return 0; +} +X +static int msnd_read_cfg(int cfg, int reg) +{ +X outb(reg, cfg); +X return inb(cfg + 1); +} +X +static int msnd_write_cfg_io0(int cfg, int num, WORD io) +{ +X if (msnd_write_cfg(cfg, IREG_LOGDEVICE, num)) +X return -EIO; +X if (msnd_write_cfg(cfg, IREG_IO0_BASEHI, HIBYTE(io))) +X return -EIO; +X if (msnd_write_cfg(cfg, IREG_IO0_BASELO, LOBYTE(io))) +X return -EIO; +X return 0; +} +X +static int msnd_read_cfg_io0(int cfg, int num, WORD *io) +{ +X if (msnd_write_cfg(cfg, IREG_LOGDEVICE, num)) +X return -EIO; +X +X *io = MAKEWORD(msnd_read_cfg(cfg, IREG_IO0_BASELO), +X msnd_read_cfg(cfg, IREG_IO0_BASEHI)); +X +X return 0; +} +X +static int msnd_write_cfg_io1(int cfg, int num, WORD io) +{ +X if (msnd_write_cfg(cfg, IREG_LOGDEVICE, num)) +X return -EIO; +X if (msnd_write_cfg(cfg, IREG_IO1_BASEHI, HIBYTE(io))) +X return -EIO; +X if (msnd_write_cfg(cfg, IREG_IO1_BASELO, LOBYTE(io))) +X return -EIO; +X return 0; +} +X +static int msnd_read_cfg_io1(int cfg, int num, WORD *io) +{ +X if (msnd_write_cfg(cfg, IREG_LOGDEVICE, num)) +X return -EIO; +X +X *io = MAKEWORD(msnd_read_cfg(cfg, IREG_IO1_BASELO), +X msnd_read_cfg(cfg, IREG_IO1_BASEHI)); +X +X return 0; +} +X +static int msnd_write_cfg_irq(int cfg, int num, WORD irq) +{ +X if (msnd_write_cfg(cfg, IREG_LOGDEVICE, num)) +X return -EIO; +X if (msnd_write_cfg(cfg, IREG_IRQ_NUMBER, LOBYTE(irq))) +X return -EIO; +X if (msnd_write_cfg(cfg, IREG_IRQ_TYPE, IRQTYPE_EDGE)) +X return -EIO; +X return 0; +} +X +static int msnd_read_cfg_irq(int cfg, int num, WORD *irq) +{ +X if (msnd_write_cfg(cfg, IREG_LOGDEVICE, num)) +X return -EIO; +X +X *irq = msnd_read_cfg(cfg, IREG_IRQ_NUMBER); +X +X return 0; +} +X +static int msnd_write_cfg_mem(int cfg, int num, int mem) +{ +X WORD wmem; +X +X mem >>= 8; +X mem &= 0xfff; +X wmem = (WORD)mem; +X if (msnd_write_cfg(cfg, IREG_LOGDEVICE, num)) +X return -EIO; +X if (msnd_write_cfg(cfg, IREG_MEMBASEHI, HIBYTE(wmem))) +X return -EIO; +X if (msnd_write_cfg(cfg, IREG_MEMBASELO, LOBYTE(wmem))) +X return -EIO; +X if (wmem && msnd_write_cfg(cfg, IREG_MEMCONTROL, (MEMTYPE_HIADDR | MEMTYPE_16BIT))) +X return -EIO; +X return 0; +} +X +static int msnd_read_cfg_mem(int cfg, int num, int *mem) +{ +X if (msnd_write_cfg(cfg, IREG_LOGDEVICE, num)) +X return -EIO; +X +X *mem = MAKEWORD(msnd_read_cfg(cfg, IREG_MEMBASELO), +X msnd_read_cfg(cfg, IREG_MEMBASEHI)); +X *mem <<= 8; +X +X return 0; +} +X +static int msnd_activate_logical(int cfg, int num) +{ +X if (msnd_write_cfg(cfg, IREG_LOGDEVICE, num)) +X return -EIO; +X if (msnd_write_cfg(cfg, IREG_ACTIVATE, LD_ACTIVATE)) +X return -EIO; +X return 0; +} +X +static int msnd_write_cfg_logical(int cfg, int num, WORD io0, WORD io1, WORD irq, int mem) +{ +X if (msnd_write_cfg(cfg, IREG_LOGDEVICE, num)) +X return -EIO; +X if (msnd_write_cfg_io0(cfg, num, io0)) +X return -EIO; +X if (msnd_write_cfg_io1(cfg, num, io1)) +X return -EIO; +X if (msnd_write_cfg_irq(cfg, num, irq)) +X return -EIO; +X if (msnd_write_cfg_mem(cfg, num, mem)) +X return -EIO; +X if (msnd_activate_logical(cfg, num)) +X return -EIO; +X return 0; +} +X +static int msnd_read_cfg_logical(int cfg, int num, WORD *io0, WORD *io1, WORD *irq, int *mem) +{ +X if (msnd_write_cfg(cfg, IREG_LOGDEVICE, num)) +X return -EIO; +X if (msnd_read_cfg_io0(cfg, num, io0)) +X return -EIO; +X if (msnd_read_cfg_io1(cfg, num, io1)) +X return -EIO; +X if (msnd_read_cfg_irq(cfg, num, irq)) +X return -EIO; +X if (msnd_read_cfg_mem(cfg, num, mem)) +X return -EIO; +X return 0; +} +X +static void usage(void) +{ +X fprintf(stderr, +X "\n" +X "pinnaclecfg 1.0\n" +X "\n" +X "usage: pinnaclecfg <config port> [device config]\n" +X "\n" +X "This is for use with the card in NON-PnP mode only.\n" +X "\n" +X "Available devices (not all available for Fiji):\n" +X "\n" +X " Device Description\n" +X " -------------------------------------------------------------------\n" +X " reset Reset all devices (i.e. disable)\n" +X " show Display current device configurations\n" +X "\n" +X " dsp <io> <irq> <mem> Audio device\n" +X " mpu <io> <irq> Internal Kurzweil synth\n" +X " ide <io0> <io1> <irq> On-board IDE controller\n" +X " joystick <io> Joystick port\n" +X "\n"); +X exit(1); +} +X +static int cfg_reset(void) +{ +X int i; +X +X for (i = 0; i < 4; ++i) +X msnd_write_cfg_logical(config_port, i, 0, 0, 0, 0); +X +X return 0; +} +X +static int cfg_show(void) +{ +X int i; +X int count = 0; +X +X for (i = 0; i < 4; ++i) { +X WORD io0, io1, irq; +X int mem; +X msnd_read_cfg_logical(config_port, i, &io0, &io1, &irq, &mem); +X switch (i) { +X case 0: +X if (io0 || irq || mem) { +X printf("dsp 0x%x %d 0x%x\n", io0, irq, mem); +X ++count; +X } +X break; +X case 1: +X if (io0 || irq) { +X printf("mpu 0x%x %d\n", io0, irq); +X ++count; +X } +X break; +X case 2: +X if (io0 || io1 || irq) { +X printf("ide 0x%x 0x%x %d\n", io0, io1, irq); +X ++count; +X } +X break; +X case 3: +X if (io0) { +X printf("joystick 0x%x\n", io0); +X ++count; +X } +X break; +X } +X } +X +X if (count == 0) +X fprintf(stderr, "no devices configured\n"); +X +X return 0; +} +X +static int cfg_dsp(int argc, char *argv[]) +{ +X int io, irq, mem; +X +X if (argc < 3 || +X sscanf(argv[0], "0x%x", &io) != 1 || +X sscanf(argv[1], "%d", &irq) != 1 || +X sscanf(argv[2], "0x%x", &mem) != 1) +X usage(); +X +X if (!(io == 0x290 || +X io == 0x260 || +X io == 0x250 || +X io == 0x240 || +X io == 0x230 || +X io == 0x220 || +X io == 0x210 || +X io == 0x3e0)) { +X fprintf(stderr, "error: io must be one of " +X "210, 220, 230, 240, 250, 260, 290, or 3E0\n"); +X usage(); +X } +X +X if (!(irq == 5 || +X irq == 7 || +X irq == 9 || +X irq == 10 || +X irq == 11 || +X irq == 12)) { +X fprintf(stderr, "error: irq must be one of " +X "5, 7, 9, 10, 11 or 12\n"); +X usage(); +X } +X +X if (!(mem == 0xb0000 || +X mem == 0xc8000 || +X mem == 0xd0000 || +X mem == 0xd8000 || +X mem == 0xe0000 || +X mem == 0xe8000)) { +X fprintf(stderr, "error: mem must be one of " +X "0xb0000, 0xc8000, 0xd0000, 0xd8000, 0xe0000 or 0xe8000\n"); +X usage(); +X } +X +X return msnd_write_cfg_logical(config_port, 0, io, 0, irq, mem); +} +X +static int cfg_mpu(int argc, char *argv[]) +{ +X int io, irq; +X +X if (argc < 2 || +X sscanf(argv[0], "0x%x", &io) != 1 || +X sscanf(argv[1], "%d", &irq) != 1) +X usage(); +X +X return msnd_write_cfg_logical(config_port, 1, io, 0, irq, 0); +} +X +static int cfg_ide(int argc, char *argv[]) +{ +X int io0, io1, irq; +X +X if (argc < 3 || +X sscanf(argv[0], "0x%x", &io0) != 1 || +X sscanf(argv[0], "0x%x", &io1) != 1 || +X sscanf(argv[1], "%d", &irq) != 1) +X usage(); +X +X return msnd_write_cfg_logical(config_port, 2, io0, io1, irq, 0); +} +X +static int cfg_joystick(int argc, char *argv[]) +{ +X int io; +X +X if (argc < 1 || +X sscanf(argv[0], "0x%x", &io) != 1) +X usage(); +X +X return msnd_write_cfg_logical(config_port, 3, io, 0, 0, 0); +} +X +int main(int argc, char *argv[]) +{ +X char *device; +X int rv = 0; +X +X --argc; ++argv; +X +X if (argc < 2) +X usage(); +X +X sscanf(argv[0], "0x%x", &config_port); +X if (config_port != 0x250 && config_port != 0x260 && config_port != 0x270) { +X fprintf(stderr, "error: <config port> must be 0x250, 0x260 or 0x270\n"); +X exit(1); +X } +X if (ioperm(config_port, 2, 1)) { +X perror("ioperm"); +X fprintf(stderr, "note: pinnaclecfg must be run as root\n"); +X exit(1); +X } +X device = argv[1]; +X +X argc -= 2; argv += 2; +X +X if (strcmp(device, "reset") == 0) +X rv = cfg_reset(); +X else if (strcmp(device, "show") == 0) +X rv = cfg_show(); +X else if (strcmp(device, "dsp") == 0) +X rv = cfg_dsp(argc, argv); +X else if (strcmp(device, "mpu") == 0) +X rv = cfg_mpu(argc, argv); +X else if (strcmp(device, "ide") == 0) +X rv = cfg_ide(argc, argv); +X else if (strcmp(device, "joystick") == 0) +X rv = cfg_joystick(argc, argv); +X else { +X fprintf(stderr, "error: unknown device %s\n", device); +X usage(); +X } +X +X if (rv) +X fprintf(stderr, "error: device configuration failed\n"); +X +X return 0; +} +SHAR_EOF + $shar_touch -am 1204092598 'MultiSound.d/pinnaclecfg.c' && + chmod 0664 'MultiSound.d/pinnaclecfg.c' || + $echo 'restore of' 'MultiSound.d/pinnaclecfg.c' 'failed' + if ( md5sum --help 2>&1 | grep 'sage: md5sum \[' ) >/dev/null 2>&1 \ + && ( md5sum --version 2>&1 | grep -v 'textutils 1.12' ) >/dev/null; then + md5sum -c << SHAR_EOF >/dev/null 2>&1 \ + || $echo 'MultiSound.d/pinnaclecfg.c:' 'MD5 check failed' +366bdf27f0db767a3c7921d0a6db20fe MultiSound.d/pinnaclecfg.c +SHAR_EOF + else + shar_count="`LC_ALL= LC_CTYPE= LANG= wc -c < 'MultiSound.d/pinnaclecfg.c'`" + test 10235 -eq "$shar_count" || + $echo 'MultiSound.d/pinnaclecfg.c:' 'original size' '10235,' 'current size' "$shar_count!" + fi +fi +# ============= MultiSound.d/Makefile ============== +if test -f 'MultiSound.d/Makefile' && test "$first_param" != -c; then + $echo 'x -' SKIPPING 'MultiSound.d/Makefile' '(file already exists)' +else + $echo 'x -' extracting 'MultiSound.d/Makefile' '(text)' + sed 's/^X//' << 'SHAR_EOF' > 'MultiSound.d/Makefile' && +CC = gcc +CFLAGS = -O +PROGS = setdigital msndreset pinnaclecfg conv +X +all: $(PROGS) +X +clean: +X rm -f $(PROGS) +SHAR_EOF + $shar_touch -am 1204092398 'MultiSound.d/Makefile' && + chmod 0664 'MultiSound.d/Makefile' || + $echo 'restore of' 'MultiSound.d/Makefile' 'failed' + if ( md5sum --help 2>&1 | grep 'sage: md5sum \[' ) >/dev/null 2>&1 \ + && ( md5sum --version 2>&1 | grep -v 'textutils 1.12' ) >/dev/null; then + md5sum -c << SHAR_EOF >/dev/null 2>&1 \ + || $echo 'MultiSound.d/Makefile:' 'MD5 check failed' +76ca8bb44e3882edcf79c97df6c81845 MultiSound.d/Makefile +SHAR_EOF + else + shar_count="`LC_ALL= LC_CTYPE= LANG= wc -c < 'MultiSound.d/Makefile'`" + test 106 -eq "$shar_count" || + $echo 'MultiSound.d/Makefile:' 'original size' '106,' 'current size' "$shar_count!" + fi +fi +# ============= MultiSound.d/conv.l ============== +if test -f 'MultiSound.d/conv.l' && test "$first_param" != -c; then + $echo 'x -' SKIPPING 'MultiSound.d/conv.l' '(file already exists)' +else + $echo 'x -' extracting 'MultiSound.d/conv.l' '(text)' + sed 's/^X//' << 'SHAR_EOF' > 'MultiSound.d/conv.l' && +%% +[ \n\t,\r] +\;.* +DB +[0-9A-Fa-f]+H { int n; sscanf(yytext, "%xH", &n); printf("%c", n); } +%% +int yywrap() { return 1; } +main() { yylex(); } +SHAR_EOF + $shar_touch -am 0828231798 'MultiSound.d/conv.l' && + chmod 0664 'MultiSound.d/conv.l' || + $echo 'restore of' 'MultiSound.d/conv.l' 'failed' + if ( md5sum --help 2>&1 | grep 'sage: md5sum \[' ) >/dev/null 2>&1 \ + && ( md5sum --version 2>&1 | grep -v 'textutils 1.12' ) >/dev/null; then + md5sum -c << SHAR_EOF >/dev/null 2>&1 \ + || $echo 'MultiSound.d/conv.l:' 'MD5 check failed' +d2411fc32cd71a00dcdc1f009e858dd2 MultiSound.d/conv.l +SHAR_EOF + else + shar_count="`LC_ALL= LC_CTYPE= LANG= wc -c < 'MultiSound.d/conv.l'`" + test 141 -eq "$shar_count" || + $echo 'MultiSound.d/conv.l:' 'original size' '141,' 'current size' "$shar_count!" + fi +fi +# ============= MultiSound.d/msndreset.c ============== +if test -f 'MultiSound.d/msndreset.c' && test "$first_param" != -c; then + $echo 'x -' SKIPPING 'MultiSound.d/msndreset.c' '(file already exists)' +else + $echo 'x -' extracting 'MultiSound.d/msndreset.c' '(text)' + sed 's/^X//' << 'SHAR_EOF' > 'MultiSound.d/msndreset.c' && +/********************************************************************* +X * +X * msndreset.c - resets the MultiSound card +X * +X * Copyright (C) 1998 Andrew Veliath +X * +X * This program is free software; you can redistribute it and/or modify +X * it under the terms of the GNU General Public License as published by +X * the Free Software Foundation; either version 2 of the License, or +X * (at your option) any later version. +X * +X * This program is distributed in the hope that it will be useful, +X * but WITHOUT ANY WARRANTY; without even the implied warranty of +X * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +X * GNU General Public License for more details. +X * +X * You should have received a copy of the GNU General Public License +X * along with this program; if not, write to the Free Software +X * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. +X * +X ********************************************************************/ +X +#include <stdio.h> +#include <unistd.h> +#include <fcntl.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <sys/ioctl.h> +#include <sys/soundcard.h> +X +int main(int argc, char *argv[]) +{ +X int fd; +X +X if (argc != 2) { +X fprintf(stderr, "usage: msndreset <mixer device>\n"); +X exit(1); +X } +X +X if ((fd = open(argv[1], O_RDWR)) < 0) { +X perror(argv[1]); +X exit(1); +X } +X +X if (ioctl(fd, SOUND_MIXER_PRIVATE1, 0) < 0) { +X fprintf(stderr, "error: msnd ioctl reset failed\n"); +X perror("ioctl"); +X close(fd); +X exit(1); +X } +X +X close(fd); +X +X return 0; +} +SHAR_EOF + $shar_touch -am 1204100698 'MultiSound.d/msndreset.c' && + chmod 0664 'MultiSound.d/msndreset.c' || + $echo 'restore of' 'MultiSound.d/msndreset.c' 'failed' + if ( md5sum --help 2>&1 | grep 'sage: md5sum \[' ) >/dev/null 2>&1 \ + && ( md5sum --version 2>&1 | grep -v 'textutils 1.12' ) >/dev/null; then + md5sum -c << SHAR_EOF >/dev/null 2>&1 \ + || $echo 'MultiSound.d/msndreset.c:' 'MD5 check failed' +c52f876521084e8eb25e12e01dcccb8a MultiSound.d/msndreset.c +SHAR_EOF + else + shar_count="`LC_ALL= LC_CTYPE= LANG= wc -c < 'MultiSound.d/msndreset.c'`" + test 1472 -eq "$shar_count" || + $echo 'MultiSound.d/msndreset.c:' 'original size' '1472,' 'current size' "$shar_count!" + fi +fi +rm -fr _sh01426 +exit 0 diff --git a/Documentation/sound/oss/NEWS b/Documentation/sound/oss/NEWS new file mode 100644 index 000000000000..a81e0ef72ae9 --- /dev/null +++ b/Documentation/sound/oss/NEWS @@ -0,0 +1,42 @@ +Linux 2.4 Sound Changes +2000-September-25 +Christoph Hellwig, <hch@infradead.org> + + + +=== isapnp support + +The Linux 2.4 Kernel does have reliable in-kernel isapnp support. +Some drivers (sb.o, ad1816.o awe_wave.o) do now support automatically +detecting and configuring isapnp devices. +If you have a not yet supported isapnp soundcard, mail me the content +of '/proc/isapnp' on your system and some information about your card +and its driver(s) so I can try to get isapnp working for it. + + + +=== soundcard resources on kernel commandline + +Before Linux 2.4 you had to specify the resources for sounddrivers +statically linked into the kernel at compile time +(in make config/menuconfig/xconfig). In Linux 2.4 the resources are +now specified at the boot-time kernel commandline (e.g. the lilo +'append=' line or everything that's after the kernel name in grub). +Read the Configure.help entry for your card for the parameters. + + +=== softoss is gone + +In Linux 2.4 the softoss in-kernel software synthesizer is no more aviable. +Use a user space software synthesizer like timidity instead. + + + +=== /dev/sndstat and /proc/sound are gone + +In older Linux versions those files exported some information about the +OSS/Free configuration to userspace. In Linux 2.3 they were removed because +they did not support the growing number of pci soundcards and there were +some general problems with this interface. + + diff --git a/Documentation/sound/oss/NM256 b/Documentation/sound/oss/NM256 new file mode 100644 index 000000000000..b503217488b3 --- /dev/null +++ b/Documentation/sound/oss/NM256 @@ -0,0 +1,280 @@ +======================================================= +Documentation for the NeoMagic 256AV/256ZX sound driver +======================================================= + +You're looking at version 1.1 of the driver. (Woohoo!) It has been +successfully tested against the following laptop models: + + Sony Z505S/Z505SX/Z505DX/Z505RX + Sony F150, F160, F180, F250, F270, F280, PCG-F26 + Dell Latitude CPi, CPt (various submodels) + +There are a few caveats, which is why you should read the entirety of +this document first. + +This driver was developed without any support or assistance from +NeoMagic. There is no warranty, expressed, implied, or otherwise. It +is free software in the public domain; feel free to use it, sell it, +give it to your best friends, even claim that you wrote it (but why?!) +but don't go whining to me, NeoMagic, Sony, Dell, or anyone else +when it blows up your computer. + +Version 1.1 contains a change to try and detect non-AC97 versions of +the hardware, and not install itself appropriately. It should also +reinitialize the hardware on an APM resume event, assuming that APM +was configured into your kernel. + +============ +Installation +============ + +Enable the sound drivers, the OSS sound drivers, and then the NM256 +driver. The NM256 driver *must* be configured as a module (it won't +give you any other choice). + +Next, do the usual "make modules" and "make modules_install". +Finally, insmod the soundcore, sound and nm256 modules. + +When the nm256 driver module is loaded, you should see a couple of +confirmation messages in the kernel logfile indicating that it found +the device (the device does *not* use any I/O ports or DMA channels). +Now try playing a wav file, futz with the CD-ROM if you have one, etc. + +The NM256 is entirely a PCI-based device, and all the necessary +information is automatically obtained from the card. It can only be +configured as a module in a vain attempt to prevent people from +hurting themselves. It works correctly if it shares an IRQ with +another device (it normally shares IRQ 9 with the builtin eepro100 +ethernet on the Sony Z505 laptops). + +It does not run the card in any sort of compatibility mode. It will +not work on laptops that have the SB16-compatible, AD1848-compatible +or CS4232-compatible codec/mixer; you will want to use the appropriate +compatible OSS driver with these chipsets. I cannot provide any +assistance with machines using the SB16, AD1848 or CS4232 compatible +versions. (The driver now attempts to detect the mixer version, and +will refuse to load if it believes the hardware is not +AC97-compatible.) + +The sound support is very basic, but it does include simultaneous +playback and record capability. The mixer support is also quite +simple, although this is in keeping with the rather limited +functionality of the chipset. + +There is no hardware synthesizer available, as the Losedows OPL-3 and +MIDI support is done via hardware emulation. + +Only three recording devices are available on the Sony: the +microphone, the CD-ROM input, and the volume device (which corresponds +to the stereo output). (Other devices may be available on other +models of laptops.) The Z505 series does not have a builtin CD-ROM, +so of course the CD-ROM input doesn't work. It does work on laptops +with a builtin CD-ROM drive. + +The mixer device does not appear to have any tone controls, at least +on the Z505 series. The mixer module checks for tone controls in the +AC97 mixer, and will enable them if they are available. + +============== +Known problems +============== + + * There are known problems with PCMCIA cards and the eepro100 ethernet + driver on the Z505S/Z505SX/Z505DX. Keep reading. + + * There are also potential problems with using a virtual X display, and + also problems loading the module after the X server has been started. + Keep reading. + + * The volume control isn't anywhere near linear. Sorry. This will be + fixed eventually, when I get sufficiently annoyed with it. (I doubt + it will ever be fixed now, since I've never gotten sufficiently + annoyed with it and nobody else seems to care.) + + * There are reports that the CD-ROM volume is very low. Since I do not + have a CD-ROM equipped laptop, I cannot test this (it's kinda hard to + do remotely). + + * Only 8 fixed-rate speeds are supported. This is mainly a chipset + limitation. It may be possible to support other speeds in the future. + + * There is no support for the telephone mixer/codec. There is support + for a phonein/phoneout device in the mixer driver; whether or not + it does anything is anyone's guess. (Reports on this would be + appreciated. You'll have to figure out how to get the phone to + go off-hook before it'll work, tho.) + + * This driver was not written with any cooperation or support from + NeoMagic. If you have any questions about this, see their website + for their official stance on supporting open source drivers. + +============ +Video memory +============ + +The NeoMagic sound engine uses a portion of the display memory to hold +the sound buffer. (Crazy, eh?) The NeoMagic video BIOS sets up a +special pointer at the top of video RAM to indicate where the top of +the audio buffer should be placed. + +At the present time XFree86 is apparently not aware of this. It will +thus write over either the pointer or the sound buffer with abandon. +(Accelerated-X seems to do a better job here.) + +This implies a few things: + + * Sometimes the NM256 driver has to guess at where the buffer + should be placed, especially if the module is loaded after the + X server is started. It's usually correct, but it will consistently + fail on the Sony F250. + + * Virtual screens greater than 1024x768x16 under XFree86 are + problematic on laptops with only 2.5MB of screen RAM. This + includes all of the 256AV-equipped laptops. (Virtual displays + may or may not work on the 256ZX, which has at least 4MB of + video RAM.) + +If you start having problems with random noise being output either +constantly (this is the usual symptom on the F250), or when windows +are moved around (this is the usual symptom when using a virtual +screen), the best fix is to + + * Don't use a virtual frame buffer. + * Make sure you load the NM256 module before the X server is + started. + +On the F250, it is possible to force the driver to load properly even +after the XFree86 server is started by doing: + + insmod nm256 buffertop=0x25a800 + +This forces the audio buffers to the correct offset in screen RAM. + +One user has reported a similar problem on the Sony F270, although +others apparently aren't seeing any problems. His suggested command +is + + insmod nm256 buffertop=0x272800 + +================= +Official WWW site +================= + +The official site for the NM256 driver is: + + http://www.uglx.org/sony.html + +You should always be able to get the latest version of the driver there, +and the driver will be supported for the foreseeable future. + +============== +Z505RX and IDE +============== + +There appears to be a problem with the IDE chipset on the Z505RX; one +of the symptoms is that sound playback periodically hangs (when the +disk is accessed). The user reporting the problem also reported that +enabling all of the IDE chipset workarounds in the kernel solved the +problem, tho obviously only one of them should be needed--if someone +can give me more details I would appreciate it. + +============================== +Z505S/Z505SX on-board Ethernet +============================== + +If you're using the on-board Ethernet Pro/100 ethernet support on the Z505 +series, I strongly encourage you to download the latest eepro100 driver from +Donald Becker's site: + + ftp://cesdis.gsfc.nasa.gov/pub/linux/drivers/test/eepro100.c + +There was a reported problem on the Z505SX that if the ethernet +interface is disabled and reenabled while the sound driver is loaded, +the machine would lock up. I have included a workaround that is +working satisfactorily. However, you may occasionally see a message +about "Releasing interrupts, over 1000 bad interrupts" which indicates +that the workaround is doing its job. + +================================== +PCMCIA and the Z505S/Z505SX/Z505DX +================================== + +There is also a known problem with the Sony Z505S and Z505SX hanging +if a PCMCIA card is inserted while the ethernet driver is loaded, or +in some cases if the laptop is suspended. This is caused by tons of +spurious IRQ 9s, probably generated from the PCMCIA or ACPI bridges. + +There is currently no fix for the problem that works in every case. +The only known workarounds are to disable the ethernet interface +before inserting or removing a PCMCIA card, or with some cards +disabling the PCMCIA card before ejecting it will also help the +problem with the laptop hanging when the card is ejected. + +One user has reported that setting the tcic's cs_irq to some value +other than 9 (like 11) fixed the problem. This doesn't work on my +Z505S, however--changing the value causes the cardmgr to stop seeing +card insertions and removals, cards don't seem to work correctly, and +I still get hangs if a card is inserted when the kernel is booted. + +Using the latest ethernet driver and pcmcia package allows me to +insert an Adaptec 1480A SlimScsi card without the laptop hanging, +although I still have to shut down the card before ejecting or +powering down the laptop. However, similar experiments with a DE-660 +ethernet card still result in hangs when the card is inserted. I am +beginning to think that the interrupts are CardBus-related, since the +Adaptec card is a CardBus card, and the DE-660 is not; however, I +don't have any other CardBus cards to test with. + +====== +Thanks +====== + +First, I want to thank everyone (except NeoMagic of course) for their +generous support and encouragement. I'd like to list everyone's name +here that replied during the development phase, but the list is +amazingly long. + +I will be rather unfair and single out a few people, however: + + Justin Maurer, for being the first random net.person to try it, + and for letting me login to his Z505SX to get it working there + + Edi Weitz for trying out several different versions, and giving + me a lot of useful feedback + + Greg Rumple for letting me login remotely to get the driver + functional on the 256ZX, for his assistance on tracking + down all sorts of random stuff, and for trying out Accel-X + + Zach Brown, for the initial AC97 mixer interface design + + Jeff Garzik, for various helpful suggestions on the AC97 + interface + + "Mr. Bumpy" for feedback on the Z505RX + + Bill Nottingham, for generous assistance in getting the mixer ID + code working + +================= +Previous versions +================= + +Versions prior to 0.3 (aka `noname') had problems with weird artifacts +in the output and failed to set the recording rate properly. These +problems have long since been fixed. + +Versions prior to 0.5 had problems with clicks in the output when +anything other than 16-bit stereo sound was being played, and also had +periodic clicks when recording. + +Version 0.7 first incorporated support for the NM256ZX chipset, which +is found on some Dell Latitude laptops (the CPt, and apparently +some CPi models as well). It also included the generic AC97 +mixer module. + +Version 0.75 renamed all the functions and files with slightly more +generic names. + +Note that previous versions of this document claimed that recording was +8-bit only; it actually has been working for 16-bits all along. diff --git a/Documentation/sound/oss/OPL3 b/Documentation/sound/oss/OPL3 new file mode 100644 index 000000000000..2468ff827688 --- /dev/null +++ b/Documentation/sound/oss/OPL3 @@ -0,0 +1,6 @@ +A pure OPL3 card is nice and easy to configure. Simply do + +insmod opl3 io=0x388 + +Change the I/O address in the very unlikely case this card is differently +configured diff --git a/Documentation/sound/oss/OPL3-SA b/Documentation/sound/oss/OPL3-SA new file mode 100644 index 000000000000..66a91835d918 --- /dev/null +++ b/Documentation/sound/oss/OPL3-SA @@ -0,0 +1,52 @@ +OPL3-SA1 sound driver (opl3sa.o) + +--- +Note: This howto only describes how to setup the OPL3-SA1 chip; this info +does not apply to the SA2, SA3, or SA4. +--- + +The Yamaha OPL3-SA1 sound chip is usually found built into motherboards, and +it's a decent little chip offering a WSS mode, a SB Pro emulation mode, MPU401 +and OPL3 FM Synth capabilities. + +You can enable inclusion of the driver via CONFIG_SOUND_OPL3SA1=m, or +CONFIG_SOUND_OPL3SA1=y through 'make config/xconfig/menuconfig'. + +You'll need to know all of the relevant info (irq, dma, and io port) for the +chip's WSS mode, since that is the mode the kernel sound driver uses, and of +course you'll also need to know about where the MPU401 and OPL3 ports and +IRQs are if you want to use those. + +Here's the skinny on how to load it as a module: + + modprobe opl3sa io=0x530 irq=11 dma=0 dma2=1 mpu_io=0x330 mpu_irq=5 + +Module options in detail: + + io: This is the WSS's port base. + irq: This is the WSS's IRQ. + dma: This is the WSS's DMA line. In my BIOS setup screen this was + listed as "WSS Play DMA" + dma2: This is the WSS's secondary DMA line. My BIOS calls it the + "WSS capture DMA" + + mpu_io: This is the MPU401's port base. + mpu_irq: This is the MPU401's IRQ. + +If you'd like to use the OPL3 FM Synthesizer, make sure you enable +CONFIG_SOUND_YM3812 (in 'make config'). That'll build the opl3.o module. + +Then a simple 'insmod opl3 io=0x388', and you now have FM Synth. + +You can also use the SoftOSS software synthesizer instead of the builtin OPL3. +Here's how: + +Say 'y' or 'm' to "SoftOSS software wave table engine" in make config. + +If you said yes, the software synth is available once you boot your new +kernel. + +If you chose to build it as a module, just insmod the resulting softoss2.o + +Questions? Comments? +<stiker@northlink.com> diff --git a/Documentation/sound/oss/OPL3-SA2 b/Documentation/sound/oss/OPL3-SA2 new file mode 100644 index 000000000000..d8b6d2bbada6 --- /dev/null +++ b/Documentation/sound/oss/OPL3-SA2 @@ -0,0 +1,210 @@ +Documentation for the OPL3-SA2, SA3, and SAx driver (opl3sa2.o) +--------------------------------------------------------------- + +Scott Murray, scott@spiteful.org +January 7, 2001 + +NOTE: All trade-marked terms mentioned below are properties of their + respective owners. + + +Supported Devices +----------------- + +This driver is for PnP soundcards based on the following Yamaha audio +controller chipsets: + +YMF711 aka OPL3-SA2 +YMF715 and YMF719 aka OPL3-SA3 + +Up until recently (December 2000), I'd thought the 719 to be a +different chipset, the OPL3-SAx. After an email exhange with +Yamaha, however, it turns out that the 719 is just a re-badged +715, and the chipsets are identical. The chipset detection code +has been updated to reflect this. + +Anyways, all of these chipsets implement the following devices: + +OPL3 FM synthesizer +Soundblaster Pro +Microsoft/Windows Sound System +MPU401 MIDI interface + +Note that this driver uses the MSS device, and to my knowledge these +chipsets enforce an either/or situation with the Soundblaster Pro +device and the MSS device. Since the MSS device has better +capabilities, I have implemented the driver to use it. + + +Mixer Channels +-------------- + +Older versions of this driver (pre-December 2000) had two mixers, +an OPL3-SA2 or SA3 mixer and a MSS mixer. The OPL3-SA[23] mixer +device contained a superset of mixer channels consisting of its own +channels and all of the MSS mixer channels. To simplify the driver +considerably, and to partition functionality better, the OPL3-SA[23] +mixer device now contains has its own specific mixer channels. They +are: + +Volume - Hardware master volume control +Bass - SA3 only, now supports left and right channels +Treble - SA3 only, now supports left and right channels +Microphone - Hardware microphone input volume control +Digital1 - Yamaha 3D enhancement "Wide" mixer + +All other mixer channels (e.g. "PCM", "CD", etc.) now have to be +controlled via the "MS Sound System (CS4231)" mixer. To facilitate +this, the mixer device creation order has been switched so that +the MSS mixer is created first. This allows accessing the majority +of the useful mixer channels even via single mixer-aware tools +such as "aumix". + + +Plug 'n Play +------------ + +In previous kernels (2.2.x), some configuration was required to +get the driver to talk to the card. Being the new millennium and +all, the 2.4.x kernels now support auto-configuration if ISA PnP +support is configured in. Theoretically, the driver even supports +having more than one card in this case. + +With the addition of PnP support to the driver, two new parameters +have been added to control it: + +isapnp - set to 0 to disable ISA PnP card detection + +multiple - set to 0 to disable multiple PnP card detection + + +Optional Parameters +------------------- + +Recent (December 2000) additions to the driver (based on a patch +provided by Peter Englmaier) are two new parameters: + +ymode - Set Yamaha 3D enhancement mode: + 0 = Desktop/Normal 5-12 cm speakers + 1 = Notebook PC (1) 3 cm speakers + 2 = Notebook PC (2) 1.5 cm speakers + 3 = Hi-Fi 16-38 cm speakers + +loopback - Set A/D input source. Useful for echo cancellation: + 0 = Mic Right channel (default) + 1 = Mono output loopback + +The ymode parameter has been tested and does work. The loopback +parameter, however, is untested. Any feedback on its usefulness +would be appreciated. + + +Manual Configuration +-------------------- + +If for some reason you decide not to compile ISA PnP support into +your kernel, or disabled the driver's usage of it by setting the +isapnp parameter as discussed above, then you will need to do some +manual configuration. There are two ways of doing this. The most +common is to use the isapnptools package to initialize the card, and +use the kernel module form of the sound subsystem and sound drivers. +Alternatively, some BIOS's allow manual configuration of installed +PnP devices in a BIOS menu, which should allow using the non-modular +sound drivers, i.e. built into the kernel. + +I personally use isapnp and modules, and do not have access to a PnP +BIOS machine to test. If you have such a beast, configuring the +driver to be built into the kernel should just work (thanks to work +done by David Luyer <luyer@ucs.uwa.edu.au>). You will still need +to specify settings, which can be done by adding: + +opl3sa2=<io>,<irq>,<dma>,<dma2>,<mssio>,<mpuio> + +to the kernel command line. For example: + +opl3sa2=0x370,5,0,1,0x530,0x330 + +If you are instead using the isapnp tools (as most people have been +before Linux 2.4.x), follow the directions in their documentation to +produce a configuration file. Here is the relevant excerpt I used to +use for my SA3 card from my isapnp.conf: + +(CONFIGURE YMH0800/-1 (LD 0 + +# NOTE: IO 0 is for the unused SoundBlaster part of the chipset. +(IO 0 (BASE 0x0220)) +(IO 1 (BASE 0x0530)) +(IO 2 (BASE 0x0388)) +(IO 3 (BASE 0x0330)) +(IO 4 (BASE 0x0370)) +(INT 0 (IRQ 5 (MODE +E))) +(DMA 0 (CHANNEL 0)) +(DMA 1 (CHANNEL 1)) + +Here, note that: + +Port Acceptable Range Purpose +---- ---------------- ------- +IO 0 0x0220 - 0x0280 SB base address, unused. +IO 1 0x0530 - 0x0F48 MSS base address +IO 2 0x0388 - 0x03F8 OPL3 base address +IO 3 0x0300 - 0x0334 MPU base address +IO 4 0x0100 - 0x0FFE card's own base address for its control I/O ports + +The IRQ and DMA values can be any that are considered acceptable for a +MSS. Assuming you've got isapnp all happy, then you should be able to +do something like the following (which matches up with the isapnp +configuration above): + +modprobe mpu401 +modprobe ad1848 +modprobe opl3sa2 io=0x370 mss_io=0x530 mpu_io=0x330 irq=5 dma=0 dma2=1 +modprobe opl3 io=0x388 + +See the section "Automatic Module Loading" below for how to set up +/etc/modprobe.conf to automate this. + +An important thing to remember that the opl3sa2 module's io argument is +for it's own control port, which handles the card's master mixer for +volume (on all cards), and bass and treble (on SA3 cards). + + +Troubleshooting +--------------- + +If all goes well and you see no error messages, you should be able to +start using the sound capabilities of your system. If you get an +error message while trying to insert the opl3sa2 module, then make +sure that the values of the various arguments match what you specified +in your isapnp configuration file, and that there is no conflict with +another device for an I/O port or interrupt. Checking the contents of +/proc/ioports and /proc/interrupts can be useful to see if you're +butting heads with another device. + +If you still cannot get the module to load, look at the contents of +your system log file, usually /var/log/messages. If you see the +message "opl3sa2: Unknown Yamaha audio controller version", then you +have a different chipset version than I've encountered so far. Look +for all messages in the log file that start with "opl3sa2: " and see +if they provide any clues. If you do not see the chipset version +message, and none of the other messages present in the system log are +helpful, email me some details and I'll try my best to help. + + +Automatic Module Loading +------------------------ + +Lastly, if you're using modules and want to set up automatic module +loading with kmod, the kernel module loader, here is the section I +currently use in my modprobe.conf file: + +# Sound +alias sound-slot-0 opl3sa2 +options opl3sa2 io=0x370 mss_io=0x530 mpu_io=0x330 irq=7 dma=0 dma2=3 +options opl3 io=0x388 + +That's all it currently takes to get an OPL3-SA3 card working on my +system. Once again, if you have any other problems, email me at the +address listed above. + +Scott diff --git a/Documentation/sound/oss/Opti b/Documentation/sound/oss/Opti new file mode 100644 index 000000000000..c15af3c07d46 --- /dev/null +++ b/Documentation/sound/oss/Opti @@ -0,0 +1,222 @@ +Support for the OPTi 82C931 chip +-------------------------------- +Note: parts of this README file apply also to other +cards that use the mad16 driver. + +Some items in this README file are based on features +added to the sound driver after Linux-2.1.91 was out. +By the time of writing this I do not know which official +kernel release will include these features. +Please do not report inconsistencies on older Linux +kernels. + +The OPTi 82C931 is supported in its non-PnP mode. +Usually you do not need to set jumpers, etc. The sound driver +will check the card status and if it is required it will +force the card into a mode in which it can be programmed. + +If you have another OS installed on your computer it is recommended +that Linux and the other OS use the same resources. + +Also, it is recommended that resources specified in /etc/modprobe.conf +and resources specified in /etc/isapnp.conf agree. + +Compiling the sound driver +-------------------------- +I highly recommend that you build a modularized sound driver. +This document does not cover a sound-driver which is built in +the kernel. + +Sound card support should be enabled as a module (chose m). +Answer 'm' for these items: + Generic OPL2/OPL3 FM synthesizer support (CONFIG_SOUND_ADLIB) + Microsoft Sound System support (CONFIG_SOUND_MSS) + Support for OPTi MAD16 and/or Mozart based cards (CONFIG_SOUND_MAD16) + FM synthesizer (YM3812/OPL-3) support (CONFIG_SOUND_YM3812) + +The configuration menu may ask for addresses, IRQ lines or DMA +channels. If the card is used as a module the module loading +options will override these values. + +For the OPTi 931 you can answer 'n' to: + Support MIDI in older MAD16 based cards (requires SB) (CONFIG_SOUND_MAD16_OLDCARD) +If you do need MIDI support in a Mozart or C928 based card you +need to answer 'm' to the above question. In that case you will +also need to answer 'm' to: + '100% Sound Blaster compatibles (SB16/32/64, ESS, Jazz16) support' (CONFIG_SOUND_SB) + +Go on and compile your kernel and modules. Install the modules. Run depmod -a. + +Using isapnptools +----------------- +In most systems with a PnP BIOS you do not need to use isapnp. The +initialization provided by the BIOS is sufficient for the driver +to pick up the card and continue initialization. + +If that fails, or if you have other PnP cards, you need to use isapnp +to initialize the card. +This was tested with isapnptools-1.11 but I recommend that you use +isapnptools-1.13 (or newer). Run pnpdump to dump the information +about your PnP cards. Then edit the resulting file and select +the options of your choice. This file is normally installed as +/etc/isapnp.conf. + +The driver has one limitation with respect to I/O port resources: +IO3 base must be 0x0E0C. Although isapnp allows other ports, this +address is hard-coded into the driver. + +Using kmod and autoloading the sound driver +------------------------------------------- +Comment: as of linux-2.1.90 kmod is replacing kerneld. +The config file '/etc/modprobe.conf' is used as before. + +This is the sound part of my /etc/modprobe.conf file. +Following that I will explain each line. + +alias mixer0 mad16 +alias audio0 mad16 +alias midi0 mad16 +alias synth0 opl3 +options sb mad16=1 +options mad16 irq=10 dma=0 dma16=1 io=0x530 joystick=1 cdtype=0 +options opl3 io=0x388 +install mad16 /sbin/modprobe -i mad16 && /sbin/ad1848_mixer_reroute 14 8 15 3 16 6 + +If you have an MPU daughtercard or onboard MPU you will want to add to the +"options mad16" line - eg + +options mad16 irq=5 dma=0 dma16=3 io=0x530 mpu_io=0x330 mpu_irq=9 + +To set the I/O and IRQ of the MPU. + + +Explain: + +alias mixer0 mad16 +alias audio0 mad16 +alias midi0 mad16 +alias synth0 opl3 + +When any sound device is opened the kernel requests auto-loading +of char-major-14. There is a built-in alias that translates this +request to loading the main sound module. + +The sound module in its turn will request loading of a sub-driver +for mixer, audio, midi or synthesizer device. The first 3 are +supported by the mad16 driver. The synth device is supported +by the opl3 driver. + +There is currently no way to autoload the sound device driver +if more than one card is installed. + +options sb mad16=1 + +This is left for historical reasons. If you enable the +config option 'Support MIDI in older MAD16 based cards (requires SB)' +or if you use an older mad16 driver it will force loading of the +SoundBlaster driver. This option tells the SB driver not to look +for a SB card but to wait for the mad16 driver. + +options mad16 irq=10 dma=0 dma16=1 io=0x530 joystick=1 cdtype=0 +options opl3 io=0x388 + +post-install mad16 /sbin/ad1848_mixer_reroute 14 8 15 3 16 6 + +This sets resources and options for the mad16 and opl3 drivers. +I use two DMA channels (only one is required) to enable full duplex. +joystick=1 enables the joystick port. cdtype=0 disables the cd port. +You can also set mpu_io and mpu_irq in the mad16 options for the +uart401 driver. + +This tells modprobe to run /sbin/ad1848_mixer_reroute after +mad16 is successfully loaded and initialized. The source +for ad1848_mixer_reroute is appended to the end of this readme +file. It is impossible for the sound driver to know the actual +connections to the mixer. The 3 inputs intended for cd, synth +and line-in are mapped to the generic inputs line1, line2 and +line3. This program reroutes these mixer channels to their +right names (note the right mapping depends on the actual sound +card that you use). +The numeric parameters mean: + 14=line1 8=cd - reroute line1 to the CD input. + 15=line2 3=synth - reroute line2 to the synthesizer input. + 16=line3 6=line - reroute line3 to the line input. +For reference on other input names look at the file +/usr/include/linux/soundcard.h. + +Using a joystick +----------------- +You must enable a joystick in the mad16 options. (also +in /etc/isapnp.conf if you use it). +Tested with regular analog joysticks. + +A CDROM drive connected to the sound card +----------------------------------------- +The 82C931 chip has support only for secondary ATAPI cdrom. +(cdtype=8). Loading the mad16 driver resets the C931 chip +and if a cdrom was already mounted it may cause a complete +system hang. Do not use the sound card if you have an alternative. +If you do use the sound card it is important that you load +the mad16 driver (use "modprobe mad16" to prevent auto-unloading) +before the cdrom is accessed the first time. + +Using the sound driver built-in to the kernel may help here, but... +Most new systems have a PnP BIOS and also two IDE controllers. +The IDE controller on the sound card may be needed only on older +systems (which have only one IDE controller) but these systems +also do not have a PnP BIOS - requiring isapnptools and a modularized +driver. + +Known problems +-------------- +1. See the section on "A CDROM drive connected to the sound card". + +2. On my system the codec cannot capture companded sound samples. + (eg., recording from /dev/audio). When any companded capture is + requested I get stereo-16 bit samples instead. Playback of + companded samples works well. Apparently this problem is not common + to all C931 based cards. I do not know how to identify cards that + have this problem. + +Source for ad1848_mixer_reroute.c +--------------------------------- +#include <stdio.h> +#include <fcntl.h> +#include <linux/soundcard.h> + +static char *mixer_names[SOUND_MIXER_NRDEVICES] = + SOUND_DEVICE_LABELS; + +int +main(int argc, char **argv) { + int val, from, to; + int i, fd; + + fd = open("/dev/mixer", O_RDWR); + if(fd < 0) { + perror("/dev/mixer"); + return 1; + } + + for(i = 2; i < argc; i += 2) { + from = atoi(argv[i-1]); + to = atoi(argv[i]); + + if(to == SOUND_MIXER_NONE) + fprintf(stderr, "%s: turning off mixer %s\n", + argv[0], mixer_names[to]); + else + fprintf(stderr, "%s: rerouting mixer %s to %s\n", + argv[0], mixer_names[from], mixer_names[to]); + + val = from << 8 | to; + + if(ioctl(fd, SOUND_MIXER_PRIVATE2, &val)) { + perror("AD1848 mixer reroute"); + return 1; + } + } + + return 0; +} + diff --git a/Documentation/sound/oss/PAS16 b/Documentation/sound/oss/PAS16 new file mode 100644 index 000000000000..951b3dce51b4 --- /dev/null +++ b/Documentation/sound/oss/PAS16 @@ -0,0 +1,163 @@ +Pro Audio Spectrum 16 for 2.3.99 and later +========================================= +by Thomas Molina (tmolina@home.com) +last modified 3 Mar 2001 +Acknowledgement to Axel Boldt (boldt@math.ucsb.edu) for stuff taken +from Configure.help, Riccardo Facchetti for stuff from README.OSS, +and others whose names I could not find. + +This documentation is relevant for the PAS16 driver (pas2_card.c and +friends) under kernel version 2.3.99 and later. If you are +unfamiliar with configuring sound under Linux, please read the +Sound-HOWTO, Documentation/sound/oss/Introduction and other +relevant docs first. + +The following information is relevant information from README.OSS +and legacy docs for the Pro Audio Spectrum 16 (PAS16): +================================================================== + +The pas2_card.c driver supports the following cards -- +Pro Audio Spectrum 16 (PAS16) and compatibles: + Pro Audio Spectrum 16 + Pro Audio Studio 16 + Logitech Sound Man 16 + NOTE! The original Pro Audio Spectrum as well as the PAS+ are not + and will not be supported by the driver. + +The sound driver configuration dialog +------------------------------------- + +Sound configuration starts by making some yes/no questions. Be careful +when answering to these questions since answering y to a question may +prevent some later ones from being asked. For example don't answer y to +the question about (PAS16) if you don't really have a PAS16. Sound +configuration may also be made modular by answering m to configuration +options presented. + +Note also that all questions may not be asked. The configuration program +may disable some questions depending on the earlier choices. It may also +select some options automatically as well. + + "ProAudioSpectrum 16 support", + - Answer 'y'_ONLY_ if you have a Pro Audio Spectrum _16_, + Pro Audio Studio 16 or Logitech SoundMan 16 (be sure that + you read the above list correctly). Don't answer 'y' if you + have some other card made by Media Vision or Logitech since they + are not PAS16 compatible. + NOTE! Since 3.5-beta10 you need to enable SB support (next question) + if you want to use the SB emulation of PAS16. It's also possible to + the emulation if you want to use a true SB card together with PAS16 + (there is another question about this that is asked later). + + "Generic OPL2/OPL3 FM synthesizer support", + - Answer 'y' if your card has a FM chip made by Yamaha (OPL2/OPL3/OPL4). + The PAS16 has an OPL3-compatible FM chip. + +With PAS16 you can use two audio device files at the same time. /dev/dsp (and +/dev/audio) is connected to the 8/16 bit native codec and the /dev/dsp1 (and +/dev/audio1) is connected to the SB emulation (8 bit mono only). + + +The new stuff for 2.3.99 and later +============================================================================ +The following configuration options from Documentation/Configure.help +are relevant to configuring the PAS16: + +Sound card support +CONFIG_SOUND + If you have a sound card in your computer, i.e. if it can say more + than an occasional beep, say Y. Be sure to have all the information + about your sound card and its configuration down (I/O port, + interrupt and DMA channel), because you will be asked for it. + + You want to read the Sound-HOWTO, available from + http://www.tldp.org/docs.html#howto . General information + about the modular sound system is contained in the files + Documentation/sound/oss/Introduction. The file + Documentation/sound/oss/README.OSS contains some slightly outdated but + still useful information as well. + +OSS sound modules +CONFIG_SOUND_OSS + OSS is the Open Sound System suite of sound card drivers. They make + sound programming easier since they provide a common API. Say Y or M + here (the module will be called sound.o) if you haven't found a + driver for your sound card above, then pick your driver from the + list below. + +Persistent DMA buffers +CONFIG_SOUND_DMAP + Linux can often have problems allocating DMA buffers for ISA sound + cards on machines with more than 16MB of RAM. This is because ISA + DMA buffers must exist below the 16MB boundary and it is quite + possible that a large enough free block in this region cannot be + found after the machine has been running for a while. If you say Y + here the DMA buffers (64Kb) will be allocated at boot time and kept + until the shutdown. This option is only useful if you said Y to + "OSS sound modules", above. If you said M to "OSS sound modules" + then you can get the persistent DMA buffer functionality by passing + the command-line argument "dmabuf=1" to the sound.o module. + + Say y here for PAS16. + +ProAudioSpectrum 16 support +CONFIG_SOUND_PAS + Answer Y only if you have a Pro Audio Spectrum 16, ProAudio Studio + 16 or Logitech SoundMan 16 sound card. Don't answer Y if you have + some other card made by Media Vision or Logitech since they are not + PAS16 compatible. It is not necessary to enable the separate + Sound Blaster support; it is included in the PAS driver. + + If you compile the driver into the kernel, you have to add + "pas2=<io>,<irq>,<dma>,<dma2>,<sbio>,<sbirq>,<sbdma>,<sbdma2> + to the kernel command line. + +FM Synthesizer (YM3812/OPL-3) support +CONFIG_SOUND_YM3812 + Answer Y if your card has a FM chip made by Yamaha (OPL2/OPL3/OPL4). + Answering Y is usually a safe and recommended choice, however some + cards may have software (TSR) FM emulation. Enabling FM support with + these cards may cause trouble (I don't currently know of any such + cards, however). + Please read the file Documentation/sound/oss/OPL3 if your card has an + OPL3 chip. + If you compile the driver into the kernel, you have to add + "opl3=<io>" to the kernel command line. + + If you compile your drivers into the kernel, you MUST configure + OPL3 support as a module for PAS16 support to work properly. + You can then get OPL3 functionality by issuing the command: + insmod opl3 + In addition, you must either add the following line to + /etc/modprobe.conf: + options opl3 io=0x388 + or else add the following line to /etc/lilo.conf: + opl3=0x388 + + +EXAMPLES +=================================================================== +To use the PAS16 in my computer I have enabled the following sound +configuration options: + +CONFIG_SOUND=y +CONFIG_SOUND_OSS=y +CONFIG_SOUND_TRACEINIT=y +CONFIG_SOUND_DMAP=y +CONFIG_SOUND_PAS=y +CONFIG_SOUND_SB=n +CONFIG_SOUND_YM3812=m + +I have also included the following append line in /etc/lilo.conf: +append="pas2=0x388,10,3,-1,0x220,5,1,-1 sb=0x220,5,1,-1 opl3=0x388" + +The io address of 0x388 is default configuration on the PAS16. The +irq of 10 and dma of 3 may not match your installation. The above +configuration enables PAS16, 8-bit Soundblaster and OPL3 +functionality. If Soundblaster functionality is not desired, the +following line would be appropriate: +append="pas2=0x388,10,3,-1,0,-1,-1,-1 opl3=0x388" + +If sound is built totally modular, the above options may be +specified in /etc/modprobe.conf for pas2, sb and opl3 +respectively. diff --git a/Documentation/sound/oss/PSS b/Documentation/sound/oss/PSS new file mode 100644 index 000000000000..187b9525e1f6 --- /dev/null +++ b/Documentation/sound/oss/PSS @@ -0,0 +1,41 @@ +The PSS cards and other ECHO based cards provide an onboard DSP with +downloadable programs and also has an AD1848 "Microsoft Sound System" +device. The PSS driver enables MSS and MPU401 modes of the card. SB +is not enabled since it doesn't work concurrently with MSS. + +If you build this driver as a module then the driver takes the following +parameters + +pss_io. The I/O base the PSS card is configured at (normally 0x220 + or 0x240) + +mss_io The base address of the Microsoft Sound System interface. + This is normally 0x530, but may be 0x604 or other addresses. + +mss_irq The interrupt assigned to the Microsoft Sound System + emulation. IRQ's 3,5,7,9,10,11 and 12 are available. If you + get IRQ errors be sure to check the interrupt is set to + "ISA/Legacy" in the BIOS on modern machines. + +mss_dma The DMA channel used by the Microsoft Sound System. + This can be 0, 1, or 3. DMA 0 is not available on older + machines and will cause a crash on them. + +mpu_io The MPU emulation base address. This sets the base of the + synthesizer. It is typically 0x330 but can be altered. + +mpu_irq The interrupt to use for the synthesizer. It must differ + from the IRQ used by the Microsoft Sound System port. + + +The mpu_io/mpu_irq fields are optional. If they are not specified the +synthesizer parts are not configured. + +When the module is loaded it looks for a file called +/etc/sound/pss_synth. This is the firmware file from the DOS install disks. +This fil holds a general MIDI emulation. The file expected is called +genmidi.ld on newer DOS driver install disks and synth.ld on older ones. + +You can also load alternative DSP algorithms into the card if you wish. One +alternative driver can be found at http://www.mpg123.de/ + diff --git a/Documentation/sound/oss/PSS-updates b/Documentation/sound/oss/PSS-updates new file mode 100644 index 000000000000..c84dd7597e64 --- /dev/null +++ b/Documentation/sound/oss/PSS-updates @@ -0,0 +1,88 @@ + This file contains notes for users of PSS sound cards who wish to use the +newly added features of the newest version of this driver. + + The major enhancements present in this new revision of this driver is the +addition of two new module parameters that allow you to take full advantage of +all the features present on your PSS sound card. These features include the +ability to enable both the builtin CDROM and joystick ports. + +pss_enable_joystick + + This parameter is basically a flag. A 0 will leave the joystick port +disabled, while a non-zero value would enable the joystick port. The default +setting is pss_enable_joystick=0 as this keeps this driver fully compatible +with systems that were using previous versions of this driver. If you wish to +enable the joystick port you will have to add pss_enable_joystick=1 as an +argument to the driver. To actually use the joystick port you will then have +to load the joystick driver itself. Just remember to load the joystick driver +AFTER the pss sound driver. + +pss_cdrom_port + + This parameter takes a port address as its parameter. Any available port +address can be specified to enable the CDROM port, except for 0x0 and -1 as +these values would leave the port disabled. Like the joystick port, the cdrom +port will require that an appropriate CDROM driver be loaded before you can make +use of the newly enabled CDROM port. Like the joystick port option above, +remember to load the CDROM driver AFTER the pss sound driver. While it may +differ on some PSS sound cards, all the PSS sound cards that I have seen have a +builtin Wearnes CDROM port. If this is the case with your PSS sound card you +should load aztcd with the appropriate port option that matches the port you +assigned to the CDROM port when you loaded your pss sound driver. (ex. +modprobe pss pss_cdrom_port=0x340 && modprobe aztcd aztcd=0x340) The default +setting of this parameter leaves the CDROM port disabled to maintain full +compatibility with systems using previous versions of this driver. + + Other options have also been added for the added convenience and utility +of the user. These options are only available if this driver is loaded as a +module. + +pss_no_sound + + This module parameter is a flag that can be used to tell the driver to +just configure non-sound components. 0 configures all components, a non-0 +value will only attept to configure the CDROM and joystick ports. This +parameter can be used by a user who only wished to use the builtin joystick +and/or CDROM port(s) of his PSS sound card. If this driver is loaded with this +parameter and with the parameter below set to true then a user can safely unload +this driver with the following command "rmmod pss && rmmod ad1848 && rmmod +mpu401 && rmmod sound && rmmod soundcore" and retain the full functionality of +his CDROM and/or joystick port(s) while gaining back the memory previously used +by the sound drivers. This default setting of this parameter is 0 to retain +full behavioral compatibility with previous versions of this driver. + +pss_keep_settings + + This parameter can be used to specify whether you want the driver to reset +all emulations whenever its unloaded. This can be useful for those who are +sharing resources (io ports, IRQ's, DMA's) between different ISA cards. This +flag can also be useful in that future versions of this driver may reset all +emulations by default on the driver's unloading (as it probably should), so +specifying it now will ensure that all future versions of this driver will +continue to work as expected. The default value of this parameter is 1 to +retain full behavioral compatibility with previous versions of this driver. + +pss_firmware + + This parameter can be used to specify the file containing the firmware +code so that a user could tell the driver where that file is located instead +of having to put it in a predefined location with a predefined name. The +default setting of this parameter is "/etc/sound/pss_synth" as this was the +path and filename the hardcoded value in the previous versions of this driver. + +Examples: + +# Normal PSS sound card system, loading of drivers. +# Should be specified in an rc file (ex. Slackware uses /etc/rc.d/rc.modules). + +/sbin/modprobe pss pss_io=0x220 mpu_io=0x338 mpu_irq=9 mss_io=0x530 mss_irq=10 mss_dma=1 pss_cdrom_port=0x340 pss_enable_joystick=1 +/sbin/modprobe aztcd aztcd=0x340 +/sbin/modprobe joystick + +# System using the PSS sound card just for its CDROM and joystick ports. +# Should be specified in an rc file (ex. Slackware uses /etc/rc.d/rc.modules). + +/sbin/modprobe pss pss_io=0x220 pss_cdrom_port=0x340 pss_enable_joystick=1 pss_no_sound=1 +/sbin/rmmod pss && /sbin/rmmod ad1848 && /sbin/rmmod mpu401 && /sbin/rmmod sound && /sbin/rmmod soundcore # This line not needed, but saves memory. +/sbin/modprobe aztcd aztcd=0x340 +/sbin/modprobe joystick diff --git a/Documentation/sound/oss/README.OSS b/Documentation/sound/oss/README.OSS new file mode 100644 index 000000000000..fd42b05b2f55 --- /dev/null +++ b/Documentation/sound/oss/README.OSS @@ -0,0 +1,1456 @@ +Introduction +------------ + +This file is a collection of all the old Readme files distributed with +OSS/Lite by Hannu Savolainen. Since the new Linux sound driver is founded +on it I think these information may still be interesting for users that +have to configure their sound system. + +Be warned: Alan Cox is the current maintainer of the Linux sound driver so if +you have problems with it, please contact him or the current device-specific +driver maintainer (e.g. for aedsp16 specific problems contact me). If you have +patches, contributions or suggestions send them to Alan: I'm sure they are +welcome. + +In this document you will find a lot of references about OSS/Lite or ossfree: +they are gone forever. Keeping this in mind and with a grain of salt this +document can be still interesting and very helpful. + +[ File edited 17.01.1999 - Riccardo Facchetti ] +[ Edited miroSOUND section 19.04.2001 - Robert Siemer ] + +OSS/Free version 3.8 release notes +---------------------------------- + +Please read the SOUND-HOWTO (available from sunsite.unc.edu and other Linux FTP +sites). It gives instructions about using sound with Linux. It's bit out of +date but still very useful. Information about bug fixes and such things +is available from the web page (see above). + +Please check http://www.opensound.com/pguide for more info about programming +with OSS API. + + ==================================================== +- THIS VERSION ____REQUIRES____ Linux 2.1.57 OR LATER. + ==================================================== + +Packages "snd-util-3.8.tar.gz" and "snd-data-0.1.tar.Z" +contain useful utilities to be used with this driver. +See http://www.opensound.com/ossfree/getting.html for +download instructions. + +If you are looking for the installation instructions, please +look forward into this document. + +Supported sound cards +--------------------- + +See below. + +Contributors +------------ + +This driver contains code by several contributors. In addition several other +persons have given useful suggestions. The following is a list of major +contributors. (I could have forgotten some names.) + + Craig Metz 1/2 of the PAS16 Mixer and PCM support + Rob Hooft Volume computation algorithm for the FM synth. + Mika Liljeberg uLaw encoding and decoding routines + Jeff Tranter Linux SOUND HOWTO document + Greg Lee Volume computation algorithm for the GUS and + lots of valuable suggestions. + Andy Warner ISC port + Jim Lowe, + Amancio Hasty Jr FreeBSD/NetBSD port + Anders Baekgaard Bug hunting and valuable suggestions. + Joerg Schubert SB16 DSP support (initial version). + Andrew Robinson Improvements to the GUS driver + Megens SA MIDI recording for SB and SB Pro (initial version). + Mikael Nordqvist Linear volume support for GUS and + nonblocking /dev/sequencer. + Ian Hartas SVR4.2 port + Markus Aroharju and + Risto Kankkunen Major contributions to the mixer support + of GUS v3.7. + Hunyue Yau Mixer support for SG NX Pro. + Marc Hoffman PSS support (initial version). + Rainer Vranken Initialization for Jazz16 (initial version). + Peter Trattler Initial version of loadable module support for Linux. + JRA Gibson 16 bit mode for Jazz16 (initial version) + Davor Jadrijevic MAD16 support (initial version) + Gregor Hoffleit Mozart support (initial version) + Riccardo Facchetti Audio Excel DSP 16 (aedsp16) support + James Hightower Spotting a tiny but important bug in CS423x support. + Denis Sablic OPTi 82C924 specific enhancements (non PnP mode) + Tim MacKenzie Full duplex support for OPTi 82C930. + + Please look at lowlevel/README for more contributors. + +There are probably many other names missing. If you have sent me some +patches and your name is not in the above list, please inform me. + +Sending your contributions or patches +------------------------------------- + +First of all it's highly recommended to contact me before sending anything +or before even starting to do any work. Tell me what you suggest to be +changed or what you have planned to do. Also ensure you are using the +very latest (development) version of OSS/Free since the change may already be +implemented there. In general it's a major waste of time to try to improve a +several months old version. Information about the latest version can be found +from http://www.opensound.com/ossfree. In general there is no point in +sending me patches relative to production kernels. + +Sponsors etc. +------------- + +The following companies have greatly helped development of this driver +in form of a free copy of their product: + +Novell, Inc. UnixWare personal edition + SDK +The Santa Cruz Operation, Inc. A SCO OpenServer + SDK +Ensoniq Corp, a SoundScape card and extensive amount of assistance +MediaTrix Peripherals Inc, a AudioTrix Pro card + SDK +Acer, Inc. a pair of AcerMagic S23 cards. + +In addition the following companies have provided me sufficient amount +of technical information at least some of their products (free or $$$): + +Advanced Gravis Computer Technology Ltd. +Media Vision Inc. +Analog Devices Inc. +Logitech Inc. +Aztech Labs Inc. +Crystal Semiconductor Corporation, +Integrated Circuit Systems Inc. +OAK Technology +OPTi +Turtle Beach +miro +Ad Lib Inc. ($$) +Music Quest Inc. ($$) +Creative Labs ($$$) + +If you have some problems +========================= + +Read the sound HOWTO (sunsite.unc.edu:/pub/Linux/docs/...?). +Also look at the home page (http://www.opensound.com/ossfree). It may +contain info about some recent bug fixes. + +It's likely that you have some problems when trying to use the sound driver +first time. Sound cards don't have standard configuration so there are no +good default configuration to use. Please try to use same I/O, DMA and IRQ +values for the sound card than with DOS. + +If you get an error message when trying to use the driver, please look +at /var/adm/messages for more verbose error message. + + +The following errors are likely with /dev/dsp and /dev/audio. + + - "No such device or address". + This error indicates that there are no suitable hardware for the + device file or the sound driver has been compiled without support for + this particular device. For example /dev/audio and /dev/dsp will not + work if "digitized voice support" was not enabled during "make config". + + - "Device or resource busy". Probably the IRQ (or DMA) channel + required by the sound card is in use by some other device/driver. + + - "I/O error". Almost certainly (99%) it's an IRQ or DMA conflict. + Look at the kernel messages in /var/adm/notice for more info. + + - "Invalid argument". The application is calling ioctl() + with impossible parameters. Check that the application is + for sound driver version 2.X or later. + +Linux installation +================== + +IMPORTANT! Read this if you are installing a separately + distributed version of this driver. + + Check that your kernel version works with this + release of the driver (see Readme). Also verify + that your current kernel version doesn't have more + recent sound driver version than this one. IT'S HIGHLY + RECOMMENDED THAT YOU USE THE SOUND DRIVER VERSION THAT + IS DISTRIBUTED WITH KERNEL SOURCES. + +- When installing separately distributed sound driver you should first + read the above notice. Then try to find proper directory where and how + to install the driver sources. You should not try to install a separately + distributed driver version if you are not able to find the proper way + yourself (in this case use the version that is distributed with kernel + sources). Remove old version of linux/drivers/sound directory before + installing new files. + +- To build the device files you need to run the enclosed shell script + (see below). You need to do this only when installing sound driver + first time or when upgrading to much recent version than the earlier + one. + +- Configure and compile Linux as normally (remember to include the + sound support during "make config"). Please refer to kernel documentation + for instructions about configuring and compiling kernel. File Readme.cards + contains card specific instructions for configuring this driver for + use with various sound cards. + +Boot time configuration (using lilo and insmod) +----------------------------------------------- + +This information has been removed. Too many users didn't believe +that it's really not necessary to use this method. Please look at +Readme of sound driver version 3.0.1 if you still want to use this method. + +Problems +-------- + +Common error messages: + +- /dev/???????: No such file or directory. +Run the script at the end of this file. + +- /dev/???????: No such device. +You are not running kernel which contains the sound driver. When using +modularized sound driver this error means that the sound driver is not +loaded. + +- /dev/????: No such device or address. +Sound driver didn't detect suitable card when initializing. Please look at +Readme.cards for info about configuring the driver with your card. Also +check for possible boot (insmod) time error messages in /var/adm/messages. + +- Other messages or problems +Please check http://www.opensound.com/ossfree for more info. + +Configuring version 3.8 (for Linux) with some common sound cards +================================================================ + +This document describes configuring sound cards with the freeware version of +Open Sound Systems (OSS/Free). Information about the commercial version +(OSS/Linux) and its configuration is available from +http://www.opensound.com/linux.html. Information presented here is +not valid for OSS/Linux. + +If you are unsure about how to configure OSS/Free +you can download the free evaluation version of OSS/Linux from the above +address. There is a chance that it can autodetect your sound card. In this case +you can use the information included in soundon.log when configuring OSS/Free. + + +IMPORTANT! This document covers only cards that were "known" when + this driver version was released. Please look at + http://www.opensound.com/ossfree for info about + cards introduced recently. + + When configuring the sound driver, you should carefully + check each sound configuration option (particularly + "Support for /dev/dsp and /dev/audio"). The default values + offered by these programs are not necessarily valid. + + +THE BIGGEST MISTAKES YOU CAN MAKE +================================= + +1. Assuming that the card is Sound Blaster compatible when it's not. +-------------------------------------------------------------------- + +The number one mistake is to assume that your card is compatible with +Sound Blaster. Only the cards made by Creative Technology or which have +one or more chips labeled by Creative are SB compatible. In addition there +are few sound chipsets which are SB compatible in Linux such as ESS1688 or +Jazz16. Note that SB compatibility in DOS/Windows does _NOT_ mean anything +in Linux. + +IF YOU REALLY ARE 150% SURE YOU HAVE A SOUND BLASTER YOU CAN SKIP THE REST OF +THIS CHAPTER. + +For most other "supposed to be SB compatible" cards you have to use other +than SB drivers (see below). It is possible to get most sound cards to work +in SB mode but in general it's a complete waste of time. There are several +problems which you will encounter by using SB mode with cards that are not +truly SB compatible: + +- The SB emulation is at most SB Pro (DSP version 3.x) which means that +you get only 8 bit audio (there is always an another ("native") mode which +gives the 16 bit capability). The 8 bit only operation is the reason why +many users claim that sound quality in Linux is much worse than in DOS. +In addition some applications require 16 bit mode and they produce just +noise with a 8 bit only device. +- The card may work only in some cases but refuse to work most of the +time. The SB compatible mode always requires special initialization which is +done by the DOS/Windows drivers. This kind of cards work in Linux after +you have warm booted it after DOS but they don't work after cold boot +(power on or reset). +- You get the famous "DMA timed out" messages. Usually all SB clones have +software selectable IRQ and DMA settings. If the (power on default) values +currently used by the card don't match configuration of the driver you will +get the above error message whenever you try to record or play. There are +few other reasons to the DMA timeout message but using the SB mode seems +to be the most common cause. + +2. Trying to use a PnP (Plug & Play) card just like an ordinary sound card +-------------------------------------------------------------------------- + +Plug & Play is a protocol defined by Intel and Microsoft. It lets operating +systems to easily identify and reconfigure I/O ports, IRQs and DMAs of ISA +cards. The problem with PnP cards is that the standard Linux doesn't currently +(versions 2.1.x and earlier) don't support PnP. This means that you will have +to use some special tricks (see later) to get a PnP card alive. Many PnP cards +work after they have been initialized but this is not always the case. + +There are sometimes both PnP and non-PnP versions of the same sound card. +The non-PnP version is the original model which usually has been discontinued +more than an year ago. The PnP version has the same name but with "PnP" +appended to it (sometimes not). This causes major confusion since the non-PnP +model works with Linux but the PnP one doesn't. + +You should carefully check if "Plug & Play" or "PnP" is mentioned in the name +of the card or in the documentation or package that came with the card. +Everything described in the rest of this document is not necessarily valid for +PnP models of sound cards even you have managed to wake up the card properly. +Many PnP cards are simply too different from their non-PnP ancestors which are +covered by this document. + + +Cards that are not (fully) supported by this driver +=================================================== + +See http://www.opensound.com/ossfree for information about sound cards +to be supported in future. + + +How to use sound without recompiling kernel and/or sound driver +=============================================================== + +There is a commercial sound driver which comes in precompiled form and doesn't +require recompiling of the kernel. See http://www.4Front-tech.com/oss.html for +more info. + + +Configuring PnP cards +===================== + +New versions of most sound cards use the so-called ISA PnP protocol for +soft configuring their I/O, IRQ, DMA and shared memory resources. +Currently at least cards made by Creative Technology (SB32 and SB32AWE +PnP), Gravis (GUS PnP and GUS PnP Pro), Ensoniq (Soundscape PnP) and +Aztech (some Sound Galaxy models) use PnP technology. The CS4232/4236 audio +chip by Crystal Semiconductor (Intel Atlantis, HP Pavilion and many other +motherboards) is also based on PnP technology but there is a "native" driver +available for it (see information about CS4232 later in this document). + +PnP sound cards (as well as most other PnP ISA cards) are not supported +by this version of the driver . Proper +support for them should be released during 97 once the kernel level +PnP support is available. + +There is a method to get most of the PnP cards to work. The basic method +is the following: + +1) Boot DOS so the card's DOS drivers have a chance to initialize it. +2) _Cold_ boot to Linux by using "loadlin.exe". Hitting ctrl-alt-del +works with older machines but causes a hard reset of all cards on recent +(Pentium) machines. +3) If you have the sound driver in Linux configured properly, the card should +work now. "Proper" means that I/O, IRQ and DMA settings are the same as in +DOS. The hard part is to find which settings were used. See the documentation of +your card for more info. + +Windows 95 could work as well as DOS but running loadlin may be difficult. +Probably you should "shut down" your machine to MS-DOS mode before running it. + +Some machines have a BIOS utility for setting PnP resources. This is a good +way to configure some cards. In this case you don't need to boot DOS/Win95 +before starting Linux. + +Another way to initialize PnP cards without DOS/Win95 is a Linux based +PnP isolation tool. When writing this there is a pre alpha test version +of such a tool available from ftp://ftp.demon.co.uk/pub/unix/linux/utils. The +file is called isapnptools-*. Please note that this tool is just a temporary +solution which may be incompatible with future kernel versions having proper +support for PnP cards. There are bugs in setting DMA channels in earlier +versions of isapnptools so at least version 1.6 is required with sound cards. + +Yet another way to use PnP cards is to use (commercial) OSS/Linux drivers. See +http://www.opensound.com/linux.html for more info. This is probably the way you +should do it if you don't want to spend time recompiling the kernel and +required tools. + + +Read this before trying to configure the driver +=============================================== + +There are currently many cards that work with this driver. Some of the cards +have native support while others work since they emulate some other +card (usually SB, MSS/WSS and/or MPU401). The following cards have native +support in the driver. Detailed instructions for configuring these cards +will be given later in this document. + +Pro Audio Spectrum 16 (PAS16) and compatibles: + Pro Audio Spectrum 16 + Pro Audio Studio 16 + Logitech Sound Man 16 + NOTE! The original Pro Audio Spectrum as well as the PAS+ are not + and will not be supported by the driver. + +Media Vision Jazz16 based cards + Pro Sonic 16 + Logitech SoundMan Wave + (Other Jazz based cards should work but I don't have any reports + about them). + +Sound Blasters + SB 1.0 to 2.0 + SB Pro + SB 16 + SB32/64/AWE + Configure SB32/64/AWE just like SB16. See lowlevel/README.awe + for information about using the wave table synth. + NOTE! AWE63/Gold and 16/32/AWE "PnP" cards need to be activated + using isapnptools before they work with OSS/Free. + SB16 compatible cards by other manufacturers than Creative. + You have been fooled since there are _no_ SB16 compatible + cards on the market (as of May 1997). It's likely that your card + is compatible just with SB Pro but there is also a non-SB- + compatible 16 bit mode. Usually it's MSS/WSS but it could also + be a proprietary one like MV Jazz16 or ESS ES688. OPTi + MAD16 chips are very common in so called "SB 16 bit cards" + (try with the MAD16 driver). + + ====================================================================== + "Supposed to be SB compatible" cards. + Forget the SB compatibility and check for other alternatives + first. The only cards that work with the SB driver in + Linux have been made by Creative Technology (there is at least + one chip on the card with "CREATIVE" printed on it). The + only other SB compatible chips are ESS and Jazz16 chips + (maybe ALSxxx chips too but they probably don't work). + Most other "16 bit SB compatible" cards such as "OPTi/MAD16" or + "Crystal" are _NOT_ SB compatible in Linux. + + Practically all sound cards have some kind of SB emulation mode + in addition to their native (16 bit) mode. In most cases this + (8 bit only) SB compatible mode doesn't work with Linux. If + you get it working it may cause problems with games and + applications which require 16 bit audio. Some 16 bit only + applications don't check if the card actually supports 16 bits. + They just dump 16 bit data to a 8 bit card which produces just + noise. + + In most cases the 16 bit native mode is supported by Linux. + Use the SB mode with "clones" only if you don't find anything + better from the rest of this doc. + ====================================================================== + +Gravis Ultrasound (GUS) + GUS + GUS + the 16 bit option + GUS MAX + GUS ACE (No MIDI port and audio recording) + GUS PnP (with RAM) + +MPU-401 and compatibles + The driver works both with the full (intelligent mode) MPU-401 + cards (such as MPU IPC-T and MQX-32M) and with the UART only + dumb MIDI ports. MPU-401 is currently the most common MIDI + interface. Most sound cards are compatible with it. However, + don't enable MPU401 mode blindly. Many cards with native support + in the driver have their own MPU401 driver. Enabling the standard one + will cause a conflict with these cards. So check if your card is + in the list of supported cards before enabling MPU401. + +Windows Sound System (MSS/WSS) + Even when Microsoft has discontinued their own Sound System card + they managed to make it a standard. MSS compatible cards are based on + a codec chip which is easily available from at least two manufacturers + (AD1848 by Analog Devices and CS4231/CS4248 by Crystal Semiconductor). + Currently most sound cards are based on one of the MSS compatible codec + chips. The CS4231 is used in the high quality cards such as GUS MAX, + MediaTrix AudioTrix Pro and TB Tropez (GUS MAX is not MSS compatible). + + Having a AD1848, CS4248 or CS4231 codec chip on the card is a good + sign. Even if the card is not MSS compatible, it could be easy to write + support for it. Note also that most MSS compatible cards + require special boot time initialization which may not be present + in the driver. Also, some MSS compatible cards have native support. + Enabling the MSS support with these cards is likely to + cause a conflict. So check if your card is listed in this file before + enabling the MSS support. + +Yamaha FM synthesizers (OPL2, OPL3 (not OPL3-SA) and OPL4) + Most sound cards have a FM synthesizer chip. The OPL2 is a 2 + operator chip used in the original AdLib card. Currently it's used + only in the cheapest (8 bit mono) cards. The OPL3 is a 4 operator + FM chip which provides better sound quality and/or more available + voices than the OPL2. The OPL4 is a new chip that has an OPL3 and + a wave table synthesizer packed onto the same chip. The driver supports + just the OPL3 mode directly. Most cards with an OPL4 (like + SM Wave and AudioTrix Pro) support the OPL4 mode using MPU401 + emulation. Writing a native OPL4 support is difficult + since Yamaha doesn't give information about their sample ROM chip. + + Enable the generic OPL2/OPL3 FM synthesizer support if your + card has a FM chip made by Yamaha. Don't enable it if your card + has a software (TRS) based FM emulator. + + ---------------------------------------------------------------- + NOTE! OPL3-SA is different chip than the ordinary OPL3. In addition + to the FM synth this chip has also digital audio (WSS) and + MIDI (MPU401) capabilities. Support for OPL3-SA is described below. + ---------------------------------------------------------------- + +Yamaha OPL3-SA1 + + Yamaha OPL3-SA1 (YMF701) is an audio controller chip used on some + (Intel) motherboards and on cheap sound cards. It should not be + confused with the original OPL3 chip (YMF278) which is entirely + different chip. OPL3-SA1 has support for MSS, MPU401 and SB Pro + (not used in OSS/Free) in addition to the OPL3 FM synth. + + There are also chips called OPL3-SA2, OPL3-SA3, ..., OPL3SA-N. They + are PnP chips and will not work with the OPL3-SA1 driver. You should + use the standard MSS, MPU401 and OPL3 options with these chips and to + activate the card using isapnptools. + +4Front Technologies SoftOSS + + SoftOSS is a software based wave table emulation which works with + any 16 bit stereo sound card. Due to its nature a fast CPU is + required (P133 is minimum). Although SoftOSS does _not_ use MMX + instructions it has proven out that recent processors (which appear + to have MMX) perform significantly better with SoftOSS than earlier + ones. For example a P166MMX beats a PPro200. SoftOSS should not be used + on 486 or 386 machines. + + The amount of CPU load caused by SoftOSS can be controlled by + selecting the CONFIG_SOFTOSS_RATE and CONFIG_SOFTOSS_VOICES + parameters properly (they will be prompted by make config). It's + recommended to set CONFIG_SOFTOSS_VOICES to 32. If you have a + P166MMX or faster (PPro200 is not faster) you can set + CONFIG_SOFTOSS_RATE to 44100 (kHz). However with slower systems it + recommended to use sampling rates around 22050 or even 16000 kHz. + Selecting too high values for these parameters may hang your + system when playing MIDI files with hight degree of polyphony + (number of concurrently playing notes). It's also possible to + decrease CONFIG_SOFTOSS_VOICES. This makes it possible to use + higher sampling rates. However using fewer voices decreases + playback quality more than decreasing the sampling rate. + + SoftOSS keeps the samples loaded on the system's RAM so much RAM is + required. SoftOSS should never be used on machines with less than 16 MB + of RAM since this is potentially dangerous (you may accidentally run out + of memory which probably crashes the machine). + + SoftOSS implements the wave table API originally designed for GUS. For + this reason all applications designed for GUS should work (at least + after minor modifications). For example gmod/xgmod and playmidi -g are + known to work. + + To work SoftOSS will require GUS compatible + patch files to be installed on the system (in /dos/ultrasnd/midi). You + can use the public domain MIDIA patchset available from several ftp + sites. + + ********************************************************************* + IMPORTANT NOTICE! The original patch set distributed with the Gravis + Ultrasound card is not in public domain (even though it's available from + some FTP sites). You should contact Voice Crystal (www.voicecrystal.com) + if you like to use these patches with SoftOSS included in OSS/Free. + ********************************************************************* + +PSS based cards (AD1848 + ADSP-2115 + Echo ESC614 ASIC) + Analog Devices and Echo Speech have together defined a sound card + architecture based on the above chips. The DSP chip is used + for emulation of SB Pro, FM and General MIDI/MT32. + + There are several cards based on this architecture. The most known + ones are Orchid SW32 and Cardinal DSP16. + + The driver supports downloading DSP algorithms to these cards. + + NOTE! You will have to use the "old" config script when configuring + PSS cards. + +MediaTrix AudioTrix Pro + The ATP card is built around a CS4231 codec and an OPL4 synthesizer + chips. The OPL4 mode is supported by a microcontroller running a + General MIDI emulator. There is also a SB 1.5 compatible playback mode. + +Ensoniq SoundScape and compatibles + Ensoniq has designed a sound card architecture based on the + OTTO synthesizer chip used in their professional MIDI synthesizers. + Several companies (including Ensoniq, Reveal and Spea) are selling + cards based on this architecture. + + NOTE! The SoundScape PnP is not supported by OSS/Free. Ensoniq VIVO and + VIVO90 cards are not compatible with Soundscapes so the Soundscape + driver will not work with them. You may want to use OSS/Linux with these + cards. + +OPTi MAD16 and Mozart based cards + The Mozart (OAK OTI-601), MAD16 (OPTi 82C928), MAD16 Pro (OPTi 82C929), + OPTi 82C924/82C925 (in _non_ PnP mode) and OPTi 82C930 interface + chips are used in many different sound cards, including some + cards by Reveal miro and Turtle Beach (Tropez). The purpose of these + chips is to connect other audio components to the PC bus. The + interface chip performs address decoding for the other chips. + NOTE! Tropez Plus is not MAD16 but CS4232 based. + NOTE! MAD16 PnP cards (82C924, 82C925, 82C931) are not MAD16 compatible + in the PnP mode. You will have to use them in MSS mode after having + initialized them using isapnptools or DOS. 82C931 probably requires + initialization using DOS/Windows (running isapnptools is not enough). + It's possible to use 82C931 with OSS/Free by jumpering it to non-PnP + mode (provided that the card has a jumper for this). In non-PnP mode + 82C931 is compatible with 82C930 and should work with the MAD16 driver + (without need to use isapnptools or DOS to initialize it). All OPTi + chips are supported by OSS/Linux (both in PnP and non-PnP modes). + +Audio Excel DSP16 + Support for this card was written by Riccardo Faccetti + (riccardo@cdc8g5.cdc.polimi.it). The AEDSP16 driver included in + the lowlevel/ directory. To use it you should enable the + "Additional low level drivers" option. + +Crystal CS4232 and CS4236 based cards such as AcerMagic S23, TB Tropez _Plus_ and + many PC motherboards (Compaq, HP, Intel, ...) + CS4232 is a PnP multimedia chip which contains a CS3231A codec, + SB and MPU401 emulations. There is support for OPL3 too. + Unfortunately the MPU401 mode doesn't work (I don't know how to + initialize it). CS4236 is an enhanced (compatible) version of CS4232. + NOTE! Don't ever try to use isapnptools with CS4232 since this will just + freeze your machine (due to chip bugs). If you have problems in getting + CS4232 working you could try initializing it with DOS (CS4232C.EXE) and + then booting Linux using loadlin. CS4232C.EXE loads a secret firmware + patch which is not documented by Crystal. + +Turtle Beach Maui and Tropez "classic" + This driver version supports sample, patch and program loading commands + described in the Maui/Tropez User's manual. + There is now full initialization support too. The audio side of + the Tropez is based on the MAD16 chip (see above). + NOTE! Tropez Plus is different card than Tropez "classic" and will not + work fully in Linux. You can get audio features working by configuring + the card as a CS4232 based card (above). + + +Jumpers and software configuration +================================== + +Some of the earliest sound cards were jumper configurable. You have to +configure the driver use I/O, IRQ and DMA settings +that match the jumpers. Just few 8 bit cards are fully jumper +configurable (SB 1.x/2.x, SB Pro and clones). +Some cards made by Aztech have an EEPROM which contains the +config info. These cards behave much like hardware jumpered cards. + +Most cards have jumper for the base I/O address but other parameters +are software configurable. Sometimes there are few other jumpers too. + +Latest cards are fully software configurable or they are PnP ISA +compatible. There are no jumpers on the board. + +The driver handles software configurable cards automatically. Just configure +the driver to use I/O, IRQ and DMA settings which are known to work. +You could usually use the same values than with DOS and/or Windows. +Using different settings is possible but not recommended since it may cause +some trouble (for example when warm booting from an OS to another or +when installing new hardware to the machine). + +Sound driver sets the soft configurable parameters of the card automatically +during boot. Usually you don't need to run any extra initialization +programs when booting Linux but there are some exceptions. See the +card-specific instructions below for more info. + +The drawback of software configuration is that the driver needs to know +how the card must be initialized. It cannot initialize unknown cards +even if they are otherwise compatible with some other cards (like SB, +MPU401 or Windows Sound System). + + +What if your card was not listed above? +======================================= + +The first thing to do is to look at the major IC chips on the card. +Many of the latest sound cards are based on some standard chips. If you +are lucky, all of them could be supported by the driver. The most common ones +are the OPTi MAD16, Mozart, SoundScape (Ensoniq) and the PSS architectures +listed above. Also look at the end of this file for list of unsupported +cards and the ones which could be supported later. + +The last resort is to send _exact_ name and model information of the card +to me together with a list of the major IC chips (manufactured, model) to +me. I could then try to check if your card looks like something familiar. + +There are many more cards in the world than listed above. The first thing to +do with these cards is to check if they emulate some other card or interface +such as SB, MSS and/or MPU401. In this case there is a chance to get the +card to work by booting DOS before starting Linux (boot DOS, hit ctrl-alt-del +and boot Linux without hard resetting the machine). In this method the +DOS based driver initializes the hardware to use known I/O, IRQ and DMA +settings. If sound driver is configured to use the same settings, everything +should work OK. + + +Configuring sound driver (with Linux) +===================================== + +The sound driver is currently distributed as part of the Linux kernel. The +files are in /usr/src/linux/drivers/sound/. + +**************************************************************************** +* ALWAYS USE THE SOUND DRIVER VERSION WHICH IS DISTRIBUTED WITH * +* THE KERNEL SOURCE PACKAGE YOU ARE USING. SOME ALPHA AND BETA TEST * +* VERSIONS CAN BE INSTALLED FROM A SEPARATELY DISTRIBUTED PACKAGE * +* BUT CHECK THAT THE PACKAGE IS NOT MUCH OLDER (OR NEWER) THAN THE * +* KERNEL YOU ARE USING. IT'S POSSIBLE THAT THE KERNEL/DRIVER * +* INTERFACE CHANGES BETWEEN KERNEL RELEASES WHICH MAY CAUSE SOME * +* INCOMPATIBILITY PROBLEMS. * +* * +* IN CASE YOU INSTALL A SEPARATELY DISTRIBUTED SOUND DRIVER VERSION, * +* BE SURE TO REMOVE OR RENAME THE OLD SOUND DRIVER DIRECTORY BEFORE * +* INSTALLING THE NEW ONE. LEAVING OLD FILES TO THE SOUND DRIVER * +* DIRECTORY _WILL_ CAUSE PROBLEMS WHEN THE DRIVER IS USED OR * +* COMPILED. * +**************************************************************************** + +To configure the driver, run "make config" in the kernel source directory +(/usr/src/linux). Answer "y" or "m" to the question about Sound card support +(after the questions about mouse, CD-ROM, ftape, etc. support). Questions +about options for sound will then be asked. + +After configuring the kernel and sound driver and compile the kernel +following instructions in the kernel README. + +The sound driver configuration dialog +------------------------------------- + +Sound configuration starts by making some yes/no questions. Be careful +when answering to these questions since answering y to a question may +prevent some later ones from being asked. For example don't answer y to +the first question (PAS16) if you don't really have a PAS16. Don't enable +more cards than you really need since they just consume memory. Also +some drivers (like MPU401) may conflict with your SCSI controller and +prevent kernel from booting. If you card was in the list of supported +cards (above), please look at the card specific config instructions +(later in this file) before starting to configure. Some cards must be +configured in way which is not obvious. + +So here is the beginning of the config dialog. Answer 'y' or 'n' to these +questions. The default answer is shown so that (y/n) means 'y' by default and +(n/y) means 'n'. To use the default value, just hit ENTER. But be careful +since using the default _doesn't_ guarantee anything. + +Note also that all questions may not be asked. The configuration program +may disable some questions depending on the earlier choices. It may also +select some options automatically as well. + + "ProAudioSpectrum 16 support", + - Answer 'y'_ONLY_ if you have a Pro Audio Spectrum _16_, + Pro Audio Studio 16 or Logitech SoundMan 16 (be sure that + you read the above list correctly). Don't answer 'y' if you + have some other card made by Media Vision or Logitech since they + are not PAS16 compatible. + NOTE! Since 3.5-beta10 you need to enable SB support (next question) + if you want to use the SB emulation of PAS16. It's also possible to + the emulation if you want to use a true SB card together with PAS16 + (there is another question about this that is asked later). + "Sound Blaster support", + - Answer 'y' if you have an original SB card made by Creative Labs + or a full 100% hardware compatible clone (like Thunderboard or + SM Games). If your card was in the list of supported cards (above), + please look at the card specific instructions later in this file + before answering this question. For an unknown card you may answer + 'y' if the card claims to be SB compatible. + Enable this option also with PAS16 (changed since v3.5-beta9). + + Don't enable SB if you have a MAD16 or Mozart compatible card. + + "Generic OPL2/OPL3 FM synthesizer support", + - Answer 'y' if your card has a FM chip made by Yamaha (OPL2/OPL3/OPL4). + Answering 'y' is usually a safe and recommended choice. However some + cards may have software (TSR) FM emulation. Enabling FM support + with these cards may cause trouble. However I don't currently know + such cards. + "Gravis Ultrasound support", + - Answer 'y' if you have GUS or GUS MAX. Answer 'n' if you don't + have GUS since the GUS driver consumes much memory. + Currently I don't have experiences with the GUS ACE so I don't + know what to answer with it. + "MPU-401 support (NOT for SB16)", + - Be careful with this question. The MPU401 interface is supported + by almost any sound card today. However some natively supported cards + have their own driver for MPU401. Enabling the MPU401 option with + these cards will cause a conflict. Also enabling MPU401 on a system + that doesn't really have a MPU401 could cause some trouble. If your + card was in the list of supported cards (above), please look at + the card specific instructions later in this file. + + In MOST cases this MPU401 driver should only be used with "true" + MIDI-only MPU401 professional cards. In most other cases there + is another way to get the MPU401 compatible interface of a + sound card to work. + Support for the MPU401 compatible MIDI port of SB16, ESS1688 + and MV Jazz16 cards is included in the SB driver. Use it instead + of this separate MPU401 driver with these cards. As well + Soundscape, PSS and Maui drivers include their own MPU401 + options. + + It's safe to answer 'y' if you have a true MPU401 MIDI interface + card. + "6850 UART Midi support", + - It's safe to answer 'n' to this question in all cases. The 6850 + UART interface is so rarely used. + "PSS (ECHO-ADI2111) support", + - Answer 'y' only if you have Orchid SW32, Cardinal DSP16 or some + other card based on the PSS chipset (AD1848 codec + ADSP-2115 + DSP chip + Echo ESC614 ASIC CHIP). + "16 bit sampling option of GUS (_NOT_ GUS MAX)", + - Answer 'y' if you have installed the 16 bit sampling daughtercard + to your GUS. Answer 'n' if you have GUS MAX. Enabling this option + disables GUS MAX support. + "GUS MAX support", + - Answer 'y' only if you have a GUS MAX. + "Microsoft Sound System support", + - Again think carefully before answering 'y' to this question. It's + safe to answer 'y' in case you have the original Windows Sound + System card made by Microsoft or Aztech SG 16 Pro (or NX16 Pro). + Also you may answer 'y' in case your card was not listed earlier + in this file. For cards having native support in the driver, consult + the card specific instructions later in this file. Some drivers + have their own MSS support and enabling this option will cause a + conflict. + Note! The MSS driver permits configuring two DMA channels. This is a + "nonstandard" feature and works only with very few cards (if any). + In most cases the second DMA channel should be disabled or set to + the same channel than the first one. Trying to configure two separate + channels with cards that don't support this feature will prevent + audio (at least recording) from working. + "Ensoniq Soundscape support", + - Answer 'y' if you have a sound card based on the Ensoniq SoundScape + chipset. Such cards are being manufactured at least by Ensoniq, + Spea and Reveal (note that Reveal makes other cards also). The oldest + cards made by Spea don't work properly with Linux. + Soundscape PnP as well as Ensoniq VIVO work only with the commercial + OSS/Linux version. + "MediaTrix AudioTrix Pro support", + - Answer 'y' if you have the AudioTrix Pro. + "Support for MAD16 and/or Mozart based cards", + - Answer y if your card has a Mozart (OAK OTI-601) or MAD16 + (OPTi 82C928, 82C929, 82C924/82C925 or 82C930) audio interface chip. + These chips are + currently quite common so it's possible that many no-name cards + have one of them. In addition the MAD16 chip is used in some + cards made by known manufacturers such as Turtle Beach (Tropez), + Reveal (some models) and Diamond (some recent models). + Note OPTi 82C924 and 82C925 are MAD16 compatible only in non PnP + mode (jumper selectable on many cards). + "Support for TB Maui" + - This enables TB Maui specific initialization. Works with TB Maui + and TB Tropez (may not work with Tropez Plus). + + +Then the configuration program asks some y/n questions about the higher +level services. It's recommended to answer 'y' to each of these questions. +Answer 'n' only if you know you will not need the option. + + "MIDI interface support", + - Answering 'n' disables /dev/midi## devices and access to any + MIDI ports using /dev/sequencer and /dev/music. This option + also affects any MPU401 and/or General MIDI compatible devices. + "FM synthesizer (YM3812/OPL-3) support", + - Answer 'y' here. + "/dev/sequencer support", + - Answering 'n' disables /dev/sequencer and /dev/music. + +Entering the I/O, IRQ and DMA config parameters +----------------------------------------------- + +After the above questions the configuration program prompts for the +card specific configuration information. Usually just a set of +I/O address, IRQ and DMA numbers are asked. With some cards the program +asks for some files to be used during initialization of the card. For example +many cards have a DSP chip or microprocessor which must be initialized by +downloading a program (microcode) file to the card. + +Instructions for answering these questions are given in the next section. + + +Card specific information +========================= + +This section gives additional instructions about configuring some cards. +Please refer manual of your card for valid I/O, IRQ and DMA numbers. Using +the same settings with DOS/Windows and Linux is recommended. Using +different values could cause some problems when switching between +different operating systems. + +Sound Blasters (the original ones by Creative) +--------------------------------------------- + +NOTE! Check if you have a PnP Sound Blaster (cards sold after summer 1995 + are almost certainly PnP ones). With PnP cards you should use isapnptools + to activate them (see above). + +It's possible to configure these cards to use different I/O, IRQ and +DMA settings. Since the possible/default settings have changed between various +models, you have to consult manual of your card for the proper ones. It's +a good idea to use the same values than with DOS/Windows. With SB and SB Pro +it's the only choice. SB16 has software selectable IRQ and DMA channels but +using different values with DOS and Linux is likely to cause troubles. The +DOS driver is not able to reset the card properly after warm boot from Linux +if Linux has used different IRQ or DMA values. + +The original (steam) Sound Blaster (versions 1.x and 2.x) use always +DMA1. There is no way to change it. + +The SB16 needs two DMA channels. A 8 bit one (1 or 3) is required for +8 bit operation and a 16 bit one (5, 6 or 7) for the 16 bit mode. In theory +it's possible to use just one (8 bit) DMA channel by answering the 8 bit +one when the configuration program asks for the 16 bit one. This may work +in some systems but is likely to cause terrible noise on some other systems. + +It's possible to use two SB16/32/64 at the same time. To do this you should +first configure OSS/Free for one card. Then edit local.h manually and define +SB2_BASE, SB2_IRQ, SB2_DMA and SB2_DMA2 for the second one. You can't get +the OPL3, MIDI and EMU8000 devices of the second card to work. If you are +going to use two PnP Sound Blasters, ensure that they are of different model +and have different PnP IDs. There is no way to get two cards with the same +card ID and serial number to work. The easiest way to check this is trying +if isapnptools can see both cards or just one. + +NOTE! Don't enable the SM Games option (asked by the configuration program) + if you are not 101% sure that your card is a Logitech Soundman Games + (not a SM Wave or SM16). + +SB Clones +--------- + +First of all: There are no SB16 clones. There are SB Pro clones with a +16 bit mode which is not SB16 compatible. The most likely alternative is that +the 16 bit mode means MSS/WSS. + +There are just a few fully 100% hardware SB or SB Pro compatible cards. +I know just Thunderboard and SM Games. Other cards require some kind of +hardware initialization before they become SB compatible. Check if your card +was listed in the beginning of this file. In this case you should follow +instructions for your card later in this file. + +For other not fully SB clones you may try initialization using DOS in +the following way: + + - Boot DOS so that the card specific driver gets run. + - Hit ctrl-alt-del (or use loadlin) to boot Linux. Don't + switch off power or press the reset button. + - If you use the same I/O, IRQ and DMA settings in Linux, the + card should work. + +If your card is both SB and MSS compatible, I recommend using the MSS mode. +Most cards of this kind are not able to work in the SB and the MSS mode +simultaneously. Using the MSS mode provides 16 bit recording and playback. + +ProAudioSpectrum 16 and compatibles +----------------------------------- + +PAS16 has a SB emulation chip which can be used together with the native +(16 bit) mode of the card. To enable this emulation you should configure +the driver to have SB support too (this has been changed since version +3.5-beta9 of this driver). + +With current driver versions it's also possible to use PAS16 together with +another SB compatible card. In this case you should configure SB support +for the other card and to disable the SB emulation of PAS16 (there is a +separate questions about this). + +With PAS16 you can use two audio device files at the same time. /dev/dsp (and +/dev/audio) is connected to the 8/16 bit native codec and the /dev/dsp1 (and +/dev/audio1) is connected to the SB emulation (8 bit mono only). + +Gravis Ultrasound +----------------- + +There are many different revisions of the Ultrasound card (GUS). The +earliest ones (pre 3.7) don't have a hardware mixer. With these cards +the driver uses a software emulation for synth and pcm playbacks. It's +also possible to switch some of the inputs (line in, mic) off by setting +mixer volume of the channel level below 10%. For recording you have +to select the channel as a recording source and to use volume above 10%. + +GUS 3.7 has a hardware mixer. + +GUS MAX and the 16 bit sampling daughtercard have a CS4231 codec chip which +also contains a mixer. + +Configuring GUS is simple. Just enable the GUS support and GUS MAX or +the 16 bit daughtercard if you have them. Note that enabling the daughter +card disables GUS MAX driver. + +NOTE for owners of the 16 bit daughtercard: By default the daughtercard +uses /dev/dsp (and /dev/audio). Command "ln -sf /dev/dsp1 /dev/dsp" +selects the daughter card as the default device. + +With just the standard GUS enabled the configuration program prompts +for the I/O, IRQ and DMA numbers for the card. Use the same values than +with DOS. + +With the daughter card option enabled you will be prompted for the I/O, +IRQ and DMA numbers for the daughter card. You have to use different I/O +and DMA values than for the standard GUS. The daughter card permits +simultaneous recording and playback. Use /dev/dsp (the daughtercard) for +recording and /dev/dsp1 (GUS GF1) for playback. + +GUS MAX uses the same I/O address and IRQ settings than the original GUS +(GUS MAX = GUS + a CS4231 codec). In addition an extra DMA channel may be used. +Using two DMA channels permits simultaneous playback using two devices +(dev/dsp0 and /dev/dsp1). The second DMA channel is required for +full duplex audio. +To enable the second DMA channels, give a valid DMA channel when the config +program asks for the GUS MAX DMA (entering -1 disables the second DMA). +Using 16 bit DMA channels (5,6 or 7) is recommended. + +If you have problems in recording with GUS MAX, you could try to use +just one 8 bit DMA channel. Recording will not work with one DMA +channel if it's a 16 bit one. + +Microphone input of GUS MAX is connected to mixer in little bit nonstandard +way. There is actually two microphone volume controls. Normal "mic" controls +only recording level. Mixer control "speaker" is used to control volume of +microphone signal connected directly to line/speaker out. So just decrease +volume of "speaker" if you have problems with microphone feedback. + +GUS ACE works too but any attempt to record or to use the MIDI port +will fail. + +GUS PnP (with RAM) is partially supported but it needs to be initialized using +DOS or isapnptools before starting the driver. + +MPU401 and Windows Sound System +------------------------------- + +Again. Don't enable these options in case your card is listed +somewhere else in this file. + +Configuring these cards is obvious (or it should be). With MSS +you should probably enable the OPL3 synth also since +most MSS compatible cards have it. However check that this is true +before enabling OPL3. + +Sound driver supports more than one MPU401 compatible cards at the same time +but the config program asks config info for just the first of them. +Adding the second or third MPU interfaces must be done manually by +editing sound/local.h (after running the config program). Add defines for +MPU2_BASE & MPU2_IRQ (and MPU3_BASE & MPU3_IRQ) to the file. + +CAUTION! + +The default I/O base of Adaptec AHA-1542 SCSI controller is 0x330 which +is also the default of the MPU401 driver. Don't configure the sound driver to +use 0x330 as the MPU401 base if you have a AHA1542. The kernel will not boot +if you make this mistake. + +PSS +--- + +Even the PSS cards are compatible with SB, MSS and MPU401, you must not +enable these options when configuring the driver. The configuration +program handles these options itself. (You may use the SB, MPU and MSS options +together with PSS if you have another card on the system). + +The PSS driver enables MSS and MPU401 modes of the card. SB is not enabled +since it doesn't work concurrently with MSS. The driver loads also a +DSP algorithm which is used to for the general MIDI emulation. The +algorithm file (.ld) is read by the config program and written to a +file included when the pss.c is compiled. For this reason the config +program asks if you want to download the file. Use the genmidi.ld file +distributed with the DOS/Windows drivers of the card (don't use the mt32.ld). +With some cards the file is called 'synth.ld'. You must have access to +the file when configuring the driver. The easiest way is to mount the DOS +partition containing the file with Linux. + +It's possible to load your own DSP algorithms and run them with the card. +Look at the directory pss_test of snd-util-3.0.tar.gz for more info. + +AudioTrix Pro +------------- + +You have to enable the OPL3 and SB (not SB Pro or SB16) drivers in addition +to the native AudioTrix driver. Don't enable MSS or MPU drivers. + +Configuring ATP is little bit tricky since it uses so many I/O, IRQ and +DMA numbers. Using the same values than with DOS/Win is a good idea. Don't +attempt to use the same IRQ or DMA channels twice. + +The SB mode of ATP is implemented so the ATP driver just enables SB +in the proper address. The SB driver handles the rest. You have to configure +both the SB driver and the SB mode of ATP to use the same IRQ, DMA and I/O +settings. + +Also the ATP has a microcontroller for the General MIDI emulation (OPL4). +For this reason the driver asks for the name of a file containing the +microcode (TRXPRO.HEX). This file is usually located in the directory +where the DOS drivers were installed. You must have access to this file +when configuring the driver. + +If you have the effects daughtercard, it must be initialized by running +the setfx program of snd-util-3.0.tar.gz package. This step is not required +when using the (future) binary distribution version of the driver. + +Ensoniq SoundScape +------------------ + +NOTE! The new PnP SoundScape is not supported yet. Soundscape compatible + cards made by Reveal don't work with Linux. They use older revision + of the Soundscape chipset which is not fully compatible with + newer cards made by Ensoniq. + +The SoundScape driver handles initialization of MSS and MPU supports +itself so you don't need to enable other drivers than SoundScape +(enable also the /dev/dsp, /dev/sequencer and MIDI supports). + +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! +!!!!! !!!! +!!!!! NOTE! Before version 3.5-beta6 there WERE two sets of audio !!!! +!!!!! device files (/dev/dsp0 and /dev/dsp1). The first one WAS !!!! +!!!!! used only for card initialization and the second for audio !!!! +!!!!! purposes. It WAS required to change /dev/dsp (a symlink) to !!!! +!!!!! point to /dev/dsp1. !!!! +!!!!! !!!! +!!!!! This is not required with OSS versions 3.5-beta6 and later !!!! +!!!!! since there is now just one audio device file. Please !!!! +!!!!! change /dev/dsp to point back to /dev/dsp0 if you are !!!! +!!!!! upgrading from an earlier driver version using !!!! +!!!!! (cd /dev;rm dsp;ln -s dsp0 dsp). !!!! +!!!!! !!!! +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! + +The configuration program asks one DMA channel and two interrupts. One IRQ +and one DMA is used by the MSS codec. The second IRQ is required for the +MPU401 mode (you have to use different IRQs for both purposes). +There were earlier two DMA channels for SoundScape but the current driver +version requires just one. + +The SoundScape card has a Motorola microcontroller which must initialized +_after_ boot (the driver doesn't initialize it during boot). +The initialization is done by running the 'ssinit' program which is +distributed in the snd-util-3.0.tar.gz package. You have to edit two +defines in the ssinit.c and then compile the program. You may run ssinit +manually (after each boot) or add it to /etc/rc.d/rc.local. + +The ssinit program needs the microcode file that comes with the DOS/Windows +driver of the card. You will need to use version 1.30.00 or later +of the microcode file (sndscape.co0 or sndscape.co1 depending on +your card model). THE OLD sndscape.cod WILL NOT WORK. IT WILL HANG YOUR +MACHINE. The only way to get the new microcode file is to download +and install the DOS/Windows driver from ftp://ftp.ensoniq.com/pub. + +Then you have to select the proper microcode file to use: soundscape.co0 +is the right one for most cards and sndscape.co1 is for few (older) cards +made by Reveal and/or Spea. The driver has capability to detect the card +version during boot. Look at the boot log messages in /var/adm/messages +and locate the sound driver initialization message for the SoundScape +card. If the driver displays string <Ensoniq Soundscape (old)>, you have +an old card and you will need to use sndscape.co1. For other cards use +soundscape.co0. New Soundscape revisions such as Elite and PnP use +code files with higher numbers (.co2, .co3, etc.). + +NOTE! Ensoniq Soundscape VIVO is not compatible with other Soundscape cards. + Currently it's possible to use it in Linux only with OSS/Linux + drivers. + +Check /var/adm/messages after running ssinit. The driver prints +the board version after downloading the microcode file. That version +number must match the number in the name of the microcode file (extension). + +Running ssinit with a wrong version of the sndscape.co? file is not +dangerous as long as you don't try to use a file called sndscape.cod. +If you have initialized the card using a wrong microcode file (sounds +are terrible), just modify ssinit.c to use another microcode file and try +again. It's possible to use an earlier version of sndscape.co[01] but it +may sound weird. + +MAD16 (Pro) and Mozart +---------------------- + +You need to enable just the MAD16 /Mozart support when configuring +the driver. _Don't_ enable SB, MPU401 or MSS. However you will need the +/dev/audio, /dev/sequencer and MIDI supports. + +Mozart and OPTi 82C928 (the original MAD16) chips don't support +MPU401 mode so enter just 0 when the configuration program asks the +MPU/MIDI I/O base. The MAD16 Pro (OPTi 82C929) and 82C930 chips have MPU401 +mode. + +TB Tropez is based on the 82C929 chip. It has two MIDI ports. +The one connected to the MAD16 chip is the second one (there is a second +MIDI connector/pins somewhere??). If you have not connected the second MIDI +port, just disable the MIDI port of MAD16. The 'Maui' compatible synth of +Tropez is jumper configurable and not connected to the MAD16 chip (the +Maui driver can be used with it). + +Some MAD16 based cards may cause feedback, whistle or terrible noise if the +line3 mixer channel is turned too high. This happens at least with Shuttle +Sound System. Current driver versions set volume of line3 low enough so +this should not be a problem. + +If you have a MAD16 card which have an OPL4 (FM + Wave table) synthesizer +chip (_not_ an OPL3), you have to append a line containing #define MAD16_OPL4 +to the file linux/drivers/sound/local.h (after running make config). + +MAD16 cards having a CS4231 codec support full duplex mode. This mode +can be enabled by configuring the card to use two DMA channels. Possible +DMA channel pairs are: 0&1, 1&0 and 3&0. + +NOTE! Cards having an OPTi 82C924/82C925 chip work with OSS/Free only in +non-PnP mode (usually jumper selectable). The PnP mode is supported only +by OSS/Linux. + +MV Jazz (ProSonic) +------------------ + +The Jazz16 driver is just a hack made to the SB Pro driver. However it works +fairly well. You have to enable SB, SB Pro (_not_ SB16) and MPU401 supports +when configuring the driver. The configuration program asks later if you +want support for MV Jazz16 based cards (after asking SB base address). Answer +'y' here and the driver asks the second (16 bit) DMA channel. + +The Jazz16 driver uses the MPU401 driver in a way which will cause +problems if you have another MPU401 compatible card. In this case you must +give address of the Jazz16 based MPU401 interface when the config +program prompts for the MPU401 information. Then look at the MPU401 +specific section for instructions about configuring more than one MPU401 cards. + +Logitech Soundman Wave +---------------------- + +Read the above MV Jazz specific instructions first. + +The Logitech SoundMan Wave (don't confuse this with the SM16 or SM Games) is +a MV Jazz based card which has an additional OPL4 based wave table +synthesizer. The OPL4 chip is handled by an on board microcontroller +which must be initialized during boot. The config program asks if +you have a SM Wave immediately after asking the second DMA channel of jazz16. +If you answer 'y', the config program will ask name of the file containing +code to be loaded to the microcontroller. The file is usually called +MIDI0001.BIN and it's located in the DOS/Windows driver directory. The file +may also be called as TSUNAMI.BIN or something else (older cards?). + +The OPL4 synth will be inaccessible without loading the microcontroller code. + +Also remember to enable SB MPU401 support if you want to use the OPL4 mode. +(Don't enable the 'normal' MPU401 device as with some earlier driver +versions (pre 3.5-alpha8)). + +NOTE! Don't answer 'y' when the driver asks about SM Games support + (the next question after the MIDI0001.BIN name). However + answering 'y' doesn't cause damage your computer so don't panic. + +Sound Galaxies +-------------- + +There are many different Sound Galaxy cards made by Aztech. The 8 bit +ones are fully SB or SB Pro compatible and there should be no problems +with them. + +The older 16 bit cards (SG Pro16, SG NX Pro16, Nova and Lyra) have +an EEPROM chip for storing the configuration data. There is a microcontroller +which initializes the card to match the EEPROM settings when the machine +is powered on. These cards actually behave just like they have jumpers +for all of the settings. Configure driver for MSS, MPU, SB/SB Pro and OPL3 +supports with these cards. + +There are some new Sound Galaxies in the market. I have no experience with +them so read the card's manual carefully. + +ESS ES1688 and ES688 'AudioDrive' based cards +--------------------------------------------- + +Support for these two ESS chips is embedded in the SB driver. +Configure these cards just like SB. Enable the 'SB MPU401 MIDI port' +if you want to use MIDI features of ES1688. ES688 doesn't have MPU mode +so you don't need to enable it (the driver uses normal SB MIDI automatically +with ES688). + +NOTE! ESS cards are not compatible with MSS/WSS so don't worry if MSS support +of OSS doesn't work with it. + +There are some ES1688/688 based sound cards and (particularly) motherboards +which use software configurable I/O port relocation feature of the chip. +This ESS proprietary feature is supported only by OSS/Linux. + +There are ES1688 based cards which use different interrupt pin assignment than +recommended by ESS (5, 7, 9/2 and 10). In this case all IRQs don't work. +At least a card called (Pearl?) Hypersound 16 supports IRQ 15 but it doesn't +work. + +ES1868 is a PnP chip which is (supposed to be) compatible with ESS1688 +probably works with OSS/Free after initialization using isapnptools. + +Reveal cards +------------ + +There are several different cards made/marketed by Reveal. Some of them +are compatible with SoundScape and some use the MAD16 chip. You may have +to look at the card and try to identify its origin. + +Diamond +------- + +The oldest (Sierra Aria based) sound cards made by Diamond are not supported +(they may work if the card is initialized using DOS). The recent (LX?) +models are based on the MAD16 chip which is supported by the driver. + +Audio Excel DSP16 +----------------- + +Support for this card is currently not functional. A new driver for it +should be available later this year. + +PCMCIA cards +------------ + +Sorry, can't help. Some cards may work and some don't. + +TI TM4000M notebooks +-------------------- + +These computers have a built in sound support based on the Jazz chipset. +Look at the instructions for MV Jazz (above). It's also important to note +that there is something wrong with the mouse port and sound at least on +some TM models. Don't enable the "C&T 82C710 mouse port support" when +configuring Linux. Having it enabled is likely to cause mysterious problems +and kernel failures when sound is used. + +miroSOUND +--------- + +The miroSOUND PCM1-pro, PCM12 and PCM20 radio has been used +successfully. These cards are based on the MAD16, OPL4, and CS4231A chips +and everything said in the section about MAD16 cards applies here, +too. The only major difference between the PCMxx and other MAD16 cards +is that instead of the mixer in the CS4231 codec a separate mixer +controlled by an on-board 80C32 microcontroller is used. Control of +the mixer takes place via the ACI (miro's audio control interface) +protocol that is implemented in a separate lowlevel driver. Make sure +you compile this ACI driver together with the normal MAD16 support +when you use a miroSOUND PCMxx card. The ACI mixer is controlled by +/dev/mixer and the CS4231 mixer by /dev/mixer1 (depends on load +time). Only in special cases you want to change something regularly on +the CS4231 mixer. + +The miroSOUND PCM12 and PCM20 radio is capable of full duplex +operation (simultaneous PCM replay and recording), which allows you to +implement nice real-time signal processing audio effect software and +network telephones. The ACI mixer has to be switched into the "solo" +mode for duplex operation in order to avoid feedback caused by the +mixer (input hears output signal). You can de-/activate this mode +through toggleing the record button for the wave controller with an +OSS-mixer. + +The PCM20 contains a radio tuner, which is also controlled by +ACI. This radio tuner is supported by the ACI driver together with the +miropcm20.o module. Also the 7-band equalizer is integrated +(limited by the OSS-design). Developement has started and maybe +finished for the RDS decoder on this card, too. You will be able to +read RadioText, the Programme Service name, Programme TYpe and +others. Even the v4l radio module benefits from it with a refined +strength value. See aci.[ch] and miropcm20*.[ch] for more details. + +The following configuration parameters have worked fine for the PCM12 +in Markus Kuhn's system, many other configurations might work, too: +CONFIG_MAD16_BASE=0x530, CONFIG_MAD16_IRQ=11, CONFIG_MAD16_DMA=3, +CONFIG_MAD16_DMA2=0, CONFIG_MAD16_MPU_BASE=0x330, CONFIG_MAD16_MPU_IRQ=10, +DSP_BUFFSIZE=65536, SELECTED_SOUND_OPTIONS=0x00281000. + +Bas van der Linden is using his PCM1-pro with a configuration that +differs in: CONFIG_MAD16_IRQ=7, CONFIG_MAD16_DMA=1, CONFIG_MAD16_MPU_IRQ=9 + +Compaq Deskpro XL +----------------- + +The builtin sound hardware of Compaq Deskpro XL is now supported. +You need to configure the driver with MSS and OPL3 supports enabled. +In addition you need to manually edit linux/drivers/sound/local.h and +to add a line containing "#define DESKPROXL" if you used +make menuconfig/xconfig. + +Others? +------- + +Since there are so many different sound cards, it's likely that I have +forgotten to mention many of them. Please inform me if you know yet another +card which works with Linux, please inform me (or is anybody else +willing to maintain a database of supported cards (just like in XF86)?). + +Cards not supported yet +======================= + +Please check the version of sound driver you are using before +complaining that your card is not supported. It's possible you are +using a driver version which was released months before your card was +introduced. + +First of all, there is an easy way to make most sound cards work with Linux. +Just use the DOS based driver to initialize the card to a known state, then use +loadlin.exe to boot Linux. If Linux is configured to use the same I/O, IRQ and +DMA numbers as DOS, the card could work. +(ctrl-alt-del can be used in place of loadlin.exe but it doesn't work with +new motherboards). This method works also with all/most PnP sound cards. + +Don't get fooled with SB compatibility. Most cards are compatible with +SB but that may require a TSR which is not possible with Linux. If +the card is compatible with MSS, it's a better choice. Some cards +don't work in the SB and MSS modes at the same time. + +Then there are cards which are no longer manufactured and/or which +are relatively rarely used (such as the 8 bit ProAudioSpectrum +models). It's extremely unlikely that such cards ever get supported. +Adding support for a new card requires much work and increases time +required in maintaining the driver (some changes need to be done +to all low level drivers and be tested too, maybe with multiple +operating systems). For this reason I have made a decision to not support +obsolete cards. It's possible that someone else makes a separately +distributed driver (diffs) for the card. + +Writing a driver for a new card is not possible if there are no +programming information available about the card. If you don't +find your new card from this file, look from the home page +(http://www.opensound.com/ossfree). Then please contact +manufacturer of the card and ask if they have (or are willing to) +released technical details of the card. Do this before contacting me. I +can only answer 'no' if there are no programming information available. + +I have made decision to not accept code based on reverse engineering +to the driver. There are three main reasons: First I don't want to break +relationships to sound card manufacturers. The second reason is that +maintaining and supporting a driver without any specs will be a pain. +The third reason is that companies have freedom to refuse selling their +products to other than Windows users. + +Some companies don't give low level technical information about their +products to public or at least their require signing a NDA. It's not +possible to implement a freeware driver for them. However it's possible +that support for such cards become available in the commercial version +of this driver (see http://www.4Front-tech.com/oss.html for more info). + +There are some common audio chipsets that are not supported yet. For example +Sierra Aria and IBM Mwave. It's possible that these architectures +get some support in future but I can't make any promises. Just look +at the home page (http://www.opensound.com/ossfree/new_cards.html) +for latest info. + +Information about unsupported sound cards and chipsets is welcome as well +as free copies of sound cards, SDKs and operating systems. + +If you have any corrections and/or comments, please contact me. + +Hannu Savolainen +hannu@opensound.com + +Personal home page: http://www.compusonic.fi/~hannu +home page of OSS/Free: http://www.opensound.com/ossfree + +home page of commercial OSS +(Open Sound System) drivers: http://www.opensound.com/oss.html diff --git a/Documentation/sound/oss/README.awe b/Documentation/sound/oss/README.awe new file mode 100644 index 000000000000..80054cd8fcde --- /dev/null +++ b/Documentation/sound/oss/README.awe @@ -0,0 +1,218 @@ +================================================================ + AWE32 Sound Driver for Linux / FreeBSD + version 0.4.3; Nov. 1, 1998 + + Takashi Iwai <iwai@ww.uni-erlangen.de> +================================================================ + +* GENERAL NOTES + +This is a sound driver extension for SoundBlaster AWE32 and other +compatible cards (AWE32-PnP, SB32, SB32-PnP, AWE64 & etc) to enable +the wave synth operations. The driver is provided for Linux 1.2.x +and 2.[012].x kernels, as well as FreeBSD, on Intel x86 and DEC +Alpha systems. + +This driver was written by Takashi Iwai <iwai@ww.uni-erlangen.de>, +and provided "as is". The original source (awedrv-0.4.3.tar.gz) and +binary packages are available on the following URL: + http://bahamut.mm.t.u-tokyo.ac.jp/~iwai/awedrv/ +Note that since the author is apart from this web site, the update is +not frequent now. + + +* NOTE TO LINUX USERS + +To enable this driver on linux-2.[01].x kernels, you need turn on +"AWE32 synth" options in sound menu when configure your linux kernel +and modules. The precise installation procedure is described in the +AWE64-Mini-HOWTO and linux-kernel/Documetation/sound/AWE32. + +If you're using PnP cards, the card must be initialized before loading +the sound driver. There're several options to do this: + - Initialize the card via ISA PnP tools, and load the sound module. + - Initialize the card on DOS, and load linux by loadlin.exe + - Use PnP kernel driver (for Linux-2.x.x) +The detailed instruction for the solution using isapnp tools is found +in many documents like above. A brief instruction is also included in +the installation document of this package. +For PnP driver project, please refer to the following URL: + http://www-jcr.lmh.ox.ac.uk/~pnp/ + + +* USING THE DRIVER + +The awedrv has several different playing modes to realize easy channel +allocation for MIDI songs. To hear the exact sound quality, you need +to obtain the extended sequencer program, drvmidi or playmidi-2.5. + +For playing MIDI files, you *MUST* load the soundfont file on the +driver previously by sfxload utility. Otherwise you'll here no sounds +at all! All the utilities and driver source packages are found in the +above URL. The sfxload program is included in the package +awesfx-0.4.3.tgz. Binary packages are available there, too. See the +instruction in each package for installation. + +Loading a soundfont file is very simple. Just execute the command + + % sfxload synthgm.sbk + +Then, sfxload transfers the file "synthgm.sbk" to the driver. +Both SF1 and SF2 formats are accepted. + +Now you can hear midi musics by a midi player. + + % drvmidi foo.mid + +If you run MIDI player after MOD player, you need to load soundfont +files again, since MOD player programs clear the previous loaded +samples by their own data. + +If you have only 512kb on the sound card, I recommend to use dynamic +sample loading via -L option of drvmidi. 2MB GM/GS soundfont file is +available in most midi files. + + % sfxload synthgm + % drvmidi -L 2mbgmgs foo.mid + +This makes a big difference (believe me)! For more details, please +refer to the FAQ list which is available on the URL above. + +The current chorus, reverb and equalizer status can be changed by +aweset utility program (included in awesfx package). Note that +some awedrv-native programs (like drvmidi and xmp) will change the +current settings by themselves. The aweset program is effective +only for other programs like playmidi. + +Enjoy. + + +* COMPILE FLAGS + +Compile conditions are defined in awe_config.h. + +[Compatibility Conditions] +The following flags are defined automatically when using installation +shell script. + +- AWE_MODULE_SUPPORT + indicates your Linux kernel supports module for each sound card + (in recent 2.1 or 2.2 kernels and unofficial patched 2.0 kernels + as distributed in the RH5.0 package). + This flag is automatically set when you're using 2.1.x kernels. + You can pass the base address and memory size via the following + module options, + io = base I/O port address (eg. 0x620) + memsize = DRAM size in kilobytes (eg. 512) + As default, AWE driver probes these values automatically. + + +[Hardware Conditions] +You DON'T have to define the following two values. +Define them only when the driver couldn't detect the card properly. + +- AWE_DEFAULT_BASE_ADDR (default: not defined) + specifies the base port address of your AWE32 card. + 0 means to autodetect the address. + +- AWE_DEFAULT_MEM_SIZE (default: not defined) + specifies the memory size of your AWE32 card in kilobytes. + -1 means to autodetect its size. + + +[Sample Table Size] +From ver.0.4.0, sample tables are allocated dynamically (except +Linux-1.2.x system), so you need NOT to touch these parameters. +Linux-1.2.x users may need to increase these values to appropriate size +if the sound card is equipped with more DRAM. + +- AWE_MAX_SF_LISTS, AWE_MAX_SAMPLES, AWE_MAX_INFOS + + +[Other Conditions] + +- AWE_ALWAYS_INIT_FM (default: not defined) + indicates the AWE driver always initialize FM passthrough even + without DRAM on board. Emu8000 chip has a restriction for playing + samples on DRAM that at least two channels must be occupied as + passthrough channels. + +- AWE_DEBUG_ON (default: defined) + turns on debugging messages if defined. + +- AWE_HAS_GUS_COMPATIBILITY (default: defined) + Enables GUS compatibility mode if defined, reading GUS patches and + GUS control commands. Define this option to use GMOD or other + GUS module players. + +- CONFIG_AWE32_MIDIEMU (default: defined) + Adds a MIDI emulation device by Emu8000 wavetable. The emulation + device can be accessed as an external MIDI, and sends the MIDI + control codes directly. XG and GS sysex/NRPN are accepted. + No MIDI input is supported. + +- CONFIG_AWE32_MIXER (default: not defined) + Adds a mixer device for AWE32 bass/treble equalizer control. + You can access this device using /dev/mixer?? (usually mixer01). + +- AWE_USE_NEW_VOLUME_CALC (default: defined) + Use the new method to calculate the volume change as compatible + with DOS/Win drivers. This option can be toggled via aweset + program, or drvmidi player. + +- AWE_CHECK_VTARGET (default: defined) + Check the current volume target value when searching for an + empty channel to allocate a new voice. This is experimentally + implemented in this version. (probably, this option doesn't + affect the sound quality severely...) + +- AWE_ALLOW_SAMPLE_SHARING (default: defined) + Allow sample sharing for differently loaded patches. + This function is available only together with awesfx-0.4.3p3. + Note that this is still an experimental option. + +- DEF_FM_CHORUS_DEPTH (default: 0x10) + The default strength to be sent to the chorus effect engine. + From 0 to 0xff. Larger numbers may often cause weird sounds. + +- DEF_FM_REVERB_DEPTH (default: 0x10) + The default strength to be sent to the reverb effect engine. + From 0 to 0xff. Larger numbers may often cause weird sounds. + + +* ACKNOWLEDGMENTS + +Thanks to Witold Jachimczyk (witek@xfactor.wpi.edu) for much advice +on programming of AWE32. Much code is brought from his AWE32-native +MOD player, ALMP. +The port of awedrv to FreeBSD is done by Randall Hopper +(rhh@ct.picker.com). +The new volume calculation routine was derived from Mark Weaver's +ADIP compatible routines. +I also thank linux-awe-ml members for their efforts +to reboot their system many times :-) + + +* TODO'S + +- Complete DOS/Win compatibility +- DSP-like output + + +* COPYRIGHT + +Copyright (C) 1996-1998 Takashi Iwai + +This program is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2 of the License, or +(at your option) any later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; if not, write to the Free Software +Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. diff --git a/Documentation/sound/oss/README.modules b/Documentation/sound/oss/README.modules new file mode 100644 index 000000000000..e691d74e1e5e --- /dev/null +++ b/Documentation/sound/oss/README.modules @@ -0,0 +1,106 @@ +Building a modular sound driver +================================ + + The following information is current as of linux-2.1.85. Check the other +readme files, especially README.OSS, for information not specific to +making sound modular. + + First, configure your kernel. This is an idea of what you should be +setting in the sound section: + +<M> Sound card support + +<M> 100% Sound Blaster compatibles (SB16/32/64, ESS, Jazz16) support + + I have SoundBlaster. Select your card from the list. + +<M> Generic OPL2/OPL3 FM synthesizer support +<M> FM synthesizer (YM3812/OPL-3) support + + If you don't set these, you will probably find you can play .wav files +but not .midi. As the help for them says, set them unless you know your +card does not use one of these chips for FM support. + + Once you are configured, make zlilo, modules, modules_install; reboot. +Note that it is no longer necessary or possible to configure sound in the +drivers/sound dir. Now one simply configures and makes one's kernel and +modules in the usual way. + + Then, add to your /etc/modprobe.conf something like: + +alias char-major-14-* sb +install sb /sbin/modprobe -i sb && /sbin/modprobe adlib_card +options sb io=0x220 irq=7 dma=1 dma16=5 mpu_io=0x330 +options adlib_card io=0x388 # FM synthesizer + + Alternatively, if you have compiled in kernel level ISAPnP support: + +alias char-major-14 sb +post-install sb /sbin/modprobe "-k" "adlib_card" +options adlib_card io=0x388 + + The effect of this is that the sound driver and all necessary bits and +pieces autoload on demand, assuming you use kerneld (a sound choice) and +autoclean when not in use. Also, options for the device drivers are +set. They will not work without them. Change as appropriate for your card. +If you are not yet using the very cool kerneld, you will have to "modprobe +-k sb" yourself to get things going. Eventually things may be fixed so +that this kludgery is not necessary; for the time being, it seems to work +well. + + Replace 'sb' with the driver for your card, and give it the right +options. To find the filename of the driver, look in +/lib/modules/<kernel-version>/misc. Mine looks like: + +adlib_card.o # This is the generic OPLx driver +opl3.o # The OPL3 driver +sb.o # <<The SoundBlaster driver. Yours may differ.>> +sound.o # The sound driver +uart401.o # Used by sb, maybe other cards + + Whichever card you have, try feeding it the options that would be the +default if you were making the driver wired, not as modules. You can +look at function referred to by module_init() for the card to see what +args are expected. + + Note that at present there is no way to configure the io, irq and other +parameters for the modular drivers as one does for the wired drivers.. One +needs to pass the modules the necessary parameters as arguments, either +with /etc/modprobe.conf or with command-line args to modprobe, e.g. + +modprobe sb io=0x220 irq=7 dma=1 dma16=5 mpu_io=0x330 +modprobe adlib_card io=0x388 + + recommend using /etc/modprobe.conf. + +Persistent DMA Buffers: + +The sound modules normally allocate DMA buffers during open() and +deallocate them during close(). Linux can often have problems allocating +DMA buffers for ISA cards on machines with more than 16MB RAM. This is +because ISA DMA buffers must exist below the 16MB boundary and it is quite +possible that we can't find a large enough free block in this region after +the machine has been running for any amount of time. The way to avoid this +problem is to allocate the DMA buffers during module load and deallocate +them when the module is unloaded. For this to be effective we need to load +the sound modules right after the kernel boots, either manually or by an +init script, and keep them around until we shut down. This is a little +wasteful of RAM, but it guarantees that sound always works. + +To make the sound driver use persistent DMA buffers we need to pass the +sound.o module a "dmabuf=1" command-line argument. This is normally done +in /etc/modprobe.conf like so: + +options sound dmabuf=1 + +If you have 16MB or less RAM or a PCI sound card, this is wasteful and +unnecessary. It is possible that machine with 16MB or less RAM will find +this option useful, but if your machine is so memory-starved that it +cannot find a 64K block free, you will be wasting even more RAM by keeping +the sound modules loaded and the DMA buffers allocated when they are not +needed. The proper solution is to upgrade your RAM. But you do also have +this improper solution as well. Use it wisely. + + I'm afraid I know nothing about anything but my setup, being more of a +text-mode guy anyway. If you have options for other cards or other helpful +hints, send them to me, Jim Bray, jb@as220.org, http://as220.org/jb. diff --git a/Documentation/sound/oss/README.ymfsb b/Documentation/sound/oss/README.ymfsb new file mode 100644 index 000000000000..af8a7d3a4e8e --- /dev/null +++ b/Documentation/sound/oss/README.ymfsb @@ -0,0 +1,107 @@ +Legacy audio driver for YMF7xx PCI cards. + + +FIRST OF ALL +============ + + This code references YAMAHA's sample codes and data sheets. + I respect and thank for all people they made open the informations + about YMF7xx cards. + + And this codes heavily based on Jeff Garzik <jgarzik@pobox.com>'s + old VIA 82Cxxx driver (via82cxxx.c). I also respect him. + + +DISCLIMER +========= + + This driver is currently at early ALPHA stage. It may cause serious + damage to your computer when used. + PLEASE USE IT AT YOUR OWN RISK. + + +ABOUT THIS DRIVER +================= + + This code enables you to use your YMF724[A-F], YMF740[A-C], YMF744, YMF754 + cards. When enabled, your card acts as "SoundBlaster Pro" compatible card. + It can only play 22.05kHz / 8bit / Stereo samples, control external MIDI + port. + If you want to use your card as recent "16-bit" card, you should use + Alsa or OSS/Linux driver. Of course you can write native PCI driver for + your cards :) + + +USAGE +===== + + # modprobe ymfsb (options) + + +OPTIONS FOR MODULE +================== + + io : SB base address (0x220, 0x240, 0x260, 0x280) + synth_io : OPL3 base address (0x388, 0x398, 0x3a0, 0x3a8) + dma : DMA number (0,1,3) + master_volume: AC'97 PCM out Vol (0-100) + spdif_out : SPDIF-out flag (0:disable 1:enable) + + These options will change in future... + + +FREQUENCY +========= + + When playing sounds via this driver, you will hear its pitch is slightly + lower than original sounds. Since this driver recognizes your card acts + with 21.739kHz sample rates rather than 22.050kHz (I think it must be + hardware restriction). So many players become tone deafness. + To prevent this, you should express some options to your sound player + that specify correct sample frequency. For example, to play your MP3 file + correctly with mpg123, specify the frequency like following: + + % mpg123 -r 21739 foo.mp3 + + +SPDIF OUT +========= + + With installing modules with option 'spdif_out=1', you can enjoy your + sounds from SPDIF-out of your card (if it had). + Its Fs is fixed to 48kHz (It never means the sample frequency become + up to 48kHz. All sounds via SPDIF-out also 22kHz samples). So your + digital-in capable components has to be able to handle 48kHz Fs. + + +COPYING +======= + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2, or (at your option) + any later version. + + This program is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + + +TODO +==== + * support for multiple cards + (set the different SB_IO,MPU_IO,OPL_IO for each cards) + + * support for OPL (dmfm) : There will be no requirements... :-< + + +AUTHOR +====== + + Daisuke Nagano <breeze.nagano@nifty.ne.jp> + diff --git a/Documentation/sound/oss/SoundPro b/Documentation/sound/oss/SoundPro new file mode 100644 index 000000000000..9d4db1f29d3c --- /dev/null +++ b/Documentation/sound/oss/SoundPro @@ -0,0 +1,105 @@ +Documentation for the SoundPro CMI8330 extensions in the WSS driver (ad1848.o) +------------------------------------------------------------------------------ + +( Be sure to read Documentation/sound/oss/CMI8330 too ) + +Ion Badulescu, ionut@cs.columbia.edu +February 24, 1999 + +(derived from the OPL3-SA2 documentation by Scott Murray) + +The SoundPro CMI8330 (ISA) is a chip usually found on some Taiwanese +motherboards. The official name in the documentation is CMI8330, SoundPro +is the nickname and the big inscription on the chip itself. + +The chip emulates a WSS as well as a SB16, but it has certain differences +in the mixer section which require separate support. It also emulates an +MPU401 and an OPL3 synthesizer, so you probably want to enable support +for these, too. + +The chip identifies itself as an AD1848, but its mixer is significantly +more advanced than the original AD1848 one. If your system works with +either WSS or SB16 and you are having problems with some mixer controls +(no CD audio, no line-in, etc), you might want to give this driver a try. +Detection should work, but it hasn't been widely tested, so it might still +mis-identify the chip. You can still force soundpro=1 in the modprobe +parameters for ad1848. Please let me know if it happens to you, so I can +adjust the detection routine. + +The chip is capable of doing full-duplex, but since the driver sees it as an +AD1848, it cannot take advantage of this. Moreover, the full-duplex mode is +not achievable through the WSS interface, b/c it needs a dma16 line which is +assigned only to the SB16 subdevice (with isapnp). Windows documentation +says the user must use WSS Playback and SB16 Recording for full-duplex, so +it might be possible to do the same thing under Linux. You can try loading +up both ad1848 and sb then use one for playback and the other for +recording. I don't know if this works, b/c I haven't tested it. Anyway, if +you try it, be very careful: the SB16 mixer *mostly* works, but certain +settings can have unexpected effects. Use the WSS mixer for best results. + +There is also a PCI SoundPro chip. I have not seen this chip, so I have +no idea if the driver will work with it. I suspect it won't. + +As with PnP cards, some configuration is required. There are two ways +of doing this. The most common is to use the isapnptools package to +initialize the card, and use the kernel module form of the sound +subsystem and sound drivers. Alternatively, some BIOS's allow manual +configuration of installed PnP devices in a BIOS menu, which should +allow using the non-modular sound drivers, i.e. built into the kernel. +Since in this latter case you cannot use module parameters, you will +have to enable support for the SoundPro at compile time. + +The IRQ and DMA values can be any that are considered acceptable for a +WSS. Assuming you've got isapnp all happy, then you should be able to +do something like the following (which *must* match the isapnp/BIOS +configuration): + +modprobe ad1848 io=0x530 irq=11 dma=0 soundpro=1 +-and maybe- +modprobe sb io=0x220 irq=5 dma=1 dma16=5 + +-then- +modprobe mpu401 io=0x330 irq=9 +modprobe opl3 io=0x388 + +If all goes well and you see no error messages, you should be able to +start using the sound capabilities of your system. If you get an +error message while trying to insert the module(s), then make +sure that the values of the various arguments match what you specified +in your isapnp configuration file, and that there is no conflict with +another device for an I/O port or interrupt. Checking the contents of +/proc/ioports and /proc/interrupts can be useful to see if you're +butting heads with another device. + +If you do not see the chipset version message, and none of the other +messages present in the system log are helpful, try adding 'debug=1' +to the ad1848 parameters, email me the syslog results and I'll do +my best to help. + +Lastly, if you're using modules and want to set up automatic module +loading with kmod, the kernel module loader, here is the section I +currently use in my conf.modules file: + +# Sound +post-install sound modprobe -k ad1848; modprobe -k mpu401; modprobe -k opl3 +options ad1848 io=0x530 irq=11 dma=0 +options sb io=0x220 irq=5 dma=1 dma16=5 +options mpu401 io=0x330 irq=9 +options opl3 io=0x388 + +The above ensures that ad1848 will be loaded whenever the sound system +is being used. + +Good luck. + +Ion + +NOT REALLY TESTED: +- recording +- recording device selection +- full-duplex + +TODO: +- implement mixer support for surround, loud, digital CD switches. +- come up with a scheme which allows recording volumes for each subdevice. +This is a major OSS API change. diff --git a/Documentation/sound/oss/Soundblaster b/Documentation/sound/oss/Soundblaster new file mode 100644 index 000000000000..b288d464ba8b --- /dev/null +++ b/Documentation/sound/oss/Soundblaster @@ -0,0 +1,53 @@ +modprobe sound +insmod uart401 +insmod sb ... + +This loads the driver for the Sound Blaster and assorted clones. Cards that +are covered by other drivers should not be using this driver. + +The Sound Blaster module takes the following arguments + +io I/O address of the Sound Blaster chip (0x220,0x240,0x260,0x280) +irq IRQ of the Sound Blaster chip (5,7,9,10) +dma 8-bit DMA channel for the Sound Blaster (0,1,3) +dma16 16-bit DMA channel for SB16 and equivalent cards (5,6,7) +mpu_io I/O for MPU chip if present (0x300,0x330) + +sm_games=1 Set if you have a Logitech soundman games +acer=1 Set this to detect cards in some ACER notebooks +mwave_bug=1 Set if you are trying to use this driver with mwave (see on) +type Use this to specify a specific card type + +The following arguments are taken if ISAPnP support is compiled in + +isapnp=0 Set this to disable ISAPnP detection (use io=0xXXX etc. above) +multiple=0 Set to disable detection of multiple Soundblaster cards. + Consider it a bug if this option is needed, and send in a + report. +pnplegacy=1 Set this to be able to use a PnP card(s) along with a single + non-PnP (legacy) card. Above options for io, irq, etc. are + needed, and will apply only to the legacy card. +reverse=1 Reverses the order of the search in the PnP table. +uart401=1 Set to enable detection of mpu devices on some clones. +isapnpjump=n Jumps to slot n in the driver's PnP table. Use the source, + Luke. + +You may well want to load the opl3 driver for synth music on most SB and +clone SB devices + +insmod opl3 io=0x388 + +Using Mwave + +To make this driver work with Mwave you must set mwave_bug. You also need +to warm boot from DOS/Windows with the required firmware loaded under this +OS. IBM are being difficult about documenting how to load this firmware. + +Avance Logic ALS007 + +This card is supported; see the separate file ALS007 for full details. + +Avance Logic ALS100 + +This card is supported; setup should be as for a standard Sound Blaster 16. +The driver will identify the audio device as a "Sound Blaster 16 (ALS-100)". diff --git a/Documentation/sound/oss/Tropez+ b/Documentation/sound/oss/Tropez+ new file mode 100644 index 000000000000..b93a6b734fc0 --- /dev/null +++ b/Documentation/sound/oss/Tropez+ @@ -0,0 +1,26 @@ +From: Paul Barton-Davis <pbd@op.net> + +Here is the configuration I use with a Tropez+ and my modular +driver: + + alias char-major-14 wavefront + alias synth0 wavefront + alias mixer0 cs4232 + alias audio0 cs4232 + pre-install wavefront modprobe "-k" "cs4232" + post-install wavefront modprobe "-k" "opl3" + options wavefront io=0x200 irq=9 + options cs4232 synthirq=9 synthio=0x200 io=0x530 irq=5 dma=1 dma2=0 + options opl3 io=0x388 + +Things to note: + + the wavefront options "io" and "irq" ***MUST*** match the "synthio" + and "synthirq" cs4232 options. + + you can do without the opl3 module if you don't + want to use the OPL/[34] synth on the soundcard + + the opl3 io parameter is conventionally not adjustable. + +Please see drivers/sound/README.wavefront for more details. diff --git a/Documentation/sound/oss/VIA-chipset b/Documentation/sound/oss/VIA-chipset new file mode 100644 index 000000000000..37865234e54d --- /dev/null +++ b/Documentation/sound/oss/VIA-chipset @@ -0,0 +1,43 @@ +Running sound cards on VIA chipsets + +o There are problems with VIA chipsets and sound cards that appear to + lock the hardware solidly. Test programs under DOS have verified the + problem exists on at least some (but apparently not all) VIA boards + +o VIA have so far failed to bother to answer support mail on the subject + so if you are a VIA engineer feeling aggrieved as you read this + document go chase your own people. If there is a workaround please + let us know so we can implement it. + + +Certain patterns of ISA DMA access used for most PC sound cards cause the +VIA chipsets to lock up. From the collected reports this appears to cover a +wide range of boards. Some also lock up with sound cards under Win* as well. + +Linux implements a workaround providing your chipset is PCI and you compiled +with PCI Quirks enabled. If so you will see a message + "Activating ISA DMA bug workarounds" + +during booting. If you have a VIA PCI chipset that hangs when you use the +sound and is not generating this message even with PCI quirks enabled +please report the information to the linux-kernel list (see REPORTING-BUGS). + +If you are one of the tiny number of unfortunates with a 486 ISA/VLB VIA +chipset board you need to do the following to build a special kernel for +your board + + edit linux/include/asm-i386/dma.h + +change + +#define isa_dma_bridge_buggy (0) + +to + +#define isa_dma_bridge_buggy (1) + +and rebuild a kernel without PCI quirk support. + + +Other than this particular glitch the VIA [M]VP* chipsets appear to work +perfectly with Linux. diff --git a/Documentation/sound/oss/VIBRA16 b/Documentation/sound/oss/VIBRA16 new file mode 100644 index 000000000000..68a5a46beb88 --- /dev/null +++ b/Documentation/sound/oss/VIBRA16 @@ -0,0 +1,80 @@ +Sound Blaster 16X Vibra addendum +-------------------------------- +by Marius Ilioaea <mariusi@protv.ro> + Stefan Laudat <stefan@asit.ro> + +Sat Mar 6 23:55:27 EET 1999 + + Hello again, + + Playing with a SB Vibra 16x soundcard we found it very difficult +to setup because the kernel reported a lot of DMA errors and wouldn't +simply play any sound. + A good starting point is that the vibra16x chip full-duplex facility +is neither still exploited by the sb driver found in the linux kernel +(tried it with a 2.2.2-ac7), nor in the commercial OSS package (it reports +it as half-duplex soundcard). Oh, I almost forgot, the RedHat sndconfig +failed detecting it ;) + So, the big problem still remains, because the sb module wants a +8-bit and a 16-bit dma, which we could not allocate for vibra... it supports +only two 8-bit dma channels, the second one will be passed to the module +as a 16 bit channel, the kernel will yield about that but everything will +be okay, trust us. + The only inconvenient you may find is that you will have +some sound playing jitters if you have HDD dma support enabled - but this +will happen with almost all soundcards... + + A fully working isapnp.conf is just here: + +<snip here> + +(READPORT 0x0203) +(ISOLATE PRESERVE) +(IDENTIFY *) +(VERBOSITY 2) +(CONFLICT (IO FATAL)(IRQ FATAL)(DMA FATAL)(MEM FATAL)) # or WARNING +# SB 16 and OPL3 devices +(CONFIGURE CTL00f0/-1 (LD 0 +(INT 0 (IRQ 5 (MODE +E))) +(DMA 0 (CHANNEL 1)) +(DMA 1 (CHANNEL 3)) +(IO 0 (SIZE 16) (BASE 0x0220)) +(IO 2 (SIZE 4) (BASE 0x0388)) +(NAME "CTL00f0/-1[0]{Audio }") +(ACT Y) +)) + +# Joystick device - only if you need it :-/ + +(CONFIGURE CTL00f0/-1 (LD 1 +(IO 0 (SIZE 1) (BASE 0x0200)) +(NAME "CTL00f0/-1[1]{Game }") +(ACT Y) +)) +(WAITFORKEY) + +<end of snipping> + + So, after a good kernel modules compilation and a 'depmod -a kernel_ver' +you may want to: + +modprobe sb io=0x220 irq=5 dma=1 dma16=3 + + Or, take the hard way: + +modprobe soundcore +modprobe sound +modprobe uart401 +modprobe sb io=0x220 irq=5 dma=1 dma16=3 +# do you need MIDI? +modprobe opl3=0x388 + + Just in case, the kernel sound support should be: + +CONFIG_SOUND=m +CONFIG_SOUND_OSS=m +CONFIG_SOUND_SB=m + + Enjoy your new noisy Linux box! ;) + + diff --git a/Documentation/sound/oss/WaveArtist b/Documentation/sound/oss/WaveArtist new file mode 100644 index 000000000000..f4f3407cd818 --- /dev/null +++ b/Documentation/sound/oss/WaveArtist @@ -0,0 +1,170 @@ + + (the following is from the armlinux CVS) + + WaveArtist mixer and volume levels can be accessed via these commands: + + nn30 read registers nn, where nn = 00 - 09 for mixer settings + 0a - 13 for channel volumes + mm31 write the volume setting in pairs, where mm = (nn - 10) / 2 + rr32 write the mixer settings in pairs, where rr = nn/2 + xx33 reset all settings to default + 0y34 select mono source, y=0 = left, y=1 = right + + bits + nn 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 00 | 0 | 0 0 1 1 | left line mixer gain | left aux1 mixer gain |lmute| +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 01 | 0 | 0 1 0 1 | left aux2 mixer gain | right 2 left mic gain |mmute| +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 02 | 0 | 0 1 1 1 | left mic mixer gain | left mic | left mixer gain |dith | +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 03 | 0 | 1 0 0 1 | left mixer input select |lrfg | left ADC gain | +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 04 | 0 | 1 0 1 1 | right line mixer gain | right aux1 mixer gain |rmute| +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 05 | 0 | 1 1 0 1 | right aux2 mixer gain | left 2 right mic gain |test | +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 06 | 0 | 1 1 1 1 | right mic mixer gain | right mic |right mixer gain |rbyps| +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 07 | 1 | 0 0 0 1 | right mixer select |rrfg | right ADC gain | +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 08 | 1 | 0 0 1 1 | mono mixer gain |right ADC mux sel|left ADC mux sel | +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 09 | 1 | 0 1 0 1 |loopb|left linout|loop|ADCch|TxFch|OffCD|test |loopb|loopb|osamp| +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 0a | 0 | left PCM channel volume | +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 0b | 0 | right PCM channel volume | +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 0c | 0 | left FM channel volume | +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 0d | 0 | right FM channel volume | +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 0e | 0 | left wavetable channel volume | +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 0f | 0 | right wavetable channel volume | +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 10 | 0 | left PCM expansion channel volume | +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 11 | 0 | right PCM expansion channel volume | +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 12 | 0 | left FM expansion channel volume | +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + 13 | 0 | right FM expansion channel volume | +----+---+------------+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+ + + lmute: left mute + mmute: mono mute + dith: dithds + lrfg: + rmute: right mute + rbyps: right bypass + rrfg: + ADCch: + TxFch: + OffCD: + osamp: + + And the following diagram is derived from the description in the CVS archive: + + MIC L (mouthpiece) + +------+ + -->PreAmp>-\ + +--^---+ | + | | + r2b4-5 | +--------+ + /----*-------------------------------->5 | + | | | + | /----------------------------------->4 | + | | | | + | | /--------------------------------->3 1of5 | +---+ + | | | | mux >-->AMP>--> ADC L + | | | /------------------------------->2 | +-^-+ + | | | | | | | + Line | | | | +----+ +------+ +---+ /---->1 | r3b3-0 + ------------*->mute>--> Gain >--> | | | | + L | | | +----+ +------+ | | | *->0 | + | | | | | | +---^----+ + Aux2 | | | +----+ +------+ | | | | + ----------*--->mute>--> Gain >--> M | | r8b0-2 + L | | +----+ +------+ | | | + | | | | \------\ + Aux1 | | +----+ +------+ | | | + --------*----->mute>--> Gain >--> I | | + L | +----+ +------+ | | | + | | | | + | +----+ +------+ | | +---+ | + *------->mute>--> Gain >--> X >-->AMP>--* + | +----+ +------+ | | +-^-+ | + | | | | | + | +----+ +------+ | | r2b1-3 | + | /----->mute>--> Gain >--> E | | + | | +----+ +------+ | | | + | | | | | + | | +----+ +------+ | | | + | | /--->mute>--> Gain >--> R | | + | | | +----+ +------+ | | | + | | | | | | r9b8-9 + | | | +----+ +------+ | | | | + | | | /->mute>--> Gain >--> | | +---v---+ + | | | | +----+ +------+ +---+ /-*->0 | + DAC | | | | | | | + ------------*----------------------------------->? | +----+ + L | | | | | Mux >-->mute>--> L output + | | | | /->? | +--^-+ + | | | | | | | | + | | | /--------->? | r0b0 + | | | | | | +-------+ + | | | | | | + Mono | | | | | | +-------+ + ----------* | \---> | +----+ + | | | | | | Mix >-->mute>--> Mono output + | | | | *-> | +--^-+ + | | | | | +-------+ | + | | | | | r1b0 + DAC | | | | | +-------+ + ------------*-------------------------*--------->1 | +----+ + R | | | | | | Mux >-->mute>--> R output + | | | | +----+ +------+ +---+ *->0 | +--^-+ + | | | \->mute>--> Gain >--> | | +---^---+ | + | | | +----+ +------+ | | | | r5b0 + | | | | | | r6b0 + | | | +----+ +------+ | | | + | | \--->mute>--> Gain >--> M | | + | | +----+ +------+ | | | + | | | | | + | | +----+ +------+ | | | + | *----->mute>--> Gain >--> I | | + | | +----+ +------+ | | | + | | | | | + | | +----+ +------+ | | +---+ | + \------->mute>--> Gain >--> X >-->AMP>--* + | +----+ +------+ | | +-^-+ | + /--/ | | | | + Aux1 | +----+ +------+ | | r6b1-3 | + -------*------>mute>--> Gain >--> E | | + R | | +----+ +------+ | | | + | | | | | + Aux2 | | +----+ +------+ | | /------/ + ---------*---->mute>--> Gain >--> R | | + R | | | +----+ +------+ | | | + | | | | | | +--------+ + Line | | | +----+ +------+ | | | *->0 | + -----------*-->mute>--> Gain >--> | | | | + R | | | | +----+ +------+ +---+ \---->1 | + | | | | | | + | | | \-------------------------------->2 | +---+ + | | | | Mux >-->AMP>--> ADC R + | | \---------------------------------->3 | +-^-+ + | | | | | + | \------------------------------------>4 | r7b3-0 + | | | + \-----*-------------------------------->5 | + | +---^----+ + r6b4-5 | | + | | r8b3-5 + +--v---+ | + -->PreAmp>-/ + +------+ + MIC R (electret mic) diff --git a/Documentation/sound/oss/Wavefront b/Documentation/sound/oss/Wavefront new file mode 100644 index 000000000000..16f57ea43052 --- /dev/null +++ b/Documentation/sound/oss/Wavefront @@ -0,0 +1,339 @@ + An OSS/Free Driver for WaveFront soundcards + (Turtle Beach Maui, Tropez, Tropez Plus) + + Paul Barton-Davis, July 1998 + + VERSION 0.2.5 + +Driver Status +------------- + +Requires: Kernel 2.1.106 or later (the driver is included with kernels +2.1.109 and above) + +As of 7/22/1998, this driver is currently in *BETA* state. This means +that it compiles and runs, and that I use it on my system (Linux +2.1.106) with some reasonably demanding applications and uses. I +believe the code is approaching an initial "finished" state that +provides bug-free support for the Tropez Plus. + +Please note that to date, the driver has ONLY been tested on a Tropez +Plus. I would very much like to hear (and help out) people with Tropez +and Maui cards, since I think the driver can support those cards as +well. + +Finally, the driver has not been tested (or even compiled) as a static +(non-modular) part of the kernel. Alan Cox's good work in modularizing +OSS/Free for Linux makes this rather unnecessary. + +Some Questions +-------------- + +********************************************************************** +0) What does this driver do that the maui driver did not ? +********************************************************************** + +* can fully initialize a WaveFront card from cold boot - no DOS + utilities needed +* working patch/sample/program loading and unloading (the maui + driver didn't document how to make this work, and assumed + user-level preparation of the patch data for writing + to the board. ick.) +* full user-level access to all WaveFront commands +* for the Tropez Plus, (primitive) control of the YSS225 FX processor +* Virtual MIDI mode supported - 2 MIDI devices accessible via the + WaveFront's MPU401/UART emulation. One + accesses the WaveFront synth, the other accesses the + external MIDI connector. Full MIDI read/write semantics + for both devices. +* OSS-compliant /dev/sequencer interface for the WaveFront synth, + including native and GUS-format patch downloading. +* semi-intelligent patch management (prototypical at this point) + +********************************************************************** +1) What to do about MIDI interfaces ? +********************************************************************** + +The Tropez Plus (and perhaps other WF cards) can in theory support up +to 2 physical MIDI interfaces. One of these is connected to the +ICS2115 chip (the WaveFront synth itself) and is controlled by +MPU/UART-401 emulation code running as part of the WaveFront OS. The +other is controlled by the CS4232 chip present on the board. However, +physical access to the CS4232 connector is difficult, and it is +unlikely (though not impossible) that you will want to use it. + +An older version of this driver introduced an additional kernel config +variable which controlled whether or not the CS4232 MIDI interface was +configured. Because of Alan Cox's work on modularizing the sound +drivers, and now backporting them to 2.0.34 kernels, there seems to be +little reason to support "static" configuration variables, and so this +has been abandoned in favor of *only* module parameters. Specifying +"mpuio" and "mpuirq" for the cs4232 parameter will result in the +CS4232 MIDI interface being configured; leaving them unspecified will +leave it unconfigured (and thus unusable). + +BTW, I have heard from one Tropez+ user that the CS4232 interface is +more reliable than the ICS2115 one. I have had no problems with the +latter, and I don't have the right cable to test the former one +out. Reports welcome. + +********************************************************************** +2) Why does line XXX of the code look like this .... ? +********************************************************************** + +Either because it's not finished yet, or because you're a better coder +than I am, or because you don't understand some aspect of how the card +or the code works. + +I absolutely welcome comments, criticisms and suggestions about the +design and implementation of the driver. + +********************************************************************** +3) What files are included ? +********************************************************************** + + drivers/sound/README.wavefront -- this file + + drivers/sound/wavefront.patch -- patches for the 2.1.106 sound drivers + needed to make the rest of this work + DO NOT USE IF YOU'VE APPLIED THEM + BEFORE, OR HAVE 2.1.109 OR ABOVE + + drivers/sound/wavfront.c -- the driver + drivers/sound/ys225.h -- data declarations for FX config + drivers/sound/ys225.c -- data definitions for FX config + drivers/sound/wf_midi.c -- the "uart401" driver + to support virtual MIDI mode. + include/wavefront.h -- the header file + Documentation/sound/oss/Tropez+ -- short docs on configuration + +********************************************************************** +4) How do I compile/install/use it ? +********************************************************************** + +PART ONE: install the source code into your sound driver directory + + cd <top-of-your-2.1.106-code-base-e.g.-/usr/src/linux> + tar -zxvf <where-you-put/wavefront.tar.gz> + +PART TWO: apply the patches + + DO THIS ONLY IF YOU HAVE A KERNEL VERSION BELOW 2.1.109 + AND HAVE NOT ALREADY INSTALLED THE PATCH(ES). + + cd drivers/sound + patch < wavefront.patch + +PART THREE: configure your kernel + + cd <top of your kernel tree> + make xconfig (or whichever config option you use) + + - choose YES for Sound Support + - choose MODULE (M) for OSS Sound Modules + - choose MODULE(M) to YM3812/OPL3 support + - choose MODULE(M) for WaveFront support + - choose MODULE(M) for CS4232 support + + - choose "N" for everything else (unless you have other + soundcards you want support for) + + + make boot + . + . + . + <whatever you normally do for a kernel install> + make modules + . + . + . + make modules_install + +Here's my autoconf.h SOUND section: + +/* + * Sound + */ +#define CONFIG_SOUND 1 +#undef CONFIG_SOUND_OSS +#define CONFIG_SOUND_OSS_MODULE 1 +#undef CONFIG_SOUND_PAS +#undef CONFIG_SOUND_SB +#undef CONFIG_SOUND_ADLIB +#undef CONFIG_SOUND_GUS +#undef CONFIG_SOUND_MPU401 +#undef CONFIG_SOUND_PSS +#undef CONFIG_SOUND_MSS +#undef CONFIG_SOUND_SSCAPE +#undef CONFIG_SOUND_TRIX +#undef CONFIG_SOUND_MAD16 +#undef CONFIG_SOUND_WAVEFRONT +#define CONFIG_SOUND_WAVEFRONT_MODULE 1 +#undef CONFIG_SOUND_CS4232 +#define CONFIG_SOUND_CS4232_MODULE 1 +#undef CONFIG_SOUND_MAUI +#undef CONFIG_SOUND_SGALAXY +#undef CONFIG_SOUND_OPL3SA1 +#undef CONFIG_SOUND_SOFTOSS +#undef CONFIG_SOUND_YM3812 +#define CONFIG_SOUND_YM3812_MODULE 1 +#undef CONFIG_SOUND_VMIDI +#undef CONFIG_SOUND_UART6850 +/* + * Additional low level sound drivers + */ +#undef CONFIG_LOWLEVEL_SOUND + +************************************************************ +6) How do I configure my card ? +************************************************************ + +You need to edit /etc/modprobe.conf. Here's mine (edited to show the +relevant details): + + # Sound system + alias char-major-14-* wavefront + alias synth0 wavefront + alias mixer0 cs4232 + alias audio0 cs4232 + install wavefront /sbin/modprobe cs4232 && /sbin/modprobe -i wavefront && /sbin/modprobe opl3 + options wavefront io=0x200 irq=9 + options cs4232 synthirq=9 synthio=0x200 io=0x530 irq=5 dma=1 dma2=0 + options opl3 io=0x388 + +Things to note: + + the wavefront options "io" and "irq" ***MUST*** match the "synthio" + and "synthirq" cs4232 options. + + you can do without the opl3 module if you don't + want to use the OPL/[34] FM synth on the soundcard + + the opl3 io parameter is conventionally not adjustable. + In theory, any not-in-use IO port address would work, but + just use 0x388 and stick with the crowd. + +********************************************************************** +7) What about firmware ? +********************************************************************** + +Turtle Beach have not given me permission to distribute their firmware +for the ICS2115. However, if you have a WaveFront card, then you +almost certainly have the firmware, and if not, its freely available +on their website, at: + + http://www.tbeach.com/tbs/downloads/scardsdown.htm#tropezplus + +The file is called WFOS2001.MOT (for the Tropez+). + +This driver, however, doesn't use the pure firmware as distributed, +but instead relies on a somewhat processed form of it. You can +generate this very easily. Following an idea from Andrew Veliath's +Pinnacle driver, the following flex program will generate the +processed version: + +---- cut here ------------------------- +%option main +%% +^S[28].*\r$ printf ("%c%.*s", yyleng-1,yyleng-1,yytext); +<<EOF>> { fputc ('\0', stdout); return; } +\n {} +. {} +---- cut here ------------------------- + +To use it, put the above in file (say, ws.l) compile it like this: + + shell> flex -ows.c ws.l + shell> cc -o ws ws.c + +and then use it like this: + + ws < my-copy-of-the-oswf.mot-file > /etc/sound/wavefront.os + +If you put it somewhere else, you'll always have to use the wf_ospath +module parameter (see below) or alter the source code. + +********************************************************************** +7) How do I get it working ? +********************************************************************** + +Optionally, you can reboot with the "new" kernel (even though the only +changes have really been made to a module). + +Then, as root do: + + modprobe wavefront + +You should get something like this in /var/log/messages: + + WaveFront: firmware 1.20 already loaded. + +or + + WaveFront: no response to firmware probe, assume raw. + +then: + + WaveFront: waiting for memory configuration ... + WaveFront: hardware version 1.64 + WaveFront: available DRAM 8191k + WaveFront: 332 samples used (266 real, 13 aliases, 53 multi), 180 empty + WaveFront: 128 programs slots in use + WaveFront: 256 patch slots filled, 142 in use + +The whole process takes about 16 seconds, the longest waits being +after reporting the hardware version (during the firmware download), +and after reporting program status (during patch status inquiry). Its +shorter (about 10 secs) if the firmware is already loaded (i.e. only +warm reboots since the last firmware load). + +The "available DRAM" line will vary depending on how much added RAM +your card has. Mine has 8MB. + +To check basically functionality, use play(1) or splay(1) to send a +.WAV or other audio file through the audio portion. Then use playmidi +to play a General MIDI file. Try the "-D 0" to hear the +difference between sending MIDI to the WaveFront and using the OPL/3, +which is the default (I think ...). If you have an external synth(s) +hooked to the soundcard, you can use "-e" to route to the +external synth(s) (in theory, -D 1 should work as well, but I think +there is a bug in playmidi which prevents this from doing what it +should). + +********************************************************************** +8) What are the module parameters ? +********************************************************************** + +Its best to read wavefront.c for this, but here is a summary: + +integers: + wf_raw - if set, ignore apparent presence of firmware + loaded onto the ICS2115, reset the whole + board, and initialize it from scratch. (default = 0) + + fx_raw - if set, always initialize the YSS225 processor + on the Tropez plus. (default = 1) + + < The next 4 are basically for kernel hackers to allow + tweaking the driver for testing purposes. > + + wait_usecs - loop timer used when waiting for + status conditions on the board. + The default is 150. + + debug_default - debugging flags. See sound/wavefront.h + for WF_DEBUG_* values. Default is zero. + Setting this allows you to debug the + driver during module installation. +strings: + ospath - path to get to the pre-processed OS firmware. + (default: /etc/sound/wavefront.os) + +********************************************************************** +9) Who should I contact if I have problems? +********************************************************************** + +Just me: Paul Barton-Davis <pbd@op.net> + + diff --git a/Documentation/sound/oss/btaudio b/Documentation/sound/oss/btaudio new file mode 100644 index 000000000000..1a693e69d44b --- /dev/null +++ b/Documentation/sound/oss/btaudio @@ -0,0 +1,92 @@ + +Intro +===== + +people start bugging me about this with questions, looks like I +should write up some documentation for this beast. That way I +don't have to answer that much mails I hope. Yes, I'm lazy... + + +You might have noticed that the bt878 grabber cards have actually +_two_ PCI functions: + +$ lspci +[ ... ] +00:0a.0 Multimedia video controller: Brooktree Corporation Bt878 (rev 02) +00:0a.1 Multimedia controller: Brooktree Corporation Bt878 (rev 02) +[ ... ] + +The first does video, it is backward compatible to the bt848. The second +does audio. btaudio is a driver for the second function. It's a sound +driver which can be used for recording sound (and _only_ recording, no +playback). As most TV cards come with a short cable which can be plugged +into your sound card's line-in you probably don't need this driver if all +you want to do is just watching TV... + + +Driver Status +============= + +Still somewhat experimental. The driver should work stable, i.e. it +should'nt crash your box. It might not work as expected, have bugs, +not being fully OSS API compilant, ... + +Latest versions are available from http://bytesex.org/bttv/, the +driver is in the bttv tarball. Kernel patches might be available too, +have a look at http://bytesex.org/bttv/listing.html. + +The chip knows two different modes. btaudio registers two dsp +devices, one for each mode. They can not be used at the same time. + + +Digital audio mode +================== + +The chip gives you 16 bit stereo sound. The sample rate depends on +the external source which feeds the bt878 with digital sound via I2S +interface. There is a insmod option (rate) to tell the driver which +sample rate the hardware uses (32000 is the default). + +One possible source for digital sound is the msp34xx audio processor +chip which provides digital sound via I2S with 32 kHz sample rate. My +Hauppauge board works this way. + +The Osprey-200 reportly gives you digital sound with 44100 Hz sample +rate. It is also possible that you get no sound at all. + + +analog mode (A/D) +================= + +You can tell the driver to use this mode with the insmod option "analog=1". +The chip has three analog inputs. Consequently you'll get a mixer device +to control these. + +The analog mode supports mono only. Both 8 + 16 bit. Both are _signed_ +int, which is uncommon for the 8 bit case. Sample rate range is 119 kHz +to 448 kHz. Yes, the number of digits is correct. The driver supports +downsampling by powers of two, so you can ask for more usual sample rates +like 44 kHz too. + +With my Hauppauge I get noisy sound on the second input (mapped to line2 +by the mixer device). Others get a useable signal on line1. + + +some examples +============= + +* read audio data from btaudio (dsp2), send to es1730 (dsp,dsp1): + $ sox -w -r 32000 -t ossdsp /dev/dsp2 -t ossdsp /dev/dsp + +* read audio data from btaudio, send to esound daemon (which might be + running on another host): + $ sox -c 2 -w -r 32000 -t ossdsp /dev/dsp2 -t sw - | esdcat -r 32000 + $ sox -c 1 -w -r 32000 -t ossdsp /dev/dsp2 -t sw - | esdcat -m -r 32000 + + +Have fun, + + Gerd + +-- +Gerd Knorr <kraxel@bytesex.org> diff --git a/Documentation/sound/oss/cs46xx b/Documentation/sound/oss/cs46xx new file mode 100644 index 000000000000..88d6cf8b39f3 --- /dev/null +++ b/Documentation/sound/oss/cs46xx @@ -0,0 +1,138 @@ + +Documentation for the Cirrus Logic/Crystal SoundFusion cs46xx/cs4280 audio +controller chips (2001/05/11) + +The cs46xx audio driver supports the DSP line of Cirrus controllers. +Specifically, the cs4610, cs4612, cs4614, cs4622, cs4624, cs4630 and the cs4280 +products. This driver uses the generic ac97_codec driver for AC97 codec +support. + + +Features: + +Full Duplex Playback/Capture supported from 8k-48k. +16Bit Signed LE & 8Bit Unsigned, with Mono or Stereo supported. + +APM/PM - 2.2.x PM is enabled and functional. APM can also +be enabled for 2.4.x by modifying the CS46XX_ACPI_SUPPORT macro +definition. + +DMA playback buffer size is configurable from 16k (defaultorder=2) up to 2Meg +(defaultorder=11). DMA capture buffer size is fixed at a single 4k page as +two 2k fragments. + +MMAP seems to work well with QuakeIII, and test XMMS plugin. + +Myth2 works, but the polling logic is not fully correct, but is functional. + +The 2.4.4-ac6 gameport code in the cs461x joystick driver has been tested +with a Microsoft Sidewinder joystick (cs461x.o and sidewinder.o). This +audio driver must be loaded prior to the joystick driver to enable the +DSP task image supporting the joystick device. + + +Limitations: + +SPDIF is currently not supported. + +Primary codec support only. No secondary codec support is implemented. + + + +NOTES: + +Hercules Game Theatre XP - the EGPIO2 pin controls the external Amp, +and has been tested. +Module parameter hercules_egpio_disable set to 1, will force a 0 to EGPIODR +to disable the external amplifier. + +VTB Santa Cruz - the GPIO7/GPIO8 on the Secondary Codec control +the external amplifier for the "back" speakers, since we do not +support the secondary codec then this external amp is not +turned on. The primary codec external amplifier is supported but +note that the AC97 EAPD bit is inverted logic (amp_voyetra()). + +DMA buffer size - there are issues with many of the Linux applications +concerning the optimal buffer size. Several applications request a +certain fragment size and number and then do not verify that the driver +has the ability to support the requested configuration. +SNDCTL_DSP_SETFRAGMENT ioctl is used to request a fragment size and +number of fragments. Some applications exit if an error is returned +on this particular ioctl. Therefore, in alignment with the other OSS audio +drivers, no error is returned when a SETFRAGs IOCTL is received, but the +values passed from the app are not used in any buffer calculation +(ossfragshift/ossmaxfrags are not used). +Use the "defaultorder=N" module parameter to change the buffer size if +you have an application that requires a specific number of fragments +or a specific buffer size (see below). + +Debug Interface +--------------- +There is an ioctl debug interface to allow runtime modification of the +debug print levels. This debug interface code can be disabled from the +compilation process with commenting the following define: +#define CSDEBUG_INTERFACE 1 +There is also a debug print methodolgy to select printf statements from +different areas of the driver. A debug print level is also used to allow +additional printfs to be active. Comment out the following line in the +driver to disable compilation of the CS_DBGOUT print statements: +#define CSDEBUG 1 + +Please see the definitions for cs_debuglevel and cs_debugmask for additional +information on the debug levels and sections. + +There is also a csdbg executable to allow runtime manipulation of these +parameters. for a copy email: twoller@crystal.cirrus.com + + + +MODULE_PARMS definitions +------------------------ +MODULE_PARM(defaultorder, "i"); +defaultorder=N +where N is a value from 1 to 12 +The buffer order determines the size of the dma buffer for the driver. +under Linux, a smaller buffer allows more responsiveness from many of the +applications (e.g. games). A larger buffer allows some of the apps (esound) +to not underrun the dma buffer as easily. As default, use 32k (order=3) +rather than 64k as some of the games work more responsively. +(2^N) * PAGE_SIZE = allocated buffer size + +MODULE_PARM(cs_debuglevel, "i"); +MODULE_PARM(cs_debugmask, "i"); +cs_debuglevel=N +cs_debugmask=0xMMMMMMMM +where N is a value from 0 (no debug printfs), to 9 (maximum) +0xMMMMMMMM is a debug mask corresponding to the CS_xxx bits (see driver source). + +MODULE_PARM(hercules_egpio_disable, "i"); +hercules_egpio_disable=N +where N is a 0 (enable egpio), or a 1 (disable egpio support) + +MODULE_PARM(initdelay, "i"); +initdelay=N +This value is used to determine the millescond delay during the initialization +code prior to powering up the PLL. On laptops this value can be used to +assist with errors on resume, mostly with IBM laptops. Basically, if the +system is booted under battery power then the mdelay()/udelay() functions fail to +properly delay the required time. Also, if the system is booted under AC power +and then the power removed, the mdelay()/udelay() functions will not delay properly. + +MODULE_PARM(powerdown, "i"); +powerdown=N +where N is 0 (disable any powerdown of the internal blocks) or 1 (enable powerdown) + + +MODULE_PARM(external_amp, "i"); +external_amp=1 +if N is set to 1, then force enabling the EAPD support in the primary AC97 codec. +override the detection logic and force the external amp bit in the AC97 0x26 register +to be reset (0). EAPD should be 0 for powerup, and 1 for powerdown. The VTB Santa Cruz +card has inverted logic, so there is a special function for these cards. + +MODULE_PARM(thinkpad, "i"); +thinkpad=1 +if N is set to 1, then force enabling the clkrun functionality. +Currently, when the part is being used, then clkrun is disabled for the entire system, +but re-enabled when the driver is released or there is no outstanding open count. + diff --git a/Documentation/sound/oss/es1370 b/Documentation/sound/oss/es1370 new file mode 100644 index 000000000000..7b38b1a096a3 --- /dev/null +++ b/Documentation/sound/oss/es1370 @@ -0,0 +1,70 @@ +/proc/sound, /dev/sndstat +------------------------- + +/proc/sound and /dev/sndstat is not supported by the +driver. To find out whether the driver succeeded loading, +check the kernel log (dmesg). + + +ALaw/uLaw sample formats +------------------------ + +This driver does not support the ALaw/uLaw sample formats. +ALaw is the default mode when opening a sound device +using OSS/Free. The reason for the lack of support is +that the hardware does not support these formats, and adding +conversion routines to the kernel would lead to very ugly +code in the presence of the mmap interface to the driver. +And since xquake uses mmap, mmap is considered important :-) +and no sane application uses ALaw/uLaw these days anyway. +In short, playing a Sun .au file as follows: + +cat my_file.au > /dev/dsp + +does not work. Instead, you may use the play script from +Chris Bagwell's sox-12.14 package (available from the URL +below) to play many different audio file formats. +The script automatically determines the audio format +and does do audio conversions if necessary. +http://home.sprynet.com/sprynet/cbagwell/projects.html + + +Blocking vs. nonblocking IO +--------------------------- + +Unlike OSS/Free this driver honours the O_NONBLOCK file flag +not only during open, but also during read and write. +This is an effort to make the sound driver interface more +regular. Timidity has problems with this; a patch +is available from http://www.ife.ee.ethz.ch/~sailer/linux/pciaudio.html. +(Timidity patched will also run on OSS/Free). + + +MIDI UART +--------- + +The driver supports a simple MIDI UART interface, with +no ioctl's supported. + + +MIDI synthesizer +---------------- + +This soundcard does not have any hardware MIDI synthesizer; +MIDI synthesis has to be done in software. To allow this +the driver/soundcard supports two PCM (/dev/dsp) interfaces. +The second one goes to the mixer "synth" setting and supports +only a limited set of sampling rates (44100, 22050, 11025, 5512). +By setting lineout to 1 on the driver command line +(eg. insmod es1370 lineout=1) it is even possible on some +cards to convert the LINEIN jack into a second LINEOUT jack, thus +making it possible to output four independent audio channels! + +There is a freely available software package that allows +MIDI file playback on this soundcard called Timidity. +See http://www.cgs.fi/~tt/timidity/. + + + +Thomas Sailer +t.sailer@alumni.ethz.ch diff --git a/Documentation/sound/oss/es1371 b/Documentation/sound/oss/es1371 new file mode 100644 index 000000000000..c3151266771c --- /dev/null +++ b/Documentation/sound/oss/es1371 @@ -0,0 +1,64 @@ +/proc/sound, /dev/sndstat +------------------------- + +/proc/sound and /dev/sndstat is not supported by the +driver. To find out whether the driver succeeded loading, +check the kernel log (dmesg). + + +ALaw/uLaw sample formats +------------------------ + +This driver does not support the ALaw/uLaw sample formats. +ALaw is the default mode when opening a sound device +using OSS/Free. The reason for the lack of support is +that the hardware does not support these formats, and adding +conversion routines to the kernel would lead to very ugly +code in the presence of the mmap interface to the driver. +And since xquake uses mmap, mmap is considered important :-) +and no sane application uses ALaw/uLaw these days anyway. +In short, playing a Sun .au file as follows: + +cat my_file.au > /dev/dsp + +does not work. Instead, you may use the play script from +Chris Bagwell's sox-12.14 package (available from the URL +below) to play many different audio file formats. +The script automatically determines the audio format +and does do audio conversions if necessary. +http://home.sprynet.com/sprynet/cbagwell/projects.html + + +Blocking vs. nonblocking IO +--------------------------- + +Unlike OSS/Free this driver honours the O_NONBLOCK file flag +not only during open, but also during read and write. +This is an effort to make the sound driver interface more +regular. Timidity has problems with this; a patch +is available from http://www.ife.ee.ethz.ch/~sailer/linux/pciaudio.html. +(Timidity patched will also run on OSS/Free). + + +MIDI UART +--------- + +The driver supports a simple MIDI UART interface, with +no ioctl's supported. + + +MIDI synthesizer +---------------- + +This soundcard does not have any hardware MIDI synthesizer; +MIDI synthesis has to be done in software. To allow this +the driver/soundcard supports two PCM (/dev/dsp) interfaces. + +There is a freely available software package that allows +MIDI file playback on this soundcard called Timidity. +See http://www.cgs.fi/~tt/timidity/. + + + +Thomas Sailer +t.sailer@alumni.ethz.ch diff --git a/Documentation/sound/oss/mwave b/Documentation/sound/oss/mwave new file mode 100644 index 000000000000..858334bb46b0 --- /dev/null +++ b/Documentation/sound/oss/mwave @@ -0,0 +1,185 @@ + How to try to survive an IBM Mwave under Linux SB drivers + + ++ IBM have now released documentation of sorts and Torsten is busy + trying to make the Mwave work. This is not however a trivial task. + +---------------------------------------------------------------------------- + +OK, first thing - the IRQ problem IS a problem, whether the test is bypassed or +not. It is NOT a Linux problem, but an MWAVE problem that is fixed with the +latest MWAVE patches. So, in other words, don't bypass the test for MWAVES! + +I have Windows 95 on /dev/hda1, swap on /dev/hda2, and Red Hat 5 on /dev/hda3. + +The steps, then: + + Boot to Linux. + Mount Windows 95 file system (assume mount point = /dos95). + mkdir /dos95/linux + mkdir /dos95/linux/boot + mkdir /dos95/linux/boot/parms + + Copy the kernel, any initrd image, and loadlin to /dos95/linux/boot/. + + Reboot to Windows 95. + + Edit C:/msdos.sys and add or change the following: + + Logo=0 + BootGUI=0 + + Note that msdos.sys is a text file but it needs to be made 'unhidden', + readable and writable before it can be edited. This can be done with + DOS' "attrib" command. + + Edit config.sys to have multiple config menus. I have one for windows 95 and + five for Linux, like this: +------------ +[menu] +menuitem=W95, Windows 95 +menuitem=LINTP, Linux - ThinkPad +menuitem=LINTP3, Linux - ThinkPad Console +menuitem=LINDOC, Linux - Docked +menuitem=LINDOC3, Linux - Docked Console +menuitem=LIN1, Linux - Single User Mode +REM menudefault=W95,10 + +[W95] + +[LINTP] + +[LINDOC] + +[LINTP3] + +[LINDOC3] + +[LIN1] + +[COMMON] +FILES=30 +REM Please read README.TXT in C:\MWW subdirectory before changing the DOS= statement. +DOS=HIGH,UMB +DEVICE=C:\MWW\MANAGER\MWD50430.EXE +SHELL=c:\command.com /e:2048 +------------------- + +The important things are the SHELL and DEVICE statements. + + Then change autoexec.bat. Basically everything in there originally should be + done ONLY when Windows 95 is booted. Then you add new things specifically + for Linux. Mine is as follows + +--------------- +@ECHO OFF +if "%CONFIG%" == "W95" goto W95 + +REM +REM Linux stuff +REM +SET MWPATH=C:\MWW\DLL;C:\MWW\MWGAMES;C:\MWW\DSP +SET BLASTER=A220 I5 D1 +SET MWROOT=C:\MWW +SET LIBPATH=C:\MWW\DLL +SET PATH=C:\WINDOWS;C:\MWW\DLL; +CALL MWAVE START NOSHOW +c:\linux\boot\loadlin.exe @c:\linux\boot\parms\%CONFIG%.par + +:W95 +REM +REM Windows 95 stuff +REM +c:\toolkit\guard +SET MSINPUT=C:\MSINPUT +SET MWPATH=C:\MWW\DLL;C:\MWW\MWGAMES;C:\MWW\DSP +REM The following is used by DOS games to recognize Sound Blaster hardware. +REM If hardware settings are changed, please change this line as well. +REM See the Mwave README file for instructions. +SET BLASTER=A220 I5 D1 +SET MWROOT=C:\MWW +SET LIBPATH=C:\MWW\DLL +SET PATH=C:\WINDOWS;C:\WINDOWS\COMMAND;E:\ORAWIN95\BIN;f:\msdev\bin;e:\v30\bin.dbg;v:\devt\v30\bin;c:\JavaSDK\Bin;C:\MWW\DLL; +SET INCLUDE=f:\MSDEV\INCLUDE;F:\MSDEV\MFC\INCLUDE +SET LIB=F:\MSDEV\LIB;F:\MSDEV\MFC\LIB +win + +------------------------ + +Now build a file in c:\linux\boot\parms for each Linux config that you have. + +For example, my LINDOC3 config is for a docked Thinkpad at runlevel 3 with no +initrd image, and has a parameter file named LINDOC3.PAR in c:\linux\boot\parms: + +----------------------- +# LOADLIN @param_file image=other_image root=/dev/other +# +# Linux Console in docking station +# +c:\linux\boot\zImage.krn # First value must be filename of Linux kernel. +root=/dev/hda3 # device which gets mounted as root FS +ro # Other kernel arguments go here. +apm=off +doc=yes +3 +----------------------- + +The doc=yes parameter is an environment variable used by my init scripts, not +a kernel argument. + +However, the apm=off parameter IS a kernel argument! APM, at least in my setup, +causes the kernel to crash when loaded via loadlin (but NOT when loaded via +LILO). The APM stuff COULD be forced out of the kernel via the kernel compile +options. Instead, I got an unofficial patch to the APM drivers that allows them +to be dynamically deactivated via kernel arguments. Whatever you chose to +document, APM, it seems, MUST be off for setups like mine. + +Now make sure C:\MWW\MWCONFIG.REF looks like this: + +---------------------- +[NativeDOS] +Default=SB1.5 +SBInputSource=CD +SYNTH=FM +QSound=OFF +Reverb=OFF +Chorus=OFF +ReverbDepth=5 +ChorusDepth=5 +SBInputVolume=5 +SBMainVolume=10 +SBWaveVolume=10 +SBSynthVolume=10 +WaveTableVolume=10 +AudioPowerDriver=ON + +[FastCFG] +Show=No +HideOption=Off +----------------------------- + +OR the Default= line COULD be + +Default=SBPRO + +Reboot to Windows 95 and choose Linux. When booted, use sndconfig to configure +the sound modules and voilà - ThinkPad sound with Linux. + +Now the gotchas - you can either have CD sound OR Mixers but not both. That's a +problem with the SB1.5 (CD sound) or SBPRO (Mixers) settings. No one knows why +this is! + +For some reason MPEG3 files, when played through mpg123, sound like they +are playing at 1/8th speed - not very useful! If you have ANY insight +on why this second thing might be happening, I would be grateful. + +=========================================================== + _/ _/_/_/_/ + _/_/ _/_/ _/ + _/ _/_/ _/_/_/_/ Martin John Bartlett + _/ _/ _/ _/ (martin@nitram.demon.co.uk) +_/ _/_/_/_/ + _/ +_/ _/ + _/_/ +=========================================================== diff --git a/Documentation/sound/oss/rme96xx b/Documentation/sound/oss/rme96xx new file mode 100644 index 000000000000..87d7b7b65fa1 --- /dev/null +++ b/Documentation/sound/oss/rme96xx @@ -0,0 +1,767 @@ +Beta release of the rme96xx (driver for RME 96XX cards like the +"Hammerfall" and the "Hammerfall light") + +Important: The driver module has to be installed on a freshly rebooted system, +otherwise the driver might not be able to acquire its buffers. + +features: + + - OSS programming interface (i.e. runs with standard OSS soundsoftware) + - OSS/Multichannel interface (OSS multichannel is done by just aquiring + more than 2 channels). The driver does not use more than one device + ( yet .. this feature may be implemented later ) + - more than one RME card supported + +The driver uses a specific multichannel interface, which I will document +when the driver gets stable. (take a look at the defines in rme96xx.h, +which adds blocked multichannel formats i.e instead of +lrlrlrlr --> llllrrrr etc. + +Use the "rmectrl" programm to look at the status of the card .. +or use xrmectrl, a GUI interface for the ctrl program. + +What you can do with the rmectrl program is to set the stereo device for +OSS emulation (e.g. if you use SPDIF out). + +You do: + +./ctrl offset 24 24 + +which makes the stereo device use channels 25 and 26. + +Guenter Geiger <geiger@epy.co.at> + +copy the first part of the attached source code into rmectrl.c +and the second part into xrmectrl (or get the program from +http://gige.xdv.org/pages/soft/pages/rme) + +to compile: gcc -o rmectrl rmectrl.c +------------------------------ snip ------------------------------------ + +#include <stdio.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <sys/ioctl.h> +#include <fcntl.h> +#include <linux/soundcard.h> +#include <math.h> +#include <unistd.h> +#include <stdlib.h> +#include "rme96xx.h" + +/* + remctrl.c + (C) 2000 Guenter Geiger <geiger@debian.org> + HP20020201 - Heiko Purnhagen <purnhage@tnt.uni-hannover.de> +*/ + +/* # define DEVICE_NAME "/dev/mixer" */ +# define DEVICE_NAME "/dev/mixer1" + + +void usage(void) +{ + fprintf(stderr,"usage: rmectrl [/dev/mixer<n>] [command [options]]\n\n"); + fprintf(stderr,"where command is one of:\n"); + fprintf(stderr," help show this help\n"); + fprintf(stderr," status show status bits\n"); + fprintf(stderr," control show control bits\n"); + fprintf(stderr," mix show mixer/offset status\n"); + fprintf(stderr," master <n> set sync master\n"); + fprintf(stderr," pro <n> set spdif out pro\n"); + fprintf(stderr," emphasis <n> set spdif out emphasis\n"); + fprintf(stderr," dolby <n> set spdif out no audio\n"); + fprintf(stderr," optout <n> set spdif out optical\n"); + fprintf(stderr," wordclock <n> set sync wordclock\n"); + fprintf(stderr," spdifin <n> set spdif in (0=optical,1=coax,2=intern)\n"); + fprintf(stderr," syncref <n> set sync source (0=ADAT1,1=ADAT2,2=ADAT3,3=SPDIF)\n"); + fprintf(stderr," adat1cd <n> set ADAT1 on internal CD\n"); + fprintf(stderr," offset <devnr> <in> <out> set dev (0..3) offset (0..25)\n"); + exit(-1); +} + + +int main(int argc, char* argv[]) +{ + int cards; + int ret; + int i; + double ft; + int fd, fdwr; + int param,orig; + rme_status_t stat; + rme_ctrl_t ctrl; + char *device; + int argidx; + + if (argc < 2) + usage(); + + if (*argv[1]=='/') { + device = argv[1]; + argidx = 2; + } + else { + device = DEVICE_NAME; + argidx = 1; + } + + fprintf(stdout,"mixer device %s\n",device); + if ((fd = open(device,O_RDONLY)) < 0) { + fprintf(stdout,"opening device failed\n"); + exit(-1); + } + + if ((fdwr = open(device,O_WRONLY)) < 0) { + fprintf(stdout,"opening device failed\n"); + exit(-1); + } + + if (argc < argidx+1) + usage(); + + if (!strcmp(argv[argidx],"help")) + usage(); + if (!strcmp(argv[argidx],"-h")) + usage(); + if (!strcmp(argv[argidx],"--help")) + usage(); + + if (!strcmp(argv[argidx],"status")) { + ioctl(fd,SOUND_MIXER_PRIVATE2,&stat); + fprintf(stdout,"stat.irq %d\n",stat.irq); + fprintf(stdout,"stat.lockmask %d\n",stat.lockmask); + fprintf(stdout,"stat.sr48 %d\n",stat.sr48); + fprintf(stdout,"stat.wclock %d\n",stat.wclock); + fprintf(stdout,"stat.bufpoint %d\n",stat.bufpoint); + fprintf(stdout,"stat.syncmask %d\n",stat.syncmask); + fprintf(stdout,"stat.doublespeed %d\n",stat.doublespeed); + fprintf(stdout,"stat.tc_busy %d\n",stat.tc_busy); + fprintf(stdout,"stat.tc_out %d\n",stat.tc_out); + fprintf(stdout,"stat.crystalrate %d (0=64k 3=96k 4=88.2k 5=48k 6=44.1k 7=32k)\n",stat.crystalrate); + fprintf(stdout,"stat.spdif_error %d\n",stat.spdif_error); + fprintf(stdout,"stat.bufid %d\n",stat.bufid); + fprintf(stdout,"stat.tc_valid %d\n",stat.tc_valid); + exit (0); + } + + if (!strcmp(argv[argidx],"control")) { + ioctl(fd,SOUND_MIXER_PRIVATE3,&ctrl); + fprintf(stdout,"ctrl.start %d\n",ctrl.start); + fprintf(stdout,"ctrl.latency %d (0=64 .. 7=8192)\n",ctrl.latency); + fprintf(stdout,"ctrl.master %d\n",ctrl.master); + fprintf(stdout,"ctrl.ie %d\n",ctrl.ie); + fprintf(stdout,"ctrl.sr48 %d\n",ctrl.sr48); + fprintf(stdout,"ctrl.spare %d\n",ctrl.spare); + fprintf(stdout,"ctrl.doublespeed %d\n",ctrl.doublespeed); + fprintf(stdout,"ctrl.pro %d\n",ctrl.pro); + fprintf(stdout,"ctrl.emphasis %d\n",ctrl.emphasis); + fprintf(stdout,"ctrl.dolby %d\n",ctrl.dolby); + fprintf(stdout,"ctrl.opt_out %d\n",ctrl.opt_out); + fprintf(stdout,"ctrl.wordclock %d\n",ctrl.wordclock); + fprintf(stdout,"ctrl.spdif_in %d (0=optical,1=coax,2=intern)\n",ctrl.spdif_in); + fprintf(stdout,"ctrl.sync_ref %d (0=ADAT1,1=ADAT2,2=ADAT3,3=SPDIF)\n",ctrl.sync_ref); + fprintf(stdout,"ctrl.spdif_reset %d\n",ctrl.spdif_reset); + fprintf(stdout,"ctrl.spdif_select %d\n",ctrl.spdif_select); + fprintf(stdout,"ctrl.spdif_clock %d\n",ctrl.spdif_clock); + fprintf(stdout,"ctrl.spdif_write %d\n",ctrl.spdif_write); + fprintf(stdout,"ctrl.adat1_cd %d\n",ctrl.adat1_cd); + exit (0); + } + + if (!strcmp(argv[argidx],"mix")) { + rme_mixer mix; + int i; + + for (i=0; i<4; i++) { + mix.devnr = i; + ioctl(fd,SOUND_MIXER_PRIVATE1,&mix); + if (mix.devnr == i) { + fprintf(stdout,"devnr %d\n",mix.devnr); + fprintf(stdout,"mix.i_offset %2d (0-25)\n",mix.i_offset); + fprintf(stdout,"mix.o_offset %2d (0-25)\n",mix.o_offset); + } + } + exit (0); + } + +/* the control flags */ + + if (argc < argidx+2) + usage(); + + if (!strcmp(argv[argidx],"master")) { + int val = atoi(argv[argidx+1]); + ioctl(fd,SOUND_MIXER_PRIVATE3,&ctrl); + printf("master = %d\n",val); + ctrl.master = val; + ioctl(fdwr,SOUND_MIXER_PRIVATE3,&ctrl); + exit (0); + } + + if (!strcmp(argv[argidx],"pro")) { + int val = atoi(argv[argidx+1]); + ioctl(fd,SOUND_MIXER_PRIVATE3,&ctrl); + printf("pro = %d\n",val); + ctrl.pro = val; + ioctl(fdwr,SOUND_MIXER_PRIVATE3,&ctrl); + exit (0); + } + + if (!strcmp(argv[argidx],"emphasis")) { + int val = atoi(argv[argidx+1]); + ioctl(fd,SOUND_MIXER_PRIVATE3,&ctrl); + printf("emphasis = %d\n",val); + ctrl.emphasis = val; + ioctl(fdwr,SOUND_MIXER_PRIVATE3,&ctrl); + exit (0); + } + + if (!strcmp(argv[argidx],"dolby")) { + int val = atoi(argv[argidx+1]); + ioctl(fd,SOUND_MIXER_PRIVATE3,&ctrl); + printf("dolby = %d\n",val); + ctrl.dolby = val; + ioctl(fdwr,SOUND_MIXER_PRIVATE3,&ctrl); + exit (0); + } + + if (!strcmp(argv[argidx],"optout")) { + int val = atoi(argv[argidx+1]); + ioctl(fd,SOUND_MIXER_PRIVATE3,&ctrl); + printf("optout = %d\n",val); + ctrl.opt_out = val; + ioctl(fdwr,SOUND_MIXER_PRIVATE3,&ctrl); + exit (0); + } + + if (!strcmp(argv[argidx],"wordclock")) { + int val = atoi(argv[argidx+1]); + ioctl(fd,SOUND_MIXER_PRIVATE3,&ctrl); + printf("wordclock = %d\n",val); + ctrl.wordclock = val; + ioctl(fdwr,SOUND_MIXER_PRIVATE3,&ctrl); + exit (0); + } + + if (!strcmp(argv[argidx],"spdifin")) { + int val = atoi(argv[argidx+1]); + ioctl(fd,SOUND_MIXER_PRIVATE3,&ctrl); + printf("spdifin = %d\n",val); + ctrl.spdif_in = val; + ioctl(fdwr,SOUND_MIXER_PRIVATE3,&ctrl); + exit (0); + } + + if (!strcmp(argv[argidx],"syncref")) { + int val = atoi(argv[argidx+1]); + ioctl(fd,SOUND_MIXER_PRIVATE3,&ctrl); + printf("syncref = %d\n",val); + ctrl.sync_ref = val; + ioctl(fdwr,SOUND_MIXER_PRIVATE3,&ctrl); + exit (0); + } + + if (!strcmp(argv[argidx],"adat1cd")) { + int val = atoi(argv[argidx+1]); + ioctl(fd,SOUND_MIXER_PRIVATE3,&ctrl); + printf("adat1cd = %d\n",val); + ctrl.adat1_cd = val; + ioctl(fdwr,SOUND_MIXER_PRIVATE3,&ctrl); + exit (0); + } + +/* setting offset */ + + if (argc < argidx+4) + usage(); + + if (!strcmp(argv[argidx],"offset")) { + rme_mixer mix; + + mix.devnr = atoi(argv[argidx+1]); + + mix.i_offset = atoi(argv[argidx+2]); + mix.o_offset = atoi(argv[argidx+3]); + ioctl(fdwr,SOUND_MIXER_PRIVATE1,&mix); + fprintf(stdout,"devnr %d\n",mix.devnr); + fprintf(stdout,"mix.i_offset to %d\n",mix.i_offset); + fprintf(stdout,"mix.o_offset to %d\n",mix.o_offset); + exit (0); + } + + usage(); + exit (0); /* to avoid warning */ +} + + +---------------------------- <snip> -------------------------------- +#!/usr/bin/wish + +# xrmectrl +# (C) 2000 Guenter Geiger <geiger@debian.org> +# HP20020201 - Heiko Purnhagen <purnhage@tnt.uni-hannover.de> + +#set defaults "-relief ridged" +set CTRLPROG "./rmectrl" +if {$argc} { + set CTRLPROG "$CTRLPROG $argv" +} +puts "CTRLPROG $CTRLPROG" + +frame .butts +button .butts.exit -text "Exit" -command "exit" -relief ridge +#button .butts.state -text "State" -command "get_all" + +pack .butts.exit -side left +pack .butts -side bottom + + +# +# STATUS +# + +frame .status + +# Sampling Rate + +frame .status.sr +label .status.sr.text -text "Sampling Rate" -justify left +radiobutton .status.sr.441 -selectcolor red -text "44.1 kHz" -width 10 -anchor nw -variable srate -value 44100 -font times +radiobutton .status.sr.480 -selectcolor red -text "48 kHz" -width 10 -anchor nw -variable srate -value 48000 -font times +radiobutton .status.sr.882 -selectcolor red -text "88.2 kHz" -width 10 -anchor nw -variable srate -value 88200 -font times +radiobutton .status.sr.960 -selectcolor red -text "96 kHz" -width 10 -anchor nw -variable srate -value 96000 -font times + +pack .status.sr.text .status.sr.441 .status.sr.480 .status.sr.882 .status.sr.960 -side top -padx 3 + +# Lock + +frame .status.lock +label .status.lock.text -text "Lock" -justify left +checkbutton .status.lock.adat1 -selectcolor red -text "ADAT1" -anchor nw -width 10 -variable adatlock1 -font times +checkbutton .status.lock.adat2 -selectcolor red -text "ADAT2" -anchor nw -width 10 -variable adatlock2 -font times +checkbutton .status.lock.adat3 -selectcolor red -text "ADAT3" -anchor nw -width 10 -variable adatlock3 -font times + +pack .status.lock.text .status.lock.adat1 .status.lock.adat2 .status.lock.adat3 -side top -padx 3 + +# Sync + +frame .status.sync +label .status.sync.text -text "Sync" -justify left +checkbutton .status.sync.adat1 -selectcolor red -text "ADAT1" -anchor nw -width 10 -variable adatsync1 -font times +checkbutton .status.sync.adat2 -selectcolor red -text "ADAT2" -anchor nw -width 10 -variable adatsync2 -font times +checkbutton .status.sync.adat3 -selectcolor red -text "ADAT3" -anchor nw -width 10 -variable adatsync3 -font times + +pack .status.sync.text .status.sync.adat1 .status.sync.adat2 .status.sync.adat3 -side top -padx 3 + +# Timecode + +frame .status.tc +label .status.tc.text -text "Timecode" -justify left +checkbutton .status.tc.busy -selectcolor red -text "busy" -anchor nw -width 10 -variable tcbusy -font times +checkbutton .status.tc.out -selectcolor red -text "out" -anchor nw -width 10 -variable tcout -font times +checkbutton .status.tc.valid -selectcolor red -text "valid" -anchor nw -width 10 -variable tcvalid -font times + +pack .status.tc.text .status.tc.busy .status.tc.out .status.tc.valid -side top -padx 3 + +# SPDIF In + +frame .status.spdif +label .status.spdif.text -text "SPDIF In" -justify left +label .status.spdif.sr -text "--.- kHz" -anchor n -width 10 -font times +checkbutton .status.spdif.error -selectcolor red -text "Input Lock" -anchor nw -width 10 -variable spdiferr -font times + +pack .status.spdif.text .status.spdif.sr .status.spdif.error -side top -padx 3 + +pack .status.sr .status.lock .status.sync .status.tc .status.spdif -side left -fill x -anchor n -expand 1 + + +# +# CONTROL +# + +proc setprof {} { + global CTRLPROG + global spprof + exec $CTRLPROG pro $spprof +} + +proc setemph {} { + global CTRLPROG + global spemph + exec $CTRLPROG emphasis $spemph +} + +proc setnoaud {} { + global CTRLPROG + global spnoaud + exec $CTRLPROG dolby $spnoaud +} + +proc setoptical {} { + global CTRLPROG + global spoptical + exec $CTRLPROG optout $spoptical +} + +proc setspdifin {} { + global CTRLPROG + global spdifin + exec $CTRLPROG spdifin [expr $spdifin - 1] +} + +proc setsyncsource {} { + global CTRLPROG + global syncsource + exec $CTRLPROG syncref [expr $syncsource -1] +} + + +proc setmaster {} { + global CTRLPROG + global master + exec $CTRLPROG master $master +} + +proc setwordclock {} { + global CTRLPROG + global wordclock + exec $CTRLPROG wordclock $wordclock +} + +proc setadat1cd {} { + global CTRLPROG + global adat1cd + exec $CTRLPROG adat1cd $adat1cd +} + + +frame .control + +# SPDIF In & SPDIF Out + + +frame .control.spdif + +frame .control.spdif.in +label .control.spdif.in.text -text "SPDIF In" -justify left +radiobutton .control.spdif.in.input1 -text "Optical" -anchor nw -width 13 -variable spdifin -value 1 -command setspdifin -selectcolor blue -font times +radiobutton .control.spdif.in.input2 -text "Coaxial" -anchor nw -width 13 -variable spdifin -value 2 -command setspdifin -selectcolor blue -font times +radiobutton .control.spdif.in.input3 -text "Intern " -anchor nw -width 13 -variable spdifin -command setspdifin -value 3 -selectcolor blue -font times + +checkbutton .control.spdif.in.adat1cd -text "ADAT1 Intern" -anchor nw -width 13 -variable adat1cd -command setadat1cd -selectcolor blue -font times + +pack .control.spdif.in.text .control.spdif.in.input1 .control.spdif.in.input2 .control.spdif.in.input3 .control.spdif.in.adat1cd + +label .control.spdif.space + +frame .control.spdif.out +label .control.spdif.out.text -text "SPDIF Out" -justify left +checkbutton .control.spdif.out.pro -text "Professional" -anchor nw -width 13 -variable spprof -command setprof -selectcolor blue -font times +checkbutton .control.spdif.out.emphasis -text "Emphasis" -anchor nw -width 13 -variable spemph -command setemph -selectcolor blue -font times +checkbutton .control.spdif.out.dolby -text "NoAudio" -anchor nw -width 13 -variable spnoaud -command setnoaud -selectcolor blue -font times +checkbutton .control.spdif.out.optout -text "Optical Out" -anchor nw -width 13 -variable spoptical -command setoptical -selectcolor blue -font times + +pack .control.spdif.out.optout .control.spdif.out.dolby .control.spdif.out.emphasis .control.spdif.out.pro .control.spdif.out.text -side bottom + +pack .control.spdif.in .control.spdif.space .control.spdif.out -side top -fill y -padx 3 -expand 1 + +# Sync Mode & Sync Source + +frame .control.sync +frame .control.sync.mode +label .control.sync.mode.text -text "Sync Mode" -justify left +checkbutton .control.sync.mode.master -text "Master" -anchor nw -width 13 -variable master -command setmaster -selectcolor blue -font times +checkbutton .control.sync.mode.wc -text "Wordclock" -anchor nw -width 13 -variable wordclock -command setwordclock -selectcolor blue -font times + +pack .control.sync.mode.text .control.sync.mode.master .control.sync.mode.wc + +label .control.sync.space + +frame .control.sync.src +label .control.sync.src.text -text "Sync Source" -justify left +radiobutton .control.sync.src.input1 -text "ADAT1" -anchor nw -width 13 -variable syncsource -value 1 -command setsyncsource -selectcolor blue -font times +radiobutton .control.sync.src.input2 -text "ADAT2" -anchor nw -width 13 -variable syncsource -value 2 -command setsyncsource -selectcolor blue -font times +radiobutton .control.sync.src.input3 -text "ADAT3" -anchor nw -width 13 -variable syncsource -command setsyncsource -value 3 -selectcolor blue -font times +radiobutton .control.sync.src.input4 -text "SPDIF" -anchor nw -width 13 -variable syncsource -command setsyncsource -value 4 -selectcolor blue -font times + +pack .control.sync.src.input4 .control.sync.src.input3 .control.sync.src.input2 .control.sync.src.input1 .control.sync.src.text -side bottom + +pack .control.sync.mode .control.sync.space .control.sync.src -side top -fill y -padx 3 -expand 1 + +label .control.space -text "" -width 10 + +# Buffer Size + +frame .control.buf +label .control.buf.text -text "Buffer Size (Latency)" -justify left +radiobutton .control.buf.b1 -selectcolor red -text "64 (1.5 ms)" -width 13 -anchor nw -variable ssrate -value 1 -font times +radiobutton .control.buf.b2 -selectcolor red -text "128 (3 ms)" -width 13 -anchor nw -variable ssrate -value 2 -font times +radiobutton .control.buf.b3 -selectcolor red -text "256 (6 ms)" -width 13 -anchor nw -variable ssrate -value 3 -font times +radiobutton .control.buf.b4 -selectcolor red -text "512 (12 ms)" -width 13 -anchor nw -variable ssrate -value 4 -font times +radiobutton .control.buf.b5 -selectcolor red -text "1024 (23 ms)" -width 13 -anchor nw -variable ssrate -value 5 -font times +radiobutton .control.buf.b6 -selectcolor red -text "2048 (46 ms)" -width 13 -anchor nw -variable ssrate -value 6 -font times +radiobutton .control.buf.b7 -selectcolor red -text "4096 (93 ms)" -width 13 -anchor nw -variable ssrate -value 7 -font times +radiobutton .control.buf.b8 -selectcolor red -text "8192 (186 ms)" -width 13 -anchor nw -variable ssrate -value 8 -font times + +pack .control.buf.text .control.buf.b1 .control.buf.b2 .control.buf.b3 .control.buf.b4 .control.buf.b5 .control.buf.b6 .control.buf.b7 .control.buf.b8 -side top -padx 3 + +# Offset + +frame .control.offset + +frame .control.offset.in +label .control.offset.in.text -text "Offset In" -justify left +label .control.offset.in.off0 -text "dev\#0: -" -anchor nw -width 10 -font times +label .control.offset.in.off1 -text "dev\#1: -" -anchor nw -width 10 -font times +label .control.offset.in.off2 -text "dev\#2: -" -anchor nw -width 10 -font times +label .control.offset.in.off3 -text "dev\#3: -" -anchor nw -width 10 -font times + +pack .control.offset.in.text .control.offset.in.off0 .control.offset.in.off1 .control.offset.in.off2 .control.offset.in.off3 + +label .control.offset.space + +frame .control.offset.out +label .control.offset.out.text -text "Offset Out" -justify left +label .control.offset.out.off0 -text "dev\#0: -" -anchor nw -width 10 -font times +label .control.offset.out.off1 -text "dev\#1: -" -anchor nw -width 10 -font times +label .control.offset.out.off2 -text "dev\#2: -" -anchor nw -width 10 -font times +label .control.offset.out.off3 -text "dev\#3: -" -anchor nw -width 10 -font times + +pack .control.offset.out.off3 .control.offset.out.off2 .control.offset.out.off1 .control.offset.out.off0 .control.offset.out.text -side bottom + +pack .control.offset.in .control.offset.space .control.offset.out -side top -fill y -padx 3 -expand 1 + + +pack .control.spdif .control.sync .control.space .control.buf .control.offset -side left -fill both -anchor n -expand 1 + + +label .statustext -text Status -justify center -relief ridge +label .controltext -text Control -justify center -relief ridge + +label .statusspace +label .controlspace + +pack .statustext .status .statusspace .controltext .control .controlspace -side top -anchor nw -fill both -expand 1 + + +proc get_bit {output sstr} { + set idx1 [string last [concat $sstr 1] $output] + set idx1 [expr $idx1 != -1] + return $idx1 +} + +proc get_val {output sstr} { + set val [string wordend $output [string last $sstr $output]] + set val [string range $output $val [expr $val+1]] + return $val +} + +proc get_val2 {output sstr} { + set val [string wordend $output [string first $sstr $output]] + set val [string range $output $val [expr $val+2]] + return $val +} + +proc get_control {} { + global spprof + global spemph + global spnoaud + global spoptical + global spdifin + global ssrate + global master + global wordclock + global syncsource + global CTRLPROG + + set f [open "| $CTRLPROG control" r+] + set ooo [read $f 1000] + close $f +# puts $ooo + + set spprof [ get_bit $ooo "pro"] + set spemph [ get_bit $ooo "emphasis"] + set spnoaud [ get_bit $ooo "dolby"] + set spoptical [ get_bit $ooo "opt_out"] + set spdifin [ expr [ get_val $ooo "spdif_in"] + 1] + set ssrate [ expr [ get_val $ooo "latency"] + 1] + set master [ expr [ get_val $ooo "master"]] + set wordclock [ expr [ get_val $ooo "wordclock"]] + set syncsource [ expr [ get_val $ooo "sync_ref"] + 1] +} + +proc get_status {} { + global srate + global ctrlcom + + global adatlock1 + global adatlock2 + global adatlock3 + + global adatsync1 + global adatsync2 + global adatsync3 + + global tcbusy + global tcout + global tcvalid + + global spdiferr + global crystal + global .status.spdif.text + global CTRLPROG + + + set f [open "| $CTRLPROG status" r+] + set ooo [read $f 1000] + close $f +# puts $ooo + +# samplerate + + set idx1 [string last "sr48 1" $ooo] + set idx2 [string last "doublespeed 1" $ooo] + if {$idx1 >= 0} { + set fact1 48000 + } else { + set fact1 44100 + } + + if {$idx2 >= 0} { + set fact2 2 + } else { + set fact2 1 + } + set srate [expr $fact1 * $fact2] +# ADAT lock + + set val [get_val $ooo lockmask] + set adatlock1 0 + set adatlock2 0 + set adatlock3 0 + if {[expr $val & 1]} { + set adatlock3 1 + } + if {[expr $val & 2]} { + set adatlock2 1 + } + if {[expr $val & 4]} { + set adatlock1 1 + } + +# ADAT sync + set val [get_val $ooo syncmask] + set adatsync1 0 + set adatsync2 0 + set adatsync3 0 + + if {[expr $val & 1]} { + set adatsync3 1 + } + if {[expr $val & 2]} { + set adatsync2 1 + } + if {[expr $val & 4]} { + set adatsync1 1 + } + +# TC busy + + set tcbusy [get_bit $ooo "busy"] + set tcout [get_bit $ooo "out"] + set tcvalid [get_bit $ooo "valid"] + set spdiferr [expr [get_bit $ooo "spdif_error"] == 0] + +# 000=64kHz, 100=88.2kHz, 011=96kHz +# 111=32kHz, 110=44.1kHz, 101=48kHz + + set val [get_val $ooo crystalrate] + + set crystal "--.- kHz" + if {$val == 0} { + set crystal "64 kHz" + } + if {$val == 4} { + set crystal "88.2 kHz" + } + if {$val == 3} { + set crystal "96 kHz" + } + if {$val == 7} { + set crystal "32 kHz" + } + if {$val == 6} { + set crystal "44.1 kHz" + } + if {$val == 5} { + set crystal "48 kHz" + } + .status.spdif.sr configure -text $crystal +} + +proc get_offset {} { + global inoffset + global outoffset + global CTRLPROG + + set f [open "| $CTRLPROG mix" r+] + set ooo [read $f 1000] + close $f +# puts $ooo + + if { [string match "*devnr*" $ooo] } { + set ooo [string range $ooo [string wordend $ooo [string first devnr $ooo]] end] + set val [get_val2 $ooo i_offset] + .control.offset.in.off0 configure -text "dev\#0: $val" + set val [get_val2 $ooo o_offset] + .control.offset.out.off0 configure -text "dev\#0: $val" + } else { + .control.offset.in.off0 configure -text "dev\#0: -" + .control.offset.out.off0 configure -text "dev\#0: -" + } + if { [string match "*devnr*" $ooo] } { + set ooo [string range $ooo [string wordend $ooo [string first devnr $ooo]] end] + set val [get_val2 $ooo i_offset] + .control.offset.in.off1 configure -text "dev\#1: $val" + set val [get_val2 $ooo o_offset] + .control.offset.out.off1 configure -text "dev\#1: $val" + } else { + .control.offset.in.off1 configure -text "dev\#1: -" + .control.offset.out.off1 configure -text "dev\#1: -" + } + if { [string match "*devnr*" $ooo] } { + set ooo [string range $ooo [string wordend $ooo [string first devnr $ooo]] end] + set val [get_val2 $ooo i_offset] + .control.offset.in.off2 configure -text "dev\#2: $val" + set val [get_val2 $ooo o_offset] + .control.offset.out.off2 configure -text "dev\#2: $val" + } else { + .control.offset.in.off2 configure -text "dev\#2: -" + .control.offset.out.off2 configure -text "dev\#2: -" + } + if { [string match "*devnr*" $ooo] } { + set ooo [string range $ooo [string wordend $ooo [string first devnr $ooo]] end] + set val [get_val2 $ooo i_offset] + .control.offset.in.off3 configure -text "dev\#3: $val" + set val [get_val2 $ooo o_offset] + .control.offset.out.off3 configure -text "dev\#3: $val" + } else { + .control.offset.in.off3 configure -text "dev\#3: -" + .control.offset.out.off3 configure -text "dev\#3: -" + } +} + + +proc get_all {} { +get_status +get_control +get_offset +} + +# main +while {1} { + after 200 + get_all + update +} diff --git a/Documentation/sound/oss/solo1 b/Documentation/sound/oss/solo1 new file mode 100644 index 000000000000..6f53d407d027 --- /dev/null +++ b/Documentation/sound/oss/solo1 @@ -0,0 +1,70 @@ +Recording +--------- + +Recording does not work on the author's card, but there +is at least one report of it working on later silicon. +The chip behaves differently than described in the data sheet, +likely due to a chip bug. Working around this would require +the help of ESS (for example by publishing an errata sheet), +but ESS has not done so so far. + +Also, the chip only supports 24 bit addresses for recording, +which means it cannot work on some Alpha mainboards. + + +/proc/sound, /dev/sndstat +------------------------- + +/proc/sound and /dev/sndstat is not supported by the +driver. To find out whether the driver succeeded loading, +check the kernel log (dmesg). + + +ALaw/uLaw sample formats +------------------------ + +This driver does not support the ALaw/uLaw sample formats. +ALaw is the default mode when opening a sound device +using OSS/Free. The reason for the lack of support is +that the hardware does not support these formats, and adding +conversion routines to the kernel would lead to very ugly +code in the presence of the mmap interface to the driver. +And since xquake uses mmap, mmap is considered important :-) +and no sane application uses ALaw/uLaw these days anyway. +In short, playing a Sun .au file as follows: + +cat my_file.au > /dev/dsp + +does not work. Instead, you may use the play script from +Chris Bagwell's sox-12.14 package (or later, available from the URL +below) to play many different audio file formats. +The script automatically determines the audio format +and does do audio conversions if necessary. +http://home.sprynet.com/sprynet/cbagwell/projects.html + + +Blocking vs. nonblocking IO +--------------------------- + +Unlike OSS/Free this driver honours the O_NONBLOCK file flag +not only during open, but also during read and write. +This is an effort to make the sound driver interface more +regular. Timidity has problems with this; a patch +is available from http://www.ife.ee.ethz.ch/~sailer/linux/pciaudio.html. +(Timidity patched will also run on OSS/Free). + + +MIDI UART +--------- + +The driver supports a simple MIDI UART interface, with +no ioctl's supported. + + +MIDI synthesizer +---------------- + +The card has an OPL compatible FM synthesizer. + +Thomas Sailer +t.sailer@alumni.ethz.ch diff --git a/Documentation/sound/oss/sonicvibes b/Documentation/sound/oss/sonicvibes new file mode 100644 index 000000000000..84dee2e0b37d --- /dev/null +++ b/Documentation/sound/oss/sonicvibes @@ -0,0 +1,81 @@ +/proc/sound, /dev/sndstat +------------------------- + +/proc/sound and /dev/sndstat is not supported by the +driver. To find out whether the driver succeeded loading, +check the kernel log (dmesg). + + +ALaw/uLaw sample formats +------------------------ + +This driver does not support the ALaw/uLaw sample formats. +ALaw is the default mode when opening a sound device +using OSS/Free. The reason for the lack of support is +that the hardware does not support these formats, and adding +conversion routines to the kernel would lead to very ugly +code in the presence of the mmap interface to the driver. +And since xquake uses mmap, mmap is considered important :-) +and no sane application uses ALaw/uLaw these days anyway. +In short, playing a Sun .au file as follows: + +cat my_file.au > /dev/dsp + +does not work. Instead, you may use the play script from +Chris Bagwell's sox-12.14 package (available from the URL +below) to play many different audio file formats. +The script automatically determines the audio format +and does do audio conversions if necessary. +http://home.sprynet.com/sprynet/cbagwell/projects.html + + +Blocking vs. nonblocking IO +--------------------------- + +Unlike OSS/Free this driver honours the O_NONBLOCK file flag +not only during open, but also during read and write. +This is an effort to make the sound driver interface more +regular. Timidity has problems with this; a patch +is available from http://www.ife.ee.ethz.ch/~sailer/linux/pciaudio.html. +(Timidity patched will also run on OSS/Free). + + +MIDI UART +--------- + +The driver supports a simple MIDI UART interface, with +no ioctl's supported. + + +MIDI synthesizer +---------------- + +The card both has an OPL compatible FM synthesizer as well as +a wavetable synthesizer. + +I haven't managed so far to get the OPL synth running. + +Using the wavetable synthesizer requires allocating +1-4MB of physically contiguous memory, which isn't possible +currently on Linux without ugly hacks like the bigphysarea +patch. Therefore, the driver doesn't support wavetable +synthesis. + + +No support from S3 +------------------ + +I do not get any support from S3. Therefore, the driver +still has many problems. For example, although the manual +states that the chip should be able to access the sample +buffer anywhere in 32bit address space, I haven't managed to +get it working with buffers above 16M. Therefore, the card +has the same disadvantages as ISA soundcards. + +Given that the card is also very noisy, and if you haven't +already bought it, you should strongly opt for one of the +comparatively priced Ensoniq products. + + +Thomas Sailer +t.sailer@alumni.ethz.ch diff --git a/Documentation/sound/oss/ultrasound b/Documentation/sound/oss/ultrasound new file mode 100644 index 000000000000..32cd50478b36 --- /dev/null +++ b/Documentation/sound/oss/ultrasound @@ -0,0 +1,30 @@ +modprobe sound +insmod ad1848 +insmod gus io=* irq=* dma=* ... + +This loads the driver for the Gravis Ultrasound family of sound cards. + +The gus module takes the following arguments + +io I/O address of the Ultrasound card (eg. io=0x220) +irq IRQ of the Sound Blaster card +dma DMA channel for the Sound Blaster +dma16 2nd DMA channel, only needed for full duplex operation +type 1 for PnP card +gus16 1 for using 16 bit sampling daughter board +no_wave_dma Set to disable DMA usage for wavetable (see note) +db16 ??? + + +no_wave_dma option + +This option defaults to a value of 0, which allows the Ultrasound wavetable +DSP to use DMA for for playback and downloading samples. This is the same +as the old behaviour. If set to 1, no DMA is needed for downloading samples, +and allows owners of a GUS MAX to make use of simultaneous digital audio +(/dev/dsp), MIDI, and wavetable playback. + + +If you have problems in recording with GUS MAX, you could try to use +just one 8 bit DMA channel. Recording will not work with one DMA +channel if it's a 16 bit one. diff --git a/Documentation/sound/oss/vwsnd b/Documentation/sound/oss/vwsnd new file mode 100644 index 000000000000..a6ea0a1df9e4 --- /dev/null +++ b/Documentation/sound/oss/vwsnd @@ -0,0 +1,293 @@ +vwsnd - Sound driver for the Silicon Graphics 320 and 540 Visual +Workstations' onboard audio. + +Copyright 1999 Silicon Graphics, Inc. All rights reserved. + + +At the time of this writing, March 1999, there are two models of +Visual Workstation, the 320 and the 540. This document only describes +those models. Future Visual Workstation models may have different +sound capabilities, and this driver will probably not work on those +boxes. + +The Visual Workstation has an Analog Devices AD1843 "SoundComm" audio +codec chip. The AD1843 is accessed through the Cobalt I/O ASIC, also +known as Lithium. This driver programs both both chips. + +============================================================================== +QUICK CONFIGURATION + + # insmod soundcore + # insmod vwsnd + +============================================================================== +I/O CONNECTIONS + +On the Visual Workstation, only three of the AD1843 inputs are hooked +up. The analog line in jacks are connected to the AD1843's AUX1 +input. The CD audio lines are connected to the AD1843's AUX2 input. +The microphone jack is connected to the AD1843's MIC input. The mic +jack is mono, but the signal is delivered to both the left and right +MIC inputs. You can record in stereo from the mic input, but you will +get the same signal on both channels (within the limits of A/D +accuracy). Full scale on the Line input is +/- 2.0 V. Full scale on +the MIC input is 20 dB less, or +/- 0.2 V. + +The AD1843's LOUT1 outputs are connected to the Line Out jacks. The +AD1843's HPOUT outputs are connected to the speaker/headphone jack. +LOUT2 is not connected. Line out's maximum level is +/- 2.0 V peak to +peak. The speaker/headphone out's maximum is +/- 4.0 V peak to peak. + +The AD1843's PCM input channel and one of its output channels (DAC1) +are connected to Lithium. The other output channel (DAC2) is not +connected. + +============================================================================== +CAPABILITIES + +The AD1843 has PCM input and output (Pulse Code Modulation, also known +as wavetable). PCM input and output can be mono or stereo in any of +four formats. The formats are 16 bit signed and 8 bit unsigned, +u-Law, and A-Law format. Any sample rate from 4 KHz to 49 KHz is +available, in 1 Hz increments. + +The AD1843 includes an analog mixer that can mix all three input +signals (line, mic and CD) into the analog outputs. The mixer has a +separate gain control and mute switch for each input. + +There are two outputs, line out and speaker/headphone out. They +always produce the same signal, and the speaker always has 3 dB more +gain than the line out. The speaker/headphone output can be muted, +but this driver does not export that function. + +The hardware can sync audio to the video clock, but this driver does +not have a way to specify syncing to video. + +============================================================================== +PROGRAMMING + +This section explains the API supported by the driver. Also see the +Open Sound Programming Guide at http://www.opensound.com/pguide/ . +This section assumes familiarity with that document. + +The driver has two interfaces, an I/O interface and a mixer interface. +There is no MIDI or sequencer capability. + +============================================================================== +PROGRAMMING PCM I/O + +The I/O interface is usually accessed as /dev/audio or /dev/dsp. +Using the standard Open Sound System (OSS) ioctl calls, the sample +rate, number of channels, and sample format may be set within the +limitations described above. The driver supports triggering. It also +supports getting the input and output pointers with one-sample +accuracy. + +The SNDCTL_DSP_GETCAP ioctl returns these capabilities. + + DSP_CAP_DUPLEX - driver supports full duplex. + + DSP_CAP_TRIGGER - driver supports triggering. + + DSP_CAP_REALTIME - values returned by SNDCTL_DSP_GETIPTR + and SNDCTL_DSP_GETOPTR are accurate to a few samples. + +Memory mapping (mmap) is not implemented. + +The driver permits subdivided fragment sizes from 64 to 4096 bytes. +The number of fragments can be anything from 3 fragments to however +many fragments fit into 124 kilobytes. It is up to the user to +determine how few/small fragments can be used without introducing +glitches with a given workload. Linux is not realtime, so we can't +promise anything. (sigh...) + +When this driver is switched into or out of mu-Law or A-Law mode on +output, it may produce an audible click. This is unavoidable. To +prevent clicking, use signed 16-bit mode instead, and convert from +mu-Law or A-Law format in software. + +============================================================================== +PROGRAMMING THE MIXER INTERFACE + +The mixer interface is usually accessed as /dev/mixer. It is accessed +through ioctls. The mixer allows the application to control gain or +mute several audio signal paths, and also allows selection of the +recording source. + +Each of the constants described here can be read using the +MIXER_READ(SOUND_MIXER_xxx) ioctl. Those that are not read-only can +also be written using the MIXER_WRITE(SOUND_MIXER_xxx) ioctl. In most +cases, <sys/soundcard.h> defines constants SOUND_MIXER_READ_xxx and +SOUND_MIXER_WRITE_xxx which work just as well. + +SOUND_MIXER_CAPS Read-only + +This is a mask of optional driver capabilities that are implemented. +This driver's only capability is SOUND_CAP_EXCL_INPUT, which means +that only one recording source can be active at a time. + +SOUND_MIXER_DEVMASK Read-only + +This is a mask of the sound channels. This driver's channels are PCM, +LINE, MIC, CD, and RECLEV. + +SOUND_MIXER_STEREODEVS Read-only + +This is a mask of which sound channels are capable of stereo. All +channels are capable of stereo. (But see caveat on MIC input in I/O +CONNECTIONS section above). + +SOUND_MIXER_OUTMASK Read-only + +This is a mask of channels that route inputs through to outputs. +Those are LINE, MIC, and CD. + +SOUND_MIXER_RECMASK Read-only + +This is a mask of channels that can be recording sources. Those are +PCM, LINE, MIC, CD. + +SOUND_MIXER_PCM Default: 0x5757 (0 dB) + +This is the gain control for PCM output. The left and right channel +gain are controlled independently. This gain control has 64 levels, +which range from -82.5 dB to +12.0 dB in 1.5 dB steps. Those 64 +levels are mapped onto 100 levels at the ioctl, see below. + +SOUND_MIXER_LINE Default: 0x4a4a (0 dB) + +This is the gain control for mixing the Line In source into the +outputs. The left and right channel gain are controlled +independently. This gain control has 32 levels, which range from +-34.5 dB to +12.0 dB in 1.5 dB steps. Those 32 levels are mapped onto +100 levels at the ioctl, see below. + +SOUND_MIXER_MIC Default: 0x4a4a (0 dB) + +This is the gain control for mixing the MIC source into the outputs. +The left and right channel gain are controlled independently. This +gain control has 32 levels, which range from -34.5 dB to +12.0 dB in +1.5 dB steps. Those 32 levels are mapped onto 100 levels at the +ioctl, see below. + +SOUND_MIXER_CD Default: 0x4a4a (0 dB) + +This is the gain control for mixing the CD audio source into the +outputs. The left and right channel gain are controlled +independently. This gain control has 32 levels, which range from +-34.5 dB to +12.0 dB in 1.5 dB steps. Those 32 levels are mapped onto +100 levels at the ioctl, see below. + +SOUND_MIXER_RECLEV Default: 0 (0 dB) + +This is the gain control for PCM input (RECording LEVel). The left +and right channel gain are controlled independently. This gain +control has 16 levels, which range from 0 dB to +22.5 dB in 1.5 dB +steps. Those 16 levels are mapped onto 100 levels at the ioctl, see +below. + +SOUND_MIXER_RECSRC Default: SOUND_MASK_LINE + +This is a mask of currently selected PCM input sources (RECording +SouRCes). Because the AD1843 can only have a single recording source +at a time, only one bit at a time can be set in this mask. The +allowable values are SOUND_MASK_PCM, SOUND_MASK_LINE, SOUND_MASK_MIC, +or SOUND_MASK_CD. Selecting SOUND_MASK_PCM sets up internal +resampling which is useful for loopback testing and for hardware +sample rate conversion. But software sample rate conversion is +probably faster, so I don't know how useful that is. + +SOUND_MIXER_OUTSRC DEFAULT: SOUND_MASK_LINE|SOUND_MASK_MIC|SOUND_MASK_CD + +This is a mask of sources that are currently passed through to the +outputs. Those sources whose bits are not set are muted. + +============================================================================== +GAIN CONTROL + +There are five gain controls listed above. Each has 16, 32, or 64 +steps. Each control has 1.5 dB of gain per step. Each control is +stereo. + +The OSS defines the argument to a channel gain ioctl as having two +components, left and right, each of which ranges from 0 to 100. The +two components are packed into the same word, with the left side gain +in the least significant byte, and the right side gain in the second +least significant byte. In C, we would say this. + + #include <assert.h> + + ... + + assert(leftgain >= 0 && leftgain <= 100); + assert(rightgain >= 0 && rightgain <= 100); + arg = leftgain | rightgain << 8; + +So each OSS gain control has 101 steps. But the hardware has 16, 32, +or 64 steps. The hardware steps are spread across the 101 OSS steps +nearly evenly. The conversion formulas are like this, given N equals +16, 32, or 64. + + int round = N/2 - 1; + OSS_gain_steps = (hw_gain_steps * 100 + round) / (N - 1); + hw_gain_steps = (OSS_gain_steps * (N - 1) + round) / 100; + +Here is a snippet of C code that will return the left and right gain +of any channel in dB. Pass it one of the predefined gain_desc_t +structures to access any of the five channels' gains. + + typedef struct gain_desc { + float min_gain; + float gain_step; + int nbits; + int chan; + } gain_desc_t; + + const gain_desc_t gain_pcm = { -82.5, 1.5, 6, SOUND_MIXER_PCM }; + const gain_desc_t gain_line = { -34.5, 1.5, 5, SOUND_MIXER_LINE }; + const gain_desc_t gain_mic = { -34.5, 1.5, 5, SOUND_MIXER_MIC }; + const gain_desc_t gain_cd = { -34.5, 1.5, 5, SOUND_MIXER_CD }; + const gain_desc_t gain_reclev = { 0.0, 1.5, 4, SOUND_MIXER_RECLEV }; + + int get_gain_dB(int fd, const gain_desc_t *gp, + float *left, float *right) + { + int word; + int lg, rg; + int mask = (1 << gp->nbits) - 1; + + if (ioctl(fd, MIXER_READ(gp->chan), &word) != 0) + return -1; /* fail */ + lg = word & 0xFF; + rg = word >> 8 & 0xFF; + lg = (lg * mask + mask / 2) / 100; + rg = (rg * mask + mask / 2) / 100; + *left = gp->min_gain + gp->gain_step * lg; + *right = gp->min_gain + gp->gain_step * rg; + return 0; + } + +And here is the corresponding routine to set a channel's gain in dB. + + int set_gain_dB(int fd, const gain_desc_t *gp, float left, float right) + { + float max_gain = + gp->min_gain + (1 << gp->nbits) * gp->gain_step; + float round = gp->gain_step / 2; + int mask = (1 << gp->nbits) - 1; + int word; + int lg, rg; + + if (left < gp->min_gain || right < gp->min_gain) + return EINVAL; + lg = (left - gp->min_gain + round) / gp->gain_step; + rg = (right - gp->min_gain + round) / gp->gain_step; + if (lg >= (1 << gp->nbits) || rg >= (1 << gp->nbits)) + return EINVAL; + lg = (100 * lg + mask / 2) / mask; + rg = (100 * rg + mask / 2) / mask; + word = lg | rg << 8; + + return ioctl(fd, MIXER_WRITE(gp->chan), &word); + } + diff --git a/Documentation/sparc/README-2.5 b/Documentation/sparc/README-2.5 new file mode 100644 index 000000000000..806fe490a56d --- /dev/null +++ b/Documentation/sparc/README-2.5 @@ -0,0 +1,46 @@ +BTFIXUP +------- + +To build new kernels you have to issue "make image". The ready kernel +in ELF format is placed in arch/sparc/boot/image. Explanation is below. + +BTFIXUP is a unique feature of Linux/sparc among other architectures, +developed by Jakub Jelinek (I think... Obviously David S. Miller took +part, too). It allows to boot the same kernel at different +sub-architectures, such as sun4c, sun4m, sun4d, where SunOS uses +different kernels. This feature is convinient for people who you move +disks between boxes and for distrution builders. + +To function, BTFIXUP must link the kernel "in the draft" first, +analyze the result, write a special stub code based on that, and +build the final kernel with the stub (btfix.o). + +Kai Germaschewski improved the build system of the kernel in the 2.5 series +significantly. Unfortunately, the traditional way of running the draft +linking from architecture specific Makefile before the actual linking +by generic Makefile is nearly impossible to support properly in the +new build system. Therefore, the way we integrate BTFIXUP with the +build system was changed in 2.5.40. Now, generic Makefile performs +the draft linking and stores the result in file vmlinux. Architecture +specific post-processing invokes BTFIXUP machinery and final linking +in the same way as other architectures do bootstraps. + +Implications of that change are as follows. + +1. Hackers must type "make image" now, instead of just "make", in the same + way as s390 people do now. It is analogous to "make bzImage" on i386. + This does NOT affect sparc64, you continue to use "make" to build sparc64 + kernels. + +2. vmlinux is not the final kernel, so RPM builders have to adjust + their spec files (if they delivered vmlinux for debugging). + System.map generated for vmlinux is still valid. + +3. Scripts that produce a.out images have to be changed. First, if they + invoke make, they have to use "make image". Second, they have to pick up + the new kernel in arch/sparc/boot/image instead of vmlinux. + +4. Since we are compliant with Kai's build system now, make -j is permitted. + +-- Pete Zaitcev +zaitcev@yahoo.com diff --git a/Documentation/sparc/sbus_drivers.txt b/Documentation/sparc/sbus_drivers.txt new file mode 100644 index 000000000000..876195dc2aef --- /dev/null +++ b/Documentation/sparc/sbus_drivers.txt @@ -0,0 +1,272 @@ + + Writing SBUS Drivers + + David S. Miller (davem@redhat.com) + + The SBUS driver interfaces of the Linux kernel have been +revamped completely for 2.4.x for several reasons. Foremost were +performance and complexity concerns. This document details these +new interfaces and how they are used to write an SBUS device driver. + + SBUS drivers need to include <asm/sbus.h> to get access +to functions and structures described here. + + Probing and Detection + + Each SBUS device inside the machine is described by a +structure called "struct sbus_dev". Likewise, each SBUS bus +found in the system is described by a "struct sbus_bus". For +each SBUS bus, the devices underneath are hung in a tree-like +fashion off of the bus structure. + + The SBUS device structure contains enough information +for you to implement your device probing algorithm and obtain +the bits necessary to run your device. The most commonly +used members of this structure, and their typical usage, +will be detailed below. + + Here is how probing is performed by an SBUS driver +under Linux: + + static void init_one_mydevice(struct sbus_dev *sdev) + { + ... + } + + static int mydevice_match(struct sbus_dev *sdev) + { + if (some_criteria(sdev)) + return 1; + return 0; + } + + static void mydevice_probe(void) + { + struct sbus_bus *sbus; + struct sbus_dev *sdev; + + for_each_sbus(sbus) { + for_each_sbusdev(sdev, sbus) { + if (mydevice_match(sdev)) + init_one_mydevice(sdev); + } + } + } + + All this does is walk through all SBUS devices in the +system, checks each to see if it is of the type which +your driver is written for, and if so it calls the init +routine to attach the device and prepare to drive it. + + "init_one_mydevice" might do things like allocate software +state structures, map in I/O registers, place the hardware +into an initialized state, etc. + + Mapping and Accessing I/O Registers + + Each SBUS device structure contains an array of descriptors +which describe each register set. We abuse struct resource for that. +They each correspond to the "reg" properties provided by the OBP firmware. + + Before you can access your device's registers you must map +them. And later if you wish to shutdown your driver (for module +unload or similar) you must unmap them. You must treat them as +a resource, which you allocate (map) before using and free up +(unmap) when you are done with it. + + The mapping information is stored in an opaque value +typed as an "unsigned long". This is the type of the return value +of the mapping interface, and the arguments to the unmapping +interface. Let's say you want to map the first set of registers. +Perhaps part of your driver software state structure looks like: + + struct mydevice { + unsigned long control_regs; + ... + struct sbus_dev *sdev; + ... + }; + + At initialization time you then use the sbus_ioremap +interface to map in your registers, like so: + + static void init_one_mydevice(struct sbus_dev *sdev) + { + struct mydevice *mp; + ... + + mp->control_regs = sbus_ioremap(&sdev->resource[0], 0, + CONTROL_REGS_SIZE, "mydevice regs"); + if (!mp->control_regs) { + /* Failure, cleanup and return. */ + } + } + + Second argument to sbus_ioremap is an offset for +cranky devices with broken OBP PROM. The sbus_ioremap uses only +a start address and flags from the resource structure. +Therefore it is possible to use the same resource to map +several sets of registers or even to fabricate a resource +structure if driver gets physical address from some private place. +This practice is discouraged though. Use whatever OBP PROM +provided to you. + + And here is how you might unmap these registers later at +driver shutdown or module unload time, using the sbus_iounmap +interface: + + static void mydevice_unmap_regs(struct mydevice *mp) + { + sbus_iounmap(mp->control_regs, CONTROL_REGS_SIZE); + } + + Finally, to actually access your registers there are 6 +interface routines at your disposal. Accesses are byte (8 bit), +word (16 bit), or longword (32 bit) sized. Here they are: + + u8 sbus_readb(unsigned long reg) /* read byte */ + u16 sbus_readw(unsigned long reg) /* read word */ + u32 sbus_readl(unsigned long reg) /* read longword */ + void sbus_writeb(u8 value, unsigned long reg) /* write byte */ + void sbus_writew(u16 value, unsigned long reg) /* write word */ + void sbus_writel(u32 value, unsigned long reg) /* write longword */ + + So, let's say your device has a control register of some sort +at offset zero. The following might implement resetting your device: + + #define CONTROL 0x00UL + + #define CONTROL_RESET 0x00000001 /* Reset hardware */ + + static void mydevice_reset(struct mydevice *mp) + { + sbus_writel(CONTROL_RESET, mp->regs + CONTROL); + } + + Or perhaps there is a data port register at an offset of +16 bytes which allows you to read bytes from a fifo in the device: + + #define DATA 0x10UL + + static u8 mydevice_get_byte(struct mydevice *mp) + { + return sbus_readb(mp->regs + DATA); + } + + It's pretty straightforward, and clueful readers may have +noticed that these interfaces mimick the PCI interfaces of the +Linux kernel. This was not by accident. + + WARNING: + + DO NOT try to treat these opaque register mapping + values as a memory mapped pointer to some structure + which you can dereference. + + It may be memory mapped, it may not be. In fact it + could be a physical address, or it could be the time + of day xor'd with 0xdeadbeef. :-) + + Whatever it is, it's an implementation detail. The + interface was done this way to shield the driver + author from such complexities. + + Doing DVMA + + SBUS devices can perform DMA transactions in a way similar +to PCI but dissimilar to ISA, e.g. DMA masters supply address. +In contrast to PCI, however, that address (a bus address) is +translated by IOMMU before a memory access is performed and therefore +it is virtual. Sun calls this procedure DVMA. + + Linux supports two styles of using SBUS DVMA: "consistent memory" +and "streaming DVMA". CPU view of consistent memory chunk is, well, +consistent with a view of a device. Think of it as an uncached memory. +Typically this way of doing DVMA is not very fast and drivers use it +mostly for control blocks or queues. On some CPUs we cannot flush or +invalidate individual pages or cache lines and doing explicit flushing +over ever little byte in every control block would be wasteful. + +Streaming DVMA is a preferred way to transfer large amounts of data. +This process works in the following way: +1. a CPU stops accessing a certain part of memory, + flushes its caches covering that memory; +2. a device does DVMA accesses, then posts an interrupt; +3. CPU invalidates its caches and starts to access the memory. + +A single streaming DVMA operation can touch several discontiguous +regions of a virtual bus address space. This is called a scatter-gather +DVMA. + +[TBD: Why do not we neither Solaris attempt to map disjoint pages +into a single virtual chunk with the help of IOMMU, so that non SG +DVMA masters would do SG? It'd be very helpful for RAID.] + + In order to perform a consistent DVMA a driver does something +like the following: + + char *mem; /* Address in the CPU space */ + u32 busa; /* Address in the SBus space */ + + mem = (char *) sbus_alloc_consistent(sdev, MYMEMSIZE, &busa); + + Then mem is used when CPU accesses this memory and u32 +is fed to the device so that it can do DVMA. This is typically +done with an sbus_writel() into some device register. + + Do not forget to free the DVMA resources once you are done: + + sbus_free_consistent(sdev, MYMEMSIZE, mem, busa); + + Streaming DVMA is more interesting. First you allocate some +memory suitable for it or pin down some user pages. Then it all works +like this: + + char *mem = argumen1; + unsigned int size = argument2; + u32 busa; /* Address in the SBus space */ + + *mem = 1; /* CPU can access */ + busa = sbus_map_single(sdev, mem, size); + if (busa == 0) ....... + + /* Tell the device to use busa here */ + /* CPU cannot access the memory without sbus_dma_sync_single() */ + + sbus_unmap_single(sdev, busa, size); + if (*mem == 0) .... /* CPU can access again */ + + It is possible to retain mappings and ask the device to +access data again and again without calling sbus_unmap_single. +However, CPU caches must be invalidated with sbus_dma_sync_single +before such access. + +[TBD but what about writeback caches here... do we have any?] + + There is an equivalent set of functions doing the same thing +only with several memory segments at once for devices capable of +scatter-gather transfers. Use the Source, Luke. + + Examples + + drivers/net/sunhme.c + This is a complicated driver which illustrates many concepts +discussed above and plus it handles both PCI and SBUS boards. + + drivers/scsi/esp.c + Check it out for scatter-gather DVMA. + + drivers/sbus/char/bpp.c + A non-DVMA device. + + drivers/net/sunlance.c + Lance driver abuses consistent mappings for data transfer. +It is a nifty trick which we do not particularly recommend... +Just check it out and know that it's legal. + + Bad examples, do NOT use + + drivers/video/cgsix.c + This one uses result of sbus_ioremap as if it is an address. +This does NOT work on sparc64 and therefore is broken. We will +convert it at a later date. diff --git a/Documentation/sparse.txt b/Documentation/sparse.txt new file mode 100644 index 000000000000..f97841478459 --- /dev/null +++ b/Documentation/sparse.txt @@ -0,0 +1,72 @@ +Copyright 2004 Linus Torvalds +Copyright 2004 Pavel Machek <pavel@suse.cz> + +Using sparse for typechecking +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +"__bitwise" is a type attribute, so you have to do something like this: + + typedef int __bitwise pm_request_t; + + enum pm_request { + PM_SUSPEND = (__force pm_request_t) 1, + PM_RESUME = (__force pm_request_t) 2 + }; + +which makes PM_SUSPEND and PM_RESUME "bitwise" integers (the "__force" is +there because sparse will complain about casting to/from a bitwise type, +but in this case we really _do_ want to force the conversion). And because +the enum values are all the same type, now "enum pm_request" will be that +type too. + +And with gcc, all the __bitwise/__force stuff goes away, and it all ends +up looking just like integers to gcc. + +Quite frankly, you don't need the enum there. The above all really just +boils down to one special "int __bitwise" type. + +So the simpler way is to just do + + typedef int __bitwise pm_request_t; + + #define PM_SUSPEND ((__force pm_request_t) 1) + #define PM_RESUME ((__force pm_request_t) 2) + +and you now have all the infrastructure needed for strict typechecking. + +One small note: the constant integer "0" is special. You can use a +constant zero as a bitwise integer type without sparse ever complaining. +This is because "bitwise" (as the name implies) was designed for making +sure that bitwise types don't get mixed up (little-endian vs big-endian +vs cpu-endian vs whatever), and there the constant "0" really _is_ +special. + +Modify top-level Makefile to say + +CHECK = sparse -Wbitwise + +or you don't get any checking at all. + + +Where to get sparse +~~~~~~~~~~~~~~~~~~~ + +With BK, you can just get it from + + bk://sparse.bkbits.net/sparse + +and DaveJ has tar-balls at + + http://www.codemonkey.org.uk/projects/bitkeeper/sparse/ + + +Once you have it, just do + + make + make install + +as your regular user, and it will install sparse in your ~/bin directory. +After that, doing a kernel make with "make C=1" will run sparse on all the +C files that get recompiled, or with "make C=2" will run sparse on the +files whether they need to be recompiled or not (ie the latter is fast way +to check the whole tree if you have already built it). diff --git a/Documentation/specialix.txt b/Documentation/specialix.txt new file mode 100644 index 000000000000..4a4b428ce8f6 --- /dev/null +++ b/Documentation/specialix.txt @@ -0,0 +1,385 @@ + + specialix.txt -- specialix IO8+ multiport serial driver readme. + + + + Copyright (C) 1997 Roger Wolff (R.E.Wolff@BitWizard.nl) + + Specialix pays for the development and support of this driver. + Please DO contact io8-linux@specialix.co.uk if you require + support. + + This driver was developed in the BitWizard linux device + driver service. If you require a linux device driver for your + product, please contact devices@BitWizard.nl for a quote. + + This code is firmly based on the riscom/8 serial driver, + written by Dmitry Gorodchanin. The specialix IO8+ card + programming information was obtained from the CL-CD1865 Data + Book, and Specialix document number 6200059: IO8+ Hardware + Functional Specification, augmented by document number 6200088: + Merak Hardware Functional Specification. (IO8+/PCI is also + called Merak) + + + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License as + published by the Free Software Foundation; either version 2 of + the License, or (at your option) any later version. + + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR + PURPOSE. See the GNU General Public License for more details. + + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, + USA. + + +Intro +===== + + +This file contains some random information, that I like to have online +instead of in a manual that can get lost. Ever misplace your Linux +kernel sources? And the manual of one of the boards in your computer? + + +Addresses and interrupts +======================== + +Address dip switch settings: +The dip switch sets bits 2-9 of the IO address. + + 9 8 7 6 5 4 3 2 + +-----------------+ + 0 | X X X X X X X | + | | = IoBase = 0x100 + 1 | X | + +-----------------+ ------ RS232 connectors ----> + + | | | + edge connector + | | | + V V V + +Base address 0x100 caused a conflict in one of my computers once. I +haven't the foggiest why. My Specialix card is now at 0x180. My +other computer runs just fine with the Specialix card at 0x100.... +The card occupies 4 addresses, but actually only two are really used. + +The PCI version doesn't have any dip switches. The BIOS assigns +an IO address. + +The driver now still autoprobes at 0x100, 0x180, 0x250 and 0x260. If +that causes trouble for you, please report that. I'll remove +autoprobing then. + +The driver will tell the card what IRQ to use, so you don't have to +change any jumpers to change the IRQ. Just use a command line +argument (irq=xx) to the insmod program to set the interrupt. + +The BIOS assigns the IRQ on the PCI version. You have no say in what +IRQ to use in that case. + +If your specialix cards are not at the default locations, you can use +the kernel command line argument "specialix=io0,irq0,io1,irq1...". +Here "io0" is the io address for the first card, and "irq0" is the +irq line that the first card should use. And so on. + +Examples. + +You use the driver as a module and have three cards at 0x100, 0x250 +and 0x180. And some way or another you want them detected in that +order. Moreover irq 12 is taken (e.g. by your PS/2 mouse). + + insmod specialix.o iobase=0x100,0x250,0x180 irq=9,11,15 + +The same three cards, but now in the kernel would require you to +add + + specialix=0x100,9,0x250,11,0x180,15 + +to the command line. This would become + + append="specialix=0x100,9,0x250,11,0x180,15" + +in your /etc/lilo.conf file if you use lilo. + +The Specialix driver is slightly odd: It allows you to have the second +or third card detected without having a first card. This has +advantages and disadvantages. A slot that isn't filled by an ISA card, +might be filled if a PCI card is detected. Thus if you have an ISA +card at 0x250 and a PCI card, you would get: + +sx0: specialix IO8+ Board at 0x100 not found. +sx1: specialix IO8+ Board at 0x180 not found. +sx2: specialix IO8+ board detected at 0x250, IRQ 12, CD1865 Rev. B. +sx3: specialix IO8+ Board at 0x260 not found. +sx0: specialix IO8+ board detected at 0xd800, IRQ 9, CD1865 Rev. B. + +This would happen if you don't give any probe hints to the driver. +If you would specify: + + specialix=0x250,11 + +you'd get the following messages: + +sx0: specialix IO8+ board detected at 0x250, IRQ 11, CD1865 Rev. B. +sx1: specialix IO8+ board detected at 0xd800, IRQ 9, CD1865 Rev. B. + +ISA probing is aborted after the IO address you gave is exhausted, and +the PCI card is now detected as the second card. The ISA card is now +also forced to IRQ11.... + + +Baud rates +========== + +The rev 1.2 and below boards use a CL-CD1864. These chips can only +do 64kbit. The rev 1.3 and newer boards use a CL-CD1865. These chips +are officially capable of 115k2. + +The Specialix card uses a 25MHz crystal (in times two mode, which in +fact is a divided by two mode). This is not enough to reach the rated +115k2 on all ports at the same time. With this clock rate you can only +do 37% of this rate. This means that at 115k2 on all ports you are +going to lose characters (The chip cannot handle that many incoming +bits at this clock rate.) (Yes, you read that correctly: there is a +limit to the number of -=bits=- per second that the chip can handle.) + +If you near the "limit" you will first start to see a graceful +degradation in that the chip cannot keep the transmitter busy at all +times. However with a central clock this slow, you can also get it to +miss incoming characters. The driver will print a warning message when +you are outside the official specs. The messages usually show up in +the file /var/log/messages . + +The specialix card cannot reliably do 115k2. If you use it, you have +to do "extensive testing" (*) to verify if it actually works. + +When "mgetty" communicates with my modem at 115k2 it reports: +got: +++[0d]ATQ0V1H0[0d][0d][8a]O[cb][0d][8a] + ^^^^ ^^^^ ^^^^ + +The three characters that have the "^^^" under them have suffered a +bit error in the highest bit. In conclusion: I've tested it, and found +that it simply DOESN'T work for me. I also suspect that this is also +caused by the baud rate being just a little bit out of tune. + +I upgraded the crystal to 66Mhz on one of my Specialix cards. Works +great! Contact me for details. (Voids warranty, requires a steady hand +and more such restrictions....) + + +(*) Cirrus logic CD1864 databook, page 40. + + +Cables for the Specialix IO8+ +============================= + +The pinout of the connectors on the IO8+ is: + + pin short direction long name + name + Pin 1 DCD input Data Carrier Detect + Pin 2 RXD input Receive + Pin 3 DTR/RTS output Data Terminal Ready/Ready To Send + Pin 4 GND - Ground + Pin 5 TXD output Transmit + Pin 6 CTS input Clear To Send + + + -- 6 5 4 3 2 1 -- + | | + | | + | | + | | + +----- -----+ + |__________| + clip + + Front view of an RJ12 connector. Cable moves "into" the paper. + (the plug is ready to plug into your mouth this way...) + + + NULL cable. I don't know who is going to use these except for + testing purposes, but I tested the cards with this cable. (It + took quite a while to figure out, so I'm not going to delete + it. So there! :-) + + + This end goes This end needs + straight into the some twists in + RJ12 plug. the wiring. + IO8+ RJ12 IO8+ RJ12 + 1 DCD white - + - - 1 DCD + 2 RXD black 5 TXD + 3 DTR/RTS red 6 CTS + 4 GND green 4 GND + 5 TXD yellow 2 RXD + 6 CTS blue 3 DTR/RTS + + + Same NULL cable, but now sorted on the second column. + + 1 DCD white - + - - 1 DCD + 5 TXD yellow 2 RXD + 6 CTS blue 3 DTR/RTS + 4 GND green 4 GND + 2 RXD black 5 TXD + 3 DTR/RTS red 6 CTS + + + + This is a modem cable usable for hardware handshaking: + RJ12 DB25 DB9 + 1 DCD white 8 DCD 1 DCD + 2 RXD black 3 RXD 2 RXD + 3 DTR/RTS red 4 RTS 7 RTS + 4 GND green 7 GND 5 GND + 5 TXD yellow 2 TXD 3 TXD + 6 CTS blue 5 CTS 8 CTS + +---- 6 DSR 6 DSR + +---- 20 DTR 4 DTR + + This is a modem cable usable for software handshaking: + It allows you to reset the modem using the DTR ioctls. + I (REW) have never tested this, "but xxxxxxxxxxxxx + says that it works." If you test this, please + tell me and I'll fill in your name on the xxx's. + + RJ12 DB25 DB9 + 1 DCD white 8 DCD 1 DCD + 2 RXD black 3 RXD 2 RXD + 3 DTR/RTS red 20 DTR 4 DTR + 4 GND green 7 GND 5 GND + 5 TXD yellow 2 TXD 3 TXD + 6 CTS blue 5 CTS 8 CTS + +---- 6 DSR 6 DSR + +---- 4 RTS 7 RTS + + I bought a 6 wire flat cable. It was colored as indicated. + Check that yours is the same before you trust me on this. + + +Hardware handshaking issues. +============================ + +The driver can be compiled in two different ways. The default +("Specialix DTR/RTS pin is RTS" is off) the pin behaves as DTR when +hardware handshaking is off. It behaves as the RTS hardware +handshaking signal when hardware handshaking is selected. + +When you use this, you have to use the appropriate cable. The +cable will either be compatible with hardware handshaking or with +software handshaking. So switching on the fly is not really an +option. + +I actually prefer to use the "Specialix DTR/RTS pin is RTS" option. +This makes the DTR/RTS pin always an RTS pin, and ioctls to +change DTR are always ignored. I have a cable that is configured +for this. + + +Ports and devices +================= + +Port 0 is the one furthest from the card-edge connector. + +Devices: + +You should make the devices as follows: + +bash +cd /dev +for i in 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 \ + 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 +do + echo -n "$i " + mknod /dev/ttyW$i c 75 $i + mknod /dev/cuw$i c 76 $i +done +echo "" + +If your system doesn't come with these devices preinstalled, bug your +linux-vendor about this. They have had ample time to get this +implemented by now. + +You cannot have more than 4 boards in one computer. The card only +supports 4 different interrupts. If you really want this, contact me +about this and I'll give you a few tips (requires soldering iron).... + +If you have enough PCI slots, you can probably use more than 4 PCI +versions of the card though.... + +The PCI version of the card cannot adhere to the mechanical part of +the PCI spec because the 8 serial connectors are simply too large. If +it doesn't fit in your computer, bring back the card. + + +------------------------------------------------------------------------ + + + Fixed bugs and restrictions: + - During initialization, interrupts are blindly turned on. + Having a shadow variable would cause an extra memory + access on every IO instruction. + - The interrupt (on the card) should be disabled when we + don't allocate the Linux end of the interrupt. This allows + a different driver/card to use it while all ports are not in + use..... (a la standard serial port) + == An extra _off variant of the sx_in and sx_out macros are + now available. They don't set the interrupt enable bit. + These are used during initialization. Normal operation uses + the old variant which enables the interrupt line. + - RTS/DTR issue needs to be implemented according to + specialix' spec. + I kind of like the "determinism" of the current + implementation. Compile time flag? + == Ok. Compile time flag! Default is how Specialix likes it. + == Now a config time flag! Gets saved in your config file. Neat! + - Can you set the IO address from the lilo command line? + If you need this, bug me about it, I'll make it. + == Hah! No bugging needed. Fixed! :-) + - Cirrus logic hasn't gotten back to me yet why the CD1865 can + and the CD1864 can't do 115k2. I suspect that this is + because the CD1864 is not rated for 33MHz operation. + Therefore the CD1864 versions of the card can't do 115k2 on + all ports just like the CD1865 versions. The driver does + not block 115k2 on CD1864 cards. + == I called the Cirrus Logic representative here in Holland. + The CD1864 databook is identical to the CD1865 databook, + except for an extra warning at the end. Similar Bit errors + have been observed in testing at 115k2 on both an 1865 and + a 1864 chip. I see no reason why I would prohibit 115k2 on + 1864 chips and not do it on 1865 chips. Actually there is + reason to prohibit it on BOTH chips. I print a warning. + If you use 115k2, you're on your own. + - A spiky CD may send spurious HUPs. Also in CLOCAL??? + -- A fix for this turned out to be counter productive. + Different fix? Current behaviour is acceptable? + -- Maybe the current implementation is correct. If anybody + gets bitten by this, please report, and it will get fixed. + + -- Testing revealed that when in CLOCAL, the problem doesn't + occur. As warned for in the CD1865 manual, the chip may + send modem intr's on a spike. We could filter those out, + but that would be a cludge anyway (You'd still risk getting + a spurious HUP when two spikes occur.)..... + + + + Bugs & restrictions: + - This is a difficult card to autoprobe. + You have to WRITE to the address register to even + read-probe a CD186x register. Disable autodetection? + -- Specialix: any suggestions? + - Arbitrary baud rates are not implemented yet. + If you need this, bug me about it. + + diff --git a/Documentation/spinlocks.txt b/Documentation/spinlocks.txt new file mode 100644 index 000000000000..c2122996631e --- /dev/null +++ b/Documentation/spinlocks.txt @@ -0,0 +1,212 @@ +UPDATE March 21 2005 Amit Gud <gud@eth.net> + +Macros SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED are deprecated and will be +removed soon. So for any new code dynamic initialization should be used: + + spinlock_t xxx_lock; + rwlock_t xxx_rw_lock; + + static int __init xxx_init(void) + { + spin_lock_init(&xxx_lock); + rw_lock_init(&xxx_rw_lock); + ... + } + + module_init(xxx_init); + +Reasons for deprecation + - it hurts automatic lock validators + - it becomes intrusive for the realtime preemption patches + +Following discussion is still valid, however, with the dynamic initialization +of spinlocks instead of static. + +----------------------- + +On Fri, 2 Jan 1998, Doug Ledford wrote: +> +> I'm working on making the aic7xxx driver more SMP friendly (as well as +> importing the latest FreeBSD sequencer code to have 7895 support) and wanted +> to get some info from you. The goal here is to make the various routines +> SMP safe as well as UP safe during interrupts and other manipulating +> routines. So far, I've added a spin_lock variable to things like my queue +> structs. Now, from what I recall, there are some spin lock functions I can +> use to lock these spin locks from other use as opposed to a (nasty) +> save_flags(); cli(); stuff; restore_flags(); construct. Where do I find +> these routines and go about making use of them? Do they only lock on a +> per-processor basis or can they also lock say an interrupt routine from +> mucking with a queue if the queue routine was manipulating it when the +> interrupt occurred, or should I still use a cli(); based construct on that +> one? + +See <asm/spinlock.h>. The basic version is: + + spinlock_t xxx_lock = SPIN_LOCK_UNLOCKED; + + + unsigned long flags; + + spin_lock_irqsave(&xxx_lock, flags); + ... critical section here .. + spin_unlock_irqrestore(&xxx_lock, flags); + +and the above is always safe. It will disable interrupts _locally_, but the +spinlock itself will guarantee the global lock, so it will guarantee that +there is only one thread-of-control within the region(s) protected by that +lock. + +Note that it works well even under UP - the above sequence under UP +essentially is just the same as doing a + + unsigned long flags; + + save_flags(flags); cli(); + ... critical section ... + restore_flags(flags); + +so the code does _not_ need to worry about UP vs SMP issues: the spinlocks +work correctly under both (and spinlocks are actually more efficient on +architectures that allow doing the "save_flags + cli" in one go because I +don't export that interface normally). + +NOTE NOTE NOTE! The reason the spinlock is so much faster than a global +interrupt lock under SMP is exactly because it disables interrupts only on +the local CPU. The spin-lock is safe only when you _also_ use the lock +itself to do locking across CPU's, which implies that EVERYTHING that +touches a shared variable has to agree about the spinlock they want to +use. + +The above is usually pretty simple (you usually need and want only one +spinlock for most things - using more than one spinlock can make things a +lot more complex and even slower and is usually worth it only for +sequences that you _know_ need to be split up: avoid it at all cost if you +aren't sure). HOWEVER, it _does_ mean that if you have some code that does + + cli(); + .. critical section .. + sti(); + +and another sequence that does + + spin_lock_irqsave(flags); + .. critical section .. + spin_unlock_irqrestore(flags); + +then they are NOT mutually exclusive, and the critical regions can happen +at the same time on two different CPU's. That's fine per se, but the +critical regions had better be critical for different things (ie they +can't stomp on each other). + +The above is a problem mainly if you end up mixing code - for example the +routines in ll_rw_block() tend to use cli/sti to protect the atomicity of +their actions, and if a driver uses spinlocks instead then you should +think about issues like the above.. + +This is really the only really hard part about spinlocks: once you start +using spinlocks they tend to expand to areas you might not have noticed +before, because you have to make sure the spinlocks correctly protect the +shared data structures _everywhere_ they are used. The spinlocks are most +easily added to places that are completely independent of other code (ie +internal driver data structures that nobody else ever touches, for +example). + +---- + +Lesson 2: reader-writer spinlocks. + +If your data accesses have a very natural pattern where you usually tend +to mostly read from the shared variables, the reader-writer locks +(rw_lock) versions of the spinlocks are often nicer. They allow multiple +readers to be in the same critical region at once, but if somebody wants +to change the variables it has to get an exclusive write lock. The +routines look the same as above: + + rwlock_t xxx_lock = RW_LOCK_UNLOCKED; + + + unsigned long flags; + + read_lock_irqsave(&xxx_lock, flags); + .. critical section that only reads the info ... + read_unlock_irqrestore(&xxx_lock, flags); + + write_lock_irqsave(&xxx_lock, flags); + .. read and write exclusive access to the info ... + write_unlock_irqrestore(&xxx_lock, flags); + +The above kind of lock is useful for complex data structures like linked +lists etc, especially when you know that most of the work is to just +traverse the list searching for entries without changing the list itself, +for example. Then you can use the read lock for that kind of list +traversal, which allows many concurrent readers. Anything that _changes_ +the list will have to get the write lock. + +Note: you cannot "upgrade" a read-lock to a write-lock, so if you at _any_ +time need to do any changes (even if you don't do it every time), you have +to get the write-lock at the very beginning. I could fairly easily add a +primitive to create a "upgradeable" read-lock, but it hasn't been an issue +yet. Tell me if you'd want one. + +---- + +Lesson 3: spinlocks revisited. + +The single spin-lock primitives above are by no means the only ones. They +are the most safe ones, and the ones that work under all circumstances, +but partly _because_ they are safe they are also fairly slow. They are +much faster than a generic global cli/sti pair, but slower than they'd +need to be, because they do have to disable interrupts (which is just a +single instruction on a x86, but it's an expensive one - and on other +architectures it can be worse). + +If you have a case where you have to protect a data structure across +several CPU's and you want to use spinlocks you can potentially use +cheaper versions of the spinlocks. IFF you know that the spinlocks are +never used in interrupt handlers, you can use the non-irq versions: + + spin_lock(&lock); + ... + spin_unlock(&lock); + +(and the equivalent read-write versions too, of course). The spinlock will +guarantee the same kind of exclusive access, and it will be much faster. +This is useful if you know that the data in question is only ever +manipulated from a "process context", ie no interrupts involved. + +The reasons you mustn't use these versions if you have interrupts that +play with the spinlock is that you can get deadlocks: + + spin_lock(&lock); + ... + <- interrupt comes in: + spin_lock(&lock); + +where an interrupt tries to lock an already locked variable. This is ok if +the other interrupt happens on another CPU, but it is _not_ ok if the +interrupt happens on the same CPU that already holds the lock, because the +lock will obviously never be released (because the interrupt is waiting +for the lock, and the lock-holder is interrupted by the interrupt and will +not continue until the interrupt has been processed). + +(This is also the reason why the irq-versions of the spinlocks only need +to disable the _local_ interrupts - it's ok to use spinlocks in interrupts +on other CPU's, because an interrupt on another CPU doesn't interrupt the +CPU that holds the lock, so the lock-holder can continue and eventually +releases the lock). + +Note that you can be clever with read-write locks and interrupts. For +example, if you know that the interrupt only ever gets a read-lock, then +you can use a non-irq version of read locks everywhere - because they +don't block on each other (and thus there is no dead-lock wrt interrupts. +But when you do the write-lock, you have to use the irq-safe version. + +For an example of being clever with rw-locks, see the "waitqueue_lock" +handling in kernel/sched.c - nothing ever _changes_ a wait-queue from +within an interrupt, they only read the queue in order to know whom to +wake up. So read-locks are safe (which is good: they are very common +indeed), while write-locks need to protect themselves against interrupts. + + Linus + + diff --git a/Documentation/stable_api_nonsense.txt b/Documentation/stable_api_nonsense.txt new file mode 100644 index 000000000000..3cea13875277 --- /dev/null +++ b/Documentation/stable_api_nonsense.txt @@ -0,0 +1,193 @@ +The Linux Kernel Driver Interface +(all of your questions answered and then some) + +Greg Kroah-Hartman <greg@kroah.com> + +This is being written to try to explain why Linux does not have a binary +kernel interface, nor does it have a stable kernel interface. Please +realize that this article describes the _in kernel_ interfaces, not the +kernel to userspace interfaces. The kernel to userspace interface is +the one that application programs use, the syscall interface. That +interface is _very_ stable over time, and will not break. I have old +programs that were built on a pre 0.9something kernel that still work +just fine on the latest 2.6 kernel release. This interface is the one +that users and application programmers can count on being stable. + + +Executive Summary +----------------- +You think you want a stable kernel interface, but you really do not, and +you don't even know it. What you want is a stable running driver, and +you get that only if your driver is in the main kernel tree. You also +get lots of other good benefits if your driver is in the main kernel +tree, all of which has made Linux into such a strong, stable, and mature +operating system which is the reason you are using it in the first +place. + + +Intro +----- + +It's only the odd person who wants to write a kernel driver that needs +to worry about the in-kernel interfaces changing. For the majority of +the world, they neither see this interface, nor do they care about it at +all. + +First off, I'm not going to address _any_ legal issues about closed +source, hidden source, binary blobs, source wrappers, or any other term +that describes kernel drivers that do not have their source code +released under the GPL. Please consult a lawyer if you have any legal +questions, I'm a programmer and hence, I'm just going to be describing +the technical issues here (not to make light of the legal issues, they +are real, and you do need to be aware of them at all times.) + +So, there are two main topics here, binary kernel interfaces and stable +kernel source interfaces. They both depend on each other, but we will +discuss the binary stuff first to get it out of the way. + + +Binary Kernel Interface +----------------------- +Assuming that we had a stable kernel source interface for the kernel, a +binary interface would naturally happen too, right? Wrong. Please +consider the following facts about the Linux kernel: + - Depending on the version of the C compiler you use, different kernel + data structures will contain different alignment of structures, and + possibly include different functions in different ways (putting + functions inline or not.) The individual function organization + isn't that important, but the different data structure padding is + very important. + - Depending on what kernel build options you select, a wide range of + different things can be assumed by the kernel: + - different structures can contain different fields + - Some functions may not be implemented at all, (i.e. some locks + compile away to nothing for non-SMP builds.) + - Parameter passing of variables from function to function can be + done in different ways (the CONFIG_REGPARM option controls + this.) + - Memory within the kernel can be aligned in different ways, + depending on the build options. + - Linux runs on a wide range of different processor architectures. + There is no way that binary drivers from one architecture will run + on another architecture properly. + +Now a number of these issues can be addressed by simply compiling your +module for the exact specific kernel configuration, using the same exact +C compiler that the kernel was built with. This is sufficient if you +want to provide a module for a specific release version of a specific +Linux distribution. But multiply that single build by the number of +different Linux distributions and the number of different supported +releases of the Linux distribution and you quickly have a nightmare of +different build options on different releases. Also realize that each +Linux distribution release contains a number of different kernels, all +tuned to different hardware types (different processor types and +different options), so for even a single release you will need to create +multiple versions of your module. + +Trust me, you will go insane over time if you try to support this kind +of release, I learned this the hard way a long time ago... + + +Stable Kernel Source Interfaces +------------------------------- + +This is a much more "volatile" topic if you talk to people who try to +keep a Linux kernel driver that is not in the main kernel tree up to +date over time. + +Linux kernel development is continuous and at a rapid pace, never +stopping to slow down. As such, the kernel developers find bugs in +current interfaces, or figure out a better way to do things. If they do +that, they then fix the current interfaces to work better. When they do +so, function names may change, structures may grow or shrink, and +function parameters may be reworked. If this happens, all of the +instances of where this interface is used within the kernel are fixed up +at the same time, ensuring that everything continues to work properly. + +As a specific examples of this, the in-kernel USB interfaces have +undergone at least three different reworks over the lifetime of this +subsystem. These reworks were done to address a number of different +issues: + - A change from a synchronous model of data streams to an asynchronous + one. This reduced the complexity of a number of drivers and + increased the throughput of all USB drivers such that we are now + running almost all USB devices at their maximum speed possible. + - A change was made in the way data packets were allocated from the + USB core by USB drivers so that all drivers now needed to provide + more information to the USB core to fix a number of documented + deadlocks. + +This is in stark contrast to a number of closed source operating systems +which have had to maintain their older USB interfaces over time. This +provides the ability for new developers to accidentally use the old +interfaces and do things in improper ways, causing the stability of the +operating system to suffer. + +In both of these instances, all developers agreed that these were +important changes that needed to be made, and they were made, with +relatively little pain. If Linux had to ensure that it preserve a +stable source interface, a new interface would have been created, and +the older, broken one would have had to be maintained over time, leading +to extra work for the USB developers. Since all Linux USB developers do +their work on their own time, asking programmers to do extra work for no +gain, for free, is not a possibility. + +Security issues are also a very important for Linux. When a +security issue is found, it is fixed in a very short amount of time. A +number of times this has caused internal kernel interfaces to be +reworked to prevent the security problem from occurring. When this +happens, all drivers that use the interfaces were also fixed at the +same time, ensuring that the security problem was fixed and could not +come back at some future time accidentally. If the internal interfaces +were not allowed to change, fixing this kind of security problem and +insuring that it could not happen again would not be possible. + +Kernel interfaces are cleaned up over time. If there is no one using a +current interface, it is deleted. This ensures that the kernel remains +as small as possible, and that all potential interfaces are tested as +well as they can be (unused interfaces are pretty much impossible to +test for validity.) + + +What to do +---------- + +So, if you have a Linux kernel driver that is not in the main kernel +tree, what are you, a developer, supposed to do? Releasing a binary +driver for every different kernel version for every distribution is a +nightmare, and trying to keep up with an ever changing kernel interface +is also a rough job. + +Simple, get your kernel driver into the main kernel tree (remember we +are talking about GPL released drivers here, if your code doesn't fall +under this category, good luck, you are on your own here, you leech +<insert link to leech comment from Andrew and Linus here>.) If your +driver is in the tree, and a kernel interface changes, it will be fixed +up by the person who did the kernel change in the first place. This +ensures that your driver is always buildable, and works over time, with +very little effort on your part. + +The very good side effects of having your driver in the main kernel tree +are: + - The quality of the driver will rise as the maintenance costs (to the + original developer) will decrease. + - Other developers will add features to your driver. + - Other people will find and fix bugs in your driver. + - Other people will find tuning opportunities in your driver. + - Other people will update the driver for you when external interface + changes require it. + - The driver automatically gets shipped in all Linux distributions + without having to ask the distros to add it. + +As Linux supports a larger number of different devices "out of the box" +than any other operating system, and it supports these devices on more +different processor architectures than any other operating system, this +proven type of development model must be doing something right :) + + + +------ + +Thanks to Randy Dunlap, Andrew Morton, David Brownell, Hanna Linder, +Robert Love, and Nishanth Aravamudan for their review and comments on +early drafts of this paper. diff --git a/Documentation/stallion.txt b/Documentation/stallion.txt new file mode 100644 index 000000000000..5c4902d9a5be --- /dev/null +++ b/Documentation/stallion.txt @@ -0,0 +1,392 @@ +* NOTE - This is an unmaintained driver. Lantronix, which bought Stallion +technologies, is not active in driver maintenance, and they have no information +on when or if they will have a 2.6 driver. + +James Nelson <james4765@gmail.com> - 12-12-2004 + +Stallion Multiport Serial Driver Readme +--------------------------------------- + +Copyright (C) 1994-1999, Stallion Technologies. + +Version: 5.5.1 +Date: 28MAR99 + + + +1. INTRODUCTION + +There are two drivers that work with the different families of Stallion +multiport serial boards. One is for the Stallion smart boards - that is +EasyIO, EasyConnection 8/32 and EasyConnection 8/64-PCI, the other for +the true Stallion intelligent multiport boards - EasyConnection 8/64 +(ISA, EISA, MCA), EasyConnection/RA-PCI, ONboard and Brumby. + +If you are using any of the Stallion intelligent multiport boards (Brumby, +ONboard, EasyConnection 8/64 (ISA, EISA, MCA), EasyConnection/RA-PCI) with +Linux you will need to get the driver utility package. This contains a +firmware loader and the firmware images necessary to make the devices operate. + +The Stallion Technologies ftp site, ftp.stallion.com, will always have +the latest version of the driver utility package. + +ftp://ftp.stallion.com/drivers/ata5/Linux/ata-linux-550.tar.gz + +As of the printing of this document the latest version of the driver +utility package is 5.5.0. If a later version is now available then you +should use the latest version. + +If you are using the EasyIO, EasyConnection 8/32 or EasyConnection 8/64-PCI +boards then you don't need this package, although it does have a serial stats +display program. + +If you require DIP switch settings, EISA or MCA configuration files, or any +other information related to Stallion boards then have a look at Stallion's +web pages at http://www.stallion.com. + + + +2. INSTALLATION + +The drivers can be used as loadable modules or compiled into the kernel. +You can choose which when doing a "config" on the kernel. + +All ISA, EISA and MCA boards that you want to use need to be configured into +the driver(s). All PCI boards will be automatically detected when you load +the driver - so they do not need to be entered into the driver(s) +configuration structure. Note that kernel PCI support is required to use PCI +boards. + +There are two methods of configuring ISA, EISA and MCA boards into the drivers. +If using the driver as a loadable module then the simplest method is to pass +the driver configuration as module arguments. The other method is to modify +the driver source to add configuration lines for each board in use. + +If you have pre-built Stallion driver modules then the module argument +configuration method should be used. A lot of Linux distributions come with +pre-built driver modules in /lib/modules/X.Y.Z/misc for the kernel in use. +That makes things pretty simple to get going. + + +2.1 MODULE DRIVER CONFIGURATION: + +The simplest configuration for modules is to use the module load arguments +to configure any ISA, EISA or MCA boards. PCI boards are automatically +detected, so do not need any additional configuration at all. + +If using EasyIO, EasyConnection 8/32 ISA or MCA, or EasyConnection 8/63-PCI +boards then use the "stallion" driver module, Otherwise if you are using +an EasyConnection 8/64 ISA, EISA or MCA, EasyConnection/RA-PCI, ONboard, +Brumby or original Stallion board then use the "istallion" driver module. + +Typically to load up the smart board driver use: + + modprobe stallion + +This will load the EasyIO and EasyConnection 8/32 driver. It will output a +message to say that it loaded and print the driver version number. It will +also print out whether it found the configured boards or not. These messages +may not appear on the console, but typically are always logged to +/var/adm/messages or /var/log/syslog files - depending on how the klogd and +syslogd daemons are setup on your system. + +To load the intelligent board driver use: + + modprobe istallion + +It will output similar messages to the smart board driver. + +If not using an auto-detectable board type (that is a PCI board) then you +will also need to supply command line arguments to the modprobe command +when loading the driver. The general form of the configuration argument is + + board?=<name>[,<ioaddr>[,<addr>][,<irq>]] + +where: + + board? -- specifies the arbitrary board number of this board, + can be in the range 0 to 3. + + name -- textual name of this board. The board name is the common + board name, or any "shortened" version of that. The board + type number may also be used here. + + ioaddr -- specifies the I/O address of this board. This argument is + optional, but should generally be specified. + + addr -- optional second address argument. Some board types require + a second I/O address, some require a memory address. The + exact meaning of this argument depends on the board type. + + irq -- optional IRQ line used by this board. + +Up to 4 board configuration arguments can be specified on the load line. +Here is some examples: + + modprobe stallion board0=easyio,0x2a0,5 + +This configures an EasyIO board as board 0 at I/O address 0x2a0 and IRQ 5. + + modprobe istallion board3=ec8/64,0x2c0,0xcc000 + +This configures an EasyConnection 8/64 ISA as board 3 at I/O address 0x2c0 at +memory address 0xcc000. + + modprobe stallion board1=ec8/32-at,0x2a0,0x280,10 + +This configures an EasyConnection 8/32 ISA board at primary I/O address 0x2a0, +secondary address 0x280 and IRQ 10. + +You will probably want to enter this module load and configuration information +into your system startup scripts so that the drivers are loaded and configured +on each system boot. Typically the start up script would be something like +/etc/modprobe.conf. + + +2.2 STATIC DRIVER CONFIGURATION: + +For static driver configuration you need to modify the driver source code. +Entering ISA, EISA and MCA boards into the driver(s) configuration structure +involves editing the driver(s) source file. It's pretty easy if you follow +the instructions below. Both drivers can support up to 4 boards. The smart +card driver (the stallion.c driver) supports any combination of EasyIO and +EasyConnection 8/32 boards (up to a total of 4). The intelligent driver +supports any combination of ONboards, Brumbys, Stallions and EasyConnection +8/64 (ISA and EISA) boards (up to a total of 4). + +To set up the driver(s) for the boards that you want to use you need to +edit the appropriate driver file and add configuration entries. + +If using EasyIO or EasyConnection 8/32 ISA or MCA boards, + In drivers/char/stallion.c: + - find the definition of the stl_brdconf array (of structures) + near the top of the file + - modify this to match the boards you are going to install + (the comments before this structure should help) + - save and exit + +If using ONboard, Brumby, Stallion or EasyConnection 8/64 (ISA or EISA) +boards, + In drivers/char/istallion.c: + - find the definition of the stli_brdconf array (of structures) + near the top of the file + - modify this to match the boards you are going to install + (the comments before this structure should help) + - save and exit + +Once you have set up the board configurations then you are ready to build +the kernel or modules. + +When the new kernel is booted, or the loadable module loaded then the +driver will emit some kernel trace messages about whether the configured +boards were detected or not. Depending on how your system logger is set +up these may come out on the console, or just be logged to +/var/adm/messages or /var/log/syslog. You should check the messages to +confirm that all is well. + + +2.3 SHARING INTERRUPTS + +It is possible to share interrupts between multiple EasyIO and +EasyConnection 8/32 boards in an EISA system. To do this you must be using +static driver configuration, modifying the driver source code to add driver +configuration. Then a couple of extra things are required: + +1. When entering the board resources into the stallion.c file you need to + mark the boards as using level triggered interrupts. Do this by replacing + the "0" entry at field position 6 (the last field) in the board + configuration structure with a "1". (This is the structure that defines + the board type, I/O locations, etc. for each board). All boards that are + sharing an interrupt must be set this way, and each board should have the + same interrupt number specified here as well. Now build the module or + kernel as you would normally. + +2. When physically installing the boards into the system you must enter + the system EISA configuration utility. You will need to install the EISA + configuration files for *all* the EasyIO and EasyConnection 8/32 boards + that are sharing interrupts. The Stallion EasyIO and EasyConnection 8/32 + EISA configuration files required are supplied by Stallion Technologies + on the EASY Utilities floppy diskette (usually supplied in the box with + the board when purchased. If not, you can pick it up from Stallion's FTP + site, ftp.stallion.com). You will need to edit the board resources to + choose level triggered interrupts, and make sure to set each board's + interrupt to the same IRQ number. + +You must complete both the above steps for this to work. When you reboot +or load the driver your EasyIO and EasyConnection 8/32 boards will be +sharing interrupts. + + +2.4 USING HIGH SHARED MEMORY + +The EasyConnection 8/64-EI, ONboard and Stallion boards are capable of +using shared memory addresses above the usual 640K - 1Mb range. The ONboard +ISA and the Stallion boards can be programmed to use memory addresses up to +16Mb (the ISA bus addressing limit), and the EasyConnection 8/64-EI and +ONboard/E can be programmed for memory addresses up to 4Gb (the EISA bus +addressing limit). + +The higher than 1Mb memory addresses are fully supported by this driver. +Just enter the address as you normally would for a lower than 1Mb address +(in the driver's board configuration structure). + + + +2.5 TROUBLE SHOOTING + +If a board is not found by the driver but is actually in the system then the +most likely problem is that the I/O address is wrong. Change the module load +argument for the loadable module form. Or change it in the driver stallion.c +or istallion.c configuration structure and rebuild the kernel or modules, or +change it on the board. + +On EasyIO and EasyConnection 8/32 boards the IRQ is software programmable, so +if there is a conflict you may need to change the IRQ used for a board. There +are no interrupts to worry about for ONboard, Brumby or EasyConnection 8/64 +(ISA, EISA and MCA) boards. The memory region on EasyConnection 8/64 and +ONboard boards is software programmable, but not on the Brumby boards. + + + +3. USING THE DRIVERS + +3.1 INTELLIGENT DRIVER OPERATION + +The intelligent boards also need to have their "firmware" code downloaded +to them. This is done via a user level application supplied in the driver +utility package called "stlload". Compile this program wherever you dropped +the package files, by typing "make". In its simplest form you can then type + + ./stlload -i cdk.sys + +in this directory and that will download board 0 (assuming board 0 is an +EasyConnection 8/64 or EasyConnection/RA board). To download to an +ONboard, Brumby or Stallion do: + + ./stlload -i 2681.sys + +Normally you would want all boards to be downloaded as part of the standard +system startup. To achieve this, add one of the lines above into the +/etc/rc.d/rc.S or /etc/rc.d/rc.serial file. To download each board just add +the "-b <brd-number>" option to the line. You will need to download code for +every board. You should probably move the stlload program into a system +directory, such as /usr/sbin. Also, the default location of the cdk.sys image +file in the stlload down-loader is /usr/lib/stallion. Create that directory +and put the cdk.sys and 2681.sys files in it. (It's a convenient place to put +them anyway). As an example your /etc/rc.d/rc.S file might have the +following lines added to it (if you had 3 boards): + + /usr/sbin/stlload -b 0 -i /usr/lib/stallion/cdk.sys + /usr/sbin/stlload -b 1 -i /usr/lib/stallion/2681.sys + /usr/sbin/stlload -b 2 -i /usr/lib/stallion/2681.sys + +The image files cdk.sys and 2681.sys are specific to the board types. The +cdk.sys will only function correctly on an EasyConnection 8/64 board. Similarly +the 2681.sys image fill only operate on ONboard, Brumby and Stallion boards. +If you load the wrong image file into a board it will fail to start up, and +of course the ports will not be operational! + +If you are using the modularized version of the driver you might want to put +the modprobe calls in the startup script as well (before the download lines +obviously). + + +3.2 USING THE SERIAL PORTS + +Once the driver is installed you will need to setup some device nodes to +access the serial ports. The simplest method is to use the /dev/MAKEDEV program. +It will automatically create device entries for Stallion boards. This will +create the normal serial port devices as /dev/ttyE# where# is the port number +starting from 0. A bank of 64 minor device numbers is allocated to each board, +so the first port on the second board is port 64,etc. A set of callout type +devices may also be created. They are created as the devices /dev/cue# where # +is the same as for the ttyE devices. + +For the most part the Stallion driver tries to emulate the standard PC system +COM ports and the standard Linux serial driver. The idea is that you should +be able to use Stallion board ports and COM ports interchangeably without +modifying anything but the device name. Anything that doesn't work like that +should be considered a bug in this driver! + +If you look at the driver code you will notice that it is fairly closely +based on the Linux serial driver (linux/drivers/char/serial.c). This is +intentional, obviously this is the easiest way to emulate its behavior! + +Since this driver tries to emulate the standard serial ports as much as +possible, most system utilities should work as they do for the standard +COM ports. Most importantly "stty" works as expected and "setserial" can +also be used (excepting the ability to auto-configure the I/O and IRQ +addresses of boards). Higher baud rates are supported in the usual fashion +through setserial or using the CBAUDEX extensions. Note that the EasyIO and +EasyConnection (all types) support at least 57600 and 115200 baud. The newer +EasyConnection XP modules and new EasyIO boards support 230400 and 460800 +baud as well. The older boards including ONboard and Brumby support a +maximum baud rate of 38400. + +If you are unfamiliar with how to use serial ports, then get the Serial-HOWTO +by Greg Hankins. It will explain everything you need to know! + + + +4. NOTES + +You can use both drivers at once if you have a mix of board types installed +in a system. However to do this you will need to change the major numbers +used by one of the drivers. Currently both drivers use major numbers 24, 25 +and 28 for their devices. Change one driver to use some other major numbers, +and then modify the mkdevnods script to make device nodes based on those new +major numbers. For example, you could change the istallion.c driver to use +major numbers 60, 61 and 62. You will also need to create device nodes with +different names for the ports, for example ttyF# and cuf#. + +The original Stallion board is no longer supported by Stallion Technologies. +Although it is known to work with the istallion driver. + +Finding a free physical memory address range can be a problem. The older +boards like the Stallion and ONboard need large areas (64K or even 128K), so +they can be very difficult to get into a system. If you have 16 Mb of RAM +then you have no choice but to put them somewhere in the 640K -> 1Mb range. +ONboards require 64K, so typically 0xd0000 is good, or 0xe0000 on some +systems. If you have an original Stallion board, "V4.0" or Rev.O, then you +need a 64K memory address space, so again 0xd0000 and 0xe0000 are good. +Older Stallion boards are a much bigger problem. They need 128K of address +space and must be on a 128K boundary. If you don't have a VGA card then +0xc0000 might be usable - there is really no other place you can put them +below 1Mb. + +Both the ONboard and old Stallion boards can use higher memory addresses as +well, but you must have less than 16Mb of RAM to be able to use them. Usual +high memory addresses used include 0xec0000 and 0xf00000. + +The Brumby boards only require 16Kb of address space, so you can usually +squeeze them in somewhere. Common addresses are 0xc8000, 0xcc000, or in +the 0xd0000 range. EasyConnection 8/64 boards are even better, they only +require 4Kb of address space, again usually 0xc8000, 0xcc000 or 0xd0000 +are good. + +If you are using an EasyConnection 8/64-EI or ONboard/E then usually the +0xd0000 or 0xe0000 ranges are the best options below 1Mb. If neither of +them can be used then the high memory support to use the really high address +ranges is the best option. Typically the 2Gb range is convenient for them, +and gets them well out of the way. + +The ports of the EasyIO-8M board do not have DCD or DTR signals. So these +ports cannot be used as real modem devices. Generally, when using these +ports you should only use the cueX devices. + +The driver utility package contains a couple of very useful programs. One +is a serial port statistics collection and display program - very handy +for solving serial port problems. The other is an extended option setting +program that works with the intelligent boards. + + + +5. DISCLAIMER + +The information contained in this document is believed to be accurate and +reliable. However, no responsibility is assumed by Stallion Technologies +Pty. Ltd. for its use, nor any infringements of patents or other rights +of third parties resulting from its use. Stallion Technologies reserves +the right to modify the design of its products and will endeavour to change +the information in manuals and accompanying documentation accordingly. + diff --git a/Documentation/svga.txt b/Documentation/svga.txt new file mode 100644 index 000000000000..cd66ec836e4f --- /dev/null +++ b/Documentation/svga.txt @@ -0,0 +1,276 @@ + Video Mode Selection Support 2.13 + (c) 1995--1999 Martin Mares, <mj@ucw.cz> +-------------------------------------------------------------------------------- + +1. Intro +~~~~~~~~ + This small document describes the "Video Mode Selection" feature which +allows the use of various special video modes supported by the video BIOS. Due +to usage of the BIOS, the selection is limited to boot time (before the +kernel decompression starts) and works only on 80X86 machines. + + ** Short intro for the impatient: Just use vga=ask for the first time, + ** enter `scan' on the video mode prompt, pick the mode you want to use, + ** remember its mode ID (the four-digit hexadecimal number) and then + ** set the vga parameter to this number (converted to decimal first). + + The video mode to be used is selected by a kernel parameter which can be +specified in the kernel Makefile (the SVGA_MODE=... line) or by the "vga=..." +option of LILO (or some other boot loader you use) or by the "vidmode" utility +(present in standard Linux utility packages). You can use the following values +of this parameter: + + NORMAL_VGA - Standard 80x25 mode available on all display adapters. + + EXTENDED_VGA - Standard 8-pixel font mode: 80x43 on EGA, 80x50 on VGA. + + ASK_VGA - Display a video mode menu upon startup (see below). + + 0..35 - Menu item number (when you have used the menu to view the list of + modes available on your adapter, you can specify the menu item you want + to use). 0..9 correspond to "0".."9", 10..35 to "a".."z". Warning: the + mode list displayed may vary as the kernel version changes, because the + modes are listed in a "first detected -- first displayed" manner. It's + better to use absolute mode numbers instead. + + 0x.... - Hexadecimal video mode ID (also displayed on the menu, see below + for exact meaning of the ID). Warning: rdev and LILO don't support + hexadecimal numbers -- you have to convert it to decimal manually. + +2. Menu +~~~~~~~ + The ASK_VGA mode causes the kernel to offer a video mode menu upon +bootup. It displays a "Press <RETURN> to see video modes available, <SPACE> +to continue or wait 30 secs" message. If you press <RETURN>, you enter the +menu, if you press <SPACE> or wait 30 seconds, the kernel will boot up in +the standard 80x25 mode. + + The menu looks like: + +Video adapter: <name-of-detected-video-adapter> +Mode: COLSxROWS: +0 0F00 80x25 +1 0F01 80x50 +2 0F02 80x43 +3 0F03 80x26 +.... +Enter mode number or `scan': <flashing-cursor-here> + + <name-of-detected-video-adapter> tells what video adapter did Linux detect +-- it's either a generic adapter name (MDA, CGA, HGC, EGA, VGA, VESA VGA [a VGA +with VESA-compliant BIOS]) or a chipset name (e.g., Trident). Direct detection +of chipsets is turned off by default (see CONFIG_VIDEO_SVGA in chapter 4 to see +how to enable it if you really want) as it's inherently unreliable due to +absolutely insane PC design. + + "0 0F00 80x25" means that the first menu item (the menu items are numbered +from "0" to "9" and from "a" to "z") is a 80x25 mode with ID=0x0f00 (see the +next section for a description of mode IDs). + + <flashing-cursor-here> encourages you to enter the item number or mode ID +you wish to set and press <RETURN>. If the computer complains something about +"Unknown mode ID", it is trying to tell you that it isn't possible to set such +a mode. It's also possible to press only <RETURN> which leaves the current mode. + + The mode list usually contains a few basic modes and some VESA modes. In +case your chipset has been detected, some chipset-specific modes are shown as +well (some of these might be missing or unusable on your machine as different +BIOSes are often shipped with the same card and the mode numbers depend purely +on the VGA BIOS). + + The modes displayed on the menu are partially sorted: The list starts with +the standard modes (80x25 and 80x50) followed by "special" modes (80x28 and +80x43), local modes (if the local modes feature is enabled), VESA modes and +finally SVGA modes for the auto-detected adapter. + + If you are not happy with the mode list offered (e.g., if you think your card +is able to do more), you can enter "scan" instead of item number / mode ID. The +program will try to ask the BIOS for all possible video mode numbers and test +what happens then. The screen will be probably flashing wildly for some time and +strange noises will be heard from inside the monitor and so on and then, really +all consistent video modes supported by your BIOS will appear (plus maybe some +`ghost modes'). If you are afraid this could damage your monitor, don't use this +function. + + After scanning, the mode ordering is a bit different: the auto-detected SVGA +modes are not listed at all and the modes revealed by `scan' are shown before +all VESA modes. + +3. Mode IDs +~~~~~~~~~~~ + Because of the complexity of all the video stuff, the video mode IDs +used here are also a bit complex. A video mode ID is a 16-bit number usually +expressed in a hexadecimal notation (starting with "0x"). You can set a mode +by entering its mode directly if you know it even if it isn't shown on the menu. + +The ID numbers can be divided to three regions: + + 0x0000 to 0x00ff - menu item references. 0x0000 is the first item. Don't use + outside the menu as this can change from boot to boot (especially if you + have used the `scan' feature). + + 0x0100 to 0x017f - standard BIOS modes. The ID is a BIOS video mode number + (as presented to INT 10, function 00) increased by 0x0100. + + 0x0200 to 0x08ff - VESA BIOS modes. The ID is a VESA mode ID increased by + 0x0100. All VESA modes should be autodetected and shown on the menu. + + 0x0900 to 0x09ff - Video7 special modes. Set by calling INT 0x10, AX=0x6f05. + (Usually 940=80x43, 941=132x25, 942=132x44, 943=80x60, 944=100x60, + 945=132x28 for the standard Video7 BIOS) + + 0x0f00 to 0x0fff - special modes (they are set by various tricks -- usually + by modifying one of the standard modes). Currently available: + 0x0f00 standard 80x25, don't reset mode if already set (=FFFF) + 0x0f01 standard with 8-point font: 80x43 on EGA, 80x50 on VGA + 0x0f02 VGA 80x43 (VGA switched to 350 scanlines with a 8-point font) + 0x0f03 VGA 80x28 (standard VGA scans, but 14-point font) + 0x0f04 leave current video mode + 0x0f05 VGA 80x30 (480 scans, 16-point font) + 0x0f06 VGA 80x34 (480 scans, 14-point font) + 0x0f07 VGA 80x60 (480 scans, 8-point font) + 0x0f08 Graphics hack (see the CONFIG_VIDEO_HACK paragraph below) + + 0x1000 to 0x7fff - modes specified by resolution. The code has a "0xRRCC" + form where RR is a number of rows and CC is a number of columns. + E.g., 0x1950 corresponds to a 80x25 mode, 0x2b84 to 132x43 etc. + This is the only fully portable way to refer to a non-standard mode, + but it relies on the mode being found and displayed on the menu + (remember that mode scanning is not done automatically). + + 0xff00 to 0xffff - aliases for backward compatibility: + 0xffff equivalent to 0x0f00 (standard 80x25) + 0xfffe equivalent to 0x0f01 (EGA 80x43 or VGA 80x50) + + If you add 0x8000 to the mode ID, the program will try to recalculate +vertical display timing according to mode parameters, which can be used to +eliminate some annoying bugs of certain VGA BIOSes (usually those used for +cards with S3 chipsets and old Cirrus Logic BIOSes) -- mainly extra lines at the +end of the display. + +4. Options +~~~~~~~~~~ + Some options can be set in the source text (in arch/i386/boot/video.S). +All of them are simple #define's -- change them to #undef's when you want to +switch them off. Currently supported: + + CONFIG_VIDEO_SVGA - enables autodetection of SVGA cards. This is switched +off by default as it's a bit unreliable due to terribly bad PC design. If you +really want to have the adapter autodetected (maybe in case the `scan' feature +doesn't work on your machine), switch this on and don't cry if the results +are not completely sane. In case you really need this feature, please drop me +a mail as I think of removing it some day. + + CONFIG_VIDEO_VESA - enables autodetection of VESA modes. If it doesn't work +on your machine (or displays a "Error: Scanning of VESA modes failed" message), +you can switch it off and report as a bug. + + CONFIG_VIDEO_COMPACT - enables compacting of the video mode list. If there +are more modes with the same screen size, only the first one is kept (see above +for more info on mode ordering). However, in very strange cases it's possible +that the first "version" of the mode doesn't work although some of the others +do -- in this case turn this switch off to see the rest. + + CONFIG_VIDEO_RETAIN - enables retaining of screen contents when switching +video modes. Works only with some boot loaders which leave enough room for the +buffer. (If you have old LILO, you can adjust heap_end_ptr and loadflags +in setup.S, but it's better to upgrade the boot loader...) + + CONFIG_VIDEO_LOCAL - enables inclusion of "local modes" in the list. The +local modes are added automatically to the beginning of the list not depending +on hardware configuration. The local modes are listed in the source text after +the "local_mode_table:" line. The comment before this line describes the format +of the table (which also includes a video card name to be displayed on the +top of the menu). + + CONFIG_VIDEO_400_HACK - force setting of 400 scan lines for standard VGA +modes. This option is intended to be used on certain buggy BIOSes which draw +some useless logo using font download and then fail to reset the correct mode. +Don't use unless needed as it forces resetting the video card. + + CONFIG_VIDEO_GFX_HACK - includes special hack for setting of graphics modes +to be used later by special drivers (e.g., 800x600 on IBM ThinkPad -- see +ftp://ftp.phys.keio.ac.jp/pub/XFree86/800x600/XF86Configs/XF86Config.IBM_TP560). +Allows to set _any_ BIOS mode including graphic ones and forcing specific +text screen resolution instead of peeking it from BIOS variables. Don't use +unless you think you know what you're doing. To activate this setup, use +mode number 0x0f08 (see section 3). + +5. Still doesn't work? +~~~~~~~~~~~~~~~~~~~~~~ + When the mode detection doesn't work (e.g., the mode list is incorrect or +the machine hangs instead of displaying the menu), try to switch off some of +the configuration options listed in section 4. If it fails, you can still use +your kernel with the video mode set directly via the kernel parameter. + + In either case, please send me a bug report containing what _exactly_ +happens and how do the configuration switches affect the behaviour of the bug. + + If you start Linux from M$-DOS, you might also use some DOS tools for +video mode setting. In this case, you must specify the 0x0f04 mode ("leave +current settings") to Linux, because if you don't and you use any non-standard +mode, Linux will switch to 80x25 automatically. + + If you set some extended mode and there's one or more extra lines on the +bottom of the display containing already scrolled-out text, your VGA BIOS +contains the most common video BIOS bug called "incorrect vertical display +end setting". Adding 0x8000 to the mode ID might fix the problem. Unfortunately, +this must be done manually -- no autodetection mechanisms are available. + + If you have a VGA card and your display still looks as on EGA, your BIOS +is probably broken and you need to set the CONFIG_VIDEO_400_HACK switch to +force setting of the correct mode. + +6. History +~~~~~~~~~~ +1.0 (??-Nov-95) First version supporting all adapters supported by the old + setup.S + Cirrus Logic 54XX. Present in some 1.3.4? kernels + and then removed due to instability on some machines. +2.0 (28-Jan-96) Rewritten from scratch. Cirrus Logic 64XX support added, almost + everything is configurable, the VESA support should be much more + stable, explicit mode numbering allowed, "scan" implemented etc. +2.1 (30-Jan-96) VESA modes moved to 0x200-0x3ff. Mode selection by resolution + supported. Few bugs fixed. VESA modes are listed prior to + modes supplied by SVGA autodetection as they are more reliable. + CLGD autodetect works better. Doesn't depend on 80x25 being + active when started. Scanning fixed. 80x43 (any VGA) added. + Code cleaned up. +2.2 (01-Feb-96) EGA 80x43 fixed. VESA extended to 0x200-0x4ff (non-standard 02XX + VESA modes work now). Display end bug workaround supported. + Special modes renumbered to allow adding of the "recalculate" + flag, 0xffff and 0xfffe became aliases instead of real IDs. + Screen contents retained during mode changes. +2.3 (15-Mar-96) Changed to work with 1.3.74 kernel. +2.4 (18-Mar-96) Added patches by Hans Lermen fixing a memory overwrite problem + with some boot loaders. Memory management rewritten to reflect + these changes. Unfortunately, screen contents retaining works + only with some loaders now. + Added a Tseng 132x60 mode. +2.5 (19-Mar-96) Fixed a VESA mode scanning bug introduced in 2.4. +2.6 (25-Mar-96) Some VESA BIOS errors not reported -- it fixes error reports on + several cards with broken VESA code (e.g., ATI VGA). +2.7 (09-Apr-96) - Accepted all VESA modes in range 0x100 to 0x7ff, because some + cards use very strange mode numbers. + - Added Realtek VGA modes (thanks to Gonzalo Tornaria). + - Hardware testing order slightly changed, tests based on ROM + contents done as first. + - Added support for special Video7 mode switching functions + (thanks to Tom Vander Aa). + - Added 480-scanline modes (especially useful for notebooks, + original version written by hhanemaa@cs.ruu.nl, patched by + Jeff Chua, rewritten by me). + - Screen store/restore fixed. +2.8 (14-Apr-96) - Previous release was not compilable without CONFIG_VIDEO_SVGA. + - Better recognition of text modes during mode scan. +2.9 (12-May-96) - Ignored VESA modes 0x80 - 0xff (more VESA BIOS bugs!) +2.10 (11-Nov-96)- The whole thing made optional. + - Added the CONFIG_VIDEO_400_HACK switch. + - Added the CONFIG_VIDEO_GFX_HACK switch. + - Code cleanup. +2.11 (03-May-97)- Yet another cleanup, now including also the documentation. + - Direct testing of SVGA adapters turned off by default, `scan' + offered explicitly on the prompt line. + - Removed the doc section describing adding of new probing + functions as I try to get rid of _all_ hardware probing here. +2.12 (25-May-98)- Added support for VESA frame buffer graphics. +2.13 (14-May-99)- Minor documentation fixes. diff --git a/Documentation/sx.txt b/Documentation/sx.txt new file mode 100644 index 000000000000..cb4efa0fb5cc --- /dev/null +++ b/Documentation/sx.txt @@ -0,0 +1,294 @@ + + sx.txt -- specialix SX/SI multiport serial driver readme. + + + + Copyright (C) 1997 Roger Wolff (R.E.Wolff@BitWizard.nl) + + Specialix pays for the development and support of this driver. + Please DO contact support@specialix.co.uk if you require + support. + + This driver was developed in the BitWizard linux device + driver service. If you require a linux device driver for your + product, please contact devices@BitWizard.nl for a quote. + + (History) + There used to be an SI driver by Simon Allan. This is a complete + rewrite from scratch. Just a few lines-of-code have been snatched. + + (Sources) + Specialix document number 6210028: SX Host Card and Download Code + Software Functional Specification. + + (Copying) + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License as + published by the Free Software Foundation; either version 2 of + the License, or (at your option) any later version. + + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR + PURPOSE. See the GNU General Public License for more details. + + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, + USA. + + (Addendum) + I'd appreciate it that if you have fixes, that you send them + to me first. + + +Introduction +============ + +This file contains some random information, that I like to have online +instead of in a manual that can get lost. Ever misplace your Linux +kernel sources? And the manual of one of the boards in your computer? + + +Theory of operation +=================== + +An important thing to know is that the driver itself doesn't have the +firmware for the card. This means that you need the separate package +"sx_firmware". For now you can get the source at + + ftp://ftp.bitwizard.nl/specialix/sx_firmware_<version>.tgz + +The firmware load needs a "misc" device, so you'll need to enable the +"Support for user misc device modules" in your kernel configuration. +The misc device needs to be called "/dev/specialix_sxctl". It needs +misc major 10, and minor number 167 (assigned by HPA). The section +on creating device files below also creates this device. + +After loading the sx.o module into your kernel, the driver will report +the number of cards detected, but because it doesn't have any +firmware, it will not be able to determine the number of ports. Only +when you then run "sx_firmware" will the firmware be downloaded and +the rest of the driver initialized. At that time the sx_firmware +program will report the number of ports installed. + +In contrast with many other multi port serial cards, some of the data +structures are only allocated when the card knows the number of ports +that are connected. This means we won't waste memory for 120 port +descriptor structures when you only have 8 ports. If you experience +problems due to this, please report them: I haven't seen any. + + +Interrupts +========== + +A multi port serial card, would generate a horrendous amount of +interrupts if it would interrupt the CPU for every received +character. Even more than 10 years ago, the trick not to use +interrupts but to poll the serial cards was invented. + +The SX card allow us to do this two ways. First the card limits its +own interrupt rate to a rate that won't overwhelm the CPU. Secondly, +we could forget about the cards interrupt completely and use the +internal timer for this purpose. + +Polling the card can take up to a few percent of your CPU. Using the +interrupts would be better if you have most of the ports idle. Using +timer-based polling is better if your card almost always has work to +do. You save the separate interrupt in that case. + +In any case, it doesn't really matter all that much. + +The most common problem with interrupts is that for ISA cards in a PCI +system the BIOS has to be told to configure that interrupt as "legacy +ISA". Otherwise the card can pull on the interrupt line all it wants +but the CPU won't see this. + +If you can't get the interrupt to work, remember that polling mode is +more efficient (provided you actually use the card intensively). + + +Allowed Configurations +====================== + +Some configurations are disallowed. Even though at a glance they might +seem to work, they are known to lockup the bus between the host card +and the device concentrators. You should respect the drivers decision +not to support certain configurations. It's there for a reason. + +Warning: Seriously technical stuff ahead. Executive summary: Don't use +SX cards except configured at a 64k boundary. Skip the next paragraph. + +The SX cards can theoretically be placed at a 32k boundary. So for +instance you can put an SX card at 0xc8000-0xd7fff. This is not a +"recommended configuration". ISA cards have to tell the bus controller +how they like their timing. Due to timing issues they have to do this +based on which 64k window the address falls into. This means that the +32k window below and above the SX card have to use exactly the same +timing as the SX card. That reportedly works for other SX cards. But +you're still left with two useless 32k windows that should not be used +by anybody else. + + +Configuring the driver +====================== + +PCI cards are always detected. The driver auto-probes for ISA cards at +some sensible addresses. Please report if the auto-probe causes trouble +in your system, or when a card isn't detected. + +I'm afraid I haven't implemented "kernel command line parameters" yet. +This means that if the default doesn't work for you, you shouldn't use +the compiled-into-the-kernel version of the driver. Use a module +instead. If you convince me that you need this, I'll make it for +you. Deal? + +I'm afraid that the module parameters are a bit clumsy. If you have a +better idea, please tell me. + +You can specify several parameters: + + sx_poll: number of jiffies between timer-based polls. + + Set this to "0" to disable timer based polls. + Initialization of cards without a working interrupt + will fail. + + Set this to "1" if you want a polling driver. + (on Intel: 100 polls per second). If you don't use + fast baud rates, you might consider a value like "5". + (If you don't know how to do the math, use 1). + + sx_slowpoll: Number of jiffies between timer-based polls. + Set this to "100" to poll once a second. + This should get the card out of a stall if the driver + ever misses an interrupt. I've never seen this happen, + and if it does, that's a bug. Tell me. + + sx_maxints: Number of interrupts to request from the card. + The card normally limits interrupts to about 100 per + second to offload the host CPU. You can increase this + number to reduce latency on the card a little. + Note that if you give a very high number you can overload + your CPU as well as the CPU on the host card. This setting + is inaccurate and not recommended for SI cards (But it + works). + + sx_irqmask: The mask of allowable IRQs to use. I suggest you set + this to 0 (disable IRQs all together) and use polling if + the assignment of IRQs becomes problematic. This is defined + as the sum of (1 << irq) 's that you want to allow. So + sx_irqmask of 8 (1 << 3) specifies that only irq 3 may + be used by the SX driver. If you want to specify to the + driver: "Either irq 11 or 12 is ok for you to use", then + specify (1 << 11) | (1 << 12) = 0x1800 . + + sx_debug: You can enable different sorts of debug traces with this. + At "-1" all debugging traces are active. You'll get several + times more debugging output than you'll get characters + transmitted. + + +Baud rates +========== + +Theoretically new SXDCs should be capable of more than 460k +baud. However the line drivers usually give up before that. Also the +CPU on the card may not be able to handle 8 channels going at full +blast at that speed. Moreover, the buffers are not large enough to +allow operation with 100 interrupts per second. You'll have to realize +that the card has a 256 byte buffer, so you'll have to increase the +number of interrupts per second if you have more than 256*100 bytes +per second to transmit. If you do any performance testing in this +area, I'd be glad to hear from you... + +(Psst Linux users..... I think the Linux driver is more efficient than +the driver for other OSes. If you can and want to benchmark them +against each other, be my guest, and report your findings...... :-) + + +Ports and devices +================= + +Port 0 is the top connector on the module closest to the host +card. Oh, the ports on the SXDCs and TAs are labelled from 1 to 8 +instead of from 0 to 7, as they are numbered by linux. I'm stubborn in +this: I know for sure that I wouldn't be able to calculate which port +is which anymore if I would change that.... + + +Devices: + +You should make the device files as follows: + +#!/bin/sh +# (I recommend that you cut-and-paste this into a file and run that) +cd /dev +t=0 +mknod specialix_sxctl c 10 167 +while [ $t -lt 64 ] + do + echo -n "$t " + mknod ttyX$t c 32 $t + mknod cux$t c 33 $t + t=`expr $t + 1` +done +echo "" +rm /etc/psdevtab +ps > /dev/null + + +This creates 64 devices. If you have more, increase the constant on +the line with "while". The devices start at 0, as is customary on +Linux. Specialix seems to like starting the numbering at 1. + +If your system doesn't come with these devices pre-installed, bug your +linux-vendor about this. They should have these devices +"pre-installed" before the new millennium. The "ps" stuff at the end +is to "tell" ps that the new devices exist. + +Officially the maximum number of cards per computer is 4. This driver +however supports as many cards in one machine as you want. You'll run +out of interrupts after a few, but you can switch to polled operation +then. At about 256 ports (More than 8 cards), we run out of minor +device numbers. Sorry. I suggest you buy a second computer.... (Or +switch to RIO). + +------------------------------------------------------------------------ + + + Fixed bugs and restrictions: + - Hangup processing. + -- Done. + + - the write path in generic_serial (lockup / oops). + -- Done (Ugly: not the way I want it. Copied from serial.c). + + - write buffer isn't flushed at close. + -- Done. I still seem to lose a few chars at close. + Sorry. I think that this is a firmware issue. (-> Specialix) + + - drain hardware before changing termios + - Change debug on the fly. + - ISA free irq -1. (no firmware loaded). + - adding c8000 as a probe address. Added warning. + - Add a RAMtest for the RAM on the card.c + - Crash when opening a port "way" of the number of allowed ports. + (for example opening port 60 when there are only 24 ports attached) + - Sometimes the use-count strays a bit. After a few hours of + testing the use count is sometimes "3". If you are not like + me and can remember what you did to get it that way, I'd + appreciate an Email. Possibly fixed. Tell me if anyone still + sees this. + - TAs don't work right if you don't connect all the modem control + signals. SXDCs do. T225 firmware problem -> Specialix. + (Mostly fixed now, I think. Tell me if you encounter this!) + + Bugs & restrictions: + + - Arbitrary baud rates. Requires firmware update. (-> Specialix) + + - Low latency (mostly firmware, -> Specialix) + + + diff --git a/Documentation/sysctl/README b/Documentation/sysctl/README new file mode 100644 index 000000000000..8c3306e01d52 --- /dev/null +++ b/Documentation/sysctl/README @@ -0,0 +1,75 @@ +Documentation for /proc/sys/ kernel version 2.2.10 + (c) 1998, 1999, Rik van Riel <riel@nl.linux.org> + +'Why', I hear you ask, 'would anyone even _want_ documentation +for them sysctl files? If anybody really needs it, it's all in +the source...' + +Well, this documentation is written because some people either +don't know they need to tweak something, or because they don't +have the time or knowledge to read the source code. + +Furthermore, the programmers who built sysctl have built it to +be actually used, not just for the fun of programming it :-) + +============================================================== + +Legal blurb: + +As usual, there are two main things to consider: +1. you get what you pay for +2. it's free + +The consequences are that I won't guarantee the correctness of +this document, and if you come to me complaining about how you +screwed up your system because of wrong documentation, I won't +feel sorry for you. I might even laugh at you... + +But of course, if you _do_ manage to screw up your system using +only the sysctl options used in this file, I'd like to hear of +it. Not only to have a great laugh, but also to make sure that +you're the last RTFMing person to screw up. + +In short, e-mail your suggestions, corrections and / or horror +stories to: <riel@nl.linux.org> + +Rik van Riel. + +============================================================== + +Introduction: + +Sysctl is a means of configuring certain aspects of the kernel +at run-time, and the /proc/sys/ directory is there so that you +don't even need special tools to do it! +In fact, there are only four things needed to use these config +facilities: +- a running Linux system +- root access +- common sense (this is especially hard to come by these days) +- knowledge of what all those values mean + +As a quick 'ls /proc/sys' will show, the directory consists of +several (arch-dependent?) subdirs. Each subdir is mainly about +one part of the kernel, so you can do configuration on a piece +by piece basis, or just some 'thematic frobbing'. + +The subdirs are about: +abi/ execution domains & personalities +debug/ <empty> +dev/ device specific information (eg dev/cdrom/info) +fs/ specific filesystems + filehandle, inode, dentry and quota tuning + binfmt_misc <Documentation/binfmt_misc.txt> +kernel/ global kernel info / tuning + miscellaneous stuff +net/ networking stuff, for documentation look in: + <Documentation/networking/> +proc/ <empty> +sunrpc/ SUN Remote Procedure Call (NFS) +vm/ memory management tuning + buffer and cache management + +These are the subdirs I have on my system. There might be more +or other subdirs in another setup. If you see another dir, I'd +really like to hear about it :-) diff --git a/Documentation/sysctl/abi.txt b/Documentation/sysctl/abi.txt new file mode 100644 index 000000000000..63f4ebcf652c --- /dev/null +++ b/Documentation/sysctl/abi.txt @@ -0,0 +1,54 @@ +Documentation for /proc/sys/abi/* kernel version 2.6.0.test2 + (c) 2003, Fabian Frederick <ffrederick@users.sourceforge.net> + +For general info : README. + +============================================================== + +This path is binary emulation relevant aka personality types aka abi. +When a process is executed, it's linked to an exec_domain whose +personality is defined using values available from /proc/sys/abi. +You can find further details about abi in include/linux/personality.h. + +Here are the files featuring in 2.6 kernel : + +- defhandler_coff +- defhandler_elf +- defhandler_lcall7 +- defhandler_libcso +- fake_utsname +- trace + +=========================================================== +defhandler_coff: +defined value : +PER_SCOSVR3 +0x0003 | STICKY_TIMEOUTS | WHOLE_SECONDS | SHORT_INODE + +=========================================================== +defhandler_elf: +defined value : +PER_LINUX +0 + +=========================================================== +defhandler_lcall7: +defined value : +PER_SVR4 +0x0001 | STICKY_TIMEOUTS | MMAP_PAGE_ZERO, + +=========================================================== +defhandler_libsco: +defined value: +PER_SVR4 +0x0001 | STICKY_TIMEOUTS | MMAP_PAGE_ZERO, + +=========================================================== +fake_utsname: +Unused + +=========================================================== +trace: +Unused + +=========================================================== diff --git a/Documentation/sysctl/fs.txt b/Documentation/sysctl/fs.txt new file mode 100644 index 000000000000..0b62c62142cf --- /dev/null +++ b/Documentation/sysctl/fs.txt @@ -0,0 +1,150 @@ +Documentation for /proc/sys/fs/* kernel version 2.2.10 + (c) 1998, 1999, Rik van Riel <riel@nl.linux.org> + +For general info and legal blurb, please look in README. + +============================================================== + +This file contains documentation for the sysctl files in +/proc/sys/fs/ and is valid for Linux kernel version 2.2. + +The files in this directory can be used to tune and monitor +miscellaneous and general things in the operation of the Linux +kernel. Since some of the files _can_ be used to screw up your +system, it is advisable to read both documentation and source +before actually making adjustments. + +Currently, these files are in /proc/sys/fs: +- dentry-state +- dquot-max +- dquot-nr +- file-max +- file-nr +- inode-max +- inode-nr +- inode-state +- overflowuid +- overflowgid +- super-max +- super-nr + +Documentation for the files in /proc/sys/fs/binfmt_misc is +in Documentation/binfmt_misc.txt. + +============================================================== + +dentry-state: + +From linux/fs/dentry.c: +-------------------------------------------------------------- +struct { + int nr_dentry; + int nr_unused; + int age_limit; /* age in seconds */ + int want_pages; /* pages requested by system */ + int dummy[2]; +} dentry_stat = {0, 0, 45, 0,}; +-------------------------------------------------------------- + +Dentries are dynamically allocated and deallocated, and +nr_dentry seems to be 0 all the time. Hence it's safe to +assume that only nr_unused, age_limit and want_pages are +used. Nr_unused seems to be exactly what its name says. +Age_limit is the age in seconds after which dcache entries +can be reclaimed when memory is short and want_pages is +nonzero when shrink_dcache_pages() has been called and the +dcache isn't pruned yet. + +============================================================== + +dquot-max & dquot-nr: + +The file dquot-max shows the maximum number of cached disk +quota entries. + +The file dquot-nr shows the number of allocated disk quota +entries and the number of free disk quota entries. + +If the number of free cached disk quotas is very low and +you have some awesome number of simultaneous system users, +you might want to raise the limit. + +============================================================== + +file-max & file-nr: + +The kernel allocates file handles dynamically, but as yet it +doesn't free them again. + +The value in file-max denotes the maximum number of file- +handles that the Linux kernel will allocate. When you get lots +of error messages about running out of file handles, you might +want to increase this limit. + +The three values in file-nr denote the number of allocated +file handles, the number of unused file handles and the maximum +number of file handles. When the allocated file handles come +close to the maximum, but the number of unused file handles is +significantly greater than 0, you've encountered a peak in your +usage of file handles and you don't need to increase the maximum. + +============================================================== + +inode-max, inode-nr & inode-state: + +As with file handles, the kernel allocates the inode structures +dynamically, but can't free them yet. + +The value in inode-max denotes the maximum number of inode +handlers. This value should be 3-4 times larger than the value +in file-max, since stdin, stdout and network sockets also +need an inode struct to handle them. When you regularly run +out of inodes, you need to increase this value. + +The file inode-nr contains the first two items from +inode-state, so we'll skip to that file... + +Inode-state contains three actual numbers and four dummies. +The actual numbers are, in order of appearance, nr_inodes, +nr_free_inodes and preshrink. + +Nr_inodes stands for the number of inodes the system has +allocated, this can be slightly more than inode-max because +Linux allocates them one pageful at a time. + +Nr_free_inodes represents the number of free inodes (?) and +preshrink is nonzero when the nr_inodes > inode-max and the +system needs to prune the inode list instead of allocating +more. + +============================================================== + +overflowgid & overflowuid: + +Some filesystems only support 16-bit UIDs and GIDs, although in Linux +UIDs and GIDs are 32 bits. When one of these filesystems is mounted +with writes enabled, any UID or GID that would exceed 65535 is translated +to a fixed value before being written to disk. + +These sysctls allow you to change the value of the fixed UID and GID. +The default is 65534. + +============================================================== + +super-max & super-nr: + +These numbers control the maximum number of superblocks, and +thus the maximum number of mounted filesystems the kernel +can have. You only need to increase super-max if you need to +mount more filesystems than the current value in super-max +allows you to. + +============================================================== + +aio-nr & aio-max-nr: + +aio-nr shows the current system-wide number of asynchronous io +requests. aio-max-nr allows you to change the maximum value +aio-nr can grow to. + +============================================================== diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt new file mode 100644 index 000000000000..35159176997b --- /dev/null +++ b/Documentation/sysctl/kernel.txt @@ -0,0 +1,314 @@ +Documentation for /proc/sys/kernel/* kernel version 2.2.10 + (c) 1998, 1999, Rik van Riel <riel@nl.linux.org> + +For general info and legal blurb, please look in README. + +============================================================== + +This file contains documentation for the sysctl files in +/proc/sys/kernel/ and is valid for Linux kernel version 2.2. + +The files in this directory can be used to tune and monitor +miscellaneous and general things in the operation of the Linux +kernel. Since some of the files _can_ be used to screw up your +system, it is advisable to read both documentation and source +before actually making adjustments. + +Currently, these files might (depending on your configuration) +show up in /proc/sys/kernel: +- acct +- core_pattern +- core_uses_pid +- ctrl-alt-del +- dentry-state +- domainname +- hostname +- hotplug +- java-appletviewer [ binfmt_java, obsolete ] +- java-interpreter [ binfmt_java, obsolete ] +- l2cr [ PPC only ] +- modprobe ==> Documentation/kmod.txt +- msgmax +- msgmnb +- msgmni +- osrelease +- ostype +- overflowgid +- overflowuid +- panic +- pid_max +- powersave-nap [ PPC only ] +- printk +- real-root-dev ==> Documentation/initrd.txt +- reboot-cmd [ SPARC only ] +- rtsig-max +- rtsig-nr +- sem +- sg-big-buff [ generic SCSI device (sg) ] +- shmall +- shmmax [ sysv ipc ] +- shmmni +- stop-a [ SPARC only ] +- sysrq ==> Documentation/sysrq.txt +- tainted +- threads-max +- version + +============================================================== + +acct: + +highwater lowwater frequency + +If BSD-style process accounting is enabled these values control +its behaviour. If free space on filesystem where the log lives +goes below <lowwater>% accounting suspends. If free space gets +above <highwater>% accounting resumes. <Frequency> determines +how often do we check the amount of free space (value is in +seconds). Default: +4 2 30 +That is, suspend accounting if there left <= 2% free; resume it +if we got >=4%; consider information about amount of free space +valid for 30 seconds. + +============================================================== + +core_pattern: + +core_pattern is used to specify a core dumpfile pattern name. +. max length 64 characters; default value is "core" +. core_pattern is used as a pattern template for the output filename; + certain string patterns (beginning with '%') are substituted with + their actual values. +. backward compatibility with core_uses_pid: + If core_pattern does not include "%p" (default does not) + and core_uses_pid is set, then .PID will be appended to + the filename. +. corename format specifiers: + %<NUL> '%' is dropped + %% output one '%' + %p pid + %u uid + %g gid + %s signal number + %t UNIX time of dump + %h hostname + %e executable filename + %<OTHER> both are dropped + +============================================================== + +core_uses_pid: + +The default coredump filename is "core". By setting +core_uses_pid to 1, the coredump filename becomes core.PID. +If core_pattern does not include "%p" (default does not) +and core_uses_pid is set, then .PID will be appended to +the filename. + +============================================================== + +ctrl-alt-del: + +When the value in this file is 0, ctrl-alt-del is trapped and +sent to the init(1) program to handle a graceful restart. +When, however, the value is > 0, Linux's reaction to a Vulcan +Nerve Pinch (tm) will be an immediate reboot, without even +syncing its dirty buffers. + +Note: when a program (like dosemu) has the keyboard in 'raw' +mode, the ctrl-alt-del is intercepted by the program before it +ever reaches the kernel tty layer, and it's up to the program +to decide what to do with it. + +============================================================== + +domainname & hostname: + +These files can be used to set the NIS/YP domainname and the +hostname of your box in exactly the same way as the commands +domainname and hostname, i.e.: +# echo "darkstar" > /proc/sys/kernel/hostname +# echo "mydomain" > /proc/sys/kernel/domainname +has the same effect as +# hostname "darkstar" +# domainname "mydomain" + +Note, however, that the classic darkstar.frop.org has the +hostname "darkstar" and DNS (Internet Domain Name Server) +domainname "frop.org", not to be confused with the NIS (Network +Information Service) or YP (Yellow Pages) domainname. These two +domain names are in general different. For a detailed discussion +see the hostname(1) man page. + +============================================================== + +hotplug: + +Path for the hotplug policy agent. +Default value is "/sbin/hotplug". + +============================================================== + +l2cr: (PPC only) + +This flag controls the L2 cache of G3 processor boards. If +0, the cache is disabled. Enabled if nonzero. + +============================================================== + +osrelease, ostype & version: + +# cat osrelease +2.1.88 +# cat ostype +Linux +# cat version +#5 Wed Feb 25 21:49:24 MET 1998 + +The files osrelease and ostype should be clear enough. Version +needs a little more clarification however. The '#5' means that +this is the fifth kernel built from this source base and the +date behind it indicates the time the kernel was built. +The only way to tune these values is to rebuild the kernel :-) + +============================================================== + +overflowgid & overflowuid: + +if your architecture did not always support 32-bit UIDs (i.e. arm, i386, +m68k, sh, and sparc32), a fixed UID and GID will be returned to +applications that use the old 16-bit UID/GID system calls, if the actual +UID or GID would exceed 65535. + +These sysctls allow you to change the value of the fixed UID and GID. +The default is 65534. + +============================================================== + +panic: + +The value in this file represents the number of seconds the +kernel waits before rebooting on a panic. When you use the +software watchdog, the recommended setting is 60. + +============================================================== + +panic_on_oops: + +Controls the kernel's behaviour when an oops or BUG is encountered. + +0: try to continue operation + +1: delay a few seconds (to give klogd time to record the oops output) and + then panic. If the `panic' sysctl is also non-zero then the machine will + be rebooted. + +============================================================== + +pid_max: + +PID allocation wrap value. When the kenrel's next PID value +reaches this value, it wraps back to a minimum PID value. +PIDs of value pid_max or larger are not allocated. + +============================================================== + +powersave-nap: (PPC only) + +If set, Linux-PPC will use the 'nap' mode of powersaving, +otherwise the 'doze' mode will be used. + +============================================================== + +printk: + +The four values in printk denote: console_loglevel, +default_message_loglevel, minimum_console_loglevel and +default_console_loglevel respectively. + +These values influence printk() behavior when printing or +logging error messages. See 'man 2 syslog' for more info on +the different loglevels. + +- console_loglevel: messages with a higher priority than + this will be printed to the console +- default_message_level: messages without an explicit priority + will be printed with this priority +- minimum_console_loglevel: minimum (highest) value to which + console_loglevel can be set +- default_console_loglevel: default value for console_loglevel + +============================================================== + +printk_ratelimit: + +Some warning messages are rate limited. printk_ratelimit specifies +the minimum length of time between these messages (in jiffies), by +default we allow one every 5 seconds. + +A value of 0 will disable rate limiting. + +============================================================== + +printk_ratelimit_burst: + +While long term we enforce one message per printk_ratelimit +seconds, we do allow a burst of messages to pass through. +printk_ratelimit_burst specifies the number of messages we can +send before ratelimiting kicks in. + +============================================================== + +reboot-cmd: (Sparc only) + +??? This seems to be a way to give an argument to the Sparc +ROM/Flash boot loader. Maybe to tell it what to do after +rebooting. ??? + +============================================================== + +rtsig-max & rtsig-nr: + +The file rtsig-max can be used to tune the maximum number +of POSIX realtime (queued) signals that can be outstanding +in the system. + +rtsig-nr shows the number of RT signals currently queued. + +============================================================== + +sg-big-buff: + +This file shows the size of the generic SCSI (sg) buffer. +You can't tune it just yet, but you could change it on +compile time by editing include/scsi/sg.h and changing +the value of SG_BIG_BUFF. + +There shouldn't be any reason to change this value. If +you can come up with one, you probably know what you +are doing anyway :) + +============================================================== + +shmmax: + +This value can be used to query and set the run time limit +on the maximum shared memory segment size that can be created. +Shared memory segments up to 1Gb are now supported in the +kernel. This value defaults to SHMMAX. + +============================================================== + +tainted: + +Non-zero if the kernel has been tainted. Numeric values, which +can be ORed together: + + 1 - A module with a non-GPL license has been loaded, this + includes modules with no license. + Set by modutils >= 2.4.9 and module-init-tools. + 2 - A module was force loaded by insmod -f. + Set by modutils >= 2.4.9 and module-init-tools. + 4 - Unsafe SMP processors: SMP with CPUs not designed for SMP. + diff --git a/Documentation/sysctl/sunrpc.txt b/Documentation/sysctl/sunrpc.txt new file mode 100644 index 000000000000..ae1ecac6f85a --- /dev/null +++ b/Documentation/sysctl/sunrpc.txt @@ -0,0 +1,20 @@ +Documentation for /proc/sys/sunrpc/* kernel version 2.2.10 + (c) 1998, 1999, Rik van Riel <riel@nl.linux.org> + +For general info and legal blurb, please look in README. + +============================================================== + +This file contains the documentation for the sysctl files in +/proc/sys/sunrpc and is valid for Linux kernel version 2.2. + +The files in this directory can be used to (re)set the debug +flags of the SUN Remote Procedure Call (RPC) subsystem in +the Linux kernel. This stuff is used for NFS, KNFSD and +maybe a few other things as well. + +The files in there are used to control the debugging flags: +rpc_debug, nfs_debug, nfsd_debug and nlm_debug. + +These flags are for kernel hackers only. You should read the +source code in net/sunrpc/ for more information. diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt new file mode 100644 index 000000000000..2f1aae32a5d9 --- /dev/null +++ b/Documentation/sysctl/vm.txt @@ -0,0 +1,104 @@ +Documentation for /proc/sys/vm/* kernel version 2.2.10 + (c) 1998, 1999, Rik van Riel <riel@nl.linux.org> + +For general info and legal blurb, please look in README. + +============================================================== + +This file contains the documentation for the sysctl files in +/proc/sys/vm and is valid for Linux kernel version 2.2. + +The files in this directory can be used to tune the operation +of the virtual memory (VM) subsystem of the Linux kernel and +the writeout of dirty data to disk. + +Default values and initialization routines for most of these +files can be found in mm/swap.c. + +Currently, these files are in /proc/sys/vm: +- overcommit_memory +- page-cluster +- dirty_ratio +- dirty_background_ratio +- dirty_expire_centisecs +- dirty_writeback_centisecs +- max_map_count +- min_free_kbytes +- laptop_mode +- block_dump + +============================================================== + +dirty_ratio, dirty_background_ratio, dirty_expire_centisecs, +dirty_writeback_centisecs, vfs_cache_pressure, laptop_mode, +block_dump, swap_token_timeout: + +See Documentation/filesystems/proc.txt + +============================================================== + +overcommit_memory: + +This value contains a flag that enables memory overcommitment. + +When this flag is 0, the kernel attempts to estimate the amount +of free memory left when userspace requests more memory. + +When this flag is 1, the kernel pretends there is always enough +memory until it actually runs out. + +When this flag is 2, the kernel uses a "never overcommit" +policy that attempts to prevent any overcommit of memory. + +This feature can be very useful because there are a lot of +programs that malloc() huge amounts of memory "just-in-case" +and don't use much of it. + +The default value is 0. + +See Documentation/vm/overcommit-accounting and +security/commoncap.c::cap_vm_enough_memory() for more information. + +============================================================== + +overcommit_ratio: + +When overcommit_memory is set to 2, the committed address +space is not permitted to exceed swap plus this percentage +of physical RAM. See above. + +============================================================== + +page-cluster: + +The Linux VM subsystem avoids excessive disk seeks by reading +multiple pages on a page fault. The number of pages it reads +is dependent on the amount of memory in your machine. + +The number of pages the kernel reads in at once is equal to +2 ^ page-cluster. Values above 2 ^ 5 don't make much sense +for swap because we only cluster swap data in 32-page groups. + +============================================================== + +max_map_count: + +This file contains the maximum number of memory map areas a process +may have. Memory map areas are used as a side-effect of calling +malloc, directly by mmap and mprotect, and also when loading shared +libraries. + +While most applications need less than a thousand maps, certain +programs, particularly malloc debuggers, may consume lots of them, +e.g., up to one or two maps per allocation. + +The default value is 65536. + +============================================================== + +min_free_kbytes: + +This is used to force the Linux VM to keep a minimum number +of kilobytes free. The VM uses this number to compute a pages_min +value for each lowmem zone in the system. Each lowmem zone gets +a number of reserved free pages based proportionally on its size. diff --git a/Documentation/sysrq.txt b/Documentation/sysrq.txt new file mode 100644 index 000000000000..f98c2e31c143 --- /dev/null +++ b/Documentation/sysrq.txt @@ -0,0 +1,213 @@ +Linux Magic System Request Key Hacks +Documentation for sysrq.c version 1.15 +Last update: $Date: 2001/01/28 10:15:59 $ + +* What is the magic SysRq key? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +It is a 'magical' key combo you can hit which the kernel will respond to +regardless of whatever else it is doing, unless it is completely locked up. + +* How do I enable the magic SysRq key? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +You need to say "yes" to 'Magic SysRq key (CONFIG_MAGIC_SYSRQ)' when +configuring the kernel. When running a kernel with SysRq compiled in, +/proc/sys/kernel/sysrq controls the functions allowed to be invoked via +the SysRq key. By default the file contains 1 which means that every +possible SysRq request is allowed (in older versions SysRq was disabled +by default, and you were required to specifically enable it at run-time +but this is not the case any more). Here is the list of possible values +in /proc/sys/kernel/sysrq: + 0 - disable sysrq completely + 1 - enable all functions of sysrq + >1 - bitmask of allowed sysrq functions (see below for detailed function + description): + 2 - enable control of console logging level + 4 - enable control of keyboard (SAK, unraw) + 8 - enable debugging dumps of processes etc. + 16 - enable sync command + 32 - enable remount read-only + 64 - enable signalling of processes (term, kill, oom-kill) + 128 - allow reboot/poweroff + 256 - allow nicing of all RT tasks + +You can set the value in the file by the following command: + echo "number" >/proc/sys/kernel/sysrq + +Note that the value of /proc/sys/kernel/sysrq influences only the invocation +via a keyboard. Invocation of any operation via /proc/sysrq-trigger is always +allowed. + +* How do I use the magic SysRq key? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +On x86 - You press the key combo 'ALT-SysRq-<command key>'. Note - Some + keyboards may not have a key labeled 'SysRq'. The 'SysRq' key is + also known as the 'Print Screen' key. Also some keyboards cannot + handle so many keys being pressed at the same time, so you might + have better luck with "press Alt", "press SysRq", "release Alt", + "press <command key>", release everything. + +On SPARC - You press 'ALT-STOP-<command key>', I believe. + +On the serial console (PC style standard serial ports only) - + You send a BREAK, then within 5 seconds a command key. Sending + BREAK twice is interpreted as a normal BREAK. + +On PowerPC - Press 'ALT - Print Screen (or F13) - <command key>, + Print Screen (or F13) - <command key> may suffice. + +On other - If you know of the key combos for other architectures, please + let me know so I can add them to this section. + +On all - write a character to /proc/sysrq-trigger. eg: + + echo t > /proc/sysrq-trigger + +* What are the 'command' keys? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +'r' - Turns off keyboard raw mode and sets it to XLATE. + +'k' - Secure Access Key (SAK) Kills all programs on the current virtual + console. NOTE: See important comments below in SAK section. + +'b' - Will immediately reboot the system without syncing or unmounting + your disks. + +'o' - Will shut your system off (if configured and supported). + +'s' - Will attempt to sync all mounted filesystems. + +'u' - Will attempt to remount all mounted filesystems read-only. + +'p' - Will dump the current registers and flags to your console. + +'t' - Will dump a list of current tasks and their information to your + console. + +'m' - Will dump current memory info to your console. + +'v' - Dumps Voyager SMP processor info to your console. + +'0'-'9' - Sets the console log level, controlling which kernel messages + will be printed to your console. ('0', for example would make + it so that only emergency messages like PANICs or OOPSes would + make it to your console.) + +'f' - Will call oom_kill to kill a memory hog process + +'e' - Send a SIGTERM to all processes, except for init. + +'i' - Send a SIGKILL to all processes, except for init. + +'l' - Send a SIGKILL to all processes, INCLUDING init. (Your system + will be non-functional after this.) + +'h' - Will display help ( actually any other key than those listed + above will display help. but 'h' is easy to remember :-) + +* Okay, so what can I use them for? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Well, un'R'aw is very handy when your X server or a svgalib program crashes. + +sa'K' (Secure Access Key) is useful when you want to be sure there are no +trojan program is running at console and which could grab your password +when you would try to login. It will kill all programs on given console +and thus letting you make sure that the login prompt you see is actually +the one from init, not some trojan program. +IMPORTANT:In its true form it is not a true SAK like the one in :IMPORTANT +IMPORTANT:c2 compliant systems, and it should be mistook as such. :IMPORTANT + It seems other find it useful as (System Attention Key) which is +useful when you want to exit a program that will not let you switch consoles. +(For example, X or a svgalib program.) + +re'B'oot is good when you're unable to shut down. But you should also 'S'ync +and 'U'mount first. + +'S'ync is great when your system is locked up, it allows you to sync your +disks and will certainly lessen the chance of data loss and fscking. Note +that the sync hasn't taken place until you see the "OK" and "Done" appear +on the screen. (If the kernel is really in strife, you may not ever get the +OK or Done message...) + +'U'mount is basically useful in the same ways as 'S'ync. I generally 'S'ync, +'U'mount, then re'B'oot when my system locks. It's saved me many a fsck. +Again, the unmount (remount read-only) hasn't taken place until you see the +"OK" and "Done" message appear on the screen. + +The loglevel'0'-'9' is useful when your console is being flooded with +kernel messages you do not want to see. Setting '0' will prevent all but +the most urgent kernel messages from reaching your console. (They will +still be logged if syslogd/klogd are alive, though.) + +t'E'rm and k'I'll are useful if you have some sort of runaway process you +are unable to kill any other way, especially if it's spawning other +processes. + +* Sometimes SysRq seems to get 'stuck' after using it, what can I do? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +That happens to me, also. I've found that tapping shift, alt, and control +on both sides of the keyboard, and hitting an invalid sysrq sequence again +will fix the problem. (ie, something like alt-sysrq-z). Switching to another +virtual console (ALT+Fn) and then back again should also help. + +* I hit SysRq, but nothing seems to happen, what's wrong? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +There are some keyboards that send different scancodes for SysRq than the +pre-defined 0x54. So if SysRq doesn't work out of the box for a certain +keyboard, run 'showkey -s' to find out the proper scancode sequence. Then +use 'setkeycodes <sequence> 84' to define this sequence to the usual SysRq +code (84 is decimal for 0x54). It's probably best to put this command in a +boot script. Oh, and by the way, you exit 'showkey' by not typing anything +for ten seconds. + +* I want to add SysRQ key events to a module, how does it work? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +In order to register a basic function with the table, you must first include +the header 'include/linux/sysrq.h', this will define everything else you need. +Next, you must create a sysrq_key_op struct, and populate it with A) the key +handler function you will use, B) a help_msg string, that will print when SysRQ +prints help, and C) an action_msg string, that will print right before your +handler is called. Your handler must conform to the protoype in 'sysrq.h'. + +After the sysrq_key_op is created, you can call the macro +register_sysrq_key(int key, struct sysrq_key_op *op_p) that is defined in +sysrq.h, this will register the operation pointed to by 'op_p' at table +key 'key', if that slot in the table is blank. At module unload time, you must +call the macro unregister_sysrq_key(int key, struct sysrq_key_op *op_p), which +will remove the key op pointed to by 'op_p' from the key 'key', if and only if +it is currently registered in that slot. This is in case the slot has been +overwritten since you registered it. + +The Magic SysRQ system works by registering key operations against a key op +lookup table, which is defined in 'drivers/char/sysrq.c'. This key table has +a number of operations registered into it at compile time, but is mutable, +and 4 functions are exported for interface to it: __sysrq_lock_table, +__sysrq_unlock_table, __sysrq_get_key_op, and __sysrq_put_key_op. The +functions __sysrq_swap_key_ops and __sysrq_swap_key_ops_nolock are defined +in the header itself, and the REGISTER and UNREGISTER macros are built from +these. More complex (and dangerous!) manipulations of the table are possible +using these functions, but you must be careful to always lock the table before +you read or write from it, and to unlock it again when you are done. (And of +course, to never ever leave an invalid pointer in the table). Null pointers in +the table are always safe :) + +If for some reason you feel the need to call the handle_sysrq function from +within a function called by handle_sysrq, you must be aware that you are in +a lock (you are also in an interrupt handler, which means don't sleep!), so +you must call __handle_sysrq_nolock instead. + +* I have more questions, who can I ask? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +You may feel free to send email to myrdraal@deathsdoor.com, and I will +respond as soon as possible. + -Myrdraal + +And I'll answer any questions about the registration system you got, also +responding as soon as possible. + -Crutcher + +* Credits +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Written by Mydraal <myrdraal@deathsdoor.com> +Updated by Adam Sulmicki <adam@cfar.umd.edu> +Updated by Jeremy M. Dolan <jmd@turbogeek.org> 2001/01/28 10:15:59 +Added to by Crutcher Dunnavant <crutcher+kernel@datastacks.com> diff --git a/Documentation/telephony/ixj.txt b/Documentation/telephony/ixj.txt new file mode 100644 index 000000000000..621024fd3a18 --- /dev/null +++ b/Documentation/telephony/ixj.txt @@ -0,0 +1,406 @@ +Linux Quicknet-Drivers-Howto +Quicknet Technologies, Inc. (www.quicknet.net) +Version 0.3.4 December 18, 1999 + +1.0 Introduction + +This document describes the first GPL release version of the Linux +driver for the Quicknet Internet PhoneJACK and Internet LineJACK +cards. More information about these cards is available at +www.quicknet.net. The driver version discussed in this document is +0.3.4. + +These cards offer nice telco style interfaces to use your standard +telephone/key system/PBX as the user interface for VoIP applications. +The Internet LineJACK also offers PSTN connectivity for a single line +Internet to PSTN gateway. Of course, you can add more than one card +to a system to obtain multi-line functionality. At this time, the +driver supports the POTS port on both the Internet PhoneJACK and the +Internet LineJACK, but the PSTN port on the latter card is not yet +supported. + +This document, and the drivers for the cards, are intended for a +limited audience that includes technically capable programmers who +would like to experiment with Quicknet cards. The drivers are +considered in ALPHA status and are not yet considered stable enough +for general, widespread use in an unlimited audience. + +That's worth saying again: + +THE LINUX DRIVERS FOR QUICKNET CARDS ARE PRESENTLY IN A ALPHA STATE +AND SHOULD NOT BE CONSIDERED AS READY FOR NORMAL WIDESPREAD USE. + +They are released early in the spirit of Internet development and to +make this technology available to innovators who would benefit from +early exposure. + +When we promote the device driver to "beta" level it will be +considered ready for non-programmer, non-technical users. Until then, +please be aware that these drivers may not be stable and may affect +the performance of your system. + + +1.1 Latest Additions/Improvements + +The 0.3.4 version of the driver is the first GPL release. Several +features had to be removed from the prior binary only module, mostly +for reasons of Intellectual Property rights. We can't release +information that is not ours - so certain aspects of the driver had to +be removed to protect the rights of others. + +Specifically, very old Internet PhoneJACK cards have non-standard +G.723.1 codecs (due to the early nature of the DSPs in those days). +The auto-conversion code to bring those cards into compliance with +todays standards is available as a binary only module to those people +needing it. If you bought your card after 1997 or so, you are OK - +it's only the very old cards that are affected. + +Also, the code to download G.728/G.729/G.729a codecs to the DSP is +available as a binary only module as well. This IP is not ours to +release. + +Hooks are built into the GPL driver to allow it to work with other +companion modules that are completely separate from this module. + +1.2 Copyright, Trademarks, Disclaimer, & Credits + +Copyright + +Copyright (c) 1999 Quicknet Technologies, Inc. Permission is granted +to freely copy and distribute this document provided you preserve it +in its original form. For corrections and minor changes contact the +maintainer at linux@quicknet.net. + +Trademarks + +Internet PhoneJACK and Internet LineJACK are registered trademarks of +Quicknet Technologies, Inc. + +Disclaimer + +Much of the info in this HOWTO is early information released by +Quicknet Technologies, Inc. for the express purpose of allowing early +testing and use of the Linux drivers developed for their products. +While every attempt has been made to be thorough, complete and +accurate, the information contained here may be unreliable and there +are likely a number of errors in this document. Please let the +maintainer know about them. Since this is free documentation, it +should be obvious that neither I nor previous authors can be held +legally responsible for any errors. + +Credits + +This HOWTO was written by: + + Greg Herlein <gherlein@quicknet.net> + Ed Okerson <eokerson@quicknet.net> + +1.3 Future Plans: You Can Help + +Please let the maintainer know of any errors in facts, opinions, +logic, spelling, grammar, clarity, links, etc. But first, if the date +is over a month old, check to see that you have the latest +version. Please send any info that you think belongs in this document. + +You can also contribute code and/or bug-fixes for the sample +applications. + + +1.4 Where to get things + +You can download the latest versions of the driver from: + +http://www.quicknet.net/develop.htm + +You can download the latest version of this document from: + +http://www.quicknet.net/develop.htm + + +1.5 Mailing List + +Quicknet operates a mailing list to provide a public forum on using +these drivers. + +To subscribe to the linux-sdk mailing list, send an email to: + + majordomo@linux.quicknet.net + +In the body of the email, type: + + subscribe linux-sdk <your-email-address> + +Please delete any signature block that you would normally add to the +bottom of your email - it tends to confuse majordomo. + +To send mail to the list, address your mail to + + linux-sdk@linux.quicknet.net + +Your message will go out to everyone on the list. + +To unsubscribe to the linux-sdk mailing list, send an email to: + + majordomo@linux.quicknet.net + +In the body of the email, type: + + unsubscribe linux-sdk <your-email-address> + + + +2.0 Requirements + +2.1 Quicknet Card(s) + +You will need at least one Internet PhoneJACK or Internet LineJACK +cards. These are ISA or PCI bus devices that use Plug-n-Play for +configuration, and use no IRQs. The driver will support up to 16 +cards in any one system, of any mix between the two types. + +Note that you will need two cards to do any useful testing alone, since +you will need a card on both ends of the connection. Of course, if +you are doing collaborative work, perhaps your friends or coworkers +have cards too. If not, we'll gladly sell them some! + + +2.2 ISAPNP + +Since the Quicknet cards are Plug-n-Play devices, you will need the +isapnp tools package to configure the cards, or you can use the isapnp +module to autoconfigure them. The former package probably came with +your Linux distribution. Documentation on this package is available +online at: + +http://mailer.wiwi.uni-marburg.de/linux/LDP/HOWTO/Plug-and-Play-HOWTO.html + +The isapnp autoconfiguration is available on the Quicknet website at: + + http://www.quicknet.net/develop.htm + +though it may be in the kernel by the time you read this. + + +3.0 Card Configuration + +If you did not get your drivers as part of the linux kernel, do the +following to install them: + + a. untar the distribution file. We use the following command: + tar -xvzf ixj-0.x.x.tgz + +This creates a subdirectory holding all the necessary files. Go to that +subdirectory. + + b. run the "ixj_dev_create" script to remove any stray device +files left in the /dev directory, and to create the new officially +designated device files. Note that the old devices were called +/dev/ixj, and the new method uses /dev/phone. + + c. type "make;make install" - this will compile and install the +module. + + d. type "depmod -av" to rebuild all your kernel version dependencies. + + e. if you are using the isapnp module to configure the cards + automatically, then skip to step f. Otherwise, ensure that you + have run the isapnp configuration utility to properly configure + the cards. + + e1. The Internet PhoneJACK has one configuration register that + requires 16 IO ports. The Internet LineJACK card has two + configuration registers and isapnp reports that IO 0 + requires 16 IO ports and IO 1 requires 8. The Quicknet + driver assumes that these registers are configured to be + contiguous, i.e. if IO 0 is set to 0x340 then IO 1 should + be set to 0x350. + + Make sure that none of the cards overlap if you have + multiple cards in the system. + + If you are new to the isapnp tools, you can jumpstart + yourself by doing the following: + + e2. go to the /etc directory and run pnpdump to get a blank + isapnp.conf file. + + pnpdump > /etc/isapnp.conf + + e3. edit the /etc/isapnp.conf file to set the IO warnings and + the register IO addresses. The IO warnings means that you + should find the line in the file that looks like this: + + (CONFLICT (IO FATAL)(IRQ FATAL)(DMA FATAL)(MEM FATAL)) # or WARNING + + and you should edit the line to look like this: + + (CONFLICT (IO WARNING)(IRQ FATAL)(DMA FATAL)(MEM FATAL)) # + or WARNING + + The next step is to set the IO port addresses. The issue + here is that isapnp does not identify all of the ports out + there. Specifically any device that does not have a driver + or module loaded by Linux will not be registered. This + includes older sound cards and network cards. We have + found that the IO port 0x300 is often used even though + isapnp claims that no-one is using those ports. We + recommend that for a single card installation that port + 0x340 (and 0x350) be used. The IO port line should change + from this: + + (IO 0 (SIZE 16) (BASE 0x0300) (CHECK)) + + to this: + + (IO 0 (SIZE 16) (BASE 0x0340) ) + + e4. if you have multiple Quicknet cards, make sure that you do + not have any overlaps. Be especially careful if you are + mixing Internet PhoneJACK and Internet LineJACK cards in + the same system. In these cases we recommend moving the + IO port addresses to the 0x400 block. Please note that on + a few machines the 0x400 series are used. Feel free to + experiment with other addresses. Our cards have been + proven to work using IO addresses of up to 0xFF0. + + e5. the last step is to uncomment the activation line so the + drivers will be associated with the port. This means the + line (immediately below) the IO line should go from this: + + # (ACT Y) + + to this: + + (ACT Y) + + Once you have finished editing the isapnp.conf file you + must submit it into the pnp driverconfigure the cards. + This is done using the following command: + + isapnp isapnp.conf + + If this works you should see a line that identifies the + Quicknet device, the IO port(s) chosen, and a message + "Enabled OK". + + f. if you are loading the module by hand, use insmod. An example +of this would look like this: + + insmod phonedev + insmod ixj dspio=0x320,0x310 xio=0,0x330 + +Then verify the module loaded by running lsmod. If you are not using a +module that matches your kernel version, you may need to "force" the +load using the -f option in the insmod command. + + insmod phonedev + insmod -f ixj dspio=0x320,0x310 xio=0,0x330 + + +If you are using isapnp to autoconfigure your card, then you do NOT +need any of the above, though you need to use depmod to load the +driver, like this: + + depmod ixj + +which will result in the needed drivers getting loaded automatically. + + g. if you are planning on using kerneld to automatically load the +module for you, then you need to edit /etc/conf.modules and add the +following lines: + + options ixj dspio=0x340 xio=0x330 ixjdebug=0 + +If you do this, then when you execute an application that uses the +module kerneld will load the module for you. Note that to do this, +you need to have your kernel set to support kerneld. You can check +for this by looking at /usr/src/linux/.config and you should see this: + + # Loadable module support + # + <snip> + CONFIG_KMOD=y + + h. if you want non-root users to be able to read and write to the +ixj devices (this is a good idea!) you should do the following: + + - decide upon a group name to use and create that group if + needed. Add the user names to that group that you wish to + have access to the device. For example, we typically will + create a group named "ixj" in /etc/group and add all users + to that group that we want to run software that can use the + ixjX devices. + + - change the permissions on the device files, like this: + + chgrp ixj /dev/ixj* + chmod 660 /dev/ixj* + +Once this is done, then non-root users should be able to use the +devices. If you have enabled autoloading of modules, then the user +should be able to open the device and have the module loaded +automatically for them. + + +4.0 Driver Installation problems. + +We have tested these drivers on the 2.2.9, 2.2.10, 2.2.12, and 2.2.13 kernels +and in all cases have eventually been able to get the drivers to load and +run. We have found four types of problems that prevent this from happening. +The problems and solutions are: + + a. A step was missed in the installation. Go back and use section 3 +as a checklist. Many people miss running the ixj_dev_create script and thus +never load the device names into the filesystem. + + b. The kernel is inconsistently linked. We have found this problem in +the Out Of the Box installation of several distributions. The symptoms +are that neither driver will load, and that the unknown symbols include "jiffy" +and "kmalloc". The solution is to recompile both the kernel and the +modules. The command string for the final compile looks like this: + + In the kernel directory: + 1. cp .config /tmp + 2. make mrproper + 3. cp /tmp/.config . + 4. make clean;make bzImage;make modules;make modules_install + +This rebuilds both the kernel and all the modules and makes sure they all +have the same linkages. This generally solves the problem once the new +kernel is installed and the system rebooted. + + c. The kernel has been patched, then unpatched. This happens when +someone decides to use an earlier kernel after they load a later kernel. +The symptoms are proceeding through all three above steps and still not +being able to load the driver. What has happened is that the generated +header files are out of sync with the kernel itself. The solution is +to recompile (again) using "make mrproper". This will remove and then +regenerate all the necessary header files. Once this is done, then you +need to install and reboot the kernel. We have not seen any problem +loading one of our drivers after this treatment. + +5.0 Known Limitations + +We cannot currently play "dial-tone" and listen for DTMF digits at the +same time using the ISA PhoneJACK. This is a bug in the 8020 DSP chip +used on that product. All other Quicknet products function normally +in this regard. We have a work-around, but it's not done yet. Until +then, if you want dial-tone, you can always play a recorded dial-tone +sound into the audio until you have gathered the DTMF digits. + + + + + + + + + + + + + + + + + diff --git a/Documentation/time_interpolators.txt b/Documentation/time_interpolators.txt new file mode 100644 index 000000000000..e3b60854fbc2 --- /dev/null +++ b/Documentation/time_interpolators.txt @@ -0,0 +1,41 @@ +Time Interpolators +------------------ + +Time interpolators are a base of time calculation between timer ticks and +allow an accurate determination of time down to the accuracy of the time +source in nanoseconds. + +The architecture specific code typically provides gettimeofday and +settimeofday under Linux. The time interpolator provides both if an arch +defines CONFIG_TIME_INTERPOLATION. The arch still must set up timer tick +operations and call the necessary functions to advance the clock. + +With the time interpolator a standardized interface exists for time +interpolation between ticks. The provided logic is highly scalable +and has been tested in SMP situations of up to 512 CPUs. + +If CONFIG_TIME_INTERPOLATION is defined then the architecture specific code +(or the device drivers - like HPET) may register time interpolators. +These are typically defined in the following way: + +static struct time_interpolator my_interpolator { + .frequency = MY_FREQUENCY, + .source = TIME_SOURCE_MMIO32, + .shift = 8, /* scaling for higher accuracy */ + .drift = -1, /* Unknown drift */ + .jitter = 0 /* time source is stable */ +}; + +void time_init(void) +{ + .... + /* Initialization of the timer *. + my_interpolator.address = &my_timer; + register_time_interpolator(&my_interpolator); + .... +} + +For more details see include/linux/timex.h and kernel/timer.c. + +Christoph Lameter <christoph@lameter.com>, October 31, 2004 + diff --git a/Documentation/tipar.txt b/Documentation/tipar.txt new file mode 100644 index 000000000000..67133baef6ef --- /dev/null +++ b/Documentation/tipar.txt @@ -0,0 +1,93 @@ + + Parallel link cable for Texas Instruments handhelds + =================================================== + + +Author: Romain Lievin +Homepage: http://lpg.ticalc.org/prj_tidev/index.html + + +INTRODUCTION: + +This is a driver for the very common home-made parallel link cable, a cable +designed for connecting TI8x/9x graphing calculators (handhelds) to a computer +or workstation (Alpha, Sparc). Given that driver is built on parport, the +parallel port abstraction layer, this driver is architecture-independent. + +It can also be used with another device plugged on the same port (such as a +ZIP drive). I have a 100MB ZIP and both of them work fine! + +If you need more information, please visit the 'TI drivers' homepage at the URL +above. + +WHAT YOU NEED: + +A TI calculator and a program capable of communicating with your calculator. + +TiLP will work for sure (since I am its developer!). yal92 may be able to use +it by changing tidev for tipar (may require some hacking...). + +HOW TO USE IT: + +You must have first compiled parport support (CONFIG_PARPORT_DEV): either +compiled in your kernel, either as a module. + +Next, (as root): + + modprobe parport + modprobe tipar + +If it is not already there (it usually is), create the device: + + mknod /dev/tipar0 c 115 0 + mknod /dev/tipar1 c 115 1 + mknod /dev/tipar2 c 115 2 + +You will have to set permissions on this device to allow you to read/write +from it: + + chmod 666 /dev/tipar[0..2] + +Now you are ready to run a linking program such as TiLP. Be sure to configure +it properly (RTFM). + +MODULE PARAMETERS: + + You can set these with: modprobe tipar NAME=VALUE + There is currently no way to set these on a per-cable basis. + + NAME: timeout + TYPE: integer + DEFAULT: 15 + DESC: Timeout value in tenth of seconds. If no data is available once this + time has expired then the driver will return with a timeout error. + + NAME: delay + TYPE: integer + DEFAULT: 10 + DESC: Inter-bit delay in micro-seconds. A lower value gives an higher data + rate but makes transmission less reliable. + +These parameters can be changed at run time by any program via ioctl(2) calls +as listed in ./include/linux/ticable.h. + +Rather than write 50 pages describing the ioctl() and so on, it is +perhaps more useful you look at ticables library (dev_link.c) that demonstrates +how to use them, and demonstrates the features of the driver. This is +probably a lot more useful to people interested in writing applications +that will be using this driver. + +QUIRKS/BUGS: + +None. + +HOW TO CONTACT US: + +You can email me at roms@lpg.ticalc.org. Please prefix the subject line +with "TIPAR: " so that I am certain to notice your message. +You can also mail JB at jb@jblache.org. He packaged these drivers for Debian. + +CREDITS: + +The code is based on tidev.c & parport.c. +The driver has been developed independently of Texas Instruments. diff --git a/Documentation/tty.txt b/Documentation/tty.txt new file mode 100644 index 000000000000..3958cf746dde --- /dev/null +++ b/Documentation/tty.txt @@ -0,0 +1,198 @@ + + The Lockronomicon + +Your guide to the ancient and twisted locking policies of the tty layer and +the warped logic behind them. Beware all ye who read on. + +FIXME: still need to work out the full set of BKL assumptions and document +them so they can eventually be killed off. + + +Line Discipline +--------------- + +Line disciplines are registered with tty_register_ldisc() passing the +discipline number and the ldisc structure. At the point of registration the +discipline must be ready to use and it is possible it will get used before +the call returns success. If the call returns an error then it won't get +called. Do not re-use ldisc numbers as they are part of the userspace ABI +and writing over an existing ldisc will cause demons to eat your computer. +After the return the ldisc data has been copied so you may free your own +copy of the structure. You must not re-register over the top of the line +discipline even with the same data or your computer again will be eaten by +demons. + +In order to remove a line discipline call tty_register_ldisc passing NULL. +In ancient times this always worked. In modern times the function will +return -EBUSY if the ldisc is currently in use. Since the ldisc referencing +code manages the module counts this should not usually be a concern. + +Heed this warning: the reference count field of the registered copies of the +tty_ldisc structure in the ldisc table counts the number of lines using this +discipline. The reference count of the tty_ldisc structure within a tty +counts the number of active users of the ldisc at this instant. In effect it +counts the number of threads of execution within an ldisc method (plus those +about to enter and exit although this detail matters not). + +Line Discipline Methods +----------------------- + +TTY side interfaces: + +close() - This is called on a terminal when the line + discipline is being unplugged. At the point of + execution no further users will enter the + ldisc code for this tty. Can sleep. + +open() - Called when the line discipline is attached to + the terminal. No other call into the line + discipline for this tty will occur until it + completes successfully. Can sleep. + +write() - A process is writing data through the line + discipline. Multiple write calls are serialized + by the tty layer for the ldisc. May sleep. + +flush_buffer() - May be called at any point between open and close. + +chars_in_buffer() - Report the number of bytes in the buffer. + +set_termios() - Called on termios structure changes. The caller + passes the old termios data and the current data + is in the tty. Called under the termios semaphore so + allowed to sleep. Serialized against itself only. + +read() - Move data from the line discipline to the user. + Multiple read calls may occur in parallel and the + ldisc must deal with serialization issues. May + sleep. + +poll() - Check the status for the poll/select calls. Multiple + poll calls may occur in parallel. May sleep. + +ioctl() - Called when an ioctl is handed to the tty layer + that might be for the ldisc. Multiple ioctl calls + may occur in parallel. May sleep. + +Driver Side Interfaces: + +receive_buf() - Hand buffers of bytes from the driver to the ldisc + for processing. Semantics currently rather + mysterious 8( + +receive_room() - Can be called by the driver layer at any time when + the ldisc is opened. The ldisc must be able to + handle the reported amount of data at that instant. + Synchronization between active receive_buf and + receive_room calls is down to the driver not the + ldisc. Must not sleep. + +write_wakeup() - May be called at any point between open and close. + The TTY_DO_WRITE_WAKEUP flag indicates if a call + is needed but always races versus calls. Thus the + ldisc must be careful about setting order and to + handle unexpected calls. Must not sleep. + + The driver is forbidden from calling this directly + from the ->write call from the ldisc as the ldisc + is permitted to call the driver write method from + this function. In such a situation defer it. + + +Locking + +Callers to the line discipline functions from the tty layer are required to +take line discipline locks. The same is true of calls from the driver side +but not yet enforced. + +Three calls are now provided + + ldisc = tty_ldisc_ref(tty); + +takes a handle to the line discipline in the tty and returns it. If no ldisc +is currently attached or the ldisc is being closed and re-opened at this +point then NULL is returned. While this handle is held the ldisc will not +change or go away. + + tty_ldisc_deref(ldisc) + +Returns the ldisc reference and allows the ldisc to be closed. Returning the +reference takes away your right to call the ldisc functions until you take +a new reference. + + ldisc = tty_ldisc_ref_wait(tty); + +Performs the same function as tty_ldisc_ref except that it will wait for an +ldisc change to complete and then return a reference to the new ldisc. + +While these functions are slightly slower than the old code they should have +minimal impact as most receive logic uses the flip buffers and they only +need to take a reference when they push bits up through the driver. + +A caution: The ldisc->open(), ldisc->close() and driver->set_ldisc +functions are called with the ldisc unavailable. Thus tty_ldisc_ref will +fail in this situation if used within these functions. Ldisc and driver +code calling its own functions must be careful in this case. + + +Driver Interface +---------------- + +open() - Called when a device is opened. May sleep + +close() - Called when a device is closed. At the point of + return from this call the driver must make no + further ldisc calls of any kind. May sleep + +write() - Called to write bytes to the device. May not + sleep. May occur in parallel in special cases. + Because this includes panic paths drivers generally + shouldn't try and do clever locking here. + +put_char() - Stuff a single character onto the queue. The + driver is guaranteed following up calls to + flush_chars. + +flush_chars() - Ask the kernel to write put_char queue + +write_room() - Return the number of characters tht can be stuffed + into the port buffers without overflow (or less). + The ldisc is responsible for being intelligent + about multi-threading of write_room/write calls + +ioctl() - Called when an ioctl may be for the driver + +set_termios() - Called on termios change, serialized against + itself by a semaphore. May sleep. + +set_ldisc() - Notifier for discipline change. At the point this + is done the discipline is not yet usable. Can now + sleep (I think) + +throttle() - Called by the ldisc to ask the driver to do flow + control. Serialization including with unthrottle + is the job of the ldisc layer. + +unthrottle() - Called by the ldisc to ask the driver to stop flow + control. + +stop() - Ldisc notifier to the driver to stop output. As with + throttle the serializations with start() are down + to the ldisc layer. + +start() - Ldisc notifier to the driver to start output. + +hangup() - Ask the tty driver to cause a hangup initiated + from the host side. [Can sleep ??] + +break_ctl() - Send RS232 break. Can sleep. Can get called in + parallel, driver must serialize (for now), and + with write calls. + +wait_until_sent() - Wait for characters to exit the hardware queue + of the driver. Can sleep + +send_xchar() - Send XON/XOFF and if possible jump the queue with + it in order to get fast flow control responses. + Cannot sleep ?? + diff --git a/Documentation/uml/UserModeLinux-HOWTO.txt b/Documentation/uml/UserModeLinux-HOWTO.txt new file mode 100644 index 000000000000..0c7b654fec99 --- /dev/null +++ b/Documentation/uml/UserModeLinux-HOWTO.txt @@ -0,0 +1,4686 @@ + User Mode Linux HOWTO + User Mode Linux Core Team + Mon Nov 18 14:16:16 EST 2002 + + This document describes the use and abuse of Jeff Dike's User Mode + Linux: a port of the Linux kernel as a normal Intel Linux process. + ______________________________________________________________________ + + Table of Contents + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 1. Introduction + + 1.1 How is User Mode Linux Different? + 1.2 Why Would I Want User Mode Linux? + + 2. Compiling the kernel and modules + + 2.1 Compiling the kernel + 2.2 Compiling and installing kernel modules + 2.3 Compiling and installing uml_utilities + + 3. Running UML and logging in + + 3.1 Running UML + 3.2 Logging in + 3.3 Examples + + 4. UML on 2G/2G hosts + + 4.1 Introduction + 4.2 The problem + 4.3 The solution + + 5. Setting up serial lines and consoles + + 5.1 Specifying the device + 5.2 Specifying the channel + 5.3 Examples + + 6. Setting up the network + + 6.1 General setup + 6.2 Userspace daemons + 6.3 Specifying ethernet addresses + 6.4 UML interface setup + 6.5 Multicast + 6.6 TUN/TAP with the uml_net helper + 6.7 TUN/TAP with a preconfigured tap device + 6.8 Ethertap + 6.9 The switch daemon + 6.10 Slip + 6.11 Slirp + 6.12 pcap + 6.13 Setting up the host yourself + + 7. Sharing Filesystems between Virtual Machines + + 7.1 A warning + 7.2 Using layered block devices + 7.3 Note! + 7.4 Another warning + 7.5 uml_moo : Merging a COW file with its backing file + + 8. Creating filesystems + + 8.1 Create the filesystem file + 8.2 Assign the file to a UML device + 8.3 Creating and mounting the filesystem + + 9. Host file access + + 9.1 Using hostfs + 9.2 hostfs as the root filesystem + 9.3 Building hostfs + + 10. The Management Console + 10.1 version + 10.2 halt and reboot + 10.3 config + 10.4 remove + 10.5 sysrq + 10.6 help + 10.7 cad + 10.8 stop + 10.9 go + + 11. Kernel debugging + + 11.1 Starting the kernel under gdb + 11.2 Examining sleeping processes + 11.3 Running ddd on UML + 11.4 Debugging modules + 11.5 Attaching gdb to the kernel + 11.6 Using alternate debuggers + + 12. Kernel debugging examples + + 12.1 The case of the hung fsck + 12.2 Episode 2: The case of the hung fsck + + 13. What to do when UML doesn't work + + 13.1 Strange compilation errors when you build from source + 13.2 UML hangs on boot after mounting devfs + 13.3 A variety of panics and hangs with /tmp on a reiserfs filesystem + 13.4 The compile fails with errors about conflicting types for 'open', 'dup', and 'waitpid' + 13.5 UML doesn't work when /tmp is an NFS filesystem + 13.6 UML hangs on boot when compiled with gprof support + 13.7 syslogd dies with a SIGTERM on startup + 13.8 TUN/TAP networking doesn't work on a 2.4 host + 13.9 You can network to the host but not to other machines on the net + 13.10 I have no root and I want to scream + 13.11 UML build conflict between ptrace.h and ucontext.h + 13.12 The UML BogoMips is exactly half the host's BogoMips + 13.13 When you run UML, it immediately segfaults + 13.14 xterms appear, then immediately disappear + 13.15 Any other panic, hang, or strange behavior + + 14. Diagnosing Problems + + 14.1 Case 1 : Normal kernel panics + 14.2 Case 2 : Tracing thread panics + 14.3 Case 3 : Tracing thread panics caused by other threads + 14.4 Case 4 : Hangs + + 15. Thanks + + 15.1 Code and Documentation + 15.2 Flushing out bugs + 15.3 Buglets and clean-ups + 15.4 Case Studies + 15.5 Other contributions + + + ______________________________________________________________________ + + 11.. IInnttrroodduuccttiioonn + + Welcome to User Mode Linux. It's going to be fun. + + + + 11..11.. HHooww iiss UUsseerr MMooddee LLiinnuuxx DDiiffffeerreenntt?? + + Normally, the Linux Kernel talks straight to your hardware (video + card, keyboard, hard drives, etc), and any programs which run ask the + kernel to operate the hardware, like so: + + + + +-----------+-----------+----+ + | Process 1 | Process 2 | ...| + +-----------+-----------+----+ + | Linux Kernel | + +----------------------------+ + | Hardware | + +----------------------------+ + + + + + The User Mode Linux Kernel is different; instead of talking to the + hardware, it talks to a `real' Linux kernel (called the `host kernel' + from now on), like any other program. Programs can then run inside + User-Mode Linux as if they were running under a normal kernel, like + so: + + + + +----------------+ + | Process 2 | ...| + +-----------+----------------+ + | Process 1 | User-Mode Linux| + +----------------------------+ + | Linux Kernel | + +----------------------------+ + | Hardware | + +----------------------------+ + + + + + + 11..22.. WWhhyy WWoouulldd II WWaanntt UUsseerr MMooddee LLiinnuuxx?? + + + 1. If User Mode Linux crashes, your host kernel is still fine. + + 2. You can run a usermode kernel as a non-root user. + + 3. You can debug the User Mode Linux like any normal process. + + 4. You can run gprof (profiling) and gcov (coverage testing). + + 5. You can play with your kernel without breaking things. + + 6. You can use it as a sandbox for testing new apps. + + 7. You can try new development kernels safely. + + 8. You can run different distributions simultaneously. + + 9. It's extremely fun. + + + + + + 22.. CCoommppiilliinngg tthhee kkeerrnneell aanndd mmoodduulleess + + + + + 22..11.. CCoommppiilliinngg tthhee kkeerrnneell + + + Compiling the user mode kernel is just like compiling any other + kernel. Let's go through the steps, using 2.4.0-prerelease (current + as of this writing) as an example: + + + 1. Download the latest UML patch from + + the download page <http://user-mode-linux.sourceforge.net/dl- + sf.html> + + In this example, the file is uml-patch-2.4.0-prerelease.bz2. + + + 2. Download the matching kernel from your favourite kernel mirror, + such as: + + ftp://ftp.ca.kernel.org/pub/kernel/v2.4/linux-2.4.0-prerelease.tar.bz2 + <ftp://ftp.ca.kernel.org/pub/kernel/v2.4/linux-2.4.0-prerelease.tar.bz2> + . + + + 3. Make a directory and unpack the kernel into it. + + + + host% + mkdir ~/uml + + + + + + + host% + cd ~/uml + + + + + + + host% + tar -xzvf linux-2.4.0-prerelease.tar.bz2 + + + + + + + 4. Apply the patch using + + + + host% + cd ~/uml/linux + + + + host% + bzcat uml-patch-2.4.0-prerelease.bz2 | patch -p1 + + + + + + + 5. Run your favorite config; `make xconfig ARCH=um' is the most + convenient. `make config ARCH=um' and 'make menuconfig ARCH=um' + will work as well. The defaults will give you a useful kernel. If + you want to change something, go ahead, it probably won't hurt + anything. + + + Note: If the host is configured with a 2G/2G address space split + rather than the usual 3G/1G split, then the packaged UML binaries + will not run. They will immediately segfault. See ``UML on 2G/2G + hosts'' for the scoop on running UML on your system. + + + + 6. Finish with `make linux ARCH=um': the result is a file called + `linux' in the top directory of your source tree. + + Make sure that you don't build this kernel in /usr/src/linux. On some + distributions, /usr/include/asm is a link into this pool. The user- + mode build changes the other end of that link, and things that include + <asm/anything.h> stop compiling. + + The sources are also available from cvs at the project's cvs page, + which has directions on getting the sources. You can also browse the + CVS pool from there. + + If you get the CVS sources, you will have to check them out into an + empty directory. You will then have to copy each file into the + corresponding directory in the appropriate kernel pool. + + If you don't have the latest kernel pool, you can get the + corresponding user-mode sources with + + + host% cvs co -r v_2_3_x linux + + + + + where 'x' is the version in your pool. Note that you will not get the + bug fixes and enhancements that have gone into subsequent releases. + + + If you build your own kernel, and want to boot it from one of the + filesystems distributed from this site, then, in nearly all cases, + devfs must be compiled into the kernel and mounted at boot time. The + exception is the SuSE filesystem. For this, devfs must either not be + in the kernel at all, or "devfs=nomount" must be on the kernel command + line. Any disagreement between the kernel and the filesystem being + booted about whether devfs is being used will result in the boot + getting no further than single-user mode. + + + If you don't want to use devfs, you can remove the need for it from a + filesystem by copying /dev from someplace, making a bunch of /dev/ubd + devices: + + + UML# for i in 0 1 2 3 4 5 6 7; do mknod ubd$i b 98 $i; done + + + + + and changing /etc/fstab and /etc/inittab to refer to the non-devfs + devices. + + + + 22..22.. CCoommppiilliinngg aanndd iinnssttaalllliinngg kkeerrnneell mmoodduulleess + + UML modules are built in the same way as the native kernel (with the + exception of the 'ARCH=um' that you always need for UML): + + + host% make modules ARCH=um + + + + + Any modules that you want to load into this kernel need to be built in + the user-mode pool. Modules from the native kernel won't work. + + You can install them by using ftp or something to copy them into the + virtual machine and dropping them into /lib/modules/`uname -r`. + + You can also get the kernel build process to install them as follows: + + 1. with the kernel not booted, mount the root filesystem in the top + level of the kernel pool: + + + host% mount root_fs mnt -o loop + + + + + + + 2. run + + + host% + make modules_install INSTALL_MOD_PATH=`pwd`/mnt ARCH=um + + + + + + + 3. unmount the filesystem + + + host% umount mnt + + + + + + + 4. boot the kernel on it + + + When the system is booted, you can use insmod as usual to get the + modules into the kernel. A number of things have been loaded into UML + as modules, especially filesystems and network protocols and filters, + so most symbols which need to be exported probably already are. + However, if you do find symbols that need exporting, let us + <http://user-mode-linux.sourceforge.net/contacts.html> know, and + they'll be "taken care of". + + + + 22..33.. CCoommppiilliinngg aanndd iinnssttaalllliinngg uummll__uuttiilliittiieess + + Many features of the UML kernel require a user-space helper program, + so a uml_utilities package is distributed separately from the kernel + patch which provides these helpers. Included within this is: + + +o port-helper - Used by consoles which connect to xterms or ports + + +o tunctl - Configuration tool to create and delete tap devices + + +o uml_net - Setuid binary for automatic tap device configuration + + +o uml_switch - User-space virtual switch required for daemon + transport + + The uml_utilities tree is compiled with: + + + host# + make && make install + + + + + Note that UML kernel patches may require a specific version of the + uml_utilities distribution. If you don't keep up with the mailing + lists, ensure that you have the latest release of uml_utilities if you + are experiencing problems with your UML kernel, particularly when + dealing with consoles or command-line switches to the helper programs + + + + + + + + + 33.. RRuunnnniinngg UUMMLL aanndd llooggggiinngg iinn + + + + 33..11.. RRuunnnniinngg UUMMLL + + It runs on 2.2.15 or later, and all 2.4 kernels. + + + Booting UML is straightforward. Simply run 'linux': it will try to + mount the file `root_fs' in the current directory. You do not need to + run it as root. If your root filesystem is not named `root_fs', then + you need to put a `ubd0=root_fs_whatever' switch on the linux command + line. + + + You will need a filesystem to boot UML from. There are a number + available for download from here <http://user-mode- + linux.sourceforge.net/dl-sf.html> . There are also several tools + <http://user-mode-linux.sourceforge.net/fs_making.html> which can be + used to generate UML-compatible filesystem images from media. + The kernel will boot up and present you with a login prompt. + + + Note: If the host is configured with a 2G/2G address space split + rather than the usual 3G/1G split, then the packaged UML binaries will + not run. They will immediately segfault. See ``UML on 2G/2G hosts'' + for the scoop on running UML on your system. + + + + 33..22.. LLooggggiinngg iinn + + + + The prepackaged filesystems have a root account with password 'root' + and a user account with password 'user'. The login banner will + generally tell you how to log in. So, you log in and you will find + yourself inside a little virtual machine. Our filesystems have a + variety of commands and utilities installed (and it is fairly easy to + add more), so you will have a lot of tools with which to poke around + the system. + + There are a couple of other ways to log in: + + +o On a virtual console + + + + Each virtual console that is configured (i.e. the device exists in + /dev and /etc/inittab runs a getty on it) will come up in its own + xterm. If you get tired of the xterms, read ``Setting up serial + lines and consoles'' to see how to attach the consoles to + something else, like host ptys. + + + + +o Over the serial line + + + In the boot output, find a line that looks like: + + + + serial line 0 assigned pty /dev/ptyp1 + + + + + Attach your favorite terminal program to the corresponding tty. I.e. + for minicom, the command would be + + + host% minicom -o -p /dev/ttyp1 + + + + + + + +o Over the net + + + If the network is running, then you can telnet to the virtual + machine and log in to it. See ``Setting up the network'' to learn + about setting up a virtual network. + + When you're done using it, run halt, and the kernel will bring itself + down and the process will exit. + + + 33..33.. EExxaammpplleess + + Here are some examples of UML in action: + + +o A login session <http://user-mode-linux.sourceforge.net/login.html> + + +o A virtual network <http://user-mode-linux.sourceforge.net/net.html> + + + + + + + + 44.. UUMMLL oonn 22GG//22GG hhoossttss + + + + + 44..11.. IInnttrroodduuccttiioonn + + + Most Linux machines are configured so that the kernel occupies the + upper 1G (0xc0000000 - 0xffffffff) of the 4G address space and + processes use the lower 3G (0x00000000 - 0xbfffffff). However, some + machine are configured with a 2G/2G split, with the kernel occupying + the upper 2G (0x80000000 - 0xffffffff) and processes using the lower + 2G (0x00000000 - 0x7fffffff). + + + + + 44..22.. TThhee pprroobblleemm + + + The prebuilt UML binaries on this site will not run on 2G/2G hosts + because UML occupies the upper .5G of the 3G process address space + (0xa0000000 - 0xbfffffff). Obviously, on 2G/2G hosts, this is right + in the middle of the kernel address space, so UML won't even load - it + will immediately segfault. + + + + + 44..33.. TThhee ssoolluuttiioonn + + + The fix for this is to rebuild UML from source after enabling + CONFIG_HOST_2G_2G (under 'General Setup'). This will cause UML to + load itself in the top .5G of that smaller process address space, + where it will run fine. See ``Compiling the kernel and modules'' if + you need help building UML from source. + + + + + + + + + + + 55.. SSeettttiinngg uupp sseerriiaall lliinneess aanndd ccoonnssoolleess + + + It is possible to attach UML serial lines and consoles to many types + of host I/O channels by specifying them on the command line. + + + You can attach them to host ptys, ttys, file descriptors, and ports. + This allows you to do things like + + +o have a UML console appear on an unused host console, + + +o hook two virtual machines together by having one attach to a pty + and having the other attach to the corresponding tty + + +o make a virtual machine accessible from the net by attaching a + console to a port on the host. + + + The general format of the command line option is device=channel. + + + + 55..11.. SSppeecciiffyyiinngg tthhee ddeevviiccee + + Devices are specified with "con" or "ssl" (console or serial line, + respectively), optionally with a device number if you are talking + about a specific device. + + + Using just "con" or "ssl" describes all of the consoles or serial + lines. If you want to talk about console #3 or serial line #10, they + would be "con3" and "ssl10", respectively. + + + A specific device name will override a less general "con=" or "ssl=". + So, for example, you can assign a pty to each of the serial lines + except for the first two like this: + + + ssl=pty ssl0=tty:/dev/tty0 ssl1=tty:/dev/tty1 + + + + + The specificity of the device name is all that matters; order on the + command line is irrelevant. + + + + 55..22.. SSppeecciiffyyiinngg tthhee cchhaannnneell + + There are a number of different types of channels to attach a UML + device to, each with a different way of specifying exactly what to + attach to. + + +o pseudo-terminals - device=pty pts terminals - device=pts + + + This will cause UML to allocate a free host pseudo-terminal for the + device. The terminal that it got will be announced in the boot + log. You access it by attaching a terminal program to the + corresponding tty: + + +o screen /dev/pts/n + + +o screen /dev/ttyxx + + +o minicom -o -p /dev/ttyxx - minicom seems not able to handle pts + devices + + +o kermit - start it up, 'open' the device, then 'connect' + + + + + + +o terminals - device=tty:tty device file + + + This will make UML attach the device to the specified tty (i.e + + + con1=tty:/dev/tty3 + + + + + will attach UML's console 1 to the host's /dev/tty3). If the tty that + you specify is the slave end of a tty/pty pair, something else must + have already opened the corresponding pty in order for this to work. + + + + + + +o xterms - device=xterm + + + UML will run an xterm and the device will be attached to it. + + + + + + +o Port - device=port:port number + + + This will attach the UML devices to the specified host port. + Attaching console 1 to the host's port 9000 would be done like + this: + + + con1=port:9000 + + + + + Attaching all the serial lines to that port would be done similarly: + + + ssl=port:9000 + + + + + You access these devices by telnetting to that port. Each active tel- + net session gets a different device. If there are more telnets to a + port than UML devices attached to it, then the extra telnet sessions + will block until an existing telnet detaches, or until another device + becomes active (i.e. by being activated in /etc/inittab). + + This channel has the advantage that you can both attach multiple UML + devices to it and know how to access them without reading the UML boot + log. It is also unique in allowing access to a UML from remote + machines without requiring that the UML be networked. This could be + useful in allowing public access to UMLs because they would be + accessible from the net, but wouldn't need any kind of network + filtering or access control because they would have no network access. + + + If you attach the main console to a portal, then the UML boot will + appear to hang. In reality, it's waiting for a telnet to connect, at + which point the boot will proceed. + + + + + + +o already-existing file descriptors - device=file descriptor + + + If you set up a file descriptor on the UML command line, you can + attach a UML device to it. This is most commonly used to put the + main console back on stdin and stdout after assigning all the other + consoles to something else: + + + con0=fd:0,fd:1 con=pts + + + + + + + + + +o Nothing - device=null + + + This allows the device to be opened, in contrast to 'none', but + reads will block, and writes will succeed and the data will be + thrown out. + + + + + + +o None - device=none + + + This causes the device to disappear. If you are using devfs, the + device will not appear in /dev. If not, then attempts to open it + will return -ENODEV. + + + + You can also specify different input and output channels for a device + by putting a comma between them: + + + ssl3=tty:/dev/tty2,xterm + + + + + will cause serial line 3 to accept input on the host's /dev/tty3 and + display output on an xterm. That's a silly example - the most common + use of this syntax is to reattach the main console to stdin and stdout + as shown above. + + + If you decide to move the main console away from stdin/stdout, the + initial boot output will appear in the terminal that you're running + UML in. However, once the console driver has been officially + initialized, then the boot output will start appearing wherever you + specified that console 0 should be. That device will receive all + subsequent output. + + + + 55..33.. EExxaammpplleess + + There are a number of interesting things you can do with this + capability. + + + First, this is how you get rid of those bleeding console xterms by + attaching them to host ptys: + + + con=pty con0=fd:0,fd:1 + + + + + This will make a UML console take over an unused host virtual console, + so that when you switch to it, you will see the UML login prompt + rather than the host login prompt: + + + con1=tty:/dev/tty6 + + + + + You can attach two virtual machines together with what amounts to a + serial line as follows: + + Run one UML with a serial line attached to a pty - + + + ssl1=pty + + + + + Look at the boot log to see what pty it got (this example will assume + that it got /dev/ptyp1). + + Boot the other UML with a serial line attached to the corresponding + tty - + + + ssl1=tty:/dev/ttyp1 + + + + + Log in, make sure that it has no getty on that serial line, attach a + terminal program like minicom to it, and you should see the login + prompt of the other virtual machine. + + + 66.. SSeettttiinngg uupp tthhee nneettwwoorrkk + + + + This page describes how to set up the various transports and to + provide a UML instance with network access to the host, other machines + on the local net, and the rest of the net. + + + As of 2.4.5, UML networking has been completely redone to make it much + easier to set up, fix bugs, and add new features. + + + There is a new helper, uml_net, which does the host setup that + requires root privileges. + + + There are currently five transport types available for a UML virtual + machine to exchange packets with other hosts: + + +o ethertap + + +o TUN/TAP + + +o Multicast + + +o a switch daemon + + +o slip + + +o slirp + + +o pcap + + The TUN/TAP, ethertap, slip, and slirp transports allow a UML + instance to exchange packets with the host. They may be directed + to the host or the host may just act as a router to provide access + to other physical or virtual machines. + + + The pcap transport is a synthetic read-only interface, using the + libpcap binary to collect packets from interfaces on the host and + filter them. This is useful for building preconfigured traffic + monitors or sniffers. + + + The daemon and multicast transports provide a completely virtual + network to other virtual machines. This network is completely + disconnected from the physical network unless one of the virtual + machines on it is acting as a gateway. + + + With so many host transports, which one should you use? Here's when + you should use each one: + + +o ethertap - if you want access to the host networking and it is + running 2.2 + + +o TUN/TAP - if you want access to the host networking and it is + running 2.4. Also, the TUN/TAP transport is able to use a + preconfigured device, allowing it to avoid using the setuid uml_net + helper, which is a security advantage. + + +o Multicast - if you want a purely virtual network and you don't want + to set up anything but the UML + + +o a switch daemon - if you want a purely virtual network and you + don't mind running the daemon in order to get somewhat better + performance + + +o slip - there is no particular reason to run the slip backend unless + ethertap and TUN/TAP are just not available for some reason + + +o slirp - if you don't have root access on the host to setup + networking, or if you don't want to allocate an IP to your UML + + +o pcap - not much use for actual network connectivity, but great for + monitoring traffic on the host + + Ethertap is available on 2.4 and works fine. TUN/TAP is preferred + to it because it has better performance and ethertap is officially + considered obsolete in 2.4. Also, the root helper only needs to + run occasionally for TUN/TAP, rather than handling every packet, as + it does with ethertap. This is a slight security advantage since + it provides fewer opportunities for a nasty UML user to somehow + exploit the helper's root privileges. + + + 66..11.. GGeenneerraall sseettuupp + + First, you must have the virtual network enabled in your UML. If are + running a prebuilt kernel from this site, everything is already + enabled. If you build the kernel yourself, under the "Network device + support" menu, enable "Network device support", and then the three + transports. + + + The next step is to provide a network device to the virtual machine. + This is done by describing it on the kernel command line. + + The general format is + + + eth <n> = <transport> , <transport args> + + + + + For example, a virtual ethernet device may be attached to a host + ethertap device as follows: + + + eth0=ethertap,tap0,fe:fd:0:0:0:1,192.168.0.254 + + + + + This sets up eth0 inside the virtual machine to attach itself to the + host /dev/tap0, assigns it an ethernet address, and assigns the host + tap0 interface an IP address. + + + + Note that the IP address you assign to the host end of the tap device + must be different than the IP you assign to the eth device inside UML. + If you are short on IPs and don't want to comsume two per UML, then + you can reuse the host's eth IP address for the host ends of the tap + devices. Internally, the UMLs must still get unique IPs for their eth + devices. You can also give the UMLs non-routable IPs (192.168.x.x or + 10.x.x.x) and have the host masquerade them. This will let outgoing + connections work, but incoming connections won't without more work, + such as port forwarding from the host. + Also note that when you configure the host side of an interface, it is + only acting as a gateway. It will respond to pings sent to it + locally, but is not useful to do that since it's a host interface. + You are not talking to the UML when you ping that interface and get a + response. + + + You can also add devices to a UML and remove them at runtime. See the + ``The Management Console'' page for details. + + + The sections below describe this in more detail. + + + Once you've decided how you're going to set up the devices, you boot + UML, log in, configure the UML side of the devices, and set up routes + to the outside world. At that point, you will be able to talk to any + other machines, physical or virtual, on the net. + + + If ifconfig inside UML fails and the network refuses to come up, run + tell you what went wrong. + + + + 66..22.. UUsseerrssppaaccee ddaaeemmoonnss + + You will likely need the setuid helper, or the switch daemon, or both. + They are both installed with the RPM and deb, so if you've installed + either, you can skip the rest of this section. + + + If not, then you need to check them out of CVS, build them, and + install them. The helper is uml_net, in CVS /tools/uml_net, and the + daemon is uml_switch, in CVS /tools/uml_router. They are both built + with a plain 'make'. Both need to be installed in a directory that's + in your path - /usr/bin is recommend. On top of that, uml_net needs + to be setuid root. + + + + 66..33.. SSppeecciiffyyiinngg eetthheerrnneett aaddddrreesssseess + + Below, you will see that the TUN/TAP, ethertap, and daemon interfaces + allow you to specify hardware addresses for the virtual ethernet + devices. This is generally not necessary. If you don't have a + specific reason to do it, you probably shouldn't. If one is not + specified on the command line, the driver will assign one based on the + device IP address. It will provide the address fe:fd:nn:nn:nn:nn + where nn.nn.nn.nn is the device IP address. This is nearly always + sufficient to guarantee a unique hardware address for the device. A + couple of exceptions are: + + +o Another set of virtual ethernet devices are on the same network and + they are assigned hardware addresses using a different scheme which + may conflict with the UML IP address-based scheme + + +o You aren't going to use the device for IP networking, so you don't + assign the device an IP address + + If you let the driver provide the hardware address, you should make + sure that the device IP address is known before the interface is + brought up. So, inside UML, this will guarantee that: + + + + UML# + ifconfig eth0 192.168.0.250 up + + + + + If you decide to assign the hardware address yourself, make sure that + the first byte of the address is even. Addresses with an odd first + byte are broadcast addresses, which you don't want assigned to a + device. + + + + 66..44.. UUMMLL iinntteerrffaaccee sseettuupp + + Once the network devices have been described on the command line, you + should boot UML and log in. + + + The first thing to do is bring the interface up: + + + UML# ifconfig ethn ip-address up + + + + + You should be able to ping the host at this point. + + + To reach the rest of the world, you should set a default route to the + host: + + + UML# route add default gw host ip + + + + + Again, with host ip of 192.168.0.4: + + + UML# route add default gw 192.168.0.4 + + + + + This page used to recommend setting a network route to your local net. + This is wrong, because it will cause UML to try to figure out hardware + addresses of the local machines by arping on the interface to the + host. Since that interface is basically a single strand of ethernet + with two nodes on it (UML and the host) and arp requests don't cross + networks, they will fail to elicit any responses. So, what you want + is for UML to just blindly throw all packets at the host and let it + figure out what to do with them, which is what leaving out the network + route and adding the default route does. + + + Note: If you can't communicate with other hosts on your physical + ethernet, it's probably because of a network route that's + automatically set up. If you run 'route -n' and see a route that + looks like this: + + + + + Destination Gateway Genmask Flags Metric Ref Use Iface + 192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 + + + + + with a mask that's not 255.255.255.255, then replace it with a route + to your host: + + + UML# + route del -net 192.168.0.0 dev eth0 netmask 255.255.255.0 + + + + + + + UML# + route add -host 192.168.0.4 dev eth0 + + + + + This, plus the default route to the host, will allow UML to exchange + packets with any machine on your ethernet. + + + + 66..55.. MMuullttiiccaasstt + + The simplest way to set up a virtual network between multiple UMLs is + to use the mcast transport. This was written by Harald Welte and is + present in UML version 2.4.5-5um and later. Your system must have + multicast enabled in the kernel and there must be a multicast-capable + network device on the host. Normally, this is eth0, but if there is + no ethernet card on the host, then you will likely get strange error + messages when you bring the device up inside UML. + + + To use it, run two UMLs with + + + eth0=mcast + + + + + on their command lines. Log in, configure the ethernet device in each + machine with different IP addresses: + + + UML1# ifconfig eth0 192.168.0.254 + + + + + + + UML2# ifconfig eth0 192.168.0.253 + + + + + and they should be able to talk to each other. + + The full set of command line options for this transport are + + + + ethn=mcast,ethernet address,multicast + address,multicast port,ttl + + + + + Harald's original README is here <http://user-mode-linux.source- + forge.net/text/mcast.txt> and explains these in detail, as well as + some other issues. + + + + 66..66.. TTUUNN//TTAAPP wwiitthh tthhee uummll__nneett hheellppeerr + + TUN/TAP is the preferred mechanism on 2.4 to exchange packets with the + host. The TUN/TAP backend has been in UML since 2.4.9-3um. + + + The easiest way to get up and running is to let the setuid uml_net + helper do the host setup for you. This involves insmod-ing the tun.o + module if necessary, configuring the device, and setting up IP + forwarding, routing, and proxy arp. If you are new to UML networking, + do this first. If you're concerned about the security implications of + the setuid helper, use it to get up and running, then read the next + section to see how to have UML use a preconfigured tap device, which + avoids the use of uml_net. + + + If you specify an IP address for the host side of the device, the + uml_net helper will do all necessary setup on the host - the only + requirement is that TUN/TAP be available, either built in to the host + kernel or as the tun.o module. + + The format of the command line switch to attach a device to a TUN/TAP + device is + + + eth <n> =tuntap,,, <IP address> + + + + + For example, this argument will attach the UML's eth0 to the next + available tap device and assign an ethernet address to it based on its + IP address + + + eth0=tuntap,,,192.168.0.254 + + + + + + + Note that the IP address that must be used for the eth device inside + UML is fixed by the routing and proxy arp that is set up on the + TUN/TAP device on the host. You can use a different one, but it won't + work because reply packets won't reach the UML. This is a feature. + It prevents a nasty UML user from doing things like setting the UML IP + to the same as the network's nameserver or mail server. + + + There are a couple potential problems with running the TUN/TAP + transport on a 2.4 host kernel + + +o TUN/TAP seems not to work on 2.4.3 and earlier. Upgrade the host + kernel or use the ethertap transport. + + +o With an upgraded kernel, TUN/TAP may fail with + + + File descriptor in bad state + + + + + This is due to a header mismatch between the upgraded kernel and the + kernel that was originally installed on the machine. The fix is to + make sure that /usr/src/linux points to the headers for the running + kernel. + + These were pointed out by Tim Robinson <timro at trkr dot net> in + <http://www.geocrawler.com/lists/3/SourceForge/597/0/> name="this uml- + user post"> . + + + + 66..77.. TTUUNN//TTAAPP wwiitthh aa pprreeccoonnffiigguurreedd ttaapp ddeevviiccee + + If you prefer not to have UML use uml_net (which is somewhat + insecure), with UML 2.4.17-11, you can set up a TUN/TAP device + beforehand. The setup needs to be done as root, but once that's done, + there is no need for root assistance. Setting up the device is done + as follows: + + +o Create the device with tunctl (available from the UML utilities + tarball) + + + + + host# tunctl -u uid + + + + + where uid is the user id or username that UML will be run as. This + will tell you what device was created. + + +o Configure the device IP (change IP addresses and device name to + suit) + + + + + host# ifconfig tap0 192.168.0.254 up + + + + + + +o Set up routing and arping if desired - this is my recipe, there are + other ways of doing the same thing + + + host# + bash -c 'echo 1 > /proc/sys/net/ipv4/ip_forward' + + host# + route add -host 192.168.0.253 dev tap0 + + + + + + + host# + bash -c 'echo 1 > /proc/sys/net/ipv4/conf/tap0/proxy_arp' + + + + + + + host# + arp -Ds 192.168.0.253 eth0 pub + + + + + Note that this must be done every time the host boots - this configu- + ration is not stored across host reboots. So, it's probably a good + idea to stick it in an rc file. An even better idea would be a little + utility which reads the information from a config file and sets up + devices at boot time. + + +o Rather than using up two IPs and ARPing for one of them, you can + also provide direct access to your LAN by the UML by using a + bridge. + + + host# + brctl addbr br0 + + + + + + + host# + ifconfig eth0 0.0.0.0 promisc up + + + + + + + host# + ifconfig tap0 0.0.0.0 promisc up + + + + + + + host# + ifconfig br0 192.168.0.1 netmask 255.255.255.0 up + + + + + + + + host# + brctl stp br0 off + + + + + + + host# + brctl setfd br0 1 + + + + + + + host# + brctl sethello br0 1 + + + + + + + host# + brctl addif br0 eth0 + + + + + + + host# + brctl addif br0 tap0 + + + + + Note that 'br0' should be setup using ifconfig with the existing IP + address of eth0, as eth0 no longer has its own IP. + + +o + + + Also, the /dev/net/tun device must be writable by the user running + UML in order for the UML to use the device that's been configured + for it. The simplest thing to do is + + + host# chmod 666 /dev/net/tun + + + + + Making it world-writeable looks bad, but it seems not to be + exploitable as a security hole. However, it does allow anyone to cre- + ate useless tap devices (useless because they can't configure them), + which is a DOS attack. A somewhat more secure alternative would to be + to create a group containing all the users who have preconfigured tap + devices and chgrp /dev/net/tun to that group with mode 664 or 660. + + + +o Once the device is set up, run UML with 'eth0=tuntap,device name' + (i.e. 'eth0=tuntap,tap0') on the command line (or do it with the + mconsole config command). + + +o Bring the eth device up in UML and you're in business. + + If you don't want that tap device any more, you can make it non- + persistent with + + + host# tunctl -d tap device + + + + + Finally, tunctl has a -b (for brief mode) switch which causes it to + output only the name of the tap device it created. This makes it + suitable for capture by a script: + + + host# TAP=`tunctl -u 1000 -b` + + + + + + + 66..88.. EEtthheerrttaapp + + Ethertap is the general mechanism on 2.2 for userspace processes to + exchange packets with the kernel. + + + + To use this transport, you need to describe the virtual network device + on the UML command line. The general format for this is + + + eth <n> =ethertap, <device> , <ethernet address> , <tap IP address> + + + + + So, the previous example + + + eth0=ethertap,tap0,fe:fd:0:0:0:1,192.168.0.254 + + + + + attaches the UML eth0 device to the host /dev/tap0, assigns it the + ethernet address fe:fd:0:0:0:1, and assigns the IP address + 192.168.0.254 to the tap device. + + + + The tap device is mandatory, but the others are optional. If the + ethernet address is omitted, one will be assigned to it. + + + The presence of the tap IP address will cause the helper to run and do + whatever host setup is needed to allow the virtual machine to + communicate with the outside world. If you're not sure you know what + you're doing, this is the way to go. + + + If it is absent, then you must configure the tap device and whatever + arping and routing you will need on the host. However, even in this + case, the uml_net helper still needs to be in your path and it must be + setuid root if you're not running UML as root. This is because the + tap device doesn't support SIGIO, which UML needs in order to use + something as a source of input. So, the helper is used as a + convenient asynchronous IO thread. + + If you're using the uml_net helper, you can ignore the following host + setup - uml_net will do it for you. You just need to make sure you + have ethertap available, either built in to the host kernel or + available as a module. + + + If you want to set things up yourself, you need to make sure that the + appropriate /dev entry exists. If it doesn't, become root and create + it as follows: + + + mknod /dev/tap <minor> c 36 <minor> + 16 + + + + + For example, this is how to create /dev/tap0: + + + mknod /dev/tap0 c 36 0 + 16 + + + + + You also need to make sure that the host kernel has ethertap support. + If ethertap is enabled as a module, you apparently need to insmod + ethertap once for each ethertap device you want to enable. So, + + + host# + insmod ethertap + + + + + will give you the tap0 interface. To get the tap1 interface, you need + to run + + + host# + insmod ethertap unit=1 -o ethertap1 + + + + + + + + 66..99.. TThhee sswwiittcchh ddaaeemmoonn + + NNoottee: This is the daemon formerly known as uml_router, but which was + renamed so the network weenies of the world would stop growling at me. + + + The switch daemon, uml_switch, provides a mechanism for creating a + totally virtual network. By default, it provides no connection to the + host network (but see -tap, below). + + + The first thing you need to do is run the daemon. Running it with no + arguments will make it listen on a default pair of unix domain + sockets. + + + If you want it to listen on a different pair of sockets, use + + + -unix control socket data socket + + + + + + If you want it to act as a hub rather than a switch, use + + + -hub + + + + + + If you want the switch to be connected to host networking (allowing + the umls to get access to the outside world through the host), use + + + -tap tap0 + + + + + + Note that the tap device must be preconfigured (see "TUN/TAP with a + preconfigured tap device", above). If you're using a different tap + device than tap0, specify that instead of tap0. + + + uml_switch can be backgrounded as follows + + + host% + uml_switch [ options ] < /dev/null > /dev/null + + + + + The reason it doesn't background by default is that it listens to + stdin for EOF. When it sees that, it exits. + + + The general format of the kernel command line switch is + + + + ethn=daemon,ethernet address,socket + type,control socket,data socket + + + + + You can leave off everything except the 'daemon'. You only need to + specify the ethernet address if the one that will be assigned to it + isn't acceptable for some reason. The rest of the arguments describe + how to communicate with the daemon. You should only specify them if + you told the daemon to use different sockets than the default. So, if + you ran the daemon with no arguments, running the UML on the same + machine with + eth0=daemon + + + + + will cause the eth0 driver to attach itself to the daemon correctly. + + + + 66..1100.. SSlliipp + + Slip is another, less general, mechanism for a process to communicate + with the host networking. In contrast to the ethertap interface, + which exchanges ethernet frames with the host and can be used to + transport any higher-level protocol, it can only be used to transport + IP. + + + The general format of the command line switch is + + + + ethn=slip,slip IP + + + + + The slip IP argument is the IP address that will be assigned to the + host end of the slip device. If it is specified, the helper will run + and will set up the host so that the virtual machine can reach it and + the rest of the network. + + + There are some oddities with this interface that you should be aware + of. You should only specify one slip device on a given virtual + machine, and its name inside UML will be 'umn', not 'eth0' or whatever + you specified on the command line. These problems will be fixed at + some point. + + + + 66..1111.. SSlliirrpp + + slirp uses an external program, usually /usr/bin/slirp, to provide IP + only networking connectivity through the host. This is similar to IP + masquerading with a firewall, although the translation is performed in + user-space, rather than by the kernel. As slirp does not set up any + interfaces on the host, or changes routing, slirp does not require + root access or setuid binaries on the host. + + + The general format of the command line switch for slirp is: + + + + ethn=slirp,ethernet address,slirp path + + + + + The ethernet address is optional, as UML will set up the interface + with an ethernet address based upon the initial IP address of the + interface. The slirp path is generally /usr/bin/slirp, although it + will depend on distribution. + + + The slirp program can have a number of options passed to the command + line and we can't add them to the UML command line, as they will be + parsed incorrectly. Instead, a wrapper shell script can be written or + the options inserted into the /.slirprc file. More information on + all of the slirp options can be found in its man pages. + + + The eth0 interface on UML should be set up with the IP 10.2.0.15, + although you can use anything as long as it is not used by a network + you will be connecting to. The default route on UML should be set to + use + + + UML# + route add default dev eth0 + + + + + slirp provides a number of useful IP addresses which can be used by + UML, such as 10.0.2.3 which is an alias for the DNS server specified + in /etc/resolv.conf on the host or the IP given in the 'dns' option + for slirp. + + + Even with a baudrate setting higher than 115200, the slirp connection + is limited to 115200. If you need it to go faster, the slirp binary + needs to be compiled with FULL_BOLT defined in config.h. + + + + 66..1122.. ppccaapp + + The pcap transport is attached to a UML ethernet device on the command + line or with uml_mconsole with the following syntax: + + + + ethn=pcap,host interface,filter + expression,option1,option2 + + + + + The expression and options are optional. + + + The interface is whatever network device on the host you want to + sniff. The expression is a pcap filter expression, which is also what + tcpdump uses, so if you know how to specify tcpdump filters, you will + use the same expressions here. The options are up to two of + 'promisc', control whether pcap puts the host interface into + promiscuous mode. 'optimize' and 'nooptimize' control whether the pcap + expression optimizer is used. + + + Example: + + + + eth0=pcap,eth0,tcp + + eth1=pcap,eth0,!tcp + + + + will cause the UML eth0 to emit all tcp packets on the host eth0 and + the UML eth1 to emit all non-tcp packets on the host eth0. + + + + 66..1133.. SSeettttiinngg uupp tthhee hhoosstt yyoouurrsseellff + + If you don't specify an address for the host side of the ethertap or + slip device, UML won't do any setup on the host. So this is what is + needed to get things working (the examples use a host-side IP of + 192.168.0.251 and a UML-side IP of 192.168.0.250 - adjust to suit your + own network): + + +o The device needs to be configured with its IP address. Tap devices + are also configured with an mtu of 1484. Slip devices are + configured with a point-to-point address pointing at the UML ip + address. + + + host# ifconfig tap0 arp mtu 1484 192.168.0.251 up + + + + + + + host# + ifconfig sl0 192.168.0.251 pointopoint 192.168.0.250 up + + + + + + +o If a tap device is being set up, a route is set to the UML IP. + + + UML# route add -host 192.168.0.250 gw 192.168.0.251 + + + + + + +o To allow other hosts on your network to see the virtual machine, + proxy arp is set up for it. + + + host# arp -Ds 192.168.0.250 eth0 pub + + + + + + +o Finally, the host is set up to route packets. + + + host# echo 1 > /proc/sys/net/ipv4/ip_forward + + + + + + + + + + + 77.. SShhaarriinngg FFiilleessyysstteemmss bbeettwweeeenn VViirrttuuaall MMaacchhiinneess + + + + + 77..11.. AA wwaarrnniinngg + + Don't attempt to share filesystems simply by booting two UMLs from the + same file. That's the same thing as booting two physical machines + from a shared disk. It will result in filesystem corruption. + + + + 77..22.. UUssiinngg llaayyeerreedd bblloocckk ddeevviicceess + + The way to share a filesystem between two virtual machines is to use + the copy-on-write (COW) layering capability of the ubd block driver. + As of 2.4.6-2um, the driver supports layering a read-write private + device over a read-only shared device. A machine's writes are stored + in the private device, while reads come from either device - the + private one if the requested block is valid in it, the shared one if + not. Using this scheme, the majority of data which is unchanged is + shared between an arbitrary number of virtual machines, each of which + has a much smaller file containing the changes that it has made. With + a large number of UMLs booting from a large root filesystem, this + leads to a huge disk space saving. It will also help performance, + since the host will be able to cache the shared data using a much + smaller amount of memory, so UML disk requests will be served from the + host's memory rather than its disks. + + + + + To add a copy-on-write layer to an existing block device file, simply + add the name of the COW file to the appropriate ubd switch: + + + ubd0=root_fs_cow,root_fs_debian_22 + + + + + where 'root_fs_cow' is the private COW file and 'root_fs_debian_22' is + the existing shared filesystem. The COW file need not exist. If it + doesn't, the driver will create and initialize it. Once the COW file + has been initialized, it can be used on its own on the command line: + + + ubd0=root_fs_cow + + + + + The name of the backing file is stored in the COW file header, so it + would be redundant to continue specifying it on the command line. + + + + 77..33.. NNoottee!! + + When checking the size of the COW file in order to see the gobs of + space that you're saving, make sure you use 'ls -ls' to see the actual + disk consumption rather than the length of the file. The COW file is + sparse, so the length will be very different from the disk usage. + Here is a 'ls -l' of a COW file and backing file from one boot and + shutdown: + host% ls -l cow.debian debian2.2 + -rw-r--r-- 1 jdike jdike 492504064 Aug 6 21:16 cow.debian + -rwxrw-rw- 1 jdike jdike 537919488 Aug 6 20:42 debian2.2 + + + + + Doesn't look like much saved space, does it? Well, here's 'ls -ls': + + + host% ls -ls cow.debian debian2.2 + 880 -rw-r--r-- 1 jdike jdike 492504064 Aug 6 21:16 cow.debian + 525832 -rwxrw-rw- 1 jdike jdike 537919488 Aug 6 20:42 debian2.2 + + + + + Now, you can see that the COW file has less than a meg of disk, rather + than 492 meg. + + + + 77..44.. AAnnootthheerr wwaarrnniinngg + + Once a filesystem is being used as a readonly backing file for a COW + file, do not boot directly from it or modify it in any way. Doing so + will invalidate any COW files that are using it. The mtime and size + of the backing file are stored in the COW file header at its creation, + and they must continue to match. If they don't, the driver will + refuse to use the COW file. + + + + + If you attempt to evade this restriction by changing either the + backing file or the COW header by hand, you will get a corrupted + filesystem. + + + + + Among other things, this means that upgrading the distribution in a + backing file and expecting that all of the COW files using it will see + the upgrade will not work. + + + + + 77..55.. uummll__mmoooo :: MMeerrggiinngg aa CCOOWW ffiillee wwiitthh iittss bbaacckkiinngg ffiillee + + Depending on how you use UML and COW devices, it may be advisable to + merge the changes in the COW file into the backing file every once in + a while. + + + + + The utility that does this is uml_moo. Its usage is + + + host% uml_moo COW file new backing file + + + + + There's no need to specify the backing file since that information is + already in the COW file header. If you're paranoid, boot the new + merged file, and if you're happy with it, move it over the old backing + file. + + + + + uml_moo creates a new backing file by default as a safety measure. It + also has a destructive merge option which will merge the COW file + directly into its current backing file. This is really only usable + when the backing file only has one COW file associated with it. If + there are multiple COWs associated with a backing file, a -d merge of + one of them will invalidate all of the others. However, it is + convenient if you're short of disk space, and it should also be + noticably faster than a non-destructive merge. + + + + + uml_moo is installed with the UML deb and RPM. If you didn't install + UML from one of those packages, you can also get it from the UML + utilities <http://user-mode-linux.sourceforge.net/dl-sf.html#UML + utilities> tar file in tools/moo. + + + + + + + + + 88.. CCrreeaattiinngg ffiilleessyysstteemmss + + + You may want to create and mount new UML filesystems, either because + your root filesystem isn't large enough or because you want to use a + filesystem other than ext2. + + + This was written on the occasion of reiserfs being included in the + 2.4.1 kernel pool, and therefore the 2.4.1 UML, so the examples will + talk about reiserfs. This information is generic, and the examples + should be easy to translate to the filesystem of your choice. + + + 88..11.. CCrreeaattee tthhee ffiilleessyysstteemm ffiillee + + dd is your friend. All you need to do is tell dd to create an empty + file of the appropriate size. I usually make it sparse to save time + and to avoid allocating disk space until it's actually used. For + example, the following command will create a sparse 100 meg file full + of zeroes. + + + host% + dd if=/dev/zero of=new_filesystem seek=100 count=1 bs=1M + + + + + + + 88..22.. AAssssiiggnn tthhee ffiillee ttoo aa UUMMLL ddeevviiccee + + Add an argument like the following to the UML command line: + + ubd4=new_filesystem + + + + + making sure that you use an unassigned ubd device number. + + + + 88..33.. CCrreeaattiinngg aanndd mmoouunnttiinngg tthhee ffiilleessyysstteemm + + Make sure that the filesystem is available, either by being built into + the kernel, or available as a module, then boot up UML and log in. If + the root filesystem doesn't have the filesystem utilities (mkfs, fsck, + etc), then get them into UML by way of the net or hostfs. + + + Make the new filesystem on the device assigned to the new file: + + + host# mkreiserfs /dev/ubd/4 + + + <----------- MKREISERFSv2 -----------> + + ReiserFS version 3.6.25 + Block size 4096 bytes + Block count 25856 + Used blocks 8212 + Journal - 8192 blocks (18-8209), journal header is in block 8210 + Bitmaps: 17 + Root block 8211 + Hash function "r5" + ATTENTION: ALL DATA WILL BE LOST ON '/dev/ubd/4'! (y/n)y + journal size 8192 (from 18) + Initializing journal - 0%....20%....40%....60%....80%....100% + Syncing..done. + + + + + Now, mount it: + + + UML# + mount /dev/ubd/4 /mnt + + + + + and you're in business. + + + + + + + + + + 99.. HHoosstt ffiillee aacccceessss + + + If you want to access files on the host machine from inside UML, you + can treat it as a separate machine and either nfs mount directories + from the host or copy files into the virtual machine with scp or rcp. + However, since UML is running on the the host, it can access those + files just like any other process and make them available inside the + virtual machine without needing to use the network. + + + This is now possible with the hostfs virtual filesystem. With it, you + can mount a host directory into the UML filesystem and access the + files contained in it just as you would on the host. + + + 99..11.. UUssiinngg hhoossttffss + + To begin with, make sure that hostfs is available inside the virtual + machine with + + + UML# cat /proc/filesystems + + + + . hostfs should be listed. If it's not, either rebuild the kernel + with hostfs configured into it or make sure that hostfs is built as a + module and available inside the virtual machine, and insmod it. + + + Now all you need to do is run mount: + + + UML# mount none /mnt/host -t hostfs + + + + + will mount the host's / on the virtual machine's /mnt/host. + + + If you don't want to mount the host root directory, then you can + specify a subdirectory to mount with the -o switch to mount: + + + UML# mount none /mnt/home -t hostfs -o /home + + + + + will mount the hosts's /home on the virtual machine's /mnt/home. + + + + 99..22.. hhoossttffss aass tthhee rroooott ffiilleessyysstteemm + + It's possible to boot from a directory hierarchy on the host using + hostfs rather than using the standard filesystem in a file. + + To start, you need that hierarchy. The easiest way is to loop mount + an existing root_fs file: + + + host# mount root_fs uml_root_dir -o loop + + + + + You need to change the filesystem type of / in etc/fstab to be + 'hostfs', so that line looks like this: + + /dev/ubd/0 / hostfs defaults 1 1 + + + + + Then you need to chown to yourself all the files in that directory + that are owned by root. This worked for me: + + + host# find . -uid 0 -exec chown jdike {} \; + + + + + Next, make sure that your UML kernel has hostfs compiled in, not as a + module. Then run UML with the boot device pointing at that directory: + + + ubd0=/path/to/uml/root/directory + + + + + UML should then boot as it does normally. + + + 99..33.. BBuuiillddiinngg hhoossttffss + + If you need to build hostfs because it's not in your kernel, you have + two choices: + + + + +o Compiling hostfs into the kernel: + + + Reconfigure the kernel and set the 'Host filesystem' option under + + + +o Compiling hostfs as a module: + + + Reconfigure the kernel and set the 'Host filesystem' option under + be in arch/um/fs/hostfs/hostfs.o. Install that in + /lib/modules/`uname -r`/fs in the virtual machine, boot it up, and + + + UML# insmod hostfs + + + + + + + + + + + + + 1100.. TThhee MMaannaaggeemmeenntt CCoonnssoollee + + + + The UML management console is a low-level interface to the kernel, + somewhat like the i386 SysRq interface. Since there is a full-blown + operating system under UML, there is much greater flexibility possible + than with the SysRq mechanism. + + + There are a number of things you can do with the mconsole interface: + + +o get the kernel version + + +o add and remove devices + + +o halt or reboot the machine + + +o Send SysRq commands + + +o Pause and resume the UML + + + You need the mconsole client (uml_mconsole) which is present in CVS + (/tools/mconsole) in 2.4.5-9um and later, and will be in the RPM in + 2.4.6. + + + You also need CONFIG_MCONSOLE (under 'General Setup') enabled in UML. + When you boot UML, you'll see a line like: + + + mconsole initialized on /home/jdike/.uml/umlNJ32yL/mconsole + + + + + If you specify a unique machine id one the UML command line, i.e. + + + umid=debian + + + + + you'll see this + + + mconsole initialized on /home/jdike/.uml/debian/mconsole + + + + + That file is the socket that uml_mconsole will use to communicate with + UML. Run it with either the umid or the full path as its argument: + + + host% uml_mconsole debian + + + + + or + + + host% uml_mconsole /home/jdike/.uml/debian/mconsole + + + + + You'll get a prompt, at which you can run one of these commands: + + +o version + + +o halt + + +o reboot + + +o config + + +o remove + + +o sysrq + + +o help + + +o cad + + +o stop + + +o go + + + 1100..11.. vveerrssiioonn + + This takes no arguments. It prints the UML version. + + + (mconsole) version + OK Linux usermode 2.4.5-9um #1 Wed Jun 20 22:47:08 EDT 2001 i686 + + + + + There are a couple actual uses for this. It's a simple no-op which + can be used to check that a UML is running. It's also a way of + sending an interrupt to the UML. This is sometimes useful on SMP + hosts, where there's a bug which causes signals to UML to be lost, + often causing it to appear to hang. Sending such a UML the mconsole + version command is a good way to 'wake it up' before networking has + been enabled, as it does not do anything to the function of the UML. + + + + 1100..22.. hhaalltt aanndd rreebboooott + + These take no arguments. They shut the machine down immediately, with + no syncing of disks and no clean shutdown of userspace. So, they are + pretty close to crashing the machine. + + + (mconsole) halt + OK + + + + + + + 1100..33.. ccoonnffiigg + + "config" adds a new device to the virtual machine. Currently the ubd + and network drivers support this. It takes one argument, which is the + device to add, with the same syntax as the kernel command line. + + + + + (mconsole) + config ubd3=/home/jdike/incoming/roots/root_fs_debian22 + + OK + (mconsole) config eth1=mcast + OK + + + + + + + 1100..44.. rreemmoovvee + + "remove" deletes a device from the system. Its argument is just the + name of the device to be removed. The device must be idle in whatever + sense the driver considers necessary. In the case of the ubd driver, + the removed block device must not be mounted, swapped on, or otherwise + open, and in the case of the network driver, the device must be down. + + + (mconsole) remove ubd3 + OK + (mconsole) remove eth1 + OK + + + + + + + 1100..55.. ssyyssrrqq + + This takes one argument, which is a single letter. It calls the + generic kernel's SysRq driver, which does whatever is called for by + that argument. See the SysRq documentation in Documentation/sysrq.txt + in your favorite kernel tree to see what letters are valid and what + they do. + + + + 1100..66.. hheellpp + + "help" returns a string listing the valid commands and what each one + does. + + + + 1100..77.. ccaadd + + This invokes the Ctl-Alt-Del action on init. What exactly this ends + up doing is up to /etc/inittab. Normally, it reboots the machine. + With UML, this is usually not desired, so if a halt would be better, + then find the section of inittab that looks like this + + + # What to do when CTRL-ALT-DEL is pressed. + ca:12345:ctrlaltdel:/sbin/shutdown -t1 -a -r now + + + + + and change the command to halt. + + + + 1100..88.. ssttoopp + + This puts the UML in a loop reading mconsole requests until a 'go' + mconsole command is received. This is very useful for making backups + of UML filesystems, as the UML can be stopped, then synced via 'sysrq + s', so that everything is written to the filesystem. You can then copy + the filesystem and then send the UML 'go' via mconsole. + + + Note that a UML running with more than one CPU will have problems + after you send the 'stop' command, as only one CPU will be held in a + mconsole loop and all others will continue as normal. This is a bug, + and will be fixed. + + + + 1100..99.. ggoo + + This resumes a UML after being paused by a 'stop' command. Note that + when the UML has resumed, TCP connections may have timed out and if + the UML is paused for a long period of time, crond might go a little + crazy, running all the jobs it didn't do earlier. + + + + + + + + + 1111.. KKeerrnneell ddeebbuuggggiinngg + + + NNoottee:: The interface that makes debugging, as described here, possible + is present in 2.4.0-test6 kernels and later. + + + Since the user-mode kernel runs as a normal Linux process, it is + possible to debug it with gdb almost like any other process. It is + slightly different because the kernel's threads are already being + ptraced for system call interception, so gdb can't ptrace them. + However, a mechanism has been added to work around that problem. + + + In order to debug the kernel, you need build it from source. See + ``Compiling the kernel and modules'' for information on doing that. + Make sure that you enable CONFIG_DEBUGSYM and CONFIG_PT_PROXY during + the config. These will compile the kernel with -g, and enable the + ptrace proxy so that gdb works with UML, respectively. + + + + + 1111..11.. SSttaarrttiinngg tthhee kkeerrnneell uunnddeerr ggddbb + + You can have the kernel running under the control of gdb from the + beginning by putting 'debug' on the command line. You will get an + xterm with gdb running inside it. The kernel will send some commands + to gdb which will leave it stopped at the beginning of start_kernel. + At this point, you can get things going with 'next', 'step', or + 'cont'. + + + There is a transcript of a debugging session here <debug- + session.html> , with breakpoints being set in the scheduler and in an + interrupt handler. + 1111..22.. EExxaammiinniinngg sslleeeeppiinngg pprroocceesssseess + + Not every bug is evident in the currently running process. Sometimes, + processes hang in the kernel when they shouldn't because they've + deadlocked on a semaphore or something similar. In this case, when + you ^C gdb and get a backtrace, you will see the idle thread, which + isn't very relevant. + + + What you want is the stack of whatever process is sleeping when it + shouldn't be. You need to figure out which process that is, which is + generally fairly easy. Then you need to get its host process id, + which you can do either by looking at ps on the host or at + task.thread.extern_pid in gdb. + + + Now what you do is this: + + +o detach from the current thread + + + (UML gdb) det + + + + + + +o attach to the thread you are interested in + + + (UML gdb) att <host pid> + + + + + + +o look at its stack and anything else of interest + + + (UML gdb) bt + + + + + Note that you can't do anything at this point that requires that a + process execute, e.g. calling a function + + +o when you're done looking at that process, reattach to the current + thread and continue it + + + (UML gdb) + att 1 + + + + + + + (UML gdb) + c + + + + + Here, specifying any pid which is not the process id of a UML thread + will cause gdb to reattach to the current thread. I commonly use 1, + but any other invalid pid would work. + + + + 1111..33.. RRuunnnniinngg dddddd oonn UUMMLL + + ddd works on UML, but requires a special kludge. The process goes + like this: + + +o Start ddd + + + host% ddd linux + + + + + + +o With ps, get the pid of the gdb that ddd started. You can ask the + gdb to tell you, but for some reason that confuses things and + causes a hang. + + +o run UML with 'debug=parent gdb-pid=<pid>' added to the command line + - it will just sit there after you hit return + + +o type 'att 1' to the ddd gdb and you will see something like + + + 0xa013dc51 in __kill () + + + (gdb) + + + + + + +o At this point, type 'c', UML will boot up, and you can use ddd just + as you do on any other process. + + + + 1111..44.. DDeebbuuggggiinngg mmoodduulleess + + gdb has support for debugging code which is dynamically loaded into + the process. This support is what is needed to debug kernel modules + under UML. + + + Using that support is somewhat complicated. You have to tell gdb what + object file you just loaded into UML and where in memory it is. Then, + it can read the symbol table, and figure out where all the symbols are + from the load address that you provided. It gets more interesting + when you load the module again (i.e. after an rmmod). You have to + tell gdb to forget about all its symbols, including the main UML ones + for some reason, then load then all back in again. + + + There's an easy way and a hard way to do this. The easy way is to use + the umlgdb expect script written by Chandan Kudige. It basically + automates the process for you. + + + First, you must tell it where your modules are. There is a list in + the script that looks like this: + set MODULE_PATHS { + "fat" "/usr/src/uml/linux-2.4.18/fs/fat/fat.o" + "isofs" "/usr/src/uml/linux-2.4.18/fs/isofs/isofs.o" + "minix" "/usr/src/uml/linux-2.4.18/fs/minix/minix.o" + } + + + + + You change that to list the names and paths of the modules that you + are going to debug. Then you run it from the toplevel directory of + your UML pool and it basically tells you what to do: + + + + + ******** GDB pid is 21903 ******** + Start UML as: ./linux <kernel switches> debug gdb-pid=21903 + + + + GNU gdb 5.0rh-5 Red Hat Linux 7.1 + Copyright 2001 Free Software Foundation, Inc. + GDB is free software, covered by the GNU General Public License, and you are + welcome to change it and/or distribute copies of it under certain conditions. + Type "show copying" to see the conditions. + There is absolutely no warranty for GDB. Type "show warranty" for details. + This GDB was configured as "i386-redhat-linux"... + (gdb) b sys_init_module + Breakpoint 1 at 0xa0011923: file module.c, line 349. + (gdb) att 1 + + + + + After you run UML and it sits there doing nothing, you hit return at + the 'att 1' and continue it: + + + Attaching to program: /home/jdike/linux/2.4/um/./linux, process 1 + 0xa00f4221 in __kill () + (UML gdb) c + Continuing. + + + + + At this point, you debug normally. When you insmod something, the + expect magic will kick in and you'll see something like: + + + + + + + + + + + + + + + + + + *** Module hostfs loaded *** + Breakpoint 1, sys_init_module (name_user=0x805abb0 "hostfs", + mod_user=0x8070e00) at module.c:349 + 349 char *name, *n_name, *name_tmp = NULL; + (UML gdb) finish + Run till exit from #0 sys_init_module (name_user=0x805abb0 "hostfs", + mod_user=0x8070e00) at module.c:349 + 0xa00e2e23 in execute_syscall (r=0xa8140284) at syscall_kern.c:411 + 411 else res = EXECUTE_SYSCALL(syscall, regs); + Value returned is $1 = 0 + (UML gdb) + p/x (int)module_list + module_list->size_of_struct + + $2 = 0xa9021054 + (UML gdb) symbol-file ./linux + Load new symbol table from "./linux"? (y or n) y + Reading symbols from ./linux... + done. + (UML gdb) + add-symbol-file /home/jdike/linux/2.4/um/arch/um/fs/hostfs/hostfs.o 0xa9021054 + + add symbol table from file "/home/jdike/linux/2.4/um/arch/um/fs/hostfs/hostfs.o" at + .text_addr = 0xa9021054 + (y or n) y + + Reading symbols from /home/jdike/linux/2.4/um/arch/um/fs/hostfs/hostfs.o... + done. + (UML gdb) p *module_list + $1 = {size_of_struct = 84, next = 0xa0178720, name = 0xa9022de0 "hostfs", + size = 9016, uc = {usecount = {counter = 0}, pad = 0}, flags = 1, + nsyms = 57, ndeps = 0, syms = 0xa9023170, deps = 0x0, refs = 0x0, + init = 0xa90221f0 <init_hostfs>, cleanup = 0xa902222c <exit_hostfs>, + ex_table_start = 0x0, ex_table_end = 0x0, persist_start = 0x0, + persist_end = 0x0, can_unload = 0, runsize = 0, kallsyms_start = 0x0, + kallsyms_end = 0x0, + archdata_start = 0x1b855 <Address 0x1b855 out of bounds>, + archdata_end = 0xe5890000 <Address 0xe5890000 out of bounds>, + kernel_data = 0xf689c35d <Address 0xf689c35d out of bounds>} + >> Finished loading symbols for hostfs ... + + + + + That's the easy way. It's highly recommended. The hard way is + described below in case you're interested in what's going on. + + + Boot the kernel under the debugger and load the module with insmod or + modprobe. With gdb, do: + + + (UML gdb) p module_list + + + + + This is a list of modules that have been loaded into the kernel, with + the most recently loaded module first. Normally, the module you want + is at module_list. If it's not, walk down the next links, looking at + the name fields until find the module you want to debug. Take the + address of that structure, and add module.size_of_struct (which in + 2.4.10 kernels is 96 (0x60)) to it. Gdb can make this hard addition + for you :-): + + + + (UML gdb) + printf "%#x\n", (int)module_list module_list->size_of_struct + + + + + The offset from the module start occasionally changes (before 2.4.0, + it was module.size_of_struct + 4), so it's a good idea to check the + init and cleanup addresses once in a while, as describe below. Now + do: + + + (UML gdb) + add-symbol-file /path/to/module/on/host that_address + + + + + Tell gdb you really want to do it, and you're in business. + + + If there's any doubt that you got the offset right, like breakpoints + appear not to work, or they're appearing in the wrong place, you can + check it by looking at the module structure. The init and cleanup + fields should look like: + + + init = 0x588066b0 <init_hostfs>, cleanup = 0x588066c0 <exit_hostfs> + + + + + with no offsets on the symbol names. If the names are right, but they + are offset, then the offset tells you how much you need to add to the + address you gave to add-symbol-file. + + + When you want to load in a new version of the module, you need to get + gdb to forget about the old one. The only way I've found to do that + is to tell gdb to forget about all symbols that it knows about: + + + (UML gdb) symbol-file + + + + + Then reload the symbols from the kernel binary: + + + (UML gdb) symbol-file /path/to/kernel + + + + + and repeat the process above. You'll also need to re-enable break- + points. They were disabled when you dumped all the symbols because + gdb couldn't figure out where they should go. + + + + 1111..55.. AAttttaacchhiinngg ggddbb ttoo tthhee kkeerrnneell + + If you don't have the kernel running under gdb, you can attach gdb to + it later by sending the tracing thread a SIGUSR1. The first line of + the console output identifies its pid: + tracing thread pid = 20093 + + + + + When you send it the signal: + + + host% kill -USR1 20093 + + + + + you will get an xterm with gdb running in it. + + + If you have the mconsole compiled into UML, then the mconsole client + can be used to start gdb: + + + (mconsole) (mconsole) config gdb=xterm + + + + + will fire up an xterm with gdb running in it. + + + + 1111..66.. UUssiinngg aalltteerrnnaattee ddeebbuuggggeerrss + + UML has support for attaching to an already running debugger rather + than starting gdb itself. This is present in CVS as of 17 Apr 2001. + I sent it to Alan for inclusion in the ac tree, and it will be in my + 2.4.4 release. + + + This is useful when gdb is a subprocess of some UI, such as emacs or + ddd. It can also be used to run debuggers other than gdb on UML. + Below is an example of using strace as an alternate debugger. + + + To do this, you need to get the pid of the debugger and pass it in + with the + + + If you are using gdb under some UI, then tell it to 'att 1', and + you'll find yourself attached to UML. + + + If you are using something other than gdb as your debugger, then + you'll need to get it to do the equivalent of 'att 1' if it doesn't do + it automatically. + + + An example of an alternate debugger is strace. You can strace the + actual kernel as follows: + + +o Run the following in a shell + + + host% + sh -c 'echo pid=$$; echo -n hit return; read x; exec strace -p 1 -o strace.out' + + + + +o Run UML with 'debug' and 'gdb-pid=<pid>' with the pid printed out + by the previous command + + +o Hit return in the shell, and UML will start running, and strace + output will start accumulating in the output file. + + Note that this is different from running + + + host% strace ./linux + + + + + That will strace only the main UML thread, the tracing thread, which + doesn't do any of the actual kernel work. It just oversees the vir- + tual machine. In contrast, using strace as described above will show + you the low-level activity of the virtual machine. + + + + + + 1122.. KKeerrnneell ddeebbuuggggiinngg eexxaammpplleess + + 1122..11.. TThhee ccaassee ooff tthhee hhuunngg ffsscckk + + When booting up the kernel, fsck failed, and dropped me into a shell + to fix things up. I ran fsck -y, which hung: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Setting hostname uml [ OK ] + Checking root filesystem + /dev/fhd0 was not cleanly unmounted, check forced. + Error reading block 86894 (Attempt to read block from filesystem resulted in short read) while reading indirect blocks of inode 19780. + + /dev/fhd0: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. + (i.e., without -a or -p options) + [ FAILED ] + + *** An error occurred during the file system check. + *** Dropping you to a shell; the system will reboot + *** when you leave the shell. + Give root password for maintenance + (or type Control-D for normal startup): + + [root@uml /root]# fsck -y /dev/fhd0 + fsck -y /dev/fhd0 + Parallelizing fsck version 1.14 (9-Jan-1999) + e2fsck 1.14, 9-Jan-1999 for EXT2 FS 0.5b, 95/08/09 + /dev/fhd0 contains a file system with errors, check forced. + Pass 1: Checking inodes, blocks, and sizes + Error reading block 86894 (Attempt to read block from filesystem resulted in short read) while reading indirect blocks of inode 19780. Ignore error? yes + + Inode 19780, i_blocks is 1548, should be 540. Fix? yes + + Pass 2: Checking directory structure + Error reading block 49405 (Attempt to read block from filesystem resulted in short read). Ignore error? yes + + Directory inode 11858, block 0, offset 0: directory corrupted + Salvage? yes + + Missing '.' in directory inode 11858. + Fix? yes + + Missing '..' in directory inode 11858. + Fix? yes + + + + + + The standard drill in this sort of situation is to fire up gdb on the + signal thread, which, in this case, was pid 1935. In another window, + I run gdb and attach pid 1935. + + + + + ~/linux/2.3.26/um 1016: gdb linux + GNU gdb 4.17.0.11 with Linux support + Copyright 1998 Free Software Foundation, Inc. + GDB is free software, covered by the GNU General Public License, and you are + welcome to change it and/or distribute copies of it under certain conditions. + Type "show copying" to see the conditions. + There is absolutely no warranty for GDB. Type "show warranty" for details. + This GDB was configured as "i386-redhat-linux"... + + (gdb) att 1935 + Attaching to program `/home/dike/linux/2.3.26/um/linux', Pid 1935 + 0x100756d9 in __wait4 () + + + + + + + Let's see what's currently running: + + + + (gdb) p current_task.pid + $1 = 0 + + + + + + It's the idle thread, which means that fsck went to sleep for some + reason and never woke up. + + + Let's guess that the last process in the process list is fsck: + + + + (gdb) p current_task.prev_task.comm + $13 = "fsck.ext2\000\000\000\000\000\000" + + + + + + It is, so let's see what it thinks it's up to: + + + + (gdb) p current_task.prev_task.thread + $14 = {extern_pid = 1980, tracing = 0, want_tracing = 0, forking = 0, + kernel_stack_page = 0, signal_stack = 1342627840, syscall = {id = 4, args = { + 3, 134973440, 1024, 0, 1024}, have_result = 0, result = 50590720}, + request = {op = 2, u = {exec = {ip = 1350467584, sp = 2952789424}, fork = { + regs = {1350467584, 2952789424, 0 <repeats 15 times>}, sigstack = 0, + pid = 0}, switch_to = 0x507e8000, thread = {proc = 0x507e8000, + arg = 0xaffffdb0, flags = 0, new_pid = 0}, input_request = { + op = 1350467584, fd = -1342177872, proc = 0, pid = 0}}}} + + + + + + The interesting things here are the fact that its .thread.syscall.id + is __NR_write (see the big switch in arch/um/kernel/syscall_kern.c or + the defines in include/asm-um/arch/unistd.h), and that it never + returned. Also, its .request.op is OP_SWITCH (see + arch/um/include/user_util.h). These mean that it went into a write, + and, for some reason, called schedule(). + + + The fact that it never returned from write means that its stack should + be fairly interesting. Its pid is 1980 (.thread.extern_pid). That + process is being ptraced by the signal thread, so it must be detached + before gdb can attach it: + + + + + + + + + + + (gdb) call detach(1980) + + Program received signal SIGSEGV, Segmentation fault. + <function called from gdb> + The program being debugged stopped while in a function called from GDB. + When the function (detach) is done executing, GDB will silently + stop (instead of continuing to evaluate the expression containing + the function call). + (gdb) call detach(1980) + $15 = 0 + + + + + + The first detach segfaults for some reason, and the second one + succeeds. + + + Now I detach from the signal thread, attach to the fsck thread, and + look at its stack: + + + (gdb) det + Detaching from program: /home/dike/linux/2.3.26/um/linux Pid 1935 + (gdb) att 1980 + Attaching to program `/home/dike/linux/2.3.26/um/linux', Pid 1980 + 0x10070451 in __kill () + (gdb) bt + #0 0x10070451 in __kill () + #1 0x10068ccd in usr1_pid (pid=1980) at process.c:30 + #2 0x1006a03f in _switch_to (prev=0x50072000, next=0x507e8000) + at process_kern.c:156 + #3 0x1006a052 in switch_to (prev=0x50072000, next=0x507e8000, last=0x50072000) + at process_kern.c:161 + #4 0x10001d12 in schedule () at sched.c:777 + #5 0x1006a744 in __down (sem=0x507d241c) at semaphore.c:71 + #6 0x1006aa10 in __down_failed () at semaphore.c:157 + #7 0x1006c5d8 in segv_handler (sc=0x5006e940) at trap_user.c:174 + #8 0x1006c5ec in kern_segv_handler (sig=11) at trap_user.c:182 + #9 <signal handler called> + #10 0x10155404 in errno () + #11 0x1006c0aa in segv (address=1342179328, is_write=2) at trap_kern.c:50 + #12 0x1006c5d8 in segv_handler (sc=0x5006eaf8) at trap_user.c:174 + #13 0x1006c5ec in kern_segv_handler (sig=11) at trap_user.c:182 + #14 <signal handler called> + #15 0xc0fd in ?? () + #16 0x10016647 in sys_write (fd=3, + buf=0x80b8800 <Address 0x80b8800 out of bounds>, count=1024) + at read_write.c:159 + #17 0x1006d5b3 in execute_syscall (syscall=4, args=0x5006ef08) + at syscall_kern.c:254 + #18 0x1006af87 in really_do_syscall (sig=12) at syscall_user.c:35 + #19 <signal handler called> + #20 0x400dc8b0 in ?? () + + + + + + The interesting things here are : + + +o There are two segfaults on this stack (frames 9 and 14) + + +o The first faulting address (frame 11) is 0x50000800 + + (gdb) p (void *)1342179328 + $16 = (void *) 0x50000800 + + + + + + The initial faulting address is interesting because it is on the idle + thread's stack. I had been seeing the idle thread segfault for no + apparent reason, and the cause looked like stack corruption. In hopes + of catching the culprit in the act, I had turned off all protections + to that stack while the idle thread wasn't running. This apparently + tripped that trap. + + + However, the more immediate problem is that second segfault and I'm + going to concentrate on that. First, I want to see where the fault + happened, so I have to go look at the sigcontent struct in frame 8: + + + + (gdb) up + #1 0x10068ccd in usr1_pid (pid=1980) at process.c:30 + 30 kill(pid, SIGUSR1); + (gdb) + #2 0x1006a03f in _switch_to (prev=0x50072000, next=0x507e8000) + at process_kern.c:156 + 156 usr1_pid(getpid()); + (gdb) + #3 0x1006a052 in switch_to (prev=0x50072000, next=0x507e8000, last=0x50072000) + at process_kern.c:161 + 161 _switch_to(prev, next); + (gdb) + #4 0x10001d12 in schedule () at sched.c:777 + 777 switch_to(prev, next, prev); + (gdb) + #5 0x1006a744 in __down (sem=0x507d241c) at semaphore.c:71 + 71 schedule(); + (gdb) + #6 0x1006aa10 in __down_failed () at semaphore.c:157 + 157 } + (gdb) + #7 0x1006c5d8 in segv_handler (sc=0x5006e940) at trap_user.c:174 + 174 segv(sc->cr2, sc->err & 2); + (gdb) + #8 0x1006c5ec in kern_segv_handler (sig=11) at trap_user.c:182 + 182 segv_handler(sc); + (gdb) p *sc + Cannot access memory at address 0x0. + + + + + That's not very useful, so I'll try a more manual method: + + + (gdb) p *((struct sigcontext *) (&sig + 1)) + $19 = {gs = 0, __gsh = 0, fs = 0, __fsh = 0, es = 43, __esh = 0, ds = 43, + __dsh = 0, edi = 1342179328, esi = 1350378548, ebp = 1342630440, + esp = 1342630420, ebx = 1348150624, edx = 1280, ecx = 0, eax = 0, + trapno = 14, err = 4, eip = 268480945, cs = 35, __csh = 0, eflags = 66118, + esp_at_signal = 1342630420, ss = 43, __ssh = 0, fpstate = 0x0, oldmask = 0, + cr2 = 1280} + + + + The ip is in handle_mm_fault: + + + (gdb) p (void *)268480945 + $20 = (void *) 0x1000b1b1 + (gdb) i sym $20 + handle_mm_fault + 57 in section .text + + + + + + Specifically, it's in pte_alloc: + + + (gdb) i line *$20 + Line 124 of "/home/dike/linux/2.3.26/um/include/asm/pgalloc.h" + starts at address 0x1000b1b1 <handle_mm_fault+57> + and ends at 0x1000b1b7 <handle_mm_fault+63>. + + + + + + To find where in handle_mm_fault this is, I'll jump forward in the + code until I see an address in that procedure: + + + + (gdb) i line *0x1000b1c0 + Line 126 of "/home/dike/linux/2.3.26/um/include/asm/pgalloc.h" + starts at address 0x1000b1b7 <handle_mm_fault+63> + and ends at 0x1000b1c3 <handle_mm_fault+75>. + (gdb) i line *0x1000b1d0 + Line 131 of "/home/dike/linux/2.3.26/um/include/asm/pgalloc.h" + starts at address 0x1000b1d0 <handle_mm_fault+88> + and ends at 0x1000b1da <handle_mm_fault+98>. + (gdb) i line *0x1000b1e0 + Line 61 of "/home/dike/linux/2.3.26/um/include/asm/pgalloc.h" + starts at address 0x1000b1da <handle_mm_fault+98> + and ends at 0x1000b1e1 <handle_mm_fault+105>. + (gdb) i line *0x1000b1f0 + Line 134 of "/home/dike/linux/2.3.26/um/include/asm/pgalloc.h" + starts at address 0x1000b1f0 <handle_mm_fault+120> + and ends at 0x1000b200 <handle_mm_fault+136>. + (gdb) i line *0x1000b200 + Line 135 of "/home/dike/linux/2.3.26/um/include/asm/pgalloc.h" + starts at address 0x1000b200 <handle_mm_fault+136> + and ends at 0x1000b208 <handle_mm_fault+144>. + (gdb) i line *0x1000b210 + Line 139 of "/home/dike/linux/2.3.26/um/include/asm/pgalloc.h" + starts at address 0x1000b210 <handle_mm_fault+152> + and ends at 0x1000b219 <handle_mm_fault+161>. + (gdb) i line *0x1000b220 + Line 1168 of "memory.c" starts at address 0x1000b21e <handle_mm_fault+166> + and ends at 0x1000b222 <handle_mm_fault+170>. + + + + + + Something is apparently wrong with the page tables or vma_structs, so + lets go back to frame 11 and have a look at them: + + + + #11 0x1006c0aa in segv (address=1342179328, is_write=2) at trap_kern.c:50 + 50 handle_mm_fault(current, vma, address, is_write); + (gdb) call pgd_offset_proc(vma->vm_mm, address) + $22 = (pgd_t *) 0x80a548c + + + + + + That's pretty bogus. Page tables aren't supposed to be in process + text or data areas. Let's see what's in the vma: + + + (gdb) p *vma + $23 = {vm_mm = 0x507d2434, vm_start = 0, vm_end = 134512640, + vm_next = 0x80a4f8c, vm_page_prot = {pgprot = 0}, vm_flags = 31200, + vm_avl_height = 2058, vm_avl_left = 0x80a8c94, vm_avl_right = 0x80d1000, + vm_next_share = 0xaffffdb0, vm_pprev_share = 0xaffffe63, + vm_ops = 0xaffffe7a, vm_pgoff = 2952789626, vm_file = 0xafffffec, + vm_private_data = 0x62} + (gdb) p *vma.vm_mm + $24 = {mmap = 0x507d2434, mmap_avl = 0x0, mmap_cache = 0x8048000, + pgd = 0x80a4f8c, mm_users = {counter = 0}, mm_count = {counter = 134904288}, + map_count = 134909076, mmap_sem = {count = {counter = 135073792}, + sleepers = -1342177872, wait = {lock = <optimized out or zero length>, + task_list = {next = 0xaffffe63, prev = 0xaffffe7a}, + __magic = -1342177670, __creator = -1342177300}, __magic = 98}, + page_table_lock = {}, context = 138, start_code = 0, end_code = 0, + start_data = 0, end_data = 0, start_brk = 0, brk = 0, start_stack = 0, + arg_start = 0, arg_end = 0, env_start = 0, env_end = 0, rss = 1350381536, + total_vm = 0, locked_vm = 0, def_flags = 0, cpu_vm_mask = 0, swap_cnt = 0, + swap_address = 0, segments = 0x0} + + + + + + This also pretty bogus. With all of the 0x80xxxxx and 0xaffffxxx + addresses, this is looking like a stack was plonked down on top of + these structures. Maybe it's a stack overflow from the next page: + + + + (gdb) p vma + $25 = (struct vm_area_struct *) 0x507d2434 + + + + + + That's towards the lower quarter of the page, so that would have to + have been pretty heavy stack overflow: + + + + + + + + + + + + + + + (gdb) x/100x $25 + 0x507d2434: 0x507d2434 0x00000000 0x08048000 0x080a4f8c + 0x507d2444: 0x00000000 0x080a79e0 0x080a8c94 0x080d1000 + 0x507d2454: 0xaffffdb0 0xaffffe63 0xaffffe7a 0xaffffe7a + 0x507d2464: 0xafffffec 0x00000062 0x0000008a 0x00000000 + 0x507d2474: 0x00000000 0x00000000 0x00000000 0x00000000 + 0x507d2484: 0x00000000 0x00000000 0x00000000 0x00000000 + 0x507d2494: 0x00000000 0x00000000 0x507d2fe0 0x00000000 + 0x507d24a4: 0x00000000 0x00000000 0x00000000 0x00000000 + 0x507d24b4: 0x00000000 0x00000000 0x00000000 0x00000000 + 0x507d24c4: 0x00000000 0x00000000 0x00000000 0x00000000 + 0x507d24d4: 0x00000000 0x00000000 0x00000000 0x00000000 + 0x507d24e4: 0x00000000 0x00000000 0x00000000 0x00000000 + 0x507d24f4: 0x00000000 0x00000000 0x00000000 0x00000000 + 0x507d2504: 0x00000000 0x00000000 0x00000000 0x00000000 + 0x507d2514: 0x00000000 0x00000000 0x00000000 0x00000000 + 0x507d2524: 0x00000000 0x00000000 0x00000000 0x00000000 + 0x507d2534: 0x00000000 0x00000000 0x507d25dc 0x00000000 + 0x507d2544: 0x00000000 0x00000000 0x00000000 0x00000000 + 0x507d2554: 0x00000000 0x00000000 0x00000000 0x00000000 + 0x507d2564: 0x00000000 0x00000000 0x00000000 0x00000000 + 0x507d2574: 0x00000000 0x00000000 0x00000000 0x00000000 + 0x507d2584: 0x00000000 0x00000000 0x00000000 0x00000000 + 0x507d2594: 0x00000000 0x00000000 0x00000000 0x00000000 + 0x507d25a4: 0x00000000 0x00000000 0x00000000 0x00000000 + 0x507d25b4: 0x00000000 0x00000000 0x00000000 0x00000000 + + + + + + It's not stack overflow. The only "stack-like" piece of this data is + the vma_struct itself. + + + At this point, I don't see any avenues to pursue, so I just have to + admit that I have no idea what's going on. What I will do, though, is + stick a trap on the segfault handler which will stop if it sees any + writes to the idle thread's stack. That was the thing that happened + first, and it may be that if I can catch it immediately, what's going + on will be somewhat clearer. + + + 1122..22.. EEppiissooddee 22:: TThhee ccaassee ooff tthhee hhuunngg ffsscckk + + After setting a trap in the SEGV handler for accesses to the signal + thread's stack, I reran the kernel. + + + fsck hung again, this time by hitting the trap: + + + + + + + + + + + + + + + + + Setting hostname uml [ OK ] + Checking root filesystem + /dev/fhd0 contains a file system with errors, check forced. + Error reading block 86894 (Attempt to read block from filesystem resulted in short read) while reading indirect blocks of inode 19780. + + /dev/fhd0: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. + (i.e., without -a or -p options) + [ FAILED ] + + *** An error occurred during the file system check. + *** Dropping you to a shell; the system will reboot + *** when you leave the shell. + Give root password for maintenance + (or type Control-D for normal startup): + + [root@uml /root]# fsck -y /dev/fhd0 + fsck -y /dev/fhd0 + Parallelizing fsck version 1.14 (9-Jan-1999) + e2fsck 1.14, 9-Jan-1999 for EXT2 FS 0.5b, 95/08/09 + /dev/fhd0 contains a file system with errors, check forced. + Pass 1: Checking inodes, blocks, and sizes + Error reading block 86894 (Attempt to read block from filesystem resulted in short read) while reading indirect blocks of inode 19780. Ignore error? yes + + Pass 2: Checking directory structure + Error reading block 49405 (Attempt to read block from filesystem resulted in short read). Ignore error? yes + + Directory inode 11858, block 0, offset 0: directory corrupted + Salvage? yes + + Missing '.' in directory inode 11858. + Fix? yes + + Missing '..' in directory inode 11858. + Fix? yes + + Untested (4127) [100fe44c]: trap_kern.c line 31 + + + + + + I need to get the signal thread to detach from pid 4127 so that I can + attach to it with gdb. This is done by sending it a SIGUSR1, which is + caught by the signal thread, which detaches the process: + + + kill -USR1 4127 + + + + + + Now I can run gdb on it: + + + + + + + + + + + + + + ~/linux/2.3.26/um 1034: gdb linux + GNU gdb 4.17.0.11 with Linux support + Copyright 1998 Free Software Foundation, Inc. + GDB is free software, covered by the GNU General Public License, and you are + welcome to change it and/or distribute copies of it under certain conditions. + Type "show copying" to see the conditions. + There is absolutely no warranty for GDB. Type "show warranty" for details. + This GDB was configured as "i386-redhat-linux"... + (gdb) att 4127 + Attaching to program `/home/dike/linux/2.3.26/um/linux', Pid 4127 + 0x10075891 in __libc_nanosleep () + + + + + + The backtrace shows that it was in a write and that the fault address + (address in frame 3) is 0x50000800, which is right in the middle of + the signal thread's stack page: + + + (gdb) bt + #0 0x10075891 in __libc_nanosleep () + #1 0x1007584d in __sleep (seconds=1000000) + at ../sysdeps/unix/sysv/linux/sleep.c:78 + #2 0x1006ce9a in stop () at user_util.c:191 + #3 0x1006bf88 in segv (address=1342179328, is_write=2) at trap_kern.c:31 + #4 0x1006c628 in segv_handler (sc=0x5006eaf8) at trap_user.c:174 + #5 0x1006c63c in kern_segv_handler (sig=11) at trap_user.c:182 + #6 <signal handler called> + #7 0xc0fd in ?? () + #8 0x10016647 in sys_write (fd=3, buf=0x80b8800 "R.", count=1024) + at read_write.c:159 + #9 0x1006d603 in execute_syscall (syscall=4, args=0x5006ef08) + at syscall_kern.c:254 + #10 0x1006af87 in really_do_syscall (sig=12) at syscall_user.c:35 + #11 <signal handler called> + #12 0x400dc8b0 in ?? () + #13 <signal handler called> + #14 0x400dc8b0 in ?? () + #15 0x80545fd in ?? () + #16 0x804daae in ?? () + #17 0x8054334 in ?? () + #18 0x804d23e in ?? () + #19 0x8049632 in ?? () + #20 0x80491d2 in ?? () + #21 0x80596b5 in ?? () + (gdb) p (void *)1342179328 + $3 = (void *) 0x50000800 + + + + + + Going up the stack to the segv_handler frame and looking at where in + the code the access happened shows that it happened near line 110 of + block_dev.c: + + + + + + + + + + (gdb) up + #1 0x1007584d in __sleep (seconds=1000000) + at ../sysdeps/unix/sysv/linux/sleep.c:78 + ../sysdeps/unix/sysv/linux/sleep.c:78: No such file or directory. + (gdb) + #2 0x1006ce9a in stop () at user_util.c:191 + 191 while(1) sleep(1000000); + (gdb) + #3 0x1006bf88 in segv (address=1342179328, is_write=2) at trap_kern.c:31 + 31 KERN_UNTESTED(); + (gdb) + #4 0x1006c628 in segv_handler (sc=0x5006eaf8) at trap_user.c:174 + 174 segv(sc->cr2, sc->err & 2); + (gdb) p *sc + $1 = {gs = 0, __gsh = 0, fs = 0, __fsh = 0, es = 43, __esh = 0, ds = 43, + __dsh = 0, edi = 1342179328, esi = 134973440, ebp = 1342631484, + esp = 1342630864, ebx = 256, edx = 0, ecx = 256, eax = 1024, trapno = 14, + err = 6, eip = 268550834, cs = 35, __csh = 0, eflags = 66070, + esp_at_signal = 1342630864, ss = 43, __ssh = 0, fpstate = 0x0, oldmask = 0, + cr2 = 1342179328} + (gdb) p (void *)268550834 + $2 = (void *) 0x1001c2b2 + (gdb) i sym $2 + block_write + 1090 in section .text + (gdb) i line *$2 + Line 209 of "/home/dike/linux/2.3.26/um/include/asm/arch/string.h" + starts at address 0x1001c2a1 <block_write+1073> + and ends at 0x1001c2bf <block_write+1103>. + (gdb) i line *0x1001c2c0 + Line 110 of "block_dev.c" starts at address 0x1001c2bf <block_write+1103> + and ends at 0x1001c2e3 <block_write+1139>. + + + + + + Looking at the source shows that the fault happened during a call to + copy_to_user to copy the data into the kernel: + + + 107 count -= chars; + 108 copy_from_user(p,buf,chars); + 109 p += chars; + 110 buf += chars; + + + + + + p is the pointer which must contain 0x50000800, since buf contains + 0x80b8800 (frame 8 above). It is defined as: + + + p = offset + bh->b_data; + + + + + + I need to figure out what bh is, and it just so happens that bh is + passed as an argument to mark_buffer_uptodate and mark_buffer_dirty a + few lines later, so I do a little disassembly: + + + + + (gdb) disas 0x1001c2bf 0x1001c2e0 + Dump of assembler code from 0x1001c2bf to 0x1001c2d0: + 0x1001c2bf <block_write+1103>: addl %eax,0xc(%ebp) + 0x1001c2c2 <block_write+1106>: movl 0xfffffdd4(%ebp),%edx + 0x1001c2c8 <block_write+1112>: btsl $0x0,0x18(%edx) + 0x1001c2cd <block_write+1117>: btsl $0x1,0x18(%edx) + 0x1001c2d2 <block_write+1122>: sbbl %ecx,%ecx + 0x1001c2d4 <block_write+1124>: testl %ecx,%ecx + 0x1001c2d6 <block_write+1126>: jne 0x1001c2e3 <block_write+1139> + 0x1001c2d8 <block_write+1128>: pushl $0x0 + 0x1001c2da <block_write+1130>: pushl %edx + 0x1001c2db <block_write+1131>: call 0x1001819c <__mark_buffer_dirty> + End of assembler dump. + + + + + + At that point, bh is in %edx (address 0x1001c2da), which is calculated + at 0x1001c2c2 as %ebp + 0xfffffdd4, so I figure exactly what that is, + taking %ebp from the sigcontext_struct above: + + + (gdb) p (void *)1342631484 + $5 = (void *) 0x5006ee3c + (gdb) p 0x5006ee3c+0xfffffdd4 + $6 = 1342630928 + (gdb) p (void *)$6 + $7 = (void *) 0x5006ec10 + (gdb) p *((void **)$7) + $8 = (void *) 0x50100200 + + + + + + Now, I look at the structure to see what's in it, and particularly, + what its b_data field contains: + + + (gdb) p *((struct buffer_head *)0x50100200) + $13 = {b_next = 0x50289380, b_blocknr = 49405, b_size = 1024, b_list = 0, + b_dev = 15872, b_count = {counter = 1}, b_rdev = 15872, b_state = 24, + b_flushtime = 0, b_next_free = 0x501001a0, b_prev_free = 0x50100260, + b_this_page = 0x501001a0, b_reqnext = 0x0, b_pprev = 0x507fcf58, + b_data = 0x50000800 "", b_page = 0x50004000, + b_end_io = 0x10017f60 <end_buffer_io_sync>, b_dev_id = 0x0, + b_rsector = 98810, b_wait = {lock = <optimized out or zero length>, + task_list = {next = 0x50100248, prev = 0x50100248}, __magic = 1343226448, + __creator = 0}, b_kiobuf = 0x0} + + + + + + The b_data field is indeed 0x50000800, so the question becomes how + that happened. The rest of the structure looks fine, so this probably + is not a case of data corruption. It happened on purpose somehow. + + + The b_page field is a pointer to the page_struct representing the + 0x50000000 page. Looking at it shows the kernel's idea of the state + of that page: + + + + (gdb) p *$13.b_page + $17 = {list = {next = 0x50004a5c, prev = 0x100c5174}, mapping = 0x0, + index = 0, next_hash = 0x0, count = {counter = 1}, flags = 132, lru = { + next = 0x50008460, prev = 0x50019350}, wait = { + lock = <optimized out or zero length>, task_list = {next = 0x50004024, + prev = 0x50004024}, __magic = 1342193708, __creator = 0}, + pprev_hash = 0x0, buffers = 0x501002c0, virtual = 1342177280, + zone = 0x100c5160} + + + + + + Some sanity-checking: the virtual field shows the "virtual" address of + this page, which in this kernel is the same as its "physical" address, + and the page_struct itself should be mem_map[0], since it represents + the first page of memory: + + + + (gdb) p (void *)1342177280 + $18 = (void *) 0x50000000 + (gdb) p mem_map + $19 = (mem_map_t *) 0x50004000 + + + + + + These check out fine. + + + Now to check out the page_struct itself. In particular, the flags + field shows whether the page is considered free or not: + + + (gdb) p (void *)132 + $21 = (void *) 0x84 + + + + + + The "reserved" bit is the high bit, which is definitely not set, so + the kernel considers the signal stack page to be free and available to + be used. + + + At this point, I jump to conclusions and start looking at my early + boot code, because that's where that page is supposed to be reserved. + + + In my setup_arch procedure, I have the following code which looks just + fine: + + + + bootmap_size = init_bootmem(start_pfn, end_pfn - start_pfn); + free_bootmem(__pa(low_physmem) + bootmap_size, high_physmem - low_physmem); + + + + + + Two stack pages have already been allocated, and low_physmem points to + the third page, which is the beginning of free memory. + The init_bootmem call declares the entire memory to the boot memory + manager, which marks it all reserved. The free_bootmem call frees up + all of it, except for the first two pages. This looks correct to me. + + + So, I decide to see init_bootmem run and make sure that it is marking + those first two pages as reserved. I never get that far. + + + Stepping into init_bootmem, and looking at bootmem_map before looking + at what it contains shows the following: + + + + (gdb) p bootmem_map + $3 = (void *) 0x50000000 + + + + + + Aha! The light dawns. That first page is doing double duty as a + stack and as the boot memory map. The last thing that the boot memory + manager does is to free the pages used by its memory map, so this page + is getting freed even its marked as reserved. + + + The fix was to initialize the boot memory manager before allocating + those two stack pages, and then allocate them through the boot memory + manager. After doing this, and fixing a couple of subsequent buglets, + the stack corruption problem disappeared. + + + + + + 1133.. WWhhaatt ttoo ddoo wwhheenn UUMMLL ddooeessnn''tt wwoorrkk + + + + + 1133..11.. SSttrraannggee ccoommppiillaattiioonn eerrrroorrss wwhheenn yyoouu bbuuiilldd ffrroomm ssoouurrccee + + As of test11, it is necessary to have "ARCH=um" in the environment or + on the make command line for all steps in building UML, including + clean, distclean, or mrproper, config, menuconfig, or xconfig, dep, + and linux. If you forget for any of them, the i386 build seems to + contaminate the UML build. If this happens, start from scratch with + + + host% + make mrproper ARCH=um + + + + + and repeat the build process with ARCH=um on all the steps. + + + See ``Compiling the kernel and modules'' for more details. + + + Another cause of strange compilation errors is building UML in + /usr/src/linux. If you do this, the first thing you need to do is + clean up the mess you made. The /usr/src/linux/asm link will now + point to /usr/src/linux/asm-um. Make it point back to + /usr/src/linux/asm-i386. Then, move your UML pool someplace else and + build it there. Also see below, where a more specific set of symptoms + is described. + + + + 1133..22.. UUMMLL hhaannggss oonn bboooott aafftteerr mmoouunnttiinngg ddeevvffss + + The boot looks like this: + + + VFS: Mounted root (ext2 filesystem) readonly. + Mounted devfs on /dev + + + + + You're probably running a recent distribution on an old machine. I + saw this with the RH7.1 filesystem running on a Pentium. The shared + library loader, ld.so, was executing an instruction (cmove) which the + Pentium didn't support. That instruction was apparently added later. + If you run UML under the debugger, you'll see the hang caused by one + instruction causing an infinite SIGILL stream. + + + The fix is to boot UML on an older filesystem. + + + + 1133..33.. AA vvaarriieettyy ooff ppaanniiccss aanndd hhaannggss wwiitthh //ttmmpp oonn aa rreeiisseerrffss ffiilleessyyss-- + tteemm + + I saw this on reiserfs 3.5.21 and it seems to be fixed in 3.5.27. + Panics preceded by + + + Detaching pid nnnn + + + + are diagnostic of this problem. This is a reiserfs bug which causes a + thread to occasionally read stale data from a mmapped page shared with + another thread. The fix is to upgrade the filesystem or to have /tmp + be an ext2 filesystem. + + + + 1133..44.. TThhee ccoommppiillee ffaaiillss wwiitthh eerrrroorrss aabboouutt ccoonnfflliiccttiinngg ttyyppeess ffoorr + ''ooppeenn'',, ''dduupp'',, aanndd ''wwaaiittppiidd'' + + This happens when you build in /usr/src/linux. The UML build makes + the include/asm link point to include/asm-um. /usr/include/asm points + to /usr/src/linux/include/asm, so when that link gets moved, files + which need to include the asm-i386 versions of headers get the + incompatible asm-um versions. The fix is to move the include/asm link + back to include/asm-i386 and to do UML builds someplace else. + + + + 1133..55.. UUMMLL ddooeessnn''tt wwoorrkk wwhheenn //ttmmpp iiss aann NNFFSS ffiilleessyysstteemm + + This seems to be a similar situation with the resierfs problem above. + Some versions of NFS seems not to handle mmap correctly, which UML + depends on. The workaround is have /tmp be non-NFS directory. + + + 1133..66.. UUMMLL hhaannggss oonn bboooott wwhheenn ccoommppiilleedd wwiitthh ggpprrooff ssuuppppoorrtt + + If you build UML with gprof support and, early in the boot, it does + this + + + kernel BUG at page_alloc.c:100! + + + + + you have a buggy gcc. You can work around the problem by removing + UM_FASTCALL from CFLAGS in arch/um/Makefile-i386. This will open up + another bug, but that one is fairly hard to reproduce. + + + + 1133..77.. ssyyssllooggdd ddiieess wwiitthh aa SSIIGGTTEERRMM oonn ssttaarrttuupp + + The exact boot error depends on the distribution that you're booting, + but Debian produces this: + + + /etc/rc2.d/S10sysklogd: line 49: 93 Terminated + start-stop-daemon --start --quiet --exec /sbin/syslogd -- $SYSLOGD + + + + + This is a syslogd bug. There's a race between a parent process + installing a signal handler and its child sending the signal. See + this uml-devel post <http://www.geocrawler.com/lists/3/Source- + Forge/709/0/6612801> for the details. + + + + 1133..88.. TTUUNN//TTAAPP nneettwwoorrkkiinngg ddooeessnn''tt wwoorrkk oonn aa 22..44 hhoosstt + + There are a couple of problems which were + <http://www.geocrawler.com/lists/3/SourceForge/597/0/> name="pointed + out"> by Tim Robinson <timro at trkr dot net> + + +o It doesn't work on hosts running 2.4.7 (or thereabouts) or earlier. + The fix is to upgrade to something more recent and then read the + next item. + + +o If you see + + + File descriptor in bad state + + + + when you bring up the device inside UML, you have a header mismatch + between the original kernel and the upgraded one. Make /usr/src/linux + point at the new headers. This will only be a problem if you build + uml_net yourself. + + + + 1133..99.. YYoouu ccaann nneettwwoorrkk ttoo tthhee hhoosstt bbuutt nnoott ttoo ootthheerr mmaacchhiinneess oonn tthhee + nneett + + If you can connect to the host, and the host can connect to UML, but + you can not connect to any other machines, then you may need to enable + IP Masquerading on the host. Usually this is only experienced when + using private IP addresses (192.168.x.x or 10.x.x.x) for host/UML + networking, rather than the public address space that your host is + connected to. UML does not enable IP Masquerading, so you will need + to create a static rule to enable it: + + + host% + iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE + + + + + Replace eth0 with the interface that you use to talk to the rest of + the world. + + + Documentation on IP Masquerading, and SNAT, can be found at + www.netfilter.org <http://www.netfilter.org> . + + + If you can reach the local net, but not the outside Internet, then + that is usually a routing problem. The UML needs a default route: + + + UML# + route add default gw gateway IP + + + + + The gateway IP can be any machine on the local net that knows how to + reach the outside world. Usually, this is the host or the local net- + work's gateway. + + + Occasionally, we hear from someone who can reach some machines, but + not others on the same net, or who can reach some ports on other + machines, but not others. These are usually caused by strange + firewalling somewhere between the UML and the other box. You track + this down by running tcpdump on every interface the packets travel + over and see where they disappear. When you find a machine that takes + the packets in, but does not send them onward, that's the culprit. + + + + 1133..1100.. II hhaavvee nnoo rroooott aanndd II wwaanntt ttoo ssccrreeaamm + + Thanks to Birgit Wahlich for telling me about this strange one. It + turns out that there's a limit of six environment variables on the + kernel command line. When that limit is reached or exceeded, argument + processing stops, which means that the 'root=' argument that UML + usually adds is not seen. So, the filesystem has no idea what the + root device is, so it panics. + + + The fix is to put less stuff on the command line. Glomming all your + setup variables into one is probably the best way to go. + + + + 1133..1111.. UUMMLL bbuuiilldd ccoonnfflliicctt bbeettwweeeenn ppttrraaccee..hh aanndd uuccoonntteexxtt..hh + + On some older systems, /usr/include/asm/ptrace.h and + /usr/include/sys/ucontext.h define the same names. So, when they're + included together, the defines from one completely mess up the parsing + of the other, producing errors like: + /usr/include/sys/ucontext.h:47: parse error before + `10' + + + + + plus a pile of warnings. + + + This is a libc botch, which has since been fixed, and I don't see any + way around it besides upgrading. + + + + 1133..1122.. TThhee UUMMLL BBooggooMMiippss iiss eexxaaccttllyy hhaallff tthhee hhoosstt''ss BBooggooMMiippss + + On i386 kernels, there are two ways of running the loop that is used + to calculate the BogoMips rating, using the TSC if it's there or using + a one-instruction loop. The TSC produces twice the BogoMips as the + loop. UML uses the loop, since it has nothing resembling a TSC, and + will get almost exactly the same BogoMips as a host using the loop. + However, on a host with a TSC, its BogoMips will be double the loop + BogoMips, and therefore double the UML BogoMips. + + + + 1133..1133.. WWhheenn yyoouu rruunn UUMMLL,, iitt iimmmmeeddiiaatteellyy sseeggffaauullttss + + If the host is configured with the 2G/2G address space split, that's + why. See ``UML on 2G/2G hosts'' for the details on getting UML to + run on your host. + + + + 1133..1144.. xxtteerrmmss aappppeeaarr,, tthheenn iimmmmeeddiiaatteellyy ddiissaappppeeaarr + + If you're running an up to date kernel with an old release of + uml_utilities, the port-helper program will not work properly, so + xterms will exit straight after they appear. The solution is to + upgrade to the latest release of uml_utilities. Usually this problem + occurs when you have installed a packaged release of UML then compiled + your own development kernel without upgrading the uml_utilities from + the source distribution. + + + + 1133..1155.. AAnnyy ootthheerr ppaanniicc,, hhaanngg,, oorr ssttrraannggee bbeehhaavviioorr + + If you're seeing truly strange behavior, such as hangs or panics that + happen in random places, or you try running the debugger to see what's + happening and it acts strangely, then it could be a problem in the + host kernel. If you're not running a stock Linus or -ac kernel, then + try that. An early version of the preemption patch and a 2.4.10 SuSE + kernel have caused very strange problems in UML. + + + Otherwise, let me know about it. Send a message to one of the UML + mailing lists - either the developer list - user-mode-linux-devel at + lists dot sourceforge dot net (subscription info) or the user list - + user-mode-linux-user at lists dot sourceforge do net (subscription + info), whichever you prefer. Don't assume that everyone knows about + it and that a fix is imminent. + + + If you want to be super-helpful, read ``Diagnosing Problems'' and + follow the instructions contained therein. + 1144.. DDiiaaggnnoossiinngg PPrroobblleemmss + + + If you get UML to crash, hang, or otherwise misbehave, you should + report this on one of the project mailing lists, either the developer + list - user-mode-linux-devel at lists dot sourceforge dot net + (subscription info) or the user list - user-mode-linux-user at lists + dot sourceforge dot net (subscription info). When you do, it is + likely that I will want more information. So, it would be helpful to + read the stuff below, do whatever is applicable in your case, and + report the results to the list. + + + For any diagnosis, you're going to need to build a debugging kernel. + The binaries from this site aren't debuggable. If you haven't done + this before, read about ``Compiling the kernel and modules'' and + ``Kernel debugging'' UML first. + + + 1144..11.. CCaassee 11 :: NNoorrmmaall kkeerrnneell ppaanniiccss + + The most common case is for a normal thread to panic. To debug this, + you will need to run it under the debugger (add 'debug' to the command + line). An xterm will start up with gdb running inside it. Continue + it when it stops in start_kernel and make it crash. Now ^C gdb and + + + If the panic was a "Kernel mode fault", then there will be a segv + frame on the stack and I'm going to want some more information. The + stack might look something like this: + + + (UML gdb) backtrace + #0 0x1009bf76 in __sigprocmask (how=1, set=0x5f347940, oset=0x0) + at ../sysdeps/unix/sysv/linux/sigprocmask.c:49 + #1 0x10091411 in change_sig (signal=10, on=1) at process.c:218 + #2 0x10094785 in timer_handler (sig=26) at time_kern.c:32 + #3 0x1009bf38 in __restore () + at ../sysdeps/unix/sysv/linux/i386/sigaction.c:125 + #4 0x1009534c in segv (address=8, ip=268849158, is_write=2, is_user=0) + at trap_kern.c:66 + #5 0x10095c04 in segv_handler (sig=11) at trap_user.c:285 + #6 0x1009bf38 in __restore () + + + + + I'm going to want to see the symbol and line information for the value + of ip in the segv frame. In this case, you would do the following: + + + (UML gdb) i sym 268849158 + + + + + and + + + (UML gdb) i line *268849158 + + + + + The reason for this is the __restore frame right above the segv_han- + dler frame is hiding the frame that actually segfaulted. So, I have + to get that information from the faulting ip. + + + 1144..22.. CCaassee 22 :: TTrraacciinngg tthhrreeaadd ppaanniiccss + + The less common and more painful case is when the tracing thread + panics. In this case, the kernel debugger will be useless because it + needs a healthy tracing thread in order to work. The first thing to + do is get a backtrace from the tracing thread. This is done by + figuring out what its pid is, firing up gdb, and attaching it to that + pid. You can figure out the tracing thread pid by looking at the + first line of the console output, which will look like this: + + + tracing thread pid = 15851 + + + + + or by running ps on the host and finding the line that looks like + this: + + + jdike 15851 4.5 0.4 132568 1104 pts/0 S 21:34 0:05 ./linux [(tracing thread)] + + + + + If the panic was 'segfault in signals', then follow the instructions + above for collecting information about the location of the seg fault. + + + If the tracing thread flaked out all by itself, then send that + backtrace in and wait for our crack debugging team to fix the problem. + + + 1144..33.. CCaassee 33 :: TTrraacciinngg tthhrreeaadd ppaanniiccss ccaauusseedd bbyy ootthheerr tthhrreeaaddss + + However, there are cases where the misbehavior of another thread + caused the problem. The most common panic of this type is: + + + wait_for_stop failed to wait for <pid> to stop with <signal number> + + + + + In this case, you'll need to get a backtrace from the process men- + tioned in the panic, which is complicated by the fact that the kernel + debugger is defunct and without some fancy footwork, another gdb can't + attach to it. So, this is how the fancy footwork goes: + + In a shell: + + + host% kill -STOP pid + + + + + Run gdb on the tracing thread as described in case 2 and do: + + + (host gdb) call detach(pid) + + + If you get a segfault, do it again. It always works the second time. + + Detach from the tracing thread and attach to that other thread: + + + (host gdb) detach + + + + + + + (host gdb) attach pid + + + + + If gdb hangs when attaching to that process, go back to a shell and + do: + + + host% + kill -CONT pid + + + + + And then get the backtrace: + + + (host gdb) backtrace + + + + + + 1144..44.. CCaassee 44 :: HHaannggss + + Hangs seem to be fairly rare, but they sometimes happen. When a hang + happens, we need a backtrace from the offending process. Run the + kernel debugger as described in case 1 and get a backtrace. If the + current process is not the idle thread, then send in the backtrace. + You can tell that it's the idle thread if the stack looks like this: + + + #0 0x100b1401 in __libc_nanosleep () + #1 0x100a2885 in idle_sleep (secs=10) at time.c:122 + #2 0x100a546f in do_idle () at process_kern.c:445 + #3 0x100a5508 in cpu_idle () at process_kern.c:471 + #4 0x100ec18f in start_kernel () at init/main.c:592 + #5 0x100a3e10 in start_kernel_proc (unused=0x0) at um_arch.c:71 + #6 0x100a383f in signal_tramp (arg=0x100a3dd8) at trap_user.c:50 + + + + + If this is the case, then some other process is at fault, and went to + sleep when it shouldn't have. Run ps on the host and figure out which + process should not have gone to sleep and stayed asleep. Then attach + to it with gdb and get a backtrace as described in case 3. + + + + + + + 1155.. TThhaannkkss + + + A number of people have helped this project in various ways, and this + page gives recognition where recognition is due. + + + If you're listed here and you would prefer a real link on your name, + or no link at all, instead of the despammed email address pseudo-link, + let me know. + + + If you're not listed here and you think maybe you should be, please + let me know that as well. I try to get everyone, but sometimes my + bookkeeping lapses and I forget about contributions. + + + 1155..11.. CCooddee aanndd DDooccuummeennttaattiioonn + + Rusty Russell <rusty at linuxcare.com.au> - + + +o wrote the HOWTO <http://user-mode- + linux.sourceforge.net/UserModeLinux-HOWTO.html> + + +o prodded me into making this project official and putting it on + SourceForge + + +o came up with the way cool UML logo <http://user-mode- + linux.sourceforge.net/uml-small.png> + + +o redid the config process + + + Peter Moulder <reiter at netspace.net.au> - Fixed my config and build + processes, and added some useful code to the block driver + + + Bill Stearns <wstearns at pobox.com> - + + +o HOWTO updates + + +o lots of bug reports + + +o lots of testing + + +o dedicated a box (uml.ists.dartmouth.edu) to support UML development + + +o wrote the mkrootfs script, which allows bootable filesystems of + RPM-based distributions to be cranked out + + +o cranked out a large number of filesystems with said script + + + Jim Leu <jleu at mindspring.com> - Wrote the virtual ethernet driver + and associated usermode tools + + Lars Brinkhoff <http://lars.nocrew.org/> - Contributed the ptrace + proxy from his own project <http://a386.nocrew.org/> to allow easier + kernel debugging + + + Andrea Arcangeli <andrea at suse.de> - Redid some of the early boot + code so that it would work on machines with Large File Support + + + Chris Emerson <http://www.chiark.greenend.org.uk/~cemerson/> - Did + the first UML port to Linux/ppc + + + Harald Welte <laforge at gnumonks.org> - Wrote the multicast + transport for the network driver + + + Jorgen Cederlof - Added special file support to hostfs + + + Greg Lonnon <glonnon at ridgerun dot com> - Changed the ubd driver + to allow it to layer a COW file on a shared read-only filesystem and + wrote the iomem emulation support + + + Henrik Nordstrom <http://hem.passagen.se/hno/> - Provided a variety + of patches, fixes, and clues + + + Lennert Buytenhek - Contributed various patches, a rewrite of the + network driver, the first implementation of the mconsole driver, and + did the bulk of the work needed to get SMP working again. + + + Yon Uriarte - Fixed the TUN/TAP network backend while I slept. + + + Adam Heath - Made a bunch of nice cleanups to the initialization code, + plus various other small patches. + + + Matt Zimmerman - Matt volunteered to be the UML Debian maintainer and + is doing a real nice job of it. He also noticed and fixed a number of + actually and potentially exploitable security holes in uml_net. Plus + the occasional patch. I like patches. + + + James McMechan - James seems to have taken over maintenance of the ubd + driver and is doing a nice job of it. + + + Chandan Kudige - wrote the umlgdb script which automates the reloading + of module symbols. + + + Steve Schmidtke - wrote the UML slirp transport and hostaudio drivers, + enabling UML processes to access audio devices on the host. He also + submitted patches for the slip transport and lots of other things. + + + David Coulson <http://davidcoulson.net> - + + +o Set up the usermodelinux.org <http://usermodelinux.org> site, + which is a great way of keeping the UML user community on top of + UML goings-on. + + +o Site documentation and updates + + +o Nifty little UML management daemon UMLd + <http://uml.openconsultancy.com/umld/> + + +o Lots of testing and bug reports + + + + + 1155..22.. FFlluusshhiinngg oouutt bbuuggss + + + + +o Yuri Pudgorodsky + + +o Gerald Britton + + +o Ian Wehrman + + +o Gord Lamb + + +o Eugene Koontz + + +o John H. Hartman + + +o Anders Karlsson + + +o Daniel Phillips + + +o John Fremlin + + +o Rainer Burgstaller + + +o James Stevenson + + +o Matt Clay + + +o Cliff Jefferies + + +o Geoff Hoff + + +o Lennert Buytenhek + + +o Al Viro + + +o Frank Klingenhoefer + + +o Livio Baldini Soares + + +o Jon Burgess + + +o Petru Paler + + +o Paul + + +o Chris Reahard + + +o Sverker Nilsson + + +o Gong Su + + +o johan verrept + + +o Bjorn Eriksson + + +o Lorenzo Allegrucci + + +o Muli Ben-Yehuda + + +o David Mansfield + + +o Howard Goff + + +o Mike Anderson + + +o John Byrne + + +o Sapan J. Batia + + +o Iris Huang + + +o Jan Hudec + + +o Voluspa + + + + + 1155..33.. BBuugglleettss aanndd cclleeaann--uuppss + + + + +o Dave Zarzycki + + +o Adam Lazur + + +o Boria Feigin + + +o Brian J. Murrell + + +o JS + + +o Roman Zippel + + +o Wil Cooley + + +o Ayelet Shemesh + + +o Will Dyson + + +o Sverker Nilsson + + +o dvorak + + +o v.naga srinivas + + +o Shlomi Fish + + +o Roger Binns + + +o johan verrept + + +o MrChuoi + + +o Peter Cleve + + +o Vincent Guffens + + +o Nathan Scott + + +o Patrick Caulfield + + +o jbearce + + +o Catalin Marinas + + +o Shane Spencer + + +o Zou Min + + + +o Ryan Boder + + +o Lorenzo Colitti + + +o Gwendal Grignou + + +o Andre' Breiler + + +o Tsutomu Yasuda + + + + 1155..44.. CCaassee SSttuuddiieess + + + +o Jon Wright + + +o William McEwan + + +o Michael Richardson + + + + 1155..55.. OOtthheerr ccoonnttrriibbuuttiioonnss + + + Bill Carr <Bill.Carr at compaq.com> made the Red Hat mkrootfs script + work with RH 6.2. + + Michael Jennings <mikejen at hevanet.com> sent in some material which + is now gracing the top of the index page <http://user-mode- + linux.sourceforge.net/index.html> of this site. + + SGI <http://www.sgi.com> (and more specifically Ralf Baechle <ralf at + uni-koblenz.de> ) gave me an account on oss.sgi.com + <http://www.oss.sgi.com> . The bandwidth there made it possible to + produce most of the filesystems available on the project download + page. + + Laurent Bonnaud <Laurent.Bonnaud at inpg.fr> took the old grotty + Debian filesystem that I've been distributing and updated it to 2.2. + It is now available by itself here. + + Rik van Riel gave me some ftp space on ftp.nl.linux.org so I can make + releases even when Sourceforge is broken. + + Rodrigo de Castro looked at my broken pte code and told me what was + wrong with it, letting me fix a long-standing (several weeks) and + serious set of bugs. + + Chris Reahard built a specialized root filesystem for running a DNS + server jailed inside UML. It's available from the download + <http://user-mode-linux.sourceforge.net/dl-sf.html> page in the Jail + Filesysems section. + + + + + + + + + + + + diff --git a/Documentation/unicode.txt b/Documentation/unicode.txt new file mode 100644 index 000000000000..4a33f81cadb1 --- /dev/null +++ b/Documentation/unicode.txt @@ -0,0 +1,175 @@ + Last update: 2005-01-17, version 1.4 + +This file is maintained by H. Peter Anvin <unicode@lanana.org> as part +of the Linux Assigned Names And Numbers Authority (LANANA) project. +The current version can be found at: + + http://www.lanana.org/docs/unicode/unicode.txt + + ------------------------ + +The Linux kernel code has been rewritten to use Unicode to map +characters to fonts. By downloading a single Unicode-to-font table, +both the eight-bit character sets and UTF-8 mode are changed to use +the font as indicated. + +This changes the semantics of the eight-bit character tables subtly. +The four character tables are now: + +Map symbol Map name Escape code (G0) + +LAT1_MAP Latin-1 (ISO 8859-1) ESC ( B +GRAF_MAP DEC VT100 pseudographics ESC ( 0 +IBMPC_MAP IBM code page 437 ESC ( U +USER_MAP User defined ESC ( K + +In particular, ESC ( U is no longer "straight to font", since the font +might be completely different than the IBM character set. This +permits for example the use of block graphics even with a Latin-1 font +loaded. + +Note that although these codes are similar to ISO 2022, neither the +codes nor their uses match ISO 2022; Linux has two 8-bit codes (G0 and +G1), whereas ISO 2022 has four 7-bit codes (G0-G3). + +In accordance with the Unicode standard/ISO 10646 the range U+F000 to +U+F8FF has been reserved for OS-wide allocation (the Unicode Standard +refers to this as a "Corporate Zone", since this is inaccurate for +Linux we call it the "Linux Zone"). U+F000 was picked as the starting +point since it lets the direct-mapping area start on a large power of +two (in case 1024- or 2048-character fonts ever become necessary). +This leaves U+E000 to U+EFFF as End User Zone. + +[v1.2]: The Unicodes range from U+F000 and up to U+F7FF have been +hard-coded to map directly to the loaded font, bypassing the +translation table. The user-defined map now defaults to U+F000 to +U+F0FF, emulating the previous behaviour. In practice, this range +might be shorter; for example, vgacon can only handle 256-character +(U+F000..U+F0FF) or 512-character (U+F000..U+F1FF) fonts. + + +Actual characters assigned in the Linux Zone +-------------------------------------------- + +In addition, the following characters not present in Unicode 1.1.4 +have been defined; these are used by the DEC VT graphics map. [v1.2] +THIS USE IS OBSOLETE AND SHOULD NO LONGER BE USED; PLEASE SEE BELOW. + +U+F800 DEC VT GRAPHICS HORIZONTAL LINE SCAN 1 +U+F801 DEC VT GRAPHICS HORIZONTAL LINE SCAN 3 +U+F803 DEC VT GRAPHICS HORIZONTAL LINE SCAN 7 +U+F804 DEC VT GRAPHICS HORIZONTAL LINE SCAN 9 + +The DEC VT220 uses a 6x10 character matrix, and these characters form +a smooth progression in the DEC VT graphics character set. I have +omitted the scan 5 line, since it is also used as a block-graphics +character, and hence has been coded as U+2500 FORMS LIGHT HORIZONTAL. + +[v1.3]: These characters have been officially added to Unicode 3.2.0; +they are added at U+23BA, U+23BB, U+23BC, U+23BD. Linux now uses the +new values. + +[v1.2]: The following characters have been added to represent common +keyboard symbols that are unlikely to ever be added to Unicode proper +since they are horribly vendor-specific. This, of course, is an +excellent example of horrible design. + +U+F810 KEYBOARD SYMBOL FLYING FLAG +U+F811 KEYBOARD SYMBOL PULLDOWN MENU +U+F812 KEYBOARD SYMBOL OPEN APPLE +U+F813 KEYBOARD SYMBOL SOLID APPLE + +Klingon language support +------------------------ + +In 1996, Linux was the first operating system in the world to add +support for the artificial language Klingon, created by Marc Okrand +for the "Star Trek" television series. This encoding was later +adopted by the ConScript Unicode Registry and proposed (but ultimately +rejected) for inclusion in Unicode Plane 1. Thus, it remains as a +Linux/CSUR private assignment in the Linux Zone. + +This encoding has been endorsed by the Klingon Language Institute. +For more information, contact them at: + + http://www.kli.org/ + +Since the characters in the beginning of the Linux CZ have been more +of the dingbats/symbols/forms type and this is a language, I have +located it at the end, on a 16-cell boundary in keeping with standard +Unicode practice. + +NOTE: This range is now officially managed by the ConScript Unicode +Registry. The normative reference is at: + + http://www.evertype.com/standards/csur/klingon.html + +Klingon has an alphabet of 26 characters, a positional numeric writing +system with 10 digits, and is written left-to-right, top-to-bottom. + +Several glyph forms for the Klingon alphabet have been proposed. +However, since the set of symbols appear to be consistent throughout, +with only the actual shapes being different, in keeping with standard +Unicode practice these differences are considered font variants. + +U+F8D0 KLINGON LETTER A +U+F8D1 KLINGON LETTER B +U+F8D2 KLINGON LETTER CH +U+F8D3 KLINGON LETTER D +U+F8D4 KLINGON LETTER E +U+F8D5 KLINGON LETTER GH +U+F8D6 KLINGON LETTER H +U+F8D7 KLINGON LETTER I +U+F8D8 KLINGON LETTER J +U+F8D9 KLINGON LETTER L +U+F8DA KLINGON LETTER M +U+F8DB KLINGON LETTER N +U+F8DC KLINGON LETTER NG +U+F8DD KLINGON LETTER O +U+F8DE KLINGON LETTER P +U+F8DF KLINGON LETTER Q + - Written <q> in standard Okrand Latin transliteration +U+F8E0 KLINGON LETTER QH + - Written <Q> in standard Okrand Latin transliteration +U+F8E1 KLINGON LETTER R +U+F8E2 KLINGON LETTER S +U+F8E3 KLINGON LETTER T +U+F8E4 KLINGON LETTER TLH +U+F8E5 KLINGON LETTER U +U+F8E6 KLINGON LETTER V +U+F8E7 KLINGON LETTER W +U+F8E8 KLINGON LETTER Y +U+F8E9 KLINGON LETTER GLOTTAL STOP + +U+F8F0 KLINGON DIGIT ZERO +U+F8F1 KLINGON DIGIT ONE +U+F8F2 KLINGON DIGIT TWO +U+F8F3 KLINGON DIGIT THREE +U+F8F4 KLINGON DIGIT FOUR +U+F8F5 KLINGON DIGIT FIVE +U+F8F6 KLINGON DIGIT SIX +U+F8F7 KLINGON DIGIT SEVEN +U+F8F8 KLINGON DIGIT EIGHT +U+F8F9 KLINGON DIGIT NINE + +U+F8FD KLINGON COMMA +U+F8FE KLINGON FULL STOP +U+F8FF KLINGON SYMBOL FOR EMPIRE + +Other Fictional and Artificial Scripts +-------------------------------------- + +Since the assignment of the Klingon Linux Unicode block, a registry of +fictional and artificial scripts has been established by John Cowan +<jcowan@reutershealth.com> and Michael Everson <everson@evertype.com>. +The ConScript Unicode Registry is accessible at: + + http://www.evertype.com/standards/csur/ + +The ranges used fall at the low end of the End User Zone and can hence +not be normatively assigned, but it is recommended that people who +wish to encode fictional scripts use these codes, in the interest of +interoperability. For Klingon, CSUR has adopted the Linux encoding. +The CSUR people are driving adding Tengwar and Cirth into Unicode +Plane 1; the addition of Klingon to Unicode Plane 1 has been rejected +and so the above encoding remains official. diff --git a/Documentation/usb/CREDITS b/Documentation/usb/CREDITS new file mode 100644 index 000000000000..01e7f857ef35 --- /dev/null +++ b/Documentation/usb/CREDITS @@ -0,0 +1,175 @@ +Credits for the Simple Linux USB Driver: + +The following people have contributed to this code (in alphabetical +order by last name). I'm sure this list should be longer, its +difficult to maintain, add yourself with a patch if desired. + + Georg Acher <acher@informatik.tu-muenchen.de> + David Brownell <dbrownell@users.sourceforge.net> + Alan Cox <alan@lxorguk.ukuu.org.uk> + Randy Dunlap <randy.dunlap@intel.com> + Johannes Erdfelt <johannes@erdfelt.com> + Deti Fliegl <deti@fliegl.de> + ham <ham@unsuave.com> + Bradley M Keryan <keryan@andrew.cmu.edu> + Greg Kroah-Hartman <greg@kroah.com> + Pavel Machek <pavel@suse.cz> + Paul Mackerras <paulus@cs.anu.edu.au> + Petko Manlolov <petkan@dce.bg> + David E. Nelson <dnelson@jump.net> + Vojtech Pavlik <vojtech@suse.cz> + Bill Ryder <bryder@sgi.com> + Thomas Sailer <sailer@ife.ee.ethz.ch> + Gregory P. Smith <greg@electricrain.com> + Linus Torvalds <torvalds@osdl.org> + Roman Weissgaerber <weissg@vienna.at> + <Kazuki.Yasumatsu@fujixerox.co.jp> + +Special thanks to: + + Inaky Perez Gonzalez <inaky@peloncho.fis.ucm.es> for starting the + Linux USB driver effort and writing much of the larger uusbd driver. + Much has been learned from that effort. + + The NetBSD & FreeBSD USB developers. For being on the Linux USB list + and offering suggestions and sharing implementation experiences. + +Additional thanks to the following companies and people for donations +of hardware, support, time and development (this is from the original +THANKS file in Inaky's driver): + + The following corporations have helped us in the development + of Linux USB / UUSBD: + + - 3Com GmbH for donating a ISDN Pro TA and supporting me + in technical questions and with test equipment. I'd never + expect such a great help. + + - USAR Systems provided us with one of their excellent USB + Evaluation Kits. It allows us to test the Linux-USB driver + for compliance with the latest USB specification. USAR + Systems recognized the importance of an up-to-date open + Operating System and supports this project with + Hardware. Thanks!. + + - Thanks to Intel Corporation for their precious help. + + - We teamed up with Cherry to make Linux the first OS with + built-in USB support. Cherry is one of the biggest keyboard + makers in the world. + + - CMD Technology, Inc. sponsored us kindly donating a CSA-6700 + PCI-to-USB Controller Board to test the OHCI implementation. + + - Due to their support to us, Keytronic can be sure that they + will sell keyboards to some of the 3 million (at least) + Linux users. + + - Many thanks to ing büro h doran [http://www.ibhdoran.com]! + It was almost impossible to get a PC backplate USB connector + for the motherboard here at Europe (mine, home-made, was + quite lousy :). Now I know where to acquire nice USB stuff! + + - Genius Germany donated a USB mouse to test the mouse boot + protocol. They've also donated a F-23 digital joystick and a + NetMouse Pro. Thanks! + + - AVM GmbH Berlin is supporting the development of the Linux + USB driver for the AVM ISDN Controller B1 USB. AVM is a + leading manufacturer for active and passive ISDN Controllers + and CAPI 2.0-based software. The active design of the AVM B1 + is open for all OS platforms, including Linux. + + - Thanks to Y-E Data, Inc. for donating their FlashBuster-U + USB Floppy Disk Drive, so we could test the bulk transfer + code. + + - Many thanks to Logitech for contributing a three axis USB + mouse. + + Logitech designs, manufactures and markets + Human Interface Devices, having a long history and + experience in making devices such as keyboards, mice, + trackballs, cameras, loudspeakers and control devices for + gaming and professional use. + + Being a recognized vendor and seller for all these devices, + they have donated USB mice, a joystick and a scanner, as a + way to acknowledge the importance of Linux and to allow + Logitech customers to enjoy support in their favorite + operating systems and all Linux users to use Logitech and + other USB hardware. + + Logitech is official sponsor of the Linux Conference on + Feb. 11th 1999 in Vienna, where we'll will present the + current state of the Linux USB effort. + + - CATC has provided means to uncover dark corners of the UHCI + inner workings with a USB Inspector. + + - Thanks to Entrega for providing PCI to USB cards, hubs and + converter products for development. + + - Thanks to ConnectTech for providing a WhiteHEAT usb to + serial converter, and the documentation for the device to + allow a driver to be written. + + - Thanks to ADMtek for providing Pegasus and Pegasus II + evaluation boards, specs and valuable advices during + the driver development. + + And thanks go to (hey! in no particular order :) + + - Oren Tirosh <orenti@hishome.net>, for standing so patiently + all my doubts'bout USB and giving lots of cool ideas. + + - Jochen Karrer <karrer@wpfd25.physik.uni-wuerzburg.de>, for + pointing out mortal bugs and giving advice. + + - Edmund Humemberger <ed@atnet.at>, for it's great work on + public relationships and general management stuff for the + Linux-USB effort. + + - Alberto Menegazzi <flash@flash.iol.it> is starting the + documentation for the UUSBD. Go for it! + + - Ric Klaren <ia_ric@cs.utwente.nl> for doing nice + introductory documents (competing with Alberto's :). + + - Christian Groessler <cpg@aladdin.de>, for it's help on those + itchy bits ... :) + + - Paul MacKerras for polishing OHCI and pushing me harder for + the iMac support, giving improvements and enhancements. + + - Fernando Herrera <fherrera@eurielec.etsit.upm.es> has taken + charge of composing, maintaining and feeding the + long-awaited, unique and marvelous UUSBD FAQ! Tadaaaa!!! + + - Rasca Gmelch <thron@gmx.de> has revived the raw driver and + pointed bugs, as well as started the uusbd-utils package. + + - Peter Dettori <dettori@ozy.dec.com> is uncovering bugs like + crazy, as well as making cool suggestions, great :) + + - All the Free Software and Linux community, the FSF & the GNU + project, the MIT X consortium, the TeX people ... everyone! + You know who you are! + + - Big thanks to Richard Stallman for creating Emacs! + + - The people at the linux-usb mailing list, for reading so + many messages :) Ok, no more kidding; for all your advises! + + - All the people at the USB Implementors Forum for their + help and assistance. + + - Nathan Myers <ncm@cantrip.org>, for his advice! (hope you + liked Cibeles' party). + + - Linus Torvalds, for starting, developing and managing Linux. + + - Mike Smith, Craig Keithley, Thierry Giron and Janet Schank + for convincing me USB Standard hubs are not that standard + and that's good to allow for vendor specific quirks on the + standard hub driver. diff --git a/Documentation/usb/URB.txt b/Documentation/usb/URB.txt new file mode 100644 index 000000000000..d59b95cc6f1b --- /dev/null +++ b/Documentation/usb/URB.txt @@ -0,0 +1,252 @@ +Revised: 2000-Dec-05. +Again: 2002-Jul-06 + + NOTE: + + The USB subsystem now has a substantial section in "The Linux Kernel API" + guide (in Documentation/DocBook), generated from the current source + code. This particular documentation file isn't particularly current or + complete; don't rely on it except for a quick overview. + + +1.1. Basic concept or 'What is an URB?' + +The basic idea of the new driver is message passing, the message itself is +called USB Request Block, or URB for short. + +- An URB consists of all relevant information to execute any USB transaction + and deliver the data and status back. + +- Execution of an URB is inherently an asynchronous operation, i.e. the + usb_submit_urb(urb) call returns immediately after it has successfully queued + the requested action. + +- Transfers for one URB can be canceled with usb_unlink_urb(urb) at any time. + +- Each URB has a completion handler, which is called after the action + has been successfully completed or canceled. The URB also contains a + context-pointer for passing information to the completion handler. + +- Each endpoint for a device logically supports a queue of requests. + You can fill that queue, so that the USB hardware can still transfer + data to an endpoint while your driver handles completion of another. + This maximizes use of USB bandwidth, and supports seamless streaming + of data to (or from) devices when using periodic transfer modes. + + +1.2. The URB structure + +Some of the fields in an URB are: + +struct urb +{ +// (IN) device and pipe specify the endpoint queue + struct usb_device *dev; // pointer to associated USB device + unsigned int pipe; // endpoint information + + unsigned int transfer_flags; // ISO_ASAP, SHORT_NOT_OK, etc. + +// (IN) all urbs need completion routines + void *context; // context for completion routine + void (*complete)(struct urb *); // pointer to completion routine + +// (OUT) status after each completion + int status; // returned status + +// (IN) buffer used for data transfers + void *transfer_buffer; // associated data buffer + int transfer_buffer_length; // data buffer length + int number_of_packets; // size of iso_frame_desc + +// (OUT) sometimes only part of CTRL/BULK/INTR transfer_buffer is used + int actual_length; // actual data buffer length + +// (IN) setup stage for CTRL (pass a struct usb_ctrlrequest) + unsigned char* setup_packet; // setup packet (control only) + +// Only for PERIODIC transfers (ISO, INTERRUPT) + // (IN/OUT) start_frame is set unless ISO_ASAP isn't set + int start_frame; // start frame + int interval; // polling interval + + // ISO only: packets are only "best effort"; each can have errors + int error_count; // number of errors + struct usb_iso_packet_descriptor iso_frame_desc[0]; +}; + +Your driver must create the "pipe" value using values from the appropriate +endpoint descriptor in an interface that it's claimed. + + +1.3. How to get an URB? + +URBs are allocated with the following call + + struct urb *usb_alloc_urb(int isoframes, int mem_flags) + +Return value is a pointer to the allocated URB, 0 if allocation failed. +The parameter isoframes specifies the number of isochronous transfer frames +you want to schedule. For CTRL/BULK/INT, use 0. The mem_flags parameter +holds standard memory allocation flags, letting you control (among other +things) whether the underlying code may block or not. + +To free an URB, use + + void usb_free_urb(struct urb *urb) + +You may not free an urb that you've submitted, but which hasn't yet been +returned to you in a completion callback. + + +1.4. What has to be filled in? + +Depending on the type of transaction, there are some inline functions +defined in <linux/usb.h> to simplify the initialization, such as +fill_control_urb() and fill_bulk_urb(). In general, they need the usb +device pointer, the pipe (usual format from usb.h), the transfer buffer, +the desired transfer length, the completion handler, and its context. +Take a look at the some existing drivers to see how they're used. + +Flags: +For ISO there are two startup behaviors: Specified start_frame or ASAP. +For ASAP set URB_ISO_ASAP in transfer_flags. + +If short packets should NOT be tolerated, set URB_SHORT_NOT_OK in +transfer_flags. + + +1.5. How to submit an URB? + +Just call + + int usb_submit_urb(struct urb *urb, int mem_flags) + +The mem_flags parameter, such as SLAB_ATOMIC, controls memory allocation, +such as whether the lower levels may block when memory is tight. + +It immediately returns, either with status 0 (request queued) or some +error code, usually caused by the following: + +- Out of memory (-ENOMEM) +- Unplugged device (-ENODEV) +- Stalled endpoint (-EPIPE) +- Too many queued ISO transfers (-EAGAIN) +- Too many requested ISO frames (-EFBIG) +- Invalid INT interval (-EINVAL) +- More than one packet for INT (-EINVAL) + +After submission, urb->status is -EINPROGRESS; however, you should never +look at that value except in your completion callback. + +For isochronous endpoints, your completion handlers should (re)submit +URBs to the same endpoint with the ISO_ASAP flag, using multi-buffering, +to get seamless ISO streaming. + + +1.6. How to cancel an already running URB? + +For an URB which you've submitted, but which hasn't been returned to +your driver by the host controller, call + + int usb_unlink_urb(struct urb *urb) + +It removes the urb from the internal list and frees all allocated +HW descriptors. The status is changed to reflect unlinking. After +usb_unlink_urb() returns with that status code, you can free the URB +with usb_free_urb(). + +There is also an asynchronous unlink mode. To use this, set the +the URB_ASYNC_UNLINK flag in urb->transfer flags before calling +usb_unlink_urb(). When using async unlinking, the URB will not +normally be unlinked when usb_unlink_urb() returns. Instead, wait +for the completion handler to be called. + + +1.7. What about the completion handler? + +The handler is of the following type: + + typedef void (*usb_complete_t)(struct urb *); + +i.e. it gets just the URB that caused the completion call. +In the completion handler, you should have a look at urb->status to +detect any USB errors. Since the context parameter is included in the URB, +you can pass information to the completion handler. + +Note that even when an error (or unlink) is reported, data may have been +transferred. That's because USB transfers are packetized; it might take +sixteen packets to transfer your 1KByte buffer, and ten of them might +have transferred succesfully before the completion is called. + + +NOTE: ***** WARNING ***** +Don't use urb->dev field in your completion handler; it's cleared +as part of giving urbs back to drivers. (Addressing an issue with +ownership of periodic URBs, which was otherwise ambiguous.) Instead, +use urb->context to hold all the data your driver needs. + +NOTE: ***** WARNING ***** +Also, NEVER SLEEP IN A COMPLETION HANDLER. These are normally called +during hardware interrupt processing. If you can, defer substantial +work to a tasklet (bottom half) to keep system latencies low. You'll +probably need to use spinlocks to protect data structures you manipulate +in completion handlers. + + +1.8. How to do isochronous (ISO) transfers? + +For ISO transfers you have to fill a usb_iso_packet_descriptor structure, +allocated at the end of the URB by usb_alloc_urb(n,mem_flags), for each +packet you want to schedule. You also have to set urb->interval to say +how often to make transfers; it's often one per frame (which is once +every microframe for highspeed devices). The actual interval used will +be a power of two that's no bigger than what you specify. + +The usb_submit_urb() call modifies urb->interval to the implemented interval +value that is less than or equal to the requested interval value. If +ISO_ASAP scheduling is used, urb->start_frame is also updated. + +For each entry you have to specify the data offset for this frame (base is +transfer_buffer), and the length you want to write/expect to read. +After completion, actual_length contains the actual transferred length and +status contains the resulting status for the ISO transfer for this frame. +It is allowed to specify a varying length from frame to frame (e.g. for +audio synchronisation/adaptive transfer rates). You can also use the length +0 to omit one or more frames (striping). + +For scheduling you can choose your own start frame or ISO_ASAP. As explained +earlier, if you always keep at least one URB queued and your completion +keeps (re)submitting a later URB, you'll get smooth ISO streaming (if usb +bandwidth utilization allows). + +If you specify your own start frame, make sure it's several frames in advance +of the current frame. You might want this model if you're synchronizing +ISO data with some other event stream. + + +1.9. How to start interrupt (INT) transfers? + +Interrupt transfers, like isochronous transfers, are periodic, and happen +in intervals that are powers of two (1, 2, 4 etc) units. Units are frames +for full and low speed devices, and microframes for high speed ones. + +Currently, after you submit one interrupt URB, that urb is owned by the +host controller driver until you cancel it with usb_unlink_urb(). You +may unlink interrupt urbs in their completion handlers, if you need to. + +After a transfer completion is called, the URB is automagically resubmitted. +THIS BEHAVIOR IS EXPECTED TO BE REMOVED!! + +Interrupt transfers may only send (or receive) the "maxpacket" value for +the given interrupt endpoint; if you need more data, you will need to +copy that data out of (or into) another buffer. Similarly, you can't +queue interrupt transfers. +THESE RESTRICTIONS ARE EXPECTED TO BE REMOVED!! + +Note that this automagic resubmission model does make it awkward to use +interrupt OUT transfers. The portable solution involves unlinking those +OUT urbs after the data is transferred, and perhaps submitting a final +URB for a short packet. + +The usb_submit_urb() call modifies urb->interval to the implemented interval +value that is less than or equal to the requested interval value. diff --git a/Documentation/usb/acm.txt b/Documentation/usb/acm.txt new file mode 100644 index 000000000000..8ef45ea8f691 --- /dev/null +++ b/Documentation/usb/acm.txt @@ -0,0 +1,138 @@ + Linux ACM driver v0.16 + (c) 1999 Vojtech Pavlik <vojtech@suse.cz> + Sponsored by SuSE +---------------------------------------------------------------------------- + +0. Disclaimer +~~~~~~~~~~~~~ + This program is free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by the Free +Software Foundation; either version 2 of the License, or (at your option) +any later version. + + This program is distributed in the hope that it will be useful, but +WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY +or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for +more details. + + You should have received a copy of the GNU General Public License along +with this program; if not, write to the Free Software Foundation, Inc., 59 +Temple Place, Suite 330, Boston, MA 02111-1307 USA + + Should you need to contact me, the author, you can do so either by e-mail +- mail your message to <vojtech@suse.cz>, or by paper mail: Vojtech Pavlik, +Ucitelska 1576, Prague 8, 182 00 Czech Republic + + For your convenience, the GNU General Public License version 2 is included +in the package: See the file COPYING. + +1. Usage +~~~~~~~~ + The drivers/usb/class/cdc-acm.c drivers works with USB modems and USB ISDN terminal +adapters that conform to the Universal Serial Bus Communication Device Class +Abstract Control Model (USB CDC ACM) specification. + + Many modems do, here is a list of those I know of: + + 3Com OfficeConnect 56k + 3Com Voice FaxModem Pro + 3Com Sportster + MultiTech MultiModem 56k + Zoom 2986L FaxModem + Compaq 56k FaxModem + ELSA Microlink 56k + + I know of one ISDN TA that does work with the acm driver: + + 3Com USR ISDN Pro TA + + Unfortunately many modems and most ISDN TAs use proprietary interfaces and +thus won't work with this drivers. Check for ACM compliance before buying. + + The driver (with devfs) creates these devices in /dev/usb/acm: + + crw-r--r-- 1 root root 166, 0 Apr 1 10:49 0 + crw-r--r-- 1 root root 166, 1 Apr 1 10:49 1 + crw-r--r-- 1 root root 166, 2 Apr 1 10:49 2 + + And so on, up to 31, with the limit being possible to change in acm.c to up +to 256, so you can use up to 256 USB modems with one computer (you'll need +three USB cards for that, though). + + If you don't use devfs, then you can create device nodes with the same +minor/major numbers anywhere you want, but either the above location or +/dev/usb/ttyACM0 is preferred. + + To use the modems you need these modules loaded: + + usbcore.ko + uhci-hcd.ko ohci-hcd.ko or ehci-hcd.ko + cdc-acm.ko + + After that, the modem[s] should be accessible. You should be able to use +minicom, ppp and mgetty with them. + +2. Verifying that it works +~~~~~~~~~~~~~~~~~~~~~~~~~~ + The first step would be to check /proc/bus/usb/devices, it should look +like this: + +T: Bus=01 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2 +B: Alloc= 0/900 us ( 0%), #Int= 0, #Iso= 0 +D: Ver= 1.00 Cls=09(hub ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 +P: Vendor=0000 ProdID=0000 Rev= 0.00 +S: Product=USB UHCI Root Hub +S: SerialNumber=6800 +C:* #Ifs= 1 Cfg#= 1 Atr=40 MxPwr= 0mA +I: If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub +E: Ad=81(I) Atr=03(Int.) MxPS= 8 Ivl=255ms +T: Bus=01 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 +D: Ver= 1.00 Cls=02(comm.) Sub=00 Prot=00 MxPS= 8 #Cfgs= 2 +P: Vendor=04c1 ProdID=008f Rev= 2.07 +S: Manufacturer=3Com Inc. +S: Product=3Com U.S. Robotics Pro ISDN TA +S: SerialNumber=UFT53A49BVT7 +C: #Ifs= 1 Cfg#= 1 Atr=60 MxPwr= 0mA +I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=acm +E: Ad=85(I) Atr=02(Bulk) MxPS= 64 Ivl= 0ms +E: Ad=04(O) Atr=02(Bulk) MxPS= 64 Ivl= 0ms +E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=128ms +C:* #Ifs= 2 Cfg#= 2 Atr=60 MxPwr= 0mA +I: If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=02 Prot=01 Driver=acm +E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=128ms +I: If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=acm +E: Ad=85(I) Atr=02(Bulk) MxPS= 64 Ivl= 0ms +E: Ad=04(O) Atr=02(Bulk) MxPS= 64 Ivl= 0ms + +The presence of these three lines (and the Cls= 'comm' and 'data' classes) +is important, it means it's an ACM device. The Driver=acm means the acm +driver is used for the device. If you see only Cls=ff(vend.) then you're out +of luck, you have a device with vendor specific-interface. + +D: Ver= 1.00 Cls=02(comm.) Sub=00 Prot=00 MxPS= 8 #Cfgs= 2 +I: If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=02 Prot=01 Driver=acm +I: If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=acm + +In the system log you should see: + +usb.c: USB new device connect, assigned device number 2 +usb.c: kmalloc IF c7691fa0, numif 1 +usb.c: kmalloc IF c7b5f3e0, numif 2 +usb.c: skipped 4 class/vendor specific interface descriptors +usb.c: new device strings: Mfr=1, Product=2, SerialNumber=3 +usb.c: USB device number 2 default language ID 0x409 +Manufacturer: 3Com Inc. +Product: 3Com U.S. Robotics Pro ISDN TA +SerialNumber: UFT53A49BVT7 +acm.c: probing config 1 +acm.c: probing config 2 +ttyACM0: USB ACM device +acm.c: acm_control_msg: rq: 0x22 val: 0x0 len: 0x0 result: 0 +acm.c: acm_control_msg: rq: 0x20 val: 0x0 len: 0x7 result: 7 +usb.c: acm driver claimed interface c7b5f3e0 +usb.c: acm driver claimed interface c7b5f3f8 +usb.c: acm driver claimed interface c7691fa0 + +If all this seems to be OK, fire up minicom and set it to talk to the ttyACM +device and try typing 'at'. If it responds with 'OK', then everything is +working. diff --git a/Documentation/usb/auerswald.txt b/Documentation/usb/auerswald.txt new file mode 100644 index 000000000000..7ee4d8f69116 --- /dev/null +++ b/Documentation/usb/auerswald.txt @@ -0,0 +1,30 @@ + Auerswald USB kernel driver + =========================== + +What is it? What can I do with it? +================================== +The auerswald USB kernel driver connects your linux 2.4.x +system to the auerswald usb-enabled devices. + +There are two types of auerswald usb devices: +a) small PBX systems (ISDN) +b) COMfort system telephones (ISDN) + +The driver installation creates the devices +/dev/usb/auer0..15. These devices carry a vendor- +specific protocol. You may run all auerswald java +software on it. The java software needs a native +library "libAuerUsbJNINative.so" installed on +your system. This library is available from +auerswald and shipped as part of the java software. + +You may create the devices with: + mknod -m 666 /dev/usb/auer0 c 180 112 + ... + mknod -m 666 /dev/usb/auer15 c 180 127 + +Future plans +============ +- Connection to ISDN4LINUX (the hisax interface) + +The maintainer of this driver is wolfgang@iksw-muees.de diff --git a/Documentation/usb/bluetooth.txt b/Documentation/usb/bluetooth.txt new file mode 100644 index 000000000000..774f5d3835cc --- /dev/null +++ b/Documentation/usb/bluetooth.txt @@ -0,0 +1,44 @@ +INTRODUCTION + + The USB Bluetooth driver supports any USB Bluetooth device. + It currently works well with the Linux USB Bluetooth stack from Axis + (available at http://developer.axis.com/software/bluetooth/ ) and + has been rumored to work with other Linux USB Bluetooth stacks. + + +CONFIGURATION + + Currently the driver can handle up to 256 different USB Bluetooth + devices at once. + + If you are not using devfs: + The major number that the driver uses is 216 so to use the driver, + create the following nodes: + mknod /dev/ttyUB0 c 216 0 + mknod /dev/ttyUB1 c 216 1 + mknod /dev/ttyUB2 c 216 2 + mknod /dev/ttyUB3 c 216 3 + . + . + . + mknod /dev/ttyUB254 c 216 254 + mknod /dev/ttyUB255 c 216 255 + + If you are using devfs: + The devices supported by this driver will show up as + /dev/usb/ttub/{0,1,...} + + When the device is connected and recognized by the driver, the driver + will print to the system log, which node the device has been bound to. + + +CONTACT: + + If anyone has any problems using this driver, please contact me, or + join the Linux-USB mailing list (information on joining the mailing + list, as well as a link to its searchable archive is at + http://www.linux-usb.org/ ) + + +Greg Kroah-Hartman +greg@kroah.com diff --git a/Documentation/usb/dma.txt b/Documentation/usb/dma.txt new file mode 100644 index 000000000000..62844aeba69c --- /dev/null +++ b/Documentation/usb/dma.txt @@ -0,0 +1,116 @@ +In Linux 2.5 kernels (and later), USB device drivers have additional control +over how DMA may be used to perform I/O operations. The APIs are detailed +in the kernel usb programming guide (kerneldoc, from the source code). + + +API OVERVIEW + +The big picture is that USB drivers can continue to ignore most DMA issues, +though they still must provide DMA-ready buffers (see DMA-mapping.txt). +That's how they've worked through the 2.4 (and earlier) kernels. + +OR: they can now be DMA-aware. + +- New calls enable DMA-aware drivers, letting them allocate dma buffers and + manage dma mappings for existing dma-ready buffers (see below). + +- URBs have an additional "transfer_dma" field, as well as a transfer_flags + bit saying if it's valid. (Control requests also have "setup_dma" and a + corresponding transfer_flags bit.) + +- "usbcore" will map those DMA addresses, if a DMA-aware driver didn't do + it first and set URB_NO_TRANSFER_DMA_MAP or URB_NO_SETUP_DMA_MAP. HCDs + don't manage dma mappings for URBs. + +- There's a new "generic DMA API", parts of which are usable by USB device + drivers. Never use dma_set_mask() on any USB interface or device; that + would potentially break all devices sharing that bus. + + +ELIMINATING COPIES + +It's good to avoid making CPUs copy data needlessly. The costs can add up, +and effects like cache-trashing can impose subtle penalties. + +- When you're allocating a buffer for DMA purposes anyway, use the buffer + primitives. Think of them as kmalloc and kfree that give you the right + kind of addresses to store in urb->transfer_buffer and urb->transfer_dma, + while guaranteeing that no hidden copies through DMA "bounce" buffers will + slow things down. You'd also set URB_NO_TRANSFER_DMA_MAP in + urb->transfer_flags: + + void *usb_buffer_alloc (struct usb_device *dev, size_t size, + int mem_flags, dma_addr_t *dma); + + void usb_buffer_free (struct usb_device *dev, size_t size, + void *addr, dma_addr_t dma); + + For control transfers you can use the buffer primitives or not for each + of the transfer buffer and setup buffer independently. Set the flag bits + URB_NO_TRANSFER_DMA_MAP and URB_NO_SETUP_DMA_MAP to indicate which + buffers you have prepared. For non-control transfers URB_NO_SETUP_DMA_MAP + is ignored. + + The memory buffer returned is "dma-coherent"; sometimes you might need to + force a consistent memory access ordering by using memory barriers. It's + not using a streaming DMA mapping, so it's good for small transfers on + systems where the I/O would otherwise tie up an IOMMU mapping. (See + Documentation/DMA-mapping.txt for definitions of "coherent" and "streaming" + DMA mappings.) + + Asking for 1/Nth of a page (as well as asking for N pages) is reasonably + space-efficient. + +- Devices on some EHCI controllers could handle DMA to/from high memory. + Driver probe() routines can notice this using a generic DMA call, then + tell higher level code (network, scsi, etc) about it like this: + + if (dma_supported (&intf->dev, 0xffffffffffffffffULL)) + net->features |= NETIF_F_HIGHDMA; + + That can eliminate dma bounce buffering of requests that originate (or + terminate) in high memory, in cases where the buffers aren't allocated + with usb_buffer_alloc() but instead are dma-mapped. + + +WORKING WITH EXISTING BUFFERS + +Existing buffers aren't usable for DMA without first being mapped into the +DMA address space of the device. + +- When you're using scatterlists, you can map everything at once. On some + systems, this kicks in an IOMMU and turns the scatterlists into single + DMA transactions: + + int usb_buffer_map_sg (struct usb_device *dev, unsigned pipe, + struct scatterlist *sg, int nents); + + void usb_buffer_dmasync_sg (struct usb_device *dev, unsigned pipe, + struct scatterlist *sg, int n_hw_ents); + + void usb_buffer_unmap_sg (struct usb_device *dev, unsigned pipe, + struct scatterlist *sg, int n_hw_ents); + + It's probably easier to use the new usb_sg_*() calls, which do the DMA + mapping and apply other tweaks to make scatterlist i/o be fast. + +- Some drivers may prefer to work with the model that they're mapping large + buffers, synchronizing their safe re-use. (If there's no re-use, then let + usbcore do the map/unmap.) Large periodic transfers make good examples + here, since it's cheaper to just synchronize the buffer than to unmap it + each time an urb completes and then re-map it on during resubmission. + + These calls all work with initialized urbs: urb->dev, urb->pipe, + urb->transfer_buffer, and urb->transfer_buffer_length must all be + valid when these calls are used (urb->setup_packet must be valid too + if urb is a control request): + + struct urb *usb_buffer_map (struct urb *urb); + + void usb_buffer_dmasync (struct urb *urb); + + void usb_buffer_unmap (struct urb *urb); + + The calls manage urb->transfer_dma for you, and set URB_NO_TRANSFER_DMA_MAP + so that usbcore won't map or unmap the buffer. The same goes for + urb->setup_dma and URB_NO_SETUP_DMA_MAP for control requests. diff --git a/Documentation/usb/ehci.txt b/Documentation/usb/ehci.txt new file mode 100644 index 000000000000..1536b7e75134 --- /dev/null +++ b/Documentation/usb/ehci.txt @@ -0,0 +1,212 @@ +27-Dec-2002 + +The EHCI driver is used to talk to high speed USB 2.0 devices using +USB 2.0-capable host controller hardware. The USB 2.0 standard is +compatible with the USB 1.1 standard. It defines three transfer speeds: + + - "High Speed" 480 Mbit/sec (60 MByte/sec) + - "Full Speed" 12 Mbit/sec (1.5 MByte/sec) + - "Low Speed" 1.5 Mbit/sec + +USB 1.1 only addressed full speed and low speed. High speed devices +can be used on USB 1.1 systems, but they slow down to USB 1.1 speeds. + +USB 1.1 devices may also be used on USB 2.0 systems. When plugged +into an EHCI controller, they are given to a USB 1.1 "companion" +controller, which is a OHCI or UHCI controller as normally used with +such devices. When USB 1.1 devices plug into USB 2.0 hubs, they +interact with the EHCI controller through a "Transaction Translator" +(TT) in the hub, which turns low or full speed transactions into +high speed "split transactions" that don't waste transfer bandwidth. + +At this writing, this driver has been seen to work with implementations +of EHCI from (in alphabetical order): Intel, NEC, Philips, and VIA. +Other EHCI implementations are becoming available from other vendors; +you should expect this driver to work with them too. + +While usb-storage devices have been available since mid-2001 (working +quite speedily on the 2.4 version of this driver), hubs have only +been available since late 2001, and other kinds of high speed devices +appear to be on hold until more systems come with USB 2.0 built-in. +Such new systems have been available since early 2002, and became much +more typical in the second half of 2002. + +Note that USB 2.0 support involves more than just EHCI. It requires +other changes to the Linux-USB core APIs, including the hub driver, +but those changes haven't needed to really change the basic "usbcore" +APIs exposed to USB device drivers. + +- David Brownell + <dbrownell@users.sourceforge.net> + + +FUNCTIONALITY + +This driver is regularly tested on x86 hardware, and has also been +used on PPC hardware so big/little endianness issues should be gone. +It's believed to do all the right PCI magic so that I/O works even on +systems with interesting DMA mapping issues. + +Transfer Types + +At this writing the driver should comfortably handle all control, bulk, +and interrupt transfers, including requests to USB 1.1 devices through +transaction translators (TTs) in USB 2.0 hubs. But you may find bugs. + +High Speed Isochronous (ISO) transfer support is also functional, but +at this writing no Linux drivers have been using that support. + +Full Speed Isochronous transfer support, through transaction translators, +is not yet available. Note that split transaction support for ISO +transfers can't share much code with the code for high speed ISO transfers, +since EHCI represents these with a different data structure. So for now, +most USB audio and video devices can't be connected to high speed buses. + +Driver Behavior + +Transfers of all types can be queued. This means that control transfers +from a driver on one interface (or through usbfs) won't interfere with +ones from another driver, and that interrupt transfers can use periods +of one frame without risking data loss due to interrupt processing costs. + +The EHCI root hub code hands off USB 1.1 devices to its companion +controller. This driver doesn't need to know anything about those +drivers; a OHCI or UHCI driver that works already doesn't need to change +just because the EHCI driver is also present. + +There are some issues with power management; suspend/resume doesn't +behave quite right at the moment. + +Also, some shortcuts have been taken with the scheduling periodic +transactions (interrupt and isochronous transfers). These place some +limits on the number of periodic transactions that can be scheduled, +and prevent use of polling intervals of less than one frame. + + +USE BY + +Assuming you have an EHCI controller (on a PCI card or motherboard) +and have compiled this driver as a module, load this like: + + # modprobe ehci-hcd + +and remove it by: + + # rmmod ehci-hcd + +You should also have a driver for a "companion controller", such as +"ohci-hcd" or "uhci-hcd". In case of any trouble with the EHCI driver, +remove its module and then the driver for that companion controller will +take over (at lower speed) all the devices that were previously handled +by the EHCI driver. + +Module parameters (pass to "modprobe") include: + + log2_irq_thresh (default 0): + Log2 of default interrupt delay, in microframes. The default + value is 0, indicating 1 microframe (125 usec). Maximum value + is 6, indicating 2^6 = 64 microframes. This controls how often + the EHCI controller can issue interrupts. + +If you're using this driver on a 2.5 kernel, and you've enabled USB +debugging support, you'll see three files in the "sysfs" directory for +any EHCI controller: + + "async" dumps the asynchronous schedule, used for control + and bulk transfers. Shows each active qh and the qtds + pending, usually one qtd per urb. (Look at it with + usb-storage doing disk I/O; watch the request queues!) + "periodic" dumps the periodic schedule, used for interrupt + and isochronous transfers. Doesn't show qtds. + "registers" show controller register state, and + +The contents of those files can help identify driver problems. + + +Device drivers shouldn't care whether they're running over EHCI or not, +but they may want to check for "usb_device->speed == USB_SPEED_HIGH". +High speed devices can do things that full speed (or low speed) ones +can't, such as "high bandwidth" periodic (interrupt or ISO) transfers. +Also, some values in device descriptors (such as polling intervals for +periodic transfers) use different encodings when operating at high speed. + +However, do make a point of testing device drivers through USB 2.0 hubs. +Those hubs report some failures, such as disconnections, differently when +transaction translators are in use; some drivers have been seen to behave +badly when they see different faults than OHCI or UHCI report. + + +PERFORMANCE + +USB 2.0 throughput is gated by two main factors: how fast the host +controller can process requests, and how fast devices can respond to +them. The 480 Mbit/sec "raw transfer rate" is obeyed by all devices, +but aggregate throughput is also affected by issues like delays between +individual high speed packets, driver intelligence, and of course the +overall system load. Latency is also a performance concern. + +Bulk transfers are most often used where throughput is an issue. It's +good to keep in mind that bulk transfers are always in 512 byte packets, +and at most 13 of those fit into one USB 2.0 microframe. Eight USB 2.0 +microframes fit in a USB 1.1 frame; a microframe is 1 msec/8 = 125 usec. + +So more than 50 MByte/sec is available for bulk transfers, when both +hardware and device driver software allow it. Periodic transfer modes +(isochronous and interrupt) allow the larger packet sizes which let you +approach the quoted 480 MBit/sec transfer rate. + +Hardware Performance + +At this writing, individual USB 2.0 devices tend to max out at around +20 MByte/sec transfer rates. This is of course subject to change; +and some devices now go faster, while others go slower. + +The first NEC implementation of EHCI seems to have a hardware bottleneck +at around 28 MByte/sec aggregate transfer rate. While this is clearly +enough for a single device at 20 MByte/sec, putting three such devices +onto one bus does not get you 60 MByte/sec. The issue appears to be +that the controller hardware won't do concurrent USB and PCI access, +so that it's only trying six (or maybe seven) USB transactions each +microframe rather than thirteen. (Seems like a reasonable trade off +for a product that beat all the others to market by over a year!) + +It's expected that newer implementations will better this, throwing +more silicon real estate at the problem so that new motherboard chip +sets will get closer to that 60 MByte/sec target. That includes an +updated implementation from NEC, as well as other vendors' silicon. + +There's a minimum latency of one microframe (125 usec) for the host +to receive interrupts from the EHCI controller indicating completion +of requests. That latency is tunable; there's a module option. By +default ehci-hcd driver uses the minimum latency, which means that if +you issue a control or bulk request you can often expect to learn that +it completed in less than 250 usec (depending on transfer size). + +Software Performance + +To get even 20 MByte/sec transfer rates, Linux-USB device drivers will +need to keep the EHCI queue full. That means issuing large requests, +or using bulk queuing if a series of small requests needs to be issued. +When drivers don't do that, their performance results will show it. + +In typical situations, a usb_bulk_msg() loop writing out 4 KB chunks is +going to waste more than half the USB 2.0 bandwidth. Delays between the +I/O completion and the driver issuing the next request will take longer +than the I/O. If that same loop used 16 KB chunks, it'd be better; a +sequence of 128 KB chunks would waste a lot less. + +But rather than depending on such large I/O buffers to make synchronous +I/O be efficient, it's better to just queue up several (bulk) requests +to the HC, and wait for them all to complete (or be canceled on error). +Such URB queuing should work with all the USB 1.1 HC drivers too. + +In the Linux 2.5 kernels, new usb_sg_*() api calls have been defined; they +queue all the buffers from a scatterlist. They also use scatterlist DMA +mapping (which might apply an IOMMU) and IRQ reduction, all of which will +help make high speed transfers run as fast as they can. + + +TBD: Interrupt and ISO transfer performance issues. Those periodic +transfers are fully scheduled, so the main issue is likely to be how +to trigger "high bandwidth" modes. + diff --git a/Documentation/usb/error-codes.txt b/Documentation/usb/error-codes.txt new file mode 100644 index 000000000000..1e36f1661cd0 --- /dev/null +++ b/Documentation/usb/error-codes.txt @@ -0,0 +1,167 @@ +Revised: 2004-Oct-21 + +This is the documentation of (hopefully) all possible error codes (and +their interpretation) that can be returned from usbcore. + +Some of them are returned by the Host Controller Drivers (HCDs), which +device drivers only see through usbcore. As a rule, all the HCDs should +behave the same except for transfer speed dependent behaviors and the +way certain faults are reported. + + +************************************************************************** +* Error codes returned by usb_submit_urb * +************************************************************************** + +Non-USB-specific: + +0 URB submission went fine + +-ENOMEM no memory for allocation of internal structures + +USB-specific: + +-ENODEV specified USB-device or bus doesn't exist + +-ENOENT specified interface or endpoint does not exist or + is not enabled + +-ENXIO host controller driver does not support queuing of this type + of urb. (treat as a host controller bug.) + +-EINVAL a) Invalid transfer type specified (or not supported) + b) Invalid or unsupported periodic transfer interval + c) ISO: attempted to change transfer interval + d) ISO: number_of_packets is < 0 + e) various other cases + +-EAGAIN a) specified ISO start frame too early + b) (using ISO-ASAP) too much scheduled for the future + wait some time and try again. + +-EFBIG Host controller driver can't schedule that many ISO frames. + +-EPIPE Specified endpoint is stalled. For non-control endpoints, + reset this status with usb_clear_halt(). + +-EMSGSIZE (a) endpoint maxpacket size is zero; it is not usable + in the current interface altsetting. + (b) ISO packet is biger than endpoint maxpacket + (c) requested data transfer size is invalid (negative) + +-ENOSPC This request would overcommit the usb bandwidth reserved + for periodic transfers (interrupt, isochronous). + +-ESHUTDOWN The device or host controller has been disabled due to some + problem that could not be worked around. + +-EPERM Submission failed because urb->reject was set. + +-EHOSTUNREACH URB was rejected because the device is suspended. + + +************************************************************************** +* Error codes returned by in urb->status * +* or in iso_frame_desc[n].status (for ISO) * +************************************************************************** + +USB device drivers may only test urb status values in completion handlers. +This is because otherwise there would be a race between HCDs updating +these values on one CPU, and device drivers testing them on another CPU. + +A transfer's actual_length may be positive even when an error has been +reported. That's because transfers often involve several packets, so that +one or more packets could finish before an error stops further endpoint I/O. + + +0 Transfer completed successfully + +-ENOENT URB was synchronously unlinked by usb_unlink_urb + +-EINPROGRESS URB still pending, no results yet + (That is, if drivers see this it's a bug.) + +-EPROTO (*, **) a) bitstuff error + b) no response packet received within the + prescribed bus turn-around time + c) unknown USB error + +-EILSEQ (*, **) a) CRC mismatch + b) no response packet received within the + prescribed bus turn-around time + c) unknown USB error + + Note that often the controller hardware does not + distinguish among cases a), b), and c), so a + driver cannot tell whether there was a protocol + error, a failure to respond (often caused by + device disconnect), or some other fault. + +-ETIMEDOUT (**) No response packet received within the prescribed + bus turn-around time. This error may instead be + reported as -EPROTO or -EILSEQ. + + Note that the synchronous USB message functions + also use this code to indicate timeout expired + before the transfer completed. + +-EPIPE (**) Endpoint stalled. For non-control endpoints, + reset this status with usb_clear_halt(). + +-ECOMM During an IN transfer, the host controller + received data from an endpoint faster than it + could be written to system memory + +-ENOSR During an OUT transfer, the host controller + could not retrieve data from system memory fast + enough to keep up with the USB data rate + +-EOVERFLOW (*) The amount of data returned by the endpoint was + greater than either the max packet size of the + endpoint or the remaining buffer size. "Babble". + +-EREMOTEIO The data read from the endpoint did not fill the + specified buffer, and URB_SHORT_NOT_OK was set in + urb->transfer_flags. + +-ENODEV Device was removed. Often preceded by a burst of + other errors, since the hub driver does't detect + device removal events immediately. + +-EXDEV ISO transfer only partially completed + look at individual frame status for details + +-EINVAL ISO madness, if this happens: Log off and go home + +-ECONNRESET URB was asynchronously unlinked by usb_unlink_urb + +-ESHUTDOWN The device or host controller has been disabled due + to some problem that could not be worked around, + such as a physical disconnect. + + +(*) Error codes like -EPROTO, -EILSEQ and -EOVERFLOW normally indicate +hardware problems such as bad devices (including firmware) or cables. + +(**) This is also one of several codes that different kinds of host +controller use to to indicate a transfer has failed because of device +disconnect. In the interval before the hub driver starts disconnect +processing, devices may receive such fault reports for every request. + + + +************************************************************************** +* Error codes returned by usbcore-functions * +* (expect also other submit and transfer status codes) * +************************************************************************** + +usb_register(): +-EINVAL error during registering new driver + +usb_get_*/usb_set_*(): +usb_control_msg(): +usb_bulk_msg(): +-ETIMEDOUT Timeout expired before the transfer completed. + In the future this code may change to -ETIME, + whose definition is a closer match to this sort + of error. diff --git a/Documentation/usb/gadget_serial.txt b/Documentation/usb/gadget_serial.txt new file mode 100644 index 000000000000..a938c3dd13d6 --- /dev/null +++ b/Documentation/usb/gadget_serial.txt @@ -0,0 +1,332 @@ + + Linux Gadget Serial Driver v2.0 + 11/20/2004 + + +License and Disclaimer +---------------------- +This program is free software; you can redistribute it and/or +modify it under the terms of the GNU General Public License as +published by the Free Software Foundation; either version 2 of +the License, or (at your option) any later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public +License along with this program; if not, write to the Free +Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, +MA 02111-1307 USA. + +This document and the the gadget serial driver itself are +Copyright (C) 2004 by Al Borchers (alborchers@steinerpoint.com). + +If you have questions, problems, or suggestions for this driver +please contact Al Borchers at alborchers@steinerpoint.com. + + +Prerequisites +------------- +Versions of the gadget serial driver are available for the +2.4 Linux kernels, but this document assumes you are using +version 2.0 or later of the gadget serial driver in a 2.6 +Linux kernel. + +This document assumes that you are familiar with Linux and +Windows and know how to configure and build Linux kernels, run +standard utilities, use minicom and HyperTerminal, and work with +USB and serial devices. It also assumes you configure the Linux +gadget and usb drivers as modules. + + +Overview +-------- +The gadget serial driver is a Linux USB gadget driver, a USB device +side driver. It runs on a Linux system that has USB device side +hardware; for example, a PDA, an embedded Linux system, or a PC +with a USB development card. + +The gadget serial driver talks over USB to either a CDC ACM driver +or a generic USB serial driver running on a host PC. + + Host + -------------------------------------- + | Host-Side CDC ACM USB Host | + | Operating | or | Controller | USB + | System | Generic USB | Driver |-------- + | (Linux or | Serial | and | | + | Windows) Driver USB Stack | | + -------------------------------------- | + | + | + | + Gadget | + -------------------------------------- | + | Gadget USB Periph. | | + | Device-Side | Gadget | Controller | | + | Linux | Serial | Driver |-------- + | Operating | Driver | and | + | System USB Stack | + -------------------------------------- + +On the device-side Linux system, the gadget serial driver looks +like a serial device. + +On the host-side system, the gadget serial device looks like a +CDC ACM compliant class device or a simple vendor specific device +with bulk in and bulk out endpoints, and it is treated similarly +to other serial devices. + +The host side driver can potentially be any ACM compliant driver +or any driver that can talk to a device with a simple bulk in/out +interface. Gadget serial has been tested with the Linux ACM driver, +the Windows usbser.sys ACM driver, and the Linux USB generic serial +driver. + +With the gadget serial driver and the host side ACM or generic +serial driver running, you should be able to communicate between +the host and the gadget side systems as if they were connected by a +serial cable. + +The gadget serial driver only provides simple unreliable data +communication. It does not yet handle flow control or many other +features of normal serial devices. + + +Installing the Gadget Serial Driver +----------------------------------- +To use the gadget serial driver you must configure the Linux gadget +side kernel for "Support for USB Gadgets", for a "USB Peripheral +Controller" (for example, net2280), and for the "Serial Gadget" +driver. All this are listed under "USB Gadget Support" when +configuring the kernel. Then rebuild and install the kernel or +modules. + +The gadget serial driver uses major number 127, for now. So you +will need to create a device node for it, like this: + + mknod /dev/ttygserial c 127 0 + +You only need to do this once. + +Then you must load the gadget serial driver. To load it as an +ACM device, do this: + + modprobe g_serial use_acm=1 + +To load it as a vendor specific bulk in/out device, do this: + + modprobe g_serial + +This will also automatically load the underlying gadget peripheral +controller driver. This must be done each time you reboot the gadget +side Linux system. You can add this to the start up scripts, if +desired. + +If gadget serial is loaded as an ACM device you will want to use +either the Windows or Linux ACM driver on the host side. If gadget +serial is loaded as a bulk in/out device, you will want to use the +Linux generic serial driver on the host side. Follow the appropriate +instructions below to install the host side driver. + + +Installing the Windows Host ACM Driver +-------------------------------------- +To use the Windows ACM driver you must have the files "gserial.inf" +and "usbser.sys" together in a folder on the Windows machine. + +The "gserial.inf" file is given here. + +-------------------- CUT HERE -------------------- +[Version] +Signature="$Windows NT$" +Class=Ports +ClassGuid={4D36E978-E325-11CE-BFC1-08002BE10318} +Provider=%LINUX% +DriverVer=08/17/2004,0.0.2.0 +; Copyright (C) 2004 Al Borchers (alborchers@steinerpoint.com) + +[Manufacturer] +%LINUX%=GSerialDeviceList + +[GSerialDeviceList] +%GSERIAL%=GSerialInstall, USB\VID_0525&PID_A4A7 + +[DestinationDirs] +DefaultDestDir=10,System32\Drivers + +[GSerialInstall] +CopyFiles=GSerialCopyFiles +AddReg=GSerialAddReg + +[GSerialCopyFiles] +usbser.sys + +[GSerialAddReg] +HKR,,DevLoader,,*ntkern +HKR,,NTMPDriver,,usbser.sys +HKR,,EnumPropPages32,,"MsPorts.dll,SerialPortPropPageProvider" + +[GSerialInstall.Services] +AddService = usbser,0x0002,GSerialService + +[GSerialService] +DisplayName = %GSERIAL_DISPLAY_NAME% +ServiceType = 1 ; SERVICE_KERNEL_DRIVER +StartType = 3 ; SERVICE_DEMAND_START +ErrorControl = 1 ; SERVICE_ERROR_NORMAL +ServiceBinary = %10%\System32\Drivers\usbser.sys +LoadOrderGroup = Base + +[Strings] +LINUX = "Linux" +GSERIAL = "Gadget Serial" +GSERIAL_DISPLAY_NAME = "USB Gadget Serial Driver" +-------------------- CUT HERE -------------------- + +The "usbser.sys" file comes with various versions of Windows. +For example, it can be found on Windows XP typically in + + C:\WINDOWS\Driver Cache\i386\driver.cab + +Or it can be found on the Windows 98SE CD in the "win98" folder +in the "DRIVER11.CAB" through "DRIVER20.CAB" cab files. You will +need the DOS "expand" program, the Cygwin "cabextract" program, or +a similar program to unpack these cab files and extract "usbser.sys". + +For example, to extract "usbser.sys" into the current directory +on Windows XP, open a DOS window and run a command like + + expand C:\WINDOWS\Driver~1\i386\driver.cab -F:usbser.sys . + +(Thanks to Nishant Kamat for pointing out this DOS command.) + +When the gadget serial driver is loaded and the USB device connected +to the Windows host with a USB cable, Windows should recognize the +gadget serial device and ask for a driver. Tell Windows to find the +driver in the folder that contains "gserial.inf" and "usbser.sys". + +For example, on Windows XP, when the gadget serial device is first +plugged in, the "Found New Hardware Wizard" starts up. Select +"Install from a list or specific location (Advanced)", then on +the next screen select "Include this location in the search" and +enter the path or browse to the folder containing "gserial.inf" and +"usbser.sys". Windows will complain that the Gadget Serial driver +has not passed Windows Logo testing, but select "Continue anyway" +and finish the driver installation. + +On Windows XP, in the "Device Manager" (under "Control Panel", +"System", "Hardware") expand the "Ports (COM & LPT)" entry and you +should see "Gadget Serial" listed as the driver for one of the COM +ports. + +To uninstall the Windows XP driver for "Gadget Serial", right click +on the "Gadget Serial" entry in the "Device Manager" and select +"Uninstall". + + +Installing the Linux Host ACM Driver +------------------------------------ +To use the Linux ACM driver you must configure the Linux host side +kernel for "Support for Host-side USB" and for "USB Modem (CDC ACM) +support". + +Once the gadget serial driver is loaded and the USB device connected +to the Linux host with a USB cable, the host system should recognize +the gadget serial device. For example, the command + + cat /proc/bus/usb/devices + +should show something like this: + +T: Bus=01 Lev=01 Prnt=01 Port=01 Cnt=02 Dev#= 5 Spd=480 MxCh= 0 +D: Ver= 2.00 Cls=02(comm.) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 +P: Vendor=0525 ProdID=a4a7 Rev= 2.01 +S: Manufacturer=Linux 2.6.8.1 with net2280 +S: Product=Gadget Serial +S: SerialNumber=0 +C:* #Ifs= 2 Cfg#= 2 Atr=c0 MxPwr= 2mA +I: If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=02 Prot=01 Driver=acm +E: Ad=83(I) Atr=03(Int.) MxPS= 8 Ivl=32ms +I: If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=acm +E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms + +If the host side Linux system is configured properly, the ACM driver +should be loaded automatically. The command "lsmod" should show the +"acm" module is loaded. + + +Installing the Linux Host Generic USB Serial Driver +--------------------------------------------------- +To use the Linux generic USB serial driver you must configure the +Linux host side kernel for "Support for Host-side USB", for "USB +Serial Converter support", and for the "USB Generic Serial Driver". + +Once the gadget serial driver is loaded and the USB device connected +to the Linux host with a USB cable, the host system should recognize +the gadget serial device. For example, the command + + cat /proc/bus/usb/devices + +should show something like this: + +T: Bus=01 Lev=01 Prnt=01 Port=01 Cnt=02 Dev#= 6 Spd=480 MxCh= 0 +D: Ver= 2.00 Cls=ff(vend.) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 +P: Vendor=0525 ProdID=a4a6 Rev= 2.01 +S: Manufacturer=Linux 2.6.8.1 with net2280 +S: Product=Gadget Serial +S: SerialNumber=0 +C:* #Ifs= 1 Cfg#= 1 Atr=c0 MxPwr= 2mA +I: If#= 0 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=serial +E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms + +You must explicitly load the usbserial driver with parameters to +configure it to recognize the gadget serial device, like this: + + modprobe usbserial vendor=0x0525 product=0xA4A6 + +If everything is working, usbserial will print a message in the +system log saying something like "Gadget Serial converter now +attached to ttyUSB0". + + +Testing with Minicom or HyperTerminal +------------------------------------- +Once the gadget serial driver and the host driver are both installed, +and a USB cable connects the gadget device to the host, you should +be able to communicate over USB between the gadget and host systems. +You can use minicom or HyperTerminal to try this out. + +On the gadget side run "minicom -s" to configure a new minicom +session. Under "Serial port setup" set "/dev/ttygserial" as the +"Serial Device". Set baud rate, data bits, parity, and stop bits, +to 9600, 8, none, and 1--these settings mostly do not matter. +Under "Modem and dialing" erase all the modem and dialing strings. + +On a Linux host running the ACM driver, configure minicom similarly +but use "/dev/ttyACM0" as the "Serial Device". (If you have other +ACM devices connected, change the device name appropriately.) + +On a Linux host running the USB generic serial driver, configure +minicom similarly, but use "/dev/ttyUSB0" as the "Serial Device". +(If you have other USB serial devices connected, change the device +name appropriately.) + +On a Windows host configure a new HyperTerminal session to use the +COM port assigned to Gadget Serial. The "Port Settings" will be +set automatically when HyperTerminal connects to the gadget serial +device, so you can leave them set to the default values--these +settings mostly do not matter. + +With minicom configured and running on the gadget side and with +minicom or HyperTerminal configured and running on the host side, +you should be able to send data back and forth between the gadget +side and host side systems. Anything you type on the terminal +window on the gadget side should appear in the terminal window on +the host side and vice versa. + + diff --git a/Documentation/usb/hiddev.txt b/Documentation/usb/hiddev.txt new file mode 100644 index 000000000000..cd6fb4b58e1f --- /dev/null +++ b/Documentation/usb/hiddev.txt @@ -0,0 +1,205 @@ +Care and feeding of your Human Interface Devices + +INTRODUCTION + +In addition to the normal input type HID devices, USB also uses the +human interface device protocols for things that are not really human +interfaces, but have similar sorts of communication needs. The two big +examples for this are power devices (especially uninterruptable power +supplies) and monitor control on higher end monitors. + +To support these disparite requirements, the Linux USB system provides +HID events to two separate interfaces: +* the input subsystem, which converts HID events into normal input +device interfaces (such as keyboard, mouse and joystick) and a +normalised event interface - see Documentation/input/input.txt +* the hiddev interface, which provides fairly raw HID events + +The data flow for a HID event produced by a device is something like +the following : + + usb.c ---> hid-core.c ----> hid-input.c ----> [keyboard/mouse/joystick/event] + | + | + --> hiddev.c ----> POWER / MONITOR CONTROL + +In addition, other subsystems (apart from USB) can potentially feed +events into the input subsystem, but these have no effect on the hid +device interface. + +USING THE HID DEVICE INTERFACE + +The hiddev interface is a char interface using the normal USB major, +with the minor numbers starting at 96 and finishing at 111. Therefore, +you need the following commands: +mknod /dev/usb/hiddev0 c 180 96 +mknod /dev/usb/hiddev1 c 180 97 +mknod /dev/usb/hiddev2 c 180 98 +mknod /dev/usb/hiddev3 c 180 99 +mknod /dev/usb/hiddev4 c 180 100 +mknod /dev/usb/hiddev5 c 180 101 +mknod /dev/usb/hiddev6 c 180 102 +mknod /dev/usb/hiddev7 c 180 103 +mknod /dev/usb/hiddev8 c 180 104 +mknod /dev/usb/hiddev9 c 180 105 +mknod /dev/usb/hiddev10 c 180 106 +mknod /dev/usb/hiddev11 c 180 107 +mknod /dev/usb/hiddev12 c 180 108 +mknod /dev/usb/hiddev13 c 180 109 +mknod /dev/usb/hiddev14 c 180 110 +mknod /dev/usb/hiddev15 c 180 111 + +So you point your hiddev compliant user-space program at the correct +interface for your device, and it all just works. + +Assuming that you have a hiddev compliant user-space program, of +course. If you need to write one, read on. + + +THE HIDDEV API +This description should be read in conjunction with the HID +specification, freely available from http://www.usb.org, and +conveniently linked of http://www.linux-usb.org. + +The hiddev API uses a read() interface, and a set of ioctl() calls. + +HID devices exchange data with the host computer using data +bundles called "reports". Each report is divided into "fields", +each of which can have one or more "usages". In the hid-core, +each one of these usages has a single signed 32 bit value. + +read(): +This is the event interface. When the HID device's state changes, +it performs an interrupt transfer containing a report which contains +the changed value. The hid-core.c module parses the report, and +returns to hiddev.c the individual usages that have changed within +the report. In its basic mode, the hiddev will make these individual +usage changes available to the reader using a struct hiddev_event: + + struct hiddev_event { + unsigned hid; + signed int value; + }; + +containing the HID usage identifier for the status that changed, and +the value that it was changed to. Note that the structure is defined +within <linux/hiddev.h>, along with some other useful #defines and +structures. The HID usage identifier is a composite of the HID usage +page shifted to the 16 high order bits ORed with the usage code. The +behavior of the read() function can be modified using the HIDIOCSFLAG +ioctl() described below. + + +ioctl(): +This is the control interface. There are a number of controls: + +HIDIOCGVERSION - int (read) +Gets the version code out of the hiddev driver. + +HIDIOCAPPLICATION - (none) +This ioctl call returns the HID application usage associated with the +hid device. The third argument to ioctl() specifies which application +index to get. This is useful when the device has more than one +application collection. If the index is invalid (greater or equal to +the number of application collections this device has) the ioctl +returns -1. You can find out beforehand how many application +collections the device has from the num_applications field from the +hiddev_devinfo structure. + +HIDIOCGCOLLECTIONINFO - struct hiddev_collection_info (read/write) +This returns a superset of the information above, providing not only +application collections, but all the collections the device has. It +also returns the level the collection lives in the hierarchy. +The user passes in a hiddev_collection_info struct with the index +field set to the index that should be returned. The ioctl fills in +the other fields. If the index is larger than the last collection +index, the ioctl returns -1 and sets errno to -EINVAL. + +HIDIOCGDEVINFO - struct hiddev_devinfo (read) +Gets a hiddev_devinfo structure which describes the device. + +HIDIOCGSTRING - struct struct hiddev_string_descriptor (read/write) +Gets a string descriptor from the device. The caller must fill in the +"index" field to indicate which descriptor should be returned. + +HIDIOCINITREPORT - (none) +Instructs the kernel to retrieve all input and feature report values +from the device. At this point, all the usage structures will contain +current values for the device, and will maintain it as the device +changes. Note that the use of this ioctl is unnecessary in general, +since later kernels automatically initialize the reports from the +device at attach time. + +HIDIOCGNAME - string (variable length) +Gets the device name + +HIDIOCGREPORT - struct hiddev_report_info (write) +Instructs the kernel to get a feature or input report from the device, +in order to selectively update the usage structures (in contrast to +INITREPORT). + +HIDIOCSREPORT - struct hiddev_report_info (write) +Instructs the kernel to send a report to the device. This report can +be filled in by the user through HIDIOCSUSAGE calls (below) to fill in +individual usage values in the report before sending the report in full +to the device. + +HIDIOCGREPORTINFO - struct hiddev_report_info (read/write) +Fills in a hiddev_report_info structure for the user. The report is +looked up by type (input, output or feature) and id, so these fields +must be filled in by the user. The ID can be absolute -- the actual +report id as reported by the device -- or relative -- +HID_REPORT_ID_FIRST for the first report, and (HID_REPORT_ID_NEXT | +report_id) for the next report after report_id. Without a-priori +information about report ids, the right way to use this ioctl is to +use the relative IDs above to enumerate the valid IDs. The ioctl +returns non-zero when there is no more next ID. The real report ID is +filled into the returned hiddev_report_info structure. + +HIDIOCGFIELDINFO - struct hiddev_field_info (read/write) +Returns the field information associated with a report in a +hiddev_field_info structure. The user must fill in report_id and +report_type in this structure, as above. The field_index should also +be filled in, which should be a number from 0 and maxfield-1, as +returned from a previous HIDIOCGREPORTINFO call. + +HIDIOCGUCODE - struct hiddev_usage_ref (read/write) +Returns the usage_code in a hiddev_usage_ref structure, given that +given its report type, report id, field index, and index within the +field have already been filled into the structure. + +HIDIOCGUSAGE - struct hiddev_usage_ref (read/write) +Returns the value of a usage in a hiddev_usage_ref structure. The +usage to be retrieved can be specified as above, or the user can +choose to fill in the report_type field and specify the report_id as +HID_REPORT_ID_UNKNOWN. In this case, the hiddev_usage_ref will be +filled in with the report and field information associated with this +usage if it is found. + +HIDIOCSUSAGE - struct hiddev_usage_ref (write) +Sets the value of a usage in an output report. The user fills in +the hiddev_usage_ref structure as above, but additionally fills in +the value field. + +HIDIOGCOLLECTIONINDEX - struct hiddev_usage_ref (write) +Returns the collection index associated with this usage. This +indicates where in the collection hierarchy this usage sits. + +HIDIOCGFLAG - int (read) +HIDIOCSFLAG - int (write) +These operations respectively inspect and replace the mode flags +that influence the read() call above. The flags are as follows: + + HIDDEV_FLAG_UREF - read() calls will now return + struct hiddev_usage_ref instead of struct hiddev_event. + This is a larger structure, but in situations where the + device has more than one usage in its reports with the + same usage code, this mode serves to resolve such + ambiguity. + + HIDDEV_FLAG_REPORT - This flag can only be used in conjunction + with HIDDEV_FLAG_UREF. With this flag set, when the device + sends a report, a struct hiddev_usage_ref will be returned + to read() filled in with the report_type and report_id, but + with field_index set to FIELD_INDEX_NONE. This serves as + additional notification when the device has sent a report. diff --git a/Documentation/usb/hotplug.txt b/Documentation/usb/hotplug.txt new file mode 100644 index 000000000000..f53170665f37 --- /dev/null +++ b/Documentation/usb/hotplug.txt @@ -0,0 +1,148 @@ +LINUX HOTPLUGGING + +In hotpluggable busses like USB (and Cardbus PCI), end-users plug devices +into the bus with power on. In most cases, users expect the devices to become +immediately usable. That means the system must do many things, including: + + - Find a driver that can handle the device. That may involve + loading a kernel module; newer drivers can use module-init-tools + to publish their device (and class) support to user utilities. + + - Bind a driver to that device. Bus frameworks do that using a + device driver's probe() routine. + + - Tell other subsystems to configure the new device. Print + queues may need to be enabled, networks brought up, disk + partitions mounted, and so on. In some cases these will + be driver-specific actions. + +This involves a mix of kernel mode and user mode actions. Making devices +be immediately usable means that any user mode actions can't wait for an +administrator to do them: the kernel must trigger them, either passively +(triggering some monitoring daemon to invoke a helper program) or +actively (calling such a user mode helper program directly). + +Those triggered actions must support a system's administrative policies; +such programs are called "policy agents" here. Typically they involve +shell scripts that dispatch to more familiar administration tools. + +Because some of those actions rely on information about drivers (metadata) +that is currently available only when the drivers are dynamically linked, +you get the best hotplugging when you configure a highly modular system. + + +KERNEL HOTPLUG HELPER (/sbin/hotplug) + +When you compile with CONFIG_HOTPLUG, you get a new kernel parameter: +/proc/sys/kernel/hotplug, which normally holds the pathname "/sbin/hotplug". +That parameter names a program which the kernel may invoke at various times. + +The /sbin/hotplug program can be invoked by any subsystem as part of its +reaction to a configuration change, from a thread in that subsystem. +Only one parameter is required: the name of a subsystem being notified of +some kernel event. That name is used as the first key for further event +dispatch; any other argument and environment parameters are specified by +the subsystem making that invocation. + +Hotplug software and other resources is available at: + + http://linux-hotplug.sourceforge.net + +Mailing list information is also available at that site. + + +-------------------------------------------------------------------------- + + +USB POLICY AGENT + +The USB subsystem currently invokes /sbin/hotplug when USB devices +are added or removed from system. The invocation is done by the kernel +hub daemon thread [khubd], or else as part of root hub initialization +(done by init, modprobe, kapmd, etc). Its single command line parameter +is the string "usb", and it passes these environment variables: + + ACTION ... "add", "remove" + PRODUCT ... USB vendor, product, and version codes (hex) + TYPE ... device class codes (decimal) + INTERFACE ... interface 0 class codes (decimal) + +If "usbdevfs" is configured, DEVICE and DEVFS are also passed. DEVICE is +the pathname of the device, and is useful for devices with multiple and/or +alternate interfaces that complicate driver selection. By design, USB +hotplugging is independent of "usbdevfs": you can do most essential parts +of USB device setup without using that filesystem, and without running a +user mode daemon to detect changes in system configuration. + +Currently available policy agent implementations can load drivers for +modules, and can invoke driver-specific setup scripts. The newest ones +leverage USB module-init-tools support. Later agents might unload drivers. + + +USB MODUTILS SUPPORT + +Current versions of module-init-tools will create a "modules.usbmap" file +which contains the entries from each driver's MODULE_DEVICE_TABLE. Such +files can be used by various user mode policy agents to make sure all the +right driver modules get loaded, either at boot time or later. + +See <linux/usb.h> for full information about such table entries; or look +at existing drivers. Each table entry describes one or more criteria to +be used when matching a driver to a device or class of devices. The +specific criteria are identified by bits set in "match_flags", paired +with field values. You can construct the criteria directly, or with +macros such as these, and use driver_info to store more information. + + USB_DEVICE (vendorId, productId) + ... matching devices with specified vendor and product ids + USB_DEVICE_VER (vendorId, productId, lo, hi) + ... like USB_DEVICE with lo <= productversion <= hi + USB_INTERFACE_INFO (class, subclass, protocol) + ... matching specified interface class info + USB_DEVICE_INFO (class, subclass, protocol) + ... matching specified device class info + +A short example, for a driver that supports several specific USB devices +and their quirks, might have a MODULE_DEVICE_TABLE like this: + + static const struct usb_device_id mydriver_id_table = { + { USB_DEVICE (0x9999, 0xaaaa), driver_info: QUIRK_X }, + { USB_DEVICE (0xbbbb, 0x8888), driver_info: QUIRK_Y|QUIRK_Z }, + ... + { } /* end with an all-zeroes entry */ + } + MODULE_DEVICE_TABLE (usb, mydriver_id_table); + +Most USB device drivers should pass these tables to the USB subsystem as +well as to the module management subsystem. Not all, though: some driver +frameworks connect using interfaces layered over USB, and so they won't +need such a "struct usb_driver". + +Drivers that connect directly to the USB subsystem should be declared +something like this: + + static struct usb_driver mydriver = { + .name = "mydriver", + .id_table = mydriver_id_table, + .probe = my_probe, + .disconnect = my_disconnect, + + /* + if using the usb chardev framework: + .minor = MY_USB_MINOR_START, + .fops = my_file_ops, + if exposing any operations through usbdevfs: + .ioctl = my_ioctl, + */ + } + +When the USB subsystem knows about a driver's device ID table, it's used when +choosing drivers to probe(). The thread doing new device processing checks +drivers' device ID entries from the MODULE_DEVICE_TABLE against interface and +device descriptors for the device. It will only call probe() if there is a +match, and the third argument to probe() will be the entry that matched. + +If you don't provide an id_table for your driver, then your driver may get +probed for each new device; the third parameter to probe() will be null. + + diff --git a/Documentation/usb/ibmcam.txt b/Documentation/usb/ibmcam.txt new file mode 100644 index 000000000000..ce2f21a3eac4 --- /dev/null +++ b/Documentation/usb/ibmcam.txt @@ -0,0 +1,324 @@ +README for Linux device driver for the IBM "C-It" USB video camera + +INTRODUCTION: + +This driver does not use all features known to exist in +the IBM camera. However most of needed features work well. + +This driver was developed using logs of observed USB traffic +which was produced by standard Windows driver (c-it98.sys). +I did not have data sheets from Xirlink. + +Video formats: + 128x96 [model 1] + 176x144 + 320x240 [model 2] + 352x240 [model 2] + 352x288 +Frame rate: 3 - 30 frames per second (FPS) +External interface: USB +Internal interface: Video For Linux (V4L) +Supported controls: +- by V4L: Contrast, Brightness, Color, Hue +- by driver options: frame rate, lighting conditions, video format, + default picture settings, sharpness. + +SUPPORTED CAMERAS: + +Xirlink "C-It" camera, also known as "IBM PC Camera". +The device uses proprietary ASIC (and compression method); +it is manufactured by Xirlink. See http://www.xirlink.com/ +http://www.ibmpccamera.com or http://www.c-itnow.com/ for +details and pictures. + +This very chipset ("X Chip", as marked at the factory) +is used in several other cameras, and they are supported +as well: + +- IBM NetCamera +- Veo Stingray + +The Linux driver was developed with camera with following +model number (or FCC ID): KSX-XVP510. This camera has three +interfaces, each with one endpoint (control, iso, iso). This +type of cameras is referred to as "model 1". These cameras are +no longer manufactured. + +Xirlink now manufactures new cameras which are somewhat different. +In particular, following models [FCC ID] belong to that category: + +XVP300 [KSX-X9903] +XVP600 [KSX-X9902] +XVP610 [KSX-X9902] + +(see http://www.xirlink.com/ibmpccamera/ for updates, they refer +to these new cameras by Windows driver dated 12-27-99, v3005 BETA) +These cameras have two interfaces, one endpoint in each (iso, bulk). +Such type of cameras is referred to as "model 2". They are supported +(with exception of 352x288 native mode). + +Some IBM NetCameras (Model 4) are made to generate only compressed +video streams. This is great for performance, but unfortunately +nobody knows how to decompress the stream :-( Therefore, these +cameras are *unsupported* and if you try to use one of those, all +you get is random colored horizontal streaks, not the image! +If you have one of those cameras, you probably should return it +to the store and get something that is supported. + +Tell me more about all that "model" business +-------------------------------------------- + +I just invented model numbers to uniquely identify flavors of the +hardware/firmware that were sold. It was very confusing to use +brand names or some other internal numbering schemes. So I found +by experimentation that all Xirlink chipsets fall into four big +classes, and I called them "models". Each model is programmed in +its own way, and each model sends back the video in its own way. + +Quirks of Model 2 cameras: +------------------------- + +Model 2 does not have hardware contrast control. Corresponding V4L +control is implemented in software, which is not very nice to your +CPU, but at least it works. + +This driver provides 352x288 mode by switching the camera into +quasi-352x288 RGB mode (800 Kbits per frame) essentially limiting +this mode to 10 frames per second or less, in ideal conditions on +the bus (USB is shared, after all). The frame rate +has to be programmed very conservatively. Additional concern is that +frame rate depends on brightness setting; therefore the picture can +be good at one brightness and broken at another! I did not want to fix +the frame rate at slowest setting, but I had to move it pretty much down +the scale (so that framerate option barely matters). I also noticed that +camera after first powering up produces frames slightly faster than during +consecutive uses. All this means that if you use 352x288 (which is +default), be warned - you may encounter broken picture on first connect; +try to adjust brightness - brighter image is slower, so USB will be able +to send all data. However if you regularly use Model 2 cameras you may +prefer 176x144 which makes perfectly good I420, with no scaling and +lesser demands on USB (300 Kbits per second, or 26 frames per second). + +Another strange effect of 352x288 mode is the fine vertical grid visible +on some colored surfaces. I am sure it is caused by me not understanding +what the camera is trying to say. Blame trade secrets for that. + +The camera that I had also has a hardware quirk: if disconnected, +it needs few minutes to "relax" before it can be plugged in again +(poorly designed USB processor reset circuit?) + +[Veo Stingray with Product ID 0x800C is also Model 2, but I haven't +observed this particular flaw in it.] + +Model 2 camera can be programmed for very high sensitivity (even starlight +may be enough), this makes it convenient for tinkering with. The driver +code has enough comments to help a programmer to tweak the camera +as s/he feels necessary. + +WHAT YOU NEED: + +- A supported IBM PC (C-it) camera (model 1 or 2) + +- A Linux box with USB support (2.3/2.4; 2.2 w/backport may work) + +- A Video4Linux compatible frame grabber program such as xawtv. + +HOW TO COMPILE THE DRIVER: + +You need to compile the driver only if you are a developer +or if you want to make changes to the code. Most distributions +precompile all modules, so you can go directly to the next +section "HOW TO USE THE DRIVER". + +The ibmcam driver uses usbvideo helper library (module), +so if you are studying the ibmcam code you will be led there. + +The driver itself consists of only one file in usb/ directory: +ibmcam.c. This file is included into the Linux kernel build +process if you configure the kernel for CONFIG_USB_IBMCAM. +Run "make xconfig" and in USB section you will find the IBM +camera driver. Select it, save the configuration and recompile. + +HOW TO USE THE DRIVER: + +I recommend to compile driver as a module. This gives you an +easier access to its configuration. The camera has many more +settings than V4L can operate, so some settings are done using +module options. + +To begin with, on most modern Linux distributions the driver +will be automatically loaded whenever you plug the supported +camera in. Therefore, you don't need to do anything. However +if you want to experiment with some module parameters then +you can load and unload the driver manually, with camera +plugged in or unplugged. + +Typically module is installed with command 'modprobe', like this: + +# modprobe ibmcam framerate=1 + +Alternatively you can use 'insmod' in similar fashion: + +# insmod /lib/modules/2.x.y/usb/ibmcam.o framerate=1 + +Module can be inserted with camera connected or disconnected. + +The driver can have options, though some defaults are provided. + +Driver options: (* indicates that option is model-dependent) + +Name Type Range [default] Example +-------------- -------------- -------------- ------------------ +debug Integer 0-9 [0] debug=1 +flags Integer 0-0xFF [0] flags=0x0d +framerate Integer 0-6 [2] framerate=1 +hue_correction Integer 0-255 [128] hue_correction=115 +init_brightness Integer 0-255 [128] init_brightness=100 +init_contrast Integer 0-255 [192] init_contrast=200 +init_color Integer 0-255 [128] init_color=130 +init_hue Integer 0-255 [128] init_hue=115 +lighting Integer 0-2* [1] lighting=2 +sharpness Integer 0-6* [4] sharpness=3 +size Integer 0-2* [2] size=1 + +Options for Model 2 only: + +Name Type Range [default] Example +-------------- -------------- -------------- ------------------ +init_model2_rg Integer 0..255 [0x70] init_model2_rg=128 +init_model2_rg2 Integer 0..255 [0x2f] init_model2_rg2=50 +init_model2_sat Integer 0..255 [0x34] init_model2_sat=65 +init_model2_yb Integer 0..255 [0xa0] init_model2_yb=200 + +debug You don't need this option unless you are a developer. + If you are a developer then you will see in the code + what values do what. 0=off. + +flags This is a bit mask, and you can combine any number of + bits to produce what you want. Usually you don't want + any of extra features this option provides: + + FLAGS_RETRY_VIDIOCSYNC 1 This bit allows to retry failed + VIDIOCSYNC ioctls without failing. + Will work with xawtv, will not + with xrealproducer. Default is + not set. + FLAGS_MONOCHROME 2 Activates monochrome (b/w) mode. + FLAGS_DISPLAY_HINTS 4 Shows colored pixels which have + magic meaning to developers. + FLAGS_OVERLAY_STATS 8 Shows tiny numbers on screen, + useful only for debugging. + FLAGS_FORCE_TESTPATTERN 16 Shows blue screen with numbers. + FLAGS_SEPARATE_FRAMES 32 Shows each frame separately, as + it was received from the camera. + Default (not set) is to mix the + preceding frame in to compensate + for occasional loss of Isoc data + on high frame rates. + FLAGS_CLEAN_FRAMES 64 Forces "cleanup" of each frame + prior to use; relevant only if + FLAGS_SEPARATE_FRAMES is set. + Default is not to clean frames, + this is a little faster but may + produce flicker if frame rate is + too high and Isoc data gets lost. + FLAGS_NO_DECODING 128 This flag turns the video stream + decoder off, and dumps the raw + Isoc data from the camera into + the reading process. Useful to + developers, but not to users. + +framerate This setting controls frame rate of the camera. This is + an approximate setting (in terms of "worst" ... "best") + because camera changes frame rate depending on amount + of light available. Setting 0 is slowest, 6 is fastest. + Beware - fast settings are very demanding and may not + work well with all video sizes. Be conservative. + +hue_correction This highly optional setting allows to adjust the + hue of the image in a way slightly different from + what usual "hue" control does. Both controls affect + YUV colorspace: regular "hue" control adjusts only + U component, and this "hue_correction" option similarly + adjusts only V component. However usually it is enough + to tweak only U or V to compensate for colored light or + color temperature; this option simply allows more + complicated correction when and if it is necessary. + +init_brightness These settings specify _initial_ values which will be +init_contrast used to set up the camera. If your V4L application has +init_color its own controls to adjust the picture then these +init_hue controls will be used too. These options allow you to + preconfigure the camera when it gets connected, before + any V4L application connects to it. Good for webcams. + +init_model2_rg These initial settings alter color balance of the +init_model2_rg2 camera on hardware level. All four settings may be used +init_model2_sat to tune the camera to specific lighting conditions. These +init_model2_yb settings only apply to Model 2 cameras. + +lighting This option selects one of three hardware-defined + photosensitivity settings of the camera. 0=bright light, + 1=Medium (default), 2=Low light. This setting affects + frame rate: the dimmer the lighting the lower the frame + rate (because longer exposition time is needed). The + Model 2 cameras allow values more than 2 for this option, + thus enabling extremely high sensitivity at cost of frame + rate, color saturation and imaging sensor noise. + +sharpness This option controls smoothing (noise reduction) + made by camera. Setting 0 is most smooth, setting 6 + is most sharp. Be aware that CMOS sensor used in the + camera is pretty noisy, so if you choose 6 you will + be greeted with "snowy" image. Default is 4. Model 2 + cameras do not support this feature. + +size This setting chooses one of several image sizes that are + supported by this driver. Cameras may support more, but + it's difficult to reverse-engineer all formats. + Following video sizes are supported: + + size=0 128x96 (Model 1 only) + size=1 160x120 + size=2 176x144 + size=3 320x240 (Model 2 only) + size=4 352x240 (Model 2 only) + size=5 352x288 + size=6 640x480 (Model 3 only) + + The 352x288 is the native size of the Model 1 sensor + array, so it's the best resolution the camera can + yield. The best resolution of Model 2 is 176x144, and + larger images are produced by stretching the bitmap. + Model 3 has sensor with 640x480 grid, and it works too, + but the frame rate will be exceptionally low (1-2 FPS); + it may be still OK for some applications, like security. + Choose the image size you need. The smaller image can + support faster frame rate. Default is 352x288. + +For more information and the Troubleshooting FAQ visit this URL: + + http://www.linux-usb.org/ibmcam/ + +WHAT NEEDS TO BE DONE: + +- The button on the camera is not used. I don't know how to get to it. + I know now how to read button on Model 2, but what to do with it? + +- Camera reports its status back to the driver; however I don't know + what returned data means. If camera fails at some initialization + stage then something should be done, and I don't do that because + I don't even know that some command failed. This is mostly Model 1 + concern because Model 2 uses different commands which do not return + status (and seem to complete successfully every time). + +- Some flavors of Model 4 NetCameras produce only compressed video + streams, and I don't know how to decode them. + +CREDITS: + +The code is based in no small part on the CPiA driver by Johannes Erdfelt, +Randy Dunlap, and others. Big thanks to them for their pioneering work on that +and the USB stack. + +I also thank John Lightsey for his donation of the Veo Stingray camera. diff --git a/Documentation/usb/linux.inf b/Documentation/usb/linux.inf new file mode 100644 index 000000000000..2f7217d124ff --- /dev/null +++ b/Documentation/usb/linux.inf @@ -0,0 +1,200 @@ +; MS-Windows driver config matching some basic modes of the +; Linux-USB Ethernet/RNDIS gadget firmware: +; +; - RNDIS plus CDC Ethernet ... this may be familiar as a DOCSIS +; cable modem profile, and supports most non-Microsoft USB hosts +; +; - RNDIS plus CDC Subset ... used by hardware that incapable of +; full CDC Ethernet support. +; +; Microsoft only directly supports RNDIS drivers, and bundled them into XP. +; The Microsoft "Remote NDIS USB Driver Kit" is currently found at: +; http://www.microsoft.com/whdc/hwdev/resources/HWservices/rndis.mspx + + +[Version] +Signature = "$CHICAGO$" +Class = Net +ClassGUID = {4d36e972-e325-11ce-bfc1-08002be10318} +Provider = %Linux% +Compatible = 1 +MillenniumPreferred = .ME +DriverVer = 03/30/2004,0.0.0.0 +; catalog file would be used by WHQL +;CatalogFile = Linux.cat + +[Manufacturer] +%Linux% = LinuxDevices,NT.5.1 + +[LinuxDevices] +; NetChip IDs, used by both firmware modes +%LinuxDevice% = RNDIS, USB\VID_0525&PID_a4a2 + +[LinuxDevices.NT.5.1] +%LinuxDevice% = RNDIS.NT.5.1, USB\VID_0525&PID_a4a2 + +[ControlFlags] +ExcludeFromSelect=* + +; Windows 98, Windows 98 Second Edition specific sections -------- + +[RNDIS] +DeviceID = usb8023 +MaxInstance = 512 +DriverVer = 03/30/2004,0.0.0.0 +AddReg = RNDIS_AddReg_98, RNDIS_AddReg_Common + +[RNDIS_AddReg_98] +HKR, , DevLoader, 0, *ndis +HKR, , DeviceVxDs, 0, usb8023.sys +HKR, NDIS, LogDriverName, 0, "usb8023" +HKR, NDIS, MajorNdisVersion, 1, 5 +HKR, NDIS, MinorNdisVersion, 1, 0 +HKR, Ndi\Interfaces, DefUpper, 0, "ndis3,ndis4,ndis5" +HKR, Ndi\Interfaces, DefLower, 0, "ethernet" +HKR, Ndi\Interfaces, UpperRange, 0, "ndis3,ndis4,ndis5" +HKR, Ndi\Interfaces, LowerRange, 0, "ethernet" +HKR, Ndi\Install, ndis3, 0, "RNDIS_Install_98" +HKR, Ndi\Install, ndis4, 0, "RNDIS_Install_98" +HKR, Ndi\Install, ndis5, 0, "RNDIS_Install_98" +HKR, Ndi, DeviceId, 0, "USB\VID_0525&PID_a4a2" + +[RNDIS_Install_98] +CopyFiles=RNDIS_CopyFiles_98 + +[RNDIS_CopyFiles_98] +usb8023.sys, usb8023w.sys, , 0 +rndismp.sys, rndismpw.sys, , 0 + +; Windows Millennium Edition specific sections -------------------- + +[RNDIS.ME] +DeviceID = usb8023 +MaxInstance = 512 +DriverVer = 03/30/2004,0.0.0.0 +AddReg = RNDIS_AddReg_ME, RNDIS_AddReg_Common +Characteristics = 0x84 ; NCF_PHYSICAL + NCF_HAS_UI +BusType = 15 + +[RNDIS_AddReg_ME] +HKR, , DevLoader, 0, *ndis +HKR, , DeviceVxDs, 0, usb8023.sys +HKR, NDIS, LogDriverName, 0, "usb8023" +HKR, NDIS, MajorNdisVersion, 1, 5 +HKR, NDIS, MinorNdisVersion, 1, 0 +HKR, Ndi\Interfaces, DefUpper, 0, "ndis3,ndis4,ndis5" +HKR, Ndi\Interfaces, DefLower, 0, "ethernet" +HKR, Ndi\Interfaces, UpperRange, 0, "ndis3,ndis4,ndis5" +HKR, Ndi\Interfaces, LowerRange, 0, "ethernet" +HKR, Ndi\Install, ndis3, 0, "RNDIS_Install_ME" +HKR, Ndi\Install, ndis4, 0, "RNDIS_Install_ME" +HKR, Ndi\Install, ndis5, 0, "RNDIS_Install_ME" +HKR, Ndi, DeviceId, 0, "USB\VID_0525&PID_a4a2" + +[RNDIS_Install_ME] +CopyFiles=RNDIS_CopyFiles_ME + +[RNDIS_CopyFiles_ME] +usb8023.sys, usb8023m.sys, , 0 +rndismp.sys, rndismpm.sys, , 0 + +; Windows 2000 specific sections --------------------------------- + +[RNDIS.NT] +Characteristics = 0x84 ; NCF_PHYSICAL + NCF_HAS_UI +BusType = 15 +DriverVer = 03/30/2004,0.0.0.0 +AddReg = RNDIS_AddReg_NT, RNDIS_AddReg_Common +CopyFiles = RNDIS_CopyFiles_NT + +[RNDIS.NT.Services] +AddService = USB_RNDIS, 2, RNDIS_ServiceInst_NT, RNDIS_EventLog + +[RNDIS_CopyFiles_NT] +; no rename of files on Windows 2000, use the 'k' names as is +usb8023k.sys, , , 0 +rndismpk.sys, , , 0 + +[RNDIS_ServiceInst_NT] +DisplayName = %ServiceDisplayName% +ServiceType = 1 +StartType = 3 +ErrorControl = 1 +ServiceBinary = %12%\usb8023k.sys +LoadOrderGroup = NDIS +AddReg = RNDIS_WMI_AddReg_NT + +[RNDIS_WMI_AddReg_NT] +HKR, , MofImagePath, 0x00020000, "System32\drivers\rndismpk.sys" + +; Windows XP specific sections ----------------------------------- + +[RNDIS.NT.5.1] +Characteristics = 0x84 ; NCF_PHYSICAL + NCF_HAS_UI +BusType = 15 +DriverVer = 03/30/2004,0.0.0.0 +AddReg = RNDIS_AddReg_NT, RNDIS_AddReg_Common +; no copyfiles - the files are already in place + +[RNDIS.NT.5.1.Services] +AddService = USB_RNDIS, 2, RNDIS_ServiceInst_51, RNDIS_EventLog + +[RNDIS_ServiceInst_51] +DisplayName = %ServiceDisplayName% +ServiceType = 1 +StartType = 3 +ErrorControl = 1 +ServiceBinary = %12%\usb8023.sys +LoadOrderGroup = NDIS +AddReg = RNDIS_WMI_AddReg_51 + +[RNDIS_WMI_AddReg_51] +HKR, , MofImagePath, 0x00020000, "System32\drivers\rndismp.sys" + +; Windows 2000 and Windows XP common sections -------------------- + +[RNDIS_AddReg_NT] +HKR, Ndi, Service, 0, "USB_RNDIS" +HKR, Ndi\Interfaces, UpperRange, 0, "ndis5" +HKR, Ndi\Interfaces, LowerRange, 0, "ethernet" + +[RNDIS_EventLog] +AddReg = RNDIS_EventLog_AddReg + +[RNDIS_EventLog_AddReg] +HKR, , EventMessageFile, 0x00020000, "%%SystemRoot%%\System32\netevent.dll" +HKR, , TypesSupported, 0x00010001, 7 + +; Common Sections ------------------------------------------------- + +[RNDIS_AddReg_Common] +HKR, NDI\params\NetworkAddress, ParamDesc, 0, %NetworkAddress% +HKR, NDI\params\NetworkAddress, type, 0, "edit" +HKR, NDI\params\NetworkAddress, LimitText, 0, "12" +HKR, NDI\params\NetworkAddress, UpperCase, 0, "1" +HKR, NDI\params\NetworkAddress, default, 0, " " +HKR, NDI\params\NetworkAddress, optional, 0, "1" + +[SourceDisksNames] +1=%SourceDisk%,,1 + +[SourceDisksFiles] +usb8023m.sys=1 +rndismpm.sys=1 +usb8023w.sys=1 +rndismpw.sys=1 +usb8023k.sys=1 +rndismpk.sys=1 + +[DestinationDirs] +RNDIS_CopyFiles_98 = 10, system32/drivers +RNDIS_CopyFiles_ME = 10, system32/drivers +RNDIS_CopyFiles_NT = 12 + +[Strings] +ServiceDisplayName = "USB Remote NDIS Network Device Driver" +NetworkAddress = "Network Address" +Linux = "Linux Developer Community" +LinuxDevice = "Linux USB Ethernet/RNDIS Gadget" +SourceDisk = "Ethernet/RNDIS Gadget Driver Install Disk" + diff --git a/Documentation/usb/mtouchusb.txt b/Documentation/usb/mtouchusb.txt new file mode 100644 index 000000000000..cd806bfc8b81 --- /dev/null +++ b/Documentation/usb/mtouchusb.txt @@ -0,0 +1,76 @@ +CHANGES + +- 0.3 - Created based off of scanner & INSTALL from the original touchscreen + driver on freshmeat (http://freshmeat.net/projects/3mtouchscreendriver) +- Amended for linux-2.4.18, then 2.4.19 + +- 0.5 - Complete rewrite using Linux Input in 2.6.3 + Unfortunately no calibration support at this time + +- 1.4 - Multiple changes to support the EXII 5000UC and house cleaning + Changed reset from standard USB dev reset to vendor reset + Changed data sent to host from compensated to raw coordinates + Eliminated vendor/product module params + Performed multiple successfull tests with an EXII-5010UC + +SUPPORTED HARDWARE: + + All controllers have the Vendor: 0x0596 & Product: 0x0001 + + + Controller Description Part Number + ------------------------------------------------------ + + USB Capacitive - Pearl Case 14-205 (Discontinued) + USB Capacitive - Black Case 14-124 (Discontinued) + USB Capacitive - No Case 14-206 (Discontinued) + + USB Capacitive - Pearl Case EXII-5010UC + USB Capacitive - Black Case EXII-5030UC + USB Capacitive - No Case EXII-5050UC + +DRIVER NOTES: + +Installation is simple, you only need to add Linux Input, Linux USB, and the +driver to the kernel. The driver can also be optionally built as a module. + +This driver appears to be one of possible 2 Linux USB Input Touchscreen +drivers. Although 3M produces a binary only driver available for +download, I persist in updating this driver since I would like to use the +touchscreen for embedded apps using QTEmbedded, DirectFB, etc. So I feel the +logical choice is to use Linux Imput. + +Currently there is no way to calibrate the device via this driver. Even if +the device could be calibrated, the driver pulls to raw coordinate data from +the controller. This means calibration must be performed within the +userspace. + +The controller screen resolution is now 0 to 16384 for both X and Y reporting +the raw touch data. This is the same for the old and new capacitive USB +controllers. + +Perhaps at some point an abstract function will be placed into evdev so +generic functions like calibrations, resets, and vendor information can be +requested from the userspace (And the drivers would handle the vendor specific +tasks). + +ADDITIONAL INFORMATION/UPDATES/X CONFIGURATION EXAMPLE: + +http://groomlakelabs.com/grandamp/code/microtouch/ + +TODO: + +Implement a control urb again to handle requests to and from the device +such as calibration, etc once/if it becomes available. + +DISCLAMER: + +I am not a MicroTouch/3M employee, nor have I ever been. 3M does not support +this driver! If you want touch drivers only supported within X, please go to: + +http://www.3m.com/3MTouchSystems/downloads/ + +THANKS: + +A huge thank you to 3M Touch Systems for the EXII-5010UC controllers for +testing! diff --git a/Documentation/usb/ohci.txt b/Documentation/usb/ohci.txt new file mode 100644 index 000000000000..99320d9fa523 --- /dev/null +++ b/Documentation/usb/ohci.txt @@ -0,0 +1,32 @@ +23-Aug-2002 + +The "ohci-hcd" driver is a USB Host Controller Driver (HCD) that is derived +from the "usb-ohci" driver from the 2.4 kernel series. The "usb-ohci" code +was written primarily by Roman Weissgaerber <weissg@vienna.at> but with +contributions from many others (read its copyright/licencing header). + +It supports the "Open Host Controller Interface" (OHCI), which standardizes +hardware register protocols used to talk to USB 1.1 host controllers. As +compared to the earlier "Universal Host Controller Interface" (UHCI) from +Intel, it pushes more intelligence into the hardware. USB 1.1 controllers +from vendors other than Intel and VIA generally use OHCI. + +Changes since the 2.4 kernel include + + - improved robustness; bugfixes; and less overhead + - supports the updated and simplified usbcore APIs + - interrupt transfers can be larger, and can be queued + - less code, by using the upper level "hcd" framework + - supports some non-PCI implementations of OHCI + - ... more + +The "ohci-hcd" driver handles all USB 1.1 transfer types. Transfers of all +types can be queued. That was also true in "usb-ohci", except for interrupt +transfers. Previously, using periods of one frame would risk data loss due +to overhead in IRQ processing. When interrupt transfers are queued, those +risks can be minimized by making sure the hardware always has transfers to +work on while the OS is getting around to the relevant IRQ processing. + +- David Brownell + <dbrownell@users.sourceforge.net> + diff --git a/Documentation/usb/ov511.txt b/Documentation/usb/ov511.txt new file mode 100644 index 000000000000..e1974ec8217e --- /dev/null +++ b/Documentation/usb/ov511.txt @@ -0,0 +1,289 @@ +------------------------------------------------------------------------------- +Readme for Linux device driver for the OmniVision OV511 USB to camera bridge IC +------------------------------------------------------------------------------- + +Author: Mark McClelland +Homepage: http://alpha.dyndns.org/ov511 + +INTRODUCTION: + +This is a driver for the OV511, a USB-only chip used in many "webcam" devices. +Any camera using the OV511/OV511+ and the OV6620/OV7610/20/20AE should work. +Video capture devices that use the Philips SAA7111A decoder also work. It +supports streaming and capture of color or monochrome video via the Video4Linux +API. Most V4L apps are compatible with it. Most resolutions with a width and +height that are a multiple of 8 are supported. + +If you need more information, please visit the OV511 homepage at the above URL. + +WHAT YOU NEED: + +- If you want to help with the development, get the chip's specification docs at + http://www.ovt.com/omniusbp.html + +- A Video4Linux compatible frame grabber program (I recommend vidcat and xawtv) + vidcat is part of the w3cam package: http://www.hdk-berlin.de/~rasca/w3cam/ + xawtv is available at: http://www.in-berlin.de/User/kraxel/xawtv.html + +HOW TO USE IT: + +Note: These are simplified instructions. For complete instructions see: + http://alpha.dyndns.org/ov511/install.html + +You must have first compiled USB support, support for your specific USB host +controller (UHCI or OHCI), and Video4Linux support for your kernel (I recommend +making them modules.) Make sure "Enforce bandwidth allocation" is NOT enabled. + +Next, (as root): + + modprobe usbcore + modprobe usb-uhci <OR> modprobe usb-ohci + modprobe videodev + modprobe ov511 + +If it is not already there (it usually is), create the video device: + + mknod /dev/video0 c 81 0 + +Optionally, symlink /dev/video to /dev/video0 + +You will have to set permissions on this device to allow you to read/write +from it: + + chmod 666 /dev/video + chmod 666 /dev/video0 (if necessary) + +Now you are ready to run a video app! Both vidcat and xawtv work well for me +at 640x480. + +[Using vidcat:] + + vidcat -s 640x480 -p c > test.jpg + xview test.jpg + +[Using xawtv:] + +From the main xawtv directory: + + make clean + ./configure + make + make install + +Now you should be able to run xawtv. Right click for the options dialog. + +MODULE PARAMETERS: + + You can set these with: insmod ov511 NAME=VALUE + There is currently no way to set these on a per-camera basis. + + NAME: autobright + TYPE: integer (Boolean) + DEFAULT: 1 + DESC: Brightness is normally under automatic control and can't be set + manually by the video app. Set to 0 for manual control. + + NAME: autogain + TYPE: integer (Boolean) + DEFAULT: 1 + DESC: Auto Gain Control enable. This feature is not yet implemented. + + NAME: autoexp + TYPE: integer (Boolean) + DEFAULT: 1 + DESC: Auto Exposure Control enable. This feature is not yet implemented. + + NAME: debug + TYPE: integer (0-6) + DEFAULT: 3 + DESC: Sets the threshold for printing debug messages. The higher the value, + the more is printed. The levels are cumulative, and are as follows: + 0=no debug messages + 1=init/detection/unload and other significant messages + 2=some warning messages + 3=config/control function calls + 4=most function calls and data parsing messages + 5=highly repetitive mesgs + + NAME: snapshot + TYPE: integer (Boolean) + DEFAULT: 0 + DESC: Set to 1 to enable snapshot mode. read()/VIDIOCSYNC will block until + the snapshot button is pressed. Note: enabling this mode disables + /proc/video/ov511/<minor#>/button + + NAME: cams + TYPE: integer (1-4 for OV511, 1-31 for OV511+) + DEFAULT: 1 + DESC: Number of cameras allowed to stream simultaneously on a single bus. + Values higher than 1 reduce the data rate of each camera, allowing two + or more to be used at once. If you have a complicated setup involving + both OV511 and OV511+ cameras, trial-and-error may be necessary for + finding the optimum setting. + + NAME: compress + TYPE: integer (Boolean) + DEFAULT: 0 + DESC: Set this to 1 to turn on the camera's compression engine. This can + potentially increase the frame rate at the expense of quality, if you + have a fast CPU. You must load the proper compression module for your + camera before starting your application (ov511_decomp or ov518_decomp). + + NAME: testpat + TYPE: integer (Boolean) + DEFAULT: 0 + DESC: This configures the camera's sensor to transmit a colored test-pattern + instead of an image. This does not work correctly yet. + + NAME: dumppix + TYPE: integer (0-2) + DEFAULT: 0 + DESC: Dumps raw pixel data and skips post-processing and format conversion. + It is for debugging purposes only. Options are: + 0: Disable (default) + 1: Dump raw data from camera, excluding headers and trailers + 2: Dumps data exactly as received from camera + + NAME: led + TYPE: integer (0-2) + DEFAULT: 1 (Always on) + DESC: Controls whether the LED (the little light) on the front of the camera + is always off (0), always on (1), or only on when driver is open (2). + This is not supported with the OV511, and might only work with certain + cameras (ones that actually have the LED wired to the control pin, and + not just hard-wired to be on all the time). + + NAME: dump_bridge + TYPE: integer (Boolean) + DEFAULT: 0 + DESC: Dumps the bridge (OV511[+] or OV518[+]) register values to the system + log. Only useful for serious debugging/development purposes. + + NAME: dump_sensor + TYPE: integer (Boolean) + DEFAULT: 0 + DESC: Dumps the sensor register values to the system log. Only useful for + serious debugging/development purposes. + + NAME: printph + TYPE: integer (Boolean) + DEFAULT: 0 + DESC: Setting this to 1 will dump the first 12 bytes of each isoc frame. This + is only useful if you are trying to debug problems with the isoc data + stream (i.e.: camera initializes, but vidcat hangs until Ctrl-C). Be + warned that this dumps a large number of messages to your kernel log. + + NAME: phy, phuv, pvy, pvuv, qhy, qhuv, qvy, qvuv + TYPE: integer (0-63 for phy and phuv, 0-255 for rest) + DEFAULT: OV511 default values + DESC: These are registers 70h - 77h of the OV511, which control the + prediction ranges and quantization thresholds of the compressor, for + the Y and UV channels in the horizontal and vertical directions. See + the OV511 or OV511+ data sheet for more detailed descriptions. These + normally do not need to be changed. + + NAME: lightfreq + TYPE: integer (0, 50, or 60) + DEFAULT: 0 (use sensor default) + DESC: Sets the sensor to match your lighting frequency. This can reduce the + appearance of "banding", i.e. horizontal lines or waves of light and + dark that are often caused by artificial lighting. Valid values are: + 0 - Use default (depends on sensor, most likely 60 Hz) + 50 - For European and Asian 50 Hz power + 60 - For American 60 Hz power + + NAME: bandingfilter + TYPE: integer (Boolean) + DEFAULT: 0 (off) + DESC: Enables the sensor´s banding filter exposure algorithm. This reduces + or stabilizes the "banding" caused by some artificial light sources + (especially fluorescent). You might have to set lightfreq correctly for + this to work right. As an added bonus, this sometimes makes it + possible to capture your monitor´s output. + + NAME: fastset + TYPE: integer (Boolean) + DEFAULT: 0 (off) + DESC: Allows picture settings (brightness, contrast, color, and hue) to take + effect immediately, even in the middle of a frame. This reduces the + time to change settings, but can ruin frames during the change. Only + affects OmniVision sensors. + + NAME: force_palette + TYPE: integer (Boolean) + DEFAULT: 0 (off) + DESC: Forces the palette (color format) to a specific value. If an + application requests a different palette, it will be rejected, thereby + forcing it to try others until it succeeds. This is useful for forcing + greyscale mode with a color camera, for example. Supported modes are: + 0 (Allows all the following formats) + 1 VIDEO_PALETTE_GREY (Linear greyscale) + 10 VIDEO_PALETTE_YUV420 (YUV 4:2:0 Planar) + 15 VIDEO_PALETTE_YUV420P (YUV 4:2:0 Planar, same as 10) + + NAME: backlight + TYPE: integer (Boolean) + DEFAULT: 0 (off) + DESC: Setting this flag changes the exposure algorithm for OmniVision sensors + such that objects in the camera's view (i.e. your head) can be clearly + seen when they are illuminated from behind. It reduces or eliminates + the sensor's auto-exposure function, so it should only be used when + needed. Additionally, it is only supported with the OV6620 and OV7620. + + NAME: unit_video + TYPE: Up to 16 comma-separated integers + DEFAULT: 0,0,0... (automatically assign the next available minor(s)) + DESC: You can specify up to 16 minor numbers to be assigned to ov511 devices. + For example, "unit_video=1,3" will make the driver use /dev/video1 and + /dev/video3 for the first two devices it detects. Additional devices + will be assigned automatically starting at the first available device + node (/dev/video0 in this case). Note that you cannot specify 0 as a + minor number. This feature requires kernel version 2.4.5 or higher. + + NAME: remove_zeros + TYPE: integer (Boolean) + DEFAULT: 0 (do not skip any incoming data) + DESC: Setting this to 1 will remove zero-padding from incoming data. This + will compensate for the blocks of corruption that can appear when the + camera cannot keep up with the speed of the USB bus (eg. at low frame + resolutions). This feature is always enabled when compression is on. + + NAME: mirror + TYPE: integer (Boolean) + DEFAULT: 0 (off) + DESC: Setting this to 1 will reverse ("mirror") the image horizontally. This + might be necessary if your camera has a custom lens assembly. This has + no effect with video capture devices. + + NAME: ov518_color + TYPE: integer (Boolean) + DEFAULT: 0 (off) + DESC: Enable OV518 color support. This is off by default since it doesn't + work most of the time. If you want to try it, you must also load + ov518_decomp with the "nouv=0" parameter. If you get improper colors or + diagonal lines through the image, restart your video app and try again. + Repeat as necessary. + +WORKING FEATURES: + o Color streaming/capture at most widths and heights that are multiples of 8. + o Monochrome (use force_palette=1 to enable) + o Setting/getting of saturation, contrast, brightness, and hue (only some of + them work the OV7620 and OV7620AE) + o /proc status reporting + o SAA7111A video capture support at 320x240 and 640x480 + o Compression support + o SMP compatibility + +HOW TO CONTACT ME: + +You can email me at mark@alpha.dyndns.org . Please prefix the subject line +with "OV511: " so that I am certain to notice your message. + +CREDITS: + +The code is based in no small part on the CPiA driver by Johannes Erdfelt, +Randy Dunlap, and others. Big thanks to them for their pioneering work on that +and the USB stack. Thanks to Bret Wallach for getting camera reg IO, ISOC, and +image capture working. Thanks to Orion Sky Lawlor, Kevin Moore, and Claudio +Matsuoka for their work as well. + diff --git a/Documentation/usb/proc_usb_info.txt b/Documentation/usb/proc_usb_info.txt new file mode 100644 index 000000000000..729c72d34c89 --- /dev/null +++ b/Documentation/usb/proc_usb_info.txt @@ -0,0 +1,371 @@ +/proc/bus/usb filesystem output +=============================== +(version 2003.05.30) + + +The usbfs filesystem for USB devices is traditionally mounted at +/proc/bus/usb. It provides the /proc/bus/usb/devices file, as well as +the /proc/bus/usb/BBB/DDD files. + + +**NOTE**: If /proc/bus/usb appears empty, and a host controller + driver has been linked, then you need to mount the + filesystem. Issue the command (as root): + + mount -t usbfs none /proc/bus/usb + + An alternative and more permanent method would be to add + + none /proc/bus/usb usbfs defaults 0 0 + + to /etc/fstab. This will mount usbfs at each reboot. + You can then issue `cat /proc/bus/usb/devices` to extract + USB device information, and user mode drivers can use usbfs + to interact with USB devices. + + There are a number of mount options supported by usbfs. + Consult the source code (linux/drivers/usb/core/inode.c) for + information about those options. + +**NOTE**: The filesystem has been renamed from "usbdevfs" to + "usbfs", to reduce confusion with "devfs". You may + still see references to the older "usbdevfs" name. + +For more information on mounting the usbfs file system, see the +"USB Device Filesystem" section of the USB Guide. The latest copy +of the USB Guide can be found at http://www.linux-usb.org/ + + +THE /proc/bus/usb/BBB/DDD FILES: +-------------------------------- +Each connected USB device has one file. The BBB indicates the bus +number. The DDD indicates the device address on that bus. Both +of these numbers are assigned sequentially, and can be reused, so +you can't rely on them for stable access to devices. For example, +it's relatively common for devices to re-enumerate while they are +still connected (perhaps someone jostled their power supply, hub, +or USB cable), so a device might be 002/027 when you first connect +it and 002/048 sometime later. + +These files can be read as binary data. The binary data consists +of first the device descriptor, then the descriptors for each +configuration of the device. That information is also shown in +text form by the /proc/bus/usb/devices file, described later. + +These files may also be used to write user-level drivers for the USB +devices. You would open the /proc/bus/usb/BBB/DDD file read/write, +read its descriptors to make sure it's the device you expect, and then +bind to an interface (or perhaps several) using an ioctl call. You +would issue more ioctls to the device to communicate to it using +control, bulk, or other kinds of USB transfers. The IOCTLs are +listed in the <linux/usbdevice_fs.h> file, and at this writing the +source code (linux/drivers/usb/devio.c) is the primary reference +for how to access devices through those files. + +Note that since by default these BBB/DDD files are writable only by +root, only root can write such user mode drivers. You can selectively +grant read/write permissions to other users by using "chmod". Also, +usbfs mount options such as "devmode=0666" may be helpful. + + + +THE /proc/bus/usb/devices FILE: +------------------------------- +In /proc/bus/usb/devices, each device's output has multiple +lines of ASCII output. +I made it ASCII instead of binary on purpose, so that someone +can obtain some useful data from it without the use of an +auxiliary program. However, with an auxiliary program, the numbers +in the first 4 columns of each "T:" line (topology info: +Lev, Prnt, Port, Cnt) can be used to build a USB topology diagram. + +Each line is tagged with a one-character ID for that line: + +T = Topology (etc.) +B = Bandwidth (applies only to USB host controllers, which are + virtualized as root hubs) +D = Device descriptor info. +P = Product ID info. (from Device descriptor, but they won't fit + together on one line) +S = String descriptors. +C = Configuration descriptor info. (* = active configuration) +I = Interface descriptor info. +E = Endpoint descriptor info. + +======================================================================= + +/proc/bus/usb/devices output format: + +Legend: + d = decimal number (may have leading spaces or 0's) + x = hexadecimal number (may have leading spaces or 0's) + s = string + + +Topology info: + +T: Bus=dd Lev=dd Prnt=dd Port=dd Cnt=dd Dev#=ddd Spd=ddd MxCh=dd +| | | | | | | | |__MaxChildren +| | | | | | | |__Device Speed in Mbps +| | | | | | |__DeviceNumber +| | | | | |__Count of devices at this level +| | | | |__Connector/Port on Parent for this device +| | | |__Parent DeviceNumber +| | |__Level in topology for this bus +| |__Bus number +|__Topology info tag + + Speed may be: + 1.5 Mbit/s for low speed USB + 12 Mbit/s for full speed USB + 480 Mbit/s for high speed USB (added for USB 2.0) + + +Bandwidth info: +B: Alloc=ddd/ddd us (xx%), #Int=ddd, #Iso=ddd +| | | |__Number of isochronous requests +| | |__Number of interrupt requests +| |__Total Bandwidth allocated to this bus +|__Bandwidth info tag + + Bandwidth allocation is an approximation of how much of one frame + (millisecond) is in use. It reflects only periodic transfers, which + are the only transfers that reserve bandwidth. Control and bulk + transfers use all other bandwidth, including reserved bandwidth that + is not used for transfers (such as for short packets). + + The percentage is how much of the "reserved" bandwidth is scheduled by + those transfers. For a low or full speed bus (loosely, "USB 1.1"), + 90% of the bus bandwidth is reserved. For a high speed bus (loosely, + "USB 2.0") 80% is reserved. + + +Device descriptor info & Product ID info: + +D: Ver=x.xx Cls=xx(s) Sub=xx Prot=xx MxPS=dd #Cfgs=dd +P: Vendor=xxxx ProdID=xxxx Rev=xx.xx + +where +D: Ver=x.xx Cls=xx(sssss) Sub=xx Prot=xx MxPS=dd #Cfgs=dd +| | | | | | |__NumberConfigurations +| | | | | |__MaxPacketSize of Default Endpoint +| | | | |__DeviceProtocol +| | | |__DeviceSubClass +| | |__DeviceClass +| |__Device USB version +|__Device info tag #1 + +where +P: Vendor=xxxx ProdID=xxxx Rev=xx.xx +| | | |__Product revision number +| | |__Product ID code +| |__Vendor ID code +|__Device info tag #2 + + +String descriptor info: + +S: Manufacturer=ssss +| |__Manufacturer of this device as read from the device. +| For USB host controller drivers (virtual root hubs) this may +| be omitted, or (for newer drivers) will identify the kernel +| version and the driver which provides this hub emulation. +|__String info tag + +S: Product=ssss +| |__Product description of this device as read from the device. +| For older USB host controller drivers (virtual root hubs) this +| indicates the driver; for newer ones, it's a product (and vendor) +| description that often comes from the kernel's PCI ID database. +|__String info tag + +S: SerialNumber=ssss +| |__Serial Number of this device as read from the device. +| For USB host controller drivers (virtual root hubs) this is +| some unique ID, normally a bus ID (address or slot name) that +| can't be shared with any other device. +|__String info tag + + + +Configuration descriptor info: + +C:* #Ifs=dd Cfg#=dd Atr=xx MPwr=dddmA +| | | | | |__MaxPower in mA +| | | | |__Attributes +| | | |__ConfiguratioNumber +| | |__NumberOfInterfaces +| |__ "*" indicates the active configuration (others are " ") +|__Config info tag + + USB devices may have multiple configurations, each of which act + rather differently. For example, a bus-powered configuration + might be much less capable than one that is self-powered. Only + one device configuration can be active at a time; most devices + have only one configuration. + + Each configuration consists of one or more interfaces. Each + interface serves a distinct "function", which is typically bound + to a different USB device driver. One common example is a USB + speaker with an audio interface for playback, and a HID interface + for use with software volume control. + + +Interface descriptor info (can be multiple per Config): + +I: If#=dd Alt=dd #EPs=dd Cls=xx(sssss) Sub=xx Prot=xx Driver=ssss +| | | | | | | |__Driver name +| | | | | | | or "(none)" +| | | | | | |__InterfaceProtocol +| | | | | |__InterfaceSubClass +| | | | |__InterfaceClass +| | | |__NumberOfEndpoints +| | |__AlternateSettingNumber +| |__InterfaceNumber +|__Interface info tag + + A given interface may have one or more "alternate" settings. + For example, default settings may not use more than a small + amount of periodic bandwidth. To use significant fractions + of bus bandwidth, drivers must select a non-default altsetting. + + Only one setting for an interface may be active at a time, and + only one driver may bind to an interface at a time. Most devices + have only one alternate setting per interface. + + +Endpoint descriptor info (can be multiple per Interface): + +E: Ad=xx(s) Atr=xx(ssss) MxPS=dddd Ivl=dddss +| | | | |__Interval (max) between transfers +| | | |__EndpointMaxPacketSize +| | |__Attributes(EndpointType) +| |__EndpointAddress(I=In,O=Out) +|__Endpoint info tag + + The interval is nonzero for all periodic (interrupt or isochronous) + endpoints. For high speed endpoints the transfer interval may be + measured in microseconds rather than milliseconds. + + For high speed periodic endpoints, the "MaxPacketSize" reflects + the per-microframe data transfer size. For "high bandwidth" + endpoints, that can reflect two or three packets (for up to + 3KBytes every 125 usec) per endpoint. + + With the Linux-USB stack, periodic bandwidth reservations use the + transfer intervals and sizes provided by URBs, which can be less + than those found in endpoint descriptor. + + +======================================================================= + + +If a user or script is interested only in Topology info, for +example, use something like "grep ^T: /proc/bus/usb/devices" +for only the Topology lines. A command like +"grep -i ^[tdp]: /proc/bus/usb/devices" can be used to list +only the lines that begin with the characters in square brackets, +where the valid characters are TDPCIE. With a slightly more able +script, it can display any selected lines (for example, only T, D, +and P lines) and change their output format. (The "procusb" +Perl script is the beginning of this idea. It will list only +selected lines [selected from TBDPSCIE] or "All" lines from +/proc/bus/usb/devices.) + +The Topology lines can be used to generate a graphic/pictorial +of the USB devices on a system's root hub. (See more below +on how to do this.) + +The Interface lines can be used to determine what driver is +being used for each device. + +The Configuration lines could be used to list maximum power +(in milliamps) that a system's USB devices are using. +For example, "grep ^C: /proc/bus/usb/devices". + + +Here's an example, from a system which has a UHCI root hub, +an external hub connected to the root hub, and a mouse and +a serial converter connected to the external hub. + +T: Bus=00 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2 +B: Alloc= 28/900 us ( 3%), #Int= 2, #Iso= 0 +D: Ver= 1.00 Cls=09(hub ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 +P: Vendor=0000 ProdID=0000 Rev= 0.00 +S: Product=USB UHCI Root Hub +S: SerialNumber=dce0 +C:* #Ifs= 1 Cfg#= 1 Atr=40 MxPwr= 0mA +I: If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub +E: Ad=81(I) Atr=03(Int.) MxPS= 8 Ivl=255ms +T: Bus=00 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 4 +D: Ver= 1.00 Cls=09(hub ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 +P: Vendor=0451 ProdID=1446 Rev= 1.00 +C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=100mA +I: If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub +E: Ad=81(I) Atr=03(Int.) MxPS= 1 Ivl=255ms +T: Bus=00 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#= 3 Spd=1.5 MxCh= 0 +D: Ver= 1.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 +P: Vendor=04b4 ProdID=0001 Rev= 0.00 +C:* #Ifs= 1 Cfg#= 1 Atr=80 MxPwr=100mA +I: If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=01 Prot=02 Driver=mouse +E: Ad=81(I) Atr=03(Int.) MxPS= 3 Ivl= 10ms +T: Bus=00 Lev=02 Prnt=02 Port=02 Cnt=02 Dev#= 4 Spd=12 MxCh= 0 +D: Ver= 1.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 +P: Vendor=0565 ProdID=0001 Rev= 1.08 +S: Manufacturer=Peracom Networks, Inc. +S: Product=Peracom USB to Serial Converter +C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=100mA +I: If#= 0 Alt= 0 #EPs= 3 Cls=00(>ifc ) Sub=00 Prot=00 Driver=serial +E: Ad=81(I) Atr=02(Bulk) MxPS= 64 Ivl= 16ms +E: Ad=01(O) Atr=02(Bulk) MxPS= 16 Ivl= 16ms +E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl= 8ms + + +Selecting only the "T:" and "I:" lines from this (for example, by using +"procusb ti"), we have: + +T: Bus=00 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2 +T: Bus=00 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 4 +I: If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub +T: Bus=00 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#= 3 Spd=1.5 MxCh= 0 +I: If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=01 Prot=02 Driver=mouse +T: Bus=00 Lev=02 Prnt=02 Port=02 Cnt=02 Dev#= 4 Spd=12 MxCh= 0 +I: If#= 0 Alt= 0 #EPs= 3 Cls=00(>ifc ) Sub=00 Prot=00 Driver=serial + + +Physically this looks like (or could be converted to): + + +------------------+ + | PC/root_hub (12)| Dev# = 1 + +------------------+ (nn) is Mbps. + Level 0 | CN.0 | CN.1 | [CN = connector/port #] + +------------------+ + / + / + +-----------------------+ + Level 1 | Dev#2: 4-port hub (12)| + +-----------------------+ + |CN.0 |CN.1 |CN.2 |CN.3 | + +-----------------------+ + \ \____________________ + \_____ \ + \ \ + +--------------------+ +--------------------+ + Level 2 | Dev# 3: mouse (1.5)| | Dev# 4: serial (12)| + +--------------------+ +--------------------+ + + + +Or, in a more tree-like structure (ports [Connectors] without +connections could be omitted): + +PC: Dev# 1, root hub, 2 ports, 12 Mbps +|_ CN.0: Dev# 2, hub, 4 ports, 12 Mbps + |_ CN.0: Dev #3, mouse, 1.5 Mbps + |_ CN.1: + |_ CN.2: Dev #4, serial, 12 Mbps + |_ CN.3: +|_ CN.1: + + + ### END ### diff --git a/Documentation/usb/rio.txt b/Documentation/usb/rio.txt new file mode 100644 index 000000000000..0aa79ab0088c --- /dev/null +++ b/Documentation/usb/rio.txt @@ -0,0 +1,138 @@ +Copyright (C) 1999, 2000 Bruce Tenison +Portions Copyright (C) 1999, 2000 David Nelson +Thanks to David Nelson for guidance and the usage of the scanner.txt +and scanner.c files to model our driver and this informative file. + +Mar. 2, 2000 + +CHANGES + +- Initial Revision + + +OVERVIEW + +This README will address issues regarding how to configure the kernel +to access a RIO 500 mp3 player. +Before I explain how to use this to access the Rio500 please be warned: + +W A R N I N G: +-------------- + +Please note that this software is still under development. The authors +are in no way responsible for any damage that may occur, no matter how +inconsequential. + +It seems that the Rio has a problem when sending .mp3 with low batteries. +I suggest when the batteries are low and want to transfer stuff that you +replace it with a fresh one. In my case, what happened is I lost two 16kb +blocks (they are no longer usable to store information to it). But I don't +know if thats normal or not. It could simply be a problem with the flash +memory. + +In an extreme case, I left my Rio playing overnight and the batteries wore +down to nothing and appear to have corrupted the flash memory. My RIO +needed to be replaced as a result. Diamond tech support is aware of the +problem. Do NOT allow your batteries to wear down to nothing before +changing them. It appears RIO 500 firmware does not handle low battery +power well at all. + +On systems with OHCI controllers, the kernel OHCI code appears to have +power on problems with some chipsets. If you are having problems +connecting to your RIO 500, try turning it on first and then plugging it +into the USB cable. + +Contact information: +-------------------- + + The main page for the project is hosted at sourceforge.net in the following + address: http://rio500.sourceforge.net You can also go to the sourceforge + project page at: http://sourceforge.net/project/?group_id=1944 There is + also a mailing list: rio500-users@lists.sourceforge.net + +Authors: +------- + +Most of the code was written by Cesar Miquel <miquel@df.uba.ar>. Keith +Clayton <kclayton@jps.net> is incharge of the PPC port and making sure +things work there. Bruce Tenison <btenison@dibbs.net> is adding support +for .fon files and also does testing. The program will mostly sure be +re-written and Pete Ikusz along with the rest will re-design it. I would +also like to thank Tri Nguyen <tmn_3022000@hotmail.com> who provided use +with some important information regarding the communication with the Rio. + +ADDITIONAL INFORMATION and Userspace tools + +http://rio500.sourceforge.net/ + + +REQUIREMENTS + +A host with a USB port. Ideally, either a UHCI (Intel) or OHCI +(Compaq and others) hardware port should work. + +A Linux development kernel (2.3.x) with USB support enabled or a +backported version to linux-2.2.x. See http://www.linux-usb.org for +more information on accomplishing this. + +A Linux kernel with RIO 500 support enabled. + +'lspci' which is only needed to determine the type of USB hardware +available in your machine. + +CONFIGURATION + +Using `lspci -v`, determine the type of USB hardware available. + + If you see something like: + + USB Controller: ...... + Flags: ..... + I/O ports at .... + + Then you have a UHCI based controller. + + If you see something like: + + USB Controller: ..... + Flags: .... + Memory at ..... + + Then you have a OHCI based controller. + +Using `make menuconfig` or your preferred method for configuring the +kernel, select 'Support for USB', 'OHCI/UHCI' depending on your +hardware (determined from the steps above), 'USB Diamond Rio500 support', and +'Preliminary USB device filesystem'. Compile and install the modules +(you may need to execute `depmod -a` to update the module +dependencies). + +Add a device for the USB rio500: + `mknod /dev/usb/rio500 c 180 64` + +Set appropriate permissions for /dev/usb/rio500 (don't forget about +group and world permissions). Both read and write permissions are +required for proper operation. + +Load the appropriate modules (if compiled as modules): + + OHCI: + modprobe usbcore + modprobe usb-ohci + modprobe rio500 + + UHCI: + modprobe usbcore + modprobe usb-uhci (or uhci) + modprobe rio500 + +That's it. The Rio500 Utils at: http://rio500.sourceforge.net should +be able to access the rio500. + +BUGS + +If you encounter any problems feel free to drop me an email. + +Bruce Tenison +btenison@dibbs.net + diff --git a/Documentation/usb/se401.txt b/Documentation/usb/se401.txt new file mode 100644 index 000000000000..7b9d1c960a10 --- /dev/null +++ b/Documentation/usb/se401.txt @@ -0,0 +1,54 @@ +Linux driver for SE401 based USB cameras + +Copyright, 2001, Jeroen Vreeken + + +INTRODUCTION: + +The SE401 chip is the used in low-cost usb webcams. +It is produced by Endpoints Inc. (www.endpoints.com). +It interfaces directly to a cmos image sensor and USB. The only other major +part in a se401 based camera is a dram chip. + +The following cameras are known to work with this driver: + +Aox se401 (non-branded) cameras +Philips PVCV665 USB VGA webcam 'Vesta Fun' +Kensington VideoCAM PC Camera Model 67014 +Kensington VideoCAM PC Camera Model 67015 +Kensington VideoCAM PC Camera Model 67016 +Kensington VideoCAM PC Camera Model 67017 + + +WHAT YOU NEED: + +- USB support +- VIDEO4LINUX support + +More information about USB support for linux can be found at: +http://www.linux-usb.org + + +MODULE OPTIONS: + +When the driver is compiled as a module you can also use the 'flickerless' +option. With it exposure is limited to values that do not interfere with the +net frequency. Valid options for this option are 0, 50 and 60. (0=disable, +50=50hz, 60=60hz) + + +KNOWN PROBLEMS: + +The driver works fine with the usb-ohci and uhci host controller drivers, +the default settings also work with usb-uhci. But sending more than one bulk +transfer at a time with usb-uhci doesn't work yet. +Users of usb-ohci and uhci can safely enlarge SE401_NUMSBUF in se401.h in +order to increase the throughput (and thus framerate). + + +HELP: + +The latest info on this driver can be found at: +http://www.chello.nl/~j.vreeken/se401/ +And questions to me can be send to: +pe1rxq@amsat.org diff --git a/Documentation/usb/sn9c102.txt b/Documentation/usb/sn9c102.txt new file mode 100644 index 000000000000..cf9a1187edce --- /dev/null +++ b/Documentation/usb/sn9c102.txt @@ -0,0 +1,480 @@ + + SN9C10x PC Camera Controllers + Driver for Linux + ============================= + + - Documentation - + + +Index +===== +1. Copyright +2. Disclaimer +3. License +4. Overview and features +5. Module dependencies +6. Module loading +7. Module parameters +8. Optional device control through "sysfs" +9. Supported devices +10. How to add plug-in's for new image sensors +11. Notes for V4L2 application developers +12. Video frame formats +13. Contact information +14. Credits + + +1. Copyright +============ +Copyright (C) 2004-2005 by Luca Risolia <luca.risolia@studio.unibo.it> + + +2. Disclaimer +============= +SONiX is a trademark of SONiX Technology Company Limited, inc. +This software is not sponsored or developed by SONiX. + + +3. License +========== +This program is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2 of the License, or +(at your option) any later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; if not, write to the Free Software +Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + + +4. Overview and features +======================== +This driver attempts to support the video and audio streaming capabilities of +the devices mounting the SONiX SN9C101, SN9C102 and SN9C103 PC Camera +Controllers. + +It's worth to note that SONiX has never collaborated with the author during the +development of this project, despite several requests for enough detailed +specifications of the register tables, compression engine and video data format +of the above chips. Nevertheless, these informations are no longer necessary, +becouse all the aspects related to these chips are known and have been +described in detail in this documentation. + +The driver relies on the Video4Linux2 and USB core modules. It has been +designed to run properly on SMP systems as well. + +The latest version of the SN9C10x driver can be found at the following URL: +http://www.linux-projects.org/ + +Some of the features of the driver are: + +- full compliance with the Video4Linux2 API (see also "Notes for V4L2 + application developers" paragraph); +- available mmap or read/poll methods for video streaming through isochronous + data transfers; +- automatic detection of image sensor; +- support for any window resolutions and optional panning within the maximum + pixel area of image sensor; +- image downscaling with arbitrary scaling factors from 1, 2 and 4 in both + directions (see "Notes for V4L2 application developers" paragraph); +- two different video formats for uncompressed or compressed data in low or + high compression quality (see also "Notes for V4L2 application developers" + and "Video frame formats" paragraphs); +- full support for the capabilities of many of the possible image sensors that + can be connected to the SN9C10x bridges, including, for istance, red, green, + blue and global gain adjustments and exposure (see "Supported devices" + paragraph for details); +- use of default color settings for sunlight conditions; +- dynamic I/O interface for both SN9C10x and image sensor control and + monitoring (see "Optional device control through 'sysfs'" paragraph); +- dynamic driver control thanks to various module parameters (see "Module + parameters" paragraph); +- up to 64 cameras can be handled at the same time; they can be connected and + disconnected from the host many times without turning off the computer, if + your system supports hotplugging; +- no known bugs. + + +5. Module dependencies +====================== +For it to work properly, the driver needs kernel support for Video4Linux and +USB. + +The following options of the kernel configuration file must be enabled and +corresponding modules must be compiled: + + # Multimedia devices + # + CONFIG_VIDEO_DEV=m + + # USB support + # + CONFIG_USB=m + +In addition, depending on the hardware being used, the modules below are +necessary: + + # USB Host Controller Drivers + # + CONFIG_USB_EHCI_HCD=m + CONFIG_USB_UHCI_HCD=m + CONFIG_USB_OHCI_HCD=m + +And finally: + + # USB Multimedia devices + # + CONFIG_USB_SN9C102=m + + +6. Module loading +================= +To use the driver, it is necessary to load the "sn9c102" module into memory +after every other module required: "videodev", "usbcore" and, depending on +the USB host controller you have, "ehci-hcd", "uhci-hcd" or "ohci-hcd". + +Loading can be done as shown below: + + [root@localhost home]# modprobe sn9c102 + +At this point the devices should be recognized. You can invoke "dmesg" to +analyze kernel messages and verify that the loading process has gone well: + + [user@localhost home]$ dmesg + + +7. Module parameters +==================== +Module parameters are listed below: +------------------------------------------------------------------------------- +Name: video_nr +Type: int array (min = 0, max = 64) +Syntax: <-1|n[,...]> +Description: Specify V4L2 minor mode number: + -1 = use next available + n = use minor number n + You can specify up to 64 cameras this way. + For example: + video_nr=-1,2,-1 would assign minor number 2 to the second + recognized camera and use auto for the first one and for every + other camera. +Default: -1 +------------------------------------------------------------------------------- +Name: force_munmap; +Type: bool array (min = 0, max = 64) +Syntax: <0|1[,...]> +Description: Force the application to unmap previously mapped buffer memory + before calling any VIDIOC_S_CROP or VIDIOC_S_FMT ioctl's. Not + all the applications support this feature. This parameter is + specific for each detected camera. + 0 = do not force memory unmapping" + 1 = force memory unmapping (save memory)" +Default: 0 +------------------------------------------------------------------------------- +Name: debug +Type: int +Syntax: <n> +Description: Debugging information level, from 0 to 3: + 0 = none (use carefully) + 1 = critical errors + 2 = significant informations + 3 = more verbose messages + Level 3 is useful for testing only, when only one device + is used. It also shows some more informations about the + hardware being detected. This parameter can be changed at + runtime thanks to the /sys filesystem. +Default: 2 +------------------------------------------------------------------------------- + + +8. Optional device control through "sysfs" [1] +========================================== +It is possible to read and write both the SN9C10x and the image sensor +registers by using the "sysfs" filesystem interface. + +Every time a supported device is recognized, a write-only file named "green" is +created in the /sys/class/video4linux/videoX directory. You can set the green +channel's gain by writing the desired value to it. The value may range from 0 +to 15 for SN9C101 or SN9C102 bridges, from 0 to 127 for SN9C103 bridges. +Similarly, only for SN9C103 controllers, blue and red gain control files are +available in the same directory, for which accepted values may range from 0 to +127. + +There are other four entries in the directory above for each registered camera: +"reg", "val", "i2c_reg" and "i2c_val". The first two files control the +SN9C10x bridge, while the other two control the sensor chip. "reg" and +"i2c_reg" hold the values of the current register index where the following +reading/writing operations are addressed at through "val" and "i2c_val". Their +use is not intended for end-users. Note that "i2c_reg" and "i2c_val" will not +be created if the sensor does not actually support the standard I2C protocol or +its registers are not 8-bit long. Also, remember that you must be logged in as +root before writing to them. + +As an example, suppose we were to want to read the value contained in the +register number 1 of the sensor register table - which is usually the product +identifier - of the camera registered as "/dev/video0": + + [root@localhost #] cd /sys/class/video4linux/video0 + [root@localhost #] echo 1 > i2c_reg + [root@localhost #] cat i2c_val + +Note that "cat" will fail if sensor registers cannot be read. + +Now let's set the green gain's register of the SN9C101 or SN9C102 chips to 2: + + [root@localhost #] echo 0x11 > reg + [root@localhost #] echo 2 > val + +Note that the SN9C10x always returns 0 when some of its registers are read. +To avoid race conditions, all the I/O accesses to the above files are +serialized. + +The sysfs interface also provides the "frame_header" entry, which exports the +frame header of the most recent requested and captured video frame. The header +is 12-bytes long and is appended to every video frame by the SN9C10x +controllers. As an example, this additional information can be used by the user +application for implementing auto-exposure features via software. + +The following table describes the frame header: + +Byte # Value Description +------ ----- ----------- +0x00 0xFF Frame synchronisation pattern. +0x01 0xFF Frame synchronisation pattern. +0x02 0x00 Frame synchronisation pattern. +0x03 0xC4 Frame synchronisation pattern. +0x04 0xC4 Frame synchronisation pattern. +0x05 0x96 Frame synchronisation pattern. +0x06 0x00 or 0x01 Unknown meaning. The exact value depends on the chip. +0x07 0xXX Variable value, whose bits are ff00uzzc, where ff is a + frame counter, u is unknown, zz is a size indicator + (00 = VGA, 01 = SIF, 10 = QSIF) and c stands for + "compression enabled" (1 = yes, 0 = no). +0x08 0xXX Brightness sum inside Auto-Exposure area (low-byte). +0x09 0xXX Brightness sum inside Auto-Exposure area (high-byte). + For a pure white image, this number will be equal to 500 + times the area of the specified AE area. For images + that are not pure white, the value scales down according + to relative whiteness. +0x0A 0xXX Brightness sum outside Auto-Exposure area (low-byte). +0x0B 0xXX Brightness sum outside Auto-Exposure area (high-byte). + For a pure white image, this number will be equal to 125 + times the area outside of the specified AE area. For + images that are not pure white, the value scales down + according to relative whiteness. + +The AE area (sx, sy, ex, ey) in the active window can be set by programming the +registers 0x1c, 0x1d, 0x1e and 0x1f of the SN9C10x controllers, where one unit +corresponds to 32 pixels. + +[1] The frame header has been documented by Bertrik Sikken. + + +9. Supported devices +==================== +None of the names of the companies as well as their products will be mentioned +here. They have never collaborated with the author, so no advertising. + +From the point of view of a driver, what unambiguously identify a device are +its vendor and product USB identifiers. Below is a list of known identifiers of +devices mounting the SN9C10x PC camera controllers: + +Vendor ID Product ID +--------- ---------- +0x0c45 0x6001 +0x0c45 0x6005 +0x0c45 0x6009 +0x0c45 0x600d +0x0c45 0x6024 +0x0c45 0x6025 +0x0c45 0x6028 +0x0c45 0x6029 +0x0c45 0x602a +0x0c45 0x602b +0x0c45 0x602c +0x0c45 0x6030 +0x0c45 0x6080 +0x0c45 0x6082 +0x0c45 0x6083 +0x0c45 0x6088 +0x0c45 0x608a +0x0c45 0x608b +0x0c45 0x608c +0x0c45 0x608e +0x0c45 0x608f +0x0c45 0x60a0 +0x0c45 0x60a2 +0x0c45 0x60a3 +0x0c45 0x60a8 +0x0c45 0x60aa +0x0c45 0x60ab +0x0c45 0x60ac +0x0c45 0x60ae +0x0c45 0x60af +0x0c45 0x60b0 +0x0c45 0x60b2 +0x0c45 0x60b3 +0x0c45 0x60b8 +0x0c45 0x60ba +0x0c45 0x60bb +0x0c45 0x60bc +0x0c45 0x60be + +The list above does not imply that all those devices work with this driver: up +until now only the ones that mount the following image sensors are supported; +kernel messages will always tell you whether this is the case: + +Model Manufacturer +----- ------------ +HV7131D Hynix Semiconductor, Inc. +MI-0343 Micron Technology, Inc. +PAS106B PixArt Imaging, Inc. +PAS202BCB PixArt Imaging, Inc. +TAS5110C1B Taiwan Advanced Sensor Corporation +TAS5130D1B Taiwan Advanced Sensor Corporation + +All the available control settings of each image sensor are supported through +the V4L2 interface. + +Donations of new models for further testing and support would be much +appreciated. Non-available hardware will not be supported by the author of this +driver. + + +10. How to add plug-in's for new image sensors +============================================== +It should be easy to write plug-in's for new sensors by using the small API +that has been created for this purpose, which is present in "sn9c102_sensor.h" +(documentation is included there). As an example, have a look at the code in +"sn9c102_pas106b.c", which uses the mentioned interface. + +At the moment, possible unsupported image sensors are: CIS-VF10 (VGA), +OV7620 (VGA), OV7630 (VGA). + + +11. Notes for V4L2 application developers +========================================= +This driver follows the V4L2 API specifications. In particular, it enforces two +rules: + +- exactly one I/O method, either "mmap" or "read", is associated with each +file descriptor. Once it is selected, the application must close and reopen the +device to switch to the other I/O method; + +- although it is not mandatory, previously mapped buffer memory should always +be unmapped before calling any "VIDIOC_S_CROP" or "VIDIOC_S_FMT" ioctl's. +The same number of buffers as before will be allocated again to match the size +of the new video frames, so you have to map the buffers again before any I/O +attempts on them. + +Consistently with the hardware limits, this driver also supports image +downscaling with arbitrary scaling factors from 1, 2 and 4 in both directions. +However, the V4L2 API specifications don't correctly define how the scaling +factor can be chosen arbitrarily by the "negotiation" of the "source" and +"target" rectangles. To work around this flaw, we have added the convention +that, during the negotiation, whenever the "VIDIOC_S_CROP" ioctl is issued, the +scaling factor is restored to 1. + +This driver supports two different video formats: the first one is the "8-bit +Sequential Bayer" format and can be used to obtain uncompressed video data +from the device through the current I/O method, while the second one provides +"raw" compressed video data (without frame headers not related to the +compressed data). The compression quality may vary from 0 to 1 and can be +selected or queried thanks to the VIDIOC_S_JPEGCOMP and VIDIOC_G_JPEGCOMP V4L2 +ioctl's. For maximum flexibility, both the default active video format and the +default compression quality depend on how the image sensor being used is +initialized (as described in the documentation of the API for the image sensors +supplied by this driver). + + +12. Video frame formats [1] +======================= +The SN9C10x PC Camera Controllers can send images in two possible video +formats over the USB: either native "Sequential RGB Bayer" or Huffman +compressed. The latter is used to achieve high frame rates. The current video +format may be selected or queried from the user application by calling the +VIDIOC_S_FMT or VIDIOC_G_FMT ioctl's, as described in the V4L2 API +specifications. + +The name "Sequential Bayer" indicates the organization of the red, green and +blue pixels in one video frame. Each pixel is associated with a 8-bit long +value and is disposed in memory according to the pattern shown below: + +B[0] G[1] B[2] G[3] ... B[m-2] G[m-1] +G[m] R[m+1] G[m+2] R[m+2] ... G[2m-2] R[2m-1] +... +... B[(n-1)(m-2)] G[(n-1)(m-1)] +... G[n(m-2)] R[n(m-1)] + +The above matrix also represents the sequential or progressive read-out mode of +the (n, m) Bayer color filter array used in many CCD/CMOS image sensors. + +One compressed video frame consists of a bitstream that encodes for every R, G, +or B pixel the difference between the value of the pixel itself and some +reference pixel value. Pixels are organised in the Bayer pattern and the Bayer +sub-pixels are tracked individually and alternatingly. For example, in the +first line values for the B and G1 pixels are alternatingly encoded, while in +the second line values for the G2 and R pixels are alternatingly encoded. + +The pixel reference value is calculated as follows: +- the 4 top left pixels are encoded in raw uncompressed 8-bit format; +- the value in the top two rows is the value of the pixel left of the current + pixel; +- the value in the left column is the value of the pixel above the current + pixel; +- for all other pixels, the reference value is the average of the value of the + pixel on the left and the value of the pixel above the current pixel; +- there is one code in the bitstream that specifies the value of a pixel + directly (in 4-bit resolution); +- pixel values need to be clamped inside the range [0..255] for proper + decoding. + +The algorithm purely describes the conversion from compressed Bayer code used +in the SN9C10x chips to uncompressed Bayer. Additional steps are required to +convert this to a color image (i.e. a color interpolation algorithm). + +The following Huffman codes have been found: +0: +0 (relative to reference pixel value) +100: +4 +101: -4? +1110xxxx: set absolute value to xxxx.0000 +1101: +11 +1111: -11 +11001: +20 +110000: -20 +110001: ??? - these codes are apparently not used + +[1] The Huffman compression algorithm has been reverse-engineered and + documented by Bertrik Sikken. + + +13. Contact information +======================= +The author may be contacted by e-mail at <luca.risolia@studio.unibo.it>. + +GPG/PGP encrypted e-mail's are accepted. The GPG key ID of the author is +'FCE635A4'; the public 1024-bit key should be available at any keyserver; +the fingerprint is: '88E8 F32F 7244 68BA 3958 5D40 99DA 5D2A FCE6 35A4'. + + +14. Credits +=========== +Many thanks to following persons for their contribute (listed in alphabetical +order): + +- Luca Capello for the donation of a webcam; +- Joao Rodrigo Fuzaro, Joao Limirio, Claudio Filho and Caio Begotti for the + donation of a webcam; +- Carlos Eduardo Medaglia Dyonisio, who added the support for the PAS202BCB + image sensor; +- Stefano Mozzi, who donated 45 EU; +- Bertrik Sikken, who reverse-engineered and documented the Huffman compression + algorithm used in the SN9C10x controllers and implemented the first decoder; +- Mizuno Takafumi for the donation of a webcam; +- An "anonymous" donator (who didn't want his name to be revealed) for the + donation of a webcam. diff --git a/Documentation/usb/stv680.txt b/Documentation/usb/stv680.txt new file mode 100644 index 000000000000..6448041e7a37 --- /dev/null +++ b/Documentation/usb/stv680.txt @@ -0,0 +1,55 @@ +Linux driver for STV0680 based USB cameras + +Copyright, 2001, Kevin Sisson + + +INTRODUCTION: + +STMicroelectronics produces the STV0680B chip, which comes in two +types, -001 and -003. The -003 version allows the recording and downloading +of sound clips from the camera, and allows a flash attachment. Otherwise, +it uses the same commands as the -001 version. Both versions support a +variety of SDRAM sizes and sensors, allowing for a maximum of 26 VGA or 20 +CIF pictures. The STV0680 supports either a serial or a usb interface, and +video is possible through the usb interface. + +The following cameras are known to work with this driver, although any +camera with Vendor/Product codes of 0553/0202 should work: + +Aiptek Pencam (various models) +Nisis QuickPix 2 +Radio Shack 'Kid's digital camera' (#60-1207) +At least one Trust Spycam model +Several other European brand models + +WHAT YOU NEED: + +- USB support +- VIDEO4LINUX support + +More information about USB support for linux can be found at: +http://www.linux-usb.org + + +MODULE OPTIONS: + +When the driver is compiled as a module, you can set a "swapRGB=1" +option, if necessary, for those applications that require it +(such as xawtv). However, the driver should detect and set this +automatically, so this option should not normally be used. + + +KNOWN PROBLEMS: + +The driver seems to work better with the usb-ohci than the usb-uhci host +controller driver. + +HELP: + +The latest info on this driver can be found at: +http://personal.clt.bellsouth.net/~kjsisson or at +http://stv0680-usb.sourceforge.net + +Any questions to me can be send to: kjsisson@bellsouth.net + + diff --git a/Documentation/usb/uhci.txt b/Documentation/usb/uhci.txt new file mode 100644 index 000000000000..2f25952c86c6 --- /dev/null +++ b/Documentation/usb/uhci.txt @@ -0,0 +1,165 @@ +Specification and Internals for the New UHCI Driver (Whitepaper...) + + brought to you by + + Georg Acher, acher@in.tum.de (executive slave) (base guitar) + Deti Fliegl, deti@fliegl.de (executive slave) (lead voice) + Thomas Sailer, sailer@ife.ee.ethz.ch (chief consultant) (cheer leader) + + $Id: README.uhci,v 1.1 1999/12/14 14:03:02 fliegl Exp $ + +This document and the new uhci sources can be found on + http://hotswap.in.tum.de/usb + +1. General issues + +1.1 Why a new UHCI driver, we already have one?!? + +Correct, but its internal structure got more and more mixed up by the (still +ongoing) efforts to get isochronous transfers (ISO) to work. +Since there is an increasing need for reliable ISO-transfers (especially +for USB-audio needed by TS and for a DAB-USB-Receiver build by GA and DF), +this state was a bit unsatisfying in our opinion, so we've decided (based +on knowledge and experiences with the old UHCI driver) to start +from scratch with a new approach, much simpler but at the same time more +powerful. +It is inspired by the way Win98/Win2000 handles USB requests via URBs, +but it's definitely 100% free of MS-code and doesn't crash while +unplugging an used ISO-device like Win98 ;-) +Some code for HW setup and root hub management was taken from the +original UHCI driver, but heavily modified to fit into the new code. +The invention of the basic concept, and major coding were completed in two +days (and nights) on the 16th and 17th of October 1999, now known as the +great USB-October-Revolution started by GA, DF, and TS ;-) + +Since the concept is in no way UHCI dependent, we hope that it will also be +transferred to the OHCI-driver, so both drivers share a common API. + +1.2. Advantages and disadvantages + ++ All USB transfer types work now! ++ Asynchronous operation ++ Simple, but powerful interface (only two calls for start and cancel) ++ Easy migration to the new API, simplified by a compatibility API ++ Simple usage of ISO transfers ++ Automatic linking of requests ++ ISO transfers allow variable length for each frame and striping ++ No CPU dependent and non-portable atomic memory access, no asm()-inlines ++ Tested on x86 and Alpha + +- Rewriting for ISO transfers needed + +1.3. Is there some compatibility to the old API? + +Yes, but only for control, bulk and interrupt transfers. We've implemented +some wrapper calls for these transfer types. The usbcore works fine with +these wrappers. For ISO there's no compatibility, because the old ISO-API +and its semantics were unnecessary complicated in our opinion. + +1.4. What's really working? + +As said above, CTRL and BULK already work fine even with the wrappers, +so legacy code wouldn't notice the change. +Regarding to Thomas, ISO transfers now run stable with USB audio. +INT transfers (e.g. mouse driver) work fine, too. + +1.5. Are there any bugs? + +No ;-) +Hm... +Well, of course this implementation needs extensive testing on all available +hardware, but we believe that any fixes shouldn't harm the overall concept. + +1.6. What should be done next? + +A large part of the request handling seems to be identical for UHCI and +OHCI, so it would be a good idea to extract the common parts and have only +the HW specific stuff in uhci.c. Furthermore, all other USB device drivers +should need URBification, if they use isochronous or interrupt transfers. +One thing missing in the current implementation (and the old UHCI driver) +is fair queueing for BULK transfers. Since this would need (in principle) +the alteration of already constructed TD chains (to switch from depth to +breadth execution), another way has to be found. Maybe some simple +heuristics work with the same effect. + +--------------------------------------------------------------------------- + +2. Internal structure and mechanisms + +To get quickly familiar with the internal structures, here's a short +description how the new UHCI driver works. However, the ultimate source of +truth is only uhci.c! + +2.1. Descriptor structure (QHs and TDs) + +During initialization, the following skeleton is allocated in init_skel: + + framespecific | common chain + +framelist[] +[ 0 ]-----> TD --> TD -------\ +[ 1 ]-----> TD --> TD --------> TD ----> QH -------> QH -------> QH ---> NULL + ... TD --> TD -------/ +[1023]-----> TD --> TD ------/ + + ^^ ^^ ^^ ^^ ^^ ^^ + 1024 TDs for 7 TDs for 1 TD for Start of Start of End Chain + ISO INT (2-128ms) 1ms-INT CTRL Chain BULK Chain + +For each CTRL or BULK transfer a new QH is allocated and the containing data +transfers are appended as (vertical) TDs. After building the whole QH with its +dangling TDs, the QH is inserted before the BULK Chain QH (for CTRL) or +before the End Chain QH (for BULK). Since only the QH->next pointers are +affected, no atomic memory operation is required. The three QHs in the +common chain are never equipped with TDs! + +For ISO or INT, the TD for each frame is simply inserted into the appropriate +ISO/INT-TD-chain for the desired frame. The 7 skeleton INT-TDs are scattered +among the 1024 frames similar to the old UHCI driver. + +For CTRL/BULK/ISO, the last TD in the transfer has the IOC-bit set. For INT, +every TD (there is only one...) has the IOC-bit set. + +Besides the data for the UHCI controller (2 or 4 32bit words), the descriptors +are double-linked through the .vertical and .horizontal elements in the +SW data of the descriptor (using the double-linked list structures and +operations), but SW-linking occurs only in closed domains, i.e. for each of +the 1024 ISO-chains and the 8 INT-chains there is a closed cycle. This +simplifies all insertions and unlinking operations and avoids costly +bus_to_virt()-calls. + +2.2. URB structure and linking to QH/TDs + +During assembly of the QH and TDs of the requested action, these descriptors +are stored in urb->urb_list, so the allocated QH/TD descriptors are bound to +this URB. +If the assembly was successful and the descriptors were added to the HW chain, +the corresponding URB is inserted into a global URB list for this controller. +This list stores all pending URBs. + +2.3. Interrupt processing + +Since UHCI provides no means to directly detect completed transactions, the +following is done in each UHCI interrupt (uhci_interrupt()): + +For each URB in the pending queue (process_urb()), the ACTIVE-flag of the +associated TDs are processed (depending on the transfer type +process_{transfer|interrupt|iso}()). If the TDs are not active anymore, +they indicate the completion of the transaction and the status is calculated. +Inactive QH/TDs are removed from the HW chain (since the host controller +already removed the TDs from the QH, no atomic access is needed) and +eventually the URB is marked as completed (OK or errors) and removed from the +pending queue. Then the next linked URB is submitted. After (or immediately +before) that, the completion handler is called. + +2.4. Unlinking URBs + +First, all QH/TDs stored in the URB are unlinked from the HW chain. +To ensure that the host controller really left a vertical TD chain, we +wait for one frame. After that, the TDs are physically destroyed. + +2.5. URB linking and the consequences + +Since URBs can be linked and the corresponding submit_urb is called in +the UHCI-interrupt, all work associated with URB/QH/TD assembly has to be +interrupt save. This forces kmalloc to use GFP_ATOMIC in the interrupt. diff --git a/Documentation/usb/usb-help.txt b/Documentation/usb/usb-help.txt new file mode 100644 index 000000000000..b7c324973695 --- /dev/null +++ b/Documentation/usb/usb-help.txt @@ -0,0 +1,19 @@ +usb-help.txt +2000-July-12 + +For USB help other than the readme files that are located in +Documentation/usb/*, see the following: + +Linux-USB project: http://www.linux-usb.org + mirrors at http://www.suse.cz/development/linux-usb/ + and http://usb.in.tum.de/linux-usb/ + and http://it.linux-usb.org +Linux USB Guide: http://linux-usb.sourceforge.net +Linux-USB device overview (working devices and drivers): + http://www.qbik.ch/usb/devices/ + +The Linux-USB mailing lists are: + linux-usb-users@lists.sourceforge.net for general user help + linux-usb-devel@lists.sourceforge.net for developer discussions + +### diff --git a/Documentation/usb/usb-serial.txt b/Documentation/usb/usb-serial.txt new file mode 100644 index 000000000000..f001cd93b79b --- /dev/null +++ b/Documentation/usb/usb-serial.txt @@ -0,0 +1,470 @@ +INTRODUCTION + + The USB serial driver currently supports a number of different USB to + serial converter products, as well as some devices that use a serial + interface from userspace to talk to the device. + + See the individual product section below for specific information about + the different devices. + + +CONFIGURATION + + Currently the driver can handle up to 256 different serial interfaces at + one time. + + If you are not using devfs: + The major number that the driver uses is 188 so to use the driver, + create the following nodes: + mknod /dev/ttyUSB0 c 188 0 + mknod /dev/ttyUSB1 c 188 1 + mknod /dev/ttyUSB2 c 188 2 + mknod /dev/ttyUSB3 c 188 3 + . + . + . + mknod /dev/ttyUSB254 c 188 254 + mknod /dev/ttyUSB255 c 188 255 + + If you are using devfs: + The devices supported by this driver will show up as + /dev/usb/tts/{0,1,...} + + When the device is connected and recognized by the driver, the driver + will print to the system log, which node(s) the device has been bound + to. + + +SPECIFIC DEVICES SUPPORTED + + +ConnectTech WhiteHEAT 4 port converter + + ConnectTech has been very forthcoming with information about their + device, including providing a unit to test with. + + The driver is officially supported by Connect Tech Inc. + http://www.connecttech.com + + For any questions or problems with this driver, please contact + Stuart MacDonald at stuartm@connecttech.com + + +HandSpring Visor, Palm USB, and Clié USB driver + + This driver works with all HandSpring USB, Palm USB, and Sony Clié USB + devices. + + Only when the device tries to connect to the host, will the device show + up to the host as a valid USB device. When this happens, the device is + properly enumerated, assigned a port, and then communication _should_ be + possible. The driver cleans up properly when the device is removed, or + the connection is canceled on the device. + + NOTE: + This means that in order to talk to the device, the sync button must be + pressed BEFORE trying to get any program to communicate to the device. + This goes against the current documentation for pilot-xfer and other + packages, but is the only way that it will work due to the hardware + in the device. + + When the device is connected, try talking to it on the second port + (this is usually /dev/ttyUSB1 if you do not have any other usb-serial + devices in the system.) The system log should tell you which port is + the port to use for the HotSync transfer. The "Generic" port can be used + for other device communication, such as a PPP link. + + For some Sony Clié devices, /dev/ttyUSB0 must be used to talk to the + device. This is true for all OS version 3.5 devices, and most devices + that have had a flash upgrade to a newer version of the OS. See the + kernel system log for information on which is the correct port to use. + + If after pressing the sync button, nothing shows up in the system log, + try resetting the device, first a hot reset, and then a cold reset if + necessary. Some devices need this before they can talk to the USB port + properly. + + Devices that are not compiled into the kernel can be specified with module + parameters. e.g. modprobe visor vendor=0x54c product=0x66 + + There is a webpage and mailing lists for this portion of the driver at: + http://usbvisor.sourceforge.net/ + + For any questions or problems with this driver, please contact Greg + Kroah-Hartman at greg@kroah.com + + +PocketPC PDA Driver + + This driver can be used to connect to Compaq iPAQ, HP Jornada, Casio EM500 + and other PDAs running Windows CE 3.0 or PocketPC 2002 using a USB + cable/cradle. + Most devices supported by ActiveSync are supported out of the box. + For others, please use module parameters to specify the product and vendor + id. e.g. modprobe ipaq vendor=0x3f0 product=0x1125 + + The driver presents a serial interface (usually on /dev/ttyUSB0) over + which one may run ppp and establish a TCP/IP link to the PDA. Once this + is done, you can transfer files, backup, download email etc. The most + significant advantage of using USB is speed - I can get 73 to 113 + kbytes/sec for download/upload to my iPAQ. + + This driver is only one of a set of components required to utilize + the USB connection. Please visit http://synce.sourceforge.net which + contains the necessary packages and a simple step-by-step howto. + + Once connected, you can use Win CE programs like ftpView, Pocket Outlook + from the PDA and xcerdisp, synce utilities from the Linux side. + + To use Pocket IE, follow the instructions given at + http://www.tekguru.co.uk/EM500/usbtonet.htm to achieve the same thing + on Win98. Omit the proxy server part; Linux is quite capable of forwarding + packets unlike Win98. Another modification is required at least for the + iPAQ - disable autosync by going to the Start/Settings/Connections menu + and unchecking the "Automatically synchronize ..." box. Go to + Start/Programs/Connections, connect the cable and select "usbdial" (or + whatever you named your new USB connection). You should finally wind + up with a "Connected to usbdial" window with status shown as connected. + Now start up PIE and browse away. + + If it doesn't work for some reason, load both the usbserial and ipaq module + with the module parameter "debug" set to 1 and examine the system log. + You can also try soft-resetting your PDA before attempting a connection. + + Other functionality may be possible depending on your PDA. According to + Wes Cilldhaire <billybobjoehenrybob@hotmail.com>, with the Toshiba E570, + ...if you boot into the bootloader (hold down the power when hitting the + reset button, continuing to hold onto the power until the bootloader screen + is displayed), then put it in the cradle with the ipaq driver loaded, open + a terminal on /dev/ttyUSB0, it gives you a "USB Reflash" terminal, which can + be used to flash the ROM, as well as the microP code.. so much for needing + Toshiba's $350 serial cable for flashing!! :D + NOTE: This has NOT been tested. Use at your own risk. + + For any questions or problems with the driver, please contact Ganesh + Varadarajan <ganesh@veritas.com> + + +Keyspan PDA Serial Adapter + + Single port DB-9 serial adapter, pushed as a PDA adapter for iMacs (mostly + sold in Macintosh catalogs, comes in a translucent white/green dongle). + Fairly simple device. Firmware is homebrew. + This driver also works for the Xircom/Entrgra single port serial adapter. + + Current status: + Things that work: + basic input/output (tested with 'cu') + blocking write when serial line can't keep up + changing baud rates (up to 115200) + getting/setting modem control pins (TIOCM{GET,SET,BIS,BIC}) + sending break (although duration looks suspect) + Things that don't: + device strings (as logged by kernel) have trailing binary garbage + device ID isn't right, might collide with other Keyspan products + changing baud rates ought to flush tx/rx to avoid mangled half characters + Big Things on the todo list: + parity, 7 vs 8 bits per char, 1 or 2 stop bits + HW flow control + not all of the standard USB descriptors are handled: Get_Status, Set_Feature + O_NONBLOCK, select() + + For any questions or problems with this driver, please contact Brian + Warner at warner@lothar.com + + +Keyspan USA-series Serial Adapters + + Single, Dual and Quad port adapters - driver uses Keyspan supplied + firmware and is being developed with their support. + + Current status: + The USA-18X, USA-28X, USA-19, USA-19W and USA-49W are supported and + have been pretty throughly tested at various baud rates with 8-N-1 + character settings. Other character lengths and parity setups are + presently untested. + + The USA-28 isn't yet supported though doing so should be pretty + straightforward. Contact the maintainer if you require this + functionality. + + More information is available at: + http://misc.nu/hugh/keyspan.html + + For any questions or problems with this driver, please contact Hugh + Blemings at hugh@misc.nu + + +FTDI Single Port Serial Driver + + This is a single port DB-25 serial adapter. More information about this + device and the Linux driver can be found at: + http://reality.sgi.com/bryder_wellington/ftdi_sio/ + + For any questions or problems with this driver, please contact Bill Ryder + at bryder@sgi.com + + +ZyXEL omni.net lcd plus ISDN TA + + This is an ISDN TA. Please report both successes and troubles to + azummo@towertech.it + + +Cypress M8 CY4601 Family Serial Driver + + This driver was in most part developed by Neil "koyama" Whelchel. It + has been improved since that previous form to support dynamic serial + line settings and improved line handling. The driver is for the most + part stable and has been tested on an smp machine. (dual p2) + + Chipsets supported under CY4601 family: + + CY7C63723, CY7C63742, CY7C63743, CY7C64013 + + Devices supported: + + -DeLorme's USB Earthmate (SiRF Star II lp arch) + -Cypress HID->COM RS232 adapter + + Note: Cypress Semiconductor claims no affiliation with the + the hid->com device. + + Most devices using chipsets under the CY4601 family should + work with the driver. As long as they stay true to the CY4601 + usbserial specification. + + Technical notes: + + The Earthmate starts out at 4800 8N1 by default... the driver will + upon start init to this setting. usbserial core provides the rest + of the termios settings, along with some custom termios so that the + output is in proper format and parsable. + + The device can be put into sirf mode by issuing NMEA command: + $PSRF100,<protocol>,<baud>,<databits>,<stopbits>,<parity>*CHECKSUM + $PSRF100,0,9600,8,1,0*0C + + It should then be sufficient to change the port termios to match this + to begin communicating. + + As far as I can tell it supports pretty much every sirf command as + documented online available with firmware 2.31, with some unknown + message ids. + + The hid->com adapter can run at a maximum baud of 115200bps. Please note + that the device has trouble or is incapable of raising line voltage properly. + It will be fine with null modem links, as long as you do not try to link two + together without hacking the adapter to set the line high. + + The driver is smp safe. Performance with the driver is rather low when using + it for transfering files. This is being worked on, but I would be willing to + accept patches. An urb queue or packet buffer would likely fit the bill here. + + If you have any questions, problems, patches, feature requests, etc. you can + contact me here via email: + dignome@gmail.com + (your problems/patches can alternately be submitted to usb-devel) + + +Digi AccelePort Driver + + This driver supports the Digi AccelePort USB 2 and 4 devices, 2 port + (plus a parallel port) and 4 port USB serial converters. The driver + does NOT yet support the Digi AccelePort USB 8. + + This driver works under SMP with the usb-uhci driver. It does not + work under SMP with the uhci driver. + + The driver is generally working, though we still have a few more ioctls + to implement and final testing and debugging to do. The paralled port + on the USB 2 is supported as a serial to parallel converter; in other + words, it appears as another USB serial port on Linux, even though + physically it is really a parallel port. The Digi Acceleport USB 8 + is not yet supported. + + Please contact Peter Berger (pberger@brimson.com) or Al Borchers + (alborchers@steinerpoint.com) for questions or problems with this + driver. + + +Belkin USB Serial Adapter F5U103 + + Single port DB-9/PS-2 serial adapter from Belkin with firmware by eTEK Labs. + The Peracom single port serial adapter also works with this driver, as + well as the GoHubs adapter. + + Current status: + The following have been tested and work: + Baud rate 300-230400 + Data bits 5-8 + Stop bits 1-2 + Parity N,E,O,M,S + Handshake None, Software (XON/XOFF), Hardware (CTSRTS,CTSDTR)* + Break Set and clear + Line contrl Input/Output query and control ** + + * Hardware input flow control is only enabled for firmware + levels above 2.06. Read source code comments describing Belkin + firmware errata. Hardware output flow control is working for all + firmware versions. + ** Queries of inputs (CTS,DSR,CD,RI) show the last + reported state. Queries of outputs (DTR,RTS) show the last + requested state and may not reflect current state as set by + automatic hardware flow control. + + TO DO List: + -- Add true modem contol line query capability. Currently tracks the + states reported by the interrupt and the states requested. + -- Add error reporting back to application for UART error conditions. + -- Add support for flush ioctls. + -- Add everything else that is missing :) + + For any questions or problems with this driver, please contact William + Greathouse at wgreathouse@smva.com + + +Empeg empeg-car Mark I/II Driver + + This is an experimental driver to provide connectivity support for the + client synchronization tools for an Empeg empeg-car mp3 player. + + Tips: + * Don't forget to create the device nodes for ttyUSB{0,1,2,...} + * modprobe empeg (modprobe is your friend) + * emptool --usb /dev/ttyUSB0 (or whatever you named your device node) + + For any questions or problems with this driver, please contact Gary + Brubaker at xavyer@ix.netcom.com + + +MCT USB Single Port Serial Adapter U232 + + This driver is for the MCT USB-RS232 Converter (25 pin, Model No. + U232-P25) from Magic Control Technology Corp. (there is also a 9 pin + Model No. U232-P9). More information about this device can be found at + the manufacture's web-site: http://www.mct.com.tw. + + The driver is generally working, though it still needs some more testing. + It is derived from the Belkin USB Serial Adapter F5U103 driver and its + TODO list is valid for this driver as well. + + This driver has also been found to work for other products, which have + the same Vendor ID but different Product IDs. Sitecom's U232-P25 serial + converter uses Product ID 0x230 and Vendor ID 0x711 and works with this + driver. Also, D-Link's DU-H3SP USB BAY also works with this driver. + + For any questions or problems with this driver, please contact Wolfgang + Grandegger at wolfgang@ces.ch + + +Inside Out Networks Edgeport Driver + + This driver supports all devices made by Inside Out Networks, specifically + the following models: + Edgeport/4 + Rapidport/4 + Edgeport/4t + Edgeport/2 + Edgeport/4i + Edgeport/2i + Edgeport/421 + Edgeport/21 + Edgeport/8 + Edgeport/8 Dual + Edgeport/2D8 + Edgeport/4D8 + Edgeport/8i + Edgeport/2 DIN + Edgeport/4 DIN + Edgeport/16 Dual + + For any questions or problems with this driver, please contact Greg + Kroah-Hartman at greg@kroah.com + + +REINER SCT cyberJack pinpad/e-com USB chipcard reader + + Interface to ISO 7816 compatible contactbased chipcards, e.g. GSM SIMs. + + Current status: + This is the kernel part of the driver for this USB card reader. + There is also a user part for a CT-API driver available. A site + for downloading is TBA. For now, you can request it from the + maintainer (linux-usb@sii.li). + + For any questions or problems with this driver, please contact + linux-usb@sii.li + + +Prolific PL2303 Driver + + This driver support any device that has the PL2303 chip from Prolific + in it. This includes a number of single port USB to serial + converters and USB GPS devices. Devices from Aten (the UC-232) and + IO-Data work with this driver. + + For any questions or problems with this driver, please contact Greg + Kroah-Hartman at greg@kroah.com + + +KL5KUSB105 chipset / PalmConnect USB single-port adapter + +Current status: + The driver was put together by looking at the usb bus transactions + done by Palm's driver under Windows, so a lot of functionality is + still missing. Notably, serial ioctls are sometimes faked or not yet + implemented. Support for finding out about DSR and CTS line status is + however implemented (though not nicely), so your favorite autopilot(1) + and pilot-manager -daemon calls will work. Baud rates up to 115200 + are supported, but handshaking (software or hardware) is not, which is + why it is wise to cut down on the rate used is wise for large + transfers until this is settled. + +Options supported: + If this driver is compiled as a module you can pass the following + options to it: + debug - extra verbose debugging info + (default: 0; nonzero enables) + use_lowlatency - use low_latency flag to speed up tty layer + when reading from from the device. + (default: 0; nonzero enables) + + See http://www.uuhaus.de/linux/palmconnect.html for up-to-date + information on this driver. + + +Generic Serial driver + + If your device is not one of the above listed devices, compatible with + the above models, you can try out the "generic" interface. This + interface does not provide any type of control messages sent to the + device, and does not support any kind of device flow control. All that + is required of your device is that it has at least one bulk in endpoint, + or one bulk out endpoint. + + To enable the generic driver to recognize your device, build the driver + as a module and load it by the following invocation: + insmod usbserial vendor=0x#### product=0x#### + where the #### is replaced with the hex representation of your device's + vendor id and product id. + + This driver has been successfully used to connect to the NetChip USB + development board, providing a way to develop USB firmware without + having to write a custom driver. + + For any questions or problems with this driver, please contact Greg + Kroah-Hartman at greg@kroah.com + + +CONTACT: + + If anyone has any problems using these drivers, with any of the above + specified products, please contact the specific driver's author listed + above, or join the Linux-USB mailing list (information on joining the + mailing list, as well as a link to its searchable archive is at + http://www.linux-usb.org/ ) + + +Greg Kroah-Hartman +greg@kroah.com diff --git a/Documentation/usb/usbmon.txt b/Documentation/usb/usbmon.txt new file mode 100644 index 000000000000..2f8431f92b77 --- /dev/null +++ b/Documentation/usb/usbmon.txt @@ -0,0 +1,156 @@ +* Introduction + +The name "usbmon" in lowercase refers to a facility in kernel which is +used to collect traces of I/O on the USB bus. This function is analogous +to a packet socket used by network monitoring tools such as tcpdump(1) +or Ethereal. Similarly, it is expected that a tool such as usbdump or +USBMon (with uppercase letters) is used to examine raw traces produced +by usbmon. + +The usbmon reports requests made by peripheral-specific drivers to Host +Controller Drivers (HCD). So, if HCD is buggy, the traces reported by +usbmon may not correspond to bus transactions precisely. This is the same +situation as with tcpdump. + +* How to use usbmon to collect raw text traces + +Unlike the packet socket, usbmon has an interface which provides traces +in a text format. This is used for two purposes. First, it serves as a +common trace exchange format for tools while most sophisticated formats +are finalized. Second, humans can read it in case tools are not available. + +To collect a raw text trace, execute following steps. + +1. Prepare + +Mount debugfs (it has to be enabled in your kernel configuration), and +load the usbmon module (if built as module). The second step is skipped +if usbmon is built into the kernel. + +# mount -t debugfs none_debugs /sys/kernel/debug +# modprobe usbmon + +Verify that bus sockets are present. + +[root@lembas zaitcev]# ls /sys/kernel/debug/usbmon +1s 1t 2s 2t 3s 3t 4s 4t +[root@lembas zaitcev]# + +# ls /sys/kernel + +2. Find which bus connects to the desired device + +Run "cat /proc/bus/usb/devices", and find the T-line which corresponds to +the device. Usually you do it by looking for the vendor string. If you have +many similar devices, unplug one and compare two /proc/bus/usb/devices outputs. +The T-line will have a bus number. Example: + +T: Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 +D: Ver= 1.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 +P: Vendor=0557 ProdID=2004 Rev= 1.00 +S: Manufacturer=ATEN +S: Product=UC100KM V2.00 + +Bus=03 means it's bus 3. + +3. Start 'cat' + +# cat /sys/kernel/debug/usbmon/3t > /tmp/1.mon.out + +This process will be reading until killed. Naturally, the output can be +redirected to a desirable location. This is preferred, because it is going +to be quite long. + +4. Perform the desired operation on the USB bus + +This is where you do something that creates the traffic: plug in a flash key, +copy files, control a webcam, etc. + +5. Kill cat + +Usually it's done with a keyboard interrupt (Control-C). + +At this point the output file (/tmp/1.mon.out in this example) can be saved, +sent by e-mail, or inspected with a text editor. In the last case make sure +that the file size is not excessive for your favourite editor. + +* Raw text data format + +The '0t' type data consists of a stream of events, such as URB submission, +URB callback, submission error. Every event is a text line, which consists +of whitespace separated words. The number of position of words may depend +on the event type, but there is a set of words, common for all types. + +Here is the list of words, from left to right: +- URB Tag. This is used to identify URBs is normally a kernel mode address + of the URB structure in hexadecimal. +- Timestamp in microseconds, a decimal number. The timestamp's resolution + depends on available clock, and so it can be much worse than a microsecond + (if the implementation uses jiffies, for example). +- Event Type. This type refers to the format of the event, not URB type. + Available types are: S - submission, C - callback, E - submission error. +- "Pipe". The pipe concept is deprecated. This is a composite word, used to + be derived from information in pipes. It consists of three fields, separated + by colons: URB type and direction, Device address, Endpoint number. + Type and direction are encoded with two bytes in the following manner: + Ci Co Control input and output + Zi Zo Isochronous input and output + Ii Io Interrupt input and output + Bi Bo Bulk input and output + Device address and Endpoint number are decimal numbers with leading zeroes + or 3 and 2 positions, correspondingly. +- URB Status. This field makes no sense for submissions, but is present + to help scripts with parsing. In error case, it contains the error code. +- Data Length. This is the actual length in the URB. +- Data tag. The usbmon may not always capture data, even if length is nonzero. + Only if tag is '=', the data words are present. +- Data words follow, in big endian hexadecimal format. Notice that they are + not machine words, but really just a byte stream split into words to make + it easier to read. Thus, the last word may contain from one to four bytes. + The length of collected data is limited and can be less than the data length + report in Data Length word. + +Here is an example of code to read the data stream in a well known programming +language: + +class ParsedLine { + int data_len; /* Available length of data */ + byte data[]; + + void parseData(StringTokenizer st) { + int availwords = st.countTokens(); + data = new byte[availwords * 4]; + data_len = 0; + while (st.hasMoreTokens()) { + String data_str = st.nextToken(); + int len = data_str.length() / 2; + int i; + for (i = 0; i < len; i++) { + data[data_len] = Byte.parseByte( + data_str.substring(i*2, i*2 + 2), + 16); + data_len++; + } + } + } +} + +This format is obviously deficient. For example, the setup packet for control +transfers is not delivered. This will change in the future. + +Examples: + +An input control transfer to get a port status: + +d74ff9a0 2640288196 S Ci:001:00 -115 4 < +d74ff9a0 2640288202 C Ci:001:00 0 4 = 01010100 + +An output bulk transfer to send a SCSI command 0x5E in a 31-byte Bulk wrapper +to a storage device at address 5: + +dd65f0e8 4128379752 S Bo:005:02 -115 31 = 55534243 5e000000 00000000 00000600 00000000 00000000 00000000 000000 +dd65f0e8 4128379808 C Bo:005:02 0 31 > + +* Raw binary format and API + +TBD diff --git a/Documentation/usb/w9968cf.txt b/Documentation/usb/w9968cf.txt new file mode 100644 index 000000000000..18a47738d56c --- /dev/null +++ b/Documentation/usb/w9968cf.txt @@ -0,0 +1,481 @@ + + W996[87]CF JPEG USB Dual Mode Camera Chip + Driver for Linux 2.6 (basic version) + ========================================= + + - Documentation - + + +Index +===== +1. Copyright +2. Disclaimer +3. License +4. Overview +5. Supported devices +6. Module dependencies +7. Module loading +8. Module paramaters +9. Contact information +10. Credits + + +1. Copyright +============ +Copyright (C) 2002-2004 by Luca Risolia <luca.risolia@studio.unibo.it> + + +2. Disclaimer +============= +Winbond is a trademark of Winbond Electronics Corporation. +This software is not sponsored or developed by Winbond. + + +3. License +========== +This program is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2 of the License, or +(at your option) any later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; if not, write to the Free Software +Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + + +4. Overview +=========== +This driver supports the video streaming capabilities of the devices mounting +Winbond W9967CF and Winbond W9968CF JPEG USB Dual Mode Camera Chips. OV681 +based cameras should be supported as well. + +The driver is divided into two modules: the basic one, "w9968cf", is needed for +the supported devices to work; the second one, "w9968cf-vpp", is an optional +module, which provides some useful video post-processing functions like video +decoding, up-scaling and colour conversions. Once the driver is installed, +every time an application tries to open a recognized device, "w9968cf" checks +the presence of the "w9968cf-vpp" module and loads it automatically by default. + +Please keep in mind that official kernels do not include the second module for +performance purposes. However it is always recommended to download and install +the latest and complete release of the driver, replacing the existing one, if +present: it will be still even possible not to load the "w9968cf-vpp" module at +all, if you ever want to. Another important missing feature of the version in +the official Linux 2.4 kernels is the writeable /proc filesystem interface. + +The latest and full-featured version of the W996[87]CF driver can be found at: +http://www.linux-projects.org. Please refer to the documentation included in +that package, if you are going to use it. + +Up to 32 cameras can be handled at the same time. They can be connected and +disconnected from the host many times without turning off the computer, if +your system supports the hotplug facility. + +To change the default settings for each camera, many parameters can be passed +through command line when the module is loaded into memory. + +The driver relies on the Video4Linux, USB and I2C core modules. It has been +designed to run properly on SMP systems as well. An additional module, +"ovcamchip", is mandatory; it provides support for some OmniVision image +sensors connected to the W996[87]CF chips; if found in the system, the module +will be automatically loaded by default (provided that the kernel has been +compiled with the automatic module loading option). + + +5. Supported devices +==================== +At the moment, known W996[87]CF and OV681 based devices are: +- Aroma Digi Pen VGA Dual Mode ADG-5000 (unknown image sensor) +- AVerMedia AVerTV USB (SAA7111A, Philips FI1216Mk2 tuner, PT2313L audio chip) +- Creative Labs Video Blaster WebCam Go (OmniVision OV7610 sensor) +- Creative Labs Video Blaster WebCam Go Plus (OmniVision OV7620 sensor) +- Lebon LDC-035A (unknown image sensor) +- Ezonics EZ-802 EZMega Cam (OmniVision OV8610C sensor) +- OmniVision OV8610-EDE (OmniVision OV8610 sensor) +- OPCOM Digi Pen VGA Dual Mode Pen Camera (unknown image sensor) +- Pretec Digi Pen-II (OmniVision OV7620 sensor) +- Pretec DigiPen-480 (OmniVision OV8610 sensor) + +If you know any other W996[87]CF or OV681 based cameras, please contact me. + +The list above does not imply that all those devices work with this driver: up +until now only webcams that have an image sensor supported by the "ovcamchip" +module work. Kernel messages will always tell you whether this is case. + +Possible external microcontrollers of those webcams are not supported: this +means that still images cannot be downloaded from the device memory. + +Furthermore, it's worth to note that I was only able to run tests on my +"Creative Labs Video Blaster WebCam Go". Donations of other models, for +additional testing and full support, would be much appreciated. + + +6. Module dependencies +====================== +For it to work properly, the driver needs kernel support for Video4Linux, USB +and I2C, and the "ovcamchip" module for the image sensor. Make sure you are not +actually using any external "ovcamchip" module, given that the W996[87]CF +driver depends on the version of the module present in the official kernels. + +The following options of the kernel configuration file must be enabled and +corresponding modules must be compiled: + + # Multimedia devices + # + CONFIG_VIDEO_DEV=m + + # I2C support + # + CONFIG_I2C=m + +The I2C core module can be compiled statically in the kernel as well. + + # OmniVision Camera Chip support + # + CONFIG_VIDEO_OVCAMCHIP=m + + # USB support + # + CONFIG_USB=m + +In addition, depending on the hardware being used, only one of the modules +below is necessary: + + # USB Host Controller Drivers + # + CONFIG_USB_EHCI_HCD=m + CONFIG_USB_UHCI_HCD=m + CONFIG_USB_OHCI_HCD=m + +And finally: + + # USB Multimedia devices + # + CONFIG_USB_W9968CF=m + + +7. Module loading +================= +To use the driver, it is necessary to load the "w9968cf" module into memory +after every other module required. + +Loading can be done this way, from root: + + [root@localhost home]# modprobe usbcore + [root@localhost home]# modprobe i2c-core + [root@localhost home]# modprobe videodev + [root@localhost home]# modprobe w9968cf + +At this point the pertinent devices should be recognized: "dmesg" can be used +to analyze kernel messages: + + [user@localhost home]$ dmesg + +There are a lot of parameters the module can use to change the default +settings for each device. To list every possible parameter with a brief +explanation about them and which syntax to use, it is recommended to run the +"modinfo" command: + + [root@locahost home]# modinfo w9968cf + + +8. Module parameters +==================== +Module parameters are listed below: +------------------------------------------------------------------------------- +Name: ovmod_load +Type: bool +Syntax: <0|1> +Description: Automatic 'ovcamchip' module loading: 0 disabled, 1 enabled. + If enabled, 'insmod' searches for the required 'ovcamchip' + module in the system, according to its configuration, and + loads that module automatically. This action is performed as + once soon as the 'w9968cf' module is loaded into memory. +Default: 1 +Note: The kernel must be compiled with the CONFIG_KMOD option + enabled for the 'ovcamchip' module to be loaded and for + this parameter to be present. +------------------------------------------------------------------------------- +Name: vppmod_load +Type: bool +Syntax: <0|1> +Description: Automatic 'w9968cf-vpp' module loading: 0 disabled, 1 enabled. + If enabled, every time an application attempts to open a + camera, 'insmod' searches for the video post-processing module + in the system and loads it automatically (if present). + The optional 'w9968cf-vpp' module adds extra image manipulation + capabilities to the 'w9968cf' module,like software up-scaling, + colour conversions and video decompression for very high frame + rates. +Default: 1 +Note: The kernel must be compiled with the CONFIG_KMOD option + enabled for the 'w9968cf-vpp' module to be loaded and for + this parameter to be present. +------------------------------------------------------------------------------- +Name: simcams +Type: int +Syntax: <n> +Description: Number of cameras allowed to stream simultaneously. + n may vary from 0 to 32. +Default: 32 +------------------------------------------------------------------------------- +Name: video_nr +Type: int array (min = 0, max = 32) +Syntax: <-1|n[,...]> +Description: Specify V4L minor mode number. + -1 = use next available + n = use minor number n + You can specify up to 32 cameras this way. + For example: + video_nr=-1,2,-1 would assign minor number 2 to the second + recognized camera and use auto for the first one and for every + other camera. +Default: -1 +------------------------------------------------------------------------------- +Name: packet_size +Type: int array (min = 0, max = 32) +Syntax: <n[,...]> +Description: Specify the maximum data payload size in bytes for alternate + settings, for each device. n is scaled between 63 and 1023. +Default: 1023 +------------------------------------------------------------------------------- +Name: max_buffers +Type: int array (min = 0, max = 32) +Syntax: <n[,...]> +Description: For advanced users. + Specify the maximum number of video frame buffers to allocate + for each device, from 2 to 32. +Default: 2 +------------------------------------------------------------------------------- +Name: double_buffer +Type: bool array (min = 0, max = 32) +Syntax: <0|1[,...]> +Description: Hardware double buffering: 0 disabled, 1 enabled. + It should be enabled if you want smooth video output: if you + obtain out of sync. video, disable it, or try to + decrease the 'clockdiv' module parameter value. +Default: 1 for every device. +------------------------------------------------------------------------------- +Name: clamping +Type: bool array (min = 0, max = 32) +Syntax: <0|1[,...]> +Description: Video data clamping: 0 disabled, 1 enabled. +Default: 0 for every device. +------------------------------------------------------------------------------- +Name: filter_type +Type: int array (min = 0, max = 32) +Syntax: <0|1|2[,...]> +Description: Video filter type. + 0 none, 1 (1-2-1) 3-tap filter, 2 (2-3-6-3-2) 5-tap filter. + The filter is used to reduce noise and aliasing artifacts + produced by the CCD or CMOS image sensor. +Default: 0 for every device. +------------------------------------------------------------------------------- +Name: largeview +Type: bool array (min = 0, max = 32) +Syntax: <0|1[,...]> +Description: Large view: 0 disabled, 1 enabled. +Default: 1 for every device. +------------------------------------------------------------------------------- +Name: upscaling +Type: bool array (min = 0, max = 32) +Syntax: <0|1[,...]> +Description: Software scaling (for non-compressed video only): + 0 disabled, 1 enabled. + Disable it if you have a slow CPU or you don't have enough + memory. +Default: 0 for every device. +Note: If 'w9968cf-vpp' is not present, this parameter is set to 0. +------------------------------------------------------------------------------- +Name: decompression +Type: int array (min = 0, max = 32) +Syntax: <0|1|2[,...]> +Description: Software video decompression: + 0 = disables decompression + (doesn't allow formats needing decompression). + 1 = forces decompression + (allows formats needing decompression only). + 2 = allows any permitted formats. + Formats supporting (de)compressed video are YUV422P and + YUV420P/YUV420 in any resolutions where width and height are + multiples of 16. +Default: 2 for every device. +Note: If 'w9968cf-vpp' is not present, forcing decompression is not + allowed; in this case this parameter is set to 2. +------------------------------------------------------------------------------- +Name: force_palette +Type: int array (min = 0, max = 32) +Syntax: <0|9|10|13|15|8|7|1|6|3|4|5[,...]> +Description: Force picture palette. + In order: + 0 = Off - allows any of the following formats: + 9 = UYVY 16 bpp - Original video, compression disabled + 10 = YUV420 12 bpp - Original video, compression enabled + 13 = YUV422P 16 bpp - Original video, compression enabled + 15 = YUV420P 12 bpp - Original video, compression enabled + 8 = YUVY 16 bpp - Software conversion from UYVY + 7 = YUV422 16 bpp - Software conversion from UYVY + 1 = GREY 8 bpp - Software conversion from UYVY + 6 = RGB555 16 bpp - Software conversion from UYVY + 3 = RGB565 16 bpp - Software conversion from UYVY + 4 = RGB24 24 bpp - Software conversion from UYVY + 5 = RGB32 32 bpp - Software conversion from UYVY + When not 0, this parameter will override 'decompression'. +Default: 0 for every device. Initial palette is 9 (UYVY). +Note: If 'w9968cf-vpp' is not present, this parameter is set to 9. +------------------------------------------------------------------------------- +Name: force_rgb +Type: bool array (min = 0, max = 32) +Syntax: <0|1[,...]> +Description: Read RGB video data instead of BGR: + 1 = use RGB component ordering. + 0 = use BGR component ordering. + This parameter has effect when using RGBX palettes only. +Default: 0 for every device. +------------------------------------------------------------------------------- +Name: autobright +Type: bool array (min = 0, max = 32) +Syntax: <0|1[,...]> +Description: Image sensor automatically changes brightness: + 0 = no, 1 = yes +Default: 0 for every device. +------------------------------------------------------------------------------- +Name: autoexp +Type: bool array (min = 0, max = 32) +Syntax: <0|1[,...]> +Description: Image sensor automatically changes exposure: + 0 = no, 1 = yes +Default: 1 for every device. +------------------------------------------------------------------------------- +Name: lightfreq +Type: int array (min = 0, max = 32) +Syntax: <50|60[,...]> +Description: Light frequency in Hz: + 50 for European and Asian lighting, 60 for American lighting. +Default: 50 for every device. +------------------------------------------------------------------------------- +Name: bandingfilter +Type: bool array (min = 0, max = 32) +Syntax: <0|1[,...]> +Description: Banding filter to reduce effects of fluorescent + lighting: + 0 disabled, 1 enabled. + This filter tries to reduce the pattern of horizontal + light/dark bands caused by some (usually fluorescent) lighting. +Default: 0 for every device. +------------------------------------------------------------------------------- +Name: clockdiv +Type: int array (min = 0, max = 32) +Syntax: <-1|n[,...]> +Description: Force pixel clock divisor to a specific value (for experts): + n may vary from 0 to 127. + -1 for automatic value. + See also the 'double_buffer' module parameter. +Default: -1 for every device. +------------------------------------------------------------------------------- +Name: backlight +Type: bool array (min = 0, max = 32) +Syntax: <0|1[,...]> +Description: Objects are lit from behind: + 0 = no, 1 = yes +Default: 0 for every device. +------------------------------------------------------------------------------- +Name: mirror +Type: bool array (min = 0, max = 32) +Syntax: <0|1[,...]> +Description: Reverse image horizontally: + 0 = no, 1 = yes +Default: 0 for every device. +------------------------------------------------------------------------------- +Name: monochrome +Type: bool array (min = 0, max = 32) +Syntax: <0|1[,...]> +Description: The image sensor is monochrome: + 0 = no, 1 = yes +Default: 0 for every device. +------------------------------------------------------------------------------- +Name: brightness +Type: long array (min = 0, max = 32) +Syntax: <n[,...]> +Description: Set picture brightness (0-65535). + This parameter has no effect if 'autobright' is enabled. +Default: 31000 for every device. +------------------------------------------------------------------------------- +Name: hue +Type: long array (min = 0, max = 32) +Syntax: <n[,...]> +Description: Set picture hue (0-65535). +Default: 32768 for every device. +------------------------------------------------------------------------------- +Name: colour +Type: long array (min = 0, max = 32) +Syntax: <n[,...]> +Description: Set picture saturation (0-65535). +Default: 32768 for every device. +------------------------------------------------------------------------------- +Name: contrast +Type: long array (min = 0, max = 32) +Syntax: <n[,...]> +Description: Set picture contrast (0-65535). +Default: 50000 for every device. +------------------------------------------------------------------------------- +Name: whiteness +Type: long array (min = 0, max = 32) +Syntax: <n[,...]> +Description: Set picture whiteness (0-65535). +Default: 32768 for every device. +------------------------------------------------------------------------------- +Name: debug +Type: int +Syntax: <n> +Description: Debugging information level, from 0 to 6: + 0 = none (use carefully) + 1 = critical errors + 2 = significant informations + 3 = configuration or general messages + 4 = warnings + 5 = called functions + 6 = function internals + Level 5 and 6 are useful for testing only, when only one + device is used. +Default: 2 +------------------------------------------------------------------------------- +Name: specific_debug +Type: bool +Syntax: <0|1> +Description: Enable or disable specific debugging messages: + 0 = print messages concerning every level <= 'debug' level. + 1 = print messages concerning the level indicated by 'debug'. +Default: 0 +------------------------------------------------------------------------------- + + +9. Contact information +====================== +I may be contacted by e-mail at <luca.risolia@studio.unibo.it>. + +I can accept GPG/PGP encrypted e-mail. My GPG key ID is 'FCE635A4'. +My public 1024-bit key should be available at your keyserver; the fingerprint +is: '88E8 F32F 7244 68BA 3958 5D40 99DA 5D2A FCE6 35A4'. + + +10. Credits +========== +The development would not have proceed much further without having looked at +the source code of other drivers and without the help of several persons; in +particular: + +- the I2C interface to kernel and high-level image sensor control routines have + been taken from the OV511 driver by Mark McClelland; + +- memory management code has been copied from the bttv driver by Ralph Metzler, + Marcus Metzler and Gerd Knorr; + +- the low-level I2C read function has been written by Frederic Jouault; + +- the low-level I2C fast write function has been written by Piotr Czerczak. diff --git a/Documentation/video4linux/API.html b/Documentation/video4linux/API.html new file mode 100644 index 000000000000..4b3d8f640a4a --- /dev/null +++ b/Documentation/video4linux/API.html @@ -0,0 +1,399 @@ +<HTML><HEAD> +<TITLE>Video4Linux Kernel API Reference v0.1:19990430</TITLE> +</HEAD> +<! Revision History: > +<! 4/30/1999 - Fred Gleason (fredg@wava.com)> +<! Documented extensions for the Radio Data System (RDS) extensions > +<BODY bgcolor="#ffffff"> +<H3>Devices</H3> +Video4Linux provides the following sets of device files. These live on the +character device formerly known as "/dev/bttv". /dev/bttv should be a +symlink to /dev/video0 for most people. +<P> +<TABLE> +<TR><TH>Device Name</TH><TH>Minor Range</TH><TH>Function</TH> +<TR><TD>/dev/video</TD><TD>0-63</TD><TD>Video Capture Interface</TD> +<TR><TD>/dev/radio</TD><TD>64-127</TD><TD>AM/FM Radio Devices</TD> +<TR><TD>/dev/vtx</TD><TD>192-223</TD><TD>Teletext Interface Chips</TD> +<TR><TD>/dev/vbi</TD><TD>224-239</TD><TD>Raw VBI Data (Intercast/teletext)</TD> +</TABLE> +<P> +Video4Linux programs open and scan the devices to find what they are looking +for. Capability queries define what each interface supports. The +described API is only defined for video capture cards. The relevant subset +applies to radio cards. Teletext interfaces talk the existing VTX API. +<P> +<H3>Capability Query Ioctl</H3> +The <B>VIDIOCGCAP</B> ioctl call is used to obtain the capability +information for a video device. The <b>struct video_capability</b> object +passed to the ioctl is completed and returned. It contains the following +information +<P> +<TABLE> +<TR><TD><b>name[32]</b><TD>Canonical name for this interface</TD> +<TR><TD><b>type</b><TD>Type of interface</TD> +<TR><TD><b>channels</b><TD>Number of radio/tv channels if appropriate</TD> +<TR><TD><b>audios</b><TD>Number of audio devices if appropriate</TD> +<TR><TD><b>maxwidth</b><TD>Maximum capture width in pixels</TD> +<TR><TD><b>maxheight</b><TD>Maximum capture height in pixels</TD> +<TR><TD><b>minwidth</b><TD>Minimum capture width in pixels</TD> +<TR><TD><b>minheight</b><TD>Minimum capture height in pixels</TD> +</TABLE> +<P> +The type field lists the capability flags for the device. These are +as follows +<P> +<TABLE> +<TR><TH>Name</TH><TH>Description</TH> +<TR><TD><b>VID_TYPE_CAPTURE</b><TD>Can capture to memory</TD> +<TR><TD><b>VID_TYPE_TUNER</b><TD>Has a tuner of some form</TD> +<TR><TD><b>VID_TYPE_TELETEXT</b><TD>Has teletext capability</TD> +<TR><TD><b>VID_TYPE_OVERLAY</b><TD>Can overlay its image onto the frame buffer</TD> +<TR><TD><b>VID_TYPE_CHROMAKEY</b><TD>Overlay is Chromakeyed</TD> +<TR><TD><b>VID_TYPE_CLIPPING</b><TD>Overlay clipping is supported</TD> +<TR><TD><b>VID_TYPE_FRAMERAM</b><TD>Overlay overwrites frame buffer memory</TD> +<TR><TD><b>VID_TYPE_SCALES</b><TD>The hardware supports image scaling</TD> +<TR><TD><b>VID_TYPE_MONOCHROME</b><TD>Image capture is grey scale only</TD> +<TR><TD><b>VID_TYPE_SUBCAPTURE</b><TD>Capture can be of only part of the image</TD> +</TABLE> +<P> +The minimum and maximum sizes listed for a capture device do not imply all +that all height/width ratios or sizes within the range are possible. A +request to set a size will be honoured by the largest available capture +size whose capture is no large than the requested rectangle in either +direction. For example the quickcam has 3 fixed settings. +<P> +<H3>Frame Buffer</H3> +Capture cards that drop data directly onto the frame buffer must be told the +base address of the frame buffer, its size and organisation. This is a +privileged ioctl and one that eventually X itself should set. +<P> +The <b>VIDIOCSFBUF</b> ioctl sets the frame buffer parameters for a capture +card. If the card does not do direct writes to the frame buffer then this +ioctl will be unsupported. The <b>VIDIOCGFBUF</b> ioctl returns the +currently used parameters. The structure used in both cases is a +<b>struct video_buffer</b>. +<P> +<TABLE> +<TR><TD><b>void *base</b></TD><TD>Base physical address of the buffer</TD> +<TR><TD><b>int height</b></TD><TD>Height of the frame buffer</TD> +<TR><TD><b>int width</b></TD><TD>Width of the frame buffer</TD> +<TR><TD><b>int depth</b></TD><TD>Depth of the frame buffer</TD> +<TR><TD><b>int bytesperline</b></TD><TD>Number of bytes of memory between the start of two adjacent lines</TD> +</TABLE> +<P> +Note that these values reflect the physical layout of the frame buffer. +The visible area may be smaller. In fact under XFree86 this is commonly the +case. XFree86 DGA can provide the parameters required to set up this ioctl. +Setting the base address to NULL indicates there is no physical frame buffer +access. +<P> +<H3>Capture Windows</H3> +The capture area is described by a <b>struct video_window</b>. This defines +a capture area and the clipping information if relevant. The +<b>VIDIOCGWIN</b> ioctl recovers the current settings and the +<b>VIDIOCSWIN</b> sets new values. A successful call to <b>VIDIOCSWIN</b> +indicates that a suitable set of parameters have been chosen. They do not +indicate that exactly what was requested was granted. The program should +call <b>VIDIOCGWIN</b> to check if the nearest match was suitable. The +<b>struct video_window</b> contains the following fields. +<P> +<TABLE> +<TR><TD><b>x</b><TD>The X co-ordinate specified in X windows format.</TD> +<TR><TD><b>y</b><TD>The Y co-ordinate specified in X windows format.</TD> +<TR><TD><b>width</b><TD>The width of the image capture.</TD> +<TR><TD><b>height</b><TD>The height of the image capture.</TD> +<TR><TD><b>chromakey</b><TD>A host order RGB32 value for the chroma key.</TD> +<TR><TD><b>flags</b><TD>Additional capture flags.</TD> +<TR><TD><b>clips</b><TD>A list of clipping rectangles. <em>(Set only)</em></TD> +<TR><TD><b>clipcount</b><TD>The number of clipping rectangles. <em>(Set only)</em></TD> +</TABLE> +<P> +Clipping rectangles are passed as an array. Each clip consists of the following +fields available to the user. +<P> +<TABLE> +<TR><TD><b>x</b></TD><TD>X co-ordinate of rectangle to skip</TD> +<TR><TD><b>y</b></TD><TD>Y co-ordinate of rectangle to skip</TD> +<TR><TD><b>width</b></TD><TD>Width of rectangle to skip</TD> +<TR><TD><b>height</b></TD><TD>Height of rectangle to skip</TD> +</TABLE> +<P> +Merely setting the window does not enable capturing. Overlay capturing +(i.e. PCI-PCI transfer to the frame buffer of the video card) +is activated by passing the <b>VIDIOCCAPTURE</b> ioctl a value of 1, and +disabled by passing it a value of 0. +<P> +Some capture devices can capture a subfield of the image they actually see. +This is indicated when VIDEO_TYPE_SUBCAPTURE is defined. +The video_capture describes the time and special subfields to capture. +The video_capture structure contains the following fields. +<P> +<TABLE> +<TR><TD><b>x</b></TD><TD>X co-ordinate of source rectangle to grab</TD> +<TR><TD><b>y</b></TD><TD>Y co-ordinate of source rectangle to grab</TD> +<TR><TD><b>width</b></TD><TD>Width of source rectangle to grab</TD> +<TR><TD><b>height</b></TD><TD>Height of source rectangle to grab</TD> +<TR><TD><b>decimation</b></TD><TD>Decimation to apply</TD> +<TR><TD><b>flags</b></TD><TD>Flag settings for grabbing</TD> +</TABLE> +The available flags are +<P> +<TABLE> +<TR><TH>Name</TH><TH>Description</TH> +<TR><TD><b>VIDEO_CAPTURE_ODD</b><TD>Capture only odd frames</TD> +<TR><TD><b>VIDEO_CAPTURE_EVEN</b><TD>Capture only even frames</TD> +</TABLE> +<P> +<H3>Video Sources</H3> +Each video4linux video or audio device captures from one or more +source <b>channels</b>. Each channel can be queries with the +<b>VDIOCGCHAN</b> ioctl call. Before invoking this function the caller +must set the channel field to the channel that is being queried. On return +the <b>struct video_channel</b> is filled in with information about the +nature of the channel itself. +<P> +The <b>VIDIOCSCHAN</b> ioctl takes an integer argument and switches the +capture to this input. It is not defined whether parameters such as colour +settings or tuning are maintained across a channel switch. The caller should +maintain settings as desired for each channel. (This is reasonable as +different video inputs may have different properties). +<P> +The <b>struct video_channel</b> consists of the following +<P> +<TABLE> +<TR><TD><b>channel</b></TD><TD>The channel number</TD> +<TR><TD><b>name</b></TD><TD>The input name - preferably reflecting the label +on the card input itself</TD> +<TR><TD><b>tuners</b></TD><TD>Number of tuners for this input</TD> +<TR><TD><b>flags</b></TD><TD>Properties the tuner has</TD> +<TR><TD><b>type</b></TD><TD>Input type (if known)</TD> +<TR><TD><b>norm</b><TD>The norm for this channel</TD> +</TABLE> +<P> +The flags defined are +<P> +<TABLE> +<TR><TD><b>VIDEO_VC_TUNER</b><TD>Channel has tuners.</TD> +<TR><TD><b>VIDEO_VC_AUDIO</b><TD>Channel has audio.</TD> +<TR><TD><b>VIDEO_VC_NORM</b><TD>Channel has norm setting.</TD> +</TABLE> +<P> +The types defined are +<P> +<TABLE> +<TR><TD><b>VIDEO_TYPE_TV</b><TD>The input is a TV input.</TD> +<TR><TD><b>VIDEO_TYPE_CAMERA</b><TD>The input is a camera.</TD> +</TABLE> +<P> +<H3>Image Properties</H3> +The image properties of the picture can be queried with the <b>VIDIOCGPICT</b> +ioctl which fills in a <b>struct video_picture</b>. The <b>VIDIOCSPICT</b> +ioctl allows values to be changed. All values except for the palette type +are scaled between 0-65535. +<P> +The <b>struct video_picture</b> consists of the following fields +<P> +<TABLE> +<TR><TD><b>brightness</b><TD>Picture brightness</TD> +<TR><TD><b>hue</b><TD>Picture hue (colour only)</TD> +<TR><TD><b>colour</b><TD>Picture colour (colour only)</TD> +<TR><TD><b>contrast</b><TD>Picture contrast</TD> +<TR><TD><b>whiteness</b><TD>The whiteness (greyscale only)</TD> +<TR><TD><b>depth</b><TD>The capture depth (may need to match the frame buffer depth)</TD> +<TR><TD><b>palette</b><TD>Reports the palette that should be used for this image</TD> +</TABLE> +<P> +The following palettes are defined +<P> +<TABLE> +<TR><TD><b>VIDEO_PALETTE_GREY</b><TD>Linear intensity grey scale (255 is brightest).</TD> +<TR><TD><b>VIDEO_PALETTE_HI240</b><TD>The BT848 8bit colour cube.</TD> +<TR><TD><b>VIDEO_PALETTE_RGB565</b><TD>RGB565 packed into 16 bit words.</TD> +<TR><TD><b>VIDEO_PALETTE_RGB555</b><TD>RGV555 packed into 16 bit words, top bit undefined.</TD> +<TR><TD><b>VIDEO_PALETTE_RGB24</b><TD>RGB888 packed into 24bit words.</TD> +<TR><TD><b>VIDEO_PALETTE_RGB32</b><TD>RGB888 packed into the low 3 bytes of 32bit words. The top 8bits are undefined.</TD> +<TR><TD><b>VIDEO_PALETTE_YUV422</b><TD>Video style YUV422 - 8bits packed 4bits Y 2bits U 2bits V</TD> +<TR><TD><b>VIDEO_PALETTE_YUYV</b><TD>Describe me</TD> +<TR><TD><b>VIDEO_PALETTE_UYVY</b><TD>Describe me</TD> +<TR><TD><b>VIDEO_PALETTE_YUV420</b><TD>YUV420 capture</TD> +<TR><TD><b>VIDEO_PALETTE_YUV411</b><TD>YUV411 capture</TD> +<TR><TD><b>VIDEO_PALETTE_RAW</b><TD>RAW capture (BT848)</TD> +<TR><TD><b>VIDEO_PALETTE_YUV422P</b><TD>YUV 4:2:2 Planar</TD> +<TR><TD><b>VIDEO_PALETTE_YUV411P</b><TD>YUV 4:1:1 Planar</TD> +</TABLE> +<P> +<H3>Tuning</H3> +Each video input channel can have one or more tuners associated with it. Many +devices will not have tuners. TV cards and radio cards will have one or more +tuners attached. +<P> +Tuners are described by a <b>struct video_tuner</b> which can be obtained by +the <b>VIDIOCGTUNER</b> ioctl. Fill in the tuner number in the structure +then pass the structure to the ioctl to have the data filled in. The +tuner can be switched using <b>VIDIOCSTUNER</b> which takes an integer argument +giving the tuner to use. A struct tuner has the following fields +<P> +<TABLE> +<TR><TD><b>tuner</b><TD>Number of the tuner</TD> +<TR><TD><b>name</b><TD>Canonical name for this tuner (eg FM/AM/TV)</TD> +<TR><TD><b>rangelow</b><TD>Lowest tunable frequency</TD> +<TR><TD><b>rangehigh</b><TD>Highest tunable frequency</TD> +<TR><TD><b>flags</b><TD>Flags describing the tuner</TD> +<TR><TD><b>mode</b><TD>The video signal mode if relevant</TD> +<TR><TD><b>signal</b><TD>Signal strength if known - between 0-65535</TD> +</TABLE> +<P> +The following flags exist +<P> +<TABLE> +<TR><TD><b>VIDEO_TUNER_PAL</b><TD>PAL tuning is supported</TD> +<TR><TD><b>VIDEO_TUNER_NTSC</b><TD>NTSC tuning is supported</TD> +<TR><TD><b>VIDEO_TUNER_SECAM</b><TD>SECAM tuning is supported</TD> +<TR><TD><b>VIDEO_TUNER_LOW</b><TD>Frequency is in a lower range</TD> +<TR><TD><b>VIDEO_TUNER_NORM</b><TD>The norm for this tuner is settable</TD> +<TR><TD><b>VIDEO_TUNER_STEREO_ON</b><TD>The tuner is seeing stereo audio</TD> +<TR><TD><b>VIDEO_TUNER_RDS_ON</b><TD>The tuner is seeing a RDS datastream</TD> +<TR><TD><b>VIDEO_TUNER_MBS_ON</b><TD>The tuner is seeing a MBS datastream</TD> +</TABLE> +<P> +The following modes are defined +<P> +<TABLE> +<TR><TD><b>VIDEO_MODE_PAL</b><TD>The tuner is in PAL mode</TD> +<TR><TD><b>VIDEO_MODE_NTSC</b><TD>The tuner is in NTSC mode</TD> +<TR><TD><b>VIDEO_MODE_SECAM</b><TD>The tuner is in SECAM mode</TD> +<TR><TD><b>VIDEO_MODE_AUTO</b><TD>The tuner auto switches, or mode does not apply</TD> +</TABLE> +<P> +Tuning frequencies are an unsigned 32bit value in 1/16th MHz or if the +<b>VIDEO_TUNER_LOW</b> flag is set they are in 1/16th KHz. The current +frequency is obtained as an unsigned long via the <b>VIDIOCGFREQ</b> ioctl and +set by the <b>VIDIOCSFREQ</b> ioctl. +<P> +<H3>Audio</H3> +TV and Radio devices have one or more audio inputs that may be selected. +The audio properties are queried by passing a <b>struct video_audio</b> to <b>VIDIOCGAUDIO</b> ioctl. The +<b>VIDIOCSAUDIO</b> ioctl sets audio properties. +<P> +The structure contains the following fields +<P> +<TABLE> +<TR><TD><b>audio</b><TD>The channel number</TD> +<TR><TD><b>volume</b><TD>The volume level</TD> +<TR><TD><b>bass</b><TD>The bass level</TD> +<TR><TD><b>treble</b><TD>The treble level</TD> +<TR><TD><b>flags</b><TD>Flags describing the audio channel</TD> +<TR><TD><b>name</b><TD>Canonical name for the audio input</TD> +<TR><TD><b>mode</b><TD>The mode the audio input is in</TD> +<TR><TD><b>balance</b><TD>The left/right balance</TD> +<TR><TD><b>step</b><TD>Actual step used by the hardware</TD> +</TABLE> +<P> +The following flags are defined +<P> +<TABLE> +<TR><TD><b>VIDEO_AUDIO_MUTE</b><TD>The audio is muted</TD> +<TR><TD><b>VIDEO_AUDIO_MUTABLE</b><TD>Audio muting is supported</TD> +<TR><TD><b>VIDEO_AUDIO_VOLUME</b><TD>The volume is controllable</TD> +<TR><TD><b>VIDEO_AUDIO_BASS</b><TD>The bass is controllable</TD> +<TR><TD><b>VIDEO_AUDIO_TREBLE</b><TD>The treble is controllable</TD> +<TR><TD><b>VIDEO_AUDIO_BALANCE</b><TD>The balance is controllable</TD> +</TABLE> +<P> +The following decoding modes are defined +<P> +<TABLE> +<TR><TD><b>VIDEO_SOUND_MONO</b><TD>Mono signal</TD> +<TR><TD><b>VIDEO_SOUND_STEREO</b><TD>Stereo signal (NICAM for TV)</TD> +<TR><TD><b>VIDEO_SOUND_LANG1</b><TD>European TV alternate language 1</TD> +<TR><TD><b>VIDEO_SOUND_LANG2</b><TD>European TV alternate language 2</TD> +</TABLE> +<P> +<H3>Reading Images</H3> +Each call to the <b>read</b> syscall returns the next available image +from the device. It is up to the caller to set format and size (using +the VIDIOCSPICT and VIDIOCSWIN ioctls) and then to pass a suitable +size buffer and length to the function. Not all devices will support +read operations. +<P> +A second way to handle image capture is via the mmap interface if supported. +To use the mmap interface a user first sets the desired image size and depth +properties. Next the VIDIOCGMBUF ioctl is issued. This reports the size +of buffer to mmap and the offset within the buffer for each frame. The +number of frames supported is device dependent and may only be one. +<P> +The video_mbuf structure contains the following fields +<P> +<TABLE> +<TR><TD><b>size</b><TD>The number of bytes to map</TD> +<TR><TD><b>frames</b><TD>The number of frames</TD> +<TR><TD><b>offsets</b><TD>The offset of each frame</TD> +</TABLE> +<P> +Once the mmap has been made the VIDIOCMCAPTURE ioctl starts the +capture to a frame using the format and image size specified in the +video_mmap (which should match or be below the initial query size). +When the VIDIOCMCAPTURE ioctl returns the frame is <em>not</em> +captured yet, the driver just instructed the hardware to start the +capture. The application has to use the VIDIOCSYNC ioctl to wait +until the capture of a frame is finished. VIDIOCSYNC takes the frame +number you want to wait for as argument. +<p> +It is allowed to call VIDIOCMCAPTURE multiple times (with different +frame numbers in video_mmap->frame of course) and thus have multiple +outstanding capture requests. A simple way do to double-buffering +using this feature looks like this: +<pre> +/* setup everything */ +VIDIOCMCAPTURE(0) +while (whatever) { + VIDIOCMCAPTURE(1) + VIDIOCSYNC(0) + /* process frame 0 while the hardware captures frame 1 */ + VIDIOCMCAPTURE(0) + VIDIOCSYNC(1) + /* process frame 1 while the hardware captures frame 0 */ +} +</pre> +Note that you are <em>not</em> limited to only two frames. The API +allows up to 32 frames, the VIDIOCGMBUF ioctl returns the number of +frames the driver granted. Thus it is possible to build deeper queues +to avoid loosing frames on load peaks. +<p> +While capturing to memory the driver will make a "best effort" attempt +to capture to screen as well if requested. This normally means all +frames that "miss" memory mapped capture will go to the display. +<P> +A final ioctl exists to allow a device to obtain related devices if a +driver has multiple components (for example video0 may not be associated +with vbi0 which would cause an intercast display program to make a bad +mistake). The VIDIOCGUNIT ioctl reports the unit numbers of the associated +devices if any exist. The video_unit structure has the following fields. +<P> +<TABLE> +<TR><TD><b>video</b><TD>Video capture device</TD> +<TR><TD><b>vbi</b><TD>VBI capture device</TD> +<TR><TD><b>radio</b><TD>Radio device</TD> +<TR><TD><b>audio</b><TD>Audio mixer</TD> +<TR><TD><b>teletext</b><TD>Teletext device</TD> +</TABLE> +<P> +<H3>RDS Datastreams</H3> +For radio devices that support it, it is possible to receive Radio Data +System (RDS) data by means of a read() on the device. The data is packed in +groups of three, as follows: +<TABLE> +<TR><TD>First Octet</TD><TD>Least Significant Byte of RDS Block</TD></TR> +<TR><TD>Second Octet</TD><TD>Most Significant Byte of RDS Block +<TR><TD>Third Octet</TD><TD>Bit 7:</TD><TD>Error bit. Indicates that +an uncorrectable error occurred during reception of this block.</TD></TR> +<TR><TD> </TD><TD>Bit 6:</TD><TD>Corrected bit. Indicates that +an error was corrected for this data block.</TD></TR> +<TR><TD> </TD><TD>Bits 5-3:</TD><TD>Received Offset. Indicates the +offset received by the sync system.</TD></TR> +<TR><TD> </TD><TD>Bits 2-0:</TD><TD>Offset Name. Indicates the +offset applied to this data.</TD></TR> +</TABLE> +</BODY> +</HTML> diff --git a/Documentation/video4linux/CARDLIST.bttv b/Documentation/video4linux/CARDLIST.bttv new file mode 100644 index 000000000000..e46761c39e3f --- /dev/null +++ b/Documentation/video4linux/CARDLIST.bttv @@ -0,0 +1,121 @@ +card=0 - *** UNKNOWN/GENERIC *** +card=1 - MIRO PCTV +card=2 - Hauppauge (bt848) +card=3 - STB, Gateway P/N 6000699 (bt848) +card=4 - Intel Create and Share PCI/ Smart Video Recorder III +card=5 - Diamond DTV2000 +card=6 - AVerMedia TVPhone +card=7 - MATRIX-Vision MV-Delta +card=8 - Lifeview FlyVideo II (Bt848) LR26 / MAXI TV Video PCI2 LR26 +card=9 - IMS/IXmicro TurboTV +card=10 - Hauppauge (bt878) +card=11 - MIRO PCTV pro +card=12 - ADS Technologies Channel Surfer TV (bt848) +card=13 - AVerMedia TVCapture 98 +card=14 - Aimslab Video Highway Xtreme (VHX) +card=15 - Zoltrix TV-Max +card=16 - Prolink Pixelview PlayTV (bt878) +card=17 - Leadtek WinView 601 +card=18 - AVEC Intercapture +card=19 - Lifeview FlyVideo II EZ /FlyKit LR38 Bt848 (capture only) +card=20 - CEI Raffles Card +card=21 - Lifeview FlyVideo 98/ Lucky Star Image World ConferenceTV LR50 +card=22 - Askey CPH050/ Phoebe Tv Master + FM +card=23 - Modular Technology MM201/MM202/MM205/MM210/MM215 PCTV, bt878 +card=24 - Askey CPH05X/06X (bt878) [many vendors] +card=25 - Terratec TerraTV+ Version 1.0 (Bt848)/ Terra TValue Version 1.0/ Vobis TV-Boostar +card=26 - Hauppauge WinCam newer (bt878) +card=27 - Lifeview FlyVideo 98/ MAXI TV Video PCI2 LR50 +card=28 - Terratec TerraTV+ Version 1.1 (bt878) +card=29 - Imagenation PXC200 +card=30 - Lifeview FlyVideo 98 LR50 +card=31 - Formac iProTV, Formac ProTV I (bt848) +card=32 - Intel Create and Share PCI/ Smart Video Recorder III +card=33 - Terratec TerraTValue Version Bt878 +card=34 - Leadtek WinFast 2000/ WinFast 2000 XP +card=35 - Lifeview FlyVideo 98 LR50 / Chronos Video Shuttle II +card=36 - Lifeview FlyVideo 98FM LR50 / Typhoon TView TV/FM Tuner +card=37 - Prolink PixelView PlayTV pro +card=38 - Askey CPH06X TView99 +card=39 - Pinnacle PCTV Studio/Rave +card=40 - STB TV PCI FM, Gateway P/N 6000704 (bt878), 3Dfx VoodooTV 100 +card=41 - AVerMedia TVPhone 98 +card=42 - ProVideo PV951 +card=43 - Little OnAir TV +card=44 - Sigma TVII-FM +card=45 - MATRIX-Vision MV-Delta 2 +card=46 - Zoltrix Genie TV/FM +card=47 - Terratec TV/Radio+ +card=48 - Askey CPH03x/ Dynalink Magic TView +card=49 - IODATA GV-BCTV3/PCI +card=50 - Prolink PV-BT878P+4E / PixelView PlayTV PAK / Lenco MXTV-9578 CP +card=51 - Eagle Wireless Capricorn2 (bt878A) +card=52 - Pinnacle PCTV Studio Pro +card=53 - Typhoon TView RDS + FM Stereo / KNC1 TV Station RDS +card=54 - Lifeview FlyVideo 2000 /FlyVideo A2/ Lifetec LT 9415 TV [LR90] +card=55 - Askey CPH031/ BESTBUY Easy TV +card=56 - Lifeview FlyVideo 98FM LR50 +card=57 - GrandTec 'Grand Video Capture' (Bt848) +card=58 - Askey CPH060/ Phoebe TV Master Only (No FM) +card=59 - Askey CPH03x TV Capturer +card=60 - Modular Technology MM100PCTV +card=61 - AG Electronics GMV1 +card=62 - Askey CPH061/ BESTBUY Easy TV (bt878) +card=63 - ATI TV-Wonder +card=64 - ATI TV-Wonder VE +card=65 - Lifeview FlyVideo 2000S LR90 +card=66 - Terratec TValueRadio +card=67 - IODATA GV-BCTV4/PCI +card=68 - 3Dfx VoodooTV FM (Euro), VoodooTV 200 (USA) +card=69 - Active Imaging AIMMS +card=70 - Prolink Pixelview PV-BT878P+ (Rev.4C,8E) +card=71 - Lifeview FlyVideo 98EZ (capture only) LR51 +card=72 - Prolink Pixelview PV-BT878P+9B (PlayTV Pro rev.9B FM+NICAM) +card=73 - Sensoray 311 +card=74 - RemoteVision MX (RV605) +card=75 - Powercolor MTV878/ MTV878R/ MTV878F +card=76 - Canopus WinDVR PCI (COMPAQ Presario 3524JP, 5112JP) +card=77 - GrandTec Multi Capture Card (Bt878) +card=78 - Jetway TV/Capture JW-TV878-FBK, Kworld KW-TV878RF +card=79 - DSP Design TCVIDEO +card=80 - Hauppauge WinTV PVR +card=81 - IODATA GV-BCTV5/PCI +card=82 - Osprey 100/150 (878) +card=83 - Osprey 100/150 (848) +card=84 - Osprey 101 (848) +card=85 - Osprey 101/151 +card=86 - Osprey 101/151 w/ svid +card=87 - Osprey 200/201/250/251 +card=88 - Osprey 200/250 +card=89 - Osprey 210/220 +card=90 - Osprey 500 +card=91 - Osprey 540 +card=92 - Osprey 2000 +card=93 - IDS Eagle +card=94 - Pinnacle PCTV Sat +card=95 - Formac ProTV II (bt878) +card=96 - MachTV +card=97 - Euresys Picolo +card=98 - ProVideo PV150 +card=99 - AD-TVK503 +card=100 - Hercules Smart TV Stereo +card=101 - Pace TV & Radio Card +card=102 - IVC-200 +card=103 - Grand X-Guard / Trust 814PCI +card=104 - Nebula Electronics DigiTV +card=105 - ProVideo PV143 +card=106 - PHYTEC VD-009-X1 MiniDIN (bt878) +card=107 - PHYTEC VD-009-X1 Combi (bt878) +card=108 - PHYTEC VD-009 MiniDIN (bt878) +card=109 - PHYTEC VD-009 Combi (bt878) +card=110 - IVC-100 +card=111 - IVC-120G +card=112 - pcHDTV HD-2000 TV +card=113 - Twinhan DST + clones +card=114 - Winfast VC100 +card=115 - Teppro TEV-560/InterVision IV-560 +card=116 - SIMUS GVC1100 +card=117 - NGS NGSTV+ +card=118 - LMLBT4 +card=119 - Tekram M205 PRO +card=120 - Conceptronic CONTVFMi diff --git a/Documentation/video4linux/CARDLIST.saa7134 b/Documentation/video4linux/CARDLIST.saa7134 new file mode 100644 index 000000000000..a6c82fa4de02 --- /dev/null +++ b/Documentation/video4linux/CARDLIST.saa7134 @@ -0,0 +1,35 @@ + 0 -> UNKNOWN/GENERIC + 1 -> Proteus Pro [philips reference design] [1131:2001,1131:2001] + 2 -> LifeView FlyVIDEO3000 [5168:0138,4e42:0138] + 3 -> LifeView FlyVIDEO2000 [5168:0138] + 4 -> EMPRESS [1131:6752] + 5 -> SKNet Monster TV [1131:4e85] + 6 -> Tevion MD 9717 + 7 -> KNC One TV-Station RDS / Typhoon TV Tuner RDS [1131:fe01,1894:fe01] + 8 -> Terratec Cinergy 400 TV [153B:1142] + 9 -> Medion 5044 + 10 -> Kworld/KuroutoShikou SAA7130-TVPCI + 11 -> Terratec Cinergy 600 TV [153B:1143] + 12 -> Medion 7134 [16be:0003] + 13 -> Typhoon TV+Radio 90031 + 14 -> ELSA EX-VISION 300TV [1048:226b] + 15 -> ELSA EX-VISION 500TV [1048:226b] + 16 -> ASUS TV-FM 7134 [1043:4842,1043:4830,1043:4840] + 17 -> AOPEN VA1000 POWER [1131:7133] + 18 -> BMK MPEX No Tuner + 19 -> Compro VideoMate TV [185b:c100] + 20 -> Matrox CronosPlus [102B:48d0] + 21 -> 10MOONS PCI TV CAPTURE CARD [1131:2001] + 22 -> Medion 2819/ AverMedia M156 [1461:a70b,1461:2115] + 23 -> BMK MPEX Tuner + 24 -> KNC One TV-Station DVR [1894:a006] + 25 -> ASUS TV-FM 7133 [1043:4843] + 26 -> Pinnacle PCTV Stereo (saa7134) [11bd:002b] + 27 -> Manli MuchTV M-TV002 + 28 -> Manli MuchTV M-TV001 + 29 -> Nagase Sangyo TransGear 3000TV [1461:050c] + 30 -> Elitegroup ECS TVP3XP FM1216 Tuner Card(PAL-BG,FM) [1019:4cb4] + 31 -> Elitegroup ECS TVP3XP FM1236 Tuner Card (NTSC,FM) [1019:4cb5] + 32 -> AVACS SmartTV + 33 -> AVerMedia DVD EZMaker [1461:10ff] + 34 -> LifeView FlyTV Platinum33 mini [5168:0212] diff --git a/Documentation/video4linux/CARDLIST.tuner b/Documentation/video4linux/CARDLIST.tuner new file mode 100644 index 000000000000..f7bafe862ba0 --- /dev/null +++ b/Documentation/video4linux/CARDLIST.tuner @@ -0,0 +1,46 @@ +tuner=0 - Temic PAL (4002 FH5) +tuner=1 - Philips PAL_I (FI1246 and compatibles) +tuner=2 - Philips NTSC (FI1236,FM1236 and compatibles) +tuner=3 - Philips (SECAM+PAL_BG) (FI1216MF, FM1216MF, FR1216MF) +tuner=4 - NoTuner +tuner=5 - Philips PAL_BG (FI1216 and compatibles) +tuner=6 - Temic NTSC (4032 FY5) +tuner=7 - Temic PAL_I (4062 FY5) +tuner=8 - Temic NTSC (4036 FY5) +tuner=9 - Alps HSBH1 +tuner=10 - Alps TSBE1 +tuner=11 - Alps TSBB5 +tuner=12 - Alps TSBE5 +tuner=13 - Alps TSBC5 +tuner=14 - Temic PAL_BG (4006FH5) +tuner=15 - Alps TSCH6 +tuner=16 - Temic PAL_DK (4016 FY5) +tuner=17 - Philips NTSC_M (MK2) +tuner=18 - Temic PAL_I (4066 FY5) +tuner=19 - Temic PAL* auto (4006 FN5) +tuner=20 - Temic PAL_BG (4009 FR5) or PAL_I (4069 FR5) +tuner=21 - Temic NTSC (4039 FR5) +tuner=22 - Temic PAL/SECAM multi (4046 FM5) +tuner=23 - Philips PAL_DK (FI1256 and compatibles) +tuner=24 - Philips PAL/SECAM multi (FQ1216ME) +tuner=25 - LG PAL_I+FM (TAPC-I001D) +tuner=26 - LG PAL_I (TAPC-I701D) +tuner=27 - LG NTSC+FM (TPI8NSR01F) +tuner=28 - LG PAL_BG+FM (TPI8PSB01D) +tuner=29 - LG PAL_BG (TPI8PSB11D) +tuner=30 - Temic PAL* auto + FM (4009 FN5) +tuner=31 - SHARP NTSC_JP (2U5JF5540) +tuner=32 - Samsung PAL TCPM9091PD27 +tuner=33 - MT20xx universal +tuner=34 - Temic PAL_BG (4106 FH5) +tuner=35 - Temic PAL_DK/SECAM_L (4012 FY5) +tuner=36 - Temic NTSC (4136 FY5) +tuner=37 - LG PAL (newer TAPC series) +tuner=38 - Philips PAL/SECAM multi (FM1216ME MK3) +tuner=39 - LG NTSC (newer TAPC series) +tuner=40 - HITACHI V7-J180AT +tuner=41 - Philips PAL_MK (FI1216 MK) +tuner=42 - Philips 1236D ATSC/NTSC daul in +tuner=43 - Philips NTSC MK3 (FM1236MK3 or FM1236/F) +tuner=44 - Philips 4 in 1 (ATI TV Wonder Pro/Conexant) +tuner=45 - Microtune 4049 FM5 diff --git a/Documentation/video4linux/CQcam.txt b/Documentation/video4linux/CQcam.txt new file mode 100644 index 000000000000..e415e3604539 --- /dev/null +++ b/Documentation/video4linux/CQcam.txt @@ -0,0 +1,412 @@ +c-qcam - Connectix Color QuickCam video4linux kernel driver + +Copyright (C) 1999 Dave Forrest <drf5n@virginia.edu> + released under GNU GPL. + +1999-12-08 Dave Forrest, written with kernel version 2.2.12 in mind + + +Table of Contents + +1.0 Introduction +2.0 Compilation, Installation, and Configuration +3.0 Troubleshooting +4.0 Future Work / current work arounds +9.0 Sample Program, v4lgrab +10.0 Other Information + + +1.0 Introduction + + The file ../drivers/char/c-qcam.c is a device driver for the +Logitech (nee Connectix) parallel port interface color CCD camera. +This is a fairly inexpensive device for capturing images. Logitech +does not currently provide information for developers, but many people +have engineered several solutions for non-Microsoft use of the Color +Quickcam. + +1.1 Motivation + + I spent a number of hours trying to get my camera to work, and I +hope this document saves you some time. My camera will not work with +the 2.2.13 kernel as distributed, but with a few patches to the +module, I was able to grab some frames. See 4.0, Future Work. + + + +2.0 Compilation, Installation, and Configuration + + The c-qcam depends on parallel port support, video4linux, and the +Color Quickcam. It is also nice to have the parallel port readback +support enabled. I enabled these as modules during the kernel +configuration. The appropriate flags are: + + CONFIG_PRINTER M for lp.o, parport.o parport_pc.o modules + CONFIG_PNP_PARPORT M for autoprobe.o IEEE1284 readback module + CONFIG_PRINTER_READBACK M for parport_probe.o IEEE1284 readback module + CONFIG_VIDEO_DEV M for videodev.o video4linux module + CONFIG_VIDEO_CQCAM M for c-qcam.o Color Quickcam module + + With these flags, the kernel should compile and install the modules. +To record and monitor the compilation, I use: + + (make zlilo ; \ + make modules; \ + make modules_install ; + depmod -a ) &>log & + less log # then a capital 'F' to watch the progress + +But that is my personal preference. + +2.2 Configuration + + The configuration requires module configuration and device +configuration. I like kmod or kerneld process with the +/etc/modprobe.conf file so the modules can automatically load/unload as +they are used. The video devices could already exist, be generated +using MAKEDEV, or need to be created. The following sections detail +these procedures. + + +2.1 Module Configuration + + Using modules requires a bit of work to install and pass the +parameters. Understand that entries in /etc/modprobe.conf of: + + alias parport_lowlevel parport_pc + options parport_pc io=0x378 irq=none + alias char-major-81 videodev + alias char-major-81-0 c-qcam + +will cause the kmod/modprobe to do certain things. If you are +using kmod, then a request for a 'char-major-81-0' will cause +the 'c-qcam' module to load. If you have other video sources with +modules, you might want to assign the different minor numbers to +different modules. + +2.2 Device Configuration + + At this point, we need to ensure that the device files exist. +Video4linux used the /dev/video* files, and we want to attach the +Quickcam to one of these. + + ls -lad /dev/video* # should produce a list of the video devices + +If the video devices do not exist, you can create them with: + + su + cd /dev + for ii in 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ; do + mknod video$ii c 81 $ii # char-major-81-[0-16] + chown root.root video$ii # owned by root + chmod 600 video$ii # read/writable by root only + done + + Lots of people connect video0 to video and bttv, but you might want +your c-qcam to mean something more: + + ln -s video0 c-qcam # make /dev/c-qcam a working file + ln -s c-qcam video # make /dev/c-qcam your default video source + + But these are conveniences. The important part is to make the proper +special character files with the right major and minor numbers. All +of the special device files are listed in ../devices.txt. If you +would like the c-qcam readable by non-root users, you will need to +change the permissions. + +3.0 Troubleshooting + + If the sample program below, v4lgrab, gives you output then +everything is working. + + v4lgrab | wc # should give you a count of characters + + Otherwise, you have some problem. + + The c-qcam is IEEE1284 compatible, so if you are using the proc file +system (CONFIG_PROC_FS), the parallel printer support +(CONFIG_PRINTER), the IEEE 1284 system,(CONFIG_PRINTER_READBACK), you +should be able to read some identification from your quickcam with + + modprobe -v parport + modprobe -v parport_probe + cat /proc/parport/PORTNUMBER/autoprobe +Returns: + CLASS:MEDIA; + MODEL:Color QuickCam 2.0; + MANUFACTURER:Connectix; + + A good response to this indicates that your color quickcam is alive +and well. A common problem is that the current driver does not +reliably detect a c-qcam, even though one is attached. In this case, + + modprobe -v c-qcam +or + insmod -v c-qcam + + Returns a message saying "Device or resource busy" Development is +currently underway, but a workaround is to patch the module to skip +the detection code and attach to a defined port. Check the +video4linux mailing list and archive for more current information. + +3.1 Checklist: + + Can you get an image? + v4lgrab >qcam.ppm ; wc qcam.ppm ; xv qcam.ppm + + Is a working c-qcam connected to the port? + grep ^ /proc/parport/?/autoprobe + + Do the /dev/video* files exist? + ls -lad /dev/video + + Is the c-qcam module loaded? + modprobe -v c-qcam ; lsmod + + Does the camera work with alternate programs? cqcam, etc? + + + + +4.0 Future Work / current workarounds + + It is hoped that this section will soon become obsolete, but if it +isn't, you might try patching the c-qcam module to add a parport=xxx +option as in the bw-qcam module so you can specify the parallel port: + + insmod -v c-qcam parport=0 + +And bypass the detection code, see ../../drivers/char/c-qcam.c and +look for the 'qc_detect' code and call. + + Note that there is work in progress to change the video4linux API, +this work is documented at the video4linux2 site listed below. + + +9.0 --- A sample program using v4lgrabber, + +This program is a simple image grabber that will copy a frame from the +first video device, /dev/video0 to standard output in portable pixmap +format (.ppm) Using this like: 'v4lgrab | convert - c-qcam.jpg' +produced this picture of me at + http://mug.sys.virginia.edu/~drf5n/extras/c-qcam.jpg + +-------------------- 8< ---------------- 8< ----------------------------- + +/* Simple Video4Linux image grabber. */ +/* + * Video4Linux Driver Test/Example Framegrabbing Program + * + * Compile with: + * gcc -s -Wall -Wstrict-prototypes v4lgrab.c -o v4lgrab + * Use as: + * v4lgrab >image.ppm + * + * Copyright (C) 1998-05-03, Phil Blundell <philb@gnu.org> + * Copied from http://www.tazenda.demon.co.uk/phil/vgrabber.c + * with minor modifications (Dave Forrest, drf5n@virginia.edu). + * + */ + +#include <unistd.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <fcntl.h> +#include <stdio.h> +#include <sys/ioctl.h> +#include <stdlib.h> + +#include <linux/types.h> +#include <linux/videodev.h> + +#define FILE "/dev/video0" + +/* Stole this from tvset.c */ + +#define READ_VIDEO_PIXEL(buf, format, depth, r, g, b) \ +{ \ + switch (format) \ + { \ + case VIDEO_PALETTE_GREY: \ + switch (depth) \ + { \ + case 4: \ + case 6: \ + case 8: \ + (r) = (g) = (b) = (*buf++ << 8);\ + break; \ + \ + case 16: \ + (r) = (g) = (b) = \ + *((unsigned short *) buf); \ + buf += 2; \ + break; \ + } \ + break; \ + \ + \ + case VIDEO_PALETTE_RGB565: \ + { \ + unsigned short tmp = *(unsigned short *)buf; \ + (r) = tmp&0xF800; \ + (g) = (tmp<<5)&0xFC00; \ + (b) = (tmp<<11)&0xF800; \ + buf += 2; \ + } \ + break; \ + \ + case VIDEO_PALETTE_RGB555: \ + (r) = (buf[0]&0xF8)<<8; \ + (g) = ((buf[0] << 5 | buf[1] >> 3)&0xF8)<<8; \ + (b) = ((buf[1] << 2 ) & 0xF8)<<8; \ + buf += 2; \ + break; \ + \ + case VIDEO_PALETTE_RGB24: \ + (r) = buf[0] << 8; (g) = buf[1] << 8; \ + (b) = buf[2] << 8; \ + buf += 3; \ + break; \ + \ + default: \ + fprintf(stderr, \ + "Format %d not yet supported\n", \ + format); \ + } \ +} + +int get_brightness_adj(unsigned char *image, long size, int *brightness) { + long i, tot = 0; + for (i=0;i<size*3;i++) + tot += image[i]; + *brightness = (128 - tot/(size*3))/3; + return !((tot/(size*3)) >= 126 && (tot/(size*3)) <= 130); +} + +int main(int argc, char ** argv) +{ + int fd = open(FILE, O_RDONLY), f; + struct video_capability cap; + struct video_window win; + struct video_picture vpic; + + unsigned char *buffer, *src; + int bpp = 24, r, g, b; + unsigned int i, src_depth; + + if (fd < 0) { + perror(FILE); + exit(1); + } + + if (ioctl(fd, VIDIOCGCAP, &cap) < 0) { + perror("VIDIOGCAP"); + fprintf(stderr, "(" FILE " not a video4linux device?)\n"); + close(fd); + exit(1); + } + + if (ioctl(fd, VIDIOCGWIN, &win) < 0) { + perror("VIDIOCGWIN"); + close(fd); + exit(1); + } + + if (ioctl(fd, VIDIOCGPICT, &vpic) < 0) { + perror("VIDIOCGPICT"); + close(fd); + exit(1); + } + + if (cap.type & VID_TYPE_MONOCHROME) { + vpic.depth=8; + vpic.palette=VIDEO_PALETTE_GREY; /* 8bit grey */ + if(ioctl(fd, VIDIOCSPICT, &vpic) < 0) { + vpic.depth=6; + if(ioctl(fd, VIDIOCSPICT, &vpic) < 0) { + vpic.depth=4; + if(ioctl(fd, VIDIOCSPICT, &vpic) < 0) { + fprintf(stderr, "Unable to find a supported capture format.\n"); + close(fd); + exit(1); + } + } + } + } else { + vpic.depth=24; + vpic.palette=VIDEO_PALETTE_RGB24; + + if(ioctl(fd, VIDIOCSPICT, &vpic) < 0) { + vpic.palette=VIDEO_PALETTE_RGB565; + vpic.depth=16; + + if(ioctl(fd, VIDIOCSPICT, &vpic)==-1) { + vpic.palette=VIDEO_PALETTE_RGB555; + vpic.depth=15; + + if(ioctl(fd, VIDIOCSPICT, &vpic)==-1) { + fprintf(stderr, "Unable to find a supported capture format.\n"); + return -1; + } + } + } + } + + buffer = malloc(win.width * win.height * bpp); + if (!buffer) { + fprintf(stderr, "Out of memory.\n"); + exit(1); + } + + do { + int newbright; + read(fd, buffer, win.width * win.height * bpp); + f = get_brightness_adj(buffer, win.width * win.height, &newbright); + if (f) { + vpic.brightness += (newbright << 8); + if(ioctl(fd, VIDIOCSPICT, &vpic)==-1) { + perror("VIDIOSPICT"); + break; + } + } + } while (f); + + fprintf(stdout, "P6\n%d %d 255\n", win.width, win.height); + + src = buffer; + + for (i = 0; i < win.width * win.height; i++) { + READ_VIDEO_PIXEL(src, vpic.palette, src_depth, r, g, b); + fputc(r>>8, stdout); + fputc(g>>8, stdout); + fputc(b>>8, stdout); + } + + close(fd); + return 0; +} +-------------------- 8< ---------------- 8< ----------------------------- + + +10.0 --- Other Information + +Use the ../../Maintainers file, particularly the VIDEO FOR LINUX and PARALLEL +PORT SUPPORT sections + +The video4linux page: + http://roadrunner.swansea.linux.org.uk/v4l.shtml + +The video4linux2 page: + http://millennium.diads.com/bdirks/v4l2.htm + +Some web pages about the quickcams: + http://www.dkfz-heidelberg.de/Macromol/wedemann/mini-HOWTO-cqcam.html + + http://www.crynwr.com/qcpc/ QuickCam Third-Party Drivers + http://www.crynwr.com/qcpc/re.html Some Reverse Engineering + http://cse.unl.edu/~cluening/gqcam/ v4l client + http://phobos.illtel.denver.co.us/pub/qcread/ doesn't use v4l + ftp://ftp.cs.unm.edu/pub/chris/quickcam/ Has lots of drivers + http://www.cs.duke.edu/~reynolds/quickcam/ Has lots of information + + diff --git a/Documentation/video4linux/README.cpia b/Documentation/video4linux/README.cpia new file mode 100644 index 000000000000..c95e7bbc0fdf --- /dev/null +++ b/Documentation/video4linux/README.cpia @@ -0,0 +1,191 @@ +This is a driver for the CPiA PPC2 driven parallel connected +Camera. For example the Creative WebcamII is CPiA driven. + + ) [1]Peter Pregler, Linz 2000, published under the [2]GNU GPL + +--------------------------------------------------------------------------- + +USAGE: + +General: +======== + +1) Make sure you have created the video devices (/dev/video*): + +- if you have a recent MAKEDEV do a 'cd /dev;./MAKEDEV video' +- otherwise do a: + +cd /dev +mknod video0 c 81 0 +ln -s video0 video + +2) Compile the kernel (see below for the list of options to use), + configure your parport and reboot. + +3) If all worked well you should get messages similar + to the following (your versions may be different) on the console: + +V4L-Driver for Vision CPiA based cameras v0.7.4 +parport0: read2 timeout. +parport0: Multimedia device, VLSI Vision Ltd PPC2 +Parallel port driver for Vision CPiA based camera + CPIA Version: 1.20 (2.0) + CPIA PnP-ID: 0553:0002:0100 + VP-Version: 1.0 0100 + 1 camera(s) found + + +As modules: +=========== + +Make sure you have selected the following kernel options (you can +select all stuff as modules): + +The cpia-stuff is in the section 'Character devices -> Video For Linux'. + +CONFIG_PARPORT=m +CONFIG_PARPORT_PC=m +CONFIG_PARPORT_PC_FIFO=y +CONFIG_PARPORT_1284=y +CONFIG_VIDEO_DEV=m +CONFIG_VIDEO_CPIA=m +CONFIG_VIDEO_CPIA_PP=m + +For autoloading of all those modules you need to tell module-init-tools +some stuff. Add the following line to your module-init-tools config-file +(e.g. /etc/modprobe.conf or wherever your distribution does store that +stuff): + +options parport_pc io=0x378 irq=7 dma=3 +alias char-major-81 cpia_pp + +The first line tells the dma/irq channels to use. Those _must_ match +the settings of your BIOS. Do NOT simply use the values above. See +Documentation/parport.txt for more information about this. The second +line associates the video-device file with the driver. Of cause you +can also load the modules once upon boot (usually done in /etc/modules). + +Linked into the kernel: +======================= + +Make sure you have selected the following kernel options. Note that +you cannot compile the parport-stuff as modules and the cpia-driver +statically (the other way round is okay though). + +The cpia-stuff is in the section 'Character devices -> Video For Linux'. + +CONFIG_PARPORT=y +CONFIG_PARPORT_PC=y +CONFIG_PARPORT_PC_FIFO=y +CONFIG_PARPORT_1284=y +CONFIG_VIDEO_DEV=y +CONFIG_VIDEO_CPIA=y +CONFIG_VIDEO_CPIA_PP=y + +To use DMA/irq you will need to tell the kernel upon boot time the +hardware configuration of the parport. You can give the boot-parameter +at the LILO-prompt or specify it in lilo.conf. I use the following +append-line in lilo.conf: + + append="parport=0x378,7,3" + +See Documentation/parport.txt for more information about the +configuration of the parport and the values given above. Do not simply +use the values given above. + +--------------------------------------------------------------------------- +FEATURES: + +- mmap/read v4l-interface (but no overlay) +- image formats: CIF/QCIF, SIF/QSIF, various others used by isabel; + note: all sizes except CIF/QCIF are implemented by clipping, i.e. + pixels are not uploaded from the camera +- palettes: VIDEO_PALETTE_GRAY, VIDEO_PALETTE_RGB565, VIDEO_PALETTE_RGB555, + VIDEO_PALETTE_RGB24, VIDEO_PALETTE_RGB32, VIDEO_PALETTE_YUYV, + VIDEO_PALETTE_UYVY, VIDEO_PALETTE_YUV422 +- state information (color balance, exposure, ...) is preserved between + device opens +- complete control over camera via proc-interface (_all_ camera settings are + supported), there is also a python-gtk application available for this [3] +- works under SMP (but the driver is completely serialized and synchronous) + so you get no benefit from SMP, but at least it does not crash your box +- might work for non-Intel architecture, let us know about this + +--------------------------------------------------------------------------- +TESTED APPLICATIONS: + +- a simple test application based on Xt is available at [3] +- another test-application based on gqcam-0.4 (uses GTK) +- gqcam-0.6 should work +- xawtv-3.x (also the webcam software) +- xawtv-2.46 +- w3cam (cgi-interface and vidcat, e.g. you may try out 'vidcat |xv + -maxpect -root -quit +noresetroot -rmode 5 -') +- vic, the MBONE video conferencing tool (version 2.8ucl4-1) +- isabel 3R4beta (barely working, but AFAICT all the problems are on + their side) +- camserv-0.40 + +See [3] for pointers to v4l-applications. + +--------------------------------------------------------------------------- +KNOWN PROBLEMS: + +- some applications do not handle the image format correctly, you will + see strange horizontal stripes instead of a nice picture -> make sure + your application does use a supported image size or queries the driver + for the actually used size (reason behind this: the camera cannot + provide any image format, so if size NxM is requested the driver will + use a format to the closest fitting N1xM1, the application should now + query for this granted size, most applications do not). +- all the todo ;) +- if there is not enough light and the picture is too dark try to + adjust the SetSensorFPS setting, automatic frame rate adjustment + has its price +- do not try out isabel 3R4beta (built 135), you will be disappointed + +--------------------------------------------------------------------------- +TODO: + +- multiple camera support (struct camera or something) - This should work, + but hasn't been tested yet. +- architecture independence? +- SMP-safe asynchronous mmap interface +- nibble mode for old parport interfaces +- streaming capture, this should give a performance gain + +--------------------------------------------------------------------------- +IMPLEMENTATION NOTES: + +The camera can act in two modes, streaming or grabbing. Right now a +polling grab-scheme is used. Maybe interrupt driven streaming will be +used for a asynchronous mmap interface in the next major release of the +driver. This might give a better frame rate. + +--------------------------------------------------------------------------- +THANKS (in no particular order): + +- Scott J. Bertin <sbertin@mindspring.com> for cleanups, the proc-filesystem + and much more +- Henry Bruce <whb@vvl.co.uk> for providing developers information about + the CPiA chip, I wish all companies would treat Linux as seriously +- Karoly Erdei <Karoly.Erdei@risc.uni-linz.ac.at> and RISC-Linz for being + my boss ;) resp. my employer and for providing me the hardware and + allow me to devote some working time to this project +- Manuel J. Petit de Gabriel <mpetit@dit.upm.es> for providing help + with Isabel (http://isabel.dit.upm.es/) +- Bas Huisman <bhuism@cs.utwente.nl> for writing the initial parport code +- Jarl Totland <Jarl.Totland@bdc.no> for setting up the mailing list + and maintaining the web-server[3] +- Chris Whiteford <Chris@informinteractive.com> for fixes related to the + 1.02 firmware +- special kudos to all the tester whose machines crashed and/or + will crash. :) + +--------------------------------------------------------------------------- +REFERENCES + + 1. http://www.risc.uni-linz.ac.at/people/ppregler + mailto:Peter_Pregler@email.com + 2. see the file COPYING in the top directory of the kernel tree + 3. http://webcam.sourceforge.net/ diff --git a/Documentation/video4linux/README.cx88 b/Documentation/video4linux/README.cx88 new file mode 100644 index 000000000000..897ab834839a --- /dev/null +++ b/Documentation/video4linux/README.cx88 @@ -0,0 +1,69 @@ + +cx8800 release notes +==================== + +This is a v4l2 device driver for the cx2388x chip. + + +current status +============== + +video + - Basically works. + - Some minor image quality glitches. + - For now only capture, overlay support isn't completed yet. + +audio + - The chip specs for the on-chip TV sound decoder are next + to useless :-/ + - Neverless the builtin TV sound decoder starts working now, + at least for PAL-BG. Other TV norms need other code ... + FOR ANY REPORTS ON THIS PLEASE MENTION THE TV NORM YOU ARE + USING. + - Most tuner chips do provide mono sound, which may or may not + be useable depending on the board design. With the Hauppauge + cards it works, so there is mono sound available as fallback. + - audio data dma (i.e. recording without loopback cable to the + sound card) should be possible, but there is no code yet ... + +vbi + - some code present. Doesn't crash any more, but also doesn't + work yet ... + + +how to add support for new cards +================================ + +The driver needs some config info for the TV cards. This stuff is in +cx88-cards.c. If the driver doesn't work well you likely need a new +entry for your card in that file. Check the kernel log (using dmesg) +to see whenever the driver knows your card or not. There is a line +like this one: + + cx8800[0]: subsystem: 0070:3400, board: Hauppauge WinTV \ + 34xxx models [card=1,autodetected] + +If your card is listed as "board: UNKNOWN/GENERIC" it is unknown to +the driver. What to do then? + + (1) Try upgrading to the latest snapshot, maybe it has been added + meanwhile. + (2) You can try to create a new entry yourself, have a look at + cx88-cards.c. If that worked, mail me your changes as unified + diff ("diff -u"). + (3) Or you can mail me the config information. I need at least the + following informations to add the card: + + * the PCI Subsystem ID ("0070:3400" from the line above, + "lspci -v" output is fine too). + * the tuner type used by the card. You can try to find one by + trial-and-error using the tuner=<n> insmod option. If you + know which one the card has you can also have a look at the + list in CARDLIST.tuner + +Have fun, + + Gerd + +-- +Gerd Knorr <kraxel@bytesex.org> [SuSE Labs] diff --git a/Documentation/video4linux/README.ir b/Documentation/video4linux/README.ir new file mode 100644 index 000000000000..0da47a847056 --- /dev/null +++ b/Documentation/video4linux/README.ir @@ -0,0 +1,72 @@ + +infrared remote control support in video4linux drivers +====================================================== + + +basics +------ + +Current versions use the linux input layer to support infrared +remote controls. I suggest to download my input layer tools +from http://bytesex.org/snapshot/input-<date>.tar.gz + +Modules you have to load: + + saa7134 statically built in, i.e. just the driver :) + bttv ir-kbd-gpio or ir-kbd-i2c depending on your + card. + +ir-kbd-gpio and ir-kbd-i2c don't support all cards lirc supports +(yet), mainly for the reason that the code of lirc_i2c and lirc_gpio +was very confusing and I decided to basically start over from scratch. +Feel free to contact me in case of trouble. Note that the ir-kbd-* +modules work on 2.6.x kernels only through ... + + +how it works +------------ + +The modules register the remote as keyboard within the linux input +layer, i.e. you'll see the keys of the remote as normal key strokes +(if CONFIG_INPUT_KEYBOARD is enabled). + +Using the event devices (CONFIG_INPUT_EVDEV) it is possible for +applications to access the remote via /dev/input/event<n> devices. +You might have to create the special files using "/sbin/MAKEDEV +input". The input layer tools mentioned above use the event device. + +The input layer tools are nice for trouble shooting, i.e. to check +whenever the input device is really present, which of the devices it +is, check whenever pressing keys on the remote actually generates +events and the like. You can also use the kbd utility to change the +keymaps (2.6.x kernels only through). + + +using with lircd +================ + +The cvs version of the lircd daemon supports reading events from the +linux input layer (via event device). The input layer tools tarball +comes with a lircd config file. + + +using without lircd +=================== + +XFree86 likely can be configured to recognise the remote keys. Once I +simply tried to configure one of the multimedia keyboards as input +device, which had the effect that XFree86 recognised some of the keys +of my remote control and passed volume up/down key presses as +XF86AudioRaiseVolume and XF86AudioLowerVolume key events to the X11 +clients. + +It likely is possible to make that fly with a nice xkb config file, +I know next to nothing about that through. + + +Have fun, + + Gerd + +-- +Gerd Knorr <kraxel@bytesex.org> diff --git a/Documentation/video4linux/README.saa7134 b/Documentation/video4linux/README.saa7134 new file mode 100644 index 000000000000..1a446c65365e --- /dev/null +++ b/Documentation/video4linux/README.saa7134 @@ -0,0 +1,73 @@ + + +What is it? +=========== + +This is a v4l2/oss device driver for saa7130/33/34/35 based capture / TV +boards. See http://www.semiconductors.philips.com/pip/saa7134hl for a +description. + + +Status +====== + +Almost everything is working. video, sound, tuner, radio, mpeg ts, ... + +As with bttv, card-specific tweaks are needed. Check CARDLIST for a +list of known TV cards and saa7134-cards.c for the drivers card +configuration info. + + +Build +===== + +Pick up videodev + v4l2 patches from http://bytesex.org/patches/. +Configure, build, install + boot the new kernel. You'll need at least +these config options: + + CONFIG_I2C=m + CONFIG_VIDEO_DEV=m + +Type "make" to build the driver now. "make install" installs the +driver. "modprobe saa7134" should load it. Depending on the card you +might have to pass card=<nr> as insmod option, check CARDLIST for +valid choices. + + +Changes / Fixes +=============== + +Please mail me unified diffs ("diff -u") with your changes, and don't +forget to tell me what it changes / which problem it fixes / whatever +it is good for ... + + +Known Problems +============== + +* The tuner for the flyvideos isn't detected automatically and the + default might not work for you depending on which version you have. + There is a tuner= insmod option to override the driver's default. + +Card Variations: +================ + +Cards can use either of these two crystals (xtal): + - 32.11 MHz -> .audio_clock=0x187de7 + - 24.576MHz -> .audio_clock=0x200000 +(xtal * .audio_clock = 51539600) + + +Credits +======= + +andrew.stevens@philips.com + werner.leeb@philips.com for providing +saa7134 hardware specs and sample board. + + +Have fun, + + Gerd + +-- +Gerd Knorr <kraxel@bytesex.org> [SuSE Labs] diff --git a/Documentation/video4linux/Zoran b/Documentation/video4linux/Zoran new file mode 100644 index 000000000000..01425c21986b --- /dev/null +++ b/Documentation/video4linux/Zoran @@ -0,0 +1,557 @@ +Frequently Asked Questions: +=========================== +subject: unified zoran driver (zr360x7, zoran, buz, dc10(+), dc30(+), lml33) +website: http://mjpeg.sourceforge.net/driver-zoran/ + +1. What cards are supported +1.1 What the TV decoder can do an what not +1.2 What the TV encoder can do an what not +2. How do I get this damn thing to work +3. What mainboard should I use (or why doesn't my card work) +4. Programming interface +5. Applications +6. Concerning buffer sizes, quality, output size etc. +7. It hangs/crashes/fails/whatevers! Help! +8. Maintainers/Contacting +9. License + +=========================== + +1. What cards are supported + +Iomega Buz, Linux Media Labs LML33/LML33R10, Pinnacle/Miro +DC10/DC10+/DC30/DC30+ and related boards (available under various names). + +Iomega Buz: +* Zoran zr36067 PCI controller +* Zoran zr36060 MJPEG codec +* Philips saa7111 TV decoder +* Philips saa7185 TV encoder +Drivers to use: videodev, i2c-core, i2c-algo-bit, + videocodec, saa7111, saa7185, zr36060, zr36067 +Inputs/outputs: Composite and S-video +Norms: PAL, SECAM (720x576 @ 25 fps), NTSC (720x480 @ 29.97 fps) +Card number: 7 + +Linux Media Labs LML33: +* Zoran zr36067 PCI controller +* Zoran zr36060 MJPEG codec +* Brooktree bt819 TV decoder +* Brooktree bt856 TV encoder +Drivers to use: videodev, i2c-core, i2c-algo-bit, + videocodec, bt819, bt856, zr36060, zr36067 +Inputs/outputs: Composite and S-video +Norms: PAL (720x576 @ 25 fps), NTSC (720x480 @ 29.97 fps) +Card number: 5 + +Linux Media Labs LML33R10: +* Zoran zr36067 PCI controller +* Zoran zr36060 MJPEG codec +* Philips saa7114 TV decoder +* Analog Devices adv7170 TV encoder +Drivers to use: videodev, i2c-core, i2c-algo-bit, + videocodec, saa7114, adv7170, zr36060, zr36067 +Inputs/outputs: Composite and S-video +Norms: PAL (720x576 @ 25 fps), NTSC (720x480 @ 29.97 fps) +Card number: 6 + +Pinnacle/Miro DC10(new): +* Zoran zr36057 PCI controller +* Zoran zr36060 MJPEG codec +* Philips saa7110a TV decoder +* Analog Devices adv7176 TV encoder +Drivers to use: videodev, i2c-core, i2c-algo-bit, + videocodec, saa7110, adv7175, zr36060, zr36067 +Inputs/outputs: Composite, S-video and Internal +Norms: PAL, SECAM (768x576 @ 25 fps), NTSC (640x480 @ 29.97 fps) +Card number: 1 + +Pinnacle/Miro DC10+: +* Zoran zr36067 PCI controller +* Zoran zr36060 MJPEG codec +* Philips saa7110a TV decoder +* Analog Devices adv7176 TV encoder +Drivers to use: videodev, i2c-core, i2c-algo-bit, + videocodec, sa7110, adv7175, zr36060, zr36067 +Inputs/outputs: Composite, S-video and Internal +Norms: PAL, SECAM (768x576 @ 25 fps), NTSC (640x480 @ 29.97 fps) +Card number: 2 + +Pinnacle/Miro DC10(old): * +* Zoran zr36057 PCI controller +* Zoran zr36050 MJPEG codec +* Zoran zr36016 Video Front End or Fuji md0211 Video Front End (clone?) +* Micronas vpx3220a TV decoder +* mse3000 TV encoder or Analog Devices adv7176 TV encoder * +Drivers to use: videodev, i2c-core, i2c-algo-bit, + videocodec, vpx3220, mse3000/adv7175, zr36050, zr36016, zr36067 +Inputs/outputs: Composite, S-video and Internal +Norms: PAL, SECAM (768x576 @ 25 fps), NTSC (640x480 @ 29.97 fps) +Card number: 0 + +Pinnacle/Miro DC30: * +* Zoran zr36057 PCI controller +* Zoran zr36050 MJPEG codec +* Zoran zr36016 Video Front End +* Micronas vpx3225d/vpx3220a/vpx3216b TV decoder +* Analog Devices adv7176 TV encoder +Drivers to use: videodev, i2c-core, i2c-algo-bit, + videocodec, vpx3220/vpx3224, adv7175, zr36050, zr36016, zr36067 +Inputs/outputs: Composite, S-video and Internal +Norms: PAL, SECAM (768x576 @ 25 fps), NTSC (640x480 @ 29.97 fps) +Card number: 3 + +Pinnacle/Miro DC30+: * +* Zoran zr36067 PCI controller +* Zoran zr36050 MJPEG codec +* Zoran zr36016 Video Front End +* Micronas vpx3225d/vpx3220a/vpx3216b TV decoder +* Analog Devices adv7176 TV encoder +Drivers to use: videodev, i2c-core, i2c-algo-bit, + videocodec, vpx3220/vpx3224, adv7175, zr36050, zr36015, zr36067 +Inputs/outputs: Composite, S-video and Internal +Norms: PAL, SECAM (768x576 @ 25 fps), NTSC (640x480 @ 29.97 fps) +Card number: 4 + +Note: No module for the mse3000 is available yet +Note: No module for the vpx3224 is available yet +Note: use encoder=X or decoder=X for non-default i2c chips (see i2c-id.h) + +=========================== + +1.1 What the TV decoder can do an what not + +The best know TV standards are NTSC/PAL/SECAM. but for decoding a frame that +information is not enough. There are several formats of the TV standards. +And not every TV decoder is able to handle every format. Also the every +combination is supported by the driver. There are currently 11 different +tv broadcast formats all aver the world. + +The CCIR defines parameters needed for broadcasting the signal. +The CCIR has defined different standards: A,B,D,E,F,G,D,H,I,K,K1,L,M,N,... +The CCIR says not much about about the colorsystem used !!! +And talking about a colorsystem says not to much about how it is broadcast. + +The CCIR standards A,E,F are not used any more. + +When you speak about NTSC, you usually mean the standard: CCIR - M using +the NTSC colorsystem which is used in the USA, Japan, Mexico, Canada +and a few others. + +When you talk about PAL, you usually mean: CCIR - B/G using the PAL +colorsystem which is used in many Countries. + +When you talk about SECAM, you mean: CCIR - L using the SECAM Colorsystem +which is used in France, and a few others. + +There the other version of SECAM, CCIR - D/K is used in Bulgaria, China, +Slovakai, Hungary, Korea (Rep.), Poland, Rumania and a others. + +The CCIR - H uses the PAL colorsystem (sometimes SECAM) and is used in +Egypt, Libya, Sri Lanka, Syrain Arab. Rep. + +The CCIR - I uses the PAL colorsystem, and is used in Great Britain, Hong Kong, +Ireland, Nigeria, South Africa. + +The CCIR - N uses the PAL colorsystem and PAL frame size but the NTSC framerate, +and is used in Argentinia, Uruguay, an a few others + +We do not talk about how the audio is broadcast ! + +A rather good sites about the TV standards are: +http://www.sony.jp/ServiceArea/Voltage_map/ +http://info.electronicwerkstatt.de/bereiche/fernsehtechnik/frequenzen_und_normen/Fernsehnormen/ +and http://www.cabl.com/restaurant/channel.html + +Other weird things around: NTSC 4.43 is a modificated NTSC, which is mainly +used in PAL VCR's that are able to play back NTSC. PAL 60 seems to be the same +as NTSC 4.43 . The Datasheets also talk about NTSC 44, It seems as if it would +be the same as NTSC 4.43. +NTSC Combs seems to be a decoder mode where the decoder uses a comb filter +to split coma and luma instead of a Delay line. + +But I did not defiantly find out what NTSC Comb is. + +Philips saa7111 TV decoder +was introduced in 1997, is used in the BUZ and +can handle: PAL B/G/H/I, PAL N, PAL M, NTSC M, NTSC N, NTSC 4.43 and SECAM + +Philips saa7110a TV decoder +was introduced in 1995, is used in the Pinnacle/Miro DC10(new), DC10+ and +can handle: PAL B/G, NTSC M and SECAM + +Philips saa7114 TV decoder +was introduced in 2000, is used in the LML33R10 and +can handle: PAL B/G/D/H/I/N, PAL N, PAL M, NTSC M, NTSC 4.43 and SECAM + +Brooktree bt819 TV decoder +was introduced in 1996, and is used in the LML33 and +can handle: PAL B/D/G/H/I, NTSC M + +Micronas vpx3220a TV decoder +was introduced in 1996, is used in the DC30 and DC30+ and +can handle: PAL B/G/H/I, PAL N, PAL M, NTSC M, NTSC 44, PAL 60, SECAM,NTSC Comb + +=========================== + +1.2 What the TV encoder can do an what not + +The TV encoder are doing the "same" as the decoder, but in the oder direction. +You feed them digital data and the generate a Composite or SVHS signal. +For information about the colorsystems and TV norm take a look in the +TV decoder section. + +Philips saa7185 TV Encoder +was introduced in 1996, is used in the BUZ +can generate: PAL B/G, NTSC M + +Brooktree bt856 TV Encoder +was introduced in 1994, is used in the LML33 +can generate: PAL B/D/G/H/I/N, PAL M, NTSC M, PAL-N (Argentina) + +Analog Devices adv7170 TV Encoder +was introduced in 2000, is used in the LML300R10 +can generate: PAL B/D/G/H/I/N, PAL M, NTSC M, PAL 60 + +Analog Devices adv7175 TV Encoder +was introduced in 1996, is used in the DC10, DC10+, DC10 old, DC30, DC30+ +can generate: PAL B/D/G/H/I/N, PAL M, NTSC M + +ITT mse3000 TV encoder +was introduced in 1991, is used in the DC10 old +can generate: PAL , NTSC , SECAM + +The adv717x, should be able to produce PAL N. But you find nothing PAL N +specific in the the registers. Seem that you have to reuse a other standard +to generate PAL N, maybe it would work if you use the PAL M settings. + +========================== + +2. How do I get this damn thing to work + +Load zr36067.o. If it can't autodetect your card, use the card=X insmod +option with X being the card number as given in the previous section. +To have more than one card, use card=X1[,X2[,X3,[X4[..]]]] + +To automate this, add the following to your /etc/modprobe.conf: + +options zr36067 card=X1[,X2[,X3[,X4[..]]]] +alias char-major-81-0 zr36067 + +One thing to keep in mind is that this doesn't load zr36067.o itself yet. It +just automates loading. If you start using xawtv, the device won't load on +some systems, since you're trying to load modules as a user, which is not +allowed ("permission denied"). A quick workaround is to add 'Load "v4l"' to +XF86Config-4 when you use X by default, or to run 'v4l-conf -c <device>' in +one of your startup scripts (normally rc.local) if you don't use X. Both +make sure that the modules are loaded on startup, under the root account. + +=========================== + +3. What mainboard should I use (or why doesn't my card work) + +<insert lousy disclaimer here>. In short: good=SiS/Intel, bad=VIA. + +Experience tells us that people with a Buz, on average, have more problems +than users with a DC10+/LML33. Also, it tells us that people owning a VIA- +based mainboard (ktXXX, MVP3) have more problems than users with a mainboard +based on a different chipset. Here's some notes from Andrew Stevens: +-- +Here's my experience of using LML33 and Buz on various motherboards: + +VIA MVP3 + Forget it. Pointless. Doesn't work. +Intel 430FX (Pentium 200) + LML33 perfect, Buz tolerable (3 or 4 frames dropped per movie) +Intel 440BX (early stepping) + LML33 tolerable. Buz starting to get annoying (6-10 frames/hour) +Intel 440BX (late stepping) + Buz tolerable, LML3 almost perfect (occasional single frame drops) +SiS735 + LML33 perfect, Buz tolerable. +VIA KT133(*) + LML33 starting to get annoying, Buz poor enough that I have up. + +Both 440BX boards were dual CPU versions. +-- +Bernhard Praschinger later added: +-- +AMD 751 + Buz perfect-tolerable +AMD 760 + Buz perfect-tolerable +-- +In general, people on the user mailinglist won't give you much of a chance +if you have a VIA-based motherboard. They may be cheap, but sometimes, you'd +rather want to spend some more money on better boards. In general, VIA +mainboard's IDE/PCI performance will also suck badly compared to others. +You'll noticed the DC10+/DC30+ aren't mentioned anywhere in the overview. +Basically, you can assume that if the Buz works, the LML33 will work too. If +the LML33 works, the DC10+/DC30+ will work too. They're most tolerant to +different mainboard chipsets from all of the supported cards. + +If you experience timeouts during capture, buy a better mainboard or lower +the quality/buffersize during capture (see 'Concerning buffer sizes, quality, +output size etc.'). If it hangs, there's little we can do as of now. Check +your IRQs and make sure the card has its own interrupts. + +=========================== + +4. Programming interface + +This driver conforms to video4linux and video4linux2, both can be used to +use the driver. Since video4linux didn't provide adequate calls to fully +use the cards' features, we've introduced several programming extensions, +which are currently officially accepted in the 2.4.x branch of the kernel. +These extensions are known as the v4l/mjpeg extensions. See zoran.h for +details (structs/ioctls). + +Information - video4linux: +http://roadrunner.swansea.linux.org.uk/v4lapi.shtml +Documentation/video4linux/API.html +/usr/include/linux/videodev.h + +Information - video4linux/mjpeg extensions: +./zoran.h +(also see below) + +Information - video4linux2: +http://www.thedirks.org/v4l2/ +/usr/include/linux/videodev2.h +http://www.bytesex.org/v4l/ + +More information on the video4linux/mjpeg extensions, by Serguei +Miridonovi and Rainer Johanni: +-- +The ioctls for that interface are as follows: + +BUZIOC_G_PARAMS +BUZIOC_S_PARAMS + +Get and set the parameters of the buz. The user should always do a +BUZIOC_G_PARAMS (with a struct buz_params) to obtain the default +settings, change what he likes and then make a BUZIOC_S_PARAMS call. + +BUZIOC_REQBUFS + +Before being able to capture/playback, the user has to request +the buffers he is wanting to use. Fill the structure +zoran_requestbuffers with the size (recommended: 256*1024) and +the number (recommended 32 up to 256). There are no such restrictions +as for the Video for Linux buffers, you should LEAVE SUFFICIENT +MEMORY for your system however, else strange things will happen .... +On return, the zoran_requestbuffers structure contains number and +size of the actually allocated buffers. +You should use these numbers for doing a mmap of the buffers +into the user space. +The BUZIOC_REQBUFS ioctl also makes it happen, that the next mmap +maps the MJPEG buffer instead of the V4L buffers. + +BUZIOC_QBUF_CAPT +BUZIOC_QBUF_PLAY + +Queue a buffer for capture or playback. The first call also starts +streaming capture. When streaming capture is going on, you may +only queue further buffers or issue syncs until streaming +capture is switched off again with a argument of -1 to +a BUZIOC_QBUF_CAPT/BUZIOC_QBUF_PLAY ioctl. + +BUZIOC_SYNC + +Issue this ioctl when all buffers are queued. This ioctl will +block until the first buffer becomes free for saving its +data to disk (after BUZIOC_QBUF_CAPT) or for reuse (after BUZIOC_QBUF_PLAY). + +BUZIOC_G_STATUS + +Get the status of the input lines (video source connected/norm). + +For programming example, please, look at lavrec.c and lavplay.c code in +lavtools-1.2p2 package (URL: http://www.cicese.mx/~mirsev/DC10plus/) +and the 'examples' directory in the original Buz driver distribution. + +Additional notes for software developers: + + The driver returns maxwidth and maxheight parameters according to + the current TV standard (norm). Therefore, the software which + communicates with the driver and "asks" for these parameters should + first set the correct norm. Well, it seems logically correct: TV + standard is "more constant" for current country than geometry + settings of a variety of TV capture cards which may work in ITU or + square pixel format. Remember that users now can lock the norm to + avoid any ambiguity. +-- +Please note that lavplay/lavrec are also included in the MJPEG-tools +(http://mjpeg.sf.net/). + +=========================== + +5. Applications + +Applications known to work with this driver: + +TV viewing: +* xawtv +* kwintv +* probably any TV application that supports video4linux or video4linux2. + +MJPEG capture/playback: +* mjpegtools/lavtools (or Linux Video Studio) +* gstreamer +* mplayer + +General raw capture: +* xawtv +* gstreamer +* probably any application that supports video4linux or video4linux2 + +Video editing: +* Cinelerra +* MainActor +* mjpegtools (or Linux Video Studio) + +=========================== + +6. Concerning buffer sizes, quality, output size etc. + +The zr36060 can do 1:2 JPEG compression. This is really the theoretical +maximum that the chipset can reach. The driver can, however, limit compression +to a maximum (size) of 1:4. The reason for this is that some cards (e.g. Buz) +can't handle 1:2 compression without stopping capture after only a few minutes. +With 1:4, it'll mostly work. If you have a Buz, use 'low_bitrate=1' to go into +1:4 max. compression mode. + +100% JPEG quality is thus 1:2 compression in practice. So for a full PAL frame +(size 720x576). The JPEG fields are stored in YUY2 format, so the size of the +fields are 720x288x16/2 bits/field (2 fields/frame) = 207360 bytes/field x 2 = +414720 bytes/frame (add some more bytes for headers and DHT (huffman)/DQT +(quantization) tables, and you'll get to something like 512kB per frame for +1:2 compression. For 1:4 compression, you'd have frames of half this size. + +Some additional explanation by Martin Samuelsson, which also explains the +importance of buffer sizes: +-- +> Hmm, I do not think it is really that way. With the current (downloaded +> at 18:00 Monday) driver I get that output sizes for 10 sec: +> -q 50 -b 128 : 24.283.332 Bytes +> -q 50 -b 256 : 48.442.368 +> -q 25 -b 128 : 24.655.992 +> -q 25 -b 256 : 25.859.820 + +I woke up, and can't go to sleep again. I'll kill some time explaining why +this doesn't look strange to me. + +Let's do some math using a width of 704 pixels. I'm not sure whether the Buz +actually use that number or not, but that's not too important right now. + +704x288 pixels, one field, is 202752 pixels. Divided by 64 pixels per block; +3168 blocks per field. Each pixel consist of two bytes; 128 bytes per block; +1024 bits per block. 100% in the new driver mean 1:2 compression; the maximum +output becomes 512 bits per block. Actually 510, but 512 is simpler to use +for calculations. + +Let's say that we specify d1q50. We thus want 256 bits per block; times 3168 +becomes 811008 bits; 101376 bytes per field. We're talking raw bits and bytes +here, so we don't need to do any fancy corrections for bits-per-pixel or such +things. 101376 bytes per field. + +d1 video contains two fields per frame. Those sum up to 202752 bytes per +frame, and one of those frames goes into each buffer. + +But wait a second! -b128 gives 128kB buffers! It's not possible to cram +202752 bytes of JPEG data into 128kB! + +This is what the driver notice and automatically compensate for in your +examples. Let's do some math using this information: + +128kB is 131072 bytes. In this buffer, we want to store two fields, which +leaves 65536 bytes for each field. Using 3168 blocks per field, we get +20.68686868... available bytes per block; 165 bits. We can't allow the +request for 256 bits per block when there's only 165 bits available! The -q50 +option is silently overridden, and the -b128 option takes precedence, leaving +us with the equivalence of -q32. + +This gives us a data rate of 165 bits per block, which, times 3168, sums up +to 65340 bytes per field, out of the allowed 65536. The current driver has +another level of rate limiting; it won't accept -q values that fill more than +6/8 of the specified buffers. (I'm not sure why. "Playing it safe" seem to be +a safe bet. Personally, I think I would have lowered requested-bits-per-block +by one, or something like that.) We can't use 165 bits per block, but have to +lower it again, to 6/8 of the available buffer space: We end up with 124 bits +per block, the equivalence of -q24. With 128kB buffers, you can't use greater +than -q24 at -d1. (And PAL, and 704 pixels width...) + +The third example is limited to -q24 through the same process. The second +example, using very similar calculations, is limited to -q48. The only +example that actually grab at the specified -q value is the last one, which +is clearly visible, looking at the file size. +-- + +Conclusion: the quality of the resulting movie depends on buffer size, quality, +whether or not you use 'low_bitrate=1' as insmod option for the zr36060.c +module to do 1:4 instead of 1:2 compression, etc. + +If you experience timeouts, lowering the quality/buffersize or using +'low_bitrate=1 as insmod option for zr36060.o might actually help, as is +proven by the Buz. + +=========================== + +7. It hangs/crashes/fails/whatevers! Help! + +Make sure that the card has its own interrupts (see /proc/interrupts), check +the output of dmesg at high verbosity (load zr36067.o with debug=2, +load all other modules with debug=1). Check that your mainboard is favorable +(see question 2) and if not, test the card in another computer. Also see the +notes given in question 3 and try lowering quality/buffersize/capturesize +if recording fails after a period of time. + +If all this doesn't help, give a clear description of the problem including +detailed hardware information (memory+brand, mainboard+chipset+brand, which +MJPEG card, processor, other PCI cards that might be of interest), give the +system PnP information (/proc/interrupts, /proc/dma, /proc/devices), and give +the kernel version, driver version, glibc version, gcc version and any other +information that might possibly be of interest. Also provide the dmesg output +at high verbosity. See 'Contacting' on how to contact the developers. + +=========================== + +8. Maintainers/Contacting + +The driver is currently maintained by Laurent Pinchart and Ronald Bultje +(<laurent.pinchart@skynet.be> and <rbultje@ronald.bitfreak.net>). For bug +reports or questions, please contact the mailinglist instead of the developers +individually. For user questions (i.e. bug reports or how-to questions), send +an email to <mjpeg-users@lists.sf.net>, for developers (i.e. if you want to +help programming), send an email to <mjpeg-developer@lists.sf.net>. See +http://www.sf.net/projects/mjpeg/ for subscription information. + +For bug reports, be sure to include all the information as described in +the section 'It hangs/crashes/fails/whatevers! Help!'. Please make sure +you're using the latest version (http://mjpeg.sf.net/driver-zoran/). + +Previous maintainers/developers of this driver include Serguei Miridonov +<mirsev@cicese.mx>, Wolfgang Scherr <scherr@net4you.net>, Dave Perks +<dperks@ibm.net> and Rainer Johanni <Rainer@Johanni.de>. + +=========================== + +9. License + +This driver is distributed under the terms of the General Public License. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + +See http://www.gnu.org/ for more information. diff --git a/Documentation/video4linux/bttv/CONTRIBUTORS b/Documentation/video4linux/bttv/CONTRIBUTORS new file mode 100644 index 000000000000..aef49db8847d --- /dev/null +++ b/Documentation/video4linux/bttv/CONTRIBUTORS @@ -0,0 +1,25 @@ +Contributors to bttv: + +Michael Chu <mmchu@pobox.com> + AverMedia fix and more flexible card recognition + +Alan Cox <alan@redhat.com> + Video4Linux interface and 2.1.x kernel adaptation + +Chris Kleitsch + Hardware I2C + +Gerd Knorr <kraxel@cs.tu-berlin.de> + Radio card (ITT sound processor) + +bigfoot <bigfoot@net-way.net> +Ragnar Hojland Espinosa <ragnar@macula.net> + ConferenceTV card + + ++ many more (please mail me if you are missing in this list and would + like to be mentioned) + + + + diff --git a/Documentation/video4linux/bttv/Cards b/Documentation/video4linux/bttv/Cards new file mode 100644 index 000000000000..7f8c7eb70ab2 --- /dev/null +++ b/Documentation/video4linux/bttv/Cards @@ -0,0 +1,964 @@ + +Gunther Mayer's bttv card gallery (graphical version of this text file :-) +is available at: http://www.bttv-gallery.de/ + + +Supported cards: +Bt848/Bt848a/Bt849/Bt878/Bt879 cards +------------------------------------ + +All cards with Bt848/Bt848a/Bt849/Bt878/Bt879 and normal +Composite/S-VHS inputs are supported. Teletext and Intercast support +(PAL only) for ALL cards via VBI sample decoding in software. + +Some cards with additional multiplexing of inputs or other additional +fancy chips are only partially supported (unless specifications by the +card manufacturer are given). When a card is listed here it isn't +necessarily fully supported. + +All other cards only differ by additional components as tuners, sound +decoders, EEPROMs, teletext decoders ... + + +Unsupported Cards: +------------------ + +Cards with Zoran (ZR) or Philips (SAA) or ISA are not supported by +this driver. + + +MATRIX Vision +------------- + +MV-Delta +- Bt848A +- 4 Composite inputs, 1 S-VHS input (shared with 4th composite) +- EEPROM + +http://www.matrix-vision.de/ + +This card has no tuner but supports all 4 composite (1 shared with an +S-VHS input) of the Bt848A. +Very nice card if you only have satellite TV but several tuners connected +to the card via composite. + +Many thanks to Matrix-Vision for giving us 2 cards for free which made +Bt848a/Bt849 single crytal operation support possible!!! + + + +Miro/Pinnacle PCTV +------------------ + +- Bt848 + some (all??) come with 2 crystals for PAL/SECAM and NTSC +- PAL, SECAM or NTSC TV tuner (Philips or TEMIC) +- MSP34xx sound decoder on add on board + decoder is supported but AFAIK does not yet work + (other sound MUX setting in GPIO port needed??? somebody who fixed this???) +- 1 tuner, 1 composite and 1 S-VHS input +- tuner type is autodetected + +http://www.miro.de/ +http://www.miro.com/ + + +Many thanks for the free card which made first NTSC support possible back +in 1997! + + +Hauppauge Win/TV pci +-------------------- + +There are many different versions of the Hauppauge cards with different +tuners (TV+Radio ...), teletext decoders. +Note that even cards with same model numbers have (depending on the revision) +different chips on it. + +- Bt848 (and others but always in 2 crystal operation???) + newer cards have a Bt878 +- PAL, SECAM, NTSC or tuner with or without Radio support + +e.g.: + PAL: + TDA5737: VHF, hyperband and UHF mixer/oscillator for TV and VCR 3-band tuners + TSA5522: 1.4 GHz I2C-bus controlled synthesizer, I2C 0xc2-0xc3 + + NTSC: + TDA5731: VHF, hyperband and UHF mixer/oscillator for TV and VCR 3-band tuners + TSA5518: no datasheet available on Philips site +- Philips SAA5246 or SAA5284 ( or no) Teletext decoder chip + with buffer RAM (e.g. Winbond W24257AS-35: 32Kx8 CMOS static RAM) + SAA5246 (I2C 0x22) is supported +- 256 bytes EEPROM: Microchip 24LC02B or Philips 8582E2Y + with configuration information + I2C address 0xa0 (24LC02B also responds to 0xa2-0xaf) +- 1 tuner, 1 composite and (depending on model) 1 S-VHS input +- 14052B: mux for selection of sound source +- sound decoder: TDA9800, MSP34xx (stereo cards) + + +Askey CPH-Series +---------------- +Developed by TelSignal(?), OEMed by many vendors (Typhoon, Anubis, Dynalink) + + Card series: + CPH01x: BT848 capture only + CPH03x: BT848 + CPH05x: BT878 with FM + CPH06x: BT878 (w/o FM) + CPH07x: BT878 capture only + + TV standards: + CPH0x0: NTSC-M/M + CPH0x1: PAL-B/G + CPH0x2: PAL-I/I + CPH0x3: PAL-D/K + CPH0x4: SECAM-L/L + CPH0x5: SECAM-B/G + CPH0x6: SECAM-D/K + CPH0x7: PAL-N/N + CPH0x8: PAL-B/H + CPH0x9: PAL-M/M + + CPH03x was often sold as "TV capturer". + + Identifying: + 1) 878 cards can be identified by PCI Subsystem-ID: + 144f:3000 = CPH06x + 144F:3002 = CPH05x w/ FM + 144F:3005 = CPH06x_LC (w/o remote control) + 1) The cards have a sticker with "CPH"-model on the back. + 2) These cards have a number printed on the PCB just above the tuner metal box: + "80-CP2000300-x" = CPH03X + "80-CP2000500-x" = CPH05X + "80-CP2000600-x" = CPH06X / CPH06x_LC + + Askey sells these cards as "Magic TView series", Brand "MagicXpress". + Other OEM often call these "Tview", "TView99" or else. + +Lifeview Flyvideo Series: +------------------------- + The naming of these series differs in time and space. + + Identifying: + 1) Some models can be identified by PCI subsystem ID: + 1852:1852 = Flyvideo 98 FM + 1851:1850 = Flyvideo 98 + 1851:1851 = Flyvideo 98 EZ (capture only) + 2) There is a print on the PCB: + LR25 = Flyvideo (Zoran ZR36120, SAA7110A) + LR26 Rev.N = Flyvideo II (Bt848) + Rev.O = Flyvideo II (Bt878) + LR37 Rev.C = Flyvideo EZ (Capture only, ZR36120 + SAA7110) + LR38 Rev.A1= Flyvideo II EZ (Bt848 capture only) + LR50 Rev.Q = Flyvideo 98 (w/eeprom and PCI subsystem ID) + Rev.W = Flyvideo 98 (no eeprom) + LR51 Rev.E = Flyvideo 98 EZ (capture only) + LR90 = Flyvideo 2000 (Bt878) + Flyvideo 2000S (Bt878) w/Stereo TV (Package incl. LR91 daughterboard) + LR91 = Stereo daughter card for LR90 + LR97 = Flyvideo DVBS + LR99 Rev.E = Low profile card for OEM integration (only internal audio!) bt878 + LR136 = Flyvideo 2100/3100 (Low profile, SAA7130/SAA7134) + LR137 = Flyvideo DV2000/DV3000 (SAA7130/SAA7134 + IEEE1394) + LR138 Rev.C= Flyvideo 2000 (SAA7130) + or Flyvideo 3000 (SAA7134) w/Stereo TV + These exist in variations w/FM and w/Remote sometimes denoted + by suffixes "FM" and "R". + 3) You have a laptop (miniPCI card): + Product = FlyTV Platinum Mini + Model/Chip = LR212/saa7135 + + Lifeview.com.tw states (Feb. 2002): + "The FlyVideo2000 and FlyVideo2000s product name have renamed to FlyVideo98." + Their Bt8x8 cards are listed as discontinued. + Flyvideo 2000S was probably sold as Flyvideo 3000 in some contries(Europe?). + The new Flyvideo 2000/3000 are SAA7130/SAA7134 based. + + "Flyvideo II" had been the name for the 848 cards, nowadays (in Germany) + this name is re-used for LR50 Rev.W. + The Lifeview website mentioned Flyvideo III at some time, but such a card + has not yet been seen (perhaps it was the german name for LR90 [stereo]). + These cards are sold by many OEMs too. + + FlyVideo A2 (Elta 8680)= LR90 Rev.F (w/Remote, w/o FM, stereo TV by tda9821) {Germany} + Lifeview 3000 (Elta 8681) as sold by Plus(April 2002), Germany = LR138 w/ saa7134 + + +Typhoon TV card series: +----------------------- + These can be CPH, Flyvideo, Pixelview or KNC1 series. + Typhoon is the brand of Anubis. + Model 50680 got re-used, some model no. had different contents over time. + + Models: + 50680 "TV Tuner PCI Pal BG"(old,red package)=can be CPH03x(bt848) or CPH06x(bt878) + 50680 "TV Tuner Pal BG" (blue package)= Pixelview PV-BT878P+ (Rev 9B) + 50681 "TV Tuner PCI Pal I" (variant of 50680) + 50682 "TView TV/FM Tuner Pal BG" = Flyvideo 98FM (LR50 Rev.Q) + Note: The package has a picture of CPH05x (which would be a real TView) + 50683 "TV Tuner PCI SECAM" (variant of 50680) + 50684 "TV Tuner Pal BG" = Pixelview 878TV(Rev.3D) + 50686 "TV Tuner" = KNC1 TV Station + 50687 "TV Tuner stereo" = KNC1 TV Station pro + 50688 "TV Tuner RDS" (black package) = KNC1 TV Station RDS + 50689 TV SAT DVB-S CARD CI PCI (SAA7146AH, SU1278?) = "KNC1 TV Station DVB-S" + 50692 "TV/FM Tuner" (small PCB) + 50694 TV TUNER CARD RDS (PHILIPS CHIPSET SAA7134HL) + 50696 TV TUNER STEREO (PHILIPS CHIPSET SAA7134HL, MK3ME Tuner) + 50804 PC-SAT TV/Audio Karte = Techni-PC-Sat (ZORAN 36120PQC, Tuner:Alps) + 50866 TVIEW SAT RECEIVER+ADR + 50868 "TV/FM Tuner Pal I" (variant of 50682) + 50999 "TV/FM Tuner Secam" (variant of 50682) + + +Guillemot +--------- + Maxi-TV PCI (ZR36120) + Maxi TV Video 2 = LR50 Rev.Q (FI1216MF, PAL BG+SECAM) + Maxi TV Video 3 = CPH064 (PAL BG + SECAM) + +Mentor +------ + Mentor TV card ("55-878TV-U1") = Pixelview 878TV(Rev.3F) (w/FM w/Remote) + +Prolink +------- + TV cards: + PixelView Play TV pro - (Model: PV-BT878P+ REV 8E) + PixelView Play TV pro - (Model: PV-BT878P+ REV 9D) + PixelView Play TV pro - (Model: PV-BT878P+ REV 4C / 8D / 10A ) + PixelView Play TV - (Model: PV-BT848P+) + 878TV - (Model: PV-BT878TV) + + Multimedia TV packages (card + software pack): + PixelView Play TV Theater - (Model: PV-M4200) = PixelView Play TV pro + Software + PixelView Play TV PAK - (Model: PV-BT878P+ REV 4E) + PixelView Play TV/VCR - (Model: PV-M3200 REV 4C / 8D / 10A ) + PixelView Studio PAK - (Model: M2200 REV 4C / 8D / 10A ) + PixelView PowerStudio PAK - (Model: PV-M3600 REV 4E) + PixelView DigitalVCR PAK - (Model: PV-M2400 REV 4C / 8D / 10A ) + + PixelView PlayTV PAK II (TV/FM card + usb camera) PV-M3800 + PixelView PlayTV XP PV-M4700,PV-M4700(w/FM) + PixelView PlayTV DVR PV-M4600 package contents:PixelView PlayTV pro, windvr & videoMail s/w + + Further Cards: + PV-BT878P+rev.9B (Play TV Pro, opt. w/FM w/NICAM) + PV-BT878P+rev.2F + PV-BT878P Rev.1D (bt878, capture only) + + XCapture PV-CX881P (cx23881) + PlayTV HD PV-CX881PL+, PV-CX881PL+(w/FM) (cx23881) + + DTV3000 PV-DTV3000P+ DVB-S CI = Twinhan VP-1030 + DTV2000 DVB-S = Twinhan VP-1020 + + Video Conferencing: + PixelView Meeting PAK - (Model: PV-BT878P) + PixelView Meeting PAK Lite - (Model: PV-BT878P) + PixelView Meeting PAK plus - (Model: PV-BT878P+rev 4C/8D/10A) + PixelView Capture - (Model: PV-BT848P) + + PixelView PlayTV USB pro + Model No. PV-NT1004+, PV-NT1004+ (w/FM) = NT1004 USB decoder chip + SAA7113 video decoder chip + +Dynalink +-------- + These are CPH series. + +Phoebemicro +----------- + TV Master = CPH030 or CPH060 + TV Master FM = CPH050 + +Genius/Kye +---------- + Video Wonder/Genius Internet Video Kit = LR37 Rev.C + Video Wonder Pro II (848 or 878) = LR26 + +Tekram +------ + VideoCap C205 (Bt848) + VideoCap C210 (zr36120 +Philips) + CaptureTV M200 (ISA) + CaptureTV M205 (Bt848) + +Lucky Star +---------- + Image World Conference TV = LR50 Rev. Q + +Leadtek +------- + WinView 601 (Bt848) + WinView 610 (Zoran) + WinFast2000 + WinFast2000 XP + +KNC One +------- + TV-Station + TV-Station SE (+Software Bundle) + TV-Station pro (+TV stereo) + TV-Station FM (+Radio) + TV-Station RDS (+RDS) + TV Station SAT (analog satellite) + TV-Station DVB-S + + newer Cards have saa7134, but model name stayed the same? + +Provideo +-------- + PV951 or PV-951 (also are sold as: + Boeder TV-FM Video Capture Card + Titanmedia Supervision TV-2400 + Provideo PV951 TF + 3DeMon PV951 + MediaForte TV-Vision PV951 + Yoko PV951 + Vivanco Tuner Card PCI Art.-Nr.: 68404 + ) now named PV-951T + + Surveillance Series + PV-141 + PV-143 + PV-147 + PV-148 (capture only) + PV-150 + PV-151 + + TV-FM Tuner Series + PV-951TDV (tv tuner + 1394) + PV-951T/TF + PV-951PT/TF + PV-956T/TF Low Profile + PV-911 + +Highscreen +---------- + TV Karte = LR50 Rev.S + TV-Boostar = Terratec Terra TV+ Version 1.0 (Bt848, tda9821) "ceb105.pcb" + +Zoltrix +------- + Face to Face Capture (Bt848 capture only) (PCB "VP-2848") + Face To Face TV MAX (Bt848) (PCB "VP-8482 Rev1.3") + Genie TV (Bt878) (PCB "VP-8790 Rev 2.1") + Genie Wonder Pro + +AVerMedia +--------- + AVer FunTV Lite (ISA, AV3001 chipset) "M101.C" + AVerTV + AVerTV Stereo + AVerTV Studio (w/FM) + AVerMedia TV98 with Remote + AVerMedia TV/FM98 Stereo + AVerMedia TVCAM98 + TVCapture (Bt848) + TVPhone (Bt848) + TVCapture98 (="AVerMedia TV98" in USA) (Bt878) + TVPhone98 (Bt878, w/FM) + + PCB PCI-ID Model-Name Eeprom Tuner Sound Country + -------------------------------------------------------------------- + M101.C ISA ! + M108-B Bt848 -- FR1236 US (2),(3) + M1A8-A Bt848 AVer TV-Phone FM1216 -- + M168-T 1461:0003 AVerTV Studio 48:17 FM1216 TDA9840T D (1) w/FM w/Remote + M168-U 1461:0004 TVCapture98 40:11 FI1216 -- D w/Remote + M168II-B 1461:0003 Medion MD9592 48:16 FM1216 TDA9873H D w/FM + + (1) Daughterboard MB68-A with TDA9820T and TDA9840T + (2) Sony NE41S soldered (stereo sound?) + (3) Daughterboard M118-A w/ pic 16c54 and 4 MHz quartz + + US site has different drivers for (as of 09/2002): + EZ Capture/InterCam PCI (BT-848 chip) + EZ Capture/InterCam PCI (BT-878 chip) + TV-Phone (BT-848 chip) + TV98 (BT-848 chip) + TV98 With Remote (BT-848 chip) + TV98 (BT-878 chip) + TV98 With Remote (BT-878) + TV/FM98 (BT-878 chip) + AVerTV + AverTV Stereo + AVerTV Studio + + DE hat diverse Treiber fuer diese Modelle (Stand 09/2002): + TVPhone (848) mit Philips tuner FR12X6 (w/ FM radio) + TVPhone (848) mit Philips tuner FM12X6 (w/ FM radio) + TVCapture (848) w/Philips tuner FI12X6 + TVCapture (848) non-Philips tuner + TVCapture98 (Bt878) + TVPhone98 (Bt878) + AVerTV und TVCapture98 w/VCR (Bt 878) + AVerTVStudio und TVPhone98 w/VCR (Bt878) + AVerTV GO Serie (Kein SVideo Input) + AVerTV98 (BT-878 chip) + AVerTV98 mit Fernbedienung (BT-878 chip) + AVerTV/FM98 (BT-878 chip) + + VDOmate (www.averm.com.cn) = M168U ? + +Aimslab +------- + Video Highway or "Video Highway TR200" (ISA) + Video Highway Xtreme (aka "VHX") (Bt848, FM w/ TEA5757) + +IXMicro (former: IMS=Integrated Micro Solutions) +------- + IXTV BT848 (=TurboTV) + IXTV BT878 + IMS TurboTV (Bt848) + +Lifetec/Medion/Tevion/Aldi +-------------------------- + LT9306/MD9306 = CPH061 + LT9415/MD9415 = LR90 Rev.F or Rev.G + MD9592 = Avermedia TVphone98 (PCI_ID=1461:0003), PCB-Rev=M168II-B (w/TDA9873H) + MD9717 = KNC One (Rev D4, saa7134, FM1216 MK2 tuner) + MD5044 = KNC One (Rev D4, saa7134, FM1216ME MK3 tuner) + +Modular Technologies (www.modulartech.com) UK +--------------------------------------------- + MM100 PCTV (Bt848) + MM201 PCTV (Bt878, Bt832) w/ Quartzsight camera + MM202 PCTV (Bt878, Bt832, tda9874) + MM205 PCTV (Bt878) + MM210 PCTV (Bt878) (Galaxy TV, Galaxymedia ?) + +Terratec +-------- + Terra TV+ Version 1.0 (Bt848), "ceb105.PCB" printed on the PCB, TDA9821 + Terra TV+ Version 1.1 (Bt878), "LR74 Rev.E" printed on the PCB, TDA9821 + Terra TValueRadio, "LR102 Rev.C" printed on the PCB + Terra TV/Radio+ Version 1.0, "80-CP2830100-0" TTTV3 printed on the PCB, + "CPH010-E83" on the back, SAA6588T, TDA9873H + Terra TValue Version BT878, "80-CP2830110-0 TTTV4" printed on the PCB, + "CPH011-D83" on back + Terra TValue Version 1.0 "ceb105.PCB" (really identical to Terra TV+ Version 1.0) + Terra TValue New Revision "LR102 Rec.C" + Terra Active Radio Upgrade (tea5757h, saa6588t) + + LR74 is a newer PCB revision of ceb105 (both incl. connector for Active Radio Upgrade) + + Cinergy 400 (saa7134), "E877 11(S)", "PM820092D" printed on PCB + Cinergy 600 (saa7134) + +Technisat +--------- + Discos ADR PC-Karte ISA (no TV!) + Discos ADR PC-Karte PCI (probably no TV?) + Techni-PC-Sat (Sat. analog) + Rev 1.2 (zr36120, vpx3220, stv0030, saa5246, BSJE3-494A) + Mediafocus I (zr36120/zr36125, drp3510, Sat. analog + ADR Radio) + Mediafocus II (saa7146, Sat. analog) + SatADR Rev 2.1 (saa7146a, saa7113h, stv0056a, msp3400c, drp3510a, BSKE3-307A) + SkyStar 1 DVB (AV7110) = Technotrend Premium + SkyStar 2 DVB (B2C2) (=Sky2PC) + +Siemens +------- + Multimedia eXtension Board (MXB) (SAA7146, SAA7111) + +Stradis +------- + SDM275,SDM250,SDM026,SDM025 (SAA7146, IBMMPEG2): MPEG2 decoder only + +Powercolor +---------- + MTV878 + Package comes with different contents: + a) pcb "MTV878" (CARD=75) + b) Pixelview Rev. 4_ + MTV878R w/Remote Control + MTV878F w/Remote Control w/FM radio + +Pinnacle +-------- + Mirovideo PCTV (Bt848) + Mirovideo PCTV SE (Bt848) + Mirovideo PCTV Pro (Bt848 + Daughterboard for TV Stereo and FM) + Studio PCTV Rave (Bt848 Version = Mirovideo PCTV) + Studio PCTV Rave (Bt878 package w/o infrared) + Studio PCTV (Bt878) + Studio PCTV Pro (Bt878 stereo w/ FM) + Pinnacle PCTV (Bt878, MT2032) + Pinnacle PCTV Pro (Bt878, MT2032) + Pinncale PCTV Sat (bt878a, HM1821/1221) ["Conexant CX24110 with CX24108 tuner, aka HM1221/HM1811"] + Pinnacle PCTV Sat XE + + M(J)PEG capture and playback: + DC1+ (ISA) + DC10 (zr36057, zr36060, saa7110, adv7176) + DC10+ (zr36067, zr36060, saa7110, adv7176) + DC20 (ql16x24b,zr36050, zr36016, saa7110, saa7187 ...) + DC30 (zr36057, zr36050, zr36016, vpx3220, adv7176, ad1843, tea6415, miro FST97A1) + DC30+ (zr36067, zr36050, zr36016, vpx3220, adv7176) + DC50 (zr36067, zr36050, zr36016, saa7112, adv7176 (2 pcs.?), ad1843, miro FST97A1, Lattice ???) + +Lenco +----- + MXR-9565 (=Technisat Mediafocus?) + MXR-9571 (Bt848) (=CPH031?) + MXR-9575 + MXR-9577 (Bt878) (=Prolink 878TV Rev.3x) + MXTV-9578CP (Bt878) (= Prolink PV-BT878P+4E) + +Iomega +------ + Buz (zr36067, zr36060, saa7111, saa7185) + +LML +--- + LML33 (zr36067, zr36060, bt819, bt856) + +Grandtec +-------- + Grand Video Capture (Bt848) + Multi Capture Card (Bt878) + +Koutech +------- + KW-606 (Bt848) + KW-607 (Bt848 capture only) + KW-606RSF + KW-607A (capture only) + KW-608 (Zoran capture only) + +IODATA (jp) +------ + GV-BCTV/PCI + GV-BCTV2/PCI + GV-BCTV3/PCI + GV-BCTV4/PCI + GV-VCP/PCI (capture only) + GV-VCP2/PCI (capture only) + +Canopus (jp) +------- + WinDVR = Kworld "KW-TVL878RF" + +www.sigmacom.co.kr +------------------ + Sigma Cyber TV II + +www.sasem.co.kr +--------------- + Litte OnAir TV + +hama +---- + TV/Radio-Tuner Card, PCI (Model 44677) = CPH051 + +Sigma Designs +------------- + Hollywood plus (em8300, em9010, adv7175), (PCB "M340-10") MPEG DVD decoder + +Formac +------ + iProTV (Card for iMac Mezzanine slot, Bt848+SCSI) + ProTV (Bt848) + ProTV II = ProTV Stereo (Bt878) ["stereo" means FM stereo, tv is still mono] + +ATI +--- + TV-Wonder + TV-Wonder VE + +Diamond Multimedia +------------------ + DTV2000 (Bt848, tda9875) + +Aopen +----- + VA1000 Plus (w/ Stereo) + VA1000 Lite + VA1000 (=LR90) + +Intel +----- + Smart Video Recorder (ISA full-length) + Smart Video Recorder pro (ISA half-length) + Smart Video Recorder III (Bt848) + +STB +--- + STB Gateway 6000704 (bt878) + STB Gateway 6000699 (bt848) + STB Gateway 6000402 (bt848) + STB TV130 PCI + +Videologic +---------- + Captivator Pro/TV (ISA?) + Captivator PCI/VC (Bt848 bundled with camera) (capture only) + +Technotrend +------------ + TT-SAT PCI (PCB "Sat-PCI Rev.:1.3.1"; zr36125, vpx3225d, stc0056a, Tuner:BSKE6-155A + TT-DVB-Sat + revisions 1.1, 1.3, 1.5, 1.6 and 2.1 + This card is sold as OEM from: + Siemens DVB-s Card + Hauppauge WinTV DVB-S + Technisat SkyStar 1 DVB + Galaxis DVB Sat + Now this card is called TT-PCline Premium Family + TT-Budget (saa7146, bsru6-701a) + This card is sold as OEM from: + Hauppauge WinTV Nova + Satelco Standard PCI (DVB-S) + TT-DVB-C PCI + +Teles +----- + DVB-s (Rev. 2.2, BSRV2-301A, data only?) + +Remote Vision +------------- + MX RV605 (Bt848 capture only) + +Boeder +------ + PC ChatCam (Model 68252) (Bt848 capture only) + Tv/Fm Capture Card (Model 68404) = PV951 + +Media-Surfer (esc-kathrein.de) +------------------------------- + Sat-Surfer (ISA) + Sat-Surfer PCI = Techni-PC-Sat + Cable-Surfer 1 + Cable-Surfer 2 + Cable-Surfer PCI (zr36120) + Audio-Surfer (ISA Radio card) + +Jetway (www.jetway.com.tw) +-------------------------- + JW-TV 878M + JW-TV 878 = KWorld KW-TV878RF + +Galaxis +------- + Galaxis DVB Card S CI + Galaxis DVB Card C CI + Galaxis DVB Card S + Galaxis DVB Card C + Galaxis plug.in S [neuer Name: Galaxis DVB Card S CI + +Hauppauge +--------- + many many WinTV models ... + WinTV DVBs = Technotrend Premium 1.3 + WinTV NOVA = Technotrend Budget 1.1 "S-DVB DATA" + WinTV NOVA-CI "SDVBACI" + WinTV Nova USB (=Technotrend USB 1.0) + WinTV-Nexus-s (=Technotrend Premium 2.1 or 2.2) + WinTV PVR + WinTV PVR 250 + WinTV PVR 450 + + US models + 990 WinTV-PVR-350 (249USD) (iTVC15 chipset + radio) + 980 WinTV-PVR-250 (149USD) (iTVC15 chipset) + 880 WinTV-PVR-PCI (199USD) (KFIR chipset + bt878) + 881 WinTV-PVR-USB + 190 WinTV-GO + 191 WinTV-GO-FM + 404 WinTV + 401 WinTV-radio + 495 WinTV-Theater + 602 WinTV-USB + 621 WinTV-USB-FM + 600 USB-Live + 698 WinTV-HD + 697 WinTV-D + 564 WinTV-Nexus-S + + Deutsche Modelle + 603 WinTV GO + 719 WinTV Primio-FM + 718 WinTV PCI-FM + 497 WinTV Theater + 569 WinTV USB + 568 WinTV USB-FM + 882 WinTV PVR + 981 WinTV PVR 250 + 891 WinTV-PVR-USB + 541 WinTV Nova + 488 WinTV Nova-Ci + 564 WinTV-Nexus-s + 727 WinTV-DVB-c + 545 Common Interface + 898 WinTV-Nova-USB + + UK models + 607 WinTV Go + 693,793 WinTV Primio FM + 647,747 WinTV PCI FM + 498 WinTV Theater + 883 WinTV PVR + 893 WinTV PVR USB (Duplicate entry) + 566 WinTV USB (UK) + 573 WinTV USB FM + 429 Impact VCB (bt848) + 600 USB Live (Video-In 1x Comp, 1xSVHS) + 542 WinTV Nova + 717 WinTV DVB-S + 909 Nova-t PCI + 893 Nova-t USB (Duplicate entry) + 802 MyTV + 804 MyView + 809 MyVideo + 872 MyTV2Go FM + + + 546 WinTV Nova-S CI + 543 WinTV Nova + 907 Nova-S USB + 908 Nova-T USB + 717 WinTV Nexus-S + 157 DEC3000-s Standalone + USB + + Spain + 685 WinTV-Go + 690 WinTV-PrimioFM + 416 WinTV-PCI Nicam Estereo + 677 WinTV-PCI-FM + 699 WinTV-Theater + 683 WinTV-USB + 678 WinTV-USB-FM + 983 WinTV-PVR-250 + 883 WinTV-PVR-PCI + 993 WinTV-PVR-350 + 893 WinTV-PVR-USB + 728 WinTV-DVB-C PCI + 832 MyTV2Go + 869 MyTV2Go-FM + 805 MyVideo (USB) + + +Matrix-Vision +------------- + MATRIX-Vision MV-Delta + MATRIX-Vision MV-Delta 2 + MVsigma-SLC (Bt848) + +Conceptronic (.net) +------------ + TVCON FM, TV card w/ FM = CPH05x + TVCON = CPH06x + +BestData +-------- + HCC100 = VCC100rev1 + camera + VCC100 rev1 (bt848) + VCC100 rev2 (bt878) + +Gallant (www.gallantcom.com) www.minton.com.tw +----------------------------------------------- + Intervision IV-510 (capture only bt8x8) + Intervision IV-550 (bt8x8) + Intervision IV-100 (zoran) + Intervision IV-1000 (bt8x8) + +Asonic (www.asonic.com.cn) (website down) +----------------------------------------- + SkyEye tv 878 + +Hoontech +-------- + 878TV/FM + +Teppro (www.itcteppro.com.tw) +----------------------------- + ITC PCITV (Card Ver 1.0) "Teppro TV1/TVFM1 Card" + ITC PCITV (Card Ver 2.0) + ITC PCITV (Card Ver 3.0) = "PV-BT878P+ (REV.9D)" + ITC PCITV (Card Ver 4.0) + TEPPRO IV-550 (For BT848 Main Chip) + ITC DSTTV (bt878, satellite) + ITC VideoMaker (saa7146, StreamMachine sm2110, tvtuner) "PV-SM2210P+ (REV:1C)" + +Kworld (www.kworld.com.tw) +-------------------------- + PC TV Station + KWORLD KW-TV878R TV (no radio) + KWORLD KW-TV878RF TV (w/ radio) + + KWORLD KW-TVL878RF (low profile) + + KWORLD KW-TV713XRF (saa7134) + + + MPEG TV Station (same cards as above plus WinDVR Software MPEG en/decoder) + KWORLD KW-TV878R -Pro TV (no Radio) + KWORLD KW-TV878RF-Pro TV (w/ Radio) + KWORLD KW-TV878R -Ultra TV (no Radio) + KWORLD KW-TV878RF-Ultra TV (w/ Radio) + + + +JTT/ Justy Corp.http://www.justy.co.jp/ (www.jtt.com.jp website down) +--------------------------------------------------------------------- + JTT-02 (JTT TV) "TV watchmate pro" (bt848) + +ADS www.adstech.com +------------------- + Channel Surfer TV ( CHX-950 ) + Channel Surfer TV+FM ( CHX-960FM ) + +AVEC www.prochips.com +--------------------- + AVEC Intercapture (bt848, tea6320) + +NoBrand +------- + TV Excel = Australian Name for "PV-BT878P+ 8E" or "878TV Rev.3_" + +Mach www.machspeed.com +---- + Mach TV 878 + +Eline www.eline-net.com/ +----- + Eline Vision TVMaster / TVMaster FM (ELV-TVM/ ELV-TVM-FM) = LR26 (bt878) + Eline Vision TVMaster-2000 (ELV-TVM-2000, ELV-TVM-2000-FM)= LR138 (saa713x) + +Spirit http://www.spiritmodems.com.au/ +------ + Spirit TV Tuner/Video Capture Card (bt848) + +Boser www.boser.com.tw +----- + HS-878 Mini PCI Capture Add-on Card + HS-879 Mini PCI 3D Audio and Capture Add-on Card (w/ ES1938 Solo-1) + +Satelco www.citycom-gmbh.de, www.satelco.de +------- + TV-FM =KNC1 saa7134 + Standard PCI (DVB-S) = Technotrend Budget + Standard PCI (DVB-S) w/ CI + Satelco Highend PCI (DVB-S) = Technotrend Premium + + +Sensoray www.sensoray.com +-------- + Sensoray 311 (PC/104 bus) + Sensoray 611 (PCI) + +CEI (Chartered Electronics Industries Pte Ltd [CEI] [FCC ID HBY]) +--- + TV Tuner - HBY-33A-RAFFLES Brooktree Bt848KPF + Philips + TV Tuner MG9910 - HBY33A-TVO CEI + Philips SAA7110 + OKI M548262 + ST STV8438CV + Primetime TV (ISA) + acquired by Singapore Technologies + now operating as Chartered Semiconductor Manufacturing + Manufacturer of video cards is listed as: + Cogent Electronics Industries [CEI] + +AITech +------ + Wavewatcher TV (ISA) + AITech WaveWatcher TV-PCI = can be LR26 (Bt848) or LR50 (BT878) + WaveWatcher TVR-202 TV/FM Radio Card (ISA) + +MAXRON +------ + Maxron MaxTV/FM Radio (KW-TV878-FNT) = Kworld or JW-TV878-FBK + +www.ids-imaging.de +------------------ + Falcon Series (capture only) + In USA: http://www.theimagingsource.com/ + DFG/LC1 + +www.sknet-web.co.jp +------------------- + SKnet Monster TV (saa7134) + +A-Max www.amaxhk.com (Colormax, Amax, Napa) +------------------- + APAC Viewcomp 878 + +Cybertainment +------------- + CyberMail AV Video Email Kit w/ PCI Capture Card (capture only) + CyberMail Xtreme + These are Flyvideo + +VCR (http://www.vcrinc.com/) +--- + Video Catcher 16 + +Twinhan +------- + DST Card/DST-IP (bt878, twinhan asic) VP-1020 + Sold as: + KWorld DVBS Satellite TV-Card + Powercolor DSTV Satellite Tuner Card + Prolink Pixelview DTV2000 + Provideo PV-911 Digital Satellite TV Tuner Card With Common Interface ? + DST-CI Card (DVB Satellite) VP-1030 + DCT Card (DVB cable) + +MSI +--- + MSI TV@nywhere Tuner Card (MS-8876) (CX23881/883) Not Bt878 compatible. + MS-8401 DVB-S + +Focus www.focusinfo.com +----- + InVideo PCI (bt878) + +Sdisilk www.sdisilk.com/ +------- + SDI Silk 100 + SDI Silk 200 SDI Input Card + +www.euresys.com + PICOLO series + +PMC/Pace +www.pacecom.co.uk website closed + +Mercury www.kobian.com (UK and FR) + LR50 + LR138RBG-Rx == LR138 + +TEC sound (package and manuals don't have any other manufacturer info) TecSound + Though educated googling found: www.techmakers.com + TV-Mate = Zoltrix VP-8482 + +Lorenzen www.lorenzen.de +-------- + SL DVB-S PCI = Technotrend Budget PCI (su1278 or bsru version) + +Origo (.uk) www.origo2000.com + PC TV Card = LR50 + +I/O Magic www.iomagic.com +--------- + PC PVR - Desktop TV Personal Video Recorder DR-PCTV100 = Pinnacle ROB2D-51009464 4.0 + Cyberlink PowerVCR II + +Arowana +------- + TV-Karte / Poso Power TV (?) = Zoltrix VP-8482 (?) + +iTVC15 boards: +------------- +kuroutoshikou.com ITVC15 +yuan.com MPG160 PCI TV (Internal PCI MPEG2 encoder card plus TV-tuner) + +Asus www.asuscom.com + Asus TV Tuner Card 880 NTSC (low profile, cx23880) + Asus TV (saa7134) + +Hoontech +-------- +http://www.hoontech.com/korean/download/down_driver_list03.html + HART Vision 848 (H-ART Vision 848) + HART Vision 878 (H-Art Vision 878) diff --git a/Documentation/video4linux/bttv/ICs b/Documentation/video4linux/bttv/ICs new file mode 100644 index 000000000000..6b7491336967 --- /dev/null +++ b/Documentation/video4linux/bttv/ICs @@ -0,0 +1,37 @@ +all boards: + +Brooktree Bt848/848A/849/878/879: video capture chip + + + +Miro PCTV: + +Philips or Temic Tuner + + + +Hauppauge Win/TV pci (version 405): + +Microchip 24LC02B or +Philips 8582E2Y: 256 Byte EEPROM with configuration information + I2C 0xa0-0xa1, (24LC02B also responds to 0xa2-0xaf) +Philips SAA5246AGP/E: Videotext decoder chip, I2C 0x22-0x23 +TDA9800: sound decoder +Winbond W24257AS-35: 32Kx8 CMOS static RAM (Videotext buffer mem) +14052B: analog switch for selection of sound source + +PAL: +TDA5737: VHF, hyperband and UHF mixer/oscillator for TV and VCR 3-band tuners +TSA5522: 1.4 GHz I2C-bus controlled synthesizer, I2C 0xc2-0xc3 + +NTSC: +TDA5731: VHF, hyperband and UHF mixer/oscillator for TV and VCR 3-band tuners +TSA5518: no datasheet available on Philips site + + + +STB TV pci: + +??? +if you want better support for STB cards send me info! +Look at the board! What chips are on it? diff --git a/Documentation/video4linux/bttv/Insmod-options b/Documentation/video4linux/bttv/Insmod-options new file mode 100644 index 000000000000..7bb5a50b0779 --- /dev/null +++ b/Documentation/video4linux/bttv/Insmod-options @@ -0,0 +1,173 @@ + +Note: "modinfo <module>" prints various informations about a kernel +module, among them a complete and up-to-date list of insmod options. +This list tends to be outdated because it is updated manually ... + +========================================================================== + +bttv.o + the bt848/878 (grabber chip) driver + + insmod args: + card=n card type, see CARDLIST for a list. + tuner=n tuner type, see CARDLIST for a list. + radio=0/1 card supports radio + pll=0/1/2 pll settings + 0: don't use PLL + 1: 28 MHz crystal installed + 2: 35 MHz crystal installed + + triton1=0/1 for Triton1 (+others) compatibility + vsfx=0/1 yet another chipset bug compatibility bit + see README.quirks for details on these two. + + bigendian=n Set the endianness of the gfx framebuffer. + Default is native endian. + fieldnr=0/1 Count fields. Some TV descrambling software + needs this, for others it only generates + 50 useless IRQs/sec. default is 0 (off). + autoload=0/1 autoload helper modules (tuner, audio). + default is 1 (on). + bttv_verbose=0/1/2 verbose level (at insmod time, while + looking at the hardware). default is 1. + bttv_debug=0/1 debug messages (for capture). + default is 0 (off). + irq_debug=0/1 irq handler debug messages. + default is 0 (off). + gbuffers=2-32 number of capture buffers for mmap'ed capture. + default is 4. + gbufsize= size of capture buffers. default and + maximum value is 0x208000 (~2MB) + no_overlay=0 Enable overlay on broken hardware. There + are some chipsets (SIS for example) which + are known to have problems with the PCI DMA + push used by bttv. bttv will disable overlay + by default on this hardware to avoid crashes. + With this insmod option you can override this. + automute=0/1 Automatically mutes the sound if there is + no TV signal, on by default. You might try + to disable this if you have bad input signal + quality which leading to unwanted sound + dropouts. + chroma_agc=0/1 AGC of chroma signal, off by default. + adc_crush=0/1 Luminance ADC crush, on by default. + + bttv_gpio=0/1 + gpiomask= + audioall= + audiomux= + See Sound-FAQ for a detailed description. + + remap, card, radio and pll accept up to four comma-separated arguments + (for multiple boards). + +tuner.o + The tuner driver. You need this unless you want to use only + with a camera or external tuner ... + + insmod args: + debug=1 print some debug info to the syslog + type=n type of the tuner chip. n as follows: + see CARDLIST for a complete list. + pal=[bdgil] select PAL variant (used for some tuners + only, important for the audio carrier). + +tvmixer.o + registers a mixer device for the TV card's volume/bass/treble + controls (requires a i2c audio control chip like the msp3400). + + insmod args: + debug=1 print some debug info to the syslog. + devnr=n allocate device #n (0 == /dev/mixer, + 1 = /dev/mixer1, ...), default is to + use the first free one. + +tvaudio.o + new, experimental module which is supported to provide a single + driver for all simple i2c audio control chips (tda/tea*). + + insmod args: + tda8425 = 1 enable/disable the support for the + tda9840 = 1 various chips. + tda9850 = 1 The tea6300 can't be autodetected and is + tda9855 = 1 therefore off by default, if you have + tda9873 = 1 this one on your card (STB uses these) + tda9874a = 1 you have to enable it explicitly. + tea6300 = 0 The two tda985x chips use the same i2c + tea6420 = 1 address and can't be disturgished from + pic16c54 = 1 each other, you might have to disable + the wrong one. + debug = 1 print debug messages + + insmod args for tda9874a: + tda9874a_SIF=1/2 select sound IF input pin (1 or 2) + (default is pin 1) + tda9874a_AMSEL=0/1 auto-mute select for NICAM (default=0) + Please read note 3 below! + tda9874a_STD=n select TV sound standard (0..8): + 0 - A2, B/G + 1 - A2, M (Korea) + 2 - A2, D/K (1) + 3 - A2, D/K (2) + 4 - A2, D/K (3) + 5 - NICAM, I + 6 - NICAM, B/G + 7 - NICAM, D/K (default) + 8 - NICAM, L + + Note 1: tda9874a supports both tda9874h (old) and tda9874a (new) chips. + Note 2: tda9874h/a and tda9875 (which is supported separately by + tda9875.o) use the same i2c address so both modules should not be + used at the same time. + Note 3: Using tda9874a_AMSEL option depends on your TV card design! + AMSEL=0: auto-mute will switch between NICAM sound + and the sound on 1st carrier (i.e. FM mono or AM). + AMSEL=1: auto-mute will switch between NICAM sound + and the analog mono input (MONOIN pin). + If tda9874a decoder on your card has MONOIN pin not connected, then + use only tda9874_AMSEL=0 or don't specify this option at all. + For example: + card=65 (FlyVideo 2000S) - set AMSEL=1 or AMSEL=0 + card=72 (Prolink PV-BT878P rev.9B) - set AMSEL=0 only + +msp3400.o + The driver for the msp34xx sound processor chips. If you have a + stereo card, you probably want to insmod this one. + + insmod args: + debug=1/2 print some debug info to the syslog, + 2 is more verbose. + simple=1 Use the "short programming" method. Newer + msp34xx versions support this. You need this + for dbx stereo. Default is on if supported by + the chip. + once=1 Don't check the TV-stations Audio mode + every few seconds, but only once after + channel switches. + amsound=1 Audio carrier is AM/NICAM at 6.5 Mhz. This + should improve things for french people, the + carrier autoscan seems to work with FM only... + +tea6300.o - OBSOLETE (use tvaudio instead) + The driver for the tea6300 fader chip. If you have a stereo + card and the msp3400.o doesn't work, you might want to try this + one. This chip is seen on most STB TV/FM cards (usually from + Gateway OEM sold surplus on auction sites). + + insmod args: + debug=1 print some debug info to the syslog. + +tda8425.o - OBSOLETE (use tvaudio instead) + The driver for the tda8425 fader chip. This driver used to be + part of bttv.c, so if your sound used to work but does not + anymore, try loading this module. + + insmod args: + debug=1 print some debug info to the syslog. + +tda985x.o - OBSOLETE (use tvaudio instead) + The driver for the tda9850/55 audio chips. + + insmod args: + debug=1 print some debug info to the syslog. + chip=9850/9855 set the chip type. diff --git a/Documentation/video4linux/bttv/MAKEDEV b/Documentation/video4linux/bttv/MAKEDEV new file mode 100644 index 000000000000..6c29ba43b6c6 --- /dev/null +++ b/Documentation/video4linux/bttv/MAKEDEV @@ -0,0 +1,28 @@ +#!/bin/bash + +function makedev () { + + for dev in 0 1 2 3; do + echo "/dev/$1$dev: char 81 $[ $2 + $dev ]" + rm -f /dev/$1$dev + mknod /dev/$1$dev c 81 $[ $2 + $dev ] + chmod 666 /dev/$1$dev + done + + # symlink for default device + rm -f /dev/$1 + ln -s /dev/${1}0 /dev/$1 +} + +# see http://roadrunner.swansea.uk.linux.org/v4lapi.shtml + +echo "*** new device names ***" +makedev video 0 +makedev radio 64 +makedev vtx 192 +makedev vbi 224 + +#echo "*** old device names (for compatibility only) ***" +#makedev bttv 0 +#makedev bttv-fm 64 +#makedev bttv-vbi 224 diff --git a/Documentation/video4linux/bttv/Modprobe.conf b/Documentation/video4linux/bttv/Modprobe.conf new file mode 100644 index 000000000000..55f14650d8cd --- /dev/null +++ b/Documentation/video4linux/bttv/Modprobe.conf @@ -0,0 +1,11 @@ +# i2c +alias char-major-89 i2c-dev +options i2c-core i2c_debug=1 +options i2c-algo-bit bit_test=1 + +# bttv +alias char-major-81 videodev +alias char-major-81-0 bttv +options bttv card=2 radio=1 +options tuner debug=1 + diff --git a/Documentation/video4linux/bttv/Modules.conf b/Documentation/video4linux/bttv/Modules.conf new file mode 100644 index 000000000000..753f15956eb8 --- /dev/null +++ b/Documentation/video4linux/bttv/Modules.conf @@ -0,0 +1,14 @@ +# For modern kernels (2.6 or above), this belongs in /etc/modprobe.conf +# For for 2.4 kernels or earlier, this belongs in /etc/modules.conf. + +# i2c +alias char-major-89 i2c-dev +options i2c-core i2c_debug=1 +options i2c-algo-bit bit_test=1 + +# bttv +alias char-major-81 videodev +alias char-major-81-0 bttv +options bttv card=2 radio=1 +options tuner debug=1 + diff --git a/Documentation/video4linux/bttv/PROBLEMS b/Documentation/video4linux/bttv/PROBLEMS new file mode 100644 index 000000000000..8e31e9e36bf7 --- /dev/null +++ b/Documentation/video4linux/bttv/PROBLEMS @@ -0,0 +1,62 @@ +- Start capturing by pressing "c" or by selecting it via a menu! + +- Start capturing by pressing "c" or by selecting it via a menu!!! + +- The memory of some S3 cards is not recognized right: + + First of all, if you are not using XFree-3.2 or newer, upgrade AT LEAST to + XFree-3.2A! This solved the problem for most people. + + Start up X11 like this: "XF86_S3 -probeonly" and write down where the + linear frame buffer is. + If it is different to the address found by bttv install bttv like this: + "insmod bttv vidmem=0xfb0" + if the linear frame buffer is at 0xfb000000 (i.e. omit the last 5 zeros!) + + Some S3 cards even take up 64MB of memory but only report 32MB to the BIOS. + If this 64MB area overlaps the IO memory of the Bt848 you also have to + remap this. E.g.: insmod bttv vidmem=0xfb0 remap=0xfa0 + + If the video memory is found at the right place and there are no address + conflicts but still no picture (or the computer even crashes), + try disabling features of your PCI chipset in the BIOS setup. + + Frank Kapahnke <frank@kapahnke.prima.ruhr.de> also reported that problems + with his S3 868 went away when he upgraded to XFree 3.2. + + +- I still only get a black picture with my S3 card! + + Even with XFree-3.2A some people have problems with their S3 cards + (mostly with Trio 64 but also with some others) + Get the free demo version of Accelerated X from www.xinside.com and try + bttv with it. bttv seems to work with most S3 cards with Accelerated X. + + Since I do not know much (better make that almost nothing) about VGA card + programming I do not know the reason for this. + Looks like XFree does something different when setting up the video memory? + Maybe somebody can enlighten me? + Would be nice if somebody could get this to work with XFree since + Accelerated X costs more than some of the grabber cards ... + + Better linear frame buffer support for S3 cards will probably be in + XFree 4.0. + +- Grabbing is not switched off when changing consoles with XFree. + That's because XFree and some AcceleratedX versions do not send unmap + events. + +- Some popup windows (e.g. of the window manager) are not refreshed. + + Disable backing store by starting X with the option "-bs" + +- When using 32 bpp in XFree or 24+8bpp mode in AccelX 3.1 the system + can sometimes lock up if you use more than 1 bt848 card at the same time. + You will always get pixel errors when e.g. using more than 1 card in full + screen mode. Maybe we need something faster than the PCI bus ... + + +- Some S3 cards and the Matrox Mystique will produce pixel errors with + full resolution in 32-bit mode. + +- Some video cards have problems with Accelerated X 4.1 diff --git a/Documentation/video4linux/bttv/README b/Documentation/video4linux/bttv/README new file mode 100644 index 000000000000..a72f4c94fb0b --- /dev/null +++ b/Documentation/video4linux/bttv/README @@ -0,0 +1,90 @@ + +Release notes for bttv +====================== + +You'll need at least these config options for bttv: + CONFIG_I2C=m + CONFIG_I2C_ALGOBIT=m + CONFIG_VIDEO_DEV=m + +The latest bttv version is available from http://bytesex.org/bttv/ + + +Make bttv work with your card +----------------------------- + +Just try "modprobe bttv" and see if that works. + +If it doesn't bttv likely could not autodetect your card and needs some +insmod options. The most important insmod option for bttv is "card=n" +to select the correct card type. If you get video but no sound you've +very likely specified the wrong (or no) card type. A list of supported +cards is in CARDLIST.bttv + +If bttv takes very long to load (happens sometimes with the cheap +cards which have no tuner), try adding this to your modules.conf: + options i2c-algo-bit bit_test=1 + +For the WinTV/PVR you need one firmware file from the driver CD: +hcwamc.rbf. The file is in the pvr45xxx.exe archive (self-extracting +zip file, unzip can unpack it). Put it into the /etc/pvr directory or +use the firm_altera=<path> insmod option to point the driver to the +location of the file. + +If your card isn't listed in CARDLIST.bttv or if you have trouble making +audio work, you should read the Sound-FAQ. + + +Autodetecting cards +------------------- + +bttv uses the PCI Subsystem ID to autodetect the card type. lspci lists +the Subsystem ID in the second line, looks like this: + +00:0a.0 Multimedia video controller: Brooktree Corporation Bt878 (rev 02) + Subsystem: Hauppauge computer works Inc. WinTV/GO + Flags: bus master, medium devsel, latency 32, IRQ 5 + Memory at e2000000 (32-bit, prefetchable) [size=4K] + +only bt878-based cards can have a subsystem ID (which does not mean +that every card really has one). bt848 cards can't have a Subsystem +ID and therefore can't be autodetected. There is a list with the ID's +in bttv-cards.c (in case you are intrested or want to mail patches +with updates). + + +Still doesn't work? +------------------- + +I do NOT have a lab with 30+ different grabber boards and a +PAL/NTSC/SECAM test signal generator at home, so I often can't +reproduce your problems. This makes debugging very difficult for me. +If you have some knowledge and spare time, please try to fix this +yourself (patches very welcome of course...) You know: The linux +slogan is "Do it yourself". + +There is a mailing list: video4linux-list@redhat.com. +https://listman.redhat.com/mailman/listinfo/video4linux-list + +If you have trouble with some specific TV card, try to ask there +instead of mailing me directly. The chance that someone with the +same card listens there is much higher... + +For problems with sound: There are alot of different systems used +for TV sound all over the world. And there are also different chips +which decode the audio signal. Reports about sound problems ("stereo +does'nt work") are pretty useless unless you include some details +about your hardware and the TV sound scheme used in your country (or +at least the country you are living in). + + +Finally: If you mail some patches for bttv around the world (to +linux-kernel/Alan/Linus/...), please Cc: me. + + +Have fun with bttv, + + Gerd + +-- +Gerd Knorr <kraxel@bytesex.org> diff --git a/Documentation/video4linux/bttv/README.WINVIEW b/Documentation/video4linux/bttv/README.WINVIEW new file mode 100644 index 000000000000..c61cf2864287 --- /dev/null +++ b/Documentation/video4linux/bttv/README.WINVIEW @@ -0,0 +1,33 @@ + +Support for the Leadtek WinView 601 TV/FM by Jon Tombs <jon@gte.esi.us.es> + +This card is basically the same as all the rest (Bt484A, Philips tuner), +the main difference is that they have attached a programmable attenuator to 3 +GPIO lines in order to give some volume control. They have also stuck an +infra-red remote control decoded on the board, I will add support for this +when I get time (it simple generates an interrupt for each key press, with +the key code is placed in the GPIO port). + +I don't yet have any application to test the radio support. The tuner +frequency setting should work but it is possible that the audio multiplexer +is wrong. If it doesn't work, send me email. + + +- No Thanks to Leadtek they refused to answer any questions about their +hardware. The driver was written by visual inspection of the card. If you +use this driver, send an email insult to them, and tell them you won't +continue buying their hardware unless they support Linux. + +- Little thanks to Princeton Technology Corp (http://www.princeton.com.tw) +who make the audio attenuator. Their publicly available data-sheet available +on their web site doesn't include the chip programming information! Hidden +on their server are the full data-sheets, but don't ask how I found it. + +To use the driver I use the following options, the tuner and pll settings might +be different in your country + +insmod videodev +insmod i2c scan=1 i2c_debug=0 verbose=0 +insmod tuner type=1 debug=0 +insmod bttv pll=1 radio=1 card=17 + diff --git a/Documentation/video4linux/bttv/README.freeze b/Documentation/video4linux/bttv/README.freeze new file mode 100644 index 000000000000..51f8d4379a94 --- /dev/null +++ b/Documentation/video4linux/bttv/README.freeze @@ -0,0 +1,74 @@ + +If the box freezes hard with bttv ... +===================================== + +It might be a bttv driver bug. It also might be bad hardware. It also +might be something else ... + +Just mailing me "bttv freezes" isn't going to help much. This README +has a few hints how you can help to pin down the problem. + + +bttv bugs +--------- + +If some version works and another doesn't it is likely to be a driver +bug. It is very helpful if you can tell where exactly it broke +(i.e. the last working and the first broken version). + +With a hard freeze you probably doesn't find anything in the logfiles. +The only way to capture any kernel messages is to hook up a serial +console and let some terminal application log the messages. /me uses +screen. See Documentation/serial-console.txt for details on setting +up a serial console. + +Read Documentation/oops-tracing.txt to learn how to get any useful +information out of a register+stack dump printed by the kernel on +protection faults (so-called "kernel oops"). + +If you run into some kind of deadlock, you can try to dump a call trace +for each process using sysrq-t (see Documentation/sysrq.txt). ksymoops +will translate these dumps into kernel symbols too. This way it is +possible to figure where *exactly* some process in "D" state is stuck. + +I've seen reports that bttv 0.7.x crashes whereas 0.8.x works rock solid +for some people. Thus probably a small buglet left somewhere in bttv +0.7.x. I have no idea where exactly, it works stable for me and alot of +other people. But in case you have problems with the 0.7.x versions you +can give 0.8.x a try ... + + +hardware bugs +------------- + +Some hardware can't deal with PCI-PCI transfers (i.e. grabber => vga). +Sometimes problems show up with bttv just because of the high load on +the PCI bus. The bt848/878 chips have a few workarounds for known +incompatibilities, see README.quirks. + +Some folks report that increasing the pci latency helps too, +althrought I'm not sure whenever this really fixes the problems or +only makes it less likely to happen. Both bttv and btaudio have a +insmod option to set the PCI latency of the device. + +Some mainboard have problems to deal correctly with multiple devices +doing DMA at the same time. bttv + ide seems to cause this sometimes, +if this is the case you likely see freezes only with video and hard disk +access at the same time. Updating the IDE driver to get the latest and +greatest workarounds for hardware bugs might fix these problems. + + +other +----- + +If you use some binary-only yunk (like nvidia module) try to reproduce +the problem without. + +IRQ sharing is known to cause problems in some cases. It works just +fine in theory and many configurations. Neverless it might be worth a +try to shuffle around the PCI cards to give bttv another IRQ or make +it share the IRQ with some other piece of hardware. IRQ sharing with +VGA cards seems to cause trouble sometimes. I've also seen funny +effects with bttv sharing the IRQ with the ACPI bridge (and +apci-enabled kernel). + diff --git a/Documentation/video4linux/bttv/README.quirks b/Documentation/video4linux/bttv/README.quirks new file mode 100644 index 000000000000..e8edb87df711 --- /dev/null +++ b/Documentation/video4linux/bttv/README.quirks @@ -0,0 +1,83 @@ + +Below is what the bt878 data book says about the PCI bug compatibility +modes of the bt878 chip. + +The triton1 insmod option sets the EN_TBFX bit in the control register. +The vsfx insmod option does the same for EN_VSFX bit. If you have +stability problems you can try if one of these options makes your box +work solid. + +drivers/pci/quirks.c knows about these issues, this way these bits are +enabled automagically for known-buggy chipsets (look at the kernel +messages, bttv tells you). + +HTH, + + Gerd + +---------------------------- cut here -------------------------- + +Normal PCI Mode +--------------- + +The PCI REQ signal is the logical-or of the incoming function requests. +The inter-nal GNT[0:1] signals are gated asynchronously with GNT and +demultiplexed by the audio request signal. Thus the arbiter defaults to +the video function at power-up and parks there during no requests for +bus access. This is desirable since the video will request the bus more +often. However, the audio will have highest bus access priority. Thus +the audio will have first access to the bus even when issuing a request +after the video request but before the PCI external arbiter has granted +access to the Bt879. Neither function can preempt the other once on the +bus. The duration to empty the entire video PCI FIFO onto the PCI bus is +very short compared to the bus access latency the audio PCI FIFO can +tolerate. + + +430FX Compatibility Mode +------------------------ + +When using the 430FX PCI, the following rules will ensure +compatibility: + + (1) Deassert REQ at the same time as asserting FRAME. + (2) Do not reassert REQ to request another bus transaction until after + finish-ing the previous transaction. + +Since the individual bus masters do not have direct control of REQ, a +simple logical-or of video and audio requests would violate the rules. +Thus, both the arbiter and the initiator contain 430FX compatibility +mode logic. To enable 430FX mode, set the EN_TBFX bit as indicated in +Device Control Register on page 104. + +When EN_TBFX is enabled, the arbiter ensures that the two compatibility +rules are satisfied. Before GNT is asserted by the PCI arbiter, this +internal arbiter may still logical-or the two requests. However, once +the GNT is issued, this arbiter must lock in its decision and now route +only the granted request to the REQ pin. The arbiter decision lock +happens regardless of the state of FRAME because it does not know when +FRAME will be asserted (typically - each initiator will assert FRAME on +the cycle following GNT). When FRAME is asserted, it is the initiator s +responsibility to remove its request at the same time. It is the +arbiters responsibility to allow this request to flow through to REQ and +not allow the other request to hold REQ asserted. The decision lock may +be removed at the end of the transaction: for example, when the bus is +idle (FRAME and IRDY). The arbiter decision may then continue +asynchronously until GNT is again asserted. + + +Interfacing with Non-PCI 2.1 Compliant Core Logic +------------------------------------------------- + +A small percentage of core logic devices may start a bus transaction +during the same cycle that GNT is de-asserted. This is non PCI 2.1 +compliant. To ensure compatibility when using PCs with these PCI +controllers, the EN_VSFX bit must be enabled (refer to Device Control +Register on page 104). When in this mode, the arbiter does not pass GNT +to the internal functions unless REQ is asserted. This prevents a bus +transaction from starting the same cycle as GNT is de-asserted. This +also has the side effect of not being able to take advantage of bus +parking, thus lowering arbitration performance. The Bt879 drivers must +query for these non-compliant devices, and set the EN_VSFX bit only if +required. + diff --git a/Documentation/video4linux/bttv/Sound-FAQ b/Documentation/video4linux/bttv/Sound-FAQ new file mode 100644 index 000000000000..b8c9c2605ce2 --- /dev/null +++ b/Documentation/video4linux/bttv/Sound-FAQ @@ -0,0 +1,148 @@ + +bttv and sound mini howto +========================= + +There are alot of different bt848/849/878/879 based boards available. +Making video work often is not a big deal, because this is handled +completely by the bt8xx chip, which is common on all boards. But +sound is handled in slightly different ways on each board. + +To handle the grabber boards correctly, there is a array tvcards[] in +bttv-cards.c, which holds the informations required for each board. +Sound will work only, if the correct entry is used (for video it often +makes no difference). The bttv driver prints a line to the kernel +log, telling which card type is used. Like this one: + + bttv0: model: BT848(Hauppauge old) [autodetected] + +You should verify this is correct. If it isn't, you have to pass the +correct board type as insmod argument, "insmod bttv card=2" for +example. The file CARDLIST has a list of valid arguments for card. +If your card isn't listed there, you might check the source code for +new entries which are not listed yet. If there isn't one for your +card, you can check if one of the existing entries does work for you +(just trial and error...). + +Some boards have an extra processor for sound to do stereo decoding +and other nice features. The msp34xx chips are used by Hauppauge for +example. If your board has one, you might have to load a helper +module like msp3400.o to make sound work. If there isn't one for the +chip used on your board: Bad luck. Start writing a new one. Well, +you might want to check the video4linux mailing list archive first... + +Of course you need a correctly installed soundcard unless you have the +speakers connected directly to the grabber board. Hint: check the +mixer settings too. ALSA for example has everything muted by default. + + +How sound works in detail +========================= + +Still doesn't work? Looks like some driver hacking is required. +Below is a do-it-yourself description for you. + +The bt8xx chips have 32 general purpose pins, and registers to control +these pins. One register is the output enable register +(BT848_GPIO_OUT_EN), it says which pins are actively driven by the +bt848 chip. Another one is the data register (BT848_GPIO_DATA), where +you can get/set the status if these pins. They can be used for input +and output. + +Most grabber board vendors use these pins to control an external chip +which does the sound routing. But every board is a little different. +These pins are also used by some companies to drive remote control +receiver chips. Some boards use the i2c bus instead of the gpio pins +to connect the mux chip. + +As mentioned above, there is a array which holds the required +informations for each known board. You basically have to create a new +line for your board. The important fields are these two: + +struct tvcard +{ + [ ... ] + u32 gpiomask; + u32 audiomux[6]; /* Tuner, Radio, external, internal, mute, stereo */ +}; + +gpiomask specifies which pins are used to control the audio mux chip. +The corresponding bits in the output enable register +(BT848_GPIO_OUT_EN) will be set as these pins must be driven by the +bt848 chip. + +The audiomux[] array holds the data values for the different inputs +(i.e. which pins must be high/low for tuner/mute/...). This will be +written to the data register (BT848_GPIO_DATA) to switch the audio +mux. + + +What you have to do is figure out the correct values for gpiomask and +the audiomux array. If you have Windows and the drivers four your +card installed, you might to check out if you can read these registers +values used by the windows driver. A tool to do this is available +from ftp://telepresence.dmem.strath.ac.uk/pub/bt848/winutil, but it +does'nt work with bt878 boards according to some reports I received. +Another one with bt878 suport is available from +http://btwincap.sourceforge.net/Files/btspy2.00.zip + +You might also dig around in the *.ini files of the Windows applications. +You can have a look at the board to see which of the gpio pins are +connected at all and then start trial-and-error ... + + +Starting with release 0.7.41 bttv has a number of insmod options to +make the gpio debugging easier: + +bttv_gpio=0/1 enable/disable gpio debug messages +gpiomask=n set the gpiomask value +audiomux=i,j,... set the values of the audiomux array +audioall=a set the values of the audiomux array (one + value for all array elements, useful to check + out which effect the particular value has). + +The messages printed with bttv_gpio=1 look like this: + + bttv0: gpio: en=00000027, out=00000024 in=00ffffd8 [audio: off] + +en = output _en_able register (BT848_GPIO_OUT_EN) +out = _out_put bits of the data register (BT848_GPIO_DATA), + i.e. BT848_GPIO_DATA & BT848_GPIO_OUT_EN +in = _in_put bits of the data register, + i.e. BT848_GPIO_DATA & ~BT848_GPIO_OUT_EN + + + +Other elements of the tvcards array +=================================== + +If you are trying to make a new card work you might find it useful to +know what the other elements in the tvcards array are good for: + +video_inputs - # of video inputs the card has +audio_inputs - historical cruft, not used any more. +tuner - which input is the tuner +svhs - which input is svhs (all others are labeled composite) +muxsel - video mux, input->registervalue mapping +pll - same as pll= insmod option +tuner_type - same as tuner= insmod option +*_modulename - hint whenever some card needs this or that audio + module loaded to work properly. +has_radio - whenever this TV card has a radio tuner. +no_msp34xx - "1" disables loading of msp3400.o module +no_tda9875 - "1" disables loading of tda9875.o module +needs_tvaudio - set to "1" to load tvaudio.o module + +If some config item is specified both from the tvcards array and as +insmod option, the insmod option takes precedence. + + + +Good luck, + + Gerd + + +PS: If you have a new working entry, mail it to me. + +-- +Gerd Knorr <kraxel@bytesex.org> diff --git a/Documentation/video4linux/bttv/Specs b/Documentation/video4linux/bttv/Specs new file mode 100644 index 000000000000..79b9e576fe79 --- /dev/null +++ b/Documentation/video4linux/bttv/Specs @@ -0,0 +1,3 @@ +Philips http://www.Semiconductors.COM/pip/ +Conexant http://www.conexant.com/techinfo/default.asp +Micronas http://www.micronas.de/pages/product_documentation/index.html diff --git a/Documentation/video4linux/bttv/THANKS b/Documentation/video4linux/bttv/THANKS new file mode 100644 index 000000000000..2085399da7d4 --- /dev/null +++ b/Documentation/video4linux/bttv/THANKS @@ -0,0 +1,24 @@ +Many thanks to: + +- Markus Schroeder <schroedm@uni-duesseldorf.de> for information on the Bt848 + and tuner programming and his control program xtvc. + +- Martin Buck <martin-2.buck@student.uni-ulm.de> for his great Videotext + package. + +- Gerd Knorr <kraxel@cs.tu-berlin.de> for the MSP3400 support and the modular + I2C, tuner, ... support. + + +- MATRIX Vision for giving us 2 cards for free, which made support of + single crystal operation possible. + +- MIRO for providing a free PCTV card and detailed information about the + components on their cards. (E.g. how the tuner type is detected) + Without their card I could not have debugged the NTSC mode. + +- Hauppauge for telling how the sound input is selected and what components + they do and will use on their radio cards. + Also many thanks for faxing me the FM1216 data sheet. + + diff --git a/Documentation/video4linux/bttv/Tuners b/Documentation/video4linux/bttv/Tuners new file mode 100644 index 000000000000..d18fbc70c0e0 --- /dev/null +++ b/Documentation/video4linux/bttv/Tuners @@ -0,0 +1,115 @@ +1) Tuner Programming +==================== +There are some flavors of Tuner programming APIs. +These differ mainly by the bandswitch byte. + + L= LG_API (VHF_LO=0x01, VHF_HI=0x02, UHF=0x08, radio=0x04) + P= PHILIPS_API (VHF_LO=0xA0, VHF_HI=0x90, UHF=0x30, radio=0x04) + T= TEMIC_API (VHF_LO=0x02, VHF_HI=0x04, UHF=0x01) + A= ALPS_API (VHF_LO=0x14, VHF_HI=0x12, UHF=0x11) + M= PHILIPS_MK3 (VHF_LO=0x01, VHF_HI=0x02, UHF=0x04, radio=0x19) + +2) Tuner Manufacturers +====================== + +SAMSUNG Tuner identification: (e.g. TCPM9091PD27) + TCP [ABCJLMNQ] 90[89][125] [DP] [ACD] 27 [ABCD] + [ABCJLMNQ]: + A= BG+DK + B= BG + C= I+DK + J= NTSC-Japan + L= Secam LL + M= BG+I+DK + N= NTSC + Q= BG+I+DK+LL + [89]: ? + [125]: + 2: No FM + 5: With FM + [DP]: + D= NTSC + P= PAL + [ACD]: + A= F-connector + C= Phono connector + D= Din Jack + [ABCD]: + 3-wire/I2C tuning, 2-band/3-band + + These Tuners are PHILIPS_API compatible. + +Philips Tuner identification: (e.g. FM1216MF) + F[IRMQ]12[1345]6{MF|ME|MP} + F[IRMQ]: + FI12x6: Tuner Series + FR12x6: Tuner + Radio IF + FM12x6: Tuner + FM + FQ12x6: special + FMR12x6: special + TD15xx: Digital Tuner ATSC + 12[1345]6: + 1216: PAL BG + 1236: NTSC + 1246: PAL I + 1256: Pal DK + {MF|ME|MP} + MF: BG LL w/ Secam (Multi France) + ME: BG DK I LL (Multi Europe) + MP: BG DK I (Multi PAL) + MR: BG DK M (?) + MG: BG DKI M (?) + MK2 series PHILIPS_API, most tuners are compatible to this one ! + MK3 series introduced in 2002 w/ PHILIPS_MK3_API + +Temic Tuner identification: (.e.g 4006FH5) + 4[01][0136][269]F[HYNR]5 + 40x2: Tuner (5V/33V), TEMIC_API. + 40x6: Tuner 5V + 41xx: Tuner compact + 40x9: Tuner+FM compact + [0136] + xx0x: PAL BG + xx1x: Pal DK, Secam LL + xx3x: NTSC + xx6x: PAL I + F[HYNR]5 + FH5: Pal BG + FY5: others + FN5: multistandard + FR5: w/ FM radio + 3X xxxx: order number with specific connector + Note: Only 40x2 series has TEMIC_API, all newer tuners have PHILIPS_API. + +LG Innotek Tuner: + TPI8NSR11 : NTSC J/M (TPI8NSR01 w/FM) (P,210/497) + TPI8PSB11 : PAL B/G (TPI8PSB01 w/FM) (P,170/450) + TAPC-I701 : PAL I (TAPC-I001 w/FM) (P,170/450) + TPI8PSB12 : PAL D/K+B/G (TPI8PSB02 w/FM) (P,170/450) + TAPC-H701P: NTSC_JP (TAPC-H001P w/FM) (L,170/450) + TAPC-G701P: PAL B/G (TAPC-G001P w/FM) (L,170/450) + TAPC-W701P: PAL I (TAPC-W001P w/FM) (L,170/450) + TAPC-Q703P: PAL D/K (TAPC-Q001P w/FM) (L,170/450) + TAPC-Q704P: PAL D/K+I (L,170/450) + TAPC-G702P: PAL D/K+B/G (L,170/450) + + TADC-H002F: NTSC (L,175/410?; 2-B, C-W+11, W+12-69) + TADC-M201D: PAL D/K+B/G+I (L,143/425) (sound control at I2C address 0xc8) + TADC-T003F: NTSC Taiwan (L,175/410?; 2-B, C-W+11, W+12-69) + Suffix: + P= Standard phono female socket + D= IEC female socket + F= F-connector + +Other Tuners: +TCL2002MB-1 : PAL BG + DK =TUNER_LG_PAL_NEW_TAPC +TCL2002MB-1F: PAL BG + DK w/FM =PHILIPS_PAL +TCL2002MI-2 : PAL I = ?? + +ALPS Tuners: + Most are LG_API compatible + TSCH6 has ALPS_API (TSCH5 ?) + TSBE1 has extra API 05,02,08 Control_byte=0xCB Source:(1) + +Lit. +(1) conexant100029b-PCI-Decoder-ApplicationNote.pdf diff --git a/Documentation/video4linux/meye.txt b/Documentation/video4linux/meye.txt new file mode 100644 index 000000000000..2137da97552f --- /dev/null +++ b/Documentation/video4linux/meye.txt @@ -0,0 +1,130 @@ +Vaio Picturebook Motion Eye Camera Driver Readme +------------------------------------------------ + Copyright (C) 2001-2004 Stelian Pop <stelian@popies.net> + Copyright (C) 2001-2002 Alcôve <www.alcove.com> + Copyright (C) 2000 Andrew Tridgell <tridge@samba.org> + +This driver enable the use of video4linux compatible applications with the +Motion Eye camera. This driver requires the "Sony Vaio Programmable I/O +Control Device" driver (which can be found in the "Character drivers" +section of the kernel configuration utility) to be compiled and installed +(using its "camera=1" parameter). + +It can do at maximum 30 fps @ 320x240 or 15 fps @ 640x480. + +Grabbing is supported in packed YUV colorspace only. + +MJPEG hardware grabbing is supported via a private API (see below). + +Hardware supported: +------------------- + +This driver supports the 'second' version of the MotionEye camera :) + +The first version was connected directly on the video bus of the Neomagic +video card and is unsupported. + +The second one, made by Kawasaki Steel is fully supported by this +driver (PCI vendor/device is 0x136b/0xff01) + +The third one, present in recent (more or less last year) Picturebooks +(C1M* models), is not supported. The manufacturer has given the specs +to the developers under a NDA (which allows the develoment of a GPL +driver however), but things are not moving very fast (see +http://r-engine.sourceforge.net/) (PCI vendor/device is 0x10cf/0x2011). + +There is a forth model connected on the USB bus in TR1* Vaio laptops. +This camera is not supported at all by the current driver, in fact +little information if any is available for this camera +(USB vendor/device is 0x054c/0x0107). + +Driver options: +--------------- + +Several options can be passed to the meye driver using the standard +module argument syntax (<param>=<value> when passing the option to the +module or meye.<param>=<value> on the kernel boot line when meye is +statically linked into the kernel). Those options are: + + forcev4l1: force use of V4L1 API instead of V4L2 + + gbuffers: number of capture buffers, default is 2 (32 max) + + gbufsize: size of each capture buffer, default is 614400 + + video_nr: video device to register (0 = /dev/video0, etc) + +Module use: +----------- + +In order to automatically load the meye module on use, you can put those lines +in your /etc/modprobe.conf file: + + alias char-major-81 videodev + alias char-major-81-0 meye + options meye gbuffers=32 + +Usage: +------ + + xawtv >= 3.49 (<http://bytesex.org/xawtv/>) + for display and uncompressed video capture: + + xawtv -c /dev/video0 -geometry 640x480 + or + xawtv -c /dev/video0 -geometry 320x240 + + motioneye (<http://popies.net/meye/>) + for getting ppm or jpg snapshots, mjpeg video + +Private API: +------------ + + The driver supports frame grabbing with the video4linux API + (either v4l1 or v4l2), so all video4linux tools (like xawtv) + should work with this driver. + + Besides the video4linux interface, the driver has a private interface + for accessing the Motion Eye extended parameters (camera sharpness, + agc, video framerate), the shapshot and the MJPEG capture facilities. + + This interface consists of several ioctls (prototypes and structures + can be found in include/linux/meye.h): + + MEYEIOC_G_PARAMS + MEYEIOC_S_PARAMS + Get and set the extended parameters of the motion eye camera. + The user should always query the current parameters with + MEYEIOC_G_PARAMS, change what he likes and then issue the + MEYEIOC_S_PARAMS call (checking for -EINVAL). The extended + parameters are described by the meye_params structure. + + + MEYEIOC_QBUF_CAPT + Queue a buffer for capture (the buffers must have been + obtained with a VIDIOCGMBUF call and mmap'ed by the + application). The argument to MEYEIOC_QBUF_CAPT is the + buffer number to queue (or -1 to end capture). The first + call to MEYEIOC_QBUF_CAPT starts the streaming capture. + + MEYEIOC_SYNC + Takes as an argument the buffer number you want to sync. + This ioctl blocks until the buffer is filled and ready + for the application to use. It returns the buffer size. + + MEYEIOC_STILLCAPT + MEYEIOC_STILLJCAPT + Takes a snapshot in an uncompressed or compressed jpeg format. + This ioctl blocks until the snapshot is done and returns (for + jpeg snapshot) the size of the image. The image data is + available from the first mmap'ed buffer. + + Look at the 'motioneye' application code for an actual example. + +Bugs / Todo: +------------ + + - the driver could be much cleaned up by removing the v4l1 support. + However, this means all v4l1-only applications will stop working. + + - 'motioneye' still uses the meye private v4l1 API extensions. diff --git a/Documentation/video4linux/radiotrack.txt b/Documentation/video4linux/radiotrack.txt new file mode 100644 index 000000000000..2b75345f13e3 --- /dev/null +++ b/Documentation/video4linux/radiotrack.txt @@ -0,0 +1,147 @@ +NOTES ON RADIOTRACK CARD CONTROL +by Stephen M. Benoit (benoits@servicepro.com) Dec 14, 1996 +---------------------------------------------------------------------------- + +Document version 1.0 + +ACKNOWLEDGMENTS +---------------- +This document was made based on 'C' code for Linux from Gideon le Grange +(legrang@active.co.za or legrang@cs.sun.ac.za) in 1994, and elaborations from +Frans Brinkman (brinkman@esd.nl) in 1996. The results reported here are from +experiments that the author performed on his own setup, so your mileage may +vary... I make no guarantees, claims or warranties to the suitability or +validity of this information. No other documentation on the AIMS +Lab (http://www.aimslab.com/) RadioTrack card was made available to the +author. This document is offered in the hopes that it might help users who +want to use the RadioTrack card in an environment other than MS Windows. + +WHY THIS DOCUMENT? +------------------ +I have a RadioTrack card from back when I ran an MS-Windows platform. After +converting to Linux, I found Gideon le Grange's command-line software for +running the card, and found that it was good! Frans Brinkman made a +comfortable X-windows interface, and added a scanning feature. For hack +value, I wanted to see if the tuner could be tuned beyond the usual FM radio +broadcast band, so I could pick up the audio carriers from North American +broadcast TV channels, situated just below and above the 87.0-109.0 MHz range. +I did not get much success, but I learned about programming ioports under +Linux and gained some insights about the hardware design used for the card. + +So, without further delay, here are the details. + + +PHYSICAL DESCRIPTION +-------------------- +The RadioTrack card is an ISA 8-bit FM radio card. The radio frequency (RF) +input is simply an antenna lead, and the output is a power audio signal +available through a miniature phone plug. Its RF frequencies of operation are +more or less limited from 87.0 to 109.0 MHz (the commercial FM broadcast +band). Although the registers can be programmed to request frequencies beyond +these limits, experiments did not give promising results. The variable +frequency oscillator (VFO) that demodulates the intermediate frequency (IF) +signal probably has a small range of useful frequencies, and wraps around or +gets clipped beyond the limits mentioned above. + + +CONTROLLING THE CARD WITH IOPORT +-------------------------------- +The RadioTrack (base) ioport is configurable for 0x30c or 0x20c. Only one +ioport seems to be involved. The ioport decoding circuitry must be pretty +simple, as individual ioport bits are directly matched to specific functions +(or blocks) of the radio card. This way, many functions can be changed in +parallel with one write to the ioport. The only feedback available through +the ioports appears to be the "Stereo Detect" bit. + +The bits of the ioport are arranged as follows: + + MSb LSb ++------+------+------+--------+--------+-------+---------+--------+ +| VolA | VolB | ???? | Stereo | Radio | TuneA | TuneB | Tune | +| (+) | (-) | | Detect | Audio | (bit) | (latch) | Update | +| | | | Enable | Enable | | | Enable | ++------+------+------+--------+--------+-------+---------+--------+ + + +VolA . VolB [AB......] +----------- +0 0 : audio mute +0 1 : volume + (some delay required) +1 0 : volume - (some delay required) +1 1 : stay at present volume + +Stereo Detect Enable [...S....] +-------------------- +0 : No Detect +1 : Detect + + Results available by reading ioport >60 msec after last port write. + 0xff ==> no stereo detected, 0xfd ==> stereo detected. + +Radio to Audio (path) Enable [....R...] +---------------------------- +0 : Disable path (silence) +1 : Enable path (audio produced) + +TuneA . TuneB [.....AB.] +------------- +0 0 : "zero" bit phase 1 +0 1 : "zero" bit phase 2 + +1 0 : "one" bit phase 1 +1 1 : "one" bit phase 2 + + 24-bit code, where bits = (freq*40) + 10486188. + The Most Significant 11 bits must be 1010 xxxx 0x0 to be valid. + The bits are shifted in LSb first. + +Tune Update Enable [.......T] +------------------ +0 : Tuner held constant +1 : Tuner updating in progress + + +PROGRAMMING EXAMPLES +-------------------- +Default: BASE <-- 0xc8 (current volume, no stereo detect, + radio enable, tuner adjust disable) + +Card Off: BASE <-- 0x00 (audio mute, no stereo detect, + radio disable, tuner adjust disable) + +Card On: BASE <-- 0x00 (see "Card Off", clears any unfinished business) + BASE <-- 0xc8 (see "Default") + +Volume Down: BASE <-- 0x48 (volume down, no stereo detect, + radio enable, tuner adjust disable) + * wait 10 msec * + BASE <-- 0xc8 (see "Default") + +Volume Up: BASE <-- 0x88 (volume up, no stereo detect, + radio enable, tuner adjust disable) + * wait 10 msec * + BASE <-- 0xc8 (see "Default") + +Check Stereo: BASE <-- 0xd8 (current volume, stereo detect, + radio enable, tuner adjust disable) + * wait 100 msec * + x <-- BASE (read ioport) + BASE <-- 0xc8 (see "Default") + + x=0xff ==> "not stereo", x=0xfd ==> "stereo detected" + +Set Frequency: code = (freq*40) + 10486188 + foreach of the 24 bits in code, + (from Least to Most Significant): + to write a "zero" bit, + BASE <-- 0x01 (audio mute, no stereo detect, radio + disable, "zero" bit phase 1, tuner adjust) + BASE <-- 0x03 (audio mute, no stereo detect, radio + disable, "zero" bit phase 2, tuner adjust) + to write a "one" bit, + BASE <-- 0x05 (audio mute, no stereo detect, radio + disable, "one" bit phase 1, tuner adjust) + BASE <-- 0x07 (audio mute, no stereo detect, radio + disable, "one" bit phase 2, tuner adjust) + +---------------------------------------------------------------------------- diff --git a/Documentation/video4linux/w9966.txt b/Documentation/video4linux/w9966.txt new file mode 100644 index 000000000000..e7ac33a7eb06 --- /dev/null +++ b/Documentation/video4linux/w9966.txt @@ -0,0 +1,33 @@ +W9966 Camera driver, written by Jakob Kemi (jakob.kemi@telia.com) + +After a lot of work in softice & wdasm, reading .pdf-files and tiresome +trial-and-error work I've finally got everything to work. I needed vision for a +robotics project so I borrowed this camera from a friend and started hacking. +Anyway I've converted my original code from the AVR 8bit RISC C/ASM code into +a working Linux driver. + +To get it working simply configure your kernel to support +parport, ieee1284, video4linux and w9966 + +If w9966 is statically linked it will always perform aggressive probing for +the camera. If built as a module you'll have more configuration options. + +Options: + modprobe w9966.o pardev=parport0(or whatever) parmode=0 (0=auto, 1=ecp, 2=epp) +voila! + +you can also type 'modinfo -p w9966.o' for option usage +(or checkout w9966.c) + +The only thing to keep in mind is that the image format is in Y-U-Y-V format +where every two pixels take 4 bytes. In SDL (www.libsdl.org) this format +is called VIDEO_PALETTE_YUV422 (16 bpp). + +A minimal test application (with source) is available from: + http://hem.fyristorg.com/mogul/w9966.html + +The slow framerate is due to missing DMA ECP read support in the +parport drivers. I might add working EPP support later. + +Good luck! + /Jakob Kemi diff --git a/Documentation/video4linux/zr36120.txt b/Documentation/video4linux/zr36120.txt new file mode 100644 index 000000000000..4af6c52595eb --- /dev/null +++ b/Documentation/video4linux/zr36120.txt @@ -0,0 +1,159 @@ +Driver for Trust Computer Products Framegrabber, version 0.6.1 +------ --- ----- -------- -------- ------------ ------- - - - + +- ZORAN ------------------------------------------------------ + Author: Pauline Middelink <middelin@polyware.nl> + Date: 18 September 1999 +Version: 0.6.1 + +- Description ------------------------------------------------ + +Video4Linux compatible driver for an unknown brand framegrabber +(Sold in the Netherlands by TRUST Computer Products) and various +other zoran zr36120 based framegrabbers. + +The card contains a ZR36120 Multimedia PCI Interface and a Philips +SAA7110 Onechip Frontend videodecoder. There is also an DSP of +which I have forgotten the number, since i will never get that thing +to work without specs from the vendor itself. + +The SAA711x are capable of processing 6 different video inputs, +CVBS1..6 and Y1+C1, Y2+C2, Y3+C3. All in 50/60Hz, NTSC, PAL or +SECAM and delivering a YUV datastream. On my card the input +'CVBS-0' corresponds to channel CVBS2 and 'S-Video' to Y2+C2. + +I have some reports of other cards working with the mentioned +chip sets. For a list of other working cards please have a look +at the cards named in the tvcards struct in the beginning of +zr36120.c + +After some testing, I discovered that the carddesigner messed up +on the I2C interface. The Zoran chip includes 2 lines SDA and SCL +which (s)he connected reversely. So we have to clock on the SDA +and r/w data on the SCL pin. Life is fun... Each cardtype now has +a bit which signifies if you have a card with the same deficiency. + +Oh, for the completeness of this story I must mention that my +card delivers the VSYNC pulse of the SAA chip to GIRQ1, not +GIRQ0 as some other cards have. This is also incorporated in +the driver be clearing/setting the 'useirq1' bit in the tvcard +description. + +Another problems of continuous capturing data with a Zoran chip +is something nasty inside the chip. It effectively halves the +fps we ought to get... Here is the scenario: capturing frames +to memory is done in the so-called snapshot mode. In this mode +the Zoran stops after capturing a frame worth of data and wait +till the application set GRAB bit to indicate readiness for the +next frame. After detecting a set bit, the chip neatly waits +till the start of a frame, captures it and it goes back to off. +Smart ppl will notice the problem here. Its the waiting on the +_next_ frame each time we set the GRAB bit... Oh well, 12,5 fps +is still plenty fast for me. +-- update 28/7/1999 -- +Don't believe a word I just said... Proof is the output +of `streamer -t 300 -r 25 -f avi15 -o /dev/null` + ++--+-+-+-+-+-+-+-+-+-+-+-+-s+-+-+-+-+-+-+-+-+-+-+- 25/25 + +-s+-+-+-+-+-+-+-+-+-+-+-+-+-s+-+-+-+-+-+-+-+-+-+-+- 25/25 + +-s+-+-+-+-+-+-+-+-+-+-+-+-+-s+-+-+-+-+-+-+-+-+-+-+- 25/25 + +-s+-+-+-+-+-+-+-+-+-+-+-+-+-s+-+-+-+-+-+-+-+-+-+-+- 25/25 + +-s+-+-+-+-+-+-+-+-+-+-+-+-+-s+-+-+-+-+-+-+-+-+-+-+- 25/25 + +-s+-+-+-+-+-+-+-+-+-+-+-+-+-s+-+-+-+-+-+-+-+-+-+-+- 25/25 + +-s+-+-+-+-+-+-+-+-+-+-+-+-+-s+-+-+-+-+-+-+-+-+-+-+- 25/25 + +-s+-+-+-+-+-+-+-+-+-+-+-+-+-s+-+-+-+-+-+-+-+-+-+-+- 25/25 + +-s+-+-+-+-+-+-+-+-+-+-+-+-+-s+-+-+-+-+-+-+-+-+-+-+- 25/25 + +-s+-+-+-+-+-+-+-+-+-+-+-+-+-s+-+-+-+-+-+-+-+-+-+-+- 25/25 + +-s+-+-+-+-+-+-+-+-+-+-+-+-+-s+-+-+-+-+-+-+-+-+-+-+- 25/25 + +-s+-+-+-+-+-+-+-+-+-+-+-+-+-s+-+-+-+-+-+-+-+-+-+-+- + syncer: done + writer: done +(note the /dev/null is prudent here, my system is not able to + grab /and/ write 25 fps to a file... gifts welcome :) ) +The technical reasoning follows: The zoran completed the last +frame, the VSYNC goes low, and GRAB is cleared. The interrupt +routine starts to work since its VSYNC driven, and again +activates the GRAB bit. A few ms later the VSYNC (re-)rises and +the zoran starts to work on a new and freshly broadcasted frame.... + +For pointers I used the specs of both chips. Below are the URLs: + http://www.zoran.com/ftp/download/devices/pci/ZR36120/36120data.pdf + http://www-us.semiconductor.philips.com/acrobat/datasheets/SAA_7110_A_1.pdf + +The documentation has very little on absolute numbers or timings +needed for the various modes/resolutions, but there are other +programs you can borrow those from. + +------ Install -------------------------------------------- +Read the file called TODO. Note its long list of limitations. + +Build a kernel with VIDEO4LINUX enabled. Activate the +BT848 driver; we need this because we have need for the +other modules (i2c and videodev) it enables. + +To install this software, extract it into a suitable directory. +Examine the makefile and change anything you don't like. Type "make". + +After making the modules check if you have the much needed +/dev/video devices. If not, execute the following 4 lines: + mknod /dev/video c 81 0 + mknod /dev/video1 c 81 1 + mknod /dev/video2 c 81 2 + mknod /dev/video3 c 81 3 + mknod /dev/video4 c 81 4 + +After making/checking the devices do: + modprobe i2c + modprobe videodev + modprobe saa7110 (optional) + modprobe saa7111 (optional) + modprobe tuner (optional) + insmod zoran cardtype=<n> + +<n> is the cardtype of the card you have. The cardnumber can +be found in the source of zr36120. Look for tvcards. If your +card is not there, please try if any other card gives some +response, and mail me if you got a working tvcard addition. + +PS. <TVCard editors behold!) + Dont forget to set video_input to the number of inputs + you defined in the video_mux part of the tvcard definition. + Its a common error to add a channel but not incrementing + video_input and getting angry with me/v4l/linux/linus :( + +You are now ready to test the framegrabber with your favorite +video4linux compatible tool + +------ Application ---------------------------------------- + +This device works with all Video4Linux compatible applications, +given the limitations in the TODO file. + +------ API ------------------------------------------------ + +This uses the V4L interface as of kernel release 2.1.116, and in +fact has not been tested on any lower version. There are a couple +of minor differences due to the fact that the amount of data returned +with each frame varies, and no doubt there are discrepancies due to my +misunderstanding of the API. I intend to convert this driver to the +new V4L2 API when it has stabilized more. + +------ Current state -------------------------------------- + +The driver is capable of overlaying a video image in screen, and +even capable of grabbing frames. It uses the BIGPHYSAREA patch +to allocate lots of large memory blocks when tis patch is +found in the kernel, but it doesn't need it. +The consequence is that, when loading the driver as a module, +the module may tell you it's out of memory, but 'free' says +otherwise. The reason is simple; the modules wants its memory +contiguous, not fragmented, and after a long uptime there +probably isn't a fragment of memory large enough... + +The driver uses a double buffering scheme, which should really +be an n-way buffer, depending on the size of allocated framebuffer +and the requested grab-size/format. +This current version also fixes a dead-lock situation during irq +time, which really, really froze my system... :) + +Good luck. + Pauline diff --git a/Documentation/vm/balance b/Documentation/vm/balance new file mode 100644 index 000000000000..bd3d31bc4915 --- /dev/null +++ b/Documentation/vm/balance @@ -0,0 +1,93 @@ +Started Jan 2000 by Kanoj Sarcar <kanoj@sgi.com> + +Memory balancing is needed for non __GFP_WAIT as well as for non +__GFP_IO allocations. + +There are two reasons to be requesting non __GFP_WAIT allocations: +the caller can not sleep (typically intr context), or does not want +to incur cost overheads of page stealing and possible swap io for +whatever reasons. + +__GFP_IO allocation requests are made to prevent file system deadlocks. + +In the absence of non sleepable allocation requests, it seems detrimental +to be doing balancing. Page reclamation can be kicked off lazily, that +is, only when needed (aka zone free memory is 0), instead of making it +a proactive process. + +That being said, the kernel should try to fulfill requests for direct +mapped pages from the direct mapped pool, instead of falling back on +the dma pool, so as to keep the dma pool filled for dma requests (atomic +or not). A similar argument applies to highmem and direct mapped pages. +OTOH, if there is a lot of free dma pages, it is preferable to satisfy +regular memory requests by allocating one from the dma pool, instead +of incurring the overhead of regular zone balancing. + +In 2.2, memory balancing/page reclamation would kick off only when the +_total_ number of free pages fell below 1/64 th of total memory. With the +right ratio of dma and regular memory, it is quite possible that balancing +would not be done even when the dma zone was completely empty. 2.2 has +been running production machines of varying memory sizes, and seems to be +doing fine even with the presence of this problem. In 2.3, due to +HIGHMEM, this problem is aggravated. + +In 2.3, zone balancing can be done in one of two ways: depending on the +zone size (and possibly of the size of lower class zones), we can decide +at init time how many free pages we should aim for while balancing any +zone. The good part is, while balancing, we do not need to look at sizes +of lower class zones, the bad part is, we might do too frequent balancing +due to ignoring possibly lower usage in the lower class zones. Also, +with a slight change in the allocation routine, it is possible to reduce +the memclass() macro to be a simple equality. + +Another possible solution is that we balance only when the free memory +of a zone _and_ all its lower class zones falls below 1/64th of the +total memory in the zone and its lower class zones. This fixes the 2.2 +balancing problem, and stays as close to 2.2 behavior as possible. Also, +the balancing algorithm works the same way on the various architectures, +which have different numbers and types of zones. If we wanted to get +fancy, we could assign different weights to free pages in different +zones in the future. + +Note that if the size of the regular zone is huge compared to dma zone, +it becomes less significant to consider the free dma pages while +deciding whether to balance the regular zone. The first solution +becomes more attractive then. + +The appended patch implements the second solution. It also "fixes" two +problems: first, kswapd is woken up as in 2.2 on low memory conditions +for non-sleepable allocations. Second, the HIGHMEM zone is also balanced, +so as to give a fighting chance for replace_with_highmem() to get a +HIGHMEM page, as well as to ensure that HIGHMEM allocations do not +fall back into regular zone. This also makes sure that HIGHMEM pages +are not leaked (for example, in situations where a HIGHMEM page is in +the swapcache but is not being used by anyone) + +kswapd also needs to know about the zones it should balance. kswapd is +primarily needed in a situation where balancing can not be done, +probably because all allocation requests are coming from intr context +and all process contexts are sleeping. For 2.3, kswapd does not really +need to balance the highmem zone, since intr context does not request +highmem pages. kswapd looks at the zone_wake_kswapd field in the zone +structure to decide whether a zone needs balancing. + +Page stealing from process memory and shm is done if stealing the page would +alleviate memory pressure on any zone in the page's node that has fallen below +its watermark. + +pages_min/pages_low/pages_high/low_on_memory/zone_wake_kswapd: These are +per-zone fields, used to determine when a zone needs to be balanced. When +the number of pages falls below pages_min, the hysteric field low_on_memory +gets set. This stays set till the number of free pages becomes pages_high. +When low_on_memory is set, page allocation requests will try to free some +pages in the zone (providing GFP_WAIT is set in the request). Orthogonal +to this, is the decision to poke kswapd to free some zone pages. That +decision is not hysteresis based, and is done when the number of free +pages is below pages_low; in which case zone_wake_kswapd is also set. + + +(Good) Ideas that I have heard: +1. Dynamic experience should influence balancing: number of failed requests +for a zone can be tracked and fed into the balancing scheme (jalvo@mbay.net) +2. Implement a replace_with_highmem()-like replace_with_regular() to preserve +dma pages. (lkd@tantalophile.demon.co.uk) diff --git a/Documentation/vm/hugetlbpage.txt b/Documentation/vm/hugetlbpage.txt new file mode 100644 index 000000000000..1b9bcd1fe98b --- /dev/null +++ b/Documentation/vm/hugetlbpage.txt @@ -0,0 +1,284 @@ + +The intent of this file is to give a brief summary of hugetlbpage support in +the Linux kernel. This support is built on top of multiple page size support +that is provided by most modern architectures. For example, i386 +architecture supports 4K and 4M (2M in PAE mode) page sizes, ia64 +architecture supports multiple page sizes 4K, 8K, 64K, 256K, 1M, 4M, 16M, +256M and ppc64 supports 4K and 16M. A TLB is a cache of virtual-to-physical +translations. Typically this is a very scarce resource on processor. +Operating systems try to make best use of limited number of TLB resources. +This optimization is more critical now as bigger and bigger physical memories +(several GBs) are more readily available. + +Users can use the huge page support in Linux kernel by either using the mmap +system call or standard SYSv shared memory system calls (shmget, shmat). + +First the Linux kernel needs to be built with CONFIG_HUGETLB_PAGE (present +under Processor types and feature) and CONFIG_HUGETLBFS (present under file +system option on config menu) config options. + +The kernel built with hugepage support should show the number of configured +hugepages in the system by running the "cat /proc/meminfo" command. + +/proc/meminfo also provides information about the total number of hugetlb +pages configured in the kernel. It also displays information about the +number of free hugetlb pages at any time. It also displays information about +the configured hugepage size - this is needed for generating the proper +alignment and size of the arguments to the above system calls. + +The output of "cat /proc/meminfo" will have output like: + +..... +HugePages_Total: xxx +HugePages_Free: yyy +Hugepagesize: zzz KB + +/proc/filesystems should also show a filesystem of type "hugetlbfs" configured +in the kernel. + +/proc/sys/vm/nr_hugepages indicates the current number of configured hugetlb +pages in the kernel. Super user can dynamically request more (or free some +pre-configured) hugepages. +The allocation( or deallocation) of hugetlb pages is posible only if there are +enough physically contiguous free pages in system (freeing of hugepages is +possible only if there are enough hugetlb pages free that can be transfered +back to regular memory pool). + +Pages that are used as hugetlb pages are reserved inside the kernel and can +not be used for other purposes. + +Once the kernel with Hugetlb page support is built and running, a user can +use either the mmap system call or shared memory system calls to start using +the huge pages. It is required that the system administrator preallocate +enough memory for huge page purposes. + +Use the following command to dynamically allocate/deallocate hugepages: + + echo 20 > /proc/sys/vm/nr_hugepages + +This command will try to configure 20 hugepages in the system. The success +or failure of allocation depends on the amount of physically contiguous +memory that is preset in system at this time. System administrators may want +to put this command in one of the local rc init file. This will enable the +kernel to request huge pages early in the boot process (when the possibility +of getting physical contiguous pages is still very high). + +If the user applications are going to request hugepages using mmap system +call, then it is required that system administrator mount a file system of +type hugetlbfs: + + mount none /mnt/huge -t hugetlbfs <uid=value> <gid=value> <mode=value> + <size=value> <nr_inodes=value> + +This command mounts a (pseudo) filesystem of type hugetlbfs on the directory +/mnt/huge. Any files created on /mnt/huge uses hugepages. The uid and gid +options sets the owner and group of the root of the file system. By default +the uid and gid of the current process are taken. The mode option sets the +mode of root of file system to value & 0777. This value is given in octal. +By default the value 0755 is picked. The size option sets the maximum value of +memory (huge pages) allowed for that filesystem (/mnt/huge). The size is +rounded down to HPAGE_SIZE. The option nr_inode sets the maximum number of +inodes that /mnt/huge can use. If the size or nr_inode options are not +provided on command line then no limits are set. For size and nr_inodes +options, you can use [G|g]/[M|m]/[K|k] to represent giga/mega/kilo. For +example, size=2K has the same meaning as size=2048. An example is given at +the end of this document. + +read and write system calls are not supported on files that reside on hugetlb +file systems. + +A regular chown, chgrp and chmod commands (with right permissions) could be +used to change the file attributes on hugetlbfs. + +Also, it is important to note that no such mount command is required if the +applications are going to use only shmat/shmget system calls. Users who +wish to use hugetlb page via shared memory segment should be a member of +a supplementary group and system admin needs to configure that gid into +/proc/sys/vm/hugetlb_shm_group. It is possible for same or different +applications to use any combination of mmaps and shm* calls. Though the +mount of filesystem will be required for using mmaps. + +******************************************************************* + +/* + * Example of using hugepage memory in a user application using Sys V shared + * memory system calls. In this example the app is requesting 256MB of + * memory that is backed by huge pages. The application uses the flag + * SHM_HUGETLB in the shmget system call to inform the kernel that it is + * requesting hugepages. + * + * For the ia64 architecture, the Linux kernel reserves Region number 4 for + * hugepages. That means the addresses starting with 0x800000... will need + * to be specified. Specifying a fixed address is not required on ppc64, + * i386 or x86_64. + * + * Note: The default shared memory limit is quite low on many kernels, + * you may need to increase it via: + * + * echo 268435456 > /proc/sys/kernel/shmmax + * + * This will increase the maximum size per shared memory segment to 256MB. + * The other limit that you will hit eventually is shmall which is the + * total amount of shared memory in pages. To set it to 16GB on a system + * with a 4kB pagesize do: + * + * echo 4194304 > /proc/sys/kernel/shmall + */ +#include <stdlib.h> +#include <stdio.h> +#include <sys/types.h> +#include <sys/ipc.h> +#include <sys/shm.h> +#include <sys/mman.h> + +#ifndef SHM_HUGETLB +#define SHM_HUGETLB 04000 +#endif + +#define LENGTH (256UL*1024*1024) + +#define dprintf(x) printf(x) + +/* Only ia64 requires this */ +#ifdef __ia64__ +#define ADDR (void *)(0x8000000000000000UL) +#define SHMAT_FLAGS (SHM_RND) +#else +#define ADDR (void *)(0x0UL) +#define SHMAT_FLAGS (0) +#endif + +int main(void) +{ + int shmid; + unsigned long i; + char *shmaddr; + + if ((shmid = shmget(2, LENGTH, + SHM_HUGETLB | IPC_CREAT | SHM_R | SHM_W)) < 0) { + perror("shmget"); + exit(1); + } + printf("shmid: 0x%x\n", shmid); + + shmaddr = shmat(shmid, ADDR, SHMAT_FLAGS); + if (shmaddr == (char *)-1) { + perror("Shared memory attach failure"); + shmctl(shmid, IPC_RMID, NULL); + exit(2); + } + printf("shmaddr: %p\n", shmaddr); + + dprintf("Starting the writes:\n"); + for (i = 0; i < LENGTH; i++) { + shmaddr[i] = (char)(i); + if (!(i % (1024 * 1024))) + dprintf("."); + } + dprintf("\n"); + + dprintf("Starting the Check..."); + for (i = 0; i < LENGTH; i++) + if (shmaddr[i] != (char)i) + printf("\nIndex %lu mismatched\n", i); + dprintf("Done.\n"); + + if (shmdt((const void *)shmaddr) != 0) { + perror("Detach failure"); + shmctl(shmid, IPC_RMID, NULL); + exit(3); + } + + shmctl(shmid, IPC_RMID, NULL); + + return 0; +} + +******************************************************************* + +/* + * Example of using hugepage memory in a user application using the mmap + * system call. Before running this application, make sure that the + * administrator has mounted the hugetlbfs filesystem (on some directory + * like /mnt) using the command mount -t hugetlbfs nodev /mnt. In this + * example, the app is requesting memory of size 256MB that is backed by + * huge pages. + * + * For ia64 architecture, Linux kernel reserves Region number 4 for hugepages. + * That means the addresses starting with 0x800000... will need to be + * specified. Specifying a fixed address is not required on ppc64, i386 + * or x86_64. + */ +#include <stdlib.h> +#include <stdio.h> +#include <unistd.h> +#include <sys/mman.h> +#include <fcntl.h> + +#define FILE_NAME "/mnt/hugepagefile" +#define LENGTH (256UL*1024*1024) +#define PROTECTION (PROT_READ | PROT_WRITE) + +/* Only ia64 requires this */ +#ifdef __ia64__ +#define ADDR (void *)(0x8000000000000000UL) +#define FLAGS (MAP_SHARED | MAP_FIXED) +#else +#define ADDR (void *)(0x0UL) +#define FLAGS (MAP_SHARED) +#endif + +void check_bytes(char *addr) +{ + printf("First hex is %x\n", *((unsigned int *)addr)); +} + +void write_bytes(char *addr) +{ + unsigned long i; + + for (i = 0; i < LENGTH; i++) + *(addr + i) = (char)i; +} + +void read_bytes(char *addr) +{ + unsigned long i; + + check_bytes(addr); + for (i = 0; i < LENGTH; i++) + if (*(addr + i) != (char)i) { + printf("Mismatch at %lu\n", i); + break; + } +} + +int main(void) +{ + void *addr; + int fd; + + fd = open(FILE_NAME, O_CREAT | O_RDWR, 0755); + if (fd < 0) { + perror("Open failed"); + exit(1); + } + + addr = mmap(ADDR, LENGTH, PROTECTION, FLAGS, fd, 0); + if (addr == MAP_FAILED) { + perror("mmap"); + unlink(FILE_NAME); + exit(1); + } + + printf("Returned address is %p\n", addr); + check_bytes(addr); + write_bytes(addr); + read_bytes(addr); + + munmap(addr, LENGTH); + close(fd); + unlink(FILE_NAME); + + return 0; +} diff --git a/Documentation/vm/locking b/Documentation/vm/locking new file mode 100644 index 000000000000..c3ef09ae3bb1 --- /dev/null +++ b/Documentation/vm/locking @@ -0,0 +1,131 @@ +Started Oct 1999 by Kanoj Sarcar <kanojsarcar@yahoo.com> + +The intent of this file is to have an uptodate, running commentary +from different people about how locking and synchronization is done +in the Linux vm code. + +page_table_lock & mmap_sem +-------------------------------------- + +Page stealers pick processes out of the process pool and scan for +the best process to steal pages from. To guarantee the existence +of the victim mm, a mm_count inc and a mmdrop are done in swap_out(). +Page stealers hold kernel_lock to protect against a bunch of races. +The vma list of the victim mm is also scanned by the stealer, +and the page_table_lock is used to preserve list sanity against the +process adding/deleting to the list. This also guarantees existence +of the vma. Vma existence is not guaranteed once try_to_swap_out() +drops the page_table_lock. To guarantee the existence of the underlying +file structure, a get_file is done before the swapout() method is +invoked. The page passed into swapout() is guaranteed not to be reused +for a different purpose because the page reference count due to being +present in the user's pte is not released till after swapout() returns. + +Any code that modifies the vmlist, or the vm_start/vm_end/ +vm_flags:VM_LOCKED/vm_next of any vma *in the list* must prevent +kswapd from looking at the chain. + +The rules are: +1. To scan the vmlist (look but don't touch) you must hold the + mmap_sem with read bias, i.e. down_read(&mm->mmap_sem) +2. To modify the vmlist you need to hold the mmap_sem with + read&write bias, i.e. down_write(&mm->mmap_sem) *AND* + you need to take the page_table_lock. +3. The swapper takes _just_ the page_table_lock, this is done + because the mmap_sem can be an extremely long lived lock + and the swapper just cannot sleep on that. +4. The exception to this rule is expand_stack, which just + takes the read lock and the page_table_lock, this is ok + because it doesn't really modify fields anybody relies on. +5. You must be able to guarantee that while holding page_table_lock + or page_table_lock of mm A, you will not try to get either lock + for mm B. + +The caveats are: +1. find_vma() makes use of, and updates, the mmap_cache pointer hint. +The update of mmap_cache is racy (page stealer can race with other code +that invokes find_vma with mmap_sem held), but that is okay, since it +is a hint. This can be fixed, if desired, by having find_vma grab the +page_table_lock. + + +Code that add/delete elements from the vmlist chain are +1. callers of insert_vm_struct +2. callers of merge_segments +3. callers of avl_remove + +Code that changes vm_start/vm_end/vm_flags:VM_LOCKED of vma's on +the list: +1. expand_stack +2. mprotect +3. mlock +4. mremap + +It is advisable that changes to vm_start/vm_end be protected, although +in some cases it is not really needed. Eg, vm_start is modified by +expand_stack(), it is hard to come up with a destructive scenario without +having the vmlist protection in this case. + +The page_table_lock nests with the inode i_mmap_lock and the kmem cache +c_spinlock spinlocks. This is okay, since the kmem code asks for pages after +dropping c_spinlock. The page_table_lock also nests with pagecache_lock and +pagemap_lru_lock spinlocks, and no code asks for memory with these locks +held. + +The page_table_lock is grabbed while holding the kernel_lock spinning monitor. + +The page_table_lock is a spin lock. + +Note: PTL can also be used to guarantee that no new clones using the +mm start up ... this is a loose form of stability on mm_users. For +example, it is used in copy_mm to protect against a racing tlb_gather_mmu +single address space optimization, so that the zap_page_range (from +vmtruncate) does not lose sending ipi's to cloned threads that might +be spawned underneath it and go to user mode to drag in pte's into tlbs. + +swap_list_lock/swap_device_lock +------------------------------- +The swap devices are chained in priority order from the "swap_list" header. +The "swap_list" is used for the round-robin swaphandle allocation strategy. +The #free swaphandles is maintained in "nr_swap_pages". These two together +are protected by the swap_list_lock. + +The swap_device_lock, which is per swap device, protects the reference +counts on the corresponding swaphandles, maintained in the "swap_map" +array, and the "highest_bit" and "lowest_bit" fields. + +Both of these are spinlocks, and are never acquired from intr level. The +locking hierarchy is swap_list_lock -> swap_device_lock. + +To prevent races between swap space deletion or async readahead swapins +deciding whether a swap handle is being used, ie worthy of being read in +from disk, and an unmap -> swap_free making the handle unused, the swap +delete and readahead code grabs a temp reference on the swaphandle to +prevent warning messages from swap_duplicate <- read_swap_cache_async. + +Swap cache locking +------------------ +Pages are added into the swap cache with kernel_lock held, to make sure +that multiple pages are not being added (and hence lost) by associating +all of them with the same swaphandle. + +Pages are guaranteed not to be removed from the scache if the page is +"shared": ie, other processes hold reference on the page or the associated +swap handle. The only code that does not follow this rule is shrink_mmap, +which deletes pages from the swap cache if no process has a reference on +the page (multiple processes might have references on the corresponding +swap handle though). lookup_swap_cache() races with shrink_mmap, when +establishing a reference on a scache page, so, it must check whether the +page it located is still in the swapcache, or shrink_mmap deleted it. +(This race is due to the fact that shrink_mmap looks at the page ref +count with pagecache_lock, but then drops pagecache_lock before deleting +the page from the scache). + +do_wp_page and do_swap_page have MP races in them while trying to figure +out whether a page is "shared", by looking at the page_count + swap_count. +To preserve the sum of the counts, the page lock _must_ be acquired before +calling is_page_shared (else processes might switch their swap_count refs +to the page count refs, after the page count ref has been snapshotted). + +Swap device deletion code currently breaks all the scache assumptions, +since it grabs neither mmap_sem nor page_table_lock. diff --git a/Documentation/vm/numa b/Documentation/vm/numa new file mode 100644 index 000000000000..4b8db1bd3b78 --- /dev/null +++ b/Documentation/vm/numa @@ -0,0 +1,41 @@ +Started Nov 1999 by Kanoj Sarcar <kanoj@sgi.com> + +The intent of this file is to have an uptodate, running commentary +from different people about NUMA specific code in the Linux vm. + +What is NUMA? It is an architecture where the memory access times +for different regions of memory from a given processor varies +according to the "distance" of the memory region from the processor. +Each region of memory to which access times are the same from any +cpu, is called a node. On such architectures, it is beneficial if +the kernel tries to minimize inter node communications. Schemes +for this range from kernel text and read-only data replication +across nodes, and trying to house all the data structures that +key components of the kernel need on memory on that node. + +Currently, all the numa support is to provide efficient handling +of widely discontiguous physical memory, so architectures which +are not NUMA but can have huge holes in the physical address space +can use the same code. All this code is bracketed by CONFIG_DISCONTIGMEM. + +The initial port includes NUMAizing the bootmem allocator code by +encapsulating all the pieces of information into a bootmem_data_t +structure. Node specific calls have been added to the allocator. +In theory, any platform which uses the bootmem allocator should +be able to to put the bootmem and mem_map data structures anywhere +it deems best. + +Each node's page allocation data structures have also been encapsulated +into a pg_data_t. The bootmem_data_t is just one part of this. To +make the code look uniform between NUMA and regular UMA platforms, +UMA platforms have a statically allocated pg_data_t too (contig_page_data). +For the sake of uniformity, the function num_online_nodes() is also defined +for all platforms. As we run benchmarks, we might decide to NUMAize +more variables like low_on_memory, nr_free_pages etc into the pg_data_t. + +The NUMA aware page allocation code currently tries to allocate pages +from different nodes in a round robin manner. This will be changed to +do concentratic circle search, starting from current node, once the +NUMA port achieves more maturity. The call alloc_pages_node has been +added, so that drivers can make the call and not worry about whether +it is running on a NUMA or UMA platform. diff --git a/Documentation/vm/overcommit-accounting b/Documentation/vm/overcommit-accounting new file mode 100644 index 000000000000..21c7b1f8f32b --- /dev/null +++ b/Documentation/vm/overcommit-accounting @@ -0,0 +1,73 @@ +The Linux kernel supports the following overcommit handling modes + +0 - Heuristic overcommit handling. Obvious overcommits of + address space are refused. Used for a typical system. It + ensures a seriously wild allocation fails while allowing + overcommit to reduce swap usage. root is allowed to + allocate slighly more memory in this mode. This is the + default. + +1 - Always overcommit. Appropriate for some scientific + applications. + +2 - Don't overcommit. The total address space commit + for the system is not permitted to exceed swap + a + configurable percentage (default is 50) of physical RAM. + Depending on the percentage you use, in most situations + this means a process will not be killed while accessing + pages but will receive errors on memory allocation as + appropriate. + +The overcommit policy is set via the sysctl `vm.overcommit_memory'. + +The overcommit percentage is set via `vm.overcommit_ratio'. + +The current overcommit limit and amount committed are viewable in +/proc/meminfo as CommitLimit and Committed_AS respectively. + +Gotchas +------- + +The C language stack growth does an implicit mremap. If you want absolute +guarantees and run close to the edge you MUST mmap your stack for the +largest size you think you will need. For typical stack usage this does +not matter much but it's a corner case if you really really care + +In mode 2 the MAP_NORESERVE flag is ignored. + + +How It Works +------------ + +The overcommit is based on the following rules + +For a file backed map + SHARED or READ-only - 0 cost (the file is the map not swap) + PRIVATE WRITABLE - size of mapping per instance + +For an anonymous or /dev/zero map + SHARED - size of mapping + PRIVATE READ-only - 0 cost (but of little use) + PRIVATE WRITABLE - size of mapping per instance + +Additional accounting + Pages made writable copies by mmap + shmfs memory drawn from the same pool + +Status +------ + +o We account mmap memory mappings +o We account mprotect changes in commit +o We account mremap changes in size +o We account brk +o We account munmap +o We report the commit status in /proc +o Account and check on fork +o Review stack handling/building on exec +o SHMfs accounting +o Implement actual limit enforcement + +To Do +----- +o Account ptrace pages (this is hard) diff --git a/Documentation/voyager.txt b/Documentation/voyager.txt new file mode 100644 index 000000000000..2749af552cdf --- /dev/null +++ b/Documentation/voyager.txt @@ -0,0 +1,95 @@ +Running Linux on the Voyager Architecture +========================================= + +For full details and current project status, see + +http://www.hansenpartnership.com/voyager + +The voyager architecture was designed by NCR in the mid 80s to be a +fully SMP capable RAS computing architecture built around intel's 486 +chip set. The voyager came in three levels of architectural +sophistication: 3,4 and 5 --- 1 and 2 never made it out of prototype. +The linux patches support only the Level 5 voyager architecture (any +machine class 3435 and above). + +The Voyager Architecture +------------------------ + +Voyager machines consist of a Baseboard with a 386 diagnostic +processor, a Power Supply Interface (PSI) a Primary and possibly +Secondary Microchannel bus and between 2 and 20 voyager slots. The +voyager slots can be populated with memory and cpu cards (up to 4GB +memory and from 1 486 to 32 Pentium Pro processors). Internally, the +voyager has a dual arbitrated system bus and a configuration and test +bus (CAT). The voyager bus speed is 40MHz. Therefore (since all +voyager cards are dual ported for each system bus) the maximum +transfer rate is 320Mb/s but only if you have your slot configuration +tuned (only memory cards can communicate with both busses at once, CPU +cards utilise them one at a time). + +Voyager SMP +----------- + +Since voyager was the first intel based SMP system, it is slightly +more primitive than the Intel IO-APIC approach to SMP. Voyager allows +arbitrary interrupt routing (including processor affinity routing) of +all 16 PC type interrupts. However it does this by using a modified +5259 master/slave chip set instead of an APIC bus. Additionally, +voyager supports Cross Processor Interrupts (CPI) equivalent to the +APIC IPIs. There are two routed voyager interrupt lines provided to +each slot. + +Processor Cards +--------------- + +These come in single, dyadic and quad configurations (the quads are +problematic--see later). The maximum configuration is 8 quad cards +for 32 way SMP. + +Quad Processors +--------------- + +Because voyager only supplies two interrupt lines to each Processor +card, the Quad processors have to be configured (and Bootstrapped) in +as a pair of Master/Slave processors. + +In fact, most Quad cards only accept one VIC interrupt line, so they +have one interrupt handling processor (called the VIC extended +processor) and three non-interrupt handling processors. + +Current Status +-------------- + +The System will boot on Mono, Dyad and Quad cards. There was +originally a Quad boot problem which has been fixed by proper gdt +alignment in the initial boot loader. If you still cannot get your +voyager system to boot, email me at: + +<J.E.J.Bottomley@HansenPartnership.com> + + +The Quad cards now support using the separate Quad CPI vectors instead +of going through the VIC mailbox system. + +The Level 4 architecture (3430 and 3360 Machines) should also work +fine. + +Dump Switch +----------- + +The voyager dump switch sends out a broadcast NMI which the voyager +code intercepts and does a task dump. + +Power Switch +------------ + +The front panel power switch is intercepted by the kernel and should +cause a system shutdown and power off. + +A Note About Mixed CPU Systems +------------------------------ + +Linux isn't designed to handle mixed CPU systems very well. In order +to get everything going you *must* make sure that your lowest +capability CPU is used for booting. Also, mixing CPU classes +(e.g. 486 and 586) is really not going to work very well at all. diff --git a/Documentation/w1/w1.generic b/Documentation/w1/w1.generic new file mode 100644 index 000000000000..eace3046a858 --- /dev/null +++ b/Documentation/w1/w1.generic @@ -0,0 +1,19 @@ +Any w1 device must be connected to w1 bus master device - for example +ds9490 usb device or w1-over-GPIO or RS232 converter. +Driver for w1 bus master must provide several functions(you can find +them in struct w1_bus_master definition in w1.h) which then will be +called by w1 core to send various commands over w1 bus(by default it is +reset and search commands). When some device is found on the bus, w1 core +checks if driver for it's family is loaded. +If driver is loaded w1 core creates new w1_slave object and registers it +in the system(creates some generic sysfs files(struct w1_family_ops in +w1_family.h), notifies any registered listener and so on...). +It is device driver's business to provide any communication method +upstream. +For example w1_therm driver(ds18?20 thermal sensor family driver) +provides temperature reading function which is bound to ->rbin() method +of the above w1_family_ops structure. +w1_smem - driver for simple 64bit memory cell provides ID reading +method. + +You can call above methods by reading appropriate sysfs files. diff --git a/Documentation/watchdog/pcwd-watchdog.txt b/Documentation/watchdog/pcwd-watchdog.txt new file mode 100644 index 000000000000..12187a33e310 --- /dev/null +++ b/Documentation/watchdog/pcwd-watchdog.txt @@ -0,0 +1,135 @@ + Berkshire Products PC Watchdog Card + Support for ISA Cards Revision A and C + Documentation and Driver by Ken Hollis <kenji@bitgate.com> + + The PC Watchdog is a card that offers the same type of functionality that + the WDT card does, only it doesn't require an IRQ to run. Furthermore, + the Revision C card allows you to monitor any IO Port to automatically + trigger the card into being reset. This way you can make the card + monitor hard drive status, or anything else you need. + + The Watchdog Driver has one basic role: to talk to the card and send + signals to it so it doesn't reset your computer ... at least during + normal operation. + + The Watchdog Driver will automatically find your watchdog card, and will + attach a running driver for use with that card. After the watchdog + drivers have initialized, you can then talk to the card using the PC + Watchdog program, available from http://ftp.bitgate.com/pcwd/. + + I suggest putting a "watchdog -d" before the beginning of an fsck, and + a "watchdog -e -t 1" immediately after the end of an fsck. (Remember + to run the program with an "&" to run it in the background!) + + If you want to write a program to be compatible with the PC Watchdog + driver, simply do the following: + +-- Snippet of code -- +/* + * Watchdog Driver Test Program + */ + +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> +#include <fcntl.h> +#include <sys/ioctl.h> +#include <linux/types.h> +#include <linux/watchdog.h> + +int fd; + +/* + * This function simply sends an IOCTL to the driver, which in turn ticks + * the PC Watchdog card to reset its internal timer so it doesn't trigger + * a computer reset. + */ +void keep_alive(void) +{ + int dummy; + + ioctl(fd, WDIOC_KEEPALIVE, &dummy); +} + +/* + * The main program. Run the program with "-d" to disable the card, + * or "-e" to enable the card. + */ +int main(int argc, char *argv[]) +{ + fd = open("/dev/watchdog", O_WRONLY); + + if (fd == -1) { + fprintf(stderr, "Watchdog device not enabled.\n"); + fflush(stderr); + exit(-1); + } + + if (argc > 1) { + if (!strncasecmp(argv[1], "-d", 2)) { + ioctl(fd, WDIOC_SETOPTIONS, WDIOS_DISABLECARD); + fprintf(stderr, "Watchdog card disabled.\n"); + fflush(stderr); + exit(0); + } else if (!strncasecmp(argv[1], "-e", 2)) { + ioctl(fd, WDIOC_SETOPTIONS, WDIOS_ENABLECARD); + fprintf(stderr, "Watchdog card enabled.\n"); + fflush(stderr); + exit(0); + } else { + fprintf(stderr, "-d to disable, -e to enable.\n"); + fprintf(stderr, "run by itself to tick the card.\n"); + fflush(stderr); + exit(0); + } + } else { + fprintf(stderr, "Watchdog Ticking Away!\n"); + fflush(stderr); + } + + while(1) { + keep_alive(); + sleep(1); + } +} +-- End snippet -- + + Other IOCTL functions include: + + WDIOC_GETSUPPORT + This returns the support of the card itself. This + returns in structure "PCWDS" which returns: + options = WDIOS_TEMPPANIC + (This card supports temperature) + firmware_version = xxxx + (Firmware version of the card) + + WDIOC_GETSTATUS + This returns the status of the card, with the bits of + WDIOF_* bitwise-anded into the value. (The comments + are in linux/pcwd.h) + + WDIOC_GETBOOTSTATUS + This returns the status of the card that was reported + at bootup. + + WDIOC_GETTEMP + This returns the temperature of the card. (You can also + read /dev/watchdog, which gives a temperature update + every second.) + + WDIOC_SETOPTIONS + This lets you set the options of the card. You can either + enable or disable the card this way. + + WDIOC_KEEPALIVE + This pings the card to tell it not to reset your computer. + + And that's all she wrote! + + -- Ken Hollis + (kenji@bitgate.com) + +(This documentation may be out of date. Check + http://ftp.bitgate.com/pcwd/ for the absolute latest additions.) diff --git a/Documentation/watchdog/watchdog-api.txt b/Documentation/watchdog/watchdog-api.txt new file mode 100644 index 000000000000..28388aa700c6 --- /dev/null +++ b/Documentation/watchdog/watchdog-api.txt @@ -0,0 +1,399 @@ +The Linux Watchdog driver API. + +Copyright 2002 Christer Weingel <wingel@nano-system.com> + +Some parts of this document are copied verbatim from the sbc60xxwdt +driver which is (c) Copyright 2000 Jakob Oestergaard <jakob@ostenfeld.dk> + +This document describes the state of the Linux 2.4.18 kernel. + +Introduction: + +A Watchdog Timer (WDT) is a hardware circuit that can reset the +computer system in case of a software fault. You probably knew that +already. + +Usually a userspace daemon will notify the kernel watchdog driver via the +/dev/watchdog special device file that userspace is still alive, at +regular intervals. When such a notification occurs, the driver will +usually tell the hardware watchdog that everything is in order, and +that the watchdog should wait for yet another little while to reset +the system. If userspace fails (RAM error, kernel bug, whatever), the +notifications cease to occur, and the hardware watchdog will reset the +system (causing a reboot) after the timeout occurs. + +The Linux watchdog API is a rather AD hoc construction and different +drivers implement different, and sometimes incompatible, parts of it. +This file is an attempt to document the existing usage and allow +future driver writers to use it as a reference. + +The simplest API: + +All drivers support the basic mode of operation, where the watchdog +activates as soon as /dev/watchdog is opened and will reboot unless +the watchdog is pinged within a certain time, this time is called the +timeout or margin. The simplest way to ping the watchdog is to write +some data to the device. So a very simple watchdog daemon would look +like this: + +int main(int argc, const char *argv[]) { + int fd=open("/dev/watchdog",O_WRONLY); + if (fd==-1) { + perror("watchdog"); + exit(1); + } + while(1) { + write(fd, "\0", 1); + sleep(10); + } +} + +A more advanced driver could for example check that a HTTP server is +still responding before doing the write call to ping the watchdog. + +When the device is closed, the watchdog is disabled. This is not +always such a good idea, since if there is a bug in the watchdog +daemon and it crashes the system will not reboot. Because of this, +some of the drivers support the configuration option "Disable watchdog +shutdown on close", CONFIG_WATCHDOG_NOWAYOUT. If it is set to Y when +compiling the kernel, there is no way of disabling the watchdog once +it has been started. So, if the watchdog dameon crashes, the system +will reboot after the timeout has passed. + +Some other drivers will not disable the watchdog, unless a specific +magic character 'V' has been sent /dev/watchdog just before closing +the file. If the userspace daemon closes the file without sending +this special character, the driver will assume that the daemon (and +userspace in general) died, and will stop pinging the watchdog without +disabling it first. This will then cause a reboot. + +The ioctl API: + +All conforming drivers also support an ioctl API. + +Pinging the watchdog using an ioctl: + +All drivers that have an ioctl interface support at least one ioctl, +KEEPALIVE. This ioctl does exactly the same thing as a write to the +watchdog device, so the main loop in the above program could be +replaced with: + + while (1) { + ioctl(fd, WDIOC_KEEPALIVE, 0); + sleep(10); + } + +the argument to the ioctl is ignored. + +Setting and getting the timeout: + +For some drivers it is possible to modify the watchdog timeout on the +fly with the SETTIMEOUT ioctl, those drivers have the WDIOF_SETTIMEOUT +flag set in their option field. The argument is an integer +representing the timeout in seconds. The driver returns the real +timeout used in the same variable, and this timeout might differ from +the requested one due to limitation of the hardware. + + int timeout = 45; + ioctl(fd, WDIOC_SETTIMEOUT, &timeout); + printf("The timeout was set to %d seconds\n", timeout); + +This example might actually print "The timeout was set to 60 seconds" +if the device has a granularity of minutes for its timeout. + +Starting with the Linux 2.4.18 kernel, it is possible to query the +current timeout using the GETTIMEOUT ioctl. + + ioctl(fd, WDIOC_GETTIMEOUT, &timeout); + printf("The timeout was is %d seconds\n", timeout); + +Envinronmental monitoring: + +All watchdog drivers are required return more information about the system, +some do temperature, fan and power level monitoring, some can tell you +the reason for the last reboot of the system. The GETSUPPORT ioctl is +available to ask what the device can do: + + struct watchdog_info ident; + ioctl(fd, WDIOC_GETSUPPORT, &ident); + +the fields returned in the ident struct are: + + identity a string identifying the watchdog driver + firmware_version the firmware version of the card if available + options a flags describing what the device supports + +the options field can have the following bits set, and describes what +kind of information that the GET_STATUS and GET_BOOT_STATUS ioctls can +return. [FIXME -- Is this correct?] + + WDIOF_OVERHEAT Reset due to CPU overheat + +The machine was last rebooted by the watchdog because the thermal limit was +exceeded + + WDIOF_FANFAULT Fan failed + +A system fan monitored by the watchdog card has failed + + WDIOF_EXTERN1 External relay 1 + +External monitoring relay/source 1 was triggered. Controllers intended for +real world applications include external monitoring pins that will trigger +a reset. + + WDIOF_EXTERN2 External relay 2 + +External monitoring relay/source 2 was triggered + + WDIOF_POWERUNDER Power bad/power fault + +The machine is showing an undervoltage status + + WDIOF_CARDRESET Card previously reset the CPU + +The last reboot was caused by the watchdog card + + WDIOF_POWEROVER Power over voltage + +The machine is showing an overvoltage status. Note that if one level is +under and one over both bits will be set - this may seem odd but makes +sense. + + WDIOF_KEEPALIVEPING Keep alive ping reply + +The watchdog saw a keepalive ping since it was last queried. + + WDIOF_SETTIMEOUT Can set/get the timeout + + +For those drivers that return any bits set in the option field, the +GETSTATUS and GETBOOTSTATUS ioctls can be used to ask for the current +status, and the status at the last reboot, respectively. + + int flags; + ioctl(fd, WDIOC_GETSTATUS, &flags); + + or + + ioctl(fd, WDIOC_GETBOOTSTATUS, &flags); + +Note that not all devices support these two calls, and some only +support the GETBOOTSTATUS call. + +Some drivers can measure the temperature using the GETTEMP ioctl. The +returned value is the temperature in degrees farenheit. + + int temperature; + ioctl(fd, WDIOC_GETTEMP, &temperature); + +Finally the SETOPTIONS ioctl can be used to control some aspects of +the cards operation; right now the pcwd driver is the only one +supporting thiss ioctl. + + int options = 0; + ioctl(fd, WDIOC_SETOPTIONS, options); + +The following options are available: + + WDIOS_DISABLECARD Turn off the watchdog timer + WDIOS_ENABLECARD Turn on the watchdog timer + WDIOS_TEMPPANIC Kernel panic on temperature trip + +[FIXME -- better explanations] + +Implementations in the current drivers in the kernel tree: + +Here I have tried to summarize what the different drivers support and +where they do strange things compared to the other drivers. + +acquirewdt.c -- Acquire Single Board Computer + + This driver has a hardcoded timeout of 1 minute + + Supports CONFIG_WATCHDOG_NOWAYOUT + + GETSUPPORT returns KEEPALIVEPING. GETSTATUS will return 1 if + the device is open, 0 if not. [FIXME -- isn't this rather + silly? To be able to use the ioctl, the device must be open + and so GETSTATUS will always return 1]. + +advantechwdt.c -- Advantech Single Board Computer + + Timeout that defaults to 60 seconds, supports SETTIMEOUT. + + Supports CONFIG_WATCHDOG_NOWAYOUT + + GETSUPPORT returns WDIOF_KEEPALIVEPING and WDIOF_SETTIMEOUT. + The GETSTATUS call returns if the device is open or not. + [FIXME -- silliness again?] + +eurotechwdt.c -- Eurotech CPU-1220/1410 + + The timeout can be set using the SETTIMEOUT ioctl and defaults + to 60 seconds. + + Also has a module parameter "ev", event type which controls + what should happen on a timeout, the string "int" or anything + else that causes a reboot. [FIXME -- better description] + + Supports CONFIG_WATCHDOG_NOWAYOUT + + GETSUPPORT returns CARDRESET and WDIOF_SETTIMEOUT but + GETSTATUS is not supported and GETBOOTSTATUS just returns 0. + +i810-tco.c -- Intel 810 chipset + + Also has support for a lot of other i8x0 stuff, but the + watchdog is one of the things. + + The timeout is set using the module parameter "i810_margin", + which is in steps of 0.6 seconds where 2<i810_margin<64. The + driver supports the SETTIMEOUT ioctl. + + Supports CONFIG_WATCHDOG_NOWAYOUT. + + GETSUPPORT returns WDIOF_SETTIMEOUT. The GETSTATUS call + returns some kind of timer value which ist not compatible with + the other drivers. GETBOOT status returns some kind of + hardware specific boot status. [FIXME -- describe this] + +ib700wdt.c -- IB700 Single Board Computer + + Default timeout of 30 seconds and the timeout is settable + using the SETTIMEOUT ioctl. Note that only a few timeout + values are supported. + + Supports CONFIG_WATCHDOG_NOWAYOUT + + GETSUPPORT returns WDIOF_KEEPALIVEPING and WDIOF_SETTIMEOUT. + The GETSTATUS call returns if the device is open or not. + [FIXME -- silliness again?] + +machzwd.c -- MachZ ZF-Logic + + Hardcoded timeout of 10 seconds + + Has a module parameter "action" that controls what happens + when the timeout runs out which can be 0 = RESET (default), + 1 = SMI, 2 = NMI, 3 = SCI. + + Supports CONFIG_WATCHDOG_NOWAYOUT and the magic character + 'V' close handling. + + GETSUPPORT returns WDIOF_KEEPALIVEPING, and the GETSTATUS call + returns if the device is open or not. [FIXME -- silliness + again?] + +mixcomwd.c -- MixCom Watchdog + + [FIXME -- I'm unable to tell what the timeout is] + + Supports CONFIG_WATCHDOG_NOWAYOUT + + GETSUPPORT returns WDIOF_KEEPALIVEPING, GETSTATUS returns if + the device is opened or not [FIXME -- I'm not really sure how + this works, there seems to be some magic connected to + CONFIG_WATCHDOG_NOWAYOUT] + +pcwd.c -- Berkshire PC Watchdog + + Hardcoded timeout of 1.5 seconds + + Supports CONFIG_WATCHDOG_NOWAYOUT + + GETSUPPORT returns WDIOF_OVERHEAT|WDIOF_CARDRESET and both + GETSTATUS and GETBOOTSTATUS return something useful. + + The SETOPTIONS call can be used to enable and disable the card + and to ask the driver to call panic if the system overheats. + +sbc60xxwdt.c -- 60xx Single Board Computer + + Hardcoded timeout of 10 seconds + + Does not support CONFIG_WATCHDOG_NOWAYOUT, but has the magic + character 'V' close handling. + + No bits set in GETSUPPORT + +scx200.c -- National SCx200 CPUs + + Not in the kernel yet. + + The timeout is set using a module parameter "margin" which + defaults to 60 seconds. The timeout can also be set using + SETTIMEOUT and read using GETTIMEOUT. + + Supports a module parameter "nowayout" that is initialized + with the value of CONFIG_WATCHDOG_NOWAYOUT. Also supports the + magic character 'V' handling. + +shwdt.c -- SuperH 3/4 processors + + [FIXME -- I'm unable to tell what the timeout is] + + Supports CONFIG_WATCHDOG_NOWAYOUT + + GETSUPPORT returns WDIOF_KEEPALIVEPING, and the GETSTATUS call + returns if the device is open or not. [FIXME -- silliness + again?] + +softdog.c -- Software watchdog + + The timeout is set with the module parameter "soft_margin" + which defaults to 60 seconds, the timeout is also settable + using the SETTIMEOUT ioctl. + + Supports CONFIG_WATCHDOG_NOWAYOUT + + WDIOF_SETTIMEOUT bit set in GETSUPPORT + +w83877f_wdt.c -- W83877F Computer + + Hardcoded timeout of 30 seconds + + Does not support CONFIG_WATCHDOG_NOWAYOUT, but has the magic + character 'V' close handling. + + No bits set in GETSUPPORT + +w83627hf_wdt.c -- w83627hf watchdog + + Timeout that defaults to 60 seconds, supports SETTIMEOUT. + + Supports CONFIG_WATCHDOG_NOWAYOUT + + GETSUPPORT returns WDIOF_KEEPALIVEPING and WDIOF_SETTIMEOUT. + The GETSTATUS call returns if the device is open or not. + +wdt.c -- ICS WDT500/501 ISA and +wdt_pci.c -- ICS WDT500/501 PCI + + Default timeout of 60 seconds. The timeout is also settable + using the SETTIMEOUT ioctl. + + Supports CONFIG_WATCHDOG_NOWAYOUT + + GETSUPPORT returns with bits set depending on the actual + card. The WDT501 supports a lot of external monitoring, the + WDT500 much less. + +wdt285.c -- Footbridge watchdog + + The timeout is set with the module parameter "soft_margin" + which defaults to 60 seconds. The timeout is also settable + using the SETTIMEOUT ioctl. + + Does not support CONFIG_WATCHDOG_NOWAYOUT + + WDIOF_SETTIMEOUT bit set in GETSUPPORT + +wdt977.c -- Netwinder W83977AF chip + + Hardcoded timeout of 3 minutes + + Supports CONFIG_WATCHDOG_NOWAYOUT + + Does not support any ioctls at all. + diff --git a/Documentation/watchdog/watchdog.txt b/Documentation/watchdog/watchdog.txt new file mode 100644 index 000000000000..dffda29c8799 --- /dev/null +++ b/Documentation/watchdog/watchdog.txt @@ -0,0 +1,115 @@ + Watchdog Timer Interfaces For The Linux Operating System + + Alan Cox <alan@lxorguk.ukuu.org.uk> + + Custom Linux Driver And Program Development + + +The following watchdog drivers are currently implemented: + + ICS WDT501-P + ICS WDT501-P (no fan tachometer) + ICS WDT500-P + Software Only + SA1100 Internal Watchdog + Berkshire Products PC Watchdog Revision A & C (by Ken Hollis) + + +All six interfaces provide /dev/watchdog, which when open must be written +to within a timeout or the machine will reboot. Each write delays the reboot +time another timeout. In the case of the software watchdog the ability to +reboot will depend on the state of the machines and interrupts. The hardware +boards physically pull the machine down off their own onboard timers and +will reboot from almost anything. + +A second temperature monitoring interface is available on the WDT501P cards +and some Berkshire cards. This provides /dev/temperature. This is the machine +internal temperature in degrees Fahrenheit. Each read returns a single byte +giving the temperature. + +The third interface logs kernel messages on additional alert events. + +Both software and hardware watchdog drivers are available in the standard +kernel. If you are using the software watchdog, you probably also want +to use "panic=60" as a boot argument as well. + +The wdt card cannot be safely probed for. Instead you need to pass +wdt=ioaddr,irq as a boot parameter - eg "wdt=0x240,11". + +The SA1100 watchdog module can be configured with the "sa1100_margin" +commandline argument which specifies timeout value in seconds. + +The i810 TCO watchdog modules can be configured with the "i810_margin" +commandline argument which specifies the counter initial value. The counter +is decremented every 0.6 seconds and default to 50 (30 seconds). Values can +range between 3 and 63. + +The i810 TCO watchdog driver also implements the WDIOC_GETSTATUS and +WDIOC_GETBOOTSTATUS ioctl()s. WDIOC_GETSTATUS returns the actual counter value +and WDIOC_GETBOOTSTATUS returns the value of TCO2 Status Register (see Intel's +documentation for the 82801AA and 82801AB datasheet). + +Features +-------- + WDT501P WDT500P Software Berkshire i810 TCO SA1100WD +Reboot Timer X X X X X X +External Reboot X X o o o X +I/O Port Monitor o o o X o o +Temperature X o o X o o +Fan Speed X o o o o o +Power Under X o o o o o +Power Over X o o o o o +Overheat X o o o o o + +The external event interfaces on the WDT boards are not currently supported. +Minor numbers are however allocated for it. + + +Example Watchdog Driver +----------------------- + +#include <stdio.h> +#include <unistd.h> +#include <fcntl.h> + +int main(int argc, const char *argv[]) +{ + int fd=open("/dev/watchdog",O_WRONLY); + if(fd==-1) + { + perror("watchdog"); + exit(1); + } + while(1) + { + write(fd,"\0",1); + fsync(fd); + sleep(10); + } +} + + +Contact Information + +People keep asking about the WDT watchdog timer hardware: The phone contacts +for Industrial Computer Source are: + +Industrial Computer Source +http://www.indcompsrc.com +ICS Advent, San Diego +6260 Sequence Dr. +San Diego, CA 92121-4371 +Phone (858) 677-0877 +FAX: (858) 677-0895 +> +ICS Advent Europe, UK +Oving Road +Chichester, +West Sussex, +PO19 4ET, UK +Phone: 00.44.1243.533900 + + +and please mention Linux when enquiring. + +For full information about the PCWD cards see the pcwd-watchdog.txt document. diff --git a/Documentation/x86_64/boot-options.txt b/Documentation/x86_64/boot-options.txt new file mode 100644 index 000000000000..44b6eea60ece --- /dev/null +++ b/Documentation/x86_64/boot-options.txt @@ -0,0 +1,180 @@ +AMD64 specific boot options + +There are many others (usually documented in driver documentation), but +only the AMD64 specific ones are listed here. + +Machine check + + mce=off disable machine check + + nomce (for compatibility with i386): same as mce=off + + Everything else is in sysfs now. + +APICs + + apic Use IO-APIC. Default + + noapic Don't use the IO-APIC. + + disableapic Don't use the local APIC + + nolapic Don't use the local APIC (alias for i386 compatibility) + + pirq=... See Documentation/i386/IO-APIC.txt + + noapictimer Don't set up the APIC timer + +Early Console + + syntax: earlyprintk=vga + earlyprintk=serial[,ttySn[,baudrate]] + + The early console is useful when the kernel crashes before the + normal console is initialized. It is not enabled by + default because it has some cosmetic problems. + Append ,keep to not disable it when the real console takes over. + Only vga or serial at a time, not both. + Currently only ttyS0 and ttyS1 are supported. + Interaction with the standard serial driver is not very good. + The VGA output is eventually overwritten by the real console. + +Timing + + notsc + Don't use the CPU time stamp counter to read the wall time. + This can be used to work around timing problems on multiprocessor systems + with not properly synchronized CPUs. Only useful with a SMP kernel + + report_lost_ticks + Report when timer interrupts are lost because some code turned off + interrupts for too long. + + nmi_watchdog=NUMBER[,panic] + NUMBER can be: + 0 don't use an NMI watchdog + 1 use the IO-APIC timer for the NMI watchdog + 2 use the local APIC for the NMI watchdog using a performance counter. Note + This will use one performance counter and the local APIC's performance + vector. + When panic is specified panic when an NMI watchdog timeout occurs. + This is useful when you use a panic=... timeout and need the box + quickly up again. + + nohpet + Don't use the HPET timer. + +Idle loop + + idle=poll + Don't do power saving in the idle loop using HLT, but poll for rescheduling + event. This will make the CPUs eat a lot more power, but may be useful + to get slightly better performance in multiprocessor benchmarks. It also + makes some profiling using performance counters more accurate. + +Rebooting + + reboot=b[ios] | t[riple] | k[bd] [, [w]arm | [c]old] + bios Use the CPU reboto vector for warm reset + warm Don't set the cold reboot flag + cold Set the cold reboot flag + triple Force a triple fault (init) + kbd Use the keyboard controller. cold reset (default) + + Using warm reset will be much faster especially on big memory + systems because the BIOS will not go through the memory check. + Disadvantage is that not all hardware will be completely reinitialized + on reboot so there may be boot problems on some systems. + + reboot=force + + Don't stop other CPUs on reboot. This can make reboot more reliable + in some cases. + +Non Executable Mappings + + noexec=on|off + + on Enable(default) + off Disable + +SMP + + nosmp Only use a single CPU + + maxcpus=NUMBER only use upto NUMBER CPUs + + cpumask=MASK only use cpus with bits set in mask + +NUMA + + numa=off Only set up a single NUMA node spanning all memory. + + numa=noacpi Don't parse the SRAT table for NUMA setup + + numa=fake=X Fake X nodes and ignore NUMA setup of the actual machine. + +ACPI + + acpi=off Don't enable ACPI + acpi=ht Use ACPI boot table parsing, but don't enable ACPI + interpreter + acpi=force Force ACPI on (currently not needed) + + acpi=strict Disable out of spec ACPI workarounds. + + acpi_sci={edge,level,high,low} Set up ACPI SCI interrupt. + + acpi=noirq Don't route interrupts + +PCI + + pci=off Don't use PCI + pci=conf1 Use conf1 access. + pci=conf2 Use conf2 access. + pci=rom Assign ROMs. + pci=assign-busses Assign busses + pci=irqmask=MASK Set PCI interrupt mask to MASK + pci=lastbus=NUMBER Scan upto NUMBER busses, no matter what the mptable says. + pci=noacpi Don't use ACPI to set up PCI interrupt routing. + +IOMMU + + iommu=[size][,noagp][,off][,force][,noforce][,leak][,memaper[=order]][,merge] + [,forcesac][,fullflush][,nomerge][,noaperture] + size set size of iommu (in bytes) + noagp don't initialize the AGP driver and use full aperture. + off don't use the IOMMU + leak turn on simple iommu leak tracing (only when CONFIG_IOMMU_LEAK is on) + memaper[=order] allocate an own aperture over RAM with size 32MB^order. + noforce don't force IOMMU usage. Default. + force Force IOMMU. + merge Do SG merging. Implies force (experimental) + nomerge Don't do SG merging. + forcesac For SAC mode for masks <40bits (experimental) + fullflush Flush IOMMU on each allocation (default) + nofullflush Don't use IOMMU fullflush + allowed overwrite iommu off workarounds for specific chipsets. + soft Use software bounce buffering (default for Intel machines) + noaperture Don't touch the aperture for AGP. + + swiotlb=pages[,force] + + pages Prereserve that many 128K pages for the software IO bounce buffering. + force Force all IO through the software TLB. + +Debugging + + oops=panic Always panic on oopses. Default is to just kill the process, + but there is a small probability of deadlocking the machine. + This will also cause panics on machine check exceptions. + Useful together with panic=30 to trigger a reboot. + + kstack=N Print that many words from the kernel stack in oops dumps. + +Misc + + noreplacement Don't replace instructions with more appropiate ones + for the CPU. This may be useful on asymmetric MP systems + where some CPU have less capabilities than the others. + diff --git a/Documentation/x86_64/mm.txt b/Documentation/x86_64/mm.txt new file mode 100644 index 000000000000..662b73971a67 --- /dev/null +++ b/Documentation/x86_64/mm.txt @@ -0,0 +1,24 @@ + +<previous description obsolete, deleted> + +Virtual memory map with 4 level page tables: + +0000000000000000 - 00007fffffffffff (=47bits) user space, different per mm +hole caused by [48:63] sign extension +ffff800000000000 - ffff80ffffffffff (=40bits) guard hole +ffff810000000000 - ffffc0ffffffffff (=46bits) direct mapping of phys. memory +ffffc10000000000 - ffffc1ffffffffff (=40bits) hole +ffffc20000000000 - ffffe1ffffffffff (=45bits) vmalloc/ioremap space +... unused hole ... +ffffffff80000000 - ffffffff82800000 (=40MB) kernel text mapping, from phys 0 +... unused hole ... +ffffffff88000000 - fffffffffff00000 (=1919MB) module mapping space + +vmalloc space is lazily synchronized into the different PML4 pages of +the processes using the page fault handler, with init_level4_pgt as +reference. + +Current X86-64 implementations only support 40 bit of address space, +but we support upto 46bits. This expands into MBZ space in the page tables. + +-Andi Kleen, Jul 2004 diff --git a/Documentation/xterm-linux.xpm b/Documentation/xterm-linux.xpm new file mode 100644 index 000000000000..f469c1a18e6e --- /dev/null +++ b/Documentation/xterm-linux.xpm @@ -0,0 +1,61 @@ +/* XPM */ +/*****************************************************************************/ +/** This pixmap was made by Torsten Poulin - 1996 - torsten@diku.dk **/ +/** It was made by combining xterm-blank.xpm with **/ +/** the wonderfully cute Linux penguin mascot by Larry Ewing. **/ +/** I had to change Larry's penguin a little to make it fit. **/ +/** xterm-blank.xpm contained the following comment: **/ +/** This pixmap is kindly offered by Ion Cionca - 1992 - **/ +/** Swiss Federal Institute of Technology **/ +/** Central Computing Service **/ +/*****************************************************************************/ +static char * image_name [] = { +/**/ +"64 38 8 1", +/**/ +" s mask c none", +". c gray70", +"X c gray85", +"o c gray50", +"O c yellow", +"+ c darkolivegreen", +"@ c white", +"# c black", +" ###### ", +" ######## ", +" ########## ........................... ", +" ########### .XXXXXXXXXXXXXXXXXXXXXXXXXXX. ", +" ########### .XXXXXXXXXXXXXXXXXXXXXXXXXXXXXoo ", +" #@@@#@@@### .XX+++++++++++++++++++++++XXXXoo ", +" #@#@#@#@### .XX++++++++++++++++++++++++XXXooo ", +" #@#####@### .XX++@@+@++@+@@@@++@+++++++XXXooo ", +" ###OOO######.XX++++++++++++++++++++++++XXXoooo ", +" ##OOOOOO####.XX++@@@@+@@+@@@+++++++++++XXXoooo ", +" #O#OOO#O####.XX++++++++++++++++++++++++XXXooooo ", +" ##O###OO####.XX++@@@@@@@@@@+@@@@@++++++XXXooooo ", +" ###OOOO@#####XX++++++++++++++++++++++++XXXooooo ", +" ##@###@@@@####XX++@@@+@@@@+@@++@@@++++++XXXooooo ", +" #@@@@@@@@@@####X++++++++++++++++++++++++XXXooooo ", +" ##@@@@@@@@@@#####++@+++++++++++++++++++++XXXooooo ", +" ###@@@@@@@@@@######+++++++++++++++++++++++XXXooooo ", +" ####@@@@@@@@@@@#####+@@@@+@+@@@+@++++++++++XXXooooo ", +" ###@@@@@@@@@@@@######++++++++++++++++++++++XXXooooo ", +" ##@@@@@@@@@@@@@@#####@+@@@@++++++++++++++++XXXooooo ", +" ###@@@@@@@@@@@@@@######++++++++++++++++++++XXXXoooo ", +" ###@@@@@@@@@@@@@@######XXXXXXXXXXXXXXXXXXXXXXXXooo ", +" ###@@@@@@@@@@@@@@@######XXXXXXXXXXXXXXXXXXXXXXXooo ", +" ###@@@@@@@@@@@@@@@@#####ooooooooooooooooooooooo...oo ", +" ###@@@@@@@@@@@@@@@######.........................ooo ", +" #OO##@@@@@@@@@@@@@#######oooooooooooooooooooooooooooo ", +" #OOO##@@@@@@@@@@@#OO####O#XXXXXXXXXXXXXXXXXXXXXXXoooo.. .. ", +" ###OOOOO##@@@@@@@@@@#OOO#OOO#XXXXXXXXXXXXXX#######XXoooo . .", +" #OOOOOOOO###@@@@@@@@@#OOOOOOO#ooooooooooooooooooooXXXooo . ", +" #OOOOOOOOO###@@@@@@@@@#OOOOOOO##XXXXXXXXXXXXXXXXXooooo . ", +" #OOOOOOOOO#@@@@@@@@###OOOOOOOOO#XXXXXXXXXXXXXXXoo oooooo ", +" #OOOOOOOOO#@@@@@@@####OOOOOOOO#@@@@@@@@@@@XXXXXoo ooooo...o ", +" #OOOOOOOOOOO###########OOOOOO##XXXXXXXXXXXXXXXXoo ooXXXoo..o ", +" ##OOOOOOOOO###########OOOO##@@@@@@@@@@@@@XXXXoo oXXXXX..o ", +" ###OOOO### oXX##OOO#XXXXXXXXXXXXXXXXXXoo o.....oo ", +" #### oooo####oooooooooooooooooooo ooooooo ", +" ", +" "}; diff --git a/Documentation/zorro.txt b/Documentation/zorro.txt new file mode 100644 index 000000000000..d5829d14774a --- /dev/null +++ b/Documentation/zorro.txt @@ -0,0 +1,102 @@ + Writing Device Drivers for Zorro Devices + ---------------------------------------- + +Written by Geert Uytterhoeven <geert@linux-m68k.org> +Last revised: September 5, 2003 + + +1. Introduction +--------------- + +The Zorro bus is the bus used in the Amiga family of computers. Thanks to +AutoConfig(tm), it's 100% Plug-and-Play. + +There are two types of Zorro busses, Zorro II and Zorro III: + + - The Zorro II address space is 24-bit and lies within the first 16 MB of the + Amiga's address map. + + - Zorro III is a 32-bit extension of Zorro II, which is backwards compatible + with Zorro II. The Zorro III address space lies outside the first 16 MB. + + +2. Probing for Zorro Devices +---------------------------- + +Zorro devices are found by calling `zorro_find_device()', which returns a +pointer to the `next' Zorro device with the specified Zorro ID. A probe loop +for the board with Zorro ID `ZORRO_PROD_xxx' looks like: + + struct zorro_dev *z = NULL; + + while ((z = zorro_find_device(ZORRO_PROD_xxx, z))) { + if (!zorro_request_region(z->resource.start+MY_START, MY_SIZE, + "My explanation")) + ... + } + +`ZORRO_WILDCARD' acts as a wildcard and finds any Zorro device. If your driver +supports different types of boards, you can use a construct like: + + struct zorro_dev *z = NULL; + + while ((z = zorro_find_device(ZORRO_WILDCARD, z))) { + if (z->id != ZORRO_PROD_xxx1 && z->id != ZORRO_PROD_xxx2 && ...) + continue; + if (!zorro_request_region(z->resource.start+MY_START, MY_SIZE, + "My explanation")) + ... + } + + +3. Zorro Resources +------------------ + +Before you can access a Zorro device's registers, you have to make sure it's +not yet in use. This is done using the I/O memory space resource management +functions: + + request_mem_region() + release_mem_region() + +Shortcuts to claim the whole device's address space are provided as well: + + zorro_request_device + zorro_release_device + + +4. Accessing the Zorro Address Space +------------------------------------ + +The address regions in the Zorro device resources are Zorro bus address +regions. Due to the identity bus-physical address mapping on the Zorro bus, +they are CPU physical addresses as well. + +The treatment of these regions depends on the type of Zorro space: + + - Zorro II address space is always mapped and does not have to be mapped + explicitly using z_ioremap(). + + Conversion from bus/physical Zorro II addresses to kernel virtual addresses + and vice versa is done using: + + virt_addr = ZTWO_VADDR(bus_addr); + bus_addr = ZTWO_PADDR(virt_addr); + + - Zorro III address space must be mapped explicitly using z_ioremap() first + before it can be accessed: + + virt_addr = z_ioremap(bus_addr, size); + ... + z_iounmap(virt_addr); + + +5. References +------------- + +linux/include/linux/zorro.h +linux/include/asm-{m68k,ppc}/zorro.h +linux/include/linux/zorro_ids.h +linux/drivers/zorro +/proc/bus/zorro + |