<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/linux/compat.h, branch v2.6.38.5</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>compat: Make compat_alloc_user_space() incorporate the access_ok()</title>
<updated>2010-09-14T23:08:45+00:00</updated>
<author>
<name>H. Peter Anvin</name>
<email>hpa@linux.intel.com</email>
</author>
<published>2010-09-07T23:16:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c41d68a513c71e35a14f66d71782d27a79a81ea6'/>
<id>c41d68a513c71e35a14f66d71782d27a79a81ea6</id>
<content type='text'>
compat_alloc_user_space() expects the caller to independently call
access_ok() to verify the returned area.  A missing call could
introduce problems on some architectures.

This patch incorporates the access_ok() check into
compat_alloc_user_space() and also adds a sanity check on the length.
The existing compat_alloc_user_space() implementations are renamed
arch_compat_alloc_user_space() and are used as part of the
implementation of the new global function.

This patch assumes NULL will cause __get_user()/__put_user() to either
fail or access userspace on all architectures.  This should be
followed by checking the return value of compat_access_user_space()
for NULL in the callers, at which time the access_ok() in the callers
can also be removed.

Reported-by: Ben Hawkes &lt;hawkes@sota.gen.nz&gt;
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
Acked-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Acked-by: Chris Metcalf &lt;cmetcalf@tilera.com&gt;
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Acked-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: James Bottomley &lt;jejb@parisc-linux.org&gt;
Cc: Kyle McMartin &lt;kyle@mcmartin.ca&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: &lt;stable@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
compat_alloc_user_space() expects the caller to independently call
access_ok() to verify the returned area.  A missing call could
introduce problems on some architectures.

This patch incorporates the access_ok() check into
compat_alloc_user_space() and also adds a sanity check on the length.
The existing compat_alloc_user_space() implementations are renamed
arch_compat_alloc_user_space() and are used as part of the
implementation of the new global function.

This patch assumes NULL will cause __get_user()/__put_user() to either
fail or access userspace on all architectures.  This should be
followed by checking the return value of compat_access_user_space()
for NULL in the callers, at which time the access_ok() in the callers
can also be removed.

Reported-by: Ben Hawkes &lt;hawkes@sota.gen.nz&gt;
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
Acked-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Acked-by: Chris Metcalf &lt;cmetcalf@tilera.com&gt;
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Acked-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: James Bottomley &lt;jejb@parisc-linux.org&gt;
Cc: Kyle McMartin &lt;kyle@mcmartin.ca&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: &lt;stable@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Mark arguments to certain syscalls as being const</title>
<updated>2010-08-13T23:53:13+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2010-08-11T10:26:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c7887325230aec47d47a32562a6e26014a0fafca'/>
<id>c7887325230aec47d47a32562a6e26014a0fafca</id>
<content type='text'>
Mark arguments to certain system calls as being const where they should be but
aren't.  The list includes:

 (*) The filename arguments of various stat syscalls, execve(), various utimes
     syscalls and some mount syscalls.

 (*) The filename arguments of some syscall helpers relating to the above.

 (*) The buffer argument of various write syscalls.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Mark arguments to certain system calls as being const where they should be but
aren't.  The list includes:

 (*) The filename arguments of various stat syscalls, execve(), various utimes
     syscalls and some mount syscalls.

 (*) The filename arguments of some syscall helpers relating to the above.

 (*) The buffer argument of various write syscalls.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>compat: factor out compat_rw_copy_check_uvector from compat_do_readv_writev</title>
<updated>2010-05-27T16:12:53+00:00</updated>
<author>
<name>Jeff Moyer</name>
<email>jmoyer@redhat.com</email>
</author>
<published>2010-05-26T21:44:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b83733639a494d5f42fa00a2506563fbd2d3015d'/>
<id>b83733639a494d5f42fa00a2506563fbd2d3015d</id>
<content type='text'>
It was reported in http://lkml.org/lkml/2010/3/8/309 that 32 bit readv and
writev AIO operations were not functioning properly.  It turns out that
the code to convert the 32bit io vectors to 64 bits was never written.
The results of that can be pretty bad, but in my testing, it mostly ended
up in generating EFAULT as we walked off the list of I/O vectors provided.

This patch set fixes the problem in my environment.  are greatly
appreciated.

This patch:

Factor out code that will be used by both compat_do_readv_writev and the
compat aio submission code paths.

Signed-off-by: Jeff Moyer &lt;jmoyer@redhat.com&gt;
Reported-by: Michael Tokarev &lt;mjt@tls.msk.ru&gt;
Cc: Zach Brown &lt;zach.brown@oracle.com&gt;
Cc: &lt;stable@kernel.org&gt;		[2.6.35.1]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It was reported in http://lkml.org/lkml/2010/3/8/309 that 32 bit readv and
writev AIO operations were not functioning properly.  It turns out that
the code to convert the 32bit io vectors to 64 bits was never written.
The results of that can be pretty bad, but in my testing, it mostly ended
up in generating EFAULT as we walked off the list of I/O vectors provided.

This patch set fixes the problem in my environment.  are greatly
appreciated.

This patch:

Factor out code that will be used by both compat_do_readv_writev and the
compat aio submission code paths.

Signed-off-by: Jeff Moyer &lt;jmoyer@redhat.com&gt;
Reported-by: Michael Tokarev &lt;mjt@tls.msk.ru&gt;
Cc: Zach Brown &lt;zach.brown@oracle.com&gt;
Cc: &lt;stable@kernel.org&gt;		[2.6.35.1]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Add generic sys_old_select()</title>
<updated>2010-03-12T23:52:32+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2010-03-10T23:21:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5d0e52830e9ae09b872567f4aca3dfb5b5918079'/>
<id>5d0e52830e9ae09b872567f4aca3dfb5b5918079</id>
<content type='text'>
Add a generic implementation of the old select() syscall, which expects
its argument in a memory block and switch all architectures over to use
it.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: Jeff Dike &lt;jdike@addtoit.com&gt;
Cc: Hirokazu Takata &lt;takata@linux-m32r.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Reviewed-by: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Acked-by: Andreas Schwab &lt;schwab@linux-m68k.org&gt;
Acked-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
Acked-by: Greg Ungerer &lt;gerg@uclinux.org&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Cc: Andreas Schwab &lt;schwab@linux-m68k.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a generic implementation of the old select() syscall, which expects
its argument in a memory block and switch all architectures over to use
it.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: Jeff Dike &lt;jdike@addtoit.com&gt;
Cc: Hirokazu Takata &lt;takata@linux-m32r.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Reviewed-by: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Acked-by: Andreas Schwab &lt;schwab@linux-m68k.org&gt;
Acked-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
Acked-by: Greg Ungerer &lt;gerg@uclinux.org&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Cc: Andreas Schwab &lt;schwab@linux-m68k.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/compat_ioctl: support SIOCWANDEV</title>
<updated>2009-11-09T04:57:03+00:00</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2009-11-09T04:57:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7a50a240c495478179f01c9df4bd75e39cff79c7'/>
<id>7a50a240c495478179f01c9df4bd75e39cff79c7</id>
<content type='text'>
This adds compat_ioctl support for SIOCWANDEV, which has
always been missing.

The definition of struct compat_ifreq was missing an
ifru_settings fields that is needed to support SIOCWANDEV,
so add that and clean up the whitespace damage in the
struct definition.

Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds compat_ioctl support for SIOCWANDEV, which has
always been missing.

The definition of struct compat_ifreq was missing an
ifru_settings fields that is needed to support SIOCWANDEV,
so add that and clean up the whitespace damage in the
struct definition.

Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: compat: No need to define IFHWADDRLEN and IFNAMSIZ twice.</title>
<updated>2009-11-07T07:11:35+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2009-11-07T04:46:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b622d97a63ad4ce890b625c62acd1bb894592e63'/>
<id>b622d97a63ad4ce890b625c62acd1bb894592e63</id>
<content type='text'>
It's defined colloqually in linux/if.h and linux/compat.h
includes that.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's defined colloqually in linux/if.h and linux/compat.h
includes that.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>compat: add struct compat_ifreq etc to compat.h</title>
<updated>2009-11-07T04:43:57+00:00</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2009-11-06T08:09:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2dceba14ef0e62738d58777a1bd4018130d47a74'/>
<id>2dceba14ef0e62738d58777a1bd4018130d47a74</id>
<content type='text'>
In order to move socket ioctl conversion code into multiple
places in the socket code, we need a common defintion of
the data structures it uses.

Also change the name from ifreq32 to compat_ifreq to
follow the naming convention for compat.h

Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In order to move socket ioctl conversion code into multiple
places in the socket code, we need a common defintion of
the data structures it uses.

Also change the name from ifreq32 to compat_ifreq to
follow the naming convention for compat.h

Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>signals: implement sys_rt_tgsigqueueinfo</title>
<updated>2009-04-30T17:24:24+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2009-04-04T21:01:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=62ab4505e3efaf67784f84059e0fb9cedb1728ea'/>
<id>62ab4505e3efaf67784f84059e0fb9cedb1728ea</id>
<content type='text'>
sys_kill has the per thread counterpart sys_tgkill. sigqueueinfo is
missing a thread directed counterpart. Such an interface is important
for migrating applications from other OSes which have the per thread
delivery implemented.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: Roland McGrath &lt;roland@redhat.com&gt;
Acked-by: Ulrich Drepper &lt;drepper@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
sys_kill has the per thread counterpart sys_tgkill. sigqueueinfo is
missing a thread directed counterpart. Such an interface is important
for migrating applications from other OSes which have the per thread
delivery implemented.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: Roland McGrath &lt;roland@redhat.com&gt;
Acked-by: Ulrich Drepper &lt;drepper@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Make non-compat preadv/pwritev use native register size</title>
<updated>2009-04-04T21:20:34+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2009-04-03T15:03:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=601cc11d054ae4b5e9b5babec3d8e4667a2cb9b5'/>
<id>601cc11d054ae4b5e9b5babec3d8e4667a2cb9b5</id>
<content type='text'>
Instead of always splitting the file offset into 32-bit 'high' and 'low'
parts, just split them into the largest natural word-size - which in C
terms is 'unsigned long'.

This allows 64-bit architectures to avoid the unnecessary 32-bit
shifting and masking for native format (while the compat interfaces will
obviously always have to do it).

This also changes the order of 'high' and 'low' to be "low first".  Why?
Because when we have it like this, the 64-bit system calls now don't use
the "pos_high" argument at all, and it makes more sense for the native
system call to simply match the user-mode prototype.

This results in a much more natural calling convention, and allows the
compiler to generate much more straightforward code.  On x86-64, we now
generate

        testq   %rcx, %rcx      # pos_l
        js      .L122   #,
        movq    %rcx, -48(%rbp) # pos_l, pos

from the C source

        loff_t pos = pos_from_hilo(pos_h, pos_l);
	...
        if (pos &lt; 0)
                return -EINVAL;

and the 'pos_h' register isn't even touched.  It used to generate code
like

        mov     %r8d, %r8d      # pos_low, pos_low
        salq    $32, %rcx       #, tmp71
        movq    %r8, %rax       # pos_low, pos.386
        orq     %rcx, %rax      # tmp71, pos.386
        js      .L122   #,
        movq    %rax, -48(%rbp) # pos.386, pos

which isn't _that_ horrible, but it does show how the natural word size
is just a more sensible interface (same arguments will hold in the user
level glibc wrapper function, of course, so the kernel side is just half
of the equation!)

Note: in all cases the user code wrapper can again be the same. You can
just do

	#define HALF_BITS (sizeof(unsigned long)*4)
	__syscall(PWRITEV, fd, iov, count, offset, (offset &gt;&gt; HALF_BITS) &gt;&gt; HALF_BITS);

or something like that.  That way the user mode wrapper will also be
nicely passing in a zero (it won't actually have to do the shifts, the
compiler will understand what is going on) for the last argument.

And that is a good idea, even if nobody will necessarily ever care: if
we ever do move to a 128-bit lloff_t, this particular system call might
be left alone.  Of course, that will be the least of our worries if we
really ever need to care, so this may not be worth really caring about.

[ Fixed for lost 'loff_t' cast noticed by Andrew Morton ]

Acked-by: Gerd Hoffmann &lt;kraxel@redhat.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: linux-api@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of always splitting the file offset into 32-bit 'high' and 'low'
parts, just split them into the largest natural word-size - which in C
terms is 'unsigned long'.

This allows 64-bit architectures to avoid the unnecessary 32-bit
shifting and masking for native format (while the compat interfaces will
obviously always have to do it).

This also changes the order of 'high' and 'low' to be "low first".  Why?
Because when we have it like this, the 64-bit system calls now don't use
the "pos_high" argument at all, and it makes more sense for the native
system call to simply match the user-mode prototype.

This results in a much more natural calling convention, and allows the
compiler to generate much more straightforward code.  On x86-64, we now
generate

        testq   %rcx, %rcx      # pos_l
        js      .L122   #,
        movq    %rcx, -48(%rbp) # pos_l, pos

from the C source

        loff_t pos = pos_from_hilo(pos_h, pos_l);
	...
        if (pos &lt; 0)
                return -EINVAL;

and the 'pos_h' register isn't even touched.  It used to generate code
like

        mov     %r8d, %r8d      # pos_low, pos_low
        salq    $32, %rcx       #, tmp71
        movq    %r8, %rax       # pos_low, pos.386
        orq     %rcx, %rax      # tmp71, pos.386
        js      .L122   #,
        movq    %rax, -48(%rbp) # pos.386, pos

which isn't _that_ horrible, but it does show how the natural word size
is just a more sensible interface (same arguments will hold in the user
level glibc wrapper function, of course, so the kernel side is just half
of the equation!)

Note: in all cases the user code wrapper can again be the same. You can
just do

	#define HALF_BITS (sizeof(unsigned long)*4)
	__syscall(PWRITEV, fd, iov, count, offset, (offset &gt;&gt; HALF_BITS) &gt;&gt; HALF_BITS);

or something like that.  That way the user mode wrapper will also be
nicely passing in a zero (it won't actually have to do the shifts, the
compiler will understand what is going on) for the last argument.

And that is a good idea, even if nobody will necessarily ever care: if
we ever do move to a 128-bit lloff_t, this particular system call might
be left alone.  Of course, that will be the least of our worries if we
really ever need to care, so this may not be worth really caring about.

[ Fixed for lost 'loff_t' cast noticed by Andrew Morton ]

Acked-by: Gerd Hoffmann &lt;kraxel@redhat.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: linux-api@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>preadv/pwritev: Add preadv and pwritev system calls.</title>
<updated>2009-04-03T02:05:08+00:00</updated>
<author>
<name>Gerd Hoffmann</name>
<email>kraxel@redhat.com</email>
</author>
<published>2009-04-02T23:59:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f3554f4bc69803ac2baaf7cf2aa4339e1f4b693e'/>
<id>f3554f4bc69803ac2baaf7cf2aa4339e1f4b693e</id>
<content type='text'>
This patch adds preadv and pwritev system calls.  These syscalls are a
pretty straightforward combination of pread and readv (same for write).
They are quite useful for doing vectored I/O in threaded applications.
Using lseek+readv instead opens race windows you'll have to plug with
locking.

Other systems have such system calls too, for example NetBSD, check
here: http://www.daemon-systems.org/man/preadv.2.html

The application-visible interface provided by glibc should look like
this to be compatible to the existing implementations in the *BSD family:

  ssize_t preadv(int d, const struct iovec *iov, int iovcnt, off_t offset);
  ssize_t pwritev(int d, const struct iovec *iov, int iovcnt, off_t offset);

This prototype has one problem though: On 32bit archs is the (64bit)
offset argument unaligned, which the syscall ABI of several archs doesn't
allow to do.  At least s390 needs a wrapper in glibc to handle this.  As
we'll need a wrappers in glibc anyway I've decided to push problem to
glibc entriely and use a syscall prototype which works without
arch-specific wrappers inside the kernel: The offset argument is
explicitly splitted into two 32bit values.

The patch sports the actual system call implementation and the windup in
the x86 system call tables.  Other archs follow as separate patches.

Signed-off-by: Gerd Hoffmann &lt;kraxel@redhat.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: &lt;linux-api@vger.kernel.org&gt;
Cc: &lt;linux-arch@vger.kernel.org&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds preadv and pwritev system calls.  These syscalls are a
pretty straightforward combination of pread and readv (same for write).
They are quite useful for doing vectored I/O in threaded applications.
Using lseek+readv instead opens race windows you'll have to plug with
locking.

Other systems have such system calls too, for example NetBSD, check
here: http://www.daemon-systems.org/man/preadv.2.html

The application-visible interface provided by glibc should look like
this to be compatible to the existing implementations in the *BSD family:

  ssize_t preadv(int d, const struct iovec *iov, int iovcnt, off_t offset);
  ssize_t pwritev(int d, const struct iovec *iov, int iovcnt, off_t offset);

This prototype has one problem though: On 32bit archs is the (64bit)
offset argument unaligned, which the syscall ABI of several archs doesn't
allow to do.  At least s390 needs a wrapper in glibc to handle this.  As
we'll need a wrappers in glibc anyway I've decided to push problem to
glibc entriely and use a syscall prototype which works without
arch-specific wrappers inside the kernel: The offset argument is
explicitly splitted into two 32bit values.

The patch sports the actual system call implementation and the windup in
the x86 system call tables.  Other archs follow as separate patches.

Signed-off-by: Gerd Hoffmann &lt;kraxel@redhat.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: &lt;linux-api@vger.kernel.org&gt;
Cc: &lt;linux-arch@vger.kernel.org&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
