<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/linux/buffer_head.h, branch v2.6.27.13</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>fs: rename buffer trylock</title>
<updated>2008-08-05T04:56:09+00:00</updated>
<author>
<name>Nick Piggin</name>
<email>npiggin@suse.de</email>
</author>
<published>2008-08-02T10:02:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ca5de404ff036a29b25e9a83f6919c9f606c5841'/>
<id>ca5de404ff036a29b25e9a83f6919c9f606c5841</id>
<content type='text'>
Like the page lock change, this also requires name change, so convert the
raw test_and_set bitop to a trylock.

Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Like the page lock change, this also requires name change, so convert the
raw test_and_set bitop to a trylock.

Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfs: pagecache usage optimization for pagesize!=blocksize</title>
<updated>2008-07-28T23:30:21+00:00</updated>
<author>
<name>Hisashi Hifumi</name>
<email>hifumi.hisashi@oss.ntt.co.jp</email>
</author>
<published>2008-07-28T22:46:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8ab22b9abb5c55413802e4adc9aa6223324547c3'/>
<id>8ab22b9abb5c55413802e4adc9aa6223324547c3</id>
<content type='text'>
When we read some part of a file through pagecache, if there is a
pagecache of corresponding index but this page is not uptodate, read IO
is issued and this page will be uptodate.

I think this is good for pagesize == blocksize environment but there is
room for improvement on pagesize != blocksize environment.  Because in
this case a page can have multiple buffers and even if a page is not
uptodate, some buffers can be uptodate.

So I suggest that when all buffers which correspond to a part of a file
that we want to read are uptodate, use this pagecache and copy data from
this pagecache to user buffer even if a page is not uptodate.  This can
reduce read IO and improve system throughput.

I wrote a benchmark program and got result number with this program.

This benchmark do:

  1: mount and open a test file.

  2: create a 512MB file.

  3: close a file and umount.

  4: mount and again open a test file.

  5: pwrite randomly 300000 times on a test file.  offset is aligned
     by IO size(1024bytes).

  6: measure time of preading randomly 100000 times on a test file.

The result was:
	2.6.26
        330 sec

	2.6.26-patched
        226 sec

Arch:i386
Filesystem:ext3
Blocksize:1024 bytes
Memory: 1GB

On ext3/4, a file is written through buffer/block.  So random read/write
mixed workloads or random read after random write workloads are optimized
with this patch under pagesize != blocksize environment.  This test result
showed this.

The benchmark program is as follows:

#include &lt;stdio.h&gt;
#include &lt;sys/types.h&gt;
#include &lt;sys/stat.h&gt;
#include &lt;fcntl.h&gt;
#include &lt;unistd.h&gt;
#include &lt;time.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;string.h&gt;
#include &lt;sys/mount.h&gt;

#define LEN 1024
#define LOOP 1024*512 /* 512MB */

main(void)
{
	unsigned long i, offset, filesize;
	int fd;
	char buf[LEN];
	time_t t1, t2;

	if (mount("/dev/sda1", "/root/test1/", "ext3", 0, 0) &lt; 0) {
		perror("cannot mount\n");
		exit(1);
	}
	memset(buf, 0, LEN);
	fd = open("/root/test1/testfile", O_CREAT|O_RDWR|O_TRUNC);
	if (fd &lt; 0) {
		perror("cannot open file\n");
		exit(1);
	}
	for (i = 0; i &lt; LOOP; i++)
		write(fd, buf, LEN);
	close(fd);
	if (umount("/root/test1/") &lt; 0) {
		perror("cannot umount\n");
		exit(1);
	}
	if (mount("/dev/sda1", "/root/test1/", "ext3", 0, 0) &lt; 0) {
		perror("cannot mount\n");
		exit(1);
	}
	fd = open("/root/test1/testfile", O_RDWR);
	if (fd &lt; 0) {
		perror("cannot open file\n");
		exit(1);
	}

	filesize = LEN * LOOP;
	for (i = 0; i &lt; 300000; i++){
		offset = (random() % filesize) &amp; (~(LEN - 1));
		pwrite(fd, buf, LEN, offset);
	}
	printf("start test\n");
	time(&amp;t1);
	for (i = 0; i &lt; 100000; i++){
		offset = (random() % filesize) &amp; (~(LEN - 1));
		pread(fd, buf, LEN, offset);
	}
	time(&amp;t2);
	printf("%ld sec\n", t2-t1);
	close(fd);
	if (umount("/root/test1/") &lt; 0) {
		perror("cannot umount\n");
		exit(1);
	}
}

Signed-off-by: Hisashi Hifumi &lt;hifumi.hisashi@oss.ntt.co.jp&gt;
Cc: Nick Piggin &lt;nickpiggin@yahoo.com.au&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Jan Kara &lt;jack@ucw.cz&gt;
Cc: &lt;linux-ext4@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When we read some part of a file through pagecache, if there is a
pagecache of corresponding index but this page is not uptodate, read IO
is issued and this page will be uptodate.

I think this is good for pagesize == blocksize environment but there is
room for improvement on pagesize != blocksize environment.  Because in
this case a page can have multiple buffers and even if a page is not
uptodate, some buffers can be uptodate.

So I suggest that when all buffers which correspond to a part of a file
that we want to read are uptodate, use this pagecache and copy data from
this pagecache to user buffer even if a page is not uptodate.  This can
reduce read IO and improve system throughput.

I wrote a benchmark program and got result number with this program.

This benchmark do:

  1: mount and open a test file.

  2: create a 512MB file.

  3: close a file and umount.

  4: mount and again open a test file.

  5: pwrite randomly 300000 times on a test file.  offset is aligned
     by IO size(1024bytes).

  6: measure time of preading randomly 100000 times on a test file.

The result was:
	2.6.26
        330 sec

	2.6.26-patched
        226 sec

Arch:i386
Filesystem:ext3
Blocksize:1024 bytes
Memory: 1GB

On ext3/4, a file is written through buffer/block.  So random read/write
mixed workloads or random read after random write workloads are optimized
with this patch under pagesize != blocksize environment.  This test result
showed this.

The benchmark program is as follows:

#include &lt;stdio.h&gt;
#include &lt;sys/types.h&gt;
#include &lt;sys/stat.h&gt;
#include &lt;fcntl.h&gt;
#include &lt;unistd.h&gt;
#include &lt;time.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;string.h&gt;
#include &lt;sys/mount.h&gt;

#define LEN 1024
#define LOOP 1024*512 /* 512MB */

main(void)
{
	unsigned long i, offset, filesize;
	int fd;
	char buf[LEN];
	time_t t1, t2;

	if (mount("/dev/sda1", "/root/test1/", "ext3", 0, 0) &lt; 0) {
		perror("cannot mount\n");
		exit(1);
	}
	memset(buf, 0, LEN);
	fd = open("/root/test1/testfile", O_CREAT|O_RDWR|O_TRUNC);
	if (fd &lt; 0) {
		perror("cannot open file\n");
		exit(1);
	}
	for (i = 0; i &lt; LOOP; i++)
		write(fd, buf, LEN);
	close(fd);
	if (umount("/root/test1/") &lt; 0) {
		perror("cannot umount\n");
		exit(1);
	}
	if (mount("/dev/sda1", "/root/test1/", "ext3", 0, 0) &lt; 0) {
		perror("cannot mount\n");
		exit(1);
	}
	fd = open("/root/test1/testfile", O_RDWR);
	if (fd &lt; 0) {
		perror("cannot open file\n");
		exit(1);
	}

	filesize = LEN * LOOP;
	for (i = 0; i &lt; 300000; i++){
		offset = (random() % filesize) &amp; (~(LEN - 1));
		pwrite(fd, buf, LEN, offset);
	}
	printf("start test\n");
	time(&amp;t1);
	for (i = 0; i &lt; 100000; i++){
		offset = (random() % filesize) &amp; (~(LEN - 1));
		pread(fd, buf, LEN, offset);
	}
	time(&amp;t2);
	printf("%ld sec\n", t2-t1);
	close(fd);
	if (umount("/root/test1/") &lt; 0) {
		perror("cannot umount\n");
		exit(1);
	}
}

Signed-off-by: Hisashi Hifumi &lt;hifumi.hisashi@oss.ntt.co.jp&gt;
Cc: Nick Piggin &lt;nickpiggin@yahoo.com.au&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Jan Kara &lt;jack@ucw.cz&gt;
Cc: &lt;linux-ext4@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>remove generic_commit_write()</title>
<updated>2008-04-29T15:06:01+00:00</updated>
<author>
<name>Adrian Bunk</name>
<email>bunk@kernel.org</email>
</author>
<published>2008-04-29T07:59:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=946a57b526a16e5662235cb8f573337bc8ecdc48'/>
<id>946a57b526a16e5662235cb8f573337bc8ecdc48</id>
<content type='text'>
Remove the obsolete and no longer used generic_commit_write().

Signed-off-by: Adrian Bunk &lt;bunk@kernel.org&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove the obsolete and no longer used generic_commit_write().

Signed-off-by: Adrian Bunk &lt;bunk@kernel.org&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>include/linux: Remove all users of FASTCALL() macro</title>
<updated>2008-02-14T00:21:18+00:00</updated>
<author>
<name>Harvey Harrison</name>
<email>harvey.harrison@gmail.com</email>
</author>
<published>2008-02-13T23:03:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b3c97528689619fc66569b30bf83d09d9929521a'/>
<id>b3c97528689619fc66569b30bf83d09d9929521a</id>
<content type='text'>
FASTCALL() is always expanded to empty, remove it.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Harvey Harrison &lt;harvey.harrison@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
FASTCALL() is always expanded to empty, remove it.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Harvey Harrison &lt;harvey.harrison@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Add buffer head related helper functions</title>
<updated>2008-01-29T04:58:26+00:00</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.vnet.ibm.com</email>
</author>
<published>2008-01-29T04:58:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=389d1b083c767a360ec84b27a95da06244becec8'/>
<id>389d1b083c767a360ec84b27a95da06244becec8</id>
<content type='text'>
Add buffer head related helper function bh_uptodate_or_lock and
bh_submit_read which can be used by file system

Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add buffer head related helper function bh_uptodate_or_lock and
bh_submit_read which can be used by file system

Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: restore nobh</title>
<updated>2007-10-16T16:42:58+00:00</updated>
<author>
<name>Nick Piggin</name>
<email>npiggin@suse.de</email>
</author>
<published>2007-10-16T08:25:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=03158cd7eb3374843de68421142ca5900df845d9'/>
<id>03158cd7eb3374843de68421142ca5900df845d9</id>
<content type='text'>
Implement nobh in new aops.  This is a bit tricky.  FWIW, nobh_truncate is
now implemented in a way that does not create blocks in sparse regions,
which is a silly thing for it to have been doing (isn't it?)

ext2 survives fsx and fsstress. jfs is converted as well... ext3
should be easy to do (but not done yet).

[akpm@linux-foundation.org: coding-style fixes]
Cc: Badari Pulavarty &lt;pbadari@us.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Implement nobh in new aops.  This is a bit tricky.  FWIW, nobh_truncate is
now implemented in a way that does not create blocks in sparse regions,
which is a silly thing for it to have been doing (isn't it?)

ext2 survives fsx and fsstress. jfs is converted as well... ext3
should be easy to do (but not done yet).

[akpm@linux-foundation.org: coding-style fixes]
Cc: Badari Pulavarty &lt;pbadari@us.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>With reiserfs no longer using the weird generic_cont_expand, remove it completely.</title>
<updated>2007-10-16T16:42:56+00:00</updated>
<author>
<name>Nick Piggin</name>
<email>npiggin@suse.de</email>
</author>
<published>2007-10-16T08:25:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a20fa20c549ed569885d871f689a59cfd2f6ff77'/>
<id>a20fa20c549ed569885d871f689a59cfd2f6ff77</id>
<content type='text'>
Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: new cont helpers</title>
<updated>2007-10-16T16:42:55+00:00</updated>
<author>
<name>Nick Piggin</name>
<email>npiggin@suse.de</email>
</author>
<published>2007-10-16T08:25:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=89e107877b65bf6eff1d63a1302dee9a091586f5'/>
<id>89e107877b65bf6eff1d63a1302dee9a091586f5</id>
<content type='text'>
Rework the generic block "cont" routines to handle the new aops.  Supporting
cont_prepare_write would take quite a lot of code to support, so remove it
instead (and we later convert all filesystems to use it).

write_begin gets passed AOP_FLAG_CONT_EXPAND when called from
generic_cont_expand, so filesystems can avoid the old hacks they used.

Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Cc: OGAWA Hirofumi &lt;hirofumi@mail.parknet.co.jp&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rework the generic block "cont" routines to handle the new aops.  Supporting
cont_prepare_write would take quite a lot of code to support, so remove it
instead (and we later convert all filesystems to use it).

write_begin gets passed AOP_FLAG_CONT_EXPAND when called from
generic_cont_expand, so filesystems can avoid the old hacks they used.

Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Cc: OGAWA Hirofumi &lt;hirofumi@mail.parknet.co.jp&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: introduce write_begin, write_end, and perform_write aops</title>
<updated>2007-10-16T16:42:55+00:00</updated>
<author>
<name>Nick Piggin</name>
<email>npiggin@suse.de</email>
</author>
<published>2007-10-16T08:25:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=afddba49d18f346e5cc2938b6ed7c512db18ca68'/>
<id>afddba49d18f346e5cc2938b6ed7c512db18ca68</id>
<content type='text'>
These are intended to replace prepare_write and commit_write with more
flexible alternatives that are also able to avoid the buffered write
deadlock problems efficiently (which prepare_write is unable to do).

[mark.fasheh@oracle.com: API design contributions, code review and fixes]
[akpm@linux-foundation.org: various fixes]
[dmonakhov@sw.ru: new aop block_write_begin fix]
Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Signed-off-by: Mark Fasheh &lt;mark.fasheh@oracle.com&gt;
Signed-off-by: Dmitriy Monakhov &lt;dmonakhov@openvz.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
These are intended to replace prepare_write and commit_write with more
flexible alternatives that are also able to avoid the buffered write
deadlock problems efficiently (which prepare_write is unable to do).

[mark.fasheh@oracle.com: API design contributions, code review and fixes]
[akpm@linux-foundation.org: various fixes]
[dmonakhov@sw.ru: new aop block_write_begin fix]
Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Signed-off-by: Mark Fasheh &lt;mark.fasheh@oracle.com&gt;
Signed-off-by: Dmitriy Monakhov &lt;dmonakhov@openvz.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[FS] Implement block_page_mkwrite.</title>
<updated>2007-07-19T09:50:50+00:00</updated>
<author>
<name>David Chinner</name>
<email>dgc@sgi.com</email>
</author>
<published>2007-07-19T07:39:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5417169026c3df151adf5a65eb061278b0a72e69'/>
<id>5417169026c3df151adf5a65eb061278b0a72e69</id>
<content type='text'>
Many filesystems need a -&gt;page-mkwrite callout to correctly
set up pages that have been written to by mmap. This is especially
important when mmap is writing into holes as it allows filesystems
to correctly account for and allocate space before the mmap
write is allowed to proceed.

Protection against truncate races is provided by locking the page
and checking to see whether the page mapping is correct and whether
it is beyond EOF so we don't end up allowing allocations beyond
the current EOF or changing EOF as a result of a mmap write.

SGI-PV: 940392
SGI-Modid: 2.6.x-xfs-melb:linux:29146a

Signed-off-by: David Chinner &lt;dgc@sgi.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@infradead.org&gt;
Signed-off-by: Tim Shimmin &lt;tes@sgi.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Many filesystems need a -&gt;page-mkwrite callout to correctly
set up pages that have been written to by mmap. This is especially
important when mmap is writing into holes as it allows filesystems
to correctly account for and allocate space before the mmap
write is allowed to proceed.

Protection against truncate races is provided by locking the page
and checking to see whether the page mapping is correct and whether
it is beyond EOF so we don't end up allowing allocations beyond
the current EOF or changing EOF as a result of a mmap write.

SGI-PV: 940392
SGI-Modid: 2.6.x-xfs-melb:linux:29146a

Signed-off-by: David Chinner &lt;dgc@sgi.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@infradead.org&gt;
Signed-off-by: Tim Shimmin &lt;tes@sgi.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
