<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/block, branch v3.15.5</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>zram: revalidate disk after capacity change</title>
<updated>2014-07-09T18:21:30+00:00</updated>
<author>
<name>Minchan Kim</name>
<email>minchan@kernel.org</email>
</author>
<published>2014-07-02T22:22:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=38c957e9372211fb4b210602557dde43d70bf57d'/>
<id>38c957e9372211fb4b210602557dde43d70bf57d</id>
<content type='text'>
commit 2e32baea46ce542c561a519414c840295b229c8f upstream.

Alexander reported mkswap on /dev/zram0 is failed if other process is
opening the block device file.

Step is as follows,

0. Reset the unused zram device.
1. Use a program that opens /dev/zram0 with O_RDWR and sleeps
   until killed.
2. While that program sleeps, echo the correct value to
   /sys/block/zram0/disksize.
3. Verify (e.g. in /proc/partitions) that the disk size is applied
   correctly. It is.
4. While that program still sleeps, attempt to mkswap /dev/zram0.
   This fails: mkswap: error: swap area needs to be at least 40 KiB

When I investigated, the size get by ioctl(fd, BLKGETSIZE64, xxx) on
mkswap to get a size of blockdev was zero although zram0 has right size by
2.

The reason is zram didn't revalidate disk after changing capacity so that
size of blockdev's inode is not uptodate until all of file is close.

This patch should fix the BUG.

Signed-off-by: Minchan Kim &lt;minchan@kernel.org&gt;
Reported-by: Alexander E. Patrakov &lt;patrakov@gmail.com&gt;
Tested-by: Alexander E. Patrakov &lt;patrakov@gmail.com&gt;
Reviewed-by: Sergey Senozhatsky &lt;sergey.senozhatsky@gmail.com&gt;
Cc: Nitin Gupta &lt;ngupta@vflare.org&gt;
Acked-by: Jerome Marchand &lt;jmarchan@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 2e32baea46ce542c561a519414c840295b229c8f upstream.

Alexander reported mkswap on /dev/zram0 is failed if other process is
opening the block device file.

Step is as follows,

0. Reset the unused zram device.
1. Use a program that opens /dev/zram0 with O_RDWR and sleeps
   until killed.
2. While that program sleeps, echo the correct value to
   /sys/block/zram0/disksize.
3. Verify (e.g. in /proc/partitions) that the disk size is applied
   correctly. It is.
4. While that program still sleeps, attempt to mkswap /dev/zram0.
   This fails: mkswap: error: swap area needs to be at least 40 KiB

When I investigated, the size get by ioctl(fd, BLKGETSIZE64, xxx) on
mkswap to get a size of blockdev was zero although zram0 has right size by
2.

The reason is zram didn't revalidate disk after changing capacity so that
size of blockdev's inode is not uptodate until all of file is close.

This patch should fix the BUG.

Signed-off-by: Minchan Kim &lt;minchan@kernel.org&gt;
Reported-by: Alexander E. Patrakov &lt;patrakov@gmail.com&gt;
Tested-by: Alexander E. Patrakov &lt;patrakov@gmail.com&gt;
Reviewed-by: Sergey Senozhatsky &lt;sergey.senozhatsky@gmail.com&gt;
Cc: Nitin Gupta &lt;ngupta@vflare.org&gt;
Acked-by: Jerome Marchand &lt;jmarchan@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>rbd: handle parent_overlap on writes correctly</title>
<updated>2014-07-09T18:21:28+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>ilya.dryomov@inktank.com</email>
</author>
<published>2014-06-10T09:53:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8f121b315e67b553806a1babdc9ba3aefc86fa5d'/>
<id>8f121b315e67b553806a1babdc9ba3aefc86fa5d</id>
<content type='text'>
commit 9638556a276125553549fdfe349c464481ec2f39 upstream.

The following check in rbd_img_obj_request_submit()

    rbd_dev-&gt;parent_overlap &lt;= obj_request-&gt;img_offset

allows the fall through to the non-layered write case even if both
parent_overlap and obj_request-&gt;img_offset belong to the same RADOS
object.  This leads to data corruption, because the area to the left of
parent_overlap ends up unconditionally zero-filled instead of being
populated with parent data.  Suppose we want to write 1M to offset 6M
of image bar, which is a clone of foo@snap; object_size is 4M,
parent_overlap is 5M:

    rbd_data.&lt;id&gt;.0000000000000001
     ---------------------|----------------------|------------
    | should be copyup'ed | should be zeroed out | write ...
     ---------------------|----------------------|------------
   4M                    5M                     6M
                    parent_overlap    obj_request-&gt;img_offset

4..5M should be copyup'ed from foo, yet it is zero-filled, just like
5..6M is.

Given that the only striping mode kernel client currently supports is
chunking (i.e. stripe_unit == object_size, stripe_count == 1), round
parent_overlap up to the next object boundary for the purposes of the
overlap check.

Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
Reviewed-by: Josh Durgin &lt;josh.durgin@inktank.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 9638556a276125553549fdfe349c464481ec2f39 upstream.

The following check in rbd_img_obj_request_submit()

    rbd_dev-&gt;parent_overlap &lt;= obj_request-&gt;img_offset

allows the fall through to the non-layered write case even if both
parent_overlap and obj_request-&gt;img_offset belong to the same RADOS
object.  This leads to data corruption, because the area to the left of
parent_overlap ends up unconditionally zero-filled instead of being
populated with parent data.  Suppose we want to write 1M to offset 6M
of image bar, which is a clone of foo@snap; object_size is 4M,
parent_overlap is 5M:

    rbd_data.&lt;id&gt;.0000000000000001
     ---------------------|----------------------|------------
    | should be copyup'ed | should be zeroed out | write ...
     ---------------------|----------------------|------------
   4M                    5M                     6M
                    parent_overlap    obj_request-&gt;img_offset

4..5M should be copyup'ed from foo, yet it is zero-filled, just like
5..6M is.

Given that the only striping mode kernel client currently supports is
chunking (i.e. stripe_unit == object_size, stripe_count == 1), round
parent_overlap up to the next object boundary for the purposes of the
overlap check.

Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
Reviewed-by: Josh Durgin &lt;josh.durgin@inktank.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>rbd: use reference counts for image requests</title>
<updated>2014-07-09T18:21:28+00:00</updated>
<author>
<name>Alex Elder</name>
<email>elder@linaro.org</email>
</author>
<published>2014-04-26T10:21:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3ab5cb07a9ce04437ba00556151ed205a189d4e6'/>
<id>3ab5cb07a9ce04437ba00556151ed205a189d4e6</id>
<content type='text'>
commit 0f2d5be792b0466b06797f637cfbb0f64dbb408c upstream.

Each image request contains a reference count, but to date it has
not actually been used.  (I think this was just an oversight.) A
recent report involving rbd failing an assertion shed light on why
and where we need to use these reference counts.

Every OSD request associated with an object request uses
rbd_osd_req_callback() as its callback function.  That function will
call a helper function (dependent on the type of OSD request) that
will set the object request's "done" flag if the object request if
appropriate.  If that "done" flag is set, the object request is
passed to rbd_obj_request_complete().

In rbd_obj_request_complete(), requests are processed in sequential
order.  So if an object request completes before one of its
predecessors in the image request, the completion is deferred.
Otherwise, if it's a completing object's "turn" to be completed, it
is passed to rbd_img_obj_end_request(), which records the result of
the operation, accumulates transferred bytes, and so on.  Next, the
successor to this request is checked and if it is marked "done",
(deferred) completion processing is performed on that request, and
so on.  If the last object request in an image request is completed,
rbd_img_request_complete() is called, which (typically) destroys
the image request.

There is a race here, however.  The instant an object request is
marked "done" it can be provided (by a thread handling completion of
one of its predecessor operations) to rbd_img_obj_end_request(),
which (for the last request) can then lead to the image request
getting torn down.  And this can happen *before* that object has
itself entered rbd_img_obj_end_request().  As a result, once it
*does* enter that function, the image request (and even the object
request itself) may have been freed and become invalid.

All that's necessary to avoid this is to properly count references
to the image requests.  We tear down an image request's object
requests all at once--only when the entire image request has
completed.  So there's no need for an image request to count
references for its object requests.  However, we don't want an
image request to go away until the last of its object requests
has passed through rbd_img_obj_callback().  In other words,
we don't want rbd_img_request_complete() to necessarily
result in the image request being destroyed, because it may
get called before we've finished processing on all of its
object requests.

So the fix is to add a reference to an image request for
each of its object requests.  The reference can be viewed
as representing an object request that has not yet finished
its call to rbd_img_obj_callback().  That is emphasized by
getting the reference right after assigning that as the image
object's callback function.  The corresponding release of that
reference is done at the end of rbd_img_obj_callback(), which
every image object request passes through exactly once.

Signed-off-by: Alex Elder &lt;elder@linaro.org&gt;
Reviewed-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 0f2d5be792b0466b06797f637cfbb0f64dbb408c upstream.

Each image request contains a reference count, but to date it has
not actually been used.  (I think this was just an oversight.) A
recent report involving rbd failing an assertion shed light on why
and where we need to use these reference counts.

Every OSD request associated with an object request uses
rbd_osd_req_callback() as its callback function.  That function will
call a helper function (dependent on the type of OSD request) that
will set the object request's "done" flag if the object request if
appropriate.  If that "done" flag is set, the object request is
passed to rbd_obj_request_complete().

In rbd_obj_request_complete(), requests are processed in sequential
order.  So if an object request completes before one of its
predecessors in the image request, the completion is deferred.
Otherwise, if it's a completing object's "turn" to be completed, it
is passed to rbd_img_obj_end_request(), which records the result of
the operation, accumulates transferred bytes, and so on.  Next, the
successor to this request is checked and if it is marked "done",
(deferred) completion processing is performed on that request, and
so on.  If the last object request in an image request is completed,
rbd_img_request_complete() is called, which (typically) destroys
the image request.

There is a race here, however.  The instant an object request is
marked "done" it can be provided (by a thread handling completion of
one of its predecessor operations) to rbd_img_obj_end_request(),
which (for the last request) can then lead to the image request
getting torn down.  And this can happen *before* that object has
itself entered rbd_img_obj_end_request().  As a result, once it
*does* enter that function, the image request (and even the object
request itself) may have been freed and become invalid.

All that's necessary to avoid this is to properly count references
to the image requests.  We tear down an image request's object
requests all at once--only when the entire image request has
completed.  So there's no need for an image request to count
references for its object requests.  However, we don't want an
image request to go away until the last of its object requests
has passed through rbd_img_obj_callback().  In other words,
we don't want rbd_img_request_complete() to necessarily
result in the image request being destroyed, because it may
get called before we've finished processing on all of its
object requests.

So the fix is to add a reference to an image request for
each of its object requests.  The reference can be viewed
as representing an object request that has not yet finished
its call to rbd_img_obj_callback().  That is emphasized by
getting the reference right after assigning that as the image
object's callback function.  The corresponding release of that
reference is done at the end of rbd_img_obj_callback(), which
every image object request passes through exactly once.

Signed-off-by: Alex Elder &lt;elder@linaro.org&gt;
Reviewed-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>mtip32xx: Remove dfs_parent after pci unregister</title>
<updated>2014-07-07T01:59:09+00:00</updated>
<author>
<name>Asai Thambi S P</name>
<email>asamymuthupa@micron.com</email>
</author>
<published>2014-04-17T03:30:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=438b20e970ccf13d8eedd70031e1fedc957c1ccb'/>
<id>438b20e970ccf13d8eedd70031e1fedc957c1ccb</id>
<content type='text'>
commit af5ded8ccf21627f9614afc03b356712666ed225 upstream.

In module exit, dfs_parent and it's subtree were removed before
unregistering with pci. When debugfs entry for each device is attempted
to remove in pci_remove() context, they don't exist, as dfs_parent and
its children were already ripped apart.

Modified to first unregister with pci and then remove dfs_parent.

Signed-off-by: Asai Thambi S P &lt;asamymuthupa@micron.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit af5ded8ccf21627f9614afc03b356712666ed225 upstream.

In module exit, dfs_parent and it's subtree were removed before
unregistering with pci. When debugfs entry for each device is attempted
to remove in pci_remove() context, they don't exist, as dfs_parent and
its children were already ripped apart.

Modified to first unregister with pci and then remove dfs_parent.

Signed-off-by: Asai Thambi S P &lt;asamymuthupa@micron.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>mtip32xx: Increase timeout for STANDBY IMMEDIATE command</title>
<updated>2014-07-07T01:59:09+00:00</updated>
<author>
<name>Asai Thambi S P</name>
<email>asamymuthupa@micron.com</email>
</author>
<published>2014-04-17T03:27:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8e5d7e9d97682c239627e1c55b3ffa8cc09f187a'/>
<id>8e5d7e9d97682c239627e1c55b3ffa8cc09f187a</id>
<content type='text'>
commit 670a641420a3d9586eebe7429dfeec4e7ed447aa upstream.

Increased timeout for STANDBY IMMEDIATE command to 2 minutes.

Signed-off-by: Selvan Mani &lt;smani@micron.com&gt;
Signed-off-by: Asai Thambi S P &lt;asamymuthupa@micron.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 670a641420a3d9586eebe7429dfeec4e7ed447aa upstream.

Increased timeout for STANDBY IMMEDIATE command to 2 minutes.

Signed-off-by: Selvan Mani &lt;smani@micron.com&gt;
Signed-off-by: Asai Thambi S P &lt;asamymuthupa@micron.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>mtip32xx: Fix ERO and NoSnoop values in PCIe upstream on AMD systems</title>
<updated>2014-07-07T01:59:09+00:00</updated>
<author>
<name>Asai Thambi S P</name>
<email>asamymuthupa@micron.com</email>
</author>
<published>2014-03-14T01:45:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=55e416900e309572eb497da2ddb43ce45f80690b'/>
<id>55e416900e309572eb497da2ddb43ce45f80690b</id>
<content type='text'>
commit d1e714db8129a1d3670e449b87719c78e2c76f9f upstream.

A hardware quirk in P320h/P420m interfere with PCIe transactions on some
AMD chipsets, making P320h/P420m unusable. This workaround is to disable
ERO and NoSnoop bits in the parent and root complex for normal
functioning of these devices

NOTE: This workaround is specific to AMD chipset with a PCIe upstream
device with device id 0x5aXX

Signed-off-by: Asai Thambi S P &lt;asamymuthupa@micron.com&gt;
Signed-off-by: Sam Bradshaw &lt;sbradshaw@micron.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d1e714db8129a1d3670e449b87719c78e2c76f9f upstream.

A hardware quirk in P320h/P420m interfere with PCIe transactions on some
AMD chipsets, making P320h/P420m unusable. This workaround is to disable
ERO and NoSnoop bits in the parent and root complex for normal
functioning of these devices

NOTE: This workaround is specific to AMD chipset with a PCIe upstream
device with device id 0x5aXX

Signed-off-by: Asai Thambi S P &lt;asamymuthupa@micron.com&gt;
Signed-off-by: Sam Bradshaw &lt;sbradshaw@micron.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>zram: correct offset usage in zram_bio_discard</title>
<updated>2014-07-01T03:13:55+00:00</updated>
<author>
<name>Weijie Yang</name>
<email>weijie.yang@samsung.com</email>
</author>
<published>2014-06-04T23:11:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e9198a0b9b519fc5e61cef1f0c5a7e1b0798658e'/>
<id>e9198a0b9b519fc5e61cef1f0c5a7e1b0798658e</id>
<content type='text'>
commit 38515c73398a4c58059ecf1087e844561b58ee0f upstream.

We want to skip the physical block(PAGE_SIZE) which is partially covered
by the discard bio, so we check the remaining size and subtract it if
there is a need to goto the next physical block.

The current offset usage in zram_bio_discard is incorrect, it will cause
its upper filesystem breakdown.  Consider the following scenario:

On some architecture or config, PAGE_SIZE is 64K for example, filesystem
is set up on zram disk without PAGE_SIZE aligned, a discard bio leads to a
offset = 4K and size=72K, normally, it should not really discard any
physical block as it partially cover two physical blocks.  However, with
the current offset usage, it will discard the second physical block and
free its memory, which will cause filesystem breakdown.

This patch corrects the offset usage in zram_bio_discard.

Signed-off-by: Weijie Yang &lt;weijie.yang@samsung.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Nitin Gupta &lt;ngupta@vflare.org&gt;
Acked-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Sergey Senozhatsky &lt;sergey.senozhatsky@gmail.com&gt;
Cc: Bob Liu &lt;bob.liu@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 38515c73398a4c58059ecf1087e844561b58ee0f upstream.

We want to skip the physical block(PAGE_SIZE) which is partially covered
by the discard bio, so we check the remaining size and subtract it if
there is a need to goto the next physical block.

The current offset usage in zram_bio_discard is incorrect, it will cause
its upper filesystem breakdown.  Consider the following scenario:

On some architecture or config, PAGE_SIZE is 64K for example, filesystem
is set up on zram disk without PAGE_SIZE aligned, a discard bio leads to a
offset = 4K and size=72K, normally, it should not really discard any
physical block as it partially cover two physical blocks.  However, with
the current offset usage, it will discard the second physical block and
free its memory, which will cause filesystem breakdown.

This patch corrects the offset usage in zram_bio_discard.

Signed-off-by: Weijie Yang &lt;weijie.yang@samsung.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Nitin Gupta &lt;ngupta@vflare.org&gt;
Acked-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Sergey Senozhatsky &lt;sergey.senozhatsky@gmail.com&gt;
Cc: Bob Liu &lt;bob.liu@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>block: virtio_blk: don't hold spin lock during world switch</title>
<updated>2014-07-01T03:13:52+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@canonical.com</email>
</author>
<published>2014-05-30T02:49:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ed950366acc7694b514a43878b41a1392d89bd19'/>
<id>ed950366acc7694b514a43878b41a1392d89bd19</id>
<content type='text'>
commit e8edca6f7f92234202d6dd163c118ef495244d7c upstream.

Firstly, it isn't necessary to hold lock of vblk-&gt;vq_lock
when notifying hypervisor about queued I/O.

Secondly, virtqueue_notify() will cause world switch and
it may take long time on some hypervisors(such as, qemu-arm),
so it isn't good to hold the lock and block other vCPUs.

On arm64 quad core VM(qemu-kvm), the patch can increase I/O
performance a lot with VIRTIO_RING_F_EVENT_IDX enabled:
	- without the patch: 14K IOPS
	- with the patch: 34K IOPS

fio script:
	[global]
	direct=1
	bsrange=4k-4k
	timeout=10
	numjobs=4
	ioengine=libaio
	iodepth=64

	filename=/dev/vdc
	group_reporting=1

	[f1]
	rw=randread

Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Acked-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit e8edca6f7f92234202d6dd163c118ef495244d7c upstream.

Firstly, it isn't necessary to hold lock of vblk-&gt;vq_lock
when notifying hypervisor about queued I/O.

Secondly, virtqueue_notify() will cause world switch and
it may take long time on some hypervisors(such as, qemu-arm),
so it isn't good to hold the lock and block other vCPUs.

On arm64 quad core VM(qemu-kvm), the patch can increase I/O
performance a lot with VIRTIO_RING_F_EVENT_IDX enabled:
	- without the patch: 14K IOPS
	- with the patch: 34K IOPS

fio script:
	[global]
	direct=1
	bsrange=4k-4k
	timeout=10
	numjobs=4
	ioengine=libaio
	iodepth=64

	filename=/dev/vdc
	group_reporting=1

	[f1]
	rw=randread

Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Acked-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>virtio_blk: fix race between start and stop queue</title>
<updated>2014-05-27T14:41:10+00:00</updated>
<author>
<name>Ming Lei</name>
<email>tom.leiming@gmail.com</email>
</author>
<published>2014-05-16T15:31:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=aa0818c6ee8d8e4772725a43550823347bc1ad30'/>
<id>aa0818c6ee8d8e4772725a43550823347bc1ad30</id>
<content type='text'>
When there isn't enough vring descriptor for adding to vq,
blk-mq will be put as stopped state until some of pending
descriptors are completed &amp; freed.

Unfortunately, the vq's interrupt may come just before
blk-mq's BLK_MQ_S_STOPPED flag is set, so the blk-mq will
still be kept as stopped even though lots of descriptors
are completed and freed in the interrupt handler. The worst
case is that all pending descriptors are freed in the
interrupt handler, and the queue is kept as stopped forever.

This patch fixes the problem by starting/stopping blk-mq
with holding vq_lock.

Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Ming Lei &lt;tom.leiming@gmail.com&gt;
Cc: stable@kernel.org
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;

Conflicts:
	drivers/block/virtio_blk.c
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When there isn't enough vring descriptor for adding to vq,
blk-mq will be put as stopped state until some of pending
descriptors are completed &amp; freed.

Unfortunately, the vq's interrupt may come just before
blk-mq's BLK_MQ_S_STOPPED flag is set, so the blk-mq will
still be kept as stopped even though lots of descriptors
are completed and freed in the interrupt handler. The worst
case is that all pending descriptors are freed in the
interrupt handler, and the queue is kept as stopped forever.

This patch fixes the problem by starting/stopping blk-mq
with holding vq_lock.

Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Ming Lei &lt;tom.leiming@gmail.com&gt;
Cc: stable@kernel.org
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;

Conflicts:
	drivers/block/virtio_blk.c
</pre>
</div>
</content>
</entry>
<entry>
<title>floppy: don't write kernel-only members to FDRAWCMD ioctl output</title>
<updated>2014-05-05T14:46:56+00:00</updated>
<author>
<name>Matthew Daley</name>
<email>mattd@bugfuzz.com</email>
</author>
<published>2014-04-28T07:05:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=2145e15e0557a01b9195d1c7199a1b92cb9be81f'/>
<id>2145e15e0557a01b9195d1c7199a1b92cb9be81f</id>
<content type='text'>
Do not leak kernel-only floppy_raw_cmd structure members to userspace.
This includes the linked-list pointer and the pointer to the allocated
DMA space.

Signed-off-by: Matthew Daley &lt;mattd@bugfuzz.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Do not leak kernel-only floppy_raw_cmd structure members to userspace.
This includes the linked-list pointer and the pointer to the allocated
DMA space.

Signed-off-by: Matthew Daley &lt;mattd@bugfuzz.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
