<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/block/blk-cgroup.h, branch v3.0.44</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>cfq-iosched: Make IO merge related stats per cpu</title>
<updated>2011-05-23T08:02:19+00:00</updated>
<author>
<name>Vivek Goyal</name>
<email>vgoyal@redhat.com</email>
</author>
<published>2011-05-23T08:02:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=317389a7739675aa990b7e0d750a7c435f1d25d7'/>
<id>317389a7739675aa990b7e0d750a7c435f1d25d7</id>
<content type='text'>
Make BLKIO_STAT_MERGED per cpu hence gettring rid of need of taking
blkg-&gt;stats_lock.

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Make BLKIO_STAT_MERGED per cpu hence gettring rid of need of taking
blkg-&gt;stats_lock.

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-cgroup: Make 64bit per cpu stats safe on 32bit arch</title>
<updated>2011-05-20T18:34:53+00:00</updated>
<author>
<name>Vivek Goyal</name>
<email>vgoyal@redhat.com</email>
</author>
<published>2011-05-19T19:38:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=575969a0dd3fe65c6556bcb8f87c42303326ea55'/>
<id>575969a0dd3fe65c6556bcb8f87c42303326ea55</id>
<content type='text'>
Some of the stats are 64bit and updation will be non atomic on 32bit
architecture. Use sequence counters on 32bit arch to make reading
of stats safe.

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some of the stats are 64bit and updation will be non atomic on 32bit
architecture. Use sequence counters on 32bit arch to make reading
of stats safe.

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-throttle: Make dispatch stats per cpu</title>
<updated>2011-05-20T18:34:52+00:00</updated>
<author>
<name>Vivek Goyal</name>
<email>vgoyal@redhat.com</email>
</author>
<published>2011-05-19T19:38:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5624a4e445e2ec27582984b068d7bf7f127cee10'/>
<id>5624a4e445e2ec27582984b068d7bf7f127cee10</id>
<content type='text'>
Currently we take blkg_stat lock for even updating the stats. So even if
a group has no throttling rules (common case for root group), we end
up taking blkg_lock, for updating the stats.

Make dispatch stats per cpu so that these can be updated without taking
blkg lock.

If cpu goes offline, these stats simply disappear. No protection has
been provided for that yet. Do we really need anything for that?

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently we take blkg_stat lock for even updating the stats. So even if
a group has no throttling rules (common case for root group), we end
up taking blkg_lock, for updating the stats.

Make dispatch stats per cpu so that these can be updated without taking
blkg lock.

If cpu goes offline, these stats simply disappear. No protection has
been provided for that yet. Do we really need anything for that?

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-cgroup: move some fields of unaccounted_time file under right config option</title>
<updated>2011-05-20T18:34:52+00:00</updated>
<author>
<name>Vivek Goyal</name>
<email>vgoyal@redhat.com</email>
</author>
<published>2011-05-19T19:38:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a23e68695593d00b35a6cddf8e9c9ec03505ecb9'/>
<id>a23e68695593d00b35a6cddf8e9c9ec03505ecb9</id>
<content type='text'>
cgroup unaccounted_time file is created only if CONFIG_DEBUG_BLK_CGROUP=y.
there are some fields which are out side this config option. Fix that.

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
cgroup unaccounted_time file is created only if CONFIG_DEBUG_BLK_CGROUP=y.
there are some fields which are out side this config option. Fix that.

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-throttle: Use task_subsys_state() to determine a task's blkio_cgroup</title>
<updated>2011-05-16T13:24:08+00:00</updated>
<author>
<name>Vivek Goyal</name>
<email>vgoyal@redhat.com</email>
</author>
<published>2011-05-16T13:24:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=70087dc38cc77ca8f46059564c00338777734762'/>
<id>70087dc38cc77ca8f46059564c00338777734762</id>
<content type='text'>
Currentlly we first map the task to cgroup and then cgroup to
blkio_cgroup. There is a more direct way to get to blkio_cgroup
from task using task_subsys_state(). Use that.

The real reason for the fix is that it also avoids a race in generic
cgroup code. During remount/umount rebind_subsystems() is called and
it can do following with and rcu protection.

cgrp-&gt;subsys[i] = NULL;

That means if somebody got hold of cgroup under rcu and then it tried
to do cgroup-&gt;subsys[] to get to blkio_cgroup, it would get NULL which
is wrong. I was running into this race condition with ltp running on a
upstream derived kernel and that lead to crash.

So ideally we should also fix cgroup generic code to wait for rcu
grace period before setting pointer to NULL. Li Zefan is not very keen
on introducing synchronize_wait() as he thinks it will slow
down moun/remount/umount operations.

So for the time being atleast fix the kernel crash by taking a more
direct route to blkio_cgroup.

One tester had reported a crash while running LTP on a derived kernel
and with this fix crash is no more seen while the test has been
running for over 6 days.

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Reviewed-by: Li Zefan &lt;lizf@cn.fujitsu.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currentlly we first map the task to cgroup and then cgroup to
blkio_cgroup. There is a more direct way to get to blkio_cgroup
from task using task_subsys_state(). Use that.

The real reason for the fix is that it also avoids a race in generic
cgroup code. During remount/umount rebind_subsystems() is called and
it can do following with and rcu protection.

cgrp-&gt;subsys[i] = NULL;

That means if somebody got hold of cgroup under rcu and then it tried
to do cgroup-&gt;subsys[] to get to blkio_cgroup, it would get NULL which
is wrong. I was running into this race condition with ltp running on a
upstream derived kernel and that lead to crash.

So ideally we should also fix cgroup generic code to wait for rcu
grace period before setting pointer to NULL. Li Zefan is not very keen
on introducing synchronize_wait() as he thinks it will slow
down moun/remount/umount operations.

So for the time being atleast fix the kernel crash by taking a more
direct route to blkio_cgroup.

One tester had reported a crash while running LTP on a derived kernel
and with this fix crash is no more seen while the test has been
running for over 6 days.

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Reviewed-by: Li Zefan &lt;lizf@cn.fujitsu.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-cgroup: Add unaccounted time to timeslice_used.</title>
<updated>2011-03-12T15:54:00+00:00</updated>
<author>
<name>Justin TerAvest</name>
<email>teravest@google.com</email>
</author>
<published>2011-03-12T15:54:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=167400d34070ebbc408dc0f447c4ddb4bf837360'/>
<id>167400d34070ebbc408dc0f447c4ddb4bf837360</id>
<content type='text'>
There are two kind of times that tasks are not charged for: the first
seek and the extra time slice used over the allocated timeslice. Both
of these exported as a new unaccounted_time stat.

I think it would be good to have this reported in 'time' as well, but
that is probably a separate discussion.

Signed-off-by: Justin TerAvest &lt;teravest@google.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are two kind of times that tasks are not charged for: the first
seek and the extra time slice used over the allocated timeslice. Both
of these exported as a new unaccounted_time stat.

I think it would be good to have this reported in 'time' as well, but
that is probably a separate discussion.

Signed-off-by: Justin TerAvest &lt;teravest@google.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-cgroup: Lower minimum weight from 100 to 10.</title>
<updated>2011-03-08T18:45:00+00:00</updated>
<author>
<name>Justin TerAvest</name>
<email>teravest@google.com</email>
</author>
<published>2011-03-08T18:45:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=df457f845e5449be2e7d96668791f789b3770ac7'/>
<id>df457f845e5449be2e7d96668791f789b3770ac7</id>
<content type='text'>
We've found that we still get good, useful isolation at weights this
low. I'd like to adjust the minimum so that any other changes can take
these values into account.

Signed-off-by: Justin TerAvest &lt;teravest@google.com&gt;
Acked-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We've found that we still get good, useful isolation at weights this
low. I'd like to adjust the minimum so that any other changes can take
these values into account.

Signed-off-by: Justin TerAvest &lt;teravest@google.com&gt;
Acked-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blkio-throttle: limit max iops value to UINT_MAX</title>
<updated>2010-10-01T19:16:41+00:00</updated>
<author>
<name>Vivek Goyal</name>
<email>vgoyal@redhat.com</email>
</author>
<published>2010-10-01T19:16:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9355aede5a3c4975e0ba8bbfe2b9d1fd73308916'/>
<id>9355aede5a3c4975e0ba8bbfe2b9d1fd73308916</id>
<content type='text'>
- Limit max iops value to UINT_MAX and return error to user if value is more
  than that instead of accepting bigger values and truncating implicitly.

Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Limit max iops value to UINT_MAX and return error to user if value is more
  than that instead of accepting bigger values and truncating implicitly.

Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blkio: Recalculate the throttled bio dispatch time upon throttle limit change</title>
<updated>2010-10-01T12:49:49+00:00</updated>
<author>
<name>Vivek Goyal</name>
<email>vgoyal@redhat.com</email>
</author>
<published>2010-10-01T12:49:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fe0714377ee2ca161bf2afb7773e22f15f1786d4'/>
<id>fe0714377ee2ca161bf2afb7773e22f15f1786d4</id>
<content type='text'>
o Currently any cgroup throttle limit changes are processed asynchronousy and
  the change does not take affect till a new bio is dispatched from same group.

o It might happen that a user sets a redicuously low limit on throttling.
  Say 1 bytes per second on reads. In such cases simple operations like mount
  a disk can wait for a very long time.

o Once bio is throttled, there is no easy way to come out of that wait even if
  user increases the read limit later.

o This patch fixes it. Now if a user changes the cgroup limits, we recalculate
  the bio dispatch time according to new limits.

o Can't take queueu lock under blkcg_lock, hence after the change I wake
  up the dispatch thread again which recalculates the time. So there are some
  variables being synchronized across two threads without lock and I had to
  make use of barriers. Hoping I have used barriers correctly. Any review of
  memory barrier code especially will help.

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
o Currently any cgroup throttle limit changes are processed asynchronousy and
  the change does not take affect till a new bio is dispatched from same group.

o It might happen that a user sets a redicuously low limit on throttling.
  Say 1 bytes per second on reads. In such cases simple operations like mount
  a disk can wait for a very long time.

o Once bio is throttled, there is no easy way to come out of that wait even if
  user increases the read limit later.

o This patch fixes it. Now if a user changes the cgroup limits, we recalculate
  the bio dispatch time according to new limits.

o Can't take queueu lock under blkcg_lock, hence after the change I wake
  up the dispatch thread again which recalculates the time. So there are some
  variables being synchronized across two threads without lock and I had to
  make use of barriers. Hoping I have used barriers correctly. Any review of
  memory barrier code especially will help.

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-cgroup: cgroup changes for IOPS limit support</title>
<updated>2010-09-16T06:42:58+00:00</updated>
<author>
<name>Vivek Goyal</name>
<email>vgoyal@redhat.com</email>
</author>
<published>2010-09-15T21:06:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7702e8f45b0a3bb262b9366c60beb5445758d94c'/>
<id>7702e8f45b0a3bb262b9366c60beb5445758d94c</id>
<content type='text'>
o cgroup changes for IOPS throttling rules.

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
o cgroup changes for IOPS throttling rules.

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
