<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/net/sch_generic.h, branch v2.6.36-rc5</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6</title>
<updated>2010-07-07T22:59:38+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2010-07-07T22:59:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=597e608a8492d662736c9bc6aa507dbf1cadc17d'/>
<id>597e608a8492d662736c9bc6aa507dbf1cadc17d</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>net: decreasing real_num_tx_queues needs to flush qdisc</title>
<updated>2010-07-03T04:59:07+00:00</updated>
<author>
<name>John Fastabend</name>
<email>john.r.fastabend@intel.com</email>
</author>
<published>2010-07-01T13:21:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f0796d5c73e59786d09a1e617689d1d415f2db44'/>
<id>f0796d5c73e59786d09a1e617689d1d415f2db44</id>
<content type='text'>
Reducing real_num_queues needs to flush the qdisc otherwise
skbs with queue_mappings greater then real_num_tx_queues can
be sent to the underlying driver.

The flow for this is,

dev_queue_xmit()
	dev_pick_tx()
		skb_tx_hash()  =&gt; hash using real_num_tx_queues
		skb_set_queue_mapping()
	...
	qdisc_enqueue_root() =&gt; enqueue skb on txq from hash
...
dev-&gt;real_num_tx_queues -= n
...
sch_direct_xmit()
	dev_hard_start_xmit()
		ndo_start_xmit(skb,dev) =&gt; skb queue set with old hash

skbs are enqueued on the qdisc with skb-&gt;queue_mapping set
0 &lt; queue_mappings &lt; real_num_tx_queues.  When the driver
decreases real_num_tx_queues skb's may be dequeued from the
qdisc with a queue_mapping greater then real_num_tx_queues.

This fixes a case in ixgbe where this was occurring with DCB
and FCoE. Because the driver is using queue_mapping to map
skbs to tx descriptor rings we can potentially map skbs to
rings that no longer exist.

Signed-off-by: John Fastabend &lt;john.r.fastabend@intel.com&gt;
Tested-by: Ross Brattain &lt;ross.b.brattain@intel.com&gt;
Signed-off-by: Jeff Kirsher &lt;jeffrey.t.kirsher@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Reducing real_num_queues needs to flush the qdisc otherwise
skbs with queue_mappings greater then real_num_tx_queues can
be sent to the underlying driver.

The flow for this is,

dev_queue_xmit()
	dev_pick_tx()
		skb_tx_hash()  =&gt; hash using real_num_tx_queues
		skb_set_queue_mapping()
	...
	qdisc_enqueue_root() =&gt; enqueue skb on txq from hash
...
dev-&gt;real_num_tx_queues -= n
...
sch_direct_xmit()
	dev_hard_start_xmit()
		ndo_start_xmit(skb,dev) =&gt; skb queue set with old hash

skbs are enqueued on the qdisc with skb-&gt;queue_mapping set
0 &lt; queue_mappings &lt; real_num_tx_queues.  When the driver
decreases real_num_tx_queues skb's may be dequeued from the
qdisc with a queue_mapping greater then real_num_tx_queues.

This fixes a case in ixgbe where this was occurring with DCB
and FCoE. Because the driver is using queue_mapping to map
skbs to tx descriptor rings we can potentially map skbs to
rings that no longer exist.

Signed-off-by: John Fastabend &lt;john.r.fastabend@intel.com&gt;
Tested-by: Ross Brattain &lt;ross.b.brattain@intel.com&gt;
Signed-off-by: Jeff Kirsher &lt;jeffrey.t.kirsher@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: qdisc_reset_all_tx is calling qdisc_reset without qdisc_lock</title>
<updated>2010-07-03T04:59:07+00:00</updated>
<author>
<name>John Fastabend</name>
<email>john.r.fastabend@intel.com</email>
</author>
<published>2010-07-01T13:21:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4ef6acff83222f4496ceef7d1f0ee9e50a5bb403'/>
<id>4ef6acff83222f4496ceef7d1f0ee9e50a5bb403</id>
<content type='text'>
When calling qdisc_reset() the qdisc lock needs to be held.  In
this case there is at least one driver i4l which is using this
without holding the lock.  Add the locking here.

Signed-off-by: John Fastabend &lt;john.r.fastabend@intel.com&gt;
Signed-off-by: Jeff Kirsher &lt;jeffrey.t.kirsher@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When calling qdisc_reset() the qdisc lock needs to be held.  In
this case there is at least one driver i4l which is using this
without holding the lock.  Add the locking here.

Signed-off-by: John Fastabend &lt;john.r.fastabend@intel.com&gt;
Signed-off-by: Jeff Kirsher &lt;jeffrey.t.kirsher@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>act_mirred: don't clone skb when skb isn't shared</title>
<updated>2010-06-29T06:24:32+00:00</updated>
<author>
<name>Changli Gao</name>
<email>xiaosuo@gmail.com</email>
</author>
<published>2010-06-24T16:25:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=210d6de78c5d7c785fc532556cea340e517955e1'/>
<id>210d6de78c5d7c785fc532556cea340e517955e1</id>
<content type='text'>
don't clone skb when skb isn't shared

When the tcf_action is TC_ACT_STOLEN, and the skb isn't shared, we don't need
to clone a new skb. As the skb will be freed after this function returns, we
can use it freely once we get a reference to it.

Signed-off-by: Changli Gao &lt;xiaosuo@gmail.com&gt;
----
 include/net/sch_generic.h |   11 +++++++++--
 net/sched/act_mirred.c    |    6 +++---
 2 files changed, 12 insertions(+), 5 deletions(-)
Signed-off-by: Jamal Hadi Salim &lt;hadi@cyberus.ca&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
don't clone skb when skb isn't shared

When the tcf_action is TC_ACT_STOLEN, and the skb isn't shared, we don't need
to clone a new skb. As the skb will be freed after this function returns, we
can use it freely once we get a reference to it.

Signed-off-by: Changli Gao &lt;xiaosuo@gmail.com&gt;
----
 include/net/sch_generic.h |   11 +++++++++--
 net/sched/act_mirred.c    |    6 +++---
 2 files changed, 12 insertions(+), 5 deletions(-)
Signed-off-by: Jamal Hadi Salim &lt;hadi@cyberus.ca&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: add additional lock to qdisc to increase throughput</title>
<updated>2010-06-02T12:09:29+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2010-06-02T12:09:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=79640a4ca6955e3ebdb7038508fa7a0cd7fa5527'/>
<id>79640a4ca6955e3ebdb7038508fa7a0cd7fa5527</id>
<content type='text'>
When many cpus compete for sending frames on a given qdisc, the qdisc
spinlock suffers from very high contention.

The cpu owning __QDISC_STATE_RUNNING bit has same priority to acquire
the lock, and cannot dequeue packets fast enough, since it must wait for
this lock for each dequeued packet.

One solution to this problem is to force all cpus spinning on a second
lock before trying to get the main lock, when/if they see
__QDISC_STATE_RUNNING already set.

The owning cpu then compete with at most one other cpu for the main
lock, allowing for higher dequeueing rate.

Based on a previous patch from Alexander Duyck. I added the heuristic to
avoid the atomic in fast path, and put the new lock far away from the
cache line used by the dequeue worker. Also try to release the busylock
lock as late as possible.

Tests with following script gave a boost from ~50.000 pps to ~600.000
pps on a dual quad core machine (E5450 @3.00GHz), tg3 driver.
(A single netperf flow can reach ~800.000 pps on this platform)

for j in `seq 0 3`; do
  for i in `seq 0 7`; do
    netperf -H 192.168.0.1 -t UDP_STREAM -l 60 -N -T $i -- -m 6 &amp;
  done
done

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Acked-by: Alexander Duyck &lt;alexander.h.duyck@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When many cpus compete for sending frames on a given qdisc, the qdisc
spinlock suffers from very high contention.

The cpu owning __QDISC_STATE_RUNNING bit has same priority to acquire
the lock, and cannot dequeue packets fast enough, since it must wait for
this lock for each dequeued packet.

One solution to this problem is to force all cpus spinning on a second
lock before trying to get the main lock, when/if they see
__QDISC_STATE_RUNNING already set.

The owning cpu then compete with at most one other cpu for the main
lock, allowing for higher dequeueing rate.

Based on a previous patch from Alexander Duyck. I added the heuristic to
avoid the atomic in fast path, and put the new lock far away from the
cache line used by the dequeue worker. Also try to release the busylock
lock as late as possible.

Tests with following script gave a boost from ~50.000 pps to ~600.000
pps on a dual quad core machine (E5450 @3.00GHz), tg3 driver.
(A single netperf flow can reach ~800.000 pps on this platform)

for j in `seq 0 3`; do
  for i in `seq 0 7`; do
    netperf -H 192.168.0.1 -t UDP_STREAM -l 60 -N -T $i -- -m 6 &amp;
  done
done

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Acked-by: Alexander Duyck &lt;alexander.h.duyck@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: QDISC_STATE_RUNNING dont need atomic bit ops</title>
<updated>2010-06-02T10:24:13+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2010-06-02T10:24:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=371121057607e3127e19b3fa094330181b5b031e'/>
<id>371121057607e3127e19b3fa094330181b5b031e</id>
<content type='text'>
__QDISC_STATE_RUNNING is always changed while qdisc lock is held.

We can avoid two atomic operations in xmit path, if we move this bit in
a new __state container.

Location of this __state container is carefully chosen so that fast path
only dirties one qdisc cache line.

THROTTLED bit could later be moved into this __state location too, to
avoid dirtying first qdisc cache line.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
__QDISC_STATE_RUNNING is always changed while qdisc lock is held.

We can avoid two atomic operations in xmit path, if we move this bit in
a new __state container.

Location of this __state container is carefully chosen so that fast path
only dirties one qdisc cache line.

THROTTLED bit could later be moved into this __state location too, to
avoid dirtying first qdisc cache line.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Define accessors to manipulate QDISC_STATE_RUNNING</title>
<updated>2010-06-02T10:23:51+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2010-06-02T10:23:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bc135b23d01acf7ee926aaf75b0020c86d3869f9'/>
<id>bc135b23d01acf7ee926aaf75b0020c86d3869f9</id>
<content type='text'>
Define three helpers to manipulate QDISC_STATE_RUNNIG flag, that a
second patch will move on another location.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Define three helpers to manipulate QDISC_STATE_RUNNIG flag, that a
second patch will move on another location.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gen_estimator: deadlock fix</title>
<updated>2010-04-02T01:38:48+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2010-03-31T07:06:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5d944c640b4ae5f37c537acf491c2f0eb89fa0d6'/>
<id>5d944c640b4ae5f37c537acf491c2f0eb89fa0d6</id>
<content type='text'>
One of my test machine got a deadlock during "tc" sessions,
adding/deleting classes &amp; filters, using traffic estimators.

After some analysis, I believe we have a potential use after free case
in est_timer() :

spin_lock(e-&gt;stats_lock); &lt;&lt; HERE &gt;&gt;
read_lock(&amp;est_lock);
if (e-&gt;bstats == NULL)   &lt;&lt; TEST &gt;&gt;
	goto skip;

Test is done a bit late, because after estimator is killed, and before
rcu grace period elapsed, we might already have freed/reuse memory where
e-&gt;stats_locks points to (some qdisc-&gt;q.lock)

A possible fix is to respect a rcu grace period at Qdisc dismantle time.

On 64bit, sizeof(struct Qdisc) is exactly 192 bytes. Adding 16 bytes to
it (for struct rcu_head) is a problem because it might change
performance, given QDISC_ALIGNTO is 32 bytes.

This is why I also change QDISC_ALIGNTO to 64 bytes, to satisfy most
current alignment requirements.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
One of my test machine got a deadlock during "tc" sessions,
adding/deleting classes &amp; filters, using traffic estimators.

After some analysis, I believe we have a potential use after free case
in est_timer() :

spin_lock(e-&gt;stats_lock); &lt;&lt; HERE &gt;&gt;
read_lock(&amp;est_lock);
if (e-&gt;bstats == NULL)   &lt;&lt; TEST &gt;&gt;
	goto skip;

Test is done a bit late, because after estimator is killed, and before
rcu grace period elapsed, we might already have freed/reuse memory where
e-&gt;stats_locks points to (some qdisc-&gt;q.lock)

A possible fix is to respect a rcu grace period at Qdisc dismantle time.

On 64bit, sizeof(struct Qdisc) is exactly 192 bytes. Adding 16 bytes to
it (for struct rcu_head) is a problem because it might change
performance, given QDISC_ALIGNTO is 32 bytes.

This is why I also change QDISC_ALIGNTO to 64 bytes, to satisfy most
current alignment requirements.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: add head drop fifo queue</title>
<updated>2010-01-29T05:27:00+00:00</updated>
<author>
<name>Hagen Paul Pfeifer</name>
<email>hagen@jauu.net</email>
</author>
<published>2010-01-24T12:30:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=57dbb2d83d100ea601c54fe129bfde0678db5dee'/>
<id>57dbb2d83d100ea601c54fe129bfde0678db5dee</id>
<content type='text'>
This adds an additional queuing strategy, called pfifo_head_drop,
to remove the oldest skb in the case of an overflow within the queue -
the head element - instead of the last skb (tail). To remove the oldest
skb in congested situations is useful for sensor network environments
where newer packets reflect the superior information.

Reviewed-by: Florian Westphal &lt;fw@strlen.de&gt;
Acked-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Hagen Paul Pfeifer &lt;hagen@jauu.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds an additional queuing strategy, called pfifo_head_drop,
to remove the oldest skb in the case of an overflow within the queue -
the head element - instead of the last skb (tail). To remove the oldest
skb in congested situations is useful for sensor network environments
where newer packets reflect the superior information.

Reviewed-by: Florian Westphal &lt;fw@strlen.de&gt;
Acked-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Hagen Paul Pfeifer &lt;hagen@jauu.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: cleanup include/net</title>
<updated>2009-11-04T13:06:25+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2009-11-03T03:26:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fd2c3ef761fbc5e6c27fa7d40b30cda06bfcd7d8'/>
<id>fd2c3ef761fbc5e6c27fa7d40b30cda06bfcd7d8</id>
<content type='text'>
This cleanup patch puts struct/union/enum opening braces,
in first line to ease grep games.

struct something
{

becomes :

struct something {

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This cleanup patch puts struct/union/enum opening braces,
in first line to ease grep games.

struct something
{

becomes :

struct something {

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
