<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/net/team, branch v3.14.56</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>team: don't traverse port list using rcu in team_set_mac_address</title>
<updated>2015-03-18T12:31:23+00:00</updated>
<author>
<name>Jiri Pirko</name>
<email>jiri@resnulli.us</email>
</author>
<published>2015-03-04T07:36:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=67b58fb7d3e7b1e393609e14d95d02a14d7328db'/>
<id>67b58fb7d3e7b1e393609e14d95d02a14d7328db</id>
<content type='text'>
[ Upstream commit 9215f437b85da339a7dfe3db6e288637406f88b2 ]

Currently the list is traversed using rcu variant. That is not correct
since dev_set_mac_address can be called which eventually calls
rtmsg_ifinfo_build_skb and there, skb allocation can sleep. So fix this
by remove the rcu usage here.

Fixes: 3d249d4ca7 "net: introduce ethernet teaming device"
Signed-off-by: Jiri Pirko &lt;jiri@resnulli.us&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 9215f437b85da339a7dfe3db6e288637406f88b2 ]

Currently the list is traversed using rcu variant. That is not correct
since dev_set_mac_address can be called which eventually calls
rtmsg_ifinfo_build_skb and there, skb allocation can sleep. So fix this
by remove the rcu usage here.

Fixes: 3d249d4ca7 "net: introduce ethernet teaming device"
Signed-off-by: Jiri Pirko &lt;jiri@resnulli.us&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>team: fix possible null pointer dereference in team_handle_frame</title>
<updated>2015-03-18T12:31:22+00:00</updated>
<author>
<name>Jiri Pirko</name>
<email>jiri@resnulli.us</email>
</author>
<published>2015-02-23T13:02:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fdb35e5a1be06aa0d37dbdc7a987a792011c8bf9'/>
<id>fdb35e5a1be06aa0d37dbdc7a987a792011c8bf9</id>
<content type='text'>
[ Upstream commit 57e595631904c827cfa1a0f7bbd7cc9a49da5745 ]

Currently following race is possible in team:

CPU0                                        CPU1
                                            team_port_del
                                              team_upper_dev_unlink
                                                priv_flags &amp;= ~IFF_TEAM_PORT
team_handle_frame
  team_port_get_rcu
    team_port_exists
      priv_flags &amp; IFF_TEAM_PORT == 0
    return NULL (instead of port got
                 from rx_handler_data)
                                              netdev_rx_handler_unregister

The thing is that the flag is removed before rx_handler is unregistered.
If team_handle_frame is called in between, team_port_exists returns 0
and team_port_get_rcu will return NULL.
So do not check the flag here. It is guaranteed by netdev_rx_handler_unregister
that team_handle_frame will always see valid rx_handler_data pointer.

Signed-off-by: Jiri Pirko &lt;jiri@resnulli.us&gt;
Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device")
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 57e595631904c827cfa1a0f7bbd7cc9a49da5745 ]

Currently following race is possible in team:

CPU0                                        CPU1
                                            team_port_del
                                              team_upper_dev_unlink
                                                priv_flags &amp;= ~IFF_TEAM_PORT
team_handle_frame
  team_port_get_rcu
    team_port_exists
      priv_flags &amp; IFF_TEAM_PORT == 0
    return NULL (instead of port got
                 from rx_handler_data)
                                              netdev_rx_handler_unregister

The thing is that the flag is removed before rx_handler is unregistered.
If team_handle_frame is called in between, team_port_exists returns 0
and team_port_get_rcu will return NULL.
So do not check the flag here. It is guaranteed by netdev_rx_handler_unregister
that team_handle_frame will always see valid rx_handler_data pointer.

Signed-off-by: Jiri Pirko &lt;jiri@resnulli.us&gt;
Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device")
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>team: avoid possible underflow of count_pending value for notify_peers and mcast_rejoin</title>
<updated>2015-01-27T16:18:53+00:00</updated>
<author>
<name>Jiri Pirko</name>
<email>jiri@resnulli.us</email>
</author>
<published>2015-01-14T17:15:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=52bf2a12aa7ee1701fa52c3206e3c0188e433aeb'/>
<id>52bf2a12aa7ee1701fa52c3206e3c0188e433aeb</id>
<content type='text'>
[ Upstream commit b0d11b42785b70e19bc6a3122eead3f7969a7589 ]

This patch is fixing a race condition that may cause setting
count_pending to -1, which results in unwanted big bulk of arp messages
(in case of "notify peers").

Consider following scenario:

count_pending == 2
   CPU0                                           CPU1
					team_notify_peers_work
					  atomic_dec_and_test (dec count_pending to 1)
					  schedule_delayed_work
 team_notify_peers
   atomic_add (adding 1 to count_pending)
					team_notify_peers_work
					  atomic_dec_and_test (dec count_pending to 1)
					  schedule_delayed_work
					team_notify_peers_work
					  atomic_dec_and_test (dec count_pending to 0)
   schedule_delayed_work
					team_notify_peers_work
					  atomic_dec_and_test (dec count_pending to -1)

Fix this race by using atomic_dec_if_positive - that will prevent
count_pending running under 0.

Fixes: fc423ff00df3a1955441 ("team: add peer notification")
Fixes: 492b200efdd20b8fcfd  ("team: add support for sending multicast rejoins")
Signed-off-by: Jiri Pirko &lt;jiri@resnulli.us&gt;
Signed-off-by: Jiri Benc &lt;jbenc@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit b0d11b42785b70e19bc6a3122eead3f7969a7589 ]

This patch is fixing a race condition that may cause setting
count_pending to -1, which results in unwanted big bulk of arp messages
(in case of "notify peers").

Consider following scenario:

count_pending == 2
   CPU0                                           CPU1
					team_notify_peers_work
					  atomic_dec_and_test (dec count_pending to 1)
					  schedule_delayed_work
 team_notify_peers
   atomic_add (adding 1 to count_pending)
					team_notify_peers_work
					  atomic_dec_and_test (dec count_pending to 1)
					  schedule_delayed_work
					team_notify_peers_work
					  atomic_dec_and_test (dec count_pending to 0)
   schedule_delayed_work
					team_notify_peers_work
					  atomic_dec_and_test (dec count_pending to -1)

Fix this race by using atomic_dec_if_positive - that will prevent
count_pending running under 0.

Fixes: fc423ff00df3a1955441 ("team: add peer notification")
Fixes: 492b200efdd20b8fcfd  ("team: add support for sending multicast rejoins")
Signed-off-by: Jiri Pirko &lt;jiri@resnulli.us&gt;
Signed-off-by: Jiri Benc &lt;jbenc@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>team: avoid race condition in scheduling delayed work</title>
<updated>2014-10-15T06:36:42+00:00</updated>
<author>
<name>Joe Lawrence</name>
<email>Joe.Lawrence@stratus.com</email>
</author>
<published>2014-10-03T13:58:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5c4b226c22294a1746f386091da274016cbce394'/>
<id>5c4b226c22294a1746f386091da274016cbce394</id>
<content type='text'>
[ Upstream commit 47549650abd13d873fd2e5fc218db19e21031074 ]

When team_notify_peers and team_mcast_rejoin are called, they both reset
their respective .count_pending atomic variable. Then when the actual
worker function is executed, the variable is atomically decremented.
This pattern introduces a potential race condition where the
.count_pending rolls over and the worker function keeps rescheduling
until .count_pending decrements to zero again:

THREAD 1                           THREAD 2

========                           ========
team_notify_peers(teamX)
  atomic_set count_pending = 1
  schedule_delayed_work
                                   team_notify_peers(teamX)
                                   atomic_set count_pending = 1
team_notify_peers_work
  atomic_dec_and_test
    count_pending = 0
  (return)
                                   schedule_delayed_work
                                   team_notify_peers_work
                                   atomic_dec_and_test
                                     count_pending = -1
                                   schedule_delayed_work
                                   (repeat until count_pending = 0)

Instead of assigning a new value to .count_pending, use atomic_add to
tack-on the additional desired worker function invocations.

Signed-off-by: Joe Lawrence &lt;joe.lawrence@stratus.com&gt;
Acked-by: Jiri Pirko &lt;jiri@resnulli.us&gt;
Fixes: fc423ff00df3a19554414ee ("team: add peer notification")
Fixes: 492b200efdd20b8fcfdac87 ("team: add support for sending multicast rejoins")
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 47549650abd13d873fd2e5fc218db19e21031074 ]

When team_notify_peers and team_mcast_rejoin are called, they both reset
their respective .count_pending atomic variable. Then when the actual
worker function is executed, the variable is atomically decremented.
This pattern introduces a potential race condition where the
.count_pending rolls over and the worker function keeps rescheduling
until .count_pending decrements to zero again:

THREAD 1                           THREAD 2

========                           ========
team_notify_peers(teamX)
  atomic_set count_pending = 1
  schedule_delayed_work
                                   team_notify_peers(teamX)
                                   atomic_set count_pending = 1
team_notify_peers_work
  atomic_dec_and_test
    count_pending = 0
  (return)
                                   schedule_delayed_work
                                   team_notify_peers_work
                                   atomic_dec_and_test
                                     count_pending = -1
                                   schedule_delayed_work
                                   (repeat until count_pending = 0)

Instead of assigning a new value to .count_pending, use atomic_add to
tack-on the additional desired worker function invocations.

Signed-off-by: Joe Lawrence &lt;joe.lawrence@stratus.com&gt;
Acked-by: Jiri Pirko &lt;jiri@resnulli.us&gt;
Fixes: fc423ff00df3a19554414ee ("team: add peer notification")
Fixes: 492b200efdd20b8fcfdac87 ("team: add support for sending multicast rejoins")
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>team: fix mtu setting</title>
<updated>2014-06-26T19:15:39+00:00</updated>
<author>
<name>Jiri Pirko</name>
<email>jiri@resnulli.us</email>
</author>
<published>2014-05-29T18:46:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=03bcf77669fbe71906c92b25a450e9b1dc2b4ee4'/>
<id>03bcf77669fbe71906c92b25a450e9b1dc2b4ee4</id>
<content type='text'>
[ Upstream commit 9d0d68faea6962d62dd501cd6e71ce5cc8ed262b ]

Now it is not possible to set mtu to team device which has a port
enslaved to it. The reason is that when team_change_mtu() calls
dev_set_mtu() for port device, notificator for NETDEV_PRECHANGEMTU
event is called and team_device_event() returns NOTIFY_BAD forbidding
the change. So fix this by returning NOTIFY_DONE here in case team is
changing mtu in team_change_mtu().

Introduced-by: 3d249d4c "net: introduce ethernet teaming device"
Signed-off-by: Jiri Pirko &lt;jiri@resnulli.us&gt;
Acked-by: Flavio Leitner &lt;fbl@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 9d0d68faea6962d62dd501cd6e71ce5cc8ed262b ]

Now it is not possible to set mtu to team device which has a port
enslaved to it. The reason is that when team_change_mtu() calls
dev_set_mtu() for port device, notificator for NETDEV_PRECHANGEMTU
event is called and team_device_event() returns NOTIFY_BAD forbidding
the change. So fix this by returning NOTIFY_DONE here in case team is
changing mtu in team_change_mtu().

Introduced-by: 3d249d4c "net: introduce ethernet teaming device"
Signed-off-by: Jiri Pirko &lt;jiri@resnulli.us&gt;
Acked-by: Flavio Leitner &lt;fbl@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netdevice: add queue selection fallback handler for ndo_select_queue</title>
<updated>2014-02-17T05:36:34+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>dborkman@redhat.com</email>
</author>
<published>2014-02-16T14:55:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=99932d4fc03a13bb3e94938fe25458fabc8f2fc3'/>
<id>99932d4fc03a13bb3e94938fe25458fabc8f2fc3</id>
<content type='text'>
Add a new argument for ndo_select_queue() callback that passes a
fallback handler. This gets invoked through netdev_pick_tx();
fallback handler is currently __netdev_pick_tx() as most drivers
invoke this function within their customized implementation in
case for skbs that don't need any special handling. This fallback
handler can then be replaced on other call-sites with different
queue selection methods (e.g. in packet sockets, pktgen etc).

This also has the nice side-effect that __netdev_pick_tx() is
then only invoked from netdev_pick_tx() and export of that
function to modules can be undone.

Suggested-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a new argument for ndo_select_queue() callback that passes a
fallback handler. This gets invoked through netdev_pick_tx();
fallback handler is currently __netdev_pick_tx() as most drivers
invoke this function within their customized implementation in
case for skbs that don't need any special handling. This fallback
handler can then be replaced on other call-sites with different
queue selection methods (e.g. in packet sockets, pktgen etc).

This also has the nice side-effect that __netdev_pick_tx() is
then only invoked from netdev_pick_tx() and export of that
function to modules can be undone.

Suggested-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>team: Don't allow team devices to change network namespaces.</title>
<updated>2014-01-23T05:57:17+00:00</updated>
<author>
<name>Weilong Chen</name>
<email>chenweilong@huawei.com</email>
</author>
<published>2014-01-23T03:07:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=99301ba11cb72f68f4e43e5778106035bd6965c4'/>
<id>99301ba11cb72f68f4e43e5778106035bd6965c4</id>
<content type='text'>
Like bonding, team as netdevice doesn't cross netns boundaries.

Team ports and team itself live in same netns.

Signed-off-by: Weilong Chen &lt;chenweilong@huawei.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Like bonding, team as netdevice doesn't cross netns boundaries.

Team ports and team itself live in same netns.

Signed-off-by: Weilong Chen &lt;chenweilong@huawei.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>random32: add prandom_u32_max and convert open coded users</title>
<updated>2014-01-22T07:17:20+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>dborkman@redhat.com</email>
</author>
<published>2014-01-22T01:29:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f337db64af059c9a94278a8b0ab97d87259ff62f'/>
<id>f337db64af059c9a94278a8b0ab97d87259ff62f</id>
<content type='text'>
Many functions have open coded a function that returns a random
number in range [0,N-1]. Under the assumption that we have a PRNG
such as taus113 with being well distributed in [0, ~0U] space,
we can implement such a function as uword t = (n*m')&gt;&gt;32, where
m' is a random number obtained from PRNG, n the right open interval
border and t our resulting random number, with n,m',t in u32 universe.

Lets go with Joe and simply call it prandom_u32_max(), although
technically we have an right open interval endpoint, but that we
have documented. Other users can further be migrated to the new
prandom_u32_max() function later on; for now, we need to make sure
to migrate reciprocal_divide() users for the reciprocal_divide()
follow-up fixup since their function signatures are going to change.

Joint work with Hannes Frederic Sowa.

Cc: Jakub Zawadzki &lt;darkjames-ws@darkjames.pl&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Many functions have open coded a function that returns a random
number in range [0,N-1]. Under the assumption that we have a PRNG
such as taus113 with being well distributed in [0, ~0U] space,
we can implement such a function as uword t = (n*m')&gt;&gt;32, where
m' is a random number obtained from PRNG, n the right open interval
border and t our resulting random number, with n,m',t in u32 universe.

Lets go with Joe and simply call it prandom_u32_max(), although
technically we have an right open interval endpoint, but that we
have documented. Other users can further be migrated to the new
prandom_u32_max() function later on; for now, we need to make sure
to migrate reciprocal_divide() users for the reciprocal_divide()
follow-up fixup since their function signatures are going to change.

Joint work with Hannes Frederic Sowa.

Cc: Jakub Zawadzki &lt;darkjames-ws@darkjames.pl&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>team: block mtu change before it happens via NETDEV_PRECHANGEMTU</title>
<updated>2014-01-17T01:15:42+00:00</updated>
<author>
<name>Veaceslav Falico</name>
<email>vfalico@redhat.com</email>
</author>
<published>2014-01-15T23:02:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b01f236c66b214a73816a4083050f73481638a72'/>
<id>b01f236c66b214a73816a4083050f73481638a72</id>
<content type='text'>
Now it catches the NETDEV_CHANGEMTU notification, which is signaled after
the actual change happened on the device, and returns NOTIFY_BAD, so that
the change on the device is reverted.

This might be quite costly and messy, so use the new NETDEV_PRECHANGEMTU to
catch the MTU change before the actual change happens and signal that it's
forbidden to do it.

CC: Jiri Pirko &lt;jiri@resnulli.us&gt;
Signed-off-by: Veaceslav Falico &lt;vfalico@redhat.com&gt;
Acked-by: Jiri Pirko &lt;jiri@resnulli.us&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now it catches the NETDEV_CHANGEMTU notification, which is signaled after
the actual change happened on the device, and returns NOTIFY_BAD, so that
the change on the device is reverted.

This might be quite costly and messy, so use the new NETDEV_PRECHANGEMTU to
catch the MTU change before the actual change happens and signal that it's
forbidden to do it.

CC: Jiri Pirko &lt;jiri@resnulli.us&gt;
Signed-off-by: Veaceslav Falico &lt;vfalico@redhat.com&gt;
Acked-by: Jiri Pirko &lt;jiri@resnulli.us&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: core: explicitly select a txq before doing l2 forwarding</title>
<updated>2014-01-10T18:23:08+00:00</updated>
<author>
<name>Jason Wang</name>
<email>jasowang@redhat.com</email>
</author>
<published>2014-01-10T08:18:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f663dd9aaf9ed124f25f0f8452edf238f087ad50'/>
<id>f663dd9aaf9ed124f25f0f8452edf238f087ad50</id>
<content type='text'>
Currently, the tx queue were selected implicitly in ndo_dfwd_start_xmit(). The
will cause several issues:

- NETIF_F_LLTX were removed for macvlan, so txq lock were done for macvlan
  instead of lower device which misses the necessary txq synchronization for
  lower device such as txq stopping or frozen required by dev watchdog or
  control path.
- dev_hard_start_xmit() was called with NULL txq which bypasses the net device
  watchdog.
- dev_hard_start_xmit() does not check txq everywhere which will lead a crash
  when tso is disabled for lower device.

Fix this by explicitly introducing a new param for .ndo_select_queue() for just
selecting queues in the case of l2 forwarding offload. netdev_pick_tx() was also
extended to accept this parameter and dev_queue_xmit_accel() was used to do l2
forwarding transmission.

With this fixes, NETIF_F_LLTX could be preserved for macvlan and there's no need
to check txq against NULL in dev_hard_start_xmit(). Also there's no need to keep
a dedicated ndo_dfwd_start_xmit() and we can just reuse the code of
dev_queue_xmit() to do the transmission.

In the future, it was also required for macvtap l2 forwarding support since it
provides a necessary synchronization method.

Cc: John Fastabend &lt;john.r.fastabend@intel.com&gt;
Cc: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Cc: e1000-devel@lists.sourceforge.net
Signed-off-by: Jason Wang &lt;jasowang@redhat.com&gt;
Acked-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Acked-by: John Fastabend &lt;john.r.fastabend@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, the tx queue were selected implicitly in ndo_dfwd_start_xmit(). The
will cause several issues:

- NETIF_F_LLTX were removed for macvlan, so txq lock were done for macvlan
  instead of lower device which misses the necessary txq synchronization for
  lower device such as txq stopping or frozen required by dev watchdog or
  control path.
- dev_hard_start_xmit() was called with NULL txq which bypasses the net device
  watchdog.
- dev_hard_start_xmit() does not check txq everywhere which will lead a crash
  when tso is disabled for lower device.

Fix this by explicitly introducing a new param for .ndo_select_queue() for just
selecting queues in the case of l2 forwarding offload. netdev_pick_tx() was also
extended to accept this parameter and dev_queue_xmit_accel() was used to do l2
forwarding transmission.

With this fixes, NETIF_F_LLTX could be preserved for macvlan and there's no need
to check txq against NULL in dev_hard_start_xmit(). Also there's no need to keep
a dedicated ndo_dfwd_start_xmit() and we can just reuse the code of
dev_queue_xmit() to do the transmission.

In the future, it was also required for macvtap l2 forwarding support since it
provides a necessary synchronization method.

Cc: John Fastabend &lt;john.r.fastabend@intel.com&gt;
Cc: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Cc: e1000-devel@lists.sourceforge.net
Signed-off-by: Jason Wang &lt;jasowang@redhat.com&gt;
Acked-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Acked-by: John Fastabend &lt;john.r.fastabend@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
