<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/net/ipvlan/ipvlan_core.c, branch v4.10</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>ipvlan: fix multicast processing</title>
<updated>2016-12-23T22:53:47+00:00</updated>
<author>
<name>Mahesh Bandewar</name>
<email>maheshb@google.com</email>
</author>
<published>2016-12-22T01:30:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e252536068efd1578c6e23e7323527c5e6e980bd'/>
<id>e252536068efd1578c6e23e7323527c5e6e980bd</id>
<content type='text'>
In an IPvlan setup when master is set in loopback mode e.g.

  ethtool -K eth0 set loopback on

  where eth0 is master device for IPvlan setup.

The failure is caused by the faulty logic that determines if the
packet is from TX-path vs. RX-path by just looking at the mac-
addresses on the packet while processing multicast packets.

In the loopback-mode where this crash was happening, the packets
that are sent out are reflected by the NIC and are processed on
the RX path, but mac-address check tricks into thinking this
packet is from TX path and falsely uses dev_forward_skb() to pass
packets to the slave (virtual) devices.

This patch records the path while queueing packets and eliminates
logic of looking at mac-addresses for the same decision.

------------[ cut here ]------------
kernel BUG at include/linux/skbuff.h:1737!
Call Trace:
 [&lt;ffffffff921fbbc2&gt;] dev_forward_skb+0x92/0xd0
 [&lt;ffffffffc031ac65&gt;] ipvlan_process_multicast+0x395/0x4c0 [ipvlan]
 [&lt;ffffffffc031a9a7&gt;] ? ipvlan_process_multicast+0xd7/0x4c0 [ipvlan]
 [&lt;ffffffff91cdfea7&gt;] ? process_one_work+0x147/0x660
 [&lt;ffffffff91cdff09&gt;] process_one_work+0x1a9/0x660
 [&lt;ffffffff91cdfea7&gt;] ? process_one_work+0x147/0x660
 [&lt;ffffffff91ce086d&gt;] worker_thread+0x11d/0x360
 [&lt;ffffffff91ce0750&gt;] ? rescuer_thread+0x350/0x350
 [&lt;ffffffff91ce960b&gt;] kthread+0xdb/0xe0
 [&lt;ffffffff91c05c70&gt;] ? _raw_spin_unlock_irq+0x30/0x50
 [&lt;ffffffff91ce9530&gt;] ? flush_kthread_worker+0xc0/0xc0
 [&lt;ffffffff92348b7a&gt;] ret_from_fork+0x9a/0xd0
 [&lt;ffffffff91ce9530&gt;] ? flush_kthread_worker+0xc0/0xc0

Fixes: ba35f8588f47 ("ipvlan: Defer multicast / broadcast processing to a work-queue")
Signed-off-by: Mahesh Bandewar &lt;maheshb@google.com&gt;
CC: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In an IPvlan setup when master is set in loopback mode e.g.

  ethtool -K eth0 set loopback on

  where eth0 is master device for IPvlan setup.

The failure is caused by the faulty logic that determines if the
packet is from TX-path vs. RX-path by just looking at the mac-
addresses on the packet while processing multicast packets.

In the loopback-mode where this crash was happening, the packets
that are sent out are reflected by the NIC and are processed on
the RX path, but mac-address check tricks into thinking this
packet is from TX path and falsely uses dev_forward_skb() to pass
packets to the slave (virtual) devices.

This patch records the path while queueing packets and eliminates
logic of looking at mac-addresses for the same decision.

------------[ cut here ]------------
kernel BUG at include/linux/skbuff.h:1737!
Call Trace:
 [&lt;ffffffff921fbbc2&gt;] dev_forward_skb+0x92/0xd0
 [&lt;ffffffffc031ac65&gt;] ipvlan_process_multicast+0x395/0x4c0 [ipvlan]
 [&lt;ffffffffc031a9a7&gt;] ? ipvlan_process_multicast+0xd7/0x4c0 [ipvlan]
 [&lt;ffffffff91cdfea7&gt;] ? process_one_work+0x147/0x660
 [&lt;ffffffff91cdff09&gt;] process_one_work+0x1a9/0x660
 [&lt;ffffffff91cdfea7&gt;] ? process_one_work+0x147/0x660
 [&lt;ffffffff91ce086d&gt;] worker_thread+0x11d/0x360
 [&lt;ffffffff91ce0750&gt;] ? rescuer_thread+0x350/0x350
 [&lt;ffffffff91ce960b&gt;] kthread+0xdb/0xe0
 [&lt;ffffffff91c05c70&gt;] ? _raw_spin_unlock_irq+0x30/0x50
 [&lt;ffffffff91ce9530&gt;] ? flush_kthread_worker+0xc0/0xc0
 [&lt;ffffffff92348b7a&gt;] ret_from_fork+0x9a/0xd0
 [&lt;ffffffff91ce9530&gt;] ? flush_kthread_worker+0xc0/0xc0

Fixes: ba35f8588f47 ("ipvlan: Defer multicast / broadcast processing to a work-queue")
Signed-off-by: Mahesh Bandewar &lt;maheshb@google.com&gt;
CC: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvlan: fix various issues in ipvlan_process_multicast()</title>
<updated>2016-12-23T22:53:47+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2016-12-22T02:00:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b1227d019fa98c43381ad8827baf7efbe2923ed1'/>
<id>b1227d019fa98c43381ad8827baf7efbe2923ed1</id>
<content type='text'>
1) netif_rx() / dev_forward_skb() should not be called from process
context.

2) ipvlan_count_rx() should be called with preemption disabled.

3) We should check if ipvlan-&gt;dev is up before feeding packets
to netif_rx()

4) We need to prevent device from disappearing if some packets
are in the multicast backlog.

5) One kfree_skb() should be a consume_skb() eventually

Fixes: ba35f8588f47 ("ipvlan: Defer multicast / broadcast processing to
a work-queue")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Mahesh Bandewar &lt;maheshb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
1) netif_rx() / dev_forward_skb() should not be called from process
context.

2) ipvlan_count_rx() should be called with preemption disabled.

3) We should check if ipvlan-&gt;dev is up before feeding packets
to netif_rx()

4) We need to prevent device from disappearing if some packets
are in the multicast backlog.

5) One kfree_skb() should be a consume_skb() eventually

Fixes: ba35f8588f47 ("ipvlan: Defer multicast / broadcast processing to
a work-queue")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Mahesh Bandewar &lt;maheshb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvlan: Introduce l3s mode</title>
<updated>2016-09-19T05:25:22+00:00</updated>
<author>
<name>Mahesh Bandewar</name>
<email>maheshb@google.com</email>
</author>
<published>2016-09-16T19:59:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4fbae7d83c98c30efcf0a2a2ac55fbb75ef5a1a5'/>
<id>4fbae7d83c98c30efcf0a2a2ac55fbb75ef5a1a5</id>
<content type='text'>
In a typical IPvlan L3 setup where master is in default-ns and
each slave is into different (slave) ns. In this setup egress
packet processing for traffic originating from slave-ns will
hit all NF_HOOKs in slave-ns as well as default-ns. However same
is not true for ingress processing. All these NF_HOOKs are
hit only in the slave-ns skipping them in the default-ns.
IPvlan in L3 mode is restrictive and if admins want to deploy
iptables rules in default-ns, this asymmetric data path makes it
impossible to do so.

This patch makes use of the l3_rcv() (added as part of l3mdev
enhancements) to perform input route lookup on RX packets without
changing the skb-&gt;dev and then uses nf_hook at NF_INET_LOCAL_IN
to change the skb-&gt;dev just before handing over skb to L4.

Signed-off-by: Mahesh Bandewar &lt;maheshb@google.com&gt;
CC: David Ahern &lt;dsa@cumulusnetworks.com&gt;
Reviewed-by: David Ahern &lt;dsa@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In a typical IPvlan L3 setup where master is in default-ns and
each slave is into different (slave) ns. In this setup egress
packet processing for traffic originating from slave-ns will
hit all NF_HOOKs in slave-ns as well as default-ns. However same
is not true for ingress processing. All these NF_HOOKs are
hit only in the slave-ns skipping them in the default-ns.
IPvlan in L3 mode is restrictive and if admins want to deploy
iptables rules in default-ns, this asymmetric data path makes it
impossible to do so.

This patch makes use of the l3_rcv() (added as part of l3mdev
enhancements) to perform input route lookup on RX packets without
changing the skb-&gt;dev and then uses nf_hook at NF_INET_LOCAL_IN
to change the skb-&gt;dev just before handing over skb to L4.

Signed-off-by: Mahesh Bandewar &lt;maheshb@google.com&gt;
CC: David Ahern &lt;dsa@cumulusnetworks.com&gt;
Reviewed-by: David Ahern &lt;dsa@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvlan: Scrub skb before crossing the namespace boundry</title>
<updated>2016-07-26T04:47:26+00:00</updated>
<author>
<name>Mahesh Bandewar</name>
<email>maheshb@google.com</email>
</author>
<published>2016-07-25T21:38:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b93dd49c1a35884864027abd707889b795637f7a'/>
<id>b93dd49c1a35884864027abd707889b795637f7a</id>
<content type='text'>
The earlier patch c3aaa06d5a63 (ipvlan: scrub skb before routing
in L3 mode.) did this but only for TX path in L3 mode. This
patch extends it for both the modes for TX/RX path.

Signed-off-by: Mahesh Bandewar &lt;maheshb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The earlier patch c3aaa06d5a63 (ipvlan: scrub skb before routing
in L3 mode.) did this but only for TX path in L3 mode. This
patch extends it for both the modes for TX/RX path.

Signed-off-by: Mahesh Bandewar &lt;maheshb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvlan: misc changes</title>
<updated>2016-02-22T03:43:24+00:00</updated>
<author>
<name>Mahesh Bandewar</name>
<email>maheshb@google.com</email>
</author>
<published>2016-02-21T03:31:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ab5b7013db3cc637a8f19e00d71310e40db75bf6'/>
<id>ab5b7013db3cc637a8f19e00d71310e40db75bf6</id>
<content type='text'>
1. scope correction for few functions that are used in single file.
2. Adjust variables that are used in fast-path to fit into single cacheline
3. Update rcv_frame() to skip shared check for frames coming over wire

Signed-off-by: Mahesh Bandewar &lt;maheshb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
1. scope correction for few functions that are used in single file.
2. Adjust variables that are used in fast-path to fit into single cacheline
3. Update rcv_frame() to skip shared check for frames coming over wire

Signed-off-by: Mahesh Bandewar &lt;maheshb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvlan: scrub skb before routing in L3 mode.</title>
<updated>2016-02-22T03:43:24+00:00</updated>
<author>
<name>Mahesh Bandewar</name>
<email>maheshb@google.com</email>
</author>
<published>2016-02-21T03:31:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c3aaa06d5a63609641b7ad62ee0956f3de86c1cd'/>
<id>c3aaa06d5a63609641b7ad62ee0956f3de86c1cd</id>
<content type='text'>
Scrub skb before hitting the iptable hooks to ensure packets hit
these hooks. Set the xnet param only when the packet is crossing the
ns boundry so if the IPvlan slave and master belong to the same ns,
the param will be set to false.

Signed-off-by: Mahesh Bandewar &lt;maheshb@google.com&gt;
CC: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Scrub skb before hitting the iptable hooks to ensure packets hit
these hooks. Set the xnet param only when the packet is crossing the
ns boundry so if the IPvlan slave and master belong to the same ns,
the param will be set to false.

Signed-off-by: Mahesh Bandewar &lt;maheshb@google.com&gt;
CC: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvlan: fix use after free of skb</title>
<updated>2015-11-17T19:39:29+00:00</updated>
<author>
<name>Sabrina Dubroca</name>
<email>sd@queasysnail.net</email>
</author>
<published>2015-11-16T21:44:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a534dc529853c69e94994aa47c1d80a03ce2c11d'/>
<id>a534dc529853c69e94994aa47c1d80a03ce2c11d</id>
<content type='text'>
ipvlan_handle_frame is a rx_handler, and when it returns a value other
than RX_HANDLER_CONSUMED (here, NET_RX_DROP aka RX_HANDLER_ANOTHER),
__netif_receive_skb_core expects that the skb still exists and will
process it further, but we just freed it.

Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.")
Signed-off-by: Sabrina Dubroca &lt;sd@queasysnail.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ipvlan_handle_frame is a rx_handler, and when it returns a value other
than RX_HANDLER_CONSUMED (here, NET_RX_DROP aka RX_HANDLER_ANOTHER),
__netif_receive_skb_core expects that the skb still exists and will
process it further, but we just freed it.

Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.")
Signed-off-by: Sabrina Dubroca &lt;sd@queasysnail.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvlan: fix leak in ipvlan_rcv_frame</title>
<updated>2015-11-17T19:39:28+00:00</updated>
<author>
<name>Sabrina Dubroca</name>
<email>sd@queasysnail.net</email>
</author>
<published>2015-11-16T21:34:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cf554ada0be7077906aa9a17faf151ff66e3cb8e'/>
<id>cf554ada0be7077906aa9a17faf151ff66e3cb8e</id>
<content type='text'>
Pass a **skb to ipvlan_rcv_frame so that if skb_share_check returns a
new skb, we actually use it during further processing.

It's safe to ignore the new skb in the ipvlan_xmit_* functions, because
they call ipvlan_rcv_frame with local == true, so that dev_forward_skb
is called and always takes ownership of the skb.

Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.")
Signed-off-by: Sabrina Dubroca &lt;sd@queasysnail.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pass a **skb to ipvlan_rcv_frame so that if skb_share_check returns a
new skb, we actually use it during further processing.

It's safe to ignore the new skb in the ipvlan_xmit_* functions, because
they call ipvlan_rcv_frame with local == true, so that dev_forward_skb
is called and always takes ownership of the skb.

Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.")
Signed-off-by: Sabrina Dubroca &lt;sd@queasysnail.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvlan: read direct ifindex instead of iflink</title>
<updated>2015-10-22T13:39:08+00:00</updated>
<author>
<name>Brenden Blanco</name>
<email>bblanco@plumgrid.com</email>
</author>
<published>2015-10-20T23:47:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=63b11e757d6dae570bc22450ec58a5b68cdf5c3c'/>
<id>63b11e757d6dae570bc22450ec58a5b68cdf5c3c</id>
<content type='text'>
In the ipv4 outbound path of an ipvlan device in l3 mode, the ifindex is
being grabbed from dev_get_iflink. This works for the physical device
case, since as the documentation of that function notes: "Physical
interfaces have the same 'ifindex' and 'iflink' values.".  However, if
the master device is a veth, and the pairs are in separate net
namespaces, the route lookup will fail with -ENODEV due to outer veth
pair being in a separate namespace from the ipvlan master/routing
namespace.

  ns0    |   ns1    |   ns2
 veth0a--|--veth0b--|--ipvl0

In ipvlan_process_v4_outbound(), a packet sent from ipvl0 in the above
configuration will pass fl.flowi4_oif == veth0a to
ip_route_output_flow(), but *net == ns1.

Notice also that ipv6 processing is not using iflink. Since there is a
discrepancy in usage, fixup both v4 and v6 case to use local dev
variable.

Tested this with l3 ipvlan on top of veth, as well as with single
physical interface in the top namespace.

Signed-off-by: Brenden Blanco &lt;bblanco@plumgrid.com&gt;
Reviewed-by: Jiri Benc &lt;jbenc@redhat.com&gt;
Acked-by: Mahesh Bandewar &lt;maheshb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In the ipv4 outbound path of an ipvlan device in l3 mode, the ifindex is
being grabbed from dev_get_iflink. This works for the physical device
case, since as the documentation of that function notes: "Physical
interfaces have the same 'ifindex' and 'iflink' values.".  However, if
the master device is a veth, and the pairs are in separate net
namespaces, the route lookup will fail with -ENODEV due to outer veth
pair being in a separate namespace from the ipvlan master/routing
namespace.

  ns0    |   ns1    |   ns2
 veth0a--|--veth0b--|--ipvl0

In ipvlan_process_v4_outbound(), a packet sent from ipvl0 in the above
configuration will pass fl.flowi4_oif == veth0a to
ip_route_output_flow(), but *net == ns1.

Notice also that ipv6 processing is not using iflink. Since there is a
discrepancy in usage, fixup both v4 and v6 case to use local dev
variable.

Tested this with l3 ipvlan on top of veth, as well as with single
physical interface in the top namespace.

Signed-off-by: Brenden Blanco &lt;bblanco@plumgrid.com&gt;
Reviewed-by: Jiri Benc &lt;jbenc@redhat.com&gt;
Acked-by: Mahesh Bandewar &lt;maheshb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4, ipv6: Pass net into ip_local_out and ip6_local_out</title>
<updated>2015-10-08T11:27:02+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2015-10-07T21:48:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=33224b16ffccb49cf798317670389e0bfba0024c'/>
<id>33224b16ffccb49cf798317670389e0bfba0024c</id>
<content type='text'>
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
