summaryrefslogtreecommitdiff
path: root/drivers/net/vrf.c
AgeCommit message (Collapse)Author
2015-09-17netfilter: Pass net into okfnEric W. Biederman
This is immediately motivated by the bridge code that chains functions that call into netfilter. Without passing net into the okfns the bridge code would need to guess about the best expression for the network namespace to process packets in. As net is frequently one of the first things computed in continuation functions after netfilter has done it's job passing in the desired network namespace is in many cases a code simplification. To support this change the function dst_output_okfn is introduced to simplify passing dst_output as an okfn. For the moment dst_output_okfn just silently drops the struct net. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17netfilter: Pass struct net into the netfilter hooksEric W. Biederman
Pass a network namespace parameter into the netfilter hooks. At the call site of the netfilter hooks the path a packet is taking through the network stack is well known which allows the network namespace to be easily and reliabily. This allows the replacement of magic code like "dev_net(state->in?:state->out)" that appears at the start of most netfilter hooks with "state->net". In almost all cases the network namespace passed in is derived from the first network device passed in, guaranteeing those paths will not see any changes in practice. The exceptions are: xfrm/xfrm_output.c:xfrm_output_resume() xs_net(skb_dst(skb)->xfrm) ipvs/ip_vs_xmit.c:ip_vs_nat_send_or_cont() ip_vs_conn_net(cp) ipvs/ip_vs_xmit.c:ip_vs_send_or_cont() ip_vs_conn_net(cp) ipv4/raw.c:raw_send_hdrinc() sock_net(sk) ipv6/ip6_output.c:ip6_xmit() sock_net(sk) ipv6/ndisc.c:ndisc_send_skb() dev_net(skb->dev) not dev_net(dst->dev) ipv6/raw.c:raw6_send_hdrinc() sock_net(sk) br_netfilter_hooks.c:br_nf_pre_routing_finish() dev_net(skb->dev) before skb->dev is set to nf_bridge->physindev In all cases these exceptions seem to be a better expression for the network namespace the packet is being processed in then the historic "dev_net(in?in:out)". I am documenting them in case something odd pops up and someone starts trying to track down what happened. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15net: Add FIB table id to rtableDavid Ahern
Add the FIB table id to rtable to make the information available for IPv4 as it is for IPv6. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-28net: Add ethernet header for pass through VRF deviceDavid Ahern
The change to use a custom dst broke tcpdump captures on the VRF device: $ tcpdump -n -i vrf10 ... 05:32:29.009362 IP 10.2.1.254 > 10.2.1.2: ICMP echo request, id 21989, seq 1, length 64 05:32:29.009855 00:00:40:01:8d:36 > 45:00:00:54:d6:6f, ethertype Unknown (0x0a02), length 84: 0x0000: 0102 0a02 01fe 0000 9181 55e5 0001 bd11 ..........U..... 0x0010: da55 0000 0000 bb5d 0700 0000 0000 1011 .U.....]........ 0x0020: 1213 1415 1617 1819 1a1b 1c1d 1e1f 2021 ...............! 0x0030: 2223 2425 2627 2829 2a2b 2c2d 2e2f 3031 "#$%&'()*+,-./01 0x0040: 3233 3435 3637 234567 Local packets going through the VRF device are missing an ethernet header. Fix by adding one and then stripping it off before pushing back to the IP stack. With this patch you get the expected dumps: ... 05:36:15.713944 IP 10.2.1.254 > 10.2.1.2: ICMP echo request, id 23795, seq 1, length 64 05:36:15.714160 IP 10.2.1.2 > 10.2.1.254: ICMP echo reply, id 23795, seq 1, length 64 ... Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-20route: move lwtunnel state to dst_entryJiri Benc
Currently, the lwtunnel state resides in per-protocol data. This is a problem if we encapsulate ipv6 traffic in an ipv4 tunnel (or vice versa). The xmit function of the tunnel does not know whether the packet has been routed to it by ipv4 or ipv6, yet it needs the lwtstate data. Moving the lwtstate data to dst_entry makes such inter-protocol tunneling possible. As a bonus, this brings a nice diffstat. Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-20vrf: ndo_add|del_slave drop unnecessary checksNikolay Aleksandrov
When ndo_add|del_slave ops are used, they're taken from the respective master device's netdev ops, so if the master device is a VRF only then the VRF ops will get called thus no need to check the type of the master. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-20vrf: move vrf_insert_slave so we can drop a goto labelNikolay Aleksandrov
We can simplify do_vrf_add_slave by moving vrf_insert_slave in the end of the enslaving and thus eliminate an error goto label. It always succeeds and isn't needed before that anyway. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-20vrf: remove unnecessary duplicate checkNikolay Aleksandrov
The upper/lower functions already check for duplicate slaves so no need to do it again. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-20vrf: don't panic on cache create failureNikolay Aleksandrov
It's pointless to panic on cache create failure when that case is handled and even more so since it's not a kernel-wide fatal problem so don't panic. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-20vrf: plug skb leaksNikolay Aleksandrov
Currently whenever a packet different from ETH_P_IP is sent through the VRF device it is leaked so plug the leaks and properly drop these packets. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-18vrf: simplify the netdev notifier functionNikolay Aleksandrov
We can drop the check because if vrf_ptr is present then we must have the vrf device as a master and since we're running with rtnl it can't go away. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-18vrf: don't check for dstats and rth in uninit pathNikolay Aleksandrov
dstats and rth are always present because we fail the device registration if they can't be allocated in vrf_init() (ndo_init) so drop the unnecessary checks. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-18vrf: drop unused num_slaves memberNikolay Aleksandrov
slave_queue has a num_slaves member which is unused, drop it. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-18vrf: drop unnecessary dev refcnt changesNikolay Aleksandrov
netdev_master_upper_dev_link/unlink already do a dev_hold/put on the devices being linked, so no need to take another reference. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13net: Introduce VRF device driverDavid Ahern
This driver borrows heavily from IPvlan and teaming drivers. Routing domains (VRF-lite) are created by instantiating a VRF master device with an associated table and enslaving all routed interfaces that participate in the domain. As part of the enslavement, all connected routes for the enslaved devices are moved to the table associated with the VRF device. Outgoing sockets must bind to the VRF device to function. Standard FIB rules bind the VRF device to tables and regular fib rule processing is followed. Routed traffic through the box, is forwarded by using the VRF device as the IIF and following the IIF rule to a table that is mated with the VRF. Example: Create vrf 1: ip link add vrf1 type vrf table 5 ip rule add iif vrf1 table 5 ip rule add oif vrf1 table 5 ip route add table 5 prohibit default ip link set vrf1 up Add interface to vrf 1: ip link set eth1 master vrf1 Signed-off-by: Shrijeet Mukherjee <shm@cumulusnetworks.com> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>