summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2015-07-21ip_tunnel: Make ovs_tunnel_info and ovs_key_ipv4_tunnel genericThomas Graf
Rename the tunnel metadata data structures currently internal to OVS and make them generic for use by all IP tunnels. Both structures are kernel internal and will stay that way. Their members are exposed to user space through individual Netlink attributes by OVS. It will therefore be possible to extend/modify these structures without affecting user ABI. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21mpls: ip tunnel supportRoopa Prabhu
This implementation uses lwtunnel infrastructure to register hooks for mpls tunnel encaps. It picks cues from iptunnel_encaps infrastructure and previous mpls iptunnel RFC patches from Eric W. Biederman and Robert Shearman Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21lwtunnel: support dst output redirect functionRoopa Prabhu
This patch introduces lwtunnel_output function to call corresponding lwtunnels output function to xmit the packet. It adds two variants lwtunnel_output and lwtunnel_output6 for ipv4 and ipv6 respectively today. But this is subject to change when lwtstate will reside in dst or dst_metadata (as per upstream discussions). Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21ipv6: support for fib route lwtunnel encap attributesRoopa Prabhu
This patch adds support in ipv6 fib functions to parse Netlink RTA encap attributes and attach encap state data to rt6_info. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21ipv4: support for fib route lwtunnel encap attributesRoopa Prabhu
This patch adds support in ipv4 fib functions to parse user provided encap attributes and attach encap state data to fib_nh and rtable. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21lwtunnel: infrastructure for handling light weight tunnels like mplsRoopa Prabhu
Provides infrastructure to parse/dump/store encap information for light weight tunnels like mpls. Encap information for such tunnels is associated with fib routes. This infrastructure is based on previous suggestions from Eric Biederman to follow the xfrm infrastructure. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21rtnetlink: introduce new RTA_ENCAP_TYPE and RTA_ENCAP attributesRoopa Prabhu
This patch introduces two new RTA attributes to attach encap data to fib routes. Example iproute2 command to attach mpls encap data to ipv4 routes $ip route add 10.1.1.0/30 encap mpls 200 via inet 10.1.1.1 dev swp1 Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Suggested-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-20bpf: introduce bpf_skb_vlan_push/pop() helpersAlexei Starovoitov
Allow eBPF programs attached to TC qdiscs call skb_vlan_push/pop via helper functions. These functions may change skb->data/hlen which are cached by some JITs to improve performance of ld_abs/ld_ind instructions. Therefore JITs need to recognize bpf_skb_vlan_push/pop() calls, re-compute header len and re-cache skb->data/hlen back into cpu registers. Note, skb->data/hlen are not directly accessible from the programs, so any changes to skb->data done either by these helpers or by other TC actions are safe. eBPF JIT supported by three architectures: - arm64 JIT is using bpf_load_pointer() without caching, so it's ok as-is. - x64 JIT re-caches skb->data/hlen unconditionally after vlan_push/pop calls (experiments showed that conditional re-caching is slower). - s390 JIT falls back to interpreter for now when bpf_skb_vlan_push() is present in the program (re-caching is tbd). These helpers allow more scalable handling of vlan from the programs. Instead of creating thousands of vlan netdevs on top of eth0 and attaching TC+ingress+bpf to all of them, the program can be attached to eth0 directly and manipulate vlans as necessary. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-20Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2015-07-17 This series contains updates to igb, ixgbe, ixgbevf, i40e, bnx2x, freescale, siena and dp83640. Jacob provides several patches to clarify the intended way to implement both SIOCSHWTSTAMP and ethtool's get_ts_info(). It is okay to support the specific filters in SIOCSHWTSTAMP by upscaling them to the generic filters. Alex Duyck provides a igb patch to pull the time stamp from the fragment before it gets added to the skb, to avoid a possible issue in which the fragment can possibly be less than IGB_RX_HDR_LEN due to the time stamp being pulled after the copybreak check. Also provides a ixgbevf patch to fold the ixgbevf_pull_tail() call into ixgbevf_add_rx_frag(), which gives the advantage that the fragment does not have to be modified after it is added to the skb. Fan provides patches for ixgbe/ixgbevf to set the receive hash type based on receive descriptor RSS type. Todd provides a fix for igb where on check for link on any media other than copper was not being detected since it was looking on the incorrect PHY page (due to the page being used gets switched before the function to check link gets executed). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-20stmmac: drop custom_* fields from plat_stmmacenet_dataJoachim Eastwood
Both of these fields are unused and has been unused since they were added 3 and 5 years ago. Drop them since they are clearly not very useful. Signed-off-by: Joachim Eastwood <manabian@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-20net: remove skb_frag_add_headJiri Benc
It's not used anywhere. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-20switchdev: add offload_fwd_mark generator helperScott Feldman
skb->offload_fwd_mark and dev->offload_fwd_mark are 32-bit and should be unique for device and may even be unique for a sub-set of ports within device, so add switchdev helper function to generate unique marks based on port's switch ID and group_ifindex. group_ifindex would typically be the container dev's ifindex, such as the bridge's ifindex. The generator uses a global hash table to store offload_fwd_marks hashed by {switch ID, group_ifindex} key. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-20net: add phys ID compare helper to test if two IDs are the sameScott Feldman
Signed-off-by: Scott Feldman <sfeldma@gmail.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-20net: don't reforward packets already forwarded by offload deviceScott Feldman
Just before queuing skb for xmit on port, check if skb has been marked by switchdev port driver as already fordwarded by device. If so, drop skb. A non-zero skb->offload_fwd_mark field is set by the switchdev port driver/device on ingress to indicate the skb has already been forwarded by the device to egress ports with matching dev->skb_mark. The switchdev port driver would assign a non-zero dev->offload_skb_mark for each device port netdev during registration, for example. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-20ebpf: add helper to retrieve net_cls's classid cookieDaniel Borkmann
It would be very useful to retrieve the net_cls's classid from an eBPF program to allow for a more fine-grained classification, it could be directly used or in conjunction with additional policies. I.e. docker, but also tooling such as cgexec, can easily run applications via net_cls cgroups: cgcreate -g net_cls:/foo echo 42 > foo/net_cls.classid cgexec -g net_cls:foo <prog> Thus, their respecitve classid cookie of foo can then be looked up on the egress path to apply further policies. The helper is desigend such that a non-zero value returns the cgroup id. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Thomas Graf <tgraf@suug.ch> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-20cls_cgroup: factor out classid retrievalDaniel Borkmann
Split out retrieving the cgroups net_cls classid retrieval into its own function, so that it can be reused later on from other parts of the traffic control subsystem. If there's no skb->sk, then the small helper returns 0 as well, which in cls_cgroup terms means 'could not classify'. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-17clarify implementation of ethtool's get_ts_info opJacob Keller
This patch adds some clarification about the intended way to implement both SIOCSHWTSTAMP and ethtool's get_ts_info. The HWTSTAMP API has several Rx filters which are very specific, as well as more general filters. The specific filters really only exist to support some broken hardware which can't fully implement the generic filters. This patch adds clarification that it is okay to support the specific filters in SIOCSHWTSTAMP by upscaling them to the generic filters. In addition, update the header for ethtool_ts_info to specify that drivers ought to only report the filters they support without upscaling in this manner. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Reviewed-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-07-15netlink: changes for setting and clearing protodown via netlink.Anuradha Karuppiah
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com> Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-15net core: Add protodown support.Anuradha Karuppiah
This patch introduces the proto_down flag that can be used by user space applications to notify switch drivers that errors have been detected on the device. The switch driver can react to protodown notification by doing a phys down on the associated switch port. Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com> Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: net/bridge/br_mdb.c Minor conflict in br_mdb.c, in 'net' we added a memset of the on-stack 'ip' variable whereas in 'net-next' we assign a new member 'vid'. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-13bridge: mdb: add vlan support for user entriesNikolay Aleksandrov
Until now all user mdb entries were added in vlan 0, this patch adds support to allow the user to specify the vlan for the entry. About the uapi change a hole in struct br_mdb_entry is used so the size and offsets are kept the same (verified with pahole and tested with older iproute2). Example: $ bridge mdb dev br0 port eth1 grp 239.0.0.1 permanent vlan 2000 dev br0 port eth1 grp 239.0.0.1 permanent vlan 200 dev br0 port eth1 grp 239.0.0.1 permanent Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Missing list head init in bluetooth hidp session creation, from Tedd Ho-Jeong An. 2) Don't leak SKB in bridge netfilter error paths, from Florian Westphal. 3) ipv6 netdevice private leak in netfilter bridging, fixed by Julien Grall. 4) Fix regression in IP over hamradio bpq encapsulation, from Ralf Baechle. 5) Fix race between rhashtable resize events and table walks, from Phil Sutter. 6) Missing validation of IFLA_VF_INFO netlink attributes, fix from Daniel Borkmann. 7) Missing security layer socket state initialization in tipc code, from Stephen Smalley. 8) Fix shared IRQ handling in boomerang 3c59x interrupt handler, from Denys Vlasenko. 9) Missing minor_idr destroy on module unload on macvtap driver, from Johannes Thumshirn. 10) Various pktgen kernel thread races, from Oleg Nesterov. 11) Fix races that can cause packets to be processed in the backlog even after a device attached to that SKB has been fully unregistered. From Julian Anastasov. 12) bcmgenet driver doesn't account packet drops vs. errors properly, fix from Petri Gynther. 13) Array index validation and off by one fix in DSA layer from Florian Fainelli * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (66 commits) can: replace timestamp as unique skb attribute ARM: dts: dra7x-evm: Prevent glitch on DCAN1 pinmux can: c_can: Fix default pinmux glitch at init can: rcar_can: unify error messages can: rcar_can: print request_irq() error code can: rcar_can: fix typo in error message can: rcar_can: print signed IRQ # can: rcar_can: fix IRQ check net: dsa: Fix off-by-one in switch address parsing net: dsa: Test array index before use net: switchdev: don't abort unsupported operations net: bcmgenet: fix accounting of packet drops vs errors cdc_ncm: update specs URL Doc: z8530book: Fix typo in API-z8530-sync-txdma-open.html net: inet_diag: always export IPV6_V6ONLY sockopt for listening sockets bridge: mdb: allow the user to delete mdb entry if there's a querier net: call rcu_read_lock early in process_backlog net: do not process device backlog during unregistration bridge: fix potential crash in __netdev_pick_tx() net: axienet: Fix devm_ioremap_resource return value check ...
2015-07-12can: replace timestamp as unique skb attributeOliver Hartkopp
Commit 514ac99c64b "can: fix multiple delivery of a single CAN frame for overlapping CAN filters" requires the skb->tstamp to be set to check for identical CAN skbs. Without timestamping to be required by user space applications this timestamp was not generated which lead to commit 36c01245eb8 "can: fix loss of CAN frames in raw_rcv" - which forces the timestamp to be set in all CAN related skbuffs by introducing several __net_timestamp() calls. This forces e.g. out of tree drivers which are not using alloc_can{,fd}_skb() to add __net_timestamp() after skbuff creation to prevent the frame loss fixed in mainline Linux. This patch removes the timestamp dependency and uses an atomic counter to create an unique identifier together with the skbuff pointer. Btw: the new skbcnt element introduced in struct can_skb_priv has to be initialized with zero in out-of-tree drivers which are not using alloc_can{,fd}_skb() too. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2015-07-12Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "This update from the timer departement contains: - A series of patches which address a shortcoming in the tick broadcast code. If the broadcast device is not available or an hrtimer emulated broadcast device, some of the original assumptions lead to boot failures. I rather plugged all of the corner cases instead of only addressing the issue reported, so the change got a little larger. Has been extensivly tested on x86 and arm. - Get rid of the last holdouts using do_posix_clock_monotonic_gettime() - A regression fix for the imx clocksource driver - An update to the new state callbacks mechanism for clockevents. This is required to simplify the conversion, which will take place in 4.3" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tick/broadcast: Prevent NULL pointer dereference time: Get rid of do_posix_clock_monotonic_gettime cris: Replace do_posix_clock_monotonic_gettime() tick/broadcast: Unbreak CONFIG_GENERIC_CLOCKEVENTS=n build tick/broadcast: Handle spurious interrupts gracefully tick/broadcast: Check for hrtimer broadcast active early tick/broadcast: Return busy when IPI is pending tick/broadcast: Return busy if periodic mode and hrtimer broadcast tick/broadcast: Move the check for periodic mode inside state handling tick/broadcast: Prevent deep idle if no broadcast device available tick/broadcast: Make idle check independent from mode and config tick/broadcast: Sanity check the shutdown of the local clock_event tick/broadcast: Prevent hrtimer recursion clockevents: Allow set-state callbacks to be optional clocksource/imx: Define clocksource for mx27
2015-07-12Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Thomas Gleixner: "A single fix for a cpu hotplug race vs. interrupt descriptors: Prevent irq setup/teardown across the cpu starting/dying parts of cpu hotplug so that the starting/dying cpu has a stable view of the descriptor space. This has been an issue for all architectures in the cpu dying phase, where interrupts are migrated away from the dying cpu. In the starting phase its mostly a x86 issue vs the vector space update" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: hotplug: Prevent alloc/free of irq descriptors during cpu up/down
2015-07-11Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm Pull libnvdimm fixes from Dan Williams: "1) Fixes for a handful of smatch reports (Thanks Dan C.!) and minor bug fixes (patches 1-6) 2) Correctness fixes to the BLK-mode nvdimm driver (patches 7-10). Granted these are slightly large for a -rc update. They have been out for review in one form or another since the end of May and were deferred from the merge window while we settled on the "PMEM API" for the PMEM-mode nvdimm driver (ie memremap_pmem, memcpy_to_pmem, and wmb_pmem). Now that those apis are merged we implement them in the BLK driver to guarantee that mmio aperture moves stay ordered with respect to incoming read/write requests, and that writes are flushed through those mmio-windows and platform-buffers to be persistent on media. These pass the sub-system unit tests with the updates to tools/testing/nvdimm, and have received a successful build-report from the kbuild robot (468 configs). With acks from Rafael for the touches to drivers/acpi/" * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm: nfit: add support for NVDIMM "latch" flag nfit: update block I/O path to use PMEM API tools/testing/nvdimm: add mock acpi_nfit_flush_address entries to nfit_test tools/testing/nvdimm: fix return code for unimplemented commands tools/testing/nvdimm: mock ioremap_wt pmem: add maintainer for include/linux/pmem.h nfit: fix smatch "use after null check" report nvdimm: Fix return value of nvdimm_bus_init() if class_create() fails libnvdimm: smatch cleanups in __nd_ioctl sparse: fix misplaced __pmem definition
2015-07-11Merge tag 'armsoc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Kevin Hilman: "A fairly random colletion of fixes based on -rc1 for OMAP, sunxi and prima2 as well as a few arm64-specific DT fixes. This series also includes a late to support a new Allwinner (sunxi) SoC, but since it's rather simple and isolated to the platform-specific code, it's included it for this -rc" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: arm64: dts: add device tree for ARM SMM-A53x2 on LogicTile Express 20MG arm: dts: vexpress: add missing CCI PMU device node to TC2 arm: dts: vexpress: describe all PMUs in TC2 dts GICv3: Add ITS entry to THUNDER dts arm64: dts: Add poweroff button device node for APM X-Gene platform ARM: dts: am4372.dtsi: disable rfbi ARM: dts: am57xx-beagle-x15: Provide supply for usb2_phy2 ARM: dts: am4372: Add emif node Revert "ARM: dts: am335x-boneblack: disable RTC-only sleep" ARM: sunxi: Enable simplefb in the defconfig ARM: Remove deprecated symbol from defconfig files ARM: sunxi: Add Machine support for A33 ARM: sunxi: Introduce Allwinner H3 support Documentation: sunxi: Update Allwinner SoC documentation ARM: prima2: move to use REGMAP APIs for rtciobrg ARM: dts: atlas7: add pinctrl and gpio descriptions ARM: OMAP2+: Remove unnessary return statement from the void function, omap2_show_dma_caps memory: omap-gpmc: Fix parsing of devices
2015-07-10net: phy: Pass mdix ethtool setting through to phy driverDavid Thomson
Pass the mdix setting from ethtool down to the phy driver, to allow driver specific implementations of manually setting the polarity. Signed-off-by: David Thomson <david.thomson@alliedtelesis.co.nz> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-09ipv6: Nonlocal bindTom Herbert
Add support to allow non-local binds similar to how this was done for IPv4. Non-local binds are very useful in emulating the Internet in a box, etc. This add the ip_nonlocal_bind sysctl under ipv6. Testing: Set up nonlocal binding and receive routing on a host, e.g.: ip -6 rule add from ::/0 iif eth0 lookup 200 ip -6 route add local 2001:0:0:1::/64 dev lo proto kernel scope host table 200 sysctl -w net.ipv6.ip_nonlocal_bind=1 Set up routing to 2001:0:0:1::/64 on peer to go to first host ping6 -I 2001:0:0:1::1 peer-address -- to verify Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-09inet: inet_twsk_deschedule factorizationEric Dumazet
inet_twsk_deschedule() calls are followed by inet_twsk_put(). Only particular case is in inet_twsk_purge() but there is no point to defer the inet_twsk_put() after re-enabling BH. Lets rename inet_twsk_deschedule() to inet_twsk_deschedule_put() and move the inet_twsk_put() inside. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-09inet: simplify timewait refcountingEric Dumazet
timewait sockets have a complex refcounting logic. Once we realize it should be similar to established and syn_recv sockets, we can use sk_nulls_del_node_init_rcu() and remove inet_twsk_unhash() In particular, deferred inet_twsk_put() added in commit 13475a30b66cd ("tcp: connect() race with timewait reuse") looks unecessary : When removing a timewait socket from ehash or bhash, caller must own a reference on the socket anyway. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-09inet: remove BUG_ON() in twsk_destructor()Eric Dumazet
Kernel will crash the same if one of the pointer is NULL anyway. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-09ipv6: use flag instead of u16 for hop in inet6_skb_parmFlorian Westphal
Hop was always either 0 or sizeof(struct ipv6hdr). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-09cdc_ncm: Add support for moving NDP to end of NCM frameEnrico Mioso
NCM specs are not actually mandating a specific position in the frame for the NDP (Network Datagram Pointer). However, some Huawei devices will ignore our aggregates if it is not placed after the datagrams it points to. Add support for doing just this, in a per-device configurable way. While at it, update NCM subdrivers, disabling this functionality in all of them, except in huawei_cdc_ncm where it is enabled instead. We aren't making any distinction between different Huawei NCM devices, based on what the vendor driver does. Standard NCM devices are left unaffected: if they are compliant, they should be always usable, still stay on the safe side. This change has been tested and working with a Huawei E3131 device (which works regardless of NDP position), a Huawei E3531 (also working both ways) and an E3372 (which mandates NDP to be after indexed datagrams). V1->V2: - corrected wrong NDP acronym definition - fixed possible NULL pointer dereference - patch cleanup V2->V3: - Properly account for the NDP size when writing new packets to SKB Signed-off-by: Enrico Mioso <mrkiko.rs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-09tcp: do not slow start when cwnd equals ssthreshYuchung Cheng
In the original design slow start is only used to raise cwnd when cwnd is stricly below ssthresh. It makes little sense to slow start when cwnd == ssthresh: especially when hystart has set ssthresh in the initial ramp, or after recovery when cwnd resets to ssthresh. Not doing so will also help reduce the buffer bloat slightly. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Nandita Dukkipati <nanditad@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-09tcp: add tcp_in_slow_start helperYuchung Cheng
Add a helper to test the slow start condition in various congestion control modules and other places. This is to prepare a slight improvement in policy as to exactly when to slow start. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Nandita Dukkipati <nanditad@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-09Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fixes from Sage Weil: "There is a fix for CephFS and RBD when used within containers/namespaces, and a fix for the address learning the client is supposed to do when initially talking to the Ceph cluster. There are also two patches updating MAINTAINERS. One breaks out the common Ceph code shared by fs/ceph and drivers/block/rbd.c into a separate entry with the appropriate maintainers listed. The second adds a second reference to the github tree where the Ceph client development takes place (before it is pushed to korg and then to you). The goal here is to move closer to a situation where Ilya Dryomov or one of the other maintainers can push things to you if I am unavailable. Ilya has done most of the work preparing branches for upstream recently; you should not be surprised to hear from him if I am trapped in some internet-less wasteland or hit by a bus or something. In the meantime, we'll work on getting him added to the kernel web of trust" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: MAINTAINERS: add secondary tree for ceph modules MAINTAINERS: update ceph entries libceph: treat sockaddr_storage with uninitialized family as blank libceph: enable ceph in a non-default network namespace
2015-07-09libceph: enable ceph in a non-default network namespaceIlya Dryomov
Grab a reference on a network namespace of the 'rbd map' (in case of rbd) or 'mount' (in case of ceph) process and use that to open sockets instead of always using init_net and bailing if network namespace is anything but init_net. Be careful to not share struct ceph_client instances between different namespaces and don't add any code in the !CONFIG_NET_NS case. This is based on a patch from Hong Zhiguo <zhiguohong@tencent.com>. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2015-07-09time: Get rid of do_posix_clock_monotonic_gettimeThomas Gleixner
All users gone. Remove it before we get another one. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-07-08ipv4: add support for linkdown sysctl to netconfAndy Gospodarek
This kernel patch exports the value of the new ignore_routes_with_linkdown via netconf. v2: changes to notify userspace via netlink when sysctl values change and proposed for 'net' since this could be considered a bugfix Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Suggested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08Merge tag 'pm+acpi-4.2-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI updates from Rafael Wysocki: "These are fixes on top of the previous PM+ACPI pull requests (including one fix for a 4.1 regression) and two commits adding _CLS-based device enumeration support to the ACPI core and the ATA subsystem that waited for the latest ACPICA changes to be merged. Specifics: - Fix for an ACPI resources management regression introduced during the 4.1 cycle (that unfortunately went into -stable) effectively reverting the bad commit along with the recent fixups on top of it and using an alternative approach to address the underlying issue (Rafael J Wysocki). - Fix for a memory leak and an incorrect return value in an error code path in the ACPI LPSS (Low-Power Subsystem) driver (Rafael J Wysocki). - Fix for a leftover dangling pointer in an error code path in the new wakeup IRQ support code (Rafael J Wysocki). - Fix to prevent infinite loops (due to errors in other places) from happening in the core generic PM domains support code (Geert Uytterhoeven). - Hibernation documentation update/clarification (Uwe Geuder). - Support for _CLS-based device enumeration in the ACPI core and in the ATA subsystem (Suravee Suthikulpanit)" * tag 'pm+acpi-4.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / wakeirq: Avoid setting power.wakeirq too hastily ata: ahci_platform: Add ACPI _CLS matching ACPI / scan: Add support for ACPI _CLS device matching PM / hibernate: clarify resume documentation PM / Domains: Avoid infinite loops in attach/detach code ACPI / LPSS: Fix up acpi_lpss_create_device() ACPI / PNP: Reserve ACPI resources at the fs_initcall_sync stage
2015-07-08Merge tag 'sirf-iobrg2regmap-for-4.2' of ↵Kevin Hilman
git://git.kernel.org/pub/scm/linux/kernel/git/baohua/linux into fixes Merge "CSR SiRFSoC rtc iobrg move to regmap for 4.2" from Barry Song: move CSR rtc iobrg read/write API to be regmap this moves to general APIs, and all drivers will be changed based on it. * tag 'sirf-iobrg2regmap-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/baohua/linux: ARM: prima2: move to use REGMAP APIs for rtciobrg
2015-07-08net_sched: act_mirred: remove spinlock in fast pathEric Dumazet
Like act_gact, act_mirred can be lockless in packet processing 1) Use percpu stats 2) update lastuse only every clock tick to avoid false sharing 3) use rcu to protect tcfm_dev 4) Remove spinlock usage, as it is no longer needed. Next step : add multi queue capability to ifb device Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: John Fastabend <john.fastabend@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08net_sched: act_gact: remove spinlock in fast pathEric Dumazet
Final step for gact RCU operation : 1) Use percpu stats 2) update lastuse only every clock tick to avoid false sharing 3) Remove spinlock acquisition, as it is no longer needed. Since this is the last contended lock in packet RX when tc gact is used, this gives impressive gain. My host with 8 RX queues was handling 5 Mpps before the patch, and more than 11 Mpps after patch. Tested: On receiver : dev=eth0 tc qdisc del dev $dev ingress 2>/dev/null tc qdisc add dev $dev ingress tc filter del dev $dev root pref 10 2>/dev/null tc filter del dev $dev pref 10 2>/dev/null tc filter add dev $dev est 1sec 4sec parent ffff: protocol ip prio 1 \ u32 match ip src 7.0.0.0/8 flowid 1:15 action drop Sender sends packets flood from 7/8 network Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08net_sched: act_gact: use a separate packet counters for gact_determ()Eric Dumazet
Second step for gact RCU operation : We want to get rid of the spinlock protecting gact operations. Stats (packets/bytes) will soon be per cpu. gact_determ() would not work without a central packet counter, so lets add it for this mode. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08net: sched: add percpu stats to actionsEric Dumazet
Reuse existing percpu infrastructure John Fastabend added for qdisc. This patch adds a new cpustats parameter to tcf_hash_create() and all actions pass false, meaning this patch should have no effect yet. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08net: sched: extend percpu stats helpersEric Dumazet
qdisc_bstats_update_cpu() and other helpers were added to support percpu stats for qdisc. We want to add percpu stats for tc action, so this patch add common helpers. qdisc_bstats_update_cpu() is renamed to qdisc_bstats_cpu_update() qdisc_qstats_drop_cpu() is renamed to qdisc_qstats_cpu_drop() Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08hotplug: Prevent alloc/free of irq descriptors during cpu up/downThomas Gleixner
When a cpu goes up some architectures (e.g. x86) have to walk the irq space to set up the vector space for the cpu. While this needs extra protection at the architecture level we can avoid a few race conditions by preventing the concurrent allocation/free of irq descriptors and the associated data. When a cpu goes down it moves the interrupts which are targeted to this cpu away by reassigning the affinities. While this happens interrupts can be allocated and freed, which opens a can of race conditions in the code which reassignes the affinities because interrupt descriptors might be freed underneath. Example: CPU1 CPU2 cpu_up/down irq_desc = irq_to_desc(irq); remove_from_radix_tree(desc); raw_spin_lock(&desc->lock); free(desc); We could protect the irq descriptors with RCU, but that would require a full tree change of all accesses to interrupt descriptors. But fortunately these kind of race conditions are rather limited to a few things like cpu hotplug. The normal setup/teardown is very well serialized. So the simpler and obvious solution is: Prevent allocation and freeing of interrupt descriptors accross cpu hotplug. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: xiao jin <jin.xiao@intel.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Borislav Petkov <bp@suse.de> Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com> Link: http://lkml.kernel.org/r/20150705171102.063519515@linutronix.de
2015-07-07Merge branch 'acpi-scan'Rafael J. Wysocki
* acpi-scan: ata: ahci_platform: Add ACPI _CLS matching ACPI / scan: Add support for ACPI _CLS device matching
2015-07-07tick/broadcast: Unbreak CONFIG_GENERIC_CLOCKEVENTS=n buildThomas Gleixner
Making tick_broadcast_oneshot_control() independent from CONFIG_GENERIC_CLOCKEVENTS_BROADCAST broke the build for CONFIG_GENERIC_CLOCKEVENTS=n because the function is not defined there. Provide a proper stub inline. Fixes: f32dd1170511 'tick/broadcast: Make idle check independent from mode and config' Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>