Age | Commit message (Collapse) | Author |
|
Jens Axboe noticed that we were queueing &conn->work on both btaddconn
and keventd_wq.
Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
|
|
Upstream commit 7000d38d:
This patch is similar to nfnetlink_queue fixes. It fixes the computation
of skb size by using NLMSG_SPACE instead of NLMSG_ALIGN.
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Upstream commit cabaa9bf:
Size of the netlink skb was wrongly computed because the formula was using
NLMSG_ALIGN instead of NLMSG_SPACE. NLMSG_ALIGN does not add the room for
netlink header as NLMSG_SPACE does. This was causing a failure of message
building in some cases.
On my test system, all messages for packets in range [8*k+41, 8*k+48] where k
is an integer were invalid and the corresponding packets were dropped.
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Upstream commit 4f4c9430:
xt_time_match() in net/netfilter/xt_time.c in kernel 2.6.24 never
matches on Sundays. On my host I have a rule like
iptables -A OUTPUT -m time --weekdays Sun -j REJECT
and it never matches. The problem is in localtime_2(), which uses
r->weekday = (4 + r->dse) % 7;
to map the epoch day onto a weekday in {0,...,6}. In particular this
gives 0 for Sundays. But 0 has to be wrong; a weekday of 0 can never
match. xt_time_match() has
if (!(info->weekdays_match & (1 << current_time.weekday)))
return false;
and when current_time.weekday = 0, the result of the & is always
zero, even when info->weekdays_match = XT_TIME_ALL_WEEKDAYS = 0xFE.
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Upstream commit 1b04ab459:
The function ebt_do_table doesn't take NF_DROP as a verdict from the targets.
Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Upstream commit eb1197bc0:
http://bugzilla.kernel.org/show_bug.cgi?id=9920
The function skb_make_writable returns true or false.
Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Upstream commit e2b58a67:
As reported by Tomas Simonaitis <tomas.simonaitis@gmail.com>, inserting new
data in skbs queued over {ip,ip6,nfnetlink}_queue triggers a SKB_LINEAR_ASSERT
in skb_put().
Going back through the git history, it seems this bug is present since at
least 2.6.12-rc2, probably even since the removal of skb_linearize() for
netfilter.
Linearize non-linear skbs through skb_copy_expand() when enlarging them.
Tested by Thomas, fixes bugzilla #9933.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Upstream commit: 21e43188f272c7fd9efc84b8244c0b1dfccaa105
Because we use shared tfm objects in order to conserve memory,
(each tfm requires 128K of vmalloc memory), BH needs to be turned
off on output as that can occur in process context.
Previously this was done implicitly by the xfrm output code.
That was lost when it became lockless. So we need to add the
BH disabling to IPComp directly.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
|
|
Upstream commit: dea75bdfa57f75a7a7ec2961ec28db506c18e5db
From: Stephen Hemminger <shemminger@linux-foundation.org>
Based upon a patch by Marcel Wappler:
This patch fixes a DHCP issue of the kernel: some DHCP servers
(i.e. in the Linksys WRT54Gv5) are very strict about the contents
of the DHCPDISCOVER packet they receive from clients.
Table 5 in RFC2131 page 36 requests the fields 'ciaddr' and
'siaddr' MUST be set to '0'. These DHCP servers ignore Linux
kernel's DHCP discovery packets with these two fields set to
'255.255.255.255' (in contrast to popular DHCP clients, such as
'dhclient' or 'udhcpc'). This leads to a not booting system.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
|
|
Upstream commit: e4f8b5d4edc1edb0709531bd1a342655d5e8b98e
Various RFCs have all sorts of things to say about the CS field of the
DSCP value. In particular they try to make the distinction between
values that should be used by "user applications" and things like
routing daemons.
This seems to have influenced the CAP_NET_ADMIN check which exists for
IP_TOS socket option settings, but in fact it has an off-by-one error
so it wasn't allowing CS5 which is meant for "user applications" as
well.
Further adding to the inconsistency and brokenness here, IPV6 does not
validate the DSCP values specified for the IPV6_TCLASS socket option.
The real actual uses of these TOS values are system specific in the
final analysis, and these RFC recommendations are just that, "a
recommendation". In fact the standards very purposefully use
"SHOULD" and "SHOULD NOT" when describing how these values can be
used.
In the final analysis the only clean way to provide consistency here
is to remove the CAP_NET_ADMIN check. The alternatives just don't
work out:
1) If we add the CAP_NET_ADMIN check to ipv6, this can break existing
setups.
2) If we just fix the off-by-one error in the class comparison in
IPV4, certain DSCP values can be used in IPV6 but not IPV4 by
default. So people will just ask for a sysctl asking to
override that.
I checked several other freely available kernel trees and they
do not make any privilege checks in this area like we do. For
the BSD stacks, this goes back all the way to Stevens Volume 2
and beyond.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
|
|
Upstream commit: 9937ded8e44de8865cba1509d24eea9d350cebf0
The result of the ip_route_output is not assigned to skb. This means that
- it is leaked
- possible OOPS below dereferrencing skb->dst
- no ICMP message for this case
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
|
|
Upstream commits: 28a89453b1e8de8d777ad96fa1eef27b5d1ce074
b5c15fc004ac83b7ad280acbe0fd4bbed7e2c8d4
This is a long-standing bug in the IPsec IPv6 code that breaks
when we emit a IPsec tunnel-mode datagram packet. The problem
is that the code the emits the packet assumes the IPv6 stack
will fragment it later, but the IPv6 stack assumes that whoever
is emitting the packet is going to pre-fragment the packet.
In the long term we need to fix both sides, e.g., to get the
datagram code to pre-fragment as well as to get the IPv6 stack
to fragment locally generated tunnel-mode packet.
For now this patch does the second part which should make it
work for the IPsec host case.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
|
|
Upstream commit: d8b2a4d21e0b37b9669b202867bfef19f68f786a
There is a race in Linux kernel file net/core/dev.c, function dev_close.
The function calls function dev_deactivate, which calls function
dev_watchdog_down that deletes the watchdog timer. However, after that, a
driver can call netif_carrier_ok, which calls function
__netdev_watchdog_up that can add the watchdog timer again. Function
unregister_netdevice calls function dev_shutdown that traps the bug
!timer_pending(&dev->watchdog_timer). Moving dev_deactivate after
netif_running() has been cleared prevents function netif_carrier_on
from calling __netdev_watchdog_up and adding the watchdog timer again.
Signed-off-by: Matti Linnanvuori <mattilinnanvuori@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
|
|
Upstream commit: 12aa343add3eced38a44bdb612b35fdf634d918c
Commit a0a400d79e3dd7843e7e81baa3ef2957bdc292d0 ("[NET]: dev_mcast:
add multicast list synchronization helpers") from you introduced a new
field "da_synced" to struct dev_addr_list that is not properly
initialized to 0. So when any of the current users (8021q, macvlan,
mac80211) calls dev_mc_sync/unsync they mess the address list for both
devices.
The attached patch fixed it for me and avoid future problems.
Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
|
|
Upstream commit: b6c0632105f7d7548f1d642ba830088478d4f2b0
The bluetooth hci_conn sysfs add/del executed in the default
workqueue. If the del_conn is executed after the new add_conn with
same target, add_conn will failed with warning of "same kobject name".
Here add btaddconn & btdelconn workqueues, flush the btdelconn
workqueue in the add_conn function to avoid the issue.
Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Upstream commit: 29ffe1a5c52dae13b6efead97aab9b058f38fce4
When ip_fragment has to hit the slow path the value of skb->truesize
may go out of sync because we would have updated it without changing
the packet length. This violates the constraints on truesize.
This patch postpones the update of skb->truesize to prevent this.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Upstream commit: 8cf8e5a67fb07f583aac94482ba51a7930dab493
Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=9825
The inet_diag_lock_handler function uses ERR_PTR to encode errors but
its callers were testing against NULL.
This only happens when the only inet_diag modular user, DCCP, is not
built into the kernel or available as a module.
Also there was a problem with not dropping the mutex lock when a handler
was not found, also fixed in this patch.
This caused an OOPS and ss would then hang on subsequent calls, as
&inet_diag_table_mutex was being left locked.
Thanks to spike at ml.yaroslavl.ru for report it after trying 'ss -d'
on a kernel that doesn't have DCCP available.
This bug was introduced in cset
d523a328fb0271e1a763e985a21f2488fd816e7e ("Fix inet_diag dead-lock
regression"), after 2.6.24-rc3, so just 2.6.24 seems to be affected.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Upstream commit: 2614fa59fa805cd488083c5602eb48533cdbc018
When I moved the nexthdr setting out of IPComp I accidently moved
the reading of ipch->nexthdr after the decompression. Unfortunately
this means that we'd be reading from a stale ipch pointer which
doesn't work very well.
This patch moves the reading up so that we get the correct nexthdr
value.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Upstream commit: b1641064a3f4a58644bc2e8edf40c025c58473b4
I made a silly typo by entering IPPROTO_IP (== 0) instead of
IPPROTO_IPIP (== 4). This broke the reception of incompressible
packets.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Upstream commit: c18865f39276435abb9286f9a816cb5b66c99a00
fib_info can be shared by many route prefixes but we don't want
duplicate alternative routes for a prefix+tos+priority. Last change
was not correct to check fib_treeref because it accounts usage from
other prefixes. Additionally, avoid replacement without error if new
route is same, as Joonwoo Park suggests.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Upstream commit: 936f6f8e1bc46834bbb3e3fa3ac13ab44f1e7ba6
Update fib_trie with some fib_hash fixes:
- check for duplicate alternative routes for prefix+tos+priority when
replacing route
- properly insert by matching tos together with priority
- fix alias walking to use list_for_each_entry_continue for insertion
and deletion when fa_head is not NULL
- copy state from fa to new_fa on replace (not a problem for now)
- additionally, avoid replacement without error if new route is same,
as Joonwoo Park suggests.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Upstream commit: 268bcca1e7b0d244afd07ea89cda672e61b0fc4a
Setting up a meta match causes a kernel OOPS because of uninitialized
elements in tree.
[ 37.322381] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[ 37.322381] IP: [<ffffffff883fc717>] :em_meta:em_meta_destroy+0x17/0x80
[ 37.322381] Call Trace:
[ 37.322381] [<ffffffff803ec83d>] tcf_em_tree_destroy+0x2d/0xa0
[ 37.322381] [<ffffffff803ecc8c>] tcf_em_tree_validate+0x2dc/0x4a0
[ 37.322381] [<ffffffff803f06d2>] nla_parse+0x92/0xe0
[ 37.322381] [<ffffffff883f9672>] :cls_basic:basic_change+0x202/0x3c0
[ 37.322381] [<ffffffff802a3917>] kmem_cache_alloc+0x67/0xa0
[ 37.322381] [<ffffffff803ea221>] tc_ctl_tfilter+0x3b1/0x580
[ 37.322381] [<ffffffff803dffd0>] rtnetlink_rcv_msg+0x0/0x260
[ 37.322381] [<ffffffff803ee944>] netlink_rcv_skb+0x74/0xa0
[ 37.322381] [<ffffffff803dffc8>] rtnetlink_rcv+0x18/0x20
[ 37.322381] [<ffffffff803ee6c3>] netlink_unicast+0x263/0x290
[ 37.322381] [<ffffffff803cf276>] __alloc_skb+0x96/0x160
[ 37.322381] [<ffffffff803ef014>] netlink_sendmsg+0x274/0x340
[ 37.322381] [<ffffffff803c7c3b>] sock_sendmsg+0x12b/0x140
[ 37.322381] [<ffffffff8024de90>] autoremove_wake_function+0x0/0x30
[ 37.322381] [<ffffffff8024de90>] autoremove_wake_function+0x0/0x30
[ 37.322381] [<ffffffff803c7c3b>] sock_sendmsg+0x12b/0x140
[ 37.322381] [<ffffffff80288611>] zone_statistics+0xb1/0xc0
[ 37.322381] [<ffffffff803c7e5e>] sys_sendmsg+0x20e/0x360
[ 37.322381] [<ffffffff803c7411>] sockfd_lookup_light+0x41/0x80
[ 37.322381] [<ffffffff8028d04b>] handle_mm_fault+0x3eb/0x7f0
[ 37.322381] [<ffffffff8020c2fb>] system_call_after_swapgs+0x7b/0x80
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Upstream commit: 04f217aca4d803fe72c2c54fe460d68f5233ce52
If userspace passes a unknown match index into em_meta, then
em_meta_change will return an error and the data for the match will
not be set. This then causes an null pointer dereference when the
cleanup is done in the error path via tcf_em_tree_destroy. Since the
tree structure comes kzalloc, it is initialized to NULL.
Discovered when testing a new version of tc command against an
accidental older kernel.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Upstream commit: 16ca3f913001efdb6171a2781ef41c77474e3895
In strategy_allowed_congestion_control of the 2.6.24 kernel, when
sysctl_string return 1 on success,it should call
tcp_set_allowed_congestion_control to set the allowed congestion
control.But, it don't. the sysctl_string return 1 on success,
otherwise return negative, never return 0.The patch fix the problem.
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
[NETFILTER]: nf_conntrack_tcp: conntrack reopening fix
[Upstream commits b2155e7f + d0c1fd7a]
TCP connection tracking in netfilter did not handle TCP reopening
properly: active close was taken into account for one side only and
not for any side, which is fixed now. The patch includes more comments
to explain the logic how the different cases are handled.
The bug was discovered by Jeff Chua.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
This reverts commit 81100eb80add328c4d2a377326f15aa0e7236398 for the
release, to avoid the unnecessary warning noise that is only really
relevant to wireless driver developers.
The warning will probably go right back in after I cut the release, but
at least we won't unnecessarily worry users.
Acked-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
As it is ip_append_data only counts page fragments to the skb that
allocated it. As such it means that the first skb gets hit with a
4K charge even though it might have only used a fraction of it while
all subsequent skb's that use the same page gets away with no charge
at all.
This bug was exposed by the UDP accounting patch.
[ The wmem_alloc bumping needs to be moved with the truesize,
noticed by Takahiro Yasui. -DaveM ]
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
init_net is used added as a parameter to a lot of old API calls, f.e.
ip_dev_find. These calls were exported as EXPORT_SYMBOL. So, export init_net
as EXPORT_SYMBOL to keep networking API consistent.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
And as noted by Takahiro Yasui, we thus need to bump the
sk->sk_wmem_alloc at this spot as well.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The rfcomm tty device will possibly retain even when conn is down, and
sysfs doesn't support zombie device moving, so this patch move the tty
device before conn device is destroyed.
For the bug refered please see :
http://lkml.org/lkml/2007/12/28/87
Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit "96793b482540f3a26e2188eaf75cb56b7829d3e3" (Add ICMPMsgStats
MIB (RFC 4293)) made a mistake.
In that patch, David L added a icmp_out_count() in
ip_push_pending_frames(), remove icmp_out_count() from
icmp_reply(). But he forgot to remove icmp_out_count() from
icmp_send() too. Since icmp_send and icmp_reply will call
icmp_push_reply, which will call ip_push_pending_frames, a duplicated
increment happened in icmp_send.
This patch remove the icmp_out_count from icmp_send too.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The snmp6 entry name was changed, and it broke compatibility
to RFC 2011.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
icmpv6_send() calls ip6_push_pending_frames() indirectly.
Both ip6_push_pending_frames() and icmpv6_send() increment
counter ICMP6_MIB_OUTMSGS.
This patch remove the increment from icmpv6_send.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When unregistering the rtnl_link_ops, all existing devices using
the ops are destroyed. With nested devices this may lead to a
use-after-free despite the use of for_each_netdev_safe() in case
the upper device is next in the device list and is destroyed
by the NETDEV_UNREGISTER notifier.
The easy fix is to restart scanning the device list after removing
a device. Alternatively we could add new devices to the front of
the list to avoid having dependant devices follow the device they
depend on. A third option would be to only restart scanning if
dev->iflink of the next device matches dev->ifindex of the current
one. For now this seems like the safest solution.
With this patch, the veth rtnl_link_ops unregistration can use
rtnl_link_unregister() directly since it now also handles destruction
of multiple devices at once.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Here goes an IrDA patch against your latest net-2.6 tree.
This patch fixes some af_irda memory leaks. It also checks for
irias_new_obect() return value.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 9cd40029423701c376391da59d2c6469672b4bed (Fix race between
neigh_parms_release and neightbl_fill_parms) introduced device
reference counting regressions for several people, see:
http://bugzilla.kernel.org/show_bug.cgi?id=9778
for example.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When packets are flood-forwarded to multiple output devices, the
bridge-netfilter code reuses skb->nf_bridge for each clone to store
the bridge port. When queueing packets using NFQUEUE netfilter takes
a reference to skb->nf_bridge->physoutdev, which is overwritten
when the packet is forwarded to the second port. This causes
refcount unterflows for the first device and refcount leaks for all
others. Additionally this provides incorrect data to the iptables
physdev match.
Unshare skb->nf_bridge by copying it if it is shared before assigning
the physoutdev device.
Reported, tested and based on initial patch by
Jan Christoph Nordholz <hesso@pool.math.tu-berlin.de>.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We omit (or delay) sending NSes for known-to-unreachable routers (in
NUD_FAILED state) according to RFC 4191 (Default Router Preferences
and More-Specific Routes). But this is not fully compatible with RFC
4861 (Neighbor Discovery Protocol for IPv6), which does not remember
unreachability of neighbors.
So, let's avoid mixing sending algorithm of RFC 4191 and that of RFC
4861, and make the algorithm more friendly with RFC 4861 if RFC 4191
is disabled.
Issue was found by IPv6 Ready Logo Core Self_Test 1.5.0b2 (by TAHI
Project), and has been tracked down by Mitsuru Chinen
<mitch@linux.vnet.ibm.com>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
I noticed "ip route list" was slower than "cat /proc/net/route" on a
machine with a full Internet routing table (214392 entries : Special
thanks to Robert ;) )
This is similar to problem reported in commit
d8c9283089287341c85a0a69de32c2287a990e71 ("[IPV4] ROUTE: ip_rt_dump()
is unecessary slow")
Fix is to avoid scanning the begining of fz_hash table, but directly
seek to the right offset.
Before patch :
time ip route >/tmp/ROUTE
real 0m1.285s
user 0m0.712s
sys 0m0.436s
After patch
# time ip route >/tmp/ROUTE
real 0m0.835s
user 0m0.692s
sys 0m0.124s
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
http://bugzilla.kernel.org/show_bug.cgi?id=9493
The fib allows making identical routes with 'ip route replace'.
This patch makes the fib return -EEXIST if replacement would cause duplication.
Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
http://bugzilla.kernel.org/show_bug.cgi?id=9493
The fib allows making identical routes with 'ip route replace'.
This patch makes the fib return -EEXIST if replacement would cause duplication.
Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When looking for a conflicting connection the !sk->sk_bound_dev_if
check is performed only for live sockets, but not for timewait-ed.
This is not the case for ipv4, for __inet6_lookup_established in
both ipv4 and ipv6 and for other places that check for tw-s.
Was this missed accidentally? If so, then this patch fixes it and
besides makes use if the dif variable declared in the function.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Code inspection turned up that error cases in rfkill_register() do not
call rfkill_led_trigger_unregister() even though we have already
registered.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The bridge code incorrectly causes two POST_ROUTING hook invocations
for DNATed packets that end up on the same bridge device. This
happens because packets with a changed destination address are passed
to dst_output() to make them go through the neighbour output function
again to build a new destination MAC address, before they will continue
through the IP hooks simulated by bridge netfilter.
The resulting hook order is:
PREROUTING (bridge netfilter)
POSTROUTING (dst_output -> ip_output)
FORWARD (bridge netfilter)
POSTROUTING (bridge netfilter)
The deferred hooks used to abort the first POST_ROUTING invocation,
but since the only thing bridge netfilter actually really wants is
a new MAC address, we can avoid going through the IP stack completely
by simply calling the neighbour output function directly.
Tested, reported and lots of data provided by: Damien Thebault <damien.thebault@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use the @helper variable that was just obtained.
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RFC2464 says that the next to lowerst order bit of the first octet
of the Interface Identifier is formed by complementing
the Universal/Local bit of the EUI-64. But ip6t_eui64 uses OR not XOR.
Thanks Peter Ivancik for reporing this bug and posting a patch
for it.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Allow vlans nesting other vlans without lockdep's warnings (max. 2 levels
i.e. parent + child). Thanks to Patrick McHardy for pointing a bug in the
first version of this patch.
Reported-by: Benny Amorsen
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In dn_rt_cache_get_next(), no need to guard seq->private by a
rcu_dereference() since seq is private to the thread running this
function. Reading seq.private once (as guaranted bu rcu_dereference())
or several time if compiler really is dumb enough wont change the
result.
But we miss real spots where rcu_dereference() are needed, both in
dn_rt_cache_get_first() and dn_rt_cache_get_next()
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|