summaryrefslogtreecommitdiff
path: root/net/tipc
AgeCommit message (Collapse)Author
2014-03-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/usb/r8152.c drivers/net/xen-netback/netback.c Both the r8152 and netback conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12tipc: eliminate redundant lookups in registryJon Paul Maloy
As an artefact from the native interface, the message sending functions in the port takes a port ref as first parameter, and then looks up in the registry to find the corresponding port pointer. This despite the fact that the only currently existing caller, tipc_sock, already knows this pointer. We change the signature of these functions to take a struct tipc_port* argument, and remove the redundant lookups. We also remove an unmotivated extra lookup in the function socket.c:auto_connect(), and, as the lookup functions tipc_port_deref() and ref_deref() now become unused, we remove these two functions. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12tipc: align usage of variable names and macros in socketJon Paul Maloy
The practice of naming variables in TIPC is inconistent, sometimes even within the same file. In this commit we align variable names and declarations within socket.c, and function and macro names within socket.h. We also reduce the number of conversion macros to two, in order to make usage less obsure. These changes are purely cosmetic. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12tipc: eliminate redundant lockingJon Paul Maloy
The three functions tipc_portimportance(), tipc_portunreliable() and tipc_portunreturnable() and their corresponding tipc_set* functions, are all grabbing port_lock when accessing the targeted port. This is unnecessary in the current code, since these calls only are made from within socket downcalls, already protected by sock_lock. We remove the redundant locking. Also, since the functions now become trivial one-liners, we move them to port.h and make them inline. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12tipc: eliminate upcall function pointers between port and socketJon Paul Maloy
Due to the original one-to-many relation between port and user API layers, upcalls to the API have been performed via function pointers, installed in struct tipc_port at creation. Since this relation now always is one-to-one, we can instead use ordinary function calls. We remove the function pointers 'dispatcher' and ´wakeup' from struct tipc_port, and replace them with calls to the renamed functions tipc_sk_rcv() and tipc_sk_wakeup(). At the same time we change the name and signature of the functions tipc_createport() and tipc_deleteport() to reflect their new role as mere initialization/destruction functions. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12tipc: aggregate port structure into socket structureJon Paul Maloy
After the removal of the tipc native API the relation between a tipc_port and its API types is strictly one-to-one, i.e, the latter can now only be a socket API. There is therefore no need to allocate struct tipc_port and struct sock independently. In this commit, we aggregate struct tipc_port into struct tipc_sock, hence saving both CPU cycles and structure complexity. There are no functional changes in this commit, except for the elimination of the separate allocation/freeing of tipc_port. All other changes are just adaptatons to the new data structure. This commit also opens up for further code simplifications and code volume reduction, something we will do in later commits. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12tipc: remove redundant 'peer_name' field in struct tipc_sockJon Paul Maloy
The field 'peer_name' in struct tipc_sock is redundant, since this information already is available from tipc_port, to which tipc_sock has a reference. We remove the field, and ensure that peer node and peer port info instead is fetched via the functions that already exist for this purpose. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12tipc: replace reference table rwlock with spinlockJon Paul Maloy
The lock for protecting the reference table is declared as an RWLOCK, although it is only used in write mode, never in read mode. We redefine it to become a spinlock. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12tipc: Convert uses of __constant_<foo> to <foo>Joe Perches
The use of __constant_<foo> has been unnecessary for quite awhile now. Make these uses consistent with the rest of the kernel. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06tipc: don't log disabled tasklet handler errorsErik Hugne
Failure to schedule a TIPC tasklet with tipc_k_signal because the tasklet handler is disabled is not an error. It means TIPC is currently in the process of shutting down. We remove the error logging in this case. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06tipc: fix memory leak during module removalErik Hugne
When the TIPC module is removed, the tasklet handler is disabled before all other subsystems. This will cause lingering publications in the name table because the node_down tasklets responsible to clean up publications from an unreachable node will never run. When the name table is shut down, these publications are detected and an error message is logged: tipc: nametbl_stop(): orphaned hash chain detected This is actually a memory leak, introduced with commit 993b858e37b3120ee76d9957a901cca22312ffaa ("tipc: correct the order of stopping services at rmmod") Instead of just logging an error and leaking memory, we free the orphaned entries during nametable shutdown. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06tipc: drop subscriber connection id invalidationErik Hugne
When a topology server subscriber is disconnected, the associated connection id is set to zero. A check vs zero is then done in the subscription timeout function to see if the subscriber have been shut down. This is unnecessary, because all subscription timers will be cancelled when a subscriber terminates. Setting the connection id to zero is actually harmful because id zero is the identity of the topology server listening socket, and can cause a race that leads to this socket being closed instead. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06tipc: avoid to unnecessary process switch under non-block modeYing Xue
When messages are received via tipc socket under non-block mode, schedule_timeout() is called in tipc_wait_for_rcvmsg(), that is, the process of receiving messages will be scheduled once although timeout value passed to schedule_timeout() is 0. The same issue exists in accept()/wait_for_accept(). To avoid this unnecessary process switch, we only call schedule_timeout() if the timeout value is non-zero. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06tipc: fix connection refcount leakYing Xue
When tipc_conn_sendmsg() calls tipc_conn_lookup() to query a connection instance, its reference count value is increased if it's found. But subsequently if it's found that the connection is closed, the work of sending message is not queued into its server send workqueue, and the connection reference count is not decreased. This will cause a reference count leak. To reproduce this problem, an application would need to open and closes topology server connections with high intensity. We fix this by immediately decrementing the connection reference count if a send fails due to the connection being closed. Signed-off-by: Ying Xue <ying.xue@windriver.com> Acked-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06tipc: allow connection shutdown callback to be invoked in advanceYing Xue
Currently connection shutdown callback function is called when connection instance is released in tipc_conn_kref_release(), and receiving packets and sending packets are running in different threads. Even if connection is closed by the thread of receiving packets, its shutdown callback may not be called immediately as the connection reference count is non-zero at that moment. So, although the connection is shut down by the thread of receiving packets, the thread of sending packets doesn't know it. Before its shutdown callback is invoked to tell the sending thread its connection has been closed, the sending thread may deliver messages by tipc_conn_sendmsg(), this is why the following error information appears: "Sending subscription event failed, no memory" To eliminate it, allow connection shutdown callback function to be called before connection id is removed in tipc_close_conn(), which makes the sending thread know the truth in time that its socket is closed so that it doesn't send message to it. We also remove the "Sending XXX failed..." error reporting for topology and config services. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/wireless/ath/ath9k/recv.c drivers/net/wireless/mwifiex/pcie.c net/ipv6/sit.c The SIT driver conflict consists of a bug fix being done by hand in 'net' (missing u64_stats_init()) whilst in 'net-next' a helper was created (netdev_alloc_pcpu_stats()) which takes care of this. The two wireless conflicts were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-22tipc: make bearer set up in module insertion stageYing Xue
Accidentally a side effect is involved by commit 6e967adf7(tipc: relocate common functions from media to bearer). Now tipc stack handler of receiving packets from netdevices as well as netdevice notification handler are registered when bearer is enabled rather than tipc module initialization stage, but the two handlers are both unregistered in tipc module exit phase. If tipc module is inserted and then immediately removed, the following warning message will appear: "dev_remove_pack: ffffffffa0380940 not found" This is because in module insertion stage tipc stack packet handler is not registered at all, but in module exit phase dev_remove_pack() needs to remove it. Of course, dev_remove_pack() cannot find tipc protocol handler from the kernel protocol handler list so that the warning message is printed out. But if registering the two handlers is adjusted from enabling bearer phase into inserting module stage, the warning message will be eliminated. Due to this change, tipc_core_start_net() and tipc_core_stop_net() can be deleted as well. Reported-by: Wang Weidong <wangweidong1@huawei.com> Cc: Jon Maloy <jon.maloy@ericsson.com> Cc: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-22tipc: remove all enabled flags from all tipc componentsYing Xue
When tipc module is inserted, many tipc components are initialized one by one. During the initialization period, if one of them is failed, tipc_core_stop() will be called to stop all components whatever corresponding components are created or not. To avoid to release uncreated ones, relevant components have to add necessary enabled flags indicating whether they are created or not. But in the initialization stage, if one component is unsuccessfully created, we will just destroy successfully created components before the failed component instead of all components. All enabled flags defined in components, in turn, become redundant. Additionally it's also unnecessary to identify whether table.types is NULL in tipc_nametbl_stop() because name stable has been definitely created successfully when tipc_nametbl_stop() is called. Cc: Jon Maloy <jon.maloy@ericsson.com> Cc: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-19tipc: failed transmissions should return errorErik Hugne
When a message could not be sent out because the destination node or link could not be found, the full message size is returned from sendmsg() as if it had been sent successfully. An application will then get a false indication that it's making forward progress. This problem has existed since the initial commit in 2.6.16. We change this to return -ENETUNREACH if the message cannot be delivered due to the destination node/link being unavailable. We also get rid of the redundant tipc_reject_msg call since freeing the buffer and doing a tipc_port_iovec_reject accomplishes exactly the same thing. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/bonding/bond_3ad.h drivers/net/bonding/bond_main.c Two minor conflicts in bonding, both of which were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-18tipc: align tipc function names with common naming practice in the networkYing Xue
Rename the following functions, which are shorter and more in line with common naming practice in the network subsystem. tipc_bclink_send_msg->tipc_bclink_xmit tipc_bclink_recv_pkt->tipc_bclink_rcv tipc_disc_recv_msg->tipc_disc_rcv tipc_link_send_proto_msg->tipc_link_proto_xmit link_recv_proto_msg->tipc_link_proto_rcv link_send_sections_long->tipc_link_iovec_long_xmit tipc_link_send_sections_fast->tipc_link_iovec_xmit_fast tipc_link_send_sync->tipc_link_sync_xmit tipc_link_recv_sync->tipc_link_sync_rcv tipc_link_send_buf->__tipc_link_xmit tipc_link_send->tipc_link_xmit tipc_link_send_names->tipc_link_names_xmit tipc_named_recv->tipc_named_rcv tipc_link_recv_bundle->tipc_link_bundle_rcv tipc_link_dup_send_queue->tipc_link_dup_queue_xmit link_send_long_buf->tipc_link_frag_xmit tipc_multicast->tipc_port_mcast_xmit tipc_port_recv_mcast->tipc_port_mcast_rcv tipc_port_reject_sections->tipc_port_iovec_reject tipc_port_recv_proto_msg->tipc_port_proto_rcv tipc_connect->tipc_port_connect __tipc_connect->__tipc_port_connect __tipc_disconnect->__tipc_port_disconnect tipc_disconnect->tipc_port_disconnect tipc_shutdown->tipc_port_shutdown tipc_port_recv_msg->tipc_port_rcv tipc_port_recv_sections->tipc_port_iovec_rcv release->tipc_release accept->tipc_accept bind->tipc_bind get_name->tipc_getname poll->tipc_poll send_msg->tipc_sendmsg send_packet->tipc_send_packet send_stream->tipc_send_stream recv_msg->tipc_recvmsg recv_stream->tipc_recv_stream connect->tipc_connect listen->tipc_listen shutdown->tipc_shutdown setsockopt->tipc_setsockopt getsockopt->tipc_getsockopt Above changes have no impact on current users of the functions. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17tipc: correct usage of spin_lock() vs spin_lock_bh()Jon Paul Maloy
I commit e099e86c9e24fe9aff36773600543eb31d8954d ("tipc: add node_lock protection to link lookup function") we are calling spin_lock(&node->lock) directly instead of indirectly via the tipc_node_lock(node) function. However, tipc_node_lock() is using spin_lock_bh(), not spin_lock(), something leading to unbalanced usage in one place, and a smatch warning. We fix this by consistently using tipc_node_lock()/unlock() in in the places touched by the mentioned commit. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17tipc: fix a loop style problemJon Paul Maloy
In commit 7d33939f475d403e79124e3143d7951dcfe8629f ("tipc: delay delete of link when failover is needed") we introduced a loop for finding and removing a link pointer in an array. The removal is done after we have left the loop, giving the impression that one may remove the wrong pointer if no matching element is found. This is not really a bug, since we know that there will always be a matching element, but it looks wrong, and causes a smatch warning. We fix this loop with this commit. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: add node_lock protection to link lookup functionJon Paul Maloy
In an earlier commit, ("tipc: remove links list from bearer struct") we described three issues that need to be pre-emptively resolved before we can remove tipc_net_lock. Here we resolve issue a) described in that commit: "a) In access method #2, we access the link before taking the protecting node_lock. This will not work once net_lock is gone, so we will have to change the access order. We will deal with this in a later commit in this series." Here, we change that access order, by ensuring that the function link_find_link() returns only a safe reference for finding the link, i.e., a node pointer and an index into its 'links' array, not the link pointer itself. We also change all callers of this function to first take the node lock before they can check if there still is a valid link pointer at the returned index. Since the function now returns a node pointer rather than a link pointer, we rename it to the more appropriate 'tipc_link_find_owner(). Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: remove bearer_lock from tipc_bearer structYing Xue
After the earlier commits ("tipc: remove 'links' list from tipc_bearer struct") and ("tipc: introduce new spinlock to protect struct link_req"), there is no longer any need to protect struct link_req or or any link list by use of bearer_lock. Furthermore, we have eliminated the need for using bearer_lock during downcalls (send) from the link to the bearer, since we have ensured that bearers always have a longer life cycle that their associated links, and always contain valid data. So, the only need now for a lock protecting bearers is for guaranteeing consistency of the bearer list itself. For this, it is sufficient, at least for the time being, to continue applying 'net_lock´ in write mode. By removing bearer_lock we also pre-empt introduction of issue b) descibed in the previous commit "tipc: remove 'links' list from tipc_bearer struct": "b) When the outer protection from net_lock is gone, taking bearer_lock and node_lock in opposite order of method 1) and 2) will become an obvious deadlock hazard". Therefore, we now eliminate the bearer_lock spinlock. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: delay delete of link when failover is neededJon Paul Maloy
When a bearer is disabled, all its attached links are deleted. Ideally, we should do link failover to redundant links on other bearers, if there are any, in such cases. This would be consistent with current behavior when a link is reset, but not deleted. However, due to the complexity involved, and the (wrongly) perceived low demand for this feature, it was never implemented until now. We mark the doomed link for deletion with a new flag, but wait until the failover process is finished before we actually delete it. With the improved link tunnelling/failover code introduced earlier in this commit series, it is now easy to identify a spot in the code where the failover is finished and it is safe to delete the marked link. Moreover, the test for the flag and the deletion can be done synchronously, and outside the most time critical data path. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: changes to general packet reception algorithmJon Paul Maloy
We change the order of checking for destination users when processing incoming packets. By placing the checks for users that may potentially replace the processed buffer, i.e., CHANGEOVER_PROTOCOL and MSG_FRAGMENTER, in a separate step before we check for the true end users, we get rid of a label and a 'goto', at the same time making the code more comprehensible and easy to follow. This commit does not change any functionality, it is just a cosmetic code reshuffle. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: rename stack variables in function tipc_link_tunnel_rcvJon Paul Maloy
After the previous redesign of the tunnel reception algorithm and functions, we finalize it by renaming a couple of stack variables in tipc_tunnel_rcv(). This makes it more consistent with the naming scheme elsewhere in this part of the code. This change is purely cosmetic, with no functional changes. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: more cleanup of tunnelling reception functionJon Paul Maloy
We simplify and slim down the code in function tipc_tunnel_rcv() No impact on the users of this function. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: change signature of tunnelling reception functionJon Paul Maloy
After the earlier commits in this series related to the function tipc_link_tunnel_rcv(), we can now go further and simplify its signature. The function now consumes all DUPLICATE packets, and only returns such ORIGINAL packets that are ready for immediate delivery, i.e., no more link level protocol processing needs to be done by the caller. As a consequence, the the caller, tipc_rcv(), does not access the link pointer after call return, and it becomes unnecessary to pass a link pointer reference in the call. Instead, we now only pass it the tunnel link's owner node, which is sufficient to find the destination link for the tunnelled packet. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: change reception of tunnelled failover packetsJon Paul Maloy
When a link is reset, and there is a redundant link available, all sender sockets will steer their subsequent traffic through the remaining link. In order to guarantee preserved packet order and cardinality during the transition, we tunnel the failing link's send queue through the remaining link before we allow any sockets to use it. In this commit, we change the algorithm for receiving failover ("ORIGINAL_MSG") packets in tipc_link_tunnel_rcv(), at the same time delegating it to a new subfuncton, tipc_link_failover_rcv(). Instead of directly returning an extracted inner packet to the packet reception loop in tipc_rcv(), we first check if it is a message fragment, in which case we append it to the reset link's fragment chain. If the fragment chain is complete, we return the whole chain instead of the individual buffer, eliminating any need for the tipc_rcv() loop to do reassembly of tunneled packets. This change makes it possible to further simplify tipc_link_tunnel_rcv(), as well as the calling tipc_rcv() loop. We will do that in later commits. It also makes it possible to identify a single spot in the code where we can tell that a failover procedure is finished, something that is useful when we are deleting links after a failover. This will also be done in a later commit. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: change reception of tunnelled duplicate packetsJon Paul Maloy
When a second link to a destination comes up, some sender sockets will steer their subsequent traffic through the new link. In order to guarantee preserved packet order and cardinality for those sockets, we tunnel a duplicate of the old link's send queue through the new link before we open it for regular traffic. The last arriving packet copy, on whichever link, will be dropped at the receiving end based on the original sequence number, to ensure that only one copy is delivered to the end receiver. In this commit, we change the algorithm for receiving DUPLICATE_MSG packets, at the same time delegating it to a new subfunction, tipc_link_dup_rcv(). Instead of returning an extracted inner packet to the packet reception loop in tipc_rcv(), we just add it to the receiving (new) link's deferred packet queue. The packet will then be processed by that link when it receives its first non-tunneled packet, i.e., at latest when the changeover procedure is finished. Because tipc_link_tunnel_rcv()/tipc_link_dup_rcv() now is consuming all packets of type DUPLICATE_MSG, the calling tipc_rcv() function can omit testing for this. This in turn means that the current conditional jump to the label 'protocol_check' becomes redundant, and we can remove that label. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: remove 'links' list from tipc_bearer structYing Xue
In our ongoing effort to simplify the TIPC locking structure, we see a need to remove the linked list for tipc_links in the bearer. This can be explained as follows. Currently, we have three different ways to access a link, via three different lists/tables: 1: Via a node hash table: Used by the time-critical outgoing/incoming data paths. (e.g. link_send_sections_fast() and tipc_recv_msg() ): grab net_lock(read) find node from node hash table grab node_lock select link grab bearer_lock send_msg() release bearer_lock release node lock release net_lock 2: Via a global linked list for nodes: Used by configuration commands (link_cmd_set_value()) grab net_lock(read) find node and link from global node list (using link name) grab node_lock update link release node lock release net_lock (Same locking order as above. No problem.) 3: Via the bearer's linked link list: Used by notifications from interface (e.g. tipc_disable_bearer() ) grab net_lock(write) grab bearer_lock get link ptr from bearer's link list get node from link grab node_lock delete link release node lock release bearer_lock release net_lock (Different order from above, but works because we grab the outer net_lock in write mode first, excluding all other access.) The first major goal in our simplification effort is to get rid of the "big" net_lock, replacing it with rcu-locks when accessing the node list and node hash array. This will come in a later patch series. But to get there we first need to rewrite access methods ##2 and 3, since removal of net_lock would introduce three major problems: a) In access method #2, we access the link before taking the protecting node_lock. This will not work once net_lock is gone, so we will have to change the access order. We will deal with this in a later commit in this series, "tipc: add node lock protection to link found by link_find_link()". b) When the outer protection from net_lock is gone, taking bearer_lock and node_lock in opposite order of method 1) and 2) will become an obvious deadlock hazard. This is fixed in the commit ("tipc: remove bearer_lock from tipc_bearer struct") later in this series. c) Similar to what is described in problem a), access method #3 starts with using a link pointer that is unprotected by node_lock, in order to via that pointer find the correct node struct and lock it. Before we remove net_lock, this access order must be altered. This is what we do with this commit. We can avoid introducing problem problem c) by even here using the global node list to find the node, before accessing its links. When we loop though the node list we use the own bearer identity as search criteria, thus easily finding the links that are associated to the resetting/disabling bearer. It should be noted that although this method is somewhat slower than the current list traversal, it is in no way time critical. This is only about resetting or deleting links, something that must be considered relatively infrequent events. As a bonus, we can get rid of the mutual pointers between links and bearers. After this commit, pointer dependency go in one direction only: from the link to the bearer. This commit pre-empts introduction of problem c) as described above. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: redefine 'started' flag in struct link to bitmapYing Xue
Currently, the 'started' field in struct tipc_link represents only a binary state, 'started' or 'not started'. We need it to represent more link execution states in the coming commits in this series. Hence, we rename the field to 'flags', and define the current started/non-started state to be represented by the LSB bit of that field. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: move code for deleting links from bearer.c to link.cYing Xue
We break out the code for deleting attached links in the function bearer_disable(), and define a new function named tipc_link_delete_list() to do this job. This commit incurs no functional changes, but makes the code of function bearer_disable() cleaner. It is also a preparation for a more important change to the bearer code, in a subsequent commit in this series. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: move code for resetting links from bearer.c to link.cYing Xue
We break out the code for resetting attached links in the function tipc_reset_bearer(), and define a new function named tipc_link_reset_list() to do this job. This commit incurs no functional changes, but makes the code of function tipc_reset_bearer() cleaner. It is also a preparation for a more important change to the bearer code, in a subsequent commit in this series. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: stricter behavior of message reassembly functionJon Paul Maloy
The function tipc_link_recv_fragment(struct sk_buff **buf) currently leaves the value of the input buffer pointer undefined when it returns, except when the return code indicates that the reassembly is complete. This despite the fact that it always consumes the input buffer. Here, we enforce a stricter behavior by this function, ensuring that the returned buffer pointer is non-NULL if and only if the reassembly is complete. This makes it possible to test for the buffer pointer as criteria for successful reassembly. We also rename the function to tipc_link_frag_rcv(), which is both shorter and more in line with common naming practice in the network subsystem. Apart from the new name, these changes have no impact on current users of the function, but makes it more practical for use in some planned future commits. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: explicitly include core.h in addr.hAndreas Bofjäll
The inline functions in addr.h uses tipc_own_addr which is exported by core.h, but addr.h never actually includes it. It works because it is explicitly included where this is used, but it looks a bit strange. Include core.h in addr.h explicitly to make the dependency clearer. Signed-off-by: Andreas Bofjäll <andreas.bofjall@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13tipc: fix message corruption bug for deferred packetsErik Hugne
If a packet received on a link is out-of-sequence, it will be placed on a deferred queue and later reinserted in the receive path once the preceding packets have been processed. The problem with this is that it will be subject to the buffer adjustment from link_recv_buf_validate twice. The second adjustment for 20 bytes header space will corrupt the packet. We solve this by tagging the deferred packets and bail out from receive buffer validation for packets that have already been subjected to this. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-18net: add build-time checks for msg->msg_name sizeSteffen Hurrle
This is a follow-up patch to f3d3342602f8bc ("net: rework recvmsg handler msg_name and msg_namelen logic"). DECLARE_SOCKADDR validates that the structure we use for writing the name information to is not larger than the buffer which is reserved for msg->msg_name (which is 128 bytes). Also use DECLARE_SOCKADDR consistently in sendmsg code paths. Signed-off-by: Steffen Hurrle <steffen@hurrle.net> Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16tipc: standardize recvmsg routineYing Xue
Standardize the behaviour of waiting for events in TIPC recvmsg() so that all variables of socket or port structures are protected within socket lock, allowing the process of calling recvmsg() to be woken up at appropriate time. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16tipc: standardize sendmsg routine of connected socketYing Xue
Standardize the behaviour of waiting for events in TIPC send_packet() so that all variables of socket or port structures are protected within socket lock, allowing the process of calling sendmsg() to be woken up at appropriate time. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16tipc: standardize sendmsg routine of connectionless socketYing Xue
Comparing the behaviour of how to wait for events in TIPC sendmsg() with other stacks, the TIPC implementation might be perceived as different, and sometimes even incorrect. For instance, sk_sleep() and tport->congested variables associated with socket are exposed without socket lock protection while wait_event_interruptible_timeout() accesses them. So standardizing it with similar implementation in other stacks can help us correct these errors which the process of calling sendmsg() cannot be woken up event if an expected event arrive at socket or improperly woken up although the wake condition doesn't match. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16tipc: standardize accept routineYing Xue
Comparing the behaviour of how to wait for events in TIPC accept() with other stacks, the TIPC implementation might be perceived as different, and sometimes even incorrect. As sk_sleep() and sk->sk_receive_queue variables associated with socket are not protected by socket lock, the process of calling accept() may be woken up improperly or sometimes cannot be woken up at all. After standardizing it with inet_csk_wait_for_connect routine, we can get benefits including: avoiding 'thundering herd' phenomenon, adding a timeout mechanism for accept(), coping with a pending signal, and having sk_sleep() and sk->sk_receive_queue being always protected within socket lock scope and so on. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16tipc: standardize connect routineYing Xue
Comparing the behaviour of how to wait for events in TIPC connect() with other stacks, the TIPC implementation might be perceived as different, and sometimes even incorrect. For instance, as both sock->state and sk_sleep() are directly fed to wait_event_interruptible_timeout() as its arguments, and socket lock has to be released before we call wait_event_interruptible_timeout(), the two variables associated with socket are exposed out of socket lock protection, thereby probably getting stale values so that the process of calling connect() cannot be woken up exactly even if correct event arrives or it is woken up improperly even if the wake condition is not satisfied in practice. Therefore, standardizing its behaviour with sk_stream_wait_connect routine can avoid these risks. Additionally the implementation of connect routine is simplified as a whole, allowing it to return correct values in all different cases. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14tipc: spelling fixesstephen hemminger
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2014-01-07tipc: make link start event synchronousJon Paul Maloy
When a link is created we delay the start event by launching it to be executed later in a tasklet. As we hold all the necessary locks at the moment of creation, and there is no risk of deadlock or contention, this delay serves no purpose in the current code. We remove this obsolete indirection step, and the associated function link_start(). At the same time, we rename the function tipc_link_stop() to the more appropriate tipc_link_purge_queues(). Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07tipc: introduce new spinlock to protect struct link_reqYing Xue
Currently, only 'bearer_lock' is used to protect struct link_req in the function disc_timeout(). This is unsafe, since the member fields 'num_nodes' and 'timer_intv' might be accessed by below three different threads simultaneously, none of them grabbing bearer_lock in the critical region: link_activate() tipc_bearer_add_dest() tipc_disc_add_dest() req->num_nodes++; tipc_link_reset() tipc_bearer_remove_dest() tipc_disc_remove_dest() req->num_nodes-- disc_update() read req->num_nodes write req->timer_intv disc_timeout() read req->num_nodes read/write req->timer_intv Without lock protection, the only symptom of a race is that discovery messages occasionally may not be sent out. This is not fatal, since such messages are best-effort anyway. On the other hand, since discovery messages are not time critical, adding a protecting lock brings no serious overhead either. So we add a new, dedicated spinlock in order to guarantee absolute data consistency in link_req objects. This also helps reduce the overall role of the bearer_lock, which we want to remove completely in a later commit series. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07tipc: remove 'has_redundant_link' flag from STATE link protocol messagesJon Paul Maloy
The flag 'has_redundant_link' is defined only in RESET and ACTIVATE protocol messages. Due to an ambiguity in the protocol specification it is currently also transferred in STATE messages. Its value is used to initialize a link state variable, 'permit_changeover', which is used to inhibit futile link failover attempts when it is known that the peer node has no working links at the moment, although the local node may still think it has one. The fact that 'has_redundant_link' incorrectly is read from STATE messages has the effect that 'permit_changeover' sometimes gets a wrong value, and permanently blocks any links from being re-established. Such failures can only occur in in dual-link systems, and are extremely rare. This bug seems to have always been present in the code. Furthermore, since commit b4b5610223f17790419b03eaa962b0e3ecf930d7 ("tipc: Ensure both nodes recognize loss of contact between them"), the 'permit_changeover' field serves no purpose any more. The task of enforcing 'lost contact' cycles at both peer endpoints is now taken by a new mechanism, using the flags WAIT_NODE_DOWN and WAIT_PEER_DOWN in struct tipc_node to abort unnecessary failover attempts. We therefore remove the 'has_redundant_link' flag from STATE messages, as well as the now redundant 'permit_changeover' variable. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>