[SCSI] fcoe: Ensure fcoe_recv_frame is always called in process context

commit 859b7b649ab58ee5cbfb761491317d5b315c1b0f introduced the ability to call fcoe_recv_frame in softirq context. While this is beneficial to performance, its not safe to do, as it breaks the serialization of access to the lport structure (i.e. when an fcoe interface is being torn down, theres no way to serialize the teardown effort with the completion of receieve operations occuring in softirq context. As a result, lport (and other) data structures can be read and modified in parallel leading to corruption. Most notable is the vport list, which is protected by a mutex, that will cause a panic if a softirq receive while said mutex is locked. Additionaly, the ema_list, discussed here: http://lists.open-fcoe.org/pipermail/devel/2012-February/011947.html Can be corrupted if a list traversal occurs in softirq context at the same time as a list delete in process context. And generally the lport state variables will not be stable, and may lead to unpredictable results. The most direct fix is to remove the bits from the above commit that allowed fcoe_recv_frame to be called in softirq context. We just force all frames to be handled by the per-cpu rx threads. This will allow the fcoe_if_destroy's use of fcoe_percpu_clean to function properly, ensuring that no frames are being received while the lport is being torn down. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reviewed-by: Vasu Dev <vasu.dev@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
author: Neil Horman <nhorman@tuxdriver.com> 2012-03-09 14:49:48 -0800
committer: James Bottomley <JBottomley@Parallels.com> 2012-03-28 09:02:26 +0100
commit: 5e70c4c43e559ea6a1bf1edc0eb7d284ea7f16b4 (patch)
tree: 48cf08ae09bdf23e9a48a10821ca5223f25ab74f /drivers
parent: b08c1856b4d4295040ec72f15427588087369220 (diff)
1 files changed, 10 insertions, 17 deletions
diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
index 278958157e24..22ae29520d6e 100644
--- a/drivers/scsi/fcoe/fcoe.c
+++ b/drivers/scsi/fcoe/fcoe.c
@@ -1463,24 +1463,17 @@ static int fcoe_rcv(struct sk_buff *skb, struct net_device *netdev,
 	 * so we're free to queue skbs into it's queue.
 	 */
 
-	/* If this is a SCSI-FCP frame, and this is already executing on the
-	 * correct CPU, and the queue for this CPU is empty, then go ahead
-	 * and process the frame directly in the softirq context.
-	 * This lets us process completions without context switching from the
-	 * NET_RX softirq, to our receive processing thread, and then back to
-	 * BLOCK softirq context.
+	/*
+	 * Note: We used to have a set of conditions under which we would
+	 * call fcoe_recv_frame directly, rather than queuing to the rx list
+	 * as it could save a few cycles, but doing so is prohibited, as
+	 * fcoe_recv_frame has several paths that may sleep, which is forbidden
+	 * in softirq context.
 	 */
-	if (fh->fh_type == FC_TYPE_FCP &&
-	    cpu == smp_processor_id() &&
-	    skb_queue_empty(&fps->fcoe_rx_list)) {
-		spin_unlock_bh(&fps->fcoe_rx_list.lock);
-		fcoe_recv_frame(skb);
-	} else {
-		__skb_queue_tail(&fps->fcoe_rx_list, skb);
-		if (fps->fcoe_rx_list.qlen == 1)
-			wake_up_process(fps->thread);
-		spin_unlock_bh(&fps->fcoe_rx_list.lock);
-	}
+	__skb_queue_tail(&fps->fcoe_rx_list, skb);
+	if (fps->fcoe_rx_list.qlen == 1)
+		wake_up_process(fps->thread);
+	spin_unlock_bh(&fps->fcoe_rx_list.lock);
 
 	return 0;
 err:
author	Neil Horman <nhorman@tuxdriver.com>	2012-03-09 14:49:48 -0800
committer	James Bottomley <JBottomley@Parallels.com>	2012-03-28 09:02:26 +0100
commit	5e70c4c43e559ea6a1bf1edc0eb7d284ea7f16b4 (patch)
tree	48cf08ae09bdf23e9a48a10821ca5223f25ab74f /drivers
parent	b08c1856b4d4295040ec72f15427588087369220 (diff)