summaryrefslogtreecommitdiff
path: root/drivers/net
diff options
context:
space:
mode:
authorSaeed Mahameed <saeedm@nvidia.com>2026-03-30 22:40:15 +0300
committerJakub Kicinski <kuba@kernel.org>2026-04-01 20:10:41 -0700
commit403186400a1a6166efe7031edc549c15fee4723f (patch)
tree98b58e340a4e42b32a271663e978a8c81c19022e /drivers/net
parent10dc35f6a443d488f219d1a1e3fb8f8dac422070 (diff)
net/mlx5: Fix switchdev mode rollback in case of failure
If for some internal reason switchdev mode fails, we rollback to legacy mode, before this patch, rollback will unregister the uplink netdev and leave it unregistered causing the below kernel bug. To fix this, we need to avoid netdev unregister by setting the proper rollback flag 'MLX5_PRIV_FLAGS_SWITCH_LEGACY' to indicate legacy mode. devlink (431) used greatest stack depth: 11048 bytes left mlx5_core 0000:00:03.0: E-Switch: Disable: mode(LEGACY), nvfs(0), \ necvfs(0), active vports(0) mlx5_core 0000:00:03.0: E-Switch: Supported tc chains and prios offload mlx5_core 0000:00:03.0: Loading uplink representor for vport 65535 mlx5_core 0000:00:03.0: mlx5_cmd_out_err:816:(pid 456): \ QUERY_HCA_CAP(0x100) op_mod(0x0) failed, \ status bad parameter(0x3), syndrome (0x3a3846), err(-22) mlx5_core 0000:00:03.0 enp0s3np0 (unregistered): Unloading uplink \ representor for vport 65535 ------------[ cut here ]------------ kernel BUG at net/core/dev.c:12070! Oops: invalid opcode: 0000 [#1] SMP NOPTI CPU: 2 UID: 0 PID: 456 Comm: devlink Not tainted 6.16.0-rc3+ \ #9 PREEMPT(voluntary) RIP: 0010:unregister_netdevice_many_notify+0x123/0xae0 ... Call Trace: [ 90.923094] unregister_netdevice_queue+0xad/0xf0 [ 90.923323] unregister_netdev+0x1c/0x40 [ 90.923522] mlx5e_vport_rep_unload+0x61/0xc6 [ 90.923736] esw_offloads_enable+0x8e6/0x920 [ 90.923947] mlx5_eswitch_enable_locked+0x349/0x430 [ 90.924182] ? is_mp_supported+0x57/0xb0 [ 90.924376] mlx5_devlink_eswitch_mode_set+0x167/0x350 [ 90.924628] devlink_nl_eswitch_set_doit+0x6f/0xf0 [ 90.924862] genl_family_rcv_msg_doit+0xe8/0x140 [ 90.925088] genl_rcv_msg+0x18b/0x290 [ 90.925269] ? __pfx_devlink_nl_pre_doit+0x10/0x10 [ 90.925506] ? __pfx_devlink_nl_eswitch_set_doit+0x10/0x10 [ 90.925766] ? __pfx_devlink_nl_post_doit+0x10/0x10 [ 90.926001] ? __pfx_genl_rcv_msg+0x10/0x10 [ 90.926206] netlink_rcv_skb+0x52/0x100 [ 90.926393] genl_rcv+0x28/0x40 [ 90.926557] netlink_unicast+0x27d/0x3d0 [ 90.926749] netlink_sendmsg+0x1f7/0x430 [ 90.926942] __sys_sendto+0x213/0x220 [ 90.927127] ? __sys_recvmsg+0x6a/0xd0 [ 90.927312] __x64_sys_sendto+0x24/0x30 [ 90.927504] do_syscall_64+0x50/0x1c0 [ 90.927687] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 90.927929] RIP: 0033:0x7f7d0363e047 Fixes: 2a4f56fbcc47 ("net/mlx5e: Keep netdev when leave switchdev for devlink set legacy only") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260330194015.53585-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'drivers/net')
-rw-r--r--drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c2
1 files changed, 2 insertions, 0 deletions
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 7a9ee36b8dca..01f6aecc4fcc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -3761,6 +3761,8 @@ int esw_offloads_enable(struct mlx5_eswitch *esw)
return 0;
err_vports:
+ /* rollback to legacy, indicates don't unregister the uplink netdev */
+ esw->dev->priv.flags |= MLX5_PRIV_FLAGS_SWITCH_LEGACY;
mlx5_esw_offloads_rep_unload(esw, MLX5_VPORT_UPLINK);
err_uplink:
esw_offloads_steering_cleanup(esw);