diff options
| author | Tariq Toukan <tariqt@nvidia.com> | 2025-12-09 14:56:16 +0200 |
|---|---|---|
| committer | Paolo Abeni <pabeni@redhat.com> | 2025-12-18 13:39:29 +0100 |
| commit | c8591decd9dbf395cb8ae398e70b0438fdd24aee (patch) | |
| tree | 24ebf094a6cf8ce6684c9b4c411ce519a44571d0 /fs/btrfs/dev-replace.c | |
| parent | 9ab89bde13e5251e1d0507e1cc426edcdfe19142 (diff) | |
net/mlx5e: Do not update BQL of old txqs during channel reconfiguration
During channel reconfiguration (e.g., ethtool private flags changes),
the driver can trigger a kernel BUG_ON in dql_completed() with the error
"kernel BUG at lib/dynamic_queue_limits.c:99".
The issue occurs in the following sequence:
During mlx5e_safe_switch_params(), old channels are deactivated via
mlx5e_deactivate_txqsq(). New channels are created and activated, taking
ownership of the netdev_queues and their BQL state.
When old channels are closed via mlx5e_close_txqsq(), there may be
pending TX descriptors (sq->cc != sq->pc) that were in-flight during the
deactivation.
mlx5e_free_txqsq_descs() frees these pending descriptors and attempts to
complete them via netdev_tx_completed_queue().
However, the BQL state (dql->num_queued and dql->num_completed) have
been reset in mlx5e_activate_txqsq and belong to the new queue owner,
leading to dql->num_queued - dql->num_completed < nbytes.
This triggers BUG_ON(count > num_queued - num_completed) in
dql_completed().
Fixes: 3b88a535a8e1 ("net/mlx5e: Defer channels closure to reduce interface down time")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: William Tu <witu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Link: https://patch.msgid.link/1765284977-1363052-9-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Diffstat (limited to 'fs/btrfs/dev-replace.c')
0 files changed, 0 insertions, 0 deletions
