linux-toradex.git/kernel/printk/nbcon.c, branch v6.19

printk/nbcon: Restore IRQ in atomic flush after each emitted record

2025-12-15T15:18:41+00:00

The commit d5d399efff6577 ("printk/nbcon: Release nbcon consoles ownership
in atomic flush after each emitted record") prevented stall of a CPU
which lost nbcon console ownership because another CPU entered
an emergency flush.

But there is still the problem that the CPU doing the emergency flush
might cause a stall on its own.

Let's go even further and restore IRQ in the atomic flush after
each emitted record.

It is not a complete solution. The interrupts and/or scheduling might
still be blocked when the emergency atomic flush was called with
IRQs and/or scheduling disabled. But it should remove the following
lockup:

  mlx5_core 0000:03:00.0: Shutdown was called
  kvm: exiting hardware virtualization
  arm-smmu-v3 arm-smmu-v3.10.auto: CMD_SYNC timeout at 0x00000103 [hwprod 0x00000104, hwcons 0x00000102]
  smp: csd: Detected non-responsive CSD lock (#1) on CPU#4, waiting 5000000032 ns for CPU#00 do_nothing (kernel/smp.c:1057)
  smp:     csd: CSD lock (#1) unresponsive.
  [...]
  Call trace:
  pl011_console_write_atomic (./arch/arm64/include/asm/vdso/processor.h:12 drivers/tty/serial/amba-pl011.c:2540) (P)
  nbcon_emit_next_record (kernel/printk/nbcon.c:1049)
  __nbcon_atomic_flush_pending_con (kernel/printk/nbcon.c:1517)
  __nbcon_atomic_flush_pending.llvm.15488114865160659019 (./arch/arm64/include/asm/alternative-macros.h:254 ./arch/arm64/include/asm/cpufeature.h:808 ./arch/arm64/include/asm/irqflags.h:192 kernel/printk/nbcon.c:1562 kernel/printk/nbcon.c:1612)
  nbcon_atomic_flush_pending (kernel/printk/nbcon.c:1629)
  printk_kthreads_shutdown (kernel/printk/printk.c:?)
  syscore_shutdown (drivers/base/syscore.c:120)
  kernel_kexec (kernel/kexec_core.c:1045)
  __arm64_sys_reboot (kernel/reboot.c:794 kernel/reboot.c:722 kernel/reboot.c:722)
  invoke_syscall (arch/arm64/kernel/syscall.c:50)
  el0_svc_common.llvm.14158405452757855239 (arch/arm64/kernel/syscall.c:?)
  do_el0_svc (arch/arm64/kernel/syscall.c:152)
  el0_svc (./arch/arm64/include/asm/alternative-macros.h:254 ./arch/arm64/include/asm/cpufeature.h:808 ./arch/arm64/include/asm/irqflags.h:73 arch/arm64/kernel/entry-common.c:169 arch/arm64/kernel/entry-common.c:182 arch/arm64/kernel/entry-common.c:749)
  el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:820)
  el0t_64_sync (arch/arm64/kernel/entry.S:600)

In this case, nbcon_atomic_flush_pending() is called from
printk_kthreads_shutdown() with IRQs and scheduling enabled.

Note that __nbcon_atomic_flush_pending_con() is directly called also from
nbcon_device_release() where the disabled IRQs might break PREEMPT_RT
guarantees. But the atomic flush is called only in emergency or panic
situations where the latencies are irrelevant anyway.

An ultimate solution would be a touching of watchdogs. But it would hide
all problems. Let's do it later when anyone reports a stall which does
not have a better solution.

Closes: https://lore.kernel.org/r/sqwajvt7utnt463tzxgwu2yctyn5m6bjwrslsnupfexeml6hkd@v6sqmpbu3vvu
Tested-by: Breno Leitao 
Reviewed-by: John Ogness 
Link: https://patch.msgid.link/20251212124520.244483-1-pmladek@suse.com
Signed-off-by: Petr Mladek

Merge branch 'rework/write_atomic-unsafe' into for-linus

2025-12-01T13:17:04+00:00

Merge branch 'rework/suspend-fixes' into for-linus

2025-12-01T13:16:28+00:00

Merge branch 'rework/nbcon-in-kdb' into for-linus

2025-12-01T13:15:43+00:00

printk: Avoid scheduling irq_work on suspend

2025-11-19T15:01:31+00:00

Allowing irq_work to be scheduled while trying to suspend has shown
to cause problems as some architectures interpret the pending
interrupts as a reason to not suspend. This became a problem for
printk() with the introduction of NBCON consoles. With every
printk() call, NBCON console printing kthreads are woken by queueing
irq_work. This means that irq_work continues to be queued due to
printk() calls late in the suspend procedure.

Avoid this problem by preventing printk() from queueing irq_work
once console suspending has begun. This applies to triggering NBCON
and legacy deferred printing as well as klogd waiters.

Since triggering of NBCON threaded printing relies on irq_work, the
pr_flush() within console_suspend_all() is used to perform the final
flushing before suspending consoles and blocking irq_work queueing.
NBCON consoles that are not suspended (due to the usage of the
"no_console_suspend" boot argument) transition to atomic flushing.

Introduce a new global variable @console_irqwork_blocked to flag
when irq_work queueing is to be avoided. The flag is used by
printk_get_console_flush_type() to avoid allowing deferred printing
and switch NBCON consoles to atomic flushing. It is also used by
vprintk_emit() to avoid klogd waking.

Add WARN_ON_ONCE(console_irqwork_blocked) to the irq_work queuing
functions to catch any code that attempts to queue printk irq_work
during the suspending/resuming procedure.

Cc: stable@vger.kernel.org # 6.13.x because no drivers in 6.12.x
Fixes: 6b93bb41f6ea ("printk: Add non-BKL (nbcon) console basic infrastructure")
Closes: https://lore.kernel.org/lkml/DB9PR04MB8429E7DDF2D93C2695DE401D92C4A@DB9PR04MB8429.eurprd04.prod.outlook.com
Signed-off-by: John Ogness 
Reviewed-by: Petr Mladek 
Tested-by: Sherry Sun 
Link: https://patch.msgid.link/20251113160351.113031-3-john.ogness@linutronix.de
Signed-off-by: Petr Mladek

printk: Allow printk_trigger_flush() to flush all types

2025-11-19T15:01:31+00:00

Currently printk_trigger_flush() only triggers legacy offloaded
flushing, even if that may not be the appropriate method to flush
for currently registered consoles. (The function predates the
NBCON consoles.)

Since commit 6690d6b52726 ("printk: Add helper for flush type
logic") there is printk_get_console_flush_type(), which also
considers NBCON consoles and reports all the methods of flushing
appropriate based on the system state and consoles available.

Update printk_trigger_flush() to use
printk_get_console_flush_type() to appropriately flush registered
consoles.

Suggested-by: Petr Mladek 
Signed-off-by: John Ogness 
Reviewed-by: Petr Mladek 
Link: https://lore.kernel.org/stable/20251113160351.113031-2-john.ogness%40linutronix.de
Tested-by: Sherry Sun 
Link: https://patch.msgid.link/20251113160351.113031-2-john.ogness@linutronix.de
Signed-off-by: Petr Mladek

printk: nbcon: Allow unsafe write_atomic() for panic

2025-11-07T13:30:52+00:00

There may be console drivers that have not yet figured out a way
to implement safe atomic printing (->write_atomic() callback).
These drivers could choose to only implement threaded printing
(->write_thread() callback), but then it is guaranteed that _no_
output will be printed during panic. Not even attempted.

As a result, developers may be tempted to implement unsafe
->write_atomic() callbacks and/or implement some sort of custom
deferred printing trickery to try to make it work. This goes
against the principle intention of the nbcon API as well as
endangers other nbcon drivers that are doing things correctly
(safely).

As a compromise, allow nbcon drivers to implement unsafe
->write_atomic() callbacks by providing a new console flag
CON_NBCON_ATOMIC_UNSAFE. When specified, the ->write_atomic()
callback for that console will _only_ be called during the
final "hope and pray" flush attempt at the end of a panic:
nbcon_atomic_flush_unsafe().

Signed-off-by: John Ogness 
Link: https://lore.kernel.org/lkml/b2qps3uywhmjaym4mht2wpxul4yqtuuayeoq4iv4k3zf5wdgh3@tocu6c7mj4lt
Reviewed-by: Petr Mladek 
Link: https://lore.kernel.org/all/swdpckuwwlv3uiessmtnf2jwlx3jusw6u7fpk5iggqo4t2vdws@7rpjso4gr7qp/ [1]
Link: https://lore.kernel.org/all/20251103-fix_netpoll_aa-v4-1-4cfecdf6da7c@debian.org/ [2]
Link: https://patch.msgid.link/20251027161212.334219-2-john.ogness@linutronix.de
[pmladek@suse.com: Fix build with rework/nbcon-in-kdb branch.]
Signed-off-by: Petr Mladek

printk/nbcon: Release nbcon consoles ownership in atomic flush after each emitted record

2025-10-30T11:11:33+00:00

printk() tries to flush messages with NBCON_PRIO_EMERGENCY on
nbcon consoles immediately. It might take seconds to flush all
pending lines on slow serial consoles. Note that there might be
hundreds of messages, for example:

[    3.771531][    T1] pci 0000:3e:08.1: [8086:324
** replaying previous printk message **
[    3.771531][    T1] pci 0000:3e:08.1: [8086:3246] type 00 class 0x088000 PCIe Root Complex Integrated Endpoint
[ ... more than 2000 lines, about 200kB messages ... ]
[    3.837752][    T1] pci 0000:20:01.0: Adding to iommu group 18
[    3.837851][    T
** replaying previous printk message **
[    3.837851][    T1] pci 0000:20:03.0: Adding to iommu group 19
[    3.837946][    T1] pci 0000:20:05.0: Adding to iommu group 20
[ ... more than 500 messages for iommu groups 21-590 ...]
[    3.912932][    T1] pci 0000:f6:00.1: Adding to iommu group 591
[    3.913070][    T1] pci 0000:f6:00.2: Adding to iommu group 592
[    3.913243][    T1] DMAR: Intel(R) Virtualization Technology for Directed I/O
[    3.913245][    T1] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[    3.913245][    T1] software IO TLB: mapped [mem 0x000000004f000000-0x0000000053000000] (64MB)
[    3.913324][    T1] RAPL PMU: API unit is 2^-32 Joules, 3 fixed counters, 655360 ms ovfl timer
[    3.913325][    T1] RAPL PMU: hw unit of domain package 2^-14 Joules
[    3.913326][    T1] RAPL PMU: hw unit of domain dram 2^-14 Joules
[    3.913327][    T1] RAPL PMU: hw unit of domain psys 2^-0 Joules
[    3.933486][    T1] ------------[ cut here ]------------
[    3.933488][    T1] WARNING: CPU: 2 PID: 1 at arch/x86/events/intel/uncore.c:1156 uncore_pci_pmu_register+0x15e/0x180
[    3.930291][    C0] watchdog: Watchdog detected hard LOCKUP on cpu 0
[    3.930291][    C0] Kernel panic - not syncing: Hard LOCKUP
[...]
[    3.930291][    C0] CPU: 0 UID: 0 PID: 18 Comm: pr/ttyS0 Not tainted...
[...]
[    3.930291][    C0] RIP: 0010:nbcon_reacquire_nobuf+0x11/0x50
[    3.930291][    C0] Call Trace:
[...]
[    3.930291][    C0]  
[    3.930291][    C0]  serial8250_console_write+0x16d/0x5c0
[    3.930291][    C0]  nbcon_emit_next_record+0x22c/0x250
[    3.930291][    C0]  nbcon_emit_one+0x93/0xe0
[    3.930291][    C0]  nbcon_kthread_func+0x13c/0x1c0

The are visible two takeovers of the console ownership:

  - The 1st one is triggered by the "WARNING: CPU: 2 PID: 1 at
    arch/x86/..." line printed with NBCON_PRIO_EMERGENCY.

  - The 2nd one is triggered by the "Kernel panic - not syncing:
    Hard LOCKUP" line printed with NBCON_PRIO_PANIC.

There are more than 2500 lines, at about 240kB, emitted between
the takeover and the 1st "WARNING" line in the emergency context.
This amount of pending messages had to be flushed by
nbcon_atomic_flush_pending() when WARN() printed its first line.

The atomic flush was holding the nbcon console context for too long so
that it triggered hard lockup on the CPU running the printk kthread
"pr/ttyS0". The kthread needed to reacquire the console ownership
for restoring the original serial port state in serial8250_console_write().

Prevent the hardlockup by releasing the nbcon console ownership after
each emitted record.

Note that __nbcon_atomic_flush_pending_con() used to hold the console
ownership all the time because it blocked the printk kthread. Otherwise
the kthread tried to flush the messages in parallel which caused repeated
takeovers and more replayed messages.

It is not longer a problem because the repeated takeovers are blocked
by the counter of emergency contexts, see nbcon_cpu_emergency_cnt.

Link: https://lore.kernel.org/all/aNQO-zl3k1l4ENfy@pathway.suse.cz
Reviewed-by: Andrew Murray 
Reviewed-by: John Ogness 
Link: https://patch.msgid.link/20250926124912.243464-4-pmladek@suse.com
Signed-off-by: Petr Mladek

printk/nbcon/panic: Allow printk kthread to sleep when the system is in panic

2025-10-30T11:11:33+00:00

The printk kthread might be running when there is a panic in progress.
But it is not able to acquire the console ownership any longer.

Prevent the desperate attempts to acquire the ownership and allow sleeping
in panic. It would make it behave the same as when there is any CPU
in an emergency context.

Reviewed-by: Andrew Murray 
Reviewed-by: John Ogness 
Link: https://patch.msgid.link/20250926124912.243464-3-pmladek@suse.com
[pmladek@suse.com: Rebased on top of 6.18-rc1 (panic_in_progress() moved to panic.c)]
Signed-off-by: Petr Mladek

printk/nbcon: Block printk kthreads when any CPU is in an emergency context

2025-10-30T11:10:16+00:00

In emergency contexts, printk() tries to flush messages directly even
on nbcon consoles. And it is allowed to takeover the console ownership
and interrupt the printk kthread in the middle of a message.

Only one takeover and one repeated message should be enough in most
situations. The first emergency message flushes the backlog and printk
kthreads get to sleep. Next emergency messages are flushed directly
and printk() does not wake up the kthreads.

However, the one takeover is not guaranteed. Any printk() in normal
context on another CPU could wake up the kthreads. Or a new emergency
message might be added before the kthreads get to sleep. Note that
the interrupted .write_thread() callbacks usually have to call
nbcon_reacquire_nobuf() and restore the original device setting
before checking for pending messages.

The risk of the repeated takeovers will be even bigger because
__nbcon_atomic_flush_pending_con is going to release the console
ownership after each emitted record. It will be needed to prevent
hardlockup reports on other CPUs which are busy waiting for
the context ownership, for example, by nbcon_reacquire_nobuf() or
__uart_port_nbcon_acquire().

The repeated takeovers break the output, for example:

    [ 5042.650211][ T2220] Call Trace:
    [ 5042.6511
    ** replaying previous printk message **
    [ 5042.651192][ T2220]  
    [ 5042.652160][ T2220]  kunit_run_
    ** replaying previous printk message **
    [ 5042.652160][ T2220]  kunit_run_tests+0x72/0x90
    [ 5042.653340][ T22
    ** replaying previous printk message **
    [ 5042.653340][ T2220]  ? srso_alias_return_thunk+0x5/0xfbef5
    [ 5042.654628][ T2220]  ? stack_trace_save+0x4d/0x70
    [ 5042.6553
    ** replaying previous printk message **
    [ 5042.655394][ T2220]  ? srso_alias_return_thunk+0x5/0xfbef5
    [ 5042.656713][ T2220]  ? save_trace+0x5b/0x180

A more robust solution is to block the printk kthread entirely whenever
*any* CPU enters an emergency context. This ensures that critical messages
can be flushed without contention from the normal, non-atomic printing
path.

Link: https://lore.kernel.org/all/aNQO-zl3k1l4ENfy@pathway.suse.cz
Reviewed-by: Andrew Murray 
Reviewed-by: John Ogness 
Link: https://patch.msgid.link/20250926124912.243464-2-pmladek@suse.com
[pmladek@suse.com: Added changes proposed by John Ogness]
Signed-off-by: Petr Mladek