summaryrefslogtreecommitdiff
path: root/Documentation/admin-guide/sysctl
diff options
context:
space:
mode:
authorDmitry Torokhov <dmitry.torokhov@gmail.com>2026-04-19 18:28:57 -0700
committerDmitry Torokhov <dmitry.torokhov@gmail.com>2026-04-19 18:28:57 -0700
commitf4b369c6fe0ceaba2da2daff8c9eb415f85926dd (patch)
tree30465d0a429b2c224685b5d8e804bf053c4d129a /Documentation/admin-guide/sysctl
parentff14dafde15c11403fac61367a34fea08926e9ee (diff)
parent2ca45e57ea027fffe3350ae5e21ad9cecb0dce74 (diff)
Merge branch 'next' into for-linus
Prepare input updates for 7.1 merge window.
Diffstat (limited to 'Documentation/admin-guide/sysctl')
-rw-r--r--Documentation/admin-guide/sysctl/kernel.rst41
-rw-r--r--Documentation/admin-guide/sysctl/net.rst105
-rw-r--r--Documentation/admin-guide/sysctl/vm.rst42
3 files changed, 158 insertions, 30 deletions
diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index f3ee807b5d8b..9aed74e65cf4 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -397,13 +397,14 @@ a hung task is detected.
hung_task_panic
===============
-Controls the kernel's behavior when a hung task is detected.
+When set to a non-zero value, a kernel panic will be triggered if the
+number of hung tasks found during a single scan reaches this value.
This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.
-= =================================================
+= =======================================================
0 Continue operation. This is the default behavior.
-1 Panic immediately.
-= =================================================
+N Panic when N hung tasks are found during a single scan.
+= =======================================================
hung_task_check_count
@@ -421,6 +422,11 @@ the system boot.
This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.
+hung_task_sys_info
+==================
+A comma separated list of extra system information to be dumped when
+hung task is detected, for example, "tasks,mem,timers,locks,...".
+Refer 'panic_sys_info' section below for more details.
hung_task_timeout_secs
======================
@@ -515,6 +521,15 @@ default), only processes with the CAP_SYS_ADMIN capability may create
io_uring instances.
+kernel_sys_info
+===============
+A comma separated list of extra system information to be dumped when
+soft/hard lockup is detected, for example, "tasks,mem,timers,locks,...".
+Refer 'panic_sys_info' section below for more details.
+
+It serves as the default kernel control knob, which will take effect
+when a kernel module calls sys_info() with parameter==0.
+
kexec_load_disabled
===================
@@ -576,6 +591,14 @@ if leaking kernel pointer values to unprivileged users is a concern.
When ``kptr_restrict`` is set to 2, kernel pointers printed using
%pK will be replaced with 0s regardless of privileges.
+For disabling these security restrictions early at boot time (and once
+for all), use the ``hash_pointers`` boot parameter instead.
+
+softlockup_sys_info & hardlockup_sys_info
+=========================================
+A comma separated list of extra system information to be dumped when
+soft/hard lockup is detected, for example, "tasks,mem,timers,locks,...".
+Refer 'panic_sys_info' section below for more details.
modprobe
========
@@ -910,8 +933,8 @@ to 'panic_print'. Possible values are:
============= ===================================================
tasks print all tasks info
mem print system memory info
-timer print timers info
-lock print locks info if CONFIG_LOCKDEP is on
+timers print timers info
+locks print locks info if CONFIG_LOCKDEP is on
ftrace print ftrace buffer
all_bt print all CPUs backtrace (if available in the arch)
blocked_tasks print only tasks in uninterruptible (blocked) state
@@ -1215,12 +1238,6 @@ that support this feature.
== ===========================================================================
-real-root-dev
-=============
-
-See Documentation/admin-guide/initrd.rst.
-
-
reboot-cmd (SPARC only)
=======================
diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index 2ef50828aff1..3b2ad61995d4 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -40,8 +40,8 @@ Table : Subdirectories in /proc/sys/net
bridge Bridging rose X.25 PLP layer
core General parameter tipc TIPC
ethernet Ethernet protocol unix Unix domain sockets
- ipv4 IP version 4 x25 X.25 protocol
- ipv6 IP version 6
+ ipv4 IP version 4 vsock VSOCK sockets
+ ipv6 IP version 6 x25 X.25 protocol
========= =================== = ========== ===================
1. /proc/sys/net/core - Network core options
@@ -212,6 +212,14 @@ mem_pcpu_rsv
Per-cpu reserved forward alloc cache size in page units. Default 1MB per CPU.
+bypass_prot_mem
+---------------
+
+Skip charging socket buffers to the global per-protocol memory
+accounting controlled by net.ipv4.tcp_mem, net.ipv4.udp_mem, etc.
+
+Default: 0 (off)
+
rmem_default
------------
@@ -295,24 +303,33 @@ netdev_max_backlog
Maximum number of packets, queued on the INPUT side, when the interface
receives packets faster than kernel can process them.
+qdisc_max_burst
+------------------
+
+Maximum number of packets that can be temporarily stored before
+reaching qdisc.
+
+Default: 1000
+
netdev_rss_key
--------------
-RSS (Receive Side Scaling) enabled drivers use a 40 bytes host key that is
-randomly generated.
+RSS (Receive Side Scaling) enabled drivers use a host key that
+is randomly generated.
Some user space might need to gather its content even if drivers do not
provide ethtool -x support yet.
::
myhost:~# cat /proc/sys/net/core/netdev_rss_key
- 84:50:f4:00:a8:15:d1:a7:e9:7f:1d:60:35:c7:47:25:42:97:74:ca:56:bb:b6:a1:d8: ... (52 bytes total)
+ 84:50:f4:00:a8:15:d1:a7:e9:7f:1d:60:35:c7:47:25:42:97:74:ca:56:bb:b6:a1:d8: ... (256 bytes total)
-File contains nul bytes if no driver ever called netdev_rss_key_fill() function.
+File contains all nul bytes if no driver ever called netdev_rss_key_fill()
+function.
Note:
- /proc/sys/net/core/netdev_rss_key contains 52 bytes of key,
- but most drivers only use 40 bytes of it.
+ /proc/sys/net/core/netdev_rss_key contains 256 bytes of key,
+ but many drivers only use 40 or 52 bytes of it.
::
@@ -347,9 +364,9 @@ skb_defer_max
-------------
Max size (in skbs) of the per-cpu list of skbs being freed
-by the cpu which allocated them. Used by TCP stack so far.
+by the cpu which allocated them.
-Default: 64
+Default: 128
optmem_max
----------
@@ -406,6 +423,23 @@ to SOCK_TXREHASH_DEFAULT (i. e. not overridden by setsockopt).
If set to 1 (default), hash rethink is performed on listening socket.
If set to 0, hash rethink is not performed.
+txq_reselection_ms
+------------------
+
+Controls how often (in ms) a busy connected flow can select another tx queue.
+
+A resection is desirable when/if user thread has migrated and XPS
+would select a different queue. Same can occur without XPS
+if the flow hash has changed.
+
+But switching txq can introduce reorders, especially if the
+old queue is under high pressure. Modern TCP stacks deal
+well with reorders if they happen not too often.
+
+To disable this feature, set the value to 0.
+
+Default : 1000
+
gro_normal_batch
----------------
@@ -517,3 +551,54 @@ originally may have been issued in the correct sequential order.
If named_timeout is nonzero, failed topology updates will be placed on a defer
queue until another event arrives that clears the error, or until the timeout
expires. Value is in milliseconds.
+
+6. /proc/sys/net/vsock - VSOCK sockets
+--------------------------------------
+
+VSOCK sockets (AF_VSOCK) provide communication between virtual machines and
+their hosts. The behavior of VSOCK sockets in a network namespace is determined
+by the namespace's mode (``global`` or ``local``), which controls how CIDs
+(Context IDs) are allocated and how sockets interact across namespaces.
+
+ns_mode
+-------
+
+Read-only. Reports the current namespace's mode, set at namespace creation
+and immutable thereafter.
+
+Values:
+
+ - ``global`` - the namespace shares system-wide CID allocation and
+ its sockets can reach any VM or socket in any global namespace.
+ Sockets in this namespace cannot reach sockets in local
+ namespaces.
+ - ``local`` - the namespace has private CID allocation and its
+ sockets can only connect to VMs or sockets within the same
+ namespace.
+
+The init_net mode is always ``global``.
+
+child_ns_mode
+-------------
+
+Controls what mode newly created child namespaces will inherit. At namespace
+creation, ``ns_mode`` is inherited from the parent's ``child_ns_mode``. The
+initial value matches the namespace's own ``ns_mode``.
+
+Values:
+
+ - ``global`` - child namespaces will share system-wide CID allocation
+ and their sockets will be able to reach any VM or socket in any
+ global namespace.
+ - ``local`` - child namespaces will have private CID allocation and
+ their sockets will only be able to connect within their own
+ namespace.
+
+The first write to ``child_ns_mode`` locks its value. Subsequent writes of the
+same value succeed, but writing a different value returns ``-EBUSY``.
+
+Changing ``child_ns_mode`` only affects namespaces created after the change;
+it does not modify the current namespace or any existing children.
+
+A namespace with ``ns_mode`` set to ``local`` cannot change
+``child_ns_mode`` to ``global`` (returns ``-EPERM``).
diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index 4d71211fdad8..97e12359775c 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -41,7 +41,6 @@ Currently, these files are in /proc/sys/vm:
- extfrag_threshold
- highmem_is_dirtyable
- hugetlb_shm_group
-- laptop_mode
- legacy_va_layout
- lowmem_reserve_ratio
- max_map_count
@@ -54,6 +53,7 @@ Currently, these files are in /proc/sys/vm:
- mmap_min_addr
- mmap_rnd_bits
- mmap_rnd_compat_bits
+- movable_gigantic_pages
- nr_hugepages
- nr_hugepages_mempolicy
- nr_overcommit_hugepages
@@ -231,6 +231,8 @@ eventually gets pushed out to disk. This tunable is used to define when dirty
inode is old enough to be eligible for writeback by the kernel flusher threads.
And, it is also used as the interval to wakeup dirtytime_writeback thread.
+Setting this to zero disables periodic dirtytime writeback.
+
dirty_writeback_centisecs
=========================
@@ -363,13 +365,6 @@ hugetlb_shm_group contains group id that is allowed to create SysV
shared memory segment using hugetlb page.
-laptop_mode
-===========
-
-laptop_mode is a knob that controls "laptop mode". All the things that are
-controlled by this knob are discussed in Documentation/admin-guide/laptops/laptop-mode.rst.
-
-
legacy_va_layout
================
@@ -494,6 +489,10 @@ memory allocations.
The default value depends on CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT.
+When CONFIG_MEM_ALLOC_PROFILING_DEBUG=y, this control is read-only to avoid
+warnings produced by allocations made while profiling is disabled and freed
+when it's enabled.
+
memory_failure_early_kill
=========================
@@ -624,6 +623,33 @@ This value can be changed after boot using the
/proc/sys/vm/mmap_rnd_compat_bits tunable
+movable_gigantic_pages
+======================
+
+This parameter controls whether gigantic pages may be allocated from
+ZONE_MOVABLE. If set to non-zero, gigantic pages can be allocated
+from ZONE_MOVABLE. ZONE_MOVABLE memory may be created via the kernel
+boot parameter `kernelcore` or via memory hotplug as discussed in
+Documentation/admin-guide/mm/memory-hotplug.rst.
+
+Support may depend on specific architecture.
+
+Note that using ZONE_MOVABLE gigantic pages make memory hotremove unreliable.
+
+Memory hot-remove operations will block indefinitely until the admin reserves
+sufficient gigantic pages to service migration requests associated with the
+memory offlining process. As HugeTLB gigantic page reservation is a manual
+process (via `nodeN/hugepages/.../nr_hugepages` interfaces) this may not be
+obvious when just attempting to offline a block of memory.
+
+Additionally, as multiple gigantic pages may be reserved on a single block,
+it may appear that gigantic pages are available for migration when in reality
+they are in the process of being removed. For example if `memoryN` contains
+two gigantic pages, one reserved and one allocated, and an admin attempts to
+offline that block, this operations may hang indefinitely unless another
+reserved gigantic page is available on another block `memoryM`.
+
+
nr_hugepages
============