diff options
Diffstat (limited to 'Documentation')
| -rw-r--r-- | Documentation/ABI/testing/sysfs-kernel-mm-damon | 81 | ||||
| -rw-r--r-- | Documentation/admin-guide/kernel-parameters.txt | 4 | ||||
| -rw-r--r-- | Documentation/admin-guide/mm/damon/lru_sort.rst | 8 | ||||
| -rw-r--r-- | Documentation/admin-guide/mm/damon/reclaim.rst | 19 | ||||
| -rw-r--r-- | Documentation/admin-guide/mm/damon/stat.rst | 7 | ||||
| -rw-r--r-- | Documentation/admin-guide/mm/damon/usage.rst | 108 | ||||
| -rw-r--r-- | Documentation/admin-guide/mm/transhuge.rst | 4 | ||||
| -rw-r--r-- | Documentation/admin-guide/sysctl/vm.rst | 2 | ||||
| -rw-r--r-- | Documentation/mm/damon/design.rst | 78 | ||||
| -rw-r--r-- | Documentation/mm/damon/maintainer-profile.rst | 21 | ||||
| -rw-r--r-- | Documentation/mm/process_addrs.rst | 2 |
11 files changed, 268 insertions, 66 deletions
diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-damon b/Documentation/ABI/testing/sysfs-kernel-mm-damon index 2424237ebb10..b73e6bc28ea5 100644 --- a/Documentation/ABI/testing/sysfs-kernel-mm-damon +++ b/Documentation/ABI/testing/sysfs-kernel-mm-damon @@ -84,6 +84,13 @@ Description: Writing an integer to this file sets the 'address unit' parameter of the given operations set of the context. Reading the file returns the last-written 'address unit' value. +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/pause +Date: Mar 2026 +Contact: SeongJae Park <sj@kernel.org> +Description: Writing a boolean keyword to this file sets the 'pause' request + parameter for the context. Reading the file returns the + last-written 'pause' value. + What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/monitoring_attrs/intervals/sample_us Date: Mar 2022 Contact: SeongJae Park <sj@kernel.org> @@ -322,6 +329,18 @@ Contact: SeongJae Park <sj@kernel.org> Description: Writing to and reading from this file sets and gets the goal-based effective quota auto-tuning algorithm to use. +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/quotas/fail_charge_num +Date: Mar 2026 +Contact: SeongJae Park <sj@kernel.org> +Description: Writing to and reading from this file sets and gets the + action-failed memory quota charging ratio numerator. + +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/quotas/fail_charge_denom +Date: Mar 2026 +Contact: SeongJae Park <sj@kernel.org> +Description: Writing to and reading from this file sets and gets the + action-failed memory quota charging ratio denominator. + What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/quotas/weights/sz_permil Date: Mar 2022 Contact: SeongJae Park <sj@kernel.org> @@ -377,15 +396,20 @@ Contact: SeongJae Park <sj@kernel.org> Description: Writing to and reading from this file sets and gets the low watermark of the scheme in permil. -What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters/nr_filters -Date: Dec 2022 +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/core_filters +Date: Feb 2025 +Contact: SeongJae Park <sj@kernel.org> +Description: Directory for DAMON core layer-handled DAMOS filters. + +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/core_filters/nr_filters +Date: Feb 2025 Contact: SeongJae Park <sj@kernel.org> Description: Writing a number 'N' to this file creates the number of directories for setting filters of the scheme named '0' to - 'N-1' under the filters/ directory. + 'N-1' under the core_filters/ directory. -What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters/<F>/type -Date: Dec 2022 +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/core_filters/<F>/type +Date: Feb 2025 Contact: SeongJae Park <sj@kernel.org> Description: Writing to and reading from this file sets and gets the type of the memory of the interest. 'anon' for anonymous pages, @@ -393,77 +417,78 @@ Description: Writing to and reading from this file sets and gets the type of 'addr' for address range (an open-ended interval), or 'target' for DAMON monitoring target can be written and read. -What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters/<F>/memcg_path -Date: Dec 2022 +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/core_filters/<F>/memcg_path +Date: Feb 2025 Contact: SeongJae Park <sj@kernel.org> Description: If 'memcg' is written to the 'type' file, writing to and reading from this file sets and gets the path to the memory cgroup of the interest. -What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters/<F>/addr_start -Date: Jul 2023 +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/core_filters/<F>/addr_start +Date: Feb 2025 Contact: SeongJae Park <sj@kernel.org> Description: If 'addr' is written to the 'type' file, writing to or reading from this file sets or gets the start address of the address range for the filter. -What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters/<F>/addr_end -Date: Jul 2023 +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/core_filters/<F>/addr_end +Date: Feb 2025 Contact: SeongJae Park <sj@kernel.org> Description: If 'addr' is written to the 'type' file, writing to or reading from this file sets or gets the end address of the address range for the filter. -What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters/<F>/min +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/core_filters/<F>/min Date: Feb 2025 Contact: SeongJae Park <sj@kernel.org> Description: If 'hugepage_size' is written to the 'type' file, writing to or reading from this file sets or gets the minimum size of the hugepage for the filter. -What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters/<F>/max +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/core_filters/<F>/max Date: Feb 2025 Contact: SeongJae Park <sj@kernel.org> Description: If 'hugepage_size' is written to the 'type' file, writing to or reading from this file sets or gets the maximum size of the hugepage for the filter. -What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters/<F>/target_idx -Date: Dec 2022 +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/core_filters/<F>/damon_target_idx +Date: Feb 2025 Contact: SeongJae Park <sj@kernel.org> Description: If 'target' is written to the 'type' file, writing to or reading from this file sets or gets the index of the DAMON monitoring target of the interest. -What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters/<F>/matching -Date: Dec 2022 +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/core_filters/<F>/matching +Date: Feb 2025 Contact: SeongJae Park <sj@kernel.org> Description: Writing 'Y' or 'N' to this file sets whether the filter is for the memory of the 'type', or all except the 'type'. -What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters/<F>/allow -Date: Jan 2025 +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/core_filters/<F>/allow +Date: Feb 2025 Contact: SeongJae Park <sj@kernel.org> Description: Writing 'Y' or 'N' to this file sets whether to allow or reject applying the scheme's action to the memory that satisfies the 'type' and the 'matching' of the directory. -What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/core_filters -Date: Feb 2025 -Contact: SeongJae Park <sj@kernel.org> -Description: Directory for DAMON core layer-handled DAMOS filters. Files - under this directory works same to those of - /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters - directory. - What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/ops_filters Date: Feb 2025 Contact: SeongJae Park <sj@kernel.org> Description: Directory for DAMON operations set layer-handled DAMOS filters. Files under this directory works same to those of - /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters + /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/core_filters directory. +What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters +Date: Dec 2022 +Contact: SeongJae Park <sj@kernel.org> +Description: Directory for DAMOS filters. Files under this directory works + same to those of + /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/{core,ops}_filters + directory. This is deprecated. Use the core_filters and + ops_filters instead. + What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/dests/nr_dests Date: Jul 2025 Contact: SeongJae Park <sj@kernel.org> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 5a05b48d1684..00e8c4fa93b8 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2067,6 +2067,10 @@ Kernel parameters Format: nn[KMGTPE] or (node format) <node>:nn[KMGTPE][,<node>:nn[KMGTPE]] + The size must be a multiple of the gigantic page size. + When using node format, this applies to each per-node size. + Missaligned values are dropped with a warning. + Reserve a CMA area of given size and allocate gigantic hugepages using the CMA allocator. If enabled, the boot-time allocation of gigantic hugepages is skipped. diff --git a/Documentation/admin-guide/mm/damon/lru_sort.rst b/Documentation/admin-guide/mm/damon/lru_sort.rst index 14cc6b2db897..b93ca9b0853d 100644 --- a/Documentation/admin-guide/mm/damon/lru_sort.rst +++ b/Documentation/admin-guide/mm/damon/lru_sort.rst @@ -75,7 +75,7 @@ Make DAMON_LRU_SORT reads the input parameters again, except ``enabled``. Input parameters that updated while DAMON_LRU_SORT is running are not applied by default. Once this parameter is set as ``Y``, DAMON_LRU_SORT reads values -of parametrs except ``enabled`` again. Once the re-reading is done, this +of parameters except ``enabled`` again. Once the re-reading is done, this parameter is set as ``N``. If invalid parameters are found while the re-reading, DAMON_LRU_SORT will be disabled. @@ -246,7 +246,8 @@ monitor_region_start Start of target memory region in physical address. The start physical address of memory region that DAMON_LRU_SORT will do work -against. By default, biggest System RAM is used as the region. +against. By default, the system's entire physical memory is used as the +region. monitor_region_end ------------------ @@ -254,7 +255,8 @@ monitor_region_end End of target memory region in physical address. The end physical address of memory region that DAMON_LRU_SORT will do work -against. By default, biggest System RAM is used as the region. +against. By default, the system's entire physical memory is used as the +region. addr_unit --------- diff --git a/Documentation/admin-guide/mm/damon/reclaim.rst b/Documentation/admin-guide/mm/damon/reclaim.rst index d7a0225b4950..ec7e3e32b4ac 100644 --- a/Documentation/admin-guide/mm/damon/reclaim.rst +++ b/Documentation/admin-guide/mm/damon/reclaim.rst @@ -67,7 +67,7 @@ Make DAMON_RECLAIM reads the input parameters again, except ``enabled``. Input parameters that updated while DAMON_RECLAIM is running are not applied by default. Once this parameter is set as ``Y``, DAMON_RECLAIM reads values -of parametrs except ``enabled`` again. Once the re-reading is done, this +of parameters except ``enabled`` again. Once the re-reading is done, this parameter is set as ``N``. If invalid parameters are found while the re-reading, DAMON_RECLAIM will be disabled. @@ -85,6 +85,17 @@ identifies the region as cold, and reclaims it. 120 seconds by default. +autotune_monitoring_intervals +----------------------------- + +If this parameter is set as ``Y``, DAMON_RECLAIM automatically tunes DAMON's +sampling and aggregation intervals. The auto-tuning aims to capture meaningful +amount of access events in each DAMON-snapshot, while keeping the sampling +interval 5 milliseconds in minimum, and 10 seconds in maximum. Setting this as +``N`` disables the auto-tuning. + +Disabled by default. + quota_ms -------- @@ -229,7 +240,8 @@ Start of target memory region in physical address. The start physical address of memory region that DAMON_RECLAIM will do work against. That is, DAMON_RECLAIM will find cold memory regions in this region -and reclaims. By default, biggest System RAM is used as the region. +and reclaims. By default, the system's entire physical memory is used as the +region. monitor_region_end ------------------ @@ -238,7 +250,8 @@ End of target memory region in physical address. The end physical address of memory region that DAMON_RECLAIM will do work against. That is, DAMON_RECLAIM will find cold memory regions in this region -and reclaims. By default, biggest System RAM is used as the region. +and reclaims. By default, the system's entire physical memory is used as the +region. addr_unit --------- diff --git a/Documentation/admin-guide/mm/damon/stat.rst b/Documentation/admin-guide/mm/damon/stat.rst index c4b14daeb2dd..46c5dd96aa2e 100644 --- a/Documentation/admin-guide/mm/damon/stat.rst +++ b/Documentation/admin-guide/mm/damon/stat.rst @@ -89,3 +89,10 @@ percentiles of the idle time values via this read-only parameter. Reading the parameter returns 101 idle time values in milliseconds, separated by comma. Each value represents 0-th, 1st, 2nd, 3rd, ..., 99th and 100th percentile idle times. + +kdamond_pid +----------- + +PID of the DAMON thread. + +If DAMON_STAT is enabled, this becomes the PID of the worker thread. Else, -1. diff --git a/Documentation/admin-guide/mm/damon/usage.rst b/Documentation/admin-guide/mm/damon/usage.rst index 534e1199cf09..011296f1e7c2 100644 --- a/Documentation/admin-guide/mm/damon/usage.rst +++ b/Documentation/admin-guide/mm/damon/usage.rst @@ -66,11 +66,17 @@ comma (","). │ :ref:`kdamonds <sysfs_kdamonds>`/nr_kdamonds │ │ :ref:`0 <sysfs_kdamond>`/state,pid,refresh_ms │ │ │ :ref:`contexts <sysfs_contexts>`/nr_contexts - │ │ │ │ :ref:`0 <sysfs_context>`/avail_operations,operations,addr_unit + │ │ │ │ :ref:`0 <sysfs_context>`/avail_operations,operations,addr_unit, + │ │ │ │ pause │ │ │ │ │ :ref:`monitoring_attrs <sysfs_monitoring_attrs>`/ │ │ │ │ │ │ intervals/sample_us,aggr_us,update_us │ │ │ │ │ │ │ intervals_goal/access_bp,aggrs,min_sample_us,max_sample_us │ │ │ │ │ │ nr_regions/min,max + │ │ │ │ │ │ :ref:`probes <damon_usage_sysfs_probes>`/nr_probes + │ │ │ │ │ │ │ 0/filters/nr_filters + │ │ │ │ │ │ │ │ 0/type,matching,allow,path + │ │ │ │ │ │ │ │ ... + │ │ │ │ │ │ │ ... │ │ │ │ │ :ref:`targets <sysfs_targets>`/nr_targets │ │ │ │ │ │ :ref:`0 <sysfs_target>`/pid_target,obsolete_target │ │ │ │ │ │ │ :ref:`regions <sysfs_regions>`/nr_regions @@ -83,18 +89,23 @@ comma (","). │ │ │ │ │ │ │ │ sz/min,max │ │ │ │ │ │ │ │ nr_accesses/min,max │ │ │ │ │ │ │ │ age/min,max - │ │ │ │ │ │ │ :ref:`quotas <sysfs_quotas>`/ms,bytes,reset_interval_ms,effective_bytes,goal_tuner + │ │ │ │ │ │ │ :ref:`quotas <sysfs_quotas>`/ms,bytes,reset_interval_ms, + │ │ │ │ │ │ │ effective_bytes,goal_tuner, + │ │ │ │ │ │ │ fail_charge_num,fail_charge_denom │ │ │ │ │ │ │ │ weights/sz_permil,nr_accesses_permil,age_permil │ │ │ │ │ │ │ │ :ref:`goals <sysfs_schemes_quota_goals>`/nr_goals │ │ │ │ │ │ │ │ │ 0/target_metric,target_value,current_value,nid,path │ │ │ │ │ │ │ :ref:`watermarks <sysfs_watermarks>`/metric,interval_us,high,mid,low │ │ │ │ │ │ │ :ref:`{core_,ops_,}filters <sysfs_filters>`/nr_filters - │ │ │ │ │ │ │ │ 0/type,matching,allow,memcg_path,addr_start,addr_end,target_idx,min,max + │ │ │ │ │ │ │ │ 0/type,matching,allow,memcg_path,addr_start,addr_end,damon_target_idx,min,max │ │ │ │ │ │ │ :ref:`dests <damon_sysfs_dests>`/nr_dests │ │ │ │ │ │ │ │ 0/id,weight │ │ │ │ │ │ │ :ref:`stats <sysfs_schemes_stats>`/nr_tried,sz_tried,nr_applied,sz_applied,sz_ops_filter_passed,qt_exceeds,nr_snapshots,max_nr_snapshots │ │ │ │ │ │ │ :ref:`tried_regions <sysfs_schemes_tried_regions>`/total_bytes │ │ │ │ │ │ │ │ 0/start,end,nr_accesses,age,sz_filter_passed + │ │ │ │ │ │ │ │ │ probes + │ │ │ │ │ │ │ │ │ │ 0/hits + │ │ │ │ │ │ │ │ │ │ ... │ │ │ │ │ │ │ │ ... │ │ │ │ │ │ ... │ │ │ │ ... @@ -194,9 +205,9 @@ details). At the moment, only one context per kdamond is supported, so only contexts/<N>/ ------------- -In each context directory, three files (``avail_operations``, ``operations`` -and ``addr_unit``) and three directories (``monitoring_attrs``, ``targets``, -and ``schemes``) exist. +In each context directory, four files (``avail_operations``, ``operations``, +``addr_unit`` and ``pause``) and three directories (``monitoring_attrs``, +``targets``, and ``schemes``) exist. DAMON supports multiple types of :ref:`monitoring operations <damon_design_configurable_operations_set>`, including those for virtual address @@ -214,6 +225,9 @@ reading from the ``operations`` file. ``addr_unit`` file is for setting and getting the :ref:`address unit <damon_design_addr_unit>` parameter of the operations set. +``pause`` file is for setting and getting the :ref:`pause request +<damon_design_execution_model_and_data_structures>` parameter of the context. + .. _sysfs_monitoring_attrs: contexts/<N>/monitoring_attrs/ @@ -221,8 +235,8 @@ contexts/<N>/monitoring_attrs/ Files for specifying attributes of the monitoring including required quality and efficiency of the monitoring are in ``monitoring_attrs`` directory. -Specifically, two directories, ``intervals`` and ``nr_regions`` exist in this -directory. +Specifically, three directories, ``intervals``, ``nr_regions`` and ``probes`` +exist in this directory. Under ``intervals`` directory, three files for DAMON's sampling interval (``sample_us``), aggregation interval (``aggr_us``), and update interval @@ -256,6 +270,29 @@ tuning-applied current values of the two intervals can be read from the ``sample_us`` and ``aggr_us`` files after writing ``update_tuned_intervals`` to the ``state`` file. +.. _damon_usage_sysfs_probes: + +contexts/<N>/monitoring_attrs/probes/ +------------------------------------- + +A directory for registering :ref:`data attributes monitoring +<damon_design_data_attrs_monitoring>` probes. + +In the beginning, this directory has only one file, ``nr_probes``. Writing a +number (``N``) to the file creates the number of child directories named ``0`` +to ``N-1``. Each directory represents each monitoring probe. + +In each probe directory, one directory, ``filters`` exists. The directory +contains files for installing filters for the probe, that is used to determine +the data attribute for the probe. + +In the beginning, ``filters`` directory has only one file, ``nr_filters``. +Writing a number (``N``) to the file creates the number of child directories +named ``0`` to ``N-1``. Each directory represents each filter and works in a +way similar to that for :ref:`DAMOS filter <sysfs_filters>`. When the filter +``type`` is ``memcg``, ``path`` file acts as ``memcg_path`` for :ref:`DAMOS +filter <sysfs_filters>`. + .. _sysfs_targets: contexts/<N>/targets/ @@ -337,7 +374,7 @@ to ``N-1``. Each directory represents each DAMON-based operation scheme. schemes/<N>/ ------------ -In each scheme directory, eight directories (``access_pattern``, ``quotas``, +In each scheme directory, nine directories (``access_pattern``, ``quotas``, ``watermarks``, ``core_filters``, ``ops_filters``, ``filters``, ``dests``, ``stats``, and ``tried_regions``) and three files (``action``, ``target_nid`` and ``apply_interval``) exist. @@ -377,9 +414,10 @@ schemes/<N>/quotas/ The directory for the :ref:`quotas <damon_design_damos_quotas>` of the given DAMON-based operation scheme. -Under ``quotas`` directory, five files (``ms``, ``bytes``, -``reset_interval_ms``, ``effective_bytes`` and ``goal_tuner``) and two -directories (``weights`` and ``goals``) exist. +Under ``quotas`` directory, seven files (``ms``, ``bytes``, +``reset_interval_ms``, ``effective_bytes``, ``goal_tuner``, ``fail_charge_num`` +and ``fail_charge_denom``) and two directories (``weights`` and ``goals``) +exist. You can set the ``time quota`` in milliseconds, ``size quota`` in bytes, and ``reset interval`` in milliseconds by writing the values to the three files, @@ -398,6 +436,13 @@ the background design of the feature and the name of the selectable algorithms. Refer to :ref:`goals directory <sysfs_schemes_quota_goals>` for the goals setup. +You can set the action-failed memory quota charging ratio by writing the +numerator and the denominator for the ratio to ``fail_charge_num`` and +``fail_charge_denom`` files, respectively. Reading those files will return the +current set values. Refer to :ref:`design +<damon_design_damos_quotas_failed_memory_charging_ratio>` for more details of +the ratio feature. + The time quota is internally transformed to a size quota. Between the transformed size quota and user-specified size quota, smaller one is applied. Based on the user-specified :ref:`goal <sysfs_schemes_quota_goals>`, the @@ -429,10 +474,12 @@ to ``N-1``. Each directory represents each goal and current achievement. Among the multiple feedback, the best one is used. Each goal directory contains five files, namely ``target_metric``, -``target_value``, ``current_value`` ``nid`` and ``path``. Users can set and +``target_value``, ``current_value``, ``nid``, and ``path``. Users can set and get the five parameters for the quota auto-tuning goals that specified on the :ref:`design doc <damon_design_damos_quotas_auto_tuning>` by writing to and -reading from each of the files. Note that users should further write +reading from each of the files. Because the kernel does not update +``current_value``, reading it only makes sense when ``target_metric`` is +``user_input``. Note that users should further write ``commit_schemes_quota_goals`` to the ``state`` file of the :ref:`kdamond directory <sysfs_kdamond>` to pass the feedback to DAMON. @@ -447,7 +494,7 @@ given DAMON-based operation scheme. Under the watermarks directory, five files (``metric``, ``interval_us``, ``high``, ``mid``, and ``low``) for setting the metric, the time interval between check of the metric, and the three watermarks exist. You can set and -get the five values by writing to the files, respectively. +get the five values by writing to and reading from the files, respectively. Keywords and meanings of those that can be written to the ``metric`` file are as below. @@ -455,7 +502,7 @@ as below. - none: Ignore the watermarks - free_mem_rate: System's free memory rate (per thousand) -The ``interval`` should written in microseconds unit. +The ``interval_us`` should be written in microseconds unit. .. _sysfs_filters: @@ -471,10 +518,10 @@ directory can be used for installing filters regardless of their handled layers. Filters that requested by ``core_filters`` and ``ops_filters`` will be installed before those of ``filters``. All three directories have same files. -Use of ``filters`` directory can make expecting evaluation orders of given -filters with the files under directory bit confusing. Users are hence -recommended to use ``core_filters`` and ``ops_filters`` directories. The -``filters`` directory could be deprecated in future. +Use of ``filters`` directory can make filters evaluation orders confusing to +expect. For this reason, ``filters`` directory is deprecated. It is still +functioning, but is scheduled for removal in the near future. Users should use +``core_filters`` and ``ops_filters`` directories instead. In the beginning, the directory has only one file, ``nr_filters``. Writing a number (``N``) to the file creates the number of child directories named ``0`` @@ -483,9 +530,9 @@ in the numeric order. Each filter directory contains nine files, namely ``type``, ``matching``, ``allow``, ``memcg_path``, ``addr_start``, ``addr_end``, ``min``, ``max`` -and ``target_idx``. To ``type`` file, you can write the type of the filter. -Refer to :ref:`the design doc <damon_design_damos_filters>` for available type -names, their meaning and on what layer those are handled. +and ``damon_target_idx``. To ``type`` file, you can write the type of the +filter. Refer to :ref:`the design doc <damon_design_damos_filters>` for +available type names, their meaning and on what layer those are handled. For ``memcg`` type, you can specify the memory cgroup of the interest by writing the path of the memory cgroup from the cgroups mount point to @@ -495,7 +542,7 @@ files, respectively. For ``hugepage_size`` type, you can specify the minimum and maximum size of the range (closed interval) to ``min`` and ``max`` files, respectively. For ``target`` type, you can specify the index of the target between the list of the DAMON context's monitoring targets list to -``target_idx`` file. +``damon_target_idx`` file. You can write ``Y`` or ``N`` to ``matching`` file to specify whether the filter is for memory that matches the ``type``. You can write ``Y`` or ``N`` to @@ -601,10 +648,19 @@ tried_regions/<N>/ ------------------ In each region directory, you will find five files (``start``, ``end``, -``nr_accesses``, ``age``, and ``sz_filter_passed``). Reading the files will +``nr_accesses``, ``age`` and ``sz_filter_passed``). Reading the files will show the properties of the region that corresponding DAMON-based operation scheme ``action`` has tried to be applied. +tried_regions/<N>/probes/ +------------------------- + +In each region directory, one directory (``probes``) also exists. In the +directory, subdirectories named ``0`` to ``N-1`` exists. ``N`` is the number +of installed probes. In each number-named directory, a file (``hits``) exist. +Reading the file shows the number of data attributes monitoring probe-hit +positive samples of the region. + Example ~~~~~~~ @@ -677,7 +733,7 @@ show results using tracepoint supporting tools like ``perf``. For example:: Each line of the perf script output represents each monitoring region. The first five fields are as usual other tracepoint outputs. The sixth field -(``target_id=X``) shows the ide of the monitoring target of the region. The +(``target_id=X``) shows the id of the monitoring target of the region. The seventh field (``nr_regions=X``) shows the total number of monitoring regions for the target. The eighth field (``X-Y:``) shows the start (``X``) and end (``Y``) addresses of the region in bytes. The ninth field (``X``) shows the diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index 5fbc3d89bb07..76f4eb14e262 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -57,7 +57,7 @@ prominent because the size of each page isn't as huge as the PMD-sized variant and there is less memory to clear in each page fault. Some architectures also employ TLB compression mechanisms to squeeze more entries in when a set of PTEs are virtually and physically contiguous -and approporiately aligned. In this case, TLB misses will occur less +and appropriately aligned. In this case, TLB misses will occur less often. THP can be enabled system wide or restricted to certain tasks or even @@ -210,7 +210,7 @@ PMD-mappable transparent hugepage:: cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size All THPs at fault and collapse time will be added to _deferred_list, -and will therefore be split under memory presure if they are considered +and will therefore be split under memory pressure if they are considered "underused". A THP is underused if the number of zero-filled pages in the THP is above max_ptes_none (see below). It is possible to disable this behaviour by writing 0 to shrink_underused, and enable it by writing diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst index 97e12359775c..b9b0c218bfb4 100644 --- a/Documentation/admin-guide/sysctl/vm.rst +++ b/Documentation/admin-guide/sysctl/vm.rst @@ -1034,6 +1034,8 @@ min(3% of current process size, user_reserve_kbytes) of free memory. This is intended to prevent a user from starting a single memory hogging process, such that they cannot recover (kill the hog). +This setting has no effect when overcommit_memory is set to 0 or 1. + user_reserve_kbytes defaults to min(3% of the current process size, 128MB). If this is reduced to zero, then the user will be allowed to allocate diff --git a/Documentation/mm/damon/design.rst b/Documentation/mm/damon/design.rst index afc7d52bda2f..2da7ca0d3d17 100644 --- a/Documentation/mm/damon/design.rst +++ b/Documentation/mm/damon/design.rst @@ -19,6 +19,13 @@ types of monitoring. To know how user-space can do the configurations and start/stop DAMON, refer to :ref:`DAMON sysfs interface <sysfs_interface>` documentation. +Users can also request each context execution to be paused and resumed. When +it is paused, the kdamond does nothing other than applying online parameter +update. + +To know how user-space can pause/resume each context, refer to :ref:`DAMON +sysfs context <sysfs_context>` usage documentation. + Overall Architecture ==================== @@ -140,7 +147,7 @@ as Idle page tracking does. Address Unit ------------ -DAMON core layer uses ``unsinged long`` type for monitoring target address +DAMON core layer uses ``unsigned long`` type for monitoring target address ranges. In some cases, the address space for a given operations set could be too large to be handled with the type. ARM (32-bit) with large physical address extension is an example. For such cases, a per-operations set @@ -269,6 +276,45 @@ interval``, DAMON checks if the region's size and access frequency (``nr_accesses``) has significantly changed. If so, the counter is reset to zero. Otherwise, the counter is increased. +.. _damon_design_data_attrs_monitoring: + +Data Attributes Monitoring +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Data access pattern is only one type of data attributes. In some use cases, +users need to know more data attributes information. For example, users may +need to know how much of a given hot or cold memory region is backed by +anonymous pages, or belong to a specific cgroup. For such use case, data +attributes monitoring feature is provided. + +Using the feature, users can register data attributes of their interest to the +DAMON :ref:`context <damon_design_execution_model_and_data_structures>`. The +registration is made by specifying a probe per attribute. Each of the probe +specifies a rule to determine if a given memory region has the related +attribute. The rule is constructed with multiple filters. The filters work +same to :ref:`DAMOS filters <damon_design_damos_filters>` except the supported +filter types. Currently only ``anon`` and ``memcg`` filter types are supported +for data attributes monitoring. + +If such probes are registered, DAMON executes the probes for each region's +sampling memory when it does the access :ref:`sampling +<damon_design_region_based_sampling>`. The number of samples that identified +as having the data attribute (hitting the probe) per :ref:`aggregation interval +<damon_design_monitoring>` is accounted in a per-region per-probe counter. +Users can therefore know how much of a given DAMON region has a specific data +attribute by reading the per-region per-probe probe hits counter after each +aggregation interval. + +This is a sampling based mechanism. Hence, it is lightweight but the output +may include some measurement errors. The output should be used with good +understanding of statistics. + +Another way to do this for higher accuracy is using :ref:`DAMOS filter +<damon_design_damos_filters>` with ``stat`` :ref:`action +<damon_design_damos_action>` and ``sz_ops_filter_passed`` :ref:`stat +<damon_design_damos_stat>`. This approach provides the data attributes +information in page level. But, because it is operated in page level, the +overhead is proportional to the size of the memory. Dynamic Target Space Updates Handling ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -371,7 +417,7 @@ with theoretical maximum ``nr_accesses``, which can be calculated as ``aggregation interval / sampling interval``. The mechanism calculates the ratio of access events for ``aggrs`` aggregations, -and increases or decrease the ``sampleing interval`` and ``aggregation +and increases or decrease the ``sampling interval`` and ``aggregation interval`` in same ratio, if the observed access ratio is lower or higher than the target, respectively. The ratio of the intervals change is decided in proportion to the distance between current samples ratio and the target ratio. @@ -387,7 +433,7 @@ The tuning is turned off by default, and need to be set explicitly by the user. As a rule of thumbs and the Parreto principle, 4% access samples ratio target is recommended. Note that Parreto principle (80/20 rule) has applied twice. That is, assumes 4% (20% of 20%) DAMON-observed access events ratio (source) -to capture 64% (80% multipled by 80%) real access events (outcomes). +to capture 64% (80% multiplied by 80%) real access events (outcomes). To know how user-space can use this feature via :ref:`DAMON sysfs interface <sysfs_interface>`, refer to :ref:`intervals_goal @@ -474,6 +520,10 @@ that supports each action are as below. Supported by ``vaddr`` and ``fvaddr`` operations set. When TRANSPARENT_HUGEPAGE is disabled, the application of the action will just fail. + - ``collapse``: Call ``madvise()`` for the region with ``MADV_COLLAPSE``. + Supported by ``vaddr`` and ``fvaddr`` operations set. When + TRANSPARENT_HUGEPAGE is disabled, the application of the action will just + fail. - ``lru_prio``: Prioritize the region on its LRU lists. Supported by ``paddr`` operations set. - ``lru_deprio``: Deprioritize the region on its LRU lists. @@ -565,6 +615,28 @@ interface <sysfs_interface>`, refer to :ref:`weights <sysfs_quotas>` part of the documentation. +.. _damon_design_damos_quotas_failed_memory_charging_ratio: + +Action-failed Memory Charging Ratio +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +DAMOS action to a given region can fail for some subsets of the memory of the +region. For example, if the action is ``pageout`` and the region has some +unreclaimable pages, applying the action to the pages will fail. The amount of +system resource that is taken for such failed action applications is usually +different from that for successful action applications. For such cases, users +can set different charging ratio for such failed memory. The ratio can be +specified using ``fail_charge_num`` and ``fail_charge_denom`` parameters. The +two parameters represent the numerator and denominator of the ratio. The +feature is enabled only if ``fail_charge_denom`` is not zero. + +For example, let's suppose a DAMOS action is applied to a region of 1,000 MiB +size. The action is successfully applied to only 700 MiB of the region. +``fail_charge_num`` and ``fail_charge_denom`` are set to ``1`` and ``1024``, +respectively. Then only 700 MiB and 300 KiB of size (``700 MiB + 300 MiB * 1 / +1024``) will be charged. + + .. _damon_design_damos_quotas_auto_tuning: Aim-oriented Feedback-driven Auto-tuning diff --git a/Documentation/mm/damon/maintainer-profile.rst b/Documentation/mm/damon/maintainer-profile.rst index bcb9798a27a8..fb2fa00cc9aa 100644 --- a/Documentation/mm/damon/maintainer-profile.rst +++ b/Documentation/mm/damon/maintainer-profile.rst @@ -100,3 +100,24 @@ There is also a public Google `calendar <https://calendar.google.com/calendar/u/0?cid=ZDIwOTA4YTMxNjc2MDQ3NTIyMmUzYTM5ZmQyM2U4NDA0ZGIwZjBiYmJlZGQxNDM0MmY4ZTRjOTE0NjdhZDRiY0Bncm91cC5jYWxlbmRhci5nb29nbGUuY29t>`_ that has the events. Anyone can subscribe to it. DAMON maintainer will also provide periodic reminders to the mailing list (damon@lists.linux.dev). + +AI Review +--------- + +For patches that are publicly posted to DAMON mailing list +(damon@lists.linux.dev), AI reviews of the patches will be available at +sashiko.dev. The reviews could also be sent as mails to the author of the +patch. + +Patch authors are encouraged to check the AI reviews and share their opinions. +The sharing could be done as a reply to the mail thread. Consider reducing the +recipients list for such sharing, since some people are not really interested +in AI reviews. As a rule of thumb, drop stable@vger.kernel.org and individuals +except DAMON maintainer. + +`hkml` also provides a `feature +<https://github.com/sjp38/hackermail/blob/master/USAGE.md#forwarding-sashikodev-statuscomments-to-mailing-list>`_ +for such sharing. Please feel free to use the feature. + +It is only an optional recommendation. DAMON maintainer could also ask any +question about the AI reviews, though. diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/process_addrs.rst index 851680ead45f..042d64d72421 100644 --- a/Documentation/mm/process_addrs.rst +++ b/Documentation/mm/process_addrs.rst @@ -775,7 +775,7 @@ lock, releasing or downgrading the mmap write lock also releases the VMA write lock so there is no :c:func:`!vma_end_write` function. Note that when write-locking a VMA lock, the :c:member:`!vma.vm_refcnt` is temporarily -modified so that readers can detect the presense of a writer. The reference counter is +modified so that readers can detect the presence of a writer. The reference counter is restored once the vma sequence number used for serialisation is updated. This ensures the semantics we require - VMA write locks provide exclusive write |
