summaryrefslogtreecommitdiff
path: root/Documentation/x86
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2019-07-09 12:34:26 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2019-07-09 12:34:26 -0700
commite9a83bd2322035ed9d7dcf35753d3f984d76c6a5 (patch)
tree66dc466ff9aec0f9bb7f39cba50a47eab6585559 /Documentation/x86
parent7011b7e1b702cc76f9e969b41d9a95969f2aecaa (diff)
parent454f96f2b738374da4b0a703b1e2e7aed82c4486 (diff)
Merge tag 'docs-5.3' of git://git.lwn.net/linux
Pull Documentation updates from Jonathan Corbet: "It's been a relatively busy cycle for docs: - A fair pile of RST conversions, many from Mauro. These create more than the usual number of simple but annoying merge conflicts with other trees, unfortunately. He has a lot more of these waiting on the wings that, I think, will go to you directly later on. - A new document on how to use merges and rebases in kernel repos, and one on Spectre vulnerabilities. - Various improvements to the build system, including automatic markup of function() references because some people, for reasons I will never understand, were of the opinion that :c:func:``function()`` is unattractive and not fun to type. - We now recommend using sphinx 1.7, but still support back to 1.4. - Lots of smaller improvements, warning fixes, typo fixes, etc" * tag 'docs-5.3' of git://git.lwn.net/linux: (129 commits) docs: automarkup.py: ignore exceptions when seeking for xrefs docs: Move binderfs to admin-guide Disable Sphinx SmartyPants in HTML output doc: RCU callback locks need only _bh, not necessarily _irq docs: format kernel-parameters -- as code Doc : doc-guide : Fix a typo platform: x86: get rid of a non-existent document Add the RCU docs to the core-api manual Documentation: RCU: Add TOC tree hooks Documentation: RCU: Rename txt files to rst Documentation: RCU: Convert RCU UP systems to reST Documentation: RCU: Convert RCU linked list to reST Documentation: RCU: Convert RCU basic concepts to reST docs: filesystems: Remove uneeded .rst extension on toctables scripts/sphinx-pre-install: fix out-of-tree build docs: zh_CN: submitting-drivers.rst: Remove a duplicated Documentation/ Documentation: PGP: update for newer HW devices Documentation: Add section about CPU vulnerabilities for Spectre Documentation: platform: Delete x86-laptop-drivers.txt docs: Note that :c:func: should no longer be used ...
Diffstat (limited to 'Documentation/x86')
-rw-r--r--Documentation/x86/index.rst1
-rw-r--r--Documentation/x86/protection-keys.rst99
-rw-r--r--Documentation/x86/resctrl_ui.rst30
-rw-r--r--Documentation/x86/x86_64/5level-paging.rst2
-rw-r--r--Documentation/x86/x86_64/boot-options.rst4
-rw-r--r--Documentation/x86/x86_64/fake-numa-for-cpusets.rst2
6 files changed, 22 insertions, 116 deletions
diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index ae36fc5fc649..f2de1b2d3ac7 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -19,7 +19,6 @@ x86-specific Documentation
tlb
mtrr
pat
- protection-keys
intel_mpx
amd-memory-encryption
pti
diff --git a/Documentation/x86/protection-keys.rst b/Documentation/x86/protection-keys.rst
deleted file mode 100644
index 49d9833af871..000000000000
--- a/Documentation/x86/protection-keys.rst
+++ /dev/null
@@ -1,99 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-======================
-Memory Protection Keys
-======================
-
-Memory Protection Keys for Userspace (PKU aka PKEYs) is a feature
-which is found on Intel's Skylake "Scalable Processor" Server CPUs.
-It will be avalable in future non-server parts.
-
-For anyone wishing to test or use this feature, it is available in
-Amazon's EC2 C5 instances and is known to work there using an Ubuntu
-17.04 image.
-
-Memory Protection Keys provides a mechanism for enforcing page-based
-protections, but without requiring modification of the page tables
-when an application changes protection domains. It works by
-dedicating 4 previously ignored bits in each page table entry to a
-"protection key", giving 16 possible keys.
-
-There is also a new user-accessible register (PKRU) with two separate
-bits (Access Disable and Write Disable) for each key. Being a CPU
-register, PKRU is inherently thread-local, potentially giving each
-thread a different set of protections from every other thread.
-
-There are two new instructions (RDPKRU/WRPKRU) for reading and writing
-to the new register. The feature is only available in 64-bit mode,
-even though there is theoretically space in the PAE PTEs. These
-permissions are enforced on data access only and have no effect on
-instruction fetches.
-
-Syscalls
-========
-
-There are 3 system calls which directly interact with pkeys::
-
- int pkey_alloc(unsigned long flags, unsigned long init_access_rights)
- int pkey_free(int pkey);
- int pkey_mprotect(unsigned long start, size_t len,
- unsigned long prot, int pkey);
-
-Before a pkey can be used, it must first be allocated with
-pkey_alloc(). An application calls the WRPKRU instruction
-directly in order to change access permissions to memory covered
-with a key. In this example WRPKRU is wrapped by a C function
-called pkey_set().
-::
-
- int real_prot = PROT_READ|PROT_WRITE;
- pkey = pkey_alloc(0, PKEY_DISABLE_WRITE);
- ptr = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
- ret = pkey_mprotect(ptr, PAGE_SIZE, real_prot, pkey);
- ... application runs here
-
-Now, if the application needs to update the data at 'ptr', it can
-gain access, do the update, then remove its write access::
-
- pkey_set(pkey, 0); // clear PKEY_DISABLE_WRITE
- *ptr = foo; // assign something
- pkey_set(pkey, PKEY_DISABLE_WRITE); // set PKEY_DISABLE_WRITE again
-
-Now when it frees the memory, it will also free the pkey since it
-is no longer in use::
-
- munmap(ptr, PAGE_SIZE);
- pkey_free(pkey);
-
-.. note:: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.
- An example implementation can be found in
- tools/testing/selftests/x86/protection_keys.c.
-
-Behavior
-========
-
-The kernel attempts to make protection keys consistent with the
-behavior of a plain mprotect(). For instance if you do this::
-
- mprotect(ptr, size, PROT_NONE);
- something(ptr);
-
-you can expect the same effects with protection keys when doing this::
-
- pkey = pkey_alloc(0, PKEY_DISABLE_WRITE | PKEY_DISABLE_READ);
- pkey_mprotect(ptr, size, PROT_READ|PROT_WRITE, pkey);
- something(ptr);
-
-That should be true whether something() is a direct access to 'ptr'
-like::
-
- *ptr = foo;
-
-or when the kernel does the access on the application's behalf like
-with a read()::
-
- read(fd, ptr, 1);
-
-The kernel will send a SIGSEGV in both cases, but si_code will be set
-to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when
-the plain mprotect() permissions are violated.
diff --git a/Documentation/x86/resctrl_ui.rst b/Documentation/x86/resctrl_ui.rst
index 225cfd4daaee..5368cedfb530 100644
--- a/Documentation/x86/resctrl_ui.rst
+++ b/Documentation/x86/resctrl_ui.rst
@@ -40,7 +40,7 @@ mount options are:
Enable the MBA Software Controller(mba_sc) to specify MBA
bandwidth in MBps
-L2 and L3 CDP are controlled seperately.
+L2 and L3 CDP are controlled separately.
RDT features are orthogonal. A particular system may support only
monitoring, only control, or both monitoring and control. Cache
@@ -118,7 +118,7 @@ related to allocation:
Corresponding region is pseudo-locked. No
sharing allowed.
-Memory bandwitdh(MB) subdirectory contains the following files
+Memory bandwidth(MB) subdirectory contains the following files
with respect to allocation:
"min_bandwidth":
@@ -209,7 +209,7 @@ All groups contain the following files:
CPUs to/from this group. As with the tasks file a hierarchy is
maintained where MON groups may only include CPUs owned by the
parent CTRL_MON group.
- When the resouce group is in pseudo-locked mode this file will
+ When the resource group is in pseudo-locked mode this file will
only be readable, reflecting the CPUs associated with the
pseudo-locked region.
@@ -342,7 +342,7 @@ For cache resources we describe the portion of the cache that is available
for allocation using a bitmask. The maximum value of the mask is defined
by each cpu model (and may be different for different cache levels). It
is found using CPUID, but is also provided in the "info" directory of
-the resctrl file system in "info/{resource}/cbm_mask". X86 hardware
+the resctrl file system in "info/{resource}/cbm_mask". Intel hardware
requires that these masks have all the '1' bits in a contiguous block. So
0x3, 0x6 and 0xC are legal 4-bit masks with two bits set, but 0x5, 0x9
and 0xA are not. On a system with a 20-bit mask each bit represents 5%
@@ -380,7 +380,7 @@ where L2 external is 10GBps (hence aggregate L2 external bandwidth is
240GBps) and L3 external bandwidth is 100GBps. Now a workload with '20
threads, having 50% bandwidth, each consuming 5GBps' consumes the max L3
bandwidth of 100GBps although the percentage value specified is only 50%
-<< 100%. Hence increasing the bandwidth percentage will not yeild any
+<< 100%. Hence increasing the bandwidth percentage will not yield any
more bandwidth. This is because although the L2 external bandwidth still
has capacity, the L3 external bandwidth is fully used. Also note that
this would be dependent on number of cores the benchmark is run on.
@@ -398,7 +398,7 @@ In order to mitigate this and make the interface more user friendly,
resctrl added support for specifying the bandwidth in MBps as well. The
kernel underneath would use a software feedback mechanism or a "Software
Controller(mba_sc)" which reads the actual bandwidth using MBM counters
-and adjust the memowy bandwidth percentages to ensure::
+and adjust the memory bandwidth percentages to ensure::
"actual bandwidth < user specified bandwidth".
@@ -418,16 +418,22 @@ L3 schemata file details (CDP enabled via mount option to resctrl)
When CDP is enabled L3 control is split into two separate resources
so you can specify independent masks for code and data like this::
- L3data:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...
- L3code:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...
+ L3DATA:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...
+ L3CODE:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...
L2 schemata file details
------------------------
-L2 cache does not support code and data prioritization, so the
-schemata format is always::
+CDP is supported at L2 using the 'cdpl2' mount option. The schemata
+format is either::
L2:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...
+or
+
+ L2DATA:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...
+ L2CODE:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...
+
+
Memory bandwidth Allocation (default mode)
------------------------------------------
@@ -671,8 +677,8 @@ allocations can overlap or not. The allocations specifies the maximum
b/w that the group may be able to use and the system admin can configure
the b/w accordingly.
-If the MBA is specified in MB(megabytes) then user can enter the max b/w in MB
-rather than the percentage values.
+If resctrl is using the software controller (mba_sc) then user can enter the
+max b/w in MB rather than the percentage values.
::
# echo "L3:0=3;1=c\nMB:0=1024;1=500" > /sys/fs/resctrl/p0/schemata
diff --git a/Documentation/x86/x86_64/5level-paging.rst b/Documentation/x86/x86_64/5level-paging.rst
index ab88a4514163..44856417e6a5 100644
--- a/Documentation/x86/x86_64/5level-paging.rst
+++ b/Documentation/x86/x86_64/5level-paging.rst
@@ -20,7 +20,7 @@ physical address space. This "ought to be enough for anybody" ©.
QEMU 2.9 and later support 5-level paging.
Virtual memory layout for 5-level paging is described in
-Documentation/x86/x86_64/mm.txt
+Documentation/x86/x86_64/mm.rst
Enabling 5-level paging
diff --git a/Documentation/x86/x86_64/boot-options.rst b/Documentation/x86/x86_64/boot-options.rst
index 2f69836b8445..6a4285a3c7a4 100644
--- a/Documentation/x86/x86_64/boot-options.rst
+++ b/Documentation/x86/x86_64/boot-options.rst
@@ -9,7 +9,7 @@ only the AMD64 specific ones are listed here.
Machine check
=============
-Please see Documentation/x86/x86_64/machinecheck for sysfs runtime tunables.
+Please see Documentation/x86/x86_64/machinecheck.rst for sysfs runtime tunables.
mce=off
Disable machine check
@@ -89,7 +89,7 @@ APICs
Don't use the local APIC (alias for i386 compatibility)
pirq=...
- See Documentation/x86/i386/IO-APIC.txt
+ See Documentation/x86/i386/IO-APIC.rst
noapictimer
Don't set up the APIC timer
diff --git a/Documentation/x86/x86_64/fake-numa-for-cpusets.rst b/Documentation/x86/x86_64/fake-numa-for-cpusets.rst
index a6926cd40f70..30108684ae87 100644
--- a/Documentation/x86/x86_64/fake-numa-for-cpusets.rst
+++ b/Documentation/x86/x86_64/fake-numa-for-cpusets.rst
@@ -18,7 +18,7 @@ For more information on the features of cpusets, see
Documentation/cgroup-v1/cpusets.rst.
There are a number of different configurations you can use for your needs. For
more information on the numa=fake command line option and its various ways of
-configuring fake nodes, see Documentation/x86/x86_64/boot-options.txt.
+configuring fake nodes, see Documentation/x86/x86_64/boot-options.rst.
For the purposes of this introduction, we'll assume a very primitive NUMA
emulation setup of "numa=fake=4*512,". This will split our system memory into