diff options
Diffstat (limited to 'Documentation')
130 files changed, 2525 insertions, 1624 deletions
diff --git a/Documentation/ABI/testing/sysfs-bus-iio-dfsdm-adc-stm32 b/Documentation/ABI/testing/sysfs-bus-iio-dfsdm-adc-stm32 new file mode 100644 index 000000000000..da9822309f07 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-bus-iio-dfsdm-adc-stm32 @@ -0,0 +1,16 @@ +What: /sys/bus/iio/devices/iio:deviceX/in_voltage_spi_clk_freq +KernelVersion: 4.14 +Contact: arnaud.pouliquen@st.com +Description: + For audio purpose only. + Used by audio driver to set/get the spi input frequency. + This is mandatory if DFSDM is slave on SPI bus, to + provide information on the SPI clock frequency during runtime + Notice that the SPI frequency should be a multiple of sample + frequency to ensure the precision. + if DFSDM input is SPI master + Reading SPI clkout frequency, + error on writing + If DFSDM input is SPI Slave: + Reading returns value previously set. + Writing value before starting conversions.
\ No newline at end of file diff --git a/Documentation/ABI/testing/sysfs-class-led-trigger-netdev b/Documentation/ABI/testing/sysfs-class-led-trigger-netdev new file mode 100644 index 000000000000..451af6d6768c --- /dev/null +++ b/Documentation/ABI/testing/sysfs-class-led-trigger-netdev @@ -0,0 +1,45 @@ +What: /sys/class/leds/<led>/device_name +Date: Dec 2017 +KernelVersion: 4.16 +Contact: linux-leds@vger.kernel.org +Description: + Specifies the network device name to monitor. + +What: /sys/class/leds/<led>/interval +Date: Dec 2017 +KernelVersion: 4.16 +Contact: linux-leds@vger.kernel.org +Description: + Specifies the duration of the LED blink in milliseconds. + Defaults to 50 ms. + +What: /sys/class/leds/<led>/link +Date: Dec 2017 +KernelVersion: 4.16 +Contact: linux-leds@vger.kernel.org +Description: + Signal the link state of the named network device. + If set to 0 (default), the LED's normal state is off. + If set to 1, the LED's normal state reflects the link state + of the named network device. + Setting this value also immediately changes the LED state. + +What: /sys/class/leds/<led>/tx +Date: Dec 2017 +KernelVersion: 4.16 +Contact: linux-leds@vger.kernel.org +Description: + Signal transmission of data on the named network device. + If set to 0 (default), the LED will not blink on transmission. + If set to 1, the LED will blink for the milliseconds specified + in interval to signal transmission. + +What: /sys/class/leds/<led>/rx +Date: Dec 2017 +KernelVersion: 4.16 +Contact: linux-leds@vger.kernel.org +Description: + Signal reception of data on the named network device. + If set to 0 (default), the LED will not blink on reception. + If set to 1, the LED will blink for the milliseconds specified + in interval to signal reception. diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index d6d862db3b5d..bfd29bc8d37a 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -375,3 +375,19 @@ Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org> Description: information about CPUs heterogeneity. cpu_capacity: capacity of cpu#. + +What: /sys/devices/system/cpu/vulnerabilities + /sys/devices/system/cpu/vulnerabilities/meltdown + /sys/devices/system/cpu/vulnerabilities/spectre_v1 + /sys/devices/system/cpu/vulnerabilities/spectre_v2 +Date: January 2018 +Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org> +Description: Information about CPU vulnerabilities + + The files are named after the code names of CPU + vulnerabilities. The output of those files reflects the + state of the CPUs in the system. Possible output values: + + "Not affected" CPU is not affected by the vulnerability + "Vulnerable" CPU is affected and no mitigation in effect + "Mitigation: $M" CPU is affected and mitigation $M is in effect diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs index a7799c2fca28..d870b5514d15 100644 --- a/Documentation/ABI/testing/sysfs-fs-f2fs +++ b/Documentation/ABI/testing/sysfs-fs-f2fs @@ -186,3 +186,9 @@ Date: August 2017 Contact: "Jaegeuk Kim" <jaegeuk@kernel.org> Description: Controls sleep time of GC urgent mode + +What: /sys/fs/f2fs/<disk>/readdir_ra +Date: November 2017 +Contact: "Sheng Yong" <shengyong1@huawei.com> +Description: + Controls readahead inode block in readdir. diff --git a/Documentation/ABI/testing/sysfs-kernel-livepatch b/Documentation/ABI/testing/sysfs-kernel-livepatch index d5d39748382f..dac7e1e62a8b 100644 --- a/Documentation/ABI/testing/sysfs-kernel-livepatch +++ b/Documentation/ABI/testing/sysfs-kernel-livepatch @@ -33,6 +33,32 @@ Description: An attribute which indicates whether the patch is currently in transition. +What: /sys/kernel/livepatch/<patch>/signal +Date: Nov 2017 +KernelVersion: 4.15.0 +Contact: live-patching@vger.kernel.org +Description: + A writable attribute that allows administrator to affect the + course of an existing transition. Writing 1 sends a fake + signal to all remaining blocking tasks. The fake signal + means that no proper signal is delivered (there is no data in + signal pending structures). Tasks are interrupted or woken up, + and forced to change their patched state. + +What: /sys/kernel/livepatch/<patch>/force +Date: Nov 2017 +KernelVersion: 4.15.0 +Contact: live-patching@vger.kernel.org +Description: + A writable attribute that allows administrator to affect the + course of an existing transition. Writing 1 clears + TIF_PATCH_PENDING flag of all tasks and thus forces the tasks to + the patched or unpatched state. Administrator should not + use this feature without a clearance from a patch + distributor. Removal (rmmod) of patch modules is permanently + disabled when the feature is used. See + Documentation/livepatch/livepatch.txt for more information. + What: /sys/kernel/livepatch/<patch>/<object> Date: Nov 2014 KernelVersion: 3.19.0 diff --git a/Documentation/IRQ-domain.txt b/Documentation/IRQ-domain.txt index 4a1cd7645d85..507775cce753 100644 --- a/Documentation/IRQ-domain.txt +++ b/Documentation/IRQ-domain.txt @@ -265,37 +265,5 @@ support other architectures, such as ARM, ARM64 etc. === Debugging === -If you switch on CONFIG_IRQ_DOMAIN_DEBUG (which depends on -CONFIG_IRQ_DOMAIN and CONFIG_DEBUG_FS), you will find a new file in -your debugfs mount point, called irq_domain_mapping. This file -contains a live snapshot of all the IRQ domains in the system: - - name mapped linear-max direct-max devtree-node - pl061 8 8 0 /smb/gpio@e0080000 - pl061 8 8 0 /smb/gpio@e1050000 - pMSI 0 0 0 /interrupt-controller@e1101000/v2m@e0080000 - MSI 37 0 0 /interrupt-controller@e1101000/v2m@e0080000 - GICv2m 37 0 0 /interrupt-controller@e1101000/v2m@e0080000 - GICv2 448 448 0 /interrupt-controller@e1101000 - -it also iterates over the interrupts to display their mapping in the -domains, and makes the domain stacking visible: - - -irq hwirq chip name chip data active type domain - 1 0x00019 GICv2 0xffff00000916bfd8 * LINEAR GICv2 - 2 0x0001d GICv2 0xffff00000916bfd8 LINEAR GICv2 - 3 0x0001e GICv2 0xffff00000916bfd8 * LINEAR GICv2 - 4 0x0001b GICv2 0xffff00000916bfd8 * LINEAR GICv2 - 5 0x0001a GICv2 0xffff00000916bfd8 LINEAR GICv2 -[...] - 96 0x81808 MSI 0x (null) RADIX MSI - 96+ 0x00063 GICv2m 0xffff8003ee116980 RADIX GICv2m - 96+ 0x00063 GICv2 0xffff00000916bfd8 LINEAR GICv2 - 97 0x08800 MSI 0x (null) * RADIX MSI - 97+ 0x00064 GICv2m 0xffff8003ee116980 * RADIX GICv2m - 97+ 0x00064 GICv2 0xffff00000916bfd8 * LINEAR GICv2 - -Here, interrupts 1-5 are only using a single domain, while 96 and 97 -are build out of a stack of three domain, each level performing a -particular function. +Most of the internals of the IRQ subsystem are exposed in debugfs by +turning CONFIG_GENERIC_IRQ_DEBUGFS on. diff --git a/Documentation/RCU/Design/Data-Structures/Data-Structures.html b/Documentation/RCU/Design/Data-Structures/Data-Structures.html index 38d6d800761f..6c06e10bd04b 100644 --- a/Documentation/RCU/Design/Data-Structures/Data-Structures.html +++ b/Documentation/RCU/Design/Data-Structures/Data-Structures.html @@ -1097,7 +1097,8 @@ will cause the CPU to disregard the values of its counters on its next exit from idle. Finally, the <tt>rcu_qs_ctr_snap</tt> field is used to detect cases where a given operation has resulted in a quiescent state -for all flavors of RCU, for example, <tt>cond_resched_rcu_qs()</tt>. +for all flavors of RCU, for example, <tt>cond_resched()</tt> +when RCU has indicated a need for quiescent states. <h5>RCU Callback Handling</h5> @@ -1182,8 +1183,8 @@ CPU (and from tracing) unless otherwise stated. Its fields are as follows: <pre> - 1 int dynticks_nesting; - 2 int dynticks_nmi_nesting; + 1 long dynticks_nesting; + 2 long dynticks_nmi_nesting; 3 atomic_t dynticks; 4 bool rcu_need_heavy_qs; 5 unsigned long rcu_qs_ctr; @@ -1191,15 +1192,31 @@ Its fields are as follows: </pre> <p>The <tt>->dynticks_nesting</tt> field counts the -nesting depth of normal interrupts. -In addition, this counter is incremented when exiting dyntick-idle -mode and decremented when entering it. +nesting depth of process execution, so that in normal circumstances +this counter has value zero or one. +NMIs, irqs, and tracers are counted by the <tt>->dynticks_nmi_nesting</tt> +field. +Because NMIs cannot be masked, changes to this variable have to be +undertaken carefully using an algorithm provided by Andy Lutomirski. +The initial transition from idle adds one, and nested transitions +add two, so that a nesting level of five is represented by a +<tt>->dynticks_nmi_nesting</tt> value of nine. This counter can therefore be thought of as counting the number of reasons why this CPU cannot be permitted to enter dyntick-idle -mode, aside from non-maskable interrupts (NMIs). -NMIs are counted by the <tt>->dynticks_nmi_nesting</tt> -field, except that NMIs that interrupt non-dyntick-idle execution -are not counted. +mode, aside from process-level transitions. + +<p>However, it turns out that when running in non-idle kernel context, +the Linux kernel is fully capable of entering interrupt handlers that +never exit and perhaps also vice versa. +Therefore, whenever the <tt>->dynticks_nesting</tt> field is +incremented up from zero, the <tt>->dynticks_nmi_nesting</tt> field +is set to a large positive number, and whenever the +<tt>->dynticks_nesting</tt> field is decremented down to zero, +the the <tt>->dynticks_nmi_nesting</tt> field is set to zero. +Assuming that the number of misnested interrupts is not sufficient +to overflow the counter, this approach corrects the +<tt>->dynticks_nmi_nesting</tt> field every time the corresponding +CPU enters the idle loop from process context. </p><p>The <tt>->dynticks</tt> field counts the corresponding CPU's transitions to and from dyntick-idle mode, so that this counter @@ -1231,14 +1248,16 @@ in response. <tr><th> </th></tr> <tr><th align="left">Quick Quiz:</th></tr> <tr><td> - Why not just count all NMIs? - Wouldn't that be simpler and less error prone? + Why not simply combine the <tt>->dynticks_nesting</tt> + and <tt>->dynticks_nmi_nesting</tt> counters into a + single counter that just counts the number of reasons that + the corresponding CPU is non-idle? </td></tr> <tr><th align="left">Answer:</th></tr> <tr><td bgcolor="#ffffff"><font color="ffffff"> - It seems simpler only until you think hard about how to go about - updating the <tt>rcu_dynticks</tt> structure's - <tt>->dynticks</tt> field. + Because this would fail in the presence of interrupts whose + handlers never return and of handlers that manage to return + from a made-up interrupt. </font></td></tr> <tr><td> </td></tr> </table> diff --git a/Documentation/RCU/Design/Requirements/Requirements.html b/Documentation/RCU/Design/Requirements/Requirements.html index 62e847bcdcdd..49690228b1c6 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.html +++ b/Documentation/RCU/Design/Requirements/Requirements.html @@ -581,7 +581,8 @@ This guarantee was only partially premeditated. DYNIX/ptx used an explicit memory barrier for publication, but had nothing resembling <tt>rcu_dereference()</tt> for subscription, nor did it have anything resembling the <tt>smp_read_barrier_depends()</tt> -that was later subsumed into <tt>rcu_dereference()</tt>. +that was later subsumed into <tt>rcu_dereference()</tt> and later +still into <tt>READ_ONCE()</tt>. The need for these operations made itself known quite suddenly at a late-1990s meeting with the DEC Alpha architects, back in the days when DEC was still a free-standing company. @@ -2797,7 +2798,7 @@ RCU must avoid degrading real-time response for CPU-bound threads, whether executing in usermode (which is one use case for <tt>CONFIG_NO_HZ_FULL=y</tt>) or in the kernel. That said, CPU-bound loops in the kernel must execute -<tt>cond_resched_rcu_qs()</tt> at least once per few tens of milliseconds +<tt>cond_resched()</tt> at least once per few tens of milliseconds in order to avoid receiving an IPI from RCU. <p> @@ -3128,7 +3129,7 @@ The solution, in the form of is to have implicit read-side critical sections that are delimited by voluntary context switches, that is, calls to <tt>schedule()</tt>, -<tt>cond_resched_rcu_qs()</tt>, and +<tt>cond_resched()</tt>, and <tt>synchronize_rcu_tasks()</tt>. In addition, transitions to and from userspace execution also delimit tasks-RCU read-side critical sections. diff --git a/Documentation/RCU/rcu_dereference.txt b/Documentation/RCU/rcu_dereference.txt index 1acb26b09b48..ab96227bad42 100644 --- a/Documentation/RCU/rcu_dereference.txt +++ b/Documentation/RCU/rcu_dereference.txt @@ -122,11 +122,7 @@ o Be very careful about comparing pointers obtained from Note that if checks for being within an RCU read-side critical section are not required and the pointer is never dereferenced, rcu_access_pointer() should be used in place - of rcu_dereference(). The rcu_access_pointer() primitive - does not require an enclosing read-side critical section, - and also omits the smp_read_barrier_depends() included in - rcu_dereference(), which in turn should provide a small - performance gain in some CPUs (e.g., the DEC Alpha). + of rcu_dereference(). o The comparison is against a pointer that references memory that was initialized "a long time ago." The reason diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt index a08f928c8557..4259f95c3261 100644 --- a/Documentation/RCU/stallwarn.txt +++ b/Documentation/RCU/stallwarn.txt @@ -23,12 +23,10 @@ o A CPU looping with preemption disabled. This condition can o A CPU looping with bottom halves disabled. This condition can result in RCU-sched and RCU-bh stalls. -o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the - kernel without invoking schedule(). Note that cond_resched() - does not necessarily prevent RCU CPU stall warnings. Therefore, - if the looping in the kernel is really expected and desirable - behavior, you might need to replace some of the cond_resched() - calls with calls to cond_resched_rcu_qs(). +o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel + without invoking schedule(). If the looping in the kernel is + really expected and desirable behavior, you might need to add + some calls to cond_resched(). o Booting Linux using a console connection that is too slow to keep up with the boot-time console-message rate. For example, diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt index df62466da4e0..a27fbfb0efb8 100644 --- a/Documentation/RCU/whatisRCU.txt +++ b/Documentation/RCU/whatisRCU.txt @@ -600,8 +600,7 @@ don't forget about them when submitting patches making use of RCU!] #define rcu_dereference(p) \ ({ \ - typeof(p) _________p1 = p; \ - smp_read_barrier_depends(); \ + typeof(p) _________p1 = READ_ONCE(p); \ (_________p1); \ }) diff --git a/Documentation/admin-guide/kernel-parameters.rst b/Documentation/admin-guide/kernel-parameters.rst index b2598cc9834c..7242cbda15dd 100644 --- a/Documentation/admin-guide/kernel-parameters.rst +++ b/Documentation/admin-guide/kernel-parameters.rst @@ -109,6 +109,7 @@ parameter is applicable:: IPV6 IPv6 support is enabled. ISAPNP ISA PnP code is enabled. ISDN Appropriate ISDN support is enabled. + ISOL CPU Isolation is enabled. JOY Appropriate joystick support is enabled. KGDB Kernel debugger support is enabled. KVM Kernel Virtual Machine support is enabled. diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 6571fbfdb2a1..b98048b56ada 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -114,7 +114,6 @@ This facility can be used to prevent such uncontrolled GPE floodings. Format: <int> - Support masking of GPEs numbered from 0x00 to 0x7f. acpi_no_auto_serialize [HW,ACPI] Disable auto-serialization of AML methods @@ -223,7 +222,7 @@ acpi_sleep= [HW,ACPI] Sleep options Format: { s3_bios, s3_mode, s3_beep, s4_nohwsig, - old_ordering, nonvs, sci_force_enable } + old_ordering, nonvs, sci_force_enable, nobl } See Documentation/power/video.txt for information on s3_bios and s3_mode. s3_beep is for debugging; it makes the PC's speaker beep @@ -239,6 +238,9 @@ sci_force_enable causes the kernel to set SCI_EN directly on resume from S1/S3 (which is against the ACPI spec, but some broken systems don't work without it). + nobl causes the internal blacklist of systems known to + behave incorrectly in some ways with respect to system + suspend and resume to be ignored (use wisely). acpi_use_timer_override [HW,ACPI] Use timer override. For some broken Nvidia NF5 boards @@ -328,11 +330,15 @@ not play well with APC CPU idle - disable it if you have APC and your system crashes randomly. - apic= [APIC,X86-32] Advanced Programmable Interrupt Controller + apic= [APIC,X86] Advanced Programmable Interrupt Controller Change the output verbosity whilst booting Format: { quiet (default) | verbose | debug } Change the amount of debugging information output when initialising the APIC and IO-APIC components. + For X86-32, this can also be used to specify an APIC + driver name. + Format: apic=driver_name + Examples: apic=bigsmp apic_extnmi= [APIC,X86] External NMI delivery setting Format: { bsp (default) | all | none } @@ -709,9 +715,6 @@ It will be ignored when crashkernel=X,high is not used or memory reserved is below 4G. - crossrelease_fullstack - [KNL] Allow to record full stack trace in cross-release - cryptomgr.notests [KNL] Disable crypto self-tests @@ -1737,7 +1740,7 @@ isapnp= [ISAPNP] Format: <RDP>,<reset>,<pci_scan>,<verbosity> - isolcpus= [KNL,SMP] Isolate a given set of CPUs from disturbance. + isolcpus= [KNL,SMP,ISOL] Isolate a given set of CPUs from disturbance. [Deprecated - use cpusets instead] Format: [flag-list,]<cpu-list> @@ -2049,9 +2052,6 @@ This tests the locking primitive's ability to transition abruptly to and from idle. - locktorture.torture_runnable= [BOOT] - Start locktorture running at boot time. - locktorture.torture_type= [KNL] Specify the locking implementation to test. @@ -2622,6 +2622,11 @@ nosmt [KNL,S390] Disable symmetric multithreading (SMT). Equivalent to smt=1. + nospectre_v2 [X86] Disable all mitigations for the Spectre variant 2 + (indirect branch prediction) vulnerability. System may + allow data leaks with this option, which is equivalent + to spectre_v2=off. + noxsave [BUGS=X86] Disables x86 extended register state save and restore using xsave. The kernel will fallback to enabling legacy floating-point and sse state. @@ -2662,7 +2667,7 @@ Valid arguments: on, off Default: on - nohz_full= [KNL,BOOT] + nohz_full= [KNL,BOOT,SMP,ISOL] The argument is a cpu list, as described above. In kernels built with CONFIG_NO_HZ_FULL=y, set the specified list of CPUs whose tick will be stopped @@ -3094,6 +3099,12 @@ pcie_scan_all Scan all possible PCIe devices. Otherwise we only look for one device below a PCIe downstream port. + big_root_window Try to add a big 64bit memory window to the PCIe + root complex on AMD CPUs. Some GFX hardware + can resize a BAR to allow access to all VRAM. + Adding the window is slightly risky (it may + conflict with unreported devices), so this + taints the kernel. pcie_aspm= [PCIE] Forcibly enable or disable PCIe Active State Power Management. @@ -3282,6 +3293,21 @@ pt. [PARIDE] See Documentation/blockdev/paride.txt. + pti= [X86_64] Control Page Table Isolation of user and + kernel address spaces. Disabling this feature + removes hardening, but improves performance of + system calls and interrupts. + + on - unconditionally enable + off - unconditionally disable + auto - kernel detects whether your CPU model is + vulnerable to issues that PTI mitigates + + Not specifying this option is equivalent to pti=auto. + + nopti [X86_64] + Equivalent to pti=off + pty.legacy_count= [KNL] Number of legacy pty's. Overwrites compiled-in default number. @@ -3459,9 +3485,6 @@ the same as for rcuperf.nreaders. N, where N is the number of CPUs - rcuperf.perf_runnable= [BOOT] - Start rcuperf running at boot time. - rcuperf.perf_type= [KNL] Specify the RCU implementation to test. @@ -3595,9 +3618,6 @@ Test RCU's dyntick-idle handling. See also the rcutorture.shuffle_interval parameter. - rcutorture.torture_runnable= [BOOT] - Start rcutorture running at boot time. - rcutorture.torture_type= [KNL] Specify the RCU implementation to test. @@ -3655,7 +3675,8 @@ rdt= [HW,X86,RDT] Turn on/off individual RDT features. List is: - cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, mba. + cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp, + mba. E.g. to turn on cmt and turn off mba use: rdt=cmt,!mba @@ -3931,6 +3952,29 @@ sonypi.*= [HW] Sony Programmable I/O Control Device driver See Documentation/laptops/sonypi.txt + spectre_v2= [X86] Control mitigation of Spectre variant 2 + (indirect branch speculation) vulnerability. + + on - unconditionally enable + off - unconditionally disable + auto - kernel detects whether your CPU model is + vulnerable + + Selecting 'on' will, and 'auto' may, choose a + mitigation method at run time according to the + CPU, the available microcode, the setting of the + CONFIG_RETPOLINE configuration option, and the + compiler with which the kernel was built. + + Specific mitigations can also be selected manually: + + retpoline - replace indirect branches + retpoline,generic - google's original retpoline + retpoline,amd - AMD-specific minimal thunk + + Not specifying this option is equivalent to + spectre_v2=auto. + spia_io_base= [HW,MTD] spia_fio_base= spia_pedr= diff --git a/Documentation/admin-guide/thunderbolt.rst b/Documentation/admin-guide/thunderbolt.rst index de50a8561774..9b55952039a6 100644 --- a/Documentation/admin-guide/thunderbolt.rst +++ b/Documentation/admin-guide/thunderbolt.rst @@ -230,7 +230,7 @@ If supported by your machine this will be exposed by the WMI bus with a sysfs attribute called "force_power". For example the intel-wmi-thunderbolt driver exposes this attribute in: - /sys/devices/platform/PNP0C14:00/wmi_bus/wmi_bus-PNP0C14:00/86CCFD48-205E-4A77-9C48-2021CBEDE341/force_power + /sys/bus/wmi/devices/86CCFD48-205E-4A77-9C48-2021CBEDE341/force_power To force the power to on, write 1 to this attribute file. To disable force power, write 0 to this attribute file. diff --git a/Documentation/arm64/cpu-feature-registers.txt b/Documentation/arm64/cpu-feature-registers.txt index bd9b3faab2c4..a70090b28b07 100644 --- a/Documentation/arm64/cpu-feature-registers.txt +++ b/Documentation/arm64/cpu-feature-registers.txt @@ -110,7 +110,9 @@ infrastructure: x--------------------------------------------------x | Name | bits | visible | |--------------------------------------------------| - | RES0 | [63-48] | n | + | RES0 | [63-52] | n | + |--------------------------------------------------| + | FHM | [51-48] | y | |--------------------------------------------------| | DP | [47-44] | y | |--------------------------------------------------| diff --git a/Documentation/arm64/elf_hwcaps.txt b/Documentation/arm64/elf_hwcaps.txt index 89edba12a9e0..57324ee55ecc 100644 --- a/Documentation/arm64/elf_hwcaps.txt +++ b/Documentation/arm64/elf_hwcaps.txt @@ -158,3 +158,7 @@ HWCAP_SHA512 HWCAP_SVE Functionality implied by ID_AA64PFR0_EL1.SVE == 0b0001. + +HWCAP_ASIMDFHM + + Functionality implied by ID_AA64ISAR0_EL1.FHM == 0b0001. diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt index 304bf22bb83c..c1d520de6dfe 100644 --- a/Documentation/arm64/silicon-errata.txt +++ b/Documentation/arm64/silicon-errata.txt @@ -72,6 +72,7 @@ stable kernels. | Hisilicon | Hip0{6,7} | #161010701 | N/A | | Hisilicon | Hip07 | #161600802 | HISILICON_ERRATUM_161600802 | | | | | | -| Qualcomm Tech. | Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003 | +| Qualcomm Tech. | Kryo/Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003 | | Qualcomm Tech. | Falkor v1 | E1009 | QCOM_FALKOR_ERRATUM_1009 | | Qualcomm Tech. | QDF2400 ITS | E0065 | QCOM_QDF2400_ERRATUM_0065 | +| Qualcomm Tech. | Falkor v{1,2} | E1041 | QCOM_FALKOR_ERRATUM_1041 | diff --git a/Documentation/cgroup-v1/cgroups.txt b/Documentation/cgroup-v1/cgroups.txt index 308e5ff7207a..059f7063eea6 100644 --- a/Documentation/cgroup-v1/cgroups.txt +++ b/Documentation/cgroup-v1/cgroups.txt @@ -523,12 +523,7 @@ Accessing a task's cgroup pointer may be done in the following ways: Each subsystem should: - add an entry in linux/cgroup_subsys.h -- define a cgroup_subsys object called <name>_subsys - -If a subsystem can be compiled as a module, it should also have in its -module initcall a call to cgroup_load_subsys(), and in its exitcall a -call to cgroup_unload_subsys(). It should also set its_subsys.module = -THIS_MODULE in its .c file. +- define a cgroup_subsys object called <name>_cgrp_subsys Each subsystem may export the following methods. The only mandatory methods are css_alloc/free. Any others that are null are presumed to diff --git a/Documentation/cgroup-v1/memory.txt b/Documentation/cgroup-v1/memory.txt index cefb63639070..a4af2e124e24 100644 --- a/Documentation/cgroup-v1/memory.txt +++ b/Documentation/cgroup-v1/memory.txt @@ -524,9 +524,9 @@ Note: Only anonymous and swap cache memory is listed as part of 'rss' stat. This should not be confused with the true 'resident set size' or the amount of physical memory used by the cgroup. - 'rss + file_mapped" will give you resident set size of cgroup. + 'rss + mapped_file" will give you resident set size of cgroup. (Note: file and shmem may be shared among other cgroups. In that case, - file_mapped is accounted only when the memory cgroup is owner of page + mapped_file is accounted only when the memory cgroup is owner of page cache.) 5.3 swappiness diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt index 779211fbb69f..74cdeaed9f7a 100644 --- a/Documentation/cgroup-v2.txt +++ b/Documentation/cgroup-v2.txt @@ -53,10 +53,14 @@ v1 is available under Documentation/cgroup-v1/. 5-3-2. Writeback 5-4. PID 5-4-1. PID Interface Files - 5-5. RDMA - 5-5-1. RDMA Interface Files - 5-6. Misc - 5-6-1. perf_event + 5-5. Device + 5-6. RDMA + 5-6-1. RDMA Interface Files + 5-7. Misc + 5-7-1. perf_event + 5-N. Non-normative information + 5-N-1. CPU controller root cgroup process behaviour + 5-N-2. IO controller root cgroup process behaviour 6. Namespace 6-1. Basics 6-2. The Root and Views @@ -279,7 +283,7 @@ thread mode, the following conditions must be met. exempt from this requirement. Topology-wise, a cgroup can be in an invalid state. Please consider -the following toplogy:: +the following topology:: A (threaded domain) - B (threaded) - C (domain, just created) @@ -420,7 +424,9 @@ The root cgroup is exempt from this restriction. Root contains processes and anonymous resource consumption which can't be associated with any other cgroups and requires special treatment from most controllers. How resource consumption in the root cgroup is governed -is up to each controller. +is up to each controller (for more information on this topic please +refer to the Non-normative information section in the Controllers +chapter). Note that the restriction doesn't get in the way if there is no enabled controller in the cgroup's "cgroup.subtree_control". This is @@ -898,6 +904,13 @@ controller implements weight and absolute bandwidth limit models for normal scheduling policy and absolute bandwidth allocation model for realtime scheduling policy. +WARNING: cgroup2 doesn't yet support control of realtime processes and +the cpu controller can only be enabled when all RT processes are in +the root cgroup. Be aware that system management software may already +have placed RT processes into nonroot cgroups during the system boot +process, and these processes may need to be moved to the root cgroup +before the cpu controller can be enabled. + CPU Interface Files ~~~~~~~~~~~~~~~~~~~ @@ -1056,10 +1069,10 @@ PAGE_SIZE multiple when read back. reached the limit and allocation was about to fail. Depending on context result could be invocation of OOM - killer and retrying allocation or failing alloction. + killer and retrying allocation or failing allocation. Failed allocation in its turn could be returned into - userspace as -ENOMEM or siletly ignored in cases like + userspace as -ENOMEM or silently ignored in cases like disk readahead. For now OOM in memory cgroup kills tasks iff shortage has happened inside page fault. @@ -1184,7 +1197,7 @@ PAGE_SIZE multiple when read back. cgroups. The default is "max". Swap usage hard limit. If a cgroup's swap usage reaches this - limit, anonymous meomry of the cgroup will not be swapped out. + limit, anonymous memory of the cgroup will not be swapped out. Usage Guidelines @@ -1422,6 +1435,30 @@ through fork() or clone(). These will return -EAGAIN if the creation of a new process would cause a cgroup policy to be violated. +Device controller +----------------- + +Device controller manages access to device files. It includes both +creation of new device files (using mknod), and access to the +existing device files. + +Cgroup v2 device controller has no interface files and is implemented +on top of cgroup BPF. To control access to device files, a user may +create bpf programs of the BPF_CGROUP_DEVICE type and attach them +to cgroups. On an attempt to access a device file, corresponding +BPF programs will be executed, and depending on the return value +the attempt will succeed or fail with -EPERM. + +A BPF_CGROUP_DEVICE program takes a pointer to the bpf_cgroup_dev_ctx +structure, which describes the device access attempt: access type +(mknod/read/write) and device (type, major and minor numbers). +If the program returns 0, the attempt fails with -EPERM, otherwise +it succeeds. + +An example of BPF_CGROUP_DEVICE program may be found in the kernel +source tree in the tools/testing/selftests/bpf/dev_cgroup.c file. + + RDMA ---- @@ -1474,6 +1511,35 @@ always be filtered by cgroup v2 path. The controller can still be moved to a legacy hierarchy after v2 hierarchy is populated. +Non-normative information +------------------------- + +This section contains information that isn't considered to be a part of +the stable kernel API and so is subject to change. + + +CPU controller root cgroup process behaviour +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When distributing CPU cycles in the root cgroup each thread in this +cgroup is treated as if it was hosted in a separate child cgroup of the +root cgroup. This child cgroup weight is dependent on its thread nice +level. + +For details of this mapping see sched_prio_to_weight array in +kernel/sched/core.c file (values from this array should be scaled +appropriately so the neutral - nice 0 - value is 100 instead of 1024). + + +IO controller root cgroup process behaviour +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Root cgroup processes are hosted in an implicit leaf child node. +When distributing IO resources this implicit child node is taken into +account as if it was a normal child cgroup of the root cgroup with a +weight value of 200. + + Namespace ========= diff --git a/Documentation/circular-buffers.txt b/Documentation/circular-buffers.txt index d4628174b7c5..53e51caa3347 100644 --- a/Documentation/circular-buffers.txt +++ b/Documentation/circular-buffers.txt @@ -220,8 +220,7 @@ before it writes the new tail pointer, which will erase the item. Note the use of READ_ONCE() and smp_load_acquire() to read the opposition index. This prevents the compiler from discarding and -reloading its cached value - which some compilers will do across -smp_read_barrier_depends(). This isn't strictly needed if you can +reloading its cached value. This isn't strictly needed if you can be sure that the opposition index will _only_ be used the once. The smp_load_acquire() additionally forces the CPU to order against subsequent memory references. Similarly, smp_store_release() is used diff --git a/Documentation/device-mapper/cache-policies.txt b/Documentation/device-mapper/cache-policies.txt index d3ca8af21a31..86786d87d9a8 100644 --- a/Documentation/device-mapper/cache-policies.txt +++ b/Documentation/device-mapper/cache-policies.txt @@ -60,7 +60,7 @@ Memory usage: The mq policy used a lot of memory; 88 bytes per cache block on a 64 bit machine. -smq uses 28bit indexes to implement it's data structures rather than +smq uses 28bit indexes to implement its data structures rather than pointers. It avoids storing an explicit hit count for each block. It has a 'hotspot' queue, rather than a pre-cache, which uses a quarter of the entries (each hotspot block covers a larger area than a single @@ -84,7 +84,7 @@ resulting in better promotion/demotion decisions. Adaptability: The mq policy maintained a hit count for each cache block. For a -different block to get promoted to the cache it's hit count has to +different block to get promoted to the cache its hit count has to exceed the lowest currently in the cache. This meant it could take a long time for the cache to adapt between varying IO patterns. diff --git a/Documentation/device-mapper/cache.txt b/Documentation/device-mapper/cache.txt index cdfd0feb294e..ff0841711fd5 100644 --- a/Documentation/device-mapper/cache.txt +++ b/Documentation/device-mapper/cache.txt @@ -59,7 +59,7 @@ Fixed block size The origin is divided up into blocks of a fixed size. This block size is configurable when you first create the cache. Typically we've been using block sizes of 256KB - 1024KB. The block size must be between 64 -(32KB) and 2097152 (1GB) and a multiple of 64 (32KB). +sectors (32KB) and 2097152 sectors (1GB) and a multiple of 64 sectors (32KB). Having a fixed block size simplifies the target a lot. But it is something of a compromise. For instance, a small part of a block may be @@ -119,7 +119,7 @@ doing here to avoid migrating during those peak io moments. For the time being, a message "migration_threshold <#sectors>" can be used to set the maximum number of sectors being migrated, -the default being 204800 sectors (or 100MB). +the default being 2048 sectors (1MB). Updating on-disk metadata ------------------------- @@ -143,11 +143,6 @@ the policy how big this chunk is, but it should be kept small. Like the dirty flags this data is lost if there's a crash so a safe fallback value should always be possible. -For instance, the 'mq' policy, which is currently the default policy, -uses this facility to store the hit count of the cache blocks. If -there's a crash this information will be lost, which means the cache -may be less efficient until those hit counts are regenerated. - Policy hints affect performance, not correctness. Policy messaging diff --git a/Documentation/device-mapper/dm-raid.txt b/Documentation/device-mapper/dm-raid.txt index 32df07e29f68..390c145f01d7 100644 --- a/Documentation/device-mapper/dm-raid.txt +++ b/Documentation/device-mapper/dm-raid.txt @@ -343,5 +343,8 @@ Version History 1.11.0 Fix table line argument order (wrong raid10_copies/raid10_format sequence) 1.11.1 Add raid4/5/6 journal write-back support via journal_mode option -1.12.1 fix for MD deadlock between mddev_suspend() and md_write_start() available +1.12.1 Fix for MD deadlock between mddev_suspend() and md_write_start() available 1.13.0 Fix dev_health status at end of "recover" (was 'a', now 'A') +1.13.1 Fix deadlock caused by early md_stop_writes(). Also fix size an + state races. +1.13.2 Fix raid redundancy validation and avoid keeping raid set frozen diff --git a/Documentation/device-mapper/snapshot.txt b/Documentation/device-mapper/snapshot.txt index ad6949bff2e3..b8bbb516f989 100644 --- a/Documentation/device-mapper/snapshot.txt +++ b/Documentation/device-mapper/snapshot.txt @@ -49,6 +49,10 @@ The difference between persistent and transient is with transient snapshots less metadata must be saved on disk - they can be kept in memory by the kernel. +When loading or unloading the snapshot target, the corresponding +snapshot-origin or snapshot-merge target must be suspended. A failure to +suspend the origin target could result in data corruption. + * snapshot-merge <origin> <COW device> <persistent> <chunksize> diff --git a/Documentation/device-mapper/thin-provisioning.txt b/Documentation/device-mapper/thin-provisioning.txt index 1699a55b7b70..4bcd4b7f79f9 100644 --- a/Documentation/device-mapper/thin-provisioning.txt +++ b/Documentation/device-mapper/thin-provisioning.txt @@ -112,9 +112,11 @@ $low_water_mark is expressed in blocks of size $data_block_size. If free space on the data device drops below this level then a dm event will be triggered which a userspace daemon should catch allowing it to extend the pool device. Only one such event will be sent. -Resuming a device with a new table itself triggers an event so the -userspace daemon can use this to detect a situation where a new table -already exceeds the threshold. + +No special event is triggered if a just resumed device's free space is below +the low water mark. However, resuming a device always triggers an +event; a userspace daemon should verify that free space exceeds the low +water mark when handling this event. A low water mark for the metadata device is maintained in the kernel and will trigger a dm event if free space on the metadata device drops below @@ -274,7 +276,8 @@ ii) Status <transaction id> <used metadata blocks>/<total metadata blocks> <used data blocks>/<total data blocks> <held metadata root> - [no_]discard_passdown ro|rw + ro|rw|out_of_data_space [no_]discard_passdown [error|queue]_if_no_space + needs_check|- transaction id: A 64-bit number used by userspace to help synchronise with metadata @@ -394,3 +397,6 @@ ii) Status If the pool has encountered device errors and failed, the status will just contain the string 'Fail'. The userspace recovery tools should then be used. + + In the case where <nr mapped sectors> is 0, there is no highest + mapped sector and the value of <highest mapped sector> is unspecified. diff --git a/Documentation/device-mapper/unstriped.txt b/Documentation/device-mapper/unstriped.txt new file mode 100644 index 000000000000..0b2a306c54ee --- /dev/null +++ b/Documentation/device-mapper/unstriped.txt @@ -0,0 +1,124 @@ +Introduction +============ + +The device-mapper "unstriped" target provides a transparent mechanism to +unstripe a device-mapper "striped" target to access the underlying disks +without having to touch the true backing block-device. It can also be +used to unstripe a hardware RAID-0 to access backing disks. + +Parameters: +<number of stripes> <chunk size> <stripe #> <dev_path> <offset> + +<number of stripes> + The number of stripes in the RAID 0. + +<chunk size> + The amount of 512B sectors in the chunk striping. + +<dev_path> + The block device you wish to unstripe. + +<stripe #> + The stripe number within the device that corresponds to physical + drive you wish to unstripe. This must be 0 indexed. + + +Why use this module? +==================== + +An example of undoing an existing dm-stripe +------------------------------------------- + +This small bash script will setup 4 loop devices and use the existing +striped target to combine the 4 devices into one. It then will use +the unstriped target ontop of the striped device to access the +individual backing loop devices. We write data to the newly exposed +unstriped devices and verify the data written matches the correct +underlying device on the striped array. + +#!/bin/bash + +MEMBER_SIZE=$((128 * 1024 * 1024)) +NUM=4 +SEQ_END=$((${NUM}-1)) +CHUNK=256 +BS=4096 + +RAID_SIZE=$((${MEMBER_SIZE}*${NUM}/512)) +DM_PARMS="0 ${RAID_SIZE} striped ${NUM} ${CHUNK}" +COUNT=$((${MEMBER_SIZE} / ${BS})) + +for i in $(seq 0 ${SEQ_END}); do + dd if=/dev/zero of=member-${i} bs=${MEMBER_SIZE} count=1 oflag=direct + losetup /dev/loop${i} member-${i} + DM_PARMS+=" /dev/loop${i} 0" +done + +echo $DM_PARMS | dmsetup create raid0 +for i in $(seq 0 ${SEQ_END}); do + echo "0 1 unstriped ${NUM} ${CHUNK} ${i} /dev/mapper/raid0 0" | dmsetup create set-${i} +done; + +for i in $(seq 0 ${SEQ_END}); do + dd if=/dev/urandom of=/dev/mapper/set-${i} bs=${BS} count=${COUNT} oflag=direct + diff /dev/mapper/set-${i} member-${i} +done; + +for i in $(seq 0 ${SEQ_END}); do + dmsetup remove set-${i} +done + +dmsetup remove raid0 + +for i in $(seq 0 ${SEQ_END}); do + losetup -d /dev/loop${i} + rm -f member-${i} +done + +Another example +--------------- + +Intel NVMe drives contain two cores on the physical device. +Each core of the drive has segregated access to its LBA range. +The current LBA model has a RAID 0 128k chunk on each core, resulting +in a 256k stripe across the two cores: + + Core 0: Core 1: + __________ __________ + | LBA 512| | LBA 768| + | LBA 0 | | LBA 256| + ---------- ---------- + +The purpose of this unstriping is to provide better QoS in noisy +neighbor environments. When two partitions are created on the +aggregate drive without this unstriping, reads on one partition +can affect writes on another partition. This is because the partitions +are striped across the two cores. When we unstripe this hardware RAID 0 +and make partitions on each new exposed device the two partitions are now +physically separated. + +With the dm-unstriped target we're able to segregate an fio script that +has read and write jobs that are independent of each other. Compared to +when we run the test on a combined drive with partitions, we were able +to get a 92% reduction in read latency using this device mapper target. + + +Example dmsetup usage +===================== + +unstriped ontop of Intel NVMe device that has 2 cores +----------------------------------------------------- +dmsetup create nvmset0 --table '0 512 unstriped 2 256 0 /dev/nvme0n1 0' +dmsetup create nvmset1 --table '0 512 unstriped 2 256 1 /dev/nvme0n1 0' + +There will now be two devices that expose Intel NVMe core 0 and 1 +respectively: +/dev/mapper/nvmset0 +/dev/mapper/nvmset1 + +unstriped ontop of striped with 4 drives using 128K chunk size +-------------------------------------------------------------- +dmsetup create raid_disk0 --table '0 512 unstriped 4 256 0 /dev/mapper/striped 0' +dmsetup create raid_disk1 --table '0 512 unstriped 4 256 1 /dev/mapper/striped 0' +dmsetup create raid_disk2 --table '0 512 unstriped 4 256 2 /dev/mapper/striped 0' +dmsetup create raid_disk3 --table '0 512 unstriped 4 256 3 /dev/mapper/striped 0' diff --git a/Documentation/devicetree/bindings/arm/arm-dsu-pmu.txt b/Documentation/devicetree/bindings/arm/arm-dsu-pmu.txt new file mode 100644 index 000000000000..6efabba530f1 --- /dev/null +++ b/Documentation/devicetree/bindings/arm/arm-dsu-pmu.txt @@ -0,0 +1,27 @@ +* ARM DynamIQ Shared Unit (DSU) Performance Monitor Unit (PMU) + +ARM DyanmIQ Shared Unit (DSU) integrates one or more CPU cores +with a shared L3 memory system, control logic and external interfaces to +form a multicore cluster. The PMU enables to gather various statistics on +the operations of the DSU. The PMU provides independent 32bit counters that +can count any of the supported events, along with a 64bit cycle counter. +The PMU is accessed via CPU system registers and has no MMIO component. + +** DSU PMU required properties: + +- compatible : should be one of : + + "arm,dsu-pmu" + +- interrupts : Exactly 1 SPI must be listed. + +- cpus : List of phandles for the CPUs connected to this DSU instance. + + +** Example: + +dsu-pmu-0 { + compatible = "arm,dsu-pmu"; + interrupts = <GIC_SPI 02 IRQ_TYPE_LEVEL_HIGH>; + cpus = <&cpu_0>, <&cpu_1>; +}; diff --git a/Documentation/devicetree/bindings/arm/firmware/sdei.txt b/Documentation/devicetree/bindings/arm/firmware/sdei.txt new file mode 100644 index 000000000000..ee3f0ff49889 --- /dev/null +++ b/Documentation/devicetree/bindings/arm/firmware/sdei.txt @@ -0,0 +1,42 @@ +* Software Delegated Exception Interface (SDEI) + +Firmware implementing the SDEI functions described in ARM document number +ARM DEN 0054A ("Software Delegated Exception Interface") can be used by +Linux to receive notification of events such as those generated by +firmware-first error handling, or from an IRQ that has been promoted to +a firmware-assisted NMI. + +The interface provides a number of API functions for registering callbacks +and enabling/disabling events. Functions are invoked by trapping to the +privilege level of the SDEI firmware (specified as part of the binding +below) and passing arguments in a manner specified by the "SMC Calling +Convention (ARM DEN 0028B): + + r0 => 32-bit Function ID / return value + {r1 - r3} => Parameters + +Note that the immediate field of the trapping instruction must be set +to #0. + +The SDEI_EVENT_REGISTER function registers a callback in the kernel +text to handle the specified event number. + +The sdei node should be a child node of '/firmware' and have required +properties: + + - compatible : should contain: + * "arm,sdei-1.0" : For implementations complying to SDEI version 1.x. + + - method : The method of calling the SDEI firmware. Permitted + values are: + * "smc" : SMC #0, with the register assignments specified in this + binding. + * "hvc" : HVC #0, with the register assignments specified in this + binding. +Example: + firmware { + sdei { + compatible = "arm,sdei-1.0"; + method = "smc"; + }; + }; diff --git a/Documentation/devicetree/bindings/arm/marvell/armada-37xx.txt b/Documentation/devicetree/bindings/arm/marvell/armada-37xx.txt index 51336e5fc761..35c3c3460d17 100644 --- a/Documentation/devicetree/bindings/arm/marvell/armada-37xx.txt +++ b/Documentation/devicetree/bindings/arm/marvell/armada-37xx.txt @@ -14,3 +14,22 @@ following property before the previous one: Example: compatible = "marvell,armada-3720-db", "marvell,armada3720", "marvell,armada3710"; + + +Power management +---------------- + +For power management (particularly DVFS and AVS), the North Bridge +Power Management component is needed: + +Required properties: +- compatible : should contain "marvell,armada-3700-nb-pm", "syscon"; +- reg : the register start and length for the North Bridge + Power Management + +Example: + +nb_pm: syscon@14000 { + compatible = "marvell,armada-3700-nb-pm", "syscon"; + reg = <0x14000 0x60>; +} diff --git a/Documentation/devicetree/bindings/dma/qcom_hidma_mgmt.txt b/Documentation/devicetree/bindings/dma/qcom_hidma_mgmt.txt index b3408cc57be6..1ae4748730a8 100644 --- a/Documentation/devicetree/bindings/dma/qcom_hidma_mgmt.txt +++ b/Documentation/devicetree/bindings/dma/qcom_hidma_mgmt.txt @@ -47,8 +47,8 @@ When the OS is not in control of the management interface (i.e. it's a guest), the channel nodes appear on their own, not under a management node. Required properties: -- compatible: must contain "qcom,hidma-1.0" for initial HW or "qcom,hidma-1.1" -for MSI capable HW. +- compatible: must contain "qcom,hidma-1.0" for initial HW or + "qcom,hidma-1.1"/"qcom,hidma-1.2" for MSI capable HW. - reg: Addresses for the transfer and event channel - interrupts: Should contain the event interrupt - desc-count: Number of asynchronous requests this channel can handle diff --git a/Documentation/devicetree/bindings/gpio/gpio-axp209.txt b/Documentation/devicetree/bindings/gpio/gpio-axp209.txt index a6611304dd3c..fc42b2caa06d 100644 --- a/Documentation/devicetree/bindings/gpio/gpio-axp209.txt +++ b/Documentation/devicetree/bindings/gpio/gpio-axp209.txt @@ -1,10 +1,17 @@ -AXP209 GPIO controller +AXP209 GPIO & pinctrl controller This driver follows the usual GPIO bindings found in Documentation/devicetree/bindings/gpio/gpio.txt +This driver follows the usual pinctrl bindings found in +Documentation/devicetree/bindings/pinctrl/pinctrl-bindings.txt + +This driver employs the per-pin muxing pattern. + Required properties: -- compatible: Should be "x-powers,axp209-gpio" +- compatible: Should be one of: + - "x-powers,axp209-gpio" + - "x-powers,axp813-gpio" - #gpio-cells: Should be two. The first cell is the pin number and the second is the GPIO flags. - gpio-controller: Marks the device node as a GPIO controller. @@ -28,3 +35,41 @@ axp209: pmic@34 { #gpio-cells = <2>; }; }; + +The GPIOs can be muxed to other functions and therefore, must be a subnode of +axp_gpio. + +Example: + +&axp_gpio { + gpio0_adc: gpio0-adc { + pins = "GPIO0"; + function = "adc"; + }; +}; + +&example_node { + pinctrl-names = "default"; + pinctrl-0 = <&gpio0_adc>; +}; + +GPIOs and their functions +------------------------- + +Each GPIO is independent from the other (i.e. GPIO0 in gpio_in function does +not force GPIO1 and GPIO2 to be in gpio_in function as well). + +axp209 +------ +GPIO | Functions +------------------------ +GPIO0 | gpio_in, gpio_out, ldo, adc +GPIO1 | gpio_in, gpio_out, ldo, adc +GPIO2 | gpio_in, gpio_out + +axp813 +------ +GPIO | Functions +------------------------ +GPIO0 | gpio_in, gpio_out, ldo, adc +GPIO1 | gpio_in, gpio_out, ldo diff --git a/Documentation/devicetree/bindings/gpio/renesas,gpio-rcar.txt b/Documentation/devicetree/bindings/gpio/renesas,gpio-rcar.txt index a7ac460ad657..9474138d776e 100644 --- a/Documentation/devicetree/bindings/gpio/renesas,gpio-rcar.txt +++ b/Documentation/devicetree/bindings/gpio/renesas,gpio-rcar.txt @@ -5,7 +5,7 @@ Required Properties: - compatible: should contain one or more of the following: - "renesas,gpio-r8a7743": for R8A7743 (RZ/G1M) compatible GPIO controller. - "renesas,gpio-r8a7745": for R8A7745 (RZ/G1E) compatible GPIO controller. - - "renesas,gpio-r8a7778": for R8A7778 (R-Mobile M1) compatible GPIO controller. + - "renesas,gpio-r8a7778": for R8A7778 (R-Car M1) compatible GPIO controller. - "renesas,gpio-r8a7779": for R8A7779 (R-Car H1) compatible GPIO controller. - "renesas,gpio-r8a7790": for R8A7790 (R-Car H2) compatible GPIO controller. - "renesas,gpio-r8a7791": for R8A7791 (R-Car M2-W) compatible GPIO controller. diff --git a/Documentation/devicetree/bindings/hwmon/aspeed-pwm-tacho.txt b/Documentation/devicetree/bindings/hwmon/aspeed-pwm-tacho.txt index 367c8203213b..3ac02988a1a5 100644 --- a/Documentation/devicetree/bindings/hwmon/aspeed-pwm-tacho.txt +++ b/Documentation/devicetree/bindings/hwmon/aspeed-pwm-tacho.txt @@ -22,8 +22,9 @@ Required properties for pwm-tacho node: - compatible : should be "aspeed,ast2400-pwm-tacho" for AST2400 and "aspeed,ast2500-pwm-tacho" for AST2500. -- clocks : a fixed clock providing input clock frequency(PWM - and Fan Tach clock) +- clocks : phandle to clock provider with the clock number in the second cell + +- resets : phandle to reset controller with the reset number in the second cell fan subnode format: =================== @@ -48,19 +49,14 @@ Required properties for each child node: Examples: -pwm_tacho_fixed_clk: fixedclk { - compatible = "fixed-clock"; - #clock-cells = <0>; - clock-frequency = <24000000>; -}; - pwm_tacho: pwmtachocontroller@1e786000 { #address-cells = <1>; #size-cells = <1>; #cooling-cells = <2>; reg = <0x1E786000 0x1000>; compatible = "aspeed,ast2500-pwm-tacho"; - clocks = <&pwm_tacho_fixed_clk>; + clocks = <&syscon ASPEED_CLK_APB>; + resets = <&syscon ASPEED_RESET_PWM>; pinctrl-names = "default"; pinctrl-0 = <&pinctrl_pwm0_default &pinctrl_pwm1_default>; diff --git a/Documentation/devicetree/bindings/iio/adc/sigma-delta-modulator.txt b/Documentation/devicetree/bindings/iio/adc/sigma-delta-modulator.txt new file mode 100644 index 000000000000..e9ebb8a20e0d --- /dev/null +++ b/Documentation/devicetree/bindings/iio/adc/sigma-delta-modulator.txt @@ -0,0 +1,13 @@ +Device-Tree bindings for sigma delta modulator + +Required properties: +- compatible: should be "ads1201", "sd-modulator". "sd-modulator" can be use + as a generic SD modulator if modulator not specified in compatible list. +- #io-channel-cells = <1>: See the IIO bindings section "IIO consumers". + +Example node: + + ads1202: adc@0 { + compatible = "sd-modulator"; + #io-channel-cells = <1>; + }; diff --git a/Documentation/devicetree/bindings/iio/adc/st,stm32-dfsdm-adc.txt b/Documentation/devicetree/bindings/iio/adc/st,stm32-dfsdm-adc.txt new file mode 100644 index 000000000000..911492da48f3 --- /dev/null +++ b/Documentation/devicetree/bindings/iio/adc/st,stm32-dfsdm-adc.txt @@ -0,0 +1,128 @@ +STMicroelectronics STM32 DFSDM ADC device driver + + +STM32 DFSDM ADC is a sigma delta analog-to-digital converter dedicated to +interface external sigma delta modulators to STM32 micro controllers. +It is mainly targeted for: +- Sigma delta modulators (motor control, metering...) +- PDM microphones (audio digital microphone) + +It features up to 8 serial digital interfaces (SPI or Manchester) and +up to 4 filters on stm32h7. + +Each child node match with a filter instance. + +Contents of a STM32 DFSDM root node: +------------------------------------ +Required properties: +- compatible: Should be "st,stm32h7-dfsdm". +- reg: Offset and length of the DFSDM block register set. +- clocks: IP and serial interfaces clocking. Should be set according + to rcc clock ID and "clock-names". +- clock-names: Input clock name "dfsdm" must be defined, + "audio" is optional. If defined CLKOUT is based on the audio + clock, else "dfsdm" is used. +- #interrupt-cells = <1>; +- #address-cells = <1>; +- #size-cells = <0>; + +Optional properties: +- spi-max-frequency: Requested only for SPI master mode. + SPI clock OUT frequency (Hz). This clock must be set according + to "clock" property. Frequency must be a multiple of the rcc + clock frequency. If not, SPI CLKOUT frequency will not be + accurate. + +Contents of a STM32 DFSDM child nodes: +-------------------------------------- + +Required properties: +- compatible: Must be: + "st,stm32-dfsdm-adc" for sigma delta ADCs + "st,stm32-dfsdm-dmic" for audio digital microphone. +- reg: Specifies the DFSDM filter instance used. +- interrupts: IRQ lines connected to each DFSDM filter instance. +- st,adc-channels: List of single-ended channels muxed for this ADC. + valid values: + "st,stm32h7-dfsdm" compatibility: 0 to 7. +- st,adc-channel-names: List of single-ended channel names. +- st,filter-order: SinC filter order from 0 to 5. + 0: FastSinC + [1-5]: order 1 to 5. + For audio purpose it is recommended to use order 3 to 5. +- #io-channel-cells = <1>: See the IIO bindings section "IIO consumers". + +Required properties for "st,stm32-dfsdm-adc" compatibility: +- io-channels: From common IIO binding. Used to pipe external sigma delta + modulator or internal ADC output to DFSDM channel. + This is not required for "st,stm32-dfsdm-pdm" compatibility as + PDM microphone is binded in Audio DT node. + +Required properties for "st,stm32-dfsdm-pdm" compatibility: +- #sound-dai-cells: Must be set to 0. +- dma: DMA controller phandle and DMA request line associated to the + filter instance (specified by the field "reg") +- dma-names: Must be "rx" + +Optional properties: +- st,adc-channel-types: Single-ended channel input type. + - "SPI_R": SPI with data on rising edge (default) + - "SPI_F": SPI with data on falling edge + - "MANCH_R": manchester codec, rising edge = logic 0 + - "MANCH_F": manchester codec, falling edge = logic 1 +- st,adc-channel-clk-src: Conversion clock source. + - "CLKIN": external SPI clock (CLKIN x) + - "CLKOUT": internal SPI clock (CLKOUT) (default) + - "CLKOUT_F": internal SPI clock divided by 2 (falling edge). + - "CLKOUT_R": internal SPI clock divided by 2 (rising edge). + +- st,adc-alt-channel: Must be defined if two sigma delta modulator are + connected on same SPI input. + If not set, channel n is connected to SPI input n. + If set, channel n is connected to SPI input n + 1. + +- st,filter0-sync: Set to 1 to synchronize with DFSDM filter instance 0. + Used for multi microphones synchronization. + +Example of a sigma delta adc connected on DFSDM SPI port 0 +and a pdm microphone connected on DFSDM SPI port 1: + + ads1202: simple_sd_adc@0 { + compatible = "ads1202"; + #io-channel-cells = <1>; + }; + + dfsdm: dfsdm@40017000 { + compatible = "st,stm32h7-dfsdm"; + reg = <0x40017000 0x400>; + clocks = <&rcc DFSDM1_CK>; + clock-names = "dfsdm"; + #interrupt-cells = <1>; + #address-cells = <1>; + #size-cells = <0>; + + dfsdm_adc0: filter@0 { + compatible = "st,stm32-dfsdm-adc"; + #io-channel-cells = <1>; + reg = <0>; + interrupts = <110>; + st,adc-channels = <0>; + st,adc-channel-names = "sd_adc0"; + st,adc-channel-types = "SPI_F"; + st,adc-channel-clk-src = "CLKOUT"; + io-channels = <&ads1202 0>; + st,filter-order = <3>; + }; + dfsdm_pdm1: filter@1 { + compatible = "st,stm32-dfsdm-dmic"; + reg = <1>; + interrupts = <111>; + dmas = <&dmamux1 102 0x400 0x00>; + dma-names = "rx"; + st,adc-channels = <1>; + st,adc-channel-names = "dmic1"; + st,adc-channel-types = "SPI_R"; + st,adc-channel-clk-src = "CLKOUT"; + st,filter-order = <5>; + }; + } diff --git a/Documentation/devicetree/bindings/input/hid-over-i2c.txt b/Documentation/devicetree/bindings/input/hid-over-i2c.txt index 28e8bd8b7d64..4d3da9d91de4 100644 --- a/Documentation/devicetree/bindings/input/hid-over-i2c.txt +++ b/Documentation/devicetree/bindings/input/hid-over-i2c.txt @@ -31,7 +31,7 @@ device-specific compatible properties, which should be used in addition to the - vdd-supply: phandle of the regulator that provides the supply voltage. - post-power-on-delay-ms: time required by the device after enabling its regulators - before it is ready for communication. Must be used with 'vdd-supply'. + or powering it on, before it is ready for communication. Example: diff --git a/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm2836-l1-intc.txt b/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm2836-l1-intc.txt index f320dcd6e69b..8ced1696c325 100644 --- a/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm2836-l1-intc.txt +++ b/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm2836-l1-intc.txt @@ -12,7 +12,7 @@ Required properties: registers - interrupt-controller: Identifies the node as an interrupt controller - #interrupt-cells: Specifies the number of cells needed to encode an - interrupt source. The value shall be 1 + interrupt source. The value shall be 2 Please refer to interrupts.txt in this directory for details of the common Interrupt Controllers bindings used by client devices. @@ -32,6 +32,6 @@ local_intc: local_intc { compatible = "brcm,bcm2836-l1-intc"; reg = <0x40000000 0x100>; interrupt-controller; - #interrupt-cells = <1>; + #interrupt-cells = <2>; interrupt-parent = <&local_intc>; }; diff --git a/Documentation/devicetree/bindings/interrupt-controller/google,goldfish-pic.txt b/Documentation/devicetree/bindings/interrupt-controller/google,goldfish-pic.txt new file mode 100644 index 000000000000..35f752706e7d --- /dev/null +++ b/Documentation/devicetree/bindings/interrupt-controller/google,goldfish-pic.txt @@ -0,0 +1,30 @@ +Android Goldfish PIC + +Android Goldfish programmable interrupt device used by Android +emulator. + +Required properties: + +- compatible : should contain "google,goldfish-pic" +- reg : <registers mapping> +- interrupts : <interrupt mapping> + +Example for mips when used in cascade mode: + + cpuintc { + #interrupt-cells = <0x1>; + #address-cells = <0>; + interrupt-controller; + compatible = "mti,cpu-interrupt-controller"; + }; + + interrupt-controller@1f000000 { + compatible = "google,goldfish-pic"; + reg = <0x1f000000 0x1000>; + + interrupt-controller; + #interrupt-cells = <0x1>; + + interrupt-parent = <&cpuintc>; + interrupts = <0x2>; + }; diff --git a/Documentation/devicetree/bindings/leds/leds-lm3692x.txt b/Documentation/devicetree/bindings/leds/leds-lm3692x.txt new file mode 100644 index 000000000000..6c9074f84a51 --- /dev/null +++ b/Documentation/devicetree/bindings/leds/leds-lm3692x.txt @@ -0,0 +1,49 @@ +* Texas Instruments - LM3692x Highly Efficient White LED Driver + +The LM3692x is an ultra-compact, highly efficient, +white-LED driver designed for LCD display backlighting. + +The main difference between the LM36922 and LM36923 is the number of +LED strings it supports. The LM36922 supports two strings while the LM36923 +supports three strings. + +Required properties: + - compatible: + "ti,lm36922" + "ti,lm36923" + - reg : I2C slave address + - #address-cells : 1 + - #size-cells : 0 + +Optional properties: + - enable-gpios : gpio pin to enable/disable the device. + - vled-supply : LED supply + +Required child properties: + - reg : 0 + +Optional child properties: + - label : see Documentation/devicetree/bindings/leds/common.txt + - linux,default-trigger : + see Documentation/devicetree/bindings/leds/common.txt + +Example: + +led-controller@36 { + compatible = "ti,lm3692x"; + reg = <0x36>; + #address-cells = <1>; + #size-cells = <0>; + + enable-gpios = <&gpio1 28 GPIO_ACTIVE_HIGH>; + vled-supply = <&vbatt>; + + led@0 { + reg = <0>; + label = "white:backlight_cluster"; + linux,default-trigger = "backlight"; + }; +} + +For more product information please see the link below: +http://www.ti.com/lit/ds/snvsa29/snvsa29.pdf diff --git a/Documentation/devicetree/bindings/leds/leds-lp8860.txt b/Documentation/devicetree/bindings/leds/leds-lp8860.txt index aad38dd94d4b..5f0e892ad759 100644 --- a/Documentation/devicetree/bindings/leds/leds-lp8860.txt +++ b/Documentation/devicetree/bindings/leds/leds-lp8860.txt @@ -6,23 +6,39 @@ current sinks that can be controlled by a PWM input signal, a SPI/I2C master, or both. Required properties: - - compatible: + - compatible : "ti,lp8860" - - reg - I2C slave address - - label - Used for naming LEDs + - reg : I2C slave address + - #address-cells : 1 + - #size-cells : 0 Optional properties: - - enable-gpio - gpio pin to enable/disable the device. - - supply - "vled" - LED supply + - enable-gpios : gpio pin to enable (active high)/disable the device. + - vled-supply : LED supply + +Required child properties: + - reg : 0 + +Optional child properties: + - label : see Documentation/devicetree/bindings/leds/common.txt + - linux,default-trigger : + see Documentation/devicetree/bindings/leds/common.txt Example: -leds: leds@6 { +led-controller@2d { compatible = "ti,lp8860"; + #address-cells = <1>; + #size-cells = <0>; reg = <0x2d>; - label = "display_cluster"; - enable-gpio = <&gpio1 28 GPIO_ACTIVE_HIGH>; + enable-gpios = <&gpio1 28 GPIO_ACTIVE_HIGH>; vled-supply = <&vbatt>; + + led@0 { + reg = <0>; + label = "white:backlight"; + linux,default-trigger = "backlight"; + }; } For more product information please see the link below: diff --git a/Documentation/devicetree/bindings/mfd/mc13xxx.txt b/Documentation/devicetree/bindings/mfd/mc13xxx.txt index ac235fe385fc..8261ea73278a 100644 --- a/Documentation/devicetree/bindings/mfd/mc13xxx.txt +++ b/Documentation/devicetree/bindings/mfd/mc13xxx.txt @@ -130,7 +130,7 @@ ecspi@70010000 { /* ECSPI1 */ #size-cells = <0>; led-control = <0x000 0x000 0x0e0 0x000>; - sysled { + sysled@3 { reg = <3>; label = "system:red:live"; linux,default-trigger = "heartbeat"; diff --git a/Documentation/devicetree/bindings/mfd/syscon.txt b/Documentation/devicetree/bindings/mfd/syscon.txt index 8b92d4576c42..25d9e9c2fd53 100644 --- a/Documentation/devicetree/bindings/mfd/syscon.txt +++ b/Documentation/devicetree/bindings/mfd/syscon.txt @@ -16,9 +16,17 @@ Required properties: Optional property: - reg-io-width: the size (in bytes) of the IO accesses that should be performed on the device. +- hwlocks: reference to a phandle of a hardware spinlock provider node. Examples: gpr: iomuxc-gpr@20e0000 { compatible = "fsl,imx6q-iomuxc-gpr", "syscon"; reg = <0x020e0000 0x38>; + hwlocks = <&hwlock1 1>; +}; + +hwlock1: hwspinlock@40500000 { + ... + reg = <0x40500000 0x1000>; + #hwlock-cells = <1>; }; diff --git a/Documentation/devicetree/bindings/mmc/mtk-sd.txt b/Documentation/devicetree/bindings/mmc/mtk-sd.txt index 72d2a734ab85..9b8017670870 100644 --- a/Documentation/devicetree/bindings/mmc/mtk-sd.txt +++ b/Documentation/devicetree/bindings/mmc/mtk-sd.txt @@ -12,6 +12,8 @@ Required properties: "mediatek,mt8173-mmc": for mmc host ip compatible with mt8173 "mediatek,mt2701-mmc": for mmc host ip compatible with mt2701 "mediatek,mt2712-mmc": for mmc host ip compatible with mt2712 + "mediatek,mt7623-mmc", "mediatek,mt2701-mmc": for MT7623 SoC + - reg: physical base address of the controller and length - interrupts: Should contain MSDC interrupt number - clocks: Should contain phandle for the clock feeding the MMC controller diff --git a/Documentation/devicetree/bindings/mmc/tmio_mmc.txt b/Documentation/devicetree/bindings/mmc/tmio_mmc.txt index 3c6762430fd9..d8685cb83325 100644 --- a/Documentation/devicetree/bindings/mmc/tmio_mmc.txt +++ b/Documentation/devicetree/bindings/mmc/tmio_mmc.txt @@ -26,6 +26,7 @@ Required properties: "renesas,sdhi-r8a7794" - SDHI IP on R8A7794 SoC "renesas,sdhi-r8a7795" - SDHI IP on R8A7795 SoC "renesas,sdhi-r8a7796" - SDHI IP on R8A7796 SoC + "renesas,sdhi-r8a77995" - SDHI IP on R8A77995 SoC "renesas,sdhi-shmobile" - a generic sh-mobile SDHI controller "renesas,rcar-gen1-sdhi" - a generic R-Car Gen1 SDHI controller "renesas,rcar-gen2-sdhi" - a generic R-Car Gen2 or RZ/G1 diff --git a/Documentation/devicetree/bindings/mtd/fsl-quadspi.txt b/Documentation/devicetree/bindings/mtd/fsl-quadspi.txt index c34aa6f8a424..63d4d626fbd5 100644 --- a/Documentation/devicetree/bindings/mtd/fsl-quadspi.txt +++ b/Documentation/devicetree/bindings/mtd/fsl-quadspi.txt @@ -12,7 +12,7 @@ Required properties: - reg-names: Should contain the reg names "QuadSPI" and "QuadSPI-memory" - interrupts : Should contain the interrupt for the device - clocks : The clocks needed by the QuadSPI controller - - clock-names : the name of the clocks + - clock-names : Should contain the name of the clocks: "qspi_en" and "qspi". Optional properties: - fsl,qspi-has-second-chip: The controller has two buses, bus A and bus B. diff --git a/Documentation/devicetree/bindings/mtd/gpmc-onenand.txt b/Documentation/devicetree/bindings/mtd/gpmc-onenand.txt index b6e8bfd024f4..e9f01a963a0a 100644 --- a/Documentation/devicetree/bindings/mtd/gpmc-onenand.txt +++ b/Documentation/devicetree/bindings/mtd/gpmc-onenand.txt @@ -9,13 +9,14 @@ Documentation/devicetree/bindings/memory-controllers/omap-gpmc.txt Required properties: + - compatible: "ti,omap2-onenand" - reg: The CS line the peripheral is connected to - - gpmc,device-width Width of the ONENAND device connected to the GPMC + - gpmc,device-width: Width of the ONENAND device connected to the GPMC in bytes. Must be 1 or 2. Optional properties: - - dma-channel: DMA Channel index + - int-gpios: GPIO specifier for the INT pin. For inline partition table parsing (optional): @@ -35,6 +36,7 @@ Example for an OMAP3430 board: #size-cells = <1>; onenand@0 { + compatible = "ti,omap2-onenand"; reg = <0 0 0>; /* CS0, offset 0 */ gpmc,device-width = <2>; diff --git a/Documentation/devicetree/bindings/mtd/jedec,spi-nor.txt b/Documentation/devicetree/bindings/mtd/jedec,spi-nor.txt index 376fa2f50e6b..956bb046e599 100644 --- a/Documentation/devicetree/bindings/mtd/jedec,spi-nor.txt +++ b/Documentation/devicetree/bindings/mtd/jedec,spi-nor.txt @@ -13,7 +13,6 @@ Required properties: at25df321a at25df641 at26df081a - en25s64 mr25h128 mr25h256 mr25h10 @@ -33,7 +32,6 @@ Required properties: s25fl008k s25fl064k sst25vf040b - sst25wf040b m25p40 m25p80 m25p16 diff --git a/Documentation/devicetree/bindings/mtd/marvell-nand.txt b/Documentation/devicetree/bindings/mtd/marvell-nand.txt new file mode 100644 index 000000000000..c08fb477b3c6 --- /dev/null +++ b/Documentation/devicetree/bindings/mtd/marvell-nand.txt @@ -0,0 +1,123 @@ +Marvell NAND Flash Controller (NFC) + +Required properties: +- compatible: can be one of the following: + * "marvell,armada-8k-nand-controller" + * "marvell,armada370-nand-controller" + * "marvell,pxa3xx-nand-controller" + * "marvell,armada-8k-nand" (deprecated) + * "marvell,armada370-nand" (deprecated) + * "marvell,pxa3xx-nand" (deprecated) + Compatibles marked deprecated support only the old bindings described + at the bottom. +- reg: NAND flash controller memory area. +- #address-cells: shall be set to 1. Encode the NAND CS. +- #size-cells: shall be set to 0. +- interrupts: shall define the NAND controller interrupt. +- clocks: shall reference the NAND controller clock. +- marvell,system-controller: Set to retrieve the syscon node that handles + NAND controller related registers (only required with the + "marvell,armada-8k-nand[-controller]" compatibles). + +Optional properties: +- label: see partition.txt. New platforms shall omit this property. +- dmas: shall reference DMA channel associated to the NAND controller. + This property is only used with "marvell,pxa3xx-nand[-controller]" + compatible strings. +- dma-names: shall be "rxtx". + This property is only used with "marvell,pxa3xx-nand[-controller]" + compatible strings. + +Optional children nodes: +Children nodes represent the available NAND chips. + +Required properties: +- reg: shall contain the native Chip Select ids (0-3). +- nand-rb: see nand.txt (0-1). + +Optional properties: +- marvell,nand-keep-config: orders the driver not to take the timings + from the core and leaving them completely untouched. Bootloader + timings will then be used. +- label: MTD name. +- nand-on-flash-bbt: see nand.txt. +- nand-ecc-mode: see nand.txt. Will use hardware ECC if not specified. +- nand-ecc-algo: see nand.txt. This property is essentially useful when + not using hardware ECC. Howerver, it may be added when using hardware + ECC for clarification but will be ignored by the driver because ECC + mode is chosen depending on the page size and the strength required by + the NAND chip. This value may be overwritten with nand-ecc-strength + property. +- nand-ecc-strength: see nand.txt. +- nand-ecc-step-size: see nand.txt. Marvell's NAND flash controller does + use fixed strength (1-bit for Hamming, 16-bit for BCH), so the actual + step size will shrink or grow in order to fit the required strength. + Step sizes are not completely random for all and follow certain + patterns described in AN-379, "Marvell SoC NFC ECC". + +See Documentation/devicetree/bindings/mtd/nand.txt for more details on +generic bindings. + + +Example: +nand_controller: nand-controller@d0000 { + compatible = "marvell,armada370-nand-controller"; + reg = <0xd0000 0x54>; + #address-cells = <1>; + #size-cells = <0>; + interrupts = <GIC_SPI 84 IRQ_TYPE_LEVEL_HIGH>; + clocks = <&coredivclk 0>; + + nand@0 { + reg = <0>; + label = "main-storage"; + nand-rb = <0>; + nand-ecc-mode = "hw"; + marvell,nand-keep-config; + nand-on-flash-bbt; + nand-ecc-strength = <4>; + nand-ecc-step-size = <512>; + + partitions { + compatible = "fixed-partitions"; + #address-cells = <1>; + #size-cells = <1>; + + partition@0 { + label = "Rootfs"; + reg = <0x00000000 0x40000000>; + }; + }; + }; +}; + + +Note on legacy bindings: One can find, in not-updated device trees, +bindings slightly different than described above with other properties +described below as well as the partitions node at the root of a so +called "nand" node (without clear controller/chip separation). + +Legacy properties: +- marvell,nand-enable-arbiter: To enable the arbiter, all boards blindly + used it, this bit was set by the bootloader for many boards and even if + it is marked reserved in several datasheets, it might be needed to set + it (otherwise it is harmless) so whether or not this property is set, + the bit is selected by the driver. +- num-cs: Number of chip-select lines to use, all boards blindly set 1 + to this and for a reason, other values would have failed. The value of + this property is ignored. + +Example: + + nand0: nand@43100000 { + compatible = "marvell,pxa3xx-nand"; + reg = <0x43100000 90>; + interrupts = <45>; + dmas = <&pdma 97 0>; + dma-names = "rxtx"; + #address-cells = <1>; + marvell,nand-keep-config; + marvell,nand-enable-arbiter; + num-cs = <1>; + /* Partitions (optional) */ + }; diff --git a/Documentation/devicetree/bindings/mtd/mtk-nand.txt b/Documentation/devicetree/bindings/mtd/mtk-nand.txt index 0431841de781..1c88526dedfc 100644 --- a/Documentation/devicetree/bindings/mtd/mtk-nand.txt +++ b/Documentation/devicetree/bindings/mtd/mtk-nand.txt @@ -12,8 +12,10 @@ tree nodes. The first part of NFC is NAND Controller Interface (NFI) HW. Required NFI properties: -- compatible: Should be one of "mediatek,mt2701-nfc", - "mediatek,mt2712-nfc". +- compatible: Should be one of + "mediatek,mt2701-nfc", + "mediatek,mt2712-nfc", + "mediatek,mt7622-nfc". - reg: Base physical address and size of NFI. - interrupts: Interrupts of NFI. - clocks: NFI required clocks. @@ -142,7 +144,10 @@ Example: ============== Required BCH properties: -- compatible: Should be one of "mediatek,mt2701-ecc", "mediatek,mt2712-ecc". +- compatible: Should be one of + "mediatek,mt2701-ecc", + "mediatek,mt2712-ecc", + "mediatek,mt7622-ecc". - reg: Base physical address and size of ECC. - interrupts: Interrupts of ECC. - clocks: ECC required clocks. diff --git a/Documentation/devicetree/bindings/mtd/nand.txt b/Documentation/devicetree/bindings/mtd/nand.txt index 133f3813719c..8bb11d809429 100644 --- a/Documentation/devicetree/bindings/mtd/nand.txt +++ b/Documentation/devicetree/bindings/mtd/nand.txt @@ -43,6 +43,7 @@ Optional NAND chip properties: This is particularly useful when only the in-band area is used by the upper layers, and you want to make your NAND as reliable as possible. +- nand-rb: shall contain the native Ready/Busy ids. The ECC strength and ECC step size properties define the correction capability of a controller. Together, they say a controller can correct "{strength} bit diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt index 9d733af26be7..4e4f30288c8b 100644 --- a/Documentation/devicetree/bindings/opp/opp.txt +++ b/Documentation/devicetree/bindings/opp/opp.txt @@ -45,6 +45,11 @@ Devices supporting OPPs must set their "operating-points-v2" property with phandle to a OPP table in their DT node. The OPP core will use this phandle to find the operating points for the device. +This can contain more than one phandle for power domain providers that provide +multiple power domains. That is, one phandle for each power domain. If only one +phandle is available, then the same OPP table will be used for all power domains +provided by the power domain provider. + If required, this can be extended for SoC vendor specific bindings. Such bindings should be documented as Documentation/devicetree/bindings/power/<vendor>-opp.txt and should have a compatible description like: "operating-points-v2-<vendor>". @@ -154,6 +159,14 @@ Optional properties: - status: Marks the node enabled/disabled. +- required-opp: This contains phandle to an OPP node in another device's OPP + table. It may contain an array of phandles, where each phandle points to an + OPP of a different device. It should not contain multiple phandles to the OPP + nodes in the same OPP table. This specifies the minimum required OPP of the + device(s), whose OPP's phandle is present in this property, for the + functioning of the current device at the current OPP (where this property is + present). + Example 1: Single cluster Dual-core ARM cortex A9, switch DVFS states together. / { diff --git a/Documentation/devicetree/bindings/opp/ti-omap5-opp-supply.txt b/Documentation/devicetree/bindings/opp/ti-omap5-opp-supply.txt new file mode 100644 index 000000000000..832346e489a3 --- /dev/null +++ b/Documentation/devicetree/bindings/opp/ti-omap5-opp-supply.txt @@ -0,0 +1,63 @@ +Texas Instruments OMAP compatible OPP supply description + +OMAP5, DRA7, and AM57 family of SoCs have Class0 AVS eFuse registers which +contain data that can be used to adjust voltages programmed for some of their +supplies for more efficient operation. This binding provides the information +needed to read these values and use them to program the main regulator during +an OPP transitions. + +Also, some supplies may have an associated vbb-supply which is an Adaptive Body +Bias regulator which much be transitioned in a specific sequence with regards +to the vdd-supply and clk when making an OPP transition. By supplying two +regulators to the device that will undergo OPP transitions we can make use +of the multi regulator binding that is part of the OPP core described here [1] +to describe both regulators needed by the platform. + +[1] Documentation/devicetree/bindings/opp/opp.txt + +Required Properties for Device Node: +- vdd-supply: phandle to regulator controlling VDD supply +- vbb-supply: phandle to regulator controlling Body Bias supply + (Usually Adaptive Body Bias regulator) + +Required Properties for opp-supply node: +- compatible: Should be one of: + "ti,omap-opp-supply" - basic OPP supply controlling VDD and VBB + "ti,omap5-opp-supply" - OMAP5+ optimized voltages in efuse(class0)VDD + along with VBB + "ti,omap5-core-opp-supply" - OMAP5+ optimized voltages in efuse(class0) VDD + but no VBB. +- reg: Address and length of the efuse register set for the device (mandatory + only for "ti,omap5-opp-supply") +- ti,efuse-settings: An array of u32 tuple items providing information about + optimized efuse configuration. Each item consists of the following: + volt: voltage in uV - reference voltage (OPP voltage) + efuse_offseet: efuse offset from reg where the optimized voltage is stored. +- ti,absolute-max-voltage-uv: absolute maximum voltage for the OPP supply. + +Example: + +/* Device Node (CPU) */ +cpus { + cpu0: cpu@0 { + device_type = "cpu"; + + ... + + vdd-supply = <&vcc>; + vbb-supply = <&abb_mpu>; + }; +}; + +/* OMAP OPP Supply with Class0 registers */ +opp_supply_mpu: opp_supply@4a003b20 { + compatible = "ti,omap5-opp-supply"; + reg = <0x4a003b20 0x8>; + ti,efuse-settings = < + /* uV offset */ + 1060000 0x0 + 1160000 0x4 + 1210000 0x8 + >; + ti,absolute-max-voltage-uv = <1500000>; +}; diff --git a/Documentation/devicetree/bindings/power/power_domain.txt b/Documentation/devicetree/bindings/power/power_domain.txt index 14bd9e945ff6..f3355313c020 100644 --- a/Documentation/devicetree/bindings/power/power_domain.txt +++ b/Documentation/devicetree/bindings/power/power_domain.txt @@ -40,6 +40,12 @@ Optional properties: domain's idle states. In the absence of this property, the domain would be considered as capable of being powered-on or powered-off. +- operating-points-v2 : Phandles to the OPP tables of power domains provided by + a power domain provider. If the provider provides a single power domain only + or all the power domains provided by the provider have identical OPP tables, + then this shall contain a single phandle. Refer to ../opp/opp.txt for more + information. + Example: power: power-controller@12340000 { @@ -120,4 +126,63 @@ The node above defines a typical PM domain consumer device, which is located inside a PM domain with index 0 of a power controller represented by a node with the label "power". +Optional properties: +- required-opp: This contains phandle to an OPP node in another device's OPP + table. It may contain an array of phandles, where each phandle points to an + OPP of a different device. It should not contain multiple phandles to the OPP + nodes in the same OPP table. This specifies the minimum required OPP of the + device(s), whose OPP's phandle is present in this property, for the + functioning of the current device at the current OPP (where this property is + present). + +Example: +- OPP table for domain provider that provides two domains. + + domain0_opp_table: opp-table0 { + compatible = "operating-points-v2"; + + domain0_opp_0: opp-1000000000 { + opp-hz = /bits/ 64 <1000000000>; + opp-microvolt = <975000 970000 985000>; + }; + domain0_opp_1: opp-1100000000 { + opp-hz = /bits/ 64 <1100000000>; + opp-microvolt = <1000000 980000 1010000>; + }; + }; + + domain1_opp_table: opp-table1 { + compatible = "operating-points-v2"; + + domain1_opp_0: opp-1200000000 { + opp-hz = /bits/ 64 <1200000000>; + opp-microvolt = <975000 970000 985000>; + }; + domain1_opp_1: opp-1300000000 { + opp-hz = /bits/ 64 <1300000000>; + opp-microvolt = <1000000 980000 1010000>; + }; + }; + + power: power-controller@12340000 { + compatible = "foo,power-controller"; + reg = <0x12340000 0x1000>; + #power-domain-cells = <1>; + operating-points-v2 = <&domain0_opp_table>, <&domain1_opp_table>; + }; + + leaky-device0@12350000 { + compatible = "foo,i-leak-current"; + reg = <0x12350000 0x1000>; + power-domains = <&power 0>; + required-opp = <&domain0_opp_0>; + }; + + leaky-device1@12350000 { + compatible = "foo,i-leak-current"; + reg = <0x12350000 0x1000>; + power-domains = <&power 1>; + required-opp = <&domain1_opp_1>; + }; + [1]. Documentation/devicetree/bindings/power/domain-idle-state.txt diff --git a/Documentation/devicetree/bindings/power/reset/imx-snvs-poweroff.txt b/Documentation/devicetree/bindings/power/reset/imx-snvs-poweroff.txt deleted file mode 100644 index 1b81fcd9fb72..000000000000 --- a/Documentation/devicetree/bindings/power/reset/imx-snvs-poweroff.txt +++ /dev/null @@ -1,23 +0,0 @@ -i.mx6 Poweroff Driver - -SNVS_LPCR in SNVS module can power off the whole system by pull -PMIC_ON_REQ low if PMIC_ON_REQ is connected with external PMIC. -If you don't want to use PMIC_ON_REQ as power on/off control, -please set status='disabled' to disable this driver. - -Required Properties: --compatible: "fsl,sec-v4.0-poweroff" --reg: Specifies the physical address of the SNVS_LPCR register - -Example: - snvs@20cc000 { - compatible = "fsl,sec-v4.0-mon", "simple-bus"; - #address-cells = <1>; - #size-cells = <1>; - ranges = <0 0x020cc000 0x4000>; - ..... - snvs_poweroff: snvs-poweroff@38 { - compatible = "fsl,sec-v4.0-poweroff"; - reg = <0x38 0x4>; - }; - } diff --git a/Documentation/devicetree/bindings/power/supply/bq27xxx.txt b/Documentation/devicetree/bindings/power/supply/bq27xxx.txt index 6858e1a804ad..615c1cb6889f 100644 --- a/Documentation/devicetree/bindings/power/supply/bq27xxx.txt +++ b/Documentation/devicetree/bindings/power/supply/bq27xxx.txt @@ -15,6 +15,7 @@ Required properties: * "ti,bq27520g2" - BQ27520-g2 * "ti,bq27520g3" - BQ27520-g3 * "ti,bq27520g4" - BQ27520-g4 + * "ti,bq27521" - BQ27521 * "ti,bq27530" - BQ27530 * "ti,bq27531" - BQ27531 * "ti,bq27541" - BQ27541 diff --git a/Documentation/devicetree/bindings/regulator/regulator.txt b/Documentation/devicetree/bindings/regulator/regulator.txt index 3cbf56ce66ea..2babe15b618d 100644 --- a/Documentation/devicetree/bindings/regulator/regulator.txt +++ b/Documentation/devicetree/bindings/regulator/regulator.txt @@ -42,8 +42,16 @@ Optional properties: - regulator-state-[mem/disk] node has following common properties: - regulator-on-in-suspend: regulator should be on in suspend state. - regulator-off-in-suspend: regulator should be off in suspend state. - - regulator-suspend-microvolt: regulator should be set to this voltage - in suspend. + - regulator-suspend-min-microvolt: minimum voltage may be set in + suspend state. + - regulator-suspend-max-microvolt: maximum voltage may be set in + suspend state. + - regulator-suspend-microvolt: the default voltage which regulator + would be set in suspend. This property is now deprecated, instead + setting voltage for suspend mode via the API which regulator + driver provides is recommended. + - regulator-changeable-in-suspend: whether the default voltage and + the regulator on/off in suspend can be changed in runtime. - regulator-mode: operating mode in the given suspend state. The set of possible operating modes depends on the capabilities of every hardware so the valid modes are documented on each regulator diff --git a/Documentation/devicetree/bindings/regulator/sprd,sc2731-regulator.txt b/Documentation/devicetree/bindings/regulator/sprd,sc2731-regulator.txt new file mode 100644 index 000000000000..63dc07877cd6 --- /dev/null +++ b/Documentation/devicetree/bindings/regulator/sprd,sc2731-regulator.txt @@ -0,0 +1,43 @@ +Spreadtrum SC2731 Voltage regulators + +The SC2731 integrates low-voltage and low quiescent current DCDC/LDO. +14 LDO and 3 DCDCs are designed for external use. All DCDCs/LDOs have +their own bypass (power-down) control signals. External tantalum or MLCC +ceramic capacitors are recommended to use with these LDOs. + +Required properties: + - compatible: should be "sprd,sc27xx-regulator". + +List of regulators provided by this controller. It is named according to +its regulator type, BUCK_<name> and LDO_<name>. The definition for each +of these nodes is defined using the standard binding for regulators at +Documentation/devicetree/bindings/regulator/regulator.txt. + +The valid names for regulators are: +BUCK: + BUCK_CPU0, BUCK_CPU1, BUCK_RF +LDO: + LDO_CAMA0, LDO_CAMA1, LDO_CAMMOT, LDO_VLDO, LDO_EMMCCORE, LDO_SDCORE, + LDO_SDIO, LDO_WIFIPA, LDO_USB33, LDO_CAMD0, LDO_CAMD1, LDO_CON, + LDO_CAMIO, LDO_SRAM + +Example: + regulators { + compatible = "sprd,sc27xx-regulator"; + + vddarm0: BUCK_CPU0 { + regulator-name = "vddarm0"; + regulator-min-microvolt = <400000>; + regulator-max-microvolt = <1996875>; + regulator-ramp-delay = <25000>; + regulator-always-on; + }; + + vddcama0: LDO_CAMA0 { + regulator-name = "vddcama0"; + regulator-min-microvolt = <1200000>; + regulator-max-microvolt = <3750000>; + regulator-enable-ramp-delay = <100>; + }; + ... + }; diff --git a/Documentation/devicetree/bindings/scsi/hisilicon-sas.txt b/Documentation/devicetree/bindings/scsi/hisilicon-sas.txt index b6a869f97715..df3bef7998fa 100644 --- a/Documentation/devicetree/bindings/scsi/hisilicon-sas.txt +++ b/Documentation/devicetree/bindings/scsi/hisilicon-sas.txt @@ -8,7 +8,10 @@ Main node required properties: (b) "hisilicon,hip06-sas-v2" for v2 hw in hip06 chipset (c) "hisilicon,hip07-sas-v2" for v2 hw in hip07 chipset - sas-addr : array of 8 bytes for host SAS address - - reg : Address and length of the SAS register + - reg : Contains two regions. The first is the address and length of the SAS + register. The second is the address and length of CPLD register for + SGPIO control. The second is optional, and should be set only when + we use a CPLD for directly attached disk LED control. - hisilicon,sas-syscon: phandle of syscon used for sas control - ctrl-reset-reg : offset to controller reset register in ctrl reg - ctrl-reset-sts-reg : offset to controller reset status register in ctrl reg diff --git a/Documentation/devicetree/bindings/sound/da7218.txt b/Documentation/devicetree/bindings/sound/da7218.txt index 5ca5a709b6aa..3ab9dfef38d1 100644 --- a/Documentation/devicetree/bindings/sound/da7218.txt +++ b/Documentation/devicetree/bindings/sound/da7218.txt @@ -73,7 +73,7 @@ Example: compatible = "dlg,da7218"; reg = <0x1a>; interrupt-parent = <&gpio6>; - interrupts = <11 IRQ_TYPE_LEVEL_HIGH>; + interrupts = <11 IRQ_TYPE_LEVEL_LOW>; wakeup-source; VDD-supply = <®_audio>; diff --git a/Documentation/devicetree/bindings/sound/da7219.txt b/Documentation/devicetree/bindings/sound/da7219.txt index cf61681826b6..5b54d2d045c3 100644 --- a/Documentation/devicetree/bindings/sound/da7219.txt +++ b/Documentation/devicetree/bindings/sound/da7219.txt @@ -77,7 +77,7 @@ Example: reg = <0x1a>; interrupt-parent = <&gpio6>; - interrupts = <11 IRQ_TYPE_LEVEL_HIGH>; + interrupts = <11 IRQ_TYPE_LEVEL_LOW>; VDD-supply = <®_audio>; VDDMIC-supply = <®_audio>; diff --git a/Documentation/devicetree/bindings/sound/dmic.txt b/Documentation/devicetree/bindings/sound/dmic.txt index 54c8ef6498a8..f7bf65611453 100644 --- a/Documentation/devicetree/bindings/sound/dmic.txt +++ b/Documentation/devicetree/bindings/sound/dmic.txt @@ -7,10 +7,12 @@ Required properties: Optional properties: - dmicen-gpios: GPIO specifier for dmic to control start and stop + - num-channels: Number of microphones on this DAI Example node: dmic_codec: dmic@0 { compatible = "dmic-codec"; dmicen-gpios = <&gpio4 3 GPIO_ACTIVE_HIGH>; + num-channels = <1>; }; diff --git a/Documentation/devicetree/bindings/sound/max98373.txt b/Documentation/devicetree/bindings/sound/max98373.txt new file mode 100644 index 000000000000..456cb1c59353 --- /dev/null +++ b/Documentation/devicetree/bindings/sound/max98373.txt @@ -0,0 +1,40 @@ +Maxim Integrated MAX98373 Speaker Amplifier + +This device supports I2C. + +Required properties: + + - compatible : "maxim,max98373" + + - reg : the I2C address of the device. + +Optional properties: + + - maxim,vmon-slot-no : slot number used to send voltage information + or in inteleave mode this will be used as + interleave slot. + slot range : 0 ~ 15, Default : 0 + + - maxim,imon-slot-no : slot number used to send current information + slot range : 0 ~ 15, Default : 0 + + - maxim,spkfb-slot-no : slot number used to send speaker feedback information + slot range : 0 ~ 15, Default : 0 + + - maxim,interleave-mode : For cases where a single combined channel + for the I/V sense data is not sufficient, the device can also be configured + to share a single data output channel on alternating frames. + In this configuration, the current and voltage data will be frame interleaved + on a single output channel. + Boolean, define to enable the interleave mode, Default : false + +Example: + +codec: max98373@31 { + compatible = "maxim,max98373"; + reg = <0x31>; + maxim,vmon-slot-no = <0>; + maxim,imon-slot-no = <1>; + maxim,spkfb-slot-no = <2>; + maxim,interleave-mode; +}; diff --git a/Documentation/devicetree/bindings/sound/mt2701-afe-pcm.txt b/Documentation/devicetree/bindings/sound/mt2701-afe-pcm.txt index 77a57f84bed4..6df87b97f7cb 100644 --- a/Documentation/devicetree/bindings/sound/mt2701-afe-pcm.txt +++ b/Documentation/devicetree/bindings/sound/mt2701-afe-pcm.txt @@ -2,153 +2,143 @@ Mediatek AFE PCM controller for mt2701 Required properties: - compatible = "mediatek,mt2701-audio"; -- reg: register location and size - interrupts: should contain AFE and ASYS interrupts - interrupt-names: should be "afe" and "asys" - power-domains: should define the power domain +- clocks: Must contain an entry for each entry in clock-names + See ../clocks/clock-bindings.txt for details - clock-names: should have these clock names: "infra_sys_audio_clk", "top_audio_mux1_sel", "top_audio_mux2_sel", - "top_audio_mux1_div", - "top_audio_mux2_div", - "top_audio_48k_timing", - "top_audio_44k_timing", - "top_audpll_mux_sel", - "top_apll_sel", - "top_aud1_pll_98M", - "top_aud2_pll_90M", - "top_hadds2_pll_98M", - "top_hadds2_pll_294M", - "top_audpll", - "top_audpll_d4", - "top_audpll_d8", - "top_audpll_d16", - "top_audpll_d24", - "top_audintbus_sel", - "clk_26m", - "top_syspll1_d4", - "top_aud_k1_src_sel", - "top_aud_k2_src_sel", - "top_aud_k3_src_sel", - "top_aud_k4_src_sel", - "top_aud_k5_src_sel", - "top_aud_k6_src_sel", - "top_aud_k1_src_div", - "top_aud_k2_src_div", - "top_aud_k3_src_div", - "top_aud_k4_src_div", - "top_aud_k5_src_div", - "top_aud_k6_src_div", - "top_aud_i2s1_mclk", - "top_aud_i2s2_mclk", - "top_aud_i2s3_mclk", - "top_aud_i2s4_mclk", - "top_aud_i2s5_mclk", - "top_aud_i2s6_mclk", - "top_asm_m_sel", - "top_asm_h_sel", - "top_univpll2_d4", - "top_univpll2_d2", - "top_syspll_d5"; + "top_audio_a1sys_hp", + "top_audio_a2sys_hp", + "i2s0_src_sel", + "i2s1_src_sel", + "i2s2_src_sel", + "i2s3_src_sel", + "i2s0_src_div", + "i2s1_src_div", + "i2s2_src_div", + "i2s3_src_div", + "i2s0_mclk_en", + "i2s1_mclk_en", + "i2s2_mclk_en", + "i2s3_mclk_en", + "i2so0_hop_ck", + "i2so1_hop_ck", + "i2so2_hop_ck", + "i2so3_hop_ck", + "i2si0_hop_ck", + "i2si1_hop_ck", + "i2si2_hop_ck", + "i2si3_hop_ck", + "asrc0_out_ck", + "asrc1_out_ck", + "asrc2_out_ck", + "asrc3_out_ck", + "audio_afe_pd", + "audio_afe_conn_pd", + "audio_a1sys_pd", + "audio_a2sys_pd", + "audio_mrgif_pd"; +- assigned-clocks: list of input clocks and dividers for the audio system. + See ../clocks/clock-bindings.txt for details. +- assigned-clocks-parents: parent of input clocks of assigned clocks. +- assigned-clock-rates: list of clock frequencies of assigned clocks. + +Must be a subnode of MediaTek audsys device tree node. +See ../arm/mediatek/mediatek,audsys.txt for details about the parent node. Example: - afe: mt2701-afe-pcm@11220000 { - compatible = "mediatek,mt2701-audio"; - reg = <0 0x11220000 0 0x2000>, - <0 0x112A0000 0 0x20000>; - interrupts = <GIC_SPI 104 IRQ_TYPE_LEVEL_LOW>, - <GIC_SPI 132 IRQ_TYPE_LEVEL_LOW>; - interrupt-names = "afe", "asys"; - power-domains = <&scpsys MT2701_POWER_DOMAIN_IFR_MSC>; - clocks = <&infracfg CLK_INFRA_AUDIO>, - <&topckgen CLK_TOP_AUD_MUX1_SEL>, - <&topckgen CLK_TOP_AUD_MUX2_SEL>, - <&topckgen CLK_TOP_AUD_MUX1_DIV>, - <&topckgen CLK_TOP_AUD_MUX2_DIV>, - <&topckgen CLK_TOP_AUD_48K_TIMING>, - <&topckgen CLK_TOP_AUD_44K_TIMING>, - <&topckgen CLK_TOP_AUDPLL_MUX_SEL>, - <&topckgen CLK_TOP_APLL_SEL>, - <&topckgen CLK_TOP_AUD1PLL_98M>, - <&topckgen CLK_TOP_AUD2PLL_90M>, - <&topckgen CLK_TOP_HADDS2PLL_98M>, - <&topckgen CLK_TOP_HADDS2PLL_294M>, - <&topckgen CLK_TOP_AUDPLL>, - <&topckgen CLK_TOP_AUDPLL_D4>, - <&topckgen CLK_TOP_AUDPLL_D8>, - <&topckgen CLK_TOP_AUDPLL_D16>, - <&topckgen CLK_TOP_AUDPLL_D24>, - <&topckgen CLK_TOP_AUDINTBUS_SEL>, - <&clk26m>, - <&topckgen CLK_TOP_SYSPLL1_D4>, - <&topckgen CLK_TOP_AUD_K1_SRC_SEL>, - <&topckgen CLK_TOP_AUD_K2_SRC_SEL>, - <&topckgen CLK_TOP_AUD_K3_SRC_SEL>, - <&topckgen CLK_TOP_AUD_K4_SRC_SEL>, - <&topckgen CLK_TOP_AUD_K5_SRC_SEL>, - <&topckgen CLK_TOP_AUD_K6_SRC_SEL>, - <&topckgen CLK_TOP_AUD_K1_SRC_DIV>, - <&topckgen CLK_TOP_AUD_K2_SRC_DIV>, - <&topckgen CLK_TOP_AUD_K3_SRC_DIV>, - <&topckgen CLK_TOP_AUD_K4_SRC_DIV>, - <&topckgen CLK_TOP_AUD_K5_SRC_DIV>, - <&topckgen CLK_TOP_AUD_K6_SRC_DIV>, - <&topckgen CLK_TOP_AUD_I2S1_MCLK>, - <&topckgen CLK_TOP_AUD_I2S2_MCLK>, - <&topckgen CLK_TOP_AUD_I2S3_MCLK>, - <&topckgen CLK_TOP_AUD_I2S4_MCLK>, - <&topckgen CLK_TOP_AUD_I2S5_MCLK>, - <&topckgen CLK_TOP_AUD_I2S6_MCLK>, - <&topckgen CLK_TOP_ASM_M_SEL>, - <&topckgen CLK_TOP_ASM_H_SEL>, - <&topckgen CLK_TOP_UNIVPLL2_D4>, - <&topckgen CLK_TOP_UNIVPLL2_D2>, - <&topckgen CLK_TOP_SYSPLL_D5>; + audsys: audio-subsystem@11220000 { + compatible = "mediatek,mt2701-audsys", "syscon", "simple-mfd"; + ... + + afe: audio-controller { + compatible = "mediatek,mt2701-audio"; + interrupts = <GIC_SPI 104 IRQ_TYPE_LEVEL_LOW>, + <GIC_SPI 132 IRQ_TYPE_LEVEL_LOW>; + interrupt-names = "afe", "asys"; + power-domains = <&scpsys MT2701_POWER_DOMAIN_IFR_MSC>; + + clocks = <&infracfg CLK_INFRA_AUDIO>, + <&topckgen CLK_TOP_AUD_MUX1_SEL>, + <&topckgen CLK_TOP_AUD_MUX2_SEL>, + <&topckgen CLK_TOP_AUD_48K_TIMING>, + <&topckgen CLK_TOP_AUD_44K_TIMING>, + <&topckgen CLK_TOP_AUD_K1_SRC_SEL>, + <&topckgen CLK_TOP_AUD_K2_SRC_SEL>, + <&topckgen CLK_TOP_AUD_K3_SRC_SEL>, + <&topckgen CLK_TOP_AUD_K4_SRC_SEL>, + <&topckgen CLK_TOP_AUD_K1_SRC_DIV>, + <&topckgen CLK_TOP_AUD_K2_SRC_DIV>, + <&topckgen CLK_TOP_AUD_K3_SRC_DIV>, + <&topckgen CLK_TOP_AUD_K4_SRC_DIV>, + <&topckgen CLK_TOP_AUD_I2S1_MCLK>, + <&topckgen CLK_TOP_AUD_I2S2_MCLK>, + <&topckgen CLK_TOP_AUD_I2S3_MCLK>, + <&topckgen CLK_TOP_AUD_I2S4_MCLK>, + <&audsys CLK_AUD_I2SO1>, + <&audsys CLK_AUD_I2SO2>, + <&audsys CLK_AUD_I2SO3>, + <&audsys CLK_AUD_I2SO4>, + <&audsys CLK_AUD_I2SIN1>, + <&audsys CLK_AUD_I2SIN2>, + <&audsys CLK_AUD_I2SIN3>, + <&audsys CLK_AUD_I2SIN4>, + <&audsys CLK_AUD_ASRCO1>, + <&audsys CLK_AUD_ASRCO2>, + <&audsys CLK_AUD_ASRCO3>, + <&audsys CLK_AUD_ASRCO4>, + <&audsys CLK_AUD_AFE>, + <&audsys CLK_AUD_AFE_CONN>, + <&audsys CLK_AUD_A1SYS>, + <&audsys CLK_AUD_A2SYS>, + <&audsys CLK_AUD_AFE_MRGIF>; + + clock-names = "infra_sys_audio_clk", + "top_audio_mux1_sel", + "top_audio_mux2_sel", + "top_audio_a1sys_hp", + "top_audio_a2sys_hp", + "i2s0_src_sel", + "i2s1_src_sel", + "i2s2_src_sel", + "i2s3_src_sel", + "i2s0_src_div", + "i2s1_src_div", + "i2s2_src_div", + "i2s3_src_div", + "i2s0_mclk_en", + "i2s1_mclk_en", + "i2s2_mclk_en", + "i2s3_mclk_en", + "i2so0_hop_ck", + "i2so1_hop_ck", + "i2so2_hop_ck", + "i2so3_hop_ck", + "i2si0_hop_ck", + "i2si1_hop_ck", + "i2si2_hop_ck", + "i2si3_hop_ck", + "asrc0_out_ck", + "asrc1_out_ck", + "asrc2_out_ck", + "asrc3_out_ck", + "audio_afe_pd", + "audio_afe_conn_pd", + "audio_a1sys_pd", + "audio_a2sys_pd", + "audio_mrgif_pd"; - clock-names = "infra_sys_audio_clk", - "top_audio_mux1_sel", - "top_audio_mux2_sel", - "top_audio_mux1_div", - "top_audio_mux2_div", - "top_audio_48k_timing", - "top_audio_44k_timing", - "top_audpll_mux_sel", - "top_apll_sel", - "top_aud1_pll_98M", - "top_aud2_pll_90M", - "top_hadds2_pll_98M", - "top_hadds2_pll_294M", - "top_audpll", - "top_audpll_d4", - "top_audpll_d8", - "top_audpll_d16", - "top_audpll_d24", - "top_audintbus_sel", - "clk_26m", - "top_syspll1_d4", - "top_aud_k1_src_sel", - "top_aud_k2_src_sel", - "top_aud_k3_src_sel", - "top_aud_k4_src_sel", - "top_aud_k5_src_sel", - "top_aud_k6_src_sel", - "top_aud_k1_src_div", - "top_aud_k2_src_div", - "top_aud_k3_src_div", - "top_aud_k4_src_div", - "top_aud_k5_src_div", - "top_aud_k6_src_div", - "top_aud_i2s1_mclk", - "top_aud_i2s2_mclk", - "top_aud_i2s3_mclk", - "top_aud_i2s4_mclk", - "top_aud_i2s5_mclk", - "top_aud_i2s6_mclk", - "top_asm_m_sel", - "top_asm_h_sel", - "top_univpll2_d4", - "top_univpll2_d2", - "top_syspll_d5"; + assigned-clocks = <&topckgen CLK_TOP_AUD_MUX1_SEL>, + <&topckgen CLK_TOP_AUD_MUX2_SEL>, + <&topckgen CLK_TOP_AUD_MUX1_DIV>, + <&topckgen CLK_TOP_AUD_MUX2_DIV>; + assigned-clock-parents = <&topckgen CLK_TOP_AUD1PLL_98M>, + <&topckgen CLK_TOP_AUD2PLL_90M>; + assigned-clock-rates = <0>, <0>, <49152000>, <45158400>; + }; }; diff --git a/Documentation/devicetree/bindings/sound/mxs-audio-sgtl5000.txt b/Documentation/devicetree/bindings/sound/mxs-audio-sgtl5000.txt index 601c518eddaa..4eb980bd0287 100644 --- a/Documentation/devicetree/bindings/sound/mxs-audio-sgtl5000.txt +++ b/Documentation/devicetree/bindings/sound/mxs-audio-sgtl5000.txt @@ -1,10 +1,31 @@ * Freescale MXS audio complex with SGTL5000 codec Required properties: -- compatible: "fsl,mxs-audio-sgtl5000" -- model: The user-visible name of this sound complex -- saif-controllers: The phandle list of the MXS SAIF controller -- audio-codec: The phandle of the SGTL5000 audio codec +- compatible : "fsl,mxs-audio-sgtl5000" +- model : The user-visible name of this sound complex +- saif-controllers : The phandle list of the MXS SAIF controller +- audio-codec : The phandle of the SGTL5000 audio codec +- audio-routing : A list of the connections between audio components. + Each entry is a pair of strings, the first being the + connection's sink, the second being the connection's + source. Valid names could be power supplies, SGTL5000 + pins, and the jacks on the board: + + Power supplies: + * Mic Bias + + SGTL5000 pins: + * MIC_IN + * LINE_IN + * HP_OUT + * LINE_OUT + + Board connectors: + * Mic Jack + * Line In Jack + * Headphone Jack + * Line Out Jack + * Ext Spk Example: @@ -14,4 +35,8 @@ sound { model = "imx28-evk-sgtl5000"; saif-controllers = <&saif0 &saif1>; audio-codec = <&sgtl5000>; + audio-routing = + "MIC_IN", "Mic Jack", + "Mic Jack", "Mic Bias", + "Headphone Jack", "HP_OUT"; }; diff --git a/Documentation/devicetree/bindings/sound/nau8825.txt b/Documentation/devicetree/bindings/sound/nau8825.txt index 2f5e973285a6..d16d96839bcb 100644 --- a/Documentation/devicetree/bindings/sound/nau8825.txt +++ b/Documentation/devicetree/bindings/sound/nau8825.txt @@ -69,7 +69,7 @@ Optional properties: - nuvoton,jack-insert-debounce: number from 0 to 7 that sets debounce time to 2^(n+2) ms - nuvoton,jack-eject-debounce: number from 0 to 7 that sets debounce time to 2^(n+2) ms - - nuvoton,crosstalk-bypass: make crosstalk function bypass if set. + - nuvoton,crosstalk-enable: make crosstalk function enable if set. - clocks: list of phandle and clock specifier pairs according to common clock bindings for the clocks described in clock-names @@ -98,7 +98,7 @@ Example: nuvoton,short-key-debounce = <2>; nuvoton,jack-insert-debounce = <7>; nuvoton,jack-eject-debounce = <7>; - nuvoton,crosstalk-bypass; + nuvoton,crosstalk-enable; clock-names = "mclk"; clocks = <&tegra_car TEGRA210_CLK_CLK_OUT_2>; diff --git a/Documentation/devicetree/bindings/sound/pcm186x.txt b/Documentation/devicetree/bindings/sound/pcm186x.txt new file mode 100644 index 000000000000..1087f4855980 --- /dev/null +++ b/Documentation/devicetree/bindings/sound/pcm186x.txt @@ -0,0 +1,42 @@ +Texas Instruments PCM186x Universal Audio ADC + +These devices support both I2C and SPI (configured with pin strapping +on the board). + +Required properties: + + - compatible : "ti,pcm1862", + "ti,pcm1863", + "ti,pcm1864", + "ti,pcm1865" + + - reg : The I2C address of the device for I2C, the chip select + number for SPI. + + - avdd-supply: Analog core power supply (3.3v) + - dvdd-supply: Digital core power supply + - iovdd-supply: Digital IO power supply + See regulator/regulator.txt for more information + +CODEC input pins: + * VINL1 + * VINR1 + * VINL2 + * VINR2 + * VINL3 + * VINR3 + * VINL4 + * VINR4 + +The pins can be used in referring sound node's audio-routing property. + +Example: + + pcm186x: audio-codec@4a { + compatible = "ti,pcm1865"; + reg = <0x4a>; + + avdd-supply = <®_3v3_analog>; + dvdd-supply = <®_3v3>; + iovdd-supply = <®_1v8>; + }; diff --git a/Documentation/devicetree/bindings/sound/renesas,rsnd.txt b/Documentation/devicetree/bindings/sound/renesas,rsnd.txt index 085bec364caf..5bed9a595772 100644 --- a/Documentation/devicetree/bindings/sound/renesas,rsnd.txt +++ b/Documentation/devicetree/bindings/sound/renesas,rsnd.txt @@ -4,7 +4,7 @@ Renesas R-Car sound * Modules ============================================= -Renesas R-Car sound is constructed from below modules +Renesas R-Car and RZ/G sound is constructed from below modules (for Gen2 or later) SCU : Sampling Rate Converter Unit @@ -197,12 +197,17 @@ Ex) [MEM] -> [SRC2] -> [CTU03] -+ sound { + #address-cells = <1>; + #size-cells = <0>; + compatible = "simple-scu-audio-card"; ... - simple-audio-card,cpu-0 { + simple-audio-card,cpu@0 { + reg = <0>; sound-dai = <&rcar_sound 0>; }; - simple-audio-card,cpu-1 { + simple-audio-card,cpu@1 { + reg = <1>; sound-dai = <&rcar_sound 1>; }; simple-audio-card,codec { @@ -334,9 +339,11 @@ Required properties: - compatible : "renesas,rcar_sound-<soctype>", fallbacks "renesas,rcar_sound-gen1" if generation1, and - "renesas,rcar_sound-gen2" if generation2 + "renesas,rcar_sound-gen2" if generation2 (or RZ/G1) "renesas,rcar_sound-gen3" if generation3 Examples with soctypes are: + - "renesas,rcar_sound-r8a7743" (RZ/G1M) + - "renesas,rcar_sound-r8a7745" (RZ/G1E) - "renesas,rcar_sound-r8a7778" (R-Car M1A) - "renesas,rcar_sound-r8a7779" (R-Car H1) - "renesas,rcar_sound-r8a7790" (R-Car H2) diff --git a/Documentation/devicetree/bindings/sound/simple-card.txt b/Documentation/devicetree/bindings/sound/simple-card.txt index 166f2290233b..17c13e74667d 100644 --- a/Documentation/devicetree/bindings/sound/simple-card.txt +++ b/Documentation/devicetree/bindings/sound/simple-card.txt @@ -140,6 +140,7 @@ sound { simple-audio-card,name = "Cubox Audio"; simple-audio-card,dai-link@0 { /* I2S - HDMI */ + reg = <0>; format = "i2s"; cpu { sound-dai = <&audio1 0>; @@ -150,6 +151,7 @@ sound { }; simple-audio-card,dai-link@1 { /* S/PDIF - HDMI */ + reg = <1>; cpu { sound-dai = <&audio1 1>; }; @@ -159,6 +161,7 @@ sound { }; simple-audio-card,dai-link@2 { /* S/PDIF - S/PDIF */ + reg = <2>; cpu { sound-dai = <&audio1 1>; }; diff --git a/Documentation/devicetree/bindings/sound/st,stm32-adfsdm.txt b/Documentation/devicetree/bindings/sound/st,stm32-adfsdm.txt new file mode 100644 index 000000000000..864f5b00b031 --- /dev/null +++ b/Documentation/devicetree/bindings/sound/st,stm32-adfsdm.txt @@ -0,0 +1,63 @@ +STMicroelectronics Audio Digital Filter Sigma Delta modulators(DFSDM) + +The DFSDM allows PDM microphones capture through SPI interface. The Audio +interface is seems as a sub block of the DFSDM device. +For details on DFSDM bindings refer to ../iio/adc/st,stm32-dfsdm-adc.txt + +Required properties: + - compatible: "st,stm32h7-dfsdm-dai". + + - #sound-dai-cells : Must be equal to 0 + + - io-channels : phandle to iio dfsdm instance node. + +Example of a sound card using audio DFSDM node. + + sound_card { + compatible = "audio-graph-card"; + + dais = <&cpu_port>; + }; + + dfsdm: dfsdm@40017000 { + compatible = "st,stm32h7-dfsdm"; + reg = <0x40017000 0x400>; + clocks = <&rcc DFSDM1_CK>; + clock-names = "dfsdm"; + #interrupt-cells = <1>; + #address-cells = <1>; + #size-cells = <0>; + + dfsdm_adc0: filter@0 { + compatible = "st,stm32-dfsdm-dmic"; + reg = <0>; + interrupts = <110>; + dmas = <&dmamux1 101 0x400 0x00>; + dma-names = "rx"; + st,adc-channels = <1>; + st,adc-channel-names = "dmic0"; + st,adc-channel-types = "SPI_R"; + st,adc-channel-clk-src = "CLKOUT"; + st,filter-order = <5>; + + dfsdm_dai0: dfsdm-dai { + compatible = "st,stm32h7-dfsdm-dai"; + #sound-dai-cells = <0>; + io-channels = <&dfsdm_adc0 0>; + cpu_port: port { + dfsdm_endpoint: endpoint { + remote-endpoint = <&dmic0_endpoint>; + }; + }; + }; + }; + + dmic0: dmic@0 { + compatible = "dmic-codec"; + #sound-dai-cells = <0>; + port { + dmic0_endpoint: endpoint { + remote-endpoint = <&dfsdm_endpoint>; + }; + }; + }; diff --git a/Documentation/devicetree/bindings/sound/st,stm32-sai.txt b/Documentation/devicetree/bindings/sound/st,stm32-sai.txt index 1f9cd7095337..b1acc1a256ba 100644 --- a/Documentation/devicetree/bindings/sound/st,stm32-sai.txt +++ b/Documentation/devicetree/bindings/sound/st,stm32-sai.txt @@ -20,11 +20,6 @@ Required properties: Optional properties: - resets: Reference to a reset controller asserting the SAI - - st,sync: specify synchronization mode. - By default SAI sub-block is in asynchronous mode. - This property sets SAI sub-block as slave of another SAI sub-block. - Must contain the phandle and index of the sai sub-block providing - the synchronization. SAI subnodes: Two subnodes corresponding to SAI sub-block instances A et B can be defined. @@ -44,6 +39,13 @@ SAI subnodes required properties: - pinctrl-names: should contain only value "default" - pinctrl-0: see Documentation/devicetree/bindings/pinctrl/pinctrl-stm32.txt +SAI subnodes Optional properties: + - st,sync: specify synchronization mode. + By default SAI sub-block is in asynchronous mode. + This property sets SAI sub-block as slave of another SAI sub-block. + Must contain the phandle and index of the sai sub-block providing + the synchronization. + The device node should contain one 'port' child node with one child 'endpoint' node, according to the bindings defined in Documentation/devicetree/bindings/ graph.txt. diff --git a/Documentation/devicetree/bindings/sound/sun4i-i2s.txt b/Documentation/devicetree/bindings/sound/sun4i-i2s.txt index 05d7135a8d2f..b9d50d6cdef3 100644 --- a/Documentation/devicetree/bindings/sound/sun4i-i2s.txt +++ b/Documentation/devicetree/bindings/sound/sun4i-i2s.txt @@ -8,6 +8,7 @@ Required properties: - compatible: should be one of the following: - "allwinner,sun4i-a10-i2s" - "allwinner,sun6i-a31-i2s" + - "allwinner,sun8i-a83t-i2s" - "allwinner,sun8i-h3-i2s" - reg: physical base address of the controller and length of memory mapped region. @@ -23,6 +24,7 @@ Required properties: Required properties for the following compatibles: - "allwinner,sun6i-a31-i2s" + - "allwinner,sun8i-a83t-i2s" - "allwinner,sun8i-h3-i2s" - resets: phandle to the reset line for this codec diff --git a/Documentation/devicetree/bindings/sound/tas5720.txt b/Documentation/devicetree/bindings/sound/tas5720.txt index 40d94f82beb3..7481653fe8e3 100644 --- a/Documentation/devicetree/bindings/sound/tas5720.txt +++ b/Documentation/devicetree/bindings/sound/tas5720.txt @@ -6,10 +6,12 @@ audio playback. For more product information please see the links below: http://www.ti.com/product/TAS5720L http://www.ti.com/product/TAS5720M +http://www.ti.com/product/TAS5722L Required properties: -- compatible : "ti,tas5720" +- compatible : "ti,tas5720", + "ti,tas5722" - reg : I2C slave address - dvdd-supply : phandle to a 3.3-V supply for the digital circuitry - pvdd-supply : phandle to a supply used for the Class-D amp and the analog diff --git a/Documentation/devicetree/bindings/sound/tfa9879.txt b/Documentation/devicetree/bindings/sound/tfa9879.txt index 23ba522d9e2b..1620e6848436 100644 --- a/Documentation/devicetree/bindings/sound/tfa9879.txt +++ b/Documentation/devicetree/bindings/sound/tfa9879.txt @@ -6,18 +6,18 @@ Required properties: - reg : the I2C address of the device +- #sound-dai-cells : must be 0. + Example: &i2c1 { - clock-frequency = <100000>; pinctrl-names = "default"; pinctrl-0 = <&pinctrl_i2c1>; - status = "okay"; - codec: tfa9879@6c { + amp: amp@6c { #sound-dai-cells = <0>; compatible = "nxp,tfa9879"; reg = <0x6c>; - }; + }; }; diff --git a/Documentation/devicetree/bindings/sound/ti,tas6424.txt b/Documentation/devicetree/bindings/sound/ti,tas6424.txt new file mode 100644 index 000000000000..1c4ada0eef4e --- /dev/null +++ b/Documentation/devicetree/bindings/sound/ti,tas6424.txt @@ -0,0 +1,20 @@ +Texas Instruments TAS6424 Quad-Channel Audio amplifier + +The TAS6424 serial control bus communicates through I2C protocols. + +Required properties: + - compatible: "ti,tas6424" - TAS6424 + - reg: I2C slave address + - sound-dai-cells: must be equal to 0 + +Example: + +tas6424: tas6424@6a { + compatible = "ti,tas6424"; + reg = <0x6a>; + + #sound-dai-cells = <0>; +}; + +For more product information please see the link below: +http://www.ti.com/product/TAS6424-Q1 diff --git a/Documentation/devicetree/bindings/sound/tlv320aic31xx.txt b/Documentation/devicetree/bindings/sound/tlv320aic31xx.txt index 6fbba562eaa7..5b3c33bb99e5 100644 --- a/Documentation/devicetree/bindings/sound/tlv320aic31xx.txt +++ b/Documentation/devicetree/bindings/sound/tlv320aic31xx.txt @@ -22,7 +22,7 @@ Required properties: Optional properties: -- gpio-reset - gpio pin number used for codec reset +- reset-gpios - GPIO specification for the active low RESET input. - ai31xx-micbias-vg - MicBias Voltage setting 1 or MICBIAS_2_0V - MICBIAS output is powered to 2.0V 2 or MICBIAS_2_5V - MICBIAS output is powered to 2.5V @@ -30,6 +30,10 @@ Optional properties: If this node is not mentioned or if the value is unknown, then micbias is set to 2.0V. +Deprecated properties: + +- gpio-reset - gpio pin number used for codec reset + CODEC output pins: * HPL * HPR @@ -48,6 +52,7 @@ CODEC input pins: The pins can be used in referring sound node's audio-routing property. Example: +#include <dt-bindings/gpio/gpio.h> #include <dt-bindings/sound/tlv320aic31xx-micbias.h> tlv320aic31xx: tlv320aic31xx@18 { @@ -56,6 +61,8 @@ tlv320aic31xx: tlv320aic31xx@18 { ai31xx-micbias-vg = <MICBIAS_OFF>; + reset-gpios = <&gpio1 17 GPIO_ACTIVE_LOW>; + HPVDD-supply = <®ulator>; SPRVDD-supply = <®ulator>; SPLVDD-supply = <®ulator>; diff --git a/Documentation/devicetree/bindings/sound/tlv320aic3x.txt b/Documentation/devicetree/bindings/sound/tlv320aic3x.txt index ba5b45c483f5..9796c4639262 100644 --- a/Documentation/devicetree/bindings/sound/tlv320aic3x.txt +++ b/Documentation/devicetree/bindings/sound/tlv320aic3x.txt @@ -17,7 +17,7 @@ Required properties: Optional properties: -- gpio-reset - gpio pin number used for codec reset +- reset-gpios - GPIO specification for the active low RESET input. - ai3x-gpio-func - <array of 2 int> - AIC3X_GPIO1 & AIC3X_GPIO2 Functionality - Not supported on tlv320aic3104 - ai3x-micbias-vg - MicBias Voltage required. @@ -34,6 +34,10 @@ Optional properties: - AVDD-supply, IOVDD-supply, DRVDD-supply, DVDD-supply : power supplies for the device as covered in Documentation/devicetree/bindings/regulator/regulator.txt +Deprecated properties: + +- gpio-reset - gpio pin number used for codec reset + CODEC output pins: * LLOUT * RLOUT @@ -61,10 +65,14 @@ The pins can be used in referring sound node's audio-routing property. Example: +#include <dt-bindings/gpio/gpio.h> + tlv320aic3x: tlv320aic3x@1b { compatible = "ti,tlv320aic3x"; reg = <0x1b>; + reset-gpios = <&gpio1 17 GPIO_ACTIVE_LOW>; + AVDD-supply = <®ulator>; IOVDD-supply = <®ulator>; DRVDD-supply = <®ulator>; diff --git a/Documentation/devicetree/bindings/sound/tscs42xx.txt b/Documentation/devicetree/bindings/sound/tscs42xx.txt new file mode 100644 index 000000000000..2ac2f0996697 --- /dev/null +++ b/Documentation/devicetree/bindings/sound/tscs42xx.txt @@ -0,0 +1,16 @@ +TSCS42XX Audio CODEC + +Required Properties: + + - compatible : "tempo,tscs42A1" for analog mic + "tempo,tscs42A2" for digital mic + + - reg : <0x71> for analog mic + <0x69> for digital mic + +Example: + +wookie: codec@69 { + compatible = "tempo,tscs42A2"; + reg = <0x69>; +}; diff --git a/Documentation/devicetree/bindings/sound/uniphier,evea.txt b/Documentation/devicetree/bindings/sound/uniphier,evea.txt new file mode 100644 index 000000000000..3f31b235f18b --- /dev/null +++ b/Documentation/devicetree/bindings/sound/uniphier,evea.txt @@ -0,0 +1,26 @@ +Socionext EVEA - UniPhier SoC internal codec driver + +Required properties: +- compatible : should be "socionext,uniphier-evea". +- reg : offset and length of the register set for the device. +- clock-names : should include following entries: + "evea", "exiv" +- clocks : a list of phandle, should contain an entry for each + entries in clock-names. +- reset-names : should include following entries: + "evea", "exiv", "adamv" +- resets : a list of phandle, should contain reset entries of + reset-names. +- #sound-dai-cells: should be 1. + +Example: + + codec { + compatible = "socionext,uniphier-evea"; + reg = <0x57900000 0x1000>; + clock-names = "evea", "exiv"; + clocks = <&sys_clk 41>, <&sys_clk 42>; + reset-names = "evea", "exiv", "adamv"; + resets = <&sys_rst 41>, <&sys_rst 42>, <&adamv_rst 0>; + #sound-dai-cells = <1>; + }; diff --git a/Documentation/devicetree/bindings/spi/fsl-imx-cspi.txt b/Documentation/devicetree/bindings/spi/fsl-imx-cspi.txt index 5bf13960f7f4..e3c48b20b1a6 100644 --- a/Documentation/devicetree/bindings/spi/fsl-imx-cspi.txt +++ b/Documentation/devicetree/bindings/spi/fsl-imx-cspi.txt @@ -12,24 +12,30 @@ Required properties: - "fsl,imx53-ecspi" for SPI compatible with the one integrated on i.MX53 and later Soc - reg : Offset and length of the register set for the device - interrupts : Should contain CSPI/eCSPI interrupt -- cs-gpios : Specifies the gpio pins to be used for chipselects. - clocks : Clock specifiers for both ipg and per clocks. - clock-names : Clock names should include both "ipg" and "per" See the clock consumer binding, Documentation/devicetree/bindings/clock/clock-bindings.txt -- dmas: DMA specifiers for tx and rx dma. See the DMA client binding, - Documentation/devicetree/bindings/dma/dma.txt -- dma-names: DMA request names should include "tx" and "rx" if present. -Obsolete properties: -- fsl,spi-num-chipselects : Contains the number of the chipselect +Recommended properties: +- cs-gpios : GPIOs to use as chip selects, see spi-bus.txt. While the native chip +select lines can be used, they appear to always generate a pulse between each +word of a transfer. Most use cases will require GPIO based chip selects to +generate a valid transaction. Optional properties: +- num-cs : Number of total chip selects, see spi-bus.txt. +- dmas: DMA specifiers for tx and rx dma. See the DMA client binding, +Documentation/devicetree/bindings/dma/dma.txt. +- dma-names: DMA request names, if present, should include "tx" and "rx". - fsl,spi-rdy-drctl: Integer, representing the value of DRCTL, the register controlling the SPI_READY handling. Note that to enable the DRCTL consideration, the SPI_READY mode-flag needs to be set too. Valid values are: 0 (disabled), 1 (edge-triggered burst) and 2 (level-triggered burst). +Obsolete properties: +- fsl,spi-num-chipselects : Contains the number of the chipselect + Example: ecspi@70010000 { diff --git a/Documentation/devicetree/bindings/spi/sh-msiof.txt b/Documentation/devicetree/bindings/spi/sh-msiof.txt index bdd83959019c..80710f0f0448 100644 --- a/Documentation/devicetree/bindings/spi/sh-msiof.txt +++ b/Documentation/devicetree/bindings/spi/sh-msiof.txt @@ -36,7 +36,21 @@ Required properties: Optional properties: - clocks : Must contain a reference to the functional clock. -- num-cs : Total number of chip-selects (default is 1) +- num-cs : Total number of chip selects (default is 1). + Up to 3 native chip selects are supported: + 0: MSIOF_SYNC + 1: MSIOF_SS1 + 2: MSIOF_SS2 + Hardware limitations related to chip selects: + - Native chip selects are always deasserted in + between transfers that are part of the same + message. Use cs-gpios to work around this. + - All slaves using native chip selects must use the + same spi-cs-high configuration. Use cs-gpios to + work around this. + - When using GPIO chip selects, at least one native + chip select must be left unused, as it will be + driven anyway. - dmas : Must contain a list of two references to DMA specifiers, one for transmission, and one for reception. diff --git a/Documentation/devicetree/bindings/spi/spi-meson.txt b/Documentation/devicetree/bindings/spi/spi-meson.txt index 825c39cae74a..b7f5e86fed22 100644 --- a/Documentation/devicetree/bindings/spi/spi-meson.txt +++ b/Documentation/devicetree/bindings/spi/spi-meson.txt @@ -27,7 +27,9 @@ The Meson SPICC is generic SPI controller for general purpose Full-Duplex communications with dedicated 16 words RX/TX PIO FIFOs. Required properties: - - compatible: should be "amlogic,meson-gx-spicc" on Amlogic GX SoCs. + - compatible: should be: + "amlogic,meson-gx-spicc" on Amlogic GX and compatible SoCs. + "amlogic,meson-axg-spicc" on Amlogic AXG and compatible SoCs - reg: physical base address and length of the controller registers - interrupts: The interrupt specifier - clock-names: Must contain "core" diff --git a/Documentation/devicetree/bindings/spi/spi-orion.txt b/Documentation/devicetree/bindings/spi/spi-orion.txt index df8ec31f2f07..8434a65fc12a 100644 --- a/Documentation/devicetree/bindings/spi/spi-orion.txt +++ b/Documentation/devicetree/bindings/spi/spi-orion.txt @@ -18,8 +18,17 @@ Required properties: The eight register sets following the control registers refer to chip-select lines 0 through 7 respectively. - cell-index : Which of multiple SPI controllers is this. +- clocks : pointers to the reference clocks for this device, the first + one is the one used for the clock on the spi bus, the + second one is optional and is the clock used for the + functional part of the controller + Optional properties: - interrupts : Is currently not used. +- clock-names : names of used clocks, mandatory if the second clock is + used, the name must be "core", and "axi" (the latter + is only for Armada 7K/8K). + Example: spi@10600 { diff --git a/Documentation/devicetree/bindings/spi/spi-xilinx.txt b/Documentation/devicetree/bindings/spi/spi-xilinx.txt index c7b7856bd528..7bf61efc66c8 100644 --- a/Documentation/devicetree/bindings/spi/spi-xilinx.txt +++ b/Documentation/devicetree/bindings/spi/spi-xilinx.txt @@ -2,7 +2,7 @@ Xilinx SPI controller Device Tree Bindings ------------------------------------------------- Required properties: -- compatible : Should be "xlnx,xps-spi-2.00.a" or "xlnx,xps-spi-2.00.b" +- compatible : Should be "xlnx,xps-spi-2.00.a", "xlnx,xps-spi-2.00.b" or "xlnx,axi-quad-spi-1.00.a" - reg : Physical base address and size of SPI registers map. - interrupts : Property with a value describing the interrupt number. diff --git a/Documentation/devicetree/bindings/timer/actions,owl-timer.txt b/Documentation/devicetree/bindings/timer/actions,owl-timer.txt index e3c28da80cb2..977054f87563 100644 --- a/Documentation/devicetree/bindings/timer/actions,owl-timer.txt +++ b/Documentation/devicetree/bindings/timer/actions,owl-timer.txt @@ -2,6 +2,7 @@ Actions Semi Owl Timer Required properties: - compatible : "actions,s500-timer" for S500 + "actions,s700-timer" for S700 "actions,s900-timer" for S900 - reg : Offset and length of the register set for the device. - interrupts : Should contain the interrupts. diff --git a/Documentation/devicetree/bindings/timer/spreadtrum,sprd-timer.txt b/Documentation/devicetree/bindings/timer/spreadtrum,sprd-timer.txt new file mode 100644 index 000000000000..6d97e7d0f6e8 --- /dev/null +++ b/Documentation/devicetree/bindings/timer/spreadtrum,sprd-timer.txt @@ -0,0 +1,20 @@ +Spreadtrum timers + +The Spreadtrum SC9860 platform provides 3 general-purpose timers. +These timers can support 32bit or 64bit counter, as well as supporting +period mode or one-shot mode, and they are can be wakeup source +during deep sleep. + +Required properties: +- compatible: should be "sprd,sc9860-timer" for SC9860 platform. +- reg: The register address of the timer device. +- interrupts: Should contain the interrupt for the timer device. +- clocks: The phandle to the source clock (usually a 32.768 KHz fixed clock). + +Example: + timer@40050000 { + compatible = "sprd,sc9860-timer"; + reg = <0 0x40050000 0 0x20>; + interrupts = <GIC_SPI 26 IRQ_TYPE_LEVEL_HIGH>; + clocks = <&ext_32k>; + }; diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt index 0994bdd82cd3..f776fb804a8c 100644 --- a/Documentation/devicetree/bindings/vendor-prefixes.txt +++ b/Documentation/devicetree/bindings/vendor-prefixes.txt @@ -347,6 +347,7 @@ tcg Trusted Computing Group tcl Toby Churchill Ltd. technexion TechNexion technologic Technologic Systems +tempo Tempo Semiconductor terasic Terasic Inc. thine THine Electronics, Inc. ti Texas Instruments diff --git a/Documentation/devicetree/bindings/watchdog/zii,rave-sp-wdt.txt b/Documentation/devicetree/bindings/watchdog/zii,rave-sp-wdt.txt new file mode 100644 index 000000000000..3de96186e92e --- /dev/null +++ b/Documentation/devicetree/bindings/watchdog/zii,rave-sp-wdt.txt @@ -0,0 +1,39 @@ +Zodiac Inflight Innovations RAVE Supervisory Processor Watchdog Bindings + +RAVE SP watchdog device is a "MFD cell" device corresponding to +watchdog functionality of RAVE Supervisory Processor. It is expected +that its Device Tree node is specified as a child of the node +corresponding to the parent RAVE SP device (as documented in +Documentation/devicetree/bindings/mfd/zii,rave-sp.txt) + +Required properties: + +- compatible: Depending on wire protocol implemented by RAVE SP + firmware, should be one of: + - "zii,rave-sp-watchdog" + - "zii,rave-sp-watchdog-legacy" + +Optional properties: + +- wdt-timeout: Two byte nvmem cell specified as per + Documentation/devicetree/bindings/nvmem/nvmem.txt + +Example: + + rave-sp { + compatible = "zii,rave-sp-rdu1"; + current-speed = <38400>; + + eeprom { + wdt_timeout: wdt-timeout@8E { + reg = <0x8E 2>; + }; + }; + + watchdog { + compatible = "zii,rave-sp-watchdog"; + nvmem-cells = <&wdt_timeout>; + nvmem-cell-names = "wdt-timeout"; + }; + } + diff --git a/Documentation/driver-api/dmaengine/provider.rst b/Documentation/driver-api/dmaengine/provider.rst index 814acb4d2294..dfc4486b5743 100644 --- a/Documentation/driver-api/dmaengine/provider.rst +++ b/Documentation/driver-api/dmaengine/provider.rst @@ -111,40 +111,36 @@ The first thing you need to do in your driver is to allocate this structure. Any of the usual memory allocators will do, but you'll also need to initialize a few fields in there: -- channels: should be initialized as a list using the +- ``channels``: should be initialized as a list using the INIT_LIST_HEAD macro for example -- src_addr_widths: +- ``src_addr_widths``: should contain a bitmask of the supported source transfer width -- dst_addr_widths: +- ``dst_addr_widths``: should contain a bitmask of the supported destination transfer width -- directions: +- ``directions``: should contain a bitmask of the supported slave directions (i.e. excluding mem2mem transfers) -- residue_granularity: +- ``residue_granularity``: + granularity of the transfer residue reported to dma_set_residue. + This can be either: - - Granularity of the transfer residue reported to dma_set_residue. - This can be either: + - Descriptor: + your device doesn't support any kind of residue + reporting. The framework will only know that a particular + transaction descriptor is done. - - Descriptor + - Segment: + your device is able to report which chunks have been transferred - - Your device doesn't support any kind of residue - reporting. The framework will only know that a particular - transaction descriptor is done. + - Burst: + your device is able to report which burst have been transferred - - Segment - - - Your device is able to report which chunks have been transferred - - - Burst - - - Your device is able to report which burst have been transferred - - - dev: should hold the pointer to the ``struct device`` associated - to your current driver instance. +- ``dev``: should hold the pointer to the ``struct device`` associated + to your current driver instance. Supported transaction types --------------------------- diff --git a/Documentation/driver-api/iio/hw-consumer.rst b/Documentation/driver-api/iio/hw-consumer.rst new file mode 100644 index 000000000000..8facce6a6733 --- /dev/null +++ b/Documentation/driver-api/iio/hw-consumer.rst @@ -0,0 +1,51 @@ +=========== +HW consumer +=========== +An IIO device can be directly connected to another device in hardware. in this +case the buffers between IIO provider and IIO consumer are handled by hardware. +The Industrial I/O HW consumer offers a way to bond these IIO devices without +software buffer for data. The implementation can be found under +:file:`drivers/iio/buffer/hw-consumer.c` + + +* struct :c:type:`iio_hw_consumer` — Hardware consumer structure +* :c:func:`iio_hw_consumer_alloc` — Allocate IIO hardware consumer +* :c:func:`iio_hw_consumer_free` — Free IIO hardware consumer +* :c:func:`iio_hw_consumer_enable` — Enable IIO hardware consumer +* :c:func:`iio_hw_consumer_disable` — Disable IIO hardware consumer + + +HW consumer setup +================= + +As standard IIO device the implementation is based on IIO provider/consumer. +A typical IIO HW consumer setup looks like this:: + + static struct iio_hw_consumer *hwc; + + static const struct iio_info adc_info = { + .read_raw = adc_read_raw, + }; + + static int adc_read_raw(struct iio_dev *indio_dev, + struct iio_chan_spec const *chan, int *val, + int *val2, long mask) + { + ret = iio_hw_consumer_enable(hwc); + + /* Acquire data */ + + ret = iio_hw_consumer_disable(hwc); + } + + static int adc_probe(struct platform_device *pdev) + { + hwc = devm_iio_hw_consumer_alloc(&iio->dev); + } + +More details +============ +.. kernel-doc:: include/linux/iio/hw-consumer.h +.. kernel-doc:: drivers/iio/buffer/industrialio-hw-consumer.c + :export: + diff --git a/Documentation/driver-api/iio/index.rst b/Documentation/driver-api/iio/index.rst index e5c3922d1b6f..7fba341bd8b2 100644 --- a/Documentation/driver-api/iio/index.rst +++ b/Documentation/driver-api/iio/index.rst @@ -15,3 +15,4 @@ Contents: buffers triggers triggered-buffers + hw-consumer diff --git a/Documentation/driver-api/pm/devices.rst b/Documentation/driver-api/pm/devices.rst index 53c1b0b06da5..1128705a5731 100644 --- a/Documentation/driver-api/pm/devices.rst +++ b/Documentation/driver-api/pm/devices.rst @@ -777,17 +777,51 @@ The driver can indicate that by setting ``DPM_FLAG_SMART_SUSPEND`` in runtime suspend at the beginning of the ``suspend_late`` phase of system-wide suspend (or in the ``poweroff_late`` phase of hibernation), when runtime PM has been disabled for it, under the assumption that its state should not change -after that point until the system-wide transition is over. If that happens, the -driver's system-wide resume callbacks, if present, may still be invoked during -the subsequent system-wide resume transition and the device's runtime power -management status may be set to "active" before enabling runtime PM for it, -so the driver must be prepared to cope with the invocation of its system-wide -resume callbacks back-to-back with its ``->runtime_suspend`` one (without the -intervening ``->runtime_resume`` and so on) and the final state of the device -must reflect the "active" status for runtime PM in that case. +after that point until the system-wide transition is over (the PM core itself +does that for devices whose "noirq", "late" and "early" system-wide PM callbacks +are executed directly by it). If that happens, the driver's system-wide resume +callbacks, if present, may still be invoked during the subsequent system-wide +resume transition and the device's runtime power management status may be set +to "active" before enabling runtime PM for it, so the driver must be prepared to +cope with the invocation of its system-wide resume callbacks back-to-back with +its ``->runtime_suspend`` one (without the intervening ``->runtime_resume`` and +so on) and the final state of the device must reflect the "active" runtime PM +status in that case. During system-wide resume from a sleep state it's easiest to put devices into the full-power state, as explained in :file:`Documentation/power/runtime_pm.txt`. -Refer to that document for more information regarding this particular issue as +[Refer to that document for more information regarding this particular issue as well as for information on the device runtime power management framework in -general. +general.] + +However, it often is desirable to leave devices in suspend after system +transitions to the working state, especially if those devices had been in +runtime suspend before the preceding system-wide suspend (or analogous) +transition. Device drivers can use the ``DPM_FLAG_LEAVE_SUSPENDED`` flag to +indicate to the PM core (and middle-layer code) that they prefer the specific +devices handled by them to be left suspended and they have no problems with +skipping their system-wide resume callbacks for this reason. Whether or not the +devices will actually be left in suspend may depend on their state before the +given system suspend-resume cycle and on the type of the system transition under +way. In particular, devices are not left suspended if that transition is a +restore from hibernation, as device states are not guaranteed to be reflected +by the information stored in the hibernation image in that case. + +The middle-layer code involved in the handling of the device is expected to +indicate to the PM core if the device may be left in suspend by setting its +:c:member:`power.may_skip_resume` status bit which is checked by the PM core +during the "noirq" phase of the preceding system-wide suspend (or analogous) +transition. The middle layer is then responsible for handling the device as +appropriate in its "noirq" resume callback, which is executed regardless of +whether or not the device is left suspended, but the other resume callbacks +(except for ``->complete``) will be skipped automatically by the PM core if the +device really can be left in suspend. + +For devices whose "noirq", "late" and "early" driver callbacks are invoked +directly by the PM core, all of the system-wide resume callbacks are skipped if +``DPM_FLAG_LEAVE_SUSPENDED`` is set and the device is in runtime suspend during +the ``suspend_noirq`` (or analogous) phase or the transition under way is a +proper system suspend (rather than anything related to hibernation) and the +device's wakeup settings are suitable for runtime PM (that is, it cannot +generate wakeup signals at all or it is allowed to wake up the system from +sleep). diff --git a/Documentation/driver-api/scsi.rst b/Documentation/driver-api/scsi.rst index 9ae03171daca..3ae337929721 100644 --- a/Documentation/driver-api/scsi.rst +++ b/Documentation/driver-api/scsi.rst @@ -224,6 +224,14 @@ mid to lowlevel SCSI driver interface .. kernel-doc:: drivers/scsi/hosts.c :export: +drivers/scsi/scsi_common.c +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +general support functions + +.. kernel-doc:: drivers/scsi/scsi_common.c + :export: + Transport classes ----------------- @@ -332,5 +340,5 @@ todo ~~~~ Parallel (fast/wide/ultra) SCSI, USB, SATA, SAS, Fibre Channel, -FireWire, ATAPI devices, Infiniband, I20, iSCSI, Parallel ports, +FireWire, ATAPI devices, Infiniband, I2O, iSCSI, Parallel ports, netlink... diff --git a/Documentation/driver-model/devres.txt b/Documentation/driver-model/devres.txt index c180045eb43b..7c1bb3d0c222 100644 --- a/Documentation/driver-model/devres.txt +++ b/Documentation/driver-model/devres.txt @@ -384,6 +384,9 @@ RESET devm_reset_control_get() devm_reset_controller_register() +SERDEV + devm_serdev_device_open() + SLAVE DMA ENGINE devm_acpi_dma_controller_register() diff --git a/Documentation/features/debug/KASAN/arch-support.txt b/Documentation/features/debug/KASAN/arch-support.txt index f377290fe48e..3406fae833c3 100644 --- a/Documentation/features/debug/KASAN/arch-support.txt +++ b/Documentation/features/debug/KASAN/arch-support.txt @@ -35,5 +35,5 @@ | um: | TODO | | unicore32: | TODO | | x86: | ok | 64-bit only - | xtensa: | TODO | + | xtensa: | ok | ----------------------- diff --git a/Documentation/features/debug/stackprotector/arch-support.txt b/Documentation/features/debug/stackprotector/arch-support.txt index d7acd7bd3619..59a4c9ffb7f3 100644 --- a/Documentation/features/debug/stackprotector/arch-support.txt +++ b/Documentation/features/debug/stackprotector/arch-support.txt @@ -35,5 +35,5 @@ | um: | TODO | | unicore32: | TODO | | x86: | ok | - | xtensa: | TODO | + | xtensa: | ok | ----------------------- diff --git a/Documentation/filesystems/nfs/Exporting b/Documentation/filesystems/nfs/Exporting index 520a4becb75c..63889149f532 100644 --- a/Documentation/filesystems/nfs/Exporting +++ b/Documentation/filesystems/nfs/Exporting @@ -56,13 +56,25 @@ a/ A dentry flag DCACHE_DISCONNECTED which is set on any dentry that might not be part of the proper prefix. This is set when anonymous dentries are created, and cleared when a dentry is noticed to be a child of a dentry which is in the proper - prefix. - -b/ A per-superblock list "s_anon" of dentries which are the roots of - subtrees that are not in the proper prefix. These dentries, as - well as the proper prefix, need to be released at unmount time. As - these dentries will not be hashed, they are linked together on the - d_hash list_head. + prefix. If the refcount on a dentry with this flag set + becomes zero, the dentry is immediately discarded, rather than being + kept in the dcache. If a dentry that is not already in the dcache + is repeatedly accessed by filehandle (as NFSD might do), an new dentry + will be a allocated for each access, and discarded at the end of + the access. + + Note that such a dentry can acquire children, name, ancestors, etc. + without losing DCACHE_DISCONNECTED - that flag is only cleared when + subtree is successfully reconnected to root. Until then dentries + in such subtree are retained only as long as there are references; + refcount reaching zero means immediate eviction, same as for unhashed + dentries. That guarantees that we won't need to hunt them down upon + umount. + +b/ A primitive for creation of secondary roots - d_obtain_root(inode). + Those do _not_ bear DCACHE_DISCONNECTED. They are placed on the + per-superblock list (->s_roots), so they can be located at umount + time for eviction purposes. c/ Helper routines to allocate anonymous dentries, and to help attach loose directory dentries at lookup time. They are: @@ -77,7 +89,6 @@ c/ Helper routines to allocate anonymous dentries, and to help attach (such as an anonymous one created by d_obtain_alias), if appropriate. It returns NULL when the passed-in dentry is used, following the calling convention of ->lookup. - Filesystem Issues ----------------- diff --git a/Documentation/filesystems/nilfs2.txt b/Documentation/filesystems/nilfs2.txt index c0727dc36271..f2f3f8592a6f 100644 --- a/Documentation/filesystems/nilfs2.txt +++ b/Documentation/filesystems/nilfs2.txt @@ -25,8 +25,8 @@ available from the following download page. At least "mkfs.nilfs2", cleaner or garbage collector) are required. Details on the tools are described in the man pages included in the package. -Project web page: http://nilfs.sourceforge.net/ -Download page: http://nilfs.sourceforge.net/en/download.html +Project web page: https://nilfs.sourceforge.io/ +Download page: https://nilfs.sourceforge.io/en/download.html List info: http://vger.kernel.org/vger-lists.html#linux-nilfs Caveats diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt index 8caa60734647..e6a5f4912b6d 100644 --- a/Documentation/filesystems/overlayfs.txt +++ b/Documentation/filesystems/overlayfs.txt @@ -156,6 +156,40 @@ handle it in two different ways: root of the overlay. Finally the directory is moved to the new location. +There are several ways to tune the "redirect_dir" feature. + +Kernel config options: + +- OVERLAY_FS_REDIRECT_DIR: + If this is enabled, then redirect_dir is turned on by default. +- OVERLAY_FS_REDIRECT_ALWAYS_FOLLOW: + If this is enabled, then redirects are always followed by default. Enabling + this results in a less secure configuration. Enable this option only when + worried about backward compatibility with kernels that have the redirect_dir + feature and follow redirects even if turned off. + +Module options (can also be changed through /sys/module/overlay/parameters/*): + +- "redirect_dir=BOOL": + See OVERLAY_FS_REDIRECT_DIR kernel config option above. +- "redirect_always_follow=BOOL": + See OVERLAY_FS_REDIRECT_ALWAYS_FOLLOW kernel config option above. +- "redirect_max=NUM": + The maximum number of bytes in an absolute redirect (default is 256). + +Mount options: + +- "redirect_dir=on": + Redirects are enabled. +- "redirect_dir=follow": + Redirects are not created, but followed. +- "redirect_dir=off": + Redirects are not created and only followed if "redirect_always_follow" + feature is enabled in the kernel/module config. +- "redirect_dir=nofollow": + Redirects are not created and not followed (equivalent to "redirect_dir=off" + if "redirect_always_follow" feature is not enabled). + Non-directories --------------- diff --git a/Documentation/gpio/board.txt b/Documentation/gpio/board.txt index a0f61898d493..659bb19f5b3c 100644 --- a/Documentation/gpio/board.txt +++ b/Documentation/gpio/board.txt @@ -2,6 +2,7 @@ GPIO Mappings ============= This document explains how GPIOs can be assigned to given devices and functions. + Note that it only applies to the new descriptor-based interface. For a description of the deprecated integer-based GPIO interface please refer to gpio-legacy.txt (actually, there is no real mapping possible with the old @@ -49,7 +50,7 @@ This property will make GPIOs 15, 16 and 17 available to the driver under the power = gpiod_get(dev, "power", GPIOD_OUT_HIGH); -The led GPIOs will be active-high, while the power GPIO will be active-low (i.e. +The led GPIOs will be active high, while the power GPIO will be active low (i.e. gpiod_is_active_low(power) will be true). The second parameter of the gpiod_get() functions, the con_id string, has to be @@ -122,9 +123,14 @@ where can be NULL, in which case it will match any function. - idx is the index of the GPIO within the function. - flags is defined to specify the following properties: - * GPIOF_ACTIVE_LOW - to configure the GPIO as active-low - * GPIOF_OPEN_DRAIN - GPIO pin is open drain type. - * GPIOF_OPEN_SOURCE - GPIO pin is open source type. + * GPIO_ACTIVE_HIGH - GPIO line is active high + * GPIO_ACTIVE_LOW - GPIO line is active low + * GPIO_OPEN_DRAIN - GPIO line is set up as open drain + * GPIO_OPEN_SOURCE - GPIO line is set up as open source + * GPIO_PERSISTENT - GPIO line is persistent during + suspend/resume and maintains its value + * GPIO_TRANSITORY - GPIO line is transitory and may loose its + electrical state during suspend/resume In the future, these flags might be extended to support more properties. diff --git a/Documentation/gpio/consumer.txt b/Documentation/gpio/consumer.txt index 63e1bd1d88e3..d53e5b5cfc9c 100644 --- a/Documentation/gpio/consumer.txt +++ b/Documentation/gpio/consumer.txt @@ -66,6 +66,15 @@ for the GPIO. Values can be: * GPIOD_IN to initialize the GPIO as input. * GPIOD_OUT_LOW to initialize the GPIO as output with a value of 0. * GPIOD_OUT_HIGH to initialize the GPIO as output with a value of 1. +* GPIOD_OUT_LOW_OPEN_DRAIN same as GPIOD_OUT_LOW but also enforce the line + to be electrically used with open drain. +* GPIOD_OUT_HIGH_OPEN_DRAIN same as GPIOD_OUT_HIGH but also enforce the line + to be electrically used with open drain. + +The two last flags are used for use cases where open drain is mandatory, such +as I2C: if the line is not already configured as open drain in the mappings +(see board.txt), then open drain will be enforced anyway and a warning will be +printed that the board configuration needs to be updated to match the use case. Both functions return either a valid GPIO descriptor, or an error code checkable with IS_ERR() (they will never return a NULL pointer). -ENOENT will be returned @@ -184,7 +193,7 @@ A driver can also query the current direction of a GPIO: int gpiod_get_direction(const struct gpio_desc *desc) -This function will return either GPIOF_DIR_IN or GPIOF_DIR_OUT. +This function returns 0 for output, 1 for input, or an error code in case of error. Be aware that there is no default direction for GPIOs. Therefore, **using a GPIO without setting its direction first is illegal and will result in undefined @@ -240,59 +249,71 @@ that can't be accessed from hardIRQ handlers, these calls act the same as the spinlock-safe calls. -Active-low State and Raw GPIO Values ------------------------------------- -Device drivers like to manage the logical state of a GPIO, i.e. the value their -device will actually receive, no matter what lies between it and the GPIO line. -In some cases, it might make sense to control the actual GPIO line value. The -following set of calls ignore the active-low property of a GPIO and work on the -raw line value: - - int gpiod_get_raw_value(const struct gpio_desc *desc) - void gpiod_set_raw_value(struct gpio_desc *desc, int value) - int gpiod_get_raw_value_cansleep(const struct gpio_desc *desc) - void gpiod_set_raw_value_cansleep(struct gpio_desc *desc, int value) - int gpiod_direction_output_raw(struct gpio_desc *desc, int value) - -The active-low state of a GPIO can also be queried using the following call: - - int gpiod_is_active_low(const struct gpio_desc *desc) - -Note that these functions should only be used with great moderation ; a driver -should not have to care about the physical line level. - - -The active-low property ------------------------ - -As a driver should not have to care about the physical line level, all of the +The active low and open drain semantics +--------------------------------------- +As a consumer should not have to care about the physical line level, all of the gpiod_set_value_xxx() or gpiod_set_array_value_xxx() functions operate with -the *logical* value. With this they take the active-low property into account. -This means that they check whether the GPIO is configured to be active-low, +the *logical* value. With this they take the active low property into account. +This means that they check whether the GPIO is configured to be active low, and if so, they manipulate the passed value before the physical line level is driven. +The same is applicable for open drain or open source output lines: those do not +actively drive their output high (open drain) or low (open source), they just +switch their output to a high impedance value. The consumer should not need to +care. (For details read about open drain in driver.txt.) + With this, all the gpiod_set_(array)_value_xxx() functions interpret the -parameter "value" as "active" ("1") or "inactive" ("0"). The physical line +parameter "value" as "asserted" ("1") or "de-asserted" ("0"). The physical line level will be driven accordingly. -As an example, if the active-low property for a dedicated GPIO is set, and the -gpiod_set_(array)_value_xxx() passes "active" ("1"), the physical line level +As an example, if the active low property for a dedicated GPIO is set, and the +gpiod_set_(array)_value_xxx() passes "asserted" ("1"), the physical line level will be driven low. To summarize: -Function (example) active-low property physical line -gpiod_set_raw_value(desc, 0); don't care low -gpiod_set_raw_value(desc, 1); don't care high -gpiod_set_value(desc, 0); default (active-high) low -gpiod_set_value(desc, 1); default (active-high) high -gpiod_set_value(desc, 0); active-low high -gpiod_set_value(desc, 1); active-low low - -Please note again that the set_raw/get_raw functions should be avoided as much -as possible, especially by drivers which should not care about the actual -physical line level and worry about the logical value instead. +Function (example) line property physical line +gpiod_set_raw_value(desc, 0); don't care low +gpiod_set_raw_value(desc, 1); don't care high +gpiod_set_value(desc, 0); default (active high) low +gpiod_set_value(desc, 1); default (active high) high +gpiod_set_value(desc, 0); active low high +gpiod_set_value(desc, 1); active low low +gpiod_set_value(desc, 0); default (active high) low +gpiod_set_value(desc, 1); default (active high) high +gpiod_set_value(desc, 0); open drain low +gpiod_set_value(desc, 1); open drain high impedance +gpiod_set_value(desc, 0); open source high impedance +gpiod_set_value(desc, 1); open source high + +It is possible to override these semantics using the *set_raw/'get_raw functions +but it should be avoided as much as possible, especially by system-agnostic drivers +which should not need to care about the actual physical line level and worry about +the logical value instead. + + +Accessing raw GPIO values +------------------------- +Consumers exist that need to manage the logical state of a GPIO line, i.e. the value +their device will actually receive, no matter what lies between it and the GPIO +line. + +The following set of calls ignore the active-low or open drain property of a GPIO and +work on the raw line value: + + int gpiod_get_raw_value(const struct gpio_desc *desc) + void gpiod_set_raw_value(struct gpio_desc *desc, int value) + int gpiod_get_raw_value_cansleep(const struct gpio_desc *desc) + void gpiod_set_raw_value_cansleep(struct gpio_desc *desc, int value) + int gpiod_direction_output_raw(struct gpio_desc *desc, int value) + +The active low state of a GPIO can also be queried using the following call: + + int gpiod_is_active_low(const struct gpio_desc *desc) + +Note that these functions should only be used with great moderation; a driver +should not have to care about the physical line level or open drain semantics. Access multiple GPIOs with a single function call diff --git a/Documentation/gpio/driver.txt b/Documentation/gpio/driver.txt index d8de1c7de85a..3392a0fd4c23 100644 --- a/Documentation/gpio/driver.txt +++ b/Documentation/gpio/driver.txt @@ -88,6 +88,10 @@ ending up in the pin control back-end "behind" the GPIO controller, usually closer to the actual pins. This way the pin controller can manage the below listed GPIO configurations. +If a pin controller back-end is used, the GPIO controller or hardware +description needs to provide "GPIO ranges" mapping the GPIO line offsets to pin +numbers on the pin controller so they can properly cross-reference each other. + GPIOs with debounce support --------------------------- diff --git a/Documentation/gpio/sysfs.txt b/Documentation/gpio/sysfs.txt index aeab01aa4d00..6cdeab8650cd 100644 --- a/Documentation/gpio/sysfs.txt +++ b/Documentation/gpio/sysfs.txt @@ -1,6 +1,17 @@ GPIO Sysfs Interface for Userspace ================================== +THIS ABI IS DEPRECATED, THE ABI DOCUMENTATION HAS BEEN MOVED TO +Documentation/ABI/obsolete/sysfs-gpio AND NEW USERSPACE CONSUMERS +ARE SUPPOSED TO USE THE CHARACTER DEVICE ABI. THIS OLD SYSFS ABI WILL +NOT BE DEVELOPED (NO NEW FEATURES), IT WILL JUST BE MAINTAINED. + +Refer to the examples in tools/gpio/* for an introduction to the new +character device ABI. Also see the userspace header in +include/uapi/linux/gpio.h + +The deprecated sysfs ABI +------------------------ Platforms which use the "gpiolib" implementors framework may choose to configure a sysfs user interface to GPIOs. This is different from the debugfs interface, since it provides control over GPIO direction and diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst index 2e7ee0313c1c..e94d3ac2bdd0 100644 --- a/Documentation/gpu/i915.rst +++ b/Documentation/gpu/i915.rst @@ -341,10 +341,7 @@ GuC GuC-specific firmware loader ---------------------------- -.. kernel-doc:: drivers/gpu/drm/i915/intel_guc_loader.c - :doc: GuC-specific firmware loader - -.. kernel-doc:: drivers/gpu/drm/i915/intel_guc_loader.c +.. kernel-doc:: drivers/gpu/drm/i915/intel_guc_fw.c :internal: GuC-based command submission diff --git a/Documentation/hwmon/lm25066 b/Documentation/hwmon/lm25066 index 3fa6bf820c88..51b32aa203a8 100644 --- a/Documentation/hwmon/lm25066 +++ b/Documentation/hwmon/lm25066 @@ -8,11 +8,6 @@ Supported chips: Datasheets: http://www.ti.com/lit/gpn/lm25056 http://www.ti.com/lit/gpn/lm25056a - * TI LM25063 - Prefix: 'lm25063' - Addresses scanned: - - Datasheet: - To be announced * National Semiconductor LM25066 Prefix: 'lm25066' Addresses scanned: - @@ -42,7 +37,7 @@ Description ----------- This driver supports hardware monitoring for National Semiconductor / TI LM25056, -LM25063, LM25066, LM5064, and LM5066/LM5066I Power Management, Monitoring, +LM25066, LM5064, and LM5066/LM5066I Power Management, Monitoring, Control, and Protection ICs. The driver is a client driver to the core PMBus driver. Please see @@ -74,12 +69,8 @@ in1_input Measured input voltage. in1_average Average measured input voltage. in1_min Minimum input voltage. in1_max Maximum input voltage. -in1_crit Critical high input voltage (LM25063 only). -in1_lcrit Critical low input voltage (LM25063 only). in1_min_alarm Input voltage low alarm. in1_max_alarm Input voltage high alarm. -in1_lcrit_alarm Input voltage critical low alarm (LM25063 only). -in1_crit_alarm Input voltage critical high alarm. (LM25063 only). in2_label "vmon" in2_input Measured voltage on VAUX pin @@ -94,16 +85,12 @@ in3_input Measured output voltage. in3_average Average measured output voltage. in3_min Minimum output voltage. in3_min_alarm Output voltage low alarm. -in3_highest Historical minimum output voltage (LM25063 only). -in3_lowest Historical maximum output voltage (LM25063 only). curr1_label "iin" curr1_input Measured input current. curr1_average Average measured input current. curr1_max Maximum input current. -curr1_crit Critical input current (LM25063 only). curr1_max_alarm Input current high alarm. -curr1_crit_alarm Input current critical high alarm (LM25063 only). power1_label "pin" power1_input Measured input power. @@ -113,11 +100,6 @@ power1_alarm Input power alarm power1_input_highest Historical maximum power. power1_reset_history Write any value to reset maximum power history. -power2_label "pout". LM25063 only. -power2_input Measured output power. -power2_max Maximum output power limit. -power2_crit Critical output power limit. - temp1_input Measured temperature. temp1_max Maximum temperature. temp1_crit Critical high temperature. diff --git a/Documentation/hwmon/max31785 b/Documentation/hwmon/max31785 index 45fb6093dec2..270c5f865261 100644 --- a/Documentation/hwmon/max31785 +++ b/Documentation/hwmon/max31785 @@ -17,8 +17,9 @@ management with temperature and remote voltage sensing. Various fan control features are provided, including PWM frequency control, temperature hysteresis, dual tachometer measurements, and fan health monitoring. -For dual rotor fan configuration, the MAX31785 exposes the slowest rotor of the -two in the fan[1-4]_input attributes. +For dual-rotor configurations the MAX31785A exposes the second rotor tachometer +readings in attributes fan[5-8]_input. By contrast the MAX31785 only exposes +the slowest rotor measurement, and does so in the fan[1-4]_input attributes. Usage Notes ----------- @@ -31,7 +32,9 @@ Sysfs attributes fan[1-4]_alarm Fan alarm. fan[1-4]_fault Fan fault. -fan[1-4]_input Fan RPM. +fan[1-8]_input Fan RPM. On the MAX31785A, inputs 5-8 correspond to the + second rotor of fans 1-4 +fan[1-4]_target Fan input target in[1-6]_crit Critical maximum output voltage in[1-6]_crit_alarm Output voltage critical high alarm @@ -44,6 +47,12 @@ in[1-6]_max_alarm Output voltage high alarm in[1-6]_min Minimum output voltage in[1-6]_min_alarm Output voltage low alarm +pwm[1-4] Fan target duty cycle (0..255) +pwm[1-4]_enable 0: Full-speed + 1: Manual PWM control + 2: Automatic PWM (tach-feedback RPM fan-control) + 3: Automatic closed-loop (temp-feedback fan-control) + temp[1-11]_crit Critical high temperature temp[1-11]_crit_alarm Chip temperature critical high alarm temp[1-11]_input Measured temperature diff --git a/Documentation/hwmon/w83773g b/Documentation/hwmon/w83773g new file mode 100644 index 000000000000..4cc6c0b8257f --- /dev/null +++ b/Documentation/hwmon/w83773g @@ -0,0 +1,33 @@ +Kernel driver w83773g +==================== + +Supported chips: + * Nuvoton W83773G + Prefix: 'w83773g' + Addresses scanned: I2C 0x4c and 0x4d + Datasheet: https://www.nuvoton.com/resource-files/W83773G_SG_DatasheetV1_2.pdf + +Authors: + Lei YU <mine260309@gmail.com> + +Description +----------- + +This driver implements support for Nuvoton W83773G temperature sensor +chip. This chip implements one local and two remote sensors. +The chip also features offsets for the two remote sensors which get added to +the input readings. The chip does all the scaling by itself and the driver +therefore reports true temperatures that don't need any user-space adjustments. +Temperature is measured in degrees Celsius. +The chip is wired over I2C/SMBus and specified over a temperature +range of -40 to +125 degrees Celsius (for local sensor) and -40 to +127 +degrees Celsius (for remote sensors). +Resolution for both the local and remote channels is 0.125 degree C. + +The chip supports only temperature measurement. The driver exports +the temperature values via the following sysfs files: + +temp[1-3]_input +temp[2-3]_fault +temp[2-3]_offset +update_interval diff --git a/Documentation/input/multi-touch-protocol.rst b/Documentation/input/multi-touch-protocol.rst index 8035868c56bc..b51751a0cd5d 100644 --- a/Documentation/input/multi-touch-protocol.rst +++ b/Documentation/input/multi-touch-protocol.rst @@ -269,10 +269,11 @@ ABS_MT_ORIENTATION The orientation of the touching ellipse. The value should describe a signed quarter of a revolution clockwise around the touch center. The signed value range is arbitrary, but zero should be returned for an ellipse aligned with - the Y axis of the surface, a negative value when the ellipse is turned to - the left, and a positive value when the ellipse is turned to the - right. When completely aligned with the X axis, the range max should be - returned. + the Y axis (north) of the surface, a negative value when the ellipse is + turned to the left, and a positive value when the ellipse is turned to the + right. When aligned with the X axis in the positive direction, the range + max should be returned; when aligned with the X axis in the negative + direction, the range -max should be returned. Touch ellipsis are symmetrical by default. For devices capable of true 360 degree orientation, the reported orientation must exceed the range max to diff --git a/Documentation/kbuild/kconfig-language.txt b/Documentation/kbuild/kconfig-language.txt index 262722d8867b..c4a293a03c33 100644 --- a/Documentation/kbuild/kconfig-language.txt +++ b/Documentation/kbuild/kconfig-language.txt @@ -200,10 +200,14 @@ module state. Dependency expressions have the following syntax: <expr> ::= <symbol> (1) <symbol> '=' <symbol> (2) <symbol> '!=' <symbol> (3) - '(' <expr> ')' (4) - '!' <expr> (5) - <expr> '&&' <expr> (6) - <expr> '||' <expr> (7) + <symbol1> '<' <symbol2> (4) + <symbol1> '>' <symbol2> (4) + <symbol1> '<=' <symbol2> (4) + <symbol1> '>=' <symbol2> (4) + '(' <expr> ')' (5) + '!' <expr> (6) + <expr> '&&' <expr> (7) + <expr> '||' <expr> (8) Expressions are listed in decreasing order of precedence. @@ -214,10 +218,13 @@ Expressions are listed in decreasing order of precedence. otherwise 'n'. (3) If the values of both symbols are equal, it returns 'n', otherwise 'y'. -(4) Returns the value of the expression. Used to override precedence. -(5) Returns the result of (2-/expr/). -(6) Returns the result of min(/expr/, /expr/). -(7) Returns the result of max(/expr/, /expr/). +(4) If value of <symbol1> is respectively lower, greater, lower-or-equal, + or greater-or-equal than value of <symbol2>, it returns 'y', + otherwise 'n'. +(5) Returns the value of the expression. Used to override precedence. +(6) Returns the result of (2-/expr/). +(7) Returns the result of min(/expr/, /expr/). +(8) Returns the result of max(/expr/, /expr/). An expression can have a value of 'n', 'm' or 'y' (or 0, 1, 2 respectively for calculations). A menu entry becomes visible when its diff --git a/Documentation/livepatch/livepatch.txt b/Documentation/livepatch/livepatch.txt index ecdb18104ab0..1ae2de758c08 100644 --- a/Documentation/livepatch/livepatch.txt +++ b/Documentation/livepatch/livepatch.txt @@ -72,8 +72,7 @@ example, they add a NULL pointer or a boundary check, fix a race by adding a missing memory barrier, or add some locking around a critical section. Most of these changes are self contained and the function presents itself the same way to the rest of the system. In this case, the functions might -be updated independently one by one. (This can be done by setting the -'immediate' flag in the klp_patch struct.) +be updated independently one by one. But there are more complex fixes. For example, a patch might change ordering of locking in multiple functions at the same time. Or a patch @@ -125,12 +124,6 @@ safe to patch tasks: b) Patching CPU-bound user tasks. If the task is highly CPU-bound then it will get patched the next time it gets interrupted by an IRQ. - c) In the future it could be useful for applying patches for - architectures which don't yet have HAVE_RELIABLE_STACKTRACE. In - this case you would have to signal most of the tasks on the - system. However this isn't supported yet because there's - currently no way to patch kthreads without - HAVE_RELIABLE_STACKTRACE. 3. For idle "swapper" tasks, since they don't ever exit the kernel, they instead have a klp_update_patch_state() call in the idle loop which @@ -138,27 +131,16 @@ safe to patch tasks: (Note there's not yet such an approach for kthreads.) -All the above approaches may be skipped by setting the 'immediate' flag -in the 'klp_patch' struct, which will disable per-task consistency and -patch all tasks immediately. This can be useful if the patch doesn't -change any function or data semantics. Note that, even with this flag -set, it's possible that some tasks may still be running with an old -version of the function, until that function returns. +Architectures which don't have HAVE_RELIABLE_STACKTRACE solely rely on +the second approach. It's highly likely that some tasks may still be +running with an old version of the function, until that function +returns. In this case you would have to signal the tasks. This +especially applies to kthreads. They may not be woken up and would need +to be forced. See below for more information. -There's also an 'immediate' flag in the 'klp_func' struct which allows -you to specify that certain functions in the patch can be applied -without per-task consistency. This might be useful if you want to patch -a common function like schedule(), and the function change doesn't need -consistency but the rest of the patch does. - -For architectures which don't have HAVE_RELIABLE_STACKTRACE, the user -must set patch->immediate which causes all tasks to be patched -immediately. This option should be used with care, only when the patch -doesn't change any function or data semantics. - -In the future, architectures which don't have HAVE_RELIABLE_STACKTRACE -may be allowed to use per-task consistency if we can come up with -another way to patch kthreads. +Unless we can come up with another way to patch kthreads, architectures +without HAVE_RELIABLE_STACKTRACE are not considered fully supported by +the kernel livepatching. The /sys/kernel/livepatch/<patch>/transition file shows whether a patch is in transition. Only a single patch (the topmost patch on the stack) @@ -176,8 +158,31 @@ If a patch is in transition, this file shows 0 to indicate the task is unpatched and 1 to indicate it's patched. Otherwise, if no patch is in transition, it shows -1. Any tasks which are blocking the transition can be signaled with SIGSTOP and SIGCONT to force them to change their -patched state. - +patched state. This may be harmful to the system though. +/sys/kernel/livepatch/<patch>/signal attribute provides a better alternative. +Writing 1 to the attribute sends a fake signal to all remaining blocking +tasks. No proper signal is actually delivered (there is no data in signal +pending structures). Tasks are interrupted or woken up, and forced to change +their patched state. + +Administrator can also affect a transition through +/sys/kernel/livepatch/<patch>/force attribute. Writing 1 there clears +TIF_PATCH_PENDING flag of all tasks and thus forces the tasks to the patched +state. Important note! The force attribute is intended for cases when the +transition gets stuck for a long time because of a blocking task. Administrator +is expected to collect all necessary data (namely stack traces of such blocking +tasks) and request a clearance from a patch distributor to force the transition. +Unauthorized usage may cause harm to the system. It depends on the nature of the +patch, which functions are (un)patched, and which functions the blocking tasks +are sleeping in (/proc/<pid>/stack may help here). Removal (rmmod) of patch +modules is permanently disabled when the force feature is used. It cannot be +guaranteed there is no task sleeping in such module. It implies unbounded +reference count if a patch module is disabled and enabled in a loop. + +Moreover, the usage of force may also affect future applications of live +patches and cause even more harm to the system. Administrator should first +consider to simply cancel a transition (see above). If force is used, reboot +should be planned and no more live patches applied. 3.1 Adding consistency model support to new architectures --------------------------------------------------------- @@ -216,13 +221,6 @@ few options: a good backup option for those architectures which don't have reliable stack traces yet. -In the meantime, patches for such architectures can bypass the -consistency model by setting klp_patch.immediate to true. This option -is perfectly fine for patches which don't change the semantics of the -patched functions. In practice, this is usable for ~90% of security -fixes. Use of this option also means the patch can't be unloaded after -it has been disabled. - 4. Livepatch module =================== @@ -278,9 +276,6 @@ into three levels: only for a particular object ( vmlinux or a kernel module ). Note that kallsyms allows for searching symbols according to the object name. - There's also an 'immediate' flag which, when set, patches the - function immediately, bypassing the consistency model safety checks. - + struct klp_object defines an array of patched functions (struct klp_func) in the same object. Where the object is either vmlinux (NULL) or a module name. @@ -299,9 +294,6 @@ into three levels: symbols are found. The only exception are symbols from objects (kernel modules) that have not been loaded yet. - Setting the 'immediate' flag applies the patch to all tasks - immediately, bypassing the consistency model safety checks. - For more details on how the patch is applied on a per-task basis, see the "Consistency model" section. @@ -316,14 +308,12 @@ section "Livepatch life-cycle" below for more details about these two operations. Module removal is only safe when there are no users of the underlying -functions. The immediate consistency model is not able to detect this. The -code just redirects the functions at the very beginning and it does not -check if the functions are in use. In other words, it knows when the -functions get called but it does not know when the functions return. -Therefore it cannot be decided when the livepatch module can be safely -removed. This is solved by a hybrid consistency model. When the system is -transitioned to a new patch state (patched/unpatched) it is guaranteed that -no task sleeps or runs in the old code. +functions. This is the reason why the force feature permanently disables +the removal. The forced tasks entered the functions but we cannot say +that they returned back. Therefore it cannot be decided when the +livepatch module can be safely removed. When the system is successfully +transitioned to a new patch state (patched/unpatched) without being +forced it is guaranteed that no task sleeps or runs in the old code. 5. Livepatch life-cycle @@ -337,19 +327,12 @@ First, the patch is applied only when all patched symbols for already loaded objects are found. The error handling is much easier if this check is done before particular functions get redirected. -Second, the immediate consistency model does not guarantee that anyone is not -sleeping in the new code after the patch is reverted. This means that the new -code needs to stay around "forever". If the code is there, one could apply it -again. Therefore it makes sense to separate the operations that might be done -once and those that need to be repeated when the patch is enabled (applied) -again. - -Third, it might take some time until the entire system is migrated -when a more complex consistency model is used. The patch revert might -block the livepatch module removal for too long. Therefore it is useful -to revert the patch using a separate operation that might be called -explicitly. But it does not make sense to remove all information -until the livepatch module is really removed. +Second, it might take some time until the entire system is migrated with +the hybrid consistency model being used. The patch revert might block +the livepatch module removal for too long. Therefore it is useful to +revert the patch using a separate operation that might be called +explicitly. But it does not make sense to remove all information until +the livepatch module is really removed. 5.1. Registration @@ -435,6 +418,9 @@ Information about the registered patches can be found under /sys/kernel/livepatch. The patches could be enabled and disabled by writing there. +/sys/kernel/livepatch/<patch>/signal and /sys/kernel/livepatch/<patch>/force +attributes allow administrator to affect a patching operation. + See Documentation/ABI/testing/sysfs-kernel-livepatch for more details. diff --git a/Documentation/locking/crossrelease.txt b/Documentation/locking/crossrelease.txt deleted file mode 100644 index bdf1423d5f99..000000000000 --- a/Documentation/locking/crossrelease.txt +++ /dev/null @@ -1,874 +0,0 @@ -Crossrelease -============ - -Started by Byungchul Park <byungchul.park@lge.com> - -Contents: - - (*) Background - - - What causes deadlock - - How lockdep works - - (*) Limitation - - - Limit lockdep - - Pros from the limitation - - Cons from the limitation - - Relax the limitation - - (*) Crossrelease - - - Introduce crossrelease - - Introduce commit - - (*) Implementation - - - Data structures - - How crossrelease works - - (*) Optimizations - - - Avoid duplication - - Lockless for hot paths - - (*) APPENDIX A: What lockdep does to work aggresively - - (*) APPENDIX B: How to avoid adding false dependencies - - -========== -Background -========== - -What causes deadlock --------------------- - -A deadlock occurs when a context is waiting for an event to happen, -which is impossible because another (or the) context who can trigger the -event is also waiting for another (or the) event to happen, which is -also impossible due to the same reason. - -For example: - - A context going to trigger event C is waiting for event A to happen. - A context going to trigger event A is waiting for event B to happen. - A context going to trigger event B is waiting for event C to happen. - -A deadlock occurs when these three wait operations run at the same time, -because event C cannot be triggered if event A does not happen, which in -turn cannot be triggered if event B does not happen, which in turn -cannot be triggered if event C does not happen. After all, no event can -be triggered since any of them never meets its condition to wake up. - -A dependency might exist between two waiters and a deadlock might happen -due to an incorrect releationship between dependencies. Thus, we must -define what a dependency is first. A dependency exists between them if: - - 1. There are two waiters waiting for each event at a given time. - 2. The only way to wake up each waiter is to trigger its event. - 3. Whether one can be woken up depends on whether the other can. - -Each wait in the example creates its dependency like: - - Event C depends on event A. - Event A depends on event B. - Event B depends on event C. - - NOTE: Precisely speaking, a dependency is one between whether a - waiter for an event can be woken up and whether another waiter for - another event can be woken up. However from now on, we will describe - a dependency as if it's one between an event and another event for - simplicity. - -And they form circular dependencies like: - - -> C -> A -> B - - / \ - \ / - ---------------- - - where 'A -> B' means that event A depends on event B. - -Such circular dependencies lead to a deadlock since no waiter can meet -its condition to wake up as described. - -CONCLUSION - -Circular dependencies cause a deadlock. - - -How lockdep works ------------------ - -Lockdep tries to detect a deadlock by checking dependencies created by -lock operations, acquire and release. Waiting for a lock corresponds to -waiting for an event, and releasing a lock corresponds to triggering an -event in the previous section. - -In short, lockdep does: - - 1. Detect a new dependency. - 2. Add the dependency into a global graph. - 3. Check if that makes dependencies circular. - 4. Report a deadlock or its possibility if so. - -For example, consider a graph built by lockdep that looks like: - - A -> B - - \ - -> E - / - C -> D - - - where A, B,..., E are different lock classes. - -Lockdep will add a dependency into the graph on detection of a new -dependency. For example, it will add a dependency 'E -> C' when a new -dependency between lock E and lock C is detected. Then the graph will be: - - A -> B - - \ - -> E - - / \ - -> C -> D - \ - / / - \ / - ------------------ - - where A, B,..., E are different lock classes. - -This graph contains a subgraph which demonstrates circular dependencies: - - -> E - - / \ - -> C -> D - \ - / / - \ / - ------------------ - - where C, D and E are different lock classes. - -This is the condition under which a deadlock might occur. Lockdep -reports it on detection after adding a new dependency. This is the way -how lockdep works. - -CONCLUSION - -Lockdep detects a deadlock or its possibility by checking if circular -dependencies were created after adding each new dependency. - - -========== -Limitation -========== - -Limit lockdep -------------- - -Limiting lockdep to work on only typical locks e.g. spin locks and -mutexes, which are released within the acquire context, the -implementation becomes simple but its capacity for detection becomes -limited. Let's check pros and cons in next section. - - -Pros from the limitation ------------------------- - -Given the limitation, when acquiring a lock, locks in a held_locks -cannot be released if the context cannot acquire it so has to wait to -acquire it, which means all waiters for the locks in the held_locks are -stuck. It's an exact case to create dependencies between each lock in -the held_locks and the lock to acquire. - -For example: - - CONTEXT X - --------- - acquire A - acquire B /* Add a dependency 'A -> B' */ - release B - release A - - where A and B are different lock classes. - -When acquiring lock A, the held_locks of CONTEXT X is empty thus no -dependency is added. But when acquiring lock B, lockdep detects and adds -a new dependency 'A -> B' between lock A in the held_locks and lock B. -They can be simply added whenever acquiring each lock. - -And data required by lockdep exists in a local structure, held_locks -embedded in task_struct. Forcing to access the data within the context, -lockdep can avoid racy problems without explicit locks while handling -the local data. - -Lastly, lockdep only needs to keep locks currently being held, to build -a dependency graph. However, relaxing the limitation, it needs to keep -even locks already released, because a decision whether they created -dependencies might be long-deferred. - -To sum up, we can expect several advantages from the limitation: - - 1. Lockdep can easily identify a dependency when acquiring a lock. - 2. Races are avoidable while accessing local locks in a held_locks. - 3. Lockdep only needs to keep locks currently being held. - -CONCLUSION - -Given the limitation, the implementation becomes simple and efficient. - - -Cons from the limitation ------------------------- - -Given the limitation, lockdep is applicable only to typical locks. For -example, page locks for page access or completions for synchronization -cannot work with lockdep. - -Can we detect deadlocks below, under the limitation? - -Example 1: - - CONTEXT X CONTEXT Y CONTEXT Z - --------- --------- ---------- - mutex_lock A - lock_page B - lock_page B - mutex_lock A /* DEADLOCK */ - unlock_page B held by X - unlock_page B - mutex_unlock A - mutex_unlock A - - where A and B are different lock classes. - -No, we cannot. - -Example 2: - - CONTEXT X CONTEXT Y - --------- --------- - mutex_lock A - mutex_lock A - wait_for_complete B /* DEADLOCK */ - complete B - mutex_unlock A - mutex_unlock A - - where A is a lock class and B is a completion variable. - -No, we cannot. - -CONCLUSION - -Given the limitation, lockdep cannot detect a deadlock or its -possibility caused by page locks or completions. - - -Relax the limitation --------------------- - -Under the limitation, things to create dependencies are limited to -typical locks. However, synchronization primitives like page locks and -completions, which are allowed to be released in any context, also -create dependencies and can cause a deadlock. So lockdep should track -these locks to do a better job. We have to relax the limitation for -these locks to work with lockdep. - -Detecting dependencies is very important for lockdep to work because -adding a dependency means adding an opportunity to check whether it -causes a deadlock. The more lockdep adds dependencies, the more it -thoroughly works. Thus Lockdep has to do its best to detect and add as -many true dependencies into a graph as possible. - -For example, considering only typical locks, lockdep builds a graph like: - - A -> B - - \ - -> E - / - C -> D - - - where A, B,..., E are different lock classes. - -On the other hand, under the relaxation, additional dependencies might -be created and added. Assuming additional 'FX -> C' and 'E -> GX' are -added thanks to the relaxation, the graph will be: - - A -> B - - \ - -> E -> GX - / - FX -> C -> D - - - where A, B,..., E, FX and GX are different lock classes, and a suffix - 'X' is added on non-typical locks. - -The latter graph gives us more chances to check circular dependencies -than the former. However, it might suffer performance degradation since -relaxing the limitation, with which design and implementation of lockdep -can be efficient, might introduce inefficiency inevitably. So lockdep -should provide two options, strong detection and efficient detection. - -Choosing efficient detection: - - Lockdep works with only locks restricted to be released within the - acquire context. However, lockdep works efficiently. - -Choosing strong detection: - - Lockdep works with all synchronization primitives. However, lockdep - suffers performance degradation. - -CONCLUSION - -Relaxing the limitation, lockdep can add additional dependencies giving -additional opportunities to check circular dependencies. - - -============ -Crossrelease -============ - -Introduce crossrelease ----------------------- - -In order to allow lockdep to handle additional dependencies by what -might be released in any context, namely 'crosslock', we have to be able -to identify those created by crosslocks. The proposed 'crossrelease' -feature provoides a way to do that. - -Crossrelease feature has to do: - - 1. Identify dependencies created by crosslocks. - 2. Add the dependencies into a dependency graph. - -That's all. Once a meaningful dependency is added into graph, then -lockdep would work with the graph as it did. The most important thing -crossrelease feature has to do is to correctly identify and add true -dependencies into the global graph. - -A dependency e.g. 'A -> B' can be identified only in the A's release -context because a decision required to identify the dependency can be -made only in the release context. That is to decide whether A can be -released so that a waiter for A can be woken up. It cannot be made in -other than the A's release context. - -It's no matter for typical locks because each acquire context is same as -its release context, thus lockdep can decide whether a lock can be -released in the acquire context. However for crosslocks, lockdep cannot -make the decision in the acquire context but has to wait until the -release context is identified. - -Therefore, deadlocks by crosslocks cannot be detected just when it -happens, because those cannot be identified until the crosslocks are -released. However, deadlock possibilities can be detected and it's very -worth. See 'APPENDIX A' section to check why. - -CONCLUSION - -Using crossrelease feature, lockdep can work with what might be released -in any context, namely crosslock. - - -Introduce commit ----------------- - -Since crossrelease defers the work adding true dependencies of -crosslocks until they are actually released, crossrelease has to queue -all acquisitions which might create dependencies with the crosslocks. -Then it identifies dependencies using the queued data in batches at a -proper time. We call it 'commit'. - -There are four types of dependencies: - -1. TT type: 'typical lock A -> typical lock B' - - Just when acquiring B, lockdep can see it's in the A's release - context. So the dependency between A and B can be identified - immediately. Commit is unnecessary. - -2. TC type: 'typical lock A -> crosslock BX' - - Just when acquiring BX, lockdep can see it's in the A's release - context. So the dependency between A and BX can be identified - immediately. Commit is unnecessary, too. - -3. CT type: 'crosslock AX -> typical lock B' - - When acquiring B, lockdep cannot identify the dependency because - there's no way to know if it's in the AX's release context. It has - to wait until the decision can be made. Commit is necessary. - -4. CC type: 'crosslock AX -> crosslock BX' - - When acquiring BX, lockdep cannot identify the dependency because - there's no way to know if it's in the AX's release context. It has - to wait until the decision can be made. Commit is necessary. - But, handling CC type is not implemented yet. It's a future work. - -Lockdep can work without commit for typical locks, but commit step is -necessary once crosslocks are involved. Introducing commit, lockdep -performs three steps. What lockdep does in each step is: - -1. Acquisition: For typical locks, lockdep does what it originally did - and queues the lock so that CT type dependencies can be checked using - it at the commit step. For crosslocks, it saves data which will be - used at the commit step and increases a reference count for it. - -2. Commit: No action is reauired for typical locks. For crosslocks, - lockdep adds CT type dependencies using the data saved at the - acquisition step. - -3. Release: No changes are required for typical locks. When a crosslock - is released, it decreases a reference count for it. - -CONCLUSION - -Crossrelease introduces commit step to handle dependencies of crosslocks -in batches at a proper time. - - -============== -Implementation -============== - -Data structures ---------------- - -Crossrelease introduces two main data structures. - -1. hist_lock - - This is an array embedded in task_struct, for keeping lock history so - that dependencies can be added using them at the commit step. Since - it's local data, it can be accessed locklessly in the owner context. - The array is filled at the acquisition step and consumed at the - commit step. And it's managed in circular manner. - -2. cross_lock - - One per lockdep_map exists. This is for keeping data of crosslocks - and used at the commit step. - - -How crossrelease works ----------------------- - -It's the key of how crossrelease works, to defer necessary works to an -appropriate point in time and perform in at once at the commit step. -Let's take a look with examples step by step, starting from how lockdep -works without crossrelease for typical locks. - - acquire A /* Push A onto held_locks */ - acquire B /* Push B onto held_locks and add 'A -> B' */ - acquire C /* Push C onto held_locks and add 'B -> C' */ - release C /* Pop C from held_locks */ - release B /* Pop B from held_locks */ - release A /* Pop A from held_locks */ - - where A, B and C are different lock classes. - - NOTE: This document assumes that readers already understand how - lockdep works without crossrelease thus omits details. But there's - one thing to note. Lockdep pretends to pop a lock from held_locks - when releasing it. But it's subtly different from the original pop - operation because lockdep allows other than the top to be poped. - -In this case, lockdep adds 'the top of held_locks -> the lock to acquire' -dependency every time acquiring a lock. - -After adding 'A -> B', a dependency graph will be: - - A -> B - - where A and B are different lock classes. - -And after adding 'B -> C', the graph will be: - - A -> B -> C - - where A, B and C are different lock classes. - -Let's performs commit step even for typical locks to add dependencies. -Of course, commit step is not necessary for them, however, it would work -well because this is a more general way. - - acquire A - /* - * Queue A into hist_locks - * - * In hist_locks: A - * In graph: Empty - */ - - acquire B - /* - * Queue B into hist_locks - * - * In hist_locks: A, B - * In graph: Empty - */ - - acquire C - /* - * Queue C into hist_locks - * - * In hist_locks: A, B, C - * In graph: Empty - */ - - commit C - /* - * Add 'C -> ?' - * Answer the following to decide '?' - * What has been queued since acquire C: Nothing - * - * In hist_locks: A, B, C - * In graph: Empty - */ - - release C - - commit B - /* - * Add 'B -> ?' - * Answer the following to decide '?' - * What has been queued since acquire B: C - * - * In hist_locks: A, B, C - * In graph: 'B -> C' - */ - - release B - - commit A - /* - * Add 'A -> ?' - * Answer the following to decide '?' - * What has been queued since acquire A: B, C - * - * In hist_locks: A, B, C - * In graph: 'B -> C', 'A -> B', 'A -> C' - */ - - release A - - where A, B and C are different lock classes. - -In this case, dependencies are added at the commit step as described. - -After commits for A, B and C, the graph will be: - - A -> B -> C - - where A, B and C are different lock classes. - - NOTE: A dependency 'A -> C' is optimized out. - -We can see the former graph built without commit step is same as the -latter graph built using commit steps. Of course the former way leads to -earlier finish for building the graph, which means we can detect a -deadlock or its possibility sooner. So the former way would be prefered -when possible. But we cannot avoid using the latter way for crosslocks. - -Let's look at how commit steps work for crosslocks. In this case, the -commit step is performed only on crosslock AX as real. And it assumes -that the AX release context is different from the AX acquire context. - - BX RELEASE CONTEXT BX ACQUIRE CONTEXT - ------------------ ------------------ - acquire A - /* - * Push A onto held_locks - * Queue A into hist_locks - * - * In held_locks: A - * In hist_locks: A - * In graph: Empty - */ - - acquire BX - /* - * Add 'the top of held_locks -> BX' - * - * In held_locks: A - * In hist_locks: A - * In graph: 'A -> BX' - */ - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - It must be guaranteed that the following operations are seen after - acquiring BX globally. It can be done by things like barrier. - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - acquire C - /* - * Push C onto held_locks - * Queue C into hist_locks - * - * In held_locks: C - * In hist_locks: C - * In graph: 'A -> BX' - */ - - release C - /* - * Pop C from held_locks - * - * In held_locks: Empty - * In hist_locks: C - * In graph: 'A -> BX' - */ - acquire D - /* - * Push D onto held_locks - * Queue D into hist_locks - * Add 'the top of held_locks -> D' - * - * In held_locks: A, D - * In hist_locks: A, D - * In graph: 'A -> BX', 'A -> D' - */ - acquire E - /* - * Push E onto held_locks - * Queue E into hist_locks - * - * In held_locks: E - * In hist_locks: C, E - * In graph: 'A -> BX', 'A -> D' - */ - - release E - /* - * Pop E from held_locks - * - * In held_locks: Empty - * In hist_locks: D, E - * In graph: 'A -> BX', 'A -> D' - */ - release D - /* - * Pop D from held_locks - * - * In held_locks: A - * In hist_locks: A, D - * In graph: 'A -> BX', 'A -> D' - */ - commit BX - /* - * Add 'BX -> ?' - * What has been queued since acquire BX: C, E - * - * In held_locks: Empty - * In hist_locks: D, E - * In graph: 'A -> BX', 'A -> D', - * 'BX -> C', 'BX -> E' - */ - - release BX - /* - * In held_locks: Empty - * In hist_locks: D, E - * In graph: 'A -> BX', 'A -> D', - * 'BX -> C', 'BX -> E' - */ - release A - /* - * Pop A from held_locks - * - * In held_locks: Empty - * In hist_locks: A, D - * In graph: 'A -> BX', 'A -> D', - * 'BX -> C', 'BX -> E' - */ - - where A, BX, C,..., E are different lock classes, and a suffix 'X' is - added on crosslocks. - -Crossrelease considers all acquisitions after acqiuring BX are -candidates which might create dependencies with BX. True dependencies -will be determined when identifying the release context of BX. Meanwhile, -all typical locks are queued so that they can be used at the commit step. -And then two dependencies 'BX -> C' and 'BX -> E' are added at the -commit step when identifying the release context. - -The final graph will be, with crossrelease: - - -> C - / - -> BX - - / \ - A - -> E - \ - -> D - - where A, BX, C,..., E are different lock classes, and a suffix 'X' is - added on crosslocks. - -However, the final graph will be, without crossrelease: - - A -> D - - where A and D are different lock classes. - -The former graph has three more dependencies, 'A -> BX', 'BX -> C' and -'BX -> E' giving additional opportunities to check if they cause -deadlocks. This way lockdep can detect a deadlock or its possibility -caused by crosslocks. - -CONCLUSION - -We checked how crossrelease works with several examples. - - -============= -Optimizations -============= - -Avoid duplication ------------------ - -Crossrelease feature uses a cache like what lockdep already uses for -dependency chains, but this time it's for caching CT type dependencies. -Once that dependency is cached, the same will never be added again. - - -Lockless for hot paths ----------------------- - -To keep all locks for later use at the commit step, crossrelease adopts -a local array embedded in task_struct, which makes access to the data -lockless by forcing it to happen only within the owner context. It's -like how lockdep handles held_locks. Lockless implmentation is important -since typical locks are very frequently acquired and released. - - -================================================= -APPENDIX A: What lockdep does to work aggresively -================================================= - -A deadlock actually occurs when all wait operations creating circular -dependencies run at the same time. Even though they don't, a potential -deadlock exists if the problematic dependencies exist. Thus it's -meaningful to detect not only an actual deadlock but also its potential -possibility. The latter is rather valuable. When a deadlock occurs -actually, we can identify what happens in the system by some means or -other even without lockdep. However, there's no way to detect possiblity -without lockdep unless the whole code is parsed in head. It's terrible. -Lockdep does the both, and crossrelease only focuses on the latter. - -Whether or not a deadlock actually occurs depends on several factors. -For example, what order contexts are switched in is a factor. Assuming -circular dependencies exist, a deadlock would occur when contexts are -switched so that all wait operations creating the dependencies run -simultaneously. Thus to detect a deadlock possibility even in the case -that it has not occured yet, lockdep should consider all possible -combinations of dependencies, trying to: - -1. Use a global dependency graph. - - Lockdep combines all dependencies into one global graph and uses them, - regardless of which context generates them or what order contexts are - switched in. Aggregated dependencies are only considered so they are - prone to be circular if a problem exists. - -2. Check dependencies between classes instead of instances. - - What actually causes a deadlock are instances of lock. However, - lockdep checks dependencies between classes instead of instances. - This way lockdep can detect a deadlock which has not happened but - might happen in future by others but the same class. - -3. Assume all acquisitions lead to waiting. - - Although locks might be acquired without waiting which is essential - to create dependencies, lockdep assumes all acquisitions lead to - waiting since it might be true some time or another. - -CONCLUSION - -Lockdep detects not only an actual deadlock but also its possibility, -and the latter is more valuable. - - -================================================== -APPENDIX B: How to avoid adding false dependencies -================================================== - -Remind what a dependency is. A dependency exists if: - - 1. There are two waiters waiting for each event at a given time. - 2. The only way to wake up each waiter is to trigger its event. - 3. Whether one can be woken up depends on whether the other can. - -For example: - - acquire A - acquire B /* A dependency 'A -> B' exists */ - release B - release A - - where A and B are different lock classes. - -A depedency 'A -> B' exists since: - - 1. A waiter for A and a waiter for B might exist when acquiring B. - 2. Only way to wake up each is to release what it waits for. - 3. Whether the waiter for A can be woken up depends on whether the - other can. IOW, TASK X cannot release A if it fails to acquire B. - -For another example: - - TASK X TASK Y - ------ ------ - acquire AX - acquire B /* A dependency 'AX -> B' exists */ - release B - release AX held by Y - - where AX and B are different lock classes, and a suffix 'X' is added - on crosslocks. - -Even in this case involving crosslocks, the same rule can be applied. A -depedency 'AX -> B' exists since: - - 1. A waiter for AX and a waiter for B might exist when acquiring B. - 2. Only way to wake up each is to release what it waits for. - 3. Whether the waiter for AX can be woken up depends on whether the - other can. IOW, TASK X cannot release AX if it fails to acquire B. - -Let's take a look at more complicated example: - - TASK X TASK Y - ------ ------ - acquire B - release B - fork Y - acquire AX - acquire C /* A dependency 'AX -> C' exists */ - release C - release AX held by Y - - where AX, B and C are different lock classes, and a suffix 'X' is - added on crosslocks. - -Does a dependency 'AX -> B' exist? Nope. - -Two waiters are essential to create a dependency. However, waiters for -AX and B to create 'AX -> B' cannot exist at the same time in this -example. Thus the dependency 'AX -> B' cannot be created. - -It would be ideal if the full set of true ones can be considered. But -we can ensure nothing but what actually happened. Relying on what -actually happens at runtime, we can anyway add only true ones, though -they might be a subset of true ones. It's similar to how lockdep works -for typical locks. There might be more true dependencies than what -lockdep has detected in runtime. Lockdep has no choice but to rely on -what actually happens. Crossrelease also relies on it. - -CONCLUSION - -Relying on what actually happens, lockdep can avoid adding false -dependencies. diff --git a/Documentation/locking/locktorture.txt b/Documentation/locking/locktorture.txt index a2ef3a929bf1..6a8df4cd19bf 100644 --- a/Documentation/locking/locktorture.txt +++ b/Documentation/locking/locktorture.txt @@ -57,11 +57,6 @@ torture_type Type of lock to torture. By default, only spinlocks will o "rwsem_lock": read/write down() and up() semaphore pairs. -torture_runnable Start locktorture at boot time in the case where the - module is built into the kernel, otherwise wait for - torture_runnable to be set via sysfs before starting. - By default it will begin once the module is loaded. - ** Torture-framework (RCU + locking) ** diff --git a/Documentation/md/raid5-ppl.txt b/Documentation/md/raid5-ppl.txt index 127072b09363..bfa092589e00 100644 --- a/Documentation/md/raid5-ppl.txt +++ b/Documentation/md/raid5-ppl.txt @@ -39,6 +39,7 @@ case the behavior is the same as in plain raid5. PPL is available for md version-1 metadata and external (specifically IMSM) metadata arrays. It can be enabled using mdadm option --consistency-policy=ppl. -Currently, volatile write-back cache should be disabled on all member drives -when using PPL. Otherwise it cannot guarantee consistency in case of power -failure. +There is a limitation of maximum 64 disks in the array for PPL. It allows to +keep data structures and implementation simple. RAID5 arrays with so many disks +are not likely due to high risk of multiple disks failure. Such restriction +should not be a real life limitation. diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 479ecec80593..a863009849a3 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -227,17 +227,20 @@ There are some minimal guarantees that may be expected of a CPU: (*) On any given CPU, dependent memory accesses will be issued in order, with respect to itself. This means that for: - Q = READ_ONCE(P); smp_read_barrier_depends(); D = READ_ONCE(*Q); + Q = READ_ONCE(P); D = READ_ONCE(*Q); the CPU will issue the following memory operations: Q = LOAD P, D = LOAD *Q - and always in that order. On most systems, smp_read_barrier_depends() - does nothing, but it is required for DEC Alpha. The READ_ONCE() - is required to prevent compiler mischief. Please note that you - should normally use something like rcu_dereference() instead of - open-coding smp_read_barrier_depends(). + and always in that order. However, on DEC Alpha, READ_ONCE() also + emits a memory-barrier instruction, so that a DEC Alpha CPU will + instead issue the following memory operations: + + Q = LOAD P, MEMORY_BARRIER, D = LOAD *Q, MEMORY_BARRIER + + Whether on DEC Alpha or not, the READ_ONCE() also prevents compiler + mischief. (*) Overlapping loads and stores within a particular CPU will appear to be ordered within that CPU. This means that for: @@ -1815,7 +1818,7 @@ The Linux kernel has eight basic CPU memory barriers: GENERAL mb() smp_mb() WRITE wmb() smp_wmb() READ rmb() smp_rmb() - DATA DEPENDENCY read_barrier_depends() smp_read_barrier_depends() + DATA DEPENDENCY READ_ONCE() All memory barriers except the data dependency barriers imply a compiler @@ -2864,7 +2867,10 @@ access depends on a read, not all do, so it may not be relied on. Other CPUs may also have split caches, but must coordinate between the various cachelets for normal memory accesses. The semantics of the Alpha removes the -need for coordination in the absence of memory barriers. +need for hardware coordination in the absence of memory barriers, which +permitted Alpha to sport higher CPU clock rates back in the day. However, +please note that smp_read_barrier_depends() should not be used except in +Alpha arch-specific code and within the READ_ONCE() macro. CACHE COHERENCY VS DMA diff --git a/Documentation/mtd/spi-nor.txt b/Documentation/mtd/spi-nor.txt index 548d6306ebca..da1fbff5a24c 100644 --- a/Documentation/mtd/spi-nor.txt +++ b/Documentation/mtd/spi-nor.txt @@ -60,3 +60,6 @@ The main API is spi_nor_scan(). Before you call the hook, a driver should initialize the necessary fields for spi_nor{}. Please see drivers/mtd/spi-nor/spi-nor.c for detail. Please also refer to fsl-quadspi.c when you want to write a new driver for a SPI NOR controller. +Another API is spi_nor_restore(), this is used to restore the status of SPI +flash chip such as addressing mode. Call it whenever detach the driver from +device or reboot the system. diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst index 66e620866245..7d4b15977d61 100644 --- a/Documentation/networking/index.rst +++ b/Documentation/networking/index.rst @@ -9,6 +9,7 @@ Contents: batman-adv kapi z8530book + msg_zerocopy .. only:: subproject @@ -16,4 +17,3 @@ Contents: ======= * :ref:`genindex` - diff --git a/Documentation/networking/msg_zerocopy.rst b/Documentation/networking/msg_zerocopy.rst index 77f6d7e25cfd..291a01264967 100644 --- a/Documentation/networking/msg_zerocopy.rst +++ b/Documentation/networking/msg_zerocopy.rst @@ -72,6 +72,10 @@ this flag, a process must first signal intent by setting a socket option: if (setsockopt(fd, SOL_SOCKET, SO_ZEROCOPY, &one, sizeof(one))) error(1, errno, "setsockopt zerocopy"); +Setting the socket option only works when the socket is in its initial +(TCP_CLOSED) state. Trying to set the option for a socket returned by accept(), +for example, will lead to an EBUSY error. In this case, the option should be set +to the listening socket and it will be inherited by the accepted sockets. Transmission ------------ diff --git a/Documentation/perf/arm_dsu_pmu.txt b/Documentation/perf/arm_dsu_pmu.txt new file mode 100644 index 000000000000..d611e15f5add --- /dev/null +++ b/Documentation/perf/arm_dsu_pmu.txt @@ -0,0 +1,28 @@ +ARM DynamIQ Shared Unit (DSU) PMU +================================== + +ARM DynamIQ Shared Unit integrates one or more cores with an L3 memory system, +control logic and external interfaces to form a multicore cluster. The PMU +allows counting the various events related to the L3 cache, Snoop Control Unit +etc, using 32bit independent counters. It also provides a 64bit cycle counter. + +The PMU can only be accessed via CPU system registers and are common to the +cores connected to the same DSU. Like most of the other uncore PMUs, DSU +PMU doesn't support process specific events and cannot be used in sampling mode. + +The DSU provides a bitmap for a subset of implemented events via hardware +registers. There is no way for the driver to determine if the other events +are available or not. Hence the driver exposes only those events advertised +by the DSU, in "events" directory under : + + /sys/bus/event_sources/devices/arm_dsu_<N>/ + +The user should refer to the TRM of the product to figure out the supported events +and use the raw event code for the unlisted events. + +The driver also exposes the CPUs connected to the DSU instance in "associated_cpus". + + +e.g usage : + + perf stat -a -e arm_dsu_0/cycles/ diff --git a/Documentation/power/pci.txt b/Documentation/power/pci.txt index 704cd36079b8..8eaf9ee24d43 100644 --- a/Documentation/power/pci.txt +++ b/Documentation/power/pci.txt @@ -994,6 +994,17 @@ into D0 going forward), but if it is in runtime suspend in pci_pm_thaw_noirq(), the function will set the power.direct_complete flag for it (to make the PM core skip the subsequent "thaw" callbacks for it) and return. +Setting the DPM_FLAG_LEAVE_SUSPENDED flag means that the driver prefers the +device to be left in suspend after system-wide transitions to the working state. +This flag is checked by the PM core, but the PCI bus type informs the PM core +which devices may be left in suspend from its perspective (that happens during +the "noirq" phase of system-wide suspend and analogous transitions) and next it +uses the dev_pm_may_skip_resume() helper to decide whether or not to return from +pci_pm_resume_noirq() early, as the PM core will skip the remaining resume +callbacks for the device during the transition under way and will set its +runtime PM status to "suspended" if dev_pm_may_skip_resume() returns "true" for +it. + 3.2. Device Runtime Power Management ------------------------------------ In addition to providing device power management callbacks PCI device drivers diff --git a/Documentation/power/regulator/machine.txt b/Documentation/power/regulator/machine.txt index 757e3b53dc11..eff4dcaaa252 100644 --- a/Documentation/power/regulator/machine.txt +++ b/Documentation/power/regulator/machine.txt @@ -23,16 +23,12 @@ struct regulator_consumer_supply { e.g. for the machine above static struct regulator_consumer_supply regulator1_consumers[] = { -{ - .dev_name = "dev_name(consumer B)", - .supply = "Vcc", -},}; + REGULATOR_SUPPLY("Vcc", "consumer B"), +}; static struct regulator_consumer_supply regulator2_consumers[] = { -{ - .dev = "dev_name(consumer A"), - .supply = "Vcc", -},}; + REGULATOR_SUPPLY("Vcc", "consumer A"), +}; This maps Regulator-1 to the 'Vcc' supply for Consumer B and maps Regulator-2 to the 'Vcc' supply for Consumer A. @@ -78,20 +74,20 @@ static struct regulator_init_data regulator2_data = { Finally the regulator devices must be registered in the usual manner. static struct platform_device regulator_devices[] = { -{ - .name = "regulator", - .id = DCDC_1, - .dev = { - .platform_data = ®ulator1_data, + { + .name = "regulator", + .id = DCDC_1, + .dev = { + .platform_data = ®ulator1_data, + }, }, -}, -{ - .name = "regulator", - .id = DCDC_2, - .dev = { - .platform_data = ®ulator2_data, + { + .name = "regulator", + .id = DCDC_2, + .dev = { + .platform_data = ®ulator2_data, + }, }, -}, }; /* register regulator 1 device */ platform_device_register(®ulator_devices[0]); diff --git a/Documentation/process/kernel-enforcement-statement.rst b/Documentation/process/kernel-enforcement-statement.rst index b3170671a1df..bfa6a78103d8 100644 --- a/Documentation/process/kernel-enforcement-statement.rst +++ b/Documentation/process/kernel-enforcement-statement.rst @@ -118,6 +118,7 @@ we might work for today, have in the past, or will in the future. - Mike Marshall - Chris Mason - Paul E. McKenney + - Arnaldo Carvalho de Melo - David S. Miller - Ingo Molnar - Kuninori Morimoto diff --git a/Documentation/thermal/cpu-cooling-api.txt b/Documentation/thermal/cpu-cooling-api.txt index 71653584cd03..7df567eaea1a 100644 --- a/Documentation/thermal/cpu-cooling-api.txt +++ b/Documentation/thermal/cpu-cooling-api.txt @@ -26,39 +26,16 @@ the user. The registration APIs returns the cooling device pointer. clip_cpus: cpumask of cpus where the frequency constraints will happen. 1.1.2 struct thermal_cooling_device *of_cpufreq_cooling_register( - struct device_node *np, const struct cpumask *clip_cpus) + struct cpufreq_policy *policy) This interface function registers the cpufreq cooling device with the name "thermal-cpufreq-%x" linking it with a device tree node, in order to bind it via the thermal DT code. This api can support multiple instances of cpufreq cooling devices. - np: pointer to the cooling device device tree node - clip_cpus: cpumask of cpus where the frequency constraints will happen. + policy: CPUFreq policy. -1.1.3 struct thermal_cooling_device *cpufreq_power_cooling_register( - const struct cpumask *clip_cpus, u32 capacitance, - get_static_t plat_static_func) - -Similar to cpufreq_cooling_register, this function registers a cpufreq -cooling device. Using this function, the cooling device will -implement the power extensions by using a simple cpu power model. The -cpus must have registered their OPPs using the OPP library. - -The additional parameters are needed for the power model (See 2. Power -models). "capacitance" is the dynamic power coefficient (See 2.1 -Dynamic power). "plat_static_func" is a function to calculate the -static power consumed by these cpus (See 2.2 Static power). - -1.1.4 struct thermal_cooling_device *of_cpufreq_power_cooling_register( - struct device_node *np, const struct cpumask *clip_cpus, u32 capacitance, - get_static_t plat_static_func) - -Similar to cpufreq_power_cooling_register, this function register a -cpufreq cooling device with power extensions using the device tree -information supplied by the np parameter. - -1.1.5 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev) +1.1.3 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev) This interface function unregisters the "thermal-cpufreq-%x" cooling device. @@ -67,20 +44,14 @@ information supplied by the np parameter. 2. Power models The power API registration functions provide a simple power model for -CPUs. The current power is calculated as dynamic + (optionally) -static power. This power model requires that the operating-points of +CPUs. The current power is calculated as dynamic power (static power isn't +supported currently). This power model requires that the operating-points of the CPUs are registered using the kernel's opp library and the `cpufreq_frequency_table` is assigned to the `struct device` of the cpu. If you are using CONFIG_CPUFREQ_DT then the `cpufreq_frequency_table` should already be assigned to the cpu device. -The `plat_static_func` parameter of `cpufreq_power_cooling_register()` -and `of_cpufreq_power_cooling_register()` is optional. If you don't -provide it, only dynamic power will be considered. - -2.1 Dynamic power - The dynamic power consumption of a processor depends on many factors. For a given processor implementation the primary factors are: @@ -119,79 +90,3 @@ mW/MHz/uVolt^2. Typical values for mobile CPUs might lie in range from 100 to 500. For reference, the approximate values for the SoC in ARM's Juno Development Platform are 530 for the Cortex-A57 cluster and 140 for the Cortex-A53 cluster. - - -2.2 Static power - -Static leakage power consumption depends on a number of factors. For a -given circuit implementation the primary factors are: - -- Time the circuit spends in each 'power state' -- Temperature -- Operating voltage -- Process grade - -The time the circuit spends in each 'power state' for a given -evaluation period at first order means OFF or ON. However, -'retention' states can also be supported that reduce power during -inactive periods without loss of context. - -Note: The visibility of state entries to the OS can vary, according to -platform specifics, and this can then impact the accuracy of a model -based on OS state information alone. It might be possible in some -cases to extract more accurate information from system resources. - -The temperature, operating voltage and process 'grade' (slow to fast) -of the circuit are all significant factors in static leakage power -consumption. All of these have complex relationships to static power. - -Circuit implementation specific factors include the chosen silicon -process as well as the type, number and size of transistors in both -the logic gates and any RAM elements included. - -The static power consumption modelling must take into account the -power managed regions that are implemented. Taking the example of an -ARM processor cluster, the modelling would take into account whether -each CPU can be powered OFF separately or if only a single power -region is implemented for the complete cluster. - -In one view, there are others, a static power consumption model can -then start from a set of reference values for each power managed -region (e.g. CPU, Cluster/L2) in each state (e.g. ON, OFF) at an -arbitrary process grade, voltage and temperature point. These values -are then scaled for all of the following: the time in each state, the -process grade, the current temperature and the operating voltage. -However, since both implementation specific and complex relationships -dominate the estimate, the appropriate interface to the model from the -cpu cooling device is to provide a function callback that calculates -the static power in this platform. When registering the cpu cooling -device pass a function pointer that follows the `get_static_t` -prototype: - - int plat_get_static(cpumask_t *cpumask, int interval, - unsigned long voltage, u32 &power); - -`cpumask` is the cpumask of the cpus involved in the calculation. -`voltage` is the voltage at which they are operating. The function -should calculate the average static power for the last `interval` -milliseconds. It returns 0 on success, -E* on error. If it -succeeds, it should store the static power in `power`. Reading the -temperature of the cpus described by `cpumask` is left for -plat_get_static() to do as the platform knows best which thermal -sensor is closest to the cpu. - -If `plat_static_func` is NULL, static power is considered to be -negligible for this platform and only dynamic power is considered. - -The platform specific callback can then use any combination of tables -and/or equations to permute the estimated value. Process grade -information is not passed to the model since access to such data, from -on-chip measurement capability or manufacture time data, is platform -specific. - -Note: the significance of static power for CPUs in comparison to -dynamic power is highly dependent on implementation. Given the -potential complexity in implementation, the importance and accuracy of -its inclusion when using cpu cooling devices should be assessed on a -case by case basis. - diff --git a/Documentation/usb/gadget-testing.txt b/Documentation/usb/gadget-testing.txt index 441a4b9b666f..5908a21fddb6 100644 --- a/Documentation/usb/gadget-testing.txt +++ b/Documentation/usb/gadget-testing.txt @@ -693,7 +693,7 @@ such specification consists of a number of lines with an inverval value in each line. The rules stated above are best illustrated with an example: # mkdir functions/uvc.usb0/control/header/h -# cd functions/uvc.usb0/control/header/h +# cd functions/uvc.usb0/control/ # ln -s header/h class/fs # ln -s header/h class/ss # mkdir -p functions/uvc.usb0/streaming/uncompressed/u/360p diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 57d3ee9e4bde..fc3ae951bc07 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -3403,6 +3403,52 @@ invalid, if invalid pages are written to (e.g. after the end of memory) or if no page table is present for the addresses (e.g. when using hugepages). +4.108 KVM_PPC_GET_CPU_CHAR + +Capability: KVM_CAP_PPC_GET_CPU_CHAR +Architectures: powerpc +Type: vm ioctl +Parameters: struct kvm_ppc_cpu_char (out) +Returns: 0 on successful completion + -EFAULT if struct kvm_ppc_cpu_char cannot be written + +This ioctl gives userspace information about certain characteristics +of the CPU relating to speculative execution of instructions and +possible information leakage resulting from speculative execution (see +CVE-2017-5715, CVE-2017-5753 and CVE-2017-5754). The information is +returned in struct kvm_ppc_cpu_char, which looks like this: + +struct kvm_ppc_cpu_char { + __u64 character; /* characteristics of the CPU */ + __u64 behaviour; /* recommended software behaviour */ + __u64 character_mask; /* valid bits in character */ + __u64 behaviour_mask; /* valid bits in behaviour */ +}; + +For extensibility, the character_mask and behaviour_mask fields +indicate which bits of character and behaviour have been filled in by +the kernel. If the set of defined bits is extended in future then +userspace will be able to tell whether it is running on a kernel that +knows about the new bits. + +The character field describes attributes of the CPU which can help +with preventing inadvertent information disclosure - specifically, +whether there is an instruction to flash-invalidate the L1 data cache +(ori 30,30,0 or mtspr SPRN_TRIG2,rN), whether the L1 data cache is set +to a mode where entries can only be used by the thread that created +them, whether the bcctr[l] instruction prevents speculation, and +whether a speculation barrier instruction (ori 31,31,0) is provided. + +The behaviour field describes actions that software should take to +prevent inadvertent information disclosure, and thus describes which +vulnerabilities the hardware is subject to; specifically whether the +L1 data cache should be flushed when returning to user mode from the +kernel, and whether a speculation barrier should be placed between an +array bounds check and the array access. + +These fields use the same bit definitions as the new +H_GET_CPU_CHARACTERISTICS hypercall. + 5. The kvm_run structure ------------------------ diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt index 89fff7d611cc..0b3a1148f9f0 100644 --- a/Documentation/vm/zswap.txt +++ b/Documentation/vm/zswap.txt @@ -98,5 +98,25 @@ request is made for a page in an old zpool, it is uncompressed using its original compressor. Once all pages are removed from an old zpool, the zpool and its compressor are freed. +Some of the pages in zswap are same-value filled pages (i.e. contents of the +page have same value or repetitive pattern). These pages include zero-filled +pages and they are handled differently. During store operation, a page is +checked if it is a same-value filled page before compressing it. If true, the +compressed length of the page is set to zero and the pattern or same-filled +value is stored. + +Same-value filled pages identification feature is enabled by default and can be +disabled at boot time by setting the "same_filled_pages_enabled" attribute to 0, +e.g. zswap.same_filled_pages_enabled=0. It can also be enabled and disabled at +runtime using the sysfs "same_filled_pages_enabled" attribute, e.g. + +echo 1 > /sys/module/zswap/parameters/same_filled_pages_enabled + +When zswap same-filled page identification is disabled at runtime, it will stop +checking for the same-value filled pages during store operation. However, the +existing pages which are marked as same-value filled pages remain stored +unchanged in zswap until they are either loaded or invalidated. + A debugfs interface is provided for various statistic about pool size, number -of pages stored, and various counters for the reasons pages are rejected. +of pages stored, same-value filled pages and various counters for the reasons +pages are rejected. diff --git a/Documentation/w1/masters/w1-gpio b/Documentation/w1/masters/w1-gpio index af5d3b4aa851..623961d9e83f 100644 --- a/Documentation/w1/masters/w1-gpio +++ b/Documentation/w1/masters/w1-gpio @@ -8,17 +8,27 @@ Description ----------- GPIO 1-wire bus master driver. The driver uses the GPIO API to control the -wire and the GPIO pin can be specified using platform data. +wire and the GPIO pin can be specified using GPIO machine descriptor tables. +It is also possible to define the master using device tree, see +Documentation/devicetree/bindings/w1/w1-gpio.txt Example (mach-at91) ------------------- +#include <linux/gpio/machine.h> #include <linux/w1-gpio.h> +static struct gpiod_lookup_table foo_w1_gpiod_table = { + .dev_id = "w1-gpio", + .table = { + GPIO_LOOKUP_IDX("at91-gpio", AT91_PIN_PB20, NULL, 0, + GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN), + }, +}; + static struct w1_gpio_platform_data foo_w1_gpio_pdata = { - .pin = AT91_PIN_PB20, - .is_open_drain = 1, + .ext_pullup_enable_pin = -EINVAL, }; static struct platform_device foo_w1_device = { @@ -30,4 +40,5 @@ static struct platform_device foo_w1_device = { ... at91_set_GPIO_periph(foo_w1_gpio_pdata.pin, 1); at91_set_multi_drive(foo_w1_gpio_pdata.pin, 1); + gpiod_add_lookup_table(&foo_w1_gpiod_table); platform_device_register(&foo_w1_device); diff --git a/Documentation/x86/intel_rdt_ui.txt b/Documentation/x86/intel_rdt_ui.txt index 6851854cf69d..756fd76b78a6 100644 --- a/Documentation/x86/intel_rdt_ui.txt +++ b/Documentation/x86/intel_rdt_ui.txt @@ -7,15 +7,24 @@ Tony Luck <tony.luck@intel.com> Vikas Shivappa <vikas.shivappa@intel.com> This feature is enabled by the CONFIG_INTEL_RDT Kconfig and the -X86 /proc/cpuinfo flag bits "rdt", "cqm", "cat_l3" and "cdp_l3". +X86 /proc/cpuinfo flag bits: +RDT (Resource Director Technology) Allocation - "rdt_a" +CAT (Cache Allocation Technology) - "cat_l3", "cat_l2" +CDP (Code and Data Prioritization ) - "cdp_l3", "cdp_l2" +CQM (Cache QoS Monitoring) - "cqm_llc", "cqm_occup_llc" +MBM (Memory Bandwidth Monitoring) - "cqm_mbm_total", "cqm_mbm_local" +MBA (Memory Bandwidth Allocation) - "mba" To use the feature mount the file system: - # mount -t resctrl resctrl [-o cdp] /sys/fs/resctrl + # mount -t resctrl resctrl [-o cdp[,cdpl2]] /sys/fs/resctrl mount options are: "cdp": Enable code/data prioritization in L3 cache allocations. +"cdpl2": Enable code/data prioritization in L2 cache allocations. + +L2 and L3 CDP are controlled seperately. RDT features are orthogonal. A particular system may support only monitoring, only control, or both monitoring and control. diff --git a/Documentation/x86/pti.txt b/Documentation/x86/pti.txt new file mode 100644 index 000000000000..5cd58439ad2d --- /dev/null +++ b/Documentation/x86/pti.txt @@ -0,0 +1,186 @@ +Overview +======== + +Page Table Isolation (pti, previously known as KAISER[1]) is a +countermeasure against attacks on the shared user/kernel address +space such as the "Meltdown" approach[2]. + +To mitigate this class of attacks, we create an independent set of +page tables for use only when running userspace applications. When +the kernel is entered via syscalls, interrupts or exceptions, the +page tables are switched to the full "kernel" copy. When the system +switches back to user mode, the user copy is used again. + +The userspace page tables contain only a minimal amount of kernel +data: only what is needed to enter/exit the kernel such as the +entry/exit functions themselves and the interrupt descriptor table +(IDT). There are a few strictly unnecessary things that get mapped +such as the first C function when entering an interrupt (see +comments in pti.c). + +This approach helps to ensure that side-channel attacks leveraging +the paging structures do not function when PTI is enabled. It can be +enabled by setting CONFIG_PAGE_TABLE_ISOLATION=y at compile time. +Once enabled at compile-time, it can be disabled at boot with the +'nopti' or 'pti=' kernel parameters (see kernel-parameters.txt). + +Page Table Management +===================== + +When PTI is enabled, the kernel manages two sets of page tables. +The first set is very similar to the single set which is present in +kernels without PTI. This includes a complete mapping of userspace +that the kernel can use for things like copy_to_user(). + +Although _complete_, the user portion of the kernel page tables is +crippled by setting the NX bit in the top level. This ensures +that any missed kernel->user CR3 switch will immediately crash +userspace upon executing its first instruction. + +The userspace page tables map only the kernel data needed to enter +and exit the kernel. This data is entirely contained in the 'struct +cpu_entry_area' structure which is placed in the fixmap which gives +each CPU's copy of the area a compile-time-fixed virtual address. + +For new userspace mappings, the kernel makes the entries in its +page tables like normal. The only difference is when the kernel +makes entries in the top (PGD) level. In addition to setting the +entry in the main kernel PGD, a copy of the entry is made in the +userspace page tables' PGD. + +This sharing at the PGD level also inherently shares all the lower +layers of the page tables. This leaves a single, shared set of +userspace page tables to manage. One PTE to lock, one set of +accessed bits, dirty bits, etc... + +Overhead +======== + +Protection against side-channel attacks is important. But, +this protection comes at a cost: + +1. Increased Memory Use + a. Each process now needs an order-1 PGD instead of order-0. + (Consumes an additional 4k per process). + b. The 'cpu_entry_area' structure must be 2MB in size and 2MB + aligned so that it can be mapped by setting a single PMD + entry. This consumes nearly 2MB of RAM once the kernel + is decompressed, but no space in the kernel image itself. + +2. Runtime Cost + a. CR3 manipulation to switch between the page table copies + must be done at interrupt, syscall, and exception entry + and exit (it can be skipped when the kernel is interrupted, + though.) Moves to CR3 are on the order of a hundred + cycles, and are required at every entry and exit. + b. A "trampoline" must be used for SYSCALL entry. This + trampoline depends on a smaller set of resources than the + non-PTI SYSCALL entry code, so requires mapping fewer + things into the userspace page tables. The downside is + that stacks must be switched at entry time. + c. Global pages are disabled for all kernel structures not + mapped into both kernel and userspace page tables. This + feature of the MMU allows different processes to share TLB + entries mapping the kernel. Losing the feature means more + TLB misses after a context switch. The actual loss of + performance is very small, however, never exceeding 1%. + d. Process Context IDentifiers (PCID) is a CPU feature that + allows us to skip flushing the entire TLB when switching page + tables by setting a special bit in CR3 when the page tables + are changed. This makes switching the page tables (at context + switch, or kernel entry/exit) cheaper. But, on systems with + PCID support, the context switch code must flush both the user + and kernel entries out of the TLB. The user PCID TLB flush is + deferred until the exit to userspace, minimizing the cost. + See intel.com/sdm for the gory PCID/INVPCID details. + e. The userspace page tables must be populated for each new + process. Even without PTI, the shared kernel mappings + are created by copying top-level (PGD) entries into each + new process. But, with PTI, there are now *two* kernel + mappings: one in the kernel page tables that maps everything + and one for the entry/exit structures. At fork(), we need to + copy both. + f. In addition to the fork()-time copying, there must also + be an update to the userspace PGD any time a set_pgd() is done + on a PGD used to map userspace. This ensures that the kernel + and userspace copies always map the same userspace + memory. + g. On systems without PCID support, each CR3 write flushes + the entire TLB. That means that each syscall, interrupt + or exception flushes the TLB. + h. INVPCID is a TLB-flushing instruction which allows flushing + of TLB entries for non-current PCIDs. Some systems support + PCIDs, but do not support INVPCID. On these systems, addresses + can only be flushed from the TLB for the current PCID. When + flushing a kernel address, we need to flush all PCIDs, so a + single kernel address flush will require a TLB-flushing CR3 + write upon the next use of every PCID. + +Possible Future Work +==================== +1. We can be more careful about not actually writing to CR3 + unless its value is actually changed. +2. Allow PTI to be enabled/disabled at runtime in addition to the + boot-time switching. + +Testing +======== + +To test stability of PTI, the following test procedure is recommended, +ideally doing all of these in parallel: + +1. Set CONFIG_DEBUG_ENTRY=y +2. Run several copies of all of the tools/testing/selftests/x86/ tests + (excluding MPX and protection_keys) in a loop on multiple CPUs for + several minutes. These tests frequently uncover corner cases in the + kernel entry code. In general, old kernels might cause these tests + themselves to crash, but they should never crash the kernel. +3. Run the 'perf' tool in a mode (top or record) that generates many + frequent performance monitoring non-maskable interrupts (see "NMI" + in /proc/interrupts). This exercises the NMI entry/exit code which + is known to trigger bugs in code paths that did not expect to be + interrupted, including nested NMIs. Using "-c" boosts the rate of + NMIs, and using two -c with separate counters encourages nested NMIs + and less deterministic behavior. + + while true; do perf record -c 10000 -e instructions,cycles -a sleep 10; done + +4. Launch a KVM virtual machine. +5. Run 32-bit binaries on systems supporting the SYSCALL instruction. + This has been a lightly-tested code path and needs extra scrutiny. + +Debugging +========= + +Bugs in PTI cause a few different signatures of crashes +that are worth noting here. + + * Failures of the selftests/x86 code. Usually a bug in one of the + more obscure corners of entry_64.S + * Crashes in early boot, especially around CPU bringup. Bugs + in the trampoline code or mappings cause these. + * Crashes at the first interrupt. Caused by bugs in entry_64.S, + like screwing up a page table switch. Also caused by + incorrectly mapping the IRQ handler entry code. + * Crashes at the first NMI. The NMI code is separate from main + interrupt handlers and can have bugs that do not affect + normal interrupts. Also caused by incorrectly mapping NMI + code. NMIs that interrupt the entry code must be very + careful and can be the cause of crashes that show up when + running perf. + * Kernel crashes at the first exit to userspace. entry_64.S + bugs, or failing to map some of the exit code. + * Crashes at first interrupt that interrupts userspace. The paths + in entry_64.S that return to userspace are sometimes separate + from the ones that return to the kernel. + * Double faults: overflowing the kernel stack because of page + faults upon page faults. Caused by touching non-pti-mapped + data in the entry code, or forgetting to switch to kernel + CR3 before calling into C functions which are not pti-mapped. + * Userspace segfaults early in boot, sometimes manifesting + as mount(8) failing to mount the rootfs. These have + tended to be TLB invalidation issues. Usually invalidating + the wrong PCID, or otherwise missing an invalidation. + +1. https://gruss.cc/files/kaiser.pdf +2. https://meltdownattack.com/meltdown.pdf diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt index 3448e675b462..ea91cb61a602 100644 --- a/Documentation/x86/x86_64/mm.txt +++ b/Documentation/x86/x86_64/mm.txt @@ -1,6 +1,4 @@ -<previous description obsolete, deleted> - Virtual memory map with 4 level page tables: 0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm @@ -14,13 +12,17 @@ ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB) ... unused hole ... ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB) ... unused hole ... + vaddr_end for KASLR +fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping +fffffe8000000000 - fffffeffffffffff (=39 bits) LDT remap for PTI ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks ... unused hole ... ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space ... unused hole ... ffffffff80000000 - ffffffff9fffffff (=512 MB) kernel text mapping, from phys 0 -ffffffffa0000000 - ffffffffff5fffff (=1526 MB) module mapping space (variable) -ffffffffff600000 - ffffffffffdfffff (=8 MB) vsyscalls +ffffffffa0000000 - [fixmap start] (~1526 MB) module mapping space (variable) +[fixmap start] - ffffffffff5fffff kernel-internal fixmap range +ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole Virtual memory map with 5 level page tables: @@ -29,26 +31,31 @@ Virtual memory map with 5 level page tables: hole caused by [56:63] sign extension ff00000000000000 - ff0fffffffffffff (=52 bits) guard hole, reserved for hypervisor ff10000000000000 - ff8fffffffffffff (=55 bits) direct mapping of all phys. memory -ff90000000000000 - ff91ffffffffffff (=49 bits) hole -ff92000000000000 - ffd1ffffffffffff (=54 bits) vmalloc/ioremap space +ff90000000000000 - ff9fffffffffffff (=52 bits) LDT remap for PTI +ffa0000000000000 - ffd1ffffffffffff (=54 bits) vmalloc/ioremap space (12800 TB) ffd2000000000000 - ffd3ffffffffffff (=49 bits) hole ffd4000000000000 - ffd5ffffffffffff (=49 bits) virtual memory map (512TB) ... unused hole ... ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB) ... unused hole ... + vaddr_end for KASLR +fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping +... unused hole ... ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks ... unused hole ... ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space ... unused hole ... ffffffff80000000 - ffffffff9fffffff (=512 MB) kernel text mapping, from phys 0 -ffffffffa0000000 - ffffffffff5fffff (=1526 MB) module mapping space -ffffffffff600000 - ffffffffffdfffff (=8 MB) vsyscalls +ffffffffa0000000 - fffffffffeffffff (1520 MB) module mapping space +[fixmap start] - ffffffffff5fffff kernel-internal fixmap range +ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole Architecture defines a 64-bit virtual address. Implementations can support less. Currently supported are 48- and 57-bit virtual addresses. Bits 63 -through to the most-significant implemented bit are set to either all ones -or all zero. This causes hole between user space and kernel addresses. +through to the most-significant implemented bit are sign extended. +This causes hole between user space and kernel addresses if you interpret them +as unsigned. The direct mapping covers all memory in the system up to the highest memory address (this means in some cases it can also include PCI memory @@ -58,19 +65,15 @@ vmalloc space is lazily synchronized into the different PML4/PML5 pages of the processes using the page fault handler, with init_top_pgt as reference. -Current X86-64 implementations support up to 46 bits of address space (64 TB), -which is our current limit. This expands into MBZ space in the page tables. - We map EFI runtime services in the 'efi_pgd' PGD in a 64Gb large virtual memory window (this size is arbitrary, it can be raised later if needed). The mappings are not part of any other kernel PGD and are only available during EFI runtime calls. -The module mapping space size changes based on the CONFIG requirements for the -following fixmap section. - Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all physical memory, vmalloc/ioremap space and virtual memory map are randomized. Their order is preserved but their base will be offset early at boot time. --Andi Kleen, Jul 2004 +Be very careful vs. KASLR when changing anything here. The KASLR address +range must not overlap with anything except the KASAN shadow area, which is +correct as KASAN disables KASLR. diff --git a/Documentation/xtensa/mmu.txt b/Documentation/xtensa/mmu.txt index 5de8715d5bec..318114de63f3 100644 --- a/Documentation/xtensa/mmu.txt +++ b/Documentation/xtensa/mmu.txt @@ -69,19 +69,10 @@ Default MMUv2-compatible layout. | Userspace | 0x00000000 TASK_SIZE +------------------+ 0x40000000 +------------------+ -| Page table | 0x80000000 -+------------------+ 0x80400000 +| Page table | XCHAL_PAGE_TABLE_VADDR 0x80000000 XCHAL_PAGE_TABLE_SIZE +------------------+ -| KMAP area | PKMAP_BASE PTRS_PER_PTE * -| | DCACHE_N_COLORS * -| | PAGE_SIZE -| | (4MB * DCACHE_N_COLORS) -+------------------+ -| Atomic KMAP area | FIXADDR_START KM_TYPE_NR * -| | NR_CPUS * -| | DCACHE_N_COLORS * -| | PAGE_SIZE -+------------------+ FIXADDR_TOP 0xbffff000 +| KASAN shadow map | KASAN_SHADOW_START 0x80400000 KASAN_SHADOW_SIZE ++------------------+ 0x8e400000 +------------------+ | VMALLOC area | VMALLOC_START 0xc0000000 128MB - 64KB +------------------+ VMALLOC_END @@ -92,6 +83,17 @@ Default MMUv2-compatible layout. | remap area 2 | +------------------+ +------------------+ +| KMAP area | PKMAP_BASE PTRS_PER_PTE * +| | DCACHE_N_COLORS * +| | PAGE_SIZE +| | (4MB * DCACHE_N_COLORS) ++------------------+ +| Atomic KMAP area | FIXADDR_START KM_TYPE_NR * +| | NR_CPUS * +| | DCACHE_N_COLORS * +| | PAGE_SIZE ++------------------+ FIXADDR_TOP 0xcffff000 ++------------------+ | Cached KSEG | XCHAL_KSEG_CACHED_VADDR 0xd0000000 128MB +------------------+ | Uncached KSEG | XCHAL_KSEG_BYPASS_VADDR 0xd8000000 128MB @@ -109,19 +111,10 @@ Default MMUv2-compatible layout. | Userspace | 0x00000000 TASK_SIZE +------------------+ 0x40000000 +------------------+ -| Page table | 0x80000000 -+------------------+ 0x80400000 +| Page table | XCHAL_PAGE_TABLE_VADDR 0x80000000 XCHAL_PAGE_TABLE_SIZE +------------------+ -| KMAP area | PKMAP_BASE PTRS_PER_PTE * -| | DCACHE_N_COLORS * -| | PAGE_SIZE -| | (4MB * DCACHE_N_COLORS) -+------------------+ -| Atomic KMAP area | FIXADDR_START KM_TYPE_NR * -| | NR_CPUS * -| | DCACHE_N_COLORS * -| | PAGE_SIZE -+------------------+ FIXADDR_TOP 0x9ffff000 +| KASAN shadow map | KASAN_SHADOW_START 0x80400000 KASAN_SHADOW_SIZE ++------------------+ 0x8e400000 +------------------+ | VMALLOC area | VMALLOC_START 0xa0000000 128MB - 64KB +------------------+ VMALLOC_END @@ -132,6 +125,17 @@ Default MMUv2-compatible layout. | remap area 2 | +------------------+ +------------------+ +| KMAP area | PKMAP_BASE PTRS_PER_PTE * +| | DCACHE_N_COLORS * +| | PAGE_SIZE +| | (4MB * DCACHE_N_COLORS) ++------------------+ +| Atomic KMAP area | FIXADDR_START KM_TYPE_NR * +| | NR_CPUS * +| | DCACHE_N_COLORS * +| | PAGE_SIZE ++------------------+ FIXADDR_TOP 0xaffff000 ++------------------+ | Cached KSEG | XCHAL_KSEG_CACHED_VADDR 0xb0000000 256MB +------------------+ | Uncached KSEG | XCHAL_KSEG_BYPASS_VADDR 0xc0000000 256MB @@ -150,19 +154,10 @@ Default MMUv2-compatible layout. | Userspace | 0x00000000 TASK_SIZE +------------------+ 0x40000000 +------------------+ -| Page table | 0x80000000 -+------------------+ 0x80400000 +| Page table | XCHAL_PAGE_TABLE_VADDR 0x80000000 XCHAL_PAGE_TABLE_SIZE +------------------+ -| KMAP area | PKMAP_BASE PTRS_PER_PTE * -| | DCACHE_N_COLORS * -| | PAGE_SIZE -| | (4MB * DCACHE_N_COLORS) -+------------------+ -| Atomic KMAP area | FIXADDR_START KM_TYPE_NR * -| | NR_CPUS * -| | DCACHE_N_COLORS * -| | PAGE_SIZE -+------------------+ FIXADDR_TOP 0x8ffff000 +| KASAN shadow map | KASAN_SHADOW_START 0x80400000 KASAN_SHADOW_SIZE ++------------------+ 0x8e400000 +------------------+ | VMALLOC area | VMALLOC_START 0x90000000 128MB - 64KB +------------------+ VMALLOC_END @@ -173,6 +168,17 @@ Default MMUv2-compatible layout. | remap area 2 | +------------------+ +------------------+ +| KMAP area | PKMAP_BASE PTRS_PER_PTE * +| | DCACHE_N_COLORS * +| | PAGE_SIZE +| | (4MB * DCACHE_N_COLORS) ++------------------+ +| Atomic KMAP area | FIXADDR_START KM_TYPE_NR * +| | NR_CPUS * +| | DCACHE_N_COLORS * +| | PAGE_SIZE ++------------------+ FIXADDR_TOP 0x9ffff000 ++------------------+ | Cached KSEG | XCHAL_KSEG_CACHED_VADDR 0xa0000000 512MB +------------------+ | Uncached KSEG | XCHAL_KSEG_BYPASS_VADDR 0xc0000000 512MB |