<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/linux/ioport.h, branch v5.10-rc2</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>kernel/resource: make iomem_resource implicit in release_mem_region_adjustable()</title>
<updated>2020-10-16T18:11:18+00:00</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2020-10-16T03:09:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cb8e3c8b4f45e4ed8987a581956dc9c3827a5bcf'/>
<id>cb8e3c8b4f45e4ed8987a581956dc9c3827a5bcf</id>
<content type='text'>
"mem" in the name already indicates the root, similar to
release_mem_region() and devm_request_mem_region().  Make it implicit.
The only single caller always passes iomem_resource, other parents are not
applicable.

Suggested-by: Wei Yang &lt;richard.weiyang@linux.alibaba.com&gt;
Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Wei Yang &lt;richard.weiyang@linux.alibaba.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200916073041.10355-1-david@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
"mem" in the name already indicates the root, similar to
release_mem_region() and devm_request_mem_region().  Make it implicit.
The only single caller always passes iomem_resource, other parents are not
applicable.

Suggested-by: Wei Yang &lt;richard.weiyang@linux.alibaba.com&gt;
Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Wei Yang &lt;richard.weiyang@linux.alibaba.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200916073041.10355-1-david@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/memory_hotplug: MEMHP_MERGE_RESOURCE to specify merging of System RAM resources</title>
<updated>2020-10-16T18:11:18+00:00</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2020-10-16T03:08:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9ca6551ee24368a4d2b09566ea4d10fe87860379'/>
<id>9ca6551ee24368a4d2b09566ea4d10fe87860379</id>
<content type='text'>
Some add_memory*() users add memory in small, contiguous memory blocks.
Examples include virtio-mem, hyper-v balloon, and the XEN balloon.

This can quickly result in a lot of memory resources, whereby the actual
resource boundaries are not of interest (e.g., it might be relevant for
DIMMs, exposed via /proc/iomem to user space).  We really want to merge
added resources in this scenario where possible.

Let's provide a flag (MEMHP_MERGE_RESOURCE) to specify that a resource
either created within add_memory*() or passed via add_memory_resource()
shall be marked mergeable and merged with applicable siblings.

To implement that, we need a kernel/resource interface to mark selected
System RAM resources mergeable (IORESOURCE_SYSRAM_MERGEABLE) and trigger
merging.

Note: We really want to merge after the whole operation succeeded, not
directly when adding a resource to the resource tree (it would break
add_memory_resource() and require splitting resources again when the
operation failed - e.g., due to -ENOMEM).

Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: "K. Y. Srinivasan" &lt;kys@microsoft.com&gt;
Cc: Haiyang Zhang &lt;haiyangz@microsoft.com&gt;
Cc: Stephen Hemminger &lt;sthemmin@microsoft.com&gt;
Cc: Wei Liu &lt;wei.liu@kernel.org&gt;
Cc: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Roger Pau Monné &lt;roger.pau@citrix.com&gt;
Cc: Julien Grall &lt;julien@xen.org&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Wei Yang &lt;richardw.yang@linux.intel.com&gt;
Cc: Anton Blanchard &lt;anton@ozlabs.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Cc: Dave Jiang &lt;dave.jiang@intel.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Cc: Leonardo Bras &lt;leobras.c@gmail.com&gt;
Cc: Libor Pechacek &lt;lpechacek@suse.cz&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Nathan Lynch &lt;nathanl@linux.ibm.com&gt;
Cc: "Oliver O'Halloran" &lt;oohall@gmail.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Pingfan Liu &lt;kernelfans@gmail.com&gt;
Cc: "Rafael J. Wysocki" &lt;rjw@rjwysocki.net&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Vishal Verma &lt;vishal.l.verma@intel.com&gt;
Link: https://lkml.kernel.org/r/20200911103459.10306-6-david@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some add_memory*() users add memory in small, contiguous memory blocks.
Examples include virtio-mem, hyper-v balloon, and the XEN balloon.

This can quickly result in a lot of memory resources, whereby the actual
resource boundaries are not of interest (e.g., it might be relevant for
DIMMs, exposed via /proc/iomem to user space).  We really want to merge
added resources in this scenario where possible.

Let's provide a flag (MEMHP_MERGE_RESOURCE) to specify that a resource
either created within add_memory*() or passed via add_memory_resource()
shall be marked mergeable and merged with applicable siblings.

To implement that, we need a kernel/resource interface to mark selected
System RAM resources mergeable (IORESOURCE_SYSRAM_MERGEABLE) and trigger
merging.

Note: We really want to merge after the whole operation succeeded, not
directly when adding a resource to the resource tree (it would break
add_memory_resource() and require splitting resources again when the
operation failed - e.g., due to -ENOMEM).

Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: "K. Y. Srinivasan" &lt;kys@microsoft.com&gt;
Cc: Haiyang Zhang &lt;haiyangz@microsoft.com&gt;
Cc: Stephen Hemminger &lt;sthemmin@microsoft.com&gt;
Cc: Wei Liu &lt;wei.liu@kernel.org&gt;
Cc: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Roger Pau Monné &lt;roger.pau@citrix.com&gt;
Cc: Julien Grall &lt;julien@xen.org&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Wei Yang &lt;richardw.yang@linux.intel.com&gt;
Cc: Anton Blanchard &lt;anton@ozlabs.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Cc: Dave Jiang &lt;dave.jiang@intel.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Cc: Leonardo Bras &lt;leobras.c@gmail.com&gt;
Cc: Libor Pechacek &lt;lpechacek@suse.cz&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Nathan Lynch &lt;nathanl@linux.ibm.com&gt;
Cc: "Oliver O'Halloran" &lt;oohall@gmail.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Pingfan Liu &lt;kernelfans@gmail.com&gt;
Cc: "Rafael J. Wysocki" &lt;rjw@rjwysocki.net&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Vishal Verma &lt;vishal.l.verma@intel.com&gt;
Link: https://lkml.kernel.org/r/20200911103459.10306-6-david@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kernel/resource: move and rename IORESOURCE_MEM_DRIVER_MANAGED</title>
<updated>2020-10-16T18:11:18+00:00</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2020-10-16T03:08:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7cf603d17d9bddbda90c424b6f30c7bc2e6f48f2'/>
<id>7cf603d17d9bddbda90c424b6f30c7bc2e6f48f2</id>
<content type='text'>
IORESOURCE_MEM_DRIVER_MANAGED currently uses an unused PnP bit, which is
always set to 0 by hardware.  This is far from beautiful (and confusing),
and the bit only applies to SYSRAM.  So let's move it out of the
bus-specific (PnP) defined bits.

We'll add another SYSRAM specific bit soon.  If we ever need more bits for
other purposes, we can steal some from "desc", or reshuffle/regroup what
we have.

Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Wei Yang &lt;richardw.yang@linux.intel.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Anton Blanchard &lt;anton@ozlabs.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Cc: Dave Jiang &lt;dave.jiang@intel.com&gt;
Cc: Haiyang Zhang &lt;haiyangz@microsoft.com&gt;
Cc: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Julien Grall &lt;julien@xen.org&gt;
Cc: "K. Y. Srinivasan" &lt;kys@microsoft.com&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Cc: Leonardo Bras &lt;leobras.c@gmail.com&gt;
Cc: Libor Pechacek &lt;lpechacek@suse.cz&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Nathan Lynch &lt;nathanl@linux.ibm.com&gt;
Cc: "Oliver O'Halloran" &lt;oohall@gmail.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Pingfan Liu &lt;kernelfans@gmail.com&gt;
Cc: "Rafael J. Wysocki" &lt;rjw@rjwysocki.net&gt;
Cc: Roger Pau Monné &lt;roger.pau@citrix.com&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Stephen Hemminger &lt;sthemmin@microsoft.com&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Vishal Verma &lt;vishal.l.verma@intel.com&gt;
Cc: Wei Liu &lt;wei.liu@kernel.org&gt;
Link: https://lkml.kernel.org/r/20200911103459.10306-3-david@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
IORESOURCE_MEM_DRIVER_MANAGED currently uses an unused PnP bit, which is
always set to 0 by hardware.  This is far from beautiful (and confusing),
and the bit only applies to SYSRAM.  So let's move it out of the
bus-specific (PnP) defined bits.

We'll add another SYSRAM specific bit soon.  If we ever need more bits for
other purposes, we can steal some from "desc", or reshuffle/regroup what
we have.

Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Wei Yang &lt;richardw.yang@linux.intel.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Anton Blanchard &lt;anton@ozlabs.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Cc: Dave Jiang &lt;dave.jiang@intel.com&gt;
Cc: Haiyang Zhang &lt;haiyangz@microsoft.com&gt;
Cc: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Julien Grall &lt;julien@xen.org&gt;
Cc: "K. Y. Srinivasan" &lt;kys@microsoft.com&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Cc: Leonardo Bras &lt;leobras.c@gmail.com&gt;
Cc: Libor Pechacek &lt;lpechacek@suse.cz&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Nathan Lynch &lt;nathanl@linux.ibm.com&gt;
Cc: "Oliver O'Halloran" &lt;oohall@gmail.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Pingfan Liu &lt;kernelfans@gmail.com&gt;
Cc: "Rafael J. Wysocki" &lt;rjw@rjwysocki.net&gt;
Cc: Roger Pau Monné &lt;roger.pau@citrix.com&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Stephen Hemminger &lt;sthemmin@microsoft.com&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Vishal Verma &lt;vishal.l.verma@intel.com&gt;
Cc: Wei Liu &lt;wei.liu@kernel.org&gt;
Link: https://lkml.kernel.org/r/20200911103459.10306-3-david@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kernel/resource: make release_mem_region_adjustable() never fail</title>
<updated>2020-10-16T18:11:17+00:00</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2020-10-16T03:08:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ec62d04e3fdc4ba3a7912cd7f6da1a4e787a0d75'/>
<id>ec62d04e3fdc4ba3a7912cd7f6da1a4e787a0d75</id>
<content type='text'>
Patch series "selective merging of system ram resources", v4.

Some add_memory*() users add memory in small, contiguous memory blocks.
Examples include virtio-mem, hyper-v balloon, and the XEN balloon.

This can quickly result in a lot of memory resources, whereby the actual
resource boundaries are not of interest (e.g., it might be relevant for
DIMMs, exposed via /proc/iomem to user space).  We really want to merge
added resources in this scenario where possible.

Resources are effectively stored in a list-based tree.  Having a lot of
resources not only wastes memory, it also makes traversing that tree more
expensive, and makes /proc/iomem explode in size (e.g., requiring
kexec-tools to manually merge resources when creating a kdump header.  The
current kexec-tools resource count limit does not allow for more than
~100GB of memory with a memory block size of 128MB on x86-64).

Let's allow to selectively merge system ram resources by specifying a new
flag for add_memory*().  Patch #5 contains a /proc/iomem example.  Only
tested with virtio-mem.

This patch (of 8):

Let's make sure splitting a resource on memory hotunplug will never fail.
This will become more relevant once we merge selected System RAM resources
- then, we'll trigger that case more often on memory hotunplug.

In general, this function is already unlikely to fail.  When we remove
memory, we free up quite a lot of metadata (memmap, page tables, memory
block device, etc.).  The only reason it could really fail would be when
injecting allocation errors.

All other error cases inside release_mem_region_adjustable() seem to be
sanity checks if the function would be abused in different context - let's
add WARN_ON_ONCE() in these cases so we can catch them.

[natechancellor@gmail.com: fix use of ternary condition in release_mem_region_adjustable]
  Link: https://lkml.kernel.org/r/20200922060748.2452056-1-natechancellor@gmail.com
  Link: https://github.com/ClangBuiltLinux/linux/issues/1159

Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Nathan Chancellor &lt;natechancellor@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Wei Yang &lt;richardw.yang@linux.intel.com&gt;
Cc: Anton Blanchard &lt;anton@ozlabs.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Cc: Dave Jiang &lt;dave.jiang@intel.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Haiyang Zhang &lt;haiyangz@microsoft.com&gt;
Cc: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Julien Grall &lt;julien@xen.org&gt;
Cc: "K. Y. Srinivasan" &lt;kys@microsoft.com&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Cc: Leonardo Bras &lt;leobras.c@gmail.com&gt;
Cc: Libor Pechacek &lt;lpechacek@suse.cz&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Nathan Lynch &lt;nathanl@linux.ibm.com&gt;
Cc: "Oliver O'Halloran" &lt;oohall@gmail.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Pingfan Liu &lt;kernelfans@gmail.com&gt;
Cc: "Rafael J. Wysocki" &lt;rjw@rjwysocki.net&gt;
Cc: Roger Pau Monn &lt;roger.pau@citrix.com&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Stephen Hemminger &lt;sthemmin@microsoft.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Vishal Verma &lt;vishal.l.verma@intel.com&gt;
Cc: Wei Liu &lt;wei.liu@kernel.org&gt;
Link: https://lkml.kernel.org/r/20200911103459.10306-2-david@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Patch series "selective merging of system ram resources", v4.

Some add_memory*() users add memory in small, contiguous memory blocks.
Examples include virtio-mem, hyper-v balloon, and the XEN balloon.

This can quickly result in a lot of memory resources, whereby the actual
resource boundaries are not of interest (e.g., it might be relevant for
DIMMs, exposed via /proc/iomem to user space).  We really want to merge
added resources in this scenario where possible.

Resources are effectively stored in a list-based tree.  Having a lot of
resources not only wastes memory, it also makes traversing that tree more
expensive, and makes /proc/iomem explode in size (e.g., requiring
kexec-tools to manually merge resources when creating a kdump header.  The
current kexec-tools resource count limit does not allow for more than
~100GB of memory with a memory block size of 128MB on x86-64).

Let's allow to selectively merge system ram resources by specifying a new
flag for add_memory*().  Patch #5 contains a /proc/iomem example.  Only
tested with virtio-mem.

This patch (of 8):

Let's make sure splitting a resource on memory hotunplug will never fail.
This will become more relevant once we merge selected System RAM resources
- then, we'll trigger that case more often on memory hotunplug.

In general, this function is already unlikely to fail.  When we remove
memory, we free up quite a lot of metadata (memmap, page tables, memory
block device, etc.).  The only reason it could really fail would be when
injecting allocation errors.

All other error cases inside release_mem_region_adjustable() seem to be
sanity checks if the function would be abused in different context - let's
add WARN_ON_ONCE() in these cases so we can catch them.

[natechancellor@gmail.com: fix use of ternary condition in release_mem_region_adjustable]
  Link: https://lkml.kernel.org/r/20200922060748.2452056-1-natechancellor@gmail.com
  Link: https://github.com/ClangBuiltLinux/linux/issues/1159

Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Nathan Chancellor &lt;natechancellor@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Wei Yang &lt;richardw.yang@linux.intel.com&gt;
Cc: Anton Blanchard &lt;anton@ozlabs.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Cc: Dave Jiang &lt;dave.jiang@intel.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Haiyang Zhang &lt;haiyangz@microsoft.com&gt;
Cc: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Julien Grall &lt;julien@xen.org&gt;
Cc: "K. Y. Srinivasan" &lt;kys@microsoft.com&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Cc: Leonardo Bras &lt;leobras.c@gmail.com&gt;
Cc: Libor Pechacek &lt;lpechacek@suse.cz&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Nathan Lynch &lt;nathanl@linux.ibm.com&gt;
Cc: "Oliver O'Halloran" &lt;oohall@gmail.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Pingfan Liu &lt;kernelfans@gmail.com&gt;
Cc: "Rafael J. Wysocki" &lt;rjw@rjwysocki.net&gt;
Cc: Roger Pau Monn &lt;roger.pau@citrix.com&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Stephen Hemminger &lt;sthemmin@microsoft.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Vishal Verma &lt;vishal.l.verma@intel.com&gt;
Cc: Wei Liu &lt;wei.liu@kernel.org&gt;
Link: https://lkml.kernel.org/r/20200911103459.10306-2-david@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'char-misc-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc</title>
<updated>2020-06-07T17:59:32+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-06-07T17:59:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9aa900c8094dba7a60dc805ecec1e9f720744ba1'/>
<id>9aa900c8094dba7a60dc805ecec1e9f720744ba1</id>
<content type='text'>
Pull char/misc driver updates from Greg KH:
 "Here is the large set of char/misc driver patches for 5.8-rc1

  Included in here are:

   - habanalabs driver updates, loads

   - mhi bus driver updates

   - extcon driver updates

   - clk driver updates (approved by the clock maintainer)

   - firmware driver updates

   - fpga driver updates

   - gnss driver updates

   - coresight driver updates

   - interconnect driver updates

   - parport driver updates (it's still alive!)

   - nvmem driver updates

   - soundwire driver updates

   - visorbus driver updates

   - w1 driver updates

   - various misc driver updates

  In short, loads of different driver subsystem updates along with the
  drivers as well.

  All have been in linux-next for a while with no reported issues"

* tag 'char-misc-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (233 commits)
  habanalabs: correctly cast u64 to void*
  habanalabs: initialize variable to default value
  extcon: arizona: Fix runtime PM imbalance on error
  extcon: max14577: Add proper dt-compatible strings
  extcon: adc-jack: Fix an error handling path in 'adc_jack_probe()'
  extcon: remove redundant assignment to variable idx
  w1: omap-hdq: print dev_err if irq flags are not cleared
  w1: omap-hdq: fix interrupt handling which did show spurious timeouts
  w1: omap-hdq: fix return value to be -1 if there is a timeout
  w1: omap-hdq: cleanup to add missing newline for some dev_dbg
  /dev/mem: Revoke mappings when a driver claims the region
  misc: xilinx-sdfec: convert get_user_pages() --&gt; pin_user_pages()
  misc: xilinx-sdfec: cleanup return value in xsdfec_table_write()
  misc: xilinx-sdfec: improve get_user_pages_fast() error handling
  nvmem: qfprom: remove incorrect write support
  habanalabs: handle MMU cache invalidation timeout
  habanalabs: don't allow hard reset with open processes
  habanalabs: GAUDI does not support soft-reset
  habanalabs: add print for soft reset due to event
  habanalabs: improve MMU cache invalidation code
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull char/misc driver updates from Greg KH:
 "Here is the large set of char/misc driver patches for 5.8-rc1

  Included in here are:

   - habanalabs driver updates, loads

   - mhi bus driver updates

   - extcon driver updates

   - clk driver updates (approved by the clock maintainer)

   - firmware driver updates

   - fpga driver updates

   - gnss driver updates

   - coresight driver updates

   - interconnect driver updates

   - parport driver updates (it's still alive!)

   - nvmem driver updates

   - soundwire driver updates

   - visorbus driver updates

   - w1 driver updates

   - various misc driver updates

  In short, loads of different driver subsystem updates along with the
  drivers as well.

  All have been in linux-next for a while with no reported issues"

* tag 'char-misc-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (233 commits)
  habanalabs: correctly cast u64 to void*
  habanalabs: initialize variable to default value
  extcon: arizona: Fix runtime PM imbalance on error
  extcon: max14577: Add proper dt-compatible strings
  extcon: adc-jack: Fix an error handling path in 'adc_jack_probe()'
  extcon: remove redundant assignment to variable idx
  w1: omap-hdq: print dev_err if irq flags are not cleared
  w1: omap-hdq: fix interrupt handling which did show spurious timeouts
  w1: omap-hdq: fix return value to be -1 if there is a timeout
  w1: omap-hdq: cleanup to add missing newline for some dev_dbg
  /dev/mem: Revoke mappings when a driver claims the region
  misc: xilinx-sdfec: convert get_user_pages() --&gt; pin_user_pages()
  misc: xilinx-sdfec: cleanup return value in xsdfec_table_write()
  misc: xilinx-sdfec: improve get_user_pages_fast() error handling
  nvmem: qfprom: remove incorrect write support
  habanalabs: handle MMU cache invalidation timeout
  habanalabs: don't allow hard reset with open processes
  habanalabs: GAUDI does not support soft-reset
  habanalabs: add print for soft reset due to event
  habanalabs: improve MMU cache invalidation code
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/memory_hotplug: introduce add_memory_driver_managed()</title>
<updated>2020-06-05T02:06:23+00:00</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2020-06-04T23:48:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7b7b27214bba1966772f9213cd2d8e5d67f8487f'/>
<id>7b7b27214bba1966772f9213cd2d8e5d67f8487f</id>
<content type='text'>
Patch series "mm/memory_hotplug: Interface to add driver-managed system
ram", v4.

kexec (via kexec_load()) can currently not properly handle memory added
via dax/kmem, and will have similar issues with virtio-mem.  kexec-tools
will currently add all memory to the fixed-up initial firmware memmap.  In
case of dax/kmem, this means that - in contrast to a proper reboot - how
that persistent memory will be used can no longer be configured by the
kexec'd kernel.  In case of virtio-mem it will be harmful, because that
memory might contain inaccessible pieces that require coordination with
hypervisor first.

In both cases, we want to let the driver in the kexec'd kernel handle
detecting and adding the memory, like during an ordinary reboot.
Introduce add_memory_driver_managed().  More on the samentics are in patch
#1.

In the future, we might want to make this behavior configurable for
dax/kmem- either by configuring it in the kernel (which would then also
allow to configure kexec_file_load()) or in kexec-tools by also adding
"System RAM (kmem)" memory from /proc/iomem to the fixed-up initial
firmware memmap.

More on the motivation can be found in [1] and [2].

[1] https://lkml.kernel.org/r/20200429160803.109056-1-david@redhat.com
[2] https://lkml.kernel.org/r/20200430102908.10107-1-david@redhat.com

This patch (of 3):

Some device drivers rely on memory they managed to not get added to the
initial (firmware) memmap as system RAM - so it's not used as initial
system RAM by the kernel and the driver is under control.  While this is
the case during cold boot and after a reboot, kexec is not aware of that
and might add such memory to the initial (firmware) memmap of the kexec
kernel.  We need ways to teach kernel and userspace that this system ram
is different.

For example, dax/kmem allows to decide at runtime if persistent memory is
to be used as system ram.  Another future user is virtio-mem, which has to
coordinate with its hypervisor to deal with inaccessible parts within
memory resources.

We want to let users in the kernel (esp. kexec) but also user space
(esp. kexec-tools) know that this memory has different semantics and
needs to be handled differently:
1. Don't create entries in /sys/firmware/memmap/
2. Name the memory resource "System RAM ($DRIVER)" (exposed via
   /proc/iomem) ($DRIVER might be "kmem", "virtio_mem").
3. Flag the memory resource IORESOURCE_MEM_DRIVER_MANAGED

/sys/firmware/memmap/ [1] represents the "raw firmware-provided memory
map" because "on most architectures that firmware-provided memory map is
modified afterwards by the kernel itself".  The primary user is kexec on
x86-64.  Since commit d96ae5309165 ("memory-hotplug: create
/sys/firmware/memmap entry for new memory"), we add all hotplugged memory
to that firmware memmap - which makes perfect sense for traditional memory
hotplug on x86-64, where real HW will also add hotplugged DIMMs to the
firmware memmap.  We replicate what the "raw firmware-provided memory map"
looks like after hot(un)plug.

To keep things simple, let the user provide the full resource name instead
of only the driver name - this way, we don't have to manually
allocate/craft strings for memory resources.  Also use the resource name
to make decisions, to avoid passing additional flags.  In case the name
isn't "System RAM", it's special.

We don't have to worry about firmware_map_remove() on the removal path.
If there is no entry, it will simply return with -EINVAL.

We'll adapt dax/kmem in a follow-up patch.

[1] https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-firmware-memmap

Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Pavel Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Link: http://lkml.kernel.org/r/20200508084217.9160-1-david@redhat.com
Link: http://lkml.kernel.org/r/20200508084217.9160-3-david@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Patch series "mm/memory_hotplug: Interface to add driver-managed system
ram", v4.

kexec (via kexec_load()) can currently not properly handle memory added
via dax/kmem, and will have similar issues with virtio-mem.  kexec-tools
will currently add all memory to the fixed-up initial firmware memmap.  In
case of dax/kmem, this means that - in contrast to a proper reboot - how
that persistent memory will be used can no longer be configured by the
kexec'd kernel.  In case of virtio-mem it will be harmful, because that
memory might contain inaccessible pieces that require coordination with
hypervisor first.

In both cases, we want to let the driver in the kexec'd kernel handle
detecting and adding the memory, like during an ordinary reboot.
Introduce add_memory_driver_managed().  More on the samentics are in patch
#1.

In the future, we might want to make this behavior configurable for
dax/kmem- either by configuring it in the kernel (which would then also
allow to configure kexec_file_load()) or in kexec-tools by also adding
"System RAM (kmem)" memory from /proc/iomem to the fixed-up initial
firmware memmap.

More on the motivation can be found in [1] and [2].

[1] https://lkml.kernel.org/r/20200429160803.109056-1-david@redhat.com
[2] https://lkml.kernel.org/r/20200430102908.10107-1-david@redhat.com

This patch (of 3):

Some device drivers rely on memory they managed to not get added to the
initial (firmware) memmap as system RAM - so it's not used as initial
system RAM by the kernel and the driver is under control.  While this is
the case during cold boot and after a reboot, kexec is not aware of that
and might add such memory to the initial (firmware) memmap of the kexec
kernel.  We need ways to teach kernel and userspace that this system ram
is different.

For example, dax/kmem allows to decide at runtime if persistent memory is
to be used as system ram.  Another future user is virtio-mem, which has to
coordinate with its hypervisor to deal with inaccessible parts within
memory resources.

We want to let users in the kernel (esp. kexec) but also user space
(esp. kexec-tools) know that this memory has different semantics and
needs to be handled differently:
1. Don't create entries in /sys/firmware/memmap/
2. Name the memory resource "System RAM ($DRIVER)" (exposed via
   /proc/iomem) ($DRIVER might be "kmem", "virtio_mem").
3. Flag the memory resource IORESOURCE_MEM_DRIVER_MANAGED

/sys/firmware/memmap/ [1] represents the "raw firmware-provided memory
map" because "on most architectures that firmware-provided memory map is
modified afterwards by the kernel itself".  The primary user is kexec on
x86-64.  Since commit d96ae5309165 ("memory-hotplug: create
/sys/firmware/memmap entry for new memory"), we add all hotplugged memory
to that firmware memmap - which makes perfect sense for traditional memory
hotplug on x86-64, where real HW will also add hotplugged DIMMs to the
firmware memmap.  We replicate what the "raw firmware-provided memory map"
looks like after hot(un)plug.

To keep things simple, let the user provide the full resource name instead
of only the driver name - this way, we don't have to manually
allocate/craft strings for memory resources.  Also use the resource name
to make decisions, to avoid passing additional flags.  In case the name
isn't "System RAM", it's special.

We don't have to worry about firmware_map_remove() on the removal path.
If there is no entry, it will simply return with -EINVAL.

We'll adapt dax/kmem in a follow-up patch.

[1] https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-firmware-memmap

Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Pavel Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Link: http://lkml.kernel.org/r/20200508084217.9160-1-david@redhat.com
Link: http://lkml.kernel.org/r/20200508084217.9160-3-david@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>/dev/mem: Revoke mappings when a driver claims the region</title>
<updated>2020-05-27T09:10:05+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2020-05-21T21:06:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3234ac664a870e6ea69ae3a57d824cd7edbeacc5'/>
<id>3234ac664a870e6ea69ae3a57d824cd7edbeacc5</id>
<content type='text'>
Close the hole of holding a mapping over kernel driver takeover event of
a given address range.

Commit 90a545e98126 ("restrict /dev/mem to idle io memory ranges")
introduced CONFIG_IO_STRICT_DEVMEM with the goal of protecting the
kernel against scenarios where a /dev/mem user tramples memory that a
kernel driver owns. However, this protection only prevents *new* read(),
write() and mmap() requests. Established mappings prior to the driver
calling request_mem_region() are left alone.

Especially with persistent memory, and the core kernel metadata that is
stored there, there are plentiful scenarios for a /dev/mem user to
violate the expectations of the driver and cause amplified damage.

Teach request_mem_region() to find and shoot down active /dev/mem
mappings that it believes it has successfully claimed for the exclusive
use of the driver. Effectively a driver call to request_mem_region()
becomes a hole-punch on the /dev/mem device.

The typical usage of unmap_mapping_range() is part of
truncate_pagecache() to punch a hole in a file, but in this case the
implementation is only doing the "first half" of a hole punch. Namely it
is just evacuating current established mappings of the "hole", and it
relies on the fact that /dev/mem establishes mappings in terms of
absolute physical address offsets. Once existing mmap users are
invalidated they can attempt to re-establish the mapping, or attempt to
continue issuing read(2) / write(2) to the invalidated extent, but they
will then be subject to the CONFIG_IO_STRICT_DEVMEM checking that can
block those subsequent accesses.

Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Russell King &lt;linux@arm.linux.org.uk&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Fixes: 90a545e98126 ("restrict /dev/mem to idle io memory ranges")
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/159009507306.847224.8502634072429766747.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Close the hole of holding a mapping over kernel driver takeover event of
a given address range.

Commit 90a545e98126 ("restrict /dev/mem to idle io memory ranges")
introduced CONFIG_IO_STRICT_DEVMEM with the goal of protecting the
kernel against scenarios where a /dev/mem user tramples memory that a
kernel driver owns. However, this protection only prevents *new* read(),
write() and mmap() requests. Established mappings prior to the driver
calling request_mem_region() are left alone.

Especially with persistent memory, and the core kernel metadata that is
stored there, there are plentiful scenarios for a /dev/mem user to
violate the expectations of the driver and cause amplified damage.

Teach request_mem_region() to find and shoot down active /dev/mem
mappings that it believes it has successfully claimed for the exclusive
use of the driver. Effectively a driver call to request_mem_region()
becomes a hole-punch on the /dev/mem device.

The typical usage of unmap_mapping_range() is part of
truncate_pagecache() to punch a hole in a file, but in this case the
implementation is only doing the "first half" of a hole punch. Namely it
is just evacuating current established mappings of the "hole", and it
relies on the fact that /dev/mem establishes mappings in terms of
absolute physical address offsets. Once existing mmap users are
invalidated they can attempt to re-establish the mapping, or attempt to
continue issuing read(2) / write(2) to the invalidated extent, but they
will then be subject to the CONFIG_IO_STRICT_DEVMEM checking that can
block those subsequent accesses.

Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Russell King &lt;linux@arm.linux.org.uk&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Fixes: 90a545e98126 ("restrict /dev/mem to idle io memory ranges")
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/159009507306.847224.8502634072429766747.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/efi: EFI soft reservation to E820 enumeration</title>
<updated>2019-11-07T14:44:14+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2019-11-07T01:43:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=262b45ae3ab4bf8e2caf1fcfd0d8307897519630'/>
<id>262b45ae3ab4bf8e2caf1fcfd0d8307897519630</id>
<content type='text'>
UEFI 2.8 defines an EFI_MEMORY_SP attribute bit to augment the
interpretation of the EFI Memory Types as "reserved for a specific
purpose".

The proposed Linux behavior for specific purpose memory is that it is
reserved for direct-access (device-dax) by default and not available for
any kernel usage, not even as an OOM fallback.  Later, through udev
scripts or another init mechanism, these device-dax claimed ranges can
be reconfigured and hot-added to the available System-RAM with a unique
node identifier. This device-dax management scheme implements "soft" in
the "soft reserved" designation by allowing some or all of the
reservation to be recovered as typical memory. This policy can be
disabled at compile-time with CONFIG_EFI_SOFT_RESERVE=n, or runtime with
efi=nosoftreserve.

This patch introduces 2 new concepts at once given the entanglement
between early boot enumeration relative to memory that can optionally be
reserved from the kernel page allocator by default. The new concepts
are:

- E820_TYPE_SOFT_RESERVED: Upon detecting the EFI_MEMORY_SP
  attribute on EFI_CONVENTIONAL memory, update the E820 map with this
  new type. Only perform this classification if the
  CONFIG_EFI_SOFT_RESERVE=y policy is enabled, otherwise treat it as
  typical ram.

- IORES_DESC_SOFT_RESERVED: Add a new I/O resource descriptor for
  a device driver to search iomem resources for application specific
  memory. Teach the iomem code to identify such ranges as "Soft Reserved".

Note that the comment for do_add_efi_memmap() needed refreshing since it
seemed to imply that the efi map might overflow the e820 table, but that
is not an issue as of commit 7b6e4ba3cb1f "x86/boot/e820: Clean up the
E820_X_MAX definition" that removed the 128 entry limit for
e820__range_add().

A follow-on change integrates parsing of the ACPI HMAT to identify the
node and sub-range boundaries of EFI_MEMORY_SP designated memory. For
now, just identify and reserve memory of this type.

Acked-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Reported-by: kbuild test robot &lt;lkp@intel.com&gt;
Reviewed-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
UEFI 2.8 defines an EFI_MEMORY_SP attribute bit to augment the
interpretation of the EFI Memory Types as "reserved for a specific
purpose".

The proposed Linux behavior for specific purpose memory is that it is
reserved for direct-access (device-dax) by default and not available for
any kernel usage, not even as an OOM fallback.  Later, through udev
scripts or another init mechanism, these device-dax claimed ranges can
be reconfigured and hot-added to the available System-RAM with a unique
node identifier. This device-dax management scheme implements "soft" in
the "soft reserved" designation by allowing some or all of the
reservation to be recovered as typical memory. This policy can be
disabled at compile-time with CONFIG_EFI_SOFT_RESERVE=n, or runtime with
efi=nosoftreserve.

This patch introduces 2 new concepts at once given the entanglement
between early boot enumeration relative to memory that can optionally be
reserved from the kernel page allocator by default. The new concepts
are:

- E820_TYPE_SOFT_RESERVED: Upon detecting the EFI_MEMORY_SP
  attribute on EFI_CONVENTIONAL memory, update the E820 map with this
  new type. Only perform this classification if the
  CONFIG_EFI_SOFT_RESERVE=y policy is enabled, otherwise treat it as
  typical ram.

- IORES_DESC_SOFT_RESERVED: Add a new I/O resource descriptor for
  a device driver to search iomem resources for application specific
  memory. Teach the iomem code to identify such ranges as "Soft Reserved".

Note that the comment for do_add_efi_memmap() needed refreshing since it
seemed to imply that the efi map might overflow the e820 table, but that
is not an issue as of commit 7b6e4ba3cb1f "x86/boot/e820: Clean up the
E820_X_MAX definition" that removed the 128 entry limit for
e820__range_add().

A follow-on change integrates parsing of the ACPI HMAT to identify the
node and sub-range boundaries of EFI_MEMORY_SP designated memory. For
now, just identify and reserve memory of this type.

Acked-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Reported-by: kbuild test robot &lt;lkp@intel.com&gt;
Reviewed-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>resource: add a not device managed request_free_mem_region variant</title>
<updated>2019-08-20T12:39:41+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2019-08-18T09:05:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0c385190392d8c7128fb7517b3c676e19c7b8808'/>
<id>0c385190392d8c7128fb7517b3c676e19c7b8808</id>
<content type='text'>
Factor out the guts of devm_request_free_mem_region so that we can
implement both a device managed and a manually release version as tiny
wrappers around it.

Link: https://lore.kernel.org/r/20190818090557.17853-2-hch@lst.de
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Reviewed-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Tested-by: Bharata B Rao &lt;bharata@linux.ibm.com&gt;
Reviewed-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Factor out the guts of devm_request_free_mem_region so that we can
implement both a device managed and a manually release version as tiny
wrappers around it.

Link: https://lore.kernel.org/r/20190818090557.17853-2-hch@lst.de
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Reviewed-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Tested-by: Bharata B Rao &lt;bharata@linux.ibm.com&gt;
Reviewed-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma</title>
<updated>2019-07-15T02:42:11+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-07-15T02:42:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fec88ab0af9706b2201e5daf377c5031c62d11f7'/>
<id>fec88ab0af9706b2201e5daf377c5031c62d11f7</id>
<content type='text'>
Pull HMM updates from Jason Gunthorpe:
 "Improvements and bug fixes for the hmm interface in the kernel:

   - Improve clarity, locking and APIs related to the 'hmm mirror'
     feature merged last cycle. In linux-next we now see AMDGPU and
     nouveau to be using this API.

   - Remove old or transitional hmm APIs. These are hold overs from the
     past with no users, or APIs that existed only to manage cross tree
     conflicts. There are still a few more of these cleanups that didn't
     make the merge window cut off.

   - Improve some core mm APIs:
       - export alloc_pages_vma() for driver use
       - refactor into devm_request_free_mem_region() to manage
         DEVICE_PRIVATE resource reservations
       - refactor duplicative driver code into the core dev_pagemap
         struct

   - Remove hmm wrappers of improved core mm APIs, instead have drivers
     use the simplified API directly

   - Remove DEVICE_PUBLIC

   - Simplify the kconfig flow for the hmm users and core code"

* tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (42 commits)
  mm: don't select MIGRATE_VMA_HELPER from HMM_MIRROR
  mm: remove the HMM config option
  mm: sort out the DEVICE_PRIVATE Kconfig mess
  mm: simplify ZONE_DEVICE page private data
  mm: remove hmm_devmem_add
  mm: remove hmm_vma_alloc_locked_page
  nouveau: use devm_memremap_pages directly
  nouveau: use alloc_page_vma directly
  PCI/P2PDMA: use the dev_pagemap internal refcount
  device-dax: use the dev_pagemap internal refcount
  memremap: provide an optional internal refcount in struct dev_pagemap
  memremap: replace the altmap_valid field with a PGMAP_ALTMAP_VALID flag
  memremap: remove the data field in struct dev_pagemap
  memremap: add a migrate_to_ram method to struct dev_pagemap_ops
  memremap: lift the devmap_enable manipulation into devm_memremap_pages
  memremap: pass a struct dev_pagemap to -&gt;kill and -&gt;cleanup
  memremap: move dev_pagemap callbacks into a separate structure
  memremap: validate the pagemap type passed to devm_memremap_pages
  mm: factor out a devm_request_free_mem_region helper
  mm: export alloc_pages_vma
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull HMM updates from Jason Gunthorpe:
 "Improvements and bug fixes for the hmm interface in the kernel:

   - Improve clarity, locking and APIs related to the 'hmm mirror'
     feature merged last cycle. In linux-next we now see AMDGPU and
     nouveau to be using this API.

   - Remove old or transitional hmm APIs. These are hold overs from the
     past with no users, or APIs that existed only to manage cross tree
     conflicts. There are still a few more of these cleanups that didn't
     make the merge window cut off.

   - Improve some core mm APIs:
       - export alloc_pages_vma() for driver use
       - refactor into devm_request_free_mem_region() to manage
         DEVICE_PRIVATE resource reservations
       - refactor duplicative driver code into the core dev_pagemap
         struct

   - Remove hmm wrappers of improved core mm APIs, instead have drivers
     use the simplified API directly

   - Remove DEVICE_PUBLIC

   - Simplify the kconfig flow for the hmm users and core code"

* tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (42 commits)
  mm: don't select MIGRATE_VMA_HELPER from HMM_MIRROR
  mm: remove the HMM config option
  mm: sort out the DEVICE_PRIVATE Kconfig mess
  mm: simplify ZONE_DEVICE page private data
  mm: remove hmm_devmem_add
  mm: remove hmm_vma_alloc_locked_page
  nouveau: use devm_memremap_pages directly
  nouveau: use alloc_page_vma directly
  PCI/P2PDMA: use the dev_pagemap internal refcount
  device-dax: use the dev_pagemap internal refcount
  memremap: provide an optional internal refcount in struct dev_pagemap
  memremap: replace the altmap_valid field with a PGMAP_ALTMAP_VALID flag
  memremap: remove the data field in struct dev_pagemap
  memremap: add a migrate_to_ram method to struct dev_pagemap_ops
  memremap: lift the devmap_enable manipulation into devm_memremap_pages
  memremap: pass a struct dev_pagemap to -&gt;kill and -&gt;cleanup
  memremap: move dev_pagemap callbacks into a separate structure
  memremap: validate the pagemap type passed to devm_memremap_pages
  mm: factor out a devm_request_free_mem_region helper
  mm: export alloc_pages_vma
  ...
</pre>
</div>
</content>
</entry>
</feed>
