<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/kernel/dma/Kconfig, branch v5.10-rc1</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>dma-mapping: merge &lt;linux/dma-contiguous.h&gt; into &lt;linux/dma-map-ops.h&gt;</title>
<updated>2020-10-06T05:07:04+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-09-11T08:56:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0b1abd1fb7efafc25231c54a67c6fbb3d3127efd'/>
<id>0b1abd1fb7efafc25231c54a67c6fbb3d3127efd</id>
<content type='text'>
Merge dma-contiguous.h into dma-map-ops.h, after removing the comment
describing the contiguous allocator into kernel/dma/contigous.c.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Merge dma-contiguous.h into dma-map-ops.h, after removing the comment
describing the contiguous allocator into kernel/dma/contigous.c.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cma: decrease CMA_ALIGNMENT lower limit to 2</title>
<updated>2020-10-06T05:06:54+00:00</updated>
<author>
<name>Paul Cercueil</name>
<email>paul@crapouillou.net</email>
</author>
<published>2020-09-30T10:28:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0de327969b61a245e3a47b60009eae73fe513cef'/>
<id>0de327969b61a245e3a47b60009eae73fe513cef</id>
<content type='text'>
On an embedded system with a tiny (1 MiB) CMA area for video memory, and
a simple enough video pipeline, we can decrease the CMA_ALIGNMENT by a
factor of 2 to avoid wasting memory, as all the allocations for video
buffers will be of the exact same size (dictated by the size of the
screen).

Signed-off-by: Paul Cercueil &lt;paul@crapouillou.net&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On an embedded system with a tiny (1 MiB) CMA area for video memory, and
a simple enough video pipeline, we can decrease the CMA_ALIGNMENT by a
factor of 2 to avoid wasting memory, as all the allocations for video
buffers will be of the exact same size (dictated by the size of the
screen).

Signed-off-by: Paul Cercueil &lt;paul@crapouillou.net&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dma-mapping: remove dma_cache_sync</title>
<updated>2020-09-25T04:20:46+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-09-01T11:28:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5a84292271402cffe0717bc58e2ad9a0c7977272'/>
<id>5a84292271402cffe0717bc58e2ad9a0c7977272</id>
<content type='text'>
All users are gone now, remove the API.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Acked-by: Thomas Bogendoerfer &lt;tsbogend@alpha.franken.de&gt; (MIPS part)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
All users are gone now, remove the API.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Acked-by: Thomas Bogendoerfer &lt;tsbogend@alpha.franken.de&gt; (MIPS part)
</pre>
</div>
</content>
</entry>
<entry>
<title>dma-mapping: add (back) arch_dma_mark_clean for ia64</title>
<updated>2020-09-11T07:10:17+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-08-17T14:41:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=abdaf11ac18925ce8cc229e62e35b342d548ece2'/>
<id>abdaf11ac18925ce8cc229e62e35b342d548ece2</id>
<content type='text'>
Add back a hook to optimize dcache flushing after reading executable
code using DMA.  This gets ia64 out of the business of pretending to
be dma incoherent just for this optimization.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add back a hook to optimize dcache flushing after reading executable
code using DMA.  This gets ia64 out of the business of pretending to
be dma incoherent just for this optimization.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dma-mapping: fix DMA_OPS dependencies</title>
<updated>2020-09-11T07:09:41+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-08-29T08:40:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ef1a85b6ca093abb00fbc38ea55fd9ae08ab05ef'/>
<id>ef1a85b6ca093abb00fbc38ea55fd9ae08ab05ef</id>
<content type='text'>
Driver that select DMA_OPS need to depend on HAS_DMA support to
work.  The vop driver was missing that dependency, so add it, and also
add a another depends in DMA_OPS itself.  That won't fix the issue due
to how the Kconfig dependencies work, but at least produce a warning
about unmet dependencies.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Driver that select DMA_OPS need to depend on HAS_DMA support to
work.  The vop driver was missing that dependency, so add it, and also
add a another depends in DMA_OPS itself.  That won't fix the issue due
to how the Kconfig dependencies work, but at least produce a warning
about unmet dependencies.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dma-contiguous: provide the ability to reserve per-numa CMA</title>
<updated>2020-09-01T07:19:28+00:00</updated>
<author>
<name>Barry Song</name>
<email>song.bao.hua@hisilicon.com</email>
</author>
<published>2020-08-23T23:03:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b7176c261cdbced87bed9562577333150ed05b01'/>
<id>b7176c261cdbced87bed9562577333150ed05b01</id>
<content type='text'>
Right now, drivers like ARM SMMU are using dma_alloc_coherent() to get
coherent DMA buffers to save their command queues and page tables. As
there is only one default CMA in the whole system, SMMUs on nodes other
than node0 will get remote memory. This leads to significant latency.

This patch provides per-numa CMA so that drivers like SMMU can get local
memory. Tests show localizing CMA can decrease dma_unmap latency much.
For instance, before this patch, SMMU on node2  has to wait for more than
560ns for the completion of CMD_SYNC in an empty command queue; with this
patch, it needs 240ns only.

A positive side effect of this patch would be improving performance even
further for those users who are worried about performance more than DMA
security and use iommu.passthrough=1 to skip IOMMU. With local CMA, all
drivers can get local coherent DMA buffers.

Also, this patch changes the default CONFIG_CMA_AREAS to 19 in NUMA. As
1+CONFIG_CMA_AREAS should be quite enough for most servers on the market
even they enable both hugetlb_cma and pernuma_cma.
2 numa nodes: 2(hugetlb) + 2(pernuma) + 1(default global cma) = 5
4 numa nodes: 4(hugetlb) + 4(pernuma) + 1(default global cma) = 9
8 numa nodes: 8(hugetlb) + 8(pernuma) + 1(default global cma) = 17

Signed-off-by: Barry Song &lt;song.bao.hua@hisilicon.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Right now, drivers like ARM SMMU are using dma_alloc_coherent() to get
coherent DMA buffers to save their command queues and page tables. As
there is only one default CMA in the whole system, SMMUs on nodes other
than node0 will get remote memory. This leads to significant latency.

This patch provides per-numa CMA so that drivers like SMMU can get local
memory. Tests show localizing CMA can decrease dma_unmap latency much.
For instance, before this patch, SMMU on node2  has to wait for more than
560ns for the completion of CMD_SYNC in an empty command queue; with this
patch, it needs 240ns only.

A positive side effect of this patch would be improving performance even
further for those users who are worried about performance more than DMA
security and use iommu.passthrough=1 to skip IOMMU. With local CMA, all
drivers can get local coherent DMA buffers.

Also, this patch changes the default CONFIG_CMA_AREAS to 19 in NUMA. As
1+CONFIG_CMA_AREAS should be quite enough for most servers on the market
even they enable both hugetlb_cma and pernuma_cma.
2 numa nodes: 2(hugetlb) + 2(pernuma) + 1(default global cma) = 5
4 numa nodes: 4(hugetlb) + 4(pernuma) + 1(default global cma) = 9
8 numa nodes: 8(hugetlb) + 8(pernuma) + 1(default global cma) = 17

Signed-off-by: Barry Song &lt;song.bao.hua@hisilicon.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'sh-for-5.9' of git://git.libc.org/linux-sh</title>
<updated>2020-08-16T01:50:32+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-08-16T01:50:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5bbec3cfe376ed0014d9456a9be11d5ed75d587b'/>
<id>5bbec3cfe376ed0014d9456a9be11d5ed75d587b</id>
<content type='text'>
Pull arch/sh updates from Rich Felker:
 "Cleanup, SECCOMP_FILTER support, message printing fixes, and other
  changes to arch/sh"

* tag 'sh-for-5.9' of git://git.libc.org/linux-sh: (34 commits)
  sh: landisk: Add missing initialization of sh_io_port_base
  sh: bring syscall_set_return_value in line with other architectures
  sh: Add SECCOMP_FILTER
  sh: Rearrange blocks in entry-common.S
  sh: switch to copy_thread_tls()
  sh: use the generic dma coherent remap allocator
  sh: don't allow non-coherent DMA for NOMMU
  dma-mapping: consolidate the NO_DMA definition in kernel/dma/Kconfig
  sh: unexport register_trapped_io and match_trapped_io_handler
  sh: don't include &lt;asm/io_trapped.h&gt; in &lt;asm/io.h&gt;
  sh: move the ioremap implementation out of line
  sh: move ioremap_fixed details out of &lt;asm/io.h&gt;
  sh: remove __KERNEL__ ifdefs from non-UAPI headers
  sh: sort the selects for SUPERH alphabetically
  sh: remove -Werror from Makefiles
  sh: Replace HTTP links with HTTPS ones
  arch/sh/configs: remove obsolete CONFIG_SOC_CAMERA*
  sh: stacktrace: Remove stacktrace_ops.stack()
  sh: machvec: Modernize printing of kernel messages
  sh: pci: Modernize printing of kernel messages
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull arch/sh updates from Rich Felker:
 "Cleanup, SECCOMP_FILTER support, message printing fixes, and other
  changes to arch/sh"

* tag 'sh-for-5.9' of git://git.libc.org/linux-sh: (34 commits)
  sh: landisk: Add missing initialization of sh_io_port_base
  sh: bring syscall_set_return_value in line with other architectures
  sh: Add SECCOMP_FILTER
  sh: Rearrange blocks in entry-common.S
  sh: switch to copy_thread_tls()
  sh: use the generic dma coherent remap allocator
  sh: don't allow non-coherent DMA for NOMMU
  dma-mapping: consolidate the NO_DMA definition in kernel/dma/Kconfig
  sh: unexport register_trapped_io and match_trapped_io_handler
  sh: don't include &lt;asm/io_trapped.h&gt; in &lt;asm/io.h&gt;
  sh: move the ioremap implementation out of line
  sh: move ioremap_fixed details out of &lt;asm/io.h&gt;
  sh: remove __KERNEL__ ifdefs from non-UAPI headers
  sh: sort the selects for SUPERH alphabetically
  sh: remove -Werror from Makefiles
  sh: Replace HTTP links with HTTPS ones
  arch/sh/configs: remove obsolete CONFIG_SOC_CAMERA*
  sh: stacktrace: Remove stacktrace_ops.stack()
  sh: machvec: Modernize printing of kernel messages
  sh: pci: Modernize printing of kernel messages
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>dma-mapping: consolidate the NO_DMA definition in kernel/dma/Kconfig</title>
<updated>2020-08-15T02:05:17+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-07-14T12:18:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=846f9e1fb9b9a2c4eecefa2fdbfb51e975a7d532'/>
<id>846f9e1fb9b9a2c4eecefa2fdbfb51e975a7d532</id>
<content type='text'>
Have a single definition that architetures can select.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Rich Felker &lt;dalias@libc.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Have a single definition that architetures can select.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Rich Felker &lt;dalias@libc.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dma-debug: remove debug_dma_assert_idle() function</title>
<updated>2020-08-14T22:22:43+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-08-14T22:22:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5848dc5b1b76d83599dcec1b39f188a7cbdca7e2'/>
<id>5848dc5b1b76d83599dcec1b39f188a7cbdca7e2</id>
<content type='text'>
This remoes the code from the COW path to call debug_dma_assert_idle(),
which was added many years ago.

Google shows that it hasn't caught anything in the 6+ years we've had it
apart from a false positive, and Hugh just noticed how it had a very
unfortunate spinlock serialization in the COW path.

He fixed that issue the previous commit (a85ffd59bd36: "dma-debug: fix
debug_dma_assert_idle(), use rcu_read_lock()"), but let's see if anybody
even notices when we remove this function entirely.

NOTE! We keep the dma tracking infrastructure that was added by the
commit that introduced it.  Partly to make it easier to resurrect this
debug code if we ever deside to, and partly because that tracking by pfn
and offset looks quite reasonable.

The problem with this debug code was simply that it was expensive and
didn't seem worth it, not that it was wrong per se.

Acked-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Acked-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This remoes the code from the COW path to call debug_dma_assert_idle(),
which was added many years ago.

Google shows that it hasn't caught anything in the 6+ years we've had it
apart from a false positive, and Hugh just noticed how it had a very
unfortunate spinlock serialization in the COW path.

He fixed that issue the previous commit (a85ffd59bd36: "dma-debug: fix
debug_dma_assert_idle(), use rcu_read_lock()"), but let's see if anybody
even notices when we remove this function entirely.

NOTE! We keep the dma tracking infrastructure that was added by the
commit that introduced it.  Partly to make it easier to resurrect this
debug code if we ever deside to, and partly because that tracking by pfn
and offset looks quite reasonable.

The problem with this debug code was simply that it was expensive and
didn't seem worth it, not that it was wrong per se.

Acked-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Acked-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dma-mapping: add a dma_ops_bypass flag to struct device</title>
<updated>2020-07-19T07:29:29+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-03-23T17:19:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d35834c64820c7ef397f8a244061d4450720540e'/>
<id>d35834c64820c7ef397f8a244061d4450720540e</id>
<content type='text'>
Several IOMMU drivers have a bypass mode where they can use a direct
mapping if the devices DMA mask is large enough.  Add generic support
to the core dma-mapping code to do that to switch those drivers to
a common solution.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Tested-by: Alexey Kardashevskiy &lt;aik@ozlabs.ru&gt;
Reviewed-by: Alexey Kardashevskiy &lt;aik@ozlabs.ru&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Several IOMMU drivers have a bypass mode where they can use a direct
mapping if the devices DMA mask is large enough.  Add generic support
to the core dma-mapping code to do that to switch those drivers to
a common solution.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Tested-by: Alexey Kardashevskiy &lt;aik@ozlabs.ru&gt;
Reviewed-by: Alexey Kardashevskiy &lt;aik@ozlabs.ru&gt;
</pre>
</div>
</content>
</entry>
</feed>
