<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/linux/dmaengine.h, branch tegra-9.12.15</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Merge branch 'dmaengine' into async-tx-next</title>
<updated>2009-09-09T00:55:21+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2009-09-09T00:55:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bbb20089a3275a19e475dbc21320c3742e3ca423'/>
<id>bbb20089a3275a19e475dbc21320c3742e3ca423</id>
<content type='text'>
Conflicts:
	crypto/async_tx/async_xor.c
	drivers/dma/ioat/dma_v2.h
	drivers/dma/ioat/pci.c
	drivers/md/raid5.c
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conflicts:
	crypto/async_tx/async_xor.c
	drivers/dma/ioat/dma_v2.h
	drivers/dma/ioat/pci.c
	drivers/md/raid5.c
</pre>
</div>
</content>
</entry>
<entry>
<title>dmaengine: kill tx_list</title>
<updated>2009-09-09T00:53:04+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2009-09-09T00:53:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0803172778901e24a75ab074798d98c2b7411559'/>
<id>0803172778901e24a75ab074798d98c2b7411559</id>
<content type='text'>
The tx_list attribute of struct dma_async_tx_descriptor is common to
most, but not all dma driver implementations.  None of the upper level
code (dmaengine/async_tx) uses it, so allow drivers to implement it
locally if they need it.  This saves sizeof(struct list_head) bytes for
drivers that do not manage descriptors with a linked list (e.g.: ioatdma
v2,3).

Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;


</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The tx_list attribute of struct dma_async_tx_descriptor is common to
most, but not all dma driver implementations.  None of the upper level
code (dmaengine/async_tx) uses it, so allow drivers to implement it
locally if they need it.  This saves sizeof(struct list_head) bytes for
drivers that do not manage descriptors with a linked list (e.g.: ioatdma
v2,3).

Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;


</pre>
</div>
</content>
</entry>
<entry>
<title>dmaengine, async_tx: support alignment checks</title>
<updated>2009-09-09T00:42:53+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2009-09-09T00:42:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=83544ae9f3991bfc7d5e0fe9a3008cd05a8d57b7'/>
<id>83544ae9f3991bfc7d5e0fe9a3008cd05a8d57b7</id>
<content type='text'>
Some engines have transfer size and address alignment restrictions.  Add
a per-operation alignment property to struct dma_device that the async
routines and dmatest can use to check alignment capabilities.

Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some engines have transfer size and address alignment restrictions.  Add
a per-operation alignment property to struct dma_device that the async
routines and dmatest can use to check alignment capabilities.

Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dmaengine: cleanup unused transaction types</title>
<updated>2009-09-09T00:42:52+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2009-09-09T00:42:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9308add6ea4fedeba37b0d7c4630a542bd34f214'/>
<id>9308add6ea4fedeba37b0d7c4630a542bd34f214</id>
<content type='text'>
No drivers currently implement these operation types, so they can be
deleted.

Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
No drivers currently implement these operation types, so they can be
deleted.

Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dmaengine, async_tx: add a "no channel switch" allocator</title>
<updated>2009-09-09T00:42:51+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2009-09-09T00:42:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=138f4c359d23d2ec38d18bd70dd9613ae515fe93'/>
<id>138f4c359d23d2ec38d18bd70dd9613ae515fe93</id>
<content type='text'>
Channel switching is problematic for some dmaengine drivers as the
architecture precludes separating the -&gt;prep from -&gt;submit.  In these
cases the driver can select ASYNC_TX_DISABLE_CHANNEL_SWITCH to modify
the async_tx allocator to only return channels that support all of the
required asynchronous operations.

For example MD_RAID456=y selects support for asynchronous xor, xor
validate, pq, pq validate, and memcpy.  When
ASYNC_TX_DISABLE_CHANNEL_SWITCH=y any channel with all these
capabilities is marked DMA_ASYNC_TX allowing async_tx_find_channel() to
quickly locate compatible channels with the guarantee that dependency
chains will remain on one channel.  When
ASYNC_TX_DISABLE_CHANNEL_SWITCH=n async_tx_find_channel() may select
channels that lead to operation chains that need to cross channel
boundaries using the async_tx channel switch capability.

Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Channel switching is problematic for some dmaengine drivers as the
architecture precludes separating the -&gt;prep from -&gt;submit.  In these
cases the driver can select ASYNC_TX_DISABLE_CHANNEL_SWITCH to modify
the async_tx allocator to only return channels that support all of the
required asynchronous operations.

For example MD_RAID456=y selects support for asynchronous xor, xor
validate, pq, pq validate, and memcpy.  When
ASYNC_TX_DISABLE_CHANNEL_SWITCH=y any channel with all these
capabilities is marked DMA_ASYNC_TX allowing async_tx_find_channel() to
quickly locate compatible channels with the guarantee that dependency
chains will remain on one channel.  When
ASYNC_TX_DISABLE_CHANNEL_SWITCH=n async_tx_find_channel() may select
channels that lead to operation chains that need to cross channel
boundaries using the async_tx channel switch capability.

Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dmaengine: add fence support</title>
<updated>2009-09-09T00:42:50+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2009-09-09T00:42:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0403e3827788d878163f9ef0541b748b0f88ca5d'/>
<id>0403e3827788d878163f9ef0541b748b0f88ca5d</id>
<content type='text'>
Some engines optimize operation by reading ahead in the descriptor chain
such that descriptor2 may start execution before descriptor1 completes.
If descriptor2 depends on the result from descriptor1 then a fence is
required (on descriptor2) to disable this optimization.  The async_tx
api could implicitly identify dependencies via the 'depend_tx'
parameter, but that would constrain cases where the dependency chain
only specifies a completion order rather than a data dependency.  So,
provide an ASYNC_TX_FENCE to explicitly identify data dependencies.

Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some engines optimize operation by reading ahead in the descriptor chain
such that descriptor2 may start execution before descriptor1 completes.
If descriptor2 depends on the result from descriptor1 then a fence is
required (on descriptor2) to disable this optimization.  The async_tx
api could implicitly identify dependencies via the 'depend_tx'
parameter, but that would constrain cases where the dependency chain
only specifies a completion order rather than a data dependency.  So,
provide an ASYNC_TX_FENCE to explicitly identify data dependencies.

Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'md-raid6-accel' into ioat3.2</title>
<updated>2009-09-09T00:42:29+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2009-09-09T00:42:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f9dd2134374c8de6b911e2b8652c6c9622eaa658'/>
<id>f9dd2134374c8de6b911e2b8652c6c9622eaa658</id>
<content type='text'>
Conflicts:
	include/linux/dmaengine.h
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conflicts:
	include/linux/dmaengine.h
</pre>
</div>
</content>
</entry>
<entry>
<title>async_tx: add support for asynchronous GF multiplication</title>
<updated>2009-08-30T02:09:27+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2009-07-14T19:20:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b2f46fd8ef3dff2ab30f31126833f78b7480283a'/>
<id>b2f46fd8ef3dff2ab30f31126833f78b7480283a</id>
<content type='text'>
[ Based on an original patch by Yuri Tikhonov ]

This adds support for doing asynchronous GF multiplication by adding
two additional functions to the async_tx API:

 async_gen_syndrome() does simultaneous XOR and Galois field
    multiplication of sources.

 async_syndrome_val() validates the given source buffers against known P
    and Q values.

When a request is made to run async_pq against more than the hardware
maximum number of supported sources we need to reuse the previous
generated P and Q values as sources into the next operation.  Care must
be taken to remove Q from P' and P from Q'.  For example to perform a 5
source pq op with hardware that only supports 4 sources at a time the
following approach is taken:

p, q = PQ(src0, src1, src2, src3, COEF({01}, {02}, {04}, {08}))
p', q' = PQ(p, q, q, src4, COEF({00}, {01}, {00}, {10}))

p' = p + q + q + src4 = p + src4
q' = {00}*p + {01}*q + {00}*q + {10}*src4 = q + {10}*src4

Note: 4 is the minimum acceptable maxpq otherwise we punt to
synchronous-software path.

The DMA_PREP_CONTINUE flag indicates to the driver to reuse p and q as
sources (in the above manner) and fill the remaining slots up to maxpq
with the new sources/coefficients.

Note1: Some devices have native support for P+Q continuation and can skip
this extra work.  Devices with this capability can advertise it with
dma_set_maxpq.  It is up to each driver how to handle the
DMA_PREP_CONTINUE flag.

Note2: The api supports disabling the generation of P when generating Q,
this is ignored by the synchronous path but is implemented by some dma
devices to save unnecessary writes.  In this case the continuation
algorithm is simplified to only reuse Q as a source.

Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: David Woodhouse &lt;David.Woodhouse@intel.com&gt;
Signed-off-by: Yuri Tikhonov &lt;yur@emcraft.com&gt;
Signed-off-by: Ilya Yanok &lt;yanok@emcraft.com&gt;
Reviewed-by: Andre Noll &lt;maan@systemlinux.org&gt;
Acked-by: Maciej Sosnowski &lt;maciej.sosnowski@intel.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Based on an original patch by Yuri Tikhonov ]

This adds support for doing asynchronous GF multiplication by adding
two additional functions to the async_tx API:

 async_gen_syndrome() does simultaneous XOR and Galois field
    multiplication of sources.

 async_syndrome_val() validates the given source buffers against known P
    and Q values.

When a request is made to run async_pq against more than the hardware
maximum number of supported sources we need to reuse the previous
generated P and Q values as sources into the next operation.  Care must
be taken to remove Q from P' and P from Q'.  For example to perform a 5
source pq op with hardware that only supports 4 sources at a time the
following approach is taken:

p, q = PQ(src0, src1, src2, src3, COEF({01}, {02}, {04}, {08}))
p', q' = PQ(p, q, q, src4, COEF({00}, {01}, {00}, {10}))

p' = p + q + q + src4 = p + src4
q' = {00}*p + {01}*q + {00}*q + {10}*src4 = q + {10}*src4

Note: 4 is the minimum acceptable maxpq otherwise we punt to
synchronous-software path.

The DMA_PREP_CONTINUE flag indicates to the driver to reuse p and q as
sources (in the above manner) and fill the remaining slots up to maxpq
with the new sources/coefficients.

Note1: Some devices have native support for P+Q continuation and can skip
this extra work.  Devices with this capability can advertise it with
dma_set_maxpq.  It is up to each driver how to handle the
DMA_PREP_CONTINUE flag.

Note2: The api supports disabling the generation of P when generating Q,
this is ignored by the synchronous path but is implemented by some dma
devices to save unnecessary writes.  In this case the continuation
algorithm is simplified to only reuse Q as a source.

Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: David Woodhouse &lt;David.Woodhouse@intel.com&gt;
Signed-off-by: Yuri Tikhonov &lt;yur@emcraft.com&gt;
Signed-off-by: Ilya Yanok &lt;yanok@emcraft.com&gt;
Reviewed-by: Andre Noll &lt;maan@systemlinux.org&gt;
Acked-by: Maciej Sosnowski &lt;maciej.sosnowski@intel.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>async_tx: add sum check flags</title>
<updated>2009-08-30T02:09:26+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2009-08-30T02:09:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ad283ea4a3ce82cda2efe33163748a397b31b1eb'/>
<id>ad283ea4a3ce82cda2efe33163748a397b31b1eb</id>
<content type='text'>
Replace the flat zero_sum_result with a collection of flags to contain
the P (xor) zero-sum result, and the soon to be utilized Q (raid6 reed
solomon syndrome) zero-sum result.  Use the SUM_CHECK_ namespace instead
of DMA_ since these flags will be used on non-dma-zero-sum enabled
platforms.

Reviewed-by: Andre Noll &lt;maan@systemlinux.org&gt;
Acked-by: Maciej Sosnowski &lt;maciej.sosnowski@intel.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replace the flat zero_sum_result with a collection of flags to contain
the P (xor) zero-sum result, and the soon to be utilized Q (raid6 reed
solomon syndrome) zero-sum result.  Use the SUM_CHECK_ namespace instead
of DMA_ since these flags will be used on non-dma-zero-sum enabled
platforms.

Reviewed-by: Andre Noll &lt;maan@systemlinux.org&gt;
Acked-by: Maciej Sosnowski &lt;maciej.sosnowski@intel.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>ioatdma: fix "ioatdma frees DMA memory with wrong function"</title>
<updated>2009-05-12T21:41:47+00:00</updated>
<author>
<name>Maciej Sosnowski</name>
<email>maciej.sosnowski@intel.com</email>
</author>
<published>2009-04-23T10:31:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4f005dbe5584fe54c9f6d6d4f0acd3fb29be84da'/>
<id>4f005dbe5584fe54c9f6d6d4f0acd3fb29be84da</id>
<content type='text'>
as reported by Alexander Beregalov &lt;a.beregalov@gmail.com&gt;

ioatdma 0000:00:08.0: DMA-API: device driver frees DMA memory with
wrong function [device address=0x000000007f76f800] [size=2000 bytes]
[map
ped as single] [unmapped as page]

The ioatdma driver was unmapping all regions
(either allocated as page or single) using unmap_page.
This patch lets dma driver recognize if unmap_single or unmap_page should be used.
It introduces two new dma control flags:
DMA_COMPL_SRC_UNMAP_SINGLE and DMA_COMPL_DEST_UNMAP_SINGLE.
They should be set to indicate dma driver to do dma-unmapping as single
(first one for the source, tha latter for the destination).
If respective flag is not set, the driver assumes dma-unmapping as page.

Signed-off-by: Maciej Sosnowski &lt;maciej.sosnowski@intel.com&gt;
Reported-by: Alexander Beregalov &lt;a.beregalov@gmail.com&gt;
Tested-by: Alexander Beregalov &lt;a.beregalov@gmail.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
as reported by Alexander Beregalov &lt;a.beregalov@gmail.com&gt;

ioatdma 0000:00:08.0: DMA-API: device driver frees DMA memory with
wrong function [device address=0x000000007f76f800] [size=2000 bytes]
[map
ped as single] [unmapped as page]

The ioatdma driver was unmapping all regions
(either allocated as page or single) using unmap_page.
This patch lets dma driver recognize if unmap_single or unmap_page should be used.
It introduces two new dma control flags:
DMA_COMPL_SRC_UNMAP_SINGLE and DMA_COMPL_DEST_UNMAP_SINGLE.
They should be set to indicate dma driver to do dma-unmapping as single
(first one for the source, tha latter for the destination).
If respective flag is not set, the driver assumes dma-unmapping as page.

Signed-off-by: Maciej Sosnowski &lt;maciej.sosnowski@intel.com&gt;
Reported-by: Alexander Beregalov &lt;a.beregalov@gmail.com&gt;
Tested-by: Alexander Beregalov &lt;a.beregalov@gmail.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
