<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/gpu/host1x, branch v7.0-rc5</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>Convert more 'alloc_obj' cases to default GFP_KERNEL arguments</title>
<updated>2026-02-22T04:03:00+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-22T04:03:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=32a92f8c89326985e05dce8b22d3f0aa07a3e1bd'/>
<id>32a92f8c89326985e05dce8b22d3f0aa07a3e1bd</id>
<content type='text'>
This converts some of the visually simpler cases that have been split
over multiple lines.  I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.

Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script.  I probably had made it a bit _too_ trivial.

So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.

The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This converts some of the visually simpler cases that have been split
over multiple lines.  I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.

Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script.  I probably had made it a bit _too_ trivial.

So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.

The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Convert 'alloc_obj' family to use the new default GFP_KERNEL argument</title>
<updated>2026-02-22T01:09:51+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-22T00:37:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43'/>
<id>bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43</id>
<content type='text'>
This was done entirely with mindless brute force, using

    git grep -l '\&lt;k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This was done entirely with mindless brute force, using

    git grep -l '\&lt;k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>treewide: Replace kmalloc with kmalloc_obj for non-scalar types</title>
<updated>2026-02-21T09:02:28+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2026-02-21T07:49:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=69050f8d6d075dc01af7a5f2f550a8067510366f'/>
<id>69050f8d6d075dc01af7a5f2f550a8067510366f</id>
<content type='text'>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>host1x: Convert to bus methods</title>
<updated>2026-01-13T11:26:32+00:00</updated>
<author>
<name>Uwe Kleine-König</name>
<email>u.kleine-koenig@baylibre.com</email>
</author>
<published>2025-12-10T08:31:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=1f61d735b8595f80ae5cfbbe11390a49b7f45911'/>
<id>1f61d735b8595f80ae5cfbbe11390a49b7f45911</id>
<content type='text'>
The callbacks .probe(), .remove() and .shutdown() for device_drivers
should go away. So migrate to bus methods. There are two differences
that need addressing:

 - The bus remove callback returns void while the driver remove callback
   returns int (the actual value is ignored by the core).
 - The bus shutdown callback is also called for unbound devices, so an
   additional check for dev-&gt;driver != NULL is needed.

Signed-off-by: Uwe Kleine-König &lt;u.kleine-koenig@baylibre.com&gt;
Tested-by: Luca Ceresoli &lt;luca.ceresoli@bootlin.com&gt; # tegra20 tegra-video
Reviewed-by: Luca Ceresoli &lt;luca.ceresoli@bootlin.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://patch.msgid.link/dd55d034c68953268ea416aa5c13e41b158fcbb4.1765355236.git.u.kleine-koenig@baylibre.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The callbacks .probe(), .remove() and .shutdown() for device_drivers
should go away. So migrate to bus methods. There are two differences
that need addressing:

 - The bus remove callback returns void while the driver remove callback
   returns int (the actual value is ignored by the core).
 - The bus shutdown callback is also called for unbound devices, so an
   additional check for dev-&gt;driver != NULL is needed.

Signed-off-by: Uwe Kleine-König &lt;u.kleine-koenig@baylibre.com&gt;
Tested-by: Luca Ceresoli &lt;luca.ceresoli@bootlin.com&gt; # tegra20 tegra-video
Reviewed-by: Luca Ceresoli &lt;luca.ceresoli@bootlin.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://patch.msgid.link/dd55d034c68953268ea416aa5c13e41b158fcbb4.1765355236.git.u.kleine-koenig@baylibre.com
</pre>
</div>
</content>
</entry>
<entry>
<title>host1x: Make remove callback return void</title>
<updated>2026-01-13T11:25:38+00:00</updated>
<author>
<name>Uwe Kleine-König</name>
<email>u.kleine-koenig@baylibre.com</email>
</author>
<published>2025-12-10T08:31:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=ba3588410cedb1696cfe56ebefcc4401c6d0bb36'/>
<id>ba3588410cedb1696cfe56ebefcc4401c6d0bb36</id>
<content type='text'>
The return value of struct device_driver::remove is ignored by the core
(see device_remove() in drivers/base/dd.c). So it doesn't make sense to
let the host1x remove callback return an int just to ignore it later.

So make the callback return void. All current implementors return 0, so
they are easily converted.

Signed-off-by: Uwe Kleine-König &lt;u.kleine-koenig@baylibre.com&gt;
Tested-by: Luca Ceresoli &lt;luca.ceresoli@bootlin.com&gt; # tegra20 tegra-video
Reviewed-by: Luca Ceresoli &lt;luca.ceresoli@bootlin.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://patch.msgid.link/d364fd4ec043d36ee12e46eaef98c57658884f63.1765355236.git.u.kleine-koenig@baylibre.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The return value of struct device_driver::remove is ignored by the core
(see device_remove() in drivers/base/dd.c). So it doesn't make sense to
let the host1x remove callback return an int just to ignore it later.

So make the callback return void. All current implementors return 0, so
they are easily converted.

Signed-off-by: Uwe Kleine-König &lt;u.kleine-koenig@baylibre.com&gt;
Tested-by: Luca Ceresoli &lt;luca.ceresoli@bootlin.com&gt; # tegra20 tegra-video
Reviewed-by: Luca Ceresoli &lt;luca.ceresoli@bootlin.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://patch.msgid.link/d364fd4ec043d36ee12e46eaef98c57658884f63.1765355236.git.u.kleine-koenig@baylibre.com
</pre>
</div>
</content>
</entry>
<entry>
<title>gpu: host1x: Syncpoint interrupt performance optimization</title>
<updated>2025-11-14T17:27:19+00:00</updated>
<author>
<name>Mikko Perttunen</name>
<email>mperttunen@nvidia.com</email>
</author>
<published>2025-09-17T01:48:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bfe68975768a983d4e59c7fb465301999b863a7e'/>
<id>bfe68975768a983d4e59c7fb465301999b863a7e</id>
<content type='text'>
Optimize performance of syncpoint interrupt handling by reading
the status register in 64-bit chunks when possible, and skipping
processing when the read value is zero.

Signed-off-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://patch.msgid.link/20250917-host1x-syncpt-irq-perf-v2-1-736ef69b1347@nvidia.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Optimize performance of syncpoint interrupt handling by reading
the status register in 64-bit chunks when possible, and skipping
processing when the read value is zero.

Signed-off-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://patch.msgid.link/20250917-host1x-syncpt-irq-perf-v2-1-736ef69b1347@nvidia.com
</pre>
</div>
</content>
</entry>
<entry>
<title>gpu: host1x: Use dev_err_probe() in probe path</title>
<updated>2025-09-11T17:12:53+00:00</updated>
<author>
<name>Akhilesh Patil</name>
<email>akhilesh@ee.iitb.ac.in</email>
</author>
<published>2025-08-22T06:14:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b4505b6ad9444c8c517240c74f8b9cbbba94a86b'/>
<id>b4505b6ad9444c8c517240c74f8b9cbbba94a86b</id>
<content type='text'>
Use dev_err_probe() helper as recommended by core driver model in
drivers/base/core.c to handle deferred probe error. Improve code
consistency and debuggability using standard helper.

Signed-off-by: Akhilesh Patil &lt;akhilesh@ee.iitb.ac.in&gt;
Tested-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Acked-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://lore.kernel.org/r/aKgKrCxUvP9Sw0YI@bhairav-test.ee.iitb.ac.in
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use dev_err_probe() helper as recommended by core driver model in
drivers/base/core.c to handle deferred probe error. Improve code
consistency and debuggability using standard helper.

Signed-off-by: Akhilesh Patil &lt;akhilesh@ee.iitb.ac.in&gt;
Tested-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Acked-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://lore.kernel.org/r/aKgKrCxUvP9Sw0YI@bhairav-test.ee.iitb.ac.in
</pre>
</div>
</content>
</entry>
<entry>
<title>gpu: host1x: Allow loading tegra-drm without enabled engines</title>
<updated>2025-09-11T16:56:38+00:00</updated>
<author>
<name>Vamsee Vardhan Thummala</name>
<email>vthummala@nvidia.com</email>
</author>
<published>2025-07-08T12:18:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fab823d82ee50c7953bd67d152e35e0bddaeee86'/>
<id>fab823d82ee50c7953bd67d152e35e0bddaeee86</id>
<content type='text'>
Add support to register host1x devices without requiring subdevices.
This ensures syncpoint functionality remains available even when engine
subdevices are not present.

Add softdep for tegra-drm to make userspace interface available
without module autoloading through device tree entries.

Signed-off-by: Vamsee Vardhan Thummala &lt;vthummala@nvidia.com&gt;
[mperttunen@nvidia.com: some rewording]
Signed-off-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://lore.kernel.org/r/20250708-host1x-allow-no-subdevs-v1-1-93c66c251f03@nvidia.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add support to register host1x devices without requiring subdevices.
This ensures syncpoint functionality remains available even when engine
subdevices are not present.

Add softdep for tegra-drm to make userspace interface available
without module autoloading through device tree entries.

Signed-off-by: Vamsee Vardhan Thummala &lt;vthummala@nvidia.com&gt;
[mperttunen@nvidia.com: some rewording]
Signed-off-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://lore.kernel.org/r/20250708-host1x-allow-no-subdevs-v1-1-93c66c251f03@nvidia.com
</pre>
</div>
</content>
</entry>
<entry>
<title>gpu: host1x: Wait prefences outside MLOCK</title>
<updated>2025-09-11T16:56:35+00:00</updated>
<author>
<name>Mikko Perttunen</name>
<email>mperttunen@nvidia.com</email>
</author>
<published>2025-07-08T11:25:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=63d47cc6eeb27fa0f5b2d9e2e9b950d728b6ca24'/>
<id>63d47cc6eeb27fa0f5b2d9e2e9b950d728b6ca24</id>
<content type='text'>
The current submission opcode sequence first takes the engine MLOCK,
and then switches to HOST1X class to wait prefences. This is fine
while we only use a single channel per engine and there is no
virtualization, since jobs are serialized on that one channel anyway.
However, when that assumption doesn't hold, we are keeping the
engine locked while not running anything on it while waiting for
prefences to complete.

To resolve this, execute wait commands in the beginning of the job
outside the engine MLOCK. We still take the HOST1X MLOCK because
recent hardware requires register opcodes to be executed within some
MLOCK, but the hardware also allows unlimited channels to take the
HOST1X MLOCK at the same time.

Signed-off-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://lore.kernel.org/r/20250708-host1x-wait-prefences-outside-mlock-v1-1-13e98044e35a@nvidia.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The current submission opcode sequence first takes the engine MLOCK,
and then switches to HOST1X class to wait prefences. This is fine
while we only use a single channel per engine and there is no
virtualization, since jobs are serialized on that one channel anyway.
However, when that assumption doesn't hold, we are keeping the
engine locked while not running anything on it while waiting for
prefences to complete.

To resolve this, execute wait commands in the beginning of the job
outside the engine MLOCK. We still take the HOST1X MLOCK because
recent hardware requires register opcodes to be executed within some
MLOCK, but the hardware also allows unlimited channels to take the
HOST1X MLOCK at the same time.

Signed-off-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://lore.kernel.org/r/20250708-host1x-wait-prefences-outside-mlock-v1-1-13e98044e35a@nvidia.com
</pre>
</div>
</content>
</entry>
<entry>
<title>gpu: host1x: Fix race in syncpt alloc/free</title>
<updated>2025-09-11T16:56:32+00:00</updated>
<author>
<name>Mainak Sen</name>
<email>msen@nvidia.com</email>
</author>
<published>2025-07-07T09:17:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c7d393267c497502fa737607f435f05dfe6e3d9b'/>
<id>c7d393267c497502fa737607f435f05dfe6e3d9b</id>
<content type='text'>
Fix race condition between host1x_syncpt_alloc()
and host1x_syncpt_put() by using kref_put_mutex()
instead of kref_put() + manual mutex locking.

This ensures no thread can acquire the
syncpt_mutex after the refcount drops to zero
but before syncpt_release acquires it.
This prevents races where syncpoints could
be allocated while still being cleaned up
from a previous release.

Remove explicit mutex locking in syncpt_release
as kref_put_mutex() handles this atomically.

Signed-off-by: Mainak Sen &lt;msen@nvidia.com&gt;
Fixes: f5ba33fb9690 ("gpu: host1x: Reserve VBLANK syncpoints at initialization")
Signed-off-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://lore.kernel.org/r/20250707-host1x-syncpt-race-fix-v1-1-28b0776e70bc@nvidia.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix race condition between host1x_syncpt_alloc()
and host1x_syncpt_put() by using kref_put_mutex()
instead of kref_put() + manual mutex locking.

This ensures no thread can acquire the
syncpt_mutex after the refcount drops to zero
but before syncpt_release acquires it.
This prevents races where syncpoints could
be allocated while still being cleaned up
from a previous release.

Remove explicit mutex locking in syncpt_release
as kref_put_mutex() handles this atomically.

Signed-off-by: Mainak Sen &lt;msen@nvidia.com&gt;
Fixes: f5ba33fb9690 ("gpu: host1x: Reserve VBLANK syncpoints at initialization")
Signed-off-by: Mikko Perttunen &lt;mperttunen@nvidia.com&gt;
Signed-off-by: Thierry Reding &lt;treding@nvidia.com&gt;
Link: https://lore.kernel.org/r/20250707-host1x-syncpt-race-fix-v1-1-28b0776e70bc@nvidia.com
</pre>
</div>
</content>
</entry>
</feed>
