<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/gpu/drm/i915/intel_ringbuffer.h, branch v3.4.2</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>drm/i915: Record the tail at each request and use it to estimate the head</title>
<updated>2012-02-15T13:26:03+00:00</updated>
<author>
<name>Chris Wilson</name>
<email>chris@chris-wilson.co.uk</email>
</author>
<published>2012-02-15T11:25:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a71d8d94525e8fd855c0466fb586ae1cb008f3a2'/>
<id>a71d8d94525e8fd855c0466fb586ae1cb008f3a2</id>
<content type='text'>
By recording the location of every request in the ringbuffer, we know
that in order to retire the request the GPU must have finished reading
it and so the GPU head is now beyond the tail of the request. We can
therefore provide a conservative estimate of where the GPU is reading
from in order to avoid having to read back the ring buffer registers
when polling for space upon starting a new write into the ringbuffer.

A secondary effect is that this allows us to convert
intel_ring_buffer_wait() to use i915_wait_request() and so consolidate
upon the single function to handle the complicated task of waiting upon
the GPU. A necessary precaution is that we need to make that wait
uninterruptible to match the existing conditions as all the callers of
intel_ring_begin() have not been audited to handle ERESTARTSYS
correctly.

By using a conservative estimate for the head, and always processing all
outstanding requests first, we prevent a race condition between using
the estimate and direct reads of I915_RING_HEAD which could result in
the value of the head going backwards, and the tail overflowing once
again. We are also careful to mark any request that we skip over in
order to free space in ring as consumed which provides a
self-consistency check.

Given sufficient abuse, such as a set of unthrottled GPU bound
cairo-traces, avoiding the use of I915_RING_HEAD gives a 10-20% boost on
Sandy Bridge (i5-2520m):
  firefox-paintball  18927ms -&gt; 15646ms: 1.21x speedup
  firefox-fishtank   12563ms -&gt; 11278ms: 1.11x speedup
which is a mild consolation for the performance those traces achieved from
exploiting the buggy autoreported head.

v2: Add a few more comments and make request-&gt;tail a conservative
estimate as suggested by Daniel Vetter.

Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
[danvet: resolve conflicts with retirement defering and the lack of
the autoreport head removal (that will go in through -fixes).]
Signed-off-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
By recording the location of every request in the ringbuffer, we know
that in order to retire the request the GPU must have finished reading
it and so the GPU head is now beyond the tail of the request. We can
therefore provide a conservative estimate of where the GPU is reading
from in order to avoid having to read back the ring buffer registers
when polling for space upon starting a new write into the ringbuffer.

A secondary effect is that this allows us to convert
intel_ring_buffer_wait() to use i915_wait_request() and so consolidate
upon the single function to handle the complicated task of waiting upon
the GPU. A necessary precaution is that we need to make that wait
uninterruptible to match the existing conditions as all the callers of
intel_ring_begin() have not been audited to handle ERESTARTSYS
correctly.

By using a conservative estimate for the head, and always processing all
outstanding requests first, we prevent a race condition between using
the estimate and direct reads of I915_RING_HEAD which could result in
the value of the head going backwards, and the tail overflowing once
again. We are also careful to mark any request that we skip over in
order to free space in ring as consumed which provides a
self-consistency check.

Given sufficient abuse, such as a set of unthrottled GPU bound
cairo-traces, avoiding the use of I915_RING_HEAD gives a 10-20% boost on
Sandy Bridge (i5-2520m):
  firefox-paintball  18927ms -&gt; 15646ms: 1.21x speedup
  firefox-fishtank   12563ms -&gt; 11278ms: 1.11x speedup
which is a mild consolation for the performance those traces achieved from
exploiting the buggy autoreported head.

v2: Add a few more comments and make request-&gt;tail a conservative
estimate as suggested by Daniel Vetter.

Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
[danvet: resolve conflicts with retirement defering and the lack of
the autoreport head removal (that will go in through -fixes).]
Signed-off-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/i915: switch ring-&gt;id to be a real id</title>
<updated>2012-01-29T16:32:58+00:00</updated>
<author>
<name>Daniel Vetter</name>
<email>daniel.vetter@ffwll.ch</email>
</author>
<published>2011-12-14T12:57:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=96154f2faba540281073243d61108d1705d19c6d'/>
<id>96154f2faba540281073243d61108d1705d19c6d</id>
<content type='text'>
... and add a helpr function for the places where we want a flag.

This way we can use ring-&gt;id to index into arrays.

v2: Resurrect the missing beautification-space Chris Wilson noted.
I'm moving this space around because I'll reuse ring_str in the next
patch.

Reviewed-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Reviewed-by: Ben Widawsky &lt;ben@bwidawsk.net&gt;
Reviewed-by: Eugeni Dodonov &lt;eugeni.dodonov@intel.com&gt;
Signed-off-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
... and add a helpr function for the places where we want a flag.

This way we can use ring-&gt;id to index into arrays.

v2: Resurrect the missing beautification-space Chris Wilson noted.
I'm moving this space around because I'll reuse ring_str in the next
patch.

Reviewed-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Reviewed-by: Ben Widawsky &lt;ben@bwidawsk.net&gt;
Reviewed-by: Eugeni Dodonov &lt;eugeni.dodonov@intel.com&gt;
Signed-off-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/i915: Dumb down the semaphore logic</title>
<updated>2011-09-21T21:52:41+00:00</updated>
<author>
<name>Ben Widawsky</name>
<email>ben@bwidawsk.net</email>
</author>
<published>2011-09-15T03:32:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c8c99b0f0dea1ced5d0e10cdb9143356cc16b484'/>
<id>c8c99b0f0dea1ced5d0e10cdb9143356cc16b484</id>
<content type='text'>
While I think the previous code is correct, it was hard to follow and
hard to debug. Since we already have a ring abstraction, might as well
use it to handle the semaphore updates and compares.

I don't expect this code to make semaphores better or worse, but you
never know...

v2:
Remove magic per Keith's suggestions.
Ran Daniel's gem_ring_sync_loop test on this.

v3:
Ignored one of Keith's suggestions.

v4:
Removed some bloat per Daniel's recommendation.

Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: Keith Packard &lt;keithp@keithp.com&gt;
Signed-off-by: Ben Widawsky &lt;ben@bwidawsk.net&gt;
Signed-off-by: Keith Packard &lt;keithp@keithp.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
While I think the previous code is correct, it was hard to follow and
hard to debug. Since we already have a ring abstraction, might as well
use it to handle the semaphore updates and compares.

I don't expect this code to make semaphores better or worse, but you
never know...

v2:
Remove magic per Keith's suggestions.
Ran Daniel's gem_ring_sync_loop test on this.

v3:
Ignored one of Keith's suggestions.

v4:
Removed some bloat per Daniel's recommendation.

Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: Keith Packard &lt;keithp@keithp.com&gt;
Signed-off-by: Ben Widawsky &lt;ben@bwidawsk.net&gt;
Signed-off-by: Keith Packard &lt;keithp@keithp.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Drivers: i915: Fix all space related issues.</title>
<updated>2011-09-20T01:01:47+00:00</updated>
<author>
<name>Akshay Joshi</name>
<email>me@akshayjoshi.com</email>
</author>
<published>2011-08-16T19:34:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=0206e353a0416ad63ce07f53c807c2c725633b87'/>
<id>0206e353a0416ad63ce07f53c807c2c725633b87</id>
<content type='text'>
Various issues involved with the space character were generating
warnings in the checkpatch.pl file. This patch removes most of those
warnings.

Signed-off-by: Akshay Joshi &lt;me@akshayjoshi.com&gt;
Signed-off-by: Keith Packard &lt;keithp@keithp.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Various issues involved with the space character were generating
warnings in the checkpatch.pl file. This patch removes most of those
warnings.

Signed-off-by: Akshay Joshi &lt;me@akshayjoshi.com&gt;
Signed-off-by: Keith Packard &lt;keithp@keithp.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/i915/ringbuffer: Idling requires waiting for the ring to be empty</title>
<updated>2011-07-12T17:35:45+00:00</updated>
<author>
<name>Chris Wilson</name>
<email>chris@chris-wilson.co.uk</email>
</author>
<published>2011-07-12T17:03:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=a94919eaddaa3fede1df8563ce4d761a75374645'/>
<id>a94919eaddaa3fede1df8563ce4d761a75374645</id>
<content type='text'>
...which is measured by the size and not the amount of space remaining.

Waiting upon size-8, did one of two things. In the common case with more
than 8 bytes available to write into the ring, it would return
immediately. Otherwise, it would timeout given the impossible condition
of waiting for more space than is available in the ring, leading to
warnings such as:

[drm:intel_cleanup_ring_buffer] *ERROR* failed to quiesce render ring
whilst cleaning up: -16

Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Reviewed-by: Eric Anholt &lt;eric@anholt.net&gt;
Signed-off-by: Keith Packard &lt;keithp@keithp.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
...which is measured by the size and not the amount of space remaining.

Waiting upon size-8, did one of two things. In the common case with more
than 8 bytes available to write into the ring, it would return
immediately. Otherwise, it would timeout given the impossible condition
of waiting for more space than is available in the ring, leading to
warnings such as:

[drm:intel_cleanup_ring_buffer] *ERROR* failed to quiesce render ring
whilst cleaning up: -16

Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Reviewed-by: Eric Anholt &lt;eric@anholt.net&gt;
Signed-off-by: Keith Packard &lt;keithp@keithp.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/i915: proper use of forcewake</title>
<updated>2011-05-10T20:56:45+00:00</updated>
<author>
<name>Ben Widawsky</name>
<email>ben@bwidawsk.net</email>
</author>
<published>2011-04-25T18:22:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b7287d8054d219b3009f7ca82edf24f89fd363e5'/>
<id>b7287d8054d219b3009f7ca82edf24f89fd363e5</id>
<content type='text'>
Moved the macros around to properly do reads and writes for the given
GPU. This is to address special requirements for gen6 (SNB) reads and
writes.

Registers in the range 0-0x40000 on gen6 platforms require special
handling. Instead of relying on the callers to pick the registers
correctly, move the logic into the read and write functions.

Signed-off-by: Ben Widawsky &lt;ben@bwidawsk.net&gt;
Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Moved the macros around to properly do reads and writes for the given
GPU. This is to address special requirements for gen6 (SNB) reads and
writes.

Registers in the range 0-0x40000 on gen6 platforms require special
handling. Instead of relying on the callers to pick the registers
correctly, move the logic into the read and write functions.

Signed-off-by: Ben Widawsky &lt;ben@bwidawsk.net&gt;
Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/1915: ringbuffer wait for idle function</title>
<updated>2011-05-10T20:56:40+00:00</updated>
<author>
<name>Ben Widawsky</name>
<email>ben@bwidawsk.net</email>
</author>
<published>2011-03-20T01:14:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=96f298aa9c9fc9b7c8a2ebaf8c195d178f570e09'/>
<id>96f298aa9c9fc9b7c8a2ebaf8c195d178f570e09</id>
<content type='text'>
Added a new function which waits for the ringbuffer space to be equal to
(total - 8). This is the empty condition of the ringbuffer, and
equivalent to head==tail.

Also modified two users of this functionality elsewhere in the code.

Signed-off-by: Ben Widawsky &lt;ben@bwidawsk.net&gt;
Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Added a new function which waits for the ringbuffer space to be equal to
(total - 8). This is the empty condition of the ringbuffer, and
equivalent to head==tail.

Also modified two users of this functionality elsewhere in the code.

Signed-off-by: Ben Widawsky &lt;ben@bwidawsk.net&gt;
Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'drm-intel-fixes' into drm-intel-next</title>
<updated>2011-03-07T12:35:15+00:00</updated>
<author>
<name>Chris Wilson</name>
<email>chris@chris-wilson.co.uk</email>
</author>
<published>2011-03-07T12:32:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=47ae63e0c2e5fdb582d471dc906eb29be94c732f'/>
<id>47ae63e0c2e5fdb582d471dc906eb29be94c732f</id>
<content type='text'>
Apply the trivial conflicting regression fixes, but keep GPU semaphores
enabled.

Conflicts:
	drivers/gpu/drm/i915/i915_drv.h
	drivers/gpu/drm/i915/i915_gem_execbuffer.c
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Apply the trivial conflicting regression fixes, but keep GPU semaphores
enabled.

Conflicts:
	drivers/gpu/drm/i915/i915_drv.h
	drivers/gpu/drm/i915/i915_gem_execbuffer.c
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/i915: Do not overflow the MMADDR write FIFO</title>
<updated>2011-03-06T09:07:46+00:00</updated>
<author>
<name>Chris Wilson</name>
<email>chris@chris-wilson.co.uk</email>
</author>
<published>2011-03-04T19:22:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=91355834646328e7edc6bd25176ae44bcd7386c7'/>
<id>91355834646328e7edc6bd25176ae44bcd7386c7</id>
<content type='text'>
Whilst the GT is powered down (rc6), writes to MMADDR are placed in a
FIFO by the System Agent. This is a limited resource, only 64 entries, of
which 20 are reserved for Display and PCH writes, and so we must take
care not to queue up too many writes. To avoid this, there is counter
which we can poll to ensure there are sufficient free entries in the
fifo.

"Issuing a write to a full FIFO is not supported; at worst it could
result in corruption or a system hang."

Reported-and-Tested-by: Matt Turner &lt;mattst88@gmail.com&gt;
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34056
Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Whilst the GT is powered down (rc6), writes to MMADDR are placed in a
FIFO by the System Agent. This is a limited resource, only 64 entries, of
which 20 are reserved for Display and PCH writes, and so we must take
care not to queue up too many writes. To avoid this, there is counter
which we can poll to ensure there are sufficient free entries in the
fifo.

"Issuing a write to a full FIFO is not supported; at worst it could
result in corruption or a system hang."

Reported-and-Tested-by: Matt Turner &lt;mattst88@gmail.com&gt;
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34056
Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/i915: Refine tracepoints</title>
<updated>2011-02-07T14:59:18+00:00</updated>
<author>
<name>Chris Wilson</name>
<email>chris@chris-wilson.co.uk</email>
</author>
<published>2011-02-03T11:57:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=db53a302611c06bde01851f61fa0675a84ca018c'/>
<id>db53a302611c06bde01851f61fa0675a84ca018c</id>
<content type='text'>
A lot of minor tweaks to fix the tracepoints, improve the outputting for
ftrace, and to generally make the tracepoints useful again. It is a start
and enough to begin identifying performance issues and gaps in our
coverage.

Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A lot of minor tweaks to fix the tracepoints, improve the outputting for
ftrace, and to generally make the tracepoints useful again. It is a start
and enough to begin identifying performance issues and gaps in our
coverage.

Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
</pre>
</div>
</content>
</entry>
</feed>
