<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/arc/include/asm/delay.h, branch v4.10</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>ARC: udelay: fix inline assembler by adding LP_COUNT to clobber list</title>
<updated>2017-01-24T18:54:24+00:00</updated>
<author>
<name>Vineet Gupta</name>
<email>vgupta@synopsys.com</email>
</author>
<published>2017-01-24T18:23:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=36425cd67052e3becf325fd4d3ba5691791ef7e4'/>
<id>36425cd67052e3becf325fd4d3ba5691791ef7e4</id>
<content type='text'>
commit 3c7c7a2fc8811bc ("ARC: Don't use "+l" inline asm constraint")
modified the inline assembly to setup LP_COUNT register manually and NOT
rely on gcc to do it (with the +l inline assembler contraint hint, now
being retired in the compiler)

However the fix was flawed as we didn't add LP_COUNT to asm clobber list,
meaning gcc doesn't know that LP_COUNT or zero-delay-loops are in action
in the inline asm.

This resulted in some fun - as nested ZOL loops were being generared

| mov lp_count,250000 ;16 # tmp235,
| lp .L__GCC__LP14 #		&lt;======= OUTER LOOP (gcc generated)
|   .L14:
|   ld r2, [r5] # MEM[(volatile u32 *)prephitmp_43], w
|   dmb 1
|   breq r2, -1, @.L21 #, w,,
|   bbit0 r2,1,@.L13 # w,,
|   ld r4,[r7] ;25 # loops_per_jiffy, loops_per_jiffy
|   mpymu r3,r4,r6 #, loops_per_jiffy, tmp234
|
|   mov lp_count, r3 #		 &lt;====== INNER LOOP (from inline asm)
|   lp 1f
| 	 nop
|   1:
|   nop_s
| .L__GCC__LP14: ; loop end, start is @.L14 #,

This caused issues with drivers relying on sane behaviour of udelay
friends.

With LP_COUNT added to clobber list, gcc doesn't generate the outer
loop in say above case.

Addresses STAR 9001146134

Reported-by: Joao Pinto &lt;jpinto@synopsys.com&gt;
Fixes: 3c7c7a2fc8811bc ("ARC: Don't use "+l" inline asm constraint")
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 3c7c7a2fc8811bc ("ARC: Don't use "+l" inline asm constraint")
modified the inline assembly to setup LP_COUNT register manually and NOT
rely on gcc to do it (with the +l inline assembler contraint hint, now
being retired in the compiler)

However the fix was flawed as we didn't add LP_COUNT to asm clobber list,
meaning gcc doesn't know that LP_COUNT or zero-delay-loops are in action
in the inline asm.

This resulted in some fun - as nested ZOL loops were being generared

| mov lp_count,250000 ;16 # tmp235,
| lp .L__GCC__LP14 #		&lt;======= OUTER LOOP (gcc generated)
|   .L14:
|   ld r2, [r5] # MEM[(volatile u32 *)prephitmp_43], w
|   dmb 1
|   breq r2, -1, @.L21 #, w,,
|   bbit0 r2,1,@.L13 # w,,
|   ld r4,[r7] ;25 # loops_per_jiffy, loops_per_jiffy
|   mpymu r3,r4,r6 #, loops_per_jiffy, tmp234
|
|   mov lp_count, r3 #		 &lt;====== INNER LOOP (from inline asm)
|   lp 1f
| 	 nop
|   1:
|   nop_s
| .L__GCC__LP14: ; loop end, start is @.L14 #,

This caused issues with drivers relying on sane behaviour of udelay
friends.

With LP_COUNT added to clobber list, gcc doesn't generate the outer
loop in say above case.

Addresses STAR 9001146134

Reported-by: Joao Pinto &lt;jpinto@synopsys.com&gt;
Fixes: 3c7c7a2fc8811bc ("ARC: Don't use "+l" inline asm constraint")
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARC: Don't use "+l" inline asm constraint</title>
<updated>2016-11-28T17:17:32+00:00</updated>
<author>
<name>Vineet Gupta</name>
<email>vgupta@synopsys.com</email>
</author>
<published>2016-11-24T01:43:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3c7c7a2fc8811bc7097479f69acf2527693d7562'/>
<id>3c7c7a2fc8811bc7097479f69acf2527693d7562</id>
<content type='text'>
Apparenty this is coming in the way of gcc fix which inhibits the usage
of LP_COUNT as a gpr.

Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Apparenty this is coming in the way of gcc fix which inhibits the usage
of LP_COUNT as a gpr.

Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARCv2: Adhere to Zero Delay loop restriction</title>
<updated>2015-06-22T08:36:56+00:00</updated>
<author>
<name>Vineet Gupta</name>
<email>vgupta@synopsys.com</email>
</author>
<published>2013-10-07T12:40:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8922bc3058abbe5deaf887147e26531750ce7513'/>
<id>8922bc3058abbe5deaf887147e26531750ce7513</id>
<content type='text'>
Branch insn can't be scheduled as last insn of Zero Overhead loop

Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Branch insn can't be scheduled as last insn of Zero Overhead loop

Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARC: Fix __udelay calculation</title>
<updated>2013-09-05T05:01:12+00:00</updated>
<author>
<name>Mischa Jonker</name>
<email>mjonker@synopsys.com</email>
</author>
<published>2013-08-30T09:56:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7efd0da2d17360e1cef91507dbe619db0ee2c691'/>
<id>7efd0da2d17360e1cef91507dbe619db0ee2c691</id>
<content type='text'>
Cast usecs to u64, to ensure that the (usecs * 4295 * HZ)
multiplication is 64 bit.

Initially, the (usecs * 4295 * HZ) part was done as a 32 bit
multiplication, with the result casted to 64 bit. This led to some bits
falling off, causing a "DMA initialization error" in the stmmac Ethernet
driver, due to a premature timeout.

Signed-off-by: Mischa Jonker &lt;mjonker@synopsys.com&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Cast usecs to u64, to ensure that the (usecs * 4295 * HZ)
multiplication is 64 bit.

Initially, the (usecs * 4295 * HZ) part was done as a 32 bit
multiplication, with the result casted to 64 bit. This led to some bits
falling off, causing a "DMA initialization error" in the stmmac Ethernet
driver, due to a premature timeout.

Signed-off-by: Mischa Jonker &lt;mjonker@synopsys.com&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARC: Timers/counters/delay management</title>
<updated>2013-02-11T14:30:39+00:00</updated>
<author>
<name>Vineet Gupta</name>
<email>vgupta@synopsys.com</email>
</author>
<published>2013-01-18T09:42:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d8005e6b95268cbb50db3773d5f180c32a9434fe'/>
<id>d8005e6b95268cbb50db3773d5f180c32a9434fe</id>
<content type='text'>
ARC700 includes 2 in-core 32bit timers TIMER0 and TIMER1.
Both have exactly same capabilies.

* programmable to count from TIMER&lt;n&gt;_CNT to TIMER&lt;n&gt;_LIMIT
* for count 0 and LIMIT ~1, provides a free-running counter by
    auto-wrapping when limit is reached.
* optionally interrupt when LIMIT is reached (oneshot event semantics)
* rearming the interrupt provides periodic semantics
* run at CPU clk

ARC Linux uses TIMER0 for clockevent (periodic/oneshot) and TIMER1 for
clocksource (free-running clock).

Newer cores provide RTSC insn which gives a 64bit cpu clk snapshot hence
is more apt for clocksource when available.

SMP poses a bit of challenge for global timekeeping clocksource /
sched_clock() backend:
 -TIMER1 based local clocks are out-of-sync hence can't be used
  (thus we default to jiffies based cs as well as sched_clock() one/both
  of which platform can override with it's specific hardware assist)
 -RTSC is only allowed in SMP if it's cross-core-sync (Kconfig glue
  ensures that) and thus usable for both requirements.

Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ARC700 includes 2 in-core 32bit timers TIMER0 and TIMER1.
Both have exactly same capabilies.

* programmable to count from TIMER&lt;n&gt;_CNT to TIMER&lt;n&gt;_LIMIT
* for count 0 and LIMIT ~1, provides a free-running counter by
    auto-wrapping when limit is reached.
* optionally interrupt when LIMIT is reached (oneshot event semantics)
* rearming the interrupt provides periodic semantics
* run at CPU clk

ARC Linux uses TIMER0 for clockevent (periodic/oneshot) and TIMER1 for
clocksource (free-running clock).

Newer cores provide RTSC insn which gives a 64bit cpu clk snapshot hence
is more apt for clocksource when available.

SMP poses a bit of challenge for global timekeeping clocksource /
sched_clock() backend:
 -TIMER1 based local clocks are out-of-sync hence can't be used
  (thus we default to jiffies based cs as well as sched_clock() one/both
  of which platform can override with it's specific hardware assist)
 -RTSC is only allowed in SMP if it's cross-core-sync (Kconfig glue
  ensures that) and thus usable for both requirements.

Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</pre>
</div>
</content>
</entry>
</feed>
