<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/arch/x86/kernel/tsc.c, branch v3.2.18</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>sched/x86: Fix overflow in cyc2ns_offset</title>
<updated>2012-04-13T15:33:50+00:00</updated>
<author>
<name>Salman Qazi</name>
<email>sqazi@google.com</email>
</author>
<published>2012-03-10T00:41:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=193bc3a0209aececb61262bd432e82710c2d499a'/>
<id>193bc3a0209aececb61262bd432e82710c2d499a</id>
<content type='text'>
commit 9993bc635d01a6ee7f6b833b4ee65ce7c06350b1 upstream.

When a machine boots up, the TSC generally gets reset.  However,
when kexec is used to boot into a kernel, the TSC value would be
carried over from the previous kernel.  The computation of
cycns_offset in set_cyc2ns_scale is prone to an overflow, if the
machine has been up more than 208 days prior to the kexec.  The
overflow happens when we multiply *scale, even though there is
enough room to store the final answer.

We fix this issue by decomposing tsc_now into the quotient and
remainder of division by CYC2NS_SCALE_FACTOR and then performing
the multiplication separately on the two components.

Refactor code to share the calculation with the previous
fix in __cycles_2_ns().

Signed-off-by: Salman Qazi &lt;sqazi@google.com&gt;
Acked-by: John Stultz &lt;john.stultz@linaro.org&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Paul Turner &lt;pjt@google.com&gt;
Cc: john stultz &lt;johnstul@us.ibm.com&gt;
Link: http://lkml.kernel.org/r/20120310004027.19291.88460.stgit@dungbeetle.mtv.corp.google.com
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 9993bc635d01a6ee7f6b833b4ee65ce7c06350b1 upstream.

When a machine boots up, the TSC generally gets reset.  However,
when kexec is used to boot into a kernel, the TSC value would be
carried over from the previous kernel.  The computation of
cycns_offset in set_cyc2ns_scale is prone to an overflow, if the
machine has been up more than 208 days prior to the kexec.  The
overflow happens when we multiply *scale, even though there is
enough room to store the final answer.

We fix this issue by decomposing tsc_now into the quotient and
remainder of division by CYC2NS_SCALE_FACTOR and then performing
the multiplication separately on the two components.

Refactor code to share the calculation with the previous
fix in __cycles_2_ns().

Signed-off-by: Salman Qazi &lt;sqazi@google.com&gt;
Acked-by: John Stultz &lt;john.stultz@linaro.org&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Paul Turner &lt;pjt@google.com&gt;
Cc: john stultz &lt;johnstul@us.ibm.com&gt;
Link: http://lkml.kernel.org/r/20120310004027.19291.88460.stgit@dungbeetle.mtv.corp.google.com
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86, tsc: Skip refined tsc calibration on systems with reliable TSC</title>
<updated>2012-04-02T16:53:09+00:00</updated>
<author>
<name>Alok Kataria</name>
<email>akataria@vmware.com</email>
</author>
<published>2012-02-22T02:19:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=94e75cfe69ac1d91bae5998ac0f21e355ce93a8b'/>
<id>94e75cfe69ac1d91bae5998ac0f21e355ce93a8b</id>
<content type='text'>
commit 57779dc2b3b75bee05ef5d1ada47f615f7a13932 upstream.

While running the latest Linux as guest under VMware in highly
over-committed situations, we have seen cases when the refined TSC
algorithm fails to get a valid tsc_start value in
tsc_refine_calibration_work from multiple attempts. As a result the
kernel keeps on scheduling the tsc_irqwork task for later. Subsequently
after several attempts when it gets a valid start value it goes through
the refined calibration and either bails out or uses the new results.
Given that the kernel originally read the TSC frequency from the
platform, which is the best it can get, I don't think there is much
value in refining it.

So  for systems which get the TSC frequency from the platform we
should skip the refined tsc algorithm.

We can use the TSC_RELIABLE cpu cap flag to detect this, right now it is
set only on VMware and for Moorestown Penwell both of which have there
own TSC calibration methods.

Signed-off-by: Alok N Kataria &lt;akataria@vmware.com&gt;
Cc: John Stultz &lt;johnstul@us.ibm.com&gt;
Cc: Dirk Brandewie &lt;dirk.brandewie@gmail.com&gt;
Cc: Alan Cox &lt;alan@linux.intel.com&gt;
[jstultz: Reworked to simply not schedule the refining work,
rather then scheduling the work and bombing out later]
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 57779dc2b3b75bee05ef5d1ada47f615f7a13932 upstream.

While running the latest Linux as guest under VMware in highly
over-committed situations, we have seen cases when the refined TSC
algorithm fails to get a valid tsc_start value in
tsc_refine_calibration_work from multiple attempts. As a result the
kernel keeps on scheduling the tsc_irqwork task for later. Subsequently
after several attempts when it gets a valid start value it goes through
the refined calibration and either bails out or uses the new results.
Given that the kernel originally read the TSC frequency from the
platform, which is the best it can get, I don't think there is much
value in refining it.

So  for systems which get the TSC frequency from the platform we
should skip the refined tsc algorithm.

We can use the TSC_RELIABLE cpu cap flag to detect this, right now it is
set only on VMware and for Moorestown Penwell both of which have there
own TSC calibration methods.

Signed-off-by: Alok N Kataria &lt;akataria@vmware.com&gt;
Cc: John Stultz &lt;johnstul@us.ibm.com&gt;
Cc: Dirk Brandewie &lt;dirk.brandewie@gmail.com&gt;
Cc: Alan Cox &lt;alan@linux.intel.com&gt;
[jstultz: Reworked to simply not schedule the refining work,
rather then scheduling the work and bombing out later]
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branches 'x86-detect-hyper-for-linus', 'x86-fpu-for-linus', 'x86-kexec-for-linus', 'x86-platform-for-linus', 'x86-quirks-for-linus', 'x86-tsc-for-linus' and 'x86-smpboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip</title>
<updated>2011-07-23T17:38:21+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-07-23T17:38:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=b4db920c7f524b2cd0f5ae7efbbbbfd2c76a27da'/>
<id>b4db920c7f524b2cd0f5ae7efbbbbfd2c76a27da</id>
<content type='text'>
* 'x86-detect-hyper-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, hyper: Change hypervisor detection order

* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86-32, fpu: Fix DNA exception during check_fpu()

* 'x86-kexec-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  kexec, x86: Fix incorrect jump back address if not preserving context

* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, config: Introduce an INTEL_MID configuration

* 'x86-quirks-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, quirks: Use pci_dev-&gt;revision

* 'x86-tsc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: tsc: Remove unneeded DMI-based blacklisting

* 'x86-smpboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, boot: Wait for boot cpu to show up if nr_cpus limit is about to hit
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* 'x86-detect-hyper-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, hyper: Change hypervisor detection order

* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86-32, fpu: Fix DNA exception during check_fpu()

* 'x86-kexec-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  kexec, x86: Fix incorrect jump back address if not preserving context

* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, config: Introduce an INTEL_MID configuration

* 'x86-quirks-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, quirks: Use pci_dev-&gt;revision

* 'x86-tsc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: tsc: Remove unneeded DMI-based blacklisting

* 'x86-smpboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, boot: Wait for boot cpu to show up if nr_cpus limit is about to hit
</pre>
</div>
</content>
</entry>
<entry>
<title>x86-64: Move vread_tsc and vread_hpet into the vDSO</title>
<updated>2011-07-15T00:57:05+00:00</updated>
<author>
<name>Andy Lutomirski</name>
<email>luto@mit.edu</email>
</author>
<published>2011-07-14T10:47:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=98d0ac38ca7b1b7a552c9a2359174ff84decb600'/>
<id>98d0ac38ca7b1b7a552c9a2359174ff84decb600</id>
<content type='text'>
The vsyscall page now consists entirely of trap instructions.

Cc: John Stultz &lt;johnstul@us.ibm.com&gt;
Signed-off-by: Andy Lutomirski &lt;luto@mit.edu&gt;
Link: http://lkml.kernel.org/r/637648f303f2ef93af93bae25186e9a1bea093f5.1310639973.git.luto@mit.edu
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The vsyscall page now consists entirely of trap instructions.

Cc: John Stultz &lt;johnstul@us.ibm.com&gt;
Signed-off-by: Andy Lutomirski &lt;luto@mit.edu&gt;
Link: http://lkml.kernel.org/r/637648f303f2ef93af93bae25186e9a1bea093f5.1310639973.git.luto@mit.edu
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>clocksource: Replace vread with generic arch data</title>
<updated>2011-07-13T18:23:12+00:00</updated>
<author>
<name>Andy Lutomirski</name>
<email>luto@mit.edu</email>
</author>
<published>2011-07-13T13:24:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=433bd805e5fd2c731b3a9025b034f066272d336e'/>
<id>433bd805e5fd2c731b3a9025b034f066272d336e</id>
<content type='text'>
The vread field was bloating struct clocksource everywhere except
x86_64, and I want to change the way this works on x86_64, so let's
split it out into per-arch data.

Cc: x86@kernel.org
Cc: Clemens Ladisch &lt;clemens@ladisch.de&gt;
Cc: linux-ia64@vger.kernel.org
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: John Stultz &lt;johnstul@us.ibm.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Andy Lutomirski &lt;luto@mit.edu&gt;
Link: http://lkml.kernel.org/r/3ae5ec76a168eaaae63f08a2a1060b91aa0b7759.1310563276.git.luto@mit.edu
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The vread field was bloating struct clocksource everywhere except
x86_64, and I want to change the way this works on x86_64, so let's
split it out into per-arch data.

Cc: x86@kernel.org
Cc: Clemens Ladisch &lt;clemens@ladisch.de&gt;
Cc: linux-ia64@vger.kernel.org
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: John Stultz &lt;johnstul@us.ibm.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Andy Lutomirski &lt;luto@mit.edu&gt;
Link: http://lkml.kernel.org/r/3ae5ec76a168eaaae63f08a2a1060b91aa0b7759.1310563276.git.luto@mit.edu
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: tsc: Remove unneeded DMI-based blacklisting</title>
<updated>2011-06-01T06:19:51+00:00</updated>
<author>
<name>Tero Roponen</name>
<email>tero.roponen@gmail.com</email>
</author>
<published>2011-05-31T07:24:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=df049672dddde4a2fdacf63fb32eb80146e26841'/>
<id>df049672dddde4a2fdacf63fb32eb80146e26841</id>
<content type='text'>
The blacklist was added in response to my bug report
(http://lkml.org/lkml/2006/1/19/362) and has never
contained more than the one entry describing my old
now dead ThinkPad 380XD laptop. As found out later
(http://lkml.org/lkml/2007/11/29/50), this special
treatment has been unnecessary for a long time, so
it can be removed.

Signed-off-by: Tero Roponen &lt;tero.roponen@gmail.com&gt;
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The blacklist was added in response to my bug report
(http://lkml.org/lkml/2006/1/19/362) and has never
contained more than the one entry describing my old
now dead ThinkPad 380XD laptop. As found out later
(http://lkml.org/lkml/2007/11/29/50), this special
treatment has been unnecessary for a long time, so
it can be removed.

Signed-off-by: Tero Roponen &lt;tero.roponen@gmail.com&gt;
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86-64: Move vread_tsc into a new file with sensible options</title>
<updated>2011-05-24T12:51:29+00:00</updated>
<author>
<name>Andy Lutomirski</name>
<email>luto@MIT.EDU</email>
</author>
<published>2011-05-23T13:31:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=44259b1abfaa8bb819d25d41d71e8e33e25dd36a'/>
<id>44259b1abfaa8bb819d25d41d71e8e33e25dd36a</id>
<content type='text'>
vread_tsc is short and hot, and it's userspace code so the usual
reasons to enable -pg and turn off sibling calls don't apply.

(OK, turning off sibling calls has no effect.  But it might
someday...)

As an added benefit, tsc.c is profilable now.

Signed-off-by: Andy Lutomirski &lt;luto@mit.edu&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Borislav Petkov &lt;bp@amd64.org&gt;
Link: http://lkml.kernel.org/r/%3C99c6d7f5efa3ccb65b4ac6eb443e1ab7bad47d7b.1306156808.git.luto%40mit.edu%3E
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
vread_tsc is short and hot, and it's userspace code so the usual
reasons to enable -pg and turn off sibling calls don't apply.

(OK, turning off sibling calls has no effect.  But it might
someday...)

As an added benefit, tsc.c is profilable now.

Signed-off-by: Andy Lutomirski &lt;luto@mit.edu&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Borislav Petkov &lt;bp@amd64.org&gt;
Link: http://lkml.kernel.org/r/%3C99c6d7f5efa3ccb65b4ac6eb443e1ab7bad47d7b.1306156808.git.luto%40mit.edu%3E
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86-64: Don't generate cmov in vread_tsc</title>
<updated>2011-05-24T12:51:28+00:00</updated>
<author>
<name>Andy Lutomirski</name>
<email>luto@MIT.EDU</email>
</author>
<published>2011-05-23T13:31:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3729db5ca2b2000c660e5a5d0eb68b1053212cab'/>
<id>3729db5ca2b2000c660e5a5d0eb68b1053212cab</id>
<content type='text'>
vread_tsc checks whether rdtsc returns something less than
cycle_last, which is an extremely predictable branch.  GCC likes
to generate a cmov anyway, which is several cycles slower than
a predicted branch.  This saves a couple of nanoseconds.

Signed-off-by: Andy Lutomirski &lt;luto@mit.edu&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Borislav Petkov &lt;bp@amd64.org&gt;
Link: http://lkml.kernel.org/r/%3C561280649519de41352fcb620684dfb22bad6bac.1306156808.git.luto%40mit.edu%3E
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
vread_tsc checks whether rdtsc returns something less than
cycle_last, which is an extremely predictable branch.  GCC likes
to generate a cmov anyway, which is several cycles slower than
a predicted branch.  This saves a couple of nanoseconds.

Signed-off-by: Andy Lutomirski &lt;luto@mit.edu&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Borislav Petkov &lt;bp@amd64.org&gt;
Link: http://lkml.kernel.org/r/%3C561280649519de41352fcb620684dfb22bad6bac.1306156808.git.luto%40mit.edu%3E
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86-64: Remove unnecessary barrier in vread_tsc</title>
<updated>2011-05-24T12:51:28+00:00</updated>
<author>
<name>Andy Lutomirski</name>
<email>luto@MIT.EDU</email>
</author>
<published>2011-05-23T13:31:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=057e6a8c660e95c3f4e7162e00e2fee1fc90c50d'/>
<id>057e6a8c660e95c3f4e7162e00e2fee1fc90c50d</id>
<content type='text'>
RDTSC is completely unordered on modern Intel and AMD CPUs.  The
Intel manual says that lfence;rdtsc causes all previous instructions
to complete before the tsc is read, and the AMD manual says to use
mfence;rdtsc to do the same thing.

From a decent amount of testing [1] this is enough to make rdtsc
be ordered with respect to subsequent loads across a wide variety
of CPUs.

On Sandy Bridge (i7-2600), this improves a loop of
clock_gettime(CLOCK_MONOTONIC) by more than 5 ns/iter.

[1] https://lkml.org/lkml/2011/4/18/350

Signed-off-by: Andy Lutomirski &lt;luto@mit.edu&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Borislav Petkov &lt;bp@amd64.org&gt;
Link: http://lkml.kernel.org/r/%3C1c158b9d74338aa5361f96dd473d0e6a58235302.1306156808.git.luto%40mit.edu%3E
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
RDTSC is completely unordered on modern Intel and AMD CPUs.  The
Intel manual says that lfence;rdtsc causes all previous instructions
to complete before the tsc is read, and the AMD manual says to use
mfence;rdtsc to do the same thing.

From a decent amount of testing [1] this is enough to make rdtsc
be ordered with respect to subsequent loads across a wide variety
of CPUs.

On Sandy Bridge (i7-2600), this improves a loop of
clock_gettime(CLOCK_MONOTONIC) by more than 5 ns/iter.

[1] https://lkml.org/lkml/2011/4/18/350

Signed-off-by: Andy Lutomirski &lt;luto@mit.edu&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Borislav Petkov &lt;bp@amd64.org&gt;
Link: http://lkml.kernel.org/r/%3C1c158b9d74338aa5361f96dd473d0e6a58235302.1306156808.git.luto%40mit.edu%3E
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86-64: Clean up vdso/kernel shared variables</title>
<updated>2011-05-24T12:51:28+00:00</updated>
<author>
<name>Andy Lutomirski</name>
<email>luto@MIT.EDU</email>
</author>
<published>2011-05-23T13:31:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=8c49d9a74bac5ea3f18480307057241b808fcc0c'/>
<id>8c49d9a74bac5ea3f18480307057241b808fcc0c</id>
<content type='text'>
Variables that are shared between the vdso and the kernel are
currently a bit of a mess.  They are each defined with their own
magic, they are accessed differently in the kernel, the vsyscall page,
and the vdso, and one of them (vsyscall_clock) doesn't even really
exist.

This changes them all to use a common mechanism.  All of them are
delcared in vvar.h with a fixed address (validated by the linker
script).  In the kernel (as before), they look like ordinary
read-write variables.  In the vsyscall page and the vdso, they are
accessed through a new macro VVAR, which gives read-only access.

The vdso is now loaded verbatim into memory without any fixups.  As a
side bonus, access from the vdso is faster because a level of
indirection is removed.

While we're at it, pack jiffies and vgetcpu_mode into the same
cacheline.

Signed-off-by: Andy Lutomirski &lt;luto@mit.edu&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Borislav Petkov &lt;bp@amd64.org&gt;
Link: http://lkml.kernel.org/r/%3C7357882fbb51fa30491636a7b6528747301b7ee9.1306156808.git.luto%40mit.edu%3E
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Variables that are shared between the vdso and the kernel are
currently a bit of a mess.  They are each defined with their own
magic, they are accessed differently in the kernel, the vsyscall page,
and the vdso, and one of them (vsyscall_clock) doesn't even really
exist.

This changes them all to use a common mechanism.  All of them are
delcared in vvar.h with a fixed address (validated by the linker
script).  In the kernel (as before), they look like ordinary
read-write variables.  In the vsyscall page and the vdso, they are
accessed through a new macro VVAR, which gives read-only access.

The vdso is now loaded verbatim into memory without any fixups.  As a
side bonus, access from the vdso is faster because a level of
indirection is removed.

While we're at it, pack jiffies and vgetcpu_mode into the same
cacheline.

Signed-off-by: Andy Lutomirski &lt;luto@mit.edu&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Borislav Petkov &lt;bp@amd64.org&gt;
Link: http://lkml.kernel.org/r/%3C7357882fbb51fa30491636a7b6528747301b7ee9.1306156808.git.luto%40mit.edu%3E
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</pre>
</div>
</content>
</entry>
</feed>
