<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/drivers/edac/mce_amd.c, branch v5.0</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>EDAC, amd64: Add Hygon Dhyana support</title>
<updated>2018-09-27T16:38:26+00:00</updated>
<author>
<name>Pu Wen</name>
<email>puwen@hygon.cn</email>
</author>
<published>2018-09-27T14:31:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c4a3e94641449362ee970f521a2cdb0e8cd08690'/>
<id>c4a3e94641449362ee970f521a2cdb0e8cd08690</id>
<content type='text'>
Add support for Hygon Dhyana CPU to EDAC.

Signed-off-by: Pu Wen &lt;puwen@hygon.cn&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: mchehab@kernel.org
Cc: tglx@linutronix.de
Cc: mingo@redhat.com
Cc: hpa@zytor.com
Cc: thomas.lendacky@amd.com
Cc: linux-edac@vger.kernel.org
Link: https://lkml.kernel.org/r/9d71061301177822bc55b3bfd44f91057458d886.1537533369.git.puwen@hygon.cn
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add support for Hygon Dhyana CPU to EDAC.

Signed-off-by: Pu Wen &lt;puwen@hygon.cn&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: mchehab@kernel.org
Cc: tglx@linutronix.de
Cc: mingo@redhat.com
Cc: hpa@zytor.com
Cc: thomas.lendacky@amd.com
Cc: linux-edac@vger.kernel.org
Link: https://lkml.kernel.org/r/9d71061301177822bc55b3bfd44f91057458d886.1537533369.git.puwen@hygon.cn
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mce/AMD, EDAC/mce_amd: Enumerate Reserved SMCA bank type</title>
<updated>2018-02-21T16:00:54+00:00</updated>
<author>
<name>Yazen Ghannam</name>
<email>yazen.ghannam@amd.com</email>
</author>
<published>2018-02-21T10:18:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=68627a697c195937672ce07683094c72b1174786'/>
<id>68627a697c195937672ce07683094c72b1174786</id>
<content type='text'>
Currently, bank 4 is reserved on Fam17h, so we chose not to initialize
bank 4 in the smca_banks array. This means that when we check if a bank
is initialized, like during boot or resume, we will see that bank 4 is
not initialized and try to initialize it.

This will cause a call trace, when resuming from suspend, due to
rdmsr_*on_cpu() calls in the init path. The rdmsr_*on_cpu() calls issue
an IPI but we're running with interrupts disabled. This triggers:

  WARNING: CPU: 0 PID: 11523 at kernel/smp.c:291 smp_call_function_single+0xdc/0xe0
  ...

Reserved banks will be read-as-zero, so their MCA_IPID register will be
zero. So, like the smca_banks array, the threshold_banks array will not
have an entry for a reserved bank since all its MCA_MISC* registers will
be zero.

Enumerate a "Reserved" bank type that matches on a HWID_MCATYPE of 0,0.

Use the "Reserved" type when checking if a bank is reserved. It's
possible that other bank numbers may be reserved on future systems.

Don't try to find the block address on reserved banks.

Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: &lt;stable@vger.kernel.org&gt; # 4.14.x
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Link: http://lkml.kernel.org/r/20180221101900.10326-7-bp@alien8.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, bank 4 is reserved on Fam17h, so we chose not to initialize
bank 4 in the smca_banks array. This means that when we check if a bank
is initialized, like during boot or resume, we will see that bank 4 is
not initialized and try to initialize it.

This will cause a call trace, when resuming from suspend, due to
rdmsr_*on_cpu() calls in the init path. The rdmsr_*on_cpu() calls issue
an IPI but we're running with interrupts disabled. This triggers:

  WARNING: CPU: 0 PID: 11523 at kernel/smp.c:291 smp_call_function_single+0xdc/0xe0
  ...

Reserved banks will be read-as-zero, so their MCA_IPID register will be
zero. So, like the smca_banks array, the threshold_banks array will not
have an entry for a reserved bank since all its MCA_MISC* registers will
be zero.

Enumerate a "Reserved" bank type that matches on a HWID_MCATYPE of 0,0.

Use the "Reserved" type when checking if a bank is reserved. It's
possible that other bank numbers may be reserved on future systems.

Don't try to find the block address on reserved banks.

Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: &lt;stable@vger.kernel.org&gt; # 4.14.x
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Link: http://lkml.kernel.org/r/20180221101900.10326-7-bp@alien8.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>EDAC, mce_amd: Get rid of local var in amd_filter_mce()</title>
<updated>2017-08-21T15:59:38+00:00</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2017-07-25T09:09:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=398443471f1698003832972637d5508a0c0809e0'/>
<id>398443471f1698003832972637d5508a0c0809e0</id>
<content type='text'>
... and use the macro for that.

No functionality change.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
... and use the macro for that.

No functionality change.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>EDAC, mce_amd: Get rid of most struct cpuinfo_x86 uses</title>
<updated>2017-08-21T15:54:57+00:00</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2017-07-25T09:07:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=f3c0891c2feafa008cc35d6ffb8cf593df66c867'/>
<id>f3c0891c2feafa008cc35d6ffb8cf593df66c867</id>
<content type='text'>
struct mce.cpuid contains CPUID(1).EAX which contains family, model and
stepping and thus has enough information for our purposes. Thus get rid
of some external dependencies which are not really needed.

No functionality change.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
struct mce.cpuid contains CPUID(1).EAX which contains family, model and
stepping and thus has enough information for our purposes. Thus get rid
of some external dependencies which are not really needed.

No functionality change.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>EDAC, mce_amd: Rename decode_smca_errors() to decode_smca_error()</title>
<updated>2017-08-21T15:44:09+00:00</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2017-07-25T08:44:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=4ab1784b48b384e52a6539ab10201fed7a3127f5'/>
<id>4ab1784b48b384e52a6539ab10201fed7a3127f5</id>
<content type='text'>
Singular fits better because it decodes a single error.

No functionality change.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Singular fits better because it decodes a single error.

No functionality change.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>EDAC, mce_amd: Use cpu_to_node() to find the node ID</title>
<updated>2017-07-17T05:01:08+00:00</updated>
<author>
<name>Yazen Ghannam</name>
<email>yazen.ghannam@amd.com</email>
</author>
<published>2017-03-20T20:26:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fbe63acf62f57f8e51adae602c5ce1025002d5ee'/>
<id>fbe63acf62f57f8e51adae602c5ce1025002d5ee</id>
<content type='text'>
Using the homegrown amd_get_nb_id() to find a node ID on AMD was fine
while the L3 to node mapping was 1:1. And Zen topology broke this. So
let's start slowly moving away from it and use the topology interfaces
instead.

Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Link: http://lkml.kernel.org/r/1490041614-90057-2-git-send-email-Yazen.Ghannam@amd.com
[ Massage commit message. ]
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Using the homegrown amd_get_nb_id() to find a node ID on AMD was fine
while the L3 to node mapping was 1:1. And Zen topology broke this. So
let's start slowly moving away from it and use the topology interfaces
instead.

Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: x86-ml &lt;x86@kernel.org&gt;
Link: http://lkml.kernel.org/r/1490041614-90057-2-git-send-email-Yazen.Ghannam@amd.com
[ Massage commit message. ]
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>EDAC, mce_amd: Fix typo in SMCA error description</title>
<updated>2017-06-12T17:03:55+00:00</updated>
<author>
<name>Yazen Ghannam</name>
<email>yazen.ghannam@amd.com</email>
</author>
<published>2017-06-12T16:58:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bdf1bf1744355d83cd44e160f2d3dcc28df5140b'/>
<id>bdf1bf1744355d83cd44e160f2d3dcc28df5140b</id>
<content type='text'>
Fix typo in "poison consumption" error description.

Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Link: http://lkml.kernel.org/r/1497286703-62853-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix typo in "poison consumption" error description.

Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Link: http://lkml.kernel.org/r/1497286703-62853-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2017-02-20T20:47:44+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-02-20T20:47:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=60c906bab124a0627fba04c9ca5e61bba4747c0c'/>
<id>60c906bab124a0627fba04c9ca5e61bba4747c0c</id>
<content type='text'>
Pull RAS updates from Ingo Molnar:
 "The main changes in this cycle were:

  - Assign notifier chain priorities for all RAS related handlers to
    make the ordering explicit (Borislav Petkov)

  - Improve the AMD MCA banks sysfs output (Yazen Ghannam)

  - Various cleanups and restructuring of the x86 RAS code (Borislav
    Petkov)"

* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ras, EDAC, acpi: Assign MCE notifier handlers a priority
  x86/ras: Get rid of mce_process_work()
  EDAC/mce/amd: Dump TSC value
  EDAC/mce/amd: Unexport amd_decode_mce()
  x86/ras/amd/inj: Change dependency
  x86/ras: Flip the TSC-adding logic
  x86/ras/amd: Make sysfs names of banks more user-friendly
  x86/ras/therm_throt: Do not log a fake MCE for thermal events
  x86/ras/inject: Make it depend on X86_LOCAL_APIC=y
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull RAS updates from Ingo Molnar:
 "The main changes in this cycle were:

  - Assign notifier chain priorities for all RAS related handlers to
    make the ordering explicit (Borislav Petkov)

  - Improve the AMD MCA banks sysfs output (Yazen Ghannam)

  - Various cleanups and restructuring of the x86 RAS code (Borislav
    Petkov)"

* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ras, EDAC, acpi: Assign MCE notifier handlers a priority
  x86/ras: Get rid of mce_process_work()
  EDAC/mce/amd: Dump TSC value
  EDAC/mce/amd: Unexport amd_decode_mce()
  x86/ras/amd/inj: Change dependency
  x86/ras: Flip the TSC-adding logic
  x86/ras/amd: Make sysfs names of banks more user-friendly
  x86/ras/therm_throt: Do not log a fake MCE for thermal events
  x86/ras/inject: Make it depend on X86_LOCAL_APIC=y
</pre>
</div>
</content>
</entry>
<entry>
<title>EDAC, mce_amd: Print IPID and Syndrome on a separate line</title>
<updated>2017-02-16T14:39:32+00:00</updated>
<author>
<name>Yazen Ghannam</name>
<email>Yazen.Ghannam@amd.com</email>
</author>
<published>2017-02-15T20:56:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=75bf2f6478cab9b0c1d7f5f674a765d1e2ad530e'/>
<id>75bf2f6478cab9b0c1d7f5f674a765d1e2ad530e</id>
<content type='text'>
Currently, the IPID and Syndrome are printed on the same line as the
Address. There are cases when we can have a valid Syndrome but not a
valid Address.

For example, the MCA_SYND register can be used to hold more detailed
error info that the hardware folks can use. It's not just DRAM ECC
syndromes. There are some error types that aren't related to memory that
may have valid syndromes, like some errors related to links in the Data
Fabric, etc.

In these cases, the IPID and Syndrome are not printed at the same log
level as the rest of the stanza, so users won't see them on the console.

Console:
  [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
  [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2

Dmesg:
  [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
  , Syndrome: 0x000000010b404000, IPID: 0x0001002e00000002
  [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2

Print the IPID first and on a new line. The IPID should always be
printed on SMCA systems. The Syndrome will then be printed with the IPID
and at the same log level when valid:

  [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
  [Hardware Error]: IPID: 0x0001002e00000002, Syndrome: 0x000000010b404000
  [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2

Signed-off-by: Yazen Ghannam &lt;Yazen.Ghannam@amd.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Link: http://lkml.kernel.org/r/1487192182-2474-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, the IPID and Syndrome are printed on the same line as the
Address. There are cases when we can have a valid Syndrome but not a
valid Address.

For example, the MCA_SYND register can be used to hold more detailed
error info that the hardware folks can use. It's not just DRAM ECC
syndromes. There are some error types that aren't related to memory that
may have valid syndromes, like some errors related to links in the Data
Fabric, etc.

In these cases, the IPID and Syndrome are not printed at the same log
level as the rest of the stanza, so users won't see them on the console.

Console:
  [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
  [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2

Dmesg:
  [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
  , Syndrome: 0x000000010b404000, IPID: 0x0001002e00000002
  [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2

Print the IPID first and on a new line. The IPID should always be
printed on SMCA systems. The Syndrome will then be printed with the IPID
and at the same log level when valid:

  [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
  [Hardware Error]: IPID: 0x0001002e00000002, Syndrome: 0x000000010b404000
  [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2

Signed-off-by: Yazen Ghannam &lt;Yazen.Ghannam@amd.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Link: http://lkml.kernel.org/r/1487192182-2474-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>EDAC, mce_amd: Give more context to deferred error message</title>
<updated>2017-01-28T12:03:29+00:00</updated>
<author>
<name>Yazen Ghannam</name>
<email>Yazen.Ghannam@amd.com</email>
</author>
<published>2017-01-24T22:32:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=67d7fd306ef2ef1ba5cdb8ce2dfde1339d4c8136'/>
<id>67d7fd306ef2ef1ba5cdb8ce2dfde1339d4c8136</id>
<content type='text'>
Users may not be familiar with the concept of deferred errors. There is
no action for users to take on this type of error, so give more context
in the error message to make this more clear.

Signed-off-by: Yazen Ghannam &lt;Yazen.Ghannam@amd.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Link: http://lkml.kernel.org/r/1485297149-13733-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Users may not be familiar with the concept of deferred errors. There is
no action for users to take on this type of error, so give more context
in the error message to make this more clear.

Signed-off-by: Yazen Ghannam &lt;Yazen.Ghannam@amd.com&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Link: http://lkml.kernel.org/r/1485297149-13733-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</pre>
</div>
</content>
</entry>
</feed>
