<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/include/linux/edac.h, branch v5.1-rc1</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>EDAC: Drop per-memory controller buses</title>
<updated>2018-11-13T20:55:24+00:00</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2018-11-06T11:35:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=861e6ed667c83d64a42b0db41a22d6b4de4e913f'/>
<id>861e6ed667c83d64a42b0db41a22d6b4de4e913f</id>
<content type='text'>
... and use the single edac_subsys object returned from
subsys_system_register(). The idea is to have a single bus
and multiple devices on it.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Acked-by: Mauro Carvalho Chehab &lt;mchehab+samsung@kernel.org&gt;
CC: Aristeu Rozanski Filho &lt;arozansk@redhat.com&gt;
CC: Greg KH &lt;gregkh@linuxfoundation.org&gt;
CC: Justin Ernst &lt;justin.ernst@hpe.com&gt;
CC: linux-edac &lt;linux-edac@vger.kernel.org&gt;
CC: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
CC: Russ Anderson &lt;rja@hpe.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Link: https://lkml.kernel.org/r/20180926152752.GG5584@zn.tnic
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
... and use the single edac_subsys object returned from
subsys_system_register(). The idea is to have a single bus
and multiple devices on it.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Acked-by: Mauro Carvalho Chehab &lt;mchehab+samsung@kernel.org&gt;
CC: Aristeu Rozanski Filho &lt;arozansk@redhat.com&gt;
CC: Greg KH &lt;gregkh@linuxfoundation.org&gt;
CC: Justin Ernst &lt;justin.ernst@hpe.com&gt;
CC: linux-edac &lt;linux-edac@vger.kernel.org&gt;
CC: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
CC: Russ Anderson &lt;rja@hpe.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Link: https://lkml.kernel.org/r/20180926152752.GG5584@zn.tnic
</pre>
</div>
</content>
</entry>
<entry>
<title>EDAC: Raise the maximum number of memory controllers</title>
<updated>2018-09-27T05:52:05+00:00</updated>
<author>
<name>Justin Ernst</name>
<email>justin.ernst@hpe.com</email>
</author>
<published>2018-09-25T14:34:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=6b58859419554fb824e09cfdd73151a195473cbc'/>
<id>6b58859419554fb824e09cfdd73151a195473cbc</id>
<content type='text'>
We observe an oops in the skx_edac module during boot:

  EDAC MC0: Giving out device to module skx_edac controller Skylake Socket#0 IMC#0
  EDAC MC1: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
  EDAC MC2: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
  ...
  EDAC MC13: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
  EDAC MC14: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
  EDAC MC15: Giving out device to module skx_edac controller Skylake Socket#1 IMC#1
  Too many memory controllers: 16
  EDAC MC: Removed device 0 for skx_edac Skylake Socket#0 IMC#0

We observe there are two memory controllers per socket, with a limit
of 16. Raise the maximum number of memory controllers from 16 to 2 *
MAX_NUMNODES (1024).

[ bp: This is just a band-aid fix until we've sorted out the whole issue
  with the bus_type association and handling in EDAC and can get rid of
  this arbitrary limit. ]

Signed-off-by: Justin Ernst &lt;justin.ernst@hpe.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Acked-by: Russ Anderson &lt;russ.anderson@hpe.com&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: linux-edac@vger.kernel.org
Link: https://lkml.kernel.org/r/20180925143449.284634-1-justin.ernst@hpe.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We observe an oops in the skx_edac module during boot:

  EDAC MC0: Giving out device to module skx_edac controller Skylake Socket#0 IMC#0
  EDAC MC1: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
  EDAC MC2: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
  ...
  EDAC MC13: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1
  EDAC MC14: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0
  EDAC MC15: Giving out device to module skx_edac controller Skylake Socket#1 IMC#1
  Too many memory controllers: 16
  EDAC MC: Removed device 0 for skx_edac Skylake Socket#0 IMC#0

We observe there are two memory controllers per socket, with a limit
of 16. Raise the maximum number of memory controllers from 16 to 2 *
MAX_NUMNODES (1024).

[ bp: This is just a band-aid fix until we've sorted out the whole issue
  with the bus_type association and handling in EDAC and can get rid of
  this arbitrary limit. ]

Signed-off-by: Justin Ernst &lt;justin.ernst@hpe.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Acked-by: Russ Anderson &lt;russ.anderson@hpe.com&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: linux-edac@vger.kernel.org
Link: https://lkml.kernel.org/r/20180925143449.284634-1-justin.ernst@hpe.com
</pre>
</div>
</content>
</entry>
<entry>
<title>EDAC, ghes: Use CPER module handles to locate DIMMs</title>
<updated>2018-09-22T16:35:40+00:00</updated>
<author>
<name>Fan Wu</name>
<email>wufan@codeaurora.org</email>
</author>
<published>2018-09-19T01:59:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c798c88f3962ddff89c7aa818986caeecd46ab4c'/>
<id>c798c88f3962ddff89c7aa818986caeecd46ab4c</id>
<content type='text'>
Use SMBIOS module handle type 17, on platforms which provide valid
ones, to locate the corresponding DIMM and thus have per-DIMM error
counter updates.

Signed-off-by: Fan Wu &lt;wufan@codeaurora.org&gt;
[ Massage commit message. ]
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Tyler Baicar &lt;baicar.tyler@gmail.com&gt;
Reviewed-by: James Morse &lt;james.morse@arm.com&gt;
Tested-by: Toshi Kani &lt;toshi.kani@hpe.com&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: baicar.tyler@gmail.com
Cc: john.garry@huawei.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: shiju.jose@huawei.com
Cc: tanxiaofei@huawei.com
Cc: wanghuiqiang@huawei.com
Link: http://lkml.kernel.org/r/1537322340-1860-1-git-send-email-wufan@codeaurora.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use SMBIOS module handle type 17, on platforms which provide valid
ones, to locate the corresponding DIMM and thus have per-DIMM error
counter updates.

Signed-off-by: Fan Wu &lt;wufan@codeaurora.org&gt;
[ Massage commit message. ]
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Tyler Baicar &lt;baicar.tyler@gmail.com&gt;
Reviewed-by: James Morse &lt;james.morse@arm.com&gt;
Tested-by: Toshi Kani &lt;toshi.kani@hpe.com&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: baicar.tyler@gmail.com
Cc: john.garry@huawei.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: shiju.jose@huawei.com
Cc: tanxiaofei@huawei.com
Cc: wanghuiqiang@huawei.com
Link: http://lkml.kernel.org/r/1537322340-1860-1-git-send-email-wufan@codeaurora.org
</pre>
</div>
</content>
</entry>
<entry>
<title>EDAC: Add new memory type for non-volatile DIMMs</title>
<updated>2018-03-14T11:32:06+00:00</updated>
<author>
<name>Tony Luck</name>
<email>tony.luck@intel.com</email>
</author>
<published>2018-03-12T18:24:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=001f86137d3fca3c9002beaa7609c666715ebc70'/>
<id>001f86137d3fca3c9002beaa7609c666715ebc70</id>
<content type='text'>
There are now non-volatile versions of DIMMs. Add a new entry to "enum
mem_type" and a new string in edac_mem_types[].

Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: "Rafael J. Wysocki" &lt;rjw@rjwysocki.net&gt;
Cc: Aristeu Rozanski &lt;aris@redhat.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Jean Delvare &lt;jdelvare@suse.com&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: Qiuxu Zhuo &lt;qiuxu.zhuo@intel.com&gt;
Cc: linux-acpi@vger.kernel.org
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: linux-nvdimm@lists.01.org
Link: http://lkml.kernel.org/r/20180312182430.10335-3-tony.luck@intel.com
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are now non-volatile versions of DIMMs. Add a new entry to "enum
mem_type" and a new string in edac_mem_types[].

Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: "Rafael J. Wysocki" &lt;rjw@rjwysocki.net&gt;
Cc: Aristeu Rozanski &lt;aris@redhat.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Jean Delvare &lt;jdelvare@suse.com&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: Qiuxu Zhuo &lt;qiuxu.zhuo@intel.com&gt;
Cc: linux-acpi@vger.kernel.org
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Cc: linux-nvdimm@lists.01.org
Link: http://lkml.kernel.org/r/20180312182430.10335-3-tony.luck@intel.com
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>EDAC: Get rid of mci-&gt;mod_ver</title>
<updated>2017-07-17T11:42:48+00:00</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2017-06-29T10:00:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c54182ec0e157988f0cafd1e8d37b68ab4210f87'/>
<id>c54182ec0e157988f0cafd1e8d37b68ab4210f87</id>
<content type='text'>
It is a write-only variable so get rid of it.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Acked-by: Robert Richter &lt;rric@kernel.org&gt;
Acked-by: Michal Simek &lt;michal.simek@xilinx.com&gt;
Acked-by: Thor Thayer &lt;thor.thayer@linux.intel.com&gt;
Acked-by: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Mark Gross &lt;mark.gross@intel.com&gt;
Cc: Tim Small &lt;tim@buttersideup.com&gt;
Cc: Ranganathan Desikan &lt;ravi@jetztechnologies.com&gt;
Cc: "Arvind R." &lt;arvino55@gmail.com&gt;
Cc: Jason Baron &lt;jbaron@akamai.com&gt;
Cc: "Sören Brinkmann" &lt;soren.brinkmann@xilinx.com&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: David Daney &lt;david.daney@cavium.com&gt;
Cc: Loc Ho &lt;lho@apm.com&gt;
Cc: linux-edac@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@linux-mips.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It is a write-only variable so get rid of it.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Acked-by: Robert Richter &lt;rric@kernel.org&gt;
Acked-by: Michal Simek &lt;michal.simek@xilinx.com&gt;
Acked-by: Thor Thayer &lt;thor.thayer@linux.intel.com&gt;
Acked-by: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Mark Gross &lt;mark.gross@intel.com&gt;
Cc: Tim Small &lt;tim@buttersideup.com&gt;
Cc: Ranganathan Desikan &lt;ravi@jetztechnologies.com&gt;
Cc: "Arvind R." &lt;arvino55@gmail.com&gt;
Cc: Jason Baron &lt;jbaron@akamai.com&gt;
Cc: "Sören Brinkmann" &lt;soren.brinkmann@xilinx.com&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: David Daney &lt;david.daney@cavium.com&gt;
Cc: Loc Ho &lt;lho@apm.com&gt;
Cc: linux-edac@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@linux-mips.org
</pre>
</div>
</content>
</entry>
<entry>
<title>EDAC: Rename report status accessors</title>
<updated>2017-04-10T15:15:02+00:00</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2017-02-04T17:10:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bffc7dece92edd0b6445b76a378e2fa9e324c7ed'/>
<id>bffc7dece92edd0b6445b76a378e2fa9e324c7ed</id>
<content type='text'>
Change them to have the edac_ prefix.

No functionality change.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change them to have the edac_ prefix.

No functionality change.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>EDAC: Delete edac_stub.c</title>
<updated>2017-04-10T15:14:48+00:00</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2017-02-04T16:42:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=fee27d7d97886515a60cce38b4152b7f5b5a21fc'/>
<id>fee27d7d97886515a60cce38b4152b7f5b5a21fc</id>
<content type='text'>
Move the remaining functionality to edac_mc.c. Convert "edac_report=" to
a module parameter.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move the remaining functionality to edac_mc.c. Convert "edac_report=" to
a module parameter.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>EDAC: Remove edac_err_assert</title>
<updated>2017-04-10T15:14:21+00:00</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2017-01-26T17:25:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=d3116a0837261405e0febb8043fe7040c8ebccb4'/>
<id>d3116a0837261405e0febb8043fe7040c8ebccb4</id>
<content type='text'>
... and the glue around it. It is not needed anymore.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
... and the glue around it. It is not needed anymore.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>EDAC: Get rid of edac_handlers</title>
<updated>2017-04-10T15:14:17+00:00</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2017-01-26T15:49:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=97bb6c17ad5a0892beb45070dfe8c7d6d0e5326e'/>
<id>97bb6c17ad5a0892beb45070dfe8c7d6d0e5326e</id>
<content type='text'>
Use mc_devices list instead to check whether we have EDAC driver
instances successfully registered with EDAC core.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use mc_devices list instead to check whether we have EDAC driver
instances successfully registered with EDAC core.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/nmi, EDAC: Get rid of DRAM error reporting thru PCI SERR NMI</title>
<updated>2017-04-10T15:13:48+00:00</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2017-01-25T19:30:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=db47d5f856467ce0dd3af7e20a33df3d901266df'/>
<id>db47d5f856467ce0dd3af7e20a33df3d901266df</id>
<content type='text'>
Apparently, some machines used to report DRAM errors through a PCI SERR
NMI. This is why we have a call into EDAC in the NMI handler. See

  c0d121720220 ("drivers/edac: add new nmi rescan").

From looking at the patch above, that's two drivers: e752x_edac.c and
e7xxx_edac.c. Now, I wanna say those are old machines which are probably
decommissioned already.

Tony says that "[t]the newest CPU supported by either of those drivers
is the Xeon E7520 (a.k.a. "Nehalem") released in Q1'2010. Possibly some
folks are still using these ... but people that hold onto h/w for 7
years generally cling to old s/w too ... so I'd guess it unlikely that
we will get complaints for breaking these in upstream."

So even if there is a small number still in use, we did load EDAC with
edac_op_state == EDAC_OPSTATE_POLL by default (we still do, in fact)
which means a default EDAC setup without any parameters supplied on the
command line or otherwise would never even log the error in the NMI
handler because we're polling by default:

  inline int edac_handler_set(void)
  {
         if (edac_op_state == EDAC_OPSTATE_POLL)
                 return 0;

         return atomic_read(&amp;edac_handlers);
  }

So, long story short, I'd like to get rid of that nastiness called
edac_stub.c and confine all the EDAC drivers solely to drivers/edac/. If
we ever have to do stuff like that again, it should be notifiers we're
using and not some insanity like this one.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Apparently, some machines used to report DRAM errors through a PCI SERR
NMI. This is why we have a call into EDAC in the NMI handler. See

  c0d121720220 ("drivers/edac: add new nmi rescan").

From looking at the patch above, that's two drivers: e752x_edac.c and
e7xxx_edac.c. Now, I wanna say those are old machines which are probably
decommissioned already.

Tony says that "[t]the newest CPU supported by either of those drivers
is the Xeon E7520 (a.k.a. "Nehalem") released in Q1'2010. Possibly some
folks are still using these ... but people that hold onto h/w for 7
years generally cling to old s/w too ... so I'd guess it unlikely that
we will get complaints for breaking these in upstream."

So even if there is a small number still in use, we did load EDAC with
edac_op_state == EDAC_OPSTATE_POLL by default (we still do, in fact)
which means a default EDAC setup without any parameters supplied on the
command line or otherwise would never even log the error in the NMI
handler because we're polling by default:

  inline int edac_handler_set(void)
  {
         if (edac_op_state == EDAC_OPSTATE_POLL)
                 return 0;

         return atomic_read(&amp;edac_handlers);
  }

So, long story short, I'd like to get rid of that nastiness called
edac_stub.c and confine all the EDAC drivers solely to drivers/edac/. If
we ever have to do stuff like that again, it should be notifiers we're
using and not some insanity like this one.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
