Age | Commit message (Collapse) | Author |
|
[ Upstream commit 25b223ddfe2a557307c05fe673e09d94ae950877 ]
The peripherals' RAS functionality only exist on the Arria10 SoCFPGA.
The Cyclone5 initialization generates EDAC warnings when the peripherals
aren't found in the device tree. Fix by checking for Arria10 in the init
functions.
Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1491415262-5018-1-git-send-email-thor.thayer@linux.intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b399151cb48db30ad1e0e93dd40d68c6d007b637 upstream.
x86_mask is a confusing name which is hard to associate with the
processor's stepping.
Additionally, correct an indent issue in lib/cpu.c.
Signed-off-by: Jia Zhang <qianyue.zj@alibaba-inc.com>
[ Updated it to more recent kernels. ]
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@alien8.de
Cc: tony.luck@intel.com
Link: http://lkml.kernel.org/r/1514771530-70829-1-git-send-email-qianyue.zj@alibaba-inc.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 544e92581a2ac44607d7cc602c6b54d18656f56d upstream.
Fix an uninitialized variable warning in the Octeon EDAC driver, as seen
in MIPS cavium_octeon_defconfig builds since v4.14 with Codescape GNU
Tools 2016.05-03:
drivers/edac/octeon_edac-lmc.c In function ‘octeon_lmc_edac_poll_o2’:
drivers/edac/octeon_edac-lmc.c:87:24: warning: ‘((long unsigned int*)&int_reg)[1]’ may \
be used uninitialized in this function [-Wmaybe-uninitialized]
if (int_reg.s.sec_err || int_reg.s.ded_err) {
^
Iinitialise the whole int_reg variable to zero before the conditional
assignments in the error injection case.
Signed-off-by: James Hogan <jhogan@kernel.org>
Acked-by: David Daney <david.daney@cavium.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linux-mips@linux-mips.org
Fixes: 1bc021e81565 ("EDAC: Octeon: Add error injection support")
Link: http://lkml.kernel.org/r/20171113161206.20990-1-james.hogan@mips.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit a8c8261425649da58bdf08221570e5335ad33a31 ]
In the i5000 and i5400 drivers, the NRECMEMB register is defined as a
16-bit value, which results in wrong shifts in the code, as reported by
sparse.
In the datasheets ([1], section 3.9.22.20 and [2], section 3.9.22.21),
this register is a 32-bit register. A u32 value for the register fixes
the wrong shifts warnings and matches the datasheet.
Also fix the mask to access to the CAS bits [27:16] in the i5000 driver.
[1]: https://www.intel.com/content/dam/doc/datasheet/5000p-5000v-5000z-chipset-memory-controller-hub-datasheet.pdf
[2]: https://www.intel.se/content/dam/doc/datasheet/5400-chipset-memory-controller-hub-datasheet.pdf
Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170629005729.8478-1-jeremy.lefaure@lse.epita.fr
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit e61555c29c28a4a3b6ba6207f4a0883ee236004d ]
The MTR_DRAM_WIDTH macro returns the data width. It is sometimes used
as if it returned a boolean true if the width if 8. Fix the tests where
MTR_DRAM_WIDTH is misused.
Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170309011809.8340-1-jeremy.lefaure@lse.epita.fr
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit a8e9b186f153a44690ad0363a56716e7077ad28c ]
Add missing break statement in order to prevent the code from falling
through.
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20171016174029.GA19757@embeddedor.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 2287c63643f0f52d9d5452b9dc4079aec0889fe8 ]
We should save the return code from probe_one_instance() so that it can
be returned from the module init function. Otherwise, we'll be returning
the -ENOMEM from above.
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1484322741-41884-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 1bd9900b8301fc505f032c90ea487824cf824e99 ]
Match one of the devices in amd64_cpuids[] before loading the module.
This is an additional sanity check against users trying to load
amd64_edac_mod on unsupported systems.
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1485537863-2707-9-git-send-email-Yazen.Ghannam@amd.com
[ Get rid of err_ret label, make it a bit more readable this way. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 75bf2f6478cab9b0c1d7f5f674a765d1e2ad530e ]
Currently, the IPID and Syndrome are printed on the same line as the
Address. There are cases when we can have a valid Syndrome but not a
valid Address.
For example, the MCA_SYND register can be used to hold more detailed
error info that the hardware folks can use. It's not just DRAM ECC
syndromes. There are some error types that aren't related to memory that
may have valid syndromes, like some errors related to links in the Data
Fabric, etc.
In these cases, the IPID and Syndrome are not printed at the same log
level as the rest of the stanza, so users won't see them on the console.
Console:
[Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
[Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2
Dmesg:
[Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
, Syndrome: 0x000000010b404000, IPID: 0x0001002e00000002
[Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2
Print the IPID first and on a new line. The IPID should always be
printed on SMCA systems. The Syndrome will then be printed with the IPID
and at the same log level when valid:
[Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
[Hardware Error]: IPID: 0x0001002e00000002, Syndrome: 0x000000010b404000
[Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1487192182-2474-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Pull EDAC updates from Borislav Petkov:
"A lot of movement in the EDAC tree this time around, coarse summary
below:
- Altera Arria10 enablement of NAND, DMA, USB, QSPI and SD-MMC FIFO
buffers (Thor Thayer)
- split the memory controller part out of mpc85xx and share it with a
new Freescale ARM Layerscape driver (York Sun)
- amd64_edac fixes (Yazen Ghannam)
- misc cleanups, refactoring and fixes all over the place"
* tag 'edac_for_4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (37 commits)
EDAC, altera: Add IRQ Flags to disable IRQ while handling
EDAC, altera: Correct EDAC IRQ error message
EDAC, amd64: Autoload module using x86_cpu_id
EDAC, sb_edac: Remove NULL pointer check on array pci_tad
EDAC: Remove NO_IRQ from powerpc-only drivers
EDAC, fsl_ddr: Fix error return code in fsl_mc_err_probe()
EDAC, fsl_ddr: Add entry to MAINTAINERS
EDAC: Move Doug Thompson to CREDITS
EDAC, I3000: Orphan driver
EDAC, fsl_ddr: Replace simple_strtoul() with kstrtoul()
EDAC, layerscape: Add Layerscape EDAC support
EDAC, fsl_ddr: Fix IRQ dispose warning when module is removed
EDAC, fsl_ddr: Add support for little endian
EDAC, fsl_ddr: Add missing DDR DRAM types
EDAC, fsl_ddr: Rename macros and names
EDAC, fsl-ddr: Separate FSL DDR driver from MPC85xx
EDAC, mpc85xx: Replace printk() with pr_* format
EDAC, mpc85xx: Drop setting/clearing RFXE bit in HID1
EDAC, altera: Rename MC trigger to common name
EDAC, altera: Rename device trigger to common name
...
|
|
Add the IRQF_ONESHOT and IRQF_TRIGGER_HIGH flags to disable the IRQ
while executing the IRQ handler. Remove the IRQF_SHARED because these
are not shared IRQs in the domain. Exposed when flooding IRQs.
Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1474582419-7053-2-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Correct the error message sent out in the case of a single bit error IRQ
allocation.
Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1474582419-7053-1-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Reinstate driver autoloading now that PCI dependency is gone.
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1473984445-1726-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Bank 4 is reserved on family 0x17 and shouldn't generate any MCE
records. However, broken hardware and software is not something unheard
of so warn about bank 4 errors. They shouldn't be coming from bank 4
naturally but users can still use mce_amd_inj to simulate errors from it
for testing purposed.
Also, avoid special handling in the injector mce_amd_inj like it is
being done on the older families.
[ bp: Rewrite commit message and merge into one patch. Use boot_cpu_data. ]
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Link: http://lkml.kernel.org/r/1473384591-5323-1-git-send-email-Yazen.Ghannam@amd.com
Link: http://lkml.kernel.org/r/1473384591-5323-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The MCA_SYND and MCA_IPID registers contain valuable information and
should be included in MCE output. The MCA_SYND register contains
syndrome and other error information, and the MCA_IPID register will
uniquely identify the MCA bank's type without having to rely on system
software.
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472680624-34221-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Scalable MCA defines a number of IP types. An MCA bank on an SMCA
system is defined as one of these IP types. A bank's type is uniquely
identified by the combination of the HWID and MCATYPE values read from
its MCA_IPID register.
Add the required tables in order to be able to lookup error descriptions
based on a bank's type and the error's extended error code.
[ bp: Align comments, simplify a bit. ]
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472741832-1690-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The error descriptions defined for Fam17h can be reused for other SMCA
systems, so their names should reflect this.
Change f17h prefix to smca for error descriptions.
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472673994-12235-4-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Add missing SMCA error descriptions to the error descriptions arrays.
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472673994-12235-3-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Print SyndV bit status and print the raw value of the MCA_SYND register.
Further decoding of the syndrome from struct mce.synd can be done in
other places where appropriate, e.g. DRAM ECC.
Boris: make the error stanza more compact by putting the error address
and syndrome on the same line:
[Hardware Error]: Corrected error, no action required.
[Hardware Error]: CPU:2 (17:0:0) MC4_STATUS[-|CE|-|PCC|AddrV|-|-|SyndV|CECC]: 0x96204100001e0117
[Hardware Error]: Error Addr: 0x000000007f4c52e3, Syndrome: 0x0000000000000000
[Hardware Error]: Invalid IP block specified.
[Hardware Error]: cache level: L3/GEN, tx: DATA, mem-tx: RD
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1467633035-32080-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
pvt->pci_tad is a NUM_CHANNELS array of struct pci_dev pointers and
hence cannot be NULL, so the NULL pointer check on pci_tad is redundant.
Remove it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20160908083801.14766-1-colin.king@canonical.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
We'd like to eventually remove NO_IRQ on powerpc, so remove usages of it
from powerpc-only drivers.
The pdata structs are kzalloc'ed, so we don't need to initialise those
to 0, we can just drop the assignments entirely.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Johannes Thumshirn <morbidrsa@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linuxppc-dev@ozlabs.org
Link: http://lkml.kernel.org/r/1473674436-19467-1-git-send-email-mpe@ellerman.id.au
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Return negative error code from the edac_mc_add_mc() error handling case
instead of 0, as done elsewhere in this function.
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: York Sun <york.sun@nxp.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1473350284-26482-1-git-send-email-weiyj.lk@gmail.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Replace obsolete simple_strtoul() with kstrtoul().
Signed-off-by: York Sun <york.sun@nxp.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1471990593-27536-1-git-send-email-york.sun@nxp.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Add DDR EDAC driver for ARM-based compatible controllers. Both
big-endian and little-endian are supported, as specified in device tree.
Signed-off-by: York Sun <york.sun@nxp.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1471990465-27443-1-git-send-email-york.sun@nxp.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
When compiled as a module, removing it causes kernel warnings
when irq_dispose_mapping() is called. Instead of calling
irq_of_parse_and_map(), use platform_get_irq() to acquire the IRQ
number.
Signed-off-by: York Sun <york.sun@nxp.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: morbidrsa@gmail.com
Cc: oss@buserror.net
Cc: stuart.yoder@nxp.com
Link: http://lkml.kernel.org/r/1470779760-16483-8-git-send-email-york.sun@nxp.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Get endianness from device tree. Both big endian and little endian are
supported. Default to big endian for backwards compatibility to MPC85xx.
Signed-off-by: York Sun <york.sun@nxp.com>
Acked-by: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: morbidrsa@gmail.com
Cc: oss@buserror.net
Cc: stuart.yoder@nxp.com
Link: http://lkml.kernel.org/r/1470779760-16483-7-git-send-email-york.sun@nxp.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
The compatible DDR controllers may support DDR, DDR2, DDR3, DDR4 DRAM.
An individual controller doesn't support all of them. The EDAC driver
reads SDRAM_CFG to determine which mode is configured.
Add DDR4 and drop the defines used only in the mtype assignment.
Signed-off-by: York Sun <york.sun@nxp.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: morbidrsa@gmail.com
Cc: oss@buserror.net
Cc: stuart.yoder@nxp.com
Link: http://lkml.kernel.org/r/1470779760-16483-6-git-send-email-york.sun@nxp.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Use FSL-specific prefix for macros, variables and functions.
Signed-off-by: York Sun <york.sun@nxp.com>
Cc: Johannes Thumshirn <morbidrsa@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: oss@buserror.net
Cc: stuart.yoder@nxp.com
Link: http://lkml.kernel.org/r/1470779760-16483-5-git-send-email-york.sun@nxp.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
The mpc85xx-compatible DDR controllers are used on ARM-based SoCs too.
Carve out the DDR part from the mpc85xx EDAC driver in preparation to
support both architectures.
Signed-off-by: York Sun <york.sun@nxp.com>
Cc: Johannes Thumshirn <morbidrsa@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: oss@buserror.net
Cc: stuart.yoder@nxp.com
Link: http://lkml.kernel.org/r/1470946525-3410-1-git-send-email-york.sun@nxp.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Replace printk() with pr_err/pr_warn/pr_info macros.
Signed-off-by: York Sun <york.sun@nxp.com>
Cc: Johannes Thumshirn <morbidrsa@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: oss@buserror.net
Cc: stuart.yoder@nxp.com
Link: http://lkml.kernel.org/r/1470779760-16483-3-git-send-email-york.sun@nxp.com
[ Boris: unbreak strings for easier greppability. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
On e500v1, read fault exception enable (RFXE) controls whether assertion
of core_fault_in causes a machine check interrupt. Assertion of
core_fault_in can result from uncorrectable data error, such as an L2
multi-bit ECC error. It can also occur from a system error if logic on
the integrated device signals a fault for nonfatal errors. RFXE bit is
cleared out of reset, and should be left clear for normal operation.
Assertion of core_fault_in does not cause a machine check.
RFXE is set specifically for RIO (Rapid IO) and PCI for book E to catch
the errors by machine check. With this bit set, the EDAC driver can't
get the interrupt in case of uncorrectable error. So this bit is cleared
in favor of EDAC. However, the benefit of catching such uncorrectable
error doesn't outweigh the other errors which may hang the system.
Besides, e500v2 has different errors masked by RFXE, and e500mc doesn't
support this bit. It is more reasonable to leave RFXE as is in the EDAC
driver, and leave the uncorrectable errors triggering machine check for
e500v1.
Suggested-by: Scott Wood <oss@buserror.net>
Signed-off-by: York Sun <york.sun@nxp.com>
Cc: Johannes Thumshirn <morbidrsa@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: oss@buserror.net
Cc: stuart.yoder@nxp.com
Link: http://lkml.kernel.org/r/1470779760-16483-2-git-send-email-york.sun@nxp.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Rename the Memory Controller debug trigger to the same common name as
the EDAC devices.
Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1471622666-15197-3-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
The L2 and OCRAM devices have different ecc trigger names than the other
EDAC devices (FIFO peripherals). Make them all the same and remove the
character array from the device structure.
Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1471622666-15197-2-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
This is an entirely new driver instead of yet another set of patches
to sb_edac.c because:
1) Mapping from PCI devices to socket/memory controller is significantly
different. Skylake scatters devices on a socket across a number of
PCI buses.
2) There is an extra level of interleaving via the "mcroute" register
that would be a little messy to squeeze into the old driver.
3) Validation is getting too expensive. Changes to sb_edac need to
be checked against Sandy Bridge, Ivy Bridge, Haswell, Broadwell and
Knights Landing.
Acked-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
According to the reference manual of MPC8572 and T4240, bit 31 of
PEX_ERR_CAP_STAT is W1C (write 1 to clear).
Add the corresponding write to PEX_ERR_CAP_STAT in order to fix the PCIe
error capture.
Tested on a T4240 processor.
Signed-off-by: Tillmann Heidsieck <theidsieck@leenox.de>
Acked-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20160815190849.29327-1-theidsieck@leenox.de
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Replace the deprecated create_singlethread_workqueue() with
alloc_ordered_workqueue() with WQ_MEM_RECLAIM. This is the identity
conversion.
It's not recommended to stall it from memory pressure. Hence,
WQ_MEM_RECLAIM has been set to ensure forward progress under memory
pressure.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20160813164124.GA9077@Karyakshetra
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Fix the following sparse warning:
drivers/edac/altera_edac.c:1649:23: warning:
symbol 'a10_eccmgr_ic_ops' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Reviewed-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: lkml <linux-kernel@vger.kernel.org>
Link: http://lkml.kernel.org/r/1470836667-11822-1-git-send-email-weiyj.lk@gmail.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Add Altera Arria10 SD-MMC FIFO memory EDAC support. The SD-MMC is a
dual port RAM implementation which is different than any of the other
peripherals and therefore requires additional code.
Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: dinguyen@opensource.altera.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1470753653-23465-3-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Fam15hMod60h systems are using the channel decode of Fam15hMod30h which
gives incorrect results. Fam15hMod60h systems should use the generic
channel decode method plus a couple more cases.
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1470236355-30039-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Add Altera Arria10 QSPI FIFO memory support.
Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: dinguyen@opensource.altera.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1468512408-5156-9-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Add Altera Arria10 USB FIFO memory support.
Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: dinguyen@opensource.altera.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1468512408-5156-8-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Add Altera Arria10 DMA FIFO memory support.
Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: dinguyen@opensource.altera.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1468512408-5156-7-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Add Altera Arria10 NAND FIFO memory support.
Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: dinguyen@opensource.altera.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1468512408-5156-6-git-send-email-tthayer@opensource.altera.com
[ Reformat loop in altr_edac_a10_probe() for better readability. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
On Intel Xeon Phi Knights Landing processor family the channels of the
memory controller have untypical arrangement - MC0 is mapped to CH3,4,5
and MC1 is mapped to CH0,1,2. This causes the EDAC driver to report the
channel name incorrectly.
We missed this change earlier, so the code already contains similar
comment, but the translation function is incorrect.
Without this patch:
errors in DIMM_A and DIMM_D were reported in DIMM_D
errors in DIMM_B and DIMM_E were reported in DIMM_E
errors in DIMM_C and DIMM_F were reported in DIMM_F
Correct this.
Hubert Chrzaniuk:
- rebased to 4.8
- comments and code cleanup
Fixes: d0cdf9003140 ("sb_edac: Add Knights Landing (Xeon Phi gen 2) support")
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: lukasz.anaczkowski@intel.com
Cc: lukasz.odzioba@intel.com
Cc: mchehab@kernel.org
Cc: <stable@vger.kernel.org> # v4.5..
Link: http://lkml.kernel.org/r/1469231089-22837-1-git-send-email-lukasz.odzioba@intel.com
Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
[ Boris: Simplify a bit by removing char mc. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Pull EDAC updates from Borislav Petkov:
"This last cycle, Thor was busy adding Arria10 eth FIFO support to the
altera_edac driver along with other improvements. We have two
cleanups/fixes too.
Summary:
- Altera Arria10 ethernet FIFO buffer support (Thor Thayer)
- Minor cleanups"
* tag 'edac_for_4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
ARM: dts: Add Arria10 Ethernet EDAC devicetree entry
EDAC, altera: Add Arria10 Ethernet EDAC support
EDAC, altera: Add Arria10 ECC memory init functions
Documentation: dt: socfpga: Add Arria10 Ethernet binding
EDAC, altera: Drop some ifdeffery
EDAC, altera: Add panic flag check to A10 IRQ
EDAC, altera: Check parent status for Arria10 EDAC block
EDAC, altera: Make all private data structures static
EDAC: Correct channel count limit
EDAC, amd64_edac: Init opstate at the proper time during init
EDAC, altera: Handle Arria10 SDRAM child node
EDAC, altera: Add ECC Manager IRQ controller support
Documentation: dt: socfpga: Add interrupt-controller to ecc-manager
|
|
In commit 2c1ea4c700af ("EDAC, sb_edac: Use cpu family/model in driver
detection") I broke Knights Landing because I failed to notice that it
called a wrapper macro "sbridge_get_all_devices_knl" instead of
"sbridge_get_all_devices" like all the other types.
Now that we include the processor type in the pci_id_table structure we
can skip the wrappers and just have the sbridge_get_all_devices() check
the type to decide whether to allow duplicate devices and controllers to
have registers spread across buses.
Fixes: 2c1ea4c700af ("EDAC, sb_edac: Use cpu family/model in driver detection")
Tested-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add Altera Arria10 Ethernet FIFO memory EDAC support. Update to support
a common compatibility string for all Ethernet FIFOs in the DT.
Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1466603939-7526-8-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
In preparation for additional memory module ECCs, add the memory
initialization functions and helpers.
Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1466603939-7526-7-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Make the IRQ and check_deps() functions available to all the memory
buffers by moving them outside of the OCRAM only area.
Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1466603939-7526-5-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
In preparation for additional memory module ECCs, the IRQ function will
check a panic flag before doing a kernel panic on double bit errors.
OCRAM uncorrectable errors cause a panic because sleep/resume functions
and FPGA contents during sleep are stored in OCRAM.
ECCs on peripheral FIFO buffers will not cause a kernel panic on DBERRs
because the packet can be retried and therefore recovered.
Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1466603939-7526-3-git-send-email-tthayer@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
|