summaryrefslogtreecommitdiff
path: root/drivers/edac
AgeCommit message (Collapse)Author
2017-05-25EDAC, sb_edac: Carve out dimm-populating loopBorislav Petkov
... to slim down get_dimm_config(). No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25EDAC, sb_edac: Fix mod_nameBorislav Petkov
It is called "sb_edac.c" now. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25EDAC, sb_edac: Assign EDAC memory controller per h/w controllerQiuxu Zhuo
Tony pointed out: "currently the driver pretends there is one big 8-channel memory controller per socket instead of 2 4-channel controllers. This is fine with all memory controller populated with symmetrical DIMM configurations, but runs into difficulties on asymmetrical setups". Restructure the driver to assign an EDAC memory controller to each real h/w memory controller to resolve the issue. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170523000731.87793-1-qiuxu.zhuo@intel.com [ Break some lines at convenient points. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25EDAC, sb_edac: Don't use "Socket#" in the memory controller nameTony Luck
EDAC assigns logical memory controller numbers in the order that we find memory controllers, which depends on which PCI bus they are on. Some systems end up with MC0 on socket0, others (e.g Haswell) have MC0 on socket3. All this is made more confusing for users because we use the string "Socket" while generating names for memory controllers, but the number that we attach there is the memory controller number. E.g. EDAC MC0: Giving out device to module sbridge_edac.c controller Haswell Socket#0: DEV 0000:ff:12.0 (INTERRUPT) Change the names to say "SrcID#%d" (where the number we use is read from the h/w associated with the memory controller instead of some logical number internal to the EDAC driver). New message: EDAC MC0: Giving out device to module sbridge_edac.c controller Haswell SrcID#3: DEV 0000:ff:12.0 (INTERRUPT) Reported-by: Andrey Korolyov <andrey@xdel.ru> Reported-by: Patrick Geary <patrickg@supermicro.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170523000603.87748-1-qiuxu.zhuo@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25EDAC, sb_edac: Classify PCI-IDs by topologyQiuxu Zhuo
Each of the PCI device IDs belongs to a CPU socket, or to one of the integrated memory controllers. Provide an enum to specify the domain of each, and distinguish the resource number in each domain: the number of the PCI device IDs per integrated memory controller/socket, and the number of integrated memory controllers per socket. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170523000533.87704-1-qiuxu.zhuo@intel.com [ Realign pci_dev_descr_knl members. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-24EDAC, altera: Constify irq_domain_opsTobias Klauser
struct irq_domain_ops is not modified, so it can be made const. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Cc: Thor Thayer <thor.thayer@linux.intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170524133505.1233-1-tklauser@distanz.ch Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-03EDAC, amd64: Fix reporting of Chip Select sizes on Fam17hYazen Ghannam
The wrong index into the csbases/csmasks arrays was being passed to the function to compute the chip select sizes, which resulted in the wrong size being computed. Address that so that the correct values are computed and printed. Also, redo how we calculate the number of pages in a CS row. Reported-by: Benjamin Bennett <benbennett@gmail.com> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Cc: <stable@vger.kernel.org> # 4.10.x Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1493313114-11260-1-git-send-email-Yazen.Ghannam@amd.com [ Remove unneeded integer math comment, minor cleanups. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-27EDAC, ghes: Do not enable it by defaultBorislav Petkov
Leave it to the user to decide whether to enable this or not. Otherwise, platform-specific drivers won't initialize (currently, EDAC supports only a single platform driver loaded). Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10EDAC: Rename report status accessorsBorislav Petkov
Change them to have the edac_ prefix. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10EDAC: Delete edac_stub.cBorislav Petkov
Move the remaining functionality to edac_mc.c. Convert "edac_report=" to a module parameter. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10EDAC: Update Kconfig help textBorislav Petkov
Remove the old URLs. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10EDAC: Remove EDAC_MM_EDACBorislav Petkov
Move all the EDAC core functionality behind CONFIG_EDAC and get rid of that indirection. Update defconfigs which had it. While at it, fix dependencies such that EDAC depends on RAS for the tracepoints. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: linux-edac@vger.kernel.org
2017-04-10EDAC: Issue tracepoint only when it is definedBorislav Petkov
... and this happens only when CONFIG_RAS is enabled. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10EDAC: Move edac_op_state to edac_mc.cBorislav Petkov
... as part of moving stuff away from edac_stub.c Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10EDAC: Remove edac_err_assertBorislav Petkov
... and the glue around it. It is not needed anymore. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10EDAC: Get rid of edac_handlersBorislav Petkov
Use mc_devices list instead to check whether we have EDAC driver instances successfully registered with EDAC core. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10x86/nmi, EDAC: Get rid of DRAM error reporting thru PCI SERR NMIBorislav Petkov
Apparently, some machines used to report DRAM errors through a PCI SERR NMI. This is why we have a call into EDAC in the NMI handler. See c0d121720220 ("drivers/edac: add new nmi rescan"). From looking at the patch above, that's two drivers: e752x_edac.c and e7xxx_edac.c. Now, I wanna say those are old machines which are probably decommissioned already. Tony says that "[t]the newest CPU supported by either of those drivers is the Xeon E7520 (a.k.a. "Nehalem") released in Q1'2010. Possibly some folks are still using these ... but people that hold onto h/w for 7 years generally cling to old s/w too ... so I'd guess it unlikely that we will get complaints for breaking these in upstream." So even if there is a small number still in use, we did load EDAC with edac_op_state == EDAC_OPSTATE_POLL by default (we still do, in fact) which means a default EDAC setup without any parameters supplied on the command line or otherwise would never even log the error in the NMI handler because we're polling by default: inline int edac_handler_set(void) { if (edac_op_state == EDAC_OPSTATE_POLL) return 0; return atomic_read(&edac_handlers); } So, long story short, I'd like to get rid of that nastiness called edac_stub.c and confine all the EDAC drivers solely to drivers/edac/. If we ever have to do stuff like that again, it should be notifiers we're using and not some insanity like this one. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com>
2017-04-10EDAC, highbank: Align Makefile directivesBorislav Petkov
... like the rest of the file. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-07EDAC, thunderx: Remove unused codeSergey Temerkhanov
Remove unused code reserved for upcoming CPUs. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Sergey Temerkhanov <s.temerkhanov@gmail.com> Cc: David Daney <david.daney@cavium.com> Cc: Jan.Glauber@cavium.com Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170406113834.17153-1-s.temerkhanov@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-07EDAC, thunderx: Change LMC index calculationSergey Temerkhanov
Shift the node number by 3 bits instead of 8 allowing proper functioning with default EDAC_MAX_MCS. Signed-off-by: Sergey Temerkhanov <s.temerkhanov@gmail.com> Cc: David Daney <david.daney@cavium.com> Cc: Jan.Glauber@cavium.com Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170406113755.17082-1-s.temerkhanov@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-06EDAC, altera: Fix peripheral warnings for Cyclone5Thor Thayer
The peripherals' RAS functionality only exist on the Arria10 SoCFPGA. The Cyclone5 initialization generates EDAC warnings when the peripherals aren't found in the device tree. Fix by checking for Arria10 in the init functions. Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1491415262-5018-1-git-send-email-thor.thayer@linux.intel.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-05EDAC, thunderx: Fix L2C MCI interrupt disableJan Glauber
Fix a typo that disabled the MCI interrupts using the wrong bitmask. Signed-off-by: Jan Glauber <jglauber@cavium.com> Cc: David Daney <david.daney@cavium.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Sergey Temerkhanov <s.temerkhanov@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170405102739.6301-1-jglauber@cavium.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-03-27EDAC, thunderx: Add Cavium ThunderX EDAC driverSergey Temerkhanov
Add support for Cavium ThunderX EDAC capable on-chip peripherals, namely the DRAM controller (LMC), cache coherent processor interconnect (CCPI) and level 2 cache blocks (L2C-TAD, L2C-MCI, L2C-CBC) Signed-off-by: Sergey Temerkhanov <s.temerkhanov@gmail.com> Cc: David.Daney@cavium.com Cc: Jan.Glauber@cavium.com Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170324222837.60583-1-s.temerkhanov@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-03-26EDAC, pnd2_edac: Fix reported DIMM numberQiuxu Zhuo
DIMM number passed to edac_mc_handle_error() was accidentally hardcoded to zero. Pass in the correct daddr->dimm value. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de>
2017-03-23EDAC, pnd2_edac: Fix !EDAC_DEBUG buildBorislav Petkov
Provide debugfs function stubs when EDAC_DEBUG is not enabled so that we don't fail the build: drivers/edac/pnd2_edac.c: In function ‘pnd2_init’: drivers/edac/pnd2_edac.c:1521:2: error: implicit declaration of function ‘setup_pnd2_debug’ [-Werror=implicit-function-declaration] setup_pnd2_debug(); ^ drivers/edac/pnd2_edac.c: In function ‘pnd2_exit’: drivers/edac/pnd2_edac.c:1529:2: error: implicit declaration of function ‘teardown_pnd2_debug’ [-Werror=implicit-function-declaration] teardown_pnd2_debug(); ^ Signed-off-by: Borislav Petkov <bp@suse.de>
2017-03-23EDAC: Select DEBUG_FSBorislav Petkov
The debugfs.c functionality relies on DEBUG_FS so select it. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-03-16EDAC, pnd2_edac: Add new EDAC driver for Intel SoC platformsTony Luck
Initial target for this driver is the Intel Apollo Lake platform and Denverton micro-server, they use the same internal memory controller IP called Pondicherry2. Memory controller registers are not in PCI config space like earlier Intel memory controllers. For Apollo Lake platform they are accessed via a "side-band" interface, for Denverton micro-server they are access via PCI config space and memory map I/O. This driver is for Apollo Lake and Denverton, but only the Denverton is fully enabled while we wait for the sideband driver. Apollo lake driver and initial cut at Denverton driver by Tony Luck. Extensive cleanup, refactoring and basic verification by Qiuxu Zhuo. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170308174539.14432-1-qiuxu.zhuo@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-03-09EDAC, i5000, i5400: Fix use of MTR_DRAM_WIDTH macroJérémy Lefaure
The MTR_DRAM_WIDTH macro returns the data width. It is sometimes used as if it returned a boolean true if the width if 8. Fix the tests where MTR_DRAM_WIDTH is misused. Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170309011809.8340-1-jeremy.lefaure@lse.epita.fr Signed-off-by: Borislav Petkov <bp@suse.de>
2017-03-06EDAC, xgene: Fix wrongly spelled "procesing"Colin Ian King
Fix spelling mistake in dev_err message. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Loc Ho <lho@apm.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170223002609.9440-1-colin.king@canonical.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-02-20Merge branch 'ras-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Ingo Molnar: "The main changes in this cycle were: - Assign notifier chain priorities for all RAS related handlers to make the ordering explicit (Borislav Petkov) - Improve the AMD MCA banks sysfs output (Yazen Ghannam) - Various cleanups and restructuring of the x86 RAS code (Borislav Petkov)" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ras, EDAC, acpi: Assign MCE notifier handlers a priority x86/ras: Get rid of mce_process_work() EDAC/mce/amd: Dump TSC value EDAC/mce/amd: Unexport amd_decode_mce() x86/ras/amd/inj: Change dependency x86/ras: Flip the TSC-adding logic x86/ras/amd: Make sysfs names of banks more user-friendly x86/ras/therm_throt: Do not log a fake MCE for thermal events x86/ras/inject: Make it depend on X86_LOCAL_APIC=y
2017-02-16EDAC, mce_amd: Print IPID and Syndrome on a separate lineYazen Ghannam
Currently, the IPID and Syndrome are printed on the same line as the Address. There are cases when we can have a valid Syndrome but not a valid Address. For example, the MCA_SYND register can be used to hold more detailed error info that the hardware folks can use. It's not just DRAM ECC syndromes. There are some error types that aren't related to memory that may have valid syndromes, like some errors related to links in the Data Fabric, etc. In these cases, the IPID and Syndrome are not printed at the same log level as the rest of the stanza, so users won't see them on the console. Console: [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2 Dmesg: [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b , Syndrome: 0x000000010b404000, IPID: 0x0001002e00000002 [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2 Print the IPID first and on a new line. The IPID should always be printed on SMCA systems. The Syndrome will then be printed with the IPID and at the same log level when valid: [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b [Hardware Error]: IPID: 0x0001002e00000002, Syndrome: 0x000000010b404000 [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2 Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1487192182-2474-1-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-02-14EDAC, amd64: Bump driver versionBorislav Petkov
Last time we did that was when we enabled Bulldozer. Now, we enabled Zen so it is only natural ... :-) Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
2017-02-09EDAC, fsl_ddr: Make locally used symbols staticWei Yongjun
Fix the following sparse warnings: drivers/edac/fsl_ddr_edac.c:148:1: warning: symbol 'dev_attr_inject_data_hi' was not declared. Should it be static? drivers/edac/fsl_ddr_edac.c:150:1: warning: symbol 'dev_attr_inject_data_lo' was not declared. Should it be static? drivers/edac/fsl_ddr_edac.c:152:1: warning: symbol 'dev_attr_inject_ctrl' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170209150424.15124-1-weiyj.lk@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-02-03EDAC, mpc85xx: Add T2080 l2-cache supportChris Packham
The L2 cache controller on the T2080 SoC has similar capabilities to the others already supported by the mpc85xx_edac driver. Add it to the list of compatible devices. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Acked-by: Johannes Thumshirn <jth@kernel.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: devicetree@vger.kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20170201231624.28843-1-chris.packham@alliedtelesis.co.nz Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-28EDAC, amd64: Add x86cpuid sanity check during initYazen Ghannam
Match one of the devices in amd64_cpuids[] before loading the module. This is an additional sanity check against users trying to load amd64_edac_mod on unsupported systems. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485537863-2707-9-git-send-email-Yazen.Ghannam@amd.com [ Get rid of err_ret label, make it a bit more readable this way. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-28EDAC, amd64: Don't treat ECC disabled as failureYazen Ghannam
Having ECC disabled on a node doesn't necessarily mean that it's disabled for the entire system. So let's return a non-failing code when ECC is disabled on a node. This way we can skip initialization for the node but still continue with the remaining nodes. After probing all instances, make sure we have at least one MC device allocated. This issue is seen and fix tested on Fam15h and Fam17h MCM systems. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485537863-2707-8-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-28EDAC: Add routine to check if MC devices list is emptyYazen Ghannam
We need to know if any MC devices have been allocated. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485537863-2707-7-git-send-email-Yazen.Ghannam@amd.com [ Prettify text. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-28EDAC, amd64: Remove unused printing macrosYazen Ghannam
amd64_{debug,notice} don't have any users, so remove them. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485537863-2707-6-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-28EDAC, amd64: Rework messages in ecc_enabled()Yazen Ghannam
Print the node number when informing that DRAM ECC is disabled so that we can show which nodes have DRAM ECC disabled. Also, print more detailed system information as edac_dbg(), so as to not bother general users. Switch amd64_notice to amd64_info to match the message above it. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485537863-2707-5-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-28EDAC, amd64: Move global code out of instance functionsYazen Ghannam
We have a few functions that register/unregister an ECC error decoding routine. These functions are called when we init/remove instances. However, they are global and so don't need to be registered/unregistered multiple times. So move them out of the init/remove instance functions and into the module init/exit routines. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485297149-13733-4-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-28EDAC, amd64: Free unused memory when init_one_instance() failsYazen Ghannam
Jump to memory freeing routines when init_one_instance() fails. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485297149-13733-3-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-28EDAC, mce_amd: Give more context to deferred error messageYazen Ghannam
Users may not be familiar with the concept of deferred errors. There is no action for users to take on this type of error, so give more context in the error message to make this more clear. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485297149-13733-2-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-26EDAC, i7300: Test for the second channel properlyBorislav Petkov
REDMEMB[17] is the ECC_Locator bit, which, when set, identifies the CS[3:2] as the simbols in error. And thus the second channel. The macro computing it was wrong so get rid of it (it was used at one place only) and get rid of the conditional too. Generates better code this way anyway. Signed-off-by: Borislav Petkov <bp@suse.de> Reported-by: David Binderman <dcb314@hotmail.com> Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-01-24x86/ras, EDAC, acpi: Assign MCE notifier handlers a priorityBorislav Petkov
Assign all notifiers on the MCE decode chain a priority so that they get called in the correct order. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170123183514.13356-10-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-24EDAC/mce/amd: Dump TSC valueBorislav Petkov
Dump the TSC value of the time when the MCE got logged. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170123183514.13356-8-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-24EDAC/mce/amd: Unexport amd_decode_mce()Borislav Petkov
It is not used outside of the driver anymore. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170123183514.13356-7-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-23EDAC, sb_edac: Get rid of ->show_interleave_mode()Nicolas Iooss
Function sbridge_register_mci() sets pvt->info.show_interleave_mode to knl_show_interleave_mode() on Knight's Landing and show_interleave_mode() anywhere else. Merge show_interleave_mode() and knl_show_interleave_mode() in a single implementation and use it without an indirect function pointer. Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170122172806.10412-1-nicolas.iooss_linux@m4x.org [ Call it get_intlv_mode_str(). ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-19EDAC: Expose per-DIMM error counts in sysfsAaron Miller
The old csrowX sysfs directories have per-csrow error counters, but the new dimmX directories do not currently expose error counts. EDAC already keeps these counts, add them to sysfs so per-DIMM counts are still available when CONFIG_EDAC_LEGACY_SYSFS=n. Signed-off-by: Aaron Miller <aaronmiller@fb.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20161103220153.3997328-1-aaronmiller@fb.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-16EDAC, amd64: Save and return err code from probe_one_instance()Yazen Ghannam
We should save the return code from probe_one_instance() so that it can be returned from the module init function. Otherwise, we'll be returning the -ENOMEM from above. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1484322741-41884-1-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-16EDAC, i82975x: Add ioremap_nocache() error handlingArvind Yadav
If ioremap_nocache() fails, it will return NULL. Which will then cause a NULL-pointer dereference. Handle the returned value properly. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Cc: "Arvind R." <arvino55@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1484549092-11349-1-git-send-email-arvind.yadav.cs@gmail.com [ Boris: massage commit message and improve error message. ] Signed-off-by: Borislav Petkov <bp@suse.de>