summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-06-25ARCv2: STAR 9000837815 workaround hardware exclusive transactions livelockVineet Gupta
A quad core SMP build could get into hardware livelock with concurrent LLOCK/SCOND. Workaround that by adding a PREFETCHW which is serialized by SCU (System Coherency Unit). It brings the cache line in Exclusive state and makes others invalidate their lines. This gives enough time for winner to complete the LLOCK/SCOND, before others can get the line back. The prefetchw in the ll/sc loop is not nice but this is the only software workaround for current version of RTL. Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25ARC: Reduce bitops lines of code using macrosVineet Gupta
No semantical changes ! Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25ARCv2: barriersVineet Gupta
ARCv2 based HS38 cores are weakly ordered and thus explicit barriers for kernel proper. SMP barrier is provided by DMB instruction which also guarantees local barrier hence used as backend of smp_*mb() as well as *mb() APIs Also hookup barriers into MMIO accessors to avoid ordering issues in IO Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25arch: conditionally define smp_{mb,rmb,wmb}Vineet Gupta
That way arches can define the minimal versions and still #include asm-generic for defaults (vs. defining defaults in arch code) See new barrier.h in arc for usage ! Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25ARC: add smp barriers around atomics per Documentation/atomic_ops.txtVineet Gupta
- arch_spin_lock/unlock were lacking the ACQUIRE/RELEASE barriers Since ARCv2 only provides load/load, store/store and all/all, we need the full barrier - LLOCK/SCOND based atomics, bitops, cmpxchg, which return modified values were lacking the explicit smp barriers. - Non LLOCK/SCOND varaints don't need the explicit barriers since that is implicity provided by the spin locks used to implement the critical section (the spin lock barriers in turn are also fixed in this commit as explained above Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25ARC: add compiler barrier to LLSC based cmpxchgVineet Gupta
When auditing cmpxchg call sites, Chuck noted that gcc was optimizing away some of the desired LDs. | do { | new = old = *ipi_data_ptr; | new |= 1U << msg; | } while (cmpxchg(ipi_data_ptr, old, new) != old); was generating to below | 8015cef8: ld r2,[r4,0] <-- First LD | 8015cefc: bset r1,r2,r1 | | 8015cf00: llock r3,[r4] <-- atomic op | 8015cf04: brne r3,r2,8015cf10 | 8015cf08: scond r1,[r4] | 8015cf0c: bnz 8015cf00 | | 8015cf10: brne r3,r2,8015cf00 <-- Branch doesn't go to orig LD Although this was fixed by adding a ACCESS_ONCE in this call site, it seems safer (for now at least) to add compiler barrier to LLSC based cmpxchg Reported-by: Chuck Jordan <cjordan@synopsys,com> Cc: <stable@vger.kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARCv2: SMP: intc: IDU 2nd level intc for dynamic IRQ distributionVineet Gupta
Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARCv2: SMP: clocksource: Enable Global Real Time counterVineet Gupta
Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARCv2: SMP: ARConnect debug/robustnessVineet Gupta
- Handle possible interrupt coalescing from MCIP - chk if prev IPI ack before sending new Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARCv2: SMP: Support ARConnect (MCIP) for Inter-Core-Interrupts et alVineet Gupta
Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARC: make plat_smp_ops weak to allow over-ridesVineet Gupta
This allows platforms to provide their own cpu wakeup routines as well as IPI send / clear backends, while allowing a SMP kernel w/o any such backend to build/boot Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARCv2: clocksource: Introduce 64bit local RTC counterVineet Gupta
Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARCv2: extable: Enable sorting at build timeVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARCv2: Adhere to Zero Delay loop restrictionVineet Gupta
Branch insn can't be scheduled as last insn of Zero Overhead loop Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARCv2: optimised string/mem lib routinesClaudiu Zissulescu
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARCv2: MMUv4: support aliasing icache configVineet Gupta
This is also default for AXS103 release Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARCv2: MMUv4: cache programming model changesVineet Gupta
Caveats about cache flush on ARCv2 based cores - dcache is PIPT so paddr is sufficient for cache maintenance ops (no need to setup PTAG reg - icache is still VIPT but only aliasing configs need PTAG setup So basically this is departure from MMU-v3 which always need vaddr in line ops registers (DC_IVDL, DC_FLDL, IC_IVIL) but paddr in DC_PTAG, IC_PTAG respectively. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARCv2: MMUv4: TLB programming Model changesVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARCv2: STAR 9000814690: Really Re-enable interrupts to avoid deadlocksVineet Gupta
The issue was, on HS when interrupt is taken, IRQ_ACT is set and that is NOT cleared unless we do RTIE (or manually clear it). Linux interrupt handling has top and bottom halves. Latter lead to softirqs (which can reschedule) AND expect interrupts to be REALLY re-enabled which was NOT happening for us since we only SETI, dont clear IRQ_ACT So we can have a state when both cores have taken interrupt (IRQ_ACT set), get rescheduled, both send IPI and wait in CSD lock which will never be cleared as cores can't take the pending IPI IRQ due to existing IRQ_ACT set. So local_irq_enable() now drops the IRQ_ACT.act bit to re-enable IRQs. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARCv2: STAR 9000808988: signals involving Delay SlotVineet Gupta
Reported by Anton as LTP:munmap01 failing with Illegal Instruction Exception. --------------------->8-------------------------------------- mmap2(NULL, 24576, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x200d2000 munmap(0x200d2000, 24576) = 0 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x200d2000} --- potentially unexpected fatal signal 4. Path: /munmap01 CPU: 0 PID: 61 Comm: munmap01 Not tainted 3.13.0-g5d5c46d9a556 #8 task: 9f1a8000 ti: 9f154000 task.ti: 9f154000 [ECR ]: 0x00020100 => Illegal Insn [EFA ]: 0x0001354c [BLINK ]: 0x200515d4 [ERET ]: 0x1354c @off 0x1354c in [/munmap01] VMA: 0x00010000 to 0x00018000 [STAT32]: 0x800802c0 ... --------------------->8-------------------------------------- The issue was 1. munmap01 accessed unmapped memory (on purpose) with signal handler installed for SIGSEGV 2. The faulting instruction happened to be in Delay Slot 00011864 <main>: 11908: bl.d 13284 <tst_resm> 1190c: stb r16,[r2] 3. kernel sets up the reg file for signal handler and correctly clears the DE bit in pt_regs->status32 placeholder 4. However RESTORE_CALLEE_SAVED_USER macro is not adjusted for ARCv2, and it over-writes the above with orig/stale value of status32 5. After RTIE, userspace signal handler executes a non branch instruction with DE bit set, triggering Illegal Instruction Exception. Reported-by: Anton Kolesov <akolesov@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARCv2: STAR 9000793984: Handle return from intr to Delay SlotVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARCv2: Support for ARCv2 ISA and HS38x coresVineet Gupta
The notable features are: - SMP configurations of upto 4 cores with coherency - Optional L2 Cache and IO-Coherency - Revised Interrupt Architecture (multiple priorites, reg banks, auto stack switch, auto regfile save/restore) - MMUv4 (PIPT dcache, Huge Pages) - Instructions for * 64bit load/store: LDD, STD * Hardware assisted divide/remainder: DIV, REM * Function prologue/epilogue: ENTER_S, LEAVE_S * IRQ enable/disable: CLRI, SETI * pop count: FFS, FLS * SETcc, BMSKN, XBFU... Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARCv2: [intc] HS38 core interrupt controllerVineet Gupta
Cc: Jason Cooper <jason@lakedaemon.net> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARC: uncached base is hard constant for ARC, don't save itVineet Gupta
ioremap already uses the hard define, just make sure BCR value matches that Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: intc: split into ARCompact ISA specific, common bitsVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: Make way for pt_regs != user_regs_structVineet Gupta
These have been register compatible so far. However ARCv2 mandates different pt_regs layout (due to h/w auto save). To keep pt_regs same for both, we start by removing the assumption - used mainly for block copies between the 2 structs in signal handling and ptrace Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: entry.S: [arcompact] simplify SWITCH_TO_KERNEL_STKVineet Gupta
Previously this macro was overloaded with stack switching, saving SP at right slot in pt_regs, saving/setup of r25 and setting SP baseline to where pt_regs->sp is saved (vs. bottom of pt_regs) Now it only does SP switch, and leaves SP pointing to bottom of pt_regs. r25 saving is no longer done here to allow for future reordering of regfile in pt_regs w/o touching this macro Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: entry.S: use single EXCEPTION_PROLOGUEVineet Gupta
Returning from pure kernel mode and exception mode use the same code anyways. Remove one the duplicate blocks Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: entry.S: micro-optimize Trap handlerVineet Gupta
Elide the need to re-read ECR in Trap handler by ensuring that EXCEPTION_PROLOGUE does that at the very end just before returning to Trap handler ARCv2 EXCEPTION_PROLOGUE already did that, so same for ARcompact and the common trap handler adjusted to use cached ECR Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: entry.S: move some code around for cache locality in return pathVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: entry.S: split into ARCompact ISA specific, common bitsVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: entry.S: Ensure that restore_regs is local to compilation unitVineet Gupta
This fixes the possible link/relo errors, since restore_regs will be provided by ISA code, but called from ARC common code. The .L prefix reassures binutils that it will be in same compilation unit. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: entry.S: comments cleanupVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: entry.S: Trap handler to use r10 for syscall vs. brkpt decisionVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: entry.S: FAKE_RET_FROM_EXCPN can always use r9Vineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: entry.S: confine EXCEPTION_* macros to one fileVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: entry.S: canonical'ize EXCEPTION_{PROLOGUE,EPILOGUE}Vineet Gupta
-EXCEPTION_EPILOGUE introduced -EXCEPTION_PROLOGUE now also includes reg file saving Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: entry.S: Introduce INTERRUPT_{PROLOGUE,EPILOGUE}Vineet Gupta
-common'ize macros for level 1 and level 2 interrupts Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: entry.S: common'ize scrtach reg freeup in intr + exceptionsVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: untangle cache flush loopVineet Gupta
- Remove the ifdef'ery and write distinct versions for each mmu ver even if there is some code duplication Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: cacheflush: No need to retain DC_CTRL from __before_dc_op()Vineet Gupta
That is because __after_dc_op() already reads it for status check, so it is better anyways to use that "newer" value. Also reduces the clutter in callers for passing from/to these routines. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: cacheflush: move some code around, delete old commentsVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: mm/cache_arc700.c -> mm/cache.cVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: [axs101] Add missing __init annotationsVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: [axs101] STAR 9000799830: Fix SD cards supportAlexey Brodkin
As DW Mobile Storage databook says it's required to use "Hold Register" if card is enumerated in SDR12 or SDR25 modes. It means we need to act in the same way as in Altera's Socfpga implementation - set "use hold reg" bit in commad. Note that for upstream proper solution would be to remove dw_mci_pltfm_prepare_command() at all and set the bit right in dw_mci_prepare_command() for all platforms. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: [axs101] Tweak DDR port aperture mappings for performanceVineet Gupta
Route all MB originated traffic to DDR Port 1 and keep Port 0 for CPU traffic only Basic system parameters -------------------------------------------------------------------------------------- Host OS Description Mhz tlb cache mem scal pages line par load bytes ----------------- ------------- --------------------------------------- ---- ----- ----- ------ ---- axs101-sd-2-new-f Linux 3.13.0+ axs101-sd-2-new-fw-old-img-rerun 739 8 32 1.1100 1 axs101-sd-3-arc-3 Linux 3.13.9+ axs101-sd-3-arc-3.13-tip-regression 735 8 32 1.1000 1 axs101-sd-9-diffe Linux 3.13.11 axs101-sd-9-different-tweak 740 8 32 1.0000 1 Processor, Processes - times in microseconds - smaller is better ------------------------------------------------------------------------------ Host OS Mhz null null open slct sig sig fork exec sh call I/O stat clos TCP inst hndl proc proc proc --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- axs101-sd Linux 3.13.0+ 739 0.50 0.88 5.38 14.6 34.1 0.92 5.18 2135 6555 12.K axs101-sd Linux 3.13.9+ 735 0.50 0.90 5.89 19.2 81.4 0.94 4.08 2560 8559 15.K axs101-sd Linux 3.13.11 740 0.50 0.88 4.45 17.8 34.4 0.94 3.25 2052 6493 12.K ^^^^ ^^^^ Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: [axs101] support early 8250 uartVineet Gupta
Earlycon calculates UART clock as "BASE_BAUD * 16". In case of ARC "BASE_BAUD" is calculated dynamically in runtime, basically it is an alias to arc_early_base_baud(), which in turn just does "arc_base_baud/16". 8250 UART on AXS/SDP board uses 33.3MHz clock source which is set in "arc_base_baud" with this change. Additional compatibility string "snps,arc-sdp" is introduced as well because there're different flavours of AXS boards but they all share the same motherboard and so it's possible to re-use the same code for motherbord even if CPU daughterboard changes. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: [axs101] Add support for AXS101 SDP (software development platform)Alexey Brodkin
The AXS10x platforms consist of a mainboard with peripherals, on which several daughter cards can be placed. The daughter cards typically contain a CPU and memory. Signed-off-by: Mischa Jonker <mjonker@synopsys.com> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: mm: document system mem map clearlyVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: stack unwinder to bail if PC is not kernel modeVineet Gupta
Currently, it doesn't invoke the callback but continues to unwind Also while at it - simplify the code a bit Signed-off-by: Vineet Gupta <vgupta@synopsys.com>