summaryrefslogtreecommitdiff
path: root/arch/arc/mm/cache.c
AgeCommit message (Collapse)Author
2017-05-02ARCv2: mm: Merge 2 updates to DC_CTRL for region flushVineet Gupta
Region Flush has a weird programming model. 1. Flush or Invalidate is selected by DC_CTRL.RGN_OP 2 Flush-n-Invalidate is done by DC_CTRL.IM Given the code structuring before, case #2 above was generating two seperate updates to DC_CTRL which was pointless. | 80a342b0 <__dma_cache_wback_inv_l1>: | 80a342b0: clri r4 | 80a342b4: lr r2,[dc_ctrl] | 80a342b8: bset_s r2,r2,0x6 | 80a342ba: sr r2,[dc_ctrl] <-- FIRST | | 80a342be: bmskn r3,r0,0x5 | | 80a342c2: lr r2,[dc_ctrl] | 80a342c6: and r2,r2,0xfffff1ff | 80a342ce: bset_s r2,r2,0x9 | 80a342d0: sr r2,[dc_ctrl] <-- SECOND | | 80a342d4: add_s r1,r1,0x3f | 80a342d6: bmsk_s r0,r0,0x5 | 80a342d8: add_s r0,r0,r1 | 80a342da: add_s r0,r0,r3 | 80a342dc: sr r0,[78] | 80a342e0: sr r3,[77] |... |... So move setting of DC_CTRL.RGN_OP into __before_dc_op() and combine with any other update. | 80b63324 <__dma_cache_wback_inv_l1>: | 80b63324: clri r3 | 80b63328: lr r2,[dc_ctrl] | 80b6332c: and r2,r2,0xfffff1ff | 80b63334: or r2,r2,576 | 80b63338: sr r2,[dc_ctrl] | | 80b6333c: add_s r1,r1,0x3f | 80b6333e: bmskn r2,r0,0x5 | 80b63342: add_s r0,r0,r1 | 80b63344: sr r0,[78] | 80b63348: sr r2,[77] Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-05-02ARCv2: mm: Implement cache region flush operationsVineet Gupta
These are more efficient than the per-line ops Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-05-02ARC: mm: Move full_page computation into cache version agnostic wrapperVineet Gupta
This reduces code duplication in each of cache version specific handlers Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-03-30ARCv2: SLC: Make sure busy bit is set properly on SLC flushingAlexey Brodkin
As reported in STAR 9001165532, an SLC control reg read (for checking busy state) right after SLC invalidate command may incorrectly return NOT busy causing software to NOT spin-wait while operation is underway. (and for some reason this only happens if L1 cache is also disabled - as required by IOC programming model) Suggested workaround is to do an additional Control Reg read, which ensures the 2nd read gets the right status. Cc: stable@vger.kernel.org #4.10 Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> [vgupta: reworte changelog a bit] Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-01-18ARC: Revert "ARC: mm: IOC: Don't enable IOC by default"Vineet Gupta
The programming model has been fixed with prev patches so re-enable it by default This reverts commit 23cb1f644019bac49d87b4dd7c1eac0569cc4f53. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-01-18ARC: mm: split arc_cache_init to allow __init reaping of bulkVineet Gupta
arc_cache_init() is called for each core so can't be tagged __init. However bulk of it is only executed by master core and thus is candidate for __init reaping. So split it up to allow that. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-01-18ARCv2: IOC: Use actual memory size to setup aperture sizeVineet Gupta
vs. fixed 512M before. But this still assumes that all of memory is under IOC which may not be true for the SoC. Improve that later when this becomes a real issue, by specifying this from DT. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-01-18ARCv2: IOC: Adhere to progamming model guidelines to avoid DMA corruptionVineet Gupta
On AXS103 release bitfiles, DMA data corruptions were seen because IOC setup was not following the recommended way in documentation. Flipping IOC on when caches are enabled or coherency transactions are in flight, might cause some of the memory operations to not observe coherency as expected. So strictly follow the programming model recommendations as documented in comment header above arc_ioc_setup() Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-01-18ARCv2: IOC: refactor the IOC and SLC operations into own functionsVineet Gupta
- Move IOC setup into arc_ioc_setup() - Move SLC disabling into arc_slc_disable() Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-01-04ARC: mmu: clarify the MMUv3 programming modelVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-12-19ARC: mm: arc700: Don't assume 2 colours for aliasing VIPT dcacheVineet Gupta
An ARC700 customer reported linux boot crashes when upgrading to bigger L1 dcache (64K from 32K). Turns out they had an aliasing VIPT config and current code only assumed 2 colours, while theirs had 4. So default to 4 colours and complain if there are fewer. Ideally this needs to be a Kconfig option, but heck that's too much of hassle for a single user. Cc: stable@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-12-19ARC: mm: No need to save cache version in @cpuinfoVineet Gupta
Historical MMU revisions have been paired with Cache revision updates which are captured in MMU and Cache Build Configuration Registers respectively. This was used in boot code to check for configurations mismatches, speically in simulations (such as running with non existent caches, non pairing MMU and Cache version etc). This can instead be inferred from other cache params such as line size. So remove @ver from post processed @cpuinfo which could be used later to save soem other interesting info. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-11-28ARC: mm: IOC: Don't enable IOC by defaultVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-10-28ARCv2: boot log: print IOC exists as well as enabled statusVineet Gupta
Previously we would not print the case when IOC existed but was not enabled. And while at it, reduce one line off boot printing by consolidating the Peripheral address space and IO-Coherency which in a way applies to them Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-10-24ARCv2: IOC: use @ioc_enable not @ioc_exist where intendedVineet Gupta
if user disables IOC from debugger at startup (by clearing @ioc_enable), @ioc_exists is cleared too. This means boot prints don't capture the fact that IOC was present but disabled which could be misleading. So invert how we use @ioc_enable and @ioc_exists and make it more canonical. @ioc_exists represent whether hardware is present or not and stays same whether enabled or not. @ioc_enable is still user driven, but will be auto-disabled if IOC hardware is not present, i.e. if @ioc_exist=0. This is opposite to what we were doing before, but much clearer. This means @ioc_enable is now the "exported" toggle in rest of code such as dma mapping API. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30ARCv2: Support dynamic peripheral address space in HS38 rel 3.0 coresVineet Gupta
HS release 3.0 provides for even more flexibility in specifying the volatile address space for mapping peripherals. With HS 2.1 @start was made flexible / programmable - with HS 3.0 even @end can be setup (vs. fixed to 0xFFFF_FFFF before). So add code to reflect that and while at it remove an unused struct defintion Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-08-10ARC: Elide redundant setup of DMA callbacksVineet Gupta
For resources shared by all cores such as SLC and IOC, only the master core needs to do any setups / enabling / disabling etc. Cc: <stable@vger.kernel.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-05-30Fix typosAndrea Gelmini
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-04-04mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-19ARCv2: ioremap: Support dynamic peripheral address spaceVineet Gupta
The peripheral address space is architectural address window which is uncached and typically used to wire up peripherals. For ARC700 cores (ARCompact ISA based) this was fixed to 1GB region 0xC000_0000 - 0xFFFF_FFFF. For ARCv2 based HS38 cores the start address is flexible and can be 0xC, 0xD, 0xE, 0xF 000_000 by programming AUX_NON_VOLATILE_LIMIT reg (typically done in bootloader) Further in cas of PAE, the physical address can extend beyond 4GB so need to confine this check, otherwise all pages beyond 4GB will be treated as uncached Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-19ARC: dma: ioremap: use phys_addr_t consistenctly in code pathsVineet Gupta
To support dma in physical memory beyond 4GB with PAE40 Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-11ARC: Fix misspellings in comments.Adam Buchbinder
Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-01-15mm: differentiate page_mapped() from page_mapcount() for compound pagesKirill A. Shutemov
Let's define page_mapped() to be true for compound pages if any sub-pages of the compound page is mapped (with PMD or PTE). On other hand page_mapcount() return mapcount for this particular small page. This will make cases like page_get_anon_vma() behave correctly once we allow huge pages to be mapped with PTE. Most users outside core-mm should use page_mapcount() instead of page_mapped(). Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Sasha Levin <sasha.levin@oracle.com> Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Steve Capper <steve.capper@linaro.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-29ARC: mm: PAE40 supportVineet Gupta
This is the first working implementation of 40-bit physical address extension on ARCv2. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: mm: PAE40: switch to using phys_addr_t for physical addressesVineet Gupta
That way a single flip of phys_addr_t to 64 bit ensures all places dealing with physical addresses get correct data Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28ARC: mm: preps ahead of HIGHMEM supportVineet Gupta
Before we plug in highmem support, some of code needs to be ready for it - copy_user_highpage() needs to be using the kmap_atomic API - mk_pte() can't assume page_address() - do_page_fault() can't assume VMALLOC_END is end of kernel vaddr space Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17ARC: boot log: move helper macros to header for reuseVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-21ARC: Eliminate some ARCv2 specific code for ARCompact buildVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20ARCv2: IOC: Allow boot time disableAlexey Brodkin
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20ARCv2: SLC: Allow boot time disableVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20ARCv2: Support IO Coherency and permutations involving L1 and L2 cachesAlexey Brodkin
In case of ARCv2 CPU there're could be following configurations that affect cache handling for data exchanged with peripherals via DMA: [1] Only L1 cache exists [2] Both L1 and L2 exist, but no IO coherency unit [3] L1, L2 caches and IO coherency unit exist Current implementation takes care of [1] and [2]. Moreover support of [2] is implemented with run-time check for SLC existence which is not super optimal. This patch introduces support of [3] and rework of DMA ops usage. Instead of doing run-time check every time a particular DMA op is executed we'll have 3 different implementations of DMA ops and select appropriate one during init. As for IOC support for it we need: [a] Implement empty DMA ops because IOC takes care of cache coherency with DMAed data [b] Route dma_alloc_coherent() via dma_alloc_noncoherent() This is required to make IOC work in first place and also serves as optimization as LD/ST to coherent buffers can be srviced from caches w/o going all the way to memory Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> [vgupta: -Added some comments about IOC gains -Marked dma ops as static, -Massaged changelog a bit] Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-06ARCv2: guard SLC DMA ops with spinlockAlexey Brodkin
SLC maintenance ops need to be serialized by software as there is no inherent buffering / quequing of aux commands. It can silently ignore a new aux operation if previous one is still ongoing (SLC_CTRL_BUSY) So gaurd the SLC op using a spin lock The spin lock doesn't seem to be contended even in heavy workloads such as iperf. On FPGA @ 75 MHz. [1] Before this change: ============================================================ # iperf -c 10.42.0.1 ------------------------------------------------------------ Client connecting to 10.42.0.1, TCP port 5001 TCP window size: 43.8 KByte (default) ------------------------------------------------------------ [ 3] local 10.42.0.110 port 38935 connected with 10.42.0.1 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 48.4 MBytes 40.6 Mbits/sec ============================================================ [2] After this change: ============================================================ # iperf -c 10.42.0.1 ------------------------------------------------------------ Client connecting to 10.42.0.1, TCP port 5001 TCP window size: 43.8 KByte (default) ------------------------------------------------------------ [ 3] local 10.42.0.243 port 60248 connected with 10.42.0.1 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 47.5 MBytes 39.8 Mbits/sec # iperf -c 10.42.0.1 ------------------------------------------------------------ Client connecting to 10.42.0.1, TCP port 5001 TCP window size: 43.8 KByte (default) ------------------------------------------------------------ [ 3] local 10.42.0.243 port 60249 connected with 10.42.0.1 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 54.9 MBytes 46.0 Mbits/sec ============================================================ Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: arc-linux-dev@synopsys.com Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25ARCv2: SLC: Handle explcit flush for DMA ops (w/o IO-coherency)Vineet Gupta
L2 cache on ARCHS processors is called SLC (System Level Cache) For working DMA (in absence of hardware assisted IO Coherency) we need to manage SLC explicitly when buffers transition between cpu and controllers. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARCv2: MMUv4: support aliasing icache configVineet Gupta
This is also default for AXS103 release Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22ARCv2: MMUv4: cache programming model changesVineet Gupta
Caveats about cache flush on ARCv2 based cores - dcache is PIPT so paddr is sufficient for cache maintenance ops (no need to setup PTAG reg - icache is still VIPT but only aliasing configs need PTAG setup So basically this is departure from MMU-v3 which always need vaddr in line ops registers (DC_IVDL, DC_FLDL, IC_IVIL) but paddr in DC_PTAG, IC_PTAG respectively. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: untangle cache flush loopVineet Gupta
- Remove the ifdef'ery and write distinct versions for each mmu ver even if there is some code duplication Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: cacheflush: No need to retain DC_CTRL from __before_dc_op()Vineet Gupta
That is because __after_dc_op() already reads it for status check, so it is better anyways to use that "newer" value. Also reduces the clutter in callers for passing from/to these routines. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: cacheflush: move some code around, delete old commentsVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: mm/cache_arc700.c -> mm/cache.cVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>