summaryrefslogtreecommitdiff
path: root/fs/btrfs/zlib.c
AgeCommit message (Collapse)Author
2017-11-01btrfs: allow to set compression level for zlibDavid Sterba
Preliminary support for setting compression level for zlib, the following works: $ mount -o compess=zlib # default $ mount -o compess=zlib0 # same $ mount -o compess=zlib9 # level 9, slower sync, less data $ mount -o compess=zlib1 # level 1, faster sync, more data $ mount -o remount,compress=zlib3 # level set by remount The compress-force works the same as compress'. The level is visible in the same format in /proc/mounts. Level set via file property does not work yet. Required patch: "btrfs: prepare for extensions in compression options" Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-19btrfs: switch to kvmalloc and GFP_KERNEL in lzo/zlib alloc_workspaceDavid Sterba
The compression workspace buffers are larger than a page so we use vmalloc, unconditionally. This is not always necessary as there might be contiguous memory available. Let's use the kvmalloc helpers that will try kmalloc first and fallback to vmalloc. For that they require GFP_KERNEL flags. As we now have the alloc_workspace calls protected by memalloc_nofs in the critical contexts, we can safely use GFP_KERNEL. Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-19btrfs: switch kmallocs to GFP_KERNEL in lzo/zlib alloc_workspaceDavid Sterba
As alloc_workspace is now protected by memalloc_nofs where needed, we can switch the kmalloc to use GFP_KERNEL. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-19btrfs: reduce arguments for decompress_bio opsAnand Jain
struct compressed_bio pointer can be used instead. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: derive maximum output size in the compression implementationDavid Sterba
The value of max_out can be calculated from the parameters passed to the compressors, which is number of pages and the page size, and we don't have to needlessly pass it around. Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: merge nr_pages input and output parameter in compress_pagesDavid Sterba
The parameter saying how many pages can be allocated at maximum can be merged with the output page counter, to save some stack space. The compression implementation will sink the parameter to a local variable so everything works as before. The nr_pages variables can also be simply merged in compress_file_range into one. Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: merge length input and output parameter in compress_pagesDavid Sterba
The length parameter is basically duplicated for input and output in the top level caller of the compress_pages chain. We can simply use one variable for that and reduce stack consumption. The compression implementation will sink the parameter to a local variable so everything works as before. Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: use bio iterators for the decompression handlersChristoph Hellwig
Pass the full bio to the decompression routines and use bio iterators to iterate over the data in the bio. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: Call kunmap if zlib_inflateInit2 failsNick Terrell
If zlib_inflateInit2 fails, the input page is never unmapped. Add a call to kunmap when it fails. Signed-off-by: Nick Terrell <nickrterrell@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-26btrfs: convert printk(KERN_* to use pr_* callsJeff Mahoney
This patch converts printk(KERN_* style messages to use the pr_* versions. One side effect is that anything that was KERN_DEBUG is now automatically a dynamic debug message. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-04mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-16btrfs: constify structs with op functions or static definitionsDavid Sterba
There are some op tables that can be easily made const, similarly the sysfs feature and raid tables. This is motivated by PaX CONSTIFY plugin. Signed-off-by: David Sterba <dsterba@suse.cz>
2014-11-30btrfs: zero out left over bytes after processing compression streamsChris Mason
Don Bailey noticed that our page zeroing for compression at end-io time isn't complete. This reworks a patch from Linus to push the zeroing into the zlib and lzo specific functions instead of trying to handle the corners inside btrfs_decompress_buf2page Signed-off-by: Chris Mason <clm@fb.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Reported-by: Don A. Bailey <donb@securitymouse.com> cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-09-17btrfs compression: merge inflate and deflate z_streamsSergey Senozhatsky
`struct workspace' used for zlib compression contains two zlib z_stream-s: `def_strm' used in zlib_compress_pages(), and `inf_strm' used in zlib_decompress/zlib_decompress_biovec(). None of these functions use `inf_strm' and `def_strm' simultaniously, meaning that for every compress/decompress operation we need only one z_stream (out of two available). `inf_strm' and `def_strm' are different in size of ->workspace. For inflate stream we vmalloc() zlib_inflate_workspacesize() bytes, for deflate stream - zlib_deflate_workspacesize() bytes. On my system zlib returns the following workspace sizes, correspondingly: 42312 and 268104 (+ guard pages). Keep only one `z_stream' in `struct workspace' and use it for both compression and decompression. Hence, instead of vmalloc() of two z_stream->worskpace-s, allocate only one of size: max(zlib_deflate_workspacesize(), zlib_inflate_workspacesize()) Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-09-17btrfs: use DIV_ROUND_UP instead of open-coded variantsDavid Sterba
The form (value + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT is equivalent to (value + PAGE_CACHE_SIZE - 1) / PAGE_CACHE_SIZE The rest is a simple subsitution, no difference in the generated assembly code. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
2014-07-03btrfs: use E2BIG instead of EIO if compression does not helpDavid Sterba
Return codes got updated in 60e1975acb48fc3d74a3422b21dde74c977ac3d5 (btrfs: return errno instead of -1 from compression) lzo wrapper returns E2BIG in this case, do the same for zlib. Signed-off-by: David Sterba <dsterba@suse.cz>
2014-06-09btrfs: return errno instead of -1 from compressionZach Brown
The compression layer seems to have been built to return -1 and have callers make up errors that make sense. This isn't great because there are different errors that originate down in the compression layer. Let's return real negative errnos from the compression layer so that callers can pass on the error without having to guess what happened. ENOMEM for allocation failure, E2BIG when compression exceeds the uncompressed input, and EIO for everything else. This helps a future path return errors from btrfs_decompress(). Signed-off-by: Zach Brown <zab@redhat.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-01-28Btrfs: convert printk to btrfs_ and fix BTRFS prefixFrank Holton
Convert all applicable cases of printk and pr_* to the btrfs_* macros. Fix all uses of the BTRFS prefix. Signed-off-by: Frank Holton <fholton@gmail.com> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
2012-10-09btrfs: fix message printingDaniel J Blueman
Fix various messages to include newline and module prefix. Signed-off-by: Daniel J Blueman <daniel@quora.org>
2012-03-20btrfs: remove the second argument of k[un]map_atomic()Cong Wang
Signed-off-by: Cong Wang <amwang@redhat.com>
2011-03-22zlib: slim down zlib_deflate() workspace when possibleJim Keniston
Instead of always creating a huge (268K) deflate_workspace with the maximum compression parameters (windowBits=15, memLevel=8), allow the caller to obtain a smaller workspace by specifying smaller parameter values. For example, when capturing oops and panic reports to a medium with limited capacity, such as NVRAM, compression may be the only way to capture the whole report. In this case, a small workspace (24K works fine) is a win, whether you allocate the workspace when you need it (i.e., during an oops or panic) or at boot time. I've verified that this patch works with all accepted values of windowBits (positive and negative), memLevel, and compression level. Signed-off-by: Jim Keniston <jkenisto@us.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: David Miller <davem@davemloft.net> Cc: Chris Mason <chris.mason@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-12-22btrfs: Extract duplicate decompress codeLi Zefan
Add a common function to copy decompressed data from working buffer to bio pages. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
2010-12-22btrfs: Allow to add new compression algorithmLi Zefan
Make the code aware of compression type, instead of always assuming zlib compression. Also make the zlib workspace function as common code for all compression types. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
2010-12-22btrfs: Fix error handling in zlibLi Zefan
Return failure if alloc_page() fails to allocate memory, and the upper code will just give up compression. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
2010-12-22btrfs: Fix bugs in zlib workspaceLi Zefan
- Fix a race that can result in alloc_workspace > cpus. - Fix to check num_workspace after wakeup. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
2010-10-29Btrfs: cleanup warnings from gcc 4.6 (nonbugs)Andi Kleen
These are all the cases where a variable is set, but not read which are not bugs as far as I can see, but simply leftovers. Still needs more review. Found by gcc 4.6's new warnings Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Chris Mason <chris.mason@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-08-07Btrfs: correct error-handling zlib error handlingJulia Lawall
find_zlib_workspace returns an ERR_PTR value in an error case instead of NULL. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @match exists@ expression x, E; statement S1, S2; @@ x = find_zlib_workspace(...) ... when != x = E ( * if (x == NULL || ...) S1 else S2 | * if (x == NULL && ...) S1 else S2 ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-01-05Btrfs: Fix checkpatch.pl warningsChris Mason
There were many, most are fixed now. struct-funcs.c generates some warnings but these are bogus. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-12-02Btrfs: make things static and include the right headersChristoph Hellwig
Shut up various sparse warnings about symbols that should be either static or have their declarations in scope. Signed-off-by: Christoph Hellwig <hch@lst.de>
2008-11-11Btrfs: Fix compile warnings on 32 bit machinesChris Mason
Simple casting here and there to fix things up. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-11-06Btrfs: Optimize compressed writeback and readsChris Mason
When reading compressed extents, try to put pages into the page cache for any pages covered by the compressed extent that readpages didn't already preload. Add an async work queue to handle transformations at delayed allocation processing time. Right now this is just compression. The workflow is: 1) Find offsets in the file marked for delayed allocation 2) Lock the pages 3) Lock the state bits 4) Call the async delalloc code The async delalloc code clears the state lock bits and delalloc bits. It is important this happens before the range goes into the work queue because otherwise it might deadlock with other work queue items that try to lock those extent bits. The file pages are compressed, and if the compression doesn't work the pages are written back directly. An ordered work queue is used to make sure the inodes are written in the same order that pdflush or writepages sent them down. This changes extent_write_cache_pages to let the writepage function update the wbc nr_written count. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-10-29Btrfs: Add zlib compression supportChris Mason
This is a large change for adding compression on reading and writing, both for inline and regular extents. It does some fairly large surgery to the writeback paths. Compression is off by default and enabled by mount -o compress. Even when the -o compress mount option is not used, it is possible to read compressed extents off the disk. If compression for a given set of pages fails to make them smaller, the file is flagged to avoid future compression attempts later. * While finding delalloc extents, the pages are locked before being sent down to the delalloc handler. This allows the delalloc handler to do complex things such as cleaning the pages, marking them writeback and starting IO on their behalf. * Inline extents are inserted at delalloc time now. This allows us to compress the data before inserting the inline extent, and it allows us to insert an inline extent that spans multiple pages. * All of the in-memory extent representations (extent_map.c, ordered-data.c etc) are changed to record both an in-memory size and an on disk size, as well as a flag for compression. From a disk format point of view, the extent pointers in the file are changed to record the on disk size of a given extent and some encoding flags. Space in the disk format is allocated for compression encoding, as well as encryption and a generic 'other' field. Neither the encryption or the 'other' field are currently used. In order to limit the amount of data read for a single random read in the file, the size of a compressed extent is limited to 128k. This is a software only limit, the disk format supports u64 sized compressed extents. In order to limit the ram consumed while processing extents, the uncompressed size of a compressed extent is limited to 256k. This is a software only limit and will be subject to tuning later. Checksumming is still done on compressed extents, and it is done on the uncompressed version of the data. This way additional encodings can be layered on without having to figure out which encoding to checksum. Compression happens at delalloc time, which is basically singled threaded because it is usually done by a single pdflush thread. This makes it tricky to spread the compression load across all the cpus on the box. We'll have to look at parallel pdflush walks of dirty inodes at a later time. Decompression is hooked into readpages and it does spread across CPUs nicely. Signed-off-by: Chris Mason <chris.mason@oracle.com>