summaryrefslogtreecommitdiff
path: root/fs/ext4
AgeCommit message (Collapse)Author
2016-05-17Merge branch 'work.preadv2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs cleanups from Al Viro: "More cleanups from Christoph" * 'work.preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: nfsd: use RWF_SYNC fs: add RWF_DSYNC aand RWF_SYNC ceph: use generic_write_sync fs: simplify the generic_write_sync prototype fs: add IOCB_SYNC and IOCB_DSYNC direct-io: remove the offset argument to dio_complete direct-io: eliminate the offset argument to ->direct_IO xfs: eliminate the pos variable in xfs_file_dio_aio_write filemap: remove the pos argument to generic_file_direct_write filemap: remove pos variables in generic_file_read_iter
2016-05-17Merge branch 'ovl-fixes' into for-linusAl Viro
Backmerge to resolve a conflict in ovl_lookup_real(); "ovl_lookup_real(): use lookup_one_len_unlocked()" instead, but it was too late in the cycle to rebase.
2016-05-12ext4: switch to ->iterate_shared()Al Viro
Note that we need relax_dir() equivalent for directories locked shared. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02Merge getxattr prototype change into work.lookupsAl Viro
The rest of work.xattr stuff isn't needed for this branch
2016-05-01fs: simplify the generic_write_sync prototypeChristoph Hellwig
The kiocb already has the new position, so use that. The only interesting case is AIO, where we currently don't bother updating ki_pos. We're about to free the kiocb after we're done, so we might as well update it to make everyone's life simpler. While we're at it also return the bytes written argument passed in if we were successful so that the boilerplate error switch code in the callers can go away. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-01fs: add IOCB_SYNC and IOCB_DSYNCChristoph Hellwig
This will allow us to do per-I/O sync file writes, as required by a lot of fileservers or storage targets. XXX: Will need a few additional audits for O_DSYNC Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-01direct-io: eliminate the offset argument to ->direct_IOChristoph Hellwig
Including blkdev_direct_IO and dax_do_io. It has to be ki_pos to actually work, so eliminate the superflous argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-12ext4/fscrypto: avoid RCU lookup in d_revalidateJaegeuk Kim
As Al pointed, d_revalidate should return RCU lookup before using d_inode. This was originally introduced by: commit 34286d666230 ("fs: rcu-walk aware d_revalidate method"). Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Theodore Ts'o <tytso@mit.edu> Cc: stable <stable@vger.kernel.org>
2016-04-10xattr_handler: pass dentry and inode as separate arguments of ->get()Al Viro
... and do not assume they are already attached to each other Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-10Revert "ext4: allow readdir()'s of large empty directories to be interrupted"Linus Torvalds
This reverts commit 1028b55bafb7611dda1d8fed2aeca16a436b7dff. It's broken: it makes ext4 return an error at an invalid point, causing the readdir wrappers to write the the position of the last successful directory entry into the position field, which means that the next readdir will now return that last successful entry _again_. You can only return fatal errors (that terminate the readdir directory walk) from within the filesystem readdir functions, the "normal" errors (that happen when the readdir buffer fills up, for example) happen in the iterorator where we know the position of the actual failing entry. I do have a very different patch that does the "signal_pending()" handling inside the iterator function where it is allowable, but while that one passes all the sanity checks, I screwed up something like four times while emailing it out, so I'm not going to commit it today. So my track record is not good enough, and the stars will have to align better before that one gets committed. And it would be good to get some review too, of course, since celestial alignments are always an iffy debugging model. IOW, let's just revert the commit that caused the problem for now. Reported-by: Greg Thelen <gthelen@google.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-10don't bother with ->d_inode->i_sb - it's always equal to ->d_sbAl Viro
... and neither can ever be NULL Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-07Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bugfixes from Ted Ts'o: "These changes contains a fix for overlayfs interacting with some (badly behaved) dentry code in various file systems. These have been reviewed by Al and the respective file system mtinainers and are going through the ext4 tree for convenience. This also has a few ext4 encryption bug fixes that were discovered in Android testing (yes, we will need to get these sync'ed up with the fs/crypto code; I'll take care of that). It also has some bug fixes and a change to ignore the legacy quota options to allow for xfstests regression testing of ext4's internal quota feature and to be more consistent with how xfs handles this case" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: ignore quota mount options if the quota feature is enabled ext4 crypto: fix some error handling ext4: avoid calling dquot_get_next_id() if quota is not enabled ext4: retry block allocation for failed DIO and DAX writes ext4: add lockdep annotations for i_data_sem ext4: allow readdir()'s of large empty directories to be interrupted btrfs: fix crash/invalid memory access on fsync when using overlayfs ext4 crypto: use dget_parent() in ext4_d_revalidate() ext4: use file_dentry() ext4: use dget_parent() in ext4_file_open() nfs: use file_dentry() fs: add file_dentry() ext4 crypto: don't let data integrity writebacks fail with ENOMEM ext4: check if in-inode xattr is corrupted in ext4_expand_extra_isize_ea()
2016-04-04mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usageKirill A. Shutemov
Mostly direct substitution with occasional adjustment or removing outdated comments. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-04mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-03ext4: ignore quota mount options if the quota feature is enabledTheodore Ts'o
Previously, ext4 would fail the mount if the file system had the quota feature enabled and quota mount options (used for the older quota setups) were present. This broke xfstests, since xfs silently ignores the usrquote and grpquota mount options if they are specified. This commit changes things so that we are consistent with xfs; having the mount options specified is harmless, so no sense break users by forbidding them. Cc: stable@vger.kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-04-02ext4 crypto: fix some error handlingDan Carpenter
We should be testing for -ENOMEM but the minus sign is missing. Fixes: c9af28fdd449 ('ext4 crypto: don't let data integrity writebacks fail with ENOMEM') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-04-01ext4: avoid calling dquot_get_next_id() if quota is not enabledTheodore Ts'o
This should be fixed in the quota layer so we can test with the quota mutex held, but for now, we need this to avoid tests from crashing the kernel aborting the regression test suite. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-04-01ext4: retry block allocation for failed DIO and DAX writesJan Kara
Currently if block allocation for DIO or DAX write fails due to ENOSPC, we just returned it to userspace. However these ENOSPC errors can be transient because the transaction freeing blocks has not yet committed. This demonstrates as failures of generic/102 test when the filesystem is mounted with 'dax' mount option. Fix the problem by properly retrying the allocation in case of ENOSPC error in get blocks functions used for direct IO. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com>
2016-04-01ext4: add lockdep annotations for i_data_semTheodore Ts'o
With the internal Quota feature, mke2fs creates empty quota inodes and quota usage tracking is enabled as soon as the file system is mounted. Since quotacheck is no longer preallocating all of the blocks in the quota inode that are likely needed to be written to, we are now seeing a lockdep false positive caused by needing to allocate a quota block from inside ext4_map_blocks(), while holding i_data_sem for a data inode. This results in this complaint: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ei->i_data_sem); lock(&s->s_dquot.dqio_mutex); lock(&ei->i_data_sem); lock(&s->s_dquot.dqio_mutex); Google-Bug-Id: 27907753 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2016-03-31posix_acl: Inode acl caching fixesAndreas Gruenbacher
When get_acl() is called for an inode whose ACL is not cached yet, the get_acl inode operation is called to fetch the ACL from the filesystem. The inode operation is responsible for updating the cached acl with set_cached_acl(). This is done without locking at the VFS level, so another task can call set_cached_acl() or forget_cached_acl() before the get_acl inode operation gets to calling set_cached_acl(), and then get_acl's call to set_cached_acl() results in caching an outdate ACL. Prevent this from happening by setting the cached ACL pointer to a task-specific sentinel value before calling the get_acl inode operation. Move the responsibility for updating the cached ACL from the get_acl inode operations to get_acl(). There, only set the cached ACL if the sentinel value hasn't changed. The sentinel values are chosen to have odd values. Likewise, the value of ACL_NOT_CACHED is odd. In contrast, ACL object pointers always have an even value (ACLs are aligned in memory). This allows to distinguish uncached ACLs values from ACL objects. In addition, switch from guarding inode->i_acl and inode->i_default_acl upates by the inode->i_lock spinlock to using xchg() and cmpxchg(). Filesystems that do not want ACLs returned from their get_acl inode operations to be cached must call forget_cached_acl() to prevent the VFS from doing so. (Patch written by Al Viro and Andreas Gruenbacher.) Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-30ext4: allow readdir()'s of large empty directories to be interruptedTheodore Ts'o
If a directory has a large number of empty blocks, iterating over all of them can take a long time, leading to scheduler warnings and users getting irritated when they can't kill a process in the middle of one of these long-running readdir operations. Fix this by adding checks to ext4_readdir() and ext4_htree_fill_tree(). Reported-by: Benjamin LaHaise <bcrl@kvack.org> Google-Bug-Id: 27880676 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-26ext4 crypto: use dget_parent() in ext4_d_revalidate()Theodore Ts'o
This avoids potential problems caused by a race where the inode gets renamed out from its parent directory and the parent directory is deleted while ext4_d_revalidate() is running. Fixes: 28b4c263961c Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2016-03-26ext4: use file_dentry()Miklos Szeredi
EXT4 may be used as lower layer of overlayfs and accessing f_path.dentry can lead to a crash. Fix by replacing direct access of file->f_path.dentry with the file_dentry() accessor, which will always return a native object. Reported-by: Daniel Axtens <dja@axtens.net> Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay") Fixes: ff978b09f973 ("ext4 crypto: move context consistency check to ext4_file_open()") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> # v4.5
2016-03-26ext4: use dget_parent() in ext4_file_open()Miklos Szeredi
In f_op->open() lock on parent is not held, so there's no guarantee that parent dentry won't go away at any time. Even after this patch there's no guarantee that 'dir' will stay the parent of 'inode', but at least it won't be freed while being used. Fixes: ff978b09f973 ("ext4 crypto: move context consistency check to ext4_file_open()") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: <stable@vger.kernel.org> # v4.5
2016-03-26ext4 crypto: don't let data integrity writebacks fail with ENOMEMTheodore Ts'o
We don't want the writeback triggered from the journal commit (in data=writeback mode) to cause the journal to abort due to generic_writepages() returning an ENOMEM error. In addition, if fsync() fails with ENOMEM, most applications will probably not do the right thing. So if we are doing a data integrity sync, and ext4_encrypt() returns ENOMEM, we will submit any queued I/O to date, and then retry the allocation using GFP_NOFAIL. Google-Bug-Id: 27641567 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-22ext4: in ext4_dir_llseek, check syscall bitness directlyAndy Lutomirski
ext4 treats directory offsets differently for 32-bit and 64-bit callers. Check the caller type using in_compat_syscall, not is_compat_task. This changes behavior on SPARC slightly. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22ext4: check if in-inode xattr is corrupted in ext4_expand_extra_isize_ea()Theodore Ts'o
We aren't checking to see if the in-inode extended attribute is corrupted before we try to expand the inode's extra isize fields. This can lead to potential crashes caused by the BUG_ON() check in ext4_xattr_shift_entries(). Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-21Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull UDF and quota updates from Jan Kara: "This contains a rewrite of UDF handling of filename encoding to fix remaining overflow issues from Andrew Gabbasov and quota changes to support new Q_[X]GETNEXTQUOTA quotactl for VFS quota formats" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: quota: Fix possible GPF due to uninitialised pointers ext4: Make Q_GETNEXTQUOTA work for quota in hidden inodes quota: Forbid Q_GETQUOTA and Q_GETNEXTQUOTA for frozen filesystem quota: Fix possible races during quota loading ocfs2: Implement get_next_id() quota_v2: Implement get_next_id() for V2 quota format quota: Add support for ->get_nextdqblk() for VFS quota udf: Merge linux specific translation into CS0 conversion function udf: Remove struct ustr as non-needed intermediate storage udf: Use separate buffer for copying split names udf: Adjust UDF_NAME_LEN to better reflect actual restrictions udf: Join functions for UTF8 and NLS conversions udf: Parameterize output length in udf_put_filename quota: Allow Q_GETQUOTA for frozen filesystem quota: Fixup comments about return value of Q_[X]GETNEXTQUOTA
2016-03-21Merge tag 'xfs-for-linus-4.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs Pull xfs updates from Dave Chinner: "There's quite a lot in this request, and there's some cross-over with ext4, dax and quota code due to the nature of the changes being made. As for the rest of the XFS changes, there are lots of little things all over the place, which add up to a lot of changes in the end. The major changes are that we've reduced the size of the struct xfs_inode by ~100 bytes (gives an inode cache footprint reduction of >10%), the writepage code now only does a single set of mapping tree lockups so uses less CPU, delayed allocation reservations won't overrun under random write loads anymore, and we added compile time verification for on-disk structure sizes so we find out when a commit or platform/compiler change breaks the on disk structure as early as possible. Change summary: - error propagation for direct IO failures fixes for both XFS and ext4 - new quota interfaces and XFS implementation for iterating all the quota IDs in the filesystem - locking fixes for real-time device extent allocation - reduction of duplicate information in the xfs and vfs inode, saving roughly 100 bytes of memory per cached inode. - buffer flag cleanup - rework of the writepage code to use the generic write clustering mechanisms - several fixes for inode flag based DAX enablement - rework of remount option parsing - compile time verification of on-disk format structure sizes - delayed allocation reservation overrun fixes - lots of little error handling fixes - small memory leak fixes - enable xfsaild freezing again" * tag 'xfs-for-linus-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (66 commits) xfs: always set rvalp in xfs_dir2_node_trim_free xfs: ensure committed is initialized in xfs_trans_roll xfs: borrow indirect blocks from freed extent when available xfs: refactor delalloc indlen reservation split into helper xfs: update freeblocks counter after extent deletion xfs: debug mode forced buffered write failure xfs: remove impossible condition xfs: check sizes of XFS on-disk structures at compile time xfs: ioends require logically contiguous file offsets xfs: use named array initializers for log item dumping xfs: fix computation of inode btree maxlevels xfs: reinitialise per-AG structures if geometry changes during recovery xfs: remove xfs_trans_get_block_res xfs: fix up inode32/64 (re)mount handling xfs: fix format specifier , should be %llx and not %llu xfs: sanitize remount options xfs: convert mount option parsing to tokens xfs: fix two memory leaks in xfs_attr_list.c error paths xfs: XFS_DIFLAG2_DAX limited by PAGE_SIZE xfs: dynamically switch modes when XFS_DIFLAG2_DAX is set/cleared ...
2016-03-17Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Performance improvements in SEEK_DATA and xattr scalability improvements, plus a lot of clean ups and bug fixes" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (38 commits) ext4: clean up error handling in the MMP support jbd2: do not fail journal because of frozen_buffer allocation failure ext4: use __GFP_NOFAIL in ext4_free_blocks() ext4: fix compile error while opening the macro DOUBLE_CHECK ext4: print ext4 mount option data_err=abort correctly ext4: fix NULL pointer dereference in ext4_mark_inode_dirty() ext4: drop unneeded BUFFER_TRACE in ext4_delete_inline_entry() ext4: fix misspellings in comments. jbd2: fix FS corruption possibility in jbd2_journal_destroy() on umount path ext4: more efficient SEEK_DATA implementation ext4: cleanup handling of bh->b_state in DAX mmap ext4: return hole from ext4_map_blocks() ext4: factor out determining of hole size ext4: fix setting of referenced bit in ext4_es_lookup_extent() ext4: remove i_ioend_count ext4: simplify io_end handling for AIO DIO ext4: move trans handling and completion deferal out of _ext4_get_block ext4: rename and split get blocks functions ext4: use i_mutex to serialize unaligned AIO DIO ext4: pack ioend structure better ...
2016-03-17Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto update from Herbert Xu: "Here is the crypto update for 4.6: API: - Convert remaining crypto_hash users to shash or ahash, also convert blkcipher/ablkcipher users to skcipher. - Remove crypto_hash interface. - Remove crypto_pcomp interface. - Add crypto engine for async cipher drivers. - Add akcipher documentation. - Add skcipher documentation. Algorithms: - Rename crypto/crc32 to avoid name clash with lib/crc32. - Fix bug in keywrap where we zero the wrong pointer. Drivers: - Support T5/M5, T7/M7 SPARC CPUs in n2 hwrng driver. - Add PIC32 hwrng driver. - Support BCM6368 in bcm63xx hwrng driver. - Pack structs for 32-bit compat users in qat. - Use crypto engine in omap-aes. - Add support for sama5d2x SoCs in atmel-sha. - Make atmel-sha available again. - Make sahara hashing available again. - Make ccp hashing available again. - Make sha1-mb available again. - Add support for multiple devices in ccp. - Improve DMA performance in caam. - Add hashing support to rockchip" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits) crypto: qat - remove redundant arbiter configuration crypto: ux500 - fix checks of error code returned by devm_ioremap_resource() crypto: atmel - fix checks of error code returned by devm_ioremap_resource() crypto: qat - Change the definition of icp_qat_uof_regtype hwrng: exynos - use __maybe_unused to hide pm functions crypto: ccp - Add abstraction for device-specific calls crypto: ccp - CCP versioning support crypto: ccp - Support for multiple CCPs crypto: ccp - Remove check for x86 family and model crypto: ccp - memset request context to zero during import lib/mpi: use "static inline" instead of "extern inline" lib/mpi: avoid assembler warning hwrng: bcm63xx - fix non device tree compatibility crypto: testmgr - allow rfc3686 aes-ctr variants in fips mode. crypto: qat - The AE id should be less than the maximal AE number lib/mpi: Endianness fix crypto: rockchip - add hash support for crypto engine in rk3288 crypto: xts - fix compile errors crypto: doc - add skcipher API documentation crypto: doc - update AEAD AD handling ...
2016-03-13ext4: clean up error handling in the MMP supportvikram.jadhav07
There is memory leak as both caller function kmmpd() and callee read_mmp_block() not releasing bh_check (i.e buffer_head). Given patch fixes this problem. [ Additional changes suggested by Andreas Dilger -- TYT ] Signed-off-by: Jadhav Vikram <vikramjadhavpucsd2007@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-13ext4: use __GFP_NOFAIL in ext4_free_blocks()Konstantin Khlebnikov
This might be unexpected but pages allocated for sbi->s_buddy_cache are charged to current memory cgroup. So, GFP_NOFS allocation could fail if current task has been killed by OOM or if current memory cgroup has no free memory left. Block allocator cannot handle such failures here yet. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-13ext4: fix compile error while opening the macro DOUBLE_CHECKAihua Zhang
the error is: fs/ext4/mballoc.c:475:43: error: 'struct ext4_group_info' has no member named 'bb_bitmap'. so, the definition of macro DOUBLE_CHECK should before 'struct ext4_group_info', I fixed it, and I moved the macro AGGRESSIVE_CHECK together, because I think they shoule be together. Signed-off-by: Aihua Zhang <zhangaihua1@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-12ext4: print ext4 mount option data_err=abort correctlyAles Novak
If data_err=abort option is specified for an ext3/ext4 mount, /proc/mounts does show it as "(null)". This is caused by token2str() returning NULL for Opt_data_err_abort (due to its pattern containing '='). Signed-off-by: Ales Novak <alnovak@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-12ext4: fix NULL pointer dereference in ext4_mark_inode_dirty()Eryu Guan
ext4_reserve_inode_write() in ext4_mark_inode_dirty() could fail on error (e.g. EIO) and iloc.bh can be NULL in this case. But the error is ignored in the following "if" condition and ext4_expand_extra_isize() might be called with NULL iloc.bh set, which triggers NULL pointer dereference. This is uncovered by commit 8b4953e13f4c ("ext4: reserve code points for the project quota feature"), which enlarges the ext4_inode size, and run the following script on new kernel but with old mke2fs: #/bin/bash mnt=/mnt/ext4 devname=ext4-error dev=/dev/mapper/$devname fsimg=/home/fs.img trap cleanup 0 1 2 3 9 15 cleanup() { umount $mnt >/dev/null 2>&1 dmsetup remove $devname losetup -d $backend_dev rm -f $fsimg exit 0 } rm -f $fsimg fallocate -l 1g $fsimg backend_dev=`losetup -f --show $fsimg` devsize=`blockdev --getsz $backend_dev` good_tab="0 $devsize linear $backend_dev 0" error_tab="0 $devsize error $backend_dev 0" dmsetup create $devname --table "$good_tab" mkfs -t ext4 $dev mount -t ext4 -o errors=continue,strictatime $dev $mnt dmsetup load $devname --table "$error_tab" && dmsetup resume $devname echo 3 > /proc/sys/vm/drop_caches ls -l $mnt exit 0 [ Patch changed to simplify the function a tiny bit. -- Ted ] Signed-off-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-10ext4: drop unneeded BUFFER_TRACE in ext4_delete_inline_entry()Geliang Tang
BUFFER_TRACE info "call ext4_handle_dirty_metadata" doesn't match the code, so drop it. Signed-off-by: Geliang Tang <geliangtang@163.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-09ext4: fix misspellings in comments.Adam Buchbinder
Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-09ext4: more efficient SEEK_DATA implementationJan Kara
Using SEEK_DATA in a huge sparse file can easily lead to sotflockups as ext4_seek_data() iterates hole block-by-block. Fix the problem by using returned hole size from ext4_map_blocks() and thus skip the hole in one go. Update also SEEK_HOLE implementation to follow the same pattern as SEEK_DATA to make future maintenance easier. Furthermore we add cond_resched() to both ext4_seek_data() and ext4_seek_hole() to avoid softlockups in case evil user creates huge fragmented file and we have to go through lots of extents. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-09ext4: cleanup handling of bh->b_state in DAX mmapJan Kara
ext4_dax_mmap_get_block() updates bh->b_state directly instead of using ext4_update_bh_state(). This is mostly a cosmetic issue since DAX code always passes on-stack buffer_head but clean this up to make code more uniform. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-09ext4: return hole from ext4_map_blocks()Jan Kara
Currently, ext4_map_blocks() just returns 0 when it finds a hole and allocation is not requested. However we have all the information available to tell how large the hole actually is and there are callers of ext4_map_blocks() which would save some block-by-block hole iteration if they knew this information. So fill in struct ext4_map_blocks even for holes with the information we have. We keep returning 0 for holes to maintain backward compatibility of the function. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-09ext4: factor out determining of hole sizeJan Kara
ext4_ext_put_gap_in_cache() determines hole size in the extent tree, then trims this with possible delayed allocated blocks, and inserts the result into the extent status tree. Factor out determination of the size of the hole in the extent tree as we will need this information in ext4_ext_map_blocks() as well. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-09Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fix from Ted Ts'o: "This fixes a regression which crept in v4.5-rc5" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: iterate over buffer heads correctly in move_extent_per_page()
2016-03-09ext4: fix setting of referenced bit in ext4_es_lookup_extent()Jan Kara
We were setting referenced bit on the extent structure we return from ext4_es_lookup_extent() which is just a private structure on stack. Thus setting had no effect. Set the bit in the structure in the status tree instead. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-09ext4: iterate over buffer heads correctly in move_extent_per_page()Eryu Guan
In commit bcff24887d00 ("ext4: don't read blocks from disk after extents being swapped") bh is not updated correctly in the for loop and wrong data has been written to disk. generic/324 catches this on sub-page block size ext4. Fixes: bcff24887d00 ("ext4: don't read blocks from disk after extentsbeing swapped") Signed-off-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-08ext4: remove i_ioend_countJan Kara
Remove counter of pending io ends as it is unused. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-08ext4: simplify io_end handling for AIO DIOJan Kara
When mapping blocks for direct IO, we allocate io_end structure before mapping blocks and store pointer to it in the inode. This creates a requirement that any AIO DIO using io_end must be protected by i_mutex. This created problems in the past with dioread_nolock mode which was corrupting io_end pointers. Also io_end is allocated unnecessarily in case where we don't need to convert any extents (which is a common case for example when overwriting file). We fix the problem by allocating io_end only once we return unwritten extent from block mapping function for AIO DIO (so we can save some pointless io_end allocations) and we pass pointer to it in bh->b_private which generic DIO code later passes to our end IO callback. That way we remove any need for global pointer to io_end structure and thus fix the races. The downside of this change is that the checking for unwritten IO in flight in ext4_extents_can_be_merged() is more racy since we now increment i_unwritten / set EXT4_STATE_DIO_UNWRITTEN only after dropping i_data_sem. However the check has been racy already before because ext4_writepages() already increment i_unwritten after dropping i_data_sem and reserved blocks save us from hitting ENOSPC in the worst case. Signed-off-by: Jan Kara <jack@suse.cz>
2016-03-08ext4: move trans handling and completion deferal out of _ext4_get_blockJan Kara
There is no need to handle starting of a transaction and deferal of DIO completion in _ext4_get_block() function. We can move this out to get block functions for direct IO that need it. That way we can add stricter checks verifying things work as we expect. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-08ext4: rename and split get blocks functionsJan Kara
Rename ext4_get_blocks_write() to ext4_get_blocks_unwritten() to better describe what it does. Also split out get blocks functions for direct IO. Later we move functionality from _ext4_get_blocks() there. There's no functional change in this patch. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-08ext4: use i_mutex to serialize unaligned AIO DIOJan Kara
Currently we've used hashed aio_mutex to serialize unaligned AIO DIO. However the code cleanups that happened after 2011 when the lock was introduced made aio_mutex acquired at almost the same places where we already have exclusion using i_mutex. So just use i_mutex for the exclusion of unaligned AIO DIO. The change moves waiting for pending unwritten extent conversion under i_mutex. That makes special handling of O_APPEND writes unnecessary and also avoids possible livelocking of unaligned AIO DIO with aligned one (nothing was preventing contiguous stream of aligned AIO DIOs to let unaligned AIO DIO wait forever). Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>