summaryrefslogtreecommitdiff
path: root/fs/ext4
AgeCommit message (Collapse)Author
2020-12-17ext4: fix deadlock with fs freezing and EA inodesJan Kara
Xattr code using inodes with large xattr data can end up dropping last inode reference (and thus deleting the inode) from places like ext4_xattr_set_entry(). That function is called with transaction started and so ext4_evict_inode() can deadlock against fs freezing like: CPU1 CPU2 removexattr() freeze_super() vfs_removexattr() ext4_xattr_set() handle = ext4_journal_start() ... ext4_xattr_set_entry() iput(old_ea_inode) ext4_evict_inode(old_ea_inode) sb->s_writers.frozen = SB_FREEZE_FS; sb_wait_write(sb, SB_FREEZE_FS); ext4_freeze() jbd2_journal_lock_updates() -> blocks waiting for all handles to stop sb_start_intwrite() -> blocks as sb is already in SB_FREEZE_FS state Generally it is advisable to delete inodes from a separate transaction as it can consume quite some credits however in this case it would be quite clumsy and furthermore the credits for inode deletion are quite limited and already accounted for. So just tweak ext4_evict_inode() to avoid freeze protection if we have transaction already started and thus it is not really needed anyway. Cc: stable@vger.kernel.org Fixes: dec214d00e0d ("ext4: xattr inode deduplication") Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20201127110649.24730-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-12-17ext4: make fast_commit.h byte identical with e2fsprogs/fast_commit.hHarshad Shirwadkar
This patch makes fast_commit.h byte by byte identical with e2fsprogs/fast_commit.h. This will help us ensure that there are no on-disk format inconsistencies between e2fsck and kernel ext4. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201120202232.2240293-1-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-12-17ext4: fix fall-through warnings for ClangGustavo A. R. Silva
In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of just letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/03497331f088a938d7a728e7a689bd7953139429.1605896059.git.gustavoars@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-12-17ext4: add docs about fast commit idempotenceHarshad Shirwadkar
Fast commit on-disk format is designed such that the replay of these tags can be idempotent. This patch adds documentation in the code in form of comments and in form kernel docs that describes these characteristics. This patch also adds a TODO item needed to ensure kernel fast commit replay idempotence. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201119232822.1860882-1-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-12-17ext4: remove the unused EXT4_CURRENT_REV macroKaixu Xia
There are no callers of the EXT4_CURRENT_REV macro, so remove it. Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/1605164202-31120-1-git-send-email-kaixuxia@tencent.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-12-17ext4: fix an IS_ERR() vs NULL checkDan Carpenter
The ext4_find_extent() function never returns NULL, it returns error pointers. Fixes: 44059e503b03 ("ext4: fast commit recovery path") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20201023112232.GB282278@mwanda Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2020-12-17ext4: check for invalid block size early when mounting a file systemTheodore Ts'o
Check for valid block size directly by validating s_log_block_size; we were doing this in two places. First, by calculating blocksize via BLOCK_SIZE << s_log_block_size, and then checking that the blocksize was valid. And then secondly, by checking s_log_block_size directly. The first check is not reliable, and can trigger an UBSAN warning if s_log_block_size on a maliciously corrupted superblock is greater than 22. This is harmless, since the second test will correctly reject the maliciously fuzzed file system, but to make syzbot shut up, and because the two checks are duplicative in any case, delete the blocksize check, and move the s_log_block_size earlier in ext4_fill_super(). Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: syzbot+345b75652b1d24227443@syzkaller.appspotmail.com
2020-12-17ext4: fix a memory leak of ext4_free_dataChunguang Xu
When freeing metadata, we will create an ext4_free_data and insert it into the pending free list. After the current transaction is committed, the object will be freed. ext4_mb_free_metadata() will check whether the area to be freed overlaps with the pending free list. If true, return directly. At this time, ext4_free_data is leaked. Fortunately, the probability of this problem is small, since it only occurs if the file system is corrupted such that a block is claimed by more one inode and those inodes are deleted within a single jbd2 transaction. Signed-off-by: Chunguang Xu <brookxu@tencent.com> Link: https://lore.kernel.org/r/1604764698-4269-8-git-send-email-brookxu@tencent.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2020-12-16Merge tag 'for-5.11/block-2020-12-14' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block updates from Jens Axboe: "Another series of killing more code than what is being added, again thanks to Christoph's relentless cleanups and tech debt tackling. This contains: - blk-iocost improvements (Baolin Wang) - part0 iostat fix (Jeffle Xu) - Disable iopoll for split bios (Jeffle Xu) - block tracepoint cleanups (Christoph Hellwig) - Merging of struct block_device and hd_struct (Christoph Hellwig) - Rework/cleanup of how block device sizes are updated (Christoph Hellwig) - Simplification of gendisk lookup and removal of block device aliasing (Christoph Hellwig) - Block device ioctl cleanups (Christoph Hellwig) - Removal of bdget()/blkdev_get() as exported API (Christoph Hellwig) - Disk change rework, avoid ->revalidate_disk() (Christoph Hellwig) - sbitmap improvements (Pavel Begunkov) - Hybrid polling fix (Pavel Begunkov) - bvec iteration improvements (Pavel Begunkov) - Zone revalidation fixes (Damien Le Moal) - blk-throttle limit fix (Yu Kuai) - Various little fixes" * tag 'for-5.11/block-2020-12-14' of git://git.kernel.dk/linux-block: (126 commits) blk-mq: fix msec comment from micro to milli seconds blk-mq: update arg in comment of blk_mq_map_queue blk-mq: add helper allocating tagset->tags Revert "block: Fix a lockdep complaint triggered by request queue flushing" nvme-loop: use blk_mq_hctx_set_fq_lock_class to set loop's lock class blk-mq: add new API of blk_mq_hctx_set_fq_lock_class block: disable iopoll for split bio block: Improve blk_revalidate_disk_zones() checks sbitmap: simplify wrap check sbitmap: replace CAS with atomic and sbitmap: remove swap_lock sbitmap: optimise sbitmap_deferred_clear() blk-mq: skip hybrid polling if iopoll doesn't spin blk-iocost: Factor out the base vrate change into a separate function blk-iocost: Factor out the active iocgs' state check into a separate function blk-iocost: Move the usage ratio calculation to the correct place blk-iocost: Remove unnecessary advance declaration blk-iocost: Fix some typos in comments blktrace: fix up a kerneldoc comment block: remove the request_queue to argument request based tracepoints ...
2020-12-16Merge tag 'linux-kselftest-kunit-5.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kunit updates from Shuah Khan: - documentation update and fix to kunit_tool to parse diagnostic messages correctly from David Gow - Support for Parameterized Testing and fs/ext4 test updates to use KUnit parameterized testing feature from Arpitha Raghunandan - Helper to derive file names depending on --build_dir argument from Andy Shevchenko * tag 'linux-kselftest-kunit-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: fs: ext4: Modify inode-test.c to use KUnit parameterized testing feature kunit: Support for Parameterized Testing kunit: kunit_tool: Correctly parse diagnostic messages Documentation: kunit: provide guidance for testing many inputs kunit: Introduce get_file_path() helper
2020-12-14Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds
Pull fscrypt updates from Eric Biggers: "This release there are some fixes for longstanding problems, as well as some cleanups: - Fix a race condition where a duplicate filename could be created in an encrypted directory if a syscall that creates a new filename raced with the directory's encryption key being added. - Allow deleting files that use an unsupported encryption policy. - Simplify the locking for 'struct fscrypt_master_key'. - Remove kernel-internal constants from the UAPI header. As usual, all these patches have been in linux-next with no reported issues, and I've tested them with xfstests" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fscrypt: allow deleting files with unsupported encryption policy fscrypt: unexport fscrypt_get_encryption_info() fscrypt: move fscrypt_require_key() to fscrypt_private.h fscrypt: move body of fscrypt_prepare_setattr() out-of-line fscrypt: introduce fscrypt_prepare_readdir() ext4: don't call fscrypt_get_encryption_info() from dx_show_leaf() ubifs: remove ubifs_dir_open() f2fs: remove f2fs_dir_open() ext4: remove ext4_dir_open() fscrypt: simplify master key locking fscrypt: remove unnecessary calls to fscrypt_require_key() ubifs: prevent creating duplicate encrypted filenames f2fs: prevent creating duplicate encrypted filenames ext4: prevent creating duplicate encrypted filenames fscrypt: add fscrypt_is_nokey_name() fscrypt: remove kernel-internal constants from UAPI header
2020-12-09ext4: delete nonsensical (commented-out) code inside ext4_xattr_block_set()Chunguang Xu
Signed-off-by: Chunguang Xu <brookxu@tencent.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/1604764698-4269-7-git-send-email-brookxu@tencent.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-12-09ext4: update ext4_data_block_valid related commentsChunguang Xu
Since ext4_data_block_valid() has been renamed to ext4_inode_block_valid(), the related comments need to be updated. Signed-off-by: Chunguang Xu <brookxu@tencent.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/1604764698-4269-5-git-send-email-brookxu@tencent.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-12-03ext4: simplify the code of mb_find_order_for_blockChunguang Xu
The code of mb_find_order_for_block is a bit obscure, but we can simplify it with mb_find_buddy(), make the code more concise. Signed-off-by: Chunguang Xu <brookxu@tencent.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/1604764698-4269-3-git-send-email-brookxu@tencent.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-12-03ext4: remove redundant mb_regenerate_buddy()Chunguang Xu
After this patch (163a203), if an abnormal bitmap is detected, we will mark the group as corrupt, and we will not use this group in the future. Therefore, it should be meaningless to regenerate the buddy bitmap of this group, It might be better to delete it. Signed-off-by: Chunguang Xu <brookxu@tencent.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/1604764698-4269-2-git-send-email-brookxu@tencent.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-12-03ext4: use ASSERT() to replace J_ASSERT()Chunguang Xu
There are currently multiple forms of assertion, such as J_ASSERT(). J_ASEERT() is provided for the jbd module, which is a public module. Maybe we should use custom ASSERT() like other file systems, such as xfs, which would be better. Signed-off-by: Chunguang Xu <brookxu@tencent.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/1604764698-4269-1-git-send-email-brookxu@tencent.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-12-03ext4: print quota journalling mode on (re-)mountRoman Anufriev
Right now, it is hard to understand which quota journalling type is enabled: you need to be quite familiar with kernel code and trace it or really understand what different combinations of fs flags/mount options lead to. This patch adds printing of current quota jounalling mode on each mount/remount, thus making it easier to check it at a glance/in autotests. The semantics is similar to ext4 data journalling modes: * journalled - quota configured, journalling will be enabled * writeback - quota configured, journalling won't be enabled * none - quota isn't configured * disabled - kernel compiled without CONFIG_QUOTA feature Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/1603336860-16153-2-git-send-email-dotdot@yandex-team.ru Signed-off-by: Roman Anufriev <dotdot@yandex-team.ru> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-12-03ext4: add helpers for checking whether quota can be enabled/is journalledRoman Anufriev
Right now, there are several places, where we check whether fs is capable of enabling quota or if quota is journalled with quite long and non-self-descriptive condition statements. This patch wraps these statements into helpers for better readability and easier usage. Link: https://lore.kernel.org/r/1603336860-16153-1-git-send-email-dotdot@yandex-team.ru Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Roman Anufriev <dotdot@yandex-team.ru> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-12-03ext4: remove redundant assignment of variable exColin Ian King
Variable ex is assigned a variable that is not being read, the assignment is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20201021132326.148052-1-colin.king@canonical.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-12-03ext4: remove the null check of bio_vec pageXianting Tian
bv_page can't be NULL in a valid bio_vec, so we can remove the NULL check, as we did in other places when calling bio_for_each_segment_all() to go through all bio_vec of a bio. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Xianting Tian <tian.xianting@h3c.com> Link: https://lore.kernel.org/r/20201020082201.34257-1-tian.xianting@h3c.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-12-03ext4: remove redundant operation that set bh to NULLKaixu Xia
The out_fail branch path don't release the bh and the second bh is valid only in the for statement, so we don't need to set them to NULL. Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Reviewed-by: zhangyi (F) <yi.zhang@huawei.com> Link: https://lore.kernel.org/r/1603194069-17557-1-git-send-email-kaixuxia@tencent.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-12-02fscrypt: Have filesystems handle their d_opsDaniel Rosenberg
This shifts the responsibility of setting up dentry operations from fscrypt to the individual filesystems, allowing them to have their own operations while still setting fscrypt's d_revalidate as appropriate. Most filesystems can just use generic_set_encrypted_ci_d_ops, unless they have their own specific dentry operations as well. That operation will set the minimal d_ops required under the circumstances. Since the fscrypt d_ops are set later on, we must set all d_ops there, since we cannot adjust those later on. This should not result in any change in behavior. Signed-off-by: Daniel Rosenberg <drosen@google.com> Acked-by: Theodore Ts'o <tytso@mit.edu> Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-02fscrypt: introduce fscrypt_prepare_readdir()Eric Biggers
The last remaining use of fscrypt_get_encryption_info() from filesystems is for readdir (->iterate_shared()). Every other call is now in fs/crypto/ as part of some other higher-level operation. We need to add a new argument to fscrypt_get_encryption_info() to indicate whether the encryption policy is allowed to be unrecognized or not. Doing this is easier if we can work with high-level operations rather than direct filesystem use of fscrypt_get_encryption_info(). So add a function fscrypt_prepare_readdir() which wraps the call to fscrypt_get_encryption_info() for the readdir use case. Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20201203022041.230976-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-02ext4: don't call fscrypt_get_encryption_info() from dx_show_leaf()Eric Biggers
The call to fscrypt_get_encryption_info() in dx_show_leaf() is too low in the call tree; fscrypt_get_encryption_info() should have already been called when starting the directory operation. And indeed, it already is. Moreover, the encryption key is guaranteed to already be available because dx_show_leaf() is only called when adding a new directory entry. And even if the key wasn't available, dx_show_leaf() uses fscrypt_fname_disk_to_usr() which knows how to create a no-key name. So for the above reasons, and because it would be desirable to stop exporting fscrypt_get_encryption_info() directly to filesystems, remove the call to fscrypt_get_encryption_info() from dx_show_leaf(). Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20201203022041.230976-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-02ext4: remove ext4_dir_open()Eric Biggers
Since encrypted directories can be opened and searched without their key being available, and each readdir and ->lookup() tries to set up the key, trying to set up the key in ->open() too isn't really useful. Just remove it so that directories don't need an ->open() method anymore, and so that we eliminate a use of fscrypt_get_encryption_info() (which I'd like to stop exporting to filesystems). Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20201203022041.230976-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-02fs: ext4: Modify inode-test.c to use KUnit parameterized testing featureArpitha Raghunandan
Modify fs/ext4/inode-test.c to use the parameterized testing feature of KUnit. Signed-off-by: Arpitha Raghunandan <98.arpi@gmail.com> Signed-off-by: Marco Elver <elver@google.com> Reviewed-by: Marco Elver <elver@google.com> Tested-by: David Gow <davidgow@google.com> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Iurii Zaikin <yzaikin@google.com> Acked-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-12-01block: switch partition lookup to use struct block_deviceChristoph Hellwig
Use struct block_device to lookup partitions on a disk. This removes all usage of struct hd_struct from the I/O path. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: Chao Yu <yuchao0@huawei.com> [f2fs] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-01fs: simplify freeze_bdev/thaw_bdevChristoph Hellwig
Store the frozen superblock in struct block_device to avoid the awkward interface that can return a sb only used a cookie, an ERR_PTR or NULL. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Chao Yu <yuchao0@huawei.com> [f2fs] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-11-24ext4: prevent creating duplicate encrypted filenamesEric Biggers
As described in "fscrypt: add fscrypt_is_nokey_name()", it's possible to create a duplicate filename in an encrypted directory by creating a file concurrently with adding the directory's encryption key. Fix this bug on ext4 by rejecting no-key dentries in ext4_add_entry(). Note that the duplicate check in ext4_find_dest_de() sometimes prevented this bug. However in many cases it didn't, since ext4_find_dest_de() doesn't examine every dentry. Fixes: 4461471107b7 ("ext4 crypto: enable filename encryption") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201118075609.120337-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-11-19ext4: fix bogus warning in ext4_update_dx_flag()Jan Kara
The idea of the warning in ext4_update_dx_flag() is that we should warn when we are clearing EXT4_INODE_INDEX on a filesystem with metadata checksums enabled since after clearing the flag, checksums for internal htree nodes will become invalid. So there's no need to warn (or actually do anything) when EXT4_INODE_INDEX is not set. Link: https://lore.kernel.org/r/20201118153032.17281-1-jack@suse.cz Fixes: 48a34311953d ("ext4: fix checksum errors with indexed dirs") Reported-by: Eric Biggers <ebiggers@kernel.org> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2020-11-19ext4: drop fast_commit from /proc/mountsTheodore Ts'o
The options in /proc/mounts must be valid mount options --- and fast_commit is not a mount option. Otherwise, command sequences like this will fail: # mount /dev/vdc /vdc # mkdir -p /vdc/phoronix_test_suite /pts # mount --bind /vdc/phoronix_test_suite /pts # mount -o remount,nodioread_nolock /pts mount: /pts: mount point not mounted or bad option. And in the system logs, you'll find: EXT4-fs (vdc): Unrecognized mount option "fast_commit" or missing value Fixes: 995a3ed67fc8 ("ext4: add fast_commit feature and handling for extended mount options") Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-11Revert "ext4: fix superblock checksum calculation race"Theodore Ts'o
This reverts commit acaa532687cdc3a03757defafece9c27aa667546 which can result in a ext4_superblock_csum_set() trying to sleep while a spinlock is being held. For more discussion of this issue, please see: https://lore.kernel.org/r/000000000000f50cb705b313ed70@google.com Reported-by: syzbot+7a4ba6a239b91a126c28@syzkaller.appspotmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-11ext4: handle dax mount option collisionHarshad Shirwadkar
Mount options dax=inode and dax=never collided with fast_commit and journal checksum. Redefine the mount flags to remove the collision. Reported-by: Murphy Zhou <jencce.kernel@gmail.com> Fixes: 9cb20f94afcd2 ("fs/ext4: Make DAX mount option a tri-state") Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201111183209.447175-1-harshads@google.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-07ext4: fix sparse warnings in fast_commit codeTheodore Ts'o
Add missing __acquire() and __releases() annotations, and make fc_ineligible_reasons[] static, as it is not used outside of fs/ext4/fast_commit.c. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: cleanup fast commit mount optionsHarshad Shirwadkar
Drop no_fc mount option that disable fast commit even if it was enabled at mkfs time. Move fc_debug_force mount option under ifdef EXT4_DEBUG to annotate that this is strictly for debugging and testing purposes and should not be used in production. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-23-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: make s_mount_flags modifications atomicHarshad Shirwadkar
Fast commit file system states are recorded in sbi->s_mount_flags. Fast commit expects these bit manipulations to be atomic. This patch adds helpers to make those modifications atomic. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-21-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: issue fsdev cache flush before starting fast commitHarshad Shirwadkar
If the journal dev is different from fsdev, issue a cache flush before committing fast commit blocks to disk. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20201106035911.1942128-20-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: disable fast commit with data journallingHarshad Shirwadkar
Fast commits don't work with data journalling. This patch disables the fast commit support when data journalling is turned on. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20201106035911.1942128-19-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: fix inode dirty check in case of fast commitsHarshad Shirwadkar
In case of fast commits, determine if the inode is dirty by checking if the inode is on fast commit list. This also helps us get rid of ext4_inode_info.i_fc_committed_subtid field. Reported-by: Andrea Righi <andrea.righi@canonical.com> Tested-by: Andrea Righi <andrea.righi@canonical.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-18-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: remove unnecessary fast commit calls from ext4_file_mmapHarshad Shirwadkar
Remove unnecessary calls to ext4_fc_start_update() and ext4_fc_stop_update() from ext4_file_mmap(). Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-17-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: mark buf dirty before submitting fast commit bufferHarshad Shirwadkar
Mark the fast commit buffer as dirty before submission. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-16-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: fix code documentatioonHarshad Shirwadkar
Add a TODO to remember fixing REQ_FUA | REQ_PREFLUSH for fast commit buffers. Also, fix a typo in top level comment in fast_commit.c Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-15-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: dedpulicate the code to wait on inode that's being committedHarshad Shirwadkar
This patch removes the deduplicates the code that implements waiting on inode that's being committed. That code is moved into a new function. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-14-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06jbd2: don't pass tid to jbd2_fc_end_commit_fallback()Harshad Shirwadkar
In jbd2_fc_end_commit_fallback(), we know which tid to commit. There's no need for caller to pass it. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-10-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: clean up the JBD2 API that initializes fast commitsHarshad Shirwadkar
This patch removes jbd2_fc_init() API and its related functions to simplify enabling fast commits. With this change, the number of fast commit blocks to use is solely determined by the JBD2 layer. So, we move the default value for minimum number of fast commit blocks from ext4/fast_commit.h to include/linux/jbd2.h. However, whether or not to use fast commits is determined by the file system. The file system just sets the fast commit feature using jbd2_journal_set_features(). JBD2 layer then determines how many blocks to use for fast commits (based on the value found in the JBD2 superblock). Note that the JBD2 feature flag of fast commits is just an indication that there are fast commit blocks present on disk. It doesn't tell JBD2 layer about the intent of the file system of whether to it wants to use fast commit or not. That's why, we blindly clear the fast commit flag in journal_reset() after the recovery is done. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-7-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06jbd2: rename j_maxlen to j_total_len and add jbd2_journal_max_txn_bufsHarshad Shirwadkar
The on-disk superblock field sb->s_maxlen represents the total size of the journal including the fast commit area and is no more the max number of blocks available for a transaction. The maximum number of blocks available to a transaction is reduced by the number of fast commit blocks. So, this patch renames j_maxlen to j_total_len to better represent its intent. Also, it adds a function to calculate max number of bufs available for a transaction. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-6-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: fixup ext4_fc_track_* functions' signatureHarshad Shirwadkar
Firstly, pass handle to all ext4_fc_track_* functions and use transaction id found in handle->h_transaction->h_tid for tracking fast commit updates. Secondly, don't pass inode to ext4_fc_track_link/create/unlink functions. inode can be found inside these functions as d_inode(dentry). However, rename path is an exeception. That's because in that case, we need inode that's not same as d_inode(dentry). To handle that, add a couple of low-level wrapper functions that take inode and dentry as arguments. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-5-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: drop redundant calls ext4_fc_track_rangeHarshad Shirwadkar
ext4_fc_track_range() should only be called when blocks are added or removed from an inode. So, the only places from where we need to call this function are ext4_map_blocks(), punch hole, collapse / zero range, truncate. Remove all the other redundant calls to ths function. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-4-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: mark fc ineligible if inode gets evictied due to mem pressureHarshad Shirwadkar
If inode gets evicted due to memory pressure, we have to remove it from the fast commit list. However, that inode may have uncommitted changes that fast commits will lose. So, just fall back to full commits in this case. Also, rename the fast commit ineligiblity reason from "EXT4_FC_REASON_MEM" to "EXT4_FC_REASON_MEM_NOMEM" for better expression. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-3-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: describe fast_commit feature flagsHarshad Shirwadkar
Fast commit feature has flags in the file system as well in JBD2. The meaning of fast commit feature flags can get confusing. Update docs and code to add more documentation about it. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20201106035911.1942128-2-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>