summaryrefslogtreecommitdiff
path: root/fs/btrfs
AgeCommit message (Collapse)Author
2016-09-26btrfs: extend btrfs_set_extent_delalloc and its friends to support in-band ↵Qu Wenruo
dedupe and subpage size patchset Extend btrfs_set_extent_delalloc() and extent_clear_unlock_delalloc() parameters for both in-band dedupe and subpage sector size patchset. This should reduce conflict of both patchset and the effort to rebase them. Cc: Chandan Rajendra <chandan@linux.vnet.ibm.com> Cc: David Sterba <dsterba@suse.cz> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-26btrfs: add dynamic debug supportJeff Mahoney
We can re-use the dynamic debugging descriptor to make use of the dynamic debugging mechanism but still use our own printk interface. Defining the DEBUG macro works as it did before. When it's defined, all of the messages default to print. We can also enable all debug messages at boot or module-load time using the 'dyndbg' and 'btrfs.dyndbg' options. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-26btrfs: Fix warning "variable ‘gen’ set but not used"Luis Henriques
Variable 'gen' in reada_for_search() is not used since commit 58dc4ce43251 ("btrfs: remove unused parameter from readahead_tree_block"). This patch simply removes this variable. Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-26btrfs: Fix warning "variable ‘blocksize’ set but not used"Luis Henriques
Variable 'blocksize' in reada_walk_down() is not used since commit d3e46fea1b1e ("btrfs: sink blocksize parameter to readahead_tree_block"). This patch simply removes this variable. Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-26btrfs: let btrfs_delete_unused_bgs() to clean relocated bgsNaohiro Aota
Currently, btrfs_relocate_chunk() is removing relocated BG by itself. But the work can be done by btrfs_delete_unused_bgs() (and it's better since it trim the BG). Let's dedupe the code. While btrfs_delete_unused_bgs() is already hitting the relocated BG, it skip the BG since the BG has "ro" flag set (to keep balancing BG intact). On the other hand, btrfs cannot drop "ro" flag here to prevent additional writes. So this patch make use of "removed" flag. btrfs_delete_unused_bgs() now detect the flag to distinguish whether a read-only BG is relocating or not. Signed-off-by: Naohiro Aota <naohiro.aota@hgst.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-26Btrfs: bail out if block group has different mixed flagLiu Bo
Currently we allow inconsistence about mixed flag (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_DATA). We'd get ENOSPC if block group has mixed flag and btrfs doesn't. If that happens, we have one space_info with mixed flag and another space_info only with BTRFS_BLOCK_GROUP_METADATA, and global_block_rsv.space_info points to the latter one, but all bytes from block_group contributes to the mixed space_info, thus all the allocation will fail with ENOSPC. This adds a check for the above case. Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> [ updated message ] Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-26Btrfs: fix memory leak in reading btree blocksLiu Bo
So we can read a btree block via readahead or intentional read, and we can end up with a memory leak when something happens as follows, 1) readahead starts to read block A but does not wait for read completion, 2) btree_readpage_end_io_hook finds that block A is corrupted, and it needs to clear all block A's pages' uptodate bit. 3) meanwhile an intentional read kicks in and checks block A's pages' uptodate to decide which page needs to be read. 4) when some pages have the uptodate bit during 3)'s check so 3) doesn't count them for eb->io_pages, but they are later cleared by 2) so we has to readpage on the page, we get the wrong eb->io_pages which results in a memory leak of this block. This fixes the problem by firstly getting all pages's locking and then checking pages' uptodate bit. t1(readahead) t2(readahead endio) t3(the following read) read_extent_buffer_pages end_bio_extent_readpage for pg in eb: for page 0,1,2 in eb: if pg is uptodate: btree_readpage_end_io_hook(pg) num_reads++ if uptodate: eb->io_pages = num_reads SetPageUptodate(pg) _______________ for pg in eb: for page 3 in eb: read_extent_buffer_pages if pg is NOT uptodate: btree_readpage_end_io_hook(pg) for pg in eb: __extent_read_full_page(pg) sanity check reports something wrong if pg is uptodate: clear_extent_buffer_uptodate(eb) num_reads++ for pg in eb: eb->io_pages = num_reads ClearPageUptodate(page) _______________ for pg in eb: if pg is NOT uptodate: __extent_read_full_page(pg) So t3's eb->io_pages is not consistent with the number of pages it's reading, and during endio(), atomic_dec_and_test(&eb->io_pages) will get a negative number so that we're not able to free the eb. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-26Btrfs: remove BUG() in raid56Liu Bo
This BUG() has been triggered by a fuzz testing image, which contains an invalid chunk type, ie. a single stripe chunk has the raid6 type. Btrfs can handle this gracefully by returning -EIO, so besides using btrfs_warn to give us more debugging information rather than a single BUG(), we can return error properly. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-26btrfs: fix check_shared for fiemap ioctlLu Fengqi
Only in the case of different root_id or different object_id, check_shared identified extent as the shared. However, If a extent was referred by different offset of same file, it should also be identified as shared. In addition, check_shared's loop scale is at least n^3, so if a extent has too many references, even causes soft hang up. First, add all delayed_ref to the ref_tree and calculate the unqiue_refs, if the unique_refs is greater than one, return BACKREF_FOUND_SHARED. Then individually add the on-disk reference(inline/keyed) to the ref_tree and calculate the unique_refs of the ref_tree to check if the unique_refs is greater than one.Because once there are two references to return SHARED, so the time complexity is close to the constant. Reported-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-26btrfs: create example debugfs file only in debugging buildDavid Sterba
Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-26btrfs: fix perms on demonstration debugfs interfaceEric Sandeen
btrfs provides a helpful demonstration of how to export a global variable via debugfs; however, it is unique among other debugfs files in that it is world-writable, which causes some concern to people who are not familiar with its purpose. Fix it so that it is only user-writable. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-26Btrfs: fix memory leak of block group cacheLiu Bo
While processing delayed refs, we may update block group's statistics and attach it to cur_trans->dirty_bgs, and later writing dirty block groups will process the list, which happens during btrfs_commit_transaction(). For whatever reason, the transaction is aborted and dirty_bgs is not processed in cleanup_transaction(), we end up with memory leak of these dirty block group cache. Since btrfs_start_dirty_block_groups() doesn't make it go to the commit critical section, this also adds the cleanup work inside it. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-23Merge branch 'for-linus-4.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "Josef fixed a problem when quotas are enabled with his latest ENOSPC rework, and Jeff added more checks into the subvol ioctls to avoid tripping up lookup_one_len" * 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: ensure that file descriptor used with subvol ioctls is a dir Btrfs: handle quota reserve failure properly
2016-09-22fs: Give dentry to inode_change_ok() instead of inodeJan Kara
inode_change_ok() will be resposible for clearing capabilities and IMA extended attributes and as such will need dentry. Give it as an argument to inode_change_ok() instead of an inode. Also rename inode_change_ok() to setattr_prepare() to better relect that it does also some modifications in addition to checks. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2016-09-22posix_acl: Clear SGID bit when setting file permissionsJan Kara
When file permissions are modified via chmod(2) and the user is not in the owning group or capable of CAP_FSETID, the setgid bit is cleared in inode_change_ok(). Setting a POSIX ACL via setxattr(2) sets the file permissions as well as the new ACL, but doesn't clear the setgid bit in a similar way; this allows to bypass the check in chmod(2). Fix that. References: CVE-2016-7097 Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2016-09-21btrfs: ensure that file descriptor used with subvol ioctls is a dirJeff Mahoney
If the subvol/snapshot create/destroy ioctls are passed a regular file with execute permissions set, we'll eventually Oops while trying to do inode->i_op->lookup via lookup_one_len. This patch ensures that the file descriptor refers to a directory. Fixes: cb8e70901d (Btrfs: Fix subvolume creation locking rules) Fixes: 76dda93c6a (Btrfs: add snapshot/subvolume destroy ioctl) Cc: <stable@vger.kernel.org> #v2.6.29+ Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-09-21Btrfs: handle quota reserve failure properlyJosef Bacik
btrfs/022 was spitting a warning for the case that we exceed the quota. If we fail to make our quota reservation we need to clean up our data space reservation. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Tested-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-09-16btrfs: use filemap_check_errors()Miklos Szeredi
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Cc: Chris Mason <clm@fb.com>
2016-09-14block, dm-crypt, btrfs: Introduce bio_flags()Bart Van Assche
Introduce the bio_flags() macro. Ensure that the second argument of bio_set_op_attrs() only contains flags and no operation. This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Mike Christie <mchristi@redhat.com> Cc: Chris Mason <clm@fb.com> (maintainer:BTRFS FILE SYSTEM) Cc: Josef Bacik <jbacik@fb.com> (maintainer:BTRFS FILE SYSTEM) Cc: Mike Snitzer <snitzer@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Damien Le Moal <damien.lemoal@hgst.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-09Merge branch 'for-linus-4.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "I'm not proud of how long it took me to track down that one liner in btrfs_sync_log(), but the good news is the patches I was trying to blame for these problems were actually fine (sorry Filipe)" * 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: introduce tickets_id to determine whether asynchronous metadata reclaim work makes progress btrfs: remove root_log_ctx from ctx list before btrfs_sync_log returns btrfs: do not decrease bytes_may_use when replaying extents
2016-09-07Merge branch 'for-chris' of ↵Chris Mason
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.8
2016-09-06btrfs: introduce tickets_id to determine whether asynchronous metadata ↵Wang Xiaoguang
reclaim work makes progress In btrfs_async_reclaim_metadata_space(), we use ticket's address to determine whether asynchronous metadata reclaim work is making progress. ticket = list_first_entry(&space_info->tickets, struct reserve_ticket, list); if (last_ticket == ticket) { flush_state++; } else { last_ticket = ticket; flush_state = FLUSH_DELAYED_ITEMS_NR; if (commit_cycles) commit_cycles--; } But indeed it's wrong, we should not rely on local variable's address to do this check, because addresses may be same. In my test environment, I dd one 168MB file in a 256MB fs, found that for this file, every time wait_reserve_ticket() called, local variable ticket's address is same, For above codes, assume a previous ticket's address is addrA, last_ticket is addrA. Btrfs_async_reclaim_metadata_space() finished this ticket and wake up it, then another ticket is added, but with the same address addrA, now last_ticket will be same to current ticket, then current ticket's flush work will start from current flush_state, not initial FLUSH_DELAYED_ITEMS_NR, which may result in some enospc issues(I have seen this in my test machine). Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-06Btrfs: remove root_log_ctx from ctx list before btrfs_sync_log returnsChris Mason
We use a btrfs_log_ctx structure to pass information into the tree log commit, and get error values out. It gets added to a per log-transaction list which we walk when things go bad. Commit d1433debe added an optimization to skip waiting for the log commit, but didn't take root_log_ctx out of the list. This patch makes sure we remove things before exiting. Signed-off-by: Chris Mason <clm@fb.com> Fixes: d1433debe7f4346cf9fc0dafc71c3137d2a97bc4 cc: stable@vger.kernel.org # 3.15+
2016-09-05btrfs: do not decrease bytes_may_use when replaying extentsWang Xiaoguang
When replaying extents, there is no need to update bytes_may_use in btrfs_alloc_logged_file_extent(), otherwise it'll trigger a WARN_ON about bytes_may_use. Fixes: ("btrfs: update btrfs_space_info's bytes_may_use timely") Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-03Merge branch 'for-linus-4.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "I'm still prepping a set of fixes for btrfs fsync, just nailing down a hard to trigger memory corruption. For now, these are tested and ready." * 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: fix one bug that process may endlessly wait for ticket in wait_reserve_ticket() Btrfs: fix endless loop in balancing block groups Btrfs: kill invalid ASSERT() in process_all_refs()
2016-09-01btrfs: fix one bug that process may endlessly wait for ticket in ↵Wang Xiaoguang
wait_reserve_ticket() If can_overcommit() in btrfs_calc_reclaim_metadata_size() returns true, btrfs_async_reclaim_metadata_space() will not reclaim metadata space, just return directly and also forget to wake up process which are waiting for their tickets, so these processes will wait endlessly. Fstests case generic/172 with mount option "-o compress=lzo" have revealed this bug in my test machine. Here if we have tickets to handle, we must handle them first. Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-01Btrfs: fix endless loop in balancing block groupsLiu Bo
Qgroup function may overwrite the saved error 'err' with 0 in case quota is not enabled, and this ends up with a endless loop in balance because we keep going back to balance the same block group. It really should use 'ret' instead. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-01Btrfs: kill invalid ASSERT() in process_all_refs()Josef Bacik
Suppose you have the following tree in snap1 on a file system mounted with -o inode_cache so that inode numbers are recycled └── [ 258] a └── [ 257] b and then you remove b, rename a to c, and then re-create b in c so you have the following tree └── [ 258] c └── [ 257] b and then you try to do an incremental send you will hit ASSERT(pending_move == 0); in process_all_refs(). This is because we assume that any recycling of inodes will not have a pending change in our path, which isn't the case. This is the case for the DELETE side, since we want to remove the old file using the old path, but on the create side we could have a pending move and need to do the normal pending rename dance. So remove this ASSERT() and put a comment about why we ignore pending_move. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-08-26Merge branch 'for-linus-4.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "We've queued up a few different fixes in here. These range from enospc corners to fsync and quota fixes, and a few targeted at error handling for corrupt metadata/fuzzing" * 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: fix lockdep warning on deadlock against an inode's log mutex Btrfs: detect corruption when non-root leaf has zero item Btrfs: check btree node's nritems btrfs: don't create or leak aliased root while cleaning up orphans Btrfs: fix em leak in find_first_block_group btrfs: do not background blkdev_put() Btrfs: clarify do_chunk_alloc()'s return value btrfs: fix fsfreeze hang caused by delayed iputs deal btrfs: update btrfs_space_info's bytes_may_use timely btrfs: divide btrfs_update_reserved_bytes() into two functions btrfs: use correct offset for reloc_inode in prealloc_file_extent_cluster() btrfs: qgroup: Fix qgroup incorrectness caused by log replay btrfs: relocation: Fix leaking qgroups numbers on data extents btrfs: qgroup: Refactor btrfs_qgroup_insert_dirty_extent() btrfs: waiting on qgroup rescan should not always be interruptible btrfs: properly track when rescan worker is running btrfs: flush_space: treat return value of do_chunk_alloc properly Btrfs: add ASSERT for block group's memory leak btrfs: backref: Fix soft lockup in __merge_refs function Btrfs: fix memory leak of reloc_root
2016-08-25Btrfs: fix lockdep warning on deadlock against an inode's log mutexFilipe Manana
Commit 44f714dae50a ("Btrfs: improve performance on fsync against new inode after rename/unlink"), which landed in 4.8-rc2, introduced a possibility for a deadlock due to double locking of an inode's log mutex by the same task, which lockdep reports with: [23045.433975] ============================================= [23045.434748] [ INFO: possible recursive locking detected ] [23045.435426] 4.7.0-rc6-btrfs-next-34+ #1 Not tainted [23045.436044] --------------------------------------------- [23045.436044] xfs_io/3688 is trying to acquire lock: [23045.436044] (&ei->log_mutex){+.+...}, at: [<ffffffffa038552d>] btrfs_log_inode+0x13a/0xc95 [btrfs] [23045.436044] but task is already holding lock: [23045.436044] (&ei->log_mutex){+.+...}, at: [<ffffffffa038552d>] btrfs_log_inode+0x13a/0xc95 [btrfs] [23045.436044] other info that might help us debug this: [23045.436044] Possible unsafe locking scenario: [23045.436044] CPU0 [23045.436044] ---- [23045.436044] lock(&ei->log_mutex); [23045.436044] lock(&ei->log_mutex); [23045.436044] *** DEADLOCK *** [23045.436044] May be due to missing lock nesting notation [23045.436044] 3 locks held by xfs_io/3688: [23045.436044] #0: (&sb->s_type->i_mutex_key#15){+.+...}, at: [<ffffffffa035f2ae>] btrfs_sync_file+0x14e/0x425 [btrfs] [23045.436044] #1: (sb_internal#2){.+.+.+}, at: [<ffffffff8118446b>] __sb_start_write+0x5f/0xb0 [23045.436044] #2: (&ei->log_mutex){+.+...}, at: [<ffffffffa038552d>] btrfs_log_inode+0x13a/0xc95 [btrfs] [23045.436044] stack backtrace: [23045.436044] CPU: 4 PID: 3688 Comm: xfs_io Not tainted 4.7.0-rc6-btrfs-next-34+ #1 [23045.436044] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 [23045.436044] 0000000000000000 ffff88022f5f7860 ffffffff8127074d ffffffff82a54b70 [23045.436044] ffffffff82a54b70 ffff88022f5f7920 ffffffff81092897 ffff880228015d68 [23045.436044] 0000000000000000 ffffffff82a54b70 ffffffff829c3f00 ffff880228015d68 [23045.436044] Call Trace: [23045.436044] [<ffffffff8127074d>] dump_stack+0x67/0x90 [23045.436044] [<ffffffff81092897>] __lock_acquire+0xcbb/0xe4e [23045.436044] [<ffffffff8109155f>] ? mark_lock+0x24/0x201 [23045.436044] [<ffffffff8109179a>] ? mark_held_locks+0x5e/0x74 [23045.436044] [<ffffffff81092de0>] lock_acquire+0x12f/0x1c3 [23045.436044] [<ffffffff81092de0>] ? lock_acquire+0x12f/0x1c3 [23045.436044] [<ffffffffa038552d>] ? btrfs_log_inode+0x13a/0xc95 [btrfs] [23045.436044] [<ffffffffa038552d>] ? btrfs_log_inode+0x13a/0xc95 [btrfs] [23045.436044] [<ffffffff814a51a4>] mutex_lock_nested+0x77/0x3a7 [23045.436044] [<ffffffffa038552d>] ? btrfs_log_inode+0x13a/0xc95 [btrfs] [23045.436044] [<ffffffffa039705e>] ? btrfs_release_delayed_node+0xb/0xd [btrfs] [23045.436044] [<ffffffffa038552d>] btrfs_log_inode+0x13a/0xc95 [btrfs] [23045.436044] [<ffffffffa038552d>] ? btrfs_log_inode+0x13a/0xc95 [btrfs] [23045.436044] [<ffffffff810a0ed1>] ? vprintk_emit+0x453/0x465 [23045.436044] [<ffffffffa0385a61>] btrfs_log_inode+0x66e/0xc95 [btrfs] [23045.436044] [<ffffffffa03c084d>] log_new_dir_dentries+0x26c/0x359 [btrfs] [23045.436044] [<ffffffffa03865aa>] btrfs_log_inode_parent+0x4a6/0x628 [btrfs] [23045.436044] [<ffffffffa0387552>] btrfs_log_dentry_safe+0x5a/0x75 [btrfs] [23045.436044] [<ffffffffa035f464>] btrfs_sync_file+0x304/0x425 [btrfs] [23045.436044] [<ffffffff811acaf4>] vfs_fsync_range+0x8c/0x9e [23045.436044] [<ffffffff811acb22>] vfs_fsync+0x1c/0x1e [23045.436044] [<ffffffff811acc79>] do_fsync+0x31/0x4a [23045.436044] [<ffffffff811ace99>] SyS_fsync+0x10/0x14 [23045.436044] [<ffffffff814a88e5>] entry_SYSCALL_64_fastpath+0x18/0xa8 [23045.436044] [<ffffffff8108f039>] ? trace_hardirqs_off_caller+0x3f/0xaa An example reproducer for this is: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ mkdir /mnt/dir $ touch /mnt/dir/foo $ sync $ mv /mnt/dir/foo /mnt/dir/bar $ touch /mnt/dir/foo $ xfs_io -c "fsync" /mnt/dir/bar This is because while logging the inode of file bar we end up logging its parent directory (since its inode has an unlink_trans field matching the current transaction id due to the rename operation), which in turn logs the inodes for all its new dentries, so that the new inode for the new file named foo gets logged which in turn triggered another logging attempt for the inode we are fsync'ing, since that inode had an old name that corresponds to the name of the new inode. So fix this by ensuring that when logging the inode for a new dentry that has a name matching an old name of some other inode, we don't log again the original inode that we are fsync'ing. Fixes: 44f714dae50a ("Btrfs: improve performance on fsync against new inode after rename/unlink") Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25Btrfs: detect corruption when non-root leaf has zero itemLiu Bo
Right now we treat leaf which has zero item as a valid one because we could have an empty tree, that is, a root that is also a leaf without any item, however, in the same case but when the leaf is not a root, we can end up with hitting the BUG_ON(1) in btrfs_extend_item() called by setup_inline_extent_backref(). This makes us check the situation as a corruption if leaf is not its own root. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25Btrfs: check btree node's nritemsLiu Bo
When btree node (level = 1) has nritems which equals to zero, we can end up with panic due to insert_ptr()'s BUG_ON(slot > nritems); where slot is 1 and nritems is 0, as copy_for_split() calls insert_ptr(.., path->slots[1] + 1, ...); A invalid value results in the whole mess, this adds the check for btree's node nritems so that we stop reading block when when something is wrong. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25btrfs: don't create or leak aliased root while cleaning up orphansJeff Mahoney
commit 909c3a22da3 (Btrfs: fix loading of orphan roots leading to BUG_ON) avoids the BUG_ON but can add an aliased root to the dead_roots list or leak the root. Since we've already been loading roots into the radix tree, we should use it before looking the root up on disk. Cc: <stable@vger.kernel.org> # 4.5 Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25Btrfs: fix em leak in find_first_block_groupJosef Bacik
We need to call free_extent_map() on the em we look up. Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25btrfs: do not background blkdev_put()Anand Jain
At the end of unmount/dev-delete, if the device exclusive open is not actually closed, then there might be a race with another program in the userland who is trying to open the device in exclusive mode and it may fail for eg: unmount /btrfs; fsck /dev/x btrfs dev del /dev/x /btrfs; fsck /dev/x so here background blkdev_put() is not a choice Signed-off-by: Anand Jain <Anand.Jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25Btrfs: clarify do_chunk_alloc()'s return valueLiu Bo
Function start_transaction() can return ERR_PTR(1) when flush is BTRFS_RESERVE_FLUSH_LIMIT, so the call graph is start_transaction (return ERR_PTR(1)) -> btrfs_block_rsv_add (return 1) -> reserve_metadata_bytes (return 1) -> flush_space (return 1) -> do_chunk_alloc (return 1) With BTRFS_RESERVE_FLUSH_LIMIT, if flush_space is already on the flush_state of ALLOC_CHUNK and it successfully allocates a new chunk, then instead of trying to reserve space again, reserve_metadata_bytes returns 1 immediately. Eventually the callers who call start_transaction() usually just do the IS_ERR() check which ERR_PTR(1) can pass, then it'll get a panic when dereferencing a pointer which is ERR_PTR(1). The following patch fixes the above problem. "btrfs: flush_space: treat return value of do_chunk_alloc properly" https://patchwork.kernel.org/patch/7778651/ This add comments to clarify do_chunk_alloc()'s return value. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25btrfs: fix fsfreeze hang caused by delayed iputs dealWang Xiaoguang
When running fstests generic/068, sometimes we got below deadlock: xfs_io D ffff8800331dbb20 0 6697 6693 0x00000080 ffff8800331dbb20 ffff88007acfc140 ffff880034d895c0 ffff8800331dc000 ffff880032d243e8 fffffffeffffffff ffff880032d24400 0000000000000001 ffff8800331dbb38 ffffffff816a9045 ffff880034d895c0 ffff8800331dbba8 Call Trace: [<ffffffff816a9045>] schedule+0x35/0x80 [<ffffffff816abab2>] rwsem_down_read_failed+0xf2/0x140 [<ffffffff8118f5e1>] ? __filemap_fdatawrite_range+0xd1/0x100 [<ffffffff8134f978>] call_rwsem_down_read_failed+0x18/0x30 [<ffffffffa06631fc>] ? btrfs_alloc_block_rsv+0x2c/0xb0 [btrfs] [<ffffffff810d32b5>] percpu_down_read+0x35/0x50 [<ffffffff81217dfc>] __sb_start_write+0x2c/0x40 [<ffffffffa067f5d5>] start_transaction+0x2a5/0x4d0 [btrfs] [<ffffffffa067f857>] btrfs_join_transaction+0x17/0x20 [btrfs] [<ffffffffa068ba34>] btrfs_evict_inode+0x3c4/0x5d0 [btrfs] [<ffffffff81230a1a>] evict+0xba/0x1a0 [<ffffffff812316b6>] iput+0x196/0x200 [<ffffffffa06851d0>] btrfs_run_delayed_iputs+0x70/0xc0 [btrfs] [<ffffffffa067f1d8>] btrfs_commit_transaction+0x928/0xa80 [btrfs] [<ffffffffa0646df0>] btrfs_freeze+0x30/0x40 [btrfs] [<ffffffff81218040>] freeze_super+0xf0/0x190 [<ffffffff81229275>] do_vfs_ioctl+0x4a5/0x5c0 [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70 [<ffffffff810038cf>] ? syscall_trace_enter_phase1+0x11f/0x140 [<ffffffff81229409>] SyS_ioctl+0x79/0x90 [<ffffffff81003c12>] do_syscall_64+0x62/0x110 [<ffffffff816acbe1>] entry_SYSCALL64_slow_path+0x25/0x25 >From this warning, freeze_super() already holds SB_FREEZE_FS, but btrfs_freeze() will call btrfs_commit_transaction() again, if btrfs_commit_transaction() finds that it has delayed iputs to handle, it'll start_transaction(), which will try to get SB_FREEZE_FS lock again, then deadlock occurs. The root cause is that in btrfs, sync_filesystem(sb) does not make sure all metadata is updated. There still maybe some codes adding delayed iputs, see below sample race window: CPU1 | CPU2 |-> freeze_super() | |-> sync_filesystem(sb); | | |-> cleaner_kthread() | | |-> btrfs_delete_unused_bgs() | | |-> btrfs_remove_chunk() | | |-> btrfs_remove_block_group() | | |-> btrfs_add_delayed_iput() | | |-> sb->s_writers.frozen = SB_FREEZE_FS; | |-> sb_wait_write(sb, SB_FREEZE_FS); | | acquire SB_FREEZE_FS lock. | | | |-> btrfs_freeze() | |-> btrfs_commit_transaction() | |-> btrfs_run_delayed_iputs() | | will handle delayed iputs, | | that means start_transaction() | | will be called, which will try | | to get SB_FREEZE_FS lock. | To fix this issue, introduce a "int fs_frozen" to record internally whether fs has been frozen. If fs has been frozen, we can not handle delayed iputs. Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> [ add comment to btrfs_freeze ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25btrfs: update btrfs_space_info's bytes_may_use timelyWang Xiaoguang
This patch can fix some false ENOSPC errors, below test script can reproduce one false ENOSPC error: #!/bin/bash dd if=/dev/zero of=fs.img bs=$((1024*1024)) count=128 dev=$(losetup --show -f fs.img) mkfs.btrfs -f -M $dev mkdir /tmp/mntpoint mount $dev /tmp/mntpoint cd /tmp/mntpoint xfs_io -f -c "falloc 0 $((64*1024*1024))" testfile Above script will fail for ENOSPC reason, but indeed fs still has free space to satisfy this request. Please see call graph: btrfs_fallocate() |-> btrfs_alloc_data_chunk_ondemand() | bytes_may_use += 64M |-> btrfs_prealloc_file_range() |-> btrfs_reserve_extent() |-> btrfs_add_reserved_bytes() | alloc_type is RESERVE_ALLOC_NO_ACCOUNT, so it does not | change bytes_may_use, and bytes_reserved += 64M. Now | bytes_may_use + bytes_reserved == 128M, which is greater | than btrfs_space_info's total_bytes, false enospc occurs. | Note, the bytes_may_use decrease operation will be done in | end of btrfs_fallocate(), which is too late. Here is another simple case for buffered write: CPU 1 | CPU 2 | |-> cow_file_range() |-> __btrfs_buffered_write() |-> btrfs_reserve_extent() | | | | | | | | | ..... | |-> btrfs_check_data_free_space() | | | | |-> extent_clear_unlock_delalloc() | In CPU 1, btrfs_reserve_extent()->find_free_extent()-> btrfs_add_reserved_bytes() do not decrease bytes_may_use, the decrease operation will be delayed to be done in extent_clear_unlock_delalloc(). Assume in this case, btrfs_reserve_extent() reserved 128MB data, CPU2's btrfs_check_data_free_space() tries to reserve 100MB data space. If 100MB > data_sinfo->total_bytes - data_sinfo->bytes_used - data_sinfo->bytes_reserved - data_sinfo->bytes_pinned - data_sinfo->bytes_readonly - data_sinfo->bytes_may_use btrfs_check_data_free_space() will try to allcate new data chunk or call btrfs_start_delalloc_roots(), or commit current transaction in order to reserve some free space, obviously a lot of work. But indeed it's not necessary as long as decreasing bytes_may_use timely, we still have free space, decreasing 128M from bytes_may_use. To fix this issue, this patch chooses to update bytes_may_use for both data and metadata in btrfs_add_reserved_bytes(). For compress path, real extent length may not be equal to file content length, so introduce a ram_bytes argument for btrfs_reserve_extent(), find_free_extent() and btrfs_add_reserved_bytes(), it's becasue bytes_may_use is increased by file content length. Then compress path can update bytes_may_use correctly. Also now we can discard RESERVE_ALLOC_NO_ACCOUNT, RESERVE_ALLOC and RESERVE_FREE. As we know, usually EXTENT_DO_ACCOUNTING is used for error path. In run_delalloc_nocow(), for inode marked as NODATACOW or extent marked as PREALLOC, we also need to update bytes_may_use, but can not pass EXTENT_DO_ACCOUNTING, because it also clears metadata reservation, so here we introduce EXTENT_CLEAR_DATA_RESV flag to indicate btrfs_clear_bit_hook() to update btrfs_space_info's bytes_may_use. Meanwhile __btrfs_prealloc_file_range() will call btrfs_free_reserved_data_space() internally for both sucessful and failed path, btrfs_prealloc_file_range()'s callers does not need to call btrfs_free_reserved_data_space() any more. Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25btrfs: divide btrfs_update_reserved_bytes() into two functionsWang Xiaoguang
This patch divides btrfs_update_reserved_bytes() into btrfs_add_reserved_bytes() and btrfs_free_reserved_bytes(), and next patch will extend btrfs_add_reserved_bytes()to fix some false ENOSPC error, please see later patch for detailed info. Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25btrfs: use correct offset for reloc_inode in prealloc_file_extent_cluster()Wang Xiaoguang
In prealloc_file_extent_cluster(), btrfs_check_data_free_space() uses wrong file offset for reloc_inode, it uses cluster->start and cluster->end, which indeed are extent's bytenr. The correct value should be cluster->[start|end] minus block group's start bytenr. start bytenr cluster->start | | extent | extent | ...| extent | |----------------------------------------------------------------| | block group reloc_inode | Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25btrfs: qgroup: Fix qgroup incorrectness caused by log replayQu Wenruo
When doing log replay at mount time(after power loss), qgroup will leak numbers of replayed data extents. The cause is almost the same of balance. So fix it by manually informing qgroup for owner changed extents. The bug can be detected by btrfs/119 test case. Cc: Mark Fasheh <mfasheh@suse.de> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-and-Tested-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25btrfs: relocation: Fix leaking qgroups numbers on data extentsQu Wenruo
This patch fixes a REGRESSION introduced in 4.2, caused by the big quota rework. When balancing data extents, qgroup will leak all its numbers for relocated data extents. The relocation is done in the following steps for data extents: 1) Create data reloc tree and inode 2) Copy all data extents to data reloc tree And commit transaction 3) Create tree reloc tree(special snapshot) for any related subvolumes 4) Replace file extent in tree reloc tree with new extents in data reloc tree And commit transaction 5) Merge tree reloc tree with original fs, by swapping tree blocks For 1)~4), since tree reloc tree and data reloc tree doesn't count to qgroup, everything is OK. But for 5), the swapping of tree blocks will only info qgroup to track metadata extents. If metadata extents contain file extents, qgroup number for file extents will get lost, leading to corrupted qgroup accounting. The fix is, before commit transaction of step 5), manually info qgroup to track all file extents in data reloc tree. Since at commit transaction time, the tree swapping is done, and qgroup will account these data extents correctly. Cc: Mark Fasheh <mfasheh@suse.de> Reported-by: Mark Fasheh <mfasheh@suse.de> Reported-by: Filipe Manana <fdmanana@gmail.com> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Tested-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25btrfs: qgroup: Refactor btrfs_qgroup_insert_dirty_extent()Qu Wenruo
Refactor btrfs_qgroup_insert_dirty_extent() function, to two functions: 1. btrfs_qgroup_insert_dirty_extent_nolock() Almost the same with original code. For delayed_ref usage, which has delayed refs locked. Change the return value type to int, since caller never needs the pointer, but only needs to know if they need to free the allocated memory. 2. btrfs_qgroup_insert_dirty_extent() The more encapsulated version. Will do the delayed_refs lock, memory allocation, quota enabled check and other things. The original design is to keep exported functions to minimal, but since more btrfs hacks exposed, like replacing path in balance, we need to record dirty extents manually, so we have to add such functions. Also, add comment for both functions, to info developers how to keep qgroup correct when doing hacks. Cc: Mark Fasheh <mfasheh@suse.de> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-and-Tested-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25btrfs: waiting on qgroup rescan should not always be interruptibleJeff Mahoney
We wait on qgroup rescan completion in three places: file system shutdown, the quota disable ioctl, and the rescan wait ioctl. If the user sends a signal while we're waiting, we continue happily along. This is expected behavior for the rescan wait ioctl. It's racy in the shutdown path but mostly works due to other unrelated synchronization points. In the quota disable path, it Oopses the kernel pretty much immediately. Cc: <stable@vger.kernel.org> # v4.4+ Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25btrfs: properly track when rescan worker is runningJeff Mahoney
The qgroup_flags field is overloaded such that it reflects the on-disk status of qgroups and the runtime state. The BTRFS_QGROUP_STATUS_FLAG_RESCAN flag is used to indicate that a rescan operation is in progress, but if the file system is unmounted while a rescan is running, the rescan operation is paused. If the file system is then mounted read-only, the flag will still be present but the rescan operation will not have been resumed. When we go to umount, btrfs_qgroup_wait_for_completion will see the flag and interpret it to mean that the rescan worker is still running and will wait for a completion that will never come. This patch uses a separate flag to indicate when the worker is running. The locking and state surrounding the qgroup rescan worker needs a lot of attention beyond this patch but this is enough to avoid a hung umount. Cc: <stable@vger.kernel.org> # v4.4+ Signed-off-by; Jeff Mahoney <jeffm@suse.com> Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25btrfs: flush_space: treat return value of do_chunk_alloc properlyAlex Lyakas
do_chunk_alloc returns 1 when it succeeds to allocate a new chunk. But flush_space will not convert this to 0, and will also return 1. As a result, reserve_metadata_bytes will think that flush_space failed, and may potentially return this value "1" to the caller (depends how reserve_metadata_bytes was called). The caller will also treat this as an error. For example, btrfs_block_rsv_refill does: int ret = -ENOSPC; ... ret = reserve_metadata_bytes(root, block_rsv, num_bytes, flush); if (!ret) { block_rsv_add_bytes(block_rsv, num_bytes, 0); return 0; } return ret; So it will return -ENOSPC. Signed-off-by: Alex Lyakas <alex@zadarastorage.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25Btrfs: add ASSERT for block group's memory leakLiu Bo
This adds several ASSERT()' s to report memory leak of block group cache. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25btrfs: backref: Fix soft lockup in __merge_refs functionQu Wenruo
When over 1000 file extents refers to one extent, find_parent_nodes() will be obviously slow, due to the O(n^2)~O(n^3) loops inside __merge_refs(). The following ftrace shows the cubic growth of execution time: 256 refs 5) + 91.768 us | __add_keyed_refs.isra.12 [btrfs](); 5) 1.447 us | __add_missing_keys.isra.13 [btrfs](); 5) ! 114.544 us | __merge_refs [btrfs](); 5) ! 136.399 us | __merge_refs [btrfs](); 512 refs 6) ! 279.859 us | __add_keyed_refs.isra.12 [btrfs](); 6) 3.164 us | __add_missing_keys.isra.13 [btrfs](); 6) ! 442.498 us | __merge_refs [btrfs](); 6) # 2091.073 us | __merge_refs [btrfs](); and 1024 refs 7) ! 368.683 us | __add_keyed_refs.isra.12 [btrfs](); 7) 4.810 us | __add_missing_keys.isra.13 [btrfs](); 7) # 2043.428 us | __merge_refs [btrfs](); 7) * 18964.23 us | __merge_refs [btrfs](); And sort them into the following char: (Unit: us) ------------------------------------------------------------------------ Trace function | 256 ref | 512 refs | 1024 refs | ------------------------------------------------------------------------ __add_keyed_refs | 91 | 249 | 368 | __add_missing_keys | 1 | 3 | 4 | __merge_refs 1st call | 114 | 442 | 2043 | __merge_refs 2nd call | 136 | 2091 | 18964 | ------------------------------------------------------------------------ We can see the that __add_keyed_refs() grows almost in linear behavior. And __add_missing_keys() in this case doesn't change much or takes much time. While for the 1st __merge_refs() it's square growth for the 2nd __merge_refs() call it's cubic growth. It's no doubt that merge_refs() will take a long long time to execute if the number of refs continues its grows. So add a cond_resced() into the loop of __merge_refs(). Although this will solve the problem of soft lockup, we need to use the new rb_tree based structure introduced by Lu Fengqi to really solve the long execution time. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25Btrfs: fix memory leak of reloc_rootLiu Bo
When some critical errors occur and FS would be flipped into RO, if we have an on-going balance, we can end up with a memory leak of root->reloc_root since btrfs_drop_snapshots() bails out without freeing reloc_root at the very early start. However, we're not able to free reloc_root in btrfs_drop_snapshots() because its caller, merge_reloc_roots(), still needs to access it to cleanup reloc_root's rbtree. This makes us free reloc_root when we're going to free fs/file roots. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-10Merge branch 'for-linus-4.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "Some fixes for btrfs send/recv and fsync from Filipe and Robbie Ko. Bonus points to Filipe for already having xfstests in place for many of these" * 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: remove unused function btrfs_add_delayed_qgroup_reserve() Btrfs: improve performance on fsync against new inode after rename/unlink Btrfs: be more precise on errors when getting an inode from disk Btrfs: send, don't bug on inconsistent snapshots Btrfs: send, avoid incorrect leaf accesses when sending utimes operations Btrfs: send, fix invalid leaf accesses due to incorrect utimes operations Btrfs: send, fix warning due to late freeing of orphan_dir_info structures Btrfs: incremental send, fix premature rmdir operations Btrfs: incremental send, fix invalid paths for rename operations Btrfs: send, add missing error check for calls to path_loop() Btrfs: send, fix failure to move directories with the same name around Btrfs: add missing check for writeback errors on fsync