summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-05-28btrfs: take the last remnants of ->d_fsdata use outAl Viro
[spotted while going through ->d_fsdata handling around d_splice_alias(); don't really care which tree that goes through] The only thing even looking at ->d_fsdata in there (since 2012) had been kfree(dentry->d_fsdata) in btrfs_dentry_delete(). Which, incidentally, is all btrfs_dentry_delete() does. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Do super block verification before writing it to diskQu Wenruo
There are already 2 reports about strangely corrupted super blocks, where csum still matches but extra garbage gets slipped into super block. The corruption would looks like: ------ superblock: bytenr=65536, device=/dev/sdc1 --------------------------------------------------------- csum_type 41700 (INVALID) csum 0x3b252d3a [match] bytenr 65536 flags 0x1 ( WRITTEN ) magic _BHRfS_M [match] ... incompat_flags 0x5b22400000000169 ( MIXED_BACKREF | COMPRESS_LZO | BIG_METADATA | EXTENDED_IREF | SKINNY_METADATA | unknown flag: 0x5b22400000000000 ) ... ------ Or ------ superblock: bytenr=65536, device=/dev/mapper/x --------------------------------------------------------- csum_type 35355 (INVALID) csum_size 32 csum 0xf0dbeddd [match] bytenr 65536 flags 0x1 ( WRITTEN ) magic _BHRfS_M [match] ... incompat_flags 0x176d200000000169 ( MIXED_BACKREF | COMPRESS_LZO | BIG_METADATA | EXTENDED_IREF | SKINNY_METADATA | unknown flag: 0x176d200000000000 ) ------ Obviously, csum_type and incompat_flags get some garbage, but its csum still matches, which means kernel calculates the csum based on corrupted super block memory. And after manually fixing these values, the filesystem is completely healthy without any problem exposed by btrfs check. Although the cause is still unknown, at least detect it and prevent further corruption. Both reports have same symptoms, there's an overwrite on offset 192 of the superblock, by 4 bytes. The superblock structure is not allocated or freed and stays in the memory for the whole filesystem lifetime, so it's not a use-after-free kind of error on someone else's leaked page. As a vague point for the problable cause is mentioning of other system freezing related to graphic card drivers. Reported-by: Ken Swenson <flat@imo.uto.moe> Reported-by: Ben Parsons <9parsonsb@gmail.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ add brief analysis of the reports ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Refactor btrfs_check_super_validQu Wenruo
Refactor btrfs_check_super_valid: 1) Rename it to btrfs_validate_mount_super() Now it's more obvious when the function should be called. 2) Extract core check routine into validate_super() Later write time check can reuse it, and if needed, we could also use validate_super() to check each super block. 3) Add more comments about btrfs_validate_mount_super() Mostly about what it doesn't check and when it should be called. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ rename to validate_super ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Move btrfs_check_super_valid() to avoid forward declarationQu Wenruo
Move btrfs_check_super_valid() before its single caller to avoid forward declaration. Though such code motion is not recommended as it pollutes git history, in this case the following patches would need to add new forward declarations for static functions that we want to avoid. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove fs_info argument from populate_free_space_treeNikolay Borisov
This function always takes a transaction handle which contains a reference to the fs_info. Use that and remove the extra argument. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove fs_info argument from add_to_free_space_treeNikolay Borisov
This function takes a transaction handle which already contains a reference to the fs_info. So use it and remove the extra function argument. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove fs_info argument from remove_from_free_space_treeNikolay Borisov
This function alreay takes a transaction handle which holds a reference to the fs_info. Use that and remove the extra argument. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove fs_info argument from __remove_from_free_space_treeNikolay Borisov
This function takes a transaction handle which holds a reference to fs_info. So use that and remove the extra argument. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove fs_info argument from remove_free_space_extentNikolay Borisov
This function takes a transaction handle which already has a reference to the fs_info. Use it and remove the extra argument. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove fs_info argument from add_free_space_extentNikolay Borisov
This function always takes a transaction handle which references the fs_info structure. So use that and remove the extra argument. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove fs_info argument from modify_free_space_bitmapNikolay Borisov
This function already takes a transaction which has a reference to the fs_info. So use that and remove the extra argument. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove fs_info argument from update_free_space_extent_countNikolay Borisov
This function already takes a transaction handle which has a reference to the fs_info. So use that and remove the extra argument. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove fs_info parameter from convert_free_space_to_extentsNikolay Borisov
This function always takes a transaction handle which contains a reference to fs_info. So use that and kill the extra argument. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove fs_info argument from convert_free_space_to_bitmapsNikolay Borisov
This function already takes a transaction handle which contains a reference to fs_info. So use that and remove the extra argument. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove fs_info parameter from remove_block_group_free_spaceNikolay Borisov
This function always takes a trans handle which contains a reference to the fs_info. Use that and remove the extra argument. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove fs_info argument from add_new_free_spaceNikolay Borisov
This function also takes a btrfs_block_group_cache which contains a referene to the fs_info. So use that and remove the extra argument. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove fs_info parameter from add_new_free_space_infoNikolay Borisov
This function already takes trans handle from where fs_info can be referenced. Remove the redundant parameter. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove fs_info argument from __add_to_free_space_treeNikolay Borisov
This function already takes a transaction handle which contains a reference to fs_info. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove fs_info argument from __add_block_group_free_spaceNikolay Borisov
This function already takes a transaction handle which has a reference to the fs_info. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove fs_info argument from add_block_group_free_spaceNikolay Borisov
We also pass in a transaction handle which has a reference to the fs_info. Just remove the extraneous argument. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Make btrfs_init_dummy_trans initialize trans' fs_info fieldNikolay Borisov
This will be necessary for future cleanups which remove the fs_info argument from some freespace tree functions. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Add assert in __btrfs_del_delalloc_inodeNikolay Borisov
The invariant is that when nr_delalloc_inodes is 0 then the root mustn't have any inodes on its delalloc inodes list. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: incremental send, improve rmdir performance for large directoryRobbie Ko
Currently when checking if a directory can be deleted, we always check if all its children have been processed. Example: A directory with 2,000,000 files was deleted original: 1994m57.071s patch: 1m38.554s [FIX] Instead of checking all children on all calls to can_rmdir(), we keep track of the directory index offset of the child last checked in the last call to can_rmdir(), and then use it as the starting point for future calls to can_rmdir(). Signed-off-by: Robbie Ko <robbieko@synology.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: incremental send, move allocation until it's needed in orphan_dir_infoRobbie Ko
Move the allocation after the search when it's clear that the new entry will be added. Signed-off-by: Robbie Ko <robbieko@synology.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: split delayed ref head initialization and additionNikolay Borisov
add_delayed_ref_head really performed 2 independent operations - initialisting the ref head and adding it to a list. Now that the init part is in a separate function let's complete the separation between both operations. This results in a lot simpler interface for add_delayed_ref_head since the function now deals solely with either adding the newly initialised delayed ref head or merging it into an existing delayed ref head. This results in vastly simplified function signature since 5 arguments are dropped. The only other thing worth mentioning is that due to this split the WARN_ON catching reinit of existing. In this patch the condition is extended such that: qrecord && head_ref->qgroup_ref_root && head_ref->qgroup_reserved is added. This is done because the two qgroup_* prefixed member are set only if both ref_root and reserved are passed. So functionally it's equivalent to the old WARN_ON and allows to remove the two args from add_delayed_ref_head. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Use init_delayed_ref_head in add_delayed_ref_headNikolay Borisov
Use the newly introduced function when initialising the head_ref in add_delayed_ref_head. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Introduce init_delayed_ref_headNikolay Borisov
add_delayed_ref_head implements the logic to both initialize a head_ref structure as well as perform the necessary operations to add it to the delayed ref machinery. This has resulted in a very cumebrsome interface with loads of parameters and code, which at first glance, looks very unwieldy. Begin untangling it by first extracting the initialization only code in its own function. It's more or less verbatim copy of the first part of add_delayed_ref_head. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Open-code add_delayed_data_refNikolay Borisov
Now that the initialization part and the critical section code have been split it's a lot easier to open code add_delayed_data_ref. Do so in the following manner: 1. The common init function is put immediately after memory-to-be-initialized is allocated, followed by the specific data ref initialization. 2. The only piece of code that remains in the critical section is insert_delayed_ref call. 3. Tracing and memory freeing code is moved outside of the critical section. No functional changes, just an overall shorter critical section. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Open-code add_delayed_tree_refNikolay Borisov
Now that the initialization part and the critical section code have been split it's a lot easier to open code add_delayed_tree_ref. Do so in the following manner: 1. The comming init code is put immediately after memory-to-be-initialized is allocated, followed by the ref-specific member initialization. 2. The only piece of code that remains in the critical section is insert_delayed_ref call. 3. Tracing and memory freeing code is put outside of the critical section as well. The only real change here is an overall shorter critical section when dealing with delayed tree refs. From functional point of view - the code is unchanged. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Use init_delayed_ref_common in add_delayed_data_refNikolay Borisov
Use the newly introduced helper and remove the duplicate code. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Use init_delayed_ref_common in add_delayed_tree_refNikolay Borisov
Use the newly introduced common helper. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Factor out common delayed refs init codeNikolay Borisov
THe majority of the init code for struct btrfs_delayed_ref_node is duplicated in add_delayed_data_ref and add_delayed_tree_ref. Factor out the common bits in init_delayed_ref_common. This function is going to be used in future patches to clean that up. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: return original error code when failing from option parsingChengguang Xu
It's not good to overwrite -ENOMEM using -EINVAL when failing from mount option parsing, so just return original error code. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Reviewed-by: David Sterba <dsterba@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: remove redundant btrfs_balance_control::fs_infoDavid Sterba
The fs_info is always available from the context so we don't need to store it in the structure. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: qgroup: Allow trace_btrfs_qgroup_account_extent() to record its transidQu Wenruo
When debugging quota rescan race, some times btrfs rescan could account some old (committed) leaf and then re-account newly committed leaf in next generation. This race needs extra transid to locate, so add @transid for trace_btrfs_qgroup_account_extent() for such debug. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: send: fix spelling mistake: "send_in_progres" -> "send_in_progress"Colin Ian King
Trivial fix to spelling mistake of function name in btrfs_err message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove devid parameter from btrfs_rmap_blockNikolay Borisov
This function is used in only one place and devid argument is always passed 0. So just remove it, similarly to how it was removed in the userspace code. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: trace: Allow trace_qgroup_update_counters() to record old rfer/excl valueQu Wenruo
Origin trace_qgroup_update_counters() only records qgroup id and its reference count change. It's good enough to debug qgroup accounting change, but when rescan race is involved, it's pretty hard to distinguish which modification belongs to which rescan. So add old_rfer and old_excl trace output to help distinguishing different rescan instance. (Different rescan instance should reset its qgroup->rfer to 0) For trace event parameter, it just changes from u64 qgroup_id to struct btrfs_qgroup *qgroup, so number of parameters is not changed at all. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Unexport btrfs_alloc_delalloc_workNikolay Borisov
It's used only in inode.c so makes no sense to have it exported. Also move the definition of btrfs_delalloc_work to inode.c since it's used only this file. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove delayed_iput member from btrfs_delalloc_workNikolay Borisov
When allocating a delalloc work we are always setting the delayed_iput to 0. So remove the delay_iput member of btrfs_delalloc_work, as a result also remove it as a parameter from btrfs_alloc_delalloc_work since it's not used anymore. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove delay_iput parameter from __start_delalloc_inodesNikolay Borisov
It's always set to 0 so remove it. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> [ rename to start_delalloc_inodes ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove delayed_iput parameter from btrfs_start_delalloc_inodesNikolay Borisov
It's always set to 0, so just remove it and collapse the constant value to the only function we are passing it. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove delayed_iput parameter of btrfs_start_delalloc_rootsNikolay Borisov
This parameter was introduced alongside the function in eb73c1b7cea7 ("Btrfs: introduce per-subvolume delalloc inode list") to avoid deadlocks since this function was used in the transaction commit path. However, commit 8d875f95da43 ("btrfs: disable strict file flushes for renames and truncates") removed that usage, rendering the parameter obsolete. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: do reverse path readahead in btrfs_shrink_deviceGu Jinxiang
In btrfs_shrink_device, before btrfs_search_slot, path->reada is set to READA_FORWARD. But I think READA_BACK is correct. Since: 1. key.offset is set to (u64)-1 2. after btrfs_search_slot, btrfs_previous_item is called So, for readahead previous items, READA_BACK is the correct one. Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: trace: Add trace points for unused block groupsQu Wenruo
This patch will add the following trace events: 1) btrfs_remove_block_group For btrfs_remove_block_group() function. Triggered when a block group is really removed. 2) btrfs_add_unused_block_group Triggered which block group is added to unused_bgs list. 3) btrfs_skip_unused_block_group Triggered which unused block group is not deleted. These trace events is pretty handy to debug case related to block group auto remove. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: trace: Remove unnecessary fs_info parameter for btrfs__reserve_extent ↵Qu Wenruo
event class fs_info can be extracted from btrfs_block_group_cache, and all btrfs_block_group_cache is created by btrfs_create_block_group_cache() with fs_info initialized, no need to worry about NULL pointer dereference. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: remove unused fs_info parameterGu Jinxiang
Since the commit c6100a4b4e3d ("Btrfs: replace tree->mapping with tree->private_data"), parameter fs_info in alloc_reloc_control is not used. So remove it. Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: move btrfs_raid_mindev_errorvalues to btrfs_raid_attr tableAnand Jain
Add a new member struct btrfs_raid_attr::mindev_error so that btrfs_raid_array can maintain the error code to return if the minimum number of devices condition is not met while trying to delete a device in the given raid. And so we can drop btrfs_raid_mindev_error. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: move btrfs_raid_group values to btrfs_raid_attr tableAnand Jain
Add a new member struct btrfs_raid_attr::bg_flag so that btrfs_raid_array can maintain the bit map flag of the raid type, and so we can drop btrfs_raid_group. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: move btrfs_raid_type_names values to btrfs_raid_attr tableAnand Jain
Add a new member struct btrfs_raid_attr::raid_name so that btrfs_raid_array can maintain the name of the raid type, and so we can drop btrfs_raid_type_names. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>