summaryrefslogtreecommitdiff
path: root/fs/xfs/scrub
AgeCommit message (Collapse)Author
2018-06-08xfs: move various type verifiers to common fileDave Chinner
New verification functions like xfs_verify_fsbno() and xfs_verify_agino() are spread across multiple files and different header files. They really don't fit cleanly into the places they've been put, and have wider scope than the current header includes. Move the type verifiers to a new file in libxfs (xfs-types.c) and the prototypes to xfs_types.h where they will be visible to all the code that uses the types. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-06xfs: convert to SPDX license tagsDave Chinner
Remove the verbose license text from XFS files and replace them with SPDX tags. This does not change the license of any of the code, merely refers to the common, up-to-date license files in LICENSES/ This change was mostly scripted. fs/xfs/Makefile and fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected and modified by the following command: for f in `git grep -l "GNU General" fs/xfs/` ; do echo $f cat $f | awk -f hdr.awk > $f.new mv -f $f.new $f done And the hdr.awk script that did the modification (including detecting the difference between GPL-2.0 and GPL-2.0+ licenses) is as follows: $ cat hdr.awk BEGIN { hdr = 1.0 tag = "GPL-2.0" str = "" } /^ \* This program is free software/ { hdr = 2.0; next } /any later version./ { tag = "GPL-2.0+" next } /^ \*\// { if (hdr > 0.0) { print "// SPDX-License-Identifier: " tag print str print $0 str="" hdr = 0.0 next } print $0 next } /^ \* / { if (hdr > 1.0) next if (hdr > 0.0) { if (str != "") str = str "\n" str = str $0 next } print $0 next } /^ \*/ { if (hdr > 0.0) next print $0 next } // { if (hdr > 0.0) { if (str != "") str = str "\n" str = str $0 next } print $0 } END { } $ Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-01xfs: fix xfs_rtalloc_rec unitsDarrick J. Wong
All the realtime allocation functions deal with space on the rtdev in units of realtime extents. However, struct xfs_rtalloc_rec confusingly uses the word 'block' in the name, even though they're really extents. Fix the naming problem and fix all the unit handling problems in the two existing users. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com>
2018-05-30xfs: repair superblocksDarrick J. Wong
If one of the backup superblocks is found to differ seriously from superblock 0, write out a fresh copy from the in-core sb. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-30xfs: add helpers to attach quotas to inodesDarrick J. Wong
Add a helper routine to attach quota information to inodes that are about to undergo repair. If that fails, we need to schedule a quotacheck for the next mount but allow the corrupted metadata repair to continue. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-30xfs: recover AG btree roots from rmap dataDarrick J. Wong
Add a helper function to help us recover btree roots from the rmap data. Callers pass in a list of rmap owner codes, buffer ops, and magic numbers. We iterate the rmap records looking for owner matches, and then read the matching blocks to see if the magic number & uuid match. If so, we then read-verify the block, and if that passes then we retain a pointer to the block with the highest level, assuming that by the end of the call we will have found the root. This will be used to reset the AGF/AGI btree root fields during their rebuild procedures. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-30xfs: add helpers to dispose of old btree blocks after a repairDarrick J. Wong
Now that we've plumbed in the ability to construct a list of dead btree blocks following a repair, add more helpers to dispose of them. This is done by examining the rmapbt -- if the btree was the only owner we can free the block, otherwise it's crosslinked and we can only remove the rmapbt record. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-30xfs: add helpers to collect and sift btree block pointers during repairDarrick J. Wong
Add some helpers to assemble a list of fs block extents. Generally, repair functions will iterate the rmapbt to make a list (1) of all extents owned by the nominal owner of the metadata structure; then they will iterate all other structures with the same rmap owner to make a list (2) of active blocks; and finally we have a subtraction function to subtract all the blocks in (2) from (1), with the result that (1) is now a list of blocks that were owned by the old btree and must be disposed. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-30xfs: add helpers to allocate and initialize fresh btree rootsDarrick J. Wong
Add a pair of helper functions to allocate and initialize fresh btree roots. The repair functions will use these as part of recreating corrupted metadata. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2018-05-30xfs: add helpers to deal with transaction allocation and rollingDarrick J. Wong
For repairs, we need to reserve at least as many blocks as we think we're going to need to rebuild the data structure, and we're going to need some helpers to roll transactions while maintaining locks on the AG headers so that other threads cannot wander into the middle of a repair. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2018-05-30xfs: grab the per-ag structure whenever relevantDarrick J. Wong
Grab and hold the per-AG data across a scrub run whenever relevant. This helps us avoid repeated trips through rcu and the radix tree in the repair code. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-15xfs: implement the metadata repair ioctl flagDarrick J. Wong
Plumb in the pieces necessary to make the "scrub" subfunction of the scrub ioctl actually work. This means that we make the IFLAG_REPAIR flag to the scrub ioctl actually do something, and we add an errortag knob so that xfstests can force the kernel to rebuild a metadata structure even if there's nothing wrong with it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-15xfs: create tracepoints for online repairDarrick J. Wong
These tracepoints will be used to debug the online repair routines. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-15xfs: hoist xfs_scrub_agfl_walk to libxfs as xfs_agfl_walkDarrick J. Wong
This function is basically a generic AGFL block iterator, so promote it to libxfs ahead of online repair wanting to use it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: avoid ABBA deadlock when scrubbing parent pointersDarrick J. Wong
In normal operation, the XFS convention is to take an inode's iolock and then allocate a transaction. However, when scrubbing parent inodes this is inverted -- we allocated the transaction to do the scrub, and now we're trying to grab the parent's iolock. This can lead to ABBA deadlocks: some thread grabbed the parent's iolock and is waiting for space for a transaction while our parent scrubber is sitting on a transaction trying to get the parent's iolock. Therefore, convert all iolock attempts to use trylock; if that fails, they can use the existing mechanisms to back off and try again. The ABBA deadlock didn't happen with a non-repair scrub because the transactions don't reserve any space, but repair scrubs require reservation in order to update metadata. However, any other concurrent metadata update (e.g. directory create in the parent) could also induce this deadlock with the parent scrubber. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: scrub the data fork of the realtime inodesDarrick J. Wong
The realtime bitmap and summary inodes live on the metadata device, so we can scrub their data forks with the regular scrubbers. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: quota scrub should use bmapbtd scrubberDarrick J. Wong
Replace the quota scrubber's open-coded data fork scrubber with a redirected call to the bmapbtd scrubber. This strengthens the quota scrub to include all the cross-referencing that it does. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: don't continue scrub if already corruptDarrick J. Wong
If we've already decided that something is corrupt, we might as well abort all the loops and exit as quickly as possible. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: superblock scrub should use short-lived buffersDarrick J. Wong
Secondary superblocks are rarely used, so create a helper to read a given non-primary AG's superblock and ensure that it won't stick around hogging memory. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: skip scrub xref if corruption already notedDarrick J. Wong
Don't bother looking for cross-referencing problems if the metadata is already corrupt or we've already found a cross-referencing problem. Since we added a helper function for flags testing, convert existing users to use it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: refactor scrub transaction allocation functionDarrick J. Wong
Since the transaction allocation helper is about to become more complex, move it to common.c and remove the redundant parameters. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: btree scrub should check minrecsDarrick J. Wong
Strengthen the btree block header checks to detect the number of records being less than the btree type's minimum record count. Certain blocks are allowed to violate this constraint -- specifically any btree block at the top of the tree can have fewer than minrecs records. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: clean up scrub usage of KM_NOFSDarrick J. Wong
All scrub code runs in transaction context, which means that memory allocations are automatically run in PF_MEMALLOC_NOFS context. It's therefore unnecessary to pass in KM_NOFS to allocation routines, so clean them all out. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: avoid ilock games in the quota scrubberDarrick J. Wong
Refactor the quota scrubber to take the quotaofflock and grab the quota inode in the setup function so that we can treat quota in the same "scrub in the context of this inode" (i.e. sc->ip) manner as we treat any other inode. We do have to drop the quota inode's ILOCK_EXCL to use dqiterate, but since dquots have their own individual locks the ILOCK wasn't helping us anyway. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: refactor dquot iterationDarrick J. Wong
Create a helper function to iterate all the dquots of a given type in the system, and refactor the dquot scrub to use it. This will get more use in the quota repair code. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-05-10xfs: refactor XFS_QMOPT_DQNEXT out of existenceDarrick J. Wong
There's only one caller of DQNEXT and its semantics can be moved into a separate function, so create the function and get rid of the flag. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-03-23xfs: xfs_scrub_iallocbt_xref_rmap_inodes should use xref_set_corruptDarrick J. Wong
In xfs_scrub_iallocbt_xref_rmap_inodes we're checking inodes against rmap records, so we should use xfs_scrub_btree_xref_set_corrupt if we encounter discrepancies here so that we know that it's a cross referencing error, not necessarily a corruption in the inobt itself. The userspace xfs_scrub program will try to repair outright corruptions in the agi/inobt prior to phase 3 so that the inode scan will proceed. If only a cross-referencing error is noted, the repair program defers the repair attempt until it can check the other space metadata at least once. It is therefore essential that the inobt scrubber can correctly distinguish between corruptions and "unable to cross-reference something else with this inobt". The same reasoning applies to "xfs: record inode buf errors as a xref error in inobt scrubber". Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-03-23xfs: flag inode corruption if parent ptr doesn't get us a real inodeDarrick J. Wong
If a directory's parent inode pointer doesn't point to an inode, the directory should be flagged as corrupt. Enable IGET_UNTRUSTED here so that _iget will return -EINVAL if the inobt does not confirm that the inode is present and allocated and we can flag the directory corruption. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-03-23xfs: move inode extent size hint validation to libxfsDarrick J. Wong
Extent size hint validation is used by scrub to decide if there's an error, and it will be used by repair to decide to remove the hint. Since these use the same validation functions, move them to libxfs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-03-23xfs: record inode buf errors as a xref error in inobt scrubberDarrick J. Wong
During the inode btree scrubs we try to confirm the freemask bits against the inode records. If the inode buffer read fails, this is a cross-referencing error, not a corruption of the inode btree itself. Use the xref_process_error call here. Found via core.version middlebit fuzz in xfs/415. The userspace xfs_scrub program will try to repair outright corruptions in the agi/inobt prior to phase 3 so that the inode scan will proceed. If only a cross-referencing error is noted, the repair program defers the repair attempt until it can check the other space metadata at least once. It is therefore essential that the inobt scrubber can correctly distinguish between corruptions and "unable to cross-reference something else with this inobt". Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-03-23xfs: remove xfs_buf parameter from inode scrub methodsDarrick J. Wong
Now that we no longer do raw inode buffer scrubbing, the bp parameter is no longer used anywhere we're dealing with an inode, so remove it and all the useless NULL parameters that go with it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-03-23xfs: inode scrubber shouldn't bother with raw checksDarrick J. Wong
The inode scrubber tries to _iget the inode prior to running checks. If that _iget call fails with corruption errors that's an automatic fail, regardless of whether it was the inode buffer read verifier, the ifork verifier, or the ifork formatter that errored out. Therefore, get rid of the raw mode scrub code because it's not needed. Found by trying to fix some test failures in xfs/379 and xfs/415. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-03-23xfs: bmap scrubber should do rmap xref with bmap for sparse filesDarrick J. Wong
When we're scanning an extent mapping inode fork, ensure that every rmap record for this ifork has a corresponding bmbt record too. This (mostly) provides the ability to cross-reference rmap records with bmap data. The rmap scrubber cannot do the xref on its own because that requires taking an ilock with the agf lock held, which violates our locking order rules (inode, then agf). Note that we only do this for forks that are in btree format due to the increased complexity; or forks that should have data but suspiciously have zero extents because the inode could have just had its iforks zapped by the inode repair code and now we need to reclaim the old extents. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-03-14xfs: merge _xfs_log_force and xfs_log_forceChristoph Hellwig
Switch to a single interface for flushing the whole log, which gives consistent trace point coverage, and removes the unused log_flushed argument for the previous _xfs_log_force callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-03-11xfs: convert XFS_AGFL_SIZE to a helper functionDave Chinner
The AGFL size calculation is about to get more complex, so lets turn the macro into a function first and remove the macro. Signed-off-by: Dave Chinner <dchinner@redhat.com> [darrick: forward port to newer kernel, simplify the helper] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-02-22xfs: use memset to initialize xfs_scrub_agfl_infoEric Sandeen
Apparently different gcc versions have competing and incompatible notions of how to initialize at declaration, so just give up and fall back to the time-tested memset(). Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-01-29xfs: don't clobber inobt/finobt cursors when xref with rmapDarrick J. Wong
Even if we can't use the inobt/finobt cursors to count the number of inode btree blocks, we are never allowed to clobber the cursor of the btree being checked, so don't do this. Found by fuzzing level = ones in xfs/364. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-01-29xfs: make tracepoint inode number format consistentDarrick J. Wong
Fix all the inode number formats to be consistently (0x%llx) in all trace point definitions. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-01-17xfs: check that br_blockcount doesn't overflowDarrick J. Wong
xfs_bmbt_irec.br_blockcount is declared as xfs_filblks_t, which is an unsigned 64-bit integer. Though the bmbt helpers will never set a value larger than 2^21 (since the underlying on-disk extent record has a length field that is only 21 bits wide), we should be a little defensive about checking that a bmbt record doesn't exceed what we're expecting or overflow into the next AG. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: directory scrubber must walk through data block to offsetDarrick J. Wong
In xfs_scrub_dir_rec, we must walk through the directory block entries to arrive at the offset given by the hash structure. If we blindly trust the hash address, we can end up midway into a directory entry and stray outside the block. Found by lastbit fuzzing lents[3].address in xfs/390 with KASAN enabled. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: don't iunlock unlocked inodesDarrick J. Wong
Don't iunlock an unlocked inode, which can happen if the parent pointer scrubber bails out with sc->ip unlocked while trying to grab the parent directory inode. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-01-17xfs: scrub in-core metadataDarrick J. Wong
Whenever we load a buffer, explicitly re-call the structure verifier to ensure that memory isn't corrupting things. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: cross-reference the block mappings when possibleDarrick J. Wong
Use an inode's block mappings to cross-reference inode block counters. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: cross-reference the realtime bitmapDarrick J. Wong
While we're scrubbing various btrees, cross-reference the records with the other metadata. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: cross-reference refcount btree during scrubDarrick J. Wong
During metadata btree scrub, we should cross-reference with the reference counts. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: cross-reference the rmapbt data with the refcountbtDarrick J. Wong
Cross reference the refcount data with the rmap data to check that the number of rmaps for a given block match the refcount of that block, and that CoW blocks (which are owned entirely by the refcountbt) are tracked as well. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: cross-reference reverse-mapping btreeDarrick J. Wong
When scrubbing various btrees, we should cross-reference the records with the reverse mapping btree and ensure that traversing the btree finds the same number of blocks that the rmapbt thinks are owned by that btree. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: cross-reference inode btrees during scrubDarrick J. Wong
Cross-reference the inode btrees with the other metadata when we scrub the filesystem. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: cross-reference bnobt records with cntbtDarrick J. Wong
Scrub should make sure that each bnobt record has a corresponding cntbt record. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-01-17xfs: cross-reference with the bnobtDarrick J. Wong
When we're scrubbing various btrees, cross-reference the records with the bnobt to ensure that we don't also think the space is free. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>