summaryrefslogtreecommitdiff
path: root/fs/xfs/xfs_log.c
AgeCommit message (Collapse)Author
2016-03-07xfs: use named array initializers for log item dumpingDarrick J. Wong
Use named array initializers for the string arrays used to dump log items, rather than depending on the order being maintained correctly. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-10xfs: remove XFS_BUF_ZEROFLAGS macroDave Chinner
The places where we use this macro already clear unnecessary IO flags (e.g. through xfs_bwrite()) or never have unexpected IO flags set on them in the first place (e.g. iclog buffers). Remove the macro from these locations, and where necessary clear only the specific flags that are conditional in the current buffer context. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-10xfs: remove XBF_WRITE flag wrapper macrosDave Chinner
They only set/clear/check a flag, no need for obfuscating this with a macro. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-10xfs: remove XBF_ASYNC flag wrapper macrosDave Chinner
They only set/clear/check a flag, no need for obfuscating this with a macro. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-10xfs: remove XBF_DONE flag wrapper macrosDave Chinner
They only set/clear/check a flag, no need for obfuscating this with a macro. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-01-05Merge branch 'xfs-misc-fixes-for-4.5' into for-nextDave Chinner
2016-01-05xfs: debug mode log record crc error injectionBrian Foster
XFS now uses CRC verification over a limited section of the log to detect torn writes prior to a crash. This is difficult to test directly due to the timing and hardware requirements to cause a short write. Add a mechanism to inject CRC errors into log records to facilitate testing torn write detection during log recovery. This mechanism is dangerous and can result in filesystem corruption. Thus, it is only available in DEBUG mode for testing/development purposes. Set a non-zero value to the following sysfs entry to enable error injection: /sys/fs/xfs/<dev>/log/log_badcrc_factor Once enabled, XFS intentionally writes an invalid CRC to a log record at some random point in the future based on the provided frequency. The filesystem immediately shuts down once the record has been written to the physical log to prevent metadata writeback (e.g., AIL insertion) once the log write completes. This helps reasonably simulate a torn write to the log as the affected record must be safe to discard. The next mount after the intentional shutdown requires log recovery and should detect and recover from the torn write. Note again that this _will_ result in data loss or worse. For testing and development purposes only! Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-01-04xfs: fix log ticket type printingDarrick J. Wong
Update the log ticket reservation type printing code to reflect all the types of log tickets, to avoid incorrect debug output and avoid running off the end of the array. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-10-12Merge branch 'xfs-logging-fixes' into for-nextDave Chinner
2015-10-12xfs: per-filesystem stats counter implementationBill O'Donnell
This patch modifies the stats counting macros and the callers to those macros to properly increment, decrement, and add-to the xfs stats counts. The counts for global and per-fs stats are correctly advanced, and cleared by writing a "1" to the corresponding clear file. global counts: /sys/fs/xfs/stats/stats per-fs counts: /sys/fs/xfs/sda*/stats/stats global clear: /sys/fs/xfs/stats/stats_clear per-fs clear: /sys/fs/xfs/sda*/stats/stats_clear [dchinner: cleaned up macro variables, removed CONFIG_FS_PROC around stats structures and macros. ] Signed-off-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-10-12xfs: avoid null *src in memcpy call in xlog_writeEric Sandeen
The gcc undefined behavior sanitizer caught this; surely any sane memcpy implementation will no-op if size == 0, but behavior with a *src of NULL is technically undefined (declared nonnull), so avoid it here. We are actually in this situation frequently via xlog_commit_record(), because: struct xfs_log_iovec reg = { .i_addr = NULL, .i_len = 0, .i_type = XLOG_REG_TYPE_COMMIT, }; Reported-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-10-12xfs: validate metadata LSNs against log on v5 superblocksBrian Foster
Since the onset of v5 superblocks, the LSN of the last modification has been included in a variety of on-disk data structures. This LSN is used to provide log recovery ordering guarantees (e.g., to ensure an older log recovery item is not replayed over a newer target data structure). While this works correctly from the point a filesystem is formatted and mounted, userspace tools have some problematic behaviors that defeat this mechanism. For example, xfs_repair historically zeroes out the log unconditionally (regardless of whether corruption is detected). If this occurs, the LSN of the filesystem is reset and the log is now in a problematic state with respect to on-disk metadata structures that might have a larger LSN. Until either the log catches up to the highest previously used metadata LSN or each affected data structure is modified and written out without incident (which resets the metadata LSN), log recovery is susceptible to filesystem corruption. This problem is ultimately addressed and repaired in the associated userspace tools. The kernel is still responsible to detect the problem and notify the user that something is wrong. Check the superblock LSN at mount time and fail the mount if it is invalid. From that point on, trigger verifier failure on any metadata I/O where an invalid LSN is detected. This results in a filesystem shutdown and guarantees that we do not log metadata changes with invalid LSNs on disk. Since this is a known issue with a known recovery path, present a warning to instruct the user how to recover. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-08-19Merge branch 'xfs-efi-rework' into for-nextDave Chinner
2015-08-19xfs: checksum log record ext headers based on record sizeBrian Foster
The first 4 bytes of every basic block in the physical log is stamped with the current lsn. To support this mechanism, the log record header (first block of each new log record) contains space for the original first byte of each log record block before it is replaced with the lsn. The log record header has space for 32k worth of blocks. The version 2 log adds new extended record headers for each additional 32k worth of blocks beyond what is supported by the record header. The log record checksum incorporates the log record header, the extended headers and the record payload. xlog_cksum() checksums the extended headers based on log->l_iclog_heads, which specifies the number of extended headers in a log record based on the log buffer size mount option. The log buffer size is variable, however, and thus means the checksum can be calculated differently based on how a filesystem is mounted. This is problematic if a filesystem crashes and recovery occurs on a subsequent mount using a different log buffer size. For example, crash an active filesystem that is mounted with the default (32k) logbsize, attempt remount/recovery using '-o logbsize=64k' and the mount fails on or warns about log checksum failures. To avoid this problem, update xlog_cksum() to calculate the checksum based on the size of the log buffer according to the log record. The size is already included in the h_size field of the log record header and thus is available at log recovery time. Extended log record headers are also only written when the log record is large enough to require them. This makes checksum calculation of log records consistent with the extended record header mechanism as well as how on-disk records are checksummed with various log buffer size mount options. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-08-19xfs: don't leave EFIs on AIL on mount failureBrian Foster
Log recovery occurs in two phases at mount time. In the first phase, EFIs and EFDs are processed and potentially cancelled out. EFIs without EFD objects are inserted into the AIL for processing and recovery in the second phase. xfs_mountfs() runs various other operations between the phases and is thus subject to failure. If failure occurs after the first phase but before the second, pending EFIs sit on the AIL, pin it and cause the mount to hang. Update the mount sequence to ensure that pending EFIs are cancelled in the event of failure. Add a recovery cancellation mechanism to iterate the AIL and cancel all EFI items when requested. Plumb cancellation support through the log mount finish helper and update xfs_mountfs() to invoke cancellation in the event of failure after recovery has started. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-07-29xfs: Use consistent logging message prefixesJoe Perches
The second and subsequent lines of multi-line logging messages are not prefixed with the same information as the first line. Separate messages with newlines into multiple calls to ensure consistent prefixing and allow easier grep use. Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-06-23Merge branch 'xfs-misc-fixes-for-4.2-3' into for-nextDave Chinner
2015-06-22xfs: remove xfs_caddr_tChristoph Hellwig
Just use char pointers directly instead of the confusing typedef to a pointer type. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-06-22xfs: use void pointers in log validation helpersChristoph Hellwig
Compared to char pointers this saves us a lot of casting effort. Also add another local variable to make the code easier to read. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-06-22xfs: remove __psint_t and __psunsigned_tChristoph Hellwig
Replace uses of __psint_t with the proper uintptr_t and ptrdiff_t types, and remove the defintions of __psint_t and __psunsigned_t. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-06-04xfs: fix xfs_log_done interfaceChristoph Hellwig
Instead of the confusing flags argument pass a boolean flag to indicate if we want to release or regrant a log reservation. Also ensure that xfs_log_done always drop the reference on the log ticket, to both simplify the code and make the logic in xfs_trans_roll easier to understand. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-22Merge branch 'xfs-sb-logging-rework' into for-nextDave Chinner
Conflicts: fs/xfs/xfs_mount.c
2015-01-22xfs: consolidate superblock logging functionsDave Chinner
We now have several superblock loggin functions that are identical except for the transaction reservation and whether it shoul dbe a synchronous transaction or not. Consolidate these all into a single function, a single reserveration and a sync flag and call it xfs_sync_sb(). Also, xfs_mod_sb() is not really a modification function - it's the operation of logging the superblock buffer. hence change the name of it to reflect this. Note that we have to change the mp->m_update_flags that are passed around at mount time to a boolean simply to indicate a superblock update is needed. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-12-24xfs: remove extra newlines from xfs messagesEric Sandeen
xfs_warn() and friends add a newline by default, but some messages add another one. Particularly for the failing write message below, this can waste a lot of console real estate! Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-12-24xfs: initialize log buf I/O completion wq on log allocBrian Foster
Log buffer I/O completion passes through the high priority m_log_workqueue rather than the default metadata buffer workqueue. The log buffer wq is initialized at I/O submission time. The log buffers are reused once initialized, however, so this is not necessary. Initialize the log buffer I/O completion workqueue pointers once when the log is allocated and log buffers initialized rather than on every log buffer I/O submission. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-12-04Merge branch 'xfs-misc-fixes-for-3.19-2' into for-nextDave Chinner
Conflicts: fs/xfs/xfs_iops.c
2014-12-04xfs: split metadata and log buffer completion to separate workqueuesBrian Foster
XFS traditionally sends all buffer I/O completion work to a single workqueue. This includes metadata buffer completion and log buffer completion. The log buffer completion requires a high priority queue to prevent stalls due to log forces getting stuck behind other queued work. Rather than continue to prioritize all buffer I/O completion due to the needs of log completion, split log buffer completion off to m_log_workqueue and move the high priority flag from m_buf_workqueue to m_log_workqueue. Add a b_ioend_wq wq pointer to xfs_buf to allow completion workqueue customization on a per-buffer basis. Initialize b_ioend_wq to m_buf_workqueue by default in the generic buffer I/O submission path. Finally, override the default wq with the high priority m_log_workqueue in the log buffer I/O submission path. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-11-28Merge branch 'xfs-consolidate-format-defs' into for-nextDave Chinner
2014-11-28xfs: move most of xfs_sb.h to xfs_format.hChristoph Hellwig
More on-disk format consolidation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-11-28xfs: merge xfs_ag.h into xfs_format.hChristoph Hellwig
More on-disk format consolidation. A few declarations that weren't on-disk format related move into better suitable spots. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-11-28xfs: allow lazy sb counter sync during filesystem freeze sequenceBrian Foster
The expectation since the introduction the lazy superblock counters is that the counters are synced and superblock logged appropriately as part of the filesystem freeze sequence. This does not occur, however, due to the logic in xfs_fs_writable() that prevents progress when the fs is in any state other than SB_UNFROZEN. While this is a bug, it has not been exposed to date because the last thing XFS does during freeze is dirty the log. The log recovery process recalculates the counters from AGI/AGF metadata to ensure everything is correct. Therefore should a crash occur while an fs is frozen, the subsequent log recovery puts everything back in order. See the following commit for reference: 92821e2b [XFS] Lazy Superblock Counters We might not always want to rely on dirtying the log on a frozen fs. Modify xfs_log_sbcount() to proceed when the filesystem is freezing but not once the freeze process has completed. Modify xfs_fs_writable() to accept the minimum freeze level for which modifications should be blocked to support various codepaths. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-10-02xfs: introduce xfs_buf_submit[_wait]Dave Chinner
There is a lot of cookie-cutter code that looks like: if (shutdown) handle buffer error xfs_buf_iorequest(bp) error = xfs_buf_iowait(bp) if (error) handle buffer error spread through XFS. There's significant complexity now in xfs_buf_iorequest() to specifically handle this sort of synchronous IO pattern, but there's all sorts of nasty surprises in different error handling code dependent on who owns the buffer references and the locks. Pull this pattern into a single helper, where we can hide all the synchronous IO warts and hence make the error handling for all the callers much saner. This removes the need for a special extra reference to protect IO completion processing, as we can now hold a single reference across dispatch and waiting, simplifying the sync IO smeantics and error handling. In doing this, also rename xfs_buf_iorequest to xfs_buf_submit and make it explicitly handle on asynchronous IO. This forces all users to be switched specifically to one interface or the other and removes any ambiguity between how the interfaces are to be used. It also means that xfs_buf_iowait() goes away. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-10-02xfs: xfs_buf_ioend and xfs_buf_iodone_work duplicate functionalityDave Chinner
We do some work in xfs_buf_ioend, and some work in xfs_buf_iodone_work, but much of that functionality is the same. This work can all be done in a single function, leaving xfs_buf_iodone just a wrapper to determine if we should execute it by workqueue or directly. hence rename xfs_buf_iodone_work to xfs_buf_ioend(), and add a new xfs_buf_ioend_async() for places that need async processing. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-10-02xfs: force the log before shutting downDave Chinner
When we have marked the filesystem for shutdown, we want to prevent any further buffer IO from being submitted. However, we currently force the log after marking the filesystem as shut down, hence allowing IO to the log *after* we have marked both the filesystem and the log as in an error state. Clean this up by forcing the log before we mark the filesytem with an error. This replaces the pure CIL flush that we currently have which works around this same issue (i.e the CIL can't be flushed once the shutdown flags are set) and hence enables us to clean up the logic substantially. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-08-04Merge branch 'xfs-misc-fixes-3.17-2' into for-nextDave Chinner
2014-08-04xfs: catch buffers written without verifiers attachedDave Chinner
We recently had a bug where buffers were slipping through log recovery without any verifier attached to them. This was resulting in on-disk CRC mismatches for valid data. Add some warning code to catch this occurrence so that we catch such bugs during development rather than not being aware they exist. Note that we cannot do this verification unconditionally as non-CRC filesystems don't always attach verifiers to the buffers being written. e.g. during log recovery we cannot identify all the different types of buffers correctly on non-CRC filesystems, so we can't attach the correct verifiers in all cases and so we don't attach any. Hence we don't want on non-CRC filesystems to avoid spamming the logs with false indications. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-07-15xfs: add xlog sysfs kobject and attribute handlersBrian Foster
Embed a kobject into the xfs log data structure (xlog). This creates a 'log' subdirectory for every XFS mount instance in sysfs. The lifecycle of the log kobject is tied to the lifecycle of the log. Also define a set of generic attribute handlers associated with the log kobject in preparation for the addition of attributes. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-25xfs: global error sign conversionDave Chinner
Convert all the errors the core XFs code to negative error signs like the rest of the kernel and remove all the sign conversion we do in the interface layers. Errors for conversion (and comparison) found via searches like: $ git grep " E" fs/xfs $ git grep "return E" fs/xfs $ git grep " E[A-Z].*;$" fs/xfs Negation points found via searches like: $ git grep "= -[a-z,A-Z]" fs/xfs $ git grep "return -[a-z,A-D,F-Z]" fs/xfs $ git grep " -[a-z].*;" fs/xfs [ with some bits I missed from Brian Foster ] Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-22xfs: Nuke XFS_ERROR macroEric Sandeen
XFS_ERROR was designed long ago to trap return values, but it's not runtime configurable, it's not consistently used, and we can do similar error trapping with ftrace scripts and triggers from userspace. Just nuke XFS_ERROR and associated bits. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-10Merge branch 'xfs-misc-fixes-3-for-3.16' into for-nextDave Chinner
2014-06-06xfs: kill xfs_buf_geterror()Dave Chinner
Most of the callers are just calling ASSERT(!xfs_buf_geterror()) which means they are checking for bp->b_error == 0. If bp is null in this case, we will assert fail, and hence it's no different in result to oopsing because of a null bp. In some cases, errors have already been checked for or the function returning the buffer can't return a buffer with an error, so it's just a redundant assert. Either way, the assert can either be removed. The other two non-assert callers can just test for a buffer and error properly. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-05-15Merge branch 'xfs-misc-fixes-1-for-3.16' into for-nextDave Chinner
2014-05-07xfs: don't sleep in xlog_cil_force_lsn on shutdownDave Chinner
Reports of a shutdown hang when fsyncing a directory have surfaced, such as this: [ 3663.394472] Call Trace: [ 3663.397199] [<ffffffff815f1889>] schedule+0x29/0x70 [ 3663.402743] [<ffffffffa01feda5>] xlog_cil_force_lsn+0x185/0x1a0 [xfs] [ 3663.416249] [<ffffffffa01fd3af>] _xfs_log_force_lsn+0x6f/0x2f0 [xfs] [ 3663.429271] [<ffffffffa01a339d>] xfs_dir_fsync+0x7d/0xe0 [xfs] [ 3663.435873] [<ffffffff811df8c5>] do_fsync+0x65/0xa0 [ 3663.441408] [<ffffffff811dfbc0>] SyS_fsync+0x10/0x20 [ 3663.447043] [<ffffffff815fc7d9>] system_call_fastpath+0x16/0x1b If we trigger a shutdown in xlog_cil_push() from xlog_write(), we will never wake waiters on the current push sequence number, so anything waiting in xlog_cil_force_lsn() for that push sequence number to come up will not get woken and hence stall the shutdown. Fix this by ensuring we call wake_up_all(&cil->xc_commit_wait) in the push abort handling, in the log shutdown code when waking all waiters, and adding a shutdown check in the sequence completion wait loops to ensure they abort when a wakeup due to a shutdown occurs. Reported-by: Boris Ranto <branto@redhat.com> Reported-by: Eric Sandeen <esandeen@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-05-05xfs: fully support v5 format filesystemsDave Chinner
We have had this code in the kernel for over a year now and have shaken all the known issues out of the code over the past few releases. It's now time to remove the experimental warnings during mount and fully support the new filesystem format in production systems. Remove the experimental warning, and add a version number to the initial "mounting filesystem" message to tell use what type of filesystem is being mounted. Also, remove the temporary inode cluster size output at mount time now we know that this code works fine. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-17xfs: unmount does not wait for shutdown during unmountDave Chinner
And interesting situation can occur if a log IO error occurs during the unmount of a filesystem. The cases reported have the same signature - the update of the superblock counters fails due to a log write IO error: XFS (dm-16): xfs_do_force_shutdown(0x2) called from line 1170 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa08a44a1 XFS (dm-16): Log I/O Error Detected. Shutting down filesystem XFS (dm-16): Unable to update superblock counters. Freespace may not be correct on next mount. XFS (dm-16): xfs_log_force: error 5 returned. XFS (¿-¿¿¿): Please umount the filesystem and rectify the problem(s) It can be seen that the last line of output contains a corrupt device name - this is because the log and xfs_mount structures have already been freed by the time this message is printed. A kernel oops closely follows. The issue is that the shutdown is occurring in a separate IO completion thread to the unmount. Once the shutdown processing has started and all the iclogs are marked with XLOG_STATE_IOERROR, the log shutdown code wakes anyone waiting on a log force so they can process the shutdown error. This wakes up the unmount code that is doing a synchronous transaction to update the superblock counters. The unmount path now sees all the iclogs are marked with XLOG_STATE_IOERROR and so never waits on them again, knowing that if it does, there will not be a wakeup trigger for it and we will hang the unmount if we do. Hence the unmount runs through all the remaining code and frees all the filesystem structures while the xlog_iodone() is still processing the shutdown. When the log shutdown processing completes, xfs_do_force_shutdown() emits the "Please umount the filesystem and rectify the problem(s)" message, and xlog_iodone() then aborts all the objects attached to the iclog. An iclog that has already been freed.... The real issue here is that there is no serialisation point between the log IO and the unmount. We have serialisations points for log writes, log forces, reservations, etc, but we don't actually have any code that wakes for log IO to fully complete. We do that for all other types of object, so why not iclogbufs? Well, it turns out that we can easily do this. We've got xfs_buf handles, and that's what everyone else uses for IO serialisation. i.e. bp->b_sema. So, lets hold iclogbufs locked over IO, and only release the lock in xlog_iodone() when we are finished with the buffer. That way before we tear down the iclog, we can lock and unlock the buffer to ensure IO completion has finished completely before we tear it down. Signed-off-by: Dave Chinner <dchinner@redhat.com> Tested-by: Mike Snitzer <snitzer@redhat.com> Tested-by: Bob Mastors <bob.mastors@solidfire.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2013-11-06xfs: trace AIL manipulationsDave Chinner
I debugging a log tail issue on a RHEL6 kernel, I added these trace points to trace log items being added, moved and removed in the AIL and how that affected the log tail LSN that was written to the log. They were very helpful in that they immediately identified the cause of the problem being seen. Hence I'd like to always have them available for use. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
2013-10-30xfs: fix possible NULL dereference in xlog_verify_iclogGeyslan G. Bem
In xlog_verify_iclog a debug check of the incore log buffers prints an error if icptr is null and then goes on to dereference the pointer regardless. Convert this to an assert so that the intention is clear. This was reported by Coverty. Signed-off-by: Ben Myers <bpm@sgi.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2013-10-23xfs: decouple inode and bmap btree header filesDave Chinner
Currently the xfs_inode.h header has a dependency on the definition of the BMAP btree records as the inode fork includes an array of xfs_bmbt_rec_host_t objects in it's definition. Move all the btree format definitions from xfs_btree.h, xfs_bmap_btree.h, xfs_alloc_btree.h and xfs_ialloc_btree.h to xfs_format.h to continue the process of centralising the on-disk format definitions. With this done, the xfs inode definitions are no longer dependent on btree header files. The enables a massive culling of unnecessary includes, with close to 200 #include directives removed from the XFS kernel code base. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2013-10-23xfs: decouple log and transaction headersDave Chinner
xfs_trans.h has a dependency on xfs_log.h for a couple of structures. Most code that does transactions doesn't need to know anything about the log, but this dependency means that they have to include xfs_log.h. Decouple the xfs_trans.h and xfs_log.h header files and clean up the includes to be in dependency order. In doing this, remove the direct include of xfs_trans_reserve.h from xfs_trans.h so that we remove the dependency between xfs_trans.h and xfs_mount.h. Hence the xfs_trans.h include can be moved to the indicate the actual dependencies other header files have on it. Note that these are kernel only header files, so this does not translate to any userspace changes at all. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2013-10-23xfs: create a shared header file for format-related informationDave Chinner
All of the buffer operations structures are needed to be exported for xfs_db, so move them all to a common location rather than spreading them all over the place. They are verifying the on-disk format, so while xfs_format.h might be a good place, it is not part of the on disk format. Hence we need to create a new header file that we centralise these related definitions. Start by moving the bffer operations structures, and then also move all the other definitions that have crept into xfs_log_format.h and xfs_format.h as there was no other shared header file to put them in. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>