summaryrefslogtreecommitdiff
path: root/fs/xfs/xfs_iomap.h
AgeCommit message (Collapse)Author
2017-09-26xfs: update i_size after unwritten conversion in dio completionEryu Guan
Since commit d531d91d6990 ("xfs: always use unwritten extents for direct I/O writes"), we start allocating unwritten extents for all direct writes to allow appending aio in XFS. But for dio writes that could extend file size we update the in-core inode size first, then convert the unwritten extents to real allocations at dio completion time in xfs_dio_write_end_io(). Thus a racing direct read could see the new i_size and find the unwritten extents first and read zeros instead of actual data, if the direct writer also takes a shared iolock. Fix it by updating the in-core inode size after the unwritten extent conversion. To do this, introduce a new boolean argument to xfs_iomap_write_unwritten() to tell if we want to update in-core i_size or not. Suggested-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eryu Guan <eguan@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-02-06xfs: introduce xfs_aligned_fsb_countChristoph Hellwig
Factor a helper to calculate the extent-size aligned block out of the iomap code, so that it can be reused by the upcoming reflink dio code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-01-30iomap: constify struct iomap_opsChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2016-10-05xfs: create a separate cow extent size hint for the allocatorDarrick J. Wong
Create a per-inode extent size allocator hint for copy-on-write. This hint is separate from the existing extent size hint so that CoW can take advantage of the fragmentation-reducing properties of extent size hints without disabling delalloc for regular writes. The extent size hint that's fed to the allocator during a copy on write operation is the greater of the cowextsize and regular extsize hint. During reflink, if we're sharing the entire source file to the entire destination file and the destination file doesn't already have a cowextsize hint, propagate the source file's cowextsize hint to the destination file. Furthermore, zero the bulkstat buffer prior to setting the fields so that we don't copy kernel memory contents into userspace. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2016-10-04xfs: support allocating delayed extents in CoW forkDarrick J. Wong
Modify xfs_bmap_add_extent_delay_real() so that we can convert delayed allocation extents in the CoW fork to real allocations, and wire this up all the way back to xfs_iomap_write_allocate(). In a subsequent patch, we'll modify the writepage handler to call this. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2016-09-19xfs: rewrite and optimize the delalloc write pathChristoph Hellwig
Currently xfs_iomap_write_delay does up to lookups in the inode extent tree, which is rather costly especially with the new iomap based write path and small write sizes. But it turns out that the low-level xfs_bmap_search_extents gives us all the information we need in the regular delalloc buffered write path: - it will return us an extent covering the block we are looking up if it exists. In that case we can simply return that extent to the caller and are done - it will tell us if we are beyoned the last current allocated block with an eof return parameter. In that case we can create a delalloc reservation and use the also returned information about the last extent in the file as the hint to size our delalloc reservation. - it can tell us that we are writing into a hole, but that there is an extent beyoned this hole. In this case we can create a delalloc reservation that covers the requested size (possible capped to the next existing allocation). All that can be done in one single routine instead of bouncing up and down a few layers. This reduced the CPU overhead of the block mapping routines and also simplified the code a lot. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-08-17xfs: (re-)implement FIEMAP_FLAG_XATTRChristoph Hellwig
Use a special read-only iomap_ops implementation to support fiemap on the attr fork. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-06-21xfs: implement iomap based buffered write pathChristoph Hellwig
Convert XFS to use the new iomap based multipage write path. This involves implementing the ->iomap_begin and ->iomap_end methods, and switching the buffered file write, page_mkwrite and xfs_iozero paths to the new iomap helpers. With this change __xfs_get_blocks will never be used for buffered writes, and the code handling them can be removed. Based on earlier code from Dave Chinner. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-06-21xfs: make xfs_bmbt_to_iomap available outside of xfs_pnfs.cChristoph Hellwig
And ensure it works for RT subvolume files an set the block device, both of which will be needed to be able to use the function in the buffered write path. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-09xfs: pass a 64-bit count argument to xfs_iomap_write_unwrittenChristoph Hellwig
The code is already ready for it, and the pnfs layout commit code expects to be able to pass a larger than 32-bit argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2013-10-01xfs: get rid of count from xfs_iomap_write_allocate()Jie Liu
Get rid of function variable count from xfs_iomap_write_allocate() as it is unused. Additionally, checkpatch warn me of the following for this change: WARNING: extern prototypes should be avoided in .h files +extern int xfs_iomap_write_allocate(struct xfs_inode *, xfs_off_t, So this patch also remove all extern function prototypes at xfs_iomap.h to suppress it to make this code style in consistent manner in this file. Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2010-12-16xfs: kill xfs_iomapChristoph Hellwig
Opencode the xfs_iomap code in it's two callers. The overlap of passed flags already was minimal and will be further reduced in the next patch. As a side effect the BMAPI_* flags for xfs_bmapi and the IO_* flags for I/O end processing are merged into a single set of flags, which should be a bit more descriptive of the operation we perform. Also improve the tracing by giving each caller it's own type set of tracepoints. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-07-26xfs: do not use emums for flags used in tracingChristoph Hellwig
The tracing code can't print flags defined as enums. Most flags that we want to print are defines as macros already, but move the few remaining ones over to make the trace output more useful. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2010-07-26xfs: small cleanups for xfs_iomap / __xfs_get_blocksChristoph Hellwig
Remove the flags argument to __xfs_get_blocks as we can easily derive it from the direct argument, and remove the unused BMAPI_MMAP flag. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2010-05-19xfs: mark xfs_iomap_write_ helpers staticChristoph Hellwig
And also drop a useless argument to xfs_iomap_write_direct. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-19xfs: move I/O type flags into xfs_aops.cChristoph Hellwig
The IOMAP_ flags are now only used inside xfs_aops.c for extent probing and I/O completion tracking, so more them here, and rename them to IO_* as there's no mapping involved at all. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-19xfs: kill struct xfs_iomapChristoph Hellwig
Now that struct xfs_iomap contains exactly the same units as struct xfs_bmbt_irec we can just use the latter directly in the aops code. Replace the missing IOMAP_NEW flag with a new boolean output parameter to xfs_iomap. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-19xfs: report iomap_bn in block baseChristoph Hellwig
Report the iomap_bn field of struct xfs_iomap in terms of filesystem blocks instead of in terms of bytes. Shift the byte conversions into the caller, and replace the IOMAP_DELAY and IOMAP_HOLE flag checks with checks for HOLESTARTBLOCK and DELAYSTARTBLOCK. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-19xfs: remove iomap_deltaChristoph Hellwig
The iomap_delta field in struct xfs_iomap just contains the difference between the offset passed to xfs_iomap and the iomap_offset. Just calculate it in the only caller that cares. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-05-19xfs: remove iomap_targetChristoph Hellwig
Instead of using the iomap_target field in struct xfs_iomap and the IOMAP_REALTIME flag just use the already existing xfs_find_bdev_for_inode helper. There's some fallout as we need to pass the inode in a few more places, which we also use to sanitize some calling conventions. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2009-12-14xfs: event tracing supportChristoph Hellwig
Convert the old xfs tracing support that could only be used with the out of tree kdb and xfsidbg patches to use the generic event tracer. To use it make sure CONFIG_EVENT_TRACING is enabled and then enable all xfs trace channels by: echo 1 > /sys/kernel/debug/tracing/events/xfs/enable or alternatively enable single events by just doing the same in one event subdirectory, e.g. echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_ihold/enable or set more complex filters, etc. In Documentation/trace/events.txt all this is desctribed in more detail. To reads the events do a cat /sys/kernel/debug/tracing/trace Compared to the last posting this patch converts the tracing mostly to the one tracepoint per callsite model that other users of the new tracing facility also employ. This allows a very fine-grained control of the tracing, a cleaner output of the traces and also enables the perf tool to use each tracepoint as a virtual performance counter, allowing us to e.g. count how often certain workloads git various spots in XFS. Take a look at http://lwn.net/Articles/346470/ for some examples. Also the btree tracing isn't included at all yet, as it will require additional core tracing features not in mainline yet, I plan to deliver it later. And the really nice thing about this patch is that it actually removes many lines of code while adding this nice functionality: fs/xfs/Makefile | 8 fs/xfs/linux-2.6/xfs_acl.c | 1 fs/xfs/linux-2.6/xfs_aops.c | 52 - fs/xfs/linux-2.6/xfs_aops.h | 2 fs/xfs/linux-2.6/xfs_buf.c | 117 +-- fs/xfs/linux-2.6/xfs_buf.h | 33 fs/xfs/linux-2.6/xfs_fs_subr.c | 3 fs/xfs/linux-2.6/xfs_ioctl.c | 1 fs/xfs/linux-2.6/xfs_ioctl32.c | 1 fs/xfs/linux-2.6/xfs_iops.c | 1 fs/xfs/linux-2.6/xfs_linux.h | 1 fs/xfs/linux-2.6/xfs_lrw.c | 87 -- fs/xfs/linux-2.6/xfs_lrw.h | 45 - fs/xfs/linux-2.6/xfs_super.c | 104 --- fs/xfs/linux-2.6/xfs_super.h | 7 fs/xfs/linux-2.6/xfs_sync.c | 1 fs/xfs/linux-2.6/xfs_trace.c | 75 ++ fs/xfs/linux-2.6/xfs_trace.h | 1369 +++++++++++++++++++++++++++++++++++++++++ fs/xfs/linux-2.6/xfs_vnode.h | 4 fs/xfs/quota/xfs_dquot.c | 110 --- fs/xfs/quota/xfs_dquot.h | 21 fs/xfs/quota/xfs_qm.c | 40 - fs/xfs/quota/xfs_qm_syscalls.c | 4 fs/xfs/support/ktrace.c | 323 --------- fs/xfs/support/ktrace.h | 85 -- fs/xfs/xfs.h | 16 fs/xfs/xfs_ag.h | 14 fs/xfs/xfs_alloc.c | 230 +----- fs/xfs/xfs_alloc.h | 27 fs/xfs/xfs_alloc_btree.c | 1 fs/xfs/xfs_attr.c | 107 --- fs/xfs/xfs_attr.h | 10 fs/xfs/xfs_attr_leaf.c | 14 fs/xfs/xfs_attr_sf.h | 40 - fs/xfs/xfs_bmap.c | 507 +++------------ fs/xfs/xfs_bmap.h | 49 - fs/xfs/xfs_bmap_btree.c | 6 fs/xfs/xfs_btree.c | 5 fs/xfs/xfs_btree_trace.h | 17 fs/xfs/xfs_buf_item.c | 87 -- fs/xfs/xfs_buf_item.h | 20 fs/xfs/xfs_da_btree.c | 3 fs/xfs/xfs_da_btree.h | 7 fs/xfs/xfs_dfrag.c | 2 fs/xfs/xfs_dir2.c | 8 fs/xfs/xfs_dir2_block.c | 20 fs/xfs/xfs_dir2_leaf.c | 21 fs/xfs/xfs_dir2_node.c | 27 fs/xfs/xfs_dir2_sf.c | 26 fs/xfs/xfs_dir2_trace.c | 216 ------ fs/xfs/xfs_dir2_trace.h | 72 -- fs/xfs/xfs_filestream.c | 8 fs/xfs/xfs_fsops.c | 2 fs/xfs/xfs_iget.c | 111 --- fs/xfs/xfs_inode.c | 67 -- fs/xfs/xfs_inode.h | 76 -- fs/xfs/xfs_inode_item.c | 5 fs/xfs/xfs_iomap.c | 85 -- fs/xfs/xfs_iomap.h | 8 fs/xfs/xfs_log.c | 181 +---- fs/xfs/xfs_log_priv.h | 20 fs/xfs/xfs_log_recover.c | 1 fs/xfs/xfs_mount.c | 2 fs/xfs/xfs_quota.h | 8 fs/xfs/xfs_rename.c | 1 fs/xfs/xfs_rtalloc.c | 1 fs/xfs/xfs_rw.c | 3 fs/xfs/xfs_trans.h | 47 + fs/xfs/xfs_trans_buf.c | 62 - fs/xfs/xfs_vnodeops.c | 8 70 files changed, 2151 insertions(+), 2592 deletions(-) Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2009-04-06xfs: remove xfs_flush_spaceDave Chinner
The only thing we need to do now when we get an ENOSPC condition during delayed allocation reservation is flush all the other inodes with delalloc blocks on them and retry without EOF preallocation. Remove the unneeded mess that is xfs_flush_space() and just call xfs_flush_inodes() directly from xfs_iomap_write_delay(). Also, change the location of the retry label to avoid trying to do EOF preallocation because we don't want to do that at ENOSPC. This enables us to remove the BMAPI_SYNC flag as it is no longer used. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-03-29xfs: fix various typosMalcolm Parsons
Signed-off-by: Malcolm Parsons <malcolm.parsons@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2008-02-07[XFS] kill unnessecary ioops indirectionLachlan McIlroy
Currently there is an indirection called ioops in the XFS data I/O path. Various functions are called by functions pointers, but there is no coherence in what this is for, and of course for XFS itself it's entirely unused. This patch removes it instead and significantly reduces source and binary size of XFS while making maintaince easier. SGI-PV: 970841 SGI-Modid: xfs-linux-melb:xfs-kern:29737a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-02-07[XFS] kill BMAPI_UNWRITTENChristoph Hellwig
There is no reason to go through xfs_iomap for the BMAPI_UNWRITTEN because it has nothing in common with the other cases. Instead check for the shutdown filesystem in xfs_end_bio_unwritten and perform a direct call to xfs_iomap_write_unwritten (which should be renamed to something more sensible one day) SGI-PV: 970241 SGI-Modid: xfs-linux-melb:xfs-kern:29681a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-02-07[XFS] kill BMAPI_DEVICEChristoph Hellwig
There is no reason to go into the iomap machinery just to get the right block device for an inode. Instead look at the realtime flag in the inode and grab the right device from the mount structure. I created a new helper, xfs_find_bdev_for_inode instead of opencoding it because I plan to use it in other places in the future. SGI-PV: 970240 SGI-Modid: xfs-linux-melb:xfs-kern:29680a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16[XFS] Kill unused IOMAP_EOF flagChristoph Hellwig
SGI-PV: 968563 SGI-Modid: xfs-linux-melb:xfs-kern:29705a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-05-08[XFS] Fix to prevent the notorious 'NULL files' problem after a crash.Lachlan McIlroy
The problem that has been addressed is that of synchronising updates of the file size with writes that extend a file. Without the fix the update of a file's size, as a result of a write beyond eof, is independent of when the cached data is flushed to disk. Often the file size update would be written to the filesystem log before the data is flushed to disk. When a system crashes between these two events and the filesystem log is replayed on mount the file's size will be set but since the contents never made it to disk the file is full of holes. If some of the cached data was flushed to disk then it may just be a section of the file at the end that has holes. There are existing fixes to help alleviate this problem, particularly in the case where a file has been truncated, that force cached data to be flushed to disk when the file is closed. If the system crashes while the file(s) are still open then this flushing will never occur. The fix that we have implemented is to introduce a second file size, called the in-memory file size, that represents the current file size as viewed by the user. The existing file size, called the on-disk file size, is the one that get's written to the filesystem log and we only update it when it is safe to do so. When we write to a file beyond eof we only update the in- memory file size in the write operation. Later when the I/O operation, that flushes the cached data to disk completes, an I/O completion routine will update the on-disk file size. The on-disk file size will be updated to the maximum offset of the I/O or to the value of the in-memory file size if the I/O includes eof. SGI-PV: 958522 SGI-Modid: xfs-linux-melb:xfs-kern:28322a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2005-11-25[XFS] Fix potential overflow in xfs_iomap_t delta for very large extentsEric Sandeen
SGI-PV: 945311 SGI-Modid: xfs-linux-melb:xfs-kern:201708a Signed-off-by: Eric Sandeen <sandeen@sgi.com> Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02[XFS] Update license/copyright notices to match the prefered SGINathan Scott
boilerplate. SGI-PV: 913862 SGI-Modid: xfs-linux:xfs-kern:23903a Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-05-05[XFS] Cleanup use of loff_t vs xfs_off_t in the core code.Nathan Scott
SGI Modid: xfs-linux:xfs-kern:22378a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Christoph Hellwig <hch@sgi.com>
2005-05-05[XFS] Use the right offset when ensuring a delayed allocate conversion has ↵Nathan Scott
covered the offset originally requested. Can cause data corruption when multiple processes are performing writeout on different areas of the same file. Quite difficult to hit though. SGI Modid: xfs-linux:xfs-kern:22377a Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Christoph Hellwig <hch@sgi.com> .
2005-04-16Linux-2.6.12-rc2Linus Torvalds
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!