summaryrefslogtreecommitdiff
path: root/fs/xfs/xfs_inode.c
AgeCommit message (Collapse)Author
2009-08-12xfs: check for dinode realtime flag corruptionChristoph Hellwig
Ramon tested XFS with a modified version of fsfuzzer and hit a NULL pointer dereference in __xfs_get_blocks due to the RT device target pointer being NULL. To fix this reject inode with the realtime bit set on a a filesystem without an RT subvolume during inode read. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Felix Blyakher <felixb@sgi.com> Reported-by: Ramon de Carvalho Valle <ramon@risesecurity.org> Tested-by: Ramon de Carvalho Valle <ramon@risesecurity.org> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-06-10xfs: use generic Posix ACL codeChristoph Hellwig
This patch rips out the XFS ACL handling code and uses the generic fs/posix_acl.c code instead. The ondisk format is of course left unchanged. This also introduces the same ACL caching all other Linux filesystems do by adding pointers to the acl and default acl in struct xfs_inode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-04-29xfs_file_last_byte() needs to acquire ilockLachlan McIlroy
We had some systems crash with this stack: [<a00000010000cb20>] ia64_leave_kernel+0x0/0x280 [<a00000021291ca00>] xfs_bmbt_get_startoff+0x0/0x20 [xfs] [<a0000002129080b0>] xfs_bmap_last_offset+0x210/0x280 [xfs] [<a00000021295b010>] xfs_file_last_byte+0x70/0x1a0 [xfs] [<a00000021295b200>] xfs_itruncate_start+0xc0/0x1a0 [xfs] [<a0000002129935f0>] xfs_inactive_free_eofblocks+0x290/0x460 [xfs] [<a000000212998fb0>] xfs_release+0x1b0/0x240 [xfs] [<a0000002129ad930>] xfs_file_release+0x70/0xa0 [xfs] [<a000000100162ea0>] __fput+0x1a0/0x420 [<a000000100163160>] fput+0x40/0x60 The problem here is that xfs_file_last_byte() does not acquire the inode lock and can therefore race with another thread that is modifying the extext list. While xfs_bmap_last_offset() is trying to lookup what was the last extent some extents were merged and the extent list shrunk so the index we lookup is now beyond the end of the extent list and potentially in a freed buffer. Signed-off-by: Lachlan McIlroy <lmcilroy@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-01-19xfs: sanity check attr fork sizeChristoph Hellwig
Recently we have quite a few kerneloops reports about dereferencing a NULL if_data in the attribute fork. From looking over the code this can only happen if we pass a 0 size argument to xfs_iformat_local. This implies some sort of corruption and in fact the only mailinglist report about this from earlier this year was after a powerfail presumably on a system with write cache and without barriers. Add a quick sanity check for the attr fork size in xfs_iformat to catch these early and without an oops. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-01-16[XFS] Remove the rest of the macro-to-function indirections.Eric Sandeen
Remove the last of the macros-defined-to-static-functions. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-22[XFS] Remove XFS_BUF_SHUT() and friendsLachlan McIlroy
Code does nothing so remove it. Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-11[XFS] resync headers with libxfsChristoph Hellwig
- xfs_sb.h add the XFS_SB_VERSION2_PARENTBIT features2 that has been around in userspace for some time - xfs_inode.h: move a few things out of __KERNEL__ that are needed by userspace - xfs_mount.h: only include xfs_sync.h under __KERNEL__ - xfs_inode.c: minor whitespace fixup. I accidentaly changes this when importing this file for use by userspace. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-10[XFS] Remove unused tracing codeLachlan McIlroy
None of this code appears to be used anywhere so remove it. Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-04move vn_iowait / vn_iowake into xfs_aops.cChristoph Hellwig
The whole machinery to wait on I/O completion is related to the I/O path and should be there instead of in xfs_vnode.c. Also give the functions more descriptive names. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04cleanup the inode reclaim pathChristoph Hellwig
Merge xfs_iextract and xfs_idestroy into xfs_ireclaim as they are never called individually. Also rewrite most comments in this area as they were severly out of date. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01[XFS] move inode allocation out xfs_ireadChristoph Hellwig
Allocate the inode in xfs_iget_cache_miss and pass it into xfs_iread. This simplifies the error handling and allows xfs_iread to be shared with userspace which already uses these semantics. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01[XFS] kill the XFS_IMAP_BULKSTAT flagChristoph Hellwig
Just pass down the XFS_IGET_* flags all the way down to xfs_imap instead of translating them mid-way. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01[XFS] embededd struct xfs_imap into xfs_inodeChristoph Hellwig
Most uses of struct xfs_imap are to map and inode to a buffer. To avoid copying around the inode location information we should just embedd a strcut xfs_imap into the xfs_inode. To make sure it doesn't bloat an inode the im_len is changed to a ushort, which is fine as that's what the users exepect anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01[XFS] merge xfs_imap into xfs_dilocateChristoph Hellwig
xfs_imap is the only caller of xfs_dilocate and doesn't add any significant value. Merge the two functions and document the various cases we have for inode cluster lookup in the new xfs_imap. Also remove the unused im_agblkno and im_ioffset fields from struct xfs_imap while we're at it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01[XFS] remove dead code for old inode item recoveryChristoph Hellwig
We have removed the support for old-style inode items a while ago and xlog_recover_do_inode_trans is now only called for XFS_LI_INODE items. That means we can remove the call to xfs_imap there and with it the XFS_IMAP_LOOKUP that is set by all other callers. We can also mark xfs_imap static now. (First sent on October 21st) Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01[XFS] stop using xfs_itobp in xfs_ireadChristoph Hellwig
The only caller of xfs_itobp that doesn't have i_blkno setup is now the initial inode read. It needs access to the whole xfs_imap so using xfs_inotobp is not an option. Instead opencode the buffer lookup in xfs_iread and kill all the functionality for the initial map from xfs_itobp. (First sent on October 21st) Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01[XFS] kill XFS_DINODE_VERSION_ definesChristoph Hellwig
These names don't add any value at all over just using the numerical values. (First sent on October 9th) Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01[XFS] kill xfs_dinode_core_tChristoph Hellwig
Now that we have a separate xfs_icdinode_t for the in-core inode which gets logged there is no need anymore for the xfs_dinode vs xfs_dinode_core split - the fact that part of the structure gets logged through the inode log item and a small part not can better be described in a comment. All sizeof operations on the dinode_core either really wanted the icdinode and are switched to that one, or had already added the size of the agi unlinked list pointer. Later both will be replaced with helpers once we get the larger CRC-enabled dinode. Removing the data and attribute fork unions also has the advantage that xfs_dinode.h doesn't need to pull in every header under the sun. While we're at it also add some more comments describing the dinode structure. (First sent on October 7th) Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01[XFS] factor out xfs_read_agi helperChristoph Hellwig
Add a helper to read the AGI header and perform basic verification. Based on hunks from a larger patch from Dave Chinner. (First sent on Juli 23rd) Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-11-17[XFS] Fix double free of log ticketsDave Chinner
When an I/O error occurs during an intermediate commit on a rolling transaction, xfs_trans_commit() will free the transaction structure and the related ticket. However, the duplicate transaction that gets used as the transaction continues still contains a pointer to the ticket. Hence when the duplicate transaction is cancelled and freed, we free the ticket a second time. Add reference counting to the ticket so that we hold an extra reference to the ticket over the transaction commit. We drop the extra reference once we have checked that the transaction commit did not return an error, thus avoiding a double free on commit error. Credit to Nick Piggin for tripping over the problem. SGI-PV: 989741 Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30[XFS] free partially initialized inodes using destroy_inodeChristoph Hellwig
To make sure we free the security data inodes need to be freed using the proper VFS helper (which we also need to export for this). We mark these inodes bad so we can skip the flush path for them. SGI-PV: 987246 SGI-Modid: xfs-linux-melb:xfs-kern:32398a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30[XFS] stop using xfs_itobp in xfs_bulkstatChristoph Hellwig
xfs_bulkstat only wants the dinode, offset and buffer from a given inode number. Instead of using xfs_itobp on a fake inode which is complicated and currently leads to leaks of the security data just use xfs_inotobp which is designed to do exactly the kind of lookup xfs_bulkstat wants. The only thing that's missing in xfs_inotobp is a flags paramter that let's us pass down XFS_IMAP_BULKSTAT, but that can easily added. SGI-PV: 987246 SGI-Modid: xfs-linux-melb:xfs-kern:32397a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30[XFS] Finish removing the mount pointer from the AIL APIDavid Chinner
Change all the remaining AIL API functions that are passed struct xfs_mount pointers to pass pointers directly to the struct xfs_ail being used. With this conversion, all external access to the AIL is via the struct xfs_ail. Hence the operation and referencing of the AIL is almost entirely independent of the xfs_mount that is using it - it is now much more tightly tied to the log and the items it is tracking in the log than it is tied to the xfs_mount. SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32353a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30[XFS] Move the AIL lock into the struct xfs_ailDavid Chinner
Bring the ail lock inside the struct xfs_ail. This means the AIL can be entirely manipulated via the struct xfs_ail rather than needing both the struct xfs_mount and the struct xfs_ail. SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32350a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30[XFS] Allow 64 bit machines to avoid the AIL lock during flushesDavid Chinner
When copying lsn's from the log item to the inode or dquot flush lsn, we currently grab the AIL lock. We do this because the LSN is a 64 bit quantity and it needs to be read atomically. The lock is used to guarantee atomicity for 32 bit platforms. Make the LSN copying a small function, and make the function used conditional on BITS_PER_LONG so that 64 bit machines don't need to take the AIL lock in these places. SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32349a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30[XFS] kill deleted inodes listDavid Chinner
Now that the deleted inodes list is unused, kill it. This also removes the i_reclaim list head from the xfs_inode, shrinking it by two pointers. SGI-PV: 988142 SGI-Modid: xfs-linux-melb:xfs-kern:32334a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30[XFS] Combine the XFS and Linux inodesDavid Chinner
To avoid issues with different lifecycles of XFS and Linux inodes, embedd the linux inode inside the XFS inode. This means that the linux inode has the same lifecycle as the XFS inode, even when it has been released by the OS. XFS inodes don't live much longer than this (a short stint in reclaim at most), so there isn't significant memory usage penalties here. Version 3 o kill xfs_icount() Version 2 o remove unused commented out code from xfs_iget(). o kill useless cast in VFS_I() SGI-PV: 988141 SGI-Modid: xfs-linux-melb:xfs-kern:32323a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30[XFS] Always use struct xfs_btree_block instead of short / longformChristoph Hellwig
structures. Always use the generic xfs_btree_block type instead of the short / long structures. Add XFS_BTREE_SBLOCK_LEN / XFS_BTREE_LBLOCK_LEN defines for the length of a short / long form block. The rationale for this is that we will grow more btree block header variants to support CRCs and other RAS information, and always accessing them through the same datatype with unions for the short / long form pointers makes implementing this much easier. SGI-PV: 988146 SGI-Modid: xfs-linux-melb:xfs-kern:32300a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30[XFS] cleanup btree record / key / ptr addressing macros.Christoph Hellwig
Replace the generic record / key / ptr addressing macros that use cpp token pasting with simpler macros that do the job for just one given btree type. The new macros lose the cur argument and thus can be used outside the core btree code, but also gain an xfs_mount * argument to allow for checking the CRC flag in the near future. Note that many of these macros aren't actually used in the kernel code, but only in userspace (mostly in xfs_repair). SGI-PV: 988146 SGI-Modid: xfs-linux-melb:xfs-kern:32295a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30[XFS] Cleanup maxrecs calculation.Christoph Hellwig
Clean up the way the maximum and minimum records for the btree blocks are calculated. For the alloc and inobt btrees all the values are pre-calculated in xfs_mount_common, and we switch the current loop around the ugly generic macros that use cpp token pasting to generate type names to two small helpers in normal C code. For the bmbt and bmdr trees these helpers also exist, but can be called during runtime, too. Here we also kill various macros dealing with them and inline the logic into the get_minrecs / get_maxrecs / get_dmaxrecs methods in xfs_bmap_btree.c. Note that all these new helpers take an xfs_mount * argument which will be needed to determine the size of a btree block once we add support for extended btree blocks with CRCs and other RAS information. SGI-PV: 988146 SGI-Modid: xfs-linux-melb:xfs-kern:32292a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30[XFS] Remove xfs_iflush_all and clean up xfs_finish_reclaim_all()David Chinner
xfs_iflush_all() walks the m_inodes list to find inodes that need reclaiming. We already have such a list - the m_del_inodes list. Replace xfs_iflush_all() with a call to xfs_finish_reclaim_all() and clean up xfs_finish_reclaim_all() to handle the different flush modes now needed. Originally based on a patch from Christoph Hellwig. Version 3 o rediff against new linux-2.6/xfs_sync.c code Version 2 o revert xfs_syncsub() inode reclaim behaviour back to original code o xfs_quiesce_fs() should use XFS_IFLUSH_DELWRI_ELSE_ASYNC, not XFS_IFLUSH_ASYNC, to prevent change of behaviour. SGI-PV: 988139 SGI-Modid: xfs-linux-melb:xfs-kern:32284a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30[XFS] Wait for all I/O on truncate to zero file sizeLachlan McIlroy
It's possible to have outstanding xfs_ioend_t's queued when the file size is zero. This can happen in the direct I/O path when a direct I/O write fails due to ENOSPC. In this case the xfs_ioend_t will still be queued (ie xfs_end_io_direct() does not know that the I/O failed so can't force the xfs_ioend_t to be flushed synchronously). When we truncate a file on unlink we don't know to wait for these xfs_ioend_ts and we can have a use-after-free situation if the inode is reclaimed before the xfs_ioend_t is finally processed. As was suggested by Dave Chinner lets wait for all I/Os to complete when truncating the file size to zero. SGI-PV: 981668 SGI-Modid: xfs-linux-melb:xfs-kern:32216a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30[XFS] make btree tracing genericChristoph Hellwig
Make the existing bmap btree tracing generic so that it applies to all btree types. Some fragments lifted from a patch by Dave Chinner. SGI-PV: 985583 SGI-Modid: xfs-linux-melb:xfs-kern:32187a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Bill O'Donnell <billodo@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30[XFS] Use xfs_idestroy() to cleanup an inode.Lachlan McIlroy
SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31927a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30[XFS] Make use of the init-once slab optimisation.David Chinner
To avoid having to initialise some fields of the XFS inode on every allocation, we can use the slab init-once feature to initialise them. All we have to guarantee is that when we free the inode, all it's entries are in the initial state. Add asserts where possible to ensure debug kernels check this initial state before freeing and after allocation. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31925a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-09-26[XFS] Remove xfs_iext_irec_compact_full()Lachlan McIlroy
Yet another bug was found in xfs_iext_irec_compact_full() and while the source of the bug was found it wasn't an easy task to track it down because the conditions are very difficult to reproduce. A HUGE thank-you goes to Russell Cattelan and Eric Sandeen for their significant effort in tracking down the source of this corruption. xfs_iext_irec_compact_full() and xfs_iext_irec_compact_pages() are almost identical - they both compact indirect extent lists by moving extents from subsequent buffers into earlier ones. xfs_iext_irec_compact_pages() only moves extents if all of the extents in the next buffer will fit into the empty space in the buffer before it. xfs_iext_irec_compact_full() will go a step further and move part of the next buffer if all the extents wont fit. It will then shift the remaining extents in the next buffer up to the start of the buffer. The bug here was that we did not update er_extoff and this caused extent list corruption. It does not appear that this extra functionality gains us much. Calling xfs_iext_irec_compact_pages() instead will do a good enough job at compacting the indirect list and will be quicker too. For the case in xfs_iext_indirect_to_direct() the total number of extents in the indirect list will fit into one buffer so we will never need the extra functionality of xfs_iext_irec_compact_full() there. Also xfs_iext_irec_compact_pages() doesn't need to do a memmove() (the buffers will never overlap) so we don't want the performance hit that can incur. SGI-PV: 987159 SGI-Modid: xfs-linux-melb:xfs-kern:32166a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
2008-09-26[XFS] Fix extent list corruption in xfs_iext_irec_compact_full().Lachlan McIlroy
If we don't move all the records from the next buffer into the current buffer then we need to update the er_extoff field of the next buffer as we shift the remaining records to the start of the buffer. SGI-PV: 987159 SGI-Modid: xfs-linux-melb:xfs-kern:32165a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Russell Cattelan <cattelan@thebarn.com>
2008-08-14CRED: Introduce credential access wrappersDavid Howells
The patches that are intended to introduce copy-on-write credentials for 2.6.28 require abstraction of access to some fields of the task structure, particularly for the case of one task accessing another's credentials where RCU will have to be observed. Introduced here are trivial no-op versions of the desired accessors for current and other tasks so that other subsystems can start to be converted over more easily. Wrappers are introduced into a new header (linux/cred.h) for UID/GID, EUID/EGID, SUID/SGID, FSUID/FSGID, cap_effective and current's subscribed user_struct. These wrappers are macros because the ordering between header files mitigates against making them inline functions. linux/cred.h is #included from linux/sched.h. Further, XFS is modified such that it no longer defines and uses parameterised versions of current_fs[ug]id(), thus getting rid of the namespace collision otherwise incurred. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2008-08-13[XFS] Use KM_NOFS for debug trace buffersLachlan McIlroy
Use KM_NOFS to prevent recursion back into the filesystem which can cause deadlocks. In the case of xfs_iread() we hold the lock on the inode cluster buffer while allocating memory for the trace buffers. If we recurse back into XFS to flush data that may require a transaction to allocate extents which needs log space. This can deadlock with the xfsaild thread which can't push the tail of the log because it is trying to get the inode cluster buffer lock. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31838a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com>
2008-08-13[XFS] update timestamp in xfs_ialloc manuallyChristoph Hellwig
In xfs_ialloc we just want to set all timestamps to the current time. We don't need to mark the inode dirty like xfs_ichgtime does, and we don't need nor want the opimizations in xfs_ichgtime that I will introduce in the next patch. So just opencode the timestamp update in xfs_ialloc, and remove the new unused XFS_ICHGTIME_ACC case in xfs_ichgtime. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31825a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13[XFS] replace inode flush semaphore with a completionDavid Chinner
Use the new completion flush code to implement the inode flush lock. Removes one of the final users of semaphores in the XFS code base. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31817a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13[XFS] use get_unaligned_* helpersHarvey Harrison
SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31813a Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13[XFS] sanitize xfs_initialize_vnodeChristoph Hellwig
Sanitize setting up the Linux indode. Setting up the xfs_inode <-> inode link is opencoded in xfs_iget_core now because that's the only place it needs to be done, xfs_initialize_vnode is renamed to xfs_setup_inode and loses all superflous paramaters. The check for I_NEW is removed because it always is true and the di_mode check moves into xfs_iget_core because it's only needed there. xfs_set_inodeops and xfs_revalidate_inode are merged into xfs_setup_inode and the whole things is moved into xfs_iops.c where it belongs. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31782a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13[XFS] kill bhv_vnode_tChristoph Hellwig
All remaining bhv_vnode_t instance are in code that's more or less Linux specific. (Well, for xfs_acl.c that could be argued, but that code is on the removal list, too). So just do an s/bhv_vnode_t/struct inode/ over the whole tree. We can clean up variable naming and some useless helpers later. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31781a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13[XFS] remove some easy bhv_vnode_t instancesChristoph Hellwig
In various places we can just move a VFS_I call into the argument list of called functions/macros instead of having a local bhv_vnode_t. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31776a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13[XFS] Use KM_NOFS for incore inode extent tree allocation V2David Chinner
If we allow incore extent tree allocations to recurse into the filesystem under memory pressure, new delayed allocations through xfs_iomap_write_delay() can deadlock on themselves if memory reclaim tries to write back dirty pages from that inode. It will deadlock in xfs_iomap_write_allocate() trying to take the ilock we already hold. This can also show up as complex ABBA deadlocks when multiple threads are triggering memory reclaim when trying to allocate extents. The main cause of this is the fact that delayed allocation is not done in a transaction, so KM_NOFS is not automatically added to the allocations to prevent this recursion. Mark all allocations done for the incore inode extent tree as KM_NOFS to ensure they never recurse back into the filesystem. Version 2: o KM_NOFS implies KM_SLEEP, so just use KM_NOFS SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31726a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13[XFS] Kill shouty XFS_ITOV() macroDavid Chinner
Replace XFS_ITOV() with the new VFS_I() inline. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31724a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13[XFS] kill shouty XFS_ITOV_NULL macroDavid Chinner
Replace XFS_ITOV_NULL() with the new VFS_I() inline. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31722a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] fix extent corruption in xfs_iext_irec_compact_full()Lachlan McIlroy
This function is used to compact the indirect extent list by moving extents from one page to the previous to fill them up. After we move some extents to an earlier page we need to shuffle the remaining extents to the start of the page. The actual bug here is the second argument to memmove() needs to index past the extents, that were copied to the previous page, and move the remaining extents. For pages that are already full (ie ext_avail == 0) the compaction code has no net effect so don't do it. SGI-PV: 983337 SGI-Modid: xfs-linux-melb:xfs-kern:31332a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-07-28[XFS] make inode reclaim wait for log I/O to completeLachlan McIlroy
During a forced shutdown a xfs inode can be destroyed before log I/O involving that inode is complete. We need to wait for the inode to be unpinned before tearing it down. Version 2 cleans up the code a bit by relying on xfs_iflush() to do the unpinning and forced shutdown check. SGI-PV: 981240 SGI-Modid: xfs-linux-melb:xfs-kern:31326a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com>