summaryrefslogtreecommitdiff
path: root/fs/gfs2/super.c
AgeCommit message (Collapse)Author
2008-06-27[GFS2] Clean up the glock coreSteven Whitehouse
This patch implements a number of cleanups to the core of the GFS2 glock code. As a result a lot of code is removed. It looks like a really big change, but actually a large part of this patch is either removing or moving existing code. There are some new bits too though, such as the new run_queue() function which is considerably streamlined. Highlights of this patch include: o Fixes a cluster coherency bug during SH -> EX lock conversions o Removes the "glmutex" code in favour of a single bit lock o Removes the ->go_xmote_bh() for inodes since it was duplicating ->go_lock() o We now only use the ->lm_lock() function for both locks and unlocks (i.e. unlock is a lock with target mode LM_ST_UNLOCKED) o The fast path is considerably shortly, giving performance gains especially with lock_nolock o The glock_workqueue is now used for all the callbacks from the DLM which allows us to simplify the lock_dlm module (see following patch) o The way is now open to make further changes such as eliminating the two threads (gfs2_glockd and gfs2_scand) in favour of a more efficient scheme. This patch has undergone extensive testing with various test suites so it should be pretty stable by now. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Bob Peterson <rpeterso@redhat.com>
2008-04-10[GFS2] fix GFP_KERNEL misusesJosef Bacik
There are several places where GFP_KERNEL allocations happen under a glock, which will result in hangs if we're under memory pressure and go to re-enter the fs in order to flush stuff out. This patch changes the culprits to GFS_NOFS to keep this problem from happening. Thank you, Signed-off-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31[GFS2] Streamline indirect pointer tree height calculationSteven Whitehouse
This patch improves the calculation of the tree height in order to reduce the number of operations which are carried out on each call to gfs2_block_map. In the common case, we now make a single comparison, rather than calculating the required tree height from scratch each time. Also in the case that the tree does need some extra height, we start from the current height rather from zero when we work out what the new height ought to be. In addition the di_height field is moved into the inode proper and reduced in size to a u8 since the value must be between 0 and GFS2_MAX_META_HEIGHT (10). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25[GFS2] Initialize extent_list earlierBob Peterson
Here is a patch for the latest upstream GFS2 code: The journal extent map needs to be initialized sooner than it currently is. Otherwise failed mount attempts (e.g. not enough journals, etc.) may panic trying to access the uninitialized list. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25[GFS2] Eliminate the no longer needed sd_statfs_mutexBob Peterson
This patch eliminates the unneeded sd_statfs_mutex mutex but preserves the ordering as discussed. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25[GFS2] Journal extent mappingBob Peterson
This patch saves a little time when gfs2 writes to the journals by keeping a mapping between logical and physical blocks on disk. That's better than constantly looking up indirect pointers in buffers, when the journals are several levels of indirection (which they typically are). Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25[GFS2] Don't periodically update the jindexSteven Whitehouse
We only care about the content of the jindex in two cases, one is when we mount the fs and the other is when we need to recover another journal. In both cases we have to update the jindex anyway, so there is no point in updating it periodically between times, so this removes it to simplify gfs2_logd. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25[GFS2] Remove "reclaim limit"Steven Whitehouse
This call to reclaim glocks is not needed, and in particular we don't want it in the fast path for locking glocks. The limit was entirely arbitrary anyway and we can't expect users to adjust things like this, the remaining code will do the right thing on its own. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25[GFS2] Remove unused variablesSteven Whitehouse
These haven't been used for some time, remove them. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25[GFS2] Remove useless i_cache from inodesSteven Whitehouse
The i_cache was designed to keep references to the indirect blocks used during block mapping so that they didn't have to be looked up continually. The idea failed because there are too many places where the i_cache needs to be freed, and this has in the past been the cause of many bugs. In addition there was no performance benefit being gained since the disk blocks in question were cached anyway. So this patch removes it in order to simplify the code to prepare for other changes which would otherwise have had to add further support for this feature. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmwLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (51 commits) [DLM] block dlm_recv in recovery transition [DLM] don't overwrite castparam if it's NULL [GFS2] Get superblock a different way [GFS2] Don't try to remove buffers that don't exist [GFS2] Alternate gfs2_iget to avoid looking up inodes being freed [GFS2] Data corruption fix [GFS2] Clean up journaled data writing [GFS2] GFS2: chmod hung - fix race in thread creation [DLM] Make dlm_sendd cond_resched more [GFS2] Move inode deletion out of blocking_cb [GFS2] flocks from same process trip kernel BUG at fs/gfs2/glock.c:1118! [GFS2] Clean up gfs2_trans_add_revoke() [GFS2] Use slab operations for all gfs2_bufdata allocations [GFS2] Replace revoke structure with bufdata structure [GFS2] Fix ordering of dirty/journal for ordered buffer unstuffing [GFS2] Clean up ordered write code [GFS2] Move pin/unpin into lops.c, clean up locking [GFS2] Don't mark jdata dirty in gfs2_unstuffer_page() [GFS2] Introduce gfs2_remove_from_ail [GFS2] Correct lock ordering in unlink ...
2007-10-12Fix up more bio falloutAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-10[GFS2] Reduce number of gfs2_scand processes to oneSteven Whitehouse
We only need a single gfs2_scand process rather than the one per filesystem which we had previously. As a result the parameter determining the frequency of gfs2_scand runs becomes a module parameter rather than a mount parameter as it was before. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10Drop 'size' argument from bio_endio and bi_end_ioNeilBrown
As bi_end_io is only called once when the reqeust is complete, the 'size' argument is now redundant. Remove it. Now there is no need for bio_endio to subtract the size completed from bi_size. So don't do that either. While we are at it, change bi_end_io to return void. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-09[GFS2] Fix sign problem in quota/statfs and cleanup _host structuresSteven Whitehouse
This patch fixes some sign issues which were accidentally introduced into the quota & statfs code during the endianess annotation process. Also included is a general clean up which moves all of the _host structures out of gfs2_ondisk.h (where they should not have been to start with) and into the places where they are actually used (often only one place). Also those _host structures which are not required any more are removed entirely (which is the eventual plan for all of them). The conversion routines from ondisk.c are also moved into the places where they are actually used, which for almost every one, was just one single place, so all those are now static functions. This also cleans up the end of gfs2_ondisk.h which no longer needs the #ifdef __KERNEL__. The net result is a reduction of about 100 lines of code, many functions now marked static plus the bug fixes as mentioned above. For good measure I ran the code through sparse after making these changes to check that there are no warnings generated. This fixes Red Hat bz #239686 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[GFS2] Clean up inode number handlingSteven Whitehouse
This patch cleans up the inode number handling code. The main difference is that instead of looking up the inodes using a struct gfs2_inum_host we now use just the no_addr member of this structure. The tests relating to no_formal_ino can then be done by the calling code. This has advantages in that we want to do different things in different code paths if the no_formal_ino doesn't match. In the NFS patch we want to return -ESTALE, but in the ->lookup() path, its a bug in the fs if the no_formal_ino doesn't match and thus we can withdraw in this case. In order to later fix bz #201012, we need to be able to look up an inode without knowing no_formal_ino, as the only information that is known to us is the on-disk location of the inode in question. This patch will also help us to fix bz #236099 at a later date by cleaning up a lot of the code in that area. There are no user visible changes as a result of this patch and there are no changes to the on-disk format either. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07[GFS2] Remove unused variableSteven Whitehouse
Remove an unused variable. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05[GFS2] Remove local exclusive glock modeSteven Whitehouse
Here is a patch for GFS2 to remove the local exclusive flag. In the places it was used, mutex's are always held earlier in the call path, so it appears redundant in the LM_ST_SHARED case. Also, the GFS2 holders were setting local exclusive in any case where the requested lock was LM_ST_EXCLUSIVE. So the other places in the glock code where the flag was tested have been replaced with tests for the lock state being LM_ST_EXCLUSIVE in order to ensure the logic is the same as before (i.e. LM_ST_EXCLUSIVE is always locally exclusive as well as globally exclusive). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05[GFS2] Remove the "greedy" function from glock.[ch]Steven Whitehouse
The "greedy" code was an attempt to retain glocks for a minimum length of time when they relate to mmap()ed files. The current implementation of this feature is not, however, ideal in that it required allocating memory in order to do this and its overly complicated. It also misses the mark by ignoring the other I/O operations which are just as likely to suffer from the same problem. So the plan is to remove this now and then add the functionality back as part of the glock state machine at a later date (and thus take into account all the possible users of this feature) Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05[GFS2] Remove max_atomic_write tunableSteven Whitehouse
This removes an unused sysfs tunable parameter. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05[GFS2] Clean up/speed up readdirSteven Whitehouse
This removes the extra filldir callback which gfs2 was using to enclose an attempt at readahead for inodes during readdir. The code was too complicated and also hurts performance badly in the case that the getdents64/readdir call isn't being followed by stat() and it wasn't even getting it right all the time when it was. As a result, on my test box an "ls" of a directory containing 250000 files fell from about 7mins (freshly mounted, so nothing cached) to between about 15 to 25 seconds. When the directory content was cached, the time taken fell from about 3mins to about 4 or 5 seconds. Interestingly in the cached case, running "ls -l" once reduced the time taken for subsequent runs of "ls" to about 6 secs even without this patch. Now it turns out that there was a special case of glocks being used for prefetching the metadata, but because of the timeouts for these locks (set to 10 secs) the metadata was being timed out before it was being used and this the prefetch code was constantly trying to prefetch the same data over and over. Calling "ls -l" meant that the inodes were brought into memory and once the inodes are cached, the glocks are not disposed of until the inodes are pushed out of the cache, thus extending the lifetime of the glocks, and thus bringing down the time for subsequent runs of "ls" considerably. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Add a comment about reading the super blockSteven Whitehouse
The comment explains why we use the bio functions to read the super block. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Morton <akpm@osdl.org> Cc: Srinivasa Ds <srinivasa@in.ibm.com>
2006-11-30[GFS2] Mount problem with the GFS2 codeSrinivasa Ds
While mounting the gfs2 filesystem,our test team had a problem and we got this error message. ======================================================= GFS2: fsid=: Trying to join cluster "lock_nolock", "dasde1" GFS2: fsid=dasde1.0: Joined cluster. Now mounting FS... GFS2: not a GFS2 filesystem GFS2: fsid=dasde1.0: can't read superblock: -22 ========================================================================== On debugging further we found that problem is while reading the super block(gfs2_read_super) and comparing the magic number in it. When I replace the submit_bio() call(present in gfs2_read_super) with the sb_getblk() and ll_rw_block(), mount operation succeded. On further analysis we found that before calling submit_bio(), bio->bi_sector was set to "sector" variable. This "sector" variable has the same value of bh->b_blocknr(block number). Hence there is a need to multiply this valuwith (blocksize >> 9)(9 because,sector size 2^9,samething happens in ll_rw_block also, before calling submit_bio()). So I have developed the patch which solves this problem. Please let me know your comments. ================================================================ Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Simplify glops functionsSteven Whitehouse
The go_sync callback took two flags, but one of them was set on every call, so this patch removes once of the flags and makes the previously conditional operations (on this flag), unconditional. The go_inval callback took three flags, each of which was set on every call to it. This patch removes the flags and makes the operations unconditional, which makes the logic rather more obvious. Two now unused flags are also removed from incore.h. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] split and annotate gfs2_statfs_changeAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] split and annotate gfs2_log_headAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] split gfs2_sbAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-02[GFS2] Remove duplicate sb reading codeSteven Whitehouse
For some reason we had two different sets of code for reading in the superblock. This removes one of them in favour of the other. Also we don't need the temporary buffer for the sb since we already have one in the gfs2 sb itself. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-25[GFS2/DLM] Fix trailing whitespaceSteven Whitehouse
As per Andrew Morton's request, removed trailing whitespace. Cc: Andrew Morton <akpm@osdl.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-19[GFS2] Export lm_interface to kernel headersFabio Massimo Di Nitto
lm_interface.h has a few out of the tree clients such as GFS1 and userland tools. Right now, these clients keeps a copy of the file in their build tree that can go out of sync. Move lm_interface.h to include/linux, export it to userland and clean up fs/gfs2 to use the new location. Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-04[GFS2] Change all types to uX styleSteven Whitehouse
This makes all fixed size types have consistent names. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-04[GFS2] Align all labels against LH sideSteven Whitehouse
This makes everything consistent. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-04[GFS2] Tidy up bmap/inode codeSteven Whitehouse
As per Jan Engelhardt's third set of comments, this make various code style changes and moves the structures from format.h into super.c, which was the only place that format.h was actually used. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-01[GFS2] Update copyright, tidy up incore.hSteven Whitehouse
As per comments from Jan Engelhardt <jengelh@linux01.gwdg.de> this updates the copyright message to say "version" in full rather than "v.2". Also incore.h has been updated to remove forward structure declarations which are not required. The gfs2_quota_lvb structure has now had endianess annotations added to it. Also quota.c has been updated so that we now store the lvb data locally in endian independant format to avoid needing a structure in host endianess too. As a result the endianess conversions are done as required at various points and thus the conversion routines in lvb.[ch] are no longer required. I've moved the one remaining constant in lvb.h thats used into lm.h and removed the unused lvb.[ch]. I have not changed the HIF_ constants. That is left to a later patch which I hope will unify the gh_flags and gh_iflags fields of the struct gfs2_holder. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-27[GFS2] Fix bug in super block reading codeSteven Whitehouse
This gets the argument to submit_bio() correct. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-27[GFS2] Use a bio to read the superblockSteven Whitehouse
This means that we don't need to create a special inode just to contain a struct address_space in order to read a single disk block. Instead we read the disk block directly. Its slightly faster, and uses slightly less memory, but the real reason for doing this is that it removes a special case from the glock code. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-14[GFS2] Fix unlinked file handlingSteven Whitehouse
This patch fixes the way we have been dealing with unlinked, but still open files. It removes all limits (other than memory for inodes, as per every other filesystem) on numbers of these which we can support on GFS2. It also means that (like other fs) its the responsibility of the last process to close the file to deallocate the storage, rather than the person who did the unlinking. Note that with GFS2, those two events might take place on different nodes. Also there are a number of other changes: o We use the Linux inode subsystem as it was intended to be used, wrt allocating GFS2 inodes o The Linux inode cache is now the point which we use for local enforcement of only holding one copy of the inode in core at once (previous to this we used the glock layer). o We no longer use the unlinked "special" file. We just ignore it completely. This makes unlinking more efficient. o We now use the 4th block allocation state. The previously unused state is used to track unlinked but still open inodes. o gfs2_inoded is no longer needed o Several fields are now no longer needed (and removed) from the in core struct gfs2_inode o Several fields are no longer needed (and removed) from the in core superblock There are a number of future possible optimisations and clean ups which have been made possible by this patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-18[GFS2] Update copyright date to 2006Steven Whitehouse
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-18[GFS2] Remove semaphore.h from C filesSteven Whitehouse
We no longer use semaphores, everything has been converted to mutex or rwsem, so we don't need to include this header any more. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-28[GFS2] [-mm patch] fs/gfs2/: possible cleanupsAdrian Bunk
This patch contains the following possible cleanups: - make needlessly global code static - #if 0 unused functions - remove the following global function that was both unused and unimplemented: - super.c: gfs2_do_upgrade() Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-26[GFS2] Remove GL_NEVER_RECURSE flagSteven Whitehouse
There is no point in keeping this flag since recursion is not now allowed for any glock. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-28[GFS2] Further updates to dir and logging codeSteven Whitehouse
This reduces the size of the directory code by about 3k and gets readdir() to use the functions which were introduced in the previous directory code update. Two memory allocations are merged into one. Eliminates zeroing of some buffers which were never used before they were initialised by other data. There is still scope for further improvement in the directory code. On the logging side, a hand created mutex has been replaced by a standard Linux mutex in the log allocation code. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-20[GFS2] Fix bug in directory code and tidy upSteven Whitehouse
Due to a typo, the dir leaf split operation was (for the first split in a directory) writing the new hash vaules at the wrong offset. This is now fixed. Also some other tidy ups are included: - We use GFS2's hash function for dentries (see ops_dentry.c) so that we don't have to keep recalculating the hash values. - A lot of common code is eliminated between the various directory lookup routines. - Better error checking on directory lookup (previously different routines checked for different errors) - The leaf split operation has a couple of redundant operations removed from it, so it should be faster. There is still further scope for further clean ups in the directory code, and readdir in particular could do with slimming down a bit. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-01[GFS2] Tidy up mount code.Steven Whitehouse
We no longer lookup ".gfs2_admin" in the root directory in order to find it, but instead use the inode number given in the superblock. Both the root directory and the admin directory are now looked up using the same routine, so the redundant code is removed. Also, there is no longer a reference to the root inode in the GFS2 super block. When required this can be retreived via sb->s_root->d_inode instead. Assuming that we introduce a metadata filesystem type for GFS, then this is a first step towards that goal. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-27[GFS2] Macros removal in gfs2.hSteven Whitehouse
As suggested by Pekka Enberg <penberg@cs.helsinki.fi>. The DIV_RU macro is renamed DIV_ROUND_UP and and moved to kernel.h The other macros are gone from gfs2.h as (although not requested by Pekka Enberg) are a number of included header file which are now included individually. The inode number comparison function is now an inline function. The DT2IF and IF2DT may be addressed in a future patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-27[GFS2] 80 Column audit of GFS2Steven Whitehouse
Requested by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-27[GFS2] Audit printk and kmallocSteven Whitehouse
All printk calls now have KERN_ set where required and a couple of kmalloc(), memset(.., 0, ...) calls changed to kzalloc(). This is in response to comments from: Pekka Enberg <penberg@cs.helsinki.fi> and Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-21[GFS2] Use mutices rather than semaphoresSteven Whitehouse
As well as a number of minor bug fixes, this patch changes GFS to use mutices rather than semaphores. This results in better information in case there are any locking problems. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-13[GFS2] Fix for root inode ref count bugSteven Whitehouse
Umount is now working correctly again. The bug was due to not getting an extra ref count when mounting the fs. We should have bumped it by two (once for the internal pointer to the root inode from the super block and once for the inode hanging off the dcache entry for root). Also this patch tidys up the code dealing with looking up and creating inodes. We now pass Linux inodes (with gfs2_inodes attached) rather than the other way around and this reduces code duplication in various places. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-30[GFS2] Add gfs2_internal_read()Steven Whitehouse
Add the new external read function. Its temporarily in jdata.c even though the protoype is in ops_file.h - this will change shortly. The current implementation will change to a page cache one when that happens. In order to effect the above changes, the various internal inodes now have Linux inodes attached to them. We keep the references to the Linux inodes, rather than the gfs2_inodes in the super block. In order to get everything to work correctly I've had to reorder the init sequence on mount (which I should probably have done earlier when .gfs2_admin was made visible). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>