From 92e3cc91d8f51ce64a8b7c696377180953dd316e Mon Sep 17 00:00:00 2001 From: David Howells Date: Fri, 9 Oct 2020 14:11:58 +0100 Subject: afs: Fix rapid cell addition/removal by not using RCU on cells tree There are a number of problems that are being seen by the rapidly mounting and unmounting an afs dynamic root with an explicit cell and volume specified (which should probably be rejected, but that's a separate issue): What the tests are doing is to look up/create a cell record for the name given and then tear it down again without actually using it to try to talk to a server. This is repeated endlessly, very fast, and the new cell collides with the old one if it's not quick enough to reuse it. It appears (as suggested by Hillf Danton) that the search through the RB tree under a read_seqbegin_or_lock() under RCU conditions isn't safe and that it's not blocking the write_seqlock(), despite taking two passes at it. He suggested that the code should take a ref on the cell it's attempting to look at - but this shouldn't be necessary until we've compared the cell names. It's possible that I'm missing a barrier somewhere. However, using an RCU search for this is overkill, really - we only need to access the cell name in a few places, and they're places where we're may end up sleeping anyway. Fix this by switching to an R/W semaphore instead. Additionally, draw the down_read() call inside the function (renamed to afs_find_cell()) since all the callers were taking the RCU read lock (or should've been[*]). [*] afs_probe_cell_name() should have been, but that doesn't appear to be involved in the bug reports. The symptoms of this look like: general protection fault, probably for non-canonical address 0xf27d208691691fdb: 0000 [#1] PREEMPT SMP KASAN KASAN: maybe wild-memory-access in range [0x93e924348b48fed8-0x93e924348b48fedf] ... RIP: 0010:strncasecmp lib/string.c:52 [inline] RIP: 0010:strncasecmp+0x5f/0x240 lib/string.c:43 afs_lookup_cell_rcu+0x313/0x720 fs/afs/cell.c:88 afs_lookup_cell+0x2ee/0x1440 fs/afs/cell.c:249 afs_parse_source fs/afs/super.c:290 [inline] ... Fixes: 989782dcdc91 ("afs: Overhaul cell database management") Reported-by: syzbot+459a5dce0b4cb70fd076@syzkaller.appspotmail.com Signed-off-by: David Howells cc: Hillf Danton cc: syzkaller-bugs@googlegroups.com --- fs/afs/super.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) (limited to 'fs/afs/super.c') diff --git a/fs/afs/super.c b/fs/afs/super.c index b552357b1d13..0be99016ecfb 100644 --- a/fs/afs/super.c +++ b/fs/afs/super.c @@ -634,9 +634,7 @@ static int afs_init_fs_context(struct fs_context *fc) ctx->net = afs_net(fc->net_ns); /* Default to the workstation cell. */ - rcu_read_lock(); - cell = afs_lookup_cell_rcu(ctx->net, NULL, 0); - rcu_read_unlock(); + cell = afs_find_cell(ctx->net, NULL, 0); if (IS_ERR(cell)) cell = NULL; ctx->cell = cell; -- cgit v1.2.3 From 88c853c3f5c0a07c5db61b494ee25152535cfeee Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 23 Jul 2019 11:24:59 +0100 Subject: afs: Fix cell refcounting by splitting the usage counter Management of the lifetime of afs_cell struct has some problems due to the usage counter being used to determine whether objects of that type are in use in addition to whether anyone might be interested in the structure. This is made trickier by cell objects being cached for a period of time in case they're quickly reused as they hold the result of a setup process that may be slow (DNS lookups, AFS RPC ops). Problems include the cached root volume from alias resolution pinning its parent cell record, rmmod occasionally hanging and occasionally producing assertion failures. Fix this by splitting the count of active users from the struct reference count. Things then work as follows: (1) The cell cache keeps +1 on the cell's activity count and this has to be dropped before the cell can be removed. afs_manage_cell() tries to exchange the 1 to a 0 with the cells_lock write-locked, and if successful, the record is removed from the net->cells. (2) One struct ref is 'owned' by the activity count. That is put when the active count is reduced to 0 (final_destruction label). (3) A ref can be held on a cell whilst it is queued for management on a work queue without confusing the active count. afs_queue_cell() is added to wrap this. (4) The queue's ref is dropped at the end of the management. This is split out into a separate function, afs_manage_cell_work(). (5) The root volume record is put after a cell is removed (at the final_destruction label) rather then in the RCU destruction routine. (6) Volumes hold struct refs, but aren't active users. (7) Both counts are displayed in /proc/net/afs/cells. There are some management function changes: (*) afs_put_cell() now just decrements the refcount and triggers the RCU destruction if it becomes 0. It no longer sets a timer to have the manager do this. (*) afs_use_cell() and afs_unuse_cell() are added to increase and decrease the active count. afs_unuse_cell() sets the management timer. (*) afs_queue_cell() is added to queue a cell with approprate refs. There are also some other fixes: (*) Don't let /proc/net/afs/cells access a cell's vllist if it's NULL. (*) Make sure that candidate cells in lookups are properly destroyed rather than being simply kfree'd. This ensures the bits it points to are destroyed also. (*) afs_dec_cells_outstanding() is now called in cell destruction rather than at "final_destruction". This ensures that cell->net is still valid to the end of the destructor. (*) As a consequence of the previous two changes, move the increment of net->cells_outstanding that was at the point of insertion into the tree to the allocation routine to correctly balance things. Fixes: 989782dcdc91 ("afs: Overhaul cell database management") Signed-off-by: David Howells --- fs/afs/super.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) (limited to 'fs/afs/super.c') diff --git a/fs/afs/super.c b/fs/afs/super.c index 0be99016ecfb..e72c223f831d 100644 --- a/fs/afs/super.c +++ b/fs/afs/super.c @@ -294,7 +294,7 @@ static int afs_parse_source(struct fs_context *fc, struct fs_parameter *param) cellnamesz, cellnamesz, cellname ?: ""); return PTR_ERR(cell); } - afs_put_cell(ctx->net, ctx->cell); + afs_unuse_cell(ctx->net, ctx->cell); ctx->cell = cell; } @@ -389,8 +389,8 @@ static int afs_validate_fc(struct fs_context *fc) _debug("switch to alias"); key_put(ctx->key); ctx->key = NULL; - cell = afs_get_cell(ctx->cell->alias_of); - afs_put_cell(ctx->net, ctx->cell); + cell = afs_use_cell(ctx->cell->alias_of); + afs_unuse_cell(ctx->net, ctx->cell); ctx->cell = cell; goto reget_key; } @@ -508,7 +508,7 @@ static struct afs_super_info *afs_alloc_sbi(struct fs_context *fc) if (ctx->dyn_root) { as->dyn_root = true; } else { - as->cell = afs_get_cell(ctx->cell); + as->cell = afs_use_cell(ctx->cell); as->volume = afs_get_volume(ctx->volume, afs_volume_trace_get_alloc_sbi); } @@ -521,7 +521,7 @@ static void afs_destroy_sbi(struct afs_super_info *as) if (as) { struct afs_net *net = afs_net(as->net_ns); afs_put_volume(net, as->volume, afs_volume_trace_put_destroy_sbi); - afs_put_cell(net, as->cell); + afs_unuse_cell(net, as->cell); put_net(as->net_ns); kfree(as); } @@ -607,7 +607,7 @@ static void afs_free_fc(struct fs_context *fc) afs_destroy_sbi(fc->s_fs_info); afs_put_volume(ctx->net, ctx->volume, afs_volume_trace_put_free_fc); - afs_put_cell(ctx->net, ctx->cell); + afs_unuse_cell(ctx->net, ctx->cell); key_put(ctx->key); kfree(ctx); } -- cgit v1.2.3 From dca54a7bbb8ca9148ae10d60c66c926e222a9c4b Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 13 Oct 2020 20:51:59 +0100 Subject: afs: Add tracing for cell refcount and active user count Add a tracepoint to log the cell refcount and active user count and pass in a reason code through various functions that manipulate these counters. Additionally, a helper function, afs_see_cell(), is provided to log interesting places that deal with a cell without actually doing any accounting directly. Signed-off-by: David Howells --- fs/afs/super.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) (limited to 'fs/afs/super.c') diff --git a/fs/afs/super.c b/fs/afs/super.c index e72c223f831d..ac4e3ed4e9bd 100644 --- a/fs/afs/super.c +++ b/fs/afs/super.c @@ -294,7 +294,8 @@ static int afs_parse_source(struct fs_context *fc, struct fs_parameter *param) cellnamesz, cellnamesz, cellname ?: ""); return PTR_ERR(cell); } - afs_unuse_cell(ctx->net, ctx->cell); + afs_unuse_cell(ctx->net, ctx->cell, afs_cell_trace_unuse_parse); + afs_see_cell(cell, afs_cell_trace_see_source); ctx->cell = cell; } @@ -389,8 +390,9 @@ static int afs_validate_fc(struct fs_context *fc) _debug("switch to alias"); key_put(ctx->key); ctx->key = NULL; - cell = afs_use_cell(ctx->cell->alias_of); - afs_unuse_cell(ctx->net, ctx->cell); + cell = afs_use_cell(ctx->cell->alias_of, + afs_cell_trace_use_fc_alias); + afs_unuse_cell(ctx->net, ctx->cell, afs_cell_trace_unuse_fc); ctx->cell = cell; goto reget_key; } @@ -508,7 +510,7 @@ static struct afs_super_info *afs_alloc_sbi(struct fs_context *fc) if (ctx->dyn_root) { as->dyn_root = true; } else { - as->cell = afs_use_cell(ctx->cell); + as->cell = afs_use_cell(ctx->cell, afs_cell_trace_use_sbi); as->volume = afs_get_volume(ctx->volume, afs_volume_trace_get_alloc_sbi); } @@ -521,7 +523,7 @@ static void afs_destroy_sbi(struct afs_super_info *as) if (as) { struct afs_net *net = afs_net(as->net_ns); afs_put_volume(net, as->volume, afs_volume_trace_put_destroy_sbi); - afs_unuse_cell(net, as->cell); + afs_unuse_cell(net, as->cell, afs_cell_trace_unuse_sbi); put_net(as->net_ns); kfree(as); } @@ -607,7 +609,7 @@ static void afs_free_fc(struct fs_context *fc) afs_destroy_sbi(fc->s_fs_info); afs_put_volume(ctx->net, ctx->volume, afs_volume_trace_put_free_fc); - afs_unuse_cell(ctx->net, ctx->cell); + afs_unuse_cell(ctx->net, ctx->cell, afs_cell_trace_unuse_fc); key_put(ctx->key); kfree(ctx); } @@ -634,7 +636,7 @@ static int afs_init_fs_context(struct fs_context *fc) ctx->net = afs_net(fc->net_ns); /* Default to the workstation cell. */ - cell = afs_find_cell(ctx->net, NULL, 0); + cell = afs_find_cell(ctx->net, NULL, 0, afs_cell_trace_use_fc); if (IS_ERR(cell)) cell = NULL; ctx->cell = cell; -- cgit v1.2.3