Age | Commit message (Collapse) | Author |
|
When tipc_conn_sendmsg() calls tipc_conn_lookup() to query a
connection instance, its reference count value is increased if
it's found. But subsequently if it's found that the connection is
closed, the work of sending message is not queued into its server
send workqueue, and the connection reference count is not decreased.
This will cause a reference count leak. To reproduce this problem,
an application would need to open and closes topology server
connections with high intensity.
We fix this by immediately decrementing the connection reference
count if a send fails due to the connection being closed.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently connection shutdown callback function is called when
connection instance is released in tipc_conn_kref_release(), and
receiving packets and sending packets are running in different
threads. Even if connection is closed by the thread of receiving
packets, its shutdown callback may not be called immediately as
the connection reference count is non-zero at that moment. So,
although the connection is shut down by the thread of receiving
packets, the thread of sending packets doesn't know it. Before
its shutdown callback is invoked to tell the sending thread its
connection has been closed, the sending thread may deliver
messages by tipc_conn_sendmsg(), this is why the following error
information appears:
"Sending subscription event failed, no memory"
To eliminate it, allow connection shutdown callback function to
be called before connection id is removed in tipc_close_conn(),
which makes the sending thread know the truth in time that its
socket is closed so that it doesn't send message to it. We also
remove the "Sending XXX failed..." error reporting for topology
and config services.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When tipc module is inserted, many tipc components are initialized
one by one. During the initialization period, if one of them is
failed, tipc_core_stop() will be called to stop all components
whatever corresponding components are created or not. To avoid to
release uncreated ones, relevant components have to add necessary
enabled flags indicating whether they are created or not.
But in the initialization stage, if one component is unsuccessfully
created, we will just destroy successfully created components before
the failed component instead of all components. All enabled flags
defined in components, in turn, become redundant. Additionally it's
also unnecessary to identify whether table.types is NULL in
tipc_nametbl_stop() because name stable has been definitely created
successfully when tipc_nametbl_stop() is called.
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When creation of TIPC internal server socket fails,
we get an oops with the following dump:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffffa0011f49>] tipc_close_conn+0x59/0xb0 [tipc]
PGD 13719067 PUD 12008067 PMD 0
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
Modules linked in: tipc(+)
CPU: 4 PID: 4340 Comm: insmod Not tainted 3.10.0+ #1
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
task: ffff880014360000 ti: ffff88001374c000 task.ti: ffff88001374c000
RIP: 0010:[<ffffffffa0011f49>] [<ffffffffa0011f49>] tipc_close_conn+0x59/0xb0 [tipc]
RSP: 0018:ffff88001374dc98 EFLAGS: 00010292
RAX: 0000000000000000 RBX: ffff880012ac09d8 RCX: 0000000000000000
RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffff880014360000
RBP: ffff88001374dcb8 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0016fa0
R13: ffffffffa0017010 R14: ffffffffa0017010 R15: ffff880012ac09d8
FS: 0000000000000000(0000) GS:ffff880016600000(0063) knlGS:00000000f76668d0
CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 0000000000000020 CR3: 0000000012227000 CR4: 00000000000006e0
Stack:
ffff88001374dcb8 ffffffffa0016fa0 0000000000000000 0000000000000001
ffff88001374dcf8 ffffffffa0012922 ffff88001374dce8 00000000ffffffea
ffffffffa0017100 0000000000000000 ffff8800134241a8 ffffffffa0017150
Call Trace:
[<ffffffffa0012922>] tipc_server_stop+0xa2/0x1b0 [tipc]
[<ffffffffa0009995>] tipc_subscr_stop+0x15/0x20 [tipc]
[<ffffffffa00130f5>] tipc_core_stop+0x1d/0x33 [tipc]
[<ffffffffa001f0d4>] tipc_init+0xd4/0xf8 [tipc]
[<ffffffffa001f000>] ? 0xffffffffa001efff
[<ffffffff8100023f>] do_one_initcall+0x3f/0x150
[<ffffffff81082f4d>] ? __blocking_notifier_call_chain+0x7d/0xd0
[<ffffffff810cc58a>] load_module+0x11aa/0x19c0
[<ffffffff810c8d60>] ? show_initstate+0x50/0x50
[<ffffffff8190311c>] ? retint_restore_args+0xe/0xe
[<ffffffff810cce79>] SyS_init_module+0xd9/0x110
[<ffffffff8190dc65>] sysenter_dispatch+0x7/0x1f
Code: 6c 24 70 4c 89 ef e8 b7 04 8f e1 8b 73 04 4c 89 e7 e8 7c 9e 32 e1 41 83 ac 24
b8 00 00 00 01 4c 89 ef e8 eb 0a 8f e1 48 8b 43 08 <4c> 8b 68 20 4d 8d a5 48 03 00
00 4c 89 e7 e8 04 05 8f e1 4c 89
RIP [<ffffffffa0011f49>] tipc_close_conn+0x59/0xb0 [tipc]
RSP <ffff88001374dc98>
CR2: 0000000000000020
---[ end trace b02321f40e4269a3 ]---
We have the following call chain:
tipc_core_start()
ret = tipc_subscr_start()
ret = tipc_server_start(){
server->enabled = 1;
ret = tipc_open_listening_sock()
}
I.e., the server->enabled flag is unconditionally set to 1, whatever
the return value of tipc_open_listening_sock().
This causes a crash when tipc_core_start() tries to clean up
resources after a failed initialization:
if (ret == failed)
tipc_subscr_stop()
tipc_server_stop(){
if (server->enabled)
tipc_close_conn(){
NULL reference of con->sock-sk
OOPS!
}
}
To avoid this, tipc_server_start() should only set server->enabled
to 1 in case of a succesful socket creation. In case of failure, it
should release all allocated resources before returning.
Problem introduced in commit c5fa7b3cf3cb22e4ac60485fc2dc187fe012910f
("tipc: introduce new TIPC server infrastructure") in v3.11-rc1.
Note that it won't be seen often; it takes a module load under memory
constrained conditions in order to trigger the failure condition.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
TIPC has two internal servers, one providing a subscription
service for topology events, and another providing the
configuration interface. These servers have previously been running
in BH context, accessing the TIPC-port (aka native) API directly.
Apart from these servers, even the TIPC socket implementation is
partially built on this API.
As this API may simultaneously be called via different paths and in
different contexts, a complex and costly lock policiy is required
in order to protect TIPC internal resources.
To eliminate the need for this complex lock policiy, we introduce
a new, generic service API that uses kernel sockets for message
passing instead of the native API. Once the toplogy and configuration
servers are converted to use this new service, all code pertaining
to the native API can be removed. This entails a significant
reduction in code amount and complexity, and opens up for a complete
rework of the locking policy in TIPC.
The new service also solves another problem:
As the current topology server works in BH context, it cannot easily
be blocked when sending of events fails due to congestion. In such
cases events may have to be silently dropped, something that is
unacceptable. Therefore, the new service keeps a dedicated outbound
queue receiving messages from BH context. Once messages are
inserted into this queue, we will immediately schedule a work from a
special workqueue. This way, messages/events from the topology server
are in reality sent in process context, and the server can block
if necessary.
Analogously, there is a new workqueue for receiving messages. Once a
notification about an arriving message is received in BH context, we
schedule a work from the receive workqueue to do the job of
receiving the message in process context.
As both sending and receive messages are now finished in processes,
subscribed events cannot be dropped any more.
As of this commit, this new server infrastructure is built, but
not actually yet called by the existing TIPC code, but since the
conversion changes required in order to use it are significant,
the addition is kept here as a separate commit.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|