Age | Commit message (Collapse) | Author |
|
An earlier patch changed lookup side to also net_eq() namespaces after
obtaining a reference on the conntrack, so a single kmemcache can be used.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
We already include netns address in the hash, so we only need to use
net_eq in find_appropriate_src and can then put all entries into
same table.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Will be needed soon when we place all in the same hash table.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
TC_ACT_STOLEN is used when ingress traffic is mirred/redirected
to say ifb.
Packet is not dropped, but consumed.
Only TC_ACT_SHOT is a clear indication something went wrong.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
On small embedded routers, one wants to control maximal amount of
memory used by fq_codel, instead of controlling number of packets or
bytes, since GRO/TSO make these not practical.
Assuming skb->truesize is accurate, we have to keep track of
skb->truesize sum for skbs in queue.
This patch adds a new TCA_FQ_CODEL_MEMORY_LIMIT attribute.
I chose a default value of 32 MBytes, which looks reasonable even
for heavy duty usages. (Prior fq_codel users should not be hurt
when they upgrade their kernels)
Two fields are added to tc_fq_codel_qd_stats to report :
- Current memory usage
- Number of drops caused by memory limits
# tc qd replace dev eth1 root est 1sec 4sec fq_codel memory_limit 4M
..
# tc -s -d qd sh dev eth1
qdisc fq_codel 8008: root refcnt 257 limit 10240p flows 1024
quantum 1514 target 5.0ms interval 100.0ms memory_limit 4Mb ecn
Sent 2083566791363 bytes 1376214889 pkt (dropped 4994406, overlimits 0
requeues 21705223)
rate 9841Mbit 812549pps backlog 3906120b 376p requeues 21705223
maxpacket 68130 drop_overlimit 4994406 new_flow_count 28855414
ecn_mark 0 memory_used 4190048 drop_overmemory 4994406
new_flows_len 1 old_flows_len 177
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Dave Täht <dave.taht@gmail.com>
Cc: Sebastian Möller <moeller0@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add an implementation of Qualcomm's IPC router protocol, used to
communicate with service providing remote processors.
Signed-off-by: Courtney Cavin <courtney.cavin@sonymobile.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
[bjorn: Cope with 0 being a valid node id and implement RTM_NEWADDR]
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
UDP tunnel segmentation code relies on the inner offsets being set for
an UDP tunnel GSO packet, but the inner *_complete() functions will
set the inner offsets only if 'encapsulation' is set before calling
them. Currently, udp_gro_complete() sets 'encapsulation' only after
the inner *_complete() functions are done. This causes the inner
offsets having invalid values after udp_gro_complete() returns, which
in turn will make it impossible to properly segment the packet in case
it needs to be forwarded, which would be visible to the user either as
invalid packets being sent or as packet loss.
This patch fixes this by setting skb's 'encapsulation' in
udp_gro_complete() before calling into the inner complete functions,
and by making each possible UDP tunnel gro_complete() callback set the
inner_mac_header to the beginning of the tunnel payload.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Reviewed-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The setting of the UDP tunnel GSO type is already performed by
udp[46]_gro_complete().
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
I forgot that ip_send_unicast_reply() is not BH safe (yet).
Disabling preemption before calling it was not a good move.
Fixes: c10d9310edf5 ("tcp: do not assume TCP code is non preemptible")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andres Lagar-Cavilla <andreslc@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
allow cls_bpf and act_bpf programs access skb->data and skb->data_end pointers.
The bpf helpers that change skb->data need to update data_end pointer as well.
The verifier checks that programs always reload data, data_end pointers
after calls to such bpf helpers.
We cannot add 'data_end' pointer to struct qdisc_skb_cb directly,
since it's embedded as-is by infiniband ipoib, so wrapper struct is needed.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Tables have to exist for VRFs to function. Ensure they exist
when VRF device is created.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Responses for packets to unused ports are getting lost with L3 domains.
IPv4 has ip_send_unicast_reply for sending TCP responses which accounts
for L3 domains; update the IPv6 counterpart tcp_v6_send_response.
For icmp the L3 master check needs to be moved up in icmp6_send
to properly respond to UDP packets to a port with no listener.
Fixes: ca254490c8df ("net: Add VRF support to IPv6 stack")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With the newly introduced helper functions the skb pulling is hidden
in the checksumming function - and undone before returning to the
caller.
The IGMP and MLD query parsing functions in the bridge still
assumed that the skb is pointing to the beginning of the IGMP/MLD
message while it is now kept at the beginning of the IPv4/6 header.
If there is a querier somewhere else, then this either causes
the multicast snooping to stay disabled even though it could be
enabled. Or, if we have the querier enabled too, then this can
create unnecessary IGMP / MLD query messages on the link.
Fixing this by taking the offset between IP and IGMP/MLD header into
account, too.
Fixes: 9afd85c9e455 ("net: Export IGMP/MLD message validation code")
Reported-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next
Simon Horman says:
====================
Second Round of IPVS Updates for v4.7
please consider these enhancements to the IPVS. They allow its DoS
mitigation strategy effective in conjunction with the SIP persistence
engine.
====================
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
We already include netns address in the hash and compare the netns pointers
during lookup, so even if namespaces have overlapping addresses entries
will be spread across the expectation table.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
DoS protection policy that deletes connections to avoid out of memory is
currently not effective for SIP-pe plus OPS-mode for two reasons:
1) connection templates (holding SIP call-id) are always skipped in
ip_vs_random_dropentry()
2) in_pkts counter (used by drop_entry algorithm) is not incremented
for connection templates
This patch addresses such problems with the following changes:
a) connection templates associated (via their dest) to virtual-services
configured in OPS mode are included in ip_vs_random_dropentry()
monitoring. This applies to SIP-pe over UDP (which requires OPS mode),
but is more general principle: when OPS is controlled by templates
memory can be used only by templates themselves, since OPS conns are
deleted after packet is forwarded.
b) OPS connections, if controlled by a template, cause increment of
in_pkts counter of their template. This is already happening but only
in case director is in master-slave mode (see ip_vs_sync_conn()).
Signed-off-by: Marco Angaroni <marcoangaroni@gmail.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
get_bridge_ifindices() is used from the old "deviceless" bridge ioctl
calls which aren't called with rtnl held. The comment above says that it is
called with rtnl but that is not really the case.
Here's a sample output from a test ASSERT_RTNL() which I put in
get_bridge_ifindices and executed "brctl show":
[ 957.422726] RTNL: assertion failed at net/bridge//br_ioctl.c (30)
[ 957.422925] CPU: 0 PID: 1862 Comm: brctl Tainted: G W O
4.6.0-rc4+ #157
[ 957.423009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.8.1-20150318_183358- 04/01/2014
[ 957.423009] 0000000000000000 ffff880058adfdf0 ffffffff8138dec5
0000000000000400
[ 957.423009] ffffffff81ce8380 ffff880058adfe58 ffffffffa05ead32
0000000000000001
[ 957.423009] 00007ffec1a444b0 0000000000000400 ffff880053c19130
0000000000008940
[ 957.423009] Call Trace:
[ 957.423009] [<ffffffff8138dec5>] dump_stack+0x85/0xc0
[ 957.423009] [<ffffffffa05ead32>]
br_ioctl_deviceless_stub+0x212/0x2e0 [bridge]
[ 957.423009] [<ffffffff81515beb>] sock_ioctl+0x22b/0x290
[ 957.423009] [<ffffffff8126ba75>] do_vfs_ioctl+0x95/0x700
[ 957.423009] [<ffffffff8126c159>] SyS_ioctl+0x79/0x90
[ 957.423009] [<ffffffff8163a4c0>] entry_SYSCALL_64_fastpath+0x23/0xc1
Since it only reads bridge ifindices, we can use rcu to safely walk the net
device list. Also remove the wrong rtnl comment above.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The peer may be expecting a reply having sent a request and then done a
shutdown(SHUT_WR), so tearing down the whole socket at this point seems
wrong and breaks for me with a client which does a SHUT_WR.
Looking at other socket family's stream_recvmsg callbacks doing a shutdown
here does not seem to be the norm and removing it does not seem to have
had any adverse effects that I can see.
I'm using Stefan's RFC virtio transport patches, I'm unsure of the impact
on the vmci transport.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Cc: Andy King <acking@vmware.com>
Cc: Dmitry Torokhov <dtor@vmware.com>
Cc: Jorgen Hansen <jhansen@vmware.com>
Cc: Adit Ranadive <aditr@vmware.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, we support set names of up to 16 bytes, get this aligned
with the maximum length we can use in ipset to make it easier when
considering migration to nf_tables.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The dprintf() and duprintf() functions are enabled at compile time,
these days we have better runtime debugging through pr_debug() and
static keys.
On top of this, this debugging is so old that I don't expect anyone
using this anymore, so let's get rid of this.
IP_NF_ASSERT() is still left in place, although this needs that
NETFILTER_DEBUG is enabled, I think these assertions provide useful
context information when reading the code.
Note that ARP_NF_ASSERT() has been removed as there is no user of
this.
Kill also DEBUG_ALLOW_ALL and a couple of pr_error() and pr_debug()
spots that are inconsistently placed in the code.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
If the protocol is not natively supported, this assigns generic protocol
tracker so we can always assume a valid pointer after these calls.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Joe Stringer <joe@ovn.org>
|
|
This patch introduces nf_ct_resolve_clash() to resolve race condition on
conntrack insertions.
This is particularly a problem for connection-less protocols such as
UDP, with no initial handshake. Two or more packets may race to insert
the entry resulting in packet drops.
Another problematic scenario are packets enqueued to userspace via
NFQUEUE after the raw table, that make it easier to trigger this
race.
To resolve this, the idea is to reset the conntrack entry to the one
that won race. Packet and bytes counters are also merged.
The 'insert_failed' stats still accounts for this situation, after
this patch, the drop counter is bumped whenever we drop packets, so we
can watch for unresolved clashes.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Introduce a helper function to update conntrack counters.
__nf_ct_kill_acct() was unnecessarily subtracting skb_network_offset()
that is expected to be zero from the ipv4/ipv6 hooks.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Remove unnecessary check for non-nul pointer in destroy_conntrack()
given that __nf_ct_l4proto_find() returns the generic protocol tracker
if the protocol is not supported.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
When iterating, skip conntrack entries living in a different netns.
We could ignore netns and kill some other non-assured one, but it
has two problems:
- a netns can kill non-assured conntracks in other namespace
- we would start to 'over-subscribe' the affected/overlimit netns.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
We already include netns address in the hash and compare the netns pointers
during lookup, so even if namespaces have overlapping addresses entries
will be spread across the table.
Assuming 64k bucket size, this change saves 0.5 mbyte per namespace on a
64bit system.
NAT bysrc and expectation hash is still per namespace, those will
changed too soon.
Future patch will also make conntrack object slab cache global again.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Once we place all conntracks into a global hash table we want them to be
spread across entire hash table, even if namespaces have overlapping ip
addresses.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Once we place all conntracks in the same hash table we must also compare
the netns pointer to skip conntracks that belong to a different namespace.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The iteration process is lockless, so we test if the conntrack object is
eligible for printing (e.g. is AF_INET) after obtaining the reference
count.
Once we put all conntracks into same hash table we might see more
entries that need to be skipped.
So add a helper and first perform the test in a lockless fashion
for fast skip.
Once we obtain the reference count, just repeat the check.
Note that this refactoring also includes a missing check for unconfirmed
conntrack entries due to slab rcu object re-usage, so they need to be
skipped since they are not part of the listing.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This prepares for upcoming change that places all conntracks into a
single, global table. For this to work we will need to also compare
net pointer during lookup. To avoid open-coding such check use the
nf_ct_key_equal helper and then later extend it to also consider net_eq.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Once we place all conntracks into same table iteration becomes more
costly because the table contains conntracks that we are not interested
in (belonging to other netns).
So don't bother scanning if the current namespace has no entries.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
When resizing the conntrack hash table at runtime via
echo 42 > /sys/module/nf_conntrack/parameters/hashsize, we are racing with
the conntrack lookup path -- reads can happen in parallel and nothing
prevents readers from observing a the newly allocated hash but the old
size (or vice versa).
So access to hash[bucket] can trigger OOB read access in case the table got
expanded and we saw the new size but the old hash pointer (or it got shrunk
and we got new hash ptr but the size of the old and larger table):
kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN
CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.6.0-rc2+ #107
[..]
Call Trace:
[<ffffffff822c3d6a>] ? nf_conntrack_tuple_taken+0x12a/0xe90
[<ffffffff822c3ac1>] ? nf_ct_invert_tuplepr+0x221/0x3a0
[<ffffffff8230e703>] get_unique_tuple+0xfb3/0x2760
Use generation counter to obtain the address/length of the same table.
Also add a synchronize_net before freeing the old hash.
AFAICS, without it we might access ct_hash[bucket] after ct_hash has been
freed, provided that lockless reader got delayed by another event:
CPU1 CPU2
seq_begin
seq_retry
<delay> resize occurs
free oldhash
for_each(oldhash[size])
Note that resize is only supported in init_netns, it took over 2 minutes
of constant resizing+flooding to produce the warning, so this isn't a
big problem in practice.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
No need to disable BH here anymore:
stats are switched to _ATOMIC variant (== this_cpu_inc()), which
nowadays generates same code as the non _ATOMIC NF_STAT, at least on x86.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Conntrack labels are currently sized depending on the iptables
ruleset, i.e. if we're asked to test or set bits 1, 2, and 65 then we
would allocate enough room to store at least bit 65.
However, with nft, the input is just a register with arbitrary runtime
content.
We therefore ask for the upper ceiling we currently have, which is
enough room to store 128 bits.
Alternatively, we could alter nf_connlabel_replace to increase
net->ct.label_words at run time, but since 128 bits is not that
big we'd only save sizeof(long) so it doesn't seem worth it for now.
This follows a similar approach that xtables 'connlabel'
match uses, so when user inputs
ct label set bar
then we will set the bit used by the 'bar' label and leave the rest alone.
This is done by passing the sreg content to nf_connlabels_replace
as both value and mask argument.
Labels (bits) already set thus cannot be re-set to zero, but
this is not supported by xtables connlabel match either.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
percpu_counter only have protection against preemption.
TCP stack uses them possibly from BH, so we need BH protection
in contexts that could be run in process context
Fixes: c10d9310edf5 ("tcp: do not assume TCP code is non preemptible")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
__inet_twsk_hashdance() might be called from process context,
better block BH before acquiring bind hash and established locks
Fixes: c10d9310edf5 ("tcp: do not assume TCP code is non preemptible")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tcp_snd_una_update() and tcp_rcv_nxt_update() call
u64_stats_update_begin() either from process context or BH handler.
This triggers a lockdep splat on 32bit & SMP builds.
We could add u64_stats_update_begin_bh() variant but this would
slow down 32bit builds with useless local_disable_bh() and
local_enable_bh() pairs, since we own the socket lock at this point.
I add sock_owned_by_me() helper to have proper lockdep support
even on 64bit builds, and new u64_stats_update_begin_raw()
and u64_stats_update_end_raw methods.
Fixes: c10d9310edf5 ("tcp: do not assume TCP code is non preemptible")
Reported-by: Fabio Estevam <festevam@gmail.com>
Diagnosed-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2016-05-04
1) The flowcache can hit an OOM condition if too
many entries are in the gc_list. Fix this by
counting the entries in the gc_list and refuse
new allocations if the value is too high.
2) The inner headers are invalid after a xfrm transformation,
so reset the skb encapsulation field to ensure nobody tries
access the inner headers. Otherwise tunnel devices stacked
on top of xfrm may build the outer headers based on wrong
informations.
3) Add pmtu handling to vti, we need it to report
pmtu informations for local generated packets.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Antonio Quartulli says:
====================
pull request: batman-adv 20160504
In this pull request you have:
- two changes to the MAINTAINERS file where one marks our mailing list
as moderated and the other adds a missing documentation file
- kernel-doc fixes
- code refactoring and various cleanups
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The stack object “map” has a total size of 32 bytes. Its last 4
bytes are padding generated by compiler. These padding bytes are
not initialized and sent out via “nla_put”.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The stack object “info” has a total size of 12 bytes. Its last byte
is padding which is not initialized and leaked via “put_cmsg”.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
previous patches removed all direct accesses to dev->trans_start,
so change the netif_trans_update helper to update trans_start of
netdev queue 0 instead and then remove trans_start from struct net_device.
AFAICS a lot of the netif_trans_update() invocations are now useless
because they occur in ndo_start_xmit and driver doesn't set LLTX
(i.e. stack already took care of the update).
As I can't test any of them it seems better to just leave them alone.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replace all trans_start updates with netif_trans_update helper.
change was done via spatch:
struct net_device *d;
@@
- d->trans_start = jiffies
+ netif_trans_update(d)
Compile tested only.
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: linux-xtensa@linux-xtensa.org
Cc: linux1394-devel@lists.sourceforge.net
Cc: linux-rdma@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: MPT-FusionLinux.pdl@broadcom.com
Cc: linux-scsi@vger.kernel.org
Cc: linux-can@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linux-omap@vger.kernel.org
Cc: linux-hams@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: devel@driverdev.osuosl.org
Cc: b.a.t.m.a.n@lists.open-mesh.org
Cc: linux-bluetooth@vger.kernel.org
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Acked-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The ipv6 gre implementation was cleaned up to share more code
with the ipv4 version, but it can be enabled even when NET_IPGRE_DEMUX
is disabled, resulting in a link error:
net/built-in.o: In function `gre_rcv':
:(.text+0x17f5d0): undefined reference to `gre_parse_header'
ERROR: "gre_parse_header" [net/ipv6/ip6_gre.ko] undefined!
This adds a Kconfig dependency to prevent that now invalid
configuration.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 308edfdf1563 ("gre6: Cleanup GREv6 receive path, call common GRE functions")
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For ipgre interfaces in collect metadata mode, receive also traffic with
encapsulated Ethernet headers. The lwtunnel users are supposed to sort this
out correctly. This allows to have mixed Ethernet + L3-only traffic on the
same lwtunnel interface. This is the same way as VXLAN-GPE behaves.
To keep backwards compatibility and prevent any surprises, gretap interfaces
have priority in receiving packets with Ethernet headers.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This will allow to make the pull dependent on the tunnel type.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The call to gre_parse_header is either followed by iptunnel_pull_header, or
in the case of ICMP error path, the actual header is not accessed at all.
In the first case, iptunnel_pull_header will call pskb_may_pull anyway and
it's pointless to do it twice. The only difference is what call will fail
with what error code but the net effect is still the same in all call sites.
In the second case, pskb_may_pull is pointless, as skb->data is at the outer
IP header and not at the GRE header.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This change makes it so that we will strip the TSO_MANGLEID bit if TSO is
not present. This way we will also handle ECN correctly of TSO is not
present.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|