diff options
author | David S. Miller <davem@davemloft.net> | 2016-11-13 22:41:25 -0500 |
---|---|---|
committer | David S. Miller <davem@davemloft.net> | 2016-11-13 22:41:25 -0500 |
commit | 7d384846b9987f7b611357adf3cdfecfdcf0c402 (patch) | |
tree | 5bff8b40efeea1a177e73905e6190ad6180bacb0 /include | |
parent | 8d419324efeb1ac9b99a226643859a6636bc2aec (diff) | |
parent | eb1a6bdc28268afa964e4c9f3399961dff9fd691 (diff) |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains a second batch of Netfilter updates for
your net-next tree. This includes a rework of the core hook
infrastructure that improves Netfilter performance by ~15% according to
synthetic benchmarks. Then, a large batch with ipset updates, including
a new hash:ipmac set type, via Jozsef Kadlecsik. This also includes a
couple of assorted updates.
Regarding the core hook infrastructure rework to improve performance,
using this simple drop-all packets ruleset from ingress:
nft add table netdev x
nft add chain netdev x y { type filter hook ingress device eth0 priority 0\; }
nft add rule netdev x y drop
And generating traffic through Jesper Brouer's
samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh script using -i
option. perf report shows nf_tables calls in its top 10:
17.30% kpktgend_0 [nf_tables] [k] nft_do_chain
15.75% kpktgend_0 [kernel.vmlinux] [k] __netif_receive_skb_core
10.39% kpktgend_0 [nf_tables_netdev] [k] nft_do_chain_netdev
I'm measuring here an improvement of ~15% in performance with this
patchset, so we got +2.5Mpps more. I have used my old laptop Intel(R)
Core(TM) i5-3320M CPU @ 2.60GHz 4-cores.
This rework contains more specifically, in strict order, these patches:
1) Remove compile-time debugging from core.
2) Remove obsolete comments that predate the rcu era. These days it is
well known that a Netfilter hook always runs under rcu_read_lock().
3) Remove threshold handling, this is only used by br_netfilter too.
We already have specific code to handle this from br_netfilter,
so remove this code from the core path.
4) Deprecate NF_STOP, as this is only used by br_netfilter.
5) Place nf_state_hook pointer into xt_action_param structure, so
this structure fits into one single cacheline according to pahole.
This also implicit affects nftables since it also relies on the
xt_action_param structure.
6) Move state->hook_entries into nf_queue entry. The hook_entries
pointer is only required by nf_queue(), so we can store this in the
queue entry instead.
7) use switch() statement to handle verdict cases.
8) Remove hook_entries field from nf_hook_state structure, this is only
required by nf_queue, so store it in nf_queue_entry structure.
9) Merge nf_iterate() into nf_hook_slow() that results in a much more
simple and readable function.
10) Handle NF_REPEAT away from the core, so far the only client is
nf_conntrack_in() and we can restart the packet processing using a
simple goto to jump back there when the TCP requires it.
This update required a second pass to fix fallout, fix from
Arnd Bergmann.
11) Set random seed from nft_hash when no seed is specified from
userspace.
12) Simplify nf_tables expression registration, in a much smarter way
to save lots of boiler plate code, by Liping Zhang.
13) Simplify layer 4 protocol conntrack tracker registration, from
Davide Caratti.
14) Missing CONFIG_NF_SOCKET_IPV4 dependency for udp4_lib_lookup, due
to recent generalization of the socket infrastructure, from Arnd
Bergmann.
15) Then, the ipset batch from Jozsef, he describes it as it follows:
* Cleanup: Remove extra whitespaces in ip_set.h
* Cleanup: Mark some of the helpers arguments as const in ip_set.h
* Cleanup: Group counter helper functions together in ip_set.h
* struct ip_set_skbinfo is introduced instead of open coded fields
in skbinfo get/init helper funcions.
* Use kmalloc() in comment extension helper instead of kzalloc()
because it is unnecessary to zero out the area just before
explicit initialization.
* Cleanup: Split extensions into separate files.
* Cleanup: Separate memsize calculation code into dedicated function.
* Cleanup: group ip_set_put_extensions() and ip_set_get_extensions()
together.
* Add element count to hash headers by Eric B Munson.
* Add element count to all set types header for uniform output
across all set types.
* Count non-static extension memory into memsize calculation for
userspace.
* Cleanup: Remove redundant mtype_expire() arguments, because
they can be get from other parameters.
* Cleanup: Simplify mtype_expire() for hash types by removing
one level of intendation.
* Make NLEN compile time constant for hash types.
* Make sure element data size is a multiple of u32 for the hash set
types.
* Optimize hash creation routine, exit as early as possible.
* Make struct htype per ipset family so nets array becomes fixed size
and thus simplifies the struct htype allocation.
* Collapse same condition body into a single one.
* Fix reported memory size for hash:* types, base hash bucket structure
was not taken into account.
* hash:ipmac type support added to ipset by Tomasz Chilinski.
* Use setup_timer() and mod_timer() instead of init_timer()
by Muhammad Falak R Wani, individually for the set type families.
16) Remove useless connlabel field in struct netns_ct, patch from
Florian Westphal.
17) xt_find_table_lock() doesn't return ERR_PTR() anymore, so simplify
{ip,ip6,arp}tables code that uses this.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'include')
-rw-r--r-- | include/linux/netfilter.h | 58 | ||||
-rw-r--r-- | include/linux/netfilter/ipset/ip_set.h | 136 | ||||
-rw-r--r-- | include/linux/netfilter/ipset/ip_set_bitmap.h | 2 | ||||
-rw-r--r-- | include/linux/netfilter/ipset/ip_set_comment.h | 11 | ||||
-rw-r--r-- | include/linux/netfilter/ipset/ip_set_counter.h | 75 | ||||
-rw-r--r-- | include/linux/netfilter/ipset/ip_set_skbinfo.h | 46 | ||||
-rw-r--r-- | include/linux/netfilter/ipset/ip_set_timeout.h | 4 | ||||
-rw-r--r-- | include/linux/netfilter/x_tables.h | 48 | ||||
-rw-r--r-- | include/linux/netfilter_ingress.h | 4 | ||||
-rw-r--r-- | include/net/netfilter/nf_conntrack_l4proto.h | 18 | ||||
-rw-r--r-- | include/net/netfilter/nf_queue.h | 1 | ||||
-rw-r--r-- | include/net/netfilter/nf_tables.h | 36 | ||||
-rw-r--r-- | include/net/netfilter/nf_tables_core.h | 33 | ||||
-rw-r--r-- | include/net/netns/conntrack.h | 1 | ||||
-rw-r--r-- | include/uapi/linux/netfilter.h | 2 |
15 files changed, 259 insertions, 216 deletions
diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h index abc7fdcb9eb1..69230140215b 100644 --- a/include/linux/netfilter.h +++ b/include/linux/netfilter.h @@ -49,13 +49,11 @@ struct sock; struct nf_hook_state { unsigned int hook; - int thresh; u_int8_t pf; struct net_device *in; struct net_device *out; struct sock *sk; struct net *net; - struct nf_hook_entry __rcu *hook_entries; int (*okfn)(struct net *, struct sock *, struct sk_buff *); }; @@ -82,9 +80,8 @@ struct nf_hook_entry { }; static inline void nf_hook_state_init(struct nf_hook_state *p, - struct nf_hook_entry *hook_entry, unsigned int hook, - int thresh, u_int8_t pf, + u_int8_t pf, struct net_device *indev, struct net_device *outdev, struct sock *sk, @@ -92,13 +89,11 @@ static inline void nf_hook_state_init(struct nf_hook_state *p, int (*okfn)(struct net *, struct sock *, struct sk_buff *)) { p->hook = hook; - p->thresh = thresh; p->pf = pf; p->in = indev; p->out = outdev; p->sk = sk; p->net = net; - RCU_INIT_POINTER(p->hook_entries, hook_entry); p->okfn = okfn; } @@ -152,23 +147,20 @@ void nf_unregister_sockopt(struct nf_sockopt_ops *reg); extern struct static_key nf_hooks_needed[NFPROTO_NUMPROTO][NF_MAX_HOOKS]; #endif -int nf_hook_slow(struct sk_buff *skb, struct nf_hook_state *state); +int nf_hook_slow(struct sk_buff *skb, struct nf_hook_state *state, + struct nf_hook_entry *entry); /** - * nf_hook_thresh - call a netfilter hook + * nf_hook - call a netfilter hook * * Returns 1 if the hook has allowed the packet to pass. The function * okfn must be invoked by the caller in this case. Any other return * value indicates the packet has been consumed by the hook. */ -static inline int nf_hook_thresh(u_int8_t pf, unsigned int hook, - struct net *net, - struct sock *sk, - struct sk_buff *skb, - struct net_device *indev, - struct net_device *outdev, - int (*okfn)(struct net *, struct sock *, struct sk_buff *), - int thresh) +static inline int nf_hook(u_int8_t pf, unsigned int hook, struct net *net, + struct sock *sk, struct sk_buff *skb, + struct net_device *indev, struct net_device *outdev, + int (*okfn)(struct net *, struct sock *, struct sk_buff *)) { struct nf_hook_entry *hook_head; int ret = 1; @@ -185,24 +177,16 @@ static inline int nf_hook_thresh(u_int8_t pf, unsigned int hook, if (hook_head) { struct nf_hook_state state; - nf_hook_state_init(&state, hook_head, hook, thresh, - pf, indev, outdev, sk, net, okfn); + nf_hook_state_init(&state, hook, pf, indev, outdev, + sk, net, okfn); - ret = nf_hook_slow(skb, &state); + ret = nf_hook_slow(skb, &state, hook_head); } rcu_read_unlock(); return ret; } -static inline int nf_hook(u_int8_t pf, unsigned int hook, struct net *net, - struct sock *sk, struct sk_buff *skb, - struct net_device *indev, struct net_device *outdev, - int (*okfn)(struct net *, struct sock *, struct sk_buff *)) -{ - return nf_hook_thresh(pf, hook, net, sk, skb, indev, outdev, okfn, INT_MIN); -} - /* Activate hook; either okfn or kfree_skb called, unless a hook returns NF_STOLEN (in which case, it's up to the hook to deal with the consequences). @@ -221,19 +205,6 @@ static inline int nf_hook(u_int8_t pf, unsigned int hook, struct net *net, */ static inline int -NF_HOOK_THRESH(uint8_t pf, unsigned int hook, struct net *net, struct sock *sk, - struct sk_buff *skb, struct net_device *in, - struct net_device *out, - int (*okfn)(struct net *, struct sock *, struct sk_buff *), - int thresh) -{ - int ret = nf_hook_thresh(pf, hook, net, sk, skb, in, out, okfn, thresh); - if (ret == 1) - ret = okfn(net, sk, skb); - return ret; -} - -static inline int NF_HOOK_COND(uint8_t pf, unsigned int hook, struct net *net, struct sock *sk, struct sk_buff *skb, struct net_device *in, struct net_device *out, int (*okfn)(struct net *, struct sock *, struct sk_buff *), @@ -242,7 +213,7 @@ NF_HOOK_COND(uint8_t pf, unsigned int hook, struct net *net, struct sock *sk, int ret; if (!cond || - ((ret = nf_hook_thresh(pf, hook, net, sk, skb, in, out, okfn, INT_MIN)) == 1)) + ((ret = nf_hook(pf, hook, net, sk, skb, in, out, okfn)) == 1)) ret = okfn(net, sk, skb); return ret; } @@ -252,7 +223,10 @@ NF_HOOK(uint8_t pf, unsigned int hook, struct net *net, struct sock *sk, struct struct net_device *in, struct net_device *out, int (*okfn)(struct net *, struct sock *, struct sk_buff *)) { - return NF_HOOK_THRESH(pf, hook, net, sk, skb, in, out, okfn, INT_MIN); + int ret = nf_hook(pf, hook, net, sk, skb, in, out, okfn); + if (ret == 1) + ret = okfn(net, sk, skb); + return ret; } /* Call setsockopt() */ diff --git a/include/linux/netfilter/ipset/ip_set.h b/include/linux/netfilter/ipset/ip_set.h index 83b9a2e0d8d4..8e42253e5d4d 100644 --- a/include/linux/netfilter/ipset/ip_set.h +++ b/include/linux/netfilter/ipset/ip_set.h @@ -79,10 +79,12 @@ enum ip_set_ext_id { IPSET_EXT_ID_MAX, }; +struct ip_set; + /* Extension type */ struct ip_set_ext_type { /* Destroy extension private data (can be NULL) */ - void (*destroy)(void *ext); + void (*destroy)(struct ip_set *set, void *ext); enum ip_set_extension type; enum ipset_cadt_flags flag; /* Size and minimal alignment */ @@ -92,17 +94,6 @@ struct ip_set_ext_type { extern const struct ip_set_ext_type ip_set_extensions[]; -struct ip_set_ext { - u64 packets; - u64 bytes; - u32 timeout; - u32 skbmark; - u32 skbmarkmask; - u32 skbprio; - u16 skbqueue; - char *comment; -}; - struct ip_set_counter { atomic64_t bytes; atomic64_t packets; @@ -122,6 +113,15 @@ struct ip_set_skbinfo { u32 skbmarkmask; u32 skbprio; u16 skbqueue; + u16 __pad; +}; + +struct ip_set_ext { + struct ip_set_skbinfo skbinfo; + u64 packets; + u64 bytes; + char *comment; + u32 timeout; }; struct ip_set; @@ -252,6 +252,10 @@ struct ip_set { u8 flags; /* Default timeout value, if enabled */ u32 timeout; + /* Number of elements (vs timeout) */ + u32 elements; + /* Size of the dynamic extensions (vs timeout) */ + size_t ext_size; /* Element data size */ size_t dsize; /* Offsets to extensions in elements */ @@ -268,7 +272,7 @@ ip_set_ext_destroy(struct ip_set *set, void *data) */ if (SET_WITH_COMMENT(set)) ip_set_extensions[IPSET_EXT_ID_COMMENT].destroy( - ext_comment(data, set)); + set, ext_comment(data, set)); } static inline int @@ -294,104 +298,6 @@ ip_set_put_flags(struct sk_buff *skb, struct ip_set *set) return nla_put_net32(skb, IPSET_ATTR_CADT_FLAGS, htonl(cadt_flags)); } -static inline void -ip_set_add_bytes(u64 bytes, struct ip_set_counter *counter) -{ - atomic64_add((long long)bytes, &(counter)->bytes); -} - -static inline void -ip_set_add_packets(u64 packets, struct ip_set_counter *counter) -{ - atomic64_add((long long)packets, &(counter)->packets); -} - -static inline u64 -ip_set_get_bytes(const struct ip_set_counter *counter) -{ - return (u64)atomic64_read(&(counter)->bytes); -} - -static inline u64 -ip_set_get_packets(const struct ip_set_counter *counter) -{ - return (u64)atomic64_read(&(counter)->packets); -} - -static inline void -ip_set_update_counter(struct ip_set_counter *counter, - const struct ip_set_ext *ext, - struct ip_set_ext *mext, u32 flags) -{ - if (ext->packets != ULLONG_MAX && - !(flags & IPSET_FLAG_SKIP_COUNTER_UPDATE)) { - ip_set_add_bytes(ext->bytes, counter); - ip_set_add_packets(ext->packets, counter); - } - if (flags & IPSET_FLAG_MATCH_COUNTERS) { - mext->packets = ip_set_get_packets(counter); - mext->bytes = ip_set_get_bytes(counter); - } -} - -static inline void -ip_set_get_skbinfo(struct ip_set_skbinfo *skbinfo, - const struct ip_set_ext *ext, - struct ip_set_ext *mext, u32 flags) -{ - mext->skbmark = skbinfo->skbmark; - mext->skbmarkmask = skbinfo->skbmarkmask; - mext->skbprio = skbinfo->skbprio; - mext->skbqueue = skbinfo->skbqueue; -} -static inline bool -ip_set_put_skbinfo(struct sk_buff *skb, struct ip_set_skbinfo *skbinfo) -{ - /* Send nonzero parameters only */ - return ((skbinfo->skbmark || skbinfo->skbmarkmask) && - nla_put_net64(skb, IPSET_ATTR_SKBMARK, - cpu_to_be64((u64)skbinfo->skbmark << 32 | - skbinfo->skbmarkmask), - IPSET_ATTR_PAD)) || - (skbinfo->skbprio && - nla_put_net32(skb, IPSET_ATTR_SKBPRIO, - cpu_to_be32(skbinfo->skbprio))) || - (skbinfo->skbqueue && - nla_put_net16(skb, IPSET_ATTR_SKBQUEUE, - cpu_to_be16(skbinfo->skbqueue))); -} - -static inline void -ip_set_init_skbinfo(struct ip_set_skbinfo *skbinfo, - const struct ip_set_ext *ext) -{ - skbinfo->skbmark = ext->skbmark; - skbinfo->skbmarkmask = ext->skbmarkmask; - skbinfo->skbprio = ext->skbprio; - skbinfo->skbqueue = ext->skbqueue; -} - -static inline bool -ip_set_put_counter(struct sk_buff *skb, struct ip_set_counter *counter) -{ - return nla_put_net64(skb, IPSET_ATTR_BYTES, - cpu_to_be64(ip_set_get_bytes(counter)), - IPSET_ATTR_PAD) || - nla_put_net64(skb, IPSET_ATTR_PACKETS, - cpu_to_be64(ip_set_get_packets(counter)), - IPSET_ATTR_PAD); -} - -static inline void -ip_set_init_counter(struct ip_set_counter *counter, - const struct ip_set_ext *ext) -{ - if (ext->bytes != ULLONG_MAX) - atomic64_set(&(counter)->bytes, (long long)(ext->bytes)); - if (ext->packets != ULLONG_MAX) - atomic64_set(&(counter)->packets, (long long)(ext->packets)); -} - /* Netlink CB args */ enum { IPSET_CB_NET = 0, /* net namespace */ @@ -431,6 +337,8 @@ extern size_t ip_set_elem_len(struct ip_set *set, struct nlattr *tb[], size_t len, size_t align); extern int ip_set_get_extensions(struct ip_set *set, struct nlattr *tb[], struct ip_set_ext *ext); +extern int ip_set_put_extensions(struct sk_buff *skb, const struct ip_set *set, + const void *e, bool active); static inline int ip_set_get_hostipaddr4(struct nlattr *nla, u32 *ipaddr) @@ -546,10 +454,8 @@ bitmap_bytes(u32 a, u32 b) #include <linux/netfilter/ipset/ip_set_timeout.h> #include <linux/netfilter/ipset/ip_set_comment.h> - -int -ip_set_put_extensions(struct sk_buff *skb, const struct ip_set *set, - const void *e, bool active); +#include <linux/netfilter/ipset/ip_set_counter.h> +#include <linux/netfilter/ipset/ip_set_skbinfo.h> #define IP_SET_INIT_KEXT(skb, opt, set) \ { .bytes = (skb)->len, .packets = 1, \ diff --git a/include/linux/netfilter/ipset/ip_set_bitmap.h b/include/linux/netfilter/ipset/ip_set_bitmap.h index 5e4662a71e01..366d6c0ea04f 100644 --- a/include/linux/netfilter/ipset/ip_set_bitmap.h +++ b/include/linux/netfilter/ipset/ip_set_bitmap.h @@ -6,8 +6,8 @@ #define IPSET_BITMAP_MAX_RANGE 0x0000FFFF enum { + IPSET_ADD_STORE_PLAIN_TIMEOUT = -1, IPSET_ADD_FAILED = 1, - IPSET_ADD_STORE_PLAIN_TIMEOUT, IPSET_ADD_START_STORED_TIMEOUT, }; diff --git a/include/linux/netfilter/ipset/ip_set_comment.h b/include/linux/netfilter/ipset/ip_set_comment.h index 8d0248525957..8e2bab1e8e90 100644 --- a/include/linux/netfilter/ipset/ip_set_comment.h +++ b/include/linux/netfilter/ipset/ip_set_comment.h @@ -20,13 +20,14 @@ ip_set_comment_uget(struct nlattr *tb) * The kadt functions don't use the comment extensions in any way. */ static inline void -ip_set_init_comment(struct ip_set_comment *comment, +ip_set_init_comment(struct ip_set *set, struct ip_set_comment *comment, const struct ip_set_ext *ext) { struct ip_set_comment_rcu *c = rcu_dereference_protected(comment->c, 1); size_t len = ext->comment ? strlen(ext->comment) : 0; if (unlikely(c)) { + set->ext_size -= sizeof(*c) + strlen(c->str) + 1; kfree_rcu(c, rcu); rcu_assign_pointer(comment->c, NULL); } @@ -34,16 +35,17 @@ ip_set_init_comment(struct ip_set_comment *comment, return; if (unlikely(len > IPSET_MAX_COMMENT_SIZE)) len = IPSET_MAX_COMMENT_SIZE; - c = kzalloc(sizeof(*c) + len + 1, GFP_ATOMIC); + c = kmalloc(sizeof(*c) + len + 1, GFP_ATOMIC); if (unlikely(!c)) return; strlcpy(c->str, ext->comment, len + 1); + set->ext_size += sizeof(*c) + strlen(c->str) + 1; rcu_assign_pointer(comment->c, c); } /* Used only when dumping a set, protected by rcu_read_lock_bh() */ static inline int -ip_set_put_comment(struct sk_buff *skb, struct ip_set_comment *comment) +ip_set_put_comment(struct sk_buff *skb, const struct ip_set_comment *comment) { struct ip_set_comment_rcu *c = rcu_dereference_bh(comment->c); @@ -58,13 +60,14 @@ ip_set_put_comment(struct sk_buff *skb, struct ip_set_comment *comment) * of the set data anymore. */ static inline void -ip_set_comment_free(struct ip_set_comment *comment) +ip_set_comment_free(struct ip_set *set, struct ip_set_comment *comment) { struct ip_set_comment_rcu *c; c = rcu_dereference_protected(comment->c, 1); if (unlikely(!c)) return; + set->ext_size -= sizeof(*c) + strlen(c->str) + 1; kfree_rcu(c, rcu); rcu_assign_pointer(comment->c, NULL); } diff --git a/include/linux/netfilter/ipset/ip_set_counter.h b/include/linux/netfilter/ipset/ip_set_counter.h new file mode 100644 index 000000000000..bb6fba480118 --- /dev/null +++ b/include/linux/netfilter/ipset/ip_set_counter.h @@ -0,0 +1,75 @@ +#ifndef _IP_SET_COUNTER_H +#define _IP_SET_COUNTER_H + +/* Copyright (C) 2015 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#ifdef __KERNEL__ + +static inline void +ip_set_add_bytes(u64 bytes, struct ip_set_counter *counter) +{ + atomic64_add((long long)bytes, &(counter)->bytes); +} + +static inline void +ip_set_add_packets(u64 packets, struct ip_set_counter *counter) +{ + atomic64_add((long long)packets, &(counter)->packets); +} + +static inline u64 +ip_set_get_bytes(const struct ip_set_counter *counter) +{ + return (u64)atomic64_read(&(counter)->bytes); +} + +static inline u64 +ip_set_get_packets(const struct ip_set_counter *counter) +{ + return (u64)atomic64_read(&(counter)->packets); +} + +static inline void +ip_set_update_counter(struct ip_set_counter *counter, + const struct ip_set_ext *ext, + struct ip_set_ext *mext, u32 flags) +{ + if (ext->packets != ULLONG_MAX && + !(flags & IPSET_FLAG_SKIP_COUNTER_UPDATE)) { + ip_set_add_bytes(ext->bytes, counter); + ip_set_add_packets(ext->packets, counter); + } + if (flags & IPSET_FLAG_MATCH_COUNTERS) { + mext->packets = ip_set_get_packets(counter); + mext->bytes = ip_set_get_bytes(counter); + } +} + +static inline bool +ip_set_put_counter(struct sk_buff *skb, const struct ip_set_counter *counter) +{ + return nla_put_net64(skb, IPSET_ATTR_BYTES, + cpu_to_be64(ip_set_get_bytes(counter)), + IPSET_ATTR_PAD) || + nla_put_net64(skb, IPSET_ATTR_PACKETS, + cpu_to_be64(ip_set_get_packets(counter)), + IPSET_ATTR_PAD); +} + +static inline void +ip_set_init_counter(struct ip_set_counter *counter, + const struct ip_set_ext *ext) +{ + if (ext->bytes != ULLONG_MAX) + atomic64_set(&(counter)->bytes, (long long)(ext->bytes)); + if (ext->packets != ULLONG_MAX) + atomic64_set(&(counter)->packets, (long long)(ext->packets)); +} + +#endif /* __KERNEL__ */ +#endif /* _IP_SET_COUNTER_H */ diff --git a/include/linux/netfilter/ipset/ip_set_skbinfo.h b/include/linux/netfilter/ipset/ip_set_skbinfo.h new file mode 100644 index 000000000000..29d7ef2bc3fa --- /dev/null +++ b/include/linux/netfilter/ipset/ip_set_skbinfo.h @@ -0,0 +1,46 @@ +#ifndef _IP_SET_SKBINFO_H +#define _IP_SET_SKBINFO_H + +/* Copyright (C) 2015 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#ifdef __KERNEL__ + +static inline void +ip_set_get_skbinfo(struct ip_set_skbinfo *skbinfo, + const struct ip_set_ext *ext, + struct ip_set_ext *mext, u32 flags) +{ + mext->skbinfo = *skbinfo; +} + +static inline bool +ip_set_put_skbinfo(struct sk_buff *skb, const struct ip_set_skbinfo *skbinfo) +{ + /* Send nonzero parameters only */ + return ((skbinfo->skbmark || skbinfo->skbmarkmask) && + nla_put_net64(skb, IPSET_ATTR_SKBMARK, + cpu_to_be64((u64)skbinfo->skbmark << 32 | + skbinfo->skbmarkmask), + IPSET_ATTR_PAD)) || + (skbinfo->skbprio && + nla_put_net32(skb, IPSET_ATTR_SKBPRIO, + cpu_to_be32(skbinfo->skbprio))) || + (skbinfo->skbqueue && + nla_put_net16(skb, IPSET_ATTR_SKBQUEUE, + cpu_to_be16(skbinfo->skbqueue))); +} + +static inline void +ip_set_init_skbinfo(struct ip_set_skbinfo *skbinfo, + const struct ip_set_ext *ext) +{ + *skbinfo = ext->skbinfo; +} + +#endif /* __KERNEL__ */ +#endif /* _IP_SET_SKBINFO_H */ diff --git a/include/linux/netfilter/ipset/ip_set_timeout.h b/include/linux/netfilter/ipset/ip_set_timeout.h index 1d6a935c1ac5..bfb3531fd88a 100644 --- a/include/linux/netfilter/ipset/ip_set_timeout.h +++ b/include/linux/netfilter/ipset/ip_set_timeout.h @@ -40,7 +40,7 @@ ip_set_timeout_uget(struct nlattr *tb) } static inline bool -ip_set_timeout_expired(unsigned long *t) +ip_set_timeout_expired(const unsigned long *t) { return *t != IPSET_ELEM_PERMANENT && time_is_before_jiffies(*t); } @@ -63,7 +63,7 @@ ip_set_timeout_set(unsigned long *timeout, u32 value) } static inline u32 -ip_set_timeout_get(unsigned long *timeout) +ip_set_timeout_get(const unsigned long *timeout) { return *timeout == IPSET_ELEM_PERMANENT ? 0 : jiffies_to_msecs(*timeout - jiffies)/MSEC_PER_SEC; diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h index 2ad1a2b289b5..cd4eaf8df445 100644 --- a/include/linux/netfilter/x_tables.h +++ b/include/linux/netfilter/x_tables.h @@ -4,6 +4,7 @@ #include <linux/netdevice.h> #include <linux/static_key.h> +#include <linux/netfilter.h> #include <uapi/linux/netfilter/x_tables.h> /* Test a struct->invflags and a boolean for inequality */ @@ -17,14 +18,9 @@ * @target: the target extension * @matchinfo: per-match data * @targetinfo: per-target data - * @net network namespace through which the action was invoked - * @in: input netdevice - * @out: output netdevice + * @state: pointer to hook state this packet came from * @fragoff: packet is a fragment, this is the data offset * @thoff: position of transport header relative to skb->data - * @hook: hook number given packet came from - * @family: Actual NFPROTO_* through which the function is invoked - * (helpful when match->family == NFPROTO_UNSPEC) * * Fields written to by extensions: * @@ -38,15 +34,47 @@ struct xt_action_param { union { const void *matchinfo, *targinfo; }; - struct net *net; - const struct net_device *in, *out; + const struct nf_hook_state *state; int fragoff; unsigned int thoff; - unsigned int hooknum; - u_int8_t family; bool hotdrop; }; +static inline struct net *xt_net(const struct xt_action_param *par) +{ + return par->state->net; +} + +static inline struct net_device *xt_in(const struct xt_action_param *par) +{ + return par->state->in; +} + +static inline const char *xt_inname(const struct xt_action_param *par) +{ + return par->state->in->name; +} + +static inline struct net_device *xt_out(const struct xt_action_param *par) +{ + return par->state->out; +} + +static inline const char *xt_outname(const struct xt_action_param *par) +{ + return par->state->out->name; +} + +static inline unsigned int xt_hooknum(const struct xt_action_param *par) +{ + return par->state->hook; +} + +static inline u_int8_t xt_family(const struct xt_action_param *par) +{ + return par->state->pf; +} + /** * struct xt_mtchk_param - parameters for match extensions' * checkentry functions diff --git a/include/linux/netfilter_ingress.h b/include/linux/netfilter_ingress.h index 33e37fb41d5d..2dc3b49b804a 100644 --- a/include/linux/netfilter_ingress.h +++ b/include/linux/netfilter_ingress.h @@ -26,10 +26,10 @@ static inline int nf_hook_ingress(struct sk_buff *skb) if (unlikely(!e)) return 0; - nf_hook_state_init(&state, e, NF_NETDEV_INGRESS, INT_MIN, + nf_hook_state_init(&state, NF_NETDEV_INGRESS, NFPROTO_NETDEV, skb->dev, NULL, NULL, dev_net(skb->dev), NULL); - return nf_hook_slow(skb, &state); + return nf_hook_slow(skb, &state, e); } static inline void nf_hook_ingress_init(struct net_device *dev) diff --git a/include/net/netfilter/nf_conntrack_l4proto.h b/include/net/netfilter/nf_conntrack_l4proto.h index de629f1520df..2152b70626d5 100644 --- a/include/net/netfilter/nf_conntrack_l4proto.h +++ b/include/net/netfilter/nf_conntrack_l4proto.h @@ -125,14 +125,24 @@ struct nf_conntrack_l4proto *nf_ct_l4proto_find_get(u_int16_t l3proto, void nf_ct_l4proto_put(struct nf_conntrack_l4proto *p); /* Protocol pernet registration. */ +int nf_ct_l4proto_pernet_register_one(struct net *net, + struct nf_conntrack_l4proto *proto); +void nf_ct_l4proto_pernet_unregister_one(struct net *net, + struct nf_conntrack_l4proto *proto); int nf_ct_l4proto_pernet_register(struct net *net, - struct nf_conntrack_l4proto *proto); + struct nf_conntrack_l4proto *proto[], + unsigned int num_proto); void nf_ct_l4proto_pernet_unregister(struct net *net, - struct nf_conntrack_l4proto *proto); + struct nf_conntrack_l4proto *proto[], + unsigned int num_proto); /* Protocol global registration. */ -int nf_ct_l4proto_register(struct nf_conntrack_l4proto *proto); -void nf_ct_l4proto_unregister(struct nf_conntrack_l4proto *proto); +int nf_ct_l4proto_register_one(struct nf_conntrack_l4proto *proto); +void nf_ct_l4proto_unregister_one(struct nf_conntrack_l4proto *proto); +int nf_ct_l4proto_register(struct nf_conntrack_l4proto *proto[], + unsigned int num_proto); +void nf_ct_l4proto_unregister(struct nf_conntrack_l4proto *proto[], + unsigned int num_proto); /* Generic netlink helpers */ int nf_ct_port_tuple_to_nlattr(struct sk_buff *skb, diff --git a/include/net/netfilter/nf_queue.h b/include/net/netfilter/nf_queue.h index 2280cfe86c56..09948d10e38e 100644 --- a/include/net/netfilter/nf_queue.h +++ b/include/net/netfilter/nf_queue.h @@ -12,6 +12,7 @@ struct nf_queue_entry { unsigned int id; struct nf_hook_state state; + struct nf_hook_entry *hook; u16 size; /* sizeof(entry) + saved route keys */ /* extra space to store route keys */ diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index 5031e072567b..3295fb85bff6 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -14,27 +14,43 @@ struct nft_pktinfo { struct sk_buff *skb; - struct net *net; - const struct net_device *in; - const struct net_device *out; - u8 pf; - u8 hook; bool tprot_set; u8 tprot; /* for x_tables compatibility */ struct xt_action_param xt; }; +static inline struct net *nft_net(const struct nft_pktinfo *pkt) +{ + return pkt->xt.state->net; +} + +static inline unsigned int nft_hook(const struct nft_pktinfo *pkt) +{ + return pkt->xt.state->hook; +} + +static inline u8 nft_pf(const struct nft_pktinfo *pkt) +{ + return pkt->xt.state->pf; +} + +static inline const struct net_device *nft_in(const struct nft_pktinfo *pkt) +{ + return pkt->xt.state->in; +} + +static inline const struct net_device *nft_out(const struct nft_pktinfo *pkt) +{ + return pkt->xt.state->out; +} + static inline void nft_set_pktinfo(struct nft_pktinfo *pkt, struct sk_buff *skb, const struct nf_hook_state *state) { pkt->skb = skb; - pkt->net = pkt->xt.net = state->net; - pkt->in = pkt->xt.in = state->in; - pkt->out = pkt->xt.out = state->out; - pkt->hook = pkt->xt.hooknum = state->hook; - pkt->pf = pkt->xt.family = state->pf; + pkt->xt.state = state; } static inline void nft_set_pktinfo_proto_unspec(struct nft_pktinfo *pkt, diff --git a/include/net/netfilter/nf_tables_core.h b/include/net/netfilter/nf_tables_core.h index 00f4f6b1b1ba..862373d4ea9d 100644 --- a/include/net/netfilter/nf_tables_core.h +++ b/include/net/netfilter/nf_tables_core.h @@ -1,12 +1,18 @@ #ifndef _NET_NF_TABLES_CORE_H #define _NET_NF_TABLES_CORE_H +extern struct nft_expr_type nft_imm_type; +extern struct nft_expr_type nft_cmp_type; +extern struct nft_expr_type nft_lookup_type; +extern struct nft_expr_type nft_bitwise_type; +extern struct nft_expr_type nft_byteorder_type; +extern struct nft_expr_type nft_payload_type; +extern struct nft_expr_type nft_dynset_type; +extern struct nft_expr_type nft_range_type; + int nf_tables_core_module_init(void); void nf_tables_core_module_exit(void); -int nft_immediate_module_init(void); -void nft_immediate_module_exit(void); - struct nft_cmp_fast_expr { u32 data; enum nft_registers sreg:8; @@ -25,24 +31,6 @@ static inline u32 nft_cmp_fast_mask(unsigned int len) extern const struct nft_expr_ops nft_cmp_fast_ops; -int nft_cmp_module_init(void); -void nft_cmp_module_exit(void); - -int nft_range_module_init(void); -void nft_range_module_exit(void); - -int nft_lookup_module_init(void); -void nft_lookup_module_exit(void); - -int nft_dynset_module_init(void); -void nft_dynset_module_exit(void); - -int nft_bitwise_module_init(void); -void nft_bitwise_module_exit(void); - -int nft_byteorder_module_init(void); -void nft_byteorder_module_exit(void); - struct nft_payload { enum nft_payload_bases base:8; u8 offset; @@ -62,7 +50,4 @@ struct nft_payload_set { extern const struct nft_expr_ops nft_payload_fast_ops; extern struct static_key_false nft_trace_enabled; -int nft_payload_module_init(void); -void nft_payload_module_exit(void); - #endif /* _NET_NF_TABLES_CORE_H */ diff --git a/include/net/netns/conntrack.h b/include/net/netns/conntrack.h index e469e85de3f9..3d06d94d2e52 100644 --- a/include/net/netns/conntrack.h +++ b/include/net/netns/conntrack.h @@ -91,7 +91,6 @@ struct netns_ct { struct nf_ip_net nf_ct_proto; #if defined(CONFIG_NF_CONNTRACK_LABELS) unsigned int labels_used; - u8 label_words; #endif }; #endif diff --git a/include/uapi/linux/netfilter.h b/include/uapi/linux/netfilter.h index d93f949d1d9a..7550e9176a54 100644 --- a/include/uapi/linux/netfilter.h +++ b/include/uapi/linux/netfilter.h @@ -13,7 +13,7 @@ #define NF_STOLEN 2 #define NF_QUEUE 3 #define NF_REPEAT 4 -#define NF_STOP 5 +#define NF_STOP 5 /* Deprecated, for userspace nf_queue compatibility. */ #define NF_MAX_VERDICT NF_STOP /* we overload the higher bits for encoding auxiliary data such as the queue |