summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-04-25ixgbe/fm10k: Only support macvlan offload for types that support destination ↵Alexander Duyck
filtering Both the ixgbe and fm10k drivers support destination filtering. Instead of adding a ton of complexity to support either source or passthru mode we can instead just avoid offloading them for now. Doing this we avoid leaking packets into interfaces that aren't meant to receive them. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-25macvlan: Provide function for interfaces to release HW offloadAlexander Duyck
This patch provides a basic function to allow a lower device to disable macvlan offload if it was previously enabled on a given macvlan. The idea here is to allow for recovery from failure should the lowerdev run out of resources. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-25macvlan: Add function to test for destination filtering supportAlexander Duyck
This patch adds a function indicating if a given macvlan can fully supports destination filtering, especially as it relates to unicast traffic. For those macvlan interfaces that do not support destination filtering such passthru or source mode filtering we should not be enabling offload support. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-25macvlan: macvlan_count_rx shouldn't be static inline AND externAlexander Duyck
It doesn't make sense to define macvlan_count_rx as a static inline and then add a forward declaration after that as an extern. I am dropping the extern declaration since it seems like it is something that likely got missed when the function was made an inline. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-25ixgbe/fm10k: Drop tracking stats for macvlan broadcast/multicastAlexander Duyck
Drop dead code now that we shouldn't be receiving broadcast or multicast frames on the queues associated to the macvlan netdev. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-25macvlan: Use software path for offloaded local, broadcast, and multicast trafficAlexander Duyck
This change makes it so that we use a software path for packets that are going to be locally switched between two macvlan interfaces on the same device. In addition we resort to software replication of broadcast and multicast packets instead of offloading that to hardware. The general idea is that using the device for east/west traffic local to the system is extremely inefficient. We can only support up to whatever the PCIe limit is for any given device so this caps us at somewhere around 20G for devices supported by ixgbe. This is compounded even further when you take broadcast and multicast into account as a single 10G port can come to a crawl as a packet is replicated up to 60+ times in some cases. In order to get away from that I am implementing changes so that we handle broadcast/multicast replication and east/west local traffic all in software. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-25macvlan: Rename fwd_priv to accel_priv and add accessor functionAlexander Duyck
This change renames the fwd_priv member to accel_priv as this more accurately reflects the actual purpose of this value. In addition I am adding an accessor which will allow us to further abstract this in the future if needed. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-25ixgbe: Drop support for macvlan specific unicast listsAlexander Duyck
Drop the code for handling macvlan specific unicast lists. It isn't needed since we don't take any efforts to maintain it when we bring the interface up and it takes the slow path anyway. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-25bpf, x64: fix JIT emission for dead codeGianluca Borello
Commit 2a5418a13fcf ("bpf: improve dead code sanitizing") replaced dead code with a series of ja-1 instructions, for safety. That made JIT compilation much more complex for some BPF programs. One instance of such programs is, for example: bool flag = false ... /* A bunch of other code */ ... if (flag) do_something() In some cases llvm is not able to remove at compile time the code for do_something(), so the generated BPF program ends up with a large amount of dead instructions. In one specific real life example, there are two series of ~500 and ~1000 dead instructions in the program. When the verifier replaces them with a series of ja-1 instructions, it causes an interesting behavior at JIT time. During the first pass, since all the instructions are estimated at 64 bytes, the ja-1 instructions end up being translated as 5 bytes JMP instructions (0xE9), since the jump offsets become increasingly large (> 127) as each instruction gets discovered to be 5 bytes instead of the estimated 64. Starting from the second pass, the first N instructions of the ja-1 sequence get translated into 2 bytes JMPs (0xEB) because the jump offsets become <= 127 this time. In particular, N is defined as roughly 127 / (5 - 2) ~= 42. So, each further pass will make the subsequent N JMP instructions shrink from 5 to 2 bytes, making the image shrink every time. This means that in order to have the entire program converge, there need to be, in the real example above, at least ~1000 / 42 ~= 24 passes just for translating the dead code. If we add this number to the passes needed to translate the other non dead code, it brings such program to 40+ passes, and JIT doesn't complete. Ultimately the userspace loader fails because such BPF program was supposed to be part of a prog array owner being JITed. While it is certainly possible to try to refactor such programs to help the compiler remove dead code, the behavior is not really intuitive and it puts further burden on the BPF developer who is not expecting such behavior. To make things worse, such programs are working just fine in all the kernel releases prior to the ja-1 fix. A possible approach to mitigate this behavior consists into noticing that for ja-1 instructions we don't really need to rely on the estimated size of the previous and current instructions, we know that a -1 BPF jump offset can be safely translated into a 0xEB instruction with a jump offset of -2. Such fix brings the BPF program in the previous example to complete again in ~9 passes. Fixes: 2a5418a13fcf ("bpf: improve dead code sanitizing") Signed-off-by: Gianluca Borello <g.borello@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25Merge branch 'bpf-optimize-neg-sums'Daniel Borkmann
Jakub Kicinski says: ==================== This set adds an optimization run to the NFP jit to turn ADD and SUB instructions with negative immediate into the opposite operation with a positive immediate. NFP can fit small immediates into the instructions but it can't ever fit negative immediates. Addition of small negative immediates is quite common in BPF programs for stack address calculations, therefore this optimization gives us non-negligible savings in instruction count (up to 4%). ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25nfp: bpf: optimize comparisons to negative constantsJakub Kicinski
Comparison instruction requires a subtraction. If the constant is negative we are more likely to fit it into a NFP instruction directly if we change the sign and use addition. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25nfp: bpf: tabularize generations of compare operationsJakub Kicinski
There are quite a few compare instructions now, use a table to translate BPF instruction code to NFP instruction parameters instead of parameterizing helpers. This saves LOC and makes future extensions easier. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25nfp: bpf: optimize add/sub of a negative constantJakub Kicinski
NFP instruction set can fit small immediates into the instruction. Negative integers, however, will never fit because they will have highest bit set. If we swap the ALU op between ADD and SUB and negate the constant we have a better chance of fitting small negative integers into the instruction itself and saving one or two cycles. immed[gprB_21, 0xfffffffc] alu[gprA_4, gprA_4, +, gprB_21], gpr_wrboth immed[gprB_21, 0xffffffff] alu[gprA_5, gprA_5, +carry, gprB_21], gpr_wrboth now becomes: alu[gprA_4, gprA_4, -, 4], gpr_wrboth alu[gprA_5, gprA_5, -carry, 0], gpr_wrboth Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25nfp: bpf: remove double spaceJakub Kicinski
Whitespace cleanup - remove double space. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25bpf: clear the ip_tunnel_info.William Tu
The percpu metadata_dst might carry the stale ip_tunnel_info and cause incorrect behavior. When mixing tests using ipv4/ipv6 bpf vxlan and geneve tunnel, the ipv6 tunnel info incorrectly uses ipv4's src ip addr as its ipv6 src address, because the previous tunnel info does not clean up. The patch zeros the fields in ip_tunnel_info. Signed-off-by: William Tu <u9012063@gmail.com> Reported-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2018-04-24Merge branch 'userns-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull userns bug fix from Eric Biederman: "Just a small fix to properly set the return code on error" * 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: commoncap: Handle memory allocation failure.
2018-04-25bpf: reduce runtime of test_sockmap testsJohn Fastabend
When test_sockmap was running outside of selftests and was not being run by build bots it was reasonable to spend significant amount of time running various tests. The number of tests is high because many different I/O iterators are run. However, now that test_sockmap is part of selftests rather than iterate through all I/O sides only test a minimal set of min/max values along with a few "normal" I/O ops. Also remove the long running tests. They can be run from other test frameworks on a regular cadence. This significanly reduces runtime of test_sockmap. Before: $ time sudo ./test_sockmap > /dev/null real 4m47.521s user 0m0.370s sys 0m3.131s After: $ time sudo ./test_sockmap > /dev/null real 0m0.514s user 0m0.104s sys 0m0.430s The CLI is still available for users that want to test the long running tests that do the larger send/recv tests. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25Merge branch 'bpf-sockmap-selftests'Daniel Borkmann
John Fastabend says: ==================== This series moves ./samples/sockmap into BPF selftests. There are a few good reasons to do this. First, by pushing this into selftests the tests will be run automatically. Second, sockmap was not really a sample of anything anymore, but rather a large set of tests. Note: There are three recent fixes outstanding against bpf branch that can be detected occasionally by the automated tests here. https://patchwork.ozlabs.org/patch/903138/ https://patchwork.ozlabs.org/patch/903139/ https://patchwork.ozlabs.org/patch/903140/ ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25bpf: sockmap, remove samples programJohn Fastabend
The BPF sample sockmap is redundant now that equivelant tests exist in the BPF selftests. Lets remove this sample and only keep the selftest version that will be run as part of the selftest suite. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25bpf: sockmap, add selftestsJohn Fastabend
This adds a new test program test_sockmap which is the old sample sockmap program. By moving the sample program here we can now run it as part of the self tests suite. To support this a populate_progs() routine is added to load programs and maps which was previously done with load_bpf_file(). This is needed because self test libs do not provide a similar routine. Also we now use the cgroup_helpers routines to manage cgroup use instead of manually creating one and supplying it to the CLI. Notice we keep the CLI around though because it is useful for dbg and specialized testing. To run use ./test_sockmap and the result should be, Summary 660 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25bpf: sockmap, add a set of tests to run by defaultJohn Fastabend
If no options are passed to sockmap after this patch we run a set of tests using various options and sendmsg/sendpage sizes. This replaces the sockmap_test.sh script. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25bpf: sockmap, code sockmap_test in CJohn Fastabend
By moving sockmap_test from shell script into C we can run it directly from selftests, but we can also push the input/output around in proper structures. However, keep the CLI options around because they are useful for debugging when a paticular pattern of msghdr or sockmap options trips up the sockmap code path. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-25tools/bpf: remove test_sock_addr from TEST_GEN_PROGSYonghong Song
Since test_sock_addr is not supposed to run by itself, remove it from TEST_GEN_PROGS and add it to TEST_GEN_PROGS_EXTENDED. This way, run_tests will not run test_sock_addr. The corresponding test to run is test_sock_addr.sh. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24selftests: bpf: update .gitignore with missing fileAnders Roxell
Fixes: c0fa1b6c3efc ("bpf: btf: Add BTF tests") Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix rtnl deadlock in ipvs, from Julian Anastasov. 2) s390 qeth fixes from Julian Wiedmann (control IO completion stalls, bad MAC address update sequence, request side races on command IO timeouts). 3) Handle seq_file overflow properly in l2tp, from Guillaume Nault. 4) Fix VLAN priority mappings in cpsw driver, from Ivan Khoronzhuk. 5) Packet scheduler ife action fixes (malformed TLV lengths, etc.) from Alexander Aring. 6) Fix out of bounds access in tcp md5 option parser, from Jann Horn. 7) Missing netlink attribute policies in rtm_ipv6_policy table, from Eric Dumazet. 8) Missing socket address length checks in l2tp and pppoe connect, from Guillaume Nault. 9) Fix netconsole over team and bonding, from Xin Long. 10) Fix race with AF_PACKET socket state bitfields, from Willem de Bruijn. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (51 commits) ice: Fix insufficient memory issue in ice_aq_manage_mac_read sfc: ARFS filter IDs net: ethtool: Add missing kernel doc for FEC parameters packet: fix bitfield update race ice: Do not check INTEVENT bit for OICR interrupts ice: Fix incorrect comment for action type ice: Fix initialization for num_nodes_added igb: Fix the transmission mode of queue 0 for Qav mode ixgbevf: ensure xdp_ring resources are free'd on error exit team: fix netconsole setup over team amd-xgbe: Only use the SFP supported transceiver signals amd-xgbe: Improve KR auto-negotiation and training amd-xgbe: Add pre/post auto-negotiation phy hooks pppoe: check sockaddr length in pppoe_connect() l2tp: check sockaddr length in pppol2tp_connect() net: phy: marvell: clear wol event before setting it ipv6: add RTA_TABLE and RTA_PREFSRC to rtm_ipv6_policy bonding: do not set slave_dev npinfo before slave_enable_netpoll in bond_enslave tcp: don't read out-of-bounds opsize ibmvnic: Clean actual number of RX or TX pools ...
2018-04-24Merge branch 'bpf-map-val-as-key'Daniel Borkmann
Paul Chaignon says: ==================== Currently, helpers that expect ARG_PTR_TO_MAP_KEY and ARG_PTR_TO_MAP_VALUE can only access stack and packet memory. This patchset allows these helpers to directly access map values by passing registers of type PTR_TO_MAP_VALUE. The first patch changes the verifier; the second adds new test cases. The first three versions of this patchset were sent on the iovisor-dev mailing list only. Changelogs: Changes in v5: - Refactor using check_helper_mem_access. Changes in v4: - Rebase. Changes in v3: - Bug fixes. - Negative test cases. Changes in v2: - Additional test cases for adjusted maps. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24tools/bpf: add verifier tests for accesses to map valuesPaul Chaignon
This patch adds new test cases for accesses to map values from map helpers. Signed-off-by: Paul Chaignon <paul.chaignon@orange.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24bpf: allow map helpers access to map values directlyPaul Chaignon
Helpers that expect ARG_PTR_TO_MAP_KEY and ARG_PTR_TO_MAP_VALUE can only access stack and packet memory. Allow these helpers to directly access map values by passing registers of type PTR_TO_MAP_VALUE. This change removes the need for an extra copy to the stack when using a map value to perform a second map lookup, as in the following: struct bpf_map_def SEC("maps") infobyreq = { .type = BPF_MAP_TYPE_HASHMAP, .key_size = sizeof(struct request *), .value_size = sizeof(struct info_t), .max_entries = 1024, }; struct bpf_map_def SEC("maps") counts = { .type = BPF_MAP_TYPE_HASHMAP, .key_size = sizeof(struct info_t), .value_size = sizeof(u64), .max_entries = 1024, }; SEC("kprobe/blk_account_io_start") int bpf_blk_account_io_start(struct pt_regs *ctx) { struct info_t *info = bpf_map_lookup_elem(&infobyreq, &ctx->di); u64 *count = bpf_map_lookup_elem(&counts, info); (*count)++; } Signed-off-by: Paul Chaignon <paul.chaignon@orange.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24Merge branch 'bpf-xfrm-states'Daniel Borkmann
Eyal Birger says: ==================== This patchset adds support for fetching XFRM state information from an eBPF program called from TC. The first patch introduces a helper for fetching an XFRM state from the skb's secpath. The XFRM state is modeled using a new virtual struct which contains the SPI, peer address, and reqid values of the state; This struct can be extended in the future to provide additional state information. The second patch adds a test example in test_tunnel_bpf.sh. The sample validates the correct extraction of state information by the eBPF program. v3: - Kept SPI and peer IPv4 address in state in network byte order following suggestion from Alexei Starovoitov v2: - Fixed two comments by Daniel Borkmann: - disallow reserved flags in helper call - avoid compiling in helper code when CONFIG_XFRM is off ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24samples/bpf: extend test_tunnel_bpf.sh with xfrm state testEyal Birger
Add a test for fetching xfrm state parameters from a tc program running on ingress. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24bpf: add helper for getting xfrm statesEyal Birger
This commit introduces a helper which allows fetching xfrm state parameters by eBPF programs attached to TC. Prototype: bpf_skb_get_xfrm_state(skb, index, xfrm_state, size, flags) skb: pointer to skb index: the index in the skb xfrm_state secpath array xfrm_state: pointer to 'struct bpf_xfrm_state' size: size of 'struct bpf_xfrm_state' flags: reserved for future extensions The helper returns 0 on success. Non zero if no xfrm state at the index is found - or non exists at all. struct bpf_xfrm_state currently includes the SPI, peer IPv4/IPv6 address and the reqid; it can be further extended by adding elements to its end - indicating the populated fields by the 'size' argument - keeping backwards compatibility. Typical usage: struct bpf_xfrm_state x = {}; bpf_skb_get_xfrm_state(skb, 0, &x, sizeof(x), 0); ... Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-24liquidio: Swap VF representor Tx and Rx statisticsSrinivas Jampala
Swap VF representor tx and rx interface statistics since it is a virtual switchdev port and tx for VM should be rx for VF representor and vice-versa. Signed-off-by: Srinivas Jampala <srinivasa.jampala@cavium.com> Acked-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24net/ipv6: fix LOCKDEP issue in rt6_remove_exception_rt()Eric Dumazet
rt6_remove_exception_rt() is called under rcu_read_lock() only. We lock rt6_exception_lock a bit later, so we do not hold rt6_exception_lock yet. Fixes: 8a14e46f1402 ("net/ipv6: Fix missing rcu dereferences on from") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: David Ahern <dsahern@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24Merge branch '1GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2018-04-24 This series contains fixes to ixgbevf, igb and ice drivers. Colin Ian King fixes the return value on error for the new XDP support that went into ixgbevf for 4.17. Vinicius provides a fix for queue 0 for igb, which was not receiving all the credits it needed when QAV mode was enabled. Anirudh provides several fixes for the new ice driver, starting with properly initializing num_nodes_added to zero. Fixed up a code comment to better reflect what is really going on in the code. Fixed how to detect if an OICR interrupt has occurred to a more reliable method. Md Fahad fixes the ice driver to allocate the right amount of memory when reading and storing the devices MAC addresses. The device can have up to 2 MAC addresses (LAN and WoL), while WoL is currently not supported, we need to ensure it can be properly handled when support is added. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24net/tls: remove redundant second null check on sgoutColin Ian King
A duplicated null check on sgout is redundant as it is known to be already true because of the identical earlier check. Remove it. Detected by cppcheck: net/tls/tls_sw.c:696: (warning) Identical inner 'if' condition is always true. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24fsl/fman_port: remove redundant check on port->rev_info.majorColin Ian King
The check port->rev_info.major >= 6 is being performed twice, thus the inner second check is always true and is redundant, hence it can be removed. Detected by cppcheck. drivers/net/ethernet/freescale/fman/fman_port.c:1394]: (warning) Identical inner 'if' condition is always true. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24ice: Fix insufficient memory issue in ice_aq_manage_mac_readMd Fahad Iqbal Polash
For the MAC read operation, the device can return up to two (LAN and WoL) MAC addresses. Without access to adequate memory, the device will return an error. Fixed this by allocating the right amount of memory. Also, logic to detect and copy the LAN MAC address into the port_info structure has been added. Note that the WoL MAC address is ignored currently as the WoL feature isn't supported yet. Fixes: dc49c7723676 ("ice: Get MAC/PHY/link info and scheduler topology") Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-24qed: Fix copying 2 stringsDenis Bolotin
The strscpy() was a recent fix (net: qed: use correct strncpy() size) to prevent passing the length of the source buffer to strncpy() and guarantee null termination. It misses the goal of overwriting only the first 3 characters in "???_BIG_RAM" and "???_RAM" while keeping the rest of the string. Use strncpy() with the length of 3, without null termination. Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24sfc: ARFS filter IDsEdward Cree
Associate an arbitrary ID with each ARFS filter, allowing to properly query for expiry. The association is maintained in a hash table, which is protected by a spinlock. v3: fix build warnings when CONFIG_RFS_ACCEL is disabled (thanks lkp-robot). v2: fixed uninitialised variable (thanks davem and lkp-robot). Fixes: 3af0f34290f6 ("sfc: replace asynchronous filter operations") Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24Merge branch 'ipconfig-NTP-server-support-bug-fixes-documentation-improvements'David S. Miller
Chris Novakovic says: ==================== ipconfig: NTP server support, bug fixes, documentation improvements This series (against net-next) makes various improvements to ipconfig: - Patch #1 correctly documents the behaviour of parameter 4 in the "ip=" and "nfsaddrs=" command line parameter. - Patch #2 tidies up the printk()s for reporting configured name servers. - Patch #3 fixes a bug in autoconfiguration via BOOTP whereby the IP addresses of IEN-116 name servers are requested from the BOOTP server, rather than those of DNS name servers. - Patch #4 requests the number of DNS servers specified by CONF_NAMESERVERS_MAX when autoconfiguring via BOOTP, rather than hardcoding it to 2. - Patch #5 fully documents the contents and format of /proc/net/pnp in Documentation/filesystems/nfs/nfsroot.txt. - Patch #6 fixes a bug whereby bogus information is written to /proc/net/pnp when ipconfig is not used. - Patch #7 creates a new procfs directory for ipconfig-related configuration reports at /proc/net/ipconfig. - Patch #8 allows for NTP servers to be configured (manually on the kernel command line or automatically via DHCP), enabling systems with an NFS root filesystem to synchronise their clock before mounting their root filesystem. NTP server IP addresses are written to /proc/net/ipconfig/ntp_servers. Changes from v1: - David requested that a new directory /proc/net/ipconfig be created to contain ipconfig-related configuration reports, which is implemented in the new patch #7. NTP server IPs are now written to this directory instead of /proc/net/ntp in the new patch #8. - Cong and David both requested that the modification to CREDITS be dropped. This patch has been removed from the series. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24ipconfig: Write NTP server IPs to /proc/net/ipconfig/ntp_serversChris Novakovic
Distributed filesystems are most effective when the server and client clocks are synchronised. Embedded devices often use NFS for their root filesystem but typically do not contain an RTC, so the clocks of the NFS server and the embedded device will be out-of-sync when the root filesystem is mounted (and may not be synchronised until late in the boot process). Extend ipconfig with the ability to export IP addresses of NTP servers it discovers to /proc/net/ipconfig/ntp_servers. They can be supplied as follows: - If ipconfig is configured manually via the "ip=" or "nfsaddrs=" kernel command line parameters, one NTP server can be specified in the new "<ntp0-ip>" parameter. - If ipconfig is autoconfigured via DHCP, request DHCP option 42 in the DHCPDISCOVER message, and record the IP addresses of up to three NTP servers sent by the responding DHCP server in the subsequent DHCPOFFER message. ipconfig will only write the NTP server IP addresses it discovers to /proc/net/ipconfig/ntp_servers, one per line (in the order received from the DHCP server, if DHCP autoconfiguration is used); making use of these NTP servers is the responsibility of a user space process (e.g. an initrd/initram script that invokes an NTP client before mounting an NFS root filesystem). Signed-off-by: Chris Novakovic <chris@chrisn.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24ipconfig: Create /proc/net/ipconfig directoryChris Novakovic
To allow ipconfig to report IP configuration details to user space processes without cluttering /proc/net, create a new subdirectory /proc/net/ipconfig. All files containing IP configuration details should be written to this directory. Signed-off-by: Chris Novakovic <chris@chrisn.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24ipconfig: Correctly initialise ic_nameserversChris Novakovic
ic_nameservers, which stores the list of name servers discovered by ipconfig, is initialised (i.e. has all of its elements set to NONE, or 0xffffffff) by ic_nameservers_predef() in the following scenarios: - before the "ip=" and "nfsaddrs=" kernel command line parameters are parsed (in ip_auto_config_setup()); - before autoconfiguring via DHCP or BOOTP (in ic_bootp_init()), in order to clear any values that may have been set after parsing "ip=" or "nfsaddrs=" and are no longer needed. This means that ic_nameservers_predef() is not called when neither "ip=" nor "nfsaddrs=" is specified on the kernel command line. In this scenario, every element in ic_nameservers remains set to 0x00000000, which is indistinguishable from ANY and causes pnp_seq_show() to write the following (bogus) information to /proc/net/pnp: #MANUAL nameserver 0.0.0.0 nameserver 0.0.0.0 nameserver 0.0.0.0 This is potentially problematic for systems that blindly link /etc/resolv.conf to /proc/net/pnp. Ensure that ic_nameservers is also initialised when neither "ip=" nor "nfsaddrs=" are specified by calling ic_nameservers_predef() in ip_auto_config(), but only when ip_auto_config_setup() was not called earlier. This causes the following to be written to /proc/net/pnp, and is consistent with what gets written when ipconfig is configured manually but no name servers are specified on the kernel command line: #MANUAL Signed-off-by: Chris Novakovic <chris@chrisn.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24ipconfig: Document /proc/net/pnpChris Novakovic
Fully document the format used by the /proc/net/pnp file written by ipconfig, explain where its values originate from, and clarify that the tertiary name server IP and DNS domain name are only written to the file when autoconfiguration is used. Signed-off-by: Chris Novakovic <chris@chrisn.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24ipconfig: BOOTP: Request CONF_NAMESERVERS_MAX name serversChris Novakovic
When ipconfig is autoconfigured via BOOTP, the request packet initialised by ic_bootp_init_ext() always allocates 8 bytes for the name server option, limiting the BOOTP server to responding with at most 2 name servers even though ipconfig in fact supports an arbitrary number of name servers (as defined by CONF_NAMESERVERS_MAX, which is currently 3). Only request name servers in the request packet if CONF_NAMESERVERS_MAX is positive (to comply with [1, §3.8]), and allocate enough space in the packet for CONF_NAMESERVERS_MAX name servers to indicate the maximum number we can accept in response. [1] RFC 2132, "DHCP Options and BOOTP Vendor Extensions": https://tools.ietf.org/rfc/rfc2132.txt Signed-off-by: Chris Novakovic <chris@chrisn.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24ipconfig: BOOTP: Don't request IEN-116 name serversChris Novakovic
When ipconfig is autoconfigured via BOOTP, the request packet initialised by ic_bootp_init_ext() allocates 8 bytes for tag 5 ("Name Server" [1, §3.7]), but tag 5 in the response isn't processed by ic_do_bootp_ext(). Instead, allocate the 8 bytes to tag 6 ("Domain Name Server" [1, §3.8]), which is processed by ic_do_bootp_ext(), and appears to have been the intended tag to request. This won't cause any breakage for existing users, as tag 5 responses provided by BOOTP servers weren't being processed anyway. [1] RFC 2132, "DHCP Options and BOOTP Vendor Extensions": https://tools.ietf.org/rfc/rfc2132.txt Signed-off-by: Chris Novakovic <chris@chrisn.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24ipconfig: Tidy up reporting of name serversChris Novakovic
Commit 5e953778a2aab04929a5e7b69f53dc26e39b079e ("ipconfig: add nameserver IPs to kernel-parameter ip=") adds the IP addresses of discovered name servers to the summary printed by ipconfig when configuration is complete. It appears the intention in ip_auto_config() was to print the name servers on a new line (especially given the spacing and lack of comma before "nameserver0="), but they're actually printed on the same line as the NFS root filesystem configuration summary: [ 0.686186] IP-Config: Complete: [ 0.686226] device=eth0, hwaddr=xx:xx:xx:xx:xx:xx, ipaddr=10.0.0.2, mask=255.255.255.0, gw=10.0.0.1 [ 0.686328] host=test, domain=example.com, nis-domain=(none) [ 0.686386] bootserver=10.0.0.1, rootserver=10.0.0.1, rootpath= nameserver0=10.0.0.1 This makes it harder to read and parse ipconfig's output. Instead, print the name servers on a separate line: [ 0.791250] IP-Config: Complete: [ 0.791289] device=eth0, hwaddr=xx:xx:xx:xx:xx:xx, ipaddr=10.0.0.2, mask=255.255.255.0, gw=10.0.0.1 [ 0.791407] host=test, domain=example.com, nis-domain=(none) [ 0.791475] bootserver=10.0.0.1, rootserver=10.0.0.1, rootpath= [ 0.791476] nameserver0=10.0.0.1 Signed-off-by: Chris Novakovic <chris@chrisn.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24ipconfig: Document setting of NIS domain nameChris Novakovic
ic_do_bootp_ext() is responsible for parsing the "ip=" and "nfsaddrs=" kernel parameters. If a "." character is found in parameter 4 (the client's hostname), everything before the first "." is used as the hostname, and everything after it is used as the NIS domain name (but not necessarily the DNS domain name). Document this behaviour in Documentation/filesystems/nfs/nfsroot.txt, as it is not made explicit. Signed-off-by: Chris Novakovic <chris@chrisn.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-24net: ethtool: Add missing kernel doc for FEC parametersFlorian Fainelli
While adding support for ethtool::get_fecparam and set_fecparam, kernel doc for these functions was missed, add those. Fixes: 1a5f3da20bd9 ("net: ethtool: add support for forward error correction modes") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>